100% found this document useful (1 vote)

464 views247 pages

Discrete Mathematics and Functional Programming PDF

This document is a table of contents for a book on discrete mathematics and functional programming. It contains 8 chapters that cover topics like sets, logic, proofs, algorithms, relations, functions, programs, and graphs. The chapters progress from basic concepts like sets and logic to more advanced topics such as proofs, algorithms, and graph theory. The book uses examples from the functional programming language ML to illustrate concepts.

Uploaded by

jib90957

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

464 views247 pages

Discrete Mathematics and Functional Programming PDF

Uploaded by

jib90957

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 247

Discrete Mathematics and Functional Programming

Thomas VanDrunen

June 23, 2006

2
Brief contents

I Set 1

II Logic 33

III Proof 71

IV Algorithm 91

V Relation 127

VI Function 155

VII Program 181

VIII Graph 215

i
BRIEF CONTENTS BRIEF CONTENTS

ii
Contents

Preface ix

I Set 1
1 Sets and elements 3
1.1 Your mathematical biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Reasoning about items collectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Intuiton about sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Set notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Expressions and Types 11

2.1 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Making your own types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 Set Operations 19
3.1 Axiomatic foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Operations and visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Powersets, cartesian products, and partitions . . . . . . . . . . . . . . . . . . . . . . 22

4 Tuples and Lists 25

4.1 Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3 Lists vs. tuples vs. arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

II Logic 33
5 Logical Propositions and Forms 35
5.1 Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.3 Boolean values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.4 Truth tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.5 Logical equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6 Conditionals 43
6.1 Conditional propositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.2 Negation of a conditional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.3 Converse, inverse, and contrapositive . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.4 Writing conditionals in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.5 Conditional expressions in ML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

iii
CONTENTS CONTENTS

7 Argument forms 49
7.1 Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2 Common syllogisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.3 Using argument forms for deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

8 Predicates and quantifiers 55

8.1 Predication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.2 Universal quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.3 Existential quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.4 Implicit quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.5 Negation of quantified propositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

9 Multiple quantification; representing predicates 61

9.1 Multiple quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9.2 Ambiguous quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.3 Predicates in ML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
9.4 Pattern-matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

III Proof 71
10 Subset proofs 73
10.1 Introductory remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
10.2 Forms for proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
10.3 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
10.4 Closing remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

11 Set equality and empty proofs 79

11.1 Set equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
11.2 Set emptiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
11.3 Remarks on proof by contradiction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

12 Conditional proofs 83
12.1 Worlds of make believe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
12.2 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
12.3 Biconditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
12.4 Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

13 Special Topic: Russell’s paradox 89

IV Algorithm 91
14 Algorithms 93
14.1 Problem-solving steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
14.2 Repetition and change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
14.3 Packaging and parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
14.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

15 Induction 101
15.1 Calculating a powerset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
15.2 Proof of powerset size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
15.3 Mathematical induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
15.4 Induction gone awry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
15.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

iv
CONTENTS CONTENTS

16 Correctness of algorithms 109

16.1 Defining correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
16.2 Loop invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
16.3 Big example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
16.4 Small example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

17 From Theorems to Algorithms 115

17.1 The Division Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
17.2 The Euclidean Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
17.3 The Euclidean Algorithm, another way . . . . . . . . . . . . . . . . . . . . . . . . . . 118

18 Recursive algorithms 121

18.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
18.2 Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

V Relation 127
19 Relations 129
19.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
19.2 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
19.3 Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

20 Properties of relations 135

20.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
20.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
20.3 Equivalence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
20.4 Computing transitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

21 Closures 141
21.1 Transitive failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
21.2 Transitive and other closures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
21.3 Computing the transitive closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
21.4 Relations as predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

22 Partial orders 149

22.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
22.2 Comparability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
22.3 Topological sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

VI Function 155
23 Functions 157
23.1 Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
23.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
23.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
23.4 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

24 Images 163
24.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
24.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
24.3 Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

v
CONTENTS CONTENTS

25 Function properties 167

25.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
25.2 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
25.3 Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

26 Function composition 171

26.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
26.2 Functions as components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
26.3 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

27 Special Topic: Countability 177

VII Program 181

28 Recursion Revisited 183
28.1 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
28.2 Recurrence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

29 Recursive Types 189

29.1 Datatype constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
29.2 Peano numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
29.3 Parameterized datatype constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

30 Fixed-point iteration 197

30.1 Currying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
30.2 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
30.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
30.4 Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

31 Combinatorics 205
31.1 Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
31.2 Permutations and combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
31.3 Computing combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

32 Special Topic: Computability 209

33 Special topic: Comparison with object-oriented programming 211

VIII Graph 215

34 Graphs 217
34.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
34.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
34.3 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
34.4 Game theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

35 Paths and cycles 223

35.1 Walks and paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
35.2 Circuits and cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
35.3 Euler circuits and Hamiltonian cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

vi
CONTENTS CONTENTS

36 Isomorphisms 229
36.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
36.2 Isomorphic invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
36.3 The isomorphic relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
36.4 Final bow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

vii
CONTENTS CONTENTS

viii
Preface

If you have discussed your schedule this semester with anyone, you have probably been asked what
discrete mathematics is—or perhaps someone has asked would could make math indiscreet. While
discrete mathematics is something few people outside of mathematical fields have heard of, it is
comprised of topics that are fundamental to mathematics; to gather these topics together into one
course is a more recent phenomenon in mathematics curriculum. Because these topics are sometimes
treated separately or in various other places in an undergraduate course of study in mathematic,
discrete math texts and courses appear like hodge-podges, and unifying themes are sometimes hard
to identify. Here we will attempt to shed some light on the matter.
Discrete mathematics topics include symbolic logic and proofs, including proof by induction;
number theory; set theory; functions and relations on sets; graph theory; algorithms, their analysis,
and their correctness; matrices; sequences and recurrence relations; counting and combinatorics;
discrete probability; and languages and automata. All of these would be appropriate in other courses
or in their own course. Why teach them together? For one thing, students in a field like computer
science need a basic knowledge of many of these topics but do not have time to take full courses in
all of them; one course that meanders through these topics is thus a practical compromise. (And
no one-semester course could possibly touch all of them; we will be completely skipping matrices,
probability, languages, and automata, and number theory, sequences, recurrence relations, counting,
and combinatorics will receive only passing attention.)
However, all these topics do have something in common which distinguishes them from much of
the rest of mathematics. Subjects like calculus, analysis, and differential equations, anything that
deals with the real or complex numbers, can be put under the heading of continuous mathematics,
where a continuum of values is always in view. In contrast to this, discrete mathematics always has
separable, indivisible, quantized (that is, discrete) objects in view—things like sets, integers, truth
values, or vertices in a graph. Thus discrete math stands towards continuous math in the same way
that digital devices stand toward analog. Imagine the difference between an electric stove and a gas
stove. A gas stove has a nob which in theory can be set to an infinite number of positions between
high and low, but the discrete states of an electric stove are a finite, numbered set.
This particular course, however, does something more. Here we also intertwine functional pro-
gramming with the discrete math topics. Functional programming is a different style or paradigm
from the procedural, imperative, and/or object-oriented approach that those of you who have pro-
grammed before have seen (which, incidentally, should place students who have programming ex-
perience and those who have not on a level playing field). Instead of viewing a computer program
as a collection of commands given to a computer, we see a program as a collection of interacting
functions, in the mathematical sense. Since functions are a major topic of discrete math anyway,
the interplay is natural. As we shall see, functional programming is a useful forum for illustrating
the other discrete math topics as well.
But like any course, especially at a liberal arts college, our main goal is to make you think better.
You should leave this course with a sharper understanding of categorical reasoning, the ability to
analyze logic formally, an appreciation for the precision of mathematical proofs, and the clarity of
thought necessary to arrange tasks into an algorithm. The slogan of this course is, “Math majors
should learn to write programs and computer science majors should learn to write proofs together.”
Math majors will spend most of their time as undergraduates proving things, and computer science
majors will do a lot of programming; yet they both need to do a little of the other. The fact is,

ix
CHAPTER 0. PREFACE

robust programs and correct proofs have a lot to do with each other, not the least of which is that
they both require clear, logical thinking. We will see how proof techniques will allow us to check
that an algorithm is correct, and that proofs can prompt algorithms. Moreover, the programming
component motivates the proof-based discrete math for the computer science majors and keeps it
relevant; the proof component should lighten the unfamiliarity that math majors often experience
in a programming course.
There are three major theme pairs that run throughout this course. The theme of proof and
program has already been explained. The next is symbol and representation. So much of
precise mathematics relies on accurate and informative notation, and it is important to distinguish
the difference between a symbol and the idea it represents. This is also a point where math and
computer science streams of thought swirl; much of our programming discussions will focus on the
best ways to represent mathematical concepts and structure on a computer. Finally, the theme of
analysis and synthesis will recur. Analysis is the taking of something apart; synthesis is putting
something together. This pattern occurs frequently in proofs. Take any proposition in the form “if
q then p.” q will involve some definition that will need to be analyzed straight away to determine
what is really being asserted. The proof will end by assembling the components according to the
definitions of the terms used in p. Likewise in functional programming, we will practice decomposing
a problem into its parts and synthesizing smaller solutions into a complete solution.
This course covers a lot of material, and much of it is challenging. However, with careful practice,
none of it is beyond the grasp of anyone with the mathematical maturity that is usually achieved
around the time a student takes calculus.

x
Part I

Set

1
Chapter 1

Sets and elements

1.1 Your mathematical biography

Much of your mathematical eduction can be retraced by considering your expanding awareness of
different kinds of numbers. In fact, human civilization’s understanding of mathematics progressed
in much the same way.
When you first conceptualized differences in quantities (10 Cheerios is more than 3 Cheerios)
and learned to count, the numbers you used where 1, 2, 3, . . . up to the highest number to which you
could count. We call these numbers the natural numbers, and we will represent all of them with the
symbol N. It is critical to note that we are using this to symbolize all or them, not each of them. N
is not a variable that can stand for any natural number, but a constant that stands for all natural
numbers as a group.
In the early grades, you learned to count ever higher (learned of more natural numbers) and
learned operations you could perform on them (addition, subtraction, etc.). You also learned of an
extra number, 0 (an important step in the history of mathematics). We will designate all natural
numbers and 0 to be whole numbers, and represent them collectively by W.
0 is a whole number but not a natural number. 5 is both. In fact, anything that is a natural
number is also a whole number, but only some things (specificly, everything but zero) that are whole
numbers are also natural numbers. Thus W is a broader category than N. We can visualize this in
the following diagram.

The whole number operation of subtraction eventually forced you to face a dilemma: what
happens if you subtract a larger number from a smaller number? Since W is insufficient to answer
this, negative numbers were invented. We call all whole numbers with their opposites (that is, their
negative counterparts) integers, and we use Z (from Zahlen, the German word for “numbers”) to
symbolize the integers.

−5 5

Division, as an operation on integers, results in a similar problem. What happens if we divide 5

by 2? In the history of mathematical thinking, we can imagine two cavemen arguing over how they

3
1.1. YOUR MATHEMATICAL BIOGRAPHY CHAPTER 1. SETS AND ELEMENTS

can split five apples. Physically, they could chop one of the apples into two equal parts and each get
one part, but how can you describe the resulting quantity that each caveman would get? Human
languages handle this with words like “half”; mathematics handles this with fractions, like 25 or the
equivalent 2 21 , which is shorthand for 2 + 21 . We call numbers that can be written as fractions (that
is, ratios of integers) rational numbers, symbolized by Q (for quotient). Since a number like 5 can
be written as 15 , all integers are rational numbers.

5 3
−5 7

Geometry exposed the insufficiency of rational numbers.

√ For example, an isosceles right triangle
with sides of length 1 inch has a hypotenuse of length 2 by the Pythagorean theorem. A circle
with diameter
√ 1 inch has a circumference of π. Obviously such a triangle and circle can be drawn on
paper, so 2 and π are measurable quantities possible in the physical world. However, they cannot
be written as fractions, and hence they are not rational numbers. We call all of these “possible real
world quantities” real numbers, and symbolize them by R.

5 3
−5 7

All of these designations ought to be second nature to you. A lesser known distinction that
you may or may not remember is that real numbers can be split up√into two camps: algebraic
numbers (A), each of which is a root to some polynomial function, like 2 and all the integers; and
transcendental numbers (T), which are not.

4
CHAPTER 1. SETS AND ELEMENTS 1.1. YOUR MATHEMATICAL BIOGRAPHY

5 3
−5 7

We first considered negative numbers when we invented the integers. However, as we expanded
to rationals and reals, we introduced both new negative numbers and new positive numbers. Thus
negative (real) numbers considered as a collection (R− ) cut across all of these other collections,
except W and N.

5 3
−5 7

To finish off the picture, remember how N, Z, and Q each in turn proved to be inadequate√because
of operations we wished to perform on them. Likewise R is inadequate for operations like −1. To
handle that, we have complex numbers, C.

5 3
−5 7

5
1.3. INTUITON ABOUT SETS CHAPTER 1. SETS AND ELEMENTS

1.2 Reasoning about items collectively

What is the meaning of this circle diagram and these symbols to represent various collections of
numbers? Like all concepts and their representations, these are tools for reasoning and communi-
cating ideas. The following table lists various statements that express ideas using the terminology
introduced in the preceding exercise. Meanwhile, in the right column we write these statements
symbolically, using symbols that will be formally introduced later.

5 is a natural number; or the collection of natural 5∈N

numbers contains 5.

Adding 0 to the collection of natural numbers W = {0} ∪ N

makes the collection of whole numbers.

Merging the algebraic numbers and the transcen- R=A∪T

dental numbers makes the real numbers.

Transcendental numbers are those real numbers T=R−A

which are not algebraic numbers.

Nothing is both transcendental and algebraic, or T∩A=∅

the collection of things both transcendental and
algebraic is empty.

Negative integers are both negative and integers. Z − = R− ∩ Z

All integers are rational numbers. Z⊆Q

Since all rational numbers are algebraic numbers Q⊆A

and all algebraic numbers are real numbers, it fol- A⊆R
lows that all rational numbers are real numbers. ∴ Q⊆R

We also note that in the circle diagram and the acompanying

√ discussion, we are dealing with two
very different sorts of things: on one hand we have 5, 73 , 2, π, 2i + 3, and the like; on the other
hand we have things like N, Z, Q, R, and C. What is the difference? We have been referring to the
element former by terms such as “number” or “item”, but the standard mathematical term is element. We
have called any of the latter category a “collection” or “group”; in mathematics, we call such a thing
set a set. Informally, a set is a group or collection of items categorized together because of perceived
common properties.
This presents one of the fundamental shifts in thinking from continuous mathematics. such as
pre-calculus and calculus. Up till now, you have concerned yourself with the the contents of these
sets; in discrete mathematics, we will be reasoning about sets themselves.

1.3 Intuiton about sets

There is nothing binding about the sets we mentioned earlier. We can declare sets arbitrarily—such
as the set of even whole numbers, or simply the set containing only 1, 15, and 23. We can have
sets of things other than numbers. For example, other mathematical objects can be considered
collectively—a set of geometric points, a set of matrices, or a set of functions. But we are not
limited to mathematical objects—we can declare sets of physical objects or even abstract ideas.
Grammatically, anything that is a noun can be discussed in terms of sets. We may speak of

The set of students in this course.

6
CHAPTER 1. SETS AND ELEMENTS 1.4. SET NOTATION

The set of car models produced by Ford (different from the set of cars produced by Ford).
The set of entrees served at Bon Appetit.
The set of the Fruits of the Spirit.

Since set is a noun, we can even have a set of sets; for example, the set of number sets included
in R, which would contain Q, Z, W, and N. In theory, a set can even contain itself—the set of things
mentioned on this page is itself mentioned on this page and thus includes itself—though that leads
to some paradoxes.
Hrbacek and Jech give an important clarification to our intuition of what a set is:

Sets are not objects of the real world, like tables or stars; they are created by our mind,
not by our hands. A heap of potatoes is not a set of potatoes, the set of all molecules in
a drop of water is not the same object as that drop of water[9].

It is legitimate, though, to speak of the set of molecules in the drop and of the set of potatoes in
the heap.

1.4 Set notation

Finally, we explain some of the notation we used earlier:
We can describe a set explicitly by listing the elements of the set inside curly braces. Remem- {}
bering that order does not matter in a set, then for example, to define the set of the colors red and
green,

X = {Red, Green} = {Green, Red}

. . . but do not forget that

Red 6= {Red}
Using this system, {} stands for a set with no elements, that is, the empty set, but we also have a
special symbol for that, ∅. The symbol ∈ stands for set membership and should be read “an element ∅
of” or “is an element of”, depending on the grammatical context (sometimes just “in” works if you
∈
are reading quickly).

Red ∈ {Green, Red}

The curly braces can be used more flexibly if you want to specify the elements of a set by property
rather than listing them explicity. Begin an expression like this by giving a variable to stand for an
arbitrary element of the set being defined, a vertical bar (read as “such that”), a statement that
the element is already a member in another set, and finally a statement that some other property is
true for these elements. For example, one way to define the set of natural numbers is

N = {x|x ∈ Z, x > 0}
which reads “the set of natural numbers is the set of all x such that that x is an integer and x is
greater than 0.” Recall from analysis that you can specify a range on the real number line, say all
from one exclusive to 5 inclusive, by the notation (1, 5]. Indeed, a range is a set; note

(1, 5] = {x|x ∈ R, 1 < x ≤ 5}

If one set is completely contained in another, as {Green, Red} is completely contained in {Red, Green, Blue},
was say that the first is a subset of the second. The symbol ⊆ reads “a subset of” or “is a subset subset, ⊆
of”, as in

{Green, Red} ⊆ {Red, Green, Blue}

7
1.4. SET NOTATION CHAPTER 1. SETS AND ELEMENTS

Note that with this definition, it so happens that

For any set X, ∅ ⊆ X

For any set X, X ⊆ X

For any sets X and Y , X ⊆ Y and Y ⊆ X iff X = Y

superset, ⊆ Those with Hebraic tendencies will appreciate the less-used ⊇, standing for superset, that is,
X ⊇ Y if Y is completely contained in X. Also, if you want to exclude the possibility that a subset
is equal to the larger set (say X is contained in Y , but Y has some elements not in X), what you
proper subset, ⊂ have in mind is called a proper subset, symbolized by X ⊂ Y . Compare ⊆ and ⊂ with < and ≤.
Rarely, though, will we want to restrict ourselves to proper subsets.
Often it is useful to take all the elements in two or more sets and consider them together. The
union, ∪ resulting set is called the union, and its construction is symbolized by ∪. The union of two sets is
the set of elements in either set.

{Orange, Red} ∪ {Green, Red} = {Orange, Red, Green}

If any element occurs in both of the original sets, it still occurs only once in the resulting set.
There is no notion of something occuring twice in a set. On the other hand, sometimes we will want
intersection, ∩ to consider only the elments in both sets. We call that the intersection, and use ∩.

{Orange, Red} ∩ {Green, Red} = {Red}

At this point it is very important to understand that X ∩ Y means “the set where X and Y
overlap.” It does not mean “X and Y overlap at some point.” It is a noun, not a sentence. This
will be reemphasized in the next chapter.
difference, − The fanciest operation on sets for this chapter is set difference, which expresses the set the
resulting when we remove all the element of one set from another. We use the subtraction symbol
for this, say X − Y . Y may or may not be a subset of X.

{Orange, Red, Green, Blue} − {Blue, Orange} = {Red, Green}

{Red, Green, Blue} − {Blue, Orange} = {Red, Green}

Finally, those circle diagrams have a name. Venn diagrams are so called after their inventor
John Venn. They help visualize how different sets relate to each other in terms of containment and
overlap. Note that the areas of the regions have no meaning—a large area might not contain more
elements than a small area, for example.

8
CHAPTER 1. SETS AND ELEMENTS 1.4. SET NOTATION

Exercises

9. A ⊆ C.
Let T be the set of trees, D be the set of deciduous trees, and C 10. R ⊆ C ∩ R−1
be the set of coniferous trees. In exercises 1–6, write the statement
symbolically. 11. 4 ∈ C.
12. Q ∩ T = ∅.
1. Oak is a deciduous tree.
1
13. ∈ Q − R.
2. Pine is not a decidous tree. 63

14. Z − R−1 = W.
3. All coniferous trees are trees.
15. T ∪ Z ⊆ A.
4. Decidous trees are those that are trees but are not conifer-
ous. 16. All of the labeled sets we considered in Section 1.1 have an in-
5. Decidous trees and conferous trees together make all trees. finite number of elements, even though some are completely
contained in others. (We will later consider whether all in-
6. There is no tree that is both deciduous and coniferous. finities should be considered equal.) However, two regions
7. Write [2.3, 9.5) in set notation. have a finite number of elements.

a. Describe the region shaded . How many elements does

In exercises 8–15, determine whether each statement is true or

it have?
false.
b. Describe the region shaded . How many elements
8. −12 ∈ N. does it have?

0

0

3

5

−5 7

9
1.4. SET NOTATION CHAPTER 1. SETS AND ELEMENTS

10
Chapter 2

Expressions and Types

2.1 Expressions
One of the most important themes in this course is the modeling or representation of mathematical
concepts in a computer system. Ultimately, the concepts are modeled in computer memory; the
arrangements of bits is information which we interpret as representing certain concepts, and we
program the computer to operate on that information in a way consistent with our interpretation.
An expression is a programming language construct that expresses something. ML’s interactive expression
mode works as a cycle: you enter an expression, which it will evaluate. A value is the result of the
evaluation of an expression. To relate this to the previous chapter, a value is like an element. An evaluate
expression is a way to describe that element. For example, 5 and 7 − 2 are two ways to express the
value
same element of N.
When you start ML, you will see a hyphen, which is ML’s prompt, indicating it is waiting for
you to enter an expression. The way you communicate to ML is to enter an expression followed by
a semicolon and pressing the “enter” key. (If you press “enter” before the expression is finished, you
will get a slightly different prompt, marked by an equals sign; this indicates ML assumes you have
more to say.)
Try entering 5 into the ML prompt. Text that the user types into the prompt will be in
typewriter font; ML’s response will be in slanted typewriter font .

- 5;

val it = 5 : int

The basic form you use is

<expression> ;

This is what the response means:

val is short for “value”, indicating this is the value the ML interpreter has found for the expression
you entered.
it is a variable. A variable is a symbol that represents a value in a given context. Note that this variable
means that a variable, too, is an expression; however, unlike the symbol 5, the value associated
with the variable changes as you declare it to. Variables in ML are like those you are familiar
with from mathematics (and other programming languages, if you have programmed before),
and you can think of a variable as a box that stores values. Unless directed otherwise, ML
automatically stores the value of the most recently evaluated expression in a variable called
it .
5 is the value of the expression (not surprisingly, the value of 5 is 5 ).

11
2.2. TYPES CHAPTER 2. EXPRESSIONS AND TYPES

int is the type of the expression (in this case, short for “integer”), about which we will say more
soon.

We can make more interesting expressions using mathematical operators. We can enter

- 7 - 2;

val it = 5 : int

Note that this expression itself contains two other expressions, 7 and 2. Smaller expressions that
subexpression compose a larger expression are called subexpressions of that expression. - is an operator , and the
operator subexpressions are the operands of that operator. + means what you would expect, * stands for
operand multiplication, and ∼ is used as a negative sign (having one operand, to distinguish it from -, which
has two); division we will discuss later. To express (and calculate) 67 + 4 × −13, type

- 67 + 4 * ~ 13;

val it = 15 : int

2.2 Types
type So far, all these values have had the type int (we will use sans serif font for types). A type is a set
of values that are related by the operations that can be performed on them. This provides another
example of modeling concepts on a computer: a type models our concept of set.
Nevertheless, this also demonstrates the limitations of modeling because types are more restricted
than our general concept of a set. ML does not provide a way to use the concepts of subsets, unions,
or intersections on types. We will later study other ways to model sets to support these concepts.
Moreover, the type int, although it corresponds to the set Z in terms of how we interpret it, does
not equal the set Z. The values (elements) of int are computer representations of integers, not the
integers themselves, and since computer memory is limited, int comprises only a finite number of
values. On the the computer used to write this book, the largest integer ML recognizes is 1073741823.
Although 1073741824 ∈ Z, it is not a valid ML int.

- 1073741824;

stdIn:6.1-6.11 Error: int constant too large

ML also has a type real corresponding to R. The operators you have already seen are also defined
for reals, plus / for division.

- ~4.73;

val it = ~4.73 : real

- 7.1 - 4.8 / 63.2;

val it = 7.02405063291 : real

- 5.3 - 0.3;

val it = 5.0 : real

12
CHAPTER 2. EXPRESSIONS AND TYPES 2.2. TYPES

Notice that 5.0 has type real, not type int. Again the set modeling breaks down. int is not a
subset (or subtype) of real, and 5.0 is a completely different value from 5 .
A consequence of int and real being unrelated is that you cannot mix them in arithmetic expres-
sions. English requires that the subject of a sentence have the same number (singular or plural) as
the main verb, which is why it does not allow a sentence like, “Two dogs walks down the street.”
This is called subject-verb agreement. In the same way, these ML operators require type agreement. type agreement
That is, +, for example, is defined for adding two reals and for adding two ints, but not one of each.
Attempting to mix them will generate an error.

- 7.3 + 5;

stdIn:16.1-16.8 Error: operator and operand don’t agree [literal]

operator domain: real * real
operand: real * int
in expression:
7.3 + 5

This rule guarantees that the result of an arithmetic operation will have the same type as the
operands. This complicates the division operation on ints. We expect that 5 ÷ 4 = 1.25—as we
noted in the previous chapter, division takes us out of the circle of integers. Actually, the / operator
is not defined for ints at all.

- 5/4;

stdIn:20.2 Error: overloaded variable not defined at type

symbol: /
type: int

Instead, another operator performs integer division, which computes the integer quotient (that is, integer division
ignoring the remainder) resulting from dividing two integers. Such an operation is different enough
from real number division that it uses a different symbol: the word div. The remainder is calculated
by the modulus operator, mod.

- 5 div 3;

val it = 1 : int

- 5 mod 3;

val it = 2 : int

But would it not be useful to include both reals and ints in some computations? Yes, but to
preserve type purity and reduce the chance of error, ML requires that you convert such values
explicitly using one of the converters in the table below. Note the use of parentheses. These
“converters” are functions, as we will see in a later chapter.

Converter converts from to by

real() int real appending a 0 decimal portion
round() real int conventional rounding
floor() real int rounding down
ceil() real int rounding up
trunc() real int throwing away the decimal portion

For example,

13
2.3. VARIABLES CHAPTER 2. EXPRESSIONS AND TYPES

- 15.3 / real(6);

val it = 2.55 : real

- trunc(15.3) div 6;

val it = 2 : int

2.3 Variables
Since variables are expressions, they are fair game for entering into a prompt.

- it;

val it = 2 : int

We too little appreciate what a powerful thing it is to know a name. Having a name by which
to call something allows one to exercise a certain measure of control over it. As Elwood Dowd said
when meeting Harvey the rabbit, “You have the advantage on me. You know my name—and I don’t
know yours”[2]. More seriously, the Third Commandment shows how zealous God is for the right
use of his name. Geerhardus Vos comments:

It is not sufficient to think of swearing and blasphemy in the present-day common sense
of these terms. The word is one of the chief powers of pagan superstition, and the most
potent form of word-magic is name-magic. It was believed that through the pronouncing
of the name of some supernatural entity this can be compelled to do the bidding of
the magic-user. The commandment applies to the divine disapproval of such practices
specifically to the name “Jehovah.” [13].

Compare also Ex 6:2 and Rev 19:12. A name is a blessing, as in Gen 32:26-29 and Rev 2:17.
Even more so, to give a name to something is act of dominion over it, as in Gen 2:19-20. Think of
how in the game of tag the player who is “it” has the power to place that name on someone else.
The name it in ML gives us the power to recall the previous value

- it * 15;

val it = 30 : int

- it div 10;

val it = 3 : int

To name a value something other than it, imitate the interpreter’s response using val, the
desired variable, and equals, something in the form of

val <identifier> = <expression>;

- val x = 5;

val x = 5 : int

You could also add a colon and a type after the expression, like the interpreter does in its response,
but there is no need to—the interpreter can figure that out on its own. However, we will see a few
occasions much later when complicated expressions need an explicit typing for disambiguation.
identifier An identifier is a programmer-given name, such as a variable. ML has the following rules for
valid identifiers:

14
CHAPTER 2. EXPRESSIONS AND TYPES 2.4. MAKING YOUR OWN TYPES

1. The first character must be a letter1 .

2. Subsequent characters must be letters, digits, or underscores.

3. Identifiers are case sensitive.

It is convention to use mainly lowercase letters in variables. If you use several words joined
together to make a variable, capitalize the first letter of the subsequent words.

- val secsInMinute = 60;

val secsInMinute = 60 : int

- val minutesInHour = 60;

val minutesInHour = 60 : int

- val hoursInDay = 24;

val hoursInDay = 24 : int

- val daysInYear = 365;

val daysInYear = 365 : int

- val secsInHour = secsInMinute*minutesInHour;

val secsInHour = 3600 : int

- val hoursInYear = hoursInDay * daysInYear;

val hoursInYear = 8760 : int

- val secsInYear = secsInHour * hoursInYear;

val secsInYear = 31536000 : int

2.4 Making your own types

Finally, we will briefly look at how to define your own type. A programmer-defined type is called
a datatype, and you can think of such a type as a set of arbitrary elements. Suppose we want to datatype
model species of felines or trees.

- datatype feline = Leopard | Lion | Tiger | Cheetah | Panther;

datatype feline = Cheetah | Leopard | Lion | Panther | Tiger

- datatype tree = Oak | Maple | Pine | Elm | Spruce;

datatype tree = Elm | Maple | Oak | Pine | Spruce

1 Identifiers may also begin with an apostrophe, but only when used for a special kind of variable

15
2.4. MAKING YOUR OWN TYPES CHAPTER 2. EXPRESSIONS AND TYPES

When defining a datatype, separate the elements by vertical lines called “pipes,” a character
that appears with a break in the middle on some keyboards. The name of the type and the elements
must be valid identifiers. As demonstrated here, it is conventional for the names of types to be all
lower case, whereas the elements have their first letters capitalized. Notice that in its response, ML
alphabetizes the elements; as with any set, their order does not matter. Until we learn to define
functions, there is little of interest we can use datatypes for. Arithmetic operators, not surprisingly,
cannot be used.

- val cat = Leopard;

val cat = Leopard : feline

- val evergreen = Spruce;

val evergreen = Spruce : tree

- cat + evergreen;

stdIn:49.1-49.16 Error: operator and operand don’t agree [tycon mismatch]

operator domain: feline * feline
operand: feline * tree
in expression:
cat + evergreen
stdIn:49.5 Error: overloaded variable not defined at type
symbol: +
type: feline

16
CHAPTER 2. EXPRESSIONS AND TYPES 2.4. MAKING YOUR OWN TYPES

Exercises

(e) wHeAtoN
1. Determine the type of each of the following. (f) wheaton
(a) 5.3 + 0.3 (g) wheaton12
(b) 5.3 - 0.3 (h) 12wheaton
(c) 5.3 < 0.3
5. Redo the computation of the number of seconds in a year in
(d) 24.0 / 6.0 Section 2.3, but take into consideration that there are actu-
(e) 24.0 * 6.0 ally 365.25 days in a year, and so daysInYear should be of
(f) 24 * 6 type real. Your final answer should still be an int.

2. Is ceil(15.2) an expression? If no, why not? If so, what is 6. Mercury orbits the sun in 87.969 days. Calculate how old
its value and what is its type? Would ML accept ceil(15)? you will be in 30 Mercury-years. Your answer should de-
Why or why not? pend on your birthday (that is, don’t simply add a number
of years to your age), but it should be an int.
3. Make two points, stored in variables point1 and point2, and
calculuate the distance between the two points. (Recall that 7. Create a datatype of varieties of fish.
this can be done using the Pythagorean theorem.) 8. Use ML to compute the circumference and area of a circle
4. Which of the following are valid ML identifiers? and the volume and surface area of a sphere, first each of
radius 12, then of radius 12.75.
(a) wheaton
(b) wheaton college 9. Store the values 4.5 and 6.7 in variables standing for base
and height, respectively, of a rectangle, and use the variables
(c) wheaton’college to calculate the area. Then do the same but assuming they
(d) wheatonCollege are the base and height of a triangle.

17
2.4. MAKING YOUR OWN TYPES CHAPTER 2. EXPRESSIONS AND TYPES

18
Chapter 3

Set Operations

3.1 Axiomatic foundations

It is worth reemphasizing that the definitions we gave for set and element in Chapter 1 were in-
formal. Formal definitions for them are, in fact, impossible. They constitute basic building blocks
of mathematical language. In geometry, the terms “point,” “straight line,” and “planes” have a
similar primitivity[12]. Instead of formal definitions, these sorts of objects were described using
axioms, propositions that were assumed rather than proven, and by which all other propositions
were proven.
Here are two axioms (of many others) that are used to ground set theory.

Axiom 1 (Existence.) There is a set with no elements.

Axiom 2 (Extensionality.) If every element of a set X is an element of a set Y and every element
of Y is an element of X, then X = Y .

We may not know what sets and elements are, but we know that it is possible for a set to have
no elements; Axiom 1 tells us that there is an empty set. Axiom 2 tells us what it means for sets to
be equal, and this implicit definition captures what we mean when we say that sets are unordered,
since if two different sets had all the same elements but in different orders, they in fact would not
be two different sets, but one and the same.
Moreover, putting these two axioms together confirms that we may speak meaningfully not only
of an empty set, but the empty set, since there is only one. Suppose for the sake of argument there
were two empty sets. Since they have all the same elements—that is, none at all—they are actually
the same set. This is what we would call a trivial application of the axiom, but it is still valid.
Hence the empty set is unique.
A complete axomatic foundation for set theory is tedious and beyond the scope of our purposes.
We will touch on these axioms once more when we study proof in Chapter 10, but the important
lesson for now is the use of axioms to describe basic and undefinable terms. Axioms are good if they
correctly capture our intuion; notice that on page 8 we essentially derived the Axiom 2 from our
informal definition of set.

3.2 Operations and visualization

One of the two main objectives is this course is learning how to write formal proofs. To prime you
for that study, we will verify some propositions visually.
Any discussion that uses set theory takes places within an arbitrarily (and sometimes implicitly)
but fixed set. If we were talking about the sets { Panther, Lion, Tiger } and sets { Bob cat, Cheetah,
Panther, Tiger }, then the context likely implies that all sets in the discussion are subsets of the set
of felines. However, if the set { Panther, Cheetah, Weasel, Sponge } is mentioned, then the backdrop

19
3.2. OPERATIONS AND VISUALIZATION CHAPTER 3. SET OPERATIONS

is probably the set of animals, or the set { Cheetah, Sponge, Apple tree } would imply the set of
living things. Either way, there is some context of which all sets in the discussion are a subset. It
is unlikely one would ever speak of the set { Green, Sponge, Acid reflux, Annuciation } unless the
context is, say, the set of English words.
universal set That background set relevant to the context is called the universal set, and designated U . In
Venn diagrams, it is often drawn as a rectangle framing the other sets. Further, shading is often
used to highlight particular sets or regions among sets. A simple diagram showing a single set might
look like this:

cardinality We also pause to define the cardinality of a finite set, which is the number of elements in the
set. This is symbolized by veritcal bars, like absolute value. If U is the set of lowercase letters, then
|{a, b, c, d}| = 4. Note that this definition does not allow you to say, for example, that the cardinality
of Z is infinity; rather, cardinality is defined only for finite sets. Expanding the idea of cardinality
to infinite sets brings up interesting problems we will explore sometime later.

We can visualize basic set operations by drawing two overlapping circles, shading one of them
(X, defined as above) and the other (Y = {c, d, e, f })
is shaded at all, and the intersection X ∩ Y is the region shaded
. The union X ∪ Y is anything that
, all shown on the left; on the

right is the intersection alone.

U U

X X

Y

X Y

Note that it is not true that |X ∪ Y | = |X| + |Y |, since the elements in the intersection, X ∩ Y =
{c, d, }, would be counted twice. However, do not assume that just because sets are drawn so that
they overlap that they in fact share some elements. The overlap region may be empty. We say that
disjoint two sets X and Y are disjoint is they have no elements in common, that is, X ∩ Y = ∅. Note that
if X and Y are disjoint, then |X ∪ Y | = |X| + |Y |.
We can expand the notion of disjoint by considering a larger collection of sets. A set of sets
pairwise disjoint {A1 , A2 , . . . An } is pairwise disjoint if no pair of sets have any elements in common, that is, if for

20
CHAPTER 3. SET OPERATIONS 3.2. OPERATIONS AND VISUALIZATION

all i, j, 1 ≤ i, j ≤ n, i 6= j, Ai ∩ Aj = ∅.
Remember that the difference between two sets X and Y is X − Y = {x|x ∈ X and x ∈ / Y }.
From our earlier X and Y , X − Y = {a, b}. Having introduced the universal set, it now makes sense
also to talk about the complement of a set, X = {x|x ∈/ X}, everything that (is in the universal set complement
but) is not in the given set. In our example, X = {e, f, g, h, . . . , z}. Difference is illustrated to the
left and complement to the right; Y does not come into play in set complement, but is drawn for
consistency.

U

U

X

Y

Y
X

Y
Now we can use this drawing and shading method to verify propositions about set operations.
For example, suppose X, Y , and Z are sets, and consider the proposition

X ∪ (Y ∩ Z) = (X ∪ Y ) ∩ (X ∪ Z)

This is called the distributive law ; compare with the law from algebra x · (y + z) = x · y + x · y
for x, y, z ∈ R. First we draw a Venn diagram with three circles.

U
Y

X Z

Then we shade it according to the left side of the equation and, separately, according to the right
and compare the two drawings. First, shade X with and Y ∩ Z with . The union operation

indicates all the shaded regions together. (Note that X ∩ (Y ∩ Z) is shaded , but that is not
important for our present task.) Thus the left side of the equation:

21
3.3. POWERSETS, CARTESIAN PRODUCTS, AND PARTITIONS CHAPTER 3. SET OPERATIONS

U
Y

X

Z

. The intersection of these

Now, in a separate picture, shade X ∪ Y with
two sets is the region that is double-shaded, .

and X ∪ Z with

U
Y

X

Z

Since the total shaded region in the first picture is the same as the double-shaded region in the
second picture, we have verified the proposition. Notice how graphics, will a little narration to help,
can be used for an informal, intutive proof.

3.3 Powersets, cartesian products, and partitions

powerset Finally, we introduce three specialized set definitions. The powerset of a set X is the set of all
subsets of X and is symbolized by P(X). Formally,

P(X) = {Y |Y ⊆ X}

If X = {1, 2, 3}, then P(X) = {{1, 2, 3}, {1, 2}, {2, 3}, {1, 3}, {1}, {2}, {3}, ∅}. It is important to
notice that for any set X, X ∈ P(X) and ∅ ∈ P(X), since X is a subset of itself and ∅ is a subset
of everything. It so happens that for finite sets, |P(X)| = 2|X| .
ordered pair An ordered pair is two elements (not necessarily of the same set) written in a specific order.
Suppose X and Y are sets, and say x ∈ X and y ∈ Y . Then we say that (x, y) is an ordered pair
over X and Y . We say two ordered pairs are equal, say (x, y) = (w, x) if x = w and y = z. An
ordered pair is different from a set of cardinality 2 in that it is ordered. Moreover, the Cartesian
Cartesian product product of two sets, X and Y , written X × Y , is the set of all order pairs over X and Y . Formally,

X × Y = {(x, y)|x ∈ X and y ∈ Y }

22
CHAPTER 3. SET OPERATIONS 3.3. POWERSETS, CARTESIAN PRODUCTS, AND PARTITIONS

If X = {1, 2} and Y = {2, 3}, then X × Y = {(1, 2), (1, 3), (2, 2), (2, 3)}. The Cartesian product,
named after Decartes, is nothing new to you. The most famous Cartesian product is R × R, that
is, the Cartesian plane. Similarly, we can define ordered triples, quadrupals, and n-tuples, and
corresponding higher-ordered products.
If X is a set, then a partition of X is a set of non-empty sets {X1 , X2 , . . . , Xn } such that partition
X1 , X2 , . . . , Xn are pairwise disjoint and X1 ∪ X2 ∪ . . . ∪ Xn = X. Intuitively, a partition of a set is
a bunch of non-overlapping subsets that constitute the entire set. From Chapter 1, T and A make
up a partition of R. Here is how we might draw a partition:

X
X2
X1 X5
X4
X3

23
3.3. POWERSETS, CARTESIAN PRODUCTS, AND PARTITIONS CHAPTER 3. SET OPERATIONS

Exercises

5. (A ∩ B) ∩ (A ∪ C).
1. Complete the on-line Venn diagram drills found at
Describe the powerset (by listing all the elements) of the following.
www.ship.edu/~deensl/DiscreteMath/flash/ch3/sec3 1
/venntwoset.html and www.ship.edu/~deensl/DiscreteMath 6. {1, 2}
/flash/ch3/sec3 1/vennthreeset.html
7. {a, b, c, d}
Let A, B, and C be sets, subsets of the universal set U . Draw Venn
diagrams to show the following (do not draw C in the cases where 8. ∅
it is not used). 9. P({1, 2}.
2. (A ∩ B) − A.
10. Describe three distinct partitions of the set Z. For exam-
3. (A − B) ∪ (B − A). ple, one partition is the set of evens and the set of odds
4. (A ∪ B) ∩ (A ∪ C). (remember that these two sets make one partition).

24
Chapter 4

Tuples and Lists

4.1 Tuples
One of the last things we considered in the previous chapter was the Cartesian product over sets. ML
has a ready-made way to represent ordered pairs—or their generalized counterparts, tuples, as they tuples
are more frequently spoken of in the context of ML. In ML, a tuple is made by listing expressions,
separated by commas and enclosed in parentheses, following standard mathematical notation. The
expressions are evaluated, and the values are displayed, again in a standard way.

- (2.3 - 0.7, ~8.4);

val it = (1.6,~8.4) : real * real

Note that the type is real * real, corresponding to R×R. We can think of this as modeling a point
in the real plane. (1.6, 8.4) is itself a value of that type. We can store this value in a variable;
we also can extract the components of this value using #1 and #2 for the first and second number
in the pair, respectively.

- val point = (7.3, 27.85);

val point = (7.3,27.85) : real * real

- #1(point);

val it = 7.3 : real

- #2(point);

val it = 27.85 : real

Suppose we want to shift the point up 5 units and over .3 units.

- val newx = #1(point) + 5.0;

val newx = 12.3 : real

- val newy = #2(point) + 0.3;

val newy = 28.15 : real

25
4.1. TUPLES CHAPTER 4. TUPLES AND LISTS

- (newx, newy);

val it = (12.3,28.15) : real * real

Note that although we mentioned “shifting the point,” we really are not making effecting a
change on the variable point—it has stayed the same. (If you have programmed before, note well
that this is different from changing the state of an object or array.)
- point;

val it = (7.3,27.85) : real * real

We can make tuples of any size and of non-uniform types. We can make tuples of any types,
even of tuple types.
- (4.3, 7.9, ~0.002);

val it = (4.3,7.9,~0.002) : real * real * real

- datatype Bird = Sparrow | Finch | Robin | Owl | Gull | Dove | Eagle

datatype Bird
= Chicken
| Dove
...

- (4.5, Eagle, 17, Finch);

val it = (4.5,Eagle,17,Finch) : real * Bird * int * Bird

- (~3.1, Owl, (3, 5));

val it = (~3.1,Owl,(3,5)) : real * Bird * (int * int)

- #2(#3(it));

val it = 5 : int

It is most important to observe the types of the various expressions we are considering. Type
correctness is how parts of a computer program are assembled in a meaningful way. For the simplic-
ities sake, let us confine ourselves to thinking of pairs of reals. The operators #1 and #2 consume a
pair (a value of type real * real) and produces a real. The parentheses and comma consume two reals
and produces a pair of reals (real * real). Notice the difference between “two reals” (two distinct
values) and “a pair of reals” (one value). Consider again
- (#1(point) + 5.0, #2(point) + 0.3);

val it = (12.3,28.15) : real * real

We can analyze the types of this expression and its subexpressions:

(#1(point) + 5.0, #2(point) + 0.3)
real * real real * real

real real real real

real * real

26
CHAPTER 4. TUPLES AND LISTS 4.2. LISTS

4.2 Lists
At first glance, it might seem that tuples are reasonable candidates for representing sets in ML. We
can group three values together and consider them a single entity, as in
- (Robin, Duck, Chicken);

val it = (Robin,Duck,Chicken) : Bird * Bird * Bird

This is a poor solution because even though a tuple’s length is arbitrary, it is still fixed. The
type Bird * Bird * Bird is more restricted than a “set of Birds.” A 4-tuple of Birds has a completely
different type.
- (Finch, Goose, Penguin, Dove);

val it = (Finch,Goose,Penguin,Dove) : Bird * Bird * Bird * Bird

An alternative which will enable us better to represent the mathematical concept of sets is a
list. Cosmetically, the difference between the two is to use square brackets instead of parentheses. list
Observe how the interpreter responds to these various attempts at using lists.
- [Finch, Robin, Owl];

val it = [Finch,Robin,Owl] : Bird list

- [Vulture, Sparrow];

val it = [Vulture,Sparrow] : Bird list

- [Eagle];

val it = [Eagle] : Bird list

- [];

val it = [] : ’a list

- [13, 28, 22, 23, 4, 57, 86];

val it = [13,28,22,23,4,57,86] : int list

- [5, 5.5, Sparrow];

stdIn:29.1-29.18 Error: operator and operand don’t agree [tycon mismatch]

operator domain: real * real list
operand: real * Bird list
in expression:
5.5 :: Sparrow :: nil

Every list made up of birds has the same type, Bird list, regardless of how many Birds there are.
Likewise we can have a list of ints, but unlike tuples we cannot have a list of mixed types. The type
of which the list is made up is called its base type. The interpreter typed the expression [] as ’a list; base type
’a is a type variable, a symbol ML uses to stand for an unknown type. Two lone square braces is
obviously an empty list, but even an empty list must have a base type. This is the first case we have type variable
seen of an expression whose type ML cannot infer from context. We can disambiguate this using
explicit typing, which is done by following an expression with a colon and the type. For example, explicit typing
we can declare that we want an empty list to be considered a list of Birds.

27
4.2. LISTS CHAPTER 4. TUPLES AND LISTS

- [] : Bird list;

val it = [] : Bird list

It is perfectly logical to speak of lists of lists—that is, the base type of a list may be itself a list
type.

- [[Loon], [Robin, Duck]];

val it = [[Loon],[Robin,Duck]] : Bird list list

Make sure the difference between tuples and lists is understood. An n-tuple has n components,
and n is a fundamental aspect of the type. Lists (with base type ’a), on the other hand, are considered
head to have exactly two components: the first element (called the head , of type ’a) and the rest of the
tail list (called the tail , of type ’a list)—all this, of course, unless the list is empty. Thus we defy the
grammar school rule that one cannot define something in terms of itself by saying an ’a list is

• an empty list, or
• an ’a followed by an ’a list.

Corresponding to the difference in definition, a programmer interacts with lists in a way different
from how tuples are used. A tuple has n components referenced by #1, #2, etc. A list has these two
accessors components referenced by the accessors hd for head and tl for tail.

- val sundryBirds = [Robin, Penguin, Gull, Loon];

val sundryBirds = [Robin,Penguin,Gull,Loon] : Bird list

- hd(sundryBirds);

val it = Robin : Bird

- tl(sundryBirds);

val it = [Penguin,Gull,Loon] : Bird list

- tl(it);

val it = [Gull,Loon] : Bird list

- tl(it);

val it = [Loon] : Bird list

- tl(it);

val it = [] : Bird list

- tl(it);

uncaught exception Empty

28
CHAPTER 4. TUPLES AND LISTS 4.2. LISTS

hd and tl can be used to slice up a list, one element at a time. However, it is an error to try to
extract the tail (or head, for that matter) on an empty list.
We have seen how to make a list using square braces and how to takes lists apart. We can take
two lists and concatenate them—that is, tack one on the back of the other to make a new list—using concatenate
the cat operator, @.
cat, @
- [Owl, Finch] @ [Eagle, Vulture];

val it = [Owl,Finch,Eagle,Vulture] : Bird list

We also can take a value of the base type and a list and considering them to be the head and
tail, respectively, of a new list. This is by the construct or cons operator, ::. cons, ::

- Robin::it;

val it = [Robin,Owl,Finch,Eagle,Vulture] : Bird list

Cons must take an item and a list; cat must take two lists.

- it::Hawk;

stdIn:40.1-40.9 Error: operator and operand don’t agree [tycon mismatch]

operator domain: Bird list * Bird list list
operand: Bird list * Bird
in expression:
it :: Hawk

- Hawk@[Loon];

stdIn:1.1-1.12 Error: operator and operand don’t agree [tycon mismatch]

operator domain: ’Z list * ’Z list
operand: Bird * Bird list
in expression:
Hawk @ Loon :: nil

- Sparrow::Robin::Turkey;

stdIn:1.1-30.7 Error: operator and operand don’t agree [tycon mismatch]

operator domain: Bird * Bird list
operand: Bird * Bird
in expression:
Robin :: Turkey

However, cons works from right to left, so these next two are fine:

- Sparrow::Robin::[Turkey];

val it = [Sparrow,Robin,Turkey] : Bird list

- Sparrow::Robin::Turkey::[];

val it = [Sparrow,Robin,Turkey] : Bird list

And, once more, using cat and cons on lists of lists:

29
4.3. LISTS VS. TUPLES VS. ARRAYS CHAPTER 4. TUPLES AND LISTS

- [Loon]::[];

val it = [[Loon]] : Bird list list

- [Robin]::it;

val it = [[Robin],[Loon]] : Bird list list

- [[Duck, Vulture]]@it;

val it = [[Duck,Vulture],[Robin],[Loon]] : Bird list list

4.3 Lists vs. tuples vs. arrays

By using a list instead of a tuple, we are surrendering the ability to extract an element from an
random access arbitrary position in one step, a feature of tuples called random access. The way we consider items
sequential access in a list is called sequential access. This is the price of using lists—we sacrifice random access for
indefinite length. ML also provides another sort of container called an array, something we will
touch on only briefly here, though it certainly is familiar to all who have programmed before.
array An array is a finite, ordered sequence of values, each value indicated by an integer index , starting
index from 0. Since it is a non-primitive part of ML, the programmer must load a special package that
allows the use of arrays.

- open Array;

One creates a new array by typing array(n, v), which will evaluate to an array of size n, with
each position of the array initialized to the value v. The value at positioni in an array is produced
by sub(A, i), and the value at that position is modified to contain v by update(A, i, v).

- val A = array(10, 0);

val A = [|0,0,0,0,0,0,0,0,0,0|] : int array

- update(A, 2, 16);

val it = () : unit

- update(A, 3, 21);

val it = () : unit

- A;

val it = [|0,0,16,21,0,0,0,0,0,0|] : int array

- sub(A, 3);

val it = 21 : int

30
CHAPTER 4. TUPLES AND LISTS 4.3. LISTS VS. TUPLES VS. ARRAYS

The interpreter’s response val it = () : unit will be explained later. Note that arrays can
be changed—they are mutable. Although we can generate new tuples and lists, we cannot change mutable
the value of a tuple or list. However, unlike lists, new arrays cannot be made by concatenating two
arrays together.
Much of the work of programming is weighting trade-offs among options. In this case, we are
considering the appropriateness of various data structures, each of which has its advantages and
liabilities. The following table summarized the differences among tuples, lists, and arrays.

Access Concatenation Length Element types Mutability

Tuples random unsupported fixed unrelated immutable
Lists sequential supported indefinite uniform immutable
Arrays random unsupported indefinite uniform mutable

In this case, we want a data structure suitable for representing sets. Our choice to use lists comes
because the concept of “list of X” is so similar to “set of X”—and because ML is optimized to operate
on lists rather than arrays. However, there are downsides. A list, unlike a set, may contain multiple
copies of the same element. The cat operator is, for example, a poor union operation because it will
keep both copies if an element belongs to both subsets that are being unioned. Later we will learn
to write our own set operations to operate on lists.

31
4.3. LISTS VS. TUPLES VS. ARRAYS CHAPTER 4. TUPLES AND LISTS

Exercises

10. Remove the second item of collection and tack it on the

In Exercises 1–9, analyze the type of the expression, as on page front.
26, or indicate that it is an error. 11. Remove the third item of collection and tack it on the
1. [[(Loon, 5), (Vulture, 6)][]] back.

2. [tl([Sparrow, Robin, Turkey])]@[Owl, Finch] 12. Remove the first two items of collection and tack them on
the back.
3. [Owl, Finch]@tl([Sparrow, Robin, Turkey])
In Exercises 13–15, state whether it would be best to use an array,
4. [Owl, Finch]::tl([Sparrow, Robin, Turkey]) a tuple, or a list.
5. [Owl, Finch]::[tl([Sparrow, Robin, Turkey])]
13. You have a collection of numbers that you wish to sort. The
6. hd([Sparrow, Robin, Turkey])::[Owl, Finch] sorting method you wish to use involves cutting the collec-
7. ([Finch, Robin], [2, 4]) tion into smaller parts and then joining them back together.

8. [(Finch, Robin), (2, 4)] 14. The units of data you are using have components of differ-
ent type, but the units are always the same length, with the
9. [5, 12, ceil(7.3 * 2.1), #2(Owl, 17)] same types of components in the same order.
In Exercises 10–12, assume collection is a list with sufficient 15. You have a collection of numbers that you wish to sort. The
length for the given problem. Write an ML expression to produce sorting method you wish to use involves interchanging pairs
a list as indicated. of values from various places in the collection.

32
Part II

Logic

33
Chapter 5

Logical Propositions and Forms

In this chapter, we begin our study of formal logic. Logic is the set of rules for making deductions,
that is, for producing new bits of information by piecing together other known bits of information.
We study logic, in this course in particular, because logic is a foundational part of the language of
mathematics, and it is the basis for all computing. The circuits of microchips, after all, are first
modeling logical operations; that logic is then used to emulate other work, such as arithmetic. Logic,
further, trains the mind and is a tool for any field, whether it be natural science, rhetoric, philosophy,
or theology. Two things should be clarified before we begin.
First, you must understand that logic studies the form of arguments, not their contents. The
argument

If the moon is made of blue cheese, then Napoleon Dynamite is President.

The moon is made of blue cheese.
Therefore, Napoleon Dynamite is President.

is perfectly logical. Its absurdity lies in the content of its premises. Similarly, the argument

If an integer is a multiple of 3, then its digits sum to a multiple of 3.

The digits of 216 sum to 9, a multiple of 3.
Therefore 216 is a multiple of 3.

is illogical. Nevertheless, everything except for therefore is true.

Second, and consequently, we must recognize logic as a tool for thinking; it is not thinking itself,
nor is it a substitute for thinking. Logic just as easily can be employed to talk about unreality as
it can reality. Common sense, observation, and intuition are necessary to find proper raw materials
(premises) on which to apply logic. G. K. Chesterton observed that a madman’s “most sinister
quality is a horrible clarity of detail; a connection of one thing with another in a map more elaborate
than a maze. . . He is not hampered by a sense of humor or by charity. . . He is the more logical for
losing certain sane affections. . . The madman is not the man who has lost his reason. The madman
is the man who has lost everything except his reason” [3].

5.1 Forms
Consider these two arguments:

If it is Wednesday or spinach is on sale, then I go to the store.

So, if I do not go to the store, then it is not Wednesday and spinach is not on sale.
If x < −5 or x > 5, then |x| > 5.
So, if |x| ≯ 5, then x ≮ −5 and x ≯ 5.

35
5.2. SYMBOLS CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS

What do these have in common? If we strip out everything except except for the words if, or,
then, so, not and and (that is, if we strip out all the content but leave the logical connectors), we
are left in either case with
If p or q, then r.
So, if not r, then not p and not q.
Notice that we replaced the content with variables. This allowed us to abstract from the two
arguments to find a form common to both. These variables are something new because they do not
stand for numbers but for independent clauses, grammatical items that have a complete meaning
proposition and can be true or false. In the terminology of logic, we say a proposition is a sentence that is true
or false, but not both. The words true, false, and sentence we leave undefined, like set and element.
Since a proposition can be true or false, the following qualify as propositions:
7−3=4
7−4=3
Bob is taking discrete mathematics.
By saying a proposition must be one or the other and not both, we disallow the following:
7−x=4
He is taking discrete mathematics.
In other words, a proposition must be complete enough (no variables) to be true or false. 1

5.2 Symbols
Logic uses a standard system of symbols, much of which was formulated by Alfred Tarski. We have
already seen that variables can be used to stand for propositions, and you may have noticed that
propositions can be joined together by connective words to make new propositions. These connective
words stand for logical operations, similar to the operations used on numbers to perform arithmetic.
The three basic operations are
Symbolization Meaning

∼p Not p— having truth value opposite that of p

p∧q p and q— true if both are true, false otherwise
p∨q p or q— true if at least one is true, false otherwise
negation We call these operations negation, conjunction, and disjunction, respectively. Negation has
conjunction a higher precedence than conjunction and disjunction; ∼ p ∨ q means (∼ p) ∨ q, not ∼ (p ∨ q).
disjunction Conjunction and disjunction are performed from left to right.
Take notice of how things transfer from everyday speech to the precision of mathematical nota-
tion. The words and, but, and not are not grammatical peers—that is, they are not all the same
part of speech. The words but and and are conjunctions; not is an adverb. How can they become
equivalent in their applicability when treated formally? In fact, they are not equivalent, and the
difference appears in the number of operands they take. Remember that truth values correspond to
independent clauses in English. ∨ and ∧ are binary—they operate on two things, just as in natural
language conjunctions hook two independent clauses together. ∼ is unary—it operates on only one
thing, just as the adverb not modifies a single independent clause.
To grow accustomed to mathematical notation, it is helpful to understand how to translate from
English to equivalent symbols. In the following examples, assume r stands for “it is raining” and s
stands for “it is snowing.” Compare the propositions in English with their symbolic representation.
1 Technically, Bob is taking discrete mathematics might not be a proposition if the context does not determine

which Bob we are talking about; similarly, He is taking discrete mathematics might be a proposition if the context
makes plain who the antecedent is of he. We use these only as examples of the need for a sentence to be well-defined
to be a proposition.

36
CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS 5.3. BOOLEAN VALUES

It is neither raining nor snowing. ∼ r∧ ∼ s

It is raining and not snowing. r∧ ∼ s
It is both raining and snowing. r∧s
It is raining but not snowing. r∧ ∼ s
It is either raining or snowing but not both. (r ∨ s)∧ ∼ (r ∧ s).

Notice that “and” and “but” have the same symbolic translation. This is because both conjunc-
tions have the same denotational, logical meaning. Their difference in English is in their connota-
tions. Whichever we choose, we are asserting that two things are both true; “a but b” merely spins
the statement to imply something like “a and b are both true, and b is surprising in light of a.”
If it is hard to swallow the idea that “and” and “but” mean the same thing, observe how another
language differentiates things more subtly. Greek has three words to cover the same semantic range
as our “and” and “but”: kai, meaning “and”; alla, meaning “but”; and de, meaning something
halfway between “and” and “but,” joining two things together in contrast but not as sharply as alla.

5.3 Boolean values

Truth value is the most basic idea that digital computation machinery models because there are only
two possible values—true and false—which can correspond to the two possible switch configurations—
on and off. ML provides a type to stand for truth values, bool, short for boolean, named after George
Boole. The two values of bool are true and false. Expressions and variables, accordingly, can have
the type bool.

- true;

val it = true : bool

- false;

val it = false : bool

- it;

val it = false : bool

- val p = true;

val p = true : bool

- val q = false;

val q = false : bool

The basic boolean operators are named not, andalso, and orelse.

- p andalso q;

val it = false : bool

- p orelse q;

val it = true : bool

37
5.4. TRUTH TABLES CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS

- (p orelse q) andalso not (p andalso q);

val it = true : bool

Comparison operators, testing for equality and related ideas, are different from others we have
seen in that their results have types different from their operands. If we compare two ints, we do
not get another int, but a bool. The ML comparison operators are =, <, >, <=, >=, and <>, the last
three for ≤, ≥, and 6=, respectively.

- 5 <> 4;

val it = true : bool

- 5 <= 4;

val it = false : bool

Round-off error makes testing reals for equality and inequality unreliable, so ML disallows it.
Instead, check for both < and >.

- 5.2 <> 4.3;

stdIn:24.1-24.11 Error: operator and operand don’t agree [equality type required]
operator domain: ’’Z * ’’Z
operand: real * real
in expression:
5.2 <> 4.3

- 5.2 < 4.3 orelse 5.2 > 4.3;

val it = true : bool

5.4 Truth tables

Our definitions of the logical operators was informal, based on an intuitive appeal to our under-
standing of English words not, and, and or. Moreover, since we accepted true and false as primitive,
undefined terms, you may have guessed that an axiomatic system underlies symbolic logic indicating
how true and false are used. Our axioms are the formal descriptions of the basic operators, defining
them as functions, with ∼ taking one argument and ∧ and ∨ taking two. We specify these functions
truth tables using truth tables, tabular representations of logical operations where arguments to a function ap-
pear in columns on the left side and results appear in columns on the right. The rows stand for the
possible combinations of argument values.
p q p∧q p∨q
p ∼p T T T T
T F T F F T
F T F T F T
F F F F
Truth tables also can be used as a handy way to evaluate composite logical propositions. Suppose
we wanted to determine the value of (p ∧ q)∨ ∼ (p ∨ q) for the various assignments to p and q. We
do this by making a truth table with columns on the left for the arguments; a set of columns in
the middle for intermediate propositions, each one the result of applying the basic operators to the
arguments or to other intermediate propositions; and a column on the right for the final answer.

38
CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS 5.5. LOGICAL EQUIVALENCE

p q p∧q p∨q ∼ (p ∨ q) (p ∧ q)∨ ∼ (p ∨ q)

T T T T F T
T F F T F F
F T F T F F
F F F F T T
Truth tables are a “brute force” way of attacking logical problems. For complicated propositions
and arguments, they can be tedious, and we will learn more expeditious ways of exploring logical
properties. However, truth tables are a sure method of evaluation to which we can fall back when
necessary.

5.5 Logical equivalence

Consider the forms ∼ (p ∧ q) and ∼ p∨ ∼ q. This truth table evaluates them for all possible
arguments.
p q ∼p ∼q p∧q ∼ (p ∧ q) ∼ p∨ ∼ q
T T F F T F F
T F F T F T T
F T T F F T T
F F T T F T T

The two rightmost columns are identical. This is because the forms ∼ (p ∧ q) and ∼ p∨ ∼ q are
logically equivalent, that is, they have the same truth value for any assignments of their arguments. logically equivalent
(We also say that propositions are logically equivalent if they have logically equivalent forms.) We
use ≡ to indicate that two forms are logically equivalent. For example, similar to the equivalence
demonstrated above, it is true that ∼ (p ∨ q) ≡∼ q∧ ∼ q. These two equivalences are called
DeMorgan’s laws, after Augustus DeMorgan.
It is important to remember that the operators ∨ and ∧ flip when they are negated. For example,
take the sentence
x is even and prime.
We do not call this a proposition, because x is an unknown, but by supplying a value for x (taking
N as the universal set) we would make it a proposition. The set of values that make this a true
proposition is {2}. The set of values that make this proposition false needs to be the complement
of that set—that is, the set of all natural numbers besides 2. It may be tempting to negate the
not-quite-a-proposition as
x is not even and not prime.
But this is wrong. The set of values that makes this a true proposition is the set of all numbers
except evens and primes—a much different set from what is required. The correct negation is
x is not even or not prime.
A form that is logically equivalent with the constant value T (something always true, no matter
what the assignments are to the variables) is called a tautology. A form that is logically equivalent tautology
to F (something necessarily false) is called a contradiction. Obviously all tautologies are logically
equivalent to each other, and similarly for contradictions. The following truth table explores some contradiction
tautologies and contradictions.
p ∼p p∨ ∼ p p∧ ∼ p p∧T p∨T p∧F p∨F
T F T F T T F T
F T T F F T F F

Hence p∨ ∼ p and p ∨ T are tautologies, p∧ ∼ p and p ∧ F are contradictions, p ∧ T ≡ p, and

p ∨ F ≡ p.
The following Theorem displays common logical equivalences.

39
5.5. LOGICAL EQUIVALENCE CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS

Theorem 5.1 (Logical equivalences.) Given logical variables p, q, and r, the following equiva-
lences hold.
Commutative laws: p∧q ≡ q∧p p∨q ≡ q∨p

Associative laws: (p ∧ q) ∧ r ≡ p ∧ (q ∧ r) (p ∨ q) ∨ r ≡ p ∨ (q ∨ r)

Distributive laws: p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r) p ∨ (q ∧ r) ≡ (p ∨ q) ∧ (p ∨ r)

Absorption laws: p ∧ (p ∨ q) ≡ p p ∨ (p ∧ q) ≡ p

Idempotent laws: p∧p ≡ p p∨p ≡ p

Double negative law: ∼∼ p ≡ p

DeMorgan’s laws: ∼ (p ∧ q) ≡ ∼ p∨ ∼ q ∼ (p ∨ q) ≡ ∼ p∧ ∼ q

Negation laws: p∨ ∼ p ≡ T p∧ ∼ p ≡ F

Universal bound laws: p∨T ≡ T p∧F ≡ F

Identity laws: p∧T ≡ p p∨F ≡ p

Tautology and contradiction laws: ∼T ≡ F ∼F ≡ T

These can be verified using truth tables. They also can be used to prove other equivalences
without using truth tables by means of a step-by-step reduction to a simpler form. For example,
q ∧ (p ∨ T ) ∧ (p∨ ∼ (∼ p∨ ∼ q)) is equivalent to p ∧ q:

q ∧ (p ∨ T ) ∧ (p∨ ∼ (∼ p∨ ∼ q))
≡ q ∧ T ∧ (p∨ ∼ (∼ p∨ ∼ q)) by universal bounds
≡ q ∧ (p∨ ∼ (∼ p∨ ∼ q)) by identity
≡ q ∧ (p∨ ∼∼ (p ∧ q)) by DeMorgan’s
≡ q ∧ (p ∨ (p ∧ q)) by double negative
≡ q∧p by absorption
≡ p∧q by commutativity.

40
CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS 5.5. LOGICAL EQUIVALENCE

Exercises

Determine which of the following are propositions. four possible assignments to q and p, checking that they agree each
time).
1. Spinach is on sale.
2. Spinach on sale. 10. ∼ (p ∧ q) ≡∼ p∨ ∼ q.
3. 3 > 5. 11. p ∧ (p ∨ q) ≡ p.
4. If 3 > 5, then Spinach is on sale. Verify the following equivalences using a truth table.
5. Why is Spinach on sale? 12. (p ∧ q) ∧ r ≡ p ∧ (q ∧ r).
Let s stand for “spinach is on sale” and k stand for “kale is on 13. p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r).
sale.” Write the following using logical symbols.
Verify the following equivalences by applying known equivalences
6. Kale is on sale, but spinach is not on sale. from Theorem 5.1.
7. Either kale is on sale and spinach is not on sale or kale and
spinach are both on sale. 14. ∼ (∼ p ∨ (∼ p∧ ∼ q))∨ ∼ p ≡ T .

8. Kale is on sale, but spinach and kale are not both on sale. 15. p ∧ (∼ q ∨ (p∧ ∼ p)) ≡ p∧ ∼ q.

9. Spinach is on sale and spinach is not on sale. 16. (q ∧ p)∨ ∼ (p∨ ∼ q) ≡ q.

Verify the following equivalences using a truth table. Then ver- 17. ((q ∧ (p ∧ (p ∨ q))) ∨ (q∧ ∼ p))∧ ∼ q ≡ F .
ify them using ML (that is, type in the left and right sides for all 18. ∼ (∼ (p ∧ p) ∨ (∼ q ∧ T )).

41
5.5. LOGICAL EQUIVALENCE CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS

42
Chapter 6

Conditionals

6.1 Conditional propositions

In the previous chapter, we took intuitive notions of and, or, and not and considered them as
mathematical operators. However, our initial example

If it is Wednesday or spinach is on sale, then I go to the store.

or, with the details abstracted by variables,

If p or q, then r.

has more than just an or indicating its logical form. We also have the words if and then, which
together knit “p or q” and “r” into a logical form. Let us simplify this a bit to

If spinach is on sale, then I go to the store.

which has the form

If p, then q.

A proposition in this form is called a conditional , and is symbolized by the operator →. The conditional
symbolism p → q reads “if p then q” or “p implies q.” p is called the hypothesis and q is called the
conclusion. hypothesis
To define this operator formally, consider the various scenarios for the truth of the hypothesis
conclusion
and conclusion and how they affect the truth of the conditional proposition.

Is spinach on sale? Do I go to the store? Is it true that if spinach is on sale,

I go to the store?
Yes. Yes. It would be consistent with the
evidence, but not proved abso-
lutely
Yes. No. No, it cannot be true because
spinach is on sale and I do not
go to the store.
No. Yes. Not enough information—the
proposition does not seem to ad-
dress the situation when spinach
is not on sale.
No. No. Similar to above.
From this we see that we need to be more precise about what is intended when we say that a
conditional proposition is true. In only one case were we able to say that it is definitely false—in the

43
6.2. NEGATION OF A CONDITIONAL CHAPTER 6. CONDITIONALS

other three cases, the hypothesis and conclusion do not disprove the conditional, but they do not
prove it either. To clarify this, we say that a conditional proposition is true if the truth values of
the hypothesis and conclusion are consistent with the proposition being true. This means the cases
vacuously true where the hypothesis is false are both true, by default. (We call this being vacuously true). Thus
we have this truth table for →:

p q p→q
T T T
T F F
F T T
F F T

We can further use a truth table to show, for example, that p → q ≡ q∨ ∼ (p ∨ q).

p q p∨q ∼ (p ∨ q) q∨ ∼ (p ∨ q) p→q
T T T F T T
T F T F F F
F T T F T T
F F F T T T

In other words, “If spinach is on sale, then I go to the store” is equivalent to “I go to the store
or it is not true that either spinach is on sale or I go to the store.”

6.2 Negation of a conditional

Negating a conditional is tricky. Obviously one can simply put a “not” in front of the whole
proposition (∼ (p → q) or “it is not true that if spinach is on sale, then I go to the store”), but we
would like something more useful, a way to evaluate such a negation to get a simpler, equivalent
proposition—a simplification rule like De Morgan’s laws.
The negation of a proposition must be true exactly when the proposition is not true. In other
words, it must have the opposite truth value. What has opposite truth to “If spinach is on sale,
then I go to the store”? Consider these attempts:

If spinach is not on sale, then I go to the store. This is not right because it does not
adequately address the situation where one goes to the store every day, whether spinach
is on sale or not. In that case, both this and the original proposition would be true, so
this is not a negation.
If spinach is not on sale, I do not go to the store. Merely propagating the negation to
hypothesis and conclusion does not work at all. If spinach is on sale and I go, or spinach
is not on sale and I do not go, both this and the original proposition hold.
If spinach is on sale, I do not go to the store. This attempt is perhaps the most attractive,
because it does indeed contradict the original proposition. However, it can be considered
“too strong” and so not a negation—both it and the original proposition are vacuously
true if spinach is not on sale.

To find a true negation, use a truth table to identify the truth values for ∼ (p → q); then we will
try to construct a simple form equivalent to it.

p q p→q ∼ (p → q)
T T T F
T F F T
F T T F
F F T F

44
CHAPTER 6. CONDITIONALS 6.3. CONVERSE, INVERSE, AND CONTRAPOSITIVE

That is, we are looking for a proposition that is true only when both p is true and q is not true.
Thus we have ∼ (p → q) ≡ p∧ ∼ q.
This is a surprising result. The negation of a conditional is not itself a conditional. The negation
of “If spinach is on sale, then I go to the store” is “Spinach is on sale and I do not go to the store.”

6.3 Converse, inverse, and contrapositive

There are three common forms that are variations on the conditional. Switching the order of the
hypothesis and the conclusion, q → p, is the converse of p → q. converse

If I go to the store, then spinach is on sale.

The converse is not logically equivalent to the proposition.

p q p→q q→p
T T T T
T F F T
F T T F
F F T T

Many common errors in reasoning come down to a failure to recognize this. p being correlated
to q is not the same thing as q being correlated to p. I may go to the store every time spinach is on
sale, but that does not mean that I will never go to the store if spinach is not on sale, and so my
going to the store does not imply that spinach is on sale.
The inverse is formed by negating each of the hypothesis and conclusion separately (not negating inverse
the entire conditional), ∼ p →∼ q.

If I do not go to the store, then spinach is not on sale.

For the same reason as above, the inverse is not logically equivalent to the proposition either.

p q p→q ∼ p →∼ q
T T T T
T F F T
F T T F
F F T T

The contrapositive is formed by negating and switching the components of a conditional, ∼ q →∼ contrapositive
p.

If I do not go to the store, then spinach is not on sale.

The contrapositive is logically equivalent to the proposition.

p q p→q ∼ q →∼ p
T T T T
T F F F
F T T T
F F T T

Compare the truth tables of the converse and the inverse. Notice that they are logically equiva-
lent. In fact, the converse and inverse are contrapositives of each other.

45
6.4. WRITING CONDITIONALS IN ENGLISH CHAPTER 6. CONDITIONALS

6.4 Writing conditionals in English

There is a variety of phrasings in mathematical parlance alone (let alone everyday speech) that
express the idea of conditionals and their variations. For example, think about what is meant by
x > 7 only if x is even.
This certainly does not say that if x is even, then x > 7, but rather leaves open the possibility
that x is even although it is 7 or less. No, this means if x > 7, then x is even In other words, only
if is a way of saying a converse proposition. p only if q means q → p.
Along the same lines, the phrase “if and only if,” often abbreviated “iff,” is of immense use in
mathematics. If we expand
x mod 2 = 0 iff x is even.
we get
If x mod 2 = 0, then x is even, and if x is even, then x mod 2 = 0.
That is, we get both the conditional and its converse. The connector iff is sometimes called the
biconditional biconditional , is symbolized by a double arrow, p ↔ q, and is defined to be equivalent to (p →
q) ∧ (q → p).

p q p→q q→p p↔q

T T T T T
T F F T F
F T T F F
F F T T T

necessary conditions We also sometimes speak of necessary conditions and sufficient conditions, which refer to converse
sufficient conditions conditional and conditional propositions, respectively.
An even degree is a necessary condition for a polynomial to have no real roots
means
If a polynomial function has no real roots, then it has an even degree.
A positive global minimum is a sufficient condition for a polynomial to have no real roots
means
If a polynomial function has a positive global minimum, then it has no real roots.
Values all of the same sign is a necessary and sufficient condition for a polynomial to
have no real roots.
means
A polynomial function has values all of the same sign if and only if the function has no
real roots.

6.5 Conditional expressions in ML

ML does not have an operator that performs logical conditionals, because that would be rarely used.
However, it is appropriate now to introduce functionality it has to make decisions based on the truth
or falsity of an expression.
A step function is a function over real numbers whose value is zero up until some number, after
which it is one, for example,

0 if x < 5
f (x) =
1 otherwise
Step functions have applications in electronics and statistics. Notice that computing them re-
quires making a decision, and that decision is based on evaluating a statement: is x less than 5 or
not? We can express this in ML using the form

46
CHAPTER 6. CONDITIONALS 6.5. CONDITIONAL EXPRESSIONS IN ML

if <expression>1 then <expression>2 else <expression>3

The first expression is called the condition. The second two are called the then-clause and else-
clause, respectively. The condition must have type bool. The other expressions must have the same
type, but that type can be anything. If the condition is true, then the value of the entire expression is
the value of the then-clause; if it is false, then the entire expression’s value is that of the else-clause.
Thus the type of the second two expressions is also the type of the entire expression.

- val x = 3;

val x = 3 : int

- if x < 5 then 0 else 1;

val it = 0 : int

- datatype mammal = Dolphin | Monkey | Hog;

datatype mammal = Dolphin | Hog | Monkey

- datatype collective = Pod | Troop | Herd;

datatype collective = Herd | Pod | Troop

- val a = Hog;

val a = Hog : mammal

- if a = Dolphin then Pod else if a = Monkey then Troop else Herd;

val it = Herd : collective

- if a = Dolphin then 12 else Pod;

stdIn:1.1-2.21 Error: types of rules don’t agree [literal]

earlier rule(s): bool -> int
this rule: bool -> collective
in rule:
false => Pod

At this point, the number of interesting things for which we can use this are limited, but it is
important to understand how types fit together in an expression like this for later use.

47
6.5. CONDITIONAL EXPRESSIONS IN ML CHAPTER 6. CONDITIONALS

Exercises

In Exercises 8–13, indicate if the sentence is true or false or if you

1. Use a truth table to show that (p ∨ q) → r ≡∼ r → (∼ p∧ ∼ cannot tell.
q).
8. Bob passed through P .
2. The equivalence p → q ≡∼ q →∼ p is called the contrapos-
itive law. Use this and known equivalences from Theorem 9. Bob passed through N .
5.1 to show that (p ∨ q) → r ≡∼ r → (∼ p∧ ∼ q).
10. Bob passed through M .
Let r stand for “It is raining,” s stand for “It is snowing,” and p
stand for “spinach is on sale.” Write the following using logical 11. If Bob passed through O, then Bob passed through F .
symbols.
12. If Bob passed through K, then Bob passed through L.
3. If it is not snowing, then spinach is on sale.
4. If spinach is on sale and it is raining, then it is not snowing. 13. If Bob passed through L, then Bob passed through K. (Be
careful.)
5. If it is raining, then spinach is not on sale or it is snowing.
6. If spinach is on sale, then if it is not raining, then it is snow- 14. Write the negation, converse, inverse, and contrapositive (all
ing. in English) of the statement in exercise 3.
7. In Illinois, the drinking age is 21. Each of the following cards
gives information about one of the four people at a table in a 15. Write the negation , converse, inverse, and contrapositive of
certain restaurant. On each card, one side gives the person’s the statement in exercise 4.
age and the other side indicates what that person is drink- 16. Write the negation , converse, inverse, and contrapositive of
ing. The manager claims, “If a person is drinking beer, then the statement in exercise 5.
that person is at least 21.” Which card or cards would you
need to turn over to determine whether the claim is false? 17. Write the negation , converse, inverse, and contrapositive of
Explain. the statement in exercise 6.
a. b. c. d.
18. Suppose you wanted to divide x by some value y, but only
Beer Coke 20 24
if y was non-zero; you would like the result simply to be x
A person named Bob passed through the following maze without otherwise. Write an ML expression for this calculation, but
using the same door twice. let x appear in the expression only once.

T S R Q P
19. Find a form that will always produce the same result (that
is, is equivalent to)

K L M N O if <boolean expression>1 then true else <boolean expression>2

If you do not see it immediately, experiment on some specific

J I H G F
expressions.

E D C B A Exercises 7 and 8–13 were presented by Susanna Epp at a workshop

in June 2006.

48
Chapter 7

Argument forms

7.1 Arguments
So far we have considered symbolic logic on the proposition level. We have considered the logical
connectives that can be used to knit propositions together into new propositions, and how to evaluate
propositions based on the value of their component propositions. However, to put this to use—either
for writing mathematical proofs or engaging in any other sort of discourse—we need to work in units
larger than propositions. For example,

During the full moon, spinach is on sale.

The moon is full.
Therefore, spinach is on sale.

contains several propositions, and they are not connected to become a single proposition. This,
instead, is an argument, a sequence of propositions, with the last proposition beginning with the word argument
“therefore”—or “so” or “hence” or some other such word, and possibly in a postpositive position, as
in “Spinach, therefore, is on sale.” Similarly, an argument form is a sequence of proposition forms, argument form
with the last prefixed by the symbol ∴. All except the last in the sequence are called premises; the
last proposition is called the conclusion. premises

Propositions are true or false. Arguments, however, are not spoken of as being true or false;
conclusion
instead, they are valid or invalid. We say that an argument form is valid if, whenever all the
premises are true (depending on the truth values of their variables), the conclusion is also true. valid
Consider another argument:

If my crystal wards off alligators, then there will be no alligators around.

There are no alligators around.
Therefore, my crystal wards off alligators.

The previous argument and this one have the following argument forms, respectively (rephrasing
“During the full moon. . . ” as “If the moon is full, then. . . ”):

p→q p→q
p q
∴q ∴p

It is to be hoped that you readily identify the left argument form as valid and the right as invalid.
The truth table verifies this.

49
7.2. COMMON SYLLOGISMS CHAPTER 7. ARGUMENT FORMS

←−−−−−−−
←−−−−−−−

conclusion
conclusion

←−−−−−
←−−−−−
←−−−−−

←−−−−−

premise

premise
premise

p q premise
p→q q p q p→q p
critical row critical row
T T T T ←−−−−−−−
− T T T T ←−−−−−−−
−
T F F F T F F T
critical row
F T T T F T T F ←−−−−−−−
−
F F T F F F T F
The first argument has only one case where both premises are true, and we see there that the
conclusion is also true. The rest of the truth table does not matter—only the rows where all premises
critical rows are true count. We call these critical rows, and when you are evaluating large argument forms, it is
acceptable to leave the entries in non-critical rows blank. Moreover, once you have found a critical
row where the conclusion is false, nothing more needs to be done. The second argument has a critical
row where the conclusion is false; hence it is an invalid argument.

7.2 Common syllogisms

syllogism A syllogism is a short argument. Usually the definition restricts the term to apply only to arguments
having exactly two premises, but we will include one- and three-premised arguments as well in our
modus ponens discussion. The correct argument form above is the most famous, and is called modus ponens, Latin
for “method of affirming,” or, more literally, “method of placing” (or, more literally still, “placing
method,” since ponens is a participle, not a gerund).
p→q If Socrates is a man, then he is mortal.
p Socrates is a man.
∴q Therefore, Socrates is mortal.

Since the contrapositive of a conditional is logically equivalent to the conditional itself, a truth
table from the previous chapter proves

p→q
∴∼ q →∼ p

If we substitute “∼ q” for “p” and “∼ p” for “q in modus ponens, we get

∼ q →∼ p
∼q
∴∼ p

modus tollens Putting these two together results in the second most famous syllogism, modus tollens, “lifting
[i.e., denying] method.”
p→q If the moon is full, then Lupin will be a werewolf.
∼q Lupin is not a werewolf.
∴∼ p Therefore, the moon is not full.

We can also prove this directly with a truth table, with only the critical row completed for the
conclusion column.
p q p→q ∼q ∼p
T T T F
T F F T
F T T F
F F T T T

50
CHAPTER 7. ARGUMENT FORMS 7.3. USING ARGUMENT FORMS FOR DEDUCTION

The form generalization may seem trivial and useless, but, in fact, it captures a reasoning tech- generalization
nique we often use subconsciously. It relies on the fact that a true proposition or ’ed to any other
proposition is still a true proposition.
p We are in Pittsburgh.
∴ p∨q Therefore, we are in Pittburgh or Mozart wrote The Nutcracker.

A similar (though symmetric) argument form using and is called specialization. specialization

p∧q x > 5 and x is even.

∴p Therefore, x > 5.

Sherlock Holmes describes the process of elimination as, “. . . when you have eliminated the
impossible, whatever remains, however improbable, must be the truth.” Formally, elimination is elimination

p∨q x is even or x is odd.

∼p x is not even.
∴q Therefore, x is odd.

We will later prove that, given sets A, B, and C, if A ⊆ B and B ⊆ C, then A ⊆ C. This means
that the ⊆ relation is transitive. The analogous logical form transitivity is transitivity

p→q If x > 5, then x > 3.

q→r If x > 3, then x > 0.
∴p→r Therefore, if x > 5, then x > 0.

Sometimes a phenomenon has two possible causes (or, at least, two things that are correlated to
it). Then all that needs to be shown is that at least one such cause is true. Or, if we know that at
least one of two possibilities is true and that they each imply a fact, that fact is true. We call this
form division into cases. division into cases

p∨q It’s either snowing or raining.

p→r If it’s snowing, then spinach is on sale.
q→r If it’s raining, then spinach is on sale.
∴r Therefore, spinach is on sale.

Finally, we have proof by contradiction. contradiction

p→F If x − 2 = x then −2 = 0 and −2 6= 0.

∴∼ p Therefore, x − 2 6= x.

Because of certain paradoxes that have arisen in the study of the foundations of mathematics,
some logicians call into question the validity of proof by contradiction. If p leads to a contradiction,
it might not be that p is false; it could be that p is neither true nor false, that is, p might not be a
proposition at all. For our purposes, however, we can rely on proof by contradiction.

7.3 Using argument forms for deduction

Non-trivial argument forms would require huge truth tables to verify. However, we can use known
argument forms to verify larger argument forms in a step-wise fashion. Take the argument form

(a) ∼ p∧ ∼ r → s
(b) p →∼ q
(c) ∼ t
(d) t∨ ∼ s
(e) r →∼ q
(f) ∴∼ q

51
7.3. USING ARGUMENT FORMS FOR DEDUCTION CHAPTER 7. ARGUMENT FORMS

This does not follow immediately from the argument forms we have given. However, we can
deduce immediately that ∼ s by applying division into cases to (c) and (d). Our goal is to generate
new propositions from known argument forms until we have verified proposition (f).
(i) ∼s by (c), (d), and division into cases.
(ii) ∼ (∼ p∧ ∼ r) by (a), (i), and modus tollens
(iii) p∨r by (ii) and DeMorgan’s laws
(iv) ∼q by (iii), (b), (c), and division into cases.

Notice that using logical equivalences from Theorem 5.1 is fair game.

52
CHAPTER 7. ARGUMENT FORMS 7.3. USING ARGUMENT FORMS FOR DEDUCTION

Exercises

9. (a) p ∨ q
Verify the following syllogisms using truth tables. (b) q → r
1. Generalization. (c) p ∧ s → t
2. Specialization.
(d) ∼ r
3. Elimination.
(e) ∼ q → u ∧ s
4. Transitivity.
(f) ∴ t
5. Division into cases.
10. (a) ∼ p → r∧ ∼ s
6. Contradiction.
(b) t → s
Use known syllogisms and logical equivalences to verify the follow-
ing arguments. (c) u →∼ p
7. (a) p → q (d) ∼ w
(b) r ∨ s (e) u ∨ w
(c) r → t (f) ∴∼ t
(d) ∼ q 11. (a) p → q
(e) u → v (b) r ∨ s
(f) s → p
(c) ∼ s →∼ t
(g) ∴ t
(d) ∼ q ∨ s
8. (a) ∼ p ∨ q → r
(e) ∼ s
(b) s∨ ∼ q
(f) ∼ p ∧ r → u
(c) ∼ t
(g) w ∨ t
(d) p → t
(e) ∼ p ∧ r →∼ s (h) ∴ u ∧ w
(f) ∴∼ q Exercises 7–11 are taken from Epp [5].

53
7.3. USING ARGUMENT FORMS FOR DEDUCTION CHAPTER 7. ARGUMENT FORMS

54
Chapter 8

Predicates and quantifiers

Recall the most famous syllogism, modus ponens,

If Socrates is a man, then he is mortal.

Socrates is a man.
Therefore, Socrates is mortal.

It is unlikely that we would either presume or prove such a narrow premise as “If Socrates is a
man, then he is mortal.” What is so special about Socrates that this conditional mortality accrues
to him? Rather, we would be more likely to say

All men are mortal.

Socrates is a man.
Therefore, Socrates is mortal.

Our premise, “All men are mortal,” now addresses a wider scope, and our syllogism merely
applies this universal truth to a specific case. However, we have not yet discussed how do deal with
concepts like “all” in formal logic. Is it necessary to expand our notation (and reasoning rules), or
can this be captured by the logical forms we already know? For example, could we not express the
first premise using a conditional?

If someone is a man, then he is mortal.

This is equivalent, but now we have introduced the pronouns “someone” and “he,” which is
English’s way of referring to the same but unknown value. Therefore we could simplistically replace
the pronouns with pseudo-mathematical notation to get

If x is a man, then x is mortal.1

However, variables (and pronouns with uncertain antecedents) mean that a sentence is no longer
a proposition. In this chapter, we will see the use of unknowns in formal reasoning.

8.1 Predication
When we introduced conditionals, we moved from the specific case to the general case by replacing
parts of a conditional sentence with variables.

If Socrates is a man, then he is mortal. If p then q.

These variables are parameters, that is, independent variables that can be supplied with any parameters
1 This is not quite right; we still need to say that this is true “for all x.”

55
8.2. UNIVERSAL QUANTIFICATION CHAPTER 8. PREDICATES AND QUANTIFIERS

values from a certain set (in this case, the set of propositions) and that affect the value of the
entire expression (in this case, also a proposition, once the values have been supplied). When we
replace parts of a mathematical expression with independent variables, we are parameterizing that
expression. We see the same process here:

Socrates is a man x is a man.

This makes a proposition with a hole in it. A sentence that is a proposition but for an independent
predicate variable is a predicate. This is the same term predicate that you remember from grammar school; a
predicate is the part of a clause that expresses something about the subject.

hit the
The boy |{z} ball.}
| {z Spinach is
|{z} green.
| {z } | {z } | {z }
subject transitive direct subject linking predicate
verb object verb nominative
| {z } | {z }
predicate predicate

If we want to represent a given predicate with a symbol, it would be useful to incorporate the
parameter. Hence we can define, for example,

P (x) = x is mortal
You should recognize this notation as being the same as that used for functions in algebra and
calculus. We will study functions formally in Part VI, but you can use what you remember from
prior math courses to recognize that a predicate is alternately defined as a function whose codomain
is the set of truth values, { true, false }. In fact, another term for predicate is propositional function.
domain The domain of a predicate is the set of values that may be substituted in place of the independent
variable.
To play with the two sentences above, let

P (x) = x hit the ball.

Q(x) = x is green.

And so we can note P (the boy), P (the bat), Q(spinach), Q(Kermit), and ∼ Q(ruby). It is
important to note that the domain of a predicate is not just those things that make it true, but
rather all things it would make sense to talk about in that context (here we can assume something
like “visible objects and substances”). It is not invalid to say Q(ruby) or “ruby is green.” It merely
happens to be false.
Here is a mathematical example. Let P (x) = x2 > x. What is P (x) for various values of x, if
we assume R as the domain?
1
x 5 2 1 2 0 − 12 −1
P (x) T T F F F T T
It is not too difficult to characterize the numbers that make this predicate true—positive numbers
truth set greater than or equal to one, and all negative numbers. The truth set of a predicate P (x) with domain
D is the set of all elements in D that make P (x) true when substituted for x. We can denote this
using set notation as {x ∈ D|P (x)}. In this case,

{x ∈ R|P (x)} = (−∞, 0) ∪ [1, ∞)

8.2 Universal quantification

Now we return to the original question. How can we express “All men are mortal” formally? It does
not do to define a predicate

P (x) = if x is a man, then x is mortal.

56
CHAPTER 8. PREDICATES AND QUANTIFIERS 8.3. EXISTENTIAL QUANTIFICATION

because “All men are mortal” is a proposition, no mere predicate. It truly does make a claim that is
either true or false. In other words, the predicate does not capture the assertion being made about
all men. The problem is that the variable x is not truly free, but rather we want to remark on the
extent of x, the values for which we are asserting this predicate. Words that express this are called
quantifiers. Here is a rephrasing of “all men are mortal” that uses variables correctly: quantifiers

For all men x, x is mortal.

The symbol ∀ stands for “for all.” Then, if we let M stand for the set of all men and M (x) = x
is mortal, we have

∀ x ∈ M, M (x)
∀ is called the universal quantifier . Likewise, a universal proposition is a proposition in the form universal quantifier
∀x ∈ D, P (x), where P (x) is a predicate and D is the domain (or a subset of the domain) of P (x).
Unfortunately, defining the meaning of a universal proposition cannot be done simply with a truth universal proposition
table. Instead we say, almost banally, that the proposition is true if P (x) is true for every element
in the domain. For example, let D = {3, 54, 219, 318, 471}. Which of the following are true?

∀ x ∈ D, x2 ≤ x No, this is actually false for all of them.

∀ x ∈ D, x is even No, this fails for 3.

∀ x ∈ D, x is a multiple of 3 Yes: 3 = 3 · 1, 54 = 3 · 18, 219 = 3 · 73, 318 = 3 · 106,

and 471 = 3 · 157.

What we used on the last proposition was the method of exhaustion, that is, we tried all possible method of exhaustion
values for x until we exhausted the domain, demonstrating each one made the predicate true.
Obviously this method of proof is possible only with finite sets, and it is impractical for any set
much larger than the one in this example. On the other hand, disproving a universal proposition
is quite easy, since it takes only one hole to sink the ship. If for any element of D, P (x) is false,
then the entire proposition is false. Having found 3, not an even number, we disproved the second
proposition, without even noting that the predicate fails also for 219 and 471. An element of the
domain for which the predicate is false is called a counterexample. counterexample

8.3 Existential quantification

It is not true that

x2 = 16 for all x ∈ R.
It is true, however, that

x2 = 16 for some x ∈ R.
namely, for 4 and −4. While it is true that

∼ (∀ x ∈ R, x2 = 16)
it is not true that

∀ x ∈ R, ∼ (x2 = 16)
The word “some” expresses the situation that falls between being true for all and being true
for none—in other words, the predicate is true for at least one, perhaps more. It is an existential
quantifier , because it asserts that at least one thing exists with the given predicate as a property. existential quantifier
We can rephrase the second proposition of this section to get

57
8.4. IMPLICIT QUANTIFICATION CHAPTER 8. PREDICATES AND QUANTIFIERS

There exists an x ∈ R such that x2 = 16.

Now, the symbol ∃ means “there exists.” Hence we have

∃ x ∈ R | x2 = 16
Formally, an existential proposition is a proposition of the form ∃ x ∈ D | P (x) for some predicate
P (x) with domain (or domain subset) D.
Revisiting our earlier example with D = {3, 54, 219, 318, 471}, which of the following are true?

∃ x ∈ D, x2 ≤ x No, this is false for all of them.

∃ x ∈ D, x is even Yes, 54 = 2 · 27.

∃ x ∈ D, x is a multiple of 3 Yes: 3 = 3 · 1.

Notice that with existentially quantified propositions, we must use the method of exhaustion to
disprove it. Only one specimen is needed to show that it is true.

8.4 Implicit quantification

The most difficult part of getting used to quantified propositions is taking note of the various forms
in which they appear in English and various mathematical forms that are equivalent. For example,
∀ can just as easily stand in for “for any,” “for every,” “for each,” and “given any.” Consider the
proposition
A positive integer evenly divides 121.
While this is awkward and ambiguous (is this supposed to define the term positive integer ?), it can
be read to mean “there exists a positive integer that evenly divides 121,” or,

∃ x ∈ Z+ | x divides 121
which is true, letting x = 11. The existential quantification is implicit, hanging on the indefinite
article a. However, the indefinite article can also imply universal quantification, depending on the
context. Hence
If a number is a rational number, then it is a real number.
becomes

∀ x ∈ Q, x ∈ R
Notice also that this is equivalent to

∀ x ∈ U, x ∈ Q → x ∈ R
where we take the universal set to be all numbers. More generally,

∀ x ∈ D, Q(x) ≡ ∀ x ∈ U, x ∈ D → Q(x)
Quantification in natural language is subtle and a frequence source of ambiguity for the careless
(though the intended meaning is usually clear from context or voice inflection). Adverbs (besides
not) usually do not affect the logical meaning of a sentence, but notice how just turns
I wouldn’t give that talk to any audience.
into
I wouldn’t give that talk to just any audience.

58
CHAPTER 8. PREDICATES AND QUANTIFIERS 8.5. NEGATION OF QUANTIFIED PROPOSITIONS

8.5 Negation of quantified propositions

Finally, we consider how to negate propositions with universal or existential quantifiers. We saw
earlier that

∼ (∀ x ∈ R, x2 = 16) 6≡ ∀ x ∈ R, ∼ (x2 = 16)

So negation is not a simple matter of propagating the negation symbol through the quantifier.
What, then, is the negation of

All men are mortal.

There exists a bag of spinach that is on sale.

What will help us here is to think about what the quantifiers are actually saying. If a predicate
is true for all elements in the set, then if we could order those elements, it would be true for the first
one, and the next one, and one after that. In other words, we can think of universal quantification as
an extension of conjunction. If there merely exists an element for which the predicate is true, then it
is true for the first, or the next one, or one of the elements after that. Hence if D = {d 1 , d2 , d3 , . . .},

∀ x ∈ D, P (x) ≡ P (d1 ) ∧ P (d2 ) ∧ P (d3 ) . . .

and

∃ x ∈ D | P (x) ≡ P (d1 ) ∨ P (d2 ) ∨ P (d3 ) . . .

Now, we can apply an extended version of DeMorgan’s laws.

∼ (∀ x ∈ D, P (x)) ≡ ∼ (P (d1 )∧P (d2 )∧P (d3 ) . . .) ≡∼ P (d1 )∨ ∼ P (d2 )∨ ∼ P (d3 ) . . . ≡ ∃ x ∈ D, ∼ P (x)

and

∼ (∃ x ∈ D | P (x)) ≡ ∼ (P (d1 )∨P (d2 )∨P (d3 ) . . .) ≡∼ P (d1 )∧ ∼ P (d2 )∧ ∼ P (d3 ) . . . ≡ ∀ x ∈ D, ∼ P (x)

Hence the negation of a universal proposition is an existential proposition, and the negation of
an existential proposition is a universal proposition. To negate our examples above, we would say

There exists a man who is not mortal.

All bags of spinach are not on sale.

59
8.5. NEGATION OF QUANTIFIED PROPOSITIONS CHAPTER 8. PREDICATES AND QUANTIFIERS

Exercises

2. Some bags of spinach are on sale.

Let S be the set of bags of spinach, g(x) be the predicate where x is 3. All bags of spincah that are not green are on sale.
green, and s(x) be the predicate where x is on sale. Write the fol-
lowing symbolically, then negate them, then express the negations 4. Every bag of spinach is green.
in English. 5. Some bags of spinach are not green.
1. All bags of spinach are on sale. 6. Any bag of spinach is green and on sale.

60
Chapter 9

Multiple quantification;
representing predicates

In this chapter, we have two separate concerns, both of which extend the concepts of predication from
the previous chapter. First, we consider how to interpret propositions that are multiply quantified,
that is, have two (or more) quantifiers. Second, we consider how to represent predicates in ML.

9.1 Multiple quantification

Soon we will take up the game of writing mathematical proofs. The most important consideration
for determining a strategy in a given proof is how the proposition to be proved is quantified. You
will be trained to think, “What would it take to prove to someone that this proposition is true?”
or, equivalently, “How would a skeptic challenge this proposition?” A proof is a ready-made answer
to a possible challenge.
Suppose we have the (fairly banal) proposition

There exists an integer greater than 5.

Written symbolically, this is ∃ x ∈ Z | x > 5. A doubter would have to challenge this by saying
“So, you think there is a number greater than five, do you? Well then, show me one.” To satisfy
this doubter, your response would be to name any number that is both an integer and greater than
5— 6 will do fine. On the other hand, suppose you were asserting that

Any integer is a rational.

or, ∀ x ∈ Z, x ∈ Q. It would be unfair for a challenger to ask you to verify this proposition for every
element in Z. Even the most hardened skeptic would not have the patience to hear you enumerate an
infinite number of possibilities. However, this does demonstrate the futility of proving by example:

Doubter: Prove to me that any integer is rational.

Prover: Well, 0 is an integer and it is rational.
Doubter: Ok, one down, infinitely many to go.
Prover: Also, 1 is an integer and it is rational.
...
Prover: And furthermore, 1, 405, 341 is an integer and it is rational.
Doubter: Ok, 1, 405, 342 down, infinitely many to go. And you haven’t even begun to
talk about negative integers.

What makes proving by example unconvincing is that the prover is picking the test cases. What
if he is picking only the ones on which the proposition seems to be true, and ignoring the coun-
terexamples? It is like a huckster picking his own shills out of the crowd on whom to demonstrate

61
9.1. MULTIPLE QUANTIFICATIONCHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES

his snake oil. Instead, to make the game both fair and finite, it must be the doubter who picks the
sample from the domain and challenges the prover to demonstrate the predicate is true for it.
Thus an existentially quantified proposition is proven by providing an example, and a universally
quantified proposition is proven by providing a way to confirm any given example.
All this serves to build intuition for how to interpret propositions with nested levels of quantifi-
cation. How would we symbolically represent the proposition

Every integer has an opposite.

To work this out, let us follow the kind of reasoning above in reverse. Suppose you were to
prove this. What kind of challenge would you expect? The idea here is that the integer 5 has for
its opposite -5, -10 has 10, 0 has 0, and so on. Since you want to show this pattern holds for every
integer, it makes sense that challenger gets to pick the integer on which to argue. However, once
that integer is picked, how is the rest of the game played? You, the prover, must come up with
an opposite to match that integer. Hence the game has two steps: the doubter picks an item to
challenge you, and you counter that challenge with another item. The two steps correspond to two
levels of quantification. First, you are claiming that some predicate is true for all integers, so we
have something in the form

∀ x ∈ Z, P (x)

But what is P(x)? What are we claiming about all integers? We claim that something exists
corresponding to it, namely an opposite. P (x) = ∃ y ∈ Z | y = −x. All together,

∀ x ∈ Z, ∃ y ∈ Z | y = −x

multiply quantified This is a multiply quantified proposition, meaning that the predicate of the proposition is itself
quantified. This is more specific than that the proposition merely has more than one quantifier. The
proposition “Every frog is green, and there exists a brown toad” has more than one quantifier, but
this is not what we have in mind by multiple quantification, because one quantified subproposition
is not nested within the other.
Do not fail to notice that quantifiers are not commutative. We would have a very different (and
false) proposition if we were to say

∃ y ∈ Z | ∀ x ∈ Z, y = −x

There is an integer such that every integer is its opposite.

Also notice that the innermost predicate (y = −x) has two independent variables. The general
form is

∀ x ∈ D, ∃ y ∈ E | P (x, y)

where P is a two-argument predicate, with arguments of domains D and E, respectively; or, equiv-
alently, P is a single-argument predicate with domain D × E.
Let us try another example. How would you translate to symbols the proposition

There is no greatest prime number.

To make this process easier, let us temporarily ignore the negation.

There is a greatest prime number.

62
CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES
9.2. AMBIGUOUS QUANTIFICATION

Obviously the outer proposition is existential. Picking P to stand for the set of prime numbers and
writing half-symbolically we have

∃ x ∈ P | x is the greatest prime number

What does it mean to be the greatest prime number? It means that all other prime numbers must
be less than x.

∃ x ∈ P | ∀ y ∈ P, y ≤ x
(Why did we say “≤” rather than “<”?) Now we negate this.

∼ ∃ x ∈ P | ∀ y ∈ P, y ≤ x
Evaluate the negation of the existential quantifier.

∀ x ∈ P, ∼ ∀ y ∈ P, y ≤ x
Evaluate the negation of the universal quantifier.

∀ x ∈ P, ∃ y ∈ P | y > x
or

For every prime, there exists a greater prime.

As a final example, think back to the beginning of calculus when you first encountered the formal
definition of a limit. You may remember it as a less than pleasant experience. It is likely that one
of the main frustrations was that it is multiply quantified. lim x→a f (x) = L means

∀ ∈ R+ , ∃ δ ∈ R+ | 0 < |x − a| < δ → |f (x) − L| <

9.2 Ambiguous quantification

The use of symbols becomes increasingly critical when multiple quantification is involved because
natural language becomes desperately ambiguous at this point. Consider the sentence

There is a professor teaching every class.

This could mean

∃ x ∈ (The set of professors) | ∀ y ∈ (The set of classes), x teaches y

A very busy professor indeed. More likely, this is meant to indicate an adequate staffing situation,
that is,

∀ y ∈ (The set of classes), ∃ x ∈ (The set of professors) | x teaches y

Similarly, when someone says

A man loves every woman.

does it indicate a particularly promiscuous fellow

∃ x ∈ (Men) | ∀ y ∈ (Women), x loves y

or a Hollywood ending?

∀ y ∈ (Women), ∃ x ∈ (Men) | x loves y

If we say

63
9.3. PREDICATES IN ML CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES

Every man loves a woman.

do we mean

∀ x ∈ (Men), ∃ y ∈ (Women) | x loves y

that every guy has a star after which he pines, or

∃ y ∈ (Women) | ∀ x ∈ (Men), x loves y

some gal will be a shoe-in for homecoming queen?

9.3 Predicates in ML
We have already seen how to write boolean expressions in ML—expressions that represent proposi-
tions. However, ML understandably rejects an expression containing an undefined variable.

- x < 15.3;

stdIn:1.7 Error: unbound variable or constructor: x

However, we can capture the independent variable by giving a name to the expression, that is,
defining a predicate. In ML, we define predicates using the form

fun <identifier> ( <identifier>) = <expression>

The keyword fun is similar to val in that it assigns a meaning to an identifier (namely the first
identifier), which is the name of the predicate. The second identifier, enclosed in parentheses, is the
independent variable.

- fun P(x) = x < 15.3;

val P = fn : real -> bool

- P(1.5);

val it = true : bool

- P(15.3);

val it = false : bool

- P(27.4);

val it = false : bool

Let us compose predicates to test if a real number is within the range [−3.4, 47.3) and if an
integer is divisible by three.

- fun Q(x) = x >= 3.4 andalso x < 47.3;

val Q = fn : real -> bool

- Q(16.5);

64
CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES 9.4. PATTERN-MATCHING

val it = true : bool

- Q(0.1);

val it = false : bool

- Q(57.9);

val it = false : bool

- fun R(n) = n mod 3 = 0;

val R = fn : int -> bool

- R(2);

val it = false : bool

- R(4);

val it = false : bool

- R(6);

val it = true : bool

Notice several things. First, the keyword fun is used because we are defining a function—
specifically, a function whose co-domain is bool. Next, note ML’s response to the definition of a
predicate, for example val Q = fn : real -> bool . This means that the variable Q is assigned a
certain value. Indeed, Q is a variable, but not a variable that stores a int or bool value, but one that
stores a function value. The value is not printed, but fn (another keyword based on an abbreviation
for “function”) stands in for it. The type of the value is real →bool, which essentially means a
function whose domain is real and whose co-domain is bool.
Remember that * corresponds to × in mathematical notation for Cartesian products. In math-
ematical notation, we would say that a predicate like Q is a function R → {true, false}. You should
use your knowledge of functions from earlier study in mathematics to understand this for now, but
general functions and the concept of function types with be examined more carefully in a later
chapter. The function is quite the crucial concept in ML programming.

9.4 Pattern-matching
As a finale for this chapter, we consider writing predicates for non-numerical data. In an earlier
chapter, we learned how to create a new datatype. For example, we considered the set of trees (here
expanded):

- datatype tree = Oak | Maple | Pine | Elm | Spruce | Fir | Willow;

datatype tree = Elm | Fir | Maple | Oak | Pine | Spruce | Willow

We know that equality is defined automatically. By chaining comparisons using orelse, we can
create more complicated statements and encapsulate them in predicates.

65
9.4. PATTERN-MATCHING CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES

- fun isConiferous(tr) =
= tr = Pine orelse tr = Spruce orelse tr = Fir;

val isConiferous = fn : tree -> bool

- isConiferous(Willow);

val it = false : bool

- isConiferous(Pine);

val it = true : bool

Notice that we never explicitly consider the cases for non-coniferous trees. An equivalent way of
writing this is

- fun isConiferous2(tr) =
= if (tr = Pine)
= then true
= else if (tr = Spruce)
= then true
= else if (tr = Fir)
= then true
= else false;

val isConiferous2 = fn : tree -> bool

- isConiferous2(Oak);

val it = false : bool

- isConiferous2(Spruce);

val it = true : bool

Although this is much more verbose, a pattern like this would be necessary if such a predicate (or,
more generally, function) needed to perform more computation to determine its result (as opposed
to merely returning literals true and false). ML does, however, provide a cleaner way of using a
pattern conceptually the same as that used above. Instead of naming a variable for the predicate to
receive, we write a series of expressions for explicit input values:

- fun isConiferous3(Pine) = true

= | isConiferous3(Spruce) = true
= | isConiferous3(Fir) = true
= | isConiferous3(x) = false;

val isConiferous3 = fn : tree -> bool

- isConiferous3(Maple);

val it = false : bool

- isConiferous3(Fir);

66
CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES 9.4. PATTERN-MATCHING

val it = true : bool

This is referred to as pattern matching because the predicate is evaluated by finding the definition
that matches the input. Patterns can become more complicated than we have seen here. Notice also
that we still use a variable in the last line of the definition of isConiferous3 as a default case.
It is legal to define a predicate so that it leaves some cases undefined. However, that will generate
a warning when you define it and an error message if you try to use it on a value for which it is not
defined.

- fun isConiferous4(Pine) = true

= | isConiferous4(Spruce) = true
= | isConiferous4(Fir) = true;

stdIn:38.1-40.31 Warning: match nonexhaustive

Pine => ...
Spruce => ...
Fir => ...

val isConiferous4 = fn : tree -> bool

- isConiferous4(Spruce);

val it = true : bool

- isConiferous4(Pine);

val it = true : bool

- isConiferous4(Elm);

uncaught exception nonexhaustive match failure

As an example of a more complicated pattern, consider a 2-place predicate. Suppose you are
having guests over and want to serve food that will not violate any of your guests’ dietary restrictions.
Arwen tells you that she is a vegetarian, Luca says he is on a low-sodium diet, and Jael tells you
that she eats kosher. Estella says she will eat anything, and Bogdan claims he eats nothing, but you
figure that everyone will eat ice cream. To test what combination of foods would be palatable, you
use the following ML system:

- datatype guests = Arwen | Luca | Jael | Estella | Bogdan;

datatype guests = Arwen | Bogdan | Estella | Jael | Luca

- datatype foods = FrenchFries | Chicken | Bacon | Spinach | IceCream;

datatype foods = Bacon | Chicken | FrenchFries | IceCream | Spinach

- fun isKosher(Bacon) = false

= | isKosher(x) = true;

val isKosher = fn : foods -> bool

67
9.4. PATTERN-MATCHING CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES

- fun isMeat(Chicken) = true

= | isMeat(Bacon) = true
= | isMeat(x) = false;

val isMeat = fn : foods -> bool

- fun isSalty(Bacon) = true

= | isSalty(FrenchFries) = true
= | isSalty(x) = false;

val isSalty = fn : foods -> bool

- fun eats(x, IceCream) = true

val eats = fn : guests * foods -> bool

68
CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES 9.4. PATTERN-MATCHING

Exercises

After your dinner party you and your guests (from the earlier ex-
Let M be the set of men, U be the set of unicorns, h(x, y) be ample) will be watching a movie. Nobody wants to watch Titanic.
the predicate that x hunts y, and s(x, y) be the predicate that x Luca, a Californian, wants to see the Governator in Terminator.
is smarter than y. Write the following symbolically, then negate Jael is up for anything but a drama. Estella is in the mood for a
them, and then write the negation in English. good comedy. Bogdan has a crush on Estella and wants to watch
whatever she wants.
1. A certain man hunts every unicorn.
8. Create a movie genre datatype with elements Drama, Com-
2. Every man is smarter than every unicorn. edy, and Action. Create a movie datatype with elements
3. Any man is mortal if he hunts a unicorn. Terminator, Shrek, Titanic, Alien, GoneWithTheWind, and
Airplane. Copy the guest datatype from the example.
4. There is a smartest unicorn.
9. Write three predicates using pattern-matching to determine
5. No man hunts every unicorn. the genre of a movie (i.e. isDrama, isComedy, isAction).
Evaluate the negation 10. Write a predicate wantsToWatch(guest, movie) that deter-
mines if a guest will watch a particular movie.
6. ∼ ∀ x ∈ D, ∃ y ∈ E|P (x, y).
11. Will Bogdan watch Shrek ? Will Jael watch Gone With The
7. ∼ ∃ x ∈ D|∀ y ∈ E, P (x, y). Wind ?

69
9.4. PATTERN-MATCHING CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES

70
Part III

Proof

71
Chapter 10

Subset proofs

10.1 Introductory remarks

There are two principle goals for this course: writing proofs and writing programs. Writing programs
develops as a crescendo in that you gather pieces throughout this book until their grand assembling
in Part VII. Writing proofs, on the other hand, is a skill you will practice over and over, as a figure
skater does a triple axel, for the entire course. It is therefore difficult to overstate the importance of
these next few chapters.
A traditional high school mathematics curriculum teaches proof writing in a course on geometry.
You may remember the simple two-column format for proof-writing, with succeeding propositions
in the left column and corresponding justifications in the right (this is not too different conceptually
from what we did in Section 7.3). Let us apply this method to proving the Pythagorean Theorem,
which states that the square of the hypotenuse of a right triangle is equal to the sum of the squares
of the other two sides, or, for the triangle below left, a2 + b2 = c2 . One way to prove this is to
reduplicate the triangle, rotating and arranging as shown below center, where T is the inner square
and S is the outer square. We then reason as we see below right (using charming high school
abbreviations and symbols).
a b
S

a 4A ∼ = 4B SSS
b
c c ∠1 + ∠2 = 90◦ 4 angles sum to 180◦
∠1 + ∠20 = 90◦ ∠2 ∼
= ∠2
0

T
∠3 = 90◦ Supplementary ∠s
T is a square Equal sides, 90◦ ∠s
2
Area of T = c2 Area of
c
c
c b
Area of S = (a + b)2 Area of
b
a A
Area of each 4 = ab 2 Area of 4
B 3 (a + b)2 = c2 + 4 ab
2 Sum of areas
2’ 1 2 2 2
a + 2ab + b = c + 2ab Algebra (FOIL, simplification)
a b a ∴ c2 = a 2 + b2 Subtract 2ab from both sides.

Proofs in real mathematics, however, require something more professional. Proofs should be
written as paragraphs of complete English sentences—though mathematical symbolism is often useful
for conciseness and precision.

10.2 Forms for proofs

A theorem is a proposition that is proven to be true. Thus a paradox (or any other non-proposition) theorem
cannot be a theorem, because a non-proposition is neither true nor false (or it is both). A false
proposition also cannot be a theorem because it is false. Like a theorem, an axiom is true, but
unlike a theorem, an axiom is assumed to be true rather than proven true. Finally, a theorem is
different from a conjecture, even one that happens to be true, in that a theorem is proven to be true,
whereas a conjecture is not proven (and therefore not known for certain) to be true. Our business is

73
10.2. FORMS FOR PROOFS CHAPTER 10. SUBSET PROOFS

the promotion of conjectures to theorems by writing proofs for them. However, we will often speak
of this as “to prove a theorem,” since all propositions you will be asked to prove will have been
proven before.
Basic theorems take on one of three General Forms:
1. Facts. p
2. Conditionals. If p then q.
3. Biconditionals. p iff q.
Of these, General Form 2 is the most important, since facts can often be restated as conditionals,
and biconditionals are actually just two separate conditionals. The theorems we shall prove in this
part of the book all come out of set theory, and basic facts in set theory take on one of three Set
Proposition Forms:
1. Subset. X ⊆ Y .
2. Set equality. X = Y .
3. Set empty. X = ∅.
In this chapter, we will work on proving the simplest kinds of theorems: those that conform to
General Form 1 and Set Form 1. In the next chapter, we will consider theorems of General Form 1
and Set Forms 2 and 3, and the chapter after that will cover General Forms 2 and 3.
Let A and B be sets, subsets of the universal set U . An example of a proposition in General
Form 1, Set Form 1 is
Theorem 10.1 A ∩ B ⊆ A
This is a simple fact (not modified by a conditional) expressing a subset relation, that one set (A∩B)
is a subset of another (A). Our task is to prove that this is always the case, no matter what A and
B are. To prove that, we need to ask ourselves What does it mean for one set to be a subset of
another? and Why is it the case that these two sets are in that relationship?
The first question appeals to the definition of subset. Formal, precise definitions are extremely
important for doing proofs. Chapter 1 gave an informal definition of the subset relation, but we
need a formal one in order to reason precisely. Our knowledge of quantified logic allows us to define

X ⊆ Y if ∀ x ∈ X, x ∈ Y
(Observe how this fact now breaks down into a conditional. “A ∩ B ⊆ A” is equivalent to “if
a ∈ A ∩ B then a ∈ A.” This observation will make proving conditional propositions more familiar
when the time comes. More importantly, you should notice that definitions, although expressed
merely as conditionals, really are biconditionals; the “if” is an implied “iff.”)
The burden of our proof to show A ⊆ B is, then, to show that

∀ a ∈ A, a ∈ B
Notice that this is a special case of the more general form

∀ a ∈ A, P (a)
letting P (a) = a ∈ B. Think back to the previous chapter. How would you persuade someone that
this is the case? You allow the doubter to pick an element of A and then show that that element
makes the predicate true. The way we express an invitation to the doubter to pick an element is
with the word suppose. “Suppose a ∈ A. . . ” is math-speak for “choose for yourself an element of A,
and I will tell you what to do with it, which will persuade you of my point.”
In our case, the set that is a subset is A ∩ B. The formal definition of intersect is

X ∩ Y = {z | z ∈ X ∧ z ∈ Y }
Now follow the proof:

74
CHAPTER 10. SUBSET PROOFS 10.3. AN EXAMPLE

Proof. Suppose a ∈ A ∩ B.
By definition of intersection, a ∈ A and a ∈ B.
a ∈ A by specialization.
Therefore, by definition of subset, A ∩ B ⊆ A. 2

One line is italicized because it really could be omitted for the sake of brevity (not that this
particular proof needs to be more brief). The proofs in this text frequently will have italicized
sentences which will add clarity (but also clutter) to proofs; you may omit similar sentences when
you write proofs yourself. Specialization is the sort of logical step that it is fair to assume your
audience will perform automatically, as long as you recognize that a real logical operation is indeed
happening. Notice that

• Our proof begins with Suppose...

• Every other sentence is a proposition joined with a prepositional phrase governed by by.
• The last sentence begins with therefore and, except for the by part, is the proposition we are
proving.

• The proof is terminated by the symbol 2, a widely used end-of-proof marker. You will some-
times see proofs terminated with QED, an older convention from the Latin quod erat demon-
strandum, which means “which was to be proven.”

Compare that with our arguments in Section 7.3. Our overall strategy in this case is what is
called the element argument for proving facts of set form 1: element argument

To prove A⊆B

say Suppose a ∈ A.

followed by . . . a sequence of propositions that logically follow each other . . .

with second-to-last sentence a ∈ B by . . .

and last sentence Therefore, A ⊆ B by the definition of subset. 2

Proofs at the beginning level are all about analysis and synthesis. Analysis is the taking apart
of something. Break down the assumed or proven propositions by applying definitions, going from
term to meaning. Synthesis is the putting together of something. Assemble the proposition to be
proven by applying the definition in the other direction, going from meaning to term. Here is a
summary of the formal definitions of set operations we have seen before.

X ∪Y = {z | z ∈ X ∧ z ∈ Y } X −Y = {z | z ∈ X ∧ z ∈/ Y}
X ∩Y = {z | z ∈ X ∨ z ∈ Y } X ×Y = {(x, y) | x ∈ X ∧ y ∈ Y }
X = {z | z ∈
/ X}

10.3 An example
Now we consider a more complicated example. The same principles apply, only with more steps
along the way. Let A, B, and C be sets, subsets of U . Prove

A × (B ∪ C) ⊆ (A × B) ∪ (A × C)
Immediately we notice that this fits our form, so we know our proof will begin with

75
10.4. CLOSING REMARKS CHAPTER 10. SUBSET PROOFS

Proof. Suppose x ∈ A × (B ∪ C).

and end with

x ∈ (A × B) ∪ (A × C) by . . . . Therefore, A × (B ∪ C) ⊆ (A × B) ∪ (A × C). 2

Our proof is going to be a journey from x ∈ A × (B ∪ C) to x ∈ (A × B) ∪ (A × C). Most steps

will be the application of a definition. To trace out a route, we need to consider what paths lead
from x ∈ A × (B ∪ C) and which paths lead to x ∈ (A × B) ∪ (A × C), and to find a way to connect
them in the middle.
What has the definition of Cartesian product to say about x ∈ A × (B ∪ C)? x must be an
ordered pair, having the form, say, (a, d). Having picked the symbols a and d, we know that a ∈ A
and d ∈ B ∪ C. What we know about d can be broken down further, observing that d ∈ B or d ∈ C.
This is the analysis of our supposition, broken down to individual facts.
What about x ∈ (A × B) ∪ (A × C)? What would this mean, if it were true? x would have to be
an element of A × B or of A × C. In the first case, it would have to have the form (a, d) where a ∈ A
and d ∈ B; in the other case, it would still have to have the form (a, d) where a ∈ A, but instead
where d ∈ C. This is a hypothetical analysis of our destination. Notice our use of the subjunctive
throughout, since we do not know that these parts are true, only that they would be true if the
destination were true. However, if we proved all these parts by some other means, then retracing
our steps would lead to a synthesis of our destination. In this instance, the glue that connects these
with the pieces of the previous analysis is the argument form we have learned called division into
cases.

Proof. Suppose x ∈ A × (B ∪ C). By the definition of Cartesian product, x = (a, d) for

some a ∈ A and some d ∈ B ∪ C. Then d ∈ B or d ∈ C, by definition of union.
Case 1: Suppose d ∈ B. Then, by definition of Cartesian product, (a, d) ∈ A × B.
Moreover, x ∈ A × B by substitution. Finally, by the definition of union, x ∈ (A × B) ∪
(A × C).
Case 2: Suppose d ∈ C. Then, by definition of Cartesian product, (a, d) ∈ A × C.
Moreover, x ∈ A × C by substitution. Finally, by the definition of union, x ∈ (A × B) ∪
(A × C).
So x ∈ (A×B)∪(A×C) by division into cases. Therefore, A×(B∪C) ⊆ (A×B)∪(A×C).
2

Notice several things. First, instead of writing each sentence on its own line, we combined several
sentences into paragraphs. This manifests the general divisions in our proof technique. The first
paragraph did the analysis. The next two each dealt with one case, also beginning the synthesis.
The last paragraph completed the synthesis.
Second, division into cases was turned into prose and paragraph form by highlighting each case,
with each case coming from a clause of a disjunction (“d ∈ B or d ∈ C”), and each case requires
another supposition. We will see this structure again later, and take more careful note of it then.
Third, we have peppered this proof with little words like “then,” “moreover,” “finally,” and “so.”
These do not add meaning, but they make the proof more readable.
Finally, we have made use of one extra but very important mathematical tool, that of substitution.
If two expressions are assumed or shown to be equal, we may substitute one for the other in another
expression. In this case, we assumed x and (a, d) were equal, and so we substituted x for (a, d) in
(a, d) ∈ A × B. (Or, we replaced (a, d) with x. Do not say that we substituted (a, d) with x. See
Fowler’s Modern English Usage on the matter [7].)

10.4 Closing remarks

Writing proofs is writing. Your English instructors have long forewarned that good writing skills are
essential no matter what field you study. The day of reckoning has arrived. When you write proofs,

76
CHAPTER 10. SUBSET PROOFS 10.4. CLOSING REMARKS

you will write in grammatical sentences and fluent paragraphs. Two-column grids with non-parallel
phrases are for children. Sentences and paragraphs are for adults. However, you are by no means
expected to write good proofs immediately. Proof-writing is a fine skill that takes patience and
practice to acquire. Do not be discouraged when you write pieces of rubbish along the way. Being
able to write proofs is a goal of this course, not a prerequisite.
More specifically, writing proofs is persuasive writing. You have a proposition, and you want your
audience to believe that it is true. Yet mathematical proofs have a character of their own among
other kinds of persuasive writing. We are not about the business of amassing evidence or showing
that something is probable—mathematics stands out even among the other sciences in this regard.
One proof cannot merely be stronger than another; a proof either proves its theorem absolutely
or not at all. A mathematical proof of a theorem may in some ways be less weighty than, say, an
argument for a theological position. On the other hand, a mathematical proof has a level of precision
that no other discourse community can approach. This is the community standard to which you
must live up; when you write proofs, justify everything.
You may imagine the stereotypical drill sergeant telling recruits, “When you speak to me, the
first and last word out of your mouth better be ‘sir.’ ” You will notice an almost militaristic rigidity
in the proof-writing instruction in this book. Every proof must begin with suppose. Every other
sentence must contain by or since or because (with a few exceptions, which will generally contain
another suppose instead). Your last sentence must begin with therefore. Your proofs will conform to
a handful of patterns like the element argument we have seen here. However, this does not represent
the whole of mathematical proofs. If you go on in mathematical study, you will see that the structure
and phrasing of proofs can be quite varied, creative, and even colorful. Steps are skipped, mainly
for brevity. Phrasing can be less precise, as long as it is believable that they could be tightened
up. However, this is not yet the place for creativity, but for fundamental training. We teach you to
march now. Learn it well, and someday you will dance.

77
10.4. CLOSING REMARKS CHAPTER 10. SUBSET PROOFS

Exercises

4. (A ∪ B) ⊆ A ∩ B.
Let A, B, and C be sets, subsets of the universal set. Prove.
5. A ∪ (B ∩ C) ⊆ (A ∪ B) ∩ (A ∪ C).
1. A ⊆ A ∪ B.
2. A − B ⊆ B. 6. (A × B) ∪ (A × C) ⊆ A × (B ∪ C).

3. A ∩ B ⊆ A − B. 7. A × (B − C) ⊆ (A × B) − (A × C).

78
Chapter 11

Set equality and empty proofs

11.1 Set equality

We now concern ourselves with proving facts of Set Form 2, that is, propositions in the form A = B
for some sets A and B. As with subsets, proofs of set equality must be based on the definition, what
it means for two sets to be equal. Informally we understand that two sets are equal if they are made
up of all the same elements, that is, if they completely overlap. In other words, two sets are equal
if they are subsets of each other. Formally:

X = Y if X ⊆ Y ∧ Y ⊆ X
This means you already know how to prove propositions of set equality—it is the same as proving
subsets, only it needs to be done twice (once in each direction of the equality). Observe this proof
of A − B = A ∩ B:
Proof. First, suppose x ∈ A − B. By the definition of set difference, x ∈ A and
x∈/ B. By the definition of complement, x ∈ B. Then, by the definition of intersection,
x ∈ A ∩ B. Hence, by the definition of subset, A − B ⊆ A ∩ B.
Next, suppose x ∈ A ∩ B. . . . Fill in your proof from Chapter 10, Exercise 3. . . Hence,
by the definition of subset, A ∩ B ⊆ A − B.
Therefore, by the definition of set equality, A − B = A ∩ B. 2
Notice that this proof required two parts, highlighted by “first” and “next,” each part with its
own supposition and each part arriving at its own conclusion. We used the word “hence” to mark
the conclusion of a part of the proof and “therefore” to mark the end of the entire proof, but they
mean the same thing. Notice also the general pattern: suppose an element in the left side and show
that it is in the right; suppose an element in the right side and show that it is in the left.
To avoid redoing work (and making proofs unreasonably long), you may use previously proven
propositions as justification in a proof. A theorem that is proven only to be used as justification in
a proof of another theorem is called a lemma. Lemmas and theorems are either identified by name lemma
or by number. For our purposes, we can also refer to theorems by their exercise number. Here is a
re-writing of the proof:
Lemma 11.1 A − B ⊆ A ∩ B.
Proof. Suppose x ∈ A − B. By the definition of set difference, x ∈ A and x ∈ / B. By
the definition of complement, x ∈ B. Then, by the definition of intersection, x ∈ A ∩ B.
Therefore, by the definition of subset, A − B ⊆ A ∩ B. 2
Now the proof of our theorem becomes a one-liner (incomplete sentences may be countenanced
when things get this simple):
Proof. By Lemma 11.1, Exercise 10.3, and the definition of set equality. 2

79
11.2. SET EMPTINESS CHAPTER 11. SET EQUALITY AND EMPTY PROOFS

11.2 Set emptiness

Suppose we are to prove

A ∩ A=∅
To address Set Form 3, we must consider what it means for a set to be empty; though this may
seem obvious, a precise proof cannot be written if it does not have a precise defiinition for which to
aim. A set X is empty if

∼∃x∈U |x∈X
It is frequently useful to prove that a certain sort of thing does not exist. The doubter’s objection
that “just because you have not found one does not mean they do not exist” is quite a high hurdle for
the prover. We do, however, have a weapon for propositions of this sort—the proof by contradiction
syllogism we learned in Chapter 7. We suppose the opposite of what we are trying to prove

∃x∈U |x∈X
show that this leads to a contradiction, and then conclude what we were trying to prove. This is
indeed one of the most profound techniques in mathematics. G.H. Hardy remarked, “It is a far finer
gambit than any chess gambit: a chess player may offer the sacrifice of a pawn or even a piece, but
the mathematician offers the game.”

Proof. Suppose a ∈ A ∩ A. Then, by definition of intersection, a ∈ A and a ∈ A. By

the definition of complement, a ∈/ A. a ∈ A and a ∈ / A is a contradiction. This proves
that the supposition a ∈ A ∩ A is false. Therefore, A ∩ A = ∅. 2

11.3 Remarks on proof by contradiction

The more powerful the tool, the more easily it can be misused. So it is with proof by contradiction.
Since we can prove that a proposition is false by deriving a known false propsition from it, the novice
prover is often tempted to prove that a proposition is true by deriving a known true proposition
from it. This is a logical error of the most magnificent kind, which you must be careful to avoid.
Let this example, to prove A = ∅ (that is, all sets are empty), demonstrate the folly to would-be
trespassers:

Proof. Suppose A = ∅. Then A ∪ U = ∅ ∪ U by substitution. By Exercise 11, A ∪ U = U

and also ∅ ∪ U = U . By substitution, U = U , an obvious fact. Hence A = ∅. 2

In other words, there is no proof by tautology. The truth table below presents another way to see
why not—in the second critical row, the conclusion is false.

p T p→T p
p→T critical row
T T T T ←−−−−−−−
−
∴p critical row
F T T F ←−−−−−−−
−

80
CHAPTER 11. SET EQUALITY AND EMPTY PROOFS 11.3. REMARKS ON PROOF BY CONTRADICTION

Exercises

11. A ∪ U = U .
Let A, B, and C be sets, subsets of the universal set U . Prove. 12. (A ∪ B) = A ∩ B.
You may use exercises from the previous chapter in your proofs.
13. (A ∩ B) = A ∪ B.
1. A ∪ ∅ = A.
14. A ∪ (A ∩ B) = A.
2. A ∪ (A ∩ B) = A.
15. A ∪ B = A ∪ (B − (A ∩ B)).
3. A × (B ∪ C) = (A × B) ∪ (A × C).
16. A ∩ ∅ = ∅.
4. A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C). 17. A − ∅ = A.
5. A ∪ A = U . 18. A ∩ A = ∅.
6. A × (B − C) = (A × B) − (A × C). 19. A × ∅ = ∅.
7. A ∪ B = B ∪ A. 20. A − A = ∅.
8. (A ∪ B) ∪ C = A ∪ (B ∪ C). 21. (A − B) ∩ (A ∩ B) = ∅.
9. (A ∩ B) ∩ C = A ∩ (B ∩ C). 22. (A − B) ∩ B = ∅.
10. A = A. 23. A ∪ (B − (A ∩ B)) = ∅.

81
11.3. REMARKS ON PROOF BY CONTRADICTION CHAPTER 11. SET EQUALITY AND EMPTY PROOFS

82
Chapter 12

Conditional proofs

12.1 Worlds of make believe

In this chapter we concern ourselves with proofs of General Forms 2 and 3, conditional and bicon-
ditional propositions. As an initial example, consider the proposition

If A ⊆ B, then A ∩ B = A.

Proof. Suppose A ⊆ B.

Further suppose that x ∈ A ∩ B. By definition of

Proof of
intersection, x ∈ A. Hence A∩B ⊆ A by definition A ∩ B ⊆ A.
of subset.

Now suppose that x ∈ A. Since A ⊆ B, then Entire proof.

Proof of
x ∈ B as well by definition of subset. Then by Proof of A ∩ B = A.
definition of intersection, x ∈ A ∩ B. Hence A ⊆ A ⊆ A ∩ B.
A ∩ B by definition of subset.

Therefore, by definition of set equality, A∩B = A.

We see from this that the proof of this slightly more sophisticated proposition is really composed
of smaller proofs with which we are already familiar. At the innermost levels, we have subset proofs,
two of them. Together, they constitute a proof of set equality, as we saw in the previous chapter. We
are now merely wrapping that proof in one more layer to get a proof of a conditional proposition.
That “one more layer” is another supposition. Take careful stock in how the word suppose is used
in proof above.
The sub-proof of A ⊆ A ∩ B makes no sense out of context. Certainly a set A is not in general
a subset of its intersection with another set. What makes that true (as we say in the proof) is that
we have supposed a restriction, namely A ⊆ B. The wrapper provides a context that makes the
proposition A ⊆ A ∩ B true.
When we say suppose, we are boarding Mister Rogers’s trolley to the Neighborhood of Make-
Believe. We are creating a fantasy world in which our supposition is true, and then demonstrating
that something else happens to be true in the world we are imagining. The imaginary world must
obey all mathematical laws plus the laws we postulate in our supposition. Sometimes, it is useful to
make another supposition, in which case we enter a fantasy world within the first fantasy world—that
world must obey all the laws of the outer world, plus whatever is now supposed.
Trace our journey through the proof above. We begin in the real world (or at least the world of
mathematical set theory), World 0. We imagine a world in which A ⊆ B, World 1. From that world

83
12.2. INTEGERS CHAPTER 12. CONDITIONAL PROOFS

we imagine yet another world, World 2, in which x ∈ A ∩ B, and show that in World 2, x ∈ A. This
proves that any world like World 2 within World 1 will behave that way, and this proves that within
World 1, A ∩ B ⊆ A. (It happens that this is true in World 0 as well, but we do not prove it.) We
exit World 2. We then imagine a World 3 within World 1 in which x ∈ A, and in a way similar to
what we had before, show that A ⊆ A ∩ B. Together, these things about World 1 also show that
A ∩ B = A in World 1, and our proof is complete. Notice that our return to World 0 is not explicit.
This is because the proposition we are proving does not ask us to prove anything directly about the
real world (as propositions in General Form 1 do). Rather, it asks us to prove something about all
worlds in which A ⊆ B.

12.2 Integers
For variety, let us try out these proof techniques in another realm of mathematics. Here we will
prove various propositions about integers, particularly about what facts depend upon them being
even even or odd. These proofs will rely on formal definitions of even and odd: An integer x is even if

∃ k ∈ Z | x = 2k
odd and an integer x is odd if

∃ k ∈ Z | x = 2k + 1
We will take as axioms that integers are closed under addition and multiplication, and that all
integers are either even or odd and not both.

Axiom 3 If x, y ∈ Z, then x + y ∈ Z.

Axiom 4 If x, y ∈ Z, then x · y ∈ Z.

Axiom 5 If x ∈ Z, then x is even iff x is not odd.

You may also use basic properties of arithmetic and algebra in your proofs. Cite them as “by rules
of arithmetic” or “by rules of algebra,” although if you recall the name of the specific rule or rules
being used, your proof will be better if you cite them. We begin with

If x and y are even integers, then x + y is even.

Proof. Suppose x and y are even integers. By the definition of even, there exist j, k ∈ Z
such that x = 2j and y = 2k. Then

x + y = 2j + 2k by substitution
= 2(j + k) by distribution

Further, j + k ∈ Z because integers are closed under addition. Hence there is an integer,
namely j + k, such that x + y = 2(j + k). Therefore x + y is an even integer by the
definition of even. 2

Notice how we brought in the variables j and k. By saying “. . . there exist j, k. . . ”, we have
made an implicit supposition about what j and k are. The definition of even establishes that this
supposition is legal. Notice also how we structured the steps from x + y to 2(j + k). This is a
convenient shorthand for dealing with long chains of equations. Were we to write this out fully, we
would say x + y = 2j + 2k by substitution, 2j + 2k = 2(j + k) by distribution, and x + y = 2(j + k)
by substitution again (or by the transitivity of equals). This is not so bad when we are juggling only
three expressions, but a larger number would become unreadable without chaining them together.
Finally, the second to last sentence is italicized because it merely rephrases what the previous two
sentences gave us. It is included here for explicitness, but you may omit such sentences.

84
CHAPTER 12. CONDITIONAL PROOFS 12.3. BICONDITIONALS

12.3 Biconditionals
Bicondional propositions (those of General Form 3) stand toward conditional propositions in the
same relationship as proofs of set equality stand toward subset proofs. A biconditional is simply two
conditionals written as one proposition; one merely needs to prove both of them.

A − B = ∅ iff A ⊆ B

Proof. First suppose A − B = ∅. We will prove that A ⊆ B. Suppose x ∈ A. Since

A − B = ∅, it cannot be that x ∈ A − B, by definition of empty set. By definition of set
difference, it is not true that x ∈ A and x ∈
/ B, and hence by DeMorgan’s law, it is true
that either x ∈ / A or x ∈ B. Since x ∈ A, then by elimination, x ∈ B. Hence A ⊆ B.
Conversely, suppose A ⊆ B. We will prove that A − B = ∅. Suppose x ∈ A − B.
By definition of set difference, x ∈ A and x ∈
/ B. By definition of subset, x ∈ B,
contradiction. Hence A − B = ∅. 2

Notice how many suppositions we have scattered all over the proof—and we are still proving fairly
simple propositions. To avoid confusion you should use paragraph structure and transition words
to guide reader around the worlds you are moving in and out of. The word conversely, for example,
indicates that we are now proving the second direction of a biconditional proposition (which is the
converse of the first direction).

12.4 Warnings
We conclude this chapter—and the part of this book explicitly about proofs—with the exposing
of logical errors particularly seductive to those learning to prove. Let not your feet go near their
houses.

Arguing from example.

2 + 6 = 8. Therefore, the sum of two even integers is an even integer. 2

If this reasoning were sound, then this would also prove that the sum of any two even integers is 8.

Reusing variables.

Suppose x and y are even integers. By the definition of even, there exists k ∈ Z such
that x = 2k and y = 2k.

Since x and y are even, each of them is twice some other integer—but those are different integers.
Otherwise, we would be proving that all even integers are equal to each other. What is confusing
about the correct way we wrote this earlier, “there exist j, k ∈ Z such that x = 2j and y = 2k,” is
that we were contracting the longer phrasing, “there exists j ∈ Z such that x = 2j and there exists
k ∈ Z such that y = 2k.” Had we reused the variable k in the longer version, it would be clear
that we were trying to reuse a variable we had already defined. This kind of mistake is extremely
common for beginners.

Begging the question.

Suppose x and y are even. Then x = 2j and y = 2k for some j, k ∈ Z. Suppose x + y is

even. Then x + y = 2m for some m ∈ Z. By substitution, 2j + 2k = 2m, which is even
by definition, and which we were to show. 2

85
12.4. WARNINGS CHAPTER 12. CONDITIONAL PROOFS

This nonsense proof tries to postulate a world in which the proposition to be proven is already true,
followed by irrelevant manipulation of symbols. You cannot make any progress by supposing what
you mean to prove—this is merely a subtle form of the “proof by tautology” we repudiated in the
previous chapter.

Substituting “if ” for “because” or “since.”

First suppose A−B = ∅. Suppose x ∈ A. If A−B = ∅, then it cannot be that x ∈ A−B.

By definition of set difference, it is not true that x ∈ A and x ∈
/ B ...

This is more a matter of style and readability than logic. Remember that each step in a proof should
yield a new known proposition, justified by previously known facts. “If A − B = ∅, then it cannot be
that x ∈ A − B” does no such thing, only informing us that x ∈ A − B, contingent upon A − B = ∅
being true. Instead, this part of the proof should assert that x ∈
/ A − B because A − B = ∅.

86
CHAPTER 12. CONDITIONAL PROOFS 12.4. WARNINGS

Exercises

the subsets of A into two groups: those that contain a and

Let A, B, and C be sets, subsets of the universal set U . For inte- those that do not. You need to show that these two groups
gers x and y to be consecutive means that y = x + 1. Prove. You comprise the entire powerset.
may use exercises from the previous chapter in your proofs.
6. If a ∈ A then P(A−{a})∩{ {a}∪A0 | A0 ∈ P(A−{a})} = ∅.
1. If A ⊆ B and B ⊆ C, then A ⊆ C. (This also is a proof that two sets are disjoint. This together
2. If A ∩ B = A ∩ C and A ∪ B = A ∪ C, then B = C. with the previous exercise show that these two sets make a
partition of P(A).
3. If a ∈
/ A, then A ∩ {a} = ∅. (This is the same as saying that
if a ∈
/ A, then A and {a} are disjoint.) 7. If x and y are odd integers, then x + y is even.
4. B ⊆ A iff (A − B) ∪ B = A. 8. If x and y are consecutive integers, then x + y is odd.
5. If a ∈ A then P(A) = P(A − {a}) ∪ {a ∪ A0 |A0 ∈
9. If n2 is odd, then n is odd. (Hint: try a proof by contradic-
P(A − {a})}. Hint: First review what a powerset is. This is
tion using Axiom 5.)
a complicated expression and consequently a difficult proof,
but do not be intimidated. The idea is that we are splitting 10. x · y is odd iff x and y are both odd.

87
12.4. WARNINGS CHAPTER 12. CONDITIONAL PROOFS

88
Chapter 13

Special Topic: Russell’s paradox

The usefulness of the set concept is its simpleness and its flexibility. For this reason set theory
is a core component of the foundations of mathematics. An example of the concept’s flexibility
is that we can talk sensibly about sets of sets—for example, powersets. Reasoning becomes more
subtle, however, if we speak of sets that even contain themselves. For example, the set X of all sets
mentioned in this book is hereby mentioned in this book, and so X ∈ X. Bertrand Russell called
for caution when playing with such ideas with the following paradox.
Let X be the set of all sets that do not contain themselves, that is, X = {Y |Y ∈ / Y }. Does X
contain itself?
First, suppose it does, that is X ∈ X. However, then the definition of X states that only
those sets that do not contain themselves are elements of X, so X ∈ / X, which is a contradiction.
Hence X ∈ / X. But wait—the same definition of X now tells us that X ∈ X, and we have another
contradiction.
A well-known puzzle, also attributed to Russell, presents the same problem. Suppose a certain
town has only one man who is a barber. That barber shaves only every man in the town who does
not shave himself. Does the barber shave himself? If he does, then he doesn’t; if he doesn’t, then
he does.
We can conclude from this only that the setup of the puzzle is an impossibility. There could not
possibly be a man who shaves only every man who does not shave himself. Likewise, the set of all
sets that do not contain themselves must not exist. This is why rigorous set theory must be built on
axioms. Although we have not presented a complete axiomatic foundation for set theory here, we
have assumed that at least one set exists (namely, the empty set), and we have assumed a notion
of what it means for sets to be equal. We have not assumed that any set that can be described
necessarily exists.
For this reason, when we name sets in a proof (“let X be the set. . . ”), we really are doing more
than assigning a name to some concept; we are jumping to the conclusion that the concept exists.
Therefore if we are defining sets in terms of a property (“let X = {x | x < 3}”), it is more rigorous
to make that set a subset of a known (or postulated) set merely limited by that property (“let
X = {x ∈ Z | x < 3}”).
This clears away the paradox nicely. Since it makes sense only to speak about things in the
universal set, we will assume that X ⊂ U , that is, X = {Y ∈ U | Y ∈ / Y }. Now the question, Does
X contain itself?, becomes, Is it true that X ∈ U and X ∈ / X? First suppose X ∈ U . Then either
X∈ / X or X ∈ X. As we saw before, both of those lead to contradictions. Hence X ∈ / U . In other
words, X does not exist.
Interestingly, this leaves powerset without a foundation, since we cannot define it as a subset of
something else. Since we do not want to abandon the idea altogether, we place it on firm ground
with its own axiom.

Axiom 6 (Powerset.) For any set X, there exists a set P(X) such that Y ∈ P(X) if and only
if Y ⊆ X.

89
CHAPTER 13. SPECIAL TOPIC: RUSSELL’S PARADOX

We have declared it O.K. to speak of powersets. However, we can prove that no set contains its
own powerset.

Theorem 13.1 If A is a set, then P(A) * A.

Proof. Suppose A is a set. Further suppose that P(A) ⊆ A. Let B = {b ∈ A | b ∈ / b}.

Since B ⊆ A, B ∈ P(A) by definition of powerset. By definition of subset, B ∈ A. Now
we have two cases:
Case 1: Suppose B ∈ B. Then B ∈
/ B, contradiction; this case is impossible.
Case 2: Suppose B ∈
/ B. Then B ∈ B, contradiction, this case is impossible.
Either case leads to a contradiction. Hence P(A) * A. 2

Moreover, this shows that the idea of a “set of all sets” is downright impossible in our axiomatic
system.

Corollary 13.1 The set of all sets does not exist.

Proof. Suppose A were the set of all sets. By Axiom 6, P(A) also exists. Since P(A) is
a set of sets and A is the set of all sets, P(A) ⊆ A. However, by the theorem, P(A) * A,
a contradiction. Hence A does not exist. 2

This chapter draws heavily from Epp[5] and Hrbacek and Jech[9].

90
Part IV

Algorithm

91
Chapter 14

Algorithms

14.1 Problem-solving steps

So far we have studied sets, a tool for categorization; logic, rules for deduction; and proof, the
application of these rules. We procede now to another thinking skill, that of ordering instructions
to describe a process of solving a problem. The importance of this to computer science is obvious,
because programming is merely the ordering of instructions for a computer to follow. We will see if
also as a mathematical concept.
An algorithm is a series of steps to solving and problem or discovering a result. An algorithm algorithm
must have some raw material to work on (its input), and there must be a result that it produces
(its output). The fact that it has input implies that the solution is general ; you would write an input
algorithm for finding n!, not one for finding 5!. An algorithm must be expressed in a way that is
output
precise enough to be followed correctly.
Much of the mathematics you learned in the early grades involved mastering algorithms—how to
add multi-digit numbers, how to do long division, how to add fractions, etc. We also find algorithms
in everyday life in the form of recipes, driving directions, and assembly instructions.
Let us take multi-digit addition. The algorithm is informed by how we arrange the materials we
work on. The two numbers are analyzed by “columns,” which represent powers of ten. You write the
two addends, one above the other, aligning them by the one’s column. The answer will be written
below the two numbers, and carry digits will be written above. At every point in the process, there
is a specific column that we are working on (call it the “current column”). Our process is

1. Start with the one’s column.

2. While the current column has a number for either of the addends, repeatedly
(a) Add the two numbers in the current column plus any carry from last time; let’s call this
result the “sum.”
(b) Write the one’s column of the sum in the current column of the answer.
(c) Write the ten’s column of the sum in the carry space of the next column to the left.
(d) Let the next column to the left become our new current column.
3. If the last round of the repetition had a carry, write it in a new column for the answer.

Notice how this algorithm takes two addends as its input, produces and answer for its output,
and uses the notions of sum and current column as temporary scratch space. It is particularly
important to notice that the sum and the current column keep changing.
A grammatical analysis of the algorithm is instructive. All the sentences are in the imperative
mood1 . Contrast this with propositions, which were all indicative. The algorithm does contain
1 It is a convenient aspect of English, however, that the “to” at the end of the previous paragraph makes them

infinitives.

93
14.2. REPETITION AND CHANGE CHAPTER 14. ALGORITHMS

propositions, however: the independent clauses “the current column has a number for either of the
addends” and “the last round of the repetition has a carry” are either true or false. Moreover, those
propositions are used to guard steps 2 and 3; the words “if” and “while” guide decisions about
whether or how many times to execute a certain command. Finally, note that step 2 is actually the
repetition of four smaller steps, bound together as if they were one. This is similar to how we use
simple expressions to build more complex ones.
We already know how to do certain pieces of this in ML: Boolean expressions represent proposi-
tions and if expressions make decisions. What we have not yet seen is how to repeat, how to change
the value of a variable, how to compound steps together to be treated as one, or generally how to do
anything explicitly imperative. ML’s tools for doing these things will seem clunky because it goes
against the grain of the kind of programming for which ML was designed. It is still worth our time
to consider imperative algorithms; bear with the unusual syntax for the next few chapters, and after
that ML will appear elegant again.

14.2 Repetition and change

statement A statement is a complete programming construct that does not produce a value (and so differs
from an expression). If you evaluate a statement in ML, it will respond with
val it = () : unit
As it appears, this literally is an empty tuple, and it is ML’s way of expressing “nothing” or “no
result.” () is the only value of the type unit.
statement list The simplest way to write statements is to use them in a statement list, which is a list of
expressions separated by semicolons and enclosed in parentheses (thus it looks like a tuple except
with semicolons instead of commas). A statement list is evaluated by evaluating each expression in
order and throwing away the result except for the last, which it evalues and returns as the value of
the entire statement list. For example,
- (7 + 3; 15.6 / 34.7; 4 < 17; Dove);

val it = Dove : Bird

The results of 7 + 3, 15.6 / 34.7, and 4 < 17 are ignored, turning them into statements. A
statement list itself, however, has a value and thus is an expression and not a statement. Statements
side effects are used for their side effects, changes they make to the system that will affect expressions later.
We do not yet know anything that has a side effect.
while statement A while statement is a construct which will evaluate an expression repeatedly as long as a given
condition is true. Its form is
while <expression> do <expression>
Since it is a statement, it will always result in () so long as it does indeed finish. If we try
- val x = 4;

val x = 4 : int

- while x < 5 do x - 3;

ML gives no response, because we told it to keep subtracting 3 from x as long as x is less than
5. Since x is 4, it is and always will be less than 5, and so the execution of the statement goes
on forever. (Press control-C to make it stop.) The algorithm for adding two multi-digit numbers
has a concrete stopping point: when you run out of columns in the addends. The problem here is
that since variables do not change during the evaluation of an expression, there is no way that the
boolean expression guarding the loop will change either. Trying

94
CHAPTER 14. ALGORITHMS 14.2. REPETITION AND CHANGE

- while x < 5 do x = x - 3;
will merely compare x with x - 3 forever. The attempt
- while x < 5 do val x = x - 3;

stdIn:11.13-11.21 Error: syntax error: deleting DO VAL ID

is rejected because val x = x - 3 is not an expression.

It turns out that in ML, variables are like the laws of the Medes and the Persions, which, once
instated, cannot be repealed (Daniel 6:15, Esther 1:19). This seems to contradict your experience
because you probably have reused variables in the same session.
- val y = 5;

val y = 5 : int

- val y = 16;

val y = 16 : int

- val y = Owl;

val y = Owl : Bird

What is actually happening here is like when King Xerxes, realizing he could could not revoke an
old decree, instead issued a new decree that counteracted the old one. Technically, when you reuse
an identifier in ML, you are not changing the value of the old variable but making a new variable
of the same name that shadows the old one. This is a technical detail that is transparent to mos
ML programming (you need not understand or remember it), but it makes a difference in writing
imperative algorithms because you cannot redeclare a variable in the middle of an expression or
statement.
To deal with this, ML allows for a different kind of variable, called a reference variable, which reference variable
can change value. Three rules distinguish reference variables from ordinary ones.
• To declare a reference variable, precede the expression assinged with the keyword ref.
- val x = ref 5;

val x = ref 5 : int ref

• To use a reference variable, precede the variable with !.

- !x;

val it = 5 : int

• To set a reference variable, use an assignment statement in the form assignment statement

<identifier> := <expression>
- x := !x + 1;

val it = () : unit

- !x;

val it = 6 : int

95
14.3. PACKAGING AND PARAMETERIZATION CHAPTER 14. ALGORITHMS

Notice how types work here. The type of the variable x is int ref, and the type of the expression !x
is int. The operator ! always takes something of type a’ ref for some type a’ and returns something
of type a’.
As an example, consider computing a factorial. Using product notation, we define
n
Y
n! = i
i=1

For instance,
5
Y
5! = i = 1 × 2 × 3 × 4 × 5 = 125
i=1

Notice the algorithm nature of this definition (or just the product notation). It states to keep a
running product while repeated multiplying by a multiplier, incrementing the multiplier by one at
each step, starting at one, until a limit for the multiplier is reached. With reference variables in our
arsenal, we can do this in ML:

- val i = ref 0;

val i = ref 0 : int ref

- val fact = ref 1;

val fact = ref 1 : int ref

- while !i < 5 do
= (i := !i + 1;
= fact := !fact * !i);

val it = () : unit

- !fact;

val it = 120 : int

The while statement produces only (), even though it computes 5!. It does not report that
answer (because it is not an expression), but instead changes the value of fact and i—a good
example of a side effect. Thus we need to evaluate the expression !fact to discover the answer.

14.3 Packaging and parameterization

We used a statement list to compound the assignments to i and fact. Recall that the value of a
statement list is the value of the last expression in the list. This way, we can compound the two last
steps above into one step. This will give us a better sence of packaging, since the computation and
the reporting of the result are really one idea.

- i := 0;

val it = () : unit

- fact := 1;

96
CHAPTER 14. ALGORITHMS 14.3. PACKAGING AND PARAMETERIZATION

val it = () : unit

- (while !i < 5 do
= (i := !i + 1;
= fact := !fact * !i);
= !fact);

val it = 120 : int

However, the variables i and fact will still exists after the computation finishes, even though
they no longer have a purpose. The duration of a variable’s validity is called its scope. A variable scope
declared in the prompt has scope starting then and contuning until you exit ML. We would like
instead to have variables that are local to the expression, that is, variables that the expression alone
can use and that disappear when the expression finishes executing. We can make local variables by
using a let expression with form

let <var declaration>1; <var declaration>2; . . . <var declaration>n in <expression> end

The value of the last expression is also the value of the entire let expression. Now we rewrite
factorial computation as a single, self-contained expression:

- let
= val i = ref 0;
= val fact = ref 1;
= in
= (while !i < 5 do
= (i := !i + 1;
= fact := !fact * !i);
= !fact)
= end;

val it = 120 : int

The only thing deficient in our program is that it is not reusable. Why would we bother, after
all, with a 9-line algorithm for computing 5! when the one line 1 * 2 * 3 * 4 * 5 would have
produced the same result? The value of an algorithm is its generality. This is the same algorithm
we would use to compute any other factorial n!, only that we would replace 5 in the fifth line with n.
In other words, we would like to parameterize the expression, or wrap it in a package that accepts
n as input and produces the factorial for any n. Just as we parameterized statements with fun to
make predicates, so we can here.

- fun factorial(n) =
= let
= val i = ref 0;
= val fact = ref 1;
= in
= (while !i < n do
= (i := !i + 1;
= fact := !fact * !i);
= !fact)
= end;

val factorial = fn : int -> int

- factorial(5);

97
14.4. EXAMPLE CHAPTER 14. ALGORITHMS

val it = 120 : int

- factorial(8);

val it = 40320 : int

- factorial(10);

val it = 3628800 : int

For a second time, you are informally introduced to the notion of a function. We see it here as
a way to package an algorithm so that it can be used as an expression when given an input value.
Later we will see how to knit functions together to eliminate the need for while loops and reference
variables in almost all cases.

14.4 Example
In Chapter 4 introduced arrays as a type for storing finite, uniform, and random accessible collections
of data. You may recall that when you performed array operations that the interpreter responded
with ():unit , which we now know indicates that these are statements. The following displays and
uses an algorithm for computing long division. The function takes a divisor (as an integer) and a
dividend (as an array of integers, each position representing one column), and it returns a tuple
standing for the quotient and the remainder. Study this carefully.

- fun longdiv(divisor, dividend, len) =

= let
= val i = ref 0;
= val result = ref 0;
= val remainder = ref 0;
= in
= (while !i < len do
= (remainder := !remainder * 10 + sub(dividend, !i);
= let
= val quotient = !remainder div divisor;
= val product = quotient * divisor;
= in
= (result := !result * 10 + quotient;
= remainder := !remainder - product)
= end;
= i := !i + 1);
= (!result,!remainder))
= end;

val longdiv = fn : int * int array * int -> int * int

- longdiv(8,A,4);

val it = (440,1) : int * int

- val A = array(4, 0);

val A = [|0,0,0,0|] : int array

98
CHAPTER 14. ALGORITHMS 14.4. EXAMPLE

- update(A, 0, 3);

val it = () : unit

- update(A, 1, 5);

val it = () : unit

- update(A, 2, 2);

val it = () : unit

- update(A, 3, 1);

val it = () : unit

- A;

val it = [|3,5,2,1|] : int array

99
14.4. EXAMPLE CHAPTER 14. ALGORITHMS

Exercises

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, . . .

1. Suppose ML did not have a multiplication operator defined
for integers. Write a function multiply which takes two Write a function to compute the nth Fibonacci number,
integers and produces their product, without using direct given n.
multiplication (that is, using repeated addition). (Hint: You will need four reference variables local to the
Pn i function: one to serve as a counter, one to hold the cur-
2. A series in the form i=0 ar is called a geometric series. rent value, one to hold the previous value, and one to hold
That is, given a value r and a coefficient a, find the sum of
a temporary value as you shuffle the current value to the
a+ar +ar 2 +ar3 +. . .+ar n . Write a function that computes
previous:
a geometric series, taking n, a, and r as parameters.
3. Rewrite the long division function so that it receives the div- temp := ...
idend as a normal integer rather than an array. (Hint: use previous := !current
div and mod to cut the dividend up into digits) current := !temp

4. The Fibonacci sequence is defined by repeatedly adding the Then think about how you can rewrite this so that the tem-
two previous numbers in the sequence (starting with 0 and porary value can be local to the while statement, and a reg-
1) to obtain the next number, i.e. ular variable instead of a reference variable.)

100
Chapter 15

Induction

15.1 Calculating a powerset

Recall that the powerset of a set X is the set of all its subsets.

P(X) = {Y |Y ⊆ X}

For example, the powerset of {1, 2, 3} is {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}. Here is an
algorithm for computing the powerset of a given set (represented by a list), anotated at the side.

- fun powerset(set) =
= let
= val remainingSet = ref set; remainingSet is the part of set we have
not yet processed.
= val powSet = ref ([[]] : int list list); powSet is the powerset as we have cal-
culated it so far.
= in
While there is still some of the set left
= (while not (!remainingSet = nil) do to process. . .
= let powAddition is what we are adding to
the power set this time around.
= val powAddition = ref ([] : int list list); remainingPowSet is the part of powSet we
have not yet processed.
= val remainingPowSet = ref (!powSet); currentElement is the element of set we
are processing this time around.
= val currentElement = hd(!remainingSet); . . . pick the next element. . .
= in
= (while not (!remainingPowSet = nil) do . . . and while there is still part of what
= let we have made so far left to process. . .
currentSubset is the subset we are pro-
= val currentSubSet = hd(!remainingPowSet); cessing this time around.
. . . pick the next subset. . .
= in
= (powAddition := (currentElement :: currentSubSet) . . . add currentElement to that
= :: !powAddition; subset. . .
= remainingPowSet := tl(!remainingPowSet))
= end;
= powSet := !powSet @ !powAddition; . . . and add all those new subsets to
= remainingSet := tl(!remainingSet)) the powerset.
= end;
= !powSet)
= end;

val powerset = fn : int list -> int list list

- powerset([1,2,3]);

val it = [[],[1],[2,1],[2],[3,2],[3,2,1],[3,1],[3]] : int list list

101
15.2. PROOF OF POWERSET SIZE CHAPTER 15. INDUCTION

As we will see in a later chapter, this is far from the best way to compute this in ML. Dissecting
it, however, will exercise your understanding of algorithms from the previous chapter. Of immediate
interest to us is the relationship between the size of a set and the size of its powerset. Recall that
the cardinality of a finite set X, written |X|, is the number of elements in the set. (Defining the
cardinality of infinite sets is a more delicate problem, one we will pick up later.) With our algorithm
handy, we can generate the powerset of sets of various sizes and count the number of elements in
them.

- powerset([]);

val it = [[]] : int list list

- powerset([1]);

val it = [[],[1]] : int list list

- powerset([1,2]);

val it = [[],[1],[2,1],[2]] : int list list

- powerset([1,2,3,4]);

val it = [[],[1],[2,1],[2],[3,2],[3,2,1],[3,1],[3],
[4,3],[4,3,1],[4,3,2,1],[4,3,2], ...] : int list list

|A| |P(A)| The elipses on the last result indicate that there are more items in the list, but the list exceeds
0 1 the length that the ML interpreter normally displays. We summarize our findings in the adjacent
1 2
2 4
table. The pattern we recognize is that |P(A)| = 2|A| (so the last number is actually 16). An
3 8 informal way to verify this hypothesis is to think about what the algorithm is doing. We start out
4 > 12 with the empty set—all sets will at least have that as a subset. Then, for each other element in the
original set, we add it to each element in our powerset so far. Thus, each time we process an element
from the original set, we double the size of the powerset. So if the set has cardinality n, we begin
with a set of cardinality 1 and double its size n times; the resulting set must have cardinality 2 n .
This makes sense, but how do we prove it formally? For this we will use a new proving technique,
proof by mathematical induction.

15.2 Proof of powerset size

Our theorem is

Theorem 15.1 |P(A)| = 2|A|

This is in General Form 1 (straight facts, without explicit condition). However, restating this
will make the proof easier. First, we define the predicate

I(n) = If A is a set and |A| = n then |P(A)| = 2n

invariant The I is for invariant, since this is a condition that remains the same regardless of how we adjust
n. Our theorem essentially says

∀ n ∈ W, I(n)
What this gives us is that now we are predicating our work on whole numbers (“prove this for all
n”) rather than sets (“prove this for all A”).

102
CHAPTER 15. INDUCTION 15.2. PROOF OF POWERSET SIZE

Before we go further, an important side note: The definition of cardinality we presented in

Chapter 3 is informal, and we will not be able to define it more precisely until Chapter 25. For
this reason, and also to reduce the complexity of the present proof, we will be appealing to various
results proven elsewhere in the book.

Result Where found

If A and B are finite sets and A ⊆ B, then |B − A| = |B| − |A|. Chapter 25, Exercise 10.
If a ∈ A then P(A) = P(A − {a}) ∪ {a ∪ A0 |A0 ∈ P(A − {a})}. Chapter 12, Exercise 5.
If a ∈ A then P(A − {a}) ∩ { {a} ∪ A0 | A0 ∈ P(A − {a})} = ∅. Chapter 12, Exercise 6.
If A is a finite set and a ∈ A, then |{ {a} ∪ A0 | A0 ∈ P(A − {a})}| = |P(A − {a})|. Chapter 25, Exercise 11.
If A and B are finite, disjoint sets, then |A ∪ B| = |A| + |B|. Theorem 25.2.

Proof. Suppose |A| = 0. By definition of the empty set, A = ∅. Moreover, by definition

of powerset, P(A) = P(∅) = {∅}, since ∅ is the only subset of ∅. By definition of
cardinality, |{∅}| = 1 = 20 , so |P(A)| = 20 . Hence I(0).

Remember, we want to prove I(n) for all n. We have proven it only for n = 0, leaving infinitely
many possibilities to go. So far, this approach looks frighteningly like a proof by exhaustion with
an infinity of cases. However, this apparently paltry result allows us to say one thing more.

Moreover, ∃ N ≥ 0 such that I(N ).

We know that this is true because it is at least true for n = 0. Possibly it is true for greater values
of n also. But this foot in the door now lets us take all other cases in a giant leap.

Now, suppose |A| = N + 1. Let a ∈ A. (Since N ≥ 0, N + 1 ≥ 1, so we know A must

have at least one element.) Note that {a} ⊆ A by definition of subset and |{a}| = 1 by
definition of cardinality.

|A − {a}| = |A| − |{a}| by Exercise 10 of Chapter 25

= (N + 1) − 1 by substitution
= N by arithmetic.

By Exercises 5 and 6 of Chapter 12, P(A − {a}) and { {a} ∪ A0 | A0 ∈ P(A − {a})} are
a partition of P(A).

Do not let the complicated notation intimidate you. All we are doing is splitting P(A) into the
subsets that contain a and those that do not. Review the relevant exercises of Chapter 12 if necessary.

Since I(N ), |P(A − {a})| = 2N . By Exercise 11 of Chapter 25, |{ {a} ∪ A0 | A0 ∈

P(A − {a})}| = |P(A − {a})|. So,

|P(A)| = |P(A − {a})| + |{ {a} ∪ A0 | A0 ∈ P(A − {a})}| by Theorem 25.2

= 2N + 2N by substitution
= 2N +1 by exponential addition

Hence I(N + 1).

Now we have proven (a) I(N ) for some N , and (b) if I(n) then I(n + 1). Stated differently, we have
shown that it is true for the first case, and that if it is true for one case it is true for the next case.
Thus,

By the principle of math induction, I(n) for all n ∈ W. Therefore, |P(A)| = 2|A| . 2

103
15.3. MATHEMATICAL INDUCTION CHAPTER 15. INDUCTION

15.3 Mathematical induction

Induction, broadly, is a form of reasoning where a general rule is inferred from and verified by a set
of observations consistent with that rule. Suppose you see five blue jays and notice that all of them
are blue. From this you hypothesize that perhaps all blue jays are blue. You then proceed to see
fifty more blue jays, all of them also blue. Now you are satisfied and believe that in fact, all blue
jays are blue.
This differs from deduction, where one knows a general rule beforehand, and from that general
rule infers a specific fact. Suppose you accepted the fact that all blue jays are blue as an axiom. If
a blue jay flew by, you would have no reason even to look at it: you know it must be blue.
In fields like philosophy and theology, induction is viewed somewhat suspiciously, since it does
not prove something beyond any doubt whatsoever. What if blue jay # 56 is red, and you simply
quit watching too early? Inductive arguments are still used from time to time, though, because they
can be rhetorically persuasive and because for some questions, it is simply the best we can do.
In natural science, in fact, we see inductive reasoning used all the time. The scientific method
posits a hypothesis that explains observed phenomena and promotes a hypothesis to a theory only
if many experiments (that is, controlled observations) consistently fit the hypothesis. The law
of universal gravitation is called a law because so many experiments and observations have been
amassed that suggest it to be true. It has never been proven in the mathematical sense, nor could
it be.
In formal logic and mathematics, however, inductive reasoning is never grounds for proof, but
rather is a gradiose but just as fallacious version of proof by example. Fermat’s last theorem had
been verified using computers to at least n = 1000000[14], but it still was not considered proven
until Andrew Wiles’s proof. All this is to point out that math induction is not induction in the
rhetorical sense. It is still deduction. It bears many surface similarities to inductive reasoning since
it considers at least two examples and then generalizes, but that generalization is done by deduction
from the following principle.

Axiom 7 (Mathematical Induction) If I(n) is a predicate with domain W, then if I(0) and if
for all N ≥ 0, I(N ) → I(N + 1), then I(n) for all n ∈ W.

(The principle can also be applied where I(n) has domain N, and we prove I(1) first; or, any
other integer may be used as a starting point. Assuming zero to be our starting point allows us to
state the principle more succinctly.)
Even though we take this as an axiom for our purposes, it can be proven based on other axioms
generally taken in number theory. Intuitively, an inductive proof is like climbing a ladder. The first
step is like getting on the first rung. Every other step is merely a movement from one rung to the
next. Thus you prove two things: you can get on the first rung, and, supposing that you reach the
nth rung at some point, you can move to the n + 1st rung. These two facts taken together prove
that you eventually can reach any rung.
Recall the dialogue between the doubter and the prover in Chapter 9. In this case, the prover is
claiming I(n) for all n. When the doubter challenges this, the prover says, “Very well, you pick an n,
and I will show it works.” When the doubter has picked n, the prover then proves I(0); then, using
I(0) as a premise, proves I(1); uses I(1) to prove I(2); and so forth, until I(n) is proven. However,
all of these steps except the first are identical. A proof using mathematical induction provides a
recipe for generating a proof up to any n, should the doubter be so stubborn.
base case Therefore, a proof by induction has two parts, a base case and an inductive case. For clarity,
inductive case you should label these in your proof. Thus the format of a proof by induction should be

Proof. Reformulate the theorem as a predicate with domain W. By induction on n.

Base case. Suppose some proposition equivalent to n = 0. . . . Hence I(0). Moreover
∃ N ≥ 0 such that I(N ).
Inductive case. Suppose some proposition equivalent to n = N + 1. . . . Hence I(N + 1).
Therefore, by math induction, I(n) for all n ∈ W and the original proposition. 2

104
CHAPTER 15. INDUCTION 15.4. INDUCTION GONE AWRY

You may notice that with our formulation of the axiom, the proposition ∃ N ≥ 0 such that I(N )
is technically unnecessary. We will generally add it for clarity.

15.4 Induction gone awry

When being exposed to mathematical induction for the first time, many students are initially in-
credulous that it really works. This may be because students imagine an incorrect version of math
induction being used to prove manifestly false propositions. Consider the following “proof” that all
cows have the same color1 , and try to determine where the error is. This will test how well you have
understood the principle, and, we hope, convince you that the problem lies somwhere besides the
pinciple itself.

Proof. Let the predicate I(n) be “for any set of n cows, every cow in that set has the
same color.” We will prove I(n) for all n ≥ 1 by induction on n.
Base case.Suppose we have a set of one cow. Since that cow is the only cow in the set,
it obviously has the same color as itself. Thus all the cows in that set have the same
color. Hence I(1). Moreover, ∃N ≥ 1 such that I(N ).
Inductive case. Now, suppose we have a set, C, of N + 1 cows. Pick any cow, c 1 ∈ C.
The set C − {c1 } has N cows, by Exercise 10 of Chapter 12. Moreoever, by I(N ), all
cows in the set C − {c1 } have the same color. Now pick another cow c2 ∈ C, where
c2 6= c1 . We know c2 must exists because |C| = N + 1 ≥ 1 + 1 = 2. By reasoning similar
to that above, all cows in the set C − {c2 } must have the same color.
Now, c1 ∈ C − {c2 }, and so c1 must have the same color as the rest of the set (that is,
besides c2 ). Similarly, c2 ∈ C − {c1 }, so c2 must have the same color of the rest of the
set. Hence c1 and c2 also have the same color as each other, and so all cows of C have
the same color.
Therefore, by math induction, I(n) for all n, and all cows of any sized set have the same
color. 2

The error lurking here is the conclusion that c1 and c2 have the same color just because they
each have the same color as “the rest of C.” Since we had previously proven only I(1), we cannot
assume that C has any more than two elements—so it could be that C = {c1 , c2 }. In that case, {c1 }
and {c2 } each have cows all of the same color, but that does not relate c1 to c2 or prove anything
about the color of the cows of C.
The problem is not induction at all, but a faulty implicit assumption that C has size at least
three and that I(2) is true. If we had proven I(2) (which of course is ridiculous), then we could
indeed prove I(n) for all n. Suppose C = {c1 , c2 , c3 }. Take c1 out, leaving {c2 , c3 }. By I(2), c2
and c3 have the same color. Put c1 back in, take c2 out, leaving {c1 , c3 }. By I(2) again, c1 and c3
have the same color. By “transitivity of color,” c1 and c2 have the same color. Hence I(3), and by
induction I(n) for all n. One false assumption can prove anything.

15.5 Example
In Exercises 12 and 13 of Chapter 11, you proved what are known as DeMorgan’s laws for sets
(compare them with DeMorgan’s laws from Chapter 5). We can generalize those rules for unions
and intersections of more than two sets. First, we define the iterated union and iterated intersection iterated union
of the collection of sets A1 , A2 , . . . An :
iterated intersection
n
[
Ai = A 1 ∪ A 2 ∪ . . . ∪ A n
i=1
1 The original formulation, by George Pólya, was a proof that “any n girls have eyes of the same color”[11]. That

was in 1954. Cows are not as interesting, but they are a more appropriate subject, given modern sensitivities.

105
15.5. EXAMPLE CHAPTER 15. INDUCTION

n
\
Ai = A 1 ∩ A 2 ∩ . . . ∩ A n
i=1

Now we can prove a generalized DeMorgan’s law.

n
[ n
\
Theorem 15.2 Ai = Ai for all n ∈ N.
i=1 i=1

Proof. By induction on n.
Base case. Suppose n = 1, and suppose A1 is a (collection of one) set. Then, by
n
[ n
\
definition of iterated union and iterated intersection, Ai = A 1 = Ai . Hence there
i=1 i=1
N
[ N
\
exists some N ≥ 1 such that Ai = Ai .
i=1 i=1
Inductive case. Now, suppose A1 , A2 , . . . AN +1 is a collection of N + 1 sets. Then

+1
N[
Ai = A1 ∪ A2 ∪ . . . ∪ AN ∪ AN +1 by definition of iterated union
i=1
N
[
= ( Ai ) ∪ AN +1 also by definition of iterated union
i=1
N
[
= Ai ∩ AN +1 by Exercise 12 of Chapter 11
i=1
\N
= ( Ai ) ∩ AN +1 by the inductive hypothesis
i=1
= (A1 ∩ A2 ∩ . . . ∩ AN ) ∩ AN +1 by definition of iterated intersection
+1
N\
= Ai also by definition of iterated intersection.
i=1

Therefore, by math induction, for all n ∈ N and for all collections of sets A1 , A2 , . . . An ,
[n \n
Ai = Ai . 2
i=1 i=1

Notice that in this proof we never gave a name to the predicate (for example, I(n)). This means
N
\
we could not say “since I(N )” to justify that ( Ai )∩AN +1 is equivalent to the previous expressions.
i=1
inductive hypothesis Instead, we used the term inductive hypothesis, which refers to our supposition that the predicate is
true for N . Notice also that we never make this supposition explicitly with the word “suppose.” We
make it implicitly when we say that it is true for some N —we have proven that some N exists (1 at
least), but we are still supposing an arbitrary number, and calling it N . For your first couple of tries
at math induction, you will probably find it easier to get it right if you name and use the predicate
explicitly. Once you get the hang of it, you will probably find the way we wrote the preceeding proof
more concise.

106
CHAPTER 15. INDUCTION 15.5. EXAMPLE

Exercises
n
X
5. A summation is an iterated addition, defined by ai =
Prove using mathematical induction, for all n ∈ N. i=1
a1 + a2 + . . . + an for some formula ai , depending on i. Us-
n
\ n
[ n
i = n(n+1)
X
1. Ai = Ai . ing math induction, prove 2
for all n ∈ N.
i=1 i=1 i=1

n n
[ [ 6. We say that an integer a is divisible by b 6= 0 (or b divides
2. (A ∩ Bi ) = A ∩ ( Bi ). a), written b|a, if there exists an integer c such that a = cb.
i=1 i=1
Prove that for all x ∈ Z and for all n ∈ W, x − 1|xn − 1.
n
[ n
[ Hint: first suppose x ∈ Z. Then use math induction on n.
3. (Ai − B) = ( Ai ) − B. If you do not see how it works at first, pick a specific value
i=1 i=1 for x, say 3, and try it for n = 0, n = 1, n = 2, and n = 3.
n
\ n
\ Notice the pattern, and then use math induction to prove it
4. (Ai − B) = ( Ai ) − B. for all n, but still assuming x = 3. Then rewrite the proof
i=1 i=1 for an arbitrary x.

107
15.5. EXAMPLE CHAPTER 15. INDUCTION

108
Chapter 16

Correctness of algorithms

16.1 Defining correctness

N
X
This program computes an arithmetic series from 1 to a given N , i:
i=1

- fun arithSum(N) =
= let
= val s = ref 0;
= val i = ref 1;
= in
= (while !i <= N do
= (s := !s + !i;
= i := !i + 1);
= !s)
= end;

But how do we know if this is correct? Or what does it even mean for a program to be correct?
Intuitively, a program is correct if, given a certain input, it will produce a certain, desired output.
In this case, if the input N is an integer (something the ML interpreter will enforce), then it should
evaluate to the desired sum. When a programmer is testing software, he or she runs the program on
several inputs chosen to represent the full variety of the input range and compares the result of the
program to the expected result. Obviously this approach to correctness is based on experimentation;
it is inductive reasoning in the non-mathematical sense, and while it can increase confidence in a
program’s correctness, it cannot prove correctness absolutely (except in the unrealistic case that
every possible input is tested—a proof by exhaustion over the set of possible input). This process
also assumes the intended result of the program can be verified conveniently by hand (or by another
program)—and if that was truly convenient, we may not have written the program in the first place.
Let us consider this notion of correctness more formally. The correctness of a given program is
defined by two sets of propositions: pre-conditions, which we expect to be true before the program pre-conditions
is evaluated, and post-conditions which we expect to hold afterwards. If the post-conditions hold
whenever the pre-conditions are met, then we say that the algorithm is correct. This approach is post-conditions
particularly useful in that it scales down to apply to smaller portions of an algorithm; Suppose we
have the pre-condition a is a nonnegative integer for the statement

b := !a + 1

We can think of many plausible post-conditions for this, including b is a positive integer, b > a,
and b−a = 1. Whichever of these makes sense, they are things we can prove mathematically, as long
as we implicitly take the efficacy of the assignment statement as axiomatic; propositions deduced
that way are justified as by assignment.

109
16.2. LOOP INVARIANTS CHAPTER 16. CORRECTNESS OF ALGORITHMS

Moreover, the post-conditions of a single statement are then the pre-conditions of the following
statement; the pre-conditions of the entire program are the pre-conditions of the first statement;
and the post-conditions of the final statement are the post-conditions of the entire program. Thus
by inspecting how each statement in turn affects the propositions in the pre- and post-conditions,
we can prove an algorithm is correct. Consider this program for computing a mod b (assuming we
can still use div).

- fun remainder(a, b) =
= let
= val q = a div b;
= val p = q * b;
= val r = a - p;
= in
= r
= end;

Sometimes we will consider val-declarations merely as setting up initial pre-conditions; since they
do all the work of computation in this example, we will consider them statements to be inspected.
The pre-condition for the entire program is that a, b ∈ Z+ . The result of the program should be
a mod b; equivalently, the post-condition of the declarations is r = a mod b. To analyze what a div b
does, we rely on the following standard result from number theory, a proof for which can be found
in any traditional discrete math text.

Theorem 16.1 (Quotient-Remainder.) If n ∈ Z and d ∈ Z+ , then there exist unique integers q

and r such that n = d · q + r and 0 ≤ r < d.

This theorem is the basis for our definition of division and modulus. q in the above theorem is called
quotient the quotient, and a div b = q. r is called the remainder , and a mod b = r.
remainder As a first pass, we intersperse the algorithm with little proofs.
Suppose a, b ∈ Z.
val q = a div b
By assignment, q = a div b. By the Quotient-
Remainder Theorem and the definition of division,
a = b · q + R for some R ∈ Z, where 0 ≤ R < b.
By algebra, q = a−Rb .
val p = q * b
p = q · b by assignment. p = a − R by substitution
and algebra.
val r = a - p
By assignment, r = a − p. By substitution and
algebra, r = a − (a − R) = R. Therefore, by the
definition of mod, r = a mod b. 2

16.2 Loop invariants

The previous example is simple almost to deception. It ignores all things that make an algorithm
difficult to reason about: variables that change value, branching, and repetition. We will discover
how to reason about the mutation of reference variables and the evaluation of conditionals along the
way. We now consider loops, which are more difficult.
Simplistically, a loop can be analyzed for correctness by unrolling it. If a loop were to be executed,
say, five times, then we can prove correct the program that would result if we pasted the body of
the loop end to end five times. The problem with that approach is that almost always the number
of iterations of the loop itself depends on the input, as we see in the arithmetic series example, the

110
CHAPTER 16. CORRECTNESS OF ALGORITHMS 16.3. BIG EXAMPLE

guard is i <= N. (An iteration is an execution of the body of a loop; the boolean expression which iteration
we test to see if the loop should continue is called the guard ). We want to prove some proposition to
be true for an arbitrary number of iterations. The number of iterations is certainly a whole number. guard
This suggests a proof by induction on the number of iterations.
We need a predicate, then, whose argument is the number N of iterations, and we need to show
it to be true for all N ≥ 0. This is asking for a lot of flexibility from this predicate: since it must be
true for 0 iterations, it is the pre-condition for the entire loop. Since it must be true for N iterations,
it is the post-condition for the entire loop. Since it must be true for every value between 0 and N , it
is the post-condition and pre-condition for every iteration along the way. Of course, if a statement
or code section has identical pre-conditions and post-conditions, that suggests the code does not do
anything. That is not what is in view here—these various pre- and post-conditions are not identical
to each other, but rather are parameterized. We must formulate this predicate in such a way that
the parameterization captures what the loop does. The predicate must state what the loop does not
change, with respect to the number of iterations.
A loop invariant I(n) is a predicate whose argument, n, is the number of iterations of the loop, loop invariant
chosen so that

• I(0) is true (that is, the proposition is true before the loop starts; this must be proven as the
base case in the proof).

• I(n) implies I(n + 1) (that is, if the proposition is true before a given iteration, it will still be
true after that iteration; this must be proven as the inductive case in the proof).

• If the loop terminates (after, say, N iterations), then I(N ) is true (this follows from the two
previous facts and the principle of math induction).

• Also if the the loop terminates, then I(N ) implies the post-condition of the entire loop.

These four points correspond to four steps in the proof that a loop is correct: we prove that the
loop is initialized so as to establish the loop invariant; that a given iteration maintains the loop initialization
invariant; that the loop will eventually terminate, that is, that the guard will be false eventually;
and that the loop invariant implies the post-condition. The first two steps constitute a proof by maintenance
induction, on which we focus. The last two complete the proof of algorithm correctness. termination

16.3 Big example

Let us now apply this intention to the algorithm given at the beginning of this chapter. The post-
condition for the entire algorithm is that the variable s should equal the value of the series. Since
s is a reference variable, it changes value during the running of the program. Specifically, after n
Xn
iterations, s = k. (Why did we replace N and i with n and k?) It will also be helpful to monitor
k=1
what the variable i is doing. Since i is initialized to 1 and is incremented by 1 during each iteration,
then with respect to n, i = n + 1. Thus we have our loop invariant:
n
X
I(n) = after n iterations, i = n + 1 and s = k
k=1

Now we prove

N
X
Theorem 16.2 For all N ∈ W, the program arithSum computes i.
i=1

Proof. Suppose N ∈ W.

111
16.4. SMALL EXAMPLE CHAPTER 16. CORRECTNESS OF ALGORITHMS

0
X
Base case / initialization. After 0 iterations, i = 1 and s = 0 = k by assignment
k=1
and the definition of summation. Hence I(0), and so there exists an n0 ≥ 0 such that
I(n0 ).
Inductive case / maintenance. Suppose I(n0 ). Let iold be the value of i after the
n0 th iteration and before the n0 + 1st iteration, and let inew be the value of i after the
Xn0
0 0 0
n + 1st iteration. Similarly define sold and snew . By I(n ), iold = n + 1 and sold = k.
k=1
n0
X
0
nX +1
By assignment and substitution, snew = sold + iold = k + (n0 + 1) = k. Similarly
k=1 k=1
inew = iold + 1 = (n0 + 1) + 1. Hence I(n0 + 1).
Hence by math induction, I(n) for all n ∈ W, and so I(N ).

Before we continue, be attentive to the subtle matter of our choice of variables. What is with
the N , the n, and the n0 ? N is the input to the program arithSum. n is the argument to (or,
independent variable of) the predicate I(n) used to analyze arithSum. One thing we are trying
to prove is I(N ) (I is true for n = N ) in the imaginary world created by supposing N . n0 is
an arbitrary whole number such that I(n0 ), used inside of our inductive proof that I(n) for all n.
Properly disambiguating variables is essential for all proofs and become particularly hard in proofs
of algorithm correctness when you are juggling similar variables and proving a property of an object
which itself has variables. Failure to do this will lead to equivocal nonsense.

Termination By I(N ), after N iterations, i = N + 1 > N , and so the guard will fail.
N
X N
X
Moreover, by I(N ), after N iterations, s = k. By change of variable, s = i, which
k=1 i=1
is our post-condition for the program. Hence the program is correct. 2

16.4 Small example

Now consider an algorithm that processes an array. This example is “smaller” than the previous
one because we will concern ourselves only with the proof of the loop invariant. This program finds
the smallest element in an array.

- fun findMin(array) =
= let
= val min = ref (sub(array, 0));
= val i = ref 1;
= in
= (while !i < length(array) do
= (if sub(array, !i) < !min
= then min := sub(array, !i)
= else ();
= i := !i + 1);
= !min)
= end;

As you will see by inspecting the code, it finds the minimum by considering each element in the
array in order. The variable min is used to store the “smallest seen so far.” Every time we see an
element smaller than the smallest so far (sub(array, !i) < !min), we make that element the new
smallest so far (then min := sub(array, !i)), and otherwise do nothing (else ()). At the end
of the loop, the “smallest so far” is also the “smallest overall.”

112
CHAPTER 16. CORRECTNESS OF ALGORITHMS 16.4. SMALL EXAMPLE

Thus our loop invariant must somehow capture the notion of “smallest so far.” To make this
more convenient, we introduce the notation A[i..j], which will stand for the set of values that are
elements of the subarray of A in the range from i inclusive to j exclusive. That is, subarray

A[i..j] = {A[k] | i ≤ k < j}

We say that x is the minimum element in a subset A of integers if x ∈ A and ∀ y ∈ A, x ≤ y. Now,
our post-condition for the loop is that min is the minimum of array[0..N ], where N is the size of
array. Our (proposed) loop invariant is

I(n) = after n iterations, i = n + 1 and min is the minimum of array [0..i]

We now prove it to be a loop invariant.

Proof. Suppose array is an array of integers.

Base case / initialization. After 0 iterations, min = array[0] and i = 1 = 0 + 1 by
assignment. By the definition of subarray, array[0] is the only element in array[0..1] =
array[0..i], and so min is the minimum of array[0..i] by definition of minimum. Hence
I(0), and there exists n0 ≥ 0 such that I(n0 ).
Inductive case / maintenance. Suppose I(n0 ). Let min old be the value of min
after the n0 th iteration and before the n0 + 1st iteration, and let min new be the value of
min after the n0 + 1st iteration. Similarly define iold and inew . By I(n0 ), min old is the
minimum of array [0..iold ]. We now have two cases:
Case 1: Suppose array[iold ] < min old . Then, by assignment min new = array[iold ]. Now,
suppose x ∈ array [0..(iold + 1)]. If x = array[iold ], then min new ≤ x trivially. Otherwise,
x ∈ array [0..iold ], and min new = array[iold ] < min old ≤ x, the last step by the definition
of minimum. In either case, min new ≤ x, and so min new is the minimum of array[0..(iold +
1)] by definition of minimum.
Case 2: Suppose array [iold ] ≥ min old . Then min new = min old . Now, suppose x ∈
array[0..(iold + 1)]. If x = array[iold ], then min new ≤ x by substitution. Otherwise,
x ∈ array[0..iold ], and min new = min old ≤ x by definition of minimum. In either
case, min new ≤ x, and so min new is the minimum of array[0..(iold + 1)] by definition of
minimum.
In either case, min new is the minimum of array[0..(iold + 1)]. By assignment, inew =
iold + 1 = (n0 + 1) + 1 by substitution. Also by substitution, min new is the minimum of
array[0..inew ]. Hence I(n0 + 1), and I(n) is an invariant for this loop. 2

Notice how the conditional requires a division into cases. Whichever way the condition branches,
we end up with min being the minimum of the range so far.

113
16.4. SMALL EXAMPLE CHAPTER 16. CORRECTNESS OF ALGORITHMS

Exercises

3. I(n) = m + n is odd.
In Exercises 1–3, prove the predicates to be loop invariants for the
- fun ccc(n) =
loops in the following programs.
= let
1. I(n) = x is even. = val x = ref 0;
- fun aaa(n) = = val y = ref 101;
= let = in
= val x = ref 0; = (while !x < n do
= val i = ref 0; = (x := !x + 4;
= in = y := !y - 2);
= (while !i < n do = !x + !y)
= (x := !x + 2 * i; = end;
= i := !i + 1);
4. Finish the proof of correctness for findMin. That is, show
= !x)
that the loop will terminate and that the loop invariant im-
= end;
plies the post-condition.
2. I(n) = x + y = 100.
5. Write a program which, given an array of integers, computes
- fun bbb(n) = the sum of the elements in the array. Write a complete proof
= let of correctness for your program: determine pre-conditions
= val x = ref 50; and post-conditions, determine a loop invariant, prove it to
= val y = ref 50; be a loop invariant, show that the loop will terminate, and
= val i = ref 0 show that the loop invariant implies the post-conditions.
= in
= (while !i < n do Exercises 2 and 3 are taken from Epp [5].
= (x := !x + 1;
= y := !y - 1);
= !x + !y)
= end; Hi.

114
Chapter 17

From Theorems to Algorithms

17.1 The Division Algorithm

Recall the Quotient-Remainder Theorem from the previous chapter:

Theorem 16.1. If n, d ∈ Z+ , then there exist unique integers q and r such that n = d · q + r and
0 ≤ r < d.

Notice that this result tells us of the existence of something. A proof for such a proposition
must either show that it is impossible for the item not to exist (by a proof by contradiction) or give
an algorithm for calculating the item. The latter is called a constructive proof, and it in fact gives
more information that merely the truth of the proposition. The traditional proof of the QRT is
non-constructive.
However, the theorem itself, if studied carefully, tells us much about how to build an algorithm
for finding the quotient and the remainder, called the Division Algorithm. In the previous chapter, Division Algorithm
we proved results (specifically, correctness) about algorithms. In this chapter, we derive algorithms
from results in a process that is essentially the reverse of that of the previous chapter.
Consider what the QRT says about q and r. The two assertions (the propositions subordinate
to “such that”) can be thought of as restrictions that must be met as we try to find suitable values.
We have:

• n = d · q + r.

• r ≥ 0.

• r < d.

The restrictions n = d · q + r and r ≥ 0 are not difficult to satisfy; simply take q = 0 and
r = n—call this our initial guess. That guess, however, might conflict with the other restriction,
r < d. Our general strategy, then, is to alter our initial guess, making sure that any changes we
make do not violate the earlier restriction, and repeating until we satisfy the other restriction, too.
Note the elements of our algorithm sketch, which you should recognize from studying correctness
proofs from the previous chapter. We have:

• Initial conditions: q = 0 and r = n.

• A loop invariant: n = d · q + r.

• A termination condition: r < d.

In ML,

115
17.2. THE EUCLIDEAN ALGORITHM CHAPTER 17. FROM THEOREMS TO ALGORITHMS

- fun divisionAlg(n, d) =
= let
= val r = ref n;
= val q = ref 0;
= in
= (while !r >= d do
= ();
= (!q, !r))
= end;

Notice that we return the quotient and the remainder as a tuple. The only thing missing is the
body of the loop—that is, how to mutate q and r so as to preserve the invariant and to make progress
towards the termination condition. It may seem backwards to determine the husk of a loop before
its body or to set up a proof of correctness before writing what is to be proven correct. However,
this way of thinking encourages programming that is amenable to correctness proof and enforces a
discipline of programming and proving in parallel.
Since the termination condition is r < d, progress is make by making r smaller. If the termination
condition does not hold, then r ≥ d, and so there is some nonnegative integer, say y, such that
r = d + y. Note that

n = d·q+r
= d·q+d+y
= d · (q + 1) + y
Compare the form of the expression at the bottom of the right column with that at the top. As
we did in the previous chapter, we will reckon the affect of the change to r by distinguishing r old
from rnew , and similarly for q. Then let

rnew = y = rold − d
qnew = qold + 1
In other words, decrease r by d and increase q by one. This preserves the loop invariant (by
substitution, n = d · qnew + rold ) and reduces r.

- fun divisionAlg(n, d) =
= let
= val r = ref n;
= val q = ref 0;
= in
= (while !r >= d do
= (r := !r - d;
= q := !q + 1);
= (!q, !r))
= end;

We have a correctness proof ready-made. Furthermore, the correctness of the algorithm proves
the theorem (the existence part of it, anyway), since a way to compute the numbers proves that
they exist.

17.2 The Euclidean Algorithm

greatest common divisor The greatest common divisor (GCD) of two integers a and b is a positive integer d such that

• d|a and d|b (that is, d is a common divisor).

• for all c such that c|a and c|b, c|d (that is, d is the greatest of all common divisors).

116
CHAPTER 17. FROM THEOREMS TO ALGORITHMS 17.2. THE EUCLIDEAN ALGORITHM

Recall from the definition of divides that the first item means that there exist integers p and q
such that a = p · d and b = q · d. The second items says that any other divisor of a and b must also
be a divisor of d. The GCD of a and b, commonly written gcd(a, b), is most recognized for its use
in simplifying fractions. Clearly a simple and efficient way to compute the GCD would be useful for
any application involving integers or rational numbers. Here we have two lemmas:

Lemma 17.1 If a ∈ Z, then gcd(a, 0) = a.

Lemma 17.2 If a, b ∈ Z, q, r, ∈ Znonneg , and a = b · q + r, then gcd(a, b) = gcd(b, r).

The proof of Lemma 17.1 is left as an exercise. For the proof of Lemma 17.2, consult a traditional
discrete math text. We now follow the same steps as in the previous section
The fact that we have two lemmas of course makes this more complicated. However, we can
simplify this by unifying their conclusions. They both provide equivalent expressions for gcd(a, b),
except that Lemma 17.1 addresses the special case of b = 0. From this we have the intuition that
the loop invariant will probably involve the GCD.
Now consider how the lemmas differ. The equivalent Lemma 17.2 gives is simply a new GCD
problem; Lemma 17.1, on the other hand, gives a final answer. Thus we have our termination
condition: b = 0. Lemma 17.2 then helps us distinguish between what does not change and what
does: The parameters to the GCD change; the answer does not. If we distinguish the changing
variables a and b from their original values a0 and b0 , we have

• Initial conditions: a = a0 and b = b0 .

• Loop invariant: gcd(a, b) = gcd(a0 , b0 ).
• Termination condition: b = 0.

All that remains is how to mutate the variables a and b. a takes on the value of the old b. b
takes on the value of r from Lemma 17.2.

anew = bold
bnew = r = aold mod bold
This is known as the Euclidean Algorithm, though it was probably known before Euclid. Euclidean Algorithm

- fun gcd(a0, b0) =

= let
= val a = ref a0;
= val b = ref b0;
= in
= (while !b <> 0 do
= (a := !b;
= b := !a mod !b);
= !a)
= end;

val gcd = fn : int * int -> int

- gcd(21,36);

val it = 36 : int

36 clearly is not the GCD of 21 and 36. What went wrong? Our proof-amenable approach has
lulled us into a false sense of security, but it also will help us identify the problem speedily. The line
b := !a mod !b is not bnew = aold mod bold but rather bnew = anew mod bold . We must calculate r
before we change a. This can be done with a let expression:

117
17.3. THE EUCLIDEAN ALGORITHM, ANOTHER WAY CHAPTER 17. FROM THEOREMS TO ALGORITHMS

- fun gcd(a0, b0) =

= let
= val a = ref a0;
= val b = ref b0;
= in
= (while !b <> 0 do
= let val r = !a mod !b
= in
= (a := !b;
= b := r)
= end;
= !a)
= end;

17.3 The Euclidean Algorithm, another way

Considering how we unified Lemmas 17.1 and 17.2, we could have written them as one lemma:

Lemma 17.1–17.2. If a, b ∈ Z, then

a if b = 0
gcd(a, b) =
gcd(b, a mod b) otherwise
Not only is this more concise (partly because we employed the mod operation guaranteed by the
QRT), but it also expresses the result in an if-then-else format as we might find in an algorithm.
Could we use this in constructing an ML function? We have seen predicates and functions that are
defined in terms of other predicates and functions, but this is different because the GCD is defined
in terms of itself. This invites us to do some wishful thinking: If only the function gcd were already
defined, then defining it would be so much easier. ML, like most programming languages, allows
this sort of wishing:

- fun gcd(a, b) = if b = 0 then a else gcd(b, a mod b);

Or, using pattern-matching:

- fun gcd(a, 0) = a
= | gcd(a, b) = gcd(b, a mod b);

The loop disappears completely. Instead of effecting repetition by a command to iterate (as with
a while statement), repetition happens implicitly by the repeated calling of the function. Instead of
mutating reference variables, we feed different parameters into gcd each time. Essentially we have

gcd(21, 36)
= gcd(36, 21)
= gcd(21, 15)
= gcd(15, 6)
= gcd(6, 3)
= gcd(3, 0)
= 3

recursion This interaction of a function with itself is called recursion, which means self-reference. A simple
recursive function has a conditional (or patterns) separating into two cases: a base case, a point
at which the recursion stops; and a recursive case, which involves a recursive function call. Notice
how the termination condition of the iterative version corresponds to the base case of the recursive
version. Also notice the similarity of structure between inductive proofs and recursive functions.

118
CHAPTER 17. FROM THEOREMS TO ALGORITHMS 17.3. THE EUCLIDEAN ALGORITHM, ANOTHER WAY

What is most striking is how much more concise the recursive version is. Recursive solutions
do tend to be compact and elegant. Something else, however, is also at work here; ML is intended
for a programming style that makes heavy use of recursion. More generally, this style is called the
functional or applicative style of programming (as opposed to the iterative style, which uses loops), functional programming
where the central concept of building algorithms is the applying of functions, as opposed to the
repeating of loop bodies or the stringing together of statements. This style will be taught for the applicative style
remainder of this course.

119
17.3. THE EUCLIDEAN ALGORITHM, ANOTHER WAY CHAPTER 17. FROM THEOREMS TO ALGORITHMS

Exercises

5. The algorithm you used in Exercise 4 required n multipli-

cation operations. Write a function implementing a faster
1. Write a complete proof of correctness for the function
version of this algorithm by making use of the following ad-
divisionAlg.
ditional lemma.
2. Recall that divisionAlg returns a tuple standing for the
quotient and the remainder together. Write functions quot Lemma 17.5 If a, b, n ∈ Z and n is even, then a · bn =
n
and remain that return only the quotient and remainder, re- a(b2 ) 2 .
spectively (so they are equivalent to ML’s div and mod). Do
not re-type the algorithm for each one; instead, enter it once (If you have done this correctly, your function/algorithm
as we did here and have remain and quot call divisionAlg. should require about log2 n multiplications. Can you tell
why?)
3. Prove Lemma 17.1.
6. In a similar way, use the following three lemmas to write a
Exercises 4–9 are developed from Abelson and Sussman, Struc- function that computes x · y without using multiplication.
ture and Interpretation of Computer Programs, McGraw Hill, 1996, (To make it fast, you will need to use division, but only
pages 46-47. dividing by 2.)
4. The following lemmas give no particularly surprising result. Lemma 17.6 If a, b ∈ Z, then a + b · 0 = a.
0
Lemma 17.3 If a, b, ∈ Z, then a · b = a. Lemma 17.7 If a, b, c ∈ Z, then a + b · c = a + b + b · (c − 1).

Lemma 17.4 If a, b, n ∈ Z, then a · bn = (a · b)bn−1 . Lemma 17.8 If a, b, c ∈ Z and c is even, then a + b · c =

a + (b + b)(c ÷ 2).
However, they can be used to generate an algorithm for ex-
ponentiation using repeated multiplication. Write an ML 7. Write a recursive version of your function from Exercise 4.
function for such an algorithm, to compute xy . (Hint: If 8. Write a recursive version of your function from Exercise 5.
initially a = 1, b = x, and n = y, then a · bn = xy , which
is also your invariant. Your postcondition is that a = xy , 9. Write a recursive version of your function from Exercise 6.
b = x, and n = 0. 10. Write a recursive version of divisionAlg.

120
Chapter 18

Recursive algorithms

In the previous chapter we were introduced to recursive functions. We will explore this further here,
with a focus on processing sets represented in ML as lists, and also on list processing in general.
This focus is chosen not only because these applications are amenable to recursion, but also because
of the centrality of sets to discrete mathematics and of lists to functional programming. We divide
our discussion into two sections: functions that take lists (and possibly other things as well) as
arguments and compute something about those lists; and functions that compute lists themselves.

18.1 Analysis
Suppose we are modeling the courses that make up a typical mathematics or computer science
program, and we wish to analyze the requirements, overlap, etc. Datatypes are good for modeling
a universal set.

- datatype Course = Calculus | Discrete | Programming | LinearAlg

The most basic set operation is ∈, set inclusion. We call it an “operation,” though it can be
just as easily be thought of as a predicate, say isElementOf. (In Part V, we will see that it is also
a relation.) Set inclusion is very difficult to describe formally, in part because sets are unordered.
However, any structure we use to represent sets must impose some incidental order to the elements.
Computing whether a set contains an element obviously requires searching the set to look for the
element, and the incidental ordering must guide our search. If we represented a set using an array,
we might write a loop which iterates over the array and update a bool variable to keep track of
whether we have found the desired item. Compare this with findMin from Chapter 16:

- fun isElementOf(x, array) =

= let
= val len = length(array);
= val i = ref 0;
= val found = ref false;
= in
= (while not (!found) andalso !i < len do
= (if sub(array, !i) = x then found := true else ();
= i := !i + 1);
= !found)
= end;

val isElementOf = fn : ’’a * ’’a array -> bool

121
18.1. ANALYSIS CHAPTER 18. RECURSIVE ALGORITHMS

- csCourses;

val it = [|Discrete,Programming,OperSys,Algorithms,Compilers|] : Course array

- isElementOf(OperSys,csCourses);

val it = true : bool

(This function would give the same result is we had left out not (!found) andalso. Why did
we include that extra condition?)
Arrays are a cumbersome way to represent sets, and as we will see in the next section, completely
useless when we want to derive new sets from old. It is the fixed size and random access of arrays
that make them so inconvenient, and this is why lists have been and continue to be our preferred
representation of sets. We have a pattern for writing an array algorithm which asks, How does this
algorithm on the entire array break down to a step to be taken on each position? The answer to
this question become the body of the loop. We must develop a corresponding strategy for lists.
It is always easiest to start with the trivial. If a set is empty, no item is an element of it. Thus
we can say

- fun isElementOf(x, []) = false

The almost-trivial case is if by luck the element we are looking for is the head of the list.

- fun isElementOf(x, []) = false

= | isElementOf(x, a) = if x = hd(a) then true

- fun isElementOf(x, []) = false

= | isElementOf(x, y::rest) = if x = y then true

What if the element is somewhere else in the list? Or what if it is not in the list, but the list is not
empty? Both of those questions are subsumed by asking, Is the element in rest or not?

- fun isElementOf(x, []) = false

= | isElementOf(x, y::rest) = if x = y then true else isElementOf(x, rest);

Do not fail to marvel at the succinctness of this recursive algorithm on a list, contrasted with
the iterative algorithm on an array. The key insight is that after checking if the list is empty and
whether the first item is what we are looking for, we have reduced the problem (if we have not
solved it immediately) to a new problem, identical in structure to the original, but smaller. This
is the heart of thinking recursively. It works so well on lists because lists themselves are defined
recursively; a list is

• An empty list, or

• an item followed by a list.

Hence our pattern for list operations is

• What do I do with an empty list?

• Given a non-empty list

– What do I do with the first element?

– What do I do with the rest of the list?

122
CHAPTER 18. RECURSIVE ALGORITHMS 18.2. SYNTHESIS

You will learn to recognize that the answer to the last sub-question is also the solution to the entire
problem.
Let us apply this now to a new problem: computing the cardinality of a set. The type of the
function cardinality will be ’a list -> int. If a set is empty, then its cardinality is zero. Otherwise,
if a set A contains at least one element, x, then we can note

A = {x} ∪ (A − {x})
and since {x} and A − {x} are disjoint,

|A| = |{x}| ∪ |A − {x}|

and so

- fun cardinality([]) = 0
= | cardinality(x::rest) = 1 + cardinality(rest);

val cardinality = fn : ’a list -> int

- val csCourses = [Discrete, Programming, OperSys, Algorithms, Compilers];

val csCourses = [Discrete,Programming,OperSys,Algorithms,Compilers]

: Course list

- cardinality(csCourses);

val it = 5 : int

We can describe the recursive case into three parts. First there is the work done before the
recursive call of the function; when working with lists, this work is usually the splitting of the list
into its head and tail, which can be done implicitly with pattern matching, as we have done here.
Second, there is the recursive call itself. Finally, we usually must do some work after the call, in this
case accounting for the first element by adding one to the result of the recursive call.

18.2 Synthesis
Now we consider functions that will construct lists. One drawback of using lists to represent sets is
that lists may have duplicate items. Our set operation functions operate under the assumption that
the list has been constructed so there happen to be no duplicates. We will get undesired responses
if that assumption breaks.

- cardinality([RealAnalysis, OperSys, LinearAlg, ModernAlg, OperSys]);

val it = 5 : int

It would be useful to have a function that will strip duplicates out of a list, say makeNoRepeats.
Applying the same strategy as before, first consider how to handle an empty list. Since an empty
list cannot have any repeats, this is the trivial case, but keep in mind we are not computing an int
or bool based on this list. Instead we are computing another list, in this case, just the empty list.

- fun makeNoRepeats([]) = []

123
18.2. SYNTHESIS CHAPTER 18. RECURSIVE ALGORITHMS

Now, what do we do with the case of x::rest? The subproblem that corresponds to the entire
problem is removing the duplicates from rest. It is a safe guess that our answer will include
makeNoRepeats(rest). The leaves just x to be dealt with. If we wish to include x in our resulting
list, we can use the cons operator to construct that new list, just as we use it to break down a list
in pattern matching: x::makeNoRepeats(rest). However, we should include x in the resulting list
only if it does not appear there already (otherwise it would be a duplicate); it will appear in the
result of the recursive call if and only if it appears in rest (why?). Thus we have

- fun makeNoRepeats([]) = []
= | makeNoRepeats(x::rest) =
= if isElementOf(x,rest)
= then makeNoRepeats(rest)
= else x::makeNoRepeats(rest);

val makeNoRepeats = fn : ’’a list -> ’’a list

- makeNoRepeats([RealAnalysis, OperSys, LinearAlg, ModernAlg, OperSys]);

val it = [RealAnalysis,LinearAlg,ModernAlg,OperSys] : Course list

To test your understanding of how this works, explain why it was the first occurrence of OperSys
that disappeared, rather than the second.
We noted in Chapter 4 that the cat operator is a poor way to perform a union on sets represented
on lists because any elements in the intersection of the two lists will be included twice. Now we can
write a simple union function, making use of makeNoRepeats:

- fun union(a, b) = makeNoRepeats(a @ b);

A final example will reemphasize the importance of types. Recall that one can make a list of any
type, including other list types. Now suppose we wanted to take a list, create one-element lists of
all its elements, and return a list of those lists, for example

- listify([RealAnalysis,Discrete,DiffEq,Compilers]);

val it = [[RealAnalysis],[Discrete],[DiffEq],[Compilers]] : Course list list

An empty list is still listified to an empty list. But now the work that needs to be done to x is
to make a list out of it.

- fun listify([]) = []
= | listify(x::rest) = [x] :: listify(rest);

What is interesting here is an analysis of the types of the subexpressions of the recursive case:

x ] :: listify (rest
[|{z} )
| {z } | {z }
’a list ’a list
| {z } |’a list -> ’a list
{z }
’a list ’a list list
| {z }
’a list list

124
CHAPTER 18. RECURSIVE ALGORITHMS 18.2. SYNTHESIS

Exercises

6. Write a function sum which computes the sum of the ele-

ments in a list of integers.
1. Write a predicate isSubsetOf which takes two sets repre-
sented as lists and determines if the first set is a subset of 7. Write a function findMin like the one in Section 16.4 ex-
the other. Hint: make the first set vary in the pattern be- cept that it finds the minimum element in a list of integers.
tween empty and non-empty, but regard the second a just a Hint: Your recursive call will result in something akin to
generic set, as in the “minimum so far” idea we had when proving the old
findMin correct, and you like likely use that value twice. Do
- fun isSubsetOf([], b) = ... not write a redundant recursive call; instead, save the result
= | isSubsetOf(x::rest, b) = ... to a variable using a let expression.
You may use isElementOf from this chapter. 8. Write a function addToAll which will take an element and a
list of lists and will produce a similar list of lists except with
2. Write a predicate hasNoRepeats which tests if a list has no the other element prepended to all of the sub-lists, as in
duplicate elements.
- addToAll(5, [[1,2],[3],[],[4,5,7]]);
3. Write a function intersection which computes the inter-
section of two sets. val it = [[5,1,2],[5,3],[5],[5,4,5,7]]
4. Write a function difference which computes the difference : int list list
of one set from another.
9. In Chapter 15, we noted that the powerset function was far
5. Write a new union function which does not use from the best way to compute a powerset in ML. Write a
makeNoRepeats but instead uses @, intersection, and better, recursive powerset. Hint: use your addToAll from
difference Exercise 8.

125
18.2. SYNTHESIS CHAPTER 18. RECURSIVE ALGORITHMS

126
Part V

Relation

127
Chapter 19

Relations

19.1 Definition
Relations are not a new concept for you. Most likely, you learned about relations by how they
differed from functions. We, too, will use relations as a building block for studying functions in
Part VI, but we will also consider them for logical and computational interest in their own right.
If we consider curves in the real plane, the equation y = 4 − x2 represents a function because its
graph passes the vertical line test. A circle, like y 2 = 4 − x2 , fails the test and thus is not a function,
but it is still a relation.

y = 4 − x2 y 2 = 4 − x2

Notice that a “curve” in a graph is really a set of points; so is the equation the curve represents,
2
for that matter.
√ √y = 4 √ − x2√can be thought of as defining the set that includes (0, 2), (2, 0), (0, −2),
(−2, 0), ( 2, − 2), (− 2, 2), etc. A point, in turn, is actually an ordered pair of real numbers.
This leads to what you may remember as the high school definition of a relation: a set of ordered
pairs. We should also recognize them as subsets of R × R.
Our notion of relation will be more broad and technical, allowing for subsets of any Cartesian
product. Thus if X and Y are sets, then a relation R from X to Y is a subset of X × Y . If X = Y , relation
we say that R is a relation on X. (Sometimes subsets on higher-ordered Cartesian products are also
considered; in that context, our definition of a relation more specifically is of a binary relation.) If
(x, y) ∈ R, we say that x is related to y. This is sometimes written xRy, especially if R is a special
symbol like |, ∈, or =, which, you should notice, are all relations. We can also consider R to be a

129
19.2. REPRESENTATION CHAPTER 19. RELATIONS

predicate and write R(x, y) if (x, y) ∈ R.

Small relations are visualized using graphs, which we will treat formally in Part VIII. Their
intent should be intuitively clear for the following examples of relations.

14
2
15
3
Relation | (divides) from {(2, 14), (2, 16), (2, 18), (3, 15), 16
{2, 3, 5, 7} to {14, 15, 16, 17, 18} (3, 18), (5, 15), (7, 14)} 5
17
7
18

2
{(2, {2, 3, 5}), (2, {2, 7}), { 2, 3, 5 }
Relation ∈ from {2, 3, 5, 7} to (3, {2, 3, 5}), (3, {3, 5, 7}), 3
{{2, 3, 5}, {3, 5, 7}, {2, 7}}. (5, {2, 3, 5}), (5, {3, 5, 7}), { 3, 5, 7 }
5
(7, {3, 5, 7}), (7, {2, 7})} { 2, 7 }
7

coyote

hawk
Relation eats on the set { hawk, {(coyote, rabbit), (hawk, rabbit), fox
coyote, rabbit, fox, clover } (coyote, fox), (fox, rabbit),
(rabbit, clover)} rabbit

clover

{(∅, ∅), (∅, {1}), (∅, {2}), (∅, {3}) 0

(∅, {1, 2}), (∅, {2, 3}),
(∅, {1, 3}), (∅, {1, 2, 3}),
({1}, {1}), ({1}, {1, 2}),
({1}, {1, 3}), ({1}, {1, 2, 3}), {1} {2} {3}
Relation ⊆ on P({1, 2, 3}). ({2}, {2}), ({2}, {1, 2}),
({2}, {2, 3}), ({2}, {1, 2, 3}),
({3}, {3}), ({3}, {1, 3}),
({3}, {2, 3}), ({1}, {1, 2, 3}), { 2, 3 } { 1, 3 } { 1, 2 }
({1, 2}, {1, 2}), ({1, 2}, {1, 2, 3}),
({2, 3}, {2, 3}), ({2, 3}, {1, 2, 3}),
({1, 3}, {1, 3}), ({1, 3}, {1, 2, 3}),
({1, 2, 3}, {1, 2, 3}) { 1, 2, 3 }

Notice that in some cases, an element may be related to itself. This is represented graphically
self-loop by a self-loop.

19.2 Representation
A principal goal of this course is to train you to think generally about mathematical objects. Origi-
nally, the mathematics you knew was only about numbers. Gradually you learned that mathematics
can be concerned about other things as well, such as points or matrices. Our primary extension of
this list has been sets. You should now be expanding your horizon to include relations as full-fledged
mathematical objects, since a relation is simply a special kind of set. In ML, our concept of value

130
CHAPTER 19. RELATIONS 19.2. REPRESENTATION

corresponds to what we mean when we say mathematical object. We now consider how to represent
relations in ML.
Our running example through this part is the relations among geographic entities of Old Testa-
ment Israel. As raw material, we represent (as datatypes) the sets of tribes and bodies of water.

DAN

ASHER

- datatype Tribe = Asher | Naphtali | Zebulun

LUN
= | Issachar | Dan | Gad

I
HTAL
Chinnereth

ZEBU
an
= | Manasseh | Reuben | Ephraim

NAP
a
= | Benjamin | Judah | Simeon;

err
dit
ISSACHAR

Jordan
M

GAD

Jabbok
EPHRAIM

BENJAMIN

REUBEN

H
A
D
U Dead
J Arnon

SIMEON

Now we consider the relation from Tribe to WaterBody defined so that tribe t is related to water
body w if t is bordered by w. We could represent this using a predicate, and we will later find some
circumstances where a predicate is more convenient to use. Using lists, however, it is much easier
to define a relation.

- val tribeBordersWater =
= [(Asher, Mediterranean), (Manasseh, Mediterranean), (Ephraim, Mediterranean),
= (Naphtali, Chinnereth), (Naphtali, Jordan), (Issachar, Jordan),
= (Manasseh, Jordan), (Gad, Jordan), (Benjamin, Jordan), (Judah, Jordan),
= (Reuben, Jordan), (Judah, Dead), (Reuben, Dead), (Simeon, Dead),
= (Gad, Jabbok), (Reuben, Jabbok), (Reuben, Arnon)];

val tribeBordersWater =
[(Asher,Mediterranean),(Manasseh,Mediterranean),(Ephraim,Mediterranean),
(Naphtali,Chinnereth),(Naphtali,Jordan),(Issachar,Jordan),
(Manasseh,Jordan),(Gad,Jordan),(Benjamin,Jordan),(Judah,Jordan),
(Reuben,Jordan),(Judah,Dead),...] : (Tribe * WaterBody) list

The important thing is the type reported by ML: (Tribe * WaterBody) list. Since (Tribe *
WaterBody) is the Cartesian product and list is how we represent sets, this fits perfectly with our

131
19.3. MANIPULATION CHAPTER 19. RELATIONS

formal definition of a relation. Now we can use our predicate isElementOf from Chapter 18 as a
model for a predicate determining if two items are related.
- fun isRelatedTo(a, b, []) = false
= | isRelatedTo(a, b, (h1, h2)::rest) =
= (a = h1 andalso b = h2) orelse isRelatedTo(a, b, rest);

val isRelatedTo = fn : ’’a * ’’b * (’’a * ’’b) list -> bool

- isRelatedTo(Judah,Jordan, tribeBordersWater);

val it = true : bool

- isRelatedTo(Simeon,Chinnereth, tribeBordersWater);

val it = false : bool

Alternatively, we could use isElementOf directly.

- fun isRelatedTo(a, b, relat) = isElementOf((a,b), relat);

val isRelatedTo = fn : ’’a * ’’b * (’’a * ’’b) list -> bool

19.3 Manipulation
We conclude this introduction to relations by defining a few objects that can be computed from
image relations. First, the image of an element a ∈ X under a relation R from X to Y is the set

IR (a) = {b ∈ Y | (a, b) ∈ R}
The image of 3 under | (assuming the subsets of Z from the earlier example) is {15, 18}. The image
of fox under eats is { rabbit }. The image of Reuben under tribeBordersWater is [Jordan, Dead,
Jabbok, Arnon].
inverse The inverse of a relation R from X to Y is the relation

R−1 = {(b, a) ∈ Y × Y | (a, b) ∈ R}

The inverse is easy to perceive graphically; all one does is reverse the direction of the arrows.
composition The composition of a relation R from X to Y and a relation S from Y to Z is the relation

R ◦ S = {(a, c) ∈ X × Z | ∃ b ∈ Y such that (a, b) ∈ R and (b, c) ∈ S}

Suppose we had the set of cities C = {Chicago, Green Bay, Indianapolis, Minneapolis}, the set
of professional sports P = {baseball, football, hockey, basketball}, and the set of seasons S =
{summer, fall, spring, winter}. Let hasMajorProTeam be the relation from C to T representing
whether a city has a major professional team in a given sport, and let playsSeaon be the relation
from T to S representing whether or not a professional sport plays in a given season. The composi-
tion hasMajorProTeam◦ playsSeason represents whether or not a city has a major professional team
playing in a given season. In the illustration below, pairs in the composition are found by following
two arrows.
Chicago Baseball Summer
Green Bay Football Fall

Indianapolis Hockey Spring

Minneapolis Basketball Winter

132
CHAPTER 19. RELATIONS 19.3. MANIPULATION

Exercises

We define the following relation to represent whether or not a body

1. Complete the on-line relation diagraph drills found at of water is immediately west of another.
www.ship.edu/~deensl/DiscreteMath/flash/ch4/sec4 1 - val waterImmedWestOf =
/arrowdiagramsrelations.html and www.ship.edu/~deensl = [(Mediterranean, Chinnereth),
/DiscreteMath/flash/ch4/sec4 1/onesetarrows.html = (Mediterranean, Jordan), (Mediterranean, Dead),
2. The identity relation on a set X is defined as IX = {(x, x) ∈ = (Chinnereth, Jabbok), (Chinnereth, Arnon),
X ×X}. Prove that for a relation R from A to B, R◦IB = R. = (Jordan, Jabbok), (Jordan, Arnon),
3. Prove that if R is a relation from A to B, then (R −1 )−1 = R. = (Dead, Jabbok), (Dead, Arnon)];
4. If R is a relation from A to B, is R ◦ R−1 = IA ? Prove or Notice that a body of water a is west of another b if a is immedi-
give a counterexample. ately west of b, or if it is immediately west of another body c which
5. Write a function image which takes an element and a relation is in turn immediately west of b. The latter condition is found by
and returns the image of the element over the relation. the composition of waterImmedWestOf with itself.

6. Write a function addImage which takes an element and a set 7. Write a function compose which takes two relations and re-
(represented as a list) and returns a relation under which turns the composition of those two relations. (Hint: use your
the given element’s image is the given set. image and addImage functions.)

133
19.3. MANIPULATION CHAPTER 19. RELATIONS

134
Chapter 20

Properties of relations

20.1 Definitions
Certain relations that are on a single set X (as opposed to being from X to a distinct Y ) have some
of several interesting properties.

• A relation R on a set X is reflexive if ∀ x ∈ X, (x, x) ∈ R. You can tell a reflexive relation reflexive
by its graph because every element will have a self-loop. Examples of reflexive relations are
= on anything, ≡ on logical propositions, ≤ and ≥ on R and its subsets, ⊆ on sets, and “is
acquainted with” on people.

• A relation R on a set X is symmetric if ∀ x, y ∈ X, if (x, y) ∈ R then (y, x) ∈ R. In the graph symmetric

of a symmetric relation, every arrow has a corresponding reverse arrow (except for self-loops,
which are their own reverse arrow). Examples of symmetric relations are = on anything, ≡ on
logical propositions, and “is acquainted with” on people.

• For completeness, we mention that a relation R on a set X is antisymmetric if ∀ x, y ∈ X, if antisymmetric

(x, y) ∈ R and (x, y) ∈ R, then x = y. We will consider antisymmetric relations more carefully
in Chapter 22.

• A relation R on a set X is transitive if ∀ x, y, z ∈ X, if (x, y) ∈ R and (y, z) ∈ R, then transitive

(x, z) ∈ R. If you imagine “traveling” around a graph by hopping from element to element
as the arrows permit, then you recognize a transitive relation because any trip that can be
made in several hops can also be made in one hop. Examples of transitive relations are = on
anything, ≡ on logical propositions, ≤, ≥, <, and > on R and its subsets, and ⊆ on sets.

Furthermore, a relation is an equivalence relation if it is reflexive, symmetric, and transitive. equivalence relation

Reflexivity Symmetry Transitivity

135
20.2. PROOFS CHAPTER 20. PROPERTIES OF RELATIONS

20.2 Proofs
These properties give an exciting opportunity to revisit proving techniques because they require
careful consideration of what burden of proof their definition demands. For example, suppose we
have the theorem

Theorem 20.1 The relation | on Z+ is reflexive.

If we unpack this using the definition of “reflexive,” we see that this is a “for all” proposition,
which itself contains a set-membership proposition which requires the application of the definition
of this particular relation.

Let the doubter pick any one

Proof. Suppose a ∈ Z. element.

By arithmetic, a · 1 = a, and so by the defini- Apply the definition of the relation.

tion of divides, a|a.

Hence, by the definition of reflexive, | is re- Satisfy the demands of reflexivity.

flexive. 2

Symmetry and transitivity require similar reasoning, except that their “for all” propositions
require two and three picks, respectively, and they contain General Form 2 propositions.

Theorem 20.2 The relation | on Z is transitive.

Proof. Suppose a, b, c ∈ Z, and suppose a|b and b|c. By the definition of divides, there
exist d, e ∈ Z such that a·d = b and b·e = c. By substitution and association, a(d·e) = c.
By the definition of divides, a|c. Hence | is transitive. 2

Notice that this proof involved two applications of the definition of the relation, the first to
analyze the fact that a|b and b|c, the second to synthesize the fact that a|c. Why did we restrict
ourselves to Z+ for reflexivity, but consider all of Z for transitivity?

20.3 Equivalence relations

Equivalence relations are important because they group elements together into useful subsets and,
as the name suggests, express some notion of equivalence among certain elements. The graph of an
equivalence relation has all the properties discussed above for the graphs of the three properties,
and also has the feature that elements are grouped into cliques that all have arrows to each other
and no arrows to any elements in other cliques.

136
CHAPTER 20. PROPERTIES OF RELATIONS 20.3. EQUIVALENCE RELATIONS

Examples of equivalence relations:

R on Z where (a, b) ∈ R if a and R on the set of students where
b are both even or both odd. (a, b) ∈ R if a and b have the same
majors.
R on the set of logical formulas R on the set of variable names
of variables p, q, and r where where (a, b) ∈ R if a and b have
(a, b) ∈ R if a and b have iden- the same first eight letters (which
tical truth table columns. some old programming languages
indeed consider equivalent).
For some n ∈ Z+ , R on Z where R on Z × Z where ((a, b), (c, d)) ∈
(a, b) ∈ R if a and b are congruent R if ab = dc .
mod n.
R on the set of points in the real R on the set of real-valued func-
plane where ((x1 , y1 ), (x2 , y2 )) ∈ tions where (f (x), g(x)) ∈ R if
R if (x1 , y1 ) and (x2 , y2 ) are col- f 0 (x) = g 0 (x).
inear.

Proving that a specific relation is an equivalence relation follows a fairly predictable pattern.
The parts of the proof are the proving of the three individual properties.
Theorem 20.3 Let R be a relation on Z defined that (a, b) ∈ R if a + b is even. R is an equivalence
relation.
Proof. Suppose a ∈ Z. Then by arithmetic, a + a = 2a, which is even by definition.
Hence (a, a) ∈ R and R is reflexive.
Now suppose a, b ∈ Z and (a, b) ∈ R. Then, by the definition of even, a + b = 2c for
some c ∈ Z. By the commutativity of addition, b + a = 2c, which is still even, and so
(b, a) ∈ R. Hence R is symmetric.
Finally suppose a, b, c ∈ Z, (a, b) ∈ R and (b, c) ∈ R. By the definition of even, there
exist d, e ∈ Z such that a+b = 2d and b+c = 2e. By algebra, a = 2d−b. By substitution
and algebra

a + c = 2d − b + c = 2d − 2b + b + c
= 2d − 2b + 2e = 2(d − b + e)
which is even by definition (since d − b + e ∈ Z). Hence (a, c) ∈ R, and so R is transitive.
Therefore, R is an equivalence relation by definition. 2
Once one knows that a relation is an equivalence relation, there are many other facts one can
conclude about it.
Theorem 20.4 If R is an equivalence relation, then R = R −1 .
This result might be spiced up with a sophisticated subject, but the way to attack it is to
remember that a relation is a set, and so R = R−1 is in Set Proposition Form 2 and wrapped in a
General Form 2 proposition.
Proof. Suppose R is an equivalence relation.
First suppose (a, b) ∈ R. Since R is an equivalence relation, it is symmetric, so (b, a) ∈ R
by definition of symmetry. Then by the definition of inverse, (a, b) ∈ R −1 , and so
R ⊆ R−1 by definition of subset.
Next suppose (a, b) ∈ R−1 . By definition of inverse, (b, a) ∈ R. Again by symmetry,
(a, b) ∈ R, and so R−1 ⊆ R.
Therefore, by definition of set equality, R = R−1 . 2

137
20.4. COMPUTING TRANSITIVITY CHAPTER 20. PROPERTIES OF RELATIONS

Equivalence relations have a natural connection to partitions, introduced back in Chapter 3. Let
X be a set, and let P = {X1 , X2 , . . . , Xn } be a partition of X. Let R be the relation on X defined
relation induced so that (x, y) ∈ R if there exists Xi ∈ P such that x, y ∈ Xi . We call R the relation induced by the
partition.
Similarly, let R be an equivalence relation on X. Let [x] be the image of a given x ∈ X under
equivalence class R. We call [x] the equivalence class of x (under R). It turns out that any relation induced by a
partition is an equivalence relation, and the collection of all equivalence classes under an equivalence
relation is a partition.

Theorem 20.5 Let A be a set, P = {A1 , A2 , . . . , An } be a partition of A, and R be the relation

induced by P . R is an equivalence relation.

Proof. Suppose a ∈ A. Since A = A1 ∪ A2 ∪ . . . ∪ An by the definition of partition,

a ∈ A1 ∪ A2 ∪ . . . ∪ An . By the definition of union, there exists Ai ∈ P such that a ∈ Ai .
By definition of relation induced, (a, a) ∈ R. Hence R is reflexive.
Now suppose a, b ∈ A and (a, b) ∈ R. By the definition of relation induced, there exists
Ai ∈ P such that a, b ∈ Ai . Again by the definition of relation induced, (b, a) ∈ R. Hence
R is symmetric.
Finally suppose a, b, c ∈ A, (a, b) ∈ R, and (b, c) ∈ R. By the definition of relation
induced, there exist Ai , Aj ∈ P such that a, b ∈ Ai and b, c ∈ Aj . Suppose that Ai 6= Aj .
Then, by definition of intersection, b ∈ Ai ∩ Aj . However, by definition of partition,
Ai and Aj are disjoint, and so Ai ∩ Aj = ∅ by the definition of disjoint. This is a
contradiction, and so Ai = Aj . By substitution, c ∈ Ai , and so (a, c) ∈ R by definition
of relation induced. Hence R is transitive.
Therefore, R is an equivalence relation. 2

Theorem 20.6 Let A be a set and R be an equivalence relation on A. Then {[a] | a ∈ A} is a

partition of A.

Proof. Exercise 12.

20.4 Computing transitivity

We have already thought about representing relations in ML. Now we consider testing such relations
for some of the properties we have talked about. A test for transitivity is delicate because the
definition begins with a double “for all.” We can tame this by writing a modified (but equivalent)
definition, that a relation R is transitive if

∀ (x, y) ∈ R, ∀ (y, z) ∈ R, (x, z) ∈ R

That is, we choose (x, y) and (y, z) one at a time. This allows us to break down the problem. Assume
that (x, y) is already chosen, and we want to know if transitivity holds specifically with respect to
(x, y). In other words, is it the case that for any element to which y is related, x is also related to
that element? There are three cases for which we need to test, based on the makeup of the list:

1. The list is empty. Then true, the list is (vacuously) transitive with respect to (x, y).

2. The list begins with (y, z) for some z. Then the list is transitive with respect to (x, y) if (x, z)
exists in the relation, and if the rest of the list is transitive with respect to (x, y).

3. The list begins with (w, z) for some w 6= y. Then the list is transitive with respect to (x, y) if
the rest of the list is.

Expressing this in ML:

138
CHAPTER 20. PROPERTIES OF RELATIONS 20.4. COMPUTING TRANSITIVITY

- fun testOnePair((a, b), []) = true

= | testOnePair((a, b), (c, d)::rest) =
= ((not (b = c)) orelse isRelatedTo(a, d, relation))
= andalso testOnePair((a,b), rest);
Note two things. First, our symbolic notation is more expressive than ML since we can write
∀(x, y), (y, z) ∈ R, using y twice, but we cannot use a variable more than once in a pattern. That
is, to write
...testOnePair((a, b), (b, c)::rest) = ...
is not allowed in ML. Hence we coalesce cases 2 and 3 as far as pattern matching goes and test b
and c separately for equality. Second, we used the undefined variable relation, something we will
have to fix by putting it in a proper context. The reason for this is that it is crucial that in case 2
we test if (a, c) exists in the original relation, not just the rest of the list currently being examined.
In other words, it would be a serious error (why?) to write
...((not (b=c)) orelse relatedTo(a, d, rest))
The difficult part is complete. Now we must cover the first “for all,” which can be done by
iterating over the list, testing each pair.
- fun test([]) = true
= | test((a,b)::rest) =
= testOnePair((a,b), relation) andalso test(rest);
Note again that we must be able to refer to the original relation, not just the rest of the list. As
we saw in Chapter 14, it is good style to bundle into one package pieces like test and testOnePair
that will not be used independently of our test for transitivity. This will also solve our problem of
making relation a valid variable.
- fun isTransitive(relation) =
= let fun testOnePair((a, b), []) = true
= | testOnePair((a, b), (c, d)::rest) =
= ((not (b = c)) orelse isRelatedTo(a, d, relation))
= andalso testOnePair((a,b), rest);
= fun test([]) = true
= | test((a,b)::rest) =
= testOnePair((a,b), relation) andalso test(rest);
= in
= test(relation)
= end;

val isTransitive = fn : (’’a * ’’a) list -> bool

- val waterWestOf =
= [(Mediterranean, Chinnereth), (Mediterranean, Jordan), (Mediterranean, Dead),
= (Mediterranean, Jabbok), (Mediterranean, Arnon), (Chinnereth, Jabbok),
= (Chinnereth, Arnon), (Jordan, Jabbok), (Jordan, Arnon), (Dead, Jabbok),
= (Dead, Arnon)];

val waterWestOf =
[(Mediterranean,Chinnereth), ...] : (WaterBody * WaterBody) list

- isTransitive(waterWestOf);

val it = true : bool

139
20.4. COMPUTING TRANSITIVITY CHAPTER 20. PROPERTIES OF RELATIONS

Exercises

10. Prove that if R is an equivalence relation and (a, b) ∈ R, then

IR (a) = IR (b). (Hint: Don’t overlook the Set Proposition
1. Give a counterexample proving that | is not symmetric over
Form 2 of the conclusion.)
Z.
11. Prove that if R and S are both equivalence relations, then
For x, y ∈ R, let x ≤ y if there exists z ∈ Rnonneg such that x+z = y. R ◦ S is an equivalence relation.
2. Prove that ≤ is reflexive over R. 12. Prove Theorem 20.6.
3. Give a counterexample proving that ≤ is not symmetric over 13. It is tempting to think that if a relation R is symmetric and
R. transitive, then it is also reflexive. This, however, is false.
4. Prove that ≤ is transitive over R. Give a counterexample. (Hint: Try to prove that R is re-
flexive. Then think about what unfounded assumption you
Let A be the set of intervals on the real number line. Let R be the are making in your proof.)
relation on A such that two intervals are related if they overlap,
14. Based on what you discovered in Exercise 13, fill in the blank
that is, there exists a real number that is in both intervals.
to make this proposition true and prove it. “If a relation R
5. Prove that R is reflexive. on a set A is symmetric and transitive and if , then
6. Prove that R is symmetric. R is reflexive.”

7. Give a counterexample proving that R is not transitive. 15. Write a predicate isSymmetric which tests if a relation is
symmetric, similar to isTransitive.
Let R and S be a relations on A.
16. Write a predicate isAntisymmetric which tests if a relation
8. Suppose that R is reflexive and that for all a, b, c ∈ A, if is antisymmetric.
(a, b) ∈ R and (b, c) ∈ R, then (c, a) ∈ R. Prove that R is 17. Why do we not ask you to write a predicate isReflexive?
an equivalence relation. What would such a predicate require that you do not know
9. Prove that if R is an equivalence relation, then R ◦ R ⊆ R. how to do?

140
Chapter 21

Closures

21.1 Transitive failure

The water bodies of Israel neatly fit into vertical bands. The Mediterranean is to the west; the
Sea of Chinnereth, the Jordan River, and the Dead Sea form a north-to-south line; and the Arnon
and the Jabbok run east to west into the Jordan, parallel to each other. The relation “is vertically
aligned with” is an equivalence relation, which we (naı̈vely) represent in with the following list; we
also illustrate the partition inducing the relation.

- val waterVerticalAlign =
= [(Chinnereth, Jordan), (Chinnereth, Dead),
= (Jordan, Chinnereth), (Jordan, Dead), Chinnereth

= (Dead, Chinnereth), (Dead, Jordan), Jordan

Dead
= (Jabbok, Arnon), (Arnon, Jabbok)];

val waterVerticalAlign = [...] : (WaterBody * WaterBody) list

Jabbok
Mediterranean
- isTransitive(waterVerticalAlign);
Arnon

val it = false : bool

To our surprise, it reports that waterVerticalAlign, a supposed equivalence relation, is not

transitive. We must then turn our attention to debugging—why is our relation not transitive? Since
the test for transitivity fails when pairs (a, b) and (c, d) are found in the list and b = c but (a, d)
is not found in the list, the most useful information we could get would be to ask the predicate
isTransitive, “What pairs did you find that contradict transitivity?”
What if we modified isTransitive so that it returns a list of counterexamples instead of a simple
true or false? This would tell us not only that a relation is intransitive, but why it is intransitive.
This would require the following changes to testOnePair:

• Instead of returning true on an empty list or when a is related to d, return [], that is, an
empty list, indicating no contradicting pairs are found.
• Instead of returning false when a is not related to d, return [(a, d)], that is, a list containing
the pair that was expected but not found.
• Instead of anding the (boolean) value of testOnePair on the rest of the list with what we
found for the current pair, concatenate the result for the current pair to the (list) value of
testOnePair.

Observe the new function.

141
21.1. TRANSITIVE FAILURE CHAPTER 21. CLOSURES

- fun counterTransitive(relation) =
= let fun testOnePair((a, b), []) = []
= | testOnePair((a, b), (c,d)::rest) =
= (if ((not (b=c)) orelse isRelatedTo(a, d, relation))
= then [] else [(a, d)])
= @ testOnePair((a,b), rest);
= fun test([]) = []
= | test((a,b)::rest) = testOnePair((a,b), relation) @ test(rest)
= in
= test(relation)
= end;

val counterTransitive = fn : (’’a * ’’a) list -> (’’a * ’’a) list

Note the type. It is no trouble to write the same things again, and it is a safeguard for you.

- counterTransitive(waterVerticalAlign);

val it =
[(Chinnereth,Chinnereth),(Chinnereth,Chinnereth),(Jordan,Jordan),
(Jordan,Jordan),(Dead,Dead),(Dead,Dead),(Jabbok,Jabbok),(Arnon,Arnon)]
: (WaterBody * WaterBody) list

This reveals the problem: we forgot to add self-loops. Adding them (but eliminating repeats)
makes the predicate transitive.

- val correctedWaterVerticalAlign =
= waterVerticalAlign @ makeNoRepeats(counterTransitive(waterVerticalAlign));

val correctedWaterVerticalAlign =
[(Chinnereth,Jordan),(Chinnereth,Dead),(Jordan,Chinnereth),(Jordan,Dead),
(Dead,Chinnereth),(Dead,Jordan),(Jabbok,Arnon),(Arnon,Jabbok),
(Chinnereth,Chinnereth),(Jordan,Jordan),(Dead,Dead),(Jabbok,Jabbok),...]
: (WaterBody * WaterBody) list

- isTransitive(correctedWaterVerticalAlign);

val it = true : bool

Similarly, we could use this to derive the transitive relation waterWestOf from waterImmedWestOf:

- isTransitive(waterImmedWestOf);

val it = false : bool

- val waterWestOf =
= waterImmedWestOf @ makeNoRepeats(counterTransitive(waterImmedWestOf));

142
CHAPTER 21. CLOSURES 21.2. TRANSITIVE AND OTHER CLOSURES

val waterWestOf =
[...] : (WaterBody * WaterBody) list

- isTransitive(waterWestOf);

val it = true : bool

21.2 Transitive and other closures

waterVerticalAlign and waterWestOf are both examples of cases where adding certain pairs to
a relation produces a new relation which is now transitive. The smallest transitive relation that
is a superset of a relation R is called the transitive closure, R T , of R. The requirement that this transitive closure
be the smallest such relation reflects that fact that there may be many transitive relations that are
supersets of R (for example, the universal relation where everything is related to everything else).
We are interested in adding the fewest possible pairs to make R transitive. Formally, R T is the
transitive closure of R if

1. RT is transitive.

2. R ⊆ RT .

3. If S is a relation such that R ⊆ S and S is transitive, then R T ⊆ S.

Theorem 21.1 The transitive closure of a relation R is unique.

Proof. Suppose S and T are relations fulfilling the requirements for being transitive
closures of R. By items 1 and 2, S is transitive and R ⊆ S, so by item 3, T ⊆ S. By
items 1 and 2, T is transitive and R ⊆ T , so by item 3, S ⊆ T . Therefore S = T by the
definition of set equality. 2

Some examples of transitive closures:

• The transitive closure of our relation “eats” on { hawk, coyote, rabbit, fox, clover } is “gets
nutrients from.” A coyote ultimately gets nutrients from clover.

• Let R be the relation on Z defined that (a, b) ∈ R if a+1 = b. Thus (−15, −14), (1, 2), (23, 24) ∈
R. The transitive closure of R is <.

The reflexive closure and the symmetric closure are defined similarly, though these are of less reflexive closure
importance. The reflexive closure of < is ≤. “Is in love with” in an ideal world is the symmetric
closure of “is in love with” in the real world. symmetric closure

21.3 Computing the transitive closure

Our success using @ and makeNoRepeats might make us optimistically write a function

- fun transitiveClosure(relation) =
= relation @ makeNoRepeats(counterTransitive(relation));

However, this is wrong. Suppose we test it on a relation that relates the tribes descended from
Leah according to who immediately precedes whom in birth (since we are considering tribes as
geographic entities, we have ignored Levi, who received no land).

143
21.3. COMPUTING THE TRANSITIVE CLOSURE CHAPTER 21. CLOSURES

- val immediatelyPrecede =
= [(Reuben, Simeon), (Simeon, Judah), (Judah, Issachar),
= (Issachar, Zebulun)];

val immediatelyPrecede = [...] : (Tribe * Tribe) list

- val birthOrder = transitiveClosure(immediatelyPrecede);

val birthOrder =
[(Reuben,Simeon),(Simeon,Judah),(Judah,Issachar),(Issachar,Zebulun),
(Reuben,Judah),(Simeon,Issachar),(Judah,Zebulun)] : (Tribe * Tribe) list

- isTransitive(birthOrder);

val it = false : bool

- counterTransitive(birthOrder);

val it =
[(Reuben,Issachar),(Simeon,Zebulun),(Reuben,Issachar),(Reuben,Zebulun),
(Simeon,Zebulun)] : (Tribe * Tribe) list

The transitive closure should completely express who is older than whom, yet the answer is
missing, for example, (Reuben, Issachar). By adding the pair (Reuben, Judah), we have also
created a new “missing pair.” To compute the transitive closure correctly, we must add missing
pairs repeatedly until the relation is transitive. In other words, we must add not only the pairs of
R2 = R ◦ R, but also those of R3 , R4 , etc. The following theorem informs how to calculate the
transitive closure.
Theorem 21.2 If R is a relation on a set A, then
∞
[
R∞ = Ri = {(x, y) | ∃ i ∈ N such that (x, y) ∈ Ri }
i=1
is the transitive closure of R.
Proof. Suppose R is a relation on a set A.
Suppose a, b, c ∈ A, (a, b) ∈ R∞ , and (b, c) ∈ R∞ . By the definition of R∞ , there exist
i, j ∈ N such that (a, b) ∈ Ri and (b, c) ∈ Rj . By the definition of relation composition,
(a, c) ∈ Ri ◦ Rj = Ri+j ⊆ R∞ . By the definition of subset, (a, c) ∈ R∞ , and so R∞ is
transitive.
Suppose a, b ∈ A and (a, b) ∈ R. By the definition of R∞ (taking i = 1), (a, b) ∈ R∞ ,
and so R ⊆ R∞ .
Suppose S is a transitive relation on A and R ⊆ S. Further suppose (a, b) ∈ R ∞ . Then,
by definition of R∞ , there exists i ∈ N such that (a, b) ∈ Ri . We will prove that (a, b) ∈ S
by induction on i.
Suppose i = 1. Then (a, b) ∈ R ⊆ S, so (a, b) ∈ S. Hence there exists some I ≥ 1 such
that for all (e, f ) ∈ RI , (e, f ) ∈ S.
Next suppose that i = I + 1. Then by the definition of relation composition there exist
j, k ∈ N, j + k = i and c ∈ A such that (a, c) ∈ Rj and (c, b) ∈ Rk . Since j, k < i,
j, k ≤ I, both by arithmetic. By our induction hypothesis, (a, c), (c, b) ∈ S. Since S is
transitive, (a, b) ∈ S.
Hence, by math induction, (a, b) ∈ S for all i ∈ N.
Hence R∞ ⊆ S by definition of subset.
Therefore, R∞ is the transitive closure of R. 2

144
CHAPTER 21. CLOSURES 21.4. RELATIONS AS PREDICATES

The potential need for making an infinity of compositions is depressing. However, on a finite set,
the number of possible pairs is also finite, so eventually these compositions will not have anything
more to add; we can stop when the relation we are constructing is finally transitive. Interpreting
this iteratively,

- fun transitiveClosure(relation) =
= let val closure = ref relation;
= in
= (while not (isTransitive(!closure)) do
= closure := !closure @ makeNoRepeats(counterTransitive(!closure));
= !closure)
= end;

Iu Manin said, “A good proof is one which makes us wiser” [10]. The same can be said about
good programs. In this case, we can restate our definition of the transitive closure of a relation to
be

• The relation itself, if it is already transitive, or

• The transitive closure of the union of the relation to its immediately missing pairs, otherwise.

Hence in the applicative style, we have

- fun transitiveClosure(relation) =
= if isTransitive(relation)
= then relation
= else transitiveClosure(makeNoRepeats(counterTransitive(relation))
= @ relation);

21.4 Relations as predicates

Recall that our initial problem with the relation waterVerticalAlign was that it was missing
self-loops. Hence the reflexive closure would have solved our problem just as well as the transitive
closure. Unfortunately, there is no way to compute the reflexive closure given our way of representing
relations (think back to Exercise 17 of Chapter 20).
This is particularly frustrating because the reflexive closure is so simple. For example, we could
write a predicate like isRelatedTo except that it tests if two elements are related by the relation’s
reflexive closure.

- fun isRelatedToByRefClos(a, b, relation) =

= a = b orelse relatedTo(a, b, relation);

This suggests that our situation could be remedied if we followed the intuition of the other way
to represent relations, as predicates. For example, eats back in Chapter 9 is a relation.
What we do not want to lose is the ability to treat relations as what we have been calling
“mathematical entities.” What we mean is that we should be able to pass a relation to a function
and have a function return a relation, as transitiveClosure does. A value in a programming
environment that can be passed to and returned from a function is a first-class value. A pillar of first-class value
functional programming is that functions are first-class values.
A relation represented by a predicate will have a type like (’a * ’a) → bool. This means “function
that maps from an ’a × ’a pair to a bool.” Our first task is to write a function that will convert
from list representation to predicate representation. To return a predicate from a function, simply
define the predicate (locally) within the function and return it by naming it without giving any
parameters.

145
21.4. RELATIONS AS PREDICATES CHAPTER 21. CLOSURES

- fun listToPredicate(oldRelation) =
= let fun newRelation(a, b) = isRelatedTo(a, b, oldRelation);
= in
= newRelation
= end;

val listToPredicate = fn : (’’a * ’’b) list -> ’’a * ’’b -> bool

In a similar way, a function can receive a predicate. Observe this function to compute the reflexive
closure:

- fun reflexiveClosure(relation) =
= let fun closure(a, b) = a = b orelse relation(a, b);
= in
= closure
= end;

val reflexiveClosure = fn : (’’a * ’’a -> bool) -> ’’a * ’’a -> bool

Computing the symmetric closure is an exercise. We cannot compute the transitive closure
directly, but if the relation is originally represented as a list, we could compute the transitive closure
before converting to predicate form.
As in Chapter 4, we are faced with the dilemma of choosing among two representations, each of
which has favorable features and drawbacks. The list representation is the more intuitive, at least in
terms of the formal definition of a relation; with the predicate representation, however, we can test
a pair for membership directly, as opposed to relying on a predicate like isRelatedTo. Because we
can iterate through all pairs, the list representation allows testing and computing of transitivity, but
with the predicate representation we can compute reflexive and symmetric closures. Conversion is
one-way, from lists to predicates. Neither representation allows us to test reflexivity. These aspects
are summarized below. Every “no” would become a “yes” if only we had the means to iterate
through all elements of a datatype.
List Predicate
first class value yes yes
membership test indirectly directly
isReflexive no no
isSymmetric yes no
isTransitive yes no
isAntiSymmetric yes no
reflexiveClosure no yes
symmetricClosure yes yes
transitiveClosure yes no
convert to other yes no
Finally, a word of interest only to those who have programmed in an object-oriented language such
as Java. When you read examples like these, you should be thinking about the best way to represent
a concept in other programming languages. In an object-oriented language, objects are first-class
values; hence we would want to write a class to represent relations. The primary difference between
the list and predicate representations presented here is that the former represents the relation as
data, the latter as functionality. The primary characteristic of objects is that they encapsulate data
and functionality together in one package. Since the predicate representation can be built from a list
representation, one would expect that it would be strictly more powerful; however, in the conversion
we lost the ability to test for transitivity, and this is because we have lost access to the list. If instead
we made the list to be an instance variable of a class, the methods could do the functional work

146
CHAPTER 21. CLOSURES 21.4. RELATIONS AS PREDICATES

of reflexiveClosure as well as the iterative work of isTransitive. Moreover, Java 5 has enum
types unavailable in earlier versions of Java, which provide the functionality of ML’s datatypes, as
well as a means of iterating over all elements of a set, something that hinders representing relations
as lists or predicates in ML (ML’s datatype is more powerful than Java’s enum types in other ways,
however). The following shows an implementation of relations using Java 5.
/** public boolean isTransitive() {
* Class Relation to model mathematical relations. for (List current = pairs; current != null;
* The relation is assumed to be over a set modeled current = current.tail)
* by a Java enum. if (! isTransitiveWRTPair(current.first,
* current.second))
* @author ThomasVanDrunen return false;
* Wheaton College return true;
* June 30, 2005 }
*/
public boolean isAntisymmetric() {
public class Relation<E extends Enum<E>> { if (isSymmetric()) return false;
for (List current = pairs; current != null;
private class List { current = current.tail)
public E first; if (current.first != current.second &&
public E second; relatedTo(current.second, current.first))
public List tail; return false;
public List(E first, E second, List tail) { return true;
this.first = first; }
this.second = second;
this.tail = tail; public Relation reflexiveClosure() {
} if (isReflexive()) return this;
else return new Relation<E>() {
/** public boolean isReflexive() { return true; }
* Concatenate a give list to the end of this list. public boolean relatedTo(E a, E b) {
* @param other The list to add. return a == b
* POSTCONDITION: The other list is added to || Relation.this.relatedTo(a, b);
* the end of this list; the other list is not affected. }
*/ };
public void concatenate(List other) { }
if (tail == null) tail = other;
else tail.concatenate(other); public Relation symmetricClosure() {
} if (isSymmetric()) return this;
} else return new Relation<E>() {
public boolean isSymmetric() { return true; }
/** public boolean relatedTo(E a, E b) {
* The set ordered pairs of the relation. return Relation.this.relatedTo(a, b) ||
*/ Relation.this.relatedTo(b, a);
private List pairs; }
};
/** }
* Constructor to create a new relation of n pairs from an
* n by 2 array. private List counterTransitiveWRTPair(E a, E b) {
* @param input The array of pairs; essentially an array of List toReturn = null;
* length-2 arrays of the base enum. for (List current = pairs; current != null;
*/ current = current.tail)
public Relation(E[][] input) { if (b == current.first
for (int i = 0; i < input.length; i++) { && ! relatedTo(a, current.second))
assert input[i].length == 2; toReturn = new List(a, current.second, toReturn);
pairs = new List(input[i][0], input[i][1], pairs); return toReturn;
} }
}
private List counterTransitive() {
public boolean relatedTo(E a, E b) { List toReturn = null;
for (List current = pairs; current != null; for (List current = pairs; current != null;
current = current.tail) current = current.tail) {
if (current.first == a && current.second == b) List currentCounter =
return true; counterTransitiveWRTPair(current.first,
return false; current.second);
} if (currentCounter != null) {
currentCounter.concatenate(toReturn);
public boolean isReflexive() { toReturn = currentCounter;
// if there are no pairs, we can assume this is }
// not reflexive. }
if (pairs == null) return false; return toReturn;
try { }
for (E t : (E[]) pairs.first.getClass()
.getMethod("values").invoke(null)) /**
if (! relatedTo(t, t)) return false; * Default constructor used by transitiveClosure().
} catch (Exception e) { } // won’t happen */
return true; private Relation() {}
}
public Relation transitiveClosure() {
public boolean isSymmetric() { if (isTransitive()) return this;
for (List current = pairs; current != null; Relation toReturn = new Relation();
current = current.tail) toReturn.pairs = counterTransitive();
if (! relatedTo(current.second, current.first)) toReturn.pairs.concatenate(pairs);
return false; return toReturn.transitiveClosure();
return true; }
}
public boolean isEquivalenceRelation() {
private boolean isTransitiveWRTPair(E a, E b) { return isReflexive() && isSymmetric() && isTransitive();
for (List current = pairs; current != null; }
current = current.tail)
if (b == current.first public boolean isPartialOrder() {
&& ! relatedTo(a, current.second)) return isReflexive() && isAntisymmetric()
return false; && isTransitive();
return true; }
} }

147
21.4. RELATIONS AS PREDICATES CHAPTER 21. CLOSURES

Exercises

4. Using counterSymmetric, write a function symmetricClosure.

1. Prove that if R is a relation on A, then R ∪ IA is uniquely 5. Write a function counterAntisymmetic which operates on
the reflexive closure R. the list representation of relations and constructs a list of
counter examples to antisymmetry. This is different from
2. Prove that if R is a relation on A, then R ∪ R−1 is uniquely counterSymmetric and counterTransitive because it will
the symmetric closure of R. create a list of extraneous pairs, not missing pairs.
3. Write a function counterSymmetric which operates on the 6. Write a function that computes the symmetric closure for
list representation of relations. the predicate representation.

148
Chapter 22

Partial orders

22.1 Definition
No one would mistake the ⊆ relation over a powerset for an equivalence relation by
0
looking at the graph. So far from parcelling the set out into autonomous islands, it
connects everything in an intricate, flowing, and (if drawn well) beautiful network.
{1} {2} {3}
However, it does happen to have two of the three attributes of an equivalence
relation: it is reflexive and it is transitive. Symmetry makes all the difference
here. ⊆ in fact is the opposite of symmetric—it is antisymmetric, the property { 2, 3 } { 1, 3 } { 1, 2 }
we mentioned only briefly in Chapter 19, but which you also have seen in a few
exercises. Being antisymmetric, informally, means that no two distinct elements { 1, 2, 3 }
in the set are mutually related—though any single element may be (mutually)
related to itself. Note carefully, though, that the definition of antisymmetric says “if two elements
are mutually related, they must be equal,” not “if two elements are equal, they must be mutually
related.” Relations like this are important because they give a sense of order to the set, in this case
a hierarchy from the least inclusive subset to the most inclusive, with certain sets at more or less
the same level.
A partial order relation is a relation R on a set X that is reflexive, transitive, and antisymmetric. partial order relation
A set X on which a partial order is defined is called a partially ordered set or a poset. The idea of
the ordering being only partial is because not every pair of elements in the set is organized by it. poset
In this case, for example, {1, 2} and {1, 3} are not comparable, which we will define formally in the
next section.

Theorem 22.1 Let A be any set of sets over a universal set U . Then A is a poset with the relation
⊆.

Proof. Suppose A is a set of sets over a universal set U .

Suppose a ∈ A. By the definition of subset, a ⊆ a. Hence ⊆ is reflexive.
Suppose a, b, c ∈ A, a ⊆ b, and b ⊆ c. Suppose further that x ∈ a. By definition of
subset, x ∈ b, and similarly x ∈ c. Again by the definition of subset, a ⊆ c. Hence ⊆ is
transitive.
Finally suppose a, b ∈ A, a ⊆ b, and b ⊆ a. By the definition of set equality, a = b.
Hence, by the definition of antisymmetry, ⊆ is antisymmetric.
Therefore, A is a poset with ⊆. 2

The graphs of equivalence relations and of partial orders become very cluttered. For equivalence
relations, it is more useful visually to illustrate regions representing the equivalence classes, like
on page 141. For partial orders, we use a pared down version of a graph called a Hasse diagram, Hasse diagram
after German mathematician Helmut Hasse. It strips out redundant information. To transform the

149
22.2. COMPARABILITY CHAPTER 22. PARTIAL ORDERS

graph of a partial order relation into a Hasse diagram, first draw it so that all the arrows (except for
self-loops) are pointing up. Antisymmetry makes this possible. Then, since the arrangement on the
page informs us what direction the arrows are going, the arrowheads themselves are redundant and
can be erased. Finally, since we know that the relation is transitive and reflexive, we can remove
self-loops and short-cuts. In the end we have something more readable, with the (visual) symmetry
apparent.

{ 1, 2, 3 } { 1, 2, 3 } { 1, 2, 3 }

{ 2, 3 } { 1, 3 } { 1, 2 } { 2, 3 } { 1, 3 } { 1, 2 } { 2, 3 } { 1, 3 } { 1, 2 }

{1} {2} {3} {1} {2} {3} {1} {2} {3}

0 0 0

Some examples of partial orders:

• | (divides) on Z+ . Arnon Jabbok

• ≤ on R.
Chinnereth Jordan Dead
• waterWestOf on WaterBody.
• Alphabetical ordering over the set of lexemes in a lan-
guage. Mediterranean

Generic partial orders are often denoted by the symbol , with obvious reference to ≤ and ⊆.

22.2 Comparability
We have noted that the partial order relation ⊆ does not put in order, for example, {1, 3} and {2, 3}.
comparable We say that for a partial order relation on a set A, a, b ∈ A are comparable if a b or b a. 12
and 24 are comparable for | (12|24), but 12 and 15 are noncomparable. Mediterranean and Arnon
are comparable for waterWestOf (the Mediterranean is west of the Arnon), but Arnon and Jabbok
are noncomparable. For ⊆, ∅ is comparable to everything; in fact, it is a subset of everything.
Arnon is not comparable to everything, but everything it is comparable with is west of it. These
last observations lead us to say that if is a partial order relation on A, then a ∈ A is
maximal • maximal if ∀ b ∈ A, b a or b and a are not comparable.
minimal • minimal if ∀ b ∈ A, a b or b and a are not comparable.
greatest • greatest if ∀ b ∈ A, b a.
least • least if ∀ b ∈ A, a b
As you can see, a poset may have many maximal or minimal elements, but at most one greatest
or least. An infinite poset, like R with ≤ may have none, but we can prove that a finite poset has at
least one maximal element. First, a trivial result that will be useful: For any poset, we can remove
one element and it will still be a poset.

150
CHAPTER 22. PARTIAL ORDERS 22.3. TOPOLOGICAL SORT

Lemma 22.1 If A is a poset with partial order relation , and a ∈ A, then A − {a} is a poset with
partial order relation −{(b, c) ∈ | b = a or c = a}.

Proof. Suppose A is a poset with partial order relation , and suppose a ∈ A. Let
A0 = A − {a} and 0 = −{(b, c) ∈ | b = a or c = a}

Now suppose b ∈ A0 . By definition of difference, b ∈ A, and since is reflexive, b b.

Also by definition of difference, b 6= a, so b 0 b. Hence 0 is reflexive.

Suppose b, c ∈ A0 , b 0 c, and c 0 b. By definition of difference, b, c ∈ A, b c, and

c b. Since is antisymmetric, b = c. Hence 0 is reflexive.

Finally, suppose b, c, d ∈ A0 , b c, and c 0 d. By definition of difference, b, c, d ∈ A,

b c, and c d. Since is transitive, b d. Also by the definition of difference, b 6= a
and d 6= a, so b 0 d. Hence 0 is transitive.

Therefore A0 is a partial order relation with 0 . 2

Theorem 22.2 A finite, non-empty poset has at least one maximal element.

Proof. Suppose A is a poset with partial order relation . We will prove it has at least
one maximal element by induction on |A|.

Base case. Suppose |A| = 1. Let a be the one element of A. Trivially, suppose b ∈ A.
Since |A| = 1, b = a, and since is reflexive, b a. Hence a is a maximal element, and
so there exists an N ≥ 1 such that for any poset of size N , it has at least one maximal
element.

Inductive case. Suppose |A| = N + 1. Suppose a ∈ A. Let A0 = A − {a} and

0 = −{(b, c) ∈ | b = a or c = a}. By Lemma 22.1, A0 is a partial order with 0 .
By the inductive hypothesis, A0 has at least one maximal element. Let b be a maximal
element of A0 . Now we have three cases.

Case 1: Suppose a b. Then suppose c ∈ A. If c 6= a, then since b is maximal

in A0 , either c 0 b (in which case, by definition of difference, c b) or c and b are
noncomparable (in which case c and b are still incomparable in 0 ). If c = a, then c b
by our supposition. Hence b is a maximal element in A.

Case 2: Suppose b a. See Exercise 12.

Case 3: Suppose b and a in noncomparable with . See Exercise 13.

22.3 Topological sort

If all pairs in a poset are comparable, then the partial order relation is a total order relation. Total total order relation
orders we have seen include ≤ on R and alphabetical ordering on lexemes. As the name suggests,
such a relation puts the set into a complete ordering; a relation like this could be used to sort a
subset of a poset. However, there are many situations where elements of a poset need to be put
into sequence even though the relation is not a total order relation. In that case, we merely must
disambiguate the several possibilities for sequencing noncomparable elements. If X is a poset with
partial order relation , then the relation 0 is a topological sort if ⊆0 and 0 is a total order topological sort
relation. In other words, a topological sort of a partial order relation is simply that partial order
relation with the ambiguities worked out.
Wheaton College’s mathematics program has the following courses with their prerequisites:

151
22.3. TOPOLOGICAL SORT CHAPTER 22. PARTIAL ORDERS

494
231: none 351: 232 and 245
232: 231 363: 232 441 451 463
243: 231 441: 341
245: 231 451: 351 341 351 333 331 363

331: 232 463: 363

245
333: 232 494: 341 and 351 232 243

341: 245
231

A topological sort would be a sequence of these courses taken in an order that does not conflict
with the prerequisite requirement, for example,

231—232—243—333—245—331—351—341—441—363—463—494—451

152
CHAPTER 22. PARTIAL ORDERS 22.3. TOPOLOGICAL SORT

Exercises

and determines if the first element is related to the second

Consider the set of Jedis and Siths, A = { Yoda, Palpatine, Mace, in the partial order.
Anakin, Obi-Wan, Maul, Qui-Gon, Dooku, Luke }. Let R be a
relation on A defined so that (a, b) ∈ R if a has defeated b in a 8. Alphabetical order is generalized by what is called lexico-
light-saber fight (not necessarily fatally). Specifically: graphical order. If A1 , A2 , . . . , An are posets with total or-
der relations 1 , 2 , . . . , n , respectively, then the lexico-
Ep 1. Maul defeats Qui-Gon. graphical order of A1 × A2 × . . . × An is ` defined as
Ep 1. Obi-Wan defeats Maul. (a1 , a2 , . . . , an ) ` (b1 , b2 , . . . , bn ) if a1 1 b1 , a2 2 b2 , . . . ,
Ep 2. Dooku defeats Obi-Wan. an n bn . Prove that a lexicographical order is a total order
Ep 2. Dooku defeats Anakin. relation.
Ep 2. Yoda defeats Dooku.
Ep 3. Anakin defeats Dooku. 9. The previous example was defined for tuples. We can imag-
Ep 3. Mace defeats Palpatine. ine a similar concept for lists, where all the elements must
Ep 3. Palpatine defeats Yoda. come from the same set, using only one total order relation.
Ep 4. Anakin defeats Obi-Wan. To deal with the varying size of lists, we will say that if a
Ep 5. Anakin defeats Luke. longer list has a shorter list as a prefix, then the shorter
Ep 6. Luke defeats Anakin. list comes before the longer. Write a function isRelatedLex
which takes two lists and a total order relation on the el-
1. Draw a graph representing this relation.
ements in those lists and determines if the first list should
2. Let RT be the transitive closure of R. Which of the following come after the second.
are true?
10. If R and S are antisymmetric relations on a set A, is R ∪ S
(a) (Obi-Wan, Qui-Gon) ∈ RT .
T
antisymmetric? Prove or give a counterexample.
(b) (Mace, Maul) ∈ R .
11. If R and S are partial order relations on a set A, is R ◦ S a
(c) (Luke, Yoda) ∈ RT .
T
partial order relation? Prove or give a counterexample.
(d) (Palpatine, Luke) ∈ R .
12. Prove case 2 of Theorem 22.2.
3. Give a counterexample to show that R is not antisymmetric.
4. If you consider only the final meeting of each Jedi/Sith pair 13. Prove case 3 of Theorem 22.2.
(that is, eliminate the pairs (Obi-Wan, Anakin), (Anakin, 14. Suppose A is a poset with a total order , and suppose a ∈ A
Luke), and (Dooku, Anakin)) and take the transitive closure, is a maximal element. Prove that a is the greatest element.
we are left with a partial order. Draw the Hasse diagram.
15. What relation is reflexive, symmetric, transitive, and anti-
5. Suppose you wanted to find a relation S that associated peo- symmetric?
ple with others on the same side in the conflict (that is, S =
“is on the same side”). S should be an equivalence relation 16. Prove that | (divides) on Z+ is antisymmetric.
and it should be that if (a, b) ∈ S then (a, b) ∈
/ R, that is, a 17. C stands out from the other standard number sets in that
person never fights with someone on the same side. Explain ≤ (and similar comparison operators) is no longer defined.
why S does not exist for the given data (you need not know Find a partial order for C based on ≤.
the story; merely find a counterexample in the given data).
If you know the movies, explain why this is the case in terms 18. Find a total order for C. (Hint: Find a topological sort of
of the story. your answer to Exercise 17.)
6. Give a topological sort of the partial order in Exercise 4. 19. The following exercise comes from Cormen et al [4]. Profes-
7. Notice that the Hasse diagram of waterWestOf looks very sor Bumstead is getting dressed. He needs to put on a belt,
much like a graph of waterImmedWestOf, just without the a jacket, pants, a shirt, shoes, socks, a tie, an undershirt,
arrowheads. Using this observation, we could implement in undershorts, and a watch. Draw a Hasse diagram showing
ML a minimal representation of a partial order that only the order in which certain articles must be put on in rela-
contained pairs that would be connected in the Hasse dia- tion to one another. Then give a topological sort that gives a
gram. Write a function isRelatedPO which takes two ele- reasonable order in which Professor Bumstead can put them
ments and a minimal list representation of a partial order on.

153
22.3. TOPOLOGICAL SORT CHAPTER 22. PARTIAL ORDERS

154
Part VI

Function

155
Chapter 23

Functions

23.1 Intuition
Of all the major topics in this course, the function is probably the one with which your prior
familiarity is the greatest. In the modern treatment of mathematics, functions pervade algebra,
analysis, the calculus, and even analytic geometry and trigonometry. It is hard for today’s student
even to comprehend the mathematics of earlier ages which did not have the modern understanding
and notation of functions. Recall the distinction we made at the beginning of this course, that
though you had previously studied the contents of various number sets, you would now study sets
themselves. Similarly, up till now you have used functions to talk about other mathematical objects.
Now we shall study functions as mathematical objects themselves.
Your acquaintance with functions has given you several models or metaphors by which you
conceive functions. One of the first you encountered was that of dependence—two phenomena or
quantities are related to each other in such a way that one was dependent on the other. In high
school chemistry, you may have performed an experiment where, given a certain volume of water,
the change of temperature of the water was affected by how much heat was applied to the water.
The number of joules applied is considered the independent variable, a quantity you can control.
The temperature change, in Kelvins, is a dependent variable, which changes predictably based on
the independent variable. Let f be the temperature change in Kelvins and x be the heat in joules.
With a kilogram of water, you would discover that

f (x) = 4.183x

Next, one might think of a function as a kind of machine, one that has a slot into which you can
feed raw materials and a slot where the finished product comes out, like a Salad Shooter.

x, input,
argument

f(x),
output

A function is sometimes defined as a mapping or association between two collections of things.

Think of how a telephone book associates names with phone numbers. This concept of mapping
crosses disciplines in surprising ways. It is interesting to note that mapping is not only a synonym
for function in mathematics but also for metaphor in cognitive linguistics.

157
23.2. DEFINITION CHAPTER 23. FUNCTIONS

Finally, think back to our introduction to relations. Considered graphically, a (real-valued)

function is a curve that passes the vertical line test—that is, there is no x value for which there
are more than one y value. This last model best informs our formal definition of a function—a
function is a restricted kind of relation. This works not only for functions in the specific world of
the real-number plane, but for functions between any sets.

23.2 Definition
function A function f from a set X to a set Y is a relation from X to Y such that each x ∈ X is related to
domain exactly one y ∈ Y , which we denote f (x). We call X the domain of f and Y the codomain. We
codomain write f : X → Y to mean “f is a function from X to Y .”
Let A = {1, 2, 3} and B = {5, 6, 7}. Let f = {(1, 5), (2, 5), (1, 7)}. f is not a function because 1
is related to two different items, 5 and 7, and also because 3 is not related to any item. (It is not a
problem that more than one item is related to 5, or that nothing is related to 6.) When a supposed
well defined function meets the requirements of the definition, we sometimes will say that it is well defined . Make
sure you remember, however, that being well defined is the same thing as being a function. Do not
say “well-defined function” unless the redundancy is truly warranted for emphasis.
The term codomain might sound like what you remember calling the range. However, we give
range a specific and slightly different definition for that: for a function f : X → Y , the range of f is the
set {y ∈ Y | ∃ x ∈ X such that f (x) = y}. That is, a function may be broadly defined to a set but
may actually contain pairs for only some of the elements of the codomain. An arrow diagram will
illustrate.

a1 f b1 B

a2 b2

a3 b3

a4 b4

b5
a5

Since f is a function, each element of A is at the tail of exactly one arrow. Each element of
B may be at the head of any number (including zero) of arrows. Although the codomain of f is
B = {b1 , b2 , b3 , b4 , b5 }, since nothing maps to b3 , the range of f is {b1 , b2 , b4 , b5 }.
function equality Two functions are equal if they map all domain elements to the same things. That is, for
f : X → Y and g : X → Y , f = g if for all x ∈ X, f (x) = g(x). The following result is not
surprising, but it demonstrates the structure of a proof of function equality.

(2x−4)(x+1)
Theorem 23.1 Let f : R → R as f (x) = x2 − 4 and g : R → R as g(x) = 2 . f = g.

Proof. Suppose x0 ∈ R. Then, by algebra

158
CHAPTER 23. FUNCTIONS 23.3. EXAMPLES

2
f (x0 ) = x0 − 4
2
= x0 − 2x0 + 2x0 − 4
= (x0 − 2)(x0 + 2)
0 0
= 2(x −2)(x
2
+2)
0 0
= (2x −4)(x
2
+2)
= g(x0 )

Hence, by definition of function equality, f = g. 2

Notice that we chose x0 as the variable to work with instead of x. This is to avoid equivocation
by the reuse of variables. We used x in the rules given to define f and g. x0 is the symbol we used
to stand for an arbitrary element of R which we were plugging into the rule.

23.3 Examples

Domain: R
f (x) = 3x + 4 Codomain: R
Range: R

Domain: R
f (x) = 4 − x2 Codomain: R
Range: (−∞, 4]

Domain: (−∞, 1) ∪ (1, ∞)

1
f (x) = x−1 Codomain: R
Range: (−∞, 0) ∪ (0, ∞)

Domain: R
f (x) = bxc Codomain: R or Z
Range: Z

Domain: R or Z
f (x) = 5 Codomain: R or Z
Range: {5}

Domain: R or Z
P (x) = x > −5 ∧ x < 3 Codomain: { true, false }
Range: { true, false }

159
23.4. REPRESENTATION CHAPTER 23. FUNCTIONS

Notice that in some cases, the domain codomain are open to interpretation. Accordingly, in
exercises like the previous examples you will be asked to give a “reasonable” domain and codomain.
constant function A function like f (x) = 5 whose range is a set with a single element is called a constant function. We
can now redefine the term predicate to mean a function whose codomain is { true, false }. Though
not displayed above, notice that the identity relation on a set is a function. Finally, you may recall
seeing some functions with more than one argument. Our simple definition of function still works
in this case if we consider the domain of the function to be a Cartesian product. That is, f (4, 12) is
simply f applied to the tuple (4, 12). This, in fact, is exactly how ML treats functions apparently
with more than one argument.

23.4 Representation
It is almost awkward to introduce functions in ML, since by now they are like a well-known friend
to you. However, you still have much to learn about each other. You have been told to think of an
ML function as a parameterized expression. Nevertheless, we are primarily interested in using ML
to represent the mathematical objects we discuss, and the best way to represent a function is with,
well, a function. Let us take the parabola example and a new, more subtle curve:

0 if x = 1
g(x) = x2 −1
x−1 otherwise

- fun f(x) = 4.0 - x*x;

val f = fn : real -> real

- fun g(x) = if x <= 1.0 andalso x >= 1.0

= then 0.0
= else (x * x - 1.0) / (x - 1.0);

val g = fn : real -> real

The type real -> real corresponds to what we write R → R. More importantly, the fact that a
function has a type emphasizes that it is a value (as we have seen, a first class value no less). An
identifier like f is simply a variable that happens to store a value that is a function. As we saw
apply with the function reflexiveClosure, it can be used in an expression without applying it to any
arguments.

- f;

val it = fn : real -> real

- it(5.0);

val it = ~21.0 : real

Once the value of f has been saved to it, it can be used as a function. In fact, we need not
store a function in a variable; to write an anonymous function, use the form

fn (<identifier>) => <expression>

Thus we have

160
CHAPTER 23. FUNCTIONS 23.4. REPRESENTATION

- fn (x) => (x + 3) mod 5;

val it = fn : int -> int

- it(15);

val it = 3 : int

or even

- (fn (x) => (x + 3) mod 5) (15);

val it = 3 : int

Note that

fun <identifier>1 ( <identifier>2 ) = <expression>;

is (almost—see Chapter 28) equivalent to

val <identifier>1 = fn ( <identifier>2 ) => <expression>;

161
23.4. REPRESENTATION CHAPTER 23. FUNCTIONS

Exercises

be thought of as a function with W, N, or a finite subset of

1. Complete the on-line function arrow diagrams drills found at one of these as the domain. Conventional notation would
www.ship.edu/~deensl/DiscreteMath/flash/ch4/sec4 1 say a0 = 1, a1 = 2, ak = 2k . Since a list can also be in-
/arrowdiagrams.html terpreted as a (finite) sequence, a list therefore can also be
interpreted as a function. Write a function that takes a list
2. To convert from degrees Fahrenheit to degrees Celsius, we and an integer n and returns the nth element of the list. It
need to consider that a degree Fahrenheit is 95 of a degree is unspecified how your function should behave if given an
Celsius and that degrees Fahrenheit are “off by 32,” that is, n outside the size of the list.
32◦ F corresponds to 0◦ C. Write a function fahrToCel to
convert from Fahrenheit to Celsius.
5. Write a function that takes a list and returns a function that
3. Redo your solution to Exercise 2 in val/fn form. takes an integer n and returns the nth item in the list. This
4. A mathematical sequence, for example 1, 2, 4, 8, . . . , can essentially converts the list into a function.

162
Chapter 24

Images

24.1 Definition
We noticed in the previous chapter that a function’s range may be a proper subset of its codomain;
that is, a function may take its domain to a smaller set than the set it is theoretically defined to.
This leads us to consider the idea of a function mapping a set (rather than just a single element).
A function will map a subset of the domain to a subset of the codomain. Suppose f : X → Y and
A ⊆ Y . The image of A under f is image

F (A) = {y ∈ Y | ∃ x ∈ A such that f (x) = y}

The image of a subset of the domain is the set of elements “hit” by elements in the subset. Note
that we capitalize the name of the function when we are using it to map an image. This is not
standard notation (typically you would see the same name of the function itself being used, merely
applied to a set instead of to an element), but it will help you identify more readily when the image
is being spoken of. In the following diagram, let A0 = {a3 , a4 , a5 }. Then F (A0 ) = {b2 , b4 }.

a1 f b1 B

a2 b2

a3 F(A’) b3
A’
a4 b4

b5
a5

The inverse image of a set B ⊆ Y under a function f : X → Y is inverse image

F −1 (Y ) = {x ∈ X | f (x) ∈ Y }
The image of a subset of the codomain is the set of elements that hit the subset. It is vital to

163
24.2. EXAMPLES CHAPTER 24. IMAGES

remember that although the image is defined by a set in the domain, it is a set in the codomain,
and although the inverse image is defined by a set in the codomain, it is a set in the domain. It is
also important to be able to distinguish an inverse image from an inverse (of a) function, which we
will meet in Chapter 25. Let B 0 = {b2 , b3 }. Then F −1 (B 0 ) = {a4 , a5 }. Notice that F −1 ({b3 }) = ∅.

a1 f b1 B

a2 b2

a3 B’ b3

a4 b4
F (B’) b5
a5

24.2 Examples
Proofs of propositions involving images and inverse images are a straightforward matter of applying
definitions. However, students frequently have trouble with them, possibly because they are used
to thinking about functions operating on elements rather than on entire sets of elements. The
important thing to remember is that images and inverse images are sets. Therefore, you must put
the techniques you learned for set proofs to work.

Theorem 24.1 Let f : X → Y and A, B ⊆ X. Then F (A ∪ B) = F (A) ∪ F (B).

Do not be distracted by the new definitions. This is a proof of set equality, Set Proof Form 2.
Moreover, F (A ∪ B) ⊆ Y . Choose your variable names in a way that shows you understand this.

Proof. Suppose y ∈ F (A ∪ B). By the definition of image, there exists an x ∈ A ∪ B

such that f (x) = y. By the definition of union, x ∈ A or x ∈ B. Suppose x ∈ A. Then,
again by the definition of image, y ∈ F (A). Similarly, if x ∈ B, then y ∈ F (B). Either
way, by definition of union, y ∈ F (A) ∪ F (B). Hence F (A ∪ B) ⊆ F (A) ∪ F (B) by
definition of subset.
Next suppose y ∈ F (A) ∪ F (B). By definition of union, either y ∈ F (A) or F (B).
Suppose y ∈ F (A). By definition of image, there exists an x ∈ A such that f (x) = y.
By definition of union, x ∈ A ∪ B. Again by definition of image, y ∈ F (A ∪ B). The
argument is similar if y ∈ F (B). Hence by definition of subset, F (A)∪F (B) ⊆ F (A∪B).
Therefore, by definition of set equality, F (A ∪ B) = F (A) ∪ F (B). 2

The inverse image also may look intimidating, but reasoning about them is still a matter of set
manipulation. Just remember that an inverse image is a subset of the domain.

Theorem 24.2 Let f : X → Y and A, B ⊆ Y . Then F −1 (A − B) = F −1 (A) − F −1 (B).

164
CHAPTER 24. IMAGES 24.3. MAP

Proof. Suppose x ∈ F −1 (A − B). By definition of inverse image, f (x) ∈ A − B. By

definition of set difference, f (x) ∈ A and f (x) ∈
/ B. Again by definition of inverse image,
x ∈ F −1 (A) and x ∈ / F −1 (B). Again by definition of set difference, x ∈ F −1 (A) −
F −1 (B). Hence, by definition of subset, F −1 (A − B) ⊆ F −1 (A) − F −1 (B).
Next suppose x ∈ F −1 (A) − F −1 (B). By definition of set difference, x ∈ F −1 (A) and
x∈/ F −1 (B). By definition if inverse image, f (x) ∈ A and f (x) ∈
/ B. Again by definition
of set difference f (x) ∈ A − B. Again by definition of inverse image, x ∈ F −1 (A − B).
Hence, by definition of subset, F −1 (A) − F −1 (B) ⊆ F −1 (A − B).
Therefore, by definition of set equality, F −1 (A − B) = F −1 (A) − F −1 (B). 2

These are classic examples of the analysis/synthesis process. We take apart expressions by
definition, and by definition we reconstruct other expressions.

24.3 Map
Frequently it is useful to apply an operation over an entire collection of data, and therefore to get a
collection of results. For example, suppose we wanted to square every element in a list of integers.

- fun square([]) = nil

= | square(x::rest) = x*x :: square(rest);

val square = fn : int list -> int list

- square([1,2,3,4,5]);

val it = [1,4,9,16,25] : int list

We can generalize this pattern using the fact that functions are first class values. A program
traditionally called map takes a function and a list and applies that function to the entire list.

- fun map(func, []) = nil

= | map(func, x::rest) = func(x) :: map(func, rest);

val map = fn : (’a -> ’b) * ’a list -> ’b list

- map(fn (x) => x*x, [1,2,3,4,5]);

val it = [1,4,9,16,25] : int list

Keep in mind that we are considering lists in general, not lists as representing sets. However,
map very naturally adapts to our notion of image, using the list representation of sets.

- fun image(f, set) = makeNoRepeats(map(f, set));

165
24.3. MAP CHAPTER 24. IMAGES

Exercises

In Exercises 1–11, assume f : X → Y . Prove, unless you are asked 11. If A ⊆ Y , F (F −1 (A)) ⊆ A. This is false; give a counterex-
to give a counterexample. ample.
1. If A, B ⊆ X, F (A ∩ B) ⊆ F (A) ∩ F (B).
12. Sometimes it is useful to write a program that operates on
2. If A, B ⊆ X, F (A ∩ B) = F (A) ∩ F (B). This is false; give a
a list but also takes some extra arguments. For example
counterexample.
a program scale might take a list of integers and another
3. If A, B ⊆ X, F (A) − F (B) ⊆ F (A − B). integer and return a list of the old integers multiplied by
4. If A, B ⊆ X, F (A − B) ⊆ F (A) − F (B). This is false; give the other integer. Write a program mapPlus that takes
a counterexample. a function, a list, and an extra argument and applies the
5. If A ⊆ B ⊆ Y , then F −1 (A) ⊆ F −1 (B). function to every element in the list and the extra argu-
ment. For example, mapPlus(fn (x, y) => x * y, [1, 2,
6. If A, B ⊆ Y , then F −1 (A ∪ B) = F −1 (A) ∪ F −1 (B). 3, 4], 2) would return [2, 4, 6, 8].
7. If A, B ⊆ Y , then F −1 (A ∩ B) = F −1 (A) ∩ F −1 (B).
13. Write a program mapPlusMaker that takes a function and
8. If A ⊆ X, A ⊆ F −1 (F (A)).
returns a function that will take a list and an extra argu-
9. If A ⊆ X, A = F −1 (F (A)). This is false; give a counterex- ment and apply the given function to all the elements in
ample. the list. For example, mapPlusMaker(fn (x, y) => x * y)
10. If A ⊆ Y , F (F −1 (A)) ⊆ A. would return the scale program described above.

166
Chapter 25

Function properties

25.1 Definitions
Some functions have certain properties that imply that they behave in predictable ways. These
properties will come in particularly handy in the next chapter when we consider the composition of
functions.
We have seen examples of functions where some elements in the codomain are hit more than
once, and functions where some are hit not at all. Two properties give names for (the opposite) of
these situations. A function f : X → Y is onto if for all y ∈ Y , there exists an x ∈ X such that onto
f (x) = y. In other words, an onto function hits every element in the codomain (possibly more than
once). If B ⊆ Y and for all y ∈ B there exists an x ∈ X such that f (x) = y, then we say that f is
onto B. A function is one-to-one if for all x1 , x2 ∈ X, if f (x1 ) = f (x2 ), then x1 = x2 . If any two one-to-one
domain elements hit the same codomain element, they must be equal (compare the structure of this
definition with the definition of antisymmetry); this means that no element of the codomain is hit
more than once. A function that is both one-to-one and onto is called a one-to-one correspondence; one-to-one
in that case, every element of the codomain is hit exactly once. correspondence

A A A

a1 f b1 B a1 f b1 B a1 f b1 B

a2 b2 a2 b2 a2 b2

b3 b3 b3
a3 a3 a3

a4 b4 a4 b4 a4 b4

a5 a5
b5 b5

Onto but not one-to-one One-to-one but not onto One-to-one correspondence

Sometimes onto functions, one-to-one functions, and one-to-one correspondences are called sur-
jections, injections, and bijections, respectively.
Last time you proved that F (A ∩ B) ⊆ F (A) ∩ F (B) but gave a counterexample against F (A) ∩
F (B) ⊆ F (A ∩ B). If you look back at that counterexample, you will see that the problem is that
your f is not one-to-one. Thus
Theorem 25.1 If f : X → Y , A, B ⊆ X, and f is one-to-one, then F (A) ∩ F (B) ⊆ F (A ∩ B).
Proof. Suppose f : X → Y , A, B ⊆ X, and f is one-to-one.
Now suppose y ∈ F (A) ∩ F (B). Then y ∈ F (A) and y ∈ F (B) by definition of inter-
section. By the definition of image, there exist x1 ∈ A such that f (x1 ) = y and x2 ∈ B

167
25.2. CARDINALITY CHAPTER 25. FUNCTION PROPERTIES

such that f (x2 ) = y. By definition of one-to-one, x1 = x2 . By substitution, x1 ∈ B. By

definition of intersection, x1 ∈ A ∩ B. Therefore, by definition of image, y ∈ F (A ∩ B).
2
Note that with y ∈ F (A) ∩ F (B), we had to assume two x’s, one in A and one in B. Only
then could we apply the fact that f is one-to-one and prove the two x’s equal. This is extremely
important. Notice also that by putting this result and Exercise 1 of Chapter 24 together, we conclude
that if f is one-to-one, F (A ∩ B) = F (A) ∩ F (B).

25.2 Cardinality
If a function f : X → Y is onto, then every element in Y has at least one domain element seeking
it, but two elements in X could be rivals for the same codomain element. If it is one-to-one, then
every domain element has a codomain element all to itself, but some codomain elements may be
left out. If it is a one-to-one correspondence, then everyone has a date to the dance. When we are
considering finite sets, we can use this as intuition for comparing the cardinalities of the domain and
codomain. For example, f could be onto only if |X| ≥ |Y | and one-to-one only if |X| ≤ |Y |. If f is
a one-to-one correspondence, then it must be that |X| = |Y |.
How could we prove this, though? In fact, we have never given a formal definition of cardinality.
The one careful proof we did involving cardinality was Theorem 15.1 which relies in part on this
chapter. Rather than proving that the existence of a one-to-one correspondence implies sets of equal
cardinality cardinality, we simply define cardinality so. Two finite sets X and Y have the same cardinality if
there exists a one-to-one correspondence from X to Y . We write |X| = n for some n ∈ N if there
exists a one-to-one correspondence from {1, 2, . . . , n} to X, and define |∅| = 0.
(There is actually one wrinkle in this system: it lacks a formal definition of the term finite. Tech-
nically, we should define a set X to be finite if there exists an n ∈ N and a one-to-one correspondence
from {1, 2, . . . , n} to X. Then, however, we would need to use this to define cardinality. We would
prefer to keep the definition of cardinality separate from the notion of finite subsets of N to make it
easier to extend the concepts to infinite sets later. We are also are assuming that a set’s cardinality
is unique, or that the cardinality operator | | is well-defined as a function. This can be proven, but
it is difficult.)
Now we can use cardinality formally.
Theorem 25.2 If A and B are finite, disjoint sets, then |A ∪ B| = |A| + |B|.
Proof. Suppose A and B are finite, disjoint sets. By the definition of finite, there exist
i, j ∈ N and one-to-one correspondences f : {1, 2, . . . , i} → A and g : {1, 2, . . . , j} → B.
Note that |A| = i and |B| = j. Define a function h : {1, 2, . . . , i + j} → A ∪ B as

f (x) if x ≤ i
h(x) =
g(x − i) otherwise
Now suppose y ∈ A ∪ B. Then either y ∈ A or y ∈ B by definition of union, and it is
not true that y ∈ A and y ∈ B by definition of disjoint. Hence we have two cases:
Case 1: Suppose y ∈ A and y ∈ / B. Then, since f is a onto, there existss a k ∈ {1, 2, . . . , i}
such that f (k) = y. By our definition of h, h(k) = y. Further, suppose ` ∈ {1, 2, . . . , i+j}
and h(`) = y. Suppose ` > i; then y = h(`) = g(` − i) ∈ B, contradiction; hence ` ≤ i.
This implies h(`) = f (`), and since f is one-to-one, ` = k.
Case 2: Suppose y ∈ B and y ∈ / A. Then, since g is onto, there exist a k ∈ {1, 2, . . . , j}
such that g(k) = y. By our definition of h, h(k + i) = g(k) = y. Further, suppose
` ∈ {1, 2, . . . , i+j} and h(`) = y. Suppose ` ≤ i; then y = h(`) = f (`) ∈ A, contradiction;
hence ` > i. This implies h(`) = g(` − i), and since g is one-to-one, ` − i = k or ` = k + i.
In either case, there exists a unique element in m ∈ {1, 2, . . . , i+j} (m = k and m = k+i,
respectively) such that h(m) = y. Hence h is a one-to-one correspondence. Therefore,
|A ∪ B| = i + j = |A| + |B|. 2

168
CHAPTER 25. FUNCTION PROPERTIES 25.3. INVERSE FUNCTIONS

25.3 Inverse functions

Our work with images and inverse images, particularly problems like proving A ⊆ F −1 (F (A)), force
us to think about functions as taking us on a journey from a place in one set to a place in another.
The inverse image addressed the complexities of the return voyage. Suppose f : X → Y were not
onto, specifically that for some x1 , x2 ∈ X and y ∈ Y , both f (x1 ) = y and f (x2 ) = y. f (x1 ) will
take us to y (or F ({x1 }) will take us to {y}), but we have in a sense lost our way: Going backwards
through f might take us to x2 instead of x1 ; F −1 (y) = {x1 , x2 } 6= {x1 }. If f is one-to-one, this is
not a problem because only one element will take us to y; that way, if we start in X and go to Y ,
we know we can always retrace our steps back to where we were in X. However, this does not help
if we start in Y and want to go to X; unless f is also onto, there may not be a way from X to that
place in Y .
Accordingly, if f : X → Y is a one-to-one correspondence, we define the inverse of f , to be the inverse
relation

f −1 = {(y, x) ∈ Y × X | f (x) = y}
It is more convenient to call this the inverse function of f , but the title “function” does not come inverse function
for free.
Theorem 25.3 If f : X → Y is a one-to-one correspondence, then f −1 : Y → X is well-defined.
Proof. Suppose y ∈ Y . Since f is onto, there exists x ∈ X such that f (x) = y. Hence
(y, x) ∈ f −1 or f −1 (y) = x.
Next suppose (y, x1 ), (y, x2 ) ∈ f −1 or f −1 (y) = x1 and f −1 (y) = x2 . Then f (x1 ) = y
and f (x2 ) = y. Since f is one-to-one, x1 = x2 .
Therefore, by definition of function, f −1 is well-defined. 2
Do not confuse the inverse of a function and the inverse image of a function. Remember that the
inverse image is a set, a subset of the domain, applied to a subset of the codomain; the inverse image
always exists. The inverse function exists only if the function itself is a one-to-one correspondence;
it takes an element of the codomain and produces an element of the domain.
As an application of functions and the properties discussed here, we consider an important
concept in information security. A hash function is a function that takes a string (that is, a variable- hash function
length sequence of characters) and returns fixed-length string. Since the output is smaller than
the input, a hash function could not be one-to-one. However, a good hash function (often called a
one-way hash function) should have the following properties:
• It should be very improbable for two arbitrary input strings to produce the same output.
Obviously some strings will map to the some output, but such pairs should be very difficult to
find. In this way, the function should be “as one-to-one as possible” and any collisions should
happen without any predictable pattern.
• It should be impossible to approximate an inverse for it. Since it is not one-to-one, a true
inverse is impossible, but in view here is that given an output string, it should be very difficult
to produce any input string that could be mapped to it.
The idea is that a hash function can be used to produce a fingerprint of a document or file
which proves the document’s existence without revealing its contents. Suppose you wanted to prove
to someone that you have a document that they also have, but you do not want to make that
document public. Instead, you could compute the hash of that document and make that public.
All those who have the document already can verify the hash, but no one who did not have the
document could invert the hash to reproduce it. Another use is time stamping. Suppose you have
a document that contains intellectual property (say, a novel or a blueprint) for which you want to
ensure that you get credit. You could compute the hash of the document and make the hash public
(for example, printing it in the classified ads of a newspaper); some time later, when you make the
document itself public, you will have proof that you knew its contents on or before the date the hash
was published.

169
25.3. INVERSE FUNCTIONS CHAPTER 25. FUNCTION PROPERTIES

Exercises

3. If A ∈ X and f is one-to-one, then F −1 (F (A)) ⊆ A.

1. Which of the following are one-to-one and/or onto? (Assume 4. If A ∈ Y and f is onto, then A ⊆ F (F −1 (A)).
a domain and range of R.)
5. If f is onto, then the range of f is Y .
(a) f (x) = x. 6. If for all non-empty A, B ⊆ X, F (A ∩ B) = F (A) ∩ F (B),
(b) f (x) = x2 . then f is a one-to-one correspondence.
(c) f (x) = ln x. 7. If f is one-to-one, then |X| ≤ |Y |.
(d) f (x) = x3 . 8. Let A be a set. iA is a one-to-one correspondence.
(e) f (x) = 31 x3 − 4x. 9. If f is onto, then |X| ≥ |Y |.
In Exercises 2–9, assume f : X → Y . Prove. 10. If A ⊆ B, then |B − A| = |B| − |A|.
2. If A, B ∈ X and f is one-to-one, then F (A − B) ⊆ F (A) − 11. If A is a finite set and a ∈ A, then |{ {a} ∪ A0 | A0 ∈
F (B). P(A − {a})}| = |P(A − {a})|.

170
Chapter 26

Function composition

26.1 Definition
We have seen that two relations can be composed to form a new relation, say, given relations R from
X to Y and S from Y to Z:

R ◦ S = {(a, c) ∈ X × Z | ∃ b ∈ Y such that (a, b) ∈ R and (b, c) ∈ S}

This easily can be specialized for the composition of functions. Suppose f : X → Y and
g : Y → Z are functions. Then the composition of f and g is function composition

g ◦ f = {(x, z) ∈ X × Z | z = g(f (x))}

This is merely a rewriting of the definition of composition of relations. The interesting part is that
g ◦ f is a function.
Theorem 26.1 If f : A → B and g : B → C are functions, then g ◦ f : A → C is well-defined.
Proof. Suppose a ∈ A. Since f is a function, there exists a b ∈ B such that f (a) = b.
Since g is a function, there exists a c ∈ C such that g(b) = c. By definition of composition,
(a, c) ∈ g ◦ f , or g ◦ f (a) = c.
Next suppose (a, c1 ), (a, c2 ) ∈ g ◦ f , or g ◦ f (a) = c1 and g ◦ f (a) = c2 . By definition of
composition, there exist b1 , b2 such that f (a) = b1 , f (a) = b2 , g(b1 ) = c1 , and g(b2 ) = c2 .
Since f is a function, b1 = b2 . Since g is a function, c1 = c2 .
Therefore, by definition of function, g ◦ f is well-defined. 2

g f
C
A
a1
c1

c2
b1

a3
c3
b2
f

B
b3

171
26.2. FUNCTIONS AS COMPONENTS CHAPTER 26. FUNCTION COMPOSITION

The most intuitive way to think of composition is to use the machine model of functions. We
simply attach two machines together, feeding the output slot of the one into the input slot of the
other. To make this work, the one output slot must fit into the other input slot, and the one machine’s
output material must be appropriate input material for the other machine. Mathematically, we
describe this by considering the domains and codomains. Given f : A → B and g : C → D, g ◦ f is
defined only for B = C, though it could easily be extended for the case where B ⊆ C.
Function composition√happens quite frequently without our noticing it. For example, the real-
√ f (x) = x − 12 can be consiered the composition of the functions g(x) = x − 12
valued function
and h(x) = x.

26.2 Functions as components

In ML, a function application is like any other expression. It can be the argument to another
function application, and in this way we do function composition. The sets used for domain and
codomain are represented by types, and so when doing function composition, the types must match.
The important matter is to consider functions as components. Our approach to building algorthms
from this point on is the defining and applying of functions, like so many interlocking machines.
For our example, suppose we want to write a program that will predict the height of a tree
after a given number of years. The prediction is calculated based on three pieces of information:
the variety of tree (assume that this determines the tree’s growth rate), the number of years, and
the initial height of the tree. Obviously this assumes tree growth is linear, as well as making other
assumptions, with nothing based on any real botany.
Suppose we associate the following growth rates in terms of units per month for the following
varieties of tree.

- datatype treeGenus = Oak | Elm | Maple | Spruce | Fir | Pine | Willow;

datatype treeGenus = Elm | Fir | Maple | Oak | Pine | Spruce | Willow

- fun growthRate(Elm) = 1.3

val growthRate = fn : treeGenus -> real

growthRate is a function mapping tree genera to their rates of growth. We have overlooked
something, though. There is a larger categorization of these trees that will afffect how this growth
rate is applied: Coniferous trees grow all year round, but deciduous trees are inactive during the
winter. The number of months used for growing, therefore, is a function of the taxonomic division.

- datatype treeDivision = Deciduous | Coniferous;

datatype treeDivision = Coniferous | Deciduous

- fun growingMonths(Deciduous) = 6.0

= | growingMonths(Coniferous) = 12.0;

val growingMonths = fn : treeDivision -> real

172
CHAPTER 26. FUNCTION COMPOSITION 26.3. PROOFS

Because of the hierarchy of taxonomy, we can determine a tree’s division based on its genus. To
avoid making the mapping longer than necessary, we pattern-match on coniferous trees and make
deciduous the default case.

- fun division(Pine) = Coniferous

= | division(Spruce) = Coniferous
= | division(Fir) = Coniferous
= | division(x) = Deciduous;

val division = fn : treeGenus -> treeDivision

Since division maps treeGenus to treeDivision and growingMonths maps treeDivision to real,
their composition maps treeGenus to real.
The predicted height of the tree is of course the initial height plus the growth rate times growth
time. The growth rate is calculated from the genus, and the growth time is calculated by multiplying
years times the growing months. Thus we have

- fun predictHeight(genus, initial, years) =

= initial + (years * growingMonths(division(genus))
= * growthRate(genus));

val predictHeight = fn : treeGenus * real * real -> real

We can generalize this composition process by writing a function that takes two functions and
returns a composed function. Just for the sake of being fancy, we can use it to rewrite the expression
growingMonths(division(genus)).

- fun compose(f, g) = fn (x) => g(f(x));

val compose = fn : (’a -> ’b) * (’b -> ’c) -> ’a -> ’c

- fun predictHeight(genus, initial, years) =

= initial + (years * compose(division, growingMonths)(genus)
= * growthRate(genus));

val predictHeight = fn : treeGenus * real * real -> real

26.3 Proofs
Composition finally gives us enough raw material to prove some fun results on functions. Remember
in all of these to apply the definitions carefully and also to follow the outlines for proving propositions
in now-standard forms. For example, it is easy to verify visually that the composition of two one-to-
one functions is also a one-to-one function. If no two f or g arrows collide, then no g ◦ f arrows have
a chance of colliding. Proving this result relies on the definitions of composition and one-to-one.

Theorem 26.2 If f : A → B and g : B → C are one-to-one, then g ◦ f : A → C is one-to-one.

Proof. Suppose f : A → B and g : B → C are one-to-one. Now suppose a1 , a2 ∈ A and

c ∈ C such that g◦f (a1 ) = c and g◦f (a2 ) = c. By definition of composition, g(f (a1 )) = c
and g(f (a2 )) = c. Since g is one-to-one, f (a1 ) = f (a2 ). Since f is one-to-one, a1 = a2 .
Therefore, by definition of one-to-one, g ◦ f is one-to-one. 2

173
26.3. PROOFS CHAPTER 26. FUNCTION COMPOSITION

In the long run, we want to prove that something is one-to-one, so we need to gather the materials
to syntesize the definition. It means we can pick any two elements from the domain of g ◦ f , and if
they map to the same element, they themselves must be the same. Here is how the proof connects
with a visual verification.

g f g f
C C
A A
a1
c

f f

g g

B B
Suppose f : A → B and g : B → C are one-to-one. Now suppose a1 , a2 ∈ A and c ∈ C such that g ◦
f (a1 ) = c and g ◦ f (a2 ) = c.

g f g f
C C
A A
a1 a1
c c

a2 a2
f (a1) f (a1)

f (a2)

f f (a2) f

g g

B B
By definition of composition, g(f (a1 )) = c and Since g is one-to-one, f (a1 ) = f (a2 ).
g(f (a2 )) = c.

g f g f
C C
A A
a1 a1
a2 c a2 c

f (a1)

f (a2)

f f

g g

B B
Since f is one-to-one, a1 = a2 . Therefore, by definition of one-to-one, g ◦ f is one-to-
one. 2

174
CHAPTER 26. FUNCTION COMPOSITION 26.3. PROOFS

Our intuition about inverses from the previous chapter conceived of functions as taking us from
a spot in one set to a spot in another. Only if the function were a one-to-one correspondence could
we assume a “round trip” (using, an inverse function) existed. Since the net affect of a round trip
is getting you nowhere, you may conjecture that if you compose a function with its inverse, you will
get an identify function. The converse is true, too.

Theorem 26.3 Suppose f : A → B is a one-to-one correspondence and g : B → A. Then g = f −1

if and only if g ◦ f = iA .

Proof. Suppose that g = f −1 . (Since f is a one-to-one correspondence, f −1 is well-

definied.) Suppose a ∈ A. Then

g ◦ f (a) = g(f (a)) by definition of composition

= f −1 (f (a)) by substitution
= a by definition of inverse function

Therefore, by function equality, g ◦ f = iA .

For the converse (proving that if you compose a function and get the identity function,
it must be the inverse), see Exercise 9. 2

175
26.3. PROOFS CHAPTER 26. FUNCTION COMPOSITION

Exercises

7. If f : A → B and g : B → C, and g ◦ f is one-to-one, then f

1. Write functions to compute the circumference and area of is one-to-one.
a circle based on radius, the surface area and volume of a 8. If f : A → B and g : B → C, and g ◦ f is onto, then g is
sphere based on radius, and the surface area and volume of a onto.
cylinder based on radius of base and height. Use real values
and reuse functions as much as possible. 9. If f : A → B is a one-to-one correspondence, g : B → A,
2. Make a data type that represents a set of six of your friends. and g ◦ f = iA , then g = f −1 .
Then write a function cpo that maps these friends to their 10. If f : A → B, g : A → B, h : B → C, h is one-to-one, and
CPO (campus post office) address. At the milbox section of h ◦ f = h ◦ g, then f = g.
the CPO, boxes 1–2039 are on the west wall, 2040–2579 are
on the north wall, and 2560–3299 are on the east wall. Make 11. If f : A → B and g : B → C are both one-to-one corre-
a data type for the set of these three walls, and write a func- spondences, then the inverse function (g ◦ f )−1 exists and
tion that maps CPO address to their appropriate wall. Then (g ◦ f )−1 = f −1 ◦ g −1 . (Hint: Notice that this requires you
write a function (using the two functions you just wrote) that to prove two things. However, the first thing can be proven
maps your friends to the wall where one can find their box. quickly from previous exercises in this chapter.)
Prove. 12. Let G ◦ F (X) be the image of a set X under the function
3. If f : A → B, then f ◦ iA = f . g ◦ f , for functions f and g; similarly define inverse images
4. If f : A → B, then iB ◦ f = f . for composed functions. If f : A → B, g : B → C and
5. If f : A → B, g : B → C, and h : C → D, then X ⊆ C, then (G ◦ F )−1 (X) = F −1 (G−1 (X).
h ◦ (g ◦ f ) = (h ◦ g) ◦ f . 13. Let B be a set of subsets of a set A (for example, B could
6. If f : A → B and g : B → C are both onto, then g ◦ f is also be P(A)). Cardinality (that is, the relation R on B such
onto. that (X, Y ) ∈ R if |X| = |Y |) is an equivalence relation.

176
Chapter 27

Special Topic: Countability

Both our informal definition of cardinality in Chapter 3 and the more careful one in Chapter 25 were
restricted to finite sets. This was in deference to an unspoken assumption that the cardinality of a set
ought to be something, that is, a whole number. As has been mentioned already, we cannot merely
say that a set like Z has cardinality infinity. Infinity is not a whole number—or even a number at
all, if one has in mind the number sets N, W, Z, Q, R, and C. The definition of cardinality taken
at face value, however, does not guarantee that the cardinality is something; it merely inspired us
to define what the operator || means by comparing a set to a subset of N. Indeed, the definition
of cardinality merely gives us a way to say that two sets have the same cardinality as each other.
What would happen if we extended this bare notion to infinite sets—that is, drop the term finite,
which we have not defined anyway?
Two sets X and Y have the same cardinality if there exists a one-to-one correspondence from cardinality
X to Y . We know from Exercise 13 of Chapter 26 that this relation partitions sets into equivalence
classes. (Your proof did not depend on the sets being finite, did it?) We say that a set X is finite if finite
it is the empty set or if there exists an n ∈ W such that X has the same cardinality as {1, 2, . . . , n}.
Otherwise, we say the set is infinite. infinite
It is worth remembering that the term infinite is defined negatively; this also makes sense ety-
mologically, since the word is simply finite with a negative prefix. Did you ever wonder what was
so infinite about the grammatical term infinitive? The term makes more sense in Latin, where the
pronoun subject of a verb is given by a suffix: ambulo, ambulas, ambulat is the conjugation “I walk,”
“you walk,” “he/she/it walks,” o, s, and t being the singulars for first, second, and third person,
respectively. An infinitive (in this case, ambulare) has no pronominal suffix. This does not mean
that it goes on forever, just that it, literally, has no proper ending. Since the caboose has fallen
out of use in American railways, can we now say that freight trains are therefore infinite? Perhaps
not, since even though there is no longer a formal ending car, they are at least terminated by a box
called a FRED (which stands for Flashing Rear-End Device; no kidding).
On a more serious note, this raises the question, are all infinities equal? More than merely raising
the question, it gives a rigorous way to phrase it: Are all infinite sets in the same equivalence class?
Our intuition could go either way. On one hand, one might assume that infinity is infinity without
qualification. Thus

|N| = |Z| = |Q| = |R|

On the other hand, every integer is a rational number, and the real numbers are all the rationals
plus a whole many more. Perhaps

|N| < |Z| < |Q| < |R|

Which could it be?
Let us ask a more general question first. Is it possible for a proper subset to have the same
cardinality as the whole set? Take Z and N for example. Both are infinite, but obviously N ⊆ Z and
N 6= Z.

177
CHAPTER 27. SPECIAL TOPIC: COUNTABILITY

Theorem 27.1 N and Z have the same cardinality.

This calls for a proof of existence. The definition of cardinality requires that a one-to-one
correspondence exists between the two sets. We must either prove that it is impossible for such a
function not to exist or propose a candidate function and demonstrate that it meets the requirements.
We use the latter strategy.

Proof. Define a function f : N → Z thus:

n div 2 if n is even
f (n) =
−(n div 2) otherwise

Now suppose n0 ∈ Z. We have three cases: n0 = 0, n0 > 0, and n0 < 0. Suppose n0 = 0;

then f (1) = 1 div 2 = 0 = n0 . Suppose n0 > 0; then f (2n0 ) = 2n0 div 2 = n0 . Suppose
n0 < 0; then f ((−2n0 ) + 1) = −(((−2n0 ) + 1) div 2) = −(−n0 ) = n0 . In each case, there
exists an element of N that maps to n0 and hence f is onto.
Next suppose n1 , n2 ∈ Z and f (n1 ) = f (n2 ). We have three cases: f (n1 ) = 0, f (n1 ) > 0,
and f (n1 ) < 0.
Case 1: Suppose f (n1 ) = 0. Since −0 = 0, then by our formula, n1 div 2 = 0 whether
/ N, n1 = 1. By substitution and
n1 is even or odd. Then n1 = 0 or n1 = 1, but since 0 ∈
similar reasoning, n2 = 1.
Case 2: Suppose f (n1 ) > 0. Then n1 is even. Moreover, by our formula, either n1 =
2 · f (n1 ) or n1 = 2 · f (n1 ) + 1. Since n1 is even, n1 = 2 · f (n1 ). By substitution and
similar reasoning, n2 = 2 · f (n1 ).
Case 3: Suppose f (n1 ) < 0. Then n1 is odd. Moreover, by our formula, either n1 =
2 · (−f (n1 )) or n1 = 2 · (−f (n1 )) + 1. Since n1 is odd, n1 = 2 · (−f (n1 )) + 1. By
substitution and similar reasoning, n2 = 2 · (−f (n1 )) + 1.
In each case, n1 = n2 . Hence f is onto.
Hence f is a one-to-one correspondence, and therefore Z has the same cardinality as N.
2

So it is possible for a proper subset to have as many (infinite) elements as the set that contains
it. Moreover, |N| = |Z|, so the rest of the equality chain seems plausible. But is it true? Before
countably infinite asking again a slightly different question, we say that a set X is countably infinite if X has the same
countable cardinality as N. A set X is countable if it is finite or countably infinite. The idea is that we could
count every element in the set by assigning a number 1, 2, 3, . . . to each one of them, even if it took
uncountable us forever. A set X is uncountable if it is not countable. This gives us a new question: Are all sets
countable?
Let us try Q next. The jump from N to Z was not so shocking since Z has only about “twice”
as many elements as N. Q, however, has an infinite number of elements between 0 and 1 alone, and
1
an infinite number again between 0 and 10 . Nevertheless,

Theorem 27.2 Q has the same cardinality as N.

A formal proof is delicate, but this ML program demonstrates the main idea:

- fun cantorDiag(n) =
= let
= fun gcd(a, 0) = a
= | gcd(a, b) = gcd(b, a mod b);
= fun reduce(a, b) =
= let val comDenom = gcd (a, b);
= in (a div comDenom, b div comDenom)

178
CHAPTER 27. SPECIAL TOPIC: COUNTABILITY

= end;
= fun nextRatio(a, b) =
= if a = 1 andalso b mod 2 = 1 then (1, b + 1)
= else if b = 1 andalso a mod 2 = 0 then (a + 1, 1)
= else if (a + b) mod 2 = 1 then (a + 1, b - 1)
= else (a - 1, b + 1);
= fun contains((a, b), []) = false
= | contains((a, b), (c, d)::rest) =
= if a = c andalso b = d then true else contains((a, b), rest);
= val i = ref 1;
= val currentRatio = ref (1, 1);
= val usedList = ref [];
= in
= (while !i < n do
= (usedList := !currentRatio :: !usedList;
= while contains(reduce(!currentRatio), !usedList) do
= currentRatio := nextRatio(!currentRatio);
= i := !i + 1);
= !currentRatio)
= end;

val cantorDiag = fn : int -> int * int

- map(cantorDiag, [1,2,3,4,5,6,7,8]);

val it = [(1,1),(1,2),(2,1),(3,1),(1,3),(1,4),(2,3),(3,2)] : (int * int) list

What this function computes is a diagonal walk over the set of positive rationals, invented by
Cantor, illustrated here. We lay out all ratios of integers as an infinite 2 × 2 grid and weave our way
around them, assigning a natural number to each in the order that we come to them, but skipping
over ones that are equivalent to something we have seen before.

1 1 1 1 1
1 2 3 4 5

2 2 2 2 2
1 2 3 4 5

3 3 3 3 3
1 2 3 4 5

4 4 4 4 4
1 2 3 4 5

This function hits every positive rational exactly once, so it is a one-to-one correspondence. Thus
at least Q+ is countably infinite, and by a process similar to the proof of Theorem 27.1, we can
bring Q into the fold by showing its cardinal equivalence to Q+ . Thus

|N| = |Z| = |Q|

This strongly suggests that all infinities are equal, but the tally is not finished. We still need to
consider R. To simplify things, we will restrict ourselves just to the set (0, 1), that is, only the real
numbers greater than 0 and less that 1. This might sound like cheating, but it is not.

Theorem 27.3 (0, 1) has the same cardinality as R.

179
CHAPTER 27. SPECIAL TOPIC: COUNTABILITY

For this we will use a geometric argument. Imagine taking the line segment (0, 1) and rolling it
up into a ball with one point missing where 0 and 1 would meet. Then place the ball on the real
number line, so .5 on the ball is tangent with 0 on the line. Now define a function f : (0, 1) → R
so that to find f (x) we draw a line from the 0/1 point on the ball through x on the ball; the value
f (x) is the point where the drawn line hits the real number line. Proving that this function is a
one-to-one correspondence is a matter of using analytic geometry to find a formula for f and then
showing that every R is hit from exactly one value on the ball.

0 or 1
.75 .5

.25

This appears to cut our task down immensely. To prove all infinities (that we know of) equal,
all we need to show is that any of the sets already proven countable can be mapped one-to-one and
onto the simple line segment (0, 1). However,

Theorem 27.4 (0, 1) is uncountable.

Since Countability calls for an existence proof, uncountability requires a non-existence proof, for
which we will need a proof by contradiction.

Proof. Suppose (0, 1) is countable. Then there exists a one-to-one correspondence

f : N → (0, 1). We will use f to give names to the all the digits of all the numbers in
(0, 1), considering each number in its decimal expansion, where each ai,j stands for a
digit.:

f (1) = 0.a1,1 a1,2 a1,3 . . . a1,n . . .

f (2) = 0.a2,1 a2,2 a2,3 . . . a2,n . . .
..
.
f (n) = 0.an,1 an,2 an,3 . . . an,n . . .
..
.

Now construct a number d = 0.d1 d2 d3 . . . dn . . . as follows

6 1
1 if an,n =
dn =
2 if an,n = 1

Since d ∈ (0, 1) and f is onto, there exists an n0 ∈ N such that f (n0 ) = d. Moreover,
f (n0 ) = 0.an0 ,1 an0 ,2 an0 ,3 . . . an0 ,n0 . . ., so d = 0.an0 ,1 an0 ,2 an0 ,3 . . . an0 ,n0 . . . by substitution.
However, by how we have defined d, dn 6= an0 ,n0 and so d 6= 0.an0 ,1 an0 ,2 an0 ,3 . . . an0 ,n0 . . .,
a contradiction.
Therefore (0, 1) is not countable. 2

Corollary 27.1 R is uncountable.

Proof. Theorems 27.3 and 27.4. 2

Anticlimactic? Perhaps. But also profound. There a just as many naturals as integers as
rationals. But there are many, many more reals.
This chapter draws heavily from Epp[5].

180
Part VII

Program

181
Chapter 28

Recursion Revisited

First, a word of introduction for this entire part on functional programming. We have already seen
many of the building blocks of programming in the functional paradigm, and in the next few chapters
we will put them together in various applications and also learn certain advanced techniques. The
chapters do not particularly depend on each other and could be resequenced. The order presented
here has the following rationale: This chapter will ease you into the sequence with a fair amount of
review, and will also look at recursion from its mathematical foundations. Chapter 29 will present
the datatype construct as a much more powerful tool than we have seen before, particularly in how
the idea of recursion can be extended to types. Chapter 30 is the climax of the part, illustrating
the use of functional programming in the fixed-point iteration algorithm strategy; it is also the only
chapter of the book that requires a basic familiarity with differential calculus, so if you have not
taken the first semester of calculus, ask a friend who has to give you the five-minute explanation of
what a derivative is. Chapter 31 is intended as a breather following Chapter 30, applying our skills
to computing combinations and permutations.

28.1 Scope
In Chapter 23, we implied that

fun <identifier>1 ( <identifier>2 ) = <expression>;

is merely shorthand for

val <identifier>1 = fn ( <identifier>2 ) => <expression>;

However, this is not true; there is a subtle but important difference, and it boils down to scope.
Recall that a variable’s scope is the duration of its validity. A variable (including one that holds a
function value) is valid from the point of its declaration on. It is not valid, however, in its declaration
itself; in the val/fn form, <identifier>1 cannot appear in <expression>. The name of a function
defined using the fun form, however, has scope including its own definition. Recall that recursion is
self-reference; a function defined using fun can call itself—or return itself, for that matter.
Functional programming is a style where no variables are modified. We will demonstrate this
distinction and how recursive calls make this possible by transforming our iterative factorial function
from Chapter 14 into the functional style. We modify this slightly by counting from 1 to n instead
of from 0 to n − 1, and accordingly we update fact before i.

183
28.1. SCOPE CHAPTER 28. RECURSION REVISITED

- fun factorial(n) =
= let val i = ref 1;
= val fact = ref 1;
= in
= (while !i <= n do
= (fact := !fact * !i;
= i := !i + 1);
= !fact)
= end;
Our first change is to encapsulate the body of the while loop into a function, which we will call
factBody. The body of the while loop does two things: It updates both fact and i. Our function,
then, must consume the old fact and i and produce new values for them. We can handle the need
to return two values by returning a tuple. Thus we also consolidate fact and i into one value, the
tuple current. This essentially represents the current state of the computation.
- fun factorial(n) =
= let fun factBody(fact, i) = (i * fact, i + 1);
= val current = ref (1, 1);
= in
= (while #2(!current) <= n do
= current := factBody(!current);
= #1(!current))
= end;
Next, we can be more ambitious about how much of the work we subsume into the function
factBody. At this point we also take advantage of the recursive use of function names. That the
while loop is doing one thing: calling factBody repeatedly. Since factBody can call itself, it may as
well eat up the rest of the while loop. Notice that the while is effectively replaced with an if. Both
the old while and the new if are making the same decision—either stop (and do not change the state
of current or (fact, i)) or make the change and repeat.
- fun factorial(n) =
= let fun factBody(fact, i) =
= if i <= n then factBody(i * fact, i + 1)
= else (fact, i);
= val current = ref (1, 1);
= in
= (current := factBody(!current);
= #1(!current))
= end;
Now we notice that the second item in current is no longer used; current can be a single int.
We have come full circle, in a way—current is now equivalent to the old variable (now parameter)
fact. Accordingly, factBody should only return one thing, the new current fact value. The main
call to factBody needs to be given an initial value for the second item (the old variable, now
parameter i).
- fun factorial(n) =
= let fun factBody(fact, i) =
= if i <= n then factBody(i * fact, i + 1)
= else fact;
= val current = ref 1;
= in
= (current := factBody(!current, 1);
= !current)
= end;

184
CHAPTER 28. RECURSION REVISITED 28.1. SCOPE

Next, consider the statement list inside the let expression. It is rather silly to store the result
of the main call to factBody and immediately retrieve it. Why not replace the statement list with
just the call?

- fun factorial(n) =
= let fun factBody(fact, i) =
= if i <= n then factBody(i * fact, i + 1)
= else fact;
= val current = ref 1;
= in
= factBody(!current, 1)
= end;

Now that current is never updated, there is no need for it to be a reference variable—or a
variable at all, for that matter. We replace its one remaining use with its initial value, 1.

- fun factorial(n) =
= let fun factBody(fact, i) =
= if i <= n then factBody(i * fact, i + 1)
= else fact;
= in
= factBody(1, 1)
= end;

Next we make use of the associativity of multiplication. Our current version of factBody performs
its multiplication first and then passes the result to the call, essentially performing (. . . (((1 × 1) ×
2) × 3) . . . × n). This means that fact gets bigger on the way down, and the result is unchanged
as it comes back up from the series of recursive calls. Instead, we can do the multiplication after
the call, so that fact stays the same on the way down, but the result gets bigger as it comes up,
essentially (1 × (1 × (2 × (3 × . . . (n) . . .)))).

- fun factorial(n) =
= let fun factBody(fact, i) =
= if i = 0 then fact
= else factBody(i * fact, i - 1);
= in
= factBody(1, n)
= end;

We can also make use of associativity. Instead of performing multiplication first and then passing
the result to the call (so fact gets bigger on the way down, and the result is unchanged as it
comes back up from the series of recursive calls) we can do the multiplication after the call, so
that fact stays the same on the way down, but the result gets bigger as it comes up, essentially
(n × ((n − 1) × ((n − 2) × (. . . (1) . . .)))).

- fun factorial(n) =
= let fun factBody(fact, i) =
= if i = 0 then fact
= else i * factBody(fact, i - 1);
= in
= factBody(1, n)
= end;

An amazing thing has happened: The variable fact no longer varies with each call. This means
we can eliminate it from the parameter list and replace its use with its only value, 1.

185
28.2. RECURRENCE RELATIONS CHAPTER 28. RECURSION REVISITED

- fun factorial(n) =
= let fun factBody(i) =
= if i = 0 then 1
= else i * factBody(i - 1);
= in
= factBody(n)
= end;

Now all that factorial does is make a local function and apply it to its input without modifi-
cation. We may as well replace its body with the body of factBody—but be careful to substitute
factorial for factBody and n for i.

- fun factorial(n) =
= if n = 0 then 1
= else n * factorial(n - 1);

Finally, we summon the beautifying magic of pattern-matching.

- fun factorial(0) = 1
= | factorial(n) = n * factorial(n - 1);

The mathematical definition of factorial is

1 if n = 0
n! =
n · (n − 1)! otherwise

28.2 Recurrence relations

The medieval mathematician Leonardo of Pisa proposed the following problem. A newborn male/female
pair of an immortal species of rabbit is introduced to an island. Individuals of the species reach
sexual maturity after one month and have a one-month gestation period. Each litter contains exactly
one male and one female. Individuals mate once a month for life with their siblings with a perfect
fertility rate and no miscarriages. How many pairs of rabbits will there be after n months?
Ultimately we would like a function f (n) which compute the number of pairs given a number of
months. Before defining such a function directly, we consider how one function value may be related
to others. The number of pairs of rabbits alive after n months will be

• the number of pairs born in the nth month, plus

• the number of pairs that were alive before the nth month.

The second of these is the same as the number of pairs alive after n − 1 months, that is, f (n − 1).
What about the first point? How many new pairs will be born? Since the fertility rate is 100%,
every pair will give birth to a new pair—except for juvenile pairs, which are not fertile yet. Since
we assume all the pairs older than one month are fertile, this means that every pair alive after n − 2
months is ready, and so f (n − 2) new pairs are born.

f (n) = f (n − 1) + f (n − 2)
Yet this does not fully define the function, since f (n − 1) and f (n − 2) are unknown, unless we define
it for a set of base cases.

f (0) = 1 just the original pair

f (1) = 1 since the original pair was not sexually mature the previous month

If you have not guessed, Leonardo of Pisa was better know by his nickname, Fibonacci.

186
CHAPTER 28. RECURSION REVISITED 28.2. RECURRENCE RELATIONS

- fun fibonacci(0) = 1
= | fibonacci(1) = 1
= | fibonacci(n) = fibonacci(n-1) + fibonacci(n-2);

Recall that a mathematical sequence is a function with domain W or N. For a sequence a, we often
write ak for a(k) and refer to this as the kth term in the sequence. A recurrence relation for a sequence recurrence relation
a is a formula which, for some N relate each term ak (for all k ≥ N ) to a finite number of predecessor
terms ak−N , ak−N +1 , . . . , ak−1 ; moreover, terms a0 , a1 , a1 , . . . , aN −1 are explicitly defined, which
we call the initial conditions of the recurrence relation. (The term “relation” is misleading here, initial conditions
since it is not directly related to relation as a set of ordered pairs.) This demonstrates for us the
basic structure of recursion: A formula has base case by which it is grounded explicitly, and in other
cases it is defined in terms of other values of the same formula.
For another example, consider the well-known Tower of Hanoi puzzle. Imagine you have three
pegs and k disks. Each disk has a different size and a hole in the middle big enough for the pegs to
fit through. Initially all disks are on the first peg from the largest disk on the bottom to the smallest
disk on the top. Your task is to move all the disks to the third peg under the following rules:

• You may move only one disk at a time.

• You may never put a larger disk on top of a smaller disk on the same peg.

The strategy we use is itself recursive. Assume we call the pegs 0, 1, and 2, and generalize the
problem to moving k disks from peg i to peg j. Then

1. Move the top k − 1 disks from peg i to the peg other than i or j.
2. Move the kth disk from peg i to peg j.
3. Move the top k − 1 disks from the other peg to peg j.

How many moves will this take? If mk is the number of moves it takes to solve the puzzle for k
disks, then

mk = mk−1 to move the top k − 1 disks

+ 1 to move the kth disk
+ mk−1 to move the top k − 1 disks again
And

m1 = 1
In ML,

- fun hanoi(1) = 1
= | hanoi(n) = 2 * hanoi(n-1) + 1

187
28.2. RECURRENCE RELATIONS CHAPTER 28. RECURSION REVISITED

Exercises

3. Using a process similar to that used to transform factorial,

1. The definition given for recurrence relations assumes the se- transform your fibonacci function from Exercise 2 into the
quence has domain W. Rewrite it for sequences that have functional style (no reference variables) but still without du-
domain N. plicate work.
2. The ML function fibonacci is awful. The call 4. Prove, using math induction, that the sequence m we de-
fibonacci(n-1) itself will call fibonacci(n-2), redoing the fined to represent the number of moves in our Towers of
work of the other fibonacci(n-1). When you consider how Hanoi algorithm is the same as
this situation spreads through all the recursive calls, the in-
efficiency is obscene. Write an iterative version (using refer-
ence variables and a while loop) of fibonacci that does no mk = 2 k − 1
duplicate work.

188
Chapter 29

Recursive Types

29.1 Datatype constructors

Suppose we are writing an application whose data includes fractions. The obvious way to represent
fractions is a tuple of two integers. Thus, to add two fractions

- fun add((a, b), (c, d)) =

= let
= val numerator = a * d + c * b;
= val denominator = b * d;
= val divisor = gcd(numerator, denominator);
= in
= (numerator div divisor, denominator div divisor)
= end;

val add = fn : (int * int) * (int * int) -> int * int

- add((1,3), (1,2));

val it = (5,6) : int * int

However, this introduces a software engineering danger: What if there are other uses for tuples
of two integers in the system, say, points, pairs in a relation, or simply the result of a function
that needed to return two values? It would be very easy for a programmer to become careless
and introduce bugs that improperly treats an int tuple as a fraction or a fraction as some other
int tuple. In this course we have mostly considered an idealized mathematical world apart from
practical software engineering concerns. However, one purpose of types is to make mistakes like this
harder to make and easier to detect.
What is missing in the situation above is proper abstraction, in this case a way to encapsulate
a fraction value and designate it specifically as a fraction so, for example, the point (5, 3) is not
mistaken for 35 . A better solution is to extend ML’s type system to include a rational number type.
We know one way to create new types, and that is using the datatype construct. So far we have
seen it used to represent only finite sets, not (nearly)1 infinite sets like the rational type we have in
mind. We can expand on the old way of writing datatypes by tacking a little extra data onto the
options we give for a datatype.

- datatype rational = Fraction of int * int;

datatype rational = Fraction of int * int

1 The int type, after all, is a finite set, bound by the memory limitations of the computer.

189
29.1. DATATYPE CONSTRUCTORS CHAPTER 29. RECURSIVE TYPES

- Fraction(3,5);

val it = Fraction (3,5) : rational

- fun add(Fraction(a, b), Fraction(c, d)) =

= let
= val numerator = a * d + c * b;
= val denominator = b * d;
= val divisor = gcd(numerator, denominator);
= in
= Fraction(numerator div divisor, denominator div divisor)
= end;

val add = fn : rational * rational -> rational

- add(Fraction(1,2), Fraction(1,3));

val it = Fraction (5,6) : rational

constructor expression The construct Fraction of int * int is called a constructor expression. The idea is that it
explains one way to construct a value of the type rational. Another way to think of it is that
Fraction is a husk that contains an int tuple as its core. A datatype declaration, then, is a sequence
of constructor expressions each following the form
<identifier> of <type expression>
with the “of . . . ” part optional.
Suppose we were writing an application to manage payroll and other human resources operations
for a company. The company has both hourly and salaried employees. We wish to represent both
kinds by a single type (so that, for example, values of both can be stored in the same list or array),
but different data is associated with them: for hourly employees, their hourly wage and the number
of hours clocked since the last pay period; for salaried employees, their yearly salary. We use this
datatype:
- datatype employee = Hourly of real * int ref | Salaried of real;
The reason the second value of Hourly is a reference type is so that values can be updated as
the employee clocks more hours. Thus
- fun clockHours(Hourly(rate, hours), newHours) = hours := !hours + newHours
= | clockHours(Salaried(salary), newHours) = ();
Notice how pattern matching naturally extends to more complicated datatypes. This function
does nothing when clocking hours for salaried employees. (If this were a realistic example, it might
store those hours for the sake of assessing the employee’s work; we also likely would store information
such as the employee’s name and office location.) Similarly, we use pattern matching to determine
how to compute an employee’s wage (assume a two-week pay period):
- fun computePay(Hourly(rate, hours)) =
= let val hoursThisPeriod = !hours;
= in
= (hours := 0;
= rate * real(hoursThisPeriod))
= end
= | computePay(Salaried(salary)) = salary / 26.0;
Notice how the option for hourly employees both resets the hours to 0 and returns the computed
wage.

190
CHAPTER 29. RECURSIVE TYPES 29.2. PEANO NUMBERS

29.2 Peano numbers

Italian mathematician Giuseppe Peano introduced a way of reasoning about whole numbers that
includes the following axioms:

Axiom 8 There exists a whole number 0.

Axiom 9 Every whole number n has a successor, succ n. successor

Axiom 10 No whole number has 0 as its successor.

Axiom 11 If a, b ∈ W, then a = b iff succ a = succ b.

If we interpret successor to mean “one more than,” these axioms allow us to define whole numbers
recursively (called Peano numbers); a whole number is Peano numbers

• zero, or
• one more than another whole number.

Now we see just how flexible the datatype construct is: The scope of the name of the type being
defined includes the definition itself. This means types can be defined recursively.

- datatype wholeNumber = Zero | OnePlus of wholeNumber;

When we wrote self-referential functions or algorithms, we essentially were thinking of recursive

verbs. What is new here is that we are now speaking of recursive nouns. A noun appearing in its
own definition should not in itself seem strange; take, for example, this definition of coral, adapted
from Merriam-Webster :

coral 1 : the calcareous or horny skeletal deposit produced by anthozoan polyps.

2 : a piece of coral

The second definition is merely shorthand for “a piece of the calcareous or horny skeletal deposit
produced by anthozoan polyps”—that is, the use of the word coral internal to the second definition
refers only to the first definition, not back to the second definition itself. No reasonable person
would interpret this as a rule that would produce, as a replacement for the occurrence of coral in a
text, “a piece of a piece of a piece of a piece of a piece of the calcareous or horny skeletal deposit
produced by anthozoan polyps.”
That is, however, how we interpret our definition of whole numbers. The recursive part establishes
a pattern for generating every possible whole number. For example,

5 is a whole number because it is the successor of

4, which is a whole number because it is the successor of
3, which is a whole number because it is the successor of
2, which is a whole number because it is the successor of
1, which is a whole number because it is the successor of
0, which is a whole number by axiom.

In ML,

- val five = OnePlus(OnePlus(OnePlus(OnePlus(OnePlus(Zero)))));

val five = OnePlus (OnePlus (OnePlus (OnePlus (OnePlus Zero)))) : wholeNumber

- val six = OnePlus(five);

val six = OnePlus (OnePlus (OnePlus (OnePlus (OnePlus (OnePlus Zero))))) : wholeNumber

191
29.2. PEANO NUMBERS CHAPTER 29. RECURSIVE TYPES

Finding the successor of a number is just a matter of tacking “OnePlus” to the front, a process
easily automated.

- fun succ(num) = OnePlus(num);

Conversion from an int to a wholeNumber is a recursive process—the base case, 0, can be returned
immediately; for any other case, we add one to the wholeNumber representation of the int that comes
before the one we are converting. (Negative ints will get us into trouble.)

- fun asWholeNumber(0) = Zero

= | asWholeNumber(n) = OnePlus(asWholeNumber(n-1));

val asWholeNumber = fn : int -> wholeNumber

Notice how subtracting one from the int and adding one to the resulting wholeNumber balance
predecessor each other off. Opposite the successor, we define the predecessor of a natural number n, pred n, to
be the number of which n is the successor. From Axiom 11 we can prove that the predecessor of a
number, if it exists, is unique; Axiom 10 says that 0 has no predecessor. Pattern matching makes
stripping off a “OnePlus” easy:

- fun pred(OnePlus(num)) = num;

Warning: match nonexhaustive

OnePlus num => ...

val pred = fn : wholeNumber -> wholeNumber

You may remember this warning from the first time we saw pattern-matching in Chapter 9—or
from more recent mistakes you have made. In this case it is not a mistake; we truly want to leave
the operation undefined for Zero. Using the function on Zero, rather than failing to define it for
Zero, would be the mistake.

- val three = asWholeNumber(3);

val three = OnePlus (OnePlus (OnePlus Zero)) : wholeNumber

- pred(three);

val it = OnePlus (OnePlus Zero) : wholeNumber

- pred(Zero);

uncaught exception nonexhaustive match failure

Now we can start defining arithmetic recursively. Zero will always be our base case; anything
we add to Zero is just itself. For other numbers, picture an abacus. We have two wires, each with
a certain number of beads pushed up. At the end of the computation, we want one of the wires to
contain our answer. Thus we push down one bead from the other wire, bring up one bead on the
answer wire, and repeat until the other wire has no beads left. In other words, we define addition
similarly to our recursive gcd lemmas from Chapter 17.

0+b = b
a+0 = a
a + b = (a + 1) + (b − 1) if b 6= 0
In ML,

192
CHAPTER 29. RECURSIVE TYPES 29.3. PARAMETERIZED DATATYPE CONSTRUCTORS

- fun plus(Zero, num) = num

= | plus(num, Zero) = num
= | plus(num1, OnePlus(num2)) = plus(OnePlus(num1), num2);

Examine for yourself the similarity of structure for isLessThatOrEqualTo. Keep in mind that
recursively-define predicates have two base cases, one true and one false. Here the first and second
parameter are in a survival contest; they repeatedly shed a OnePlus, and the first one reduced to
Zero loses.

- fun isLessThanOrEq(Zero, num) = true

= | isLessThanOrEq(num, Zero) = false
= | isLessThanOrEq(OnePlus(num1), OnePlus(num2)) = lessThanOrEq(num1, num2);

For subtraction, we observe

a−0 = a
a − b = (a − 1) − (b − 1) if a 6= 0 and b 6= 0
In ML,

- fun minus(num, Zero) = num

= | minus(OnePlus(num1), OnePlus(num2)) = minus(num1, num2);

Warning: match nonexhaustive

(num,Zero) => ...
(OnePlus num1,OnePlus num2) => ...

val minus = fn : wholeNumber * wholeNumber -> wholeNumber

This rightly leaves the pattern minus(Zero, OnePlus(num)) undefined. Finally, conversion back
to int is just a literal interpretation of the identifiers we gave to the constructor expressions.

- fun asInt(Zero) = 0
= | asInt(OnePlus(num)) = 1 + asInt(num);

29.3 Parameterized datatype constructors

A stack is any structuring or collection of data for which the following operations are defined: stack

push Add a new piece of data to the collection.

top Return the most recently-added piece of data that has not yet been removed.
pop Remove the most recently-added piece of data that has not yet been removed.
depth Compute the number of pieces of data in the collection.

The data structure can grow and shrink as you add to and remove from it. You retrieve elements
in the reverse order of which you added them. A favorite real-world example is a PEZ dispenser,
which will help explain the intuition of the operation name depth.
One thing we have left unspecified is what sort of thing (that is, what type) these pieces of
data are. This is intentional; we would like to be able to use stacks of any type, just as we have
for lists and arrays. This, too, can be implemented using a datatype, because datatypes can be
parameterized by type. For example

- datatype (’a) someType = Container of ’a;

193
29.3. PARAMETERIZED DATATYPE CONSTRUCTORS CHAPTER 29. RECURSIVE TYPES

datatype ’a someType = Container of ’a

- Container(5);

val it = Container 5 : int someType

- Container(true);

val it = Container true : bool someType

- Container([(Container(3.4), 2),(Container(2.7), 8)]);

val it = Container [(Container 3.4,2),(Container 2.7,8)]

: (real someType * int) list someType

type parameter Here ’a is a type parameter . Indeed, variables may be used to store types, in which case the
identifier should begin with an apostrophe. (This should explain some unusual typing judgments
ML has been giving you.) The form for declaring a parameterized datatype is

datatype <type parameter> <identifier>

and the form for referring to a parameterized type is

<type expression> <identifier>

where the identifier is the name of the parameterized type. Notice that this second form is itself a
type expression type expression, any construct that expresses a type.
With this in hand, we can define a stack recursively as being either

• empty, or
• a single item on top of another stack

and define a generic stack as

- datatype (’a) Stack = Empty | NonEmpty of ’a * (’a) Stack;

Implementing the operations for the stack come easily by applying the principles from the Peano
numbers example to the definitions of the operations.

- fun top(NonEmpty(x, rest)) = x;

- fun pop(NonEmpty(x, rest)) = rest;

- fun push(stk, x) = NonEmpty(x, stk);

- fun depth(Empty) = 0
= | depth(NonEmpty(x, rest)) = 1 + depth(rest);

This chapter received much inspiration from Felleisen and Friedman[6].

194
CHAPTER 29. RECURSIVE TYPES 29.3. PARAMETERIZED DATATYPE CONSTRUCTORS

Exercises

10. Write a function asInteger, converting a powerOfTwo to an

1. Write a function subtract for the rational type. equivalent int
2. Write a function multiply for the rational type. 11. Write a function logBase2, computing an int base 2 loga-
rithm from a powerOfTwo (that is, the type of this function
3. Write a function divide for the rational type.
should be powerOfTwo -> int).
4. Write a function multiply for the wholeNumber type.
If ML did not come with a list construct, we could define our own
5. Write a function divide, performing integer division, for the
using the datatype
wholeNumber type. (Hint: The Division Algorithm from
Chapter 17 is a good place to start.) - datatype (’a) homemadeList =
6. Write a function modulo, performing the mod operation, for = Null | Cons of ’a * ’a homemadeList;
the wholeNumber type. 12. Write a function head for ’a homemadeList, equivalent to the
7. Write a function gcd, computing the greatest common divi- ML primitive hd.
sor, for the wholeNumber type.
13. Write a function tail for ’a homemadeList, equivalent to the
Natural numbers that are powers of 2 can be defined as ML primitive tl.
• 1, or 14. Write a function cat to concatenate two ’a homemadeLists.
• 2 times a power of 2. equivalent to the ML primitive @.

We can represent this in ML as 15. Write a function isElementOf for ’a homemadeList.

16. Write a function makeNoRepeats for ’a homemadeList.
- datatype powerOfTwo = One | TwoTimes of powerOfTwo;
17. Write a function listify for ’a homemadeList.
8. Write a function multiply, multiplying two powerOfTwos to
get another powerOfTwo. 18. Write a function sum for int homemadeList.
9. Write a function asPowerOfTwo, converting an int to the 19. Write a function map for ’a homemadeList, like the map func-
nearest powerOfTwo less than or equal to it. tion of Chapter 24.

195
29.3. PARAMETERIZED DATATYPE CONSTRUCTORS CHAPTER 29. RECURSIVE TYPES

196
Chapter 30

Fixed-point iteration

30.1 Currying
Before we begin, we introduce a common functional programming technique for reducing the number
of arguments or arity of a function by partially evaluating it. Take for a simple example a function arity
that takes two arguments and multiplies them.

- fun mul(x, y) = x * y;

We could specialize this by making a function that specifically doubles any given input by calling
mul with one argument hardwired to be 2.

- fun double(y) = mul(2, y);

This can be generalized an automated by a function that takes any first argument and returns a
function that requires only the second argument.

- fun makeMultiplier(x) = fn y => mul(x, y);

val makeMultiplier = fn : int -> int -> int

- makeMultiplier(2);

val it = fn : int -> int

- it(3);

val it = 6 : int

This process is called currying, after mathematician Haskell Curry, who studied this process currying
(though he did not invent it). We can generalize this with a function that transforms any two-
argument function into curried form.

- fun curry(func) = fn x => fn y => func(x, y);

val curry = fn : (’a * ’b -> ’c) -> ’a -> ’b -> ’c

- curry(mul)(2)(3);

val it = 6 : int

197
30.2. PROBLEM CHAPTER 30. FIXED-POINT ITERATION

30.2 Problem
In this chapter we apply our functional programming skills to a non-trivial problem: calculating
square roots. One approach is to adapt Newton’s method for finding roots in general, which you
may recall from calculus. Here is how Newton’s method works.
Suppose a curve of a function f crosses the x-axis at x0 Functionally, this means f (x0 ) = 0, and
root we say that x0 is a root of f . Finding x0 may be difficult or even impossible; obviously x0 may not
be rational, and if f is not a polynomial, then x0 may not even be an algebraic number (do you
remember the difference between A and T?). Instead, we use Newton’s method to approximate the
root.

f(x)

g(x)

( xi , f (xi))
A

( xi+1, f (x i+1))
C
D
xi x i+1 B x’

The approximation is done by making an initial guess and then improving the guess until it is
tolerance “close enough” (in technical terms, it is within a desired tolerance of the correct answer). Suppose
xi is a guess in this process. To improve the guess, we draw a tangent to the curve at the point A,
(xi , f (xi )), and then calculate the point at which the tangent strikes the x-axis. The slope of the
tangent can be calculated by evaluating the derivative at that point, f 0 (xi ). Recall from first-year
algebra that if you have a slope m and a point (x0 , y 0 ), a line through that point with that slope
satisfies the equation

y − y0 = m · (x − x0 )
y = m · (x − x0 ) + y 0

Using point-slope form, we define a function corresponding to the tangent,

g(x) = f 0 (xi ) · (x − xi ) + f (xi )

Let xi+1 be the x value where g strikes the x-axis. Thus we want g(xi+1 ) = 0, and solving this
equation for xi+1 lets us find point B, the intersection of the tangent and the axis.

198
CHAPTER 30. FIXED-POINT ITERATION 30.2. PROBLEM

0 = f 0 (xi )(xi+1 − xi ) + f (xi )

xi f 0 (xi ) − f (xi )
xi+1 =
f 0 (xi )
f (xi )
= xi − (30.1)
f 0 (xi )

Drawing a vertical line through B leads us to C, the next point on the curve where we will draw
a tangent. Observe how this process brings us closer to x0 , the actual root, at point D. xi+1 is
thus our next guess. Equation 30.1 tells us how to generate each successive approximation from the
previous one. When the absolute value of f (xi ) is small enough (the function value of our guess is
within the tolerance of zero), then we√declare xi to be our answer.
We can use this method to find c by noting that the square root is simply
the positive root of the function f (x) = x2 − c. In this case we find the derivative
f 0 (x) = 2x and plug this into Equation 30.1 to produce a function for improving f ( x) = x −c
(0 , c )
2

a given guess x:

x2 − c
I(x) = x −
2x
In ML, (− c , 0) ( c , 0)

- fun improve(x) =
= x - (x * x - c) / (2.0 * x);

val improve = fn : real -> real

Obviously this will work only if c has been given a valid definition already. Now that we have
our guess-improver in place, our concern is the repetition necessary to achieve a result. Stated
algorithmically, while our current approximation is not within the tolerance, simply improve the
approximation. By now we should have left iterative solutions behind, so we omit the ML code for
this approach. Instead, we set up a recursive solution based on the current guess, which is the data
that is being tested and updated. There are two cases, depending on whether the current guess is
within the tolerance or not.

• Base case: If the current guess is within the tolerance, return it as the answer.
• Recursive case: Otherwise, improve the guess and reapply the this test, returning the result
as the answer.

In ML,

- fun sqrtBody(x) =
= if inTolerance(x)
= then x
= else sqrtBody(improve(x))

val sqrtBody = fn : real -> real

We called this function sqrtBody instead of sqrt because it is a function of the previous guess,
x, not a function of the radicand, c. Two things remain: a predicate to determine if a guess is within
the tolerance (say, .001; then we are in the tolerance when |x2 − c| < .001), and an initial guess (say,
1). If we package this together, we have

199
30.3. ANALYSIS CHAPTER 30. FIXED-POINT ITERATION

- fun sqrt (c) =

= let fun isInTolerance(x) =
= abs((x * x) - c) < 0.001;
= fun improve(x) =
= x - (x * x - c) / (2.0 * x);
= fun sqrtBody(x) =
= if isInTolerance(x)
= then x
= else sqrtBody(improve(x))
= in
= sqrtBody(1.0)
= end;

val sqrt = fn : real -> real

- sqrt(2.0);

val it = 1.41421568627 : real

- sqrt(16.0);

val it = 4.00000063669 : real

- sqrt(121.0);

val it = 11.0000000016 : real

- sqrt(121.75);

val it = 11.0340382471 : real

30.3 Analysis
Whenever you solve a problem in mathematics or computer science, the next question to ask is
whether the solution can be generalized so that it applies to a wider range of problems and thus can
be reused more readily. To generalize an idea means to reduce the number of assumptions and to
acknowledge more unknowns. In other words, we are replacing constants with variables.
Our square root algorithm was a specialization of Newton’s method. The natural next question
is how to program Newton’s method in general. What assumptions or restrictions did we make on
Newton’s method when we specialized it? Principally, we assumed that the function for which we
were finding a root was in the form x2 − c where c is a variable to the system. Let us examine how
this assumption affects the segment of the solution that tests for tolerance.

- fun isInTolerance(x) =
= abs((x * x) - c) < 0.001;

The assumed function shows itself in the expression (x * x) - c. By taking the absolute value
of that function for a supplied x and comparing with .001, we are checking if the function is within an
epsilon of zero. We know that functions can be passed as parameters to functions; here, as happens
frequently, generalization manifests itself as parameterization.

- fun isInTolerance(function, x) =
= abs(function(x)) < 0.001;

200
CHAPTER 30. FIXED-POINT ITERATION 30.3. ANALYSIS

val isInTolerance = fn : (’a -> real) * ’a -> bool

Take stock of the type. The function isInTolerance takes a function (in turn mapping from a
type ’a to real) and a value of type ’a. The given information does not allow ML to infer what type
function would accept; hence the type variable ’a. How function’s return type is inferred to be
real is more subtle abs is a special kind of function that is defined so that it can accept either reals
or ints, but it must return the same type that it receives. Since we compare its result against 0.001,
its result must be real; thus its parameter must also be real, and finally we conclude that function
must return a real.
isInTolerance is now less easy to use because we must pass in the function whenever we want
to use it, unless function is in scope already and we can eliminate it as a parameter. However,
we know that functions can also return functions. To make this more general, instead of writing a
function to test the tolerance, we write a function that produces a tolerance tester, based on a given
function.

- fun toleranceTester(function) =
= fn x => abs(function(x)) < 0.001;

val toleranceTester = fn : (’a -> real) -> ’a -> bool

Notice that the -> operator is right associative, which means it groups items on the right side
unless parentheses force it to do otherwise. toleranceTester accepts something of type ’a -> real
and returns something of type ’a -> real. Now we need call toleranceTester only once and call the
function it returns whenever we want to test for tolerance. To further generalize, let us no longer
assume = .001, but instead parameterize it.

- fun toleranceTester(function, tolerance) =

= fn x => abs(function(x)) < tolerance;

val toleranceTester = fn : (’a -> int) * int -> ’a -> bool

What happened to the type? Since 0.001 no longer appears, there is nothing to indicate that
we are dealing with reals. Yet ML cannot simply introduce a new type variable (for, say, (’a -> ’b)
* ’b -> ’a -> bool) because abs is not defined for all types, just int and real. Instead, ML has to
guess, and when it comes between real and int, it goes with int. We will force it to chose real.

- fun toleranceTester(function, tolerance) =

= fn x => abs(function(x):real) < tolerance;

val toleranceTester = fn : (’a -> real) * real -> ’a -> bool

Next, consider the function improve. We can generalize this by stepping back and considering
how we formulated it in the first place. It comes from applying Equation 30.1 to a specific function
f . Thus we can generalize it by making the function of the curve a parameter. Since we do not have
a means of differentiating f , we will need f 0 to be supplied as well.

- fun nextGuess(guess, function, derivative) =

= guess - (function(guess) / derivative(guess));

However, just as with tolerance testing, we would prefer to think of our next-guesser as a function
only of the previous guess, not of the curve function and derivative. We can modify nextGuess easily
so that it produces a function like improve:

- fun nextGuesser(function, derivative) =

= fn guess => guess - (function(guess) / derivative(guess));

201
30.4. SYNTHESIS CHAPTER 30. FIXED-POINT ITERATION

Notice that this process amounts to the partial application of a function, and example of currying.
nextGuess has three parameters; nextGuesser allows us to supply values for some of the parameters,
and the result is another function. sqrtBody also demonstrates a widely applicable technique. If we
generalize our function I(x) based on Equation 30.1 we have

f (x)
G(x) = x −
f 0 (x)
If x is an actual root, then f (x) = 0, and so G(x) = x. In other words, a root of f (x) is a solution
to the equation

x = G(x)
fixed point problems Problems in this form are called fixed point problems because they seek a value which does not
change when G(x) is applied to it (and so it is fixed). If the fixed point is a local minimum or
maximum and one starts with a good initial guess, one approach to solving (or approximating) a
fixed point iteration fixed point problem is fixed point iteration, the repeated application of the function G(x), that is

G(G(G(. . . G(x0 ) . . .)))

where x0 is the initial guess, until the result is “good enough.” In parameterizing this process by
function, initial guess, and tester, we have this generalized version of sqrtBody:

- fun fixedPoint(function, guess, tester) =

= if tester(guess) then guess
= else fixedPoint(function, function(guess), tester);

val fixedPoint = fn : (’a -> ’a) * ’a * (’a -> bool) -> ’a

30.4 Synthesis
We have decomposed our implementation of the square root function to uncover the elements in
Newton’s method (and more generally, a fixed point iteration). Now we assemble these to make
useful, applied functions. In the analysis, parameters proliferated; as we synthesize the components
into something more useful, we will reduce the parameters, or “fill in the blanks.” Simply hooking
up fixedPoint, nextGuesser, and toleranceTester, we have

- fun newtonsMethod(function, derivative, guess) =

= fixedPoint(nextGuesser(function, derivative), guess,
= toleranceTester(function, 0.001));

val newtonsMethod = fn : (real -> real) * (real -> real) * real -> real

Given a function, its derivative, and a guess, we can approximate a root. However, one parameter
in particular impedes our use of newtonsMethod. We are required to supply the derivative of
function; in fact, many curves on which we wish to use Newton’s method may not be readily
differentiable. In those cases, we would be better off finding a numerical approximation to the
derivative. The easiest such approximation is the secant method, where we take a point on the curve
near the point at which we want to evaluate the derivative and calculate the slope of the line between
those points (which is a secant to the curve). Thus for small , f 0 (x) ≈ f (x+)−f

(x)
. In ML, taking
= .001,

- fun numDerivative(function) =
= fn x => (function(x + 0.001) - function(x)) / 0.001;

val numDerivative = fn : (real -> real) -> real -> real

202
CHAPTER 30. FIXED-POINT ITERATION 30.4. SYNTHESIS

Now we make an improved version of our earlier function. Since any user of this new function is
concerned only about the results and not about how the results are obtained, our name for it shall
reflect what the function does rather than how it does it.

- fun findRoot(function, guess) =

= newtonsMethod(function, numDerivative(function), guess);

val findRoot = fn : (real -> real) * real -> real

Coming full circle, we can apply these pre-packaged functions to a special case: finding the square
root. We can use newtonsMethod directly and provide an explicit derivative or we can use findRoot
and rely on a numerical derivative, with different levels of precision. Since x2 − c is monotonically
increasing, 1 is a safe guess, which we provide.

- fun sqrt(c) =
= newtonsMethod(fn x => x * x - c, fn x => 2.0 * x, 1.0);

val sqrt = fn : real -> real

- sqrt(2.0);

val it = 1.41421568627 : real

- sqrt(16.0);

val it = 4.00000063669 : real

- sqrt(121.0);

val it = 11.0000000016 : real

- sqrt(121.75);

val it = 11.0340382471 : real

There are several lessons here. First, this has been a demonstration of the interaction between
mathematics and computer science. The example we used comes from an area of study called
numerical analysis which straddles the two fields. Numerical analysis is concerned with the numerical
approximation of calculus and other topics of mathematical analysis. More importantly, this also
demonstrates the interaction between discrete and continuous mathematics. We have throughout
assumed that f (x) is a real-valued function like you are accustomed to seeing in calculus or analysis.
However, the functions themselves are discrete objects. The most important lesson is how functions
can be parameterized to become more general, curried to reduce parameterization, and, as discrete
objects, passed and returned as values.
The running example in this chapter was developed from Abelson and Sussman [1].

203
30.4. SYNTHESIS CHAPTER 30. FIXED-POINT ITERATION

Exercises

4. The implementation of fixedPoint presented in this chap-

ter differs from traditional fixed point iteration because it
1. Write a function similar to the original sqrt of Section 30.2 requires the user to provide a function which determines
but to calculate cubed roots. when the iteration should stop, based on the current guess.
2. Notice that a root of f is either a local minimum or a local That approach is tied to the application of finding roots,
maximum of G. Identify under what circumstances it is a since our criterion for termination how close f (guess) is to
minimum and under what circumstances it is a maximum. zero. Instead, termination should depend on how close suc-
cessive guesses are to each other, that is, if | xi − xi−1 |< .
3. Write a function similar to sqrt of Section 30.4 but to cal- Rewrite fixedPoint so that it uses this criterion and receives
culate cubed roots. a tolerance instead of a tester function.

204
Chapter 31

Combinatorics

31.1 Counting
In Chapter 25, we proved that if A and B are finite, disjoint sets, then |A ∪ B| = |A| + |B|. We will
generalize this idea, but first

Lemma 31.1 If A and B are finite sets and B ⊆ A, then |A − B| = |A| − |B|.

Proof. By Exercise 4 of Chapter 12, (A − B) ∪ B = A. By Exercise 22 of Chapter 11,

A − B and B are disjoint. Thus, by Theorem 25.2, |A − B| + |B| = |A|. By algebra,
|A − B| = |A| − |B|. 2.

Now,

Theorem 31.1 If A and B are finite sets, then |A ∪ B| = |A| + |B| − |A ∩ B|.

Proof. Suppose A and B are finite sets. Note that by Exercise 23 of Chapter 11, A and
B − (A ∩ B) are disjoint; and that by Exercise 10.1 also of Chapter 11, A ∩ B ⊆ B. Then,

|A ∪ B| = |A ∪ (B − (A ∩ B))| by Exercise 15 of Chapter 11 and substitution

= |A| + |B − (A ∩ B)| by Theorem 25.2
= |A| + |B| − |A ∩ B| by Lemma 31.1. 2

This result can be interpreted as directions for counting the elements of A and B. It is not simply
a matter of adding the number of elements in A to the number of elements in B, because A and B
might overlap. For example, if you counted the number of math majors and the number of computer
science majors in this course, you may end up with a number larger than the enrollment because
you counted the double math-computer science majors twice. To avoid counting the overlapping
elements twice, we subtract the cardinality of the intersection.
The area of mathematics that studies counting sets and the ways they combine and are ordered is
combinatorics. (Elementary combinatorics is often simply called “counting,” but that invites derision combinatorics
from those unacquainted with higher mathematics.) It plays a part in many field of mathematics,
probability and statistics especially. Lemma 31.1 is called the difference rule and Theorem 31.1 is difference rule
called the inclusion/exclusion rule. We can generalize Theorem 25.2 to define the addition rule:
inclusion/exclusion rule
n
X
Theorem 31.2 If A is a finite set with partition A1 , A2 , . . . , An , then |A| = |Ai |. addition rule
i=1

and introduce the multiplication rule: multiplication rule

Theorem 31.3 If A1 , A2 , . . . , An are finite sets, then |A1 × A2 × . . . × An | = |A1 | · |A2 | · . . . · |An |.

205
31.2. PERMUTATIONS AND COMBINATIONS CHAPTER 31. COMBINATORICS

The proofs are left as exercises.

Consider some examples of these. ML allows identifiers to be made up of upper and lower case
letters, numbers, underscores, and apostrophes, with the first character being a letter or apostrophe.
Furthermore, an identifier may not be the same as a reserved word. How many possible four-character
identifiers are there in ML? To solve this, consider the various sets of characters. Let A be the set
of capital letters, B the set of lowercase letters, C the set { ’ }, D the set { }, and E the set of
digits. Thus

addition rule
z }| {
( (|A| + |B| + |C|)


 × (|A| + |B| + |C| + |D| + |E|)



multiplication rule × (|A| + |B| + |C| + |D| + |E|)





× (|A| + |B| + |C| + |D| + |E|) )

difference rule − |{ case, else, open, then, type, with }|

The number of identifiers is (53 · 64 · 64 · 64) − 6 = 13893626.

Next consider the number of possible phone numbers in an area code. Phone numbers are seven
characters, chosen from the ten digits with the restrictions that no numbers begin with 0, 1, 555,
411, or 911. Notice that there are

10 · 10 · 10 · 10 = 10000
four-digit suffixes, and

8 · 10 · 10 − 3 = 797
three-digit prefixes that do not start with 0 or 1 and do not include the restricted prefixes. Choosing
a phone number means choosing a prefix and a suffix, hence 797 · 10000 = 7970000 possible phone
numbers.

31.2 Permutations and combinations

A classic set of combinatorial examples comes from a scenario where we have an urn full of balls
and we are pulling balls out of the urn. Suppose first that we are picking each ball one at a time;
permutation thus we are essentially giving an ordering the set of balls. A permutation of a set is a sequence of its
elements. A permutation is like a tuple of the set with dimension the same as the cardinality of the
entire set, except one important difference—once we choose the first element, that element is not in
the set to choose from for the remaining elements, and so on. In ML, our lists are a permutation of
a set. Applying the multiplication rule, we see that the number of permutations of a set of size n is

P (n) = n · (n − 1) · (n − 2) · . . . · 1
= n!
r-permutation An r-permutation of a set is a permutation of a subset of size r—in other words, we do not pull
out all the balls, only the first r we come to. The number of r-permutations of a set of size n is

P (n, r) = n · (n − 1) · (n − 2) · . . . · (n − r + 1)

n!
= (n−r)!

On the other hand, what if we grabbed several balls from the urn at the same time? If you have
four balls in your hand at once, it is not clear what order they are in; instead, you have simply

206
CHAPTER 31. COMBINATORICS 31.3. COMPUTING COMBINATIONS

chosen an unordered subset. An r-combination of a set of n elements is a subset of size r. How r-combination
n
many subsets of size r are there of a set of size n (written ( r ))? First, we know that the number of
n!
orderings of subsets of that size is (n−r)! . Each of those subsets can be ordered in r! ways. Hence

n n!
=
r r!(n − r)!

31.3 Computing combinations

Our main interest here is computing not just the number of permutations and combinations, but
the permutations and combinations themselves. Suppose we want to write a function combos(x,
r) which takes a set x and returns a set of sets, all the subsets of x of size r. We need to find a
recursive strategy for this. First off, we can consider easy cases. If x is an empty set, there are no
combinations of any size. Similarly, there are no combinations of size zero of any set. Thus

- fun combos([], r) = []
= | combos(x, 0) = []

The case where r is 1 is also straight forward: Every element in the set is, by itself, a combination
of size 1. Creating a set of all those little sets is accomplished by our listify function from
Chapter 18.

- fun combos([], r) = []
= | combos(x, 0) = []
= | combos(x, 1) = listify(x)

The strategy taking shape is odd compared to most of the recursive strategies we have seen
before. The variety of base cases seems to anticipate the problem being made smaller in both the
x argument and the r argument. What does this suggest? When you form a combination of size r
from a list, you first must decide whether that combination will contain the first element of the list
or not. Thus all the combinations are

• all the combinations of size r that do not contain the first element, plus
• all the combinations of size r − 1 that do not contain the first element with the first element
added to all of them.

Function addToAll from Exercise 8 of Chapter 18 will work nicely here.

- fun combos([], r) = []
= | combos(x, 0) = []
= | combos(x, 1) = listify(x)
= | combos(head::rest, r) =
= addToAll(head, combos(rest, r-1)) @ combos(rest, r);

207
31.3. COMPUTING COMBINATIONS CHAPTER 31. COMBINATORICS

Exercises

lists, each like y but with x inserted into every position.

1. Prove Theorem 31.2. For example, addEverywhere(1, [2, 3, 4]) should return
[[1,2,3,4],[2,1,3,4],[2,3,1,4],[2,3,4,1]]. Given an
2. Prove Theorem 31.3. empty list, it should return the list containing just the list
3. Write a function P(n, r) which computes the number of containing x.
r-permutations of a set of size n. 6. Write a function fullPermus(y) which takes a list and re-
4. Write a function C(n, r) which computes the number of turns the list of all the permutations of that list.
r-combinations of a set of size n.
7. Write a function fullPermusList(y) which takes a list of
The following exercises walk you through how to write a function lists and returns a united list of all the permutations of all
to compute r-permutations. the lists in y.
5. Write a function addEverywhere(x, y) which takes an item 8. Write a function permus(y, r) which computes all the r-
x and a list of items of that type y and returns a list of permutations of the list y.

208
Chapter 32

Special Topic: Computability

In Chapter 27, we noted the amazing property that natural numbers, integers, and rationals all have
the same cardinality, being countably infinite, but that real numbers are more infinitely numerous.
That discussion only considered comparing sizes of number sets. What about infinite sets of other
things?
Let us take for example computer programs. The finite nature of any given computer necessitates
that the set of computer programs is not even infinite. However, suppose we even allow for a
computer with an arbitrary amount of memory, where more always could be added if need be. Since
computer programs are stored in memory, and memory is a series of bits, we can interpret the bit-
representation of a program as one large natural number in binary. Thus we have a function from
computer programs to natural numbers. This function is not necessarily a one-to-one correspondence
(it certainly is not, if we exclude from the domain any bit representations of invalid programs), but it
is one-to-one, since every natural number has only one binary representation, which means that there
are at least as many natural numbers as programs, perhaps more. The set of computer programs is
therefore countable.
In ML programming, we think of a program as something that represents and computes a func-
tion. How many functions are there? Since every real number can be considered to be a constant
function, it is easily argued that there are uncountably many possible functions. Nevertheless, let us
look at a narrower scope of functions, ones we have a chance of writing a program to compute (we
could not expect, for example, a program to compute an arbitrary real number in a finite amount of
time). For convenience, we choose the seemingly arbitrary set T of functions from natural numbers
to digits,

T = {f : N → {0, 1, 2, . . . , 9} }
Now we will define a function not in the set T but with T as its codomain (a function that
returns functions), h : (0, 1) → T . Suppose we represent a number in (0, 1) as 0.a1 a2 a3 . . . an . . ..
Then define

h(0.a1 a2 a3 . . . an . . .) = the function f defined by f (n) = an

Every number in (0, 1) will produce a unique function, and every function in T defines a unique
real number if one simply feeds the natural numbers into it and interprets the output as a decimal
expansion. Hence h is one-to-one and onto, and so T is uncountable.
If T is uncountable and the set of possible programs is countable, this means there are more
functions than there are computer programs—and further, that there must be some functions for
which there are no possible computer programs. Some functions cannot be computed. This is a very
humbling thought about the limitations of the machines that we build.
The set of functions that map natural numbers to digits is still fairly unrealistic. Are any of
these uncomputable functions something that we would actually want to compute? Suppose we
wanted a program/function that would take as its input another program/function and input for

209
CHAPTER 32. SPECIAL TOPIC: COMPUTABILITY

that program/function and return true if the given program halts (that is, does not loop forever or
have infinite recursion) on that input, false otherwise. To clarify, the following program does half of
the intended work:

- fun halfHalt(f, x) = (f(x); true);

val halfHalt = fn : (’a -> ’b) * ’a -> bool

This program will indeed return true if the given program halts on the given input, but rather
than return true otherwise, it will go one forever itself. We cannot just let the program run for a
while and, if it does not end on its own, break the execution and conclude it loops because we will
never know whether or not we have waited long enough. A program that does decide this would be
useful indeed. One of the most frequent kinds of programming mistakes (especially for beginners) is
unterminated iteration or recursion. Such a program—at least a program that would differentiate all
programs precisely and not have any program for which it could not tell either way—is impossible.
To see that this is true, suppose we had such a program, halt.

- fun d(m) = if halt(m, m)

= then (while true do (); false)
= else true;

val d = fn : (’a -> ’b) -> bool1

Given a function m, this program feeds m into halt as both the function to run and the input to
run it on. If m does not halt on itself, this function returns true. If it does halt, then this program
loops forever (the false is present in the statement list just so that it will type correctly; it will
never be reached because the while loop will not end).
What would be the result of running d(d)? If d halts when it is applied to itself, then halt will
return true, so then d will loop forever. If d loops forever when it is applied to itself, then halt will
return false, so then d halts, returning true. If it will halt, then it will not; if it will not, then it will.
This contradiction means that the function halt cannot exist.
halting problem This, the unsolvability of the halting problem is a fundamental result in the theory of computation
since it defines a boundary of what can and cannot be computed. We can write a program that
accepts this problem, that is, that will answer true if the program halts but does not answer at all if
it does not. What we cannot write is a program that decides this problem. Research has shown that
all problems that can be accepted but not decided are reducible to the halting problem. If we had
a model of computation—something much different from anything we have imagined so far—which
could decide the halting problem, then all problems we can currently accept would then also be
decidable.

1 This actually would not type in ML because the application halt(m, m) requires the equation

’a = (’a -> ’b) to hold. ML cannot handle recursive types unless datatypes are used.

210
Chapter 33

Special topic: Comparison with

object-oriented programming

This chapter is for students who have taken a programming course using an object-oriented language
such as Java. If you plan to take such a course in the future, you are recommended to come back
and read this chapter after you have learned the fundamentals of class design, subtyping, and
polymorphism.
You have no doubt noticed the striking difference in flavor between functional programming
and object-oriented programming. A functional language views a program as a set of interacting
functions, whereas an object-oriented program is a set of interacting objects. A quick sampling of
their similarities will illuminate these differences. Here are two principles which cut across all styles
of programming which are relevant for our purposes:
• A program or system is comprised of data structures and functionality.
• A well-designed language encourages writing code that is modular (it is made up of small, semi-
autonomous parts), reusable (those parts can be plugged into other systems), and extensible
(the system can be modified by adding new parts).
Notice how functional and object-oriented styles address the first point, at least in the way they
are taught to beginners. In an object-oriented language, data structures and the functionality defined
on them are packaged together; a class represents the organization of data (the instance variables)
and operations (the instance methods) in one unit. In a functional language, the data (defined, for
example, by a datatype) is less tightly coupled to the functionality (the functions written for that
datatype). This touches on the second point as well: In a functional language, the primary unit of
modularity is the function, and in an object-oriented language, the primary unit of modularity is
the class.
To see this illustrated, consider this system in ML to model animals and the noises they make.
- datatype Animal = Dog | Cat ;

- fun happyNoise(Dog) = "pant pant"

= | happyNoise(Cat) = "purrrr"

- fun excitedNoise(Dog) = "bark"

= | excitedNoise(Cat) = "meow"

It is easy to extend the functionality (that is, add an operation). We need only write a new function,
without any change to the datatype or other functions.
- fun angryNoise(Dog) = "grrrrr"
= | angryNoise(Cat) = "hisssss"

211
CHAPTER 33. SPECIAL TOPIC: COMPARISON WITH OBJECT-ORIENTED PROGRAMMING

It is difficult, on the other hand, to extend the data. Adding a new kind of animal requires
changing the datatype and every function that operates on it. From the ML interpreter’s perspective,
this is rewriting the whole system from scratch.

- datatype Animal = Dog | Cat | Chicken;

- fun happyNoise(Dog) = "pant pant"

= | happyNoise(Cat) = "purrrr"
= | happyNoise(Chicken) = "cluck cluck";

- fun excitedNoise(Dog) = "bark"

= | excitedNoise(Cat) = "meow"
= | excitedNoise(Chicken) = "cockadoodledoo";

- fun angryNoise(Dog) = "grrrrr"

= | angryNoise(Cat) = "hisssss"
= | angryNoise(Chicken) = "squaaaack";

In an object-oriented setting, we have the opposite situation. A Java system equivalent to our
original ML example would be

interface Animal {
String happyNoise();
String excitedNoise();
}

class Dog implements Animal {

String happyNoise() { return "pant pant"; }
String excitedNoise() { return "bark"; }
}

class Cat implements Animal {

String happyNoise() { return "purrrrr"; }
String excitedNoise() { return "meow"; }
}

Although the interface demands that everything of type Animal will have methods happyNoise
and excitedNoise defined for it, the code for an operation like happyNoise is distributed among
the classes. The result is a system where it is very easy to extend the data; you simply write a new
class, without changing the other classes or the interface.

class Chicken implements Animal {

String happyNoise() { return "cluck cluck"; }
String excitedNoise() { return "cockadoodledoo"; }
}

The price is that we have made extending the functionality difficult. Adding a new operation
now requires a change to the interface and to every class.

interface Animal {
String happyNoise();
String excitedNoise();
String angryNoise();
}

212
CHAPTER 33. SPECIAL TOPIC: COMPARISON WITH OBJECT-ORIENTED PROGRAMMING

class Dog implements Animal {

String happyNoise() { return "pant pant"; }
String excitedNoise() { return "bark"; }
String angryNoise() { return "grrrrr"; }
}

class Cat implements Animal {

String happyNoise() { return "purrrrr"; }
String excitedNoise() { return "meow"; }
String angryNoise() { return "hissss"; }
}

class Chicken implements Animal {

String happyNoise() { return "cluck cluck"; }
String excitedNoise() { return "cockadoodledoo"; }
String angryNoise() { return "squaaaack"; }
}
We can lay out the types and the operations in a table to represent the system in a way inde-
pendent of either paradigm.

Dog Cat Chicken

happyNoise pant pant purrrr cluck cluck
excitedNoise bark meow cockadoodledoo
angryNoise grrrrr hisssss squaaaaack

Relative to this table, functional programming packages things by rows, and adding a row to
the table is convenient. Adding a column is easy is object-oriented programming, since a column is
encapsulated by a class.
It is worth noting that object-oriented programming’s most touted feature, inheritance, does not
touch this problem. Adding a new operation like angryNoise by subclassing may allow us to leave
the old classes untouched, but it does require writing three new classes and a new interface.
interface AnimalWithAngryNoise extends Animal {
String angryNoise();
}

class DogWithAngryNoise extends Dog implements AnimalWithAngryNoise {

String angryNoise() { return "grrrrr"; }
}

class CatWithAngryNoise extends Cat implements AnimalWithAngryNoise {

String angryNoise() { return "hissss"; }
}

class ChickenWithAngryNoise extends Cat implements AnimalWithAngryNoise {

String angryNoise() { return "squaaaack"; }
}
Worse yet, all the code that depends on the old classes and interfaces must be changed to refer to
the new (and oddly named) types.
This exemplifies a fundamental trade-off in designing a system. The Visitor pattern[8] is a way of
importing the advantages (and liabilities) of the functional paradigm to object-oriented languages.
A system that is easily extended by both data and functionality is a perennial riddle in software
design.

213
CHAPTER 33. SPECIAL TOPIC: COMPARISON WITH OBJECT-ORIENTED PROGRAMMING

214
Part VIII

Graph

215
Chapter 34

Graphs

34.1 Introduction
We commonly use the word graph to refer to a wide range of graphics and charts which provide
an illustration or visual representation of information, particularly quantitative information. In the
realm of mathematics, you probably most closely associate graphs with illustrations of functions in
the real-number plane. Graph theory, our topic in this part, is a field of mathematics that studies
a very specific yet abstract notion of a graph. It has many applications throughout mathematics,
computer science, and other fields, particularly for modeling systems and representing knowledge.
Unfortunately, the beginning student of graph theory will likely feel intimidated by the horde
of terminology required; the slight differences of terms among sources and textbooks aggravates the
situation. The student is encouraged to take careful stock of the definitions in these chapters, but
also to enjoy the beauty of graphs and their uses. This chapter will consider the basic vocabulary of
graph theory and a few results and applications. The following chapter will explore various kinds of
paths through graphs. Finally, we will use graph theory as a framework for discussing isomorphism,
a central concept throughout mathematics.
A graph G = (V, E) is a pair of finite sets, a set V of vertices (singular vertex ) and a set E of graph
pairs of vertices called edges. We will typically write V = {v1 , v2 , . . . , vn } and E = {e1 , e2 , . . . , em }
where each ek = (vi , vj ) for some vi , vj ; in that case, vi and vj are called end points of the edge ek . vertex
Graphs are drawn so that vertices are dots and edges are line segments or curves connecting two
edge
dots.
As an example of a mathematical graph and its relation to everyday visual displays, consider the end points
graph where the set of vertices is { Chicago, Gary, Grand Rapids, Indianapolis, Lafayette, Urbana,
Wheaton } and the edges are direct highway connections between these cities. We have the following
graph (with vertices and edges labeled). Notice how this resembles a map, simply more abstract
(for example, it contains no accurate information about distance or direction).

Grand Rapids
Wheaton
Chicago
I90 Gary
I88 I196
I294
I65n Lafayette
I57

I65s

Urbana
Indianapolis
I74

We call the edges pairs of vertices for lack of a better term; a pair is generally considered a

217
34.2. DEFINITIONS CHAPTER 34. GRAPHS

two-tuple (in this case, it would be an element of V × V ); moreover, we write edges with parentheses
and a comma, just as we would with tuples. However, we mean something slightly different. First,
tuples are ordered. In our basic definition of graphs, we assume that the end points of an edge are
unordered: we could write I57 as (Chicago, Urbana) or (Urbana, Chicago). Second, an edge as a
pair of vertices is not unique. In the cities example, we have duplicate entries for (Chicago, Gary):
both I90 and I294. This is why it is necessary to have two ways to represent an edge, a unique name
as well as a descriptive one.
The kinship between graphs and relations should be readily apparent. Graphs, however, are more
flexible. As a second example, this graph represents the relationships (close friendship, siblinghood,
or romantic involvement—labeled on the drawing but not as names for the edges) among the main
characters of Anna Karenina. V = { Karenin, Anna, Vronsky, Oblonsky, Dolly, Kitty, Levin }.
E = { (Karenin, Anna), (Anna, Vronsky), (Vronsky, Kitty), (Anna, Oblonsky), (Oblonsky, Dolly),
(Dolly, Kitty), (Oblonsky, Levin), (Kitty, Levin) }.

Vronsky
ex Kitty
ur
mo
ra

sp
g
pa

ou
lin

se
sib
Anna Dotty
Levin
spouse
se
ou

sib
sp

lin d
g en
fri
Karenin
Oblonski

34.2 Definitions
incident An edge (vi , vj ) is incident on its end points vi and vj ; we also say that it connects them. If vertices
connects vi and vj are connected by an edge, they are adjacent to one another. If a vertex is adjacent to
adjacent itself, that connecting edge is called a self-loop. If two edges connect the same two vertices, then
self-loop those edges are parallel to each other. Below, e1 is incident on v1 and v4 . e10 connects v7 and v6 .
parallel v9 and v6 are adjacent. e8 is a self-loop. e4 and e5 are parallel.
e8
v1 e1 v
5
v4 e2

e e6
3

v3 e7
e4

v v6
2 e5
e 10

v7
e
9

e e 14
12
e 11

v9
e 13
v8

degree The degree deg(v) of a vertex v is the number of edges incident on the vertex, with self-loops
subgraph counted twice. deg(v1 ) = 2, deg(v5 ) = 3, and deg(v2 ) = 4. A subgraph of a graph G = (V, E)
is a graph G0 = (V 0 , E 0 ) where V 0 ⊆ V and E 0 ⊆ E (and, by definition of graph, for any edge
simple (vi , vj ) ∈ E 0 , vi , vj ∈ V 0 ). A graph G = (V, E) is simple if it contains no parallel edges or self-
loops. The graph ({v1 , v2 , v3 , v4 , v5 }, {e1, e2 , e3 , e4 , e6 }) is a simple subgraph of the graph shown.

218
CHAPTER 34. GRAPHS 34.3. PROOFS

A simple graph G = (V, E) is complete if for all vi , vj ∈ V , the edge (vi , vj ) ∈ E. The subgraph complete
({v7 , v8 , v9 }, {e11 , v12 , v13 }) is complete. The complement of a simple graph G = (V, E) is a graph
G = (V, E 0 ) where for vi , vj ∈ V , (vi , vj ) ∈ E 0 if (vi , vj ) ∈
/ E; in other words, the complement has all complement
the same vertices and all (and only) those possible edges that are not in the original graph. The com-
plement of the subgraph ({v3 , v4 , v6 , v7 }, {e6 , e7 , e10 }) is ({v3 , v4 , v6 , v7 }, {(v3 , v7 ), (v7 , v4 ), (v3 , v6 )},
as shown below.
v4 v4

e6
v e7 v
3 3

v6 v6
e 10

v7 v7

Even though we tend to think of a graph fundamentally as a picture, it is important to notice

that a formal description of a graph is independent of the way it is drawn. Important to the essence
of a graph is merely the names of vertices and edges and their connections (and in Chapter 36, we
will see that even the names are not that important). The following two pictures show the same
graph.
v 1
v1
e e1
e 4
1
v5 e5 v4
v2 v
3

e2
e3
e e
2
4
e
3
v3 e v
v4 v2 5 5

A directed graph is a graph where the edges are ordered pairs, that is, edges have directions. directed graph
Pictorially, the direction of the edge is shown with arrows. Notice that a directed graph with no
parallel edges is the same as a set together with a relation on that set. For this reason, we were able
to use directed graphs to visualize relations in Part V. In a directed graph, we must differentiate
between a vertex’s in-degree, the number of edges towards it, and its out-degree, the number of edges in-degree
away from it.
out-degree

34.3 Proofs
By now you should have achieved a skill level for writing proofs at which it is appropriate to ease
up on the formality slightly. The definitions in graph theory do not avail themselves to proofs as
detailed as those we have written for sets, relations, and functions, and graph theory proofs tend to
be longer anyway. Do not be misled, nevertheless, into thinking that this is lowering the standards
for logic and rigor; we will merely be stepping over a few obvious details for the sake of notation,
length, and readability. The proof of the following proposition shows what sort of argumentation is
expected for these chapters.
n
X
Theorem 34.1 (Handshake.) If G = (V, E) is a graph with V = {v1 , v2 , . . . , vn }, then deg(vi ) =
i=1
2 · |E|.

Proof. By induction on the cardinality of E. First, suppose that G has no edges, that

219
34.4. GAME THEORY CHAPTER 34. GRAPHS

n
X
is, |E| = 0. Then for any vertex v ∈ V , deg(v) = 0. Hence deg(vi ) = 0 = 2·0 = 2·|E|.
i=1
n
X
Hence there exists an N ≥ 0 such that for all m ≤ N , if |E| = m then deg(vi ) = 2·|E|.
i=1
Now suppose |E| = N +1, and suppose e ∈ E. Consider the subgraph of G, G0 = (V, E −
{e}). We will write the degree of v ∈ V when it is being considered a vertex in G0 instead
n
X
of G as deg0 (v). |E − {e}| = N , so by our inductive hypothesis deg0 (vi ) = 2 · |E − {e}|.
i=1
Suppose vi , vj are the end points of e. If vi = vj , then deg(vi ) = deg0 (vi ) + 2; otherwise,
deg(vi ) = deg0 (vi ) + 1 and deg(vj ) = deg0 (vj ) + 1; both by the definition of degree. For
any other vertex v ∈ V , where v 6= vi and v 6= vj , we have deg(v) = deg0 (v).
n
X n
X
Hence deg(vi ) = 2 + deg0 (vi ) = 2 + 2 · |E − {e}| = 2 + 2(|E| − 1) = 2 · |E|. 2
i=1 i=1

Here are the things left out of this proof.

• The third sentence makes the unjustified claim that having no edges implies every vertex has
a degree of zero. This follows immediately from the definition of degree, and it is the best we
can do without a formal notion of what “the number of edges” means. Keep in mind that the
only formal mechanism we have developed for reasoning about quantity is cardinality.

• In the fourth sentence, substitution and rules of arithmetic are used without citation.

• The claim |E − {e}| = N depends on Lemma 31.1 and the facts that E and {e} are disjoint
that |{e}| = 1.

• The sixth sentence of the second paragraph together with the last sentence claims that whether
or not e is a self loop, it contributes two to the total sum of degrees. It is difficult to state this
more formally.

• The last sentence uses arithmetic, algebra, and substitution uncited.

34.4 Game theory

Finally, we consider an application of graph theory. Graphs can be used to enumerate the possible
outcomes in the playing of a game or manipulation of a puzzle, a method used in game theory,
decision theory, and artificial intelligence. Consider the following puzzle:

You must transport a cabbage, a goat, and a wolf across a river using a boat. The boat
has only enough room for you and one of the other objects. You cannot leave the goat
and the cabbage together unsupervised, or the goat will eat the cabbage. Similarly, the
wolf will eat the goat if you are not there to prevent it. How can you safely transport all
of them to the other side?

We will solve this puzzle by analyzing the possible “states” of the situation, that is, the possible
places you, the goat, the wolf, and the cabbage can be, relative to the river; and the “moves” that
can be made between the states, that is, your rowing the boat across the river, possibly with one of
the objects. Let f stand for you, g for the goat, w for the wolf, and c for the cabbage. / will show
how the river separates all of these. For example, the initial state is f gwc/, indicating that you and
all the objects are on one side of the river. If you were to row across the river by yourself, this would
move the puzzle into the state gwc/f , which would be a failure. Our goal is to find a series of moves
that will result in the state /f gwc.
First, enumerate all the states.

220
CHAPTER 34. GRAPHS 34.4. GAME THEORY

fcgw/ fcg/w fw/cg c/fgw

cgw/f gw/fc fg/cw g/fcw

fgw/c cw/fg fc/gw w/fcg

fcw/g cg/fw f/cgw /fcgw

Now, mark the starting state with a double circle, the winning state with a triple circle, each
losing state with a square, and every other state with a single circle. These will be the vertices in
our graph.

fcgw/ fcg/w fw/cg c/fgw

cgw/f gw/fc fg/cw g/fcw

fgw/c cw/fg fc/gw w/fcg

fcw/g cg/fw f/cgw /fcgw

Finally, we draw edges between states to show what would happen if you cross the river carrying
one or zero objects.

cgw/f

fgw/c c/fgw fc/gw

gw/fc

fcgw/ fcw/g g/fcw fg/cw /fcgw f/cgw

cw/fg

fcg/w w/fcg fw/cg

cg/fw

The puzzle is solved by finding a route through this graph (in the next chapter we shall see
that the technical term for this is path) from the starting state to the finishing state, never passing
through a losing state. One possible route informs you to transport them all by first taking over
the goat, coming back (alone), transporting the cabbage, coming back with the goat, transporting
the wolf, coming back (alone), and transporting the goat again. (One could argue that f /cgw is an
unreachable state, since you would first need to win in order to lose in that way.)
Theoretically, this strategy could be used to write an unbeatable chess-playing program: let each
vertex represent a legal position (or state) in a chess game, and let the edges represent how making a
move changes the position. Then trace back from the positions representing checkmates against the
computer and mark the edges that lead to them, so the computer will not choose them. However,
the limitations of time and space once again hound us: There are estimated to be between 10 43 and
1050 legal chess positions.

221
34.4. GAME THEORY CHAPTER 34. GRAPHS

Exercises

www.plastelina.net/games/game2.html. While there is

1. Complete the on-line graph drills found at nothing funny about missionaries being attacked, it is an
www.ship.edu/~deensl/DiscreteMath/flash/ch7/sec7 3 interesting puzzle. Play that game, and use a graph to show
/planargraphs.html all the states, including a winning solution.
2. Find the degree of every vertex in the graph for the cabbage, 6. Suppose there is a party attended by eleven people. Is it pos-
goat, and wolf puzzle. sible for each person to shake hands with exactly five other
3. Draw the complement of the cabbage, goat, and wolf graph. people? Explain. (Hint: Notice the name of Theorem 34.1.)
4. Draw a graph that would represent the moves in the puzzle
7. Prove that for any graph the number of vertices with an odd
if you were allowed to transport zero, one, or two objects at
degree is even.
a time. Notice that the vertices are the same.
5. A puzzle related to the cabbage, goat, and wolf puz- 8. Prove that if a graph G = (V, E) is complete, then |E| =
|V | (|V |−1)
zle called Missionaries and Cannibals can be found at 2
.

222
Chapter 35

Paths and cycles

35.1 Walks and paths

In all the following definitions, assume G = (V, E) is a graph.
A walk from vertex v to vertex w, v, w ∈ V , is a sequence alternating between vertices in V walk
and edges in E, written v0 e1 v1 e2 . . . vn−1 en vn where v0 = v and vn = w and for all i, 1 ≤ i < n,
ei = (vi−1 , vi ). v is called the initial vertex and w is called the terminal vertex. A walk is trivial initial
if it contains only one vertex and no edges; otherwise it is nontrivial. The length of a walk is the
number of edges (not necessarily distinct, since an edge may appear more than once). In the graph terminal
below, some examples of non-trivial walks are v1 e1 v2 e4 v6 e9 v8 e11 v7 e10 v6 e8 v9 with length 6, v5 e14 v15
trivial
with length 1, and v11 e21 v12 e17 v9 e18 v13 e22 v12 e17 v9 e18 v1 3e23 v14 with length 7.
length
v1 e1 v2 e2 v3 e3 v4 v5

v6 e5 v7
e4 e6
e 10
e e 13 e 14
e8 9 v8 e 11
e7 e 12

e 15 e 16
v9 v 10
e 18 e 19 e 20
v 11 v12 e 17 v 15
v 14
e 21 e 22 v13 e 23

A graph is connected if for all v, w ∈ V , there exists a walk in G from v to w. This graph is not connected
connected, since no walk exists from v5 or v15 to any of the other vertices. However, the subgraph
excluding v5 , v15 , and e14 is connected.
A path is a walk that does not contain a repeated edge. v1 e1 v2 e4 v6 e9 v8 e11 v7 e10 v6 e8 v9 is a path, path
but v11 e21 v12 e17 v9 e18 v13 e22 v12 e17 v9 e18 v13 . is not. If the walk contains no repeated vertices, except
possibly the initial and terminal, then the walk is simple. v1 e1 v2 e4 v6 e9 v8 e11 v7 e10 v6 e8 v9 is not simple, simple
since v6 occurs twice. Its subpath v8 e11 v7 e10 v6 e8 v9 is simple.
Propositions about walks and paths require careful use of the notation we use to denote a walk.
Observe the process used in this example.

Theorem 35.1 If G = (V, E) is a connected graph, then between any two distinct vertices of G
there exists a simple path in G.

The first thing you must do is understand what this theorem is claiming. A quick read might
mislead someone to think that this is merely stating the definition of connected. Be careful—being
connected only means that any two vertices are connected by a walk, but this theorem claims that
they are connected by a simple path, that is, without repeated edges or vertices. It turns out that
whenever a walk exists, a simple path must also.

223
35.2. CIRCUITS AND CYCLES CHAPTER 35. PATHS AND CYCLES

Lemma 35.1 If G = (V, E) is a graph, v, w ∈ V , and there exists a walk from v to w, then there
exists a simple path from v to w.

We will use a notation where we put subscripts on ellipses. The ellipses stand for subwalks which
we would like to splice in and out of walks we are constructing.

Proof. Suppose G = (V, E) is a graph, v, w ∈ V , and there exists a walk c =

v0 e1 v1 . . . en vn in G with v0 = v and vn = w.
First, suppose that c contains a repeated edge, say ex = ey for 0 6= x < y 6= n, so that
we can write c = v0 e1 v1 . . .1 vx−1 ex vx . . .2 vy−1 ey vy ey+1 . . .3 en vn . Since ex = ey , then
either vx−1 = vy−1 and vx = vy or vx−1 = vy and vx = vy−1 . Then create a new walk c0
in this way:
If vx−1 = vy−1 and vx = vy , then let c0 = v0 e1 v1 . . .1 vx−1 ex vx ey+1 . . .3 en vn . Otherwise
(if vx−1 = vy and vx = vy−1 ), let c0 = v0 e1 v1 . . .1 vx−1 ey+1 . . .3 en vn . Repeat this process
until we obtain a walk c00 with no repeated edges. (If c had no repeated edges in the first
place, then c00 = c.)
Next, suppose that c00 contains a repeated vertex, say, vx = vy for x < y, so that
we can write c = v0 e1 v1 . . .1 ex vx ex+1 . . .2 ey vy ey+1 . . .3 en vn . Then create a new walk
c000 = v0 e1 v1 . . .1 ex vx ey+1 . . .3 en vn . Repeat this process until we obtain a walk civ with
no repeated vertices.
Then civ is a walk from v to w, and since it repeats neither vertex nor edge, it is a simple
path. 2

To meet the burden of this existence proof, we produced a walk that fulfills the requirements.
This is a constructive proof, and it gives an algorithm for deriving a simple path from any walk in
an undirected graph. It also makes a quick proof for the earlier theorem.

Proof (of Theorem 35.1). Suppose G = (V, E) is a connected graph and v, w ∈ V .

By definition of connected, there exists a walk from v to w. By Lemma 35.1, there exists
a simple path from v to w. 2

35.2 Circuits and cycles

closed If v = w (that is, the initial and terminal vertices are the same), then the walk is closed . A circuit
circuit is a closed path. A cycle is a simple circuit. In the earlier example, v6 e9 v8 e11 v7 e12 v10 e16 v8 e15 v9 e8 v6
cycle is a circuit, but not a cycle, since v8 is repeated. v2 e4 v6 e8 v9 e17 v12 e7 v2 is a cycle. If a circuit or cycle
c comprises the entire graph (that is, for all v ∈ V and e ∈ E, v and e appear in c), then we will say
that the graph is a circuit or cycle.
Proofs of propositions about circuits and cycles become monstrous because the things we must
demonstrate proliferate. To show that something is a cycle requires showing that it is closed, that
it is simple, and that it is a path, in addition to the specific requirements of the present proposition.

Theorem 35.2 If G = (V, E) is a connected graph and for all v ∈ V , deg(v) = 2, then G is a
circuit.

This requires us to show a walk that is a circuit and comprises the entire graph.

Proof. Suppose G = (V, E) is a connected graph and for all v ∈ V , deg(v) = 2.

First suppose |V | = 1, that is, there is only one vertex, v. Since deg(v) = 2, this implies
that there is only one edge, e = (v, v). Then the circuit vev comprises the entire graph.

This looks like the beginning of a proof by induction, but actually it is a division into cases.
We are merely getting a special case out of the way. We want to use the fact that there can be no
self-loops, but that is true only if there are more than one vertex.

224
CHAPTER 35. PATHS AND CYCLES 35.3. EULER CIRCUITS AND HAMILTONIAN CYCLES

Next suppose |V | > 1. By Exercise 6, G has no self-loops. Then construct a walk c in

this manner: Pick a vertex v1 ∈ V and an edge e1 = (v1 , v2 ). Since deg(v1 ) = 2, e must
exist, and since G contains no self-loops, v1 6= v2 . Since deg(v2 ) = 2, there exists another
edge, e2 = (v2 , v3 ) ∈ E. Continue this process until we reach a vertex already visited, so
that we can write c = v1 e1 e2 v3 . . . ex−1 vx where vx = vi for some i, 1 ≤ i < x. We will
reach such a vertex eventually because V is finite.

We have constructed a walk. We must show that it meets the requirements we are looking for.

Only one vertex in c is repeated, since reaching a vertex for the second time stops the
building process. Hence c is simple.
Since we never repeat a vertex (until the last), each edge chosen leads to a new vertex,
hence no edge is repeated in c, so c is a path.
We are always choosing the edge other than the one we took into a vertex, so i 6= x − 1.
Suppose i 6= 1. Since no other vertex is repeated, vi−1 , vi+1 , and vx−1 are distinct.
Therefore, distinct edges (vi−1 , vi ), (vi , vi+1 ), and (vx−1 , vi ) all exist, and so deg(vi ) ≥ 3.
Since deg(vi ) = 2, this is a contradiction. Hence i = 1. Moreover, v1 = vx and c is
closed.
As a closed, simple path, c is a circuit.
Suppose that a vertex v ∈ V is not in c, and let v 0 be any vertex in c. Since G is
connected, there must be a walk, c0 from v to v 0 , and let edge e0 be the first edge in c0
(starting from v 0 ) that is not in c, and let v 00 be an endpoint in c0 in c. Since two edges
incident on v 00 occur in c, accounting for e0 means that deg(v 00 ) ≥ 3. Since deg(vi ) = 2,
this is a contradiction. Hence there is no vertex not in c.
Suppose that an edge e ∈ E is not in c, and let v be an endpoint of e. Since v is in
the circuit, there exist distinct edges e1 and e2 in c that are incident on v, implying
deg(v) ≥ 3. Since deg(v) = 2, this is a contradiction. Hence there is no edge not in c.
Therefore, c is a circuit that comprises the entire graph, and G is a circuit. 2

The reasoning becomes very informal, especially the last two point about the circuit comprising
the whole graph. However, make sure you see that the basic logic is still present: These are merely
proofs of set emptiness.

35.3 Euler circuits and Hamiltonian cycles

Leonhard Euler proposed a problem based on the bridges in the town of Königsberg, Prussia. Two
branches in of the Pregel River converge in the town, delineating it into a north part, a south part,
an east part, and an island in the middle, as shown below. In Euler’s time, the east part had one
bridge to each the north part and the south part, the island had two bridges each to the north part
and the south part, and one bridge connected the east part and the island. Supposing your house
was in any of the four parts, is it possible to walk around town (beginning and ending at your house)
and cross every bridge exactly once?

We can turn this into a graph problem by representing the information with a graph whose
vertices stand for the parts of town and whose edges stand for the bridges, as displayed above. Let

225
35.3. EULER CIRCUITS AND HAMILTONIAN CYCLES CHAPTER 35. PATHS AND CYCLES

G = (V, E) be a graph. An Euler circuit of G is a circuit that contains every vertex and every edge. Eule
(Since it is a circuit, this also means that an Euler circuit contains very edge exactly once. Vertices,
however, may be repeated.) The question now is whether or not this graph has an Euler circuit. We
can prove that it does not, and so such a stroll about town is impossible.

Theorem 35.3 If a graph G = (V, E) has an Euler circuit, then every vertex of G has an even
degree.

Proof. Suppose graph G = (V, E) has an Euler circuit c = v0 e1 v1 . . . en vn , where

v0 = vn . For all e ∈ E, e appears in c exactly once, and for all v ∈ V , v appears at least
once, by definition of Euler circuit. Suppose vi ∈ V , vi 6= v1 , and let x be the number
of times vi appears in c. Since each appearance of a vertex corresponds to two edges,
deg(vi ) = 2x, which is even by definition.
For v1 , let y be the number of times it occurs in c besides as an initial or terminal vertex.
These occurrences correspond to 2y incident edges. Moreover, e1 and en are incident on
v1 , and hence deg(vi ) = 2y + 1 + 1 = 2(y + 1), which is even by definition. (If either
e1 and en are self-loops, they have already been accounted for once, and we are rightly
counting them a second time.)
Therefore, all vertices in G have an even degree. 2

The northern, eastern, and southern parts of town each have odd degrees, so by the contrapositive
of this theorem, no Euler circuit around town exists.
Hamiltonian cycle Another interesting case is that of a Hamiltonian cycle, which for a graph G = (V, E) is a cycle
that includes every vertex in V . Since it is a cycle, this means that no vertex or edge is repeated;
however, not all the edges need to be included. We reserve one Hamiltonian cycle proof for the
exercises, but here is a Hamiltonian cycle in a graph similar to the one at the beginning of this
chapter (with the disconnected subgraph removed).
v1 e1 v2 e2 v3 e3 v4

v6 e5 v7
e4 e6
e 10
e e 13
e e8 9 v8 e 11
24
e7 e 12

e 15 e 16
v9 v 10
e 18 e 19 e 20
v v12 e 17
11 v 14
e 21 e 22 v13 e 23

226
CHAPTER 35. PATHS AND CYCLES 35.3. EULER CIRCUITS AND HAMILTONIAN CYCLES

Exercises

12. Find a Hamiltonian cycle in the following graph (copy it and

1. Complete the on-line graph drills found at highlight the walk on your copy).
www.ship.edu/~deensl/DiscreteMath/flash/ch7/sec7 1
/euler.html and www.ship.edu/~deensl/DiscreteMath
/flash/ch7/sec7 7/hamiltongraphs.html
2. Draw a graph that has a simple walk that is not a path.
Prove. Assume G = (V, E) is a graph.
3. If G is connected, then |E| ≥ |V | − 1.
4. If G is connected, |V | ≥ 2, and |V | > |E|, then G has a
vertex of degree 1.
13. If G has a non-trivial Hamiltonian cycle, then G has a sub-
5. If G is not connected, then G is connected. graph G0 = (V 0 , E 0 ) such that
6. If G is connected, for all v ∈ V , deg(v) = 2, and |V | > 1, • G0 contains every vertex of G (V 0 = V )
then G has no self-loops.
• G0 is connected,
7. Every circuit in G contains a subwalk that is a cycle.
• G0 has the same number of edges as vertices (|V 0 | =
8. If for all v ∈ V , deg(v) ≥ 2, then G contains a cycle. |E 0 |), and
9. If v, w ∈ V are part of a circuit c and G0 is a subgraph of G • every vertex of G0 has degree 2.
formed by removing one edge of c, then there exists a path
from v to w in G0 . 14. Reinterpret the Königsburg bridges problem by making a
graph whose vertices represent the bridges and whose edges
10. If G has no cycles, then it has a vertex of degree 0 or 1. represent land connections between bridges. What problem
11. If G has an Euler circuit, then it is connected. are you solving now?

227
35.3. EULER CIRCUITS AND HAMILTONIAN CYCLES CHAPTER 35. PATHS AND CYCLES

228
Chapter 36

Isomorphisms

36.1 Definition
We have already seen that the printed shape of the graph—the placement of the dots, the resulting
angles of the lines, any curvature of the lines—is not of the essence of the graph. The only things that
count are the names of the vertices and edges and the abstract shape, that is, the connections that
the edges define. However, consider the two graph representations below, which illustrate the graphs
G = (V = {v1 , v2 , v3 , v4 }, E = {e1 = (v1 , v2 ), e2 = (v2 , v3 ), e3 = (v3 , v4 ), e4 = (v4 , v1 ), e5 = (v1 , v3 )})
and G0 = (W = {w1 , w2 , w3 , w4 , w5 }, F = {f1 = (w1 , w2 ), f2 = (w2 , w3 ), f3 = (w3 , w4 ), f4 =
(w3 , w1 ), f5 = (w4 , w2 )}).

v1 e1 v2
w f
1 1 w
2

e e f f5
4 5 4
e2

f
2

v e3 w w
v 4 f 3
4 3 3

These graphs have much in common. Both have four vertices and five edges. Both have
two vertices with degree two and two vertices with degree three. Both have a Hamiltonian cy-
cle (v1 e1 v2 e2 v3 e3 v4 e4 v1 and w1 f1 w2 f5 w4 f3 w3 f4 w1 , leaving out e5 and f2 , respectively) and two
other cycles (involving e5 and f2 ). In fact, if you imagine switching the positions of w1 and w2 , with
the edges sticking to the vertices as they move, and then doing a little stretching and squeezing, you
could transform the second graph until it appears identical to the first.
In other words, these two really are the same graph, in a certain sense of sameness. The only
difference is the arbitrary matter of names for the vertices and edges. We can formalize this by
writing renaming functions, g : V → W and h : E → F .

e h(e)
v g(v)
e1 f5
v1 w2
e2 f3
v2 w4
e3 f4
v3 w3
e4 f1
v4 w1
e5 f2
The term for this kind of equivalence is isomorphism, from the Greek roots iso meaning “same” isomorphism

229
36.2. ISOMORPHIC INVARIANTS CHAPTER 36. ISOMORPHISMS

and morphë meaning “shape.” This is a way to recognize identical abstract shapes of graphs, that
two graphs are the same up to renaming. Let G = (V, E) and G0 = (W, F ) be graphs. G is
isomorphic isomorphic to G0 if there exist one-to-one correspondences g : V → W and h : E → F such that
for all v ∈ V and e ∈ E, v is an endpoint of e iff g(v) is an endpoint of h(e). The two functions
g and h, taken together, are usually referred to as the isomorphism itself, that is, there exists an
isomorphism, namely g and h, between G and G0 .
As we shall see, what characterizes isomorphisms is that they preserve properties, that is, there
are many graph properties which, if they are true for one graph, are true for any other graph isomor-
phic to that graph. Graph theory is by no means the only area of mathematics where this concept
occurs. Group theory (a main component of modern algebra) involves isomorphisms as functions
that map from one group to another in such a way that operations are preserved. Isomorphisms
define equivalences between matrices in linear algebra. The main concept of isomorphism is the
independence of structure from data. If an isomorphism exists between two things, it means that
they exist as parallel but equivalent universes, and everything that happens in one universe has an
equivalent event in the other that keeps them in step.

36.2 Isomorphic invariants

isomorphic invariant An isomorphic invariant is a property that is preserved through an isomorphism. In other words,
proposition P is an isomorphic invariant if for any graphs G and G0 , if P (G) and G is isomorphic
to G0 , then P (G0 ). Some isomorphic invariants are simple and therefore quite easy to prove; others
require subtle and complex proofs. To illustrate what is meant, consider this theorem.

Theorem 36.1 For any k ∈ N, the proposition P (G) = “G has a vertex of degree k” is an isomor-
phic invariant.

This is not a difficult result to prove as long as one can identify what burden is required. Being
an isomorphic invariant has significance only when two pieces are already in place: We have two
graphs known to be isomorphic and that the proposition is true for one of those graphs.

Proof. Suppose k ∈ N, G = (V, E) is a graph which has a vertex v ∈ V with degree k,

and G0 is a graph to which G is isomorphic.

Now we can assume and use the definition of isomorphic (those handy one-to-one correspondences
must exist), and we must prove that G0 has a vertex of degree k. The definition of degree also comes
into play, especially in distinguishing between self-loops and other edges.

By definition of degree, there exist edges e1 , e2 , . . . , en ∈ E, non self-loops, and e01 , e02 , . . . e0m ∈
E, self-loops, that are incident on v, such that k = n + 2m.
By the definition of of isomorphism, there exist one-to-one correspondences g and h with
the isomorphic property.

Saying “with the isomorphic property” spares the trouble of writing out “for all v 0 ∈ V ” etc, and
instead more directly we claim

For each ei , 1 ≤ i ≤ n, h(ei ) has g(v) as one endpoint, and for each e0j , 1 ≤ j ≤ m, h(e0j )
has g(v) as both endpoints, and no other edge has g(v) as an endpoint. Each h(ei ) and
h(e0j ) is distinct since h is one-to-one. Hence

deg(g(v)) = n + 2m = k

Therefore G0 has a vertex, namely g(v), with degree k. 2

230
CHAPTER 36. ISOMORPHISMS 36.3. THE ISOMORPHIC RELATION

36.3 The isomorphic relation

The proof in the previous section contained the awkward clause “G0 is a graph to which G is
isomorphic.” This was necessary because our definitions directly allow only discussion of when one
graph is isomorphic to another, not necessarily the same thing as, for example, “G 0 is isomorphic to
G.” However, your intuition should suggest to you that these in fact are the same thing, that if G is
isomorphic to G0 then G0 is also isomorphic to G, and we may as well say “G and G0 are isomorphic
to each other.”
If we think of isomorphism as a relation (the relation “is isomorphic to”), this means that the
relation is symmetric. We can go further, and observe that isomorphism defines an equivalence class
for graphs.

Theorem 36.2 The relation R on graphs defined that (G, G0 ) ∈ R if G is isomorphic to G0 is an

equivalence relation.

Proof. Suppose R is a relation on graphs defined that (G, G0 ) ∈ R if G is isomorphic to

G0 .
Reflexivity. Suppose G = (V, E) is a graph. Let g = iV and h = iE (that is, the identity
functions on V and E, respectively). By Exercise 8 of Chapter 25, g and h are one-to-one
correspondences. Now suppose v ∈ V and e ∈ E. Then g(v) = v and h(e) = e, so if v
is an endpoint of e then g(v) is an endpoint of h(e), and if g(v) is an endpoint of h(e),
then v is an endpoint of e. Hence g and h are an isomorphism between G and itself,
(G, G) ∈ R, and R is reflexive.
Symmetry. Suppose G = (V, E) and G0 = (W, F ) are graphs and (G, G0 ) ∈ R, that is,
G is isomorphic to G0 . Then there exist one-to-one correspondences g : V → W and
h : E → F with the isomorphic property. Since g and h are one-to-one correspondences,
their inverses, g 0 : W → V and h0 : F → E, respectively, exist. Suppose w ∈ W
and f ∈ F . By definition of inverse, g(g 0 (w)) = w and h(h0 (f )) = f . Suppose w is an
endpoint of f . By definition of isomorphism, g 0 (w) is an endpoint of h0 (f ). Then suppose
g 0 (w) is an endpoint of h0 (f ). Also by definition of isomorphism, w is an endpoint of f .
Hence g 0 and h0 are an isomorphism from G0 to G, (G0 , G) ∈ R, and R is symmetric.
Transitivity. Exercise 2.
Therefore, R is an equivalence relation. 2

There we have it—our “certain sense of sameness” is an example of mathematical equivalence.

This is what abstract thinking is about, being able to ignore the detail to see what unifies. This
proof, especially the part about symmetry, has much important review tucked into it. We had to
notice that g and h were one-to-one correspondences before we could assume inverse functions. We
see again the pattern of analysis (taking apart what it means for G to be isomorphic to G 0 ) and
synthesis (constructing what it means for G0 to be isomorphic to G). The isomorphic property is an
“iff” property, so we were required to prove the matter of endpoints going either direction.

36.4 Final bow

We end this chapter, this part, and the entire book with a real workout.

Theorem 36.3 Having a Hamiltonian cycle is an isomorphic invariant.

Proof. Suppose G = (V, E) and G0 = (W, F ) are isomorphic graphs, and suppose that
G has a Hamiltonian cycle, say c = v1 e1 v2 . . . en−1 vn (where v1 = vn ). By the definition
of isomorphism, there exist one-to-one correspondences g and h with the isomorphic
property.

231
36.4. FINAL BOW CHAPTER 36. ISOMORPHISMS

Suppose ei is any edge in c, incident on vi and vi+1 . By the definition of isomor-

phism, h(ei ) is incident on g(vi ) and g(vi+1 ). This allows us to construct the walk
c0 = g(v1 )h(e1 )g(v2 ) . . . h(en−1 )g(vn ).
By definition of Hamiltonian cycle, every v ∈ V occurs in c. Since g is onto, every vertex
w ∈ W occurs in c0 , and since g is one-to-one, no vertex occurs more than once in c0
except g(v1 ) = g(vn ). Also by definition of Hamiltonian cycle, no edge e ∈ E occurs
more than once in c. Since h is one-to-one, no edge f ∈ F occurs more than once in c0 .
Hence c0 is a Hamiltonian cycle. Therefore, having a Hamiltonian cycle is an isomorphic
invariant. 2

232
CHAPTER 36. ISOMORPHISMS 36.4. FINAL BOW

Exercises

4. G has n vertices, for n ∈ N.

1. Complete the on-line graph drills found at 5. G has m edges, for m ∈ N.
www.ship.edu/~deensl/DiscreteMath/flash/ch3/sec7 3
/isomorphism.html 6. G has n vertices of degree k, for k ∈ N.
2. Prove that the relation “is isomorphic to” over graphs is 7. G has a circuit of length k, for k ∈ N.
transitive.
8. G has a cycle of length k, for k ∈ N.
3. Prove that if graphs G and H are isomorphic, then their
complements, G and H are isomorphic. 9. G is connected.
Prove that the following are isomorphic invariants. 10. G has an Euler circuit.

233
36.4. FINAL BOW CHAPTER 36. ISOMORPHISMS

234
Bibliography

[1] Harold Abelson and Gerald Jay Sussman with Julie Sussman. Structure and Interpretation of
Computer Programs. McGraw Hill and the MIT Press, Cambridge, MA, second edition, 1996.
[2] Mary Chase. Harvey. Dramatists Play Service, Inc., New York, 1971. Originally published in
1944.
[3] G.K. Chesteron. Orthodoxy. Image Books, Garden City, NY, 1959.
[4] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction
to algorithms. McGraw-Hill and MIT Press, second edition, 2001.
[5] Sussana S. Epp. Discrete Mathematics with Applications. Thomson Brooks/Cole, Belmont,
CA, third edition, 2004.
[6] Matthias Felleisen and Daniel P Friedman. The Little MLer. MIT Press, Cambridge, MA, 1998.
[7] H. W. Fowler. A Dictionary of Modern English Usage. Oxford University Press, Oxford, 1965.
Originally published 1926. Revised by Ernest Gowers.
[8] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns: Elements
of Reusable Object-Oriented Software. Addison-Wesley, 1995.
[9] Karel Hrbacek and Thomas Jech. Introduction to Set Theory. Marcel Dekker, New York, 1978.
Reprinted by University Microfilms International, 1991.
[10] Iu I Manin. A course in mathematical logic. Graduate texts in mathematics. Springer Verlag,
New York, 1977. Translated from the Russian by Neal Koblitz.
[11] George Pólya. Induction and Analogy in Mathematics. Princeton University Press, 1954. Volume
I of Mathematics and Plausible Reasoning.
[12] Michael Stob. Writing proofs. Unpublished, September 1994.
[13] Geerhardus Vos. Biblical Theology. Banner of Truth, Carlisle, PA, 1975. Originally published
by Eerdmans, 1948.
[14] Samuel Wagstaff. Fermat’s last theorem is true for any exponent less than 1000000 (abstract).
AMS Notices, 23(167):A–53, 1976.

235

The Joy of Factoring PDF
No ratings yet
The Joy of Factoring PDF
311 pages
Arnold Poofs Text PDF
No ratings yet
Arnold Poofs Text PDF
165 pages
Vakil - The Rising Sea. Foundations of Algebraic Geometry
No ratings yet
Vakil - The Rising Sea. Foundations of Algebraic Geometry
850 pages
Watson S., Steprans J. - Set Theory and Its Applications (1989)
No ratings yet
Watson S., Steprans J. - Set Theory and Its Applications (1989)
232 pages
Knebusch M Scheiderer C Real Algebra A First Course
No ratings yet
Knebusch M Scheiderer C Real Algebra A First Course
217 pages
A Complete Solution Guide To Complex Analysis 3rd Bak
No ratings yet
A Complete Solution Guide To Complex Analysis 3rd Bak
253 pages
(Pure and Applied Undergraduate Texts 43) Allan Bickle - Fundamentals of Graph Theory (2020, American Mathematical Society) - Libgen - Li
100% (1)
(Pure and Applied Undergraduate Texts 43) Allan Bickle - Fundamentals of Graph Theory (2020, American Mathematical Society) - Libgen - Li
354 pages
(Problem Books in Mathematics) Marek Capiński, Tomasz Zastawniak (Auth.) - Probability Through Problems-Springer-Verlag New York (2001)
No ratings yet
(Problem Books in Mathematics) Marek Capiński, Tomasz Zastawniak (Auth.) - Probability Through Problems-Springer-Verlag New York (2001)
262 pages
Browne - Grassmann Algebra With Mathematica PDF
100% (1)
Browne - Grassmann Algebra With Mathematica PDF
449 pages
Peter Smith - Category Theory, A Gentle Introduction PDF
75% (4)
Peter Smith - Category Theory, A Gentle Introduction PDF
283 pages
Z. A. Kuzicheva (Auth.), A. N. Kolmogorov, A. P. Yushkevich (Eds.) - Mathematics of The 19th Century - Mathematical Logic Algebra Number Theory Probability Theory (1992, Birkhäuser Basel) PDF
100% (3)
Z. A. Kuzicheva (Auth.), A. N. Kolmogorov, A. P. Yushkevich (Eds.) - Mathematics of The 19th Century - Mathematical Logic Algebra Number Theory Probability Theory (1992, Birkhäuser Basel) PDF
319 pages
9780203755419
100% (1)
9780203755419
431 pages
Specialist Units 1_2 - Chapter 2
No ratings yet
Specialist Units 1_2 - Chapter 2
36 pages
Notes in Descrete Math PDF
No ratings yet
Notes in Descrete Math PDF
399 pages
(Mathematics and Its Applications) Malempati M. Rao, Randall J. Swift - Probability Theory With Applications - Springer (2006)
No ratings yet
(Mathematics and Its Applications) Malempati M. Rao, Randall J. Swift - Probability Theory With Applications - Springer (2006)
536 pages
Gatsinzi - Euclidean Geometry11111111111111111
No ratings yet
Gatsinzi - Euclidean Geometry11111111111111111
185 pages
Foundation of Algebra Geometry
No ratings yet
Foundation of Algebra Geometry
826 pages
Vakil, Ravi - Foundations of Algebraic Geometry
No ratings yet
Vakil, Ravi - Foundations of Algebraic Geometry
760 pages
Exam Grade 7
100% (1)
Exam Grade 7
3 pages
Toric Varieties
100% (1)
Toric Varieties
863 pages
(ESI Lectures in Mathematics and Physics) Christian GÃ©rard - Microlocal Analysis of Quantum Fields On Curved Spacetimes-European Mathematical Society (2019)
No ratings yet
(ESI Lectures in Mathematics and Physics) Christian GÃ©rard - Microlocal Analysis of Quantum Fields On Curved Spacetimes-European Mathematical Society (2019)
230 pages
Fields of Logic and Computation Essays Dedicated To Yuri Gurevich On The Occasion of His 70th Birthday PDF
No ratings yet
Fields of Logic and Computation Essays Dedicated To Yuri Gurevich On The Occasion of His 70th Birthday PDF
636 pages
MIT18 785F19 Full Notes PDF
No ratings yet
MIT18 785F19 Full Notes PDF
297 pages
Free Probability and Random Matrices
No ratings yet
Free Probability and Random Matrices
342 pages
Measure Theory For Dummies
0% (1)
Measure Theory For Dummies
7 pages
Szamuely - Galois Theory After Galois
No ratings yet
Szamuely - Galois Theory After Galois
10 pages
Point Set Theory
100% (1)
Point Set Theory
296 pages
Crash Course Coding Companion
No ratings yet
Crash Course Coding Companion
136 pages
Geometry of Projective Algebraic Curves
100% (1)
Geometry of Projective Algebraic Curves
429 pages
Lebesgue Integration and Measure
No ratings yet
Lebesgue Integration and Measure
290 pages
Bruce Driver-Undergraduate Analysis Tools
No ratings yet
Bruce Driver-Undergraduate Analysis Tools
228 pages
Discrete Mathematics: Thomas Goller January 2013
No ratings yet
Discrete Mathematics: Thomas Goller January 2013
87 pages
Judson Abstract Algebra Theory and Applications
100% (2)
Judson Abstract Algebra Theory and Applications
385 pages
A Visual Introduction To Information Theory
No ratings yet
A Visual Introduction To Information Theory
43 pages
Introduction To Probability and Statistics
100% (1)
Introduction To Probability and Statistics
179 pages
Zermelo Fraenkel Set Theory - Wikipedia
No ratings yet
Zermelo Fraenkel Set Theory - Wikipedia
12 pages
Category Theory
No ratings yet
Category Theory
10 pages
Denotational Semantics by David A Schmidt
No ratings yet
Denotational Semantics by David A Schmidt
304 pages
Pro-Slavery Arguments
No ratings yet
Pro-Slavery Arguments
6 pages
MAT-101 Engineering Mathematics 1 Differential Calculus Lecture-1 Differentiation: Basic Concepts To Remember
100% (1)
MAT-101 Engineering Mathematics 1 Differential Calculus Lecture-1 Differentiation: Basic Concepts To Remember
4 pages
Mealy & Moore Machine: Rao Wakeel Ahmad
No ratings yet
Mealy & Moore Machine: Rao Wakeel Ahmad
26 pages
Multilinear Algebra - MIT
No ratings yet
Multilinear Algebra - MIT
141 pages
Duren - Geometric Function Theory
100% (1)
Duren - Geometric Function Theory
40 pages
Algebraic Number Theory - Computational Approach PDF
No ratings yet
Algebraic Number Theory - Computational Approach PDF
215 pages
Kreisel and Lawvere On Category Theory and The Foundations of Mathematics
100% (2)
Kreisel and Lawvere On Category Theory and The Foundations of Mathematics
46 pages
ZKP Session 1
No ratings yet
ZKP Session 1
26 pages
GHF - S of TT - Technical Curriculum
No ratings yet
GHF - S of TT - Technical Curriculum
65 pages
Mathematics 433/533 Class Notes: Richard Koch
No ratings yet
Mathematics 433/533 Class Notes: Richard Koch
188 pages
Mathematical Logic
100% (2)
Mathematical Logic
113 pages
Rings and Ideals A First Course in
No ratings yet
Rings and Ideals A First Course in
208 pages
Lecture 3 - Statistics
No ratings yet
Lecture 3 - Statistics
16 pages
Howie J. Fundamentals of Semigroup Theory (OUP, 1995) (ISBN 0198511949) (K) (T) (361s) - MAtg
100% (1)
Howie J. Fundamentals of Semigroup Theory (OUP, 1995) (ISBN 0198511949) (K) (T) (361s) - MAtg
361 pages
Dimension Functions: Depth, Measuring Singularities: Pieter Belmans February 14, 2014
No ratings yet
Dimension Functions: Depth, Measuring Singularities: Pieter Belmans February 14, 2014
11 pages
Kahn Topology PDF
100% (1)
Kahn Topology PDF
220 pages
School of Economics ECON6004: Mathematical Economics: Lecture 1.1: Numbers, Sets, and Functions
No ratings yet
School of Economics ECON6004: Mathematical Economics: Lecture 1.1: Numbers, Sets, and Functions
12 pages
(Pure and Applied Mathematics) Donald L. Stancl, Mildred L. Stancl-Real Analysis With Point-Set Topology-Dekker (1987)
100% (1)
(Pure and Applied Mathematics) Donald L. Stancl, Mildred L. Stancl-Real Analysis With Point-Set Topology-Dekker (1987)
295 pages
Free Probability and Operator Algebras
100% (1)
Free Probability and Operator Algebras
144 pages
Introduction To Modern Set Theory
No ratings yet
Introduction To Modern Set Theory
129 pages
Lukes-Maly - Measure and Integral PDF
100% (3)
Lukes-Maly - Measure and Integral PDF
232 pages
MATH MISSION -DAY-7 SAMPLE PAPER-6
No ratings yet
MATH MISSION -DAY-7 SAMPLE PAPER-6
5 pages
(H.S. Carslaw) Introduction To The Theory of Fourier Series
No ratings yet
(H.S. Carslaw) Introduction To The Theory of Fourier Series
332 pages
Second Periodical Test 8 2019
100% (1)
Second Periodical Test 8 2019
5 pages
2.2 Equivalence Relations
No ratings yet
2.2 Equivalence Relations
5 pages
DLL-8 (Week 1, Day 4)
No ratings yet
DLL-8 (Week 1, Day 4)
6 pages
Standard Positional Numeral Systems: Base Name Usage
No ratings yet
Standard Positional Numeral Systems: Base Name Usage
10 pages
Unit Plan of 3rd MP
No ratings yet
Unit Plan of 3rd MP
10 pages
Pure Mathematics (M208) Content Listing: Mathematical Language and Proof
No ratings yet
Pure Mathematics (M208) Content Listing: Mathematical Language and Proof
1 page
s.5 Mathematics Seminar Questions
No ratings yet
s.5 Mathematics Seminar Questions
9 pages
Automata and Formal Language Theory
No ratings yet
Automata and Formal Language Theory
18 pages
Cambridge O Level: Mathematics (Syllabus D) 4024/21
No ratings yet
Cambridge O Level: Mathematics (Syllabus D) 4024/21
9 pages
Lecture 1 Introduction To Number Theory, MAT115A
100% (1)
Lecture 1 Introduction To Number Theory, MAT115A
6 pages
Math 140a - HW 1 Solutions
No ratings yet
Math 140a - HW 1 Solutions
5 pages
Cyclotomic Fields and Applications
No ratings yet
Cyclotomic Fields and Applications
18 pages
Engelking, Sieklucki - Topology A Geometric Approach
100% (2)
Engelking, Sieklucki - Topology A Geometric Approach
437 pages
Cyclotomic Polynomials
No ratings yet
Cyclotomic Polynomials
13 pages
Introduction To Theory of The Complexity
100% (2)
Introduction To Theory of The Complexity
290 pages
Maths 6
No ratings yet
Maths 6
2 pages
Schedule Jee Advanced 2023 Batch 1
No ratings yet
Schedule Jee Advanced 2023 Batch 1
2 pages
Example: Convert To An Improper Fraction. 3 2 5
No ratings yet
Example: Convert To An Improper Fraction. 3 2 5
1 page
Utf-8' '2023-24 - Fall1 - Oct18
No ratings yet
Utf-8' '2023-24 - Fall1 - Oct18
1 page
13 Ec 533
No ratings yet
13 Ec 533
1 page
Geometric Sequences
No ratings yet
Geometric Sequences
10 pages
Introduction to Matrices and Linear Transformations: Third Edition
From Everand
Introduction to Matrices and Linear Transformations: Third Edition
Daniel T. Finkbeiner
3/5 (1)
The Theory of Remainders
From Everand
The Theory of Remainders
Andrea Rothbart
No ratings yet
Infinite Matrices and Sequence Spaces
From Everand
Infinite Matrices and Sequence Spaces
Richard G. Cooke
No ratings yet
Applied Functional Analysis
From Everand
Applied Functional Analysis
D.H. Griffel
No ratings yet
Exercises of Complex Numbers
From Everand
Exercises of Complex Numbers
Simone Malacrida
No ratings yet