0% found this document useful (0 votes)
8 views200 pages

Matemáticas Discretas

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views200 pages

Matemáticas Discretas

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 200

Logic and Discrete Mathematics

for
Computer Scientists

James Caldwell
Department of Computer Science
University of Wyoming
Laramie, Wyoming

Draft of
August 26, 2011
c James Caldwell1 2011
ALL RIGHTS RESERVED

1 This material is based upon work partially supported by the National Science Foundation

under Grant No. 9985239. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the author(s) and do not necessarily reflect the views
of the National Science Foundation.
Contents

1 Syntax and Semantics* 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Formal Languages . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.1 Concrete vs. Abstract Syntax . . . . . . . . . . . . . . . . 4
1.3.2 Some examples of Syntax . . . . . . . . . . . . . . . . . . 5
1.3.3 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.1 Definition by Recursion . . . . . . . . . . . . . . . . . . . 10
1.5 Possibilities for Implementation . . . . . . . . . . . . . . . . . . . 15

I Logic 17
2 Propositional Logic 21
2.1 Syntax of Propositional Logic . . . . . . . . . . . . . . . . . . . . 21
2.1.1 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.2 Definitions: Extending the Language . . . . . . . . . . . . 24
2.1.3 Substitutions* . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.1 Boolean values and Assignments . . . . . . . . . . . . . . 25
2.2.2 The Valuation Function . . . . . . . . . . . . . . . . . . . 26
2.2.3 Truth Table Semantics . . . . . . . . . . . . . . . . . . . . 28
2.2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Proof Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3.1 Sequents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.2 Semantics of Sequents . . . . . . . . . . . . . . . . . . . . 32
2.3.3 Sequent Schemas and Matching . . . . . . . . . . . . . . . 34
2.3.4 Proof Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.5 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.6 Some Useful Tautologies . . . . . . . . . . . . . . . . . . . 42
2.3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4 Metamathematical Considerations* . . . . . . . . . . . . . . . . . 43

i
ii CONTENTS

2.4.1 Soundness and Completeness . . . . . . . . . . . . . . . . 44


2.4.2 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3 Boolean Algebra and Equational Reasoning* 47


3.1 Boolean Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.1 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . 48
3.1.2 Translation from Propositional Logic . . . . . . . . . . . . 49
3.1.3 Falsity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1.4 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1.5 Conjunction . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1.6 Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.1.7 Exclusive-Or . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.1.8 Disjunction . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.9 Implication . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.10 The Final Translation . . . . . . . . . . . . . . . . . . . . 52
3.1.11 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 Equational Reasoning . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2.1 Complete Sets of Connectives . . . . . . . . . . . . . . . . 53

4 Predicate Logic 55
4.1 Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 The Syntax of Predicate Logic . . . . . . . . . . . . . . . . . . . 57
4.2.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.2 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.3 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3.1 Bindings and Variable Occurrences . . . . . . . . . . . . . 61
4.3.2 Free Variables . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.3 Capture Avoiding Substitution* . . . . . . . . . . . . . . 64
4.4 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.1 Proof Rules for Quantifiers . . . . . . . . . . . . . . . . . 66
4.4.2 Universal Quantifier Rules . . . . . . . . . . . . . . . . . . 66
4.4.3 Existential Quantifier Rules . . . . . . . . . . . . . . . . . 67
4.4.4 Some Proofs . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.5 Translating Sequent Proofs into English . . . . . . . . . . 70

II Sets, Relations and Functions 75


5 Set Theory 77
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.1.1 Informal Notation . . . . . . . . . . . . . . . . . . . . . . 78
5.1.2 Membership is primitive . . . . . . . . . . . . . . . . . . . 78
5.2 Equality and Subsets . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2.1 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . 79
CONTENTS iii

5.2.2 Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3 Set Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.1 The Empty Set . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.2 Unordered Pairs and Singletons . . . . . . . . . . . . . . . 83
5.3.3 Ordered Pairs . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.3.4 Set Union . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.3.5 Set Intersection . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.6 Power Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.7 Comprehension . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.8 Set Difference . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.3.9 Cartesian Products and Tuples . . . . . . . . . . . . . . . 92
5.4 Properties of Operations on Sets . . . . . . . . . . . . . . . . . . 94
5.4.1 Idempotency . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.4.2 Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.4.3 Commutativity . . . . . . . . . . . . . . . . . . . . . . . . 95
5.4.4 Associativity . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.4.5 Distributivity . . . . . . . . . . . . . . . . . . . . . . . . . 95

6 Relations 97
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.2.1 Binary Relations . . . . . . . . . . . . . . . . . . . . . . . 98
6.2.2 n-ary Relations . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2.3 Some Particular Relations . . . . . . . . . . . . . . . . . . 99
6.3 Operations on Relations . . . . . . . . . . . . . . . . . . . . . . . 100
6.3.1 Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.3.2 Complement of a Relation . . . . . . . . . . . . . . . . . . 101
6.3.3 Composition of Relations . . . . . . . . . . . . . . . . . . 101
6.4 Properties of Relations . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4.1 Reflexivity . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4.2 Irreflexivity . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4.3 Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4.4 Antisymmetry . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4.5 Asymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4.6 Transitivity . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4.7 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . 106
6.5 Closures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.6 Properties of Operations on Relations . . . . . . . . . . . . . . . 109

7 Equivalence and Order 111


7.1 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . 111
7.1.1 Equivalence Classes . . . . . . . . . . . . . . . . . . . . . 112
7.1.2 The Quotient Construction* . . . . . . . . . . . . . . . . . 113
7.1.3 Q is a Quotient . . . . . . . . . . . . . . . . . . . . . . . 113
7.1.4 Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.1.5 Congruence Relations* . . . . . . . . . . . . . . . . . . . . 116
iv CONTENTS

7.2 Order Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117


7.2.1 Partial Orders . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2.2 Products and Sums of Orders . . . . . . . . . . . . . . . . 118

8 Functions 119
8.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.2 Extensionality (equivalence for functions) . . . . . . . . . . . . . 120
8.3 Operations on functions . . . . . . . . . . . . . . . . . . . . . . . 121
8.3.1 Restrictions and Extensions . . . . . . . . . . . . . . . . . 121
8.3.2 Composition of Functions . . . . . . . . . . . . . . . . . . 121
8.3.3 Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.4 Properties of Functions . . . . . . . . . . . . . . . . . . . . . . . 124
8.4.1 Injections . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.4.2 Surjections . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.4.3 Bijections . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

9 Cardinality and Counting 131


9.1 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9.2 Infinite Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
9.3 Finite Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.3.1 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.4 Cantor’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.5 Countable and Uncountable Sets . . . . . . . . . . . . . . . . . . 136
9.6 Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.6.1 The Pigeonhole Principle . . . . . . . . . . . . . . . . . . 138

III Induction and Recursion 139


10 Natural Numbers 143
10.1 Peano Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.2 Definition by Recursion . . . . . . . . . . . . . . . . . . . . . . . 146
10.3 Mathematical Induction . . . . . . . . . . . . . . . . . . . . . . . 149
10.3.1 An informal justification for the principle . . . . . . . . . 149
10.3.2 A sequent style proof rule . . . . . . . . . . . . . . . . . . 150
10.3.3 Some First Inductive Proofs . . . . . . . . . . . . . . . . . 151
10.4 Properties of the Arithmetic Operators . . . . . . . . . . . . . . . 153
10.4.1 Order Properties . . . . . . . . . . . . . . . . . . . . . . . 155
10.4.2 Iterated Sums and Products . . . . . . . . . . . . . . . . . 157
10.4.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . 159
10.5 Complete Induction . . . . . . . . . . . . . . . . . . . . . . . . . 160
10.5.1 Proof of the Principle of Complete Induction* . . . . . . . 161
10.5.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . 162
CONTENTS v

11 Lists 167
11.1 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
11.2 Definition by recursion . . . . . . . . . . . . . . . . . . . . . . . . 169
11.3 List Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
11.3.1 Some proofs by list induction . . . . . . . . . . . . . . . . 172
vi CONTENTS

List of Definitions.

List of Examples.
CONTENTS vii
viii CONTENTS

Preface
Discrete mathematics is a required course in the undergraduate Computer
Science curriculum. In a perhaps unsympathetic view, the standard presenta-
tions (and there are many )the material in the course is treated as a discrete
collection of so many techniques that the students must master for further stud-
ies in Computer Science. Our philosophy, and the one embodied in this book
is different. Of course the development of the students abilities to do logic and
proofs, to know about naive set theory, relations, functions, graphs, inductively
defined structures, definitions by recursion on inductively defined structures
and elementary combinatorics is important. But we believe that rather than so
many assorted topics and techniques to be learned, the course can flow contin-
uously as a single narrative, each topic linked by a formal presentation building
on previous topics. We believe that Discrete Mathematics is perhaps the most
intellectually exciting and potentially one of the most interesting courses in the
computer science curriculum. Rather than a simply viewing the course as a
necessary tool for further, and perhaps more interesting developments to come
later, we believe it is the place in the curriculum that an appreciation of the
deep ideas of computer science can be presented; the relation between syntax
and semantics, how it is that unbounded structures can be defined finitely and
how to reason about those structure and how to calculate with them.
Most texts, following perhaps standard mathematical practice, attempt to
minimize the formalism, assuming that a students intuition will guide them
through to the end, often avoiding proofs in favor of examples 2 Mathematical
intuition is an entirely variable personal attribute, and even individuals with
significant talents can be misguided by intuition. This is shown over and over
in the history of mathematics; the history of the characterization of infinity is a
prime example, but many others exist like the Tarski-Banach paradox [?]. We
do not argue that intuition should be banished from teaching mathematics but
instead that the discrete mathematics course is a place in the curriculum to
cultivate the idea, useful in higher mathematics and in computer science, that
formalism is trustworthy and can be used to verify intuition.
Indeed, we believe, contrary to the common conception, that rather than
making the material more opaque, a formal presentation gives the students a
way to understand the material in a deeper and more satisfying way. The fact
that formal objects can be easily represented in ways that they can be consumed
by computers lends a concreteness to the ideas presented in the course. The
fact that formal proofs can be sometimes be found by a machine and can aways
be checked by a machine give an absolute criteria for what counts as a proof;
in our experience, this unambiguous nature of of formal proofs is a comfort to
students trying to decide if they’ve achieved a proof or not. Once the formal
criteria for proof has been assimilated, it is entirely appropriate to relax the
rigid idea of a proof as a machine checkable structure and to allow more simply
2 As an example we cite the pigeonhole principle which is not proved in any discrete math-

ematics text we know of but which is motivated by example. The proof is elementary once
the ideas of injection, surjection and one-to-one mappings have been presented.
CONTENTS ix

rigorous but informal proofs to be presented.


The formal approach to the presentation of material has, we believe, a num-
ber of significant advantages, especially for Computer Science students, but also,
for more traditional math students who might find their way into the course.
In mathematics departments proofs are typically learned by students through
a process of osmosis. Asked to give a proof, students hand in what they might
believe is a proof and the professor, as oracle, will either accept it or reject it.
If he rejects it he may point out that a particular part of the purported proof
is too vague, or that all cases have not been considered or he might identify
some other flaw. In any case, the criteria for what counts as a proof is a vague
one, students are left in doubt as to what a proof actually is and what might
count as one. We are convinced that this process, of learning by example only
works for students who have some innate ability to understand the distinctions
being made by repeated exposure to examples. But these distinctions are rarely
made explicit. Indeed, the successful student must essentially reconstruct for
himself a model of proof that has already been completely developed in explicit
detail by logicians starting with Frege. Most mathematicians would agree that,
in principle, proofs can be formalized – of course this was Hilbert’s attempt
to answer the paradoxes. But mathematicians, unlike logicians, do not teach
proofs in this way because that is not the way they do them in practice.
For computer scientists and software engineers, formalism is their daily
bread. Logic is the mathematical basis of computation as calculus and dif-
ferential equations are the mathematical basis of engineering physical systems.
Programs are formal syntactic objects. Computation, whether based on an ab-
stract model; like a Turing machine, the lambda calculus, or register transfer
machines; or based on a more realistic model like the Java virtual machine;
is a formal manipulation governed by formal rules. We believe that a formal
presentation of discrete mathematics is the best (and perhaps earliest) point in
the curriculum to make the distinction between syntax and semantics explicit
and to make proofs something that all students can learn to do, not only those
students who have some natural talent for making such arguments. Also, recur-
sion is the computational dual of induction and students unable to learn how
to do inductive proofs are unlikely to be able to consistently and successfully
write recursive procedures or to understand the reasons recursion works.
The text which perhaps most closely embodies the approach taken here may
be Gries and Schneider’s [21]. Gries and Schneider developed an equational
approach to logic based principally on the connectives for bi-equivalence and
exclusive-or. Our approach differs in that we use a standard form of proofs
based on Gentzen’s sequent calculus [18].
Another text that is close in style to this one is The Haskell Road to Logic,
Maths and Programming’ by Doets and van Eijck [9]. In that textbook, the
authors use the functional programming language Haskell as a computational
basis for much of the text and present proofs in an informal natural deduction
style.
Manna an Waldinger’s text The Logical Basis for Computer Programming
[36] is an excellent presentation of the material presented here and much more.
x CONTENTS

Unfortunately, that two volume work is now out of print.


As computer scientists we care about the reasons we can make a claim about
a computational artifact; among these artifacts we include: algorithms, data-
structures, programs, systems, and models of systems. Proofs are the means
to this end. Proofs tell us why something is true rather than just telling us
whether it is true or not. The ability to make such arguments is crucial to
the endeavor of the computer scientist and the software engineer. To specify
how a computational artifact is supposed to behave requires logic. To be able
to prove that a computational artifact has some property, for other than the
most trivial properties, requires a proof. As computer scientists and software
engineers we must take responsibility for the things we build, to do so requires
more than that we simply build them, we must be able to make arguments for
their correctness.
Proofs have another advantage; failed proofs of false conjectures usually
reveal something about why the conjecture is not true. Proofs in computer
science are often not deep, but can be extraordinarily detailed. In this course
we learn most of the mathematical structures and proof techniques required to
verify properties about computational artifacts. Students who practice with the
techniques presented here and will find applications in most aspects of designing,
building and testing computational artifacts.
Prerequisites for this course typically include a semester of calculus and
at least two semesters of programming. From programming, we assume stu-
dents know at least one programming language and may have been exposed to
two, though we do not assume they are experts. There is no programming in
the course as taught at Wyoming, but exposure to these ideas is important.
Based on their exposure to a programming language, we assume students have
had some experience implementing algorithms, and preferably have had some
exposure to a inductively defined data-type, like lists or trees, although that
experience may not have emphasized the inductive nature of those types, e.g.
they may have been more focused on the mechanics of manipulating pointers if
their language is C++. Mathematically, we assume students are already famil-
iar with notations for set membership (x ∈ A), explicit enumeration of sets (e.g.
finite enumerations of the form {a, b, c, d} and infinite enumerations of the form
{0, 2, 4, · · · }). We also assume that a student has seen notation for functions
(f : A → B) specifying that f is a function from A to B. Of course all the
mathematical prerequisites just mentioned are represented here in some detail
in the appropriate chapters.
These notes do not attempt to systematically build the mathematical struc-
tures and methods studied here from some absolute or minimal foundation. We
certainly attempt to explain things the first time they appear, but often, com-
plex ideas, like the inductive structure of syntax for example, are introduced
before a full and precise account can be given. We believe that repeatedly see-
ing the same methods and constructs a number of times throughout the course
and in a number of different guises is the best path to the students learning.
CONTENTS xi

Acknowledgments: Thanks go to Eric Berg, Andrew Blair, John Dumkee


and other anonymous students in COSC 2300 at the University of Wyoming for
being careful readers and for providing feedback on the text.
xii CONTENTS
Chapter 1

Syntax and Semantics*

In this chapter we give a brief overview of syntax and semantics. We describe,


without delving to deeply into the all the formal details, how to specify abstract
syntax using inductive definitions, we will see a mathematical justification of
these ideas in the chapter presenting inductively defined sets. We also present
simple examples of semantics and recursive functions defined on the abstract
syntax in this chapter. A detailed account of the material presented in this
chapter would draw heavily on material presented later in these lectures; indeed,
we are starting the lectures with an application of discrete mathematics as used
in computer science; the interpretation of inductively defined syntax by semantic
functions.

1.1 Introduction
Syntax has to do with form and semantics has to do with meaning. Syntax is
described by specifying a set of structured terms while semantics associates a
meaning to the structured terms. In and of itself syntax does not have mean-
ing, only structure. Only after a semantic interpretation has been specified for
the syntax do the structured terms acquire meaning. Of course, good syntax
suggests the intended meaning in a way that allows us see though it to the in-
tended meaning but it is an essential aspect of the formal approach, based on
the separation of syntax and semantics, that we do not attach these meanings
until they have been specified.
The syntax/semantics distinction is fundamental in Computer Science and
goes back to the very beginning of the field. Abstractly, computation is the
manipulation of formal (syntactic) representation of objects 1
For example, when compiling a program in written some language (say C++)
the compiler first checks the syntax to verify that the program is in the language.
1 The abstract characterization of computation as the manipulation of syntax, was first

given by logicians in the 1930’s who were the first to try to describe what we mean by the
word “algorithm”.

1
2 CHAPTER 1. SYNTAX AND SEMANTICS*

If not, a syntax error is indicated. However, a program that is accepted by


the compiler is not necessarily correct, to tell if the program is correct we must
consider the semantics of the language. One reasonable semantics for a program
is the result of executing it. Just because a program is in the language (i.e the
compiler produces an executable output) does not guarantee the correctness of
the program (i.e. that the program means the right thing) with regard to the
intended computation.

1.2 Formal Languages


Mathematically, a formal language is a (finite or infinite) set of structured terms
(think of them as trees.) These terms are of finite size and are defined over a
basis of lexical primitives or an alphabet. An alphabet is a finite collection of
symbols and each term itself is a finite structured collection of these symbols,
although there may an infinite number of terms.
Finite languages can (in principle) be specified by enumerating the terms in
the language; infinite languages are usually characterized by specifying a finite
set of rules for constructing the set of term structures included in the language.
Such a set of rules is called a grammar. In 1959 Noam Chomsky, a linguist at
MIT, first characterized the complexity of formal languages by characterizing
the structure of the grammars used to specify them [5]. Chomsky’s character-
ization of formal languages lead to huge progress in the early development of
the theory of programming languages and in their implementations, especially
in parser and compiler development. Essentially, his idea for classifying formal
languages was to classify them according to the complexity of the devices that
can be used to generate the terms in the language.
The study of formal languages has an extensive literature of it’s own [?,
12, 29, 45, ?]. Similarly, the study of mathematical semantics of programming
languages is a rich research area in its own right [47, 48, 42, 46, 50, 23, 24, 22,
53, 1, 38]

1.3 Syntax
We can finitely describe abstract syntax in a number of ways. A common way
is to describe the terms of the language inductively by giving a formal grammar
describing how terms of the language can be constructed. We give an abstract
description of a grammar over an alphabet and then, in later sections we provide
examples to make the ideas more concrete.

Definition 1.1 (grammar) A grammar over an alphabet (say Σ) is of the


form
class T ::= C1 | C2 | · · · | Cn
where

T : is a set which, if empty is omitted from the specification, and


1.3. SYNTAX 3

class: is the name of the syntactic class being defined, and

Ci : are constructors 1 ≤ i ≤ n, n > 0

The symbol ::= separates the name of the syntactic class being defined
from the collection of rules that define it. Note that the vertical bar “|” is read
as “or” and it separates the rules (or productions) used to construct the terms
of class. The rules separated by the vertical bar are alternatives. The order of
the rules does not matter, but in more complex cases it is conventional to write
the simpler cases first. Sometimes it is convenient to parametrize the class being
defined by some set. We show an example of this below were we simultaneously
define lists over some set T all at once, rather than making separate syntactic
definitions for each kind of list.
Traditionally, the constructors are also sometimes called rules or productions.
The describe the allowable forms of the structures included in the language.
The constructors are either constants from the alphabet, are elements from
some collection of sets, or describe how to construct new complex constructs
consisting of symbols from the alphabet, elements from the parameter sets,
and possibly from previously constructed elements of the syntactic class; the
constructor functions return new elements of the syntactic class. At least one
constructor must not include arguments consisting of previously constructed
elements of the class being defined; this insures that the syntactic structures in
the language defined by the grammar are finite. These non-recursive alternatives
(the ones that do not have subparts which are of the type of structure being
defined) are sometimes called the base cases.
Two syntactic elements are equal if they are constructed using identical
constructors applied to equal arguments. It is never the case that c1 x = c2 x if
c1 and c2 are different constructors.

Noam Chomsky (1928-


)is the father of modern
linguistics. In 1959 he char-
acterized formal languages
in terms of their generative
power and laid the mathe-
matical foundation for the
study of formal languages.
He is also known as political
activist.

Noam Chomsky
4 CHAPTER 1. SYNTAX AND SEMANTICS*

1.3.1 Concrete vs. Abstract Syntax


A text is a linear sequence of symbols which, on the printed page, we read from
left to right2 and top to bottom. We can specify syntax concretely so that it
can be read unambiguously as linear sequence of symbols, or abstractly which
simply specifies the structure of terms without telling us how they must appear
to be read as text. We use parentheses to indicate the order of application of
the constructors in a grammar when writing abstract syntax as linear text.
Concrete syntax completely specifies the language in such a way that there
is no ambiguity in reading terms in the language as text, i.e. as a linear sequence
of symbols read from left to right. For example, does the ambiguous arithmetic
statement (a ∗ b + c) mean (a ∗ (b + c)) or ((a ∗ b) + c)? In informal usage we
might stipulate that multiplication “binds tighter than addition” so the common
interpretation would be he second form; however, in specifying a concrete syntax
for arithmetic we would typically completely parenthesize statements (e.g. we
would write ((a ∗ b) + c) or (a ∗ (b + c))) and perhaps specify conventions that
allow us to drop parentheses to make reading the statement easier.
In abstract syntax the productions of the grammar are considered to be
constructors for structured terms having tree-like structure. We do not include
parentheses in the specification of the language, there is no ambiguity because we
are specifying trees which explicitly show the structure of the term without the
use of parentheses. When we write abstract syntax as text, we add parentheses
as needed to indicate the structure of the term, e.g. in the example above we
would write a ∗ (b + c) or (a ∗ b) + c depending on which term we intend.

∗ +

a + ∗ c

b c a b

Abstract syntax can be displayed in tree form. For example, the formula
a ∗ (b + c) is displayed by the abstract syntax tree on the left in Fig. ?? and
the formula (a ∗ b) + c is displayed by the tree on the right of Fig. ??. Notice
that the ambiguity disappears when displayed in tree form since the principle
constructor labels the top of the tree. The immediate subterms are at the next
level and so on. For arithmetic formulas, you can think of the topmost (or
principle) operator as the last one you would evaluate.
2 Of course the fact that we read and write from left to right is only an arbitrary convention,

Hebrew and Egyptian hieroglyphics are read from right to left. But even the notion of left and
right are simply conventions, Herodotus [27] tells us in his book The History (written about
440 B.C.) that the ancient Egyptians wrote moving from right to left but he reports “they
say they are are moving [when writing] to right”, i.e. what we (in agreement with the ancient
Greeks) call left the ancient Egyptians called right and vice versa. I theorize that notions of
right and left may have first been understood only in relation to the linear form introduced
by writing. In that case, if right means “the side of the papyrus you start on when writing a
new line” then the Egyptian interpretation of right and left coincide with the Greeks.
1.3. SYNTAX 5

1.3.2 Some examples of Syntax


We give some examples of grammars to describe the syntax of the Booleans, the
natural numbers, a simple language of Boolean expressions which includes the
Booleans and an if-then-else construct, and describe a grammar for constructing
lists where the elements are selected from some specified set.

Syntax of B
The Booleans3 consist of two elements. We denote the elements by the alphabet
consisting of the symbols T and F. Although this is enough, i.e. it is enough
to say that a Boolean is either the symbol T or is the symbol F, we can define
the Booleans (denoted B) by the following grammar:

B ::= T | F
Read the definition as follows:

A Boolean is either the symbol T or the symbol F.

The syntax of these terms is trivial, they have no more structure than the
individual symbols of the alphabet do. The syntax trees are simply individual
nodes labeled either T or F. There are no other abstract syntax trees for the
class B.

Syntax of N
The syntax of the natural numbers (denoted by the symbol N) can be defined
as follows:

Definition 1.2.
N ::= 0 | s n
where the alphabet consists of the symbols {0, s} and n is a variable denot-
ing some previously constructed element of the set N. 0 is a constant symbol
denoting an element of N and s is a constructor function mapping N to N.

The definition is read as follows:

A natural number is either: the constant symbol 0 or is of the


form s n where n is a previously constructed natural number.

Implicitly, we also stipulate that nothing else is in N, i.e. the only elements of
N are those terms which can be constructed by the rules of the grammar.
Thus, N = {0, s0, ss0, sss0, · · · } are all elements of N. Note that the variable
“n” used in the definition of the rules never occurs in an element of N, it is simply
a place-holder for an term of type N, i.e. it must be replaced by some term from
3 “Boolean” is eponymous for George Boole, the English mathematician who first formu-

lated symbolic logic in symbolic algebraic form.


6 CHAPTER 1. SYNTAX AND SEMANTICS*

s s

0 s

Figure 1.1: Syntax trees for terms s0 and sss0

the set {0, s0, ss0, · · · }. Such place-holders are called meta-variables and are
required if the language has inductive structure, i.e. if we define the elements
of the language using previously constructed elements of the language.
Although the grammar for N contains only two rules, the language it de-
scribes is far more complex than the language of B (which also consists of two
rules.) There are an infinite number of syntactically well-formed terms in the
language of N. To do so it relies on n being a previously defined element of N ;
thus N is an inductively defined structure.

Abstract Syntax Trees for N


s

The trees are of one of two forms shown above The subtree for a previously
constructed N labeled n is displayed by the following figure:

The triangular shape below the n is intended to suggest that n itself is an


abstract syntax tree whose exact shape is unknown. Of course, any actual ab-
stract syntax tree would not contain any of these triangular forms. For example,
the abstract syntax trees for the terms s0 and sss0 are displayed in Fig. 1.1.

Syntax of a simple computational language


We define a simple language P LB (Programming Language Booleans) of Boolean
expressions with a semantics allowing us to compute with Booleans. The al-
phabet of the language includes the symbols {if , then , else , fi }, i.e. the
alphabet includes a collection of keywords suitable for constructing if-then-else
1.3. SYNTAX 7

statements. We use the Booleans as the basis, i.e. the Booleans defined above
serve as the base case for the language.

Definition 1.3 (PLB)

P LB ::= b | if p then p1 else p2 fi

where
b ∈ B: is a Boolean, and
p, p1 , p2 : are previously constructed terms of P LB.
Terms of the language include:

{T, F, if T then T else T fi, if T then T else F fi,


if T then F else T fi, if T then F else F fi,
if F then T else T fi, if F then T else F fi,
if F then F else T fi, if F then F else F fi,
···
if if T then T else T fi then T else T fi,
if if T then T else T fi then T else F fi,
···
if if T then T else T fi then if T then T else T fi else T fi,
···
}

Thus, the language P LB includes the Boolean values {T, F} and allows arbi-
trarily nested if-then-else statements.

Lists
We can define lists containing elements from some set T by two rules. The
alphabet of lists is {[ ], ::} where “[ ]” is a constant symbol called “nil” which
denotes the empty list and “::” is a symbol denoting the constructor that adds
an element of the set T to a previously constructed list. This constructor is, for
historical reasons, called “cons”. Note that although “[ ]” and “::” both consist
of sequences of two symbols, we consider them to be atomic symbols for the
purposes of this syntax.
This is the first definition where the use of the parameter (in this case T )
has been used.
Definition 1.4 (T List)

List T ::= [ ] | a :: L

where
T : is a set,
[ ]: is a constant symbol denoting the empty list, which is called “nil”,
a: is an element of the set T , and
8 CHAPTER 1. SYNTAX AND SEMANTICS*

::

a ::

b ::

a []

Figure 1.2: Syntax tree for the list [a, b, a] constructed as a::(b::(a::[ ]))

L: is a previously constructed List T .

A list of the form a::L is called a cons. The element a from T in a::L is
called the head and the list L in the cons a::L is called the tail.

Example 1.1. As an example, let A = {a, b}, then the set of terms in the class
List A is the following:

{[ ], a::[ ], b::[ ], a::a::[ ], a::b::[ ], b::a::[ ], b::b::[ ], a::a::a::[ ], a::a::b::[ ], · · · }

We call terms in the class List T lists. The set of all lists in class List A is
infinite, but each list is finite because lists must always end with the symbol [ ].
Note that we assume a::b::[ ] means a::(b::[ ]) and not (a::b)::[ ]), to express this
we say cons associates to the right. The second form violates the rule for cons
because a::b is not well-formed since b is an element of A, it is not a previously
constructed List A . To make reading lists easier we simply separate the consed
elements with commas and enclose them in square brackets “[” and “]”, thus,
we write a::[ ] as [a] and write a::b::[ ] as [a, b]. Using this notation we can rewrite
the set of lists in the class List A more succinctly as follows:

{[ ], [a], [b], [a, a], [a, b], [b, a], [b, b], [a, a, a], [a, a, b], · · · }

Note that the set T need not be finite, for example, the class of List N is
perfectly sensible, in this case, there are an infinite number of lists containing
only one element e.g.
{[0], [1], [2], [3] · · · }

Abstract Syntax Trees for Lists

Note that the pretty linear notation for trees in only intended to make them
more readable, the syntactic structure underlying the list [a, b, a] is displayed
by the following abstract syntax tree:
1.3. SYNTAX 9

1.3.3 Definitions
A definition is a way to extend a language to possibly include new symbols but
to describe them in terms of the existing language. Adding a definition does
not allow anything new to be said that could not already have been; though
definitions can be extraordinarily useful in making things clear. The key idea
behind defined terms is that they can be completely eliminated by just replacing
them by its definition.

Definition 1.5 (definitions) A definition is a template or schematic form that


introduces new symbols into an existing language as an abbreviation for another
(possibly more complicated) term. Definitions have the form
def
A[x1 , · · · , xk ] = B[x1 , · · · , xk ]

where xi , 1 ≤ i ≤ k are variables standing for terms of the language (defined so-
far). An instance of the defined term is the form A[t1 , · · · , tk ] where the xi ’s are
instantiated by terms ti . This term is an abbreviation (possibly parametrized
if k > 0) for the schematic formula B[t1 , · · · , tk ] i.e. for the term having the
shape of B but where each of the variables xi is replaced by the term ti . A
may introduce new symbols not in the language while B must be a formula
of the language defined up to the point of its introduction, this includes those
formulas given by the syntax as well as formulas that may include previously
defined symbols.
def
The symbol “ = ” separates the left side of the definition, the thing being
defined, from the right side which contains the definition. The left side of the
definition may contain meta-variables which also appear on the right side.
Instances of defined terms can be replaced by their definitions replacing the
arguments in the left side of the definition into the right side. The process of
“replacement” is fundamental and is called substitution. In following chapters,
we will carefully define substitution (as an algorithm) for propositional and then
predicate logic.
This definition of “definition” is perhaps too abstract to be of much use,
and yet the idea of introducing new definitions is one of the most natural ideas
of mathematics. A few definitions are given below which should make the idea
perfectly transparent.

Example 1.2. In mathematics, we define functions all the time by introducing


their names and arguments and giving the right hand side “definition.” e.g.

f (k) = 2k + k
∞ n
X x
exp(x) = x∈R
n=0
n!

The secondPone is interesting since the variable n is “bound” by the summation


operator ( ). We call operators that “bind” variables binding operators. In the
10 CHAPTER 1. SYNTAX AND SEMANTICS*

definition of exp, we must be careful when we substitute in something for x – if


the term being replaced for x contains the variable n in it, we first must rename
n’s on the right side of the definition and then go ahead with substitution. This
process is called capture avoiding substitution and is covered in some detail in
Chapter ??.

1.4 Semantics
Semantics associates meaning with syntax. Formal semantics (the kind we are
interested in here) is given by defining a mathematical mapping from syntax
(think of syntax as a kind of data-structure) to some other mathematical struc-
ture. This mapping is called the semantic function or interpretation ; we will
use these terms interchangeably. When possible, formal languages are given
compositional semantics. The meaning of a syntactic structure depends on the
meanings of its parts.
Before a semantics is given, an element in a syntactic class can only be seen
as a meaningless structured term, or if expressed linearly as text, it is simply a
meaningless sequence of symbols. Since semantics are intended to present the
meanings of the syntax, they are taken from some mathematical domain which
is already assumed to be understood or is, by some measure, simpler. In the
case of a program, the meaning might be the sequence of states an abstract
machine goes through in the evaluation of the program on some input (in this
case, meanings would consist of pairs of input values and sequences of states);
or perhaps the meaning is described simply as the input/output behavior of the
program (in this case the meaning would consist of pairs of input values and
output values.) In either case, the meaning is described in terms of (well under-
stood) mathematical structures . Semantics establish the relationship between
the syntax and its interpretation as a mathematical structure.

1.4.1 Definition by Recursion


In practice, semantic functions are defined by recursion on the structure of
the syntax being interpreted. We expect students have already encountered
recursion. To define an interpretation, an equation must be given for possible
construct(or) in the grammar; one equation for each alternative. So, it is easy
enough to tell if the definition is complete; it must have as many cases as there
are constructors. The alternatives that do not contain references to the syntactic
class defined by the grammar are base cases and these cases in the definition
of the semantic function are not recursive. The alternatives that do contain
subparts of the same type as the syntactic class being defined are inductive
alternatives . The semantic equations for these cases of the semantic function
typically call the semantic function (recursively) on the inductive parts; this
is where the recursion comes in. It may appear that a recursive definition
is circular, but because we restrict the inductive parts of the structure to be
“previously constructed”, we guarantee that eventually we will have reduced
1.4. SEMANTICS 11

the complex parts down to one of the base cases. In this way, computation by
recursion is guaranteed to terminate.

Semantics of B
Suppose that we intend the meanings of B to be among the set {0, 1}. Then,
functions assigning the values T and F to elements of {0, 1} count as a seman-
tics. Following the tradition of denotational semantics, if b ∈ B we write [[b]] to
denote the meaning of b. Using this notation one semantics would be:
[[T]] = 0
[[F]] = 1
Thus, the meaning of T is 0 and the meaning of F is 1. This interpretation
might not be the one you expected (i.e. you may think of 1 as T and 0 as
F) but, an essential point of formal semantics is that the meanings of symbols
or terms need not be the one you impose through convention or force of habit.
Things mean whatever the semantics say they do 4 . Before the semantics has
been given, it is a mistake to interpret syntax as anything more than a complex
of meaningless symbols.

As another semantics for Booleans we might take the domain of meaning to


be sets of integers 5 . We will interpret F to be the set containing the single
element 0 and T to be the set of all non-zero integers.
[[T]] = {k ∈ Z | k 6= 0}
[[F]] = {0}
This semantics can be used to model the interpretation of integers as Booleans
in the C++ programming language where any non-zero number is interpreted
as T and 0 is interpreted as F as follows. If i is an integer and b is a Boolean
then:
(bool) i = b iff i ∈ [[b]]
This says: the integer i, interpreted as a Boolean6 , is equal to the Boolean b if
and only if i is in the set of meanings of b; e.g. since [[T]] = {k ∈ Z | k 6= 0} we
know 5 ∈ [[T]] therefore we can conclude that (bool)5 = T.

Semantics of N
We will describe the meaning of terms in N by mapping them onto non-negative
integers. This presumes we already have the integers as an understood mathe-
matical domain7 .
4 Perhaps interestingly, in the logic of CMOS circuit technology, this seemingly backward

semantic interpretation is the one used.


5 We denote the set of integers {· · · , −1, 0, 1, 2, · · · } by the symbol Z. This comes from

German Zahlen which means number.


6 A cast in C++ is specified by putting the type to cast to in parentheses before the term

to be cast.
7 Because the integers are usually constructed from the natural numbers this may seem to

be putting the cart before the horse, so to speak, but it provides a good example here.
12 CHAPTER 1. SYNTAX AND SEMANTICS*

Our idea is to map the term 0 ∈ N to the actual number 0 ∈ Z, and to


map terms having k occurrences of s to the integer k. To do this we define the
semantic equations recursively on the structure of the term. This is the stan-
dard form of definition for semantic equations over a grammar having inductive
structure.

[[0]] = 0
[[sn]] = [[n]] + 1 where n ∈ N

The equations say that the meaning of the term 0 is just 0 and if the term
has the form sn (for some n ∈ N) the meaning is the meaning of n plus one.
Note that there are many cases in the recursive definition as there are in the
grammar, one case for each possible way of constructing a term in N. This will
always be the case for every recursive definition given on the structure of a term.
Under these semantics we calculate the meaning of a few terms to show how
the equations work.

[[s0]] [[sssss0]]
= [[0]] + 1 = [[ssss0]] + 1
=0+1 = ([[sss0]] + 1) + 1
=1 = (([[ss0]] + 1) + 1) + 1
= ((([[s0]] + 1) + 1) + 1) + 1
= (((([[0]] + 1) + 1) + 1) + 1) + 1
= ((((0 + 1) + 1) + 1) + 1) + 1
= (((1 + 1) + 1) + 1) + 1
= (((2 + 1) + 1) + 1
= ((3 + 1) + 1
=4+1
=5

Thus, under these semantics, [[s0]] = 1 and [[sssss0]] = 5.

Semantics of P LB

The intended semantics for the language P LB to reflect evaluation of Boolean


expressions where if-then-else has the normal interpretation. Thus our semantics
will map expressions of P LB to values in B. Recall the syntax of P LB:

P LB ::= b | if p then p1 else p2 fi

As always, the semantics will include one equation for each production in
the grammar. Informally, if a P LB term is already a Boolean, the semantic
function does nothing. For other, more complex, terms we explicitly specify the
values when the conditional argument is a Boolean, and if it is not we repeatedly
reduce it until it is grounded as a Boolean value. The equation for if-then-else
is given by case analysis (on the conditional argument).
1.4. SEMANTICS 13

[[b]] = b (1)
[[p
 1 ]] if ([[p]] = T)
[[if p then p1 else p2 fi]] = [[p2 ]] if ([[p]] = F) (2)
[[if q then p1 else p2 fi]] (p 6∈ B, q = [[p]])

We have numbered the semantic equations so we can refer to them in the


example derivations below; we have annotated each step in the derivation with:
the equation used, the bindings of the variables used to match the equation,
and, in the case of justifications based on equation (2) the case used to match
the case of the equation. Note that the equations are applied from top down,
i.e. we apply the case p 6∈ B only after considering the possibility that p = T
and p = F.

Here are some calculations (equational style derivations) that show how the
equations can be used to compute meanings.

[[if T then F else T fi]]


hhEquation : 2 p = T, p1 = F, p2 = T Case : p = Tii
= [[F]]
hhEquation : 1 b = Fii
=F

[[if F then F else T fi]]


hhEquation : 2 p = F, p1 = F, p2 = T Case : p = Fii
= [[T]]
hhEquation : 1 b = Tii
=T

Note that in these calculations, it seems needless to evaluate [[b]], the fol-
lowing derivation illustrates an case where the first argument is not a Boolean
constant and the evaluation of the condition is needed.
14 CHAPTER 1. SYNTAX AND SEMANTICS*

[[if if F then F else T fi then F else T fi]]


hhEquation : 2 p = if F then F else T fi, p1 = F, p2 = T
Case : p 6∈ B
q = [[if F then F else T fi]]
hhEquation : 2 p = F, p1 = F, p2 = T Case : p = Fii
= [[T]]
hhEquation : 1 b = Tii
=T
ii
= [[if T then F else T fi]]
hhEquation : 2 p = T, p1 = F, p2 = T Case : p = Tii
= [[F]]
hhEquation : 1 b = Fii
=F

Using terms of P LB, we can define other logical operators.


def
not p = if p then F else T fi
def
(p and q) = if p then q else F fi
def
(p or q) = if p then T else q fi
In Chapter ?? on propositional logic we will prove that these definitions are
indeed correct in that the defined operators behave as we expect them to.

Semantics of List T
Perhaps oddly, we do not intend to assign semantics to the class List T . The
terms of the class represent themselves, i.e. we are interested in lists as lists.
But still, semantic functions are not the only functions that can be defined by
recursion on the structure of syntax, we can define other interesting functions
on lists by recursion on the syntactic structure of one or more of the arguments.
For example, we can define a function that glues two lists together (given
inputs L and M where L, M ∈ List T , append (L, M ) is a list in List T ). It is
defined by recursion on the (syntactic) structure of the first argument as follows:
Definition 1.6 (Append)

append ([ ], M ) = M
append (a::L, M ) = a::(append (L, M ))

The first equation of the definition says: if the first argument is the list [ ], the
result is just the second argument. The second equation of the definition says,
if the first argument is a cons of the form a::L, then the result is obtained by
consing a on the append of L and M . Thus, there are two equations, one for each
rule that could have been used to construct the first argument of the function.
1.5. POSSIBILITIES FOR IMPLEMENTATION 15

We give some example computations with the definition of append.

append (a::b::[ ], [ ])
= a::(append (b::[ ], [ ])
= a::b::(append ([ ], [ ])
= a::b::[ ]

Using the more compact notation or lists, we have shown append ((, [)a, b], [ ]) =
[a, b]. Using this notation for lists we can rewrite the derivation as follows:

append ([a, b], [ ])


= a::(append ([b], [ ])
= a::b::(append ([ ], [ ])
= a::b::[ ]
= [a, b]

Remark 1.1 (Infix Notation for Append) The append operation is so com-
monly used that many functional programming languages include special infix
notation. In the Haskell programming language [?] the infix notation is ++, in
the ML family of programming languages append is written @. We will write
m++n for append (m, n).
Using this infix notation, we rewrite the computation above as follows:

[a, b]++[ ]
= a::([b]++[ ]
= a::b::([ ]++[ ]
= a::b::[ ]
= [a, b]

We will use the more succinct notation for lists from now on and the infix
notation, but do not forget that this is just a more readable display for the more
cumbersome but precise notation which explicitly uses the cons constructor.
Here is another example.
[ ]++[a, b]
= [a, b]
We will discuss lists and operations on lists as well as ways to prove properties
about lists in some depth in Chapter 11. For example, the rules for append
immediately give append ((, [ ]), M ) = M , but the following equation is a theorem
as well append ((, M ), [ ]) = M . For any individual list M we can compute with
the rules for append and show this, but currently have no way to assert this in
general for all M with out proving it by induction.

1.5 Possibilities for Implementation


A number of programming languages provide excellent support for implement-
ing abstract syntax almost succinctly as it has been presented above. This is
16 CHAPTER 1. SYNTAX AND SEMANTICS*

especially true of the ML family of languages [37, 33, 39] and the language
Haskell [?]. Scheme is also useful in this way [16]. All three, ML, Haskell and
Scheme are languages in the family of functional programming languages. Of
course we can define term structures in any modern programming language, but
the functional languages provide particularly good support for this. Similarly,
semantics is typically defined by recursion on the structure of the syntax and
these languages make such definitions quite transparent, implementations ap-
pear syntactically close to the mathematical notions used above. The approach
to implementing syntax and semantics in ML is taken in [?] and a similar ap-
proach using Scheme is followed in [16]. The excellent book [9]presents most of
the material presented here in the context of the Haskell programming language.
Part I

Logic

17
19

Kurt Gödel (1906 – 1978)


was one of the greatest
minds of the 20th century.
His famous incompleteness
theorem changed, in a deep
way, the conception of
mathematics.

Kurt Godel
20
Chapter 2

Propositional Logic

One of the people present said: ‘Persuade me that logic is useful.’


– ‘Do you want me to prove it to you?’ He asked. – ‘Yes.’ – ‘So I
must produce a probative argument?’ – He agreed. – ‘Then how will
you know if I produce a sophism?’ – He said nothing. – ‘You see,’
he said, ‘you yourself agree that all this is necessary, since without
it you cannot even learn whether it is necessary or not.’ Epictetus1
Discourses2 II xxv.

Propositional logic is the most basic form of logic. It takes as primitive,


propositions.

Definition 2.1 (Proposition) A proposition is a statement that can, in prin-


ciple, be either true or false.

Of course the true nature of propositions is open to some philosophical de-


bate, we leave this debate to the philosophers and note that we are essentially
adopting Wittgenstein’s [55] definition of propositions as truth functional.
In the following sections of this chapter we define the syntax of propositional
formulas, we describe the semantics, present a sequent proof system an proofs
and finally discuss equational style reasoning in propositional logic.

2.1 Syntax of Propositional Logic


The primitive vocabulary of symbols from which more complex terms of the
language are constructed is called the lexical components. Syntax specifies the
acceptable form or structures of lexical components allowed in the language. We
think of the syntactic forms (formulas) as trees (syntax trees) whose possible
shapes are given by a grammar for propositional formulas. Even though familiar
symbols appear in formulas (whose meaning you may already know) they are
1 Epictetus was an ancient philosopher (50–130 A.D.) of the Stoic school in Rome.
2 Translated by Jonathan Barnes in his book [3].

21
22 CHAPTER 2. PROPOSITIONAL LOGIC

uninterpreted here; they are simply tree-like structures waiting for a semantics
to be applied.

2.1.1 Formulas
We use propositional variables to stand for arbitrary propositions and we assume
there is an infinite supply of these variables.

V = {p, q, r, p1 , q1 , r1 , p2 , · · · }

Note that the fact that the set V is infinite is unimportant since no individ-
ual formula will ever require more than some fixed finite number of variables
however, it is important that the number of variables we can select from is
unbounded. There must always be a way to get another one.
We include the constant symbol ⊥ (say “bottom”).
Complex propositions are constructed by combining simpler ones with propo-
sitional connectives. For now we leave the meaning of the connectives unspec-
ified and simply present them as one of the symbols ∧, ∨, ⇒ which we read as
and, or and implies respectively.

Definition 2.2 (Propositional Logic syntax) The syntax of propositional


formulas (we denote the set as P) can be described by a grammar as follows:

P ::= ⊥ | x | ¬φ | φ ∧ ψ | φ ∨ ψ | φ ⇒ ψ

where
⊥ is a constant symbol,
x ∈ V is a propositional variable, and
φ, ψ ∈ P are meta-variables denoting previously constructed propositional
formulas.

To write the terms of the language P linearly (i.e. so that they can be
written from left-to-right on a page), we insert parentheses to indicate the order
of the construction of the term as needed e.g. p∧q ∨r is ambiguous in that we do
not know if it denotes a conjunction of a variable and a disjunction (p ∧ (q ∨ r))
or it denotes the disjunction of a conjunction and a variable ((p ∧ q) ∨ r).
Thus (written linearly) the following are among the terms of P: ⊥, p, q, ¬q,
p ∧ ¬q, ((p ∧ ¬q) ∨ q), and ¬((p ∧ ¬q) ∨ r).
We use the lowercase Greek letters φ and ψ (possibly subscripted) as meta-
variables ranging over propositional formulas, that is, φ and ψ are variables that
denote propositional formulas; note that they are not themselves propositional
formulas and no actual propositional formula contains either of them.

Another view of syntax* Readers familiar with a programming language


that supports high-level datatypes will be familiar with constructor functions.
2.1. SYNTAX OF PROPOSITIONAL LOGIC 23

If we were to implement a datatype representing formulas we would provide a


set of constructor functions, one for each kind of formula.
They can be described by giving their type signatures3 . If P is the type
of propositional formulas and V is the type of variables, the signatures of the
constructors are given as follows:

mk bot : P
mk var : V → P
mk not : P → P
mk and : (P × P) → P
mk or : (P × P) → P
mk implies : (P × P) → P

Thus, mk bot is a constant of type P, i.e. it is a propositional formula. The


constructor mk var maps variables in V to propositional formulas and so is
labeled as having the type V → P. The constructor mk not maps a previously
constructed propositional formula to a new propositional formula (by sticking
a not symbol in front) and so is labeled as having the type P → P. We say
it is a unary connective since it takes one argument. The constructors for and,
or, and implies all take two arguments and so are called binary connectives.
Their arguments are pairs of previously constructed propositional formulas and
so they all have the signature (P × P) → P.
Here are some formulas represented in different ways.

No. Linear Form Constructor Form


i. ⊥ mk bot
ii. p mk var(p)
iii. ¬p mk not(mk var(p))
iv. p ∧ ¬p mk and(mk var(p), mk not(mk var(p)))
v. ¬⊥ ⇒ p mk implies(mk not(mk bot), mk var(p))
vi. ((p ∧ ⊥) ⇒ (p ∨ ¬q)) mk implies(mk and(mk var(p), mk bot),
mk or(mk var(p),
mk not(mk var(q))))

The syntax trees for the last four of these examples are drawn as follows:
iii.) ¬ iv.) ∧

p p ¬

3 A signature f : A → B says f is a function for type A to type B. A signature of the form

p : A × B says p is a tuple whose first element is of type A and whose second element is type
B.
24 CHAPTER 2. PROPOSITIONAL LOGIC

v.) ⇒ vi.) ⇒

¬ p ∧ ∨
⊥ p ⊥ p ¬

2.1.2 Definitions: Extending the Language


As discussed in Sect. 1.1.3.3, definitions allow for the introduction of new sym-
bols into the language by describing them in terms of the existing language.
Adding new symbols can be a significant notional convenience but it does not
extend the expressiveness of the language since definitions are given in terms of
the existing language.
A useful connective we have not included in our base syntax for propositional
logic is for the if-and-only-if connective.

If-and-only-if

Definition 2.3 (bi-conditional) The so-called bi-conditional or if-and-only-if


connective is defined as follows:
def
(φ ⇔ ψ) = ((φ ⇒ ψ) ∧ (ψ ⇒ φ))

True > The syntax includes a constant ⊥ which, when we do the semantics,
will turn out to denote false; but we do not have a constant corresponding to
true. We define it here.

Definition 2.4 (Top) We define a new constant “>” (say top) as follows:
def
> = ¬⊥.

2.1.3 Substitutions*
A substitution is a means to map formulas to other formulas by uniformly re-
placing all occurrences individual variables with a formula. For example, given
the formula (p ∧ q) ⇒ p we could substitute any formula for p or any formula
for q. Say we wanted to substitute (r ∨ q) for p then we write the following:

subst(p, dr ∨ qe)d(p ∧ q) ⇒ pe

Recalling that syntax is best


2.2. SEMANTICS 25

2.1.4 Exercises

2.2 Semantics
Semantics gives meaning to syntax. The style of semantics presented here was
first presented by Alfred Tarski in his paper[49] on truth in formalized languages
which was first published in Polish in 1933.
If I asked you to tell me whether the expression x + 11 > 42 it true, you’d
probably tell me that you need to know what the value of x is. So, the meaning
of x + 11 > 42 depends on the values assigned to x. Similarly, if I asked you if
a formula was true (say p ∧ q) you’d tell me you need to know what the values
of p and q are. The meanings of a formula depend on the values assigned to
variables in the formula.
In the following sections we introduce the set of Boolean values and we for-
malize the notion of an assignment. We present the semantics for propositional
logic in the form of a valuation function that, given an assignment and a formula
returns Tor F. The valuation function is then used as the basis to describe the
method of truth tables. Truth tables characterize all the possible meanings of
a formula. This gives us a semantics for propositional logic.

2.2.1 Boolean values and Assignments


Definition 2.5 (Booleans) The two element set

B = {T, F}

is called the Boolean4 set, and its elements (T and F) are called Boolean values.

Note that any two element set would do, as long as we could distinguish the
elements from one another.
When we ask if a formula is true, we are asking whether it evaluates to
T when it is interpreted with respect to some kind of structure. For an arith-
metic expression like the one in the example give above (x + 11 > 42) – the
structure would have to (at least) indicate the integer value associated with
the variable x. For propositional logic, the structure binding Boolean values to
variables is called an assignment.

Definition 2.6 (assignment) An assignment is a function that maps propo-


sitional variables to one of the Boolean values T or F. Assignments have the
following type:
V → {T, F}

We will use the symbols α, α0 , α̂, α1 , α2 , · · · to denote assignments.


Remember that the set of propositional variables V is infinite but any indi-
vidual formula contains a finite number of them. We don’t need to know the
4 The Booleans are eponymous for George Bool (1815 – 1864) the English mathematician

who first presented logic in symbolic or algebraic form.


26 CHAPTER 2. PROPOSITIONAL LOGIC

value of of every variable to determine the value of a formula, we simply need


to know the values of the variables occurring in the formula. This will allow
us to specify assignments by enumerating the cases for each variable that does
occur. We “don’t care” what values the assignment gives to variables not in
the formula under consideration, they will have no effect on the outcome of the
valuation.

Counting 2.1 (Assignments) For a formula φ containing k distinct variables


where k > 0, there are 2k possible assignments (disregarding all the variables in
V that are not among the k variables that do occur.)

Example 2.1. For the formula consisting of the single variable p there are
21 = 2 possible assignments.

p
α0 F
α1 T
To read the table, the assignment name is in the left column and the variables
are listed across the top. Thus, α0 (p) = F and α1 (p) = T.
For the formula p ∨ ⊥ having one occurrence of the variable p, k = 1 and
there are 21 = 2 possible assignments which were the ones just given.
The formula (p ∨ q) ⇒ p has two distinct variables p and q and so has 22 = 4
different assignments.

p q
α0 F F
α1 F T
α2 T F
α3 T T

2.2.2 The Valuation Function


Definition 2.7 (valuation) A valuation is a function that takes an assignment
and a propositional formula as input and returns a Boolean value, depending
on whether the assignment determines the formulas value to be T or F.

We define the (recursive) valuation function val by induction on the structure


of the formula as follows.

Definition 2.8 (valuation function)

val(α, ⊥) = F
val(α, x) = α(x) whenever x ∈ V
val(α, ¬φ) = not(val(α, φ))
val(α, φ ∧ ψ) = val(α, φ) and val(α, ψ)
val(α, φ ∨ ψ) = val(α, φ) or val(α, ψ)
2.2. SEMANTICS 27

val(α, φ ⇒ ψ) = not(val(α, φ)) or val(α, ψ)


val(α, φ ⇔ ψ) = val(α, φ ⇒ ψ) and val(α, ψ ⇒ φ)

The definition specifies how to compute the valuation of any propositional


formula (under assignment α) by including one equation for each rule in the
grammar of P..
Example 2.2. As an example, suppose we define an assignment α as follows:
α(p) = T
α(q) = F
α(r) = T
Then, the valuationof the formula ((p ⇒ q) ∨ r) is computed as follows.

val(α, ((p ⇒ q) ∨ r))


= val(α, (p ⇒ q)) or val(α, r)
= (not(val(α, p)) or val(α, q)) or α(r)
= (not(α(p)) or α(q)) or T
= (not(T) or F) or T
= (F or F) or T
= F or T
=T
Consider the valuation of another formula under same assignment.
Example 2.3.
val(α, ((p ⇒ q) ∨ q))
= val(α, (p ⇒ q)) or val(α, q)
= (not(val(α, p)) or val(α, q)) or α(q)
= (not(α(p)) or α(q)) or F
= (not(T) or F) or F
= (F or F) or F
= F or F
=F
Definition 2.9 (satisfies) An assignment α satisfies a formula φ if and only
if val(α, φ) = T. In this case we write α |= φ and say “α models φ”.
Definition 2.10 (failsifies) An assignment α falsifies a formula φ if and only
if val(α, φ) = F. In this case we write α 6|= φ and say “α does not model φ.”
Definition 2.11 (valid) If a formula φ is satisfied by every assignment (i.e. if
it is true in in every row of the truth table, it is valid) we write |= φ. In this
case we say φ is true in all models.
Example 2.2 shows that α |= ((p ⇒ q) ∨ r) and Example 2.3 shows α 6|=
((p ⇒ q) ∨ q).
28 CHAPTER 2. PROPOSITIONAL LOGIC

2.2.3 Truth Table Semantics


The semantics for propositional logic maps a formula (a syntax tree) to its
meaning. The meaning of a propositional formula depends only on the meaning
of its parts. Based on this, we say the semantics of propositional logic are
compositional. This fact suggests a method for determining the meaning of any
propositional formula; i.e. consider all the possible values is parts may take.
This leads to the idea of truth table semantics, the meaning of each connective
is defined in terms of the meaning of each part, since each part can only take
the values T or F, denoting true and false respectively.
Thus, complete analysis of the possible values of true and false requires us
to consider a only finite number of cases. Truth tables were first formulated by
the philosopher Ludwig Wittgenstein. . .
The formula constant ⊥ is mapped to the constant F as the following one
row truth table indicates.


F

Negation is a unary connective (i.e. it only has one argument) that toggles
the value of it’s argument as the following truth table shows.

φ ¬φ
T F
F T

The truth functional interpretations of the binary connectives for conjunc-


tion, disjunction, implication, and if-and-only-if are summarized in the following
truth table.

φ ψ (φ ∧ ψ) (φ ∨ ψ) (φ ⇒ ψ) (φ ⇔ ψ)
T T T T T T
T F F T F F
F T F T T F
F F F F T T

Thus, the truth or falsity of a formula is determined solely by the truth or falsity
of its sub-terms:

φ ∧ ψ: is true if both φ and φ are true and is false otherwise,


φ ∨ ψ: is true if one of φ or φ is true and is false otherwise,
φ ⇒ ψ: is true if φ is false or if ψ is true and is false otherwise, and
φ ⇔ ψ: is true if both φ and ψ are true or if they are both false
and is false otherwise.

We remark that for any element of P, although the number of cases (rows in
2.2. SEMANTICS 29

a truth table) is finite, the total number of cases is exponential in the number
of distinct variables. This means that, for each variable we must consider in a
formula, the number of cases we must consider doubles. Complete analysis of
a formula having no variables (i.e. its only base term is ⊥) has 20 = 1 row;
a formula having one distinct variable has 21 = 2 rows, two variables means
four cases, three variables means eight, and so on. If the formula contains n
distinct variables, there are 2n possible combinations of true and false that the
n variables may take.
Consider the following example of true table for the formula ((p ⇒ q) ∨ z).

p q r (p ⇒ q) ((p ⇒ q) ∨ r)
T T T T T
T T F T T
T F T F T
T F F F F
F T T T T
F T F T T
F F T T T
F F F T T

Since the formula has three distinct variables, there are 23 = 8 rows in the
truth table. Notice that the fourth row of the truth table falsifies the formula,
i.e. if p is true, q is false, and r is false, the formula ((p ⇒ q) ∨ r) is false. All the
other rows satisfy the formula i.e. all the other assignments of true and false to
the variables of the formula make it true.
A formula having the same shape (i.e. drawn as a tree it has the same
structure) , but only having two distinct variables is ((p ⇒ q) ∨ p). Although
there are three variable occurrences in the formula, (two occurrences of p and
one occurrence of q), the distinct variables are p and q. To completely analyze
the formula we only need 22 = 4 rows in the truth table.

p q (p ⇒ q) ((p ⇒ q) ∨ p)
T T T T
T F F T
F T T T
F F T T

Note that this formula is true for every assignment of Boolean values to the
variables p and q.

Definition 2.12 (satisfiable) A propositional formula is satisfiable if the col-


umn under the principal connective is true in any row of the truth table.

Definition 2.13 (falsifiable) A propositional formula is falsifiable if the col-


umn under the principal connective is false in any row of the truth table.
30 CHAPTER 2. PROPOSITIONAL LOGIC

Definition 2.14 (valid) A propositional formula is valid (or a tautology) if the


column under the principal connective is true in every row of the truth table.

Definition 2.15 (contradiction) A propositional formula is a contradiction


(or unsatisfiable) if the column under the principal connective is false in every
row of the truth table.

A formula having the meaning T

We did not include a constant in the base syntax for the language of propo-
sitional logic whose meaning is T; however, we defined the constant > (see
Definition 2.4). The following truth table shows that this defined formula al-
ways has the meaning T.

⊥ ¬⊥
F T
Note that any tautology could serve as our definition of true, but this is the
simplest such formula in the language P.

2.2.4 Exercises

2.3 Proof Theory


An alternative to using semantics to determining whether a formula is valid is
to build a proof. In this section we present a formal5 system of proofs. It is
fair to think of a proof as an argument that proceeds in steps. At each step
there is information that has been been accrued in the process of building the
proof to that point and there are the goals left to be shown before the proof is
complete. Together, the information accrued so far and the outstanding goals
determine the state of the proof. In the proof system presented here, this state
is represented in a structure called a sequent. Although a narrative describing a
proof is necessarily linear, it turns out that proofs are formally represented by
tree structures. Nodes of the tree are sequents recording the state of proof at
that point and the number of edges from a node and the form of the children
nodes depend on which proof rule was applied at that node.
A key idea is that formal proofs are a kind of tree-like data-structure and
we can determine whether a tree really is a proof bu checking local conditions
on the nodes. In this section, we make this idea concrete.

5 By formal, we mean that we give a detailed mathematical presentation with enough detail

for a software implementation.


2.3. PROOF THEORY 31

2.3.1 Sequents

Gerhard Gentzen (1909–


1945) was a German logician
who, in his short years, made
astounding contributions to
the foundations of math-
ematics, logic and proof
theory. In the same paper
[18] he invented both natu-
ral deduction proof system
as well as the sequent proof
systems we use here.

Gerhard Gentzen
Sequents are pairs of lists of formulas used to characterize a point in a proof.
One element of the pair lists the assumptions that are in force at the point in
a proof characterized by the sequent and the other lists the goals, one of which
we must prove to complete a proof of the sequent. The sequent formulation
of proofs, presented below, was first given by the German logician Gerhard
Gentzen in 1935 [18].
We will use letters (possibly subscripted) from the upper-case Greek alpha-
bet as meta-variables that range over (possibly empty) lists of formulas. Thus
Γ, Γ1 , Γ2 , ∆, ∆1 , and ∆2 all stand for arbitrary elements of the class List P 6 .

Definition 2.16 (Sequent) A sequent is a pair of lists of formulas hΓ, ∆i. The
list Γ is called the antecedent of the sequent and the list ∆ is called the succedent
of the sequent.

Remark 2.1 (On Sequent notation) A standard notational convention is to


write the sequent hΓ, ∆i in the form Γ ` ∆. The symbol “`” is called turnstile.
If Γ = [p, p ⇒ q] and ∆ = [q, r] we will omit the left and right brackets and
simply write p, p ⇒ q ` q, r instead of [p, p ⇒ q] ` [q, r]. In the specifications of
proof rules we will often want to focus on some formula in the antecedent (left
side) or consequent (right side) of a sequent. We do this by introducing some
notation for destructuring lists. Recall the definition of append (Def. 11.2) from
Chapter 1 and the infix notation. We write m++n to denote the list constructed
by appending (concatenating) the lists m and n together e.g. [3, 4]++[1, 2, 3] =
[3, 4, 1, 2, 3]. If Γ1 , Γ2 are formula lists and φ is a formula, we write Γ1 , φ, Γ2
to denote the list Γ1 ++([φ]++Γ2 ). By these conventions, Γ1 , φ, Γ2 is a list of
formulas in which the formulas in Γ1 occur in order and to the left of φ and the
formulas in Γ2 occur in order and to the right of φ.
6 The syntax for the class List T , lists over some set T , was defined in Chapter ??.
32 CHAPTER 2. PROPOSITIONAL LOGIC

2.3.2 Semantics of Sequents


Informally, a sequent Γ ` ∆ is valid if, assuming all the formulas in Γ are true,
then at least one formula in ∆ is. Validity for propositional formulas was defined
in terms of satisfiability under all assignments and likewise for sequents.
Definition 2.17 (Sequent satisfiability) A sequent Γ ` ∆ is satisfiable un-
der assignment α if α make all the formulas in Γ true then α makes at least one
formula in ∆ true. In this case we write α |= Γ ` ∆.
Given an assignment mapping propositional variables to Boolean values, we
can compute the valuation of a sequent under that assignment by translating
the sequent into a formula of propositional logic and then using the ordinary
valuation function given in Def. 2.7.
The translation of the sequent Γ ` ∆ into a formula is based on the following
ideas:
i.) All the formulas in the antecedent Γ are true if and only if their conjunction
is true as well.
ii.) At least one formula in the consequent ∆ is true if and only if the disjunc-
tion of the formulas in ∆ is true.
iii.) Implication models the notion if-then.
We will formalize the translation once we have defined operations mapping
lists of formulas to their conjunctions and disjunctions.

Conjunctions and Disjunctions of lists of Formulas


Informally, if Γ is the list [φ1 , φ2 , · · · , φn ] then
^
φ = (φ1 ∧ (φ2 ∧ (· · · (φn ∧ (¬⊥)) · · · )))
φ∈Γ

Dually, if ∆ is the list [ψ1 , ψ2 , · · · , ψm ] then


_
φ = (ψ1 ∨ (ψ2 ∨ (· · · (ψm ∨ (⊥)) · · · )))
φ∈∆

These operations can be formally defined by recursion on the structure of


their list arguments as follows:
Definition 2.18 (Conjunction over a list) The function which conjoins all
the elements in a list is defined on the structure of the list by the following two
recursive equations.
def
^
φ = ¬⊥
φ∈[ ]
def
^ ^
φ = (ψ ∧ ( φ))
φ∈(ψ::Γ) φ∈Γ
2.3. PROOF THEORY 33

The first equation defines the conjunction of formulas in the empty list simply
to be the formula ¬⊥ (i.e. the formula having the meaning T). The formula ¬⊥
is the right identity for conjunction i.e. the following is a tautology ((φ∧¬⊥) ⇔
φ).

Exercise 2.1. Verify that ((φ ∧ ¬⊥) ⇔ φ) is a tautology.

You might argue semantically that this is the right choice for the empty list
as follows: the conjunction of the formulas in a list is valid if and only if all the
formulas in the list are valid, but there are no formulas in the empty list, so all
of them (all none of them) are valid.
The second equation in the definition says that the conjunction over a list
constructed by a cons is the conjunction of the individual formula that is the
head of the list with the conjunction over the tail of the list.

Definition 2.19 (Disjunction over a list) The function which creates a dis-
junction of all the elements in a list is defined by recursion on the structure of
the list and is given by the following two equations.
def
_
φ = ⊥
φ∈[ ]

def
_ _
φ = (ψ ∨ ( φ))
φ∈(ψ::Γ) φ∈Γ

The first equation defines the disjunction of formulas in the empty list simply
to be the formula ⊥ (i.e. the formula whose meaning is F). The formula ⊥ is
the right identity for disjunction i.e. the following is a tautology ((φ ∨ ⊥) ⇔ φ).

Exercise 2.2. Verify that ((φ ∨ ⊥) ⇔ φ) is a tautology.

An informal semantic argument for this choice to represent the disjunction


of the empty list might go as follows: The disjunction of the formulas in a list
is valid if and only if some formula in the list is valid, but there are no formulas
in the empty list, so none of them are valid and the disjunction must be false.
The second equation in the definition says that the disjunction over a list
constructed by a cons is the disjunction of the individual formula that is the
head of the list with the disjunction over the tail of the list.

Formal Semantics for Sequents


Now that we have operators for constructing conjunctions and disjunctions over
lists of formulas, we give a definition of sequent validity in terms of the validity
of a formula of P.
34 CHAPTER 2. PROPOSITIONAL LOGIC

Definition 2.20 (Formula interpretation of a sequent)


def
^ _
[[Γ ` ∆]] = (( φ) ⇒ ( ψ))
φ∈Γ ψ∈∆

Thus [[Γ ` ∆]] is a translation of the sequent Γ ` ∆ into a formula. Using this
translation, we semantically characterize the validity of a sequent as follows.
Definition 2.21 (Sequent Valuation) Given an assignment α and a sequent
Γ ` ∆ we say α satisfies Γ ` ∆ if the following holds:
α |= [[Γ ` ∆]]
In this case we write α |= Γ ` ∆.
This gives the following
Definition 2.22 (Sequent validity) A sequent Γ ` ∆ is valid if and only if
|= [[Γ ` ∆]]
That is, a sequent is valid if and only if
^ _
|= ( φ) ⇒ ( ψ)
φ∈Γ ψ∈∆

To exercise these definitions we now consider the cases whether the an-
tecedent and/or succedent are empty. If Γ = [] then the sequent Γ ` ∆ is valid
if and only if one of the formulas in ∆ is. If ∆ = [] then the sequent Γ ` ∆ is
valid if and only if one of the formulas in Γ is not valid. If both the antecedent
and the succedent are empty, i.e. Γ = ∆ = [], then the sequent Γ ` ∆ is not
valid since
^ _
( φ) ⇒ ( ψ) is equivalent to the formula (¬⊥ ⇒ ⊥)
φ∈ [] ψ∈ []

and (¬⊥ ⇒ ⊥) is a contradiction. We verify this claim with the following truth
table.

⊥ ¬⊥ (¬⊥ ⇒ ⊥)
F T F

2.3.3 Sequent Schemas and Matching


So far we have used sequents as schemas that can be filled in with lists of actual
formulas taking the place of the meta-variables Γ and ∆. You can think of
them as templates waiting to be filled in. A way to fill in the template is by
a substitution function that maps the meta-variables (φ, ψ, Γ, ∆) in the syntax
of a schematic sequent to the appropriate kind of things (formulas or formula
lists) depending on the type of meta-variable.
2.3. PROOF THEORY 35

Example 2.4. Consider the following schematic sequent:

S: Γ1 , φ ⇒ ψ, Γ2 ` ∆

The following substitution (call it σ) specifies how to map meta-variables oc-


curring in S to lists of formulas and formulas.

σ(Γ1 ) = [p]
σ(Γ2 ) = []
σ(∆) = [r]
σ(φ) = p ∨ q
σ(ψ) = r

We apply a substitution to a schema S by decomposing the schema into its


syntactic parts and recursively applying the substitution to those parts.

σ(Γ1 , φ ⇒ ψ, Γ2 ` ∆)
= σ(Γ1 ), σ(φ ⇒ ψ), σ(Γ2 ) ` σ(∆)
= [p], σ(φ) ⇒ σ(ψ), [] ` [r]
= [p], (p ∨ q) ⇒ r, [] ` [r]

By our notational conventions for sequents the resulting sequent is written as


follows:
p, (p ∨ q) ⇒ r ` r

Definition 2.23 (matching) A sequent (call it S) matches a sequent schema


(call it Ŝ) if there is some way of substituting actual formula lists and formulas
for meta-variables in the schema Ŝ (elements of List P for list meta-variables
and elements of P for formula meta-variables) so the that the resulting sequent
is identical to S.

2.3.4 Proof Rules


Definition 2.24 (Proof Rule Schemata) Proof rules for the propositional
sequent calculus have one of the following three forms:

H H1 H2
(N ) (N ) (N )
C C C
where C, H, H1 , and H2 are all schematic sequents. N is the name of the rule.
The H patterns are the premises (or hypotheses) of the rule and the pattern C is
the goal (or conclusion) of the rule. Rules having no premises are called axioms.
Rules that operate on formulas in the antecedent (on the left side of `) of a
sequent are called elimination rules and rules that operate on formulas in the
consequent (the right side of `) of a sequent are called introduction rules.

Definition 2.25 (Admissable Rules) A proof rule is admissible if, whenever


the premises of the rule are valid the conclusion is valid.
36 CHAPTER 2. PROPOSITIONAL LOGIC

Proof rules are schemas (templates) used to specify a single step of inference.
The proof rule schemas are specified by arranging schematic sequents in partic-
ular configurations to indicate which parts of the rule are related to which. For
example, the rule for decomposing an implication on the left side of the turnstile
is given as:

Γ1 , Γ2 ` φ, ∆ Γ1 , ψ, Γ2 ` ∆
Γ1 , (φ ⇒ ψ), Γ2 ` ∆

There are three schematic sequents in this rule.

Γ1 , Γ2 ` φ, ∆
Γ1 , ψ, Γ2 ` ∆
Γ1 , (φ ⇒ ψ), Γ2 ` ∆

Each of these schematic sequents specifies a pattern that an actual (or con-
crete) sequent might (or might not) match. By an actual sequent, we mean a
sequent that contains no meta-variables (e.g. it contains no Γs or ∆s, or φs or
ψs but is composed of formulas int he language of propositional logic.)

Structural Rules*
The semantics of sequents (given in Def. 2.20) gives them lots of structure.
There are some non-logical rules that sequents obey. These rules are admissible
based simply on the semantics, regardless of the formula instances occurring
in the antecedent and consequent. They essentially express the ideas that the
order of the formulas does not affect validity and neither does the number of
times a formula occurs in the antecedent or the consequent.
It turns out that in the propositional case, it is never required that a struc-
tural proof rule be used to find a proof. Once the quantifiers have been added
in Chapter 4, some proofs will require the use of these rules.

Weakening Weakening says that if Γ ` ∆ is valid, then adding formulas to


the left or right side does not affect validity.

Proof Rule 2.1 (WL)

Γ`∆
(WL)
φ, Γ ` ∆

Proof Rule 2.2 (WR)

Γ`∆
(WR)
Γ ` φ, ∆
2.3. PROOF THEORY 37

Contraction The contraction rules essentially say that, if a formula occurs


on the left or right side of a sequent, the number of times it occurs matter. You
are free to make as many copies of a formula as you wish.

Proof Rule 2.3 (CL)


Γ1 , φ, Γ2 , φ, Γ3 ` ∆
(CL)
Γ1 , φ, Γ2 , Γ3 ` ∆

Proof Rule 2.4 (CR)


Γ ` ∆1 , φ, ∆2 , φ, ∆3
(CR)
Γ ` ∆1 , φ, ∆2 , ∆3

Permutation The permutation rules say that reordering of formulas in the


antecedent and consequent do not affect validity. The reordering is expressed in
terms of locally swapping the order of formulas in the antecedent and consequent
of the sequent.

Proof Rule 2.5 (PL)


Γ1 , φ, ψ, Γ2 ` ∆
(PermL)
Γ1 , ψ, φ, Γ2 ` ∆

Proof Rule 2.6 (PermR)


Γ ` ∆1 , φ, ψ, ∆2
(PermR)
Γ ` ∆1 , ψ, φ, ∆2

Axiom Rules
If there is a formula that appears in both the antecedent and the consequent
of a sequent then the sequent is valid. The axiom rule reflects this and has the
following form:

Proof Rule 2.7 (Ax)

(Ax)
Γ1 , φ, Γ2 ` ∆1 , φ, ∆2

Also, since false (⊥) implies anything, if the formula ⊥ appears in the an-
tecedent of a sequent that sequent is trivially valid.
Proof Rule 2.8 (⊥Ax)

(⊥Ax)
Γ1 , ⊥, Γ2 ` ∆
38 CHAPTER 2. PROPOSITIONAL LOGIC

Conjunction Rules

On the right

A conjunction (φ ∧ ψ) is true when both φ is true and when ψ is true. Thus,


the proof rule for a conjunction on the right is given as follows:

Proof Rule 2.9 (∧R)

Γ ` ∆1 , φ, ∆2 Γ ` ∆1 , ψ, ∆2
(∧R)
Γ ` ∆1 , (φ ∧ ψ), ∆2

On the left

On the other hand, if we have a hypothesis that is a conjunction of the form


(φ ∧ ψ), then we know both φ and ψ are true.

Proof Rule 2.10 (∧L)

Γ1 , φ, ψ, Γ2 ` ∆
(∧L)
Γ1 , (φ ∧ ψ), Γ2 ` ∆

Disjunction Rules

A disjunction (φ ∨ ψ) is true when either φ is true or when ψ is true. Thus, the


proof rule for proving a goal having disjunctive form is the following.

Proof Rule 2.11 (∨R)

Γ ` ∆1 , φ, ψ, ∆2
(∨R)
Γ, ` ∆1 , (φ ∨ ψ), ∆2

On the other hand, if we have a hypothesis that is a disjunction of the form


(φ ∨ ψ), then, since we don’t know which of the disjuncts is true (but since we
are assuming the disjunction is true, one of them must be), we must continue
by cases on φ and ψ, showing that the sequent Γ1 , φ, Γ2 ` ∆ is true and that
the sequent Γ1 , ψ, Γ2 ` ∆ is as well.

Proof Rule 2.12 (∨L)

Γ1 , φ, Γ2 ` ∆ Γ1 , ψ, Γ2 ` ∆
(∨L)
Γ1 , (φ ∨ ψ), Γ2 ` ∆
2.3. PROOF THEORY 39

Implication Rules

A implication (φ ⇒ ψ) is provable when, assuming φ, you can prove ψ. Thus,


the proof rule for proving a goal having implicational form is the following.

Proof Rule 2.13 (⇒R)

Γ, φ ` ∆1 , ψ, ∆2
(⇒R)
Γ ` ∆1 , (φ ⇒ ψ), ∆2

If we have a hypothesis that is a implication of the form (φ ⇒ ψ) and we


wish to prove some formula in the conclusion ∆, working backward, we must
show that adding ψ to the hypotheses proves ∆ and also that adding φ to the
conclusion (i.e. φ, ∆) is provable. Structurally, this rule is the most complicated.

Proof Rule 2.14 (⇒L)

Γ1 , Γ2 ` φ, ∆ Γ1 , ψ, Γ2 ` ∆
(⇒L)
Γ1 , (φ ⇒ ψ), Γ2 ` ∆

Note that if φ is in Γ then this is just like Modus Ponens since the left subgoal
becomes an instance of the axiom rule.

Negation Rules

The negation ¬φ can be viewed as an abbreviation for the formula φ ⇒ ⊥; this


claim can be checked by writing out the truth table. Based on this relationship,
the proof rule for negation is related to that of implication (see above.)

Proof Rule 2.15 (¬R)

Γ, φ ` ∆1 , ∆2
(¬R)
Γ ` ∆1 , ¬φ, ∆2

If you have a negated formula ¬φ in the antecedent, working backward, you


can swap the formula φ to the other side of the turnstile and try to prove it
directly.

Proof Rule 2.16 (¬L)

Γ1 , Γ2 ` φ, ∆
(¬L)
Γ1 , ¬φ, Γ2 ` ∆
40 CHAPTER 2. PROPOSITIONAL LOGIC

2.3.5 Proofs
We have the proof rules, now we define what a proof is. A formal proof is a
tree structure where the nodes of the tree are sequents, the leaves of the tree
are instances of one of the axiom rules, and there is an edge between sequents
if the sequents form an instance of some proof rule. We can formally describe
an inductive data-structure for representing sequent proofs.

Definition 2.26 (proof tree) A proof tree having root sequent S is defined
inductively as follows:

i.) If the sequent S is an instance of one of the axioms rules whose name is N,
then

(N)
S

is a proof tree whose root is the sequent S.

ii.) If ρ1 is a proof tree whose root is the sequent S1 and, if

S1
(N)
S

is an instance of some proof rule having a single premise, then the tree

..
.
ρ1
(N)
S

is a proof tree whose root is the sequent S.

iii.) If ρ1 is a proof tree with root sequent S1 and ρ2 is a proof tree with root
sequent S2 and, if

S1 S2
(N)
S

is an instance the proof rule which has two premises, then the tree

.. ..
. .
ρ1 ρ2
(N)
S

is a proof tree whose root is the sequent S.


2.3. PROOF THEORY 41

Although proof trees were just defined by starting with the leaves and build-
ing them toward the root, the proof rules are typically applied in the reverse
order, i.e. the goal sequent is scanned to see if it is an instance of an axiom
rule, if so we’re done. If the sequent is not an instance of an axiom rule and it
contains some non-atomic formula on the left or right side, then the rule for the
principle connective of that formula is matched against the sequent. The result-
ing substitution is applied to the schematic sequents in the premises of the rule.
The sequents generated by applying the matching substitution to the premises
are placed in the proper positions relative to the goal. This process is repeated
on incomplete leaves of the tree (leaves that are not instances of axioms) until
all leaves are either instances of an axiom rule, or until all the formulas in the
sequents at the leaves of the tree are atomic and are not instances of an axiom
rule. In this last case, there is no proof of the goal sequent.
As characterized in [43], the goal directed process of building proofs, i.e.
working backward from the goal, is a reductive process as opposed to the de-
ductive process which proceeds forward from the axioms.
We present some examples.
Example 2.5. Consider the sequent (p ∨ q) ` (p ∨ q). The following substi-
tution verifies the match of the sequent against the goal of the axiom rule as
follows:

 Γ1 = [ ]

 Γ2 = [ ]


σ1 = ∆1 = [ ]
∆2 = [ ]




φ = (p ∨ q)

Thus, the following proof tree proves the sequent.

(Ax)
(p ∨ q) ` (p ∨ q)

Example 2.6. Consider the sequent (p ∨ q) ` (q ∨ p). It is not an axiom, since


(p ∨ q) is distinct from (q ∨ p). The sequent matches both the ∨L-rule and the
∨R-rule. We match sequent against the ∨R-rule which results in the following
substitution: 

 Γ := [(p ∨ q)]
 ∆1 := [ ]


σ1 = ∆2 := [ ]
φ := q




ψ := p

The sequent that results from applying this substitution to the schematic se-
quent in the premise of the rule ∨R results in the sequent (p ∨ q) ` q, p.
Thus far we have constructed the following partial proof:
(p ∨ q) ` q, p
(∨R)
(p ∨ q) ` (q ∨ p)
42 CHAPTER 2. PROPOSITIONAL LOGIC

Now we match the sequent on the incomplete branch of the proof against the
∨L-rule. This is the only rule that matches since the sequent is not an axiom
and only contains one non-atomic formula, namely the (q ∨ p) on the left side.
The match generates the following substitution.



 Γ1 := [ ]
 Γ2 := [ ]


σ2 = ∆ := [q, p]
φ := q




ψ := p

Applying this substitution to the premises of the ∨L-rule results in the sequents
p ` q, p and q ` q, p. Placing them in their proper positions results in the
following partial proof tree.

p ` q, p q ` q, p
(∨L)
(p ∨ q) ` q, p
(∨R)
(p ∨ q) ` (q ∨ p)

In this case, both incomplete branches are instances of the axiom rule. The
matches for the left and right branches are, respectively:
 

 Γ1 := [ ] 
 Γ1 := [ ]
 Γ2 := [ ]  Γ2 := [ ]

 

σ3 = ∆1 := [q] σ4 = ∆1 := [ ]
∆2 := [ ] ∆2 := [p]

 


 

φ := p φ := q
 

These matches verify that the incomplete branches are indeed axioms and
the final proof tree appears as follows:

(Ax) (Ax)
p ` q, p q ` q, p
(∨L)
(p ∨ q) ` q, p
(∨R)
(p ∨ q) ` (q ∨ p)

2.3.6 Some Useful Tautologies


Recall that the symbol > is an abbreviation for the true formula ¬⊥.
2.4. METAMATHEMATICAL CONSIDERATIONS* 43

Theorem 2.1.
i. ¬¬φ ⇔ φ
ii. ¬φ ⇔ (φ ⇒ ⊥)
iii. (φ ⇒ ψ) ⇔ ¬φ ∨ ψ
iv. ¬(φ ∧ ψ) ⇔ (¬φ ∨ ¬ψ)
v. ¬(φ ∨ ψ) ⇔ (¬φ ∧ ¬ψ)
vi. (φ ∨ ψ) ⇔ (ψ ∨ φ)
vii. (φ ∧ ψ) ⇔ (ψ ∧ φ)
viii. ((φ ∨ ψ) ∨ ϕ) ⇔ (φ ∨ (ψ ∨ ϕ))
ix. ((φ ∧ ψ) ∧ ϕ) ⇔ (φ ∧ (ψ ∧ ϕ))
x. (φ ∨ ⊥) ⇔ φ
xi. (φ ∧ >) ⇔ φ
xii. (φ ∨ >) ⇔ >
xiii. (φ ∧ ⊥) ⇔ ⊥
xiv. (φ ∨ ¬φ) ⇔ >
xv. (φ ∧ ¬φ) ⇔ ⊥
xvi. (φ ∧ (ψ ∨ ϕ)) ⇔ (φ ∧ ψ) ∨ (φ ∧ ϕ)
xvii. (φ ∨ (ψ ∧ ϕ)) ⇔ (φ ∨ ψ) ∧ (φ ∨ ϕ)
xviii. (p ⇒ q) ∨ (q ⇒ p)
Exercise 2.3. Give sequent proofs showing that the tautologies in Thm 2.1
hold.

2.3.7 Exercises

2.4 Metamathematical Considerations*


Metamathematics is the application of mathematical methods to study the
mathematics itself. In this chapter, we have presented syntax, semantics and
a sequent proof system for propositional logic. We have mentioned that proofs
provide an alternative path to determining validity. But we have not said how
we know that the proof system presented here coincides with the semantics.
The relationship of the proof system to the semantics is given by soundness and
completeness results. Also, from a computational stance, we’d like to know if
there are algorithms and what (their computational complexity might be) for
deciding whether a formula is valid or not; such an algorithm is called a decision
procedure.
44 CHAPTER 2. PROPOSITIONAL LOGIC

2.4.1 Soundness and Completeness


The properties that relate a proof systems to the corresponding semantic notion
are soundness and completeness. Recall that the semantic notion of validity for
a formula φ is denoted |= φ. It is traditional to assert that a formula φ is
provable by writing ` φ, since we have used ` in our notation for sequents, we
will write k−φ to mean there is a proof of φ.

Soundness Soundness is the property that claims that every provable formula
is semantically valid. An unsound proof system would not be of much use, if
we could prove any theorem which was not valid, we could prove all theorems
because ⊥ ` φ for an arbitrary formula φ.

Theorem 2.2 (Soundness) For every propositional formula φ, if k−φ then


|= φ.

We do not have the tools or methods yet to prove this theorem. We have
informally argued for the admissibility of the individual proof rules and these
individual facts can be combined to show soundness. The proof method used
to prove soundness is based on an induction principle that follows the structure
of a formula. These methods will be introduced in a Chapter ??.

Completeness Completeness is the property that all valid formulas are prov-
able. If a proof system is complete, it captures all of the valid formulas. It turns
out that there are mathematical theories for which there is no complete proof
system, propositional logic is not one of them.

Theorem 2.3 (Completness) For every propositional formula φ, if |= φ then


k−φ.

Again, we do not yet have a proof method that will allow us to prove com-
pleteness; by the end of the book we will.

2.4.2 Decidability
A set is decidable if there is an algorithm to decide if an element is in the set.
To talk about the decidability of a logic, we first have to describe it as a set or
collection of formulas.

Definition 2.27 (Theory) Given a language L and a semantic notion of va-


lidity on L (say |=), the theory of hP, |=i is the collection of all valid formulas
in P. We write this as follows:

Th(L, |=) = {φ | |= φ}

Thus, the theory of propositional logic ThhP, |=i is the collection of all
formulas in the language of propositional logic (P) that are semantically valid.
2.4. METAMATHEMATICAL CONSIDERATIONS* 45

Definition 2.28 (Decidability) Given a set S, a subset T of S is decidable if


there is an algorithm if to determine if an arbitrary element of S is in T . More
formally, if there is an algorithm d : S → B such that d(x) = T if and only if
x ∈ T.

Theorem 2.4 (Propositonal Logic is Decidable) The theory of Th(P, |=)


is decidable.

2.4.3 Exercises
46 CHAPTER 2. PROPOSITIONAL LOGIC
Chapter 3

Boolean Algebra and


Equational Reasoning*

Propositional logic has an algebraic form first investigated by George Boole in


1840’s. Arguably, Boole’s algebraic and symbolic approach to logic was the first
truly significant step forward in the development of logic since Aristotle; Boole’s
algebra was symbolic.

George Boole (1815-1864)


was an English mathemati-
cian and logician. His father
was a shoemaker and George
was largely self-taught. He
became a schoolmaster at
age 16 and published his
first mathematical paper at
age 23. See [?] in [10] for
more on Boole’s life.

George Boole

3.1 Boolean Algebra


In the previous chapter we have presented propositional logic syntax and have
give semantics (meaning) based on truth tables over the set of truth values
{T, F }. An alternative meaning can be assigned to propositional formulas by
translating them into algebraic form over the natural numbers and then looking
at the congruences modulo 2, i.e. by claiming they’re congruent to 0 or 1
depending on whether they’re even or odd.
Such an interpretation is correct if it makes all the same formulas true.

47
48CHAPTER 3. BOOLEAN ALGEBRA AND EQUATIONAL REASONING*

3.1.1 Modular Arithmetic


Congruence (of which modular arithmetic is one kind) is an interesting topic
of discrete mathematics in its own right. We will only present enough mate-
rial here to make the association between propositional logic and its algebraic
interpretation.

Remark 3.1. Recall that for every integer a and every natural number m > 0,
there exists a integers q and r where 0 ≤ r < m such that the following equation
holds:
a = qm + r
We call q the quotient and r the remainder If r = 0 (there is no remainder)
them we say m divides a e.g. a ÷ m = q.

Definition 3.1. Two integers are congruent modulo 2, if and only if they have
the same remainder when divided by 2. In this case we write

a ≡ b(mod 2)

Example 3.1.
0 ≡ 0(mod 2) a = 0, k = 0, r =0
1 ≡ 1(mod 2) a = 1, k = 0, r =1
2 ≡ 0(mod 2) a = 2, k = 1, r =0
3 ≡ 1(mod 2) a = 3, k = 1, r =1
4 ≡ 0(mod 2) a = 4, k = 2, r =0
5 ≡ 1(mod 2) a = 5, k = 2, r =1

Theorem 3.1. The following three properties hold1 .

i.) a ≡ a(mod 2)
ii.) If a ≡ b(mod 2) then b ≡ a(mod 2)
iii.) If a ≡ b(mod 2) and b ≡ c(mod 2) then a ≡ c(mod 2)

Theorem 3.2. If a ∈ Z is even, then a ≡ 0(mod 2) and if a is odd, then


a ≡ 1(mod 2).

Theorem 3.3. If a ≡ c(mod n) and b ≡ d(mod n) then

a + b ≡ c + d(mod n)
a · b ≡ c · d(mod n)

Example 3.2. Since 5 ≡ 3(mod 2) and 10 ≡ 98(mod 2)

5 + 10 ≡ 3 + 98(mod 2) and 5 · 10 ≡ 3 · 98(mod 2)


1 We will see later in Chapter 7 that relations having these properties are called equivalence

relations
3.1. BOOLEAN ALGEBRA 49

To see this note the following:


5 + 10 = 15 and 15 = 7 · 2 + 1, so 5 + 10 ≡ 1(mod 2)
3 + 98 = 101 and 101 = 50 · 2 + 1, so 3 + 98 ≡ 1(mod 2)
so by properties (ii) and (iii) of Theorem 1.1
5 + 10 ≡ 3 + 98(mod 2)

Prove to yourself that 5 · 10 ≡ 3 · 98(mod 2).

Definition 3.2. We will write n(mod 2) to denote the remainder of n ÷ 2. So,


5(mod 2) = 1 and 28(mod 2) = 0.

Theorem 3.4. The following identities hold.


2p ≡ 0(mod 2)
p2 ≡ p(mod 2)

3.1.2 Translation from Propositional Logic


In this section we define a function that maps propositional formulas to algebraic
formulas.
We define the translation (denoted M[[φ]]) which maps propositional formu-
las to algebraic formulas by recursion on the structure of the formula φ.
We start the translation with falsity (⊥) and conjunction. Conjunction is
easily seen to correspond to multiplication. Negation is defined next, and then
using DeMorgan’s laws, translations for disjunction and implication are given.

3.1.3 Falsity
We interpret ⊥ as 0, so the translation function maps ⊥ to 0, no matter what
the assignment is.

M[[⊥]] = 0

3.1.4 Variables
Propositional variables are just mapped to variables in the algebraic language

M[[x]] = x

3.1.5 Conjunction
Consider the following tables for multiplication and the table for conjunction.
a b ab a b a∧b
1 1 1 T T T
1 0 0 T F F
0 1 0 F T F
0 0 0 F F F
50CHAPTER 3. BOOLEAN ALGEBRA AND EQUATIONAL REASONING*

This table is identical to the truth table for conjunction (∧) if we replace 1
by T , 0 by F and the symbol for multiplication (·) by the symbol for conjunction
(∧). Thus, we get the following translation.

M[[φ ∧ ψ]] = M[[φ]] · M[[ψ]]

3.1.6 Negation
Notice that addition by 1 modulo 2 toggles values.

1 + 1 = 2 and 2 ≡ 0(mod 2) and 0 + 1 = 1

The following tables show addition by 1 modulo 2 and the truth table for nega-
tion to illustrate that the translating negations to addition by 1 give the correct
results.
a a + 1(mod 2) a ¬a
1 0 T F
0 1 F T
The translation is defined as follows:

M[[¬φ]] = (M[[φ]] + 1)

3.1.7 Exclusive-Or
We might hope that disjunction would be properly modeled by addition ... “If
wishes were horses, beggars would ride.” Consider the table for addition modulo
2 and compare it with the table for disjunction – clearly they do not match.

a b a + b(mod 2) a b a∨b
1 1 0 T T T
1 0 1 T F T
0 1 1 F T T
0 0 0 F F F

The problem is that 1 + 1 ≡ 0(mod 2) while we want that entry to be 1, i.e. if


p and q are both T , p ∨ q should be T as well.
But the addition table does correspond to a useful propositional connective
(one we haven’t introduced so far) – exclusive or – often written as (p ⊕ q) and
which is true if one of the p or q is true but not both. It’s truth table is given as
follows:
a b a⊕b
T T F
T F T
F T T
F F F
3.1. BOOLEAN ALGEBRA 51

3.1.8 Disjunction
We can derive disjunction using the following identity of propositional logic and
the translation rules we have defined so far.

(p ∨ q) ⇔ ¬(¬p ∧ ¬q)

Exercise 3.1. Verify this identity by using a truth table.

By the translation we have so far

M[[¬(¬p ∧ ¬q)]]
= M[[(¬p ∧ ¬q)]] + 1
= (M[[¬p]] · M[[¬q]]) + 1
= ((M[[p]] + 1) · (M[[q]] + 1)) + 1
= ((p + 1) · (q + 1)) + 1
= pq + p + q + 1 + 1
= pq + p + q + 2
Since 2 ≡ 0(mod 2), we can cancel the 2 and end up with the term pq + p + q.
Here are the tables (you might check for yourself that the entries are correct.)

a b ab + a + b(mod 2) a b a∨b
1 1 1 T T T
1 0 1 T F T
0 1 1 F T T
0 0 0 F F F

So, we define the translation of disjunctions as follows.

M[[φ ∨ ψ]] = pq + p + q where p = M[[φ]] and q = M[[ψ]]

3.1.9 Implication
The following propositional formula holds.

(p ⇒ q) ⇔ (¬p ∨ q)
Thus, implication can be reformulated in terms of negation and disjunction.
Using the translation constructed so far, we get the following

M[[¬p ∨ q]]
= M[[¬p]] · M[[q]] + M[[¬p]] + M[[q]]
= (M[[p]] + 1) · q + (M[[p]] + 1) + q
= (p + 1)q + (p + 1) + q
= (pq + q + (p + 1) + q
= (pq + 2q + (p + 1)
Since 2q ≡ 0(mod 2), we can cancel the 2q term and the final formula for the
translation of implication is pq + p + 1. And we get the following tables.
52CHAPTER 3. BOOLEAN ALGEBRA AND EQUATIONAL REASONING*

a b ab + a + 1(mod 2) a b a⇒b
1 1 1 T T T
1 0 0 T F F
0 1 1 F T T
0 0 1 F F T

Thus,

M[[φ ⇒ ψ]] = pq + p + 1 where p = M[[φ]] and q = M[[ψ]]

3.1.10 The Final Translation


The following function recursively translates a propositional formula into an
algebraic formula.

M[[⊥]] = 0
M[[x]] = x
M[[¬φ]] = M[[φ]] + 1
M[[φ ∧ ψ]] = (M[[φ]] · M[[ψ]])
M[[φ ∨ ψ]] = (M[[φ]] · M[[ψ]]) + M[[φ]] + M[[ψ]]
M[[φ ⇒ ψ]] = (M[[φ]] · M[[ψ]]) + M[[φ]] + 1

Example 3.3. Consider the formula (p ∨ q) ⇒ p.

M[[(p ∨ q) ⇒ p]]
= (M[[p ∨ q]] · M[[p]]) + M[[p ∨ q]] + 1
= (((M[[p]] · M[[q]]) + M[[p]] + M[[q]]) · p) + ((M[[p]] · M[[q]]) + M[[p]] + M[[q]]) + 1
= (((p · q) + p + q) · p) + ((p · q) + p + q) + 1
= (((pq) + p + q)p) + ((pq) + p + q) + 1
= (p2 q + p2 + pq) + pq + p + q + 1
= pq + p + pq + pq + p + q + 1
= 2(pq) + 2p + pq + q + 1
= pq + q + 1

We can check this for all combinations of values for p and q. Instead, we
notice that the final formula is the same as the translation for implication of
q ⇒ p. To check our work we could check that:

((p ∨ q) ⇒ p) ⇔ (q ⇒ p)
We have presented propositional logic syntax and have give semantics (mean-
ing) based on truth tables over the set of truth values {T, F }. An alternative
meaning can be assigned to propositional formulas by translating them into
algebraic form over the natural numbers and then looking at the congruences
modulo 2, i.e. by claiming they’re congruent to 0 or 1 depending on whether
they’re even or odd.
Such an interpretation is correct if it makes all the same formulas true.
3.2. EQUATIONAL REASONING 53

3.1.11 Notes
In modern times, the Boolean algebras have been investigated abstractly [25].

3.2 Equational Reasoning


The connective ⇔ turns out to have the following properties.

Theorem 3.5. If φ and ψ and varpsi are propositional meta-variables, the


following theorems hold.

1.) φ ⇔ φ
2.) (φ ⇔ ψ) ⇔ (ψ ⇔ φ)
3.) ((φ ⇔ ψ) ∧ (ψ ⇔ R)) ⇒ (φ ⇔ R)

Exercise 3.2. Prove Thm. 3.5.

We shall see in Chap 7 that operations like ⇔ that have properties (1), (2)
and (3) behave like an equality. If you interpret “⇔” as “=” and φ, ψ and R
as numbers you will see this. Property (1) shows ⇔ is reflexive, property (2)
shows it is symmetric and property (3) shows it is transitive.

3.2.1 Complete Sets of Connectives


Definition 3.3 (Complete set) A set of connectives, C,

C ⊆ {⊥, ¬, ∨, ∧, ⇒, ⇔}

is complete if those connectives not in the set C can be defined in terms of the
connectives that are in the set C.

Example 3.4. The following definitions show that the set {¬, ∨} is complete.
def
1.) ⊥ = ¬(φ ∨ ¬φ)
def
2.) (φ ∧ ψ) = ¬(¬φ ∨ ¬ψ)
def
3.) (φ ⇒ ψ) = ¬φ ∨ ψ
def
4.) (φ ⇔ ψ) = ¬(¬(¬φ ∨ ψ) ∨ ¬(¬ψ ∨ φ))

To verify that these definitions are indeed correct, you could verify that
the columns of the truth table for the defined connective match (row-for-row)
the truth table for the definition. Alternatively, you could replace the symbol
def
“ = ” by “⇔” and use the sequent proof rules to verify the resulting
formulas, e.g. to prove the definition for ⊥ given above is correct, prove the
sequent ` ⊥ ⇔ ¬(φ ∨ ¬φ). Another method of verification would be to do
equational style proofs starting with the left-hand side of the definition and
rewriting to the right hand side.
54CHAPTER 3. BOOLEAN ALGEBRA AND EQUATIONAL REASONING*

Here are example verifications using the equational style of proof. We label
each step in the proof by the equivalence used to justify it or, if the step follows
from a definition we say which one.
hii h> def i hxivi
1.) ⊥⇐⇒¬¬⊥ ⇐⇒ ¬>⇐⇒¬(φ ∨ ¬φ)
hii hivi
2.) (φ ∧ ψ)⇐⇒¬¬(φ ∧ ψ)⇐⇒¬(¬φ ∨ ¬ψ)
hiiii
3.) (φ ⇒ ψ)⇐⇒¬φ ∨ ψ
h⇔ def.i
4.) (φ ⇔ ψ) ⇐⇒ (φ ⇒ ψ) ∧ (ψ ⇒ φ)
hiiii
⇐⇒(¬φ ∨ ψ) ∧ (ψ ⇒ φ)
hiiii
⇐⇒(¬φ ∨ ψ) ∧ (¬ψ ∨ φ)
h2i
⇐⇒¬(¬(¬φ ∨ ψ) ∨ ¬(¬ψ ∨ φ))

Exercise 3.3. Prove that the set {¬, ∧} is complete for {⊥, ¬, ∨, ∧, ⇒, ⇔}.
You’ll need to give definitions for ⊥, vee, ⇒ and ⇔ in terms of ¬ and ∧ and
then prove that your definitions are correct.

Exercise 3.4. Prove that the set {⊥, ⇒} is complete for {⊥, ¬, ∨, ∧, ⇒, ⇔}.
Chapter 4

Predicate Logic

Since by the aid of speech and such communication as you re-


ceive here you must advance to perfection, and purge your will and
correct the faculty which makes use of the appearances of things; and
since it is necessary also for the teaching (delivery) of theorems to be
effected by a certain mode of expression and with a certain variety
and sharpness, some persons captivated by these very things abide in
them, one captivated by the expression, another by syllogisms, an-
other by sophisms, and still another by some other inn (πανδoκoν)
of this kind; and there they stay and waste away as if they were
among the Sirens. Epictetus Discourses [13] II xxiii

In this section we extend propositional logic presented in the previous chap-


ter to allow for quantification of the form:

for all things x, · · ·


for every x, · · ·
there exists a thing x such that · · ·
for some thing x, · · ·

Where “· · · ” is some statement referring to the thing denoted by the variable x


that specifies a property of the thing denoted by x. The first two forms are called
universal quantification, they are different ways of asserting that everything
satisfies some specified property. The second two forms are called existential
quantification, they assert that something exists having the specified property.
Symbolically, we write “for all things x, · · · ” as (∀x. · · · ) and “there exists
a thing x such that · · · ” as (∃x. · · · ).

4.1 Predicates
To make this extension to our logic we add truth-valued functions called predi-
cates which map elements from a domain of discourse to the values in B.

55
56 CHAPTER 4. PREDICATE LOGIC

Definition 4.1 (arity) A function is called n-ary if it takes n arguments, 0 ≤


n. If a function is n − ary, we say it has arity n. A function of arity 0, i.e. a
function that takes no arguments, is called a constant. We say a 0-ary function
is nullary, 1-ary function is unary. We say a 2-ary function is binary and,
although we could say 3-ary, 4-ary and 5-ary functions ternary, quaternary and
quintary respectively, we do not insist on carrying this increasingly tortured
nomenclature any further.

For example, consider the following functions:

i.) f () = 5
ii.) g(x) = x + 5
iii.) h(x, y) = (x + y) − 1
vi.) f1 (x, y, z) = x ∗ (y + z)
v.) g1 (x, y, z, w) = f1 (x, y, w) − z

The first function is nullary, it takes no arguments. Typically, we will drop the
parentheses and write f instead of f (). The second function takes one argument
and so is a unary function. The third function is binary. The fourth and fifth
are 3-ary and 4-ary functions respectively.

Definition 4.2 (Boolean valued function) A function is Boolean-valued if


its range is the set B.

Definition 4.3 (predicate) A predicate is a n-ary Boolean-valued function


over some domain of input.

Example 4.1. In ordinary arithmetic, the binary predicates include less than
(written <) and equals (written =). Typically these are written in infix notation
i.e. instead of writing = (x, y) and < (x, y) we write x = y and x < y; do not
let this infix confuse you, they are still binary predicates. We can define other
predicates in terms of these two. For example we can define a binary predicate
less-than-or-equals as:

def
i ≤ j = ((i = j) ∨ (i < j))

We could define a unary predicate which is true when its argument is equal to
0 and is false otherwise:
def
=0 (i) = i = 0

We could define a 3-ary predicate which is true if k is strictly between i and j:

def
between(i, j, k) = ((i < k) ∧ (k < j))

Note that predicate constants act just like propositional variables.


4.2. THE SYNTAX OF PREDICATE LOGIC 57

Gottlob Frege (1948 –


1925) was a German math-
ematician, logician and
philosopher. He made the
largest advance in logic since
Aristotle by his discovery of
the notion of and formaliza-
tion of quantified variables.
Frege was also a founder of
analytic philosophy and the
philosophy of language. A
fundamental contribution
there is that words obtain
meanings in the context of
their usages.
Gottlob Frege

4.2 The Syntax of Predicate Logic


Predicate logic formulas are constructed from two sorts of components: terms
and formulas which may contain terms.
i.) parts that refer to objects and functions on those objects in the domain of
discourse. These components of the formula are called terms.
ii.) parts of a formula that denote truth values, these include predicates over the
domain of discourse and formulas constructed inductively by connecting
previously constructed formulas.

4.2.1 Variables
The definitions of the syntactic classes of terms and formulas (both defined
below) depend on an unbounded collection of variable symbols, we call this set
V.
V = {x, y, z, w, x1 , y1 , z1 , w1 , x2 , · · · }
Unlike propositional variables, which denoted truth-values, these variables will
range over individual elements in the domain of discourse. Like propositional
variables, we assume the set V is fixed (and so we do not include it among the
parameters of the definitions that use it.)

4.2.2 Terms
The syntax of terms (the collection of which we will write as T ) is determined
by a set of n-ary function symbols, call this set F. We assume the arity of a
function symbol can be determined.
58 CHAPTER 4. PREDICATE LOGIC

Definition 4.4 (Terms) Terms are defined over a set of function symbols F
are given by the following grammar:

T[F ] ::= x | f (t1 , · · · tn )

where:
F is a set of function symbols,
x ∈ V is a variable,
f ∈ F is a function symbol for a function of arity n, where n ≥ 0 and
ti ∈ T[F ] denote previously constructed terms, 1 ≤ i ≤ n.

The syntax tree for a function application might appear as follows.


f

t1 ··· tn

where the figure displayed as:

ti

denotes the syntax tree for the term ti .


Note that the definition of terms is parametrized by the set of function
symbols. The set of terms in T[F ] is determined by the set of function symbols
in F and by the arities of those symbols. Also, note that if n = 0, the term f ()
is a constant and we will write it simply as f .

Example 4.2. Let F = {a, b, f, g} where a and b are constants, f is a unary


function symbol and g is a binary function symbol. In this case, T includes:

{a, x, f (a), f (x),


g(a, a), g(a, x), g(a, f (a)), g(a, f (x)),
g(x, a), g(x, x), g(x, f (a)), g(x, f (x)),
b, y, f (b), f (y), f (f (a)), f (f (x)), f (g(a, a)), · · ·

4.2.3 Formulas
Definition 4.5 (Predicate Logic Formula) Formulas of predicate logic are
defined over a set of function symbols F and a set of predicate symbols P and
are given by the following grammar.

PL[F ,P] ::= ⊥ | P (t1 , · · · , tn ) | ¬φ | φ ∧ ψ | φ ∨ ψ | φ ⇒ ψ | ∀x.φ | ∃x.φ

where:
4.2. THE SYNTAX OF PREDICATE LOGIC 59

F is set of function symbols,


P is set of predicate symbols,
⊥ is a constant symbol,
P ∈ P is a predicate symbol for a predicate of arity n, where n ≥ 0,
ti ∈ T[F ] are terms, 1 ≤ i ≤ n,
φ, ψ ∈ PL[F ,P] are previously constructed formulas, and
x ∈ V is a variable.
This definition is parametrized by the set of function symbols (F) and the
set of predicate symbols (P). As remarked above, a predicate symbol denoting
a constant is the equivalent of a propositional variable presented in the previous
chapter. Thus, predicate symbols are a generalization of propositional variables;
when actual values are substituted for their variables, they denote truth values.

Predicate Logic extends Propositional Logic


Given a rich enough set of predicate symbols P i.e. one that includes one
constant symbol for each propositional variable, the language of predicate logic
extends the language of propositional logic. Specifically, every formula of propo-
sitional logic is a formula of predicate logic. To see this note that: the constant
symbol bottom (⊥) is included in both languages; the propositional variables
are all included in P as predicate symbols of arity 0. Also, every connective
of propositional logic is also a connective of predicate logic. Thus, we conclude
that every formula of propositional logic can be identified with a syntactically
identical formula of predicate logic.
We will see in later sections that not only is the syntax preserved, both the
semantics and the proof system are also preserved.

Some Examples
In the following examples we show uses of the quantifiers to formally encode
some theorems of arithmetic.
Example 4.3. The law of trichotomy in the language of arithmetic says:
For all integers i and j, either: i is less than j or i is equal to j or
j is less than i.
We can formalize this statement, making explicit that less-than and equals
are binary predicates by writing them as lt(i, j) and eq(i, j) respectively:
∀i.∀j.(lt(i, j) ∨ (eq(i, j) ∨ lt(j, i)))
We can rewrite the same statement as follows using the ordinary notation
of arithmetic (which perhaps makes the fact that less-than and equals are pred-
icates less obvious.)
∀i.∀j.(i < j ∨ (i = j ∨ j < i))
60 CHAPTER 4. PREDICATE LOGIC

Example 4.4. As another example in the natural numbers.

For every natural number i either: i = 0 and there is no number less


that i, or there exists a natural number j such that j < i.

We can formalize this statement as

∀i.(i = 0 ∧ ∀j.¬(j < i)) ∨ (∃j.j < i))

Note that if the domain of discourse (the set from which the variables i and j
take their values) is the natural numbers, the statement is a theorem but it is
false if the domain of discourse is the integers or reals.

Example 4.5. A version of the symmetric property for le on numbers can be


stated as follows.

For all integers n and m, if n ≤ m ≤ n then n = m.

This is formalized as follows:

∀n.∀m.(n ≤ m ∧ m ≤ n) ⇒ n = m

Note that the commonly used notation (i ≤ j ≤ k) means ((i ≤ j) ∧ (j ≤ k)).

Example 4.6. Consider the following statement:


For every natural number n which is greater than 1, either n is a
prime number or there are two integers, both greater than 1 and less
than n, whose product is n.
Let P be a unary predicate that is true if and only if its single argument is
a prime number. Let mul be a binary function symbol denoting multiplication.
Then we formalize this statement as follows:

∀n.n > 1 ⇒ (P (n) ∨ ∃i.∃j.between(1, n, i) ∧ between(1, n, j) ∧ mul(i, j) = n)

We can rewrite this using standard mathematical notion as follows:

∀n.n > 1 ⇒ (P (n) ∨ ∃i.∃j.(1 < i ∧ i < n) ∧ (1 < j ∧ j < n) ∧ i · j = n)

Remark 4.1. The fact that these statements are true when the symbols are
interpreted in the ordinary way we think of numbers is a fact that is external
to logic. The predicates less-than and equals are particular predicates that
have particular values when interpreted in ordinary arithmetic. If we swapped
the interpretations of the symbols (i.e. if we interpreted i < j to be true
whenever i and j are equal numbers and interpreted i < j to be false otherwise;
and similarly interpreted i = j to be true whenever i was less than j and
false otherwise) we would still have well formed formulas in the language of
4.3. SUBSTITUTION 61

arithmetic, but would interpret the meanings of the predicates differently. So


the interpreted meaning of the predicate symbols and function symbols may
have a bearing on the truth or falsity of a formula. We discuss this later in the
section on semantics. Note that there are also formulas of predicate logic which
do not depend on the meanings of the predicate and function symbols, we will
give examples of such formulas in a later section.

4.3 Substitution
Substitution is the process of replacing a variable by some more complex piece
of syntax such as a term or a formula. Readers are already familiar with this
process, though there are is some added complexity that results from notations
that bind variables e.g. summation (Σki=j f (i)), product (Πi = j k f (i)), integral
Rb
( a f (x)dx), and the quantifiers of predicate logic (∀x.φ(x) and ∃x.φ(x)).
As an example of a simple substitution (without considerations related to
bindings) consider the following example.

Example 4.7. If we consider the polynomial 2x2 −3x−1 and say x = y 2 +2y+1
then, by substitution we know

2x2 − 3x − 1 = 2(y 2 + 2y + 1)2 − 3(y 2 + 2y + 1) − 1

Here, we simply replaced all the occurrences of x in the polynomial 2x2 − 3x − 1


by the polynomial y 2 + 2y + 1. Of course, the rules of algebra would allow us to
simplify the resulting expression further; but the process of substitution is one
of replacing all the x’s by y 2 + 2y + 1.

Summation is

4.3.1 Bindings and Variable Occurrences


Variable Occurrences
Definition 4.6 (occurs (in a term)) A variable x occurs in term t if and
only if x ∈ occurs(t) where occurs is defined by recursion on the structure of
the term as follows:
occurs(z) = {z}
n
[
occurs(f (t1 , · · · , tn )) = occurs(ti )
i=1

Thus, the variable z occurs in a term which is simply a variable of the form
z. Otherwise, a variable occurs in a term of the form f (t1 , · · · , tn ) if and only
if it occurs in one of the terms ti , 1 ≤ i ≤ n. To collect them, we simply union
1
all the sets of variables occurring in each ti .
1 If
S0
n = 0, (i.e. if the arity of the function symbol is 0) then i=1 occurs(ti ) = {}
62 CHAPTER 4. PREDICATE LOGIC

Definition 4.7 (occurs (in a formula)) A variable x occurs in formula φ if


and only if x ∈ occurs(φ) where occurs is defined by recursion on the structure
of φ as follows.
Sn
occurs(P (t1 , · · · , tn )) = i=1 occurs(ti )
occurs(¬φ) = occurs(φ)
occurs(φ ∧ ψ) = occurs(φ) ∪ occurs(ψ)
occurs(φ ∨ ψ) = occurs(φ) ∪ occurs(ψ)
occurs(φ ⇒ ψ) = occurs(φ) ∪ occurs(ψ)
occurs(∀z.φ) = occurs(φ) ∪ {z}
occurs(∃z.φ) = occurs(φ) ∪ {z}
Thus, a variable x occurs in a formula of the form P (t1 , · · · , tn ) if and only
if x occurs in one of the terms ti , 1 ≤ i ≤ n. The variable x occurs in ¬φ iff it
occurs in φ. Similarly, it occurs in φ ∧ ψ, φ ∨ ψ, and φ ⇒ ψ iff it occurs in φ or
it occurs in ψ. The variable x occurs in ∀x.φ and ∃x.φ regardless of whether it
occurs in φ.
Definition 4.8 (binding operator) In formulas of the form ∀x.φ and ∃x.φ:
the quantifier symbols “∀” and “∃” are binding operators, the occurrence of the
variable x just after the quantifier is called the binding occurrence. The partial
syntax tree of the formula φ, where sub-trees corresponding to sub-formulas of
the form ∀x.ψ and ∃x.ψ have been removed, is called the scope of the binding. In
the linear notation of formulas, we mark the missing sub-formulas by replacing
them with the symbol “.”
The idea of variable scope is a familiar one to programmers. In programming
languages, the scope of a variable declaration specifies what part of the program
text refers to which variable declaration. Different languages have different
scoping rules, but the modern standard of lexical scoping (or local scoping)
essentially follow the rules given for logic. These are very close to the rules used
in C++ for example [11].
Example 4.8 () The scope of the leftmost binding occurrence of the variable
x in the formula
∀x.(P (x) ∧ ∃x.Q(x, y)) is (P (x) ∧ )
Where,  blocks out the part of the formula not in the scope of the first binding
occurrence of x. The scope of the rightmost binding occurrence of x in the same
formula is Q(x, y).

4.3.2 Free Variables


Definition 4.9 (free occurrence (in a term)) A variable x occurs free in
term t if and only if x ∈ F V (t) where FV is defined as follows:
FV(z) {z}
= S
n
FV(f (t1 , · · · , tn )) = i=1 FV(ti )
4.3. SUBSTITUTION 63

Thus, a variable x occurs free in a term which is simply a variable of the form
z if and only if x = z. Otherwise, x occurs in a term of the form f (t1 , · · · , tn )
if and only if x occurs in one of the terms ti , 1 ≤ i ≤ n.

Definition 4.10 (free occurrence (in a formula)) A variable x occurs free


in formula φ if and only if x ∈ FV(φ) where:
Sn
FV(P (t1 , · · · , tn )) = i=1 FV(ti )
FV(¬φ) = FV(φ)
FV(φ ∧ ψ) = FV(φ) ∪ FV(ψ)
FV(φ ∨ ψ) = FV(φ) ∪ FV(ψ)
FV(φ ⇒ ψ) = FV(φ) ∪ FV(ψ)
FV(∀z.φ) = FV(φ) − {z}
FV(∃z.φ) = FV(φ) − {z}

Thus, a variable x occurs free in a formula iff it occurs in the formula and it is
not in the scope of any binding of the variable x.

Bound Variables
Bound variables can only occur in formulas; this is because there are no binding
operators in the language of terms.

Definition 4.11 (bound occurrence ) A variable x occurs bound in formula


φ if and only if x ∈ BV(φ) where BV is defined as follows:

BV(P (t1 , · · · , tn )) = {}
BV(¬φ) = BV(φ)
BV(φ ∧ ψ) = BV(φ) ∪ BV(ψ)
BV(φ ∨ ψ) = BV(φ) ∪ BV(ψ)
BV(φ ⇒ ψ) = BV(φ) ∪ BV(ψ)
BV(∀z.φ) = BV(φ) ∪ {z}
BV(∃z.φ) = BV(φ) ∪ {z}

Thus, a variable x occurs bound in a formula ψ iff it contains a sub-formula of


the form ∀x.φ or ∃x.φ.

Discussion
The algorithms for computing the free variables and bound variables of a formula
are given by recursion on the structure of the formula. By drawing a syntax tree,
it is easy to see which variables are free and which are bound. Choose a variable
in the tree. It is bound if it is the left child of a quantifier or if, traversing the
tree to its root a quantifier is encountered having the same variable as a left
child. A variable is free if it is not the left child of a quantifier or if the path
from the variable to the root of the syntax tree does not include a quantifier
whose left child matches the variable.
64 CHAPTER 4. PREDICATE LOGIC

Example 4.9. Consider the formula

(∀x.∃y.Q(x, y) ∨ R(z)) ⇒ ∀z.R(x) ∧ Q(a, z)


The syntax tree appears as follows:

∀ ∃

x ∃ z ∧

y ∨ R Q

Q R x a z

x y z

We can refer to variables by their left to right position in the formula. The
leftmost x in the formula is bound because, in the syntax tree, it is a left child
of the quantifier ∀. Similarly, the same holds for the leftmost y. The second
occurrence of x in the formula is bound because on the path to the root of the
syntax tree passes a ∀ quantifier whose left child is also x. The second y in
the formula is bound because the path to the root passes an ∃ quantifier whose
left child is a y. The first occurrence of the variable z in the formula is free
because no quantifier on the path to the root has a z as a left child. The second
z occurring in the formula is bound because it is the left child of an ∃ quantifier.
The third x is free. The constant symbol a is not a variable and so is neither
free nor bound. The last z in the formula is bound by the ∃ quantifier above it
in the syntax tree.
Remark 4.2. Note that there are formulas where a variable may occur both
free and occur bound, e.g. x in the example above. As another example where
this happens, consider the formula P (x) ∧ ∀x.P (x). The first occurrence of x is
free and the second and third occurrences of x are bound.

4.3.3 Capture Avoiding Substitution*


Substitution is perhaps the most basic operation in mathematics and it is often
performed without mention. But to actually specify capture avoiding substitu-
tion correctly reveals a sad history of error. Hilbert got it wrong in the first
edition of his logic book with Ackermann [28], Quine got it wrong in the first
edition of his logic book [44], and almost every automated theorem prover in
existence has experienced bugs in their implementations of substitution at some
time. Capture avoiding substitution is hard to get right.
4.3. SUBSTITUTION 65

More evidence for the pivotal role substitution plays: the only computation
mechanism in Church’s2 lambda calculus [6] is substitution, and anything we
currently count as algorithmically computable can be computed by a term of
the lambda calculus.
For terms, there are no binding operators so capture avoiding substitution
is just ordinary substitution – i.e. we just search for the variable to be replaced
by a term and when one that matches is found, it is replaced.

Definition 4.12 (substitution (for terms)) Substitution is defined as fol-


lows:

x[x := t] = t
z[x := t] = z if (x 6= z)
f (t1 , · · · , tn )[x := t] = f (t1 [x := t], · · · tn [x := t])

The first clause of definition says that if you are trying to substitute the
term t for free occurrences of the variable x in the term that consists of the
single variable x, then go ahead and do it – i.e. replace x by t and that is the
result of the substitution.
The second clause of the definition says that if you’re looking to substitute t
for x, but you’re looking at a variable z where z is different from x, do nothing
– the result of the substitution is just the variable z.
The third clause of the definition follows a standard pattern of recursion.
The result of substituting t for free occurrences of x in the term f (t1 , · · · , tn ),
is the term obtained by substituting t for x in each of the n arguments ti , 1 ≤
i ≤ n, and then returning the term assembled from these parts by placing the
substituted argument terms in the appropriate places.
Note that substitution of term t for free occurrences of the variable x can
never affect a function symbol (f ) since function symbols are not variables.

Definition 4.13 (Capture Avoiding Substitution) Capture avoiding sub-

2 Alonzo Church was an American mathematician and logician who taught at Princeton

University. Among other things, he is known for his development of λ-calculus, a notation for
functions that serves as a theoretical basis for modern programming languages.
66 CHAPTER 4. PREDICATE LOGIC

stitution for formulas is defined as follows:

⊥[x := t] = ⊥
P (t1 , · · · , tn )[x := t] = P (t1 [x := t], · · · , tn [x := t])
(¬φ)[x := t] = ¬(φ[x := t])
(φ ∧ ψ)[x := t] = (φ[x := t] ∧ ψ[x := t])
(φ ∨ ψ)[x := t] = (φ[x := t] ∨ ψ[x := t])
(φ ⇒ ψ)[x := t] = (φ[x := t] ⇒ ψ[x := t])
(∀x.φ)[x := t] = (∀x.φ)
(∀y.φ)[x := t] = (∀y.φ[x := t])
if (x 6= y, y 6∈ F V (t))
(∀y.φ)[x := t] = (∀z.φ[y := z][x := t])
if (x 6= y, y ∈ F V (t), z 6∈ (F V (t) ∪ F V (φ) ∪ {x})
(∃x.φ)[x := t] = (∃x.φ)
(∃y.φ)[x := t] = (∃y.φ[x := t])
if (x 6= y, y 6∈ F V (t))
(∃y.φ)[x := t] = (∃z.φ[y := z][x := t])
if (x 6= y, y ∈ F V (t), z 6∈ (F V (t) ∪ F V (φ) ∪ {x})

4.4 Proofs
4.4.1 Proof Rules for Quantifiers

4.4.2 Universal Quantifier Rules


On the right

If we have a formula with the principle constructor ∀ (say ∀x.φ) on the right of
a sequent then it is enough to prove the sequent where ∀x.φ has been replaced
by the formula φ[x := y], where y is a new variable not occurring free in any
formula of the sequent. Choosing a new variable not occuring free anywhere in
the sequent, to replace the bound variable x has the effect of selecting an arbi-
trary element from the domain of discourse i.e. by choosing a completely new
variable, we know nothing about it — except that it stands for some element of
the domain of discourse.
Γ ` ∆1 , φ[x := y], ∆2 where variable y is not free in any
(∀R)
Γ ` ∆1 , ∀x.φ, ∆2 formula of (Γ ∪ ∆1 ∪ {∀x.φ} ∪ ∆2 ).

Since y is not free in any formula of the sequent, y represents an arbitrary


element of the domain of discourse.
4.4. PROOFS 67

On the left
The rule for a ∀ on the left says, to prove a sequent with a ∀ occurring as the
principle connective of a formula on the left side (say ∀x.φ) it is enough to prove
the sequent obtained by replacing ∀x.φ by the formula φ[x := t] where t is any
term 3 .
Γ1 , φ[x := t], Γ2 ` ∆
(∀L) where t ∈ T .
Γ1 , ∀x.φ, Γ2 ` ∆
We justify this by noting that if we assume ∀x.φ (this is what it means to
be on the left) then it must be the case that φ[x := t] is true for any term t
what-so-ever.

4.4.3 Existential Quantifier Rules


On the right
To prove a formula of the form ∃x.φ it is enough to find a term t such that
φ[x := t] can be proved.

Γ ` ∆1 , φ[x := t], ∆2
(∃R) where t ∈ T .
Γ ` ∆1 , ∃x.φ, ∆2
Note that the choice of t may require some creative thought.

Definition 4.14 (existential witness) The term t substituted for the bound
variables in an ∃R-rule is called the existential witness.

On the left
The rule for a ∃ on the left says, to prove a sequent with a ∃ occurring as
the principle connective of a formula on the left side, it is enough to prove the
sequent obtained by replacing the bound variable of the forall by an arbitrary
variable y where y is not free in any formula of the sequent.
Γ1 , φ[x := y], Γ2 ` ∆ where variable y is not free in any
(∃L)
Γ1 , ∃x.φ, Γ2 ` ∆ formula of (Γ1 ∪ Γ2 ∪ {∃x.φ} ∪ ∆).
Since we know ∃x.φ, we know something (call it y) exists which satisfies
φ[x := y], but we can not assume anything about y other than that it has been
arbitrarily chosen from the domain of discourse.

4.4.4 Some Proofs


The mechanisms for checking whether a labeled tree of sequents is a proof is the
same here as presented in Chap. 2 on propositional logic. But in the presence of
3 If you further constrain that the only variables you use to construct t are among the

free variables occuring in the sequent, then your proof is valid in every domain of discourse,
including the empty one. Logics allowing the empty domain of discourse are called Free Logics
[?] .
68 CHAPTER 4. PREDICATE LOGIC

quantifiers, finding proofs is no longer a strictly mechanical process. Creativity


may be required.

Example 4.10. Consider the sequent ` (∀x.P (x)) ⇒ (∃y.P (y)). Surely if
everything satisfies property P , then something satisfies property P .
Initially, the only rule that applies is the propositional ⇒R-rule. It matches
this sequent by the following substitution:


 Γ := [ ]
 ∆1 := [ ]


σ1 = ∆2 := [ ]
φ := (∀x.P (x))




ψ := (∃y.P (y))

The result of applying this substitution to the to premise of the ⇒R-rule results
in the partial proof tree of the following form:

∀x.P (x) ` ∃y.P (y)


(⇒R)
` ∀x.P (x) ⇒ ∃y.P (y)

Now, to continue developing the incomplete branch, we examine the ∀L-rule


and the ∃R-rule. Both require the prover to select a term to substitute into
the scope of the bound variable. In this case, any term will do, as long as we
use the same one on both sides. All variables are terms, so just use z, and we
arbitrarily choose to apply the ∀L-rule first.
The match of the sequent against the goal of the rule is given by the substi-
tution:


 Γ := [ ]
∆1 := [ ]




∆ := [∃y.P (y)]

σ2 =

 φ := P (x)
x := x




t := z

The term t we have chosen is the variable z. Applying the substitution to


the premise of the rule results in the sequent P (x)[x := z] ` ∃y.P (y). Note that
P (x)[x := z] = P (z), thus the resulting partial proof is:

P (z) ` ∃y.P (y)


(∀L)
∀x.P (x) ` ∃y.P (y)
(⇒R)
` ∀x.P (x) ⇒ ∃y.P (y)

Now, the only rule that applies is the ∃R-rule. We choose t to be z and
match by the following substitution.
4.4. PROOFS 69



 Γ := [P (z)]
∆1 := [ ]




∆2 := [ ]

σ3 =

 φ := P (y)
x := y




t := z

The partial proof generated by applying this rule with this substitution is
as follows:

P (z) ` P (z)
(∃R)
P (z) ` ∃y.P (y)
(∀L)
∀x.P (x) ` ∃y.P (y)
(⇒R)
` ∀x.P (x) ⇒ ∃y.P (y)

Now, the incomplete branch of the proof is an instance of an axiom, where


the substitution verifying the match is given as follows:



 Γ1 := [ ]
 Γ2 := [ ]


σ4 = ∆1 := [ ]
∆2 := [ ]




φ := P (z)

Finally, we have the following complete proof tree.

(Ax)
P (z) ` P (z)
(∃R)
P (z) ` ∃y.P (y)
(∀L)
∀x.P (x) ` ∃y.P (y)
(⇒R)
` ∀x.P (x) ⇒ ∃y.P (y)

Example 4.11. In the case of propositional logic we did not need to apply
any of the structural rules however, they may be required in the case of the
quantifier rules. Consider the following theorem.

∃x.P (x) ⇒ ∀x.P (x)

Here is a sequent proof whose first step is to copy the formula using the rule
for contraction on the right.
70 CHAPTER 4. PREDICATE LOGIC

(Ax)
P (a), P (y) ` P (y), ∀x.P (x)
(⇒R)
P (a) ` P (y), P (y) ⇒ ∀x.P (x)
(∃R)
P (a) ` P (y), ∃x.P (x) ⇒ ∀x.P (x)
(∀R)
P (a) ` ∀x.P (x), ∃x.P (x) ⇒ ∀x.P (x)
(⇒R)
` P (a) ⇒ ∀x.P (x), ∃x.P (x) ⇒ ∀x.P (x)
(∃R)
` ∃x.P (x) ⇒ ∀x.P (x), ∃x.P (x) ⇒ ∀x.P (x)
(CR)
` ∃x.P (x) ⇒ ∀x.P (x)

4.4.5 Translating Sequent Proofs into English


Gentzen4 devised the sequent proof system to reflect how proofs are done in
ordinary mathematics. The formal sequent proof is a tree structure and we
could easily write and algorithm that would recursively translate sequent proofs
into English. The rules for such a transformation are given in the following
sections.

Axiom Rule
The rule is:

(Ax)
Γ1 , φ, Γ2 ` ∆1 , φ, ∆2

We say: “But we know φ is true since we have assumed it.” or “φ holds


since we assumed φ to be true and so we are done.”
The other axiom is formally given as follows:

(⊥Ax)
Γ1 , ⊥, Γ2 ` ∆

We say: “But now we have assumed false and the theorem is true.” or “But
now, we have derived a contradiction and the theorem is true.”

Conjunction Rules
The rule on the right is:

Γ ` ∆1 , φ, ∆2 Γ ` ∆1 , ψ, ∆2
(∧R)
Γ ` ∆1 , (φ ∧ ψ), ∆2

We say: “To show φ ∧ ψ there are two cases, (case 1.) insert translated proof
of the left branch here (case 2.) insert translated proof of the right branch here..”
4 Gentzen, Gerhard
4.4. PROOFS 71

Or we say: “To show φ ∧ ψ we must show φ and we must show ψ. To see


that φ holds: insert translated proof of left branch here This completes the proof
of φ. To see that ψ holds: insert translated proof of right branch here. This
completes the proof of φ ∧ ψ.”
The rule on the left says:
Γ1 , φ, ψ, Γ2 ` ∆
(∧L)
Γ1 , (φ ∧ ψ), Γ2 ` ∆
We say: “Since we have assumed φ ∧ ψ, we assume φ and we assume ψ.
Insert translated proof of the premise here.”

Disjunction
The formal rule for a disjunction on the right is:
Γ ` ∆1 , φ, ψ, ∆2
(∨R)
Γ, ` ∆1 , (φ ∨ ψ), ∆2
We say: “To show φ ∨ ψ we must either show φ or show ψ. Insert translated
proof of the premise here.”
The sequent proof rule for disjunction on the left is:
Γ1 , φ, Γ2 ` ∆ Γ1 , ψ, Γ2 ` ∆
(∨L)
Γ1 , (φ ∨ ψ), Γ2 ` ∆
We say: “Since we know φ ∨ ψ we proceed by cases: suppose φ is true, then
insert translated proof from the left branch here. On the other hand, if ψ holds:
insert translated proof from right branch here.
or, we say: “Since φ ∨ ψ holds, we proceed consider the two cases: (case 1,
φ holds:) insert translated proof from the left branch here. (case 2. ψ holds:)
insert translated proof from right branch here.

Implication Rules
The formal rule for an implication on the right is:
Γ, φ ` ∆1 , ψ, ∆2
(⇒R)
Γ ` ∆1 , (φ ⇒ ψ), ∆2
We say: “To prove φ ⇒ ψ, assume φ and show ψ, insert translated proof of
the subgoal here..”

The formal rule for an implication on the left is:


Γ1 , Γ2 ` φ, ∆ Γ1 , ψ, Γ2 ` ∆
(⇒L)
Γ1 , (φ ⇒ ψ), Γ2 ` ∆
We say: “Since we have assumed φ ⇒ ψ, we show φ and assume φ. To see that
φ holds: insert translated proof of left branch here. Now, we assume ψ. Insert
translated proof of right branch here.”
72 CHAPTER 4. PREDICATE LOGIC

Negation
The formal rule for a negation on the right is:
Γ, φ ` ∆1 , ∆2
(¬R)
Γ ` ∆1 , ¬φ, ∆2

We say: “Assume φ. Insert translated proof of premise here.” or we say “Since


we must show ¬φ, assume φ. Insert translated proof of premise here.”
The formal rule for a negation on the left is:

Γ1 , ¬φ, Γ2 ` ∆
(¬L)
Γ1 , Γ2 ` φ, ∆

We say: “Since we have assumed ¬φ, we show φ. Insert translated proof of


premise here.” or, we say: “Since we know ¬φ, we prove φ. Insert translated
proof of the premise here.”

Universal Quantifier
The formal rule for a ∀ on the right is:

Γ ` ∆1 , φ[x := y], ∆2 where variable y is not free in any


(∀R)
Γ ` ∆1 , ∀x.φ, ∆2 formula of (Γ ∪ ∆1 ∪ {∀x.φ} ∪ ∆2 ).

We say: “To prove ∀x.φ, pick an arbitrary y and show φ[x := y] 5 . Insert
translated proof of the premise here.” or, we simply say: “Pick an arbitrary y
and show φ[x := y]. Insert translated proof of the premise here.”
The formal rule for ∀ on the left says:

Γ1 , φ[x := t], Γ2 ` ∆
(∀L) where t ∈ T .
Γ1 , ∀x.φ, Γ2 ` ∆

We say: “ Since we know that for every x, φ is true, assume φ[x := t]. Insert
translated proof of premise here.” or, we say: “Assume φ[x := t].”

Existential Quantifiers
The rule for ∃ on the right is:

Γ ` ∆1 , φ[x := t], ∆2
(∃R) where t ∈ T .
Γ ` ∆1 , ∃x.φ, ∆2

We say: “Let t be the witness for x in ∃x.φ. We must show φ[x := t].Insert
5 In this rule, and those that follow, we say φ[x := y] to be the formula that results from

the substitution of y fopr x in phi, i.e. actually do the substitution before writing the formula
in your proof.
4.4. PROOFS 73

translated proof of the premise here.” or, we say to show ∃x.φ, we choose the
witness t and show φ[x := t]. Insert translated proof of the premise here.”.

The rule for ∃ on the left is:

Γ1 , φ[x := y], Γ2 ` ∆ where variable y is not free in any


(∃L)
Γ1 , ∃x.φ, Γ2 ` ∆ formula of (Γ ∪ ∆1 ∪ {∃x.φ} ∪ ∆2 ).

We say: “Since we know ∃x.φ, pick an arbitrary element of the domain of


discourse, call it y, and assume φ[x := y]. Insert translated proof of the premise
here.” or, we say: “we know φ, holds for arbitrate x, so assume φ[x := y]. Insert
translated proof of the premise here.”
74 CHAPTER 4. PREDICATE LOGIC
Part II

Sets, Relations and


Functions

75
Chapter 5

Set Theory

Georg Cantor (1845-1918)


was a German mathe-
matician and logician who
created set theory. See [15].
To many, set theory is is
the universal language of
mathematics. Using set
theory Cantor was able to
characterize much of the
known mathematics at the
time and proved many fun-
damental theorems about
set theoretic structures.

Georg Cantor

In this chapter we present elementary set theory. Set theory serves as a foun-
dation for mathematics1 , i.e. in principle, we can describe all of mathematics
using set theory.
Our presentation is based on Mitchell’s [38]. A classic presentation can be
found in Halmos’ [26] Naive Set Theory.

1 We say it is “a foundation” since alternative approaches to the foundations of mathematics

exist e.g. category theory [32, 41] or type theory [?] can also serve as the foundations of
mathematics. Set theory is the foundational theory accepted by most working mathematicians.

77
78 CHAPTER 5. SET THEORY

5.1 Introduction
Set theory is the mathematical theory of collections. A set is a collection of
abstract objects where the order and multiplicity of the elements is not taken
into account. This is in contrast to other structures like lists or sequences, where
both the order of the elements and the number of times they occur (multiplicity)
are taken into account when determining if two are equal. For equality on sets,
all that matters is membership. Two sets are considered equal if they have the
same elements.

5.1.1 Informal Notation


From given objects we can form sets by collecting together some or all of the
given objects.
We write down small sets by enclosing the elements in curly brackets “{“
and “}”. Thus, the following are sets.

{a, 1, 2}
{a}
{1, 2, a}
{a, a, a}
Sometimes, if a pattern is obvious, we use an informal notation to indicate
larger sets without writing down the names of all the elements in the set. For
example, we might write:

{0, 1, 2, ..., 10}


{0, 2, 4, ..., 10}
{2, 3, 5, 7, · · · , 23, 29}
{0, 1, 2, ...}
{..., −1, 0, 1, 2, ...}
These marks denote: the set of natural numbers from zero to ten; the set of
even numbers from zero to ten; the set consisting of the first 10 prime numbers,
the set of natural numbers; and the set of integers. The sequence of three dots (
“...”) notation is called an ellipsis and indicates that the some material has been
omitted intentionally. In describing sets, the cases of the form {..., Γ} or {Γ, ...}
indicate some pattern (which should be obvious from Γ) repeats indefinitely.
We present more formal ways of concisely writing down these sets later in this
chapter.

5.1.2 Membership is primitive


The objects included in a set are called the elements or members of the set.
Membership is a primitive notion in set theory, as such it is not formally
defined. Rather, it should be thought of as an undefined primitive relation; it
is used in set theory to characterize the properties of sets.
5.2. EQUALITY AND SUBSETS 79

We indicate an object x is a member of a set A by writing

x∈A

We sometimes will also say, “A contains x” or “x is in A”.


The statement x ∈ A is a true proposition if x actually is in A. We read the
symbol “6∈” as not in and define it by negating the membership proposition:
def
x 6∈ A = ¬(x ∈ A)

Evidently, the following are all true propositions.

a ∈ {a, 1, 2}
1 6∈ {a}
1 ∈ {1, 2, a}
2 6∈ {a, a, a}
Note that sets may contain other sets as members. Thus,

{1, {1}}
is a set and the following propositions are true.

1 ∈ {1, {1}}
{1} ∈ {1, {1}}
Consider the following true proposition.

1 6∈ {{1}}

Now this last statement can be confusing2 .

{1} 6∈ {1}

5.2 Equality and Subsets


Throughout mathematics and computer science, whenever a new notion or struc-
ture is introduced (sets in this case) we must also say when instances of the
objects are equal or when they are subsets.

5.2.1 Extensionality
Sets are determined by their members or elements. This means, the only prop-
erty significant for determining when two sets are equal is the membership re-
lation. Thus, in a set, the order of the elements is insignificant and the number
of times an element occurs in a set (its multiplicity) is also insignificant. This
equality (i.e. the one that ignores multiplicity and order) is called extensionality.
2 Indeed there are some serious philosophers who reject it as senseless [20].
80 CHAPTER 5. SET THEORY

Definition 5.1.
def
A = B = ∀x. (x ∈ A ⇔ x ∈ B)

We write A 6= B for ¬(A = B).


Consider the following sets.

{a, 1, 2}
{a}
{1, 2, a}
{a, a, a}
The first and the third are equal as sets and the second and the fourth are
equal. It is not unreasonable to think of these equal sets as different descriptions
(or names) of the same mathematical object.
Note that the set {1, 2} is not equal3 to the set {1, {2}}, this is because
2 ∈ {1, 2} but 2 6∈ {1, {2}}.

5.2.2 Subsets
Definition 5.2 (Subset) A set A is a subset of another set B if every element
of A is also an element of B. Formally, we write:

def
A ⊆ B = ∀x. (x ∈ A ⇒ x ∈ B)

Thus, the set of even numbers (call this set 2N) is a subset of the natural
numbers N, in symbols 2N ⊆ N.

Theorem 5.1. For every set A, A ⊆ A

Proof:
(Ax)
x∈A`x∈A
⇒R
`x∈A⇒x∈A
∀R
` ∀x. (x ∈ A ⇒ x ∈ A)
(⊆def)
`A⊆A
(∀R)
` ∀A. A ⊆ A

Differences between the membership and subset relations.


The reader should not confuse the notions x ∈ A and x ⊂ A.
3 It may be interesting to note that some philosophers with nominalist tendencies[19, 20]

deny the distinction commonly made between these sets, they say that the “don’t understand”
the distinction being made between these sets.
5.3. SET CONSTRUCTORS 81

Equality via subsets


We can prove the following theorem which relates extensionality with subsets.

Theorem 5.2 (subset extensionality) For every set A and every set B,

A = B ⇔ ((A ⊆ B) ∧ (B ⊆ A))

Proof: Since the theorem is an if and only if, we must show two cases; we label
them below as (⇒) and (⇐). def. of ⇔ and (∧R)
(⇒) Assume A = B, we must show that A ⊆ B and that B ⊆ A. By (⇒R) and (∧R)
definition, if A = B, then ∀x. (x ∈ A ⇔ x ∈ B). First, to show that A ⊆ B, we
must show that ∀x. (x ∈ A ⇒ x ∈ B). Pick an arbitrary thing, call it y and we
must show that y ∈ A ⇒ y ∈ B, but we assumed A = B, this means (by the (∀R)
definition of equality) y ∈ A ⇔ y ∈ B. Since the definition of equality is an iff,
we may assume y ∈ A ⇒ y ∈ B and y ∈ B ⇒ y ∈ A. But then we have show (∀L) and (∧L)
y ∈ A ⇔ y ∈ B as was desired. The for argument to show the case B ⊆ A is
similar.

(⇐) Assume ((A ⊆ B) ∧ (B ⊆ A)), i.e. that A ⊆ B and B ⊆ A, we must (∧L)


show that A = B. By definition, this is true if and only if ∀x.x ∈ A ⇔ x ∈ B.
Pick an arbitrary thing, call it y and show y ∈ A ⇔ y ∈ B, i.e. show (∀L)
y ∈ A ⇒ y ∈ B and y ∈ B ⇒ y ∈ A. Using y in the assumption A ⊆ B def. of ⇔ and (∧R)
gives the first case and using y in the assumption that B ⊆ A gives the second. (∀L) twice


This theorem gives us another way to prove sets are equal; instead of proving
that every element in both, show that the two sets are subsets of one another.
In the same way we labeled the cases of the proof of an (⇔) by (⇒) and (⇐)
we sometimes label the cases of a proof that two set are equal by (⊆) and (⊇).

5.3 Set Constructors


Although we have been discussing sets as if they exist, we need to provide
various means to construct them.

5.3.1 The Empty Set


There exists a set containing no elements. We can write this fact as follows:

Axiom 5.1 (Empty Set)


∃A.∀x.x 6∈ A

The axiom says that there exists a set A which, for any thing whatsoever
(call it x), that thing is not in the set A.
82 CHAPTER 5. SET THEORY

Definition 5.3 (Uniqueness) Consider a property (say P ) if, no matter what


two things we pick (say x and y), if whenever P (x) and P (y) hold then x = y,
we say that the property P is unique.
def
unique(P ) = ∀x, y.(P (x) ∧ P (y)) ⇒ (x = y)

We can prove that the empty set is unique i.e. we can prove the following
theorem which says that any two sets having the property that they contain no
elements are equal. In this case the property P is defined as P (z) = ∀x.x 6∈ z.

Theorem 5.3. For every set A and every set B, if ∀x.x 6∈ A and ∀x.x 6∈ B,
then A = B.

Proof: We give a formal sequent proof.

(Ax) (Ax)
x ∈ A ` x ∈ B, x ∈ A, x ∈ B x ∈ B ` x ∈ B, x ∈ A, x ∈ A
(⇒R) (⇒R)
` x ∈ B, x ∈ A, (x ∈ A ⇒ x ∈ B) ` x ∈ B, x ∈ A, (x ∈ B ⇒ x ∈ A)
(∧R)
` x ∈ B, x ∈ A, (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(¬L)
¬(x ∈ B) ` x ∈ A, (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(¬L)
¬(x ∈ A), ¬(x ∈ B) ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
({6∈}-def)
¬(x ∈ A), x 6∈ B ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
({6∈}-def)
x 6∈ A, x 6∈ B ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(∀L)
x 6∈ A, ∀x.x 6∈ B ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(∀L)
∀x.x 6∈ A, ∀x.x 6∈ B ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(∧L)
(∀x.x 6∈ A) ∧ (∀x.x 6∈ B) ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
({⇔}-def)
(∀x.x 6∈ A) ∧ (∀x.x 6∈ B) ` x ∈ A ⇔ x ∈ B
(∀R)
(∀x.x 6∈ A) ∧ (∀x.x 6∈ B) ` ∀x.(x ∈ A ⇔ x ∈ B)
({=}-def)
(∀x.x 6∈ A) ∧ (∀x.x 6∈ B) ` A = B
(⇒R)
` ((∀x.x 6∈ A) ∧ (∀x.x 6∈ B)) ⇒ A = B
(∀R)
` ∀B.(((∀x.x 6∈ A) ∧ (∀x.x 6∈ B)) ⇒ A = B)
(∀R)
` ∀A.∀B.(((∀x.x 6∈ A) ∧ (∀x.x 6∈ B)) ⇒ A = B)


Since the set is unique we can give it a name, we denote this unique set by
the constant4 symbol ∅ (and sometimes by just writing empty brackets {}).
Using this new notation, we can restate the empty set axiom in a simpler
form as follows.
4 Do not confuse the symbol ∅ with the Greek letter φ.
5.3. SET CONSTRUCTORS 83

Corollary 5.1 (Empty Set)


∀x.x 6∈ ∅
With this fact, we can easily prove the following theorem:
Theorem 5.4 (Empty Subset) For every set A, ∅ ⊆ A.
Exercise 5.1. Prove Theorem 5.4.

5.3.2 Unordered Pairs and Singletons


Definition 5.4 (unordered pair) A set having exactly two elements is called
an unordered pair.
The following axiom asserts that given any two things, there is a set whose
elements are those two things and only those elements.
Axiom 5.2 (Pairing)
∀x.∀y.∃A.∀z. z ∈ A ⇔ (z = x ∨ z = y)
Note that although we might believe such a set exists, (i.e. the set A con-
taining just the elements x and y) without recourse to the pairing axiom, we
would have no justification for asserting such a set exists.
Note that the the set constructed by the pairing axiom for particular x and
y is unique i.e. if we fix x and y, and we claim that A and B are sets having
the property claimed for sets whose existence is asserted by the pairing axiom,
then A = B. This is made precise by the following lemma.
Lemma 5.1 (Pairing Unique)
∀x.∀y. unique(P )
where P is the property of sets defined as follows:
def
P (C) = ∀z. z ∈ C ⇔ (z = x ∨ z = y)
Proof: Choose arbitrary x, y, and show that P is unique. Specifically, show
that
∀A.∀B.P (A) ∧ P (B) ⇒ A = B
Choose arbitrary sets A and B and assume P (A) and P (B).
P (A) : ∀z. z ∈ A ⇔ (z = x ∨ z = y)
P (B) : ∀z. z ∈ B ⇔ (z = x ∨ z = y)
To show A = B we show that w ∈ A ⇔ w ∈ B for arbitrary w.
case 1: Assume w ∈ A and show w ∈ B. But by P (A) (using w for z) if w ∈ A
then we know (w = x ∨ w = y). Now, using w for z in P (B) this gives us the
fact that w ∈ B which is what we were to show.
case 2: Assume w ∈ B and show w ∈ A. Use P (B) (using w for z) similarly
to we did in case 1 and this case holds as well.
84 CHAPTER 5. SET THEORY

Now, since any unordered pair composed of elements x and y is unique, we


will write {x, y} (or {y, x}) to denote this set. As a corollary, we restate the
pairing axiom as follows:

Corollary 5.2 (Pairing)

∀x.∀y.∀z. z ∈ {x, y} ⇔ (z = x ∨ z = y)

From now on we will use this form of the pairing axiom instead of the form
having the existential quantifier in its statement.

Singletons
By choosing x and y in the paring axiom to be the same element we get a
singleton, a set having exactly one element.

Lemma 5.2 (Singleton Exists)

∀x.∃A.∀z. z ∈ A ⇔ z = x

Proof: To prove the theorem, choose an arbitrary element (call it w) and


show
(∗) ∃A.∀z.z ∈ A ⇔ z = w
Now, by the pairing axiom, we know

∀x.∀y.∀z. z ∈ {x, y} ⇔ (z = x ∨ z = y)

Let both x and y be w, then we know, ∀z.z ∈ {w, w} ⇔ (z = w ∨ z = w). Since


(P ∨ P ) ⇔ P we can simplify this as ∀z. z ∈ {w, w} ⇔ (z = w). Use {w, w} as
the witness for A in (*) giving ∀z. z ∈ {w, w} ⇔ (z = w) which we have just
shown to be true.


Like pairs, singletons are unique.

Corollary 5.3 (Singleton Unique)

∀x. unique(P )

where P is the property of sets defined as follows:


def
P (C) = ∀z. z ∈ C ⇔ (z = x)

Proof: Singletons are just pairs where the elements are not distinct. Note
that the proof uniqueness for pairs (Lemma 5.1) does not depend in any way
on distinctness of the elements in the pair and so singletons are also unique.
5.3. SET CONSTRUCTORS 85

Note that by extensionality, {x, x} = {x}, and since singletons are unique,
we will write {x} for the singleton containing x. Note that the singleton set
{x} is distinguished from its element x, i.e. x 6= {x}. Because the set that
is claimed to exist in Lemma 5.2 is unique, we can restate that lemma more
simply as follows.

Corollary 5.4 (Simplified Singleton Exists)

∀x.∀z. z ∈ {x} ⇔ z = x

Corollary 5.5 (Singleton Member)

∀w.w ∈ {w}

Proof: To prove this, choose an arbitrary w and show w ∈ {w}. By Corol-


lary 5.4, we know ∀x.∀z.z ∈ {x} ⇔ z = x. In this formula, choose both x and
z to be w, yielding the fact w ∈ {w} ⇔ w = w. Since w = w is always true, we
have shown w ∈ {w}.


Lemma 5.3 (Singleton Equality)

∀x, y. {x} = {y} ⇔ x = y

Exercise 5.2. Prove the singleton equality lemma.

5.3.3 Ordered Pairs

Kazimierz Kuratowski
(1896 – 1980) was a Polish
mathematician who was
active in the early devel-
opment of topology and
axiomatic set theory.

Kazimierz Kuratowski
86 CHAPTER 5. SET THEORY

The pair {a, b} and the pair {b, a} are identical as far as we can tell using set
equality. They are indistinguishable if we only consider their members. What
if we want to be able to distinguish pairs by the order in which their elements
are listed, is it possible using only sets? The following encoding of ordered pairs
was first given by Kuratowski.

Definition 5.5 (ordered pair)


def
ha, bi = {{a}, {a, b}}

Note that the angled brackets (“h” and “i”) are used here to denote ordered
pairs.
Under this definition h1, 2i = {{1}, {1, 2}} and h2, 1i = {{2}, {1, 2}}. As
sets, h1, 2i =
6 h2, 1i. Also, not that the pair consisting of two of the same
elements is encoded as the set containing the set containing that element.

h1, 1i = {{1}, {1, 1}} = {{1}, {1}} = {{1}}

Theorem 5.5 (characteristic property of ordered pairs) For sets A and


B and for every a, a0 ∈ A and b, b0 ∈ B,

ha, bi = ha0 , b0 i ⇔ (a = a0 ∧ b = b0 )

Exercise 5.3. Prove theorem 5.5.

Definition 5.6 (projections) We define the projection functions5 which map


pairs (say p) to their first and second components.
def
π1 p = x = {x} ∈ p
def
π2 p = b = {π1 p, b} ∈ p

Thus, to prove that for a pair p, π1 p = x, it is enough to show that {x} ∈ p.


Similarly, to show that π2 p = b show that {π1 p, b} ∈ p.

Lemma 5.4. For every a and b the identities following hold.


i.) π1 ha, bi = a
ii.) π2 ha, bi = b
Proof: Choose arbitrary a and b.
i.) By definition π1 ha, bi = a if and only if {a} ∈ ha, bi. By definition, ha, bi =
{{a}, {a, b}}, and {a} is in this set so this case holds.
ii.) By definition π2 ha, bi = a if and only if {π1 ha, bi, b} ∈ ha, bi. We just
saw (in the proof of i.) that π1 ha, bi = a, so we must check whether {a, b} ∈
{{a}, {a, b}}, which is true and so this case holds as well.
5 After we introduce relations and functions below, it will be possible to prove that the

projections, which technically are defined here as relations between a pair and its first or
second element, really are functions.
5.3. SET CONSTRUCTORS 87


Lemma 5.5.
∀a.π1 ha, ai = π2 ha, ai
Proof: Choose arbitrary a. By definition, ha, ai = {{a}, {a, a}}, thus π1 ha, ai =
a iff {a} ∈ {{a}, {a, a}} which is true. Similarly, π2 ha, ai = a iff {a, a} ∈
{{a}, {a, a}} which is also true and so the theorem holds.


Norbert Wiener (1894


– 1964) was a U.S. math-
ematician who taught at
MIT and founded the field
of cybernetics.

Norbert Weiner

Exercise 5.4. An alternative definition of ordered pairs (this was the first def-
inition) was given by Norbert Wiener in 1914.
def
hx, yi = {{{x}, ∅}, {{y}}}
Prove that this definition satisfies the characteristic property of ordered pairs
as stated in Thm.5.5..

5.3.4 Set Union


Since we distinguish sets by their members, we can define operations on sets
that construct new sets by indicating when an element is a member of the
constructed set.
Definition 5.7 (union) Union is the operation of putting two collections to-
gether. If A and B are sets, we write A∪B for the set consisting of the members
of A and of B.
Axiom 5.3 (union membership axiom) We characterize the membership
in a union as follows:
def
x ∈ (A ∪ B) = (x ∈ A ∨ x ∈ B)
88 CHAPTER 5. SET THEORY

Thus if A = {1, 2, 3} and B = {2, 3, 4} then A ∪ B = {1, 2, 3, 4}.

The empty set acts as an identity element for the union operation (in the
same way 0 is the identity for addition and 1 is the identity for multiplication.)
This idea is captured by the following theorem.

Theorem 5.6. For every set A, A ∪ ∅ = A.

Proof: We give a formal sequent proof of the theorem.

(Ax)
x ∈ ∅ ` x ∈ ∅, x ∈ A
(¬-L)
¬(x ∈ ∅), x ∈ ∅ ` x ∈ A
(def of 6∈)
x 6∈ ∅, x ∈ ∅ ` x ∈ A
(∀-L)
∀x.x 6∈ ∅, x ∈ ∅ ` x ∈ A (Ax)
(Ax) (Assert)
x∈A`x∈A x∈∅`x∈A x ∈ A ` x ∈ A, x ∈ ∅)
(∨-L) (∨-R)
x∈A∨x∈∅`x∈A x ∈ A ` x ∈ A ∨ x ∈ ∅)
(∈ ∪ def) (∈ ∪ def)
x ∈ (A ∪ ∅) ` x ∈ A x ∈ A ` x ∈ (A ∪ ∅)
(⇒-R) (⇒-R)
` x ∈ (A ∪ ∅) ⇒ x ∈ A ` x ∈ A ⇒ x ∈ (A ∪ ∅)
(∧-R)
` (x ∈ (A ∪ ∅) ⇒ x ∈ A) ∧ (x ∈ A ⇒ A ∪ ∅)
(def of ⇔)
` x ∈ (A ∪ ∅) ⇔ x ∈ A
(∀R)
` ∀x.x ∈ (A ∪ ∅) ⇔ x ∈ A
(def of =)
`A∪∅=A
(∀R)
` ∀A. A ∪ ∅ = A

The following theorem asserts that the order of arguments to a union oper-
ation do not matter.
Theorem 5.7 (union commutes) For sets A and B, A ∪ B = B ∪ A.
Proof: Choose arbitrary sets A and B. By extensionality, A∪B = B ∪A is true
if ∀x.x ∈ (A ∪ B) ⇔ x ∈ (B ∪ A). Choose an arbitrary x, assume x ∈ (A ∪ B).
Now, by the definition of membership in a union, x ∈ (A ∪ B) iff x ∈ A ∨ x ∈ B.
(x ∈ A ∨ x ∈ B) ⇔ (x ∈ B ∨ x ∈ A and, again by the union membership
property, (x ∈ B or x ∈ A) ⇔ x ∈ (B ∪ A).

By this theorem, A ∪ ∅ = ∅ ∪ A which, together with Thm 5.6 yields the
following corollary.

Corollary 5.6. For every set A, ∅ ∪ A = A.

Theorem 5.8. For all sets A and B, A ⊆ (A ∪ B).


5.3. SET CONSTRUCTORS 89

Proof: Choose arbitrary sets A and B. By the definition of subset, A ⊆ A ∪ B


is true if ∀x.x ∈ A ⇒ x ∈ (A ∪ B). Choose an arbitrary x, assume x ∈ A. Now,
x ∈ (A ∪ B) if x ∈ A or x ∈ B. Since we have assumed x ∈ A, the theorem
holds.

By Thm ??, we have the following:

Corollary 5.7. For all sets A and B, A ⊆ (B ∪ A).

5.3.5 Set Intersection


Another way of constructing new sets form existing ones is to take their inter-
section.

Definition 5.8 (intersection) We define the operation of collecting the ele-


ments in common with two sets and call it the intersection.

Membership in an intersection is defined as follows:

Axiom 5.4 (intersection membership axiom)

def
x ∈ (A ∩ B) = (x ∈ A ∧ x ∈ B)

Thus if A = {1, 2, 3} and B = {2, 3, 4} then A ∩ B = {2, 3}.

Theorem 5.9. For every set A, A ∩ ∅ = ∅.

Proof: By extensionality, A ∩ ∅ = ∅ is true iff ∀x.x ∈ (A ∩ ∅) ⇔ x ∈ ∅. Choose


an arbitrary x. We must show

i.) x ∈ (A ∩ ∅) ⇒ x ∈ ∅
ii.) x ∈ ∅ ⇒ x ∈ (A ∩ ∅)

i.) Assume x ∈ (A ∩ ∅), then, by the membership property of intersections,


x ∈ A ∧ x ∈ ∅. And now, by the empty set axiom (Axiom ??, the second
conjunct is a contradiction, so the implication in the left to right direction
holds.
ii.) Assume x ∈ ∅. Again, by the empty set axiom, this is false so the right to
left implication holds vacuously.


Theorem 5.10 (intersection commutes) For every pair of sets A and B,


A ∩ B = b ∩ A.

Exercise 5.5. Prove Theorem 5.10.


90 CHAPTER 5. SET THEORY

5.3.6 Power Set


Definition 5.9 (power set) Consider the collection of all subsets of a given
set, this collection is itself a set and is called the power set. We write the power
set of a set S as ρ(S).

Axiom 5.5 (power set) The axiom characterizing membership in the power-
set says:
def
x ∈ ρ(S) = x ⊆ S

Consider the following examples:

A0 = {} ρA0 = {{}}
A1 = {1} ρA1 = {{}, {1}}
A2 = {1, 2} ρA2 = {{}, {1}, {2}, {1, 2}}
A3 = {1, 2, 3} ρA1 = {{}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}

Notice that the size of the power set is growing exponentially (as powers of
2, 20 = 1, 21 = 2, 22 = 4, 23 = 8)

Fact 5.1. If a set A has n elements, then the power set ρA has 2n elements.

5.3.7 Comprehension
If we are given a set and a predicate φ(x) (a property of elements of the set)
we can create the set consisting of those elements that satisfy the property. We
write the set created by the operation by instantiating the following schema.

Axiom 5.6 (Comprehension) If S is a meta-variable denoting an arbitrary


set, x is a variable and φ(x) is a predicate, then the following denotes a set.

{x ∈ S | φ(x)}

This is a powerful mechanism for defining new sets.


Note that we characterize membership in sets defined by comprehension as
follows.

def
y ∈ {x ∈ S|φ(x)} = (y ∈ S ∧ φ(y))

Lets consider a few examples of how to use comprehension. We assume the


natural numbers (N = {0, 1, 2, · · · }) has already been defined.

Example 5.1. The set of natural numbers greater than 5 can be defined using
comprehension as:

{n ∈ N | ∃m. m ∈ N ∧ m + 6 = n}
5.3. SET CONSTRUCTORS 91

Example 5.2. We can define the set of even numbers as follows.


First, note that a natural number n is even if and only if there is another
natural number (say m), such that n = 2m. (e.g. if n is 0 (an even natural
number), then if m = 0, 2m = n. If n is 2, then m = 1 gives 2m = n, etc.)
Thus, n is even if and only if ∃m.m ∈ N ∧ 2m = n. Using this predicate of n,
we can define the set of even natural numbers as follows.

{n ∈ N | ∃m.m ∈ N ∧ 2m = n}
Here, the set S from the schema is the set of natural numbers N and the
predicate φ is:
φ(n) = ∃m.m ∈ N ∧ 2 ∗ m = n

Substitution into Comprehensions*


Just like the quantifiers ∀ and ∃, the notation for a set defined by comprehension
binds a variable. This makes substitutions into sets defined by comprehension
interesting in the same way substitutions into quantified formulas can be. In
the comprehension {x : S | φ}, free occurrences of x in the formula φ are bound
by the declaration of x on the left side of “|.” The following definition extends
Def. 4.13 from chapter 4 to include sets defined by comprehension.

Definition 5.10 (capture avoiding substitution over a comprehension)

{x : S | φ}[x := t] = {x : S | φ}
{y : S[x := t] | φ}[x := t] = {y : S | φ[x := t]}
if (x 6= y, y 6∈ F V (t))
{y : S | φ}[x := t] = {z : S[y := z][x := t] | φ[y := z][x := t]}
if (x 6= y, y ∈ F V (t), z 6∈ (F V (S) ∪ F V (t) ∪ F V (φ) ∪ {x})

Bertrand Russell (1888 –


1972) was an English born
philosopher and logician.

Bertrand Russell
92 CHAPTER 5. SET THEORY

Remark 5.1 (Russell’s Paradox) In our definition of comprehension we in-


sisted that elements of sets defined by comprehension come from some preex-
isting set. In Cantor’s original theory, this constraint was not stipulated and in
(1900??) Bertrand Russell noticed the following problem.
Consider the set S consisting of those sets that do not contain themselves as
elements. Without the constraint on comprehensions, this set is easily defined
as follows.
S = {x|x 6∈ x}
Now, by the law of excluded middle of propositional logic, we know either S ∈ S
or S 6∈ S.
(Case 1) S ∈ S iff S ∈ {x|x 6∈ x}. By the rule for membership in comprehen-
sions, this is true iff S 6∈ S. But then we have shown that S ∈ S ⇔ ¬(S ∈ S)
which is a contradiction.
(Case 2) S 6∈ S iff S 6∈ {x|x 6∈ x} i.e. S 6∈ {x|x 6∈ x} which is true iff ¬(S 6∈ S)
But the definition of x 6∈ y simply is ¬(x ∈ y) so we have ¬¬(S ∈ S). By
double negation elimination, this is true iff S ∈ S. Thus, we have show that
S 6∈ S ⇔ S ∈ S which again is a contradiction.

5.3.8 Set Difference


Definition 5.11 (difference) Given a set A and a set B, the difference of A
and B, (write A − B ) is the set of elements in A that are not in the set B.
More formally:
A − B = {x : A | x 6∈ B}

Example 5.3. If Even is the set of even natural numbers, N − Even = Odd.

Theorem 5.11. For every set A, A − ∅ = A.

Theorem 5.12. For every set A, A − A = ∅.

Theorem 5.13. For all sets A and B, A − B = ∅ ⇔ A ⊆ B.

Definition 5.12 (disjoint sets) Two sets A and B are disjoint if they share
no members in common, i.e. if the following holds:

A∩B =∅

Theorem 5.14. For all sets A and B, A and B are disjoint sets iff A − B = A.

5.3.9 Cartesian Products and Tuples


Definition 5.13 (Cartesian product) The Cartesian product of sets A and
B is the set of all ordered pairs having a first element from A and second element
from B. We write A × B to denote the Cartesian product.
def
A × B = {z ∈ ρ(ρ(A ∪ B))|∃a : A.∃b : B.z = ha, bi}
5.3. SET CONSTRUCTORS 93

Note that, by Def 5.5. z = ha, bi means z is a set of the form {{a}, {a, b}}.
Evidently, the Cartesian product of two sets is a set of pairs.

Example 5.4. If A = {a, b} and B = {1, 2, 3}

A × A = {ha, ai, ha, bi, hb, ai, hb, bi}


A × B = {ha, 1i, ha, 2i, ha, 3i, hb, 1i, hb, 2i, hb, 3i}
B × A = {h1, ai, h2, ai, h3, ai, h1, bi, h2, bi, h3, bi}
B × B = {h1, 1i, h1, 2i, h1, 3i, h2, 1i, h2, 2i, h2, 3i, h3, 1i, h3, 2i, h3, 3i}

Theorem 5.15. For any set A,

i.) A × ∅ = ∅
ii.) ∅ × A = ∅

Proof: (of i) Choose an arbitrary x.


(⇒): Assume x ∈ (A × ∅), but this is true only if there exists an a, a ∈ A and
a b, b ∈ ∅ such that x = ha, bi. But by the empty set axiom there is no such b
so this case holds vacuously.
(⇐): We assume x ∈ ∅ which, by the empty set axiom, is a contradiction so
this case holds vacuously as well.

So theorem 5.15 says that ∅ is both a left and right identity for Cartesian
product.

Lemma 5.6 (pair lemma) If A and B are arbitrary sets,

∀x : A × B. ∃y : A. ∃z : B. x = hy, zi

Remark 5.2 (Membership in Comprehensions defined over Products)


To be syntactically correct, a set defined by comprehension over a Cartesian
product appears as follows:

{y ∈ A × B|P [y]}

In practice, we often need to refer to the parts of the pair y to express the
property P . If so, to be formally correct, we should write :

{y : A × B | ∀z : A. ∀w : B. y = hz, wi ⇒ P (z, w)}

So,
x ∈ {y : A × B | ∀z : A. ∀w : B. y = hz, wi ⇒ P (z, w)}
⇔ x ∈ A × B ∧ ∀z : A. ∀w : B. x = hz, wi ⇒ P (z, w)
By lemma 5.6, we know there exist z ∈ A and w ∈ B such that hz, wi. So, to
prove membership of x it is enough to show that x ∈ A × B and then assume
there are z ∈ A and w ∈ B such that x = hz, wi and show P [z, w]. A more
94 CHAPTER 5. SET THEORY

readable syntactic form allows the “destructuring” of the pair to occur on the
left side in the place of the variable.

{hx, yi ∈ A × B | P [x, y]}

Under the rule for proof just described, to show membership

z ∈ {hx, yi ∈ A × B | P [x, y]}

Show z ∈ A × B and then, assume z = hx, yi (for new variables x and y) and
show P [x, y].

5.4 Properties of Operations on Sets


A set operator is a mapping from sets to a set. For example, the union operator
maps two sets to the set whose elements are those coming from either set. The
number of set arguments (inputs) an operator takes is called the arity of the
operator. A unary operator maps a single set to a new set. A binary operator
maps two sets to set. In general, if an operator take k arguments, we say is it
a k-ary operation..

5.4.1 Idempotency
Definition 5.14 (Idempotence) Given a binary operation ?, its idempotent
elements are those elements x for which x ? x = x.

Example 5.5. For the operation of ordinary multiplication, 0 and 1 are (the
only) idempotent elements. For the operation of addition, 0 (but not 1) is an
idempotent element.

The following lemma shows that every set is an idempotent element for
intersections and unions.

Lemma 5.7 (Intersection Idempotent) ∀A. A ∩ A = A.

Lemma 5.8 (cup Idempotent) ∀A. A ∪ A = A.

5.4.2 Monotonicity
Monotonicity is a property of a unary operators.

Definition 5.15 (Monotone) A unary set operator (say X) is monotone if


for all sets A and B, A ⊆ B ⇒ X(A) ⊆ X(B).

Example 5.6. For an arbitrary sets A and B, the powerset operation is mono-
tone i.e.
A ⊆ B → ρ(A) ⊆ ρ(B)
5.4. PROPERTIES OF OPERATIONS ON SETS 95

5.4.3 Commutativity
Definition 5.16 (Commutative) A binary set operator (say ◦) is commuta-
tive if for all sets A, B
(A ◦ B) = (B ◦ A)

Lemma 5.9 (Intersection commutative) Set intersection is commutative:

A∩B =B∩A

Lemma 5.10 (Union commutative) Set union is commutative.

A∪B =B∪A

5.4.4 Associativity
Definition 5.17 (Associative) A binary set operator (say ◦) is associative if
for all sets A, B, and C,

A ◦ (B ◦ C) = (A ◦ B) ◦ C)

Lemma 5.11 (Intersection associative) Set intersection is associative.

A ∩ (B ∩ C) = (A ∩ B) ∩ C

Lemma 5.12 (Union associative) Set union is associative.

A ∪ (B ∪ C) = (A ∪ B) ∪ C

5.4.5 Distributivity
The distributive property relates pairs of operators.

Definition 5.18 (Distributive) For binary set operators (say ◦ and ), we
say◦ distributes over  if all sets A, B, and C,

A ◦ (BC) = (AB) ◦ (aC)

Lemma 5.13 (Union distributes over intersection)

∀A, B, C. A ∪ (B ∩ C) = (A ∩ B) ∪ (A ∩ C)

Lemma 5.14 (Intersection distributes over union)

∀A, B, C. A ∩ (B ∪ C) = (A ∪ B) ∩ (A ∪ C)
96 CHAPTER 5. SET THEORY
Chapter 6

Relations
Alfred Tarski (1902–1983)
was born in Poland and
came to the US at the
outbreak of WWII. Tarski
was the youngest person
to ever earn a Ph.D. for
the University of Warsaw
and throughout his ca-
reer he made many many
contributions to logic and
mathematics – though he
may be best know for his
work in semantics and model
theory. Tarski and his stu-
dents developed the theory
of relations as we know
it. See [14] for a complete
and and rather personal
biography of Tarski’s life
Alfred Tarski and work.

6.1 Introduction
Relations establish a correspondence between the elements of sets thereby im-
posing structure on the elements. In keeping with the principle that all of math-
ematics can be described using set theory, relations (and functions) themselves
can be characterized as sets (having a certain kind of structure).
For example, familial relations can be characterized mathematically using
the relational structures and/or functions. Thus, if the set P is the set of

97
98 CHAPTER 6. RELATIONS

all people living and dead, the relationship between a (biological) father and
his offspring could be represented by a set of pairs F of the form hx, yi to be
interpreted as meaning that x is the father of y if hx, yi ∈ F . We will write xF y
to denote the fact that x is the father of y instead of hx, yi ∈ F . Using this
notation, the paternal grandfather relation can be characterized by the set

{hx, yi ∈ P × P|∃z. xF z ∧ zF y}

A function is a relation that satisfies certain global properties; most sig-


nificantly, the functionality property. A relation R is functional if together
hx, yi ∈ R and hx, zi ∈ R imply y = z. This is a mathematical way of specifying
the condition that there can only be one pair in the relation R having x as
its first entry. We discuss this in more detail below. Now, if we consider the
father-of relation given above, it clearly is not a function since one father can
have more than one child e.g. if Joe has two children (say Tommy and Susie)
then there will be two entries in F with Joe as the first entry. However, if we
take the inverse relation (we might call it has father), we get a function since
each individual has only one biological father. Mathematically, we could define
def
this relation as yF −1 x = xF y. We discuss functions further in Chapter 8.
Relations and functions play a crucial role in both mathematics and com-
puter science. Within computer science, programs are usefully thought of as
functions. Relations and operations on them form the basis of most modern
database systems, so-called relational databases.

6.2 Relations
6.2.1 Binary Relations
Definition 6.1 (Binary Relation)

A (binary) relation is a subset of a Cartesian product. Given sets


A, B and R, if R ⊆ A × B we say R is a binary relation on A and B.

Thus, a (binary) relation is a set of pairs. A relation is a set of pairs. Say


it to yourself three times and do not forget it. Every time you get in the shower
for a week, repeat this as your mantra.

Definition 6.2 (domain and co-doman)

If R ⊆ A×B then we say A is the domain of R and B is the codomain


of R.

Example 6.1. Let A and B be sets. Any subset R, R ⊆ A × B is a relation.


Thus A × B itself is a relation. This one is not very interesting since every
element of A is related to every element of B.
6.2. RELATIONS 99

Example 6.2. The empty set is a relation. Recall that the empty set is a subset
of every set and so it is a subset of every Cartesian product (even the empty
one). Again, this is not a terribly interesting relation but, by the definition, it
clearly is one.

Example 6.3. Less-than (<) is a relation on (Z × Z).

< = {hx, yi ∈ Z × Z | x 6= y ∧ ∃k : N. y = x + k}

To aid readability, binary relations are often written in infix notation e.g. hx, yi ∈
R will be written xRy. So, for example, an instance of the less-than relation
will be written 3 < 100 instead of the more pedantic h3, 100i ∈ <.

Definition 6.3. If R ⊆ A × A we say R is a relation on A.

Thus, < is a relation on Z.

6.2.2 n-ary Relations


6.2.3 Some Particular Relations
Definition 6.4 (Diagonal Relation (Identity Relation)) The diagonal re-
lation over a set A is the relation

∆A = {hx, yi ∈ A × A|x = y}

Exercise 6.1. Prove that ∆−1


A = ∆A .

This relation is called the “diagonal” in analogy with the matrix presentation
of a relation R, where hx, yi in the matrix is labeled with a 0 if hx, yi 6∈ R and
hx, yi = 1 if hx, yi ∈ R.

Example 6.4. Suppose A = {0, 1, 2} and R = {h0, 0i, h1, 1i, h1, 2i, h2, 1i} then
the matrix representation appears as follows:

R 0 1 2
0 1 0 0
1 0 1 1
2 0 1 0

Under this matrix representation, the so-called diagonal relation ∆A appears


as the follows:
∆A 0 1 2
0 1 0 0
1 0 1 0
2 0 0 1
100 CHAPTER 6. RELATIONS

Notice that is it the matrix consisting of a diagonal of ones 1 . One problem with
the matrix representation is that you must order the elements of the underlying
set to be able to write them down across the top and bottom of the matrix –
this choice may have to be arbitrary if there is no natural order for the elements.

6.3 Operations on Relations


Note that since relations are sets (of pairs) all the ordinary set operations are
defined on them. In particular, unions, intersections, differences and compre-
hensions are all valid operations to perform on relations. Also, because relations
inherit their notion of equality from the fact that they are sets, relations are
equal when they are equal sets.

Definition 6.5 (subrelation) If R, S ⊆ A × B and R ⊆ S then we say R is a


subrelation of S.

We define a number of operations that are specifically defined on relations –


mainly owing to the fact that they are are sets but also have additional structure
in that they are sets of pairs.

6.3.1 Inverse
Definition 6.6 (inverse) If R ⊆ A × B, then the inverse of R is

R−1 = {hy, xi ∈ B × A|xRy}

Thus, to construct the inverse of a relation, just turn every pair around i.e.
swap the elements in the pairs of R by making the first element of each pair
the second and the second element of each pair the first. The following useful
lemma says that swapping the order of the elements from a pair in a relation R
puts the new pair into the relation R−1 .

Remark 6.1. In linear algebra, the inverse, as defined here, is called the trans-
pose.

Lemma 6.1. If R ⊆ A × B, then aRb ⇔ bR−1 a.


Proof: (⇒) Assume aRb, i.e. ha, bi ∈ R. We must show that hb, ai ∈ R−1 . By
the definition of inverse, we must show that hb, ai ∈ {hy, xi ∈ B × A|hx, yi ∈ R}.
Now, since ha, bi ∈ R we know (since R ⊆ A × B) that hb, ai ∈ B × A and we
assumed ha, bi ∈ R so this case is proved.
(⇐) Assume bR−1 a and show aRb. If bR−1 a , then hb, ai ∈ {hy, xi ∈ B ×
A|hx, yi ∈ R}, i.e. b ∈ B, a ∈ A and ha, bi ∈ R which is what we were to show.
1 Students familiar with linear algebra will know that this is the identity matrix and might

consider the relationship between composition of relations and matrix multiplication – in


particular – if you consider ordinary matrix multiplication where addition become ∨ and
multiplication is , we can show the isomorphism between composition of relations and this
Boolean matrix multiplication.
6.3. OPERATIONS ON RELATIONS 101

Example 6.5. The inverse of the less-than relation (<) is greater-than (>).

6.3.2 Complement of a Relation


Definition 6.7 (complement) If R ⊆ A × B, then the complement of R is
the relation
R = {hx, yi ∈ A × B|hx, yi 6∈ R}

Corollary 6.1. For every relation R ⊆ A × B and for every a ∈ A and b ∈ B,

aRb ⇔ ¬(aRb)

Exercise 6.2. Prove that if R ⊆ A × B, then R = (A × B) − R

Example 6.6. The complement of the less-than relation (<) is the greater-
than-or-equal-to relation (≥).

6.3.3 Composition of Relations


Given relations whose codomain and domains match-up in the proper way, we
can construct a new relation which is the composition of the two.

Definition 6.8 (composition) If R ⊆ A × B and S ⊆ B × C, then the com-


position of R and S is the relation defined as follows:

def
S ◦ R = {hx, yi ∈ A × C | ∃z : B. xRz ∧ zSy}

Remark 6.2. To some, it may seem backward to write S ◦ R instead of R ◦ S.


In fact, both conventions do appear in the mathematical literature – though
the convention adopted here is the most common one – it is not the only one.
The reason for adopting this convention might be more clear when we get to
functions.

Example 6.7. Suppose we had a relation (say R) that paired names with social
security numbers and another relation that paired social security numbers with
the state they were issued in (call this relation S), then (S ◦ R) is the relation
pairing names with the states where their social security numbers were assigned.

Theorem 6.1 (Composition is associative) For arbitrary sets A, B, C and


D if R ⊆ A × B, S ⊆ B × C and T ⊆ C × D then

T ◦ (S ◦ R) = (T ◦ S) ◦ R
102 CHAPTER 6. RELATIONS

Proof: First, note by the definition of composition that

(T ◦ (S ◦ R)) ⊆ A × D and ((T ◦ S) ◦ R) ⊆ A × D

For arbitrary hx, yi ∈ A × D we show

hx, yi ∈ (T ◦ (S ◦ R)) ⇔ hx, yi ∈ ((T ◦ S) ◦ R)

Starting on the left, by the definition of composition we know the following.

hx, yi ∈ (T ◦ (S ◦ R)) ⇔ ∃z : C. hx, zi ∈ (S ◦ R) ∧ hz, yi ∈ T

So we assume there is such a z, i.e so far we know


i.) hx, yi ∈ (T ◦ (S ◦ R))
ii.) hx, zi ∈ (S ◦ R)
iii.) hz, yi ∈ T

Then (by the definition of composition) and (ii.) we obtain two more facts which
hold for some arbitrary element w ∈ B.
iv.) hx, wi ∈ R
v.) hw, zi ∈ S

From (v.) and (iii.) and the definition of composition we obtain the following.

vi.) hw, yi ∈ (T ◦ S)

But then (iv.) and (vi.) together mean hx, yi ∈ ((T ◦ S) ◦ R) which completes
the proof.


Theorem 6.2 (Composition inverse lemma) For all relations R ⊆ A × B


and S ⊆ B × C, the following identity holds.

(S ◦ R)−1 = R−1 ◦ S −1

Proof: Let A , B and C be arbitrary sets and R ⊆ A × B and S ⊆ B × C


be arbitrary relations. Note that (S ◦ R) ⊆ A × C and so the inverse relation
(S ◦ R)−1 ⊆ C × A. So, by extensionality, we must show for arbitrary a ∈ A
and c ∈ C that c(S ◦ R)−1 a ⇔ c(R−1 ◦ S −1 )a.
We reason equationally. Starting on the left side with c(S ◦ R)−1 a and show the
right side c(R−1 ◦ S −1 )a holds. By Lemma 6.1,

c(S ◦ R)−1 a ⇔ a(S ◦ R)c

Now, by definition of composition, a(S ◦ R)c if and only if there is some b ∈ B


such that both aRb and bSc hold. But then (by two applications of Lemma 6.1)
we know cS −1 b and bR−1 a also hold. By the definition of composition this
means
c(R−1 ◦ S −1 )a, which was to be shown.
6.3. OPERATIONS ON RELATIONS 103

Recall the definition of the diagonal relation ∆A (Def. 6.4).

Lemma 6.2. If R is any relation on A, then (R ◦ ∆A ) = R.


Proof: The theorem says ∆A is a right identity for composition. To see that
the relations (sets of pairs) (R ◦ ∆A ) and R are equal, we apply Thm 5.5.2, i.e.
we show (⊆): R ◦ ∆A ⊆ R and (⊇): R ⊆ R ◦ ∆A .
(⊆): Assume hx, yi ∈ (R ◦ ∆A ). Then, by the definition of composition, there
exists a z ∈ A such that hx, zi ∈ R and hz, yi ∈ ∆A . But by the definition of
∆A , z = y and so, replacing z by y we get hx, yi ∈ R which is what we are to
show.
(⊇): Assume hx, yi ∈ R. Then, to see that hx, yi ∈ R ◦ ∆A we must show
there exists a z ∈ A such that hx, zi ∈ R and hz, yi ∈ R. Let z be y. Clearly,
hy, yi ∈ ∆A and also, by our assumption hx, yi ∈ R.


Exercise 6.3. Prove the following lemma.

Lemma 6.3. If R is any relation on A × B, then (∆B ◦ R) = R.

Note that in rational arithmetic the reciprocal of x1 of x is the multiplicative


inverse: x ∗ x1 = 1. So, the multiplication of a number with it’s inverse gives the
identity element for the operation of multiplication. We have just shown that
the identity for composition of relations is ∆A (where A depends on the domain
and codomain of the relation.) Based on this we might make the following false
conjecture.

Conjecture 6.1 (False) For relations R ⊆ A × B:

R−1 ◦ R = ∆A

Exercise 6.4. For A = {a, b, c} and R = {ha, bi, ha, ci, hb, ai, hc, bi} show that
the conjecture is false.

Exercise 6.5. For A = {a, b} and R = {ha, bi} show that the conjecture is
false.

We will see in chapter 8 that the conjecture is true when we consider func-
tions which are a restricted form of relations.

Definition 6.9 (iterated composition) We define the iterated composition


of a relation R on a set A with itself as follows.
R0 = ∆A
Rk+1 = R ◦ Rk

Corollary 6.2. For all relations R on a set A, R1 = R, since, by the definition


of iterated composition and by Lemma 6.2 we have: R1 = R ◦R0 = R ◦∆A = R.
104 CHAPTER 6. RELATIONS

Typically, we only consider the case where R ⊆ A × A, but the definition is


still sensible if the relation R is a binary relation on A × B, so long as B ⊆ A.

Example 6.8. Suppose R is the relation on natural number associating each


number with its successor, R = {hx, yi ∈ N × N | x = y + 1}. Then Rk is
the relation associating each number with its k th successor; Rk = {hx, yi ∈
N × N | x = y + k}.

Exercise 6.6. Let R be the successor relation as defined in example 6.8. Prove
that
(≤ ◦R) =<

For each k ∈ N, Rk is the relation that takes you directly to the places reach-
able in R by following k steps. The following two definitions collect together
0
Rk s where k ranges over N+ (the strictly positive natural numbers2 ) and N.

Definition 6.10 (reachability in R)


[
R+ = Ri
i∈N+

Definition 6.11 (connectivity of R)


[
R∗ = Ri
i∈N

So, R+ contains all the pairs hx, yi ∈ A × A such that y is reachable from
x by following one or more edges of R. Similarly, R∗ contains all the pairs
hx, yi ∈ A × A such that y is reachable from x by following zero or more edges
of R. As we will see below; R+ is the so-called transitive closure of R (Thm 6.6)
and R∗ is the reflexive transitive closure of R (Thm 6.7).

6.4 Properties of Relations


A relation may satisfy certain structural properties. The properties all say
something about the “shape” of the relation.
A relation R ⊂ A × A is
1.) reflexive ∀a : A. aRa
2.) irreflexive ∀a : A. ¬(aRa)
3.) symmetric ∀a, b : A. aRb ⇒ bRa
4.) antisymmetric ∀a, b : A. (aRb ∧ bRa) ⇒ a = b
5.) asymmetric ∀a, b : A. (aRb ⇒ b 6 Ra
6.) transitive ∀a, b, c : A. (aRb ∧ bRc) ⇒ (aRc)
7.) connected ∀a, b : A. a 6= b ⇒ (aRb ∨ bRa)
We discuss each of these properties individually below.
2 N+ def
= N − {0}.
6.4. PROPERTIES OF RELATIONS 105

6.4.1 Reflexivity
Definition 6.12.
def
RelfA (R) = ∀a : A. aRa

If we think of xRy as meaning we can get from x to y by following one edge in


R, then saying R is a reflexive relation means that there is an edge (or really a
loop) in R from every point to itself.

Lemma 6.4. For all relations R ⊆ A × A, if R is reflexive, and ∆A is the


diagonal relation on A, then ∆A ⊆ R.

Lemma 6.5. For all relations R, S ⊆ A × A, if R and S are reflexive, then


R ∩ S is reflexive.

Examples of reflexive relations include equality (=), less-than-or-equal (≤)


and greater-than-or-equal (≥), the proper subset relation (⊂) and, for proposi-
tions, iff (⇔).

6.4.2 Irreflexivity
Definition 6.13.
def
IrrefA (R) = ∀a : A. a 6Ra

Irreflexivity means than no element of the set is connected directly to itself.

Remark 6.3. Note that a relation can fail to be both reflexive and irreflexive.
Let A = {0, 1, 2} and R = {h0, 1i, h1, 1i, h1, 2i, h2, 0i} Then, R is not reflexive
because h0, 0i 6∈ R. But it also fails to be irreflexive since h1, 1i ∈ R.

Lemma 6.6. For all relations R, S ⊆ A × A, if R and S are irreflexive, then


R ∩ S and R ∪ R are both irreflexive.

Examples of irreflexive relations include inequality (6=), less-than (<), greater-


than (>) and the proper subset relation (⊂).

6.4.3 Symmetry
Definition 6.14.
def
Sym(R) = ∀a, b : A. aRb ⇒ bRa

A relation R is symmetric if every point reachable in one step in R can be


returned to by taking a single step also in R.

Lemma 6.7. The diagonal relation ∆A is symmetric.

Lemma 6.8. If R ⊆ A × A then R = R−1 ⇔ Sym(R).

Lemma 6.9. If R ⊆ A × A then R = R−1 ⇔ R ⊆ R−1 .


106 CHAPTER 6. RELATIONS

6.4.4 Antisymmetry
Definition 6.15.
def
AntiSym(R) = ∀a, b : A. (aRb ∧ bRa) ⇒ a = b

Antisymmetry means that if you can get from one point to another and back
in one step, then those points must have been equal.

Lemma 6.10. The diagonal relation ∆A is antisymmetric.

6.4.5 Asymmetry
Definition 6.16.
def
Asym(R) = ∀a, b : A. (aRb ⇒ b 6 Ra

Asymmetry means that there is no way to get from any point to itself in two
steps.

Lemma 6.11. For all relations R ⊆ A × A, if R is asymmetric then R is


irreflexive.

6.4.6 Transitivity
def
T rans(R) = ∀a, b, c : A. (aRb ∧ bRc) ⇒ (aRc)
A relation is transitive if every place you can get in two steps, you can get
by taking a single step.

6.4.7 Connectedness
def
Connected(R) = ∀a, b : A. a 6= b ⇒ (aRb ∨ bRa)
A relation is connected if there is an edge between every pair of points (going
one direction or the other.)

6.5 Closures
The idea of “closing” a relation with respect to a certain property is the idea of
adding just enough to the relation to make it satisfy the property (if it doesn’t
already) and to get the “smallest” such extension.

Example 6.9. Consider A and R as just presented in remark 6.3. We can


“close” R under the reflexive property by unioning the set E = {h0, 0i, h2, 2i}
with R. This is the minimal extension of R that makes it reflexive. Adding,
for example, h2, 1i does not contribute to the reflexivity of R and so it is not
added. Also note that even though E 6= ∆A , R ∪ E = R ∪ ∆A since h1, 1i ∈ R.
6.5. CLOSURES 107

Thus, the closure is the minimal extension to make a relation satisfy a prop-
erty. For some properties (like irreflexivity) there may be no way add to the
relation to make it satisfy the property – in which case we say the closure “does
not exist”. To make R satisfy irreflexivity, we would have to remove h1, 1i.

Definition 6.17 (Closure) Given a relation R ⊆ A × B and a property P of


the relation, the closure of R with respect to P is the set of relations S such
that P (S) and R ⊆ S and S is the smallest such relation,

S ∈ closure(R, P )
iff (P (S) ∧ R ⊆ S) ∧ ∀T : T ⊆ A × B ⇒ ((P (T ) ∧ R ⊆ T ) ⇒ S ⊆ T )

If we close a relation with respect to a property P that the relation already


enjoys, the result is just the relation R itself. The reader is invited to verify this
fact by proving the following lemma.

Lemma 6.12. Given a relation R ⊆ A × B and a property P of the relation,


if P (R) holds, then closure(R, P ) = R.

We now show that membership in a closure is unique.

Theorem 6.3 (Uniqueness of Closures) If R ⊆ A × A and P is a property


of relations then, the property of being a member in closure(R, P ) is unique.

Proof: Let R and P be arbitrary. The property (call it M ) that we are


showing is unique is membership in a closure i.e.

M (S) = S ∈ closure(R, P )

We recall the definition of uniqueness (Def. 5.5.3) which says


def
unique(M ) = ∀R1 , R2 .(M (R1 ) ∧ M (R2 )) ⇒ (R1 = R2 )

To show this, we assume M (R1 ) and M (R2 ) for arbitrary relations R1 , R2 ⊆


A × A and show R1 = R2 . By our assumption, we know:

M (R1 ) : R1 ∈ closure(R, P )
M (R2 ) : R2 ∈ closure(R, P )

Since R1 and R2 are in the closure of R by P , we know

i.) R ⊆ R1
ii.) P (R1 )
iii.) ∀T ⊆ A × A. (R ⊆ T ∧ P (T )) ⇒ R1 ⊆ T
iv.) R ⊆ R2
v.) P (R2 )
vi.) ∀T ⊆ A × A. (R ⊆ T ∧ P (T )) ⇒ R2 ⊆ T
108 CHAPTER 6. RELATIONS

Using R2 for T in (iii.) yields the following.


(R ⊆ R2 ∧ P (R2 )) ⇒ R1 ⊆ R2
By (iv.) and (v.) We get that R1 ⊆ R2 . Similarly, using R1 for T in (vi.) yields
the following.
(R ⊆ R1 ∧ P (R1 )) ⇒ R2 ⊆ R1
By (i.) and (ii.) We get that R2 ⊆ R1 . But then by subset extensionality
(Thm. 5.5.2) R1 = R2 .

Since closures are unique, from now on we will simply write S = closure(R, P )
instead of S ∈ closure(R, P ). Also, since closures are unique, any relation which
has the property of being a closure must be the only one that is the closure i.e.
to prove S = closure(R, P ) we simply need to show that S has the three prop-
erties that make it the closure of R by P .
Theorem 6.4. If R ⊆ A × A then the reflexive closure of R is the relation
R ∪ ∆A
Proof: More formally, the theorem says
closure(R, Ref lA ) = R ∪ ∆A
Thus, to show that the R ∪ ∆A is the reflexive closure, (by the definition of
closure) we must show three things:
i.) Ref lA (R ∪ ∆A )
ii.) R ⊆ (R ∪ ∆A )
iii.) ∀T : T ⊆ A × A ⇒ ((R ⊆ T ⊆ Ref lA (T )) ⇒ (R ∪ ∆A ) ⊆ T )

i.) More particularly, we must show that ∀x : A. hx, xi ∈ (R ∪ ∆A ). Choose


an arbitrary x ∈ A. Then, by the membership property of unions, we must
show that hx, xi ∈ R or hx, xi ∈ ∆A . But by the definition of membership in a
comprehension, hx, xi ∈ ∆A iff hx, xi ∈ A × A (which is obviously true since x
was arbitrarily chosen from the set A) and if x = x. So, we conclude that (i)
holds.

ii.) We must show that R ⊆ (R ∪ ∆A ). But this is true by Thm 5.8 from
Chapter 5.
iii.) Finally, we must show that R ∪ ∆A is the least such set, i.e. that
∀T : T ⊆ A × A ⇒ ((R ⊆ T ∧ Ref lA (T )) ⇒ (R ∪ ∆A ) ⊆ T )
To see this, choose an arbitrary relation T ⊆ A × A. Assume R ⊆ T and
Ref lA (T ). We must show that (R ∪ ∆A ) ⊆ T . Let x be an arbitrary element
of (R ∪ ∆A ). Then, there are two cases: x ∈ R or x ∈ ∆A . If x ∈ R, since we
have assumed R ⊆ T , we know x ∈ T . In the other case, x ∈ ∆A , that is, x is
of the form hy, yi for some y in A. But since we assumed Ref lA (T ), we know
that ∀z : A.hz, zi ∈ T so, in particular, hy, yi ∈ T , i.e. x ∈ T .
6.6. PROPERTIES OF OPERATIONS ON RELATIONS 109

Definition 6.18 (Symmetric) The predicate Sym(R) means R ⊆ A × A is


symmetric.
def
Sym(R) = ∀x, y : A. xRy ⇒ yRx

Note that unlike reflexivity, symmetry does not require us to know what the
full set A is, it only requires us to know what pairs are in the relation R.

Example 6.10. For any set A, the empty relation is symmetric, though the
empty relation is reflexive if and only if A = ∅.

Theorem 6.5 (Symmetric Closure) If R ⊆ A × A then the symmetric clo-


sure of R is the relation R ∪ R−1

Example 6.11. Let A = {0, 1, 2, 3} and R = {h0, 1i, h1, 2i, h2, 3i, h3, 0i}. Then

R1 = R = {h0, 1i, h1, 2i, h2, 3i, h3, 0i}


R2 = R ◦ R = {h0, 2i, h1, 3i, h2, 0i, h3, 1i}
R3 = R ◦ R2 = {h0, 3i, h1, 0i, h2, 1i, h3, 2i}
R4 = R ◦ R3 = {h0, 0i, h1, 1i, h2, 2i, h3, 3i}
R5 = R ◦ R4 = {h0, 1i, h1, 2i, h2, 3i, h3, 0i}

Note that R5 = R. The transitive closure of R is the union

R ∪ R2 ∪ R3 ∪ R4

Theorem 6.6 (Transitive Closure) If R ⊆ A × A then the transitive closure


of R is the relation [
R+ = Ri
i>0

Theorem 6.7. The reflexive transitive closure of a relation R ⊆ A × A is the


relation [
R∗ = Ri
i∈N

Remark 6.4. The notation R∗ to denote the reflexive transitive closure of a


relation R is borrowed from the theory of strings and regular languages. It
was first introduced by Stephen Cole Kleene (1909-1994) and, in the context of
regular languages, is called the Kleene Closure of R.

6.6 Properties of Operations on Relations


Just as we had properties that may or may not hold for relations, we can consider
properties of the operations on relations. This idea of properties of operations
is a “higher order” concept.
110 CHAPTER 6. RELATIONS

0
Definition 6.19 (Involution) A unary operator : A → A is an involution
if it is its own inverse, i.e. if x00 = x for all x ∈ A.

Lemma 6.13 (complement involutive) For every pair of sets A and B and
every relation R, R ⊆ A × B the following identity holds.

R=R

.
Proof: By extensionality. Choose arbitrary a ∈ A and b ∈ B and show aRb ⇔
aRb. We reason equationally.

aRb ⇔ ¬(aRb) ⇔ ¬¬(aRb) ⇔ aRb

Theorem 6.8 (Inverse involutive) For every pair of sets A and B, and for
every R ⊆ A × B
R = (R−1 )−1
Proof: Note that since R ⊆ A × B is a set, we must show (by extensionality)
that for arbitrary z, that z ∈ R ⇔ z ∈ (R−1 )−1 . Since R ⊆ A × B, z is of
the form ha, bi for some a ∈ A and some b ∈ B, thus, we must show aRb ⇔
a(R−1 )−1 b. Two applications of Lemma 6.1 give the following.

aRb ⇔ bR−1 a ⇔ a(R−1 )−1 b


Chapter 7

Equivalence and Order

7.1 Equivalence Relations


Equivalence relations generalize of the notion of what it means for two elements
of a set to be equal.

Definition 7.1 (equivalence relation) A relation on a set A that is reflexive


(on A) , symmetric and transitive is called an equivalence relation on A. We
will sometimes write EquivA (R) to mean R is an equivalence relation on A.
def
EquivA (R) = Ref lA (R) ∧ Sym(R) ∧ T rans(R)

Example 7.1. Ordinary equality on numbers is an equivalence relation.

Example 7.2. In propositional logic, the if-and-only-if connective [Def. 2.2.3]


is an equivalence on propositions. To see this we must show three things:

i.) ∀P.P ⇔ P (Reflexive)


ii.) ∀P, Q.(P ⇔ Q) ⇒ (Q ⇔ P ) (Symmetric)
iii.) ∀P, Q, Q.(P ⇔ Q) ∧ (Q ⇔ R) ⇒ (P ⇔ R) (Transitive)

But these theorems have all been proved previously as exercises, so ⇔ is an


equivalence on propositions.

Example 7.3. The reflexive closure of the sibling relation is an equivalence


relation. To see this, by the reflexive closure, everyone is related to him or
herself by this relation, because we explicitly stated it is closed under reflection.
If A is the sibling of B, then B is the sibling of A so the relation is symmetric.
And finally, if A is the sibling of B and B is the sibling of C, then A is the
sibling of C so the relation is transitive.
Note that, under this relation, if an individual has no brothers or sisters,
there is no other person (except herself by virtue of the reflexive closure) related
to her.

111
112 CHAPTER 7. EQUIVALENCE AND ORDER

Lemma 7.1. For any set A,, the diagonal relation ∆A is an equivalence relation
on A.
This is the so-called “finest” equivalence (see Definition 7.3 below) on any
set A and is defined by real equality on the elements of the set A. To see this
recall that ∆A = {hx, yi | x = y}
Lemma 7.2. For any set A, the complete relation A × A is an equivalence
relation on A.
This equivalence is rather uninteresting, it says every element in A is equiv-
alent to every other element in A. It is the “coarsest” equivalence relation on
the set A.

7.1.1 Equivalence Classes


It is often useful to define the set of all the elements from a set A equivalent to
some particular element under an equivalence relation R ⊆ A × A.
Definition 7.2 (equivalence class) If A is a set, R is an equivalence relation
on A and x is an element of A, then the equivalence class of x modulo R (we
write [x]R ) is defined as follows.
[x]R = {y ∈ A | xRy}
Example 7.4. If S is the reflexive closure of the sibling relation, then for any
individual x, [x]S is the set consisting of x and of his or her brothers and sisters.
Theorem 7.1. If A is a set, R is an equivalence relation on A, and x and y are
elements of A, then the following statements are equivalent.
1. xRy
2. [x]R ∩ [y]R 6= ∅
3. [x]R = [y]R
Proof: To prove these statements are equivalent, we will show, (i) ⇒ (ii) and
(ii) ⇒ (iii) and finally, (iii) ⇒ (i).
[(i) ⇒ (ii)] Assume xRy. Then, by the definition of [x]R , y ∈ [x]R . Also, by
reflexivity of R (recall it is an equivalence relation) yRy and so y ∈ [y]R . But
then, y is in both [x]r and y is in [y]R , hence the intersection is not empty and
we have shown [x]R ∩ [y]R 6= ∅.
[(ii) ⇒ (iii)] Assume [x]R ∩[y]R 6= ∅. Then, there must be some element (say z)(
such that z ∈ [x]R and z ∈ [y]R . We show [x]R = [y]R , i.e. we show that ∀w.w ∈
[x]R ⇔ x ∈ [y]R . Choose an arbitrary w. But then w ∈ [x]R ⇔ xRw. By the
symmetry of R, we know wRx. Now, since z ∈ [x]R , xRz and by transitivity
of R, wRz holds as well. Now, since z ∈ [y]R , yRz and by symmetry we have
zRy and by transitivity we get wRy. Finally, another application of symmetry
allows us to conclude yRw and we have shown that w ∈ [x]R ⇔ w ∈ [y]R for
arbitrary w, thus [x]R = [y]R if their intersection is non-empty.
7.1. EQUIVALENCE RELATIONS 113

[(iii) ⇒ (i)] Assume [x]R = [y]R . Then, every element of [x]R is in [y]R and
vice-versa. But, because R is reflexive, x ∈ [x]R and since y ∈ [y]R , y ∈ [x]R .
But this is true only if xRy holds.


Definition 7.3 (Fineness of an Equivalence) An equivalence relation ≡1 ⊆


A × A is finer than the equivalence relation ≡2 ⊆ A × A if,

∀x : A. [x]≡1 ⊆ [x]≡2

7.1.2 The Quotient Construction*


In higher mathematics a quotient is a structure induced by an equivalence rela-
tion.

Definition 7.4 (Quotient) If ≡ is an equivalence relation on the set A, then


we define
def
A/ ≡ = {[x]≡ |x ∈ A}
This set of sets is called the the quotient of A modulo ≡.

Lemma 7.3. For every set A, A = A/∆A .

Lemma 7.4. For every set A, A/A2 = {A}.

7.1.3 Q is a Quotient
Consider the fractions F defined as follows.

Definition 7.5.
F = Z × Z{6=0}
where Z{6=0} is the set of non-zero integers.

You may recognize F as the fractions, e.g. we can think of the first number
as the numerator and the second as the denominator, so ha, bi is the fraction ab .
Now, note that the equality on fractions (i.e. the equality on pairs – ha, bi =
hc, di ⇔ a = c ∧ b = d) is not the equality for rational numbers (usually denoted
Q.) Notice that, for example,

h1, 2i =
6 h2, 4i

but of course, for rational numbers


1 2
=
2 4
We define an equivalence relation on pairs of fractions that does reflect equal-
ity on rationals as follows:
114 CHAPTER 7. EQUIVALENCE AND ORDER

Definition 7.6.
def
≡Q = {hhx, yi, hz, wii ∈ F × F | xw = yz}

Less pedantically we might write


def
hx, yi ≡Q hz, wi = xw = yz

This is the ordinary cross-multiplication rule you learned in grade school for
determining if two rational numbers are equal.

Exercise 7.1. Prove that ≡Q is indeed an equivalence relation on fractions i.e.


you must show that it is (i.) reflexive, (ii.) symmetric and (iii.) transitive.
1. ∀ha, bi ∈ F. ha, bi ≡Q ha, bi
2. ∀ha, bi, hc, di ∈ F. ha, bi ≡Q hc, di ⇒ hc, di ≡Q ha, di
3. ∀ha, bi, hc, di, he, f i ∈ F.
(ha, bi ≡Q hc, di ∧ hc, di ≡Q he, f i) ⇒ ha, bi ≡Q he, f i

Exercise 7.2. Describe the equivalence class [h2, 4i]≡Q

Exercise 7.3. Describe the equivalence class [hx, yi]≡Q

The rational numbers are defined by a quotient construction.

Definition 7.7 (Rational Numbers)


def
Q = F/ ≡Q

This conception of the rational numbers is perhaps confusing. It leads to


the following dialog.

Question: “What is a rational number?”


Answer: “A rational number is an equivalence class of fractions.”
Question: “But then what does it mean to add two rational numbers.”
Answer: “Addition is an operation that maps a pair of rational
numbers (equivalence classes of fractions) to a rational
number (another equivalence class of fractions.).”

7.1.4 Partitions
Definition 7.8 (Partition) A partition of a set A is a set of non-empty subsets
of A (we refer to these sets as Ai where i ∈ I, I ⊆ N). Each Ai is called a block
(or a component) and a collection of such Ai is a partition if it satisfies the
following two properties:
7.1. EQUIVALENCE RELATIONS 115

i.) For all i ∈ I, Ai 6= ∅


ii.) the sets in the blocks are pairwise disjoint, i.e.

∀i, j : I. i 6= j ⇒ (Ai ∩ Aj = ∅)

and,
iii.) the union of the sets Ai , i ∈ I is the set A itself:
[
Ai = A
i∈I

Example 7.5. If A = {1, 2, 3} then the following are all the partitions of A.

{{1, 2, 3}}
{{1}, {2, 3}}
{{1, 2}, {3}}
{{1, 3}, {2}}
{{1}, {2}, {3}}

Theorem 7.2. For any set A, R ⊆ A × A is an equivalence relation if and only


if the set of its equivalence classes form a partition ı.e.
[
EquivA (R) ⇔ P artition( {[x]R })
x∈A

Exercise 7.4. Prove Thm. 7.2.

Counting Partitions
Definition 7.9 (k-partition) A k-partition of a set A is a partition of A into
k subsets.

So for example, {{1, 2, 3}} is a 1-partition of {1, 2, 3}, {{1}, {2, 3}}, {{1, 2}, {3}},
and {{1, 3}, {2}} are all 2-partitions while {{1}, {2}, {3}} is a 3-partition.

Definition 7.10 (Counting k-partitions) The numbers computed by the fol-


lowing recurrence relation are called Stirling Numbers of the second kind. They
compute the number of k-partitions of a set of size n.
S(n, 1) = 1
S(n, n) = 1
S(n, k) = S(n − 1, k − 1) + k · S(n − 1, k)

Definition 7.11 (Counting Equivalence Relations) There are as many equiv-


alence relations on a set of size n as there are k-partitions for each k ∈ {1 · · · n}.
n
X
S(n, k)
k=1
116 CHAPTER 7. EQUIVALENCE AND ORDER

7.1.5 Congruence Relations*


It is all well and good to define equivalence relations on a set, but usually
we consider sets together with operations on them, for many applications, we
expect the equivalence to, in some sense, respect those operations. This is the
idea of the congruence relation – a congruence is an equivalence that respects
operators or is compatible with one or more operators. In ordinary situations
with equality, we expect the substitution of “equals” for “equals” to not change
anything. So, if x = y, then x can be replaced with y in any context e.g. if x
and y are equal, we expect f (x) = f (y).
The idea of congruence is to ensure that substitution of equivalent for equiv-
alent in an operator results in equivalent elements.
Definition 7.12. If f is an k-ary operation on the set A and ≡ is an equivalence
relation on A then ≡ is a congruence for f if
∀x1 , · · · , xk , y1 , · · · , yk : A.(∀i : {1..k}.xi ≡ yi ) ⇒ f (x1 , · · · , xk ) ≡ f (y1 , · · · , yk )
This may look pretty complicated, but is the general form for an arbitrary
k-ary operation. Here’s the restatement for a binary operator.
Definition 7.13. If f is an binary operation on the set A and ≡ is an equiv-
alence relation on A then ≡ is a congruence for f if
∀x1 , x2 , y1 , y2 : A.(x1 ≡ y1 ∧ x2 ≡ y2 ) ⇒ f (x1 , x2 ) ≡ f (y1 , y2 )
We will sometimes write Cong(≡, f ) to indicate that the equivalence relation
≡ is compatible with the operator f .

Operations on Rational Numbers


Consider the following operations on fractions (Def. 7.5).
Definition 7.14 (Multiplication of fractions) We define multiplication of
fractions pointwise as follows:
def
ha, bi ∗F hc, di = hac, bdi
where ac and bd denote ordinary multiplication of integers.
Definition 7.15 (Addition of fractions) We define addition of fractions as
follows.
def
ha, bi +F hc, di = had + bc, bdi
Exercise 7.5. Prove that the multiplication and addition of of fractions both
result in fractions i.e.
i.) ∀ha, bi, hc, di : F. (ha, bi ∗F hc, di) ∈ F
ii.) ∀ha, bi, hc, di : F. (ha, bi +F hc, di) ∈ F
Lemma 7.5. The relation ≡Q is compatible with the operation ∗F i.e. ≡Q is
congruent with respect to ∗F .
7.2. ORDER RELATIONS 117

Proof: We must show that


∀ha, bi, hc, di, he, f i, hg, hi : F.
(ha, bi ≡Q he, f i ∧ hc, di ≡Q hg, hi)
⇒ ha, bi ∗F he, f i ≡Q hc, di ∗F hg, hi
Assume that ha, bi, hc, di, he, f i, hg, hi ∈ F are arbitrary. Then, since these pairs
are fractions, we know that b 6= 0, d 6= 0, f 6= 0 and h 6= 0. Also, assume
ha, bi ≡Q hc, di and he, f i ≡Q hg, hi. Then, by the definition of ≡Q we know
ad = bc and eh = f g. We must show
ha, bi ∗F he, f i ≡Q hc, di ∗F hg, hi
By definition of ∗F we know ha, bi ∗F he, f i is the pair hae, bf i and hc, di ∗F
hg, hi is the pair hcg, dhi. We must show that these results are equivalent i.e.
(hae, bf i ≡Q hcg, dhi). To show this, we must show that aedh = bf cg. Now,
since ad = bc and since d 6= 0 we can divide both sides by d yielding the equality
a = bcd . Using this fact together with eh = f g we show aedh = bf cg as follows.

bc bcedh
aedh = (edh) = = bceh = bcf g = bf cg
d d

The significance of the lemma is that the operation ∗F respects the equiva-
lence ≡Q i.e. even though it is defined as an operation on fractions, substitution
of ≡Q -equivalent elements yield ≡Q -equivalent results.
Lemma 7.6. The relation ≡Q is compatible with the operation +F i.e. ≡Q is
congruent with respect to +F .
Exercise 7.6. Prove Lemma 7.6

7.2 Order Relations


Equivalence relations abstract the notion of equality while order relations ab-
stract the notion of order. We are all familiar with orderings on the integers,
less-than (<) and less-than-or-equal (≤).

7.2.1 Partial Orders


Definition 7.16 (Partial Order)
If R ⊂ A × B is a relation that is reflexive, antisymmetric and
transitive we call it a partial order.
Partial order relations are usually denoted by symbols of the form ≤, ⊆, v
or  and are written in infix notation.
Example 7.6. The subset relation (Def. 5.2) is a partial order. To see this, we
must show that the subset relation is: (i.) reflexive, (ii.) antisymmetric and
(iii.) transitive.
118 CHAPTER 7. EQUIVALENCE AND ORDER

i.) For every set A, A ⊆ A (see Thm. 5.5.1) so ⊆ is a reflexive relation.


ii.) Antisymmetry holds by Thm. 5.5.2.
iii.) To see that transitivity holds for the ⊆ relation, assume that A ⊆ B and
B ⊆ C and show that A ⊆ C. Clearly, since every element of A is in B and
every element of B is in C, every element of A is in C i.e. A ⊆ C.

Definition 7.17 (Strict Partial Order)

If R ⊂ A × B is a relation that is reflexive, antisymmetric and


transitive we call it a partial order.

7.2.2 Products and Sums of Orders


We can construct new partial orders from existing ones by a various kinds of
compositions.

Cartesian Product
Definition 7.18 (Odered product) If hP1 , v1 i and hP2 , v2 i are posets, then
hP1 × P2 , vi is a poset, where
def
hx, yi v hz, wi = x v1 z ∧ y v2 w

This is a pointwise ordering.

Lexicographic Product
Definition 7.19. If
If hA, v1 i and hB, v2 i are posets, then hA × B, vi is a poset where
def
hx, yi v hz, wi = x @1 z ∨ x = z ∧ y v2 w
Chapter 8

Functions

Some relations have the special property that they are functions. A relation
R ⊆ A × B is a function if each element of the domain A gets mapped to one
element and only one element of the codomain B.

8.1 Functions
Definition 8.1 (function) A function from A to B is a relation (f ⊆ A × B)
satisfying the following properties,

i.) ∀x : A.∃y : B.hx, yi ∈ f


ii.) ∀x : A.∀y, z : B.(hx, yi ∈ f ∧ hx, zi ∈ f ) ⇒ y = z

Relations having the first property are said to be total and relations satisfying
the second property are said to be functional or to satisfy the functionality
property .

Remark 8.1. Since we usually write f (x) = y for functions instead of hx, yi ∈
f , we can restate these properties in the more familiar notation as follows.

i.) ∀x : A.∃y : B.f (x) = y


ii.) ∀x : A, y, z : B.(f (x) = y ∧ f (x) = z) ⇒ y = z

There is some danger in using the notation f (x) = y if we do not know that f
is a function.

We denote the set of all functions from A to B as A → B, so if f ⊆ A × B


is a function we write f : A → B or f ∈ A → B.

Definition 8.2 (domain, codomain, range) If f : A → B, we call the set


A the domain of f and the set B the codomain of f . The set
def
rng(f, A, B) = {y ∈ B|∃x : A.f (x) = y}

119
120 CHAPTER 8. FUNCTIONS

is called the range of f . We write dom(f ) to denote the set which is the domain
of f , codom(f ) to denote the codomain and simply rng(f ) to denote it range if
A and B are clear from the context.

It is worth considering what it means if the domain and or codomain of a


function are empty.

Lemma 8.1. [Empty Domain] For every set A, ∀f.f ∈ ∅ → A ⇔ f = ∅


Proof: Choose an arbitrary set A and show both directions:
(⇒) Suppose f is a function in emptyset → A, then f ⊆ A × ∅, so f = ∅.
(⇐) Assume f = ∅ and show f ∈ ∅ → A We must show three things: i.) f ⊆ ∅
but f = ∅ so this is trivially true; ii.) f is functional – which is vacuously true
since the domain is empty; and iii.) that f is total, which is also vacuously true.


Lemma 8.2 (Empty Codomain) For every set A

∀f : A → ∅. A = ∅

Proof: Suppose f is a function in A → ∅, then f ⊆ A × ∅ and since A × ∅ = ∅,


f ⊆ ∅, i.e. f = ∅. But also, f is a function so it is both functional and total. The
emptyset is trivially functional. But notice, that for f to be total to following
property must hold.
∀x : A.∃y : ∅.hx, yi ∈ f
More specifically, since f = ∅ we must show

∀x : A.∃y : ∅.hx, yi ∈ ∅

This is vacuously true if A = ∅ and is false otherwise, thus it must be the case
that A = ∅.


Corollary 8.1. ∀f :∈ ∅ → ∅. f = ∅

8.2 Extensionality (equivalence for functions)


We define function equality extensionally, as equality on the underlying sets as
follows:

Definition 8.3 (extensionality) For functions f, g : A → B,

def
f = g = ∀x : A.f (x) = g(x)

Thus, functions are equal if they are the same set of pairs. Since they are, by
definition functional, this amounts to checking that they agree on every input.
8.3. OPERATIONS ON FUNCTIONS 121

Remarks on Extensional Equivalence


The definition of equality for functions is based on the so-called extensional view
of functions i.e. functions as sets of pairs. Within computer science, we might
be interested in notions of equivalence that take into account other properties
besides simply the input-output behavior of functions. For example, programs
(say P1 and P2 ) that sort lists of numbers are functions from lists to lists.
When we think of Pa and P2 as sets P1 = P2 must be true if they both actually
implement sorting correctly. So for example h[2; 1; 4], [1; 2; 4]i is a pair in both
P1 and in P2 . However, the two programs may have significantly different run-
time complexities, so this is a way in which they are not equal. Program P1 may
implement the merge sort algorithm which has O(n log n) time complexity while
program P2 may implement insertion sort which has time complexity O(n2 ). So,
if we consider their run-time complexities, the two are clearly not equivalent.
Many other proprieties are not accounted for by extensional equality; indeed,
the only property that is accounted for is the input-output behavior. Properties
such as run time or length of a program that make distinctions other than the
one made by extensionality are called intensional properties.

8.3 Operations on functions


Since functions are relations, and thus are sets of pairs, all the operations on
sets and relations make sense as operations on functions.

8.3.1 Restrictions and Extensions


Definition 8.4. If f ∈ A → B and A0 ⊆ A, then f ∩ (A0 × B) is called the
restriction of f to A0 and is sometimes written f /A0 or f ↓ A0 .

Exercise 8.1. Prove functions are closed under restrictions, i.e. if f ↓ A0 is a


restriction of f : A → B to A0 where A0 ⊆ A, then f ↓ A0 is a function.

Definition 8.5. If g ∈ A0 → B is the restriction of f ∈ A → B then we say f


is the extension of g.

Lemma 8.3. f ∈ A → B is an extension of g ∈ A0 → B iff g ⊆ f .

8.3.2 Composition of Functions


Recall that functions are simply relations that are both functional and total.
This means the operation of composition for relations (Def. 6.6.8) can be ap-
plied to functions as well. Given functions f : A → B and g : b → C, their
composition is the function g ◦ f which is simply calculated by applying f and
then applying g.
Right away we must ask ourselves whether the relation obtained by compos-
ing functions results in a relation that is also a function.
122 CHAPTER 8. FUNCTIONS

Lemma 8.4 (function composition) Consider functions f : A → B and g :


B → C, the following picture illustrates the situation.

f g
A −→ B −→ C

By composition of relations, we know there is a relation (g ◦ f ) ⊆ A × C which


we claim is also a function i.e.

(g ◦ f ) ∈ A → C

Proof: To show (g ◦ f ) ∈ A → C, we must show

i.) ∀x : A. ∀y, z : C. hx, yi ∈ (g ◦ f ) ∧ hx, zi ∈ (g ◦ f ) ⇒ y = z


ii.) ∀x : A. ∃y : C. hx, yi ∈ (g ◦ f )

(i.) Assume x ∈ A and y, z ∈ C for arbitrary x, y and z. Also, assume


hx, yi ∈ (g ◦ f ) and hx, zi ∈ (g ◦ f ) and show y = z. By the definition of
composition, we know there is a w ∈ B such that f (x) = w and g(w) = y.
Similarly we know there is a ŵ ∈ B such that f (x) = ŵ and g(ŵ) = z. Now, since
f is a function we know w = ŵ and similarly, since g is a function g(w) = g(ŵ)
so x = y and we have shown that (g ◦ f ) is functional.
(ii.) Assume x ∈ A and show (∗) ∃y : C. hx, yi ∈ (g ◦ f ). Now, since f is a
function in A → B it is total and so there is a w ∈ B such that f (x) = w.
Similarly, since g is a function in B → C there is a z ∈ C such that g(w) = z.
But then hx, zi ∈ (g ◦ f ) and so use z as the witness for y in (∗).


Remark 8.2. Having proved this theorem we say, functions are closed under
composition . That is, we preserve the property of being a function when we
compose two functions.

In general, this question is rather fundamental and can be asked in many


contexts.

When is a mathematical structure closed with respect to some


operation on it?

By “closed with respect to”, we mean that applying the operation preserves
the property of having a particular structure. In the case of functions and the
composition operation, the property we are considering is whether a relation
is a function and the question “When is the composition of functions also a
function?” is answered by the previous lemma, Always.
Thus, function composition is a binary operator on pairs of functions anal-
ogous to the way addition is a binary operation on pairs of integers. The
analogy goes deeper. Addition is associative e.g. if a, b and c are numbers,
a + (b + c) = (a + b) + c. Function composition is associative as well.
8.3. OPERATIONS ON FUNCTIONS 123

Remark 8.3. Note that because the composition of relations is associative (see
Thm. 6.6.1.) the associativity of function composition is obtained for free. We
have included a direct proof of the associativity for function composition here
to illustrate the difference in the proofs. This is a case where the proof for the
special case (function composition) is easier than the more general case (relation
composition.)

Theorem 8.1 (Associativity of function composition) If f : A → B, g :


B → C and h : C → D then

h ◦ (g ◦ f ) = (h ◦ g) ◦ f

Proof: To show two functions are equal, we must apply extensionality. To show

h ◦ (g ◦ f ) = (h ◦ g) ◦ f

we must show that for every x ∈ A,

(h ◦ (g ◦ f ))(x) = ((h ◦ g) ◦ f )(x)

Pick an arbitrary x. The following sequence of equalities (derived by unfolding


the definition of composition and then folding it back up) shows the required
equality holds.
(h ◦ (g ◦ f ))(x)
= h((g ◦ f )(x))
= h(g(f (x)))
= (h ◦ g)(f (x))
= ((h ◦ g) ◦ f )(x))


Zero (0) is both a left and right identity for addition i.e. 0 + a = a and
a + 0 = a. Similarly, the identity function Id(x) = x is a left and right identity
for the operation of function composition.

Lemma 8.5 (Identity function) If A is a set, ∆A is the identity function on


A i.e. ∆A is a function and ∀x : A.∆A (x) = x.

Exercise 8.2. Prove Lemma 8.5.

Remark 8.4. We will sometimes write IdA for ∆A when we are thinking of
the relation as the identity function on A.

Theorem 8.2 (Left Right identity lemma) For any sets A and B, and any
function f : A → B, IdA is a left identity for f and IdB is a right identity for f .

IdB ◦ f = f and f ◦ IdA = f


124 CHAPTER 8. FUNCTIONS

Proof: To show two functions are equal, we apply extensionality, choosing an


arbitrary x.
(f ◦ IdA )(x) = f (IdA (x)) = f (x)
Thus IdA is a right identity for ◦. Similarly,

(IdB ◦ f )(x) = IdB (f (x)) = f (x)

Thus (f ◦ IdA ) = f and IdB ◦ f = f and the theorem has been shown.


8.3.3 Inverse
Given a relation R ⊆ A × A, recall Definition 6.6.6 of the inverse relation

R−1 = {hy, xi ∈ B × A | xRy}

Now, consider a function f : A → B. Since f is a relation, the relation f −1


exists; but is it a function?
We ask the question, “When are functions closed under the inverse opera-
tion? i.e. when is the inverse of a function still a function. We can try to begin
to answer the question by considering cases where the inverse might fail to be
a function.

Example 8.1. Let f : A → B be a function. Suppose that for some x, y ∈ A,


where x 6= y, that for some z, f (x) = z and f (y) = z. But then, hz, xi ∈ f −1
and hz, yi ∈ f −1 so, in this case, f −1 is not a function since it violates the
functionality condition. We conclude that if any two elements of A get mapped
to the same element of B, then f −1 is not a function.

Example 8.2. Let f : A → B be a function. Suppose that for some z ∈ B,


there is no x such that f (x) = z. But then, there is no pair in f −1 whose first
element is z. This violates the totality condition and so f −1 is not a function if
there is some element of B not mapped onto by f .

Functions that rule out the behavior described in Example 8.1 are said to
be one-to-one or are called injections. Functions that rule out the behavior
described in Example 8.2 are the onto functions which are also called surjections.

8.4 Properties of Functions


In the previous section we have analyzed what conditions might rule out the
possibility that a the inverse of a functions is itself a function. In this section,
we formalize those conditions as logical properties. Functions that satisfy these
properties have, in some sense, more structure than functions that don’t.
8.4. PROPERTIES OF FUNCTIONS 125

8.4.1 Injections
Definition 8.6 (injection, one-to-one) A function f : A → B is an injection
(or one-to-one) if every element of B is mapped to my at most one element of
A. Formally, we write:

∀x, y : A. f (x) = f (y) ⇒ x = y

The definition says that if f maps x and y to the same element, then it must
be that x and y are one in the same i.e. that x = y. Injections are also called
one-to-one functions.

Theorem 8.3 (Composition of injections) For sets A, B and C, and func-


tions f : A → B and g : B → C, then
a.) if f and g are injections, then g ◦ f is an injection, and
b.) If g ◦ f is an injection then f is an injection.

Proof:

Proof of part (a): Since f and g are injections we know

i.) ∀x, y : A.f (x) = f (y) ⇒ x = y


ii.) ∀x, y : B.g(x) = g(y) ⇒ x = y

We must show that (g ◦ f ) is an injection, i.e. that:

∀x, y : A. (g ◦ f )(x) = (g ◦ f )(y) ⇒ x = y

We choose arbitrary x and y from the set A and assume (g ◦ f )(x) = (g ◦ f )(y)
to show x = y. Now, by definition of function composition (g ◦ f )(x) = g(f (x))
and (g ◦ f )(y) = g(f (y)). Using f (x) for x and f (y) for y in (ii.) we get that
f (x) = f (y), but then, by the fact that f is an injection (i.) we know x = y.

Proof of part (b): Left as an exercise.




8.4.2 Surjections
Definition 8.7 (surjection, onto) A function f : A → B is an surjection (or
onto) if every element of B is mapped to by some element of A under f .

∀y : B. ∃x : A. f (x) = y

Corollary 8.2 (surjection characerization lemma) A function is a surjec-


tion if and only if codom(f ) = rng(f ).

Exercise 8.3. Prove Corollary 8.2.


126 CHAPTER 8. FUNCTIONS

Theorem 8.4 (Composition of surjections) For sets A, B and C, and func-


tions f : A → B and g : B → C,

a.) if f and g are surjections, then g ◦ f is a surjection, and

b.) If g ◦ f is a surjection then g is a surjection.

Proof:

Proof of part (a): Since f and g are surjections we know

i.) ∀y : B.∃x : A. f (x) = y


ii.) ∀y : C.∃x : B. g(x) = y

We must show that (g ◦ f ) is an surjection, i.e. that:

∀y : C.∃x : A. (g ◦ f )(x) = y

Choose an arbitrary y ∈ C and show that there exists an x ∈ A such that


(g ◦ f )(x) = y. By (ii.), there is some z ∈ B such that g(z) = y. Using z for y
in (i.), we get that there is an x ∈ A such that f (x) = z. Now, we have that
f (x) = z and g(z) = y so we know that g(f (x)) = y, in particular, we know
that (g ◦ f )(x) = y. Thus, we have shown that, if you choose an arbitrary y ∈ C
there exists an x ∈ A such that (g ◦ f )(x) = y.

Proof of part (b): Left as an exercise.




8.4.3 Bijections
Definition 8.8 (bijection) A function f : A → B is a bijection if it is both
an injection and a surjection. Bijective functions are also said to be one-to-one
and onto.

Now, going back to the question of when the inverse of a function is a func-
tion, we state the following theorem which perfectly characterizes the situation.

Lemma 8.6 (Inverse Characterization) For every function f ∈ A → B, f


is a bijection if and only if the inverse f −1 is a function.

Proof: Let f be an arbitrary function in A → B. There are two cases:


8.4. PROPERTIES OF FUNCTIONS 127

(⇒) Assume f is a bijection and show f −1 is a function.


If f is a bijection, then it is both an injection and a surjection, i.e. we
assume the following about f .

∀x, y : A. f (x) = f (y) ⇒ x = y (injection)


∀y : B. ∃x : A. f (x) = y (surjection)

Now, to show f −1 is a function from B to A, we must show that it is both


total and functional1 .

i.) ∀x : B. ∃y : A. hx, yi ∈ f −1 (total)


−1 −1
ii.) ∀x : B. ∀y, z : A. hx, yi ∈ f ∧ hx, zi ∈ f ⇒y=z (functional)

(i.) We prove that f −1 is total as follows: Choose an arbitrary element of B,


call it z and show that ∃y : A.hz, yi ∈ f −1 . Now, since f is surjective, we know
that ∃x : A.f (x) = z, so we assume there is an x such that x ∈ A and such
that that f (x) = z. Then, hx, zi ∈ f −1 . Thus, let y in ∃y : A.f −1 (z) = y be x.
Since we have just argued that hz, xi ∈ f −1 and since x ∈ A we have finished
the proof that f −1 is total.
(ii.) To see that f −1 is functional, consider the following argument. Choose an
arbitrary element of B, call it x and let y and z be arbitrary elements of A.
We must show hx, yi ∈ f −1 ∧ hx, zi ∈ f −1 ⇒ y = z so we assume hx, yi ∈ f −1
and hx, zi ∈ f −1 and show y = z. But since hx, yi ∈ f −1 , we know (by the
definition of f −1 ) that f (y) = x and similarly f (z) = x. Now, since we started
by assuming f is an injection, it must be that y = z.
This completes the proof that if f is a bijection then f −1 is a function.
(⇐) Assume f −1 is a function and show f is a bijection.
The proof is left as an exercise.


Remark 8.5. Note that we used the fact that f was surjective to prove that
f −1 was total and we used the fact that f was injective to prove that f −1
was functional. Looking at the formulas for these properties above, we see the
similarity of their forms – so it makes perfect sense that we can use them in this
way.

Exercise 8.4. Prove the (⇐) direction of Theorem 8.6.

Lemma 8.7 (inverse-bijection) For every function f ∈ A → B, if f is a


bijection then f −1 is a bijection as well.

Exercise 8.5. Prove Lemma 8.7.


1 Since we are trying to prove that f −1 is a function, and we do not know that it is yet,

we use the relational notation to avoid confusion e.g. the notation f −1 (x) = y suggests that
there is a unique y such that hx, yi ∈ f −1 ; however, until we have shown f −1 is a function we
do not know this to be true.
128 CHAPTER 8. FUNCTIONS

Exercise 8.6. Prove that ∆A is a function and is bijective. We call this the
identity function.

Lemma 8.8 (Inverse ) For every function f ∈ A → B, f is a bijection if and


only if the inverse f −1 is a function.

Theorem 8.5 (Schröder-Bernstein) If A and B are sets and f : A → B


and g : B → A are injections, then there exists a function h ∈ A → B that is a
bijection.

See [4] for a proof.


Note that if f and g are bijections then both f and g are surjections and both
are injections. Since composition preserves these properties (see Lemma 8.3and
Lemma 8.4) we have the following.

Corollary 8.3 (Composition of bijections) For sets A, B and C, and func-


tions f : A → B and g : B → C, if f and g are bijections, then so is g ◦ f .

The proof of following theorem shows how to lift bijections between pairs of
sets to their Cartesian product.

Theorem 8.6. For arbitrary sets A, B, A0 and B 0 the following holds:

A ∼ A0 ∧ B ∼ B 0 ⇒ (A × B) ∼ (A0 × B 0 )

Proof: Assume A ∼ A0 and B ∼ B 0 are witnessed by the bijections g : A → A0


and h : B → B 0 . We must construct a bijection (say f ) from (A×B) → (A0 ×B 0 ).
We define f as follows:
f (hx, yi) = hg(x), h(y)i

f injective: Now, to see that f is an injection, we must show that for arbitrary
pairs ha, bi, hc, di ∈ A × B that:

f (ha, bi) = f (hc, di) ⇒ ha, bi = hc, di

Assume f (ha, bi) = f (hc, di) and show ha, bi = hc, di. But by the definition of f
we have assumed hg(a), h(b)i = hg(c), h(d)i. Thus, by equality on ordered pairs,
we know g(a) = g(c) and h(b) = h(d). Now, since g and h are both injections
we know a = c and b = d and so we have shown that ha, bi = hc, di.

f surjective: To see that f is a surjection, we must show that

∀hc, di : A0 × B 0 . ∃ha, bi : (A × B). f (ha, bi) = hc, di

Choose an arbitrary pair hc, di ∈ A0 × B 0 . Then we claim the pair hg −1 (c), h−1 (d)i
is the witness for the existential. To see that it is we must show that f (hg −1 (c), h−1 (d)i) =
hc, di. Here is the argument.
8.5. EXERCISES 129

f (hg −1 (c), h−1 (d)i)


= hg(g −1 (c)), h(h−1 (d))i
= h(g ◦ g −1 )(c), (h ◦ h−1 )(d)i
= hIdA0 (c), IdB 0 (d)i
= hc, di

8.5 Exercises
1. Write down the formal definitions of injection, surjection and bijection
using the notation hx, yi ∈ f instead of the abbreviated form f (x) = y.
Note that you will need to include a new variable (say z) to account for
f (x) = f (y) in this more primitive notation.
130 CHAPTER 8. FUNCTIONS
Chapter 9

Cardinality and Counting

9.1 Cardinality
The term cardinality refers to the relative “size” of a set.

Definition 9.1 (equal cardinality) Two sets A and B have the same cardi-
nality iff there exists a bijection f : A → B. In this case we write |A| = |B| or
A ∼ B.

Although the usage is less common, sometimes sets of equal cardinality are
said to be equipollent or equipotent.

Exercise 9.1. Prove that the relation of equal cardinality is an equivalence


relation. Specifically, show for arbitrary sets A, B, and C that the following
hold:

i.) |A| = |A|

ii.) |A| = |B| ⇒ |B| = |A|

iii.) (|A| = |B| ∧ |B| = |C|) ⇒ |B| = |A|

This is easy if you study the theorems related to bijections their inverses and
compositions.

Next, we use the theorem to show a (perhaps) rather surprising result, that
N has the same cardinality as the set of Even numbers, even though half the
numbers are not there in Even.

Definition 9.2 (even) Even = {x : N|∃y : N. x = 2y}.

Theorem 9.1. |N| = |Even|.

131
132 CHAPTER 9. CARDINALITY AND COUNTING

Proof: To show these sets have equal cardinality we must find a bijection
between them. Let f (n) = 2n, we claim f : N → Even is a bijection. To see
this, we must show it is both: (i.) one-to-one and (ii.) onto.
(i.) f is one-to-one, i.e. we must show:

∀x, y : N, f (x) = f (y) ⇒ x = y

Choose arbitrary x, y ∈ N. Assume f (x) = f (y) and we show x = y. But by the


definition of f , if f (x) = f (y) then 2x = 2y and so x = y as we were to show.
(i.) f is onto, i.e. we must show:

∀x : Even. ∃y : N. x = f (y)

Choose an arbitrary x and assume x ∈ Even. Then, x ∈ {x : N|∃y : N. x = 2y}


is true so we know, x ∈ N and ∃y : N. x = 2y. To see that ∃y : N. x = f (y),
note that f (y) = 2y.

This theorem may be rather surprising, it says that the set of natural num-
bers is the “same size” as the set of even numbers. Clearly there are only half
as many evens as there are naturals, but somehow these sets are the same size.
This is one of the unintuitive aspects of infinite sets. This seeming paradox,
that a proper subset of an infinite set can be the same size, was first noticed
by Galileo [17] and is sometimes called Galileo’s paradox [52] after the Italian
scientist Galileo Galilei (1564 – 1642).

Definition 9.3. Squares = {x : N | ∃y : N. x = y 2 }.

Exercise 9.2. Prove that |N| = |Squares|.

Definition 9.4 (less equal cardinality) The cardinality of a set A is at most


the cardinality of B iff there exists an injection f : A → B. In this case we
write |A| ≤ |B|.

Definition 9.5 (strictly smaller cardinality) The cardinality of a set A is


less than the cardinality of B iff there exists an injection f : A → B and there
is no bijection from A to B. In this case we write |A| < |B|. Formally,

def
|A| < |B| = |A| ≤ |B| ∧ |A| =
6 |B|

The following theorem is a corollary of Thm. 8.8.5.

Theorem 9.2 (Schröder Bernstein) For all sets A and B, if |A| ≤ |B| and
|B| ≤ |A| then |A| = |B|.
9.2. INFINITE SETS 133

9.2 Infinite Sets

Richard Dedekind
(1831-1916) was a German
mathematician who made
numerous contributions in
establishing the foundations
of arithmetic and number
theory

Richard Dedekind
The following definition of infinite is sometimes called Dedekind infinite after
the mathematician Richard Dedekind (1831-1916) who first formulated it. This
characterization of infinite sets may be somewhat surprising because it does not
mention natural numbers or the notion of finiteness.
Definition 9.6 (Dedekind infinite) A set A is infinite iff there exists a func-
tion f : A → A that is one-to-one but not onto.
Lemma 9.1. N is infinite.
Proof: Consider the function f (n) = n + 1. Clearly, f is one-to-one since if
f (x) = f (y) for arbitrary x and y in N, then x + 1 = y + 1 and so x = y.
However, f is not onto since there is no element of N that is mapped to 0 by f .

Theorem 9.3. If a set A is infinite then, for any set B, if A ∼ B, then B is
infinite.
Proof: If A is infinite, then we know there is a function f : A → A that is
one-to-one but no onto. Also, since A ∼ B, there is a bijection g : A → B. To
show that B is infinite, we must show that there is a function h : B → B that
is one-to-one but not onto. We claim h = g ◦ f ◦ g −1 is such a function.
Now, to show that h is an injection (one-to-one) we recall (Theorem 8.8.3)that
the composition of injections is an injection. We assumed that f and g were
both injections and to see that g −1 is an injection as well, we cite Lemma 8.??
that says that if g is a bijection then g −1 is a bijection as well. Since bijections
are also injections, we have shown that h is an injection.
To show that h is not a surjection it is enough to show that
i.) existsy : B.∀x : B.h(x) 6= y
134 CHAPTER 9. CARDINALITY AND COUNTING

We assumed f is not onto, thus,


∃y 0 : A.∀x : A.f (x) 6= y 0
i.e. there is at least one y 0 in A such that
ii.)∀x : A.f (x) 6= y 0
Not that since g is a bijection we know the inverse g −1 is a function of type
B → A. Now, to show i.) use g(y 0 ) as the witness. We must show:
∀x : B.h(x) 6= g(y)
Choose an arbitrary x ∈ B and show h(x) 6= g(y). But to show ¬(h(x) = g(y)
we assume h(x) = g(y) and derive a contradiction.
h(x) = (g ◦ f ◦ g −1 )(x) = g(f (g −1 (x))) = g(y 0 )
Now, since g is an injection, we know f (g −1 (x)) = y 0 . But this is impossible
because by ii.) we know f (g −1 (x)) 6= y 0 .

This last theorem provides us an alternative method of showing a set is
infinite, specifically, show it has the same cardinality as some set already known
to be infinite.

9.3 Finite Sets


A set is finite if we can, in a very explicit sense, count the elements in the set
i.e. we can put the elements of the set in one-to-one correspondence with some
initial prefix of the natural numbers.
Definition 9.7.
{0..k} = {k : N | 0 ≤ k < n}
Remark 9.1. Note that {0..0} = ∅ and so has 0 elements. {0..1} = {0} and so
has one element. In general {0..k} = {0, 1, · · · , k − 1} and so has k elements.
Definition 9.8 (finite) A set A is finite iff there exists a natural number n
such that |A| = |{0..k}|. In this case we write |A| = k.
def
f inite(A) = ∃n : N. |A| = |{0..k}|
The definition says that for a set A to be finite, there must be a natural
number k and a bijection (call it f , from A to {0..k}. Since the mapping is a
bijection, it has an inverse mapping {0..k} → A which is also a bijection. Then
f −1 can be used to enumerate the elements of A.
In particular, note that if f is the bijection witnessing the finiteness of A
(showing that it has k elements), the following identities hold:
f A = {0 .. k − 1} and A = f −1 {0 .. k − 1}
Lemma 9.2. For every set A, |A| = 0 ⇔ A = ∅
9.4. CANTOR’S THEOREM 135

Proof:
(⇒) Assume |A| = 0, then, there is a bijection c : A → {0..0}. But by definition
{0..=}∅ since there are no natural numbers between 0 and -1. This means
c : A → ∅ which, by Thm. 8.8.2 A = ∅.
(⇐) Assume A = ∅ and show |A| = 0. By definition, we must show a bijection
from A → {0..0} i.e. a bijection c : ∅ → ∅. By Lemma 8.8.1 c = ∅ is a function
in ∅ → ∅ and it is vacuously a bijection.


Now, the bijection f witnessing the fact that A is finite assigns to each
element of A a number between 0 and k. Also, the inverse f −1 maps numbers
between 0 and k to unique elements of A i.e. since f −1 is itself a bijection, and
so is one-to-one, we know that no i, j ∈ {0..k − 1} such that i 6= j map to the
same element of A. We could enumerate (list) the elements of the set by the
following bit of pseudo-code.

for i ∈ {0..k − 1} do
Print (f −1 (i))

To make sure that there is nothing “between” the finite and infinite sets
(i.e. that there is no set that is neither finite nor infinite) we would expect the
following theorem holds.

Theorem 9.4. A set A is Dedekind infinite iff A is not finite.

Interestingly, it can be shown that proofs of this theorem require the use of
the axiom of choice – whose use is beyond the scope of these notes.

9.3.1 Permutations
Definition 9.9. A bijection from a finite set A to itself is called a permutation.

If a set A is finite, there exists a k ∈ N such that there is a bijection A →


{0..k}. Call this function ], then for each x ∈ A, ]x = i for some i ∈ {0..k}.

9.4 Cantor’s Theorem


Typically, it is harder to prove a set A has strictly smaller cardinality that a
set B because it is harder to prove that no function in A → B is a bijection.
To prove this we usually assume there is a bijection and derive a contradiction.
Cantor’s1 theorem which says that the power set of every set is strictly larger
than the set itself.

Theorem 9.5 (Cantor’s Theorem) For every set A, |A| < |ρ(A)|.

1 Georg Cantor (1845–1918) was a German mathematician who developed set theory and

established the importance of the ideas of injection, and bijection for counting.
136 CHAPTER 9. CARDINALITY AND COUNTING

Proof: Let A be an arbitrary set. To show |A| < |ρ(A)| we must show that (i.)
there is an injection A → B and (ii) there is no bijection from A to B.
(i.) Let f (x) = {x}. We claim that this is a injection form A to ρ(A) (the set
of all subsets of A. Clearly, for each x ∈ A, f (x) ∈ ρ(A). To see that f is an
injection we must show:

∀x, y : A. f (x) = f (y) ⇒ x = y

Choose arbitrary x and y from A. Assume f (x) = f (y), i.e. that {x} = {y},
we must show x = y. But, by Theorem 5.5.3, {x} = {y} ⇔ x = y, thus f is an
injection as was claimed.
(ii.) To see that there is no bijection, we assume f : A → B is an arbitrary
function and show that it can not be onto.
Now, if f is onto then every subset of A must be mapped to from some
element of A. Consider the set

B = {y : A | y 6∈ f (y)}

Clearly B ⊆ A, so B ∈ ρ(A). Now, if f is onto, there is some z ∈ A such that


f (z) = B. Also, it must be the case that z ∈ B ∨ z 6∈ B.
(case 1.) Assume z ∈ B. Then, z ∈ {y : A | y 6∈ F (y)}, that is, z ∈ A (as we
assumed) and z 6∈ f (z). Since we assumed f (z) = B, we have z 6∈ B. But we
started by assuming z ∈ B so this is absurd.
(case 2.) Assume z 6∈ B. Then, ¬(z ∈ {y : A | y 6∈ f (y)}. By the definition
of membership in a set defined by comprehension, ¬(z ∈ A ∧ z 6∈ f (z)). By
DeMorgan’s law, (z 6∈ A ∨ ¬(z 6∈ f (z))). Since we know z ∈ A it must be that
¬(z 6∈ f (z)), i.e. that z ∈ f (z). Since f (z) = B we have z ∈ B. But again, this
is absurd because we started this argument by assuming z 6∈ f (z).
Since we have arrived at a contradiction in both cases, we conclude that the
function f can not be a bijection.

Cantor’s theorem gives a way to take any set and use it to construct a set of
strictly larger cardinality. Thus, we can construct a hierarchy of non-equivalent
infinities. Start with the set N (which we proved was infinite) and take the
power set. By Cantor’s theorem, |N| < |ρ(N)|. Similarly, |ρ(N)| < |ρ(ρ(N))| and
so on.

Corollary 9.1 (Infinity of infinities) There is an unbounded hierarchy of


strictly distinct infinite sets.

9.5 Countable and Uncountable Sets


By the corollary 9.1, there are many different infinities and we distinguish in-
finite sets that are, in some sense, small by classifying them as countable sets
and the large sets as being uncountable.
9.6. COUNTING 137

Definition 9.10 (Countable) A set A is countable iff A is finite


or A ∼ N. Countable sets are also sometimes said to be denumerable.

Trivially, it follows that the set of natural numbers N is countable.


We have the following lemma characterizing countable sets.

Lemma 9.3. For all sets A, A is countable if and only if there exists a surjection
f : N → A.

The proof in the (⇒) direction is trivial following almost directly from the
definition of countable. The proof in the ⇐) direction assumes the existence of
the surjection and requires us to show it is a bijection, or to use it to construct
a mapping from A onto an initial segment of N.
The following theorems may be surprising.

Theorem 9.6 (Q countable) The set Q (of rational numbers) is countable.

Theorem 9.7 (R countable) The set R (of real numbers) is uncountable.

The proofs are originally due to Cantor.

9.6 Counting
We saw with the notion of cardinality that it is possible to compare the sizes
of sets without actually counting them. By counting, we mean the process
of sequentially assigning a number to each element of the set – i.e creating a
bijection between the elements of the set some set {0..k}. This is precisely the
purpose of the bijection that witnesses the finiteness of a set A – it counts the
elements of the set A. Thus, counting is “finding a function of a certain kind.”2

Lemma 9.4 (Counting Lemma)

∀j, k : N. ({0..j} ∼ {0..k} ⇒ j ≤ k

The following lemma shows that counting is unique, i.e. that it does not
matter how you count a finite set, it always has the same cardinality.

Theorem 9.8.

∀A.∀n, m : N. (|A| = n ∧ |A| = m) ⇔ n = m

A corollary is

Corollary 9.2.
∀i, j : N. {0..i} ∼ {0..j} ⇔ i = j
2 See Stuart Allen’s formalization [2] of discrete math materials, the proofs here are the

ones formalized by Allen.


138 CHAPTER 9. CARDINALITY AND COUNTING

9.6.1 The Pigeonhole Principle


A concept related to the one of uniqueness of counting is the pigeonhole principle.
In formally, it says that if you have k pigeons and m boxes to put them into, if
m < k then at least one of the m boxes must contain at least two pigeons.
There are a few ways to state this theorem. A rather explicit statement is
given by the following.

Theorem 9.9 (Pigeonhole Principle)

∀m, n : N. ∀f : {0..m} → {0..n}.n < m ⇒ ∃i, j : {0..m}. i 6= j ∧ f (i) = f (j)

But this is merely saying that at least two elements from the domain must
get mapped to the same element (pigeon hole) of the codomain. A more abstract
statement is that if i < j then there is no injection from {0..j} → {0..i} – i.e. if
for every function there are always at least a pair of elements from the domain
mapped to some single element in the codomain, then there certainly can be no
injection.

Theorem 9.10 (Pigeonhole Principle 1)

∀i, j : N. i < j ⇒ ¬∃f : {0..j} → {0..i}.Inj(f, {0..j}, {0..i})


Part III

Induction and Recursion

139
141

Stephen Cole Kleene


(1909–1994) was an Amer-
ican mathematic an and
logician who made funda-
mental early developments
in recursion theory, math-
ematical logic and meta-
mathematics. He was a
student of Alonzo Church’s
at Princeton in the 1930’s
and went on to the depart-
ment of Mathematics at the
University of Wisconsin in
Madison where he stayed
until his retirement.

Stephen Cole Kleene

Introduction

The mathematical structures studied in Part II (sets, relations and functions)


are formed by building certain kinds of sets and then imposing constraints on
those structures e.g. we have defined relations as subsets of Cartesian products,
we defined functions as a constrained form of Cartesian products, they are
relations that satisfy the properties of functionality and totality.
An alternative form of definition3 definition are those given inductively,
meaning that the structures are defined by giving base cases (instances of the
structure that do not refer to substructures of the kind being defined, and by
combining previously constructed instances of the structure being defined. Ex-
amples of mathematical structures having inductive structure are the natural
numbers, lists, trees, formulas, and even programs.
Inductive structures are ubiquitous in Computer Science and they go hand
in hand with definitions by recursion and inductive proofs. There are many
forms of induction (mathematical induction, Notherian induction, well-founded
induction [35] and structural induction [54]. In this part of the text, we concen-
trate on presenting inductive definitions in a style that allows students to define
recursive functions and to generate a structural induction principle directly from
the definition of the type. Ordinary mathematical induction is often the focus
of in discrete mathematics courses but we see it here as simply a special case of
structural induction.
In the following sections we present a number of individual inductively de-
fined structures and then follow with a chapter giving the recipie for rolling
3 In keeping with the foundational idea that all mathematical structures are definable as

sets, there is, of course, a purely set theoretic form of definition for inductive structures.
142

your own inductive defintions, we show how to define functions by recursion


on those definitions and show how to syunthesize an induction principle for the
new inductive structure.
Readers who may have skipped Chapter 1 (which is labelled as optional)
might read Section 1.3 about the form of inductive definitions used here.
Chapter 10

Natural Numbers

This memoir can be understood by any one possessing what is


usually call good common sense; no technical philosophic, or mathe-
matical, knowledge is in the least degree required. But I feel conscious
that many a reader will scarcely recognize in the shadowy forms which
I bring before him his numbers which all his life long have accom-
panied him as faithful and familiar friends; he will be frightened by
the long series of simple inferences corresponding to our step-by-
step understanding, by the matter-of-fact dissection of the chains of
reasoning on which the laws of numbers depend, and will become im-
patient at being compelled to follow out proofs for truths which to his
supposed inner consciousness seem at once evident and certain.
Richard Dedekind from the Preface to the First Edition
of The Nature and Meaning of Numbers, translated by Wooster
Woodruff Beman, in Essays on the Theory of Numbers, Open Court
Publishing, Chicago, 1901.

143
144 CHAPTER 10. NATURAL NUMBERS

Leopold Kronecker (1823


- 1891)

Leopold Kronecker
The German mathematician Leopold Kronecker famously remarked:

God made the natural numbers; all else is the work of man.

Kronecker was saying the natural numbers are absolutely primitive and that
the other mathematical structures have been constructed by men. Similarly, the
philosopher Immanuel Kant (1742 – 1804) and mathematician Luitzen Egbertus
Jan Brouwer (1881 - 1966) both believed that understanding of natural numbers
is somehow innate; that it arises from intuition about the human experience of
time as a sequence of moments. 1 In any case, it would be difficult to argue
against the primacy of the natural numbers among mathematical structures.

1 Interestingly, Kant also believed that geometry was similarly primitive and our intuition

of it arises from our experience of three dimensional space. The discovery in the 19th century
of non-Euclidean geometries [?, 34] makes this idea seem quaint by modern standards.
10.1. PEANO AXIOMS 145

10.1 Peano Axioms

Giuseppe Peano (1858–


1932), an Italian mathemati-
cian and philosopher who,
among other accomplish-
ments, gave an axiomatic
presentation of arithmetic.

Giuseppe Peano

The Peano axioms are named for Giuseppe Peano (1858–1932), an Italian
mathematician and philosopher. Peano first presented his axioms [40] of arith-
metic in 1889, though in a later paper Peano credited Dedekind [7] with the
first presentation of the axioms. We still know them as Peano’s axioms.

Definition 10.1 (Peano axioms) Let N be a set where 0 is a constant sym-


bol, s be a function of type N → N (call it the successor function) and P is any
property of natural numbers, then following are Peano’s axioms.

i.) 0∈N
ii.) ∀k : N. sk ∈ N
iii.) ∀k : N. 0 6= sk
iv.) ∀j, k : N. j = k ⇔ sj = sk
v.) (P [0] ∧ ∀k : N. P [k] ⇒ P [sk]) ⇒ ∀n : N. P [n]

Axioms (i.) and (ii.) say 0 is a natural number and if k is a natural number
then so is sk. We call s the successor function and sk is the successor of k.
Axiom (iii.) says that 0 is not the successor of any natural number and axiom
iv is a kind of monotonicity property for successor, it says the successor function
preserves equality. Axiom (v.) is the induction principle which is the main topic
of discussion of this chapter, see Section 10.3.
So, there are two ways to construct a natural number, either you write down
the constant symbol 0 or, you write down a natural number (say k) and then
you apply the successor function s which has type N → N to the k to get the
number sk.
Thus, N = {0, s0, ss0, sss0, · · · } are the elements of N. Note that the vari-
able “n” used in the definition of the rules never occurs in an element of N, it
146 CHAPTER 10. NATURAL NUMBERS

is simply a place-holder for an term of type N, i.e. it must be replaced by some


previously term from the set {0, s0, ss0, · · · }.
We typically write natural numbers in decimal notation.

0=0
1 = s0
2 = ss0
3 = sss0
..
.

You should think of 3 as a (better) notation for the natural number s s s 0.

10.2 Definition by Recursion


We have defined functions by recursion earlier in these notes (e.g. the val
function given in Chapter 2 Def. 2.8). The idea of defining functions by recursion
on the structure of one of its arguments presented here is the same. To make
a definition by recursion “on the structure” of the natural numbers, we must
specify the behavior of the function on inputs by considering the possible cases:
the input (or one of them) is 0 or it is of the form sk for some previously
constructed natural number k.
As an example, consider the definition of addition over the natural numbers
by recursion on the structure of the first argument.

Definition 10.2 (Addition)

add(0, k) = k
add(sn, k) = s(add(n, k))

We will use ordinary infix notation for addition with the symbol +; thus
add(m, n) will be written m+n. Using this standard notation we take the chance
that the reader will assume + has all the properties expected for addition. It
turns out that it does, but until we prove a property is true for this definition
we can not use the property. In the infix notation, the definition would appear
as follows:

0+k =k
sn + k = s(n + k)

Example 10.1. To add 2 + 3 using the definition, we compute as follows:

ss0 + sss0 = s(s0 + sss0) = ss(0 + sss0) = ss(sss0) = sssss0

Multiplication is defined as iterated addition by recursion on the structure


of the first argument.
10.2. DEFINITION BY RECURSION 147

Definition 10.3 (Multiplication)

mult(0, k) = 0
mult(sn, k) = mult(n, k) + k

We will use ordinary infix notation for multiplication with the symbol ·; thus
mult(m, n) will be written m · n. We will also sometimes just write mn omitting
the symbol ·. In the infix notation, the definition appears as follows:

0·k =k
sn · k = (n · k) + k

Example 10.2. So, to multiply 2 · 3,

ss0 · sss0 = (s0 · sss0) + sss0


= ((0 · sss0) + sss0) + sss0
= (0 + sss0) + sss0
= sss0 + sss0
= ssssss0

We define exponentiation by recursion on the structure of the exponent.

Definition 10.4 (exponentiation)

n0 = s 0
n(s k) = nk · n

The definitions of addition, multiplication and exponentiation are all in-


stances of a syntactic pattern of definition that is called definition by recursion.
It turns out that any definition that follows this pattern is guaranteed to be a
function. This might be seen as an early example of a design pattern [?].

Theorem 10.1 (Definition by Recursion) Given a set A and a function g ∈


(N × A) → A, and an element a ∈ A, definitions having the following form:

f (0) = a
f (sk) = g(k, f (k))

result in well-defined functions, f ∈ N → A.

The proof of the theorem justifying definition by recursion is beyond the


scope of these notes (See [30] or [31, pp.45]); however, the theorem justifies
definitions of the kind just given for addition, multiplication and exponentiation.
The theorem says that if a definition follows a particular syntactic form,
you are justified in claiming you have defined a function; definitions that follow
the pattern are guaranteed to be functional and total. Pretty good stuff, no
need to prove that each new definition is a function in N → A, just follow the
pattern specified by the theorem and you are guaranteed your definition is a
function. And note that the property of being total (i.e. that every input gets
148 CHAPTER 10. NATURAL NUMBERS

mapped to some output) guarantees that definitions which match the pattern
are guaranteed to halt on all inputs. In [31] Mac Lane notes that one can work
the other way around, one can take the definition by recursion as an axiom and
derive the Peano axioms as theorems.
The arithmetic functions we defined above all take two arguments. We state
a corollary of the Theorem 11.1 for binary functions defined by recursion on the
structure of their first arguments.

Corollary 10.1 (Definition by recursion (for binary functions)) Given


sets A and B and a function g ∈ (N × A) → A, and a function h ∈ B → A, and
an element b ∈ B, definitions of the following form:

f (0, b) = h(b)
f (sn, b) = g(n, f (n, b))

result in well-defined functions, f ∈ (N × B) → A.

Using the corollary, we prove that the definitions given above for addition,
multiplication and exponentiation of natural numbers are indeed functions.

Theorem 10.2 (Addition is a function ) Addition as given by Definition 10.2


is a function of type (N × N) → N.
Proof: Recall the definition
0+k =k
sn + k = s(n + k)

To prove that addition is a function of the specified type we show how it fits
the pattern given by Corollary 10.1. To do so we must specify the sets A and
B and the functions h and g and the element b ∈ B. In this case, let A = N
and B = N. Let b = k. Since B = N and k ∈ N it is an acceptable choice for
b. The function h : N → N is just the identity function, h(k) = k. The function
g ∈ (N × N) → N is the function g(j, k) = sk. Thus, the operation of addition
is, by Corollary 10.1 a function in (N × N) → N.


Theorem 10.3 (Multiplication is a function) Multiplication as given by


Definition 10.3 is a function of type (N × N) → N.
Proof: Recall the definition:
0·k =0
sn · k = (n · k) + k

Multiplication fits the pattern of Corollary 10.1 as follows: let A = N and


B = N and k = 0. The function h : N → N is just the constant function
h(k) = 0. The function g ∈ N → N is the function that adds k to the input
of g, so g(m, n) = n + k. We have just proved that addition is a function and
so g ∈ (N × N) → N. Thus, by Corollary 10.1 multiplication is a function in
(N × N) → N.
10.3. MATHEMATICAL INDUCTION 149

Theorem 10.4 (Exponentiation is a function) Exponentiation as given by


Definition 10.4 is a function of type (N × N) → N.

Exercise 10.1. Prove Theorem 10.4

Exercise 10.2. The Fibonacci function is defined as follows:

F (0) = 0
F (s0) = 1
F (ssk) = F (sk) + F (k)

Can this be defined using definition by recursion? If so how? If not, why not?

You may have convinced yourself that these definitions look like they “do
the right thing” but we will be able to prove that they behave in the ways we
expect them to using mathematical induction.

10.3 Mathematical Induction


The rule for ∀R provides one method of proving a statement of the form ∀n : N. φ
e.g. choose an arbitrary natural number (call it k) and assume k ∈ N and show
φ[n := k]. But it is not always enough to just choose an arbitrary element, the
argument may depend on the structure of the type being quantified over, in this
case the natural numbers. Mathematical induction is a principle of proof that
takes the structure of the natural numbers into account in the proof.
Peano’s axiom (v.) (see Definition 10.1) is known as the principle of mathe-
matical induction.

Definition 10.5 (Principle of Mathematical Induction) For a property P


of natural numbers we have the following axiom.

(P [0] ∧ ∀k : N. P [k] ⇒ P [sk]) ⇒ ∀n : N. P [n]

10.3.1 An informal justification for the principle


Suppose you wished to justify the principle of mathematical induction.

(P [0] ∧ ∀k : N. P [k] ⇒ P [sk]) ⇒ ∀n : N. P [n]

It says, for a property P of natural numbers, to show that P [n] holds for every
natural number n, it is enough to show two things:

i.) P [0] and ii.) ∀k : N.P [k] ⇒ P [sk]


150 CHAPTER 10. NATURAL NUMBERS

So, suppose you have accepted proofs of (i.) and (ii.) but somehow still believe
that there might be some n ∈ N such that ¬P [n] holds. Since n ∈ N it is
constructed by n applications of the successor function to 0. You can construct
an argument that P [n] must hold in 2n + 1 steps. The argument is constructed
using (i.) and (ii.) as follows2 :

1. P [0] you accepted this as (i.)


2. P [0] ⇒ P [s0] instantiate (ii.) with 0
3. P [s0] modus ponens using 1 and 2
4. P [s0] ⇒ P [ss0] instantiate (ii.) with s0
5. P [ss0] modus ponens using 3 and 4
.. ..
. .
2n. P [s(n−1) ] ⇒ P [sn 0] instantiate (ii.) with s(n−1)
2n+1. P [sn 0] modus ponens using 2n and 2n+1

Thus, no matter which n is chosen, we can prove P [n] holds in 2n + 1 steps


using the base case (i.) and the induction step (ii.).

10.3.2 A sequent style proof rule


We can derive a sequent style proof rule for mathematical induction by in-
stantiating Peano”s induction axiom and doing as much proof as is possible in
schematic form.

Π1 : (Ax)
Γ, ∀n : N. P [n] ` ∆1 , ∀n : N. P [n], ∆2

Γ, k : N, P [k] ` ∆1 , P [sk], ∆2
⇒R
Γ, k : N ` ∆1 , P [k] ⇒ P [sk], ∆2
∀R
Γ ` ∆1 , P [0], ∆2 Γ ` ∆1 , ∀k : N. P [k] ⇒ P [sk]), ∆2
∧R
Γ ` ∆1 , P [0] ∧ ∀k : N. P [k] ⇒ P [sk], ∆2 Π1
(⇒R)
Γ, P [0] ∧ ∀k : N. (P [k] ⇒ P [sk]) ⇒ ∀n : N. P [n] ` ∆1 , ∀n : N.P [n], ∆2
(Ax)
Γ ` ∆1 , ∀n : N.P [n], ∆2
The informal justification is as follows:if you are trying to prove a sequent
of the form Γ ` ∀m : N. P [m], you can add an instance of the principle of
mathematical induction to the left side; this is because it is an axiom of Peano
arithmetic. After one application of ∀L rule, on the left branch you will be
required to show two things: The left branch will be of the form,
Γ ` P [0] ∧ ∀k : N. P [k] ⇒ P [sk]
and the right branch will be an instance of an axiom of the form:
∀m : N. P [m], Γ ` ∀m : N. P [m]
2 Recall that modus ponens is the rule that says that P and P ⇒ Q together yield Q.
10.3. MATHEMATICAL INDUCTION 151

We can further refine the left branch by applying the ∧R rule which gives two
subgoals: One to show P [0] and the other to show ∀k : N. P [k] ⇒ P [sk]. This
sequent can further be refined by applying ∀R and then ⇒R.
This yields the following rule3 .

Proof Rule 10.1 (Mathematical Induction)


Γ ` P [0] Γ, k ∈ N, P [k] ` P [s k]
(NInd) where k is fresh.
Γ ` ∀m : N. P [m]

10.3.3 Some First Inductive Proofs


We have suggested that for k a natural number, sk (the successor of k) is the
same as k + 1 where 1 is just the decimal representation of s0. We state this as
a theorem and present it as the first proof using mathematical induction.

Lemma 10.1 (Successor is add one.)

∀k : N. sk = k + 1

Proof: By mathematical induction on k. The property of k we are to prove is


defined as follows:
def
P [k] = sk = k + 1
In general, for a formula of the form ∀k : N. φ, the property P can be written
def
as P [k] = φ.
As in all proofs by mathematical induction there are two things to show, the
base case and the induction step. The forms of these two subgoals are given by
the proof rule NInd shown above in Def. 10.3.
Base Case: We must show P [0]. We get P [0] by replacing all free occur-
rences of k with 0 in the body of the definition of P . Thus
def.P [k] subst
P [0] = (sk = 0 + 1)[k := 0] = s0 = 0 + 1

So, at this stage, we must show the following equality holds: s0 = 0 + 1. To


continue the proof, we use the definition of addition and 1 to simplify the right
side as follows: 0 + 1 = 1 = s0. Thus the left and right sides of the equality are
the same and so the base case holds.
Induction Step: To show the induction step holds we assume that for
some arbitrary k that k ∈ N and furthermore assume that P [k] holds. P [k] is
called the induction hypothesis.

sk = k + 1 (Ind.Hyp.)

We must show P [sk]. To get P [sk] carefully replace all free occurrences of k in
the body of P by sk. The result of the substitution is the following equality:

ssk = sk + 1 (A)
3 The contexts ∆1 and ∆2 on the right side have been omitted to aid readability
152 CHAPTER 10. NATURAL NUMBERS

To show this, we proceed by computing with the right side using the definition
of addition.
sk + 1 = s(k + 1) (B)
By the induction hypothesis, sk = k + 1 so we replace k + 1 by sk in the right
side of (B), this gives
sk + 1 = s(k + 1) = ssk
But now we have show (A) and so the induction step is completed.
Thus, we have shown that the base case and the induction step hold and this
completes the proof.

Since we proved in Thm. 10.2 that addition is a function in (N × N) → N
and by the previous lemma that sk = k + 1 we know the successor operation
defines a function in N → N.
Corollary 10.2 (sucessor is a function)
s∈N→N
This proof justifies restating the principle of mathematical induction in the
following (perhaps) more familiar form.
Definition 10.6 (Principle of Mathematical Induction (modified)) For
a property P of natural numbers we have the following axiom.
(P [0] ∧ ∀k : N. P [k] ⇒ P [k + 1]) ⇒ ∀n : N. P [n]
The following lemma is useful in a number of proofs.
Lemma 10.2 (addition by zero)
∀n, k : N. k = k + n ⇒ n = 0
Proof: : Choose an arbitrary n ∈ N and do induction on k.
def
P [k] = k = k + n ⇒ n = 0
Base Case: Show P [0], i.e. that 0 = 0 + n ⇒ 0 = n. Assume 0 = 0 + n
and show 0 = n. By the definition of addition 0 + n = n, so the base case holds.
Induction Step: Assume k ∈ N, assume P [k] holds and show P [sk].
P [k] : k =k+n⇒n=0
We must show
sk = sk + n ⇒ n = 0
Assume sk = sk + n and show n = 0. By definition of addition, from the right
side of the equality we have: sk + n = s(k + n) so we know that sk = s(k + m).
Applying Peano axiom (iv.) to this fact we see that k = k + m. This formula
is the antecedent of the induction hypothesis so we know that n = 0 which is
what we were to show.
10.4. PROPERTIES OF THE ARITHMETIC OPERATORS 153

Often, we would like to do case analysis on natural numbers, i.e. given an


assumption that n ∈ N, we’d like break the proof into two cases, either n = 0
or n is a successor (∃j : N. n = sj). In some way the induction principle says as
much and in fact to prove this theorem we need to use induction however, we do
not need to utilize the induction hypothesis in the proof! In general, when an
induction proof does not use the induction hypothesis, it can be done by case
analysis; here we establish this weaker principle using induction.

Lemma 10.3 (Case analysis)

∀n : N. n = 0 ∨ ∃j : N. n = sj

Proof: By mathematical induction on n. The property P [n] is defined as


follows:
def
P [n] = n = 0 ∨ ∃j : N. n = sj
Base Case: Show P [0] i.e. that 0 = 0 ∨ ∃j : N. n = sj. By reflexivity of
equality 0 = 0 so the base case holds.
Induction Step: For arbitrary k ∈ N assume P [k] and show P [sk]. We
do not show the induction hypothesis P [k] because we do not need it. Instead
we show P [sk]:
sk = 0 ∨ ∃j : N. sk = sj
By symmetry of equality and by Peano axiom (iii.) we know sk 6= 0 so we show
∃j : N. sk = sj. Use k as the witness and show sk = sk. To show this we apply
Peano axiom (iv.) show k = k which is true by the reflexivity of equality.


Using this lemma theorem we can derive the following proof rule.
Proof Rule 10.2 (Case Analysis on N)
Γ1 , k ∈ N, k = 0, Γ2 ` ∆ Γ1 , k ∈ N, j ∈ N, k = sj, Γ2 ` ∆
(NCases) j fresh.
Γ1 , k ∈ N, Γ2 ` ∆

In cases where induction is not required, case analysis can be used.

Exercise 10.3. Derive Rule 10.2 using Lemma 10.3.

10.4 Properties of the Arithmetic Operators


In Section 10.2 we defined addition, multiplication and exponentiation by re-
cursion on the structure of one of the arguments. We also proved that they are
functions. Properties of functions defined by recursion on their structure are in-
variably established by proofs using mathematical induction. In this section we
preset a number of proofs to establish that the arithmetic operators do indeed
behave as we expect them to.
154 CHAPTER 10. NATURAL NUMBERS

The laws for addition and multiplication are given as follows where m, n and
k are arbitrary natural numbers.
0 right identity for + m+0=m
+ commutative m+n=n+m
+ associative m + (n + k) = (m + n) + k
0 annihilator for · m·0=0
1 right identity for · m·1=m
· commutative m·n = n·m
· associative m·(n·c) = (m·n)·c
distributive law m · (n + k) = (m · n) + (m · k)
The fact that 0 is a left identity for addition falls out of the definition for
free. That 0 is a right identity requires mathematical induction.
Theorem 10.5 (0 right identity for +)
∀n : N. n + 0 = n
Proof: By mathematical induction on n. The property P of n is given as:
def
P [n] = n + 0 = n
Base Case: We must show P [0], i.e. that 0 + 0 = 0 but this follows
immediately from the definition of + so the base case holds.
Induction Step: Assume n ∈ N and that P [n] holds and show P [sn].
P [n] is the induction hypothesis.
P [n] : n + 0 = n
Show that sn + 0 = sn But by definition of + we know that sn + 0 = s(n + 0).
By the induction hypothesis n + 0 = n so s(n + 0) = sn and the induction step
holds.

Theorem 10.6 (+ is commutative)
∀m, n : N. m + n = n + m
Theorem 10.7 (+ is associative)
∀m, n, k : N. m + (n + k) = (m + n) + k
Theorem 10.8 (1 right identity for ·)
∀n : N n · 0 = 0
Theorem 10.9 (· is commutative)
∀m, n : N. m · n = n · m
Theorem 10.10 (· is associative)
∀m, n, k : N. m · (n · k) = (m · n) · k
10.4. PROPERTIES OF THE ARITHMETIC OPERATORS 155

10.4.1 Order Properties


The natural numbers are ordered. Consider the following definition of less than.

Definition 10.7 (Less Than)


def
m < n = ∃j : N. m + (j + 1) = n

We’d like to establish that the less than relations (<) as defined here behaves
as expected i.e. that it is a strict partial order. Recall the definition from
Chap. 7, Def ?? that the relation must be irreflexive (Def. 6.6.13) and transitive
(Definition 6.??).

Theorem 10.11 (< is irreflexive)

∀n : N. ¬(n < n)

Proof: Choose an arbitrary n ∈ N and show ¬(n < n). We assume n < n and
derive a contradiction. If n < n then, by definition of less than the following
holds:
∃j : N. n + (j + 1) = n
Let j ∈ N be such that n + (j + 1) = n. By Lemma 10.2 (addition by zero) we
know j + 1 = 0. Since j + 1 is sj this contradicts Peano axiom (iii.) instantiated
with k = j.


Theorem 10.12 (< is transitive)

∀k, n, m : N. k < n ∧ n < m ⇒ k < m

Exercise 10.4. Prove Thm. 10.12.

Lemma 10.4 (Zero min)

∀n : N.¬(n < 0)

Proof: Choose arbitrary n. To show ¬(n < 0) assume n < 0 and derive a
contradiction. By definition of less-than, n < 0 means

∃j : N. n + (j + 1) = 0

Suppose there is such a j. By commutativity of addition and the definition of


addition itself, we have the following sequence of equalities.

n + (j + 1) = (j + 1) + n = sj + n = s(j + n)

But then we have s(j + n) = 0 which contradicts Peano axiom iii.


156 CHAPTER 10. NATURAL NUMBERS

Theorem 10.13 (Addition Monotone)

∀k, m, n : N. m < n ⇒ m + k < n + k

Proof:
Choose an arbitrary k ∈ N and do induction on m.
def
P [m] = ∀n : N. m < n ⇒ m + k < n + k

Base Case: Show P [0], i.e. that

∀n : N. 0 < n ⇒ 0 + k < n + k

Choose arbitrary n ∈ N and assume 0 < n. Note that by the definition of


addition 0 + k = k, so we must show that k < n + k. By definition of less than,
we must show:
∃j : N. j 6= 0 ∧ n + k = k + j
Use the witness n (for j) and show two things, i.) n 6= 0 and ii.) n + k = k + n.
Since we assumed 0 < n, by definition of less than we know that there is some
j ∈ N such that j 6= 0 and n = 0 + j. By the definition of addition we know
that n = j and hence that n 6= 0. By associativity of addition (Thm. 10.7) the
second condition holds as well.
Induction Step: Assume P [m] for some m ∈ N and show P [m + 1].

P [m] : ∀n : N. m < n ⇒ m + k < n + k

Show
∀n : N. m + 1 < n ⇒ (m + 1) + k < n + k
Choose arbitrary n ∈ N. Assume m + 1 < n and show (m + 1) + k < n + k. By
the induction hypothesis (using n for n) we get,

m<n⇒m+k <n+k

Since we assumed m+1 < n we know m < n (if j is a witness for m+1 < n then
j + 1 is a witness for m < n.) Now, since m < n we know (m + 1) + k < n + k.
To know m + k < n + k. To show (m + 1) + k < n + k we must show the
following.
∃i : N. i 6= 0 ∧ n + k = (m + 1) + k + i

Theorem 10.14 (Exp monotone)

∀n : N. 1 < n ⇒ ∀k : N. nk < nk+1


10.4. PROPERTIES OF THE ARITHMETIC OPERATORS 157

Proof: Choose arbitrary n ∈ N and assume 1 < n. We must show

∀k : N. nk < nk+1

We proceed by induction on k.
def
P [k] = nk < nk+1

Base Case: Show P [0], i.e. that n0 < n1 . By definition n0 = 1 and

n1 = n0 · n = 1 · n = n

Since we assumed 1 < n the base case holds.


Induction Step: Assume P [k] for some k ∈ N and show P [k + 1].

P [k] : nk < nk+1

We must show nk+1 < n(k+1)+1 . Staring on the left side of the inequality we
compute as follows:
nk+1 = nk + n
By the induction hypothesis,nk < nk+1 so, using Theorem ?? we get the follow-
ing.
nk+1 = nk + n < nk+1 + n = n(k+1)+1
This shows the induction step holds and completes the proof.


10.4.2 Iterated Sums and Products


Definition 10.8 (Sum)

j
X
f (i) = 0 if j < k
i=k

j+1
X j
X
f (i) = f (j +1) + f (i) if (j + 1) ≥ k
i=k i=k

Some properties of Sums and Products


Theorem 10.15 (Gauss’ identity) For every natural number n, the follow-
ing identity holds
n
X n(n + 1)
i=
i=1
2
158 CHAPTER 10. NATURAL NUMBERS

Proof: Our proof is by mathematical induction on n. The predicate we will


prove is
n
def
X n(n + 1)
P [n] = i=
i=1
2
We must show the base case and the induction step both hold.
Base Case: We must show P [0], i.e.
0
X 0(0 + 1)
i=
i=1
2

But notice, by the definition of summation,


0
X
i=0
i=1

and also,
0(0 + 1) 0
= =0
2 2
so the base case holds.
Induction Step: Choose an arbitrary natural number, call it k. We assume
P [k] (the induction hypothesis)
k
X k(k + 1)
i= induction hypothesis
i=1
2

and we must show and must show P [k + 1] holds.


(k+1)
X (k + 1)((k + 1) + 1)
i=
i=1
2

Starting with the left side, by the definition of summation operator we get the
following.
(k+1) (k)
X X
i = (k + 1) + i
i=1 i=1
By the induction hypothesis, we have.
(k)
X k(k + 1)
(k + 1) + i = (k + 1) +
i=1
2

Algebraic reasoning gives us the following sequence of equalities completing the


proof.
k(k + 1) 2k + 2 k 2 + k k 2 + 3k + 2 (k + 1)(k + 2)
(k + 1) + = + = =
2 2 2 2 2
10.4. PROPERTIES OF THE ARITHMETIC OPERATORS 159

10.4.3 Applications
In Chapter 9 the pigeonhole principle was presented (without proof) as Theo-
rem ??. We prove this theorem here using mathematical induction.
Recall, by Definition 9.9.7 that
def
{0..n} = {k : N | 0 ≤ k < n}

So note that {0..0} = {} and if n > 0 then {0..m} = {0, 1, · · · m − 1}.

Theorem 10.16 (Pigeonhole Principle 1)

∀m, n : N. m > n ⇒ ∀f : {0..m} → {0..n}.¬Inj(f, {0..m}, {0..n})

Proof: By mathematical induction on m. The property we will prove is


def
P (m) = ∀n : N. m > n ⇒ ∀f : {0..m} → {0..n}.¬Inj(f, {0..m}, {0..n})

(Base Case:) We must show P (0) holds, i.e.

∀n : N. 0 > n ⇒ ∀f : {0..0} → {0..n}.¬Inj(f, {0..0}, {0..n})

Choose an arbitrary n ∈ N and assume 0 > n. But this assumption is not


possible, there is no natural number less than 0, and so the base case holds by
contradiction.
(Induction Step:) For arbitrary m ∈ N we assume P (m) (the induction hy-
pothesis) and show P (m + 1).

ind.hyp : ∀n : N. m > n ⇒ ∀f : {0..m} → {0..n}. ¬Inj(f, {0..m}, {0..n})

We must show.

∀n : N. m + 1 > n ⇒ ∀f : {0..m + 1} → {0..n}.¬Inj(f, {0..m + 1}, {0..n})

Choose an arbitrary n ∈ N and assume m + 1 > n. Then choose an arbitrary


function f ∈ {0..m + 1} → {0..n} and show that it is not an injection. To com-
plete this case we assume Inj(f, {0..m + 1}, {0..n}) and derive a contradiction
e.g. we assume:

(A) ∀i, j : {0..m + 1}. f (i) = f (j) ⇒ i = j

Now, consider the possibilities for n ∈ N, either n = 0 or n > 0.


Case n = 0. In this case, our assumption that f ∈ {0..m + 1} → {0..0} can be
used to give a contradiction. The domain {0..m + 1} has at least 0 in it, even
if m = 0. This means f (0) ∈ {0..0}. But {0..(}0) = ∅ and so f (0) ∈ ∅. This
contradicts the corollary of the emptyset axiom Corollary 5.5.1 from Chapter 5.
160 CHAPTER 10. NATURAL NUMBERS

Case n > 0. In this case, our assumption that m + 1 > n means that m > n − 1
(subtracting one from n is justified because n > 0. Use n − 1 for n in the
induction hypothesis to get the the following:

m > n − 1 ⇒ ∀f : {0..m} → {0..n − 1}. ¬Inj(f, {0..m}, {0..n − 1})

Since we know m > n − 1 we assume:

(B) ∀f : {0..m} → {0..n − 1}. ¬Inj(f, {0..m}, {0..n − 1})

Now, consider the injective function f ∈ {0..m + 1} → {0..n} from the hypoth-
esis labeled (A) above.
There are two cases, f (m) = n − 1 or f (m) < n − 1.
Case f (m) = n − 1. In this case, since f is an injection in {0..m + 1} → {0..n},
removing m from the domain also removes n − 1 from the co-domain and so
f ↓ {0..m} a function of type {0..m} → {0..n − 1}. Use this restricted f as a
witness to the assumption labeled B and we assume

¬Inj(f, {0..m}, {0..n − 1})

If we can show that Inj(f, {0..m}, {0..n − 1}) we have completed this case. But
we already know that f is an injection on the larger domain {0..m + 1} so it is
an injection on the smaller one.
Case f (m) < n − 1. In this case, since f is an injection with codomain {0..n},
at most one element of the domain {0..m + 1} gets mapped by f to n − 1 if it
exists, call it k. Using f we will construct a new function (call it g) by having
g behave just like f except on input k (the one such that f (k) = n − 1) we
set g(k) = f (m). Since we assumed f (m) < n − 1 we know g(k) ∈ {0..n − 1}
and because f is an injection we know that no other element of the domain was
mapped by f to f (m). So, g is defined as follows:

g(i) = if f (i) = n − 1 then f (m) else f (i)

Use g for f in (B) and we have the assumption that

¬Inj(g, {0..m}, {0..n − 1})

To prove this case we show Inj(g, {0..m}, {0..n − 1}). But we constructed g
using the injection f to be an injection as well.


10.5 Complete Induction


In the justification for mathematical induction given above in Sect 10.3.1 it can
be seen that, given an number n, in the process of building a justification that
P (n) holds, justifications for each P (k), where k < n are constructed along the
way.
10.5. COMPLETE INDUCTION 161

This suggests a stronger induction hypothesis may be possible, to not just as-
sume that the property holds for the preceding natural number but that it holds
for all proceeding natural numbers. In fact, the following induction principle
can be proved using ordinary mathematical induction.

Theorem 10.17 (Principle of Complete Induction)

(∀n : N. (∀k : N. k < n ⇒ P [k]) ⇒ P [n])


⇒ ∀n : N. P [n]

A sequent style proof rule4 for complete induction is given as follows:

Proof Rule 10.3 (Complete Induction)


Γ, n ∈ N, ∀k : N, k < n ⇒ P [k] ` P [n]
(CompNInd) where n is fresh.
Γ ` ∀m : N. P [m]

10.5.1 Proof of the Principle of Complete Induction*


Indeed, we can prove theorem 10.17. To do so we introduce a lemma which
illustrates an interesting advanced proof strategy - strengthening the induction
hypothesis. A attempt to directly prove 10.17 by induction on n fails. The
base case is provable but the induction hypothesis is not strong enough to allow
a proof of the induction step. The solution is to prove a theorem which gives a
stronger induction hypothesis and to use that to prove 10.17.

Definition 10.9 (Complete Predicate) A predicate of natural numbers is


complete if the following holds:
def
complete(P ) = ∀n : N.(∀m : N. m < n ⇒ P [m]) ⇒ P [n]

Lemma 10.5 (Complete Induction)

complete(P ) ⇒ ∀k : N.∀j : N.j < k ⇒ P [j]

Proof: Assume P is complete, i.e.:

C : ∀n : N.(∀m : N. m < n ⇒ P [m]) ⇒ P [n]

We prove
∀k : N.∀j : N.j < k ⇒ P [j]
by induction on k.
base case: Show ∀j : N.j < 0 ⇒ P [j]. This holds for arbitrary j ∈ N because
j < 0 is in contradiction with Lemma ?? which says ¬(j < 0) for all j.
4 We have omitted the contexts ∆1 and ∆2 on the right sides of the sequents for readability.
162 CHAPTER 10. NATURAL NUMBERS

induction step: For arbitrary k ∈ N assume the induction hypothesis:

IH : ∀j : N.j < k ⇒ P [j]

We must show
∀j : N.j < k + 1 ⇒ P [j]
Let j ∈ N be arbitrary, assume j < k + 1 and show P [j]. By Lemma ?? there
are two cases, j < k or j = k. In the case j < k the induction hypothesis yields
P [j]. In the case j = k, use k for n in assumption C that P is complete to get
the following:
(∀m : N. m < k ⇒ P [m]) ⇒ P [k]
The antecedent of this last formula is an instance of the induction hypothesis
IH (to see this rename the bound variable j in IH to m.) This yields P [k] and
since j = k we get P [j], as we were to show.


Theorem 10.17, the principle of complete induction, follows from Lemma 10.5.
Proof: To prove

(∀n : N. (∀k : N. k < n ⇒ P [k]) ⇒ P [n]) ⇒ ∀n : N. P [n]

assume P is complete and show P [n]. Since P is complete, by Lemma 10.5 we


know
∀k : N.∀j : N.j < k ⇒ P [j]
. Use n + 1 for k and n for j which yields n < n + 1 ⇒ P [n]. The antecedent is
true by Lemma ?? and so P [n] holds.


10.5.2 Applications
Complete induction is especially useful in proving properties of functions defined
by recursion but which do not follow the structure of the natural numbers. The
Fibonacci numbers provide an excellent example of such a function.

Definition 10.10. Fibonacci numbers

F (0) = 0
F (1) = 1
F (k + 2) = F (k + 1) + F (k)

Theorem 10.18 (Fibonacci grows slower that Exp)

∀n : N. F (n) < 2n
10.5. COMPLETE INDUCTION 163

def
Proof: By complete induction on n. The property is P [n] = F (n) < 2n . We
assume that n ∈ N and our induction hypothesis becomes

Induction Hypothesis ∀k : N. k < n ⇒ F (k) < 2k

We must show that F (n) < 2n . We assert the following leaving the proof to the
reader.
∀m : N. m = 0 ∨ m = 1 ∨ ∃k : N. m = k + 2
Using n for m in the assertion we have three cases to consider.
case[n = 0]: Assume n = 0 and show F (0) < 20 . By definition, F (0) = 0 and
20 = 1 so this case holds.
case[n = 1]: Assume n = 1 and show F (1) < 21 . By definition F (1) = 1 and
also 21 = 2 so this case holds.
case[∃k : N. n = k + 2]: Assume ∃k : N. n = k + 2. Let k ∈ N be such that n =
k +2. We must show F (k +2) < 2k+2 . By definition F (k +2) = F (k +1)+F (k).
Since we have assumed n = k + 2, we know that k + 1 < n and k < n. Using
k + 1 and k in induction hypothesis we get the following facts F (k + 1) < 2k+1
and F (k) < 2k . We use two instances of Thm. 10.13 to get the following:

F (k + 1) + F (k) < 2k+1 + F (k) < 2k+1 + 2k

Note that by Thm. 10.14 we know 2k < 2k+1 so, by definition of exponentiation,
2k < 2 · 2k . This justifies the following:

2k+1 + 2k = 2 · 2k + 2k < 2 · 2k + 2 · 2k = 2 · 2k = 2k+1

This string of inequalities and equalities shows that F (k + 2) < 2k+2 and so this
case is complete.
By these three cases we have shown that for all n ∈ N the theorem holds.


Definition 10.11 (Divisibility) For m, n ∈ N we say m|n (read: “m divides


n”) if there is a k ∈ N such that n = m · k.
def
m|n = ∃k : N.n = k · m

Definition 10.12 (Prime Numbers)


def
P rime(n) = n > 1 ∧ ∀k : {2 · · · n}. ¬(k|n)

Corollary 10.3 (2 is the least prime)

∀n : N. P rime(n) ⇒ 2 ≤ n

Proof: Choose an arbitrary n ∈ N and assume P rime(n). Then n > 1 and


∀k : {2 · · · n}. ¬(k|n). Since n > 1 we know n = 2 or n > 2. If n = 2 then the
theorem holds because 2 ≤ 2. If n > 2, then 2 < n and so 2 ≤ n.
164 CHAPTER 10. NATURAL NUMBERS

Thus the set of prime numbers is {2, 3, 5, 7, 11, 13, 17, · · · }.

Lemma 10.6 (Not Prime Composite)

∀n.N.n > 1 ⇒ ¬P rime(n) ⇔ ∃i, j : {2 · · · n}. n = i · j

In Chapter 11 we completely develop the list data-type and formally describe


a function which computes the product of a list of numbers. We present it
here, though the intuition behind the function is enough to understand the
Theorem 10.19.
Informally,
Π [k1 , k2 , · · · , kn ] = k1 · k2 · · · kn
If the list is empty ([ ]) we stipulate Π[ ] = 1.
Definition 10.13 (Product of a List) We formally define the product of a
list of numbers as follows:
Π[] = 1
Π (k :: rest) = k · Π rest

A factorization is a kind of decomposition of a number.

Definition 10.14 (Factorization) We say a list f is a factorization of n if


the product Πf = n. A list f is a prime factorization of n if f is a factorization
of n and every element of f is a prime number.

Example 10.3. Note that 4 = 2 · 2 so a factorization of 4 is given by the list


[2, 2]. The factorization of a prime is a singleton list containing the prime. For
example, the factorization of 7 is the list [7]. A non-prime factorization of of 20
is [2, 10]. The prime factorization of 20 is [2, 2, 5]. When using lists to represent
factorizations the order of the elements in the factorization is not considered.
Thus, [2, 2, 5] is the same factorization (up to the order of elements) as [2, 5, 2]
and as [5, 2, 2]. If we say a factorization is unique we mean modulo the order of
the elements.

Exercise 10.5. Note that sets of natural numbers are not a suitable represen-
tation of factorizations. Why not?

The fundamental theorem of arithmetic says that each natural number n


can be uniquely factorized by primes. It says something rather deep about the
structure of the natural numbers; in a mathematically precise way it says that
the primes are the building-blocks of the natural numbers.

Theorem 10.19 (Fundamental Theorem of Arithmetic)

∀n : N. n > 0 ⇒ ∃f : NList. (∀k : N. k ∈ f ⇒ P rime(k)) ∧ Πf = n


10.5. COMPLETE INDUCTION 165

Proof: We prove the theorem by complete induction on n. The property of n


to be proved is:

P [n] = n > 0 ⇒ ∃f : NList. (∀k : N. k ∈ f ⇒ P rime(k)) ∧ Πf = n

Let n ∈ N be arbitrary and the induction hypothesis is of the following form:

∀k : N.k < n ⇒ P [k]

We must show P [n], i.e. that

n > 0 ⇒ ∃f : NList. (∀k : N. k ∈ f ⇒ P rime(k)) ∧ Πf = n

Assume n > 0 and show that there exists a factorization f of n. We consider


cases, either n = 1 or n > 1. If n = 1 then let f = [ ]. We must show the
following.
(∀k : N. k ∈ [ ] ⇒ P rime(k)) ∧ Π[ ] = n
The left conjunct holds vacuously since, assuming k ∈ [ ] for arbitrary k is a
contradiction i.e. there are no elements in [ ]. By the definition of the product
of a list Π[ ] = 1. In the case n > 1 we consider whether n is prime itself or
not, i.e. by cases, either P rime(n) or ¬P rime(N ). If P rime(n) then the list
f = [n] is a prime factorization. If ¬P rime(n) then, by Lemma ?? we know
there exists j, k ∈ {2..n} such that n = j · k.

166 CHAPTER 10. NATURAL NUMBERS
Chapter 11

Lists

11.1 Lists

John McCarthy (1927-


),an American Computer
Scientist and one of the
fathers of Artificial Intel-
ligence. John McCarty
invented the LISP pro-
gramming language after
reading Alonzo Church’s
monograph on the lambda
calculus. LISP stands for
List Processing and lists
are well supported as a the
fundamental datatype in the
language.

John McCarthy

Lists may well be the most ubiquitous datatype in computer science. Func-
tional programming languages like LISP, Scheme, ML and Haskell support lists
in significant ways that make them a go-to data-structure. The can be used
to model many collection classes (multisets or bags come to mind) as well as
relations (as list of pairs) and finite functions.
We define lists here that may contain elements from some set T . These are
the so-called monomorphic lists; they can only contain elements of type T . There
are two constructors to create a list. Nil (written as “[ ]”) is a constant symbol
denoting the empty list and “::” is a symbol denoting the constructor that adds
an element of the set T to a previously constructed list. This constructor is,

167
168 CHAPTER 11. LISTS

for historical reasons, called “cons”. Note that although “[ ]” and “::” are both
written by sequences of two symbols, we consider them to be atomic symbols
for the purposes of the syntax.
This is the first inductive definition where a parameter (in this case T ) has
been used.
Definition 11.1 (T List)
List T ::= [ ] | a :: L
where
T : is a set,
[ ]: is a constant symbol denoting the empty list, which is called “nil”,
a: is an element of the set T , and
L: is a previously constructed List T .
A list of the form a::L is called a cons. The element a from T in a::L is
called the head and the list L in the cons a::L is called the tail.
Example 11.1. As an example, let A = {a, b}, then the set of terms in the
class List A is the following:
{[ ], a::[ ], b::[ ], a::a::[ ], a::b::[ ], b::a::[ ], b::b::[ ], a::a::a::[ ], a::a::b::[ ], · · · }
We call terms in the class List A lists. If A 6= then the set of lists in class
List A is infinite, but also, like the representation of natural numbers, the rep-
resentation of each individual list is finite1 Finiteness follows from the fact that
lists are constructed by consing some value from the set A onto a previously con-
structed List A . Note that we assume a::b::[ ] means a::(b::[ ]) and not (a::b)::[ ]),
to express this we say cons associates to the right. The second form violates the
rule for cons because a::b is not well-formed since b is an element of A, it is not
a previously constructed List A . To make reading lists easier we simply separate
the consed elements with commas and enclose them in square brackets “[” and
“]”, thus, we write a::[ ] as [a] and write a::b::[ ] as [a, b]. Using this notation we
can rewrite the set of lists in the class List A more succinctly as follows:
{[ ], [a], [b], [a, a], [a, b], [b, a], [b, b], [a, a, a], [a, a, b], · · · }
Note that the set T need not be finite, for example, the class of List N is
perfectly sensible, in this case, there are an infinite number of lists containing
only one element e.g.
{[0], [1], [2], [3] · · · }

Abstract Syntax Trees for Lists


Note that the pretty linear notation for trees in only intended to make them
more readable, the syntactic structure underlying the list [a, b, a] is displayed
by the abstract syntax tree: shown in Fig 11.1.
1 The infinite analog of lists are called streams. They have an interesting theory of their

own where induction is replaced by a principle of co-induction. The Haskell programming lan-
guage supports an elegant style of programming with streams by implementing an evaluation
mechanism that is lazy; computations are only performed when the result is needed.
11.2. DEFINITION BY RECURSION 169

::

a ::

b ::

a []

Figure 11.1: Syntax tree for the list [a, b, a] constructed as a::(b::(a::[ ]))

11.2 Definition by recursion


In the same way that in Chapter 10 we defined functions by recursion on the
structure of an argument of type N, we can define functions of type List A → B
by recursion on the structure of list arguments.
For example, we can define the append function that glues two lists together
(given inputs L and M where L, M ∈ List T , L ++ M is a list in List T ). The
append function is defined by recursion on the structure of the first argument
as follows:

Definition 11.2 (List append)

def
append ([ ], M ) = M
def
append ((a::L), M ) = a::(append (L, M ))

The first equation of the definition says: if the first argument is the empty list
[ ], the result is just the second argument. The second equation of the definition
says, if the first argument is a cons of the form a::L, then cons a on the append
of L and M . Thus, there are two equations, one for each rule that could have
been used to construct the first argument of the function. Note that since there
are only two ways to construct a list, this definition covers all possible ways the
first argument could be constructed.

Example 11.2. We give some example computations with the definition of


append.
append ((a::b::[ ]), (c::[ ]))
= a::(append ((b::[ ]), (c::[ ]))
= a::b::(append ([ ], (c::[ ]))
= a::b::c::[ ]

Using the more compact notation for lists, we have shown append ([a, b], [c]) =
[a, b, c]. Using the pretty notation for lists we can rewrite the derivation as
170 CHAPTER 11. LISTS

follows:
append ([a, b], [c])
= a::(append ([b], [c]))
= a::b::(append ([ ], [c]))
= a::b::[c]
= [a, b, c]

We will use the more succinct notation for lists from now on, but do not
forget that this is just a more readable display for the more cumbersome but
precise notation which explicitly uses the cons constructor.
Append is such a common operation on lists that the ML and Haskell pro-
gramming languages provide convenient infix notations for the list append op-
erator. In Haskell, the symbol is “++”, in the ML family of languages it is
“@”. We will use “ ++ ” here. Using the infix notation the definition appears
as follows:

Definition 11.3 (List append (infix))


def
[ ] ++ M = M
def
(a::L) ++ M = a::(L ++ M )

Example 11.3. Here is an example of a computation using the infix operator


for append and the more compact notation for lists.

[a, b] ++ [c]
= a::([b] ++ [c])
= a::b::([ ] ++ [c])
= a::b::[c]
= [a, b, c]

We would like to know that append really is a function of type (List A ×


List A ) → List A . In Chapter 10 we presented a theorem justifying recursive
definitions of a particular form (Thm. 11.1). That theorem and Corollary 11.1)
guaranteed that definitions that followed a syntactic pattern were guaranteed to
be functions. Similar results hold for lists and indeed there are similar theorems
justifying definition by recursion for any inductively defined type.

Theorem 11.1 (Definition by recursion for list functions) Given sets A


and B and an element b ∈ B and a function g ∈ (A × B) → B, definitions
having the following form:

f ([ ]) = b
f (x :: xs) = g(x, f (xs))

result in well-defined functions, f ∈ List A → B.

The corollary for functions of two arguments is given as:


11.2. DEFINITION BY RECURSION 171

Corollary 11.1 (Definition by recursion (for binary functions)) Given sets


A, B and C and a function g ∈ (A × C) → C, and a function h ∈ A → C, and
an element b ∈ B, definitions of the following form:

f ([ ], b) = h(b)
f ((x::xs), b) = g(x, f (xs, b))

result in well-defined functions, f ∈ (List A × B) → C.

Theorem 11.2 (append ∈ (List A × List A ) → List A ) Recall the definition of ap-
pend
def
append ([ ], M ) = M
def
append ((a::L), M ) = a::(append (L, M ))
Proof: We apply Corollary 11.1. Let A be an arbitrary set and let B = List A
and C = List A . Let h be the identity function on List A → List A and g(x, m) =
x::m. Then g ∈ (A × List A ) → List A . This fits the pattern and shows that
append is a function of type (List A × List A ) → List A .


In general, there is no need to apply the theorem or corollary, if a definition


is given for an operator where the definition is presented by cases on a list
argument and where the [ ] case is not recursive. In that case, the operator can
be shown to be a function. The idea is that at each recursive call, the length of
the list argument is getting shorter and eventually will become [ ], at that point
the base case is invoked and there is no more recursion so it terminates.
Here are a few more functions defined by recursion on lists.

Definition 11.4 (List length)

length([ ]) = 0
length(x::xs) = 1 + length(xs)

We will write |L| instead of length(L) in the following.

Definition 11.5 (List member)

mem(y, [ ]) = f alse
mem(y, x::xs) = y = x ∨ mem(y, xs)

We will write y ∈ M for mem(y, M ).

Definition 11.6 (List reverse)

rev([ ]) = [ ]
rev(x::xs) = rev(xs) ++ [x]

Exercise 11.1. Show that length, member and reverse are all functions by
applying Thm. 11.1 or Corollary 11.1.
172 CHAPTER 11. LISTS

11.3 List Induction


The structural induction principle for lists is given as follows:
Definition 11.7 (List Induction) For a set A and a property P of List A we
have the following axiom.
(P [[ ]] ∧ ∀x : A. ∀xs : List A . P [xs] ⇒ P [x::xs]) ⇒ ∀ys : List A . P [ys]
The corresponding proof rule is given as follows:
Proof Rule 11.1 (List Induction)
Γ ` P [[ ]] Γ, x ∈ A, , xs ∈ List A , P [xs] ` P [x::xs]
(List A Ind) x, xs fresh.
Γ ` ∀ys : List A . P [ys]

11.3.1 Some proofs by list induction


Def. 11.3 of the append function shows directly that [ ] is a left identity for ++ ,
is it a right identity as well? The following theorem establishes this fact.
Theorem 11.3 (Nil right identity for ++ )
∀ys : List A . ys ++ [ ] = ys
Proof: By list induction on ys. The property P of ys is given as:
def
P [ys] = ys ++ [ ] = ys
Base Case: Show P [[ ]], i.e. that [ ] ++ [ ] = [ ]. This follows immediately
from the definition of append.
Induction Step: Assume P [xs] (the induction hypothesis) and show
P [x::xs] for arbitrary x ∈ A and arbitrary xs ∈ List A . The induction hypothesis
is:
xs ++ [ ] = xs
We must show (x::xs) ++ [ ] = (x::xs). Starting with the left side of the equality
we get the following:
hhdef .of ++ ii hhind.hyp.ii
(x::xs) ++ [ ] = x::(xs ++ [ ]) = x::xs
So the induction step holds and the proof is complete.

Together, Def. 11.3 and Thm. 11.3 establish that [ ] is a left and right identity
with respect to append. Thus, with respect to ++ , [ ] behaves like zero does
for ordinary addition.
The next theorem shows that append is associative.
Theorem 11.4 ( ++ is associative)
∀ys, zs, xs : List A . xs ++ (ys ++ zs) = (xs ++ ys) ++ zs
11.3. LIST INDUCTION 173

Proof: Choose arbitrary ys, zs ∈ List A . We continue by list induction on xs.


The property P of xs is given as:

def
P [xs] = xs ++ (ys ++ zs) = (xs ++ ys) ++ zs

Base Case: Show P [[ ]], i.e. that

[ ] ++ (ys ++ zs) = ([ ] ++ ys) ++ zs

On the left side:


hhdef .of ++ ii
[ ] ++ (ys ++ zs) = (ys ++ zs)

On the right side:

hhdef .of ++ ii
([ ] ++ ys) ++ zs = (ys ++ zs)

So the base case holds.


Induction Step: For arbitrary x ∈ A and xs ∈ List A we assume P [xs]
(the induction hypothesis) and show P [x::xs].

xs ++ (ys ++ zs) = (xs ++ ys) ++ zs Ind.Hyp.

We must show

(x::xs) ++ (ys ++ zs) = ((x::xs) ++ ys) ++ zs

Starting on the left side:

hhdef .of ++ ii hhInd.Hyp.ii


(x::xs) ++ (ys ++ zs) = x::(xs ++ (ys ++ zs)) = x::((xs ++ ys) ++ zs)

On the right side:

hhdef .of ++ ii hhdef .of ++ ii


((x::xs) ++ ys) ++ zs = (x::(xs ++ ys)) ++ zs = x::((xs ++ ys) ++ zs)

So the left and right sides are equal and this completes the proof.

Here’s an interesting theorem about reverse. We’ve seen this pattern before
in Chapter 6 in Theorem 6.2 where we proved that the inverse of a composition
is the composition of inverses.

Theorem 11.5 (Reverse of append)

∀ys, xs : List A . rev(xs ++ ys) = rev(ys) ++ rev(xs)


174 CHAPTER 11. LISTS

Proof: Choose an arbitrary ys ∈ List A and proceed by list induction on


xs. Choose arbitrary xs ∈ List A . The property P of xs is given as:
def
P [xs] = rev(xs ++ ys) = rev(ys) ++ rev(xs)

Base Case: Show P [[ ]] i.e. that

rev([ ] ++ ys) = rev(ys) ++ rev([ ])

We start with the left side and reason as follows:


hhdef .of ++ ii
rev([ ] ++ ys) = rev(ys)

On the right side,


hhdef .of revii hhT hm 11.3ii
rev(ys) ++ rev([ ]) = rev(ys) ++ [ ] = rev(ys)

So the left and right sides are equal and the base case holds.
Induction Step: For arbitrary xs ∈ List A assume P [xs] and show P [x::xs]
for some arbitrary x ∈ A.

rev(xs ++ ys) = rev(ys) ++ rev(xs) Ind.Hyp.

We must show:

rev((x::xs) ++ ys) = rev(ys) ++ rev((x::xs))

We start with the left side.


rev((x::xs) ++ ys)
hhdef .of ++ ii
= rev(x::(xs ++ ys))
hhdef .of revii
= rev(xs ++ ys) ++ [x]
hhInd.Hyp.ii
= (rev(ys) ++ rev(xs)) ++ [x]

On the right side:

rev(ys) ++ rev((x::xs))
hhdef .of revii
= rev(ys) ++ (rev(xs) ++ [x])
hhT hm 11.4ii
= (rev(ys) ++ rev(xs)) ++ [x]


Bibliography

[1] Martı́n Abadi and Luca Cardelli. A Theory of Objects. Springer, 1996.
[2] Stuart Allen. Discrete math lessons .
http://www.cs.cornell.edu/Info/People/sfa/Nuprl/eduprl/Xcounting%
underscoreintro.html.

[3] Jonathan Barnes. Logic and the Imperial Stoa, volume LXXV of
Philosophia Antiqua. Brill, Leiden · New York · Koln, 1997.
[4] Garrett Birkhoff and Saunders Mac Lane. A Survey of Modern Algebra.
Macmillan, New York, 2nd edition, 1965.
[5] Noam Chomsky. Syntactic Structures. Number 4 in Janua Linguarum,
Minor. Mouton, The Hague, 1957.
[6] Alonzo Church. The Calculi of Lambda-Conversion, volume 6 of Annals of
Mathematical Studies. Princeton University Press, Princeton, 1951.
[7] Richard Dedekind. Was sind and was sollen die Zahlen? 1888. English
translation in [8].
[8] Richard Dedekind. Essays on the Theory of Numbers. Dover, 1963.
[9] Kees Doets and Jan van Eijck. The Haskell Road to Logic, Maths and Pro-
gramming, volume 4 of Texts in Computing. Kings College Press, London,
2004.
[10] Paul Edwards, editor. The Encyclopedia of Philosophy, Eight volumes pub-
lished in four, unabridged, New York · London, 1972. Collier Macmillan &
The Free Press.
[11] M. A. Ellis and B. Stroustrup. The Annotated C++ Reference Manual.
Addison-Wesley, Reading, MA, 1990.
[12] E. Engeler. Formal Languages: Automata and Structures. Markham,
Chicago, 1968.
[13] Discourses of Epictetus. D. Appleton and Co., New Youk, 1904. Translated
by George Long.

175
176 BIBLIOGRAPHY

[14] Anita Burdman Feferman and Solomon Feferman. Alfred Tarski: Life and
Logic. Cambridge University Press, 2004.
[15] Abraham A. Frankel. Georg Cantor. In Edwards [10], pages 20–22.
[16] D. P. Friedman, M. Wand, and C. T. Haynes. Essentials of Programming
Languages. MIT Press, 1992.
[17] Galileo Galilei. Two New Sciences. University of Wisconsin Press, 1974.
Translated by Stillman Drake.
[18] Gerhard Gentzen. Investigations into logical deduction. In M. E. Szabo, edi-
tor, The collected papers of Gerhard Gentzen, pages 68–131. North-Holland,
1969.
[19] Nelson Goodman. The Structure of Appearance, Third ed., volume 107 of
Synthese Library. D. Reidel, Dordrecht, 1977.
[20] Nelson Goodman and W. V. Quine. Steps toward a constructive nominal-
ism. Jounral of Symbolic Logic, 12:105 – 122, 1947.
[21] David Gries and Fred B. Schneider. A Logical Approach to Discrete Math.
Springer-Verlag, New York, 1993.
[22] C. A. Gunter. Semantics of Programming Languages: Structures and Tech-
niques. Foundations of Computing Series. MIT Press, 1992.
[23] C. A. Gunter and J. C. Mitchell, editors. Theoretical Aspects of Object-
Oriented Programming, Types, Semantics and Language Design. Types,
Semantics, and Language Design. MIT Press, Cambridge, MA, 1994.
[24] C. A. Gunter and D. S. Scott. Semantic domains. In J. van Leeuwen, editor,
Handbook of Theoretical Computer Science, pages 633–674. North-Holland,
1990.
[25] Paul R. Halmos. Boolean Algebra. Nan Nostrand Rienholt, New York,
1968.
[26] Paul R. Halmos. Naive Sert Theory. Springer Verlag, New York · Heidelberg
· Berlin, 1974.
[27] Herodotus. The History. University of Chicago, 1987. Translated by David
Green.
[28] D. Hilbert and W. Ackermann. Mathematical Logic. Chelsea Publishing,
New York, 1950.
[29] John E. Hopcroft and Jeffrey D. Ullman. Formal Languages and Their
Relation to Automata. Addison-Wesley, Reading, Massachusetts, 1969.
[30] Stephen C. Kleene. Introduction to Metamathematics. van Nostrand,
Princeton, 1952.
BIBLIOGRAPHY 177

[31] Saunders Mac Lane. Mathematics: Form and Function. Springer Verlag,
New York, Berlin, Heidelberg, Tokyo, 1986.

[32] F. William Lawvere and Stephen Schanuel. Conceptual Mathematics: A


First Introiduction to Categories. Cambridge University Press, 1997.

[33] Xavier Leroy. The Objective Caml system release 1.07. INRIA, France,
May 1997.

[34] Nicholas Lobachevski. Geometrical Researches on The Theory of Parallels.


Open Court Publishing Co., La Salle, Illinois, 1914. Originally published
in German, Berlin in 1840, translated by George Bruce Halsted.

[35] Zohar Manna. Theory of Computation. Addison-Wesley, 1978.

[36] Zohar Manna and Richard Waldinger. The Logical Basis for Computer
Programming: Deductive Reasoning. Addison-Wesley, 1985. Published in
2 Volumes.

[37] R. Milner, M. Tofte, and R. Harper. The Definition of Standard ML. The
MIT Press, 1991.

[38] John C. Mitchell. Foundations of Programming Languages. MIT Press,


1996.

[39] L. C. Paulson. Standard ML for the Working Programmer. Cambridge


University Press, 1991.

[40] Giuseppe Peano. Arithmetices principia, nova methodo exposita, 1899. En-
glish translation in [51], pp. 83–97.

[41] Benjamin Pierce. Basic Category Theory for COmputer Scientists. MIT
Press, 1991.

[42] Gordon D. Plotkin. A structural approach to operational semantics. Techni-


cal Report DAIMI-FN-19, Aarhus University, Aarhus University, Computer
Science Dept., Denmark, 1981.

[43] David J. Pym and Eike Ritter. Reductive Logic and Proof-search: Proof
Theory, Semantics and Contol, volume 45 of Oxford Logic Guides. Claren-
don Press, Oxford, 2004.

[44] W.V.O. Quine. Methods of Logic. Holt, Rinehart and Winston, 1950.

[45] Arto Salomaa. Formal Languages. ACM Monograph Series. Academic


Press, 1973.

[46] S. A. Schmidt. Denotational Semantics. W. C. Brown, Dubuque, Iowa,


1986.
178 BIBLIOGRAPHY

[47] D. Scott and C. Strachey. Towards a mathematical semantics for computer


languages. In J. Fox, editor, Proceedings Symposium on Computers and
Automata, pages 19–46. Polytechnic Inst. of Brooklyn Press, New York,
1971.

[48] J.E. Stoy. Denotational Semantics: The Scott-Strachey Approach to Pro-


gramming Language Theory. MIT Press, Cambridge, MA, 1977.

[49] A. Tarski. Logic, Semantics, Metamathematics, Papers from 1923 to 1938.


Clarendon Press, Oxford, 1956.

[50] R. D. Tennent. Semantics of Programming Languages. Prentice-Hall In-


ternational, London, 1991.

[51] Jan van Heijenoort, editor. From Frege to Gödel: A sourcebook in mathe-
matical logic, 1879 – 1931. Harvard University Press, 1967.

[52] Wikipedia. Galileo’s paradox — wikipedia, the free encyclopedia, 2005.

[53] G. Winskel. Formal Semantics of Programming Languages. MIT Press,


Cambridge, 1993.

[54] Glen Winskel. The Formal Semantics of Programming Languages. Foun-


dations of Computing. MIT Press, 1993.

[55] Ludwig Wittgenstein. Tractatus Logico-Philosophicus. Routledge and


Kegan Paul Ltd., London, 1955.
Index

append, 169 scope, 62


append, 170 binding operator, 9, 91
block, 114
abstract syntax, 1, 4, 15 of a partition, 114
Ackermann, 64 Bool, George, 25
addition, 146 Boole, George, 47
is monotone, 156 Boolean
of fractions, 116 expressions, 7
admissible, 35 semantics, 11
alphabet, 2 Boolean expression
antecedent, 31 syntax, 5
AntiSym, 106 Boolean valued, 56
antisymmetric, 104, 106 Booleans, 25
append, 14, 169, 170 bottom, 22
infix notation for, 15 identity for disjunction, 33
arity, 56, 94 semantics, 28
assignment, 25 ⊥, 22
falsifying, 27 bound occurrence, 63
satisfying, 27
valid, 27 Cantor, 135
associative, 95 Cantor’s theorem, 135
associativity Cantor, Georg, 77, 135
of addition, 122 capture avoiding substitution, see sub-
of function composition, 123 stitution
ASym, 106 cardinality, 131
asymmetric, 106 less than or equal, 132
Axioms of the empty set, 134
Peano, 145 strictly less than, 132
Cartesian product, 92
bi-conditional, 24 bijection between, 128
definition of, 24 of partial orders, 118
semantics, 28 category theory, 77
bijection, 126–128 Chomsky, Noam, 2, 3
of Cartesian products, 128 Church, Alonzo, 65
binary, 56 circular reasoning, see
binding vicious circle principle
operator, 62 closure, 106, 107

179
180 INDEX

of functions under composition, 122 rational numbers, 137


reflexive, 108 counting, 137, 138, 159
reflexive transitive, 109 k-partition, 115
symmetric, 109 power set, 90
transitive, 109 size of truth tables, 29
uniqueness of, 107 equivalence relations, 115
with respect to an operation, 122 uniqueness of, 137
codomain, 98, 119
commutativity Dedekind infinite, 133
of intersection, 89 Dedekind, Richard, 133
def
of union, 88 = ,9
complement definition, 9
is involutive, 110 by recursion for 2 arguments, 148
of a relation, 101 by recursion, 147, 148
complete by recursion on lists, 170, 171
relation, 112 ∆A , 99
complete induction, 161 denumerable, 137
complete predicate, 161 diagonal, 112
completeness, 44 diagonal relation, 99
component, 114 difference, 92
of a partition, 114 disjoint, 92
composition, 101 disjunction, 22
associative, 101 over a formula list, 33
inverse of, 102 proof rule for, 38
iterated, 103 semantics, 28
of relations, 101 syntax, 22
is associative, 101 distributive, 95
compositional, 28 divides, 48
compositional semantics, 10 divisibility, 163
comprehension, 90 domain, 98, 119
concrete syntax, 4 domain of discourse, 55
Cong, 116
congruence, 47, 48, 116 empty set, 81, 88
conjunction, 22 is a relation, 99
over a formula list, 32 uniqueness of, 82
proof rule for, 38 empty set axiom, 81, 83
semantics, 28 emptyset, 89
syntax, 22 Epictetus, 21
connected, 104 equality
connectives of syntax, 3
complete set, 53 equivalence, 111
connectivity, 104 class, 112
constant, 56 fineness of, 113
nullary function as, 58 relation, 111, 112
contradiction, 30 equivalence relation
countable, 137 counting, 115
INDEX 181

for Q on F, 114 Galileo, Galilei, 132


Even numbers, definition of, 131 Gauss’ identity, 157
exclusive or, 50 Gentzen, Gerhard, 31
existential quantifier, 55 Godel, Kurt, 19
proof rule for, 67 grammar, 2
existential witness, 67 alternative constructs of, 3
exponentiation, 147 base cases, 3
extension, 121 constructors, 2
extensionality, 80, 81 inductive alternatives, 10
for functions, 120 productions of, 3
for sets, 80 rules of, 3
subset characterization, 81 schematic form of, 2

Hilbert, David, 64
factorization, 164
factorization:prime, 164
idempotent, 94
falsifiable, 29
identity
falsifies, 27 for composition, 103
Fibonacci for composition of relations, 103
growth, 162 function, 123
Fibonacci numbers, 162 left, 103
finite, 134 matrix, 100
formal language, 2 relation, 99
formula, 57, 58 right, 103
predicate logic, 58 ∆A , 123
foundations of mathematics, 77 if and only if, see bi-conditional
fractions, 113 iff, see bi-conditional
addition of, 116 implication, 22
multiplication of, 116 proof rule for, 39
free occurrence, 62, 63 semantics, 28
Frege, Gottlob, 57 syntax, 22
function, 119 induction, 149, 152, 172
addition, 148 on N, 149, 152, 172
Boolean valued, 56 inductively defined sets, 1
composition of, 122 infinite, 133
equivalence of, 120 infinity of infinities, 136
exponentiation, 149 injection, 125
extension of, 121 intensional properties, 121
extensionality, 120 interpretation, 10
inverse, 124, 126–128 intersection
multiplication, 148 commutativity of, 89
restriction of, 121 of sets, 89
symbols for terms of predicate logic, zero for, 89
58 inverse
functionality, 119 bijection
Fundamental theorem of arithmetic, 164 lemma, 128
182 INDEX

lemma, 127 reductive, 41


composition of, 102
function, 124 matching, 35
characterization lemma, 126 sequents, 41
is involutive, 110 mathematical induction, 149, 152
relation, 100 complete, 161
involution, 110 strengthening, 161
complement, 110 matrix, 99
inverse, 110 identity, 100
IrrefA , 105 multiplication, 100
irreflexive, 104, 105 McCarthy, John, 167
membership, 78
k-partition, 115 in Comprehensions, 93
Kleene, Stephen Cole, 109, 141 vs. subset, 80
Kronecker, Leopold, 144 meta variable
Kuratowski, Kazimierz, 85 in sequents, 31
meta variable, 22
language formula list, 31
finite, 2 propositional, 22
formal, 2 metamathematics, 43
lemma:function-composition, 122 ML, 16
length, 171 models, 27
less than, 99, 155 |=, 27
irreflexive, 155 monotone, 94
transitive, 155 addition, 156
lexical scope, 62 multiplication, 147
lexicographic product of fractions, 116
of partial orders, 118
list, 7, 167, 168 n-ary, 56
syntax trees for, 8, 168 N, 5
append, 14 natural numbers, 5
cons, 7, 168 semantics, 11
empty, 7, 168 syntax for, 5
formula, 31 negation, 22
head, 7, 168 proof rule for, 39
over a type, 7, 167 semantics, 28
parametrized by element type, 7, syntax, 22
168 truth table for, 28
recursion on, 32, 33 nullary, 56
semantics, 14 number
tail, 7, 168 prime, 163
length, 171 numbers
list induction, 172 rational, 114
local scope, 62
logic one-to-one, see injection
deductive, 41 one-to-one and onto, see bijection
INDEX 183

onto, see surjection T[ F], 58


operator, 94 term syntax, 58
k-ary, 94 predicate symbols, 58, 59
arity of, 94 prime factorization, 164
associative, 95 prime number, 163
binary, 94 product, 92
distributive, 95 Programming Language Booleans, 7
unary, 94 proof, 40
ordered pairs, 86 inductively defined, 40
characteristic property of, 86 predicate logic, 67
existence lemma, 93 rules, 36
Kuratowski’s definition of, 86 strategies for building, 41
notation for, 86 tree, 40
projections, 86 proof rule, 35, 36, 66
Weiner’s definition of, 87 ⇒L, 39
ordered product, 118 ⇒R, 39
Cartesian, 118 ⊥ Ax, 37
lexicographic, 118 ¬L, 39
pointwise, 118 ¬R, 39
∨L, 38
pair, 83 ∨R, 38
unordered, 83 for quanitifers, 66
pairing axiom, 83, 84 admissible, 35
pairing unique, 83 ∧L, 38
partial order, 117 ∧R, 38
Cartesian product of, 118 Ax, 37
lexicographic product of, 118 axiom form, 35
strict, 118 by cases, 153
partition, 114 conclusion of, 35
block of, 114 disjunction, 38
component of, 114 elimination rules, 35
Peano Axioms, 145 ∃L, 67
Peano, Giuseppe, 145 ∃R, 67
permutation, 135 for conjunction, 38
π1 , π2 , 86 for disjunction, 38
pigeonhole principle, 138, 159 for implication, 39
PLB, 7 for negation, 39
semantics, 12 ∀L, 67
syntax, 7 ∀R, 66
power set, 90, 135 introduction rules, 35
size of, 90 left rules, 35
axiom, 90 list induction, 172
predicate, 55, 56 mathematical induction, 151, 161
predicate logic premise of, 35
PL[F ,P] , 58 right rules, 35
formula syntax, 58 schematic form of, 35
184 INDEX

proposition, 21 on N, 147
propositional recursive call, 10
assignment, 25 termination, 11
connectives, 22 recursion theorem, 147
constants, 22 reflexive, 53, 104, 105
formula, 22 closure, 108
contradiction, 30 reflexive transitive closure, 109
falsifiable, 29 relation, 98
satisfiable, 29 antisymmetric, 104, 106
valid, 30 asymmetric, 104, 106
meta variable, 22 binary, 98
semantics, 28 infix notation for, 99
sequent, 31 closure
tautology, 30 uniqueness of, 107
valuation, 26 closure of, 106, 107
variables, 22 reflexive, 108
propositional logic symmetric, 109
constructors, 23 codomain, 98
propositional logic complement of, 101
semantics, 28 complete, 112
syntax, 22 composition
syntax trees, 23 inverse of, 102
propositional logic: completeness, 44 iterated, 103
propositional logic: soundness, 44 composition of, 101
associative, 101
quantification, 55 congruence, 116
quantifier connected, 104
existential, 55 connectivity of, 104
universal, 55 diagonal, 99, 112
Quine, W. V. O., 64 domain, 98
quotient, 48, 113 empty set, 99
equality, 100
range, 119 equivalence, 111, 112
Q, 114 fineness of, 113
rational numbers functional, 119
countable, 137 inverse, 100
rationals, 114 inverse of, 100
reachability, 104 irreflexive, 104, 105
recursion, 147, 148, 170, 171 less than, 99
base case, 10 partial order, 117
definition by, 10 partial order, strict, 118
definition of addition by, 146 reachability, 104
definition of exponentiation by, 147 reflexive, 104, 105
definition of multiplication by, 147 reflexive transitive closure, 109
definition of summation by, 157 representation
on List A , 170, 171 matrix, 99
INDEX 185

smallest, 107 inductively defined, 1


symmetric, 104, 105 infinite, 133
total, 119 intersection, 89
transitive, 104 axiom, 89
transitive closure of, 109 commutativity of, 89
transpose, 100 membership in, 78
RelfA , 105 membership vs. subset, 80
remainder, 48 notations for, 78
restriction, 121 operator, 94
Nil for ++ , 172 arity of, 94
Russell, Bertrand, 91 monotone, 94
unary, 94
satisfiable, 29, 32 ordered pairs, 86
satisfies, 27 pair, 83
scheme, 16 pairing axiom, 83, 84
Schröder Bernstein theorem, 128, 132 pairing unique, 83
scope, 62 power set, 90
lexical, 62 power set axiom, 90
local, 62 product, 92
semantic, 1 singleton exists, 84
semantic function, 10 singleton exists, simplified, 85
semantics, 10, 28 singleton unique, 84
of sequents, 32 subset, 88, 89
sequent, 31 union, 87, 89
antecedent, 31 axiom, 87
formula translation of, 34 commutativity of, 88
matching, 35, 41 unordered pair, 83
root, 40 set theory, 77
satisfiability of, 32 singleton
schematic, 36 exists, 84
semantics, 34 exists, simplified, 85
succedent, 31 singleton member, 85
valid, 34 singleton unique, 84
set, 78 soundness, 44
comprehension, 90 Squares, definition of, 132
membership in, 93 Stirling Numbers, 115
countable, 137 second kind, 115
denumerable, 137 strict partial order, 118
difference, 92 subrelation, 100
disjoint, 92 subset, 80
elements of, 78 subset-reflexive, 80
empty set, 81, 88 substitution, 9, 35, 64
empty set axiom, 81, 83 into a comprehension, 91
equality, 80 in a formula, 65
extensionality, 80 in a term, 65
finite, 134 succedent, 31
186 INDEX

sum,
P 157 identity for, 88
, 157 uniqueness, 82
surjection, 125 of counting, 137
Sym, 105, 109 of empty set, 82
symmetric, 53, 104, 105 of relational closures, 107
closure, 109 of singletons, 84
syntactic class, 2 of unordered pairs, 83
syntax, 1 universal quantifier, 55
abstract, 2, 4, 15 proof rule for, 66, 67
concrete, 4 unordered pair, 83
syntax tree, 58
for propositional logic, 23 val, 26
valid, 27, 30, 34
Tarski, Alfred, 97 valuation, 26
tautology, 30 computation of, 27
term, 4, 57, 58 variable, 57
as tree, 2 binding occurrence, 62
predicate logic, 58 bound, 63, 64
syntax of, 58 in a term, 63
variables, 57 free, 62–64
T[ F], 58 in a formula, 63
terms in a term, 62
structured, 2 occurrence, 61, 62
text, 4 in a formula, 62
theory, 44 in a term, 61
top, 24 vicious circle principle, see
definition of, 24 circular reasoning
identity for conjunction, 33
semantics, 30 Weiner, Norbert, 87
>, 24 witness, 67
total, 119 Wittgenstein, Ludwig, 21, 28
transitive, 53, 104
transitive closure, 109
transpose, 100
trichotomy, 59
truth table, 28
`, 31
turnstile(vdash), 31
type theory, 77

unary, 56
uncountable
reals are, 137
union, 87
axiom, 87
commutativity of, 88

You might also like