Matemáticas Discretas
Matemáticas Discretas
for
Computer Scientists
James Caldwell
Department of Computer Science
University of Wyoming
Laramie, Wyoming
Draft of
August 26, 2011
c James Caldwell1 2011
ALL RIGHTS RESERVED
1 This material is based upon work partially supported by the National Science Foundation
under Grant No. 9985239. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the author(s) and do not necessarily reflect the views
of the National Science Foundation.
Contents
I Logic 17
2 Propositional Logic 21
2.1 Syntax of Propositional Logic . . . . . . . . . . . . . . . . . . . . 21
2.1.1 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.2 Definitions: Extending the Language . . . . . . . . . . . . 24
2.1.3 Substitutions* . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.1 Boolean values and Assignments . . . . . . . . . . . . . . 25
2.2.2 The Valuation Function . . . . . . . . . . . . . . . . . . . 26
2.2.3 Truth Table Semantics . . . . . . . . . . . . . . . . . . . . 28
2.2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Proof Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3.1 Sequents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.2 Semantics of Sequents . . . . . . . . . . . . . . . . . . . . 32
2.3.3 Sequent Schemas and Matching . . . . . . . . . . . . . . . 34
2.3.4 Proof Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.5 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.6 Some Useful Tautologies . . . . . . . . . . . . . . . . . . . 42
2.3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4 Metamathematical Considerations* . . . . . . . . . . . . . . . . . 43
i
ii CONTENTS
4 Predicate Logic 55
4.1 Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 The Syntax of Predicate Logic . . . . . . . . . . . . . . . . . . . 57
4.2.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.2 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.3 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3.1 Bindings and Variable Occurrences . . . . . . . . . . . . . 61
4.3.2 Free Variables . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.3 Capture Avoiding Substitution* . . . . . . . . . . . . . . 64
4.4 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.1 Proof Rules for Quantifiers . . . . . . . . . . . . . . . . . 66
4.4.2 Universal Quantifier Rules . . . . . . . . . . . . . . . . . . 66
4.4.3 Existential Quantifier Rules . . . . . . . . . . . . . . . . . 67
4.4.4 Some Proofs . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.5 Translating Sequent Proofs into English . . . . . . . . . . 70
5.2.2 Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3 Set Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.1 The Empty Set . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.2 Unordered Pairs and Singletons . . . . . . . . . . . . . . . 83
5.3.3 Ordered Pairs . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.3.4 Set Union . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.3.5 Set Intersection . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.6 Power Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.7 Comprehension . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.8 Set Difference . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.3.9 Cartesian Products and Tuples . . . . . . . . . . . . . . . 92
5.4 Properties of Operations on Sets . . . . . . . . . . . . . . . . . . 94
5.4.1 Idempotency . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.4.2 Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.4.3 Commutativity . . . . . . . . . . . . . . . . . . . . . . . . 95
5.4.4 Associativity . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.4.5 Distributivity . . . . . . . . . . . . . . . . . . . . . . . . . 95
6 Relations 97
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.2.1 Binary Relations . . . . . . . . . . . . . . . . . . . . . . . 98
6.2.2 n-ary Relations . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2.3 Some Particular Relations . . . . . . . . . . . . . . . . . . 99
6.3 Operations on Relations . . . . . . . . . . . . . . . . . . . . . . . 100
6.3.1 Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.3.2 Complement of a Relation . . . . . . . . . . . . . . . . . . 101
6.3.3 Composition of Relations . . . . . . . . . . . . . . . . . . 101
6.4 Properties of Relations . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4.1 Reflexivity . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4.2 Irreflexivity . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4.3 Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4.4 Antisymmetry . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4.5 Asymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4.6 Transitivity . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4.7 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . 106
6.5 Closures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.6 Properties of Operations on Relations . . . . . . . . . . . . . . . 109
8 Functions 119
8.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.2 Extensionality (equivalence for functions) . . . . . . . . . . . . . 120
8.3 Operations on functions . . . . . . . . . . . . . . . . . . . . . . . 121
8.3.1 Restrictions and Extensions . . . . . . . . . . . . . . . . . 121
8.3.2 Composition of Functions . . . . . . . . . . . . . . . . . . 121
8.3.3 Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.4 Properties of Functions . . . . . . . . . . . . . . . . . . . . . . . 124
8.4.1 Injections . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.4.2 Surjections . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.4.3 Bijections . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
11 Lists 167
11.1 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
11.2 Definition by recursion . . . . . . . . . . . . . . . . . . . . . . . . 169
11.3 List Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
11.3.1 Some proofs by list induction . . . . . . . . . . . . . . . . 172
vi CONTENTS
List of Definitions.
List of Examples.
CONTENTS vii
viii CONTENTS
Preface
Discrete mathematics is a required course in the undergraduate Computer
Science curriculum. In a perhaps unsympathetic view, the standard presenta-
tions (and there are many )the material in the course is treated as a discrete
collection of so many techniques that the students must master for further stud-
ies in Computer Science. Our philosophy, and the one embodied in this book
is different. Of course the development of the students abilities to do logic and
proofs, to know about naive set theory, relations, functions, graphs, inductively
defined structures, definitions by recursion on inductively defined structures
and elementary combinatorics is important. But we believe that rather than so
many assorted topics and techniques to be learned, the course can flow contin-
uously as a single narrative, each topic linked by a formal presentation building
on previous topics. We believe that Discrete Mathematics is perhaps the most
intellectually exciting and potentially one of the most interesting courses in the
computer science curriculum. Rather than a simply viewing the course as a
necessary tool for further, and perhaps more interesting developments to come
later, we believe it is the place in the curriculum that an appreciation of the
deep ideas of computer science can be presented; the relation between syntax
and semantics, how it is that unbounded structures can be defined finitely and
how to reason about those structure and how to calculate with them.
Most texts, following perhaps standard mathematical practice, attempt to
minimize the formalism, assuming that a students intuition will guide them
through to the end, often avoiding proofs in favor of examples 2 Mathematical
intuition is an entirely variable personal attribute, and even individuals with
significant talents can be misguided by intuition. This is shown over and over
in the history of mathematics; the history of the characterization of infinity is a
prime example, but many others exist like the Tarski-Banach paradox [?]. We
do not argue that intuition should be banished from teaching mathematics but
instead that the discrete mathematics course is a place in the curriculum to
cultivate the idea, useful in higher mathematics and in computer science, that
formalism is trustworthy and can be used to verify intuition.
Indeed, we believe, contrary to the common conception, that rather than
making the material more opaque, a formal presentation gives the students a
way to understand the material in a deeper and more satisfying way. The fact
that formal objects can be easily represented in ways that they can be consumed
by computers lends a concreteness to the ideas presented in the course. The
fact that formal proofs can be sometimes be found by a machine and can aways
be checked by a machine give an absolute criteria for what counts as a proof;
in our experience, this unambiguous nature of of formal proofs is a comfort to
students trying to decide if they’ve achieved a proof or not. Once the formal
criteria for proof has been assimilated, it is entirely appropriate to relax the
rigid idea of a proof as a machine checkable structure and to allow more simply
2 As an example we cite the pigeonhole principle which is not proved in any discrete math-
ematics text we know of but which is motivated by example. The proof is elementary once
the ideas of injection, surjection and one-to-one mappings have been presented.
CONTENTS ix
1.1 Introduction
Syntax has to do with form and semantics has to do with meaning. Syntax is
described by specifying a set of structured terms while semantics associates a
meaning to the structured terms. In and of itself syntax does not have mean-
ing, only structure. Only after a semantic interpretation has been specified for
the syntax do the structured terms acquire meaning. Of course, good syntax
suggests the intended meaning in a way that allows us see though it to the in-
tended meaning but it is an essential aspect of the formal approach, based on
the separation of syntax and semantics, that we do not attach these meanings
until they have been specified.
The syntax/semantics distinction is fundamental in Computer Science and
goes back to the very beginning of the field. Abstractly, computation is the
manipulation of formal (syntactic) representation of objects 1
For example, when compiling a program in written some language (say C++)
the compiler first checks the syntax to verify that the program is in the language.
1 The abstract characterization of computation as the manipulation of syntax, was first
given by logicians in the 1930’s who were the first to try to describe what we mean by the
word “algorithm”.
1
2 CHAPTER 1. SYNTAX AND SEMANTICS*
1.3 Syntax
We can finitely describe abstract syntax in a number of ways. A common way
is to describe the terms of the language inductively by giving a formal grammar
describing how terms of the language can be constructed. We give an abstract
description of a grammar over an alphabet and then, in later sections we provide
examples to make the ideas more concrete.
The symbol ::= separates the name of the syntactic class being defined
from the collection of rules that define it. Note that the vertical bar “|” is read
as “or” and it separates the rules (or productions) used to construct the terms
of class. The rules separated by the vertical bar are alternatives. The order of
the rules does not matter, but in more complex cases it is conventional to write
the simpler cases first. Sometimes it is convenient to parametrize the class being
defined by some set. We show an example of this below were we simultaneously
define lists over some set T all at once, rather than making separate syntactic
definitions for each kind of list.
Traditionally, the constructors are also sometimes called rules or productions.
The describe the allowable forms of the structures included in the language.
The constructors are either constants from the alphabet, are elements from
some collection of sets, or describe how to construct new complex constructs
consisting of symbols from the alphabet, elements from the parameter sets,
and possibly from previously constructed elements of the syntactic class; the
constructor functions return new elements of the syntactic class. At least one
constructor must not include arguments consisting of previously constructed
elements of the class being defined; this insures that the syntactic structures in
the language defined by the grammar are finite. These non-recursive alternatives
(the ones that do not have subparts which are of the type of structure being
defined) are sometimes called the base cases.
Two syntactic elements are equal if they are constructed using identical
constructors applied to equal arguments. It is never the case that c1 x = c2 x if
c1 and c2 are different constructors.
Noam Chomsky
4 CHAPTER 1. SYNTAX AND SEMANTICS*
∗ +
a + ∗ c
b c a b
Abstract syntax can be displayed in tree form. For example, the formula
a ∗ (b + c) is displayed by the abstract syntax tree on the left in Fig. ?? and
the formula (a ∗ b) + c is displayed by the tree on the right of Fig. ??. Notice
that the ambiguity disappears when displayed in tree form since the principle
constructor labels the top of the tree. The immediate subterms are at the next
level and so on. For arithmetic formulas, you can think of the topmost (or
principle) operator as the last one you would evaluate.
2 Of course the fact that we read and write from left to right is only an arbitrary convention,
Hebrew and Egyptian hieroglyphics are read from right to left. But even the notion of left and
right are simply conventions, Herodotus [27] tells us in his book The History (written about
440 B.C.) that the ancient Egyptians wrote moving from right to left but he reports “they
say they are are moving [when writing] to right”, i.e. what we (in agreement with the ancient
Greeks) call left the ancient Egyptians called right and vice versa. I theorize that notions of
right and left may have first been understood only in relation to the linear form introduced
by writing. In that case, if right means “the side of the papyrus you start on when writing a
new line” then the Egyptian interpretation of right and left coincide with the Greeks.
1.3. SYNTAX 5
Syntax of B
The Booleans3 consist of two elements. We denote the elements by the alphabet
consisting of the symbols T and F. Although this is enough, i.e. it is enough
to say that a Boolean is either the symbol T or is the symbol F, we can define
the Booleans (denoted B) by the following grammar:
B ::= T | F
Read the definition as follows:
The syntax of these terms is trivial, they have no more structure than the
individual symbols of the alphabet do. The syntax trees are simply individual
nodes labeled either T or F. There are no other abstract syntax trees for the
class B.
Syntax of N
The syntax of the natural numbers (denoted by the symbol N) can be defined
as follows:
Definition 1.2.
N ::= 0 | s n
where the alphabet consists of the symbols {0, s} and n is a variable denot-
ing some previously constructed element of the set N. 0 is a constant symbol
denoting an element of N and s is a constructor function mapping N to N.
Implicitly, we also stipulate that nothing else is in N, i.e. the only elements of
N are those terms which can be constructed by the rules of the grammar.
Thus, N = {0, s0, ss0, sss0, · · · } are all elements of N. Note that the variable
“n” used in the definition of the rules never occurs in an element of N, it is simply
a place-holder for an term of type N, i.e. it must be replaced by some term from
3 “Boolean” is eponymous for George Boole, the English mathematician who first formu-
s s
0 s
the set {0, s0, ss0, · · · }. Such place-holders are called meta-variables and are
required if the language has inductive structure, i.e. if we define the elements
of the language using previously constructed elements of the language.
Although the grammar for N contains only two rules, the language it de-
scribes is far more complex than the language of B (which also consists of two
rules.) There are an infinite number of syntactically well-formed terms in the
language of N. To do so it relies on n being a previously defined element of N ;
thus N is an inductively defined structure.
The trees are of one of two forms shown above The subtree for a previously
constructed N labeled n is displayed by the following figure:
statements. We use the Booleans as the basis, i.e. the Booleans defined above
serve as the base case for the language.
where
b ∈ B: is a Boolean, and
p, p1 , p2 : are previously constructed terms of P LB.
Terms of the language include:
Thus, the language P LB includes the Boolean values {T, F} and allows arbi-
trarily nested if-then-else statements.
Lists
We can define lists containing elements from some set T by two rules. The
alphabet of lists is {[ ], ::} where “[ ]” is a constant symbol called “nil” which
denotes the empty list and “::” is a symbol denoting the constructor that adds
an element of the set T to a previously constructed list. This constructor is, for
historical reasons, called “cons”. Note that although “[ ]” and “::” both consist
of sequences of two symbols, we consider them to be atomic symbols for the
purposes of this syntax.
This is the first definition where the use of the parameter (in this case T )
has been used.
Definition 1.4 (T List)
List T ::= [ ] | a :: L
where
T : is a set,
[ ]: is a constant symbol denoting the empty list, which is called “nil”,
a: is an element of the set T , and
8 CHAPTER 1. SYNTAX AND SEMANTICS*
::
a ::
b ::
a []
Figure 1.2: Syntax tree for the list [a, b, a] constructed as a::(b::(a::[ ]))
A list of the form a::L is called a cons. The element a from T in a::L is
called the head and the list L in the cons a::L is called the tail.
Example 1.1. As an example, let A = {a, b}, then the set of terms in the class
List A is the following:
We call terms in the class List T lists. The set of all lists in class List A is
infinite, but each list is finite because lists must always end with the symbol [ ].
Note that we assume a::b::[ ] means a::(b::[ ]) and not (a::b)::[ ]), to express this
we say cons associates to the right. The second form violates the rule for cons
because a::b is not well-formed since b is an element of A, it is not a previously
constructed List A . To make reading lists easier we simply separate the consed
elements with commas and enclose them in square brackets “[” and “]”, thus,
we write a::[ ] as [a] and write a::b::[ ] as [a, b]. Using this notation we can rewrite
the set of lists in the class List A more succinctly as follows:
{[ ], [a], [b], [a, a], [a, b], [b, a], [b, b], [a, a, a], [a, a, b], · · · }
Note that the set T need not be finite, for example, the class of List N is
perfectly sensible, in this case, there are an infinite number of lists containing
only one element e.g.
{[0], [1], [2], [3] · · · }
Note that the pretty linear notation for trees in only intended to make them
more readable, the syntactic structure underlying the list [a, b, a] is displayed
by the following abstract syntax tree:
1.3. SYNTAX 9
1.3.3 Definitions
A definition is a way to extend a language to possibly include new symbols but
to describe them in terms of the existing language. Adding a definition does
not allow anything new to be said that could not already have been; though
definitions can be extraordinarily useful in making things clear. The key idea
behind defined terms is that they can be completely eliminated by just replacing
them by its definition.
where xi , 1 ≤ i ≤ k are variables standing for terms of the language (defined so-
far). An instance of the defined term is the form A[t1 , · · · , tk ] where the xi ’s are
instantiated by terms ti . This term is an abbreviation (possibly parametrized
if k > 0) for the schematic formula B[t1 , · · · , tk ] i.e. for the term having the
shape of B but where each of the variables xi is replaced by the term ti . A
may introduce new symbols not in the language while B must be a formula
of the language defined up to the point of its introduction, this includes those
formulas given by the syntax as well as formulas that may include previously
defined symbols.
def
The symbol “ = ” separates the left side of the definition, the thing being
defined, from the right side which contains the definition. The left side of the
definition may contain meta-variables which also appear on the right side.
Instances of defined terms can be replaced by their definitions replacing the
arguments in the left side of the definition into the right side. The process of
“replacement” is fundamental and is called substitution. In following chapters,
we will carefully define substitution (as an algorithm) for propositional and then
predicate logic.
This definition of “definition” is perhaps too abstract to be of much use,
and yet the idea of introducing new definitions is one of the most natural ideas
of mathematics. A few definitions are given below which should make the idea
perfectly transparent.
f (k) = 2k + k
∞ n
X x
exp(x) = x∈R
n=0
n!
1.4 Semantics
Semantics associates meaning with syntax. Formal semantics (the kind we are
interested in here) is given by defining a mathematical mapping from syntax
(think of syntax as a kind of data-structure) to some other mathematical struc-
ture. This mapping is called the semantic function or interpretation ; we will
use these terms interchangeably. When possible, formal languages are given
compositional semantics. The meaning of a syntactic structure depends on the
meanings of its parts.
Before a semantics is given, an element in a syntactic class can only be seen
as a meaningless structured term, or if expressed linearly as text, it is simply a
meaningless sequence of symbols. Since semantics are intended to present the
meanings of the syntax, they are taken from some mathematical domain which
is already assumed to be understood or is, by some measure, simpler. In the
case of a program, the meaning might be the sequence of states an abstract
machine goes through in the evaluation of the program on some input (in this
case, meanings would consist of pairs of input values and sequences of states);
or perhaps the meaning is described simply as the input/output behavior of the
program (in this case the meaning would consist of pairs of input values and
output values.) In either case, the meaning is described in terms of (well under-
stood) mathematical structures . Semantics establish the relationship between
the syntax and its interpretation as a mathematical structure.
the complex parts down to one of the base cases. In this way, computation by
recursion is guaranteed to terminate.
Semantics of B
Suppose that we intend the meanings of B to be among the set {0, 1}. Then,
functions assigning the values T and F to elements of {0, 1} count as a seman-
tics. Following the tradition of denotational semantics, if b ∈ B we write [[b]] to
denote the meaning of b. Using this notation one semantics would be:
[[T]] = 0
[[F]] = 1
Thus, the meaning of T is 0 and the meaning of F is 1. This interpretation
might not be the one you expected (i.e. you may think of 1 as T and 0 as
F) but, an essential point of formal semantics is that the meanings of symbols
or terms need not be the one you impose through convention or force of habit.
Things mean whatever the semantics say they do 4 . Before the semantics has
been given, it is a mistake to interpret syntax as anything more than a complex
of meaningless symbols.
Semantics of N
We will describe the meaning of terms in N by mapping them onto non-negative
integers. This presumes we already have the integers as an understood mathe-
matical domain7 .
4 Perhaps interestingly, in the logic of CMOS circuit technology, this seemingly backward
to be cast.
7 Because the integers are usually constructed from the natural numbers this may seem to
be putting the cart before the horse, so to speak, but it provides a good example here.
12 CHAPTER 1. SYNTAX AND SEMANTICS*
[[0]] = 0
[[sn]] = [[n]] + 1 where n ∈ N
The equations say that the meaning of the term 0 is just 0 and if the term
has the form sn (for some n ∈ N) the meaning is the meaning of n plus one.
Note that there are many cases in the recursive definition as there are in the
grammar, one case for each possible way of constructing a term in N. This will
always be the case for every recursive definition given on the structure of a term.
Under these semantics we calculate the meaning of a few terms to show how
the equations work.
[[s0]] [[sssss0]]
= [[0]] + 1 = [[ssss0]] + 1
=0+1 = ([[sss0]] + 1) + 1
=1 = (([[ss0]] + 1) + 1) + 1
= ((([[s0]] + 1) + 1) + 1) + 1
= (((([[0]] + 1) + 1) + 1) + 1) + 1
= ((((0 + 1) + 1) + 1) + 1) + 1
= (((1 + 1) + 1) + 1) + 1
= (((2 + 1) + 1) + 1
= ((3 + 1) + 1
=4+1
=5
Semantics of P LB
As always, the semantics will include one equation for each production in
the grammar. Informally, if a P LB term is already a Boolean, the semantic
function does nothing. For other, more complex, terms we explicitly specify the
values when the conditional argument is a Boolean, and if it is not we repeatedly
reduce it until it is grounded as a Boolean value. The equation for if-then-else
is given by case analysis (on the conditional argument).
1.4. SEMANTICS 13
[[b]] = b (1)
[[p
1 ]] if ([[p]] = T)
[[if p then p1 else p2 fi]] = [[p2 ]] if ([[p]] = F) (2)
[[if q then p1 else p2 fi]] (p 6∈ B, q = [[p]])
Here are some calculations (equational style derivations) that show how the
equations can be used to compute meanings.
Note that in these calculations, it seems needless to evaluate [[b]], the fol-
lowing derivation illustrates an case where the first argument is not a Boolean
constant and the evaluation of the condition is needed.
14 CHAPTER 1. SYNTAX AND SEMANTICS*
Semantics of List T
Perhaps oddly, we do not intend to assign semantics to the class List T . The
terms of the class represent themselves, i.e. we are interested in lists as lists.
But still, semantic functions are not the only functions that can be defined by
recursion on the structure of syntax, we can define other interesting functions
on lists by recursion on the syntactic structure of one or more of the arguments.
For example, we can define a function that glues two lists together (given
inputs L and M where L, M ∈ List T , append (L, M ) is a list in List T ). It is
defined by recursion on the (syntactic) structure of the first argument as follows:
Definition 1.6 (Append)
append ([ ], M ) = M
append (a::L, M ) = a::(append (L, M ))
The first equation of the definition says: if the first argument is the list [ ], the
result is just the second argument. The second equation of the definition says,
if the first argument is a cons of the form a::L, then the result is obtained by
consing a on the append of L and M . Thus, there are two equations, one for each
rule that could have been used to construct the first argument of the function.
1.5. POSSIBILITIES FOR IMPLEMENTATION 15
append (a::b::[ ], [ ])
= a::(append (b::[ ], [ ])
= a::b::(append ([ ], [ ])
= a::b::[ ]
Using the more compact notation or lists, we have shown append ((, [)a, b], [ ]) =
[a, b]. Using this notation for lists we can rewrite the derivation as follows:
Remark 1.1 (Infix Notation for Append) The append operation is so com-
monly used that many functional programming languages include special infix
notation. In the Haskell programming language [?] the infix notation is ++, in
the ML family of programming languages append is written @. We will write
m++n for append (m, n).
Using this infix notation, we rewrite the computation above as follows:
[a, b]++[ ]
= a::([b]++[ ]
= a::b::([ ]++[ ]
= a::b::[ ]
= [a, b]
We will use the more succinct notation for lists from now on and the infix
notation, but do not forget that this is just a more readable display for the more
cumbersome but precise notation which explicitly uses the cons constructor.
Here is another example.
[ ]++[a, b]
= [a, b]
We will discuss lists and operations on lists as well as ways to prove properties
about lists in some depth in Chapter 11. For example, the rules for append
immediately give append ((, [ ]), M ) = M , but the following equation is a theorem
as well append ((, M ), [ ]) = M . For any individual list M we can compute with
the rules for append and show this, but currently have no way to assert this in
general for all M with out proving it by induction.
especially true of the ML family of languages [37, 33, 39] and the language
Haskell [?]. Scheme is also useful in this way [16]. All three, ML, Haskell and
Scheme are languages in the family of functional programming languages. Of
course we can define term structures in any modern programming language, but
the functional languages provide particularly good support for this. Similarly,
semantics is typically defined by recursion on the structure of the syntax and
these languages make such definitions quite transparent, implementations ap-
pear syntactically close to the mathematical notions used above. The approach
to implementing syntax and semantics in ML is taken in [?] and a similar ap-
proach using Scheme is followed in [16]. The excellent book [9]presents most of
the material presented here in the context of the Haskell programming language.
Part I
Logic
17
19
Kurt Godel
20
Chapter 2
Propositional Logic
21
22 CHAPTER 2. PROPOSITIONAL LOGIC
uninterpreted here; they are simply tree-like structures waiting for a semantics
to be applied.
2.1.1 Formulas
We use propositional variables to stand for arbitrary propositions and we assume
there is an infinite supply of these variables.
V = {p, q, r, p1 , q1 , r1 , p2 , · · · }
Note that the fact that the set V is infinite is unimportant since no individ-
ual formula will ever require more than some fixed finite number of variables
however, it is important that the number of variables we can select from is
unbounded. There must always be a way to get another one.
We include the constant symbol ⊥ (say “bottom”).
Complex propositions are constructed by combining simpler ones with propo-
sitional connectives. For now we leave the meaning of the connectives unspec-
ified and simply present them as one of the symbols ∧, ∨, ⇒ which we read as
and, or and implies respectively.
P ::= ⊥ | x | ¬φ | φ ∧ ψ | φ ∨ ψ | φ ⇒ ψ
where
⊥ is a constant symbol,
x ∈ V is a propositional variable, and
φ, ψ ∈ P are meta-variables denoting previously constructed propositional
formulas.
To write the terms of the language P linearly (i.e. so that they can be
written from left-to-right on a page), we insert parentheses to indicate the order
of the construction of the term as needed e.g. p∧q ∨r is ambiguous in that we do
not know if it denotes a conjunction of a variable and a disjunction (p ∧ (q ∨ r))
or it denotes the disjunction of a conjunction and a variable ((p ∧ q) ∨ r).
Thus (written linearly) the following are among the terms of P: ⊥, p, q, ¬q,
p ∧ ¬q, ((p ∧ ¬q) ∨ q), and ¬((p ∧ ¬q) ∨ r).
We use the lowercase Greek letters φ and ψ (possibly subscripted) as meta-
variables ranging over propositional formulas, that is, φ and ψ are variables that
denote propositional formulas; note that they are not themselves propositional
formulas and no actual propositional formula contains either of them.
mk bot : P
mk var : V → P
mk not : P → P
mk and : (P × P) → P
mk or : (P × P) → P
mk implies : (P × P) → P
The syntax trees for the last four of these examples are drawn as follows:
iii.) ¬ iv.) ∧
p p ¬
p : A × B says p is a tuple whose first element is of type A and whose second element is type
B.
24 CHAPTER 2. PROPOSITIONAL LOGIC
v.) ⇒ vi.) ⇒
¬ p ∧ ∨
⊥ p ⊥ p ¬
If-and-only-if
True > The syntax includes a constant ⊥ which, when we do the semantics,
will turn out to denote false; but we do not have a constant corresponding to
true. We define it here.
Definition 2.4 (Top) We define a new constant “>” (say top) as follows:
def
> = ¬⊥.
2.1.3 Substitutions*
A substitution is a means to map formulas to other formulas by uniformly re-
placing all occurrences individual variables with a formula. For example, given
the formula (p ∧ q) ⇒ p we could substitute any formula for p or any formula
for q. Say we wanted to substitute (r ∨ q) for p then we write the following:
subst(p, dr ∨ qe)d(p ∧ q) ⇒ pe
2.1.4 Exercises
2.2 Semantics
Semantics gives meaning to syntax. The style of semantics presented here was
first presented by Alfred Tarski in his paper[49] on truth in formalized languages
which was first published in Polish in 1933.
If I asked you to tell me whether the expression x + 11 > 42 it true, you’d
probably tell me that you need to know what the value of x is. So, the meaning
of x + 11 > 42 depends on the values assigned to x. Similarly, if I asked you if
a formula was true (say p ∧ q) you’d tell me you need to know what the values
of p and q are. The meanings of a formula depend on the values assigned to
variables in the formula.
In the following sections we introduce the set of Boolean values and we for-
malize the notion of an assignment. We present the semantics for propositional
logic in the form of a valuation function that, given an assignment and a formula
returns Tor F. The valuation function is then used as the basis to describe the
method of truth tables. Truth tables characterize all the possible meanings of
a formula. This gives us a semantics for propositional logic.
B = {T, F}
is called the Boolean4 set, and its elements (T and F) are called Boolean values.
Note that any two element set would do, as long as we could distinguish the
elements from one another.
When we ask if a formula is true, we are asking whether it evaluates to
T when it is interpreted with respect to some kind of structure. For an arith-
metic expression like the one in the example give above (x + 11 > 42) – the
structure would have to (at least) indicate the integer value associated with
the variable x. For propositional logic, the structure binding Boolean values to
variables is called an assignment.
Example 2.1. For the formula consisting of the single variable p there are
21 = 2 possible assignments.
p
α0 F
α1 T
To read the table, the assignment name is in the left column and the variables
are listed across the top. Thus, α0 (p) = F and α1 (p) = T.
For the formula p ∨ ⊥ having one occurrence of the variable p, k = 1 and
there are 21 = 2 possible assignments which were the ones just given.
The formula (p ∨ q) ⇒ p has two distinct variables p and q and so has 22 = 4
different assignments.
p q
α0 F F
α1 F T
α2 T F
α3 T T
val(α, ⊥) = F
val(α, x) = α(x) whenever x ∈ V
val(α, ¬φ) = not(val(α, φ))
val(α, φ ∧ ψ) = val(α, φ) and val(α, ψ)
val(α, φ ∨ ψ) = val(α, φ) or val(α, ψ)
2.2. SEMANTICS 27
⊥
F
Negation is a unary connective (i.e. it only has one argument) that toggles
the value of it’s argument as the following truth table shows.
φ ¬φ
T F
F T
φ ψ (φ ∧ ψ) (φ ∨ ψ) (φ ⇒ ψ) (φ ⇔ ψ)
T T T T T T
T F F T F F
F T F T T F
F F F F T T
Thus, the truth or falsity of a formula is determined solely by the truth or falsity
of its sub-terms:
We remark that for any element of P, although the number of cases (rows in
2.2. SEMANTICS 29
a truth table) is finite, the total number of cases is exponential in the number
of distinct variables. This means that, for each variable we must consider in a
formula, the number of cases we must consider doubles. Complete analysis of
a formula having no variables (i.e. its only base term is ⊥) has 20 = 1 row;
a formula having one distinct variable has 21 = 2 rows, two variables means
four cases, three variables means eight, and so on. If the formula contains n
distinct variables, there are 2n possible combinations of true and false that the
n variables may take.
Consider the following example of true table for the formula ((p ⇒ q) ∨ z).
p q r (p ⇒ q) ((p ⇒ q) ∨ r)
T T T T T
T T F T T
T F T F T
T F F F F
F T T T T
F T F T T
F F T T T
F F F T T
Since the formula has three distinct variables, there are 23 = 8 rows in the
truth table. Notice that the fourth row of the truth table falsifies the formula,
i.e. if p is true, q is false, and r is false, the formula ((p ⇒ q) ∨ r) is false. All the
other rows satisfy the formula i.e. all the other assignments of true and false to
the variables of the formula make it true.
A formula having the same shape (i.e. drawn as a tree it has the same
structure) , but only having two distinct variables is ((p ⇒ q) ∨ p). Although
there are three variable occurrences in the formula, (two occurrences of p and
one occurrence of q), the distinct variables are p and q. To completely analyze
the formula we only need 22 = 4 rows in the truth table.
p q (p ⇒ q) ((p ⇒ q) ∨ p)
T T T T
T F F T
F T T T
F F T T
Note that this formula is true for every assignment of Boolean values to the
variables p and q.
We did not include a constant in the base syntax for the language of propo-
sitional logic whose meaning is T; however, we defined the constant > (see
Definition 2.4). The following truth table shows that this defined formula al-
ways has the meaning T.
⊥ ¬⊥
F T
Note that any tautology could serve as our definition of true, but this is the
simplest such formula in the language P.
2.2.4 Exercises
5 By formal, we mean that we give a detailed mathematical presentation with enough detail
2.3.1 Sequents
Gerhard Gentzen
Sequents are pairs of lists of formulas used to characterize a point in a proof.
One element of the pair lists the assumptions that are in force at the point in
a proof characterized by the sequent and the other lists the goals, one of which
we must prove to complete a proof of the sequent. The sequent formulation
of proofs, presented below, was first given by the German logician Gerhard
Gentzen in 1935 [18].
We will use letters (possibly subscripted) from the upper-case Greek alpha-
bet as meta-variables that range over (possibly empty) lists of formulas. Thus
Γ, Γ1 , Γ2 , ∆, ∆1 , and ∆2 all stand for arbitrary elements of the class List P 6 .
Definition 2.16 (Sequent) A sequent is a pair of lists of formulas hΓ, ∆i. The
list Γ is called the antecedent of the sequent and the list ∆ is called the succedent
of the sequent.
The first equation defines the conjunction of formulas in the empty list simply
to be the formula ¬⊥ (i.e. the formula having the meaning T). The formula ¬⊥
is the right identity for conjunction i.e. the following is a tautology ((φ∧¬⊥) ⇔
φ).
You might argue semantically that this is the right choice for the empty list
as follows: the conjunction of the formulas in a list is valid if and only if all the
formulas in the list are valid, but there are no formulas in the empty list, so all
of them (all none of them) are valid.
The second equation in the definition says that the conjunction over a list
constructed by a cons is the conjunction of the individual formula that is the
head of the list with the conjunction over the tail of the list.
Definition 2.19 (Disjunction over a list) The function which creates a dis-
junction of all the elements in a list is defined by recursion on the structure of
the list and is given by the following two equations.
def
_
φ = ⊥
φ∈[ ]
def
_ _
φ = (ψ ∨ ( φ))
φ∈(ψ::Γ) φ∈Γ
The first equation defines the disjunction of formulas in the empty list simply
to be the formula ⊥ (i.e. the formula whose meaning is F). The formula ⊥ is
the right identity for disjunction i.e. the following is a tautology ((φ ∨ ⊥) ⇔ φ).
Thus [[Γ ` ∆]] is a translation of the sequent Γ ` ∆ into a formula. Using this
translation, we semantically characterize the validity of a sequent as follows.
Definition 2.21 (Sequent Valuation) Given an assignment α and a sequent
Γ ` ∆ we say α satisfies Γ ` ∆ if the following holds:
α |= [[Γ ` ∆]]
In this case we write α |= Γ ` ∆.
This gives the following
Definition 2.22 (Sequent validity) A sequent Γ ` ∆ is valid if and only if
|= [[Γ ` ∆]]
That is, a sequent is valid if and only if
^ _
|= ( φ) ⇒ ( ψ)
φ∈Γ ψ∈∆
To exercise these definitions we now consider the cases whether the an-
tecedent and/or succedent are empty. If Γ = [] then the sequent Γ ` ∆ is valid
if and only if one of the formulas in ∆ is. If ∆ = [] then the sequent Γ ` ∆ is
valid if and only if one of the formulas in Γ is not valid. If both the antecedent
and the succedent are empty, i.e. Γ = ∆ = [], then the sequent Γ ` ∆ is not
valid since
^ _
( φ) ⇒ ( ψ) is equivalent to the formula (¬⊥ ⇒ ⊥)
φ∈ [] ψ∈ []
and (¬⊥ ⇒ ⊥) is a contradiction. We verify this claim with the following truth
table.
⊥ ¬⊥ (¬⊥ ⇒ ⊥)
F T F
S: Γ1 , φ ⇒ ψ, Γ2 ` ∆
σ(Γ1 ) = [p]
σ(Γ2 ) = []
σ(∆) = [r]
σ(φ) = p ∨ q
σ(ψ) = r
σ(Γ1 , φ ⇒ ψ, Γ2 ` ∆)
= σ(Γ1 ), σ(φ ⇒ ψ), σ(Γ2 ) ` σ(∆)
= [p], σ(φ) ⇒ σ(ψ), [] ` [r]
= [p], (p ∨ q) ⇒ r, [] ` [r]
H H1 H2
(N ) (N ) (N )
C C C
where C, H, H1 , and H2 are all schematic sequents. N is the name of the rule.
The H patterns are the premises (or hypotheses) of the rule and the pattern C is
the goal (or conclusion) of the rule. Rules having no premises are called axioms.
Rules that operate on formulas in the antecedent (on the left side of `) of a
sequent are called elimination rules and rules that operate on formulas in the
consequent (the right side of `) of a sequent are called introduction rules.
Proof rules are schemas (templates) used to specify a single step of inference.
The proof rule schemas are specified by arranging schematic sequents in partic-
ular configurations to indicate which parts of the rule are related to which. For
example, the rule for decomposing an implication on the left side of the turnstile
is given as:
Γ1 , Γ2 ` φ, ∆ Γ1 , ψ, Γ2 ` ∆
Γ1 , (φ ⇒ ψ), Γ2 ` ∆
Γ1 , Γ2 ` φ, ∆
Γ1 , ψ, Γ2 ` ∆
Γ1 , (φ ⇒ ψ), Γ2 ` ∆
Each of these schematic sequents specifies a pattern that an actual (or con-
crete) sequent might (or might not) match. By an actual sequent, we mean a
sequent that contains no meta-variables (e.g. it contains no Γs or ∆s, or φs or
ψs but is composed of formulas int he language of propositional logic.)
Structural Rules*
The semantics of sequents (given in Def. 2.20) gives them lots of structure.
There are some non-logical rules that sequents obey. These rules are admissible
based simply on the semantics, regardless of the formula instances occurring
in the antecedent and consequent. They essentially express the ideas that the
order of the formulas does not affect validity and neither does the number of
times a formula occurs in the antecedent or the consequent.
It turns out that in the propositional case, it is never required that a struc-
tural proof rule be used to find a proof. Once the quantifiers have been added
in Chapter 4, some proofs will require the use of these rules.
Γ`∆
(WL)
φ, Γ ` ∆
Γ`∆
(WR)
Γ ` φ, ∆
2.3. PROOF THEORY 37
Axiom Rules
If there is a formula that appears in both the antecedent and the consequent
of a sequent then the sequent is valid. The axiom rule reflects this and has the
following form:
(Ax)
Γ1 , φ, Γ2 ` ∆1 , φ, ∆2
Also, since false (⊥) implies anything, if the formula ⊥ appears in the an-
tecedent of a sequent that sequent is trivially valid.
Proof Rule 2.8 (⊥Ax)
(⊥Ax)
Γ1 , ⊥, Γ2 ` ∆
38 CHAPTER 2. PROPOSITIONAL LOGIC
Conjunction Rules
On the right
Γ ` ∆1 , φ, ∆2 Γ ` ∆1 , ψ, ∆2
(∧R)
Γ ` ∆1 , (φ ∧ ψ), ∆2
On the left
Γ1 , φ, ψ, Γ2 ` ∆
(∧L)
Γ1 , (φ ∧ ψ), Γ2 ` ∆
Disjunction Rules
Γ ` ∆1 , φ, ψ, ∆2
(∨R)
Γ, ` ∆1 , (φ ∨ ψ), ∆2
Γ1 , φ, Γ2 ` ∆ Γ1 , ψ, Γ2 ` ∆
(∨L)
Γ1 , (φ ∨ ψ), Γ2 ` ∆
2.3. PROOF THEORY 39
Implication Rules
Γ, φ ` ∆1 , ψ, ∆2
(⇒R)
Γ ` ∆1 , (φ ⇒ ψ), ∆2
Γ1 , Γ2 ` φ, ∆ Γ1 , ψ, Γ2 ` ∆
(⇒L)
Γ1 , (φ ⇒ ψ), Γ2 ` ∆
Note that if φ is in Γ then this is just like Modus Ponens since the left subgoal
becomes an instance of the axiom rule.
Negation Rules
Γ, φ ` ∆1 , ∆2
(¬R)
Γ ` ∆1 , ¬φ, ∆2
Γ1 , Γ2 ` φ, ∆
(¬L)
Γ1 , ¬φ, Γ2 ` ∆
40 CHAPTER 2. PROPOSITIONAL LOGIC
2.3.5 Proofs
We have the proof rules, now we define what a proof is. A formal proof is a
tree structure where the nodes of the tree are sequents, the leaves of the tree
are instances of one of the axiom rules, and there is an edge between sequents
if the sequents form an instance of some proof rule. We can formally describe
an inductive data-structure for representing sequent proofs.
Definition 2.26 (proof tree) A proof tree having root sequent S is defined
inductively as follows:
i.) If the sequent S is an instance of one of the axioms rules whose name is N,
then
(N)
S
S1
(N)
S
is an instance of some proof rule having a single premise, then the tree
..
.
ρ1
(N)
S
iii.) If ρ1 is a proof tree with root sequent S1 and ρ2 is a proof tree with root
sequent S2 and, if
S1 S2
(N)
S
is an instance the proof rule which has two premises, then the tree
.. ..
. .
ρ1 ρ2
(N)
S
Although proof trees were just defined by starting with the leaves and build-
ing them toward the root, the proof rules are typically applied in the reverse
order, i.e. the goal sequent is scanned to see if it is an instance of an axiom
rule, if so we’re done. If the sequent is not an instance of an axiom rule and it
contains some non-atomic formula on the left or right side, then the rule for the
principle connective of that formula is matched against the sequent. The result-
ing substitution is applied to the schematic sequents in the premises of the rule.
The sequents generated by applying the matching substitution to the premises
are placed in the proper positions relative to the goal. This process is repeated
on incomplete leaves of the tree (leaves that are not instances of axioms) until
all leaves are either instances of an axiom rule, or until all the formulas in the
sequents at the leaves of the tree are atomic and are not instances of an axiom
rule. In this last case, there is no proof of the goal sequent.
As characterized in [43], the goal directed process of building proofs, i.e.
working backward from the goal, is a reductive process as opposed to the de-
ductive process which proceeds forward from the axioms.
We present some examples.
Example 2.5. Consider the sequent (p ∨ q) ` (p ∨ q). The following substi-
tution verifies the match of the sequent against the goal of the axiom rule as
follows:
Γ1 = [ ]
Γ2 = [ ]
σ1 = ∆1 = [ ]
∆2 = [ ]
φ = (p ∨ q)
(Ax)
(p ∨ q) ` (p ∨ q)
The sequent that results from applying this substitution to the schematic se-
quent in the premise of the rule ∨R results in the sequent (p ∨ q) ` q, p.
Thus far we have constructed the following partial proof:
(p ∨ q) ` q, p
(∨R)
(p ∨ q) ` (q ∨ p)
42 CHAPTER 2. PROPOSITIONAL LOGIC
Now we match the sequent on the incomplete branch of the proof against the
∨L-rule. This is the only rule that matches since the sequent is not an axiom
and only contains one non-atomic formula, namely the (q ∨ p) on the left side.
The match generates the following substitution.
Γ1 := [ ]
Γ2 := [ ]
σ2 = ∆ := [q, p]
φ := q
ψ := p
Applying this substitution to the premises of the ∨L-rule results in the sequents
p ` q, p and q ` q, p. Placing them in their proper positions results in the
following partial proof tree.
p ` q, p q ` q, p
(∨L)
(p ∨ q) ` q, p
(∨R)
(p ∨ q) ` (q ∨ p)
In this case, both incomplete branches are instances of the axiom rule. The
matches for the left and right branches are, respectively:
Γ1 := [ ]
Γ1 := [ ]
Γ2 := [ ] Γ2 := [ ]
σ3 = ∆1 := [q] σ4 = ∆1 := [ ]
∆2 := [ ] ∆2 := [p]
φ := p φ := q
These matches verify that the incomplete branches are indeed axioms and
the final proof tree appears as follows:
(Ax) (Ax)
p ` q, p q ` q, p
(∨L)
(p ∨ q) ` q, p
(∨R)
(p ∨ q) ` (q ∨ p)
Theorem 2.1.
i. ¬¬φ ⇔ φ
ii. ¬φ ⇔ (φ ⇒ ⊥)
iii. (φ ⇒ ψ) ⇔ ¬φ ∨ ψ
iv. ¬(φ ∧ ψ) ⇔ (¬φ ∨ ¬ψ)
v. ¬(φ ∨ ψ) ⇔ (¬φ ∧ ¬ψ)
vi. (φ ∨ ψ) ⇔ (ψ ∨ φ)
vii. (φ ∧ ψ) ⇔ (ψ ∧ φ)
viii. ((φ ∨ ψ) ∨ ϕ) ⇔ (φ ∨ (ψ ∨ ϕ))
ix. ((φ ∧ ψ) ∧ ϕ) ⇔ (φ ∧ (ψ ∧ ϕ))
x. (φ ∨ ⊥) ⇔ φ
xi. (φ ∧ >) ⇔ φ
xii. (φ ∨ >) ⇔ >
xiii. (φ ∧ ⊥) ⇔ ⊥
xiv. (φ ∨ ¬φ) ⇔ >
xv. (φ ∧ ¬φ) ⇔ ⊥
xvi. (φ ∧ (ψ ∨ ϕ)) ⇔ (φ ∧ ψ) ∨ (φ ∧ ϕ)
xvii. (φ ∨ (ψ ∧ ϕ)) ⇔ (φ ∨ ψ) ∧ (φ ∨ ϕ)
xviii. (p ⇒ q) ∨ (q ⇒ p)
Exercise 2.3. Give sequent proofs showing that the tautologies in Thm 2.1
hold.
2.3.7 Exercises
Soundness Soundness is the property that claims that every provable formula
is semantically valid. An unsound proof system would not be of much use, if
we could prove any theorem which was not valid, we could prove all theorems
because ⊥ ` φ for an arbitrary formula φ.
We do not have the tools or methods yet to prove this theorem. We have
informally argued for the admissibility of the individual proof rules and these
individual facts can be combined to show soundness. The proof method used
to prove soundness is based on an induction principle that follows the structure
of a formula. These methods will be introduced in a Chapter ??.
Completeness Completeness is the property that all valid formulas are prov-
able. If a proof system is complete, it captures all of the valid formulas. It turns
out that there are mathematical theories for which there is no complete proof
system, propositional logic is not one of them.
Again, we do not yet have a proof method that will allow us to prove com-
pleteness; by the end of the book we will.
2.4.2 Decidability
A set is decidable if there is an algorithm to decide if an element is in the set.
To talk about the decidability of a logic, we first have to describe it as a set or
collection of formulas.
Th(L, |=) = {φ | |= φ}
Thus, the theory of propositional logic ThhP, |=i is the collection of all
formulas in the language of propositional logic (P) that are semantically valid.
2.4. METAMATHEMATICAL CONSIDERATIONS* 45
2.4.3 Exercises
46 CHAPTER 2. PROPOSITIONAL LOGIC
Chapter 3
George Boole
47
48CHAPTER 3. BOOLEAN ALGEBRA AND EQUATIONAL REASONING*
Remark 3.1. Recall that for every integer a and every natural number m > 0,
there exists a integers q and r where 0 ≤ r < m such that the following equation
holds:
a = qm + r
We call q the quotient and r the remainder If r = 0 (there is no remainder)
them we say m divides a e.g. a ÷ m = q.
Definition 3.1. Two integers are congruent modulo 2, if and only if they have
the same remainder when divided by 2. In this case we write
a ≡ b(mod 2)
Example 3.1.
0 ≡ 0(mod 2) a = 0, k = 0, r =0
1 ≡ 1(mod 2) a = 1, k = 0, r =1
2 ≡ 0(mod 2) a = 2, k = 1, r =0
3 ≡ 1(mod 2) a = 3, k = 1, r =1
4 ≡ 0(mod 2) a = 4, k = 2, r =0
5 ≡ 1(mod 2) a = 5, k = 2, r =1
i.) a ≡ a(mod 2)
ii.) If a ≡ b(mod 2) then b ≡ a(mod 2)
iii.) If a ≡ b(mod 2) and b ≡ c(mod 2) then a ≡ c(mod 2)
a + b ≡ c + d(mod n)
a · b ≡ c · d(mod n)
relations
3.1. BOOLEAN ALGEBRA 49
3.1.3 Falsity
We interpret ⊥ as 0, so the translation function maps ⊥ to 0, no matter what
the assignment is.
M[[⊥]] = 0
3.1.4 Variables
Propositional variables are just mapped to variables in the algebraic language
M[[x]] = x
3.1.5 Conjunction
Consider the following tables for multiplication and the table for conjunction.
a b ab a b a∧b
1 1 1 T T T
1 0 0 T F F
0 1 0 F T F
0 0 0 F F F
50CHAPTER 3. BOOLEAN ALGEBRA AND EQUATIONAL REASONING*
This table is identical to the truth table for conjunction (∧) if we replace 1
by T , 0 by F and the symbol for multiplication (·) by the symbol for conjunction
(∧). Thus, we get the following translation.
3.1.6 Negation
Notice that addition by 1 modulo 2 toggles values.
The following tables show addition by 1 modulo 2 and the truth table for nega-
tion to illustrate that the translating negations to addition by 1 give the correct
results.
a a + 1(mod 2) a ¬a
1 0 T F
0 1 F T
The translation is defined as follows:
M[[¬φ]] = (M[[φ]] + 1)
3.1.7 Exclusive-Or
We might hope that disjunction would be properly modeled by addition ... “If
wishes were horses, beggars would ride.” Consider the table for addition modulo
2 and compare it with the table for disjunction – clearly they do not match.
a b a + b(mod 2) a b a∨b
1 1 0 T T T
1 0 1 T F T
0 1 1 F T T
0 0 0 F F F
3.1.8 Disjunction
We can derive disjunction using the following identity of propositional logic and
the translation rules we have defined so far.
(p ∨ q) ⇔ ¬(¬p ∧ ¬q)
M[[¬(¬p ∧ ¬q)]]
= M[[(¬p ∧ ¬q)]] + 1
= (M[[¬p]] · M[[¬q]]) + 1
= ((M[[p]] + 1) · (M[[q]] + 1)) + 1
= ((p + 1) · (q + 1)) + 1
= pq + p + q + 1 + 1
= pq + p + q + 2
Since 2 ≡ 0(mod 2), we can cancel the 2 and end up with the term pq + p + q.
Here are the tables (you might check for yourself that the entries are correct.)
a b ab + a + b(mod 2) a b a∨b
1 1 1 T T T
1 0 1 T F T
0 1 1 F T T
0 0 0 F F F
3.1.9 Implication
The following propositional formula holds.
(p ⇒ q) ⇔ (¬p ∨ q)
Thus, implication can be reformulated in terms of negation and disjunction.
Using the translation constructed so far, we get the following
M[[¬p ∨ q]]
= M[[¬p]] · M[[q]] + M[[¬p]] + M[[q]]
= (M[[p]] + 1) · q + (M[[p]] + 1) + q
= (p + 1)q + (p + 1) + q
= (pq + q + (p + 1) + q
= (pq + 2q + (p + 1)
Since 2q ≡ 0(mod 2), we can cancel the 2q term and the final formula for the
translation of implication is pq + p + 1. And we get the following tables.
52CHAPTER 3. BOOLEAN ALGEBRA AND EQUATIONAL REASONING*
a b ab + a + 1(mod 2) a b a⇒b
1 1 1 T T T
1 0 0 T F F
0 1 1 F T T
0 0 1 F F T
Thus,
M[[⊥]] = 0
M[[x]] = x
M[[¬φ]] = M[[φ]] + 1
M[[φ ∧ ψ]] = (M[[φ]] · M[[ψ]])
M[[φ ∨ ψ]] = (M[[φ]] · M[[ψ]]) + M[[φ]] + M[[ψ]]
M[[φ ⇒ ψ]] = (M[[φ]] · M[[ψ]]) + M[[φ]] + 1
M[[(p ∨ q) ⇒ p]]
= (M[[p ∨ q]] · M[[p]]) + M[[p ∨ q]] + 1
= (((M[[p]] · M[[q]]) + M[[p]] + M[[q]]) · p) + ((M[[p]] · M[[q]]) + M[[p]] + M[[q]]) + 1
= (((p · q) + p + q) · p) + ((p · q) + p + q) + 1
= (((pq) + p + q)p) + ((pq) + p + q) + 1
= (p2 q + p2 + pq) + pq + p + q + 1
= pq + p + pq + pq + p + q + 1
= 2(pq) + 2p + pq + q + 1
= pq + q + 1
We can check this for all combinations of values for p and q. Instead, we
notice that the final formula is the same as the translation for implication of
q ⇒ p. To check our work we could check that:
((p ∨ q) ⇒ p) ⇔ (q ⇒ p)
We have presented propositional logic syntax and have give semantics (mean-
ing) based on truth tables over the set of truth values {T, F }. An alternative
meaning can be assigned to propositional formulas by translating them into
algebraic form over the natural numbers and then looking at the congruences
modulo 2, i.e. by claiming they’re congruent to 0 or 1 depending on whether
they’re even or odd.
Such an interpretation is correct if it makes all the same formulas true.
3.2. EQUATIONAL REASONING 53
3.1.11 Notes
In modern times, the Boolean algebras have been investigated abstractly [25].
1.) φ ⇔ φ
2.) (φ ⇔ ψ) ⇔ (ψ ⇔ φ)
3.) ((φ ⇔ ψ) ∧ (ψ ⇔ R)) ⇒ (φ ⇔ R)
We shall see in Chap 7 that operations like ⇔ that have properties (1), (2)
and (3) behave like an equality. If you interpret “⇔” as “=” and φ, ψ and R
as numbers you will see this. Property (1) shows ⇔ is reflexive, property (2)
shows it is symmetric and property (3) shows it is transitive.
C ⊆ {⊥, ¬, ∨, ∧, ⇒, ⇔}
is complete if those connectives not in the set C can be defined in terms of the
connectives that are in the set C.
Example 3.4. The following definitions show that the set {¬, ∨} is complete.
def
1.) ⊥ = ¬(φ ∨ ¬φ)
def
2.) (φ ∧ ψ) = ¬(¬φ ∨ ¬ψ)
def
3.) (φ ⇒ ψ) = ¬φ ∨ ψ
def
4.) (φ ⇔ ψ) = ¬(¬(¬φ ∨ ψ) ∨ ¬(¬ψ ∨ φ))
To verify that these definitions are indeed correct, you could verify that
the columns of the truth table for the defined connective match (row-for-row)
the truth table for the definition. Alternatively, you could replace the symbol
def
“ = ” by “⇔” and use the sequent proof rules to verify the resulting
formulas, e.g. to prove the definition for ⊥ given above is correct, prove the
sequent ` ⊥ ⇔ ¬(φ ∨ ¬φ). Another method of verification would be to do
equational style proofs starting with the left-hand side of the definition and
rewriting to the right hand side.
54CHAPTER 3. BOOLEAN ALGEBRA AND EQUATIONAL REASONING*
Here are example verifications using the equational style of proof. We label
each step in the proof by the equivalence used to justify it or, if the step follows
from a definition we say which one.
hii h> def i hxivi
1.) ⊥⇐⇒¬¬⊥ ⇐⇒ ¬>⇐⇒¬(φ ∨ ¬φ)
hii hivi
2.) (φ ∧ ψ)⇐⇒¬¬(φ ∧ ψ)⇐⇒¬(¬φ ∨ ¬ψ)
hiiii
3.) (φ ⇒ ψ)⇐⇒¬φ ∨ ψ
h⇔ def.i
4.) (φ ⇔ ψ) ⇐⇒ (φ ⇒ ψ) ∧ (ψ ⇒ φ)
hiiii
⇐⇒(¬φ ∨ ψ) ∧ (ψ ⇒ φ)
hiiii
⇐⇒(¬φ ∨ ψ) ∧ (¬ψ ∨ φ)
h2i
⇐⇒¬(¬(¬φ ∨ ψ) ∨ ¬(¬ψ ∨ φ))
Exercise 3.3. Prove that the set {¬, ∧} is complete for {⊥, ¬, ∨, ∧, ⇒, ⇔}.
You’ll need to give definitions for ⊥, vee, ⇒ and ⇔ in terms of ¬ and ∧ and
then prove that your definitions are correct.
Exercise 3.4. Prove that the set {⊥, ⇒} is complete for {⊥, ¬, ∨, ∧, ⇒, ⇔}.
Chapter 4
Predicate Logic
4.1 Predicates
To make this extension to our logic we add truth-valued functions called predi-
cates which map elements from a domain of discourse to the values in B.
55
56 CHAPTER 4. PREDICATE LOGIC
i.) f () = 5
ii.) g(x) = x + 5
iii.) h(x, y) = (x + y) − 1
vi.) f1 (x, y, z) = x ∗ (y + z)
v.) g1 (x, y, z, w) = f1 (x, y, w) − z
The first function is nullary, it takes no arguments. Typically, we will drop the
parentheses and write f instead of f (). The second function takes one argument
and so is a unary function. The third function is binary. The fourth and fifth
are 3-ary and 4-ary functions respectively.
Example 4.1. In ordinary arithmetic, the binary predicates include less than
(written <) and equals (written =). Typically these are written in infix notation
i.e. instead of writing = (x, y) and < (x, y) we write x = y and x < y; do not
let this infix confuse you, they are still binary predicates. We can define other
predicates in terms of these two. For example we can define a binary predicate
less-than-or-equals as:
def
i ≤ j = ((i = j) ∨ (i < j))
We could define a unary predicate which is true when its argument is equal to
0 and is false otherwise:
def
=0 (i) = i = 0
def
between(i, j, k) = ((i < k) ∧ (k < j))
4.2.1 Variables
The definitions of the syntactic classes of terms and formulas (both defined
below) depend on an unbounded collection of variable symbols, we call this set
V.
V = {x, y, z, w, x1 , y1 , z1 , w1 , x2 , · · · }
Unlike propositional variables, which denoted truth-values, these variables will
range over individual elements in the domain of discourse. Like propositional
variables, we assume the set V is fixed (and so we do not include it among the
parameters of the definitions that use it.)
4.2.2 Terms
The syntax of terms (the collection of which we will write as T ) is determined
by a set of n-ary function symbols, call this set F. We assume the arity of a
function symbol can be determined.
58 CHAPTER 4. PREDICATE LOGIC
Definition 4.4 (Terms) Terms are defined over a set of function symbols F
are given by the following grammar:
where:
F is a set of function symbols,
x ∈ V is a variable,
f ∈ F is a function symbol for a function of arity n, where n ≥ 0 and
ti ∈ T[F ] denote previously constructed terms, 1 ≤ i ≤ n.
t1 ··· tn
ti
4.2.3 Formulas
Definition 4.5 (Predicate Logic Formula) Formulas of predicate logic are
defined over a set of function symbols F and a set of predicate symbols P and
are given by the following grammar.
where:
4.2. THE SYNTAX OF PREDICATE LOGIC 59
Some Examples
In the following examples we show uses of the quantifiers to formally encode
some theorems of arithmetic.
Example 4.3. The law of trichotomy in the language of arithmetic says:
For all integers i and j, either: i is less than j or i is equal to j or
j is less than i.
We can formalize this statement, making explicit that less-than and equals
are binary predicates by writing them as lt(i, j) and eq(i, j) respectively:
∀i.∀j.(lt(i, j) ∨ (eq(i, j) ∨ lt(j, i)))
We can rewrite the same statement as follows using the ordinary notation
of arithmetic (which perhaps makes the fact that less-than and equals are pred-
icates less obvious.)
∀i.∀j.(i < j ∨ (i = j ∨ j < i))
60 CHAPTER 4. PREDICATE LOGIC
Note that if the domain of discourse (the set from which the variables i and j
take their values) is the natural numbers, the statement is a theorem but it is
false if the domain of discourse is the integers or reals.
∀n.∀m.(n ≤ m ∧ m ≤ n) ⇒ n = m
Remark 4.1. The fact that these statements are true when the symbols are
interpreted in the ordinary way we think of numbers is a fact that is external
to logic. The predicates less-than and equals are particular predicates that
have particular values when interpreted in ordinary arithmetic. If we swapped
the interpretations of the symbols (i.e. if we interpreted i < j to be true
whenever i and j are equal numbers and interpreted i < j to be false otherwise;
and similarly interpreted i = j to be true whenever i was less than j and
false otherwise) we would still have well formed formulas in the language of
4.3. SUBSTITUTION 61
4.3 Substitution
Substitution is the process of replacing a variable by some more complex piece
of syntax such as a term or a formula. Readers are already familiar with this
process, though there are is some added complexity that results from notations
that bind variables e.g. summation (Σki=j f (i)), product (Πi = j k f (i)), integral
Rb
( a f (x)dx), and the quantifiers of predicate logic (∀x.φ(x) and ∃x.φ(x)).
As an example of a simple substitution (without considerations related to
bindings) consider the following example.
Example 4.7. If we consider the polynomial 2x2 −3x−1 and say x = y 2 +2y+1
then, by substitution we know
Summation is
Thus, the variable z occurs in a term which is simply a variable of the form
z. Otherwise, a variable occurs in a term of the form f (t1 , · · · , tn ) if and only
if it occurs in one of the terms ti , 1 ≤ i ≤ n. To collect them, we simply union
1
all the sets of variables occurring in each ti .
1 If
S0
n = 0, (i.e. if the arity of the function symbol is 0) then i=1 occurs(ti ) = {}
62 CHAPTER 4. PREDICATE LOGIC
Thus, a variable x occurs free in a term which is simply a variable of the form
z if and only if x = z. Otherwise, x occurs in a term of the form f (t1 , · · · , tn )
if and only if x occurs in one of the terms ti , 1 ≤ i ≤ n.
Thus, a variable x occurs free in a formula iff it occurs in the formula and it is
not in the scope of any binding of the variable x.
Bound Variables
Bound variables can only occur in formulas; this is because there are no binding
operators in the language of terms.
BV(P (t1 , · · · , tn )) = {}
BV(¬φ) = BV(φ)
BV(φ ∧ ψ) = BV(φ) ∪ BV(ψ)
BV(φ ∨ ψ) = BV(φ) ∪ BV(ψ)
BV(φ ⇒ ψ) = BV(φ) ∪ BV(ψ)
BV(∀z.φ) = BV(φ) ∪ {z}
BV(∃z.φ) = BV(φ) ∪ {z}
Discussion
The algorithms for computing the free variables and bound variables of a formula
are given by recursion on the structure of the formula. By drawing a syntax tree,
it is easy to see which variables are free and which are bound. Choose a variable
in the tree. It is bound if it is the left child of a quantifier or if, traversing the
tree to its root a quantifier is encountered having the same variable as a left
child. A variable is free if it is not the left child of a quantifier or if the path
from the variable to the root of the syntax tree does not include a quantifier
whose left child matches the variable.
64 CHAPTER 4. PREDICATE LOGIC
∀ ∃
x ∃ z ∧
y ∨ R Q
Q R x a z
x y z
We can refer to variables by their left to right position in the formula. The
leftmost x in the formula is bound because, in the syntax tree, it is a left child
of the quantifier ∀. Similarly, the same holds for the leftmost y. The second
occurrence of x in the formula is bound because on the path to the root of the
syntax tree passes a ∀ quantifier whose left child is also x. The second y in
the formula is bound because the path to the root passes an ∃ quantifier whose
left child is a y. The first occurrence of the variable z in the formula is free
because no quantifier on the path to the root has a z as a left child. The second
z occurring in the formula is bound because it is the left child of an ∃ quantifier.
The third x is free. The constant symbol a is not a variable and so is neither
free nor bound. The last z in the formula is bound by the ∃ quantifier above it
in the syntax tree.
Remark 4.2. Note that there are formulas where a variable may occur both
free and occur bound, e.g. x in the example above. As another example where
this happens, consider the formula P (x) ∧ ∀x.P (x). The first occurrence of x is
free and the second and third occurrences of x are bound.
More evidence for the pivotal role substitution plays: the only computation
mechanism in Church’s2 lambda calculus [6] is substitution, and anything we
currently count as algorithmically computable can be computed by a term of
the lambda calculus.
For terms, there are no binding operators so capture avoiding substitution
is just ordinary substitution – i.e. we just search for the variable to be replaced
by a term and when one that matches is found, it is replaced.
x[x := t] = t
z[x := t] = z if (x 6= z)
f (t1 , · · · , tn )[x := t] = f (t1 [x := t], · · · tn [x := t])
The first clause of definition says that if you are trying to substitute the
term t for free occurrences of the variable x in the term that consists of the
single variable x, then go ahead and do it – i.e. replace x by t and that is the
result of the substitution.
The second clause of the definition says that if you’re looking to substitute t
for x, but you’re looking at a variable z where z is different from x, do nothing
– the result of the substitution is just the variable z.
The third clause of the definition follows a standard pattern of recursion.
The result of substituting t for free occurrences of x in the term f (t1 , · · · , tn ),
is the term obtained by substituting t for x in each of the n arguments ti , 1 ≤
i ≤ n, and then returning the term assembled from these parts by placing the
substituted argument terms in the appropriate places.
Note that substitution of term t for free occurrences of the variable x can
never affect a function symbol (f ) since function symbols are not variables.
2 Alonzo Church was an American mathematician and logician who taught at Princeton
University. Among other things, he is known for his development of λ-calculus, a notation for
functions that serves as a theoretical basis for modern programming languages.
66 CHAPTER 4. PREDICATE LOGIC
⊥[x := t] = ⊥
P (t1 , · · · , tn )[x := t] = P (t1 [x := t], · · · , tn [x := t])
(¬φ)[x := t] = ¬(φ[x := t])
(φ ∧ ψ)[x := t] = (φ[x := t] ∧ ψ[x := t])
(φ ∨ ψ)[x := t] = (φ[x := t] ∨ ψ[x := t])
(φ ⇒ ψ)[x := t] = (φ[x := t] ⇒ ψ[x := t])
(∀x.φ)[x := t] = (∀x.φ)
(∀y.φ)[x := t] = (∀y.φ[x := t])
if (x 6= y, y 6∈ F V (t))
(∀y.φ)[x := t] = (∀z.φ[y := z][x := t])
if (x 6= y, y ∈ F V (t), z 6∈ (F V (t) ∪ F V (φ) ∪ {x})
(∃x.φ)[x := t] = (∃x.φ)
(∃y.φ)[x := t] = (∃y.φ[x := t])
if (x 6= y, y 6∈ F V (t))
(∃y.φ)[x := t] = (∃z.φ[y := z][x := t])
if (x 6= y, y ∈ F V (t), z 6∈ (F V (t) ∪ F V (φ) ∪ {x})
4.4 Proofs
4.4.1 Proof Rules for Quantifiers
If we have a formula with the principle constructor ∀ (say ∀x.φ) on the right of
a sequent then it is enough to prove the sequent where ∀x.φ has been replaced
by the formula φ[x := y], where y is a new variable not occurring free in any
formula of the sequent. Choosing a new variable not occuring free anywhere in
the sequent, to replace the bound variable x has the effect of selecting an arbi-
trary element from the domain of discourse i.e. by choosing a completely new
variable, we know nothing about it — except that it stands for some element of
the domain of discourse.
Γ ` ∆1 , φ[x := y], ∆2 where variable y is not free in any
(∀R)
Γ ` ∆1 , ∀x.φ, ∆2 formula of (Γ ∪ ∆1 ∪ {∀x.φ} ∪ ∆2 ).
On the left
The rule for a ∀ on the left says, to prove a sequent with a ∀ occurring as the
principle connective of a formula on the left side (say ∀x.φ) it is enough to prove
the sequent obtained by replacing ∀x.φ by the formula φ[x := t] where t is any
term 3 .
Γ1 , φ[x := t], Γ2 ` ∆
(∀L) where t ∈ T .
Γ1 , ∀x.φ, Γ2 ` ∆
We justify this by noting that if we assume ∀x.φ (this is what it means to
be on the left) then it must be the case that φ[x := t] is true for any term t
what-so-ever.
Γ ` ∆1 , φ[x := t], ∆2
(∃R) where t ∈ T .
Γ ` ∆1 , ∃x.φ, ∆2
Note that the choice of t may require some creative thought.
Definition 4.14 (existential witness) The term t substituted for the bound
variables in an ∃R-rule is called the existential witness.
On the left
The rule for a ∃ on the left says, to prove a sequent with a ∃ occurring as
the principle connective of a formula on the left side, it is enough to prove the
sequent obtained by replacing the bound variable of the forall by an arbitrary
variable y where y is not free in any formula of the sequent.
Γ1 , φ[x := y], Γ2 ` ∆ where variable y is not free in any
(∃L)
Γ1 , ∃x.φ, Γ2 ` ∆ formula of (Γ1 ∪ Γ2 ∪ {∃x.φ} ∪ ∆).
Since we know ∃x.φ, we know something (call it y) exists which satisfies
φ[x := y], but we can not assume anything about y other than that it has been
arbitrarily chosen from the domain of discourse.
free variables occuring in the sequent, then your proof is valid in every domain of discourse,
including the empty one. Logics allowing the empty domain of discourse are called Free Logics
[?] .
68 CHAPTER 4. PREDICATE LOGIC
Example 4.10. Consider the sequent ` (∀x.P (x)) ⇒ (∃y.P (y)). Surely if
everything satisfies property P , then something satisfies property P .
Initially, the only rule that applies is the propositional ⇒R-rule. It matches
this sequent by the following substitution:
Γ := [ ]
∆1 := [ ]
σ1 = ∆2 := [ ]
φ := (∀x.P (x))
ψ := (∃y.P (y))
The result of applying this substitution to the to premise of the ⇒R-rule results
in the partial proof tree of the following form:
Now, the only rule that applies is the ∃R-rule. We choose t to be z and
match by the following substitution.
4.4. PROOFS 69
Γ := [P (z)]
∆1 := [ ]
∆2 := [ ]
σ3 =
φ := P (y)
x := y
t := z
The partial proof generated by applying this rule with this substitution is
as follows:
P (z) ` P (z)
(∃R)
P (z) ` ∃y.P (y)
(∀L)
∀x.P (x) ` ∃y.P (y)
(⇒R)
` ∀x.P (x) ⇒ ∃y.P (y)
Γ1 := [ ]
Γ2 := [ ]
σ4 = ∆1 := [ ]
∆2 := [ ]
φ := P (z)
(Ax)
P (z) ` P (z)
(∃R)
P (z) ` ∃y.P (y)
(∀L)
∀x.P (x) ` ∃y.P (y)
(⇒R)
` ∀x.P (x) ⇒ ∃y.P (y)
Example 4.11. In the case of propositional logic we did not need to apply
any of the structural rules however, they may be required in the case of the
quantifier rules. Consider the following theorem.
Here is a sequent proof whose first step is to copy the formula using the rule
for contraction on the right.
70 CHAPTER 4. PREDICATE LOGIC
(Ax)
P (a), P (y) ` P (y), ∀x.P (x)
(⇒R)
P (a) ` P (y), P (y) ⇒ ∀x.P (x)
(∃R)
P (a) ` P (y), ∃x.P (x) ⇒ ∀x.P (x)
(∀R)
P (a) ` ∀x.P (x), ∃x.P (x) ⇒ ∀x.P (x)
(⇒R)
` P (a) ⇒ ∀x.P (x), ∃x.P (x) ⇒ ∀x.P (x)
(∃R)
` ∃x.P (x) ⇒ ∀x.P (x), ∃x.P (x) ⇒ ∀x.P (x)
(CR)
` ∃x.P (x) ⇒ ∀x.P (x)
Axiom Rule
The rule is:
(Ax)
Γ1 , φ, Γ2 ` ∆1 , φ, ∆2
(⊥Ax)
Γ1 , ⊥, Γ2 ` ∆
We say: “But now we have assumed false and the theorem is true.” or “But
now, we have derived a contradiction and the theorem is true.”
Conjunction Rules
The rule on the right is:
Γ ` ∆1 , φ, ∆2 Γ ` ∆1 , ψ, ∆2
(∧R)
Γ ` ∆1 , (φ ∧ ψ), ∆2
We say: “To show φ ∧ ψ there are two cases, (case 1.) insert translated proof
of the left branch here (case 2.) insert translated proof of the right branch here..”
4 Gentzen, Gerhard
4.4. PROOFS 71
Disjunction
The formal rule for a disjunction on the right is:
Γ ` ∆1 , φ, ψ, ∆2
(∨R)
Γ, ` ∆1 , (φ ∨ ψ), ∆2
We say: “To show φ ∨ ψ we must either show φ or show ψ. Insert translated
proof of the premise here.”
The sequent proof rule for disjunction on the left is:
Γ1 , φ, Γ2 ` ∆ Γ1 , ψ, Γ2 ` ∆
(∨L)
Γ1 , (φ ∨ ψ), Γ2 ` ∆
We say: “Since we know φ ∨ ψ we proceed by cases: suppose φ is true, then
insert translated proof from the left branch here. On the other hand, if ψ holds:
insert translated proof from right branch here.
or, we say: “Since φ ∨ ψ holds, we proceed consider the two cases: (case 1,
φ holds:) insert translated proof from the left branch here. (case 2. ψ holds:)
insert translated proof from right branch here.
Implication Rules
The formal rule for an implication on the right is:
Γ, φ ` ∆1 , ψ, ∆2
(⇒R)
Γ ` ∆1 , (φ ⇒ ψ), ∆2
We say: “To prove φ ⇒ ψ, assume φ and show ψ, insert translated proof of
the subgoal here..”
Negation
The formal rule for a negation on the right is:
Γ, φ ` ∆1 , ∆2
(¬R)
Γ ` ∆1 , ¬φ, ∆2
Γ1 , ¬φ, Γ2 ` ∆
(¬L)
Γ1 , Γ2 ` φ, ∆
Universal Quantifier
The formal rule for a ∀ on the right is:
We say: “To prove ∀x.φ, pick an arbitrary y and show φ[x := y] 5 . Insert
translated proof of the premise here.” or, we simply say: “Pick an arbitrary y
and show φ[x := y]. Insert translated proof of the premise here.”
The formal rule for ∀ on the left says:
Γ1 , φ[x := t], Γ2 ` ∆
(∀L) where t ∈ T .
Γ1 , ∀x.φ, Γ2 ` ∆
We say: “ Since we know that for every x, φ is true, assume φ[x := t]. Insert
translated proof of premise here.” or, we say: “Assume φ[x := t].”
Existential Quantifiers
The rule for ∃ on the right is:
Γ ` ∆1 , φ[x := t], ∆2
(∃R) where t ∈ T .
Γ ` ∆1 , ∃x.φ, ∆2
We say: “Let t be the witness for x in ∃x.φ. We must show φ[x := t].Insert
5 In this rule, and those that follow, we say φ[x := y] to be the formula that results from
the substitution of y fopr x in phi, i.e. actually do the substitution before writing the formula
in your proof.
4.4. PROOFS 73
translated proof of the premise here.” or, we say to show ∃x.φ, we choose the
witness t and show φ[x := t]. Insert translated proof of the premise here.”.
75
Chapter 5
Set Theory
Georg Cantor
In this chapter we present elementary set theory. Set theory serves as a foun-
dation for mathematics1 , i.e. in principle, we can describe all of mathematics
using set theory.
Our presentation is based on Mitchell’s [38]. A classic presentation can be
found in Halmos’ [26] Naive Set Theory.
exist e.g. category theory [32, 41] or type theory [?] can also serve as the foundations of
mathematics. Set theory is the foundational theory accepted by most working mathematicians.
77
78 CHAPTER 5. SET THEORY
5.1 Introduction
Set theory is the mathematical theory of collections. A set is a collection of
abstract objects where the order and multiplicity of the elements is not taken
into account. This is in contrast to other structures like lists or sequences, where
both the order of the elements and the number of times they occur (multiplicity)
are taken into account when determining if two are equal. For equality on sets,
all that matters is membership. Two sets are considered equal if they have the
same elements.
{a, 1, 2}
{a}
{1, 2, a}
{a, a, a}
Sometimes, if a pattern is obvious, we use an informal notation to indicate
larger sets without writing down the names of all the elements in the set. For
example, we might write:
x∈A
a ∈ {a, 1, 2}
1 6∈ {a}
1 ∈ {1, 2, a}
2 6∈ {a, a, a}
Note that sets may contain other sets as members. Thus,
{1, {1}}
is a set and the following propositions are true.
1 ∈ {1, {1}}
{1} ∈ {1, {1}}
Consider the following true proposition.
1 6∈ {{1}}
{1} 6∈ {1}
5.2.1 Extensionality
Sets are determined by their members or elements. This means, the only prop-
erty significant for determining when two sets are equal is the membership re-
lation. Thus, in a set, the order of the elements is insignificant and the number
of times an element occurs in a set (its multiplicity) is also insignificant. This
equality (i.e. the one that ignores multiplicity and order) is called extensionality.
2 Indeed there are some serious philosophers who reject it as senseless [20].
80 CHAPTER 5. SET THEORY
Definition 5.1.
def
A = B = ∀x. (x ∈ A ⇔ x ∈ B)
{a, 1, 2}
{a}
{1, 2, a}
{a, a, a}
The first and the third are equal as sets and the second and the fourth are
equal. It is not unreasonable to think of these equal sets as different descriptions
(or names) of the same mathematical object.
Note that the set {1, 2} is not equal3 to the set {1, {2}}, this is because
2 ∈ {1, 2} but 2 6∈ {1, {2}}.
5.2.2 Subsets
Definition 5.2 (Subset) A set A is a subset of another set B if every element
of A is also an element of B. Formally, we write:
def
A ⊆ B = ∀x. (x ∈ A ⇒ x ∈ B)
Thus, the set of even numbers (call this set 2N) is a subset of the natural
numbers N, in symbols 2N ⊆ N.
Proof:
(Ax)
x∈A`x∈A
⇒R
`x∈A⇒x∈A
∀R
` ∀x. (x ∈ A ⇒ x ∈ A)
(⊆def)
`A⊆A
(∀R)
` ∀A. A ⊆ A
deny the distinction commonly made between these sets, they say that the “don’t understand”
the distinction being made between these sets.
5.3. SET CONSTRUCTORS 81
Theorem 5.2 (subset extensionality) For every set A and every set B,
A = B ⇔ ((A ⊆ B) ∧ (B ⊆ A))
Proof: Since the theorem is an if and only if, we must show two cases; we label
them below as (⇒) and (⇐). def. of ⇔ and (∧R)
(⇒) Assume A = B, we must show that A ⊆ B and that B ⊆ A. By (⇒R) and (∧R)
definition, if A = B, then ∀x. (x ∈ A ⇔ x ∈ B). First, to show that A ⊆ B, we
must show that ∀x. (x ∈ A ⇒ x ∈ B). Pick an arbitrary thing, call it y and we
must show that y ∈ A ⇒ y ∈ B, but we assumed A = B, this means (by the (∀R)
definition of equality) y ∈ A ⇔ y ∈ B. Since the definition of equality is an iff,
we may assume y ∈ A ⇒ y ∈ B and y ∈ B ⇒ y ∈ A. But then we have show (∀L) and (∧L)
y ∈ A ⇔ y ∈ B as was desired. The for argument to show the case B ⊆ A is
similar.
This theorem gives us another way to prove sets are equal; instead of proving
that every element in both, show that the two sets are subsets of one another.
In the same way we labeled the cases of the proof of an (⇔) by (⇒) and (⇐)
we sometimes label the cases of a proof that two set are equal by (⊆) and (⊇).
The axiom says that there exists a set A which, for any thing whatsoever
(call it x), that thing is not in the set A.
82 CHAPTER 5. SET THEORY
We can prove that the empty set is unique i.e. we can prove the following
theorem which says that any two sets having the property that they contain no
elements are equal. In this case the property P is defined as P (z) = ∀x.x 6∈ z.
Theorem 5.3. For every set A and every set B, if ∀x.x 6∈ A and ∀x.x 6∈ B,
then A = B.
(Ax) (Ax)
x ∈ A ` x ∈ B, x ∈ A, x ∈ B x ∈ B ` x ∈ B, x ∈ A, x ∈ A
(⇒R) (⇒R)
` x ∈ B, x ∈ A, (x ∈ A ⇒ x ∈ B) ` x ∈ B, x ∈ A, (x ∈ B ⇒ x ∈ A)
(∧R)
` x ∈ B, x ∈ A, (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(¬L)
¬(x ∈ B) ` x ∈ A, (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(¬L)
¬(x ∈ A), ¬(x ∈ B) ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
({6∈}-def)
¬(x ∈ A), x 6∈ B ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
({6∈}-def)
x 6∈ A, x 6∈ B ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(∀L)
x 6∈ A, ∀x.x 6∈ B ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(∀L)
∀x.x 6∈ A, ∀x.x 6∈ B ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
(∧L)
(∀x.x 6∈ A) ∧ (∀x.x 6∈ B) ` (x ∈ A ⇒ x ∈ B) ∧ (x ∈ B ⇒ x ∈ A)
({⇔}-def)
(∀x.x 6∈ A) ∧ (∀x.x 6∈ B) ` x ∈ A ⇔ x ∈ B
(∀R)
(∀x.x 6∈ A) ∧ (∀x.x 6∈ B) ` ∀x.(x ∈ A ⇔ x ∈ B)
({=}-def)
(∀x.x 6∈ A) ∧ (∀x.x 6∈ B) ` A = B
(⇒R)
` ((∀x.x 6∈ A) ∧ (∀x.x 6∈ B)) ⇒ A = B
(∀R)
` ∀B.(((∀x.x 6∈ A) ∧ (∀x.x 6∈ B)) ⇒ A = B)
(∀R)
` ∀A.∀B.(((∀x.x 6∈ A) ∧ (∀x.x 6∈ B)) ⇒ A = B)
Since the set is unique we can give it a name, we denote this unique set by
the constant4 symbol ∅ (and sometimes by just writing empty brackets {}).
Using this new notation, we can restate the empty set axiom in a simpler
form as follows.
4 Do not confuse the symbol ∅ with the Greek letter φ.
5.3. SET CONSTRUCTORS 83
∀x.∀y.∀z. z ∈ {x, y} ⇔ (z = x ∨ z = y)
From now on we will use this form of the pairing axiom instead of the form
having the existential quantifier in its statement.
Singletons
By choosing x and y in the paring axiom to be the same element we get a
singleton, a set having exactly one element.
∀x.∃A.∀z. z ∈ A ⇔ z = x
∀x.∀y.∀z. z ∈ {x, y} ⇔ (z = x ∨ z = y)
∀x. unique(P )
Proof: Singletons are just pairs where the elements are not distinct. Note
that the proof uniqueness for pairs (Lemma 5.1) does not depend in any way
on distinctness of the elements in the pair and so singletons are also unique.
5.3. SET CONSTRUCTORS 85
Note that by extensionality, {x, x} = {x}, and since singletons are unique,
we will write {x} for the singleton containing x. Note that the singleton set
{x} is distinguished from its element x, i.e. x 6= {x}. Because the set that
is claimed to exist in Lemma 5.2 is unique, we can restate that lemma more
simply as follows.
∀x.∀z. z ∈ {x} ⇔ z = x
∀w.w ∈ {w}
Kazimierz Kuratowski
(1896 – 1980) was a Polish
mathematician who was
active in the early devel-
opment of topology and
axiomatic set theory.
Kazimierz Kuratowski
86 CHAPTER 5. SET THEORY
The pair {a, b} and the pair {b, a} are identical as far as we can tell using set
equality. They are indistinguishable if we only consider their members. What
if we want to be able to distinguish pairs by the order in which their elements
are listed, is it possible using only sets? The following encoding of ordered pairs
was first given by Kuratowski.
Note that the angled brackets (“h” and “i”) are used here to denote ordered
pairs.
Under this definition h1, 2i = {{1}, {1, 2}} and h2, 1i = {{2}, {1, 2}}. As
sets, h1, 2i =
6 h2, 1i. Also, not that the pair consisting of two of the same
elements is encoded as the set containing the set containing that element.
ha, bi = ha0 , b0 i ⇔ (a = a0 ∧ b = b0 )
projections, which technically are defined here as relations between a pair and its first or
second element, really are functions.
5.3. SET CONSTRUCTORS 87
Lemma 5.5.
∀a.π1 ha, ai = π2 ha, ai
Proof: Choose arbitrary a. By definition, ha, ai = {{a}, {a, a}}, thus π1 ha, ai =
a iff {a} ∈ {{a}, {a, a}} which is true. Similarly, π2 ha, ai = a iff {a, a} ∈
{{a}, {a, a}} which is also true and so the theorem holds.
Norbert Weiner
Exercise 5.4. An alternative definition of ordered pairs (this was the first def-
inition) was given by Norbert Wiener in 1914.
def
hx, yi = {{{x}, ∅}, {{y}}}
Prove that this definition satisfies the characteristic property of ordered pairs
as stated in Thm.5.5..
The empty set acts as an identity element for the union operation (in the
same way 0 is the identity for addition and 1 is the identity for multiplication.)
This idea is captured by the following theorem.
(Ax)
x ∈ ∅ ` x ∈ ∅, x ∈ A
(¬-L)
¬(x ∈ ∅), x ∈ ∅ ` x ∈ A
(def of 6∈)
x 6∈ ∅, x ∈ ∅ ` x ∈ A
(∀-L)
∀x.x 6∈ ∅, x ∈ ∅ ` x ∈ A (Ax)
(Ax) (Assert)
x∈A`x∈A x∈∅`x∈A x ∈ A ` x ∈ A, x ∈ ∅)
(∨-L) (∨-R)
x∈A∨x∈∅`x∈A x ∈ A ` x ∈ A ∨ x ∈ ∅)
(∈ ∪ def) (∈ ∪ def)
x ∈ (A ∪ ∅) ` x ∈ A x ∈ A ` x ∈ (A ∪ ∅)
(⇒-R) (⇒-R)
` x ∈ (A ∪ ∅) ⇒ x ∈ A ` x ∈ A ⇒ x ∈ (A ∪ ∅)
(∧-R)
` (x ∈ (A ∪ ∅) ⇒ x ∈ A) ∧ (x ∈ A ⇒ A ∪ ∅)
(def of ⇔)
` x ∈ (A ∪ ∅) ⇔ x ∈ A
(∀R)
` ∀x.x ∈ (A ∪ ∅) ⇔ x ∈ A
(def of =)
`A∪∅=A
(∀R)
` ∀A. A ∪ ∅ = A
The following theorem asserts that the order of arguments to a union oper-
ation do not matter.
Theorem 5.7 (union commutes) For sets A and B, A ∪ B = B ∪ A.
Proof: Choose arbitrary sets A and B. By extensionality, A∪B = B ∪A is true
if ∀x.x ∈ (A ∪ B) ⇔ x ∈ (B ∪ A). Choose an arbitrary x, assume x ∈ (A ∪ B).
Now, by the definition of membership in a union, x ∈ (A ∪ B) iff x ∈ A ∨ x ∈ B.
(x ∈ A ∨ x ∈ B) ⇔ (x ∈ B ∨ x ∈ A and, again by the union membership
property, (x ∈ B or x ∈ A) ⇔ x ∈ (B ∪ A).
By this theorem, A ∪ ∅ = ∅ ∪ A which, together with Thm 5.6 yields the
following corollary.
def
x ∈ (A ∩ B) = (x ∈ A ∧ x ∈ B)
i.) x ∈ (A ∩ ∅) ⇒ x ∈ ∅
ii.) x ∈ ∅ ⇒ x ∈ (A ∩ ∅)
Axiom 5.5 (power set) The axiom characterizing membership in the power-
set says:
def
x ∈ ρ(S) = x ⊆ S
A0 = {} ρA0 = {{}}
A1 = {1} ρA1 = {{}, {1}}
A2 = {1, 2} ρA2 = {{}, {1}, {2}, {1, 2}}
A3 = {1, 2, 3} ρA1 = {{}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}
Notice that the size of the power set is growing exponentially (as powers of
2, 20 = 1, 21 = 2, 22 = 4, 23 = 8)
Fact 5.1. If a set A has n elements, then the power set ρA has 2n elements.
5.3.7 Comprehension
If we are given a set and a predicate φ(x) (a property of elements of the set)
we can create the set consisting of those elements that satisfy the property. We
write the set created by the operation by instantiating the following schema.
{x ∈ S | φ(x)}
def
y ∈ {x ∈ S|φ(x)} = (y ∈ S ∧ φ(y))
Example 5.1. The set of natural numbers greater than 5 can be defined using
comprehension as:
{n ∈ N | ∃m. m ∈ N ∧ m + 6 = n}
5.3. SET CONSTRUCTORS 91
{n ∈ N | ∃m.m ∈ N ∧ 2m = n}
Here, the set S from the schema is the set of natural numbers N and the
predicate φ is:
φ(n) = ∃m.m ∈ N ∧ 2 ∗ m = n
{x : S | φ}[x := t] = {x : S | φ}
{y : S[x := t] | φ}[x := t] = {y : S | φ[x := t]}
if (x 6= y, y 6∈ F V (t))
{y : S | φ}[x := t] = {z : S[y := z][x := t] | φ[y := z][x := t]}
if (x 6= y, y ∈ F V (t), z 6∈ (F V (S) ∪ F V (t) ∪ F V (φ) ∪ {x})
Bertrand Russell
92 CHAPTER 5. SET THEORY
Example 5.3. If Even is the set of even natural numbers, N − Even = Odd.
Definition 5.12 (disjoint sets) Two sets A and B are disjoint if they share
no members in common, i.e. if the following holds:
A∩B =∅
Theorem 5.14. For all sets A and B, A and B are disjoint sets iff A − B = A.
Note that, by Def 5.5. z = ha, bi means z is a set of the form {{a}, {a, b}}.
Evidently, the Cartesian product of two sets is a set of pairs.
i.) A × ∅ = ∅
ii.) ∅ × A = ∅
∀x : A × B. ∃y : A. ∃z : B. x = hy, zi
{y ∈ A × B|P [y]}
In practice, we often need to refer to the parts of the pair y to express the
property P . If so, to be formally correct, we should write :
So,
x ∈ {y : A × B | ∀z : A. ∀w : B. y = hz, wi ⇒ P (z, w)}
⇔ x ∈ A × B ∧ ∀z : A. ∀w : B. x = hz, wi ⇒ P (z, w)
By lemma 5.6, we know there exist z ∈ A and w ∈ B such that hz, wi. So, to
prove membership of x it is enough to show that x ∈ A × B and then assume
there are z ∈ A and w ∈ B such that x = hz, wi and show P [z, w]. A more
94 CHAPTER 5. SET THEORY
readable syntactic form allows the “destructuring” of the pair to occur on the
left side in the place of the variable.
Show z ∈ A × B and then, assume z = hx, yi (for new variables x and y) and
show P [x, y].
5.4.1 Idempotency
Definition 5.14 (Idempotence) Given a binary operation ?, its idempotent
elements are those elements x for which x ? x = x.
Example 5.5. For the operation of ordinary multiplication, 0 and 1 are (the
only) idempotent elements. For the operation of addition, 0 (but not 1) is an
idempotent element.
The following lemma shows that every set is an idempotent element for
intersections and unions.
5.4.2 Monotonicity
Monotonicity is a property of a unary operators.
Example 5.6. For an arbitrary sets A and B, the powerset operation is mono-
tone i.e.
A ⊆ B → ρ(A) ⊆ ρ(B)
5.4. PROPERTIES OF OPERATIONS ON SETS 95
5.4.3 Commutativity
Definition 5.16 (Commutative) A binary set operator (say ◦) is commuta-
tive if for all sets A, B
(A ◦ B) = (B ◦ A)
A∩B =B∩A
A∪B =B∪A
5.4.4 Associativity
Definition 5.17 (Associative) A binary set operator (say ◦) is associative if
for all sets A, B, and C,
A ◦ (B ◦ C) = (A ◦ B) ◦ C)
A ∩ (B ∩ C) = (A ∩ B) ∩ C
A ∪ (B ∪ C) = (A ∪ B) ∪ C
5.4.5 Distributivity
The distributive property relates pairs of operators.
Definition 5.18 (Distributive) For binary set operators (say ◦ and ), we
say◦ distributes over if all sets A, B, and C,
∀A, B, C. A ∪ (B ∩ C) = (A ∩ B) ∪ (A ∩ C)
∀A, B, C. A ∩ (B ∪ C) = (A ∪ B) ∩ (A ∪ C)
96 CHAPTER 5. SET THEORY
Chapter 6
Relations
Alfred Tarski (1902–1983)
was born in Poland and
came to the US at the
outbreak of WWII. Tarski
was the youngest person
to ever earn a Ph.D. for
the University of Warsaw
and throughout his ca-
reer he made many many
contributions to logic and
mathematics – though he
may be best know for his
work in semantics and model
theory. Tarski and his stu-
dents developed the theory
of relations as we know
it. See [14] for a complete
and and rather personal
biography of Tarski’s life
Alfred Tarski and work.
6.1 Introduction
Relations establish a correspondence between the elements of sets thereby im-
posing structure on the elements. In keeping with the principle that all of math-
ematics can be described using set theory, relations (and functions) themselves
can be characterized as sets (having a certain kind of structure).
For example, familial relations can be characterized mathematically using
the relational structures and/or functions. Thus, if the set P is the set of
97
98 CHAPTER 6. RELATIONS
all people living and dead, the relationship between a (biological) father and
his offspring could be represented by a set of pairs F of the form hx, yi to be
interpreted as meaning that x is the father of y if hx, yi ∈ F . We will write xF y
to denote the fact that x is the father of y instead of hx, yi ∈ F . Using this
notation, the paternal grandfather relation can be characterized by the set
{hx, yi ∈ P × P|∃z. xF z ∧ zF y}
6.2 Relations
6.2.1 Binary Relations
Definition 6.1 (Binary Relation)
Example 6.2. The empty set is a relation. Recall that the empty set is a subset
of every set and so it is a subset of every Cartesian product (even the empty
one). Again, this is not a terribly interesting relation but, by the definition, it
clearly is one.
< = {hx, yi ∈ Z × Z | x 6= y ∧ ∃k : N. y = x + k}
To aid readability, binary relations are often written in infix notation e.g. hx, yi ∈
R will be written xRy. So, for example, an instance of the less-than relation
will be written 3 < 100 instead of the more pedantic h3, 100i ∈ <.
∆A = {hx, yi ∈ A × A|x = y}
This relation is called the “diagonal” in analogy with the matrix presentation
of a relation R, where hx, yi in the matrix is labeled with a 0 if hx, yi 6∈ R and
hx, yi = 1 if hx, yi ∈ R.
Example 6.4. Suppose A = {0, 1, 2} and R = {h0, 0i, h1, 1i, h1, 2i, h2, 1i} then
the matrix representation appears as follows:
R 0 1 2
0 1 0 0
1 0 1 1
2 0 1 0
Notice that is it the matrix consisting of a diagonal of ones 1 . One problem with
the matrix representation is that you must order the elements of the underlying
set to be able to write them down across the top and bottom of the matrix –
this choice may have to be arbitrary if there is no natural order for the elements.
6.3.1 Inverse
Definition 6.6 (inverse) If R ⊆ A × B, then the inverse of R is
Thus, to construct the inverse of a relation, just turn every pair around i.e.
swap the elements in the pairs of R by making the first element of each pair
the second and the second element of each pair the first. The following useful
lemma says that swapping the order of the elements from a pair in a relation R
puts the new pair into the relation R−1 .
Remark 6.1. In linear algebra, the inverse, as defined here, is called the trans-
pose.
Example 6.5. The inverse of the less-than relation (<) is greater-than (>).
aRb ⇔ ¬(aRb)
Example 6.6. The complement of the less-than relation (<) is the greater-
than-or-equal-to relation (≥).
def
S ◦ R = {hx, yi ∈ A × C | ∃z : B. xRz ∧ zSy}
Example 6.7. Suppose we had a relation (say R) that paired names with social
security numbers and another relation that paired social security numbers with
the state they were issued in (call this relation S), then (S ◦ R) is the relation
pairing names with the states where their social security numbers were assigned.
T ◦ (S ◦ R) = (T ◦ S) ◦ R
102 CHAPTER 6. RELATIONS
Then (by the definition of composition) and (ii.) we obtain two more facts which
hold for some arbitrary element w ∈ B.
iv.) hx, wi ∈ R
v.) hw, zi ∈ S
From (v.) and (iii.) and the definition of composition we obtain the following.
vi.) hw, yi ∈ (T ◦ S)
But then (iv.) and (vi.) together mean hx, yi ∈ ((T ◦ S) ◦ R) which completes
the proof.
(S ◦ R)−1 = R−1 ◦ S −1
R−1 ◦ R = ∆A
Exercise 6.4. For A = {a, b, c} and R = {ha, bi, ha, ci, hb, ai, hc, bi} show that
the conjecture is false.
Exercise 6.5. For A = {a, b} and R = {ha, bi} show that the conjecture is
false.
We will see in chapter 8 that the conjecture is true when we consider func-
tions which are a restricted form of relations.
Exercise 6.6. Let R be the successor relation as defined in example 6.8. Prove
that
(≤ ◦R) =<
For each k ∈ N, Rk is the relation that takes you directly to the places reach-
able in R by following k steps. The following two definitions collect together
0
Rk s where k ranges over N+ (the strictly positive natural numbers2 ) and N.
So, R+ contains all the pairs hx, yi ∈ A × A such that y is reachable from
x by following one or more edges of R. Similarly, R∗ contains all the pairs
hx, yi ∈ A × A such that y is reachable from x by following zero or more edges
of R. As we will see below; R+ is the so-called transitive closure of R (Thm 6.6)
and R∗ is the reflexive transitive closure of R (Thm 6.7).
6.4.1 Reflexivity
Definition 6.12.
def
RelfA (R) = ∀a : A. aRa
6.4.2 Irreflexivity
Definition 6.13.
def
IrrefA (R) = ∀a : A. a 6Ra
Remark 6.3. Note that a relation can fail to be both reflexive and irreflexive.
Let A = {0, 1, 2} and R = {h0, 1i, h1, 1i, h1, 2i, h2, 0i} Then, R is not reflexive
because h0, 0i 6∈ R. But it also fails to be irreflexive since h1, 1i ∈ R.
6.4.3 Symmetry
Definition 6.14.
def
Sym(R) = ∀a, b : A. aRb ⇒ bRa
6.4.4 Antisymmetry
Definition 6.15.
def
AntiSym(R) = ∀a, b : A. (aRb ∧ bRa) ⇒ a = b
Antisymmetry means that if you can get from one point to another and back
in one step, then those points must have been equal.
6.4.5 Asymmetry
Definition 6.16.
def
Asym(R) = ∀a, b : A. (aRb ⇒ b 6 Ra
Asymmetry means that there is no way to get from any point to itself in two
steps.
6.4.6 Transitivity
def
T rans(R) = ∀a, b, c : A. (aRb ∧ bRc) ⇒ (aRc)
A relation is transitive if every place you can get in two steps, you can get
by taking a single step.
6.4.7 Connectedness
def
Connected(R) = ∀a, b : A. a 6= b ⇒ (aRb ∨ bRa)
A relation is connected if there is an edge between every pair of points (going
one direction or the other.)
6.5 Closures
The idea of “closing” a relation with respect to a certain property is the idea of
adding just enough to the relation to make it satisfy the property (if it doesn’t
already) and to get the “smallest” such extension.
Thus, the closure is the minimal extension to make a relation satisfy a prop-
erty. For some properties (like irreflexivity) there may be no way add to the
relation to make it satisfy the property – in which case we say the closure “does
not exist”. To make R satisfy irreflexivity, we would have to remove h1, 1i.
S ∈ closure(R, P )
iff (P (S) ∧ R ⊆ S) ∧ ∀T : T ⊆ A × B ⇒ ((P (T ) ∧ R ⊆ T ) ⇒ S ⊆ T )
M (S) = S ∈ closure(R, P )
M (R1 ) : R1 ∈ closure(R, P )
M (R2 ) : R2 ∈ closure(R, P )
i.) R ⊆ R1
ii.) P (R1 )
iii.) ∀T ⊆ A × A. (R ⊆ T ∧ P (T )) ⇒ R1 ⊆ T
iv.) R ⊆ R2
v.) P (R2 )
vi.) ∀T ⊆ A × A. (R ⊆ T ∧ P (T )) ⇒ R2 ⊆ T
108 CHAPTER 6. RELATIONS
ii.) We must show that R ⊆ (R ∪ ∆A ). But this is true by Thm 5.8 from
Chapter 5.
iii.) Finally, we must show that R ∪ ∆A is the least such set, i.e. that
∀T : T ⊆ A × A ⇒ ((R ⊆ T ∧ Ref lA (T )) ⇒ (R ∪ ∆A ) ⊆ T )
To see this, choose an arbitrary relation T ⊆ A × A. Assume R ⊆ T and
Ref lA (T ). We must show that (R ∪ ∆A ) ⊆ T . Let x be an arbitrary element
of (R ∪ ∆A ). Then, there are two cases: x ∈ R or x ∈ ∆A . If x ∈ R, since we
have assumed R ⊆ T , we know x ∈ T . In the other case, x ∈ ∆A , that is, x is
of the form hy, yi for some y in A. But since we assumed Ref lA (T ), we know
that ∀z : A.hz, zi ∈ T so, in particular, hy, yi ∈ T , i.e. x ∈ T .
6.6. PROPERTIES OF OPERATIONS ON RELATIONS 109
Note that unlike reflexivity, symmetry does not require us to know what the
full set A is, it only requires us to know what pairs are in the relation R.
Example 6.10. For any set A, the empty relation is symmetric, though the
empty relation is reflexive if and only if A = ∅.
Example 6.11. Let A = {0, 1, 2, 3} and R = {h0, 1i, h1, 2i, h2, 3i, h3, 0i}. Then
R ∪ R2 ∪ R3 ∪ R4
0
Definition 6.19 (Involution) A unary operator : A → A is an involution
if it is its own inverse, i.e. if x00 = x for all x ∈ A.
Lemma 6.13 (complement involutive) For every pair of sets A and B and
every relation R, R ⊆ A × B the following identity holds.
R=R
.
Proof: By extensionality. Choose arbitrary a ∈ A and b ∈ B and show aRb ⇔
aRb. We reason equationally.
Theorem 6.8 (Inverse involutive) For every pair of sets A and B, and for
every R ⊆ A × B
R = (R−1 )−1
Proof: Note that since R ⊆ A × B is a set, we must show (by extensionality)
that for arbitrary z, that z ∈ R ⇔ z ∈ (R−1 )−1 . Since R ⊆ A × B, z is of
the form ha, bi for some a ∈ A and some b ∈ B, thus, we must show aRb ⇔
a(R−1 )−1 b. Two applications of Lemma 6.1 give the following.
Chapter 7
111
112 CHAPTER 7. EQUIVALENCE AND ORDER
Lemma 7.1. For any set A,, the diagonal relation ∆A is an equivalence relation
on A.
This is the so-called “finest” equivalence (see Definition 7.3 below) on any
set A and is defined by real equality on the elements of the set A. To see this
recall that ∆A = {hx, yi | x = y}
Lemma 7.2. For any set A, the complete relation A × A is an equivalence
relation on A.
This equivalence is rather uninteresting, it says every element in A is equiv-
alent to every other element in A. It is the “coarsest” equivalence relation on
the set A.
[(iii) ⇒ (i)] Assume [x]R = [y]R . Then, every element of [x]R is in [y]R and
vice-versa. But, because R is reflexive, x ∈ [x]R and since y ∈ [y]R , y ∈ [x]R .
But this is true only if xRy holds.
∀x : A. [x]≡1 ⊆ [x]≡2
7.1.3 Q is a Quotient
Consider the fractions F defined as follows.
Definition 7.5.
F = Z × Z{6=0}
where Z{6=0} is the set of non-zero integers.
You may recognize F as the fractions, e.g. we can think of the first number
as the numerator and the second as the denominator, so ha, bi is the fraction ab .
Now, note that the equality on fractions (i.e. the equality on pairs – ha, bi =
hc, di ⇔ a = c ∧ b = d) is not the equality for rational numbers (usually denoted
Q.) Notice that, for example,
h1, 2i =
6 h2, 4i
Definition 7.6.
def
≡Q = {hhx, yi, hz, wii ∈ F × F | xw = yz}
This is the ordinary cross-multiplication rule you learned in grade school for
determining if two rational numbers are equal.
7.1.4 Partitions
Definition 7.8 (Partition) A partition of a set A is a set of non-empty subsets
of A (we refer to these sets as Ai where i ∈ I, I ⊆ N). Each Ai is called a block
(or a component) and a collection of such Ai is a partition if it satisfies the
following two properties:
7.1. EQUIVALENCE RELATIONS 115
∀i, j : I. i 6= j ⇒ (Ai ∩ Aj = ∅)
and,
iii.) the union of the sets Ai , i ∈ I is the set A itself:
[
Ai = A
i∈I
Example 7.5. If A = {1, 2, 3} then the following are all the partitions of A.
{{1, 2, 3}}
{{1}, {2, 3}}
{{1, 2}, {3}}
{{1, 3}, {2}}
{{1}, {2}, {3}}
Counting Partitions
Definition 7.9 (k-partition) A k-partition of a set A is a partition of A into
k subsets.
So for example, {{1, 2, 3}} is a 1-partition of {1, 2, 3}, {{1}, {2, 3}}, {{1, 2}, {3}},
and {{1, 3}, {2}} are all 2-partitions while {{1}, {2}, {3}} is a 3-partition.
bc bcedh
aedh = (edh) = = bceh = bcf g = bf cg
d d
The significance of the lemma is that the operation ∗F respects the equiva-
lence ≡Q i.e. even though it is defined as an operation on fractions, substitution
of ≡Q -equivalent elements yield ≡Q -equivalent results.
Lemma 7.6. The relation ≡Q is compatible with the operation +F i.e. ≡Q is
congruent with respect to +F .
Exercise 7.6. Prove Lemma 7.6
Cartesian Product
Definition 7.18 (Odered product) If hP1 , v1 i and hP2 , v2 i are posets, then
hP1 × P2 , vi is a poset, where
def
hx, yi v hz, wi = x v1 z ∧ y v2 w
Lexicographic Product
Definition 7.19. If
If hA, v1 i and hB, v2 i are posets, then hA × B, vi is a poset where
def
hx, yi v hz, wi = x @1 z ∨ x = z ∧ y v2 w
Chapter 8
Functions
Some relations have the special property that they are functions. A relation
R ⊆ A × B is a function if each element of the domain A gets mapped to one
element and only one element of the codomain B.
8.1 Functions
Definition 8.1 (function) A function from A to B is a relation (f ⊆ A × B)
satisfying the following properties,
Relations having the first property are said to be total and relations satisfying
the second property are said to be functional or to satisfy the functionality
property .
Remark 8.1. Since we usually write f (x) = y for functions instead of hx, yi ∈
f , we can restate these properties in the more familiar notation as follows.
There is some danger in using the notation f (x) = y if we do not know that f
is a function.
119
120 CHAPTER 8. FUNCTIONS
is called the range of f . We write dom(f ) to denote the set which is the domain
of f , codom(f ) to denote the codomain and simply rng(f ) to denote it range if
A and B are clear from the context.
∀f : A → ∅. A = ∅
∀x : A.∃y : ∅.hx, yi ∈ ∅
This is vacuously true if A = ∅ and is false otherwise, thus it must be the case
that A = ∅.
Corollary 8.1. ∀f :∈ ∅ → ∅. f = ∅
def
f = g = ∀x : A.f (x) = g(x)
Thus, functions are equal if they are the same set of pairs. Since they are, by
definition functional, this amounts to checking that they agree on every input.
8.3. OPERATIONS ON FUNCTIONS 121
f g
A −→ B −→ C
(g ◦ f ) ∈ A → C
Remark 8.2. Having proved this theorem we say, functions are closed under
composition . That is, we preserve the property of being a function when we
compose two functions.
By “closed with respect to”, we mean that applying the operation preserves
the property of having a particular structure. In the case of functions and the
composition operation, the property we are considering is whether a relation
is a function and the question “When is the composition of functions also a
function?” is answered by the previous lemma, Always.
Thus, function composition is a binary operator on pairs of functions anal-
ogous to the way addition is a binary operation on pairs of integers. The
analogy goes deeper. Addition is associative e.g. if a, b and c are numbers,
a + (b + c) = (a + b) + c. Function composition is associative as well.
8.3. OPERATIONS ON FUNCTIONS 123
Remark 8.3. Note that because the composition of relations is associative (see
Thm. 6.6.1.) the associativity of function composition is obtained for free. We
have included a direct proof of the associativity for function composition here
to illustrate the difference in the proofs. This is a case where the proof for the
special case (function composition) is easier than the more general case (relation
composition.)
h ◦ (g ◦ f ) = (h ◦ g) ◦ f
Proof: To show two functions are equal, we must apply extensionality. To show
h ◦ (g ◦ f ) = (h ◦ g) ◦ f
Zero (0) is both a left and right identity for addition i.e. 0 + a = a and
a + 0 = a. Similarly, the identity function Id(x) = x is a left and right identity
for the operation of function composition.
Remark 8.4. We will sometimes write IdA for ∆A when we are thinking of
the relation as the identity function on A.
Theorem 8.2 (Left Right identity lemma) For any sets A and B, and any
function f : A → B, IdA is a left identity for f and IdB is a right identity for f .
Thus (f ◦ IdA ) = f and IdB ◦ f = f and the theorem has been shown.
8.3.3 Inverse
Given a relation R ⊆ A × A, recall Definition 6.6.6 of the inverse relation
Functions that rule out the behavior described in Example 8.1 are said to
be one-to-one or are called injections. Functions that rule out the behavior
described in Example 8.2 are the onto functions which are also called surjections.
8.4.1 Injections
Definition 8.6 (injection, one-to-one) A function f : A → B is an injection
(or one-to-one) if every element of B is mapped to my at most one element of
A. Formally, we write:
The definition says that if f maps x and y to the same element, then it must
be that x and y are one in the same i.e. that x = y. Injections are also called
one-to-one functions.
Proof:
We choose arbitrary x and y from the set A and assume (g ◦ f )(x) = (g ◦ f )(y)
to show x = y. Now, by definition of function composition (g ◦ f )(x) = g(f (x))
and (g ◦ f )(y) = g(f (y)). Using f (x) for x and f (y) for y in (ii.) we get that
f (x) = f (y), but then, by the fact that f is an injection (i.) we know x = y.
8.4.2 Surjections
Definition 8.7 (surjection, onto) A function f : A → B is an surjection (or
onto) if every element of B is mapped to by some element of A under f .
∀y : B. ∃x : A. f (x) = y
Proof:
∀y : C.∃x : A. (g ◦ f )(x) = y
8.4.3 Bijections
Definition 8.8 (bijection) A function f : A → B is a bijection if it is both
an injection and a surjection. Bijective functions are also said to be one-to-one
and onto.
Now, going back to the question of when the inverse of a function is a func-
tion, we state the following theorem which perfectly characterizes the situation.
Remark 8.5. Note that we used the fact that f was surjective to prove that
f −1 was total and we used the fact that f was injective to prove that f −1
was functional. Looking at the formulas for these properties above, we see the
similarity of their forms – so it makes perfect sense that we can use them in this
way.
we use the relational notation to avoid confusion e.g. the notation f −1 (x) = y suggests that
there is a unique y such that hx, yi ∈ f −1 ; however, until we have shown f −1 is a function we
do not know this to be true.
128 CHAPTER 8. FUNCTIONS
Exercise 8.6. Prove that ∆A is a function and is bijective. We call this the
identity function.
The proof of following theorem shows how to lift bijections between pairs of
sets to their Cartesian product.
A ∼ A0 ∧ B ∼ B 0 ⇒ (A × B) ∼ (A0 × B 0 )
f injective: Now, to see that f is an injection, we must show that for arbitrary
pairs ha, bi, hc, di ∈ A × B that:
Assume f (ha, bi) = f (hc, di) and show ha, bi = hc, di. But by the definition of f
we have assumed hg(a), h(b)i = hg(c), h(d)i. Thus, by equality on ordered pairs,
we know g(a) = g(c) and h(b) = h(d). Now, since g and h are both injections
we know a = c and b = d and so we have shown that ha, bi = hc, di.
Choose an arbitrary pair hc, di ∈ A0 × B 0 . Then we claim the pair hg −1 (c), h−1 (d)i
is the witness for the existential. To see that it is we must show that f (hg −1 (c), h−1 (d)i) =
hc, di. Here is the argument.
8.5. EXERCISES 129
8.5 Exercises
1. Write down the formal definitions of injection, surjection and bijection
using the notation hx, yi ∈ f instead of the abbreviated form f (x) = y.
Note that you will need to include a new variable (say z) to account for
f (x) = f (y) in this more primitive notation.
130 CHAPTER 8. FUNCTIONS
Chapter 9
9.1 Cardinality
The term cardinality refers to the relative “size” of a set.
Definition 9.1 (equal cardinality) Two sets A and B have the same cardi-
nality iff there exists a bijection f : A → B. In this case we write |A| = |B| or
A ∼ B.
Although the usage is less common, sometimes sets of equal cardinality are
said to be equipollent or equipotent.
This is easy if you study the theorems related to bijections their inverses and
compositions.
Next, we use the theorem to show a (perhaps) rather surprising result, that
N has the same cardinality as the set of Even numbers, even though half the
numbers are not there in Even.
131
132 CHAPTER 9. CARDINALITY AND COUNTING
Proof: To show these sets have equal cardinality we must find a bijection
between them. Let f (n) = 2n, we claim f : N → Even is a bijection. To see
this, we must show it is both: (i.) one-to-one and (ii.) onto.
(i.) f is one-to-one, i.e. we must show:
∀x : Even. ∃y : N. x = f (y)
def
|A| < |B| = |A| ≤ |B| ∧ |A| =
6 |B|
Theorem 9.2 (Schröder Bernstein) For all sets A and B, if |A| ≤ |B| and
|B| ≤ |A| then |A| = |B|.
9.2. INFINITE SETS 133
Richard Dedekind
(1831-1916) was a German
mathematician who made
numerous contributions in
establishing the foundations
of arithmetic and number
theory
Richard Dedekind
The following definition of infinite is sometimes called Dedekind infinite after
the mathematician Richard Dedekind (1831-1916) who first formulated it. This
characterization of infinite sets may be somewhat surprising because it does not
mention natural numbers or the notion of finiteness.
Definition 9.6 (Dedekind infinite) A set A is infinite iff there exists a func-
tion f : A → A that is one-to-one but not onto.
Lemma 9.1. N is infinite.
Proof: Consider the function f (n) = n + 1. Clearly, f is one-to-one since if
f (x) = f (y) for arbitrary x and y in N, then x + 1 = y + 1 and so x = y.
However, f is not onto since there is no element of N that is mapped to 0 by f .
Theorem 9.3. If a set A is infinite then, for any set B, if A ∼ B, then B is
infinite.
Proof: If A is infinite, then we know there is a function f : A → A that is
one-to-one but no onto. Also, since A ∼ B, there is a bijection g : A → B. To
show that B is infinite, we must show that there is a function h : B → B that
is one-to-one but not onto. We claim h = g ◦ f ◦ g −1 is such a function.
Now, to show that h is an injection (one-to-one) we recall (Theorem 8.8.3)that
the composition of injections is an injection. We assumed that f and g were
both injections and to see that g −1 is an injection as well, we cite Lemma 8.??
that says that if g is a bijection then g −1 is a bijection as well. Since bijections
are also injections, we have shown that h is an injection.
To show that h is not a surjection it is enough to show that
i.) existsy : B.∀x : B.h(x) 6= y
134 CHAPTER 9. CARDINALITY AND COUNTING
Proof:
(⇒) Assume |A| = 0, then, there is a bijection c : A → {0..0}. But by definition
{0..=}∅ since there are no natural numbers between 0 and -1. This means
c : A → ∅ which, by Thm. 8.8.2 A = ∅.
(⇐) Assume A = ∅ and show |A| = 0. By definition, we must show a bijection
from A → {0..0} i.e. a bijection c : ∅ → ∅. By Lemma 8.8.1 c = ∅ is a function
in ∅ → ∅ and it is vacuously a bijection.
Now, the bijection f witnessing the fact that A is finite assigns to each
element of A a number between 0 and k. Also, the inverse f −1 maps numbers
between 0 and k to unique elements of A i.e. since f −1 is itself a bijection, and
so is one-to-one, we know that no i, j ∈ {0..k − 1} such that i 6= j map to the
same element of A. We could enumerate (list) the elements of the set by the
following bit of pseudo-code.
for i ∈ {0..k − 1} do
Print (f −1 (i))
To make sure that there is nothing “between” the finite and infinite sets
(i.e. that there is no set that is neither finite nor infinite) we would expect the
following theorem holds.
Interestingly, it can be shown that proofs of this theorem require the use of
the axiom of choice – whose use is beyond the scope of these notes.
9.3.1 Permutations
Definition 9.9. A bijection from a finite set A to itself is called a permutation.
Theorem 9.5 (Cantor’s Theorem) For every set A, |A| < |ρ(A)|.
1 Georg Cantor (1845–1918) was a German mathematician who developed set theory and
established the importance of the ideas of injection, and bijection for counting.
136 CHAPTER 9. CARDINALITY AND COUNTING
Proof: Let A be an arbitrary set. To show |A| < |ρ(A)| we must show that (i.)
there is an injection A → B and (ii) there is no bijection from A to B.
(i.) Let f (x) = {x}. We claim that this is a injection form A to ρ(A) (the set
of all subsets of A. Clearly, for each x ∈ A, f (x) ∈ ρ(A). To see that f is an
injection we must show:
Choose arbitrary x and y from A. Assume f (x) = f (y), i.e. that {x} = {y},
we must show x = y. But, by Theorem 5.5.3, {x} = {y} ⇔ x = y, thus f is an
injection as was claimed.
(ii.) To see that there is no bijection, we assume f : A → B is an arbitrary
function and show that it can not be onto.
Now, if f is onto then every subset of A must be mapped to from some
element of A. Consider the set
B = {y : A | y 6∈ f (y)}
Lemma 9.3. For all sets A, A is countable if and only if there exists a surjection
f : N → A.
The proof in the (⇒) direction is trivial following almost directly from the
definition of countable. The proof in the ⇐) direction assumes the existence of
the surjection and requires us to show it is a bijection, or to use it to construct
a mapping from A onto an initial segment of N.
The following theorems may be surprising.
9.6 Counting
We saw with the notion of cardinality that it is possible to compare the sizes
of sets without actually counting them. By counting, we mean the process
of sequentially assigning a number to each element of the set – i.e creating a
bijection between the elements of the set some set {0..k}. This is precisely the
purpose of the bijection that witnesses the finiteness of a set A – it counts the
elements of the set A. Thus, counting is “finding a function of a certain kind.”2
The following lemma shows that counting is unique, i.e. that it does not
matter how you count a finite set, it always has the same cardinality.
Theorem 9.8.
A corollary is
Corollary 9.2.
∀i, j : N. {0..i} ∼ {0..j} ⇔ i = j
2 See Stuart Allen’s formalization [2] of discrete math materials, the proofs here are the
But this is merely saying that at least two elements from the domain must
get mapped to the same element (pigeon hole) of the codomain. A more abstract
statement is that if i < j then there is no injection from {0..j} → {0..i} – i.e. if
for every function there are always at least a pair of elements from the domain
mapped to some single element in the codomain, then there certainly can be no
injection.
139
141
Introduction
sets, there is, of course, a purely set theoretic form of definition for inductive structures.
142
Natural Numbers
143
144 CHAPTER 10. NATURAL NUMBERS
Leopold Kronecker
The German mathematician Leopold Kronecker famously remarked:
God made the natural numbers; all else is the work of man.
Kronecker was saying the natural numbers are absolutely primitive and that
the other mathematical structures have been constructed by men. Similarly, the
philosopher Immanuel Kant (1742 – 1804) and mathematician Luitzen Egbertus
Jan Brouwer (1881 - 1966) both believed that understanding of natural numbers
is somehow innate; that it arises from intuition about the human experience of
time as a sequence of moments. 1 In any case, it would be difficult to argue
against the primacy of the natural numbers among mathematical structures.
1 Interestingly, Kant also believed that geometry was similarly primitive and our intuition
of it arises from our experience of three dimensional space. The discovery in the 19th century
of non-Euclidean geometries [?, 34] makes this idea seem quaint by modern standards.
10.1. PEANO AXIOMS 145
Giuseppe Peano
The Peano axioms are named for Giuseppe Peano (1858–1932), an Italian
mathematician and philosopher. Peano first presented his axioms [40] of arith-
metic in 1889, though in a later paper Peano credited Dedekind [7] with the
first presentation of the axioms. We still know them as Peano’s axioms.
i.) 0∈N
ii.) ∀k : N. sk ∈ N
iii.) ∀k : N. 0 6= sk
iv.) ∀j, k : N. j = k ⇔ sj = sk
v.) (P [0] ∧ ∀k : N. P [k] ⇒ P [sk]) ⇒ ∀n : N. P [n]
Axioms (i.) and (ii.) say 0 is a natural number and if k is a natural number
then so is sk. We call s the successor function and sk is the successor of k.
Axiom (iii.) says that 0 is not the successor of any natural number and axiom
iv is a kind of monotonicity property for successor, it says the successor function
preserves equality. Axiom (v.) is the induction principle which is the main topic
of discussion of this chapter, see Section 10.3.
So, there are two ways to construct a natural number, either you write down
the constant symbol 0 or, you write down a natural number (say k) and then
you apply the successor function s which has type N → N to the k to get the
number sk.
Thus, N = {0, s0, ss0, sss0, · · · } are the elements of N. Note that the vari-
able “n” used in the definition of the rules never occurs in an element of N, it
146 CHAPTER 10. NATURAL NUMBERS
0=0
1 = s0
2 = ss0
3 = sss0
..
.
add(0, k) = k
add(sn, k) = s(add(n, k))
We will use ordinary infix notation for addition with the symbol +; thus
add(m, n) will be written m+n. Using this standard notation we take the chance
that the reader will assume + has all the properties expected for addition. It
turns out that it does, but until we prove a property is true for this definition
we can not use the property. In the infix notation, the definition would appear
as follows:
0+k =k
sn + k = s(n + k)
mult(0, k) = 0
mult(sn, k) = mult(n, k) + k
We will use ordinary infix notation for multiplication with the symbol ·; thus
mult(m, n) will be written m · n. We will also sometimes just write mn omitting
the symbol ·. In the infix notation, the definition appears as follows:
0·k =k
sn · k = (n · k) + k
n0 = s 0
n(s k) = nk · n
f (0) = a
f (sk) = g(k, f (k))
mapped to some output) guarantees that definitions which match the pattern
are guaranteed to halt on all inputs. In [31] Mac Lane notes that one can work
the other way around, one can take the definition by recursion as an axiom and
derive the Peano axioms as theorems.
The arithmetic functions we defined above all take two arguments. We state
a corollary of the Theorem 11.1 for binary functions defined by recursion on the
structure of their first arguments.
f (0, b) = h(b)
f (sn, b) = g(n, f (n, b))
Using the corollary, we prove that the definitions given above for addition,
multiplication and exponentiation of natural numbers are indeed functions.
To prove that addition is a function of the specified type we show how it fits
the pattern given by Corollary 10.1. To do so we must specify the sets A and
B and the functions h and g and the element b ∈ B. In this case, let A = N
and B = N. Let b = k. Since B = N and k ∈ N it is an acceptable choice for
b. The function h : N → N is just the identity function, h(k) = k. The function
g ∈ (N × N) → N is the function g(j, k) = sk. Thus, the operation of addition
is, by Corollary 10.1 a function in (N × N) → N.
F (0) = 0
F (s0) = 1
F (ssk) = F (sk) + F (k)
Can this be defined using definition by recursion? If so how? If not, why not?
You may have convinced yourself that these definitions look like they “do
the right thing” but we will be able to prove that they behave in the ways we
expect them to using mathematical induction.
It says, for a property P of natural numbers, to show that P [n] holds for every
natural number n, it is enough to show two things:
So, suppose you have accepted proofs of (i.) and (ii.) but somehow still believe
that there might be some n ∈ N such that ¬P [n] holds. Since n ∈ N it is
constructed by n applications of the successor function to 0. You can construct
an argument that P [n] must hold in 2n + 1 steps. The argument is constructed
using (i.) and (ii.) as follows2 :
Π1 : (Ax)
Γ, ∀n : N. P [n] ` ∆1 , ∀n : N. P [n], ∆2
Γ, k : N, P [k] ` ∆1 , P [sk], ∆2
⇒R
Γ, k : N ` ∆1 , P [k] ⇒ P [sk], ∆2
∀R
Γ ` ∆1 , P [0], ∆2 Γ ` ∆1 , ∀k : N. P [k] ⇒ P [sk]), ∆2
∧R
Γ ` ∆1 , P [0] ∧ ∀k : N. P [k] ⇒ P [sk], ∆2 Π1
(⇒R)
Γ, P [0] ∧ ∀k : N. (P [k] ⇒ P [sk]) ⇒ ∀n : N. P [n] ` ∆1 , ∀n : N.P [n], ∆2
(Ax)
Γ ` ∆1 , ∀n : N.P [n], ∆2
The informal justification is as follows:if you are trying to prove a sequent
of the form Γ ` ∀m : N. P [m], you can add an instance of the principle of
mathematical induction to the left side; this is because it is an axiom of Peano
arithmetic. After one application of ∀L rule, on the left branch you will be
required to show two things: The left branch will be of the form,
Γ ` P [0] ∧ ∀k : N. P [k] ⇒ P [sk]
and the right branch will be an instance of an axiom of the form:
∀m : N. P [m], Γ ` ∀m : N. P [m]
2 Recall that modus ponens is the rule that says that P and P ⇒ Q together yield Q.
10.3. MATHEMATICAL INDUCTION 151
We can further refine the left branch by applying the ∧R rule which gives two
subgoals: One to show P [0] and the other to show ∀k : N. P [k] ⇒ P [sk]. This
sequent can further be refined by applying ∀R and then ⇒R.
This yields the following rule3 .
∀k : N. sk = k + 1
sk = k + 1 (Ind.Hyp.)
We must show P [sk]. To get P [sk] carefully replace all free occurrences of k in
the body of P by sk. The result of the substitution is the following equality:
ssk = sk + 1 (A)
3 The contexts ∆1 and ∆2 on the right side have been omitted to aid readability
152 CHAPTER 10. NATURAL NUMBERS
To show this, we proceed by computing with the right side using the definition
of addition.
sk + 1 = s(k + 1) (B)
By the induction hypothesis, sk = k + 1 so we replace k + 1 by sk in the right
side of (B), this gives
sk + 1 = s(k + 1) = ssk
But now we have show (A) and so the induction step is completed.
Thus, we have shown that the base case and the induction step hold and this
completes the proof.
Since we proved in Thm. 10.2 that addition is a function in (N × N) → N
and by the previous lemma that sk = k + 1 we know the successor operation
defines a function in N → N.
Corollary 10.2 (sucessor is a function)
s∈N→N
This proof justifies restating the principle of mathematical induction in the
following (perhaps) more familiar form.
Definition 10.6 (Principle of Mathematical Induction (modified)) For
a property P of natural numbers we have the following axiom.
(P [0] ∧ ∀k : N. P [k] ⇒ P [k + 1]) ⇒ ∀n : N. P [n]
The following lemma is useful in a number of proofs.
Lemma 10.2 (addition by zero)
∀n, k : N. k = k + n ⇒ n = 0
Proof: : Choose an arbitrary n ∈ N and do induction on k.
def
P [k] = k = k + n ⇒ n = 0
Base Case: Show P [0], i.e. that 0 = 0 + n ⇒ 0 = n. Assume 0 = 0 + n
and show 0 = n. By the definition of addition 0 + n = n, so the base case holds.
Induction Step: Assume k ∈ N, assume P [k] holds and show P [sk].
P [k] : k =k+n⇒n=0
We must show
sk = sk + n ⇒ n = 0
Assume sk = sk + n and show n = 0. By definition of addition, from the right
side of the equality we have: sk + n = s(k + n) so we know that sk = s(k + m).
Applying Peano axiom (iv.) to this fact we see that k = k + m. This formula
is the antecedent of the induction hypothesis so we know that n = 0 which is
what we were to show.
10.4. PROPERTIES OF THE ARITHMETIC OPERATORS 153
∀n : N. n = 0 ∨ ∃j : N. n = sj
Using this lemma theorem we can derive the following proof rule.
Proof Rule 10.2 (Case Analysis on N)
Γ1 , k ∈ N, k = 0, Γ2 ` ∆ Γ1 , k ∈ N, j ∈ N, k = sj, Γ2 ` ∆
(NCases) j fresh.
Γ1 , k ∈ N, Γ2 ` ∆
The laws for addition and multiplication are given as follows where m, n and
k are arbitrary natural numbers.
0 right identity for + m+0=m
+ commutative m+n=n+m
+ associative m + (n + k) = (m + n) + k
0 annihilator for · m·0=0
1 right identity for · m·1=m
· commutative m·n = n·m
· associative m·(n·c) = (m·n)·c
distributive law m · (n + k) = (m · n) + (m · k)
The fact that 0 is a left identity for addition falls out of the definition for
free. That 0 is a right identity requires mathematical induction.
Theorem 10.5 (0 right identity for +)
∀n : N. n + 0 = n
Proof: By mathematical induction on n. The property P of n is given as:
def
P [n] = n + 0 = n
Base Case: We must show P [0], i.e. that 0 + 0 = 0 but this follows
immediately from the definition of + so the base case holds.
Induction Step: Assume n ∈ N and that P [n] holds and show P [sn].
P [n] is the induction hypothesis.
P [n] : n + 0 = n
Show that sn + 0 = sn But by definition of + we know that sn + 0 = s(n + 0).
By the induction hypothesis n + 0 = n so s(n + 0) = sn and the induction step
holds.
Theorem 10.6 (+ is commutative)
∀m, n : N. m + n = n + m
Theorem 10.7 (+ is associative)
∀m, n, k : N. m + (n + k) = (m + n) + k
Theorem 10.8 (1 right identity for ·)
∀n : N n · 0 = 0
Theorem 10.9 (· is commutative)
∀m, n : N. m · n = n · m
Theorem 10.10 (· is associative)
∀m, n, k : N. m · (n · k) = (m · n) · k
10.4. PROPERTIES OF THE ARITHMETIC OPERATORS 155
We’d like to establish that the less than relations (<) as defined here behaves
as expected i.e. that it is a strict partial order. Recall the definition from
Chap. 7, Def ?? that the relation must be irreflexive (Def. 6.6.13) and transitive
(Definition 6.??).
∀n : N. ¬(n < n)
Proof: Choose an arbitrary n ∈ N and show ¬(n < n). We assume n < n and
derive a contradiction. If n < n then, by definition of less than the following
holds:
∃j : N. n + (j + 1) = n
Let j ∈ N be such that n + (j + 1) = n. By Lemma 10.2 (addition by zero) we
know j + 1 = 0. Since j + 1 is sj this contradicts Peano axiom (iii.) instantiated
with k = j.
∀n : N.¬(n < 0)
Proof: Choose arbitrary n. To show ¬(n < 0) assume n < 0 and derive a
contradiction. By definition of less-than, n < 0 means
∃j : N. n + (j + 1) = 0
n + (j + 1) = (j + 1) + n = sj + n = s(j + n)
Proof:
Choose an arbitrary k ∈ N and do induction on m.
def
P [m] = ∀n : N. m < n ⇒ m + k < n + k
∀n : N. 0 < n ⇒ 0 + k < n + k
Show
∀n : N. m + 1 < n ⇒ (m + 1) + k < n + k
Choose arbitrary n ∈ N. Assume m + 1 < n and show (m + 1) + k < n + k. By
the induction hypothesis (using n for n) we get,
m<n⇒m+k <n+k
Since we assumed m+1 < n we know m < n (if j is a witness for m+1 < n then
j + 1 is a witness for m < n.) Now, since m < n we know (m + 1) + k < n + k.
To know m + k < n + k. To show (m + 1) + k < n + k we must show the
following.
∃i : N. i 6= 0 ∧ n + k = (m + 1) + k + i
∀k : N. nk < nk+1
We proceed by induction on k.
def
P [k] = nk < nk+1
n1 = n0 · n = 1 · n = n
We must show nk+1 < n(k+1)+1 . Staring on the left side of the inequality we
compute as follows:
nk+1 = nk + n
By the induction hypothesis,nk < nk+1 so, using Theorem ?? we get the follow-
ing.
nk+1 = nk + n < nk+1 + n = n(k+1)+1
This shows the induction step holds and completes the proof.
j
X
f (i) = 0 if j < k
i=k
j+1
X j
X
f (i) = f (j +1) + f (i) if (j + 1) ≥ k
i=k i=k
and also,
0(0 + 1) 0
= =0
2 2
so the base case holds.
Induction Step: Choose an arbitrary natural number, call it k. We assume
P [k] (the induction hypothesis)
k
X k(k + 1)
i= induction hypothesis
i=1
2
Starting with the left side, by the definition of summation operator we get the
following.
(k+1) (k)
X X
i = (k + 1) + i
i=1 i=1
By the induction hypothesis, we have.
(k)
X k(k + 1)
(k + 1) + i = (k + 1) +
i=1
2
10.4.3 Applications
In Chapter 9 the pigeonhole principle was presented (without proof) as Theo-
rem ??. We prove this theorem here using mathematical induction.
Recall, by Definition 9.9.7 that
def
{0..n} = {k : N | 0 ≤ k < n}
We must show.
Case n > 0. In this case, our assumption that m + 1 > n means that m > n − 1
(subtracting one from n is justified because n > 0. Use n − 1 for n in the
induction hypothesis to get the the following:
Now, consider the injective function f ∈ {0..m + 1} → {0..n} from the hypoth-
esis labeled (A) above.
There are two cases, f (m) = n − 1 or f (m) < n − 1.
Case f (m) = n − 1. In this case, since f is an injection in {0..m + 1} → {0..n},
removing m from the domain also removes n − 1 from the co-domain and so
f ↓ {0..m} a function of type {0..m} → {0..n − 1}. Use this restricted f as a
witness to the assumption labeled B and we assume
If we can show that Inj(f, {0..m}, {0..n − 1}) we have completed this case. But
we already know that f is an injection on the larger domain {0..m + 1} so it is
an injection on the smaller one.
Case f (m) < n − 1. In this case, since f is an injection with codomain {0..n},
at most one element of the domain {0..m + 1} gets mapped by f to n − 1 if it
exists, call it k. Using f we will construct a new function (call it g) by having
g behave just like f except on input k (the one such that f (k) = n − 1) we
set g(k) = f (m). Since we assumed f (m) < n − 1 we know g(k) ∈ {0..n − 1}
and because f is an injection we know that no other element of the domain was
mapped by f to f (m). So, g is defined as follows:
To prove this case we show Inj(g, {0..m}, {0..n − 1}). But we constructed g
using the injection f to be an injection as well.
This suggests a stronger induction hypothesis may be possible, to not just as-
sume that the property holds for the preceding natural number but that it holds
for all proceeding natural numbers. In fact, the following induction principle
can be proved using ordinary mathematical induction.
We prove
∀k : N.∀j : N.j < k ⇒ P [j]
by induction on k.
base case: Show ∀j : N.j < 0 ⇒ P [j]. This holds for arbitrary j ∈ N because
j < 0 is in contradiction with Lemma ?? which says ¬(j < 0) for all j.
4 We have omitted the contexts ∆1 and ∆2 on the right sides of the sequents for readability.
162 CHAPTER 10. NATURAL NUMBERS
We must show
∀j : N.j < k + 1 ⇒ P [j]
Let j ∈ N be arbitrary, assume j < k + 1 and show P [j]. By Lemma ?? there
are two cases, j < k or j = k. In the case j < k the induction hypothesis yields
P [j]. In the case j = k, use k for n in assumption C that P is complete to get
the following:
(∀m : N. m < k ⇒ P [m]) ⇒ P [k]
The antecedent of this last formula is an instance of the induction hypothesis
IH (to see this rename the bound variable j in IH to m.) This yields P [k] and
since j = k we get P [j], as we were to show.
Theorem 10.17, the principle of complete induction, follows from Lemma 10.5.
Proof: To prove
10.5.2 Applications
Complete induction is especially useful in proving properties of functions defined
by recursion but which do not follow the structure of the natural numbers. The
Fibonacci numbers provide an excellent example of such a function.
F (0) = 0
F (1) = 1
F (k + 2) = F (k + 1) + F (k)
∀n : N. F (n) < 2n
10.5. COMPLETE INDUCTION 163
def
Proof: By complete induction on n. The property is P [n] = F (n) < 2n . We
assume that n ∈ N and our induction hypothesis becomes
We must show that F (n) < 2n . We assert the following leaving the proof to the
reader.
∀m : N. m = 0 ∨ m = 1 ∨ ∃k : N. m = k + 2
Using n for m in the assertion we have three cases to consider.
case[n = 0]: Assume n = 0 and show F (0) < 20 . By definition, F (0) = 0 and
20 = 1 so this case holds.
case[n = 1]: Assume n = 1 and show F (1) < 21 . By definition F (1) = 1 and
also 21 = 2 so this case holds.
case[∃k : N. n = k + 2]: Assume ∃k : N. n = k + 2. Let k ∈ N be such that n =
k +2. We must show F (k +2) < 2k+2 . By definition F (k +2) = F (k +1)+F (k).
Since we have assumed n = k + 2, we know that k + 1 < n and k < n. Using
k + 1 and k in induction hypothesis we get the following facts F (k + 1) < 2k+1
and F (k) < 2k . We use two instances of Thm. 10.13 to get the following:
Note that by Thm. 10.14 we know 2k < 2k+1 so, by definition of exponentiation,
2k < 2 · 2k . This justifies the following:
This string of inequalities and equalities shows that F (k + 2) < 2k+2 and so this
case is complete.
By these three cases we have shown that for all n ∈ N the theorem holds.
∀n : N. P rime(n) ⇒ 2 ≤ n
Exercise 10.5. Note that sets of natural numbers are not a suitable represen-
tation of factorizations. Why not?
Lists
11.1 Lists
John McCarthy
Lists may well be the most ubiquitous datatype in computer science. Func-
tional programming languages like LISP, Scheme, ML and Haskell support lists
in significant ways that make them a go-to data-structure. The can be used
to model many collection classes (multisets or bags come to mind) as well as
relations (as list of pairs) and finite functions.
We define lists here that may contain elements from some set T . These are
the so-called monomorphic lists; they can only contain elements of type T . There
are two constructors to create a list. Nil (written as “[ ]”) is a constant symbol
denoting the empty list and “::” is a symbol denoting the constructor that adds
an element of the set T to a previously constructed list. This constructor is,
167
168 CHAPTER 11. LISTS
for historical reasons, called “cons”. Note that although “[ ]” and “::” are both
written by sequences of two symbols, we consider them to be atomic symbols
for the purposes of the syntax.
This is the first inductive definition where a parameter (in this case T ) has
been used.
Definition 11.1 (T List)
List T ::= [ ] | a :: L
where
T : is a set,
[ ]: is a constant symbol denoting the empty list, which is called “nil”,
a: is an element of the set T , and
L: is a previously constructed List T .
A list of the form a::L is called a cons. The element a from T in a::L is
called the head and the list L in the cons a::L is called the tail.
Example 11.1. As an example, let A = {a, b}, then the set of terms in the
class List A is the following:
{[ ], a::[ ], b::[ ], a::a::[ ], a::b::[ ], b::a::[ ], b::b::[ ], a::a::a::[ ], a::a::b::[ ], · · · }
We call terms in the class List A lists. If A 6= then the set of lists in class
List A is infinite, but also, like the representation of natural numbers, the rep-
resentation of each individual list is finite1 Finiteness follows from the fact that
lists are constructed by consing some value from the set A onto a previously con-
structed List A . Note that we assume a::b::[ ] means a::(b::[ ]) and not (a::b)::[ ]),
to express this we say cons associates to the right. The second form violates the
rule for cons because a::b is not well-formed since b is an element of A, it is not
a previously constructed List A . To make reading lists easier we simply separate
the consed elements with commas and enclose them in square brackets “[” and
“]”, thus, we write a::[ ] as [a] and write a::b::[ ] as [a, b]. Using this notation we
can rewrite the set of lists in the class List A more succinctly as follows:
{[ ], [a], [b], [a, a], [a, b], [b, a], [b, b], [a, a, a], [a, a, b], · · · }
Note that the set T need not be finite, for example, the class of List N is
perfectly sensible, in this case, there are an infinite number of lists containing
only one element e.g.
{[0], [1], [2], [3] · · · }
own where induction is replaced by a principle of co-induction. The Haskell programming lan-
guage supports an elegant style of programming with streams by implementing an evaluation
mechanism that is lazy; computations are only performed when the result is needed.
11.2. DEFINITION BY RECURSION 169
::
a ::
b ::
a []
Figure 11.1: Syntax tree for the list [a, b, a] constructed as a::(b::(a::[ ]))
def
append ([ ], M ) = M
def
append ((a::L), M ) = a::(append (L, M ))
The first equation of the definition says: if the first argument is the empty list
[ ], the result is just the second argument. The second equation of the definition
says, if the first argument is a cons of the form a::L, then cons a on the append
of L and M . Thus, there are two equations, one for each rule that could have
been used to construct the first argument of the function. Note that since there
are only two ways to construct a list, this definition covers all possible ways the
first argument could be constructed.
Using the more compact notation for lists, we have shown append ([a, b], [c]) =
[a, b, c]. Using the pretty notation for lists we can rewrite the derivation as
170 CHAPTER 11. LISTS
follows:
append ([a, b], [c])
= a::(append ([b], [c]))
= a::b::(append ([ ], [c]))
= a::b::[c]
= [a, b, c]
We will use the more succinct notation for lists from now on, but do not
forget that this is just a more readable display for the more cumbersome but
precise notation which explicitly uses the cons constructor.
Append is such a common operation on lists that the ML and Haskell pro-
gramming languages provide convenient infix notations for the list append op-
erator. In Haskell, the symbol is “++”, in the ML family of languages it is
“@”. We will use “ ++ ” here. Using the infix notation the definition appears
as follows:
[a, b] ++ [c]
= a::([b] ++ [c])
= a::b::([ ] ++ [c])
= a::b::[c]
= [a, b, c]
f ([ ]) = b
f (x :: xs) = g(x, f (xs))
f ([ ], b) = h(b)
f ((x::xs), b) = g(x, f (xs, b))
Theorem 11.2 (append ∈ (List A × List A ) → List A ) Recall the definition of ap-
pend
def
append ([ ], M ) = M
def
append ((a::L), M ) = a::(append (L, M ))
Proof: We apply Corollary 11.1. Let A be an arbitrary set and let B = List A
and C = List A . Let h be the identity function on List A → List A and g(x, m) =
x::m. Then g ∈ (A × List A ) → List A . This fits the pattern and shows that
append is a function of type (List A × List A ) → List A .
length([ ]) = 0
length(x::xs) = 1 + length(xs)
mem(y, [ ]) = f alse
mem(y, x::xs) = y = x ∨ mem(y, xs)
rev([ ]) = [ ]
rev(x::xs) = rev(xs) ++ [x]
Exercise 11.1. Show that length, member and reverse are all functions by
applying Thm. 11.1 or Corollary 11.1.
172 CHAPTER 11. LISTS
def
P [xs] = xs ++ (ys ++ zs) = (xs ++ ys) ++ zs
hhdef .of ++ ii
([ ] ++ ys) ++ zs = (ys ++ zs)
We must show
So the left and right sides are equal and this completes the proof.
Here’s an interesting theorem about reverse. We’ve seen this pattern before
in Chapter 6 in Theorem 6.2 where we proved that the inverse of a composition
is the composition of inverses.
So the left and right sides are equal and the base case holds.
Induction Step: For arbitrary xs ∈ List A assume P [xs] and show P [x::xs]
for some arbitrary x ∈ A.
We must show:
rev(ys) ++ rev((x::xs))
hhdef .of revii
= rev(ys) ++ (rev(xs) ++ [x])
hhT hm 11.4ii
= (rev(ys) ++ rev(xs)) ++ [x]
Bibliography
[1] Martı́n Abadi and Luca Cardelli. A Theory of Objects. Springer, 1996.
[2] Stuart Allen. Discrete math lessons .
http://www.cs.cornell.edu/Info/People/sfa/Nuprl/eduprl/Xcounting%
underscoreintro.html.
[3] Jonathan Barnes. Logic and the Imperial Stoa, volume LXXV of
Philosophia Antiqua. Brill, Leiden · New York · Koln, 1997.
[4] Garrett Birkhoff and Saunders Mac Lane. A Survey of Modern Algebra.
Macmillan, New York, 2nd edition, 1965.
[5] Noam Chomsky. Syntactic Structures. Number 4 in Janua Linguarum,
Minor. Mouton, The Hague, 1957.
[6] Alonzo Church. The Calculi of Lambda-Conversion, volume 6 of Annals of
Mathematical Studies. Princeton University Press, Princeton, 1951.
[7] Richard Dedekind. Was sind and was sollen die Zahlen? 1888. English
translation in [8].
[8] Richard Dedekind. Essays on the Theory of Numbers. Dover, 1963.
[9] Kees Doets and Jan van Eijck. The Haskell Road to Logic, Maths and Pro-
gramming, volume 4 of Texts in Computing. Kings College Press, London,
2004.
[10] Paul Edwards, editor. The Encyclopedia of Philosophy, Eight volumes pub-
lished in four, unabridged, New York · London, 1972. Collier Macmillan &
The Free Press.
[11] M. A. Ellis and B. Stroustrup. The Annotated C++ Reference Manual.
Addison-Wesley, Reading, MA, 1990.
[12] E. Engeler. Formal Languages: Automata and Structures. Markham,
Chicago, 1968.
[13] Discourses of Epictetus. D. Appleton and Co., New Youk, 1904. Translated
by George Long.
175
176 BIBLIOGRAPHY
[14] Anita Burdman Feferman and Solomon Feferman. Alfred Tarski: Life and
Logic. Cambridge University Press, 2004.
[15] Abraham A. Frankel. Georg Cantor. In Edwards [10], pages 20–22.
[16] D. P. Friedman, M. Wand, and C. T. Haynes. Essentials of Programming
Languages. MIT Press, 1992.
[17] Galileo Galilei. Two New Sciences. University of Wisconsin Press, 1974.
Translated by Stillman Drake.
[18] Gerhard Gentzen. Investigations into logical deduction. In M. E. Szabo, edi-
tor, The collected papers of Gerhard Gentzen, pages 68–131. North-Holland,
1969.
[19] Nelson Goodman. The Structure of Appearance, Third ed., volume 107 of
Synthese Library. D. Reidel, Dordrecht, 1977.
[20] Nelson Goodman and W. V. Quine. Steps toward a constructive nominal-
ism. Jounral of Symbolic Logic, 12:105 – 122, 1947.
[21] David Gries and Fred B. Schneider. A Logical Approach to Discrete Math.
Springer-Verlag, New York, 1993.
[22] C. A. Gunter. Semantics of Programming Languages: Structures and Tech-
niques. Foundations of Computing Series. MIT Press, 1992.
[23] C. A. Gunter and J. C. Mitchell, editors. Theoretical Aspects of Object-
Oriented Programming, Types, Semantics and Language Design. Types,
Semantics, and Language Design. MIT Press, Cambridge, MA, 1994.
[24] C. A. Gunter and D. S. Scott. Semantic domains. In J. van Leeuwen, editor,
Handbook of Theoretical Computer Science, pages 633–674. North-Holland,
1990.
[25] Paul R. Halmos. Boolean Algebra. Nan Nostrand Rienholt, New York,
1968.
[26] Paul R. Halmos. Naive Sert Theory. Springer Verlag, New York · Heidelberg
· Berlin, 1974.
[27] Herodotus. The History. University of Chicago, 1987. Translated by David
Green.
[28] D. Hilbert and W. Ackermann. Mathematical Logic. Chelsea Publishing,
New York, 1950.
[29] John E. Hopcroft and Jeffrey D. Ullman. Formal Languages and Their
Relation to Automata. Addison-Wesley, Reading, Massachusetts, 1969.
[30] Stephen C. Kleene. Introduction to Metamathematics. van Nostrand,
Princeton, 1952.
BIBLIOGRAPHY 177
[31] Saunders Mac Lane. Mathematics: Form and Function. Springer Verlag,
New York, Berlin, Heidelberg, Tokyo, 1986.
[33] Xavier Leroy. The Objective Caml system release 1.07. INRIA, France,
May 1997.
[36] Zohar Manna and Richard Waldinger. The Logical Basis for Computer
Programming: Deductive Reasoning. Addison-Wesley, 1985. Published in
2 Volumes.
[37] R. Milner, M. Tofte, and R. Harper. The Definition of Standard ML. The
MIT Press, 1991.
[40] Giuseppe Peano. Arithmetices principia, nova methodo exposita, 1899. En-
glish translation in [51], pp. 83–97.
[41] Benjamin Pierce. Basic Category Theory for COmputer Scientists. MIT
Press, 1991.
[43] David J. Pym and Eike Ritter. Reductive Logic and Proof-search: Proof
Theory, Semantics and Contol, volume 45 of Oxford Logic Guides. Claren-
don Press, Oxford, 2004.
[44] W.V.O. Quine. Methods of Logic. Holt, Rinehart and Winston, 1950.
[51] Jan van Heijenoort, editor. From Frege to Gödel: A sourcebook in mathe-
matical logic, 1879 – 1931. Harvard University Press, 1967.
179
180 INDEX
Hilbert, David, 64
factorization, 164
factorization:prime, 164
idempotent, 94
falsifiable, 29
identity
falsifies, 27 for composition, 103
Fibonacci for composition of relations, 103
growth, 162 function, 123
Fibonacci numbers, 162 left, 103
finite, 134 matrix, 100
formal language, 2 relation, 99
formula, 57, 58 right, 103
predicate logic, 58 ∆A , 123
foundations of mathematics, 77 if and only if, see bi-conditional
fractions, 113 iff, see bi-conditional
addition of, 116 implication, 22
multiplication of, 116 proof rule for, 39
free occurrence, 62, 63 semantics, 28
Frege, Gottlob, 57 syntax, 22
function, 119 induction, 149, 152, 172
addition, 148 on N, 149, 152, 172
Boolean valued, 56 inductively defined sets, 1
composition of, 122 infinite, 133
equivalence of, 120 infinity of infinities, 136
exponentiation, 149 injection, 125
extension of, 121 intensional properties, 121
extensionality, 120 interpretation, 10
inverse, 124, 126–128 intersection
multiplication, 148 commutativity of, 89
restriction of, 121 of sets, 89
symbols for terms of predicate logic, zero for, 89
58 inverse
functionality, 119 bijection
Fundamental theorem of arithmetic, 164 lemma, 128
182 INDEX
proposition, 21 on N, 147
propositional recursive call, 10
assignment, 25 termination, 11
connectives, 22 recursion theorem, 147
constants, 22 reflexive, 53, 104, 105
formula, 22 closure, 108
contradiction, 30 reflexive transitive closure, 109
falsifiable, 29 relation, 98
satisfiable, 29 antisymmetric, 104, 106
valid, 30 asymmetric, 104, 106
meta variable, 22 binary, 98
semantics, 28 infix notation for, 99
sequent, 31 closure
tautology, 30 uniqueness of, 107
valuation, 26 closure of, 106, 107
variables, 22 reflexive, 108
propositional logic symmetric, 109
constructors, 23 codomain, 98
propositional logic complement of, 101
semantics, 28 complete, 112
syntax, 22 composition
syntax trees, 23 inverse of, 102
propositional logic: completeness, 44 iterated, 103
propositional logic: soundness, 44 composition of, 101
associative, 101
quantification, 55 congruence, 116
quantifier connected, 104
existential, 55 connectivity of, 104
universal, 55 diagonal, 99, 112
Quine, W. V. O., 64 domain, 98
quotient, 48, 113 empty set, 99
equality, 100
range, 119 equivalence, 111, 112
Q, 114 fineness of, 113
rational numbers functional, 119
countable, 137 inverse, 100
rationals, 114 inverse of, 100
reachability, 104 irreflexive, 104, 105
recursion, 147, 148, 170, 171 less than, 99
base case, 10 partial order, 117
definition by, 10 partial order, strict, 118
definition of addition by, 146 reachability, 104
definition of exponentiation by, 147 reflexive, 104, 105
definition of multiplication by, 147 reflexive transitive closure, 109
definition of summation by, 157 representation
on List A , 170, 171 matrix, 99
INDEX 185
sum,
P 157 identity for, 88
, 157 uniqueness, 82
surjection, 125 of counting, 137
Sym, 105, 109 of empty set, 82
symmetric, 53, 104, 105 of relational closures, 107
closure, 109 of singletons, 84
syntactic class, 2 of unordered pairs, 83
syntax, 1 universal quantifier, 55
abstract, 2, 4, 15 proof rule for, 66, 67
concrete, 4 unordered pair, 83
syntax tree, 58
for propositional logic, 23 val, 26
valid, 27, 30, 34
Tarski, Alfred, 97 valuation, 26
tautology, 30 computation of, 27
term, 4, 57, 58 variable, 57
as tree, 2 binding occurrence, 62
predicate logic, 58 bound, 63, 64
syntax of, 58 in a term, 63
variables, 57 free, 62–64
T[ F], 58 in a formula, 63
terms in a term, 62
structured, 2 occurrence, 61, 62
text, 4 in a formula, 62
theory, 44 in a term, 61
top, 24 vicious circle principle, see
definition of, 24 circular reasoning
identity for conjunction, 33
semantics, 30 Weiner, Norbert, 87
>, 24 witness, 67
total, 119 Wittgenstein, Ludwig, 21, 28
transitive, 53, 104
transitive closure, 109
transpose, 100
trichotomy, 59
truth table, 28
`, 31
turnstile(vdash), 31
type theory, 77
unary, 56
uncountable
reals are, 137
union, 87
axiom, 87
commutativity of, 88