Boxes and Diamonds
An Open Introduction to Modal Logic
Winter 2018 bis
Boxes and Diamonds
The Open Logic Project
Instigator
Richard Zach, University of Calgary
Editorial Board
Aldo Antonelli,† University of California, Davis
Andrew Arana, Université Paris I Panthénon–Sorbonne
Jeremy Avigad, Carnegie Mellon University
Walter Dean, University of Warwick
Gillian Russell, University of North Carolina
Nicole Wyatt, University of Calgary
Audrey Yap, University of Victoria
Contributors
Samara Burns, University of Calgary
Dana Hägg, University of Calgary
Zesen Qian, Carnegie Mellon University
Boxes and Diamonds
An Open Introduction to Modal Logic
Remixed by Richard Zach
Winter 2018 bis
The Open Logic Project would like to acknowledge the generous
support of the Faculty of Arts and the Taylor Institute of Teach-
ing and Learning of the University of Calgary, and the Alberta
Open Educational Resources (ABOER) Initiative, which is made
possible through an investment from the Alberta government.
Cover illustrations by Matthew Leadbeater, used under a Cre-
ative Commons Attribution-NonCommercial 4.0 International Li-
cense.
Typeset in Baskervald X and Universalis ADF Standard by LATEX.
This version of boxes-and-diamonds is revision d6913fc (2019-
02-28), with content generated from OpenLogicProject revision
89307f1 (2019-02-27).
Boxes and Diamonds by Richard Zach is
licensed under a Creative Commons At-
tribution 4.0 International License. It
is based on The Open Logic Text by the
Open Logic Project, used under a Cre-
ative Commons Attribution 4.0 Interna-
tional License.
Contents
I Normal Modal Logics 1
1 Syntax and Semantics of Normal Modal Logics 2
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . 2
1.2 The Language of Basic Modal Logic . . . . . . . . 4
1.3 Simultaneous Substitution . . . . . . . . . . . . . 5
1.4 Relational Models . . . . . . . . . . . . . . . . . . 7
1.5 Truth at a World . . . . . . . . . . . . . . . . . . . 8
1.6 Truth in a Model . . . . . . . . . . . . . . . . . . 10
1.7 Validity . . . . . . . . . . . . . . . . . . . . . . . . 11
1.8 Tautological Instances . . . . . . . . . . . . . . . . 12
1.9 Schemas and Validity . . . . . . . . . . . . . . . . 14
1.10 Entailment . . . . . . . . . . . . . . . . . . . . . . 16
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2 Frame Definability 21
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . 21
2.2 Properties of Accessibility Relations . . . . . . . . 22
2.3 Frames . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Frame Definability . . . . . . . . . . . . . . . . . . 26
2.5 First-order Definability . . . . . . . . . . . . . . . 29
2.6 Equivalence Relations and S5 . . . . . . . . . . . 31
2.7 Second-order Definability . . . . . . . . . . . . . . 33
v
CONTENTS vi
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3 Axiomatic Derivations 38
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . 38
3.2 Normal Modal Logics . . . . . . . . . . . . . . . . 40
3.3 Derivations and Modal Systems . . . . . . . . . . 42
3.4 Proofs in K . . . . . . . . . . . . . . . . . . . . . . 44
3.5 Derived Rules . . . . . . . . . . . . . . . . . . . . 47
3.6 More Proofs in K . . . . . . . . . . . . . . . . . . 49
3.7 Dual Formulas . . . . . . . . . . . . . . . . . . . . 51
3.8 Proofs in Modal Systems . . . . . . . . . . . . . . 51
3.9 Soundness . . . . . . . . . . . . . . . . . . . . . . 53
3.10 Showing Systems are Distinct . . . . . . . . . . . . 54
3.11 Derivability from a Set of Formulas . . . . . . . . 56
3.12 Properties of Derivability . . . . . . . . . . . . . . 56
3.13 Consistency . . . . . . . . . . . . . . . . . . . . . 58
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4 Completeness and Canonical Models 60
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . 60
4.2 Complete Σ -Consistent Sets . . . . . . . . . . . . 62
4.3 Lindenbaum’s Lemma . . . . . . . . . . . . . . . . 63
4.4 Modalities and Complete Consistent Sets . . . . . 65
4.5 Canonical Models . . . . . . . . . . . . . . . . . . 68
4.6 The Truth Lemma . . . . . . . . . . . . . . . . . . 68
4.7 Determination and Completeness for K . . . . . . 70
4.8 Frame Completeness . . . . . . . . . . . . . . . . 71
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5 Filtrations and Decidability 76
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . 76
5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . 79
5.3 Filtrations . . . . . . . . . . . . . . . . . . . . . . 80
5.4 Examples of Filtrations . . . . . . . . . . . . . . . 83
5.5 Filtrations are Finite . . . . . . . . . . . . . . . . . 85
5.6 K and S5 have the Finite Model Property . . . . . 86
CONTENTS vii
5.7 S5 is Decidable . . . . . . . . . . . . . . . . . . . 87
5.8 Filtrations and Properties of Accessibility . . . . . 88
5.9 Filtrations of Euclidean Models . . . . . . . . . . 90
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6 Modal Tableaux 94
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . 94
6.2 Rules for K . . . . . . . . . . . . . . . . . . . . . . 95
6.3 Tableaux for K . . . . . . . . . . . . . . . . . . . . 98
6.4 Soundness . . . . . . . . . . . . . . . . . . . . . . 99
6.5 Rules for Other Accessibility Relations . . . . . . 103
6.6 Tableaux for Other Logics . . . . . . . . . . . . . 103
6.7 Soundness for Additional Rules . . . . . . . . . . 105
6.8 Simple Tableaux for S5 . . . . . . . . . . . . . . . 107
6.9 Completeness for K . . . . . . . . . . . . . . . . . 108
6.10 Countermodels from Tableaux . . . . . . . . . . . 112
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 115
II Intuitionistic Logic 116
7 Introduction 117
7.1 Constructive Reasoning . . . . . . . . . . . . . . . 117
7.2 Syntax of Intuitionistic Logic . . . . . . . . . . . . 119
7.3 The Brouwer-Heyting-Kolmogorov Interpretation . 120
7.4 Natural Deduction . . . . . . . . . . . . . . . . . . 124
7.5 Axiomatic Derivations . . . . . . . . . . . . . . . 128
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8 Semantics 130
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . 130
8.2 Relational models . . . . . . . . . . . . . . . . . . 131
8.3 Semantic Notions . . . . . . . . . . . . . . . . . . 133
8.4 Topological Semantics . . . . . . . . . . . . . . . 133
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 135
CONTENTS viii
9 Soundness and Completeness 136
9.1 Soundness of Axiomatic Derivations . . . . . . . . 136
9.2 Soundness of Natural Deduction . . . . . . . . . . 137
9.3 Lindenbaum’s Lemma . . . . . . . . . . . . . . . . 139
9.4 The Canonical Model . . . . . . . . . . . . . . . . 141
9.5 The Truth Lemma . . . . . . . . . . . . . . . . . . 142
9.6 The Completeness Theorem . . . . . . . . . . . . 143
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 143
III Counterfactuals 144
10 Introduction 145
10.1 The Material Conditional . . . . . . . . . . . . . . 145
10.2 Paradoxes of the Material Conditional . . . . . . 147
10.3 The Strict Conditional . . . . . . . . . . . . . . . 148
10.4 Counterfactuals . . . . . . . . . . . . . . . . . . . 150
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 151
11 Minimal Change Semantics 153
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . 153
11.2 Sphere Models . . . . . . . . . . . . . . . . . . . . 155
11.3 Truth and Falsity of Counterfactuals . . . . . . . . 157
11.4 Antecedent Strengthenng . . . . . . . . . . . . . . 159
11.5 Transitivity . . . . . . . . . . . . . . . . . . . . . . 160
11.6 Contraposition . . . . . . . . . . . . . . . . . . . . 162
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 162
IV Appendices 164
A Sets 165
A.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . 165
A.2 Some Important Sets . . . . . . . . . . . . . . . . 167
A.3 Subsets . . . . . . . . . . . . . . . . . . . . . . . . 168
A.4 Unions and Intersections . . . . . . . . . . . . . . 169
CONTENTS ix
A.5 Pairs, Tuples, Cartesian Products . . . . . . . . . 172
A.6 Russell’s Paradox . . . . . . . . . . . . . . . . . . 174
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 175
B Relations 176
B.1 Relations as Sets . . . . . . . . . . . . . . . . . . . 176
B.2 Special Properties of Relations . . . . . . . . . . . 178
B.3 Orders . . . . . . . . . . . . . . . . . . . . . . . . 180
B.4 Graphs . . . . . . . . . . . . . . . . . . . . . . . . 183
B.5 Operations on Relations . . . . . . . . . . . . . . 184
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 186
C Syntax and Semantics 187
C.1 Introduction . . . . . . . . . . . . . . . . . . . . . 187
C.2 Propositional Formulas . . . . . . . . . . . . . . . 189
C.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . 191
C.4 Valuations and Satisfaction . . . . . . . . . . . . . 193
C.5 Semantic Notions . . . . . . . . . . . . . . . . . . 195
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 196
D Axiomatic Derivations 197
D.1 Introduction . . . . . . . . . . . . . . . . . . . . . 197
D.2 Axiomatic Derivations . . . . . . . . . . . . . . . 199
D.3 Rules and Derivations . . . . . . . . . . . . . . . . 201
D.4 Axiom and Rules for the Propositional Connectives203
D.5 Examples of Derivations . . . . . . . . . . . . . . 204
D.6 Proof-Theoretic Notions . . . . . . . . . . . . . . . 206
D.7 The Deduction Theorem . . . . . . . . . . . . . . 208
D.8 Derivability and Consistency . . . . . . . . . . . . 210
D.9 Derivability and the Propositional Connectives . . 211
D.10 Soundness . . . . . . . . . . . . . . . . . . . . . . 213
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 214
E Tableaux 215
E.1 Tableaux . . . . . . . . . . . . . . . . . . . . . . . 215
E.2 Rules and Tableaux . . . . . . . . . . . . . . . . . 217
CONTENTS x
E.3 Propositional Rules . . . . . . . . . . . . . . . . . 218
E.4 Tableaux . . . . . . . . . . . . . . . . . . . . . . . 219
E.5 Examples of Tableaux . . . . . . . . . . . . . . . . 220
E.6 Proof-Theoretic Notions . . . . . . . . . . . . . . . 225
E.7 Derivability and Consistency . . . . . . . . . . . . 228
E.8 Derivability and the Propositional Connectives . . 231
E.9 Soundness . . . . . . . . . . . . . . . . . . . . . . 234
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 237
F The Completeness Theorem 238
F.1 Introduction . . . . . . . . . . . . . . . . . . . . . 238
F.2 Outline of the Proof . . . . . . . . . . . . . . . . . 240
F.3 Complete Consistent Sets of Sentences . . . . . . 241
F.4 Lindenbaum’s Lemma . . . . . . . . . . . . . . . . 243
F.5 Construction of a Model . . . . . . . . . . . . . . 244
F.6 The Completeness Theorem . . . . . . . . . . . . 245
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 246
G Proofs 247
G.1 Introduction . . . . . . . . . . . . . . . . . . . . . 247
G.2 Starting a Proof . . . . . . . . . . . . . . . . . . . 249
G.3 Using Definitions . . . . . . . . . . . . . . . . . . 249
G.4 Inference Patterns . . . . . . . . . . . . . . . . . . 252
G.5 An Example . . . . . . . . . . . . . . . . . . . . . 260
G.6 Another Example . . . . . . . . . . . . . . . . . . 265
G.7 Proof by Contradiction . . . . . . . . . . . . . . . 267
G.8 Reading Proofs . . . . . . . . . . . . . . . . . . . . 272
G.9 I Can’t Do It! . . . . . . . . . . . . . . . . . . . . 274
G.10 Other Resources . . . . . . . . . . . . . . . . . . . 276
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 276
H Induction 278
H.1 Introduction . . . . . . . . . . . . . . . . . . . . . 278
H.2 Induction on N . . . . . . . . . . . . . . . . . . . . 279
H.3 Strong Induction . . . . . . . . . . . . . . . . . . . 282
H.4 Inductive Definitions . . . . . . . . . . . . . . . . 283
CONTENTS xi
H.5 Structural Induction . . . . . . . . . . . . . . . . . 286
H.6 Relations and Functions . . . . . . . . . . . . . . . 288
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 291
Photo Credits 293
Bibliography 294
PART I
Normal
Modal
Logics
1
CHAPTER 1
Syntax and
Semantics of
Normal Modal
Logics
1.1 Introduction
Modal Logic deals with modal propositions and the entailment re-
lations among them. Examples of modal propositions are the
following:
1. It is necessary that 2 + 2 = 4.
2. It is necessarily possible that it will rain tomorrow.
3. If it is necessarily possible that A then it is possible that A.
Possibility and necessity are not the only modalities: other unary
connectives are also classified as modalities, for instance, “it ought
to be the case that A,” “It will be the case that A,” “Dana knows
that A,” or “Dana believes that A.”
2
1.1. INTRODUCTION 3
Modal logic makes its first appearance in Aristotle’s De Inter- *
pretatione: he was the first to notice that necessity implies possi-
bility, but not vice versa; that possibility and necessity are inter-
definable; that If A ∧ B is possibly true then A is possibly true
and B is possibly true, but not conversely; and that if A → B is
necessary, then if A is necessary, so is B.
The first modern approach to modal logic was the work of
C. I. Lewis, culminating with Lewis and Langford, Symbolic Logic
(1932). Lewis & Langford were unhappy with the representation
of implication by means of the material conditional: A → B is
a poor substitute for “A implies B.” Instead, they proposed to
characterize implication as “Necessarily, if A then B,” symbolized
as A J B. In trying to sort out the different properties, Lewis
indentified five different modal systems, S1, . . . , S4, S5, the last
two of which are still in use.
The approach of Lewis and Langford was purely syntactical:
they identified reasonable axioms and rules and investigated what
was provable with those means. A semantic approach remained
elusive for a long time, until a first attempt was made by Rudolf
Carnap in Meaning and Necessity (1947) using the notion of a state
description, i.e., a collection of atomic sentences (those that are
“true” in that state description). After lifting the truth definition
to arbitrary sentences A, Carnap defines A to be necessarily true
if it is true in all state descriptions. Carnap’s approach could
not handle iterated modalities, in that sentences of the form “Pos-
sibly necessarily . . . possibly A” always reduce to the innermost
modality.
The major breakthrough in modal semantics came with Saul
Kripke’s article “A Completeness Theorem in Modal Logic” ( JSL
1959). Kripke based his work on Leibniz’s idea that a statement
is necessarily true if it is true “at all possible worlds.” This idea, *
though, suffers from the same drawbacks as Carnap’s, in that the
truth of statement at a world w (or a state description s ) does not
depend on w at all. So Kripke assumed that worlds are related
by an accessibility relation R, and that a statement of the form
“Necessarily A” is true at a world w if and only if A is true at all
CHAPTER 1. SYNTAX AND SEMANTICS OF NORMAL MODAL LOGICS 4
worlds w 0 accessible from w. Semantics that provide some version
of this approach are called Kripke semantics and made possible
the tumultuous development of modal logics (in the plural).
When interpreted by the Kripke semantics, modal logic shows
us what relational structures look like “from the inside.” A rela-
tional structure is just a set equipped with a binary relation (for
instance, the set of students in the class ordered by their social
security number is a relational structure). But in fact relational
structures come in all sorts of domains: besides relative possibil-
ity of states of the world, we can have epistemic states of some
agent related by epistemic possibility, or states of a dynamical
system with their state transitions, etc. Modal logic can be used
to model all of these: the first give us ordinary, alethic, modal
logic; the others give us epistemic logic, dynamic logic, etc.
We focus on one particular angle, known to modal logicians
as “correspondence theory.” One of the most significant early
discoveries of Kripke’s is that many properties of the accessibil-
ity relation R (whether it is transitive, symmetric, etc.) can be
characterized in the modal language itself by means of appropri-
ate “modal schemas.” Modal logicians say, for instance, that the
reflexivity of R “corresponds” to the schema “If necessarily A,
then A”. We explore mainly the correspondence theory of a num-
ber of classical systems of modal logic (e.g., S4 and S5) obtained
by a combination of the schemas D, T, B, 4, and 5.
1.2 The Language of Basic Modal Logic
Definition 1.1. The basic language of modal logic contains
1. The propositional constant for falsity ⊥.
2. A countably infinite set of propositional variables: p0 , p1 ,
p2 , . . .
1.3. SIMULTANEOUS SUBSTITUTION 5
3. The propositional connectives: ¬ (negation), ∧ (conjunc-
tion), ∨ (disjunction), → (conditional).
4. The modal operator .
5. The modal operator ♦.
Definition 1.2. Formulas of the basic modal language are induc-
tively defined as follows:
1. ⊥ is an atomic formula.
2. Every propositional variable pi is an (atomic) formula.
3. If A and B are formulas, then (A ∧ B) is a formula.
4. If A and B are formulas, then (A ∨ B) is a formula.
5. If A and B are formulas, then (A → B) is a formula.
6. If A is a formula, so is A.
7. If A is a formula, then ♦A is a formula.
8. Nothing else is a formula.
If a formula A does not contain or ♦, we say it is modal-free.
1.3 Simultaneous Substitution
An instance of a formula A is the result of replacing all occurrences
of a propositional variable in A by some other formula. We will
refer to instances of formulas often, both when discussing validity
and when discussing derivability. It therefore is useful to define
the notion precisely.
CHAPTER 1. SYNTAX AND SEMANTICS OF NORMAL MODAL LOGICS 6
Definition 1.3. Where A is a modal formula all of whose propo-
sitional variables are among p 1 , . . . , p n , and D 1 , . . . , D n are also
modal formulas, we define A[D 1 /p 1, . . . , D n /p n ] as the result of
simultaneously substituting each D i for pi in A. Formally, this is
a definition by induction on A:
1. A ≡ ⊥: A[D 1 /p 1, . . . , D n /p n ] is ⊥.
2. A ≡ q : A[D 1 /p 1, . . . , D n /p n ] is q , provided q 6≡ pi for i = 1,
. . . , n.
3. A ≡ pi : A[D 1 /p 1, . . . , D n /p n ] is D i .
4. A ≡ ¬B: A[D 1 /p 1, . . . , D n /p n ] is ¬B[D 1 /p 1, . . . , D n /p n ].
5. A ≡ (B ∧ C ): A[D 1 /p 1, . . . , D n /p n ] is
(B[D 1 /p 1, . . . , D n /p n ] ∧ C [D 1 /p 1, . . . , D n /p n ]).
6. A ≡ (B ∨ C ): A[D 1 /p 1, . . . , D n /p n ] is
(B[D 1 /p 1, . . . , D n /p n ] ∨ C [D 1 /p 1, . . . , D n /p n ]).
7. A ≡ (B → C ): A[D 1 /p 1, . . . , D n /p n ] is
(B[D 1 /p 1, . . . , D n /p n ] → C [D 1 /p 1, . . . , D n /p n ]).
8. A ≡ (B ↔ C ): A[D 1 /p 1, . . . , D n /p n ] is
(B[D 1 /p 1, . . . , D n /p n ] ↔ C [D 1 /p 1, . . . , D n /p n ]).
9. A ≡ B: A[D 1 /p 1, . . . , D n /p n ] is B[D 1 /p 1, . . . , D n /p n ].
10. A ≡ ♦B: A[D 1 /p 1, . . . , D n /p n ] is ♦B[D 1 /p 1, . . . , D n /p n ].
The formula A[D 1 /p 1, . . . , D n /p n ] is called a substitution instance
of A.
1.4. RELATIONAL MODELS 7
Example 1.4. Suppose A is p 1 → (p 1 ∧ p 2 ), D 1 is ♦(p 2 → p 3 )
and D 2 is ¬p 1 . Then A[D 1 /p 1, D 2 /p 2 ] is
♦(p 2 → p 3 ) → (♦(p 2 → p 3 ) ∧ ¬p 1 )
while A[D 2 /p 1, D 1 /p 2 ] is
¬p 1 → (¬p 1 ∧ ♦(p 2 → p 3 ))
Note that simultaneous substitution is in general not the same as
iterated substitution, e.g., compare A[D 1 /p 1, D 2 /p 2 ] above with
(A[D 1 /p 1 ])[D 2 /p 2 ], which is:
♦(p 2 → p 3 ) → (♦(p 2 → p 3 ) ∧ p 2 )[¬p 1 /p 2 ], i.e.,
♦(¬p 1 → p 3 ) → (♦(¬p 1 → p 3 ) ∧ ¬p 1 )
and with (A[D 2 /p 2 ])[D 1 /p 1 ]:
p 1 → (p 1 ∧ ¬p 1 )[♦(p 2 → p 3 )/p 1 ], i.e.,
♦(p 2 → p 3 ) → (♦(p 2 → p 3 ) ∧ ¬♦(p 2 → p 3 )).
1.4 Relational Models
The basic concept of semantics for normal modal logics is that of
a relational model. It consists of a set of worlds, which are related
by a binary “accessibility relation,” together with an assignment
which determines which propositional variables count as “true”
at which worlds.
Definition 1.5. A model for the basic modal language is a triple
M = hW, R,V i, where
1. W is a nonempty set of “worlds,”
2. R is a binary accessibility relation on W , and
3. V is a function assigning to each propositional variable p
a set V (p) of possible worlds.
CHAPTER 1. SYNTAX AND SEMANTICS OF NORMAL MODAL LOGICS 8
p
w2
q
p
w1
¬q
¬p
w3
¬q
Figure 1.1: A simple model.
When Rww 0 holds, we say that w 0 is accessible from w. When
w ∈ V (p) we say p is true at w.
The great advantage of relational semantics is that models
can be represented by means of simple diagrams, such as the one
in Figure 1.1. Worlds are represented by nodes, and world w 0
is accessible from w precisely when there is an arrow from w
to w 0. Moreover, we label a node (world) by p when w ∈
V (p), and otherwise by ¬p. Figure 1.1 represents the model with
W = {w 1, w 2, w 3 }, R = {hw 1, w 2 i, hw 1, w 3 i}, V (p) = {w 1, w 2 }, and
V (q ) = {w 2 }.
1.5 Truth at a World
Every modal model determines which modal formulas count as
true at which worlds in it. The relation “model M makes for-
mula A true at world w” is the basic notion of relational seman-
tics. The relation is defined inductively and coincides with the
usual characterization using truth tables for the non-modal oper-
ators.
1.5. TRUTH AT A WORLD 9
Definition 1.6. Truth of a formula A at w in a M, in symbols:
M, w A, is defined inductively as follows:
1. A ≡ ⊥: Never M, w ⊥.
2. M, w p iff w ∈ V (p)
3. A ≡ ¬B: M, w A iff M, w 1 B.
4. A ≡ (B ∧ C ): M, w A iff M, w B and M, w C.
5. A ≡ (B ∨ C ): M, w A iff M, w B or M, w C (or
both).
6. A ≡ (B → C ): M, w A iff M, w 1 B or M, w C.
7. A ≡ B: M, w A iff M, w 0 B for all w 0 ∈ W with
Rww 0
8. A ≡ ♦B: M, w A iff M, w 0 B for at least one w 0 ∈ W
with Rww 0
Note that by clause (7), a formula B is true at w whenever
there are no w 0 with wRw 0. In such a case B is vacuously true
at w. Also, B may be satisfied at w even if B is not. The truth
of B at w does not guarantee the truth of ♦B at w. This holds,
however, if Rww, e.g., if R is reflexive. If there is no w 0 such that
Rww 0, then M, w 1 ♦A, for any A.
Proposition 1.7. 1. M, w A iff M, w ¬♦¬A.
2. M, w ♦A iff M, w ¬¬A.
Proof. 1. M, w ¬♦¬A iff M 1 ♦¬A by definition of M, w .
M, w ♦¬A iff for some w 0 with Rww 0, M, w 0 ¬A.
Hence, M, w 1 ♦¬A iff for all w 0 with Rww 0, M, w 0 1 ¬A.
We also have M, w 0 1 ¬A iff M, w 0 A. Together we have
M, w ¬♦¬A iff for all w 0 with Rww 0, M, w 0 A. Again
by definition of M, w , that is the case iff M, w A.
CHAPTER 1. SYNTAX AND SEMANTICS OF NORMAL MODAL LOGICS 10
2. Exercise.
1.6 Truth in a Model
Sometimes we are interested which formulas are true at every
world in a given model. Let’s introduce a notation for this.
Definition 1.8. A formula A is true in a model M = hW, R,V i,
written M A, if and only if M, w A for every w ∈ W .
Proposition 1.9. 1. If M A then M 1 ¬A, but not vice-versa.
2. If M A → B then M A only if M B, but not vice-versa.
Proof. 1. If M A then A is true at all worlds in W , and since
W , ∅, it can’t be that M ¬A, or else A would have to
be both true and false at some world.
On the other hand, if M 1 ¬A then A is true at some world
w ∈ W . It does not follow that M, w A for every w ∈ W .
For instance, in the model of Figure 1.1, M 1 ¬p, and also
M 1 p.
2. Assume M A →B and M A; to show M B let w ∈ W
be an arbitrary world. Then M, w A → B and M, w B,
so M, w B, and since w was arbitrary, M B.
To show that the converse fails, we need to find a model
M such that M A only if M B, but M 1 A → B.
Consider again the model of Figure 1.1: M 1 p and hence
(vacuously) M p only if M q . However, M 1 p → q ,
as p is true but q false at w 1 .
1.7. VALIDITY 11
1.7 Validity
Formulas that are true in all models, i.e., true at every world in
every model, are particularly interesting. They represent those
modal propositions which are true regardless of how and ♦ are
interpreted, as long as the interpretation is “normal” in the sense
that it is generated by some accessibility relation on possible
*
worlds. We call such formulas valid. For instance, (p ∧ q ) → p
is valid. Some formulas one might expect to be valid on the basis
of the alethic interpretation of , such as p → p, are not valid,
however. Part of the interest of relational models is that different
interpretations of and ♦ can be captured by different kinds of
accessibility relations. This suggests that we should define valid-
ity not just relative to all models, but relative to all models of a
certain kind. It will turn out, e.g., that p → p is true in all mod-
els where every world is accessible from itself, i.e., R is reflexive.
Defining validity relative to classes of models enables us to for-
mulate this succinctly: p → p is valid in the class of reflexive
models.
Definition 1.10. A formula A is valid in a class C of models if
it is true in every model in C (i.e., true at every world in every
model in C). If A is valid in C, we write C A, and we write
A if A is valid in the class of all models.
Proposition 1.11. If A is valid in C it is also valid in each class
C0 ⊆ C.
Proposition 1.12. If A is valid, then so is A. RN
Proof. Assume A. To show A let M = hW, R,V i be a model
and w ∈ W . If Rww 0 then M, w 0 A, since A is valid, and so
also M, w A. Since M and w were arbitrary, A.
CHAPTER 1. SYNTAX AND SEMANTICS OF NORMAL MODAL LOGICS 12
1.8 Tautological Instances
A modal-free formula is a tautology if it is true under every truth-
value assignment. Clearly, every tautology is true at every world
in every model. But for formulas involving and ♦, the notion
of tautology is not defined. Is it the case, e.g., that p ∨ ¬p—
an instance of the principle of excluded middle—is valid? The
notion of a tautological instance helps: a formula that is a substi-
tution instance of a (non-modal) tautology. It is not surprising,
but still requires proof, that every tautological instance is valid.
Definition 1.13. A modal formula B is a tautological instance
if and only if there is a modal-free tautology A with proposi-
tional variables p 1 , . . . , p n and formulas D 1 , . . . , D n such that
B ≡ A[D 1 /p 1, . . . , D n /p n ].
Lemma 1.14. Suppose A is a modal-free formula whose propositional
variables are p 1 , . . . , p n , and let D 1 , . . . , D n be modal formulas. Then
for any assignment v, any model M = hW, R,V i, and any w ∈ W such
that v(pi ) = T if and only if M, w D i we have that v A if and
only if M, w A[D 1 /p 1, . . . , D n /p n ].
Proof. By induction on A.
1. A ≡ ⊥: Both v 2 ⊥ and M, w 1 ⊥.
2. A ≡ pi :
v pi ⇔ v(pi ) = T by definition of v pi ;
⇔ M, w Di by assumption
⇔ M, w pi [D 1 /p 1, . . . , D n /p n ] since pi [D 1 /p 1, . . . , D n /p n
3. A ≡ ¬B:
v ¬B ⇔ v 2 B by definition of v ;
1.8. TAUTOLOGICAL INSTANCES 13
⇔ M, w 1 B[D 1 /p 1, . . . , D n /p n ] by induction hypothesi
⇔ M, w ¬B[D 1 /p 1, . . . , D n /p n ] by definition of v .
4. A ≡ (B ∧ C ):
v B ∧ C ⇔ v B and v C by definition of
⇔ M, w B[D 1 /p 1, . . . , D n /p n ] and
M, w C [D 1 /p 1, . . . , D n /p n ], by induction hy
⇔ M, w (B ∧ C )[D 1 /p 1, . . . , D n /p n ] by definition of
5. A ≡ (B ∨ C ):
v B ∨ C ⇔ v B or v C by definition of
⇔ M, w B[D 1 /p 1, . . . , D n /p n ] or
M, w C [D 1 /p 1, . . . , D n /p n ], by induction hy
⇔ M, w (B ∨ C )[D 1 /p 1, . . . , D n /p n ] by definition of
6. A ≡ (B → C ):
v B → C ⇔ v 2 B or v C by definition o
⇔ M, w 1 B[D 1 /p 1, . . . , D n /p n ] or
M, w C [D 1 /p 1, . . . , D n /p n ], by induction h
⇔ M, w (B → C )[D 1 /p 1, . . . , D n /p n ] by definition o
Proposition 1.15. All tautological instances are valid.
Proof. Contrapositively, suppose A is such that M, w 1 A[D 1 /p 1, . . . , D n /p
for some model M and world w. Define an assignment v such that
v (pi ) = T if and only if M, w D i (and v assigns arbitrary values
to q < {p 1, . . . , p n }). Then by Lemma 1.14, v 2 A, so A is not a
tautology.
CHAPTER 1. SYNTAX AND SEMANTICS OF NORMAL MODAL LOGICS 14
1.9 Schemas and Validity
Definition 1.16. A schema is a set of formulas comprising all and
only the substitution instances of some modal formula C , i.e.,
{B : ∃D 1, . . . , ∃D n B = C [D 1 /p 1, . . . , D n /p n ] }.
The formula C is called the characteristic formula of the schema,
and it is unique up to a renaming of the propositional variables.
A formula A is an instance of a schema if it is a member of the
set.
It is convenient to denote a schema by the meta-linguistic
expression obtained by substituting ‘A’, ‘B’, . . . , for the atomic
components of C . So, for instance, the following denote schemas:
‘A’, ‘A→A’, ‘A→(B →A)’. They correspond to the characteristic
formulas p, p → p, p → (q → p). The schema ‘A’ denotes the
set of all formulas.
Definition 1.17. A schema is true in a model if and only if all of
its instances are; and a schema is valid if and only if it is true in
every model.
Proposition 1.18. The following schema K is valid
(A → B) → (A → B). (K)
Proof. We need to show that all instances of the schema are true
at every world in every model. So let M = hW, R,V i and w ∈ W
be arbitrary. To show that a conditional is true at a world we
assume the antecedent is true to show that consequent is true as
well. In this case, let M, w (A → B) and M, w A. We
need to show M B. So let w 0 be arbitrary such that Rww 0.
Then by the first assumption M, w 0 A → B and by the second
assumption M, w 0 A. It follows that M, w 0 B. Since w 0 was
arbitrary, M, w B.
1.9. SCHEMAS AND VALIDITY 15
Proposition 1.19. The following schema dual is valid
♦A ↔ ¬¬A. (dual)
Proof. Exercise.
Proposition 1.20. If A and A → B are true at a world in a model
then so is B. Hence, the valid formulas are closed under modus ponens.
Proposition 1.21. A formula A is valid iff all its substitution in-
stances are. In other words, a schema is valid iff its characteristic for-
mula is.
Proof. The “if” direction is obvious, since A is a substitution in-
stance of itself.
To prove the “only if” direction, we show the following: Sup-
pose M = hW, R,V i is a modal model, and B ≡ A[D 1 /p 1, . . . , D n /p n ]
is a substitution instance of A. Define M 0 = hW, R,V 0i by V (pi ) =
{w : M D i w }. Then M Bw iff M 0 Aw, for any w ∈ W .
(We leave the proof as an exercise.) Now suppose that A was
valid, but some substitution instance B of A was not valid. Then
for some M = hW, R,V i and some w ∈ W , M 1 Bw. But then
M 0 1 Aw by the claim, and A is not valid, a contradiction.
Note, however, that it is not true that a schema is true in a
model iff its characteristic formula is. Of course, the “only if”
direction holds: if every instance of A is true in M, A itself is
true in M. But it may happen that A is true in M but some
instance of A is false at some world in M. For a very simple
counterexample consider p in a model with only one world w
and V (p) = {w }, so that p is true at w. But ⊥ is an instance of p,
and not true at w.
CHAPTER 1. SYNTAX AND SEMANTICS OF NORMAL MODAL LOGICS 16
Valid Schemas Invalid Schemas
(A → B) → (♦A → ♦B) (A ∨ B) → (A ∨ B)
♦(A → B) → (A → ♦B) (♦A ∧ ♦B) → ♦(A ∧ B)
(A ∧ B) ↔ (A ∧ B) A → A
A → (B → A) ♦A → B
¬♦A → (A → B) A → A
♦(A ∨ B) ↔ (♦A ∨ ♦B) ♦A → ♦A.
Table 1.1: Valid and (or?) invalid schemas.
1.10 Entailment
With the definition of truth at a world, we can define an entail-
ment relation between formulas. A formula B entails A iff, when-
ever B is true, A is true as well. Here, “whenever” means both
“whichever model we consider” as well as “whichever world in
that model we consider.”
Definition 1.22. If Γ is a set of formulas and A a formula, then
Γ entails A, in symbols: Γ A, if and only if for every model
M = hW, R,V i and world w ∈ W , if M, w B for every B ∈ Γ,
then M, w A. If Γ contains a single formula B, then we write
B A.
Example 1.23. To show that a formula entails another, we have
to reason about all models, using the definition of M, w . For
instance, to show p → ♦p ¬p → ¬p, we might argue as fol-
lows: Consider a model M = hW, R,V i and w ∈ W , and suppose
M, w p → ♦p. We have to show that M, w ¬p → ¬p. Sup-
pose not. Then M, w ¬p and M, w 1 ¬p. Since M, w 1 ¬p,
M, w p. By assumption, M, w p → ♦p, hence M, w ♦p. By
definition of M, w ♦p, there is some w 0 with Rww 0 such that
M, w 0 p. Since also M, w ¬p, M, w 0 ¬p, a contradiction.
To show that a formula B does not entail another A, we have
to give a counterexample, i.e., a model M = hW, R,V i where we
show that at some world w ∈ W , M, w B but M, w 1 A. Let’s
1.10. ENTAILMENT 17
w2 p w3 p
w 1 ¬p
Figure 1.2: Counterexample to p → ♦p p → p.
show that p → ♦p 2 p → p. Consider the model in Figure 1.2.
We have M, w 1 ♦p and hence M, w 1 p → ♦p. However, since
M, w 1 p but M, w 1 1 p, we have M, w 1 1 p → p.
Often very simple counterexamples suffice. The model M 0 =
{W , R 0,V 0 } with W 0 = {w }, R 0 = ∅, and V 0(p) = ∅ is also a
0
counterexample: Since M 0, w 1 p, M 0, w p → ♦p. As no worlds
are accessible from w, we have M 0, w p, and so M 0, w 1
p → p.
Problems
Problem 1.1. Consider the model of Figure 1.1. Which of the
following hold?
1. M, w 1 q;
2. M, w 3 ¬q ;
3. M, w 1 p ∨ q;
4. M, w 1 (p ∨ q );
5. M, w 3 q ;
6. M, w 3 ⊥;
7. M, w 1 ♦q ;
8. M, w 1 q ;
CHAPTER 1. SYNTAX AND SEMANTICS OF NORMAL MODAL LOGICS 18
9. M, w 1 ¬¬q .
Problem 1.2. Complete the proof of Proposition 1.7.
Problem 1.3. Let M = hW, R,V i be a model, and suppose w 1, w 2 ∈
W are such that:
1. w 1 ∈ V (p) if and only if w 2 ∈ V (p); and
2. for all w ∈ W : Rw 1w if and only if Rw 2w.
Using induction on formulas, show that for all formulas A: M, w 1
A if and only if M, w 2 A.
Problem 1.4. Let M = hM, R,V i. Show that M, w ¬♦A if and
only if M, w ¬A.
Problem 1.5. Consider the following model M for the language
comprising p 1 , p 2 , p 3 as the only propositional variables:
p1 p1
¬p 2 w 1 w3 p2
¬p 3 p3
p1
w2 p2
¬p 3
Are the following formulas and schemas true in the model M,
i.e., true at every world in M? Explain.
1. p → ♦p (for p atomic);
2. A → ♦A (for A arbitrary);
3. p → p (for p atomic);
4. ¬p → ♦p (for p atomic);
5. ♦A (for A arbitrary);
1.10. ENTAILMENT 19
6. ♦p (for p atomic).
Problem 1.6. Show that the following are valid:
1. p → (q → p);
2. ¬⊥;
3. p → (q → p).
Problem 1.7. Show that A →A is valid in the class C of models
M = hW, R,V i where W = {w }. Similarly, show that B → A and
♦A → B are valid in the class of models M = hW, R,V i where
R = ∅.
Problem 1.8. Prove Proposition 1.19.
Problem 1.9. Prove the claim in the “only if” part of the proof
of Proposition 1.21. (Hint: use induction on A.)
Problem 1.10. Show that none of the following formulas are
valid:
D: p → ♦p;
T: p → p;
B: p → ♦p;
4: p → p;
5: ♦p → ♦p.
Problem 1.11. Prove that the schemas in the first column of ta-
ble 1.1 are valid and those in the second column are not valid.
Problem 1.12. Decide whether the following schemas are valid
or invalid:
1. (♦A → B) → (A → B);
2. ♦(A → B) ∨ (B → A).
CHAPTER 1. SYNTAX AND SEMANTICS OF NORMAL MODAL LOGICS 20
Problem 1.13. For each of the following schemas find a model
M such that every instance of the formula is true in M:
1. p → ♦♦p;
2. ♦p → p.
Problem 1.14. Show that (A ∧ B) A.
Problem 1.15. Show that (p → q ) 2 p → q and p → q 2
(p → q ).
CHAPTER 2
Frame
Definability
2.1 Introduction
One question that interests modal logicians is the relationship be-
tween the accessibility relation and the truth of certain formulas
in models with that accessibility relation. For instance, suppose
the accessibility relation is reflexive, i.e., for every w ∈ W , Rww.
In other words, every world is accessible from itself. That means
that when A is true at a world w, w itself is among the accessible
worlds at which A must therefore be true. So, if the accessibility
relation R of M is reflexive, then whatever world w and formula
A we take, A → A will be true there (in other words, the schema
p → p and all its substitution instances are true in M).
The converse, however, is false. It’s not the case, e.g., that if
p → p is true in M, then R is reflexive. For we can easily find a
non-reflexive model M where p → p is true at all worlds: take
the model with a single world w, not accessible from itself, but
with w ∈ V (p). By picking the truth value of p suitably, we can
make A → A true in a model that is not reflexive.
The solution is to remove the variable assignment V from the
equation. If we require that p → p is true at all worlds in M,
21
CHAPTER 2. FRAME DEFINABILITY 22
regardless of which worlds are in V (p), then it is necessary that
R is reflexive. For in any non-reflexive model, there will be at
least one world w such that not Rww. If we set V (p) = W \ {w },
then p will be true at all worlds other than w, and so at all worlds
accessible from w (since w is guaranteed not to be accessible
from w, and w is the only world where p is false). On the other
hand, p is false at w, so p → p is false at w.
This suggests that we should introduce a notation for model
structures without a valuation: we call these frames. A frame
F is simply a pair hW, Ri consisting of a set of worlds with an
accessibility relation. Every model hW, R,V i is then, as we say,
based on the frame hW, Ri. Conversely, a frame determines the
class of models based on it; and a class of frames determines the
class of models which are based on any frame in the class. And
we can define F A, the notion of a formula being valid in a
frame as: M A for all M based on F.
With this notation, we can establish correspondence relations
between formulas and classes of frames: e.g., F p → p if, and
only if, F is reflexive.
2.2 Properties of Accessibility Relations
Many modal formulas turn out to be characteristic of simple, and
even familiar, properties of the accessibility relation. In one direc-
tion, that means that any model that has a given property makes
a corresponding formula (and all its substitution instances) true.
We begin with five classical examples of kinds of accessibility
relations and the formulas the truth of which they guarantee.
Theorem 2.1. Let M = hW, R,V i be a model. If R has the property
on the left side of table 2.1, every instance of the formula on the right
side is true in M.
Proof. Here is the case for B: to show that the schema is true in a
model we need to show that all of its instances are true all worlds
2.2. PROPERTIES OF ACCESSIBILITY RELATIONS 23
If R is . . . then . . . is true in M:
serial: ∀u∃vRuv p → ♦p (D)
reflexive: ∀wRww p → p (T)
symmetric: p → ♦p (B)
∀u∀v (Ruv → Rvu)
transitive: p → p (4)
∀u∀v∀w((Ruv ∧ Rvw) → Ruw)
euclidean: ♦p → ♦p (5)
∀w∀u∀v ((Rwu ∧ Rwv ) → Ruv
Table 2.1: Five correspondence facts.
w w0
A ♦A
♦A
Figure 2.1: The argument from symmetry.
in the model. So let A → ♦A be a given instance of B, and let
w ∈ W be an arbitrary world. Suppose the antecedent A is true at
w, in order to show that ♦A is true at w. So we need to show that
♦A is true at all w 0 accessible from w. Now, for any w 0 such that
Rww 0 we have, using the hypothesis of symmetry, that also Rw 0w
(see Figure 2.1). Since M, w A, we have M, w 0 ♦A. Since w 0
was an arbitrary world such that Rww 0, we have M, w ♦A.
We leave the other cases as exercises.
Notice that the converse implications of Theorem 2.1 do not
hold: it’s not true that if a model verifies a schema, then the ac-
cessibility relation of that model has the corresponding property.
In the case of T and reflexive models, it is easy to give an exam-
ple of a model in which T itself fails: let W = {w } and V (p) = ∅.
Then R is not reflexive, but M, w p and M, w 1 p. But here
we have just a single instance of T that fails in M, other instances,
CHAPTER 2. FRAME DEFINABILITY 24
e.g., ¬p → ¬p are true. It is harder to give examples where every
substitution instance of T is true in M and M is not reflexive. But
there are such models, too:
Proposition 2.2. Let M = hW, R,V i be a model such that W =
{u, v }, where worlds u and v are related by R: i.e., both Ruv and Rvu.
Suppose that for all p: u ∈ V (p) ⇔ v ∈ V (p). Then:
1. For all A: M, u A if and only if M, v A (use induction on
A).
2. Every instance of T is true in M.
Since M is not reflexive (it is, in fact, irreflexive), the converse of The-
orem 2.1 fails in the case of T (similar arguments can be given for
some—though not all—the other schemas mentioned in Theorem 2.1).
Although we will focus on the five classical formulas D, T, B,
4, and 5, we record in table 2.2 a few more properties of accessibil-
ity relations. The accessibility relation R is partially functional, if
from every world at most one world is accessible. If it is the case
that from every world exactly one world is accessible, we call it
functional. (Thus the functional relations are precisely those that
are both serial and partially functional). They are called “func-
tional” because the accessibility relation operates like a (partial)
function. A relation is weakly dense if whenever Ruv , there is a w
“between” u and v . So weakly dense relations are in a sense the
opposite of transitive relations: in a transitive relation, whenever
you can reach v from u by a detour via w, you can reach v from
u directly; in a weakly dense relation, whenever you can reach
v from u directly, you can also reach it by a detour via some w.
A relation is weakly directed if whenever you can reach worlds u
and v from some world w, you can reach a single world t from
both u and v —this is sometimes called the “diamond property”
or “confluence.”
2.3. FRAMES 25
If R is . . . then . . . is true in M:
partially functional:
♦p → p
∀w∀u∀v ((Rwu ∧ Rwv ) → u = v )
functional: ∀w∃v∀u(Rwu ↔ u = v ) ♦p ↔ p
weakly dense:
p → p
∀u∀v (Ruv → ∃w(Ruw ∧ Rwv ))
weakly connected:
((p ∧ p) → q ) ∨
∀w∀u∀v ((Rwu ∧ Rwv ) → (L)
((q ∧ q ) → p)
(Ruv ∨ u = v ∨ Rvu))
weakly directed:
∀w∀u∀v ((Rwu ∧ Rwv ) → ♦p → ♦p (G)
∃t (Rut ∧ Rvt )
Table 2.2: Five more correspondence facts.
2.3 Frames
Definition 2.3. A frame is a pair F = hW, Ri where W is a non-
empty set of worlds and R a binary relation on W . A model M
is based on a frame F = hW, Ri if and only if M = hW, R,V i.
Definition 2.4. If F is a frame, we say that A is valid in F, F A,
if M A for every model M based on F.
If F is a class of frames, we say A is valid in F, F A, iff
F A for every frame F ∈ F.
The reason frames are interesting is that correspondence be-
tween schemas and properties of the accessibility relation R is
at the level of frames, not of models. For instance, although T is
true in all reflexive models, not every model in which T is true
is reflexive. However, it is true that not only is T valid on all
reflexive frames, also every frame in which T is valid is reflexive.
CHAPTER 2. FRAME DEFINABILITY 26
Remark 1. Validity in a class of frames is a special case of the
notion of validity in a class of models: F A iff C A where C
is the class of all models based on a frame in F.
Obviously, if a formula or a schema is valid, i.e., valid with
respect to the class of all models, it is also valid with respect to
any class F of frames.
2.4 Frame Definability
Even though the converse implications of Theorem 2.1 fail, they
hold if we replace “model” by “frame”: for the properties con-
sidered in Theorem 2.1, it is true that if a formula is valid in a
frame then the accessibility relation of that frame has the corre-
sponding property. So, the formulas considered define the classes
of frames that have the corresponding property.
Definition 2.5. If C is a class of frames, we say A defines C iff
F A for all and only frames F ∈ C.
We now proceed to establish the full definability results for
frames.
Theorem 2.6. If the formula on the right side of table 2.1 is valid in
a frame F, then F has the property on the left side.
Proof. 1. Suppose D is valid in F = hW, Ri, i.e., F p → ♦p.
Let M = hW, R,V i be a model based on F, and w ∈ W . We
have to show that there is a v such that Rwv . Suppose not:
then both M A and M, w 1 ♦A for any A, including p.
But then M, w 1 p → ♦p, contradicting the assumption
that F p → ♦p.
2. Suppose T is valid in F, i.e., F p → p. Let w ∈ W be
an arbitrary world; we need to show Rww. Let u ∈ V (p) if
and only if Rwu (when q is other than p, V (q ) is arbitrary,
say V (q ) = ∅). Let M = hW, R,V i. By construction, for all
2.4. FRAME DEFINABILITY 27
u such that Rwu: M, u p, and hence M, w p. But by
hypothesis p → p is true at w, so that M, w p, but by
definition of V this is possible only if Rww.
3. We prove the contrapositive: Suppose F is not symmetric,
we show that B, i.e., p →♦p is not valid in F = hW, Ri. If F
is not symmetric, there are u, v ∈ W such that Ruv but not
Rvu. Define V such that w ∈ V (p) if and only if not Rvw
(and V is arbitrary otherwise). Let M = hW, R,V i. Now,
by definition of V , M, w p for all w such that not Rvw,
in particular, M, u p since not Rvu. Also, since Rvw iff
p < V (w), there is no w such that Rvw and M, w p, and
hence M, v 1 ♦p. Since Ruv , also M, u 1 ♦p. It follows
that M, u 1 p → ♦p, and so B is not valid in F.
4. Suppose 4 is valid in F = hW, Ri, i.e., F p → p,
and let u, v , w ∈ W be arbitrary worlds such that Ruv
and Rvw; we need to show that Ruw. Define V such that
z ∈ V (p) if and only if Ruz (and V is arbitrary otherwise).
Let M = hW, R,V i. By definition of V , M, z p for all z
such that Ruz , and hence M, u p. But by hypothesis 4,
p → p, is true at u, so that M, u p. Since Ruv
and Rvw, we have M, w p, but by definition of V this is
possible only if Ruw, as desired.
5. We proceed contrapositively, assuming that the frame F =
hW, Ri is not euclidean, and show that it falsifies 5, i.e., i.e.,
F 2 ♦p → ♦p. Suppose there are worlds u, v , w ∈ W such
that Rwu and Rwv but not Ruv . Define V such that for all
worlds z , z ∈ V (p) if and only if it is not the case that Ruz .
Let M = hW, R,V i. Then by hypothesis M, v p and since
Rwv also M, w ♦p. However, there is no world y such
that Ruy and M, y p so M, u 1 ♦p. Since Rwu, it follows
that M, w 1 ♦p, so that 5, ♦p → ♦p, fails at w.
CHAPTER 2. FRAME DEFINABILITY 28
You’ll notice a difference between the proof for D and the
other cases: no mention was made of the valuation V . In effect,
we proved that if M D then M is serial. So D defines the class
of serial models, not just frames.
Corollary 2.7. Any model where D is true is serial.
Corollary 2.8. Each formula on the right side of table 2.1 defines the
class of frames which have the property on the left side.
Proof. In Theorem 2.1, we proved that if a model has the property
on the left, the formula on the right is true in it. Thus, if a frame F
has the property on the left, the formula on the left is valid in F.
In Theorem 2.6, we proved the converse implications: if a formula
on the right is valid in F, F has the property on the left.
Theorem 2.6 also shows that the properties can be combined:
for instance if both B and 4 are valid in F then the frame is both
symmetric and transitive, etc. Many important modal logics are
characterized as the set of formulas valid in all frames that com-
bine some frame properties, and so we can characterize them as
the set of formulas valid in all frames in which the correspond-
ing defining formulas are valid. For instance, the classical system
S4 is the set of all formulas valid in all reflexive and transitive
frames, i.e., in all those where both T and 4 are valid. S5 is the
set of all formulas valid in all reflexive, symmetric, and euclidean
frames, i.e., all those where all of T, B, and 5 are valid.
Logical relationships between properties of R in general cor-
respond to relationships between the corresponding defining for-
mulas. For instance, every reflexive relation is serial; hence,
whenever T is valid in a frame, so is D. (Note that this rela-
tionship is not that of entailment. It is not the case that whenever
M, w T then M, w D.) We record some such relationships.
2.5. FIRST-ORDER DEFINABILITY 29
Proposition 2.9. Let R be a binary relation on a set W ; then:
1. If R is reflexive, then it is serial.
2. If R is symmetric, then it is transitive if and only if it is euclidean.
3. If R is symmetric or euclidean then it is weakly directed (it has
the “diamond property”).
4. If R is euclidean then it is weakly connected.
5. If R is functional then it is serial.
2.5 First-order Definability
We’ve seen that a number of properties of accessibility relations
of frames can be defined by modal formulas. For instance, sym-
metry of frames can be defined by the formula B, p → ♦p. The
conditions we’ve encountered so far can all be expressed by first-
order formulas in a language involving a single two-place predi-
cate symbol. For instance, symmmetry is defined by ∀x ∀y (Q (x, y)→
Q (y, x)) in the sense that a first-order structure M with |M| = W
and Q M = R satisfies the preceding formula iff R is symmetric.
This suggests the following definition:
Definition 2.10. A class C of frames is first-order definable if there
is a sentence A in the first-order language with a single two-place
predicate symbol P such that F = hW, Ri ∈ C iff M A in the
first-order structure M with |M| = W and Q M = R.
It turns out that the properties and modal formulas that define
them considered so far are exceptional. Not every formula defines
a first-order definable class of frames, and not every first-order
definable class of frames is definable by a modal formula.
A counterexample to the first is given by the Löb formula:
(p → p) → p. (W)
CHAPTER 2. FRAME DEFINABILITY 30
W defines the class of transitive and converse well-founded frames.
A relation is well-founded if there is no infinite sequence w 1 , w 2 ,
. . . such that Rw 2w 1 , Rw 3w 2 , . . . . For instance, the relation <
on N is well-founded, whereas the relation < on Z is not. A re-
lation is converse well-founded iff its converse is well-founded.
So converse well-founded relations are those where there is no
infinite sequence w 1 , w 2 , . . . such that Rw 1w 2 , Rw 2w 3 , . . . .
There is, however, no first-order formula defining transitive
converse well-founded relations. For suppose M F iff R = Q M
is transitive converse well-founded. Let An be the formula
(Q (a1, a 2 ) ∧ · · · ∧ Q (an−1, an ))
Now consider the set of formulas
Γ = {F, A1, A2, . . . }.
Every finite subset of Γ is satisfiable: Let k be largest such that Ak
M
is in the subset, |Mk | = {1, . . . , k }, ai k = i , and P Mk =<. Since
< on {1, . . . , k } is transitive and converse well-founded, Mk F .
Mk Ai by construction, for all i ≤ k . By the Compactness
Theorem for first-order logic, Γ is satisfiable in some structure M.
By hypothesis, since M F , the relation Q M is converse well-
founded. But clearly, a1M , a 2M , . . . would form an infinite sequence
of the kind ruled out by converse well-foundedness.
A counterexample to the second claim is given by the prop-
erty of universality: for every u and v , Ruv . Universal frames are
first-order definable by the formula ∀x ∀y Q (x, y). However, no
modal formula is valid in all and only the universal frames. This
is a consequence of a result that is independently interesting: the
formulas valid in universal frames are exactly the same as those
valid in reflexive, symmetric, and transitive frames. There are re-
flexive, symmetric, and transitive frames that are not universal,
hence every formula valid in all universal frames is also valid in
some non-universal frames.
2.6. EQUIVALENCE RELATIONS AND S5 31
2.6 Equivalence Relations and S5
The modal logic S5 is characterized as the set of formulas valid
on all universal frames, i.e., every world is accessible from every
world, including itself. In such a scenarion, corresponds to
necessity and ♦ to possibility: A is true if A is true at every
world, and ♦A is true if A is true at some world. It turns out that
S5 can also be characterized as the formulas valid on all reflexive,
symmetric, and transitive frames, i.e., on all equivalence relations.
Definition 2.11. A binary relation R on W is an equivalence re-
lation if and only if it is reflexive, symmetric and transitive. A
relation R on W is universal if and only if Ruv for all u, v ∈ W .
Since T, B, and 4 characterize the reflexive, symmetric, and
transitive frames, the frames where the accessibility relation is
an equivalence relation are exactly those in which all four for-
mulas are valid. It turns out that the equivalence relations can
also be characterized by other combinations of formulas, since
the conditions with which we’ve defined equivalence relations are
equivalent to combinations of other familiar conditions on R.
Proposition 2.12. The following are equivalent:
1. R is an equivalence relation;
2. R is reflexive and euclidean;
3. R is serial, symmetric, and euclidean;
4. R is serial, symmetric, and transitive.
Proof. Exercise.
Proposition 2.12 is the semantic counterpart to Proposition 3.29,
in that it gives equivalent characterization of the modal logic of
frames over which R is an equivalence (the logic traditionally
referred to as S5).
CHAPTER 2. FRAME DEFINABILITY 32
What is the relationship between universal and equivalence
relations? Although every universal relation is an equivalence
relation, clearly not every equivalence relation is universal. How-
ever, the formulas valid on all universal relations are exactly the
same as those valid on all equivalence relations.
Proposition 2.13. Let R be an equivalence relation, and for each
w ∈ W define the equivalence class of w as the set [w] = {w 0 ∈ W :
Rww 0 }. Then:
1. w ∈ [w];
2. R is universal on each equivalence class [w];
3. The collection of equivalence classes partitions W into mutually
exclusive and jointly exhaustive subsets.
Proposition 2.14. A formula A is valid in all frames F = hW, Ri
where R is an equivalence relation, if and only if it is valid in all
frames F = hW, Ri where R is universal. Hence, the logic of universal
frames is just S5.
Proof. It’s immediate to verify that a universal relation R on W
is an equivalence. Hence, if A is valid in all frames where R is
an equivalence it is valid in all universal frames. For the other
direction, we argue contrapositively: suppose B is a formula that
fails at a world w in a model M = hW, R,V i based on a frame
hW, Ri, where R is an equivalence on W . So M, w 1 B. Define a
model M 0 = hW 0, R 0,V 0i as follows:
1. W 0 = [w];
2. R 0 is universal on W 0;
3. V 0(p) = V (p) ∩ W 0.
2.7. SECOND-ORDER DEFINABILITY 33
[w]
[z ]
[u]
[v ]
Figure 2.2: A partition of W in equivalence classes.
(So the set W 0 of worlds in M 0 is represented by the shaded area
in Figure 2.2.) It is easy to see that R and R 0 agree on W 0. Then
one can show by induction on formulas that for all w 0 ∈ W 0:
M 0, w 0 A if and only if M, w 0 A for each A (this makes sense
since W 0 ⊆ W ). In particular, M 0, w 1 B, and B fails in a model
based on a universal frame.
2.7 Second-order Definability
Not every frame property definable by modal formulas is first-
order definable. However, if we allow quantification over one-
place predicates (i.e., monadic second-order quantification), we
define all modally definable frame properties. The trick is to
exploit a systematic way in which the conditions under which a
modal formula is true at a world are related to first-order formu-
las. This is the so-called standard translation of modal formulas
into first-order formulas in a language containing not just a two-
place predicate symbol Q for the accessibility relation, but also a
one-place predicate symbol Pi for the propositional variables pi
occurring in A.
CHAPTER 2. FRAME DEFINABILITY 34
Definition 2.15. The standard translation STx (A) is inductively
defined as follows:
1. A ≡ ⊥: STx (A) = ⊥.
2. A ≡ pi : STx (A) = Pi (x).
3. A ≡ ¬B: STx (A) = ¬STx (B).
4. A ≡ (B ∧ C ): STx (A) = (STx (B) ∧ STx (C )).
5. A ≡ (B ∨ C ): STx (A) = (STx (B) ∨ STx (C )).
6. A ≡ (B → C ): STx (A) = (STx (B) → STx (C )).
7. A ≡ B: STx (A) = ∀y (Q (x, y) → STy (B)).
8. A ≡ ♦B: STx (A) = ∃y (Q (x, y) ∧ STy (B)).
For instance, STx (p → p) is ∀y (Q (x, y) → P (y)) → P (x). Any
structure for the language of STx (A) requires a domain, a two-
place relation assigned to Q , and subsets of the domain assigned
to the one-place predicate symbols Pi . In other words, the com-
ponents of such a structure are exactly those of a model for A: the
domain is the set of worlds, the two-place relation assigned to Q
is the accessibility relation, and the subsets assigned to Pi are
just the assignments V (pi ). It won’t surprise that satisfaction of
A in a modal model and of STx (A) in the corresponding structure
agree:
Proposition 2.16. Let M = hW, R,V i, M 0 be the first-order structure
0
with |M 0 | = W , Q M = R, and PiM = V (pi ), and s (x) = w. Then
M, w A iff M 0, s STx (A)
Proof. By induction on A.
2.7. SECOND-ORDER DEFINABILITY 35
Proposition 2.17. Suppose A is a modal formula and F = hW, Ri is
0
a frame. Let F 0 be the first-order structure with |F 0 | = W and Q F = R,
and let A 0 be the second-order formula
∀X 1 . . . ∀Xn ∀x STx (A)[X 1 /P 1, . . . , Xn /Pn ],
where P 1 , . . . , Pn are all one-place predicate symbols in STx (A). Then
F A iff F 0 A 0
0
Proof. F 0 A 0 iff for every structure M 0 where PiM ⊆ W for
i = 1, . . . , n, and for every s with s (x) ∈ W , M 0, s STx (A). By
Proposition 2.16, that is the case iff for all models M based on F
and every world w ∈ W , M, w A, i.e., F A.
Definition 2.18. A class C of frames is second-order definable if
there is a sentence A in the second-order language with a single
two-place predicate symbol P and quantifiers only over monadic
set variables such that F = hW, Ri ∈ C iff M A in the struc-
ture M with |M| = W and P M = R.
Corollary 2.19. If a class of frames is definable by a formula A, the
corresponding class of accessibility relations is definable by a monadic
second-order sentence.
Proof. The monadic second-order sentence A 0 of the preceding
proof has the required property.
As an example, consider again the formula p → p. It defines
reflexivity. Reflexivity is of course first-order definable by the
sentence Q (x, x). But it is also definable by the monadic second-
order sentence
∀X ∀x (∀y (Q (x, y) → X (y)) → X (x)).
CHAPTER 2. FRAME DEFINABILITY 36
This means, of course, that the two sentences are equivalent.
Here’s how you might convince yourself of this directly: First
suppose the second-order sentence is true in a structure M . Since
x and X is universally quantified, the remainder must hold for any
x ∈ W and set X ⊆ W , e.g, the set {z : Rxz } where R = Q M .
So, for any s with s (x) ∈ W and s (X ) = {z : Rz x } we have
M ∀y (Q (x, y) → X (y)) → X (x). But by the way we’ve picked
s (X ) that means M, s ∀y (Q (x, y) → Q (x, y)) → Q (x, x), which is
equivalent to Q (x, x) since the antecedent is valid. Since s (x) is
arbitrary, we have M ∀x Q (x, x).
Now suppose that M Q (x, x) and show that M ∀X ∀x (∀y (Q (x, y)→
X (y))→X (x)). Pick any assignment s , and assume M, s ∀y (Q (x, y)→
X (y)). Let s 0 be the y-variant of s with s 0(y) = x; we have M, s 0
Q (x, y)→X (y)), i.e., M, s Q (x, x)→X (x)). Since M ∀x Q (x, x),
the antecedent is true, and we have M, s X (x), which is what
we needed to show.
Since some definable classes of frames are not first-order de-
finable, not every monadic-second order sentence of the form A 0
is equivalent to a first-order sentence. There is no effective method
to decide which ones are.
Problems
Problem 2.1. Complete the proof of Theorem 2.1
Problem 2.2. Prove the claims in Proposition 2.2.
Problem 2.3. Let M = hW, R,V i be a model. Show that if R
satisfies the left-hand properties of table 2.2, every instance of
the corresponding right-hand formula is true in M.
Problem 2.4. Show that if the formula on the right side of ta-
ble 2.2 is valid in a frame F, then F has the property on the left
side. To do this, consider a frame that does not satisfy the prop-
erty on the left, and define a suitable V such that the formula on
the right is false at some world.
2.7. SECOND-ORDER DEFINABILITY 37
Problem 2.5. Prove Proposition 2.9.
Problem 2.6. Prove Proposition 2.12 by showing:
1. If R is symmetric and transitive, it is euclidean.
2. If R is reflexive, it is serial.
3. If R is reflexive and euclidean, it is symmetric.
4. If R is symmetric and euclidean, it is transitive.
5. If R is serial, symmetric, and transitive, it is reflexive.
Explain why this suffices for the proof that the conditions are
equivalent.
CHAPTER 3
Axiomatic
Derivations
3.1 Introduction
We have a semantics for the basic modal language in terms of
modal models, and a notion of a formula being valid—true at
all worlds in all models—or valid with respect to some class of
models or frames—true at all worlds in all models in the class, or
based on the frame. Logic usually connects such semantic charac-
terizations of validity with a proof-theoretic notion of derivability.
The aim is to define a notion of derivability in some system such
that a formula is derivable iff it is valid.
The simplest and historically oldest derivation systems are
so-called Hilbert-type or axiomatic derivation systems. Hilbert-
type derivation systems for many modal logics are relatively easy
to construct: they are simple as objects of metatheoretical study
(e.g., to prove soundness and completeness). However, they are
much harder to use to prove formulas in than, say, natural deduc-
tion systems.
In Hilbert-type derivation systems, a derivation of a formula is
a sequence of formulas leading from certain axioms, via a handful
of inference rules, to the formula in question. Since we want the
38
3.1. INTRODUCTION 39
derivation system to match the semantics, we have to guarantee
that the set of derivable formulas are true in all models (or true in
all models in which all axioms are true). We’ll first isolate some
properties of modal logics that are necessary for this to work: the
“normal” modal logics. For normal modal logics, there are only
two inference rules that need to be assumed: modus ponens and
necessitation. As axioms we take all (substitution instances) of
tautologies, and, depending on the modal logic we deal with, a
number of modal axioms. Even if we are just interested in the
class of all models, we must also count all substitution instances
of K and Dual as axioms. This alone generates the minimal nor-
mal modal logic K.
Definition 3.1. The rule of modus ponens is the inference schema
A A → B mp
B
We say a formula B follows from formulas A, C by modus ponens
iff C ≡ A → B.
Definition 3.2. The rule of necessitation is the inference schema
A nec
A
We say the formula B follows from the formulas A by necessitation
iff B ≡ A.
Definition 3.3. A derivation from a set of axioms Σ is a sequence
of formulas B 1 , B 2 , . . . , Bn , where each Bi is either
1. a substitution instance of a tautology, or
2. a substitution instance of a formula in Σ , or
CHAPTER 3. AXIOMATIC DERIVATIONS 40
3. follows from two formulas B j , Bk with j , k < i by modus
ponens, or
4. follows from a formula B j with j < i by necessitation.
If there is such a derivation with Bn ≡ A, we say that A is derivable
from Σ , in symbols Σ ` A.
With this definition, it will turn out that the set of derivable
formulas forms a normal modal logic, and that any derivable for-
mula is true in every model in which every axiom is true. This
property of derivations is called soundness. The converse, com-
pleteness, is harder to prove.
3.2 Normal Modal Logics
Not every set of modal formulas can easily be characterized as
those formulas derivable from a set of axioms. We want modal
logics to be well-behaved. First of all, everything we can derive in
classical propositional logic should still be derivable, of course
taking into account that the formulas may now contain also
and ♦. To this end, we require that a modal logic contain all
tautological instances and be closed under modus ponens.
Definition 3.4. A modal logic is a set Σ of modal formulas which
1. contains all tautologies, and
2. is closed under substitution, i.e., if A ∈ Σ , and D 1 , . . . , D n
are formulas, then
A[D 1 /p 1, . . . , D n /p n ] ∈ Σ ,
3. is closed under modus ponens, i.e., if A and A → B ∈ Σ , then
B ∈ Σ.
3.2. NORMAL MODAL LOGICS 41
In order to use the relational semantics for modal logics, we
also have to require that all formulas valid in all modal models
are included. It turns out that this requirement is met as soon as
all instances of K and dual are derivable, and whenever a for-
mula A is derivable, so is A. A modal logic that satisfies these
conditions is called normal. (Of course, there are also non-normal
modal logics, but the usual relational models are not adequate for
them.)
Definition 3.5. A modal logic Σ is normal if it contains
(p → q ) → (p → q ), (K)
♦p ↔ ¬¬p (dual)
and is closed under necessitation, i.e., if A ∈ Σ , then A ∈ Σ .
Observe that while tautological implication is “fine-grained”
enough to preserve truth at a world, the rule nec only preserves
truth in a model (and hence also validity in a frame or in a class
of frames).
Proposition 3.6. Every normal modal logic is closed under rule rk,
A1 → (A2 → · · · (An−1 → An ) · · · )
rk
A1 → (A2 → · · · (An−1 → An ) · · · ).
Proof. By induction on n: If n = 1, then the rule is just nec, and
every normal modal logic is closed under nec.
Now suppose the result holds for n −1; we show it holds for n.
Assume
A1 → (A2 → · · · (An−1 → An ) · · · ) ∈ Σ
By the induction hypothesis, we have
A1 → (A2 → · · · (An−1 → An ) · · · ) ∈ Σ
CHAPTER 3. AXIOMATIC DERIVATIONS 42
Since Σ is a normal modal logic, it contains all instances of K,
in particular
(An−1 → An ) → (An−1 → An ) ∈ Σ
Using modus ponens and suitable tautological instances we get
A1 → (A2 → · · · (An−1 → An ) · · · ) ∈ Σ .
Proposition 3.7. Every normal modal logic Σ contains ¬♦⊥.
Proposition 3.8. Let A1 , . . . , An be formulas. Then there is a smallest
modal logic Σ containing all instances of A1 , . . . , An .
Proof. Given A1 , . . . , An , define Σ as the intersection of all nor-
mal modal logics containing all instances of A1 , . . . , An . The
intersection is non-empty as Frm(L), the set of all formulas, is
such a modal logic.
Definition 3.9. The smallest normal modal logic containing A1 ,
. . . , An is called a modal system and denoted by KA1 . . . An . The
smallest normal modal logic is denoted by K.
3.3 Derivations and Modal Systems
We first define what a derivation is for normal modal logics.
Roughly, a derivation is a sequence of formulas in which every
element is either (a substitution instance of) one of a number of
axioms, or follows from previous elements by one of a few infer-
ence rules. For normal modal logics, all instances of tautologies,
K, and dual count as axioms. This results in the modal sys-
tem K, the smallest normal modal logic. We may wish to add
additional axioms to obtain other systems, however. The rules
are always modus ponens mp and necessitation nec.
3.3. DERIVATIONS AND MODAL SYSTEMS 43
Definition 3.10. Given a modal system KA1 . . . An and a formula
B we say that B is derivable in KA1 . . . An , written KA1 . . . An ` B,
if and only if there are formulas C 1 , . . . , C k such that C k = B and
each C i is either a tautological instance, or an instance of one
of K, dual, A1 , . . . , An , or it follows from previous formulas by
means of the rules mp or nec.
The following proposition allows us to show that B ∈ Σ by
exhibiting a Σ -proof of B.
Proposition 3.11. KA1 . . . An = {B : KA1 . . . An ` B }.
Proof. We use induction on the length of derivations to show that
{B : KA1 . . . An ` B } ⊆ KA1 . . . An .
If the derivation of B has length 1, it contains a single formula.
That formula cannot follow from previous formulas by mp or nec,
so must be a tautological instance, an instance of K, dual, or an
instance of one of A1 , . . . , An . But KA1 . . . An contains these as
well, so B ∈ KA1 . . . An .
If the derivation of B has length > 1, then B may in addition
be obtained by mp or nec from formulas not occurring as the last
line in the derivation. If B follows from C and C →B (by mp), then
C and C → B ∈ KA1 . . . An by induction hypothesis. But every
modal logic is closed under modus ponens, so B ∈ KA1 . . . An . If
B ≡ C follows from C by nec, then C ∈ KA1 . . . An by induction
hypothesis. But every normal modal logic is closed under nec,
so B ∈ KA1 . . . An .
The converse inclusion follows by showing that Σ = {B :
KA1 . . . An ` B } is a normal modal logic containing all the in-
stances of A1 , . . . , An , and the observation that KA1 . . . An is, by
definition, the smallest such logic.
1. Every tautology B is a tautological instance, so KA1 . . . An `
B, so Σ contains all tautologies.
2. If KA1 . . . An ` C and KA1 . . . An ` C →B, then KA1 . . . An `
B: Combine the derivation of C with that of C → B, and
CHAPTER 3. AXIOMATIC DERIVATIONS 44
add the line B. The last line is justified by mp. So Σ is
closed under modus ponens.
3. If B has a derivation, then every substitution instance of B
also has a derivation: apply the substitution to every for-
mula in the derivation. (Exercise: prove by induction on the
length of derivations that the result is also a correct deriva-
tion). So Σ is closed under uniform substitution. (We have
now established that Σ satisfies all conditions of a modal
logic.)
4. We have KA1 . . . An ` K, so K ∈ Σ .
5. We have KA1 . . . An ` dual, so dual ∈ Σ .
6. If KA1 . . . An ` C , the additional line C is justified by nec.
Consequently, Σ is closed under nec. Thus, Σ is normal.
3.4 Proofs in K
In order to practice proofs in the smallest modal system, we show
the valid formulas on the left-hand side of table 1.1 can all be
given K-proofs.
Proposition 3.12. K ` A → (B → A)
Proof.
1. A → (B → A) taut
2. (A → (B → A)) nec, 1
3. (A → (B → A)) → (A → (B → A)) K
4. A → (B → A) mp, 2, 3
3.4. PROOFS IN K 45
Proposition 3.13. K ` (A ∧ B) → (A ∧ B)
Proof.
1. (A ∧ B) → A taut
2. ((A ∧ B) → A) nec
3. ((A ∧ B) → A) → ((A ∧ B) → A) K
4. (A ∧ B) → A mp, 2, 3
5. (A ∧ B) → B taut
6. ((A ∧ B) → B) nec
7. ((A ∧ B) → B) → ((A ∧ B) → B) K
8. (A ∧ B) → B mp, 6, 7
9. ((A ∧ B) → A) →
(((A ∧ B) → A) →
((A ∧ B) → (A ∧ B))) taut
10. ((A ∧ B) → A) →
((A ∧ B) → (A ∧ B)) mp, 4, 9
11. (A ∧ B) → (A ∧ B) mp, 4, 10.
Note that the formula on line 9 is an instance of the tautology
(p → q ) → ((p → r ) → (p → (q ∧ r ))).
Proposition 3.14. K ` (A ∧ B) → (A ∧ B)
Proof.
1. A → (B → (A ∧ B)) taut
2. (A → (B → (A ∧ B))) nec, 1
3. (A → (B → (A ∧ B))) → (A → (B → (A ∧ B))) K
4. A → (B → (A ∧ B)) mp, 2, 3
5. (B → (A ∧ B)) → (B → (A ∧ B)) K
6. (A → (B → (A ∧ B))) →
((B → (A ∧ B)) → (B → (A ∧ B))) →
(A → (B → (A ∧ B)))) taut
CHAPTER 3. AXIOMATIC DERIVATIONS 46
7. ((B → (A ∧ B)) → (B → (A ∧ B))) →
(A → (B → (A ∧ B))) mp, 4, 6
8. A → (B → (A ∧ B))) mp, 5, 7
9. (A → (B → (A ∧ B)))) →
((A ∧ B) → (A ∧ B)) taut
10. (A ∧ B) → (A ∧ B) mp, 8, 9
The formulas on lines 6 and 9 are instances of the tautologies
(p → q ) → ((q → r ) → (p → r ))
(p → (q → r )) → ((p ∧ q ) → r )
Proposition 3.15. K ` ¬p → ♦¬p
Proof.
1. ♦¬p ↔ ¬¬¬p dual
2. (♦¬p ↔ ¬¬¬p) →
(¬¬¬p → ♦¬p) taut
3. ¬¬¬p → ♦¬p mp, 1, 2
4. ¬¬p → p taut
5. (¬¬p → p) nec, 4
6. (¬¬p → p) → (¬¬p → p) K
7. (¬¬p → p) mp, 5, 6
8. (¬¬p → p) → (¬p → ¬¬¬p) taut
9. ¬p → ¬¬¬p mp, 7, 8
10. (¬p → ¬¬¬p) →
((¬¬¬p → ♦¬p) → (¬p → ♦¬p)) taut
11. (¬¬¬p → ♦¬p) → (¬p → ♦¬p) mp, 9, 10
12. ¬p → ♦¬p mp, 3, 11
The formulas on lines 8 and 10 are instances of the tautologies
(p → q ) → (¬q → ¬p)
3.5. DERIVED RULES 47
(p → q ) → ((q → r ) → (p → r )).
3.5 Derived Rules
Finding and writing derivations is obviously difficult, cumber-
some, and repetitive. For instance, very often we want to pass
from A → B to A → B, i.e., apply rule rk. That requires an
application of nec, then recording the proper instance of K, then
applying mp. Passing from A → B and B → C to A → C requires
recording the (long) tautological instance
(A → B) → ((B → C ) → (A → C ))
and applying mp twice. Often we want to replace a sub-formula
by a formula we know to be equivalent, e.g., ♦A by ¬¬A, or
¬¬A by A. So rather than write out the actual derivation, it is
more convenient to simply record why the intermediate steps are
derivable. For this purpose, let us collect some facts about deriv-
ability.
Proposition 3.16. If K ` A1 , . . . , K ` An , and B follows from A1 ,
. . . , An by propositional logic, then K ` B.
Proof. If B follows from A1 , . . . , An by propositional logic, then
A1 → (A2 → · · · (An → B) . . . )
is a tautological instance. Applying mp n times gives a derivation
of B.
We will indicate use of this proposition by pl.
CHAPTER 3. AXIOMATIC DERIVATIONS 48
Proposition 3.17. If K ` A1 → (A2 → · · · (An−1 → An ) . . . ) then
K ` A1 → (A2 → · · · (An−1 → An ) . . . ).
Proof. By induction on n, just as in the proof of Proposition 3.6.
We will indicate use of this proposition by rk. Let’s illustrate
how these results help establishing derivability results more eas-
ily.
Proposition 3.18. K ` (A ∧ B) → (A ∧ B)
Proof.
1. K ` A → (B → (A ∧ B)) taut
2. K ` A → (B → (A ∧ B))) rk, 1
3. K ` (A ∧ B) → (A ∧ B) pl, 2
Proposition 3.19. If K ` A ↔B and K ` C [A/p] then K ` C [B/p]
Proof. Exercise.
This proposition comes in handy especially when we want
to convert ♦ into (or vice versa), or remove double negations
inside a formula. For instance:
Proposition 3.20. K ` ¬p → ♦¬p
Proof.
1. K ` ♦¬p ↔ ¬¬¬p dual
2. K ` ¬¬¬p → ♦¬p pl, 1
3. K ` ¬p → ♦¬p re-write p for ¬¬p
3.6. MORE PROOFS IN K 49
The following proposition justifies that we can establish deriv-
ability results schematically. E.g., the previous proposition does
not just establish that K ` ¬p → ♦¬p, but K ` ¬A → ♦¬A for
arbitrary A.
Proposition 3.21. If A is a substitution instance of B and K ` B,
then K ` A.
Proof. It is tedious but routine to verify (by induction on the
length of the derivation of B) that applying a substitution to
an entire derivation also results in a correct derivation. Specif-
ically, substitution instances of tautological instances are them-
selves tautological instances, substitution instances of instances
of dual and K are themselves instances of dual and K, and appli-
cations of mp and nec remain correct when substituting formulas
for propositional variables in both premise(s) and conclusion.
3.6 More Proofs in K
Let’s see some more examples of derivability in K, now using the
simplified method introduced in section 3.5.
Proposition 3.22. K ` (A → B) → (♦A → ♦B)
Proof.
1. K ` (A → B) → (¬B → ¬A) pl
2. K ` (A → B) → (¬B → ¬A) rk, 1
3. K ` (¬B → ¬A) → (¬¬A → ¬¬B) taut
4. K ` (¬B → ¬A) → (¬¬A → ¬¬B) pl, 2, 3
5. K ` (A → B) → (♦A → ♦B) re-writing ♦ for ¬¬.
CHAPTER 3. AXIOMATIC DERIVATIONS 50
Proposition 3.23. K ` A → (♦(A → B) → ♦B)
Proof.
1. K ` A → (¬B → ¬(A → B)) taut
2. K ` A → (¬B → ¬(A → B)) rk, 1
3. K ` A → (¬¬(A → B) → ¬¬B) pl, 2
4. K ` A → (♦(A → B) → ♦B) re-writing ♦ for ¬¬.
Proposition 3.24. K ` (♦A ∨ ♦B) → ♦(A ∨ B)
Proof.
1. K ` ¬(A ∨ B) → ¬A taut
2. K ` ¬(A ∨ B) → ¬A rk, 1
3. K ` ¬¬A → ¬¬(A ∨ B) pl, 2
4. K ` ♦A → ♦(A ∨ B) re-writing
5. K ` ♦B → ♦(A ∨ B) similarly
6. K ` (♦A ∨ ♦B) → ♦(A ∨ B) pl, 4, 5.
Proposition 3.25. K ` ♦(A ∨ B) → (♦A ∨ ♦B)
Proof.
1. K ` ¬A → (¬B → ¬(A ∨ B) taut
2. K ` ¬A → (¬B → ¬(A ∨ B) rk
3. K ` ¬A → (¬¬(A ∨ B) → ¬¬B)) pl, 2
4. K ` ¬¬(A ∨ B) → (¬A → ¬¬B) pl, 3
5. K ` ¬¬(A ∨ B) → (¬¬¬B → ¬¬A) pl, 4
6. K ` ♦(A ∨ B) → (¬♦B → ♦A) re-writing ♦ for ¬¬
7. K ` ♦(A ∨ B) → (♦B ∨ ♦A) pl, 6.
3.7. DUAL FORMULAS 51
3.7 Dual Formulas
Definition 3.26. Each of the formulas T, B, 4, and 5 has a dual,
denoted by a subscripted diamond, as follows:
p → ♦p (T♦ )
♦p → p (B♦ )
♦♦p → ♦p (4♦ )
♦p → p (5♦ )
Each of the above dual formulas is obtained from the corre-
sponding formula by substituting ¬p for p, contraposing, replac-
ing ¬¬ by ♦, and replacing ¬♦¬ by . D, i.e., A → ♦A is its
own dual in that sense.
3.8 Proofs in Modal Systems
We now come to proofs in systems of modal logic other than K.
Proposition 3.27. The following provability results obtain:
1. KT5 ` B;
2. KT5 ` 4;
3. KDB4 ` T;
4. KB4 ` 5;
5. KB5 ` 4;
6. KT ` D.
Proof. We exhibit proofs for each.
CHAPTER 3. AXIOMATIC DERIVATIONS 52
1. KT5 ` B:
1. KT5 ` ♦A → ♦A 5
2. KT5 ` A → ♦A T♦
3. KT5 ` A → ♦A pl.
2. KT5 ` 4:
1. KT5 ` ♦A → ♦A 5 with A for p
2. KT5 ` A → ♦A T♦ with A for p
3. KT5 ` A → ♦A pl, 1, 2
4. KT5 ` ♦A → A 5♦
5. KT5 ` ♦A → A rk, 4
6. KT5 ` A → A pl, 3, 5.
3. KDB4 ` T:
1. KDB4 ` ♦A → A B♦
2. KDB4 ` A → ♦A D with A for p
3. KDB4 ` A → A pl1, 2
4. KDB4 ` A → A 4
5. KDB4 ` A → A pl, 1, 4.
4. KB4 ` 5:
1. KB4 ` ♦A → ♦♦A B with ♦A for p
2. KB4 ` ♦♦A → ♦A 4♦
3. KB4 ` ♦♦A → ♦A rk, 2
4. KB4 ` ♦A → ♦A pl, 1, 3.
5. KB5 ` 4:
3.9. SOUNDNESS 53
1. KB5 ` A → ♦A B with A for p
2. KB5 ` ♦A → A 5♦
3. KB5 ` ♦A → A rk, 2
4. KB5 ` A → A pl, 1, 3.
6. KT ` D:
1. KT ` A → A T
2. KT ` A → ♦A T♦
3. KT ` A → ♦A pl, 1, 2
Definition 3.28. Following tradition, we define S4 to be the sys-
tem KT4, and S5 the system KTB4.
The following proposition shows that the classical system S5
has several equivalent axiomatizations. This should not surprise,
as the various combinations of axioms all characterize equiva-
lence relations (see Proposition 2.12).
Proposition 3.29. KTB4 = KT5 = KDB4 = KDB5.
Proof. Exercise.
3.9 Soundness
A derivation system is called sound if everything that can be de-
rived is valid. When considering modal systems, i.e., derivations
where in addition to K we can use instances of some formulas A1 ,
. . . , An , we want every derivable formula to be true in any model
in which A1 , . . . , An are true.
CHAPTER 3. AXIOMATIC DERIVATIONS 54
Theorem 3.30 (Soundness Theorem). If every instance of A1 , . . . ,
An is valid in the classes of models C1 , . . . , Cn , respectively, then
KA1 . . . An ` B implies that B is valid in the class of models C1 ∩
· · · ∩ Cn .
Proof. By induction on length of proofs. For brevity, put C =
Cn ∩ · · · ∩ Cn .
1. Induction Basis: If B has a proof of length 1, then it is either
a tautological instance, an instance of K, or of dual, or an
instance of one of A1 , . . . , An . In the first case, B is valid
in C, since tautological instance are valid in any class of
models, by Proposition 1.15. Similarly in the second case,
by Proposition 1.18 and Proposition 1.19. Finally in the
third case, since B is valid in Ci and C ⊆ Ci , we have that
B is valid in C as well.
2. Inductive step: Suppose B has a proof of length k > 1.
If B is a tautological instance or an instance of one of A1 ,
. . . , An , we proceed as in the previous step. So suppose B is
obtained by mp from previous formulas C →B and C . Then
C → B and C have proofs of length < k , and by inductive
hypothesis they are valid in C. By Proposition 1.20, B is
valid in C as well. Finally suppose B is obtained by nec
from C (so that B = C ). By inductive hypothesis, C is
valid in C, and by Proposition 1.12 so is B.
3.10 Showing Systems are Distinct
In section 3.8 we saw how to prove that two systems of modal
logic are in fact the same system. Theorem 3.30 allows us to
show that two modal systems Σ and Σ 0 are distinct, by finding
a formula A such that Σ 0 ` A that fails in a model of Σ .
3.10. SHOWING SYSTEMS ARE DISTINCT 55
Proposition 3.31. KD ( KT
Proof. This is the syntactic counterpart to the semantic fact that
all reflexive relations are serial. To show KD ⊆ KT we need
to see that KD ` B implies KT ` B, which follows from KT `
D, as shown in Proposition 3.27(6). To show that the inclusion
is proper, by Soundness (Theorem 3.30), it suffices to exhibit a
model of KD where T, i.e., p → p, fails (an easy task left as an
exercise), for then by Soundness KD 0 p → p.
Proposition 3.32. KB , K4.
Proof. We construct a symmetric model where some instance of
4 fails; since obviously the instance is derivable for K4 but not in
KB, it will follow K4 * KB. Consider the symmetric model M of
Figure 3.1. Since the model is symmetric, K and B are true in M
(by Proposition 1.18 and Theorem 2.1, respectively). However,
M, w 1 1 p → p.
¬p p
w1 w2
p 1p
1p
Figure 3.1: A symmetric model falsifying an instance of 4.
Theorem 3.33. KTB 0 4 and KTB 0 5.
Proof. By Theorem 2.1 we know that all instances of T and B
are true in every reflexive symmetric model (respectively). So by
soundness, it suffices to find a reflexive symmetric model contain-
ing a world at which some instance of 4 fails, and similarly for 5.
We use the same model for both claims. Consider the symmetric,
reflexive model in Figure 3.2. Then M, w 1 1 p → p, so 4
fails at w 1 . Similarly, M, w 2 1 ♦¬p → ♦¬p, so the instance of 5
with A = ¬p fails at w 2 .
CHAPTER 3. AXIOMATIC DERIVATIONS 56
w1 p w2 p w 3 ¬p
p ♦¬p
1p 1♦¬p
1♦¬p
Figure 3.2: The model for Theorem 3.33.
Theorem 3.34. KD5 , KT4 = S4.
Proof. By Theorem 2.1 we know that all instances of D and 5
are true in all serial euclidean models. So it suffices to find a
serial euclidean model containing a world at which some instance
of 4 fails. Consider the model of Figure 3.3, and notice that
M, w 1 1 p → p.
3.11 Derivability from a Set of Formulas
In section 3.8 we defined a notion of provability of a formula in
a system Σ . We now extend this notion to provability in Σ from
formulas in a set Γ.
Definition 3.35. A formula A is derivable in a system Σ from a
set of formulas Γ, written Γ ` Σ A if and only if there are B 1 , . . . ,
Bn ∈ Γ such that Σ ` B 1 → (B 2 → · · · (Bn → A) · · · ).
3.12 Properties of Derivability
3.12. PROPERTIES OF DERIVABILITY 57
w 4 ¬p
p p
w2 w3
w 1 ¬p
p, 1p
Figure 3.3: The model for Theorem 3.34.
Proposition 3.36. Let Σ be a modal system and Γ a set of modal
formulas. The following properties hold:
1. Monotony: If Γ ` Σ A and Γ ⊆ ∆ then ∆ ` Σ A;
2. Reflexivity: If A ∈ Γ then Γ ` Σ A;
3. Cut: If Γ ` Σ A and ∆ ∪ {A} ` Σ B then Γ ∪ ∆ ` Σ B;
4. Deduction theorem: Γ ∪ {B } ` Σ A if and only if Γ ` Σ B →A;
5. Rule T: If Γ ` Σ A1 and . . . and Γ ` Σ An and A1 → (A2 →
· · · (An → B) · · · ) is a tautological instance, then Γ ` Σ B.
The proof is an easy exercise. Part (5) of Proposition 3.36
gives us that, for instance, if Γ ` Σ A ∨ B and Γ ` Σ ¬A, then
Γ ` Σ B. Also, in what follows, we write Γ, A ` Σ B instead of
Γ ∪ {A} ` Σ B.
CHAPTER 3. AXIOMATIC DERIVATIONS 58
Definition 3.37. A set Γ is deductively closed relatively to a system
Σ if and only if Γ ` Σ A implies A ∈ Γ.
3.13 Consistency
Consistency is an important property of sets of formulas. A set
of formulas is inconsistent if a contradiction, such as ⊥, is deriv-
able from it; and otherwise consistent. If a set is inconsistent, its
formulas cannot all be true in a model at a world. For the com-
pleteness theorem we prove the converse: every consistent set is
true at a world in a model, namely in the “canonical model.”
Definition 3.38. A set Γ is consistent relatively to a system Σ or,
as we will say, Σ -consistent, if and only if Γ 0 Σ ⊥.
So for instance, the set {(p → q ), p, ¬q } is consistent rel-
atively to propositional logic, but not K-consistent. Similarly, the
set {♦p, ♦p → q, ¬q } is not K5-consistent.
Proposition 3.39. Let Γ be a set of formulas. Then:
1. A set Γ is Σ -consistent if and only if there is some formula A
such that Γ 0 Σ A.
2. Γ ` Σ A if and only if Γ ∪ {¬A} is not Σ -consistent.
3. If Γ is Σ -consistent, then for any formula A, either Γ ∪ {A} is
Σ -consistent or Γ ∪ {¬A} is Σ -consistent.
Proof. These facts follow easily using classical propositional logic.
We give the argument for (3). Proceed contrapositively and sup-
pose neither Γ ∪ {A} nor Γ ∪ {¬A} is Σ -consistent. Then by
(2), both Γ, A ` Σ ⊥ and Γ, ¬A ` Σ ⊥. By the deduction theorem
Γ ` Σ A → ⊥ and Γ ` Σ ¬A→⊥. But (A→⊥)→((¬A→⊥)→⊥) is a
tautological instance, hence by Proposition 3.36(5), Γ ` Σ ⊥.
3.13. CONSISTENCY 59
Problems
Problem 3.1. Prove Proposition 3.7.
Problem 3.2. Find derivations in K for the following formulas:
1. ¬p → (p → q )
2. (p ∨ q ) → (p ∨ q )
3. ♦p → ♦(p ∨ q )
Problem 3.3. Prove Proposition 3.19 by proving, by induction on
the complexity of C , that if K ` A↔B then K ` C [A/p]↔C [B/p].
Problem 3.4. Show that the following derivability claims hold:
1. K ` ♦¬⊥ → (A → ♦A);
2. K ` (A ∨ B) → (♦A ∨ B);
3. K ` (♦A → B) → (A → B).
Problem 3.5. Show that for each formula A in Definition 3.26:
K ` A ↔ A♦ .
Problem 3.6. Prove Proposition 3.29.
Problem 3.7. Give an alternative proof of Theorem 3.34 using a
model with 3 worlds.
Problem 3.8. Provide a single reflexive transitive model showing
that both KT4 0 B and KT4 0 5.
CHAPTER 4
Completeness
and Canonical
Models
4.1 Introduction
If Σ is a modal system, then the soundness theorem establishes
that if Σ ` A, then A is valid in any class C of models in which all
instances of all formulas in Σ are valid. In particular that means
that if K ` A then A is true in all models; if KT ` A then A is
true in all reflexive models; if KD ` A then A is true in all serial
models, etc.
Completeness is the converse of soundness: that K is com-
plete means that if a formula A is valid, ` A, for instance. Prov-
ing completeness is a lot harder to do than proving soundness.
It is useful, first, to consider the contrapositive: K is complete iff
whenever 0 A, there is a countermodel, i.e., a model M such that
M 1 A. Equivalently (negating A), we could prove that whenever
0 ¬A, there is a model of A. In the construction of such a model,
we can use information contained in A. When we find models
for specific formulas we often do the same: E.g., if we want to
60
4.1. INTRODUCTION 61
find a countermodel to p → q , we know that it has to contain
a world where p is true and q is false. And a world where q
is false means there has to be a world accessible from it where
q is false. And that’s all we need to know: which worlds make
the propositional variables true, and which worlds are accessible
from which worlds.
In the case of proving completeness, however, we don’t have
a specific formula A for which we are constructing a model. We
want to establish that a model exists for every A such that 0 Σ ¬A.
This is a minimal requirement, since if ` Σ ¬A, by soundness,
there is no model for A (in which Σ is true). Now note that
0 Σ ¬A iff A is Σ -consistent. (Recall that Σ 0 Σ ¬A and A 0 Σ ⊥
are equivalent.) So our task is to construct a model for every
Σ -consistent formula.
The trick we’ll use is to find a Σ -consistent set of formulas
that contains A, but also other formulas which tell us what the
world that makes A true has to look like. Such sets are complete Σ -
consistent sets. It’s not enough to construct a model with a single
world to make A true, it will have to contain multiple worlds and
an accessibility relation. The complete Σ -consistent set contain-
ing A will also contain other formulas of the form B and ♦C . In
all accessible worlds, B has to be true; in at least one, C has to be
true. In order to accomplish this, we’ll simply take all possible
complete Σ -consistent sets as the basis for the set of worlds. A
tricky part will be to figure out when a complete Σ -consistent set
should count as being accessible from another in our model.
We’ll show that in the model so defined, A is true at a world—
which is also a complete Σ -consistent set—iff A is an element of
that set. If A is Σ -consistent, it will be an element of at least one
complete Σ -consistent set (a fact we’ll prove), and so there will
be a world where A is true. So we will have a single model where
every Σ -consistent formula A is true at some world. This single
model is the canonical model for Σ .
CHAPTER 4. COMPLETENESS AND CANONICAL MODELS 62
4.2 Complete Σ -Consistent Sets
Suppose Σ is a set of modal formulas—think of them as the ax-
ioms or defining principles of a normal modal logic. A set Γ is
Σ -consistent iff Γ 0 Σ ⊥, i.e., if there is no derivation of A1 →(A2 →
· · · (An → ⊥) . . . ) from Σ , where each Ai ∈ Γ. We will construct
a “canonical” model in which each world is taken to be a special
kind of Σ -consistent set: one which is not just Σ -consistent, but
maximally so, in the sense that it settles the truth value of every
modal formula: for every A, either A ∈ Γ or ¬A ∈ Γ:
Definition 4.1. A set Γ is complete Σ -consistent if and only if it is
Σ -consistent and for every A, either A ∈ Γ or ¬A ∈ Γ.
Complete Σ -consistent sets Γ have a number of useful prop-
erties. For one, they are deductively closed, i.e., if Γ ` Σ A then
A ∈ Γ. This means in particular that every instance of a for-
mula A ∈ Σ is also ∈ Γ. Moreover, membership in Γ mirrors the
truth conditions for the propositional connectives. This will be
important when we define the “canonical model.”
Proposition 4.2. Suppose Γ is complete Σ -consistent. Then:
1. Γ is deductively closed in Σ .
2. Σ ⊆ Γ.
3. ⊥ < Γ
4. ¬A ∈ Γ if and only if A < Γ.
5. A ∧ B ∈ Γ iff A ∈ Γ and B ∈ Γ
6. A ∨ B ∈ Γ iff A ∈ Γ or B ∈ Γ
7. A → B ∈ Γ iff A < Γ or B ∈ Γ
4.3. LINDENBAUM’S LEMMA 63
Proof. 1. Suppose Γ ` Σ A but A < Γ. Then since Γ is complete
Σ -consistent, ¬A ∈ Γ. This would make Γ inconsistent,
since A, ¬A ` Σ ⊥.
2. If A ∈ Σ then Γ ` Σ A, and A ∈ Γ by deductive closure, i.e.,
case (1).
3. If ⊥ ∈ Γ, then Γ ` Σ ⊥, so Γ would be Σ -inconsistent.
4. If ¬A ∈ Γ, then by consistency A < Γ; and if A < Γ then
A ∈ Γ since Γ is complete Σ -consistent.
5. Exercise.
6. Suppose A ∨ B ∈ Γ, and A < Γ and B < Γ. Since Γ is
complete Σ -consistent, ¬A ∈ Γ and ¬B ∈ Γ. Then ¬(A ∨
B) ∈ Γ since ¬A→(¬B →¬(A∨B)) is a tautological instance.
This would mean that Γ is Σ -inconsistent, a contradiction.
7. Exercise.
4.3 Lindenbaum’s Lemma
Lindenbaum’s Lemma establishes that every Σ -consistent set of
formulas is contained in at least one complete Σ -consistent set.
Our construction of the canonical model will show that for each
complete Σ -consistent set ∆, there is a world in the canonical
model where all and only the formulas in ∆ are true. So Linden-
baum’s Lemma guarantees that every Σ -consistent set is true at
some world in the canonical model.
CHAPTER 4. COMPLETENESS AND CANONICAL MODELS 64
Theorem 4.3 (Lindenbaum’s Lemma). If Γ is Σ -consistent then
there is a complete Σ -consistent set ∆ extending Γ.
Proof. Let A0 , A1 , . . . be an exhaustive listing of all formulas
of the language (repetitions are allowed). For instance, start by
listing p0 , and at each stage n ≥ 1 list the finitely many formulas
of length n using only variables among p0 , . . . , pn . We define sets
of formulas ∆n by induction on n, and we then set ∆ = n ∆n . We
Ð
first put ∆0 = Γ. Supposing that ∆n has been defined, we define
∆n+1 by:
(
∆n ∪ {An }, if ∆n ∪ {An } is consistent;
∆n+1 =
∆n ∪ {¬An }, otherwise.
If we now let ∆ = n=0 ∆n .
Ð∞
We have to show that this definition actually yields a set ∆
with the required properties, i.e., Γ ⊆ ∆ and ∆ is complete Σ -
consistent.
It’s obvious that Γ ⊆ ∆, since ∆0 ⊆ ∆ by construction, and
∆0 = Γ. In fact, ∆n ⊆ ∆ for all n, since ∆ is the union of all ∆n .
(Since in each step of the construction, we add a formula to the
set already constructed, ∆n ⊆ ∆n+1 , so since ⊆ is transitive, ∆n ⊆
∆m whenever n ≤ m.) At each stage of the construction, we either
add An or ¬An , and every formula appears (at least once) in the
list of all An . So, for every A either A ∈ ∆ or ¬A ∈ ∆, so ∆ is
complete by definition.
Finally, we have to show, that ∆ is Σ -consistent. To do this,
we show that (a) if ∆ were Σ -inconsistent, then some ∆n would
be Σ -inconsistent, and (b) all ∆n are Σ -consistent.
So suppose ∆ were Σ -inconsistent. Then ∆ ` Σ ⊥, i.e., there
are A1 , . . . , Ak ∈ ∆ such that Σ ` A1 → (A2 → · · · (Ak → ⊥) . . . ).
Since ∆ = n=0 , each Ai ∈ ∆ni for some ni . Let n be the largest
Ñ∞
of these. Since ni ≤ n, ∆ni ⊆ ∆n . So, all Ai are in some ∆n . This
would mean ∆n ` Σ ⊥, i.e., ∆n is Σ -inconsistent.
To show that each ∆n is Σ -consistent, we use a simple induc-
tion on n. ∆0 = Γ, and we assumed Γ was Σ -consistent. So
4.4. MODALITIES AND COMPLETE CONSISTENT SETS 65
the claim holds for n = 0. Now suppose it holds for n, i.e., ∆n
is Σ -consistent. ∆n+1 is either ∆n ∪ {An } is that is Σ -consistent,
otherwise it is ∆n ∪ {¬An }. In the first case, ∆n+1 is clearly Σ -
consistent. However, by Proposition 3.39(3), either ∆n ∪ {An } or
∆n ∪ {¬An } is consistent, so ∆n+1 is consistent in the other case
as well.
Corollary 4.4. Γ ` Σ A if and only if A ∈ ∆ for each complete Σ -
consistent set ∆ extending Γ (including when Γ = ∅, in which case we
get another characterization of the modal system Σ .)
Proof. Suppose Γ ` Σ A, and let ∆ be any complete Σ -consistent
set extending Γ. If A < ∆ then by maximality ¬A ∈ ∆ and so
∆ ` Σ A (by monotony) and ∆ ` Σ ¬A (by reflexivity), and so ∆ is
inconsistent. Conversely if Γ 0 Σ A, then Γ ∪{¬A} is Σ -consistent,
and by Lindenbaum’s Lemma there is a complete consistent set
∆ extending Γ ∪ {¬A}. By consistency, A < ∆.
4.4 Modalities and Complete Consistent
Sets
When we construct a model M Σ whose set of worlds is given by
the complete Σ -consistent sets ∆ in some normal modal logic Σ ,
we will also need to define an accessibility relation R Σ between
such “worlds.” We want it to be the case that the accessibility
relation (and the assignment V Σ ) are defined in such a way that
M Σ , ∆ A iff A ∈ ∆. How should we do this?
Once the accessibility relation is defined, the definition of
truth at a world ensures that M Σ , ∆ A iff M Σ , ∆0 A for
all ∆ such that R ∆∆ . The proof that M Σ , ∆
0 Σ 0 A iff A ∈ ∆
requires that this is true in particular for formulas starting with
a modal operator, i.e., M Σ , ∆ A iff A ∈ ∆. Combining this
requirement with the definition of truth at a world for A yields:
A ∈ ∆ iff A ∈ ∆0 for all ∆0 with R Σ ∆∆0
CHAPTER 4. COMPLETENESS AND CANONICAL MODELS 66
Consider the left-to-right direction: it says that if A ∈ ∆, then
A ∈ ∆0 for any A and any ∆0 with R Σ ∆∆0. If we stipulate that
R Σ ∆∆0 iff A ∈ ∆0 for all A ∈ ∆, then this holds. We can write
the condition on the right of the “iff” more compactly as: {A :
A ∈ ∆} ⊆ ∆0.
So the question is: does this definition of R Σ in fact guarantee
that A ∈ ∆ iff M Σ , ∆ A? Does it also guarantee that ♦A ∈ ∆
iff M Σ , ∆ ♦A? The next few results will establish this.
Definition 4.5. If Γ is a set of formulas, let
Γ = {B : B ∈ Γ }
♦Γ = {♦B : B ∈ Γ }
and
−1 Γ = {B : B ∈ Γ }
♦−1 Γ = {B : ♦B ∈ Γ }
In other words, Γ is Γ with in front of every formula
in Γ; −1 Γ is all the ’ed formulas of Γ with the initial ’s
removed. This definition is not terribly important on its own,
but will simplify the notation considerably.
Note that −1 Γ ⊆ Γ:
−1 Γ = {B : B ∈ Γ }
i.e., it’s just the set of all those formulas of Γ that start with .
Lemma 4.6. If Γ ` Σ A then Γ ` Σ A.
Proof. If Γ ` Σ A then there are B 1 , . . . , Bk ∈ Γ such that Σ `
B 1 → (B 2 → · · · (Bn → A) · · · ). Since Σ is normal, by rule rk,
Σ ` B 1 → (B 2 → · · · (Bn → A) · · · ), where obviously B 1 ,
. . . , Bk ∈ Γ. Hence, by definition, Γ ` Σ A.
4.4. MODALITIES AND COMPLETE CONSISTENT SETS 67
Lemma 4.7. If −1 Γ ` Σ A then Γ ` Σ A.
Proof. Suppose −1 Γ ` Σ A; then by Lemma 4.6, −1 Γ ` A.
But since −1 Γ ⊆ Γ, also Γ ` Σ A by Monotony.
Proposition 4.8. If Γ is complete Σ -consistent, then A ∈ Γ if and
only if for every complete Σ -consistent ∆ such that −1 Γ ⊆ ∆, it holds
that A ∈ ∆.
Proof. Suppose Γ is complete Σ -consistent. The “only if” direc-
tion is easy: Suppose A ∈ Γ and that −1 Γ ⊆ ∆. Since A ∈ Γ,
A ∈ −1 Γ ⊆ ∆, so A ∈ ∆.
For the “if” direction, we prove the contrapositive: Suppose
A < Γ. Since Γ is complete Σ -consistent, it is deductively
closed, and hence Γ 0 Σ A. By Lemma 4.7, −1 Γ 0 Σ A. By
Proposition 3.39(2), −1 Γ ∪ {¬A} is Σ -consistent. By Linden-
baum’s Lemma, there is a complete Σ -consistent set ∆ such that
−1 Γ ∪ {¬A} ⊆ ∆. By consistency, A < ∆.
Lemma 4.9. Suppose Γ and ∆ are complete Σ -consistent. Then:
−1 Γ ⊆ ∆ if and only if ♦∆ ⊆ Γ.
Proof. “Only if” direction: Assume −1 Γ ⊆ ∆ and suppose ♦A ∈
♦∆ (i.e., A ∈ ∆). In order to show ♦A ∈ Γ it suffices to show
¬A < Γ for then by maximality ¬¬A ∈ Γ. Now, if ¬A ∈ Γ
then by hypothesis ¬A ∈ ∆, against the consistency of ∆ (since
A ∈ ∆). Hence ¬A < Γ, as required.
“If” direction: Assume ♦∆ ⊆ Γ. We argue contrapositively:
suppose A < ∆ in order to show A < Γ. If A < ∆ then by
maximality ¬A ∈ ∆ and so by hypothesis ♦¬A ∈ Γ. But in a
normal modal logic ♦¬A is equivalent to ¬A, and if the latter
is in Γ, by consistency A < Γ, as required.
CHAPTER 4. COMPLETENESS AND CANONICAL MODELS 68
Proposition 4.10. If Γ is complete Σ -consistent, then ♦A ∈ Γ if and
only if for some complete Σ -consistent ∆ such that ♦∆ ⊆ Γ, it holds
that A ∈ ∆.
Proof. Suppose Γ is complete Σ -consistent. ♦A ∈ Γ iff ¬¬A ∈ Γ
by dual and closure. ¬¬A ∈ Γ iff ¬A < Γ by Proposi-
tion 4.2(4) since Γ is complete Σ -consistent. By Proposition 4.8,
¬A < Γ iff, for some complete Σ -consistent ∆ with −1 Γ ⊆ ∆,
¬A < ∆. Now consider any such ∆. By Lemma 4.9, −1 Γ ⊆ ∆ iff
♦∆ ⊆ Γ. Also, ¬A < ∆ iff A ∈ ∆ by Proposition 4.2(4). So ♦A ∈ Γ
iff, for some complete Σ -consistent ∆ with ♦∆ ⊆ Γ, A ∈ ∆.
4.5 Canonical Models
The canonical model for a modal system Σ is a specific model M Σ
in which the worlds are all complete Σ -consistent sets. Its acces-
sibility relation R Σ and valuation V Σ are defined so as to guar-
antee that the formulas true at a world ∆ are exactly the formulas
making up ∆.
Definition 4.11. Let Σ be a normal modal logic. The canonical
model for Σ is M Σ = hW Σ , R Σ ,V Σ i, where:
1. M Σ = {∆ : ∆ is complete Σ -consistent}.
2. R Σ ∆∆0 holds if and only if −1 ∆ ⊆ ∆0.
3. V Σ (p) = {∆ : p ∈ ∆}.
4.6 The Truth Lemma
The canonical model M Σ is defined in such a way that M Σ , ∆ A
iff A ∈ ∆. For propositional variables, the definition of V Σ yields
this directly. We have to verify that the equivalence holds for all
formulas, however. We do this by induction. The inductive step
4.6. THE TRUTH LEMMA 69
involves proving the equivalence for formulas involving proposi-
tional operators (where we have to use Proposition 4.2) and the
modal operators (where we invoke the results of section 4.4).
Proposition 4.12 (Truth Lemma). For every formula A, M Σ , ∆
A if and only if A ∈ ∆.
Proof. By induction on A.
1. A ≡ ⊥: M Σ , ∆ 1 ⊥ by Definition 1.6, and ⊥ < ∆ by
Proposition 4.2(3).
2. A ≡ p: M Σ , ∆ p iff ∆ ∈ V Σ (p) by Definition 1.6. Also,
∆ ∈ V Σ (p) iff p ∈ ∆ by definition of V Σ .
3. A ≡ ¬B: M Σ , ∆ ¬B iff M Σ , ∆ 1 B (Definition 1.6) iff
B < ∆ (by inductive hypothesis) iff ¬B ∈ ∆ (by Proposi-
tion 4.2(4)).
4. A ≡ B ∧ C : Exercise.
5. A ≡ B ∨C : M Σ , ∆ B ∨C iff M Σ , ∆ B or M Σ , ∆ C (by
Definition 1.6) iff B ∈ ∆ or C ∈ ∆ (by inductive hypothesis)
iff B ∨ C ∈ ∆ (by Proposition 4.2(6)).
6. A ≡ B → C : Exercise.
7. A ≡ B: First suppose that M Σ , ∆ B. By Defini-
tion 1.6, for every ∆0 such that R Σ ∆∆0, M Σ , ∆0 B. By in-
ductive hypothesis, for every ∆0 such that R Σ ∆∆0, B ∈ ∆0.
By definition of R Σ , for every ∆0 such that −1 ∆ ⊆ ∆0,
B ∈ ∆0. By Proposition 4.8, B ∈ ∆.
Now assume B ∈ ∆. Let ∆0 ∈ W Σ be such that R Σ ∆∆0,
i.e., −1 ∆ ⊆ ∆0. Since B ∈ ∆, B ∈ −1 ∆. Consequently,
B ∈ ∆0. By inductive hypothesis, M Σ , ∆0 B. Since ∆0 is
arbitrary with R Σ ∆∆0, for all ∆0 ∈ W Σ such that R Σ ∆∆0,
M Σ , ∆0 B. By Definition 1.6, M Σ , ∆ B.
CHAPTER 4. COMPLETENESS AND CANONICAL MODELS 70
8. A ≡ ♦B: Exercise.
4.7 Determination and Completeness for K
We are now prepared to use the canonical model to establish
completeness. Completeness follows from the fact that the for-
mulas true in the canonical for Σ are exactly the Σ -derivable
ones. Models with this property are said to determine Σ .
Definition 4.13. A model M determines a normal modal logic Σ
precisely when M A if and only if Σ ` A, for all formulas A.
Theorem 4.14 (Determination). M Σ A if and only if Σ ` A.
Proof. If M Σ A, then for every complete Σ -consistent ∆, we
have M Σ , ∆ A. Hence, by the Truth Lemma, A ∈ ∆ for every
complete Σ -consistent ∆, whence by Corollary 4.4 (with Γ = ∅),
Σ ` A.
Conversely, if Σ ` A then by Proposition 4.2(1), every com-
plete Σ -consistent ∆ contains A, and hence by the Truth Lemma,
M Σ , ∆ A for every ∆ ∈ W Σ , i.e., M Σ A.
Since the canonical model for K determines K, we immedi-
ately have completeness of K as a corollary:
4.8. FRAME COMPLETENESS 71
Corollary 4.15. The basic modal logic K is complete with respect to
the class of all models, i.e., if A then K ` A.
Proof. Contrapositively, if K 0 A then by Determination M K 1 A
and hence A is not valid.
For the general case of completeness of a system Σ with re-
spect to a class of models, e.g., of KTB4 with respect to the class
of reflexive, symmetric, transitive models, determination alone
is not enough. We must also show that the canonical model for
the system Σ is a member of the class, which does not follow ob-
viously from the canoncial model construction—nor is it always
true!
4.8 Frame Completeness
The completeness theorem for K can be extended to other modal
systems, once we show that the canonical model for a given logic
has the corresponding frame property.
Theorem 4.16. If a normal modal logic Σ contains one of the for-
mulas on the left-hand side of table 4.1, then the canonical model for Σ
has the corresponding property on the right-hand side.
If Σ contains . . . . . . the canonical model for Σ is:
D: A → ♦A serial;
T: A → A reflexive;
B: A → ♦A symmetric;
4: A → A transitive;
5: ♦A → ♦A euclidean.
Table 4.1: Basic correspondence facts.
Proof. We take each of these up in turn.
Suppose Σ contains D, and let ∆ ∈ W Σ ; we need to show that
there is a ∆0 such that R Σ ∆∆0. It suffices to show that −1 ∆ is
CHAPTER 4. COMPLETENESS AND CANONICAL MODELS 72
Σ -consistent, for then by Lindenbaum’s Lemma, there is a com-
plete Σ -consistent set ∆0 ⊇ −1 ∆, and by definition of R Σ we
have R Σ ∆∆0. So, suppose for contradiction that −1 ∆ is not Σ -
consistent, i.e., −1 ∆ ` Σ ⊥. By Lemma 4.7, ∆ ` Σ ⊥, and since
Σ contains D, also ∆ ` Σ ♦⊥. But Σ is normal, so Σ ` ¬♦⊥
(Proposition 3.7), whence also ∆ ` Σ ¬♦⊥, against the consis-
tency of ∆.
Now suppose Σ contains T, and let ∆ ∈ W Σ . We want to
show R Σ ∆∆, i.e., −1 ∆ ⊆ ∆. But if A ∈ ∆ then by T also A ∈ ∆,
as desired.
Now suppose Σ contains B, and suppose R Σ ∆∆0 for ∆, ∆0 ∈
W . We need to show that R Σ ∆0 ∆, i.e., −1 ∆0 ⊆ ∆. By Lemma 4.9,
Σ
this is equivalent to ♦∆ ⊆ ∆0. So suppose A ∈ ∆. By B, also
♦A ∈ ∆. By the hypothesis that R Σ ∆∆0, we have that −1 ∆ ⊆ ∆0,
and hence ♦A ∈ ∆0, as required.
Now suppose Σ contains 4, and suppose R Σ ∆1 ∆2 and R Σ ∆2 ∆3 .
We need to show R Σ ∆1 ∆3 . From the hypothesis we have both
−1 ∆1 ⊆ ∆2 and −1 ∆2 ⊆ ∆3 . In order to show R Σ ∆1 ∆3 it suf-
fices to show −1 ∆1 ⊆ ∆3 . So let B ∈ −1 ∆1 , i.e., B ∈ ∆1 . By
4, also B ∈ ∆1 and by hypothesis we get, first, that B ∈ ∆2
and, second, that B ∈ ∆3 , as desired.
Now suppose Σ contains 5, suppose R Σ ∆1 ∆2 and R Σ ∆1 ∆3 .
We need to show R Σ ∆2 ∆3 . The first hypothesis gives −1 ∆1 ⊆
∆2 , and the second hypothesis is equivalent to ♦∆3 ⊆ ∆2 , by
Lemma 4.9. To show R Σ ∆2 ∆3 , by Lemma 4.9, it suffices to show
♦∆3 ⊆ ∆2 . So let ♦A ∈ ♦∆3 , i.e., A ∈ ∆3 . By the second hy-
pothesis ♦A ∈ ∆1 and by 5, ♦A ∈ ∆1 as well. But now the first
hypothesis gives ♦A ∈ ∆2 , as desired.
As a corollary we obtain completeness results for a number
of systems. For instance, we know that S5 = KT5 = KTB4
is complete with respect to the class of all reflexive euclidean
models, which is the same as the class of all reflexive, symmetric
and transitive models.
4.8. FRAME COMPLETENESS 73
Theorem 4.17. Let CD , CT , CB , C4 , and C5 be the class of all se-
rial, reflexive, symmetric, transitive, and euclidean models (respectively).
Then for any schemas A1 , . . . , An among D, T, B, 4, and 5, the system
KA1 . . . An is determined by the class of models C = CA1 ∩ · · · ∩ CAn .
Proposition 4.18. Let Σ be a normal modal logic; then:
1. If Σ contains the schema ♦A → A then the canonical model
for Σ is partially functional.
2. If Σ contains the schema ♦A ↔ A then the canonical model
for Σ is functional.
3. If Σ contains the schema A → A then the canonical model
for Σ is weakly dense.
(see table 2.2 for definitions of these frame properties).
Proof. 1. suppose that Σ contains the schema ♦A → A, to
show that R Σ is partially functional we need to prove that
for any ∆1 , ∆2 , ∆3 ∈ W Σ , if R Σ ∆1 ∆2 and R Σ ∆1 ∆3 then ∆2 =
∆3 . Since R Σ ∆1 ∆2 we have −1 ∆1 ⊆ ∆2 and since R Σ ∆1 ∆3
also −1 ∆1 ⊆ ∆3 . The identity ∆2 = ∆3 will follow if we can
establish the two inclusions ∆2 ⊆ ∆3 and ∆3 ⊆ ∆2 . For the
first inclusion, let A ∈ ∆2 ; then ♦A ∈ ∆1 , and by the schema
and deductive closure of ∆1 also A ∈ ∆1 , whence by the
hypothesis that R Σ ∆1 ∆3 , A ∈ ∆3 . The second inclusion is
similar.
2. This follows immediately from part (1) and the seriality
proof in Theorem 4.16.
3. Suppose Σ contains the schema A → A and to show
that R Σ is weakly dense, let R Σ ∆1 ∆2 . We need to show that
there is a complete Σ -consistent set ∆3 such that R Σ ∆1 ∆3
and R Σ ∆3 ∆2 . Let:
Γ = −1 ∆1 ∪ ♦∆2 .
CHAPTER 4. COMPLETENESS AND CANONICAL MODELS 74
It suffices to show that Γ is Σ -consistent, for then by Lin-
denbaum’s Lemma it can be extended to a complete Σ -
consistent set ∆3 such that −1 ∆1 ⊆ ∆3 and ♦∆2 ⊆ ∆3 , i.e.,
R Σ ∆1 ∆3 and R Σ ∆3 ∆2 (by Lemma 4.9).
Suppose for contradiction that Γ is not consistent. Then
there are formulas A1 , . . . , An ∈ ∆1 and B 1 , . . . , Bm ∈ ∆2
such that
A1, . . . , An , ♦B 1, . . . , ♦Bm ` Σ ⊥.
Since ♦(B 1 ∧· · ·∧Bm ) → (♦B 1 ∧· · ·∧♦Bm ) is derivable in ev-
ery normal modal logic, we argue as follows, contradicting
the consistency of ∆2 :
A1, . . . , An , ♦B 1, . . . , ♦Bm ` Σ ⊥
⇒ A1, . . . , An ` Σ (♦B 1 ∧ · · · ∧ ♦Bm ) → ⊥, deduction theorem;
⇒ A1, . . . , An ` Σ ♦(B 1 ∧ · · · ∧ Bm ) → ⊥, Σ is normal;
⇒ A1, . . . , An ` Σ ¬(B 1 ∧ · · · ∧ Bm ), pl;
⇒ A1, . . . , An ` Σ ¬(B 1 ∧ · · · ∧ Bm ), Lemma 4.6;
⇒ A1, . . . , An ` Σ ¬(B 1 ∧ · · · ∧ Bm ), by the schema;
⇒ ∆1 ` Σ ¬(B 1 ∧ · · · ∧ Bm ), Monotony;
⇒ ¬(B 1 ∧ · · · ∧ Bm ) ∈ ∆1, deductive closure;
⇒ ¬(B 1 ∧ · · · ∧ Bm ) ∈ ∆2, since R Σ ∆1 ∆2 .
On the strength of these examples, one might think that every
system Σ of modal logic is complete, in the sense that it proves ev-
ery formula which is valid in every frame in which every theorem
of Σ is valid. Unfortunately, there are many systems that are not
complete in this sense.
Problems
Problem 4.1. Complete the proof of Proposition 4.2.
4.8. FRAME COMPLETENESS 75
Problem 4.2. Show that if Γ is complete Σ -consistent, then ♦A ∈
Γ if and only if there is a complete Σ -consistent ∆ such that
−1 Γ ⊆ ∆ and A ∈ ∆. Do this without using Lemma 4.9.
Problem 4.3. Complete the proof of Proposition 4.12.
CHAPTER 5
Filtrations and
Decidability
5.1 Introduction
One important question about a logic is always whether it is de-
cidable, i.e., if there is an effective procedure which will answer
the question “is this formula valid.” Propositional logic is decid-
able: we can effectively test if a formula is a tautology by con-
structing a truth table, and for a given formula, the truth table
is finite. But we can’t obviously test if a modal formula is true
in all models, for there are infinitely many of them. We can list
all the finite models relevant to a given formula, since only the
assignment of subsets of worlds to propositional variables which
actually occur in the formula are relevant. If the accessibility re-
lation is fixed, the possible different assignments V (p) are just all
the subsets of W , and if |W | = n there are 2n of those. If our
formula A contains m propositional variables there are then 2nm
different models with n worlds. For each one, we can test if A is
true at all worlds, simply by computing the truth value of A in
each. Of course, we also have to check all possible accessibility
relations, but there are only finitely many relations on n worlds
2
as well (specifically, the number of subsets of W × W , i.e., 2n .
76
5.1. INTRODUCTION 77
If we are not interested in the logic K, but a logic defined by
some class of models (e.g., the reflexive transitive models), we
also have to be able to test if the accessibility relation is of the
right kind. We can do that whenever the frames we are interested
in are definable by modal formulas (e.g., by testing if T and 4
valid in the frame). So, the idea would be to run through all
the finite frames, test each one if it is a frame in the class we’re
interested in, then list all the possible models on that frame and
test if A is true in each. If not, stop: A is not valid in the class of
models of interest.
There is a problem with this idea: we don’t know when, if
ever, we can stop looking. If the formula has a finite counter-
model, our procedure will find it. But if it has no finite counter-
model, we won’t get an answer. The formula may be valid (no
countermodels at all), or it have only an infinite countermodel,
which we’ll never look at. This problem can be overcome if we
can show that every formula that has a countermodel has a finite
countermodel. If this is the case we say the logic has the finite
model property.
But how would we show that a logic has the finite model prop-
erty? One way of doing this would be to find a way to turn an
infinite (counter)model of A into a finite one. If that can be done,
then whenever there is a model in which A is not true, then the
resulting finite model also makes A not true. That finite model
will show up on our list of all finite models, and we will eventually
determine, for every formula that is not valid, that it isn’t. Our
procedure won’t terminate if the formula is valid. If we can show
in addition that there is some maximum size that the finite model
our procedure provides can have, and that this maximum size de-
pends only on the formula A, we will have a size up to which we
have to test finite models in our search for countermodels. If we
haven’t found a countermodel by then, there are none. Then our
procedure will, in fact, decide the question “is A valid?” for any
formula A.
A strategy that often works for turning infinite structures into
finite structures is that of “identifying” elements of the structure
CHAPTER 5. FILTRATIONS AND DECIDABILITY 78
which behave the same way in relevant respects. If there are
infinitely many worlds in M that behave the same in relevant
respects, then we might hope that there are only finitely many
“classes” of such worlds. In other words, we partition the set of
worlds in the right way. Each partition contains infinitely many
worlds, but there are only finitely many partitions. Then we de-
fine a new model M ∗ where the worlds are the partitions. Finitely
many partitions in the old model give us finitely many worlds
in the new model, i.e., a finite model. Let’s call the partition a
world w is in [w]. We’ll want it to be the case that M, w A iff
M ∗, [w] A, since we want the new model to be a countermodel
to A if the old one was. This requires that we define the partition,
as well as the accessibility relation of M ∗ in the right way.
To see how this would go, first imagine we have no accessi-
bility relation. M, w B iff for some v ∈ W , M, v B, and
the same for M , except with [w] and [v ]. As a first idea, let’s
∗
say that two worlds u and v are equivalent (belong to the same
partition) if they agree on all propositional variables in M, i.e.,
M, u p iff M, v p. Let V ∗ (p) = {[w] : M, w p }. Our
aim is to show that M, w A iff M ∗, [w] A. Obviously, we’d
prove this by induction: The base case would be A ≡ p. First
suppose M, w p. Then [w] ∈ V ∗ by definition, so M ∗, [w] p.
Now suppose that M ∗, [w] p. That means that [w] ∈ V ∗ (p),
i.e., for some v equivalent to w, M, v p. But “w equivalent
to v ” means “w and v make all the same propositional variables
true,” so M, w p. Now for the inductive step, e.g., A ≡ ¬B.
Then M, w ¬B iff M, w 1 B iff M ∗, [w] 1 B (by inductive
hypothesis) iff M ∗, [w] ¬B. Similarly for the other non-modal
operators. It also works for : suppose M ∗, [w] B. That
means that for every [u], M ∗, [u] B. By inductive hypothesis,
for every u, M, u B. Consequently, M, w B.
In the general case, where we have to also define the accessi-
bility relation for M ∗ , things are more complicated. We’ll call
a model M ∗ a filtration if its accessibility relation R ∗ satisfies
the conditions required to make the inductive proof above go
through. Then any filtration M ∗ will make A true at [w] iff M
5.2. PRELIMINARIES 79
makes A true at w. However, now we also have to show that
there are filtrations, i.e., we can define R ∗ so that it satisfies the
required conditions. In order for this to work, however, we have
to require that worlds u, v count as equivalent not just when they
agree on all propositional variables, but on all sub-formulas of A.
Since A has only finitely many sub-formulas, this will still guaran-
tee that the filtration is finite. There is not just one way to define a
filtration, and in order to make sure that the accessibility relation
of the filtration satisfies the required properties (e.g., reflexive,
transitive, etc.) we have to be inventive with the definition of R ∗ .
5.2 Preliminaries
Filtrations allow us to establish the decidability of our systems of
modal logic by showing that they have the finite model property,
i.e., that any formula that is true (false) in a model is also true
(false) in a finite model. Filtrations are defined relative to sets of
formulas which are closed under subformulas.
Definition 5.1. A set Γ of formulas is closed under subformulas
if it contains every subformula of a formula in Γ. Further, Γ
is modally closed if it is closed under subformulas and moreover
A ∈ Γ implies A, ♦A ∈ Γ.
For instance, given a formula A, the set of all its sub-formulas
is closed under sub-formulas. When we’re defining a filtration
of a model through the set of sub-formulas of A, it will have the
property we’re after: it makes A true (false) iff the original model
does.
The set of worlds of a filtration of M through Γ is defined
as the set of all equivalence classes of the following equivalence
relation.
CHAPTER 5. FILTRATIONS AND DECIDABILITY 80
Definition 5.2. Let M = hW, R,V i and suppose Γ is closed un-
der sub-formulas. Define a relation ≡ on W to hold of any two
worlds that make the same formulas from Γ true, i.e.:
u ≡v if and only if ∀A ∈ Γ : M, u A ⇔ N, v A.
The equivalence class [w]≡ of a world w, or [w] for short, is the
set of all worlds ≡-equivalent to w:
[w] = {v : v ≡ w }.
Proposition 5.3. Given M and Γ, ≡ as defined above is an equiva-
lence relation, i.e., it is reflexive, symmetric, and transitive.
Proof. The relation ≡ is reflexive, since w makes exactly the same
formulas from Γ true as itself. It is symmetric since if u makes
the same formulas from Γ true as v , the same holds for v and u.
It is also transitive, since if u makes the same formulas from Γ
true as v , and v as w, then u makes the same formulas from Γ
true as w.
The relation ≡, like any equivalence relation, divides W into
partitions, i.e., subsets of W which are pairwise disjoint, and to-
gether cover all of W . Every w ∈ W is an element of one of the
partitions, namely of [w], since w ≡ w. So the partitions [w]
cover all of W . They are pairwise disjoint, for if u ∈ [w] and
u ∈ [v ], then u ≡ w and u ≡ v , and by symmetry and transitivity,
w ≡ v , and so [w] = [v ].
5.3 Filtrations
Rather than define “the” filtration of M through Γ, we define
when a model M ∗ counts as a filtration of M. All filtrations have
the same set of worlds W ∗ and the same valuation V ∗ . But dif-
ferent filtrations may have different accessibility relations R ∗ . To
5.3. FILTRATIONS 81
count as a filtration, R ∗ has to satisfy a number of conditions,
however. These conditions are exactly what we’ll require to prove
the main result, namely that M, w A iff M ∗, [w] A, provided
A ∈ Γ.
Definition 5.4. Let Γ be closed under subformulas and M =
hW, R,V i. A filtration of M through Γ is any model M ∗ = hW ∗, R ∗,V ∗ i,
where:
1. W ∗ = {[w] : w ∈ W };
2. For any u, v ∈ W :
a) If Ruv then R ∗ [u][v ];
b) If R ∗ [u][v ] then for any A ∈ Γ, if M, u A then
M, v A;
c) If R ∗ [u][v ] then for any ♦A ∈ Γ, if M, v A then
M, u ♦A.
3. V ∗ (p) = {[u] : u ∈ V (p)}.
It’s worthwhile thinking about what V ∗ (p) is: the set consist-
ing of the equivalence classes [w] of all worlds w where p is true
in M. On the one hand, if w ∈ V (p), then [w] ∈ V ∗ (p) by that def-
inition. However, it is not necessarily the case that if [w] ∈ V ∗ (p),
then w ∈ V (p). If [w] ∈ V ∗ (p) we are only guaranteed that
[w] = [u] for some u ∈ V (p). Of course, [w] = [u] means that
w ≡ u. So, when [w] ∈ V ∗ (p) we can (only) conclude that w ≡ u
for some u ∈ V (p).
Theorem 5.5. If M ∗ is a filtration of M through Γ, then for every
A ∈ Γ and w ∈ W , we have M, w A if and only if M ∗, [w] A.
Proof. By induction on A, using the fact that Γ is closed under
subformulas. Since A ∈ Γ and Γ is closed under sub-formulas,
all sub-formulas of A are also ∈ Γ. Hence in each inductive step,
the induction hypothesis applies to the sub-formulas of A.
CHAPTER 5. FILTRATIONS AND DECIDABILITY 82
1. A ≡ ⊥: Neither M, w A nor M ∗, w A.
2. A ≡ p: The left-to-right direction is immediate, as M, w
A only if w ∈ V (p), which implies [w] ∈ V ∗ (p), i.e., M ∗, [w]
A. Conversely, suppose M ∗, [w] A, i.e., [w] ∈ V ∗ (p).
Then for some v ∈ V (p), w ≡ v . Of course then also
M, v p. Since w ≡ v , w and v make the same formu-
las from Γ true. Since by assumption p ∈ Γ and M, v p,
M, w A.
3. A ≡ ¬B: M, w A iff M, w 1 B. By induction hypothesis,
M, w 1 B iff M ∗, [w] 1 B. Finally, M ∗, [w] 1 B iff M ∗, [w]
A.
4. Exercise.
5. A ≡ (B ∨ C ): M, w A iff M, w B or M, w C.
By induction hypothesis, M, w B iff M ∗, [w] B, and
M, w C iff M ∗, [w] C . And M ∗, [w] A iff M ∗, [w] B
or M ∗, [w] C .
6. Exercise.
7. A ≡ B: Suppose M, w A; to show that M ∗, [w] A, let
v be such that R ∗ [w][v ]. From Definition 5.4(2b), we have
that M, v B, and by inductive hypothesis M ∗, [v ] B.
Since v was arbitrary, M ∗, [w] A follows.
Conversely, suppose M ∗, [w] A and let v be arbitrary
such that Rwv . From Definition 5.4(2a), we have R ∗ [w][v ],
so that M ∗, [v ] B; by inductive hypothesis M, v B, and
since v was arbitrary, M, u A.
8. Exercise.
What holds for truth at worlds in a model also holds for truth
in a model and validity in a class of models.
5.4. EXAMPLES OF FILTRATIONS 83
Corollary 5.6. Let Γ be closed under subformulas. Then:
1. If M ∗ is a filtration of M through Γ then for any A ∈ Γ: M A
if and only if M ∗ A.
2. If C is a class of models and Γ( C) is the class of Γ-filtrations of
models in C, then any formula A ∈ Γ is valid in C if and only
if it is valid in Γ( C).
5.4 Examples of Filtrations
We have not yet shown that there are any filtrations. But indeed,
for any model M, there are many filtrations of M through Γ.
We identify two, in particular: the finest and coarsest filtrations.
Filtrations of the same models will differ in their accessibility
relation (as Definition 5.4 stipulates directly what W ∗ and V ∗
should be). The finest filtration will have as few related worlds as
possible, whereas the coarsest will have as many as possible.
Definition 5.7. Where Γ is closed under subformulas, the finest
filtration M ∗ of a model M is defined by putting:
R ∗ [u][v ] if and only if ∃u 0 ∈ [u] ∃v 0 ∈ [v ] : Ru 0v 0 .
Proposition 5.8. The finest filtration M ∗ is indeed a filtration.
Proof. We need to check that R ∗ , so defined, satisfies Definition 5.4(2).
We check the three conditions in turn.
If Ruv then since u ∈ [u] and v ∈ [v ], also R ∗ [u][v ], so (2a) is
satisfied.
For (2b), suppose A ∈ Γ, R ∗ [u][v ], and M, u A. By
definition of R ∗ , there are u 0 ≡ u and v 0 ≡ v such that Ru 0v 0.
Since u and u 0 agree on Γ, also M, u 0 A, so that M, v 0 A.
By closure of Γ under sub-formulas, v and v 0 agree on A, so
M, v A, as desired.
CHAPTER 5. FILTRATIONS AND DECIDABILITY 84
1 2 3 4
¬p p ¬p p
[1] [2] [1] [2]
¬p p ¬p p
Figure 5.1: An infinite model and its filtrations.
We leave the verification of (2c) as an exercise.
Definition 5.9. Where Γ is closed under subformulas, the coars-
est filtration M ∗ of a model M is defined by putting R ∗ [u][v ] if
and only if both of the following conditions are met:
1. If A ∈ Γ and M, u A then M, v A;
2. If ♦A ∈ Γ and M, v A then M, u ♦A.
Proposition 5.10. The coarsest filtration M ∗ is indeed a filtration.
Proof. Given the definition of R ∗ , the only condition that is left to
verify is the implication from Ruv to R ∗ [u][v ]. So assume Ruv .
Suppose A ∈ Γ and M, u A; then obviously M, v A, and
(1) is satisfied. Suppose ♦A ∈ Γ and M, v A. Then M, u ♦A
since Ruv , and (2) is satisfied.
Example 5.11. Let W = Z+ , Rnm iff m = n + 1, and V (p) =
{2n : n ∈ N}. The model M = hW, R,V i is depicted in Fig-
ure 5.1. The worlds are 1, 2, etc.; each world can access exactly
one other world—its successor, and p is true at all and only the
even numbers.
Now let Γ be the set of sub-formulas of p→p, i.e., {p, p, p→
p }. p is true at all and only the even numbers, p is true at all and
5.5. FILTRATIONS ARE FINITE 85
only the odd numbers, so p → p is true at all and only the even
numbers. In other words, every odd number makes p true and p
and p → p false; every even number makes p and p → p true,
but p false. So W ∗ = {[1], [2]}, where [1] = {1, 3, 5, . . . } and
[2] = {2, 4, 6, . . . }. Since 2 ∈ V (p), [2] ∈ V ∗ (p); since 1 < V (p),
[1] < V ∗ (p). So V ∗ (p) = {[2]}.
Any filtration based on W ∗ must have an accessibility relation
that includes h[1], [2]i, h[2], [1]i: since R12, we must have R ∗ [1][2]
by Definition 5.4(2a), and since R23 we must have R ∗ [2][3], and
[3] = [1]. It cannot include h[1], [1]i: if it did, we’d have R ∗ [1][1],
M, 1 p but M, 1 p, contradicting (2a). Nothing requires
or rules out that R ∗ [2][2]. So, there are two possible filtrations
of M, corresponding to the two accessibility relations
{h[1], [2]i, h[2], [1]i} and {h[1], [2]i, h[2], [1]i, h[2], [2]i}.
In either case, p and p → p are false and p is true at [1]; p
and p → p are true and p is false at [2].
5.5 Filtrations are Finite
We’ve defined filtrations for any set Γ that is closed under sub-
formulas. Nothing in the definition itself guarantees that filtra-
tions are finite. In fact, when Γ is infinite (e.g., is the set of all
formulas), it may well be infinite. However, if Γ is finite (e.g.,
when it is the set of sub-formulas of a given formula A), so is any
filtration through Γ.
Proposition 5.12. If Γ is finite then any filtration M ∗ of a model M
through Γ is also finite.
Proof. The size of W ∗ is the number of different classes [w] under
the equivalence relation ≡. Any two worlds u, v in such class—
that is, any u and v such that u ≡ v —agree on all formulas A
in Γ, A ∈ Γ either A is true at both u and v , or at neither. So
each class [w] corresponds to subset of Γ, namely the set of all
CHAPTER 5. FILTRATIONS AND DECIDABILITY 86
A ∈ Γ such that A is true at the worlds in [w]. No two different
classes [u] and [v ] correspond to the same subset of Γ. For if the
set of formulas true at u and that of formulas true at v are the
same, then u and v agree on all formulas in Γ, i.e., u ≡ v . But
then [u] = [v ]. So, there is an injective function from W ∗ to ℘(Γ),
and hence |W ∗ | ≤ |℘(Γ)|. Hence if Γ contains n sentences, the
cardinality of W ∗ is no greater than 2n .
5.6 K and S5 have the Finite Model
Property
Definition 5.13. A system Σ of modal logic is said to have the
finite model property if whenever a formula A is true at a world in
a model of Σ then A is true at a world in a finite model of Σ .
Proposition 5.14. K has the finite model property.
Proof. K is the set of valid formulas, i.e., any model is a model
of K. By Theorem 5.5, if MA[w], then M ∗A[w] for any filtration
of M through the set Γ of sub-formulas of A. Any formula only
has finitely many sub-formulas, so Γ is finite. By Proposition 5.12,
|W ∗ | ≤ 2n , where n is the number of formulas in Γ. And since K
imposes no restriction on models, M ∗ is a K-model.
To show that a logic L has the finite model property via fil-
trations it is essential that the filtration of an L-model is itself
a L-model. Often this requires a fair bit of work, and not any
filtration yields a L-model. However, for universal models, this
still holds.
5.7. S5 IS DECIDABLE 87
Proposition 5.15. Let U be the class of universal models (see Propo-
sition 2.14) and UFin the class of all finite universal models. Then any
formula A is valid in U if and only if it is valid in UFin .
Proof. Finite universal models are universal models, so the left-
to-right direction is trivial. For the right-to left direction, suppose
that A is false at some world w in a universal model M. Let Γ
contain A as well as all of its subformulas; clearly Γ is finite. Take
a filtration M ∗ of M; then M ∗ is finite by Proposition 5.12, and
by Theorem 5.5, A is false at [w] in M ∗ . It remains to observe
that M ∗ is also universal: given u and v , by hypothesis Ruv and
by Definition Definition 5.4(2), also R ∗ [u][v ].
Corollary 5.16. S5 has the finite model property.
Proof. By Proposition 2.14, if A is true at a world in some reflex-
ive and euclidean model then it is true at a world in a universal
model. By Proposition 5.15, it is true at a world in a finite uni-
versal model (namely the filtration of the model through the set
of sub-formulas of A). Every universal model is also reflexive and
euclidean; so A is true at a world in a finite reflexive euclidean
model.
5.7 S5 is Decidable
The finite model property gives us an easy way to show that sys-
tems of modal logic given by schemas are decidable (i.e., that there
is a computable procedure to determine whether a formulas is
derivable in the system or not).
CHAPTER 5. FILTRATIONS AND DECIDABILITY 88
Theorem 5.17. S5 is decidable.
Proof. Let A be given, and suppose the propositional variables
occurring in A are among p 1 , . . . , pk . Since for each n there are
only finitely many models with n worlds assigning a value to p 1 ,
. . . , pk , we can enumerate, in parallel, all the theorems of S5 by
generating proofs in some systematic way; and all the models con-
taining 1, 2, . . . worlds and checking whether A fails at a world in
some such model. Eventually one of the two parallel processes
will give an answer, as by Theorem 4.17 and Corollary 5.16, ei-
ther A is derivable or it fails in a finite universal model.
The above proof works for S5 because filtrations of universal
models are automatically universal. The same holds for reflexiv-
ity and seriality, but more work is needed for other properties.
5.8 Filtrations and Properties of
Accessibility
As noted, filtrations of universal, serial, and reflexive models are
always also universal, serial, or reflexive. But not every filtration
of a symmetric or transitive model is symmetric or transitive,
respectively. In some cases, however, it is possible to define fil-
trations so that this does hold. In order to do so, we proceed as in
the definition of the coarsest filtration, but add additional condi-
tions to the definition of R ∗ . Let Γ be closed under sub-formulas.
Consider the relations C i (u, v ) in table 5.1 between worlds u, v
in a model M = hW, R,V i. We can define R ∗ [u][v ] on the basis
of combinations of these conditions. For instance, if we stipulate
that R ∗ [u][v ] iff the condition C 1 (u, v ) holds, we get exactly the
coarsest filtration. If we stipulate R ∗ [u][v ] iff both C 1 (u, v ) and
C 2 (u, v ) hold, we get a different filtration. It is “finer” than the
coarsest since fewer pairs of worlds satisfy C 1 (u, v ) and C 2 (u, v )
than C 1 (u, v ) alone.
5.8. FILTRATIONS AND PROPERTIES OF ACCESSIBILITY 89
if A ∈ Γ and M, u A then M, v A; and
C 1 (u, v ):
if ♦A ∈ Γ and M, v A then M, u ♦A;
if A ∈ Γ and M, v A then M, u A; and
C 2 (u, v ):
if ♦A ∈ Γ and M, u A then M, v ♦A;
if A ∈ Γ and M, u A then M, v A; and
C 3 (u, v ):
if ♦A ∈ Γ and M, v ♦A then M, u ♦A;
if A ∈ Γ and M, v A then M, u A; and
C 4 (u, v ):
if ♦A ∈ Γ and M, u ♦A then M, v ♦A;
Table 5.1: Conditions on possible worlds for defining filtrations.
Theorem 5.18. Let M = hW, R, P i be a model, Γ closed under sub-
formulas. Let W ∗ and V ∗ be defined as in Definition 5.4. Then:
1. Suppose R ∗ [u][v ] if and only if C 1 (u, v ) ∧ C 2 (u, v ). Then R ∗
is symmetric, and M ∗ = hW ∗, R ∗,V ∗ i is a filtration if M is
symmetric.
2. Suppose R ∗ [u][v ] if and only if C 1 (u, v ) ∧ C 3 (u, v ). Then R ∗
is transitive, and M ∗ = hW ∗, R ∗,V ∗ i is a filtration if M is
transitive.
3. Suppose R ∗ [u][v ] if and only if C 1 (u, v ) ∧ C 2 (u, v ) ∧ C 3 (u, v ) ∧
C 4 (u, v ). Then R ∗ is symmetric and transitive, and M ∗ = hW ∗, R ∗,V ∗ i
is a filtration if M is symmetric and transitive.
4. Suppose R ∗ is defined as R ∗ [u][v ] if and only if C 1 (u, v ) ∧
C 3 (u, v ) ∧ C 4 (u, v ). Then R ∗ is transitive and euclidean, and
M ∗ = hW ∗, R ∗,V ∗ i is a filtration if M is transitive and eu-
clidean.
Proof. 1. It’s immediate that R ∗ is symmetric, since C 1 (u, v ) ⇔
C 2 (v, u) and C 2 (u, v ) ⇔ C 1 (v, u). So it’s left to show that if M
is symmetric then M ∗ is a filtration through Γ. Condition
C 1 (u, v ) guarantees that (2b) and (2c) of Definition 5.4 are
satisfied. So we just have to verify Definition 5.4(2a), i.e.,
that Ruv implies R ∗ [u][v ].
CHAPTER 5. FILTRATIONS AND DECIDABILITY 90
So suppose Ruv . To show R ∗ [u][v ] we need to establish
that C 1 (u, v ) and C 2 (u, v ). For C 1 : if A ∈ Γ and M, u A
then also M, v A (since Ruv ). Similarly, if ♦A ∈ Γ and
M, v A then M, u ♦A since Ruv . For C 2 : if A ∈ Γ
and M, v A then Ruv implies Rvu by symmetry, so
that M, u A. Similarly, if ♦A ∈ Γ and M, u A then
M, v ♦A (since Rvu by symmetry).
2. Exercise.
3. Exercise.
4. Exercise.
5.9 Filtrations of Euclidean Models
The approach of section 5.8 does not work in the case of models
that are euclidean or serial and euclidean. Consider the model
at the top of Figure 5.2, which is both euclidean and serial. Let
Γ = {p, p }. When taking a filtration through Γ, then [w 1 ] =
[w 3 ] since w 1 and w 3 are the only worlds that agree on Γ. Any
filtration will also have the arrow inherited from M, as depicted
in Figure 5.3. That model isn’t euclidean. Moreover, we cannot
add arrows to that model in order to make it euclidean. We would
have to add double arrows between [w 2 ] and [w 4 ], and then also
between w 2 and w 5 . But p is supposed to be true at w 2 , while
p is false at w 5 .
In particular, to obtain a euclidean flitration it is not enough
to consider filtrations through arbitrary Γ’s closed under sub-
formulas. Instead we need to consider sets Γ that are modally
closed (see Definition 5.1). Such sets of sentences are infinite, and
therefore do not immediately yield a finite model property or the
decidability of the corresponding system.
5.9. FILTRATIONS OF EUCLIDEAN MODELS 91
¬p w 1 w2 p
p p
¬p w 3 w4 p w 5 ¬p
p 1p 1p
Figure 5.2: A serial and euclidean model.
[w 2 ] p
p
¬p [w 1 ] [w 1 ] = [w 3 ]
p
[w 4 ] p [w 5 ] ¬p
1p 1p
Figure 5.3: The filtration of the model in Figure 5.2.
Theorem 5.19. Let Γ be modally closed, M = hW, R,V i, and M ∗ =
hW ∗, R ∗,V ∗ i be a coarsest filtration of M.
1. If M is symmetric, so is M ∗ .
2. If M is transitive, so is M ∗ .
3. If M is euclidean, so is M ∗ .
Proof. 1. If M ∗ is a coarsest filtration, then by definition R ∗ [u][v ]
holds if and only if C 1 (u, v ). For transitivity, suppose C 1 (u, v )
and C 1 (v, w); we have to show show C 1 (u, w). Suppose
M, u A; then M, u A since 4 is valid in all transi-
tive models; since A ∈ Γ by closure, also by C 1 (u, v ),
CHAPTER 5. FILTRATIONS AND DECIDABILITY 92
M, v A and by C 1 (v, w), also M, w A. Suppose
M, w A; then M, v ♦A by C 1 (v, w), since ♦A ∈ Γ
by modal closure. By C 1 (u, v ), we get M, u ♦♦A since
♦♦A ∈ Γ by modal closure. Since 4♦ is valid in all transi-
tive models, M, u ♦A.
2. Exercise. Use the fact that both 5 and 5♦ are valid in all
euclidean models.
3. Exercise. Use the fact that B and B♦ are valid in all sym-
metric models.
Problems
Problem 5.1. Complete the proof of Theorem 5.5
Problem 5.2. Complete the proof of Proposition 5.8.
Problem 5.3. Consider the following model M = hW, R,V i where
W = {0σ : σ ∈ B∗ }, the set of sequences of 0s and 1s starting
with 0, with Rσσ 0 iff σ 0 = σ0 or σ 0 = σ1, and V (p) = {σ0 : σ ∈
B∗ } and V (q ) = {σ1 : σ ∈ B∗ \ {1}}. Here’s a picture:
5.9. FILTRATIONS OF EUCLIDEAN MODELS 93
000
p
00
¬q
p
001
¬q
¬p
0 q
p
¬q 010
p
01
¬q
¬p
011
q
¬p
q
We have M, w 1 (p ∨ q ) → (p ∨ q ) for every w.
Let Γ be the set of sub-formulas of (p ∨ q ) → (p ∨ q ).
What are W ∗ and V ∗ ? What is the accessibility relation of the
finest filtration of M? Of the coarsest?
Problem 5.4. Show that any filtration of a serial or reflexive
model is also serial or reflexive (respectively).
Problem 5.5. Find a non-symmetric (non-transitive, non-euclidean)
filtration of a symmetric (transitive, euclidean) model.
Problem 5.6. Show that any filtration of a serial or reflexive
model is also serial or reflexive (respectively).
Problem 5.7. Find a non-symmetric (non-transitive, non-euclidean)
filtration of a symmetric (transitive, euclidean) model.
Problem 5.8. Complete the proof of Theorem 5.18.
Problem 5.9. Complete the proof of Theorem 5.19.
CHAPTER 6
Modal
Tableaux
6.1 Introduction
Tableaux are certain (downward-branching) trees of signed for-
mulas, i.e., pairs consisting of a truth value sign (T or F) and a
sentence
T A or F A.
A tableau begins with a number of assumptions. Each further
signed formula is generated by applying one of the inference
rules. Some inference rules add one or more signed formulas
to a tip of the tree; others add two new tips, resulting in two
branches. Rules result in signed formulas where the formula is
less complex than that of the signed formula to which it was ap-
plied. When a branch contains both T A and F A, we say the
branch is closed. If every branch in a tableau is closed, the entire
tableau is closed. A closed tableau consititues a derivation that
shows that the set of signed formulas which were used to begin
the tableau are unsatisfiable. This can be used to define a ` rela-
tion: Γ ` A iff there is some finite set Γ0 = {B 1, . . . , Bn } ⊆ Γ such
that there is a closed tableau for the assumptions
{F A, T B 1, . . . , T Bn }.
94
6.2. RULES FOR K 95
For modal logics, we have to both extend the notion of signed
formula and add rules that cover and ♦ In addition to a sign(T
or F), formulas in modal tableaux also have prefixes σ. The
prefixes are non-empty sequences of positive integers, i.e., σ ∈
(Z+ )∗ \ {Λ}. When we write such prefixes without the surround-
ing h i, and separate the individual elements by .’s instead of ,’s.
If σ is a prefix, then σ.n is σ _ hni; e.g., if σ = 1.2.1, then σ.3
is 1.2.1.3. So for instance,
1.2 T A → A
is a prefixed signed formula (or just a prefixed formula for short).
Intuitively, the prefix names a world in a model that might
satisfy the formulas on a branch of a tableau, and if σ names
some world, then σ.n names a world accessible from (the world
named by) σ.
6.2 Rules for K
The rules for the regular propositional connectives are the same
as for regular propositional signed tableaux, just with prefixes
added. In each case, the rule applied to a signed formula σ S A
produces new formulas that are also prefixed by σ. This should
be intuitively clear: e.g., if A ∧ B is true at (a world named by) σ,
then A and B are true at σ (and not at any other world). We
collect the propositional rules in section 6.2.
The closure condition is the same as for ordinary tableaux,
although we require that not just the formulas but also the prefixes
must match. So a branch is closed if it contains both
σTA and σFA
for some prefix σ and formula A.
The rules for setting up assumptions is also as for ordinary
tableaux, except that for asusmptions we always use the prefix 1.
(It does not matter which prefix we use, as long as it’s the same
CHAPTER 6. MODAL TABLEAUX 96
σ T ¬A σ F ¬A
¬T ¬F
σFA σTA
σTA ∧B
∧T σFA∧B
σTA ∧F
σFA | σFB
σTB
σFA∨B
σTA ∨B ∨F
∨T σFA
σTA | σTB
σFB
σFA→B
σTA →B →F
→T σTA
σFA | σTB
σFB
Table 6.1: Prefixed tableau rules for the propositional connectives
for all assumptions.) So, e.g., we say that
B 1, . . . , B n ` A
iff there is a closed tableau for the assumptions
1 T B 1, . . . , 1 T Bn , 1 F A.
For the modal operators and ♦, the prefix of the conclusion
of the rule applied to a formula with prefix σ is σ.n. However,
which n is allowed depends on whether the sign is T or F.
The T rule extends a branch containing σ T A by σ.n T A.
Similarly, the F♦ rule extends a branch containing σ F ♦A by
σ.n F A. They can only be applied for a prefix σ.n which already
occurs on the branch in which it is applied. Let’s call such a prefix
“used” (on the branch).
The F rule extends a branch containing σ F A by σ.n F A.
Similarly, the T♦ rule extends a branch containing σ T ♦A by
6.2. RULES FOR K 97
σ T A σ F A
T F
σ.n T A σ.n F A
σ.n is used σ.n is new
σ T ♦A σ F ♦A
♦T ♦F
σ.n T A σ.n F A
σ.n is new σ.n is used
Table 6.2: The modal rules for K.
σ.n T A. These rules, however, can only be applied for a pre-
fix σ.n which does not already occur on the branch in which it is
applied. We call such prefixes “new” (to the branch).
The rules are given in table 6.2.
The requirements that the restriction that the prefix for Tmust
be used is necessary as otherwise we would count the following
as a closed tableau:
1. 1T A Assumption
2. 1F ♦A Assumption
3. 1.1 T A T 1
4. 1.1 F A ♦F 2
⊗
But A 2 ♦A, so our proof system would be unsound. Like-
wise, ♦A 2 A, but without the restriction that the prefix for
Fmust be new, this would be a closed tableau:
CHAPTER 6. MODAL TABLEAUX 98
1. 1T ♦A Assumption
2. 1F A Assumption
3. 1.1 T A ♦T 1
4. 1.1 F A F 2
⊗
6.3 Tableaux for K
Example 6.1. We give a closed tableau that shows ` (A∧B)→
(A ∧ B).
1. 1 F (A ∧ B) → (A ∧ B) Assumption
2. 1 T A ∧ B →T 1
3. 1 F (A ∧ B) →T 1
4. 1 T A ∧T 2
5. 1 T B ∧T 2
6. 1.1 F A ∧ B F 3
7. 1.1 F A 1.1 F B ∧F 6
8. 1.1 T A 1.1 T B T 4; T 5
⊗ ⊗
Example 6.2. We give a closed tableau that shows ` ♦(A ∨ B) →
(♦A ∨ ♦B):
6.4. SOUNDNESS 99
1. 1 F ♦(A ∨ B) → (♦A ∨ ♦B) Assumption
2. 1 T ♦(A ∨ B) →T 1
3. 1 F ♦A ∨ ♦B →T 1
4. 1 F ♦A ∨F 3
5. 1 F ♦B ∨F 3
6. 1.1 T A ∨ B ♦T 2
7. 1.1 T A 1.1 T B ∨T 6
8. 1.1 F A 1.1 F B ♦F 4; ♦F 5
⊗ ⊗
6.4 Soundness
In order to show that prefixed tableau are sound, we have to show
that if
1 T B 1, . . . , 1 T B n , 1 F A
has a closed tableau then B 1, . . . , Bn A. It is easier to prove
the contrapositive: if for some M and world w, M, w Bi for
all i = 1, . . . , n but M, w A, then no tableau can close. Such
a countermodel shows that the initial assumptions of the tableau
are satisfiable. The strategy of the proof is to show that whenever
all the prefixed formulas on a tableau branch are satisfiable, any
application of a rule results in at least one extended branch that
is also satisfiable. Since closed branches are unsatisfiable, any
tableau for a satisfiable set of prefixed formulas must have at least
one open branch.
In order to apply this strategy in the modal case, we have to
extend our definition of “satisfiable” to modal modals and pre-
fixes. With that in hand, however, the proof is straightforward.
CHAPTER 6. MODAL TABLEAUX 100
Definition 6.3. Let P be some set of prefixes, i.e., P ⊆ (Z+ )∗ \{Λ}
and let M be a model. A function f : P → W is an interpre-
tation of P in M if, whenever σ and σ.n are both in P , then
R f (σ)f (σ.n).
Relative to an interpretation of prefixes P we can define:
1. M satisfies σ T A iff M, f (σ) A.
2. M satisfies σ F A iff M, f (σ) 1 A.
Definition 6.4. Let Γ be a set of prefixed formulas, and let P (Γ)
be the set of prefixes that occur in it. If f is an interpretation
of P (Γ) in M, we say that M satisfies Γ with respect to f , M, f
Γ, if M satisfies every prefixed formula in Γ with respect to f . Γ
is satisfiable iff there is a model M and interpretation f of P (Γ)
such that M, f Γ.
Proposition 6.5. If Γ contains both σ T A and σ F A, for some for-
mula A and prefix σ, then Γ is unsatisfiable.
Proof. There cannot be a model M and interpretation f of P (Γ)
such that both M, f (σ) A and M, f (σ) 1 A.
Theorem 6.6 (Soundness). If Γ has a closed tableau, Γ is unsatis-
fiable.
Proof. We call a branch of a tableau satisfiable iff the set of signed
formulas on it is satisfiable, and let’s call a tableau satisfiable if it
contains at least one satisfiable branch.
We show the following: Extending a satisfiable tableau by one
of the rules of inference always results in a satisfiable tableau.
This will prove the theorem: any closed tableau results by apply-
ing rules of inference to the tableau consisting only of assump-
tions from Γ. So if Γ were satisfiable, any tableau for it would
6.4. SOUNDNESS 101
be satisfiable. A closed tableau, however, is clearly not satisfi-
able, since all its branches are closed and closed branches are
unsatisfiable.
Suppose we have a satisfiable tableau, i.e., a tableau with at
least one satisfiable branch. Applying a rule of inference either
adds signed formulas to a branch, or splits a branch in two. If
the tableau has a satisfiable branch which is not extended by the
rule application in question, it remains a satisfiable branch in
the extended tableau, so the extended tableau is satisfiable. So
we only have to consider the case where a rule is applied to a
satisfiable branch.
Let Γ be the set of signed formulas on that branch, and let
σ S A ∈ Γ be the signed formula to which the rule is applied. If
the rule does not result in a split branch, we have to show that the
extended branch, i.e., Γ together with the conclusions of the rule,
is still satisfiable. If the rule results in split branch, we have to
show that at least one of the two resulting branches is satisfiable.
First, we consider the possible inferences with only one premise.
1. The branch is expanded by applying ¬T to σ T ¬B ∈ Γ.
Then the extended branch contains the signed formulas
Γ ∪ {σ F B }. Suppose M, f Γ. In particular, M, f (σ)
¬B. Thus, M, f (σ) 1 B, i.e., M satisfies σ F B with respect
to f .
2. The branch is expanded by applying ¬F to σ F ¬B ∈ Γ:
Exercise.
3. The branch is expanded by applying ∧T to σ T B ∧ C ∈ Γ,
which results in two new signed formulas on the branch:
σ T B and σ T C . Suppose M, f Γ, in particular M, f (σ)
B ∧ C . Then M, f (σ) B and M, f (σ) C . This means
that M satisfies both σ T B and σ T C with respect to f .
4. The branch is expanded by applying ∨F to T B ∨ C ∈ Γ:
Exercise.
CHAPTER 6. MODAL TABLEAUX 102
5. The branch is expanded by applying →F to σ F B →C ∈ Γ:
This results in two new signed formulas on the branch:
σ T B and σ F C . Suppose M, f Γ, in particular M, f (σ) 1
B → C . Then M, f (σ) B and M, f (σ) 1 C . This means
that M, f satisfies both σ T B and σ F C .
6. The branch is expanded by applying T to σ T B ∈ Γ:
This results in a new signed formula σ.n T B on the branch,
for some σ.n ∈ P (Γ) (since σ.n must be used). Suppose
M, f Γ, in particular, M, f (σ) B. Since f is an inter-
pretation of prefixes and both σ, σ.n ∈ P (Γ), we know that
R f (σ)f (σ.n). Hence, M, f (σ.n) B, i.e., M, f satisfies
σ.n T B.
7. The branch is expanded by applying F to σ F B ∈ Γ:
This results in a new signed formula σ.n F A, where σ.n is
a new prefix on the branch, i.e., σ.n < P (Γ). Since Γ is
satisfiable, there is a M and interpretation f of P (Γ) such
that M, f Γ, in particular M, f (σ) 1 B. We have to
show that Γ ∪ {σ.n F B } is satisfiable. To do this, we define
an interpretation of P (Γ) ∪ {σ.n} as follows:
Since M, f (σ) 1 B, there is a w ∈ W such that R f (σ)w
and M, w 1 B. Let f 0 be like f , except that f 0(σ.n) = w.
Since f 0(σ) = f (σ) and R f (σ)w, we have R f 0(σ)f 0(σ.n),
so f 0 is an interpretation of P (Γ)∪{σ.n}. Obviously M, f 0(σ.n) 1
B. Since f (σ 0) = f 0(σ 0) for all prefixes σ 0 ∈ P (Γ), M, f 0
Γ. So, M, f 0 satisfies Γ ∪ {σ.n F B }.
Now let’s consider the possible inferences with two premises.
1. The branch is expanded by applying ∧F to σ F B ∧ C ∈ Γ,
which results in two branches, a left one continuing through
σ F B and a right one through σ F C . Suppose M, f Γ,
in particular M, f (σ) 1 B ∧ C . Then M, f (σ) 1 B or
M, f (σ) 1 C . In the former case, M, f satisfies σ F B, i.e.,
the left branch is satisfiable. In the latter, M, f satisfies
σ F C , i.e., the right branch is satisfiable.
6.5. RULES FOR OTHER ACCESSIBILITY RELATIONS 103
2. The branch is expanded by applying ∨T to T B ∨ C ∈ Γ:
Exercise.
3. The branch is expanded by applying →T to T B → C ∈ Γ:
Exercise.
Corollary 6.7. If Γ ` A then Γ A.
Proof. If Γ ` A then for some B 1 , . . . , Bn ∈ Γ, ∆ = {1 F A, 1 T B 1, . . . , 1 T B
has a closed tableau. We want to show that Γ A. Suppose not,
so for some M and w, M, w Bi for i = 1, . . . , n, but M, w 1 A.
Let f (1) = w; then f is an interpretation of P (∆) into M, and M
satisfies ∆ with respect to f . But by Theorem 6.6, ∆ is unsatis-
fiable since it has a closed tableau, a contradiction. So we must
have Γ ` A after all.
Corollary 6.8. If ` A then A is true in all models.
6.5 Rules for Other Accessibility Relations
In order to deal with logics determined by special accessibility
relations, we consider the additional rules in table 6.3.
Adding these rules results in systems that are sound and com-
plete for the logics given in section 6.5.
6.6 Tableaux for Other Logics
Example 6.9. We give a closed tableau that shows S5 ` 5, i.e.,
A → ♦A.
CHAPTER 6. MODAL TABLEAUX 104
σ T A σ F ♦A
T T♦
σTA σFA
σ T A σ F ♦A
D D♦
σ T ♦A σ F A
σ.n T A σ.n F ♦A
B B♦
σTA σFA
σ T A 4 σ F ♦A
4♦
σ.n T A σ.n F ♦A
σ.n is used σ.n is used
σ.n T A 4r σ.n F ♦A
4r♦
σ T A σ F ♦A
Table 6.3: More modal rules.
1. 1 F A → ♦A Assumption
2. 1 T A →F 1
3. 1 F ♦A →F 1
4. 1.1 F ♦A F 3
5. 1 F ♦A 4r♦ 4
6. 1.1 F A ♦F 5
7. 1.1 T A T 2
⊗
6.7. SOUNDNESS FOR ADDITIONAL RULES 105
Logic R is . . . Rules
T = KT reflexive T, T♦
D = KD serial D, D♦
K4 transitive 4, 4♦
B = KTB reflexive, T, T♦
symmetric B, B♦
S4 = KT4 reflexive, T, T♦,
transitive 4, 4♦
S5 = KT4B reflexive, T, T♦,
transitive, 4, 4♦,
euclidean 4r, 4r♦
Table 6.4: Tableau rules for various modal logics.
6.7 Soundness for Additional Rules
We say a rule is sound for a class of models if, whenever a branch
in a tableau is satisfiable in a model from that class, the branch
resulting from applying the rule is also satisfiable in a model from
that class.
Proposition 6.10. T and T♦ are sound for reflexive models.
Proof. 1. The branch is expanded by applying T to σ T B ∈
Γ: This results in a new signed formula σ T B on the branch.
Suppose M, f Γ, in particular, M, f (σ) B. Since R is
reflexive, we know that R f (σ)f (σ). Hence, M, f (σ) B,
i.e., M, f satisfies σ T B.
2. The branch is expanded by applying T♦ to σ F ♦B ∈ Γ:
Exercise.
CHAPTER 6. MODAL TABLEAUX 106
Proposition 6.11. D and D♦ are sound for serial models.
Proof. 1. The branch is expanded by applying D to σ T B ∈
Γ: This results in a new signed formula σ T ♦B on the
branch. Suppose M, f Γ, in particular, M, f (σ) B.
Since R is serial, there is a w ∈ W such that R f (σ)w. Then
M, w B, and hence M, f (σ) ♦B. So, M, f satisfies
σ T ♦B.
2. The branch is expanded by applying D♦ to σ F ♦B ∈ Γ:
Exercise.
Proposition 6.12. B and B♦ are sound for symmetric models.
Proof. 1. The branch is expanded by applying B to σ.n T B ∈
Γ: This results in a new signed formula σ T B on the branch.
Suppose M, f Γ, in particular, M, f (σ.n) B. Since
f is an interpretation of prefixes on the branch into M, we
know that R f (σ)f (σ.n). Since R is symmetric, R f (σ.n)f (σ).
Since M, f (σ.n) B, M, f (σ) B. Hence, M, f satisfies
σ T B.
2. The branch is expanded by applying B♦ to σ.n F ♦B ∈ Γ:
Exercise.
Proposition 6.13. 4 and 4♦ are sound for transitive models.
Proof. 1. The branch is expanded by applying 4 to σ T B ∈
Γ: This results in a new signed formula σ.n T B on the
branch. Suppose M, f Γ, in particular, M, f (σ) B.
Since f is an interpretation of prefixes on the branch into M
and σ.n must be used, we know that R f (σ)f (σ.n). Now
let w be any world such that R f (σ.n)w. Since R is tran-
sitive, R f (σ)w. Since M, f (σ) B, M, w B. Hence,
M, f (σ.n) B, and M, f satisfies σ.n T B.
6.8. SIMPLE TABLEAUX FOR S5 107
2. The branch is expanded by applying 4♦ to σ F ♦B ∈ Γ:
Exercise.
Proposition 6.14. 4r and 4r♦ are sound for euclidean models.
Proof. 1. The branch is expanded by applying 4r to σ.n T B ∈
Γ: This results in a new signed formula σ T B on the
branch. Suppose M, f Γ, in particular, M, f (σ.n)
B. Since f is an interpretation of prefixes on the branch
into M, we know that R f (σ)f (σ.n). Now let w be any
world such that R f (σ)w. Since R is euclidean, R f (σ.n)w.
Since M, f (σ).n B, M, w B. Hence, M, f (σ) B,
and M, f satisfies σ T B.
2. The branch is expanded by applying 4r♦ to σ.n F ♦B ∈ Γ:
Exercise.
Corollary 6.15. The tableau systems given in section 6.5 are sound
for the respective classes of models.
6.8 Simple Tableaux for S5
S5 is sound and complete with respect to the class of universal
models, i.e., models where every world is accessible from every
world. In universal models the accessibility relation doesn’t mat-
ter: “there is a world w where M, w A” is true if and only if
there is such a w that’s accessible from u. So in S5, we can define
models as simply a set of worlds and a valuation V . This suggests
that we should be able to simplify the tableau rules as well. In the
general case, we take as prefixes sequences of positive integers, so
that we can keep track of which such prefixes name worlds which
are accessible from others: σ.n names a world accessible from σ.
But in S5 any world is accessible from any world, so there is no
need to so keep track. Instead, we can use positive integers as
prefixes. The simplified rules are given in table 6.5.
CHAPTER 6. MODAL TABLEAUX 108
n T A n F A
T F
m TA mFA
m is used m is new
n T ♦A n F ♦A
♦T ♦F
m TA mFA
m is new m is used
Table 6.5: Simplified rules for S5.
Example 6.16. We give a simplified closed tableau that shows
S5 ` 5, i.e., ♦A → ♦A.
1. 1 F ♦A → ♦A Assumption
2. 1 T ♦A →F 1
3. 1 F ♦A →F 1
4. 2 F ♦A F 3
5. 3T A ♦T 2
6. 3F A ♦F 4
⊗
6.9 Completeness for K
To show that the method of tableaux is complete, we have to show
that whenever there is no closed tableau to show Γ ` A, then Γ 2
A, i.e., there is a countermodel. But “there is no closed tableau”
means that every way we could try to construct one has to fail
to close. The trick is to see that if every such way fails to close,
6.9. COMPLETENESS FOR K 109
then a specific, systematic and exhaustive way also fails to close.
And this systematic and exhaustive way would close if a closed
tableau exists. The single tableau will contain, among its open
branches, all the information required to define a countermodel.
The countermodel given by an open branch in this tableau will
contain the all the prefixes used on that branch as the worlds,
and a propositional variable p is true at σ iff σ T p occurs on the
branch.
Definition 6.17. A branch in a tableau is called complete if,
whenever it contains a prefixed formula σ S A to which a rule
can be applied, it also contains
1. the prefixed formulas that are the corresponding conclu-
sions of the rule, in the case of propositional stacking rules;
2. one of the corresponding conclusion formulas in the case
of propositional branching rules;
3. at least one possible conclusion in the case of modal rules
that require a new prefix;
4. the corresponding conclusion for every prefix occurring on
the branch in the case of modal rules that require a used
prefix.
For instance, a complete branch contains σ T B and σ T C
whenever it contains T B ∧C . If it contains σ T B ∨C it contains at
least one of σ F B and σ T C . If it contains σ F it also contains
σ.n F for at least one n. And whenever it contains σ T it also
contains σ.n T for every n such that σ.n is used on the branch.
CHAPTER 6. MODAL TABLEAUX 110
Proposition 6.18. Every finite Γ has a tableau in which every branch
is complete.
Proof. Consider an open branch in a tableau for Γ. There are
finitely many prefixed formulas in the branch to which a rule
could be applied. In some fixed order (say, top to bottom), for
each of these prefixed formulas for which the conditions (1)–(4)
do not already hold, apply the rules that can be applied to it to
extend the branch. In some cases this will result in branching;
apply the rule at the tip of each resulting branch for all remain-
ing prefixed formulas. Since the number of prefixed formulas is
finite, and the number of used prefixes on the branch is finite,
this procedure eventually results in (possibly many) branches ex-
tending the original branch. Apply the procedure to each, and
repeat. But by construction, every branch is closed.
Theorem 6.19 (Completeness). If Γ has no closed tableau, Γ is
satisfiable.
Proof. By the proposition, Γ has a tableau in which every branch
is complete. Since it has no closed tableau, it thas has a tableau
in which at least one branch is open and complete. Let ∆ be
the set of prefixed formulas on the branch, and P (∆) the set of
prefixes occurring in it.
We define a model M(∆) = hP (∆), R,V i where the worlds are
the prefixes occurring in ∆, the accessibility relation is given by:
Rσσ 0 iff σ 0 = σ.n for some n
and
V (p) = {σ : σ T p ∈ ∆}.
We show by induction on A that if σ T A ∈ ∆ then M(∆), σ A,
and if σ F A ∈ ∆ then M(∆), σ 1 A.
6.9. COMPLETENESS FOR K 111
1. A ≡ p: If σ T A ∈ ∆ then σ ∈ V (p) (by definition of V ) and
so M(∆), σ A.
If σ F A ∈ ∆ then σ T A < ∆, since the branch would other-
wise be closed. So σ < V (p) and thus M(∆), σ 1 A.
2. A ≡ ¬B: If σ T A ∈ ∆, then σ F B ∈ ∆ since the branch is
complete. By induction hypothesis, M(∆), σ 1 B and thus
M(∆), σ A.
If σ F A ∈ ∆, then σ T B ∈ ∆ since the branch is complete.
By induction hypothesis, M(∆), σ B and thus M(∆), σ 1
A.
3. A ≡ B ∧ A: Exercise.
4. A ≡ B ∨ A: If σ T A ∈ ∆, then either σ T B ∈ ∆ or σ T C ∈
∆ since the branch is complete. By induction hypothesis,
either M(∆), σ B or M(∆), σ C . Thus M(∆), σ A.
If σ F A ∈ ∆, then both σ F B ∈ ∆ and σ F C ∈ ∆ since the
branch is complete. By induction hypothesis, both M(∆), σ 1
B and M(∆), σ 1 B. Thus M(∆), σ 1 A.
5. A ≡ B → A: Exercise.
6. A ≡ B: If σ T A ∈ ∆, then, since the branch is complete,
σ.n T B ∈ ∆ for every σ.n used on the branch, i.e., for
every σ 0 ∈ P (∆) such that Rσσ 0. By induction hypothesis,
M(∆), σ 0 B for every σ 0 such that Rσσ 0. Therefore,
M(∆), σ A.
If σ F A ∈ ∆, then for some σ.n, σ.n F B ∈ ∆ since the
branch is complete. By induction hypothesis, M(∆), σ.n 1
B. Since Rσ(σ.n), there is a σ 0 such that M(∆), σ 0 1 B.
Thus M(∆), σ 1 A.
7. A ≡ ♦B: Exercise.
Since Γ ⊆ ∆, M(∆) Γ.
CHAPTER 6. MODAL TABLEAUX 112
Corollary 6.20. If Γ A then Γ ` A.
Corollary 6.21. If A is true in all models, then ` A.
6.10 Countermodels from Tableaux
The proof of the completeness theorem doesn’t just show that if
A then ` A, it also gives us a method for constructing coun-
termodels to A if 2 A. In the case of K, this method constitutes
a decision procedure. For suppose 2 A. Then the proof of Propo-
sition 6.18 gives a method for constructing a complete tableau.
The method in fact always terminates. The propositional rules
for K only add prefixed formulas of lower complexity, i.e., each
propositional rule need only be applied once on a branch for any
signed formula σ S A. New prefixes are only generated by the F
and ♦T rules, and also only have to be applied once (and produce
a single new prefix). T and ♦F have to be applied potentially
multiple times, but only once per prefix, and only finitely many
new prefixes are generated. So the construction either results in
a closed branch or a complete branch after finitely many stages.
Once a tableau with an open complete branch is constructed,
the proof of Theorem 6.19 gives us an explict model that satisfies
the original set of prefixed formulas. So not only is it the case that
if Γ A, then a closed tableau exists and Γ ` A, if we look for
the closed tableau in the right way and end up with a “complete”
tableau, we’ll not only know that Γ 2 A but actually be able to
construct a countermodel.
Example 6.22. We know that 0 (p ∨ q ) → (p ∨ q ). The
construction of a tableau begins with:
6.10. COUNTERMODELS FROM TABLEAUX 113
1. 1 F (p ∨ q ) → (p ∨ q ) X Assumption
2. 1 T (p ∨ q ) →F 1
3. 1 F p ∨ q X →F 1
4. 1 F p X ∨F 3
5. 1 F q X ∨F 3
6. 1.1 F p X F 4
7. 1.2 F q X F 5
The tableau is of course not finished yet. In the next step, we
consider the only line without a checkmark: the prefixed formula
1 T (p ∨q ) on line 2. The construction of the closed tableau says
to apply the T rule for every prefix used on the branch, i.e., for
both 1.1 and 1.2:
1. 1 F (p ∨ q ) → (p ∨ q ) X Assumption
2. 1 T (p ∨ q ) →F 1
3. 1 F p ∨ q X →F 1
4. 1 F p X ∨F 3
5. 1 F q X ∨F 3
6. 1.1 F p X F 4
7. 1.2 F q X F 5
8. 1.1 T p ∨ q T 2
9. 1.2 T p ∨ q T 2
Now lines 2, 8, and 9, don’t have checkmarks. But no new prefix
has been added, so we apply ∨T to lines 8 and 9, on all resulting
branches (as long as they don’t close):
CHAPTER 6. MODAL TABLEAUX 114
¬p p
1.1 q 1.2 ¬q
¬p
1 ¬q
Figure 6.1: A countermodel to (p ∨ q ) → (p ∨ q ).
1. 1 F (p ∨ q ) → (p ∨ q ) X Assumption
2. 1 T (p ∨ q ) →F 1
3. 1 F p ∨ q X →F 1
4. 1 F p X ∨F 3
5. 1 F q X ∨F 3
6. 1.1 F p X F 4
7. 1.2 F q X F 5
8. 1.1 T p ∨ q X T 2
9. 1.2 T p ∨ q X T 2
10. 1.1 T p X 1.1 T q X ∨T 8
⊗
11. 1.2 T p X 1.2 T q X ∨T 9
⊗
There is one remaining open branch, and it is complete. From
it we define the model with worlds W = {1, 1.1, 1.2} (the only
prefixes appearing on the open branch), the accessibility relation
R = {h1, 1.1i, h1, 1.2i}, and the assignment V (p) = {1.2} (because
line 11 contains 1.2 T p) and V (q ) = {1.1} (because line 10 con-
tains 1.1 T q ). The model is pictured in Figure 6.1, and you can
verify that it is a countermodel to (p ∨ q ) → (p ∨ q ).
6.10. COUNTERMODELS FROM TABLEAUX 115
Problems
Problem 6.1. Find closed tableaux in K for the following formu-
las:
1. ¬p → (p → q )
2. (p ∨ q ) → (p ∨ q )
3. ♦p → ♦(p ∨ q )
Problem 6.2. Complete the proof of Theorem 6.6.
Problem 6.3. Give closed tableaux that show the following:
1. KT5 ` B;
2. KT5 ` 4;
3. KDB4 ` T;
4. KB4 ` 5;
5. KB5 ` 4;
6. KT ` D.
Problem 6.4. Complete the proof of Proposition 6.10
Problem 6.5. Complete the proof of Proposition 6.11
Problem 6.6. Complete the proof of Proposition 6.12
Problem 6.7. Complete the proof of Proposition 6.13
Problem 6.8. Complete the proof of Proposition 6.14
Problem 6.9. Complete the proof of Theorem 6.19.
PART II
Intuitionistic
Logic
116
CHAPTER 7
Introduction
7.1 Constructive Reasoning
In constrast to extensions of classical logic by modal operators
or second-order quantifiers, intuitionistic logic is “non-classical”
in that it restricts classical logic. Classical logic is non-constructive
in various ways. Intuitionistic logic is intended to capture a more
“constructive” kind of reasoning characteristic of a kind of con-
structive mathematics. The following examples may serve to il-
lustrate some of the underlying motivations.
Suppose someone claimed that they had determined a natu-
ral number n with the property that if n is even, the Riemann
hypothesis is true, and if n is odd, the Riemann hypothesis is
false. Great news! Whether the Riemann hypothesis is true or
not is one of the big open questions of mathematics, and they
seem to have reduced the problem to one of calculation, that is,
to the determination of whether a specific number is prime or
not.
What is the magic value of n? They describe it as follows: n is
the natural number that is equal to 2 if the Riemann hypothesis
is true, and 3 otherwise.
Angrily, you demand your money back. From a classical point
of view, the description above does in fact determine a unique
value of n; but what you really want is a value of n that is given
explicitly.
117
CHAPTER 7. INTRODUCTION 118
To take another, perhaps less contrived example, consider
the following question. We know that it is possible to raise an
irrational number to a rational power, and get a rational result.
√ 2
For example, 2 = 2. What is less clear is whether or not it is
possible to raise an irrational number to an irrational power, and
get a rational result. The following theorem answers this in the
affirmative:
Theorem 7.1. There are irrational numbers a and b such that a b is
rational.
√
√ 2
Proof. Consider
√ 2 . If this is rational, we are done: we can let
a = b = 2. Otherwise, it is irrational. Then we have
√ √2 √ √ √2·√2 √ 2
( 2 ) 2= 2 = 2 = 2,
√ √2 √
which is rational. So, in this case, let a be 2 , and let b be 2.
Does this constitute a valid proof? Most mathematicians feel
that it does. But again, there is something a little bit unsatisfying
here: we have proved the existence of a pair of real numbers
with a certain property, without being able to say which pair of
numbers it is. It is possible to prove the same result, but √in such
a way that the pair a, b is given in the proof: take a = 3 and
b = log3 4. Then
√ log 4
a b = 3 3 = 31/2·log3 4 = (3log3 4 )1/2 = 41/2 = 2,
since 3log3 x = x.
Intuitionistic logic is designed to capture a kind of reasoning
where moves like the one in the first proof are disallowed. Proving
the existence of an x satisfying A(x) means that you have to give a
specific x, and a proof that it satisfies A, like in the second proof.
Proving that A or B holds requires that you can prove one or the
other.
7.2. SYNTAX OF INTUITIONISTIC LOGIC 119
Formally speaking, intuitionistic logic is what you get if you
restrict a proof system for classical logic in a certain way. From
the mathematical point of view, these are just formal deductive
systems, but, as already noted, they are intended to capture a
kind of mathematical reasoning. One can take this to be the kind
of reasoning that is justified on a certain philosophical view of
mathematics (such as Brouwer’s intuitionism); one can take it to
be a kind of mathematical reasoning which is more “concrete”
and satisfying (along the lines of Bishop’s constructivism); and
one can argue about whether or not the formal description cap-
tures the informal motivation. But whatever philosophical posi-
tions we may hold, we can study intuitionistic logic as a formally
presented logic; and for whatever reasons, many mathematical
logicians find it interesting to do so.
7.2 Syntax of Intuitionistic Logic
The syntax of intuitionistic logic is the same as that for proposi-
tional logic. In classical propositional logic it is possible to define
connectives by others, e.g., one can define A → B by ¬A ∨ B, or
A ∨ B by ¬(¬A ∧ ¬B). Thus, presentations of classical logic often
introduce some connectives as abbreviations for these definitions.
This is not so in intuitionistic logic, with two exceptions: ¬A can
be—and often is—defined as an abbreviation for A →⊥. Then, of
course, ⊥ must not itself be defined! Also, A ↔ B can be defined,
as in classical logic, as (A → B) ∧ (B → A).
Formulas of propositional intuitionistic logic are built up from
propositional variables and the propositional constant ⊥ using log-
ical connectives. We have:
1. A countably infinite set At0 of propositional variables p0 ,
p1 , . . .
2. The propositional constant for falsity ⊥.
3. The logical connectives: ∧ (conjunction), ∨ (disjunction),
→ (conditional)
CHAPTER 7. INTRODUCTION 120
4. Punctuation marks: (, ), and the comma.
Definition 7.2 (Formula). The set Frm(L0 ) of formulas of propo-
sitional intuitionistic logic is defined inductively as follows:
1. ⊥ is an atomic formula.
2. Every propositional variable pi is an atomic formula.
3. If A and B are formulas, then (A ∧ B) is a formula.
4. If A and B are formulas, then (A ∨ B) is a formula.
5. If A and B are formulas, then (A → B) is a formula.
6. Nothing else is a formula.
In addition to the primitive connectives introduced above, we
also use the following defined symbols: ¬ (negation) and ↔ (bi-
conditional). Formulas constructed using the defined operators
are to be understood as follows:
1. ¬A abbreviates A → ⊥.
2. A ↔ B abbreviates (A → B) ∧ (B → A).
Although ¬ is officially treated as an abbreviation, we will
sometimes give explicit rules and clauses in definitions for ¬ as
if it were primitive. This is mostly so we can state practice prob-
lems.
7.3 The Brouwer-Heyting-Kolmogorov
Interpretation
There is an informal constructive interpretation of the intuitionist
connectives, usually known as the Brouwer-Heyting-Kolmogorov
interpretation. It uses the notion of a “construction,” which you
7.3. THE BROUWER-HEYTING-KOLMOGOROV INTERPRETATION 121
may think of as a constructive proof. (We don’t use “proof” in
the BHK interpretation so as not to get confused with the notion
of a derivation in a formal proof system.) Based on this intuitive
notion, the BHK interpretation explains the meanings of the in-
tuitionistic connectives.
1. We assume that we know what constitutes a construction
of an atomic statement.
2. A construction of A1 ∧ A2 is a pair hM1, M2 i where M1 is a
construction of A1 and M2 is a construction of A2 .
3. A construction of A1 ∨ A2 is a pair hs, M i where s is 1 and
M is a construction of A1 , or s is 2 and M is a construction
of A2 .
4. A construction of A → B is a function that converts a con-
struction of A into a construction of B.
5. There is no construction for ⊥ (absurdity).
6. ¬A is defined as synonym for A→⊥. That is, a construction
of ¬A is a function converting a construction of A into a
construction of ⊥.
Example 7.3. Take ¬⊥ for example. A construction of it is a
function which, given any construction of ⊥ as input, provides a
construction of ⊥ as output. Obviously, the identity function Id
is such a construction: given a construction M of ⊥, Id(M ) = M
yields a construction of ⊥.
Generally speaking, ¬A means “A construction of A is impos-
sible”.
Example 7.4. Let us prove A→¬¬A for any proposition A, which
is A → ((A → ⊥) → ⊥). The construction should be a function f
that, given a construction M of A, returns a construction f (M )
of (A → ⊥) → ⊥. Here is how f constructs the construction of
(A → ⊥) → ⊥: We have to define a function g which, when given a
CHAPTER 7. INTRODUCTION 122
construction h of A→⊥ as input, outputs a construction of ⊥. We
can define g as follows: apply the input h to the construction M
of A (that we received earlier). Since the output h(M ) of h is a
construction of ⊥, f (M )(h) = h(M ) is a construction of ⊥ if M is
a construction of A.
Example 7.5. Let us give a construction for ¬(A ∧ ¬A), i.e.,
(A ∧ (A → ⊥)) → ⊥. This is a function f which, given as input
a construction M of A ∧ (A → ⊥), yields a construction of ⊥. A
construction of a conjunction B 1 ∧ B 2 is a pair hN1, N 2 i where N1
is a construction of B 1 and N 2 is a construction of B 2 . We can
define functions p 1 and p 2 which recover from a construction of
B 1 ∧ B 2 the constructions of B 1 and B 2 , respectively:
p 1 (hN 1, N2 i) = N 1
p 2 (hN 1, N2 i) = N 2
Here is what f does: First it applies p 1 to its input M . That yields
a construction of A. Then it applies p 2 to M , yielding a construc-
tion of A → ⊥. Such a construction, in turn, is a function p 2 (M )
which, if given as input a construction of A, yields a construc-
tion of ⊥. In other words, if we apply p 2 (M ) to p 1 (M ), we get a
construction of ⊥. Thus, we can define f (M ) = p 2 (p 1 (M )).
Example 7.6. Let us give a construction of ((A ∧ B) → C ) →
(A → (B → C )), i.e., a function f which turns a construction g of
(A∧B)→C into a construction of (A→(B →C )). The construction
g is itself a function (from constructions of A ∧B to constructions
of C ). And the output f (g ) is a function h g from constructions
of A to functions from constructions of B to constructions of C .
Ok, this is confusing. We have to construct a certain function
h g , which will be the output of f for input g . The input of h g is
a construction M of A. The output of h g (M ) should be a func-
tion kM from constructions N of B to constructions of C . Let
k g ,M (N ) = g (hM, N i). Remember that hM, N i is a construction
of A ∧ B. So k g ,M is a construction of B → C : it maps construc-
tions N of B to constructions of C . Now let h g (M ) = k g ,M . That’s
7.3. THE BROUWER-HEYTING-KOLMOGOROV INTERPRETATION 123
a function that maps constructions M of A to constructions k g ,M
of B → C . Now let f (g ) = h g . That’s a function that maps con-
structions g of (A ∧ B) → C to constructions of A → (B → C ).
Whew!
The statement A ∨ ¬A is called the Law of Excluded Mid-
dle. We can prove it for some specific A (e.g., ⊥ ∨ ¬⊥), but not
in general. This is because the intuitionistic disjunction requires
a construction of one of the disjuncts, but there are statements
which currently can neither be proved nor refuted (say, Gold-
bach’s conjecture). However, you can’t refute the law of excluded
middle either: that is, ¬¬(A ∨ ¬A) holds.
Example 7.7. To prove ¬¬(A ∨ ¬A), we need a function f that
transforms a construction of ¬(A ∨ ¬A), i.e., of (A ∨ (A → ⊥)) → ⊥,
into a construction of ⊥. In other words, we need a function f
such that f (g ) is a construction of ⊥ if g is a construction of
¬(A ∨ ¬A).
Suppose g is a construction of ¬(A ∨ ¬A), i.e., a function that
transforms a construction of A ∨ ¬A into a construction of ⊥. A
construction of A ∨ ¬A is a pair hs, M i where either s = 1 and
M is a construction of A, or s = 2 and M is a construction of
¬A. Let h1 be the function mapping a construction M1 of A to a
construction of A ∨ ¬A: it maps M1 to h1, M2 i. And let h 2 be the
function mapping a construction M2 of ¬A to a construction of
A ∨ ¬A: it maps M 2 to h2, M2 i.
Let k be g ◦ h1 : it is a function which, if given a construction
of A, returns a construction of ⊥, i.e., it is a construction of A →
⊥ or ¬A. Now let l be g ◦ h2 . It is a function which, given a
construction of ¬A, provides a construction of ⊥. Since k is a
construction of ¬A, l (k ) is a construction of ⊥.
Together, what we’ve done is describe how we can turn a con-
struction g of ¬(A∨¬A) into a construction of ⊥, i.e., the function
f mapping a construction g of ¬(A ∨ ¬A) to the construction l (k )
of ⊥ is a construction of ¬¬(A ∨ ¬A).
CHAPTER 7. INTRODUCTION 124
As you can see, using the BHK interpretation to show the
intuitionistic validity of formulas quickly becomes cumbersome
and confusing. Luckily, there are better derivation systems for
intuitionistic logic, and more precise semantic interpretations.
7.4 Natural Deduction
Natural deduction without the ⊥C rules is a standard derivation
system for intuitionistic logic. We repeat the rules here and indi-
cate the motivation using the BHK interpretation. In each case,
we can think of a rule which allows us to conclude that if the
premises have constructions, so does the conclusion.
Since natural deduction derivations have undischarged as-
sumptions, we should consider such a derivation, say, of A from
undischarged assumptions Γ, as a function that turns construc-
tions of all B ∈ Γ into a construction of A. If there is a derivation
of A from no undischarged assumptions, then there is a construc-
tion of A in the sense of the BHK interpretation. For the purpose
of the discussion, however, we’ll suppress the Γ when not needed.
An assumption A by itself is a derivation of A from the undis-
charged assumption A. This agrees with the BHK-interpretation:
the identity function on constructions turns any construction of A
into a construction of A.
Conjunction
A1 A2 A1 ∧ A2
∧Intro ∧Elimi i ∈ {1, 2}
A1 ∧ A2 Ai
Suppose we have constructions N 1 , N 2 of A1 and A2 , respec-
tively. Then we also have a construction A1 ∧ A2 , namely the pair
hN 1, N 2 i.
A construction of A1 ∧ A1 on the BHK interpretation is a pair
hN 1, N 2 i. So assume we have such a pair. Then we also have a
7.4. NATURAL DEDUCTION 125
construction of each conjunct: N 1 is a construction of A1 and N 2
is a construction of A2 .
Conditional
[A]u
A→B A
→Elim
B
u B
→Intro
A→B
If we have a derivation of B from undischarged assumption A,
then there is a function f that turns constructions of A into con-
structions of B. That same function is a construction of A → B.
So, if the premise of →Intro has a construction conditional on a
construction of A, the conclusion A → B has a construction.
On the other hand, suppose there are constructions N of A
and f of A → B. A construction of A → B is a function that turns
constructions of A into constructions of B. So, f (N ) is a con-
struction of B, i.e., the conclusion of →Elim has a construction.
Disjunction
[A1 ]u [A2 ]u
Ai
∨Introi i ∈ {1, 2}
A1 ∨ A2
A1 ∨ A2 C C
u ∨Elim
C
If we have a construction Ni of Ai we can turn it into a construc-
tion hi, Ni i of A1 ∨ A2 . On the other hand, suppose we have a
construction of A1 ∨ A2 , i.e., a pair hi, Ni i where Ni is a construc-
tion of Ai , and also functions f1 , f 2 , which turn constructions
of A1 , A2 , respectively, into constructions of C . Then fi (Ni ) is a
construction of C , the conclusion of ∨Elim.
CHAPTER 7. INTRODUCTION 126
Absurdity
⊥ ⊥
I
A
If we have a derivation of ⊥ from undischarged assumptions B 1 ,
. . . , Bn , then there is a function f (M1, . . . , Mn ) that turns con-
structions of B 1 , . . . , Bn into a construction of ⊥. Since ⊥ has no
construction, there cannot be any constructions of all of B 1 , . . . ,
Bn either. Hence, f also has the property that if M1 , . . . , Mn are
constructions of B 1 , . . . , Bn , respectively, then f (M1, . . . , Mn ) is a
construction of A.
Rules for ¬
Since ¬A is defined as A → ⊥, we strictly speaking do not need
rules for ¬. But if we did, this is what they’d look like:
[A]n
¬A A
⊥ ¬Elim
⊥
n ¬Intro
¬A
Examples of Derivations
1. ` A → (¬A → ⊥), i.e., ` A → ((A → ⊥) → ⊥)
[A]2 [A → ⊥]1
⊥ →Elim
1 →Intro
(A → ⊥) → ⊥
2 →Intro
A → (A → ⊥) → ⊥
2. ` ((A ∧ B) → C ) → (A → (B → C ))
7.4. NATURAL DEDUCTION 127
[A]2 [B]1
∧Intro
[(A ∧ B) → C ]3 A∧B
→Elim
1
C
→Intro
2
B → C
→Intro
A → (B → C )
3 →Intro
((A ∧ B) → C ) → (A → (B → C ))
3. ` ¬(A ∧ ¬A), i.e., ` (A ∧ (A → ⊥)) → ⊥
[A ∧ (A → ⊥)]1 [A ∧ (A → ⊥)]1
∧Elim ∧Elim
A→⊥ A
⊥ →Elim
1 →Intro
(A ∧ (A → ⊥)) → ⊥
4. ` ¬¬(A ∨ ¬A), i.e., ` ((A ∨ (A → ⊥)) → ⊥) → ⊥
[A]1
[(A ∨ (A → ⊥)) → ⊥]2 A ∨ (A → ⊥
1
⊥
→Intro
A→⊥
∨Intro
[(A ∨ (A → ⊥)) → ⊥]2 A ∨ (A → ⊥)
⊥ →Elim
2 →Intro
((A ∨ (A → ⊥)) → ⊥) → ⊥
Proposition 7.8. If Γ ` A in intuitionistic logic, Γ ` A in classical
logic. In particular, if A is an intuitionistic theorem, it is also a classical
theorem.
Proof. Every natural deduction rule is also a rule in classical nat-
ural deduction, so every derivation in intuitionistic logic is also
a derivation in classical logic.
CHAPTER 7. INTRODUCTION 128
7.5 Axiomatic Derivations
Axiomatic derivations for intuitionistic propositional logic are
the conceptually simplest, and historically first, derivation sys-
tems. They work just as in classical propositional logic.
Definition 7.9 (Derivability). If Γ is a set of formulas of L then
a derivation from Γ is a finite sequence A1 , . . . , An of formulas
where for each i ≤ n one of the following holds:
1. Ai ∈ Γ; or
2. Ai is an axiom; or
3. Ai follows from some A j and Ak with j < i and k < i by
modus ponens, i.e., Ak ≡ A j → Ai .
Definition 7.10 (Axioms). The set of Ax0 of axioms for the in-
tuitionistic propositional logic are all formulas of the following
forms:
(A ∧ B) → A (7.1)
(A ∧ B) → B (7.2)
A → (B → (A ∧ B)) (7.3)
A → (A ∨ B) (7.4)
A → (B ∨ A) (7.5)
(A → C ) → ((B → C ) → ((A ∨ B) → C )) (7.6)
A → (B → A) (7.7)
(A → (B → C )) → ((A → B) → (A → C )) (7.8)
⊥→A (7.9)
7.5. AXIOMATIC DERIVATIONS 129
Definition 7.11 (Derivability). A formula A is derivable from Γ,
written Γ ` A, if there is a derivation from Γ ending in A.
Definition 7.12 (Theorems). A formula A is a theorem if there
is a derivation of A from the empty set. We write ` A if A is a
theorem and 0 A if it is not.
Proposition 7.13. If Γ ` A in intuitionistic logic, Γ ` A in classical
logic. In particular, if A is an intuitionistic theorem, it is also a classical
theorem.
Proof. Every intuitionistic axiom is also a classical axiom, so ev-
ery derivation in intuitionistic logic is also a derivation in classi-
cal logic.
Problems
CHAPTER 8
Semantics
8.1 Introduction
No logic is satisfactorily described without a semantics, and in-
tuitionistic logic is no exception. Whereas for classical logic, the
semantics based on valuations is canonical, there are several com-
peting semantics for intuitionistic logic. None of them are com-
pletely satisfactory in the sense that they give an intuitionistically
acceptable account of the meanings of the connectives.
The semantics based on relational models, similar to the se-
mantics for modal logics, is perhaps the most popular one. In
this semantics, propositional variables are assigned to worlds,
and these worlds are related by an accessibility relation. That re-
lation is always a partial order, i.e., it is reflexive, antisymmetric,
and transitive.
Intuitively, you might think of these worlds as states of knowl-
edge or “evidentiary situations.” A state w 0 is accessible from w
iff, for all we know, w 0 is a possible (future) state of knowledge,
i.e., one that is compatible with what’s known at w. Once a propo-
sition is known, it can’t become un-known, i.e., whenever A is
known at w and Rww 0, A is known at w 0 as well. So “knowledge”
is monotonic with respect to the accessibility relation.
If we define “A is known” as in epistemic logic as “true in all
epistemic alternatives,” then A∧B is known at w if in all epistemic
alternatives, both A and B are known. But since knowledge is
130
8.2. RELATIONAL MODELS 131
monotonic and R is reflexive, that means that A ∧ B is known
at w iff A and B are known at w. For the same reason, A ∨ B
is known at w iff at least one of them is known. So for ∧ and
∨, the truth conditions of the connectives coincide with those in
classical logic.
The truth conditions for the conditional, however, differ from
classical logic. A → B is known at w iff at no w 0 with Rww 0, A is
known without B also being known. This is not the same as the
condition that A is unknown or B is known at w. For if we know
neither A nor B at w, there might be a future epistemic state w 0
with Rww 0 such that at w 0, A is known without also coming to
know B.
We know ¬A only if there is no possible future epistemic state
in which we know A. Here the idea is that if A were knowable,
then in some possible future epistemic state A becomes known.
Since we can’t know ⊥, in that future epistemic state, we would
know A but not know ⊥.
On this interpretation the principle of excluded middle fails.
For there are some A which we don’t yet know, but which we might
come to know. For such an A, both A and ¬A are unknown, so
A ∨ ¬A is not known. But we do know, e.g., that ¬(A ∧ ¬A). For
no future state in which we know both A and ¬A is possible, and
we know this independently of whether or not we know A or ¬A.
Relational models are not the only available semantics for
intuitionistic logic. The topological semantics is another: here
propositions are interpreted as open sets in a topological space,
and the connectives are interpreted as operations on these sets
(e.g., ∧ corresponds to intersection).
8.2 Relational models
In order to give a precise semantics for intuitionistic proposi-
tional logic, we have to give a definition of what counts as a model
relative to which we can evaluate formulas. On the basis of such
a definition it is then also possible to define semantics notions
CHAPTER 8. SEMANTICS 132
such as validity and entailment. One such semantics is given by
relational models.
Definition 8.1. A relational model for intuitionistic propositional
logic is a triple M = hW, R,V i, where
1. W is a non-empty set,
2. R is a reflexive and transitive binary relation on W , and
3. V is function assigning to each propositional variable p a
subset of W , such that
4. V is monotone with respect to R, i.e., if w ∈ V (p) and
Rww 0, then w 0 ∈ V (p).
Definition 8.2. We define the notion of A being true at w in M,
M, w A, inductively as follows:
1. A ≡ p: M, w A iff w ∈ V (p).
2. A ≡ ⊥: not M, w A.
3. A ≡ ¬B: M, w A iff for no w 0 such that Rww 0, M, w 0 B.
4. A ≡ B ∧ C : M, w A iff M, w B and M, w C.
5. A ≡ B ∨ C : M, w A iff M, w B or M, w C (or both).
6. A ≡ B → C : M, w A iff for every w 0 such that Rww 0, not
M, w B or M, w C (or both).
We write M, w 1 A if not M, w A. If Γ is a set of formulas,
M, w Γ means M, w B for all B ∈ Γ.
8.3. SEMANTIC NOTIONS 133
Proposition 8.3. Truth at worlds is monotonic with respect to R, i.e.,
if M, w A and Rww 0, then M, w 0 A.
Proof. Exercise.
8.3 Semantic Notions
Definition 8.4. We say A is true in the model M = hW, R,V, w 0 i,
M A, iff M, w A for all w ∈ W . A is valid, A, iff it is true
in all models. We say a set of formulas Γ entails A, Γ A, iff for
every model M and every w such that M, w Γ, M, w A.
Proposition 8.5. 1. If M, w Γ and Γ A, then M, w A.
2. If M Γ and Γ A, then M A.
Proof. 1. Suppose M Γ. Since Γ A, we know that if
M, w Γ, then M, w A. Since M, u Γ for all every
u ∈ W , M, w Γ. Hence M, w A.
2. Follows immediately from (1).
8.4 Topological Semantics
Another way to provide a semantics for intuitionistic logic is us-
ing the mathematical concept of a topology.
Definition 8.6. Let X be a set. A topology on X is a set O ⊆ ℘(X )
that satisfies the properties below. The elements of O are called
the open sets of the topology. The set X together with O is called
a topological space.
1. The empty set and the entire space open: ∅, X ∈ O.
CHAPTER 8. SEMANTICS 134
2. Open sets are closed under finite intersections: if U , V ∈ O
then U ∩ V ∈ O
3. Open sets are closed under arbitrary unions: if Ui ∈ O for
all i ∈ I , then {Ui : i ∈ I } ∈ O.
Ð
We may write X for a topology if the collection of open sets
can be inferred from the context; note that, still, only after X is
endowed with open sets can it be called a topology.
Definition 8.7. A topological model of intuitionistic propositional
logic is a triple X = hX, O,V i where O is a topology on X and
V is a function assigning an open set in O to each propositional
variable.
Given a topological model X, we can define [A]]X inductively
as follows:
1. V (⊥) = ∅
2. [p]]X = V (p)
3. [A ∧ B]]X = [A]]X ∩ [B]]X
4. [A ∨ B]]X = [A]]X ∪ [B]]X
5. [A → B]]X = Int((X \ [A]]X ) ∪ [B]]X )
Here, Int(V ) is the function that maps a set V ⊆ X to its interior,
that is, the union of all open sets it contains. In other words,
Ø
Int(V ) = {U : U ⊆ V and U ∈ O}.
Note that the interior of any set is always open, since it is a
union of open sets. Thus, [A]]X is always an open set.
Although topological semantics is highly abstract, there are
ways to think about it that might motivate it. Suppose that the
elements, or “points,” of X are points at which statements can be
evaluated. The set of all points where A is true is the proposition
8.4. TOPOLOGICAL SEMANTICS 135
expressed by A. Not every set of points is a potential proposition;
only the elements of O are. A B iff B is true at every point at
which A is true, i.e., [A]]X ⊆ [B]]X , for all X . The absurd state-
ment ⊥ is never true, so [⊥]]X = ∅. How must the propositions
expressed by B ∧ C , B ∨ C , and B → C be related to those ex-
pressed by B and C for the intuitionistically valid laws to hold,
i.e., so that A ` B iff [A]]X ⊂ [B]]X . ⊥ ` A for any A, and only
∅ ⊆ U for all U . Since B ∧ C ` B, [B ∧ C ]X ⊆ [B]]X , and sim-
ilarly [B ∧ C ]X ⊆ [C ]X . The largest set satisfying W ⊆ U and
W ⊆ V is U ∩ V . Conversely, B ` B ∨ C and C ` B ∨ C , and so
[B]]X ⊆ [B ∨ C ]X and [C ]X ⊆ [B ∨ C ]X . The smallest set W such
that U ⊆ W and V ⊆ W is U ∪V . The definition for → is tricky:
A → B expresses the weakest proposition that, combined with A,
entails B. That A → B combined with A entails B is clear from
(A → B) ∧ A ` B. So [A → B]]X should be the greatest open set
such that [A → B]]X ∩ [A]]X ⊂ [B]]X , leading to our definition.
Problems
Problem 8.1. Show that according to Definition 8.2, M, w ¬A
iff M, w A → ⊥.
Problem 8.2. Prove Proposition 8.3.
CHAPTER 9
Soundness and
Completeness
9.1 Soundness of Axiomatic Derivations
Theorem 9.1 (Soundness). If Γ ` A, then Γ A.
Proof. We prove that if Γ ` A, then Γ A. The proof is by
induction on the number n of formulas in the derivation of A
from Γ. We show that if A1 , . . . , An = A is a derivation from Γ,
then Γ An . Note that if A1 , . . . , An is a derivation, so is A1 , . . . ,
Ak for any k < n.
There are no derivations of length 0, so for n = 0 the claim
holds vacuously. So the claim holds for all derivations of length <
n. We distinguish cases according to the justification of An .
1. An is an axiom. All axioms are valid, so Γ An for any Γ.
2. An ∈ Γ. Then for any M and w, if M, w Γ, obviously
M ΓAn [w], i.e., Γ A.
3. An follows by mp from Ai and A j ≡ Ai → An . A1 , . . . , Ai
and A1 , . . . , A j are derivations from Γ, so by inductive
hypothesis, Γ Ai and Γ Ai → An .
136
9.2. SOUNDNESS OF NATURAL DEDUCTION 137
Suppose M, w Γ. Since M, w Γ and Γ Ai → An ,
M, w Ai → An . By definition, this means that for all w 0
such that Rww 0, if M, w 0 Ai then M, w 0 An . Since R is
reflexive, w is among the w 0 such that Rww 0, i.e., we have
that if M, w Ai then M, w An . Since Γ Ai , M, w Ai .
So, M, w An , as we wanted to show.
9.2 Soundness of Natural Deduction
Theorem 9.2 (Soundness). If Γ ` A, then Γ A.
Proof. We prove that if Γ ` A, then Γ A. The proof is by
induction on the derivation of A from Γ.
1. If the derivation consists of just the assumption A, we have
A ` A, and want to show that A A. Consider any model
M such that M A. Then trivially M A.
2. The derivation ends in ∧Intro: Exercise.
3. The derivation ends in ∧Elim: Exercise.
4. The derivation ends in ∨Intro: Suppose the premise is B,
and the undischarged assumptions of the derivation ending
in B are Γ. Then we have Γ ` B and by inductive hypoth-
esis, Γ B. We have to show that Γ B ∨ C . Suppose
M Γ. Since Γ B, M B. But then also M B ∨ C .
Similarly, if the premise is C , we have that Γ C .
5. The derivation ends in ∨Elim: The derivations ending in
the premises are of B ∨C from undischarged assumptions Γ,
of D from undischarged assumptions ∆1 ∪ {B }, and of D
from undischarged assumptions ∆2 ∪ {C }. So we have Γ `
B ∨ C , ∆1 ∪ {B } ` D, and ∆2 ∪ {C } ` D. By induction
CHAPTER 9. SOUNDNESS AND COMPLETENESS 138
hypothesis, Γ B ∨ C , ∆1 ∪ {B } D, and ∆2 ∪ {C } D.
We have to prove that Γ ∪ ∆1 ∪ ∆2 D.
Suppose M Γ ∪∆1 ∪∆2 . Then M Γ and since Γ B ∨C ,
M B ∨C . By definition of M , either M B or M C .
So we distinguish cases: (a) M B. Then M ∆1 ∪ {B }.
Since ∆1 ∪ B D, we have M D. (b) M C . Then
M ∆2 ∪ {C }. Since ∆2 ∪ C D, we have M D. So in
either case, M D, as we wanted to show.
6. The derivation ends with →Intro concluding B →C . Then
the premise is C , and the derivation ending in the premise
has undischarged assumptions Γ ∪ {B }. So we have that
Γ ∪ {B } ` C , and by induction hypothesis that Γ ∪ {B } C .
We have to show that Γ B → C .
Suppose M, w Γ. We want to show that that for all w 0
such that Rww , if M, w 0 B, then M, w 0 C . So assume
0
that Rww 0 and M, w 0 B. By Proposition 8.3, M, w 0 Γ.
Since Γ ∪ {B } C , M, w 0 C , which is what we wanted to
show.
7. The derivation ends in →Elim and conclusion C . The
premises are B → C amd B, with derivations from undis-
charged assumptions Γ, ∆. So we have Γ ` B → C and
∆ ` B. By inductive hypothesis, Γ B → C and ∆ B. We
have to show that Γ ∪ ∆ C .
Suppose M, w Γ ∪ ∆. Since M, w Γ and Γ B → C ,
M, w B → C . By definition, this means that for all w 0
such that Rww 0, if M, w 0 B then M, w 0 C . Since R is
reflexive, w is among the w 0 such that Rww 0, i.e., we have
that if M, w B then M, w C . Since M, w ∆ and
∆ B, M, w B. So, M, w C , as we wanted to show.
8. The derivation ends in ⊥I , concluding A. The premise is
⊥ and the undischarged assumptions of the derivation of
the premise are Γ. Then Γ ` ⊥. By inductive hypothesis,
Γ ⊥. We have to show Γ A.
9.3. LINDENBAUM’S LEMMA 139
We proceed indirectly. If Γ 2 A there is a model M and
world w such that M, w Γ and M, w 1 A. Since Γ
⊥, M, w ⊥. But that’s impossible, since by definition,
M, w 1 ⊥. So Γ A.
9. The derivation endss in ¬Intro: Exercise.
10. The derivation ends in ¬Elim: Exercise.
9.3 Lindenbaum’s Lemma
Definition 9.3. A set of formulas Γ is prime iff
1. Γ is consistent.
2. If Γ ` A then A ∈ Γ, and
3. If A ∨ B ∈ Γ then A ∈ Γ or B ∈ Γ.
Lemma 9.4 (Lindenbaum’s Lemma). If Γ 0 A, there is a Γ ∗ ⊇ Γ
such that Γ ∗ is prime and Γ ∗ 0 A.
Proof. Let B 1 ∨C 1 , B 2 ∨C 2 , . . . , be an enumeration of all formulas
of the form B ∨ C . We’ll define an increasing sequence of sets of
formulas Γn , where each Γn+1 is defined as Γn together with one
new formula. Γ ∗ will be the union of all Γn . The new formulas
are selected so as to ensure that Γ ∗ is prime and still Γ ∗ 0 A. This
means that at each step we should find the first disjunction Bi ∨C i
such that:
1. Γn ` Bi ∨ C i
2. Bi < Γn and C i < Γn
CHAPTER 9. SOUNDNESS AND COMPLETENESS 140
We add to Γn either Bi if Γn ∪ {Bi } 0 A, or C i otherwise. We’ll
have to show that this works. For now, let’s define i (n) as the least
i such that (1) and (2) hold.
Define Γ0 = Γ and
(
Γn ∪ {Bi (n) } if Γn ∪ {Bi (n) } 0 A
Γn+1 =
Γn ∪ {C i (n) } otherwise
If i (n) is undefined, i.e., whenever Γ ` B ∨ C , either B ∈ γn or
C ∈ Γn , we let Γn+1 = Γn . Now let Γ ∗ = n=0 Γn
Ð∞
First we show that for all n, Γn 0 A. We proceed by induction
on n. For n = 0 the claim holds by the hypothesis of the theorem,
i.e., Γ 0 A. If n > 0, we have to show that if Γn 0 A then Γn+1 0 A.
If i (n) is undefined, Γn+1 = Γn and there is nothing to prove. So
suppose i (n) is defined. For simplicity, let i = i (n).
We’ll prove the contrapositive of the claim. Suppose Γn+1 `
A. By construction, Γn+1 = Γn ∪ {Bi } if Γn ∪ {Bi } 0 A, or else
Γn+1 = Γn ∪ {C i }. It clearly can’t be the first, since then Γn+1 0 A.
Hence, Γn ∪ {Bi } ` A and Γn+1 = Γn ∪ {C i }. By definition of i (n),
we have that Γn ` Bi ∨ C i . We have Γn ∪ {Bi } ` A. We also have
Γn+1 = Γn ∪ {C i } ` A. Hence, Γn ` A, which is what we wanted to
show.
If Γ ∗ ` A, there would be some finite subset Γ 0 ⊆ Γ ∗ such
that Γ 0 ` A. Each D ∈ Γ 0 must be in Γi for some i . Let n be the
largest of these. Since Γi ⊆ Γn if i ≤ n, Γ 0 ⊆ Γn . But then Γn ` A,
contrary to our proof above that Γn 0 A.
Lastly, we show that Γ ∗ is prime, i.e., satisfies conditions (1),
(2), and (3) of Definition 9.3.
First, Γ ∗ 0 A, so Γ ∗ is consistent, so (1) holds.
We now show that if Γ ∗ ` B ∨ C , then either B ∈ Γ ∗ or
C ∈ Γ ∗ . This proves (3), since if B ∈ Γ ∗ then also Γ ∗ ` B, and
similarly for C . So assume Γ ∗ ` B ∨ C but B < Γ ∗ and C < Γ ∗ .
Since Γ ∗ ` B ∨ C , Γn ` B ∨ C for some n. B ∨ C appears on the
enumeration of all disjunctions, say as B j ∨C j . B j ∨C j satisfies the
properties in the definition of i (n), namely we have Γn ` B j ∨ C j ,
while B j < Γn and C j < Γn . At each stage, at least one fewer
9.4. THE CANONICAL MODEL 141
disjunction Bi ∨C i satisfies the conditions (since at each stage we
add either Bi or C i ), so at some stage m we will have j = i (Γm ).
But then either B ∈ Γm+1 or C ∈ Γm+1 , contrary to the assumption
that B < Γ ∗ and C < Γ ∗ .
Now suppose Γ ∗ ` A. Then Γ ∗ ` A ∨ A. But we’ve just
proved that if Γ ∗ ` A ∨ A then A ∈ Γ ∗ . Hence, Γ ∗ satisfies (1) of
Definition 9.3.
9.4 The Canonical Model
The worls in our model will be finite sequences σ of natural num-
bers, i.e., σ ∈ N∗ . Note that N∗ is inductively defined by:
1. Λ ∈ N∗ .
2. If σ ∈ N∗ and n ∈ Σ , then σ.n ∈ N∗ (where σ.n is σ _ hni).
3. Nothing else is in N∗ .
So we can use N∗ to give inductive definitions.
Let hB 1,C 1 i, hB 2,C s i, . . . , be an enumeration of all pairs of
formulas. Given a set of formulas ∆, define ∆(σ) by induction as
follows:
1. ∆(Λ) = ∆
2. ∆(σ.n) =
(
(∆(σ) ∪ {Bn })∗ if ∆(σ) ∪ {Bn } 0 C n
∆(σ) otherwise
Here by (∆(σ) ∪ {Bn })∗ we mean the prime set of formulas which
exists by Lemma 9.4 applied to the set ∆(σ) ∪ {Bn }. Note that
by this definition, if ∆(σ) ∪ {Bn } 0 C n , then ∆(σ.n) ` Bn and
∆(σ.n) 0 C n . Note also that ∆(σ) ⊆ ∆(σ.n) for any n. If ∆ is
prime, then ∆(σ) is prime for all σ.
CHAPTER 9. SOUNDNESS AND COMPLETENESS 142
Definition 9.5. Suppose ∆ is prime. Then the canonical model
for ∆ is defined by:
1. W = N∗ , the set of finite sequences of natural numbers.
2. R is the partial order according to which Rσσ 0 iff σ is
an initial segment of σ 0 (i.e., σ 0 = σ _ σ 00 for some se-
quence σ 00).
3. V (p) = {σ : p ∈ ∆(σ)}.
It is easy to verify that R is indeed a partial order. Also, the
monotonicity condition on V is satisfied. Since ∆(σ) ⊆ ∆(σ.n)
we get ∆(σ) ⊆ ∆(σ 0) whenever Rσσ 0 by induction on σ.
9.5 The Truth Lemma
Lemma 9.6. If ∆ is prime, then M(∆), σ A iff ∆(σ) ` A.
Proof. By induction on A.
1. A ≡ ⊥: Since ∆(σ) is prime, it is consistent, so ∆(σ) 0 A.
By definition, M(∆), σ 1 A.
2. A ≡ p: By definition of , M(∆), σ A iff σ ∈ V (p), i.e.,
∆(σ) ` A.
3. A ≡ ¬B: exercise.
4. A ≡ B ∧C : M(∆), σ A iff M(∆), σ B and M(∆), σ C .
By induction hypothesis, M(∆), σ B iff ∆(σ) ` B, and
similarly for C . But ∆(σ) ` B and ∆(σ) ` C iff ∆(σ) ` A.
5. A ≡ B ∨ C : M(∆), σ A iff M(∆), σ B or M(∆), σ C .
By induction hypothesis, this holds iff ∆(σ) ` B of ∆(σ) `
C . We have to show that this in turn holds iff ∆(σ) ` A.
The left-to-right direction is clear. The right-to-left direction
follows since ∆(σ) is prime.
9.6. THE COMPLETENESS THEOREM 143
6. A ≡ B → C : First the contrapositive of the left-to-right
direction: Assume ∆(σ) 0 B → C . Then also Γ ∗ (σ) ∪
{B } 0 C . Since hB,C i is hBn ,C n i for some n, we have
∆(σ.n) = (∆(σ) ∪ {B })∗ , and ∆(σ.n) ` B but 0 C . By
inductive hypothesis, M(∆), σ.n B and M(∆), σ.n 1 C .
Since Rσ(σ.n), this means that M(∆), σ A.
Now assume ∆(σ) ` B → C , and let Rσσ 0. Since ∆(σ) ⊆
∆(σ 0), we have: if ∆(σ 0) ` B, then ∆(σ 0) ` C . In other
words, for every σ 0 such that Rσσ 0, either ∆(σ 0) 0 B or
∆(σ 0) ` C . By induction hypothesis, this means that when-
ever Rσσ 0, either M(∆), σ 0 1 B or M(∆), σ 0 C , i.e.,
M(∆), σ A.
9.6 The Completeness Theorem
Theorem 9.7. If Γ A then Γ ` A.
Proof. We prove the contrapositive: Suppose Γ 0 A. Then by
Lemma 9.4, there is a prime set Γ ∗ ⊇ Γ such that Γ ∗ 0 A.
Consider the canonical model M(Γ ∗ ) for Γ ∗ as defined in Def-
inition 9.5. For any B ∈ Γ, Γ ∗ ` B. Note that Γ ∗ (Λ) = Γ∗. By the
Truth Lemma (Lemma 9.6), we have M(Γ ∗ ), Λ B for all B ∈ Γ
and M(Γ ∗ ), Λ 1 A. This shows that Γ 2 A.
Problems
Problem 9.1. Complete the proof of Theorem 9.2. For the cases
for ¬Intro and ¬Elim, use the definition of M, w ¬A in Defini-
tion 8.2, i.e., don’t treat ¬A as defined by A → ⊥.
PART III
Counterfactual
144
CHAPTER 10
Introduction
10.1 The Material Conditional
In its simplest form in English, a conditional is a sentence of the
form “If . . . then . . . ,” where the . . . are themselves sentences,
such as “If the butler did it, then the gardener is innocent.” In
introductory logic courses, we earn to symbolize conditionals us-
ing the → connective: symbolize the parts indicated by . . . , e.g.,
by formulas A and B, and the entire conditional is symbolized by
A → B.
The connective → is truth-functional, i.e., the truth value—T
or F—of A → B is determined by the truth values of A and B:
A → B is true iff A is false or B is true, and false otherwise.
Relative to a truth value assignment v, we define v A → B iff
v 2 A or v B. The connective → with this semantics is called
the material conditional.
This definition results in a number of elementary logical facts.
First of all, the deduction theorem holds for the material condi-
tional:
If Γ, A B then Γ A → B (10.1)
It is truth-functional: A → B and ¬A ∨ B are equivalent:
A → B ¬A ∨ B (10.2)
¬A ∨ B A → B (10.3)
145
CHAPTER 10. INTRODUCTION 146
A material conditional is entailed by its consequent and by the
negation of its antecedent:
B A→B (10.4)
¬A A → B (10.5)
A false material conditional is equivalent to the conjunction of its
antecedent and the negation of its consequent: if A → B is false,
A ∧ ¬B is true, and vice versa:
¬(A → B) A ∧ ¬B (10.6)
A ∧ ¬B ¬(A → B) (10.7)
The material conditional supports modus ponens:
A, A → B B (10.8)
The material conditional agglomerates:
A → B, A → C A → (B ∧ C ) (10.9)
We can always strengthen the antecedent, i.e., the conditional is
monotonic:
A → B (A ∧ C ) → B (10.10)
The material conditional is transitive, i.e., the chain rule is valid:
A → B, B → C A → C (10.11)
The material conditional is equivalent to its contrapositive:
A → B ¬B → ¬A (10.12)
¬B → ¬A A → B (10.13)
10.2. PARADOXES OF THE MATERIAL CONDITIONAL 147
These are all useful and unproblematic inferences in mathe-
matical reasoning. However, the philosophical and linguistic liter-
ature is replete with purported counterexamples to the equivalent
inferences in non-mathematical contexts. These suggest that the
material conditional → is not—or at least not always—the ap-
propriate connective to use when symbolizing English “if . . . then
. . . ” statements.
10.2 Paradoxes of the Material Conditional
One of the first to criticize the use of A →B as a way to symbolize
“if . . . then . . . ” statements of English was C. I. Lewis. Lewis was
criticizing the use of the material condition in Whitehead and
Russell’s Principia Mathematica, who pronounced → as “implies.”
Lewis rightly complained that if → meant “implies,” then any
false proposition p implies that p implies q , since p → (p → q ) is
true if p is false, and that any true proposition q implies that p
implies q , since q → (p → q ) is true if q is true.
Logicians of course know that implication, i.e., logical entail-
ment, is not a connective but a relation between formulas or state-
ments. So we should just not read → as “implies” to avoid confu-
sion.1 As long as we don’t, the particular worry that Lewis had
simply does not arise: p does not “imply” q even if we think of
p as standing for a false English sentence. To determine if p q
we must consider all valuations, and p 2 q even when we use p
to symbolize a sentence which happens to be false.
But there is still something odd about “if . . . then. . . ” state-
ments such as Lewis’s
If the moon is made of green cheese, then 2 + 2 = 4.
and about the inferences
1 Reading “→” as “implies” is still widely practised by mathematicians and
computer scientists, although philosophers try to avoid the confusions Lewis
highlighted by pronouncing it as “only if.”
CHAPTER 10. INTRODUCTION 148
The moon is not made of green cheese. Therefore, if
the moon is made of green cheese, then 2 + 2 = 4.
2 + 2 = 4. Therefore, if the moon is made of green
cheese, then 2 + 2 = 4.
Yet, if “if . . . then . . . ” were just →, the sentence would be un-
problematically true, and the inferences unproblematically valid.
Another example of concerns the tautology (A→B)∨(B →A).
This would suggest that if you take two indicative sentences S and
T from the newspaper at random, the sentence “If S then T , or
if T then S ” should be true.
10.3 The Strict Conditional
Lewis introduced the strict conditional J and argued that it, not
the material conditional, corresponds to implication. In alethic
modal logic, A J B can be defined as (A → B). A strict con-
ditional is thus true (at a world) iff the corresponding material
conditional is necessary.
How does the strict conditional fare vis-a-vis the paradoxes
of the material conditional? A strict conditional with a false an-
tecedent and one with a true consequent, may be true, or it may
be false. Moreover, (A J B) ∨ (B J A) is not valid. The strict
conditional A J B is also not equivalent to ¬A ∨ B, so it is not
truth functional.
We have:
A J B ¬A ∨ B but: (10.14)
¬A ∨ B 2 A J B (10.15)
B 2AJB (10.16)
¬A 2 A J B (10.17)
¬(A → B) 2 A ∧ ¬B but: (10.18)
A ∧ ¬B ¬(A J B) (10.19)
10.3. THE STRICT CONDITIONAL 149
However, the strict conditional still supports modus ponens:
A, A J B B (10.20)
The strict conditional agglomerates:
A J B, A J C A J (B ∧ C ) (10.21)
Antecedent strengthening holds for the strict conditional:
A J B (A ∧ C ) J B (10.22)
The strict conditional is also transitive:
A J B, B J C A J C (10.23)
Finally, the strict conditional is equivalent to its contrapositive:
A J B ¬B J ¬A (10.24)
¬B J ¬A A J B (10.25)
However, the strict conditional still has its own “paradoxes.”
Just as a material conditional with a false antecedent or a true
consequent is true, a strict conditional with a necessarily false an-
tecedent or a necessarily true consequent is true. Moreover, any
true strict conditional is necessarily true, and any false strict con-
ditional is necessarily false. In other words, we have
A A J B (10.26)
¬B A J B (10.27)
A J B (A J B) (10.28)
¬(A J B) ¬(A J B) (10.29)
These are not problems if you think of J as “implies.” Logical
entailment relationships are, after all, mathematical facts and so
can’t be contingent. But they do raise issues if you want to use
J as a logical connective that is supposed to capture “if . . . then
. . . ,” especially the last two. For surely there are “if . . . then . . . ”
statements that are contingently true or contingently false—in
fact, they generally are neither necessary nor impossible.
CHAPTER 10. INTRODUCTION 150
10.4 Counterfactuals
A very common and important form of “if . . . then . . . ” construc-
tions in English are built using the past subjunctive form of to
be: “if it were the case that . . . then it would be the case that . . . ”
Because usually the antecedent of such a conditional is false, i.e.,
counter to fact, they are called counterfactual conditionals (and
because they use the subjunctive form of to be, also subjunctive
conditionals. They are distinguished from indicative conditionals
which take the form of “if it is the case that . . . then it is the
case that . . . ” Counterfactual and indicative conditionals differ
in truth conditions. Consider Adams’s famous example:
If Oswald didn’t kill Kennedy, someone else did.
If Oswald hadn’t killed Kennedy, someone else would
have.
The first is indicative, the second counterfactual. The first is
clearly true: we know JFK was killed by someone, and if that
someone wasn’t (contrary to the Warren Report) Lee Harvey Os-
wald, then someone else killed JFK. The second one says some-
thing different. It claims that if Oswald hadn’t killed Kennedy,
i.e., if the Dallas shooting had been avoided or had been unsuc-
cessful, history would have subsequently unfolded in such a way
that another assassination would have been successful. In order
for it to be true, it would have to be the case that powerful forces
had conspired to ensure JFK’s death (as many JFK conspiracy
theorists believe).
It is a live debate whether the indicative conditional is cor-
rectly captured by the material conditional, in particular, whether
the paradoxes of the material conditional can be “explained” in
a way that is compatible with it giving the truth conditions for
English indicative conditionals. By contrast, it is uncontrover-
sial that counterfactual conditionals cannot be symbolized cor-
rectly by the material conditionals. That is clear because, even
though generally the antecedents of counterfactuals are false, not
10.4. COUNTERFACTUALS 151
all counterfactuals with false antecedents are true—for instance,
if you believe the Warren Report, and there was no conspiracy
to assassinate JFK, then Adams’s counterfactual conditional is an
example.
Counterfactual conditionals play an important role in causal
reasoning: a prime example of the use of counterfactuals is to ex-
press causal relationships. E.g., striking a match causes it to light,
and you can express this by saying “if this match were struck,
it would light.” Material, and generally indicative conditionals,
cannot be used to express this: “the match is struck → the match
lights” is true if the match is never struck, regardless of what
would happen if it were. Even worse, “the match is struck → the
match turns into a bouquet of flowers” is also true if it is never
struck, but the match would certainly not turn into a bouquet of
flowers if it were struck.
It is still debated What exactly the correct logic of counter-
factuals is. An influential analysis of counterfactuals was given
by Stalnaker and Lewis. According to them, a counterfactual “if
it were the case that S then it would be the case that T ” is true iff
T is true in the counterfactual situation (“possible world”) that
is closest to the way the actual world is and where S is true. This
is called an “ontic” analysis, since it makes reference to an ontol-
ogy of possible worlds. Other analyses make use of conditional
probabilities or theories of belief revision. There is a proliferation
of different proposed logics of counterfactuals. There isn’t even
a single Lewis-Stalnaker logic of counterfactuals: even though
Stalnaker and Lewis proposed accounts along similar lines with
reference to closest possible worlds, the assumptions they made
result in different valid inferences.
Problems
Problem 10.1. Give S5-counterexamples to the entailment rela-
tions which do not hold for the strict conditional, i.e., for:
1. ¬p 2 (p → q )
CHAPTER 10. INTRODUCTION 152
2. q 2 (p → q )
3. ¬(p → q ) 2 p ∧ ¬q
4. 2 (p → q ) ∨ (q → p)
Problem 10.2. Show that the valid entailment relations hold for
the strict conditional by giving S5-proofs of:
1. (A → B) ¬A ∨ B
2. A ∧ ¬B ¬(A → B)
3. A, (A → B) B
4. (A → B), (A → C ) (A → (B ∧ C ))
5. (A → B) ((A ∧ C ) → B)
6. (A → B), (B → C ) (A → C )
7. (A → B) (¬B → ¬A)
8. (¬B → ¬A) (A → B)
Problem 10.3. Give proofs in S5 of:
1. ¬B A J B
2. A J B (A J B)
3. ¬(A J B) ¬(A J B)
Use the definition of J to do so.
CHAPTER 11
Minimal
Change
Semantics
11.1 Introduction
Stalnaker and Lewis proposed accounts of counterfactual condi-
tionals such as “If the match were struck, it would light.” Their
accounts were proposals for how to properly understand the truth
conditions for such sentences. The idea behind both proposals is
this: to evaluate whether a counterfactual conditional is true, we
have to consider those possible worlds which are minimally dif-
ferent from the way the world actually is to make the antecedent
true. If the consequent is true in these possible worlds, then the
counterfactual is true. For instance, suppose I hold a match and
a matchbook in my hand. In the actual world I only look at them
and ponder what would happen if I were to strike the match. The
minimal change from the actual world where I strike the match
is that where I decide to act and strike the match. It is minimal
in that nothing else changes: I don’t also jump in the air, striking
the match doesn’t also light my hair on fire, I don’t suddenly lose
153
CHAPTER 11. MINIMAL CHANGE SEMANTICS 154
all strength in my fingers, I am not simultaneously doused with
water in a SuperSoaker ambush, etc. In that alternative possibil-
ity, the match lights. Hence, it’s true that if I were to strike the
match, it would light.
This intuitive account can be paired with formal semantics
for logics of counterfactuals. Lewis introduced the symbol “”
for the counterfactual while Stalnaker used the symbol “>”. We’ll
use , and add it as a binary connective to propositional logic.
So, we have, in addition to formulas of the form A → B also
formulas of the form A B. The formal semantics, like the
relational semantics for modal logic, is based on models in which
formulas are evaluated at worlds, and the satisfaction condition
defining M, w A B is given in terms of M, w 0 A and
M, w 0 B for some (other) worlds w 0. Which w 0? Intuitively,
the one(s) closest to w for which it holds that M, w 0 A. This
requires that a relation of “closeness” has to be included in the
model as well.
Lewis introduced an instructive way of representing counter-
factual situations graphically. Each possible world is at the center
of a set of nested spheres containing other worlds—we draw these
spheres as concentric circles. The worlds between two spheres are
equally close to the world at the center as each other, those con-
tained in a nested sphere are closer, and those in a surrounding
sphere further away.
w A
The closest A-worlds are those worlds w 0 where A is satisfied
which lie in the smallest sphere around the center world w (the
11.2. SPHERE MODELS 155
gray area). Intuitively, A B is satisfied at w if B is true at all
closest A-worlds.
11.2 Sphere Models
One way of providing a formal semantics for counterfactuals is
to turn Lewis’s informal account into a mathematical structure.
The spheres around a world w then are sets of worlds. Since the
spheres are nested, the sets of worlds around w have to be linearly
ordered by the subset relation.
Definition 11.1. A sphere model is a triple M = hW,O,V i where
W is a non-empty set of worlds, V : At0 → ℘(W ) is a valua-
tion, and O : W → ℘(℘(W )) assigns to each world w a system of
spheres O w . For each w, O w is a set of sets of worlds, and must
satisfy:
1. O w is centered on w: {w } ∈ O w .
2. O w is nested: whenever S 1 , S 2 ∈ O w , S 1 ⊆ S 2 or S 2 ⊆ S 1 , i.e.,
O w is linearly ordered by ⊆.
3. O w is closed under non-empty unions.
4. O w is closed under non-empty intersections.
The intuition behind O w is that the worlds “around” w are
stratified according to how far away they are from w. The inner-
most sphere is just w by itself, i.e., the set {w }: w is closer to w
than the worlds in any other sphere. If S ( S 0, then the worlds
in S 0 \S are further way from w than the worlds in S : S 0 \S is the
“layer” between the S and the worlds outside of S 0. In particular,
we have to think of the spheres as containing all the worlds within
their outer surface; they are not just the individual layers.
The diagram in Figure 11.1 corresponds to the sphere model
with W = {w, w 1, . . . , w 7 }, V (p) = {w 5, w 6, w 7 }. The innermost
sphere S 1 = {w }. The closest worlds to w are w 1, w 2, w 3 , so the
CHAPTER 11. MINIMAL CHANGE SEMANTICS 156
w1 w7
w5
w p
w6
w2
w3
w4
Figure 11.1: Diagram of a sphere model
next larger sphere is S 2 = {w, w 1, w 2, w 3 }. The worlds further out
are w 4 , w 5 , w 6 , so the outermost sphere is S 3 = {w, w 1, . . . , w 6 }.
The system of spheres around w is O w = {S 1, S 2, S 3 }. The world w 7
is not in any sphere around w. The closest worlds in which p is
true are w 5 and w 6 , and so the smallest p-admitting sphere is S 3 .
To define satisfaction of a formula A at world w in a sphere
model M, M, w A, we expand the definition for modal formulas
to include a clause for B C :
Definition 11.2. M, w B C iff either
1. For all u ∈ O w , M, u 1 C , or
Ð
2. For some S ∈ O w ,
a) M, u B for some u ∈ S , and
b) for all v ∈ S , either M, v 1 B or M, v C.
According to this definition, M, w B C iff either the
antecedent B is false everywhere in the spheres around w, or
there is a sphere S where B is true, and the material conditional
B → C is true at all worlds in that “B-admitting” sphere. Note
that we didn’t require in the definition that S is the innermost B-
11.3. TRUTH AND FALSITY OF COUNTERFACTUALS 157
w A
Figure 11.2: Non-vacuously true counterfactual
admitting sphere, contrary to what one might expect from the
intuitive explanation. But if the condition in (2) is satisfied for
some sphere S , then it is also satisfied for all spheres S contains,
and hence in particular for the innermost sphere.
Note also that the definition of sphere models does not re-
quire that there is an innermost B-admitting sphere: we may
have an infinite sequence S 1 ) S 2 ) · · · ) {w } of B-admitting
spheres, and hence no innermost B-admitting spheres. In that
case, M, w B C iff B → C holds throughout the spheres S i ,
S i +1 , . . . , for some i .
11.3 Truth and Falsity of Counterfactuals
A counterfactual A B is (non-vacuously) true if the closest
A-worlds are all B-worlds, as depicted in Figure 11.2. A counter-
factual is also true at w if the system of spheres around w has no
A-admitting spheres at all. In that case it is vacuously true (see
Figure 11.3).
It can be false in two ways. One way is if the closest A-worlds
are not all B-worlds, but some of them are. In this case, A ¬B
is also false (see Figure 11.4). If the closest A-worlds do not
overlap with the B-worlds at all, then A B. But, in this case
all the closest A-worlds are ¬B-worlds, and so A ¬B is true
(see Figure 11.5).
CHAPTER 11. MINIMAL CHANGE SEMANTICS 158
w A
Figure 11.3: Vacuously true counterfactual
w A
Figure 11.4: False counterfactual, false opposite
w A
Figure 11.5: False counterfactual, true opposite
11.4. ANTECEDENT STRENGTHENNG 159
u v
Figure 11.6: Contingent counterfactual
In contrast to the strict conditional, counterfactuals may be
contingent. Consider the sphere model in Figure 11.6. The A-
worlds closest to u are all B-worlds, so M, u A B. But
there are A-worlds closest to v which are not B-worlds, so M, v 1
A B.
11.4 Antecedent Strengthenng
“Strengthening the antecedent” refers to the inference A → C
(A ∧ B) → C . It is valid for the material conditional, but invalid
for counterfactuals. Suppose it is true that if I were to strike this
match, it would light. (That means, there is nothing wrong with
the match or the matchbook surface, I will not break the match,
etc.) But it is not true that if I were to light this match in outer
space, it would light. So the following inference is invalid:
I the match were struck, it would light.
Therefore, if the match were struck in outer space, it
would light.
The Lewis-Stalnaker account of conditionals explains this:
the closest world where I light the match and I do so in outer
CHAPTER 11. MINIMAL CHANGE SEMANTICS 160
w1
w
w2
q
Figure 11.7: Counterexample to antecedent strengthening
space is much further removed from the actual world than the
closest world where I light the match is. So although it’s true that
the match lights in the latter, it is not in the former. And that is
as it schould be.
Example 11.3. The sphere semantics invalidates the inference,
i.e., we have p r 2 (p ∧ q ) r . Consider the model M =
hW,O,V i whereW = {w, w 1, w 2 }, O w = {{w }, {w, w 1 }, {w, w 1, w 2 }},
V (p) = {w 1, w 2 }, V (q ) = {w 2 }, and V (r ) = {w 1 }. There is a p-
admitting sphere S = {w, w 1 } and p → r is true at all worlds in
it, so M, w p r . There is also a (p ∧ q )-admitting sphere
S 0 = {w, w 1, w 2 } but M, w 2 1 (p ∧ q ) → r , so M, w 1 (p ∧ q ) r
(see Figure 11.7).
11.5 Transitivity
For the material conditional, the chain rule holds: A→B, B →C
A → C . In other words, the material conditional is transitive. Is
the same true for counterfactuals? Consider the following exam-
ple due to Stalnaker.
11.5. TRANSITIVITY 161
If J. Edgar Hoover had been born a Russian, he would
have been a Communist.
If J. Edgar Hoover were a Communist, he would have
been be a traitor.
Therefore, If J. Edgar Hoover had been born a Rus-
sian, he would have been be a traitor.
If Hoover had been born (at the same time he actually did), not
in the United States, but in Russia, he would have grown up in
the Soviet Union and become a Communist (let’s assume). So
the first premise is true. Likewise, the second premise, consid-
ered in isolation is true. The conclusion, however, is false: in all
likelihood, Hoover would have been a fervent Communist if he
had been born in the USSR, and not been a traitor (to his coun-
try). The intuitive assignment of truth values is borne out by the
Stalnaker-Lewis account. The closest possible world to ours with
the only change being Hoover’s place of birth is the one where
Hoover grows up to be a good citizen of the USSR. This is the
closest possible world where the antecedent of the first premise
and of the conclusion is true, and in that world Hoover is a loyal
member of the Communist party, and so not a traitor. To eval-
uate the second premise, we have to look at a different world,
however: the closest world where Hoover is a Communist, which
is one where he was born in the United States, turned, and thus
became a traitor.1
Example 11.4. The sphere semantics invalidates the inference,
i.e., we have p q, q r 2 p r . Consider the model M =
hW,O,V i whereW = {w, w 1, w 2 }, O w = {{w }, {w, w 1 }, {w, w 1, w 2 }},
V (p) = {w 2 }, V (q ) = {w 1, w 2 }, and V (r ) = {w 1 }. There is a p-
admitting sphere S = {w, w 1, w 2 } and q → q is true at all worlds
in it, so M, w p q . There is also a q -admitting sphere
1 Ofcourse, to appreciate the force of the example we have to take on
board some metaphysical and political assumptions, e.g., that it is possible
that Hoover could have been born to Russian parents, or that Communists in
the US of the 1950s were traitors to their country.
CHAPTER 11. MINIMAL CHANGE SEMANTICS 162
S 0 = {w, w 1 } and M 1 q → r is true at all worlds in it, so
M, w q r . However, the p-admitting sphere {w, w 1, w 2 }
contains a world, namely w 2 , where M, w 2 1 p → r .
11.6 Contraposition
Material and strict conditionals are equivalent to their contra-
positives. Counterfactuals are not. Here is an example due to
Kratzer:
If Goethe hadn’t died in 1832, he would (still) be dead
now.
If Goethe weren’t dead now, he would have died in
1832.
The first sentence is true: humans don’t live hundreds of years.
The second is clearly false: if Goethe weren’t dead now, he would
be still alive, and so couldn’t have died in 1832.
Example 11.5. The sphere semantics invalidates contraposition,
i.e., we have p q 2 ¬q ¬p. Think of p as “Goethe didn’t
die in 1832” and q as “Goethe is dead now.” We can capture
this in a model M1 = hW,O,V i with W = {w, w 1, w 2 }, O =
{{w }, {w, w 1 }, {w, w 1, w 2 }}, V (p) = {w 1, w 2 } and V (q ) = {w, w 1 }.
So w is the actual world where Goethe died in 1832 and is still
dead; w 1 is the (close) world where Goethe died in, say, 1833,
and is still dead; and w 2 is a (remote) world where Goethe is still
alive. There is a p-admitting sphere S = {w, w 1 } and p →q is true
at all worlds in it, so M, w p q . However, the ¬q -admitting
sphere {w, w 1, w 2 } contains a world, namely w 2 , where q is false
and p is true, so M, w 2 1 ¬q → ¬p.
Problems
Problem 11.1. Find a convincing, intuitive example for the fail-
ure of transitivity of counterfactuals.
11.6. CONTRAPOSITION 163
¬q
q
w w1
w2
p
¬p
Figure 11.8: Counterexample to contraposition
Problem 11.2. Draw the sphere diagram corresponding to the
counterexample in Example 11.4.
Problem 11.3. In Example 11.4, world w 2 is where Hoover is
born in Russia, is a communist, and not a traitor, and w 1 is the
world where Hoover is born in the US, is a communist, and a
traitor. In this model, w 1 is closer to w than w 2 is. Is this neces-
sary? Can you give a counterexample that does not assume that
Hoover’s being born in Russia is a more remote possibility than
him being a Communist?
PART IV
Appendices
164
APPENDIX A
Sets
A.1 Basics
Sets are the most fundamental building blocks of mathematical
objects. In fact, almost every mathematical object can be seen as
a set of some kind. In logic, as in other parts of mathematics,
sets and set-theoretical talk is ubiquitous. So it will be important
to discuss what sets are, and introduce the notations necessary
to talk about sets and operations on sets in a standard way.
Definition A.1 (Set). A set is a collection of objects, considered
independently of the way it is specified, of the order of the objects
in the set, or of their multiplicity. The objects making up the set
are called elements or members of the set. If a is an element of a
set X , we write a ∈ X (otherwise, a < X ). The set which has no
elements is called the empty set and denoted by the symbol ∅.
Example A.2. Whenever you have a bunch of objects, you can
collect them together in a set. The set of Richard’s siblings, for
instance, is a set that contains one person, and we could write
it as S = {Ruth}. In general, when we have some objects a1 ,
. . . , an , then the set consisting of exactly those objects is written
{a1, . . . , an }. Frequently we’ll specify a set by some property that
its elements share—as we just did, for instance, by specifying S
as the set of Richard’s siblings. We’ll use the following shorthand
165
APPENDIX A. SETS 166
notation for that: {x : . . . x . . .}, where the . . . x . . . stands for the
property that x has to have in order to be counted among the
elements of the set. In our example, we could have specified S
also as
S = {x : x is a sibling of Richard}.
When we say that sets are independent of the way they are
specified, we mean that the elements of a set are all that matters.
For instance, it so happens that
{Nicole, Jacob},
{x : is a niece or nephew of Richard}, and
{x : is a child of Ruth}
are three ways of specifying one and the same set.
Saying that sets are considered independently of the order of
their elements and their multiplicity is a fancy way of saying that
{Nicole, Jacob} and
{Jacob, Nicole}
are two ways of specifying the same set; and that
{Nicole, Jacob} and
{Jacob, Nicole, Nicole}
are also two ways of specifying the same set. In other words, all
that matters is which elements a set has. The elements of a set
are not ordered and each element occurs only once. When we
specify or describe a set, elements may occur multiple times and in
different orders, but any descriptions that only differ in the order
of elements or in how many times elements are listed describes
the same set.
Definition A.3 (Extensionality). If X and Y are sets, then X and
Y are identical, X = Y , iff every element of X is also an element
A.2. SOME IMPORTANT SETS 167
of Y , and vice versa.
Extensionality gives us a way for showing that sets are iden-
tical: to show that X = Y , show that whenever x ∈ X then also
x ∈ Y , and whenever y ∈ Y then also y ∈ X .
A.2 Some Important Sets
Example A.4. Mostly we’ll be dealing with sets that have math-
ematical objects as members. You will remember the various sets
of numbers: N is the set of natural numbers {0, 1, 2, 3, . . . }; Z the
set of integers,
{. . . , −3, −2, −1, 0, 1, 2, 3, . . . };
Q the set of rational numbers (Q = {z /n : z ∈ Z, n ∈ N, n , 0});
and R the set of real numbers. These are all infinite sets, that
is, they each have infinitely many elements. As it turns out, N,
Z, Q have the same number of elements, while R has a whole
bunch more—N, Z, Q are “countable and infinite” whereas R is
“uncountable”.
We’ll sometimes also use the set of positive integers Z+ =
{1, 2, 3, . . . } and the set containing just the first two natural num-
bers B = {0, 1}.
Example A.5 (Strings). Another interesting example is the set
A∗ of finite strings over an alphabet A: any finite sequence of
elements of A is a string over A. We include the empty string Λ
among the strings over A, for every alphabet A. For instance,
B∗ = {Λ, 0, 1, 00, 01, 10, 11,
000, 001, 010, 011, 100, 101, 110, 111, 0000, . . .}.
If x = x 1 . . . x n ∈ A∗ is a string consisting of n “letters” from A,
then we say length of the string is n and write len(x) = n.
APPENDIX A. SETS 168
Example A.6 (Infinite sequences). For any set A we may also
consider the set Aω of infinite sequences of elements of A. An
infinite sequence a1 a2 a 3 a4 . . . consists of a one-way infinite list of
objects, each one of which is an element of A.
A.3 Subsets
Sets are made up of their elements, and every element of a set is a
part of that set. But there is also a sense that some of the elements
of a set taken together are a “part of” that set. For instance, the
number 2 is part of the set of integers, but the set of even numbers
is also a part of the set of integers. It’s important to keep those
two senses of being part of a set separate.
Definition A.7 (Subset). If every element of a set X is also an
element of Y , then we say that X is a subset of Y , and write
X ⊆Y.
Example A.8. First of all, every set is a subset of itself, and ∅ is
a subset of every set. The set of even numbers is a subset of the
set of natural numbers. Also, {a, b } ⊆ {a, b, c }.
But {a, b, e } is not a subset of {a, b, c }.
Note that a set may contain other sets, not just as subsets but
as elements! In particular, a set may happen to both be an el-
ement and a subset of another, e.g., {0} ∈ {0, {0}} and also
{0} ⊆ {0, {0}}.
Extensionality gives a criterion of identity for sets: X = Y iff
every element of X is also an element of Y and vice versa. The
definition of “subset” defines X ⊆ Y precisely as the first half of
this criterion: every element of X is also an element of Y . Of
course the definition also applies if we switch X and Y : Y ⊆ X
iff every element of Y is also an element of X . And that, in turn,
is exactly the “vice versa” part of extensionality. In other words,
extensionality amounts to: X = Y iff X ⊆ Y and Y ⊆ X .
A.4. UNIONS AND INTERSECTIONS 169
Definition A.9 (Power Set). The set consisting of all subsets of
a set X is called the power set of X , written ℘(X ).
℘(X ) = {Y : Y ⊆ X }
Example A.10. What are all the possible subsets of {a, b, c }?
They are: ∅, {a}, {b }, {c }, {a, b }, {a, c }, {b, c }, {a, b, c }. The set
of all these subsets is ℘({a, b, c }):
℘({a, b, c }) = {∅, {a}, {b }, {c }, {a, b }, {b, c }, {a, c }, {a, b, c }}
A.4 Unions and Intersections
We can define new sets by abstraction, and the property used to
define the new set can mention sets we’ve already defined. So for
instance, if X and Y are sets, the set {x : x ∈ X ∨ x ∈ Y } defines
a set which consists of all those objects which are elements of
either X or Y , i.e., it’s the set that combines the elements of X
and Y . This operation on sets—combining them—is very useful
and common, and so we give it a name and a symbol.
Definition A.11 (Union). The union of two sets X and Y , writ-
ten X ∪Y , is the set of all things which are elements of X , Y , or
both.
X ∪ Y = {x : x ∈ X ∨ x ∈ Y }
Example A.12. Since the multiplicity of elements doesn’t mat-
ter, the union of two sets which have an element in common con-
tains that element only once, e.g., {a, b, c }∪{a, 0, 1} = {a, b, c, 0, 1}.
The union of a set and one of its subsets is just the bigger set:
{a, b, c } ∪ {a} = {a, b, c }.
The union of a set with the empty set is identical to the set:
{a, b, c } ∪ ∅ = {a, b, c }.
The operation that forms the set of all elements that X and
Y have in common is called their intersection.
APPENDIX A. SETS 170
Figure A.1: The union X ∪Y of two sets is set of elements of X together with
those of Y .
Figure A.2: The intersection X ∩Y of two sets is the set of elements they have
in common.
Definition A.13 (Intersection). The intersection of two sets X
and Y , written X ∩ Y , is the set of all things which are elements
of both X and Y .
X ∩ Y = {x : x ∈ X ∧ x ∈ Y }
Two sets are called disjoint if their intersection is empty. This
means they have no elements in common.
A.4. UNIONS AND INTERSECTIONS 171
Example A.14. If two sets have no elements in common, their
intersection is empty: {a, b, c } ∩ {0, 1} = ∅.
If two sets do have elements in common, their intersection is
the set of all those: {a, b, c } ∩ {a, b, d } = {a, b }.
The intersection of a set with one of its subsets is just the
smaller set: {a, b, c } ∩ {a, b } = {a, b }.
The intersection of any set with the empty set is empty: {a, b, c }∩
∅ = ∅.
We can also form the union or intersection of more than two
sets. An elegant way of dealing with this in general is the follow-
ing: suppose you collect all the sets you want to form the union
(or intersection) of into a single set. Then we can define the union
of all our original sets as the set of all objects which belong to at
least one element of the set, and the intersection as the set of all
objects which belong to every element of the set.
Definition A.15. If Z is a set of sets, then Z is the set of
Ð
elements of elements of Z :
Ø
Z = {x : x belongs to an element of Z }, i.e.,
Ø
Z = {x : there is a Y ∈ Z so that x ∈ Y }
Definition A.16. If Z is a set of sets, then Z is the set of
Ñ
objects which all elements of Z have in common:
Ù
Z = {x : x belongs to every element of Z }, i.e.,
Ù
Z = {x : for all Y ∈ Z, x ∈ Y }
Example A.17. Suppose Z = {{a, b }, {a, d, e }, {a, d }}. Then
Z = {a, b, d, e } and Z = {a}.
Ð Ñ
We could also do the same for a sequence of sets X 1 , X 2 , . . .
Ø
Xi = {x : x belongs to one of the Xi }
i
APPENDIX A. SETS 172
Figure A.3: The difference X \ Y of two sets is the set of those elements of X
which are not also elements of Y .
Ù
Xi = {x : x belongs to every Xi }.
i
Definition A.18 (Difference). The difference X \ Y is the set of
all elements of X which are not also elements of Y , i.e.,
X \ Y = {x : x ∈ X and x < Y }.
A.5 Pairs, Tuples, Cartesian Products
Sets have no order to their elements. We just think of them as an
unordered collection. So if we want to represent order, we use
ordered pairs hx, yi. In an unordered pair {x, y }, the order does
not matter: {x, y } = {y, x }. In an ordered pair, it does: if x , y,
then hx, yi , hy, xi.
Sometimes we also want ordered sequences of more than
two objects, e.g., triples hx, y, z i, quadruples hx, y, z, ui, and so on.
In fact, we can think of triples as special ordered pairs, where
the first element is itself an ordered pair: hx, y, z i is short for
hhx, yi, z i. The same is true for quadruples: hx, y, z, ui is short for
hhhx, yi, z i, ui, and so on. In general, we talk of ordered n-tuples
hx 1, . . . , x n i.
A.5. PAIRS, TUPLES, CARTESIAN PRODUCTS 173
Definition A.19 (Cartesian product). Given sets X and Y , their
Cartesian product X × Y is {hx, yi : x ∈ X and y ∈ Y }.
Example A.20. If X = {0, 1}, and Y = {1, a, b }, then their prod-
uct is
X × Y = {h0, 1i, h0, ai, h0, bi, h1, 1i, h1, ai, h1, bi}.
Example A.21. If X is a set, the product of X with itself, X ×X ,
is also written X 2 . It is the set of all pairs hx, yi with x, y ∈ X .
The set of all triples hx, y, z i is X 3 , and so on. We can give an
inductive definition:
X1 = X
X k +1 = X k × X
Proposition A.22. If X has n elements and Y has m elements, then
X × Y has n · m elements.
Proof. For every element x in X , there are m elements of the form
hx, yi ∈ X ×Y . Let Yx = {hx, yi : y ∈ Y }. Since whenever x 1 , x 2 ,
hx 1, yi , hx 2, yi, Yx 1 ∩ Yx 2 = ∅. But if X = {x 1, . . . , x n }, then
X × Y = Yx 1 ∪ · · · ∪ Yxn , and so has n · m elements.
To visualize this, arrange the elements of X × Y in a grid:
Yx 1 = {hx 1, y 1 i hx 1, y 2 i . . . hx 1, y m i}
Yx 2 = {hx 2, y 1 i hx 2, y 2 i . . . hx 2, y m i}
.. ..
. .
Yxn = {hx n , y 1 i hx n , y 2 i . . . hx n , y m i}
Since the x i are all different, and the y j are all different, no two of
the pairs in this grid are the same, and there are n ·m of them.
Example A.23. If X is a set, a word over X is any sequence
of elements of X . A sequence can be thought of as an n-tuple
APPENDIX A. SETS 174
of elements of X . For instance, if X = {a, b, c }, then the se-
quence “bac ” can be thought of as the triple hb, a, c i. Words,
i.e., sequences of symbols, are of crucial importance in computer
science, of course. By convention, we count elements of X as
sequences of length 1, and ∅ as the sequence of length 0. The set
of all words over X then is
X ∗ = {∅} ∪ X ∪ X 2 ∪ X 3 ∪ . . .
A.6 Russell’s Paradox
We said that one can define sets by specifying a property that its
elements share, e.g., defining the set of Richard’s siblings as
S = {x : x is a sibling of Richard}.
In the very general context of mathematics one must be careful,
however: not every property lends itself to comprehension. Some
properties do not define sets. If they did, we would run into
outright contradictions. One example of such a case is Russell’s
Paradox.
Sets may be elements of other sets—for instance, the power
set of a set X is made up of sets. And so it makes sense, of course,
to ask or investigate whether a set is an element of another set.
Can a set be a member of itself? Nothing about the idea of a
set seems to rule this out. For instance, surely all sets form a
collection of objects, so we should be able to collect them into
a single set—the set of all sets. And it, being a set, would be
an element of the set of all sets.
Russell’s Paradox arises when we consider the property of not
having itself as an element. The set of all sets does not have this
property, but all sets we have encountered so far have it. N is not
an element of N, since it is a set, not a natural number. ℘(X ) is
generally not an element of ℘(X ); e.g., ℘(R) < ℘(R) since it is a
set of sets of real numbers, not a set of real numbers. What if we
suppose that there is a set of all sets that do not have themselves
A.6. RUSSELL’S PARADOX 175
as an element? Does
R = {x : x < x }
exist?
If R exists, it makes sense to ask if R ∈ R or not—it must be
either ∈ R or < R. Suppose the former is true, i.e., R ∈ R. R was
defined as the set of all sets that are not elements of themselves,
and so if R ∈ R, then R does not have this defining property of R.
But only sets that have this property are in R, hence, R cannot
be an element of R, i.e., R < R. But R can’t both be and not be
an element of R, so we have a contradiction.
Since the assumption that R ∈ R leads to a contradiction, we
have R < R. But this also leads to a contradiction! For if R < R, it
does have the defining property of R, and so would be an element
of R just like all the other non-self-containing sets. And again, it
can’t both not be and be an element of R.
Problems
Problem A.1. Show that there is only one empty set, i.e., show
that if X and Y are sets without members, then X = Y .
Problem A.2. List all subsets of {a, b, c, d }.
Problem A.3. Show that if X has n elements, then ℘(X ) has 2n
elements.
Problem A.4. Prove rigorously that if X ⊆ Y , then X ∪ Y = Y .
Problem A.5. Prove rigorously that if X ⊆ Y , then X ∩ Y = X .
Problem A.6. List all elements of {1, 2, 3}3 .
Problem A.7. Show, by induction on k , that for all k ≥ 1, if X
has n elements, then X k has n k elements.
APPENDIX B
Relations
B.1 Relations as Sets
You will no doubt remember some interesting relations between
objects of some of the sets we’ve mentioned. For instance, num-
bers come with an order relation < and from the theory of whole
numbers the relation of divisibility without remainder (usually writ-
ten n | m) may be familar. There is also the relation is identical
with that every object bears to itself and to no other thing. But
there are many more interesting relations that we’ll encounter,
and even more possible relations. Before we review them, we’ll
just point out that we can look at relations as a special sort of set.
For this, first recall what a pair is: if a and b are two objects, we
can combine them into the ordered pair ha, bi. Note that for or-
dered pairs the order does matter, e.g, ha, bi , hb, ai, in contrast
to unordered pairs, i.e., 2-element sets, where {a, b } = {b, a}.
If X and Y are sets, then the Cartesian product X ×Y of X and
Y is the set of all pairs ha, bi with a ∈ X and b ∈ Y . In particular,
X 2 = X × X is the set of all pairs from X .
Now consider a relation on a set, e.g., the <-relation on the
set N of natural numbers, and consider the set of all pairs of
numbers hn, mi where n < m, i.e.,
R = {hn, mi : n, m ∈ N and n < m}.
Then there is a close connection between the number n being
176
B.1. RELATIONS AS SETS 177
less than a number m and the corresponding pair hn, mi being a
member of R, namely, n < m if and only if hn, mi ∈ R. In a sense
we can consider the set R to be the <-relation on the set N. In the
same way we can construct a subset of N2 for any relation between
numbers. Conversely, given any set of pairs of numbers S ⊆ N2 ,
there is a corresponding relation between numbers, namely, the
relationship n bears to m if and only if hn, mi ∈ S . This justifies
the following definition:
Definition B.1 (Binary relation). A binary relation on a set X is
a subset of X 2 . If R ⊆ X 2 is a binary relation on X and x, y ∈ X ,
we write Rxy (or xRy) for hx, yi ∈ R.
Example B.2. The set N2 of pairs of natural numbers can be
listed in a 2-dimensional matrix like this:
h0, 0i h0, 1i h0, 2i h0, 3i ...
h1, 0i h1, 1i h1, 2i h1, 3i ...
h2, 0i h2, 1i h2, 2i h2, 3i ...
h3, 0i h3, 1i h3, 2i h3, 3i ...
.. .. .. .. ..
. . . . .
The subset consisting of the pairs lying on the diagonal, i.e.,
{h0, 0i, h1, 1i, h2, 2i, . . . },
is the identity relation on N. (Since the identity relation is popular,
let’s define IdX = {hx, xi : x ∈ X } for any set X .) The subset of
all pairs lying above the diagonal, i.e.,
L = {h0, 1i, h0, 2i, . . . , h1, 2i, h1, 3i, . . . , h2, 3i, h2, 4i, . . .},
is the less than relation, i.e., Lnm iff n < m. The subset of pairs
below the diagonal, i.e.,
G = {h1, 0i, h2, 0i, h2, 1i, h3, 0i, h3, 1i, h3, 2i, . . . },
APPENDIX B. RELATIONS 178
is the greater than relation, i.e., G nm iff n > m. The union of L
with I , K = L ∪ I , is the less than or equal to relation: K nm iff
n ≤ m. Similarly, H = G ∪ I is the greater than or equal to relation.
L, G , K , and H are special kinds of relations called orders. L and
G have the property that no number bears L or G to itself (i.e.,
for all n, neither Lnn nor G nn). Relations with this property are
called irreflexive, and, if they also happen to be orders, they are
called strict orders.
Although orders and identity are important and natural rela-
tions, it should be emphasized that according to our definition
any subset of X 2 is a relation on X , regardless of how unnatural
or contrived it seems. In particular, ∅ is a relation on any set
(the empty relation, which no pair of elements bears), and X 2 it-
self is a relation on X as well (one which every pair bears), called
the universal relation. But also something like E = {hn, mi : n >
5 or m × n ≥ 34} counts as a relation.
B.2 Special Properties of Relations
Some kinds of relations turn out to be so common that they have
been given special names. For instance, ≤ and ⊆ both relate their
respective domains (say, N in the case of ≤ and ℘(X ) in the case
of ⊆) in similar ways. To get at exactly how these relations are
similar, and how they differ, we categorize them according to
some special properties that relations can have. It turns out that
(combinations of) some of these special properties are especially
important: orders and equivalence relations.
Definition B.3 (Reflexivity). A relation R ⊆ X 2 is reflexive iff,
for every x ∈ X , Rxx.
B.2. SPECIAL PROPERTIES OF RELATIONS 179
Definition B.4 (Transitivity). A relation R ⊆ X 2 is transitive iff,
whenever Rxy and Ryz , then also Rxz .
Definition B.5 (Symmetry). A relation R ⊆ X 2 is symmetric iff,
whenever Rxy, then also Ryx.
Definition B.6 (Anti-symmetry). A relation R ⊆ X 2 is anti-
symmetric iff, whenever both Rxy and Ryx, then x = y (or, in
other words: if x , y then either ¬Rxy or ¬Ryx).
In a symmetric relation, Rxy and Ryx always hold together,
or neither holds. In an anti-symmetric relation, the only way for
Rxy and Ryx to hold together is if x = y. Note that this does not
require that Rxy and Ryx holds when x = y, only that it isn’t ruled
out. So an anti-symmetric relation can be reflexive, but it is not
the case that every anti-symmetric relation is reflexive. Also note
that being anti-symmetric and merely not being symmetric are
different conditions. In fact, a relation can be both symmetric
and anti-symmetric at the same time (e.g., the identity relation
is).
Definition B.7 (Connectivity). A relation R ⊆ X 2 is connected if
for all x, y ∈ X , if x , y, then either Rxy or Ryx.
Definition B.8 (Partial order). A relation R ⊆ X 2 that is reflex-
ive, transitive, and anti-symmetric is called a partial order.
APPENDIX B. RELATIONS 180
Definition B.9 (Linear order). A partial order that is also con-
nected is called a linear order.
Definition B.10 (Equivalence relation). A relation R ⊆ X 2 that
is reflexive, symmetric, and transitive is called an equivalence re-
lation.
B.3 Orders
Very often we are interested in comparisons between objects,
where one object may be less or equal or greater than another
in a certain respect. Size is the most obvious example of such a
comparative relation, or order. But not all such relations are alike
in all their properties. For instance, some comparative relations
require any two objects to be comparable, others don’t. (If they
do, we call them linear or total.) Some include identity (like ≤)
and some exclude it (like <). Let’s get some order into all this.
Definition B.11 (Preorder). A relation which is both reflexive
and transitive is called a preorder.
Definition B.12 (Partial order). A preorder which is also anti-
symmetric is called a partial order.
B.3. ORDERS 181
Definition B.13 (Linear order). A partial order which is also
connected is called a total order or linear order.
Example B.14. Every linear order is also a partial order, and ev-
ery partial order is also a preorder, but the converses don’t hold.
The universal relation on X is a preorder, since it is reflexive and
transitive. But, if X has more than one element, the universal
relation is not anti-symmetric, and so not a partial order. For a
somewhat less silly example, consider the no longer than relation
4 on B∗ : x 4 y iff len(x) ≤ len(y). This is a preorder (reflexive
and transitive), and even connected, but not a partial order, since
it is not anti-symmetric. For instance, 01 4 10 and 10 4 01, but
01 , 10.
The relation of divisibility without remainder gives us an ex-
ample of a partial order which isn’t a linear order: for integers
n, m, we say n (evenly) divides m, in symbols: n | m, if there is
some k so that m = kn. On N, this is a partial order, but not a
linear order: for instance, 2 - 3 and also 3 - 2. Considered as a
relation on Z, divisibility is only a preorder since anti-symmetry
fails: 1 | −1 and −1 | 1 but 1 , −1. Another important partial
order is the relation ⊆ on a set of sets.
Notice that the examples L and G from Example B.2, al-
though we said there that they were called “strict orders,” are
not linear orders even though they are connected (they are not
reflexive). But there is a close connection, as we will see momen-
tarily.
Definition B.15 (Irreflexivity). A relation R on X is called ir-
reflexive if, for all x ∈ X , ¬Rxx.
APPENDIX B. RELATIONS 182
Definition B.16 (Asymmetry). A relation R on X is called asym-
metric if for no pair x, y ∈ X we have Rxy and Ryx.
Definition B.17 (Strict order). A strict order is a relation which
is irreflexive, asymmetric, and transitive.
Definition B.18 (Strict linear order). A strict order which is also
connected is called a strict linear order.
A strict order on X can be turned into a partial order by
adding the diagonal IdX , i.e., adding all the pairs hx, xi. (This
is called the reflexive closure of R.) Conversely, starting from a
partial order, one can get a strict order by removing IdX .
Proposition B.19. 1. If R is a strict (linear) order on X , then
R + = R ∪ IdX is a partial order (linear order).
2. If R is a partial order (linear order) on X , then R − = R \ IdX
is a strict (linear) order.
Proof. 1. Suppose R is a strict order, i.e., R ⊆ X 2 and R is
irreflexive, asymmetric, and transitive. Let R + = R ∪ IdX .
We have to show that R + is reflexive, antisymmetric, and
transitive.
R + is clearly reflexive, since for all x ∈ X , hx, xi ∈ IdX ⊆
R +.
To show R + is antisymmetric, suppose R + xy and R + yx, i.e.,
hx, yi and hy, xi ∈ R + , and x , y. Since hx, yi ∈ R ∪ IdX , but
hx, yi < IdX , we must have hx, yi ∈ R, i.e., Rxy. Similarly
we get that Ryx. But this contradicts the assumption that
R is asymmetric.
Now suppose that R + xy and R + yz . If both hx, yi ∈ R and
hy, z i ∈ R, it follows that hx, z i ∈ R since R is transitive.
B.4. GRAPHS 183
Otherwise, either hx, yi ∈ IdX , i.e., x = y, or hy, z i ∈ IdX ,
i.e., y = z . In the first case, we have that R + yz by assump-
tion, x = y, hence R + xz . Similarly in the second case. In
either case, R + xz , thus, R + is also transitive.
If R is connected, then for all x , y, either Rxy or Ryx, i.e.,
either hx, yi ∈ R or hy, xi ∈ R. Since R ⊆ R + , this remains
true of R + , so R + is connected as well.
2. Exercise.
Example B.20. ≤ is the linear order corresponding to the strict
linear order <. ⊆ is the partial order corresponding to the strict
order (.
B.4 Graphs
A graph is a diagram in which points—called “nodes” or “ver-
tices” (plural of “vertex”)—are connected by edges. Graphs are
a ubiquitous tool in discrete mathematics and in computer sci-
ence. They are incredibly useful for representing, and visualizing,
relationships and structures, from concrete things like networks
of various kinds to abstract structures such as the possible out-
comes of decisions. There are many different kinds of graphs in
the literature which differ, e.g., according to whether the edges
are directed or not, have labels or not, whether there can be edges
from a node to the same node, multiple edges between the same
nodes, etc. Directed graphs have a special connection to relations.
Definition B.21 (Directed graph). A directed graph G = hV, Ei is
a set of vertices V and a set of edges E ⊆ V 2 .
According to our definition, a graph just is a set together with
a relation on that set. Of course, when talking about graphs, it’s
only natural to expect that they are graphically represented: we
can draw a graph by connecting two vertices v 1 and v 2 by an
APPENDIX B. RELATIONS 184
arrow iff hv 1, v 2 i ∈ E. The only difference between a relation by
itself and a graph is that a graph specifies the set of vertices, i.e., a
graph may have isolated vertices. The important point, however,
is that every relation R on a set X can be seen as a directed graph
hX, Ri, and conversely, a directed graph hV, Ei can be seen as a
relation E ⊆ V 2 with the set V explicitly specified.
Example B.22. The graph hV, Ei with V = {1, 2, 3, 4} and E =
{h1, 1i, h1, 2i, h1, 3i, h2, 3i} looks like this:
1 2 4
This is a different graph than hV 0, Ei with V 0 = {1, 2, 3}, which
looks like this:
1 2
B.5 Operations on Relations
It is often useful to modify or combine relations. We’ve already
used the union of relations above (which is just the union of two
relations considered as sets of pairs). Here are some other ways:
B.5. OPERATIONS ON RELATIONS 185
Definition B.23. Let R, S ⊆ X 2 be relations and Y a set.
1. The inverse R −1 of R is R −1 = {hy, xi : hx, yi ∈ R}.
2. The relative product R | S of R and S is
(R | S ) = {hx, z i : for some y, Rxy and S yz }
3. The restriction R Y of R to Y is R ∩ Y 2
4. The application R[Y ] of R to Y is
R[Y ] = {y : for some x ∈ Y, Rxy }
Example B.24. Let S ⊆ Z2 be the successor relation on Z, i.e.,
the set of pairs hx, yi where x + 1 = y, for x, y ∈ Z. S xy holds iff y
is the successor of x.
1. The inverse S −1 of S is the predecessor relation, i.e., S −1 xy
iff x − 1 = y.
2. The relative product S | S is the relation x bears to y if
x + 2 = y.
3. The restriction of S to N is the successor relation on N.
4. The application of S to a set, e.g., S [{1, 2, 3}] is {2, 3, 4}.
Definition B.25 (Transitive closure). The transitive closure R + of
a relation R ⊆ X 2 is R + = i∞=1 R i where R 1 = R and R i +1 = R i |
Ð
R.
The reflexive transitive closure of R is R ∗ = R + ∪ IdX .
Example B.26. Take the successor relation S ⊆ Z2 . S 2 xy iff
x + 2 = y, S 3 xy iff x + 3 = y, etc. So R ∗ xy iff for some i ≥ 1,
x + i = y. In other words, S + xy iff x < y (and R ∗ xy iff x ≤ y).
APPENDIX B. RELATIONS 186
Problems
Problem B.1. List the elements of the relation ⊆ on the set
℘({a, b, c }).
Problem B.2. Give examples of relations that are (a) reflexive
and symmetric but not transitive, (b) reflexive and anti-symmetric,
(c) anti-symmetric, transitive, but not reflexive, and (d) reflexive,
symmetric, and transitive. Do not use relations on numbers or
sets.
Problem B.3. Complete the proof of Proposition B.19, i.e., prove
that if R is a partial order on X , then R − = R \ IdX is a strict
order.
Problem B.4. Consider the less-than-or-equal-to relation ≤ on
the set {1, 2, 3, 4} as a graph and draw the corresponding diagram.
Problem B.5. Show that the transitive closure of R is in fact
transitive.
APPENDIX C
Syntax and
Semantics
C.1 Introduction
Propositional logic deals with formulas that are built from propo-
sitional variables using the propositional connectives ¬, ∧, ∨, →,
and ↔. Intuitively, a propositional variable p stands for a sen-
tence or proposition that is be true or false. Whenever the “truth
value” of the propositional variable in a formula are determined,
so is the truth value of any formulas formed from them using
propositional connectives. We say that propositional logic is truth
functional, because its semantics is given by functions of truth val-
ues. In particular, in propositional logic we leave out of consider-
ation any further determination of truth and falsity, e.g., whether
something is necessarily true rather than just contingently true,
or whether something is known to be true, or whether something
is true now rather than was true or will be true. We only consider
two truth values true (T) and false (F), and so exclude from dis-
cussion the possibility that a statement may be neither true nor
false, or only half true. We also concentrate only on connectives
where the truth value of a formula built from them is completely
determined by the truth values of its parts (and not, say, on its
187
APPENDIX C. SYNTAX AND SEMANTICS 188
meaning). In particular, whether the truth value of conditionals
in English is truth functional in this sense is contentious. The ma-
terial conditional → is; other logics deal with conditionals that
are not truth functional.
In order to develop the theory and metatheory of truth-functional
propositional logic, we must first define the syntax and semantics
of its expressions. We will describe one way of constructing for-
mulas from propositional variables using the connectives. Alter-
native definitions are possible. Other systems will chose different
symbols, will select different sets of connectives as primitive, will
use parentheses differently (or even not at all, as in the case of
so-called Polish notation). What all approaches have in common,
though, is that the formation rules define the set of formulas in-
ductively. If done properly, every expression can result essentially
in only one way according to the formation rules. The induc-
tive definition resulting in expressions that are uniquely readable
means we can give meanings to these expressions using the same
method—inductive definition.
Giving the meaning of expressions is the domain of seman-
tics. The central concept in semantics for propositonal logic is
that of satisfaction in a valuation. A valuation v assigns truth
values T, F to the propositional variables. Any valuation deter-
mines a truth value v(A) for any formula A. A formula is satisfied
in a valuation v iff v(A) = T—we write this as v A. This rela-
tion can also be defined by induction on the structure of A, using
the truth functions for the logical connectives to define, say, sat-
isfaction of A ∧ B in terms of satisfaction (or not) of A and B.
On the basis of the satisfaction relation v A for sentences
we can then define the basic semantic notions of tautology, en-
tailment, and satisfiability. A formula is a tautology, A, if every
valuation satisfies it, i.e., v(A) = T for any v. It is entailed by
a set of formulas, Γ A, if every valuation that satisfies all the
formulas in Γ also satisfies A. And a set of formulas is satisfi-
able if some valuation satisfies all formulas in it at the same time.
Because formulas are inductively defined, and satisfaction is in
turn defined by induction on the structure of formulas, we can
C.2. PROPOSITIONAL FORMULAS 189
use induction to prove properties of our semantics and to relate
the semantic notions defined.
C.2 Propositional Formulas
Formulas of propositional logic are built up from propositional
variables and the propositional constant ⊥ using logical connectives.
1. A countably infinite set At0 of propositional variables p0 ,
p1 , . . .
2. The propositional constant for falsity ⊥.
3. The logical connectives: ¬ (negation), ∧ (conjunction), ∨
(disjunction), → (conditional)
4. Punctuation marks: (, ), and the comma.
In addition to the primitive connectives introduced above, we
also use the following defined symbols: ↔ (biconditional), truth >
A defined symbol is not officially part of the language, but
is introduced as an informal abbreviation: it allows us to abbre-
viate formulas which would, if we only used primitive symbols,
get quite long. This is obviously an advantage. The bigger ad-
vantage, however, is that proofs become shorter. If a symbol is
primitive, it has to be treated separately in proofs. The more
primitive symbols, therefore, the longer our proofs.
You may be familiar with different terminology and symbols
than the ones we use above. Logic texts (and teachers) commonly
use either ∼, ¬, and ! for “negation”, ∧, ·, and & for “conjunction”.
Commonly used symbols for the “conditional” or “implication”
are →, ⇒, and ⊃. Symbols for “biconditional,” “bi-implication,”
or “(material) equivalence” are ↔, ⇔, and ≡. The ⊥ symbol
is variously called “falsity,” “falsum,”, “absurdity,”, or “bottom.”
The > symbol is variously called “truth,” “verum,”, or “top.”
APPENDIX C. SYNTAX AND SEMANTICS 190
Definition C.1 (Formula). The set Frm(L0 ) of formulas of propo-
sitional logic is defined inductively as follows:
1. ⊥ is an atomic formula.
2. Every propositional variable pi is an atomic formula.
3. If A is a formula, then ¬A is formula.
4. If A and B are formulas, then (A ∧ B) is a formula.
5. If A and B are formulas, then (A ∨ B) is a formula.
6. If A and B are formulas, then (A → B) is a formula.
7. If A is a formula and x is a variable, then ∀x A is a formula.
8. If A is a formula and x is a variable, then ∃x A is a formula.
9. Nothing else is a formula.
The definitions of the set of terms and that of formulas are
inductive definitions. Essentially, we construct the set of formulas
in infinitely many stages. In the initial stage, we pronounce all
atomic formulas to be formulas; this corresponds to the first few
cases of the definition, i.e., the cases for ⊥, pi . “Atomic formula”
thus means any formula of this form.
The other cases of the definition give rules for constructing
new formulas out of formulas already constructed. At the second
stage, we can use them to construct formulas out of atomic for-
mulas. At the third stage, we construct new formulas from the
atomic formulas and those obtained in the second stage, and so
on. A formula is anything that is eventually constructed at such
a stage, and nothing else.
C.3. PRELIMINARIES 191
Definition C.2. Formulas constructed using the defined opera-
tors are to be understood as follows:
1. > abbreviates ¬⊥.
2. A ↔ B abbreviates (A → B) ∧ (B → A).
Definition C.3 (Syntactic identity). The symbol ≡ expresses
syntactic identity between strings of symbols, i.e., A ≡ B iff A
and B are strings of symbols of the same length and which con-
tain the same symbol in each place.
The ≡ symbol may be flanked by strings obtained by con-
catenation, e.g., A ≡ (B ∨ C ) means: the string of symbols A is
the same string as the one obtained by concatenating an opening
parenthesis, the string B, the ∨ symbol, the string C , and a clos-
ing parenthesis, in this order. If this is the case, then we know
that the first symbol of A is an opening parenthesis, A contains
B as a substring (starting at the second symbol), that substring
is followed by ∨, etc.
C.3 Preliminaries
Theorem C.4 (Principle of induction on formulas). If some prop-
erty P holds for all the atomic formulas and is such that
1. it holds for ¬A whenever it holds for A;
2. it holds for (A ∧ B) whenever it holds for A and B;
3. it holds for (A ∨ B) whenever it holds for A and B;
APPENDIX C. SYNTAX AND SEMANTICS 192
4. it holds for (A → B) whenever it holds for A and B;
then P holds for all formulas.
Proof. Let S be the collection of all formulas with property P .
Clearly S ⊆ Frm(L0 ). S satisfies all the conditions of Defini-
tion C.1: it contains all atomic formulas and is closed under
the logical operators. Frm(L0 ) is the smallest such class, so
Frm(L0 ) ⊆ S . So Frm(L0 ) = S , and every formula has prop-
ery P .
Proposition C.5. Any formula in Frm(L0 ) is balanced, in that it
has as many left parentheses as right ones.
Proposition C.6. No proper initial segment of a formula is a formula.
Proposition C.7 (Unique Readability). Any formula A in Frm(L0 )
has exactly one parsing as one of the following
1. ⊥.
2. pn for some pn ∈ At0 .
3. ¬B for some formula B.
4. (B ∧ C ) for some formulas B and C .
5. (B ∨ C ) for some formulas B and C .
6. (B → C ) for some formulas B and C .
Moreover, this parsing is unique.
Proof. By induction on A. For instance, suppose that A has two
distinct readings as (B → C ) and (B 0 → C 0). Then B and B 0 must
C.4. VALUATIONS AND SATISFACTION 193
be the same (or else one would be a proper initial segment of the
other); so if the two readings of A are distinct it must be because
C and C 0 are distinct readings of the same sequence of symbols,
which is impossible by the inductive hypothesis.
Definition C.8 (Uniform Substitution). If A and B are formulas,
and pi is a propositional variable, then A[B/pi ] denotes the result
of replacing each occurrence of pi by an occurrence of B in A;
similarly, the simultaneous substitution of p 1 , . . . , p n by formulas
B 1 , . . . , Bn is denoted by A[B 1 /p 1, . . . , Bn /p n ].
C.4 Valuations and Satisfaction
Definition C.9 (Valuations). Let {T, F} be the set of the two
truth values, “true” and “false.” A valuation for L0 is a func-
tion v assigning either T or F to the propositional variables of
the language, i.e., v : At0 → {T, F}.
Definition C.10. Given a valuation v , define the evaluation func-
tion v( : )Frm(L0 ) → {T, F} inductively by:
v(⊥) = F;
v(pn ) = v(pn );
(
T if v(A) = F;
v(¬A) =
F otherwise.
(
T if v(A) = T and v(B) = T;
v(A ∧ B) =
F if v(A) = F or v(B) = F.
(
T if v(A) = T or v(B) = T;
v(A ∨ B) =
F if v(A) = F and v(B) = F.
APPENDIX C. SYNTAX AND SEMANTICS 194
(
T if v(A) = F or v(B) = T;
v(A → B) =
F if v(A) = T and v(B) = F.
The valuation clauses correspond to the following truth ta-
bles:
A B A∧B A∨B A→B A↔B
T T T T T T
T F F T F F
F T F T T F
F F F F T T
Theorem C.11 (Local Determination). Suppose that v1 and v2
are valuations that agree on the propositional letters occurring in A,
i.e., v1 (pn ) = v2 (pn ) whenever pn occurs in A. Then they also agree on
any A, i.e., v1 (A) = v2 (A).
Proof. By induction on A.
Definition C.12 (Satisfaction). Using the evaluation function,
we can define the notion of satisfaction of a formula A by a valua-
tion v, v A, inductively as follows. (We write v 2 A to mean
“not v A.”)
1. A ≡ ⊥: v 2 A.
2. A ≡ pi : M A iff v(pi ) = T.
3. A ≡ ¬B: v A iff v 2 B.
4. A ≡ (B ∧ C ): v A iff v B and v C .
5. A ≡ (B ∨ C ): v A iff v A or v B (or both).
6. A ≡ (B → C ): v A iff v 2 B or v C (or both).
C.5. SEMANTIC NOTIONS 195
If Γ is a set of formulas, v Γ iff v A for every A ∈ Γ.
Proposition C.13. v A iff v(A) = T.
Proof. By induction on A.
C.5 Semantic Notions
We define the following semantic notions:
Definition C.14. 1. A formula A is satisfiable if for some v,
v A; it is unsatisfiable if for no v, v A;
2. A formula A is a tautology if v A for all valuations v ;
3. A formula A is contingent if it is satisfiable but not a tautol-
ogy;
4. If Γ is a set of formulas, Γ A (“Γ entails A”) if and only
if v A for every valuation v for which v Γ.
5. If Γ is a set of formulas, Γ is satisfiable if there is a valua-
tion v for which v Γ, and Γ is unsatisfiable otherwise.
Proposition C.15. 1. A is a tautology if and only if ∅ A;
2. If Γ A and Γ A → B then Γ B;
3. If Γ is satisfiable then every finite subset of Γ is also satisfiable;
4. Monotony: if Γ ⊆ ∆ and Γ A then also ∆ A;
5. Transitivity: if Γ A and ∆ ∪ {A} B then Γ ∪ ∆ B;
Proof. Exercise.
APPENDIX C. SYNTAX AND SEMANTICS 196
Proposition C.16. Γ A if and only if Γ ∪ {¬A} is unsatisfiable;
Proof. Exercise.
Theorem C.17 (Semantic Deduction Theorem). Γ A → B if
and only if Γ ∪ {A} B.
Proof. Exercise.
Problems
Problem C.1. Prove Proposition C.5
Problem C.2. Prove Proposition C.6
Problem C.3. Give a mathematically rigorous definition of A[B/p]
by induction.
Problem C.4. Prove Proposition C.13
Problem C.5. Prove Proposition C.15
Problem C.6. Prove Proposition C.16
Problem C.7. Prove Theorem C.17
APPENDIX D
Axiomatic
Derivations
D.1 Introduction
Logics commonly have both a semantics and a derivation system.
The semantics concerns concepts such as truth, satisfiability, va-
lidity, and entailment. The purpose of derivation systems is to
provide a purely syntactic method of establishing entailment and
validity. They are purely syntactic in the sense that a derivation
in such a system is a finite syntactic object, usually a sequence
(or other finite arrangement) of sentences or formulas. Good
derivation systems have the property that any given sequence or
arrangement of sentences or formulas can be verified mechani-
cally to be “correct.”
The simplest (and historically first) derivation systems for
first-order logic were axiomatic. A sequence of formulas counts
as a derivation in such a system if each individual formula in it
is either among a fixed set of “axioms” or follows from formulas
coming before it in the sequence by one of a fixed number of “in-
ference rules”—and it can be mechanically verified if a formula
is an axiom and whether it follows correctly from other formu-
las by one of the inference rules. Axiomatic proof systems are
197
APPENDIX D. AXIOMATIC DERIVATIONS 198
easy to describe—and also easy to handle meta-theoretically—
but derivations in them are hard to read and understand, and
are also hard to produce.
Other derivation systems have been developed with the aim
of making it easier to construct derivations or easier to under-
stand derivations once they are complete. Examples are natural
deduction, truth trees, also known as tableaux proofs, and the
sequent calculus. Some derivation systems are designed espe-
cially with mechanization in mind, e.g., the resolution method is
easy to implement in software (but its derivations are essentially
impossible to understand). Most of these other proof systems
represent derivations as trees of formulas rather than sequences.
This makes it easier to see which parts of a derivation depend on
which other parts.
So for a given logic, such as first-order logic, the different
derivation systems will give different explications of what it is for
a sentence to be a theorem and what it means for a sentence to be
derivable from some others. However that is done (via axiomatic
derivations, natural deductions, sequent derivations, truth trees,
resolution refutations), we want these relations to match the se-
mantic notions of validity and entailment. Let’s write ` A for “A is
a theorem” and “Γ ` A” for “A is derivable from Γ.” However
` is defined, we want it to match up with , that is:
1. ` A if and only if A
2. Γ ` A if and only if Γ A
The “only if” direction of the above is called soundness. A deriva-
tion system is sound if derivability guarantees entailment (or va-
lidity). Every decent derivation system has to be sound; unsound
derivation systems are not useful at all. After all, the entire pur-
pose of a derivation is to provide a syntactic guarantee of validity
or entailment. We’ll prove soundness for the derivation systems
we present.
The converse “if” direction is also important: it is called com-
pleteness. A complete derivation system is strong enough to show
D.2. AXIOMATIC DERIVATIONS 199
that A is a theorem whenever A is valid, and that there Γ ` A
whenever Γ A. Completeness is harder to establish, and some
logics have no complete derivation systems. First-order logic
does. Kurt Gödel was the first one to prove completeness for
a derivation system of first-order logic in his 1929 dissertation.
Another concept that is connected to derivation systems is
that of consistency. A set of sentences is called inconsistent if any-
thing whatsoever can be derived from it, and consistent other-
wise. Inconsistency is the syntactic counterpart to unsatisfiablity:
like unsatisfiable sets, inconsistent sets of sentences do not make
good theories, they are defective in a fundamental way. Consis-
tent sets of sentences may not be true or useful, but at least they
pass that minimal threshold of logical usefulness. For different
derivation systems the specific definition of consistency of sets of
sentences might differ, but like `, we want consistency to coincide
with its semantic counterpart, satisfiability. We want it to always
be the case that Γ is consistent if and only if it is satisfiable. Here,
the “if” direction amounts to completeness (consistency guaran-
tees satisfiability), and the “only if” direction amounts to sound-
ness (satisfiability guarantees consistency). In fact, for classical
first-order logic, the two versions of soundness and completeness
are equivalent.
D.2 Axiomatic Derivations
Axiomatic derivations are the oldest and simplest logical deriva-
tion systems. Its derivations are simply sequences of sentences.
A sequence of sentences conunts as a correct derivation if every
sentence A in it satisfies one of the following conditions:
1. A is an axiom, or
2. A is an element of a given set Γ of sentences, or
3. A is justified by a rule of inference.
APPENDIX D. AXIOMATIC DERIVATIONS 200
To be an axiom, A has to have the form of on of a number of fixed
sentence schemas. There are many sets of axiom schemas that
provide a satisfactory (sound and complete) derivation system for
first-order logic. Some are organized according to the connectives
they govern, e.g., the schemas
A → (B → A) B → (B ∨ C ) (B ∧ C ) → B
are common axioms that govern →, ∨ and ∧. Some axiom sys-
tems aim at a minimal number of axioms. Depending on the
connectives that are taken as primitives, it is even possible to
find axiom systems that consist of a single axiom.
A rule of inference is a conditional statement that gives a
sufficient condition for a sentence in a derivation to be justified.
Modus ponens is one very common such rule: it says that if A
and A → B are already justified, then B is justified. This means
that a line in a derivation containing the sentence B is justified,
provided that both A and A → B (for some sentence A) appear
in the derivation before B.
The ` relation based on axiomatic derivations is defined as
follows: Γ ` A iff there is a derivation with the sentence A as
its last formula (and Γ is taken as the set of sentences in that
derivation which are justified by (2) above). A is a theorem if A
has a derivation where Γ is empty, i.e., every sentence in the
derivation is justfied either by (1) or (3). For instance, here is
a derivation that shows that ` A → (B → (B ∨ A)):
1. B → (B ∨ A)
2. (B → (B ∨ A)) → (A → (B → (B ∨ A)))
3. A → (B → (B ∨ A))
The sentence on line 1 is of the form of the axiom A → (A ∨ B)
(with the roles of A and B reversed). The sentence on line 2 is of
the form of the axiom A → (B →A). Thus, both lines are justified.
Line 3 is justified by modus ponens: if we abbreviate it as D, then
line 2 has the form C → D, where C is B → (B ∨ A), i.e., line 1.
D.3. RULES AND DERIVATIONS 201
A set Γ is inconsistent if Γ ` ⊥. A complete axiom system
will also prove that ⊥ → A for any A, and so if Γ is inconsistent,
then Γ ` A for any A.
Systems of axiomatic derivations for logic were first given by
Gottlob Frege in his 1879 Begriffsschrift, which for this reason is
often considered the first work of modern logic. They were per-
fected in Alfred North Whitehead and Bertrand Russell’s Prin-
cipia Mathematica and by David Hilbert and his students in the
1920s. They are thus often called “Frege systems” or “Hilbert
systems.” They are very versatile in that it is often easy to find
an axiomatic system for a logic. Because derivations have a very
simple structure and only one or two inference rules, it is also rel-
atively easy to prove things about them. However, they are very
hard to use in practice, i.e., it is difficult to find and write proofs.
D.3 Rules and Derivations
Axiomatic derivations are perhaps the simplest proof system for
logic. A derivation is just a sequence of formulas. To count as
a derivation, every formula in the sequence must either be an
instance of an axiom, or must follow from one or more formulas
that precede it in the sequence by a rule of inference. A derivation
derives its last formula.
Definition D.1 (Derivability). If Γ is a set of formulas of L then
a derivation from Γ is a finite sequence A1 , . . . , An of formulas
where for each i ≤ n one of the following holds:
1. Ai ∈ Γ; or
2. Ai is an axiom; or
3. Ai follows from some A j (and Ak ) with j < i (and k < i )
by a rule of inference.
What counts as a correct derivation depends on which infer-
ence rules we allow (and of course what we take to be axioms).
APPENDIX D. AXIOMATIC DERIVATIONS 202
And an inference rule is an if-then statement that tells us that,
under certain conditions, a step Ai in is a correct inference step.
Definition D.2 (Rule of inference). A rule of inference gives a
sufficient condition for what counts as a correct inference step in
a derivation from Γ.
For instance, since any one-element sequence A with A ∈ Γ
trivially counts as a derivation, the following might be a very
simple rule of inference:
If A ∈ Γ, then A is always a correct inference step in
any derivation from Γ.
Similarly, if A is one of the axioms, then A by itself is a derivation,
and so this is also a rule of inference:
If A is an axiom, then A is a correct inference step.
It gets more interesting if the rule of inference appeals to formulas
that appear before the step considered. The following rule is
called modus ponens:
If B → A and B occur higher up in the derivation,
then A is a correct inference step.
If this is the only rule of inference, then our definition of deriva-
tion above amounts to this: A1 , . . . , An is a derivation iff for each
i ≤ n one of the following holds:
1. Ai ∈ Γ; or
2. Ai is an axiom; or
3. for some j < i , A j is B → Ai , and for some k < i , Ak is B.
The last clause says that Ai follows from A j (B) and Ak (B → Ai )
by modus ponens. If we can go from 1 to n, and each time we
find a formula Ai that is either in Γ, an axiom, or which a rule of
inference tells us that it is a correct inference step, then the entire
sequence counts as a correct derivation.
D.4. AXIOM AND RULES FOR THE PROPOSITIONAL CONNECTIVES 203
Definition D.3 (Derivability). A formula A is derivable from Γ,
written Γ ` A, if there is a derivation from Γ ending in A.
Definition D.4 (Theorems). A formula A is a theorem if there
is a derivation of A from the empty set. We write ` A if A is a
theorem and 0 A if it is not.
D.4 Axiom and Rules for the Propositional
Connectives
Definition D.5 (Axioms). The set of Ax0 of axioms for the propo-
sitional connectives comprises all formulas of the following forms:
(A ∧ B) → A (D.1)
(A ∧ B) → B (D.2)
A → (B → (A ∧ B)) (D.3)
A → (A ∨ B) (D.4)
A → (B ∨ A) (D.5)
(A → C ) → ((B → C ) → ((A ∨ B) → C )) (D.6)
A → (B → A) (D.7)
(A → (B → C )) → ((A → B) → (A → C )) (D.8)
(A → B) → ((A → ¬B) → ¬A) (D.9)
¬A → (A → B) (D.10)
> (D.11)
⊥→A (D.12)
(A → ⊥) → ¬A (D.13)
¬¬A → A (D.14)
APPENDIX D. AXIOMATIC DERIVATIONS 204
Definition D.6 (Modus ponens). If B and B → A already occur
in a derivation, then A is a correct inference step.
We’ll abbreviate the rule modus ponens as “mp.”
D.5 Examples of Derivations
Example D.7. Suppose we want to prove (¬D ∨ E) → (D → E).
Clearly, this is not an instance of any of our axioms, so we have
to use the mp rule to derive it. Our only rule is MP, which given
A and A → B allows us to justify B. One strategy would be to use
eq. (D.6) with A being ¬D, B being E, and C being D → E, i.e.,
the instance
(¬D → (D → E)) → ((E → (D → E)) → ((¬D ∨ E) → (D → E))).
Why? Two applications of MP yield the last part, which is what
we want. And we easily see that ¬D → (D → E) is an instance of
eq. (D.10), and E → (D → E) is an instance of eq. (D.7). So our
derivation is:
1. ¬D → (D → E) eq. (D.7)
2. (¬D → (D → E)) →
((E → (D → E)) → ((¬D ∨ E) → (D → E))) eq. (D.6)
3. ((E → (D → E)) → ((¬D ∨ E) → (D → E)) 1, 2, mp
4. E → (D → E) eq. (D.7)
5. (¬D ∨ E) → (D → E) 3, 4, mp
Example D.8. Let’s try to find a derivation of D →D. It is not an
instance of an axiom, so we have to use mp to derive it. eq. (D.7)
is an axiom of the form A → B to which we could apply mp. To
be useful, of course, the B which mp would justify as a correct
step in this case would have to be D → D, since this is what we
want to derive. That means A would also have to be D, i.e., we
D.5. EXAMPLES OF DERIVATIONS 205
might look at this instance of eq. (D.7):
D → (D → D)
In order to apply mp, we would also need to justify the corre-
sponding second premise, namely A. But in our case, that would
be D, and we won’t be able to derive D by itself. So we need a
different strategy.
The other axiom involving just → is eq. (D.8), i.e.,
(A → (B → C )) → ((A → B) → (A → C ))
We could get to the last nested conditional by applying mp twice.
Again, that would mean that we want an instance of eq. (D.8)
where A → C is D → D, the formula we are aiming for. Then
of course, A and C are both D. How should we pick B so that
both A → (B → C ) and A → B, i.e., in our case D → (B → D) and
D → B, are also derivable? Well, the first of these is already an
instance of eq. (D.7), whatever we decide B to be. And D → B
would be another instance of eq. (D.7) if B were (D → D). So,
our derivation is:
1. D → ((D → D) → D) eq. (D.7)
2. (D → ((D → D) → D)) →
((D → (D → D)) → (D → D)) eq. (D.8)
3. (D → (D → D)) → (D → D) 1, 2, mp
4. D → (D → D) eq. (D.7)
5. D →D 3, 4, mp
Example D.9. Sometimes we want to show that there is a deriva-
tion of some formula from some other formulas Γ. For instance,
let’s show that we can derive A → C from Γ = {A → B, B → C }.
1. A→B Hyp
2. B →C Hyp
3. (B → C ) → (A → (B → C )) eq. (D.7)
APPENDIX D. AXIOMATIC DERIVATIONS 206
4. A → (B → C ) 2, 3, mp
5. (A → (B → C )) →
((A → B) → (A → C )) eq. (D.8)
6. ((A → B) → (A → C )) 4, 5, mp
7. A →C 1, 6, mp
The lines labelled “Hyp” (for “hypothesis”) indicate that the for-
mula on that line is an element of Γ.
Proposition D.10. If Γ ` A → B and Γ ` B → C , then Γ ` A → C
Proof. Suppose Γ ` A →B and Γ ` B →C . Then there is a deriva-
tion of A → B from Γ; and a derivation of B → C from Γ as well.
Combine these into a single derivation by concatenating them.
Now add lines 3–7 of the derivation in the preceding example.
This is a derivation of A → C —which is the last line of the new
derivation—from Γ. Note that the justifications of lines 4 and 7
remain valid if the reference to line number 2 is replaced by ref-
erence to the last line of the derivation of A → B, and reference
to line number 1 by reference to the last line of the derivation
of B → C .
D.6 Proof-Theoretic Notions
Just as we’ve defined a number of important semantic notions
(tautology, entailment, satisfiabilty), we now define correspond-
ing proof-theoretic notions. These are not defined by appeal to satis-
faction of sentences in structures, but by appeal to the derivability
or non-derivability of certain formulas. It was an important dis-
covery that these notions coincide. That they do is the content
of the soundness and completeness theorems.
D.6. PROOF-THEORETIC NOTIONS 207
Definition D.11 (Derivability). A formula A is derivable from Γ,
written Γ ` A, if there is a derivation from Γ ending in A.
Definition D.12 (Theorems). A formula A is a theorem if there
is a derivation of A from the empty set. We write ` A if A is a
theorem and 0 A if it is not.
Definition D.13 (Consistency). A set Γ of formulas is consistent
if and only if Γ 0 ⊥; it is inconsistent otherwise.
Proposition D.14 (Reflexivity). If A ∈ Γ, then Γ ` A.
Proof. The formula A by itself is a derivation of A from Γ.
Proposition D.15 (Monotony). If Γ ⊆ ∆ and Γ ` A, then ∆ ` A.
Proof. Any derivation of A from Γ is also a derivation of A from ∆.
Proposition D.16 (Transitivity). If Γ ` A and {A} ∪ ∆ ` B, then
Γ ∪ ∆ ` B.
Proof. Suppose {A} ∪ ∆ ` B. Then there is a derivation B 1 , . . . ,
Bl = B from {A} ∪ ∆. Some of the steps in that derivation will be
correct because of a rule which refers to a prior line Bi = A. By
hypothesis, there is a derivation of A from Γ, i.e., a derivation A1 ,
. . . , Ak = A where every Ai is an axiom, an element of Γ, or
correct by a rule of inference. Now consider the sequence
A1, . . . , Ak = A, B 1, . . . , Bl = B .
This is a correct derivation of B from Γ ∪ ∆ since every Bi = A
is now justified by the same rule which justifies Ak = A.
APPENDIX D. AXIOMATIC DERIVATIONS 208
Note that this means that in particular if Γ ` A and A ` B,
then Γ ` B. It follows also that if A1, . . . , An ` B and Γ ` Ai for
each i , then Γ ` B.
Proposition D.17. Γ is inconsistent iff Γ ` A for every A.
Proof. Exercise.
Proposition D.18 (Compactness). 1. If Γ ` A then there is a
finite subset Γ0 ⊆ Γ such that Γ0 ` A.
2. If every finite subset of Γ is consistent, then Γ is consistent.
Proof. 1. If Γ ` A, then there is a finite sequence of formulas
A1 , . . . , An so that A ≡ An and each Ai is either a logical
axiom, an element of Γ or follows from previous formulas
by modus ponens. Take Γ0 to be those Ai which are in Γ.
Then the derivation is likewise a derivation from Γ0 , and
so Γ0 ` A.
2. This is the contrapositive of (1) for the special case A ≡ ⊥.
D.7 The Deduction Theorem
As we’ve seen, giving derivations in an axiomatic system is cum-
bersome, and derivations may be hard to find. Rather than actu-
ally write out long lists of formulas, it is generally easier to argue
that such derivations exist, by making use of a few simple results.
We’ve already established three such results: Proposition D.14
says we can always assert that Γ ` A when we know that A ∈ Γ.
Proposition D.15 says that if Γ ` A then also Γ ∪ {B } ` A. And
Proposition D.16 implies that if Γ ` A and A ` B, then Γ ` B.
Here’s another simple result, a “meta”-version of modus ponens:
D.7. THE DEDUCTION THEOREM 209
Proposition D.19. If Γ ` A and Γ ` A → B, then Γ ` B.
Proof. We have that {A, A → B } ` B:
1. A Hyp.
2. A→B Hyp.
3. B 1, 2, MP
By Proposition D.16, Γ ` B.
The most important result we’ll use in this context is the de-
duction theorem:
Theorem D.20 (Deduction Theorem). Γ ∪ {A} ` B if and only
if Γ ` A → B.
Proof. The “if” direction is immediate. If Γ ` A → B then also
Γ ∪ {A} ` A → B by Proposition D.15. Also, Γ ∪ {A} ` A by
Proposition D.14. So, by Proposition D.19, Γ ∪ {A} ` B.
For the “only if” direction, we proceed by induction on the
length of the derivation of B from Γ ∪ {A}.
For the induction basis, we prove the claim for every deriva-
tion of length 1. A derivation of B from Γ ∪ {A} of length 1
consists of B by itself; and if it is correct B is either ∈ Γ ∪ {A}
or is an axiom. If B ∈ Γ or is an axiom, then Γ ` B. We also
have that Γ ` B → (A → B) by eq. (D.7), and Proposition D.19
gives Γ ` A → B. If B ∈ {A} then Γ ` A → B because then last
sentence A → B is the same as A → A, and we have derived that
in Example D.8.
For the inductive step, suppose a derivation of B from Γ ∪{A}
ends with a step B which is justified by modus ponens. (If it
is not justified by modus ponens, B ∈ Γ, B ≡ A, or B is an
axiom, and the same reasoning as in the induction basis applies.)
Then some previous steps in the derivation are C → B and C , for
some formula C , i.e., Γ ∪ {A} ` C → B and Γ ∪ {A} ` C , and
APPENDIX D. AXIOMATIC DERIVATIONS 210
the respective derivations are shorter, so the inductive hypothesis
applies to them. We thus have both:
Γ ` A → (C → B);
Γ ` A → C.
But also
Γ ` (A → (C → B)) → ((A → C ) → (A → B)),
by eq. (D.8), and two applications of Proposition D.19 give Γ `
A → B, as required.
Notice how eq. (D.7) and eq. (D.8) were chosen precisely so
that the Deduction Theorem would hold.
The following are some useful facts about derivability, which
we leave as exercises.
Proposition D.21. 1. ` (A → B) → ((B → C ) → (A → C );
2. If Γ ∪ {¬A} ` ¬B then Γ ∪ {B } ` A (Contraposition);
3. {A, ¬A} ` B (Ex Falso Quodlibet, Explosion);
4. {¬¬A} ` A (Double Negation Elimination);
5. If Γ ` ¬¬A then Γ ` A;
D.8 Derivability and Consistency
We will now establish a number of properties of the derivability
relation. They are independently interesting, but each will play
a role in the proof of the completeness theorem.
D.9. DERIVABILITY AND THE PROPOSITIONAL CONNECTIVES 211
Proposition D.22. If Γ ` A and Γ ∪ {A} is inconsistent, then Γ is
inconsistent.
Proof. If Γ ∪ {A} is inconsistent, then Γ ∪ {A} ` ⊥. By Proposi-
tion D.14, Γ ` B for every B ∈ Γ. Since also Γ ` A by hypothesis,
Γ ` B for every B ∈ Γ ∪ {A}. By Proposition D.16, Γ ` ⊥, i.e., Γ
is inconsistent.
Proposition D.23. Γ ` A iff Γ ∪ {¬A} is inconsistent.
Proof. First suppose Γ ` A. Then Γ ∪ {¬A} ` A by Proposi-
tion D.15. Γ ∪ {¬A} ` ¬A by Proposition D.14. We also have
` ¬A → (A → ⊥) by eq. (D.10). So by two applications of Propo-
sition D.19, we have Γ ∪ {¬A} ` ⊥.
Now assume Γ ∪ {¬A} is inconsistent, i.e., Γ ∪ {¬A} ` ⊥. By
the deduction theorem, Γ ` ¬A → ⊥. Γ ` (¬A → ⊥) → ¬¬A by
eq. (D.13), so Γ ` ¬¬A by Proposition D.19. Since Γ ` ¬¬A → A
(eq. (D.14)), we have Γ ` A by Proposition D.19 again.
Proposition D.24. If Γ ` A and ¬A ∈ Γ, then Γ is inconsistent.
Proof. Γ ` ¬A →(A →⊥) by eq. (D.10). Γ ` ⊥ by two applications
of Proposition D.19.
Proposition D.25. If Γ ∪ {A} and Γ ∪ {¬A} are both inconsistent,
then Γ is inconsistent.
Proof. Exercise.
D.9 Derivability and the Propositional
Connectives
APPENDIX D. AXIOMATIC DERIVATIONS 212
Proposition D.26. 1. Both A ∧ B ` A and A ∧ B ` B
2. A, B ` A ∧ B.
Proof. 1. From eq. (D.1) and eq. (D.1) by modus ponens.
2. From eq. (D.3) by two applications of modus ponens.
Proposition D.27. 1. A ∨ B, ¬A, ¬B is inconsistent.
2. Both A ` A ∨ B and B ` A ∨ B.
Proof. 1. From eq. (D.9) we get ` ¬A → (A → ⊥) and ` ¬A →
(A → ⊥). So by the deduction theorem, we have {¬A} `
A→⊥ and {¬B } ` B →⊥. From eq. (D.6) we get {¬A, ¬B } `
(A∨B)→⊥. By the deduction theorem, {A∨B, ¬A, ¬B } ` ⊥.
2. From eq. (D.4) and eq. (D.5) by modus ponsens.
Proposition D.28. 1. A, A → B ` B.
2. Both ¬A ` A → B and B ` A → B.
Proof. 1. We can derive:
1. A Hyp
2. A→B Hyp
3. B 1, 2, mp
2. By eq. (D.10) and eq. (D.7) and the deduction theorem,
respectively.
D.10. SOUNDNESS 213
D.10 Soundness
A derivation system, such as axiomatic deduction, is sound if
it cannot derive things that do not actually hold. Soundness is
thus a kind of guaranteed safety property for derivation systems.
Depending on which proof theoretic property is in question, we
would like to know for instance, that
1. every derivable A is valid;
2. if A is derivable from some others Γ, it is also a conse-
quence of them;
3. if a set of formulas Γ is inconsistent, it is unsatisfiable.
These are important properties of a derivation system. If any of
them do not hold, the derivation system is deficient—it would
derive too much. Consequently, establishing the soundness of
a derivation system is of the utmost importance.
Proposition D.29. If A is an axiom, then v A for each valuation v.
Proof. Do truth tables for each axiom to verify that they are tau-
tologies.
Theorem D.30 (Soundness). If Γ ` A then Γ A.
Proof. By induction on the length of the derivation of A from Γ.
If there are no steps justified by inferences, then all formulas in
the derivation are either instances of axioms or are in Γ. By the
previous proposition, all the axioms are tautologies, and hence if
A is an axiom then Γ A. If A ∈ Γ, then trivially Γ A.
If the last step of the derivation of A is justified by modus
ponens, then there are formulas B and B → A in the derivation,
and the induction hypothesis applies to the part of the derivation
ending in those formulas (since they contain at least one fewer
APPENDIX D. AXIOMATIC DERIVATIONS 214
steps justified by an inference). So, by induction hypothesis, Γ
B and Γ B → A. Then Γ A by Theorem C.17.
Corollary D.31. If ` A, then A is a tautology.
Corollary D.32. If Γ is satisfiable, then it is consistent.
Proof. We prove the contrapositive. Suppose that Γ is not con-
sistent. Then Γ ` ⊥, i.e., there is a derivation of ⊥ from Γ. By
Theorem D.30, any valuation v that satisfies Γ must satisfy ⊥.
Since v 2 ⊥ for every valuation v, no v can satisfy Γ, i.e., Γ is
not satisfiable.
Problems
Problem D.1. Show that the following hold by exhibiting deriva-
tions from the axioms:
1. (A ∧ B) → (B ∧ A)
2. ((A ∧ B) → C ) → (A → (B → C ))
3. ¬(A ∨ B) → ¬A
Problem D.2. Prove Proposition D.17.
Problem D.3. Prove Proposition D.21
Problem D.4. Prove that Γ ` ¬A iff Γ ∪ {A} is inconsistent.
Problem D.5. Prove Proposition D.25
APPENDIX E
Tableaux
E.1 Tableaux
While many derivation systems operate with arrangements of sen-
tences, tableaux operate with signed formulas. A signed formula
is a pair consisting of a truth value sign (T or F) and a sentence
T A or F A.
A tableau consists of signed formulas arranged in a downward-
branching tree. It begins with a number of assumptions and con-
tinues with signed formulas which result from one of the signed
formulas above it by applying one of the rules of inference. Each
rule allows us to add one or more signed formulas to the end
of a branch, or two signed formulas side by side—in this case a
branch splits into two, with the two added signed formulas form-
ing the ends of the two branches.
A rule applied to a complex signed formula results in the
addition of signed formulas which are immediate sub-formulas.
They come in pairs, one rule for each of the two signs. For in-
stance, the ∧T rule applies to T A ∧ B, and allows the addition
of both the two signed formulas T A and T B to the end of any
branch containing T A ∧ B, and the rule A ∧ BF allows a branch
to be split by adding F A and F B side-by-side. A tableau is closed
if every one of its branches contains a matching pair of signed
formulas T A and F A.
215
APPENDIX E. TABLEAUX 216
The ` relation based on tableaux is defined as follows: Γ ` A
iff there is some finite set Γ0 = {B 1, . . . , Bn } ⊆ Γ such that there
is a closed tableau for the assumptions
{F A, T B 1, . . . , T Bn }
For instance, here is a closed tableau that shows that ` (A∧B)→A:
1. F (A ∧ B) → A Assumption
2. TA ∧B →F 1
3. FA →F 1
4. TA →T 2
5. TB →T 2
⊗
A set Γ is inconsistent in the tableau calculus if there is a
closed tableau for assumptions
{T B 1, . . . , T Bn }
for some Bi ∈ Γ.
The sequent calculus was invented in the 1950s independently
by Evert Beth and Jaakko Hintikka, and simplified and popular-
ized by Raymond Smullyan. It is very easy to use, since con-
structing a tableau is a very systematic procedure. Because of
the systematic nature of tableaux, they also lend themselves to
implementation by computer. However, tableau is often hard to
read and their connection to proofs are sometimes not easy to
see. The approach is also quite general, and many different logics
have tableau systems. Tableaux also help us to find structures that
satisfy given (sets of) sentences: if the set is satisfiable, it won’t
have a closed tableau, i.e., any tableau will have an open branch.
The satisfying structure can be “read off” an open branch, pro-
vided all rules it is possible to apply have been applied on that
branch. There is also a very close connection to the sequent cal-
culus: essentially, a closed tableau is a condensed derivation in
the sequent calculus, written upside-down.
E.2. RULES AND TABLEAUX 217
E.2 Rules and Tableaux
A tableau is a systematic survey of the possible ways a sentence
can be true or false in a structure. The bulding blocks of a tableau
are signed formulas: sentences plus a truth value “sign,” either
T or F. These signed formulas are arranged in a (downward
growing) tree.
Definition E.1. A signed formula is a pair consisting of a truth
value and a sentence, i.e., either:
T A or F A.
Intuitively, we might read T A as “A might be true” and F A
as “A might be false” (in some structure).
Each signed formula in the tree is either an assumption (which
are listed at the very top of the tree), or it is obtained from
a signed formula above it by one of a number of rules of in-
ference. There are two rules for each possible main operator of
the preceding formula, one for the case when the sign is T, and
one for the case where the sign is F. Some rules allow the tree to
branch, and some only add signed formulas to the branch. A rule
may be (and often must be) applied not to the immediately pre-
ceding signed formula, but to any signed formula in the branch
from the root to the place the rule is applied.
A branch is closed when it contains both T A and F A. A closed
tableau is one where every branch is closed. Under the intuitive
interpretation, any branch describes a joint possibility, but T A
and F A are not jointly possible. In other words, if a branch is
closed, the possibility it describes has been ruled out. In partic-
ular, that means that a closed tableau rules out all possibilities
of simultaneously making every assumption of the form T A true
and every assumption of the form F A false.
A closed tableau for A is a closed tableau with root F A. If
such a closed tableau exists, all possibilities for A being false have
been ruled out; i.e., A must be true in every structure.
APPENDIX E. TABLEAUX 218
E.3 Propositional Rules
Rules for ¬
T ¬A F ¬A
¬T ¬F
FA TA
Rules for ∧
TA ∧B
∧T FA∧B
TA ∧F
FA | FB
TB
Rules for ∨
FA∨B
TA ∨B ∨F
∨T FA
TA | TB
FB
Rules for →
FA→B
TA →B →F
→T TA
FA | TB
FB
The Cut Rule
Cut
TA | FA
The Cut rule is not applied “to” a previous signed formula;
rather, it allows every branch in a tableau to be split in two, one
branch containing T A, the other F A. It is not necessary—any
set of signed formulas with a closed tableau has one not using
Cut—but it allows us to combine tableaux in a convenient way.
E.4. TABLEAUX 219
E.4 Tableaux
We’ve said what an assumption is, and we’ve given the rules of
inference. Tableaux are inductively generated from these: each
tableau either is a single branch consisting of one or more as-
sumptions, or it results from a tableau by applying one of the
rules of inference on a branch.
Definition E.2 (Tableau). A tableau for assumptions S 1A1 , . . . ,
S nAn (where each S i is either T or F) is a tree of signed formulas
satisfying the following conditions:
1. The n topmost signed formulas of the tree are S i Ai , one
below the other.
2. Every signed formula in the tree that is not one of the as-
sumptions results from a correct application of an inference
rule to a signed formula in the branch above it.
A branch of a tableau is closed iff it contains both T A and F A,
and open otherwise. A tableau in which every branch is closed
is a closed tableau (for its set of assumptions). If a tableau is not
closed, i.e., if it contains at least one open branch, it is open.
Example E.3. Every set of assumptions on its own is a tableau,
but it will generally not be closed. (Obviously, it is closed only
if the assumptions already contain a pair of signed formulas T A
and F A.)
From a tableau (open or closed) we can obtain a new, larger
one by applying one of the rules of inference to a signed formula A
in it. The rule will append one or more signed formulas to the
end of any branch containing the occurrence of A to which we
apply the rule.
For instance, consider the assumption T A ∧ ¬A. Here is the
(open) tableau consisting of just that assumption:
1. T A ∧ ¬A Assumption
APPENDIX E. TABLEAUX 220
We obtain a new tableau from it by applying the ∧T rule to the
assumption. That rule allows us to add two new lines to the
tableau, T A and T ¬A:
1. T A ∧ ¬A Assumption
2. TA ∧T 1
3. T ¬A ∧T 1
When we write down tableaux, we record the rules we’ve applied
on the right (e.g., ∧T1 means that the signed formula on that
line is the result of applying the ∧T rule to the signed formula on
line 1). This new tableau now contains additional signed formu-
las, but to only one (T ¬A) can we apply a rule (in this case, the
¬T rule). This results in the closed tableau
1. T A ∧ ¬A Assumption
2. TA ∧T 1
3. T ¬A ∧T 1
4. FA ¬T 3
⊗
E.5 Examples of Tableaux
Example E.4. Let’s find a closed tableau for the sentence (A ∧
B) → A.
We begin by writing the corresponding assumption at the top
of the tableau.
1. F (A ∧ B) → A Assumption
There is only one assumption, so only one signed formula to
which we can apply a rule. (For every signed formula, there is
always at most one rule that can be applied: it’s the rule for the
corresponding sign and main operator of the sentence.) In this
case, this means, we must apply →F.
E.5. EXAMPLES OF TABLEAUX 221
1. F (A ∧ B) → A X Assumption
2. TA ∧B →F 1
3. FA →F 1
To keep track of which signed formulas we have applied their cor-
responding rules to, we write a checkmark next to the sentence.
However, only write a checkmark if the rule has been applied to
all open branches. Once a signed formula has had the corre-
sponding rule applied in every open branch, we will not have to
return to it and apply the rule again. In this case, there is only
one branch, so the rule only has to be applied once. (Note that
checkmarks are only a convenience for constructing tableaux and
are not officially part of the syntax of tableaux.)
There is one new signed formula to which we can apply a
rule: the T A ∧ B on line 3. Applying the ∧T rule results in:
1. F (A ∧ B) → A X Assumption
2. TA ∧B X →F 1
3. FA →F 1
4. TA ∧T 2
5. TB ∧T 2
⊗
Since the branch now contains both T A (on line 4) and F A (on
line 3), the branch is closed. Since it is the only branch, the
tableau is closed. We have found a closed tableau for (A ∧B)→A.
Example E.5. Now let’s find a closed tableau for (¬A ∨ B) →
(A → B).
We begin with the corresponding assumption:
1. F (¬A ∨ B) → (A → B) Assumption
The one signed formula in this tableau has main operator → and
sign F, so we apply the →F rule to it to obtain:
APPENDIX E. TABLEAUX 222
1. F (¬A ∨ B) → (A → B) X Assumption
2. T ¬A ∨ B →F 1
3. F (A → B) →F 1
We now have a choice as to whether to apply ∨T to line 2 or
→F to line 3. It actually doesn’t matter which order we pick, as
long as each signed formula has its corresponding rule applied
in every branch. So let’s pick the first one. The ∨T rule allows
the tableau to branch, and the two conclusions of the rule will be
the new signed formulas added to the two new branches. This
results in:
1. F (¬A ∨ B) → (A → B) X Assumption
2. T ¬A ∨ B X →F 1
3. F (A → B) →F 1
4. T ¬A TB ∨T 2
We have not applied the →F rule to line 3 yet: let’s do that now.
To save time, we apply it to both branches. Recall that we write
a checkmark next to a signed formula only if we have applied the
corresponding rule in every open branch. So it’s a good idea to
apply a rule at the end of every branch that contains the signed
formula the rule applies to. That way we won’t have to return to
that signed formula lower down in the various branches.
1. F (¬A ∨ B) → (A → B) X Assumption
2. T ¬A ∨ B X →F 1
3. F (A → B) X →F 1
4. T ¬A TB ∨T 2
5. TA TA →F 3
6. FB FB →F 3
⊗
E.5. EXAMPLES OF TABLEAUX 223
The right branch is now closed. On the left branch, we can still
apply the ¬T rule to line 4. This results in F A and closes the left
branch:
1. F (¬A ∨ B) → (A → B) X Assumption
2. T ¬A ∨ B X →F 1
3. F (A → B) X →F 1
4. T ¬A TB ∨T 2
5. TA TA →F 3
6. FB FB →F 3
7. FA ⊗ ¬T 4
⊗
Example E.6. We can give tableaux for any number of signed
formulas as assumptions. Often it is also necessary to apply more
than one rule that allows branching; and in general a tableau can
have any number of branches. For instance, consider a tableau
for {T A ∨ (B ∧ C ), F (A ∨ B) ∧ (A ∨ C )}. We start by applying the
∨T to the first assumption:
1. T A ∨ (B ∧ C ) X Assumption
2. F (A ∨ B) ∧ (A ∨ C ) Assumption
3. TA TB ∧C ∨T 1
Now we can apply the ∧F rule to line 2. We do this on both
branches simultaneously, and can therefore check off line 2:
1. T A ∨ (B ∧ C ) X Assumption
2. F (A ∨ B) ∧ (A ∨ C ) X Assumption
3. TA TB ∧C ∨T 1
4. FA∨B F A ∨C FA∨B F A ∨C ∧F 2
APPENDIX E. TABLEAUX 224
Now we can apply ∨F to all the branches containing A ∨ B:
1. T A ∨ (B ∧ C ) X Assumption
2. F (A ∨ B) ∧ (A ∨ C ) X Assumption
3. TA TB ∧C ∨T 1
4. FA∨B X F A ∨C FA∨B X F A ∨C ∧F 2
5. FA FA ∨F 4
6. FB FB ∨F 4
⊗
The leftmost branch is now closed. Let’s now apply ∨F to A ∨ C :
1. T A ∨ (B ∧ C ) X Assumpt
2. F (A ∨ B) ∧ (A ∨ C ) X Assumpt
3. TA TB ∧C ∨T 1
4. FA∨B X F A ∨C X FA∨B X F A ∨C X ∧F 2
5. FA FA ∨F 4
6. FB FB ∨F 4
7. ⊗ FA FA ∨F 4
8. FC FC ∨F 4
⊗
Note that we moved the result of applying ∨F a second time below
for clarity. In this instance it would not have been needed, since
the justifications would have been the same.
Two branches remain open, and T B ∧ C on line 3 remains
unchecked. We apply ∧T to it to obtain a closed tableau:
E.6. PROOF-THEORETIC NOTIONS 225
1. T A ∨ (B ∧ C ) X Assumpt
2. F (A ∨ B) ∧ (A ∨ C ) X Assumpt
3. TA TB ∧C X ∨T 1
4. FA∨B X F A ∨C X FA∨B X F A ∨C X ∧F 2
5. FA FA FA FA ∨F 4
6. FB FC FB FC ∨F 4
7. ⊗ ⊗ TB TB ∧T 3
8. TC TC ∧T 3
⊗ ⊗
For comparison, here’s a closed tableau for the same set of
assumptions in which the rules are applied in a different order:
1. T A ∨ (B ∧ C ) X Assumption
2. F (A ∨ B) ∧ (A ∨ C ) X Assumption
3. FA∨B X F A ∨C X ∧F 2
4. FA FA ∨F 3
5. FB FC ∨F 3
6. TA TB ∧C X TA TB ∧C X ∨T 1
7. ⊗ TB ⊗ TB ∧T 3
8. TC TC ∧T 3
⊗ ⊗
E.6 Proof-Theoretic Notions
Just as we’ve defined a number of important semantic notions
(validity, entailment, satisfiabilty), we now define corresponding
proof-theoretic notions. These are not defined by appeal to satisfac-
tion of sentences in structures, but by appeal to the existence of
APPENDIX E. TABLEAUX 226
certain closed tableaux. It was an important discovery that these
notions coincide. That they do is the content of the soundness and
completeness theorems.
Definition E.7 (Theorems). A sentence A is a theorem if there is
a closed tableau for F A. We write ` A if A is a theorem and 0 A
if it is not.
Definition E.8 (Derivability). A sentence A is derivable from a
set of sentences Γ, Γ ` A, iff there is a finite set {B 1, . . . , Bn } ⊆ Γ
and a closed tableau for the set
{F A, T B 1, . . . , T Bn , }
If A is not derivable from Γ we write Γ 0 A.
Definition E.9 (Consistency). A set of sentences Γ is inconsistent
iff there is a finite set {B 1, . . . , Bn } ⊆ Γ and a closed tableau for
the set
{T B 1, . . . , T Bn , }.
If Γ is not inconsistent, we say it is consistent.
Proposition E.10 (Reflexivity). If A ∈ Γ, then Γ ` A.
Proof. If A ∈ Γ, {A} is a finite subset of Γ and the tableau
1. FA Assumption
2. TA Assumption
⊗
is closed.
E.6. PROOF-THEORETIC NOTIONS 227
Proposition E.11 (Monotony). If Γ ⊆ ∆ and Γ ` A, then ∆ ` A.
Proof. Any finite subset of Γ is also a finite subset of ∆.
Proposition E.12 (Transitivity). If Γ ` A and {A} ∪ ∆ ` B, then
Γ ∪ ∆ ` B.
Proof. If {A}∪∆ ` B, then there is a finite subset ∆0 = {C 1, . . . ,C n } ⊆
∆ such that
{F B,T A, T C 1, . . . , T C n }
has a closed tableau. If Γ ` A then there are D 1 , . . . , D m such
that
{F A,T D 1, . . . , T D m }
has a closed tableau.
Now consider the tableau with assumptions
F B, T C 1, . . . , T C n , T D 1, . . . , T D m .
Apply the Cut rule on A. This generates two branches, one has
T A in it, the other F A. Thus, on the one branch, all of
{F B, T A, T C 1, . . . , T C n }
are available. Since there is a closed tableau for these assump-
tions, we can attach it to that branch; every branch through T A1
closes. On the other branch, all of
{F A, T D 1, . . . , T D m }
are available, so we can also complete the other side to obtain a
closed tableau. This shows Γ ∪ ∆ ` B.
Note that this means that in particular if Γ ` A and A ` B,
then Γ ` B. It follows also that if A1, . . . , An ` B and Γ ` Ai for
each i , then Γ ` B.
APPENDIX E. TABLEAUX 228
Proposition E.13. Γ is inconsistent iff Γ ` A for every sentence A.
Proof. Exercise.
Proposition E.14 (Compactness). 1. If Γ ` A then there is a
finite subset Γ0 ⊆ Γ such that Γ0 ` A.
2. If every finite subset of Γ is consistent, then Γ is consistent.
Proof. 1. If Γ ` A, then there is a finite subset Γ0 = {B 1, . . . , Bn }
and a closed tableau for
F A, T B 1, · · · T Bn
This tableau also shows Γ0 ` A.
2. If Γ is inconsistent, then for some finite subset Γ0 = {B 1, . . . , Bn }
there is a closed tableau for
T B 1, · · · T B n
This closed tableau shows that Γ0 is inconsistent.
E.7 Derivability and Consistency
We will now establish a number of properties of the derivability
relation. They are independently interesting, but each will play
a role in the proof of the completeness theorem.
E.7. DERIVABILITY AND CONSISTENCY 229
Proposition E.15. If Γ ` A and Γ ∪ {A} is inconsistent, then Γ is
inconsistent.
Proof. There are finite Γ0 = {B 1, . . . , Bn } and Γ1 = {C 1, . . . ,C n } ⊆
Γ such that
{F A,T B 1, . . . , T Bn }
{T ¬A,T C 1, . . . , T C m }
have closed tableaux. Using the Cut rule on A we can combine
these into a single closed tableau that shows Γ0 ∪Γ1 is inconsistent.
Since Γ0 ⊆ Γ and Γ1 ⊆ Γ, Γ0 ∪Γ1 ⊆ Γ, hence Γ is inconsistent.
Proposition E.16. Γ ` A iff Γ ∪ {¬A} is inconsistent.
Proof. First suppose Γ ` A, i.e., there is a closed tableau for
{F A, T B 1, . . . , T Bn }
Using the ¬T rule, this can be turned into a closed tableau for
{T ¬A, T B 1, . . . , T Bn }.
On the other hand, if there is a closed tableau for the latter, we
can turn it into a closed tableau of the former by removing every
formula that results from ¬Tapplied to the first assumption T ¬A
as well as that assumption, and adding the assumption F A. For
if a branch was closed before because it contained the conclusion
of ¬Tapplied to T ¬A, i.e., F A, the corresponding branch in the
new tableau is also closed. If a branch in the old tableau was
closed because it contained the assumption T ¬A as well as F ¬A
we can turn it into a closed branch by applying ¬F to F ¬A to
obtain T A. This closes the branch since we added F A as an
assumption.
APPENDIX E. TABLEAUX 230
Proposition E.17. If Γ ` A and ¬A ∈ Γ, then Γ is inconsistent.
Proof. Suppose Γ ` A and ¬A ∈ Γ. Then there are B 1 , . . . , Bn ∈ Γ
such that
{F A, T B 1, . . . , T Bn }
has a closed tableau. Replace the assumption F Aby T ¬A, and
insert the conclusion of ¬Tapplied to F Aafter the assumptions.
Any sentence in the tableau justified by appeal to line 1 in the
old tableau is now justified by appeal to line n + 1. So if the old
tableau was closed, the new one is. It shows that Γ is inconsistent,
since all assumptions are in Γ.
Proposition E.18. If Γ ∪ {A} and Γ ∪ {¬A} are both inconsistent,
then Γ is inconsistent.
Proof. If there are B 1 , . . . , Bn ∈ Γ and C 1 , . . . , C m ∈ Γ such that
{T A,T B 1, . . . , T Bn }
{T ¬A,T C 1, . . . , T C m }
both have closed tableaux, we can construct a tableau that shows
that Γ is inconsistent by using as assumptions T B 1 , . . . , T Bn to-
gether with T C 1 , . . . , T C m , followed by an application of the Cut
rule, yielding two branches, one starting with T A, the other with
F A. Add on the part below the assumptions of the first tableau
on the left side. Here, every rule application is still correct, and
every branch closes. On the right side, add the part below the
assumptions of the seond tableau, with the results of any appli-
cations of ¬T to T ¬A removed.
For if a branch was closed before because it contained the
conclusion of ¬Tapplied to T ¬A, i.e., F A, as well as F A, the cor-
responding branch in the new tableau is also closed. If a branch
in the old tableau was closed because it contained the assump-
tion T ¬A as well as F ¬A we can turn it into a closed branch by
applying ¬F to F ¬A to obtain T A.
E.8. DERIVABILITY AND THE PROPOSITIONAL CONNECTIVES 231
E.8 Derivability and the Propositional
Connectives
Proposition E.19. 1. Both A ∧ B ` A and A ∧ B ` B.
2. A, B ` A ∧ B.
Proof. 1. Both {F A, T A ∧ B } and {F B, T A ∧ B } have closed
tableaux
1. FA Assumption
2. TA ∧B Assumption
3. TA ∧T 2
4. TB ∧T 2
⊗
1. FB Assumption
2. TA ∧B Assumption
3. TA ∧T 2
4. TB ∧T 2
⊗
2. Here is a closed tableau for {T A, T B, F A ∧ B }:
1. FA∧B Assumption
2. TA Assumption
3. TB Assumption
4. FA FB ∧F 1
⊗ ⊗
APPENDIX E. TABLEAUX 232
Proposition E.20. 1. A ∨ B, ¬A, ¬B is inconsistent.
2. Both A ` A ∨ B and B ` A ∨ B.
Proof. 1. We give a closed tableau of {T A ∨ B, T ¬A, T ¬B }:
1. TA ∨B Assumption
2. T ¬A Assumption
3. T ¬B Assumption
4. FA ¬T 2
5. FB ¬T 3
6. TA TB ∨T 1
⊗ ⊗
2. Both {F A∨B, T A} and {F A∨B, T B } have closed tableaux:
1. FA∨B Assumption
2. TA Assumption
3. FA ∨F 1
4. FB ∨F 1
⊗
1. FA∨B Assumption
2. TB Assumption
3. FA ∨F 1
4. FB ∨F 1
⊗
E.8. DERIVABILITY AND THE PROPOSITIONAL CONNECTIVES 233
Proposition E.21. 1. A, A → B ` B.
2. Both ¬A ` A → B and B ` A → B.
Proof. 1. {F B, T A → B, T A} has a closed tableau:
1. FB Assumption
2. TA →B Assumption
3. TA Assumption
4. FA TB →T 2
⊗ ⊗
2. Both s{F A → B, T ¬A} and {F A → B, T ¬B } have closed
tableaux:
1. FA→B Assumption
2. T ¬A Assumption
3. TA →F 1
4. FB →F 1
5. FA ¬T 2
⊗
1. FA→B Assumption
2. T ¬B Assumption
3. TA →F 1
4. FB →F 1
5. FB ¬T 2
⊗
APPENDIX E. TABLEAUX 234
E.9 Soundness
A derivation system, such as tableaux, is sound if it cannot derive
things that do not actually hold. Soundness is thus a kind of
guaranteed safety property for derivation systems. Depending
on which proof theoretic property is in question, we would like
to know for instance, that
1. every derivable A is a tautology;
2. if a sentence is derivable from some others, it is also a
consequence of them;
3. if a set of sentences is inconsistent, it is unsatisfiable.
These are important properties of a derivation system. If any of
them do not hold, the derivation system is deficient—it would
derive too much. Consequently, establishing the soundness of a
derivation system is of the utmost importance.
Because all these proof-theoretic properties are defined via
closed tableaux of some kind or other, proving (1)–(3) above re-
quires proving something about the semantic properties of closed
tableaux. We will first define what it means for a signed formula
to be satisfied in a structure, and then show that if a tableau
is closed, no structure satisfies all its assumptions. (1)–(3) then
follow as corollaries from this result.
Definition E.22. A valuation v satisfies a signed formula T A iff
v A, and it satisfies F A iff v 2 A. v satisfies a set of signed
formulas Γ iff it satisfies every S A ∈ Γ. Γ is satisfiable if there is
a valuation that satisfies it, and unsatisfiable otherwise.
E.9. SOUNDNESS 235
Theorem E.23 (Soundness). If Γ has a closed tableau, Γ is unsat-
isfiable.
Proof. Let’s call a branch of a tableau satisfiable iff the set of
signed formulas on it is satisfiable, and let’s call a tableau satisfi-
able if it contains at least one satisfiable branch.
We show the following: Extending a satisfiable tableau by one
of the rules of inference always results in a satisfiable tableau.
This will prove the theorem: any closed tableau results by apply-
ing rules of inference to the tableau consisting only of assump-
tions from Γ. So if Γ were satisfiable, any tableau for it would be
satisfiable. A closed tableau, however, is clearly not satisfiable:
every branch contains both T A and F A, and no structure can
both satisfy and not satisfy A.
Suppose we have a satisfiable tableau, i.e., a tableau with at
least one satisfiable branch. Applying a rule of inference either
adds signed formulas to a branch, or splits a branch in two. If
the tableau has a satisfiable branch which is not extended by the
rule application in question, it remains a satisfiable branch in
the extended tableau, so the extended tableau is satisfiable. So
we only have to consider the case where a rule is applied to a
satisfiable branch.
Let Γ be the set of signed formulas on that branch, and let
S A ∈ Γ be the signed formula to which the rule is applied. If the
rule does not result in a split branch, we have to show that the
extended branch, i.e., Γ together with the conclusions of the rule,
is still satisfiable. If the rule results in split branch, we have to
show that at least one of the two resulting branches is satisfiable.
First, we consider the possible inferences with only one premise.
1. The branch is expanded by applying ¬T to T ¬B ∈ Γ.
Then the extended branch contains the signed formulas
Γ ∪ {F B }. Suppose v Γ. In particular, v ¬B. Thus,
v 2 B, i.e., v satisfies F B.
APPENDIX E. TABLEAUX 236
2. The branch is expanded by applying ¬F to F ¬B ∈ Γ: Ex-
ercise.
3. The branch is expanded by applying ∧T to T B ∧ C ∈ Γ,
which results in two new signed formulas on the branch:
T B and T C . Suppose v Γ, in particular v B ∧C . Then
v B and v C . This means that v satisfies both T B and
TC.
4. The branch is expanded by applying ∨F to T B ∨ C ∈ Γ:
Exercise.
5. The branch is expanded by applying →F to T B → C ∈ Γ:
This results in two new signed formulas on the branch: T B
and F C . Suppose v Γ, in particular v 2 B → C . Then
v B and v 2 C . This means that v satisfies both T B and
FC.
Now let’s consider the possible inferences with two premises.
1. The branch is expanded by applying ∧F to F B ∧ C ∈ Γ,
which results in two branches, a left one continuing through
F B and a right one through F C . Suppose v Γ, in partic-
ular v 2 B ∧C . Then v 2 B or v 2 C . In the former case, v
satisfies F B, i.e., v satisfies the formulas on the left branch.
In the latter, v satisfies F C , i.e., v satisfies the formulas on
the right branch.
2. The branch is expanded by applying ∨T to T B ∨ C ∈ Γ:
Exercise.
3. The branch is expanded by applying →T to T B → C ∈ Γ:
Exercise.
4. The branch is expanded by Cut: This results in two branches,
one containing T B, the other containing F B. Since v Γ
and either v B or v 2 B, v satisfies either the left or the
right branch.
E.9. SOUNDNESS 237
Corollary E.24. If ` A then A is a tautology.
Corollary E.25. If Γ ` A then Γ A.
Proof. If Γ ` A then for some B 1 , . . . , Bn ∈ Γ, {F A, T B 1, . . . , T Bn }
has a closed tableau. By Theorem E.23, every valuation v either
makes some Bi false or makes A true. Hence, if v Γ then also
v A.
Corollary E.26. If Γ is satisfiable, then it is consistent.
Proof. We prove the contrapositive. Suppose that Γ is not con-
sistent. Then there are B 1 , . . . , Bn ∈ Γ and a closed tableau for
{T B, . . . , T B }. By Theorem E.23, there is no v such that v Bi
for all i = 1, . . . , n. But then Γ is not satisfiable.
Problems
Problem E.1. Give closed tableaux of the following:
1. F ¬(A → B) → (A ∧ ¬B)
2. F (A → C ) ∨ (B → C ), T (A ∧ B) → C
Problem E.2. Prove Proposition E.13
Problem E.3. Prove that Γ ` ¬A iff Γ ∪ {A} is inconsistent.
Problem E.4. Complete the proof of Theorem E.23.
APPENDIX F
The
Completeness
Theorem
F.1 Introduction
The completeness theorem is one of the most fundamental re-
sults about logic. It comes in two formulations, the equivalence
of which we’ll prove. In its first formulation it says something fun-
damental about the relationship between semantic consequence
and our proof system: if a sentence A follows from some sen-
tences Γ, then there is also a derivation that establishes Γ ` A.
Thus, the proof system is as strong as it can possibly be without
proving things that don’t actually follow.
In its second formulation, it can be stated as a model exis-
tence result: every consistent set of sentences is satisfiable. Con-
sistency is a proof-theoretic notion: it says that our proof system
is unable to produce certain derivations. But who’s to say that
just because there are no derivations of a certain sort from Γ,
it’s guaranteed that there is valuation v with v Γ? Before the
completeness theorem was first proved—in fact before we had
238
F.1. INTRODUCTION 239
the proof systems we now do—the great German mathematician
David Hilbert held the view that consistency of mathematical the-
ories guarantees the existence of the objects they are about. He
put it as follows in a letter to Gottlob Frege:
If the arbitrarily given axioms do not contradict one
another with all their consequences, then they are
true and the things defined by the axioms exist. This
is for me the criterion of truth and existence.
Frege vehemently disagreed. The second formulation of the com-
pleteness theorem shows that Hilbert was right in at least the
sense that if the axioms are consistent, then some valuation exists
that makes them all true.
These aren’t the only reasons the completeness theorem—or
rather, its proof—is important. It has a number of important con-
sequences, some of which we’ll discuss separately. For instance,
since any derivation that shows Γ ` A is finite and so can only
use finitely many of the sentences in Γ, it follows by the com-
pleteness theorem that if A is a consequence of Γ, it is already
a consequence of a finite subset of Γ. This is called compactness.
Equivalently, if every finite subset of Γ is consistent, then Γ itself
must be consistent.
Although the compactness theorem follows from the com-
pleteness theorem via the detour through derivations, it is also
possible to use the the proof of the completeness theorem to estab-
lish it directly. For what the proof does is take a set of sentences
with a certain property—consistency—and constructs a structure
out of this set that has certain properties (in this case, that it sat-
isfies the set). Almost the very same construction can be used to
directly establish compactness, by starting from “finitely satisfi-
able” sets of sentences instead of consistent ones.
APPENDIX F. THE COMPLETENESS THEOREM 240
F.2 Outline of the Proof
The proof of the completeness theorem is a bit complex, and
upon first reading it, it is easy to get lost. So let us outline the
proof. The first step is a shift of perspective, that allows us to see
a route to a proof. When completeness is thought of as “whenever
Γ A then Γ ` A,” it may be hard to even come up with an idea:
for to show that Γ ` A we have to find a derivation, and it does
not look like the hypothesis that Γ A helps us for this in any
way. For some proof systems it is possible to directly construct a
derivation, but we will take a slightly different tack. The shift in
perspective required is this: completeness can also be formulated
as: “if Γ is consistent, it has a model.” Perhaps we can use the
information in Γ together with the hypothesis that it is consistent
to construct a model. After all, we know what kind of model we
are looking for: one that is as Γ describes it!
If Γ contains only propositional variables, it is easy to con-
struct a model for it. All we have to do is come up with a val-
uation v such that v pa for all p ∈ Γ. Well, let v(p) = T iff
p ∈ Γ.
Now suppose Γ contains some formula ¬B, with B atomic.
We might worry that the construction of v interferes with the
possibility of making ¬B true. But here’s where the consistency
of Γ comes in: if ¬B ∈ Γ, then B < Γ, or else Γ would be
inconsistent. And if B < Γ, then according to our construction
of v, v 2 B, so v ¬B. So far so good.
What if Γ contains complex, non-atomic formulas? Say it
contains A ∧ B. To make that true, we should proceed as if both
A and B were in Γ. And if A ∨ B ∈ Γ, then we will have to make
at least one of them true, i.e., proceed as if one of them was in Γ.
This suggests the following idea: we add additional formulas
to Γ so as to (a) keep the resulting set consistent and (b) make
sure that for every possible atomic sentence A, either A is in the
resulting set, or ¬A is, and (c) such that, whenever A ∧ B is in
the set, so are both A and B, if A ∨ B is in the set, at least one of
A or B is also, etc. We keep doing this (potentially forever). Call
F.3. COMPLETE CONSISTENT SETS OF SENTENCES 241
the set of all formulas so added Γ ∗ . Then our construction above
would provide us with a structure v for which we could prove, by
induction, that all sentences in Γ ∗ are true in it, and hence also
all sentence in Γ since Γ ⊆ Γ ∗ . It turns out that guaranteeing
(a) and (b) is enough. A set of sentences for which (b) holds is
called complete. So our task will be to extend the consistent set Γ
to a consistent and complete set Γ ∗ .
So here’s what we’ll do. First we investigate the properties of
complete consistent sets, in particular we prove that a complete
consistent set contains A ∧ B iff it contains both A and B, A ∨ B
iff it contains at least one of them, etc. (Proposition F.2). We’ll
then take the consistent set Γ and show that it can be extended
to a consistent and complete set Γ ∗ (Lemma F.3). This set Γ ∗
is what we’ll use to define our valuation v(Γ ∗ ). The valuation is
determined by the propositional variables in Γ ∗ (Definition F.4).
We’ll use the properties of complete consistent sets to show that
indeed v(Γ ∗ ) A iff A ∈ Γ ∗ (Lemma F.5), and thus in particular,
v(Γ ∗ ) Γ.
F.3 Complete Consistent Sets of Sentences
Definition F.1 (Complete set). A set Γ of sentences is complete
iff for any sentence A, either A ∈ Γ or ¬A ∈ Γ.
Complete sets of sentences leave no questions unanswered.
For any sentence A, Γ “says” if A is true or false. The impor-
tance of complete sets extends beyond the proof of the complete-
ness theorem. A theory which is complete and axiomatizable, for
instance, is always decidable.
Complete consistent sets are important in the completeness
proof since we can guarantee that every consistent set of sen-
tences Γ is contained in a complete consistent set Γ ∗ . A com-
plete consistent set contains, for each sentence A, either A or its
negation ¬A, but not both. This is true in particular for atomic
sentences, so from a complete consistent set in a language suit-
APPENDIX F. THE COMPLETENESS THEOREM 242
ably expanded by constant symbols, we can construct a structure
where the interpretation of predicate symbols is defined accord-
ing to which atomic sentences are in Γ ∗ . This structure can then
be shown to make all sentences in Γ ∗ (and hence also all those
in Γ) true. The proof of this latter fact requires that ¬A ∈ Γ ∗ iff
A < Γ ∗ , (A ∨ B) ∈ Γ ∗ iff A ∈ Γ ∗ or B ∈ Γ ∗ , etc.
In what follows, we will often tacitly use the properties of
reflexivity, monotonicity, and transitivity of ` (see appendices D.6
and E.6).
Proposition F.2. Suppose Γ is complete and consistent. Then:
1. If Γ ` A, then A ∈ Γ.
2. A ∧ B ∈ Γ iff both A ∈ Γ and B ∈ Γ.
3. A ∨ B ∈ Γ iff either A ∈ Γ or B ∈ Γ.
4. A → B ∈ Γ iff either A < Γ or B ∈ Γ.
Proof. Let us suppose for all of the following that Γ is complete
and consistent.
1. If Γ ` A, then A ∈ Γ.
Suppose that Γ ` A. Suppose to the contrary that A <
Γ. Since Γ is complete, ¬A ∈ Γ. By Propositions E.17
and D.24, Γ is inconsistent. This contradicts the assump-
tion that Γ is consistent. Hence, it cannot be the case that
A < Γ, so A ∈ Γ.
2. Exercise.
3. First we show that if A ∨ B ∈ Γ, then either A ∈ Γ or B ∈ Γ.
Suppose A ∨ B ∈ Γ but A < Γ and B < Γ. Since Γ is com-
plete, ¬A ∈ Γ and ¬B ∈ Γ. By Propositions E.20 and D.27,
item (1), Γ is inconsistent, a contradiction. Hence, either
A ∈ Γ or B ∈ Γ.
F.4. LINDENBAUM’S LEMMA 243
For the reverse direction, suppose that A ∈ Γ or B ∈ Γ. By
Propositions E.20 and D.27, item (2), Γ ` A ∨ B. By (1),
A ∨ B ∈ Γ, as required.
4. Exercise.
F.4 Lindenbaum’s Lemma
We now prove a lemma that shows that any consistent set of sen-
tences is contained in some set of sentences which is not just
consistent, but also complete. The proof works by adding one
sentence at a time, guaranteeing at each step that the set remains
consistent. We do this so that for every A, either A or ¬A gets
added at some stage. The union of all stages in that construction
then contains either A or its negation ¬A and is thus complete.
It is also consistent, since we made sure at each stage not to in-
troduce an inconsistency.
Lemma F.3 (Lindenbaum’s Lemma). Every consistent set Γ in a
language L can be extended to a complete and consistent set Γ ∗ .
Proof. Let Γ be consistent. Let A0 , A1 , . . . be an enumeration of
all the sentences of L. Define Γ0 = Γ, and
(
Γn ∪ {An } if Γn ∪ {An } is consistent;
Γn+1 =
Γn ∪ {¬An } otherwise.
Let Γ ∗ = n ≥0 Γn .
Ð
Each Γn is consistent: Γ0 is consistent by definition. If Γn+1 =
Γn ∪ {An }, this is because the latter is consistent. If it isn’t,
Γn+1 = Γn ∪ {¬An }. We have to verify that Γn ∪ {¬An } is con-
sistent. Suppose it’s not. Then both Γn ∪ {An } and Γn ∪ {¬An } are
inconsistent. This means that Γn would be inconsistent by Propo-
sitions E.17 and D.24, contrary to the induction hypothesis.
APPENDIX F. THE COMPLETENESS THEOREM 244
For every n and every i < n, Γi ⊆ Γn . This follows by a simple
induction on n. For n = 0, there are no i < 0, so the claim holds
automatically. For the inductive step, suppose it is true for n.
We have Γn+1 = Γn ∪ {An } or = Γn ∪ {¬An } by construction. So
Γn ⊆ Γn+1 . If i < n, then Γi ⊆ Γn by inductive hypothesis, and so
⊆ Γn+1 by transitivity of ⊆.
From this it follows that every finite subset of Γ ∗ is a subset
of Γn for some n, since each B ∈ Γ ∗ not already in Γ0 is added at
some stage i . If n is the last one of these, then all B in the finite
subset are in Γn . So, every finite subset of Γ ∗ is consistent. By
Propositions E.14 and D.18, Γ ∗ is consistent.
Every sentence of Frm(L) appears on the list used to de-
fine Γ ∗ . If An < Γ ∗ , then that is because Γn ∪ {An } was incon-
sistent. But then ¬An ∈ Γ ∗ , so Γ ∗ is complete.
F.5 Construction of a Model
We are now ready to define a valuation that makes all A ∈ Γ
true. To do this, we first apply Lindenbaum’s Lemma: we get a
complete consistent Γ ∗ ⊇ Γ. We let the propositional variables
in Γ ∗ determine v(Γ ∗ ).
Definition F.4. Suppose Γ ∗ is a complete consistent set of for-
mulas. Then we let
(
T if p ∈ Γ ∗
v(Γ ∗ )(p) =
F if p < Γ ∗
Lemma F.5 (Truth Lemma). v(Γ ∗ ) A iff A ∈ Γ ∗ .
Proof. We prove both directions simultaneously, and by induction
on A.
1. A ≡ ⊥: v(Γ ∗ ) 2 ⊥ by definition of satisfaction. On the
other hand, ⊥ < Γ ∗ since Γ ∗ is consistent.
F.6. THE COMPLETENESS THEOREM 245
2. A ≡ p: v(Γ ∗ ) p iff v(Γ ∗ )(p) = T (by the definition of
satisfaction) iff p ∈ Γ ∗ (by the construction of v(Γ ∗ )).
3. A ≡ ¬B: v(Γ ∗ ) A iff M(Γ ∗ ) 2 B (by definition of satis-
faction). By induction hypothesis, M(Γ ∗ ) 2 B iff B < Γ ∗ .
Since Γ ∗ is consistent and complete, B < Γ ∗ iff ¬B ∈ Γ ∗ .
4. A ≡ B ∧ C : exercise.
5. A ≡ B ∨ C : v(Γ ∗ ) A iff at v(Γ ∗ ) B or v(Γ ∗ ) C
(by definition of satisfaction) iff B ∈ Γ ∗ or C ∈ Γ ∗ (by
induction hypothesis). This is the case iff (B ∨ C ) ∈ Γ ∗ (by
Proposition F.2(3)).
6. A ≡ B → C : exercise.
F.6 The Completeness Theorem
Let’s combine our results: we arrive at the completeness theo-
rem.
Theorem F.6 (Completeness Theorem). Let Γ be a set of sentences.
If Γ is consistent, it is satisfiable.
Proof. Suppose Γ is consistent. By Lemma F.3, there is a Γ ∗ ⊇ Γ
which is consistent and complete. By Lemma F.5, v(Γ ∗ ) A iff
A ∈ Γ ∗ . From this it follows in particular that for all A ∈ Γ,
v(Γ ∗ ) A, so Γ is satisfiable.
APPENDIX F. THE COMPLETENESS THEOREM 246
Corollary F.7 (Completeness Theorem, Second Version). For all
Γ and A sentences: if Γ A then Γ ` A.
Proof. Note that the Γ’s in Corollary F.7 and Theorem F.6 are
universally quantified. To make sure we do not confuse ourselves,
let us restate Theorem F.6 using a different variable: for any set of
sentences ∆, if ∆ is consistent, it is satisfiable. By contraposition,
if ∆ is not satisfiable, then ∆ is inconsistent. We will use this to
prove the corollary.
Suppose that Γ A. Then Γ ∪ {¬A} is unsatisfiable by Propo-
sition C.16. Taking Γ ∪ {¬A} as our ∆, the previous version of
Theorem F.6 gives us that Γ ∪ {¬A} is inconsistent. By Proposi-
tions E.16 and D.23, Γ ` A.
Problems
Problem F.1. Complete the proof of Proposition F.2.
Problem F.2. Complete the proof of Lemma F.5.
Problem F.3. Use Corollary F.7 to prove Theorem F.6, thus show-
ing that the two formulations of the completeness theorem are
equivalent.
Problem F.4. In order for a derivation system to be complete,
its rules must be strong enough to prove every unsatisfiable set
inconsistent. Which of the rules of derivation were necessary to
prove completeness? Are any of these rules not used anywhere
in the proof? In order to answer these questions, make a list or
diagram that shows which of the rules of derivation were used in
which results that lead up to the proof of Theorem F.6. Be sure
to note any tacit uses of rules in these proofs.
APPENDIX G
Proofs
G.1 Introduction
Based on your experiences in introductory logic, you might be
comfortable with a proof system—probably a natural deduction
or Fitch style proof system, or perhaps a proof-tree system. You
probably remember doing proofs in these systems, either proving
a formula or show that a given argument is valid. In order to do
this, you applied the rules of the system until you got the desired
end result. In reasoning about logic, we also prove things, but
in most cases we are not using a proof system. In fact, most of
the proofs we consider are done in English (perhaps, with some
symbolic language thrown in) rather than entirely in the language
of first-order logic. When constructing such proofs, you might at
first be at a loss—how do I prove something without a proof
system? How do I start? How do I know if my proof is correct?
Before attempting a proof, it’s important to know what a proof
is and how to construct one. As implied by the name, a proof is
meant to show that something is true. You might think of this in
terms of a dialogue—someone asks you if something is true, say,
if every prime other than two is an odd number. To answer “yes”
is not enough; they might want to know why. In this case, you’d
give them a proof.
In everyday discourse, it might be enough to gesture at an
answer, or give an incomplete answer. In logic and mathematics,
247
APPENDIX G. PROOFS 248
however, we want rigorous proof—we want to show that some-
thing is true beyond any doubt. This means that every step in our
proof must be justified, and the justification must be cogent (i.e.,
the assumption you’re using is actually assumed in the statement
of the theorem you’re proving, the definitions you apply must be
correctly applied, the justifications appealed to must be correct
inferences, etc.).
Usually, we’re proving some statement. We call the statements
we’re proving by various names: propositions, theorems, lemmas,
or corollaries. A proposition is a basic proof-worthy statement:
important enough to record, but perhaps not particularly deep
nor applied often. A theorem is a significant, important proposi-
tion. Its proof often is broken into several steps, and sometimes
it is named after the person who first proved it (e.g., Cantor’s
Theorem, the Löwenheim-Skolem theorem) or after the fact it
concerns (e.g., the completeness theorem). A lemma is a propo-
sition or theorem that is used to in the proof of a more impor-
tant result. Confusingly, sometimes lemmas are important results
in themselves, and also named after the person who introduced
them (e.g., Zorn’s Lemma). A corollary is a result that easily
follows from another one.
A statement to be proved often contains some assumption
that clarifies about which kinds of things we’re proving some-
thing. It might begin with “Let A be a formula of the form B →C ”
or “Suppose Γ ` A” or something of the sort. These are hypothe-
ses of the proposition, theorem, or lemma, and you may assume
these to be true in your proof. They restrict what we’re proving
about, and also introduce some names for the objects we’re talk-
ing about. For instance, if your proposition begins with “Let A be
a formula of the form B → C ,” you’re proving something about
all formulas of a certain sort only (namely, conditionals), and it’s
understood that B →C is an arbitrary conditional that your proof
will talk about.
G.2. STARTING A PROOF 249
G.2 Starting a Proof
But where do you even start?
You’ve been given something to prove, so this should be the
last thing that is mentioned in the proof (you can, obviously, an-
nounce that you’re going to prove it at the beginning, but you don’t
want to use it as an assumption). Write what you are trying to
prove at the bottom of a fresh sheet of paper—this way you don’t
lose sight of your goal.
Next, you may have some assumptions that you are able to use
(this will be made clearer when we talk about the type of proof you
are doing in the next section). Write these at the top of the page
and make sure to flag that they are assumptions (i.e., if you are
assuming x, write “assume that x,” or “suppose that x”). Finally,
there might be some definitions in the question that you need
to know. You might be told to use a specific definition, or there
might be various definitions in the assumptions or conclusion
that you are working towards. Write these down and ensure that you
understand what they mean.
How you set up your proof will also be dependent upon the
form of the question. The next section provides details on how
to set up your proof based on the type of sentence.
G.3 Using Definitions
We mentioned that you must be familiar with all definitions that
may be used in the proof, and that you can properly apply them.
This is a really important point, and it is worth looking at in
a bit more detail. Definitions are used to abbreviate properties
and relations so we can talk about them more succinctly. The
introduced abbreviation is called the definiendum, and what it ab-
breviates is the definiens. In proofs, we often have to go back to
how the definiendum was introduced, because we have to exploit
the logical structure of the definiens (the long version of which
the defined term is the abbreviation) to get through our proof. By
APPENDIX G. PROOFS 250
unpacking definitions, you’re ensuring that you’re getting to the
heart of where the logical action is.
We’ll start with an example. Suppose you want to prove the
following:
Proposition G.1. For any sets X and Y , X ∪ Y = Y ∪ X .
In order to even start the proof, we need to know what it
means for two sets to be identical; i.e., we need to know what
the “=” in that equation means for sets. Sets are defined to be
identical whenever they have the same elements. So the definition
we have to unpack is:
Definition G.2. Sets X and Y are identical, X = Y , iff every
element of X is an element of Y , and vice versa.
This definition uses X and Y as placeholders for arbitrary
sets. What it defines—the definiendum—is the expression “X =
Y ” by giving the condition under which X = Y is true. This
condition—“every element of X is an element of Y , and vice
versa”—is the definiens.1 The definition specifies that X = Y
is true if, and only if (we abbreviate this to “iff”) the condition
holds.
When you apply the definition, you have to match the X and
Y in the definition to the case you’re dealing with. In our case, it
means that in order for X ∪Y = Y ∪X to be true, each z ∈ X ∪Y
must also be in Y ∪X , and vice versa. The expression X ∪Y in the
proposition plays the role of X in the definition, and Y ∪ X that
of Y . Since X and Y are used both in the definition and in the
statement of the proposition we’re proving, but in different uses,
you have to be careful to make sure you don’t mix up the two. For
instance, it would be a mistake to think that you could prove the
1 In this particular case—and very confusingly!—when X = Y , the sets X
and Y are just one and the same set, even though we use different letters for it
on the left and the right side. But the ways in which that set is picked out may
be different, and that makes the definition non-trivial.
G.3. USING DEFINITIONS 251
proposition by showing that every element of X is an element
of Y , and vice versa—that would show that X = Y , not that
X ∪ Y = Y ∪ X . (Also, since X and Y may be any two sets, you
won’t get very far, because if nothing is assumed about X and Y
they may well be different sets.)
Within the proof we are dealing with set-theoretic notions
such as union, and so we must also know the meanings of the
symbol ∪ in order to understand how the proof should proceed.
And sometimes, unpacking the definition gives rise to further
definitions to unpack. For instance, X ∪ Y is defined as {z :
z ∈ X or z ∈ Y }. So if you want to prove that x ∈ X ∪ Y ,
unpacking the definition of ∪ tells you that you have to prove
x ∈ {z : z ∈ X or z ∈ Y }. Now you also have to remember that
x ∈ {z : . . . z . . .} iff . . . x . . . . So, further unpacking the definition
of the {z : . . . z . . .} notation, what you have to show is: x ∈ X or
x ∈ Y . So, “every element of X ∪Y is also an element of Y ∪ X ”
really means: “for every x, if x ∈ X or x ∈ Y , then x ∈ Y or
x ∈ X .” If we fully unpack the definitions in the proposition, we
see that what we have to show is this:
APPENDIX G. PROOFS 252
Proposition G.3. For any sets X and Y : (a) for every x, if x ∈ X or
x ∈ Y , then x ∈ Y or x ∈ X , and (b) for every x, if x ∈ Y or x ∈ X ,
then x ∈ X or x ∈ Y .
What’s important is that unpacking definitions is a necessary
part of constructing a proof. Properly doing it is sometimes diffi-
cult: you must be careful to distinguish and match the variables
in the definition and the terms in the claim you’re proving. In
order to be successful, you must know what the question is ask-
ing and what all the terms used in the question mean—you will
often need to unpack more than one definition. In simple proofs
such as the ones below, the solution follows almost immediately
from the definitions themselves. Of course, it won’t always be this
simple.
G.4 Inference Patterns
Proofs are composed of individual inferences. When we make an
inference, we typically indicate that by using a word like “so,”
“thus,” or “therefore.” The inference often relies on one or two
facts we already have available in our proof—it may be something
we have assumed, or something that we’ve concluded by an in-
ference already. To be clear, we may label these things, and in
the inference we indicate what other statements we’re using in the
inference. An inference will often also contain an explanation of
why our new conclusion follows from the things that come before
it. There are some common patterns of inference that are used
very often in proofs; we’ll go through some below. Some patterns
of inference, like proofs by induction, are more involved (and will
be discussed later).
We’ve already discussed one pattern of inference: unpack-
ing, or applying, a definition. When we unpack a definition, we
just restate something that involves the definiendum by using the
definiens. For instance, suppose that we have already established
in the course of a proof that U = V (a). Then we may apply the
G.4. INFERENCE PATTERNS 253
definition of = for sets and infer: “Thus, by definition from (a),
every element of U is an element of V and vice versa.”
Somewhat confusingly, we often do not write the justification
of an inference when we actually make it, but before. Suppose
we haven’t already proved that U = V , but we want to. If U = V
is the conclusion we aim for, then we can restate this aim also
by applying the definition: to prove U = V we have to prove
that every element of U is an element of V and vice versa. So
our proof will have the form: (a) prove that every element of U
is an element of V ; (b) every element of V is an element of U ;
(c) therefore, from (a) and (b) by definition of =, U = V . But
we would usually not write it this way. Instead we might write
something like,
We want to show U = V . By definition of =, this
amounts to showing that every element of U is an el-
ement of V and vice versa.
(a) . . . (a proof that every element of U is an element
of V ) . . .
(b) . . . (a proof that every element of V is an element
of U ) . . .
Using a Conjunction
Perhaps the simplest inference pattern is that of drawing as con-
clusion one of the conjuncts of a conjunction. In other words:
if we have assumed or already proved that p and q , then we’re
entitled to infer that p (and also that q ). This is such a basic
inference that it is often not mentioned. For instance, once we’ve
unpacked the definition of U = V we’ve established that every
element of U is an element of V and vice versa. From this we
can conclude that every element of V is an element of U (that’s
the “vice versa” part).
APPENDIX G. PROOFS 254
Proving a Conjunction
Sometimes what you’ll be asked to prove will have the form of a
conjunction; you will be asked to “prove p and q .” In this case,
you simply have to do two things: prove p, and then prove q . You
could divide your proof into two sections, and for clarity, label
them. When you’re making your first notes, you might write “(1)
Prove p” at the top of the page, and “(2) Prove q ” in the middle of
the page. (Of course, you might not be explicitly asked to prove
a conjunction but find that your proof requires that you prove a
conjunction. For instance, if you’re asked to prove that U = V
you will find that, after unpacking the definition of =, you have to
prove: every element of U is an element of V and every element
of V is an element of U ).
Proving a Disjunction
When what you are proving takes the form of a disjunction (i.e., it
is an statement of the form “p or q ”), it is enough to show that one
of the disjuncts is true. However, it basically never happens that
either disjunct just follows from the assumptions of your theorem.
More often, the assumptions of your theorem are themselves dis-
junctive, or you’re showing that all things of a certain kind have
one of two properties, but some of the things have the one and
others have the other property. This is where proof by cases is
useful (see below).
Conditional Proof
Many theorems you will encounter are in conditional form (i.e.,
show that if p holds, then q is also true). These cases are nice and
easy to set up—simply assume the antecedent of the conditional
(in this case, p) and prove the conclusion q from it. So if your
theorem reads, “If p then q ,” you start your proof with “assume
p” and at the end you should have proved q .
Conditionals may be stated in different ways. So instead of “If
p then q ,” a theorem may state that “p only if q ,” “q if p,” or “q ,
G.4. INFERENCE PATTERNS 255
provided p.” These all mean the same and require assuming p
and proving q from that assumption. Recall that a biconditional
(“p if and only if (iff) q ”) is really two conditionals put together:
if p then q , and if q then p. All you have to do, then, is two
instances of conditional proof: one for the first conditional and
another one for the second. Sometimes, however, it is possible
to prove an “iff” statement by chaining together a bunch of other
“iff” statements so that you start with “p” an end with “q ”—but
in that case you have to make sure that each step really is an “iff.”
Universal Claims
Using a universal claim is simple: if something is true for any-
thing, it’s true for each particular thing. So if, say, the hypothesis
of your proof is X ⊆ Y , that means (unpacking the definition
of ⊆), that, for every x ∈ X , x ∈ Y . Thus, if you already know
that z ∈ X , you can conclude z ∈ Y .
Proving a universal claim may seem a little bit tricky. Usually
these statements take the following form: “If x has P , then it
has Q ” or “All P s are Q s.” Of course, it might not fit this form
perfectly, and it takes a bit of practice to figure out what you’re
asked to prove exactly. But: we often have to prove that all objects
with some property have a certain other property.
The way to prove a universal claim is to introduce names
or variables, for the things that have the one property and then
show that they also have the other property. We might put this
by saying that to prove something for all P s you have to prove
it for an arbitrary P . And the name introduced is a name for an
arbitrary P . We typically use single letters as these names for
arbitrary things, and the letters usually follow conventions: e.g.,
we use n for natural numbers, A for formulas, X for sets, f for
functions, etc.
The trick is to maintain generality throughout the proof. You
start by assuming that an arbitrary object (“x”) has the prop-
erty P , and show (based only on definitions or what you are al-
lowed to assume) that x has the property Q . Because you have
APPENDIX G. PROOFS 256
not stipulated what x is specifically, other that it has the property
P , then you can assert that all every P has the property Q . In
short, x is a stand-in for all things with property P .
Proposition G.4. For all sets X and Y , X ⊆ X ∪ Y .
Proof. Let X and Y be arbitrary sets. We want to show that
X ⊆ X ∪ Y . By definition of ⊆, this amounts to: for every x, if
x ∈ X then x ∈ X ∪ Y . So let x ∈ X be an arbitrary element
of X . We have to show that x ∈ X ∪ Y . Since x ∈ X , x ∈ X or
x ∈ Y . Thus, x ∈ {x : x ∈ X ∨ x ∈ Y }. But that, by definition of
∪, means x ∈ X ∪ Y .
Proof by Cases
Suppose you have a disjunction as an assumption or as an already
established conclusion—you have assumed or proved that p or q
is true. You want to prove r . You do this in two steps: first you
assume that p is true, and prove r , then you assume that q is true
and prove r again. This works because we assume or know that
one of the two alternatives holds. The two steps establish that
either one is sufficient for the truth of r . (If both are true, we
have not one but two reasons for why r is true. It is not neces-
sary to separately prove that r is true assuming both p and q .)
To indicate what we’re doing, we announce that we “distinguish
cases.” For instance, suppose we know that x ∈ Y ∪ Z . Y ∪ Z is
defined as {x : x ∈ Y or x ∈ Z }. In other words, by definition,
x ∈ Y or x ∈ Z . We would prove that x ∈ X from this by first
assuming that x ∈ Y , and proving x ∈ X from this assumption,
and then assume x ∈ Z , and again prove x ∈ X from this. You
would write “We distinguish cases” under the assumption, then
“Case (1): x ∈ Y ” underneath, and “Case (2): x ∈ Z halfway
down the page. Then you’d proceed to fill in the top half and the
bottom half of the page.
Proof by cases is especially useful if what you’re proving is
itself disjunctive. Here’s a simple example:
G.4. INFERENCE PATTERNS 257
Proposition G.5. SupposeY ⊆ U and Z ⊆ V . ThenY ∪Z ⊆ U ∪V .
Proof. Assume (a) that Y ⊆ U and (b) Z ⊆ V . By definition, any
x ∈ Y is also ∈ U (c) and any x ∈ Z is also ∈ V (d). To show that
Y ∪ Z ⊆ U ∪V , we have to show that if x ∈ Y ∪ Z then x ∈ U ∪V
(by definition of ⊆). x ∈ Y ∪ Z iff x ∈ Y or x ∈ Z (by definition
of ∪). Similarly, x ∈ U ∪ V iff x ∈ U or x ∈ V . So, we have to
show: for any x, if x ∈ Y or x ∈ Z , then x ∈ U or x ∈ V .
So far we’ve only unpacked definitions! We’ve refor-
mulated our proposition without ⊆ and ∪ and are left
with trying to prove a universal conditional claim. By
what we’ve discussed above, this is done by assuming
that x is something about which we assume the “if”
part is true, and we’ll go on to show that the “then”
part is true as well. In other words, we’ll assume that
x ∈ Y or x ∈ Z and show that x ∈ U or x ∈ V .2
Suppose that x ∈ Y or x ∈ Z . We have to show that x ∈ U or
x ∈ V . We distinguish cases.
Case 1: x ∈ Y . By (c), x ∈ U . Thus, x ∈ U or x ∈ V . (Here
we’ve made the inference discussed in the preceding subsection!)
Case 2: x ∈ Z . By (d), x ∈ V . Thus, x ∈ U or x ∈ V .
Proving an Existence Claim
When asked to prove an existence claim, the question will usually
be of the form “prove that there is an x such that . . . x . . . ”, i.e.,
that some object that has the property described by “. . . x . . . ”. In
this case you’ll have to identify a suitable object show that is has
the required property. This sounds straightforward, but a proof
of this kind can be tricky. Typically it involves constructing or
defining an object and proving that the object so defined has the
2 This paragraph just explains what we’re doing—it’s not part of the proof,
and you don’t have to go into all this detail when you write down your own
proofs.
APPENDIX G. PROOFS 258
required property. Finding the right object may be hard, proving
that it has the required property may be hard, and sometimes it’s
even tricky to show that you’ve succeeded in defining an object
at all!
Generally, you’d write this out by specifying the object, e.g.,
“let x be . . . ” (where . . . specifies which object you have in mind),
possibly proving that . . . in fact describes an object that exists,
and then go on to show that x has the property Q . Here’s a simple
example.
Proposition G.6. Suppose that x ∈ Y . Then there is an X such that
X ⊆ Y and X , ∅.
Proof. Assume x ∈ Y . Let X = {x }.
Here we’ve defined the set X by enumerating its ele-
ments. Since we assume that x is an object, and we
can always form a set by enumerating its elements, we
don’t have to show that we’ve succeeded in defining
a set X here. However, we still have to show that X
has the properties required by the proposition. The
proof isn’t complete without that!
Since x ∈ X , X , ∅.
This relies on the definition of X as {x } and the ob-
vious facts that x ∈ {x } and x < ∅.
Since x is the only element of {x }, and x ∈ Y , every element of X
is also an element of Y . By definition of ⊆, X ⊆ Y .
Using Existence Claims
Suppose you know that some existence claim is true (you’ve proved
it, or it’s a hypothesis you can use), say, “for some x, x ∈ X ” or
“there is an x ∈ X .” If you want to use it in your proof, you can
just pretend that you have a name for one of the things which
your hypothesis says exist. Since X contains at least one thing,
G.4. INFERENCE PATTERNS 259
there are things to which that name might refer. You might of
course not be able to pick one out or describe it further (other
than that it is ∈ X ). But for the purpose of the proof, you can
pretend that you have picked it out and give a name to it. It’s
important to pick a name that you haven’t already used (or that
appears in your hypotheses), otherwise things can go wrong. In
your proof, you indicate this by going from “for some x, x ∈ X ”
to “Let a ∈ X .” Now you can reason about a, use some other
hypotheses, etc., until you come to a conclusion, p. If p no longer
mentions a, p is independent of the asusmption that a ∈ X , and
you’ve shown that it follows just from the assumption “for some
x, x ∈ X .”
Proposition G.7. If X , ∅, then X ∪ Y , ∅.
Proof. Suppose X , ∅. So for some x, x ∈ X .
Here we first just restated the hypothesis of the propo-
sition. This hypothesis, i.e., X , ∅, hides an existen-
tial claim, which you get to only by unpacking a few
definitions. The definition of = tells us that X = ∅
iff every x ∈ X is also ∈ ∅ and every x ∈ ∅ is also
∈ X . Negating both sides, we get: X , ∅ iff either
some x ∈ X is < ∅ or some x ∈ ∅ is < X . Since noth-
ing is ∈ ∅, the second disjunct can never be true, and
“x ∈ X and x < ∅” reduces to just x ∈ X . So x , ∅ iff
for some x, x ∈ X . That’s an existence claim. Now
we use that existence claim by introducing a name
for one of the elements of X :
Let a ∈ X .
Now we’ve introduced a name for one of the things ∈
X . We’ll continue to argue about a, but we’ll be care-
ful to only assume that a ∈ X and nothing else:
Since a ∈ X , a ∈ X ∪ Y , by definition of ∪. So for some x,
x ∈ X ∪ Y , i.e., X ∪ Y , ∅.
APPENDIX G. PROOFS 260
In that last step, we went from “a ∈ X ∪ Y ” to “for
some x, x ∈ X ∪Y .” That doesn’t mention a anymore,
so we know that “for some x, x ∈ X ∪ Y ” follows
from “for some x, x ∈ X alone.” But that means that
X ∪ Y , ∅.
It’s maybe good practice to keep bound variables like “x” sep-
arate from hypothtical names like a, like we did. In practice,
however, we often don’t and just use x, like so:
Suppose X , ∅, i.e., there is an x ∈ X . By definition
of ∪, x ∈ X ∪ Y . So X ∪ Y , ∅.
However, when you do this, you have to be extra careful that
you use different x’s and y’s for different existential claims. For
instance, the following is not a correct proof of “If X , ∅ and
Y , ∅ then X ∩ Y , ∅” (which is not true).
Suppose X , ∅ and Y , ∅. So for some x, x ∈ X
and also for some x, x ∈ Y . Since x ∈ X and x ∈ Y ,
x ∈ X ∩ Y , by definition of ∩. So X ∩ Y , ∅.
Can you spot where the incorrect step occurs and explain why
the result does not hold?
G.5 An Example
Our first example is the following simple fact about unions and in-
tersections of sets. It will illustrate unpacking definitions, proofs
of conjunctions, of universal claims, and proof by cases.
G.5. AN EXAMPLE 261
Proposition G.8. For any sets X , Y , and Z , X ∪ (Y ∩ Z ) = (X ∪
Y ) ∩ (X ∪ Z )
Let’s prove it!
Proof. We want to show that for any sets X , Y , and Z , X ∪ (Y ∩
Z ) = (X ∪ Y ) ∩ (X ∪ Z )
First we unpack the definition of “=” in the statement
of the proposition. Recall that proving sets identical
means showing that the sets have the same elements.
That is, all elements of X ∪ (Y ∩ Z ) are also elements
of (X ∪Y ) ∩ (X ∪Z ), and vice versa. The “vice versa”
means that also every element of (X ∪ Y ) ∩ (X ∪ Z )
must be an element of X ∪ (Y ∩ Z ). So in unpacking
the definition, we see that we have to prove a conjunc-
tion. Let’s record this:
By definition, X ∪ (Y ∩ Z ) = (X ∪Y ) ∩ (X ∪ Z ) iff every element
of X ∪ (Y ∩ Z ) is also an element of (X ∪Y ) ∩ (X ∪ Z ), and every
element of (X ∪ Y ) ∩ (X ∪ Z ) is an element of X ∪ (Y ∩ Z ).
Since this is a conjunction, we must prove each con-
junct separately. Lets start with the first: let’s prove
that every element of X ∪ (Y ∩ Z ) is also an element
of (X ∪ Y ) ∩ (X ∪ Z ).
This is a universal claim, and so we consider an ar-
bitrary element of X ∪ (Y ∩ Z ) and show that it must
also be an element of (X ∪Y ) ∩ (X ∪ Z ). We’ll pick a
variable to call this arbitrary element by, say, z . Our
proof continues:
First, we prove that every element of X ∪(Y ∩Z ) is also an element
of (X ∪Y ) ∩ (X ∪ Z ). Let z ∈ X ∪ (Y ∩ Z ). We have to show that
z ∈ (X ∪ Y ) ∩ (X ∪ Z ).
APPENDIX G. PROOFS 262
Now it is time to unpack the definition of ∪ and ∩.
For instance, the definition of ∪ is: X ∪ Y = {z :
z ∈ X or z ∈ Y }. When we apply the definition to
“X ∪ (Y ∩ Z ),” the role of the “Y ” in the definition
is now played by “Y ∩ Z ,” so X ∪ (Y ∩ Z ) = {z :
z ∈ X or z ∈ Y ∩ Z }. So our assumption that z ∈
X ∪(Y ∩Z ) amounts to: z ∈ {z : z ∈ X or z ∈ Y ∩Z }.
And z ∈ {z : . . . z . . .} iff . . . z . . . , i.e., in this case,
z ∈ X or z ∈ Y ∩ Z .
By the definition of ∪, either z ∈ X or z ∈ Y ∩ Z .
Since this is a disjunction, it will be useful to apply
proof by cases. We take the two cases, and show that
in each one, the conclusion we’re aiming for (namely,
“z ∈ (X ∪ Y ) ∩ (X ∪ Z )”) obtains.
Case 1: Suppose that z ∈ X .
There’s not much more to work from based on our
assumptions. So let’s look at what we have to work
with in the conclusion. We want to show that z ∈
(X ∪ Y ) ∩ (X ∪ Z ). Based on the definition of ∩, if
we want to show that z ∈ (X ∪ Y ) ∩ (X ∪ Z ), we
have to show that it’s in both (X ∪ Y ) and (X ∪ Z ).
But z ∈ X ∪ Y iff z ∈ X or z ∈ Y , and we already
have (as the assumption of case 1) that z ∈ X . By
the same reasoning—switching Z for Y —z ∈ X ∪ Z .
This argument went in the reverse direction, so let’s
record our reasoning in the direction needed in our
proof.
Since z ∈ X , z ∈ X or z ∈ Y , and hence, by definition of ∪,
z ∈ X ∪ Y . Similarly, z ∈ X ∪ Z . But this means that z ∈
(X ∪ Y ) ∩ (X ∪ Z ), by definition of ∩.
This completes the first case of the proof by cases.
Now we want to derive the conclusion in the second
case, where z ∈ Y ∩ Z .
G.5. AN EXAMPLE 263
Case 2: Suppose that z ∈ Y ∩ Z .
Again, we are working with the intersection of two
sets. Let’s apply the definition of ∩:
Since z ∈ Y ∩ Z , z must be an element of both Y and Z , by
definition of ∩.
It’s time to look at our conclusion again. We have to
show that z is in both (X ∪ Y ) and (X ∪ Z ). And
again, the solution is immediate.
Since z ∈ Y , z ∈ (X ∪ Y ). Since z ∈ Z , also z ∈ (X ∪ Z ). So,
z ∈ (X ∪ Y ) ∩ (X ∪ Z ).
Here we applied the definitions of ∪ and ∩ again,
but since we’ve already recalled those definitions, and
already showed that if z is in one of two sets it is in
their union, we don’t have to be as explicit in what
we’ve done.
We’ve completed the second case of the proof by cases,
so now we can assert our first conclusion.
So, if z ∈ X ∪ (Y ∩ Z ) then z ∈ (X ∪ Y ) ∩ (X ∪ Z ).
Now we just want to show the other direction, that
every element of (X ∪ Y ) ∩ (X ∪ Z ) is an element of
X ∪ (Y ∩ Z ). As before, we prove this universal claim
by assuming we have an arbitrary element of the first
set and show it must be in the second set. Let’s state
what we’re about to do.
Now, assume that z ∈ (X ∪ Y ) ∩ (X ∪ Z ). We want to show that
z ∈ X ∪ (Y ∩ Z ).
We are now working from the hypothesis that z ∈
(X ∪ Y ) ∩ (X ∪ Z ). It hopefully isn’t too confusing
that we’re using the same z here as in the first part
APPENDIX G. PROOFS 264
of the proof. When we finished that part, all the as-
sumptions we’ve made there are no longer in effect,
so now we can make new assumptions about what z
is. If that is confusing to you, just replace z with a
different variable in what follows.
We know that z is in both X ∪ Y and X ∪ Z , by
definition of ∩. And by the definition of ∪, we can
further unpack this to: either z ∈ X or z ∈ Y , and
also either z ∈ X or z ∈ Z . This looks like a proof
by cases again—except the “and” makes it confusing.
You might think that this amounts to there being three
possibilities: z is either in X , Y or Z . But that would
be a mistake. We have to be careful, so let’s consider
each disjunction in turn.
By definition of ∩, z ∈ X ∪ Y and z ∈ X ∪ Z . By definition of ∪,
z ∈ X or z ∈ Y . We distinguish cases.
Since we’re focusing on the first disjunction, we haven’t
gotten our second disjunction (from unpacking X ∪
Z ) yet. In fact, we don’t need it yet. The first case
is z ∈ X , and an element of a set is also an element
of the union of that set with any other. So case 1 is
easy:
Case 1: Suppose that z ∈ X . It follows that z ∈ X ∪ (Y ∩ Z ).
Now for the second case, z ∈ Y . Here we’ll unpack
the second ∪ and do another proof-by-cases:
Case 2: Suppose that z ∈ Y . Since z ∈ X ∪ Z , either z ∈ X or
z ∈ Z . We distinguish cases further:
Case 2a: z ∈ X . Then, again, z ∈ X ∪ (Y ∩ Z ).
Ok, this was a bit weird. We didn’t actually need the
assumption that z ∈ Y for this case, but that’s ok.
G.6. ANOTHER EXAMPLE 265
Case 2b: z ∈ Z . Then z ∈ Y and z ∈ Z , so z ∈ Y ∩ Z , and
consequently, z ∈ X ∪ (Y ∩ Z ).
This concludes both proofs-by-cases and so we’re done
with the second half.
So, if z ∈ (X ∪ Y ) ∩ (X ∪ Z ) then z ∈ X ∪ (Y ∩ Z ).
G.6 Another Example
Proposition G.9. If X ⊆ Z , then X ∪ (Z \ X ) = Z .
Proof. Suppose that X ⊆ Z . We want to show that X ∪(Z \X ) = Z .
We begin by observing that this is a conditional state-
ment. It is tacitly universally quantified: the proposi-
tion holds for all sets X and Z . So X and Z are vari-
ables for arbitrary sets. To prove such a statement,
we assume the antecedent and prove the consequent.
We continue by using the assumption that X ⊆ Z .
Let’s unpack the definition of ⊆: the assumption means
that all elements of X are also elements of Z . Let’s
write this down—it’s an important fact that we’ll use
throughout the proof.
By the definition of ⊆, since X ⊆ Z , for all z , if z ∈ X , then
z ∈ Z.
We’ve unpacked all the definitions that are given to
us in the assumption. Now we can move onto the
conclusion. We want to show that X ∪ (Z \ X ) = Z ,
and so we set up a proof similarly to the last example:
we show that every element of X ∪ (Z \ X ) is also
an element of Z and, conversely, every element of Z
is an element of X ∪ (Z \ X ). We can shorten this to:
X ∪ (Z \ X ) ⊆ Z and Z ⊆ X ∪ (Z \ X ). (Here we’re
APPENDIX G. PROOFS 266
doing the opposite of unpacking a definition, but it
makes the proof a bit easier to read.) Since this is a
conjunction, we have to prove both parts. To show the
first part, i.e., that every element of X ∪(Z \X ) is also
an element of Z , we assume that z ∈ X ∪ (Z \ X ) for
an arbitrary z and show that z ∈ Z . By the definition
of ∪, we can conclude that z ∈ X or z ∈ Z \ X from
z ∈ X ∪ (Z \ X ). You should now be getting the hang
of this.
X ∪ (Z \ X ) = Z iff X ∪ (Z \ X ) ⊆ Z and Z ⊆ (X ∪ (Z \ X ). First
we prove that X ∪ (Z \ X ) ⊆ Z . Let z ∈ X ∪ (Z \ X ). So, either
z ∈ X or z ∈ (Z \ X ).
We’ve arrived at a disjunction, and from it we want
to prove that z ∈ Z . We do this using proof by cases.
Case 1: z ∈ X . Since for all z , if z ∈ X , z ∈ Z , we have that
z ∈ Z.
Here we’ve used the fact recorded earlier which fol-
lowed from the hypothesis of the proposition that
X ⊆ Z . The first case is complete, and we turn to
the second case, z ∈ (Z \ X ). Recall that Z \ X de-
notes the difference of the two sets, i.e., the set of all
elements of Z which are not elements of X . But any
element of Z not in X is in particular an element
of Z .
Case 2: z ∈ (Z \ X ). This means that z ∈ Z and z < X . So, in
particular, z ∈ Z .
Great, we’ve proved the first direction. Now for the
second direction. Here we prove that Z ⊆ X ∪(Z \X ).
So we assume that z ∈ Z and prove that z ∈ X ∪ (Z \
X ).
Now let z ∈ Z . We want to show that z ∈ X or z ∈ Z \ X .
G.7. PROOF BY CONTRADICTION 267
Since all elements of X are also elements of Z , and
Z \ X is the set of all things that are elements of Z
but not X , it follows that z is either in X or in Z \ X .
This may be a bit unclear if you don’t already know
why the result is true. It would be better to prove it
step-by-step. It will help to use a simple fact which
we can state without proof: z ∈ X or z < X . This
is called the “principle of excluded middle:” for any
statement p, either p is true or its negation is true.
(Here, p is the statement that z ∈ X .) Since this is a
disjunction, we can again use proof-by-cases.
Either z ∈ X or z < X . In the former case, z ∈ X ∪ (Z \ X ).
In the latter case, z ∈ Z and z < X , so z ∈ Z \ X . But then
z ∈ X ∪ (Z \ X ).
Our proof is complete: we have shown that X ∪ (Z \
X ) = Z.
G.7 Proof by Contradiction
In the first instance, proof by contradiction is an inference pat-
tern that is used to prove negative claims. Suppose you want to
show that some claim p is false, i.e., you want to show ¬p. The
most promising strategy is to (a) suppose that p is true, and (b)
show that this assumption leads to something you know to be
false. “Something known to be false” may be a result that con-
flicts with—contradicts—p itself, or some other hypothesis of the
overall claim you are considering. For instance, a proof of “if q
then ¬p” involves assuming that q is true and proving ¬p from
it. If you prove ¬p by contradiction, that means assuming p in
addition to q . If you can prove ¬q from p, you have shown that
the assumption p leads to something that contradicts your other
assumption q , since q and ¬q cannot both be true. Of course,
you have to use other inference patterns in your proof of the con-
APPENDIX G. PROOFS 268
tradiction, as well as unpacking definitions. Let’s consider an
example.
Proposition G.10. If X ⊆ Y and Y = ∅, then X has no elements.
Proof. Suppose X ⊆ Y and Y = ∅. We want to show that X has
no elements.
Since this is a conditional claim, we assume the an-
tecedent and want to prove the consequent. The con-
sequent is: X has no elements. We can make that a bit
more explicit: it’s not the case that there is an x ∈ X .
X has no elements iff it’s not the case that there is an x such that
x ∈ X.
So we’ve determined that what we want to prove is
really a negative claim ¬p, namely: it’s not the case
that there is an x ∈ X . To use proof by contradic-
tion, we have to assume the corresponding positive
claim p, i.e., there is an x ∈ X , and prove a contra-
diction from it. We indicate that we’re doing a proof
by contradiction by writing “by way of contradiction,
assume” or even just “suppose not,” and then state
the assumption p.
Suppose not: there is an x ∈ X .
This is now the new assumption we’ll use to obtain a
contradiction. We have two more assumptions: that
X ⊆ Y and that Y = ∅. The first gives us that x ∈ Y :
Since X ⊆ Y , x ∈ Y .
But since Y = ∅, every element of Y (e.g., x) must
also be an element of ∅.
Since Y = ∅, x ∈ ∅. This is a contradiction, since by definition ∅
has no elements.
G.7. PROOF BY CONTRADICTION 269
This already completes the proof: we’ve arrived at
what we need (a contradiction) from the assumptions
we’ve set up, and this means that the assumptions
can’t all be true. Since the first two assumptions (X ⊆
Y and Y = ∅) are not contested, it must be the last
assumption introduced (there is an x ∈ X ) that must
be false. But if we want to be thorough, we can spell
this out.
Thus, our assumption that there is an x ∈ X must be false, hence,
X has no elements by proof by contradiction.
Every positive claim is trivially equivalent to a negative claim:
p iff ¬¬p. So proofs by contradiction can also be used to establish
positive claims “indirectly,” as follows: To prove p, read it as the
negative claim ¬¬p. If we can prove a contradiction from ¬p,
we’ve established ¬¬p by proof by contradiction, and hence p.
In the last example, we aimed to prove a negative claim,
namely that X has no elements, and so the assumption we made
for the purpose of proof by contradiction (i.e., that there is an
x ∈ X ) was a positive claim. It gave us something to work with,
namely the hypothetical x ∈ X about which we continued to rea-
son until we got to x ∈ ∅.
When proving a positive claim indirectly, the assumption you’d
make for the purpose of proof by contradiction would be nega-
tive. But very often you can easily reformulate a positive claim
as a negative claim, and a negative claim as a positive claim.
Our previous proof would have been essentially the same had
we proved “X = ∅” instead of the negative consequent “X has
no elements.” (By definition of =, “X = ∅” is a general claim,
since it unpacks to “every element of X is an element of ∅ and
vice versa”.) But it is easily seen to be equivalent to the negative
claim “not: there is an x ∈ X .”
So it is sometimes easier to work with ¬p as an assumption
than it is to prove p directly. Even when a direct proof is just
as simple or even simpler (as in the next example), some people
APPENDIX G. PROOFS 270
prefer to proceed indirectly. If the double negation confuses you,
think of a proof by contradiction of some claim as a proof of a
contradiction from the opposite claim. So, a proof by contradic-
tion of ¬p is a proof of a contradiction from the assumption p; and
proof by contradiction of p is a proof of a contradiction from ¬p.
Proposition G.11. X ⊆ X ∪ Y .
Proof. We want to show that X ⊆ X ∪ Y .
On the face of it, this is a positive claim: every x ∈ X
is also in X ∪Y . The negation of that is: some x ∈ X
is < X ∪ Y . So we can prove the claim indirectly
by assuming this negated claim, and showing that it
leads to a contradiction.
Suppose not, i.e., X * X ∪ Y .
We have a definition of X ⊆ X ∪ Y : every x ∈ X is
also ∈ X ∪Y . To understand what X * X ∪Y means,
we have to use some elementary logical manipulation
on the unpacked definition: it’s false that every x ∈ X
is also ∈ X ∪ Y iff there is some x ∈ X that is < Z .
(This is a place where you want to be very careful:
many students’ attempted proofs by contradiction fail
because they analyze the negation of a claim like “all
As are Bs” incorrectly.) In other words, X * X ∪ Y
iff there is an x such that x ∈ X and x < X ∪Y . From
then on, it’s easy.
So, there is an x ∈ X such that x < X ∪ Y . By definition of ∪,
x ∈ X ∪ Y iff x ∈ X or x ∈ Y . Since x ∈ X , we have x ∈ X ∪ Y .
This contradicts the assumption that x < X ∪ Y .
G.7. PROOF BY CONTRADICTION 271
Proposition G.12. If X ⊆ Y and Y ⊆ Z then X ⊆ Z .
Proof. Suppose X ⊆ Y and Y ⊆ Z . We want to show X ⊆ Z .
Let’s proceed indirectly: we assume the negation of
what we want to etablish.
Suppose not, i.e., X * Z .
As before, we reason that X * Z iff not every x ∈ X
is also ∈ Z , i.e., some x ∈ X is < Z . Don’t worry,
with practice you won’t have to think hard anymore
to unpack negations like this.
In other words, there is an x such that x ∈ X and x < Z .
Now we can use this to get to our contradiction. Of
course, we’ll have to use the other two assumptions
to do it.
Since X ⊆ Y , x ∈ Y . Since Y ⊆ Z , x ∈ Z . But this contradicts
x < Z.
Proposition G.13. If X ∪ Y = X ∩ Y then X = Y .
Proof. Suppose X ∪ Y = X ∩ Y . We want to show that X = Y .
The beginning is now routine:
Assume, by way of contradiction, that X , Y .
Our assumption for the proof by contradiction is that
X , Y . Since X = Y iff X ⊆ Y an Y ⊆ X , we get that
X , Y iff X * Y or Y * X . (Note how important it is
to be careful when manipulating negations!) To prove
a contradiction from this disjunction, we use a proof
by cases and show that in each case, a contradiction
follows.
APPENDIX G. PROOFS 272
X , Y iff X * Y or Y * X . We distinguish cases.
In the first case, we assume X * Y , i.e., for some x,
x ∈ X but < Y . X ∩ Y is defined as those elements
that X and Y have in common, so if something isn’t
in one of them, it’s not in the intersection. X ∪ Y is
X together with Y , so anything in either is also in the
union. This tells us that x ∈ X ∪ Y but x < X ∩ Y ,
and hence that X ∩ Y , Y ∩ X .
Case 1: X * Y . Then for some x, x ∈ X but x < Y . Since
x < Y , then x < X ∩Y . Since x ∈ X , x ∈ X ∪Y . So, X ∩Y , Y ∩X ,
contradicting the assumption that X ∩ Y = X ∪ Y .
Case 2: Y * X . Then for some y, y ∈ Y but y < X . As
before, we have y ∈ X ∪Y but y < X ∩Y , and so X ∩Y , X ∪Y ,
again contradicting X ∩ Y = X ∪ Y .
G.8 Reading Proofs
Proofs you find in textbooks and articles very seldom give all the
details we have so far included in our examples. Authors ofen
do not draw attention to when they distinguish cases, when they
give an indirect proof, or don’t mention that they use a definition.
So when you read a proof in a textbook, you will often have to
fill in those details for yourself in order to understand the proof.
Doing this is also good practice to get the hang of the various
moves you have to make in a proof. Let’s look at an example.
Proposition G.14 (Absorption). For all sets X , Y ,
X ∩ (X ∪ Y ) = X
Proof. If z ∈ X ∩ (X ∪ Y ), then z ∈ X , so X ∩ (X ∪ Y ) ⊆ X .
Now suppose z ∈ X . Then also z ∈ X ∪ Y , and therefore also
z ∈ X ∩ (X ∪ Y ).
G.8. READING PROOFS 273
The preceding proof of the absorption law is very condensed.
There is no mention of any definitions used, no “we have to prove
that” before we prove it, etc. Let’s unpack it. The proposition
proved is a general claim about any sets X and Y , and when the
proof mentions X or Y , these are variables for arbitrary sets. The
general claims the proof establishes is what’s required to prove
identity of sets, i.e., that every element of the left side of the
identity is an element of the right and vice versa.
“If z ∈ X ∩ (X ∪Y ), then z ∈ X , so X ∩ (X ∪Y ) ⊆ X .”
This is the first half of the proof of the identity: it estabishes
that if an arbitrary z is an element of the left side, it is also
an element of the right, i.e., X ∩ (X ∪ Y ) ⊆ X . Assume that
z ∈ X ∩ (X ∪ Y ). Since z is an element of the intersection of
two sets iff it is an element of both sets, we can conclude that
z ∈ X and also z ∈ X ∪ Y . In particular, z ∈ X , which is what
we wanted to show. Since that’s all that has to be done for the
first half, we know that the rest of the proof must be a proof of
the second half, i.e., a proof that X ⊆ X ∩ (X ∪ Y ).
“Now suppose z ∈ X . Then also z ∈ X ∪ Y , and
therefore also z ∈ X ∩ (X ∪ Y ).”
We start by assuming that z ∈ X , since we are showing that,
for any z , if z ∈ X then z ∈ X ∩ (X ∪ Y ). To show that z ∈
X ∩ (X ∪Y ), we have to show (by definition of “∩”) that (i) z ∈ X
and also (ii) z ∈ X ∪ Y . Here (i) is just our assumption, so
there is nothing further to prove, and that’s why the proof does
not mention it again. For (ii), recall that z is an element of a
union of sets iff it is an element of at least one of those sets.
Since z ∈ X , and X ∪ Y is the union of X and Y , this is the
case here. So z ∈ X ∪ Y . We’ve shown both (i) z ∈ X and (ii)
z ∈ X ∪ Y , hence, by definition of “∩,” z ∈ X ∩ (X ∪ Y ). The
proof doesn’t mention those definitions; it’s assumed the reader
has already internalized them. If you haven’t, you’ll have to go
APPENDIX G. PROOFS 274
back and remind yourself what they are. Then you’ll also have to
recognize why it follows from z ∈ X that z ∈ X ∪ Y , and from
z ∈ X and z ∈ X ∪ Y that z ∈ X ∩ (X ∪ Y ).
Here’s another version of the proof above, with everything
made explicit:
Proof. [By definition of = for sets, X ∩ (X ∪ Y ) = X we have to
show (a) X ∩ (X ∪ Y ) ⊆ X and (b) X ∩ (X ∪ Y ) ⊆ X . (a): By
definition of ⊆, we have to show that if z ∈ X ∩ (X ∪ Y ), then
z ∈ X .] If z ∈ X ∩ (X ∪ Y ), then z ∈ X [since by definition of ∩,
z ∈ X ∩ (X ∪ Y ) iff z ∈ X and z ∈ X ∪ Y ], so X ∩ (X ∪ Y ) ⊆ X .
[(b): By definition of ⊆, we have to show that if z ∈ X , then
z ∈ X ∩ (X ∪ Y ).] Now suppose [(1)] z ∈ X . Then also [(2)]
z ∈ X ∪ Y [since by (1) z ∈ X or z ∈ Y , which by definition of ∪
means z ∈ X ∪ Y ], and therefore also z ∈ X ∩ (X ∪ Y ) [since the
definition of ∩ requires that z ∈ X , i.e., (1), and z ∈ X ∪ Y ), i.e.,
(2)].
G.9 I Can’t Do It!
We all get to a point where we feel like giving up. But you can do
it. Your instructor and teaching assistant, as well as your fellow
students, can help. Ask them for help! Here are a few tips to help
you avoid a crisis, and what to do if you feel like giving up.
To make sure you can solve problems successfully, do the fol-
lowing:
1. Start as far in advance as possible. We get busy throughout
the semester and many of us struggle with procrastination,
one of the best things you can do is to start your homework
assignments early. That way, if you’re stuck, you have time
to look for a solution (that isn’t crying).
2. Talk to your classmates. You are not alone. Others in the
class may also struggle—but the may struggle with differ-
ent things. Talking it out with your peers can give you
G.9. I CAN’T DO IT! 275
a different perspective on the problem that might lead to
a breakthrough. Of course, don’t just copy their solution:
ask them for a hint, or explain where you get stuck and ask
them for the next step. And when you do get it, recipro-
cate. Helping someone else along, and explaining things
will help you understand better, too.
3. Ask for help. You have many resources available to you—
your instructor and teaching assistant are there for you
and want you to succeed. They should be able to help
you work out a problem and identify where in the process
you’re struggling.
4. Take a break. If you’re stuck, it might be because you’ve been
staring at the problem for too long. Take a short break,
have a cup of tea, or work on a different problem for a
while, then return to the problem with a fresh mind. Sleep
on it.
Notice how these strategies require that you’ve started to work
on the proof well in advance? If you’ve started the proof at 2am
the day before it’s due, these might not be so helpful.
This might sound like doom and gloom, but solving a proof
is a challenge that pays off in the end. Some people do this as
a career—so there must be something to enjoy about it. Like
basically everything, solving problems and doing proofs is some-
thing that requires practice. You might see classmates who find
this easy: they’ve probably just had lots of practice already. Try
not to give in too easily.
If you do run out of time (or patience) on a particular prob-
lem: that’s ok. It doesn’t mean you’re stupid or that you will never
get it. Find out (from your instructor or another student) how it
is done, and identify where you went wrong or got stuck, so you
can avoid doing that the next time you encounter a similar issue.
Then try to do it without looking at the solution. And next time,
start (and ask for help) earlier.
APPENDIX G. PROOFS 276
G.10 Other Resources
There are many books on how to do proofs in mathematics which
may be useful. Check out How to Read and do Proofs: An Introduc-
tion to Mathematical Thought Processes by Daniel Solow and How
to Prove It: A Structured Approach by Daniel Velleman in particu-
lar. The Book of Proof by Richard Hammack and Mathematical
Reasoning by Ted Sundstrom are books on proof that are freely
available. Philosophers might find More Precisely: The Math you
need to do Philosophy by Eric Steinhart to be a good primer on
mathematical reasoning.
There are also various shorter guides to proofs available on
the internet; e.g., “Introduction to Mathematical Arguments” by
Michael Hutchings and “How to write proofs” by Eugenia Chang.
Motivational Videos
Feel like you have no motivation to do your homework? Feeling
down? These videos might help!
• https://www.youtube.com/watch?v=ZXsQAXx_ao0
• https://www.youtube.com/watch?v=BQ4yd2W50No
• https://www.youtube.com/watch?v=StTqXEQ2l-Y
Problems
Problem G.1. Suppose you are asked to prove that X ∩ Y , ∅.
Unpack all the definitions occuring here, i.e., restate this in a way
that does not mention “∩”, “=”, or “∅.
Problem G.2. Prove indirectly that X ∩ Y ⊆ X .
Problem G.3. Expand the following proof of X ∪ (X ∩ Y ) =
X , where you mention all the inference patterns used, why each
step follows from assumptions or claims established before it, and
where we have to appeal to which definitions.
G.10. OTHER RESOURCES 277
Proof. If z ∈ X ∪ (X ∩Y ) then z ∈ X or z ∈ X ∩Y . If z ∈ X ∩Y ,
z ∈ X . Any z ∈ X is also ∈ X ∪ (X ∩ Y ).
APPENDIX H
Induction
H.1 Introduction
Induction is an important proof technique which is used, in dif-
ferent forms, in almost all areas of logic, theoretical computer
science, and mathematics. It is needed to prove many of the re-
sults in logic.
Induction is often contrasted with deduction, and character-
ized as the inference from the particular to the general. For in-
stance, if we observe many green emeralds, and nothing that we
would call an emerald that’s not green, we might conclude that
all emeralds are green. This is an inductive inference, in that it
proceeds from many particlar cases (this emerald is green, that
emerald is green, etc.) to a general claim (all emeralds are green).
Mathematical induction is also an inference that concludes a gen-
eral claim, but it is of a very different kind that this “simple in-
duction.”
Very roughly, an inductive proof in mathematics concludes
that all mathematical objects of a certain sort have a certain
property. In the simplest case, the mathematical objects an in-
ductive proof is concerned with are natural numbers. In that
case an inductive proof is used to establish that all natural num-
bers have some property, and it does this by showing that (1) 0
has the property, and (2) whenever a number n has the property,
so does n + 1. Induction on natural numbers can then also of-
278
H.2. INDUCTION ON N 279
ten be used to prove general about mathematical objects that can
be assigned numbers. For instance, finite sets each have a finite
number n of elements, and if we can use induction to show that
every number n has the property “all finite sets of size n are . . . ”
then we will have shown something about all finite sets.
Induction can also be generalized to mathematical objects
that are inductively defined. For instance, expressions of a formal
language suchh as those of first-order logic are defined induc-
tively. Structural induction is a way to prove results about all such
expressions. Structural induction, in particular, is very useful—
and widely used—in logic.
H.2 Induction on N
In its simplest form, induction is a technique used to prove results
for all natural numbers. It uses the fact that by starting from 0 and
repeatedly adding 1 we eventually reach every natural number.
So to prove that something is true for every number, we can (1)
establish that it is true for 0 and (2) show that whenever it is true
for a number n, it is also true for the next number n + 1. If we
abbreviate “number n has property P ” by P (n), then a proof by
induction that P (n) for all n ∈ N consists of:
1. a proof of P (0), and
2. a proof that, for any n, if P (n) then P (n + 1).
To make this crystal clear, suppose we have both (1) and (2).
Then (1) tells us that P (0) is true. If we also have (2), we know
in particular that if P (0) then P (0 + 1), i.e., P (1). (This follows
from the general statement “for any n, if P (n) then P (n + 1)” by
putting 0 for n. So by modus ponens, we have that P (1). From
(2) again, now taking 1 for n, we have: if P (1) then P (2). Since
we’ve just established P (1), by modus ponens, we have P (2). And
so on. For any number k , after doing this k steps, we eventually
arrive at P (k ). So (1) and (2) together establish P (k ) for any
k ∈ N.
APPENDIX H. INDUCTION 280
Let’s look at an example. Suppose we want to find out how
many different sums we can throw with n dice. Although it might
seem silly, let’s start with 0 dice. If you have no dice there’s only
one possible sum you can “throw”: no dots at all, which sums
to 0. So the number of different possible throws is 1. If you have
only one die, i.e., n = 1, there are six possible values, 1 through 6.
With two dice, we can throw any sum from 2 through 12, that’s 11
possibilities. With three dice, we can throw any number from 3 to
18, i.e., 16 different possibilities. 1, 6, 11, 16: looks like a pattern:
maybe the answer is 5n + 1? Of course, 5n + 1 is the maximum
possible, because there are only 5n + 1 numbers between n, the
lowest value you can throw with n dice (all 1’s) and 6n, the highest
you can throw (all 6’s).
Theorem H.1. With n dice one can throw all 5n + 1 possible values
between n and 6n.
Proof. Let P (n) be the claim: “It is possible to throw any number
between n and 6n using n dice.” To use induction, we prove:
1. The induction basis P (1), i.e., with just one die, you can
throw any number between 1 and 6.
2. The induction step, for all k , if P (k ) then P (k + 1).
(1) Is proved by inspecting a 6-sided die. It has all 6 sides,
and every number between 1 and 6 shows up one on of the sides.
So it is possible to throw any number between 1 and 6 using a
single die.
To prove (2), we assume the antecedent of the conditional,
i.e., P (k ). This assumption is called the inductive hypothesis. We
use it to prove P (k + 1). The hard part is to find a way of thinking
about the possible values of a throw of k + 1 dice in terms of the
possible values of throws of k dice plus of throws of the extra
k + 1-st die—this is what we have to do, though, if we want to use
the inductive hypothesis.
H.2. INDUCTION ON N 281
The inductive hypothesis says we can get any number between
k and 6k using k dice. If we throw a 1 with our (k + 1)-st die, this
adds 1 to the total. So we can throw any value between k + 1 and
6k + 1 by throwing 5 dice and then rolling a 1 with the (k + 1)-st
die. What’s left? The values 6k + 2 through 6k + 6. We can get
these by rolling k 6s and then a number between 2 and 6 with
our (k + 1)-st die. Together, this means that with k + 1 dice we
can throw any of the numbers between k + 1 and 6(k + 1), i.e.,
we’ve proved P (k + 1) using the assumption P (k ), the inductive
hypothesis.
Very often we use induction when we want to prove something
about a series of objects (numbers, sets, etc.) that is itself defined
“inductively,” i.e., by defining the (n+1)-st object in terms of the n-
th. For instance, we can define the sum sn of the natural numbers
up to n by
s0 = 0
sn+1 = sn + (n + 1)
This definition gives:
s 0 = 0,
s1 = s0 + 1 = 1,
s2 = s1 + 2 =1+2=3
s3 = s2 + 3 = 1 + 2 + 3 = 6, etc.
Now we can prove, by induction, that sn = n(n + 1)/2.
Proposition H.2. sn = n(n + 1)/2.
Proof. We have to prove (1) that s 0 = 0 · (0 + 1)/2 and (2) if
sn = n(n + 1)/2 then sn+1 = (n + 1)(n + 2)/2. (1) is obvious. To
prove (2), we assume the inductive hypothesis: sn = n(n + 1)/2.
Using it, we have to show that sn+1 = (n + 1)(n + 2)/2.
What is sn+1 ? By the definition, sn+1 = sn + (n + 1). By in-
ductive hypothesis, sn = n(n + 1)/2. We can substitute this into
APPENDIX H. INDUCTION 282
the previous equation, and then just need a bit of arithmetic of
fractions:
n(n + 1)
sn+1 = + (n + 1) =
2
n(n + 1) 2(n + 1)
= + =
2 2
n(n + 1) + 2(n + 1)
= =
2
(n + 2)(n + 1)
= .
2
The important lesson here is that if you’re proving something
about some inductively defined sequence an , induction is the ob-
vious way to go. And even if it isn’t (as in the case of the possibil-
ities of dice throws), you can use induction if you can somehow
relate the case for n + 1 to the case for n.
H.3 Strong Induction
In the principle of induction discussed above, we prove P (0) and
also if P (n), then P (n+1). In the second part, we assume that P (n)
is true and use this assumption to prove P (n + 1). Equivalently,
of course, we could assume P (n − 1) and use it to prove P (n)—
the important part is that we be able to carry out the inference
from any number to its successor; that we can prove the claim
in question for any number under the assumption it holds for its
predecessor.
There is a variant of the principle of induction in which we
don’t just assume that the claim holds for the predecessor n − 1
of n, but for all numbers smaller than n, and use this assumption
to establish the claim for n. This also gives us the claim P (k ) for
all k ∈ N. For once we have established P (0), we have thereby
established that P holds for all numbers less than 1. And if we
know that if P (l ) for all l < n then P (n), we know this in particular
H.4. INDUCTIVE DEFINITIONS 283
for n = 1. So we can conclude P (2). With this we have proved
P (0), P (1), P (2), i.e., P (l ) for all l < 3, and since we have also the
conditional, if P (l ) for all l < 3, then P (3), we can conclude P (3),
and so on.
In fact, if we can establish the general conditional “for all n,
if P (l ) for all l < n, then P (n),” we do not have to establish P (0)
anymore, since it follows from it. For remember that a general
claim like “for all l < n, P (l )” is true if there are no l < n. This
is a case of vacuous quantification: “all As are Bs” is true if there
are no As, ∀x (A(x) → B(x)) is true if no x satisfies A(x). In this
case, the formalized version would be “∀l (l < n → P (l ))”—and
that is true if there are no l < n. And if n = 0 that’s exactly the
case: no l < 0, hence “for all l < 0, P (0)” is true, whatever P is.
A proof of “if P (l ) for all l < n, then P (n)” thus automatically
establishes P (0).
This variant is useful if establishing the claim for n can’t be
made to just rely on the claim for n − 1 but may require the
assumption that it is true for one or more l < n.
H.4 Inductive Definitions
In logic we very often define kinds of objects inductively, i.e., by
specifying rules for what counts as an object of the kind to be
defined which explain how to get new objects of that kind from
old objects of that kind. For instance, we often define special
kinds of sequences of symbols, such as the terms and formulas of
a language, by induction. For a simple example, consider strings
of consisting of letters a, b, c, d, the symbol ◦, and brackets [ and
], such as “[[c ◦ d][”, “[a[]◦]”, “a” or “[[a ◦ b] ◦ d]”. You probably
feel that there’s something “wrong” with the first two strings: the
brackets don’t “balance” at all in the first, and you might feel that
the “◦” should “connect” expressions that themselves make sense.
The third and fourth string look better: for every “[” there’s a
closing “]” (if there are any at all), and for any ◦ we can find “nice”
expressions on either side, surrounded by a pair of parenteses.
APPENDIX H. INDUCTION 284
We would like to precisely specify what counts as a “nice
term.” First of all, every letter by itself is nice. Anything that’s
not just a letter by itself should be of the form “[t ◦ s ]” where s
and t are themselves nice. Conversely, if t and s are nice, then we
can form a new nice term by putting a ◦ between them and sur-
round them by a pair of brackets. We might use these operations
to define the set of nice terms. This is an inductive definition.
Definition H.3 (Nice terms). The set of nice terms is inductively
defined as follows:
1. Any letter a, b, c, d is a nice term.
2. If s and s 0 are nice terms, then so is [s ◦ s 0].
3. Nothing else is a nice term.
This definition tells us that something counts as a nice term iff
it can be constructed according to the two conditions (1) and (2)
in some finite number of steps. In the first step, we construct all
nice terms just consisting of letters by themselves, i.e.,
a, b, c, d
In the second step, we apply (2) to the terms we’ve constructed.
We’ll get
[a ◦ a], [a ◦ b], [b ◦ a], . . . , [d ◦ d]
for all combinations of two letters. In the third step, we apply
(2) again, to any two nice terms we’ve constructed so far. We get
new nice term such as [a ◦ [a ◦ a]]—where t is a from step 1 and s
is [a ◦ a] from step 2—and [[b ◦ c] ◦ [d ◦ b]] constructed out of the
two terms [b ◦ c] and [d ◦ b] from step 2. And so on. Clause (3)
rules out that anything not constructed in this way sneaks into
the set of nice terms.
Note that we have not yet proved that every sequence of sym-
bols that “feels” nice is nice according to this definition. However,
it should be clear that everything we can construct does in fact
H.4. INDUCTIVE DEFINITIONS 285
“feel nice:” brackets are balanced, and ◦ connects parts that are
themselves nice.
The key feature of inductive definitions is that if you want
to prove something about all nice terms, the definition tells you
which cases you must consider. For instance, if you are told that
t is a nice term, the inductive definition tells you what t can look
like: t can be a letter, or it can be [r ◦ s ] for some other pair
of nice terms r and s . Because of clause (3), those are the only
possibilities.
When proving claims about all of an inductively defined set,
the strong form of induction becomes particularly important. For
instance, suppose we want to prove that for every nice term of
length n, the number of [ in it is < n/2. This can be seen as a
claim about all n: for every n, the number of [ in any nice term
of length n is < n/2.
Proposition H.4. For any n, the number of [ in a nice term of length n
is < n/2.
Proof. To prove this result by (strong) induction, we have to show
that the following conditional claim is true:
If for every k < n, any parexpression of length k has
k /2 [’s, then any parexpression of length n has n/2
[’s.
To show this conditional, assume that its antecedent is true, i.e.,
assume that for any k < n, parexpressions of length k contain
< k /2 [’s. We call this assumption the inductive hypothesis. We
want to show the same is true for parexpressions of length n.
So suppose t is a nice term of length n. Because parexpres-
sions are inductively defined, we have three two cases: (1) t is a
letter by itself, or t is [r ◦ s ] for some nice terms r and s .
1. t is a letter. Then n = 1, and the number of [ in t is 0.
Since 0 < 1/2, the claim holds.
APPENDIX H. INDUCTION 286
2. t is [s ◦ s 0] for some nice terms s and s 0. Let’s let k be the
length of s and k 0 be the length of s 0. Then the length n of
t is k +k 0 +3 (the lengths of s and s 0 plus three symbols [, ◦,
]). Since k +k 0 +3 is always greater than k , k < n. Similarly,
k 0 < n. That means that the induction hypothesis applies
to the terms s and s 0: the number m of [ in s is < k /2, and
the number of [ in s 0 is < k 0/2.
The number of [ in t is the number of [ in s , plus the number
of [ in s 0, plus 1, i.e., it is m + m 0 + 1. Since m < k /2 and
m 0 < k 0/2 we have:
k k0 k + k0 + 2 k + k0 + 3
m + m0 + 1 < + +1= < = n/2.
2 2 2 2
In each case, we’ve shown that the number of [ in t is < n/2 (on
the basis of the inductive hypothesis). By strong induction, the
proposition follows.
H.5 Structural Induction
So far we have used induction to establish results about all natural
numbers. But a corresponding principle can be used directly to
prove results about all elements of an inductively defined set.
This often called structural induction, because it depends on the
structure of the inductively defined objects.
Generally, an inductive definition is given by (a) a list of “ini-
tial” elements of the set and (b) a list of operations which produce
new elements of the set from old ones. In the case of nice terms,
for instance, the initial objects are the letters. We only have one
operation: the operations are
o(s, s 0) =[s ◦ s 0]
You can even think of the natural numbers N themselves as being
given be an inductive definition: the initial object is 0, and the
operation is the successor function x + 1.
H.5. STRUCTURAL INDUCTION 287
In order to prove something about all elements of an induc-
tively defined set, i.e., that every element of the set has a prop-
erty P , we must:
1. Prove that the initial objects have P
2. Prove that for each operation o, if the arguments have P ,
so does the result.
For instance, in order to prove something about all nice terms,
we would prove that it is true about all letters, and that it is true
about [s ◦ s 0] provided it is true of s and s 0 individually.
Proposition H.5. The number of [ equals the number of ] in any nice
term t .
Proof. We use structural induction. Nice terms are inductively
defined, with letters as initial objects and the operations o for
constructing new nice terms out of old ones.
1. The claim is true for every letter, since the number of [ in
a letter by itself is 0 and the number of ] in it is also 0.
2. Suppose the number of [ in s equals the number of ], and
the same is true for s 0. The number of [ in o(s, s 0), i.e., in
[s ◦ s 0], is the sum of the number of [ in s and s 0. The
number of ] in o(s, s 0) is the sum of the number of ] in s
and s 0. Thus, the number of [ in o(s, s 0) equals the number
of ] in o(s, s 0).
Let’s give another proof by structural induction: a proper
initial segment of a string of symbols t is any string t 0 that agrees
with t symbol by symbol, read from the left, but t 0 is longer. So,
e.g., [a ◦ is a proper initial segment of [a ◦ b], but neither are
[b ◦ (they disagree at the second symbol) nor [a ◦ b] (they are
the same length).
APPENDIX H. INDUCTION 288
Proposition H.6. Every proper initial segment of a nice term t has
more [’s than ]’s.
Proof. By induction on t :
1. t is a letter by itself: Then t has no proper initial segments.
2. t = [s ◦ s 0] for some nice terms s and s 0. If r is a proper
initial segment of t , there are a number of possibilities:
a) r is just [: Then r has one more [ than it does ].
b) r is [r 0 where r 0 is a proper initial segment of s : Since
s is a nice term, by induction hypothesis, r 0 has more
[ than ] and the same is true for [r 0.
c) r is [s or [s ◦ : By the previous result, the number of [
and ] in s is equal; so the number of [ in [s or [s ◦ is
one more than the number of ].
d) r is [s ◦r 0 where r 0 is a proper initial segment of s 0: By
induction hypothesis, r 0 contains more [ than ]. By
the previous result, the number of [ and of ] in s is
equal. So the number of [ in [s ◦ r 0 is greater than the
number of ].
e) r is [s ◦ s 0: By the previous result, the number of [ and
] in s is equal, and the same for s 0. So there is one
more [ in [s ◦ s 0 than there are ].
H.6 Relations and Functions
When we have defined a set of objects (such as the natural num-
bers or the nice terms) inductively, we can also define relations on
these objects by induction. For instance, consider the following
idea: a nice term t is a subterm of a nice term t 0 if it occurs as
a part of it. Let’s use a symbol for it: t v t 0. Every nice term is
H.6. RELATIONS AND FUNCTIONS 289
a subterm of itself, of course: t v t . We can give an inductive
definition of this relation as follows:
Definition H.7. The relation of a nice term t being a subterm
of t 0, t v t 0, is defined by induction on s 0 as follows:
1. If t 0 is a letter, then t v t 0 iff t = t 0.
2. If t 0 is [s ◦ s 0], then t v t 0 iff t = t 0, t v s , or t v s 0.
This definition, for instance, will tell us that a v [b ◦ a]. For
(2) says that a v [b ◦ a] iff a = [b ◦ a], or a v b, or a v a. The
first two are false: a clearly isn’t identical to [b ◦ a], and by (1),
a v b iff a = b, which is also false. However, also by (1), a v a iff
a = a, which is true.
It’s important to note that the success of this definition de-
pends on a fact that we haven’t proved yet: every nice term t
is either a letter by itself, or there are uniquely determined nice
terms s and s 0 such that t = [s ◦ s 0]. “Uniquely determined” here
means that if t = [s ◦ s 0] it isn’t also = [r ◦ r 0] with s , r or s 0 , r 0.
If this were the case, then clause (2) may come in conflict with
itself: reading t 0 as [s ◦ s 0] we might get t v t 0, but if we read t 0
as [r ◦ r 0] we might get not t v t 0. Before we prove that this can’t
happen, let’s look at an example where it can happen.
Definition H.8. Define bracketless terms inductively by
1. Every letter is a bracketles term.
2. If s and s 0 are bracketless terms, then s ◦ s 0 is a bracketless
term.
3. Nothing else is a bracketless term.
Bracketless terms are, e.g., a, b◦d, b◦a◦b. Now if we defined
“subterm” for bracketless terms the way we did above, the second
clause would read
APPENDIX H. INDUCTION 290
If t 0 = s ◦ s 0, then t v t 0 iff t = t 0, t v s , or t v s 0.
Now b ◦ a ◦ b is of the form s ◦ s 0 with s = b and s 0 = a ◦ b.
It is also of the form r ◦ r 0 with r = b ◦ a and r 0 = b. Now is
a ◦ b a subterm of b ◦ a ◦ b? The answer is yes if we go by the first
reading, and no if we go by the second.
The property that the way a nice term is built up from other
nice terms is unique is called unique readability. Since inductive
definitions of relations for such inductively defined objects are
important, we have to prove that it holds.
Proposition H.9. Suppose t is a nice term. Then either t is a letter
by itself, or there are uniquely determined nice terms s , s 0 such that t =
[s ◦ s 0].
Proof. If t is a letter by itself, the condition issatisfied. So assume
t isn’t a letter by itself. We can tell from the inductive definition
that then t must be of the form [s ◦ s 0] for some nice terms s
and s 0. It remains to show that these are uniquely determined,
i.e., if t = [r ◦ r 0], then s = r and s 0 = r 0.
So suppose t = [s ◦ s 0] and t = [r ◦ r 0] for nice terms s , s 0, r ,
r 0. We have to show that s = r and s 0 = r 0. First, s and r must
be identical, for otherwise one is a proper initial segment of the
other. But by Proposition H.6, that is impossible if s and r are
both nice terms. But if s = r , then clearly also s 0 = r 0.
We can also define functions inductively: e.g., we can define
the function f that maps any nice term to the maximum depth
of nested [. . . ] in it as follows:
Definition H.10. The depth of a nice term, f (t ), is defined in-
ductively as follows:
f (s ) = 0 if s is a letter
f ([s ◦ s 0] = max(f (s ), f (s 0)) + 1
H.6. RELATIONS AND FUNCTIONS 291
For instance
f ([a ◦ b]) = max(f (a), f (b)) + 1 =
= max(0, 0) + 1 = 1, and
f ([[a ◦ b] ◦ c]) = max(f ([a ◦ b]), f (c)) + 1 =
= max(1, 0) + 1 = 2.
Here, of course, we assume that s an s 0 are nice terms, and
make use of the fact that every nice term is either a letter or of
the form [s ◦ s 0]. It is again important that it can be of this form
in only one way. To see why, consider again the bracketless terms
we defined earlier. The corresponding “definition” would be:
g (s ) = 0 if s is a letter
g (s ◦ s 0) = max(g (s ), g (s 0)) + 1
Now consider the bracketless term a ◦ b ◦ c ◦ d. It can be read in
more than one way, e.g., as s ◦ s 0 with s = a and s 0 = b ◦ c ◦ d, or
as r ◦ r 0 with r = a ◦ b and r 0 = c ◦ d. Calculating g according to
the first way of reading it would give
g (s ◦ s 0) = max(g (a), g (b ◦ c ◦ d)) + 1 =
= max(0, 2) + 1 = 3
while according to the other reading we get
g (r ◦ r 0) = max(g (a ◦ b), g (c ◦ d)) + 1 =
= max(1, 1) + 1 = 2
But a function must always yield a unique value; so our “defini-
tion” of g doesn’t define a function at all.
Problems
Problem H.1. Define the set of supernice terms by
APPENDIX H. INDUCTION 292
1. Any letter a, b, c, d is a supernice term.
2. If s is a supernice term, then so is [s ].
3. If t and s are supernice terms, then so is [t ◦ s ].
4. Nothing else is a supernice term.
Show that the number of [ in a supernice term s of length n is
≤ n/2 + 1.
Problem H.2. Prove by structural induction that no nice term
starts with ].
Problem H.3. Give an inductive definition of the function l ,
where l (t ) is the number of symbols in the nice term t .
Problem H.4. Prove by induction on nice terms t that f (t ) < l (t )
(where l (t ) is the number of symbols in t and f (t ) is the depth
of t as defined in Definition H.10).
Photo Credits
293
Bibliography
294
About the Open
Logic Project
The Open Logic Text is an open-source, collaborative textbook of
formal meta-logic and formal methods, starting at an intermedi-
ate level (i.e., after an introductory formal logic course). Though
aimed at a non-mathematical audience (in particular, students of
philosophy and computer science), it is rigorous.
The Open Logic Text is a collaborative project and is under
active development. Coverage of some topics currently included
may not yet be complete, and many sections still require substan-
tial revision. We plan to expand the text to cover more topics in
the future. We also plan to add features to the text, such as a
glossary, a list of further reading, historical notes, pictures, bet-
ter explanations, sections explaining the relevance of results to
philosophy, computer science, and mathematics, and more prob-
lems and examples. If you find an error, or have a suggestion,
please let the project team know.
The project operates in the spirit of open source. Not only
is the text freely available, we provide the LaTeX source under
the Creative Commons Attribution license, which gives anyone
the right to download, use, modify, re-arrange, convert, and re-
distribute our work, as long as they give appropriate credit.
Please see the Open Logic Project website at openlogicpro-
295
BIBLIOGRAPHY 296
ject.org for additional information.