A Course in Symbolic Logic
A Course in Symbolic Logic
Haim Gaifman
Philosophy Department Columbia University
Copyright c 1992 by Haim Gaifman
Revised: February 1999. Further corrections: February 2002.
Contents
Introduction i - x
1 Declarative Sentences 1
1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Truth-Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Context Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Types and Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.3 Vagueness and Open Texture . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.4 Other Causes of Truth-Value Gaps . . . . . . . . . . . . . . . . . . . . 13
1.2 Some Other Uses of Declarative Sentences . . . . . . . . . . . . . . . . . . . . 14
2 Sentential Logic 17
2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1 Sentences, Connectives, Truth-Tables . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.1 Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.2 Conjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.3 Truth-Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.4 Atomic Sentences in Sentential Logic . . . . . . . . . . . . . . . . . . . 27
2.2 Logical Equivalence, Tautologies, Contradictions . . . . . . . . . . . . . . . . . 29
2.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.1 Some Basic Laws Concerning Equivalence . . . . . . . . . . . . . . . . 32
2.2.2 Disjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2.3 Logical Truth and Falsity, Tautologies and Contradictions . . . . . . . 40
2.3 Syntactic Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3.1 Sentences as Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.3.2 Polish Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.4 Syntax and Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.5 Sentential Logic as an Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.5.1 Using the Equivalence Laws . . . . . . . . . . . . . . . . . . . . . . . . 58
2.5.2 Additional Equivalence Laws . . . . . . . . . . . . . . . . . . . . . . . . 64
2.5.3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.6 Conditional and Biconditional . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3 Sentential Logic in Natural Language 80
3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.1 Classical Sentential Connectives in English . . . . . . . . . . . . . . . . . . . . 85
3.1.1 Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.1.2 Conjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.1.3 Disjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.1.4 Conditional and Biconditional . . . . . . . . . . . . . . . . . . . . . . . 97
4 Logical Implications and Proofs 105
4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.1 Logical Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.2 Implications with Many Premises . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.2.1 Some Basic Implication Laws and Top-Down Derivations . . . . . . . . 112
4.2.2 Additional Implication Laws and Derivations as Trees . . . . . . . . . . 118
4.2.3 Logically Inconsistent Premises . . . . . . . . . . . . . . . . . . . . . . 125
4.3 Fool-Proof Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.3.1 Validity and Counterexamples . . . . . . . . . . . . . . . . . . . . . . . 127
4.3.2 The Basic Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.3.3 The Fool-Proof Method . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.4 Proofs by Contradiction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.4.1 The Fool-Proof Method for Proofs by Contradiction . . . . . . . . . . . 139
4.5 Implications of Sentential Logic in Natural Language . . . . . . . . . . . . . . 143
4.5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
4.5.1 Meaning Postulates and Background Assumptions . . . . . . . . . . . . 144
4.5.2 Implicature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5 Mathematical Interlude 153
5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.1 Basic Concepts of Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.1.1 Sets, Membership and Extensionality . . . . . . . . . . . . . . . . . . . 154
5.1.2 Subsets, Intersections, and Unions . . . . . . . . . . . . . . . . . . . . . 159
5.1.3 Sequences and Ordered Pairs . . . . . . . . . . . . . . . . . . . . . . . 165
5.1.4 Relations and Cartesian Products . . . . . . . . . . . . . . . . . . . . . 166
5.2 Inductive Denitions and Proofs, Formal Languages . . . . . . . . . . . . . . . 173
5.2.1 Inductive denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.2.2 Proofs by Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
5.2.3 Formal Languages as Sets of Strings . . . . . . . . . . . . . . . . . . . . 188
5.2.4 Simultaneous Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
6 The Sentential Calculus 197
6.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.1 The Language and Its Semantics . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.1.1 Sentences as Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.1.2 Semantics of the Sentential Calculus . . . . . . . . . . . . . . . . . . . 202
6.1.3 Normal Forms, Truth-Functions and Complete Sets of Connectives . . . 206
6.2 Deductive Systems of Sentential Calculi . . . . . . . . . . . . . . . . . . . . . . 217
6.2.1 On Formal Deductive Systems . . . . . . . . . . . . . . . . . . . . . . . 217
6.2.2 Hilbert-Type Deductive Systems . . . . . . . . . . . . . . . . . . . . . . 219
6.2.3 A Hilbert-Type Deductive System for Sentential Logic . . . . . . . . . 221
6.2.4 Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . . 229
6.2.5 Gentzen-Type Deductive Systems . . . . . . . . . . . . . . . . . . . . . 235
7 Predicate Logic Without Quantiers 241
7.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
7.1 PC
0
, The Formal Language and Its Semantics . . . . . . . . . . . . . . . . . . 244
7.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
7.1.1 The Semantics of PC
0
. . . . . . . . . . . . . . . . . . . . . . . . . . . 246
7.2 PC
0
with Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
7.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
7.2.1 Top-Down Fool-Proof Methods For PC
0
with Equality . . . . . . . . . 253
7.3 Structures of Predicate Logic in Natural Language . . . . . . . . . . . . . . . . 261
7.3.1 Variables and Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . 261
7.3.2 Predicates and Grammatical Categories of Natural Language . . . . . . 264
7.3.3 Meaning Postulates and Logical Truth Revisited . . . . . . . . . . . . . 267
7.4 PC
0
, Predicate Logic with Individual Variables . . . . . . . . . . . . . . . . . 269
7.4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
7.4.1 Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
7.4.2 Variables and Structural Representation . . . . . . . . . . . . . . . . . 273
8 First-Order Logic 277
8.1 First View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
8.2 Ws and Sentences of FOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.2.1 Bound and Free Variables . . . . . . . . . . . . . . . . . . . . . . . . . 283
8.2.2 More on the Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
103
8.2.3 Substitutions of Free and Bound Variables . . . . . . . . . . . . . . . . 288
8.2.4 First-Order Languages with Function Symbols . . . . . . . . . . . . . . 291
8.3 First-Order Quantication in Natural Language . . . . . . . . . . . . . . . . . 295
8.3.1 Natural Language and the Use of Variables . . . . . . . . . . . . . . . . 295
8.3.2 Some Basic Forms of Quantication . . . . . . . . . . . . . . . . . . . . 297
8.3.3 Universal Quantication . . . . . . . . . . . . . . . . . . . . . . . . . . 302
8.3.4 Existential Quantication . . . . . . . . . . . . . . . . . . . . . . . . . 307
8.3.5 More on First Order Quantication in English . . . . . . . . . . . . . . 309
8.3.6 Formalization Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 314
9 FOL: Models, Truth and Logical Implication 323
9.1 Models, Satisfaction and Truth . . . . . . . . . . . . . . . . . . . . . . . . . . 323
9.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
9.1.1 The Truth Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
9.1.2 Dening Sets and Relations by Ws . . . . . . . . . . . . . . . . . . . . 331
9.2 Logical Implications in FOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
9.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
9.2.1 Proving Non-Implications by Counterexamples . . . . . . . . . . . . . . 336
9.2.2 Proving Implications by Direct Semantic Arguments . . . . . . . . . . . 338
9.2.3 Equivalence Laws and Simplications in FOL . . . . . . . . . . . . . . 341
9.3 The Top-Down Derivation Method for FOL Implications . . . . . . . . . . . . 345
9.3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
9.3.1 The Implication Laws for FOL . . . . . . . . . . . . . . . . . . . . . . . 345
9.3.2 Examples of Top-Down Derivations . . . . . . . . . . . . . . . . . . . . 350
9.3.3 The Adequacy of the Method: Completeness . . . . . . . . . . . . . . . 353
Introduction
Logic is concerned with the fundamental patterns of conceptual thought. It uncovers struc-
tures that underlie our thinking in everyday life and in domains that have very little in
common, as diverse, for example, as mathematics, history, or jurisprudence. A rather rough
idea of the scope of logic can be obtained by noting certain keywords: object, property, con-
cept, relation, true, false, negation, names, common names, deduction, implication, necessity,
possibility, and others.
Symbolic logic aims at setting up formal systems that bring to the fore basic aspects of
reasoning. These systems can be regarded as articial languages into which we try to translate
statements of natural language (e.g., English). While many aspects of the original statement
are lost in such a translation, others are made explicit. It is these latter aspects that are the
focus of the logical investigation.
Historically, logic was conceived as the science of valid reasoning, one that derives solely
from the meaning of words such as not, and, or, all, every, there is, and others, or
syntactical constructs like if ... then . These words and constructs are sometimes called
logical particles. A logical particle plays the same role in domains that have nothing else in
common.
Here is a traditional very simple example. From the two premises:
All animals are mortal.
All humans are animals.
we infer, by logic alone:
All humans are mortal.
The inference is purely logical; it does not depend on the meanings of animal, and human,
but solely on the meaning of the construct
all .... are
i
ii
The same pattern underlies the following inference, in which all non-logical concepts are
dierent. From:
All uncharged elementary particles are unaffected by electromagnetic fields.
All photons are uncharged elementary particles.
we can infer:
All photons are unaffected by electromagnetic fields.
The two cases are regarded in Aristotelian logic as instances of the same syllogisma certain
elementary type of inference. The particular syllogism under which the two examples fall is
the following scheme, where the premises are written above the line and the conclusion under
it:
(1)
All Ps are Qs
All Rs are Ps
All Rs are Qs
Our rst example is obtained if we substitute:
animal for P, mortal being for Q, human for R.
The second is obtained if we substitute uncharged elementary particle for P, thing unaf-
fected by electromagnetic elds for Q, photon for R.
Had we substituted in the rst case immortal being for Q, (instead of mortal being) we
would have gotten the inference:
All animals are immortal.
All humans are animals.
--------------------------
All humans are immortal.
Here the rst premise is false, and so is the conclusion. But the inference is correct. Its
correctness does not require that the premises be true, but that they stand in a certain logical
relation to the conclusion: they should logically imply it.
The use of schematic symbols is a rst step in setting up a formalism. Yet, there is a long
way from here to a fully edged formal language. As will become clear during the course,
there is much more to a formalism than the employment of formal symbols.
Here is a dierent type of a logical inference. From the three premises:
iii
Either Jill went to the movie, or she went to the party.
If Jill went to the party, she met there Jack.
Jill did not go to the movie.
we can infer:
Jill met Jack at the party.
The logical particles on which this last inference is based are:
either...or , if... then , not
The same scheme is exemplied by the following inference. (Again, its validity does not mean
that the premises are true, but only that they imply the conclusion.) From:
Either Ms. Hill invented her story, or Mr. Thomas harassed her.
If Mr. Thomas harassed Ms. Hill, then he lied to the Senate.
Ms. Hill did not invent her story.
we can infer:
Mr. Thomas lied to the Senate.
The scheme that covers both of these inferences can be written in the following self-explanatory
notation, where the premises (here there are three) are above the line and the conclusion is
below it:
(2)
A or B
If B then C
not A
C
Note that the schematic symbols of (2) are of a dierent kind than those of (1). Whereas in
(1) they stand for general names: human, mortal being, photon, etc., they stand in (2)
for complete sentences, such as Jill went to the party or Mr. Thomas harassed Ms. Hill.
The part of logic that takes complete sentences as basic units, and investigates the combining
of sentences into other sentences, by means of expressions such as not, and, or, if ...
then , is called sentential logic. The sentence-combining operations are called sentential
operations, and the resulting sentences (e.g., those in (2)) are sentential combinations, or
iv
sentential compounds. Sentential logic is a basic part that is usually included in other, richer
systems of logic.
The logic that treats, in addition to sentential operations, the attribution of properties to
objects (Jill is not tall, but pretty), or relations between objects (Jack likes Jill) is predicate
logic. If we add to predicate logic means for expressing general statements, like those formed
in English by using all, every, some (All humans are mortal, Jill likes some tall
man, Everyone likes somebody), then we get rst-order quantication logic, known also as
rst-order predicate logic, or, for short, rst-order logic.
The examples schematized by (1) and by (2), are rather simple. In general, the recasting of
a sentence as an instance of some scheme is far from obvious. It amounts to an analysis: a
way of getting the sentence from components, which displays a certain logical structure. The
choice of components and how, in general, we combine them is crucial for the development of
logic, just as the choice of primitive concepts and basic assumptions is crucial for any science.
Logic was bogged down for more than twenty centuries, because it was tied to a particular
way of analyzing sentences, one that is based on syllogisms. It could not accommodate the
wealth of logical reasoning that can be expressed in natural language and is apparent in the
deductive sciences. To be sure, valuable insights and sophisticated analyses were achieved
during that period. But only at the turn of the century, when the basic Aristotelian setup
has been abandoned in favour of an essentially dierent approach, did symbolic logic come
into its own.
Examples such as (1) and (2) can serve as entry points, but they do not show what symbolic
logic is about. We are not concerned merely with schematizing this or that way of reasoning,
but with the construction and the study of formal languages. The study is a tool in the
investigation of conceptual thought. It aims, and has rich implications, beyond the linguistic
dimension.
Logic is also not restricted to the rst-order case. Other logical formalisms have been designed
in order to handle a variety of notions, including those expressed by the terms possible,
necessary, can, must, and many others. There are also numerous systems that treat a
great variety of reasoning methods, which are quite dierent from the type exemplied in
rst-order logic.
A Bit of History
The beginning of logic goes back to Aristotle (384 322 B.C.). Aristotles works contain,
besides non-formal or semi-formal investigations of logical topics, a schematization and a
systematic theory of syllogisms. This logic was developed and enlarged in the middle ages
but remained very limited, owing to its underlying approach, which bases the logic on the
grammatical subject-predicate relation. Other parts of logic, namely fragments of sentential
logic, have been researched by the Stoic philosophers (330 100 B.C.). They made use of
v
schematic symbols, which did not however amount to a formal system. That stage had to
come much later.
The decisive steps in logic have taken place in the second half of the nineteen and the beginning
of the twentieth century, in the works of George Boole (1815 1864), Charles Peirce (1839
1914), Giuseppe Peano (1858 1932), andfore and foremostGottlob Frege (1848
1925), whose work Begrischrift (1879) presents the rst logical system rich enough for
expressing mathematical reasoning. The other most signicant event was the publication, in
three volumes, of the Principia Mathematicae by Russell and Whitehead (1910 1913), an
extremely ambitious work from which the current formalisms derive directly. (Freges work
was little noticed at the time, though Russell and other logicians knew it.) The big bang
in logic was directly related to new developments in mathematics, mostly in the works of
Dedekind and Cantor. It owed much to the creationby the latter of set theory (1874).
After the work of Russell and Whitehead, it has been taken much further by mathematicians
and philosophers, among whom we nd Hilbert, Ackerman, Ramsey, Lowenheim, Skolem,
andsomewhat laterGentzen, Herbrand, Church, Tarski and many others. It owes its
most important results to Kurt G odellandmark results of deep philosophical signicance.
Critical Thinking, what Symbolic Logic Is Not
In the middle ages logic was described as the the art of reasoning. It has been, and often
still is, viewed as the discipline concerned with correct ways of deriving conclusions from given
premises. Much of medieval logic was concerned with the analysis and the classication of
arguments to be found in various kinds of discourse, from everyday life to politics, philos-
ophy, and theology. In this way logic was related to rhetoric: the art of convincing people
through talk. The medieval logicians devoted considerable eort to the uncovering of invalid
arguments, known as fallacies, of which they classied a few dozens.
Perhaps a vestige of this tradition, or a renewal of it, is a logic course given in some curricula
under Critical Thinking. In certain cases, I am told, the name is a euphemism for teaching
text comprehension, lling thereby a high school lacuna. But, to judge by the textbooks, the
course commonly comprises an assortment of topics that have to do with inference-drawing:
deductive, inductive, statistics and probability, some elements of symbolic logic, and a discus-
sion of various fallacies. There is no doubt that there is value to such an overview, or to the
analysis of common fallacies. pretense of the title should be discounted. Critical thinking,
without the scare quotes, is not something that can be taught directly as an academic subject.
(Imagine a course called Thinking.)
Thinking, like walking, is learned by practice. And good thinking, clear, rigorous, critical,
is what one acquires in life, or through work in particular disciplines. Observations and
corrective tips are useful, but they will not get you far, unless incorporated in long continuous
experience. Clear thinking is a lifelong project.
vi
A course in symbolic logic is not a course in critical thinking. What you will learn may,
hopefully will, aect your thinking. You will study a certain formalism, using which you will
learn, among the rest, to analyze certain types of English sentences, to reconstruct them and
to trace their logical relationships. These should make you more sensitive to some aspects of
meaning and of reasoning. But any improvement of your thinking ability will be a consequence
of the mental eort and the practice that the course requires. Thinking does not arrive by
learning a set of rules, or by following this or that prescription.
The Research Program of Symbolic Logic
Symbolic logic is an enterprise that takes its raw material from the existing activity of
reasoning, displayed by human beings. It focuses on certain fundamental patterns and tries
to represent them in an articial language (also called calculus). The formal language is
supposed to capture basic aspects of conceptual thought.
The enterprise comprises discovery as well as construction. It would not be accurate to say
that the investigator merely uncovers existing patterns. The structures are revealed and
constructed at the same time. Having constructed a formal system, we can go back and
see how much of our reasoning is actually captured by it. The formalism is philosophically
interesting, or fruitful, in as much as it gives us a handle on essential aspects of thought. It can
be also of technical interest. For it can provide tools for language processing and computer
simulation of reasoning processes. In either respect, there is no a priori guarantee of success.
We should keep in mind that, even when the formal system represents something basic or
important, its signicance may be tied to some particular segment of our cognitive activity.
There should be no pretense of reducing the enormous wealth of our thinking to an articial
symbolic system. At least there should be no a priori conviction that it can be done. To what
extent can human reasoning be captured by a formal system is an intriguing and dicult
question. It has been much discussed in the context of articial intelligence (not always with
the best results).
Let us compare investigations in logic to investigations of human body-postures. A medical
researcher can use x rays and other scans, or slow motion pictures, in order to nd out how
human bodies function in a range of activities. He will accumulate a vast amount of data. In
order to organize his data into a meaningful account, he may classify it according to human
types, establish certain regularities and formulate some general rules. Here he is already
deviating from the raw material; he is introducing an abstract system in as much as his
types are idealized constructs, which actual humans only approximate. (Not to mention the
fact that in the very acquiring of data he is already making use of some theoretical system.)
Our investigator may also arrive at conclusions concerning the correct ways in which humans
should walk, or sit in order to preserve healthy tissue, minimize unnecessary tension, etc. His
research will establish certain norms; not only will it reveal how humans use their bodies,
vii
but also how they ought to. He may even conclude that most people do not maintain their
bodies as they should. In this way the descriptive merges into the normative. And there
is a feedback, for the normative may provide further concepts and further guidelines for the
descriptive. Our investigators conclusions may be subject to debates, objections, or revisions.
Here, as well, the descriptive and the normative are interlaced. Finally, it is possible that
certain recommendations become highly inuential, to the extent of being incorporated in the
educational system. They would thus become part of the culture, determining, to an extent,
the actual behaviour of humans, say, the way they hammer, or the kind of chairs they prefer.
All of these aspects exist when we are concerned with human thinking. Here, as well, the
descriptive merges into the normative. Having started by investigating actual thinking, we
might end by concluding how thinking ought to be done. Furthermore, the enterprise might
inuence actual thinking habits, projecting back on the very culture within which it was
carried out.
First-Order Logic
The development of symbolic logic has had by now far reaching consequences, which have
aected deeply our philosophical outlook. Coupled with certain technological developments,
it has also aected our culture in general. The basic system that the project has yielded is
rst-order logic. The name refers to a type of language, characterizedas stated in the rst
sectionby a certain logical vocabulary. First-order logic serves also as the core of many
modications, restrictions andmost importantenlargements.
Although rst-order logic is rather simple, all mathematical reasoning (derivations used in
proving mathematical results) can be reproduced within it. Since it is completely dened by
precise formal rules, rst-order logic can itself be treated and investigated as a mathematical
system . Mathematical logicians have done this, and they have been able to prove highly
interesting theorems about it; for example, theorems that assert that certain statements are
unprovable from such and such axioms. These and other theorems about the system are
known as metatheorems; for they are not about numbers, equations, geometrical spaces, or
algebras, but about the language in which theorems about numbers, equations, geometrical
spaces, or algebras are proven. They enlighten us about its possibilities and limitations.
In philosophy, the development of symbolic logic has had far reaching eects. The role of
logic within the general philosophical inquiry has been a subject of debate. There is a wide
spectrum of opinions, from those who accord it a central place, to those who restrict it to a
specialized area. The subjects importance varies with ones interests and the avour of ones
philosophy. In any case, logic is considered a basic subject, knowledge of which is required in
most graduate and undergraduate philosophy programs.
viii
The Wider Scope of Articial Languages
Historically, the idea of a comprehensive formal language, dened by precise rules of math-
ematical nature, goes back to Leibniz (1646 1716). Leibniz thought of an arithmetical
representation of concepts, and dreamed of a universal formal language within which all truth
could be expressed. A similar vision was also before the eyes of Frege and Russell. The
actual languages produced by logicians fall short of any kind of Leibnizian dream. This is no
accident, for by now we have highly convincing reasons for discounting the possibility of such
a universal language. The reasons have been provided by logic itself, in the form of certain
metatheorems (G odels incompleteness results).
As noted, there is by now a wide variety of logical systems, which express many aspects of
reasoning and of conceptual organization. In the last forty years the enterprise of articial lan-
guages has undergone a radical change due to the emergence of computer science. Computer
scientists have developed scores of languages of types dierent from the types constructed by
logicians. Their goal has not been the investigation of thought, but the description and the
manipulation of computational activity. Computer languages serve to dene the functioning
of computers and to communicate with them, to tell a computer what to do. A major
consideration that enters into the setting up of programming languages is that of eciency:
programs should be implementable in reasonable run time, on practical hardware. Usually,
there is a trade-o between a programs simplicity and conceptual clarity, on one hand, and
its eciency on the other.
At the same time we have witnessed a marked convergence of some of the projects of pro-
gramming languages and those of logic. For example, the programming language LISP (and
its many variants) is closely connected with the logical system known as the calculus, devel-
oped in the thirties by Church. The calculus and its variants have been the focus of a great
amount of research by logicians and computer scientists. There has been also an important
direct eect of symbolic logic on computer science. The clarity and simplicity of rst-order
logic suggested its use as a basis for a programming language. Ways were found to imple-
ment portions of rst-order logic in an ecient way, which led to the development of what is
known as logic programming. This, by now, is a vast area with hundreds, if not thousands,
of researchers. Logic enters also, in an essential way, into other areas of computer science, in
particular, articial intelligence and automated theorem proving.
The Goals and the Structure of the Course
The main purpose of the course is to teach FOL (rst-order logic), to relate it to natural
language (English) and to point out various philosophical problems that arise thereby.
The level is elementary, in as much as the course does not include proofs of the deeper results,
such as Godels completeness and incompleteness theorems. Nonetheless, the course aims at
ix
providing a good grasp of FOL. This includes an understanding of formal syntactic structures
an understanding of the semantics (that is, of the notion of an interpretation of the language
and how it determines truth and falsity), the mastering of certain deductive and related
techniques, and an ability to use the formal system in the analysis of English sentences and
informal reasoning.
The rst chapter, which is more of a general nature, is intended to clarify the concepts
presuppositions that underlie the project of classical logic: the category of declarative sentence
and the classication of all declarative sentences into true and false. Various problems that
arise when trying to apply this framework to natural language are discussed, among which
are indexicality, ambiguity, vagueness and open texture. This introduction is also intended
to put symbolic logic into a wider and more concrete perspective, removing from it any false
aura of a given truth.
We get down to the actual course material in chapter 2, which provides a rst view of sentential
logic, based on a semantic-oriented approach. Here are dened the connectives, a variety
of syntactic concepts (components, main connective, unique readability and others), truth-
tables and the concept of logical equivalence. The chapter contains also various simplication
techniques and an algebraic perspective on the system.
Having gotten a sucient grasp of the formalism, we proceed in chapter 3 to match the formal
setup with English. The chapter discusses, with the aid of many examples, ways of expressing
in English the classical connectives, the extent to which English sentences can be rendered in
sentential logic and what such an analysis reveals.
Chapter 4 treats logical implications and proofs. After dening (semantically) the concept of
a logical implication, the chapter presents a very convenient method of deciding whether a
purported implication, from a set of premises to a conclusion, is valid. The method combines
the ideas of Gentzens calculus with a top-down derivation technique. If the implication
is valid it yields a formal proof (which can be rewritten bottom-up), if notit produces
a counterexample, thereby establishing the non-validity. In the last section we return to
natural language and consider possible recastings of various inferences carried in English into
a formal mode. Here we discuss also some concepts from the philosophy of language, such as
implicature.
Chapter 5 provides some basic mathematical tools that are needed for the rigorous treatment
of logical calculi, in particular, for dening interpretations (models), giving a truth-denition,
and for setting up deductive systems. These tools consist of elementary notions of set theory,
and the basic techniques of inductive denitions and proofs.
In chapter 6 the formal language of the sentential calculus is dened with full mathematical
rigor, together with the concept of a deductive system. Here the crucial distinction between
syntax and semantics is claried and the relation between the two is established in terms of
soundness and completeness.
x
Chapter 7 presents predicate logic (without quantiers), based on a vocabulary of predicates
and individual constants. The equality predicate is introduced and the top-down method for
deciding logical implications is extended so as to include atomic sentences that are equalities.
This chapter treats also predication in English. In the second half of that chapter the system
is extended by the introduction of variables and steps are taken towards the introduction of
quantiers.
In chapter 8 the fully edged language of rst-order logic is dened, as well as the basic
syntactic concepts that go with it: quantier scope, free and bound variables, and legitimate
substitutions of terms. Emphasis is placed on an intuitive understanding of what rst-order
formulas express and on translations from and into English.
The general concept of a model for a rst-order language is presented in chapter 9, as well
as the denitions of satisfaction and truth. Based on these we get the concepts of logical
equivalence and logical implication. The top-down derivation technique of sentential logic is
extended to rst-order logic. As before, the method is guaranteed to yield a proof of any valid
implication. A proof of this claimwhich is not included in this chapteryields immediately
the completeness theorem.
Chapter 1
Declarative Sentences
1.0
Symbolic logic is concerned rst and foremost with declarative sentences. These are sentences
that purport to make factual statements. They are true if what they state is the case, and
they are falseif it is not.
Grass is green, Every prime number is odd, Not every prime number
is odd, The moon is larger than the earth, John Kennedy was not
the rst president of the USA to be assassinated, Jack loves Jill, but
wouldnt admit it,
declarative sentences. The rst the third and the fth are true. The second and the fourth
are false. The last is true just in this case: (i) Jack loves Jill and (ii) Jack does not admit
that he loves Jill.
You can see what distinguishes declarative sentences by comparing them with other types.
Interrogative sentences, for example, are used to express questions:
Who deduced existence from thinking? Did Homer write the Odyssey?
Such sentences call for answers, whichdepending on the kind of questioncome in several
forms; e.g., the rst of the above questions calls for a name of a person, the secondfor a
yes or a no.
Commands are expressed by means of imperative sentences, such as:
Love thy neighbour as thou lovest thyself, Do not walk on the grass.
1
2 CHAPTER 1. DECLARATIVE SENTENCES
Given in the appropriate circumstance, by someone with authority, they call for compliance.
None of these, or of the other kinds of sentence, is true or false in the same sense that a
declarative sentence is. We can say of a question that it is to the point, important, interesting,
and so on, or that it is irrelevant, misleading or ill-posed. A command can be justied,
appropriate, or illegitimate or out of place. But truth and falsityin the basic, elementary
sense of these termspertain to declarative sentences only. Sentences are used in many ways
to achieve diverse purposes in human interaction. To question and to command are only two
of a great variety of linguistic acts. We have requests, greetings, condolences, promises, oaths,
and many others. What is then, within this picture of human interaction, the principal role
of declarative sentences? It isrst and foremostto convey information, to tell someone
that such and such is the case, that a certain state of aairs obtains.
But over and above their use in human communication, declarative sentences constitute de-
scriptions (or purported descriptions) of some reality: a reality perceived by humans, but
perceived as existing in itself, independently of its being described. A logical investigation of
declarative sentences can serve as tool that claries the nature of that reality. By uncovering
certain basic features of our thinking it may also uncover basic features of the world that the
thinking organizes. One can appreciate already, at this stage, the potential that the logic of
declarative sentences has for epistemologythe inquiry into the nature of knowledge, and for
ontologythe inquiry into the nature of reality.
For this reason, when sentences are the target of a philosophical inquiry, the declarative ones
play the most important role. Formal methods are not restricted to declarative sentences;
formal systems have been designed for handling other types, such as questions and commands.
But symbolic logic is mostly about declarative sentences, and it is with these that we shall
be concerned here.
Henceforth, I shall use sentence to refer to declarative sentences, unless indicated otherwise.
1.1 Truth-Values
1.1.0
A declarative sentence is true or false, according as to whether what it states is, or is not the
case. It is very convenient to introduce two abstract objects, TRUE and FALSE, and to mark
the sentences being true by assigning to it the value TRUE, and its being falseby assigning
to it the value FALSE. We refer to these objects as truth-values.
Truth-values are merely a technical device. They make it possible to use concise and clear
formulations. One should not be mystied by these objects and one should not look for hidden
meanings. To say that a sentence has the value TRUE is just another way of saying that it is
1.1. TRUTH-VALUES 3
true, and to say that it has the value FALSE is no more than saying that it is false. Any two
objects can be chosen as TRUE and FALSE. For the only thing that matters about truth-values
is their use as markers of truth and falsity.
Notation: We use T and F as abbreviations for TRUE and FALSE.
While the introduction of truth-values is a technical convenience, the very possibility of classi-
fying sentences into true and false is a substantial philosophical issue. Does every sentence fall
under one of these categories? Little reection will show that in our everyday discourse such
a classication is, to a large extent, problematic. The problem is not of knowing a sentences
truth-value; we may not know whether Oswald was Kennedys only assassin, or whether 2
32
+1
is a prime number, but we nd no diculty in appreciating the fact that, independently of
our knowledge, Oswald was Kennedys only assassin is either true or false, and so is 2
32
+1
is prime. The problem is that in many cases it is not clear what the conditions for truth and
falsity are and whether the classication applies at all. Perhaps certain sentences should on
various occasions be considered as neither true nor false; which means, in our terminology,
that neither T nor F is their value.
The logic we are going to study, which is classical two-valued logic, assumes bivalence: the
principle that every sentence has one of the two values T or F. This principle makes for
systems that are relatively simple and highly fruitful at the same time. Logicians have, of
course, been aware of the problems surrounding the assignment of truth-values. But in order
to get o ground, an inquiry must start by focusing on some aspects, while others are ignored.
Later it may broadened so as to handle additional features and other situations. The art is
to know what to focus on and what, initially, to ignore. Classical two-valued logic has been
extremely successful in contexts where bivalence prevails. And it serves also as a point of
reference for further investigations, where problems of missing truth-values can be addressed.
In short, we are doing what every scientist does, when he starts with a deliberately idealized
picture.
In the coming sections of this chapter I shall highlight the main situations where the as-
signment of denite truth-values is called into question. This will also be an occasion for
discussing briey some major topics regarding language: context-dependency, tokens and
types, indexicals, ambiguity and vagueness.
1.1.1 Context Dependency
The same sentence may have dierent truth-values on dierent occasions of its use. Consider,
for example:
Jack: I am tall,
Jill: I am tall.
4 CHAPTER 1. DECLARATIVE SENTENCES
If Jack is not tall, but Jill is, thenin Jacks mouththe sentence is false, but in Jills mouth
it is true. This shows that we are dealing here with two kinds of things: the entity referred to
as sentence, which is the same in the mouth of Jack and the mouth of Jill, and its dierent
utterances. The distinction is fundamental; it, and some of its hinging phenomena, will be
now discussed.
1.1.2 Types and Tokens
Linguistic intercourse is based on the production of certain physical items: stretches of sounds,
marks on paper, and their like, which are interpreted as words and sentences. Such items are
called tokens. When you started to read this section you encountered a token of linguistic,
which was part of a token of the opening sentence. And what you have just encountered is
another token of linguistic, this time enclosed in inverted commas.
Of course, token is meaningful only in as much as it is a token of something, a word, a letter,
a sentence, orin generalsome other, more abstract entity. This other entity is called type.
By a sentence-token we mean a token of a sentence, that is, a token of a sentence-type.
Note that our terms letter, word, or sentence, are ambiguous. Sometimes they refer to
types and sometimes to tokens. This is shown clearly in situations that involve counting. How
many words are there on this page? The answer depends on whether you count repetitions
of the same word. If you do, then you interpret word as word-token, if you dontyou
interpret it as word-type. Usually the number of word-tokens exceeds the number of word-
types; for we do, as a rule, repeat.
Our ability to use language is preconditioned by our ability to recognize dierent tokens as
being tokens of the same type. This sameness relation is often indicated by the physical
similarity of tokens. Thus, the two tokens of ability in the rst sentence of this paragraph
have exactly the same shape. But on the whole, what counts as being tokens of the same
type is a matter of convention; similarity is not necessary. Think of the dierent fonts one can
use for the same letters, and of the enormous variety of handwritings. (Reading someones
written words is often impossible without knowing the language, even when the alphabet is
known.) And to clinch the point, note that the same words are represented by tokens in
dierent physical media: the acoustic and the visual.
Things would have been considerably simpler if we could disregard the dierence between
tokens of the same type. But this is not so; for, as the last example shows, dierent tokens
of the same type may have dierent truth-values.
1.1. TRUTH-VALUES 5
Indexicals and Demostratives
An indexical is a word whose reference dependsin a systematic wayon certain surroundings
of its token, e.g., the tokens origin, its time, or its place. Such is the pronoun I, which refers
to its utterer, and such are the words now and here, which refer to the utterances time
and place. The shift of reference may results in a truth-value change. Indexicals are, indeed,
the most common cause for assigning dierent truth-values to dierent tokens of the same
sentence. In the last example the dierence in truth-value is caused by the indexical I, which
denotes Jack, in the mouth of Jack, Jillin the mouth of Jill. Quite often the indexicals are
implicit. In
(1) It is raining,
the present tense indicates that the time is the time of the utterance. And, in the absence of
an explicit place indication, the place is the place of the utterance. When (1) is uttered in
New York, on May 17 1992 at 9:00 AM, it is equivalent to:
(1
0
) It is raining in New York, on May 17 1992 at 9:00 AM.
It is not dicult to spot indexicals, once you are aware of their possible existence. Besides
now and here, we have also the indexicals yesterday, tomorrow, last week, next
room and many others.
Demonstratives, like indexicals, have systematically determined token-dependent references.
They usually require an accompanying demonstrationsome non-linguistic act of pointing.
Such are the words that and this. The use of you involves a demonstrative element (the
act of addressing somebody), as do sometimes he and she. (It is not always easy to describe
what exactly the demonstration is, but this is another matter.) Sometimes a distinction is
made between pure indexicalswhich, like I, require no demonstrationand non-pure ones.
And sometimes indexical is used for both indexicals and demonstratives.
Some Kinds of Ambiguity
Many, perhaps most, proper names denote dierent objects on dierent occasions. Roosevelt
can mean either the rst or the second USA president of this name, Dewey can refer either to
the philosopher or to the Republican politician, Tolstoy can refer to any of several Russian
writers. First and second names, or initials, can help in avoiding confusion (thus, we distin-
guish between Teddy Rooseveltthe man who was fond of speaking softly while carrying a big
stick, and Franklin Rooseveltthe second world war leader in the wheel chair). Additional
names reduce the ambiguity, but need not eliminate it. A glance in the telephone directory
under Smith, orin New Yorkunder Cohen, will show this. Other distinguishing marks
6 CHAPTER 1. DECLARATIVE SENTENCES
can be used: Dewey the philosopher versus Dewey the politician, Johan Strauss the father
versus Johan Strauss the son.
Above all, a names denotation is determined by the context in which the name is used.
(If I ask my daughter: has Bill telephoned? it is unlikely that she will take me to have
referred to Bill Clinton.) But there are no clear-cut linguistic rules that regulate this. Various
factors enter: what has been stated before, the topic of the discussion, and what is known
of the interlocutors knowledge and intentions. Proper names behave quite dierently from
indexicals; the latter are subject to systematic rules (you refers to the person addressed,
now refers to the time of the utterance, etc.), the former are not.
Besides indexicals and proper names, linguistic expressions in general may have dierent
denotations, or meanings, on dierent occasions. The same word might mean dierent
things, e.g., tanka large container for storage, and tankan armored vehicle on caterpillar
treads. But here we should be careful, for the very dierence of meaning is often taken to
constitute a dierence of words (i.e., of types). Homonyms are dierent words written and
pronounced in the same way; their dierence rests solely on dierence in meaning. When
tank is split into homonyms, it is no longer a single ambiguous word. Accordingly,
(2) John jumped into the tank,
is, strictly speaking, not an ambiguous sentence (which has dierent truth-values on dierent
occasions) but an ambiguous expression that can be read as more than one sentence: a
sentence containing the tank-as-container-homonym, and a sentence containing the tank-
as-armored-vehicle-homonym. The context in which (2) occurs (e.g., sentences that come
before and after it) may help us to decide the intended reading.
By contrast, dierent tokens of It is now raining here are tokens of the same sentence.
For now and here do not constitute dierent words when used at dierent times, or at
dierent places. A child learning to speak does not coin a new English word, when he uses
I for the rst time. We can however say that the English language gained a new word when
tank (already in use as a name of certain containers) was introduced as a name of certain
armored cars
1
. Many cases of ambiguitywhere the meanings are linkeddo not deserve
to be treated as homonyms. Word can mean word-type or word-token, but this does not
constitute sucient ground for distinguishing two homonyms. We would do better, one feels,
to regard it as a single ambiguous word.
Ambiguous terms are not the only source of sentential ambiguity; often the sentential structure
itself can be construed in more than one way.
(3) Taking the money out of his wallet, he put it on the table.
1
By the same reasoning, no new word is coined when a new baby is given a current name, like Henry.
But we did get a new homonym when the ninth planet was named Pluto.
1.1. TRUTH-VALUES 7
Was it the money or the wallet he put on the table? That depends on the syntactic structure
of (3); it is the rst, if it goes proxy for the money, the secondif it goes proxy for his
wallet. Syntactic ambiguity takes place when the same sequence of words lends itself to
dierent structural interpretations. The truth-value can depend on the way we structure the
sentence, orin more technical terminologyon the way we parse it. Here, again, the context
can decide the intended parsing.
We can have a concept of sentence according to which dierent parsings determine dierent
sentences; if so, (3) is to be regarded in the same light as (2): an expression representing more
than one sentence. But on the usual, everyday concept of sentence, (3) is a single syntactically
ambiguous sentence.
In symbolic logic the articial language is set up in a way that bars any ambiguity. Ev-
ery sentence has a unique syntactic structure and all referring terms have unique, context-
independent references. Therefore a translation from natural language into symbolic logic
involves an interpretation whereby, in cases of ambiguity, a particular reading is chosen. As
a preparatory step, we can try to paraphrase the sentences of natural language, so as to
eliminate various context dependencies. This is the subject of the next subsection.
Eliminating Simple Context Dependencies
Dependencies on context, which are caused by indexicals or by ambiguity, can be eliminated
by replacing indexicals and ambiguous terms by terms that have unique and xed denotations
throughout the discussion.
For example, each occurrence of word can be replaced by word-type or by word-token,
depending on whether the rst or the second is meant; and when either will do, we can make
this explicit by writing word-type or word-token. Sometimes we resort to new nicknames
The rst John for our old school mate, The second John for the new department chief.
And, to be clear and succinct, we can introduce John
1
and John
2
. The same policy can
be used to eliminate homonyms. To be sure, John
1
is not an English name, but a newly
coined word. Our aim, however, is not to preserve the original phrasings, but to recast them
into forms more suitable for logical analysis.
Indexicals can be eliminated by using names or descriptive phrases with xed denotations.
Thus (1)when uttered in New York at 9:00 AM, May 17 1992is rephrased as (1
0
). And I
am tallwhen uttered by Jill on May 10 1992is recast as:
(4) Jill is tall on May 10 1992.
Here is is to be interpreted in a timeless mode, something like is/was/will-be. Note the
dierent degrees of precision in the specications of time. The weather may change from hour
to hour (hence we have 9:00 AM in (1
0
)), but presumably Jills being tall is not subject to
8 CHAPTER 1. DECLARATIVE SENTENCES
hourly changes.
In this way sentence-tokens that involve context-dependency are translated into what Quine
named eternal sentences, that is: sentences whose truth-values do not depend on time, loca-
tion, or other contextual elements.
Note: We are not concerned here with a conceptual elimination of indexicals. The time
scale used in (1
0
) and (4) is dened by referring to the planet earth, and earth is dened by
a demonstrative: this planet, or the planet we are now on. We aim only to eliminate context
dependency that can cause trouble in logical analysis. And this is achieved by paraphrases of
the kind just given.
Note also that, for local purposes: if we are concerned only with a particular discourse, we
have only to replace the terms whose denotations vary within that discourse. If today refers
to the same day in all sentence-tokens that are relevant to our purpose, we need not replace
it.
The situation is altogether dierent when it comes to ambiguities in general. If my daughter
tells me Bill telephoned an hour ago , I shall probably guess correctly who of the various Bills
it was. But all I can appeal to is an assortment of considerations: the Bill I was expecting a
call from, the Bill likely to call at that time, the Bill that has recently gured in our social life,
etc. Considerations of this kind are classied in the philosophy of language under pragmatics.
The resort to pragmatics, rather than to clear-cut rules, is of great interest for linguistic
theory and the philosophy of language; but is of no concern for logic, at least not the logic
that is our present subject. For our purposes, it is enough that there is a paraphrase that
eliminates context-dependency. Logic takes it up from there. How we get there is another
concern.
The cases considered thus far are the tip of the iceberg. The real game of ambiguity and
context-dependency starts when adjectives, descriptive phrases, and verbs are brought into
the picture. This subjecta wide area of linguistic and philosophical investigationsis not
part of this course. A few observations may however give us some idea of the extent of the
problems. Consider attributes such as
small, big, heavy, dark, high, fast, slow, rich,
and their like. You dont need much reection to realize that they are relative and highly
context-dependent.
(5) Lionel is big.
(6) Kitty is small.
1.1. TRUTH-VALUES 9
You may deduce from (5) and (6) that Lionel is bigger than Kitty. Not so if it is known
that Lionel is a cat and Kitty is a lioness. In that case the big in (5) should read: a big
cat, or big as cats go, and the small in (6) as a small lioness. If we apply the strategy
suggested above for ambiguous names, we shall split big and small into many adjectives,
say big
x
and small
x
where x indicates some kind of objects; the big in (5) is thus read
as big
c
: big on the scale of cats, and the small in (6)as small
l
: small on the scale of
lions. Another, better strategy is to provide for a systematic treatment of compounds such
as big as a ..., rich as a ..., where ... describes some (natural) class.
Systematic treatments do not apply however when the adjective must be interpreted by refer-
ring to a particular occasion. The trunk is heavy can mean that the trunk is heavy, when I
do the lifting, or when you do the lifting, or when both of us do the lifting. And occasionally
there is nothing precise or explicit that we can fall back on.
(7) Jack Havenhearst lives in a high building on the outskirts of Toronto.
How high is high? A high building in Jerusalem is not so high in Manhattan. The context
may decide it, or it may not. Perhaps the speaker has derived his statement from some vague
recollection. In cases like this, when ambiguity is tied up with vagueness, the very possession
of a denite truth-value is put into question.
Before proceeding, note that the problems just mentioned concern attributes of the neutral
kind. We have not touched on evaluative terms such as
beautiful, ugly, tasteful, repulsive, nice, sexy, attractive,
and their like, which involve additional subjective dimensions, nor on:
important, signicant, marginal, central,
not to mention the ubiquitous good and bad.
1.1.3 Vagueness and Open Texture
Some people are denitely bald, some are denitely not. But some are borderline cases,
for whom the question: Is he bald? does not seem to have a yes-or-no answer. The same
applies to every type of statement you might think of in the context of everyday discourse.
For example, is it raining now? Sometimes the answer is yes, sometimes no, and sometimes
neither appears satisfactory (does the present very light drizzle qualify as rain?)
Are we now in the USA? That type of question has almost always a well-dened answer, even
when we dont know it; for international bordersthings of extreme signicanceare very
10 CHAPTER 1. DECLARATIVE SENTENCES
carefully drawn. But what if somebody happens to straddle the border-line? There is rst
the problem of pinpointing ones location, and second the problem of pinpointing the border;
and in both the pinpointing has limited precision. Even the question: Is the date now May
17 1992? may, on some occasion, lack a yes-or-no answer; for the time-point dened by the
utterance of now is determined with no more than a certain precision, surely not up to a
millisecond, say.
In everyday discourse we often handle borderline cases by employing a more rened classi-
cation. For example, we can use quite bald and hairy for the clear cases, and baldish for
those in between. This provides for more accurate descriptions. But it leaves us in the same
situation when it comes to drawing the line between bald (in the old sense) and non-bald.
And if we were to ban our original adjective, allowing only the rened ones, there would be
still borderline cases for each of the new attributes.
Cases in which neither T nor F is to be assigned are characterized as truth-value gaps, or for
short, as gaps. The cases considered beforethose of indexicals and ambiguous termsare
not genuine gaps, in as much as they can be resolved by removing the ambiguity. In cases of
vagueness the gaps are for real (or so many philosophers think).
Vagueness inheres in our very conceptual fabric. It does not arise because we are missing some
facts. Knowing all there is to know about the hairs on Mr. Hairfews head: their number,
distribution and length, may not determine whether he is bald or not. There is no point to
insisting on a yes-or-no answer. The concept is simply not intended for cases like his. If you
think of it you will see that the phenomenon is all around. Only mathematics is exempt, and
some theoretical parts of exact science. It appears whenever an empirical element is present.
2
Vagueness has been often regarded as a aw, something to get rid ofif possible. But it has
a vital role in reducing the amount of processed information. In principle, we couldinstead
of using young, rich, bald, and their likeuse descriptions that tell us ones exact age,
ones nancial assets to the last penny, or ones precise amount of cranial hair. All of which
would involve colossal waste of valuable resources. For in most cases a two-fold classication
into young and not-young, rich and not-rich, bald and not-bald, will do. And additional
information can be obtained if and when needed. The eciency thereby achieved is worth
the price of borderline cases with truth-value gaps.
A deeper reason for vagueness is that every conceptual framework gives us only a limited
purchase on reality or the facts. There is always a place for surprise, for something
turning up that resists classication, something that dees our neatly arranged scheme.
The examples considered so far are relatively simple borderline cases. In these situations a
certain classication does not apply, yet an alternative exhaustive description is available.
2
It is not a priori impossible that some experiment will turn up a particle for which the question: is it an
electron? has no clear-cut answer. The theory rules this out; but the theory may change, for its authority is
established by empirical criteria. Here however we are confronted with open texture rather than with simple
vagueness.
1.1. TRUTH-VALUES 11
There is no mystery about the nancial status of Ms. Richeld. In principle a full list of her
assetscalculated to the last centcan be drawn. The diculty in deciding whether she is
rich stems solely from the vagueness of rich. But there are situations where no alternative
description is available, situations that involve more than occasional borderline cases.
(8) Jeremy, the chimpanzee, knows that Jill will feed him soon.
Can we say that a monkey knows that something is going to happen in the near future?
Granting the way we apply know to people (itself a knotty issue and a subject of a vast
philosophical literature) can we apply it, in some instances, to animals? All we can do is
speculate on the monkeys mode of consciousness, dispositions, state of mind or state of
brain. And it is not even clear what are the factors that are relevant for deciding the status
of (8). Surely there will be conicting opinions. Cases of this kind display the undecidedness
of our conceptual apparatus, the fact that it is open-ended and may evolve in more than one
way. They are known as open texture. As the example above shows, open texture involves
quite common concepts. Think of generosity, freedom, or sanity.
Vagueness of Generality
General statements convey quantitative information regarding some class (or multitude) of
objects. They are usually expressed by words such as all, every, some, most, and their
kin. For example:
All human beings have kidneys and lungs.
In classical logic generality is expressed by quantiers, which have precise unambiguous in-
terpretations. But in natural language the intended extent of generality is often ambiguous,
as well as vague. Everyone can cover many ranges, from ones set of acquaintances to every
human on earth. Consider, for example:
(9) Everyone knows that Reagan used to consult astrologers.
(10) Everyone wants to be rich and famous.
(11) Everyone will sometime die.
Only in (11) can we interpret everyone as meaning every human being the way it is con-
strued in symbolic logic. In (9) and in (10) the intended interpretation is obviously dierent.
In (9) everyone refers to a very small minority: people who are knowledgeable about Reagan.
(9) is just another way of saying that the item in question had some publicity. (10) covers a
wider range than (9), but falls short of the generality of (11). Even when the range covered
12 CHAPTER 1. DECLARATIVE SENTENCES
by everyone or everything is explicit, the strength of the assertion can vary. For example,
when the teacher asserts
(12) Everyone in class passed the test,
she will be taken literally; her assertion would be misleading even if a single student had
failed. But a casual remark:
(13) Everyone in college is looking forward to the holidays season,
means only that a large majority does; it would not be considered false on the ground of a
few exceptions. How large is the required majority? This is vague.
Such phenomena are even more pronounced when the general statementusually expressed
by means of the indenite pluralis intended to express a law, or a rule. For rules may have
exceptions. (And the exceptions to this last rule are in mathematics, or some of the exact
sciences, or in statements like (11).) The amount of tolerated exceptions is vague. Consider,
for example:
(14) Women live longer than men,
(15) When squirrels grow heavy furs in the autumn, the winters are colder,
(16) Birds y.
Statistical data (e.g., average life span) can be cited in support of (14); but under what
conditions the is sentence true? This is vague. (15) sums up a general impression of past
experience; presumably, statistics can be invoked here as well. (16), on the other hand, is
better viewed as a rule that determines the normal case: If something is known to be a
bird, thenin the absence of other relevant informationpresume that it ies.
The general principles concerning ambiguity and vagueness apply also here. We may have to
give up precise systematic prescriptions and settle for pragmatic guidelines. And we should
accept the possibility of borderline cases, where the assignment of any truth-value is rather
arbitrary.
Cases of the types just given can, of course, be handled by using mathematical-like systems.
(14) and (15) call for statistical analysis, with all the criteria that go with it. (16), on the
other hand, indicates reasoning based on normalcy assumptions, where ones conclusions are
retracted, if additional information shows the case to be atypical (in the relevant way). Little
reection is needed to see that almost all our decision making involves reasoning of that
kind. With no information to the contrary, the usual order of things is presupposed. To
do otherwise would freeze all deliberate action. In recent years a great deal of research, by
computer scientists, logicians and philosophers, has been devoted to systems within which
1.1. TRUTH-VALUES 13
reasoning that involves retractions can be expressed. They come under the general term of
non-monotone logic.
1.1.4 Other Causes of Truth-Value Gaps
Non-Denoting Terms
Declarative sentences may contain descriptive expressions that function as names but lack
denotations. The standard, by now worn-out example is from Russell:
(17) The present king of France is bald.
(It is assumed that (17) is uttered at a time when there is no king of France. If needed
the time-indexical can be eliminated by introducing a suitable date.) Proper names, as well,
may lack denotation: Pegasus, or Vulcaneither the name of the Roman god, or of the
non-existent planet.
3
Frege held that declarative sentences containing non-denoting terms have no truth-value.
This view was later adopted, for dierent reasons, by Strawson. Russell, on the other hand,
proposed a rephrasing by which these sentences get truth-values; (17), for example, is recon-
structed as:
(17
0
) There is a unique person who is a King of France, and whoever is a King of
France is bald.
Therefore the sentence is false. Also false, by Russells reconstruction, is:
(18) The present king of France is not bald.
But
(19) It is not the case that the king of France is bald.
is true. (The dierence between (18) and (19) is accounted for by giving the negation dierent
scopes with respect to the descriptive term the king of Francea point that we shall not
discuss here.)
3
Neptune, Pluto, and Vulcan were introduced as names of planets whose existence was deduced on
theoretical grounds from the observed movements of other planets. Neptune and Pluto were later observed
directly. Vulcan did not make it. The eects attributed to Vulcan were later explained by relativity theory.
14 CHAPTER 1. DECLARATIVE SENTENCES
As far as logic is concerned the question is more or less settlednot by a verdict in favour of
one of the views, but by having the issue suciently claried, so as to reduce it to a choice
between well understood alternatives. It boils down to what one considers as tting better
our linguistic usage. Intuitions may vary. Nonetheless, the dierent resulting systems are
variants within the general framework of classical logic.
Category Mistakes
In the usual order of things, almost every attribute and every relation is associated with a
certain type of objects. When the objects do not t the attribute, we get strange, though
grammatically correct, sentences; for example,
(20) The number 3 is thirsty.
This is a category mistake; numbers are not the kind of things that are thirsty, or non-
thirsty. Some may want to treat (20) as neither true nor false. Alternatively, (20) and its kin
can be regarded as false. This policy can be extended so as to handle negations and other
compounds. As in the case of non-denoting terms, the ways of dealing with such examples
are well-understood and can be accommodated, as variants, within the general framework of
classical logic. Non-denoting terms and category mistakes, interesting as they are when it
comes to working out the details, do not pose a foundational challenge to the framework of
classical logic. But vagueness and open texture do.
1.2 Some Other Uses of Declarative Sentences
Declarative sentences have other uses, besides that of conveying information, or describing
the world. I do not mean their misuse, through lying, or by misleading. Such misuses are
direct derivatives of their ordinary use. I mean uses that are altogether dierent. They have
been extensively studied by philosophers and linguists, and are worth noting for the sake of
completeness and in order to give us a wider perspective.
Fictional Contexts
In a play, or in a movie, the players utter declarative sentences, much as people in real life
do; but what goes on is obviously dierent. Compare, for example, an exclamation of Fire!
that is part of the play, with a similar exclamation, by the same actor in the same episode,
when he observes a real re breaking out in the hall. We say that the utterances in the
play are not true assertions, or are not performed in an assertive mode. They are pretended
assertions within a make-belief game.
1.2. SOME OTHER USES OF DECLARATIVE SENTENCES 15
Yet, within the game, they are subject to the same logic that applies to ordinary statements.
Furthermore, truth-values can be meaningfully assigned to certain statements about ctional
characters. Hamlet killed Polonius, and was not sorry about it will be regarded as true,
while Hamlet intended to kill Polonius will be regarded as false. This merely reects what
is found in the play. The pretense reaches its limits easily: Hamlet had blood type A is
neither true nor false, orif we adopt Russells methodfalse by some legislation in logic.
Consider, for contrast, Shakespeare had blood type A, which has a determinate truth-value;
even though we do not, and probably never will, know what this value is. No logical legislation
can settle this.
The declarative sentences that appear in novels, poetry, or jokes, achieve a variety of eects:
they can amuse, entertain, evoke an aesthetic experience, a feel or a vision. Some can enlighten
us, but not in the way that The earth turns around the sun does.
Metaphors, Similes, and Aphorisms
Consider the following.
Skepticism is the chastity of the intellect. Santayana
To deny to believe and to doubt well, are to a man
as a race is to a horse. Pascal
Those who cando, those who cannotteach. Shaw
Taken literally, the rst is trivially false, or a category mistake (skepticism is not the chastity
of something and chastity does not apply to intellects). The second is trivially true or
trivially falsedepending on whether the claimed likeness is indenitely wide (any two things
are alike in some respect) or narrow and precise. The thirdas a plain general statementis
false on any of the usual criteria. Evidently, the points of the sayings have little to do with
their literally determined truth-values.
Many have found in metaphors (of which the rst is an example) hidden meanings, which
can be approximatedthough not capturedby non-metaphorical rephrasings. Others have
argued that the value of metaphorwhat is transmitted, or evokedis outside the scope of
linguistic meaning. And yet a metaphor can be misleading in a way that a joke, or a poem
cannot. The same can be said of similes, which achieve their eect through a somewhat
dierent mechanism. Finally there are sayings like the third, which are not to be evaluated
literally, but are neither metaphors nor similes. Their point is to underline some noteworthy
feature, to focus our attention on a certain pattern.
) as standing for: The conjunction of Jack went to the theater and the
negation of Jill went to the theater . (Just so A B can be read as the conjunction
of A and the negation of B.) Strictly speaking, connective operations are dened for the
formal setup. In English there is more than one construction that can represent a connective,
e.g., there are several ways of expressing logical conjunction. But (1
) (Jack will be home this evening) (Jacks wife will be home this evening) .
Note that either, which marks the beginning of the left disjunct, serves (together with the
comma) to show the intended grouping. Compare for example:
Jack will go to the theater, and either Jill will go to the movie or Jill will
spend the evening with Jack,
Either Jack will go to the theater and Jill will go to the movie, or Jill will
spend the evening with Jack.
They come out, respectively, as:
Jack will go to the theater (Jill will go to the movie Jill will spend the
evening with Jack),
(Jack will go to the theater Jill will go to the movie) Jill will spend the
evening with Jack.
Or (and either...or ) is often used to combine noun phrases, verb phrases, adjectivals, or
adverbials. In this it resembles and. But the distributivity problem does not arise for or
as it arises for and; usually, we can distribute:
Jack or his wife will go to the party,
Jill will either clean her room or practice the violin,
are, respectively, equivalent to
Jack will go to the party or Jacks wife will go to the party,
Jill will clean her room or Jill will practice the violin.
Note that the following are equivalent as well:
Jill ran fast or silently,
Jill ran fast or Jill ran silently.
94 CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
The problem of the adverbs applying to the same run does not arise hereas it arises in
(19) for andbecause of disjunctions truth-table: That sometime in the past Jill ran fast
or silently is true just if sometime in the past Jill ran fast, or sometime in the past she ran
silently. (The problem arises if or is interpreted not as , but exclusively; for, then, in the
rst sentence the exclusion aects one particular run, but in the second it aects all of Jills
past running.)
Other failures of distributivity involve combinations of or with can, or with verbs indicating
possible choices. They will be discussed later (3.1.3, page 95).
Inclusive and Exclusive Or
We discussed already the inclusive and exclusive readings of or (cf. 2.2.2, pages 34, 35). Re-
call that under the inclusive reading an or-sentence is analysed as a disjunction of sentential
logic: it is true, if one of the disjuncts is true, or if both are. Under the exclusive reading,
one, and only one, should be true.
Exclusive disjunctions are truth-functional as well, but they should be recast dierently,
namely in the form
(A B) (AB), or in the equivalent form (AB) (AB) .
Cf. 2.2.2 .
In many cases or seems exclusive where an inclusive interpretation will do as well; the
question of the right reading may not have a clear-cut answer.
(21) Jack is now either in New York or in Toronto.
Since Jack cannot be at the same time both in New York and in Toronto, many jump to
the conclusion that the or is exclusive. But this does not follow. From (21) one will infer
that (i) Jack is in one of the two cities and (ii) he is not in both. But (ii) follows from the
general impossibility of being in dierent places at the same time. It need not be part of
what (21) explicitly states. The or can be inclusive. Just so, if somebody says that the sun
is now rising, we will naturally infer that it is rising in the east, though it is not part of the
statement.
Whether the speaker who asserts (21) intends the exclusion to be part of the meaning of or,
or only an obvious non-logical corollary, can be a question that none may settle, including
the speaker.
Examples of the last kind, where or can be read inclusively but where exclusion is nonetheless
implied, come easily to mind. The strength of the implied exclusion varies, however. In
Either Edith or Eunice will marry John
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH 95
the exclusion of their not marrying him both is implied by the prohibition on bigamy, which
is considerably weaker than the impossibility of being in two places at the same time.
And in
(22) This evening I shall either do my homework or go to the movies,
the exclusion is only suggested (by the inconvenience of doing both), but is by no means
necessary. The speaker can in fact add or do both :
(23) This evening I shall either do my homework, or go the movies, or do both.
By choosing to add or do bothor words to this eectthe speaker allows for the possibility
of an exclusive interpretation. (22) and (23) are equivalent under the inclusive interpretation;
under that interpretation, the additional or do both is redundant. But it is not redundant if
or is interpreted exclusively; its addition neutralizes the eect of the exclusive interpretation.
The possibility of an exclusive interpretation of or is not evidenced by (21), but by cases,
like (23), where one nds it appropriate to add or both.
There are examples where the exclusive interpretation of or is called for, e.g.,
Either you pay the ne or you go to prison.
And there are others where or is obviously inclusive:
If you are either a good athlete or a rst-rate student, you will be admitted,
which should be recast as: (A B) C.
To sum up, while there is an exclusive sense of or, the inclusive senseformalized by dis-
junction of sentential logicis appropriate in more cases than appears at rst sight. Recall,
however, that the main reason for having rather than exclusive disjunction (our
x
) as a
connective, are its algebraic and formal properties. It is also directly related (as we shall see
in chapter 5) to the set-theoretic operation of union.
Or with Can
(24) You can have either coee or tea.
If we distribute the or we get:
(25) You can have coee you can have tea .
96 CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
Now (25) is not the equivalent of (24). Suppose you can have coee, but you cannot have tea.
Then (25) is true, because the rst disjunct is, but (24) is false.
The problem with (24) is that it involves a hidden conditional. Spelled out in full, it comes
out as something like:
(24
0
) If you ask for coee or if you ask for tea, youll get what you ask for.
And the best way of expressing this is as a conjunction:
(24
00
) If you ask for coee (and no tea) youll get coee,
and
if you ask for tea (and no coee) youll get tea.
The parentheses make explicit the assumption that one does not get both coee and tea by
asking for both. This corresponds to an exclusive reading of or in (24
0
). If this assumption is
not correct the parentheses should be omitted and the or in (24
0
) should be read inclusively.
The equivalence of (24
0
) and (24
00
) is an instance of one of the following two equivalences. It
is an instance of the rst, or of the second, depending on whether the or in (24
0
) is read
exclusively or inclusively.
(A
x
B) C (AB C) (AB C) (A B) C (A C) (B C)
Sometimes the following sentence is used as an equivalent of (24):
(24.1) You can have coee or you can have tea.
In this case the or in (24.1) does not stand for disjunction, either exclusive or inclusive.
The phenomenon just exemplied is general, it takes place when can is used to express choice
or possibility:
You can choose either to marry her or not to see her again,
Bank can mean either a river-bank or a nancial institution,
and many similar cases.
Homework
In the following homework A, B, C and D represent, respectively, the sentences:
Ann is married, Barbara is married, Claire is married, Dorothy is married.
Sentences, whose truth-values depend only on which of the women are married and which are
not, can be naturally formalized, using , and . For example,
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH 97
Exactly one of Barbara and Claire is married
comes out as:
BC BC
3.3 Formalize the following as sentential compounds of A, B, C, D, using , and . Try
to get short sentences. (The four women refers to Ann, Barbara, Claire and Dorothy.)
1. At least one of the four women is married.
2. Exactly one of the four women is married.
3. At least two of Ann, Barbara and Claire are unmarried.
4. If one of the four women is married, all of them are.
5. If one of the four women is unmarried, all of them are.
6. At least one of Ann and Dorothy is married and at least one of Barbara and Claire
is unmarried.
7. Ann and Claire have the same marital status.
8. Either one or three among Barbara, Claire and Dorothy are married.
3.4 Let A, B, C be as above. Using the simplied form of the sentences of Homework
2.17 (in 2.5.2), translate them into English. Use good stylistic phrasings. You do not have
to reect logical form, but the English sentences should always have the same truth-value as
their formal counterparts. Or is to be understood inclusively. You can use the three women
to refer to Anne, Barbara and Claire.
3.1.4 Conditional and Biconditional
The main English constructions that correspond to the conditional of sentential logic are:
If ... , then , If... , , and , if ...,
where ... corresponds to the antecedent and to the consequent. Consider for example:
(26) If John works hard he will pass the logic examination,
or the equivalent
(26
0
) John will pass the logic examination if he works hard.
98 CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
If John works hard but does not pass the logic examination, (26) is false; and if he works hard
and passes the examination, (26) is true. So far the intuition is clear. Somewhat less clear is
the case of a false antecedent, i.e., if John doesnt work hard. Note that if we are to assign to
a truth-value, it must be T(as in the truth-table of ). The value cannot be F, for the point
of making an if...then statement is to avoid commitment to the truth of the antecedent.
(26) cannot be understood to imply that John will work hard.
Some may want to say that when the antecedent is false no truth-value should be assigned. It
is as if no statement is made. On this view, the very status of having a truth-value depends on
the antecedents truth. This interpretation complicates matters considerably and puts (26)
beyond the scope of classical logic. Whatever its merits, it is by no means compelling. Quite
reasonably we can regard (26), when the antecedent is false, as true. Just as we can judge a
father, who had told his daughter:
(27) If it doesnt rain tomorrow, we shall go to the zoo,
to have fullled his obligation if it rained and they didnt go to the zoo. He has fullled it
in an easy, disappointing way, but fullled it he has. By the same token, if the antecedent of
(26) is false, the sentence it true; true by default, but true nonetheless.
Material conditional (or, in older terminologies, material implication) is sometimes used to
denote our present conditional; material indicates that the truth-value depends only on the
truth-values of the antecedent and the consequent, not on any internal or causal connection.
The construal of if-statements as material conditionals can lead to oddities that conict
with ordinary intuition. The following statements turn out to be true, the rstbecause the
antecedent is false, the secondbecause the consequent is true.
(28.i) If pigs have wings, the moon is a chunk of cheese,
(28.ii) If two and two make four, then pigs dont have wings.
There is more than one reason for the oddity of statements like (28.i) and (28.ii). First, we
expect the speaker to be as informative as he or she can. There is no point of asserting a
conditional if either the antecedent is known to be false, or the consequent is known to be
true. In the rst case one is expected to assert the negation of the antecedent; in the second
case one is expected to assert the consequent.
Furthermore, we expect some connection between what the antecedent and consequent say,
and this is totally missing in (28.i) and (28.ii). The connection need not be causal, e.g.,
(29) If ve times four is twenty, then ve times eight is forty.
(29) makes sense in as much as the consequent can be naturally deduced from the an-
tecedent.
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH 99
Often if ... , then is employed in a sense that indicates a causal relation. For example,
(27) says that if they will not be hindered by rain, the persons in question (referred to by
we) will go to the zoo. But if we construe it as a material conditional and apply the familiar
equivalence
A B B A ,
we can convert it into the logically equivalent:
(27
0
) If we dont go to the zoo tomorrow, it will rain.
Without additional explanation (27
0
) looks bizarre; for it suggests that not going to the zoo
has an eect on the weather.
Cases like those discussed above have been sometimes called paradoxes of material implica-
tion [or material conditional]. Actually, there is nothing paradoxical here. Material condi-
tional is not intended to reect aspects of if that pertain to causal links, temporal order, or
any connections of meanings, over and above the truth-value dependence.
Non-material conditional can be expressed in richer systems of logic, which are designed to
handle phenomena that are not truth-functional.
If and Only If, Sucient versus Necessary Conditions
In if-statements that express conditionals the antecedent is marked by if. Recast as condi-
tionals,
If ... , then , If ... , , and , if ...
come out as:
(30) (...) ( ) .
Note that in the third expression above, the antecedent comes after the consequent. The
antecedent is not marked by its place but by the prex if .
Just as if marks the antecedent, only if marks the consequent. As far as truth-values are
concerned, to say
... , only if
is to say that ... is not true without being true; which simply means that if ... is true,
so is . Thus, it comes out as (30).
100 CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
A condition X (fact, state of aairs, what a sentence stateswe wont belabor this here) is
said to be sucient for Y , just when the obtaining of X guarantees the obtaining of Y . That
X is a sucient condition for Y is often described by asserting:
If X then Y , or Y , if X,
orto put it more accurately:
If ... , then or , if ...,
where ... describes X and describes Y .
On the other hand, X is a necessary condition for Y if Y cannot take place without X. And
this is often expressed by
Y , only if X,
A sucient condition need not be necessary; e.g., dropping the glass is sucient for breaking
it (expressed by: If the glass is dropped it will break), but it is not necessarythe glass
can be broken in other ways. Vice versa, a necessary condition need not be sucient; e.g.,
good health is necessary for Jills happiness (expressed by: Jill is happy only if she is in good
health), but it is not sucientother things are needed as well.
The confusing of necessary and sucient conditions is quite common and results in fallacious
thinking; for example, the armation-of-consequent fallacy whereby, assuming the truth of
If ... , then , one fallaciously infers the truth of ... from that of .
Although if ... , and ... , only if come to the same when construed as conditionals,
the move from the rst to the second can result in statements that are rather odd. If we rewrite
(27) (If it doesnt rain tomorrow, we shall go to the zoo) in the only-if form we get:
It wont rain tomorrow only if we go to the zoo,
which makes the going to the zoo a necessary condition for not raining, suggesting, even more
than (27
0
), some mysterious inuence on the weather. Underlying this are, again, the causal
implications that an only-if statement can have, which disappear in the formalization. By
now the matter has been suciently claried.
If-Statements that Express Generality
When an if-phrase is constructed with an indenite article and a common noun, the result
is not a conditional but a generalization of one:
(31) If a woman wants to have an abortion, the state cannot prevent her.
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH 101
To assert (31) is to assert something about all women. It cannot be formalized in sentential
logic. We shall see that in rst-order logic it is expressed by using variables and prexing a
universal quantier before the conditional. It comes out like:
(31
0
) For all x, if x is a woman who wants to have an abortion, then the state
cannot prevent x from having it.
By contrast, the following is expressible as a simple conditional, though it has the same
grammatical form.
(32) If Jill wants to have an abortion, the state cannot prevent her.
The expression of generality through conditionals is very common in technical contexts, when
general rules are stated using variables, or schematic symbols. For example, the transitivity
of > is stated in the form:
If x > y and y > z, then x > z,
meaning: for all numbers x, y, z, if x > y etc. The quantifying expression For all numbers
x, y, z has been omitted, but the reader has no diculty in supplying it.
Other Ways of Expressing Conditionals
Besides if, there are other English expressions that mark the antecedent of a conditional.
Here are some:
provided that, assuming that, in case that, and sometimes when.
For example, (27) can be rephrased as
... it does not rain tomorrow, we shall go to the zoo,
where ... can stand for each of: Provided that, Assuming that, In case that. We
can also change the places of antecedent and consequent: We shall go to the zoo tomorrow,
provided that it does not rain. As in the case of if, we can get an expression marking the
consequent by adding only: ... , only in case that .
As a rule, when can be used to form conditionals involving generality. For example, you can
replace in (31) If by When; or consider:
(33) When there is a will there is a way.
(I.e., in every case: if there is will there is a way.)
(34) A conjunction is true when both conjuncts are.
(I.e., for every conjunction, if both conjuncts are true so is the conjunction.)
102 CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
This use of when is possible when the temporal aspect is non-existent as in (34), or when it
is not explicit, as in (33). (In the latter, the temporal aspect was neutralized by using case
in the paraphrase.) But when time is explicit, we cannot recast the sentence as a generalized
conditional, at least not in a straightforward way (via common names). For example:
(35) Jack and Jill will marry when one of them gets a secure job.
The temporal proximity of the two events (getting a secure job and the marriage), which is
explicitly stated in (35), disappears if (35) is formalized as a conditional.
(The formalization of (35)in rst-order logic requires the introduction of time points. Alterna-
tively, it can be carried out in temporal logic, designed expressly for handling such statements.
In this logic truth-values are time-dependent and there are connectives for expressing con-
structions based on when, after, until, etc.)
Unless: The harmless unless, which causes no problem in actual usage, can cause confusion
when it comes to formalizing. , unless ... means that if the condition expressed by ... is
not satised, then is (or will be) true. In formal recasting it is:
(...) ( ) .
It is a conditional whose antecedent is the negation of the clause that follows unless. That
clause can also come rst: unless ... , .
Since
A B A B ,
an unless-statement can be formalized as a simple disjunction:
(...) ( ) .
Reading unless as or is somewhat surprising. Analyzing the situation, one can see that
unless connotes (perhaps even more than if) some sort of causal connection. When we read
it as or, this side of it disappears; hence, the surprise. We are also not used to regard unless
as a disjunction. Whatever the reason, this is how unless is interpreted as a truth-functional
connective. A few examples of will show that it is indeed the right way:
We shall go to the zoo tomorrow, unless it rains.
We shall go to the zoo tomorrow if it does not rain.
Either it will rain tomorrow, or we shall go to the zoo.
Unless you pass the exam, you will not qualify.
If you dont pass the exam, you will not qualify.
Youll pass the exam, or you will not qualify.
3.1. CLASSICAL SENTENTIAL CONNECTIVES IN ENGLISH 103
Biconditional
In natural language biconditionals are expressed by using if and only if:
Tomorrow will be the longest day of the year, if, and only if, today is June
20.
In this form, biconditionals can serve to assert that a certain condition is both necessary and
sucient for some other condition. Note that the necessary-part is expressed by only if,
and it corresponds to the left-to-right direction of ; the sucient-part is expressed by
if, and it corresponds to the right-to-left direction of .
Since a biconditional amounts to a conjunction of two conditionals, the construal of various
if-and-only-if-statements as biconditionals inherits some of the problems of the conditional
(material implication).
Other expressions that can be used to form biconditionals are:
just if, just in case, just when.
But biconditionals that are stated with just when are usually general claims whose formal-
izations require quantiers.
Homework 3.5 Express the following excerpts as sentential compounds. Get your basic
components as small as possible. Note basic components that are equivalent to generalized
conditionals and rewrite them so as to display the conditional part. For example:
The wise in heart will receive commandments, but a prating fool shall fall.
Answer: A B, where
A : The wise at heart will receive commandments,
Generalized conditional: If x is wise at heart, then x will receive commandment.
B : A prating fool shall fall,
Generalized conditional: If x is a prating fool, then x will fall.
For the sake of the exercise, you can treat an address to the reader as an address to a particular
person, using you as a proper name.
1. A leader is a dealer in hope.
2. Ignore what a man desires and you ignore the very source of his power.
3. Laws are like spiders webs which, if anything small falls into them they ensnare it, but
large things break through and escape.
4. If you command wisely, youll be obeyed cheerfully.
5. Youll get well in the world if you are neither more nor less wise, neither better nor
worse than your neighbors.
104 CHAPTER 3. SENTENTIAL LOGIC IN NATURAL LANGUAGE
6. God created man and, nding him not suciently alone, gave him a companion to make
him feel his solitude more keenly.
7. Women are as old as they feeland men are old when they loose their feelings.
8. If there are obstacles, the shortest line between two points may be the crooked line.
9. There is time enough for everything in the course of the day if you do but one thing at
once; but there is not time enough in the course of the day if you will do two things at
a time.
10. No one can be good for long if goodness is not in demand.
11. Literary works cannot be taken over like factories, or literary forms of expression like
industrial methods.
12. If the mind, which rules the body, ever forgets itself so far as to trample upon its slave,
the slave is never generous enough to forgive the injury; but will rise and smite its
oppressor.
Chapter 4
Logical Implications and Proofs
4.0
In this chapter we introduce implication (from many premises) and we present a method of
proving valid implications of sentential logic (i.e., tautological implications). It is an adap-
tation of Gentzens calculus, which is easy to work with and which is guaranteed to produce
either a proof, or a counterexample that shows that the given implication does not obtain in
generalhence is not provable. As we shall see in chapter 9, the system extends naturally to
rst-order logic, where it is guaranteed to produce a proof of any given logical implication.
Returning, in the last section, to natural language, we try to represent implications that arise
in English discourse, as formal implications of our system. In this connection we discuss
some well-known concepts in the philosophy of language, such as meaning postulates and
implicature.
4.1 Logical Implication
As noted in the introduction, logic was considered historically the science of correct reasoning,
which uncovers and systematizes valid forms of inference. Generally, an inference starts with
certain assumptions called premises and ends with a conclusion. It is not that required that
the premises be true, but that they imply the conclusion; i.e., it should be impossible that
the premises be true and the conclusionfalse.
In general, implications are not grounded in pure logic. That Jack was at a certain hour in New
York implies that he was not, shortly afterwards, in Toronto. This is not a logical implication.
It rests on the practical impossibility of covering the New York - Toronto distance in too
short a time. If the time is suciently short, the impossibility may be traced to a physical
105
106 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
law. And in the extreme case, it becomes the impossibility of being at the same time in two
dierent places. But even this is not something that rests on pure logic.
We shall not address at this point what comes under pure logic. As in the cases of logical
equivalence and logical truth (cf. chapter 2), we can say that a sentence logically implies
another if it is impossible that the rst be true and the second false, by virtue of the logical
elements of the two sentences. In sentential logic the only logical elements are the connectives.
A logical implication that rests only on the meaning of the connectives is tautological. Here
is the denition.
A tautologically implies B if there is no assignment of truth-values to the
atomic sentences under which A gets T and B gets F.
As in the case of tautological equivalence, (cf. 2.2.0) there is no need to go to the level of the
atomic sentences. That a sentence tautologically implies another can be seen by displaying
their relevant sentential structure. The denition entails the following.
A tautologically implies B if and only if they can be written as sentential
expressions (with each sentential variable standing for the same sentence
in both) such that there is no assignment to the sentential variables under
which the expression for A gets T, and the expression for B gets F.
This means that, in a truth-table containing columns for both, there is no row in which As
value is T and Bs value is F.
If A is a contradiction, then there is no truth-value assignment (to the atomic sentences)
under which it gets T. Hence, for all B, there is no assignment in which A gets T and B
gets F. Consequently a contradiction implies tautologically all other sentences. By a similar
argument, every sentence implies tautologically a tautology. We shall return to this later.
Note that tautological implication is a certain type of logical implication. In sentential logic
the two are the same. But in more expressive systems, in particular in rst-order logic, there
are logical implications that are not tautological.
Terminology and Notation:
Logical in the context of sentential logic means tautological. In the present chapter,
the terms are interchangeable. For the sake of brevity we often use implication for
logical implication.
|= denotes logical implication, that is:
A |= B
means that A logically implies B. If A |= B, we say that B is a logical consequence of
A.
4.1. LOGICAL IMPLICATION 107
|= is a shorthand for logically implies. Like it belongs to our English discourse,
not to the formal system. To say that A |= B is to claim that A logically implies B.
Note: Terms such as implication and equivalence are used mostly with respect to sen-
tences, or sentential expressions, of our formal system. But we use them also with respect
to our own statements. E.g., we can say that A A
0
implies that A A
0
, and we speak
about the implication
A |= A
0
= A
0
|= A .
And here, implication, which is denoted by =, refers to our own statements. Similarly,
we may speak of the equivalence
A B B A .
A similar ambiguity surrounds consequence. The intended meaning of these, and other
two-level terms, should be clear from the context.
If two sentences are equivalent, then under all truth-value assignments (to the atomic com-
ponents) they get the same truth-value. Hence they imply each other. Vice versa, if they
imply each other, than there is no assignment under which one gets T and the other gets
F; therefore they are equivalent. Hence, sentences are equivalent just when they imply each
other:
(1) A B A |= B and B |= A .
(1) shows how logical equivalence can be dened in terms of logical implication. On the other
hand, using conjunction, we can express implication in terms of equivalence:
(2) A |= B A AB .
The argument for (2) is easy:
Assume that A |= B, then (i) if A gets T, so does B and so does AB; and
(ii) if A gets F, then A B gets F. Therefore A and A B always get the
same value.
Vice versa, if A A B, it is impossible that A gets T and B gets F, for
then A and A B get dierent values.
One can, nonetheless, argue that implication is the more basic notion. It corresponds directly
to inferring. Moreover, logical equivalence is reducible to it without employing connectives,
but not vice versa.
1
As we shall presently see, the most basic notion is that of implication
with one or more premises.
1
The reason for treating, in this book, equivalence before implication is didactic. Its analogy with equality
makes equivalence more accessible and enables one to use algebraic techniques.
108 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Using conditional, we can express logical implication in terms of logical truth.
(3) A |= B A B is logically true.
Homework 4.1 Prove (3), via the same type of argument used in proving (2).
The following properties of implication are easily established.
Reexivity: A |= A
Transitivity: If A |= B and B |= C, then A |= C.
Transitivity is of course intuitively implied by the very notion of implication. The detailed
argument is trivial:
2
Assuming that A |= B and B |= C, one has to show that there is no
assignments under which A gets T and C gets F. So assume that A gets
T. Then B must get T, because A |= B. But then, C must get T, because
B |= C.
Logical implications are preserved when we substitute the sentences by logically equivalent
ones:
(4) If A A
0
and B B
0
, then A |= B i A
0
|= B
0
.
One can derive (4) immediately from the denitions, by noting that logical implication is
dened in terms of possible truth-values and that logically equivalent sentences always have
the same value.
((4) is also derivable from (1) via the transitivity of implication: If, A A
0
then, by (1),
A
0
|= A. Similarly, if B B
0
, then B |= B
0
. If also A |= B, we get:
A
0
|= A, A |= B, B |= B
0
Applying transitivity twice we get A
0
|= B
0
. In the same way we derive A |= B from A
0
|= B
0
.)
(4) implies that, in checking for logical implications, we are completely free to substitute
sentences by logically equivalent ones. We can therefore use all the previous simplication
techniques in order to reduce the problem to one that involves simpler sentences.
2
It is trivial if we dene implication by appealing to assignment to atomic sentences. If we want to bypass
atomic sentences we have to show that the two implications A |= B and B |= C, can be founded on sentential
expressions in which the three sentences are generated from the same stock of basic components (see also
footnote 4 in chapter 2, page 32). This can be done by using unique readability.
4.1. LOGICAL IMPLICATION 109
Every case of logical equivalence is, by (1), a case of two logical implications, from left to right
and from right to left. But generally implications are one-way. Here are a few easy examples
in which the reverse implication does not hold in general.
(i) A B |= A
(ii) A |= A B
(iii) B |= A B
(iv) A |= A B
(v) A B |= A B
(vi) A B |= A B
That the reverse implications do not hold in general can be seen by assigning the sentential
variables truth-values, under which the left-hand side gets T and the right-hand side gets
F. We can interpret them as standing for atomic sentences that can have these values. For
example, in the case of (i), let A get T and let B get F.
Note that we can also force this assignment by interpreting the variables as standing for
tautologies or contradictions. For example let A = C C, B = C C.
In particular cases, the right-to-left implication holds as well. For example:
If B is logically true, e.g. (B = C C) then A |= A B.
If B is a logically false, then A B |= A .
Here, as an illustration, is the argument for the second statement.
Assume that B is logically false. If A B gets T, then, since B gets F
(being logically false), A must get F. Hence, A gets T. There is, therefore,
no assignment (to the atomic sentences) in which A B gets T and A
doesnt.
Homework 4.2 Find, for each of (i) - (vi) above, whether the reverse implication holds for
all B, in each of the following cases:
(1) A is logically true. (2) A is logically false.
Altogether you have to check 12 cases. Prove every positive answer by an argument of the
type given above. Prove every negative answer by a suitable counterexample.
110 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Note: We can dene a notion of logical implication that applies to sentential expressions.
This is completely analogous to the case of logical equivalence and logical truth (cf. 2.2). We
shall discuss it in 4.3.1, in the more general context of implication from many premises.
4.2 Implications with Many Premises
4.2.0
Implications with several premises are a natural generalization of the one-premise case. The
sentences A
1
, A
2
, . . . , A
n
logically imply the sentence B, and B is a logical consequence of
A
1
, . . . , A
n
, if it is impossible, by virtue of the logical elements of the sentences, that all the
sentences A
i
are true and B is false. The notation is generalized accordingly:
A
1
, . . . , A
n
|= B
The sentences A
1
, A
2
, . . . , A
n
are referred to as premises and Bas the conclusion.
The precise denition, in the case of sentential logic, is a straightforward generalization of the
one-premise case. Here it is:
A
1
, . . . , A
n
|= B, if there is no truth-value assignment to the atomic
sentences under which all the A
i
s get T and B gets F.
Again, this entails a characterization in terms of sentential expressions, without appealing to
atomic sentences:
A
1
, . . . , A
n
|= B, i all the premises and the conclusion can be written
as sentential expressions, such that in a truth-table containing columns for
all, there is no row in which all the A
i
s get T and B gets F.
Notation: We refer to sequences such as A
1
, . . . , A
n
as lists of sentences, and we use
, ,
0
,
0
,. . . etc.,
for denoting such lists. Thus, if = A
1
, A
2
, . . . , A
n
then
|= B means that A
1
, . . . , A
n
|= B .
Furthermore, we use notations such as , A and , for lists obtained by adding sentence
and by combining two lists:
If = A
1
, . . . , A
n
and = B
1
, . . . , B
k
,
4.2. IMPLICATIONS WITH MANY PREMISES 111
then
, A = A
1
, . . . , A
n
, A and , = A
1
, . . . A
n
, B
1
, . . . , B
k
.
It is obvious that, as far as logical consequences are concerned, the ordering of the premises
makes no dierence. Also repeating a premise, or deleting a repeated occurrence, make no
dierence. For example,
A, B, A, C, D, D , B, A, C, D , and A, B, C, D
have the same logical consequences. Such rewriting of lists will, henceforth, be allowed, as a
matter of course.
It should be evident by now that, in dealing with logical implications, we can apply the
substitution-of-equivalents principle: any sentences among the premises and the conclusion
can be substituted by logically equivalent ones. Spelled out in detail, it means this:
If A
1
A
0
1
, and . . . and A
n
A
0
n
, and B B
0
,
then
A
1
, . . . , A
n
|= B A
0
1
, . . . , A
0
n
|= B
0
.
The Empty Premise List
We include among the lists the empty list, one that has no members. To be a logical conse-
quence of the empty list means simply to be a logical truth. (If is empty, then, vacuously,
all its members are true. Hence, to say that it is impossible that all members of are true
and B is false is simply to say that it is impossible that B is false.)
Logical implication by the empty list is expressed by writing nothing to the left of |= .
Therefore
|= B
means that B is a logical truth. In the case of sentential logic, it means that B is a tautology.
By using conjunction, we can reduce an implication from A
1
, . . . , A
n
to the single-premise
case:
(5) A
1
, . . . , A
n
|= B A
1
. . . A
n
|= B .
(5) is proved by observing that all A
i
s get T just when their conjunction, A
1
. . . A
n
, gets
T.
Implications from premise lists constitute, nonetheless, an important advance on single-
premise implications. First, they do not necessitate the use of conjunctions. Second, it is
112 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
easier to grasp an implication stated in terms of several premises, instead of using a single
long conjunction. Third, the premise list can be innite. The denition of implication works
equally well for that case, but the reduction to single premises, via (5), breaks down (unless
we admit innite conjunctions, which is a radical extension of the system). In this book we
shall restrict ourselves to nite premise lists. Yet the innite case has its uses. Fourth, by
including the possibility of an empty list, we incorporate logical truth within the framework
of logical implication. As we shall see, the rules for implications lead to useful methods for
establishing logical truth.
Finally and most important, there are nice sets of rules for establishing implications, which
depend essentially on the possibility of having many premises.
4.2.1 Some Basic Implication Laws and Top-Down Derivations
Our previous (3) (which characterizes implication from a single premise in terms of logical
truth) can be now stated as:
(6) A |= B |= A B
And this can be generalized to the following important law:
(7) , A |= B |= A B
(6) is a particular case of (7), obtained when is empty. Here is the proof of (7):
To prove the left-to-right direction, assume that , A |= B and show: |=
A B, i.e., that it is impossible (by virtue of the logical elements) that all
members of are true and A B is not. Assume a case where all members
of get T. If A gets F, then A B gets T (by the truth-table of ).
And if A gets T, then all members of , A get T; since we assumed that
, A |= B, B gets T. Again, by the truth-table of , A B gets T.
To prove the right-to-left direction, assume that |= A B, and show: , A |=
B, i.e., that it is impossible (by virtue of the logical elements) that all mem-
bers of , A get T and B get F. So assume that all members of , A get T.
Then (i) all members of get T and (ii) A gets T. Having assumed that
|= A B, it follows that A B gets T. Since also A gets T, it follows,
by the truth-table of , that B gets T.
4.2. IMPLICATIONS WITH MANY PREMISES 113
Note that the argument relies on the logical elements of the sentences in
, A, B and on the truth-table of , which is itself a logical element.
(7) provides a very useful way of establishing implications in which the conclusion is a con-
ditional. We can refer to it by the (rather unwieldy) conclusion-conditional law. We also
mark it by the following self-explanatory notation
(|=, ) .
Here is an illustration how (|=, ) can work. Suppose that we want to show that:
|= (A (B A))
Using (|=, ) (where is empty and B is substituted by B A) this reduces to showing
that:
A |= B A
Again, using (|=, ) (where consists of A, A is substituted by B, and B by A), this
reduces to:
A, B |= A
But this last implication is obvious, because the conclusion occurs as one of the premises.
Thus, we have established the logical truth of our original sentence.
The argument just given is an example of a top-down proof, or top-down derivation: We start
with the claim to be proved and, working our way backward, we keep reducing it to other
sucient claims (i.e., claims that imply it), until we reduce it to obviously true claims. We can
then turn the argument into a bottom-up proof: the familiar kind that starts with obviously
true claims and moves forward, in a sequence of steps, until the desired claim is established.
The bottom-up proof is obtained by inverting the top-down derivation. In our last example
the resulting bottom-up proof is:
A, B |= A obvious,
A |= B A by (|=, ),
|= A (B A) by (|=, ).
The implications occurring in top-down derivations are referred to as goals. The derivation
starts with the initial goal (the implication we want to prove) and proceeds stepwise by
reducing goals to other goals, until all the required goals are self-evident. The top-down
method gures prominently in the sequel. Besides (|=, ), we shall avail ourselves of other
laws. It goes without saying that all the laws are general schemes holding for all possible
values of the sentential variables. Here are some.
(8) If |= A and every sentence that occurs in occurs in
0
, then
0
|= A.
114 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
(8) says that the addition of premises can only increase the class of implied consequences. This
property is the monotonicity of logical implication, or of the logical consequence relation.
3
Given our denition of implication, (8) is trivial (if it is impossible that all the sentences in
get T and A gets F, then, a fortiori, it is impossible that all sentences in
0
get T and A
gets F).
Note: Monotonicity obtains for many types of implication, not necessarily logical, and
it seems quite obvious: By adding more premises one can only get more consequences, not
less. Yet we often employ reasonings that are not monotone. Conclusions established on the
basis of some information may be withdrawn when some additional information is obtained.
The well-known example is: Being told that Twitty is a bird, one will conclude that Twitty
can y; but one will withdraw this conclusion if told, in addition, that Twitty is a penguin.
Inferences of that nature have been, in the last twenty years, the subject of considerable
research by logicians, computer scientists and philosophers working in the area of belief change
and articial intelligence. Various formal systems have been proposed. They come under the
title of non-monotonic logic.
Not as obvious as (8), but still quite easy, is the following law of consequence addition:
(9) If |= A, then, for every sentence B:
|= B , A |= B
(9) means that any consequence of the given premises can be added as an additional premise,
without changing class of consequences. Here is the proof.
The left-to-right direction of the follows from monotonicity. For the
right-to-left direction, assume that , A |= B. If all sentences in get T,
then, by the initial assumption (that |= A), A must get T; hence, all
sentences in , A get T; therefore B gets T. Thus, it is impossible that all
sentences in get T and B gets F.
The following generalization of (9) allows us to add as premises many consequences of the
original list.
(9
), can be proved by the same reasoning that proves (9). It can be also deduced by re-
peated applications of (9). (Assume that = B
1
, . . . , B
m
; then every B
i
is a consequence
of . By (9), and , B
1
have the same consequences. Since B
2
is a consequence , it is,
by monotonicity, a consequence of , B
1
; again, by (9), , B
1
and , B
1
, B
2
have the same
consequences. Therefore and , B
1
, B
2
have the same consequences, etc.)
(9
) If A
0
|= A, then
, A
0
, A B |= C , A
0
, B |= C
To show (12
), assume that A
0
|= A. Then the addition of A to any list containing A
0
, yields
an equivalent list. Hence, , A
0
, A B is equivalent to , A
0
, A, A B, which, by (12), is
equivalent to , A
0
, A, B. And this last list is equivalent to , A
0
, B, since it is obtained from
it by adding A.
Here is an example of a top-down derivation that uses some of the listed laws. We want to
show that
|= [A (B C)] [B (A C)]
Starting with this as our initial goal, we keep reducing each goal to another sucient goal
and we write the goals on separate, numbered lines. Indicated in the margin is the law (or
laws) by which the preceding implication is reduced to the current one. The sign
marks
obvious implications that need no further reductions.
1. |= [A (B C)] [B (A C)] initial goal,
2. A (B C) |= B (A C)] by (|=, ),
3. A (B C), B |= A C by (|=, ),
4. A (B C), B, A |= C by (|=, ),
5. B C, B, A |= C by disjoining,
6. C, B, A |= C by disjoining.
Note that the reduction from 4. to 5. uses an instance of disjoining, whereby A (B C), A
is replaced by B C, A.
For the sake of brevity, we can write the three steps from 1. to 2., from 2. to 3., and from 3.
to 4. as a single step:
|= [A (B C)] [B (A C)] initial goal,
A (B C), B, A |= C by three applications of (|=, ).
4.2. IMPLICATIONS WITH MANY PREMISES 117
In a similar way, we can write the steps from 4. to 5. and from 5. to 6. as a single step in
which disjoining is applied twice.
The bottom-up proof of the initial goal is obtained by reversing the list: Start from 6. and
end at 1., justifying each step by the indicated rule; 5. is obtained from 6. by disjoining, 4.
from 5.by disjoining, 3. from 4. by (|=, ), etc.
From now on we will omit initial goal in the margin of the rst line.
Here is another example, where substitution-of-equivalents is used as well. The equivalences
we use are:
B C B C and A B B A
The rst equivalence is used in getting 2. from 1., the secondin getting 4. from 3.; in the
rst case substitution is applied to the conclusion, in the second caseto one of the premises.
1. A B, A C |= B C
2. A B, A C |= B C substitution of equivalents,
3. B, A B, A C |= C by (|=, ),
4. B, B A, A C |= C substitution of equivalents,
5. B, A, A C |= C by disjoining,
6. B, A, C |= C by disjoining.
Homework 4.3 Using the laws introduced so far, prove, via top-down derivations, the
following ve implications. The goal should be reduced in the end to an obvious implication
in which the conclusion is one of the premises. You can use substitution-of-equivalents based
on simple equivalences of the kind given in the last example.
In the derivations of 4. and 5. you can use laws (12
)).
1. |= (A B) A
2. |= AB (A B)
4.2. IMPLICATIONS WITH MANY PREMISES 125
3. A B, A C |= B C
4. (A B) A |= A
5. |= [C (A B)] [(C A) (C B)]
6. (A B) AB |= A B
7. [A (A B)] [B (A B)] |= A B
8. |= A (B C) (AB C)
4.2.3 Logically Inconsistent Premises
A premise-list is said to be logically inconsistent, or inconsistent for short, if it is impossible for
all of them to be true, by virtue of pure logic. This is equivalent to saying that the conjunction
of all premises is logically false. In the case of sentential logic, where the connectives are the
only logical elements, we say that the premise-list is contradictory.
Given a logically inconsistent premise-list, and any sentence B, it is impossible that all
premises are true and B is false; because it is impossible that all premises are true. Hence,
by the denition, an inconsistent premise-list implies any sentence:
If is inconsistent, |= B, for all B.
This is sometimes expressed by saying that a contradiction implies everything, and is often a
source of misunderstandings. Some accept it as a strange, or deep truth of logic. And some
may nd it a defect of the system. Actually, there is no mystery and no ground for objection.
Logical implication is a technical concept dened for particular purposes. It captures some
of our intuitions concerning implication; but it does not, and is not intended to, capture
all. Using imply in the somewhat vague, everyday sense, we will never say that the two
contradictory premises,
(i) John Kennedy was assassinated by Lee Oswald,
(ii) John Kennedy was not assassinated by Lee Oswald,
imply that once there was life on Mars. For we require some internal link between the premises
and the conclusion. But there is nothing wrong in introducing a more technical variant of
implication, well-dened in terms of possible truth-values, by which a contradiction does imply
every sentence. (Similar points relating to logical equivalence have been discussed already.)
The law that every sentence is implied by contradictory premises is very useful when it comes
to deriving and checking logical implications. We shall regard the trivial instances of this law,
126 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
where the premise-list contains a sentence and its negation, as self-evident implications that
need no further proof. This means that every implication of the form
, A, A |= B
is a possible termination point in a top-down derivation. We therefore allow two kinds of
successful leaves (that can be marked by
): those in which the conclusion appears among
the premises, and those in which the premises contain a sentence and its negation. Here is an
example showing the use of the second kind.
1. (C A) (C B) |= C AB
2. (C A) (C B), C |= A B by (|=, ),
3. (C A) (C B), C, A |= B by (|=, ),
4.1 C A, C, A |= B by (|=, ),
4.2 C B, C, A |= B by (|=, ),
5.1 C, A, A |= B by disjoining,
5.2 C, B, A |= B by disjoining,
5.1 is marked as successful, because the premises contain a sentence and its negation; 5.2 is
marked because the conclusion is among the premises.
Note: We can now derive disjoining from the other laws as follows. By (, |=): , A
B, A |= C i , A, A |= C and , B, A |= C. But the rst implication is obvious. Hence,
, A B, A |= C i , B, A |= C.
Homework 4.7 Show, via top-down derivations, that the following implications obtain.
The nal goals should be of the two allowed kinds of self-evident implications.
1. |= (A A) A
2. |= (A B) (B A)
3. AB C |= (A C) (B C)
4. |= (A B) (A B) B
5. A BC, (B D) (C D) |= A C
6. A B, B B |= A
7. A BC, B C |= A
8. AB C, AB C |= A C
4.3. FOOL-PROOF METHOD 127
4.3 A Fool-Proof Method for Finding Proofs and Coun-
terexamples
4.3.1 Validity and Counterexamples
All the preceding implication laws, and the equivalence laws of chapter 3, can be regarded as
general schemes. They hold no matter what sentences we substitute for the sentential vari-
ables. Therefore the derivable implications are schemes as well; having proved an implication
we have also proved all its instantiations.
To be precise, we should consider here (as we have done before) the sentential expressions. We
can consider lists of sentential expressions. Call them premise expressions when they occur
on the left-hand side of an implication. A list of premise expressions tautologically implies a
sentential expression if:
There is no truth-value assignment to the sentential variables under which
all the premise expressions get T and the conclusion expression gets F.
An implication (one that consists of expressions) is tautologically valid, or valid for short, if
the list of premise expressions tautologically implies the conclusion expression.
An implication is therefore non-valid when there is a truth-value assignment to the sentential
variables under which all the premise expressions get T and the conclusion expression gets
F. Such an assignment is called a counterexample. Hence an implication is valid just when it
has no counterexamples.
Obviously, a non-valid implication between sentential expressions fails as an implication be-
tween sentences, if we substitute the sentential variables by distinct atomic sentences. Alter-
natively, we can get a failed implication between sentences as follows:
Substitute every sentential variable that gets T in the counterexample by a tautology, and
every sentential variable that gets F by a contradiction.
Example:
A B, AC |= B
is not valid, because it has a counterexample:
A B C
F F T
If we assume that A = B =
D D and C = D D, the implication fails as an implication between sentences:
(DD) (DD), (DD) (DD) 6|= DD
But A, B and C can be other sentences for which the implication holds, e.g., if C = A B (check
it for yourself).
128 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Implication has therefore double usage: when the premises and the conclusion are unspecied
sentences, and when they are sentential expressions. There is no danger of confusion, because
the context will make the reading clear. If we say that an implication (e.g., the one in the
last example) may or may not hold, then we are obviously referring to unspecied sentences.
But if we say that it is not valid we are saying that the scheme does not hold in general, that
is,, as an implication between sentential expressions, it has a counterexample. By the same
token we say that x y > x+y may or may not hold, depending on the numbers x and y, but
that x
2
+ 1 > x is a valid numerical inequality.
Equivalence for Counterexamples
We have seen that in each of our laws the left-hand side implication holds i all the right-hand
side implications hold. Each of our laws satises also the following.
Counterexample Equivalence: A truth-value assignment to the sentential variables
is a counterexample to the left-hand side i it is a counterexample to at least
one of the implications on right-hand side.
Counterexample equivalence can be inferred from the following two facts: (I) In each law,
the equivalence of the two sides is preserved under all substitutions. (II) An implication is
non-valid i it has an instantiation that fails as an implication between sentences.
Alternatively, the arguments that prove the equivalence of the two sides can be used to show
their counterexample equivalence. As an illustration we show this for (|=, ) and for (, |=).
(|=, ) |= A B , A |= B
A truth-value assignment (to the sentential variables) is a counterexample
to the left-hand side, i all members of get T and A B gets F. But this
is equivalent to saying that all members of get T, A gets T, and B gets F;
which is exactly the condition for a counterexample to the right-hand side.
(, |=) , A B |= C , A |= C and , B |= C
A truth-value assignment is a counterexample to the left-hand side, i all
sentences in get T, AB gets T, and C gets F. But AB gets T, i either
A gets T, or B gets T (or both). If A gets T, then this is a counterexample
to , A |= C, and if B gets T, it is a counterexample to , B |= C. Vice
versa, a counterexample to one of the right-hand side implications assigns
T to all members of and to A B, and assigns F to C.
Using top-down derivations, we will dene a method that decides for any implication whether
or not it is valid. Given an implication, the method is guaranteed to produce either a proof
4.3. FOOL-PROOF METHOD 129
of it or a counterexample.
4.3.2 The Basic Laws
The method uses a nite number of basic laws. Some were mentioned before and some are
easily obtained from previously mentioned laws. Not all laws considered above are taken as
basic. The basic laws are naturally classied as follows.
First, there is a law that enables trivial rearrangements of premise lists:
If every sentence occurring in occurs in
0
and every sentence occurring in
0
occurs in , then for every A,
|= A
0
|= A
Using this law we can reorder the premises, delete repetitions, or list any premise more than
once. Henceforth, such reorganizing will be carried out without explicit mention.
Next, we designate two types of implications as self-evident:
Self-Evident Implications
, A |= A , A, A |= B
Implications belonging to these two types play a role similar to that of axioms. In bottom-up
proofs they serve as starting points. In top-down ones they are the nal successful goals.
The rest, referred to as reduction laws, are the basis of the method. They enable us to replace
a goal by simpler goals. The rst group consists of laws that handle sentential compounds
AB. For each binary connective, , we have a premise law (, |=) and a conclusion law (|= ),
which handle, respectively, -compounds that appear as a premise, or as the conclusion. The
second group, which deals with negated compounds, is presented later.
130 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Laws for Conjunction
(, |=) , A B |= C , A, B |= C
(|=, ) |= A B |= A and |= B
Laws for Disjunction
(, |=) , A B |= C , A |= C and , B |= C
(|=, ) |= A B , A |= B
Laws for Conditional
(, |=) , A B |= C , A |= C and , B |= C
(|=, ) |= A B , A |= B
Laws for Biconditional
(, |=) , A B |= C , A, B |= C and , A, B |= C
(|=, ) |= A B , A |= B and , B |= A
The goal-reduction process is as described in 4.3. To recap: a step consists of replacing the
left-hand side of a reduction law (the goal that is being reduced) by the right-hand side.
The -direction guarantees that proving the new goals (or goal) is sucient for proving the
old goal; the -direction means that they are also implied by it. All counterexamples to a
new goal are also counterexamples to the old one, and any counterexample to the old one is
obtained in this way.
A goals children are the goals that replace it. (If there is one goal on the right, there is
only one child). It follows from the above that if the children are valid so is the parent.
Consequently, if all the leaf goals are valid, so are their parents, and the parents of their
parents, and so on, up to the initial goal. On the other hand, any counterexample to one of
the leaf goals is also a counterexample to one (or more) of their parents, hence also to the
parents parent, and so on, up to the initial goal. And any counterexample to the original
goal is a counterexample to a goal in some leaf.
4.3. FOOL-PROOF METHOD 131
We still need laws for reducing negated compounds, sentences of the form A or (A B).
The laws for double negation allow us to drop it, either in a premises or in the conclusion.
Compounds of the form (A B) can be treated in the way described in 4.2.2 (page 124),
i.e., by pushing the negation inside. This means that we use laws such as:
, (AB) |= C , AB |= C
And similar laws for pushing negation inside in (AB), in (A B), and in (A B).
There is a more elegant way: Combine in a single law the pushing of negation and the law
that applies to the resulting compound. For (A B) this yields:
, (AB) |= C , A |= C and , B |= C .
Doing so for all connectives, we get reduction laws for negated compounds, of the same
type as our previous ones. It is easy to see that counterexample equivalence is also true
for the second group. Because these laws are obtained from the rst group by substituting
sentential expressions by logically equivalent ones; such substitutions do not change the sets
of counterexamples.
132 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Laws for Negated Negations
(, |=) , A |= B , A |= B
(|=, ) |= B |= B
Laws for Negated Conjunctions
(, |=) , (A B) |= C , A |= C and , B |= C
(|=, ) |= (A B) , A |= B
Laws for Negated Disjunctions
(, |=) , (A B) |= C , A, B |= C
(|=, ) |= (A B) , |= A and |= B
Laws for Negated Conditionals
( , |=) , (A B) |= C , A, B |= C
(|=, ) |= (A B) |= A and |= B
Laws for Negated Biconditionals
( , |=) , (A B) |= C , A, B |= C and , A, B |= C
(|=, ) |= (A B) , A |= B and , B |= A
This completes the set of reduction laws.
Branching Laws: A law whose right-hand side has more than one implication is called
a branching law. Each application of a branching law causes branching in the tree. The
branching laws are, for non-negated compounds: (|=, ), (, |=), (, |=), (, |=) and
(|=, ) . For negated compounds they are: (, |=), (|=, ), (|=, ), ( , |=) and
(|=, ). The other laws are referred to as non-branching.
4.3. FOOL-PROOF METHOD 133
Memorization: You do not have to memorize all the laws. A useful strategy is to
memorize only four: the two laws for conjunction, the premise-disjunction law, (, |=), and
the conclusion-conditional law, (|=, ). The rest you can get by obvious substitutions of
equivalents: The conclusion-disjunction lawby rewriting A B as A B; the premise-
conditional lawby rewriting A B as AB; the premise-biconditional lawby rewriting
the biconditional as (AB) (AB), and the conclusion-biconditional lawby rewriting
it as (A B) (B A). Beside the double negation laws, the other laws for negated
compounds are obtained by pushing negation in, as indicated earlier.
Elementary Implications
Elementary implications are those that cannot be simplied through reduction laws. An
implication is elementary if every sentential expressions guring in it is either a sentential
variable or a negation of one. This is equivalent to saying that it contains neither binary nor
negated compounds. Here are some examples.
A, B, C, D |= B A, C, B, C |= D A, B, C, D |= E
Claim: (I) If an elementary implication is valid, then it is self-evident, i.e., either the
conclusion occurs as a premise or the premises contain a sentential expression and its negation.
(II) If an elementary implication is not self-evident then there is a unique assignment to its
sentential variables that constitutes a counterexample. This assignment is determined by the
following conditions:
(i) Every sentential variable that occurs unnegated in the premises gets T, and
every sentential variable that occurs negated in the premises gets F.
(ii) The sentential variable of the conclusion gets F, if it occurs unnegated, Tif
it occurs negated.
Proof: Assume that an elementary implication is not self-evident and show that the condi-
tions in (II) determine an assignment, and that the assignment is the unique counterexample.
For any given assignment the following is obvious: All the premises get T i the assignment
satises (i). The conclusion gets F i the assignment satises (ii). There is at most one
assignment, to the sentential variables occurring in the implication, that satises (i) and (ii);
because (i) and (ii) prescribe truth-values to all these sentential variables. Hence there is a
counterexample i there is an assignment satisfying (i) and (ii); the counterexample is then
unique.
The only way in which (i) and (ii) can fail to determine an assignment is by prescribing
more than one truth-value for the same sentential variable. This does not happen unless the
134 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
implication is self-evident. For if it is not, no sentential variable occurs in the premises both
negated and unnegated; hence (i) assigns to each sentential variable occurring in the premises
exactly one value. Next, if the variable of the conclusion does not occur in the premises, then
only (ii) gives it a value. If it occurs in the premises, it must be either negated in the premises
and unnegated in the conclusion, or unnegated in the premises and negated in the conclusion.
Otherwise the conclusion is among the premises and the implication is self-evident. Hence
(ii) and (i) assign to it the same value.
QED
In order to check the validity of an elementary implication, we therefore check if it is self-
evident. If it is not, then (i) and (ii) in (II) tell us what the counterexample is.
Of the three elementary implications given above, the rst two are self-evident. The coun-
terexample to the third is:
A B C D E
F T F T T
We can now assemble all the pieces and sum up the method.
4.3.3 The Fool-Proof Method
To check any given implication
|= A ,
we take it as the initial goal and proceed top-down, by applying the premise and conclusion
laws for binary connectives and for negated compounds. As long as there is a goal containing
a binary compound, or a negation of one, or a double negation, we can continue.
Such a process cannot go on indenitely, because the goals become smaller. Intuitively this is
clear. The mathematical proof of this will not be given here. (The proof is not trivial because
as the goals become smaller, their number can increase; for precise inductive arguments see
6.2.4 page 232, and 6.2.5 page 239.)
When the process terminates, we get a top-down derivation tree in which all the goals in
the leaves are elementary. If all are self-evident we get a top-down derivation of the initial
implication, which can be turned upside down into a bottom-up proof. Otherwise, there are
terminal goals that are not self-evident. Each of these yields a unique counterexample to the
initial goal. All the counterexamples to the initial goal are obtained in this way.
Note: Our listed laws are sucient for deriving all valid implications. For if an implication
is not derivable. the resulting tree gives us a counterexample. This shows that non-derivable
implications are not valid.
4.3. FOOL-PROOF METHOD 135
Consequently, there is no need to use any other laws or to rely on substitution of equivalent
components. In practice, however, you can legitimately apply other established laws, such as
disjoining ((12) of 4.2.1), and you may use substitutions of equivalents (where the equivalences
have been proven already), in order to shorten the proof.
Note: In actual applications, you need not go all the way. The process can stop at the
stage where all the goals are self-evident, even if they are not elementary. In the rst example
of 4.2.2 the nal goals are elementary; in the second and the third they are not. Also, once
you get an elementary implication that is not self-evident, you have your counterexample and
you can stop. But if you want to get all counterexamples to the initial goal, you should get
all non-valid elementary implications of the tree.
Here are two examples. In the rst we get a proof, in the seconda counterexample. The law
that applies at each step is not indicated, but you can gure it out. The sentence to which a
law is applied is underlined.
1. A B, C (A B) |= C B
2. A B, C (A B), C |= B
3.1 A, C (A B), C |= B
3.2 B, C (A B), C |= B
4.11 A, C, C |= B
4.12 A, A B, C |= B
5.121 A, A, C |= B
5.122 A, B, C |= B
Note that had we employed law (12) of 4.2.1, we could have replaced
C (A B), C by C, A B ,
which would have eliminated 4.11, making 4.12 the sole child of 3.1.
1. A (B C), B (A C) |= C A
2. A (B C), B (A C), C |= A
3.1 A, B (A C), C |= A
3.2 B C, B (A C), C |= A
4.11 A, B, C |= A
136 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
4.12 A, A C, C |= A
Here the derivation stopped after yielding the non-valid elementary 4.11. The corresponding
counterexample is:
A B C
F F F
You can easily see that this is also a counterexample to 1. If you continue to reduce the
remaining goals, 3.2 and 4.12, you will see that both of them are valid, hence this is the only
counterexample to our initial goal.
Homework 4.8 Write the last two top-down derivations in tree form, using line numbers
to label the nodes. Write to the side of each node the law by which it has been derived from
its parent.
Some Noteworthy Properties of the Method
Given any occurrence of a sentential expression in a goal, there is at most one law that
can be applied to it and the result of the application is unique.
One cannot go wrong by applying the reduction laws in any order, until all goals are
elementary.
(But the choice, at each stage, of where to apply a reduction law can have considerable
eect on the derivations length. If, in the rst of the last two examples, we had started
by treating the leftmost premise, A B, and had followed this by treating, on each of
the two branches, C (AB), we would have had four branches right at the beginning
and the number of lines would have been 13, instead of 8.)
The laws that deal with a binary connective do not introduce any other connective
except, possibly, negation. Therefore, no other connectives, besides negation and those
appearing in the initial goal, appear in the derivation.
Consequently, given any set of connectives that contains negation, the laws for the
connectives of the set are sucient for deriving all tautological implications of the sub-
system that is based on these connectives. For example, if we restrict our system to
sentences whose connectives are and only, the laws double negation laws, and the
premise and conclusion laws for biconditionals and negated biconditionals are sucient.
Note: The validity of a given implication can be settled also through truth-tables. We make
a table for all the occurring sentential expressions; then we check, row after row, whether all
the premises get T and the conclusion gets F. If we nd such a rowwe get a counterexample.
If notthe implication is valid. But the execution of this brute force checking is tedious,
prone to mistakes and, often, more time consuming.
4.4. PROOFS BY CONTRADICTION 137
A most important feature of the method is that it generalizes to richer systems where truth-
tables are not available. As we shall mention later (in 9.3.3) there is no method for rst-
order logic that is guaranteed to produce, in a nite number of steps, either a proof or a
counterexample. But there is one that is guaranteed to produce a proofif the implication is
a logical implication. One of the proofs of this result is obtained by extending the present
method and by using the same type of arguments that show its adequacy for the sentential
case.
Homework 4.9 Give, for each of the following implication claims, a top-down derivation or
a counterexample. To cut short the construction, you can use at your convenience additional
laws, besides the basic ones, as well as simple substitutions of equivalents.
1. |= [A (B C)] [(A B) (A C)]
2. |= A (AC) (A C)
3. A B, B C, C A B |= C (A B)
4. A B, B (C D), D C |= C
5. A B, (B C) D |= A D
6. A B, B C, A C |= A (C B)
7. |= ((A B) C) (B C)
8. A BC, (B C) D |= (D A) (B C)
9. A (B C), B C |= A
10. BC A, (A B) C |= B A
11. A (BC), B (AC) |= (A B) C
4.4 Proofs by Contradiction
4.4.0
The following is easily established:
(15) |= A i , A is logically inconsistent.
138 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
The left-hand side holds just when it is impossible that all the sentences in be true and A
be false; this is the same as saying that it is impossible that all sentences in , A be true.
Furthermore, we have:
(16) If C is any contradiction, then
is logically inconsistent i |= C .
Again, this is obvious: A logically inconsistent premise-list implies all sentences, in particular
all contradictions. Vice versa, if implies a contradiction, then, it is impossible that all
premises in be true, for then the contradiction will have to be true as well.
From (15) and (16) we get:
(17) If C is any contradiction, then:
|= A , A |= C .
(17) gives us a way for proving that |= A : Add A to and show that the resulting list
implies a contradiction. Such a proof is called proof by contradiction.
We can choose any contradiction as C. The most common one is a sentence of the form B
B. But instead of using particular contradictions, it is convenient to introduce a special
contradiction symbol that denotes a sentence which, by denition, gets only the value F. The
symbol to be used is:
You can think of as denoting, ambiguously, any contradiction. But we shall employ it in
restricted way: it cannot occur among the premises but only as the right-hand side of |=:
|= .
This is simply a way of saying that is logically (or, in the special case of sentential logic,
tautologically) inconsistent. can be replaced, if one wishes, by any particular contradiction.
With this notation (17) becomes:
(18) |= A , A |=
It is easily seen that the two sides of (18) are counterexample equivalent (i.e., have the same
counterexamples).
All our previous premise laws apply to implications of the form |= (because can be
replaced by any contradictory sentence). E.g.,
, A B |= , A |= and , B |=
4.4. PROOFS BY CONTRADICTION 139
Our previous notions of a self-evident implication and of an elementary implication carry over,
in an obvious way, to implications of the form
|= .
The implication is self-evident just when contains a sentence and its negation. It is elemen-
tary if all the premise expressions are unnegated or negated sentential variables. There is now
only one kind of self-evident implications, because cases in which the conclusion is among the
premises are excluded by the restriction on . The claim that an elementary implication is
valid i it is self-evident has now a simpler proof:
Assume that the elementary implication is not self-evident. Assign T to
every sentential variable appearing unnegated in the premise-list, assign F
to those that appear negated. Since no variable appears both unnegated and
negated, each is assigned a single value. This assignment makes all premises
true, thereby constituting a counterexample.
4.4.1 The Fool-Proof Method for Proofs by Contradiction
Our previous top-down method can be adapted to proofs by contradiction. Given an initial
goal:
|= A
we start with replacing it by the equivalent goal:
, A |=
Then we proceed to reduce this goal to simpler goals by applying our premise laws to binary
compounds, to their negations, and to double negations. (If the initial goal is |= we
start the reductions right away.) All the resulting goals have on the right-hand side.
We can continue until all the goals are elementary. If all are self-evident we get a proof
of the initial goal. Otherwise, every elementary implication that is not self-evident yields a
counterexample. Here are two illustrations. In the rst the method yields a proof:
1. (A B), C B |= (A C)
2. (A B), C B, (A C) |=
3. (A B) , C B, A C |=
4. A , B, C B, A C |=
5. A, B, C B , A C |=
6.1 A, B, C, A C |=
140 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
6.2 A, B, B, A C |=
6.11 A, B, C, A |=
6.12 A, B, C, C |=
Note that we could have shortened the derivation had we used (12) (which is not among our
basic rules). An application of (12) to 5. yields the equivalent goal:
A, B, C B, C |=
which by another application of (12) becomes self-evident:
A, B, B, C |=
In the second example, the method yields a counterexample:
1. A B, (B C) |= A
2. A B, (B C) , A |=
3. A B , B, C, A |=
4.1 A, B, C, A |=
4.2 B, B, C, A |=
(The repeated occurrences of premises, in the last two goals, could have been deleted.) Note
that 4.1. and 4.2 are conjointly equivalent to the initial goal. Each has a counterexample.
But their counterexamples are the same, namely:
A B C
F T F
Therefore, this is the only counterexample to the original implication. You can check that
it is indeed a counterexample to the initial goal, by constructing a truth-table (for the three
sentential expressions) noting that the row corresponding to that assignment is one in which
all the premises get T and the conclusion gets F. You can moreover check that this is the
only row with that property.
The proof-by-contradiction variant uses fewer basic laws than our previous method. All the
reduction laws are premise laws. On the other hand, it may occasionally require more steps.
The basic laws for top-down proofs by contradictions are given on pages 141, 142. Except for
the law for trivial rearrangements of the premises, no other laws are needed.
4.4. PROOFS BY CONTRADICTION 141
Homework 4.10 Using the proof-by-contradiction method, check which of the following
implications is valid. Give in each case either a top-down derivation or a counterexample. You
can use at your convenience additional premise laws, such as (12), or simple substitutions-by-
equivalents.
1. A B, B C |= C A
2. A AB, B C |= A C
3. (A B) C, C |= A B
4. A (B C), B |= A C
5. |= (A B) (A C) (A B C)
6. A B, B C, C A |=
7. A B |= (A B) A
8. A (B C), (B A), (C A) |=
9. (AB) C |= (AB)C C
The Laws for Proofs by Contradiction
First we have the law that xes the self-evident implications.
Self-Evident Implication
, A, A |=
Then, we have the reduction laws. The rst in the law that introduces as the conclusion.
The other are premise laws for binary connectives and negated compounds. In the following
list the laws for A B and for A B are grouped together.
142 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Contradictory-Conclusion Law
|= A , A |=
Law for Negated Negations
(, |=) , A |= , A |=
Laws for Conjunctions and Negated Conjunctions
(, |=) , A B |= , A, B |=
(, |=) , (A B) |= , A |= and , B |=
Laws for Disjunctions and Negated Disjunctions
(, |=) , A B |= , A |= and , B |=
(, |=) , (A B) |= , A, B |=
Laws for Conditionals and Negated Conditionals
(, |=) , A B |= , A |= and , B |=
( , |=) , (A B) |= , A, B |=
Laws for Biconditionals and Negated Biconditionals:
(, |=) , A B |= , A, B |= and , A, B |=
( , |=) , (A B) |= , A, B |= and , A, B |=
4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE 143
4.5 Implications of Sentential Logic in Natural Lan-
guage
4.5.0
In order to establish logical implications between English sentences, we recast them as sen-
tences of symbolic logic. We can then check whether logical implication holds for the recast
sentences. Consider the premises:
(1) Jill will not marry Jack, unless he leaves New York,
(2) If Jack leaves New York, he must give up his current job,
and the inferred conclusion:
(3) Either Jack will give up his current job, or he wont marry Jill.
Let A, B and C be, respectively, the formal counterparts of:
Jill will marry Jack, Jack will leave New York, Jack will give up his current job.
Then the sentences are translated as:
(1
) B A
(2
) B C
(3
) C A
And indeed:
B A, B C |= C A
Had the implication not been valid, we would have had a counterexample, using which we
could have pointed out a possible scenario in which (1) and (2) are true and (3) is false. For
example, had we replaced (1) by:
(1
0
) Jack will not marry Jill if he leaves New York,
the formal implication would have been:
B A, B C |= C A
And here we get a counterexample:
144 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
A B C
T F F
That is, Jill marries Jack; he does not leave New York, and he does not give up his job.
Noteworthy Points: (I) We have construed (2) as a conditional and we have read he
must give up his job as he will give up his job. Whatever the connotations of must in this
context, they are ignored as irrelevant for the formalization.
(II) We used the same A to represent, in (1
) and in (3
) x[Bachelor(x) Unmarried(x)]
Carnap held that meaning postulates are unrevisable laws of language, without empirical
content, a view that has by now been abandoned by most philosophers. Nonetheless, even
if the distinction is notas Carnap heldabsolute, it is a good methodological policy to
distinguish sentences like (8) from sentences like Carol is unmarried, which convey non-
linguistic factual information.
We shall henceforth use meaning postulates, without however committing ourselves to the
original signicance associated with the term. Thus, in formalizing the inference from (5) to
(6), we add B C as a premise representing a meaning postulate (or a consequence of one).
Background Assumptions
Almost every reasoning involves background assumptions that are not spelled out explicitly.
Given the premises:
146 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
(9) Arthurs mother wont be content, unless he lives in Boston,
(10) Arthurs wife will be content only if he lives in New York,
we would naturally conclude:
(11) Either Arthurs mother or his wife wont be content.
Let us formalize:
A
1
: Arthur will live in Boston,
A
2
: Arthur will live in New York,
B
1
: Arthurs mother will be content.
B
2
: Arthurs wife will be content.
The required implication is:
(12) A
1
B
1
, B
2
A
2
|= B
1
B
2
.
But it is easy to see that (12) is not a valid implication: if both A
1
and A
2
get T, the premises
are true and the conclusion is false. It turns out that in deriving (11) we have been assuming
that Arthur will not live in New York and in Boston at the same time. (At the same time
because in (9), (10) and (11), the future tense is, obviously, intended to indicate the same
time.)
This background assumption becomes, upon formalization:
(A
1
A
2
)
Having added it, we get the desired logical implication:
(12
) A
1
B
1
, B
2
A
2
, (A
1
A
2
) |= B
1
B
2
A background assumption is by no means a necessary truth. It is, for example, conceivable
that Arthur will live in Boston and in New York at the same time: say, he maintains
two households and commutes daily. But given the inconvenience and the expense, such an
arrangement is very unlikely; implicitly, we have ruled it out.
4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE 147
Or consider the inference from:
If Jack leaves New York, he will have to resign his current position,
AND
Jack decided to leave New York,
TO:
Jack will resign his current position.
Here there is an implicit assumption that Jack will carry out his decision. The assumption
may be objectionable in contexts in which decisions are not always implemented.
Implicit background assumptions are thus statements of fact that are assumed to be known,
or which can reasonably be taken for granted. Their certainty can vary considerably, from
that of the well-established law, to that of a mere plausibility. Even an obvious commonplace,
e.g., that one cannot be in dierent places exactly at the same time, can be classied as a
background assumption.
There is a philosophical tradition, initiated by Kant, according to which certain truths, such
as the impossibility of being in two places at the same time, derive from basic (non-linguistic)
epistemic principles and are immune to revision. The truth just mentioned derives, presum-
ably, from the very meaning of physical body and space. But today the force of that tradition
has been considerably weakened. Many cast doubt on the unrevisability of such a priori con-
ceptual truths. Let us therefore classify under meaning postulates cases that are more of a
lexicographic nature, such as (8), rather than those that follow from foundational epistemic
considerations. The latter will be classied as background assumptions, albeit ones we can
hardly conceive of giving up.
Background assumptions thus covers an extremely wide spectrum, from the most entrenched
general laws, to probable suppositions, to particular facts implied by context. In a ner
analysis we should distinguish between them. For the sake of simplicity we ignore these
distinctions.
Meaning postulates is reserved for cases like (8), which derive from the conventions of lan-
guage. The boundary separating meaning postulates from background assumptions is, to be
sure, blurred. This is true of many useful distinctions. The dierence between (8) and some
factual assumption (e.g., that Jack will carry out his decision) is suciently clear to warrant
their classication under dierent headings.
4.5.2 Implicature
In linguistic exchange we often infer more than what is explicitly stated. On being told that
148 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
(13) Jack and Jill met and, so far, they have not quarreled,
one will naturally infer that a quarrel between Jack and Jill was likely. The sentence however
does not say it. (13) is true, just in case that (i) Jack and Jill met and (ii) Jack and Jill have
not quarreled, then and later.
Grice, who pointed out and investigated these phenomena, proposed the term implicature,
based on the verb implicate, for inferences of this kind. Thus, we can say that (13) implicates
that there was some reason for expecting a quarrel between Jack and Jill. The sentence does
not, strictly speaking, imply it. If we add to (13) the negation of the implicated sentence, we
might get something odd, but not a contradiction:
(13
0
) Jack and Jill met and so far they have not quarreled; there was no reason
to expect a quarrel.
On Grices analysis, the implicature from (13) derives from certain pragmatic rules that govern
conversations. The rules that bear on (13), and on other cases that we shall consider, have
to do with the relevance, the informativeness and the economy of the speakers utterances.
The relevance requirement is that the statements made by the speaker be relevant to the
topic under discussion. The informativeness requirement is that the speaker supply the right
amount of information (known to her), which is required in that exchange. And economy
means that she is required to avoid unnecessary length. In our example the rules produce the
implicature in the following way.
If there was no reason why Jack and Jill should quarrel, then to say that they have not
quarreled is to supply a piece of useless information. Since we expect the speaker to go by the
rules and to supply information that has some signicance, we infer from (13) (assuming the
speaker to be knowledgeable and sincere) that there is some reason for expecting a quarrel.
The rules of conversation require also that a speaker should not assert the conditional
If ... , then ,
if he knows that ... is false, or if he knows that is true. Because one can be more
informative and more brief by asserting, in the rst casethe negation of the antecedent, in
the second casethe consequent. This point was already discussed in 3.1.4; the oddity of
(28.i) and (28.ii) of that section is partly explained by noting this implicature.
Also the inferring of a causal connection, which sometimes goes with the use of and, can
perhaps be traced to conversational implicature. Being told:
(14) Jill recommended the play, and Jack went to see it,
4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE 149
we infer that Jills recommendation was the cause of Jacks going. Else there would be no point
in mentioning the two together. Actually, this is not so much the requirement of relevance,
as the requirement that there be a suciently focused topic of the discussion. The same
requirement of sucient focus can be seen to underlie the assumption of temporal proximity:
Unless stated otherwise, we interpret conjunctively combined clauses, in past or future tense,
as referring roughly to the same time.
As you can see, implicatures make it possible to mislead without making assertions that are
formally false. Many resort to this device. Politicians, advertisers and lawyers excel in it.
Implicature versus Ambiguity
In (13) the addition of the negated implicature does not yield a contradiction. We may take
this as corroborative evidence for its being an implicature, not an implication. This kind of
test is however not conclusive; on many occasions it misleads. Consider,
(15) Jack and Jill were married last week.
Usually (15) is taken to imply that Jack and Jill married each other. If, however, we add the
negation of that conclusion, we get:
(15
0
) Jack and Jill were married last week, but they did not marry each other,
which is not at all contradictory. Shall we then say that our rst inference from (15) is by
implicature only? No. Actually (15) is ambiguous. We have seen (cf. 3.1.2) that the use of
and to combine names can result in two possible interpretations: the distributive, in which
the sentence can be expressed as a conjunction, and the collective, where the combination of
names functions as a name of a single item. The dominant reading of (15) is the collective,
implying that Jack and Jill married each other. The addition of they did not marry each
other makes this interpretation untenable (for it leads to a trivial contradiction). Hence we
switch to the other reading of Jack and Jill were married. In the same vein, we may interpret
John jumped into the tank
as stating that John jumped into some armored vehicle. But with a suitable addition, e.g.,
John jumped into the tank and dived to the bottom,
we read it as stating that John jumped into a large container.
A nice illustration of conversational implicature is provided by comparing (15) and
150 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
(15
00
) Jack and Jill were married last week, on the same day.
(15), under its rst reading (i.e., that they married each other), implies (15
00
). But then, the
addition of on the same day would be altogether redundant. Assuming the speaker to go by
the rule of economy, we reinterpret (15) and infer that they were not married to each other;
else there would be no point to the additional information.
The Principle of Adjusting: We mentioned already the so-called charity principle (3.1.2
after example (16)). According to it, we interpret our interlocutor, in cases of ambiguity, so as
to make him sound sensible. Our last examples illustrate this point. But the name charity
can mislead. For the principle derives from a wider principle, according to which we interpret
our experience so as to make it cohere with the general scheme of expected regularities. This
applies to linguistic as well as to non-linguistic phenomena, to interaction with people as well
as to interaction with nature.
The subject is too broad to go into here. Suce it only to observe that in the case of language
we expect utterances, linguistic texts and linguistic interaction to accord with certain rules,
syntactic, semantic and pragmatic. We expect utterances to make a certain sense. And when
the danger of nonsense looms, we use the available possibilities to adjust our reading so as to
avoid it.
Homework 4.11 Use sentential formalization, in order to analyze the logic of the following
exchanges and to answer the questions. If there is no stated question, nd whether the
conclusion follows from the premises.
Formalize only in as much as this is necessary for the purpose of your analysis.
Discuss briey any points relating to meaning postulates, background assumptions, ambiguity
and implicature, which you nd relevant. Assume that the speakers are reasonable.
(1) Jill: Jacks mother wont be content, unless he lives in Boston.
Jack: But his wife will be content only if they live in New York.
Jill: So either his wife or his mother wont be content.
Take they in Jacks statement to refer to Jack and his wife.
(2) Arthur, David and Mary share an apartment.
Jack: Arthur and David are crazy about Mary, so if she is at home both of them are.
Jill: In any case, if one of the boys is at home, the other is too, for none trusts himself
alone with the neighbors dog.
Jack: But they said that one of them will go over to Joes place to help him with his
studies.
4.5. IMPLICATIONS OF SENTENTIAL LOGIC IN NATURAL LANGUAGE 151
Jill: Which goes to show that Mary is not at home.
Does it? Does it make a dierence for the implication if they and them, in Jacks last
statement, refer to the two boys or to the two boys and to Mary as well?
(3) Jack: Do you think that Mary is still unmarried?
Jill: I dont know, but if Mary is not unmarried, neither is Myra.
Jack: And if Myra is not married neither is Mary.
Jill: All this is rather confusing. Doesnt it imply that Myra is married only if Mary
is?
Does it?
(4) Jill: Both Arthur and Jeremiah said that they wont be happy, unless they marry Frieda.
Jack: By now she should have married one of them.
Jill: But she wasnt going to marry anyone without a secure job.
Jack: So, by now, one of them has a secure job and one of them is not happy.
(5) Jack: If one of Arthur and Jeremiah goes to the movie either Olga or Amelia will go
with him.
Jill: And the two girls wont go there together unless accompanied by a boy.
Jack: Which goes to show that the two boys will go to the movie only if the two girls
go there too.
(6) Jack consults a fortuneteller whether he should become a musician or study for the law.
Jack: I wont be happy unless I practice music.
Fortuneteller: But only by becoming a lawyer can you be rich enough to buy the things
you like.
Jack: It seems that my happiness depends on my giving up things I like.
Does it?
(7) Jill: If you go to the movie so will I.
Jack: If what you have said is true, then I will go to the movie.
Jill: Why this roundabout way of putting things? You could have simply said that
you will go to the movie.
Jack: Not at all. I only said that I will go to the movie if what you had said is true.
Who is right and why?
(8) Jack: If I enroll in the logic course I shall work very hard.
152 CHAPTER 4. LOGICAL IMPLICATIONS AND PROOFS
Jill: I dont know that I believe you... Well, at least I believe that you wont enroll in
the logic course unless what you say is true.
Jack: But doesnt this show that you dont know your true beliefs?
Is Jack right?
(9) Jack: Arthur wont move to a new apartment unless he accepts the new oer.
But this wont be true if he marries Olga.
Jill: But if he marries Olga and moves to a new apartment, he will accept the oer.
He wont be able to do both on the salary he is getting now.
Jack: So, unless one of us is wrong, he wont marry Olga.
(10) Jill: Unless you take a plane you wont meet your father.
Jack: Taking a plane is rather costly.
Jill: But your father told me that if you meet him hell cover the expenses required for
your trip.
Jack: So if I take a plane, in the end it wont cost me.
(11) Jack, Jill and Arthur who took a midterm test discuss the possible outcomes.
Jack: Someone who had a look at the list told me that two students got an A.
Jill: I am expecting an A, Ill be in a bad mood if I didnt get it.
Jack: So will I.
Arthur: I dont care. This test doesnt matter so much.
Jill: So either Arthur didnt get an A, or one of us will be in a bad mood.
Chapter 5
Mathematical Interlude
5.0
Every description of a language (or setup, or system) must be phrased in some language. The
language we use in talking about the language we discuss is referred to as the metalanguage,
and the language we discussas the object language.
When we describe French in English, the metalanguage is English and the object language is
French. The metalanguage can be the same as the object language: we can describe English
in English. The language used in this course for discussing formal systems is English, or
rather, English supplemented with some technical vocabulary.
We have been relying in our descriptions, arguments and proofs on certain basic, intuitively
grasped concepts; for example, the concept of a nite number, and that of a nite sequence.
We may say that a sentential expression is a nite sequence of symbols, and that a sentential
compound is a sentence obtained in a nite number of steps by applying connectives, and so
forth.
Initially, the use of such notions poses no problems. But as the arguments and the construc-
tions become more involved, there is an increasing need of a precise framework within which
we can dene certain abstract notions and carry out proofs. The framework can help us guard
against error.
1
At the same time it should have resources for carrying out constructions and
proofs, beyond the immediate grasp of our intuitions.
The need for a rigorous conceptual foundation was addressed by mathematicians and philoso-
phers in the second half of the nineteenth century. The desired foundations were laid in the
works of Dedekind, Frege and, in particular, Cantor, who created between the years 1874
1
There are well-known examples, in the history of thought, of arguments considered clear and self-evident,
which have later turned out to be confused or fallacious.
153
154 CHAPTER 5. MATHEMATICAL INTERLUDE
and 1884 what is known as set theory. This theory, developed later by other mathematicians
(Zermelo, von Neuman, Hausdor, Fraenkelto name a few), provides a rigorous apparatus,
suciently powerful for carrying out the constructions and proofs in all formal reasoning.
All known formal systems can, in principle, be described within set theory; and all known valid
reasoning about them can be derived in it. Nowadays, this theory provides the most basic kit
of tools for any reasoning of a mathematical, or formal nature. In its more advanced versions,
set theoryitself a sophisticated branch of mathematicsis of interest to the specialists only.
But its elementary core is employed whenever formal precision is required.
In the rst section of this chapter we shall introduce some very elementary set-theoretical
notions. Our aim is not to study set theory per se, but to provide a more rigorous treatment
of formal languages and their semantics. We shall take for granted a variety of mathematical
concepts, such as natural number, nite set, and nite sequence. In set theory these and
all other mathematical concepts are dened in terms of a single primitive: the membership
relation. But such reductions do not concern us here.
The second section is devoted to a certain technique that is widely employed in dening formal
languages and in establishing their properties. This is the technique of inductive denitions
and inductive proofs.
5.1 Basic Concepts of Set Theory
5.1.1 Sets, Membership and Extensionality
A set is a collection of any objects, considered as a single abstract object.
There is a set consisting of the earth, the sun, and the moon; another, consisting of the earth,
the sun, the moon, and the planet Jupiter; and still another, consisting of the earth, the moon,
number 8, and Bill Clinton. Thus, we can put into the same set objects whatsoever. Usually,
we consider sets whose members are of the same kind: sets of people, sets of numbers, sets of
sentences, etc. But this is not a restriction imposed by the concept of set; the theory allows
us to form sets arbitrarily.
The objects that go into a set are said to be its members. And the basic relation on which
set theory is founded is the membership relation; it holds between two objects just when the
second object is a set and the rst is a member of it.
The symbol for membership is . It is employed as follows:
x X
5.1. BASIC CONCEPTS OF SET THEORY 155
means that x is a member of the set X, and
x 6 X
means that x is not a member of X.
Hence, if X is the set whose members are the earth, the moon, the number 8, and Bill Clinton,
then:
Earth X, 8 X, Moon X, Clinton X, Nixon 6 X, 6 6 X, Jupiter 6 X, etc.
We also have: Clinton 6 Nixon, because Nixon is not a set.
Terminology: The membership symbol is occasionally used inside the English, e.g., there
is x X is read as: there is a member, x, of X. Similar self-explanatory phrases will be
used throughout.
Sometimes (but not always!) contains, or is contained, means contains as a member, or is
contained as a member. In these cases X contains x means that x X. We also say that x
belongs to X, or that x is an element of X.
We use x, y X as a shorthand for: x X andy X, and similarly for more than two
members: x, y, z X.
Extensionality
A set is completely determined by its members. This means that sets that have the same
members are the same set. Stated in full detail, the Extensionality Axiom says:
If X and Y are sets then X = Y i every member of X is a member of Y and every member
of Y is a member of X.
Note that the only if direction is trivial: X = Y means that X and Y are identical, hence
they must have the same members. (This is actually a truth of rst-order logic.) The real
content of the axiom consists in the if direction: having the same members is sucient for
being the same set.
To see the implications of extensionality, consider the following two concepts: that of a human
being and that of a featherless two-footed animal. In an obvious sense, the concepts dier.
But humans are featherless two-footed animals, and it so happens that there are no other
such creatures besides humans. Hence, the set of humans is identical to the set of featherless
two-footed animal. When forming sets, dierences between concepts that cannot be cashed
in terms of members are ignored.
156 CHAPTER 5. MATHEMATICAL INTERLUDE
The extensionality axiom provides the standard way of proving that sets are equal. If X and
Y are sets, then to prove:
X = Y
it suces to show that, for every x,
x X i x Y .
Ways of Denoting Sets
The simplest way of representing sets is by listing their members. The set is denoted by
putting curly brackets, { }, around the list. The three examples given at the beginning of
the section are denoted as:
{Earth, Sun, Moon}, {Earth, Sun, Moon, Jupiter}, {Earth, Moon, 8, Clinton}
The ordering of the list and repetitions in it do not matter:
{Earth, Moon, 8, Clinton} = {Clinton, Clinton, 8, Earth, Clinton, Earth, 8, Moon}
because every member of the left-hand side is a member of the right-hand side set, and every
member of the right-hand side is a member of the left-hand side.
The method of listing the members is not practical when the list is too long, and not feasible
if the set is innite. Sometimes suggestive notations can be used for innite sets, for example:
{0, 1, 2, . . .} or {0, 2, 4, . . .}
The rst is set of all natural numbers (i.e., non-negative integers), the secondof all even
natural numbers. But this method, which is based on guessing the intended rule, is very
limited.
The most naturaland, in principle, perhaps the onlyway of representing a set is by means
of a dening condition: one that determines what objects belong to it. In English, the
denition has the form:
the set of all ...
where ... expresses the condition in question. Thus, we have:
The set of all positive integers divisible by 7 or 9, the set of all planets, the set of all
stars, the set of all atoms, the set of all USA citizens born in August 1991, the set
of all British kings who died before 1940, and so on.
Note that nite listing can be seen as a special case of this kind of denition:
5.1. BASIC CONCEPTS OF SET THEORY 157
{earth, moon, 8, Clinton} = the set of all objects that are either the earth,
or the moon, or number 8, or Clinton.
In mathematics the following is used:
{x : . . . x. . .}
It reads: The set of all x such that ...x... . Here, ...x... states the condition about x. Instead
of x any other letter can be used. We shall refer to it as the standard curly bracket notation.
The examples given above can be therefore written as follows:
{x : x is a positive integer divisible by 7 or 9}, {x : x is a planet },
{v : v is a star}, {y : y is an atom}, {z : z is a USA citizen born in August 1991},
{x : x is a British king who died before 1940}, and so on.
This is not to say that every set can be denoted by an expression of the last given form,
orfor that matterby some other expression. In mathematics we allow for the possibility
of sets not denoted by any expression in our language; just as there may be atoms that no
description can pick.
Variants of the Notation: Usually, set members are chosen from some xed given domain
(itself a set). If U is the domain in question, then the set of all members, x, of U that satisfy
...x... is, of course:
{x : x U and . . . x. . .}
An alternative notation is:
{x U : . . . x. . .}
which reads: the set of all x in U such that ...x... . Thus, if N is the set of all natural
numbers, then:
{x N : x + 1 is divisible by 3} = {x : x N and x + 1 is divisible by 3}
Occasionally, the domain in question is to be understood from the context. It is also customary
to employ variables that range over xed domains. If in the last example it is understood
that x ranges over the natural numbers, then we can omit the reference to N and write
simply
{x : x + 1 is divisible by 3}
Other variants of the notation involve the use of functions. For example,
{2x : x N} and {x
2
: x N}
are, respectively, the set of all numbers of the form 2x and the set of all numbers of the form
x
2
, where x ranges over N (i.e., the set of all even natural numbers and the set of all squares).
158 CHAPTER 5. MATHEMATICAL INTERLUDE
We can use for these sets the standard notation; but this would result in longer expressions.
For example:
{x
2
: x N} = {z : there is x N, such that z = x
2
}
Once you get used to them you will nd these and other notations self-explanatory. The
following exercises will help you to get accustomed to set-theoretic notations and phrasings.
Homework 5.1 Translate the following into the standard curly-bracket notation.
(1) The set of all people who like themselves.
(2) The set of all integers that are smaller than their squares.
(Recall, the square of x is x
2
.)
(3) The set of all people married to 1992 Columbia students.
Rewrite the following in the curly-bracket functional notation. You can use N and Z to
denote, respectively, the set of natural numbers and the set of integers. For (6) use father(x)
to denote the father of x.
(4) The set of all positive multiples of 4.
(5) The set of all successors of integers divisible by 5.
(6) The set of all fathers of identical twins.
Describe in English the following sets, use short, neat descriptions. ( Livings , Humans ,
and Planets have the obvious meanings.)
(7) {x Livings : x has two legs}
(8) {x Humans : x has more than one child}
(9) {x Planets : x is larger than the earth}
Rewrite the following in the standard curly-bracket notation.
(10) {3x : x Primes}
(11) {x y : x Primes, y Primes}
(12) {2x +y
2
: x N, y Primes}
Note: The concept of a set is primitive. It cannot be dened by reduction to more basic
concepts. Explanations and examples (like the ones just given) may serve to get the concept
5.1. BASIC CONCEPTS OF SET THEORY 159
across, but they do not amount to denitions. In an indirect way, the concept is determined
by what we take to be the basic properties of sets. The same takes place in Euclidean
geometry, where the undened concepts of point, line and plane are indirectly determined
by the geometrical axioms. Like geometry, set theory is a system based on axioms. Some
are obvious. Others, belonging to more sophisticated parts of the theory, require deep
understanding. Except for extensionality, the axioms are not discussed here.
Singletons The set {x} has a single member, namely, x. Such a set is called a singleton,
or a unit set; {x} is the singleton of x, or the unit set of x.
One may be tempted to identify the singleton of x with x itself. The temptation should be
resisted. The singleton {Clinton} is a set containing Clinton as its sole member. Clinton
himself is a man, not a set. Just so, one distinguishes between John the man and the one-
member committee having John as its only member. If all the committee members except
John perish in a crash, the committee becomes a one-member committee; but you do not want
to say that it becomes a man. The standard version of set theory has an axiom, called the
regularity axiom, which implies that nothing can be a member of itself. It therefore implies
that, for all x, x 6= {x} (because x {x}, but x 6 x).
The singleton of x is {x}, the singleton of {x} is {{x}}, the singleton of {{x}} is {{{x}}},
and so on: {. . . {{x}} . . .}. It can be shown (assuming the regularity axiom) that all of these
are dierent from each other.
The Empty Set Among sets we include the so-called empty set: one that has no members.
At rst glance one may nd this strange, as one might nd strange, at rst, the idea of the
number zero. In fact, the concept is simple, highly useful and easily handled.
We speak about the empty set, because there is only one. This follows from extensionality: If
X and X
0
are sets that have no members, then X = X
0
, because they have the same members
(for every x: x X i x X
0
). The empty set is denoted as:
.
Note that every object that is not a set (e.g., every physical object) has no members. Ex-
tensionality does not make these objects equal to , because extensionality applies only to
sets.
5.1.2 Subsets, Intersections, and Unions
If X and Y are sets than we say that X is a subset of Y if every member of X is a member
of Y . We also say in that case that Y is a superset of X. The notation is:
X Y, or, equivalently, Y X .
160 CHAPTER 5. MATHEMATICAL INTERLUDE
Occasionally, we use the term inclusion: we say that X is included in Y , meaning that X is
a subset of Y .
As is usual in mathematics, crossing out indicates negation:
X 6 Y
means that X is not a subset of Y .
Obviously, X X, for every set X.
Proper Subsets: If X Y and X 6= Y , then X is said to be a proper subset of Y , or
properly included in Y , and Y is said to be a proper superset of X.
If X Y and Y X, then X and Y have the same members and, by extensionality, are the
same. Therefore
X = Y i X Y and Y X.
It is convenient to chain inclusions thus: X Y Z; it means: X Y and Y Z. Set
inclusion is transitive:
If X Y Z then X Z.
(The proof is trivial: Assume the left hand side. If x X then x Y , because X Y ; hence
also x Z, because Y Z; therefore every member of X is a member of Z.)
Every set, X, contains as members all members of the empty set (because the empty set has
no members). Hence,
X, for every set X
Note: The subset relation, , should be sharply distinguished from the membership
relation, . Every set is a subset of itself, but not a member of itself. On the other hand, a
member of a set need not be a subset of it; the earth is a member of {Earth, Moon}, but it
is not a subset of it, because the earth is not a set. Or consider the following:
{{}} (and the inclusion is proper), but 6 {{}}; because the only
member of {{}} is {}, and 6= {}.
{} {{}} but {} 6 {{}}; because {} contains as a member, whereas
{{}} does not contain as a member.
5.1. BASIC CONCEPTS OF SET THEORY 161
Intersections
The intersection of two sets X and Y , denoted X Y , is the set whose members are all the
objects that are members both of X and of Y :
For every x, x X Y i x X and x Y .
or, equivalently:
X Y = {x : x X and x Y }
Examples: The intersection of the set of all natural numbers divisible by 2 and the set of
all natural numbers divisible by 3 is the set of all natural numbers divisible both by 2 and by
3. (This is the set of natural numbers divisible by 6.)
The intersection of the set of all even natural numbers and the set of all prime numbers is
the set of all numbers that are both even and prime; since the only number that is both even
and prime is 2, this is the singleton {2}.
The intersection of the set of all USA citizens and the set of all redheaded people is the set
of all redheaded USA citizens.
The intersection of the set of all women and the set of all pre-1992 USA presidents is the set
of all women that have been, at some time before 1992, USA presidents. This happens to be
the empty set.
Disjoint Sets: Two sets, X, Y , are said to be disjoint if they have no common members;
i.e., if X Y = .
Unions
The union of the sets X and Y , denoted XY , is the set whose members are all objects that
are either members of X or members of Y (or members of both). That is:
For every x, x X Y i x X or x Y .
or, equivalently:
X Y = {x : x X or x Y }
162 CHAPTER 5. MATHEMATICAL INTERLUDE
Examples: The union of the set of all natural numbers that are divisible by 6 and the set
of all natural numbers that are divisible by 4 is the set of all numbers divisible either by 6 or
by 4 (or by both, e.g., 12).
The union of the set of all mammals and the set of all humans is the set of all creatures that
are either mammals or humans; since every human is a mammal, this union is the set of all
mammals.
The union of the set of all people that were, at some time up to t, senators, and the set of
all people who were, at some time up to t, congressmen, is the set of people who were at one
time or another, up to time t, members of at least one of the legislative houses.
The basic properties of intersections and unions are the following:
(X Y ) Z = X (Y Z) (X Y ) Z = X (Y Z)
X Y = Y X X Y = Y X
X X = X X X = X
X = X = X
The equalities of the rst row mean that the operations of intersection and union are asso-
ciative, those of the second row mean that they are commutative, and those of the third row
that they are idempotent. These properties follow directly from the meanings of and and
or. They are so obvious that one would hardly consider proving them formally. Formal, but
tedious, proofs can be given. When this is done, one sees that the associativity of intersection
reects the associativity of and (i.e., the fact that (AB) C and A(BC) are logically
equivalent) and the associativity of union reects that of or.
Repeated Intersections and Repeated Unions: Intersections can be applied repeatedly
to more than two sets, and the same holds for unions. Since these operations are associative,
we can ignore grouping and use expressions such as:
X
1
X
2
. . . X
n
X
1
X
2
. . . X
n
And since the operations are commutative, the order of the sets can be changed without
aecting the result.
It is easily seen that X
1
X
2
. . . X
n
is the set of all objects that are members of all the
sets X
1
, . . . , X
n
. Similarly, X
1
X
2
. . . X
n
is the set of all objects that are members of at
least one of X
1
, . . . , X
n
.
Distributive Laws: These two equalities hold in general:
X (Y Z) = (X Y ) (X Z) X (Y Z) = (X Y ) (X Z)
The rst is the distributive law of intersection over union, the second of union over inter-
section. These laws are direct outcomes of the following two tautologies:
5.1. BASIC CONCEPTS OF SET THEORY 163
x X and (x Y or x Z) i (x X and x Y ) or (x X and x Z).
x X or (x Y and x Z) i (x X or x Y ) and (x X or x Z).
Obviously, each of X and Y includes (as a subset) their intersection X Y , and is included
in their union X Y . Which can be stated thus:
X Y X, Y X Y
As is easily seen, the subset relation can be characterized in terms either of unions, or of
intersections:
X Y i X Y = X X Y i X Y = Y
We also have:
If X X
0
and Y Y
0
then X Y X
0
Y
0
and X Y X
0
Y
0
.
Every set which is included both in X and in Y is included in their intersection. This follows
easily from the denitions. (It is also derivable from the above-given properties: If Z X
and Z Y , then Z = Z Z X Y .)
Therefore, the intersection of two sets X and Y is
(i) included both in X and in Y , and
(ii) includes every set that is included in X and in Y .
We can express this by saying that X Y is the largest set that is included both in X and in
Y .
Similarly, the union of X and Y can be characterized as the smallest set that includes both X
and Y .
Homework
5.2 Let N = {0, 1, 2, . . . , n, . . .} and let x, y, z, range over N. Let
X
1
= {0, 1, 5, 7, 10, 13, 18, 19, 20}
X
2
= {3, 4, 5, 17, 21, 8, 9, 6, 1}
X
3
= {21, 31, 20, 40, 1, 0, 3, 20}
164 CHAPTER 5. MATHEMATICAL INTERLUDE
X
4
= {2x : x > 3}
X
5
= {x : x is divisible by 2 or by 3}
X
6
= {x : x is prime}
Write down in the curly-bracket notation (using for the empty set) the following sets:
1. X
1
X
2
2. X
1
X
3
3. X
3
X
4
4. X
3
X
4
5. X
1
X
2
X
3
6. (X
1
X
6
) (X
5
X
2
)
7. (X
5
X
6
) X
1
8. (X
6
X
5
) (X
1
X
3
)
9. (X
4
X
6
) X
5
10. X
4
(X
6
X
5
)
5.3 For any two sets, X, Y , dene X Y by:
X Y = {x X : x 6 Y }
With the X
i
s as in 5.2, write down in the curly-bracket notation (using for the empty
set) the following sets:
1. X
1
X
2
2. X
2
X
1
3. X
6
X
5
4. X
4
X
5
5. (X
3
X
1
) (X
2
X
4
)
6. (X
1
X
3
) X
2
7. X
1
(X
3
X
2
)
5.1. BASIC CONCEPTS OF SET THEORY 165
8. N X
4
9. X
4
N
10. X
5
(X
6
X
4
)
5.1.3 Sequences and Ordered Pairs
Sequential orderings underlie almost everything. Impressions, actions, events, come arranged
in time. Quite early in our life we become acquainted with nite sequences. We learn,
moreover, that dierent sequences can be made by arranging the same objects in dierent
ways. We also learn that elements can be repeated; the same color, shape, or whatnot, can
occur in dierent places. We learn to identify certain sequences of letters as words, and certain
sequences of words as sentences. Particular sequences of tones and rests make up tunes,
and some sequences of moves constitute games. Sequences are all around.
We shall not dene here the notion of a sequence in set-theoretical terms. Relying on our
intuitive understanding we shall take it for granted. Sequences can be formed from any given
objects. And the sequences are objects themselves.
We may use a
1
, a
2
, . . . , a
n
to denote the sequence of length n in which a
1
occurs in the rst
place, a
2
in the second, ..., and a
n
in the n
th
. But this notation is often inconvenient; for
we also use a
1
, a
2
, . . . , a
n
to refer to a plurality (we say: the numbers 3, 7, 11, 19 are prime),
whereas a sequence is a single object. Therefore we have notations that display more clearly
the sequence as an object. The most common are:
(a
1
, a
2
, . . . , a
n
) and ha
1
, a
2
, . . . , a
n
i
Finite sequences are called tuples, sequences of length nn-tuples. The expression i
th
co-
ordinate is used, ambiguously, for the i
th
place, as well as for the object occurring in that
place.
The sequences we encounter are nite. But the notion can be extended to innite cases. We
can speak of the innite sequence of natural numbers:
(0, 1, 2, . . . , n, . . .)
or of the sequence of even natural numbers:
(0, 2, 4, . . . , 2n, . . .)
In this course we shall be concerned only with nite sequences; though we may mention
innite sequences of numbers or of symbols.
166 CHAPTER 5. MATHEMATICAL INTERLUDE
It is convenient to refer to the objects occurring in the sequence as its members. The object
occurring in the i
th
place is the i
th
member of the sequence. Do not confuse this with the
membership relation of set theory! As a rule, the context indicates the intended meaning of
member.
Equality of Sequences: A sequence is determined by its length (the number of places, or
of occurrences) and by the order in which objects occur: its rst member, its second members,
etc. Sequences are equal when they are exactly the same: they have the same length and,
in each place, the same object occurs. Formally:
(a
1
, . . . , a
m
) = (b
1
, . . . , b
n
) i m = n and a
i
= b
i
, for all i = 1, . . . , m .
They are thus quite dierent from sets. A set is completely determined by its members.
Set-theoretic notations may list the members in some sequential order, but neither the order
nor repeated listings make a dierence.
{0, 1, 1, 1} = {1, 0, 1, 1} = {0, 1} = {1, 0}
But the sequences
(0, 1, 1, 1), (1, 0, 1, 1), (0, 1), (1, 0)
are all dierent.
Ordered Pairs, Triples, Quadruples, etc. Ordered pairs, or pairs for short, are 2-tuples;
triples are 3-tuples; quadruples are 4-tuples, and so on.
Ordered pairs are of particular importance. The identity condition for sequences becomes, in
the case of ordered pairs, the well-known condition:
(a, b) = (a
0
, b
0
) i a = a
0
and b = b
0
.
5.1.4 Relations and Cartesian Products
We have seen that any property of objects (belonging to some given domain) determines a
set: the set of all objects (in the given domain) that have the property. We can therefore use
sets as substitutes for properties. (By doing so we disregard the dierence between any two
properties that determine the same set.)
There are creatures that, like properties, are true of objects, but which involve more than one
object: they relate objects to each other. For example, the parent-child relation holds for any
pair of objects, x and y, such that x is a parent of y. Set theory provides a very simple and
elegant way of representing these creatures:
5.1. BASIC CONCEPTS OF SET THEORY 167
Regard the relation as a property of ordered pairs and represent it, accordingly, as a set of
ordered pairs.
Thus, the parent-child relation is the set of all ordered pairs (x, y), such that x is a parent of
y. If, for the sake of illustration, we restrict our universe to a domain consisting of:
Olga, Mary, Ruth, Jack, John, Abe, Bert, Nancy, Frieda,
and if the parent-child relation among these people is given by:
Abe is the father of Ruth and Jack,
Olga is the mother of Mary, Abe and Nancy,
Jack is the father of Bert,
John is the father of Nancy,
and there are no other parent-child relationships, thenover this domainthe parent-child
relation is simply the set:
{(Abe, Ruth), (Abe, Jack) , (Olga, Mary), (Olga,
Abe), (Olga, Nancy), (Jack, Bert), (John, Nancy)}
Note that the child-parent relation is obtained by switching the two coordinates. It contains
as members: (Ruth, Abe), (Jack, Abe), (Mary, Olga), etc.
Relations that involve three members are construed, accordingly, as sets of 3-tuples. For
example, the betweenness-relationwhich holds between any three points x, y, z on a line
such that y is between x and zis the set of all triples (x, y, z) such that y is between x and
z. Here, to sum up, are some basic notions and terms:
A binary relation is a set of ordered pairs.
An n-ary relation (also called an n-place relation) is a set of n-tuples.
Unqualied relation means often a binary relation.
If R is an n-ary relation, then n is referred to as the arity of R, or the number
of places of R.
{(x
1
, x
2
, . . . , x
n
) : . . . x
1
. . . x
2
. . . . . . x
n
. . .} is the set of all tuples (x
1
, x
2
, . . . , x
n
)
satisfying the condition stated by . . . x
1
, . . . x
2
, . . . . . . x
n
. . ..
168 CHAPTER 5. MATHEMATICAL INTERLUDE
The betweenness relation above can be written as:
{(x, y, z) : y is between x and z}
where x y and z range over geometrical points. Here are some other examples:
{(x, y) : x is a parent of y}, {(x, y) : y is a parent of x}, {(x, y, z) : x introduced y to z}
{(x, y) : x and y are real numbers and y = 2x + 1}
{(x, y) : x and y are natural numbers and x y}
Note: The variables in relational notation are used as place holders, that is, to correlate
coordinates with places in the dening expression. Dierent variables, or the same variables
in dierent roles, can achieve the same eect:
{(x, y) : x is a parent of y} = {(y, x) : y is a parent of x} = {(u, x) : u is a parent of x}
But
{(x, y) : x is a parent of y} 6= {(x, y) : y is a parent of x}
The rst relation consists of pairs in which the parent occupies the rst coordinate, the child
the second; in the other relation the child is in the rst place, the parentin the second.
Self-explanatory variants of our notation involve repetitions of variables. e.g.,
{(x, x) : x D}
is the set of all pairs (x, x), where x ranges over D. It is equal to {(x, y) : x, y D and x = y}.
Note: The arity of the relation is the length of the tuple; it may be greater than the number
of dierent variables that appear in the denition, because, as we have just seen, the same
variable can occupy dierent places in the tuple.
Relations Over a Given Domain: Often, we consider relations that relate objects of
particular kinds: numbers, people, animals, words, etc. We say that a relation is over D if it
consists of tuples whose members belong to D.
Usually, the variables range over well dened domains. In x is an uncle of y, x and y
range, obviously, over people. Relations can, however, relate objects of dierent kinds; e.g.,
the ownership relation that holds between x and y, just when x is a person, and y is an object
owned by x.
Homework
5.4 Consider binary relations consisting of the pairs (x, y), determined respectively by fol-
lowing conditions. (When the signs , <, =, 6= are used, the variables range over the natural
numbers.)
(1) x is a brother of y. (2) y is a sibling of x. (3) x y, (4) y < x, (5) x y,
where x and y are sets of natural numbers. (6) x 6= y (7) y = x (8) x is an
5.1. BASIC CONCEPTS OF SET THEORY 169
ancestor of y. (9) y is a child of x. (10) x = y = 3 (11) x and y are natural
numbers.
Find out which of the following inclusions is true (the numbers refer to the corresponding
relations). Justify your answers.
(1) (2) (2) (1) (3) (4) (4) (3) (7) (3) (7) (4) (7) (10) (10)
(7) (8) (9) (9) (8)
5.5 A (binary) relation, R, is:
symmetric, if whenever (x, y) R, also (y, x) R,
transitive, if whenever (x, y), (y, z) R, also (x, z) R,
reexive over the domain D, if whenever x D, (x, x) R.
Note that reexivity depends also on the domain. A relation which is reexive over a domain
can cease to be so with respect to a larger domain. Usually, when we speak of a reexive
relation, the domain is presupposed.
Find out which of the relations of 5.4 is symmetric, which is transitive and which is reexive
(over the naturally associated domain). Justify your conclusions. If the property in question
does not hold show this by a counterexample.
Cartesian Products
Let X
1
, X
2
, . . . , X
n
be sets. The Cartesian product of X
1
, X
2
, . . . , X
n
, denoted as:
X
1
X
2
. . . X
n
,
is the set of all n-tuples in which the rst coordinate is a member of X
1
, the second coordinate
is a member of X
2
, and so on ..., the n
th
coordinate is a member of X
n
. Formally, for all x:
x X
1
. . . X
n
i there are x
1
, x
2
, . . . , x
n
such that: x = (x
1
, x
2
, . . . , x
n
) and x
i
X
i
,
for i = 1, 2, . . . , n.
We can also express this using the notation for sets of tuples:
X
1
X
2
. . . X
n
= {(x
1
, x
2
, . . . , x
n
) : x
i
X
i
, for i = 1, 2, . . . , n}
Note: In (x
1
, x
2
, . . . , x
n
), the index is the place-number in the sequence. But there is no
general rule that ties indices to place-numbers. The rst member of (x
3
, x
1
, x
1
) is x
3
, the
second is x
1
and the third is x
1
.
170 CHAPTER 5. MATHEMATICAL INTERLUDE
Examples:
{1, 2} {0, 1, 2} = {(1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)}
{1, 2}{1, 2}{2, 3} = {(1, 1, 2), (1, 1, 3), (1, 2, 2), (1, 2, 3), (2, 1, 2), (2, 1, 3), (2, 2, 2), (2, 2, 3)}
Cartesian Powers: If X
i
= X, for i = 1, . . . n, then X
1
. . . X
n
is said to be the n
th
Cartesian power of X and is denoted as:
X
n
Obviously, X
n
is the set consisting of all n-tuples of members of X. If n = 2, it is the set of
all ordered pairs of members of X.
Historically, the concept of Cartesian product was derived from geometry. A coordinate
system for the plane consists of two perpendicular directed lines, which are referred to as
axes. Each point in the plane can be projected on the two axes, determining thereby an
ordered pair of real numbers (x, y), where x represents the projection on the rst axis, ythe
projection on the second. Vice versa, every pair of numbers determines a unique corresponding
point. In this way it is possible to identify the plane with the Cartesian product RR, or R
2
,
where R is the set of all real numbers. Similarly, the three dimensional space can be identied
with R
3
. This representation, by now a commonplace, amounts to a major breakthrough in
the history of science. It was discovered around 1637 by Descartes. (Cartesian derives from
his Latin name Cartesius.)
All of geometry is reducible, in principle, to a system that deals with pairs, or triples of
numbers. Moreover, the concept of a Cartesian product makes possible the denition and the
study of higher dimensional geometrical spaces, structures that resists visualization. A geo-
metrical space of 4 dimensions is simply, R
4
, that of 5-dimensions is R
5
, and an n-dimensional
space is R
n
.
Homework
5.6 Let X
1
= {0, 2, 4}, X
2
= {0, 5}. Write down the following sets in the curly bracket
notation.
X
1
X
2
, X
2
X
1
, X
2
1
, X
3
2
X
1
{2} X
1
X
1
{} X
1
, X
1
X
1
.
5.7 Prove the following:
(i) X (Y Z) = (X Y ) (X Z)
(ii) (X Y ) Z = (X Z) (Y Z)
(iii) X
1
. . . X
n
= i one of X
1
, X
2
, . . . , X
n
is empty.
5.8 Prove that, if no X
i
is empty, then X
1
X
2
. . . X
n
= Y
1
Y
2
. . . Y
n
i X
i
= Y
i
,
for i = 1, 2, . . . , n.
5.1. BASIC CONCEPTS OF SET THEORY 171
What can you deduce from this, concerning the equality X Y = Y X ?
Functions
Historically, functions have been conceived as laws by which a magnitude is determined by
another; for example, the distance traveled by a falling body is said to be a function of the
time of fall. Functions have been also considered as rules that correlate objects with objects.
Thus, there is a function that correlates with every number, x, the number x
2
; and one that
correlates with x the number 2x1. Functions can be dened for any kind of objects; e.g.,
there is a function that correlates the persons mother with each person, and there is one that
correlates, with each star, the galaxy it belongs to.
Commonly f(x) denotes the object that the function f correlates with x (assuming, of
course, that f is dened for x). We say that f(x) is the value of f for the argument x, and
also that it is the value of x under f.
The intuitive concept of rule, or law, by which the correlation is determined, is too
vague for mathematical purposes. Historically, the list of entities admitted as functions kept
growing, until mathematicians came to realize that they need an abstract concept of function,
which does not rely on the notion of a dening rule. Set theory provides a perfect denition
of such a concept. Consider the set of all pairs (x, y) such that y is the value correlated with
x. One can dene the function as being simply this set. Given such a set, the function assigns
a value to each x for which there is a y such that (x, y) is in the set; that y is the value of x
under the function.
Not every set of ordered pairs will do as a function. It should satisfy the condition that, for
every x, there is at most one y such that (x, y) is in the set (the function should assign to
any object no more than one value). Any set satisfying this condition is a function. Stated
in full, the denition is this:
A function, f, is a relation (set of ordered pairs) such that: for all x, y, y
0
, if
(x, y) f and (x, y
0
) f, then y = y
0
.
If there is a y such that (x, y) f, then we say that f is dened for x. The domain of the
function, which we denote as dom(f ), is the set of all objects for which the function is dened.
If x dom(f), then the value of f for x is the unique y such that (x, y) f. The value is
denoted by f(x).
Example If f = {(x, 3x
2
+ 1) : x N}, where N is the set of natural numbers, then f is
a function, dom(f) = N, and f(x) = 3x
2
+ 1 for all x N
Functions dier just when they dier as sets of ordered pairs. It is easy to see that two
functions, f and g, are equal just when they are dened for the same objects and assign to
172 CHAPTER 5. MATHEMATICAL INTERLUDE
every object the same value. Formally:
f = g i dom(f) = dom(g) and f(x) = g(x), for all x dom(f) .
Functions come under many names, which suggest diverse aspects and uses of the concept.
We have correlation, assignment, and correspondence, which suggest a pairing of objects with
objects. And we have also operator and operation, which suggest the transforming objects into
objects. We might say that squaring is an operation by which any number x is transformed
into x
2
. Transformation is itself a term adopted in mathematics for functions of a certain
type. We have also the term mapping, which suggests both a matching of items and a copying
of one thing to another.
One-to-One Functions and Equinumerous Sets: A function f is said to be one-to-one if
it correlates with dierent objects in its domain dierent values; that is, for all x, y dom(f):
x 6= y =f(x) 6= f(y)
Or, equivalently expressed:
f(x) = f(y) =x = y
The concept of a one-to-one function is extremely important in mathematics. It serves, among
other things, to dene the concept of equinumerous sets; these are sets that have the same
number of members:
The set X is equinumerous to the set Y if there exists a one-to-one function
f, such that X = dom(f) and Y = {f(x) : x X}. (In words: Y is the set
of all objects correlated via f with members of X.)
Little reection will show that this denition captures all there is to having the same num-
ber. There are exactly as many forks in the drawer as there are spoons, just when one can
pair with every fork a spoon, so that dierent forks are paired with dierent spoons and every
spoon is paired with some fork. The function that correlates with each fork its paired spoon is
the function that satises the conditions of the last denition. Vice versa, any such function
determines a pairing of spoons with forks.
A crucial feature of the denition is that it applies to all sets, nite as well as innite. We
can therefore dene when two innite sets have the same number of elements. The denition
was introduced by Cantor who derived from it the general concept of a cardinal number; that
is, a (possibly innite) number, which can serve as an answer to the question: How many
elements are there in a set?
Functions of Several Arguments: So far we have discussed functions of one argument,
i.e., functions that correlate objects with objects Often, however, we correlate objects with
more than one object. We may correlates with every two numbers, x and y, their sum: x+y;
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 173
and we may correlates with x, y and z, the number x
y
+ z. Such cases are construed as
functions of many arguments.
It is possible to subsume functions of n arguments under functions of one argument, by
regarding them as one-argument functions dened for n-tuples. Under this construal, the
function that assigns to x and y the dierence x y is a one-argument function, whose
domain is R R; it assigns to any ordered pair the number obtained by subtracting the
second coordinate from the rst.
Alternativelyand this is sometimes more convenientwe can dene a function of n argu-
ments as an (n+1) ary relation, say f, which satises the condition:
If (x
1
, . . . , x
n
, y) f and (x
1
, . . . , x
n
, y
0
) f, then y = y
0
An n-place function, f, is dened for x
1
, . . . , x
n
, if there exists y, such that (x
1
, . . . , x
n
, y) f.
Such a y is, of course, unique and we denote it by:
f(x
1
, . . . , x
n
)
5.2 Inductive Denitions and Proofs, Formal Languages
5.2.1 Inductive denitions
John McGregor (a Scottish squire from the 17
th
century) had four children:
Mary, James, Robert and Lucy.
James died childless. Mary had two children. Robert and Lucy had three
children each.
William, Marys rst child, had one child.
.
.
.
and so the story goes on.
We may not know the descendants of John McGregor or their number, but we have no trouble
in understanding what a descendant of John McGregor is. It is either a child of McGregor,
or a child of a child, or a child of a child of a child,... and so on.
Using the concept of a nite sequence it is not dicult to give an explicit denition of the set
of as descendants, where a is a person.
x is a descendant of a i there is a nite sequence (a
1
, . . . , a
n
), such that a
1
is a child of a, a
i+1
is a child of a
i
, for all i = 1, . . . , n 1, and a
n
= x.
174 CHAPTER 5. MATHEMATICAL INTERLUDE
The sequence just described shows the chain connecting x to a. The condition concerning the
sequence can be relaxed:
A person x is a descendant of a i there is a nite sequence in which x is
the last member, and every member of it is either a child of a or a child of
some previous member.
There is another way of dening the set of descendants, which does not employ nite se-
quences. Consider the following two properties of a set X:
(I) If x is a child of a, then x X (i.e., every child of a is a member of X).
(II) If x X and y is a child of X, then y X (i.e., every child of a member of
X is a member of X)
It is obvious that the set of all descendants of a satises (I) and (II), i.e., if: X = set of all
descendants of a, then (I) and (II) are true.
There are other sets that satisfy (I) and (II); for example, the set of all persons (because
every child of a is a person and every child of a person is a person); or the set of all people
that are descendants either of a or of b. But every set that satises (I) and (II) includes as a
subset the set of descendants of a: First, by (I), it contains as members all the children of a;
second, by (II), it contain also all the childrens children; hence, by (II) again, it contains all
childrens childrens children, and so on. Therefore we have:
The set of all descendants of a is the smallest set that satises (I) and (II)
Here by smallest we mean that it is included as a subset in every set that satises (I) and
(II). Note that if there is a smallest set it must be unique: if Y
1
and Y
2
are both smallest sets,
then Y
1
Y
2
and Y
2
Y
1
.
Therefore we can also say:
x is a descendant of a i it belongs to every set satisfying (I) and (II).
Frege was the rst to give denitions of this type.
Note: Instead of (I) and (II) we can use a single condition: their conjunction. This condition
can be stated as follows:
(III) If x is either a child of a or a child of a member of X, then x X.
The existence of a smallest set that satises a given condition is a property of the condition.
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 175
Not every condition has this property. Consider, for example, the condition of being non-
empty. There is no smallest non-empty set. Because if b and c are any two dierent objects,
both {b} and {c} are non-empty; but there is no non-empty set that is a subset of both
({b} {c} = ). Each of {b} and {c} is a minimal non-empty set: it has no proper subset
which is not empty; but it is not the smallest non-empty set. Or consider the following
condition on X:
(IV) At least three children of McGregor are members of X.
Given that Mary, James, Robert and Lucy are McGregors children, each of the following sets
satises (IV):
{Mary, James, Robert} {James, Robert, Lucy}
But no subset of the two satises (IV), because their intersection is {James, Robert}.
If Y is the smallest set satisfying the condition P, then it is (i) a member of the family of all
sets satisfying P, and (ii) a subset of every set in this family. [By a family of sets we mean
a set whose members are sets.] Hence, Y is the intersection of all sets satisfying P.
Note: In 5.1.2 we dened intersections of a nite number of sets. The denition generalizes
easily to any non-empty family, F, of sets: The intersection of the members of F is the set
consisting of those objects that are members of every set in F. An analogous generalization
applies to unions: The union of all the sets in the family F is the set whose members are all
objects that belong to some member of F.
Operations on Sets, Monotonicity and Fixed Points
Our rst denition of descendants tells us how to get each descendant by some nite, bottom-
up construction of a sequence. The second denition represents a top-down approach, in
which we form the intersection of all the sets that satisfy certain conditions. There is a
connection between the two denitions. It is brought out by regarding (I) and (II) not only
as conditions, but as rules that determine operations on sets. If X is the set that is operated
on, then the rules are as follows.
(I
) If x is a child of a, add x to X.
(II
) to X means
176 CHAPTER 5. MATHEMATICAL INTERLUDE
to add to it all the children of its members. (If no member of X has children, or if all the
children of the members of X are already in X, no new members are added.)
Henceforth we use (I
) and (Ii
) and (II
) or (II
) and (II
) and (II
) and
(II
) to X
1
we get a set, X
2
consisting of all the
children of a and all their children. Again, by applying (II
) to X
2
we get X
3
, which consists
of the children of a, the childrens children, and the childrens childrens children. And so on.
We get in this way a sequence
X
0
, X
1
, . . . , X
n
, . . .
which is non-decreasing: X
0
X
1
. . . X
n
. . ..
All sets in this sequence contain only descendants of a. It is also easily seen that every
descendant of a is a member of some set in the sequence. Hence the union of all the sets in
the sequence is exactly the set of all descendants of a. It is the smallest xed point of (I
)
and (II
).
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 177
Note: The sets of such a sequence can either go increasing all the way, or reach a plateau,
remaining the same from some point on. In our example, the rst is the case if time goes on
indenitely and there are always new descendants of a; the second is the case if, from some
time on, no new descendants are added.
We can combine (I
) and (II
).
Our case exemplies the general features of inductive denitions:
(a) We are given certain conditions [in our example: (I) and (II)]. There is a
smallest set that satises them and this is the set we dene.
(b) We recast the conditions as rules that determine non-decreasing monotone
operations on sets [in our example: (I
) and (II
) and
(II
); we may thus say that a set is closed under (II), or that it is a xed point of (I) and (II).
It is customary to use the same symbol in the role of the set-variable that is used in stating
the conditions (in our example, X), as well as a name for the inductively dened set. If D
a
is to denote the set of all descendants of a, then its inductive denition will have the form:
(1) If x is a child of a, then x D
a
.
(2) If x D
a
and y is a child of x, then y D
a
.
We then say that D
a
is dened inductively by (1) and (2), meaning that it is the smallest set
satisfying these conditions. And we also say that D
a
is the smallest xed point of (1) and
(2). Here are some other examples of inductively dened sets. We denote them as S1, S2,
etc.
The set S1:
(1) 2 S1.
(2) 3 S1.
(3) If x S1, then 2x S1.
(4) If x S1, then 3x S1.
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 179
Here (1) and (2) are the base rules and (3) and (4) are the recursive rules. Obviously, (1) and
(2) can be replaced by the single base rule:
(1
0
) 2, 3 S1.
And the other two rules can be combined into a single recursive rule:
(2
0
) If x S1, then 2x S1 and 3x S1.
After the rst step we get the set {2, 3} and then, with each iteration of the recursive rules,
we add to our set all the products of set members with 2 and with 3. The rst four sets in
the sequence are:
, {2, 3}, {2, 3, 4, 6, 9}, {2, 3, 4, 6, 9, 8, 12, 18, 27}
It is not dicult to see that S1 consists of all natural numbers that can be expressed as
products > 1, of 2s and 3s, that is: all numbers of the form 2
m
3
n
, where m, n 0 and at
least one of m, n is non-zero. (Recall that x
0
= 1 and x
1
= x.)
The set S2:
(1) 2, 3 S2
(2) If x, y S2, then xy S2
Clause (2) means that, S2 is closed under products; i.e., it contains, with every two members,
also their product.
It is not dicult to see that S1 = S2. The argument, which is easy, shows how the property
of being the smallest set satisfying the condition is used:
S2 contains 2 and 3 and is closed under products. Hence it contains all
products of 2s and 3s. Therefore S2 satises the conditions that dene S1.
Since S1 is the smallest set satisfying these conditions, we have: S1 S2.
Vice versa, the set of all products > 1 of 2s and 3s contains 2 and 3 and
is closed under products. Hence it satises the conditions that dene S2.
Since S2 is the smallest set satisfying these conditions, we have: S2 S1.
Putting the two together we get: S1 = S2.
This case is easy. But, in general, the question whether two given inductive denitions dene
the same set can be very dicult.
The set S3:
(1) 1 S3. (2) If x S3, then 2x S3.
180 CHAPTER 5. MATHEMATICAL INTERLUDE
It is not dicult to see that S3 is just the set consisting of all powers of 2:
{2
0
, 2
1
, 2
2
, 2
3
, . . . , 2
n
, . . .}
The set S4:
(1) 3, 5 S4.
(2) If x S4, then x+3 S4.
(3) If x S4, then x+5 S4.
S4 is the analogue of S1 (with 2 and 3 replaced by 3 and 5) in which products have been
replaced by sums. It is not dicult to see that S4 consists of all numbers > 0 that can be
written as 3m+5n, where m, n are natural numbers. Just as S4 is the analogue of S1, so the
following set is the analogue of S2.
The set S5:
(1) 3, 5 S5.
(2) If x, y S5, then x+y S5.
As in the case for products, one can show that S4 = S5. It can be also shown that this is the
same as the following S6.
The set S6:
(1) 3, 5, 6, 8 S6.
(2) If x S6 and x 8 then x+1 S6.
S6 is simply the set consisting of 3, 5, 6, 8, and all numbers greater than 8.
[To see that S4 S6 note that 3, 5 S6, that of all numbers 8 only 3, 5, 6, 8 are sums
of 3s and 5s; consequently, S6 is closed under (2) and (3) in the denition of S4. To see that
S6 S4, note that every number among 3, 5, 6, 8 is a sum of 3s and 5s and each number
from 9 on is obtainable by adding to some number from 3, 5, 6, 8 a sum of 3s and 5s.]
In the preceding examples, the recursive rules add to the set numbers of growing size. Conse-
quently, the set keeps growing and the xed point is innite. As the following example shows,
this need not hold in general.
The set S7:
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 181
(1) 7 S7.
(2) If n S7 and n is odd, then 2n S7.
(3) If n S7 and n > 4, then n 2 S7.
By iterating these rules we put into our set the following numbers: 7, 14, 5, 3, 12, 10, 8, 6, 4. Ad-
ditional applications of the rules do not yield new numbers. Hence,
S7 = {7, 14, 5, 3, 12, 10, 8, 6, 4}
Homework 5.9 Let k be a xed natural number. Let X
k
be the set dened, inductively,
by the following clauses:
(1) k X
k
.
(2) If x X
k
and x is even, then x/2 X
k
.
(3) If x X
k
and x is odd, then (3x+1)/2 X
k
.
Write down (in the curly brackets notation) the sets X
k
for the cases:
k = 0, 1, 2, 3, 5, 6, 15, 17.
Does there exist a number k for which X
k
is innite? This is an open and apparently a very
dicult problem in number theory.
Many examples of inductive denitions that apply to objects that are not numbers are given
in 5.2.3. We had already one example: the set of descendants. Here is one of the same kind.
The Set of Maternal Descendants: Let maternal descendent means a descendant via
the mother-child relation. Note that the connecting chain must consist of females, except,
possibly, the last descendant. Using MD
a
for the set of maternal descendants of a, the
clauses of the denition are:
(MD1) If a is female and x is a child of a, then x MD
a
.
(MD2) If x MD
a
and x is female and y is a child of x, then y MD
a
Note: If a is not female, MD
a
is empty. Formally, one shows that satises the two
conditions for MD
a
: Since a is not female, the antecedent of the rst condition is false and
the condition holds vacuously. satises also the second condition, since no x is in .
182 CHAPTER 5. MATHEMATICAL INTERLUDE
Inductive Denitions of Relations
The machinery of inductive denitions can be applied to dene relations, where these, recall,
are sets of pairs, or of n-tuples. The conditions determine rules for adding certain pairs, or
n-tuples, to the set that is being constructed.
Here, for example, is the denition of the descendant relation, Des, which is the set of all pairs
(x, z) in which x is a descendant of z. This denition is obtained from that of as descendants
by replacing the xed parameter a by a variable, say (z), and by suitable replacements of
x by (x, z).
(1) If x is a child of z, then (x, z) Des.
(2) If (x, z) Des and y is a child of x then (y, z) Des.
Notation: Let s be the successor function, dened for natural numbers: s(x) = x + 1.
Many relations over natural numbers can be dened inductively, in terms of the successor
function. Here is one.
(1) (x, s(x)) R (i.e., this holds for all natural numbers x).
(2) If (x, y) R, then (x, s(y)) R.
(1) puts in R all pairs of the form (x, s(x)). Then, an application of (2) adds all the pairs
(x, s(s(x))), another application adds the pairs (x, s(s(s(x)))), and so on. It is not dicult
to see that R consists exactly of all pairs (x, y) in which x < y. Hence, (1) and (2) dene
inductively the smaller-than relation, <, solely in terms of the successor function. If, instead
of (x, y) R we write x < y, we get the usual form of this denition.
(1) x < s(x)
(2) If x < y, then x < s(y).
Inductive techniques can be used to dene various functions. (Recall that functions are
construed in set theory as relations of a particular kind.) Take, for example, the addition
function and let it be the relation Sum. Since addition is a binary function, Sum is a ternary
relation:
Sum = {(x, y, z) : z = x +y}
It is not dicult to see that the following inductive denition denes it in terms of the
successor function.
(1) (x, 0, x) Sum
(2) If (x, y, z) Sum, then (x, s(y), s(z)) Sum
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 183
Rewriting statements of the form (x, y, z) Sum in the form x +y = z, we get:
(1
0
) x + 0 = x
(2
0
) If x +y = z, then x +s(y) = s(z)
Rewriting (2
0
) in the equivalent form: x + s(y) = s(x + y), yields the following customary
form of the denition:
(1
0
) x + 0 = x
(2
00
) x +s(y) = s(x +y)
This denition shows directly the iterated process. Given that s(0) = 1, s(1) = 2, s(2) = 3, ...
etc., we can get the value of m+n for every particular m and n. For example:
5 + 0 = 5
5 + 1 = 5 +s(0) = s(5 + 0) = s(5) = 6
5 + 2 = 5 +s(1) = s(5 + 1) = s(6) = 7
5 + 3 = 5 +s(2) = s(5 + 2) = s(7) = 8
etc.
Multiplication is denable inductively in terms of the successor function and addition:
(1) x 0 = 0
(2) x s(y) = (x y) +x
Homework 5.10 Give inductive denitions of the following relations over the natural
numbers, solely in terms of the successor function.
1. The less-than-or-equal relation, R
) For any n, if all numbers n have the property P, then n+1 has it.
The task of proving (II
) is easier then the task of proving (II); because, in order to show that
n+1 has the property, we can assume not only that n has it, but that all numbers n have
it.
Strong induction is implied by the fact that the set of natural numbers is the smallest set X,
such that:
(1) 0 X.
(2) If all natural numbers n are in X, so is n+1.
Alternatively, we can derive strong induction from ordinary induction by the following trick.
Given the property P, dene another property P
by:
n has the property P
.
Therefore, in order to prove (i) it suces to prove (ii). But it is not dicult to see that
proving (ii) by ordinary induction is the same as proving (i) by strong induction.
Note that (II
) For any n > 0, if all numbers smaller than n have the property P, then n
has it.
We simply let n play the role of our previous n+1. (I) and (II
). The proof of (III) may, of course, proceed by cases, with n = 0 treated as a separate
case.
Various variants of strong induction are obvious. For example, in order to show that all
natural numbers, belonging to some given set, X, have a property P, it suces to prove the
relativized version of (III):
(III
X
) For any n in X, if all numbers in X that are smaller than n have the
property P, then n has it.
Here is an example of strong induction in use. A natural number is called prime if it is greater
than 0 and is not a product of two smaller numbers. We shall now show, by strong induction,
that every number > 1 is either a prime or a product of a nite number of primes; i.e., of
the form p
1
p
2
. . . p
k
, where the p
i
s are prime (we presuppose here the concept of nite
sequences and some elementary properties of products of these kind).
Assume that n > 1 and that the claim holds for all numbers > 1 that are smaller than n. If
n is not a product of two smaller numbers, then it is a prime and the claim holds. Otherwise,
n is a product of two smaller numbers, say n = k m, where k, m < n. Both k and m
must be > 1 (if one of them is 1 the other cannot be smaller than n). Hence each is either
a prime or a product of primes: k = p
1
p
i
, m = p
0
1
p
0
j
. Combining the two we
get: n = p
1
p
i
p
0
1
p
0
j
.
188 CHAPTER 5. MATHEMATICAL INTERLUDE
Since strong induction is a more convenient tool it is employed whenever an inductive ar-
gument relating to natural numbers is needed. The term induction often means strong
induction.
5.2.3 Formal Languages as Sets of Strings
Written linguistic constructs are usually nite sequences of signs. There is a theory that treats
languages simply as sets of nite sequences. The elements of the sequences are taken from
some xed domain, which is called, in this context, the alphabet. (If we were to represent
English in this way, then the alphabet would consist of all English words and punctuation
marks and the set of sequences will consist of all grammatical English sentences.)
Let be some xed non-empty set of objects. We refer to as the alphabet and we assume
that no member of is a sequence of members of .
Strings Over : By a (non-empty) string over we mean either a member of , or a
sequence of length > 1 of members of . The length of the string is 1, if it is a member of ;
otherwise, it is the length of the sequence.
Strings over are like nite sequences of members of , with the sole dierence that the
strings of length 1 are the members themselves. This is done for the sake of convenience; if
a the distinction between a and the sequence hai plays no role in the theory.
The assumption that no member of is itself a sequence of members of is necessary in
order to avoid ambiguity in determining the length of a string and the strings members. A
string of length 1 has one member: the string itself. (Note that member does not denote
here set-theoretic membership!)
Each string over has a unique length, a uniquely determined rst member, say a
1
, a uniquely
determined second member, say a
2
, and so on. If the length of the string is n, and a
i
is its
i
th
member, i = 1, . . . , n, then the string is written as:
a
1
a
2
. . . a
n
If the string is of length 1 we simply have a
1
(an element of ). We say that a occurs in the
string if, for some i, a = a
i
.
It is very useful to include among our strings a so-called empty string or null string, whose
length is 0, which has no members. It plays a role somewhat analogous to that of the empty
set. The null string is denoted as:
, such that
x = a
1
, . . . , a
m
and y = b
1
, . . . , b
n
,
the concatenation of x and y is the string
a
1
. . . a
m
b
1
. . . b
n
It is denoted as:
xy
Concatenation is also dened if one of the strings, or both, is :
x = x = x
Obviously, concatenation is associative: (xy)z = x(yz). Hence, in repeated concatenation
we can omit parentheses. If x
1
, x
2
, . . . , x
n
are strings then
x
1
x
2
. . . x
n
is the string obtained by concatenating them in the given order. Note that the string a
1
. . . a
n
,
where the a
i
s are members of , is the concatenation of the a
i
s, where these are considered
as strings.
By a language over , we mean any subset of
.
Inductive Denitions of Sets of Strings
Strings form a domain where inductive denitions are particularly useful. First, note that,
if
1
, then
1
(the set of all strings over
1
) can be characterized inductively as the
smallest set satisfying:
(1)
1
.
190 CHAPTER 5. MATHEMATICAL INTERLUDE
(2) If x
1
and p
1
, then xp
1
.
In other words:
1
is the smallest set containing and closed under concatenation (to the
right) with members of
1
. If, for example, a
1
, . . . , a
n
are any members of
. Then, by (1),
1
Hence, by (2), a
1
1
, but this is exactly a
1
. Consequently:
a
1
1
An additional application of (2) yields:
a
1
a
2
1
and so on; n applications of (2) give us:
a
1
a
2
. . . a
n
1
is also the smallest set containing and satisfying the following two conditions:
(2
0
) If x
1
, then x
1
.
(2
00
) If x, y
1
, then xy
1
.
Prexes of Strings: A prex of a string x (called also an initial segment of x) is any string
y, such that, for some string z:
yz = x
It is easily seen that a prex of a
1
. . . a
n
is any string of the form: a
1
. . . a
m
, where m n.
The case m = 0 is taken to yield the empty string. Every string x is a prex of itself, since
x = x.
A proper prex of x is one which is dierent from x.
The prex relation can be dened inductively, using only concatenation (to the right) with
members of :
(1) For every string x, x is a prex of x.
(2) If x is a prex of y, and p , then x is a prex of yp .
Homework 5.11 A sux of x is any string y such that, for some string z:
zy = x
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 191
A segment of x is any string y, such that, for some strings u and v:
uyv = x
Give inductive denition of these concepts, based on concatenation (to the left, or to the
right) with members of .
Powers of strings are dened as iterated concatenations: If n is a natural number, then:
x
n
= xx. . . x
where the number of xs on the right-hand side is n. If n = 0, this is dened to be . The
following is an inductive denition of that function. Note that the induction is on natural
numbers, but the values of the function are strings.
(1) x
0
= .
(2) x
n+1
= x
n
x.
Obviously, x
1
= x.
Note: If a , then a
n
is simply the string of length n consisting of n as.
Examples: The following are examples of languages, i.e., sets of strings, dened by induc-
tion. We assume that a, b, and c are some xed members of , and x, y, z, are variables
ranging over
.
The set L1:
(1) b L1.
(2) If x L1, then axc L1.
Starting with (1) (the base rule), we put b in L1. Then, rule (2) enables us, given any member
of L1, to get from it a new member in L1 by concatenating it with a on the left and with c
on the right. Hence, applying (1) and following it by repeated applications of (2) we get
b, abc, aabcc, aaabcccc, . . . , a
n
bc
n
, . . .
It is not dicult to see that L1 consists of all strings of the form: a
n
bc
n
, where n = 0, 1, . . ..
The set L2:
(1) abcc L2
(2) If x L2, then axcc L2.
By applying (1), put abcc into L2. Then, each application of (2) adds one a at the beginning
and two cs at the end. Hence, we end by getting strings of the form:
a
n
bc
2n
, n = 1, 2, . . .
192 CHAPTER 5. MATHEMATICAL INTERLUDE
L2 is the language consisting exactly of all these strings.
In these two examples the inductively dened sets of strings have also explicit denitions,
which enumerate according to an obvious rule the members of the set. But, in general, an
alternative description is not easily found. Sometimes the inductive denition is all that we
have. Consider, for instance, the following very simple set of rules:
(1) ab L3.
(2) If x L3, then axb L3.
(3) If x, y L3, then xy L3.
Let a and b be, respectively the left and right parentheses:
a = ( b = )
Then L3 consists of all parentheses-strings in which all parentheses are matched. E.g.,
(()) ()()() (()())() ()()()(())(()())
are in L3, while
())( (()((()) ()())(()
are excluded from it. But the concept of matching parentheses is itself in need of clarication.
A very good way of doing this is by the inductive denition just given.
If x is a string than x
1
is dened to be the string obtained by reversing x, i.e., by reading
it right-to-left (and writing the result left-to-right). Here is an inductive denition of this
function:
(1)
1
= .
(2) If p , then (xp)
1
= p(x)
1
(i.e., the reversal of xp is p followed
by the reversal of x).
(The parentheses in (2) are used to indicate the string to which the function is applied, they
are not string members.)
For example, to nd (abb)
1
, we note that abb = abb, hence:
(abb)
1
= (abb)
1
= b(ab)
1
= bb(a)
1
= bba()
1
= bba = bba
A similar and shorter argument shows that, if p , then p
1
= p.
Homework
5.12 Let L4 be the set of strings dened inductively by:
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 193
(1) L4.
(2) If x L4, then axb L4.
(3) If x L4, then bxa L4.
(4) If x, y L4, then xy L4.
Find all the strings in L4 whose length is 4. Show how to reach each of them by a sequence
of rule-applications.
There is a very simple description of L4, can you guess it?
5.13 Dene inductively each of the following languages. Use the indicated names in the
denition.
L5: The set of all a
n
b
n
, where n = 0, 1, . . .
L6: The set of all a
m
ba
n
, where 0 n < m.
L7: The set of all a
n
b
n
c
n
, where n = 1, 2, . . ..
PAL: The set of all palindromes over the alphabet {a, b}. A palindrome is a string
x such that x
1
= x.
Proofs by Induction on Strings
Since sets of strings are often characterized inductively, the technique of inductive proofs
comes handy. The proofs fall under the general scheme of IN1 and IN2, given in 5.2.2. Here
is in example that underlies string processing techniques. L3 is the matching-parentheses set
dened above (with a and b in the role of left and right parentheses).
Claim: For every x in L3 the following is true: In every prex of x the number of as is
greater or equal to the number of bs.
Proof: First consider the strings that are put in L3 by the base rule (1) of the denition.
The rule puts in L3 the single string ab. The prexes of this string are:
, a, ab
and the claim in this case is obviously true.
Next we have two inductive rules (2) and (3). Accordingly, we have to show:
(C2) If, in every prex of x, the number of as is not smaller than the number of
bs, then this is also true for every prex of axb.
194 CHAPTER 5. MATHEMATICAL INTERLUDE
(C3) If, in every prex of x and in every prex of y, the number of as is not
smaller than the number of bs, then this is also true for every prex of xy.
In showing this we shall use certain obvious properties of prexes (themselves provable from
the denition of prex):
(a) Every prex of axb is either , or au, where u is some prex of x, or
axb.
(b) Every prex of xy is either a prex of x, or xv, where v is a prex of y.
(C2) now follows easily from (a), and (C3) from (b). Consider, for example, (C2). If u is a
prex of x, and if m and n are, respectively, the numbers of as and bs in it, then:
(i) In au, the numbers of as and bs are m+1 and n.
(ii) In axb, the numbers of as and bs are m+1 and n+1.
By our assumption (the inductive hypothesis) m n. Therefore m+1 > n, and m+1
n+1. The proof of (C3) is similar.
QED
Homework 5.14 Prove the following by induction:
1. Every string, in the set L1 (dened above) is of the form a
n
bc
n
, where n = 0, 1, . . ..
2. In every string of L2, the number of cs is twice the number of as.
3. In every string of L3 the number of as is the same as the number of bs.
5.2.4 Simultaneous Induction
The technique of inductive denitions can be applied to dene several sets and/or relations at
one go, i.e., by one set of conditions. Consider for example the following rules, which dene,
simultaneously, two sets of natural numbers E and O.
(1) 0 E.
(2) If x E, then x+1 O.
(3) If x O, then x+1 E.
It is not dicult to see that E and O are the sets of even and odd numbers.
5.2. INDUCTIVE DEFINITIONS AND PROOFS, FORMAL LANGUAGES 195
There are other pairs of sets that satisfy the clauses just given, for example the pair in which
both sets consist of all the natural numbers. But the pair (E, O) is smallest, in the following
sense: If (E
0
, O
0
) is any pair of sets satisfying the three clauses (with E and O replaced by
E
0
and O
0
), then E E
0
and O O
0
.
If several sets (and relations) are dened inductively by a single set of rules, we say that they
are dened simultaneously. These denitions are an extremely powerful tool.
In the following example simultaneous induction is used to dene a toy language, which is a
fragment of English. We let be the set consisting of eight English words:
Jack, Jill, the, person, who, liked, saved, hated
Since each of these is itself a string of more basic elements (the letters), we leave spaces
between the concatenated English words in order to ensure a unique and easy reading. We
now dene by simultaneous induction three subsets of
, denoted as:
NP, VP, S
As it will turn out, NP is a set of noun-phrases, VP is a set of verb-phrases, and S is a set of
sentences.
(1) Jack, Jill NP.
(2) If x NP, then each one of the following strings is in VP:
liked x, saved x, hated x
(3) If x VP, then the following string is in NP
the person who x
(4) If x NP and y VP, then xy S.
Applying these rules, we nd, for example, that
the person who liked Jill saved the person who hated Jack
is in S. Here is the proof:
(i) Jack and Jill are in NP, by (1).
(ii) liked Jill and hated Jack are in VP, by (2).
196 CHAPTER 5. MATHEMATICAL INTERLUDE
(iii) the person who liked Jill and the person who hated Jack are
both in NP, by (3).
(iv) saved the person who hated Jack is in VP, by (2).
(v) the person who liked Jill saved the person who hated Jack is in
S, by (4).
Chapter 6
The Sentential Calculus
6.0
Returning in this chapter to sentential logic, we set up a rigorously dened formal language,
based on an innite sequence of atomic sentences and on the sentential connectives. We shall
dene the concept of an interpretation of this language, based on which we shall dene, for
that language, logical truth, logical falsity, and logical implication. We shall also investi-
gate additional topics: disjunctive and conjunctive normal forms, truth-functions, and the
expressiveness of sets of connectives.
In the second part of this chapter the fundamental concept of a formal deductive system is
dened, Hilbert-type and Gentzen-type systems for sentential logic are given and their basic
features are established. The two fundamental criteria that relate the syntax to the semantics,
soundness and completeness, are dened and the systems presented are shown to satisfy
them.
6.1 The Language and Its Semantics
6.1.0
So far, we have assumed that our language is provided with certain sentential operations:
negation, conjunction, and other connectives; that its sentences are generated from certain
atomic sentences, and that certain general conditions hold. We shall now show how to dene,
with full formal rigour, a language that satises these assumptions.
The language is built bottom-up, from a given set of atomic sentences; that is, all other sen-
197
198 CHAPTER 6. THE SENTENTIAL CALCULUS
tences are generated from them by repeated applications of the sentential connectives. The
latter are dened in a way that ensures unique readability; i.e., there is exactly one decom-
position of a non-atomic sentence into components and no atomic sentence is decomposable
(the detailed requirements were listed in 2.3).
6.1.1 Sentences as Strings
There are many ways of setting up the language so as to satisfy the required properties. The
choice of a particular denition is a question of convenience. For example, one can dene
sentences to be certain labeled trees (cf. 2.4). The most common way, however, is to dene
them, and linguistic constructs in general, as strings over some given set of symbols; that is,
they are either members or nite sequences of members of that set (cf. 5.2.3). This still leaves
us with a large degree of freedom. Here is one way of dening the language.
The set of symbols, referred to as the alphabet, consists of the following distinct members:
A
1
, A
2
, . . . , A
n
, . . . , ,
,
, ,
The A
i
s are the atomic sentences. There is an innite number of them. The other ve symbols
are the connective letters: the negation letter , the conjunction letter
, and so on. As
usual, we assume that none of the alphabet members is a nite sequence of other members
(cf. 5.2.3).
The dierence between the connective letters ,
,
, , , and the metalinguistic
symbols we have been using: , , , , , is that the former are simply symbols
occurring in strings that constitute the sentences. But the latter are names of certain syntactic
operations. The dierent font is used to make this clear. The two are of course related: the
operations are dened by concatenation of the strings with the corresponding connective
letters.
The set of all sentences is the smallest set of strings, S, satisfying:
(i) A
i
S, for all i = 1, 2, . . .
(ii) If x, y S, then:
(ii.1) x S
(ii.2)
xy S
(ii.3)
xy S
(ii.4) xy S
(ii.5) xy S
6.1. THE LANGUAGE AND ITS SEMANTICS 199
Here, and in this section only, x, y, z, x
0
, etc., are variables ranging over strings.
Note: The sentences of the language are constructed along the lines of the Polish notation.
Had we used an inx convention, we should have included two additional symbols of left and
right parenthesis. The choice of Polish notation helps to distinguish the given formal language
from our metalanguage, where the inx notation is used.
The sentential operations are now dened as follows. (The sign =
df
should read: equal
by denition.)
x (the negation of x) =
df
x
x y (the conjunction of x and y) =
df
xy
x y (the disjunction of x and y) =
df
xy
x y (the conditional of x and y) =
df
xy
x y (the biconditional of x and y) =
df
xy
These denitions imply, for example, the equalities:
A
6
A
3
=
A
6
A
3
(A
1
A
4
) A
2
=
A
1
A
4
A
2
A
1
A
4
A
2
= A
1
A
4
A
2
In the last equality, the grouping of the left-hand side is obtained via our grouping conventions.
Given the previous denition of sentences, it is obvious that the set of sentences is closed under
applications of the connectives: the negation of a sentence is a sentence, the conjunction of
two sentences is a sentence, etc. It remains to show that unique readability obtains. This
comes to the following claims.
(I) No atomic sentence, A
i
, is of the form x, or cxy, where x and y are sentences
and c is a binary connective letter.
(II) No sentence x, is of the form cyz, where c is a binary connective letter and
y and z are sentences.
(III) If x and x
0
are sentences and x = x
0
, then x = x
0
.
(IV) For all sentences x, y, x
0
, y
0
, if c and c
0
are binary connective letters then:
cxy = c
0
x
0
y
0
, only if c = c
0
, x = x
0
, and y = y
0
200 CHAPTER 6. THE SENTENTIAL CALCULUS
(I) follows trivially from the assumption that no member of the alphabet is equal to a sequence
of other members. (II) and (III) are trivial as well: Since x and cyz are strings that start
with dierent symbols, namely, and c, they cannot be equal. And if x = x
0
, then the
strings obtained from them by deleting the rst member are equal. Also trivial is the rst
part of (III): if cxy = c
0
x
0
y
0
, then their rst members, c and c
0
must be equal. So far, we have
used only general properties of strings.
The claim that is far from obvious is that if cxy = cx
0
y
0
, where x, x
0
, y, y
0
are sentences, then
x = x
0
and y = y
0
. Now, if cxy = cx
0
y
0
, then xy = x
0
y
0
(because xy and x
0
y
0
are obtained
from cxy and cx
0
y
0
by deletion of the rst member). Hence, we have to prove:
For all sentences x, x,
0
y, y
0
,
xy = x
0
y
0
= x = x
0
and y = y
0
The proof is based on a method that enables one to determine, by easy counting, the scopes
of the connectives.
The Symbol-Counting Method Let us associate, with every symbol, a, of our alphabet
an integer, (a), as follows:
For all A
i
, (A
i
) = 1, ( ) = 0, and (c) = 1, for every binary connective letter c.
Now let x be any non-empty string of length n, say x = a
1
a
2
. . . a
n
. With every occurrence of
a symbol in x, let its count number (relative to x) be the sum of all integers associated with
it and with the preceding occurrences in x. For the i
th
occurrence the count number is:
(a
1
) +(a
2
) +. . . +(a
i
)
Here is an illustration, where the string is the sentence
(A
2
A
1
) (A
4
((A
7
A
1
))), that is:
A
2
A
1
A
4
A
7
A
1
This sentence is written below, in spaced form, on the rst line, the numbers associated with
the symbols are written below it, and below themthe count numbers.
A
2
A
1
A
4
A
7
A
1
1 1 1 0 1 1 1 0 1 1 1
1 2 1 1 0 1 0 0 1 0 1
Main Claim: For every sentence a
1
. . . a
n
, the count number of the last occurrence is 1 and
all other count numbers are < 1.
Proof: By induction on the strings. We show that (i) the claim holds for atomic sentences,
(ii) if the claim holds for a sentence x, it holds also for x and (iii) if it holds for the sentences
x and y, then it holds also for cxy, where c is any binary connective letter.
6.1. THE LANGUAGE AND ITS SEMANTICS 201
(i) is obvious. (ii) is easy: since ( ) = 0, the count number of an occurrence in x, relative
to x, is the same as the count number of that occurrence relative to x. To show (iii), note
that, since (c) = 1, the count numbers of occurrences in x, relative to cxy, are smaller
by 1 than their numbers relative to x. Assuming that the claim holds for x, it follows that,
relative to cxy, the last occurrence in x has count number 0 and all preceding occurrences
have count numbers < 0. Hence, adding cx as a prex does not aect the count numbers of
occurrences in y: the number for each occurrence in y, relative to cxy, is same as its number
relative to y. Assuming the claim for y, it follows that, relative to cxy, the count number of
the last occurrence is 1 and all other occurrences in y have numbers < 1. qed
This claim implies that no sentence is a proper prex of another sentence:
If y = xz and x and y are sentences, then y = x (i.e., z is the empty string).
Proof: The last occurrence in x has count number 0 relative to x. Since y = xz, it has
the same count number relative to y; since the last occurrence in y is the only one that has
(relative to y) count number 0, the last occurrence in x is also the last occurrence in y. Which
implies x = y.
To complete the proof of unique readability, assume that xy = x
0
y
0
, and let m and m
0
be,
respectively, the lengths of x and x
0
. If m < m
0
, then x is a proper segment of x
0
, which is
impossible, since both are sentences. For the same reason we cannot have m
0
< m. Hence
m = m
0
, implying x = x
0
. Since y and y
0
are then obtained by deleting the rst m members
from the same string, we have y = y
0
. This concludes the proof.
Note: Count numbers, as we have dened them, give us a way of nding, given any sentence,
its immediate components. Suppose that x is a sentence whose leftmost symbol, c, is a binary
connective letter. Then x = cyz, where y and z are uniquely determined sentences. To nd
them, ndby summing from left to rightall the count numbers in x. The rst occurrence
that has count number 0 (or, equivalently, count number 1 relative to the string obtained by
deleting the leftmost c) is the last occurrence in y. The remainder of the string (which comes
after cy) is z. This method leads also to a procedure for determining whether a given string
is a sentence and provides a parsing, if it is.
Homework 6.1 Consider another way of constructing the language. Here (( and )) are
additional alphabet members, functioning as left and right parentheses. The clauses for non-
atomic sentences are:
If x and y are sentences, then
((x)) is a sentence.
((x))c((y)) is a sentence, for each binary connective letter c.
Unique readability is proved via the following counting method. Associate with the left
202 CHAPTER 6. THE SENTENTIAL CALCULUS
parenthesis the number 1, with the right parenthesisthe number 1, and with all the other
alphabet symbolsthe number 0. Count numbers are dened as above, by summing from
left to right the associated numbers.
Prove, by induction, that in every sentence the last occurrence has count number 0 and all
the others count numbers are 0. Deduce from this that (i) if x is a sentence, then in ((x))
all occurrences except the last have count numbers < 0, and (ii) if x and y are sentences,
then, in ((x))c((y)), there are exactly two occurrences of parentheses with count number 0: the
last and the one after ((x. Deduce from this the unique readability property.
Note: We have presupposed an alphabet with an innite number of symbols, which function
as basic units. These can be constructed from a nite number of other units. For example,
they can be strings of 0s and 1s:
= 10,
= 110,
= 1110, = 11110, = 111110
A
1
= 1111110, A
2
= 11111110, A
3
= 111111110, . . . , A
n
= 1
5+n
0, . . .
In each of these strings 0 serves to mark the end. If x
1
. . . x
m
= y
1
. . . y
n
, where all the x
i
s and
y
j
s are strings of the form 1
k
0, k > 0, then m = n and x
i
= y
i
, for all i = 1, . . . , n. Hence,
concatenations of such strings of 0s and 1s are uniquely decomposable as concatenations of
our alphabet symbols. For example, the string
111011111110101101111111101111110
is the sentence:
A
2
A
3
A
1
From now on we shall ignore the specic nature of the sentences. We require only that
sentences be generated from the atoms by applying connectives and that unique readability
hold. We do not have any further use for the connective letters of the language. We employ,
as before the symbols , , , , for the sentential operations. We also employ,
as we did before,
A, B, C, A
0
, B
0
, C
0
, A
1
, B
1
, C
1
,...etc.
as variables ranging over sentences. The only new pieces in our stock are the atomic sen-
tences A
1
, A
2
, A
3
, etc. Do not confuse them with sentential variables!
6.1.2 Semantics of the Sentential Calculus
Let SC be the language of the sentential calculus, as just dened.
So far, the denitions have been purely syntactic. The language is dened as a system
of uninterpreted constructs. An interpretation, which reads the language as being about
6.1. THE LANGUAGE AND ITS SEMANTICS 203
something else, isas explained in 2.5the concern of the semantics. Since our treatment of
formal languages is general, we shall not be concerned with one particular interpretation, but
with the class of possible interpretations.
Usually, an interpretation determines how extralinguistic entities (objects, relations or prop-
erties) are correlated with linguistic items When it comes to SC, the only linguistic items
are sentences. The truth-values of the atomic sentences determine the values of all the other
sentences. We therefore take assignments of truth-values to atomic sentences as our possible
interpretations:
An interpretation of SC is a function, , dened for all atoms, such that, for
each A
i
, (A
i
) is a truth-value.
When we come to rst-order languages, we shall encounter richer and more familiar types of
interpretations.
We shall refer to an interpretation of SC as a truth-value assignment, or assignment for short,
and we shall use
, ,
0
,
0
, etc.
as variables ranging over assignments.
An interpretation, , determines a unique assignment of truth-values to all the sentences of
SC: The value of each atom is the value assigned to it by , and the values of the other
sentences are determined by the usual truth-table rules. Spelled out in detail, this amounts
to an inductive denition:
For Atoms: If A is an atom, A gets (A).
For Negations:
If A gets T, A gets F.
If A gets F, A gets T.
For Conjunctions:
If A gets T and B gets T, A B gets T.
If A gets F, A B gets F.
If B gets F, A B gets F.
.
.
.
And so on for each of the connectives.
204 CHAPTER 6. THE SENTENTIAL CALCULUS
Obviously, the set of all sentences that get a truth-value by virtue of these rules contains all
the atoms and is closed under connective-applications. Hence every sentence gets a truth-
value. Moreover, every sentence gets no more than one truth-value. This follows, again by
induction, by showing that the set of all sentences that get unique values contains all atoms
and is closed under connective-applications. Here we have to use the unique readability:
An atom cannot get a value by virtue of any other rule except (i), because
an atom is not a sentential compound.
Next, assume that A gets a unique value. Then, since A is not of the form
B C, where is a binary connective, and since A = A
0
only if A = A
0
,
it follows that A can get a value only through (ii) and that this value is
uniquely determined by the value of A.
The same kind of argument applies to every other connective.
We can therefore speak of the value of a sentence A under the assignment . Let us denote
this as:
val
(A)
Note: The atoms are treated as being completely independent: the truth-value of one is
not constrained by the values of the others. Dependencies between atoms can be introduced
by restricting the class of possible interpretations. The restrictions can be expressed by
stipulating that certain sentences must get the value T. You can think of them as of extra
logical axioms. For example, the restriction that it is impossible for both A
1
and A
2
to be
true, amounts to stipulating that (A
1
A
2
) gets T.
(Some restrictions cannot be expressed in this form; for example the restriction that only a
nite number of atomic sentences get T. But any restriction that involves a nite number of
atoms can be thus expressed.)
Our previous semantic notions can be now characterized in these terms:
A B, just when, for all , val
(A) = val
(B).
A is a tautology, just when, for all , val
(A) = T.
A is a contradiction, just when, for all , val
(A) = F.
|= A, just when there is no such that for all B in val
(A) = F.
Our previous methods for establishing logical equivalence and logical implications relied only
on the general features of the language and the connectives. Therefore they apply as before:
6.1. THE LANGUAGE AND ITS SEMANTICS 205
All the general equivalences, simplication methods, and proof techniques of
the previous chapters apply, without change, when the sentential variables
range over the sentences of SC.
On the other hand, with the sentences completely specied, we can now prove that particular
sentences are not tautologies, or not contradictions, or are not logically implied by other
sentences. For example, if A, B, and C are dierent atoms, then
A B, A C 6|= B C
For let be the assignment such that (A) = F, (B) = T, (C) = F, then
val
(A B) = val
(A B) = T, but val
(B C) = F.
With A, B, and C unspecied we can only claim that the implication claim need not hold in
general. The counterexamples constructed in chapter 4 can be turned into counterexamples
concerning specic sentences, by assuming that each sentential variable has a distinct atomic
sentence as value.
Note: The value of A under the interpretation depends only on the values of the atoms
that are components of A: If and assign the same values to all atomic components of A,
then
val
(A) = val
(A) .
Hence, as far as a particular sentence is concerned, we have to consider assignments dened
only for its atomic components. And if we are concerned with are nite number of sentences,
we have to consider only a nite number of atoms.
Truth tables can serve to show how a sentence fares under dierent assignments. A truth
table for a given sentence should have a column for each of its atoms. The rows represent the
dierent assignments; the value of the sentence is given in its column. When several sentences
are compared by means of truth tables, their tables should be incorporated into a single table
that has a column for each atom occurring in any of the sentences.
Logical equivalence may hold between sentences with dierent atoms. For example:
A
3
[A
1
(A
5
A
5
)] [A
3
(A
4
A
4
)] A
1
)
Note: The notion of duality (cf. 2.5.3) can be now dened for specic sentences. Con-
sider sentences built from atoms using only negation, conjunction and disjunction. Apply
the denition given in 2.5.3, assuming that the sentential variables denote distinct atoms.
Alternatively, it can be dened inductively as follows, where A
d
denotes the dual of A.
(i) If A is an atom, then A
d
= A
(ii) (A)
d
= (A
d
)
206 CHAPTER 6. THE SENTENTIAL CALCULUS
(iii) (A B)
d
= A
d
B
d
(iv) (A B)
d
= A
d
B
d
6.1.3 Normal Forms, Truth-Functions and Complete Sets of Con-
nectives
A literal is a sentence which is either an atom or a negation of an atom.
Denition: A sentence is in disjunctive normal form, abbreviated DNF, if it is a disjunction
of conjunctions of literals. For example, the following is in DNF:
(A
3
A
4
A
5
) A
2
(A
3
A
6
)
A sentence is in conjunctive normal form, abbreviated CNF, if it is a conjunction of disjunc-
tions of literals. For example:
(A
5
A
1
) (A
5
A
6
A
7
) (A
2
A
3
A
4
) A
3
Note: Every literal is both a conjunction of literals (namely, a conjunction with one
conjunct) and a disjunction of literals (namely, a disjunction with one disjunct). A disjunction
of literals, say
A
1
A
2
A
3
A
4
is in DNF, because it is a disjunction of conjunctions of literals (where every conjunction
consists of one literal). It is also in CNF, because it is a conjunction of disjunctions of literals
(namely, a conjunction with one conjunct). In a similar way,
A
1
A
2
A
3
A
4
is both in CNF and in DNF.
An equivalent characterization of DNF and CNF is:
A sentence A is in DNF i:
(i) A is constructed from atoms using no connective other than , , .
(ii) The scope of every negation is an atom.
(iii) The scope of every conjunction does not contain any disjunction.
A sentence is in CNF i it satises (i) and (ii), and
6.1. THE LANGUAGE AND ITS SEMANTICS 207
(iii
0
) The scope of every disjunction does not contain any conjunction.
Theorem: For every sentence, A, there is a logically equivalent sentence in DNF, and there
is a logically equivalent sentence in CNF.
Here is a way to convert any given sentence to an equivalent sentence in DNF.
(I) Eliminate and , by expressing them in terms of , , . Get in this
way a logically equivalent sentence that involves only , and .
(II) Push negation all the way in, cancelling double negations, until negation
applies only to atomic sentences.
(III) Push conjunction all the way in, by distributing conjunction over disjunction,
until no disjunction is within the scope of a conjunction.
To get an equivalent CNF, apply steps (I) and (II) but instead of (III) use:
(III
0
) Push disjunction all the way in, by distributing disjunction over conjunction,
until no conjunction is within the scope of a disjunction.
As you carry out these steps you can, of course, simplify according to the occasion, dropping
redundant conjuncts or disjuncts, or using established equivalences (e.g., replacing AAB
by the equivalent A B).
Example: Assuming A, B, C to be atoms, the following are the stages of a possible
converting of
[(AC) (B C)] [(AC) B]
into an equivalent DNF:
1. {(AC) [(BC) (BC)]} [(AC) B]
2. [A C (BC) (BC)] [(AC) B]
3. [AC(BC)] [(AC) B] (BC is a redundant disjunct
because we have the disjunct C)
4. [A C (BC)] [(AB) (CB)]
5. {[A C (BC)]AB} {[A C (BC)]CB}
6. (AB) (CAB) (BCA) (ACB) (CB) (BCC)
7. (AB) (CB)
208 CHAPTER 6. THE SENTENTIAL CALCULUS
In getting the CNF, steps 1-3 are the same; from 3. on we can proceed:
3. [A C (BC)] [(AC) B]
4
0
. [A [(CB) (CC)] ] (AC) B
5
0
. (ACB) (AC) B
6
0
. (AC) B (both AB and ACB are redundant in the
presence of B).
Note that in this particular case we could have gotten the CNF from the DNF by pulling out
B, or the DNF from the CNFby simple distribution of conjunction . But in general the
two forms are not as simply related.
A sentence in DNF is true just when some conjunction in it is true. Hence, this form shows
clearly the interpretations under which the sentence is true. Consider, for example,
(A
1
A
2
) (A
1
A
3
) (A
2
A
3
) (A
1
A
2
)
This sentence is true just when:
(A
1
and A
2
are true) or (A
1
is false and A
3
is true) or (A
2
and A
3
are true) or
(A
1
and A
2
are false).
Note that not all possibilities here are exclusive; if A
1
and A
2
and A
3
are true, both the rst
and third alternatives hold.
A sentence in DNF is false just when all its disjuncts are false. For example, our last sentence
is false just when:
(A
1
is false or A
2
is false) and (A
1
is true or A
3
is false) and (A
2
is false or A
3
is false) and (A
1
is true or A
2
is true).
A CNF indicates the cases of truth and falsity in a dual way. Thus,
(A
1
A
2
) (A
1
A
3
) (A
2
A
3
) (A
1
A
2
)
is true, just when:
(A
1
is true or A
2
is true) and (A
1
is false or A
3
is true) and (A
2
is true or A
3
is true) and (A
1
is false or A
2
is false).
And the sentence is false just when:
6.1. THE LANGUAGE AND ITS SEMANTICS 209
(A
1
and A
2
are false) or (A
1
is true and A
3
is false) or (A
2
and A
3
are false) or
(A
1
and A
2
are true).
Note: A sentence can have many equivalent DNFs (or CNFs). For example, A
1
(A
1
A
2
),
and A
2
(A
1
A
2
) are equivalent sentences in DNF. They are equivalent to A
1
A
2
. If you
replace them by their duals, you will get an analogous situation for CNFs.
Homework 6.2 Find, for each of the following sentences, equivalent sentences in DNF and
CNF, as short as you can. Assume that A, B, C, D are atoms.
1. (A B) (B A)
2. (AB CD) (C D)
3. ((A B) C) C)
4. (A B) (C D)
5. (A B) (C D)
6. [A (B C) (C D)]
7. (AB CD) (C D)
8. ((A C) (C D)) (B A)
9. (AB BA) C
10. (A B) (A B) (A B) (A B)
Expressing Truth-Functions by Sentences
Denition: An n-ary truth-function is a function dened for all n-tuples of Ts and Fs,
which assigns to every n-tuple a truth-value (either T, or F).
Here, for example, is a ternary truth-function f:
f(T, T, T) = F
f(T, T, F) = T
f(T, F, T) = F
f(T, F, F) = F
f(F, T, T) = T
f(F, T, F) = F
f(F, F, T) = T
f(F, F, F) = T
210 CHAPTER 6. THE SENTENTIAL CALCULUS
The n-tuples of truth-values correspond exactly to the rows in a truth-table based on n atomic
sentences, provided that we choose a matching of the atoms with the coordinates. The tuple
(x
1
, x
2
, . . . , x
n
) corresponds to the row in which x
1
is assigned to the rst atom (the atom
matched with the rst coordinate), x
2
is assigned to the second atom, and so on.
Now assume that A
1
, A
2
, . . . , A
n
are n distinct atoms and that we agree that A
i
, for i =
1, . . . , n, is matched with the i
th
coordinate. For each n-tuple of truth-values (x
1
, x
2
, . . . , x
n
),
let the assignment represented by (x
1
, x
2
, . . . , x
n
), be the assignment that assigns x
1
to A
1
, x
2
to A
2
,..., x
n
to A
n
. Then each sentence, A, whose atomic components are among A
1
, . . . , A
n
,
denes an n-ary truth-function, f
A
:
f
A
(x
1
, . . . , x
n
) = the value of A, under the assignment represented by (x
1
, . . . , x
n
).
The values of the function f
A
are given in As column, in the truth-table based on the atoms
A
1
, . . . , A
n
.
Example: It is not dicult to see that the ternary truth-function given above is the function
dened by the sentence
A
1
(A
2
A
3
)
Note: If A
i
does not occur in A, then the i
th
argument has no eect on the value of f
A
.
For example, if n = 2 and A = A
2
, then, under our denition, f
A
is a two-place function
whose value for (x
1
, x
2
) is obtained by toggling x
2
. If A contains k atoms, then for every
n k and every matching of the k atoms with coordinates from 1, 2, . . . , n, there is an n-ary
truth-function dened by A.
Theorem: Every truth-function is dened by some sentence.
This is sometimes expressed by saying that every truth-table is a truth-table of some sentence.
The proof will show how to construct the sentence, given the truth-table.
Proof: Let f be an n-ary function. Fix n distinct atomic sentences A
1
, . . . , A
n
, with A
i
corresponding to the i
th
coordinate, i = 1, . . . , n.
If there is no n-tuple for which the value of f is T, then obviously, A
1
A
1
denes f. Else,
for each i = 1, . . . , n dene:
A
T
i
=
df
A
i
A
F
i
=
df
A
i
For every n-tuple of truth-values (x
1
, . . . , x
n
), let
C
(x
1
,...,xn)
= A
x
1
1
A
x
2
2
. . . A
xn
n
Consider all the tuples (x
1
, . . . , x
n
) for which f(x
1
, . . . , x
n
) = T (i.e., the rows in the truth-
table for which the required sentence should have T). Let A be the disjunction of all the
C
(x
1
,...,x
n
)
s, where (x
1
, . . . , x
n
) ranges over these tuples.
6.1. THE LANGUAGE AND ITS SEMANTICS 211
A gets T i one of these disjuncts gets T. But C
(x
1
,...,xn)
gets T i all the conjuncts
A
x
1
1
, A
x
2
2
, . . . , A
x
n
n
get T, that is, i A
1
gets x
1
, A
2
gets x
2
, ..., A
n
gets x
n
. Therefore A
gets T i the assignment is given by one of the tuples for which the value of f is T. The
truth-function dened by A coincides with f.
QED
Example: Consider the following truth-table:
A
1
A
2
A
3
A
T T T F
T T F T
T F T F
T F F F
F T T T
F T F F
F F T T
F F F T
There are four rows for which the required sentence, A, should be T. Accordingly A can be
taken as the disjunction of four conjunctions:
(A
1
A
2
A
3
) (A
1
A
2
A
3
) (A
1
A
2
A
3
) (A
1
A
2
A
3
)
Note that the proof of the theorem yields the required sentence in DNF. It is also a new proof
that every sentence is equivalent to a DNF sentence.
A dual construction yields the required sentence in CNF. It is obtained by toggling everywhere
T and F, and and :
Consider all tuples (x
1
, . . . , x
n
) for which the value of the function is F. If there are none,
then the function is dened by A
1
A
1
. Else, dene:
B
T
i
=
df
A
i
B
F
i
=
df
A
i
Let
D
(x
1
,...,x
n
)
= B
x
1
1
B
x
2
2
. . . B
x
n
n
Then the required CNF is the conjunction of all the D
(x
1
,...,xn)
s such that f assigns to
(x
1
, . . . , x
n
) the value F.
Example: The CNF obtained for the above-given truth-table is:
(A
1
A
2
A
3
) (A
1
A
2
A
3
) (A
1
A
2
A
3
) (A
1
A
2
A
3
)
Each of the disjunctions corresponds to a row in which the sentence gets F.
212 CHAPTER 6. THE SENTENTIAL CALCULUS
Full DNFs and CNFs
Terminology: Given a conjunction, C, of literals, say that an atom occurs positively in
C, if it is one of the conjuncts, and that it occurs negatively in C if its negation is one of the
conjuncts. We speak, accordingly, of positive and negative occurrences of atoms.
Similarly an atom occurs positively in a disjunction of literals if it is one of the disjuncts, and
it occurs negatively if its negation is one of the disjuncts.
Henceforth, we assume that when a sentence is written in DNF no atom occurs both positively
and negatively in the same conjunction. For such conjunctions are contradictory and can be
dropped. The only exception is when the sentence is contradictory, in which case it reduces
to A
1
A
1
.
Similarly, we assume that in a CNF no atom occurs both positively and negatively in the
same disjunction, unless the CNF is a tautologyin which case it reduces to A
1
A
1
.
We assume, moreover, that there are no repetitions of the same literal in any conjunction (of
the DNF), or in any disjunction (of the CNF), and no repeated disjuncts (in the DNF), or
repeated conjuncts (in the CNF).
When comparing DNFs (or CNFs) we disregard dierences in the order of literals of a con-
junction (of a disjunction), and dierences in the order of the disjuncts (of the conjuncts).
Denition: A full DNF is one in which every occurring atom occurs in every conjunction.
A full CNF is one in which every atom that occurs in it occurs in every disjunction.
Examples: Assuming that A
2
, A
3
and A
4
are distinct atoms, the following is a sentence in
full DNF:
(A
2
A
3
A
4
) (A
2
A
3
A
4
) (A
2
A
3
A
4
)
By pulling A
3
A
4
out of the rst two conjunctions and dropping the resulting redundant
conjunct A
2
A
2
, we see that this sentence is logically equivalent to:
(A
3
A
4
) (A
2
A
3
A
4
)
which is in DNF but not in full DNF, because A
2
occurs in the second conjunction, but not
in the rst. The sentence is also equivalent to:
(A
2
A
3
A
4
) (A
2
A
4
)
(can you see how to get it?), which is again in DNF, but not in full DNF.
An example of a full CNF (where the A
i
s are assumed to be atoms) is:
(A
1
A
2
A
5
) (A
1
A
2
A
5
) (A
1
A
2
A
5
) (A
1
A
2
A
5
)
6.1. THE LANGUAGE AND ITS SEMANTICS 213
which is equivalent to:
(A
2
A
5
) (A
1
A
2
A
5
) (A
1
A
2
A
5
)
(can you see how?), as well as to:
(A
1
A
2
A
5
) (A
1
A
5
) (A
1
A
2
A
5
)
And this last can be further compressed into:
(A
1
A
2
) (A
1
A
5
)
All of these are in CNF but not in full CNF.
A sentence in DNF can be expanded into full DNF by supplying the missing atoms. Say that
A
i
is an atom occurring in some conjunction, but not in the conjunction C. We can replace
C by the equivalent:
C (A
i
A
i
)
which, via distributivity, becomes:
CA
i
CA
i
Thus, any disjunct not containing A
i
is replaceable by two: one with an additional A
i
and one
with an additional A
i
. Proceeding in this way, we get eventually the full DNF. Obviously,
this involves a blowing up of the sentence.
A similar process works for the full CNF: We replace every disjunction D in which A
i
does
not occur by:
(D A
i
) (D A
i
)
A full DNF shows us explicitly all the truth-table rows in which the sentence gets T. Each
conjunction contributes the row in which every atom occurring positively is assigned T, and
every atom occurring negatively is assigned F.
A full CNF shows us, in a dual way, all the rows in which it gets F. Each disjunction
contributes the row in which every atom occurring positively gets F, and every atom occurring
negatively gets T.
Note: The DNF and CNF constructed in the proof of the last theorem are full. They are
obtained by following the prescription just given for correlating conjunctions (in the DNF),
or disjunctions (in the CNF) with truth-table rows.
Homework
6.3 Write down sentences B
1
, B
2
, B
3
and B
4
that have the following truth-tables. Write
each of B
1
and B
2
in DNF and in CNF.
214 CHAPTER 6. THE SENTENTIAL CALCULUS
A
1
A
2
A
3
B
1
B
2
B
3
B
4
T T T T T F T
T T F F T T F
T F T F F T F
T F F T F T F
F T T F T F F
F T F T T T T
F F T T T T F
F F F F F T F
Having written the sentences, see if you can simplify them by pulling out common conjuncts,
or common disjuncts, as shown above.
6.4 Write B
1
of 6.3 using only and , B
2
using only and , and each of B
2
and B
3
using only and .
Dummy Atoms: A DNF (or CNF) can contain dummy atoms, i.e., atoms that have no
eect on the truth-value. For example, assuming that the A
i
s are atoms, A
1
is dummy in:
(A
1
A
2
A
3
) (A
1
A
2
A
3
)
That sentence is in fact equivalent to
A
2
A
3
Note that both sentences are in full DNF.
It can be shown that, in a full non-contradictory DNF, an atom A
i
is dummy i the following
holds:
For every conjunction in the DNF there is another conjunction in it that
diers from the rst only in that A
i
occurs positively in one, negativelyin
the other.
The condition for full non-tautological CNFs is the exact dual of that.
Dummy atoms can be eliminated from a full DNF by pulling out, i.e., by replacing each
(A
0
1
. . . A
0
i1
A
i
A
0
i+1
. . .A
0
n
) (A
0
1
. . . A
0
i1
A
i
A
0
i+1
. . .A
0
n
)
by the single conjunction
A
0
1
. . . A
0
i1
A
0
i+1
. . .A
0
n
Applying this process we eventually get an equivalent full DNF without dummy atoms. Con-
cerning such DNFs the following is provable:
6.1. THE LANGUAGE AND ITS SEMANTICS 215
Two full non-contradictory DNFs without dummy atoms are logically equivalent i they are the
same (except for rearrangements of literals and disjuncts, and dropping repeated occurrences).
The case of full non-tautological CNFs is the exact dual and we shall not repeat it.
Note: The claims just made are true only under the assumption that the DNFs (or the
CNFs) are full. The situation for non-full DNFs (or CNFs) is much more complex and will
not be discussed here.
General Connectives and Complete Connective Sets
The essential feature of a connective, which determines all its semantic properties, is its truth
table. Hence, a binary connective is characterized by the two-argument truth-function
dened by
A
1
A
2
where A
1
and A
2
represent, respectively, the rst and second coordinates. And a unary
connective is characterized by the truth-function dened by A
1
.
For any given n-ary truth-function, we can introduce a corresponding n-place connective, one
that determines the given function.
There are 16 possible binary truth-functions. This can be seen by noting that the domain of
a binary truth-function consists of 4 pairs:
(T, T), (T, F) (F, T), (F, F)
For each pair there are two possible values: T and F. Hence, there are altogether 2 2 2 2
possibilities. (In other words, there are 16 possible columns in a truth-table with two atoms.)
Accordingly, when considering binary connectives we have to consider 16 possibilities. Four
of these are used, as primitives, in SC. But it is possible to set up languages based on any set
of connectives. We can also consider connectives of higher arity, that is, which combineat
one gomore than two sentences.
A set of connectives is called complete if all truth-functions are denable by sentences built
by using only connectives from the set.
The theorem proved in the previous subsection shows that , , and constitute a
complete connective set. Since is expressible in terms of and (cf. (5) in 2.2.2), and
constitute by themselves a complete set. For similar reasons, , with each of and
constitute complete connective sets. Hence, the following sets of connectives are complete.
{, } {, } {, }
216 CHAPTER 6. THE SENTENTIAL CALCULUS
Obviously, every set that includes one of them as a subset is complete as well. It can be
shown that none of the following is complete.
{, , , }, {, }
Everything concerning the expressive power of sets of unary and binary connectives is well-
known. There are exactly two binary connectives that form, each by itself, a complete set.
One, known as Sheers stroke and usually denoted as: | , is given by the equivalence:
A|B (A B)
Sheers stroke is sometimes called nand (not(... and )). It is also called alternative denial,
because
A|B A B.
To show that Sheers stroke is complete, it suces to show that negation and conjunction
are expressible by it. Negation is expressible, since
A A|A
We have also:
(A|B) A B
Expressing left-hand side negation in terms of Sheers stroke, we see that conjunction is
expressible as well:
(A|B)|(A|B) A B
The second complete binary connective, whose sign is often , is sometimes called nor, or
joint denial. Its truth-table is given by:
A B (A B)
which is equivalent to:
A B A B
Note: Among the sixteen possible binary connectives, six are degenerate, i.e., have one
or more dummy arguments. These are:
The tautology connective, whose truth-value function assigns to all pairs
the value T. And the contradiction connective whose truth-value function
assigns to all pairs the value F.
The rst-coordinate connective, whose truth-value function assigns to ev-
ery pair (x
1
, x
2
) the rst coordinate x
1
. And the negated rst-coordinate
connective, whose truth-value function assigns to every pair the toggled rst
coordinate.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 217
The second-coordinate connective, and the negated second-coordinate con-
nective.
Homework
6.5 Prove that is, by itself, complete.
6.6 Recall that the dual of a connective is the connective whose truth-table is obtained by
toggling everywhere T and F(cf. 2.5.3). Let
D
and
D
be the duals of and . Show that:
(1) A
D
B is expressible in terms of and .
(2) A
D
B is expressible in terms of and .
6.7 Show that, by using
D
as a single connective, a contradictory sentence can be con-
structed, and that the same is true for
D
. Use this to show that negation is expressible in
terms of and
D
, as well as in terms of and
D
. What can you infer from this concerning
the completeness of {,
D
} ?
6.8 Show that disjunction is expressible in terms of conditional (i.e., that a sentence logically
equivalent to A B can be constructed from A and B, using only).
6.9 Let A
1
and A
2
be atoms. Show that if each of A and B is equivalent to one of:
A
1
A
1
(a tautology), A
1
, A
2
, A
1
A
2
, A
2
A
1
, A
1
A
2
.
then also A B is equivalent to one of these sentence. Use this in an inductive argument
to prove a certain restriction on the expressive power of the language that has as the only
connective.
6.2 Deductive Systems of Sentential Calculi
6.2.1 On Formal Deductive Systems
Proving claims, reasoning and drawing conclusions, are fundamental in all cognitive domains.
Logic, as was pointed out, is concerned with certain basic aspects of these activities. The
outcome of the proving activity is a proof: a sequence of propositions that are supposed
to establish the desired conclusion. As a rule, proofs presuppose an understanding of the
concepts under consideration.
Proofs come in many grades of precision and rigour. Proofs in mathematics, for example, are
a far cry from proofs in philosophy, which rest on partially understood concepts and on
218 CHAPTER 6. THE SENTENTIAL CALCULUS
unspecied assumptions, and which are often controversial and subject to unending debates.
But mathematics, as well, presupposes a great deal of intuitive understanding. The history
of the subject shows that even mathematical proofs have not been immune to error and
confusion.
The drive for clarity and rigour has resulted in setups in which proofs are subject to strict
requirements. In classical form, a proof should start from certain propositions, chosen from
a set xed in advance, and every step should conform to certain rules. The paradigm of
such a system has been Euclidean geometry (dating back to the fourth century B.C.), whose
importance in the history of science and ideas can be hardly exaggerated.
The propositions that serve as the starting points of proofs are known as axioms. The rules
that determine which steps are allowed are known as inference rules. When a given domain
is organized as an axiomatic system, we can think of the axioms as self-evident truths. And
we can view the inference rules as obvious truth-preserving rules, i.e., they never lead from
true premises to a false conclusion. Proofs constructed in this way are therefore guaranteed
to produce true propositions.
The usefulness of proofs lies in the fact that, though each axiom is obvious and each step is
simple, the conclusion can be a highly informative, far from obvious statement.
The axiomatic method eected a grand systematization of geometry and gave it a particular
shape. It served not only as a fool-proof guard against error, but as a guide for discovering
new geometrical truths. At the same time it provided a framework for communicating prob-
lems and results. It became a basic paradigm, an example to be followed by scientists and
philosophers through centuries.
Euclidean geometry relied, nonetheless, on many intuitions, quite a few of which were left
implicit. Later geometricians, who noted these lacunae, made various assumptions explicit
in the form of additional axioms. The more precise the system became the less it relied on
unanalysed geometrical intuitions. The big breakthrough came at the turn of the century in
the works of Hilbert. He showed that geometry can be completely reduced to a formal system,
characterized by a certain set of axioms and a certain way of constructing proofs, which do
not require any geometrical intuition. He has indicated thereby the possibility of setting up a
purely formal deductive system, one that is based on an uninterpreted language, which does
not presuppose an understanding of the symbols meaning. In such a system, the construction
of proofs amounts to symbol manipulation and belongs to the level of pure syntax.
These developments formed part of the general evolution of modern logic. At about the same
time Peano, drawing on Dedekinds work, proposed a formal deductive system for the theory
of natural numbers. Freges systems are essetially fully edged deductive systems. To a lesser
degree this is also true of Russells and Whiteheads Principia Mathematica.
A (formal) deductive system consists of:
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 219
(I) A formal language
(II) Rules that dene the systems proofs.
Most often (II) is given by:
(II.1) A set of axioms
(II.2) A set of inference rules.
Roughly speaking, a proof is a construct formed by repeated application of inference rules;
the axioms serve as starting points.
Note: Proof is used here in more than one sense. The proofs that belong to deductive
systems are formal structures, represented by arrangements of symbols in sequences, or in
trees. But we speak also of our own arguments and reasonings as proofs; for example, the
(forthcoming) proofs that every sentences in SC is equivalent to a sentence in DNF, and that
a given set of connectives is complete. Do not confuse these two notions of proof ! The context
always indicates which notion is meant.
A similar ambiguity surrounds the term theorem. We use it to refer to what is proved in
a deductive system, as well as to claims we ourselves make. Again, the context indicates the
intended meaning.
The importance of deductive systems does not derive from the practicality of their proofs
(though some have found computerized applications), but from the light they throw on our
reasoning activity. The very possibility of capturing a sizable chunk of our reasoning by
means of a completely formal system, one which is itself amenable to mathematical analysis,
is extremely signicant. We can thus reason about reasoning, and we can prove that some
things are provable and some are not.
When restricted to classical sentential logic, deductive systems do not play a crucial role;
because truth-table checking can decide whether given premises tautologically imply a given
conclusion. Yet, they are extremely important. First, because sentential deductive systems
constitute the core of richer systems in richer languagessuch as rst-order logicwhere
nothing like truth-table checking is available. Second, they serve as a basis and as a point of
comparison for various enriched sentential logics, which are beyond the scope of truth tables.
Finally, they are the simplest example that beginners can study.
6.2.2 Hilbert-Type Deductive Systems
The simplest type of deductive system is often referred to as the Hilbert-type. In this type
the axioms are certain sentences and each inference rule consists of: (i) a list of sentence-
220 CHAPTER 6. THE SENTENTIAL CALCULUS
schemes referred to as the premises, (ii) a sentence-scheme referred to as the conclusion. It is
customary to write an inference rule in the form:
A
1
, A
2
, . . . , A
m
B
where the A
i
s are the premises and B is the conclusion. The rule allows us to infer B from
A
1
, . . . , A
m
.
The most common rule is modus ponens:
A, A B
B
which allows us to infer B from the two sentences A B and A. Here A and B can be any
sentences. The rule is a scheme that covers an innite number of cases. We shall return to it
in the next subsection.
In principle, the number of premises (which can vary according to the rule) can be any nite
number; but is usually one or two.
Proofs and Theorems: A proof in a Hilbert-type system is a nite sequence of sentences
B
1
, B
2
, . . . , B
n
in which every sentence is either an axiom or is inferred from previous sentences by an inference
rule. Stated formally: for every k = 1, . . . , n either (i) B
k
is an axiom, or (ii) there are
j
1
, . . . , j
m
< k, such that B
k
is inferred from B
j
1
, B
j
2
, . . . , B
jm
by an inference rule.
Terminology: A proof, B
1
, B
2
, . . . , B
n
, is said to be a proof of B
n
, we also say that B
n
is
the sentence proved by this proof. A sentence is said to be provable if there is a proof of it. A
provable sentence is also called a theorem (of the given system).
Note: If B
1
, . . . , B
n
is a proof, then, trivially, every initial segment of it: B
1
, . . . B
j
, where
j n, is a proof. Hence, all sentences occurring in a proof are provable.
Note: We can subsume the concept of axiom under the concept of inference rule, by allowing
rules with an empty list of premises. A proof can then be described as a sequence of sentences
in which every sentence is inferred from previous ones by some inference rule; axioms are
included because they are inferred from the empty set.
It is not dicult to see that the set of theorems of a deductive system can be dened induc-
tively as the smallest set satisfying:
(I) Every axiom is a theorem.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 221
(II) If B
1
, ..., B
m
are theorems and A is inferred from B
1
, . . . , B
m
by an inference
rule, then A is a theorem.
From this viewpoint, proofs are constructs that show explicitly that a given sentence is ob-
tainable by applying (I) and (II).
Proofs as Trees:
The identication of proofs with sequences is the simplest, but by no means the only possible
way of dening this concept. Proofs can be also dened as trees. The leaves of the proof-tree
are labeled by axioms and every non-leaf is inferred from its children by an inference rule. The
sentence that labels the root is the one proved by the tree. Proof-trees take more space, but
give a fuller picture that shows explicitly the premises from which each sentence is inferred.
Notation: If D is a deductive system, then
`
D
A
means that A is provable in D. The subscript D is omitted if the intended system is
obvious.
6.2.3 A Hilbert-Type Deductive System for Sentential Logic
The following system is one of the simplest deductive systems that are adequate for the
purposes of sentential logic. We shall denote it by HS1. (HS for Hilbert-type Sentential
logic).
The language of HS1 is based on our innite list of atomic sentences and on two connectives
and
This means that the sentences of HS1 are built from atoms using and only. Other
connectives are to be expressed, if needed, in terms of and (cf. page 215). The axioms
of HS1 are all the sentences which fall under one of the following three schemes:
(A1) A (B A)
(A2) (A (B C)) ((A B) (A C))
(A3) (A B) (B A)
It has a single inference rule, modus ponens:
A B, A
B
222 CHAPTER 6. THE SENTENTIAL CALCULUS
Each of (A1), (A2), (A3) covers an innite number of sentences. For example, the following
are axioms, since they fall under (A1):
(A
2
A
4
) (A
5
(A
2
A
4
))
(A
2
(A
3
A
2
)) (A
1
(A
2
(A
3
A
2
)))
A
1
((A
1
A
1
) A
1
)
Modus ponens is schematic as well. We can, for example, infer:
from A
6
A
2
and A
6
:
the sentence A
2
,
from (A
1
A
2
) (A
1
A
2
) and (A
1
A
2
) :
the sentence (A
1
A
2
),
from (A
1
A
3
) [A
3
A
5
] and A
1
A
3
:
the sentence A
3
A
5
,
and so on.
Since we employ sentential variables throughout, our claims are of schematic nature. When
we say that
`
HS1
A A
we mean that every sentence of the form A A is a theorem of HS1. The HS1-proofs we
construct are, in fact, proof-schemes. Here, for example, is a proof (scheme) of A A.
For the sake of clarity the sentences are written on separate numbered lines, with marginal
indications of the axiom scheme under which each sentence falls or the previous sentences
from which it is inferred. (The line-numbers and the marginal indications are not part of the
formal proof.)
1. A ((AA) A) ((A1))
2. (A ((AA) A)) ((A (AA)) (AA)) ((A2))
3. (A(AA)) (AA) (from 1. and 2.)
4. A(AA) ((A1))
5. AA (from 3. and 4.)
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 223
1. is an instance of (A1) where A has been substituted by A A; the same substitution
yields 2. as an instance of (A2); 4. is another instance of (A1) obtained by substituting B
by A.
When proofs are dened as trees, the proof just given becomes:
Note: When constructing a proof in sequential form, a rule of inference can be applied
to any previously constructed sentences. If, in the above-given sequence, we move 4. to the
beginning (and leave the rest of the order unchanged) we get a proof of the same sentence,
in which the fth sentence is obtained by applying modus ponens to the rst and the fourth.
The corresponding proof-tree is unaected by that modication.
Here is another example: a proof of B (B A). The indications in the margin are left
as an exercise.
1. (A B) (B A)
2. [(A B) (B A)] [B ((A B) (B A))]
3. B (A B)
4. B ((A B) (B A))
5. [B ((A B) (B A))] [(B (A B)) (B (B A))]
6. (B (A B)) (B (B A))
7. B (B A)
Homework 6.10 Find for each sentence in the last proof the axiom under which it falls or
the two previous sentences from which it is inferred (by modus ponens). In the case of the
axiom, write the substitution that yields the desired instance.
Proving that Something Is Provable
Finding proofs in HS1 by mere trying is far from easy. But there are techniques for showing
that proofs exist, and for producing them if necessary, without having to construct them
224 CHAPTER 6. THE SENTENTIAL CALCULUS
explicitly. For one thing, we can use sentences that have been proved already, or shown to
have proofs, as axioms. Having shown that `
HS1
A A, we can use it in the following
sequence in order to show that B (A A) is provable as well.
(AA) (B (AA)), AA, B (AA)
The rst sentence is an instance of (A1) and the third is derived from the previous two by
modus ponens. The sequence is not a proof, because A A is neither an axiom nor derivable
by modus ponens from previous sentences. But we can replace A A by the sequence that
constitutes its proof and the enlarged sequence isit is easy to seea proof of B (A A).
We have thus shown that the sentence is provable, without constructing a proof. If called
upon we can provide one. Applying again the same principle, we can use, from now on,
B (A A).
Derived Inference Rules: Just as we can use theorems, we can use additional inference
rules, provided that we show that everything provable with the help of these rules is also
provable in the original system. Such rules are known as derived inference rules.
Usually, the proof that a certain rule is derived will also show how every application of it can
be reduced to applications of the original axioms and rules. For example, the following is a
derived inference rule of HS1
A
A B
To prove this we have to show that from A we can get, by applying the axioms and rules of
HS1, A B. Here is how to do it. We shall use A (A B), whose provability has
been established (cf. Homework 6.10, with A and B switched).
1. A
2. A (A B)
3. [A(AB)] [(AA)(AB)]
4. (AA)(AB)
5. A (A A)
6. A A
7. A B
Here 3. is an instance of (A2) and 5. is an instance of (A1); 4. is inferred from 2. and
3., 6.from 1. and 5., 7.from 4. and 6.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 225
Proofs From Premises
In the rest of this subsection, ` stands for `
HS1
.
The concept of provability is naturally extendible to provability from premises. As in 4.2, we
use , ,
0
,
0
, etc., for premise lists. The other notations introduced there will serve
here as well.
Denition: A proof (in HS1) from a premise list , is a sequence of sentences B
1
, . . . , B
n
such that every B
i
is either (i) an axiom, or (ii) a member of , or (iii) inferred from two
previous members by modus ponens. We say that B
1
, . . . , B
n
proves B
n
from . A sentence
is provable from if it has a proof from ; we denote this by:
` B
Our previous concept of proof is the particular case in which is empty. The notation in
that case conforms to our previous usage:
` B
Note: A premise list is not a list of additional axiom schemes. Sentential variables that
are used in writing premise-lists are meant to denote some particular unspecied sentences.
But any general proof that ` B, remains valid upon substitution of the sentential variables
by any sentences.
Example: The following shows that A (A B) ` A B:
(A(AB)) ((AA) (AB)), A (AB), (AA) (AB), AA, AB
The rst sentence is an axiom (an instance of (A2)), the second is the premise , the third
follows by modus ponens. The fourth is a previously established theorem and the last is
obtained by modus ponens. The full formal proof from the premise A(AB) is obtained
if we replace A A by its proof.
The set of sentences provable from can be dened inductively as the smallest set containing
all axioms, all members of , and closed under modus ponens.
Note: The concepts expressed by |= and ` (both of which are symbols of our metalan-
guage) are altogether dierent. The rst is semantic, dened in terms of interpretations and
truth-values; the second is purely syntactic, dened in terms of formal inference rules. We
shall see, however, that there are very close ties between the two. The establishing of these
ties is one of the highlights of modern logic.
The following is obvious.
226 CHAPTER 6. THE SENTENTIAL CALCULUS
(1) If all the sentences in occur in
0
, then every sentence provable from is
provable from
0
.
We also have:
(2) If ` A and , A ` B, then ` B
Intuitively, we may use A in proving B from , because we can prove A itself from . In a
more formal manner: let A
1
, . . . , A
k1
, A be a proof of A from and let B
1
, . . . , B
m1
, B be
a proof of B from , A then
A
1
, . . . , A
k1
, B
1
, . . . , B
m1
, B
is a proof of B from. (The occurrences of A in B
1
, . . . , B
m1
, B can be inferred from previous
sentences in A
1
, . . . , A
k1
.) When proofs are trees the proof of B from is obtained by
taking a proof of B from , A and expanding every leaf labeled by A into a proof of A from
.
All the claims that we establish here for ` hold if we replace ` by |= (which, we shall see,
is not accidental). For example, (2) is the exact analogue of (9) of 4.2. But the arguments
that establish the properties of ` are very dierent from those that establish their analogues
for |=.
The Deduction Theorem The following is the syntactic analogue of (|=, ) (cf. 4.2.1
(7)).
(3) , A ` B i ` A B .
The easy direction is from right to left: Consider a proof of A B from . If we add A to
the premise-list, we can get, via modus ponens, B.
The dicult direction is known as the Deduction Theorem:
If , A ` B then ` A B.
Here is its proof. Consider a proof of B from , A:
B
1
, B
2
, . . . , B
n
We show how to change it into a proof of A B from . First, construct the sequence of
the corresponding conditionals:
A B
1
, A B
2
, , . . . , A B
i
, . . . , A B
n
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 227
This, as a rule, is not a proof from . But we can insert before each A B
i
sentences so
that the resulting sequence is a proof of A B
n
from . Each B
i
in the proof from , A is
either (i) an axiom or a member of , or (ii) the sentence A, or (iii) inferred from two previous
members by modus ponens.
If (i) is the case, insert before A B
i
, the sentences
B
i
(A B
i
), B
i
The rst is an axiom, the second is (by our assumption) either an axiom or a member of .
From these two A B
i
is now inferred by modus ponens.
If (ii) is the case, then A B
i
= A A, which, as we have seen, is provable in HS1;
we can therefore insert before A A sentences that, together with it, constitute its proof in
HS1.
The remaining case is (iii). Say the two previous B
j
s which yield B
i
via modus ponens
are B
k
B
i
and B
k
. The original proof of B is something of the form:
. . . B
k
B
i
. . . B
k
. . . B
i
, . . .
(The relative order of B
k
and B
k
B
i
doesnt matter.) This is converted into:
. . . A (B
k
B
i
), . . . A B
k
, . . . , A B
i
, . . .
Now insert before A B
i
sequence the sequence:
[A (B
k
B
i
)] [(A B
k
) (A B
i
)], (A B
k
) (A B
i
)
The rst is an axiom (an instance of (A2), the second is inferred from A (B
k
B
i
)) and
the rst by modus ponens. And now A B
i
is inferred by modus ponens from
(AB
k
) (A B
i
) and A B
k
.
After carrying out all the insertions, every sentence in the resulting sequence is either an
axiom or a member of , or inferred from two previous members. Hence we get a proof of
A B from .
QED
The deduction theorem is a powerful tool for showing the provability of various sentences. We
can now employ the technique of transferring the antecedent to the left-hand side, which was
used in the context of logical implication. Here, for example, is an argument showing that
` (AB) [(BC)(AC)]
Using the deduction theorem, it suces to show that:
(a) A B ` (BC) (AC)
228 CHAPTER 6. THE SENTENTIAL CALCULUS
Again, by the deduction theorem, the following is sucient for establishing (a):
(b) AB, BC ` AC
Using, for the third time the deduction theorem, (b) reduces to:
(c) AB, BC A, ` C
But (c) is obvious: From A B and A we can infer B, and from B and B C we can infer
C.
Note: The concept of proof from premises is denable for general deductive systems. Some
systems have inference rules whose applicability to arbitrary sentences is subject to certain
restrictions. In such systems the denition of proofs from premises is modied accordingly.
However, (2), (3), and all other properties that are analogues of the implication laws, hold
throughout.
Homework 6.11 Prove the following:
1. A, A ` B
Hint: use A (B A), axiom (A3) and twice modus ponens.
2. A ` A
Hint: use 1. with A and B replaced by their negations, transfer A via the deduction
theorem, choose for B any axiom (or theorem) of HS1.
3. A ` A
Hint: get from 2. ` A A, replace A by A, then use an instance of (A3) and
modus ponens.
4. If , A ` B then , B ` A.
Hint: show that the assumption implies that ` A B, then use (A3) and
modus ponens.
5. If , A ` B then , B ` A.
Hint: use 2. and 3. to show that the assumption implies that , A ` B; then
use 4.
6. A, B ` (A B)
Hint: apply 5. to: A, A B ` B.
7. If , A ` C and , B ` C, then , AB ` C.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 229
Hint: get from the rst assumption, via 5. and 2., , C ` A; get from the second
, C ` B; applying 6., get , C ` (A B), then apply 4.
8. (A B) ` A
Hint: get from 1. A ` A B; then apply 5. (with an empty ) and 2.
9. (A B) ` B
Hint: get from (A1) B ` AB; then apply 5.
6.2.4 Soundness and Completeness
A formal deductive system is dened without recourse to semantic notions. Its signicance,
however, derives from its relation to some semantics. At least, this is the case in systems that
are based on classical logic or some variant of it. The semantics is given by a class of possible
interpretations, in each of which every sentence gets a truth-value.
Soundness
A deductive system, D, is said to be sound for a given semantics, if everything provable in D is
true in all interpretations. This has a generalized form that applies to proofs from premises:
For all and all A, if `
D
A, then there is no interpretation in which all
members of are true and A is false.
Roughly speaking, it means that the proofs of the system can never lead us from true premises
to false conclusions.
1
.
The soundness of D is proved by establishing the following two claims:
(S1) Every axiom of D is true in all interpretations.
(S2) The inference rules preserve truth-in-all-interpretations, that is: if the premises
of an inference rule are true in all interpretations, so is the conclusion.
For the generalized form, (S2) is replaced by:
(S2
) For every interpretation, if all premises of an inference rule are true in the
interpretation, its conclusion is true in it as well.
1
The generalized form can be deduced directly from the rst, provided that the underlying language has
(or can express) and the deduction theorem holds. In other cases the term strong soundness is sometimes
used for the generalized form.
230 CHAPTER 6. THE SENTENTIAL CALCULUS
(S1) and (S2) imply that all sentences constructed in the course of a proof of D are true
in all interpretations. We never get outside the set of all true-in-all interpretation sentences;
because the axioms are in that set and all applications of inference rules leave us in it.
Similarly, (S1) and (S2
) imply that, for any given interpretation, if the premises are true in
the interpretation, then every proof from these premises leaves us within the set of sentences
true in that interpretation. It is an inductive argument: the set of provable (or provable from
) sentences is the smallest set containing the axioms (and the members of ) and closed
under the inference rules. To show that all the sentences in this set have some property, we
show that all axioms (and all members of ) have the property, and that the set of sentences
having this property is closed under the inference rules.
In the case of the sentential calculus, the interpretations consist of all truth-value assignments
to the atoms. Truth under all interpretations means tautological truth. Presupposing this
semantics, the soundness of HS1 is the requirement that every provable sentence is a tautology;
or, in symbols, that for every A:
` A = |= A,
Similarly, generalized soundness means that for every and A:
` A = |= A
To prove that HS1 is sound we show (S1) and (S2), (for the generalized form (S2
)).
That proof is easy: (S1) means that all axioms are tautologies, which can be veried directly
by considering truth-tables. Since modus ponens is our only inference rule, (S2
) amounts
to the claim that whenever A B and A are true in some interpretation, so is B. This, by
the truth-table of , is trivial.
Completeness
A deductive system D is complete with respect to the given semantics, if every sentence that
is true under all interpretations is provable in D.
If, again, we consider also proofs from premises, we get the generalized form of completeness:
If, in every interpretation in which all members of are true, A is true, then
A is provable from .
The non-generalized form is the particular case where is empty
2
.
Completeness means that the deductive system is powerful enough for proving all sentences
that are always true (orin the generalized formfor establishing all logical implications).
2
The generalized form can be deduced directly from the rst provided that: (i) the underlying language has
(or can express) , (ii) modus ponens is a primitive or derived inference rule, and (iii) only nite premise-lists
are considered. In other cases the term strong completeness is sometimes used for the generalized notion.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 231
Completeness is of course a highly desirable property. Deductive systems that are not com-
plete fail to express the full content of the semantics. But completeness is not as essential
as soundness. It is also much more dicult to prove. Unlike soundness, there is no general
inductive argument for establishing it.
Once the language and its semantics (set of possible interpretations) are chosen, the existence
of a formal deductive system that is both sound and complete is extremely signicant. It
means that by using a purely syntactic system we can characterize basis semantic notions.
For many interpreted languages completeness is unachievable
3
. The most notable case in
which we have a deductive system that is both sound and complete is rst-order logic, the
subject of this book.
In the case of HS1 completeness means that, for every sentence A:
|= A = ` A
Or, in the generalized form, for every and A:
|= A = ` A
If both soundness and completeness hold, then we have:
|= A ` A
Soundness is the -direction, completenessthe -direction.
The Completeness of HS1: The fool-proof top-down method of chapter 4 (cf. 4.3) can
be used in order to show that HS1 is complete. Among other things, we have noted in 4.3 that
the method applies to any sublanguage of SC that has negation among its connectives. Any
true implication claim can be therefore derived by applying (repeatedly) the laws correlated
with these connectives to self-evident implications or the types:
(I.1) , A |= A (I.2) , A, A |= B
In the case of HS1, the only connectives are and . Consequently, there are six laws
altogether. Three cover the cases of double negations, conditional and negated conditional
in the premises, and three cover the same cases in the conclusion. The rst group is:
(Pr1) , A |= C , A |= C
(Pr2) , A |= C and , B |= C , A B |= C
(Pr3) , A, B |= C , (A B) |= C
3
The most important is the language of arithmetic, which describes the natural-number system with
addition and multiplication. G odels incompleteness theorem shows that completeness is out of question, for
this and any richer language.
232 CHAPTER 6. THE SENTENTIAL CALCULUS
The second group is:
(Cn1) |= A |= A
(Cn2) , A |= B |= A B
(Cn3) |= A and |= B |= (A B)
In a bottom up proof we start with implications of the types (I.1) and (I.2) and apply the laws
repeatedly in the direction. In order to establish completeness we shall prove analogous
claims for the directions, where |= is replaced by `. That is, we show the following:
(I.1
) , A ` A (I.2
) , A, A ` B
as well as:
(Pr1
) , A ` C , A ` C
(Pr2
) , A ` C and , B ` C , A B ` C
(Pr3
) , A, B ` C , (A B) ` C
(Cn1
) ` A ` A
(Cn2
) , A ` B ` A B
(Cn3
) ` A and ` B ` (A B)
Assume for the moment that we have shown this.
In 4.3 we claimed that, starting with an initial goal, the reduction is bound to terminate in
a tree in which all end goals (in the leaves) are elementary. We appealed to the fact that the
reductions always reduced the goals complexity. Here we shall make this reasoning precise
by turning it into an inductive argument.
Let the weight of a sequence of sentences, , be the sum of all numbers contributed by
connective occurrences in , where each occurrence of contributes 1 and each occurrence of
contributes 2. It is easily seen that, in each of the six claims given above, the sequence of
sentences involved in the conclusion on the right-hand side (of ) has greater weight than
the sequence of sentences involved in each of the left-hand side premises.
We now show by induction on the weight of , A, that if
|= A
then
` A
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 233
If the weight is 0, all the sentences in , A are atoms. In this, and in the more general case
where all the sentences are literals, we have: If |= A, then either A occurs in or
some atom and its negation occur in . Otherwise, we can dene (as in 4.3.3) a truth-value
assignment under which all the premises come out true and A comes out false. Hence, by
(I.1
) and (I.2
), ` A. If not all sentences are literals, then either some premise or the
conclusion is a conditional, or a negated conditional, or a doubly negated sentence. Each of
these possibilities is taken care of by corresponding claim from (Pri
), (Cni
), i = 1, 2, 3. For
example, consider the case where =
0
, B C, i.e., where we have:
0
B C |= A
By the direction of (Pr2) we get:
0
, B |= A and
0
, C |= A
Each of
0
, B, A and
0
, C, A has smaller weight then the weight of
0
, B C, A. Hence,
by the induction hypothesis:
0
, B ` A and
0
, C ` A
Applying (Pr2
) we get:
0
B C ` A
It remains to prove (I.1
), (I.2
), (Cni
), i = 1, 2, 3. Of these, (I.1
)
is trivial. (Cn2
) is the deduction theorem. The rest follow from the claims of Homework
6.11:
(I.2
) follows from 1.
(Pr1
) is 7.
(Pr3
0
A
where
0
is obtained by reordering .
The other inference rules, which constitute the heart of the system, correspond to the laws
of 4.3.2. We have antecedent rules and succedent rules for double negation, for each binary
connective and for each negated connective.
In the following list the rules are arranged in two groups. The rst consists of all antecedent
rules, the second of all succedent rules. Each row contains the rules for some connective and
its negation. (This is dierent from the arrangement of the laws in 4.3.2. But you can easily
see the correspondence, which is also indicated by the rules names. )
Note that the laws in 4.3.2 are i statements. The -direction of the law is the premises-
to-conclusion direction of corresponding rule. Thus, the sequents of the premises are always
simpler than the conclusion sequent.
238 CHAPTER 6. THE SENTENTIAL CALCULUS
ANTECEDENT RULES
, A C
( )
, A C
, A, B C
( )
, A B C
, A C , B C
( )
, (A B) C
, A C , B C
( )
, A B C
, A, B C
( )
, (A B) C
, A C , B C
( )
, A B C
, A, B C
( )
, (A B) C
, A, B C , A, B C
( )
, A B C
, A, B C , A, B C
( )
, (A B) C
SUCCEDENT RULES
B
( )
B
A B
( )
A B
, A B
( )
(A B)
, A B
( )
A B
A B
( )
(A B)
, A B
( )
A B
A B
( )
(A B)
, A B , B A
( )
A B
, A B , A B
( )
(A B)
Soundness and Completeness of GS1: The soundness of GS1 follows easily by observing
that every axiom is valid and, for every inference rule, if all its premises are logically valid,
so is the conclusion. This is exactly the -direction of the corresponding law in 4.3.2.
6.2. DEDUCTIVE SYSTEMS OF SENTENTIAL CALCULI 239
The completeness of GS1 follows from the fact that every true implication can be established
using the method of 4.3.3: Start with self-evident implications and apply the directions of
the laws. These applications correspond exactly to the steps of a proof in GS1.
The rigorous inductive argument follows the same lines as the argument used to prove the
completeness of HS1. With each sequent we associate its weight, dened as follows: Sum all
the numbers contributed by connective-occurrences, where every occurrence of contributes
1, every occurrence of , and contributes 2, and every occurrence of contributes 3. It
is easily seen that, in all the inference rules, each of the premises has smaller weight than the
conclusion. (The weight is some rough measure that reects the fact that, in every inference
rule, each of the premises is simpler than the conclusion. Any simplicity measure that
has this property would do for our purposes.)
We now prove, by induction on the weight of A, that if A is valid then it is
provable in GS1.
If all the sentences in the sequent are literals, then the sequent is valid i either (i) the
succedent is one of the antecedent sentences or (ii) the antecedent contains a sentence and
its negation. (Otherwise we can dene, as in 4.3.3, a truth-value assignment that makes all
antecedent literals true and the succedent false.) In either case the sequent is an axiom.
If not all sentences in A are literals, then one of them is either of the form C, or of
the form C D, or of the form (C D) (where is a binary connective). In each case our
sequent can be inferred, from one or two premises, by applying the corresponding rule (an
antecedent ruleif the sentence is in the antecedent, a succedent ruleif it is the succedent).
We now invoke an important feature of our rules:
Reversibility: In each rule, if the conclusion is valid, so are all the premises.
This is the -direction of the laws of 4.3.2. Since A is assumed to be valid, the premises
of the rule that yields it are valid as well. Since each premise has smaller weight, the induction
hypothesis implies that it is provable in GS1. Therefore A is provable.
QED
The Gentzen-Type Deductive System GS2
GS2 formalizes the proof-by-contradiction method of chapter 4 (cf. 4.4) exactly in the way
that GS1 formalizes the method of 4.3.3. In addition to the usual sequents, GS2 has sequents
of the form:
A
1
, . . . , A
n
where is a special symbol (signifying contradiction). can appear only as the right-hand
side of a sequent.
240 CHAPTER 6. THE SENTENTIAL CALCULUS
By stipulation, the truth-value of is F, under any assignment of truth-values to the atomic
sentences. The valid sequents are dened as before. This means that is valid if there
is no truth-value assignment that makes all the sentences in true.
Instead of the two axiom schemes (GA1) and (GA2), GS2 has one axiom scheme:
(GA3) , A, A `
The reordering rule is now supposed to cover also sequents of the new form.
GS2 has also the following inference rule:
, A
Contradiction
A
The other rules of GS2 are the antecedent rules of GS1, with the dierence that the succedent
is always . (That is, replace list above, C by ).
After applying the Contradiction Rule no other rule can be applied. Hence, the only way to
prove A is to prove , A and then apply the Contradiction Rule.
The soundness and completeness of GS2 are proved in the same way as they are proved for
GS1.
Chapter 7
Predicate Logic Without Quantiers
7.0
Taking a further step in the analysis of sentences, we set up a language in which the atomic
sentences are made of smaller units: individual constants and predicates (or relation symbols).
Individual Constants
Individual constants are basic units that function like singular names of natural language,
that is, names that denote particular objects. For example:
The Moon, Ann, Everest, Chicago, Bill, The USA .
An interpretation of the formal language associates with every individual constant a denoted
object, referred to as the constants denotation. The object can be arbitrary: a person, a
material body, a spatio-temporal region, an organization, the number 1, whatever.
In natural language a name can be ambiguous, e.g., Everest the mountain and Everest the
ocer. Usually, the intended denotation can be determined from context. A name may also
lack denotation, e.g., Pegasus. (We have discussed this at some length in chapter 1.) But
in predicate logic each individual constant has, in any given interpretation of the language,
exactly one denotation. Dierent individual constants may have the same denotation, just as
in natural language an object can have several names.
The denotations of the individual constants depend on the interpretation. The syntax leaves
them undetermined. On the purely syntactic level the individual constants are mere symbols,
which function as building blocks of sentences.
241
242 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
We shall assume that
a, b, c, d, etc.
are individual constants. It is also convenient to use a, b, c, etc. as variables
ranging over individual constants. Thus, we may say: For every individual constant c , or
There is an individual constant b. Occasionally we shall help ourselves to names borrowed
from English:
Jill, Jack, Bill, 5, etc.
Predicates
Predicates, known also as relation symbols, combine with individual constants to make sen-
tences. Grammatically, they fulll a role analogous to English verb-phrases. But they express
properties or relations. For example, in order to translate
Ann is happy
into predicate logic, we need an individual constant, say a, denoting Ann, and a predicate,
say H, that does the job of ... is happy. The translated sentence is:
H(a)
H expresses, under the intended interpretation, the property of being happy. As we shall see
later, it is interpreted simply as the set of all happy people.
Similarly, the translation of
Ann likes Bill
is obtained by using individual constants, say a and b, for denoting Ann and Bill and a
predicate, say L, to play the role of ... likes . The translation is then:
L(a,b)
7.0. 243
Here L is supposed to express the relation of liking. Actually it is interpreted as a set-
theoretical relation (cf. 5.1.4 page 167): the set of all pairs (x, y) such that x likes y.
In the rst case H is a one-place predicate. It comes with one empty place. When we ll the
empty place with an individual constant we get a sentence. In the second case L is a two-place
predicate. It comes with two empty places. We get a sentence when both places are lled
with constants. The same individual constant can be used twice, e.g.,
L(a,a)
which, under the interpretation just given, reads as:
Ann likes Ann, or Ann likes herself.
The number of places that a predicate has is known as its arity. In our example H is unary
(or monadic) and L is binary. The arity can be shown by indicating the empty places:
H( ) L( , ).
A predicate can have any nite arity. For example, if a, b, and c are points on a line, we
can translate:
a is between b and c
into:
Bet(a,b,c)
Here Bet( , , ) is a ternary predicate (interpreted as the three-place betweeness relation)
and a, b, and c are interpreted as denoting a, b, and c.
In general, an n-ary predicate is interpreted as some n-ary relation, where this is (as in set
theory) a set of n-tuples. For the moment it suces to note that predicates are analogous to
constructs such as ... is happy, ... is red, ... likes , ... is greater than , ... is
between and
, etc.
We assume that
P, R, S, P
0
, R
0
, etc.
244 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
are predicates of the formal language. It is also convenient to use P R, etc. as variables
ranging over the predicates. Thus, we might say: For some monadic predicate P, or When-
ever P is a ternary predicate, etc. We may also help ourselves to suggestive expressions such
as:
Happy( ), FatherOf( , ), Larger( , ), etc.
As far as the formal uninterpreted language is concerned, predicates are mere symbols, each
with an associated positive integer (called its arity), which can be combined in a well-dened
way with individual constants to form sentences. The interpretation of these symbols varies
with the interpretation of the language. H( ) can be interpreted as the set of happy peoplein
one interpretation, as the set of humansin another, as the set of all animalsin a third,
and as the empty setin a fourth. The same goes for individual constants.
Note: Predicate has several meanings. It can refer to properties that can be armed of
objects (rationality is a predicate of man). It is also a verb denoting the attributing of a
property to an individual (wisdom is predicated of Socrates). Dont confuse these meanings
with the present technical sense!
7.1 PC
0
, The Formal Language and Its Semantics
7.1.0
By PC
0
we shall refer to the portion of rst-order logic that involves, beside the connectives,
only individual constants and predicates. The language of PC
0
is therefore based on:
(i) A set of individual constants.
(ii) A set of predicates, each having an associated positive integer called its
number of places, or arity.
(iii) Sentential connectives, which (in our case) are: , , , and .
Actually, we are describing here not a single language, but a family of languages. For the
languages may dier in their sets of individual constants and predicates.
Given the individual constants and the of predicates, the atomic sentences are the constructs
obtained by applying the rule:
If P is an n-place predicate and c
1
, . . . , c
n
are individual constants, then
P(c
1
, . . . , c
n
) is an atomic sentence.
7.1. PC
0
, THE FORMAL LANGUAGE AND ITS SEMANTICS 245
You can think of P(c
1
, . . . , c
n
) as the result of applying a certain operation to the predicate
and the individual constants. (Just as A B is the result of applying a particular operation
to A and B.)
We want the atomic sentences to satisfy unique readability: the predicate and the sequence of
individual constants should be readable from the sentence. This amounts to the requirement:
If P(c
1
, . . . , c
n
) = P
0
(c
0
1
, . . . , c
0
m
) then
P = P
0
, n = m, and c
i
= c
0
i
, for i = 1, . . . , n
The exact nature of the atomic sentences does not matter, as long as unique readability for
atomic sentences and for those constructed from them by applying connectives is satised.
Except for the dierence in the atomic sentences, the set of all sentences is dened exactly as
in the sentential calculus. It consists of all constructs that can be obtained from the following
rules:
Every atomic sentence is a sentence.
If A and B are sentences, then:
A, A B, A B, A B, A B are sentences.
Note: Sometimes an inx notation is adopted for certain binary predicates: Instead
of R(a,b) we write a R b. This is purely a notational convention.
Example: If H( ) is a unary predicate playing the role of ... is happy, and a, b and c are
individual constants playing the roles of Ann, Bert and Claire, then the sentences,
Ann, Bert and Claire are happy Ann or Bert or Claire is happy
are rendered in PC
0
, respectively, by:
H(a) H(b) H(c) H(a) H(b) H(c)
(Assuming that the English or is meant inclusively.) The second sentence is also a translation
of One of Ann, Claire and Bert is happy, when one of means at least one of. But if we
want to formalize
Exactly one of Ann, Bert and Claire is happy,
we have to use a sentence that says that at least one of these people is happy, but no two of
them are:
[H(a) H(b) H(c)] [H(a)H(b) H(a)H(c) H(b)H(c)]
246 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
which is equivalent to:
[H(a)H(b)H(c)] [H(b)H(a)H(c)] [H(c)H(a)H(b)]
WATCH OUT: You cannot use constructions that parallel English groupings of words,
such as:
H(a,b,c), or H(a b c), or H(a b c) .
Such expressions do not represent anything in the formal language. They are, as far as PC
0
is concerned, gibberish. H is a monadic predicate; it cannot be combined with three individual
constants. Conjunction, or conjunction combine only sentences, not individual constants.
Similarly, if R( ) is a predicate that formalizes ... is relaxed, then you cannot render Ann is
happy and relaxed by writing:
(H R)(a)
This, again, is gibberish. We do not have in PC
0
a conjunction that combines predicates.
Conjunction is by denition an operation on sentences. Therefore Ann is happy and relaxed
is rendered as:
H(a) R(a)
7.1.1 The Semantics of PC
0
An interpretation of a PC
0
-type language consists of:
(I) An assignment that assigns to each individual constant an object.
(II) An assignment that assigns to each n-ary predicate an n-ary relation, i.e.,
a set of n-tuples. (If n = 1 the interpretation assigns to the predicate some
set.)
If c is assigned to c and P is assigned to P, then c and P are described as the interpretations
of c and P. We also say that c denotes, or refers to, c, and that c is the denotation, or
the reference of c. This terminology is also used, though less commonly, with respect to
predicates.
An interpretation determines the truth-value of each atomic sentence as follows. If an n-ary
predicate P is interpreted as P and each individual constant c
i
is interpreted as c
i
, for i =
1, . . . , n, then:
P(c
1
, . . . , c
n
) has the value T if (c
1
, . . . , c
n
) P
P(c
1
, . . . , c
n
) has the value F if (c
1
, . . . , c
n
) 6 P
(For n = 1, we simply take the member itself: P(c
1
) gets T if c
1
P, and it gets F if c
1
6 P.)
7.1. PC
0
, THE FORMAL LANGUAGE AND ITS SEMANTICS 247
The assignment of truth-values to the atomic sentences determines the truth-values of all
other sentences exactly in the same way as in the sentential calculus.
By dening the possible interpretations of the language, we have also determined the concepts
of logical truth and falsity: A sentence of PC
0
is logically true if it gets the value T in all
interpretations. It is logically false, if it gets the value F under all interpretations.
We have also determined the relation of logical implication between a premise-list and a
sentence. logically implies A, just when there is no interpretation under which all
members of are true and A is false.
Note: The implications we are considering now are no longer schemes of the kind we
handled in 4.3. Our sentences have been specied to be particular entities, e.g., P(a,b) or
P(a, b) R(b, b). The implication is logical if, for every interpretation, if all premises are true
so is the conclusion. The interpretations vary, but the sentences remain the same. This is
true of all systems we shall consider henceforth.
So far, no restriction has been placed on the interpretation of the predicates and the individual
constants. Consequently, any assignment of truth-values to the atomic sentences can be
realized by interpreting the predicates and the individual constants in a suitable way. Given
a truth-value assignment, , we can dene an interpretation as follows:
Interpret dierent individual constants as denoting dierent objects.
Interpret any n-ary predicate P as the n-ary relation P such that, for all
n-tuples:
(c
1
, . . . , c
n
) P i, for some individual constants c
1
, . . . , c
n
, c
i
denotes c
i
,
i = 1, . . . , n, and assigns to P(c
i
, . . . , c
n
) the value T.
Under this interpretation, the atoms get the truth-values that are assigned to them by .
A tautology of PC
0
is dened as a sentence whose truth-value, obtained via truth-tables, is
T for all assignments of truth-values to its atomic components. Since every assignment of
truth-values to the atomic sentences is obtainable in some interpretation, the logical truths of
PC
0
coincide with its tautologies. Similarly, logical implication and tautological implication
are in the case of PC
0
the same.
The story will change when we add the equality-predicate, , to the language, because the
interpretation of this predicate is severely restricted. For example, the atomic sentence a a
gets always the value T. It is a logical truth but not a tautology.
The sentential apparatus that has been developed for sentential logic applies exactly in the same
way to PC
0
. We can therefore use freely the techniques of distributing, pushing negations, De
Morgans laws, substitutions of equivalents, the top-down proof methods of 4.3.3 and 4.4.1,
DNF, CNF, and all the rest.
248 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
We can also set up a deductive system, like one of those described in 6.2except that the
atoms are not the A
i
s of SC, but the atomic sentence of our PC
0
language.
Homework
7.1 Translate the following sentences, using predicates, individual constants and connectives.
The translation is to be based on the assumption that the sentences are interpreted in a
universe consisting of three men and three women:
Jack, David, Harry, Claire, Ann, Edith
where gender is according to name, and dierent names denote dierent persons.
The language has a name for each person. You can use the English names, or their (lower
case) rst letters: j, d, h, c, a, e. For ... is happy, you can use H(...).
When you nd a sentence ambiguous, give its possible translations and prove their non-
equivalence by showing the existence of interpretations under which they get dierent truth-
values.
1. Everyone is happy.
2. Every man is happy and every woman is happy.
3. Every man is happy or every woman is happy.
4. Someone is happy.
5. Some man is happy and some woman is happy.
6. Some man is happy or some woman is happy.
7. Some woman is happy and some is not.
8. Some women are happy, while some men are not.
9. If Jack is not happy, none of the women is.
10. If Harry is happy, then one of the women is and one is not.
11. No man is happy, unless another person is.
12. All women are not happy, but all men are.
13. Women are happy, men are not.
14. Not everyone who is happy is a woman.
7.1. PC
0
, THE FORMAL LANGUAGE AND ITS SEMANTICS 249
15. If men are happy so are women.
7.2 Find all the logical implications that hold between the translations of the rst six
sentences in 7.1. (Note that there are only four sentences to consider, because of two obvious
equivalences). Prove every equivalence (by any of the methods of chapter 4, or by truth-value
considerations). Prove every non-equivalence, by showing the existence of an interpretation
that makes one sentence true and the other false.
7.3 Translate the following sentences, under the same assumptions and using the same
notations as in 7.1. Use L(..., ) for ... likes . Follow the same instructions in cases that
you nd ambiguous.
1. Every woman is liked by some man.
2. Some man likes every woman.
3. Some woman likes herself, and some man does not.
4. Nobody is happy who does not like himself.
5. Some women, who do not like themselves, like David.
6. Ann does not like a man, unless he likes her.
7. Claire likes a man who likes Edith.
8. Claire and Edith like some man.
9. Unless liked by a woman, no man is happy.
10. Most men like Edith.
7.4 Which, if any, of the translated rst two sentences of 7.3 implies logically the other?
Prove your implication, as well as non-implication, claims.
Substitutions of Individual Constants
We can substitute in any atomic sentence an individual constant by another constant; this
gives another sentence. When the individual constant has more than one occurrence we can
substitute any particular occurrence. We can also carry out simultaneous substitutions, i.e.,
several substitutions at one go. A few examples suce to make this clear. Let a b c be
dierent individual constants, and let
P(a, b, a)
be an atomic sentence.
250 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
Substituting the rst occurrence of a by b we get: P(b, b, a).
Substituting (all occurrences of) a by c we get: P(c, b, c).
Substituting the rst occurrence of a by c, its second occurrence by b, and
b by a, we get: P(c, a, b)
Substitution in non-atomic sentences is eected by substitution in the atomic components.
For example, if the given sentence is
P(a, c, b) R(b, c)
then:
Substituting all occurrences of a by c and all occurrences of c by a we get:
P(c, a, b) R(b, a)
Substituting the rst occurrence of b by c and the second by a we get:
P(a, c, c) R(a, c)
7.2 PC
0
with Equality
7.2.0
The equality predicate, or equality for short, is a binary predicate used to express statement
of equality. In English such statements are expressed by:
... is equal to ... is identical to , ... is the same as
Since we use the symbol = in our own discourse, we shall adopt a dierent symbol as the
equality predicate of PC
0
: . Thus,
c c
0
is an atomic sentence of PC
0
, which says that the denotations of c and c are identical. On
the other hand c = c
0
is a sentence in our discourse which says that c and c
0
are the
same individual constant.
We refer to atomic sentences of the form a b as equalities. (The inx notation is a mere
convention; we could have used (a, b).)
7.2. PC
0
WITH EQUALITY 251
We use a 6 b as a shorthand for (a b). Sentences of this form are referred to as
inequalities.
We stipulate as part of the semantics of PC
0
that if the language contains equality, then:
a b gets T i denotation of a = denotation of b .
As we shall see, an interpretation of a rst-order language involves the xing of a certain set of
objects as the universe of discourse. All individual constants denote objects in that universe,
and all predicates are interpreted as subsets of the universe, or as relations over it. Once the
universe is chosen, the interpretation of is, by denition, the identity relation over it: the
set of all pairs (a, a), where a belongs to the universe.
For this reason is considered a logical predicate. All other predicates are non-logical.
The following are sentences logical truths
(1) |= a a
(2) |= a b b a
(3) |= (a b b c) a c
But they are not tautologies. Their truth is not established on the basis of their sentential
structure, but because is interpreted as the identity relation. The following principle
obviously holds.
(EP) For all individual constants c, c
0
: If A
0
is obtained fromAby simultaneously
substituting in some places c
0
for c and/or c for c
0
, then:
c c
0
(A A
0
) is a logical truth.
Here A can be any sentence; it can be itself an equality. Let
A = a b
Let A
0
be obtained by substituting simultaneously a for b and b for a. Then
A
0
= b a
By (EP), the following is a logical truth:
a b [a b b a]
This is easily seen to imply (2). In a similar way, we can derive (3) from (EP). (Can you
see how?)
252 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
(EP) is implied by the stipulation that is interpreted as the identity relation and by the
following general principle that underlies the semantics of PC
0
:
Extensionality Principle:
The truth-value of a sentence, in a given interpretation, does not change, if
we substitute a term by another term that has the same denotation.
By term we mean here an individual constant; but the principle applies if term covers
predicates, whose denotations are, as we said, sets or relations (sets of tuples).
In many contexts of natural language the extensionality principle does not hold. The following
standard example is due to Frege. Hesperus and Phosphorus are two names that stand,
respectively, for the evening star and the morning star. Both, it turns out, denote the
planet Venus. The sentences
(4) Jill believes that Hesperus is identical to Phosphorus.
(5) Jill believes that Phosphorus is identical to Phosphorus.
need not have the same truth-value, although (5) results from (4) by substituting a name by
a co-denoting name (i.e., with the same denotation). Jill does not doubt that (5) is true, but
she may be unaware that Hesperus and Phosphorus are the same planet.
(4) and (5) are about Jills beliefs. We can formalize statements of this type if we introduce
something like a monadic connective, say Bel, which operates on sentences. For every sentence
A, there is a sentence Bel(A), which says that Jill (or some anonymous agent) believes that A.
Syntactically Bel acts like negation. But it is not truth-functional, hence it is not a sentential
connective of classical logic. Individual constants that occur in the scope of Bel cannot, in
general, be substituted by co-denoting constants without change of the truth value.
The same is true of a wide class of sentences involving that-phrases (thinks that ..., knows
that... and others), as well as expressions of necessity or possibility (it is necessary that...,
it is possible that...). In formal languages, such non-classical connectives are known as
modal. In this course we shall not encounter them. Contexts, and languages, in which the
extensionality principle holds are called extensional. Non-extensional context, such as (4),
are known as intensional. Classical logic is extensional throughout.
The truth-table method for detecting tautologies does not detect the new kind of logical
truth, which derives from the meaning of . It can be modied so as to take care of this
special predicate. Instead of considering all possible assignments of truth-values to the atomic
sentences, we have to rule out certain assignments as unacceptable. This amounts to striking
out certain rows in the truth-table; for example: a row that assigns c c the value F, or a
row that assigns to a b and to b a dierent truth values is unacceptable. We should also
strike out any row that assigns the value T to a b and to a c but assigns dierent values
7.2. PC
0
WITH EQUALITY 253
to P(a, b) and P(c, c). It is possible to state general, necessary and sucient conditions for
the acceptability of a row; but we shall not do it here. Once the acceptable rows have been
determined, we can check for logical truth:
A sentence of PC
0
is logically true i it has the value T in all acceptable rows of its truth-table.
Note the stronger condition for tautologies: a sentence is a tautology i it gets the value T
in all rows, including the non-acceptable ones.
Instead of modifying the truth-table method, we shall adjust the top-down derivation methods
of 4.3.3 and 4.4.1, by adding certain laws that take care of . This leads to a much simpler,
easier to apply prescription for checking logical implications (and, in particular logical truth)
of PC
0
sentences.
We remarked that the top-down derivation method applies to PC
0
without change. The only
dierence is that instead of the atoms A
i
s of the sentential calculus (cf. 6.1), or instead of the
sentential variables that are used in 4.2.1, we have the atomic sentences of PC
0
. (We can also
continue to use sentential variables, regarding them as some unspecied sentences.)
If is not present, we get at the end either a proof of the initial goal, or a truth-value
assignment to the atoms that makes the premises true and the conclusion false. In the latter
case we can (as shown in 7.1.1) nd an interpretation of the individual constants and the
predicates that yields this assignment; hence we get our counterexample.
The same procedure is adequate for sentences containing , provided that certain laws are
added to our previous lists.
7.2.1 Top-Down Fool-Proof Methods For PC
0
with Equality
We shall concentrate mostly on the proof-by-contradiction variant (cf. 4.4.0, 4.4.1) which is
based on fewer laws and which necessitates fewer additions. Only two additional laws are
required.
The rst law adds another type of self-evident implications. They, as well, can now serve
as axioms in a bottom-up proof, or as successful nal goals (marked by
) in a top-down
derivation. The second is a reduction law that can be used in reducing goals. In the following
c and c
0
are variables ranging over individual constants. ES stands for Equality
Substitution.
254 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
Equality Laws For Proofs-by-Contradiction
(EQ) , (c c) |=
(ES) , c c
0
|=
0
, c c
0
|=
where c and c
0
are dierent constants, c occurs in ,
and
0
results from by substituting everywhere c
0
for c.
(For simplicity c c
0
has been written as the rightmost premise. This is immaterial since
we can reorder the premises.) Recall that |= means that there is no interpretation
that makes all sentences of true; a counterexample is an interpretation that does this. The
two sides of (ES) are counterexample-equivalent: an interpretation is a counterexample to the
implication of one of the sides i it is a counterexample to the other. This is implied by (EP):
If c c
0
gets T, then all the sentences of get the same truth-values as their counterparts
in
0
.
To apply (ES) in a top-down derivation, choose some equality c c
0
among the premises and
replace every occurrence of c by that of c
0
, in every other sentence. After the application c
does not occur in any of the sentences except in c c
0
. We shall refer to such an application
as a c-reduction.
Note: The -direction of (ES) is the one applied in bottom-up proofs. This direction allows
uswhen c c
0
is a premise and c does not occur in any other sentence to replace in the
other sentences some (one or more) occurrences of c
0
by c.
We could have formulated (ES) in a more general form, which allows to substitute some,
but not necessarily all, the occurrences of c. But the top-down process is simpler and more
ecacious when the law is applied in its present form.
Note: The restriction that c and c
0
be dierent and that c occur in rules out cases where
the substitution would leave unchanged.
Consider a c-reduction, where the equality is c c
0
. After the reduction c appears only in
the equality c c
0
. Call an individual constant that appears in the premises dangling if it
has one occurrence, and it is the left-hand side of an equality. Then every c-reduction reduces
the number of non-dangling constants by one.
Here are three very simple examples of top-down derivations:
1. a b |= b a
2. a b, (b a) |=
7.2. PC
0
WITH EQUALITY 255
3. a b, (b b) |=
The rst step is the usual move (via the Contradictory-Conclusion Law) in a proof by con-
tradiction. The passage from 2. to 3. is via (ES), with a b in the role of c c
0
. 3. is a
self-evident implication that falls under (EQ). The bottom-up proof is obtained by reversing
the sequence. The passage from 3. to 2. is via the -direction of (ES).
1. a b, b c |= a c
2. a b, b c, (a c) |=
3. a c, b c, (a c) |=
Here the step from 2. to 3. is achieved by applying (ES), with the second equality (i.e., b c)
in the role of c c
0
; the application results in substituting b in the rst equality by c.
1. a b, c b, P(a, c, a) |= P(c, b, a)
2. a b, c b, P(a, c, a), P(c, b, a) |=
3. a b, c b, P(b, c, b), P(c, b, b) |=
4. a b, c b, P(b, b, b), P(b, b, b) |=
Here 3. is obtained from 2. by using (ES), the equality in question being a b. Then 4. is
obtained from 3. by another application of (ES), this time the equality is b c.
The Adequacy of the Method
The top-down method for PC
0
with equality is based on our previous reduction steps and on
reductions via (ES). Given an initial goal:
|= A or |=
we apply repeated reductions. This must end with a bunch of goals that cannot be further
reduced. The argument follows the same lines as in the case of sentential logic. Applications
of the connective laws yield simpler goals. (In 6.2.4 and 6.2.5 we have seen how to associates
weights with goals, so that the applications are weight reducing.) Applications of (ES) consist
in substitutions that preserve all the sentential structure. The goal is simplied in that the
number of non-dangling constants is reduced. Following these considerations, it is not dicult
to see that the process must terminate. (We can dene a new weight by adding to the weight
dened in 6.2.5 the number of all non-dangling constants. The new weight goes down with
each reduction step. We therefore have an inductive argument exactly like those of 6.2.4 and
6.2.5.)
256 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
The end goals of the process (those in the leaves) are elementary implications, i.e., they
contain only literals: atoms or negated atoms. And they cannot be further reduced via (ES).
Consider an implication of this type, where all equalities c c
0
in which the two sides are
dierent, are written rst:
c
1
c
0
1
, c
2
c
0
2
, . . . , c
n
c
0
n
, |=
consists of all premises that are not equalities, or that are trivial equalities: a a. None of
the c
i
s occurs in any other place; otherwise we could have applied a c
i
-reduction. They are
exactly all the dangling constants.
Assume that the implication is not self-evident, i.e., that the premises contain neither a
sentence and its negation, nor an inequality of the form (c c). Then the following
interpretation makes all premises true. Hence it is a counterexample.
(I) Let a
1
, . . . , a
m
be all the dierent non-dangling individual constants. Inter-
pret them as names of dierent objects, a
1
, . . . , a
m
.
(II) Interpret the predicates in a way that makes every atom occurring positively
(i.e. unnegated) in true and every atom occurring negatively false. (This
can be done because, because dierent constants occurring in have been
assigned dierent denotations.)
(III) Interpret each c
i
as denoting the same object (among the a
j
s) that is denoted
by c
0
i
, i = 1, . . . , n. (This can be done because the each c
i
occurs only in the
equality c
i
c
0
i
.)
If all the elementary implications are self-evident, the top-down derivation tree shows that
the initial goal is a logical implication. By inverting the tree we get a bottom-up proof. If,
on the other hand, one of the elementary implication is not self-evident, then, as just shown,
we can construct a counterexample. This is also a counterexample to the initial goal.
QED
Here is a simple example. The initial goal is:
L(a, b), L(a, c)L(b, a) H(a), b c |= H(a)
An attempt to construct a top-down derivation results in:
1. L(a, b), L(a, c)L(b, a) H(a), b c |= H(a)
2. b c, L(a, b), L(a, c)L(b, a) H(a), H(a) |=
3. b c, L(a, c), L(a, c)L(c, a) H(a), H(a) |=
4.1 b c, L(a, c), [L(a, c)L(c, a)], H(a) |=
7.2. PC
0
WITH EQUALITY 257
4.2 b c, L(a, c), H(a), H(a) |=
5.11 b c, L(a, c), L(a, c), H(a) |=
5.12 b c, L(a, c), L(c, a), H(a) |=
In the rst step we have also rearranged the premises by moving b c to the beginning. The
step from 2. to 3. is a b-reduction. The other steps are of the old kind. 5.12 is an elementary
but not self-evident. It has a counterexample:
Let a denote a, and let c denote c, where a 6= c.
Interpret L as any relation L, such that (a, c) L and (c, a) 6 L.
Interpret H as any set, H, such that a 6 H.
Interpret b as denoting c.
In general, in the presence of , it is advisable to apply c-reductions as soon as possible. This
will reduce the number of constants in the other premises.
Note: Obviously, , c c |= C |= C. Hence trivial equalities can be dropped. Yet,
we do not have to include this among our laws. The proof above shows that the method is
adequate without the law for dropping trivial equalities. Sometimes dropping c c results
in the disappearance of c from the premises. In this case any counterexample to the reduced
goal becomes a counterexample to the original goal if we assign to c any arbitrary denotation.
The Top-Down Method of 4.3.3 for PC
0
with Equality
The adjustment of the method of 4.3.3 is obtained along similar lines. Here is a brief sketch.
Recall that the method does not employ . Its treats implications of the form
|= C
To handle , we split each of our old equality laws into two laws, one for the premises and
one for the conclusion. Altogether we have four laws that treat equalities: the self-evident
implications (EQ1) and (EQ2) and the reduction laws (ES1) and (ES2).
258 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
(EQ1) , (c c) |= C
(EQ2) |= c c
In the following laws c and c
0
are dierent constants,
c has at one other occurrence and
0
, C
0
result from , C
by substituting everywhere c
0
for c.
(ES1) , c c
0
|= C
0
, c c
0
|= C
0
(ES2) |= (c c
0
)
0
|= (c c
0
)
To see why (ES2) holds, consider a counterexample to one of the sides. Since (c c
0
) gets
F, c c
0
gets T. Hence c and c
0
have the same denotation. Therefore the sentences in and
in
0
get the same truth-values.
The top-down reductions, for a given initial goal, proceed much as before. Again, it is advis-
able to carry out c-reductions, via (ES1) and (ES2), as early as possible. At the end the goal
is reduced a bunch of elementary implications that cannot be further reduced. Corresponding
to the four self-evident implication laws, there are four types of self-evident implications:
, A, A |= C, , A |= A, , (c c) |= C, |= c c
The methods adequacy is proven by showing that if an elementary implication is not of any
of these forms and if, moreover, it cannot be further reduced via (ES1) or (ES2), then there
is an interpretation that makes all premises true and the conclusion false.
Example:
1. P(a), P(b) P(c) |= a b (b c)
2. P(a), P(b) P(c), a b |= (b c) (|=, )
3. P(b), P(b) P(c), a b |= (b c) a-reduction, (ES1)
4. P(c), P(c) P(c), a c |= (b c) b-reduction, (ES2)
5.11 P(c), P(c), a c |= (b c) (, |=),
5.11 P(c), P(c), a c |= (b c) (, |=),
7.2. PC
0
WITH EQUALITY 259
Gentzen-Type Systems for PC
0
with Equality
The Gentzen-type systems GS1 and GS2, considered in 6.2.5, can be extended to PC
0
with
equality. By now the extension should be obvious: just as GS1 and GS2 are obtained by
formalizing the laws of 4.3.4 and 4.4.1 into axioms and inference rules, the required extensions
are obtained by formalizing the additional laws. The required extension of GS2 is obtained
by adding the following axiom and inference rule:
, (c c)
0
, c c
0
, c c
0
where c and c
0
are dierent constants, c occurs in ,
and
0
results from by substituting everywhere c
0
for c.
Note that the inference rule corresponds to the -direction of (ES).
The completeness of the extended system follows now from the adequacy of the top-down
proof-by-contradiction method for PC
0
. The extension of GS1 is obtained in a similar way
and is left to the reader.
A Hilbert-Type System for PC
0
with Equality
The Hilbert-type system HS1, given in 6.2.3, can be extended to a system that is sound and
complete for PC
0
with equality. As in HS1, we assume that the language of PC
0
has only
and as primitive sentential connectives. The same kind of extension applies to every
sound and complete system that has modus ponens as an inference rule (either primitive or
derived). In particular it applies to the systems obtained from HS1 by adding connectives
with the associated axioms, as described in 6.2.4 (cf. Homework 6.12, 6.13 and 6.14). It turns
out that the addition of the following two equality axiom-schemes is all that is needed:
EA1 c c, where c is any individual constant.
EA2 c c
0
(AA
0
), where c and c
0
are any individual constants, A is any
sentence of PC
0
and A
0
is obtained from A by substituting one occurrence
of c by c
0
.
Actually, we can restrict EA2 to the cases where A is an atom; this, together with EA1 is
already sucient, but we shall not go into this here.
260 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
If our original system is complete (i.e., is sucient for proving all tautological implications),
then the addition of the two axiom schemes takes care of all implications that are due to the
connectives and to equality. For example, the logical truth
a b b a
is derivable from EA1 and EA2, because
a b (a ab a)
is an instance of EA2 (where A is a a, c and c
0
are, respectively a and b, and A
0
is obtained by replacing the rst occurrence of a by b). This and a a, tautologically
imply a b b a.
Note that, while EA2 allows us to replace one occurrence of c by an occurrence of c
0
, repeated
applications enable us to replaces any number of cs by c
0
s.
The completeness of the resulting system is proved in the same way used to prove the com-
pleteness of HS1. We extend the previous proof by showing that the provability relation, `,
of that system has the properties required to insure the adequacy of the top-down derivation
method. We have to establish the following properties, which correspond to (EQ1), (EQ2)
and the -directions of (ES1) and (ES2).
(i) , (c c) ` C
(ii) ` c c
In the following
0
, C
0
result from , C by substituting everywhere c
0
for c.
(iii)
0
, c c
0
` C
0
, c c
0
` C
(iv)
0
` (c c
0
) ` (c c
0
)
Homework
7.5 Find which of the following is a logical implication. Justify your answers (positiveby
derivations or truth-value considerations, or both, negativeby counterexamples).
1. L(a, b) L(b, a), L(a, b)L(b, c) L(a, c) |= c a (L(a, b)L(a, a)) ?
2. (L(a, b)L(b, a)(a 6 b)) H(a) |= L(a, a)L(a, b) H(a) ?
3. L(a, b) L(b, c), L(b, c) L(c, a), L(c, a) L(a, b) |= a b [L(a, a) L(c, a)] ?
7.6 Consider interpretations of a language based on a two-place predicate L( , ), and
individual constants a, b, c, such that:
7.3. STRUCTURES OF PREDICATE LOGIC IN NATURAL LANGUAGE 261
(i) Each individual constant denotes somebody among Nancy, Edith, Je,
and Bert, where these are four distinct people.
(ii) L(... , ) reads as: ... likes , where all the pairs (x, y) such that x
likes y are:
(Nancy, Edith), (Nancy, Je), (Edith, Edith), (Edith, Bert), (Je,Je),
(Je, Bert), (Bert, Edith).
Consider the sentences
(1) L(a,b) L(b,a) L(b, b)
(2) L(a,a) L(b,a) L(c,a)
Find all the ways of interpreting a and b so that (1) comes out true; and all the ways of
interpreting a, b, c so that (2) comes out true. Indicate in each case your reasons.
7.7 Show that the following sentences are implied tautologically by instances of EA1 and
EA2, where EA2 is used for atomic sentences only. (Outline the argument, giving the instances
of the required equality axioms.)
1. a b [b c a c]
2. a b [L(a, a) L(b, b)]
3. a b [(c 6 d) (S(a, c) S(b, d))]
7.8 Prove (iii) in the above-mentioned list of properties required of `. (Hint: show that
every sentence of
0
is provable from c c
0
and the corresponding sentence of , , by repeated
uses of (EA2). Using the fact that c c
0
` c
0
c, show that C is provable from c c
0
and
C
0
)
7.3 Structures of Predicate Logic in Natural Language
7.3.1 Variables and Predicates
Variables as Place Markers
When predicates of PC
0
of arity > 1 are meant to formalize English predicate-expressions, the
correspondence between the argument places should be clear and unambiguous. For example,
if we introduce the two-place L( , ) as the formal counterpart of likes we can say:
262 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
L(... , ) is to be read as ... likes .
It is obvious here that the rst and second places in L( , ) match, respectively, the left and
right sides of likes. But this cumbersome notation is impractical when the arities are larger,
or when the English expressions are longer. At this point variables come handy. We can say
that
L(x, y) is to be read as x likes y
You are probably acquainted with the use of variables that range over numbers from high
school algebra. Variables that range over sentences have been used in previous chapters,
as well as variables ranging over strings. In chapter 5 we used using variables ranging over
arbitrary objects. Later we shall extend PC
0
by incorporating in it variables as part of the
language.
At present we are going to use x y z u v, etc. merely as place markers: to mark
certain places within syntactic constructs.
The identity of the variables is not important. For example, the following three stipulations
come to the same:
L(x, y) is to be read as x likes y.
L(u, v) is to be read as u likes v.
L(y, x) is to be read as y likes x.
Each means that L is a binary predicate to be interpreted as:
{(p, q) : p likes q}
But if we were to say that
L(x, y) is to be read as y likes x,
then we would be assigning to L a dierent interpretation, namely:
{(p, q) : q likes p}
If a and b denote, respectively, a and b, then under the rst stipulation L(a,b) is true i a
likes b, but under the second it is true i b likes a.
By substituting, in English sentences, variables for noun-phrases we can indicate predicate-
expressions. We can then say, for example:
7.3. STRUCTURES OF PREDICATE LOGIC IN NATURAL LANGUAGE 263
Let H be the predicate x is happy,
which says that the monadic predicate H( ) is to be interpreted as the set of all happy beings.
When the arity is > 1 it should be clear which coordinates are represented by which variables.
If we say
Let L be the predicate x likes y,
then we should indicate the correspondence between variables and places of L. In the absence
of other indications, we shall take the alphabetic order of the variables as our guide: x before
y, y before z.
Deriving Predicates from English Sentences
Generally, an English sentence can give rise to more than one predicate. Because we can mark
(using variables) dierent places as empty. To take an example from Frege,
(1) Brutus killed Caesar
gives rise to the expression: x killed y, as well as to: x killed Caesar and
Brutus killed y.
Let K, K
1
, and K
2
be, respectively, predicates corresponding to these expressions. Then
each of the following three is a formalization of (1):
(1
0
) K(Brutus, Caesar)
(1
00
) K
1
(Brutus)
(1
000
) K
2
(Caesar)
K denotes the binary relation of killing: the set of all pairs (p, q) in which p killed q. K
1
corresponds to the property of being a killer of Caesar; it denotes the set of all beings that
killed Caesar. K
2
corresponds to the property of being killed by Brutus; it denotes the set of
all beings that were killed by Brutus.
We can also derive from the binary predicate x killed y the monadic predicate x killed
x, which denotes the set of all beings that killed themselves.
The derived predicates can be quite arbitrary. For example, we can formalize
(2) Jack frequents the movies and Jill prefers to stay home,
264 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
as:
(2
0
) B(Jack, Jill)
where B(x, y) is to be read as: x frequents the movies and y prefers to stay home. B is
therefore interpreted as
{(p, q) : p goes to the movies and q prefers to stay home}
(2
0
) is not a good formalization since it hides the structure of (2). A better one, which shows
(2) as a conjunction, is:
(2
00
) FrMv(Jack) PrHm(Jill)
where FrMv and PrHm are, respectively the monadic predicates:
x frequents the movies and y prefers to stay home
In the same vein, we can say that (1
0
) is a better recasting of (1), then either (1
00
) or (1
000
).
While grammar can be deceptive when it comes to logical analysis, grammatical aspects can
guide us to more natural predicates that reveal more of the sentences logical structure.
7.3.2 Predicates and Grammatical Categories of Natural Language
The following are the basic grammatical categories that give rise to predicates. This is true
of English, as well as of other languages.
Adjectives, as in x is triangular
Common names, as in x is a woman.
Verbs, as in x enjoys life.
Common names are also known as general names, or as common nouns. Adjectives and verbs
give rise to predicates of arity greater than one. For example, from adjectives we get:
x is taller than y,
x is between y and z.
And from verbs:
7.3. STRUCTURES OF PREDICATE LOGIC IN NATURAL LANGUAGE 265
x introduced y to z.
In English, adjectives and common names require the word is, known in this context as the
copula. It connects the adjective, or the common name, with the noun phrase (or phrases).
In the predicate-expression the noun phrase is replaced by a variable.
Common names are characterized by the presence of the indenite article: a woman, an
animal, a city, etc.
As you can see, a variety of English constructs are put in the same bag: all become predicates
upon formalization in rst-order logic. Dierences of English syntax and certain dierences
in meaning are ignored. A ner-grained picture requires additional structural elements and
may involve considerable increase in the formalisms complexity.
Two Usages of is The role of is as a copula in predicate expressions is to be clearly
distinguished from its role as a two-place predicate denoting identity. Compare, for example,
(3) Ann is beautiful
with
(4) Anns father is Bert.
In (3) is functions as a copula. In (4) it functions as the equality predicate. (4) can be
written as
Anns father = Bert
Is must function as the equality predicate when it is anked by singular noun-phrases, (i.e.,
noun-phrases denoting particular objects).
Singular Terms
Singular terms are constructs that function as names of particular objects, e.g.,
Bill Clinton, New York City, 132, The smallest prime number, The capital of the
USA, etc.
There is a dierence between the rst two, which areso to speakatomic, and the other
two, which pick their objects by means of a description. The rst are called proper names, the
lastdenite descriptions. Usually, a denite description is marked by the denite article:
266 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
the capital of the USA, the satellite of the earth, the second world war, the man
who killed Liberty Valence etc.
But this rule has exceptions. The USA should be construed as a proper name, while
132 is really a disguised description (spelled out, it becomes 1 10
2
+ 3 10
1
+ 2 1
0
).
A denite description denotes the unique object satisfying the stated condition, e.g., the
earths satellite denotes that unique object of which x is a satellite of earth is true. The
denite description fails to denote if either no object, or more than one object, satises the
condition. There are various strategies for dealing with non-denoting descriptions. In Russells
theory of descriptions, sentences containing denite descriptions are recast into sentences that
have truth values even when the description of the original sentence fails to denote. On other
theories, a failure of denotation can cause a truth-value gap, that is: the sentence has no
truth-value.
These and other questions that relate to dierences between proper names and denite de-
scriptions focused considerable attention in the philosophy of language. Some have been the
subject of a still ongoing debate.
Note: Sometimes the denite article is used merely for emphasis, or focusing:
(5) Jill is the daughter of Eileen
need not imply that Jill is the only daughter of Eileen. It can be read as
(5
0
) Jill is a daughter of Eileen.
which can be formalized as:
(5
) Daughter(Jill, Eileen)
Here is functions as a copula. Contrast this with
(6) The daughter of Eileen is Jill.
Here is cannot be read as a copula, because Jill cannot be a general name. (6) must be
read as:
(6
) v(Fem(v) Male(v))
For the moment let us state axioms within PC
0
.
Following Carnap, we have introduced in 4.5.1 the term meaning postulate to characterize
non-logical axioms that reect the meaning of linguistic terms; these terms, it turns out,
are mostly predicates. We can thus say that (7), or the generalization (7
), derives from
the meaning of female and male. As noted in 4.5.1, the absolute distinction that Carnap
advocated between meaning postulates and empirical truth is now rejected by many. But it
still makes good sense to distinguish (7
) B(earth, moon)
(9
) S(moon, earth)
The implication between the sentences rests in this case on a meaning postulate that can be
stated schematically, with a and b standing for any individual constants:
(10) B(a, b) S(b, a)
But (10) is not a logical truth; neither is the implication from (8
) to (9
) a logical implication.
Another way of construing the situation is to regard x is bigger than y simply as another
way of writing y is smaller that x ; just as in mathematics x > y is another way of
writing y < x. And in this case (8) implies tautologically (9), because they are construed
7.4. PC
0
, PREDICATE LOGIC WITH INDIVIDUAL VARIABLES 269
as the same sentence. The question: what are the right translations of (8) and (9) may not
have an answer.
We might try a third alternative: The predicates representing smaller and bigger are
dierent, but certain sentences, such as (10), count as logical axioms. This only transforms
our original question into the question: What meaning postulates count as logical axioms?
Suppose you regard (10) as a logical truth, would you adopt the same policy with respect to
other pairs:
hotter and colder prettier and uglier to the left of and to the right of ?
And what about hot and cold, beautiful, and ugly ? Or sentences such as:
(11) Red(a) Blue(a) ?
All this should not undermine the concept of logical implication. It only indicates a looseness
of t between the formal structure and our actual language, a looseness that is inevitable
whenever a theoretical scheme is matched against concrete phenomena.
7.4 PC
0
, Predicate Logic with Individual Variables
7.4.0
We now take the crucial step of incorporating individual variables into the formal language.
Let
v
1
, v
2
, . . . , v
n
, . . .
be a xed innite list of distinct objects called individual variables, or variables for short,
which are dierent from all previous syntactic items of PC
0
. The v
i
s are dierent, but they
play the same role in the formal language. It is convenient to use
u v w x y z u
0
v
0
etc.
as standing for unspecied v
i
s, i.e., as variables ranging over v
1
, . . . , v
n
, . . . . (We may say,
For every individual variable v ..., or For some individual variable w ....)
We shall also use x, y and z in another role: to range over various domains that come
up in the discussion. For example in {(x, y) : x < y}, x and y range over numbers.
Whether x, y and z stand for variables of the formal language, or are used in a dierent
role, will be clear from the context.
270 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
Henceforth PC
0
denotes the system obtained from PC
0
by incorporating the individual
variables.
Terms: An individual term, orfor short, a term, is either an individual constant or a
variable.
Well Formed Formulas, or Ws: The basic construct of PC
0
is that of a well formed
formula. The name is abbreviated as w.
Ws are constructed like the sentences of PC
0
, except that variables, besides individual
constants, can ll the predicates empty places. We shall use lower case Greek letters:
1
2
,...
1
2
, ...
0
, ... etc.,
to range over well formed formulas. Since sentences turn out to be special cases of well formed
formulas, this involves also a notational change with regard to sentences: from upper case
Latin to lower case Greek. Spelt out in detail, the denition is:
Atomic Ws: If P is an n-place predicate and t
1
, . . . , t
n
are terms then P(t
1
, . . . , t
n
) is an
atomic w. It goes without saying that unique readability is assumed with respect to atomic
ws: the predicate and the sequence of terms are uniquely determined by the formula. This
is also assumed with respect to all compounds.
The sentential connectives are now construed as operations dened for ws. The set of all
ws is dened inductively by:
(I) Every atomic w is a w.
(II) If and are ws then:
, , , ,
are ws.
All the syntactic concepts, such as main connective, immediate components, and components,
are dened in the same way as before.
The occurrences of a term in a w are determined in the obvious way: (i) A term, t, occurs,
in the i
th
predicate-place, in the atomic w P(t
1
, . . . , t
n
), i t = t
i
(note that it can have
several occurrences), and (ii) the occurrences of t in a w are its occurrences in the atomic
components of .
The Sentences of PC
0
: The sentences of PC
0
are, by denition, the sentences of PC
0
. It
is easy to see that this means the following:
A w of PC
0
is a sentence i no variables occur in it.
7.4. PC
0
, PREDICATE LOGIC WITH INDIVIDUAL VARIABLES 271
(For atomic ws this is obvious. For the others, it follows from the fact that the ws of PC
0
and the sentences of PC
0
are generated from atoms by the same sentential connectives.)
Examples: The following are ws, where u, v, w, v
0
are any variables.
P(b, a) P(u, b) P(a, a) R(c) P(v, u) R(c) P(v, v) (R(w) R(v
0
))
The rst is an atomic sentence, the second is an atomic w that is not a sentence, the third
is a non-atomic sentence, the fourth and the fth are non-atomic ws that are not sentences.
So far the variables play the same syntactic role as individual constants. There is however
a semantic distinction: The interpretation of the language assigns denotations to individual
constants, but not to the variables. Consequently, an interpretation determines the truth-
values of sentences, but not of ws. To determine the truth-value of a w we need, besides
the interpretation of the language, an assignment of objects to the variables of .
For example, the truth value of P(a, b) is determined by the denotations of a and b and the
interpretation of P; but, in order to get the truth-value of P(v, b), we needin additionto
assign some object as the value of v. This will be elaborated and claried within the general
setting of rst-order logic.
Note: It may happen that the truth-value of a w, which is not a sentence, is the same
for all assignments of objects to its variables. For example, P(u, v) P(u, v) gets, for every
assignment, the value T. Or (P(u, v) H(a)) P(u, v) gets the same value as H(a). A w
may be therefore logically equivalent to a sentence. This does not make it a sentence. The
distinction between sentences and ws that are not sentences is syntactic, not semantic.
7.4.1 Substitutions
In 7.1.1 we discussed substitutions, in sentences of PC
0
, of individual constants by individual
constants. We can now extend this to substitution of terms by terms in ws. We denote by:
S
t
t
0
the w resulting by substituting t
0
for t in . By this we mean that t
0
is substituted for every
occurrence of t.
Examples:
S
u
c
[(R(a, u) R(x, b)) P(u)] = (R(a, c) R(x, b)) P(c),
S
u
x
[(R(a, u) R(x, b)) P(u)] = (R(a, x) R(x, b)) P(x),
S
a
c
[(R(a, u) R(x, b)) P(u)] = (R(c, u) R(x, b)) P(u),
S
b
x
[(R(a, u) R(x, b)) P(u)] = (R(a, u) R(x, x)) P(u).
272 CHAPTER 7. PREDICATE LOGIC WITHOUT QUANTIFIERS
We can also substitute at one go several terms: s
1
by t
1
, s
2
by t
2
, ..., s
n
by t
n
. These, as we
saw in 7.1.1 page 249, are called simultaneous substitutions. The result of such a simultaneous
substitution in is denoted:
S
s
1
t
1
s
2
t
2
...
...
s
n
t
n
.
Examples:
S
u,x
a,c
[(R(a, u) R(x, b)) P(u)] = (R(a, a) R(c, b)) P(a),
S
u,x
x,u
[(R(a, u) R(x, b)) P(u)] = (R(a, x) R(u, b)) P(x),
S
a,b
b,x
[(R(a, u) R(x, b)) P(u)] = (R(b, u) R(x, x)) P(u).
Variable Displaying Notation
Notations such as:
(v) (x, y) (u, v, w)
are used for ws in order to call attention to the displayed variables. The point of the
notation is that if we use (x), then we understand by (a) the w obtained from (x) by
substituting a for x. Hence we have:
S
x
a
(x) = (a), S
x,y
a,b
(x, y) = (a, b), S
x,y
b,b
(x, y) = (b, b), S
u,x,y
a,b,c
(u, x, y) = (a, b, c)
We extend this to cover substitutions of variables by variables:
S
x
y
(x) = (y), S
x,y
y,x
(x, y) = (y, x), S
x,y
b,x
(x, y) = (b, x), etc.
If we think of (x) as a predicate expressing a certain property, with x marking the empty
place of the predicate, then we can think of (a) as saying that the property is true of the
object denoted by a.
Incautious use of this convention may lead to notational inconsistency. A w containing both
x and y as free variables should not be written both as (x) and as (y). For then (a) can be
read either as the w obtained by substituting a for x, or the w obtained as substituting a for
y; and these are dierent. Also in that case we might read (y) as the result of substituting
y for x in (x). Such inconsistencies are avoided if we display all the free variables that we
consider subject to substitutions. In the case just mentioned we write the w as (x, y).
Then,
S
x
a
(x, y) = (a, y), S
y
a
(x, y) = (x, a), S
x
y
(x, y) = (y, y).
Note that it would do no harm to display variables that do not occur in the w; if x does not
occur in (x), then substituting for it any term does not have any eect: (x) = (c). But
7.4. PC
0
, PREDICATE LOGIC WITH INDIVIDUAL VARIABLES 273
since this may confuse it is best to avoid it. Usually use of (x) is taken to indicate that x
occurs in .
When we want to focus on a particular variable, while indicating that there are possibly
others, we can use notations such as: (. . . x. . .). Or we can state explicitly that (x)
may have other variables.
7.4.2 Variables and Structural Representation
A w (x), having no variables besides x, can serve as a scheme for getting sentences of the
form (c), where c is any individual constant. Similarly, a w (u, v) can serve as a scheme for
sentences of the form (a, b). Such schemes can give us a handle on long sentences. Suppose
we want to formalize:
(1) Everyone, among Jack, David and Harry, who is liked by Ann is liked by
Clair.
Let us use L(x, y) for x likes y, and the rst letters for the names. The desired sentence is a
conjunction, saying, of each of the men, that if he is liked by Ann he is liked by Claire:
(1
0
) (L(a, j)L(c, j)) (L(a, d)L(c, d)) (L(a, h)L(c, h))
We can, instead, describe the sentence as follows:
(1
0
, PREDICATE LOGIC WITH INDIVIDUAL VARIABLES 275
If we want to say of x and y that they are dierent, we have to use the w x 6 y.
If we unfold (5
0
with rst-order quantiers.
These are new syntactic operations that produce new types of ws. The application of
quantiers is called quantication.
The version we shall study here is based on two rst-order quantiers: the universal and
the existential. The choice is a matter of convenience, similar to the choice of sentential
connectives. We shall see that, semantically, each quantier is expressible in terms of the
other and negation. We use
and
for the universal and the existential quantier.
A quantier takes two objects as arguments: an individual variable and a w. It yields as
outcome a w. The outcomes of applying , or , to v and are written, respectively as:
v and v
These ws are called, respectively, universal and existential generalizations. The following are
commonly used terminologies. We speak of the universal, or existential, quantication of
with respect to v; and also of quantifying (universally, or existentially) the variable v, in , or
of quantifying over v in . One speaks of quantied ws, and also of quantied variables (in a
given w). We might say, for example, that in v, the variable v is quantied (universally).
All of which should not cause any diculty.
277
278 CHAPTER 8. FIRST-ORDER LOGIC
Before going, in the next section, into the syntax of FOL, let us get some idea of the relation
of FOL to English and of the way of interpreting a rst-order language. It will help us to
appreciate better the syntactic details.
If is to be read, in English, as ..., then
v can be read as: for all v ... .
v can be read as: for some v ... .
For example, let
Mn Mr Wm Hp
be the formal counterparts of the predicates:
x is a man x is mortal x is a woman x is happy.
Then
All men are mortal.
can be formalized as:
(1) v
1
(Mn(v
1
) Mr(v
1
))
which can be read as:
For every object v
1
, if v
1
is a man then v
1
is mortal.
(Of course, any v
i
could have been used instead of v
1
.)
Similarly, Some woman is happy can be formalized as:
(2) v
3
(Wm(v
3
) Hp(v
3
))
which can be read as:
For some object v
3
, v
3
is a woman and v
3
is happy.
Here are some less straightforward examples. Let L(x, y) and K(x, y) correspond to: x likes
y and x knows y. Then
8.2. WFFS AND SENTENCES OF FOL 279
(3) v
2
(K(v
2
, Jack) L(v
2
, Jack))
says that everyone who knows Jack likes him; literally: for every object v
2
, if v
2
knows Jack
then v
2
likes Jack. And the following says that every man is liked by some woman.
(4) v
2
[Mn(v
2
) v
1
(Wm(v
1
) L(v
1
, v
2
))]
Here is a miniature sketch of the semantics. The full denitions are given in chapter 9. An
interpretation of an FOL is given by giving the following items:
A non-empty set, playing the role of the universe (or domain) of the interpretation.
The individual variables of the language are assumed to range over that universe.
An assignment that correlates, with each individual constant of the language, a member
of the universe, which is called its denotation.
An assignment that correlates with every n-ary predicate an n-ary relation over the
universe, which is said to be the interpretation (or denotation) of the predicate.
If the language contains , its interpretation is the identity relation over the universe.
From this and from the examples above you may get a rough idea how truth and or falsity
are determined by the interpretation.
8.2 Ws and Sentences of FOL
8.2.0
First-order logic refers to a family of languages that have comparable logical resources. Those
we consider here share a logical apparatus that consists of:
sentential connectives, individual variables, rst-order universal and ex-
istential quantication.
The non-logical vocabulary may dier from language to language, it consists of:
individual constants, predicates (of any arity).
If the language has the equality symbol, , then it belongs to the logical vocabulary.
280 CHAPTER 8. FIRST-ORDER LOGIC
Every FOL language must contain at least one predicate. But it need not contain individual
constants. (The above-given (1), (2) and (3) are examples of sentences without individual
constants.) As before, we use
u v w x y z u
0
v
0
etc.
to stand for unspecied v
i
s.
Note: We assume that, when dierent symbols from this list occur in the same w-
expression, they stand for dierent v
i
s, unless stated otherwise.
Thus, what is expressed by (4) above can be expressed by using any two dierent v
i
s, which
we can write as:
(4
0
) u[Mn(u) v (Wm(v) L(v, u))]
We shall also use x, y and z as variables of our own language, ranging over various
domains according to the discussion.
First-Order Ws (Well-Formed Formulas)
Terms and atomic ws are dened exactly as in PC
0
. Ws are then dened inductively:
(I) Every atomic w is a w.
(II) If and are ws then:
, , , , are ws.
(III) If is a w and v is any individual variable, then
v and v are ws.
Ws of the forms , or , where is a binary connective, are referred to as sentential
compounds.
Ws of the forms v and v are referred to as generalizations, the rstuniversal,
the secondexistential.
Every w is, therefore, either atomic, or a sentential compound, or a generalization. Here are
some additional examples of ws:
S(x, y, b) R(c, a) x (P(x) R(x, x)) (yR(a, y)) xS(b, x, y)
8.2. WFFS AND SENTENCES OF FOL 281
x y z (S(x, y, z) uP(u)) yP(a) [(xR(y, z)) xP(x)]
The rst, the third and the last are sentential compounds. The rest are generalizations: the
second and fourthuniversal, the fthexistential.
Terminology: An occurrence of a term t in an atomic w of the form R(. . . , t, . . .) is said
to be under the predicate R.
Unique Readability: Unique readability comprises the previous conditions concerning
atomic ws and sentential compounds, as well as conditions on quantication:
Every generalization is neither an atomic w, nor a sentential compound.
The quantier, the variable, and the quantied w are uniquely determined by the
generalization: If
Qv = Q
0
v
0
0
where Q and Q
0
are quantiers, then: Q = Q
0
, v = v
0
(i.e., they are the same variable)
and =
0
.
Operants, Scopes and Subformulas: A quantier-operant is a pair of the form v, or
v, which consists of a quantier and a variable.
By an operant we shall mean either a sentential connective, or a quantier-operant. Negation
and quantier-operants are monadic: they act on single ws. The others are binary.
The notion of main connective generalizes, in the obvious way, to the notion of main operant:
The main operant of is and its scope is . The main operant of (where is
a binary connective) is and its left and right scopes are and .
The main operant of Qv (where Q is a quantier) is Qv and its scope is .
Sometimes we omit to mention the variable of the quantier-operant and speak of the scope
of a quantier (or, rather, of its occurrence), or of the quantier itself as being the main
operant.
The immediate components of a w are dened by adding to the previous denition of chapter
2 (cf. 2.3) the clause for quantiers:
The immediate component of Qv (where Q is a quantier) is .
A component is now dened as before: it is either the w itself, or an immediate component
of it, or an immediate component of an immediate component,... etc. A component of is
282 CHAPTER 8. FIRST-ORDER LOGIC
proper if it is dierent from . The components of are also referred to as the subformulas
of , and the proper componentsas the proper subformulas.
The component-structure of a given w is that of a tree. As in the sentential case, we can
write ws as trees, or even identify them with trees. The following is a w whose subformulas
correspond to the sub-trees that issue from the nodes. The nodes are numbered according
to our old numbering rule (cf. 4.2.2, page 122). The main operants of the non-atomic
subformulas are encircled. On the right-hand side is the graphic representation, with nodes
labeled either by operants or by atomic ws.
1. x [(y z S(x, y, z)) uP(u)]
2. (y z S(x, y, z)) uP(u)
3.1 y z S(x, y, z))
3.2 u P(u))
4.1 z S(x, y, z))
5.1 S(x, y, z)) atomic w
3.2 P(u) atomic w
As in the case of sentential logic (cf. 2.3.0), we often omit the word occurrence. For example
the second means the second occurrence of , the rst v means the rst occurrence of
v, etc. The same systematic ambiguity applies to other particles and constructs: variables
(the rst v), individual constants (the second a), ws (the rst P(a)) etc.
Nested Quantiers: Quantiers are said to be nested if one is within the scope of the
other. In the last example, x and y are nested. A sequence of nested quantiers is a
sequence in which the second is in the scope of the rst, the thirdwithin the scope of the
second, and so on. In the last example the following are sequences of nested quantiers:
x, y, z x, u
On the other hand, y and u are not nested.
8.2. WFFS AND SENTENCES OF FOL 283
( To be precise we should speak of quantier-occurrences, because the same quantier (with
the same variable) can occur more than once.)
Grouping Conventions for Quantier-Operants: The grouping convention for negation
is extended to all monadic operants: Every monadic operant-name binds more strongly than
every binary operant-name. This means, for example, that
v is to be read as (v)
To include within the scope of v, write:
v( )
8.2.1 Bound and Free Variables
An occurrence of a variable v in a w is bound if it is (i) within the scope of a quantier-
operant Qv, or (ii) the occurrence of v in the pair Qv An occurrence of a variable is free if it
is not bound.
A variable that has a free occurrence in is said to be free in , or a free variable of . A
variable that has a bound occurrence in is said to be bound in , or a bound variable of .
Examples:
S(x, y, b) R(y, x) : All variable occurrences are free.
xS(x, y, b) xR(y, x) : All occurrences of x are bound, all occurrences
of y are free.
x[yS(x, y, b) R(y, x)] : All occurrences of x are bound, and so are the
occurrences of y in y and under S. The last occurrence of y is free (it is not
within the scope of y).
As the last example shows, a variable can have several occurrences, of which one or more are
bound and one or more are free. Such a variable is both free and bound in .
Whether an occurrence is free depends on the w. The same variable-occurrence which is
free in one w can be bound in a larger w that contains the rst as a component. For
example, the occurrence of y in R(x, y) is free, but it is bound in the larger yR(x, y). The
x in yR(x, y) is free in that w, but is bound in xyR(x, y) .
The Binding Quantier: An occurrence of a quantier-operant Qv is said to bind, and
also to capture, all the free occurrences of v in its scope; these latter are said to be bound, or
captured, by the Qv.
284 CHAPTER 8. FIRST-ORDER LOGIC
As is usual, we apply the terminology to the quantier itself: we speak of an occurrence of
a quantier as binding, or capturing, the variables that occur free in its scope. Among the
occurrences that are bound by (an occurrence of) Q, we also include the occurrence of v in
the pair Qv.
It is not dicult to see that, for any w , every bound occurrence of v is bound by a unique
occurrence of some quantier. If the v occurs in Qv, then it is bound by that occurrence of
Q. Otherwise, it is in some subformula, Qv , such that it is free in ; it is then bound by
that Q. (The uniqueness is guaranteed by unique readability.)
An occurrence of v can belong to the scopes of several Qvs (e.g., the last occurrence of v,
in v[P(v) vR(a, v)] .) Among these, the Qv with the smallest scope is the one that
binds it.
In the following illustrations the bindings are indicated by connecting lines.
x [S(x, y) y S(x, y)]
x [P(x) (S(x, y)] y x S(x, y))
The signicance of free and bound occurrences is explained in the next subsection.
Individual Constants: It is convenient to extend the classication of free and bound
occurrences to individual constants. All occurrences of individual constants are dened to
be free. Thus, an occurrence of a term is free i it is either an occurrence of an individual
constant, or a free occurrence of an individual variable.
Sentences
A w is dened to be a sentence just when it has no free variables. (In some terminologies the
term open sentence is used for ws with one or more free variables, and closed sentencefor
sentences.)
The ws (1)-(4) at the beginning of the chapter are sentences.
The denition of sentences in PC
0
(given in 7.4) is a particular case of the present denition:
All variable-occurrences in ws of PC
0
are free, since PC
0
has no quantiers. Hence a w of
PC
0
is a sentenceaccording to the present denitionjust when it contains no variables.
8.2. WFFS AND SENTENCES OF FOL 285
8.2.2 More on the Semantics
The syntactic distinction between free and the bound occurrences has a clear and crucial
semantic signicance. Consider the w
P(v)
The interpretation of the language does not determine the formulas truth-value, because the
interpretation does not correlate with variablesas it does with the individual constants
particular objects. In order to get a truth-value we need, in addition to the interpretation,
to assign some object to v. We therefore introduce assignments of values to variables. The
truth of P(v) is relative to an assignment that assigns a value to v. The w P(v) gets T i the
assigned value is in the set that interprets P( ). If P( ) is interpreted as the set of all people,
then P(v) is true, under the assignment that assigns Juno to v, i Juno is a person.
On the other hand the interpretation by itself determines the truth-values of
vP(v) and vP(v)
The rst is true i all the objects in the universe of the interpretation are in the set denoted
by P (which means that the set is the whole universe). The second is true i some object is
in this set. (which means that the set is not empty). You can, if you wish, assign a value to
v. But this value will have no eect on the truth-values of the last two ws.
Changing the free variable in P(v) results in a non-equivalent formula:
P(v) is not logically equivalent to P(u)
Because, if P is interpreted as a set that is neither the whole universe nor empty, there is an
assignment under which the rst w gets T and second gets F: Assign to v a value in the
set and to ua value outside it. On the other hand, changing the bound variable results in
a dierent, but logically equivalent, formula:
vP(v) uP(u) vP(v) uP(u)
Roughly speaking, the w P(v) says something about the interpretation of P and the value
of v; but the ws vP(v) and vP(v) are not about the value of v, they are only about the
interpretation of P.
What we have just observed holds in general. If a w, , has free variables, then its truth-
value in a given interpretation depends, in general, on the values assigned to these variables.
But if all the variables are bound, that is, if the w is a sentence, the truth-value is completely
determined by the interpretation of the language.
Note: There are ws, with free variables, which get the same truth-value under all
assignments. For example, the truth-value of
P(v) P(v)
286 CHAPTER 8. FIRST-ORDER LOGIC
is T, for any value of v; nonetheless v is a free variable and the w is not a sentence. You can
think of this w as dening a function, whose value for each value of v is T; this is dierent
from a sentence, which simply determines a truth-value.
Variable Displaying Notation: We extend the variable displaying notation of 7.4.1 (page
272) to ws of rst-order logic. We shall use
(u) (u, v)
0
(x, y) (x, y, z) etc.
to denote ws in which the displayed variables (and possibly others that are not displayed)
are free. One of the main points of the notation has to do with substitutions of free variables,
to be considered in the next subsection.
If v is the only free variable in (v), you can think of (v) as saying something about the
value of v. It denes the set consisting of all objects whose assignment to v makes (v)
true (under the presupposed interpretation of the language). Similarly, a w with two free
variables denes a binary relation, one with three free variables denes a ternary relation,
and so on. (Recall that in cases of arity > 1 we have to stipulate which variable represents
which coordinate; cf. 7.3.)
For example, if we have in our language the predicates Male( ) and Parent( , ), we can
formalize x is a grandfather of y as:
Male(x) v (Parent(x, v) Parent(v, y))
Ws with free variables resemble predicates, but unlike predicates they are not atomic units.
Homework 8.1 Assume an interpreted rst-order language, containing and the
predicates: M( ), F( ) and C( , , ), interpreted so that M(x), F(x), and C(x, y, z) read
as:
x is a human male, x is a human female, x is a child of y and z
Write down ws that formalize the following. Use the same free variables that are used here
with the English. You may introduce shorthand notations; e.g., you can dene
1
(x, y) to
stand for some w, and then use it as a unit. But write in full unfolded form at least two of
the formalizations.
1. x is the mother of y.
2. x is a sister of y.
3. x is an uncle of y.
8.2. WFFS AND SENTENCES OF FOL 287
4. x and y are rst cousins.
5. x is ys nephew.
6. x is ys maternal grandmother.
7. x is a half-brother of y.
8. x has no sisters.
9. Everyone has a father and a mother. (Use H as a predicate for humans).
10. No one has more than one father.
Repeated Use of Bound Variables: The same variable can paired in the same w with
several occurrences of quantiers:
(5) xP(x) xP
0
(x)
(5) says that if everything is in the set denoted by P, then everything is in the set denoted by
P
0
. Such quantiers can be even nested:
(6) v [P(v) vP(v)]
We can try to read (6) as:
(6
0
) For every v, the following holds: If v is P, then for every v, v is P.
It may look, or sound, confusing, until you realize that there is no connection between the
rst v and the second v. To bring the point out, rephrase (6
0
) as:
(6
00
) For every v: if v is P, then everything is P.
(And this, it is not dicult see, says that if something is P then everything is P.)
While (6) is a legitimate sentence, one may wish to avoid the repeated use of the same bound
variable. This can be easily done by using another variable in the role of the second v. The
following is logically equivalent to (6).
(6
) v [P(v) uP(u)]
288 CHAPTER 8. FIRST-ORDER LOGIC
8.2.3 Substitutions of Free and Bound Variables
We denote by S
t
t
0 the w obtained from by substituting every free occurrence of t
by t
0
. (Recall that all occurrences of individual constants are considered free.) We describe
the operation as the substitution of free t by t
0
, or the substitution of t
0
for free t. We shall
refer to it, in general, as free-term substitution, or for short free substitution. The concept is
extended to cover also simultaneous substitutions of several terms:
S
t
1
t
0
1
t
2
t
0
2
...
...
t
n
t
0
n
). These substitutions can serve to eliminate repeated use of the same bound variable.
Using them we can transform any w into a logically equivalent one, in which dierent occur-
rences of quantiers are always paired with dierent variables. Bound-variable substitutions
can be also used to get an equivalent w in which no variable is both free and bound. All in
all we get ws that are easier to grasp.
Furthermore, bound-variable substitutions can enable free substitutions, which would be oth-
erwise illegitimate. If, for some reason (and such occasions arise), we want to substitute the
290 CHAPTER 8. FIRST-ORDER LOGIC
free v by u in
uL(u, v)
we cannot do so, because the u will be captured by the quantier. But after substituting the
bound u by w, we get the logically equivalent
wL(w, v)
And here the substitution of free v by u is legitimate and we get: wL(w, u).
Legitimate Substitutions of Bound Variables: As in the case of free-term substitutions,
substitutions of bound variables can have unintended eects of capturing free occurrences.
Consider, for example, our previous (v):
u (K(v, u) L(v, u))
If we substitute in it bound u by v we get:
v (K(v, v) L(v, v))
which is not what we intended. Bound occurrences of u have been replaced here by bound
occurrences of v, but, in addition, free occurrences of v became bound. A substitution may
also transform some bound occurrence to an occurrence that is bound by a dierent quantier
For example, if in:
(7) u [P(u) wR(u, w)]
we substitute bound u by w we get the non-equivalent:
(8) w [P(w) wR(w, w)]
To see clearly the dierence, you can read (7) and (8), respectively, as:
(7
0
) For every u: if u is P, then u is R-related to everything.
(8
0
) For every w: if w is P, then everything is R-related to itself.
The trouble here is that the occurrence of u under R, which in (7) is bound by rst , has
been transformed into an occurrence of w, which is bound by the second .
All of this motivates the following denition.
A substitution of bound variables is legitimate if every free occurrence re-
mains, after the substitution, free, and every bound occurrence is changed
to, or remains, bound by the same quantier.
8.2. WFFS AND SENTENCES OF FOL 291
We can combine the substitution conditions for free and for bound variables into one gen-
eral condition, which covers all substitutions, including mixed cases where free and bound
substitutions are carried out simultaneously.
Legitimate Substitutions in General A substitution of variables is legitimate
if the following holds: (i) Every free occurrence remains, or is replaced by,
a free occurrence; (ii) free occurrences of the same variable remain, or are
replaced by, occurrences of the same variable; (iii) every bound occurrence
remains, or is replaced by, a bound occurrence that is bound by the same
quantier-occurrence.
Note: We use the notation S
t
0
t
only for free substitutions. We will not need a special
notation for bound ones.
Homework 8.2 Let be the w:
uR(u, v) v [S(v, u, c) wR(v, w)]
(i) List, or mark the free occurrences of each of the terms in . List, or mark
all bound occurrences.
(ii) Substitute (legitimately) bound occurrences so as to get a logically equivalent
w in which no variable is both free and bound.
(iii) Construct the following ws (carry out also the illegitimate substitutions):
S
u
c
S
v
w
S
u
v
v
u
S
u
w
c
v
) x [Girl(x) LiveIn(x, NY) (x)] where (x) denes those who grow
beautiful tulips. Additional predicates will be needed to construct a plausible
(x).
Sometimes a ne-grained analysis is not needed. To show that (5) implies that someone grows
beautiful tulips, we do not need to know the structure of (x). We can let (x) be the atomic
w GBT(x), which stands for x grows beautiful tulips. But sometimes a deeper analysis is
unavoidable; we need to go into (x)s details in order to show that (5) implies that someone
grows tulips.
Many-Sorted Languages
Some rst-order languages have individual variables of several sorts. An interpretation asso-
ciates with each sort a domain; all the variables of this sort range over that domain. Quan-
tication is interpreted accordingly. Domains that correspond to sorts need no relativization,
we simply use the appropriate variables. For instance, if is a variable that, according to the
interpretation, ranges over humans, then
x (Human(x) Mortal(x)) is rewritten as Mortal() .
Single-sorted languages and many-sorted ones have the same expressive power. What the
latter achieve by use of dierent sorts the former achieve by relativizing to the corresponding
predicates. Many-sorted languages are more convenient in certain contexts of applications;
single-sorted ones are simpler when it comes to dening the syntax and the semantics.
Note: Sometimes relativization is not necessary, because the restriction to a specic domain
is already implied by other predicates that occur in the formula. Suppose, for example, that
K(x, y) reads as x knows y, and that, by denition, only humans can know.
(8) Some person other than Jack knows Jill,
and
(9) Every person who knows Jill likes her,
can be rendered as:
(8
0
) x (x 6 Jack K(x, Jill))
302 CHAPTER 8. FIRST-ORDER LOGIC
and
(9
0
) x (K(x, Jill) L(x, Jill))
Explicit relativization to the human domain can be dispensed with, because the needed re-
striction is imposed already by the predicate K. But not always does this work out. You
should note how the predicate occurs. For example,
(10) Someone doesnt know Jill
should be formalized as
(10
0
) x [H(x) K(x, Jill)] ,
where H marks o the domain of humans. Without the conjunct H(x), the formalization will
come out true whenever the universe includes non-human objects (can you see why?).
8.3.3 Universal Quantication
The universal-quantier terms in English are the indenite pronouns:
every all any each .
They dier,however in signicant aspects, grammatical as well as semantic. For example, all
requires a complement in the plural, the othersin the singular; this, we shall see, is related
to semantic dierences.
All can function as a quantier-term in ways that the others cannot, ways that do not fall
strictly under (S1). It can precede a relative pronoun:
(1) All who went never returned.
To use the other terms in such a construction, one would have to transform who went into
a noun phase, e.g., one who went.
Sometimes the terms can be used interchangeably:
(2) In this class, every student can pass the test.
(3) In this class, all students can pass the test.
(4) In this class, any student can pass the test.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE 303
(5) In this class, each student can pass the test.
But quite often they cannot:
(6) Not every rabbit likes lettuce.
(7) Not each rabbit likes lettuce.
(8) Not any rabbit likes lettuce.
The second is odd, if not ungrammatical; the third, if accepted, means something dierent
than the rst. Or compare:
(9) Each voter cast his ballot.
(10) Any voter cast his ballot.
You can easily come up with many other examples.
We shall not go into the intricate dierences of various rst-order quantications in English.
This is a job for the linguist. In what follows some basic aspects of the two most important
universal-quantier terms, every and all are discussed. The other two terms, which have
peculiarities of their own, are left to the reader.
Every expresses best the universal quantication of FOL.
(11) Every tiger is striped
states no more and no less than the conjunction of the all sentences ... is striped, where ...
denotes a tiger (assuming that every tiger can be referred to by some expression). But
(12) All tigers are striped
implies some kind of law. This nomological (law-like) dimension is lost when (12) is formalized
in FOL. The universal generalizations of FOL are, one might say, material. They state that,
as a matter of fact, every object in some class is such and such; whether this is some law, or
is a mere accident, does not matter.
This does not mean that every cannot convey a lawful regularity. When the domain of
objects that fall under the generalization is suciently large, or suciently structured by
a rich theory, an every-statement is naturally taken as an expression of lawfulness. In
particular, in mathematics, every is used throughout to state the most lawful imaginable
regularities.
304 CHAPTER 8. FIRST-ORDER LOGIC
(13) Every positive integer is a sum of four squares.
We can also apply all to accidental, or occasional groups of objects, without nomological
implications.
(14) All the students passed the test
and
(15) Every student passed the test
say the same thing. Note however that in (14) the denite article is needed to pick up a
particular group. Without it, the statement has a dierent avour:
(14
0
) All students passed the test.
Distributive versus Collective All
(16) After the meeting all the teachers went to the reception.
(17) After the meeting every teacher went to the reception.
In (16) all can be read collectively: all the teachers went together. But every must be read
distributively, in (17) and in all other cases.
Only the distributive reading of (16) can be formalized as a rst-order generalization. The
collective reading cannot be expressed in FOL, unless we provide ways of treating pluralities
as objects. The fact that all takes a plural complement conforms to its functioning as
something that relates to a plurality as a whole.
Sometimes all must be read collectively:
(18) All the pieces in this box t together.
(18) neither implies, nor is implied by the statement that every two pieces t. The latter can
be expressed as a rst-order generalization, by using a two-place predicate: x ts together
with y. But (18) must be interpreted as an assertion about single totality. (You can have a
collection of pieces, every two of which can be combined, but which cannot be combined as
a whole. Vice versa, a collection can be combined as a whole, though not every two pieces in
it t together.)
But in general the two readings are possible and the choice is made according to context and
plausibility. Quite often, the statement under the collective reading of all implies the one
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE 305
under the distributive reading. (If all the teachers went together, then also every teacher
went.) But sometimes the statements are exclusive:
(19) The problem was solved by all the engineers in this department,
(20) The problem was solved by every engineer in this department.
(Ones contribution to a joint solution does not count as solving the problem. )
Note: All collects no less, even more, when it precedes a relative pronoun. Coming with
that, it is employed in the singular form and points to a single totality. In
(21) All that John did was for the best,
all that John did refers to the collection of Johns doings. Another example is (22) below.
All with Negation
(22) All that glitters is not gold.
The proverb does not make the false assertion that every object that glitters is not made of
gold. (22) is the negation of
(23) All that glitters is gold,
in which all that glitters is read as referring to all glittering objects taken together. (23)
states that this totality is made of gold; its negation, (22), says that it is not, i.e. that some
of it is not made of gold. (22) can be the negation of (23), only if we treat all that glitters
as a name of a single entity. By contrast,
(24) Every object that glitters is not made of gold
is not the negation of
(25) Every object that glitters is made of gold.
(24) makes a much stronger statement: No glittering object is made of gold. By the same
token, under collective all,
(26) All good memories will not be forgotten, but some will,
is not logically false. But the analogous statement with every is logically false:
306 CHAPTER 8. FIRST-ORDER LOGIC
(27) Every good memory will not be forgotten, but some will.
Solid Compounds
Every and any combine with one and body to form solid compounds, which can be used
to quantify over people:
Everyone, Everybody, Anyone, Anybody .
The formalized versions require relativization to the human domain, via an appropriate pred-
icate, unless the presupposed universe consists of humans only. Thus,
(28) Everyone is happy at some time.
comes out as:
(28
) x [FutureCow(x) Blue(x)]
The predicate PastHuman denotes the set of all humans that existed before now. FutureCow
denotes the set of all cows that will exist after now. All the other predicates have time-
independent interpretations. E. g., RedHead expresses the property of being redheaded, irre-
spective of time; it is therefore interpreted as the set of all redheaded creatures, past present
and future. More of this in the next subsection.
The plural forms: There are, There exist, imply the existence of more than one object.
The discussion above, concerning some in the plural, applies in large measure also here.
8.3.5 More on First Order Quantication in English
Generality of Time and Location
Quite a few quantier terms are used to generalize, either universally or existentially, with
respect to time and place. Some of these are compounds based on terms discussed earlier.
Here is a list.
Universal Generalization
For Time: whenever, always, anytime.
For Place: wherever, everywhere, anywhere.
Existential Generalization
For Time: sometime, sometimes.
For Place: somewhere.
Temporal generality can be expressed in FOL by including times, or time-points, among the
objects over which the variables range. For example, the formalization of
(1) Jill is always content
comes out as:
(1
) x((x) (x))
a conjunct asserting the implied existence (e.g., of a girl who saw the puppy):
(6
)would be
vacuously true and completely uninteresting if no girl saw the puppy. The argument can be
carried further by considering:
(7) Everyone who was near the explosion is by now dead.
The speaker may assert (7) in complete ignorance as to whether someone was near the ex-
plosion. The point is that if (7) is granted, and if we nd later that someone was near the
explosion, then we can deduce that the person is dead. If it turns out that no one was near
the explosion (7) would still be considered true.
312 CHAPTER 8. FIRST-ORDER LOGIC
Something about Someone
Someone means at least one person. Occasionally, an assertion of some is taken to indicate
not all, or even exactly one. But such cases can be explained on the grounds of implicature.
If the teacher asserts
(8) Someone got an A on the last test,
the students will probably infer that only one of them got an A. For they assume that the
teacher does not withhold relevant information. If several students got an A, he would have
used the plural: some of you, and if all didhe would have used all of you.
But if (8) is asserted by someone who stole a hasty glance at the grade sheet, and the students
know this, they will infer only that there was at least one that got an A Note also that the
teacher himself can announce:
(9) Someone got an A on the last exam, in fact all of you did,
without contradicting himself.
Note: Some means also a relatively small quantity, as in Some grains of salt got into the
coee. Read in this way, it is not expressible in FOL.
Generality through Indenite Articles
An indenite article, by itself, sometimes implies universal generalization. Usually, such
statements are intended to express some law-like regularityan aspect that will be lost in
FOL formalization.
(10) A bachelor is an unmarried man
means:
(10
0
) All bachelors are unmarried men.
(11) Whales are mammals
means:
(11
0
) All whales are mammals.
But often the last form expresses something weaker than strict universal generalization:
(12) Birds y
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE 313
means something like: In most cases, or in most cases you are likely to
encounter, birds y.
And this kind of statement is outside the scope of FOL. Considerable eorts have been devoted
to setting up formalisms in which generalizations of this kind are expressible.
A very common variant of (10) and (12) employs the conditional.
(13) If a triangle has two equal angles, it has two equal sides
means:
(13
0
) Every triangle that has two equal angles has two equal sides.
(14) A man is not held in esteem, if he is easily provoked
means:
(14
0
) Every man who is easily provoked is not held in esteem.
Generality through Negation
(15) No person is indispensable
amounts to the negation of Some person is indispensable:
(15
0
) Every person is not indispensable.
In this category we have the very commonly used compounds: nothing, no one, nobody, as
well as nowhere.
Generalization through Some
These cases belong together with (13) and (14) above. Some plays here the role of an
indenite article.
(16) If someone beats a world record, many people admire him
means:
(16
0
) If a person beats a world record many people admire him,
which comes to:
(16
00
) Everyone who beats a world record is admired by many people.
314 CHAPTER 8. FIRST-ORDER LOGIC
And in a similar vein:
(17) Someone who is generous is liked
really means:
(17
0
) Everyone who is generous is liked.
We may even get ambiguous cases where some can signify either a universal or an existential
quantier:
(18) In this class, someone lazy will fail the test.
You can conceivably interpret (18) as stating that, in this class, all the lazy ones will fail the
test. You can also read it as a prediction about some unspecied student that the speaker
has in mind.
General Advice
From the foregoing, you can see some of the tangle of rst-order quantication in natural
language. Remember that there are no simple clear-cut rules that will enable you to derive,
in a mechanical way, correct formalized versions. Conceivably, some algorithm might do this;
but it is bound to be a very complex aair. A good way to check whether you have got the
formalization right is to consider truth-conditions:
Assuming that vagueness, non-denoting terms and other complicating factors have been
cleared, do the sentence and its formal translation have the same truth-value in every possible
circumstance? This is not the only criterion, but it is a crucial one.
In any case, do not follow blindly the grammatical form. You must understand what the
sentence says before you formalize it!
8.3.6 Formalization Techniques
When translating from English into FOL, it is often useful (especially for beginners) to pro-
ceed stepwise, using intermediary semi-formal rewrites, possibly with variables. When the
semi-formal sentence is suciently detailed, it translates easily into FOL. Here are some
illustrations. The predicates and constants in the nal ws are self-explanatory (His inter-
preted as the set of humans).In (1), (2), (5) and (6) more than one logically equivalent
ws are given as possible answers. In some you can trace the equivalence to the famil-
iar: ( ) . In all cases the equivalences follow from FOL equivalence
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE 315
rules (to be given in chapter 9). But you may try to see for yourself that the dierent versions
say the same thing.
(1) No man is happy unless he likes himself.
(1
) There is a man whom Claire likes and who does not like Claire.
(4
) There are two women whom Harry likes and there is a woman whom Harry does not
like.
(5
) Every woman does not like a man, if the man doesnt like her.
(6
), or (1
), is abbreviated as:
!x(x)
which reads: there is a unique x such that (x).
Homework 8.4 Rephrase the following sentences, using variables. Then formalize them
in FOL.
(1) Claire and Edith like the same men.
(2) The women who like Jack do not like Harry.
(3) Only Ann is liked by Harry and David.
(4) David is not happy unless two women like him.
(5) Edith is liked by some man who does not like any other woman.
(6) Harry likes a woman who likes all happy men.
(7) Unless liked by a woman no man is happy.
(8) Some man likes all women who like themselves.
(9) Every happy man is liked by some happy woman.
(10) Ann is liked by every man who likes some woman.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE 319
Divide and Conquer
It is often useful to formalize separately components of the sentence, which then can be tted
into the global structure. Such components are ws that can contain free variables. They
are obtained from a semi-formal version of the original English sentence. Here are three
examples of this divide-and-conquer method. Note how short English sentences can display,
upon analysis, an intricate logical structure.
(i) Whoever found John found also somebody else.
For all x, if x is a person and x found John then (x), where (x) is a w saying:
x found someone other than John.
All in all, the sentence can be written as:
(i
0
) x [H(x) Found(x, John) (x)]
We now turn our attention to (x). It can be written as:
y (H(y) y 6 John Found(x, y))
Substituting in (i
0
) we get our nal answer:
(i
00
) x [H(x) Found(x,John) y (H(y) y 6 John Found(x, y))]
(ii) Jill owns a dog which is smaller than any other dog.
(ii
0
) x [Dog(x) Owns(Jill,x) (x)], where (x) says:
x is smaller than any other dog. We can write it as:
y (Dog(y) y 6 x Smaller(x, y))
Substituting we get:
(ii
00
) x [Dog(x) Owns(Jill,x) y (Dog(y)y 6 x Smaller(x, y))]
(iii) Somebody loves a person who is loved by nobody else.
(iii
0
) x {H(x)(x)} where (x) says:
x loves a person who is not loved by anyone, except x.
It can be written as:
y [H(y) Loves(x, y) (x, y)] where (x, y) says:
320 CHAPTER 8. FIRST-ORDER LOGIC
Every person other then x does not love y.
It can be written as:
z (H(z)(z 6 x) Loves(z, y))
Therefore, (x) becomes:
y [H(y) Loves(x, y) z (H(z)(z 6 x) Loves(z, y))]
Substituting in (iii
0
), we get:
(iii
00
) x {H(x) y [H(y) Loves(x, y) z (H(z)(z 6 x) Loves(z, y))]}
Homework
8.5 Formalize the following sentences in FOL. You can use one-letter notations for predicates
and individual names; specify what they stand for. Indicate cases of ambiguity and formalize
the various readings.
1. One woman in the room knows all the men there.
2. No one in the room knows every person there.
3. Someone in the room does not know any other person.
4. Someone can be admitted to the club only if two club members vouch for him.
5. Bonnie knows a man who hates all club members except her.
6. Bonnie will not attend the party unless some friend of hers does.
7. Bonnie met two persons only one of whom she knew.
8. Abe met two men, one of whom he knew, and the other who knew him.
9. Some women who like Abe do not like any other man.
10. Abe owns a house which no one likes.
11. Abe was bitten by a dog owned by a woman who hates all men.
12. Bonnie knows a man who likes her and no one else.
13. Whoever visited Bonnie knew her and was known to some other club member, except
Abe.
14. With the possible exception of Bonnie, no club member is liked by all the rest.
15. With the exception of Bonnie, no club member is liked by all the rest.
8.3. FIRST-ORDER QUANTIFICATION IN NATURAL LANGUAGE 321
8.6 Formalize the following sentences in FOL. Introduce predicates as you nd necessary,
specifying the interpretations clearly. (For example: GM(x, y) stands for x and y are
people and x is good enough to be ys master). Try to get a ne-grained formalization.
Interpret some as at least one.
1. No one means all he says.
2. Someone says all he means.
3. Each State can have for enemies only other States, and not men.
4. He who is by nature not his own but anothers man, is by nature a slave.
5. No man is good enough to be another mans master.
6. Those who deny freedom to others deserve it not for themselves.
7. He who cannot give anything away cannot feel anything either.
8. You can fool some of the people all the time, or all the people some of the time, but
you cannot fool all people all the time.
322 CHAPTER 8. FIRST-ORDER LOGIC
Chapter 9
Models for FOL, Satisfaction, Truth
and Logical Implication
9.1 Models, Satisfaction and Truth
9.1.0
The interpretation of a rst-order language is given as a model. By this we mean a structure
of the form:
(U, , )
in which:
(I) U is a non-empty set, called the models universe, or domain. The members
of U are also referred to as members of the model.
(II) is a function that correlates with every predicate, P, of the language a
relation, (P), over U of the same arity as P (if Ps arity is 1, then (P) is a
subset of U). We say that (P) is the interpretation of P.
(III) is a function that correlates with every individual constant, c, of the lan-
guage a member, (c), of U. We say that (c) is the denotation of c in the
given model. We also speak of it as the interpretation of c.
In set-theoretic terms we can express this by:
(P) U
n
, where n = arity of P, (c) U .
323
324 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
There is only one restriction on the interpretation of predicates: If the lan-
guage has equality, then
() = {(x, x) : x U} .
In words: the equality sign is interpreted as the identity relation over U.
Note: In the case of a language with function symbols (cf. 8.2.4), the mapping is
also dened for the function symbols; it correlates with every n-place function symbol an
n-place function from U
n
into U. Henceforth we deal with languages based on predicates and
individual constants. The extension to function symbols is more or less straightforward.
Notation and Terminology
We shall use M, M
0
, , . . ., M
1
, . . . for models.
|M|, P
M
, c
M
, denote, respectively, the universe of M, the interpretation of P
in M, and the interpretation of c in M.
Hence, if M = (U, , ) then:
|M| = U, P
M
= (P), c
M
= (c)
If we assume xed orderings of the predicates and of the individual constants: P
1
, P
2
,
. . ., c
1
, c
2
, . . ., the model is written by displaying the interpretations in the same order:
(U, P
1
, P
2
, . . . c
1
, c
2
, . . .)
where U = |M|, P
i
= P
M
i
, c
j
= c
M
j
. (If there are no individual constants, the last
sequence is simply omitted.) A structure of this form is known also as a relational
structure.
The size of a model M is, by denition, the number of elements in its universe. The
model is nite if its universe is a nite set.
As observed in 8.2.2, the truth-value of any w is determined by: (i) a model M and (ii)
an assignment of members in M to s free variables. Accordingly, we have to dene
the truth-value of a w , in a model M, under an assignment g of values
to s free variables.
We shall denote this truth-value as:
val
M
[g]
If is a sentence, its truth-value depends only on the model and we can drop [g].
9.1. MODELS, SATISFACTION AND TRUTH 325
Note: The assignment g is neither a part of the language, nor of the model.
The following notations are used, for assignments.
(I) If x
1
, . . . , x
n
are distinct variables, we use:
x
1
a
1
x
2
a
2
...
...
x
n
a
n
to denote the assignment dened over {x
1
, . . . , x
n
}, which assigns each x
i
the value a
i
. Ac-
cordingly,
val
M
[
x
1
a
1
x
2
a
2
...
...
xn
a
n
]
is the truth-value of in M under that assignment.
(II) If g is any assignment of values to some variables, then
g
x
a
is, by denition, the assignment that assigns to x the value a, and to every other variablethe
value that is assigned to it by g.
Note 1: g
x
a
is dened for the following variables: (i) all the variables for which g is dened,
(ii) the variable x. To variables dierent from x, g
x
a
and g assign the same values. Whether
g is dened for x, or not, does not matter; for in either case g
x
a
assigns to x the value a.
Note 2: In order that val
M
[g] be dened, g should be dened for all free variables of . It
can be also dened for other variables; but as we shall see, the values given to variables not
free in play no role in determining s truth-value.
Note 3: We use assignment for a function that correlates members of the universe with
variables. Do not confuse this with the truth-value assignment (to be presently dened) which
correlateswith each w , each model M and each suitable assignment g of members of
|M| to variablesthe truth-value val
M
[g].
9.1.1 The Truth Denition
val
M
[g] is dened inductively, starting with atomic ws and proceeding to more complex
ones.
It is convenient start by assigning values to terms, i.e., to individual variables and constants.
The value of a term ,t, under g, is denoted as: val
M
t [g]. It is not a truth-value but a member
of M, determined as follows:
326 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
If t is the individual constant, c, then val
M
t [g] = c
M
.
If t is the variable v, then val
M
t [g] = g(v). This value is dened i g is
dened for v.
Atomic Ws
val
M
P(t
1
, . . . , t
n
) [g] = T if (val
M
t
1
[g], . . . , val
M
t
n
[g]) P
M
,
val
M
P(t
1
, . . . , t
n
) [g] = F if (val
M
t
1
[g], . . . , val
M
t
n
[g]) 6 P
M
.
(For atomic sentences, this coincides with the denition given in 7.1.1.)
Note that, since by assumption g is dened for all free variables of P(t
1
, . . . , t
n
),
all the values val
M
t
i
[g] are dened.
Sentential Compounds
val
M
[g] = T if val
M
[g] = F,
val
M
[g] = F if val
M
[g] = T.
If is a binary connective, then val
M
() [g] is obtained from val
M
[g] and val
M
[g] by
the truth-table of .
(In the last clause g is dened for all free variables of , hence it is dened
for the free variables of and for the free variables of . )
Universal Quantier
val
M
x[g] = T if, for every a |M|, val
M
[g
x
a
] = T,
val
M
x[g] = F otherwise.
Existential Quantier
val
M
x[g] = T if, for some a |M|, val
M
[g
x
a
] = T
val
M
x[g] = F otherwise.
If the language contains function symbols, then the denition is exactly the same, except that
we have to include in the denition of val
M
t[g] inductive clauses for terms containing function
symbols:
9.1. MODELS, SATISFACTION AND TRUTH 327
val
M
f(t
1
, . . . , t
n
) [g] = f
M
(val
M
t
1
[g], . . . , val
M
t
n
[g]) ,
where f
M
is the function that interprets the function-symbol f in M.
A Special Kind of Induction: The truth-value of x (or of x) is determined by
the truth-values of the simpler w , under all possible assignments to the variable x. It is
therefore based on simpler cases whose number is, possibly, innite. This is a special kind
high-powered induction that we have not encountered before.
Note: In the clauses for quantiers, the variable x need not be free in . If it is not, then it
can be shown that, for each assignment g, the truth-values of , x, and x are the same.
Note also that may contain non-displayed free variables, besides x. Since their values under
g and under any g
x
a
are the same, these values are xed parameters in the clause.
Understanding the Logical Particles: The clauses for quantiers employ the expressions
for every and for some. We must understand what these expressions mean in order to grasp
the denition. Just so, we should understand and, either ... or , and if..., then , in
order to understand what the truth-tables mean. We say, for example: The value of any
sentence is either T or F, or If the value of A is T and the value of B is F, then the value
of A B is F.
First-order logic does not provide us with a substitute for these concepts, but with a system-
atization that expresses them in rigorous precise form.
Satisfaction: If val
M
[g] = T we say that is satised in M by the assignment g, or
that M and g satisfy . We denote this by:
M |= [g]
If has a single free variable, we say that is satised by a (where a |M|), if is satised
by the assignment that assigns a to its free variable. Similarly, if it has two free variables,
we say that it is satised by the pair (a, b), if it is satised by the assignment that assigns
a to the rst free variable and b to the second. Here, of course, we presuppose some agreed
ordering of the variables.
If is not satised in M by g, (i.e., if val
M
[g] = F) we denote this by:
M 6|= [g]
Ambiguity of |=: |= is also used to denote logical implication (and logical truth).
There is no danger of confusion. The symbol denotes satisfaction if the expression to its left
denotes a model; otherwise it denotes logical implication. These uses of |= are traditional
in logic.
328 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
Dependence on Free Variables
It can be proved (by induction on the w) that val
M
[g] depends only on the values assigned
by g to the free variables of . If g and g
0
are assignments that assign the same values to all
free variables of , but which may dier otherwise, then
val
M
[g] = val
M
[g
0
]
The proof is not dicult but rather tedious; we shall not go into it here. If there are no free
variables, i.e., if is a sentence, the truth-value depends only on the model. We can therefore
omit any reference to an assignment, saying (in case of satisfaction) that the sentence is
satised, or is true, in M, and denoting this as:
M |=
Similarly, val
M
is the truth-value of the sentence in the model M.
Example
Consider a rst-order language based on (i) the binary predicate L, (ii) the monadic predicates
H, W and M, (iii) the individual constants c
1
and c
2
. Let their ordering be:
L, H, W, M, c
1
, c
2
and let M be the model
(U, L, H, W, M, c, d)
where:
U = {c, d, e, f, g, h}
L consists of the pairs:
(c,c), (c,d), (c,f ), (d,g), (d,h), (e,e), (e,f ), (e,h), (f,c), (f,f ), (f,h), (g,c),
(g,d), (g,e), (g,g), (h,e), (h,f )
H = {c, e, f, g}
W = {c, d, e}
M = {f, g, h}
To make this more familiar let the six objects c, d, e, f, g, h be people, three women and three
men:
c = Claire, d = Doris, e = Edith, f = Frank, g = George, h = Harry.
9.1. MODELS, SATISFACTION AND TRUTH 329
Then, W consists of the women and M of the men. Assume moreover that L is the liking-
relation over U and that H is the subset of happy people, that is, for all x, y U:
(x, y) L i x likes y,
x H i x is happy.
Note that Claire and Doris have names in our language: c
1
and c
2
, but the other people do
not.
Now let be the sentence:
u [W(u) v (M(v) H(v) L(u, v))]
It is not dicult to see that, given that the interpretation is M, says:
(1) Every woman (in U) likes some man (in U) who is happy.
Applying the truth-denition to , we shall now see that the truth of in the given model is
exactly what (1) expresses. In other words:
is true in M IFF (1)
Obviously = u(u), where
(u) = W(u) v (M(v) H(v) L(u, v))
Hence M |= i for every a U, M |= (u)[
u
a
]. Now is a conditional:
= W(u) v, where = (u, v) = M(v) H(v) L(u, v)
If a 6 W then M 6|= W(u)[
u
a
] and the antecedent of the conditional gets F; which makes the
conditional true. Hence is satised by every assignment
u
a
for which a 6 W . Therefore,
is true in M i is also satised by all the other assignments, i.e., by all assignments
u
a
in
which a W. For each of these assignments the antecedent gets T; hence the conditional
gets true i
M |= v[
u
a
]
And this last w is satised i there exists b U such that M |= (u, v)[
u
a
v
b
]; that is, i there
is b U such that:
M |= M(v)H(v)L(u, v) [
u
a
v
b
]
The last condition simply means that each of the conjuncts is satised by
u
a
v
b
, which means
that:
b M and b H and (a, b) L
Summing all this up, we have: is satised in M i for every a U, if a W, there exists
b U, such that b M and b H and (a, b) L. Which can be restated as:
330 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
(2) For every a in U, if a is a woman, then there exists b in U, such that b is a
man and b is happy and a likes b.
Obviously, (2) is nothing but a detailed rephrasing of (1).
The truth-value of can be found by checking, for every woman in W, whether there is in U
a happy man whom she likes. This indeed is the case: c likes f, d likes g, e likes f. Hence,
val
M
= T.
The same reasoning can be applied to the sentence :
v [M(v) H(v) u (W(u) L(u, v))]
, it turns out, asserts that there is a happy man who is loved by all women. It gets the value
F, because the happy men (in U) are f and g; but f is not liked by d and g is not liked by c.
Homework 9.1 (I) Find the truth-value in the model M of the last example of each of the
following sentences. Justify briey your answers in regard to 6.10. Do not go into detailed
proofs; justications for the above and can run as follows:
gets T, because for each x in W, there is a y in M, which is also in H,
such that L(x, y): for x = f choose y = c, for x = g choose y = d, and for
x = h choose y = e.
(Here we used L(x, y) as a shorthand for (x, y) L.)
gets F, because there is no x that is in M and in H, such that for all
y W, L(y, x). L M has two members f and g; but for x = f, y = d
provides a counterexample, and for x = g, y = c provides it.
1. L(c
1
, c
2
) L(c
2
, c
1
)
2. L(c
1
, c
2
) L(c
2
, c
1
)
3. x (L(c
1
, x)M(x) L(x, c
1
))
4. x (L(c
2
, x)M(x) L(x, c
2
))
5. x (L(x, c
2
)M(x) L(c
2
, x))
6. x [W(x) y (M(y)L(x, y)L(y, x))]
7. x y (W(x)W(y)x 6 y L(x, y))
8. x [W(x) y (W(y)L(x, y))]
9. x [W(x) y (W(y)L(y, x))]
9.1. MODELS, SATISFACTION AND TRUTH 331
10. x [H(x) L(x, x)]
(II) Translate the sentences into correct stylistic English. (This relates to the subject matter
of the previous chapter. Do it after answering (I).)
9.1.2 Dening Sets and Relations by Ws
The sets and relations dened by ws in a given interpretation (cf. 8.1.1) can be now described
formally using the concept of satisfaction:
In a given model M, a w with one free variable denes the set of all members of |M|
that satisfy it. A w with two free variables denes the relation that consists of all pairs
(a, b) |M|
2
that satisfy it. And a w with n free variables denes, in a similar way, an
n-ary relation over |M|. (For arity n > 1, we have to presuppose a matching of the free
variables in the w with the relations coordinates.)
Note: A w with m free variables can be used to dene relations of higher arity in which
the additional coordinates are dummy: Say, the free variables of occur among v
1
, . . . , v
n
and consider the relation consisting of all tuples (a
1
, . . . , a
n
) such that
M |= [
v
1
a
1
v
2
a
2
...
...
v
n
an
]
The v
i
s that are not free in make for dummy coordinates that have no eect on the tuples
belonging to the relation.
Examples
Consider a rst-order language, based on a two-place predicate R, the equality predicate ,
and two individual constants: c
1
and c
2
.
Let |M| = {0, 1, 2, 3, 4} and let:
R
M
= {(0, 1), (0, 2), (0, 3), (1, 2), (1, 4), (2, 2), (4, 1), (4, 3), (4, 4)}
c
M
1
= 0 c
M
2
= 3
This model is illustrated below, where an arrow from i to j means that (i, j) is in the relation.
332 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
The following obtains .
. 1. M I=R(cl>x)[;;] fora= 1,2,3 and M foull other a's.
Hence, R(Cl'X) defines the Bet {1,2,3}.
2. M F R(V, for a -:- 0,4 and M C2)m for all other a's.
Hence, R(y, C2) defines the set {0,4}.
3. R(x, Cl.) defines 0 (the empty set).
4. M 1= 3xR(c!,x), because there is a E M (e.g., 1) such that M 1= R(c!> x)!;].
5. M because not all a E M are such that M F R(Cl'X)[;;]; e.g., a = 0.
6 .. Hence, M !='-,'v'XR(Cl,X)
. 7. M 1= 3yR(y, C2) A ... ,vyR(y,C2) .
8. M 1= Vx (R(Cl,Z) V R(x, C2, because, as you can verify by direct checking, for all .
a E IMI: M 1= (R(Cl' x) V R(X, . - .
. 9. MI= 3z (z Cl AZ c2AR(z,z
10. M 1= 3u3v(u,evAR(u,u)AR(v,v
11. MI= VX[R(x,z)--+3y(yzAR(y,z))].
Because, for all a E IMI, M 1= [R(z, x) ...... 3y(y zAR(y,x)][:]; namely, if a = 0, 1,3
the antecedent gets the value F under the assignment hence the conditional gets T,
ifa = 2 M 1= 3y(y xAR(y,x))eJ, because M 1= (y ZAR(y,xgID, and a
similar argument works for a = 4.
12. Vx (R(x,y) ...... R(y, x defines the set {0,4}.
If we read R( 1.1, v) as: '1.1 points to v', then the lastwff can be read as: 'Every member
that points to y is pointed to by y'. 0 satisfies it vacuously, because no member points
to O.
9.1. MODELS, SATISFACTION AND TRUTH 333
13. The w x c
2
denes the set {3} .
14. The w y (y 6 x R(y, y) R(y, x)) zR(x, z) denes the set {1} .
Had we not included y 6 x in the last w, both 4 and 2 would have satised it (can
you see why?). Had we not included the conjunct zR(x, z), 3 would have satised it.
Repeated Quantier Notation: We use
x
1
, x
2
, . . . , x
n
, x
1
, x
2
, . . . , x
n
,
as abbreviations for:
x
1
x
2
. . . x
n
, x
1
x
2
. . . x
n
,
Homework
9.2 Consider a language with equality whose non-logical vocabulary consist of: A binary
predicate, S, a monadic predicate, P, and one individual constant a.
Let M
i
, where i = 1, 2, 3, be models for this language with the same universe U.
M
i
= (U, S
i
, P
i
, a
i
)
Assume that U = {1, 2, 3, 4} and that the relations S
i
and P
i
and the object a
i
(which
interpret, respectively, S, P, and a) are as follows:
S
1
= {(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)}, P
1
= {1, 4}, a
1
= 4 ,
S
2
= S
1
{(b, b) : b U} P
2
= a
2
= 1 ,
S
3
= {(1, 2), (1, 3), (1, 4), (2, 3), (3, 4), (4, 2)} P
3
= {2, 3, 4} a
3
= 1 .
Find, for each the following sentences, which of the three models satisfy it. Justify briey
your answers (cf. Homework 9.1). You might nd little drawings representing the model, or
parts of it, useful; especially in the case of S
3
.
1. vuS(u, v)
2. v u (u 6 v S(v, u))
3. u, v [S(v, u) P(v)]
4. u (S(a, u) P(u))
334 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
5. u, v [S(a, u)S(u, v) S(a, v)]
6. u, v, w [S(u, v)S(v, w) S(u, w)]
7. u [P(u) vS(u, v)]
8. u, v, w [P(u)S(u, v)S(u, w) v w]
9.3 Write down (in the list notation) the sets dened by each of the following ws in each
of the models M
i
, i = 1, 2, 3 of 9.2.
1. S(a, v)
2. S(v, a)
3. P(v) v a
4. P(v) v a
5. uS(u, v)
6. uS(v, u)
7. u (P(u) S(v, u))
8. u
1
, u
2
[u
1
6 u
2
S(u
1
, v) S(v, u
2
)]
9. u (P(u) u 6 v S(u, v))
9.2 Logical Implications in FOL
9.2.0
The scheme that denes logical implication for sentential logic denes it also for FOL:
A set of sentences, given as a list , logically implies the sentence , if there
is no possible interpretation in which all the members of are true and is
false.
What characterizes implication in each case is the concept of a possible interpretation and the
way interpretations determine truth-values. Rephrasing the denition in our present terms,
we can say that, for a given rst-order language, logically implies if there is no FOL
9.2. LOGICAL IMPLICATIONS IN FOL 335
model that satises all members of but does not satisfy . Furthermore, a sentence is
logically true if it is satised in all models, logically falseif it is satised in none.
The concepts extend naturally to the case of ws; we have to throw in assignments of objects
to variables, since the truth-values depend also on such assignments:
A premise-list of ws logically implies a w , if there is no model M(for the language
in question) and no assignment g (of values to the variables occurring freely in the ws
of and ) which satisfy all members of , but do not satisfy .
A w is logically true if it is satised in all models under all assignments of values to
its free variables. (Or, equivalently, if it is logically implied by the empty premise-list.)
A w is logically false if it is not satised in any model under any assignment of values
to its free variables.
The denitions for sentences are particular cases of the denitions for ws.
Satisable Sets of Ws
A set of ws is satisable in the model M if there is an assignment of values (to the free
variables of its ws) which satises all ws in the set. If the ws are sentences, this simply
that all the sentences are true in M.
A set of ws is satisable if it is satisable in some model.
A set of ws which is not satisable is described as logically inconsistent. Note that this
accords with the previous usage of that term in sentential logic (cf. 3.4.3).
Obviously, a w is satisable just when it is not logically false.
As before, we use:
|=
to say that the premise-list logically implies the w . (Recall the double-usage of |= !)
If the premise-list is empty, this means that is a logical truth:
|=
We have used to denote some unspecied contradiction (cf. 4.4.0). We adopt this notation
also for FOL.
336 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
Following the reasoning used for sentential logic (cf. 4.4.0), we see that
|=
means that is not satisable. And the same reasoning also implies:
|= , |=
Logical Equivalence
From logical implication we can get logical equivalence. Using, as before , we can dene it
by:
|= and |=
Obviously, , i, and have the same truth-value in every model under every
assignment of values to the variables that are free in and in .
From now on, unless indicated otherwise, implication and equivalence mean logical impli-
cation and logical equivalence.
9.2.1 Proving Non-Implications by Counterexamples
We use 6|= to say that does not imply . It means that there is some model and some
value-assignment to the variables, such that all members of are satised but is not. Such
a model and assignment constitute a counterexample to the implication claim. Here are some
non-implication claims that are proved by counterexamples.
(1) xP(x) 6|= xP(x)
Proof: Consider a model, M, whose universe contains at least two members,
such that P
M
is neither empty nor the whole universe. Since P
M
6= , M |=
xP(x). Since P
M
6= |M|, M 6|= xP(x).
QED
(2) 6|= yxR(x, y) xyR(x, y)
Proof: Consider the following model.
M = (U, R), where U = {0, 1}, R = {(0, 0), (1, 1)}
Since M |= R(x, y)[
x
0
y
0
], we have: M |= xR(x, y)[
y
0
]. An analogous
argument shows that M |= xR(x, y)[
y
1
]. Since 0 and 1 are the only mem-
bers, M |= yxR(x, y).
9.2. LOGICAL IMPLICATIONS IN FOL
On the either hand, M R(x,y)[orl. Hence M 'fyR(x,y)[o]' Bya
similar argument M F 'fyR(x,y)[y]. Therefore M V=3xliyR(x,y). Since
the antecedent (of the sentencll in (2 is true in M, but the ,consequ\3nt is
false, the sentence is false.
QED
Let us modify the sentence of (2) a bit:
'in: 3y (x y IIR(x,y -> 3ylix (x y -+ R(x,y
337,
It is not difficult to, see that that sentence is satisfied 'in the last counterexample, since the
antecedent is false; Still that sentence is not a logical truth. To prove this consider the
model M = (V,R) where:
U = {O,1,2}, R = {(O, 1), (1,2), (2,O)}
It is not difficult to see that the antecedent is true in this model: for every member a there
is a different member b such that (a,b) E R. But there is no member b such that (a, b) E R
for all it -F b. Hence the consequent is false.
Homework 9.4 Prove the following negative claims by constrjlcting small-size models (as
small as you can) such that the premise is satisfied, but the conclusion is not.
You also have to assign values to the free variables occurring in the implication, if there are
any. (Note that the same variable can have free and bound occurrences.)
1. Vx3yS(x, y) 3x8(x, x)
, 2. 3x,y (P(x)lIR(y V= 3x(P(x)l\R(x))
3. VX(P(x)VR(x
4. (3x (P(x) -tR{x))) AP(X) R(x)
5. 3xliyS(x, y) 3yS(x, y)
6. lixS(x,u) A VyS(v,y) V= S(u,v)
7. V u, v {u '11 :... [(S(u, u) i\ S( '11, '11 V (S(u, v) 1\ S('lI, u)]} V=' 3u.S(u, u)
8. Vu3v(vuI\.S(u,v)) u?6v-> (S(u,v)VS(v,u))
, :.
338 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
9.2.2 Proving Implications by Direct Semantic Arguments
Sometimes it is easy to show that a logical implication holds by a direct semantic argument.
Here are some examples.
(3) x |= x
Proof: Assume that M |= x[g]
By the truth-denition of universally quantied ws this means that:
M |= [g
x
a
], for all a |M|.
By denition, a models universe is never empty. Take any a |M|. Since
M |= [g
x
a
], the truth-denition for existential generalization implies:
M |= x[g]
QED
(3) is intuitively obvious. The next is less so, but not very dicult.
(4) |= xy yx
Proof: (4) is equivalent to:
xy |= yx
To simplify the notation, we can ignore s free variables other than x and y;
whatever they are, their values remain xed throughout. We can therefore
leave g out. (If needed, the notation can be lled by tacking g on, as in
the proof of (3).)
Assume that
M |= xy
Then some member of |M| satises y. Let a be such a member:
M |= y[
x
b
]. Then,
for every b M, M |= [
x
a
y
b
]
But for each b, M |= [
x
a
y
b
] implies M |= x[
y
b
]. Therefore:
for every b M, M |= x[
y
b
]
But this implies that:
M |= yx
QED
9.2. LOGICAL IMPLICATIONS IN FOL 339
The negative claims (1) and (2) show that the implication in (3) and the conditional in (4)
cannot, in general, be reversed.
(5) xy y x
Proof: Consider any model M. Again we ignore any free variables dierent
from x and y.
Assuming that M |= xy, we show that M |= yx. Our assumption
means that:
for all a |M|: M |= y [
x
a
] .
which implies that:
for all a |M|: for all b |M|: M |= [
x
a
y
b
] .
Therefore, for any b |M|:
M |= [
x
a
y
b
], for all a |M|.
Hence,
M |= x[
y
b
], for all b |M|.
Which implies:
M |= y x
This shows that the left-hand side in (5) implies the right-hand side. Since
the situation is symmetric, the reverse implication holds as well.
QED
Here is an example involving a free variable. Unlike the previous examples, it is not of general
signicance; its interest lies in the fact that the logical truth of the w is far from clear at
rst glance.
(6) |= y [R(x, y) uvR(u, v)]
Proof: We have to show that, given any model M and any member a of |M|,
we have:
M |= y [R(x, y) uvR(u, v)] [
x
a
]
We have therefore to show the existence of b |M| such that:
(6
0
) M |= R(x, y) uvR(u, v)) [
x
a
y
b
]
Now, if for some b |M|, M 6|= R(x, y)[
x
a
y
b
], then for this b the conditional
in (6
0
) is true under the given assignment, because the antecedent is false.
340 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
Remains the case in which there is no such b, that is:
for every b |M|, M |= R(x, y)[
x
a
y
b
]
Since M |= R(x, y)[
x
a
y
b
] i M |= R(u, v)[
u
a
v
b
], this can be rephrased as:
for every b |M|, M |= R(u, v)[
u
a
v
b
]
Hence,
M |= vR(u, v) [
u
a
]
implying:
M |= uvR(u, v)
In this case (6
0
) holds for all b |M|, becauseindependently of b the
consequent in the conditional gets T. QED
In a direct semantic argument we appeal directly to the truth-denition. In doing so we use
the concepts every and there exists, as we understand them in our English discourse. As
remarked before, the truth-denition makes explicit and precise concepts that are already
understood. It does not create them out of nothing.
As the last illustration, we prove what is known as Morgans law for universal quantication.
(7) x x
Proof: x is satised in a model M just when x is not satised there,
i.e., when it is not the case that, for all values of x, is satised. This is
equivalent to saying that, for some value of x, is not satisedi.e., is
satised. Thus, x is satised i there exists a value of x for which
is satised, which is equivalent to saying the satisfaction of x
The technique that we used for sentential logic can be carried over, without change, to FOL:
All the general sentential-logic laws for establishing logical equivalences and
implications are valid for FOL.
This means that you can use without restrictions the laws given in chapters
2 and 3, with sentential variables replaced everywhere by any FOL ws.
The reason for this is that the sentential laws concern only sentential compounds and derive
solely from the truth-tables of the connectives. They remain in force, no matter what other
units we have.
The sentential laws are adequate for establishing tautological equivalences and tautological
implications, but not equivalences and implications that depend on the meaning of the quan-
tiers. We therefore supplement them with appropriate quantier laws. The following table,
in 9.2.3 lists the basic equivalence laws that involve quantiers.
9.2. LOGICAL IMPLICATIONS IN FOL 341
9.2.3 Equivalence Laws and Simplications in FOL
Commutativity of Quantiers of the Same Kind
xy y x xy y x
Distributivity of Quantier Over Appropriate Connective
x( ) x x x( ) x x
De Morgans Laws for Quantiers
x x x x
If x is not free in :
x x
x( ) x x( ) x
Changing Bound Variables
0
if
0
results from by legitimate bound-variables substitution
Two of these equivalences have been proved in the last subsection: (5) is the commutativity
of universal quantiers, and (7) is De Morgans law for universal quantiers. The others are
provable by similar semantic arguments. Here for example is the argument showing that if x
is not free in then:
x ( ) x
We have to show that, for every model M and for every assignment g of
values to the free variables, the two sides have the same truth-values. Since
x is not free in , the two sides have the same free variables and x is not
among them. The argument does not depend on values assigned by g to
variables other than x; they appear as xed parameters. Only the possible
342 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
values of x plays a role, when we apply the truth denition to ws of the
form x(. . . x. . .).
First assume that is true (in M, under g). Then the right-hand side is
true. Since x is not free in , the truth-value of does not depend on the
value assigned to x. Hence, for all a |M|, val
M
[g
x
a
] = T. Therefore, for
all a |M|, val
M
( ) [g
x
a
] = T. This implies that the left-hand side is
true.
Remains the case that is false. Then, the truth-value of the right-hand
side is the value of the second disjunct: x. Suppose it is T. Then, for all
a |M|: val
M
[g
x
a
] = T, hence also val
M
( be) [g
x
a
] = T. Therefore the
left-hand side is true as well.
If, on the other hand, x gets F, then for some a |M|, val
M
[g
x
a
] = F.
Since x is not free in , the assignment of a to x has no eect on s truth-
value (which, by assumption, is F). Hence under this assignment gets
F. Therefore the value of x ( ) is F. QED
In sentential logic we have a substitution law for logically equivalent sentences (cf. 2.2.1 and
3.1). This law is now generalized to FOL:
If
0
, is a subformula of and
0
is obtained from by substituting
one or more occurrence of by
0
, then
0
Note that the notion of subformula is much wider than that of a sentential component. For
example, is a subformula of v ( ), but not a sentential component of it. Applying
the substitution law we get, for example:
If
0
then v ( ) v (
0
)
Intuitively, the substitutivity of equivalents is clear: If
0
, then in every model and under
any assignment of values to the variables, and
0
have the same truth-value. Therefore,
substituting by
0
in any w cannot aect s truth-value. A rigorous, but rather tedious,
proof (which proceeds by induction on ) can be given; we leave it at that.
Example: Applying FOL equivalence laws, substitutivity of equivalents and the tools of
sentential logic, we show that the following is logically true:
(a) x( ) (x x)
The w has the form
1
(
2
3
). By sentential logic it is equivalent to:
(b) [x( ) x] x
9.2. LOGICAL IMPLICATIONS IN FOL 343
Using the distributivity of over (in the right-to-left direction), we can replace the subfor-
mula:
x( ) x
by the logically equivalent
x [( ) ]
(b) is thereby transformed into the logically equivalent:
(c) [x (( ) )] x
By sentential logic:
( )
Hence, we can substitute in (c) and get the logically equivalent:
(d) x( ) x
Applying again distributivity of over (in the left-to-right direction) we can substitute
x() by the equivalent xx. All in all, the following is logically equivalent to (d):
(e) (x x) x
But (e) is obviously a tautology. Hence, (a), which is logically equivalent to it, is logically
true.
Note that (a) is not a tautology; it is logically equivalent to (e), but the equivalence is not
tautological.
Pushing Negation Inside: By applying De Morgans quantier laws (in the left-to-right
directions) we can push negation inside quantiers. As we do so, we have to toggle and
. Combining this with the pushing-in technique of sentential logic (cf. 3.1.1) we can push
negation inside all the way. In the end negation applies to atomic ws only. Here is an
example. Each w is equivalent to the preceding one. Find for yourself how each step is
achieved.
x [P(x, c) yR(x, y)]
x [P(x, c) yR(x, y)]
x {[P(x, c) yR(x, y)]}
x {P(x, c) yR(x, y)}
x [P(x, c) yR(x, y)]
344 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
x [P(x, c) yR(x, y)]
Note: It is not dicult to see that universal and existential quantiers play roles parallel to
those of conjunction and disjunction. If the model is nite and all its universe members have
names in the language, we can express universal quantication as a conjunction, existential
quanticationas a disjunction: Say, c
1
, . . . , c
n
are constants that denote all the universe
members; then, in that particular interpretation, x(x) has the same truth-value as:
(c
1
) . . . (c
n
)
and x(x)the same truth-value as:
(c
1
) . . . (c
n
)
This observation can provide some insight into the logic of the quantiers.
Expressing the Quantiers in Terms of Each Other: and are said to be duals
of each other. The following two equivalences, which are easily derivable from De Morgans
laws, show how each quantier is expressible in terms of its dual.
x x x x
Having negation, we can base FOL on either the universal or the existential quantier. The
choice of including both is motivated by considerations of convenience and structural symme-
try.
Homework 9.5 Derive the following equivalences, according to the indications. You can
employ, in addition, the full apparatus of sentential logic and substitutions of equivalent
subformulas.
1. De Morgans law for from De Morgans law for .
2. The laws for expressing one quantier in terms of another from De Morgans laws for
quantiers.
3. The distributivity law of over , by a direct semantic argument. (You may ignore,
for the sake of simplicity, free variables other than x.)
4. The distributivity law of over from the dual law for and and the law for
expressing in terms of
5. The law for x [ ], where x is not free in , by a direct semantic argument.
6. The law for x [ ], where x is not free in using the dual law for and and
expressing in terms of .
7. x( ) (x) , where x is not free in using the laws in the framed box.
9.3. THE TOP-DOWN DERIVATION METHOD FOR FOL IMPLICATIONS 345
9.3 The Top-Down Derivation Method for FOL Impli-
cations
9.3.0
The apparatus developed so far may carry us a good way, but is not sucient for establishing
all logical implications in FOL. We can add to our stock further equivalences and logical
truths. For example:
(1) |= x(x) (t), where (t) is obtained from (x) by a legitimate
substitution of the term t for the free x.
We shall pursue a dierent strategy. We extend to FOL the top-down derivation method of
sentential logic, given in chapter 4 (cf. 4.3 and 4.4). The result is an adequate system for
establishing all rst-order logical implications.
9.3.1 The Implication Laws for FOL
Notation: Consider a premise-list . If =
1
, . . . ,
n
, dene:
S
v
c
= S
v
c
1
, . . . , S
v
c
n
In words: S
v
c
is obtained from by substituting, in every w of , every free occurrence of
v by the individual constant c. (Note that these substitutions are always legitimate, since c
is not a variable.)
New Constants: An individual constant c is said to be new for the w , if it does not
occur in . It is new for , where is a list of ws, if it is new for all the ws in .
We shall use the proofs-by-contradiction variant of the top-down derivation (givenfor sen-
tential logicin 4.4.0 and 4.4.1), which leads to the most economical system.
First, we include all the laws for sentential logic, where the sentences are replaced by arbitrary
ws. To these we add the laws listed in the following table.
346 CHAPTER 9. FOL: MODELS, TRUTH AND LOGICAL IMPLICATION
Substitution of Free Variables by New Constants
|= S
v
c
|=
where c is any individual constant new for
Universal and Existential Quantication
(, |=) , x |= , x, S
x
c
|=
where c is any individual constant
(, |=) , x |= , S
x
c
|=
where c is any individual constant new for and x
Negated Quantications
(, |=) , x |= , S
x
c
|=
where c is any individual constant new for and x
(, |=) , x |= , x, S
x
c
|=
where c is any individual constant
For Languages with Equality: If the language contains equality, add two equality laws,
(EQ) and (ES) of 7.2.1. The ws are, of course, any ws of FOL.
For Languages with Function Symbols: The laws are the same, except that the laws
that cover all individual constants are extended, so as to cover all constant terms (terms
containing no variables). This means that in (, |=) and (, |=), we replace c by t, where
t is any constant terms. The laws by which new individual constants are introduced remain
the same.
If the language contains both equality and function symbols, then the equality laws are
similarly extended. In (EQ) c is replaced by t, and in (ES) c and c
0
are replaced by t and
t
0
, respectively, where t and t
0
are any constant terms.
The top-down method of proof for FOL is a direct extension of the sentential case. An
implication of the form: |= is reduced to the equivalent implication: , |=. Then,
applying the implication laws in the right-to-left direction one keeps reducing goals to other
9.3. THE TOP-DOWN DERIVATION METHOD FOR FOL IMPLICATIONS 347
goals until all goals are reduced to a bunch of self-evident implications: those whose premises
contain a w and its negation. The method is adequate: Every logical implication of FOL
can be established by it.
The Validity of the FOL Laws
Here validity is not the technical term used for implication schemes in sentential logic. When
we say that a law is valid we mean that it is true as general law.
The validity of (, |=) is the easiest. It follows from the fact that the two premise lists:
, x , x, S
x
c
are equivalent. The second premise-list is obtained from the rst by adding S
x
c
, which is a
logical consequence of it:
(3) x |= S
x
c
((3) is a special case of (1) in 9.3.0, where t is the term c.) A formal proof of (3) is obtainable
from:
Lemma 1: Let M be a model such that c
M
= a. Then, for any w :
M |= [g
x
a
] M |= S
x
c
[g]
Intuitively, lemma 1 is obvious: says about the value of x (under g
x
a
) what S
x
c
says
about the denotation of c; if the value and that denotation are the same, replacing the free
occurrences of x by c should make no dierence.
Formally, the proof proceeds by induction on , starting with atomic ws and working up
to more complex ones.
1
We shall not go into it here. Lemma 1 implies (3): Assume that
M |= x, then, for all a |M|, M |= [
x
a
] (here we ignore other free variables of ;
their values play no role in the argument). Choosing a as c
M
and applying lemma 1, we get:
M |= S
x
c
.
The laws for negated quantications are easily obtainable from the quantication laws by
pushing negation inside. The remaining are the substitution-by-new-constant law and (,
|=). Their proofs rely on:
Lemma 2: Consider a rst-order language, which is interpreted in M. Assume that be a
w in that language and let c be an individual constant new for . Let a |M| and let M
0
1
All the steps are straightforward. In the passage from a w, , to its generalization v, or v, we can
assume that the quantied variable is dierent from x; else, x is not free in the resulting w, hence, = S
x
c