Introduction To Logic (Worrall, J)
Introduction To Logic (Worrall, J)
Introduction To Logic (Worrall, J)
BY JOHN WORRALL
1
A: T RUTH -F UNCTIONAL L OGIC
A1: I NTRODUCTION :
L OGI C I S A BOU T RE AS ON IN G OR A R GUIN G
If I knew about the Watergate Caper, what am I doing in the White House?
2
What am I doing in the White House?
The cartoonist in this thankfully dated cartoon is implicitly landing Richard Nixon
with an argument (or train of reasoning) one that condemns him from his own
mouth. More explicitly (and therefore draining it of any semblance of humour), the
argument is that, since there are only two cases that Nixon knew about the Watergate
affair and that he didn't and since in either case there would be grounds (different in
the two cases of course) for inferringthat Nixon was unworthy of his presidential office,
it follows that Nixon was indeed unworthy of his office. An argument consists of citing
certain premises and showing (or claiming) that a certain conclusion follows from
them. The premises here are that Nixon was unworthy of his office in either the case
that he knew about the break-in or the case that he didn't. The conclusion is that he is
indeed unworthy of his office.
We argue (in the intellectual rather than the falling out sense) or reason or infer or
make deductions all the time. This is true both in intellectual disciplines and, if often
rather more loosely, in everyday life. For example, a scientist tests a particular theory
by reasoning that if that theory is true then some other claim, one that can be checked
observationally or experimentally, must also be true that is, that some
observationally checkable claim follows from the theory. For instance, Newton tested
his theory of universal gravitation by inferring what followed from that theory about
the motions of the planets in particular that they describe (roughly) elliptical orbits
around the sun. Einsteins general theory of relativity was tested by showing that you
could infer from it that the stars would appear to be different distances apart during
the day time than they were during the night (because of the effect of the sun on the
3
trajectory of the rays of light from the stars). This prediction could only be tested in the
special circumstances when the stars are visible during the daytime during a total
solar eclipse. When Eddington carried out the test, Einsteins prediction turned out to
be correct. This testing process is an essential part of science in general. And of social
science too: the Treasury tests its (theoretical) model of the economy by working out
what it implies (what follows from it) about (observable) changes in the real UK
economy.
Logic also plays a crucial role in mathematics. Mathematics is centrally concerned with
proofs, which are in fact inferences or deductions or arguments. In formal
mathematics, certain axioms are laid down (for example Euclids axioms of geometry
basic assumptions that are accepted as givens, such as the parallel postulate usually
stated as: Given a line AB and a point C outside the line, there is one and only one line
that goes through C and is parallel to AB) and proofs consist of showing that certain
other assertions (theorems for example, the theorem that the internal angles of any
triangle sum to 180o) follow from, or can be inferred from, those axioms.
Coming closer to more practical concerns, defence lawyers argue for the innocence of
their clients, politicians argue for their policies, and, more mundanely, we reason, or
make inferences, all the time though we dont always think of it in that way. Suppose
you wake up after an especially heavy night on the town and find yourself unable to
remember what day it is. You might eventually reason: Well, yesterday was Saturday,
so today must be Sunday. This is a very basic inference, but it does fit the standard
pattern; you eventually dredge up from your alcohol- (or other recreational drug-)
soaked memory a premise (that yesterday was Saturday) and you make a very
straightforward inferenceto the conclusion that today is Sunday. (Of course you are
4
also implicitly assuming other premises like that you were not so drunk that you
slept for more than 24 hours!)
Finally, we reason in this way that is, take certain information as given, and work out
what follows from that information whenever we do IQ tests or try to solve
complicated 'brainteasers' or logic puzzles. Suppose to take a real old chestnut you
are told that a certain man is standing in front of a portrait of another person and he
says:
Brothers and Sisters have I none, but that mans father is my fathers son.
You are asked whose portrait it is. What you must do is work out what assertion about
his direct relationship to the person in the picture can be inferred fromthe information
given (what the guy actually says). So whose picture is it?
To take an example from my favourite genre, suppose you are told that Alf has washed
up on the Island of Knights and Knaves a strange island inhabited exclusively by two
separate, but intermingled tribes, Knights and Knaves. Knights always tell the truth,
and Knaves always lie. In exploring the island, Alf comes to a fork in the road one but
only one of the forks leads to the Island's capital, which is where Alf wants to go.
Luckily an inhabitant is standing at the fork and helpfully (if rather improbably)
informs Alf:
Either the correct fork is the left one, or I am a Knave (or both).
The puzzle is: which road should Alf take? Try to work out the solution for yourself
before reading further.
The solution is essentially the following argument or inference: Alf's informant is either
a Knight or a Knave. If he were a Knave then he would be telling the truth when he said
he was a Knave and hence he would be telling the truth when he said that either he was
a Knave or the correct fork is the left one (because or statements are automatically
true if one part is true). But this is impossible, because Knaves always lie. So Alf's
informant must be a knight. If so, what he says must be true, because Knights always
speak the truth. But since the second part of his either/or statement is false, the first
part must be true to make the whole either/or sentence true; hence the correct fork is
indeed the left one. This is a correct inference or as we shall say a valid inference.
5
The Island of Knights and Knaves was invented by the intrepid logician Raymond
Smullyan. You can find out more about it and try out more puzzles here.
All of these pieces of reasoning from scientific tests, mathematical proofs and
philosophical arguments to logic puzzles and mundane bits of everyday reasoning
share the same basic structure (though they may differ greatly in complexity). Certain
"information" is taken as starting point we are 'given' Einstein's theory or Euclids
axioms or that Alf is on this particular island and the inhabitant utters a certain
sentence, or that yesterday was Saturday and we are asked to work out what follows
from or what we can infer from the given information. We shall refer to the information
or assumptions from which a particular piece of reasoning starts as premises and the
further claim that is inferred from those premises as the conclusion. So the reasoning
or the inference or the argument itself is the process that takes us from a set of
premises to a conclusion. All inferences, then, ultimately have the form:
PREMISES
Therefore,
CONCLUSION
The fact that an inference is being made is invariably signaled by some such word or
phrase as therefore, and so, it follows that, from which we may infer, ergo and so
on.
7
A2: V ALIDITY AND S OUND NESS
Before getting down to work, lets pause to clarify right from the outset an issue that
often confuses people. Put rather enigmatically we might say that while logic is
centrally concerned with truth-transmission, it is not at all concerned with truth.
Consider the following two inferences:
Inference 1:
Inference 2:
On the other hand, both the premises in Inference 2 are false: there are plenty of
intelligent criminals (making the second premise false) and, although I dont know any
personally, certainly members of the Klan quoted in the Media often do not seem too
bright (so the first premise seems to be false). The conclusion is also false members of
the Klan have, historically, committed any number of criminal acts. Nonetheless, this
8
second inference is valid as I hope your intuitions will agree. The premises may not
be true, and the conclusion might not be true, but nonetheless the conclusion clearly
follows from the premises. (This situation is to be compared with inference 1 where
the conclusion is true alright but it doesnt follow from the premise.) How does it
follow despite being false? Well wed be inclined to say that IF it were true that all the
Klan members were intelligent and IF it were true that all intelligent people were law-
abiding then it wouldalso have to be true, it would FOLLOW, that all the Klan
members werelaw-abiding.
Logic is about what else has to be true, supposing that certain starting points are true
that is why it is about truth-transmission rather than about truth. Of course a scientist
is not interested in drawing conclusions from any old theory she must have reason to
think that it at least may be true. But logic is indifferent it will tell you what follows
from any theory, no matter how ridiculous (that is it will tell you what else would have
to be true if that theory were true). That seems, when you think about it, intuitively
right: you can work out what follows from the (incorrect) theory that some electrons
have positive charge just as you can from the (correct) theory that they all have
negative charge. Its just that the conclusions you validly draw from the latter will all be
true (i.e. borne out in experiments), while some at least of the conclusions you draw
from the false theory that some electrons have positive charge will themselves be false
that is, run counter to what is actually observed.
Similarly, in the brainteaser case, you arent interested in whether there really is an
Island of Knights and Knaves and whether Alf really ever did raise his question. These
are just assumptions: supposing (for the sake of argument) that they were true,
what else could you infer (what else would have to be true) about which road it is that
leads to the capital?
Or consider again the Nixon cartoon we first looked at. There is one question logic can
answer and one it cannot. The question it can answer is 'Suppose it were true that if
Nixon knew about the cover up then he is unworthy of his office and also true that if
Nixon did not know about the cover up then he is again unworthy of office. Would it
also then have to be true that he is indeed unworthy of office?' (The answer is, of
course, 'yes'.) The question which logic cannot answer is whether or not these
9
suppositions are true: deciding whether or not it is true, for example, that if Nixon
knew of the cover-up then he is unworthy of office involves a complex of empirical and
ethical issues.
Logic, then, is about which inferences are VALID (which ones have justified
therefores or and sos) and this is independent of whether or not the premises of the
inferences are true. Inferences which are not only valid, but which also have true
premises are called SOUND. As ordinary reasoners or as scientists, soundness is, of
course, a major concern we would like the premises that we start from to be true (or
at least arguably true). But logic, to repeat, is indifferent to soundness and involves
only the issue of validity. The premises are always just initial assumptions logic will
tell you what follows from those assumptions just as well if they are false as if they are
true (or indeed if they are as in the brainteaser case merely assumptions about
which the question of truth simply doesnt arise).
The important connection between validity and soundness is that if the inference is
indeed valid and if moreover it is sound (that is, if its premises are true) then its
conclusion mustbe true as well. Exactly this same point can be read the other way
roundand is equally (perhaps even more) important when expressed in this negative
way: if an inference is valid and its conclusion is false, then it cannot besound that is,
not all of the premises can be true, at least one must be false. (Peopleoften learn in this
way: they begin by believing a certain set of assertions; and then realise (or are shown)
that a certain conclusion (validly) follows from that set of assertions; and they
acknowledge that that conclusion is false hence logic dictates that not all the
premises, that is, not all of the set of assertions they began by believing, can be true, at
least one must be false and so must be rejected.)
10
A3: T RUTH -F UNCTIONAL L OGIC A N I NTRODUCTION
So lets start investigating some inferences. Try not to be put off by the fact that all our
early examples will be extremely simple we have to learn to walk before we can run.
This simple inference is VALID. It is, moreover, in my view, sound its premises are
true (and hence because the inference is valid, so is its conclusion). But, as we just saw,
it doesnt matter at all from the point of view of validity if the premises are true or not.
The validity stems as always from the fact that IF the premises were true, then SO
ALSO would have to be the conclusion. The first premise asserts that one of two
possibilities has to hold true. The second premise asserts that it isnt the first
possibility that holds. It obviously follows that second possibility has to hold.
Independently of the actual facts about Geller, its just NOT POSSIBLE for the only
possibilities to be A and B (genuine powers or magic tricks), for A not to be true and for
B not to be true as well. To deny the conclusion of this inference while accepting both
the premises would just be to CONTRADICT oneself. Or more pointedly: suppose you
denied theconclusion, while you accepted the second premise (that he doesnt have
real psychokinetic powers), then you would be contradicting yourself if you continued
to hold the first premise: that the only two possibilities were real powers and
magicians tricks.
Or lets take a slightly more elaborate inference of similar form. Suppose that someone
is trying to remember which London station the train to Edinburgh leaves from. She
remembers going north to the station and this, together with her knowledge of London
stations gives her as a first premise: 'Either the Edinburgh train leaves from Euston or it
leavesfrom King's Cross'. She then remembers taking the Manchester train from
Euston, and feels sure that the Edinburgh train leaves from a different station than the
Manchester train. This in effect yields two further premises: If the Edinburgh train left
11
from Euston then it would leave from the same station as the Manchester train and
The Edinburgh train doesnt leave from the same station as the Manchester train.
Taking all these premises together it follows of course that the Edinburgh train leaves
from King's Cross. Although we can hardly imagine anyone spelling the argument or
inference out in such gory detail, we can imagine that someone would infer where to go
to catch the Edinburgh train essentially in this way (supposing she is not internet-
connected and so has no need to rely on memory). Spelling out the inference fully we
have:
1. Either the Edinburgh train leaves from Euston or it leaves from King's Cross.
2. If it leaves from Euston, then it leaves from the same station as the Manchester
train.
3. It does not leave from the same station as the Manchester train.
Again the inference here is valid I hope that this will be intuitively clear to you. If not,
consider that it is just a slight elaboration on the Geller inference: the first premise
states that one of two possibilities holds, while the second and third premises together
rule out the first possibility (i.e. they rule out Euston). This leaves only the second
possibility and the conclusion is simply the claim that it is this second possibility which
holds. We will soon use the ideas elicited by these two examples, to produce a general
characterisation of validity of inference. However, this general characterisation will be
easier to grasp if we look first at a couple of inferences that are intuitively clearly
INVALID.
Before the discovery of Australia, European ornithologists believed that all swans are
white. Their evidence was of course a whole lot of observations of white swans. They
were clearly making an inference of something like the following form:
12
This is an invalid inference. Even had it turned out that all Australian swans, like all
other swans in the world are white (in other words, even if the conclusion here turned
out as a matter of fact to be true), the ornithologists' grounds for holding it to be true
were clearly not adequate. This is easily seen by reflecting that it is POSSIBLE for
individuals a1, ... an all to be white swans (that is all premises to be true), while some
other swan is black (and so the conclusion that 'All swans are white' is false). It turned
out that this possibility is actualised: in Australia there are black swans. But the
inference would still not be deductively valid even if all swans were in fact white
(though it might nonetheless be persuasive in some other sense it is often referred to
as an inductive, rather than a deductive, argument). You dont contradict yourself if
you accept that all the swans you observed so far are white, but assert that some other
swan is not white (and hence that it is false that all swans are white). Contrast this with
the Geller case in which, as we noted, you would contradict yourself if you rejected the
conclusion and continued to assert both premises.
Or, suppose someone is reading an Agatha Christie-style novel and, not being an expert
in these matters (theres always a last page surprise), has come to the next to last page
with the firm belief that the Butler did it. His evidence is that the Butler had both the
motive to kill the vicar (who was really a blackmailer who knew of the Butler's affair
with the Lady of the house, who was really...) and the means (the murder was
committed with the Butler's machete, which he kept for 'deadheading' his roses and so
to which he had access). Our non-expert reader again has made an inference.
Something like the following one:
1. Anyone who murdered the vicar had the means and the motive.
2. The Butler had both the means and the motive.
Again this inference is invalid; that is, again the conclusion is not guaranteed to be true
simply because the stated premises are known to betrue: it is possible for both of the
premises to be true while the conclusion is false. This is of course because it is possible
for more than one person to have had the motive and the means. Indeed, we can
suppose that our non-expert reader gets the customary shock on the last page when it
turns out that little Miss Goody Two-shoes in fact the "vicar's" former lover and
13
accomplice did it. But it wouldnt matter if this turned out to be a very boring
whodunnit and the conclusion this reader had drawn was correct the Butler really
did do it. The inference would still have been invalid, as an inference, because it was
still possible (even if, so it turned out, not-actual) for someone else to have had both the
motive and the means and for that someone else, rather than the Butler, to have been
the guilty party.
Its not possible for the only two explanations for Gellers spoonbending antics to be
trickery and genuine psychic powers, for him not to have genuine psychic powers and,
at the same time, for him not to be doing it by trickery. On the other hand, it is possible,
whether or not its true, for all observed swans to be white and yet not all swans to be
white (because some so-far unobserved swans are some other colour). But surely we
cant rest what Ive argued is a crucially important and fundamental notion (of validity
of inference) on the opaque notion of possibility after all, pigs might possibly fly.
But lets for the moment, give ourselves the notion of possibility (we will soon replace
it with a much less mysterious notion) and summarise the important points that have
been made so far:
Validity:
An inference is valid if its not possible for the conclusion to be false and
(all) the premises true at the same time.
Another way of thinking about this impossibility is that in a valid inference, you would
contradict yourself if you held that the conclusion was false and all the premises true.
In the case of an invalid inference, on the other hand, you might be wrong if you
asserted that the conclusion was false while accepting the premises as true but you
would not contradict yourself.
To make this clear, think about an analogous inference to the swans one: so far as I
know, all ravens (at any rate all normal ravens, there are some albino ones) are as a
matter of fact black. The inference from any number of observed black ravens to the
14
assertion that all ravens (observed or so far unobserved) are black would nonetheless
be invalid just the same as the swans one. If someone accepted that all observed ravens
have been black, but denied that all ravens are, they would be (factually) wrong, but
they would clearly not be contradicting themselves. Just as someone about to celebrate
their 18th birthday could accept that they were under 18 yesterday and under 18 on the
day before that, and on the day before that, etc., while accepting that they will not be
under 18 tomorrow!
Hence a good way to think about what makes an inference INVALID is that it is invalid
if it is POSSIBLE for the conclusion to be false even though the premises are all true.
To drive home this important lesson, consider finally the following train of reasoning:
Someone who knows little about Opera is trying to recall which composer wrote Tosca.
She remembers that the composer was Italian, so that it's a fair bet that it was either
Puccini or Verdi. Something tells her that it wasn't Puccini, so she infers or concludes
that it was Verdi. She has made the following inference:
Therefore, it is by Verdi.
Here the conclusion is in fact false. Nonetheless there is a clear sense in which the
reasoning is correct. Had both the premises been true then the conclusion would have
had to be true as well. The assumption that the premises are trueand the conclusion
false is, again, self-contradictory. It's just that as a matter offact the conclusion is false
thus showing that at least one of the premises mustbe false too (Tosca is by Puccini).
We must, then, as we agreed earlier, always sharply separate the two questions:
1. Are the premises or assumptions from which some piece of reasoning starts
true?
2. Is the reasoning valid? That is, does the conclusion follow from the premises?
(whether or not the conclusion is true)
In the Tosca case the conclusion does follow from the premises, but it is false (because
one of the premises is false).
15
A3( A ): L OGICAL F ORM AND T RUTH -F UNCTIONAL V ALIDITY
The Tosca inference and the Uri Geller inference are valid inferences for exactly the
same reason. In fact, in logical terms they are the same inference. Although one talks
about Opera composers and the other about spoonbending (so that their contents are
radically different), both inferences have the same form. The first premise of both
inferences states that one of two possibilitiesholds. The second premise states that one
particular possibility does not hold. The conclusion is that the other possibility holds. If
we disregard the content of the two inferences by replacing single assertions by letters
(different letters for different assertions), we can express both by the scheme:
Premises: Either p or q
Not - p
Conclusion: Therefore, q
There is nothing magical about the symbols: the p's and q's are simply place-holders
for particular assertions. The above scheme is the logical form of the inference about
Tosca and about Geller. Let's call it the inference-scheme of both these inferences.
Given any such inference-scheme we can of course turn it back into a particular
inference by replacing p and q by ordinary sentences. Substituting'Tosca isby Puccini'
for p, and 'Tosca isby Verdi' for q, we arrive back fromthe scheme to the Tosca
inference. If we substitute 'The Genesis account of the creation of the universe is wrong'
for p, and 'The Darwinian theory of evolution is wrong' for q, we obtain the quite
different inference:
1. Either the Genesis account of the creation of the universe or the Darwinian
theory of evolution is wrong.
2. The Darwinian theory is not wrong.
Since there are infinitely many sentences in English, there are in fact infinitely many
possible substitutions for p and q in our simple scheme. However, any such
16
substitution must produce an inference that falls under just one of the following four
headings:
For example, we might substitute the sentence 'The sum of two and two is four' for p
and the sentence 'pigs can fly' for q, thus producing the inference:
1. Either the sum of two and two is four or pigs can fly.
2. Pigs cant fly.
For example, substitute 'Mozart wrote Fidelio' for p and 'Beethoven wrote DonGiovanni'
for q. This produces an inference whose first premise is false (sinceboth sides of the
either/or are false) and whose conclusion (Beethoven wrote Don Giovanni) is false as
well.
For example, substitute 'Newton was a great scientist' for p and 'Einstein was a great
scientist' for q. Here the second premise (not-q) is false; but the conclusion (Einstein
was a great scientist) is true.
??
It is no accident that I cannot cite any examples under heading (4). For this particular
inference-scheme, there are no such examples. Do your best to find substitutions for
p and q that might make both premises true and conclusion false - even your best will
not be good enough!
This is in fact the key to replacing the vague talk of possibility with a clear notion and
hence to producing our first precise notion of validity. We arrived intuitively at the idea
that an inference is valid if the truth of the premises would be enough to guarantee the
truth of the conclusion (even if the premises are as a matter of fact false) or
equivalently if the conclusion could not possibly be false if the premises were true. We
17
can now eliminate this rather tricky subjunctive notion ('would be's' are sometimes
called subjunctives) and say that:
The form of the inference, remember, is its symbolic representation found by replacing
single assertions in the inference by letters p, q, r, etc. using a different letter for
each different assertion. If this characterisation of validity is correct then an invalid
inference must, of course, fail to meet it. That is, for an invalid inference, case (4), (true
premises and false conclusion) should be possible. And it is.
Consider the inference that can be taken to underlie the reasoning of our earlier duped
Agatha Christie reader:
1. If the Butler 'did it', then he had both the motive and the means.
2. He had the motive and the means.
Going through replacing single assertions by letters, as before, we obtain the form of
this inference:
1. If p, then q
2. q
Therefore, p
(Here 'p' stands for 'The Butler did it' and q for the sentence The Butler had the
motive and the means'.) We can assume that in the original inference we are unsure
about the truth or falsity of p (we did take it that we knew q to be true). But whether or
not p is true, the premises are not sufficient to establish its truth becauseof the
possibility that the premises are true while the conclusion (p) is false. We can again
now eliminate this rather vague talk of 'possibility': the inference is invalid if (and only
if) we can find at least onesubstitution for p and q in the inference-scheme which makes
both premises true and the conclusion false.
18
This is in fact easily done. Take, for example, p as 'Joe diMaggio was president of the US'
and q as 'Joe diMaggio was born in the US'. This substitution into the inference scheme
produces the inference:
1. If Joe diMaggio was president of the US, then he was born in the US.
2. Joe diMaggio was born in the US.
Here the premises are true (the second just is true and the first is true in view of the
fact that anyone who stands for President must have been born in the US), but the
conclusion is of course false, though he did have the not-inconsiderable consolation of
not only being a great baseball player but also of being married for a time to the
wonderful Marilyn Monroe. This, then, is why the inference about the Butler is invalid.
There is an inference of the same form as that inference which has true premises and
false conclusion (the Joe diMaggio one).
(Exercise: Try to think yourself of other substitutions for p and q which do the same job
thatis make the premises true, but the conclusion false.)
Let's record our results so far in the form of a couple of important definitions:
Definition: Counterexample:
Let I be any inference. An inference of the same logical form as Ithat has true premises
and a false conclusion is called a COUNTEREXAMPLE to I.
Definition: Validity:
This eliminates the vagueness involved in the notion of possibility but only at the cost
of introducing the so far rather unspecified notion of the form of an inference. In the
next section this notion is made explicit (at any rate for a restricted range of
inferences).
19
A3( B ): T RUTH -F UNCTIONALLY C OMPOUND S ENTENCES
We need first to reflect on a couple of simple facts about language. First, some
sentences might be called atomic declarative sentences: declarative because they
make an assertion which is either true or false, and atomic because they have no parts
that are themselves sentences. So Logic is easy is an atomic declarative sentence, and
so is Donald Trump is crazy. On the other hand, Shut the door! and Is the door shut?
are not declarative (they arent true-or-false assertions) and so automatically not
atomic declarative. Meanwhile, If Trump wins, I will want to leave the planet and
Trump is crazy and Clinton is untrustworthy are declarative alright but not atomic
since both contain parts (Trump wins and I will want to leave the planet in the first
case and Trump is crazy and Clinton is untrustworthy in the second that are
themselves sentences).
Second, given a stock of such atomic declarative sentences, there are many ways in
which we can use them to build new more complicated sentences. For example, we can
form a single sentence by taking any two of them and sticking an 'and' between them,
and another one by sticking an 'or' between them (usually with an 'either' in front).
Indeed, the sentence Trump is crazy and Clinton is untrustworthy is formed exactly by
sticking an and between the two separate atomic sentences Trump is crazy and
Clinton is untrustworthy. Out of the sentences Tony Blair lied and I am a bad judge
of character we can form the sentence Either Tony Blair lied or I am a bad judge of
character.
Also, given any single sentence such as I am a bad judge of character we can form
another by placing Its not the case that in front of it to form: Its not the case that I am
a bad judge of character. This would more usually be expressed as: I am not a bad
judge of character. (As will become clear as we go along, very often an idiomatic
English sentence does not display its logical form directly but employs various
abbreviatory devices: so instead of saying Blair lied about weapons of mass
destruction in Iraq and Blair misled the British people we would say Blair lied about
weapons of mass destruction in Iraq and misled the British people.)
20
Another way of making compound sentences out of single (atomic) sentences is by
the if ... then construction. For example, out of the two atomic sentences 'Logic is
interesting' and 'I'm a Dutchman', we can form the single compound sentence: If logic
is interesting, then I'm a Dutchman. Or out of the sentences Einsteins theory is true
and Light rays are bent by gravitating bodies we can form the single compound
sentence If Einsteins theory is true, then light rays are bent by gravitating bodies.
The 'or' construction is called DISJUNCTION. The compound sentence 'Either logicians
are mad or the moon is made of green cheese' is the disjunction of the two atomic
sentences 'Logicians are mad' and 'The moon is made of green cheese' (each individual
sentence is a disjunct).
The 'it's not the case that' construction is called NEGATION. There is no life after
death' (which is an abbreviated form of 'It's not the case that there is life after death')
is the negation of There is life after death.
The if ... then construction forms the CONDITIONAL. In If the Conservatives win the
next election then I shall emigrate, the antecedent is the sentence The Conservatives
win the next election and the consequentis I shall emigrate.
One assumption that will be made throughout this course is that every atomic
declarative sentence is indeed either true or false (not, of course, both). Talking in a
way that will prove useful later on, we can say that every atomic declarative sentence
has one of the two TRUTH VALUES- 'true' or 'false'. There are atomic sentences (God
exists, Man and the apes share a common ancestoretc.) whose truth-values have been
a matter of heated debate. But the fact that we may not be able to agree on the truth
value of a sentence does not mean that it doesn't have one. God exists is, presumably,
either true or false even though there is no universally agreed way of deciding which.
Other sentences a favourite example is Colourless green ideas sleep furiously
although grammatically correct, and clearly of a declarative form (not, for example, an
injunction like Shut the door!) arguably have no truth value. Some philosophers have
21
claimed that moral assertions like Lying is wrong or You ought not to commit
adultery also do not have truth values they are neither true nor false, since there are
no moral facts and statements like this really amount to implicit injunctions Dont lie!,
Dont commit adultery! Or maybe statements expressing the feelings of the speaker: I
dont approve of people who lie/commit adultery. While still other philosophers have
suggested that vague statements like This set of pebbles forms a heap (100 pebbles
form a heap, one or two dont, but how about 8?) may have a third truth value
(something like indeterminate). But we will ignore these complications throughout
this course and assume that all declarative sentences are either true or false.
So atomic sentences are either true or false and we can make various compound
sentences using the constructions outlined above. The important point about all the
compound sentences just considered is that they depend fortheir overall truth value
on the truth values of their atomic components what truth value the compound
has is determined in a definite way by the truth values of the atoms.
CONJUNCTION:
The sentence Humphrey Bogart starred in Casablanca and Fred MacMurray starred in
Double Indemnityis true because both of its components (both conjuncts) are true.
The sentence Ingrid Bergman starred in Casablanca and Veronica Lake in Double
Indemnity is false, because the second conjunct is false, even though thefirst conjunct is
true. The sentence Karl Marx was a great composer and Beethoven a great
philosopher is false because both conjuncts are false.
Nothing depends in the slightest on what the individual sentences are about (film stars,
composers or whatever). We know the truth-value of the compound sentence once we
know the truth-values of the components. Any conjunction is true if and only if both
conjuncts are true, and is false otherwise (that is, the conjunction is false if either
conjunct is false, or both are). If p and q are the individual atomic sentences, then the
truth value of the sentence p and q is true if and only if the truth values of p and q are
both true. We can re-express this simple rule using a graphic device known as a truth
22
table (this graphic device is due to the famous philosopher Ludwig Wittgenstein). In
order to save space, we will from now on use the symbol & to mean and.
p q p&q
T T T
T F F
F T F
F F F
There are four lines in this truth table corresponding to each of the different possible
combinations of truth values of the conjuncts. The final column gives the overall truth
value of the compound for the corresponding truth values of the components.
NEGATION:
The case of negation is just as straightforward. The sentence It's not the case that the
moon is made of green cheese is true because the atomic sentence The moon is made
of green cheese is false. The sentence It's not the case that Pavarotti was a great tenor
is false because the atomic sentence Pavarotti was a great tenor is true. (Lets in order
to avoid heated debate understand this sentence in a timeless sense so that you are a
great tenor if you ever have been so that an opera buff could readily consent to this
sentence even while believing that the estimable Luciano was over the hill for several
years before he died). Again, nothing depends on the particular sentence involved: the
negation of any true statement is false, and the negation of any false statement is true.
For any sentence p, not-p (we shall use the symbol p) is true, if and only if, p is false.
Again we can re-express this in the form of a truth table:
p p
T F
F T
23
DISJUNCTION:
We can also form a compound sentence using the either/or construction. Out of the
sentences I shall go to visit my grandma in hospital today and I shall go to visit my
grandma in hospital tomorrow we can form the disjunction: Either I shall go to visit
my grandma in hospital today or I shall go and visit my grandma in hospital tomorrow
(more idiomatically of course I shall go and visit my grandma in hospital either today
or tomorrow). Here, however, we come across an ambiguity in ordinary language.
Sometimes (perhaps more often) we use either/or in the exclusive sense meaning
one or the other but not both. Suppose you were the unfortunate victim of a (slightly
old-fashioned) mugger who threatened Your money or your life (more explicitly
Either you give me your money or I will take your life). You would feel very aggrieved
if, having given him your money, he proceeded to shoot you anyway insisting that he
intended the either/or in the inclusive sense! (Though, assuming he was a good shot, at
least you wouldnt feel aggrieved for too long.) So we would normally take the or in
Your money or your life as clearly to be understood in the exclusive sense.
Sometimes it is unclear whether 'or' is meant in the inclusive or the exclusive sense. Is
the sentence Either Lennon or McCartney wroteA day in the lifetrue or false? (They
both did.) Would the earlier case of Either I shall go to visit my grandma in hospital
today or I shall go and visit my grandma in hospital tomorrow be true or false if you
were extra nice and went to visit her on both days?
Logic cannot tolerate ambiguity and clearly it matters for precise logical purposes
which sense we take 'or' in. If both p and q are true, then in the inclusive sense 'p or q' is
24
true but in the exclusive sense 'p orq' is false. Logicians happen (for reasons that dont
matter here) to have elected to take the inclusive sense as primary. (As we shall see
and this does concern us we won't lose anything by making this conventional
decision.) The shorthand symbol for 'or' in this inclusive sense is 'v'. So we have the
following:
p q pvq
T T T
T F T
F T T
F F F
(The reason why we dont lose anything by making the conventional decision to go for
the inclusive sense of either/or as primary is that when we definitely mean an
either/or sentence in the exclusive sense, we can express it formally using our
symbolic apparatus by a simple further compounding using inclusive-or. Suppose I say
(regretfully) Either Manchester United or Manchester City will win the Premiership
this season. This clearly means either/or in the exclusive sense (ties are not allowed).
So spelling it out more fully I assert: Either Manchester United will win the
Premiership this season or Manchester City will win the Premiership this season,
though not of course both. Taking p to be Manchester United will win and q to be
Manchester City will win, then p [exclusive] or q is equivalent to (p v q) & (p&q),
where p v q as always involves v in the inclusive sense.)
CONDITIONALS:
25
dog races on Wednesday evenings.) This is called a conditional sentence and, just to
have some handy terminology, the sentence after the if (here the sentence The 23 rd
was a Wednesday) is called the antecedent of the conditional and the sentence after
the then (here the sentence I was at the greyhound races) is called the consequentof
the conditional. The truth of this conditional sentence isdependent on the truth values
of its components (i.e. the truth values of its antecedent and consequent), just as
conjunctions and disjunctions are. However, the form of the dependency in the case of
the conditional is subtler.
Let's carefully consider the truth or falsity of our particular conditional assertion under
all possible different suppositions about the truth or falsity of its components. First,
suppose that the 23rd was indeed a Wednesday (antecedent true) and that the suspect
did indeed go to the greyhound track that night (consequent also true). In that case we
would surely regard the suspect as having spoken truly when he said 'If the 23 rd was a
Wednesday, then I was at the greyhound races', that is, we would regard his 'if/then'
statement as being true.
Now suppose that the 23rdwas a Wednesday (antecedent true), but the suspect was not
at the dog track (consequent false). In that case (true antecedent, false consequent) the
suspect surely spoke falsely his 'if/then' statement was definitely false.
These are the two obvious cases (the two cases in which the antecedent is true) and
they dictate two out of the four lines in the truth table for the conditional. But what if
the antecedent was false? What if the 23rd was, say, a Friday rather than a Wednesday?
Here intuitions are not altogether clear. But it is surely clear that if the 23rd was a
Friday, then the suspect did not lie when he said that If the 23rd was a Wednesday then
I went to the greyhound races and this is so either in the case that he went to the
greyhound races on Friday the 23rdor he did not. That is, his conditional assertion is at
least not outright false if the antecedent is false whatever the truth value of the
consequent.
If therefore we are to stick by our decision that, for our purposes in this course, all
(grammatically legitimate) sentences are to be regarded as either true or false, then it
would seem that we are forced to the conclusion that the conditional If the 23rd was a
Wednesday, then I was at the greyhound races istrue both in the case that the
26
antecedent is false and the consequent true, (the 23rd was not a Wednesday, but he was
at the dogs) and in the case that the antecedent is false and the consequent is false (the
23rd was not a Wednesday and he was, say, in fact shooting Dangerous Dan McGrew
somewhere far away from the greyhound stadium).
The reason why this case is not clear-cut is that it is not clear that we are intuitively
happy to say that the conditional is indeed true in these two cases. It perhaps seems
more natural, certainly in this instance, to regard the conditional as Not applicable
when the antecedent is false the suspects assertion only comes into play if the
antecedent is true (if the 23rd was indeed a Wednesday) and is then clearly true if the
consequent is true (he had gone to the dogs) and false if the antecedent is false (he was
somewhere else).
On the other hand, suppose going back to the drunken stupor case we thought about
earlier I say: If today is Sunday, then yesterday was Saturday. Suppose, moreover,
that I am wrong about today being Sunday in fact I had so many drinks on Saturday
that I slept through the whole of Sunday and today is in fact Monday. In that case, the
antecedent of my conditional assertion is false (its not true that today is Sunday), so
also as a matter of fact is the consequent (yesterday was Sunday not Saturday)
nonetheless most of us would still want to say that my conditional assertion was true.
So there are at least some cases in which if/then sentences with false antecedents are
intuitively regarded as true.
Moreover, consider a sentence like: If Tony Blair really believed that there were WMDs
in Iraq, then I am a Dutchman (or then I am Marilyn Monroe or then pigs can fly). We
actually use constructions like this (there are other ones used in other cultures and age
groups) as an emphatic way of asserting a negation. What you would actually mean to
imply if you asserted such a sentence is that (of course, in your opinion lets not get
into making any assertion about the actual facts here in case lawyers might become
involved), Tony Blair definitely did not really believe that there were WMDs in Iraq.
And you imply that by using an obviously false sentence (like Im a Dutchman or Im
Marilyn Monroe or The Pope is Jewish) as consequent of a conditional that you assert
as true. So in conditional sentences like these everyone knows the consequent to be
false; any conditional, we all agreed, is unambiguously false if the antecedent is truebut
27
the consequent false; hence, in this Blair case, your conditional would be false not true
if the antecedent were true thats why by asserting the overall conditional (asserting
it to be true) you in effect assert its antecedent to be false.So again this is a case in
which a conditional is intuitively true (rather than not applicable) when it has a false
antecedent and a false consequent.
So, what are we to do? Clearly we cannot hope to capture ordinary usage directly since
ordinary usage is not unambiguous and logic does not tolerate ambiguity. Again, as in
the case of disjunction, we make a decision that captures some of the intuitions and
hope that the others can be met by more sophisticated means (its in fact a lot less
clear-cut than in the case of disjunction if they can: the status of conditionals remains
an issue of hot debate in analytic philosophy, but these debates will not concern us in
this course). The decision taken in logic is to stick with the idea that all sentences are
either true or false (that is, to avoid the not applicable possibility). This means since
we agreed that our criminal suspects sentence was certainly not false if the antecedent
was false (if the 23rd was not a Wednesday) that we must take the conditional as true
whenever its antecedent is false. So, symbolising any sentence of the form if p then q
as p q, we have the following:
p q pq
T T T
T F F
F T T
F F T
BI-CONDITIONALS:
The final way of compounding atomic sentences that we shall consider is found more
often in scientific and mathematical contexts than in ordinary discourse, but maybe
because of this is another straightforward case like conjunction. This involves
connecting sentences using the phrase 'if and only if'. For example, someone might say
Corbyn will survive as Labour leader if and only if Labour wins the next election or an
28
economist might predict The economy will recover if and only if the interest rate is
increased by a whole point. (Synonymous phrases to if and only if are 'exactly when'
or 'just in case'.) The 'if and only if' connective (often abbreviated to 'iff' and
symbolized as ') is again clearly truth-functional: that is, the truth value of the
compound p q is dependent on the truth values of p and q. In fact, p q is true
whenever p and q have the same truth value and false whenever they have different
truth values.
So, for example, the statement about Corbyn will turn out to be true if one of two cases
turns out to hold (a) Corbyn survives and Labour wins the next election (p and q both
true) or (b) Corbyn does not survive and Labour loses (p and q both false). If, on the
other hand, (c) Labour loses and yet Corbyn survives as leader (p true, q false) or (d)
Labour wins but Corbyn is ousted (p false, q true) that is in either of the two separate
cases in which p and q have different truth values, then the assertion (prediction) that
Corbyn will survive as Labour leader if and only if Labour wins the next election
clearly has turned out to be false.
p q p q
T T T
T F F
F T F
F F T
FURTHER COMPOUNDING:
29
So, for example, we can form a conditional whose consequent is itself a conjunction: e.g.
If the Tories win the next election, then income tax rates will decrease and social
inequality will increase. Or we can form a conditional both the antecedent of which
and the consequent are themselves compounds: e.g. If either Manchester United or
Manchester City wins the premiership, then I shall be very unhappy and so will every
other Liverpool fan. Things can get as complicated as you like: for example, having
created the conjunctions The US remains the only Western global superpower and the
situation in the Middle East will get worse, and A Federal Europe will be created as a
new Western global superpower and the situation in the Middle East will improve, we
can go on to form the disjunction of the two conjunctions: 'Either the US remains the
only Westernglobal superpower and the situation in the Middle East will get worse or a
Federal Europe will be created as a new Western global superpower and the situation
in the Middle East will slowly improve.
Or you might be told at some airport: If you are either a British citizen or a citizen of an
EU country then you need not fill in a landing card and you should go through channel
A. This compound sentence is a conditional whose antecedent is a disjunction and
whose consequent is a conjunction. The combinations, the ways of compounding, are
(literally) endless. The way to appreciate this is through looking at examples, of which
you will be given plenty.
Lets think again about a couple of examples we just had: starting with If the Tories
win the next election, then income tax rates will decrease and social inequality will
increase. Taking p as The Tories will win the next election, q as Income tax rates will
decrease and r as Social inequality will increase, this looks like it formalises as:
pq&r
30
But this is ambiguous as it stands. It could also mean the quite different and
admittedly rather strange assertion:
If the Tories win the next election then income tax rates will decrease and [in any
event] social equality will increase.
This indicates the need for brackets (brackets are very important in logic!) to
disambiguate. What we really meant was:
Or take If either Manchester United or Manchester City wins the premiership then I
will be very unhappy and so will every other Liverpool fan. Taking p as Manchester
United wins the premiership, q as Manchester City wins the premiership, r as I will
be very unhappy and s as Every other Liverpool fan will be very unhappy, it looks like
this formalises as:
pvqr&s
But again, without brackets, this is ambiguous multiply so in this case. It could mean
what we wanted it to mean but it could equally (if oddly be read as):
(p v q) (r & s)
31
Exercise: How we would correctly formalise alternatives 1 and 2? That is, how we
would use brackets to make those 2 (aberrant) assertions?
Finally, lets think about our political example: 'Either The US remains the only global
superpower and the situation in the Middle East will get worse or a Federal Europe will
be created as a new global superpower and the situation in the Middle East will slowly
improve. This is clearly a disjunction each of whose disjuncts is a conjunction. So,
letting p be the sentence about the US, q the sentence that The situation in the Middle
East will get worse, r the sentence about a Federal Europe and s the sentence that The
situation in the Middle East will slowly improve, the correct formalisation is:
(p & q) v (r & s)
You will soon get used to this by practising formalising ordinary language sentences.
There are one or two wrinkles that turn up. For example, strictly speaking, we should if
we wish to consider the negation just of the atomic sentence sentence p, write (p) to
indicate that the negation is only of p. But, in order to save typescript, we avoid
brackets in that case and just write p. So the consequent of the intended formalisation
of our airport sentence, (r & s), is to be read as not r and [but!] s (i.e. you need not fill
in a landing card and you should go through Channel A). The formula (r & s), on the
other hand, says that its not true that you both need to fill in a landing card and that
you should go through Channel A.
Exercise:
1. Take a more straightforward case: Say p is the sentence Blair was a liar and q is
the sentence Bush was a liar: what do each of p & q, p & q and (p & q) mean
in ordinary language?
2. (p & q) is actually equivalent to (i.e. says the same thing as) a certain
disjunction, can you work out which one?
32
There are other cases where brackets become redundant but it is best to learn of
these through exercises, rather than through trying to understand a general
explanation. Certainly the rule is if in doubt leave the brackets in.
By using the basic truth tables step by step we can build up a truth table for these more
complicated sentences too. This will again tell us the overall truth value of the sentence
for every possible combination of truth values of the atomic components. Lets take our
airport case again, which formalised, remember, as (p v q) (r &s).
There are in this case four atomic sentences: p, q, r and s and hence a total of16possible
combinations of truth values and hence 16 lines in the truth table. (It is easy to prove
mathematically that if there are n atoms, there are 2n possible combinations of truth
values and 24 of course equals 16.) The relevant truth table is then:
33
p q r s (p v q) (r & s)
T T T T t F f f
T T T F t F f f
T T F T T
T T F F F
T F T T F
T F T F F
T F F T T
T F F F F
F T T T F
F T T F F
F T F T T
F T F F F
F F T T T
F F T F T
F F F T T
F F F F T
Here, I have indicated the overall truth value of the compound in each row
employing, as usual, capital Ts and Fs and placing these overall truth values under the
main connective, in this case the conditional (the sentence as we already agreed, and
as we made the bracketing indicate, is a conditional with a disjunctive antecedent and a
conjunctive consequent). I have also indicated in the case of the first two rows and
using lower case letters to indicate theintermediate steps how the overall truth value for
that row is to be worked out.
So, in the first row, all the atoms take the truth value true, hence (p v q) is true by the
basic truth table for a disjunction (hence the t under that bit of the sentence), but r is
false (by the basic truth table for negation, given that r is true) and hence, although s is
true, by the basic truth table for conjunction, (r & s) is false (hence the double bit of
working out under (r&s) ending up with an f. So, finally, we have now worked out that
for this particular assignment of truth values (all atoms true) we have, for the overall
34
conditional, a true antecedent and a false consequent: and hence, by the basic truth
table for the conditional, we have the overall truth value F (indicated by the capital F
under the main connective, the ).
Similarly, for the second row, the antecedent is again true, the consequent is false (both
r and s are false in that second row), and so again we have true false which again
gives overall F by the basic truth table for the conditional.
Exercise: Go carefully through each of the remaining 14 rows and check that the overall
truth value assigned to that row is correct. Although you should force yourself to go
through all the working (just this once, on the assumption you find this stuff easy), you
may notice that various short cuts are possible for example you know the last 4 lines
must all take the overall value T once you have seen that the antecedent is false in all
those lines and independently of what happens with the consequent r & s: this is
because, by its basic truth table, a conditional, takes the overall value T if the
antecedent is F, regardless of whether if the consequent takes T orif it takes F.
You should check that this method gives the right answers intuitively for each line. So
remember our sentence read: If you are either a British citizen or a citizen of another
EU country, then you need not fill in a landing card and you should go through channel
A. Of course we normally suppose that such notices are only put up in airports if the
sentence they contain is true, but we are supposing for the sake of this argument that
the sentence may be true or false (perhaps some joker has been putting up notices
some of which are correct, some incorrect, just for his perverted enjoyment of the
subsequent confusion). So lets consider line 1: if line 1 holds (that is if the possible
assignment of truth values to atoms that it contains is the one which actually holds in
the real world), then the facts of the matter are:
35
The rule then is indeed false, you have been misinformed, and having supposed that
you do not need to fill in a landing card, you will presumably be stopped at customs
and required to do so.
Similarly, if line 2 reflects the real world (the real truth values) then you are again a
dual British and EU country national, you again need to fill in a landing card, but now
you also should not go through channel A (s is false). So again our truth table agrees
with our intuitions that in this case the sentence on the notice was false we shall be in
trouble twice over with customs by following what it says: we both need to fill in the
landing card and we went down the wrong channel.
Notice finally the systematic method for ensuring that you have considered all
possible combinations of truth values as exemplified in this case but to beused in all
cases:
36
p v q r&s
(where p, q, r and s all mean the same as in the real airport example):
Alternative 1: p v (q (r&s))
Alternative 2: p v (q (r&s))
Construct truth tables for each of these alternatives and show both that they differ from
one another (that is, there are at least some lines in which they take different overall
truth values) and that they each differ from the truth table that we constructed for the
real airport sentence that we have been considering.
37
A3( C ): T AUTOLOGIES , C ONTRADI CTIONS AND C ONTINGENT S ENTENCES
We will return to our principal concern validity of inference soon. But first we will
take what will seem like a digression but which will turn out to useful as concerns
validity.
There is a tri-partite distinction regarding single sentences (so we have left inferences
for the moment). If I tell you The Genesis account of the creation of the universe is
true then I have made a claim about the world some of you may think it
preposterous, but nonetheless it at least purports to carry some information about the
world. Similarly, if I tell you: September 11th 2001 was a Monday, I tell you something
that may be true or false and, if true (which it is), gives us some real information about
the world (which it does).
If, on the other hand, I tell you just that Either the Genesis account of the creation of the
universe is true or that account is not true, or Either September 11 2001 was a
Monday or it was not, then what I tell you is true alright indeed it is guaranteed to be
true, but it is trivially or vacuously true. These sentences could notpossibly be false no
matter what the state of the world may be. On the otherhand, whether or not the initial
statement that the Genesis account is true or that 9/11 was a Monday is true does
depend on facts about the world.
p v p.
p p v p
T T
F T
38
In fact, the only overall truth value in any row of this table is T. This reflects the fact
that this sentence is true independently of the way the world is, it is true in all possible
worlds. Such sentences are called (truth-functional) tautologies (tautologies, as we
shall later, are a special case of a more generalspecies of logical truths).
So,
Definition: Tautology
A truth functional compound is atautologyif and only if ittakes the truth value true in
all lines of its truth table.
Slightly less obviously tautological tautologies than the Genesis or 9/11 ones include:
Either if Farage is a liar then so is Duncan Smith or Farage is a liar but [and] Duncan
Smith isnt.
(Exercise: Formalise these two sentences (you will need as always to be careful about
brackets, especially in formalising 2) and then show that their truth tables do indeed
have the truth value true in all rows.)
On the other hand, statements like 'The Genesis acount of creation is true', or Either the
Tories will win the next election or I'm a bad judge of British politics, are, if true at all,
contingently true their truth depends on, or is contingent on, the way the world is (or
was or will be). This is reflected in thefact that their truth tables have at least one line
'true' and at least one line 'false'. The first statement formalises, of course, just as p. It
has the trivial truth table:
p p
T T
F F
The other can be formalised as p v q (where p stands for The Tories will win the next
election and q for the assertion I am a good judge of British politics) and this has the
truth table:
39
p q p v q
T T T
T F T
F T F
F F T
Both of these last two tables, then, have at least one T and at least one F. Some people
find it useful to think of the assignments of truth values to the atomic sentences as
defining possible worlds (e.g. the fourth line specifies the 'possible world' in which the
Tories do not win the next election and I am a bad judge of British politics). Contingent
sentences are ones whose truth value depends on the actual world which line in the
truth table, which possible world corresponds to the real one. Tautologies on the
other hand are sometimes said to be 'true in all possible worlds'. Hence
At the other end of the spectrum from tautologies are statements like The Tories will
win the next election and they will not, or The Tories will win thenext election if and
only if they do not, which are false but which don't just happen to be false, they are
necessarily false. (It is logically impossible that they could be true.) These Sentences are
called contradictions. When formalized their truth table takes 'false' inall lines. Take
for example the second sentence. This has the truth table:
p p p
T tf
F
F ft
F
This sentence is logically, or trivially, false "false in all possible worlds". So:
40
Definition: Contradiction
A truth functional compound is a contradiction if and only ifit takes the truth value
false in all lines of its truth table.
We now return to the central concern of logic: the question of when an inference is
valid. You will remember that in giving an intuitive characterisation of the notion of
validity, we invoked ideas of the possible truth/falsity of premises and conclusion. We
are now in a position to give these a rigorous formulation and, as we will see, the
notion of a tautology is involved in the most straightforward characterisation of
validity of inference.
41
A3( D ): T RUTH -F UNCTIONAL V ALIDITY ( AGAIN )
Look back over our earlier discussion of the notion of validity. We arrived at the idea
that an inference is valid iff (remember this stands for: if and only if) there is no
counterexample to it, where a counterexample is defined as an inference of the same
logical form that has true premises and a false conclusion. We can now clarify these
definitions further.
First of all, the form of an inference is specified, for the range of inferences we are
currently considering, by its truth-functional formalisation. Two different inferences
have the same form if they produce the same symbolic schema when formalised. So the
two earlier inferences that we considered earlier:
A:
1. Either the Edinburgh train leaves from Euston or it leaves from Kings Cross.
2. It does not leave from Euston.
And:
B:
1. p v q
2. p
So, q
Hence inferences A and B, while of course different inferences (one about railstations
and the other about composers) nonetheless have the same form.
42
If we construct a joint truth table for the three symbolic sentences concerned, we
obtain the following:
p q p v p q
q
T T T F T
T F T F F
F T T T T
F F F T F
where I have, in order to make the point, repeated the line for q since this is the
conclusion of the inference.
Inspect this joint truth table closely: you will see that there are lines in it in which:
However, there is no line in the truth table in which all the premises aretrue and
the conclusion is false. That is, there is no counterexample and thisin turn means that
both the train inference and the Tosca inference are valid.
Now, consider, by way of contrast, two further inferences the first a slightly simplified
version of our earlier Agatha Christie example and the second the one about Jo
DiMaggio:
C:
D:
These are inferences of the same form, as is shown by the fact thatthey produce the
same symbolic representation:
1. p q
2. q
So, p
But unlike A and B, these are both invalid inferences. In fact, D is a case in which the
premises are true (in the 'real world') and the conclusion false (Joe DiMaggio was a US
citizen but not president, so both premises are true (in the real world) and the
conclusion false). So D supplies a counterexample both to Cand (trivially) to itself. More
formally, if we againdraw up a joint truth table for the two premises and the conclusion
of the symbolic representation of either C or D we obtain:
p q p q q p
T T T T T
T F F F T
F T T T F
F F T F F
where I have again, redundantly, repeated q and p to the right since they appear as
second premise and conclusion of the inference.
We see that in this case, unlike the case of the symbolic representation of either
inference A or B, there is a line in this truth table (namely the third) inwhich both
premises are true and yet the conclusion is false. Hence there isa counterexample
to either of the inferences C and D. It may be true in the real world (in this case the
real world of the fictional novel!) that the Butler did it, but that he did does not follow
from the fact that if he did it then he had a motive, and he had a motive. There is a
possible world in which both premises are true and the conclusion false, or, putting it
in a more down to earth way, there is an assignment of truth values to the atomic
44
components of the inference given by the third row, i.e. p:F, q:T in which both
premises are true and the conclusion false.
Put precisely, we have arrived at the following definition of (truth functional) validity:
Definition: Validity
As an old, and very boring, school teacher of mine used to say endlessly, this definition
should be 'read, learned and inwardly digested'.
45
A3( E ): D ECISION P ROCEDURES FOR T RUTH -F UNCTIONAL V ALIDITY
The above considerations not only tell us what it means for an inference to be (truth-
functionally) valid, they also indicate one method of deciding whether a given inference
is valid or invalid. In fact, in the case of inferences in the language of truth-functional
logic, we are in the happy position of being able to specify several different algorithms
or mechanical decision procedures for ascertaining validity or invalidity. (They are
called algorithms because they are guaranteed always to deliver the correct answer
when applied to any inference.)
The first method is the essentially the one that we just used:
Clearly this is indeed an algorithm that is, it is bound to give an answer in all cases.
The premises and conclusion of any inference, no matter how complicated, involve
only finitely many atomic sentences and hence the joint truth table will have only
finitely many rows (in fact, as we already noted, it will have 2n rows wheren is the
number of atomic components). We just then need to look through all the rows and will
either find a row in which all the premises are true and the conclusion false (in which
case the inference is invalid) or wewill get to the end of the 2n rows without finding one
with this property (in which case the inference is valid).
The method can be given a rather more elegant form by considering something called
the associated conditional of the inference:
46
1. The associated conditional of an inference is the singlesentence, conditional in
form, which has the conjunction of the premises as antecedent and the conclusion
as consequent.
((p v q) & p) q
((pq) & q) p
3. Note whether the associated conditional is a TAUTOLOGY (i.e. all lines in table
yield T) or NOT (at least one line is F).
It is easy to see that this just is another way of checking whether or not there is a line in
the joint truth table which makes all premises true and conclusion false by thinking a
bit about step 4: any conditional P Q (we here, and from now on, use capital Ps and
Qs etc whenever these may themselves be compound sentences, reserving lower case
ps and qs etc. for atomic sentences) is false just in case P is true and Q is false, and since
a conjunction is true just in case all its conjuncts are, the associated conditional for an
inference will have at least one F in its truth table (i.e. will not be a tautology) if and
only if there is at least one assignment of truth values to the atomic components (i.e. at
least one line in the truth table) which makes all the premises of the inference true and
its conclusion false, that is, if and only if the inference is invalid.
So if you have done the latest exercise correctly you will have found that the associated
conditional for either inference A or B, viz. ((p v q) & p) q) is a tautology (reflecting
the fact that either inference is valid no assignment of truth values to atomic
components that makes all the premises true and the conclusion false).Meanwhile, the
47
associated conditional for either inference C or D, viz. ((pq) & q) p) is not a
tautology (note carefully that the lines in that conditionals truth table which show that
it is not a tautology, i.e. in which it takes the value F, make both premises of the original
inference true and its conclusion false). So, C and D are invalid.
Since the truth table for any compound proposition with n atomic components has 2n
rows in it, the truth table method soon becomes fairly unwieldy with inferences of any
complexity. Fortunately, it is possible to take a short cut by using the time-honoured
method of 'indirect proof' or'reductio ad absurdum': to prove P is the case, assume P
and derive a contradiction; this must mean that P is true, since if P entails a
contradiction it cannot be true. This soundscomplicated, but in fact turns out to be
much quicker and more direct than the truth table method (especially asnoted, when
many atomic sentences are involved). Lets first see the method at work and then
describe it in general.
1. p v q
2. p
So, q
This is a valid inference form. The 'no counterexample method' of showing this
proceeds as follows:
1. Assume that the inference is INVALID; i.e. that there is anassignment of truth
values to the atomic components for which all the premises are true and the
conclusion is false.
2. Since q is the conclusion, were assuming that q is false.
3. Since p is a premise, and we are assuming all premises are true, p must be
true and therefore p false.
48
4. But assumptions (2) and (3) together mean that the first premise (p v q) must
be false (by the truth table for disjunction) and this CONTRADICTS our
assumption that all premises were true.
5. Hence, the assumption that the inference is invalid has proved untenable (it
turns out to impose contradictory, and therefore unfulfillable requirements),
and so we conclude that the inference is VALID.
1. p q
2. q
So, p
1. Assume the inference is INVALID; i.e. that there is a case in which all premises
are true and conclusion false.
2. p is the conclusion so p is false.
3. q is a premise so q is true.
4. This means that p is false and q is true in the first premise (p q), but that's OK,
since by the truth table for the conditional FT is true.
5. So, in this case, our assumption that the inference is invalid has not led to a
contradiction, therefore, the inference is invalid, and the method has in fact led
us to an actual COUNTEREXAMPLE: viz. an assignment of truth values to the
atoms (in fact p:F, q:T) which indeed makes all premises true and the conclusion
false.
49
2. Work out what this assumption requires by way of assignments of truth values
to the atomic components.
3. You will either be led to a contradiction that is, you wont be able consistently
to assign truth values to atomic components given the assumption of invalidity
or you wont be led to a contradiction.
If you are led to a contradiction, then the inference cannot be invalid and therefore
is valid.
If you are not led to a contradiction, then the inference is invalid (and you will in
the process have constructed a counterexample).
Obviously with only two atoms, the no counterexample method is not greatly more
efficient than the truth table method. But it does come into its own with more
complicated inferences. Consider, for example, the moderately more complicated
inference which has 5 atoms and therefore a 32-line truth table):
1. (p&q) (rs)
2. s v t
50
Or consider the following inference:
1. ((p q) v (r s))
2. t s
So, q & t
The third algorithm for truth functional validity is really just a systematic and more
elegant version of the 'no counterexample' method.
The semantic tree method systematically explores whether its possible for a given set
of sentences to be true together (that is, whether there is at least one assignment of
truth values to atomic components that makes all the sentences in that set true). To
apply the semantic tree method to the question of whether a given inference is valid,
51
therefore, we apply it to the set consisting of the premises of the inference together
with thenegation of its conclusion.
The basic idea of the tree method is that whenever there is more than one way for a
sentence to be true, this is signified by a branching of possibilities the tree branches
at that point. For example, for a sentence of the form 'P v Q' to be true either P can be
true or Q (or of course both, given that 'v' is 'inclusive or'). This is indicated by the basic
schema for disjunction (remember we are using capital letters to indicate that the
sentences concerned may be compound and hence have further truth-functional
structure):
(i) PvQ
P Q
On the other hand, for 'P & Q' to be true, both P and Q must be true. There is no
branching of possibilities, and hence the basic schema for a conjunction is:
(ii) P&Q
P
Q
What about sentences involving the other connectives that we have introduced? Well,
the sentence 'P Q' is, it turns out, equivalent to 'P v Q'. (Exercise: As we shall note in
more detail later, logical equivalence means, in the case of truth functional logic, having
exactly the same truth table. Show that 'P Q' and 'P v Q' do indeed have the same
truth table.) Given this equivalence and the basic splitting idea, the schema for the
conditional must be:
(iii) PQ
P Q
52
As for biconditionals, a sentence of the form P Q basically says that the two
sentences P and Q have the same truth value, i.e. there are two possibilities: both true
and both false; but in order for a sentence to false, its negation must of course be true.
So, we have the schema:
(iv) PQ
P P
Q Q
Next we need a set of rules for negated sentences. Again these can easily be
constructed by thinking about what the negated sentences mean and applying the basic
idea of how many different ways such a sentence might be true.
So, for example, in order for a sentence of the form (P v Q) to be true, P v Q itself must
of course be false and there is only one way for that to come about namely both P and
Q must be false (think about the truth table for P v Q, if this isnt already obvious).
Hence we have the rule:
(i) (P v Q)
P
Q
Similarly for (P & Q) to be true, P & Q must be false but here there are two ways in
which that can happen either P or Q to be false (or of course both, but we dont need
to take that into account, the fork in the tree method in effect represents inclusive or).
So we have the rule:
(ii) (P & Q)
P Q
53
As for (P Q), the only way for a conditional to false and hence for (P Q) to be
true, is for the antecedent to be true and the consequent to be false. So we have the
rule:
(iii) (P Q)
P
Q
As for a negated biconditional, there are two ways in which a biconditional can be false
(and hence its negation true) that is, the two ways in which P and Q can have
different truth values; so we have:
(iv) (P Q)
P P
Q Q
Then finally we have the obvious rule for double negation: the only way for a sentence
of the form P to be true, is for P to be false, i.e. for P to be true.
(v) P
The idea behind the tree method is to keep on applying the above rules until we are left
with simple sentences either atomic sentences or the negations of atomic sentences.
The method will, as we shall see, systematically lead us to a counterexample to any
inference, if such a counterexample exists.
Example 1:
1. p (q v r)
2. p
Therefore, (q v r)
54
As I said, a counterexample to this inference would be a set of truth value assignments
to the atoms which makes the premises true and the conclusion false, that is, one that
makes the premises and the negation of the conclusion true. The tree method will tell
you whether its possible for any set of sentences to be true together. So in this case we
apply it to the following list:
1. p (q v r)
2. p
3. (qvr)
1. p (q v r)
2. p
3. (qvr)
Notice:
If we now look back up any of the branches in this tree from the bottom to the top, take
for example the one marked A, we can read off an assignment of truth values that
provides a counterexample to our inference, by applying the obvious rule:
Wherever an atom appears unnegated, assign it the truth value true, and;
So, looking along branch A we have p, r and q. Hence the truth value assignment at
issue is p:F, q:T, r:T. (Exercise: check that this does indeed supply a counterexample to
our inference).
Example 2:
1. p (q v r)
2. (q v r)
So, p
To apply our semantic tree method to it to try to find a counterexample, we list the set
of sentences that would all have to be true to provide such a counterexample, i.e.
1. p (q v r)
2. (q v r)
3. p
56
1. p (q v r)
2. (q v r)
3. p
Once again we have decomposed all the sentences we are interested in, and no further
decomposition is possible.
Now suppose that we try to do with this inference what we did with inference 1; that is,
try to read a counterexample off any of the branches. We would get into trouble every
time. Start with the branch marked A: we have p on it so we should write p:F, but
reading up towards the top, we also have p which would require the inconsistent
assignment p:T. On branch B we have q so should assign T to q but then nearer to the
top we have q so should assign F to q; finally on branch C we have r and so r: T , but
also r so r:F!
Clearly if an atomic sentence and its negation both appear on any branch then we
cannot read a consistent counterexample to the inference off that branch.Such a branch
is called CLOSED (and we indicate the point at which it closes by drawing a double line
under it). A branch that does not close, even though all the information has been
processed, is, not surprisingly, called an open branch. We can only read
counterexamples to an inference off an open branch.
57
Hence we have the fundamental result that if all the branches of a tree foran
inference close, then no branch can supply a counterexample, so there is no
counterexample and so the inference is VALID.
(I am here presupposing a result about the semantic tree method, namely that it is
bound to find a counterexample if there is one. This seems intuitively obvious (and is
indeed true) but it does need a proof, which I shant pause to provide at this stage.)
A tree in which all branches close is, again unsurprisingly, called a closed tree, so our
fundamental result can be re-expressed as:
The inference from some set of premises to a conclusion C is valid iff the semantic
tree for the set consisting of the premises and C is closed.
There are some extra wrinkles about the tree method, particularly with respect to
economy measures you dont want to let your tree get unwieldy (too bushy!) and the
aim should be always to keep the branching to a minimum. So the general message is to
use the information where the rules lead to no splitting (as in (ii) or (i)) first. But this
is best learnt through practising with particular trees rather than attempting to lay
down particular guidelines.
58
A3( F ): C ONSISTENCY
Although the central problem of logic has been taken to be that of characterising
validity of inference, an almost equally important matter with which logic can deal is
that of the consistency of a set of statements. The question often arises in mathematics,
the sciences and the social sciences of whether it is consistent to make the assumption
A' given that assumptions Al An have already been made. Moreover, in ordinary
debate it is not unusual to charge someone with holding views that might separately be
tenable but which taken together are inconsistent. Logic can supply a precise
characterisation of the notions of consistency and inconsistency. (It turns out, as we
shall see, that these notions have very close connections with the notions of validity
and invalidity of inference indeed in a sense they are identical notions.)
The basic idea again involves the notion of 'possible truth': a set of sentences is
consistent if it is possible for them all to be true together (though they may all as a
matter of fact be false). Restricting ourselves to truth functional logic thistranslates into
the following:
Definition: Consistency
Example 1:
{If the Butler is guilty then so is the Chauffeur; the Chauffeur is guilty and the Parlour
Maid is not guilty; either the Butler is not guilty or the Parlour Maid is guilty.}
59
The set symbolises as {p q, q & r, p v r}, where p, stands for the assertion that the
Butler is guilty, q for the assertion that the Chauffeur is guilty and r for the assertion
that the Parlour Maid is guilty. In fact, the following assignment of truth values makes
these three sentences all true: p:F; q:T; r:F. (Exercise: check this using truth tables)
Example 2:
{If the Butler is guilty then so is the Chauffeur; either the Chauffeur is not guilty or the
Parlour Maid is guilty; the Parlour Maid is not guilty, but the Butler is guilty.}
(Exercise: show that this follows from the basic definition of consistency in terms of
assignments of truth values to atomic sentences.)
Example 3:
If I already assume that 'The price of good G in economy E rises ifeither the demand for
G rises or there is inflation in E' and that 'If there is tight monetary control in E then
there is no inflation in E', and I know that in fact 'The price of G in E has risen', would it
60
be CONSISTENT also to hold that 'The demand for G has not risen and there has been
tight monetary control'?
A = {p (qvr), s r, p}
(Exercise: make sure you check exactly which sentences, p, q , r and s symbolise here.)
The question of whether it is consistent to add a to the set A reduces to the question of
whether the set:
is consistent.
(Exercise: the answer is that it is not. Again try a few truth value assignments you
wont succeed in making all the sentences true.)
61
A3( G ): D EMONSTRATING C ONSIS TENCY AND I NCO NSISTENCY
So now we know what it means for a set of sentences to be consistent and what it
means for such a set to be inconsistent. How can we actually go about deciding which a
particular set of sentences is? One method essentially theequivalent of the truth table
method of deciding validity or invalidity of inference would be trial and error: try out
all the various possible truth value assignments (write out the whole truth table) and
see if any works. If at least one works (if there is at least one line in the joint truth table
in which all sentences take the truth value true) then the set is consistent, if none
works (no such line in the truth table) then the set is inconsistent.
A neat and systematic method, however, is to use semantic trees. In this case we want
to know if a given set of sentences might possibly all be true together. So we first, list
the various sentences in the set whose consistency is in question {N.B. we do not
negate anything, as we did in deciding in/validity of inferences, where we negated the
conclusion} and secondly, construct a tree.
(a) IF ANY BRANCH REMAINS OPEN THE SET IS CONSISTENT AND AN ASSIGNMENT
DEMONSTRATING ITS CONSISTENCY CAN BE 'READ OFF' THE OPEN BRANCH; while
(b) IF THE TREE CLOSES (NO OPEN BRANCHES) THEN THE SET IS INCONSISTENT.
62
Example1:
1. p q
2. q & r
3. p v r
|
q
r [from 2 by schema (ii)]
All the information has now been exhausted and branches remain open. Hence the set
of sentences is consistent and by following either of the two open branches we can
read off an assignment that shows this (in this particular case we get the same
assignment viz. p:F, q:T, r:F from both of the open branches, indicating that this is
the only assignment which demonstrates consistency). (Exercise: Check this by
substituting these truth values in the sentences 1, 2 and 3, and seeing that they all turn
out true.)
63
Example 2:
1. p q
2. q v r
3.r & p
= =
Here all the branches close and hence the set of sentences is inconsistent.
One important wrinkle: You will sometimes find in using the tree methodeither to
establish in/consistency or in/validity, that the tree you construct has open branches
on which one of the atomic sentences fails to appear altogether, that is, it appears
neither negated nor unnegated. Here is a simple example:
1. p (q v r)
2. q
q [from 2]
q q [from 2]
= B
64
Here we have, even when all information has been taken into account, two open
branches (which I have labelled A and B respectively). So the set of sentences is
certainly consistent. Which truth value assignments show this?
On branch A we have p and q, so p and q must both be false. But how about r? It does
not appear at all. The answer is that if it doesn't appear then it doesn't matter
either truth value for r will do, so long as p and q have the truth values specified.
That is, branch A in fact suppliestwoassignmentswhich demonstrate consistency: p:F
q:F r:T and p:F q:F r:F. So long as p and q are both false the sentences in the set are
going to be true whatever truth value r has. (Exercise: check directly by substituting
into the set of sentences that both truth value assignments make all (both) the
sentences true.) Note that it is of course important that this be a really open branch,
that is, that all the information has been put on in the (failed) attempt to close the
branch. The above result does not, of course, hold for a branch that remains open in an
incomplete tree.
Similarly for branch B we have r:T and q:F. But p does not appear. Again, this means
that p can be either T or F (given that r:T and q:F). So again we have two assignments
which demonstrate consistency: p:T q:F r:T and p:F q:F r:T.
This result is quite general and applies equally well when using semantic trees to
decide validity or invalidity of inference. Strictly speaking, we need a proof of the
result, but again we shall not pause to supply it here.
65
A3( H ): T HE C ONNECTION B ETWEEN ( IN )C ONSISTENCY A ND ( IN )V A LIDITY
By reflecting on the two different uses we have made of semantic trees, it is easy to see
the connection between the notion of consistency and that of validity of inference.
From this new point of view (i.e. from the point of view of deciding whether a given set
of sentences is consistent) what we were doing earlier was deciding whether the set of
sentences formed by the premises together with the negation of the conclusion is
consistent. (Remember thatin applying semantic trees in deciding validity of inference
we constructed a tree for the premises and the negation of the conclusion.) If that set is
indeed consistent we concluded that the inference is invalid, while if the set is
inconsistent we concluded that the inference is valid.
Hence we have the following important connection between the two notions: If P is a
set of sentences and C is a single sentence (all in the language of truth functional logic)
then the set of sentences P U {C} (that is, simply the set of sentences formed by adding
C to the original set P) is inconsistent if, and onlyif the inference from P as a set of
premises to C as conclusion is valid.
That is, an inference is valid iff it would be inconsistent to assume that the conclusion is
false while assuming that the premises are true.
(Important Exercise: We arrived at this connection by thinking about the two uses of
the semantic trees method; but show that it must indeed hold in virtue of the basic
definitions of validity and consistency in terms of truth value assignments to the
atoms.)
66
A3( I ): I NDEPENDENCE
In scientific and other intellectual disciplines, the question often arises of whether or
not some particular assumption is independent of another set of assumptions. This is
also a question that comes up in more ordinary, argumentative, situations such as in
politics or the law. Someone might, for example, be criticised for holding a certain view
and defend herself by saying That is not my view and it is quite independent of the
position I did assert'. One of the most celebrated questions in the history of
mathematics was that of whether Euclid's 5th (or Parallel) Axiom is or is not
independent of the rest of his axioms. The 5th axiom states that for any line and any
point outside the line there is one and only one line parallel to the given line through
the given point. The question was whether this is an independent assumption
(independent that is of the other axioms that Euclid laid down) or whether, on the
contrary, the truth or falsity of the parallel axiom had already in effect been decided by
the other axioms that is, the assumptions about points and lines that Euclid had
already made. (The general view was that the parallel axiom was so obviously correct
that it must follow from the other axioms and therefore not be independent.) The
question troubled mathematicians for over 2,000 years.
In ordinary cases at least, the question is usually that of whether or not the truth of the
assumption in question is in fact already implied by other assumptions, but we would
also surely say that if those other assumptions entailed the falsity of the statement,
then that statement was not independent of that set of statements. Hence we get the
following precise characterisation:
Definition: Independence
67
a as conclusion nor the inference from A as premises to a as conclusion is truth
functionally valid.
68
A3( J ): D EMONSTRATING I NDEPENDENCE
We saw earlier that an inference is invalid iff there is a counterexample to it. So, for
neither the inference from A to anor the inference from A to a to be valid there have
to be TWO counter-examples (that is, two truth value assignments to the atomic
propositions):
(a) one in which all of the sentences in A are true and a is true, and
(b) one in which all of the sentences in A are true and a is false.
Since case (a) is one in which all the sentences in A are true and so is a, while case (b) is
one in which all sentences in A are true but a is true, i.e. a is false, we can see that:
The single sentence a is independent of a set of sentences A iff A U {a} and A U {a}
are both consistent sets of sentences.
Since we know how to decide the consistency of a set of sentences (by producing the
relevant semantic tree) we see that the independence of a from A can be decided as
follows:
1. Construct the semantic tree for the set of sentences A U {a}; if it closes then a is
NOT independent of A.
2. If that tree remains open, then construct another tree for A U {a}. If that second
tree closes then again a is NOT independent of A, but if it too remains open then
aISindependentof A.
Example 1:
To decide whether p is consistent with this set take the sentences 1-4 and proceed as
below:
69
1. (p (q & (r v s)))
2. r
3. (s q)
4. p
s s
A =
One branch remains open (the branch marked A) and so the set is consistent as shown
by the truth value assignment that we can read off A (p:T, q:T, r:F, s:T).
So next to decide whether p is consistent with our set of sentences take the different
set of sentences 5 to 8 below and proceed as follows:
70
5. (p (q & (r v s)))
6. r
7. (s q)
8. p
s s
q q [from 7 by schema (iv)]
A B
q
rvs [by schema (ii)]
=
s s
q q [from 7 by schema (iv) *]
C =
(* Remember: all information on all open branches. This information has not been on
the right hand branch created at the beginning and hence must be put on here we
must give the tree the best chance to close.)
We have three open branches (marked A, B and C) which between them supply two
different assignments showing the consistency of sentences 5-8 (p:F, q:T, r:F, s:T and
p:F,q:F, r:F, s:F).
These two trees together hence establish that p is independent of {(p (q & (r v s))),
r, (s q)}
Example 2:
Is p independent of {p q, q}?
71
1. p q
2. q
3. p
p p
q q [from 1 by schema (iv)]
= =
72
A3( K ): U SING S EMANTIC T REES TO DECIDE THE S TATUS OF S ENTENCES
We earlier noted that single sentences in truth-functional logic fall into one of three
categories: tautology (all lines in its truth table are T), contradiction (all lines in its
truth table are F) and contingent sentence (at least one T in its truth table and at least
one F). As this characterisation indicates, the straightforward way to decide which of
the three categories a particular sentence falls is by using truth tables. But for
sentences of any complexity this can be a long process (remember that if there are n
atoms in the sentence, then there are 2n lines in its truth table). As in the case of
validity of inference, the method of semantic trees can be applied and makes the
decision simpler than writing out the whole truth table.
Here is how the method works (the exercises will give you practice in applying it):
A single sentence S is a contradiction iff and its semantic tree closes (again all
branches close).
A single sentenceS is contingent iff neither the tree for S nor the tree for S closes (i.e
both trees have at least one open branch).
73
A3( L ): T RUTH -F UNCTIONAL E QUIVALENCE AND I NTERDEFINABILITY
Truth functionality:
We now know how to do everything in truth functional logic that we need to: decide on
validity/invalidity of an inference expressed in that logic; decide whether a particular
compound sentence of the logic is a tautology, a contradiction or neither; decide
whether a set of truth functional sentences is consistent or inconsistent; and finally
decide whether a particular sentence a expressed in truth functional logic is or is not
independent of a given set of such sentences A.
In the next couple of sections, we will investigate the system that we have produced
'from the outside', so to speak. That is, we will do a little meta-logic: rather than using
the logic to decide validity or whatever, we will produce some results about that logic.
First, why exactly is the branch of logic we are presently studying called truth-
functional logic? The answer is because any compound sentence, no matter
howcomplicated, has the following property: its truth-value (its truth value in the real
world or in any 'possible world') is a function of the truth values of its atomic
components.
The term function is used here in the mathematician's sense. A function takes objects
one by one from some set of objects and "associates each with" or "maps each onto"
another in some definite way. For example, in elementary arithmetic there is the
doubling function f(x) = 2x which takes any number and maps it onto its double (2 onto
4, 3 onto 6, etc). A function is, if you like, a rule of association one that always yields
an outcome when applied to a particular input.
In the case of logic, any truth functional compound of any degree of complexity is
characterised by a rule associating one of the two truth values (the overall truth value
of the compound) with any given combination of truth values to the atoms. So, e.g., the
truth functional compound (p & q) v r defines the following truth function f:
74
f(T, T, T) = T
f(T, T, F) = T
f(T, F, T) = T
f(T, F, F) = F
f(F, T, T) = T
f(F, T, F) = F
f(F, F, T) = T
f(F, F, F) = F
The objects which the function applies to (in the jargon: 'takes as arguments') are
ordered triples of truth values (ordered meaning that (T, F, F), for example, is a
different triple from (F, F, T)). The function associates each ordered triple with one
(and of course only one) truth value: it maps every possible triple of truth values onto a
single truth value.
(Exercise: Make sure that you understand that the above truth function is the one
associated with (p & q) v r; and also that you understand that it does no more than
give another way of expressing the information contained in the relevant truth-table.)
More generally we can say that any truth-functional compound involving any number n
of atoms characterises some truth-function f which maps in some definite way each of
the different n-tuples of truth values onto a single truth value.
Truth-functional Equivalence:
The sentence Pavarotti was a great tenor (p) and Bartoli is a great mezzo-soprano (q)
clearly carries the same information, or "says the same thing", as Bartoli is a great
mezzo-soprano(q) and Pavarotti was a great tenor (p). It is also true (though a little
less obvious) that either sentence carries the same information as It's not the case
either that Pavarotti was not a great tenor or that Bartoli is not a great mezzo-soprano.
Although each of these three sentences is linguistically different from the others, so
that they are not the same sentence, they are nonetheless EQUIVALENT sentences:
they carry the sameinformation. (This is sometimes paraphrased as the two sentences
75
express the same proposition.) If we formalise each of them using the same atoms
throughout, we have:
(p & q)
(q & p)
and
(p v q)
p q p&q q&p (p v q)
T T T T T f f f
T F F F F f t t
F T F F F t t f
F F F F F t t t
The final truth value is the same in all three cases. The three sentences have the same
truth table. (Notice that the first equivalence that between p & q and q & p, though
simple andobvious, is not trivial: p q and q p, for example, have different truth
tables.) This motivates the following definition:
Two truth functional compounds P and Q are truth functionally equivalent (which is
written: P Q) if and only if they have the same truth table, that is, for any given
assignment of truth values to the atomic propositions, P and Q have the same overall
truth value.
(Important Exercise: Not surprisingly, two compounds P and Q are equivalent if they
are interderivable that is, if either can be validly inferred from the other as premise.
76
Using the basic definitions of valid inference (given earlier) and of equivalence (given
just now), show carefully that this is true.
Two compounds P and Q are equivalent iff the single sentence P Q' is a tautology.
Again use the definitions to show carefully that this is true. Show also that it follows
from part (a). Would it be enough to say that P Q iff the single sentence P Q' is true
(rather than tautologically true)?)
We could also have expressed truth functional equivalence in terms of truth functions
two compounds being equivalent iff they determine the same truth function. So, for
example, p & q and (p v q) determine the same truth function f (viz. f(T,T) = T,
f(T,F) = f(F,T) = f(F,F) = F).
Truth-Functional Interdefinability:
If two sentences are equivalent, they both 'say the same thing' or 'carry the same
content'. For any sentence of the form P & Q (remember we use capital letters to
indicate that the sentences may themselves be compound so P might be (p q) and Q
might be (r & (s t))) there is an equivalent sentence in which the connective '&' is
eliminated in favour of the connectives '' and 'v', viz. (P v Q). (So, e.g., ((p q) v
(r & (s t)) is equivalent to (p q) & (r & ( s t)).)
Similarly, for any sentence of the form P Q, there is an equivalent sentence in which
the is eliminated in favour of and v, namely P v Q.
(Exercise: Explain carefully why this is true and notice that this equivalence justifies the
semantic tree rule (iii) given earlier for P Q.)
Finally, for any sentence of the form P Q there is also an equivalent sentence in
which the '' is eliminated in favour of '' and 'v'. First, P Q is equivalent to (P Q)
& (Q P). So, eliminating as indicated earlier, we have P Q (P v Q) & (Q v P).
And then eliminating & in favour of and v, we have P Q ((P v Q) v (Q v P))
bracketing again being crucial.
77
This all means that by applying these equivalences sequentially we can eventually
eliminate all occurrences of all the other connectives (or at least all the connectives
that we know about) in favour just the two connectives: and v. Or in other words, for
any sentence whatsoever in the language of truth functional logic that we know about
so far, there is an equivalent sentence using just the connectives and v.
Example 1:
1. (p & q) (r s) (p v q) (r v s)
2. (p v q) (r v s) (p v q) v (r v s)
3. (p v q) v (r v s) (p v q) v (r v s)
So, (p & q) (r s) (p v q) v (r v s)
Example 2:
1. p (q & r) p (q v r)
2. p (q v r) ((p v (q v r)) v ((q v r) v p)
The same story as I have just told for the set of connectives {, v} could also be told for
the set {, &} and indeed for the set {, }. That is, we could equally well find
equivalents for any sentence using connectives from among {, v, &, , } which only
used {, &} or which only used {, }.
Exercise:
(a) Produce equivalents for P v Q, P Q, P Q, using only the connectives and &.
(b) Produce equivalents for P v Q, P & Q, P Q using only the connectives and .
78
Single connectives: 'alternative denial ' and 'joint denial:
A technical question that arises naturally at this point is: can this process be taken one
stage further? Is there a single connective in terms of which all our other connectives
can be defined?
The answer to this question is yes. Although there is no such connective available in
ordinary English (at least not a direct connective), we can in fact produce two separate
single connectives, characterised by their truth table, either of which will do the job.
These are 'joint denial' (symbolised '') and 'alternative denial' (symbolised '|').
Theseconnectives are defined (as truth functional connectives must be) by their truth
tables:
P Q PQ
T T F
T F F
F T F
F F T
(So joint denial is the correct name: P Q says in effect that P and Q are both false, and
is therefore itself true exactly when they are both false that is, only in the last line of
the truth table. For this reason, joint denial is sometimes also known as nor, as in
Neither P nor Q.)
P Q P|Q
T T F
T F T
F T T
F F T
(So again alternative denial is the right name: P|Q says in effect that at least one of P
and Q is false and hence is itself false only when they are both true.)
79
Since we know from our recent considerations that the connectives we started with,
viz. {,v, &, , } can be defined in terms of {, v}, all we need to do to show that all
thoseconnectives can be defined in terms of is to show that there are equivalents for
any sentence of the form P or any sentence of the form P v Q that contain only the
connective . In fact:
1. P PP and,
2. P v Q (PQ) (PQ)
Important Exercises:
(a) Show using truth tables that the equivalences 1 and 2 hold.
(b) Since we also showed that all the connectives can be defined in terms of just {,
&} we could also have proved this result about by showing that there are
equivalents to both P and P&Q that involve only the connective . We already
know the one for P, find the equivalence for P & Q.
As for alternative denial, | , (also known in the literature as the Scheffer stroke after
the logician who discovered it), the following equivalences show the same result for it:
1. P P|P and,
2. P v Q (P|P) | (Q|Q)
Important Exercises:
(a) Show using truth tables that the equivalences 1 and 2 hold.
(b) Find an equivalence using only '|' for (P & Q).
Definition: Adequacy:
A set of connectives S is said to be ADEQUATE for truth functional logic iff any truth
function can be characterised by a truth-functional compound where the only allowable
connectives are those in S.
It is important to realise that no set of connectives has been proved adequate for truth
functional logic above. All that was established there is a series of conditional results:
80
that IF {,v, &, , } is an adequate set of connectives THEN so, are {, v} and the set
{|} consisting of the single connective, etc.
That is, all we know so far is that if for any truth function there is a compound which
characterises that function and which uses only connectives from {, v, &, , } then
for any truth function there is a compound which characterises it which uses only
connectives from {, v} and indeed only from {|}.
But isn't the antecedent in this last conditional sentence just obviously true? After all
{, v, &, , } are the only connectives we know about, so isn't that set adequate by
definition? Well, these certainly are the only connectives we talked about at the
beginning of this course. But truth functional logic is about ANY WAY of compounding
atomic sentences that is truth functional that is, in which the truth value of the
compound depends systematically on the truth values of the atoms.
In the exercises you will come across ordinary English connectives which are (or which
can be construed as) truthfunctional but are not in the set (not, and, or, if ... then, if and
only if). Three examples are but, 'unless' and 'only if'. It happens thatin those three
cases there are straightforward equivalents for P but Q, P unless Q and P only if Q'
which use connectives from the usuallist. But how do we know that there aren't other
connectives which are truth functional but which we haven't taken into account and
which have no such equivalents?
Moreover, even if there are no such further connectives in ordinary English, this surely
would just be a peculiarity of that language. For example, it just happens that all the
ordinary connectives (aside from negation) are binary (connecting two propositions -
each of which may itself contain connectives). But one can easily envisage a language in
which onecan say straight off that, for example, at least two of three propositions p, q
and r are false. Thatis in which there is a three-place connective - say 'plink', where
'Plink John, Jane and Joyce are invited to the party' means in ordinary boring English
that atmost only one of the three is invited. Again it turns outthat we could easily
produce an equivalent using the connectives we know about. The simplest is:
81
But what guarantee do we have that any such imaginary connective (only imaginary in
English, of course, perhaps real in other natural languages) has equivalents using only
connectives from our list?
The answer is that so far we have no such guarantee. But we can in fact produce one via
an important theorem about truth functional logic called the disjunctive normal form
theorem. (We are now, remember, considering results about logic, rather than for
example using it to decide in/validity of inferencesin ordinary language, so we are
briefly in the domain of meta-logic.)
82
A3( M ): T HE D ISJUNCTIVE N ORMAL F ORM T HEOREM
The easiest way to understand this result is by starting with a simple particular case of
a truth functional compound, let's (arbitrarily) say (p (q & r)). Next, construct the
truth table for this:
p q r (p (q & r))
T T T F
T T F T
T F T T
T F F T
F T T F
F T F F
F F T F
F F F F
We can use this table, or indeed any such table, to construct an equivalent sentence
involving just the connectives , v and & in a completely mechanical way as follows:
Following this rule for the above truthfunctional compound we get the following three-
fold disjunction (lines 2, 3 and 4 are the lines that take the value T):
83
(p & q & r) v (p&q &r) v (p& q & r).
This is called the disjunctive normal form of our original sentence (p (q & r)). If
you think about it, this disjunction is bound by construction to be equivalent to the
original. This is because it is bound to have the same truth table: each of its constituent
conjuncts (e.g. (p & q & r)) is true precisely in the line from which it was constructed
(and only in that line); hence this new sentence takes the value T in just the same three
lines as the original sentence (since T v F v F, for instance, is T), and it takes F in all
other lines (since the disjunction is F v F v F (=F) in all those other lines).
The disjunctive normal form THEOREM basically consists in noting the fact that the
above construction can be applied quite mechanically to any truth function (with the
one exception of contradictions which we can deal with separately as we shall see). The
theorem can be proved as follows:
Consider any such truth function with an arbitrary number, n, of argument places. And
notice that weare talking about any truth function whatsoever whether or not it
corresponds to some compound produced by using the connectives available to us
in ordinary English.By definition such a truth function corresponds to a truth table
with 2n lines. So consider any such truth table:
P1 P2 pn F((p1) (pn))
T T T 1
F F F 2 n
where (pi) is the truth value of the ith atom pi and i = T or F and is the overall truth
value for the truth function f in the ith row. Now take any row i in which i = T and
define for j = 1, , n
pj' = pj if (pj) is T
= pj if (pj) is F
84
Then form the conjunction p1'& p2'& &pn'
Repeat this for all rows that have overall truth value T. Finally, form the disjunction of
all such conjunctions.
(1) For any row in which f takes over truth value T, so will its dnf. This is because
the conjunction constructed from that row will take truth value T; all other
conjunctions take F, but a disjunction with one true disjunct is true; and
(2) For any row in which f takes overall truth value F, so will its dnf. This is because
for these rows all the corresponding conjunctions will be F and a disjunction all
of whose disjuncts are F is itself F.
It follows that we finally do have an assurance that the set of connectives that we
introduced, namely {, v, &, , }, is indeed an adequate set of connectives for
truthfunctional logic. This is because the disjunctive normal form theorem, together
with the above remark about contradictions, shows that {, v, &} is an adequate set.
(Henceour earlier interdefinability results kick in to show that all of {, v}, {, &} {|}
and {} are unconditionally adequate sets of connectives.
85
B: I NFORMAL R EASONING
At the beginning of the course, I said that we would be studying the way that we reason
both in systematic disciplines like science, social science or mathematics and in
ordinary argument and reasoning. Yet the examples we have studied in connection
with truth functional logic might seem very far away from ordinary reasoning. Of
course this is in part because in logic we are formalising processes that we normally
just use intuitively. But there is another reason as we will see in this section, we often
do not spell out arguments in full gory detail. This section is aimed to convince you that
truth-functional logic does have important things to say about real arguments (and not
just artificial examples about whether Peter, Quentin and Rita etc. do or dont go to the
party!).
1. p q
2. q
Therefore, p
This is of course valid (Exercise: Check). The question then becomes whether or not it is
sound. It seems difficult to challenge the second premise; but religious believers have of
course very much questioned the first. One line is that this premise is false because much
of the evil in the world results from human actions, which (allegedly) involve human free
will: an all-loving God could have given humans free will even while foreseeing that evil
might result because the overall benefits of our (allegedly) having free will outweigh the
resulting evil. This does not of course tackle natural evils (such as tsunamis or
earthquakes). But here the line from those who challenge premise 1 is often that such evils
allow humans to exhibit some of their finest characteristics bravery, generosity, fortitude,
86
etc. So, contrary to initial appearances, an all loving God might have allowed them. (Of
course these responses have been challenged in turn.)
A more systematic version of the same argument might go as follows. Suppose there
were a God as characterised within the Judeo-Christian tradition: omnipotent (all-
powerful), omniscient (all-knowing) and omnibenevolent (all-loving or all-merciful). If
God were all-knowing then he would foresee any evil before it occurred. If God were
all-powerful then he could prevent that evil occurring, if he wished to. If he were all-
loving then he would so wish. But if he foresaw the evil, wished to prevent it, and could
prevent it, then there would be no evil; but there is; so, there is no such God.
Let p be God is all powerful q: God is all knowing r: God is all loving, s: God would
foresee any evil, t: God would prevent any evil, u: God wishes to prevent evil, v:
There is evil:
1. p & q & r
2. q s
3. p (u t)
4. r u
5. (s & u & t) v
6. v
87
Note that we could simply infer (p& q & r) as line 7 in the first inference. (Exercise:
Check). So splitting it into two arguments is unnecessary: but doing so reflects the
reduction ad absurdum character of the original and so rhetorically gives it perhaps a
bit more force. (It is important to note that many real life arguments break down
naturally into a sequence of inferences rather than just a straight one step inference
from the initial premise(s) to some conclusion. Indeed, this is true in more formal
argumentation too: a proof in mathematics is (almost invariably) a sequence of
inferences, each one of which is valid ending in the proofs last line with the
conclusion i.e. the theorem to be proved.)
The main premise of the argument is that the probability that the evidence for the
alleged miracle is faulty in some way (no matter how much apparent evidence there
may be from however many sources) is always going to be higher than the probability
of there actually being an exception to the laws of nature.
When anyone tells me, that he saw a dead man restored to life, I
immediately consider with myself, whether it be more probable, that this
person should either deceive or be deceived, or that the fact, which he
relates, should have really happened. I weigh the one miracle against
the other; and according to the superiority, which I discover, I pronounce
my decision, and always reject the greater miracle If the falsehood of
88
the testimony would be more miraculous, than the event which he relates;
then, and not till then, can he pretend to command my belief or opinion.
He goes on to suggest that the probability of the testimony being false is indeed always
higher than the probability that the allegedly miraculous event really occurred.
IF there is evidence for some alleged miracle, THEN EITHER that evidence is true
AND the miracle did occur OR the evidence is false AND the miracle did not occur.
IF the probability that the evidence is false is higher than the probability that the
miracle occurred, THEN a rational person believes that the evidence is false AND that
the miracle did not occur.
Moreover, the probability that the evidence is false is (always) higher than the
probability that the miracle occurred. Therefore, the rational person believes the
evidence is false and that the miracle did not occur.
So, t &v
Where:
s: The probability that the evidence is false is higher than the probability that the
miracle occurred.
v: The rational person believes that the miracle did not occur.
89
This is a valid argument though you may notice that the validity results simply from 2
and 3. Premise 1 in fact is a near tautology just reminding you of the possibilities it
nonetheless, although strictly redundant, seems to add to the cogency of the argument.
Again then attention turns to the question of whether or not the argument is sound: not
surprisingly, since it is philosophy, opinions differ.
It is perhaps not surprising that arguments like the above ones from philosophy are fairly
logical although presented in ordinary language, it seems pretty obvious how to
formalise them and once formalised they turn out to be valid as they stand. But they do not
represent the norm so far as informal arguments go. Consider the following examples:
1. How can anyone claim that the Tories economic policies are working?
Economic growth is near zero and inflation is on the increase.
2. Old advert on London Underground for a computer software company: If your
machine doesn't run BOS software, it's a fridge. (The ad actually said ...it's
probably a fridge, but that would complicate things unnecessarily for present
purposes.)
They are both clearly intended to be arguments, but they look nothing like the sort of
arguments you have practised on in studying truth functional logic. For one thing, in
both of these arguments, the conclusion is not made explicit. However, you are clearly
meant to infer something in each case: that the Torieseconomic policies are not
working in example 1 and that your machine will run BOS software in the case of
example 2. So the first task in spelling out ordinary arguments is sometimes to make
the conclusion explicit (if it is left implicit in the original).
If we make the initially implicit conclusions explicit, then the arguments become:
And:
90
So, your machine will run BOS software.
1. p & q
So, r
(where p is Economic growth is near zero, q: Inflation is on the increase and r: The
Tories economic policy is working.)
And:
2. p q
So, p
(where p is your machine runs BOS software and q is your machine is a fridge.)
These arguments, as they stand, are of course invalid (Exercise: show this). But clearly
the person making the remark in example 1 and the advertisers in example 2 intended
the inferences to be valid.
So what is going on here? The fact is that, in ordinary arguments, we rarely spell out all the
premises instead we assume that the people responding to our arguments will share with
us some background knowledge, the elements of which we expect to be taken for granted
and so are not in need of spelling out, even though they are necessary for the validity of the
argument. So some premises are left hidden. Clearly the person formulating argument 1
was expecting us to agree to the further, hidden premise, that If an economy suffers near
zero growth and an increasing inflation rate, then the policy governing it is not
working. In other words, the hidden premise is (p&q) r.
And if we add this initially hidden premise to the argument then it becomes
So, r
And this is, of course, valid. (Exercise: check) Here the explicit premise was observably
true at the time this argument was being presented, but an obvious reaction from a
91
defender of Tory policy would be to reject the hidden premise and argue that, far from
being part of background knowledge that we all accept, that premise is a highly
debatable claim: sometimes low growth and inflation in the short term are necessary
for longer term economic well-being.
As for the BOS software: in that case the machine you are meant to be thinking about
is a computer. So there is an obvious bit of background information or hidden premise:
that your machine is not a fridge! (It is the obviousness of this hidden premise that is
intended to give the advert its impact.) So the argument becomes:
2. p q [explicit premise]
There is a technical term for arguments in which one or more premise is left unstated
they are called enthymemes.
A further example:
At the time of the dodgy dossier, Tony Blair said The BBC reports [that the dossier
had deliberately exaggerated the threat posed by Sadaam Hussein and that Blair had
insisted on the exaggerations being made, knowing them to be exaggerations] are
absurd. If they were true, then I would have to resign.
Analysis:
Explicit premise: If the BBC reports were true (p), then I (Tony Blair) would have to
resign(q).
1. p q
Therefore, r.
92
And this is obviously invalid as it stands. And yet clearly the saint-like Tony was
expecting us to take the argument to be convincing, i.e. to be valid or at least not
obviously invalid. So what is going on?
One possibility is that Blair had vaguely in mind a valid inference involving premises
that he himself at least believed to be so obvious as not to need explicit articulation. His
vague thought might have been that if he had done what the BBC reports said then he
would have done something morally reprehensible and surely no one could think that
he, of all people, was capable of that?!
So what was he expecting us to accept as hidden premises? Well, one obvious, and
uncontentious, hidden premise is that If the reports were true then they were not
absurd (i.e. p r). However, adding this uncontentious hidden premise is not enough
to make the inference valid.
(Exercise: add that premise to the original and show that the resulting inference
remains invalid.)
What Blair was expecting his hearers to take for granted, if this construal is correct,
was something like that it was absurd to think that he would do anything so morally
reprehensible that it would make resignation a moral requirement. Stripped of the
rhetoric, this extra claim amounts to If I would have to resign (q) then I would have
done something morally reprehensible (s), but I could never do anything morally
reprehensible (s). So he is expecting us to accept (q s) & s as a piece of
background knowledge. Moreover, he is expecting us to accept that if the assumption
that a report were true entailed that he had done something morally reprehensible
then it was not only false but absurd. That is (p s) r. If we were gullible enough to
accept these two additional hidden premises then the inference would indeed become
valid. The inference would be
1. p q [explicit premise]
2. p r [uncontentious hidden premise]
3. (q s) & s [hidden premise]
4. (p s) r [hidden premise]
So, r
93
This is indeed valid (Exercise: check), but some of us would regard the initially hidden
premises 3 and 4 as far from being uncontentious parts of background knowledge.
Let's now consider a somewhat more detailed and intellectually more important
argument one that might have been given in the 19th century in favour the claim that
there must be an invisible, intangible medium that fills the whole of space, called the
luminiferous ether, vibrations in which constitute light:
Analysis: The conclusion is that there is indeed a 'Luminiferous aether pervading the
wholeuniverse'. Without a full account (which would be very lengthy in this case), we
can see the main outline of the argument and its relationship to the truth functional
logic we have studied. The truth of 'the mechanical world view' is taken as a 'hidden
premise'. We are explicitly told that this implies that light is either matter or energy. If
it is matter it would be either (a) a continuous stream or(b) a succession of particles. If
94
(a) then something would happen (viz. that the streams would affect one another when
they cross) which does not in fact happen (another explicit premise). So it can't be (a).
If it were (b) then light would travel in straight lines, but it doesn't (explicit premise
again). So neither possibility holds, so the assumption that light is matter can't hold
either. (This ought to remind you of the more elaborate version of the Edinburgh train
example that we talked about very early on it is a sort of extended disjunctive
syllogism Exercise: articulate this part of the argument explicitly in terms of
propositional logic and show that it is valid.)
So we now have an intermediate conclusion, viz. that light consists of energy (we
started with only two possibilities and have now eliminated the other). The rest of the
argument is essentially as follows:
1. There is a finite time interval, t, when the energy is neither in the light source
nor in the light receptor (This in turn is a consequence of the explicit premise
that light has a finite velocity.)
2. Energy cannot be 'free' but must be stored in some matter at all times (implicit
premise but really a consequence of the initial premise i.e. the mechanical
world view which entails that disembodied motion is a nonsense).
Therefore, there must be some medium in between the sun and the earth that
stores the light energy in the interval t.
Sometimes tracking down hidden premises can be tricky and lead to major
intellectual breakthroughs:
If the argument is intuitively convincing, then it may be very hard to decide exactly
what that hidden premise that is and why it is false. The breakthrough is made by
discovering the hidden premise at issue and seeing why it is indeed false.
95
A good example is one of Zeno's famous paradoxes - the one about Achilles and the
Tortoise. The two agree to have a race. To be fair, Achilles gives the Tortoise a start. Let
Achilles start at A and the Tortoise at B:
A B C DE Finish
His argument went as follows: By the time Achilles gets from A to B (the Tortoise's
starting point), the Tortoise, no matter how slowly he moves, but given of course that
he is moving, has gone some distance (let's say to C). So far, Achilles has not overtaken
the Tortoise. Achilles eventually arrives at C, of course, but by that time the Tortoise
has moved on (only a little bit, but he has moved on) say to D. Achilles eventually
arrives at D, but by then the Tortoise has moved on to E, etc. Hence, Achilles never
overtakes the Tortoise.
Clearly something is wrong with this argument since, if the race is long enough, the
conclusion is empirically false Achilles will overtake the Tortoise and win. But what
exactly is it that's wrong? The answer turned out to depend on quite subtle features of
a physical continuum. There is an implicit assumption that we can coherently talk of
Achilles and the Tortoise being AT particular POINTS at particular times. But this
assumption, however intuitively appealing, is not true of the sort of points involved in
the real number theory that underlines continuous processes in physics: to speak
intuitively, no one is ever at such a point but always passing through it. Working out
the correct theory of the continuum was of course an intellectual breakthrough of the
highest order.
Of course some arguments are just plain bad arguments. From the point of view we
have developed, we can see that there are basically two possible reasons for this:
(a) An obviously false explicit premise (even if the argument is valid, that is the
conclusion has to be true if all the premises are, it will of course cut no ice if one
or more of the premisesis or are obviously false we called such arguments,
remember, UNSOUND.
96
(b) The argument is invalid and any 'hidden premise' which might be invoked
would itself be obviously false.
In the 1970s a cult was built up around the Indian mystic Maharaj Ji his (surprisingly
many) followers believing that he was divine. One of these followers was an American
professional tennis player of the time called Tim Galloway. Galloway wrote a book
setting out his convictions about Maharaj Ji called Inner Tennis. The New York Times
sent along a reporter to interview Galloway and published the following account:
I asked Galloway how he had come to believe Maharaj Ji was God. [He
replied:]
"When I first heard him my only approach was to say to myself, 'He's either the
real thing ora con artist'. Well the first time I saw him he just did too bad a job
as a con-artist. A good con-artist wouldn't wear a gold wristwatch or give such
stupid answers. When I was staying withhim in India I once asked him how
much time I should spend on work and how much onmeditation and he just
said get up an hour earlier and go to bed an hour later hardly aprofound
answer. I decided that if he was doing such a bad job of being a con-man he
simply had to be genuine.
"Then how could he have six million followers?" the tennis pro replied.
Here we have essentially two arguments. The first, which Mr. Galloway gave
unprompted, has the following form:
1. p v q
2. r (s v t)
3. s &t
Therefore, p.
97
gives stupid answers'. (As usual things are not quite as simple as this and there is in the
interview in addition a sort of sub-argument for t.)
Moreover, an explicit premise here is patently false viz. premise 1. There is no reason
at all why the two possibilities mentioned should be the only two possibilities. Maharaj
Ji might for example be perfectly sincere but deluded (though no doubt one would
suspect in that case that some of the men behind his organisation were con-artists).
This is a frequent ploy in bad arguments: claim that the only two possibilities are p and
q, quickly move on to an elaborate argument which (allegedly) refutes q, and infer p.
The elaborate argument for q distracts attention from the otherwise obvious fact that p
and q are not the only possibilities.
The second argument in this passage is generated as a result of the journalist's pointing
out that the 'hidden premise' could obviously be challenged ["Did it ever occur to you
that he might be a bad conman?"]
Galloway's response is to produce an argument for the implicit premise. The argument
can be formalised as follows:
Using 'u' for 'Maharaj Ji is a bad con artist' and 'w' for 'Maharaj Ji has six million
followers' (and, remember, q means MJ is a con artist and r that MJ is a good con
artist):
1. q (r v u)
2. u w
3. w
Therefore, q r. (That is, if he were a con artist at all, he must be a good con artist.)
98
(As you will see from this, the distinction between 'hidden or implicit' and 'explicit'
premises is not always totally clear-cut. Some 'hidden' premises are so little hidden as
to verge on the explicit. For example, although Galloway never actually asserts w (i.e.
that Maharaj Ji has six million followers) this is so clearly implied as to be almost
explicit. Just as in formalising single sentences in truth-functional logic, so here in
analysing formally ordinary informal arguments, we have the 'problem' that the logical
system is totally precise, while ordinary discourse is often imprecise. It is therefore
often a matter of judgement whether one's precise formal account 'captures' the
informal one. Of course this is only a problem in that it requires work from the
logician seeking to apply his logical tools to ordinary argument, the total precision of
logic is clearly a virtue.)
In this new argument, premise 1 seems reasonable enough, but the (more or less)
explicit premise 2 is false. It is quite possible, given everything we know about people's
psychological needs, that so many people could be taken in even by a bad con artist
(indeed it seems overwhelmingly likely that Mr Galloway was one of them!). The full
argument given by Galloway (obtained by replacing premise 2 in the initial argument
by the argument for that premise that we have just analysed) is this:
1. p v q
2. q (r v u)
3. u w
4. w
5. r (s v t)
6. s & t
Therefore, p
Although valid, this argument can hardly be taken to establish its conclusion (that
Maharaj Ji is God) since it suffers from the embarrassment of including two premises
the first and the third that, although we did need to articulate them in order for the
reasoning to go through are quite patently false.
99
C: F IRST O RDER P REDICATE L OGIC
Try to think how this could be formalised in truth-functional logic. Neither any of the
premises nor the conclusion breaks down into atomic sentences connected by truth
functional connectives. There is something vaguely conditional about the first premise
we might think of it as saying something like 'If man then mortal. But neither 'man'
nor 'mortal ' is a sentence, so this is not a truth functional conditional. In fact, since
none of the sentences involved is a truth functional compound of simpler sentences,
the best we can do by way of formalisation in truth functional logic is simply:
1. p
2. q
Therefore, r
100
This inference scheme is obviously invalid since we can just assign p:T, q:T, r:F and this
is a counterexample.
This simple example alone shows that the set of truth-functionally valid inferences is
only a proper subset of the set of all valid inferences. In other words, the picture is:
Valid Inferences
Truth-functionally
valid inferences
Let's think a little more about the Socrates example since it can supply hints about how
to move towards a more adequate, more extensive system of logic. The problem is that
in formalising All men are mortal as p, Socrates is a man as q, and Socrates is mortal
as r, we lose the intuitive connection that exists between the various sentences as
expressed in English. Once formalised in this way the sentences are regarded as totally
independent from one another any combination of truth values (in particular p:T, q:T,
r:F) being possible. But clearly if it is true that 'All men are mortal' and true that
'Socrates is a man' then it can't be false that 'Socrates is mortal'. We have left out some
important connections between the sentences in formalising them in truth functional
logic in this way but we can't produce any more elaborate formalisation within truth
functional logic. Moral: we need a more refined language than that of truth functional
logic in order to capture the intuitive connections between sentences such as these.
The method of syllogisms, developed some two thousand years ago by Aristotle, can of
course easily deal with the Socrates example (which it was indeed constructed to deal
with). But if we simply added Aristotle's syllogisms to truth functional logic we should
soon find ourselves back in a similar situation to the one we are now in. There are lots
of inferences that are intuitively valid but are neither truth functionally valid nor an
instance of one of Aristotle's syllogisms. One simple example (suggested by a famous
101
19th Century mathematician and logician called Augustus de Morgan) is the following
inference:
But there are many important inferences in science, social science and mathematics
which fall in the same boat: clearly valid and yet there is no valid formalisation of them
within truth-functional or Aristotelian syllogistic logic.
So we shall ignore Aristotelian syllogisms and jump instead straight to a much more
powerful language and associated logic developed in the 19th and 20thcenturies that
subsumes Aristotelian syllogisms as special cases and delivers much more besides.
This is called first-order predicate logic (For the purposes of this course you can
forget about the 'first-order' and just call it 'predicate logic'.)
The Socrates example shows that we must get within the structure of sentences that
are truth-functionally simple. Remember that our original characterisation of a valid
inference was as one that is valid because of its logical form: a given inference is valid
iff there is no inference of the same logical form with true premises and a false
conclusion. In truth functional logic we take only the connectives as characterising the
form of a sentence. The basic idea in predicate logic is to consider the 'all' and the 'are'
in sentences like 'All men are mortal' as also constituting part of the logical form.
102
then we can give an explanation of the validity of this inference similar to that of the
validity of truth functional inferences.
Here, initially A stands for Greeks, B for men and C for mortal. As it stands the
premises are true and so is the conclusion. But this in itself, as we now know, is no
guarantee that the inference is valid. (The inference from All electrons are negatively
charged and All protons are positively charged to the conclusion that All neutrons
are electrically neutral has true premises and a true conclusion, but obviously the
conclusion, although true, does not follow from the premises. On the other hand, the
inference from All politicians always tell the truth and All those who always tell the
truth have blue eyes to All politicians have blue eyes is valid, even though both its
premises and its conclusion are of course false.) The question, as before, is: would the
conclusion have to be true if the premises were true (whether or not they actually are).
The answer in the Greeks case is that it would, and that this is revealed by the fact that
no matter what we substitute for A, B and C in the above schema (keeping the all
and the are fixed, just as we kept and, or, etc fixed in the case of truth functional
logic),we NEVER get true premises and a false conclusion.
(Exercise: Try a few substitutions. Notice that we are now substituting common nouns
(men, mortals, dogs or whatever, rather than whole sentences as we did in truth
functional logic.) You should be able to find substitutions which (i) make the premises
true and the conclusion true, (ii) make the premises false and the conclusion true and
(iii) make the premises false and the conclusion false. But you wont find any that make
the premises true and the conclusion false.)
If we again regard any parts of the verb 'to be' as part of the form of a sentence, and
also regard 'some' as part of the form then the form of this inference is:
103
2. Some A's are C's.
Here A means is a piece that Mozart wrote, B: is an opera and C: is beautiful. But
what they initially mean is irrelevant to the issue of validity: the inference about
Mozart is valid because no matter what other meaning we substitute for the stuff about
Mozart, operas, etc. that is, no matter what we take A, B and C in this formalised
inference scheme to mean we never get true premises and a false conclusion.
104
C2: M ONADIC P REDICATE L OGIC
I shall build up the system of predicate logic that we will study in two stages. First I
shall take a restricted but very simple sub-system so called monadic logic (don't
worry about this term, it will become clear later). This sub-system has the advantage of
allowing us to introduce all the main ideas in especially simple forms. I shall only then
move on to the full and more complicated system.
The basic idea is going to be that an inference is valid if and only if there is no inference
of the same logical form that has true premises and a false conclusion. How then
should we express the logical form of sentences like Socrates is a man, Boris Johnson
is a liar, 'All students are hardworking', 'Some logic lectures are boring'? Grammarians
treat such sentences as subject-predicate assertions: each assertion attributes a certain
'property' or 'predicate' to a 'subject'. The first cases are especially straightforward:
they consist in (truly or falsely) ascribing a property (e.g. that of being a liar) to a
particular individual (namely Boris Johnson).
Using lower case letters from the beginning of the alphabet (a, b, c or, if we need lots,
indexed constants a1, a2, a3 ...) as names of individuals and upper case letters (P, Q, R ...
or P1, P2, P3 ...) as names of properties or predicates, then it is natural to regard the
form of this first sentence as just:
a is P
or
And, simply to make the formalism as neat as possible, we will in fact formalise such
sentences as:
Pa & Qb.
(One thing you should note at this point is that predicate logic is indeed going to be an
extension of truth functional logic we will still use all the connectives we learned
earlier and treat truth functional conjunctions, for example, as such; but it is just that
what were, in truth functional logic, the un-analysed atoms that were conjoined are
now given further structure Pa, Qb, etc. rather than just p, q, etc.)
How about sentences like David Cameron was Prime Minister or 'Humphrey Bogart
starred in Casablanca'? Although in the past tense, these are just as much subject-
predicate assertions as our earlier examples: there is nothing special here about the
present tense of the verb 'to be'. We can in fact regard these sentences too as having
the same form. The first can be taken as David Cameron has [timelessly] the property of
having been Prime Minister. Similarly, 'Humphrey Bogart has the property of having
starred in Casablanca. (Some philosophers who ought to have better things to do
have found difficulties with so-called posthumous predication that is, with sentences
that ascribe properties to no longer living people. But well just take the nave and
surely correct view that all individuals, alive, dead, abstract (such as the number 4)
and fictional (such as Hamlet or Santa Claus) can just be named and straightforwardly
have properties attributed to them.)
106
C2( B ): U NIVERSAL S TATEMENTS : T HE U NIVERSAL Q UANTIFIER
Grammarians treat general or universal statements like 'All students are hard-working'
or 'All logic lectures are interesting' (as before, the sentences we are considering do not
need to be true!) as simply particular kinds of subject-predicate assertion. It's just that
in these cases the 'subject', instead of being an individual (named by a 'proper noun'),
is a whole class of individuals (named by a so-called 'common noun'). Since, however,
there is no such entity as 'all students' (although there are of course lots of individual
students) it is more accurate to regard the statement 'All students are hardworking' as
a sort of indefinite conjunction: stating that any individual who happens to have one
property (that of being a student) also has another (that of being a hardworker), or,
more specifically, if more laboriously:
For any object whatsoever, if it's a logic lecture then it's interesting.
The pronoun 'it' in the phrase, 'For any object whatsoever, if it's a P then it is also a Q', does
not pick out a particular individual (unlike an individual name such as David Cameron);
instead it VARIES OVER individuals (not all of which may have individual names for
example we can say (truly) that all electrons are negatively charged, though electrons
generally do not have names). This means we must introduce the idea of VARIABLES
and we denote these by lower case letters from the end of the alphabet (x,y,z,
occasionally u, v, w, or again if we need lots of variables we use indices: x 1, x2, x3 ...).
For any object x, if x has the property of being a logic lecture then x has the property of
being interesting.
107
Using our predicate symbols and remembering the truth functional connective , this
becomes:
Finally, we introduce as shorthand for the phrase 'For any' the symbol '' (upside
down A, as in for All or for Any - called the UNIVERSAL QUANTIFIER) and write:
x(Px Qx)
The sentences 'All students are hardworking', 'All jabberwocks are dangerous', 'All Sun
journalists are morally good' all share this same logical form; although again if
wewanted to formalise several at once we should, of course, use different predicate-
letters for the different properties. (So the sentence, All jabberwocks are dangerous
and all bandersnatches are frumious' could be formalised as x(Px Qx) &x (Rx
Sx).)
108
C2( C ): S OME S TATEMENTS : T HE E XISTENTIAL Q UANTIFIER
How about statements like 'Some lectures are interesting' or 'Some politicians are
liars'? These are again grammatically of subject-predicate form. The 'subjects' again do
not name particular individuals. But nor do they refer to a whole class of individuals.
They simply assert that, for example, amongst the class of all politicians there are
some who have the property of being liars. So, using our notions ofvariables and
predicates, we shall read this statement as:
There are some objects x, such that x has both the property of being a politician and the
property of being a liar.
That is,
(Exercise: 'There are some objects x such that if Px then Qx' would be the WRONG way
to read 'some politicians are liars'. Can you explain why?)
How many count as 'some'? Would three lying politicians be enough to make 'Some
politicians are liars' true? How about two? Or even one? The answer is that, just as we
found for example with the connective 'or', ordinary usage is vague and ambiguous. We
need to be precise in our logic, however, and logicians have found it best to take the
minimal understanding of the phrase 'some' and take it as meaning merely 'at least
one'. So that a 'some statement is false if and only if there are no objectsat all of the
kind described (no interesting logic lectures, no lying politicians, orwhatever). We shall
see later that nothing is lost by making this decision. So we have the following
'formalisation' of 'Some Ps are Qs':
Introducing the abbreviation (backwards E for there Exists) for the cumbersome
phrase There exists at least one objectsuch that , we have finally:
109
C2( D ): T HE E XPRESSIVE P OWER OF M ONADIC P REDICATE L OGIC
The whole expressive power of truth functional logic carries over into predicate logic.
If a sentence is a truth functional compound then this is maintained in predicate logic
but we ADD the ability to capture the form of what were in truth functional logic the
simple, un-analysable, atomic sentences. So, for example, the sentence 'All politicians
are liars and some voters have been fooled' has truth functional form p & q' and so its
predicate logic formalisation is:
Similarly, 'If all politicians are liars then some voters have been fooled' has truth
functional form p q, where p is the sentence All politicians are liars and q the
sentence some voters have been fooled, hence its formalisation in predicate logic is:
(It is important to understand that the p, q, r ... etc. of truth functional logic are full
sentences, making an outright assertion and hence having a truth value. The Px, Qx,
etc. that we use in predicate logic are not sentences but conditions or predicates: is a
politician is not a sentence which is either true or false, rather it produces a true-or-
false sentence when some particular object is substituted for the variable.)
A sentence like 'All lying politicians are either wicked or stupid' is NOT a truth-functional
compound. (It is not, in particular, equivalent to 'Either all lying politicians are wicked or
all lying politicians are stupid' which is truth functionally compound. Exercise: Explain
carefully why. You may find a diagram helps.) The sentence does however have more
structure than merely 'All Ps are Qs'. In fact, it says: For all x, if x is BOTH a politician AND a
liar, then x is EITHER wicked OR stupid.
So using our connectives, variables and the predicates P, Q, R and S (for is a politician,
is a liar, is wicked and is stupid, respectively) we get:
So it is important to note then that once we have introduced the idea of predicates, we
can use our truth functional connectives in obvious ways, not only to link fully fledged
sentences but also predicates. That is, not only can we conjoin the full sentences All
politicians are liars and All philosophers are honest as x((Px Qx) & x (Rx Sx)),
we can also conjoin (or disjoin or whatever) predicates to form more complex
conditions so the complex predicate Px & Rx with the same understanding of Px and
Rx is the predicate that holds of those and only those objects that are both politicians
and philosophers. (There are a few, usually bad ones both bad politicians and bad
philosophers, that is!) The complex predicate Px v Rx is the condition that holds of all
those objects that are either politicians or philosophers (or both since, remember, we
have taken v as the inclusive sense of either/or).
111
C3: V ALIDITY OF I NFERENCE : T HE I DEA OF AN I NTERPRETATION
We shall extend our system beyond monadic predicates shortly. But all the important
logical notions validity and invalidity of inference in particular are relatively
straightforward if we restrict ourselves to 'monadic predicates' (you will understand
precisely what these are only later when we discuss non-monadic predicates).
As we already indicated, the basic idea of validity of inference is, as before, that an
inference is valid if it is impossible for the premises to be true and the
conclusionfalse. In the case of truth-functional logic, we cashed out 'impossible' in
terms of truthvalue assignments to the atomic propositions impossible meant no
assignment of truth values to atomic components which makes the premises true and
the conclusion false. What is the corresponding notion in the case of predicate logic?
Well, let's think about the two ancient inferences I already mentioned.
As before, to decide validity we first formalise the inference, only now we do it in the
language of predicate logic. Using 'Px' for 'x is a man', 'Qx' for 'x is mortal', ' Rx' for 'x is
a Greek' and the individual constant 'a 'as the name of the individual Socrates, we have:
A:
1. x((PxQx)
2. Pa
So, Qa
And:
112
B:
1. x((RxPx)
2. x((PxQx)
So, x((RxQx)
Aand B are, then, the schematic forms (in predicate logic) of the originalintuitive
inferences A and B. The basic idea is that, again as before, the ordinary language
inferences A and B are both valid because no inference with either the form A' or the
form B' has true premises and a false conclusion. That is, no matter what we substitute
for 'men', 'mortals' and Socrates' in A, or for 'Greeks', 'men' and 'mortals' in B, we
never get true premises and a false conclusion. This means, more formally, that
whatever predicates we substitute for the predicate letters P, Q, R in either A' or B' and
whatever object we substitute for the individual constant a in inference A' we never get
both premises true and the conclusion false.
So, for example, just considering A', we might substitute 'x is an aardvark' for 'Px'; 'x is
a quadruped' for 'Qx'; and Alf (a particular aardvark in London zoo) for 'a'. We would
thus get:
A":
(True premises, true conclusion I'm assuming that 'quadruped' means 'having 4 legs
in the natural, complete state', so that we needn't worry, for example, about whether
Alf might be an unfortunate aardvark-amputee).
Or we might substitute 'x is an LSE student' for 'Px', 'x is hardworking' for 'Qx' and Bert
(an especially slothful sloth in London zoo) for a. We would thus get
Here the first premise is (sadly) false and so is the second and the conclusion is also
false.
If you try some further substitutions, you will also find cases in which the premises are
false (remember this means: not all premises are true) and the conclusion true (try
Px: x is a football hooligan; Qx: x is a devout Christian; a: Justin Welby). But you can
play this game all day long and what you will NEVER find is a substitution that gives
TRUE premises anda FALSE conclusion.
This ('no substitution which give true premises and false conclusion') is going to form
our characterisation of a valid inference in predicate logic. Clearly, if it is the correct
characterisation, then intuitively invalid inferences ought to fail to satisfy it. This
means that, in the case of intuitively invalid inferences, there should be substitutions
for the predicates that do make the premises true but the conclusion false. Well, let's
think about the following example:
C:
Here the premises are true and so is the conclusion. But the inference is invalid not
because the conclusion isn't true but because the conclusion is not guaranteed to be
true just because the premises are. It MIGHT have been true that some football players
are very skillful and Suarez is a footballer player but that he just happens to be one of
those who are not very skillful.
So, checking more formally that inference C fails to satisfy our criterion for validity, we
first formalise the inference. Using Px and Qx for the predicates 'x is a footballer and 'x
is very skillful', and a for the individual Luis Suarez, we get:
C':
114
2. Pa
So, Qa
The original inference C is invalid on our proposed criterion if there are substitutions
for Px, Qx and a in the inference schemeC' which yield true premises and a false
conclusion. But such substitutions are easy to supply. For example, let Px be 'x is a
natural number (the natural numbers are the counting numbers 1, 2, 3 ...), Qx be 'x is
even' and a be the number 5. Then, under this substitution, the scheme C' yields:
So, 5 is even.
D.
Here, as I think every unbiased observer (this excludes judges) would agree, we have
true premises and a true conclusion. The inference however is invalid the truth of the
conclusion is not guaranteed by the truth of the premises. It MIGHT have been true that
there are wealthy judges and out-of-touch wealthy people, while it just happened that
none of the out-of-touch wealthy people were also judges. The conclusion may actually
be true, but it could have been false even when the premises were true. Let's again
check that our characterisation of validity captures this idea by pronouncing inference
D invalid.
We can formalise the inference using Px, Qx and Rx for the respective predicates 'is a
judge', 'is wealthy' and 'is out-of-touch with 'ordinary life'', and obtain:
D':
115
1. x(Px & Qx)
2. x(Qx & Rx)
The original inference D is invalid iff there is a substitution for Px, Qx and Rx in D'
which makes both premises true but the conclusion false. Obviously to make the
conclusion false we shall need Px and Rx to be incompatible properties: say, 'is male'
and 'is female' in humans, or 'is less than 10 and 'is greater than 20' in the natural
numbers. We then need Qx to be a property compatible with either say, 'is Russian' or
'is even'. The two sets of properties would give the following inferences when
substituted into D'
The first has true premises and a false conclusion at any rate if we idealise away
certain difficult arguable borderline cases (e.g. of hermaphrodites). One advantage of
using properties of natural numbers is that everything is clear cut: there are no
borderline cases. So in the second substitution it is completely clear that we have true
premises and a false conclusion. Hence this second inference forms a clear-cut
COUNTEREXAMPLE to our original inference D. (The first would count asclear-
enough-cut for me, so dont worry if you are happier dealing with ordinary properties
than with numerical ones.)
So, our account of validity in predicate logic does indeed characterise intuitively invalid
inferences as invalid. In order to put the account into precise form, we need first to give
a precise account of the idea of substituting one set of properties and individuals for
another and for this we in fact use the central notion of an INTERPRETATION.
116
C3( B ): T HE IDEA OF AN I NTERPRETATION OF A S ET OF PREDICATE LOGIC
SENTENCES
For technical reasons (to be discussed later) it is not sufficient to interpret 'for all x' as
meaning 'for anything at all'. Instead we need to specify some set (the set of all (past,
present and future) humans, the set of natural numbers, or the set of all physical bodies
in the universe, or whatever) as the so-called domain of the interpretation.Hence 'for
all x' will mean 'for all objects in the domain' (whateverthe domain set might be in the
particular interpretation concerned) and 'some x' will mean 'for at least one object in
the domain'.
How about the constants? There are two types of constants in our system: individual
constants naming individuals, and predicate constants standing forproperties.
Obviously we must pick out some particular element of the domain to associate with
each individual constant (perhaps Socrates or Marilyn Monroe if the domain is humans,
the number 4 or the number 77 if the domain is natural numbers). And we must
associate with each predicate some particular property that makes sense when applied
to the domain (so it might be 'is male' if the domain is humans, or 'is even' if the
domain is natural numbers). In sum, an INTERPRETATION of a set of sentences of
predicate logic consists of:
Example 1:
Interpretation A:
a: 4
{Every natural number which is even is 2; 4 is even; there are natural numbers that
are both even and 2.}
Of course, we are only allowed to specify one meaning for a constant in any given
interpretation. P can't mean one thing in the first sentence in Example 1 and something
else in the second.
There is nothing in the notion of interpretation that requires the interpreted sentence
to be true the interpretation just turns the symbolic formulae back into ordinary true
or false assertions. E.g., the following would also be an interpretation of the set of
sentences in Example 1:
Interpretation B:
Px: x is even
Qx: x is odd
a: 5
Under this interpretation the first two sentences in the set are false. (Exercise: Write
out the interpreted sentences.)
118
interpretations they are those interpretations that happen to make the interpreted
set of sentences all true.
Example 2:
{x((Px v Qx) Rx), x(Px & Rx), Pa & Ra, Qb & Rb}
Interpretation A:
Px: x is a criminal
a: Prince Charles
b: The Queen
{Any UK citizen of voting age who is either a criminal or a member of the Royal Family
has no vote; Some UK citizens of voting age who are non-criminals have a vote; Prince
Charles is not a criminal and does not have a vote; The Queen is a member of the Royal
Family and does not have a vote}
This interpretation happens to be a model (or so I believe legal experts may want to
correct me).
Interpretation B:
Px: x is even
Qx: x is prime
Rx: x is odd
a: 2
b: 4
119
Under this interpretation the sentences read:
{Any natural number which is either even or prime is not odd; There are some natural
numbers which are both not even and odd; 2 is not even and not odd; 4 is prime and is
not odd}
This interpretation is not a model of the set of sentences the second sentence is true
in this interpretation, but to be a model of the set S the interpretation must make all
the formulas in S true.
Definition: Validity:
Let S be a set of formulas of first order predicate logic andsa singlesuch sentence. The
inference from S to s is VALID iff there is NO interpretation of the set of sentences S
U{s} which makes all the sentences in S true but s false. (Or, more briefly, such an
inference is valid iff every model of S is also a model of s, i.e. iff there is no model of S
U {s}.)
(Exercise: This is a central definition, take time to think it through carefully and check
that you understand why the various alternative formulations of the notion are
equivalent.)
120
C3( C ): D EMONSTRATING V ALIDITY AN D I NVALIDITY
Although it is true that there are no counterexamples to our Socrates inference, it is not
at all clear how we could actually demonstrate this. We would, it seems, have to check
in turn EVERY POSSIBLE interpretation of the formal sentences in the inference
scheme A' above to see that no interpretation is a model of the premises but at the
same time makes the conclusion false. However, since there are infinitely many
possible interpretations, this is clearly an impossible task.
(You might wonder why this problem does not arise in truth functional logic. There too
the reason for the validity of the inference from, say, the set of premises {p q, p} to
the conclusion q is that no matter which ordinary sentences are substituted for p and
for q, we never have true premises and a false conclusion. But again it would be
impossible to check all possible substitutions (p could be today is Wednesday, The
Moon is made of green cheese, all electrons have negative charge and so on
indefinitely and the same for q.) In the truth functional case, however, we can readily
bring the problem down to manageable proportions: all that we need to ask of given
sentences which are taken as interpretations of p and q is whether they are true or
false, hence we can partition the infinite set of all possible interpretations of, in this
case, p and q, into finitely many (in this case four) sub-sets: (1) those in which p and q,
whatever particular sentences they happen to be whether about the day of the week,
the moon, or electrons or whatever), are both true; (2) those in which p is true and q
false; (3) those in which p is false but q true; and (4) those in which both are false. This
is precisely what we do in employing the truth table method, or one of its derivatives,
to decide truth functional validity. However, there is no obvious way of reducing the
problem for predicate logic to finite proportions in a similar fashion. In fact, as we will
see, not only is no such way apparent, no such way exists! So we will have to approach
the issue of demonstrating validity in predicate logic in a different way.)
So the basic definition, although it tells us precisely what it means for a predicate logic
inference to be valid, gives us no hint of how we might demonstrate validity. On the
other hand, we can specify conditions under which we would have demonstrated
invalidity: all that it takes to show that an inference is invalid is to produce a single
121
interpretation in which the premises are true and the conclusion false, or as we shall
say, a single counterexample.
Example 1:
Take again inferenceD(the one about the judges)above. Thisformalised, remember, as:
Px: x is even
Qx: x > 10
Rx: x is odd
This is indeed obviously a counterexample the premises are true and the conclusion
is false. Hence the original inference D (which was remember about judges being out of
touch and stuff) is invalid because it has an invalid form and this shown by the fact
that there is a content-wise completely different inference which nonetheless has the
same form and which has true premises and a false conclusion.
Example 2:
122
2. All neutrinos are elementary particles and are not positively charged.
Here both the premises and the conclusion happen to be true, but the inference is
(rather obviously) invalid. To show that it is invalid we first formalise it:
And then produce an interpretation in which the premises remain true but the
conclusion is false. For example:
Tx: x is an electron
In other words, we just substitute 'electrons' (in our (re-)interpretation) for 'neutrinos'
(in the original inference). The premises are still true (electrons are elementary
particles and are not positively charged). But the conclusion is false since, as a matter
of fact, electrons are negatively charged, not electrically neutral.
The outcome of this section, then, is that IF we can actually produce a counterexample
to an inference, then it must be invalid. The next question is whether there is some
SYSTEMATIC way of producing counterexamples to inferences that are in fact invalid?
That is, is there something analogous in predicate logic to the truth-table, or semantic
tree decision procedures we developed for truth functional logic? These methods for
truth-functional logic were, remember, completely algorithmic you could apply either
method completely automatically to any inference in the language of truth functional
logic and the method would give you an answer valid or invalid in a finite number of
123
steps. The answer to the question for predicate logic generally - is NO: there is no
general algorithmic method for producing counterexamples to invalid inferences.
Instead you have to exercise your ingenuity to some extent.
Looking at our two examples, however, it is clear that we are not thrown back simply
on undirected trial and error. In Example 1, we can reason as we did earlier: The
premises require there to be some Ps that are Qs and some Qs that are Rs but leave
open the possibility that none of the Ps that are Qs are also Rs. It is just, then, a
question of actualising this possibility by finding incompatible predicates Px and Rx,
each of which is however separately compatible with Qx. This is what the
interpretation we found does (go back and check).
In Example 2 it is still clearer why the inference is invalid. The premises taken together
leave open two possibilities for the neutrinos the conclusion asserts that one of these
holds, but it is clearly possible that it might have been the other (namely negative
charge). Again it is a question of finding an interpretation that actualises this
possibility.
If, however, you have no immediate intuitions about how to produce a counterexample,
or indeed about whether or not a given inference is invalid, then you just have to try a
few interpretations and hope that eventually light will dawn. (With practice, it will.)
124
C4: C ONSISTENCY AND I NDEPEND ENCE
As I already indicated, the definition of validity of inference tells us what it means for an
inference to be valid, but it doesnt tell us how to demonstrate the validity of any
inference and it doesnt even tell us when we could justifiably assert that an inference
is valid. (Clearly simply trying to find a counterexample and failing is not sufficient.
There are infinitely many possible counterexamples and we may simply not yet have
looked hard or long enough.) The natural next step would be to remedy this defect. But
this requires the introduction of some new ideas, and there are some other important
logical notions which like that of invalidity of inference can be dealt with directly
using the ideas of interpretations and models that we have already introduced. So lets
pause to introduce them and turn to the new ideas needed to demonstrate validity
later.
The further important notions at issue are those of the CONSISTENCY of a set of
sentences and of the INDEPENDENCE of a single sentence from a given set of
sentences.
(1) A set of sentences S in the language of (first order) predicate logic is consistent iff
there is a model of S i.e. a single interpretation under which all the sentences in S are
true.
(2) Letsbe a single sentence and S be a set of sentences (all in the language ofpredicate
logic). s is independent of S iff neither s nor s is validly inferrable from S as premises
(this requires there to be two models: one of Sand s (showing the invalidity of the
inference from S to s) and one ofS and s (i.e. s) (showing the invalidity of the
inference from S to s)).
125
(Notice that, just as in truth functional logic, the notions of consistency and
independence are closely related: s is independent of S iff both the set S U {s} and the
set S U {s} are consistent.)
(Important Exercises:
(1) Explain carefully why it is true that, in predicate logic, s is independent of S iff both
S U {s} and S U{s} are consistent.
(2) ANY sentence in predicate logic is validly inferrable from an inconsistent set of
such sentences. True or false?
Example 1:
Consider the set of sentences {All philosophers are either rationalists orempiricists;
Some rationalists are obscure; No philosophers are obscure}. Not all of these sentences
are true in the real world (notably the last!). The set is, however, consistent all the
sentences could be true together (even though as a matter of factthey aren't). (All it
would take, as some of you might see intuitively, is for some rationalists not to be
philosophers and for those non-philosopher rationalists to be the obscure ones.)
Consistency is demonstrated by first formalising the sentences: {x(Px (Qx v Rx)),
x(Qx & Sx), x(Px & Sx)).
Here, if you think about it, we clearly need Px and Sx to be incompatible predicates
within whatever domain we select, while Qx and Sx must be compatible, and all Ps
must be either Qs or Rs. The following interpretation will in fact work:
Domain: {animals}
Px: x is a dog
126
Qx: x is male
Rx: x is female
Sx: x is human
Under this interpretation, our symbolic sentences yield the following set:
(All dogs are either male or female; There are some male humans; There are no dogs
that are human)
All these sentences (including the last no matter what some dog owners may think)
are of course true. Hence the original set of sentences that was about philosophers
is consistent.
Example 2:
Consider the set of sentences {All students, except those studying logic, are lazy;
Anyone who is lazy will do badly in the exams; All students who frequent the Three
Tuns will do badly in the exams; All students either study logic or frequent the Three
Tuns}.
Again not all sentences in this set are true! Are they, nonetheless, jointly consistent?
First we formalise and obtain: {x((Px & Qx ) Rx) , x(Rx Sx) , x((Px & Tx)
Sx), x(Px (Qx v Tx))}
(Here Px stands for 'x is a student', Qx for 'x is studying logic', Rx: 'x is lazy', Sx: 'x will
do badly in the exams, Tx: 'x frequents the Three Tuns note the construction for
except in the first sentence.) The following interpretation is a model of this set of
symbolic sentences and hence demonstrates the consistency of the original set:
Tx: x is even
127
Under this interpretation the set reads:
(All whole numbers greater than zero, except the odd ones, are divisible by 2; Any
whole number divisible by 2 is the sum of two odd numbers; Any even whole number
greater- than zero is the sum of two odd whole numbers; Any whole number greater
These are all true (the 2nd and 3rd sentences being two forms of an elementary
theorem of arithmetic).
Example 3:
Iss= Some judges are out of touch with 'ordinary life' independent ofthe set S =
{Some judges are wealthy; Some wealthy people are out of touch with 'ordinary life'}?
This, if true, would mean we can infer neither s nor s from S. First, formalise:
We already know that s is not validly inferrable from S, via the interpretation given
earlier when we were establishing that certain inferences were invalid:
Domain: {humans}
Px: x is male
Qx: x is Russian
Rx: x is female
If the inference from to s were valid, then there would be no counterexample, i.e. no
interpretation in which the sentences in S are true and s is false, but if s is false, then
s is true so this would require there to be no interpretation in which the sentences in S
U {s} are all true. So if we can show that there is such an interpretation, then s is not
validly inferrable from S. And in fact the following interpretation fits the bill:
128
Px: x is even
Qx: x is divisible by 4
Rx: x is divisible by 8
Under this interpretation, S reads {Some even numbers are divisible by 4; Some
numbers divisible by 4 are divisible by 8} and s is Some numbers are divisible by 8. All
these sentences are of course true.
Example 4:
Is the set {All logic lectures are interesting; Some logic lectures are notinteresting}
consistent?
It formalises, for the obvious meanings for Px and Qx, as: {x(Px Qx), x(Px & Qx)}.
The answer from intuition is obviously 'no'. But so far we don't know how to show this.
If we tried to produce an interpretation that was a model, we would of course fail. But
failure (so far) to produce a model doesn't entail that there isnt one.
Example 5:
Is the sentence s = All Greeks are mortal independent of the set S: {All Greeksare men;
All men are mortals}? The answer is again obviously not since, as we already
remarked when discussing validity of inference, s here is validly inferrable from S. But
again as already remarked, we can't show this to be true on the basis of consideration
of interpretations.
129
C5: F INITE I NTERPRETATIONS /M ODELS
If you are asked to show that a set of sentences in predicate logic is consistent then you
have to produce a model of that set. As in the case of finding counterexamples to
inferences, there is no general algorithmic method. However, some consistent sets of
sentences and all the ones that you will be asked about admit of a finite model.
With finite models, it is possible to work in an almost completely systematic way. A
finite interpretation is simply one in which the domain set (i.e. the set of all the
objects over which the variables range) instead of being infinite (like the set of all
natural numbers) or indefinite (like the set of all humans) is finite (perhaps the set: {1,
2, 3}). And a finite model of a set of sentences is, of course, just a finite interpretation
under which all the sentences turn out true.
The second step toward the finite model method technique concerns the predicates.
Philosophers like to say that predicates can be specified either intensionally
orextensionally. The INTENSION of a predicate is its meaning so 'x is red' meansx
(whatever it is) is red and 'x is an even number' means x is an even number. The
EXTENSION of a predicate, on the other hand, is the set of all individuals that possess
the property concerned so the extension of 'is red' will be a whole big set of things
including various Ferraris, Rita Hayworth's hair, various Liverpool football shirts and
so on. The extension of the predicate 'x is an even natural number' is the set {2, 4, 6, 8,
... }.
Now we said that when supplying an interpretation, we first specify a basic set as the
domain of the interpretation. Within such an interpretation, each predicate will have as
its extension some subset of the domain set. So if the domain is the set of all past and
present cars then 'x is red' determines a subset of that set viz. the subset which
contains all and only all the red cars. If the domain is the set of all natural numbers,
then the extension of 'x is even' is again the set of all even numbers, which of course is a
subset of the set of all natural numbers. More importantly for present purposes, if the
domain set is the set of all natural numbers up to and including 10, i.e.
{1,2,3,4,5,6,7,8,9,10} then the predicate 'x is even' determines the subset {2, 4, 6, 8, 10}
of that set.
130
The basic ideas of the finite model technique are:
So, if we are considering two predicates, say Px and Qx, an individual constant a, and a
finite domain, say {1,2,3), then our finite interpretation might make the extension of
the predicate Px the subset {1,2}, the extension of the predicate Qx the subset {2,3}, and
might make the individual constant a stand for 3. This means that any individual x in
the domain has the property P just in case it is a member of the subset {1,2}, that is, just
in case x=1 or x=2, and has the property Q just in case it is a member of the subset
{2,3}, that is, just in case x=2 or x=3. Given that the individual constant a has been
interpreted as naming 3 then the sentence Pa would be false in this interpretation, since
3 is not an element of the extension of Px (i.e. (3 {1,2}); while the sentence Qa would
be true, since 3 {2,3}.) Here in line with general set theory we use the Greek lower-
case letter epsilon, , to stand for is a member of.
Consider, for example, the sentence x(Px Qx).This means that everything that has
theproperty P also has the property Q. In terms of extensions this means, of course,
that everything in the extension of P is also in the extension of Q. (So All men are
mortal, if true, means that the set of all men is a subset of the set of all mortals.) This is
easy to check in the case of a finite domain.
Given the way that we have interpreted P and Q in the finite interpretation just
specified, (namely with P having the extension {1,2} and Q the extension {2,3}), x(Px
Qx) is in fact FALSE, since 1 is in the extension of P but not in the extension of Q (i.e. it
is not true that {1,2} is a subset of {2,3}).
The sentence x(Px & Qx), for example, means that at least one thing has both the
property P and the property Q.That is, it says that the extensions of P and Q have at
131
least one element in common. Again this is easy to check in the case of finite
interpretations.
In the interpretation just given x(Px & Qx) is in fact true since the element 2 is of
course an element both of the extension of Px viz.{1,2} and of the extension of Qx, viz.
{2,3}.
Other quantified sentences are just as straightforward. For example, xPx means that
the extension of P coincides with the whole domain set. The finite interpretation we
have been considering as a simple example has domain {1,2,3} while Px was given the
extension {1,2} this of course means that the sentence xPx is false in this
interpretation, since 3 is in the domain set but not in {1,2}.
x(Px v Qx) means that everything in the domain is either the extension of P or in the
extension of Q (or of course in both). This sentence is in fact true in our interpretation:
because when we put together the elements in the sets {1,2} and {2,3} (when we form
the union of these two sets, as set-theoreticians say) we get the whole domain set
{1,2,3}.
How about a sentence involving negation, such as x(Px Qx)? This means that
everything in the extension of P fails to be in the extension of Q (in set theoretic
terminology, the two sets have an empty intersection). The sentence is therefore
false under the interpretation we are considering, since 2 is in the extension of P but
also in that of Q, i.e. 2 does not fail to be in the extension of Q. If this seems a bit opaque,
think of it this way: the sentence x(Px Qx) amounts, in the domain which has only
the 3 members 1,2,3 to the finite conjunction (P1 Q1) & (P2 Q2) & (P3 Q3));
in order for this to be true all the conjuncts have to be true individually but in fact
(P2 Q2) is false since it has a true antecedent (2 but a false consequent it is
Q2 that holds not Q2 since 2
132
Example 1:
Is the set S = {x(PxQx), x(Qx & Px), x(Qx & Px), Pa} consistent?
Let, arbitrarily, the domain be the first three natural numbers {1, 2, 3} and use P and Q
(bold face) as names of the sets (extensions) associated with the predicates Px and Qx.
Let, arbitrarily, the interpretation of 'a' be '1'. Then in order to make the final sentence
in the set true 1 must be an element of P.
To make the third sentence true there must be at least one thing that is in P that is also
in Q. We may as well let that be '1' also (for the time being we can always come back
and change it if things don't work out). So we put 1 in Q.
For the second sentence to be true there has to be at least one thing in Q which is not
also in P - let that be '2'.
P: {1}
Q: {1, 2}
a: 1
In fact, the first sentence is also true in this interpretation since everything that is in P
(just '1') is also in Q. So this interpretation is indeed a model of this set of sentences S
and this demonstrates S is consistent.
(How many elements you need in the domain will depend on the particular sentences
involved. There's nothing magic about having three elements indeed in this case two
would clearly have been sufficient. A good approach would be to start with 2 members
in the domain set and add others only if that becomes necessary.)
Example 2:
133
Clearly the first sentence requires that everything in the domain be in P so 1, 2 & 3
are all in P.The second sentence, given the first, requires everything to be either in Q or
in R lets say (this is just a decision) that 2 and 3 go in Q and 1 in R. This also makes
the 3rd sentence x(Px&Qx) true since the sentence says that there is some element
of the domain that is in P that is not also in Q and, as things stand, this is true of the
element 1. Finally, the fourth sentence requires that any element from the domain is in
Q just in case it is not in R that is, in effect, that Q and R exhaust the domain between
them and with 1 in R and 2 and 3 in Q that is exactly the case. (Again if this is opaque, it
may be useful to think in terms of finite conjunctions: given that there are only three
elements in the domain what it takes for the sentence x(Qx Rx) to be true is for
(Q1 R1) & (Q2 R2) & (Q3 R3) to be true and if you work through the
conjuncts (on the basis of the interpretation) you will find that they all true (e.g. 2 is in
Q but not in R so it is Q2 (True) R2 (True) and True True is of course
true.Exercise: work through the other conjuncts.)
Domain: {1,2,3}
P: {1,2,3}
Q: {2,3}
R: {1}
Example 3:
(It might be the formalisation of {All ravens are black, No ravens are black (all ravens
are not black)}.)
This is certainly a funny pair of sentences but it is NOT an inconsistent one. The two
sentences are consistent so long as there are no P's (no ravens in the example). One
permissible interpretation for any predicate is as the EMPTY SET the set with no
members, written (Greek lower case phi. So, e.g., the following is a finite model:
134
Domain: {1, 2}
P:
Q: {1, 2}
Again thinking in terms of finite conjunctions should remove any lingering feeling of
unclarity. x(Px Qx) is equivalent in our domain to (P1 Q1) & (P2 Q2) and both
of these conditionals are true by the truth table for conditionals (in both cases the
antecedent is false nothing is a P and so 1 isnt, i.e. P1 is false and 2 isnt, so P2 is false;
both the consequents are true (Q1 and Q2 are both true) but false then true is true. As
for x (Px Qx), this amounts to (P1 Q1) & (P2 Q2) and again this is true
now both conjuncts have false antecedents and false consequents (since both 1 and 2
are Q) but false then false is also true.
Notice, by the way, that the set of sentences {Pa, x(Px Qx), x(Px Qx)} cannot be
given a model in this way (Exercise: Explain carefully why.)
Finite interpretations can also be used to show that a given single sentence s in the
language of predicate logic is independent of a set S of such sentences.
Example 4:
Is s = 'x((Px & Qx) (Sx v Rx))' independent of S = {x(Px & Qx & Rx), x(Px & Qx &
Rx), x(Sx & Px), Pa & Ra, Qb & Sb}?
Interpretation (a):
Domain: {1,2,3}
P: {1,2,3}
Q: {2,3}
R: {1,2}
135
S: {2}
a: 1
b: 2
x(Px & Qx & Rx), is true because P2 & Q2 & R2 is true; 'x(Px & Qx & Rx)' is true
because P3 & Q3 & R3 is true; S2& P2 is true which means thatx(Sx & Px) is true; and
Pa & Ra and Qb & Sb are both true given the interpretation of a and b; finally s is true
since all those things which are in both P and Q (viz. 2 and 3) are either in S(in the case
*Rather than saying 3 is not in R, we could define Ras the complement of R that is,
the set of all objects in the domain which are not in R, in this case R={3} and of course
3 is in R since 3 is in {3}.
Interpretation (b):
Domain: {1,2,3}
P: {1,2,3}
Q: {2,3}
R: {1,3}
S: {2}
a: 1
b: 2
We now need to make s false: this means there must be at least one thing which is both
in Pand Qbut is not either in S or in R (i.e. the complement of R the set of all things in
the domain but not in R). I have made this true of 3: 3 is in P and Qbut it is neither in S
nor in R (since it is in R). The sentences inS, however, are all true again. x(Px & Qx &
Rx) holds because P3 & Q3 & R3 is true; x(Px & Qx & Rx) holds because P2 & Q2 &
136
R2 is true; x(Sx & Px) holds because S2 & P2 is true; Pa & Qa holds because P1 & R1
and Qb & Sb holds because Q2 & S2.
The finite model technique allows us to be much more systematic in the search for
models of various sets of sentences. However, for the full predicate calculus (to be
introduced later) the technique is incomplete. That is, it can be shown that not every
set of sentences which has a model has a finite model. The method works only one
way: IF we can find a finite model, THEN the set of sentences is consistent; but the set
may be consistent without there being a finite model of it. This will be the case iff there
are consistent sets of sentences which only have infinite models and there are. But
all of the cases that you will be asked to deal with in the exercises can be worked using
finite interpretations/models.
137
C6: D EMONSTRATING V ALIDITY (M ONADIC P REDICATE C ALCULUS )
So we now know how to use interpretations both infinite and finite to establish
invalidity of inference, consistency of a set of sentences and independence of a single
sentence from a set of sentences. We can't use them, however, to show VALIDITY of
inference, inconsistency and dependence (that is, lack of independence). (Make sure
that you fully understand why not.) We therefore resort to a different idea to tackle
these problems. The idea is that of a FORMAL PROOF. The validity of an inference, for
example, will be established by deriving its conclusion from its premises using certain
permitted RULES OF PROOF.
Lets plunge straight in by giving a formal derivation, even though it wont make much
sense initially, and then analyse it so that it does make sense. And lets stick to our
time-worn examples.
Example 1:
1. x(Px Qx)
2. Pa
Therefore, Qa
The following is a formal proof of its conclusion from its premises (and therefore a
demonstration of the validity of the inference).
1. x(PxQx) Premise
2. Pa Premise
3. Pa Qa US, 1
138
4. Qa TI, 2, 3
Example 2:
Proof:
3. Px Qx US, 1
4. Qx Rx US, 2
5. Px Rx TI, 3, 4
As before (and as always in a formal proof), the justification for each step is given on
the right; so lines 1 and 2 are justified by the fact that they are given to us as premises,
139
steps 3 and 4 are both application of the rule 'US' (to lines 1 and 2 respectively) step 5
uses TI (tautological implication) on lines 3 and 4, and finally the rule applied at line 6
and based on line 5 is called the 'rule of universal generalisation' (UG).
This second example indicates the form of very many proofs in predicate logic: we use
rules (like US) to drop quantifiers (lines 3 and 4), manipulate the unquantified
formulas essentially truth-functionally via the single but (as we will see multi-faceted)
rule of tautological implication; and then apply generalisation rules, as at step 6 in
this case 'universal generalisation' (UG) to restore the quantifier(s).
Of course the required rules are not just any old rules. We clearly want them to have
the property that whenever we properly apply them to derive a
particularconclusion from some premises then the inference from those premises
to that conclusion is valid (in the sense that we have specified no interpretation in
whichthe premises are true and the conclusion false). Any set of rules of proof which
has this property is called a SOUND set of rules. We would also like the rules of proof
to have the further property (the converse of the one just stated) that whenever
aninference from some premises to a conclusion is in fact valid then there is a
formal proof of the conclusion from the premises using the specified rules of proof.
Any set ofrules which has this property is called a COMPLETE set of rules. The rules of
proof that I shall introduce are indeed both sound and complete though for the
purposes of this course, we shall take this for granted rather than proving it (the proof
is quite complex).
140
C6( A ): T HE R ULE OF T AUTOLOGICAL I MPLICATION (TI)
Let's start with the RULE OF TAUTOLOGICAL IMPLICATION (TI). One formula, G, is
tautologically implied by some number of other formulas F1 Fn iff the single
formula (F1& & Fn) G is a tautology (in the sense specified in truth-functional
logic). So, you pretend that the formulas involved (again I'll say more precisely what a
formula is soon) are truth-functional atoms and then apply the above test.
So, consider step 4 in Example 1 above: there we use TI to justify the step from 'Pa' and
'Pa Qa' to 'Qa', and this is a correct application of the rule since the sentence '(Pa &
(Pa Qa)) Qa' is indeed a tautology (that is, if we regard Pa and Qa as just truth
functional atoms, say p and q, the resulting sentence '(p & (p q)) q is a truth-
functional tautology). This means that Qa is indeed tautologically implied by Pa and Pa
Qa. Hence, since the proof already has the steps Pa and Pa Qa, we are allowed to
derive Qa from lines 2 and 3 of that proof by TI.
Although the rule of tautological implication is all we need for these truth-functional
manipulations (it's a sort of single grand rule covering all cases), certain special cases
of it are used more frequently than others and correspond to certain classical logical
rules with established names. Some students prefer to remember the special cases as
well as the general rule. Here are some of them (the capital letters F, G, etc. indicate
any formulas of predicate logic):
The formula G can be inferred from the formulas F and F G: so, for example, Qa follows
from Pa and Pa Qa and (Pa & (PaQa)) Qa and (Px & (Px Qx)) Qx are
tautologies; since (p & (p q)) q is atautology.
141
Rule of Modus Tollens
The formula F can be inferred from the formulas F G and G. So, e.g., Pa can be
inferred from Pa Qa and Qa (again, ((Pa Qa) &Qa) Pa is a tautology).
The formula F H can be inferred from the formulas F G and G H. So, e.g., Px Rx
is derivable from Px Qx and Qx Rx.
Rule of Simplification
Boththe formula F andthe formula G can be inferredfrom the formula F &G. So, e.g., Pa
can be inferred from Pa & Qa, and so can Qa; (Px Qx) could be inferred from (Px
Qx) & Rx, and so could Rx.
The formula F can be inferred from the formulas F v G and G. So, for example, Pa
follows from (Pa v Qa) and Qa.
(Important Exercise: Show that each of these rules of proof is a special case of the rule
of tautological implication.)
One further point to notice is that the rule of tautological implication applies equally
well to formulas that are fully-fledged quantified sentences. Consider, for example, the
inference:
1. If all LSE students are hardworking then some pigs can fly.
2. No pigs can fly.
1. p q
2. q
So, p
142
However, predicate logic being an extension of, or an elaboration on, truth-functional
logic, we could certainly also express the inference in predicate logic and, with our new
idea of a proof, also establish its validity there. In predicate logic the inference would
be formalised as:
(The form of TI being here, of course, Modus Tollens: ((x (Px Qx) x (Rx & Sx)) &
x (Rx & Sx)) x(Px Qx) is a tautology because if we replace the constituent sentences
(x (Px Qx) and x (Rx & Sx)) in this formula by the atoms p, and q (while still retaining the
truth functional structure) then we get (pq) & q) p which is a tautology (Check!).
So the basic idea for most proofs in predicate logic is going to be: start with premises,
drop the quantifiers using the appropriate rules, manipulate the resulting quantifier-
free formulas using the rule of tautological implication (or equivalently the
appropriate one of its special forms, like Modus Ponens), then replace the quantifiers (if
necessary that is, assuming that theconclusion is a quantified sentence) using the
appropriate generalisation rules. There are basically four extra rules, then, that we
need now to introduce: one for dropping universal quantifiers, one for putting them
back, one for dropping existential quantifiers and one for putting them back.
143
C6( B ): T HE R ULE OF U NIVERSAL S PECIFICATION (US)
The rule for dropping universal quantifiers is called the rule of UNIVERSAL
SPECIFICATION (US). The simple basic underlying idea is thatif some property holds
of every individual then it must hold of any individual in particular.
So if it is true for anything at all that if it's a man then it's mortal, then it follows that if
Socrates is a man then he's mortal, i.e. assuming a is our individual constant for
Socrates, we can infer Pa Qa from x(Px Qx) by universal specification. We could
equally well infer Pb Qb, Pc Qc etc. for any individual named by an individual
constant.
That is, one type of application of the rule of US is to just the step from xF to F[aj|x],
where F[aj|x] means the formula obtained from F by replacing alloccurrences of x by
the individual constant aj. (So if, e.g., F is Px (Qx v Rx), F[a|x] is Pa (Qa v Ra); F[c|x]
is Pc (Qc v Rc) etc.)
So, thinking about it purely formally or syntactically (and this is the BEST WAY to think
about these rules until you have fully got the hang of them), the rule of US says:
You can take any formula with a universal quantifier on some variable, say x, at the front
(and governing the rest of the formula that is, the quantification on x extends over the
whole formula) and infer from it the formula obtained by dropping the quantifier and, if
144
you like, replacing the occurrences of the originally quantified variable x by any variable
or by any individual constant.
So, for example, the formulas Pa Qa, Pb Qb, Px Qx, Py Qy can all be obtained
from x(Px Qx) by US.
145
C6( C ): T HE R ULE OF U NIVERSAL G ENERALISATION (UG)
How about putting universal quantifiers back? This is the role of the rule of
UNIVERSAL GENERALISATION. Clearly it would be quitewrongto infer fromthe fact
that 'Socrates is a man' that 'Everything is a man', or from the fact that Boris Johnson is
a liar that everyone is a liar. Nor, e.g., should we infer from If Boris Johnson is a liar,
then he is unelectable, that Everyone who is a liar is unelectable (it might just apply
to Boris and some others but not to everyone.) So we do NOT allow the inference from
Pa to xPx or from (Pa Qa) to x(Px Qx).
If you have a formula of the form F(x) involving some unquantified variable x (we shall
later call these free variables) then you may infer the formula xF(x), in which that
initially unquantified variable is universally quantified over.
Because the x is genuinely arbitrary (that is you suppose, in the geometry case, simply
that the object you are considering is a triangle, and you do not suppose that it has any
particular other properties say that of being equilateral it is intuitively ok to
generalize.
146
C6( D ): S OME S IMPLE D ERIVATIONS :
Using these three rules (TI, US, UG) we can demonstrate the validity of a wide range of
valid inferences. We already have seen a couple of these (1 and 2 above, you should go
over them again now); and here are a couple more.
Example 3:
Here, Px: x is a Philosopher; Qx: x is an empiricist; Rx: x is a rationalist and Sx: x knows
science. Notice that the conclusion might also have been formalised as (x(Px (Sx
Qx)). This should seem intuitively clear and the intuition is underwritten by the fact
that this second formalisation of the conclusion is logically equivalent to the first.
(Basically because (p & q) r is tautologically equivalent to: p (q r). Exercise:
check this.)
4. Rx Sx US, 2
147
6. (P x & Sx) Qx TI, 5*
*Think carefully about the quite complicated tautologies involved in these twosteps.
And exercise check that the tautologies involved are indeed tautologies.
(Further important exercise: I could also have collapsed steps 5 and 6 into one step
using of course a still more complicated application of TI. What tautology would be
involved in that single step? This is a general feature of predicate logic proofs any
series of successive applications of TI could be replaced by a single application of that
rule: how far you breakdown applications of TI into intermediate steps is always a
question of taste and of which tautologies you intuitively see as tautologies.)
Example 4:
That is:
Proof:
4. Px Qx US, 2
5. Rx Sx US, 3
148
*It would be wrong to go straight from 1 to, say, (Px Qx) & (Rx Sx). US only allows
you to go from a formula with a quantifier IN FRONT which governs the whole rest of the
sentence to that formula with the quantifier dropped so you need to detach the two
conjuncts in 1 first at steps 2 and 3 (by TI) and only then apply US to them
separately.
149
C6( E ): T HE R ULE OF E XISTENTIAL S PECIFICATION (ES)
So far we have dealt only with universally quantified formulas. How about the
existentially quantified ones? If we are going to drop quantifiers in this case too (we
are!) then we need to be careful. Clearly it would be wrong to infer from the statement
Some natural numbers are prime that some particular number is prime (that number
might actually be prime but the inference would still clearly be invalid). So we can't
infer, for example, from xPx to Pa. (Some numbers are prime. So 4 is a number that is
prime is a counterexample.) Similarly, it would be clearly mistaken to infer from 'Some
triangles are isosceles' or from Some lectures are interesting' to 'An arbitrary triangle
is isosceles' or 'An arbitrary lecture is interesting'. That is, the inference from xPx to
Px is also NOT sanctioned.
What we ARE entitled to infer from 'Some triangles are isosceles', for example, is, given
the minimal interpretation of 'some' that we have adopted, simply that there is at least
one isosceles triangle. Generally, from xPx we are entitled to infer only that there is at
least one (possibly unknown) individual with property P we cannot say that any
particular object has P, only that some particular object does (even though we may not
be able to name it). The idea, then, behind the rule of EXISTENTIAL SPECIFICATION
(ES) is to introduce a new sort of name a so-called ambiguous name. As 'ambiguous
names' we use letters from the beginning of the Greek alphabet:
If, for instance, we know xPx, then we know that at least one individual has property P
and we can "ambiguously christen" that individual ''. All we know about is that it's a
particular individual that has property P.
The formula F[Ix] can be inferred from the formula xFx (where F[Ix] is the formula
you obtain from F by substituting for x, so you can read it as 'F with for x').
So, for example, we can infer P & Q from x(Px & Qx), and R v S from x(Rx v Sx).
150
C6( F ): T HE R ULE OF E XISTENTIAL G ENERALISATION (EG)
The rule for putting existential quantifiers back the rule of EXISTENTIAL
GENERALISATION (EG) is intuitively more straightforward. Although, as we just
noted, we cant infer 5 is a prime number from the fact that There are some prime
numbers (even though it is true), it certainly does work the other way round: that is, it
does follow from the fact that '5 is a prime number' that 'There are prime numbers' (i.e.
on our understanding that 'there is at least one prime number'); it also clearly follows
from the fact that some ambiguously named entity has property P that there is at
leastone P; finally it follows from the fact that any arbitrary object has property P that
thereare some Ps. That is, P entails xPx, Pa entails xPx and Px entails xPx.
(More formally):
Rule of EG: if G is a formula that results from a formula F by at most replacing either an
ambiguous name or an individual constant by a variable x then xG can be inferred from
F.
So, for example, x(Px & Qx) can be derived from (P & Q) or from (Px & Qx); x((Px v
Qx) & Sx) can be derived from (Pa v Qa) & Sa; and so on.
151
C6( G ): S OME M ORE D ERIVATIONS
The only way to get straight about these rules is to get used to employing them to make
valid derivations. So here are a few more examples.
Example 5:
The premises formalise as 1 and 2 below (with Px meaning 'x was written by Mozart',
Qx meaning 'x is beautiful' and Rx meaning 'x is an opera') and then the proof takes us
to x (Rx & Qx) as the required conclusion:
**The tautological implication used in line 8 (sometimes called the rule of conjunction)
is p (q (p & q)). As before, I have broken down the TI's into "simple" steps, but we
could equally well have gone straight from lines 3 and 4 to R& Q the tautology
involved then being ((p & r) & (p q)) (r & q).
152
Example 6:
1. Every war is terrible, but some wars are justified and anything that is justified is
morally acceptable.
Formalisation:
Proof of validity:
1. x(Px Qx) & x(Px & Rx) & x(Rx Sx) Premise
5. P& R ES, 3*
6. P TI, 5
7. P Q US, 2
8. Q TI, 6,7
9. R TI, 5
10. R S US, 4
*Notice again that it would be wrong to start dropping quantifiers directly from line 1 (i.e.
from the single premise in this inference). The specification rules only tell you that you can
drop a quantifier in a certain way IF it is a quantifier governing the whole of the rest of the
153
formula i.e. there is a bracket directly after that quantifier and then the corresponding
closing bracket comes at the very end of the whole formula. This is not true of any
quantifier in 1 (in particular it is not true of the first one x); but it is true of the
quantifiers in lines 2, 3 and 4.
154
C6( H ): T HE NEED FOR RESTRICT IONS ON THE RULES SO FAR FORMULATED
Most of the time, intuition will guide you to employ these rules of proof in such a way
that they sanction only genuinely valid inferences. However, they are faulty as they
stand and need a few restrictions so that they do the job properly (and so that intuition
can be eliminated entirely and a genuine rock solid foundation can be found for our
logic).
Here, for example, is a very clear cut example of an INVALID inference which indicates
the need for an obvious restriction on the rule ES:
This inference has true premises and a false conclusion and so is a counterexample to
itself. It formalises as:
Here Px means x is a number, Qx means x is odd and Rx means x is even. Since the
inference is invalid, it should, of course, have no proof. But what, then, is wrong with
the following?
Proof:
3. P& Q ES, 1
4. P& R ES, 2
155
5. P& Q& R TI 3,4
Steps 5 and 6 (as well of course as 1 and 2) are perfectly OK. The problem occurs at
step 4 but ONLY given that step 3 has already been made. Certainly premise 1
guarantees the existence of at least one thing that has both the property P and the
property Q: we can call it . Similarly premise 2 guarantees the existence of at least one
thing which has both property P and property R we can, however, by no means
guarantee that the thing which has properties P and R is the same thing as that which
has properties P and Q. We should not therefore use the ambiguous name again.
So the correct step at line 4 would beP & R(or P& Ror whatever that is,
anyambiguous name aside from ), and hence the invalid inference would be blocked.
(Once this restriction is in place, we could not then legitimately use EG to get from P&
Q and P & Rto x(Px & Qx & Rx), since the EG rule requires that the introduced
variable replaces the same ambiguous name throughout.)
All the rules we have introduced need further tightening to be fully logically acceptable.
However, it is best first to get a good idea of what is involved by using them most of
the time the need for restrictions will not arise. We have seen how to use them to
establish validity of some inferences. What other purposes can they serve?
156
C7: L OGICAL T RUTH AND L OGICAL F ALSEHOOD
One important notion in truth-functional logic for which we have not yet produced an
analogue in the case of predicate logic is that of a tautology. A truth-functional
tautology, remember, was a compound sentence which turned out true no matter
whattruth values were assigned to its atomic components. This meant that, like 'Either it
isFriday or it is not' it was triviallytrue true because uninformative. Consider a
sentence like Everything is either physical or not-physical. This is not a tautology (it is
truth functionally simple and so would formalise just as p) but, unlike Everything is
physical, which, if true, is informative (it rules out, for example, the existence of
immaterial souls), it is bound to be true because uninformative. It formalises in
predicate logic as x(Px v Px). (The truth functionally disjunctive xPx v xPx is a
quite different sentence from the one we are considering. Exercise: What does that
disjunction say? Is it also trivial?)
The reason why the original sentence Everything is either physical or not-physical is
bound tobe true is easily seen if we consider its formalisation. No matter what we take
thevariable to range over (implicitly material objects in the original sentence, but it
could be numbers, people, animals or whatever) and no matter which property we take
for the predicate Px (it could be 'x is even', 'x is male', 'x is feline' or whatever) we
would always get a true sentence. ('All numbers are either even or not-even', 'All
animals are either feline or not-feline, etc.)
We shall use the term logical truth for the generalisation to first order logic of the
truth functional notion of tautology. And the above consideration gives us our
characterisation of the notion:
A single sentence s in the language of first order predicate logic is a logical truth iff it is
true in all interpretations (i.e. it has only models).
157
C7( B ): D EMONSTRATING L OGICAL T RUTH
Just as in the case of our definition of validity, we can readily demonstrate that a
sentence is NOT a logical truth. This simply requires us to produce a single
interpretation in which the sentence is false.
For example, the sentence All swans are either white or black, although true in the
actual world, is not logically true. This is why it gives (limited but) genuine information
about the world. The sentence can be formalised as:
The following interpretation is not a model, and hence shows that the sentence is not a
logical truth:
Domain: {Animals}
Px: x is human
Qx: x is English
Rx: x is Austrian
Under this interpretation the sentence reads All humans are either English or
Austrian, which is (thankfully) not true.
So: to demonstrate that a single sentence s in the language of predicate logic is NOT a
logical truth, we simply need to produce one single interpretation under which it is
false.
On the other hand, the sentence All swans are either white or not white is logically true.
But clearly we can't show this directly from our definition this would require checking all
possible interpretations and the set of all possible interpretations is infinite.
So just as with invalidity and validity, we have an asymmetry: to show that an inference
is invalid or that a single sentence is not a logical truth, we need only produce one
interpretation a counterexample to show the invalidity of an inference and an
158
interpretation which is not a model of the sentence at issue to show that that sentence
is not a logical truth. But trying to use interpretations to show either validity or logical
truth would be an unachievable, because infinite, task.
This is more than an analogy as we will immediately see: the same strategy that solved the
validity problem namely the notion of a formal proof also solves the logical truth
problem.
First notice that, just as in truth functional logic, whenever we demonstrate the validity
of an inference in predicate logic we thereby establish that a particular sentence is
logically true. This sentence is the "associated conditional".
If the inference from premises P1, Pn to conclusion C is valid, then there is no counter-
example to it, i.e. no interpretation in which all of the premises are true and C is false.
But this means, because of the truth-table definition of '', that the single sentence:
must be a logical truth, on our definition. (Exercise: think this through carefully.) So, for
(the old boring) example, since 'Socrates is mortal' follows from 'All men are mortal'
and 'Socrates is a man', then single sentence: 'If all men are mortal and Socrates is a
man then Socrates is mortal', which formalises as ((x(Px Qx) & Pa) Qa), must be a
logical truth.
Moreover, if a conclusion C can be derived from a single premise P that is itself a truth-
functionaltautology (and remember we do have a decision procedure for tautologous-
ness) then C itself must be logically true. Why? Well, first if C can be validly inferred
from P then the sentence 'P C' must be logically true (via the considerations in the
preceding paragraph). Next, if C were not itself logically true there would be at least
one interpretation, say I, in which it is false; but since P is a tautology it must be true in
all interpretations and so in particular true in I, but then 'P C' must be false in I (true
antecedent, false consequent) contrary to our assumption that 'P C' is a logical truth.
This means that C must be a logical truth if it can be proved (using the rules of proof)
from a tautological premise. But that would be effectively from no premises at all
because the rule of tautological implication in any case allows us to write down any
159
truth functional tautology at any stage (any tautology is tautologically implied by any
sentence). This gives us the following result which allows us to demonstrate logical
truth:
Result: A sentence of first order logic is a logical truth iff itisprovable from
nopremises.
But that sounds weird. How could a conclusion possibly be provable from no
premises? The answer is pretty easily. Such proofs invariably involve a very useful
additional rule of proof that we have not yet introduced. (I wont pull this stunt again;
this is the last rule we will need). This further rule is:
If the formula G can be derived from a set of premises {Pi} together with the formula F,
then the formula F G can be derived just from the premises {Pi}.
We shall later need to make our system more sophisticated in order to accommodate
fully this rule, but for the moment we will operate with it in an intuitive way. The rule
may sound a little unintuitive initially, but in fact, once you think about it, it is just
commonsense (as it needs to be given that it is a rule of logic). Suppose, for example,
we are considering the Treasury Model of the UK Economy basically a collection of
economic theories and facts. The Bank of England is considering increasing the base
rate of lending by 1% this has not been decided, so it stands as a possible assumption.
We would then be interested in what the Treasury Model predicts would happen if the
Bank took that decision. So, we add the assumption that the Bank will increase the rate
by 1% to the premises already embodied in the model. Suppose we can deduce, for
example, that Inflation will increase by 2%. So, we have deduced that inflation will
rise by 2% from the Treasury Model plus the extra assumption that the Bank raises the
lending rate by 1%. But this is obviously equivalent to (is just another way of saying)
that the Treasury Model itself (that is without any further assumption) entails the
conditional If the Bank raises the interest rate by 1%, then inflation will rise by 2%.
The rule of conditional proof is just a formal statement of that obvious equivalence.
We can recognize this by seeing the rule of conditional proof (CP) in action. Say we
want to show that All intellectuals are a drain on society follows from All intellectuals
160
are economically unproductive, and All economically unproductive people are a drain
on society. (Margaret Thatcher came close to believing that all three statements were
true!)
1. x(Px Qx)
2. x(Qx Rx)
Proof:
What we have shown in this proof by line 7 is that the formula 'Rx' follows from the
original premises plus the extra formula Px (introduced "by assumption" at line 5). The
rule of conditional proof then tells us that we could have inferred the conditional 'Px
Rx' just from the original premises and this is how we applied the rule at line 8. This
should seem like an intuitively sound way of proceeding.
Of course, we already know that we can establish the validity of this inference without
invoking CP (we could just go straight from lines 3 and 4 to line 8 by TI in fact by the
rule of 'hypothetical syllogism': the relevant tautology being ((pq) & (qr))
(pr). (Usual exercise.) The advantage is that CP has allowed us to break down the
truth-functional steps in the proof to very simple ones namely two applications of
modus ponens. In other cases, the rule of CP is essential.
161
It is of particular value in establishing logical truth, as the following examples show:
Example 1:
Show that the sentence: If Newton was a genius then there are some geniuses is a
logical truth. The sentence formalises as Pa xPx.
Proof:
So Pa xPx has indeed been proved from no premises. Make sure you fully
understand that there are no premises in the above proof. There was at line 1 an
assumption, but that assumption was discharged at line 3 when we applied CP. Notice,
then, the crucial difference between a premise and an assumption. A premise is a
statement that is given for the purposes of deducing some conclusion from it (together
with whatever other premises are involved, that is, are also given). The premises
remain given throughout. An assumption is a claim that you temporarily give
yourself for the purposes of producing a proof; and it disappears, so to speak, when
discharged.
Example 2:
Show that the sentence: If Newton was a genius then not everyone is not a genius is a
logical truth. The sentence formalizes as: Pa xPx.
Proof:
1. xPx A
2. Pa US, 1
3. xPx Pa CP, 1-2
4. Pa xPx TI, 3
Notice, then, that we have here proved a sentence of the form 'F G' not by assuming
F, proving G and then using CP to establish F G, but instead by assuming G and
162
proving F. This gives G F by CP, but G F tautologically implies F G (since
(q p) (p q) is a tautology). (Exercise: show that this is true.)
If some lectures are interesting, then not all lectures are uninteresting.
This sentence is logically true as is again shown by the fact that we can prove it
absolutely i.e. without invoking any premises.
Proof:
*The tautology invoked here is ((p & q) & (p q)) (q & q), but you may feel
happier breaking that single step down into a couple of steps: first getting Q from 3,
then Q from that plus 4 by Modus Ponens, then putting the two together. Once again,
there is no right or wrong about this it is just a question of which instances of
tautological implication you are confident in identifying as genuine, i.e. backed up by
genuine tautologies.
This proof illustrates a couple of very important points. First, that we can iterate
applications of CP i.e. make more than one assumption (so long as we remember to
"discharge" all the assumptions through CP before completing the proof indeed, the
proof is not complete until all assumptions have been discharged). Secondly, we
can, by using CP, imitate the time-honoured informal proof technique of reductio ad
absurdum. Our two assumptions (lines 1 and 2) amount to the assumption that the
conditional sentence we are out to prove is in fact false. But the assumption that it is
163
false entails a (truth-functional) contradiction (line 5) hence it is false that it is false,
that is, it must be true. The last part of this reasoning is formally captured in lines 6-8.
Notice inparticularthe very important application of TI at line 7 this heuristic is used
time and timeagain. The particular rule is that:
F follows from the single sentence F (G & G) for any formulas F and G.
(Further Exercise: Although it is easiest to see how CP can be used to prove that
conditionals are logically true, its use is not confined to conditionals (or rather to
explicit conditionals we know from truth functional logic that any truth-functionally
compound sentence is equivalent to one that just uses and ). For example, CP can
be used to prove the disjunction xPx v xPx is a logical truth (as it obviously is). Can
you work out how?)
164
C7( C ): T HE I DEA OF L OGICAL F ALSITY
In truth functional logic, we had the notion of a Contradiction. Today is Friday and it is
not the case that today is Friday is one such. It is "uninformatively (because
necessarily) false" false "in all possible worlds" (that is, no matter what truth values
are assigned to its atomic components).
Similarly, a sentence like Some numbers are both even and not-even though not a
truth-functional contradiction (notice that it is NOT a truth-functional conjunction but
is truth-functionally simple or atomic) is necessarily or logically false. This sentence
formalises in first order predicate logic as x(Px & Px) (taking Px to mean is even and
'numbers' as implicit remember that if all of a set of sentences talk about the same
things, then this is covered, in any interpretation, by the Domain). This sentence is
indeed a logical falsehood because every interpretation makes it false. This is, then,
our characterisation of a logical falsehood.
A single sentence in the language of first order predicate logic is logically false iffsis
false in all interpretations (equivalently of course iff sistrue in all interpretations, i.e. if
s is a logical truth).
165
C7( D ): D EMONSTRATING L OGICAL F ALSEHOOD
As in the case of logical truth, we can show that a sentence is NOT logically false by
producing a single interpretation in this case, a single interpretation in which is not
false but instead true. So, for example, the sentence xPx &xPx is NOT logically
false. This is because it is not false in all intepretations, since it is for instance true in
the following interpretation (under which it just says Some natural numbers are even
and some natural numbers are not even):
Px: x is even
But again, since we cannot inspect all possible interpretations, we cannot establish that
a sentence is a logical falsehood directly on the basis of this definition.
However, as should be clear from the above definition, and especially the bracketed
addition:
Fact 1:
Hence we can show a sentence s is logically false by showing that its negation is
logically true and we already know from the last subsection how to prove that a
sentence is logically true: we prove it absolutely that is, without invoking any
premises. Formally speaking, we have:
Fact 2:
A sentencesof first order predicate logic is logically false iff its negation canbe proved
from the empty set of premises.
Example 1:
Proof:
166
1. x(Px & Px) A
2. (P& P ES, 1
3. x(Px & Px) (P& P CP, 1-2
4. ( x(Px & Px)) TI, 3
Study this proof carefully. We are in effect again using the method of reductio
adabsurdum. We have proved thatx(Px & Px) is logically false by proving x(Px&
Px) but in order to prove that we have assumed its negation (which of course takes
us back to x(Px & Px)) and then derived a contradiction from its negation.
Example 2:
Proof:
167
C8: D EMONSTRATING THE I NCON SISTENCY OF S ETS OF P REDICATE L OGIC
S ENTENCES
You will remember that we were able to show that a set of sentences of predicate logic
is consistent by producing a joint model, i.e. a single interpretation in which all the
sentences in the set are true. But how can we demonstrate that such a set is
inconsistent?
Hence for finite sets of sentences we can demonstrate inconsistency by deriving the
negation of the conjunction of the sentences.
Fact:
A finite set of sentences of predicate logic S= {s1, sn} isinconsistentiff the single
sentence s1& & sn is logically false i.e. its negation is derivablefrom the empty set
of premises.
Example 1:
The set {xPx, xPx} is unsurprisingly inconsistent since xPx &xPx is logically
false.
Proof:
1. xPx &xPx A
2. xPx TI,1
3. xPx TI,2
4. P ES,3
168
5. P US,2
6. (P& P TI, 4,5
7. (xPx &xPx) (P& P) CP, 1-6
8. (xPx &xPx) TI, 7
Another perhaps more intuitive characterization can be given of what it means for a
set of predicate logic sentences to be inconsistent. Suppose we could derive a truth
functional contradiction from a set of sentences as premises that is, a formula of the
form F& F for some formula F. Since this would mean that the contradiction was
validly inferrable from the premises, then there could be no counterexample to that
inference. But this can only mean that no interpretation makes all the premises true.
(Exercise: Explain carefully why.) Thus we also have the following:
Result:
A set of sentences {s1, sn} in the language of predicate logic is INCONSISTENT if,
taking s1, snas premises we can derive a truth-functional contradiction.
This means that, using the same example as before, the proof of inconsistency is
slightly rejigged and can stop sooner.
Example 1 (again):
1. xPx Premise*
2. xPx Premise*
3. P ES, 3
4. P US, 2
5. (P& P TI, 4,5
*Notice that these are now premises, in this way of showing inconsistency, not
assumptions.
Hence we have derived a truth functional contradiction from the set of sentences as
premises hence that set is inconsistent.
It formalises as:
S = {x(Px (Qx v Rx); x(Px Sx); x(Sx Px); x (Px & Qx & Sx)}
(Exercise: check that you are happy with the formalisation of the third sentence.)
Hence we have derived a truth functional contradiction (a formula of the form F& F
here F = R) from the sentences in our original set S taken as premises, and so we have
proved that S is inconsistent.
170
C9: F ULL F IRST -O RDER P REDICATE L OGIC
INTRODUCE R ELATIONS .
So far we have restricted attention to simple, so-called monadic predicates 'is a man',
'is a liar', 'was a film star, or whatever. These are monadic or unitary because they
apply (or fail to apply) to single individuals: Socrates, Boris Johnson, Marilyn Monroe
or whomever. (Of course we can then go on to quantify but always over single
individuals having 'monadic' properties). But how about statements like 'Cain was a
son of Adam' or Mrs. Thatcher was politically to the right of Attila the Hun, or
Liverpool is north of Watford? These seem intuitively not to be assertions that a single
individual has a certain property, but instead assertions that a certain RELATION holds
betweenTWO individuals, or, as we shall usually express it, that two individualsstand
in a certain relation. The two individuals Cain and Adam stand in the relation that the
first is the son of the second; the two individuals Mrs Thatcher and Attila the Hun stand
in the relation that the first is politically to the right of the second, and so on.
Of course, we could just treat each of these sentences (and others like them) as
straightforward subject-predicate assertions (and so formalise them in Monadic
Predicate Logic) via a suitable choice of predicates. For example, we could introduce
the monadic predicate Px, 'x is a son of Adam, and using a as the individual constant
naming Cain, formalise the first sentence: Pa.
Similarly introducing the monadic predicate Qx, meaning 'x is politically to the right of
Atilla the Hun' and using b as a name for Mrs Thatcher, the second sentence would
formalise as Qb.
Finally, using Rx to mean 'x is north of Watford', Rc would do for 'Liverpool is north of
Watford' using c as the name of the city of Liverpool. This possibility explains why
logicians for 2000 years saw no need to go beyond the simple subject-predicate form
and so no need, in effect, to go beyond Monadic Predicate Logic.
171
However, it is clear that we lose something and lose something important in
formalising these ordinary English assertions this way. For example, consider our sentence
about Liverpool and Watford, formalised as Rc; next consider the sentence:Manchester is
north of Watford. This would formalize, using d for Manchester as Rd reflecting the fact
that this second sentence 'says the same thing" about d (i.e. Manchester) as the first does
about c (i.e. Liverpool). But now consider the sentence Liverpool is north of Luton.If we
only had monadic predicates, we would have to introduce a new monadic predicate, say Sx,
for 'x is north of Luton', and this last sentence about Liverpool would formalise as Sc. But
then the fact that the two original ordinary language sentences about Liverpool said the
"same sort of thing" about Liverpool would be completely lost in our formalisation Rc and
Sc are completely different assertions about the entity c, and our rules for interpreting
predicates in the search forcounterexamples, consistency proofs, etc, would allow us to
interpret the predicates Rx and Sx as we liked.
Relatedly, various intuitively valid inferences could not be shown to be valid (at least
not at the same time) if we had only monadic predicates. For example, from 'Liverpool
is north of Watford', we could validly infer 'Something is north of Watford' (as you may
know, some Londoners regard this conclusion as contentious). The inference simply
being from Rc to xRx an inference that is demonstrated to be valid by one
application of the rule EG. However, we could not validly infer from that same premise
that Liverpool is north of something (equivalently: There is something to the north of
which Liverpool lies). But why should one of these inferences be demonstrable as valid
and not the other?
We are, in fact, in an analogous situation here to the one we were in with 'Socrates is a
man', and 'All men are mortal' vis vis truth-functional logic. There the best we could
do was to formalise the two sentences as p and q, respectively, hence losing the
intuitive connection between them, and hence we were unable to capture the validity
of certain intuitively valid inferences (see the very beginning of section C of these notes
172
this was the reason that we introduce predicate logic in the first place rather than
remaining content with truth-functional logic).
The remedy for our current problem is the same as it was then: to increase the
expressive power of our language. In this case, the suggestion is to acknowledge the
intuitively relational character of sentences like the ones that we have been
considering, by allowing two-place or binary or dyadic predicates (relations) like
Rx,y: x is to the north of y; Sx,y: x is politically to the right of y, etc. (We shall usually
use the term two-place relation but you will see the others used elsewhere.)
Hence using Rx,y as our relation for x is to the north of y, a for Liverpool, b for
Manchester, c for Watford and d for Luton, our geographical assertions become:
The similarity of form between the four original sentences is thus completely captured
by our formalisations. Similarly introducing Sx,y to mean 'x is the son of y' and taking a
to mean Cain and b to mean Adam, 'Cain is a son of Adam' is formalised as Sa,b, 'Cain is
a son of Eve' could be Sa,c (where c of course is the individual constant naming Eve),
and so on. (Whereas, remember, if we had only monadic predicates we would have to
have separate predicates Px and Qx, say, for ' x is a son of Adam' and ' x is a son of Eve').
Introducing relations not only allows us to reflect more faithfully the logic of ordinary
language, it also permits an immensely better representation of reasoning in the formal
disciplines like the various sciences and, above all, mathematics. You should carefully
remember these advantages when wrestling with the complications that the
introduction of relations undeniably brings in its wake.
Notice that with relations in general the order in which we take the individuals involved
is ofvital importance. 'Liverpool is north of Watford' is true while 'Watford is north
ofLiverpool' is false: hence Ra,c (sticking to our original individual constants) is an
importantly different assertion from Rc,a.
173
There is no reason in principle why we should stick at two-place relations. Admittedly,
in ordinary language, three (and higher) place relations are thin on the ground "lies
in between" is an exception: 'Watford lies in between Liverpool and London' could be
formalised using the three place predicate Rx,y,z which holds just in case x lies in
between y and z, so that, using the same constant as before and e for London, we would
have Rc,a,e. Similarly, 'is an immediate descendant of' is a relation that holds between 3
people: for example, Prince Charles, the Queen and Prince Philip.
In mathematics, three (and higher) place relations are much easier to find for
example we could characterise a three-place predicate, say Sx,y,z, which held between
three numbers x, y and z just when x was the sum of y and z. This would mean, for
example, that if the individual constant a stood for the number 1, b for the number 2,
and c for the number 3, Sc,a,b and Sc,b,a were both true sentences, while Sa,b,c, for
example, is false (it asserts that 1=2+3.) Although we shall be primarily concerned with
two-place predicates, all the results we will arrive at apply generally to predicates with
any (finite!) number of places.
174
C9( B ): Q UANTIFIED S ENTENCES I NVOLVING R ELATIONS
Once we have relations in our formal language, we can build up quite complicated
sentences by quantification. For example:
Example 1:
Everybody has a father ( For anyone at all, there is someone who is that first
person's father) formalises as:
xyRy,x
where Rx,y is the relationship which holds between x and y just when x is the father of
y hence Ry,x says y is the father of x.
What, then, would xyRx,y mean with R understood in the same way? It would say
that Everyone has fathered someone a very different proposition from the one we
wanted to formalise. Evidently the order of the variables is again vitally important.
If we wanted to include in our formalisation the fact that our original sentence
(Everybody has a father) is evidently meant to be about people we could give it the
fuller formalisation:
where Px means x is a person and Rx,y as before stands for x is the father of y.
Example 2:
How about the sentence Some people have no daughters ( There is at least one
person such that no person is the first person's daughter)? This formalises as:
xyRy,x
(Exercises:
175
1. Make sure you understand why this formalisation is correct and not,e.g.; x(Px
&y(Py & Ry,x))
2. What does x(Px &y(Py Rx,y)) mean in ordinary English with the same
understanding of Px and Rx,y?)
Example 3:
How about: 'For every natural number there is one greater but it is not true that there
is a natural number greater than all natural numbers'?
Let Rx,y x is greater than y (usually written x > y), and Nx be 'x is a natural number',
then our sentence formalises as:
Example 4:
The second conjunct here could equally well be formalised as yxRy,x (variables are
just 'placeholders' with no intrinsic significance) in which case we would have the
sentence:
This sentence, far from being inconsistent, is true (remember, it says that everyone has
a father but no one has fathered everyone). So, this can only mean that xyRy,x
andyxRy,x are very different sentences.
In fact, under the above interpretation the first says 'everyone has a father', the second
'someone has fathered everyone'. Similarly if we interpret Rx,y as meaning 'x is greater
than y' with the natural numbers as domain then xyRy,x says 'for every number
there's one greater' (true) while yxRy,x. says 'there's a number greater than any
number' (obviously false).
A sentence involving mixed quantification is of course one in which there is at least one
universal, and at least one existential quantifier. It can be a bit tricky to read these
correctly but with practice you will get there. Let's consider the four sentences:
(i) xyRx,y
(ii) yxRx,y
(iii) xyRy,x
(iv) yxRy,x
What do each of these mean with Rx,y interpreted as 'x (strictly) greater than y' in the
natural numbers? (iii) and (iv) we just dealt with towards the end of example (4)
check themover again.
(i) says: 'For every number x there's a number y such that x is greater than y' that is,
since x is greater than y iff y is less than x, (i) says 'for every natural number there is
one less than it'. (This is false. Exercise: Explain carefully why.)
177
(ii) says: 'There's at least one number y such that every number x is (strictly) greater
than it', that is: 'There's a y which is less than every number x' (again in view of the fact
about numbers that x is greater than y if y if y is less than x). This is false too though
less obviously. I shall explain why later.
Example 6:
Once you've got the knack, you can really get fancy. Let's try 'Everybody loves a lover'
and Lincoln's famous 'You can fool all of the people some of the time, and some of the
people all of the time, but you can't fool all of the people all of the time'.
The first says 'Everyone loves anyone who loves someone' (because having someone
that you love is what it means to be a lover) and so, taking 'x is a person' as implicit
and Lx,y, for mnemonic purposes, as 'x loves y', we have:
x(yLx,y zLz,x)
That is, for anyone x at all, if there is someone y whom x loves (i.e. if x is a lover) then
anyone z at all loves x.
(Exercise: How would 'Some people are not loved by anyone','No one loves everyone',
'Some people love no one who loves them', and No lover is loved by everyone
formalise?)
Lincoln's famous remark requires monadic predicates Px for x is a person and Qx for
x is a moment in time, and a two place relation Rx,y which holds just when x can be
fooled at y. The remark then formalises as:
x(Px y(Qy & Rx,y)) &x(Px &y(Qy Rx,y)) & xy((Px & Qy) Rx,y)
(Make sure you understand how exactly to read each of these conjuncts. Do NOT try to
bring all quantifiers to the front of an expression, this may (and usually does) lead to
disaster just put them where they occur naturally in the formalisation. For instance
xy(Px (Qy & Rx,y)), with the above meaning for predicates P, Q and R, means
something very different from You can fool all of the people some of the time. Exercise:
what does it mean?)
178
So now we have for the second time in this course made a move towards a much more
expressive formal language. We started with truth-functional logic and decided that,
although quite powerful, it needed to be extended to monadic predicate logic, if it was
to capture a wider range of intuitively valid inferences (such as the hoary one about
Socrates) as valid. Monadic predicate logic involved introducing new ideas about
interpretations and rules of proof. Now we have relations in our language as well as
monadic predicates and so the next items on the agenda are to show that we can again
capture a greater range of valid inferences once we have this greater expressive power;
and showing that involves in turn revisiting our notions of (a) an interpretation (the
key, remember, to invalidity, consistency and independence), and (b) a formal proof
(the key to validity, logical truth, logical falsity, and inconsistency) in order to extend
them to take account of our enriched language, including relations as well as monadic
predicates. You will be comforted to know that all the basic ideas remain the same
though some of the details get a little trickier.
179
C9( C ): F ULL F IRST O RDER P REDICATE L OGIC : I NTERPRETATIONS
I NVALIDITY AND C ONSI STENCY
(Exercise: take this opportunity to revise all of the above notions as they came up in
monadic predicate logic.)
Example 1:
This inference has a true premise and a true conclusion (though somewhat
surprisingly, you might think, some believing Christians would deny that the
conclusion is true if everyone is restricted to every human being!) Nonetheless,
180
intuitively, (as I hope you will agree) the truth of the conclusion doesnt follow from the
truth of the premise (N.B. from that premise alone: we could readily produce an
augmented inference involving some further premises about the biology of
reproduction that would have the conclusion as a valid consequence though we might
have to get fancy about biological parent vs some other kind, worry about test tubes
and egg donations etc.)
To show that the inference is indeed invalid, first we formalise (taking reference to
'persons' as implicit):
1. x(yRy,x zSz,x)
Domain: Humans
Rx,y: x is a son of y
The premise is true but the conclusion (in view of the very fortunate existence of
daughters) is false. Hence the original inference is invalid.
Example 2:
1. Everyone who likes Sartre likes Camus, but not everyone who likes Camus likes
Sartre.
2. Some people who like Camus like Flaubert.
Here:
a: Sartre
b: Camus
c: Flaubert.
Lx,y: x > y
a: 5
b: 4
c: 6
1. Any number strictly bigger than 5 is strictly bigger than 4 but not every number
strictly bigger than 4 is strictly bigger than 5.
2. Some numbers are strictly bigger than 4 and strictly bigger than 6.
Therefore, some numbers are strictly bigger than 6 and not bigger than 4.
The premises are true (the second conjunct is true since 5 is strictly bigger than 4 but
not strictly bigger than itself); the conclusion is false.
Hence there is a counterexample to the original inference about the French writers and
that inference is therefore invalid the conclusion may perhaps be true but its truth
would not be guaranteed by assuming the truth of the premises.
182
Example 3:
{For every natural number there is one greater than or equal to it; there is a natural
number less than or equal to every natural number; for every natural number there's
one less than or equal to it}
This is amodel of the set since all these sentences are true even the last. (Exercise:
Think this through carefully even for the least natural number (0 or 1 depending on
whether or not 0 is included as a natural number) there is one less than or equal to it
viz. itself. (Hence the interpretation with Domain: {natural numbers}and Rx,y: x < y (x
strictly less than y) would not be a model because this last sentence would then be
false.)
Example 4:
The set of sentences S = (Everybody is his own father; If one person is the father of a
second and the second the father of a third then the first person is the father of the
third; Everybody has fathered someone} is clearly false since every sentence in the
set is false. But is the set (taken together) necessarilyfalse i.e. inconsistent?
To decide this, formalise S using the two place relation Rx,y: x is the father of y,
obtaining:
This set is consistent as is shown by the fact that the following interpretation I is
amodel:
(Exercise: Write out the set of sentences S as interpreted by I and check that they are
all true.)
Example 5:
Let S be as in example 4.
I:
Rx,y: x y.
Under this second interpretation the set of sentences S reads: Any integer is less than
or equal to itself (true). For any three integers, if the first is less than or equal to the
second and the second is less than or equal to the third, then the first is less than or
equal to the third (true). For any integer there is an integer greater than or equal to it
(true).
While s reads:
There is an integer less than or equal to every integer i.e. there is a least integer and
this is false, since the positive and negative integers stretch out "infinitely far"
backwards as well as forwards.
184
C9( D ): F INITE I NTERPRETATIONS
How are the interpretations that do the various jobs in the above exercises 1-5 arrived
at? So far as you are concerned at least they are simply pulled out of the blue and
written down for your inspection. Is there some systematic way of arriving at
interpretations which do the jobs we want them to do (show invalidity, consistency,
etc.)? The answer, as indicated earlier, is that there is no fully systematic way
ofproceeding here. Although you will, through practice, become adept at producing
suitable interpretations, there is no algorithmic procedure, analogous to the truth table
method or its derivatives, for producing models and hence deciding issues like
invalidity or consistency in full First Order Predicate logic. (This is actually provable as
a consequence of some really deep theorems about logic that we shall touch on again
later.) However, as in the case of monadic predicate logic, we can produce a sort of
quasi-systematic method by exploiting the idea, introduced earlier for the restricted
case of monadic logic, of finite interpretations/models.
How can we extend that idea to include relational predicates? Well, a two-place
relation, such as that of father/direct descendant or of being a greater natural
number than holds, or fails to hold, not of single individuals but of pairs ofindividuals
considered in a particular order or, as we shall say for brevity, of an ordered pair
of individuals.
So, for example, the relation Rx,y: 'x is a son of y' holds of, amongst many others of
course, the ordered pair (Prince Charles, Prince Philip), just as the predicate Px: 'x is an
even number' holds of, again amongst many others, the individual number 2.
185
pairs that we can form out of elements of the domain in which the first member of the
pair is indeed less than the second.
So, as before, the idea with finite interpretations is to forget intension (i.e. meaning)
altogether and just consider the extensionsof the predicates so, in particular, any set
of ordered pairs formed from the domain is a legitimate interpretation of any two-
placerelation: we neednt concern ourselves with what natural relation if any has that
particular extension in that particular domain. This means that we can be much more
systematic in searching for models as the following examples will illustrate.
Example 1:
We are looking to see if we can construct a model of S, i.e. an interpretation in which all
the sentences in S are true, and we are proposing to do this using just a finite number
of elements in some domain.
Well lets give ourselves the domain {1,2,3} as before theres nothing special about
three elements, it just works reasonably well in most cases that well think about
(basically because well only deal with fairly simple cases).
In order to make the last sentence in S true (and one of the tricks of the trade here is to
start with the existentially, rather than universally, quantified, sentences), there has to
be something in the extension of P (which as before well denote by P). So lets
arbitrarily put the first element of our domain, 1, in P leave it at that for the time
being and see how things go with the other sentences in S. (Another rule of thumb in
dealing with finite models is always do the minimum that it takes to make a sentence
true.)
So now lets turn to the first sentence in S. It says that everything thats a P has
something thats R-related to it. So, given that we just decided that 1 has the property P,
there must at least be an ordered pair (1, blank) in R the extension of the relation
R. So again lets arbitrarily make 2 the at least one thing thats R related to 1, i.e. lets
put (1,2) in R. Again we do the minimum as a first step we may have to come back
and revise these assignments if we run into trouble with other sentences.
186
So, finally, we have the second sentence in S. This says that everything thats a P has
something to which its R-related. (Note carefully the difference with how to read the
first sentence in S) Again given that we made 1 have the property P, this means that
there must be another ordered pair in R of the form (blank, 1). So lets, since we havent
yet used 3, make 3 the necessary blank, i.e. lets put (3,1) in R. This doesnt mess
anything up that we dealt with earlier obviously not with the third sentence (1 is still
a P so xPx is true) and also not with the first sentence. (Think through why not.) So we
have a finite interpretation I which is a model of S and hence demonstrates that S is
consistent:
I:
Domain: {1,2,3}
P: {1}
R: {(1,2), (3,1)}
Intuitively, there is something with property P, viz. 1. For everything thats P (i.e. just
1), theres something, viz. 2, that its R-related to, and also something, viz. 3, to which it
is R-related. Hence all three sentences in S are true in I.
Example 2:
Show that s = xyRx,y is independent of the set S = {xyRx,y, xyx((Rx,y & Ry,z)
Rx,z)}
For independence, remember, we require two models, one of S U{s} and one ofS U{s}:
Let the domain again (for no good initial reason) be {1, 2, 3}.
For xyRx,y to be true requires that every element of the domain be R-related to
something (not necessarily the same thing); that is, in terms of ordered pairs, that
every element of the domain occur as 1 st coordinate in at least one ordered pair in the
extension R of the relation R. So let's try (1,2), (2,3) and (3,1) and then see how we go.
(This step is again partially arbitrary: so long as we have (1, blank), (2, blank), (3,
187
blank) in R then the blanks can be filled in as we like xyRx,y will still be rendered
true).
For xyz((Rx,y & Ry,z) Rx,z) to be true requires that whenever there are two
ordered pairs in R such that the 2nd coordinate of one is the same as the 1 st coordinate
of the other then the ordered pair whose 1st coordinate is the 1st coordinate of one and
whose 2nd coordinate is the 2nd coordinate of the other must also be in R. (You may
need to read this several times, but it does make sense!) Hence, since we already have
put (1,2) and (2,3) into R we must also have (1,3) in as well, since to repeat,
xyz((Rx,y & Ry,z) Rx,z) requires that if R1,2 holds and so does R2,3 then so
must R1,3. Similarly since (1,3) and (3,1) are now in R, so must be (1,1) that is, 1 must
be R-related to itself. Since (2,3) and (3,1) are in R so must be (2,1), which, given that
(1,2) is already there, means that so must be (2,2). And finally since (3,1) and (1,3) are
in there so, to satisfy this transitivity requirement must be (3,3).
So:
(This set in fact contains allpossible ordered pairs made up of the 3 elements 1,2,3 so
in this interpretation everything is R-related to everything.)
Interpreting R is this way makes both sentences in S true. What about s? For s to be
true there must be a single x which is R-related to all y, including itself.
(Exercise: make sure you understand clearly that this is what, formally speaking, s
says.)
This means, in terms of ordered pairs, that there must be pairs (blank, 1), (blank, 2),
(blank, 3) where the blank is filled in by the same member of the domain in all cases. In
fact, since we already have in R, e.g., (1,1), (1,2) and (1,3), s is automatically made true.
That is, there is at least one element of the domain, 1, which is R-related to every
element in the domain including itself (i.e. to each of 1, 2 and 3). (In fact, as already
noted, this is true of all three elements.)
188
In order to make s false (i.e. s true) no element can be R-related to every element. This
means that our original choice of ordered pairs to satisfy xyRx,y (the first sentence
in the set S) will need to be modified for, given that choice and given the
requirements of the second sentence in S, we were forced to satisfy s (as we just saw).
So let's try (1,1), (2,1) and (3,1) to satisfy xyRx,y this set of ordered pairs does
satisfy this sentence since every element of the domain occurs as 1st coordinate of
some ordered pair; we have just in this second interpretation made it the same thing
viz. 1 that is R-related to each of 1, 2 and 3.
Now, the second sentence in S is already satisfied by this assignment: we have (2,1)
and (1,1) but this requires only an ordered pair with the same first coordinate as the
first and same second coordinate as the second, but this is the ordered pair (2,1) which
we already have, and similarly with (3,1) and (1,1). But if we stick with just these three
ordered pairs in R then s is false, i.e. s is true: since there is no single element of the
domain which is R-related to every element: 1 is only R-related to itself and not to 2 or
3, 2 is only R-related to 1 and not to either itself or 3, and 3 is similarly only R-related
to 1 and not to either itself or 2.
Domain: {1, 2, 3}
(Exercise: The interpretation I with the same domain, and with R: {(1,2), (2,1), (3,1),
(1,1), (3,2)} is also a model of S U {s}. Show carefully that this is true.)
Example 3:
1. xyRx,y
So, xyRy,x
is an invalid inference.
189
We could easily do this via an "ordinary" infinite model [Domain: {natural numbers},
Rx,y: x<y would do it, since the premise would then read 'For every natural number
there is one greater'(true), while the conclusion would read 'For every natural number
there's one less.' (false since there is no natural number less than 0)]. But we can
equally well, and more systematically, construct a finite model.
Let Domain = (1, 2, 3). We need to make the premise true, so everything in D has to
occur as first coordinate of some member of R lets say (1,2), (2,3) and (3,3).
D: {1,2,3}
(Exercise:
1. yxRx,y
So,xyRx,y
I said, when first introducing finite models (in connection with monadic predicate
logic), that it is not true that every set of sentences that has a model at all has a finite
model. The technique only works one way round: if we can find a finite model the set
must be consistent, but it is not true that if a set of sentences is consistent then it has a
finite model (obviously when you think about it this must mean that such a set has
models but they are all infinite that is the domains must be infinite sets). Now that we
190
have two-place relations in our language we can in fact give an example of a consistent
sets of sentences which has no finite model. One such is the set:
S is clearly consistent as is shown by the fact that the following (infinite) interpretation
I is a model:
D: {Natural numbers}
However, S has no finite model. (Exercise: Although a proof of the fact that S has no
finite model is beyond the scope of this course, you will in fact convince yourself that it
is a fact (and discover the basic idea underlying the proof) if you try to construct a
finite model and take notice of the reason why you are continually frustrated.)
191
C9( E ): F URTHER C LARIFICATION OF THE MEANING OF M ULTIPLY
Q UANTIFIED S ENTENCES
When we read sentences like xy(Rx,y) back into something more intuitive, we say
things like Any two things are R-related. Does this include the case in which the two
things are one and the same i.e. does it require that every object is R-related to itself?
Similarly if we say xyz((Rx,y & Ry,z) Rx,z) which we start to read as: For any
three things x,y,z, if x is R-related to y, and y is R-related to z, then x must be R-related
to z, does this include the case in which the three variables x,y,z take on the same value
or in which two out of the three of them do?
The answer is that it does. The phrase xyz is to be read as For any three not
necessarily distinct things. This means that, for example, a sentence like xyRx,y
when interpreted in the domain D = {1,2,3} would be made true by the interpretation R
= {(1,1), (2,2), (3,3)}. Under this interpretation for any individual in the domainthere is
another individual in the domain namely itself that is R-related to it. It also means
that when we employ the finite model technique and we are, say, attempting to make
the sentence xyz((Rx,y & Ry,z) Rx,z) true, then if we have the ordered pairs
(1,2) and (2,1) already in our interpretation R of R, then we must also have the ordered
pair (1,1) in R even though in that case the variables x and z take on the same value: 1
is R-related to 2 and 2 is R-related to 1; hence this sentence to be read remember as
for any three not necessarily distinct individuals x, y, z requires that 1 be R-
related to itself.
As before when we have taken decisions of this kind, nothing is lost by taking it. If we
really want to say, e.g. for any 3 definitely distinct objects x,y,z then we do so explicitly
by introducing the identity relation (x = y) and explicitly requiring distinctness.
So for any three definitely distinct objects x,y,z, (Rxy & Ry,z) Rx,z) is formalised in
full as:
xyz(((x=y) & (y=z) & (x=z) & Rx,y & R,y,z) Rx,z).
192
Similarly, if we want to say that for any object there is a different further object that is
R-related to it, we must say not just xy(Rx,y), since this is compatible with the y that
is R-related to any x being the same as x, but must instead formalize it as:
193
C9( F ): P ROVING V ALIDITY F ULL F IRST O RDER P REDICATE L OGIC
As already noted, we cannot show that an inference in first order predicate logic is
valid using the definition of validity this is because it would in principle involve
looking at infinitely many possible interpretations to check that none provided a
counterexample. Instead, as we already saw in the case of monadic predicate logic, we
demonstrate the validity of an inference by producing a proof of its conclusion from its
premises. The introduction of relations into our language requires no new rule of
proof. The list (US, ES, UG, EG, TI and CP) remains the same as before. However, as we
shall discover, introducing relations does force us to give modified, tighter versions of
some of those rules. Before introducing those qualifications, however, it is best to get
used to the ideas by looking at a couple of examples of proofs in full predicate logic.
Example 1:
Therefore, there are people who live North of Watford whom no sensible person
likes.
Where:
Wx: x lives North of Watford (no advantage using a relational term here)
Lx,y: x likes y
Proof:
194
1. xy((Sx & By) Lx,y) Premise
2. x(Bx & Wx) Premise
3. B& W ES, 2
Example 2:
1. Any two numbers are either equal or one is less than the other.
2. If one number is less than a second, the second is not less than the first.
3. 5 and 3 are not equal
So, either 5 is less than 3 and 3 not less than 5, or 3 is less than 5 and 5 not less than
3.
Where:
E(x,y): x=y
L(x,y) : x < y
a: 5
b: 3
195
Proof:
196
C9( G ): T HE NEED FOR FURTHER QUALIFICATIONS OF TH E RULES
Normally especially if you can see what is going on in a proof using the rules in the
loose form in which we have expressed them so far will lead to satisfactory proofs.
However, what we are after is a foolproof set of rules that allows only proofs of
conclusions that really do follow validly from the set of premises concerned and which
could, for example, be programmed into a computer in such a way that the computer
always gave the right answer.
The rules as we presently have them are, however, far from foolproof. As they stand
they can lead to trouble when relations are involved by sanctioning inferences that are
clearly invalid. Here is one simple example.
This is obviously invalid since as it stands the premise is true (more or less, there is a
bit of vagueness if we go back far enough in evolutionary time if our implicit domain is
members of homo sapiens, and even more of a problem, if this is our domain, for a
Christian who believes that Jesus really was the son of God) and the conclusion
definitely false. But here is a simple proof of its conclusion from its premise (Rxy
means x is the father of y and reference to persons is taken to be implicit):
Proof:
1. yx Rx,y Premise
2. x Rx,x US, 1
In the form in which we have so far expressed the rule this application of US is entirely
legitimate. Line 1 says something (viz. xRx,y) holds for all individuals, so it must hold
(mustn't it?) for the 'arbitrary individual' x. Well, clearly not since under the intended
interpretation Rx,y: x is the father of y, the premise is true and the conclusion false. (If
we wanted to eliminate the slight vagueness in the premise under this intended
interpretation we could as always turn to the crisp unambiguous natural numbers. If
Rx,y is taken to mean x>y in the natural numbers, then the premise truly states that for
197
any natural number y theres another natural number x strictly bigger than it, whereas
the conclusion falsely asserts that there is a natural number x which is strictly bigger
than itself.)
In order to fend off this and some other fallacies, we need to express our rules of proof
much more precisely than we have so far done, and this in turn requires a more
detailed specification of the language of first order logic.
198
C9( H ): T HE L ANGUAGE OF F IRST O RDER L OGIC
*We will add one further linguistic item later, but this is plenty to be going on with.
Formulas:
In order to turn predicates into formulas, we need to apply the predicates to the
appropriate number of entities (where this includes individual variables). So, e.g., we
could apply the predicate 'is even' (let's say our symbol for this is P) to the individual
constant a (it might be the number 3) or to the individual variable x or to the
ambiguous name to get the formula 'a is even' (expressed as Pa) or the formula 'x is
even' (expressed as Px) or the unknown but specific entity has the property P
(expressed as P).
199
individual a is bigger than particular individual b; or one variable and one constant, say
Rx,a - meaning 'the variable individual x is bigger than the particular individual a'; or
two ambiguous names to get R, saying that the two possibly different specific but
unknown entities and stand in the relation R.
Terms:
Atomic Formulas:
If P is an n-place predicate and ti, ..., tn are terms, then Pt1 tn is an atomic formula and
The rest of the formulas (i.e. the rest of the meaningful expressions) can all be built up
in a step-by-step ("recursive") way from these atomic formulas. For example we can
apply our truth functional connectives to atomic formulas: to create the formula Pa,
for example, from the atomic formula Pa; or the formula Pa & Qb from the atomic
formulas Pa and Qb; or Pa Rx,a from the atomic formulas Pa, and Rx,a etc. Moreover
we can iterate these procedures any (finite) number of times to create formulas like Pa
(Rx,a v Qb), or (Pa & Qb) (Rx,a v Sa,x), etc. Brackets are used in obvious ways to
indicate the method of construction.
We can also apply our quantifiers to create new formulas for any formula F and any
individual variable x we can create the formulas:xF or xF. So, for example, from the
atomic formula Py we can create the formulas yPy and yPy; or from the formula Pz
Rx, z we can create the formula z (Pz Rx, z). Notice that this means that weird
200
expressions like xPy or zRx,y count as formulas it's just that they are not very
interesting formulas. (In fact the first is equivalent to Py and the second to Rx,y.)
Again the process can be iterated to create formulas like x(Px yRx,y) or
xixj((Pxi& Rxi,xj) xk(PxkRxi,xk)).(Remember we can use indexed variables like
xiwhenever we need lots of variables.) Again the brackets are used to indicate how
exactly the overall formula has been built out of its ultimately atomic constituents.
(Brackets are dropped whenever no confusion could result, so, e.g. it is usual to write
xPx rather than x(Px); but it is necessary to write x(Px v Qx) if we mean to say that
every individual is either P or Q, in order to distinguish this from the quite different
formula xPx v Qx (really x(Px) v Qx which means either everything is P or x is Q
(though quite what this second disjunct means we will only understand fully a little
later).
Definition: Formula
Certain formulas make assertions. For example Pa, or x(Px Qx) or xyRx,y
these state respectively that some particular individual a has property P; that
everything which is P is Q; and everything is R-related to something.
In a given interpretation these statements will all be true or false. We call such
formulas sentences (or closed formulas).
201
But how about formulas like Px or Qy zPz or yRx,y? These are so-called FREE
VARIABLE (or OPEN) FORMULAS. We shall need to take particular carewith free
variable formulas when amending our rules of proof. So, first let's characterise them
more generally and then say how they should be dealt with.
Free variable formulas can be recognised purely syntactically purely, that is, in terms
of the way that the symbols are put together. First, the SCOPE of a quantifier is that
part of a formula which is governed by that quantifier usually to be recognised by
looking for the right bracket corresponding to the left bracket immediately after the
quantifier. So, e.g., the scope of the quantifier x in x(Px v Qx) is the formula Px v
Qx.The scope of the quantifier z in Px zRx,z is just 'Rx,z' (obvious brackets
having been dropped).The scope of the quantifier y in yx(Px Rx,y) is the
formula 'x(Px Rx,y)' (again a pair of brackets having been omitted here, for the full
original formula would read: y(x(Px Rx,y))).
So in x(Px & Qx) all variables are bound. Similarly, in x(Px yRx,y). In Px, the only
variable is free. In Px yRx,y both occurrences of x are free, while both occurrences
of y are bound. In xyRx,y Py both occurrences of x are bound, while the first two
occurrences of y are bound and the final occurrence of y is free if we intended that all
occurrences of y be bound we should have used brackets to write xy(Rx,y Py).
(1) A formula in which all variables, if any, are bound is called a CLOSED
FORMULA or a SENTENCE.
202
(2) A formula in which there is at least one free occurrence of at least one
variable is called a FREE VARIABLE (or OPEN) FORMULA.
Examples:
So Pa, P, andxPx are sentences, while Px and Py are free variableformulas. xyRx,y
is a sentence, yRx,y is a free variable formula, and so on.
Sentences, in a given interpretation, make an assertion and are therefore either true or
false. So, Pa interpreted in the natural numbers with a as 5 and Px as 'x is even' is
thefalse assertion '5 is even', while x(Px Qx) in the same interpretation with Px as
x is even and Qx as 'x is divisible by 2 (without remainder)' makes the true assertion
that all even numbers are divisible by 2.
But how about free variable formulas like Px or xRx,y? What do these say?
Similarly, while the sentence xy(x y) makes the false assertion that for any two
numbers the first is bigger than or equal to the second, the free variable formula x(x
y) is neither true nor false, but instead is satisfied by some substitutions for the free
variable y (i.e. it becomes true, or better yields a true sentence, when some individuals
from the domain in this case just the number '0' are substituted for its free variable)
and it is not satisfied by other substitutions (that is, it yields a false sentence when
other substitutions are made for the free variable); in this case any substitution apart
from 0, since x(x 1), x(x 2), x(x 3) , etc., are all false).
Finally, the free variable formula x y in the same interpretation is again neither true
nor false, but instead is satisfied by or holds of ordered PAIRS of individuals from
the domain pairs considered in a particular order. It is satisfied by (1,1) (2,1) (3,1)
etc., but not by (1,2) (2,3) etc. That is, when you substitute any of the first set of values
203
(in order) for its two free variables it yields a true sentence (11, 21 etc.), while if you
substitute any of the second set of values (again in order thats why these pairs are
ordered pairs) you get a false assertion 12, 23, etc.
IN GENERAL, then, free variable formulas are neither true nor false they lay down
conditions that are sometimes satisfied, sometimes not. But how about the formula
Px Qx, interpreted in the natural numbers with Px meaning 'x is even' and Qx
meaning 'x is divisible (without remainder) by 2', or the same formula interpreted in
humans with Px meaning x is male and Qx meaning x has the Y chromosome'.
In both these cases, the free variable formula would be true for all substitutions for its
free variable. Concentrating on just the numerical case, P0 Q0, P1 Q1, P2 Q2,
are all true. In such a case, the free variable formula can, as I suggested earlier, be
interpreted as claiming that any arbitrary individual has a certain (complex)
property. And in such a case (and only in such a case) we say that the free variable
formula is itself true. (Notice then that a formula F(x) with one freevariable x is
true iff its universal quantification xFx is true).
The free variable formula yRx,y is true in the interpretation D = {natural numbers},
Rx,y: y >x it says, if you like, that an arbitrary natural number x is such that there is
one bigger than it. (Again, the truth of the free variable formula is reflected in the fact
that this formulas universal quantification xyRx,y is also true.)
These considerations about free variable formulas may seem a bit finicky but in fact
clarification of the status of free variable formulas is, as we shall see, a necessary pre-
requisite for producing watertight versions of the rules of proof.
204
C9( I ): I MPROVED VERSIONS OF THE R ULES OF P ROOF
This rule, remember, basically said that you can drop a universal quantifier from the
front of a formula and substitute anything you like an individual constant, individual
variable or ambiguous name for the variable that previously had been quantified. The
intuitive justification of this rule is that if everything in a certain domain has a certain
property then any individual element of the domain must have that property.
We already know, however, that we need to qualify this rule. This is because, as it
stands, and as we saw earlier, the rule allows us to infer xRx,x from yxRx,y. But this
inference is invalid as the interpretation I (Domain = {humans}, Rx,y: x is the father of
y) or the interpretation I (Domain = {natural numbers}, R x , y: x > y) showed.
Consider, then, yxRx,y, and consider first the free variable formula that results from
simply dropping the universal quantifier y viz. xRx,y. In, say, the arithmetical
interpretation (Rx,y: x > y) this says that there is something which is bigger than y. If
we substitute more or less anything for y in that free variable formula xRx,y we get a
formula that "says the same thing" about the substituted entity as the original does
about y. So, e.g., substituting the individual constant a or the ambiguous name for y,
though it turns the free variable formula back into a sentence produces a sentence
which says the same thing about a or about as the free variable formula did about y:
viz. that there is something bigger than it.
The exception is if we substitute x for y because this produces the formula (actually a
sentence since it contains no free variables) xRx,x which says something different, viz.
that there is something bigger than itself.
205
First remember that the terms are those linguistic items that stand (possibly variably
or ambiguously) for individuals, as opposed to the predicates which stand for
properties of individuals. So the terms (at least as far as we are presently concerned)
are:
Definition:
A term t is free for the variable xi in a formula F iff no free occurrence of xi in F lies
within the scope of any quantifier on a variable xj, where xj is a variable in t.
(This is the general notion needed, as we shall see, when we slightly extend our notion
of term. However, as we currently understand them the only way for a variable x j to
be in a term t is for t to in fact be the variable xj. For this case, the definition reduces
to:
A term t is NOT free for the variable xi in a formula F iff t is the variable xj and a free
occurrence of xi in F lies within the scope of some quantifier on xj.
Examples:
The term x is not free for y in xRx,y since the free occurrence of y in this formula does
lie within the scope of a quantifier on x.
The term z is not free for y in the formula Py z(Qy Ry,z) since two free
occurrences of y lie within the scope of a quantifier on z.
On the other hand, z is free for y in Py y(Qy Ry,z). Moreover, the individual
constant a is free for y in Py y(Qy Ry,z) or indeed for any variable in any formula
(Exercise: explain why.)
206
The question, then, only arises when we are asking if one variable, say x, is free for
another variable, say y, in some formula. This question can be answered in a purely
mechanical way as follows:
Look to see whether the variable you are intending to substitute for another
would be captured by some quantifier already in that formula if it would then
that first variable is not free for the second in that formula.
All that we need to do to amend the rule US is to require that the term substituted for
the free variable created by dropping the universal quantifier is free for that variable
in the formula thus created. (Read it slowly it does make sense!)
So, e.g, y is not free for x in yRx,y (substituting y for x would mean that that
occurrence of x was captured by the existing quantifier), hence it is not permitted to
infer yRy,y from xyRx,y.
Similarly, since z is notfree for x in (Px z(Sz v Rx,y,z)) (because of the second free
occurrence of x), we cannot use US to infer from x(Px z(Sz v Rx,y,z)) to Pz z(Sz
v Rz,y,z); although we could in this second case perfectly well infer Py z(Sz v Ry,y,z)
since the variable y is free for x in Py z(Sz v Rx,y,z).
Remember that if F is any formula, then F[t|xi] is the formula obtained from F by
substituting the term t for any occurrence of the variable xi. So, e.g., if F is the formula
Px Qx, then F[a|x] is Pa Qa; if F is yRx,y, F[zlx] is yRz,y.
Given this notation and the restriction just indicated, we have the following form of the
US rule, which is in fact the final form no further restrictions being needed:
For any formula F and any term t, F[t|xi] may be inferred from xiF, PROVIDED t is free
for xi in F.
207
You will remember (refresh your memory if not!) that the rule of conditional proof
permits the introduction of an "extra assumption" which, however, is subsequently
"discharged". Consider the following inference:
1. All of those financing the Conservative Party (Px) are from big business (Qx).
2. No one from big business gives a damn about the Environment (Rx).
So, none of those financing the Conservative Party gives a damn about the
Environment.
This proof is perfectly OK. But what about the following proof?
6. xPx UG, 5
7. Px xPx CP, 5-6
8. x(Px xPx) UG, 7
There is nothing wrong with the step from 7 to 8 here. And nothing wrong at all with
this proof so far as our rules of proof stand at the moment.
"For anything at all, if it's a financer of the Conservative Party then everyone is a
financer of the Conservative Party.
208
This rather strange assertion nonetheless makes perfect sense when you think about it
carefully it is in fact equivalent to the (admittedly odd) assertion that if theres at
least one financer of the Conservative Party then everyone finances the Conservative
Party.This is obviously false and clearly does not follow from the premises of the
argument.
The invalid step here occurs in fact at line 6 of the variant proof. The general message
is that we mess things up if we allow ourselves to generalise on variables that are
involved in assumptions introduced for purposes of Conditional Proof (or rather we
mess ourselves up if we so generalise before the point at which we have used
Conditional Proof to discharge the relevant assumption). In order, then, to prevent
problems like this one, we introduce a notational convention and a corresponding
restriction on the rule of Universal Generalisation.
Notational convention:
Any variable that is free in any assumption introduced into a proof must be FLAGGED;
this means that we record the variable on the RHS of the proofalong with the
justification for that line. The variable is then flagged in any line that depends for its
justification on a line in which it is already flagged, and the flagging ends only when all
the assumptions in which it was introduced as a free variable have been discharged by
applying the rule of Conditional Proof.
You cant universally generalise on flagged variables: that is, you can infer the formula
xiF from F, if, but only if, xi is not flagged in F. (Well need to add a further restriction
in a moment.)
Lets then amend our two most recent attempted proofs in accordance with our new
notational convention.
Proof 1:
Notice that this proof is not only intuitively valid, it is also perfectly kosher so far as our
latest restriction is concerned. In particular, applying UG at line 9 breaks no rules since
x is no longer flagged in line 8 (the flagging having stopped with the discharging of the
assumption at line 7).
Proof 2:
This, remember, is the aberrant proof with the conclusion (If the Conservative Party
has a single financer then everyone finances the Conservative Party) which clearly
doesnt in fact follow from the premises. Once we introduce flagging, we see that the
aberrant proof does indeed involve at line 6 an application of UG to a flagged
variable. Hence this proof is ruled out by our restriction on UG no universal
generalising is allowed on flagged variables.
210
A somewhat similar restriction on the Rule of Existential Generalisation (EG) is also
required. To see why here is how to prove that Everyone ishappy from the premise
Some people are happy. Needless to say this proof is not valid the conclusion does
not follow from the premise. (Lets use Hx to mean x is happy)
Proof A:
1. xHx Premise
2. H ES, 1
3. Hx Ax
4. H& Hx TI, 2,3 x
5. x(Hx & Hx) EG, 4 x
6. H& H ES, 5 x
7. Hx (H& H) CP 3-6
8. Hx TI, 7
9. xHx UG, 8
(Here, the move from line 7 to line 8 is our old friend the formal equivalent of a
reductio ad absurdum: any formula of the form P (Q& Q) tautologically implies P,
since any formula of the form (P (Q& Q)) P is a tautology. ALSO remember that
there was one restriction on ES that was so obvious that we introduced it right away:
namely that we must always use a new ambiguous name whenever we use ES more
than once. Exercise: remind yourself why. We have followed that restriction in this
proof by using at line 6 rather than the already used .)
Here, although you wouldnt think up this proof unless you had a sick mind, every step
is legitimate so far as our rules as presently formulated are concerned (you should
check this carefully, paying attention to what the rules allow you to do formally and
forgetting, for the moment, about what each line means intuitively). In particular, we
have obeyed the notational convention on flagged variables, and have not transgressed
the new condition on UG this is because the only application of UG occurs at line 9,
after the flagging has correctly stopped (the assumption in which x was introduced free
has been discharged at line 7).
211
There is however something fishy about line 5 (which is in fact the only line in this
proof that is faulty). Intuitively, making the step at line 5 forces us to identify the
arbitrary object x with the particular, if ambiguously named, entity . An obvious
restriction that would ban line 5 is the counterpart of our restriction on UG namely to
ban existential generalisation on any flagged variable, as well as universal
generalization.
This, however, turns out to be overly restrictive: it would leave our system of rules of
proof incapable of demonstrating the validity of certain inferences that are in fact valid.
Here is one important example:
The inference from xPx to xPx is clearly valid (indeed intuitively they say the
same thing since they are two equivalent ways of saying that nothing is a P; and so
they should in fact be inter-derivable (as they indeed are).) Heres how to show that
the inference from xPx to xPx is valid:
Proof B:
1. Px Ax
2. xPx EG, 1 x
3. Px xPx CP 1-2
4. xPx A
5. Px TI, 3,4
6. xPx UG, 5
7. xPx xPx CP 4-6
Notice that x is not flagged at step 4 since it is not free in the assumption made there.
Everything else is done in accordance with our rules. The only questionable step is at
line 2. If we were to ban existentially generalising on flagged variables, line 2 would be
rendered illegitimate and as it turns out there is no other way to prove in our
system that this valid inference is indeed valid.
Hence to deal with the problem highlighted by Proof A, we must introduce a less
demanding restriction than a blanket ban on existentially generalizing on flagged
variables. It turns out that it can be proved that with only this restriction the rule
212
sanctions all and only all valid inferences (but we cannot give the proof of this in this
course). For the present you should just learn the restriction and apply it correctly:
Let F be the formula obtained from the formula F by substituting the variable xi for all
occurrences of a name (that is, either an individual constant or an ambiguous name) IF
ANY. Then the step from F toxiF is legitimated by EG provided that, IF goingfrom F to
F actually involves dropping an ambiguous name, THEN xi is not flagged.
Pending a general proof that the rule as thus restricted is sound (i.e. permits only valid
inferences), this complicated restriction is bound to look ad hoc. However, notice that
at least it does the job so far as our two most recent alleged derivations ProofA and
Proof B are concerned.
The step in (genuine) Proof B line 2 that was under suspicion is in fact exonerated,
that is it does not run afoul of this restriction on EG: since no ambiguous name is
dropped in the process. But step 5 in proof A is disallowed by the restriction since in
that proof was dropped in favour of x... and x was flagged. Since the inference in
Proof B is valid and the one in Proof A is invalid, the restriction definitely gets it right
in these two cases. (The (meta-)proof that it gets it right in all cases is more complex
and will not be given in this course.)
So just underlining what the restriction means a little more sharply: If some step in a
proof is from R,y toxRx,y then x must not be flagged if the step is to be legitimated
by EG; but if the step is from, say, Rx,y to xRx,y then this is legitimated by EG (as
restricted) evenif x is flagged, since no ambiguous name is dropped in making the step.
1. xyRx,y Premise
2. yR,y ES, 1
3. yyRy,y EG, 2
4. yRy,y ES, 3
213
This is invalid reasoning. Take the interpretation:
The premise is true, since the number 1 is such (1*y = y). But the conclusion that y
(y*y = y), which says that every number is equal to its own square, is false.
Step 4 looks suspicious but is in fact OK an existential quantifier has been dropped
and an ambiguous name introduced for the free variable thus created (as required by
ES) its just that in this case (because of the double quantification on y in line 3), no
free variable has been created by dropping the first quantifier.
The fallacy in fact occurs at line 3: occurs in the formula yR,y within the scope of a
quantifier on y; and this means that substituting the variable y for in (apparently)
applying the rule EG yields a bound occurrence of that variable. We must restrict EG
exactly by banning such substitutions:
Modified EG 1:
xiF may be inferred from F, where F is the same formula as F except that all the
occurrences of some name (ambiguous or constant) have been replaced by the
variable xi, if but only if the name does not occur in F within the scope of a quantifier
on the variable xi.
1. xyRx,y Premise
2. yRx,y US, 1
3. Rx, ES, 2
4. xRx,x EG, 3
Yet if we interpret Rx,y as x<y in the natural numbers, the premise here is a true
statement about numbers and the conclusion 4 is a false statement (Exercise: make
sure you understand why), so obviously something is wrong. To see precisely what,
consider what is going on intuitively. The move from 1 to 2 is clearly correct: if for
every number there is one greater than it, then this applies to any arbitrary number
214
which is what 2 says. It is also true (step 3) that, given an arbitrary number x, we can
pick a number greater than it; but the crucial point is that whichs make this true will
depend on which numberx we have picked (think about the two Hungarians in the joke
I usually give in the lecture). We ought to signalthis dependency of on the prior
choice of x by writing x as a subscript to (i.e. as x.)
Any ambiguous nameintroduced by ES has as subscripts all the free variables occurring in
the formula to which ES is applied.
Thus, e.g., if we apply ES to the formula xRx,y,z we must write Rx,y,y,z. And given this
Modified EG 2:
convention read 'x <x' and hence step 4 is blocked by this restriction on EG.
1. xyRx,y Premise
2. yRx,y US, 1
3. Rx,x ES, 2
4. xRx,x UG, 3
5. yxRx,y EG, 4
But in the same interpretation of Rx,y (viz. x < y) again the premise is true about
numbers and the conclusion false (Exercise: make sure you understand why). The
incorrect step here is 4 and this points to the required modification on UG: it is NOT
permissible to universally generalise using a variable thatoccurs as a subscript in
the formula.
215
Final statement of the Rules of Proof:
So, here are the correct versions of the rules in all their glory:
F[tIxi] may be inferred from xiF, where t is any term free for xi in F.
subscript in F.
F[j|xi] may be inferred from xiF, where j is any NEW 'ambiguous name' (i.e.
one which does not occur already in some earlier line of the proof).*
(*Remember we already introduced and justified this restriction earlier since
it is so obvious.)
Let F' be the same formula as F except POSSIBLY that some name (that is either
some ambiguous name or some individual constant) in F is replaced throughout
by some variable xi in F', then xiF' may be inferred from F, PROVIDED that the
replaced name (if any) does not occur in Fwithin the scope of a quantifier on xi;
If the formula G can be inferred from some set of premises Splus the extra
assumption formula F then the formula F G can be inferred from S alone.
216
C10: L OGICAL T RUTH , L OGICAL F ALSEHOOD , AND I NCON SISTENCY F ULL
P REDICATE L OGIC
The notion of a logical truth remains, of course, the same as it was in the restricted case
of monadic predicate logic:
The method of demonstrating logical truth also remains the same: we show that s is a
logical truth by proving it absolutely that is, without invoking any premises. As
example 1 will show, even in the case of monadic predicate logic, we need, when
producing proofs, to pay attention to the restrictions on the rules of proof that we just
introduced; it is just that, as we will see as we go along, attention to the restrictions is
much more often necessary when relations are involved.
Example 1:
xPx xPx
is a logical truth. (Its a sort of monadic predicate logic equivalent of the truth-
functional law of double negation: Everything has property P says the same thing as
Nothing fails to have property P.)
Proof:
1. xPx A
2. xPx A
3. P ES, 2
4. P US, 1
5. P& P TI, 3,4
6. xPx P& P CP, 2-5
7. xPx TI, 6
217
8. xPx xPx CP, 1-7
At this stage, we have established one direction of the biconditional. Now for the
other part:
9. Px Ax
10. xPx EG x
11. Px xPx CP, 9-10
12. xPx A
13. Px TI, 11,12
14. xPx UG, 13
15. xPx xPx CP, 12-14
Thats the second direction dealt with, so we put them together via TI:
Comments:
Example 2:
The sentence:
xyRx,y xyRx,y
is a logical truth.
219
Proof:
1. xyRx,y A
2. xyRx,y A
Comment 1: So, just as in the easy half of Example 1, we have now assumed, contrary
to what we will end up proving, that the LHS of the biconditional at issue is true, while
the RHS is false.
3. yR,y ES, 1
4. yR,y US, 2
5. R ES, 4
6. R US, 3
7. RR TI, 5,6
8. xyRx,y RR CP, 2-7
9. xyRx,y TI, 8
10. xyRx,y xyRx,y CP, 1-9
Comment 2: So, this ends the first half of the proof: just as in Example 1, we are out to
prove a biconditional, so its natural to split the proof into two halves: first FG, then
G F.
11. yRx,y Ax
Comment 3: We are now setting out to prove the second half of the result (ie
RHSLHS) by assuming not that the RHS is true and then deriving the LHS, but instead
by assuming that the LHS is false and showing that in that case the RHS is also false
(and then turning it back round which is what we will do at line 26, below). However
just assuming xyRx,y straight off (as we will do at line 14) would leave us stymied
(for essentially the same reason as explained in Comment 4 for Example 1) so we again
take the analogous steps as in Example 1. We are going to assume xyRx,y, so make
the assumption that leaves off the negation and the existential quantifier, i.e. yRx,y
(noting that x is free and so must be flagged);we can still apply EG (line 12) because
although x is flagged, no ambiguous name is dropped. Discharging by CP gives us again
220
a logical truth at line 12 and now the assumption we were going to make all along does
again talk via the application of TI at 15. Here we go:
Comment 4: We know via the intuitive result that y = y that this line is in fact
equivalent to yRx,y, which we will eventually get as line 23. It may be better to jump,
the first time you try to grasp the proof, from here directly to line 23. However, this
intuitive equivalence needs to be proved properly which is what lines 15-23 achieve.
Study that sub-proof carefully; it again involves that little riff where you assume some
sentence with a free variable and existentially generalize on it (lines 16-18)
221
C10( B ): L OGICAL F ALSITY
Again the definition of a logical falsehood is, of course, the same for the full predicate
logic as it was for the restricted case of monadic predicate logic. Namely:
We show that s is logically false, by showing that its negation s is logically true i.e,
by deriving s from no premises.
Example:
The sentence:
is logically false.
Proof:
222
Notice, then, that we have shown that x(PxyRx,y) &x(Px &yRx,y) is logically
false by showing that its negation is logically true (the further twist being that we have
proved the latter by assuming its negation i.e. the original sentence and deriving a
contradiction).
223
C10( C ): T HE I NCONSISTENCY OF S ETS OF S ENTENCES F ULL P REDICATE
L OGIC
Finally, exactly the same considerations apply to showing that a particular (finite) set
of sentences S is inconsistent in full predicate logic as they did in the simple monadic
case i.e. that S is an inconsistent set if the sentences it includes are never all true
together. So, as before, S is inconsistent iff it has no models, i.e. there is not a single
interpretation in which all the sentences in S are true. Moreover, the methods for
demonstrating that S is inconsistent remain the same. There were, remember, two such
methods that, though formally different, are clearly intuitively equivalent.
Method 1:
(s1, s2, sn) is inconsistent iff s1 & s2 & & sn is a logical falsehood.
Hence, using this method we would show that (s1 & s2 & sn) can be proved from no
premises.
Method 2:
(s1, s2, sn) is inconsistent iff a truth functional contradiction can bederived from (s1,
s2, sn) as premises
Hence, using this method, we would take (s1, s2, sn) as premises and set out to deduce
a truth functional contradication.
Example:
224
6. P US, 5
7. yzRy,z TI, 3,6
8. yzRy,z TI, 4
9. yzRy,z & yzRy,z TI, 7,8
(Exercise: prove that this same set is inconsistent by the first method this will simply
involve collapsing the first two steps into one (and relabelling) and adding one further
final step.)
225
C11: F IRST O RDER P REDICATE L OGIC WITH I DENTITY
If for example we were allowed to reinterpret what all means if we could consider
interpretations in which it meant some, for example then there would be no
interesting valid inferences. Even:
would not meet the test of validity if all were not fixed from interpretation to
interpretation. This is because if we were allowed to reinterpret all as some, the
following would be a counterexample (i.e. an inference of the same form (under this
proposed laxer notion of form) with true premises and a false conclusion)
But why exactly should 'all' be regarded as specifying part of the form of an inference,
while 'is Greek', for example, is part of the content and therefore reinterpretable? This
raises difficult and deep issues in the foundations of logic, that we shall not be able to
go into in this course. However, we can investigate one particular issue that arises in
226
this connection the issue of which side of the line the equality or identity relation x = y
falls. Is identity logical or descriptive?
In first-order logic as we have studied it so far, we have not given any special role to the
relation of identity or equality which has therefore implicitly been understood as a
descriptive term, as an ordinary two-place relation: Every object is identical to itself',
for example, just formalises as xRx,x. This means, as we know, that when we start to
reinterpret some set of sentences, for example in search of a counterexample to some
inference, then we are perfectly at liberty to reinterpret what had started out as the
identity relation in any way that we like: lets say we have formalised it as Rx,y then we
can of course set up interpretations I in which Rx,y means, say, x is the father of y, or x
loves y or whatever. And this would mean that Everything is identical to itself which
sounds like it should be a logical truth would be no such thing: since Everyone is the
father of him/herself would be a sentence of the same logical form as Everything is
identical to itself, and hence that identity statement cannot be a logical truth.
What would happen if, on the contrary, we added identity to our list of logical terms so that
= was required to mean identity or equality whatever the interpretation of the rest of the
terms? It should be clear from our basic characterisation of validity that the effect of any
addition to the list of logical terms will be to extend the class of valid inferences by
restricting the class of possible counterexamples. Consider, e.g., the following
intuitively valid inference in mathematics:
1. a = b
2. b = c
So, a = c
If we just treat identity as a two-place relation like any other, then the inference in first
order logic is just:
1. Ra,b
2. Rb,c
So, Ra,c
As thus formalised, the inference is obviously invalid. Take, e.g., the interpretation I:
227
Domain: {natural numbers}
a: 1
b: 2
c: 3
If, on the other hand, we treat identity as a logical term and hence require that its
meaning remain the same in any interpretation, then the inference will obviously be
valid. In that case, the only variable items, are the domain and the assignments of
particular elements of the domain to individual constants, a, b and c; but whichever
elements these are, clearly a = c clearly cannot be false, when a = b and b = c are both
true. (That is, if we have assigned the same element of the domain to both the constant
a and the constant b, and we have also assigned the same element to both the constant
b and the constant c, then we have ipso facto assigned the same element of the domain
to the two constants a and c.)
Of course everyone would regard the above inference as intuitively valid. But this is not
a knock-down argument for regarding identity as a logical term. After all, the inference:
is also intuitively valid. Yet it formalises using, say, Nx,y for x is to the north of y as:
1. Na,b
2. Nb,c
So, Na,c
The intuitive validity of thislatter inference is clearly best explained, however, NOT by
saying 'is north of' is a logical term (this seems obviously wrong), but by reflecting that
228
weall carry round with us certain background information that we intuitively import as
extra premises in ordinary arguments like this geographical one: so-called implicit or
hidden premises.
In the geographical case, we all know that whenever one place is north of a second and
the second north of a third, then the first is north of the third. (To put it in logician-
speak, we all know that is to the north of is a transitive relation.) When we
"articulate" this hidden premise and add it as an explicit premise the inference
becomes:
1. Na,b
2. Nb,c
3. xyz((Nx,y & Ny,z) Nx,z) Initially implicit premise
So, Na,c
Similarly, (in fact, logically speaking, identically) in the earlier inference we all know
that whenever one thing equals a second and the second a third then the first equals
the third. Adding xyz((Rx,y & Ry, z) Rx,z) as an explicit premise turns the 'a = b,
b = c, So, a = c' inference into a formally valid one, without needing to regard identity as
a logical term (indeed formally its the same inference as the geographical one).
There are, however, at least two arguments for regarding identity, as distinct from is
to the north of, as a logical notion. The first argument is that it does seem to be a
genuinely general notion: the laws of identity are always the same whether one is
talking physics, arithmetic, real analysis, economics or whatever.
The second argument is that it seems arbitrary that the notion 'There is at least one...'
should be a logical notion (as it is of course in Predicate Logic where it is represented
by x), while There are at least two, There are at least three... etc. are not. Such
notions all become purely logical if we treat identity as a logical term. E.g. 'There
are at least 2 things with property P' is:
229
Notice carefully that the last conjunct is necessary just xy(Px & Py) is, as we
stressed earlier, consistent with there being just one thing that has property P. Just
involving two variables does NOT mean that there have to be 2 different things that are
P. (In fact xy(Px & Py) is logically equivalent to just xPx.)
(Exercise: (perhaps a surprisingly tricky one) show that this logical equivalence holds.)
Notice just as carefully that everything in this expression (*), aside from the predicate
P, is logical, if identity = is regarded as such. At least two despite being a slightly
more complicated expression would stand on a logical par with there is at least one.
Moreover, it seems difficult to understand why, e.g., there is at least one thing (such
that P, or whatever) should count as a logical notion while There is exactly one thing
(such that P or whatever) does not. And again There is exactly one thing such that...'
also becomes a purely logical notion if identity is treated as logical. This is because it is
expressed by x(Px &y(Py y = x)) (or, equivalently, xy(Py y=x))
(Difficult) Exercise: show that these two formulations are indeed equivalent.
Again, everything in this expression aside from the predicate P is logical if = is. If we
do regard = as logical, then, similarly, 'There are exactly two things with property P' is
also a logical notion (aside from the descriptive P), since it formalises as xy(Px & Py
& (x = y) &z(Pzz =x v z = y))). And so on, with exactly n foranyn.
230
C11( B ): F IRST O RDER L OGIC WITH I DENTITY : L ANGUAGE AND
I NTERPRETATION
Suppose, then, that we do decide that identity should be regarded as a logical notion,
and hence should be incorporated into logic (thus creating a system called First Order
Predicate Logic with Identity).
This extension of our logic is easily achieved. On the semantic side (remember,
semantics concern interpretations and their use to establish invalidity, consistency,
etc.), we just regard '=' as a logical and therefore non-reinterpretable term, as we
already indicated.
However, this is also the place to make one further addition to the expressive power
of our language (we could have made this addition earlier but it would then
havecomplicated matters without any real pay-off).
First order logic with identity is very suitable as a basis for mathematical theories and
for scientific theories based on mathematics. In mathematics much use is made of
functions. These are 'mappings' or 'rules of association' which take one individualand
map it onto or associate it with another.
One example is the doubling function, usually written f(x) = 2x. It takes any natural
number, say, and maps it onto its double (1 onto 2, 2 onto 4, 3 onto 6, etc.). Similarly,
the squaring function, g(x) = x2, takes any number and maps it onto its square (1 onto
1, 2 onto 4, 3 onto 9, etc.)
We can have functions of any (finite) number of arguments. For example, we can
characterise a twoplace summing function h(x,y) = z that maps any pair of numbers
onto another number, viz. their sum. So h(1,2) = 3, h(7,9) = 16 etc.
231
Notice however that there is, for example, no brother function because (a) unlike the
father case, not everyone has a brother so it is not defined for all elements of the
intended domain (humans), and, (b) more importantly some people have more than
one brother, so for example Brother(Charles) does not pick out one definite entity
both Andrew and Edward are his brothers and so is not representable as a function.)
The introduction of functions greatly enlarges the class of terms (see above).
Individual constants, individual variables, and "ambiguous names" are, remember, all
terms. But now we add all functions applied to the correct number of terms: that is, we
stipulate that:
If f is any n-place function symbol and t1 tn are all terms, then soalso is f(t1 tn)
a term.
So, for example, if f(x,y) = z is the two-place summing function and a and b are
individual numbers then f(a,b)[=a+b] also names a number and hence is a term; so is
f(x,y) for variables x and y. Notice also that f(f(a,b),c) is well-defined, being the sum of
the sum of a and b, and c. The definition of terms allows for iteration (indeed any
(finite) number of iterations).
It was with the eventual introduction of functions in mind that I gave the general
definition of a term being free for a variable earlier. (Remember this notion was
necessary for an accurate description of the rule of Universal Specification.) Not only is
y not free for x in yRx,y, nor is any other term like f(y) or g(x,y) or h(y,z,w) that
involves y. Any such term when substituted for the free variable x inyRx,y gets
captured by a pre-existing quantifier.
Since f(x), g(x,y) etc. are terms, the rule of Universal Specification (US) allows us to
infer from, say, xPx, not just Px or Pa or P but also Pf(x) or Pg(x,y), etc. The rule
about the substituted term being free for the variable still applies, though, and so, since
f(y) is not free for x in yRx,y, we could not infer from xyRx,y to, for example,
yR(f(y),y) by US. Similarly, the rule of Existential Specification allows us to go from
xPf(x), for example, to Pf().
Naturally all the conventions about flagging variables and subscripting ambiguous
names apply equally well when we introduce variables or ambiguous names within the
232
context of some function. You should go back to the statement of the rules of proof (full
form) and think about how they apply when we allow Specification either Universal
or Existential using appropriate terms in the new extended sense: extended to take
functions into account.
233
C11( C ): R ULES OF I DENTITY
So far as proofs go, as just indicated, the rules we already have carry over to the new
system and in exactly the same form (they just become more powerful because applied
to an extended language involving a wider notion of terms). But two further rules,
specifically about identity, must be added to the already existing stock to extend first
order logic to first order logic with identity. These are:
The formula t = t for any term t may be derived from theempty set of premises (i.e. may
be written down at any stage in a proof).
Let t1tnand s1snall be terms. Then if, for alli {1, ,n}, si is free for tiin formula F, the
formula F obtained from F by substituting some or all of the s i for some or all
occurrences of the corresponding ti may be inferred from F and the formulas s1=t1,,
sn=tn.
The first rule is obvious and Rule I2 is actually less fearsome than it looks, as will
become clear from a few examples. (It is sometimes called the principle of the
indiscernibility of identicals because it basically says that if two objects are
identical (this basically means we have two different names for the same object) then
they have all the same properties (A rose by any other name would smell as sweet!)
1. Pb may be inferred from Pa and a = b (so, 'Cassius Clay was a great boxer' can be
inferred from Muhammad Ali was a great boxer' and Muhammad Ali = Cassius
Clay').
2. yQx,y may be inferred from yQz,y and z = x.
3. (Note that yQy,y cannot be inferred from yQz,y and z=y, since y is not free
for z in yQz,y.)
4. R(f(x1), (f(xn)) may be inferred from R(x1 xn) and x1=f(x1) . xn=f(xn)
234
C11( D ): D ERIVATIONS IN F IRST O RDER L OGIC WITH I DENTITY
Example 1:
Proof:
1. Pa Premise
2. x(Px Qx) Premise
3. x(Qx Rx) Premise
4. Rb Premise
5. Qb Rb US, 3
6. Qb TI, 4,5
7. Pb Qb US, 2
8. Pb TI, 6,7
9. a = b A
10. Pa I2, 8,9
11. Pa & Pa TI, 1,10
12. (a = b) (Pa & Pa) CP, 9-11
13. (a = b) TI, 12
Example 2:
Show that x(Pxy(x = y & Py)) is a logical truth of First Order Logicwith Identity.
Proof:
1. Px Ax
2. y(x = y & Py) Ax
3. (x = x & Px) US, 2 x
235
4. (x = x) v Px TI, 3 x
5. x = x I1 x
6. Px TI, 4,5 x
7. Px & Px TI, 1,6 x
8. y(x = y & Py) (Px & Px) CP, 2-7 x
Comment 1: Notice that x remains flagged here since although we have discharged one
assumption, it was not the assumption (line 1) in which x was introduced free.
Comment 2: x still remains flagged since although we have discharged one assumption
(line 10) in which x was introduced free, the assumption at line 1 remains
undischarged.
Comment 3: Here we universally generalize, but on the unflagged variable y (we could
not legitimately generalise on x since it remains flagged).
16. y(x = y & Py) & y(x = y & Py) TI, 9,15 x
17. y(x = y & Py) (y(x = y & Py) & y(x = y & Py)) CP, 10-16 x
18. y(x = y & Py) TI, 17 x
19. Px y(x = y & Py) CP, 1-18
Comment 4: Here at last the flagging on x stops (though we start to flag it again in the
next line, which begins the other half of the proof.)
Example 3:
Proof:
Example 4:
Proof:
237
1. xy(x = y & (y = x)) A
2. y( = y & (y = )) ES, 1
3. = & ( = ) ES, 2
4. = TI, 3
5. ( = ) TI, 3
6. = I1
7. = I2, 4,6
8. ( = ) & ( = ) TI, 5,7
9. xy(x = y & (y = x)) ( = ) & ( = ) CP, 1-8
10. xy(x = y & (y = x)) TI, 9
238
D: I NFORMAL R EASONING : P REDICATE L OGIC
As suggested earlier, in Section B, the fact that deductive logic is the logic behind
ordinary informal reasoning is obscured by the fact that we seldom spell out our
arguments in the detail required for a full demonstration of deductive validity. Instead,
we leave various premises implicit because we assume they will be presupposed by
those who hear our arguments. We saw some examples of this and of the value of
spelling out the initially hidden premises in section B with informal arguments whose
validity can be captured in truth functional logic. In this brief section we look at
arguments where predicate logic can usefully be involved.
Example 1:
An Israeli officer, during one of the wars with Egypt in the 60s or 70s, was reported as
arguing: The man who parachuted out of the Egyptian plane had blond hair. So he
must have been Russian. This clearly is an argument. The so signals the conclusion:
viz. that the man was Russian. The only explicit premise is The man who parachuted
out of the Egyptian plane had blond hair. The maninvolved was obviously a particular
individual let's therefore introduce the individual constant a for him. Then
introducing Px: x parachuted out of the Egyptian plane, Qx: x has blond hair, and Rx:
x is Russian, we get the following formalisation for the inference as it stands:
1. Pa & Qa
Therefore, Ra
This is of course invalid. (Exercise: Supply a counterexample.) Yet we can imagine that
in the circumstances the argument was quite convincing. How can this be if the logic of
ordinary argument is the deductive logic we have investigated?
Well, the answer, as in the cases in Section B, is that the Israeli officer was taking for
granted certain assumptions that he did not bother to articulate. These are so-called
implicit or hidden assumptions.
239
There can, of course, be no rules for identifying these but only more or less plausible
conjectures. It does seem fairly clear that in this case the officer was making two
assumptions: (a) that only two types of people were involved on the side of his enemy
viz. Egyptians (openly) and Russians (covertly) and so anyone who parachuted out of
the Egyptian plane was either Russian or Egyptian; and (b) that no Egyptian has blond
hair. Introducing the predicate Sx for 'x is Egyptian', then (a) formalises as:
If we now add these initially hidden premises as explicit premises, then we get the
following inference:
1. Pa & Qa
2. x(Px (Rx v Sx))
3. x(Sx Qx)
Therefore, Ra
This is, of course, valid in first order logic (Exercise: supply the easy proof).
Example 2:
As the Boston Celtics were winning the 1976 NBA championship by beating the
Phoenix Suns in Arizona, a television pundit said that the Celtics are proving that they
are a great basketball team, because you cant claim to be a great team if you cant win
on the road.
Let Px: x is a basketball team, Rx,y: x wins at y, Qx: x is the home team, Sx: x is great, a is
Boston Celtics, b Phoenix Suns, THEN the pundit's argument is:
240
So, Sa
To make it valid you would have to add the extra assumption that if a team can win on
the road then it is great; or equivalently to turn that final '' in the first premise into a
biconditional.
(Exercise: make sure you understand the formalisation and all these comments.)
Example 3:
Again some arguing is going on here, but again we need to think a bit to bring the exact
structure out. The main conclusion is that Jesus did possess original sin; a sort of
subsidiary conclusion is that fundamentalists are wrong in thinking that he did not
possess original sin. The main argument is:
Taking Rx,y to mean x is a descendant of y, Px: x has original sin; and a, m, and j to
be the obvious choices for individual constants for Adam, Mary and Jesus we have:
1. Rj,m
241
2. Pa &x y((Rx,y & Py) Px)
3. Rm,a
So, Pj
Example 4:
Raymond Smullyan's entertaining book called What is the name of this book?
contains a lot of interesting anthropological detail about two islands: the island of
Knights and Knaves (which I introduced you to briefly in Section A1) and the island of
Knights, Knaves and Normals. (Knights always tell the truth; Knaves always lie and
Normals sometimes lie and sometimes tell the truth.) There was a famous court case on
the second (tri-partite) island. Three inhabitants (A, B and C) were charged with
murder. Inspector Knacker of the Yard (flown in from England and therefore an
honorary knight!) discovered for sure (that is, you are to take these as premises) that
only one was guilty, that the guilty one was a Knight and that the guilty one was the
only Knight among the three. The court case did not last long (to the great chagrin, of
course, of the lawyers involved), since all that happened was that the three defendants
made one statement each:
A: I am innocent.
242
B: That is true.
C: B is not normal.
Fortunately, all the jurors had previously attended this course and were quickly able,
via an informal argument, to identify the guilty party. (Try to work it our yourself
before reading on.)
The reasoning is this: A can't be the Knight, since if he were he would be guilty, (we
know that whoever is the Knight is guilty) but then would have lied in his statement
and that's impossible for a Knight. A also can't be a Knave, since if he were he would be
innocent (the guilty party is known to be a Knight by premise 1) and hence what he
said would be true, which is impossible for a Knave. So A is normal (and innocent it is
given, remember, that the only knight among the three is the guilty one). Since A's
statement is therefore true, so is B's. Hence B cannot be a Knave. He is either a Knight
or a Normal. If he were a Normal then C's statement would be false and so C would
have to be either a Knave or a Normal. This would mean there was no Knight among
the three, but we know from the premises that there is just one Knight. Therefore, B is
not a Normal. Since we already decided he is not a Knave, he must be a Knight and so
the guilty one.
This reasoning consists of a whole series of valid inferences. (This is typical of much
reasoning which is sequential like a proof.) Some steps are reductio ad absurdums.
You could have some fun capturing some of these inferences in, say, first-order logic.
(In that logic we could give the following characterisation of a Knight: Knight(x) =
y((Sy & Ax,y) Ty).Here Sx: x is a statement; Tx: x is true; and Ax,y: x asserts y.)
(Exercise: Do the same for Knave(x); and prove (informally) that no inhabitant of the
island of Knights and Knaves ever said 'I am a Knave'.)
243
E: S OME P ROBLEMS IN THE F OUNDATION OF L OGIC
An inference is deductively valid, we have found, if all models of its premises are also
models of its conclusion (no counterexample). Our basic idea of a model of a set of
sentences appears trouble-free. But it involves two notions: that of a set (the domain of
the interpretation must be a set) and that of truth (the sentences must all be true for
the interpretation to be a model) each of which has interesting difficulties associated
with it. In the rest of the course, I shall indicate these difficulties, as examples of the
interesting problems that occur in the foundations of logic (and which are studied in
other courses in the Philosophy Department).
244
E1: T RUTH
"To say of what is that it is, or of what of is not that it is not is true; while
to say of what is not that it is or of what is that it is not is false."
X is true iff p
245
Both are, of course, true (the first bi-conditional being true on both sides, the second
false on both sides). Such bi-conditionals are not logically true to be logically true
they must say the same thing (in different words) about the same entities; but these bi-
conditionals have an assertion about a sentence on one side and about things in the
world on the other. But those bi-conditionals do seem trivial nonetheless: they express
the obvious connection between a true sentence and the state of affairs (in the simple
case states of affairs in the world) the sentence asserts to hold.
However, interestingly things turn out not to be quite so straightforward. To see this,
first note that there is no special problem in applying the schema to sentences which
themselves happen to be about some particular sentence instead of about snow or
Libya or whatever. Just the same condition surely applies. Consider the sentence:
"The first sentence in today's Times is true" is true iff the first sentence
in today's Times is true.
Or the sentence:
"All the sentences in Genesis are false" is true iff all the sentences in
Genesis are false.
(Notice again that these bi-conditionals are not logically true: the left hand side of the
first one is a sentence about a sentence (it makes an assertion about a sentence)
whereas the right hand side is a name of a sentence it might, for example, name the
sentence There was a major disagreement at yesterdays meeting of the Cabinet.)
Apply the schema to it. We get: "The only in red in these notes is false" is true iff the
only sentence in red in these notes is false. (Call this bi-conditional *)
But, in view of the fact that the only sentence in red in these notes is "The only sentence
in red in these notes is false", the sentence * is logically contradictory. For assume that
its LHS is TRUE, then it is indeed true that the only sentence in red in these notes is
false, i.e. "The only sentence in red in these notes is false" is false. And this contradicts
the LHS.
246
On the other hand, if the LHS is FALSE, then it is false that "The only sentence in red in
these notes is false". So, assuming the sentence makes sense, which it certainly seems
to, that sentence must be true but that sentence is "The only sentence...". So we have
now derived that the sentence must be true from the assumption that it is false plus the
truth schema. Thus the truth-schema implies that this sentence is true iff it is false.
The reasoning we just went through is a rather precise version of the so-called
Paradox of the Liar. This originates with Epimenides the Cretan who allegedly said
"All Cretans are liars" (a fact which St. Paul reported without apparently noticing
anything funny about it). Hence the paradox is also sometimes called the Epimenides.
(However, the Epimenides statement is not actually contradictory (Exercise: why not?).
We need something more direct like "I am now lying" or "This present sentence is
false" or the more precise version given above. Suppose we take the direct Liar
version: I am now lying, i.e. what I am now saying is false. This is true if what it states
to be the case is the case; but what it states to be the case is that it is false!)
The term 'paradox' may suggest that it is an engaging puzzle rather than a deep
problem. But this is not correct. As we have seen it refutes the straightforward version
of the apparently obviously correct account of what it means for a sentence to be true.
247
E1( B ): A RE SELF - REFERENTIAL S TATEMENTS M EANINGLESS ?
Many logicians/philosophers argued that what the Liar Paradox shows us is simply
that self-referring sentences are really meaningless (although they appear
grammatically correct). A sentence is, of course, self-referring if it ascribes some
property to itself. So "This sentence is false" is clearly self-referring and so was: "The
only sentence in red in these notes is false" that sentence too said of itself that it is
false. The suggestion is that such sentences don't really mean anything no wonder
then that they appear to lead to inconsistency; but if we restrict our theory of truth to
meaningful sentences (as we clearly should) then no problem arises.
But is this suggestion tenable? There are all sorts of statements that are self-referring
and yet which seem to make perfect sense and indeed seem to be true. A favourite
example is the car-sticker on the rear window that reads "If you can read this you are
too close". What is "this" here? Well of course, its the sentence "If you can read this you
are too close". In appropriate situations, this sentence, far from being meaningless,
seems to be true.
Moreover, the suggestion would not, even if accepted, solve the problem. For we can
easily construct sentences that singly do not refer to themselves but which together
give a contradiction similar to the liar.
The story goes that someone pushed a visiting card under Bertrand Russells door, the
two sides of which read as follows:
Neither sentence taken alone refers to itself, so both would be counted meaningful
even if we barred all self-referential statements as meaningless. And indeed if one of
them, say the second, were replaced by pretty well any other sentence say Liverpool
FC will soon become once again the best football team in the country then the
visiting card would just amount to an elaborate way of saying that Liverpool FC will
soon become once again the best football team in the country and the remaining first
248
sentence (The sentence on the other side of this card is true) would present no
conceivable difficulty. Similarly, for the second sentence together with any normal
sentence this would just be anelaborate way of saying that the normal sentence is
false.
So the sentence The sentence on the other side of this card is true, and the sentence
The sentence on the other side of this card is false taken separately do not self-refer
and taken separately present no problem. But together of course they become
'paradoxical'. Assume that the sentence on the first side is true then it truly states that
the sentence on the other side is true so the sentence on the other side is true; but then
it is true that 'The sentence on the other [i.e. first side, which we started from] of this
card is false', i.e. the sentence on the first side of the card is false, contrary to our
assumption. So we must assume that the sentence on the first side is false. If so, it
falsely states that the sentence on the other side is true, which can only mean (the
sentence surely makes sense) that the sentence on the other side is false. But then it is
false that the sentence on the other side is false, again contrary to assumption. We have
an inconsistency. This is usually called The Visiting-Card Paradox.
249
E1( C ): T ARSKI S L ANGUAGE -H IERARCHY
The rehabilitation of the correspondence theory of truth in the face of the above
'paradoxes' is due to the Polish-American logician Alfred Tarski, who died in 1982. He
pointed out that there is a natural distinction between an 'object-language' statement
like 'Mary had a little lamb' and a 'meta- language' statement like "'Mary" has four
letters' or Mary had a little lamb is the first line of a famous nursery rhyme.
Statements of the first kind are about physical objects (albeit in this case pretend
physical objects), while statements of the second kind are about linguistic objects like
names of individuals (rather than individuals themselves) or like sentences.
'Snow is white' is an object language assertion about the physical entity 'snow'. But the
statement "'Snow is white" is true' is a meta-language statement about an object
language entity, viz. a sentence. In the meta-language, we can refer both to sentences
and to objects. So, for example, the instance of our truth schema:
is a meta-language assertion which talks about both a sentence (on the left hand side)
and an object namely, snow (on its right hand side).
Since we can discuss meta-linguistic assertions in turn this points to the existence of
meta-meta-languages in which we can for example make the assertion that a particular
meta-language sentence is, say, false. And then there are meta-meta-meta-languages
and so on. Although in natural languages, like English, we switch without noticing it
between levels English grammar is taught in English Tarski's suggestion was that
formal correctness requires us to differentiate linguistic levels and in particular to
recognise that whenever we assert that a particular sentence S is true (or false) we do
so in the meta-language of whatever language S happens to be in. More formally, the
predicate "is true" is to be regarded as incomplete: it must always be taken as meaning
"is true in (or "is a true sentence of") particular language L". This predicate cannot be a
predicate of that same language L but must instead be a predicate of L's meta-language
(or some language higher up the hierarchy).
250
There can be within language L no predicate equivalent to L's truth predicate on pain
of the Liar Paradox being derivable. But, so long as this condition is met, the Liar
Paradox is not derivable.
When I said 'The only sentence in red in these notes is false' this was, if we accept
Tarski's analysis, an incomplete assertion. To complete it we must specify which
language that sentence is in. Let us try to derive the paradox again this time
specifying the language L in which it is expressed. We have:
This is no longer contradictory. Its status depends on how we decide to treat the
separate problem of ascriptions of truth values to statements about non-existent
entities. Is, for example, the statement "The present King of France is bald" true, false
or neither? Following Bertrand Russell's 'Theory of Descriptions' it is usual to analyse
such statements as asserting that there is at least one thing which is the present
king of Franceand there is no more than one such thing and that thing is bald. This
makes such a statement unambiguously false (the first of the three conjuncts is false).
This would mean that the above sentence in the meta-language of L is not paradoxical
but simply false.
For more information about the Liar Paradox, Tarskis language-hierarchy solution,
and other solutions which have been suggested, consult the Stanford Encyclopedia of
Philosophys Liar Paradox article.
251
E2: T HE P ARADOXES OF S ET T HEORY
Aside from the notion of truth, the other notion involved in the key idea of an
interpretation/model is that of set. It may have occurred to some of you to wonder why
it is required that a particular set be specified as the 'domain of the interpretation'.
Why not take for the domain of any interpretation the universe the set of all things,
concrete and abstract, physical and mathematical? So that when, in Predicate logic, we
say for all we really mean for all the whole universe of things.
The answer, to put it dramatically if rather obscurely, is that the universe probably
does not exist. (Bertrand Russell once said that he was proud that one could actually
prove that there are fewer things in heaven and earth than are dreamt of in his
philosophy.)
The inventor of set theory, the great mathematician Georg Cantor, took it as obvious
that for any well-defined property there is a corresponding set the set of entities that
have that property. So corresponding to the property 'is red' there is a set, viz. the set of
all red things; corresponding to the property 'is a natural number' there is a set, viz. the
set of all natural numbers. This is indeed the basis of the idea that we can characterize
predicates either intensionally (in terms of their meanings) or extensionally (as
characterised by the set of all entities that satisfy the predicate). We used this idea
inthe finite interpretation or finite model technique. Given any predicate Px, it is usual
to write its extension as {x | Px} read as: 'the set of all x such that Px'. Cantor's
assumption is nowadays called:
Every property determines a set or, more formally, for anypredicate Px, there is a set
y, such that x(x y Px). (Here x y means, as before, x is an element (or member)
of the set y.)
252
(As we saw when dealing with finite models, the principle is extendable to predicates
with more than one free variable e.g. corresponding to the predicate Rx,y is a set of
ordered pairs (x,y) such that Rx,y holds. So, corresponding to the predicate x>y in the
natural numbers is the set of all ordered pairs of natural numbers such that the first
element of the ordered pair is (strictly) greater than the second element of the pair.)
This principle seems no more than common sense. There are some funny properties
like being a natural number less than 0 or being a round square which are satisfied by
nothing; but these do not challenge Cantors principle because one (important) set is
the empty set, , which contains no members. Hence these two properties and others
like them do have a set as their extension in accordance with Cantor's principle their
extension being the empty set . But, despite the fact that it seems so obviously true as
to be trivial, Cantors principle is not called 'naive' for nothing. It turns out to be wrong
indeed to be logically false despite its intuitive appeal.
The fact that contradictions can be derived from the naive principle of comprehension
is of wider significance than might at first be thought. Bertrand Russell, and slightly
earlier, the German logician Gottlob Frege, both believed that the whole of mathematics
could be reduced to logic that mathematics consists in the end of nothing but logical
truths. They included Cantor's set theory as part of logic (after all, first-order logic is
the general study of predicates, and following the principle of abstraction, predicates
and sets go hand in hand). It was an enormous blow to this logicist programme in the
foundations of mathematics to discover that set theory, as it stood, was as far from
being logically true as could possibly be it was logically contradictory. The
discovery put the logicist programme into a turmoil from which, according to most
thinkers, it has never fully recovered. Why exactly is the naive principle of
comprehension logically inconsistent?
253
E2( B ): R USSELL S P ARADOX
Given that every property determines a set, then any property that applies to sets also
determines a set one whose members themselves happen to be sets. This is certainly
not problematic in itself. For example, the property of being a set with an even number
of members determines a set viz. the set of all sets which have an even number of
members. The property of being a set of natural numbers determines the set of all sets
of natural numbers.
This means that we can ask whether one set X is, or is not, a member of another set Y.
And we can sensibly ask whether a set is a member of itself. Most sets are in fact not
members of themselves in order to be members of themselves they would have to
satisfy their own defining characteristic. And most sets don't. E.g., the set of all England
cricketers is not itself an England cricketer (it's a set, after all, not a person!) and so is
not a member of itself. The set of all physical objects in our galaxy is not itself a
physical object in our galaxy (its an abstract mathematical entity) and so not a
member of itself.
Because most sets, indeed all of those we might normally think of, are not members of
themselves because they dont satisfy the predicate that has them as its extension
such sets are called NORMAL sets. A few sets are abnormal though it takes a bit of
ingenuity to think of examples. The easiest way is to think of 'negative properties' like
the property of not being an England cricketer. The set of all things which are not
England cricketers is not itself an England cricketer and so is a member of itself. The set
of all non-marijuana smokers does not itself smoke marijuana and so is a member of
itself. There are also a few non-negative examples of abnormal sets, like the set of all
abstract entities (itself an abstract entity and therefore a member of itself), and, more
importantly, the set of all sets itself a set and therefore a member of itself.
Bertrand Russell showed that we can derive a contradiction from the naive principle of
comprehension by considering the property 'is a normal set'. This is a reasonable
property as we just saw, some sets (intuitively the vast majority of sets) satisfy the
property, while a few sets, and of course all individuals (non-sets) do not satisfy it.
254
According to the principle, this property determines a set: the set of all normal sets, call
it N. We can now ask whether N is itself normal normality is a property of sets and N
is a set, so this is a question we must be able to ask.
Well N either is normal or it isn't. Assume first that N is normal, then N is a member of
the set of all normal sets, but that is N and so N is a member of itself. This is the
defining characteristic of abnormality. So if we assume N is normal we can derive that it
isn't.
We must conclude that N is not normal; but sets that are not normal are by definition
members of themselves, so N is a member of the set of all normal sets, which means of
course that it is normal. So if we assume N is not normal we can derive that it is.
(Exercise: In view of the close connection between predicates and sets (intension and
extension) it is not surprising that a paradox closely related to Russell's can be derived
for properties. Define a monadic property as 'heterological' if it fails to apply to itself.
'Long' for example is not long and so is heterological; 'in German' is in English not
German and so is heterological. On the other hand, 'short' is itself short, and 'in English'
is itself in English, and so both of these predicates are homological (sometimes
'autological', but in any event not-heterological). You should be able to derive a
contradiction or 'paradox' by asking "Is the property of being 'heterological' itself
heterological?" (Do it carefully!)For more on this paradox, which is sometimes called
the Grelling Paradox, or the Grelling-Nelson Paradox after its original authors, click
here. There are a great many similarly-structured paradoxes and pseudo-paradoxes
which you may also enjoyWikipedia has a good list.
2. yx(x y x x)
3. x(x x x) ES, 2
4. US, 3
255
(4) is, of course, a truth-functional contradiction.
This derivation, by the way, is in first order logic from line 2 onwards. We cannot fully
express line 1 in first-order logic; to do so we would need to quantify not just over
individuals (this includes sets considered as individuals) but also over predicates. This
is not possible in first-order logic. This is why that logic is called 'first-order'. In
second-order logic we do quantify over predicates as well as individuals, and the
step from 1 to 2 becomes a simple instance of the rule of universal specification in that
wider (and, as it turns out, interestingly problematic) system. Although nice and neat
this formal derivation does not really capture the paradoxicality of the paradox!
256
E2( C ): C ANTOR S P ARA DOX
It turned out that Cantor himself was already aware (before Russell's demonstration)
that his set theory is strictly inconsistent. Various other 'paradoxes' are in fact
derivable within the theory foremost amongst these is the one named after, and
discovered by, Cantor himself. This involves the property 'x is a set' this seems like an
entirely OK property, it is satisfied by all sets and not satisfied by non-sets
(individuals). According to the naive principle of abstraction, that property determines
a set viz. the set of all sets: the 'universal' set U. (So the whole universe would
consist of U together with all individuals.)
The assertion that U exists however can be shown to be contradictory though unlike
the straightforward Russell case, here a little work in set theory is required in order to
exhibit the inconsistency.
Subsets:
X Y x(x X x Y)
So, e.g., {1,2} {1,2,3} (but (1 {1,2,3}) its a member, not a subset; and {{1}}
{{1}, {2}, {3}} (but ({1} {{1}, {2}, (3}} again, its a member not a subset).
X is a proper subset of Y, written XY, iff X Y and x(x Y & (x Xthat is,every
element of X is in Y, but at least one element of Y is left out of X.In other words, X is
a proper subset of Y if and only if Y contains everything in X, and something more.
257
One slightly odd fact is that the empty set,written , the set with no members, is a
subset of every set, since of course it is true for any set X that x(x x X).
(Exercise: explain carefully why.)
Another slightly odd fact is that every set is a subset of itself(but of course nota
proper subset). (Same exercise.)
The power set of a set X, written (X), is the set of all subsets of X. So, if X is the set
{1,2,3} then (X) is:
The use of the term powerset stems from the fact that if the initial set X has n
members then P(X) has 2n members. So in this case X has 3 members and PX has 23 = 8.
If X is {{1,2},3} then P(X) is {{{1,2},3}, {(1,2}}, {3}, }.
(i) {{1,2}}
(ii) {{1}, (2}, (3}}
(iii) {{{1,2}} ,3}
We need just a couple more ideas from set theory, the first of which is that of a one-to-
one correspondence.There are apparently societies that do not have the
naturalnumber system (e.g. according to the anthropologist Benjamin Lee Whorf this
was true of the Hopi Indians who had only the idea of one, two, and many). A member
of such a society could nonetheless decide whether the number of, say, chairs in a given
room was the same as the number of people in that room. Without counting either set,
he or she could attempt to affect a one-to-one correspondence between the two sets
that is, try to associate each chair with one and only one person. If this attempt
succeeded, he could infer that there are as many chairs as people in the room
however many that happens to be. If there are on the contrary always some chairs left
over after any attempted pairing, then there are more chairs than people, and if there
258
are always people left standing without a chair, then the set of people is bigger than the
set of chairs.
All this applies to any two sets. This suggests that the notions same number as and
bigger (or smaller) number than are logically prior to the notion of number itself.
More formally, two sets X and Y are said to have the same number (or to have same
cardinality or to be equinumerous) if there is a one-to-one correspondence f
between X and Y. The cardinality of set X may be written |X| and so X and Y have the
same cardinality, or same size, written |X| = |Y| iff there is a 1-1 correspondence f
between X and Y.
If there is a 1-1 correspondence between X and some subset Y of Y, then |X| |Y|; and
|X| < |Y| just in case |X| |Y| and (|X| = |Y|). (This last definition might seem to be
unnecessarily complicated. If there is a one-to-one correspondence between the whole
of X and some proper subset of Y (this would correspond to the situation in which we
attempted a one-to-one correspondence between the set of seats in some lecture room
and the set of students attending a lecture and there were seats left over) then surely
there are strictly more members in Y than in X (so in the case of seats left over, strictly
more seats than students). This is indeed true in the case of finite sets, but,
fascinatingly, not so in the case of infinite sets. Indeed, it turns out to be an invariable
trait of infinite sets that they always contain proper subsets which have the same
cardinality as the whole set! Hence the slightly complicated looking definition of |X| <
|Y|.)
The straightforward, and indeed seemingly obvious, idea that two sets have the same
number of elements just in case there is a one-to-one correspondence between them
has some surprising consequences in the case of infinite sets (for finite sets it yields
only completely unsurprising consequences that, e.g. |X| = |Y| iff they have the same
number n of elements).
One example is 'Galileo'sParadox' that there are as many even natural numbers
as there are natural numbers.This is because f(x) = 2x is a one-one correspondence
between the whole set of natural numbers N and the set of even natural numbers E.
This is only a 'paradox' in the sense that it is rather odd (paradox means outside or
beyond orthodoxy); but no formal inconsistency is involved: the result that Galileo
259
proved is indeed thecorrect result. We just, in general, need to get used to the idea that
X may be a proper subset of Y, i.e.x(xXxY) andx(xY & xX), and yet X and Y
haveexactly the same number of elements, i.e. |X| = |Y|. (In fact, as I already noted, it
turns out to be true of every infinite set that it has proper subsets of the same
cardinality as itself.)
Cantor could also straightforwardly prove that the cardinality of the set of rational
numbers (natural numbers plus ratios of natural numbers) is the same as that of the
set of natural numbers, despite the fact that there are infinitely many rationals between
any two natural numbers (the rational numbers are dense). You can find a simple
interactive demonstration of the result here.
The suggestion arises that all infinite sets may just have the same cardinality. This
would make set theory relatively boring.Cantor in fact proved that this was not true
when he proved that the set of all real numbers is of a higher infinity than the infinity
of the (counting) natural numbers. (The real numbers, which can be presented in terms
of their decimal expansions, are all the points on the real line, and include natural
numbers and rational numbers and lots more besides:2, for example, although
certainly a real number its a point on the real line is not a rational number). Cantor
showed this by showing that there is no one-to-one correspondence between the set
of the reals and the set of natural numbers (and so since the set of natural numbers
forms a proper subset of the reals, there must be strictly more reals than there are
naturals). But both sets are of course infinite, so the result shows that there are orders
of infinity some infinities are greater than others.
In fact, Cantor proved the stronger result that there is no one-to-one correspondence
between {Naturals} and {reals between 0 and 1}! The proof, like many deep results in
mathematics, is by reductio ad absurdum. We assume that there is a one-to-one
correspondence between those two sets, deduce a contradiction, and so infer that there
can be no such correspondence. Here is how it goes, in outline:
Suppose that there was a one-to-one correspondence f between {Naturals} and {reals
between 0 and 1}. Assuming such a correspondence between any set X and {Naturals}
amounts to the assumption that X can be enumerated or counted that is, written as an
infinite list, without any member of X being left out: the first element of X in the list is
260
the element associated by f with the number 1, the second is the element associated by
f with the number 2, and so on.
So, if the set {real numbers between 0 and 1} can be placed in one-to-one
correspondence with {Naturals} then that set of reals can be written as a list. So think
of all the reals between 0 and 1 (specified by their decimal expansion 0.13579865 or
whatever) written as an infinite list in any order that you like. Suppose our list is:
f(1) = 0. 0 1 4 5 3 2 1 3
f(2) = 0. 1 3 4 5 1 1 2 3
f(3) = 0. 9 6 5 3 4 2 9 9
f(4) = 0. 0 0 0 0 0 0 8 7
f(5) = 0. 0 0 1 2 3 5 6 1
f(6) = 0. 7 7 7 8 7 5 4 3
f(7) = 0. 1 8 6 3 6 8 4 1
You can then form the diagonalelement out of that list: that is, form the number
0.a11a22a33.ann. where a11is the first number in the decimal expansion of the real
number, whatever it may be, that is first in the list (of course this will be a digit
between 0 and 9 inclusive), a22is the second number in the decimal expansion of the
second number in the list, and so on. So, in our list, the diagonal element is highlighted:
f(1) = 0. 0 1 4 5 3 2 1 3
f(2) = 0. 1 3 4 5 1 1 2 3
f(3) = 0. 9 6 5 3 4 2 9 9
f(4) = 0. 0 0 0 0 0 0 8 7
f(5) = 0. 0 0 1 2 6 5 6 1
f(6) = 0. 7 7 7 8 7 2 4 3
f(7) = 0. 1 8 6 3 6 8 4 1
Call the anti-diagonal element d. d is a real number between 0 and 1 but it cannot be
on the list. (Try to work out why before reading on).
If it were on the list, then it would have to appear at some finite point on it. (It is a deep
fact about the list of natural numbers that although the list is infinite, every element on
it appears at some finite place all the infinitely many natural numbers are finite!) So,
there must be some natural number m such that d appears at the mth place. But that
cant be true since d, by construction, differs from whatever number it is that appears
at the mth place in the mth place of their decimal expansions. So, the assumption that we
can produce a one-to-one correspondence between {Naturals} and {reals between 0
and 1}leads to contradiction. Hence, there is no such one-to-one correspondence and
so the infinity of the reals, even the reals between 0 and 1, is a higher infinity than the
infinity of the natural numbers.
This is the easiest case of Cantors diagonal method. There is a slight wrinkle
involving ensuring that you do not have infinite lists of 9s in the antidiagonal d that
some of you at least might like to think through. (The problem is easily overcome.)
What has all this, fascinating as it may be, to do with the set of all sets leading to a
paradox? Well Cantor generalised his diagonal result as follows:
The cardinality of the power set of a set X is strictly greater than the cardinality of X
itself. (So for example the cardinality of the set of all sets of natural numbers, i.e. (N),
is greater than that of the set of natural numbers N itself. So, put dramatically, the
infinity of the set of all sets of natural numbers is even more infinite than the set of
natural numbers itself.)
Proof:
262
To prove that |X||(X)| we need only show that there is a one-to-onecorrespondence
between the whole of the set X and some subset of (X). The function f(a) = {a}, i.e. the
function that maps any element a of X onto the set whose only member is that element,
clearly is such a correspondence: the set of all singletons (all sets which just contain
one natural number) is clearly a subset, indeed a proper subset, of the set of all sets of
natural numbers.
But now, trivially, either a X, or (a X). Assume a X i.e. a {a X|a f(a)} and
so, by the definition of X, (a f(a)). But f(a) = X, so (a X). Hence the assumption
that a X proves untenable. Therefore (a X). But, since X = f(a) this means that
(a f(a)); hence a satisfies the defining characteristic of the set X and so a X. This
is a contradiction.
The assumption that there is a one-one correspondence between X and the whole of
(X) entails a contradiction and so must be false. This means that (|(X)| = |X|);and
since the first part of the proof easily yielded that |X| |(X)| we finally have Cantors
theorem that |(X)| |X|.
263
usual is the set of all natural numbers), |P(N)|, |(((N))|, |(((N))| and so onad
infinitum.
(It is easy to show that the 'continuum' the set of all real numbers, all points on the
real line can be put in one-to-one correspondence with the set of all subsets of the
naturals, i.e.(N).)
So far we have a theorem, not a 'paradox'. The 'paradox' arises by considering the setof
all sets.The naive principle of comprehension entails that this set exists since it isthe
extension of the property 'x is a set'. Call this the 'universal' set U.
By Cantor's Theorem, |(X)| > |X| for any X, and so in particular |P(U)| > |U|. (U) is of
course the set of all subsets of the set of all sets. This means it is certainly a set of sets
and so must itself be a subset of U. (x(x (U) x U) is true since every element of
(U) is a set and all sets are in U.) But it is easy to see that for any two sets X and Y, if X
Y then |X| |Y| . This is because |X| |Y| requires only that there be aone-to-one
correspondence between X and a subset of Y, and if X itself is a subset of Y then the
identity mapping (which associates any element with itself) is such a one-to-one
correspondence.
Hence since P(U) U we have |U| |P(U)| and this contradicts the consequence of
Cantors theorem that |(U)| > |U|.
264
E2( D ): S OLUTIONS OF THE P ARADOXES
Once the paradoxes had been spotted it proved possible to revise set theory in various
ways so as to avoid them. Two axiomatic systems in particular were produced:
Zermelo-Fraenkel Set Theory and von Neumann-Bernays-Gdel set theory. Neither
system of course contains the full (nave) principle of comprehension since if they did
they would be inconsistent. In Z-F, e.g., it is replaced by the 'Axiom of Subsets', which
states that, given a set, any property determines a subset of it. Although it cannot be
proved that either system is consistent (and so absolutely free from any 'paradoxical'
derivation), it can be shown that the usual paradoxical reasoning (e.g. in the Russell
and Cantor cases) is definitely blocked in either system.
These axiomatic systems are satisfactory from the mathematical point of view. Set
theory was, however, intended to play an additional, foundational role. In particular,
as I mentioned, the logicists Frege and Russell who set out to show that mathematics
'reduces' to logic, regarded set theory as a legitimate part of logic itself. The problem
with the axiomatic set theories from this point of view is that the restrictions they
impose on set existence seem rather ad hoc aimed simply at avoiding the paradoxes
and not themselves 'self-evident' as one would hope (at anyrate on reflection) any truly
logical principle would be. Russell himself adopted a different approach:
Type Theory. Russell suggested that the universe of sets should be regarded as
stratifiedor hierarchical in structure. Every element is of a definite type. At typelevel O
are individuals (non-sets). (These, it turns out, can be eliminated but we need not
worryabout this.) At type level 1 are sets of individuals; at type level 2, sets of sets of
individuals; and so on.
Each object in the universe of sets has a type indicated by a subscript, variables vary
only over objects of a given type and so they too have type subscripts. The fundamental
rule of type theory is that any formula of the form xi yj (where i and j are the
subscripts indicating the type level) is well formed (meaningful) only if j = i + 1. In other
words, it can only be sensibly asserted that one set is, or is not, an element of a set of
next higher type. Any other membership assertion is meaningless. In particular, the
265
assertion xi xiis not well formed that is, one cannot meaningfully assert in type
theory that a set is a member of itself (and nor, therefore, that a set is not a member of
itself). Hence the reasoning that led to the Russell paradox cannoteven get started. It
can also be shown that, while in type theory there is a set of allsets of type level i (that
set itself being of level i+1) for any i, there is no trulyuniversal set i.e. set of all sets of
whatever type level. This blocks Cantors Paradox.
The problem again from the foundational point of view is to say why this type-level
stratification is 'natural' or 'obvious', once we have cleaned our logical spectacles.
Otherwise the theory appears like another ad hoc manoeuvre simply designed to avoid
the paradoxes. Russell tried to justify the stratification using his 'vicious circle
principle'.The exact import and effect of this principle is still a matter of some dispute.
Historically, however, Russell's justification was not accepted and it was generally felt
that the logicist programme came to grief over the paradoxes. Whatever the reason for
the truth of mathematics, it was not that mathematics consists simply of logical truths.
266
Some further reading:
Several articles from the Stanford Encyclopedia of Philosophy provide more detail, and
the further reading and references in each contain more information than one could
conceivably need:
267