Introduction To Modal Logic
Introduction To Modal Logic
JOEL MCCANCE
Abstract. Modal logic extends classical logic with the ability to express not
only P is true, but also statements like P is known or P is necessarily true.
We will define several varieties of modal logic, providing both their semantics
and their axiomatic proof systems, and prove their standard soundness and
completeness theorems.
Introduction
Consider the statement It is autumn, thinking in particular of the ways in
which we might intend its truth or falsity. Is it necessarily autumn? Is it known
that it is autumn? Is it believed that it is autumn? Is it autumn now, or will it
be autumn in the future? If I fly to Bombay, will it still be autumn? All these
modifications of our initial assertion are called by logicians modalities, indicating
the mode in which the statement is said to be true. These are not easily handled by the truth tables and propositional variables of the propositional calculus
(PC) taught in introductory logic and proof courses, so logicians have developed
an augmented form called modal logic. This provides mathematicians, computer
scientists, and philosophers with the symbols and semantics needed to allow for
rigorous proofs involving modalities, something that until recently was thought to
be at best pointless, and at worst impossible.
This paper will provide an introductory discussion of the subject that should be
clear to those with a basic grasp of PC. We will begin by discussing the history
and motivations of modal logic, including a few examples of such logics. We will
then introduce the definitions and concepts needed for a rigorous discussion of the
subject. The third section will show that these systems behave themselves that
is, the set of provable statements and the set of true statements are identical. We
will conclude with a discussion of some of the technical complexity of introducing
quantification into modal logic.
1. History and Motivations
As with its classical cousin, the modern interest in modal logic begins with
Aristotle.1 In addition to his syllogisms dealing with categorical statements, the
Greek thinker wished to formalize the logical relationships between what is, what is
necessary, and what is possible. Unfortunately his treatment of modality suffered
from a number of flaws and confusions, and while his categorical syllogisms became
a staple of classical education, modal logic was dismissed as a failure. Kant and
Frege both argued that modalities added no pertinent information to an argument,
The author would like to thank Professor Jen Brown for her helpful comments and Professor
Bob Milnikel for his advice and support.
1
This historical information in this section is drawn from Fitting [1].
1
JOEL MCCANCE
merely hinting at why we might believe a given statement to be true. They believed
that no more or less could be derived from the modal form a statement P that from
P itself.
This claim has come to be seen as false. After all, if two statements are equivalent, they ought to imply each other. It seems reasonable to say that if P is the case
then P must be a possible state of affairs, since what is true cannot be impossible.
However, it is quite a bit less obvious to say that because P is possible, P is the
case. It is possible, for example, that my bearded friend Charles is actually a very
hirsute woman, yet there is no reason to believe that this is actually true. So while
actuality implies possibility, possibility does not imply actuality. It seems, then,
that there is more to modality than Kant and Frege suspected; modal statements
are not quite equivalent to their nonmodal counterparts.
With the growing acceptance of such arguments in the past fifty years or so,
there has been a revival of interest in modal logic, the product of which has been
a number of interesting new modalities. Temporal logic, for instance, considers
whether P is true now, will be true at some point in the future, or has been true
in the past. This is actually a multimodal logic since it uses a number of modes
of truth within a single language. Epistemic logic the logic of knowledge and
knowing can also be multimodal. The modality in this case is that of whether
agent A knows P or, alternatively, whether P could be true given what A knows.
Related is the interpretation of which reads P as P is provable, a reading that
has great use in the field of mathematical logic.
A final, interesting twist on the theme is action-based logics. As the name would
imply, what we are interested in here is how actions alter the truth of a statement.
The example in the introduction of flying to Bombay would be an action modality.
Say It is autumn where I am is true and I am in Ohio is true. Then after flying
to Bombay, both these statements will become false, as it cannot be autumn on
both sides of the world simultaneously and, of course, I will no longer be in Ohio.
While trivial in this example, a multimodal action-based logic provides a means of
teasing out the finer implications of complex sequences of actions.
2. Propositional Modal Logic
Any complete system of logic needs at least three components: a rigorous language for writing out the statements in question, a means of interpreting the statements and determining their truth value, and a means of writing proofs.2
2.1. The Language of Modal Logic. Any language, be it English, Chinese, or
the language of logic, must have symbols and rules for combining them. In a
spoken language the symbols are words and the rules are grammar. Our case is
analogous, although rather than inventing a new system from whole cloth, modal
logic begins with the familiar language of PC and just adds two new operators to
handle modality. The result is the following set of symbols:
A countably infinite set of letters A, X, P1 , P2 , &c., called propositional
variables;
The unary operators , , and ;
The binary operators =, , and ; and
Brackets (, and ).
2The definitions in this section are adapted from Hughes [3] and Fitting [1].
P
T
F
P P
T
T
Note the recursion of this definition, first defining a base case (called the atomic
formulae) and going on to define more complex formula in terms of those wffs
already known to us. This will give us a powerful tool for future proofs where
we first prove something about isolated propositional variables and go on to show
that it holds for the second and third formation rules listed above. This is called
induction on the complexity of a formula, and will play a crucial role in our proofs
for soundness and completeness below.
2.2. Determination of Truth. In PC we are primarily interested in the tautologies. These are the formulae that are true no matter what the truth of the
propositions involved, things like (P Q) = P and P P . These are called
the valid formulae of PC and are determined by simply checking the truth table
for the formula in question. For example, we can tell that P P is valid because
every line of truth table in Figure 1 comes out to T.
The situation for modal logic is somewhat more complex. After all, the whole
point of our new symbols is to indicate a judgement that is independent of the truth
of the formula in question. Nonetheless, we want an extension of the same intuitive
notion: that the valid formulae are those that are True no matter what or True
in all situations. The difference is that now it is possible that some propositions,
while not tautological, will nonetheless always evaluate true. For example, it has
been a popular move in theology to claim that it is necessary that there exist a
greatest conceivable being. Such theologians are not generally claiming that Gods
existence is a tautology, but rather that in every conceivable world the proposition
God exists is true. Therefore, they argue, it is a necessary truth that God exists.
3The parentheses in this and the preceding rule, while crucial in some instances for rigor, will
often be omitted when the meaning is clear from context and convention. This way we will not
have to write ((P ) Q) when P Q is perfectly understandable.
JOEL MCCANCE
It is this reading of necessarily true as true in all possible worlds that lead
to the most popular interpretation of modal logic: Kripkes many-world semantics.
Under this interpretation, the truth of a statement is relative to the world in question. For propositional formulae, this is determined simply by examining the state
of affairs in that world. So if P and Q are both true in the current world, P Q
will be true in this world. The more interesting case comes with our new operators,
and . P is defined to be true in a world whenever P is true in all accessible
worlds. How we define accessibility depends on the modality, but conceivable is a
common one for the necessary/possible modality. So if P is true in all conceivable
worlds, P is true that is, P is necessarily true. P is similar, although in this
case the modality is that of possibility. If P is true in at least one accessible world,
P will be true as well since it is true somewhere, it must not be impossible.
To consider another example, say we were using P to mean X believes P ,
where X is some person or ideology. The possible worlds here are not really worlds
at all, but people, ideologies, institutions anything that can be said to believe
a proposition. Accessibility in this case is interpreted as trusts. So if Platonists
trust physics, then physics is accessible to Platonism. More elaborately, say that
Russell is a node in this network of people and ideologies. For this example, say
Russell trusts physics, Richard Rorty, and atheism exclusively. Then P is only
true for Russell if P is true for physics, Richard Rorty, and atheism. If P is not
true in any one of these, than Russell does not believe P . Similarly, if only Richard
Rorty believes P , then P is true for Russell. He can see how one would believe
P , but is not fully convinced. Finally, note that the truth or falsehood of P in
Scientology will have no effect on what Russell believes because he does not trust
it Scientology is not accessible to Russell by the relation trusts.
With these examples in mind, we can now rigorously define the semantics of our
symbols.
Definition 2.2. Let W be a non-empty set of what we will call possible worlds.
Let R be a binary relation from W to W , which we will call an accessibility relation.
Together, hW, Ri form a frame.
Definition 2.3. Let hW, Ri be a frame and let
(read forces) be a binary relation
between W and the set of all well-formed formulae. Let W . We will assume
that
obeys the following rules
For all propositional variables P , either
P or
P .
If is a wff, then
if and only if 6
.
If and are wffs, then
( ) if and only if
or
.
If and are wffs, then
( ) if and only if
and
.
If and are wffs, then
( = ) if and only if 6
or
.
only if for every W , R implies that
.
only if there exists a W such that R and
P .
Together, hW, R,
i is a propositional modal model, which we will generally shorten
to model.
There are a few items of note in this definition. First, observe that all but the
last two cases are identical to the truth-table semantics of PC. Only the final cases
add anything new. It is also worth observing that just as we can rewrite and
using just negation and implication, we can also rewrite in terms of negation and
. Think of what it would mean for P to be false in a world. This would mean
reflexive, transitive
reflexive, symmetric, transitive
K4
S5
S4
possible
worlds world
that is, !P
true. So
is equivalent
!P
A helpful
that in every
accessible
P isisfalse,
soPthe
negationto of
P .will
be true in all
analogy here is to think of why (x)(P (x)) is equivalent to (x(P (x))). To say
possible worlds
isthat
P
equivalent
tofor
P
. A helpful
that there
an xis,
such
that Pis(x)true.
is trueSo
is toP
say is
that
it is false that
every x,
P (x)isisto
false.
Similarly,
X (x)(P
simply asserts
it is false thattoX(x(P
is false in every
analogy here
think
of why
(x)) that
is equivalent
(x))). To say
that there accessible
is an x world.
such that P (x) is true is to say that it is false that for every x,
Now that we have a clear interpretation of our symbols, we can finally define
P (x) is false.
simply
it is false that X is false in every
whatSimilarly,
it means for aX
formula
to be asserts
valid in a that
given system.
accessible world.
Definition 2.4 (L-valid). Let #W, R, #$ be a model. We say that this model
Now that
we have
clear#W,
interpretation
our symbols,
define
is based
on thea frame
R$. Let be a of
well-formed
formula. we
iscan
validfinally
in
#W, R,for
#$ a
if formula
# for every
valid
W . is
in the system.
frame #W, R$ if it is valid in
what it means
to be
invalid
a given
every model based on #W, R$. Finally, if is valid in a collection of frames L, then
say it(L-valid).
is L-valid.
Definitionwe2.4
Let hW, R,
i be a model. We say that this model
is based on Since
the frames
framearehW,
Leta relation,
be ait well-formed
formula.
is valid in
justRi.
a set with
is natural to want
to choose collecof frames
R. valid
This is exactly
logicians
have done,
hW, R,
i tions
if
for based
everyonproperties
W . of is
in thewhat
frame
hW, Ri
if it is valid in
beginning with the system K, which places no restrictions on the frame whatsoever.
every model
based on hW, Ri. Finally, if is valid in a collection of frames L, then
Other common systems and frame conditions are listed in Figure 2.
we say it is L-valid.
Note the interconnections implied by this table. For example, any formula that is
K-valid ought to be valid in all these systems, since its truth relies on no particular
Since frames
are justSimilarly,
a set with
it is
want
to choose
collecframe structure.
whataisrelation,
true in B will
be natural
true in S5,to
since
the former
is
identical
to theon
latter
with the exception
of notisnecessarily
Thehave done,
tions of frames
based
properties
of R. This
exactly being
whattransitive.
logicians
exact relationships are illustrated in Figure 3.
beginning with
the system K, which places no restrictions on the frame whatsoever.
Other common
frame
conditions
are
listed
inis Figure
2..
4For systems
every world and
W there
exists
at least one
W such
that
accessible to
Note the interconnections implied by this table. For example, any formula that is
K-valid ought to be valid in all these systems, since its truth relies on no particular
frame structure. Similarly, what is true in B will be true in S5, since the former is
identical to the latter with the exception of not necessarily being transitive. The
exact relationships are illustrated in Figure 3.
2.3. An Axiomatic Treatment of Proofs. We now have a means of writing
statements and of talking about which are true in which circumstances. However,
4For every world W there exists at least one W such that is accessible to .
JOEL MCCANCE
the fact that the reader has probably taken an introductory proof course should
suggest that more is required from a system of logic. We do not want to have
to compute truth tables for every possible world whenever we want to establish
the veracity of a statement. Instead we want a system that formalizes the usual
reasoning process of mathematics: beginning from what we know and applying
rules and axioms to generate new theorems.
In the discussion that follows we will consider the system K. Since K makes no
assumptions about the structure of the frames, we will be able to augment K with
more axioms to transform it into any of the systems discussed so far.
First, the definition of a proof:
Definition 2.5. An axiomatic proof is a finite sequence of formulae, each of which
is either an axiom or else follows from the earlier terms of the sequence by one of
the rules of inference. An axiomatic theorem is the last line of a proof.
The axioms of our system are as follows:
Definition 2.6. There are two classes of axioms for K
Classical Tautologies All valid formula of PC (that is, all the tautologies of
traditional propositional logic) will be taken as axioms.
Schema K For any wffs and , we will assume that
( = ) = ( = ).
Our proof system will also have two rules of inference.
Definition 2.7. A rule of inference is an ordered pair (, ), where is a set of
wffs and is a single wff. If the propositions of are theorems of the system, so is
.
K has two rules of inference.
Modus ponens: ({, = }, ).
Necessitation: ({}, ).
Since we are introducing axioms, it is worth taking a moment to ensure that
they are reasonable. While the classical tautologies and modus ponens are the
same as in PC, Schema K and Necessitation may take some argument. We will
start with Schema K: (X = Y ) = (X = Y ). Say that we have a proof
for (X = Y ) and that we are interpreting in terms of necessity. Then what
we are saying is that we have proven it necessarily true that X = Y . If we can
prove that X is necessarily true then we will have shown that X is true in every
possible world. Since X = Y must be true in every possible world, it is reasonable
to say that Y must be as well and that Y is therefore necessarily true. Schema K
is also reasonable under epistemic logic. If know that X = Y and we know X,
we also know Y .
First, note that since we are in a proof system, X does not mean X is the case.
We are rather asserting that X is provable. Consider the two unimodal logics we
have discussed: the logic of necessity and epistemic logic. For the necessary/possible
modality we are claiming that if we have a proof for X from our other axioms and
rules of inference, X must be necessarily true. This is exactly the sort of result we
want from a proof system. Similarly, in epistemic logic we are claiming that if we
have a proof for X, we know X. This also makes sense. If we have proven X, we
Name
D
T
4
B
Scheme
P = P
P = P
P = P
P = P
Proof.
1
2
3
4
5
Tautology
Regularity on 1
Tautology
Regularity on 3
Tautology
Modus Ponens, 2, 5
Modus Ponens, 4, 6
As stated before, the other systems we have introduced are identical to K, with
only a few added restrictions on the frames. Axiomatically, these systems simply
add more axiom schemes. The added axioms are show in Figure 4. The common
systems of modal logic are formed by adding combinations of the schemes to the
axiom system K, as shown in Figure 5.
3. Soundness and Completeness
It is an interesting fact that our system of proof never introduced a formal
link to our method for determining validity. How do we know, then, whether the
theorems that we prove are valid in our collection of frames? How do we know if
our proof system is strong enough to derive all the valid statements in our collection
JOEL MCCANCE
Logic
D
T
K4
B
S4
S5
Added Axioms
D
T
4
T, B
T, 4
T, 4, B
10
JOEL MCCANCE
= S
(
=
Sn Xn+1
Sn
if Sn Xn+1 is consistent
otherwise.
11
12
JOEL MCCANCE
was arbitrary, this holds for all the worlds to which is related. So is true
in .
Now say that 6 . Since is maximally consistent, this means that .
Consider the set of all statements beginning with , {X1 , X2 , ...}. We know
from the previous theorem that if {, X1 , X2 , ...} is consistent than so is
{, X1 , X2 , ...}. We can therefore extend this to form a maximally consistent set
. Now for every statement of the form P , P is in . So by our inductive
hypothesis,
P . Since for every P
P , we know that R by
definition. But since is true in , it must be that is false in . So by
contraposition, if is true in , then .
Implications. Say that our formula is = and that and are each true if
and only if they are elements of . Say that
= . Then either is false
in or is true. Say that is true, so must be true as well. Then and are
both in . Since they are both elements of the maximally a maximally consistent
set, = must be in that set as well.
Now assume that = . Then either or is in as well, since is
maximally consistent. Then either
or
, so = is true in .
Conjunctions. Say that our formula is and that and are true if and
only if they are elements of . Say that
. Then both and must be
true, so , .
Now say that , . Then both
and
. So we know by definition
that
.
With this machinery in place, we can finally show that K is complete.
Theorem 3.9. Say that X is K-valid. Then there is a proof of X in the axiom
system K.
Proof. With the machinery above this proof becomes relatively straightforward.
We will proceed by the contrapositive. Assume that X has no proof in K. Since X
has no proof, the set {X} is consistent. (Since we cannot derive X, neither can
we derive X X.)
Extend this set to a maximally consistent set X and let hW, R,
i be the canonical model of K. Clearly X , being maximally consistent, is in W . Since X X ,
X is true in X . Since X is maximally consistent, X 6 X . Therefore X is not
true in X , meaning that X is not valid in the canonical model.
Now clearly the canonical model of K is in the collection of frames K, since K
places no restrictions on its frames. We have therefore shown that there exists a
model based on a frame in K in which X is not valid. Therefore X is not K-valid.
To show that the remaining systems are true, we merely need to show that the
canonical model of a given system L meets the requirements of that system. For
example, the proof for T consists in showing that the canonical model of T is based
on a reflexive frame. However, it is helpful to begin with a quick lemma.
Lemma 3.10. Let hW, Ri be the canonical model for a system L and let W .
If contains no statements beginning with , then R for all W .
Proof. Fix W and let be the set of all statements beginning with in .
Define = {X|X }. We know from the definition of the canonical model
13
14
JOEL MCCANCE
An atomic formula in this language will be any expression of the form R(x1 , x2 , .., xn ),
where R is an n-ary relation symbol, and build the rest of the formulas as you would
expect.
How might we construct a model for such a language? Since we are interested
in adding quantification, it would probably be useful to have some non-empty set
over which to quantify. We call this set the domain. We then determine the truth
of a formula like (x)(P (x)) in a world by checking to see if P (x) is true in for
every x in the domain. The evaluation for modalities also follows a similar route.
(x)(P (x)) can be determined true or false by checking to see if P (x) is true for
every value of x in every accessible world. Similarly, (x)(P (x)) says that there is
some element in the domain for which P (x) is true.
There is a danger on the horizon, however. Consider the veracity of (x)((x =
x)) in a world : there exists an x such that in every accessible world , x is
identical to itself. On the surface this seems quite readily true; how could it be that
something is not identical to itself? But there is a catch: how do we know whether
or not x actually exists in ? Does it even make sense to talk of objects that do
not exist? Yet we do not want to assert x 6= x, since that is a contradiction.
One possible solution is to simply require that if x is in one worlds domain, it
must be in them all. This is called a constant domain model, the frame of which
is a collection of worlds W , and accessibility relation R, and a single, non-empty
set D. However, it is not inherently unreasonable to want the domain to vary from
world to world. In temporal logic, for example, it would be ridiculous to say that
every object that exists now also existed 1,000 years ago. Temporal logic therefore
is better suited to a varying domain model. The frame of a varying domain model
looks identical to that of a constant domain model: a triple hW, R, Di. In this
case, however, D is a function assigning each world in W to a domain. The actual
domain for a world is not D, but D().
Unfortunately, this reintroduces the problem of deciding whether or not x = x
is true in if x 6 D(). One, somewhat unsatisfying option is to simply allow
that perhaps some statements are neither true nor false. This approach results in
15