Reasoning-Defeasible A4
Reasoning-Defeasible A4
Defeasible Reasoning
http://plato.stanford.edu/archives/spr2014/entries/reasoning-defeasible/ Defeasible Reasoning
from the Spring 2014 Edition of the First published Fri Jan 21, 2005; substantive revision Tue Jan 29, 2013
1
Defeasible Reasoning Robert Koons
(have the sensory experience as of being in the presence of something performed, once a short list of essential prerequisites have been met.
red), then, Chisholm argued, I may presume that I really am in the McCarthy (McCarthy 1977, 1038–1044) suggested that the solution lay in
presence of something red. This presumption can, of course, be defeated, a logical principle of circumscription: the presumption that the actual
if, for example, I learn that my environment is relevantly abnormal (for situation is as unencumbered with abnormalities and oddities (including
instance, all the ambient light is red). unexplained changes and unexpected interferences) as is consistent with
our knowledge of it. (McCarthy 1982; McCarthy 1986) In effect,
John L. Pollock developed Chisholm's idea into a theory of prima facie McCarthy suggests that it is warranted to believe whatever is true in all the
reasons and defeaters of those reasons (Pollock 1967; Pollock 1979; minimal (or otherwise preferred) models of one's initial information set.
Pollock 1974). Pollock distinguished between two kinds of defeaters of a
defeasible inference: rebutting defeaters (which give one a prima facie In the early 1980's, several systems of defeasible reasoning were proposed
reason for believing the denial of the original conclusion) and by others in the field of artificial intelligence: Ray Reiter's default logic
undercutting defeaters (which give one a reason for doubting that the (Reiter 1980; Etherington and Reiter 1983, 104–108), McDermott and
usual relationship between the premises and the conclusion hold in the Doyle's Non-Monotonic Logic I (McDermott and Doyle, 1982), Robert C.
given case). According to Pollock, a conclusion is warranted, given all of Moore's Autoepistemic Logic (Moore 1985), and Hector Levesque's
one's evidence, if it is supported by an ultimately undefeated argument formalization of the “all I know” operator (Levesque 1990). These early
whose premises are drawn from that evidence. proposals involved the search for a kind of fixed point or cognitive
equilibrium. Special rules (called default rules by Reiter) permit drawing
1.2 Artificial Intelligence certain conclusions so long as these conclusions are consistent with what
one knows, including all that one knows on the basis of these very default
As the subdiscipline of artificial intelligence took shape in the 1960's,
rules. In some cases, no such fixed point exists, and, in others, there are
pioneers like John M. McCarthy and Patrick J. Hayes soon discovered the
multiple, mutually inconsistent fixed points. In addition, these systems
need to represent and implement the sort of defeasible reasoning that had
were procedural or computational in nature, in contrast to the semantic
been identified by Aristotle and Chisholm. McCarthy and Hayes
characterization of warranted conclusions (in terms of preferred models)
(McCarthy and Hayes 1969) developed a formal language they called the
in McCarthy's circumscription system. Later work in artificial intelligence
“situation calculus,” for use by expert systems attempting to model
has tended to follow McCarthy's lead in this respect.
changes and interactions among a domain of objects and actors. McCarthy
and Hayes encountered what they called the frame problem: the problem
2. Applications and Motivation
of deciding which conditions will not change in the wake of an event.
They required a defeasible principle of inertia: the presumption that any Philosophers and theorists of artificial intelligence have found a wide
given condition will not change, unless required to do so by actual events variety of applications for defeasible reasoning. In some cases, the
and dynamic laws. In addition, they encountered the qualification defeasibility seems to be grounded in some aspect of the subject or the
problem: the need for a presumption that an action can be successfully context of communication, and in other cases in facts about the objective
world. The first includes defeasible rules as communicative or of course, defeasible, since if I subsequently learn that I have a sister after
representational conventions and autoepistemic (reasoning about one's all, the basis for the original inference is nullified.
own knowledge and lack of knowledge). The latter, the objective sources
of defeasibility, include defeasible obligations, defeasible laws of nature, 2.3 Semantics for Generics and the Progressive
induction, abduction, and Ockham's razor (the presumption that the world
Generic terms (like birds in Birds fly) are expressed in English by means
is as uncomplicated as possible).
of bare common noun phrases (without determiner). Adverbs like
2.1 Defeasibility as a Convention of Communication normally and typically are also indicators of generic predication. As Asher
and Pelletier (Asher and Pelletier 1997) have argued, the semantics for
Much of John McCarthy's early work in artificial intelligence concerned such sentences seems to involve intentionality: a generic sentence can be
the interpretation of stories and puzzles (McCarthy and Hayes 1969; true even if the majority of the kind, or even all of the kind, fail to
McCarthy 1977). McCarthy found that we often make assumptions based conform to the generalization. It can be true that birds fly even if, as a
on what is not said. So, for example, in a puzzle about safely crossing a result of a freakish accident, all surviving birds are abnormally flightless.
river by canoe, we assume that there are no bridges or other means of A promising semantic theory for the generic is to represent generic
conveyance available. Similarly, when using a database to store and predication by means of a defeasible rule or conditional.
convey information, the information that, for example, no flight is
scheduled at a certain time is represented simply by not listing such a The progressive verb involves a similar kind of intentionality. (Asher
flight. Inferences based on these conventions are defeasible, however, 1992) If Jones is crossing the street, then it would normally be the case
because the conventions can themselves be explicitly abrogated or that Jones will succeed in crossing the street. However, this inference is
suspended. clearly defeasible: Jones might be hit by a truck midway across and never
complete the crossing.
Nicholas Asher and his collaborators (Lascarides and Asher 1993, Asher
and Lascarides 2003, Vieu, Bras, Asher, and Aurnague 2005, Txurruka 2.4 Defeasible Obligations
and Asher 2008) have argued that defeasible reasoning is useful in
Philosophers have, for quite some time, been interested in defeasible
unpacking the pragmatics of conversational implicature.
obligations, which give rise to defeasible inferences about what we are, all
2.2 Autoepistemic Reasoning things considered, obliged to do. David Ross, in 1930, discussed the
phenomena of prima facie obligations (Ross 1930). The existence of a
Robert C. Moore (Moore 1985) pointed out that we sometimes infer things prima facie obligation gives one good, but defeasible grounds, for
about the world based on our not knowing certain things. So, for instance, believing that one ought to fulfill that obligation. When formal deontic
I might infer that I do not have a sister, since, if I did, I would certainly logic was developed by Chisholm and others in the 1960s (Chisholm
know it, and I do not in fact know that I have a sister. Such an inference is, 1963), the use of classical logic gave rise to certain paradoxes, such as
Chisholm's paradox of contrary-to-duty imperatives. These paradoxes can Many classical philosophical arguments, especially those in the perennial
be resolved by recognizing that the inference from imperative to actual philosophy that endured from Plato and Aristotle to the end of
duty is a defeasible one (Asher and Bonevac 1996; Nute 1997). scholasticism, can be fruitfully reconstructed by means of defeasible logic.
Metaphysical principles, like the laws of nature, may hold in normal cases,
2.5 Defeasible Laws of Nature and Scientific Programs while admitting of occasional exceptions. The principle of causality, for
example, that plays a central role in the cosmological argument for God's
Philosophers David M. Armstrong and Nancy Cartwright have argued that
existence, can plausibly construed as a defeasible generalization (Koons
the actual laws of nature are oaken rather than iron (to use Armstrong's
2001).
terms). (Armstrong 1983; Armstrong 1997, 230–231; Cartwright 1983).
Oaken laws admit of exceptions: they have tacit ceteris paribus (other As discussed above (in section 1.1), prima facie reasons and defeaters of
things being equal) or ceteris absentibus (other things being absent) those reasons play a central role in contemporary epistemology, not only
conditions. As Cartwright points out, an inference based on such a law of in relation to perceptual knowledge, but also in relation to every other
nature is always defeasible, since we may discover that additional source of knowledge: memory, imagination (as an indicator of possibility)
phenomenological factors must be added to the law in question in special and testimony, at the very least. In each cases, an impression or
cases. appearance provides good but defeasible evidence of a corresponding
reality.
There are several reasons to think that deductive logic is not an adequate
tool for dealing with this phenomenon. In order to apply deduction to the 2.7 Occam's Razor and the Assumption of a “Closed World”
laws and the initial conditions, the laws must be represented in a form that
admits of no exceptions. This would require explicitly stating each Prediction always involves an element of defeasibilty. If one predicts what
potentially relevant condition in the antecedent of each law-stating will, or what would, under some hypotheis, happen, one must presume
conditional. This is impractical, not only because it makes the statement of that there are no unknown factors that might interfere with those factors
each and every law extremely cumbersome, but also because we know that and conditions that are known. Any prediction can be upset by such
there are many exceptional cases that we have not yet encountered and unanticipated interventions. Prediction thus proceeds from the assumption
may not be able to imagine. Defeasible laws enable us to express what we that the situation as modeled constitutes a closed world: that nothing
really know to be the case, rather than forcing us to pretend that we can outside that situation could intrude in time to upset one's predictions. In
make an exhaustive list of all the possible exceptions. addition, we seem to presume that any factor that is not known to be
causally relevant is in fact causally irrelevant, since we are constantly
More recently, Tohmé, Delrieux, and Bueno (2011) have argued that encountering new factors and novel combinations of factors, and it is
defeasible reasoning is crucial to the understanding of scientific research impossible to verify their causal irrelevance in advance. This closed-world
programs. assumption is one of the principal motivations for McCarthy's logic of
circumscription (McCarthy 1982; McCarthy 1986).
2.6 Defeasible Principles in Metaphysics and Epistemology
3. Varieties of Approaches Conversely, as David Makinson and Peter Gärdenfors have pointed out
(Makinson and Gärdenfors 1991, 185–205; Makinson 2005), an
We can treat the study of defeasible reasoning either as a branch of epistemological theory of belief change can be used to define a set of
epistemology (the theory of knowledge), or as a branch of logic. In the nonmonotonic consequence relations (one relation for each initial belief
epistemological apporach, defeasible reasoning is studied as a form of state). We can define the consequence relation α β, for a given set of
inference, that is, as a process by which we add to our stock of knowledge. beliefs T, as holding just in case the result of adding belief α to T would
The epistemological approach is concerned with the transmission of include belief in β. However, on this approach, there would be many
warrant, with the question of when an inference, starting with justified or distinct nonmonotonic consequence relations, instead of a single
warranted beliefs, produces a new belief that is also warranted. This perspective-independent one.
approach focuses explicitly on the norms of belief change.
4. Epistemological Approaches
In contrast, a logical approach to defeasible reasoning fastens on a
relationship between propositions or possible bodies of information. Just There are have been three versions of the epistemological approach, each
as deductive logic consists of the study of a certain consequence relation of which attempts to define how an cognitively ideal agent arrives at
between propositions or sets of propositions (the relation of valid warranted conclusions, given an initial input. The first two of these, John
implication), so defeasible (or nonmonotonic) logic consists of the study L. Pollock's theory of defeasible reasoning and the theory of semantic
of a different kind of consequence relation. Deductive consequence is inheritance networks, are explicitly computational in nature. They take as
monotonic: if a set of premises logically entails a conclusion, than any input a complex, structured state, representing the data available to the
superset (any set of premises that includes all of the first set) will also agent, and they define a procedure by which new conclusions can be
entail that some conclusion. In contrast, defeasible consequence is warranted. The third approach, based on the theory of belief change (the
nonmonotonic. A conclusion follows defeasibly or nonmonotonically from AGM model) developed by Alchourrón, Gärdenfors, and Makinson
a set of premises just in case it is true in nearly all of the models that (Alchourrón, Gärdenfors, and Makinson 1982), instead lays down a set of
verify the premises, or in the most normal models that do. conditions that an ideal process of belief change ought to satisfy. The
AGM model can be used to define a nonmonotonic consequence relation
The two approaches are related. In particular, a logical theory of
that is temporary and local. This can represent reasoning that is
defeasible consequence will have epistemological consequences. It is
hypothetically or counterfactually defeasible, in the sense that what
presumably true that an ideally rational thinker will have a set of beliefs
“follows” from a conjunctive proposition (p & q) need not be a superset of
that are closed under defeasible, as well as deductive, consequence.
what “follows” from p alone.
However, a logical theory of defeasible consequence would have a wider
scope of application than a merely epistemological theory of inference. 4.1 Formal Epistemology
Defeasible logic would provide a mechanism for engaging in hypothetical
reasoning, not just reasoning from actual beliefs. John Pollock's approach to defeasible reasoning consists of enumerating a
set of rules that are constructive and effectively computable, and that aim undercuts the argument, thanks to belief 2. Thus, the argument that the
at describing how an ideal cognitive agent builds up a rich set of beliefs, elephant is pink is self-defeating. Pollock argues that all self-defeating
beginning with a relatively sparse data set (consisting of beliefs about arguments should be rejected, and that they should not be allowed to
immediate sensory appearances, apparent memories, and such things). The defeat other arguments. In addition, a set of nodes can experience mutual
inferences involved are not, for the most part, deductive. Instead, Pollock destruction or collective defeat if each member of the set is defeated by
defines, first, what it is for one belief to be a prima facie reason for some other member, and no member of the set is defeated by an
believing another proposition. In addition, Pollock defines what it is for undefeated node that is outside the set.
one belief, say in p, to be a defeater for q as a prima facie reason for r. In
fact. Pollock distinguishes two kinds of defeaters: rebutting defeaters, In formalizing the undercutting rebuttal, Pollock introduces a new
which are themselves prima facie reasons for believing the negation of the connective, ⊗, where p ⊗ q means that it is not the case that p wouldn't be
conclusion, and undercutting defeaters, which provide a reason for true unless q were true. Pollock uses rules, rather than conditional
doubting that q provides any support, in the actual circumstances, for r. propositions, to express the prima facie relation. If he had, instead,
(Pollock 1987, 484) A belief is ultimately warranted in relation to a data introduced a special connective ⇒, with p ⇒ q meaning that p would be a
set (or epistemic basis) just in case it is supported by some ultimately prima facie reason for q, then undercutting defeaters could be represented
undefeated argument proceeding from that epistemic basis. by means of negating this conditional. To express the fact that r is an
undercutting defeater of p as a prima facie reason for q, we could state
In his most recent work (Pollock 1995), Pollock uses a directed graph to both that (p ⇒ q) and ¬((p & r) ⇒ q).
represent the structure of an ideal cognitive state. Each directed link in the
network represents the first node's being a prima facie reason for the In the case of conflicting prima facie reasons, Pollock rejects the principle
second. The new theory includes an account of hypothetical, as well as of specificity, a widely accepted principle according to which the
categorical reasoning, since each node of the graph includes a (possibly defeasible rule with the more specific antecedent takes priority over
empty) set of hypotheses. Somewhat surprisingly, Pollock assumes a conflicting rules with less specific antecedents. Pollock does, however,
principle of monotonicity with respect to hypotheses: a belief that is accept a special case of specificity in the area of statistical syllogisms with
warranted relative to a set of hypotheses is also warranted with respect to projectible properties. (Pollock 1995, 64–66) So, if I know that most As
any superset of hypotheses. Pollock also permits conditionalization and are Bs, and the most ACs are not Bs, then I should, upon learning that
reasoning by cases. individual b is both A and C, give priority to the AC generalization over
the A generalization (concluding that b is not a B).
An argument is self-defeating if it supports a defeater for one of its own
defeasible steps. Here is an interesting example: (1) Robert says that the Pollock's theory of warrant is intended to provide normative rules for
elephant beside him looks pink. (2) Robert's color vision becomes belief, of the form: if you have warranted beliefs that are prima facie
unreliable in the presence of pink elephants. Ordinarily, belief 1 would reasons for some further belief, and you have no ultimately undefeated
support the conclusion that the elephant is pink, but this conclusion defeaters for those reasons, then that further belief is warranted and should
be believed. For more details of Pollock's theory, see the following things. To be more precise, there are both positive (“is a”) and negative
supplementary document: (“is not a”) links. The negative links are usually reprented by means of a
slash through the body of the arrow.
John Pollock's System
Semantic inheritance networks differ from Pollock's system in two
Wolfgang Spohn (Spohn 2002) has argued that Pollock's system is important ways. First, they cannot represent one fact's constituting an
normatively defective because, in the end, Pollock has no normative undercutting defeater of an inference, although they can represent
standard to appeal to, other than ad hoc intuitions about how a reasonable rebutting defeaters. For example, they do not allow an inference from the
person would respond to this or that cognitive situation. Spohn suggests apparent color of an elephant to its actual color to be undercut by the
that, with respect to the state of development of the study of defeasible information that my color vision is unreliable, unless I have information
reasoning, Pollock's theory corresponds to C. I. Lewis's early about the actual color of the elephant that contradicts its apparent color.
investigations into modal logic. Lewis suggested a number of possible Secondly, they do incorporate the principle of specificity (the principle
axiom systems, but lacked an adequate semantic theory that could provide that rules with more specific antecedents take priority in case of conflict)
an independent check on the correctness or completeness of any given list into the very definition of a warranted conclusion. In fact, in contrast to
(of the kind that was later provided by Kripke and Kanger). Analogously, Pollock, the semantic inheritance approach gives priority to rules whose
Spohn argues that Pollock's system is in need of a unifying normative antecedents are weakly or defeasibly more specific. That is, if the
standard. This very same criticism can be lodged, with equal justice, antecedent of one rule is defeasibly linked to the antecedent of a second
against a number of other theories of defeasible reasoning, including rule, the first rule gains priority. For example, if Quakers are typically
semantic inheritance networks and default logic. pacifists, then, when reasoning about a Quaker pacifist, rules pertaining to
Quakers would override rules pertaining to pacifists. For the details of
4.2 Semantic Inheritance Networks
semantic inheritance theory, see the following supplementary document:
The system of semantic inheritance networks, developed by Horty,
Semantic Inheritance Networks.
Thomason, and Touretzky (1990), is similar to Pollock's system. Both
represent cognitive states by means of directed graphs, with links David Makinson (Makinson 1994) has pointed out that semantic network
representing defeasible inferences. The semantic inheritance network theory is very sensitive to the form in which defeasible information is
theory has a intentionally narrower scope: the initial nodes of the network represented. There is a great difference between having a direct link
represent particular individuals, and all non-initial nodes represent kinds, between two nodes and having a path between the two nodes being
categories or properties. A link from an initial (individual) node to a supported by the graph as a whole. The notion of preemption gives special
category node represents simply predication: that Felix (initial node) is a powers to explicitly given premises over conclusions. Direct links always
cat (category node), for example. Links between category nodes represent take priority over longer paths. Consequently, inheritance networks lack
defeasible or generic inclusion: that birds (normally or usually) are flying two desirable metalogical properties: cut and cautious monotony (which
will be covered in more detail in the section on Logical Approaches). set, the agent is supposed to believe the logical closure of the original set
plus the new belief. When a belief is added that is inconsistent with the
Cut: If G is a subgraph of Gʹ′, and every link in Gʹ′ corresponds to a original set, the agent retreats to the most entrenched of the maximal
path supported by G, then every path supported by G is also subsets of the set that are consistent with the new belief, adding the new
supported by Gʹ′. proposition to that set and closing under logical consequence. For the
Cautious Monotony: If G is a subgraph of Gʹ′, and every link in Gʹ′ axioms of the AGM model, see the following supplementary document:
corresponds to a path supported by G, then every path supported by
Gʹ′ is also supported by G. AGM Postulates
Cumulativity (Cut plus Cautious Monotony) corresponds to reasoning by AGM belief revision theory can be used as the basis for a system of
lemmas or subconclusions. The Horty-Thomason-Touretzky system does defeasible reasoning or nonmonotonic logic, as Gärdenfors and Makinson
satisfy special cases of Cut and Cautious Monotony: if A is an atomic have recognized (Makinson and Gärdenfors 1991). If K is an epistemic
statement (a link from an individual to a category), then if graph G state, then a nonmonotonic consequence relation can be defined as
supports A, then for any statement B, G ∪ {A} supports B if and only if G follows: A B iff B ∈ K*A. Unlike Pollock's system or semantic
supports B. inheritance networks, this defeasible consequence relation depends upon a
background epistemic state. Thus, the belief revision approach gives rise,
Another form of inference that is not supported by semantic inheritance not to a single nonmonotonic consequence relation, but to family of
networks is that of reasoning by cases or by dilemma. In addition, relations. Each background state K gives rise to its own characteristic
semantic networks do not license modus-tollens-like inferences: from the consequence relation.
fact that birds normally fly and Tweety does not fly, we are not licensed to
infer that Tweety is not a bird. (This feature is also lacking in Pollock's One significant limitation of the belief-revision approach is that there is no
system.) representation in the object-language of a defeasible or default rule or
conditional (that is, of a conditional of the form If p, then normally q or
4.3 Belief Revision Theory That p would be a prima facie reason for accepting that q). In fact,
Gärdenfors (Gärdenfors 1978; Gärdernfors 1986) proved that no
Alchourrón, Gärdenfors, and Makinson (1982) developed a formal theory
conditional satisfying the Ramsey test can be added to the AGM system
of belief revision and contraction, drawing largely on Willard van Orman
without trivializing the revision relation.[1] (A conditional ⇒ satisfies the
Quine's model of the web of belief (Quine and Ullian 1970). The cognitive
Ramsey test just in case, for every epistemic state K, K includes (A ⇒ B)
agent is modelled as believing a set of propositions that are ordered by
iff K*A includes B.)
their degree of entrenchment. This model provides the basis for a set of
normative constraints on belief contraction (subtracting a belief) and belief Since the AGM system cannot include conditional beliefs, it cannot
revision (adding a new belief that is inconsistent with the original set). elucidate the question of what logical relationships hold between
When a belief is added that is logically consistent with the original belief
all the members of Γ implies (in some sense) the truth of at least one Cautious Monotony: If Γ ⊆ Δ ⊆ C(Γ), then C(Γ) ⊆ C(Δ).
member of Δ. To this point, studies of nonmonotonic logic have defined
nonmonotonic consequence relations in the style of Hilbert or Tarski, In addition, a defeasible consequence relation ought to be supraclassical:
rather than Scott. if p follows from q in classical logic, then it ought to be included in the
defeasible consequences of q as well. A formula q ought to count as an (at
A (Tarski) consequence relation is monotonic just in case it satisfies the least) defeasible consequence of itself, and anything included in the
following condition, for all formulas p and all sets Γ and Δ: content of q (any formula p that follows from q in classical logic) ought to
count as a defeasible consequence of q as well. Moreover, the defeasible
Monotonicity: If Γ ⊨ p, then Γ ∪ Δ ⊨ p. consequences of a set Γ ought to depend only on the content of the
formulas in Γ, not in how that content is represented. Consequently, the
Any consequence relation that fails this condition is nonmonotonic. A
defeasible consequence relation ought to treat Γ and the classical logical
relation of defeasible consequence clearly must be nonmonotonic, since a
closure of Γ (which I'll represent as “Cn(Γ)”) in exactly the same way. A
defeasible inference can be defeated by adding additional information that
consequence relation that satisfies these two conditions is said to satisfy
constitutes a rebutting or undercutting defeater.
full absorption (see Makinson 1994, 47).
5.2 Metalogical Desiderata
Full Absorption: Cn(C(Γ)) = C(Γ) = C(Cn(Γ))
Once monotonicity is given up, the question arises: why call the relation
Finally, a genuinely logical consequence relation ought to enable us to
of defeasible consequence a logical consequence relation at all? What
reason by cases. So, it should satisfy a principle called distribution: if a
properties do defeasible consequence and classical logical consequence
formula p follows defeasibly from both q and r, then it ought to follow
have in common, that would justify treating them as sub-classes of the
from their disjunction. (To require the converse principle would be to
same category? What justifies calling nonmonotonic consequence logical?
reinstate monotonicity.) The relevant principle is this:
To count as logical, there are certain minimal properties that a relation
Distribution: C(Γ) ∩ C(Δ) ⊆ C(Cn(Γ) ∩ Cn(Δ)).
must satisfy. First, the relation ought to permit reasoning by lemmas or
subconclusions. That is, if a proposition p already follows from a set Γ, Consequence relations that are cumulative, strongly absorptive, and
then it should make no difference to add p to Γ as an additional premise. distributive satisfy a number of other desirable properties, including
Relations that satisfy this condition are called cumulative. Cumulative conditionalization: If a formula p is a defeasible consequence of Γ ∪ {q},
relations satisfy the following two conditions (where “C(Γ)” represents the then the material conditional (q → p) is a defeasible consequence of Γ
set of defeasible consequences of Γ): alone. In addition, such logics satisfy the property of loop: if p1 p2 …
pn-1 pn (where “ ” represents the defeasible consequence relation),
Cut: If Γ ⊆ Δ ⊆ C(Γ), then C(Δ) ⊆ C(Γ).
then the defeasible consequences of pi and pj are exactly the same, for any
i or j.[2]
There are three further conditions that have been much discussed in the A default theory consists of a set of formulas (the facts), together with a
literature, but whose status remains controversial: disjunctive rationality, set of default rules. An extension of a default theory is a fixed point of a
rational monotony, and consistency preservation. particular inferential process: an extension E must be a consistent theory
(a consistent set closed under classical consequence) that contains all of
Disjunctive Rationality: If Γ ∪ {p} r, and Γ ∪ {q} r, then Γ the facts of the default theory T, and, in addition, for each normal default
∪ {(p ∨ q)} r. (p ⇒ q), if p belongs to E, and q is consistent with E, then q must belong
to E also.
Rational Monotony: If Γ A, then either Γ ∪ {B} A or Γ
¬B. Since the consequence relation is defined by a fixed-point condition, there
are default theories that have no extension at all, and other theories that
Consistency Preservation: If Γ is classically consistent, then so is
have multiple, mutually inconsistent extensions. For example, the theory
C(Γ) (the set of defeasible consequences of Γ).
consisting of the fact p and the pair of defaults (p ; (q & r) ∴ q) and (q ; ¬r
All three properties seem desirable, but they set a very hight standard for ∴ ¬r) has no extension. If the first default is applied, then the second must
the defeasible reasoner. be, and if the second default is not applied, the first must be. However, the
conclusion of the second default contradicts the prerequisite of the first, so
5.3 Default Logic the first cannot be applied if the second is. There are many default theories
that have multiple extensions. Consider the theory consisting of the facts q
Ray Reiter's default logic (Reiter 1980; Etherington and Reiter 1983) was and r and the pair of defaults (q ; p ∴ p) and (r ; ¬p ∴ ¬p). One or the
part of the first generation of defeasible systems developed in the field of other, but not both, defaults must be applied.
artificial intelligence. The relative ease of computing default extensions
has made it one of the more popular systems. Furthermore, there is no guarantee that if E and Eʹ′ are both extensions of
theory T, then the intersection of E and Eʹ′ is also an extension (the
Reiter's system is based on the use of default rules. A default rule consists intersection of two fixed points need not be itself a fixed point). Default
of three formulas: the prerequisite, the justification, and the consequent. If logic is usually interpreted as a credulous system: as a system of logic that
one accepts the prerequisite of a default rule, and the justification is allows the reasoner to select any extension of the theory and believe all of
consistent with all one knows (including what one knows on the basis of the members of that theory, even though many of the resulting beliefs will
the default rules themselves), then one is entitled to accept the consequent. involve propositions that are missing from other extensions (and may even
The most popular use of default logic relies solely on normal defaults, in be contradicted in some of those extensions).
which the justification and the consequent are identical. Thus, a normal
default of the form (p; q ∴ q) allows one to infer q from p, so long as q is Default logic fails many of the tests for a logical relation that were
consistent with one's endpoint (the extension of the default theory). introduced in the previous section. It satisfied Cut and Full Absorption,
but it fails Cautious Monotony (and thus fails to be cumulative). In
addition, it fails Distribution, a serious limitation that rules out reasoning in addition, a further technical question of which predicates to treat as
by cases. For example, if one knows that Smith is either Amish or Quaker, fixed and which to treat as variable). The nonmonotonic consequences of a
and both Quakers and Amish are normally pacifists, one cannot infer that theory T then consist of all the formulas that are true in every model of T
Smith is a pacifist. Default logic also fails to represent Pollock's that minimizes the extensions of the selected predicates. One model M of
undercutting defeaters. Finally, default logic does not incorporate any T is preferred to another, M', if and only if, for each designated predicate
form of the principle of Specificity, the principle that defaults with more F, the extension of F in M is a subset of the extension of F in M', and, for
specific prerequisites ought, in cases of conflict, to take priority over some such predicate, the extension in M is a proper subset of the extension
defaults with less specific prerequisites. Recently, John Horty (Horty in M'.
2007) has examined the implications of adding priorities among defaults
(in the form of a partial ordering), which would permit the recognition of The relation of circumscriptive consequence has all the desirable meta-
specificity and other grounds for preferring one default to another. logical properties. It is cumulative (satisfies Cut and Cautious Monotony),
strongly absorptive, and distributive. In addition, it satisfies Consistency
5.4 Nonmonotonic Logic I and Autoepistemic Logic Preservation, although not Rational Monotony.
In both McDermott-Doyle's Nonmonotonic Logic I and Moore's The most critical problem in applying circumscription is that of deciding
Autoepistemic logic (McDermott and Doyle, 1982; Moore, 1985; on what predicates to minimize (there is, in addition, a further technical
Konolige 1994), a modal operator M (representing a kind of epistemic question about which predicates to treat as fixed and which as variable in
possibility) is used. Default rules take the following form: ((p & Mq) → extension). Most often what is done is to introduce a family of
q), that is, if p is true and q is “possible” (in the relevant sense), then q is abnormality predicates ab1, ab2, etc. A default rule then can be written in
also true. In both cases, the extension of a theory is defined, as in Reiter's the form: ∀x((F(x) & ¬ abi(x) ) → G(x)), where “→” is the ordinary
default logic, by means of a fixed-point operation. Mp represents the fact material conditional of classical logic. To derive the consequences of a
that ¬p does not belong to the extension. For example, in Moore's case, a theory, all of the abnormality predicates are simultaneously minimized.
set Δ is a stable expansion of a theory Γ just in case Δ is the set of This simple approach fails to satisfy the principle of Specificity, since
classical consequences of the set Γ ∪ {¬Mp: p ∈ Δ} ∪ {Mp: p ∉ Δ}. As each default is given its own, independent abnormality predicate, and each
in the case of Reiter's default logic, some theories will lack a stable is therefore treated with the same priority. It is possible to add special
expansion, or have more than one. In addition, these systems fail to rules for the prioritizing of circumscription, but these are, of necessity, ad
incorporate Specificity. hoc and exogenous, rather than a natural result of the definition of the
consequence relation.
5.5 Circumscription
Circumscription does have the capacity of representing the existence of
In circumscription (McCarthy 1982; McCarthy 1986; Lifschitz 1988), one undercutting defeaters. Suppose that satisfying predicate F provides a
or more predicates of the language are selected for minimization (there is, prima facie reason for supposing something to be a G, and suppose that we
use the abnormality predicate ab1 in representing this default rule. We can to nonmonotonic consequence relations that fail to be cumulative.
state that the predicate H provides an undercutting defeater to this
inference by simply adding the rule: ∀ x (H(x) → ab1(x)), stating that all Once we have added the Limit Assumption, it is easy to show that any
Hs are abnormal in respect number 1. consequence relation based upon a preferential model is not only
cumulative but also supraclassical, strongly absorptive, and distributive.
5.6 Preferential Logics Let's call such logics preferential. In fact, Kraus, Lehmann, and Magidor
(Kraus, Lehmann, and Magidor 1990; Makinson 1994, 77; Makinson
Circumscription is a special case of a wider class of defeasible logics, the 2005, PAGE) proved the following representation theorem for preferential
preferential logics (Shoham 1987). In preferential logics, Γ p iff p is logics:
true in all of the most preferred models of Γ. In the case of
circumscription, the most preferred models are those that minimize the Representation Theorem for Preferential Logics: if is a
extension of certain predicates, but many other kinds of preference cumulative, supraclassical, strongly absorptive, and distributive
relations can be used instead, so long as the preference relations are consequence relation (i.e., a preferential relation) then there is a
transitive and irreflexive (a strict partial order). A structure consisting of a preferential structure M satisfying the Limit Assumption such that
set of models of a propositional or first-order language, together with a for all finite theories T, the set of -consequences of T is exactly
preference order on those models, is called a preferential structure. The the set of formulas true in every preferred model of T in M.[3]
symbol ≺ shall represent the preference relation. M ≺ Mʹ′ means that M is
There are preferential logics that fail to satisfy consistency preservation, as
strictly preferred to Mʹ′. A most preferred model is one that is minimal in
well as disjunctive rationality and rational monotony:
the ordering.
Disjunctive Rationality:
In order to give rise to a cumulative logic (one that satisfies Cut and
If Γ ∪ {p} r, and Γ ∪ {q} r, then Γ ∪ {(p ∨ q)} r.
Cautious Monotony), we must add an additional condition to the
preferential structures, a Limit Assumption (also known as the condition Rational Monotony:
of stopperedness or smoothness: If Γ p, then either Γ ∪ {q} p or Γ ¬q.
Limit Assumption: Given a theory T, and M, a non-minimal A very natural condition has been found by Kraus, Lehmann, and Magidor
model of T, there exists a model Mʹ′ which is preferred to M and that corresponds to Rational Monotony: that of ranked models. (No
which is a minimal model of T. condition on preference structures has been found that ensures disjunctive
rationality without also ensuring rational monotony.) A preferential
The Limit Assumption is satisfied if the preferential structure does not
structure M satisfies the Ranked Models condition just in case there is a
contain any infinite descending chains of more and more preferred
function r that assigns an ordinal number to each model in such a way that
models, with no minimal member. This is a difficult condition to motivate
M ≺ Mʹ′ iff r(M) < r(M'). Let's say that a preferential consequence relation
as natural, but without it, we can find preferential structures that give rise
is a rational relation just in case it satisfies Rational Monotony, and that a a set of probabilistic conditionals. A conclusion p follows defeasibly from
preferential structure is a rational structure just in case it satisfies the T if and only if every probability function satisfies the following
ranked models condition. Kraus, Lehmann, and Magidor (Kraus, condition:
Lehmann, and Magidor 1990; Makinson 1994, 71–81) also proved the
following representation theorem: For every δ, there is an ε such that, if the probability of every fact
in T is assigned a probability at least as high as 1 – ε, and every
Representation Theorem for Rational Logics: if is a rational conditional in T is assigned a conditional probability at least as
consequence relation (i.e., a preferential relation that satisfies high as 1 – ε, then the probability of the conclusion p is at least 1 –
Rational Monotony) then there is a preferential structure M δ.
satisfying the Limit Assumption and the Ranked Models
Assumption such that for all finite theories T, the set of - The resulting defeasible consequence relation is a preferential relation. (It
consequences of T is exactly the set of formulas true in every need not, however, be consistency-preserving.) This consequence relation
preferred model of T in M. also corresponds to a relation, 0-entailment, defined by Judea Pearl (Pearl
1990), as the common core to all defeasible consequence relations.
Freund proved an analogous representation result for preferential logics
that satisfy disjunctive rationality, replacing the ranking condition with a Lehmann and Magidor (1992) proposed a variation on Adams's idea.
weaker condition of filtered models: a filtered model is one such that, for Instead of using the delta-epsilon construction, they made use of
every formula, if two worlds non-minimally satisfy the formula, then there nonstandard measure theory, that is, a theory of probability functions that
is a world less than both of them that also satisfies the formula (Freund can take values that are infinitesimals (infinitely small numbers). In
1993). addition, instead of defining the consequence relation by quantifying over
all probability functions, Lehmann and Magidor assume that we can select
5.7 Logics of Extreme Probabilities a single probability function (representing something like the ideally
rational, or objective probability). On their construction, a conclusion p
Lehmann and Magidor (Lehmann and Magidor 1992) noticed an follows from T just in case the probability of p is infinitely close to 1, on
interesting coincidence: the metalogical conditions for preferential the assumption that the probabilities assigned to members of T are
consequence relations correspond exactly to the axioms for a logic of infinitely close to 1. Lehmann and Magidor proved that the resulting
conditionals developed by Ernest W. Adams (Adams 1975).[4] Adams's consequence relation is always not only preferential: it is also rational.
logic was based on a conditional, ⇒, intended to represent a relation of The logic defined by Lehmann and Magidor also corresponds exactly to
very high conditional probability: (p ⇒ q) means that the conditional the theory of Popper functions, another extension of probability theory
probability Pr(q/p) is extremely close to 1. Adams used the standard delta- designed to handle cases of conditioning on propositions with
epsilon definition of the calculus to make this idea precise. Let us suppose infinitesimal probability (see Harper 1976; van Fraassen 1995; Hawthorne
that a theory T consists of a set of conditional-free formulas (the facts) and 1998). For a brief discussion of Popper functions, see the following
supplementary document: independently of each other: one can warrant a conclusion even though we
are given an explicit exception to the other. Consider, for example, the
Popper Functions following case: birds fly (B ⇒ F), Tweety is a bird that doesn't fly (B &
¬F), whales are large (W ⇒ L), and Nemo is a whale (W). These premises
Arló Costa and Parikh, using van Fraassen's account (van Fraassen, 1995)
1-entail that Nemo is large (L). In addition, 1-entailment automatically
of primitive conditional probabilities (a variant of Popper functions),
satisfies the principle of Specificity: conditionals with more specific
proved a representation result for both finite and infinite languages (Arló
antecedents are always given priority over those with less specific
Costa and Parikh, 2005). For infinite languages, they assumed an axiom of
antecedents.
countable additivity for probabilities.
There is another form of independence, strong independence, that even 1-
Kraus, Lehmann, and Magidor proved that, for every preferential
entailment fails to satisfy. If we are given one exception to a rule
consequence relation that is probabilistically admissible,[5] there is a
involving a given antecedent, then we are unable to use any conditional
unique rational consequence relation * that minimally extends it (that is,
with the same antecedent to derive any conclusion whatsoever. Suppose,
that the intersection of all the rational consequence relations extending
for example, that we know that birds fly (B ⇒ F), Tweety is a bird that
is also a rational consequence relation). This relation, *, is called the
doesn't fly (B & ¬F), and birds lay eggs (B ⇒ E). Even under 1-
rational closure of . To find the rational closure of a preferential
entailment, the conclusion that Tweety lays eggs (E) fails to follow. This
relation, one can perform the following operation on a preferential
failure to satisfy Strong Independence is also known as the Drowning
structure that supports that relation: assign to each model in the structure
Problem (since all conditionals with the same antecedent are “drowned”
the smallest number possible, respecting the preference relation. Judea
by a single exception).
Pearl also proposed the very same idea under the name 1-entailment or
System Z (Pearl 1990). A consensus is growing that the Drowning Problem should not be
“solved” (see Pelletier and Elio 1994; Wobcke 1995, 85; Bonevac, 2003,
A critical advantage to the Lehmann-Magidor-Pearl 1-entailment system
461–462). Consider the following variant on the problem: birds fly,
over Adams's epsilon-entailment lies in the way in which 1-entailment
Tweety is a bird that doesn't fly, and birds have strong forelimb muscles.
handles irrelevant information. Suppose, for example, that we know that
Here it seems we should refrain from concluding that Tweety has strong
birds fly (B ⇒ F), Tweety is a bird (B), and Nemo is a whale (W). These
forelimb muscles, since there is reason to doubt that the strength of wing
premises do not epsilon-entail F (that Tweety flies), since there is no
muscles is causally (and hence, probabilistically) independent of capacity
guarantee that a probability function assign a high probability to F, given
for flight. Once we know that Tweety is an exceptional bird, we should
the conjunction of B and W. In contrast, 1-entailment does give us the
refrain from applying other conditionals with Tweety is a bird as their
conclusion F.
antecedents, unless we know that these conditionals are independent of
Moreover, 1-entailment satisfies a condition of weak independence of flight, that is, unless we know that the conditional with the stronger
defaults: conditionals with logically unrelated antecedents can “fire” antecedent, Tweety is a non-flying bird, is also true.
Nonetheless, several proposals have been made for securing strong science, such as frictionless planes and ideal gases. It seems reasonable to
independence and solving the Drowning Problem. Geffner and Pearl think that, in deploying the machinery of defeasible logic, we indulge in
(Geffner and Pearl 1992) proposed a system of conditional entailment, a the degree of make-believe necessary to make the formal models
variant of circumscription, in which the preference relation on models is applicable. Nonetheless, this is clearly a problem warranting further
defined in terms of the sets of defaults that are satisfied. This enables attention.
Geffner and Pearl to satisfy both the Specificity principle and Strong
Independence. Another proposal is the maximum entropy approach (Pearl 5.8 Fully Expressive Languages: Conditional Logics and
1988, 490–496; Goldszmidt, Morris and Pearl, 1993; Pearl 1990). A Higher-Order Probabilities
theory T, consisting of defaults Δ and facts F, entails p just in case the
With relatively few exceptions, the logical approaches to defeasible
probability of p, conditional on F, approaches 1 as the probabilities
reasoning developed so far put severe restrictions on the logical form of
associated with Δ approach 1, using the entropy-maximizing[6] probability
propositions included in a set of premises. In particular, they require the
function that respects the defaults in Δ. The maximum-entropy approaches
default conditional operator, ⇒, to have wide scope in every formula in
satisfies both Specificity and Strong Independence.
which it appears. Default conditionals are not allowed to be nested within
Every attempt to solve the drowning problem (including conditional other default conditionals, or within the scope of the usual Boolean
entailment and the maximum-entropy approach) comes at the cost of operators of propositional logic (negation, conjunction, disjunction,
sacrificing cumulativity. Securing strong independence makes the systems material conditional). This is a very severe restriction and one that is quite
very sensitive to the exact form in which the default information is stored. difficult to defend. For example, in representing undercutting defeaters, it
Consider, for example the following case: Swedes are (normally) fair, would be very natural to use a negated default conditional of the form ¬((p
Swedes are (normally) tall, Jon is a short Swede. Conditional entailment & q) ⇒ r) to signify that q defeats p as a prima facie reason for r. In
and maximum-entropy entailment would permit the conclusion that Jon is addition, it seems plausible that one might come gain disjunctive default
fair in this case. However, if we replace the first two default conditionals information: for example, that either customers are gullible or salesman
by the single default, Swedes are normally both tall and fair, then the are wily.
conclusion no longer follows, despite the fact that the new conditional is
Asher and Pelletier (Asher and Pelletier 1997) have argued that, when
logically equivalent to the conjunction of the two original conditionals.
translating generic sentences in natural language, it is essential that we be
Applying the logic of extreme probabilities to real-world defeasible allowed to nest default conditionals. For example, consider the following
reasoning generates an obvious problem, however. We know perfectly English sentences:
well that, in the case of the default rules we actually use, the conditional
Close friends are (normally) people who (normally) trust one
probability of the conclusion on the premises is nowhere near 1. For
another.
example, the probability that an arbitrary bird can fly is certainly not
infinitely close to 1. This problem resembles that of using idealizations in People who (normally) rise early (normally) go to bed early.
In the first case, a conditional is nested within the consequent of another and Bonevac 1996; Asher and Mao 2001) under the name Commonsense
conditional: Entailment.[7] Commonsense Entailment is a preferential (although not a
rational) consequence relation, and it automatically satisfies the
∀x ∀y(Friend(x,y) ⇒ ∀z (Time(z) ⇒ Trust(x,y,z))) Specificity principle. It permits the arbitrary nesting of default
conditionals within other logical operators, and it can be used to represent
In the second case, we seem to have conditionals nested within both the
undercutting defeaters, through the use of negated defaults (Asher and
antecedent and the consequent of a third conditional, something like:
Mao 2001).
∀x (Person(x) → (∀y(Day(y) ⇒ Rise-early(x,y)) ⇒ ∀z(Day(z) ⇒
The models of Commonsense Entailment differ significantly from those of
Bed-early(x,z))
preferential logic and the logic of extreme probabilities. Instead of having
This nesting of conditionals can be made possible by borrowing and structures that contain sets of models of a standard, default-free language,
modifying the semantics of the subjunctive or counterfactual conditional, a model of the language of Commonsense Entailment includes a set of
developed by Robert Stalnaker and David K. Lewis (Lewis 1973). For an possible worlds, together with a function that assigns standard
axiomatization of Lewis's conditional logic, see the following interpretation (a model of the default-free language) to each world. In
supplementary document: addition, to each pair consisting of a world w and a set of worlds
(proposition) A, there is a function * that assigns a set of worlds *(w,A) to
David Lewis's Conditional Logic the pair. The set *(w,A) is the set of most normal A-worlds, from the
perspective of w. A default conditional (p ⇒ q) is true in a world w (in
The only modification that is essential is to drop the condition of
such a model) just in case all of the most normal p worlds (from w's
Centering (both strong and weak), a condition that makes modus ponens
perspective) are worlds in which q is also true. Since we can assign truth-
(affirming the antecedent) logically valid. If the conditional ⇒ is to
conditions to each such conditional, we can define the truth of nested
represent a default conditional, we do not want modus ponens to be valid:
conditionals, whether the conditionals are nested within Boolean operators
we do not want (p ⇒ q) and p to entail q classically (i.e., monotonically).
or within other conditionals. Moreover, we can define both a classical,
If Centering is dropped, the resulting logic can be made to correspond
monotonic consequence relation for this class of models and a defeasible,
exactly to either a preferential or a rational defeasible entailment relation.
nonmonotonic relation (in fact, the nonmonotonic consequence relation
For example, the condition of Rational Monotony is the exact counterpart
can be defined in a variety of ways). We can then distinguish between a
of the CV axiom of Lewis's logic:
default conditional's following with logical necessity from a default theory
CV: (p ⇒ q) → [((p & r) ⇒ q) ∨ (p ⇒ ¬r )] and its following defeasibly from that same theory. Contraposition, for
example — inferring (¬q ⇒ ¬p) from (p ⇒ q) — is not logically valid for
Something like this was proposed first by James Delgrande (Delgrande default conditionals, but it might be a defeasibly correct inference.[8]
1987), and the idea has been most thoroughly developed by Nicholas
Asher and his collaborators (Asher and Morreau 1991; Asher 1995; Asher The one critical drawback to Commonsense Entailment, when compared
to the logic of extreme probabilities, is that it lacks a single, clear standard The following proposition is logically valid in this logic, representing the
of normativity. The truth-conditions of the default conditional and the presence of a defeasible modus ponens rule:
definition of nonmonotonic consequence can be fine-tuned to match many
of our intuitions, but in the end of the day, the theory of Commonsense ((p & (p ⇒ q)) ⇒ q)
Entailment offers no simple answer to the question of what its conditional
This system can be the basis for a family of rational nonmonotonic
or its consequence relation are supposed (ideally) to represent.
consequence relations that include the Adams ε-entailment system as a
Logics of extreme probability (beginning with the work of Ernest Adams) proper part (see Koons 2000, 298–319).
did not permit the nesting of default conditionals for this reason: the
5.9 Objections to Nonmonotonic Logic
conditionals were supposed to represent something like subjective
conditional probabilities of the agent, to which the agent was supposed to Confusing Logic and Epistemology?
have perfect introspective access. Consequently, it made no sense to nest
this conditionals within disjunctions (as though the agent couldn't tell In an early paper (Israel 1980), David Israel raised a number of objections
which disjunct represented his actual probability assignment) or within to the very idea of nonmonotonic logic. First, he pointed out that the
other conditionals (since the subjective probability of a subjective nonmonotonic consequences of a finite theory are typically not semi-
probability is always trivial — either exactly 1 or exactly 0). However, decidable (recursively enumerable). This remains true of most current
there is no reason why the logic of extreme probabilities couldn't be given systems, but it is also true of second-order logic, infinitary logic, and a
a different interpretation, with (p ⇒ q) representing something like the number of other systems that are now accepted as logical in nature.
objective probability of q, conditional on p, is infinitely close to 1. In this
case, it makes perfect sense to nest such statements of objective Secondly, and more to the point, Israel argued that the concept of
conditional probability within Boolean operators (either the probability of nonmonotonic logic evinces a confusion between the rules of logic and
q on p is close to 1, or the probability of r on s is close to 1), or within rules of inference. In other words, Israel accused defenders of
operators of objective probability (the objective probability that the nonmonotonic logic of confusing a theory of defeasible inference (a
objective probability of p is close to 1 is itself close to 1). What is required branch of epistemology) with a theory of genuine consequence relations (a
in the latter case is a theory of higher-order probabilities. branch of logic). Inference is nonmonotonic, but logic (according to Israel)
is essentially monotonic.
Fortunately, such a theory of higher-order probabilities is available (see
Skyrms 1980; Gaifman 1988). The central principle of this theory is The best response to Israel is to point out that, like deductive logic, a
Miller's principle. For a description of the models of the logic of extreme, theory of nonmonotonic or defeasible consequence has a number of
higher-order probability, see the following supplementary document: applications besides that of guiding actual inference. Defeasible logic can
be used as part of a theory of scientific explanation, and it can be used in
Models of Higher-Order Probability hypothetical reasoning, as in planning. It can be used to interpret implicit
features of stories, even fantastic ones, so long as it is clear which actual It would be reasonable, however, to demand that a system of
default rules to suspend. Thus, defeasible logic extends far beyond the nonmonotonic logic satisfy the following special deduction theorem:
boundaries of the theory of epistemic justification. Moreover, as we have
seen, nonmonotonic consequence relations (especially the preferential {p} q iff ∅ (p ⇒ q)
ones) share a number of very significant formal properties with classical
This is certainly possible. The special deduction theorem holds trivially; if
consequence, warranting the inclusion of them all in a larger family of
we define{p} q as ∅ ⊨ (p ⇒ q); that is, {p} defeasibly entails q if and
logics. From this perspective, classical deductive logic is simply a special
only if (by definition) (p ⇒ q) is a theorem of the classical conditional
case: the study of indefeasible consequence.
logic.[9]
We need to make one final assumption: that shooting the victim with a antecedents of an event, the occurrence of that event is rendered
loaded gun results in death (not being alive): probabilistically independent of any information about non-posterior
events. When this insight is applied to the nonmonotonic logic of extreme
((Alive(s1) & Loaded(s1)) → ¬Alive(s2) probabilities, we can use causal information to identify which defaults
function independently of others: that is, we can decide when the fact that
Intuitively, we should be able to derive the defeasible conclusion that the
one default conditional has an exception is irrelevant to the question of
victim is still alive after waiting, but dead after waiting and shooting:
whether a second conditional is also violated (see Koons 2000, 320–323).
Alive(s1) & ¬Alive(s2). However, none of the nonmonotonic logics
In effect, we have a selective version of Independence of Defaults that is
described above give us this result, since each of the three instances of the
grounded in causal information, enabling us to dissolve the Drowning
law of inertia can be violated: by the victim's inexplicably dying while we
Problem.
are waiting, by the gun's miraculously becoming unloaded while we are
waiting, or by the victim's dying as a result of the shooting. Nothing For example, in the case of Pearl's sprinkler, since rain is causally prior to
introduced into nonmonotonic logic up to this point provides us with a the sidewalk's being wet, the causal structure of the situation does not
basis for preferring the second exception to the law of inertia to the first or ensure that the rain is probabilistically independent of whether the
third. What's missing is a recognition of the importance of causal structure sprinkler is on, given the fact that the sidewalk is wet. That is, we have no
to defeasible consequence.[10] grounds for thinking that the probability of rain, conditional on the
sidewalk's being wet, is identical to the probability of rain, conditional on
There are several even simpler examples that illustrate the need to include
the sidewalk's being wet and the sprinkler's being on (presumably, the
explicitly causal information in the input to defeasible reasoning.
former is higher than the latter). This failure of independence prevents us
Consider, for instance, this problem of Judea Pearl's (Pearl 1988): if the
from using the (Wet ⇒ Rain) default, in the presence of the additional fact
sprinkler is on, then normally the sidewalk is wet, and, if the sidewalk is
that the sprinkler is on.
wet, then normally it is raining. However, we should not infer that it is
raining from the fact that the sprinkler is on. (See Lifschitz 1990 and Lin In the case of the Yale shooting problem, the state of the gun's being
and Reiter 1994 for additional examples of this kind.) Similarly, if we also loaded in the aftermath of waiting, Loaded(s1), has at its only causal
know that if the sidewalk is wet, then it is slippery, we should be able to antecedent the fact that the gun is loaded in s0. The fact of Loaded(s0)
infer that the sidewalk is slippery if the sprinkler is on and it is not raining. screens off the fact that the victim is alive in s0 from the conclusion
Loaded(s1). Similarly, the fact that the victim is alive in s0 screens off the
6.2 Causally Grounded Independence Relations
fact that the gun is loaded in s0 from the conclusion that the victim is still
Hans Reichenbach, in his analysis of the interaction of causality and alive in s1. In contrast, the fact that the victim is alive at s1 does not screen
probability (Reichenbach 1956), observed that the immediate causes of an off the fact that the gun is loaded at s1 from the conclusion that the victim
event probabilistically screen off from that event any other event that is is still alive at s2. Thus, we can assign higher priority to the law of inertia
not causally posterior to it. This means that, given the immediate causal with respect to both Load and Alive at s0, and we can conclude that the
victim is alive and the gun is loaded at s1. The causal law for shooting for all non-exempt atomic facts, while in effect circumscribing the
then gives us the desired conclusion, namely, that the victim is dead at s2. extension of the causally explained. This approach has been extended and
applied by Giunchiglia, Lee, Lifschitz, McCain and Turner [2004],
6.3 Causal Circumscription Ferraris [2007], and Ferraris, Lee, Lierler, Lifschitz and Yang [2012].
John Mylopoulos and Ray Reiter (eds.), San Mateo, Calif.: Morgan Delgrande, J. P., 1987, “A first-order conditional logic for prototypical
Kaufmann. properties”, Artificial Intelligence, 33: 105–130.
Asher, N., and F.J. Pelletier, 1997, “Generics and Defaults”, in Handbook Etherington, D. W. and R. Reiter, 1983, “On Inheritance Hierarchies and
of Logic and Language, J. van Bentham and A. ter Meulen (eds.), Exceptions”, in Proceedings of the National Conference on Artificial
Amsterdam: Elsevier. Intelligence, Los Altos, Calif.: Morgan Kaufmann.
Baker, A. B., 1988, “A simple solution to the Yale shooting problem”, in Ferraris, Paolo, 2007, “A Logic Programming Characterization of Causal
Proceedings of the First International Conference on Knowledge Theories”, Proceedings of the Twentieth International Joint
Representation and Reasoning, Ronald J. Brachman, Hector Conference on Artificial Intelligence, San Francisco, Calif.: Morgan
Levesque and Ray Reiter (eds.), San Mateo, Calif.: Morgan Kaufmann.
Kaufmann. Ferraris, Paolo, with J. Lee, Y. Lierler, V. Lifschitz, and F. Yang, 2012,
Bamber, Donald, 2000, “Entailment with Near Surety of Scaled Assertions “Representing first-order causal theories by logic programs”, Theory
of High Conditional Probability”, Journal of Philosophical Logic, 29: and Practice of Logic Programming, 12(3): 383–412.
1–74. Freund, M., with D. Lehmann, and D. Makinson, 1990, “Canonical
Bochman, Alexander, 2001, A Logical Theory of Nonmonotonic Inference extensions to the infinite case of finitary nonmonotonic inference
and Belief Change, Berlin: Springer. relations”, in Proceedings of the Workshop on Nonmonotonic
Bodanza, Gustavo A. and F. Tohmé, 2005, “Local Logics, Non- Reasoning, G. Brewka and H. Freitag (eds.), Sankt Augustin:
Monotonicity and Defeasible Argumentation”, Journal of Logic, Gesellschaft für Mathematic und Datenverarbeitung mbH.
Language and Information, 14: 1–12. Freund, M., 1993, “Injective models and disjunctive relations”, Journal of
Bonevac, Daniel, 2003, Deduction: Introductory Symbolic Logic, Malden, Logic and Computation, 3: 231–347.
Mass.: Blackwell, 2nd edition. Gabbay, D. M., 1985, “Theoretical foundations for non-monotonic
Carnap, Rudolf, 1962, Logical Foundations of Probability, Chicago: reasoning in expert systems”, in Logics and Models of Concurrent
University of Chicago Press. Systems, K. R. Apt (ed.), Berlin: Springer-Verlag.
Carnap, Rudolf and Richard C. Jeffrey, 1980, Studies in inductive logic Gaifman, Haim, 1988, “A theory of higher-order probabilities”, in
and probability, Berkeley: University of California Press. Causation, Chance and Credence, Brian Skyrms and William Harper
Cartwright, Nancy, 1983, How the laws of physics lie, Oxford: Clarendon (eds.), London, Ontario: University of Western Ontario Press.
Press. Gärdenfors, P., 1978, “Conditionals and Changes of Belief”, Acta
Chisholm, Roderick, 1957, Perceiving, Princeton: Princeton University Fennica, 30: 381–404.
Press. –––, 1986, “Belief revisions and the Ramsey test for conditionals”,
–––, 1963, “Contrary-to-Duty Imperatives and Deontic Logic”, Analysis, Philosophical Review, 95: 81–93.
24: 33–36. Geffner, H. A., and J. Pearl, 1992, “Conditional entailment: bridging two
–––, 1966, Theory of Knowledge, Englewood Cliffs: Prentice-Hall. approaches to default reasoning”, Artificial Intelligence, 53: 209–244.
Gelfond, Michael and Lifschitz, Vladimir, 1988, “The stable model theory of inheritance in nonmonotonic semantic networks”, Artificial
semantics for logic programming”, Logic Programming: Proceedings Intelligence, 42: 311–348.
of the Fifth International Conference and Symposium, Robert A. Horty, John, 2007, “Defaults with Priorities”, Journal of Philosophical
Kowalski and Kenneth A. Bowen (eds.), Cambridge, Mass.: The MIT Logic, 36: 367–413.
Press, pp. 1070–1080. Israel, David, 19860 “What's Wrong with Non-monotonic Logic”, in
Gilio, Angelo, 2005, “Probabilistic Logic under Coherence, Conditional Proceedings of the First National Conference on Artificial
Interpretations, and Default Reasoning”, Synthese, 146: 139–152. Intelligence, Palo Alto, Calif.: AAAI.
Ginsberg, M. L., 1987, Readings in Nonmonotonic Reasoning, San Mateo, Konolige, Kurt, 1994, “Autoepistemic Logic”, in Handbook of Logic in
Calif.: Morgan Kaufmann. Artificial Intelligence and Logic Programming, Volume III:
Giunchiglia, E., with J. Lee, V. Lifschitz, N. McCain, and H. Turner, Nonmonotonic Reasoning and Uncertain Reasoning, D. M. Gabbay,
“Nonmonotonic Causal Theories”,Artificial Intelligence 153: 49–104. C. J. Hogger, and J. A. Robinson (eds.), Oxford: Clarendon Press.
Goldszmidt, M. and J. Pearl, 1992, “Rank-Based Systems: A Simple Koons, Robert C., 2000, Realism Regained: An Exact Theory of
Approach to Belief Revision, Belief Update, and Reasoning about Causation, Teleology and the Mind, New York: Oxford University
Evidence and Action”, in Proceedings of the Third International Press.
Conference on Principles of Knowledge Representation and –––, 2001, “Defeasible Reasoning, Special Pleading and the Cosmological
Reasoning, San Mateo, Calif.: Morgan Kaufmann. Argument: Reply to Oppy”, Faith and Philosophy, 18: 192–203.
Goldszmidt, M., with P. Morris, and J. Pearl, 1993, “A maximum entropy Kraus, S., with D. Lehmann, and M. Magidor, 1990, “Nonmonotonic
approach to nonmonotonic reasoning”, IEEE Transactions on Pattern Reasoning, Preferential Models and Cumulative Logics”, Artificial
Analysis and Machine Intelligence, 15: 220–232. Intelligence, 44: 167–207.
Grove, A., 1988, “Two modellings for theory change”, Journal of Kyburg, Henry E., 1983, Epistemology and Inference, Minneapolis:
Philosophical Logic, 17: 157–170. University of Minnesota Press.
Hanks, Steve and Drew McDermott, 1987, “Nonmonotonic Logic and –––, 1990, Knowledge Representation and Defeasible Reasoning,
Temporal Projection”, Artificial Intelligence, 33: 379–412. Dordrecht: Kluwer.
Harper, W. L., 1976, “Rational Belief Change, Popper Functions and Lascarides, Alex and Nicholas Asher, 1993, “Temporal Interpretation,
Counterfactuals”, in Foundations of ProbabilityTheory, Statistical Discourse Relations and Commonsense Entailment”, Linguistics and
Inference, and Statistical Theories of Science, Volume I, Dordrecht: Philosophy 16:437–493.
Reidel. Lehmann, D., and M. Magidor, 1992, “What does a conditional
Hawthorne, James, 1998, “On the Logic of Nonmonotonic Conditionals knowledge base entail?”, Artificial Intelligence, 55: 1–60.
and Conditional Probabilities: Predicate Logic”, Journal of Levesque, H., 1990, “A study in autoepistemic logic”, Artificial
Philosophical Logic, 27: 1–34. Intelligence, 42: 263–309.
Horty, J. F., with R.H. Thomason, and D.S. Touretzky, 1990, “A sceptical Lewis, David K., 1973, Counterfactuals, Cambridge, Mass.: Harvard
Kaufmann. Skyrms, Brian, 1980, “Higher order degrees of belief”, in Prospects for
–––, 1990, “System Z: A Natural Ordering of Defaults with Tractable Pragmatism, Hugh Mellor (ed.), Cambridge: Cambridge University
Applications to Default Reasoning”, Proceedings of the Third Press.
Conference on Theoretical Aspects of Reasoning about Knowledge, Spohn, Wolfgang, 1988, “Ordinal Conditional Functions”, in Causation,
Rohit Parikh (ed.), San Mateo, Calif.: Morgan Kaufmann. Decision, Belief Change and Statistics, Volume III, W. L. Harper and
Pelletier, F. J. and R. Elio, “On Relevance in Nonmonotonic Reasoning: B. Skyrms (eds.), Dordrecht: Kluwer.
Some Empirical Studies”, in R. Greiner & D. Subramanian (eds) –––, 2002, “A Brief Comparison of Pollock's Defeasible Reasoning and
Relevance: AAAI 1994 Fall Symposium Series, Palo Alto: AAAI Ranking Functions”, Synthese, 13: 39–56.
Press. Tohmé, Fernando, with Claudio Delrieux and Otávio Bueno, 2011,
Pollock, John L., 1967, “Criteria and our knowledge of the material “Defeasible Reasoning + Partial Models: A Formal Framework for
world”, Philosophical Review, 76: 28–62. the Methodology of Research Programs”, Foundations of Science 16:
–––, 1970, “The structure of epistemic justification”, American 47–65.
Philosophical Quarterly (Monograph Series), 4: 62–78. Txurruka, I. and N. Asher, 2008, “A discourse-based approach to Natural
–––, 1974, Knowledge and Justification, Princeton: Princeton University Language Disjunction (revisited)”, in M. Aunargue, K. Korta and J.
Press. Lazzarabal (eds.), Language, Representation and Reasoning,
–––, 1987, “Defeasible Reasoning”, Cognitive Science, 11: 481–518. University of the Basque Country Press.
–––, 1995, Cognitive Carpentry, Cambridge, Mass.: MIT Press. van Fraassen, Bas, 1995, “Fine-grained opinion, probability, and the logic
Quine, Willard van Orman, and J.S. Ullian, 1982, The Web of Belief, New of folk belief”, Journal of Philosophical Logic, 24: 349–377.
York: Random House. Vieu, L., with M. Bras, N. Asher, and M. Aurnague, 2005, “Locating
Reiter, Ray, 1980, “A logic for default reasoning”, Artificial Intelligence, adverbials in discourse”, Journal of French Language Studies 15(2):
13: 81–137. 173–193.
Ross, David, 1930, The Right and the Good, Oxford: Oxford University Wobcke, Wayne, 1995, “Belief Revision, Conditional Logic and
Press. Nonmonotonic Reasoning”, Notre Dame Journal of Formal Logic,
Rott, Hans, 1989, “Conditionals and Theory Change: Revisions, 36: 55–103.
Expansions and Additions”, Synthese, 81: 91–113.
Schlechta, Karl, 1997, Nonmonotonic Logics: Basic Concepts, Results and Academic Tools
Techniques, Berlin: Springer-Verlag.
Shoham, Yoav, 1987, “A Semantical Approach to Nonmonotonic Logic”, How to cite this entry.
in Proceedings of the Tenth International Conference on Artificial Preview the PDF version of this entry at the Friends of the SEP
Intelligence, John McDermott (ed.), Los Altos, Calif.: Morgan Society.
Kaufmann. Look up this entry topic at the Indiana Philosophy Ontology
Horty, Thomason, and Touretzky define the relation of support between (K*3) K*A is a subset of the logical closure of K ∪ {A}.
graphs (cognitive states) and paths (assertions) by mathematical induction (K*4) If ¬A does not belong to K, then the closure of K ∪ {A} is a
on the degree of the path. Direct links (paths of length one) are always subset of K*A.
supported by the graph. (K*5) If K*A is logically inconsistent, then either K is inconsistent, or
A is.
1. If σ is a positive path, x → σ¹ → u → y, then G supports σ iff: (K*6) If A and B are logically equivalent, then K*A = K*B.
1. G supports path x → σ¹ → u. (K*7) K*(A & B) is a subset of the logical closure of K*A ∪ {B}.
2. u → y is a direct link in G. (K*8) If ¬B does not belong to K*A, then the logical closure of K*A
3. The negative link x →⁄ y does not belong to G. ∪ B is a subset of K*(A & B).
4. for all v, τ such that G supports x → τ → v, with the negative
link v →⁄ y in G, there exist z, τ¹, τ² such that z → y is in G, and Supplement to Defeasible Reasoning
either z = x, or G supports the path x → τ¹ → z → τ² → v.
2. If σ is a negative path, x → σ¹ → u →⁄ y, then G supports σ iff: Popper Functions
1. G supports path x → σ¹ → u.
2. u →⁄ y is a direct negative link in G. A Popper function is a function from pairs of propositions to real numbers
3. The positive link x → y does not belong to G. that satisfies the following conditions:
4. for all v, τ such that G supports x → τ → v, with the positive
link v → y in G, there exist z, τ¹, τ² such that z →⁄ y is in G, and 1. For some D, E, P[D|E] ≠ 1.
either z = x, or G supports the path x → τ¹ → z → τ² → v. 2. P[A|A] = 1.
3. P[A|(C & B)] = P[A|(B & C)].
The definition ensures that each potentially conflicting path be preempted 4. P[(B & A)|C] = P[(A & B)|C].
by a path with a specificity-based priority. 5. P[A|B] + P[¬A|B] = 1, or P[C|B] = 1.
6. P[(A & B)|C] = P[A|(B & C)] × P[B|C].
Supplement to Defeasible Reasoning
Supplement to Defeasible Reasoning
AGM Postulates
David Lewis's Conditional Logic
Where K is a belief state, K*A represents the set of beliefs resulting from
revising K with new belief A. This is Donald Nute's axiom system (Nute 1984, 396-399) for David K.
Lewis's preferred logic for the counterfactual conditional, VC (Lewis
(K*1) K*A is closed under logical consequence. 1973, 132):
(K*2) A belongs to K*A.
The last two axioms, MP and CS, correspond to weak and strong [(p ⇒ ¬(q ⇒ r)) & ¬((p & q) ⇒ ⊥)] ↔ ¬((p & q) ⇒ r)
centering, respectively (in effect, the stipulation that the actual world is
In both cases, “p” must be a Boolean combination of ⇒-
one of the most, or uniquely the most, normal of all worlds). For
conditionals. The variables “q” and “r” may be replaced by any
nonmonotonic logic, these conditions, and these two axioms, must be
two formulas. The formula ¬((p & q) ⇒ ⊥) expresses the joint
dropped. The fifth axiom, CV, is the object-language correlate of Rational
possibility of p and q (in the sense that they don't defeasibly imply
Monotony.
a logical absurdity, ⊥).
3. Consistency: if K and A are both consistent, then so is K*A. entropy of a function P is the sum, over all the models M of the product
4. Ramsey test: (A ⇒ B) ∈ K iff B ∈ K*A. of the probability of M and the negative logarithm (base 2) of that
5. Preservation: if ¬A ∉K, then K ⊆ K*A. probability.
2. There is another desirable property that is independent of the ones 7. Wayne Wobcke's (Wobcke 1995) system is an alternative to
discussed here: infinite conditionalization: C(Γ ∪ Δ) ⊆ Cn(Γ ∪ C(Δ)). Commonsense Entailment that allows a limited amount of nesting. Both
The study of nonmonotonic Tarski relations (in which the set of premises Delgrande and Asher use conditional logics that are significantly weaker
can be infinite) is still relatively undeveloped (although see Makinson than either Ernest Adam's or David Lewis's VC (minus Centering).
1994 and Freund, Lehmann, and Makinson 1990).
8. Under an extension of 1-entailment or the maximum entropy approach,
3. Karl Schlechta (Schlechta 1997) has proved that the restriction of these we could defeasibly infer that ¬q is not exceptional, that is, that ¬(⊤ ⇒ q),
theorem to finite models is a necessary one. In order to cover the case of unless the exceptional character of ¬q follows monotonically from our
infinite theories, it is necessary to add yet another condition to the theory. (The symbol ⊤ represents any logical tautology.) From (p ⇒ q)
preferential models: a condition Schlechta calls “definability preservation” and ¬(⊤ ⇒ q), the contraposed conditional (¬ q ⇒ ¬p) follows
(Schlechta 1997, 76). monotonically (in VC minus centering), as the following derivation shows:
5. Admissibility has also known as “p-consistency” (Adams 1975) and “ε- (⊤ ⇒ (¬p ∨ q)) [6, Left equivalence]
consistency” (Pearl 1988). A preferential consequence relation is ((⊤ & ¬q) ⇒ (¬p ∨ q)) [2, 7, CV]
admissible if and only if, for every ε > 0, there exists a probability
function P such that, for all propositions p and q, p q iff P(q/p) ≥ 1 − ε. (¬q ⇒ (¬p ∨ q)) [8, Left equivalence]
weakening]
10. Andrew Baker (Baker 1988) offers a solution to the Yale shooting
problem with the resources of circumscription, by altering which
predicates are allowed to vary and which are held fixed. Baker's solution is
not an exception to my claim, however, since his solution involves making
a critical distinction between the role of causal and non-causal
information.