Laurence Jonathan Cohen - An Introduction To The Philosophy of Induction and Probability (1989, Oxford University Press) PDF
Laurence Jonathan Cohen - An Introduction To The Philosophy of Induction and Probability (1989, Oxford University Press) PDF
Laurence Jonathan Cohen - An Introduction To The Philosophy of Induction and Probability (1989, Oxford University Press) PDF
§ 1. AN O U T L I N E O F T H E I S S U E S
cure septicaemia?
Such hypotheses always generate predictions that extrapolate
beyond the existing data. That is why they may be useful as a
basis for building another rocket, curing new cases of septi-
caemia, producing more food, etc. So induction about them is
conveniently labelled 'ampliative'. It involves reasoning to
conclusions that cover not only the evidential instances but
others also. It amplifies our knowledge. And it contrasts with the
§ 2. T H E B A C O N I A N T R A D I T I O N IN T H E P H I L O S O P H Y O F
INDUCTION
been well served by his modern interpreters. All too often they
have echoed the misrepresentation that was promoted by R.L.
Ellis in the latter half of the nineteenth century. They have held
that 'absolute certainty is . . . one of the distinguishing
characters of the Baconian induction'. Yet what Bacon wrote in
3
2 e.g. The Posthumous Works of Robert Hooke, (London: R. Waller, 1705), 6, and R.
Boyle, Works (London: A Millar, 1744), i. 199. Francis Bacon (1561-1626) was by
training a lawyer and was Lord Chancellor for a few years under James I. His philosophy
of science was concerned with the social as well as the intellectual conditions for the
advancement of knowledge and with the social as well as the intellectual benefits of
properly directed research. In jurisprudence he sought a systematic statement of the
principles of Common Law from the study of reported cases.
3 R.L. Ellis, 'General Preface to Bacon's Philosophical Works', in The Works of
Francis Bacon, ed.J. Spedding, R. Ellis, a n d D . N . Heath (London: Longmans, 1859), i.
23.
4 For detailed references to Bacon's writings on this subject see L. Jonathan Cohen,
'Some Historical Remarks on the Baconian Conception of Probability', Journal of the
History of Ideas, 61 (1980), 219-31.
6 The Origins of the Problem§3
more intelligible nature, like a true natural kind', and this more
fundamental nature may itself have a form that investigators can
seek to discover. For example, on Bacon's view the nature of
which heat is a limitation is motion: 'heat is a motion expansive,
restrained, and acting in its strife upon the smaller particles of
bodies'. And motion itself must be supposed to be a limitation of
a more fundamental property. So there is a hierarchy of
explanatory laws to be discovered. The investigator should
expect to make a gradual ascent to more and more compre-
hensive laws, acquiring greater and greater certainty as he
moves up the pyramid, and each law that is reached should lead
him to new experiments, that is, to experiments over and above
those that led to the discovery of the law. But in regard to the
apex of the pyramid, at which discovery of the 'summary law of
Nature' would generate absolute certainty, Bacon says that we
do not know whether this is within human reach. (Contrast 5
Ellis's misinterpretation.)
Bacon held that at each stage of the investigation greater
significance should be attached to some kinds of observable
instances than to others. Thus there are twenty-seven types of
'prerogative' instances that deserve preferential consideration,
and some of these are more valuable than others in terms of their
ability to exclude alternative explanations and thus contribute to
greater certainty. But Bacon repudiated as 'childish' the method
of induction by simple enumeration, whereby a generalization
that is as yet unfalsified is supposed to acquire support that
varies in strength with the number of known instances that
verify it. O n his view it is not the mere number of known
instances that counts but the variety of circumstances that have
been canvassed in the attempt to exclude alternative
explanations. And this is because there is only a limited number
of ultimate forms and so falsificatory evidence, by conclusively
excluding incorrect hypotheses, permits firmer progress than
verificatory evidence does towards identifying the correct
6
See B.J. Shapiro, Probability and Certainty in Seventeenth-Century England: A Study of the
9
Relationships between Natural Science, Religion, History, Law and Literature (Princeton:
Princeton University Press, 1983). Cf. S. Shaffer's review of this book, "Making
Certain', Social Studies of Science, 14 (1984), 137-52, which is more critical in tone than in
substance. Bacon even allowed the possibility that his own methodology was capable of
improvement: Novum Organum, bk. I, aphorism cxxx {Works, i. 223).
Shapiro, Probability and Certainty, p. 33. On this question of the lower limit of proof
1 0
reflections led him to add that the most important kind of step in
this progress to a higher level of generalization was what he
called 'consilience'. Consilience occurs when inductions from
altogether different kinds of facts are, quite unexpectedly, seen
to share a common explanation. Thus Newton's theory of uni-
versal gravitation, 'which had been inferred from the Perturba-
tions of the moon and planets by the sun and by each other, also
accounted for the fact, apparently altogether dissimilar and
remote, of the Precession of the equinoxes' , In such a case, wrote
15
§ 3. T H E R I S E O F P A S C A L I A N P R O B A B I L I T Y
separate without finishing their game and how many tosses are
needed, in throwing two dice, to have at least an even chance of
2 6Blaise Pascal (1623-62) was well known as a strong supporter ofJansen's theology
as well as being an important mathematician.
15 The Origins of the Problem§3 §3
2 7I. Hacking, The Emergence of Probability: A Philosophical Study of Early Ideas about
Probability, Induction and Statistical Inference (Cambridge: Cambridge University Press,
1975), 51-3.
16 The Origins of the Problem§3
In this seminal period it was not just the mathematics that was
developing, but also its range of applications. The crucially
important step that had clearly been taken by 1662 was to apply
the calculus of chance to other domains than the outcomes of
aleatory games. Thus the agnostic, in Pascal's famous wager, 28
the fact that a certain type of event has happened more often
than not when the circumstances are of such-and-such a kind is
2 8 B. Pascal, Pensees, trans. H.F. Stewart (London: Routledge and Kegan Paul,
1950), 120-1.
2 9 A. Arnauld, The Art of Thinking: Port-Royal Logic, trans. J . Dickoff and P. Jones
(Indianapolis: Bobbs-Merrill, 1964), 350-1. Antoine Arnauld (1612-94) supported
Jansen's theology against that of thejesuits. He is generally credited with the authorship
of the book La Logigue ou I'Art de Penser, which was published by Jansenists from Port-
Royal in 1662.
17 The Origins of the Problem§3 §3
modern terms
p(/4) + p(not-^4) = 1.
But it would be dangerously misleadng to regard this thesis as a
truism.
Compare the Pascalian law for the probability of a
conjunction of outcomes, which is easily established from an
aleatory model. When the outcomes are independent of one
another, as when each ball drawn at random from an urn is
immediately replaced, we must just multiply together the
probabilities of the separate outcomes. For example, if there are
two white and two black balls in the urn, the chance of drawing a
white ball on any one drawing is 2 out of 4, and, if there are two
drawings, then, since each of the four possible outcomes of the
first drawing may be followed by any of the four possible
outcomes of the second drawing, the chances of drawing a white
ball twice is j . Of course, the ball drawn first may not be
replaced. If so, the second outcome will not be independent of
3 1 See G. Shafer, 'Non- Additive Probabilities in the work of Bernoulli and Lambert',
Archive for the History of Exact Sciences, 19 (1978), 309-70.
3 2 A. de Moivre, The Doctrine of Chances, or a Method of Calculating the Probability of Events
in Play (London: de Moivre, 1718), 1. Abraham de Moivre (1667-1754), a French
Huguenot who emigrated to England, was one of the commissioners appointed by the
Royal Society to arbitrate on the dispute between Newton and Leibniz about which of
them invented the differential calculus.
19 The Origins of the Problem§3 §3
the first, and we shall be concerned in its case with only the three
outcomes left possible by the outcome of the first drawing. The
chance of drawing a white ball twice will then be
Symbolically, let A be the event of drawing a white ball first, and
B the event of drawing a white ball second; and let the dyadic
function p(5|yl) evaluate the probability of B given^l. Then both
cases will be covered by the same multiplication principle for
conjunction,
P(A8lB) = p(A) x {B\A),
V
which says that the chance of drawing two white balls is the
product of the chance of drawing a white ball first and the chance
of drawing a white ball second, given that a white ball was also
drawn first. According to that principle, where the ball is
replaced after being drawn p(^4&5) equals because the two
outcomes are independent and p(5|^4) equals p(-B) and thus
equals 7; and, where the ball is not replaced after the first
drawing so that the two outcomes are not independent, p(.B|^4)
equals j and so equals (We say that p(^4) is a
'monadic' function of A, because its value depends on just one
issue—the issue of whether A occurs or not. Similarly p(^4&5) is
a monadic function of^4&fi. But p(SW) is a 'dyadic' function of
A and B because its value depends on the two issues that it
relates.)
All this is straightforward and incontrovertible if our model is
an aleatory one. But suppose an art historian declares two
pictures to be genuine Vermeers. He seems to have given you a
warranty—the warranty of his expertise—to believe that the
first is a Vermeer, a warranty to believe that the second is, and a
warranty of just the same nature to believe that they both are.
The warranty for the conjunction seems no less reputable, and
no less thoroughly researched, than for either of the conjuncts.
But this conflicts with Pascalian principles because the above-
mentioned multiplicative law for the probability of a
conjunction ensures that, except in limiting cases, the
conjunction is less probable than either conjunct. If p(A) > 0
and p(B\A) < 1, p(/l&5) must always be less than p(^4), since in
accordance with the multiplication principle for conjunction
p(^4&5), as we have just seen, is equal to a proper fraction of
p(^4), namely, p(^4) X p(B\A). Of course, the chance of both
20 The Origins of the Problem§3
pictures' being genuine may well be a lot less than the chance of
just one's being genuine. But is the credibility of their genuine-
ness to be judged in terms of such chances (which are
presumably to be viewed metaphorically as chances in the
lottery of life) or in terms of the reputation of the author of the
warranties that have been given you? The dominant theory has
treated such warranted credibilities as sharing the same mathe-
matical structure as aleatory chances. But this assignment of
mathematical structure is by no means as incontrovertible in the
case of warranted credibilities as in that of aleatory chances.
So both in dealing with conjunction and in dealing with
negation more than one structure is conceivable. And if there is
more than one way of dealing with conjunction there must also
be more than one way of relating dyadic to monadic probabili-
ties. For it is a matter of elementary algebra, where p(^4) > 0,
that in accordance with whether p(/l&.B) is, or is not, equal to
p(^4) X p ( £ | i ) , correspondingly p(fi|^4) is, or is not, equal to
p(/l&£)
3 3 See L. Jonathan Cohen, The Probable and the Provable (Oxford: Clarendon Press,
1977), 30 n. 21.
3 4 The contraposability of conditional statements has been called into question by
E.W. Adams, The Logic of Conditionals (Dordrecht: Reidel, 1975). But Adam's argument
rests on the counter-intuitive, or at least controversial, assumption that the probability of
a conditional 'If A, then B' has to be equated with the conditional probability p(SU),
and that conditionals cannot be assigned dichotomous truth-values in a way that will
always and necessarily equate the probability of a conditional with the probability of its
being true. Indeed, we should be quite unnecessarily impoverishing our conceptual
resources if we excluded ourselves from ever employing the expression 'if . . ., then
21 The Origins of the Problem§3 §3
value for the true probability does not follow inversely, with high
credibility, from the size of the relative frequency in a large
sample, at least the existence of that value for the true
probability is the simplest possible explanation of the size of the
relative frequency. A different manoeuvre is adopted by the
36
And in that form the law is very easily derived within the
calculus of Pascalian chance. It results, by elementary algebra,
from putting equal to each other the two basic ways of spelling
out (in accordance with the multiplication principle for
conjunction) the probability of the conjunction of two outcomes,
A and B, namely, p(A IB) X p(5) and v(B\A) X p(^4). But
Bayes's law obviously does not solve all the problems here.
When you know the value of p(A!B), Bayes's law can be used to
derive p(5M) only if you also know the values of p(A) and p(B)
or accept some conventional procedure for determining them.
So whether you think of Bayes's law as a supplement to
Bernoulli's theorem that is useful in relation to the inverse
inference, or wish to exploit it in some other way, you must be
able to evaluate these monadic probabilities (see further § 9 and
§ 19).
In the present century the mathematics of Pascalian
probability have been studied systematically by means of
axiomatization, with varying thoroughness of formalization.
Thus A. Kolmogorov in 1933 produced six axioms for the
39
common save one, that one occurring only in the former; the
circumstance in which alone the two instances differ is the effect, or the
cause, or an indispensable part of the cause, of the phenomenon. 56
accounted for by its gravitation towards the sun and planets, the
residual feature must be explained by the resistance of the
medium through which it moves. But in such a case Mill
recognized that in practice we could not be certain that A is the
only antecedent to which the residual phenomenon a may be
referred. So any induction by the Method of Residues needs to
be confirmed by obtaining A artificially and trying it separately,
or by deriving its operation from otherwise known laws. 65
their variety. 80
carried out? Only when these issues have been resolved shall we
be in a position to investigate whether ampliative induction is
best graded in terms of Pascalian probability. And the topic is in
any case an important one since, whether or not Pascalian
probability has a central role to play in the gradation of
induction, it certainly has many other valuable uses in
contemporary culture.
II
The Controversy about the Nature of
Pascalian Probability
§ 5. S O M E G E N E R A L C O N S I D E R A T I O N S
§ 7. F R E Q U E N C Y T H E O R I E S
1 1 S.D. Poisson, Recherches sur la probability des jugements en matiere criminelle el en matiere
civile precedees des regies generates du calcul des probability (Paris: Bachelier, 1837). Poisson
(1781-1840) was a friend of Laplace. His most important work was in the application of
mathematics to physics. See also H. Reichenbach, The Theory of Probability, trans. E. H.
Hutten and M. Reichenbach, 2nd edn. (Berkeley: University of California Press, 1971),
68-9.
B. Russell, Human Knowledge pp. 384-5 A. Pap, Introduction to the Philosophy of
1 2
Science, pp. 180-1, contradicted Russell, but his argument is not cogent.
§ 6 the Nature of Pascalian Probability 51
showed how by yet another arrangement of the integers the limit
in question would be 1. So it is clear that, if a probability is
defined in terms of the limit to a sequence of relative frequencies,
it might differ under certain rearrangements of the underlying
set of outcomes. But the probability that a B is an A should be
unique. So Poisson's procedure makes the probability p {A\B)
depend on there being only one relevant ordering for the class of
Bs, which is the reference-class for the probability.
Now the trouble is that the members of some reference-classes
do not all obviously belong to some set that is well-ordered by
temporal succession or by some other uniquely appropriate
relation. Thus, if the class of student enrolments is an infinite
one, some of its members may be simultaneous with one
another, so that it is certainly not well-ordered by temporal
succession. Moreover, even when the reference-class is well-
ordered by some supposedly appropriate relation, the sequence
of cumulative relative frequencies may not in fact converge to a
limit. Or—and this is the commonest situation—there may be
infinitely many other equally appropriate well-orderings of the
class, each of which determines a different limiting value for the
sequence of relative frequencies in question.
A possible way of dealing with those difficulties would be to
identify the probability with the limit to which the relative
frequencies converge within an infinite-membered sub-class of
the reference-class that satisfies appropriate conditions for
randomness and convergence, as proposed by von Mises. But 13
this proposal still raises the question why any ordering should be
relevant to the value of the probability, within a set of outcomes
that are independent of one another. If the outcome of a toss is
supposed to be unaffected by its position in a sequence of tosses,
because repeated tosses are assumed to have no effect on the
state of the coin or of the tossing mechanism, why should the
ordering of the sequence be relevant to the probability of heads
as against tails? Moreover, a statement evaluating the limit of a
sequence of relative frequencies hypothetically continued to
infinity implies nothing whatever that is empirically testable
about any initial segment of the sequence, however large,
13 See R. von Mises, Probability, Statistics and Truth, 2nd edn. (New York: Dover,
1957), 24-5.
52 The Controversy about
because such a segment can be replaced by any arbitrary
sequence of the same length without affecting the limit of the
sequence as an infinite whole.
It might be thought that these problems about infinite
reference-classes at least leave the frequency analysis safely
available in the case of finite reference-classes. And, after all,
very many of the probabilities with which we are concerned in
the natural or social sciences are probabilities in supposedly
finite classes, although such classes are often indeterminately
large. But a third weakness of the frequency analysis seems to be
clearly displayed in certain cases where the reference-class is
certainly finite in size. One may well believe that a nicely
produced coin, fresh from the mint, has a probability of falling
heads that is correctly evaluated at j. But suppose it is tossed
three times, landing heads twice and tails once, and is then
melted down for scrap. A crude relative frequency theory would
apparently evaluate the probability of its falling heads at f,
which cannot be correct. Perhaps a frequency theorist has
therefore to say instead in such a case that the probability of
heads is not to be identified with the actual relative frequency of
heads, but with what that relative frequency would have been in
the long run. But then either that means 'in an infinite set of
tosses', in which case the difficulties already mentioned apply;
or it means 'in a very large, but finite, set of actual or potential
tosses', in which case the probability would still be altered a bit if
the coin were tossed just one more time, since the relative
frequency of heads would inevitably increase or decrease,
depending on the actual outcome of that additional toss. So
where the reference-class is finite the frequency analysis seems to
make the exact value of a probability depend on some quite
accidental fact about the precise number of instances that are
supposed to belong to the reference-class, just as where it is
infinite the probability is made to depend on some quite
accidental fact of ordering.
Of course, where the reference-class is finite we can escape
introducing such a dependence on irrelevant contingencies if we
are willing to take the probability-statement in question to be
saying that the actual or potential relative frequency falls within
a specified interval (e.g. 0.5 plus or minus 0.16). But the
§ 6 the Nature of Pascalian Probability 53
difficulties about individual events and about infinite reference-
classes cannot be so easily dodged. 14
§ 8. P R O P E N S I T Y T H E O R I E S
§ 9. P E R S O N A L I S T T H E O R I E S
L.J. Savage, The Foundations of Statistics (New York: Dover, 1972), 44.
2 4
This is the analysis adopted by de Finetti in his Theory of Probability, trans. A. Machi
2 5
state of mind nor the relative frequency with which rain occurs
in April, though willingness to bet at the corresponding odds
would again be a test of sincerity. And such a view too has a
counterpart in the analysis of moral discourse, when it is urged,
as by Hare, that words of moral evaluation, like 'good' and
29
P(B)
where p (B) > 0. The investigator next updates his value for the
monadic probability p ( H ) by putting it equal to p(//|/?i), and he
takes the next piece of evidence available into account— perhaps
E reports a rising barometer—by evaluating p(Z? 1H) and p(£' ).
2 2 2
§ 10. M U L T I - V A L U E D L O G I C T H E O R I E S
then we are left with the problem of how to measure the truth-
value of 'This is an A' where we know the relative frequency of
As only within a reference-class that is smaller than the relevant
universe as a whole.
Secondly, though a propensity analysis can interpret dyadic
judgements of probability as statements of conditional
tendencies (and a personalist analysis can relativize strength of
belief to awareness of particular items of evidence), a multi-
valued logic account sits easily only with monadic judgements,
because each proposition is naturally assumed to have just one
truth-value, whether or not we know it. The proposition that it
will rain soon may be probable on the evidence of dark clouds,
and improbable on the evidence of increasing atmospheric
pressure. But this proposition presumably has only one truth-
value, which is determined by the actual event, when it happens,
and is not relative to some other, quite different events. No
doubt we could restrict ourselves to a monadic judgement here
by identifying the conditional probability with the probability of
the corresponding conditional—that is, we could identify p(A \B)
with p(If^4, then B). But there would be a price to pay for this. It
would reduce the variety of different facts that our judgements of
probability are capable of expressing, as we have already seen
(pp. 20-1). The better course would be just to define p(A\B) as
p(A&B)
P (B)~
where p(5) > 0.
§ 11. L O G I C A L R E L A T I O N T H E O R I E S
(1883-1946) is perhaps better known for his work in economics than for his philosophy of
probability.
§6the Nature of Pascalian Probability 75
numerically measurable. But Carnap showed, in 1950, that it
42
4 2Ibid. 34.
4 3R. Carnap, 'Inductive Logic and Rational Decisions', in R. Carnap and R.C.
Jeffrey (eds.), Studies in Inductive Logic and Probability, i (Berkeley: University of California
press, 1971), 5-31.
4 4R. Carnap, Logical Foundations of Probability (Chicago: Chicago University Press,
1950), 299
4 5F. Waismann, 'Logische Analyse des WahrscheinlichkeitsbegrifP, Erkenntnis, 1
(1930-1), 228-48, repr. in English as 'A Logical Analysis of the Concept of Probability',
in F. Waismann, Philosophical Papers (Dordrecht: Reidel, 1977), 4-21.
4 6 L. Wittgenstein, Tractatus Logico-Philosophicus (London: Kegan Paul, Trench,
76 The Controversy about§11
there are also some anticipations of Carnap's approach to the
problem in the writings of Leibniz. 47
§ 12 S O M E L O G I C A L D I S T I N C T I O N S E X P L O I T E D BY D I F F E R I N G
ANALYSES OF PASCALIAN PROBABILITY
§ 13 T H E A P P R O P R I A T E N E S S O F D I F F E R E N T C O N C E P T I O N S O F
PASCALIAN PROBABILITY TO DIFFERENT PURPOSES
The Pascalian calculus permits a wide range of variation in the
logical properties of probability-judgements, which are then
useful for correspondingly different purposes. For example, to
judge a purely aleatory probability we require a conception of
probability, like the indifference-theory's one, that makes
correct judgements of probability necessary, extensional, impli-
citly general, and non-counterfactualizable. But where out-
comes cannot be assumed to be determined by pure chance, we
need probability-judgements that are contingent, extensional,
implicitly general, and counterfactualizable or ones that are
contingent, non-extensional, implicitly general, and counter-
factualizable. There are good reasons for supposing that a
variety of different conceptions of probability are operative here,
rather than just a variety of different ways of measuring the same
underlying parameter.
§15the Analysis of Probability 91
Different conceptions of Pascalian probability are suitable for
different purposes, depending partly on the epistemology and
partly on the logic of the judgements employing those con-
ceptions. Indeed, by taking into account all the possible com-
binations of logical micro-features, we might conceivably obtain
a greater potential diversity of probability-judgements than the
catalogue (§ § 6-11) of six standard Pascalian analyses suggests.
Different analyses of probability, it has already been shown
(§ 12), have different implications about whether judgements
of probability are necessary or contingent, sentence-related or
predicate-related, extensional or non-extensional, implicitly
general or implicitly singular, counterfactualizable or non-
counterfactualizable. Yet the Pascalian mathematics of prob-
ability is quite neutral in regard to preferring a particular
alternative on any of these five logical issues. The classical
calculus admits of interpretation in any of the numerous dif-
ferent senses that can be generated by combining the various
alternatives in the different ways that are possible. Clearly, if a
probability-judgement had to be contingent, extensional, and
predicate-related, for example, some kind of relative frequency
interpretation would have to be accepted, while if it had to be
necessary, non-extensional, and sentence-related, some kind of
interpretation in terms of logical relations would be required.
But there is no mathematical reason—no reason inherent in the
uninterpreted calculus—to argue on such grounds in favour of
any particular interpretation or semantical analysis.
Nor could one argue thus on the basis of ordinary linguistic
usage or of current scientific practice. Even if in ordinary usage,
or current practice, only one particular interpretation were
assigned to the Pascalian calculus, it would still be possible to
query whether that interpretation was the only useful one or
even whether it was as useful as this or that available alternative.
Moreover, what is useful for one purpose may be useless for
another. It follows that the appropriate question is not 'What is
the correct semantical analysis of probability-judgements?'
Rather we should ask 'Which semantical analysis of probability-
judgements is best suited to which kind of context?'
For example, in a context in which the probabilities being
judged relate to a game that is assumed a priori to be one of pure
chance, an objective idealist (p. 44) version of the indifference
92 The Foundations of Pluralism in§15
theory is appropriate. In specifying the nature of the game we
have to postulate a domain of statutorily basic outcome-types,
such as the two sides on which a presumably perfect coin can fall
on any one randomly executed toss or the six sides on which a
presumably perfect die can fall on any one randomly executed
throw. And from that postulate the principle of indifference
allows us to derive, as a necessary truth, the probability of any
particular basic outcome-type or the probability of a particular
compound outcome-type (such as the probability of two tosses'
both landing heads). Correspondingly we can say that the
establishment of these probabilities is an a priori, not an empir-
ical task. In this context any claim to be measuring propensities
of the gaming-device would appear to be misrepresenting an a
priori issue as an empirical one.
So, since the probability, in this sense, of a particular
outcome-type is determined by the rules that legitimate the
range ofbasic outcome-types, it risks being affected if in our own
judgements we choose to name the outcome-type in some other
way than the rules do. If the rules mention a side named '5', but
not a side named 'my favourite side', then the probability of
throwing my favourite side is not derivable a priori from the
rules and certainly cannot be equated on their authority with the
probability of '5'. In short, because true judgements of pro-
bability in the relevant indifferentist sense are necessarily true,
they are also non-extensional—i.e. resistant to the inter-
substitution of accidentally co-extensive terms. Moreover, so
long as the set-up considered continues to be one of pure chance,
any unconditional evaluation for the probability of, say, a
throw's being '5' will be implicitly general and apply also to the
probability of a throw's being '6', and so on. But the probability
of an outcome-type's being '5', given that it is nameable by an
odd number, would not be the same if a more-than-six-sided
die were in play. So judgements of aleatory probability are
non-counterfactualizable. In sum, a purely aleatory context
requires a conception of probability that satisfies an indif-
ferentist analysis and makes correct judgements of probability
necessary, non-extensional, implicitly general, and non-
counterfactualizable .
Suppose, however, the game is not assumed a priori to be one
of pure chance, but the question whether the game is one of pure
§15the Analysis of Probability 93
chance is treated as an open, empirical issue, perhaps because
there are reasons to think the coin or dice might be biased
(although the bias may be a stable one). Clearly that issue may
be put in the form: how closely does the empirically estimated
probability of a particular outcome approximate to the pro-
bability of that outcome which is calculated a priori in indif-
ferentist terms? So in order to take the issue seriously we need to
operate here with another, «o«-indifferentist conception. We
want to know how closely the value of the one function
approaches the value of the other in regard to the same
outcomes. More specifically, alongside the indifferentist con-
ception we need here also a conception of probability that allows
judgements of probability to be contingent and based on
observed trials or samples, as in the calculation of life-
expectancies for an annuity. These judgements must still be
implicitly general, but, as they now quantify over a domain of
outcomes, not of outcome-types, they would now be counter-
factualizable (so far as any bias or absence of bias is stable). Such
a conception fits either a relative frequency or a propensity
account, depending on whether or not there is a requirement for
extensionality and finitude.
Perhaps someone will be tempted to object that what is at
stake here, in the issue whether the game is one of pure chance, is
not the closeness with which the value of one probability-
function approaches the value of another but rather the close-
ness with which results from one method of measuring the value
of a particular probability-function approximate results from
another method of measuring the value of the same function.
Indeed, that would presumably be the gloss that a personalist,
belief-theoretical account would wish to put on the situation.
But what is this single parameter, then, that is allegedly being
measured in two different ways? Clearly it is not the actual
strength of a person's belief that is being measured in such cases:
no actual betting behaviour is being observed (p. 64). At best a
personalist might claim that what is being measured—in two
different ways—is the strength which his belief ought to have.
And then a realist could still rejoin that the propriety of a par-
ticular belief-strength must depend here, as elsewhere, on its
correspondence with some objective correlate like a ratio of
chances or a relative frequency, so that, if two different measure-
94 The Foundations of Pluralism in§15
ments are both correct, two different parameters are being
measured.
Moreover, in yet another situation we might be dealing
neither with an assumed game of pure chance nor with the acci-
dental vagaries of a man-made gambling device, but with the
physical properties of a particular example of this or that natural
kind. For example, we might need to judge the probability that a
neutron will decay in six minutes, where the assertion of a con-
nection between physical property and natural kind imposes a
requirement of non-extensionality. Since such a judgement will
also be contingent, grammatically singular, implicitly general,
and counterfactualizable, it operates with a concept of pro-
bability that satisfies a propensity analysis. Indeed, the physical
tendencies of natural kinds are paradigm cases for that analysis.
If you want to warn a non-smoker against the dangers of
smoking, you need to be able to cite counterfactualizable pro-
babilities. And that brings out another reason why the apparent
plurality of probability-concepts cannot be reduced to a
plurality of modes of measurement that are used in connection
with a single probability-concept. We have to think of
probability-judgements not only as conclusions that may be
reached by different methods of calculation or measurement,
but also as premisses from which different kinds of conclusion
may be drawn in accordance with differences in their logical
structure. Issues about exterisionality, implicit generality,
counterfactualizability, and so on are crucial here. For example,
the conception of probability on which we can base a warning
about smoking or about working in an asbestos factory is not the
same as that which relates only to accidental issues like the pro-
bability that a member of a certain conference is staying in a
specified hotel (pp. 88-9).
§ 14 T H E N E E D T O S U P P L E M E N T P A S C A L I A N J U D G E M E N T S BY
NON-PASCALIAN ONES
1 4 Laplace, A Philosophical Essay on Probabilities, p. 19. The rule is discussed by, among
others, J . Venn, The Logic of Chance, 2ndedn. (London: Macmillan 1876) 171-83, and
J . M . Keynes, A Treatise on Probability (London: Macmillan, 1921), 372-83.
98 The Foundations of Pluralism in§15
that the next B will be A. But, since this probability derives from
the absence of any evidence, and holds good whatever A and B
are, it can hardly constitute an empirically learned fact about the
actual world. Indeed there is analogous trouble even when
evidence is not entirely missing. An inherent fault in the rule of
succession is that it gives implausibly precise results on the basis
of very small samples.
The rule of succession is therefore at best applicable only
where we are reasoning about a supposedly chance mechanism.
And, even where the rule of succession is applied to such a sup-
posedly chance mechanism, there must not be more than two
possible outcomes for any one trial of the mechanism. Otherwise
contradictions may result. For example, suppose that an urn
contains only white, black, and yellow balls, and that we draw
first a white ball from it, next a black one, and next a yellow
one—replacing each ball after it has been drawn. The
probability that the fourth ball to be drawn will be white is,
according to the rule of succession,
1+ 1 2
§ 15 H O W A R E D I F F E R E N T C O N C E P T I O N S O F P R O B A B I L I T Y
POSSIBLE?
regard to probabilities.
2 6 Relevant references are given in n. 32 to ch. II.
IV
The Pascalian Gradation of Ampliative Induction
§ 16. I N D U C T I V E P R O B A B I L I T Y U N D E R A R E A L I S T C O N S T R U A L
1 See also A.J. Ayer, Probability and Evidence (London: Macmillan, 1972), 30-3.
2 See e.g. H. Reichenbach, Experience and Prediction (Chicago: Chicago University
Press, 1961), 397 ff.
118 The Pascalian Gradation of§17
and experimental enquiries into plumage-colour, for example,
we cannot expect many generalizations on that topic to have
been even proposed, let alone falsified. But, if there has been a
lot of funding over a long period, we may expect not only that
many such generalizations have been proposed, but also that
enough research has been done in the area for many relevant
generalizations, whether explicitly considered or not, to have
been falsified. In short, other things being equal, the probability
assignable in this way to a particular hypothesis, on the evidence
available at the time, will tend to vary inversely with the size of
past funding for research in that field. So this method of assign-
ing probabilities to particular hypotheses introduces a quite
paradoxical type of dependence on the direction of previous
research. One might well have supposed instead that the more
research goes into a subject, the more reliable become the hypo-
theses that eventually issue: they are supported, as it were, by
piles of eliminated conjectures. But the method of assessment
under consideration measures the reliability of a hypothesis by a
probability that is equal to the relative frequency of unfalsified
hypotheses in the field of enquiry.
It may therefore seem plausible to suggest instead that
enumerative induction involves establishing some very high
propensity-type probability (see § 8) on the basis of a very large
sample. Thus, if the generalization to be inductively supported
is that all Bs are As, this generalization would be treated as being
equivalent to the estimate p(A | E ) — 1 — e where e is very small,
and the evidence would be constituted by a large sample of Bs, in
which the observed relative frequency of ^4s was at least 1 — e.
The relation between the evidential sample and the estimated
propensity would then be evaluated by some conventional statis-
tical procedure which relies ultimately for its validity on
Bernoulli's limit theorem (see pp. 21-2 above). For example,
we might aim at having an estimate that could be said to be
correct within 95 per cent confidence limits. Roughly, the 3
§ 17. I N D U C T I V E P R O B A B I L I T Y U N D E R A
RANGE-THEORETICAL CONSTRUAL
' m*(e) •
[ J
§ 18. P A S C A L I A N G R A D A T I O N F O R V A R I A T I V E I N D U C T I O N
that A belongs to all the groups to which B belongs and thus that
everything which has B has A. So Keynes's theory is at any rate
not hit by the difficulty that Carnap encountered, whereby any
generalization over an infinite domain had zero prior
probability.
Keynes next points out that each evidential instance of B that
differs in some respect from any previously noted instance of B
can help to eliminate the possibility of B's belonging to any
group to which A does not belong. Hence, as each such possib-
ility is eliminated we may treat the probability that everything
which has B has A as being correspondingly increased, since
each step reduces the number of distinct groups about which it is
not known whether B and A are both members.
Of course, Keynes recognized that by this method a numer-
ically definite probability could be obtained for an induction
only if we are able to make definite assumptions about the
number of independent equiprobable causal mechanisms at
work. But he pointed out that commOn-sense judgements of
inductive probability were normally qualitative or comparative
rather than numerical. People remark that certain inductive
arguments are stronger than others, or that some are very
134 The Pascalian Gradation of§17
strong. So Keynes was content that within his system inductive
evaluations would normally just bear relations of greater or
lesser to numerical probabilities according to the approximate
limits within which our assumptions place the number of
independent generative mechanisms. That is, his assessments of
inductive support were, as we should now say, interval-valued.
Another notable feature of Keynes's system was that
enumerative induction plays no part in it whatever. An additi-
onal evidential instance is of no inductive value unless, by
exhibiting some hitherto unremarked combination of charac-
teristics, it furthers the eliminative process. Nor does a parti-
cular location in space or time count as an inductively relevant
characteristic in the investigation of natural processes. So, if an
additional evidential instance is to have any value there, it must
differ from previously noted instances in respect of some charac-
teristic other than spatial or temporal location. Indeed in this
connection Keynes invokes what he calls 'the Uniformity of
Nature', according to which (as most investigators of Nature are
prepared to assume) mere position in time and space cannot
possibly affect, as a determining cause, any other characteristic.
It is therefore no objection to his system that at any one time of
investigation all the evidential instances are past or present,
none future. Pastness, presentness, and futurity must also be
inductively irrelevant if spatio-temporal location is. (A similar
result was achieved in Carnap's system, by his preference for
symmetrical c-functions—pp. 77-8).
Keynes acknowledged that the principle of limited indepen-
dent variety was crucial to his system. But he argued that this
principle did not have to be proved true, in relation to any parti-
cular domain for inductive investigation. It was sufficient, in his
view, for the principle to have a finite probability a priori, which
our experience has considerably increased. Indeed, he says, 'it is
because there has been so much repetition and uniformity in our
experience that we place great confidence in it'. But there are at
least four points that can be made against Keynes here, and the
approval that one accords to his theory must vary inversely with
the strength that one attributes to these points.
First, it is by no means obvious that actual human experience
has tended to increase the probability that the principle of
limited independent variety is valid for Nature. As natural
science progresses to new levels of understanding new properties
§22Ampliative Induction 135
of matter tend to be discovered and perhaps this is because
Nature's independent variety is unlimited.
Secondly, even if the principle's probability has undergone
some increase, this is quite compatible with its still being more
probably false than true. But, if the principle on which our
inductive reasonings rely is more probably false than true, then
on the balance of probability our inductive reasonings are with-
out any rational foundation. So Keynes has given us insufficient
reason to suppose that our inductive reasonings do have a
rational foundation rather than that they do not.
Thirdly, as well as these difficulties about the epistemology of
the principle, there are problems about its implications. Keynes
did not explain how exactly his method of inductive appraisal
assigns antecedent probabilities to hypotheses about dyadic or
relational characteristics as distinct from monadic ones, or to
hypotheses about quantitative characteristics as distinct from
qualitative ones. For example, if there is a non-denumerable
infinity of points on a scale of physical magnitude—a scale of
heat, say, or velocity—must there not be a corresponding
infinity of generative mechanisms? And if you prefer to identify
a single generative mechanism, such as molecular motion, as the
cause of heat, you will be grouping parameters (of temperature,
velocity, etc.), rather than single characteristics, into your
supposedly finite number of groups, and this will interfere with
the assignment of antecedent probabilities to hypotheses about
monadic, qualitative characteristics.
Fourthly, Keynes assumes that within any domain of induct-
ive investigation all mutually independent generative mechan-
isms are of equal inductive importance in relation to any one
hypothesis, just as J . S . Mill (see pp. 36-7 above) assumed that
in analogical reasoning every feature of similarity or difference
between the two entities concerned is equally important. Yet (as
already remarked in connection with Mill's account) the fact
that two patients resemble one another in literary interests, for
example, would normally be much less important for medical
prognosis than that they have the same viral infection in their
bloodstreams. 17
" Keynes's views on induction are criticized from a different point of view in
J. Nicod, Geometry and Induction, trans. M. Woods (London: Routledge and Kegan Paul,
1969), 173-242, where the importance of enumerative induction is defended.
136 The Pascalian Gradation of § 17
§ 19. I N D U C T I V E P R O B A B I L I T Y U N D E R A P E R S O N A L I S T
CONSTRUAL
2 1 See the remarks by Florian Cajori in the appendix to his edition of Newton's Prin-
cipia Malhematica (Berkeley: University of California Press, 1962), 648 ff. Other
examples are given by R.G. Swinburne, 'Falsifiability of Scientific Theories', Mind, 73
(1964), 434 ff., and I. Lakatos, 'Falsificationism and the Methodology of Scientific
Research Programmes', in I. Lakatos and A. E. Musgrave (eds.). Criticism and the Growth
of Knowlege (Cambridge: Cambridge University Press, 1970), 138 ff.
§22Ampliative Induction 143
a substantial level of inductive support on the basis of evidence
that includes minor anomalies, just so long as the favourable
evidence is sufficiently strong. Schematically, therefore, we can
say that the existence of an inconsistency between E and H
should be compatible with E's giving a higher level of inductive
support to H than to some comparable theory with which E is
also inconsistent.
It is evident, however, that any such compatibility is ruled out
not only if E's degree of inductive support for H is to be a simple
probability but also if it is to be any function of Pascalian
probabilities involving E and/or H. Because Pascalian probab-
ility functions are functions of the elements of a Boolean algebra,
it can easily be shown —as a consequence of Bayes's theorem
22
189-90.
144 The Pascalian Gradation of§17
connection between support for a generalization and support for
particular predictions that derive from it, some cannot represent
the existence of inductive support for generalizations over
indeterminately or infinitely large domains, some rely on calcula-
tions from the structure of language where calculations from the
structure of reality would be more appropriate, some stand or
fall with the truth of some questionable metaphysical principle,
some resist objections by sacrificing informativeness, and some
can only be sustained if one is prepared to turn a blind eye to the
paradoxes that they entail. It is obviously worth considering,
therefore, whether any adequate analysis can be achieved in
non-Pascalian terms. The next chapter will be concerned to
explore the possibility of such an analysis and to evaluate what
gains and losses result from adopting it. 23
2 3 For personalist analyses of induction see also J . Dorling, 'A Personalist's Analysis
of Statistical Hypotheses and Some Other Rejoinders to Giere's Anti-Positivist Meta-
physics', in L.J. Cohen and M. Hesse (eds.), Applications of Inductive Logic (Oxford:
Clarendon Press, 1980), 271-81; P. Horwich, Probability and Evidence (Cambridge:
Cambridge University Press, 1982); and R . D . Rosenkrantz, Inference, Method and
Decision: Towards a Bayesian Philosophy of Science (Dordrecht.: Reidel, 1977). Foracritique
of the personalist approach to induction see C. Glymour, Theory and Evidence (Princeton:
Princeton University Press, 1980), 63-93. Further arguments against any Pascalian
analysis of induction are to be found in R. Harre, The Principles of Scientific Thinking
(London: Macmillan, 1970), 157-77 and K . R . Popper, The Logic of Scientific Discovery
(London: Hutchinson, 1959), 251-81
V
The Baconian Gradation of Ampliative
Induction
§ 20. I N D U C T I V E S U P P O R T BY T H E M E T H O D O F R E L E V A N T
VARIABLES
able result for test t together with instances of A that are also B
u
2Details are given in Cohen, The Probable and the Provable, pp. 144-57.
3See I.B. Cohen (ed.), Isaac Newton's Papers and Letters on Natural Philosophy
(Cambridge: Cambridge University Press, 1958), 47-52.
§22Ampliative Induction 151
One point of special interest arises in regard to the application
of the Method of Relevant Variables to inductive reasoning
about wide-ranging explanatory theories. If the explanatory
range of one such theory not only embraces, but also extends
beyond, the range of another, it is normally considered induct-
ively superior to the latter. But in assenting to this intuitively
cogent principle we must exclude those cases in which superior
explanatory power is achieved trivially, as happens when two
mutually independent explanatory theories are tacked into one
by conjunction. So, as a sign that no such trivial extension lies at
the root of a particular theory's superior explanatory power, we
normally welcome the ability of a theory to lead us to some new
kind of knowledge. That is, we look to the possibility of deriving
novel kinds of prediction that are then experimentally
confirmed, as Bacon, Leibniz, Lakatos, and many others have
emphasized. Alternatively, it may be that two surprisingly
4
1 0 Localization is a strategy that has been pursued by quite a number of writers in the
philosophy of induction: see the papers and bibliography in R.J. Bogdan (ed.), Local
Induction (Dordrecht: Reidel, 1976).
11 Some discussions of the Method of Relevant Variables (by I. Levi, A. W. Burks, S.
Blackburn, R. Hilpinen, A. Margalit, J . L . Mackie, a n d M . Hesse, with replies by L.J.
Cohen) are to be found in L.J. Cohen and M. Hesse (eds.), Applications of Inductive Logic
(Oxford: Clarendon Press, 1980), 26-7, 64-7, 172-201, 207-14, 245-50. See also I.
Levi, 'Support and Surprise: L.J. Cohen's view of Inductive Probability', British Journal
for the Philosophy of Science, 30 (1979), 279-92.
§22Ampliative Induction 157
§ 21. T H E L O G I C A L S Y N T A X O F T H E M E T H O D O F R E L E V A N T
VARIABLES
greater than, or equal to, i. Then E must state evidence that the
conjunctive hypothesis H&H' passes at least test t„ since that
hypothesis must resist falsification under all possible combina-
tions of variants of relevant variables manipulated in test t In
r
1 3 The proof is given in Cohen, The Probable and the Provable, pp. 190-9.
160 The Baconian Gradation of
demonstrating that his analysis of inductive support-gradings
has a Pascalian structure, what the demonstration achieves is not
merely a clarification of that analysis but also—to some
extent—a confirmation of it. Inductive support, so analysed,
has been shown to instantiate a pattern of logical syntax that has
several familiar and established manifestations in other kinds of
gradation of certainty. A somewhat controversial thesis has been
strengthened, it may be hoped, by the acquisition of respectable
associations, much as Pascal and the Port-Royal Logic originally
( § 3 ) brought intuitive beliefs about certain non-aleatory issues
within the scope of the familiar mathematics of chance, in order
that they should be consolidated thereby. So, if the logical
syntax of inductive support-gradings generated by the Method
of Relevant Variables were wholly idiosyncratic, an analysis in
terms of that method would be at a corresponding disadvantage
in comparison with any Pascalian analysis.
As it turns out, however, though Baconian evaluations of
inductive support do not conform to Pascalian principles, they
do conform to the principles of a generalized modal logic, and
this modal-logical connection serves in turn to corroborate the
legitimacy of such an analysis. The key idea here is that a
generalization which holds good (or would hold good) under all
manipulations of relevant variables must state a law of nature or
a consequence of one, while generalizations that hold good (or
would hold good) only under less thorough tests have a status
analogous to, but falling suitably short from being, that of such a
statement. So a generalization of the latter kind may be said to
have a certain degree of legisimilitude, or similarity to law,
depending on the thoroughness of the test that it is thought
capable of passing. If the evidence is, in effect, that a generaliza-
tion has been instantiated in all the various combinations of
circumstances that constitute test t then background assump-
n
to n are not to imply truth, and thus can be used to imply levels of
legisimilitude at which there is still a risk that a more thorough
test may reveal falsity. But a higher-numbered modal operator
will always imply a lower one: that is 'If nec/l, then nec/l, where
j > i' is a principle. So 'nec/l' expresses the fact that A has at
least «th degree legisimilitude, while 'not-(nec A)' expresses the
i +l
generalized, so that we have, for example, 'If nec, (if A, then B),
then, if nec^A, then nec,B' as another principle.
This kind of generalized modal logic can be rigorously forma-
lized and a large number of theorems derived. Some of the 16
1 9That is why the Method of Relevant Variables escapes the anti-inductivist threat of
historically oriented philosophers of science like Hanson, Toulmin, etc.: see L.Jonathan
Cohen, The Dialogue of Reason (Oxford-. Clarendon Press, 1986), 118-28. A more extens-
ive overview of the situation is afforded by J . E . Adler, 'Criteria for Good Inductive
Logic', in L.J. Cohen and M. Hesse (eds.), Applications of Inductive Logic, pp. 379-405.
164 The Baconian Gradation of
5 22. S O M E N O N - S T A N D A R D I N T E R P R E T A T I O N S O F B A C O N I A N
LOGICAL SYNTAX
2 4For the claim that idealized generalizations cannot generate explanations in accor-
dance with the covering law model see N. Cartwright, How the Laws of Physics Lie
(Oxford: Clarendon Press, 1983), 44-53.
§ 22 Ampliative Induction 171
particular physical variable that are predicted to hold in certain
circumstances under ideal conditions, and the values that are
observed to hold in those circumstances under conditions occur-
ring in the actual world, has to result from features of the actual
world that are excluded by the idealization: if it were not for the
idealization, the value predicted would be the one that is actually
found. Correspondingly, any idealization implying unexplain-
able differences from values in the actual world has to be rejected
or treated with reserve. Of course, the processes of approxima-
tion, by which observable evidence from the actual world is
matched to generalizations applying only to an ideally simplified
world, are in practice often delicate and difficult, allowing room
for considerations of theoretical elegance to enter into the
precisification of the ultimate solution. But in principle idealiza-
tions are to be graded inductively, by the Method of Relevant
Variables, in terms of those criteria of legisimilitude that are
appropriate to their category of subject-matter.
So far, in the present section, we have discussed various non-
standard ways in which the Method of Relevant Variables
impinges on the gradation of legisimilitude. The situations
envisaged were those where, in place of assuming a fixed
formulation for a hypothesis about some factual issue and
examining the conditions under which this or that grade of legi-
similitude would be assignable to it, we instead assume a desired
degree of legisimilitude for the hypothesis and examine the
conditions under which this or that reformulation would be
appropriate to it. Indeed, if we consider only the appropriate
types of modification in the antecedents of our hypothesis, the
underlying modal calculus can actually be interpreted as a
logical syntax for gradings of evidentially permissible simplifica-
tion. That is, if H is an unmodified hypothesis in a particular
category, then, for every support assessment stating that on the
evidence of E, H has zth grade support, there is an equivalent
statement of evidentially permissible simplification stating that,
on the evidence of E, the simplest fully supported version,
or versions, of H has, or have, zth grade simplicity, accord-
ing to appropriate criteria of simplicity. But in all this the
25
2 9I. Lakatos, 'Proofs and Refutations', British Journal for the Philosophy of Science, 14
(1963-4), 1-25, 120-39, 221-45, 296-342.
3 0See Cohen, The Implications of Induction, 172-82 and, for inductive reasoning in
philosophy, L.Jonathan Cohen, The Dialogue ofReason (Oxford: Clarendon Press, 1986),
63-148.
3 1G. Harman, The Nature of Morality (New York: Oxford University Press, 1977),
1-9.
174 The Baconian Gradation of
need not be tied up exclusively with factual explanation in terms
of causes, probabilities, and so on. More specifically, if the
explanation sought is not of why you have the moral feelings that
you do but rather of why the moral feelings that you have are the
right ones to have—that is, if the explanation sought is not an
explanation of your attitude's de facto psychological properties
but of its dejure normative status—then the relevant moral prin-
ciple may provide a perfectly good explanation. You are right to
be outraged by what the children are doing to the act, people
might say, because it is an act of wanton cruelty. And the
relevant datum in such a case is the wrongness of the act, which
normal people feel, rather than the fact that they feel this
wrongness.
We need to note here, however, that the general theories,
principles, and so on that are cited in our explanations serve this
explanatory purpose best if they also have inductive support that
is independent of the items currently requiring explanation.
Moreover, it is by no means the rase that every inductively
supported generalization provides the best explanation of its
own evidential support. The blackness of the birds that we now
see on the summer fields is perhaps best explained by the opera-
tion of their genes, or by the operation of certain environmental
factors in the evolution of their species, but not at all well
explained by the fact that they are crows and the generalization
that all crows are black. Yet they constitute inductive evidence
for that generalization. So it is certainly not safe to suppose that
induction is always an inference to the best explanation. 32
§ 23. T H E C L A S S I C A L P R O B L E M O F I N D U C T I O N
1 See J . Locke, Essay Concerning Human Understanding (London; T. Basset, 1690), bk.
IV, ch. 3, 5 14 adfin., andch. 6 § 16; and J. R. Milton, 'Induction before Hume', British
Journal for the Philosophy of Science, 38 (1987), 49-74.
2 A Treatise of Human Nature, (1739), ed. L. A. Selby-Bigge (Oxford: Clarendon Press,
1888) bk. I, pt 3, Sect. 14.
3 Neither Locke's Essay nor Hume's Treatise pays any attention, whether critical or
commendatory, to Francis Bacon's inductive methodology, though Bacon's writings
were Well known and widely read in the century after his death.
§ 23 §23Four Paradoxes about Induction 178
premiss would itself be at least as impossible to substantiate
by formallogically sanctioned deduction from the ultimate
premisses of experience.
Of course, where induction is not about factual matters, but
about a system of ethical principles, say, or of common law
maxims (see p. 172), no such paradox need arise, since a general
rule authorizing inductive inference may be supposed to be
implicitly stipulated with the imposition of the system. Justice
requires that like cases be treated alike, and legislators can
decide to impose justice on a human community. But we
humans cannot decide to impose uniformity on the processes of
Nature.
One tempting line of approach to Hume's problem is via
Bernoulli's limit theorem (see pp. 118-19). If under appro-
priate assumptions the examination of a sufficiently large
sample can let us learn something, with a corresponding degree
of confidence, about the probability with which a certain charac-
teristic exists in a specified population, then it looks as though a
confidence-function can operate here as a measure of predict-
ability by enumerative induction. But the value of such a func-
tion for this purpose is restricted by our having to assume that
the sample is not biased by some lack of homogeneity in the
population. If we want to calculate the survival prospects of a
twenty-year-old student, we should be wary of drawing our
sample solely from the members of our university's rock-
climbing club. Of course, as the weight of the evidence (see
§ 16) goes up, it would be normal to discount this kind of worry.
But assessments of weight rely on the assumption that existing
judgements of evidential relevance will remain true for the
future. So there is no escape here from the jaws of Hume's
problem: we are merely shifting the point at which the bite is felt.
Some philosophers have tried to resolve Hume's problem by
proposing a different view of the typical premisses from which
reasoning about causal connections proceeds. Thus Kant
argued that the objects of our perceptual judgements must them-
selves be structured by the category of cause and effect, if we are
to have the kind of experience that we do have. And every parti-
cular cause must presuppose a uniformity of causal connection.
So the uniformity of nature pervades the whole of the per-
ceivable world. But the details of Kant's intricate argument are
§ 23 Four Paradoxes about Induction 179
notoriously open to criticism. Nor is it at all clear from Kant's
4
4 I. Kant, Critique ofPure Reason (1781), trans. N. Kemp Smith (London: Macmillan,
1929), 218-27. For an example of the criticism see S. Korner, Kant (Harmondsworth:
Pelican Books, 1955), 87.
5 Indeed, Kant had at one time held that natural variety is unlimited: W . R . Shea,
'Filled with Wonder: Kant's Cosmological Essay, The Universal Natural History and Theory
of the Heavens', in R.E. Butts(ed.), Kant's Philosophy of Science (Dordrecht: Reidel, 1986),
110-14.
6 R. Harre and E.H. Madden, Causal Powers: A Theory of Natural Necessity (Oxford:
Blackwell, 1975).
§ 23 §23Four Paradoxes about Induction 180
So the absence of any rational justification for the belief is a
matter of concern only to philosophers, and the latter can dispel
their worries at any moment by leaving their studies and in-
dulging their natural inclinations.
Such a resolution of the paradox has at least three weaknesses.
In increasing order of importance they are as follows.
First, it is in any case an inaccurate psychological observation
that when people believe in a causal connection they have
previously perceived a constant conjunction, since the causal
connections that we discover in everyday life are not to be identi-
fied with the underlying uniformities that scientists aim at dis-
covering. We all believe that a fire emits sensible heat although
we also know that heat insulation often prevents this emission.
The causal connections that we talk about in everyday life are
sequences of events that hold good only in normal circum-
stances, not universally. But H u m e could easily have corrected
himself on this point, without sacrificing anything integral to his
sceptical thesis.
Secondly, and more importantly, Hume seems to be arguing
with his readers and trying to persuade them of the existence of a
psychological law. This law is to the effect that people develop
the natural inclination which H u m e describes whenever they
have observed the corresponding conjunction of events to be
uniformly present in their experience. But belief in the existence
of such a psychological law can itself have no rational founda-
tion, if H u m e is right, since no causal law then has a rational
foundation. So, like many other sceptics, H u m e can be hoisted
on his own petard. If what he says is true, we should already
believe it before reading what he says, and if somehow we do not
already believe it he can certainly give us no reason for doing so.
Either way his remarks are pointless.
Thirdly, just as Hume's resolution of the paradox makes no
allowance for his own apparent ability to argue about psycho-
logical laws and the strength of evidence for their existence, so
too it makes no allowance for the many ways in which scientists
reason with one another about what the laws of nature are and
what the strength of evidence is for their existence. Since these
scientists apparently think that they are appealing to rationally
cogent considerations, some account of those considerations is
needed along with a brief explanation of their plausibility.
§23Four Paradoxes about Induction 181
Indeed Hume himself does give an account of them in another
section of his Treatise where he speaks of' rules by which to judge
of causes and effects', of 'the LOGIC that I think proper to employ
in my reasoning' on the subject, and of a constant conjunction
that 'proves' a causal connection. But this way of speaking is
7
11 P. Tichy, 'On Popper's Definition of Verisimilitude', British Journal for the Philo-
sophy of Science, 25 (1974), 155-60, and D. Miller, 'Popper's Qualitative Theory of Veri-
similitude', ibid. 166-77.
1 2 See e.g. G. Oddie, Likeness to Truth (Dordrecht: Reidel, 1986); I. Niiniluoto, Truth-
likeness (Dordrecht: Reidel, 1987); and the collection of papers by various authors:
T.A.F. Kuipers (ed.), What is Closer-to-the-Truth? (Amsterdam: Rodopi, 1987).
13 Popper, Conjectures and Refutations, p. 235.
1 4 On resemblances between Popper's philosophy of science and Bacon's see P.
Urbach, 'Francis Bacon as a Precursor to Popper', British Journal for the Philosophy of
Science, 33 (1982), 113-32. On resemblances between Popper's philosophy of science and
the philosophies of Whewell and of Peirce see I. Niiniluoto, 'Notes on Popper as
Follower of Whewell and Peirce', Ajatus, 37(1978), 272-327
§ 23 §23Four Paradoxes about Induction 184
Nor should this be surprising, since only legisimilitude, not veri-
similitude, can guarantee the applicability to counterfactual
instances that is often needed when knowledge of general laws or
causal connections is exploited (see pp. 131-2).
Indeed, it seems a general weakness in Popper's position that
when attempts are made to buttress it against obvious implaus-
ibilities it tends to topple over into some kind of unacknow-
ledged inductivism. Lakatos, for example, wanted to emend it
in a different direction. Instead of considering the rationality
issue in relation to single hypotheses, Lakatos thought rather
about research programmes, in which a series of hypotheses
replaced one another as the programme either succeeded or
petered out. But (though his objections to the qualification of
15
1 8 For an attempt to develop a form of Popperianism that is not open to this criticism
see J . Watkins, Science and Scepticism (London: Hutchinson, 1984), 337-48.
1 9 For this see P. F. Strawson, Introduction to Logical Theory (London-. Methuen, 1952),
248-50. See also A.J. Ayer, The Problem of Knowledge (Harmondsworth: Penguin Books,
1956), 74-5, and P. Edwards, 'Bertrand Russell's Doubts about Induction', in A. Flew
(ed.), Logic and Language (Oxford: Blackwell, 1951), 68-70. Further recent analyses of the
problem are to be found in G. H. von Wright, The Logical Problem of Induction (Oxford:
Basil Blackwell, 1957); N. Rescher, Induction (Oxford: Basil Blackwell, 1980); D.C.
Stove, The Rationality ofInduction (Oxford: Clarendon Press, 1986); and in a useful collec-
tion of papers by various authors, R. Swinburne (ed.), The Justification of Induction
(Oxford: Oxford University Press, 1974).
§ 23 §23Four Paradoxes about Induction 186
issue that has to be settled may be put in these terms: what
entitles us to use the same evaluative term 'rational', or
'reasonable', in both contexts? Hume pointed out a pervasive
lack of analogy between, on the one hand, reasoning from
premisses to logically implied conclusions, and, on the other,
reasoning—vulgarly so called—from premisses about the
already observed to conclusions that embrace the as yet
unobserved. The former is a product of thought, he argued, the
latter of custom; the former justifies certainty, the latter not; the
former cannot be rejected without self-contradiction, the latter
can; and so on. It does not matter here whether we speak of
'reasoning' or of 'rationality' or of 'validity'. The question that
arises is the same: how can we establish the existence of a suffi-
ciently strong analogy between deduction and induction so as to
undermine H u m e ' s scepticism?
The clue to an answer for this important question lies in the
fact that inductive reasoning is a matter of degree, not of all or
nothing. So deducibility must come to be seen as a limiting case
to which inductively based inferability may be seen to exhibit
gradable degrees of approximation, whether we analyse such
gradations in terms of Pascalian probability, Baconian legi-
similitude, or some other kind of parameter. We are not then left
with two different kinds of inferential rationality, a deductive
kind and an inductive kind, but rather with one or more scales of
rationality or reasonableness, on each of which deductive
inferability.figures as a limiting case. Indeed, we have already
seen (§ 15) that in order to understand why there is such a
variety of different conceptions of Pascalian probability we can
usefully regard probability as a gradation of inferability. The
diversity of available interpretations for the Pascalian calculus
can then be seen to correspond with the familiar diversity of
systems for deductive inferability. Similarly, the Method of
Relevant Variables, in each of its varied applications, was seen
(§ 21) to conform to the rules of a generalized modal logic, with
degrees of legisimilitude mounting towards necessity. There is,
therefore, no case now for supposing, as Hume supposed, that
deductive and inductive thinking are so different from one
another that the former possesses some honorific status to which
the latter does not even approximate. Rather there is an obvious
analogy between judgements of the form 'The inference to A
§23Four Paradoxes about Induction 187
from B is logically certain' and judgements of the form 'The
inference to A from B has this-or-that level of probability', since
both kinds of judgement conform to the same Pascalian axioms.
And there is also an obvious analogy between judgements of the
form 'The proposition that if A, then B, is necessary' and 'The
proposition that if A, then B, has this-or-that level of legisimili-
tude' since both kinds ofjudgement conform to the same modal-
logical axioms.
Of course, there is another aspect of Hume's problem which
these logico-syntactic rapprochements between deduction and
induction can do nothing to resolve. Hume insisted that no
conviction about the future, or the unseen, could ever be justi-
fied on the basis of the evidence that we at present have. And, if
we read 'justified' here as meaning 'conclusively justified', we
must surely agree with him. The validity of the inductive
inferences that we make from available evidence must depend
on the reliability of the assumptions that we adopt about such
inferences in each recognized category of factual enquiry. How-
ever various the evidence that we at present have, there must
always remain the possibility, as we saw (§ 20), that our
assumed list of relevant variables, or our assumed categoriza-
tion of hypotheses, will at some time be held to be faulty. Our
evaluation of the evidence may then have been incorrect either
because the results of some past experiment were affected by the
undetected operation of an unlisted variable or because a new
experiment, can be designed in which such a variable is deliber-
ately manipulated. In other words any inductive judgement is
itself empirically corrigible. Conclusive certainty is never
inductively justifiable.
But that is a deprivation that we can live with. To keep our
minds open to the possibility of new evidence is all that it
requires of us, and while we do this we are still entitled to claim
an appropriate degree of justification for the theories that have
survived the most thorough tests which we at present believe
ourselves capable of devising. H u m a n rationality requires us at
any one time to do the best we then can, not to do the best that
could ever be possible.
§ 23 §23Four Paradoxes about Induction 188
% 24. T H E P A R A D O X O F T H E R A V E N S
Hempel's paradox of the ravens arises from the fact that three
propositions about inductive confirmation that are each
independently plausible are not co-tenable. Scheffler sought,
unsatisfactorily, to resolve the paradox by defining confirmation
in such a way that logically equivalent hypotheses may not be
equally confirmed by given evidence. Hempel's kind of solution
is more satisfactory. It requires a Pascalian measure for
confirmation and relies on our already knowing that black ravens
are much rarer than non-black non-ravens. Alternatively, if we
grade confirmation by the Method of Relevant Variables, our
resolution of the paradox must point out that known relevant
variables operate causally on plumage-colour rather than on
species-membership.
Hume's argument against the rationality of induction rested on
his assumption of a deductivist paradigm. He looked at induc-
tion from the outside, as it were, and could see in it only a
varying strength of mental process, not a gradable pattern of
evidential justification. But, even when we look at induction
from the standpoint of its own apparent rules and principles and
acknowledge the analogies that rebut Hume's scepticism, we
may still find ourselves confronted with antinomy and paradox.
One such antinomy is commonly called 'the paradox of the
ravens', and is best known from Hempel's publication of it in
1945. Here we may formulate the core of the problem in terms
20
of three propositions that are not co-tenable despite the fact that
each has considerable intuitive appeal:
(1) Any object that is both an A and a B confirms the hypo-
thesis that everything which is an A is a B.
(2) Any object that confirms a hypothesis confirms also any
proposition that is logically equivalent to that hypothesis.
(3) A white handkerchief does not confirm the hypothesis
that all ravens are black.
2 0 C . G . Hempel, 'Studies in the Logic of Confirmation', Mind, 54 (1945), 1-26,
97-121, repr. with some changes, in C . G . Hempel, Aspects of Scientific Explanation and
Other Essays in the Philosophy of Science (New York: Free Press, 1965), 1-51. The germ of
the problem originally appeared in C . G . Hempel, 'Le Probleme de la verite', Theoria, 3
(1937), 206-46.
§23Four Paradoxes about Induction 189
These three propositions can easily be seen to generate an
antinomy. A white handkerchief is a non-black thing that is a
non-raven. So, according to (1), it confirms the hypothesis that
everything which is non-black is a non-raven. And therefore,
according to (2), it confirms the logically equivalent hypothesis
that all ravens are black. But that is inconsistent with (3) which
asserts, plausibly enough, that a white handkerchief does not
confirm the hypothesis that all ravens are black. It follows that, if
a coherent account of inductive reasoning is to be presented, one
or more of the premisses on which this antinomy rests has to be
rejected as false or inapplicable.
Thus some philosophers have held, in effect, that (2) is the
source of the trouble. For example, Scheffler has argued for the 21
2 2M. Fisch, 'The Paradoxes of Confirmation and their Solutions' (M. A. Disserta-
tion at the University of Tel Aviv: May, 1981).
§ 23 §23Four Paradoxes about Induction 192
no point in asserting that all /Is are Bs). So the assertion of 'All
non-/ls are non-5s' has just the same existential commitment as
has the assertion of 'All /4s are Bs'. And in any case there are
many uses in science for generalizations that definitely lack any
existential implications, because they relate to ideal or extreme
conditions which are either not known to exist or even known
not to exist. These generalizations are certainly a target for
inductive reasoning (as we saw in § 22), and the paradox will
still arise about them even if it fails to arise in more homely fields
of enquiry, like ornithology.
Accordingly Hempel was ready to accept the applicability of
(2) and, with it, the argument that any white object confirms the
hypothesis that all ravens are black. He thus treated (1) as
stating a sufficient, but not a necessary, condition for confirma-
tion. But he gave two reasons for rejecting proposition (3).
The first reason was that any generalization of the form
'Everything which is an A is a B' or 'All As are Bs' asserts some-
thing about, and imposes a corresponding restriction on, all
entities of the appropriate category (for example, on all physical
objects). It is not just about all /Is, and so, specifically, 'All
ravens are black' is not just about all ravens. Hence there is no
entity of the appropriate category that is not caught up in the
generalization, and any such entity, even a non-raven, is there-
fore in a position to constitute confirmatory or disconfirmatory
evidence.
But Hempel's argument here is not very persuasive. If there is
a concept of confirmatory evidence such that any entity within
the domain of a generalization constitutes either confirmatory or
disconfirmatory evidence for that generalization, then that
unlocalized (see p. 156) concept of confirmatory evidence is not
a very interesting or important one. It excludes any such entities
from being regarded as irrelevant to the confirmation or dis-
confirmation of the generalization. If the generalization that all
ravens are black is supposed to be 'about' physical objects, then
the colours of shoes and ships and sealing-wax are all to be
regarded as relevant to its truth. Thus the field of relevant
investigation is made unacceptably large, and recognition of this
provides a motive for taking the scope or domain of the
generalization to be co-extensive with the category of entities
denoted by its antecedent term. But, if we therefore fall back on
§23Four Paradoxes about Induction 193
taking the domain of'All ravens are black' to be just ravens, we
have done nothing to eliminate the paradox.
Hempel's other reason for rejecting proposition (3) was that
when we judge the extent to which some object confirms a stated
hypothesis we tend tacitly to introduce a comparison of the
hypothesis with a body of evidence which includes not only that
particular object but also other items of information with which
we happen to be acquainted. For example, suppose that in
support of the hypothesis 'All sodium salts burn yellow' some-
body was to adduce an experiment in which a piece of pure ice
was held into a colourless flame and did not turn the flame
yellow. That result might be held to support the hypothesis in
question because it confirms the generalization 'Whatever does
not burn yellow is no sodium salt'. And yet at the same time this
appears paradoxical, because we happen to know anyhow that
ice contains no sodium salt and thus our experiment seems
irrelevant to the hypothesis that it was designed to test. Dis-
regard the background knowledge, argues Hempel, and the
paradox disappears.
But this argument is not very persuasive either. The sodium
salt example introduces a special feature that is not present in the
way in which the original paradox of the ravens may be
presented. It involves a piece of chemical knowledge that had at
one time to be discovered by empirical scientific enquiry,
namely the knowledge that ice contains no sodium salt. But the
knowledge that handkerchiefs are not ravens is part of the a
priori linguistic competence with which we approach any
intellectual problem. It cannot so easily be disregarded in order
to resolve a particular paradox.
The spirit of Hempel's solution (though not of his arguments
for it) is better maintained if confirmation is treated quantita-
tively rather than qualitatively and degrees of confirmation are
measured in Pascalian terms. This approach to the problem was
originally taken by Hosiasson-Lindenbaum and was followed 23
Indeed one can easily see from this law that, other things being
equal, p(H\E&.K) will get larger as P(E\K) gets smaller, and vice
versa. So that because non-black non-ravens are common
objects, with a relatively high probability of occurrence, they
can do little to raise the probability of the hypothesis 'All ravens
are black', whereas black ravens are relatively rare objects and
will raise that probability much more (see pp. 136-7).
This way of resolving Hempel's paradox will not satisfy any-
one who is intuitively convinced that any observations of non-
black non-ravens are totally irrelevant to any judgement of the
evidential support or confirmation that exists for the hypothesis
'All ravens are black'. All that such a resolution of the paradox
can do to explain away those intuitions is to suggest that they
arise from mistaking a very, very low degree of relevance for no
relevance at all.
Philosophers who find this explanation implausible, and also
wish to retain (2), have no alternative but to reject (1), and the
§23Four Paradoxes about Induction 195
Method of Relevant Variables provides a rationale for so doing.
According to that method there are just two ways in which a
hypothesis may be shown to possess inductive support. Either an
appropriately structured test on the hypothesis turns out to have
satisfactory results, or the hypothesis is shown to be the logical
consequence of one or more propositions that have been satis-
factorily so tested. It follows that the evidence which supports a
hypothesis, according to the Method of Relevant Variables, is
constituted by the results of canonical tests on that hypothesis or
on logically entailing ones, and not by objects that just happen to
satisfy the antecedent and consequent of the hypothesized
generalization. Observation of a black raven, tout court, does not
provide any inductive support for the hypothesis that all ravens
are black, nor does a non-black non-raven provide any inductive
support for the hypothesis that all non-black things are non-
ravens. Thus (1) is rejected and the paradox cannot be
generated in Hempel's terms.
But that is not enough to dispose of the underlying problem, if
the paradox can be restated in terms of a concept of confirmation
that conforms to the Method of Relevant Variables. And indeed
it might seem that the paradox reappears if (1) is replaced by
(1') Any set of objects that are both ^4s and Bs in each of an
appropriately selected variety of circumstances, con-
firms the hypothesis that everything which is an A is a B
and (3) is replaced by
(3') A set of white handkerchiefs in each of the circumstances
referred to in (1') does not confirm the hypothesis that all
ravens are black.
For in the light of (1') and (2) it might seem that a set of white
objects that have been observed in an appropriately wide variety
of circumstances confirms the hypothesis that all ravens are
black, whereas (3') denies that such a set of objects does confirm
the hypothesis.
However, when we use the Method of Relevant Variables we
are always in the position of having to exploit background know-
ledge or belief or assumptions about what variables are indeed
relevant to the hypothesis under examination. If we have no
such knowledge or belief or assumptions, we are not in a position
to test the hypothesis appropriately. Now, we may well know of
§ 23 §23Four Paradoxes about Induction 196
circumstances that are relevant to hypotheses of the category to
which 'All ravens are black' belongs. There are circumstances
like season, climate, diet, and so on, that may cause a bird to
change its plumage colour. But how could we have the requisite
kind of knowledge or belief about the hypothesis 'All non-black
things are non-ravens'? In order to have it we should need a list
of variables relevant to hypotheses of this category—that is, a list
of types of circumstances that have sometimes falsified such
hypotheses by causing objects or events that satisfy their subject-
terms to fail to satisfy their predicate-terms. But in fact we have
no information about circumstances that can thus turn a non-
raven into a raven or a non-swan into a swan. Such metamorph-
oses, or changes of species-membership, do not occur within our
experience and are contrary to what is thought genetically
possible. Consequently, even if a set of white objects were
observed in a wide variety of circumstances, it would give no
support thereby to the hypothesis that all non-black things are
non-ravens, if we are to judge support by the Method of
Relevant Variables. And unless there is some support for that
hypothesis the paradox cannot get off the ground. Or, in other
words, though 'All non-black things are non-ravens' must have
the same grade of inductive support, on given evidence, as 'All
ravens are black', such evidence must be the outcome of tests on
the latter proposition, not the former, if we adopt the Method of
Relevant Variables.
In sum, so far from its being the case that the paradox arises
because we take background knowledge into account inappro-
priately, as Hempel held, we actually have to take background
knowledge into account, quite properly, in order to dissolve the
paradox. While a Pascalian analysis of inductive reasoning
requires us to invoke contingent beliefs about the numbers of the
objects concerned so as to reject (3), a Baconian analysis in terms
of the Method of Relevant Variables requires us to reject (1) and
then to invoke contingent beliefs about the direction in which the
relevant variables operate in canonical tests. 25
certain time t are green. At time t, then, all our relevant observa-
tions confirm the hypothesis that all emeralds are green. But
consider the predicate 'grue' which applies to all things
examined before t just in case they are green and to other things
just in case they are blue. Obviously at time t, for each statement
of evidence asserting that a given emerald is green, we have a
parallel evidence-statement asserting that that emerald is grue.
And each evidence-statement that a given emerald is grue will
confirm the general hypothesis that all emeralds are grue. Hence
a hypothesis implying that all emeralds subsequently examined
will be green and a hypothesis implying that they will all be blue
are both confirmed at t by evidence statements describing the
same observations. Two mutually conflicting hypotheses are
equally well confirmed by the same evidence. Moreover, since t
may be whenever you please, the evidence collected prior to t
may contain very, very many green emeralds but the incom-
patible predictions will still be equally well confirmed at t. And
by choosing an appropriate predicate instead of 'grue' we can
clearly obtain equal confirmation for any prediction whatever
about other emeralds, or indeed for any prediction whatever
about any other kind of thing. For example, suppose 'grue'
applies to all things examined before t if and only if they are
§ 26. T H E L O T T E R Y P A R A D O X
Press, 1971), 29-30. See also Y. Bar-Hillel's comment in H.E. Kyburg and E. Nagel
(eds.), Induction: Some Current Issues (Middletown: Wesleyan University Press, 1963), 46.
2 9 H. Kyburg, jun., 'Probability, rationality and a rule of detachment', in Y. Bar-
Hillel(ed.), Proceedings of the 1964 Congressfor Logic, Methodology and the Philosophy of Science
(Amsterdam: North-Holland, 1965), 301-10. The first statement of the paradox is to be
found in H.E. Kyburg, jun., Probability and the Logic of Rational Belief (Middletown:
Wesleyan University Press, 1961), 197. A similar paradox is discussed by C.G.
Hempel, 'Deductive-Nomological vs. Statistical Explanation', in H. Feigl and G.
Maxwell (eds.), Minnesota Studies in the Philosophy of Science (1962), iii. 144-7.
§ 23 §23Four Paradoxes about Induction 206
(1) If E states all the available evidence, and the Pascalian
probability of H on E is within some suitably small
interval from 1, it is rational to believe H (i.e. justifiable
to accept H).
(2) If it is rational to believe H rational to believe H , . . .
u 2
fiable to accept both the conjunction H,&H &L . . . &CH and also
2 N
3 0For further discussion of the difference between belief and acceptance see
L. Jonathan Cohen, The Dialogue of Reason (Oxford: Clarendon Press, 1986), 92-7.
3 1I. Levi, Gambling with Truth: An Essay on Induction and the Aims of Science (New York:
Knopf, 1967), 38-42. See also I. Levi, 'Deductive Cogency in Inductive Inference',
Journal of Philosophy, 57 (1965), 68-77, repr. inl. Levi, Decisions and Revisions: Philosophical
Essays on Knowledge and Value (Cambridge: Cambridge University Press, 1984), 42-50.
§ 23 §23Four Paradoxes about Induction 208
and at most, one such hypothesis is true. (Compare Scheffler's
approach to the paradox of the ravens in § 24.) For example,
acceptance of the hypothesis that ticket no. 1 will not win the
lottery is relative to a partition into the two possibilities 'Ticket
no. 1 will win' and 'Ticket no. 1 will not win'. And, analog-
ously, acceptance of the hypothesis that ticket no. 2 will not win
is relative to a different ultimate partition, according to which
the hypothesis that ticket no. 1 will not win is not even a candid-
ate for consideration. So, if it is therefore required that the cons-
traints of deductive cogency apply only in a way that is relative
to the same ultimate partition, we must put this requirement in
place of (2). We thus have no warrant, according to Levi, for
inferring that all the tickets in the imaginary lottery will fail to
win. And without such a warrant the paradox cannot get off the
ground. Of course, in order to resolve the paradox in this way
Levi has not only to reformulate the constraints of deductive
cogency but also to modify (1). O n his view a high probability
for H, on E, does not provide sufficient grounds for accepting H
outright, even if E is all the available evidence. It may provide
grounds only for accepting H as against other hypotheses in the
ultimate partition to which H belongs. Nor is a high probability
necessary for H, if there are to be grounds for so accepting that
hypothesis. If each rival hypothesis has a lower probability than
H, H might still deserve victory over them in the struggle for
acceptance, on Levi's view, even if the probability of H itself
were in fact quite low.
Levi's thesis that a high probability is not a necessary
condition for the acceptance of H, as against its rivals, has some
plausibility in the context of inductive reasoning. If H and its
rivals are generalizations that make conflicting predictions over
an indeterminately large domain, we might expect that even the
winning hypothesis has a low probability on the evidence
because of the very low prior probability that is to be attached to
any generalization of this kind. So, if any such hypothesis is to be
accepted at all in science when its rivals have been elimin-
ated—and many are in fact so accepted—it is no use insisting
that a high probability is necessary, as in (1). But in order to
resolve the lottery paradox this point does not need to be
pressed. All Levi needs for the resolution of that paradox is his
insistence on making acceptance relative to an ultimate parti-
§23Four Paradoxes about Induction 209
tion. Or, at any rate, that is the situation if the criterion for
inductive detachment is to be stated in Pascalian terms.
But there are some further points to clarify here. On Levi's
view there can be no authority for a clear-cut, unconditional,
non-relative acceptance or rejection that is parallel to belief or
disbelief. Yet the paradox originates because of the felt need for
a criterion to control detachments that are indeed of just that
kind. Levi's resolution of the paradox has the disadvantage of
presupposing that no such criterion is ever possible.
In practice the situation varies according to the nature of the
issue and the evidence. Where there is apparently nothing to
choose between several different ultimate partitions, there is no
reason to regard the hypothesis that is more acceptable than any
other within one particular partition as somehow superior to any
hypothesis that is more acceptable than the others within any
other partition. So, where more than one ultimate partition
seems equally respectable, no single hypothesis can be the over-
all winner and therefore declared to be the one that it is justi-
fiable to accept tout court. For example, the hypothesis that in the
above-mentioned lottery ticket no. 1 will not win is no more
acceptable, in an absolute sense, than the hypothesis that ticket
no. 2 will not win or than any other such hypothesis. But the
situation is often quite different, especially in the natural
sciences. Often just one set of rival hypotheses may be
considered to contain all the serious candidates for acceptance,
because of what we know already about the subject. And the best
of these hypotheses, if the evidence is strong enough, may then
be regarded as acceptable in an absolute rather than just a
comparative sense. It is better than any rival within the only ulti-
mate partition that is itself acceptable on the available evidence.
Thus Galileo, in his famous Dialogue Concerning the Two Chief
World Systems, discussed only whether the Ptolemaic or the
Copernican theory was preferable, and did not entertain also
some quite different partition of the fundamental possibilities.
Nevertheless, the lottery paradox cannot be reproduced in this
kind of situation, so it does not constitute an objection to Levi's
resolution of that paradox. Levi can still maintain his claim that
acceptance is always relative to an ultimate partition, even if in
many cases only one ultimate partition needs to be considered.
Indeed, the lottery situation is unrepresentative of acceptance
§ 23 §23Four Paradoxes about Induction 210
issues in more than one way. Not only does it allow the equal
legitimacy of a large number of different ultimate partitions,
with the consequences that have just been under examination. It
is also insulated from any practical question about whether all
the relevant facts are included in the available evidence. Since
the lottery is explicitly assumed to be administered fairly we can
treat it as a perfect game of chance and calculate the probabilities
of the various outcomes according to an indifference analysis
(§ 6). But where we are not dealing with aleatory probabilities
questions about the extent or spread of relevant evidence may be
highly pertinent. If the issue is, for example, whether it will rain
tomorrow or not, then what Keynes called the weight (§ 14) of
the available evidence is just as important as the probability on
that evidence. It is justifiable to accept that it will rain only if the
weight and the probability are both sufficiently high. So in this
respect any Pascalian criterion for detachment needs some
appropriate supplementation if it is to cover an important
feature of many non-aleatory situations (as Levi recognizes: see
p. 174).
Finally, it is worth noting that the lottery paradox arises only
if the criterion for inductive detachment is stated in Pascalian
terms. Suppose that we grade the evidence's support by the
Method of Relevant Variables or in accordance with some
derived scale of Baconian probability (pp. 167-8), and that,
instead of adopting (1), we select some suitably high Baconian
grade as the appropriate basis for acceptance. Then all we can at
best say in these terms about ticket no. 1, or about any of the
other tickets, is that the evidence is sufficient for us to accept that
its chances of not winning are 999,999 in 1,000,000. That is to
say, each ultimate hypothesis under consideration, and open to
acceptance, is a proposition about the aleatory probability of
winning, or of not winning, not a hypothesis about winning, or
about not winning. And in this context (2) is quite innocuous:
the hypotheses that we are justified in accepting have no para-
doxical consequences, even when considered collectively.
Moreover, this solution fits well with the fact that in normal
circumstances it would seem queer for someone to buy a ticket to
a lottery which he feels justified in accepting that he will not win.
Since many people do buy lottery tickets, many people presum-
ably do not feel justified in accepting that they will not win and
§23Four Paradoxes about Induction 211
would find the relevant implications of (1) counter-intuitive.
Correspondingly the Method of Relevant Variables resolves the
prima-facie conflict between the need for inductive detachment
and the idea of deductive cogency—the conflict that underlies
the lottery paradox—by retaining (2) but reformulating (1) in
Baconian terms. It could well be argued, however, that, like
Levi's analysis, the Method of Relevant Variables still makes
inductively based acceptance relative to an ultimate partition of
hypotheses, since any eliminativist analysis of induction pre-
supposes such partitions (see § § 2, 18, and 20).
So, just as (see § § 23-5) the classical problem of induction,
the paradox of the ravens, and the 'grue' paradox all admit of
resolution both in terms of a Pascalian gradation for ampliative
induction and in terms of a Baconian one, so too the lottery
paradox can be dissolved both in Pascalian terms and in
Baconian ones. Indeed, even on the general issue of Pascalian-
32