Instant Access to Theory of probability 3rd Edition Harold Jeffreys ebook Full Chapters

Download as pdf or txt
Download as pdf or txt
You are on page 1of 85

Download the full version of the ebook at ebookfinal.

com

Theory of probability 3rd Edition Harold Jeffreys

https://ebookfinal.com/download/theory-of-probability-3rd-
edition-harold-jeffreys/

OR CLICK BUTTON

DOWNLOAD EBOOK

Download more ebook instantly today at https://ebookfinal.com


Instant digital products (PDF, ePub, MOBI) available
Download now and explore formats that suit you...

Methods of mathematical physics 3rd Edition Harold


Jeffreys

https://ebookfinal.com/download/methods-of-mathematical-physics-3rd-
edition-harold-jeffreys/

ebookfinal.com

Elementary Applications of Probability Theory Second


Edition Tuckwell

https://ebookfinal.com/download/elementary-applications-of-
probability-theory-second-edition-tuckwell/

ebookfinal.com

Schaum s Outline of Probability and Statistics 3rd Ed 3rd


Edition John Schiller

https://ebookfinal.com/download/schaum-s-outline-of-probability-and-
statistics-3rd-ed-3rd-edition-john-schiller/

ebookfinal.com

Probability with STEM Applications 3rd Edition Carlton

https://ebookfinal.com/download/probability-with-stem-
applications-3rd-edition-carlton/

ebookfinal.com
Probability and Random Processes 3rd Edition Geoffrey R.
Grimmett

https://ebookfinal.com/download/probability-and-random-processes-3rd-
edition-geoffrey-r-grimmett/

ebookfinal.com

Introduction to Probability and Statistics 3rd Edition


William Mendenhall

https://ebookfinal.com/download/introduction-to-probability-and-
statistics-3rd-edition-william-mendenhall/

ebookfinal.com

Probability Theory An Analytic View Second Edition Daniel


W. Stroock

https://ebookfinal.com/download/probability-theory-an-analytic-view-
second-edition-daniel-w-stroock/

ebookfinal.com

Structural Aspects in the Theory of Probability Second


Edition Series on Multivariate Analysis Herbert Heyer

https://ebookfinal.com/download/structural-aspects-in-the-theory-of-
probability-second-edition-series-on-multivariate-analysis-herbert-
heyer/
ebookfinal.com

XML 1 1 Bible 3rd ed Edition Elliotte Rusty Harold

https://ebookfinal.com/download/xml-1-1-bible-3rd-ed-edition-elliotte-
rusty-harold/

ebookfinal.com
_____________-r______
Oxtord Classic Texts
N THE PHYSJCAL SOENCES

us.
4
U.
4I
S

.s.
I..
I... •
a......
a Ua $5
.$UUI•. aS.
••.•u. U_RU S
as.
a• a•...iI •5•UU• .u
uus... I
U•UIIU •S..
U


a S I UUSU.
• • • •a
aSS
aa
a S •
• SR
S
' - a-
S
a
- SIR
a

Theory of Probability
1TIIRL) EDIlIUN
a I a,

I!L
ir HaroldJ
THEORY OF
PROBABILITY
BY

HAROLD JEFFREYS
Formerly Plumian Professor of Astronomy
University of Cambridge

THIRD EDITION

CLARENDON PRESS OXFORD


This book has been printed digitally and produced in a standard specification
in order to ensure its continuing availability

OXFORD
UNIVERSITY PRESS
Great Clarendon Street, Oxford OXa GDP
Oxford University Press is a department of rise University of Oxford
It furtlsers the University's objective of excellence iii research, scholarship,
and edncation by publishhsg worldwide in
Oxford New York
Auckland Bangkok Buenos Aires Cape Town Chennai
Dares Salaam Delhi Hong Kong Istanbul Karacho Kolkata
Kuala Lumpur Madnd Melbourne Mexico City Mnmbai Nafrobi
Sb Paulo Shanghai Taipei Tokyo Torontn
Oxford is a registered trade mark of Oxford University Press
in the UK and m certain other countnes
Published us the United States
it) Oxford University Press 1961
The moral nghts of the author have been asserted
Database right Oxford University Press (maker)
Reprsnted 2003
All sights reserved No part of this publication may be reproduced,
stored sos a retneval system, or transmitted, in any form or by any means,
withont the pnor permission in writing of Oxford University Press,
or as expressly persustted bylaw, or under terms agreed with the appropnate
reprographics sights organization Enquiries concerning reproduction
outside the scope of the above should be sent to the Bights Department,
Oxford University Press, at the address above
You must not circulate this book in any other biudmg or cover
and you must impose this same condition on any acquirer
ISBN 0-19-850368-7
PREFACE TO THE CORRECTED IMPRESSION
SOME corrections and amplifications have been made in the preseni
version and an appendix on harmonic analysis and autocorrelation
been added.
I am indebted for helpful discussions to Professor D. V. Lindley
Professor H. E. Daniels, and Dr. A. M. Walker.
H. J
Cambridge, 1966
PREFACE
I N the present edition I have developed more fully the proof of th
consistency of the postulates of the theory, including the product ruli
and therefore the principle of inverse probability. The convergenm
principle, otherwise the simplicity postulate for initial probabilities,
shown to satisfy the conditions. The argument amounts to a proof thai
axioms can be stated that will permit the attachment of a high probabi
lity to any precisely stated law given suitable observational data. There
is still room for choice in the precise formulation of the convergence
principle, but I regard that as an invitation to others to try. I do not
claim to have done all that is needed, but I do claim to have done a great
deal more than any of my critics have noticed. Where the theory is
incomplete the outstanding questions are mostly either of a sort that
can make little difference to practical applications, or arise from diffi-
culties in stating the likelihood that would affect any theory.
Some mathematical proofs are given more fully than in previous
editions. A proof of the Pitman—Koopman theorem concerning the
existence of sufficient statistics is given in an extended form. The related
invariance theory of Huzurbazar for initial probabilities is described.
The revision of prior probabilities is brought into relation with the
theory of types.
Some points in later chapters have been transferred to the first, in the
hope that fewer critics will be misled into inferring what is not in the
book from not finding it in the first chapter. For instance, the difficulty
mentioned on p. 3 has been repeated as inescapable, whereas the greater
part of the book is devoted to showing how it can be met in a construc-
tive way; that on p 119 continues to be stated though it was answered
thirty years ago, and arguments based on the assumption of equal
probabilities over an infinite class of laws are still given without mention
of the convergence principle.
Several writers, even recent ones, have described me as a follower of
the late Lord Keynes. Without wishing to disparage Keynes, I must
point out that the first two papers by Wrinch and me in the Philo-
sophi cat Magazine of 1919 and 1921 preceded the publication of Keynes's
book What resemblance there is between the present theory and that
of Keynes is due to the fact that Broad, Keynes, and my collaborator
had all attended the lectures of W. E Johnson. Keynes's distinctive
PREFACE
contribution was the assumption that probabilities are only partially
ordered this contradicts my Axiom I I gave reasons for not accepting
it in a review of Keynes's book and in the first edition of Scientific
Inference Mistakenly thinking that this was no longer necessary I
omitted theni from the second Keynes himself withdrew his assunip-
tion in his biographical essay on F P Ramsey My own primary in-
spiration came from Pearson's Orammar of Science, a work that is
apparently unknown to many present philosophers of science
On the other hand, the main conclusion, that scientific method de
pends on considering at the Outset the hypothesis that variation of the
data is completely random, and modifying it step by step as the data
are found to support alternatives is a complete reversal of the nature
of induction as understood by philosophcrs Yet so far as I know no
philosopher has noticed it
Adherents of frequency definitions of probability have naturally
objected to the whole system. But they carefully avoid mentioning my
criticisms of frequency definitions, which any competent mathematician
can see to be unanswerable. In this way they contrive to present me as
an intruder into a field where everything was already satisfactoT y I
speak from experience in saying that students have no difficulty in
following my system if they have not already spent several years in
trying to convince themselves that they understand frequency theories
Several authors have recently tried to construct theories that can bc
regarded as compromises between the epistemological one and one that
admits intrinsic probabilities only, it seems to me that these are only
elaborate ways of shirking the problems 'I'he present formulation is the
easiest that can be constructed.
However, there is a decided improvement in the willingness of
physicists to estimate the uncertainties of their results properly, and
I suppose that 1 can claim some of the credit for this There is, however,
room for further improvement.
H. J
Cambridge, I 961)
PREFACE TO THE FIRST El)ITION
THE chief object of this work is to provide a method of drawing infer-
ences from observational data that will be self-consistent and can also
be used in practice. Scientific method has grown up without much
attention to logical foundations, and at present there is little relation
between three main groups of workers Philosophers, mainly interested
in logical principles but not much concerned with specific applications,
have mostly followed in the tradition of Bayes and Laplace, but with
the brilliant exception of Professor P Broad have not paid much
attention to the consequences of adhering to the tradition in detail.
Modern statisticians have developed extensive mathematical techniques,
but for the most part have rejected the notion of the probability of a
hypothesis, and thereby deprived themselves of any way of saying
precisely what they mean when they decide hypotheses
Physicists have been described, by an experimental physicist who has
devoted much attention to the matter, as not only indifferent to funda-
mental analysis but actively hostile to it, and with few exceptions their
statistical technique has hardly advanced beyond that of Laplace In
opposition to the statistical school, they and some other scientists are
liable to say that a hypothesis is definitely proved by observation,
which is certainly a logical fallacy, most statisticians appear to regard
observations as a basis for possihly rejecting hypotheses, but in no case
for supporting them The latter attitude, if adopted consistently,
would reduce all inductive inference to guesswork, the former, if
adopted consistently, would make it ilupos'4ible ever to alter the hypo-
theses, however badly they agreed with new evidence The present
attitudes of most physicists and statisticians ale diametrically opposed,
but lack of a common meeting-ground has, to a very large extent, pre-
vented the opposition from being noticed Nevertheless, both schools
have made great scientific advances, in spite of the fact that their
fundamental notions, for one reason or the other, would make such
advances impossible if they were consistently maintained
In the present book I reject the attempt to reduce induction to
deduction, which is characteristic of both schools, and maintain that
the ordinary common sense notion of probabilit1 is capable of precise
and consistent treatment when once an adequate language is provided
for it It leads to the result that a precisely stated hypothesis may
attain either a high or a negligible probability a result of obsei-va-
tional data, and therefore to an aft it ide r nterrned ate between t hose
B
x PREFACE TO THE FIRST EDITION
current in physics and statistics, but in accordance with ordinary
thought Fundamentally the attitude is that of Bayes and Laplace,
though it is found necessary to modify their hypotheses before some
types of cases not considered by them can be treated, and some steps
in the argument have been filled in For instance, the rule for assessing
probabilities given in the first few lines of Laplace's book is Theorem 7.
and the principle of inverse probability is Theorem 10 There is, on the
whole, a very good agreement with the recommendations made in
statistical practice, my objection to current statistical theory is not so
much to the way it is used as to the fact that it limits its scope at the
outset in such a way that it cannot state the questions asked, or the
answers to them, within the language that it provides for itself, and
must either appeal to a feature of ordinary language that it has declared
to be meaningless, or else produce argunients within its own language
that will not bear inspection
The most beneficial result that 1 can hope for as a consequence of
this work is that more attention will be paid to the precise statement
of the alternatives involved in the questions asked It is sometimes
considered a paradox that the answer depends not only on the observa-
tions but on the question, it should be a platitude
The theory is applied to most of the main problems of statistics, and
a number of specific applications are given. It is a necessary condition
for their inclusion that they shall have interested me. As my object is
to produce a general method I have taken examples from a number of
subjects, though naturally there are more from physics than from
biology and more from geophysics than from atomic physics It was,
as a matter of fact, mostly with a view to geophysical applications that
theory was developed It is not easy, however, to produce a
statistical method that has application to only one subject, though
intraclass correlation, for instance, which is a matter of valuable posi-
ti'S e discovery in biology, is usually an unmitigated nuisance in physics
It may l,e felt that many of the applications suggest further questions.
That is inevitable It is usually only sshen one group of questions has
been answered that a further group can be stated in an answerable
form at all
I must offer roy warmest thanks to l'rofessor It A Fisher and Dr J
for their kindness in answering numerous questions from a not
very docile pupil, and to Mr It B Braithwaite, who looked over the
manuscript and suggested a number of improvements, also to the
Clarendon Press for their extreme courtesy at all stages
St John a College, cambridge
CONTENTS
I. FUNDAMENTAL NOTIONS I

II DIRECT PROBABILITIES 57

III ESTIMATION PROBLEMS 117

IV. APPROXIMATE METNODS AND SIMPLIFICATIONS 153

V. SIGNIFICANCE TESTS: ONE NEW PARAMETER 241

VI. SIGNIFICANCE TESTS VARIOUS COMPLICATIONS 332

VII. FREQUENCY DEFINITIONS AND DIRECT METBODS 3S5

VIII. GENERAL QUESTIONS 401

APPENDIX A MATNEMATICAL THEOREMS 425

APPENDIX B TABLES OF K 431

APPENDIX C MARMONIC ANALYSIS AND AUTO


CORRELATION 442

INDEX 455
I
FUNDAMENTAL NOTIONS
They say that Understanding ought to work by tho miss of right
reason. These rules are, or ought to he, contained in Logic; but the
actual science of logic is conversant at present only with things either
certain, impossible, or entirely doubtful, none of which (fortunately)
we have to reason on. Therefore the true logic for this world is the
calculus of Probabilities, which takes account of the magnitude of
the probability which is, or ought to be, in a reasonable man's mind.
J. Cacag MAxwatc
1.0. THE fundamental problem of scientific progress, and a fundamental
one of everyday life, is that of learning from experience. Knowledge
obtained in this way is partly merely description of what we have already
observed, but part consists of making inferences from past experience
to predict future experience. This part may be called generalization
or induction. It is the most important part; events that are merely
described and have no apparent relation to others may as well be for-
gotten, and in fact usually are. The theory of learning in general is
the branch of logic known as epistemology. A few illustrations will
indicate the scope of induotion. A botanist is confident that the plant
that grows from a mustard seed will have yellow flowers with four long
and two short stamens, and four petals and sepals, and this is inferred
from previous instances. The Nautical Almanac's predictions of the
positions of the planets, an engineer's estimate of the output of a new
dynamo, and an agricultural statistician's advice to a farmer about the
utility of a fertilizer are all inferences from past experience. When a
musical composer scores a bar he is expecting a definite series of sounds
when an orchestra carries out his instructions. In every ease the infer-
ence rests on past experience that certain relations have been found to
hold, and those relations are then applied to new cases that were not
part of the original data. The same applies to my expectations about
the flavour of my next meal. The process is so habitual that we hardly
notice it, and we can hardly exist for a minute without carrying it out.
On the rare occasions when anybody mentions it, it is called common
sense and left at that.
Now such inference is not covered by logic, as the word is ordinarily
understood. Traditional or deductive logic admits only three attitudes
2 FUNDAMENTAL NOTIONS I, § 10

to any proposition, definite proof, disproof, or blank ignorance. But no


number of previous instances of a rule will provide a deductive proof
that the rule will hold in a new instance. There is always the formal
possibility of an exception.
Deductive logic and its close associate, pure mathematics, have been
developed to an enormous extent, and in a thoroughly systematic way
—indeed several ways. Scientific method, on the other hand, has grown
up more or less haphazard, techniques being developed to deal with
problems as they arose, without much attempt to unify them, except
so far as most of the theoretical side involved the use of pure mathe-
matics, the teaching of which required attention to the nature of some
sort of proof. Unfortunately the mathematical proof is deductive, and
induction in the scientific sense is simply unintelligible to the pure
mathematician—as such; in his unofficial capacity he may be able to
do it very well. Consequently little attention has been paid to the
nature of induction, and apart from actual mathematical technique
the relation between science and mathematics has done little to develop
a connected account of the characteristic scientific mode of reasoning.
Many works exist claiming to give such an account, and there are some
highly useful ones dealing with methods of treating observations that
have been found useful in the past and may be found useful again. But
when they try to deal with the underlying general theory they suffer
from all the faults that modern pure mathematics has been trying to get
rid of. self-contradictions, circular arguments, postulates used without
being stated, and postulates stated without being used. Running through
the whole is the tendency to claim that scientific method can he reduced
in some way to deductive logic, which is the most fundamental fallacy
of it can be done only by rejecting its chief feature, induction.
The principal field of application of deductive logic is pure mathe-
matics, which pure mathematicians recognize quite frankly as dealing
with the working out of the consequences of stated rules with no
reference to whether there is anything in the world that satisfies those
rules. Its propositions are of the form 'If p is true, then q is true',
irrespective of whether we can find any actual instance where p is true.
The mathematical proposition is the whole proposition, 'If p is true,
then q is true', which may be true even if p is in fact always false. In
applied mathematics, as usually taught, general rules are asserted as
applicable to the external world, and the consequences are developed
logically by the technique of pure mathematics. If we inquire what
reason there is to suppose the general rules true, the usual answer is
I. § 1.0 FUNDAMENTAL NOTIONS 3

simply that they are known from experience. However, this use of the
word 'experience' covers a confusion. The rules are inferred from past
experience, and then applied to future experience, which is not the same
thing. There is no guarantee whatever in deductive logic that a rule
that has held in all previous instances will not break down in the next
instance or in all future instances. Indeed there are an infinite number
of rules that have held in all previous cases and cannot possibly all
hold in future ones. For instance, consider a body falling freely under
gravity. It would be asserted that the distance at time I below a fixed
level is given by a formula of the type
3 = (1)
This might be asserted from observations of s at a series of instants
11, I,,. That is, our previous experience asserts the proposition
that a, a, and p exist such that
8, = (2)
for all values of r from 1 to a. But the law (1) is asserted for all values
of I. But consider the law
S = a—J—at+4gt24-f(t)(t--—ti)(t---t,j...(t--—t,j, (3)
where f(t) may be any function whatever that is not infinite at any of
I,,, and a, a, and p have the same values as in (1). There are an
infinite number of such functions. Every form of (3) will satisfy the
set of relations (2), and therefore every one has held in all previous
cases. But if we consider any other instant (which might be either
within or outside the range of time between the first and last of the
original observations) it will be possible to in such a way
as to give a as found from (3) any value whatever at time tn+1• Further,
there will bean infinite number of forms off(t) that would give the same
value and there are an infinite number that would give differ-
ent values, If we observe .e at time we can choose to give
agreement with it, hut an infinite number of forms of f(t) consistent
with this value would be consistent with any arbitrary value of a at a
further moment That is, even if all the observed values agree with
(1) exactly, deductive logic can say nothing whatever about the value
of a at any other time. An infinite number of laws agree with previous
experience, and an infinite number that have agreed with previous cx-
parlance will inevitably be wrong in the next instance. What the applied
mathematician does, in fact, is to select one form out of this infinity;
and his reason for doing so has nothing whatever to do with traditional
4 FUNDAMENTAL NOTIONS I. § 10
logic. He chooses the simplest. This is actually an understatement of
the case; because in general the observations will not agree with (1)
exactly, a polynomial of n terms can still be found that will agree exactly
with the observed values at times f5,.. t,,, and yet the form (1) may
still be asserted. Similar considerations apply to any quantitative law.
The further significance of this matter must be reserved till we come to
significance tests. We need notice at the moment only that the choice
of the simplest law that fits the facts is an essential part of procedure
in applied mathematics, and cannot be justified by the methods of
deductive logic It is, however, rarely stated, and when it is stated it
is usually in a manner suggesting that it is something to be ashamed of.
We may recall the words of Brutus
But 'tie a common proof
That lowliness is young ambition's ladder,
Whereto the climber upwards turns his face;
But when he once attains the upmost round,
He then unto the ladder turns his back,
Looks in the clouds, scorning the base degrees
By which ho did ascend

It is asserted, for instance, that the choice of the simplest law is purely
a matter of economy of description or thought, and has nothing to do
with any reason for believing the law. No reason in deductive logic,
certainly; but the question is, Does deductive logic contain the whole
of reason? It does give economy of description of past experience, but
is it unreasonable to be interested in future experience? Do we make
predictions merely because those predictions are the easiest to make?
Does the Nautical Almanac Office laboriously work out the positions
of the planets by means of a complicated set of tables based on the
law of gravitation and previous observations, merely for convenience,
when it might much more easily guess them? Do sailors trust the
safety of their ships to the accuracy of these predictions for the same
reason? Does a town install a new tramway system, with expensive
plant and much preliminary consultation with engineers, with no more
reason to suppose that the trams will move than that the laws of
electromagnetic induction are a saving of trouble? I do not believe
for a moment that anybody will answer any of these questions in the
affirmative, but an affirmative answer is implied by the assertion that
is still frequently made, that the choice of the simplest law is merely a
matter of convention. I say, on the contrary, that the simplest law is
chosen because it is the most likely to give correct predictions, that the
I, § 10 FUNDAMENTAL NOTr0NS 5

choice is based on a reasonable degree of belief; and that the fact that
deductive logic provides no explanation of the choice of the simplest
law is an absolute proof that deductive logic is grossly inadequate to
cover scientific and practical requirements. It is sometimes said, again,
that the trust in the simple law is a peculiarity of human psychology;
a different type of being might behave differently Well, I see no point
whatever in discussing at length whether the human mind is any use;
it is not a perfect reasoning instrument, but it is the only one we have.
Deductive logic itself could never be known without the human mind.
If anybody rejects the human mind and then holds that he is construct-
ing valid arguments, he is contradicting himself, if he holds that human
minds other than his own are useless, and then hopes to convince them
by argument, he is again contradicting himself. A critic is himself
using inductive inference when he expects his words to convey the same
meaning to his audience as they do to himself, since the meanings of
words are learned first by noting the correspondence between things
and the sounds uttered by other people, and then applied in new
instances. On the face of it, it would appear that a general state-
ment that something accepted by the bulk of mankind is intrinsically
nonsense requires much more to support it than a mere declaration.
Many attempts have been made, while accepting induction, to claim
that it can be reduced in some way to deduction. Bertrand Russell
has remarked that induction is either disguised deduction or a mere
method of making plausible guesses.j- In the former sense we must look
for some general principle, which states a set of possible alternatives;
then observations used to show that all but one of these are wrong,
and the survivor is held to be deductively demonstrated. Such an atti-
tude has been widely advocated. On it I quote Professor C. D. Broad4
The usual view of the logic books seems to be that inductive arguments are
really syllogisms with propositions summing up the relevant observations as
minors, and a common major consisting of some universal proposition about
nature. If this were true it ought to be easy enough to find the missing major,
and the singular obscurity in which it is enshrouded would be quite inexplicable
It is reverently referred to by inductive logicians as the Uniformity of Nature,
but, as it is either never stated at all or stated in such terms that it could not
Principles ef Malhessaiice, p 360 He said, at the Aristotelian Society su,,m,er
meeting in 1039, that this remark has been too much quoted I therefore offer apolsgies
fer quoting it again He has also remarked that the inductive phitoaophero of central
Africa formerly held the view that sill men were hlack My comment would be thst
the deductive ones, if there were any, did not hold that there were any men, black,
white, or yellow
Mind, 29, 1920, 11
FUNDAMENTAL NOTIONS I, § 10
possibly do what is required of it, it appears to be the inductive equivalent of
Mrs. Gamp's mysterious friend, and might be more appropriately named Major
Harris.
It is in fact easy to prove that this whole way of looking at inductive argu-
ments is mistaken On this view they are all syllogisms with a common major.
Now their minors are propositions summing up the relevant observations. If the
observations have been carefully made the minors are practically certain. Hence,
if thi, theory were true, the conclusions of all inductive arguments in which the
observations were equally carefully made would be equally probable. For what
could vary the probabilities' Not the major, which is common to all of them.
Not the minors, which by hypothesis are equally certain. Not the mode of
reasoning, which is syllogistic in each case. But the result is preposterous, and
is enough to refute the theory which leads to it.
Attempts have been made to supply the missing major by several
modern physicists, notably Sir Arthur Eddington and Professor
E. A. Mime. But their general principles and their results differ even
within the very limited field of knowledge where they have been
applied. How is a person with less penetration to know which is right,
if any? Only by comparing the results with observation; and then his
reason for believing the survivor to be likely to give the right results
in future is inductive. I am not denying that one of them may have
got the right results. But I reject the statement that any of them can
be said to be certainly right as a matter of pure logic, independently of
experience; and I gravely doubt whether any of them could have been
thought of at all had the authors been unaware of the vast amount of
previous work that had led to the establishment by inductive methods
of the laws that they set out to explain. These attempts, though they
appear to avoid Broad's objection, do so only within a limited range,
and it is doubtful whether such an attempt is worth making if it can
at best achieve a partial success, when induction can cover the whole
field without supposing that special rules hold in certain subjects.
I should maintain (with N. R. Campbell, who saysf that a physicist
would be more likely to interchange the two terms in Russell's state-
ment) that a great deal of what passes for deduction is really disguised
induction, and that even some of the postulates of Principie Mathe-
malice are adopted on inductive grounds (which, incidentally, are false).
Karl writes as follows:
Now this is the peculiarity of scientific method, that when once it has become
a habit of mind, that mind converts all facts whatsoever into science The field
of science is unlimited; its material is endless, every group of natural phenomena,
Physics, The Elcmeals, 1920, p 9
The G,sei,ren- of Soieacc, 1892 Page 16 ef Everyman editiso, 1938.
1,11.0 FUNDAMENTAL NOTIONS 7

every phase of social life, every stage of past or present development ie material
for science The unity of oil science consists alone in its method, net in its moteriol.
The man who classifies facts of any kind whatever, who sees their mutual relation
and descrihes their sequences, is applying the scientific method and is a man of
science. The facts may belong to the pest history of mankind, to the social
statistics of our great cities, to the atmosphere of the most distant stars, to the
digestive organs of a worm, or to the life of a scarcely visible bacillus. It is not
the facts themselves which form science, but the methods by which they are
dealt with.

Here, in a few sentences, Pearson sets our problem. The italics are his.
He makes a clear distinction between method and material. No matter
what the subject-matter, the fundamental principles of the method
must be the same. There must he a uniform standard of validity for
all hypotheses, irrespective of the subject. Different laws may hold in
different subjects, but they must be tested by the same criteria; other-
wise we have no guarantee that our decisions will be those warranted
by the data and not merely the result of inadequate analysis or of
believing what we want to believe. An adequate theory of induction
must satisfy two conditions. First, it must provide a general method;
secondly, the principles of the method must not of themselves say
anything about the world. If the rules are not general, we shall have
different standards of validity in different subjects, or different stan-
dards for one's own hypotheses and somebody else's. If the rules of
themselves say anything about the world, they will make empirical
statements independently of observational evidence, and thereby limit
the scope of what we ,can find out by observation. If there arc such
limits, they must be inferred from observation, we must not assert them
in advance.
We must notice at the outset that induction is more general than
deduction. The answers given by the latter are limited to a simple
'yes', 'no', or 'it doesn't follow'. Inductive logic must split up the last
alternative, which is of no interest to deductive logic, into a number
of others, and say which of them it is most reasonable to believe on
the evidence available. Complete proof and disproof are merely the
extreme cases. Any inductive inference involves in its very nature the
possibility that the alternative chosen as the most likely may in fact
be wrong. Exceptions are always possible, and if a theory does not
provide for theos it will be claiming to ho deductive when it cannot be.
On account of this extra generality, induction must involve postulates
not included in deduction. Our problem is to state these postulates.
It is important to notice that they cannot be proved by deductive
FUNDAMENTAL NOTIONS 1, § 1.0

logic. If they could, induction would be reduced to deduction, which


is impossible. Equally they are not empirical generalizations; for in-
duction would be needed to make them and the argument would be
circular. We must in fact distinguish the general rules of the theory
from the empirical content. The general rules are a priori propositions,
accepted independently of experience, and making by themselves no
statement about experience. Induction is the application of the rules
to observational data.
Our object, in short, is not to prove induction; it is to tidy it up.
Even among professional statisticians there are considerable differences
about the best way of treating the same problem, and, I think, all
statisticians would reject some methods habitual in some branches of
physics. The question is whether we can construct a general method,
the acceptance of which would avoid these differences or at least reduce
them.

1.1. The test of the general rules, then, is not any sort of proof. This
is no objection because the primitive propositions of deductive logic
cannot be proved either. All that can be done is to state a set of
hypotheses, as plausible as possible, and see where they lead us. The
fullest development of deductive logic and of the foundations of mathe-
matics is that of Principia Methem&ica, which starts with a number of
primitive propositions taken as axioms; if the conclusions are accepted,
that is because we are willing to accept the axioms, not because the
latter are proved. The same applies, or used to apply, to Euclid. We
must not hope to prove our primitive propositions when this is the
position in pure mathematics itself. But we have rules to guide us in
stating them, largely suggested by the procedure of logicians and pure
mathematicians.
1. All hypotheses used must be explicitly stated, and the conclusions
must follow from the hypotheses.
2. The theory must be self-consistent, that is, it must not be possible
to derive contradictory conclusions from the postulates and any given
set of observational data.
3. Any rule given must be applicable in practice. A definition is
useless unless the thing defineil can be recognized in terms of the
definition when it occurs. The existence of a thing or the estimate of
a quantity must not involve an impossible experiment.
4. The theory must provide explicitly for the possibility that infer-
ences made by it may turn out to be wrong. A law may contain
I, § 1.1 FUNDAMENTAL NOTIONS 9

adjustable parameters, which may be wrongly estimated, or the law


itself may be afterwards found to need modification. It is a fact that
revision of scientific laws has often been found necessary in order to
take account of new information—the relativity and quantum theories
providing conspicuous instances—and there is no conclusive reason to
suppose that any of our present laws are final. But we do accept
inductive inference in some sense, we have a certain amount of confi.
dence that it will be right in any particular case, though this confidence
does not amount to logical certainty.
5. The theory must not deny any empirical proposition a priori;
any precisely stated empirical proposition must be formally capable o1
being accepted, in the sense of the last rule, given a moderate amount
of relevant evidence.
These five rules are essential. The first two impose on inductive logic
criteria already required in pure mathematics. The third and fifth
enforce the distinction between a priori and empirical propositions;
if an existence depends on an inapplicable definition we must either
find an applicable one, treat the existence as an empirical proposition
requiring test, or abandon it. The fourth states the distinction between
induction and deduction. The fifth makes Pearson's distinction be-
tween material and method explicit, and involves the definite rejection
of attempts to derive empirically verifiable propositions from general
principles adopted independently of experience.
The following rules also serve as useful guides.
6. The number of postulates should be reduced to a minimum. This
is done for deductive logic in Principia, though many theorems proved
there appear to be as obvious intuitively as the postulates. The motive
for not accepting other obvious propositions as postulates is partly
aesthetic. But in the theory of scientific method it is still more impor-
tant, because if we choose the postulates so as to cover the subject with
the minimum number of postulates we thereby minimize the number
of acts of apparently arbitrary choice. Most works on the subject state
more principles than I do, use far more than they state, and fail to
touch many important problems. So far as their assumptions are valid
they are consequences of mine, the present theory aims at removing
irrelevancies.
7. While we do not regard the human mind as a perfect reasoner,
we must accept it as a useful one and the only one available. The
theory need not represent actual thought-processes in detail, but should
agree with them in outline. We arc not limited to considering only the
10 FUNDAMENTAL NOTIONS 111.1
thought-processes that people describe to us. It often happens that
their behaviour is a better criterion of their inductive processes than
their arguments. If a result is alleged to be ebtained by arguments
that are certainly wrong, it does not follow that the result is wrong,
since it may have been obtained by a rough inductive process that
the author thinks it undesirable or unnecessary to state on account
of the traditional insistence on deduction as the only valid reasoning.
I disagree utterly with many arguments produced by the chief current
schools of statistics, but I rarely differ seriously from the conclusions;
their practice is far better than their precept. I should say that this
is the result of common sense emerging in spite of the deficiencies of
mathematical teaching. The theory must provide criteria for testing
the chief types of scientific law that have actually been suggested or
asserted. Any such law must be taken seriously in the sense that it can
be asserted with confidence on a moderate amount of evidence. The
fact that simple laws are often asserted will, on this criterion, require
us to say that in any particular instance some simple law is quite likely
to be true.
8. In view of the greater complexity of induction, we cannot hope
to develop it more thoroughly than deduction. We shall therefore take
it as a rule that an objection carries no weight if an analogous objection
would invalidate part of generally accepted pure mathematics. I do
not wish to insist on any particular justification of pure mathematics,
since authorities on its foundations are far from being agreed among
themselves. In Principle much of higher mathemsties, including the
whole theory of the continuous variable, rests on the axioms of infinity
and reducibility, which are rejected by Hilbert. F. P. Ramsey rejects
the axiom of reducibility, while declaring that the multiplicative axiom,
properly stated, is the most evident tautology, though Whitehead and
Russell express much doubt about it and carefully separate propositions
that depend on it from those that can be proved without it. I should
go further and say that the proof of the existence of numbers, according
to the Principle definition cf number, depends on the postulate that
all individuals are permanent, which is an empirical proposition, and
a false one, and should not be made part of a deductive logic. But we
do not need such a proof for our purposes It is enough that pure
mathematics should be consistent. If the postulate could hold in some
world, even if it was not the actual world, that would be enough to
establish consistency. Then the derivation of ordinary mathematics
from the postulates of Principle can be regarded as a proof of its
FUNDAMENTAL NOTIONS II
consistency. But the justification of all the justifications seems to he
that they lead to ordinary pure mathematics in the end; I shall assume
that the latter has validity irrespective of any particular justification.
The above principles will strike many readers as platitudes, and if
they do I shall not object. But they require the rejection of several
principles accepted as fundamental in other theories. They rule out,
in the first place, any definition of probability that attempts to define
probability in terms of infinite sets uf possible observations, for we
cannot in practice make an infinite number of observations. The Venn
limit, the hypothetical infinite population of Fisher, and the ensemble
of Willard Gibbs are useless to us by rule 3. Though many accepted
results appear to be based on these definitions, a closer analysis shows
that further hypotheses arc required before any results are obtained,
and these hypotheses are not stated. In fact, no 'objective' definition
of probability in terms of actual or possible observations, or possible
properties of the world, is admissible. For, if we made anything in our
fundamental principles depend on observations or on the structure of
the world, we should have to say either (1) that the observations we
can make, and the structure of the world, are initially unknown; then
we cannot know our fundamental principles, and we have no possible
starting-point; or (2) that we know e priori something about observa-
tions or the structure of the world, and this is illegitimate by rule 5.
Attempts to use the latter principle will superpose our preconceived
notions of what is objective on the entire system, whereas, if objectivity
has any meaning at all, our aim must be to find out what is objective
by means of observations. To try to give objective definitions at the
start will at best produce a circular argument, may lead to contradic-
tions, and in any case will make the whole scheme subjective beyond hope
of recovery. We must not rule out any empirical proposition c priori,
we must provide a system that will enable us to test it when occasion
arises, and this requires a completely comprehensive formal scheme.
We must also reject what is variously called the principle of causality,
determinism, or the uniformity of nature, in any such form as 'Precisely
similar antecedents lead to precisely similar consequences'. No two
sets of antecedents are ever identical, they must differ at least in time
and position. But even if we decide to regard time and position as
irrelevant (which may be true, but has no justification in pure logic)
the antecedents arc never identical In fact, determinists usually recog-
nize this verbally and try to save the principle by restating it in some
such form as 'In precisely the same circumstances very similar things
12 FUNDAMENTAL NOTIONS I, Ii
can be observed, or very similar things can usually be observed. f If
'precisely the same' is intended to be a matter of absolute truth, we
cannot achieve it. Astronomy is usually considered a science, but the
planets have never even approximately repeated their positions since
astronomy began. The principle gives us no means of inferring the
accelerations at a single instant, and is utterly useless. Further, if it
was to be any use we should have to knote at any application that the
entire condition of the world was the same as in some previous instance.
This is never satisfied in the most carefully controlled experimental
conditions. The most that can be done is to make those conditions the
same that we believe to be relevant—the same' can never in practice
mean more than 'the same as far as we know', and usually meaos a
great deal less The qnestion then arises, How do we know that the
neglscted variables are irrelevant? Only by actually allowing them to
vary and verifying that there is no associated variation in the result;
but this requires the use of significance tests, a theory of which must
therefore be given before there is any application of the principle, and
when it is given it is found that the principle is no longer needed and
can be omitted by rule 6. it may conceivably ho true in some sense,
though nobody has succeeded in stating clearly what this sense is. But
what is quite certain is that it is useless.
Causality, as used in applied mathematics, has a more general form,
such as 'Physical laws are expressible by mathematical equations,
possibly connecting continuous variables, such that in any ease, given
a finite number of parameters, some variable or set of variables that
appears in the equations is uniquely determined in terms of the others.'
This does not require that the values of the relevant parameters should
be actually repeated, it is possible for an electrical engineer to predict
the performance of a dynamo without there having already been some
exactly similar dynamo The equations, which we call laws, are inferred
from previous instances and then applied to instances where the relevant
quantities are different This form permits astronomical prediction.
But it still leaves the questions 'How do we know that no other para-
meters than those stated are needed?', 'How do we know that we need
consider no variables as relevant other than those mentioned explicitly
in the laws?', and 'Why do we believe the laws themselves?' It is
only after these questions have been answered that we can make any
actual application of the principle, and the principle is useless until we
have attended to the epistemologieal problems. Further, the principle
t W H Oeorgo, The Scientist in Action, 1936, p 48
1,111 FUNDAMENTAL NOTIONS 13

happens to be false for quantitative observations. It is not true that


observed results agree exactly with the predictions made by the laws
actually used The most that the laws do is to predict a variation that
accounts for the greater part of the observed variation; it never accounts
for the whole. The balance is called error' and usually quickly for-
gotten or altogether disregarded in physical writings, but its existence
compels us to say that the laws of applied mathematics do not express
the whole of the variation. Their justificstion cannot be exact mathe-
matical agreement, hut only a partial one depending on what fraction
of the obsorved variation in one quantity is accounted for by the
variations of the others. The pheoomeison of error is often dealt
with by a suggestion of various minor variations that might alter the
measurements, but this is no answer. An exact quantitative prediction
could never ha made, even if such a suggestion was true, unless we
knew in each individual tam the actual amounts of the minor varia-
tions, and we never do If we did we should allow for them and obtain
a still closer agreement, but the fact remains that in practice, however
fully we take small variations into account, we never get exact agree-
went. A physical law, for practical use, cannot be merely a statement
of exact predictions, if it was it would invariably be wrong and would
be rejected at the next trial. Quantitative prediction must always be
prediction within a margin of uncertainty, the amount of this margin
will be different in different eases, hut for a law to be of any use it
must state the margin explicitly Tile outstanding variation, for prac-
tical application, is as essential a part of the law as the predicted
variation is, and a valid statement of the law must express it. But in
any individual case this outstanding variation is not known. We know
only something about its possible range of values, not what the actual
value will he. hence a physiral law is not an exact prediction, but a state-
ment of the relative probabilities of variations of different amounts Jt is
only in this form that we can avoid rejecting causality attoyet her as false,
or as inapplicable under rule 3, but a statement of ignorance of the individual
errors has become an essential part of it, and we mast recognize that the
physical law itself, if it is to be of any sese, must have an epistemological
content.
The impossibility of exact prvdictiou has been forced on the attention
of physicists by Heisenberg's Uuvartainty Principle. It is remarkable,
considering that tile phenomenon of errors of observation was discussed
by Laplace and Gauss, that there should still have been any physicists
that thought that actual observations were exactly predictable, yet
C
14 FUNDAMENTAL NOTIONS 1,11 1
attempts to evade the principle showed that many existed. The prin-
ciple was actually no new uncertainty. What Heisenberg did was to
consider the most refined types of observation that modern physics
suggested might be possible, and to obtain a lower limit to the uncer-
tainty; but it was much smaller than the old uncertainty, which was
never neglected except by misplaced optimism. The existence of errors
of observation seems to have escaped the attention of many philo-
sophers that have discussed the uncertainty principle, this is perhaps
because they tend to get their notions of physics from popular writings,
and not from works on the combination of observations. Their criti-
cisms of popular physics, mostly valid as far as they go, would gain
enormously in force if they attended to what we knew about errors
before Heisenberg.f
The word error is liable to be interpreted in some ethical sense, but
its scientific meaning is closely connected with the original one. Latin
errare, in its original sense, means to wander, not to sin or to make
a mistake. The meaning occurs in 'knight-errant'. The error means
simply the outstanding variation after we have done our best to inter-
pret the whole variation.
The criterion of universal assent, stated by Dr. N. R. Campbell and
by Professor H. Dingle in his Science and Human Experience (but
abandoned in his Through Science to Philosophy), must also be rejected
by rule 3. This criterion requires general acceptance of a principle
before it can be adopted. But it is impossible to ask everybody's con-
sent before one believes anything, and if 'everybody' is replaced by
'everybody qualified to judge', we cannot apply the criterion until we
know who is qualified, and even then it is liable to happen that only
a small fraction of the people capable of expressing an opinion on a
scientific paper read it at all, and few even of those do express any.
Campbell lays much stress on a physicist's characteristic intuition4
Profosoor L S Stebbing (Philoeophy end the 1935, p 198) remarks
'There can be no doubt at all that procise predictions concerning the behaviour of
macroscopic bodies are made and are meetly verified within the limits of experimental
error ' Without the saving phooe at the end the statement is intelligible, and false
With it. it ie meaningless The severe criticism of much in modern physics contained
in this book is, ho my opinion, thoroughly juotifiod, but the later ports lose much of
their point through inattention to the problem of errors of observation Some philo'
sophere, however, have soon the point quite clearly For instanco, Professor J II
Muirhoad (The Element, of Elhicu, 1910, pp 37-38) staten 'The truth is that what is
called a natural law is itself not so much a statement of fact as of a standard or type
to which facts have been found mom or less to approximate Thin is true even in
inorganic nature ' I am indebted to Mr John Bradley for the reference
AndeS ,Sor Suppi vol 17. 1938, 122
FUNDAMENTAL NOTIONS
which apparently enables him always to guess right. But if there is
any such intuition there is no need for the criterion of general agree-
ment or for any other. The need for some general criterion is that even
among those apparently qualified to judge there are often serious
differences of opinion about the proper interpretation of the same
facts, what we need is an impersonal criterion that will enable an
individual to see whether, in any particular instance, he is following
the rules that other people follow and that he himself follows in other
instances.

1.2. The chief constructive rule is 4. It declares that there is a valid


primitive idea expressing the degree of confidence that we may reason-
ably have in a proposition, even though we may not be able to give
either a deductive proof or a disproof of it. In extreme cases it may
be a mere statement of ignorance. We need to express its rules. One
obvious one (though it is very commonly overlooked) is that it depends
both on the proposition considered and on the data in relation to which
it is considered. Suppose that I know that Smith is an Englishman,
hut otherwise know nothing particular about him. He is very likely,
on that evidence, to have a blue right eye. But suppose that I am
informed that his left eye is brown—the probability is changed com-
pletely. This is a trivial ease, but the principle in it constitutes most
of our subject-matter. It is a fact that our degrees of confidence in
a proposition habitually change when we make new observations or
new evidence is communicated to us by somebody else, and this change
constitutes the essential feature of all leaning from experience. We
must therefore be able to express it. Our fundamental idea will not be
simply the probability of a proposition p, but the probability of p on
data q. Omission to recognize that a probability is a function of two
arguments, both propositions, is responsible for a large number of
serious mistakes; in some hands it has led to correct results, but at the
cost of omitting to state essential hypotheses and giving a delusive
appearance of simplicity to what are really very difficult arguments.
it is no mere valid to speak of the probability of a proposition without
stating the data then it would be to speak of the value of x+y for given x,
irrespective of the value of p.
We can now proceed on rule 7. It is generally believed that proba-
bilities are that is, that if p, q, r are three propositions,
the statement 'on data p, q is more probable than r' has a meaning.
In actual cases people may disagree about which is the more probable,
16 FUNDAMENTAL NOTIONS I, § 12
and it is sometimes said that this implies that the statement has no
meaning. But the differences may have other explanations: (1) The
commonest is that the probabilities are on different data, one person
having relevant information not available to the other, and we have
made it an essential point that the probability depends on the data.
The conclusion to draw in such a case is that, if people argue without
telling each other what relevant information they have, they are wasting
their time. (2) The estimates may be wrong. It is perfectly possible to
get a wrong answer in pure mathematics, so that by rule S this is no
objection. In this case, where the probability is often a mere guess,
we cannot expect the answer to be right, though it may be and often
is a fair approximation. (3) The wish may be father to the thought.
But perhaps this also has an analogue in pure mathematics, if we con-
sider the number of fallacious methods of squaring the circle and
proving Fermat's last theorem that have been given, merely because
people wanted ir to be an algebraic or rational number or the theorem
to be true. In any case alternative hypotheses are open to the same
objection, on the one hand, that they depend on a wish to have a wholly
deductive system and to avoid the explicit statement of the fact that
scientific inferences are not certain; or, on the other, that the statement
that there is a most probable alternative on given data may curtail their
freedom to believe another when they find it more pleasant. I think
that these reasons account for all the apparent differences, but they
are not fundamental. Even if people disagree about which is the more
probable alternative, they agree that the comparison has a meaning.
We shall assume that this is right. The meaning, however, is not a
statement about the external world, it is a relation of inductive logic.
Our primitive notion, then, is that of the relation 'given p, q is more
probable than r', where p, q, and r are three propositions. If this is
satisfied in a particular instance, we say that r is less probable than q,
given p; this is the definition of less probable. If given p, q is neither
more nor less probable than r, q and r are equally probable, given p.
Then our first axiom is
AXIOM 1. Given p, q is either more, equally, or less probable than r,
end no two of these alternatives can be true.
This axiom may be called that of the comparability of probabilities.
In the first edition of Scientific Inference I took it in a more general
form, assuming that the probabilities of propositions on different data can
be compared. But this appears to be unnecessary, because it is found that
I, § 1.2 FUNDAMENTAL NOTIONS 17

the comparability of probabilities on different data, whenever it arises in


practice, is proved in the course of the work and needs no special axiom.
The fundamental relation is transitive; we express this as follows.
AXIOM 2. If p, q, r, s are four propositions, and, given p, p is more
probable than r and r is more probable than s, then, given p, q is more
probable than s.
The extreme degrees of probability are certainty and impossibility.
These lead to
AXIOM 3. All propositions deducible from a proposition p have the same
probability on data p; and all propositions inconsistent with p have the
same probability on data p.
We need this axiom to ensure consistency with deductive logic in
cases that can be treated by both methods. We are trying to construct
an extended logic, of which deductive logic will be a part, not to intro-
duce an ambiguity in cases where deductive logic already gives definite
answers. I shall often speak of 'certainty on data p' and 'impossibility
on data p' These do not refer to the mental certainty of any particular
individual, but to the relations of deductive logic expressed by 'p is
deducible from p' and 'not-q is deducible from p'. In G. E. Moore's
terminology, we may read the former as 'p entails q'. In consequence
of our rule 5, we shall never have 'p entails q' if p is merely the general
rules of the theory and q is an empirical proposition.
Actually I shall take 'entails' in a slightly extended sense; in some
usages it would be held that p is not deducible from p, or from p and p
together Some shot tening of the writing is achieved if we agree to
define 'p entails q' as meaning either 'q is deducible from p' or 'p is
identical with p' or 'q is identical with some proposition asserted in p'.
This avoids the need for special attention to trivial cases.
We also need the following axiom,
AXIOM 4. If, given p, p and q' cannot both be true, and if, given p,
r and r' cannot both be true, and if, given p, p and r are equally probable
and q' and r' are equally probable, then, given p, 'q or q" and 'r or r" are
equally probable.
At this stage it is desirable for clearness to introduce the following
notations and terminologies, mainly from Principia Mathematica.
—'p means 'not-p', that is, p is false.
p.q means 'p and q'; that is, p and p are both true.
v v q means 'p or q', that is, at least one of p and p is true.
FUNDAMENTAL NOTIONS 1,112
These notations may he combined, dots being used as brackets. Thus
— p.q means 'p and q is not true', that is, at least one of p and q
is false, which is equivalent to —'p. But
means 'p is false and q is true', which is not the same pro-
position. The rule is that a set of dots represents a bracket, the com-
pletion of the bracket being either the next equal set of dots or the
end of the expression. Dots may be omitted in joint assertions where
no ambiguity can arise.
The joint assertion or conjunction of p and q is the proposition p.
and the joint assertion of p, q, r, s,.. is the proposition p.q.r.s...; that
is, that p, q, r, s,... are all true. The joint assertion is also called the
logical product.
The disjunction of p and q is the proposition p v q; the disjunction
of p, q, r, s is the proposition p v q yr v s, that is, at least one of p, q, r, s
is true. The disjunction is also called the logical sum.
A set of propositions q1 (i = 1 to a) are said to be exclusive on data p
if not more than one of them can be true on data p; that is, if p entails
all the disjunctions '—.'q1 v when i k.
A set of propositions q1 are said to be exhaustive on data p if at least
one of them must be true on data p; that is, if p entails the disjunction

It is possible for a set of alternatives to be both exclusive and


exhaustive. For instance, a finite class must have some number n;
then the propositions a = 0, 1, 2, 3,... must include one true proposi-
tion, but cannot contain more than one.
The Axiom 4 will read.
If q and q' are exclusive, and r and r' are exclusive, on data p. and if,
given p, q and r are equally probable and q' and r' are equally probable,
then, given p, q v q' and r v r' are equally probable.
An immediate extension, obtained by successive applications of thin
axiom, is:
THEOREM 1. /f q1, g5 q,, are exclusive, and r1, r,, are exclusive,
on data p, and if, given p, the prevpositions q1 and r1, q5 and r5 and r,,
are equally probable in pairs, then givenp, q1 v q2 ... v q,, and r1 v r2 ... v F,,
are equally probable.
It will be noticed that we have not yet assumed that probabilities
can he expressed by numbers. I do not think that the introduction of
numbers is strictly necessary to the further development; but it has the
enormous advantage that it permits us to use mathematical technique.
1,112 FUNDAMENTAL NOTIONS
Without it, while we might obtain a set of propositions that would have
the same meanings, their expression would be much more cumbrous,
The actual introdoction of numbers is done by conventions, the nature
of which is essentially linguistic.
CONVENTION 1. We assign the larger number on given date to the more
probable proposition (end therefore equal numbers to equally probable
propositions).
CONVENTION 2. if, given p, q end q' are exclusive, then the number
assigned on date p to 'q or q" is the sum of those assigned to q end to q'.
It is important to notice the meaning of a convention. It is neither
an axiom nor a theorem, It is merely a rule introduced for convenience,
and it has the property that other rules would give the same results.
W. E. Johnson remarks that a convention is properly expressed in the
imperative mood. An instance is the use of rectangular or polar coordi-
nates in Euclidean geometry. The distance between two points is the
fundamental idea, and all propositions van be stated as relations be-
tween distances. Any proposition in rectangular coordinates can be
translated into polar coordinates, or vice versa, and both expressions
would give the same results if translated into propositions about
distances. It is purely a matter of convenience which we choose in a
particular ease. The choice of a unit is always a convention. But care
is needed in introducing conventions; some postulate of consistency
about the fundamental ideas is liable to be hidden. It is quite easy
to define an equilateral right-angled plane triangle, but that does not
make such a triangle possible. In this ease Convention I specifies what
order the numbers are to be arranged in. Numbers can be arranged in
an order, and so can probabilities, by Axioms i and 2. The relation
'greater than' between numbers is transitive, and so is the relation
'more probable than' between propositions on the same data. There-
fore it is possible to assign numbers by Convention 1, so that the order
of increasing degrees of belief will be the order of increasing number.
So far we need no new axiom; but we shall need the axiom that there
are enough numbers for our purpose.
AxIoM 5. The set of possible probabilities on given data, ordered in
terms of the relation 'more probable than', can be put into one—one corre-
spossdenee with a set of real numbers in increasing order.
The need for such an axiom was pointed out by an American reviewer
of Scientific inference. He remarked that if we take a series of number
pairs = (as, b5) and make it a rule that a, is to be placed after a3 if
NoTroNs I, § 1 2

a, > a,,, but that if a., = ce, is to be placed after se,, if b, > b,,, then
a,,,
the axiom that the u,, can be placed in an order will hold, but if a,, and
can each take a continuous series of values it will be impossible to
establish a one—one correspondence between the pairs and a single
continuous series without deranging the order.
Convention 2 and Axiom 4 will imply that, if we have two pairs of
exclusive propositions with the same probabilities on the same data,
the numbers chosen to correspond to their disjunctions will be the
same. The extension to disjunctions of several propositions is justi-
fied by Theorem I. We shall always, on given data, associate the
same numbers with propositions entailed or contradicted by the data,
this is justified by Axiom 3. The assessment of numbers in the way
Suggested is therefore consistent with our axioms. We can now intro-
duce the formal notation P(q I
for the number associated with the probability of the proposition q on
data p. it may be read 'the probability of q given p' provided that we
remember that the number is not in fact the probability, but merely
a representation of it in terms of a pair of conventions. The probability,
strictly, is the reasonable degree of confidence and is not identical with
the number used to express it. The relation is that between Mr. Smith
and his name 'Mr. Smith'. A sentence containing the words 'Mr Smith'
may correspond to, and identify, a fact about Mr. Smith But Mr.
Smith himself does not occur in the sentence f In this notation, the
properties of numbers will now replace Axiom 1; Axiom 2 is restated
'if P(q In) > P(r I p) and P(r p) > P(s I p), then P(q In) > P(s p)',
which is a mere mathematical implication, since all the expressions are
numbers Axiom 3 will require us to decide what numbers to associate
with certainty and impossibility. We have
TTJF.ORIeM 2. If p is consistent with the general rules, and p entails —..q,
then P(q p) = 0.
For let q and r be any two propositions, both impossible on data p.
Then (Ax. 3) if a is the number associated with impossibility on datap,
P(qip) = P(rlp)= P(q vrjp)= a
since q, r, and q v r are all impossible propositions on data p and must
be associated with the same number But qr is impossible on data p,
hence, by definition, q and rare exclusive on datap, and (Cony. 2)
P(qvrlp)= P(qlp)+P(rlp)=2a,
C Cf IS Caroop, The Logical of
I, § 1 2 FUNDAMENTAL NOTIONS 21

whence a = 0. Therefore all probability numbers are 0, by Con-


vention 1.
As we have not assumed the comparability of probabilities on dif-
ferent data, attention is needed to the possible forms that can be
substituted for q and r, given p. If p is a purely a priori proposition,
it can never entail an empirical one. Hence, if p stands for our general
rules, the admissible values for p and r must be false a priori proposi-
tions, such as 2 = I and 3 =- 2. Since such propositions can be stated
the theorem follows. If p is empirical, then —'p is an admissible value
for both q and r. Or, since we are maintaining the same general prin-
ciples throughout, we may remember that in practice if p is empirical
and we denote our general principles by A, then any set of data that
actually occurs and includes an empirical proposition will be of the
form ph. Then for p and r we may still substitute false a priori pro-
positions, which will be impossible on data ph. Hence it is always
possible to assign p and r so as to satisfy the conditions stated in tho
proof.
CONVENTION 3. If p entails q, then P(q j p) = 1.

This is the rule generally adopted, but there are cases where we wish
to express ignorance over an infinite range of values of a quantity, and
it may be convenient to express certainty that the quantity lies in that
range by in order to keep ratios for finite ranges determinate. None
of our axioms so far has stated that we must always express certainty
by the same number on different data, merely that we must on the
same data; but with this possible exception it is convenient to do so.
The converse of Theorem 2 would be. 'If P(q p) = 0, then p entails
This is false if we use Convention 3. For instance, a continuous
variable may be equally likely to have any value between 0 and 1.
Then the probability that it is exactly is 0, but is not an impossible
value There would be no point in making certainty correspond to
infinity in such a case, for it would make the probability infinite for
any finite range. It turns out that we have no occasion to use the
converse of Theorem 2.
AXIOM 6. If pq entails r, then P(qr p) = P(q I p).
In other words, given p throughout, we may consider whether p is
false or true. If q is false, then qr is false. If p is true, then, since pp
entails r, r is also true and therefore qr is true. Similarly, if qr is true
it entails q, and if qr is false q must be false on data p, since if it was
true qr would be true. Thus it is impossible, given p, that either q or
22 FUNDAMENTAL NOTIONS 1,11.2
qr should be true without the other. This is an extension of Axiom 3
and is necessary to enable us to take over a further set of rules sug-
gesteci by deductive logic, and to say that all equivalent propositions
have the same probability on given data.
THEOREM 3. If q and r are equivalent in the sense that each entails the
other, then each entail.s qr, end the probabilities of q and r on any data must
be equal. Similarly, if pg entails r, end pr entails q, P(q I p) = P(r p),
since both are equal to P(qr I p).
An immediate corollary is
THEOREM 4. =
For qr and q. r are exclusive, and the sum of their probabilities
on any data is the probability of qr: v:q. (Cony. 2). But q entails
this proposition, and also, if either q and r are both true or q is true
and r false, q is true in any case. Hence the propositions q and
v:q. are equivalent, and the theorem follows by Theorem 3.
It follows further that P(q I p) P(qr p), since P(q. r p) cannot
be negative. Also, if we write q v r for q, we have
(Th.4)
and q v r:r is equivalent to r, and q v r: to q. Hence
P(q
THEOREM If q and r are two propositions, not necessarily exclusive
on data p, P(q Yr Ip)+P(qr jp).
For the propositions qr, q. r, —'q. exclusive; and
are
q is equivalent to the disjunction of qr and q. r to the dis-
r, and
junction of qr and q . r. Hence the left side of the equation is equal to
2P(qr lp)+P(q. —'r p) (Tb. 4).
Also q v r is equivalent to the disjunction of qr, q. r, and -.'q.r.
Hence
P(q yr = P(qr ip)+P(q. —.'r Ip)+P(—.'q.rlp) (Th. 4),
whence the theorem follows.
It follows that, whether q and r are exclusive or not,
F(q vrlp)
since P(qr I p) cannot be negative. Theorems 4 and together express
upper and lower bounds to the possible values of P(q v r p) irrespective
of exclusiveness. It cannot be less than either P(q or P(r p); it
cannot be more than P(q Ip)+P(r Ip).
1,11.2 FIJNDAMENTAL NOTIONS 23

THEOREM 6. If q1, q2,... are a set of equally probable and exclusive


alternatives on data p, and if Q and B are disjunctions of two subsets of
these alternatives, of numbers m and n, then P(Q p p)/P(B p) = rn/n.
For if a is any one of the equal numbers P(q1 I P(q2 I p),.. we
have, by Convention 2,
P(RIp)=na,
whence the theorem follows.
TREOREM 7. In the condition-s of Theorem 6, if q1, q2,..., q,, are
exhaustive on data p, and B denotes their disjunction, then B is entailed
bypand P(BJp)=l (Conv.3).
It follows that F(Q p) rn/n.
This is virtually Laplace's rule, stated at the opening of the The'oi'ie
Analytique. .R entails itself and therefore is a possible value of p; hence
P(QIR)=m/n.
This may be read: given that a set of alternatives are equally probable,
exclusive, and exhaustive, the probability that some one of any subset is
true is the ratio of the number in that subset to the whole number of possible
cases. This form depends on Convention 3, and must be used only in
cases where the convention is adopted. Theorem 6, however, is inde-
pendent of Convention 3. If we chose to express certainty on data p
by 2 instead of 1, the only change would be that all numbers associated
with probabilities on data p would be multiplied by 2, and Theorem 6
would still hold. Theorem 6 is also consistent with the possibility that
the number of alternatives is infinite, since it requires only that Q and
11 shall be finite subsets.
Theorems 6 and 7 tell us how to assess the ratios of probabilities,
and, subject to Convention 3, the actual values, provided that the
propositions considered can be expressed as finite subsets of equally
probable, exclusive, and, for Theorem 7, exhaustive alternatives on the
data. Such assessments will always be rational fractions, and may be
called R-probabiities. Now a statement that m and n cannot exceed
some given value would be an empirical proposition asserted a priori,
and would be inadmissible on rule 5. Hence the B-probabilities possible
within the formal scheme form a set of the ordinal type of the rational
fractions.
If all probabilities were B-probabilities there would be no need
for Axiom 5, and the converse of Theorem 2 could hold. But many
propositions that we shall have to consider are of the form that a
FUNDAMENTAL NOTIONS 1.1 1.2
magnitude, capable of a continuous range of values, lies within a speci-
fied part of that range, and we may be unable to express them in the
required form. Thus there is no need for all probabilities to be B-
probabilities. However, if a proposition is not expressible in the
required form, it will still be associated with a reasonable degree of
belief by Axiom 1, and this, by Axiom 2, will separate the degrees for
B-probabilities into two segments, according to the relations more
probable than' and 'less probable than'. The corresponding numbers,
the B-probabilities themselves, will be separated by a unique real
number, by Axiom 5 and an application of Dedekind's section. We
fake the numerical assessment of the probability of a proposition not
expressible in the form required by Theorems 6 and 7 to be this number.
Hence we have
THEoREM 8. Any probability can be expressed by a real number.
If x is a variable capable of a continuous set of values, we may
consider the probability on data p that x is less than x5, say
P(x <x0Ip) = f(x0).
If f(x5) is differentiable we shall then be able to write
P(x0 <x < =f'(x0)dx0+o(dx0).
We shall usually write this briefly P(dx f'(x) dx, dx on the left
meaning the proposition that x lies in a particular range dx. f'(x) is
called the probability density.
THEOREM 9. If Q is the disjunction of a set of exclusive alternatives
on data p, and if B and S are subsets of Q (possibly overlapping) and if
the aitcrnadves in Q are all equally probable on data p and also on. data
Pp then P(RS I
Rp)/P(R Rp).
I

For suppose that the propositions contained in Q are of number n,


that the subset P contains m of them, and that the part common to
P and S contains 1 of them. Put
P(QIp)=a.
Then, by Theorem 6,
P(R p) = ma/n, P(RS p) = la/n.
P(S Rp) is the probability that the true proposition is in the S subset
given that it is in the P subset and p. and therefore is equal to
(ljm)P(i? Rp) Also RSp entails R, hence
P(S Rp) = P(SIl JIp) (Ax. 6)
and P(]IS p) = (lJm)(ma/n) = P(llIp)P(S Jlp)/P(RI Rp)
FUNDAMENTAL NOTIONS 25

This is the first proposition that we have had that involves probabilities
on different data, two of the factors being on data p and two on data lIp.
Q itself does not appear in it and is therefore irrelevant. It is introduced
into the theorem merely to avoid the use of Convention 3. It might be
identical with any finite set that includes both R and S.
The proof has assumed that the alternatives considered are equally
probable both on data p and also on data lIp. It has not been found
possible to prove the theorem without using this condition. But it is
necessary to further developments of the theory that we shall have
some way of relating probabilities on different data, and Theorem 9
suggests the simplest general rule that they can follow if there is one at
all. We therefore take the more general form as an axiom, as follows.
Axiosi 7. For any propositions p, q, r,

If we use Convention 3 on data qp (not necessarily on data p),


P(q qp) = I, and we have W. E. Johnson's form of the product rule,
which can be read: the probability of the joint assertion of two propositions
on any data p is the product of the probability of one of them on data p
and that of the other on the first and p.
We notice that the probability of the logical sum follows the addition
rule (with a caveat), that of the logical product the product rule. This
parallel between the Principia and probability language is lost when the
joint assertion is called the sum, as has occurred in some writings.
In a sense a probability can be regarded as a logical quotient, since in the
conditions of Theorem 7 the probability of Q given R is the probability
of Q given p divided by that of R given p. This has been recognized
in the history of the notation, which Keynesf traces to H. McColl.
McColl wrote the probability of a, relative to the a priori premiss h,
as ale, and relative to bh as a/b. This was modified by W. E Johnson
to a/h and a/bh, and he was followed by Keynes, Broad, and Ramsey.
Wrinch and I found that this notation was inconvenient when the
solidus may have to be used in its usual mathematical sense in the
same equation, and introduced P(p q), which I modified further to
P(p I q) in Scienhfic Inference because the colon was beginning to be
needed in the Principia sense of a bracket.
The sum of two classes a and in Principia, is the class y such that
t Tsrolioe on Probability, 1921, p 155 This book is of intorooting historical data
and contains many important critical remarks It is not very successful on tho con-
etructive mdc, since an unwillingness to generalize the axioms hss prevented Keynes
from obtaining many important results
FUNDAMENTAL NOTJONS I, § 1 2

every member of a or of ft is in y, and conversely. The product class of


a and ft is the class 8 of members common to a and ft. Thus Theorem 5
has a simple analogy with the numbers of members of the classes a and
ft. y and 8. The multiplicative class of a and ft is the class of all pairs, one
from a and one from ft; it is this class, not the product class, that gives
an interpretation to the product of the numbers of members of a and ft.
The extension of the product rule from Theorem 9 to Axiom I has
been taken as axiomatic. This is an application of a principle repeatedly
adopted in Prineipie Methemcstice. If there is a choice between possible
axioms, we take the one that enables most consequences to be drawn.
Such a generalization is not inductive. What we are doing is to seek for
a set of axioms that will permit the construction of a theory of induction,
the axioms themselves being primitive postulates. The choice is limited
by role 6; the axioms must be reduced to the minimum number, and
the cheek on whether we make them too general will be provided by
role 2, which will reject a theory if it is found to lead to contradictory
consequences. Consider then whether the rule
P(grjp)= P(gjp)P(rlqp)
can hold in general. Suppose first that p entails —' : gr; then either p
entails r-.'g, or p and g together entail '—'r. In either ease both sides of
the equation vanish and the rule holds. Secondly, suppose that p entails
gr; then p entails g and pg entails r. Thus both sides of the equation
are 1. Similarly, we have consistency in the converse eases where p
entails or pg entails —r, or p entails g and pg entails r. This
covers the extreme eases.
If there are any eases where the rule is untrue, we shall have to say
that in such eases P(gr p) depends on something besides P(g p) and
P(r gp), and a new hypothesis would be needed to deal with such eases.
By rule 6, we must not introduce any such hypothesis unless need for
it is definitely shown. The product rule may therefore be taken as
general unless it can be shown to lead to contradictions. We shall see
(p. 35) that consistency can be proved in a wide class of eases.
1.21. The product rule is often misread as follows: the joint proba-
bility of two propositions is the product of their probabilities separately.
This is meaningless as it stands because the data relative to which the
probabilities are considered are not mentioned. In actual application,
the rule so stated is liable to become: the joint probability of two pro-
positions on given data is the product of their separate probabilities
on those data. This is false. We may see this by considering extreme
1,11.2 FUNDAMENTAL NOTIONS 27

cases. The correct statement of the rule may be written (using Con-
vention 3 on data pr)
F(pqjr) = (1)
and the other one as
F(pq r) = P(p r)P(q r).
I
(2)
If p cannot be true given r, then p and q cannot both be true, and both
(1) and (2) reduce to 0 = 0. If p is certain given r, both reduce to
P(q(r)=F(qft) (3)
since in (I) the inclusion of p in thc data tells us nothing about p that
is not already told us by r. If p is impossible given r, both reduce to
0 = 0. If p is certain given r, both reduce to
F(plr)=P(pjr). (4)
So far everything is satisfactory. But suppose that p is impossible
givenpr. Then it is impossible for pp to be true given r, and (1) reduces
correctly to 0 = 0. But (2) reduces to
0= P(pjr)P(plr),
which is false; it is perfectly possible for both p and p to be consistent
with r and pp to be inconsistent with r. Consider the following. Let
r consist of the following information: in a given population all the
members have eyes of the same colour; half of them have blue eyes and
half brown, one member is to be chosen, and any member is equally
likely to be selected. p is the proposition that his left eye is blue, p the
proposition that his right eye is brown. What is the probability, on
data r, that his left eye is blue and his right brown? P(p
Ic) and according to (2) F(pp r) = j. But according
to (1) the probability that his right eye is brown must be assessed
subject both to the information that his eyes are of the same colour
and that his left eye is blue, and this probability is 0. Thus (1) gives
F(pp r) = 0. Clearly the latter result is right, further applications of
the former, considering also (left eye brown) and '—'p (right eye
blue) lead to the astonishing result that on data including the pro-
position that all members have two eyes of the same colour, it is as
likely as not that any member will have eyes of different colours.
This trivial instance is enough to dispose of (2); but (2) has been
widely applied in cases where it gives wrong results, and sometimes
seriously wrong ones. The Boltzmann H-theorem of the kinetic theory
of gases rests on a fallacious application of it, since it considers an
assembly of molecules, possibly with differences of density from place
to place, and gives the joint probability that two molecules will be in
28 FUNDAMENTAL NOTIONS I, § 1.2

adjoining regions as the product of the separate probabilities that they


will be. If there are differences of density, and one molecule is in a
region chosen at random, that is some evidence that the region is one of
high density, then the probability that a second is in the region, given
that the first is, is somewhat higher than it would be in the absence of
information about the first. Similar considerations apply to Boltz-
mann's treatment of the velocities. In this case the mistake has not
preventcd the right result from being obtained, though it does not
follow from the hypotheses.
Nevertheless there are many cases where (2) is true. If
= P(qlr)
we say that p is irrelevant to q, given r.
1.22. THEOREM 10. If q1, q2,..., q,, are a set of alternatives, H the
information already available, and p some additional information, then
the ratio H)
P(q, I
H)P(p I q,.H)
is the same for all the
By Axiom 7
P(p IpH)IP(p pH) (1)
= P(qr I
H)P(p I
(2)
pH)
whence I

I
H) H)
which is independent of
If we use unity to denote certainty on data for all the
(3) becomes IpH) m H)P(p (4)
I I

for variations of This is the principle of inverse probability, first


given by Bayes in 1763. It is the chief rule involved in the process of
learning from experience. It may also be stated, by means of the product
rule, as follows: pH) m P(pq, I H). (5)
This is the form used by Laplace, by way of the statement that the
posterior probabilities of causes arc proportional to the probabilities
a priori of obtaining the data by way of those causes. In the form
(4), if p is a description of a set of observations and the a set of
hypotheses, the factor P(q, I II) may be called the prior probability,
pH) the posterior probability, and P(p the likelihood, a
convenient term introduced by Professor II. A. Fisher, though in his
I, 1.2 FUNDAMENTAL NOTIONS 29

usage it is sometimes multiplied by a constant factor. It is the proba-


bility of the observations given the original information and the
hypothesis under discussion. The term a priori probability is sometimes
used for the prior probability, but this term has been used in so many
senses that the only solution is to abandon it. To Laplace the a priori
probability meant H), and sometimes the term has even been
used for the likelihood. A priori has a definite meaning in logic, in
relation to propositions independent of experience, and we frequently
have need to use it in this sense. We may then state the principle of
inverse probability in the form: The posterior probabilities of the hypo-
theses are proportional to the products of the prior probabilities and the
likelihoods. The constant factor will usually be fixed by the condition
that one of the propositions q1 to q,, must be true, and the posterior
probabilities must therefore add up to 1. (If 1 is not suitable to denote
certainty on data pH, no finite set of alternatives will contain a finite
fraction of the probability. The rule covers all cases when there is
anything to say.)
The use of the principle is easily seen in general terms. If there is
originally no ground to believe one of a set of alternatives rather than
another, the prior probabilities are equal. The most probable, when
evidence is available, will then be the one that was most likely to lead
to that evidence. We shall be most ready to accept the hypothesis that
requires the fact that the observations have occurred to be the least
remarkable coincidence. On the other hand, if the data were equally
likely to occur on any of the hypotheses, they tell us nothing new with
respect to their credibility, and we shall retain our previous opinion,
whatever it was. The principle will deal with more complicated circuni-
stances also; the immediate point is that it does provide us with what we
want, a formal rule in general accordance with common sense, that will
guide us in our use of experience to decide between hypotheses.
1.23. We have not yet shown that Convention 2 is a convention and
not a postulate. This must be done by considering otherpossible conven-
tions and seeing what results they lead to. Any other convention must
not contradict Axiom 4. For instance, if the number associated with a
probability by our rules is x, we might agree instead to use the number
Then if x and x' are the present estimates for the propositions q
and q', and for r and r', those for q v q' and r v r' will both be and
the consistency rule of Axiom 4 will be satisfied. But instead of the
addition rule for the number to be associated with a disjunction we
shall have a product rule. Every proposition stated in either notation
1)
30 FUNDAMENTAL NOTIONS 1,112
can be translated into the other, if our present system leads to the result
that a hypothesis is as likely to he true es it is that we should pick a
white ball at random out of a bag containing 99 white ones and 1 black
one, that result will also be obtained on the suggested alternative system.
The fundamental notion is that of the comparison of reasonable degrees
of belief, and so long as all methods place them in the same order the
differences between the methods are conventional. This will be satisfied
if instead of the number x we choose any function of it, f(x), such that
x and f(x) are increasing functions of each other, so that for any value
of one the other is determinate. This is necessary by Convention 1
and Axiom 1, but every form of f(x) will lead to a different rule for
the probability-number of a disjunction if it is to be consistent with
Axiom 4. Hence the addition rule is a convention. It is, of course,
much the easiest convention to use. To abandon Convention 1, con-
sistently with Axiom 1, would merely arrange all numerical assessments
in the opposite order, and again the same results would be obtained in
translation. The assessment by numbers is simply a choice of the most
convenient language for our purposes

1.3. The original development of the theory, by Bayes,f proceeds


differently. The foregoing account is entirely in terms of rules for the
comparison of reasonable degrees of belief. Bayes, however, takes as
his fundamental idea that of expectation of benefit. This is partly a
matter of what we want, which is a separate problem from that of what
it is reasonable to believe, I have therefore thought it best to proceed
as far as possible in terms of the latter alone. Nevertheless, we have in
practice often to make decisions that involve not only belief but the
desirability of the possible effect of different courses of action. If we
have to give advice to a practical man, either we or he must take these
into account. In deciding on his course of action he must allow both
for the probability that the action chosen will lead to a certain result
and for the value to him of that result if it happens. The fullest
development on Ihese lines is that of F. P. I shall not
attempt to reproduce it, but shsll try to indicate some of the principal
points as they occur in his work or in Bayes's. The fundamental idea is
Phil leans 53, 1765, 376-os It has only recently been found, by 0 A earnard,
A Fletcher. and N L Plaekett, that uayes was born in 17e2, was Presbyterian Minister
at Tunbeidge Wells from before 1731 till 1722, and died in Precious searches had
yielded hardly any pemonal information about him. See Biometriha 45, 1958, 293—3 15.
The Fosndatisne of Melhemaliee, 1931, pp 157—211 This essay, lihe that of flayes,
was published after the author's death, and sueere from a number of imperfections in
the veebal statement that he might have corrected.
I, § 1.3 FUNDAMENTAL NOTIONS 31

that the values of expectations of benefit can be arranged in an order;


it is legitimate to compare a small probability of a large gain with
a large probability of a small gain. The idea is necessarily more com-
plicated than my Axiom 1; on the other hand, the comparison is one
that a business man often has to make, whether he wants to or not, or
whether it is legitimate or not. The rule simply says that in given
circumstances there is always a best way to act. The comparison of
probabilities follows at once; if the benefits are the same, whichever
of two events happens, then if the values to us of the expectations of
benefit differ it is because the events are not equally likely to happen,
and the larger value is associated with the larger probability. Now we
have to consider the combination of expectations. Here Bayes, I think,
overlooks the distinction between what Laplace calls 'mathematical'
and 'moral' expectation. Bayes speaks in terms of monetary stakes,
and would say that a 1/100 chance of receiving £100 is as valuable as
a certainty of receiving £1. A gambler might say that it is more valuable;
most people would perhaps say that it is less so. Indeed Bayes's
definilion of a probability of 1/100 would be that it is the probability
such that the value of the chance of receiving £100 is the same as
the value of a certain £1. Since different values may be compared, the
uniqueness of a probability so defined requires a postulate that the
value of the expectation, the proposition and the data remaining
the same, is proportional to the value to be received if the proposition
is true. This is taken for granted by Bayes, and Ramsey makes an
equivalent statement (foot of p. 179). The difficulty is that the value of
£1 to us depends on how much money we have already. This point was
brought out by Daniel Bernoulli in relation to what was called the
Petersburg Problem. Two players play according to the following rules.
A coin is to be thrown until a head is thrown. If it gives a head on
the first throw, A is to pay B £1; if the first head is on the second throw,
£2; on the third, £4, and so on. What is the fair sum for B to pay A
for his chances? The mathematical expectation in pounds is

Thus on this analysis B should pay A an infinite sum. If we merely


consider a large finite sum, such as he will lose if there is a head in
any of the first 20 throws; he will gain considerably if the first head
is on the 21st or a later throw. The question was, is it really worth
anybody's while to risk such a sum, most of which he is practically
certain to lose, for an almost inappreciable chance of an enormous
FUNDAMENTAL NOTIONS 1,113
gain? Even eighteenth-century gamblers seem to have had doubts
about it. Daniel Bernoulli's solution was that the value of is very
different according to the amount we have to start with. The value of
a lose of that eum to anybody that has just that amount is not equal
and opposite to the value of a gain of the same sum. He suggested a
law relating the value of a gain to the amount already possessed, which
need not detain us;t but the important point is that he recognized that
expectations of benefit are not necessarily additive. What Laplace calls
'moral expectation' is the value or pleasure to us of an event, its rela-
tion to the monetary value in terms of mathematical expectation may
be rather remote. Bayes wrote after Bernoulli, but before Laplace,
hut he does not mention Bernoulli. Nevertheless, the distinction does
not dispose of the interest of the treatment in terms of expectation of
benefit. Though we cannot regard the benefits of gains of the same
kind as mutually irrelevant, on account of this psychological pheno-
menon of satiety, there do seem to be many eases where benefits are
mutually irrelevant. For instance, the pleasures to me of two dinners
on consecutive nights seem to be nearly independent, though those of
two dinners on the same night are definitely not. The pleasure of the
unexpected return of a loan, having a paper accepted for publication,
a swim in the afternoon, and a theatre in the evening do seem
independent. If there are a sufficient number of such benefits (or if there
could be in some possible world, since all we need is consistency), a
scale of the values of benefits can be constructed, which will satisfy the
commutative rule of addition, and then, by Bayes's principles, one of
probability in terms of them. The addition rule will then be a theorem.
The product rule is treated by Bayes in the following way. We can
write E(a, p I q) for the value of the expectation of receiving a if p is
true, given q, and by definition of P(p I
E(a,plq) = el'(pjq).
The proportionality of E(a,p I q) to a, given p and q. is a postulate, as
we have already stated. Consider the value of the expectation of
getting a if p and q are both true, given r. This is aF(pq I r). But we
may test p first and then q. If p turns out to be true, our expectation
will he aP(q pt), since p is now among our data; if untrue, we know
that we shall receive nothing. Now return to the first stage. If p is
It is that tl,e value of a gain dx, when we have a alreody, in proportional to dx/x,
this is the r,,le aooutiated in certain biological applications with the oa,aes of Wehor
and Fochnar
5, § 1 3 FUNDAMENTAL NOTIONS
true we. shall receive an expectation, whose value is aP(q I pr), otherwise
nothing. Hence our initial expectation is aP(q pr)P(p r); whence
P(pq I
pr).
Ramsey's presentation is much more elaborate, hut depends on the
same main ideas. The proof of the principle of inverse prohahility is
simple. The difficulty about the separation of propositions into dis-
junctions of equally possible and exclusive alternatives is avoided by
this treatment, but is replaced by difficulties concerning additive expec-
tations. These are hardly practical ones in either case, no practical man
will refuse to decide on a course of action merely because we are not
qoite sure which is the best way to lay the foundations of the theory.
He assomes that the course of action that he actually chooses is the best;
Bayes and Ramsey merely make the less drastic assumption that there
is some course of action that is the best. In my method expectation
would be defined in terms of value and probability; in theirs probability
is defined in terms of values and expectations. The actual propositions
are of course identical.

1.4. At any stage of knowledge it is legitimate to ask about a given


hypothesis that is accepted, 'How do you know?' The answer will
usually rest on some observational data. If we ask further, 'What did
you think of the hypothesis before yon had these data?' we may be
told of some less convincing data, but if we go far enough back we shall
always reach a stage where the answer must be: 'I thought the matter
worth considering, but had no opinion about whether it wss true
What was the probability at this stage? We have the answer already.
If there is no reason to believe one hypothesis rather than another, the
probabilities are equal. In terms of our fundamental notions of the
nature of inductive inference, to sap that the probabilities are equal is
a precise wag of sepinp that we have no ground for choosing bet u'een the
alternetiees All hypotheses that are sufficiently definitely stated to
give any difference between the probabilities of their consequences
will be compared with the data by the principle of inverse probability,
but if we do not take the prior probabilities equal we are expressing
confidence in one rather tban another before the data are available, and
this must be done only from definite reason. To take the prior piohabili-
ties different in tIme absence of observational reason for doing so would be
an expression of sheer prejudice. The rule that we should then take them
equal is nota statement of any belief about the actual composition
FUNDAMENTAL NOTIONS 1, § 1.4

of the world, nor is it an inference from previous experience; it is merely


the formal way of expressing ignorance. It is sometimes referred to as
the Principle of Insufficient Reason (Laplace) or the equal distribu-
tion of ignorance. Bayes, in his great memoir, repeatedly says that
the principle is to be used only in cases where we have no ground
whatever for choosing between the alternatives. It is not a new rule
in the present theory because it is an immediate application of Conven-
tion 1. Much confusion has arisen about it through misunderstanding
and attempts to reinterpret it in terms of frequency definitions. My
contention is that the frequency definitions themselves lead to no
results of the kind that we need until the notion of reasonable degree
of belief is reintroduced, and that since the whole purpose of these
definitions is to avoid this notion they necessarily fail in their object.
When reasonable degree of belief is taken as the fundamental notion
the rule is immediate. We begin by making no assumption that one
alternative is more likely than another and use our data to compare them.
Suppose that one hypothesis is suggested by one person A, and
another by a dozen B, C,...; does that make any difference? No; but
it means that we have to attend to two questions instead of one. First,
is p or q true? Secondly, is the difference between the suggestions due
to some psychological difference between A and the rest? The mere
voting is not evidence because it is quite possible for a large number
of people to make the same mistake. The second question cannot be
answered until we have answered the first, and the first must be con-
sidered on its merits apart from the second.

1.5. We are now in a position to consider whether we have fulfilled the


conditions that we required in 1.1. I think (1) is satisfied, though
the history of both probability and deductive logic is a warning against
over-confidence that an unstated axiom has not slipped in.
2. Axiom I assumes consistency, but this assumption by itself does
not guarantee that a given system is consistent. It makes it possible to
derive theorems by equating probabilities found in different ways, and if
in spite of all efforts probabilities found in different ways were different,
the axiom would make it impossible to accept the situation as satisfac-
tory We must not expect too much in the nature of a general proof of
consistency.
In a certain sense many logical systems have been proved consistent.
The proofs depend on a theorem that goes back, I believe, to Aristotle,
that if a system contains two contradictory propositions, then any
I, § 1.5 FUNDAMENTAL NOTIONS 35

proposition whatever in the system can be proved in the system. We


have p entails p v
entails vq,
V q entails p.
Hence entails p. Similarly it entails '—'p. Then conversely, if we
can find a proposition in the system that cannot be proved or disproved
in the system, the system is consistent. The discovery of such a proposi-
tion, however, is always difficult and the proof that it cannot be proved
is still more so. GOdel proved that if any logical system that includes
arithmetic contains a proof of its own consistency, another proposition
can be proved that can be seen (looking at the system from outside) to
mean that the system is inconsistent, consequently if arithmetic is
consistent the consistency can neither he proved nor disproved.t Quine
has found an even simpler system, not even containing the notion of
membership of a class, that contains an undecidable proposition.
In fact if we do not make a further proviso a contradiction can be
derived immediately. From the above argument p. —'p entails q, and
equally entails -..'q. In probability language
= 1; = 1.
But since q and —'q are exclusive it follows from axiom 4 that
P(q = 2,
contradicting Convention 3. Hence a necessary condition for consistency
is that probabilities must not be assessed on contradictory data.
We can also find a sufficient condition. We assume that there is a
general datum H common to all propositions that we shall consider, it
may be merely the general principles of the theory. We assume that
Axioms 1, 2, 3, 4, 5, 6 hold for probabilities on data H. We use Con-
ventions 1 and 2 on H and assume that Convention 3 is applicable on H.
Then Theorems 1 to 8 follow if the datum is H.
Now if p is any additional datum such that P(p H) > 0, and q, are
a set of propositions, exhaustive on H, whose disjunction is Q, we take
(1)
=
This provides a means of calculating probabilities on data other than H.
They are, of course, unique. Then Convention 1 becomes a rule for the
ordering of probabilities in terms of their numerical assessments instead
of conversely.
t An outline of the proof ,s in &irohfiu Iofe,eueo, 1957, pp. 18—20.
FUNDAMENTAL NOTIONS I, § 15
Since the P(p III) is independent of q1,
it follows that the p11) satisfy Axiom 1. Similarly they satisfy
Axioms 2, 4, Convention 2 (since are exclusive on 11 they are also
exclusive on pH, and if q1 are exclusive on pH, are exclusive
011 H) and Axiom 5
For Axiom 6 we have, if entails r5,
1'(q1r,, p11) = = pH) (2)
=
using Axiom 6 on data 11, hence Axiom 6 holds on data pH.
Next, if pH entails q0 P(pq1 I H) = P(p 11) by Pxiom 6, and there-
fore ph) = I Convention 3 becomes a theorem, and the first
of Axiom 3 follows. If pH entails —.oq, is impossible given H

and therefore P(pq1 H) = 0, P(q1 I pH) = 0, hence we have the second


pait of Axiom 3.
For Axiom 7, consider two sets of propositions each exhaustive on H,
say ql, r5, then Axiom 7 will become

pH) jpH)P(r5 q1pH).


By (I) this is equivalent to
1'(pq,.r,,IH) P(pq1 H) P(pq1r1. H) /P(pq1p H)
(3)
J

P(pIH) P(pq1IIl) / 1'(pqJhl)


ouch is an identity Hence Axiom 7 follows
Iii this presentation we assume that the axioms and conventions on
l,ita 11 are consistent and provide a means of calculating probabilities
on other data, o Inch is certainly possible, and derive a proof of the axioms
oti data including 11, it follows that there are assessments of probabilities
hat satisfy all the axioms and are consistent, and there can, in particular,
he no inconsistency in tile use of the principle of inverse probability.
We have, however, the restriction that all propositions used as data must
have positile probabilities on H, this is less severe than the necessary
coinlitiomi that they must not be self-contradictory, but is similar in
imatute On the face of it tIns condition may be hard to satisfy; we have
hail tile instance of a measure ohose probability is uniformly distributed
from 0 to I sclmeri the question is whether it is precisely But we shall
see that the condition is also strongly suggested by other considerations,
and the dillicmiltv can be avoided ±
3 For amm\ assessment of the prior probability the principle of inverse
* Prof K it I'opper. Logic Seie,siific Dieooccry (Ai,pen,Iis viii), maintains that it
,i,t,i,,t be as sole,1 I r,iii,,,,t 50,5 es or that ho has adequately eonsi,iored the principlo
,ii,,nvvrgeni.c discussed in § 1 02
I, § 1 5 FUNDAMENTAL NOTIONS 37

probability will give a unique posterior probability. This can be used


as tbe prior probability in taking account of a further set of data, and
the theory can therefore always take account of new information. The
choice of the prior probability at the outset, that is, before taking into
account any observational information at all, requires further con-
sideration. We shall see that further principles are available as a guide.
These principles sometimes indicate a unique choice, but in many
problems some latitude is permissible, so far as we know at present.
In such cases, and in a different world, the matter would be one for
decision by the International Research Council. Meanwhile we need
only remark that the choice in practice, within the range permitted,
makes very little difference to the results.
4. This is satisfied by definition.
5. We have avoided contradicting rule 5 so far, bnt further applica-
tions of it will appear later.
6. Our main postulates are the existence of onique reasonable degrees
of belief, which can be put in a definite order, Axiom 4 for the consistency
of probabilities of disjunctions, either the axiomatic extension of the
product rule or the theory of expectation. It does not appear that these
can be reduced in number, without making the theory incapable of
covering the ground required.
7. The simple cases mentioned on pp. 29—30 show how the principle of
inverse probability does correspond to ordinary processes of learning,
though we shall go into much more detail as we proceed. Differences
between individual assessments that do not agree with the results of
the theory will be part of the subject-matter of psychology. Their
existence can be admitted without reducing the importance of a unique
standard of reference. It has been said that the theory of probability
could be accepted only if there was experimental evidence to support
it; that psychology should invent methods of measuring actual degrees
of belief and compare them with the theory. I should reply that without
an impersonal method of analysing observations and drawing inferences
from them we should not be in a position to interpret these observations
either. The same considerations would apply to arithmetic. To quote
P. E. B. Jourdain.f
I sometimes feel inclined to apply the historical method to the multiplication
table I should make a statistical inquiry among school children, before their
pristine wisdom had been hiased by teachers. I should put down their answers
as to what e times 5 amounts to, I should work cut the average of their answers
t The Phitsssphy a! Mr B5rtr5nd Ilsesall, 1918, p 88
Exploring the Variety of Random
Documents with Different Content
Hold up there, Camerado!
Beauty is all very good as far as it goes, and Art the perpetuator of
Beauty is all very good as far as it goes, but you can tell your
folks,
Your folks in London, or in Dublin, or in Rome, or where the Arno
flows, or where Seine flows,
Your folks in the picture-galleries, admiring the Raphaels, the
Tintorettos, the Rubenses, Vandykes, Correggios, Murillos,
Angelicos of the world,
(I know them all, they have effused to me, I have wrung them out, I
have abandoned them, I have got beyond them,)—

Narcissus (aside, with tenderness).

Ah, Burne-Jones!

Paumanokides.

Tell them that I am considerably more than Beauty!


I, representing the bone and muscle and cartilage and adipose
tissue and pluck of the Sierras, of California, of the double
Carolinas, of the Granite State, and the Narragansett Bay
State, and the Wooden Nutmeg State!
I, screaming with the scream of the bald-headed bird the eagle in
the primitive woods of America my country, in the hundred
and sixth year of these States!

Dear son, I have learned the secret of the Universe,


I learned it from my original bonne, the white-capped ocean,
I learned it from the Ninth-month Equinoctial, from the redwood
tree, and the Civil War, and the hermit-thrush, and the
telephone, and the Corliss engine,
The secret of the Universe is not Beauty, dear son, nor is it Art the
perpetuator of Beauty,
The secret of the Universe is to admire one’s self.
Camerado, you hear me!
Narcissus.

Ah, I too loitering on an eve of June


Where one wan Narciss leaned above a pool,
While overhead Queen Dian rose too soon,
And through the Tyrian clematis the cool
Night airs came wandering wearily, I too,
Beholding that pale flower, beheld Life’s key at last, and knew

That love of one’s fair self were but indeed


Just worship of pure Beauty; and I gave
One sweet, sad sigh, then bade my fond eyes feed
Upon the mirrored treasure of the wave,
Like that lithe beauteous boy in Tempe’s vale,
Whom hapless Echo loved—thou know’st the Heliconian tale!

And while heaven’s harmony in lake and gold


Changed to a faint nocturne of silvern-gray,
Like rising sea-mists from my spirit rolled
The grievous vapors of this Age of Clay,
Beholding Beauty’s re-arisen shrine,
And the white glory of this precious loveliness of mine!

Paumanokides.

I catch on, my Comrade!


—You allow that your aim is similar to mine, after all is said and
done.
Well, there is not much similarity of style, and I recommend my style
to you.
Go gaze upon the native rock-piles of Mannahatta, my city,
Formless, reckless,
Marked with the emerald miracle of moss, tufted with the
unutterable wonder of the exquisite green grass,
Giving pasture to the spry and fearless-footed quadruped the goat,
Also patched by the heaven-ambitious citizens with the yellow
handbill, the advertisement of patent soaps, the glaring and
vari-colored circus poster:
Mine, too, for reasons, such arrays;
Such my unfettered verse, scorning the delicatesse of dilettantes.
Try it, I’ll stake you my ultimate dollar you’ll like it.

Narcissus (gracefully waiving the point).

Haply in the far, the orient future, in the dawn we herald like the
birds,
Men shall read the legend of our meeting, linger o’er the music of
our words;

Haply coming poets shall compare me then to Milton in his lovely


youth,
Sitting in the cell of Galileo, learning at his elder’s lips the truth.

Haply they shall liken these dear moments, safely held in History’s
amber clear,
Unto Dante’s converse bland with Virgil, on the margin of that
gloomy mere!

Paumanokides.

Do not be deceived, dear son;


Amid the choruses of the morn of progress, roaring, hilarious, those
names will be heard no longer.
Galileo was admirable once, Milton was admirable,
Dante the I-talian was a cute man in his way,
But he was not the maker of poems, the Answerer!
I Paumanokides am the maker of poems, the Answerer,
And I calculate to chant as long as the earth revolves,
To an interminable audience of haughty, effusive, copious, gritty, and
chipper Americanos!

Narcissus.
What more is left to say or do?
Our minds have met; our hands must part.
I go to plant in pastures new
The love of Beauty and of Art.
I’ll shortly start.
One town is rather small for two
Like me and you!

Paumanokides.

So long!
THE SONG OF SIR PALAMEDE.
“Came Palamede, upon a secret quest,
To high Tintagel, and abode as guest
In likeness of a minstrel with the king.
Nor was there man could sound so sweet a string.
...
To that strange minstrel strongly swore King Mark,
By all that makes a knight’s faith firm and strong,
That he, as guerdon of his harp and song,
Might crave and have his liking.
...’O King, I crave
No gift of man that king may give to slave,
But this thy crowned queen only, this thy wife.’”

Swinburne. Tristram of Lyonesse.

With flow exhaustless of alliterate words,


And rhymes that mate in music glad as birds
That feel the spring’s sweet life among light leaves
That ardent breath of amorous May upheaves
And kindles fluctuant to an emerald fire
Bright as the imperious seas that all men’s souls desire:
With long strong swell of alexandrine lines,
And with passion of anapæsts, like winds in pines
That moan and mutter in great gusts suddenly,
With whirl of wild wet wings of storms set free:
In mirth of might and very joy to sing,
Uplifting voice untired, I sound one sole sweet string.

Love, that is ever bitter as salt blown spray,


Yet sweet, yea sweet as wrath or wine alway,
As red warm mouths of Mænads subtly sweet;
Love, that is fleeter than the wind’s fleet feet
Soft-shod with snowflakes; love, that hath the name
And fury and force of swift bright shuddering flame:
Fate, that is foe to love and lovely life,
Yea foe implacable, and hath death to wife;
Fate, that is bitterer than the salt spray blown
And colder than soft snow yet hard as stone;
Fate, that makes daily fare of heart’s desire,
Being found thereunto a devouring fire:
Death, that is friend to fate and fair love’s foe;
Death, that makes waste the wolds of life with snow;
Death, harsh as spray of seas that wild winds blow:
Life, that is strangely one of all these three,
Being bitter as is the sharp salt spray of sea,
And thereto colder than the blown white rose
And soft brief blossom of unmothered snows,
And fiercer than the forceful feathered fire,
Fed as a flame with hope of heart and high desire:
All these I sing, and sound the same sweet string.

And as fresh-gathered leaves of bay I bring


Green praises to all dear dead lute-players,
Whom Pluto’s passionate queen holds fast as hers,
Yea all sad souls that have smiled and sinned and sung,
With whose gold-colored hairs and hoar this harp is strung.
And blame of the high great gods that do amiss,
Being cruel and crowned and bathed complete in bliss,
And careless if this world be out of tune,
And deaf to dithyrambs of bards that bay the moon:
And all perfections of all those I love,
Each bettering still the best and still above
The last this violent voice proclaimed the best,
And blown by stormy breath still starward o’er the rest;
And all large loathsomeness of all I hate,
Whose poisonous presence doth Caïna wait,
And better it were that they had ne’er been born,
I being dowered with hate of hate and scorn of scorn,
And shrinking not to name them newts and snakes,
Lepers and toads and frogs and hooting owls and crakes:
All these with ease of measureless might I sing,
And sound, though sheer stark mad, the same sweet string.
And many a theme I choose in wayfaring,
As one who passing plucks the sunflower
And ponders on her looks for love of her.
Yea, her flower-named whose fate was like a flower,
Being bright and brief and broken in an hour
And whirled of winds: and her whose lawless hand
Held flickering flame to fawn against the brand,
Till Meleager splendid as the sun
Shrank to a star and set, and all her day was done:
And her who lent her slight white virgin light
For death to dim, that Athens’ mastering might
Above all seas should shine, supernal sphere of night:
And her who kept the high knight amorous
Pent in her hollow hill-house marvelous,
And flame of flowers brake beauteous where she trod,
Her who hath wine and honey and a rod,
And crowneth man a king, and maketh man a slave
Her who rose rose-red from the rose-white wave:
And her who ruled with sword-blue blade-bright eyes
The helpless hearts of men in queenly wise,
And all were bowed and broken as on a wheel,
Yet no soft love-cloud long could sheath that stainless steel,
Her tiger-hearted and false and glorious,
With flower-sweet throat and float of warm hair odorous;
These sing I, and whatso else that burns and glows,
And is as fire and foam-flowers and the rose
And sun and stars and wan warm moon and snows.
Who hath said that I have not made my song to shine
With such bright words as seal a song to be divine?
Who hath said that I have not sweetness thereon spread
As gold of peerless honey is poured on bread?
Who hath said that I make not all men’s brains to ring,
And swim with imminent madness while I sing,
And fall as feeble dykes before strong tides of spring?
And now as guerdon of my great song I claim
The swan-white pearl of singers, yea Queen Fame,
Who shall be wed no more to languid lips and tame,
But clasp me and kiss and call me by my name,
And be all my days about me as a flame,
Though sane vain lame tame cranes sans shame make game and
blame!
A MERRY JEST OF A MODERN MAID.
Miss Pallas Eudora Van Blurky,
She didn’t know chicken from turkey;
High-Spanish and Greek she could fluently speak,
But her knowledge of poultry was murky!

She could tell the great-uncle of Moses,


And the dates of the Wars of the Roses,
And the reasons of things—why the Indians wore rings
In their red aboriginal noses;

Why Shakspere was wrong in his grammar,


And the meaning of Emerson’s Brahma,
And she went chipping rocks with a little black box
And a small geological hammer.

She had views upon co-education,


And the principal needs of the nation,
And her glasses were blue, and the number she knew
Of the stars in each bright constellation.

And she wrote with a handwriting clerky,


And she talked with an emphasis jerky,
And she painted on tiles in the sweetest of styles,
But she didn’t know chicken from turkey!
THE RHYME OF THE HERCULES CLUB.
BEING A BALLAD OF TO-DAY, DESIGNED TO ILLUSTRATE THE
PRINCIPLE OF REACTION, AND TO SET FORTH HOW THERE MAY BE
TOO MUCH OF AN EXCELLENT THING.

There was once a young man of the medium size,


Who, by keeping a ledger, himself kept likewise.
In the matter of lunch he’d a leaning to pies,
And his chronic dyspepsia will hence not surprise;
And his friends often told him, with tears in their eyes,
Which they did not disguise, that a person who tries
To live without exercise generally dies,
And declared, for the sake of his family ties,
He should enter the Hercules Club.

Tom Box and Dick Dumbell would suasively say,


If they met him by chance in the roar of Broadway,
“It’s bad for a fellow, all work and no play;
Come, let us propose you! You’ll find it will pay
To belong to the Hercules Club!”

And he yielded at last, and they put up his name,


Which was found without blame; and they put down the same
In a roll-book tremendous; and straight he became
A Samson, regarding his tame past with shame;
Called for “Beef, lean and rare!” and cut off all his hair,
Had his shoulders constructed abnormally square,
And walked out with an air that made people declare,
“He belongs to the Hercules Club!”

And he often remarked, in original way:


“It’s bad for a fellow, all work and no play;
Without recreation, sir, life doesn’t pay!
And I for my part am most happy to say
I belong to the Hercules Club.”

And frequently during a very hot “spell,”


In thick woolen garments clad closely and well,
“Reducing,”—for he was resolved to excel,—
rowed in the sun at full speed, in a shell
That belonged to the Hercules Club.

And for weeks, while the dew on the racing-track lay,


He ran before breakfast a half mile a day,
Improving his style and increasing his “stay”;
And was first at the finish, and fainted away,
At the games of the Hercules Club.

Six nights in succession he sat up to pore


“The Laws of Athletics” devotedly o’er
(Which number ten thousand and seventy-four),
With a view to proposing a very few more
In a speech to the Hercules Club.

And his coat upon festal occasions was gay


With medals on medals, marked “H. A. A. A.,”[1]
With a motto in Greek (which, my lore to display,
Means “Pleasure is business”), a splendid array
Of the spoils of the Hercules Club.

But acquaintances not of the muscular kind


Began to observe that his brow was deep-lined,
Too brilliant his eye, and to wander inclined;
He appeared, in a word (early English), “fore-pined”;
And one morning his ledger and desk he resigned,
Explaining, “I can’t have my health undermined
By this ‘demnition grind’; and I’m getting behind
In my duties as Captain” (an office defined,
Page hundred and two, in the by-laws that bind
With red tape the great Hercules Club).
And he further remarked, in most serious way:
“Give it up, did you say? ’Twill be frigid, that day![2]
Why, without relaxation, sir, life wouldn’t pay!
And I, for my part, will remain till I’m gray
On the roll of the Hercules Club!”

You perceive, gentle reader, the rub.


Is it nobler to suffer those arrows and slings
Lack of exercise brings—or take clubs, and let things
Unconnected with matters athletic take wings;
Till all interests beside, like the Arabs, shall glide
From the landscape of life, once a plain free and wide,
But now fenced for the “Games” which we lightly began,
Grown our serious aims and the chief end of Man?
There’s an aureate mean these two courses between,
But I humbly submit that it seldom is seen,
With all proper respect for that organization,
Of benevolent purpose and high reputation,
The excellent Hercules Club!

FOOTNOTES

[1] “H. A. A. A.”: Hercules Amateur Athletic Association.


[2] Frigid day, or day of low temperature: A singular idiom of
the American language, expressing grave improbability.
THE BALLAD OF CASSANDRA BROWN.
Though I met her in the summer, when one’s heart lies round at
ease,
As it were in tennis costume, and a man’s not hard to please,
Yet I think at any season to have met her was to love,
While her tones, unspoiled, unstudied, had the softness of the dove.

At request she read us poems in a nook among the pines,


And her artless voice lent music to the least melodious lines;
Though she lowered her shadowing lashes, in an earnest reader’s
wise,
Yet we caught blue gracious glimpses of the heavens that were her
eyes.

As in paradise I listened. Ah, I did not understand


That a little cloud, no larger than the average human hand,
Might, as stated oft in fiction, spread into a sable pall,
When she said that she should study Elocution in the fall!

I admit her earliest efforts were not in the Ercles vein;


She began with, “Lit-tle Maaybel, with her faayce against the
paayne,
And the beacon-light a-trrremble,”—which although it made me
wince,
Is a thing of cheerful nature to the things she’s rendered since.

Having learned the Soulful Quiver, she acquired the Melting Mo-o-an,
And the way she gave “Young Grayhead,” would have liquefied a
stone.
Then the Sanguinary Tragic did her energies employ,
And she tore my taste to tatters when she slew “The Polish Boy.”

It’s not pleasant for a fellow when the jewel of his soul
Wades through slaughter on the carpet, while her orbs in frenzy roll;
What was I that I should murmur? Yet it gave me grievous pain
That she rose in social gatherings and Searched among the Slain.

I was forced to look upon her, in my desperation dumb,


Knowing well that when her awful opportunity was come
She would give us battle, murder, sudden death at very least,
As a skeleton of warning, and a blight upon the feast.

Once, ah! once I fell a-dreaming; some one played a polonaise


I associated strongly with those happier August days;
And I mused, “I’ll speak this evening,” recent pangs forgotten quite.
Sudden shrilled a scream of anguish: “Curfew shall not ring to-
night!”

Ah, that sound was as a curfew, quenching rosy warm romance:


Were it safe to wed a woman one so oft would wish in France?
Oh, as she “cull-imbed” that ladder, swift my mounting hope came
down.
I am still a single cynic; she is still Cassandra Brown!

—Coroebus Green.
THE SWEET O’ THE YEAR.
This trifle may derive interest from the music, by Mr. E. C. Phelps,
in Scribner’s Monthly for August, 1880.

ACT I.

SCENE.—A Lowly Cot.

Tenant (Tenor).
Tenant’s Wife (Soprano).
Tenant’s Mother-in-Law (Contralto).
Landlord (Basso).

TENOR SOLO.

How happy is our lot,


Beneath our vines and fig-trees,
In this suburban spot,
Among so many big trees!
Our landlord’s very kind,
His speech is mild and gentle,
He never was inclined
To go and raise the rental.

TRIO.

How happy is our lot


Beneath our vines and fig-trees,
In this suburban spot,
Among so many big trees;
How happy is our lot!
How happy is our lot!
Enter Landlord. BASSO.

How do you do?


Aside. I’ll try a few devices;
I’ve paid a five-cent fare,
To see if my premises
Were wanting much repair.

TENOR.

Sir, the whole house neat and nice is,


And requires no extra care.

BASSO.

Aside. Got him there!


Direct. This is indeed a lovely spot.

TENOR.

Beyond compare.

BASSO.

Aside. Got him there!


Direct. I think you never find it hot?

TENOR.

Fine cool air.

BASSO.

Aside. Got him there!


Direct. Handy to the cars and boats?

TENOR.
Pretty fair.

BASSO.

Aside. Got him there!


Direct. Far removed from geese and goats?

TENOR.

So we air.

BASSO.

Aside. Got him there!


Think I’ve got him everywhere.
Direct. Bless you! after so much praise
I shall really have to raise.

Mother-in-law. CONTRALTO.

To Tenor. Oh, oh, oh!


No, no, no!
Have you the feelings of a man
To stand such wicked imposition?
An old house built on such a plan,
And in the very worst condition.

SOPRANO.

The paper’s hanging on the wall.

CONTRALTO.

The plaster’s tumbling from the ceiling.

SOPRANO.

The front piazza is liable to fall.


CONTRALTO.

Oh, are you a man of any feeling?

TENOR.

I won’t pay!

BASSO.

First of May.
Intermission—Agent heard without tacking up bill.

ACT II.

Enter Left—Chorus of Feminine House-Seekers and Chorus of


Masculine House-Seekers, waving permits.

FULL CHORUS.

I want to see⸺

TENOR.

Oh, certainly!
Be kind enough to follow me.

FEMALE CHORUS.

This parlor’s rather nice;


This parlor’s rather small;
Are you troubled with rats and mice?
Will the landlord paint the wall?

MALE CHORUS.

Does the roof leak when it’s clear?


FEMALE CHORUS.

Are the bedrooms tinted blue?


How long have you lived here?
Will the range cook oyster stew?

Exeunt, R.

FULL CHORUS (re-entering, R.)

It wouldn’t do!

FEMALE CHORUS.

It’s warm!

MALE CHORUS.

It’s cold!

FEMALE CHORUS.

It’s quite too new!

MALE CHORUS.

It’s quite too old!

FULL CHORUS.

I wanted gas!
I wanted grass!
We all expected fine plate-glass!
And shelves for cheese!
And orange trees!
And beds for raising strawberries!

I dwell in a marble hall,


And I couldn’t make it do;
And I don’t see how you live at all;
And I’m much obliged to you.
THE TENDER HEART.
She gazed upon the burnished brace
Of plump ruffed grouse he showed with pride
Angelic grief was in her face:
“How could you do it, dear?” she sighed.
“The poor, pathetic, moveless wings!
The songs all hushed—oh, cruel shame!”
Said he, “The partridge never sings.”
Said she, “The sin is quite the same.

“You men are savage through and through.


A boy is always bringing in
Some string of bird’s eggs, white and blue,
Or butterfly upon a pin.
The angle-worm in anguish dies,
Impaled, the pretty trout to tease⸺”
“My own, I fish for trout with flies⸺”
“Don’t wander from the question, please!”

She quoted Burns’s “Wounded Hare,”


And certain burning lines of Blake’s,
And Ruskin on the fowls of air,
And Coleridge on the water-snakes.
At Emerson’s “Forbearance” he
Began to feel his will benumbed;
At Browning’s “Donald” utterly
His soul surrendered and succumbed.

“Oh, gentlest of all gentle girls,”


He thought, “beneath the blessed sun!”
He saw her lashes hung with pearls,
And swore to give away his gun.
She smiled to find her point was gained,
And went, with happy parting words
(He subsequently ascertained),
To trim her hat with humming-birds.
—So good night unto you all.
Give me your hands, if we be friends,
And Robin shall restore amends.

A Midsummer Night’s Dream.


*** END OF THE PROJECT GUTENBERG EBOOK OBERON AND PUCK
***

Updated editions will replace the previous one—the old editions


will be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States
copyright in these works, so the Foundation (and you!) can copy
and distribute it in the United States without permission and
without paying copyright royalties. Special rules, set forth in the
General Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the


free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree
to abide by all the terms of this agreement, you must cease
using and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only


be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project Gutenberg™
works in compliance with the terms of this agreement for
keeping the Project Gutenberg™ name associated with the
work. You can easily comply with the terms of this agreement
by keeping this work in the same format with its attached full
Project Gutenberg™ License when you share it without charge
with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.

1.E. Unless you have removed all references to Project


Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:

This eBook is for the use of anyone anywhere in the United


States and most other parts of the world at no cost and
with almost no restrictions whatsoever. You may copy it,
give it away or re-use it under the terms of the Project
Gutenberg License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country
where you are located before using this eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is


derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of
the copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is


posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute


this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must,
at no additional cost, fee or expense to the user, provide a copy,
a means of exporting a copy, or a means of obtaining a copy
upon request, of the work in its original “Plain Vanilla ASCII” or
other form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or


providing access to or distributing Project Gutenberg™
electronic works provided that:

• You pay a royalty fee of 20% of the gross profits you


derive from the use of Project Gutenberg™ works
calculated using the method you already use to calculate
your applicable taxes. The fee is owed to the owner of the
Project Gutenberg™ trademark, but he has agreed to
donate royalties under this paragraph to the Project
Gutenberg Literary Archive Foundation. Royalty payments
must be paid within 60 days following each date on which
you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly
marked as such and sent to the Project Gutenberg Literary
Archive Foundation at the address specified in Section 4,
“Information about donations to the Project Gutenberg
Literary Archive Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of
receipt that s/he does not agree to the terms of the full
Project Gutenberg™ License. You must require such a user
to return or destroy all copies of the works possessed in a
physical medium and discontinue all use of and all access to
other copies of Project Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full


refund of any money paid for a work or a replacement
copy, if a defect in the electronic work is discovered and
reported to you within 90 days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project


Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on,

You might also like