Notes 1
Notes 1
The Logicians.
Part I.
From Richard Whately to William Stanley Jevons
Stanley N Burris
Author address:
Preface v
Bibliography 125
Preface
As an undergraduate I read portions of Boole’s 1854 classic, The Laws of Thought, and came to
the conclusion that Boole was rather inept at doing Boolean algebra. Much later, when studying
the history of logic in the 19th century, a conversation with Alasdair Urquhart led me to the 1976
book of Theodore Hailperin that shows why Boole’s methods work. This was when I finally realized
that trying to superimpose Boolean algebra, or Boolean rings, on Boole’s work was a real mistake.
Boole uses the ordinary number system, and not a two-element algebra.
With the exception of Hailperin’s book I do not know of a single history of logic or mathematics
that properly explains Boole’s work. Furthermore, even though the books and papers devoted to
the development of mathematical logic in the mid 1800s are readily available today, the essential
facts have not been written up in a concise yet comprehensive form for a modern audience. These
notes intend to round out the picture with a brief but substantial account of the transition from
Aristotelian logic to mathematical logic, starting with Whately’s revival of Aristotelian logic The
Elements of Logic in 1826, continuing through the work of De Morgan, Boole and Jevons, and
concluding with modern versions of equational proof systems for Boolean algebra, Boolean rings,
and a discussion of Hailperin’s exposé of the work of Boole.
Many of the sections are devoted to discussing a single book. Within such sections plain page
number references (without a source cited) apply to the book being discussed in the section.
Stanley Burris
Waterloo, Ontario
March, 2001
v
vi PREFACE
Prelude: The Creation of an Algebra of Logic
are, most unconvincing. In essence his defence was just that he was using the ‘symbolic method’,
and the requests for clarification were mainly met with a number of new examples showing how
to use his system. Boole discovered that ordinary algebra leads to a powerful algebraic system
for analyzing logical arguments, but he really did not know why it worked. Indeed, it seems that
the fact that it did work in many examples was quite enough to convince Boole that it would
work in general—a masterful application of inductive reasoning to provide an equational approach
to classical deductive reasoning. The fact that his methods provide an equational logic that can
indeed correctly handle arguments about classes was not established until Theodore Hailperin’s
book appeared in 1976.3 Hailperin uses signed multisets to extend the collection of classes, after
identifying classes with idempotent multisets. In this setting Boole’s operations become meaningful,
namely class + class may not be a class, but it is always a signed multiset.
The justification of Boole’s system likely requires more sophistication in modern algebra than
was available in the 19th century. This would lead Jevons in the 1860s to create an alternative
algebra of logic, essentially what is now called Boolean algebra.4 Boole’s work is a clever interplay
between traditional logic and the algebra of numbers. With Jevons, logic abandons the ties to
numbers and joins its new companion, ‘Boolean’ algebra, for a mature and enduring relationship.
To have a good idea of where Boole and De Morgan were starting from we begin these notes
with an overview of the influential 1826 logic text of Richard Whately. The notion of a class was
well established in the traditional literature on Aristotelian logic, as well as the use of a single letter
to denote a class, for example, using M to denote the class of men. However, using symbols to
describe combinations of classes, for example, using A + B for the union of A and B, was not part
of the traditional literature of either logic or mathematics.
3
An expanded 2nd edition appeared in 1986.
4
De Morgan was quick to praise Boole’s work, but he apparently put little serious effort into trying to understand
it. Jevons, however, was determined to make sense of Boole’s algebraic approach, and he modified it in such a way
that it could be readily justified. It is essentially the interpretations and equational axioms of Jevons that most of my
colleagues think is the contribution of Boole, the modern Boolean algebra. But it most certainly is not what Boole
did, nor does it seem that Boole had much sympathy for this ‘Boolean’ algebra.
5
Jevons 1880 book [10] has a chapter titled Elements of Equational Logic, meaning his reworking of Boole’s algebra
of logic.
PRELUDE: THE CREATION OF AN ALGEBRA OF LOGIC 3
name symbols are to be treated as universally quantified. For example, one can have the law
xy = yx
which says that for any two classes the intersection does not depend on the order in which one
takes the classes. On the other hand one can have a syllogistic argument
All x is y
All y is z
All x is z
where it is understood that the x in the first premiss and the conclusion refer to the same class,
etc. When this is translated into equations in the algebra of logic, say by
xy = x
yz = y
xz = x,
again the x in the first equation and the last equation refer to the same class, etc. Thus for each
of these equations the simple name symbols are not regarded as universally quantified.
The words variable and constant were not used by De Morgan, Boole or Jevons, so to prevent
any modern nuances from slipping into the reading we will also avoid these words when commenting
on quoted portions of their texts. Instead, if we do not use the same nomenclature as the quoted
text then we will simply use the word symbol, in the indicated font.
when X is empty, and he declines to assign either true or false to it. He says that in case both X
and Y are empty that he will
. . . not attempt to settle what nonexisting things agree or disagree.
In 1847 Boole translates ‘All X is Y’ as the equation xy = x, and when X is empty this becomes
the equation ‘0y = 0’, which is true in his system. However Boole does not explicitly discuss the
meaning of ‘All X is Y’ when the subject or predicate are empty, and consequently he does not
discuss the truth or falsity of such a statement.6 The closest Boole comes to discussing the role of
the empty class in his semantics is on page 65 of [1]:
It may happen that the simultaneous satisfaction of equations thus deduced, may require that
one or more of the elective symbols should vanish. This would only imply the nonexistence
of a class: it may even happen that it may lead to a final result of the form
1 = 0,
which would indicate the nonexistence of the logical Universe. Such cases will only arise
when we attempt to unite contradictory Propositions in a single equation.
Thus a conclusion 1 = 0 would mean that the premises were contradictory.
The second obvious way to resolve the problem with conversion per accidens is to use what we
call restricted semantics, namely one is allowed to interpret a symbol as any subclass of the
universe except the empty class and the universe. Kneale and Kneale ([13], page 408) note that
Aristotle did not permit terms to denote either the empty class or the universe, but it is doubtful
that Aristotle would have been so generous as to accept all the interpretations allowed by our
restricted semantics. For example, traditional Aristotelian logic does not allow the contrary of a
term, e.g. ‘not-man’, to be the subject of a universal statement.
Although we are not clear as to the precise semantics of Boole or De Morgan, the restricted
semantics seems to fit remarkably well. After declining to speculate as to whether ‘All X is Y’ is true
or false when X does not exist, De Morgan makes the condition of existence (i.e., nonemptiness) of
the terms a precondition for using a categorical proposition in a syllogism. 7
De Morgan had stated ([4], page 55) that no term of a proposition was to name the universe.
This was not because it created problems with the meaning of propositions, but rather because
it led to trivial simplifications of the premises that would eliminate the reference to such terms.
However once he excludes the empty class from being an interpretation of the symbols, it seems
clear that he must also exclude the universe—to be able to use contraries of terms on an equal
footing with terms.
Putting the above pieces together it seems that the restricted semantics is in excellent harmony
with the writings of Boole and De Morgan. All their simple inferences, including the conversions,
are valid under the restricted semantics. There is one problem with Boole’s classification of the
valid syllogisms, given that he, like De Morgan, accepts contrary terms such as not-X on an equal
footing with the term X. From the 2nd Figure AA premises we have, under the restricted semantics,
6
The clear decision to admit the empty class as an acceptable interpretation of the subject of a categorical
proposition seems to have originated with C.S. Peirce in his 1880 paper [16] on the algebra of logic. Peirce looks
at the examples ‘[All/Some] lines [are/are not] vertical’ and pictures four different scenarios, two of which have no
vertical lines. He assigns, in modern fashion, truth values to each of the four Aristotelian categorical statements in
each of these four cases. Schröder adopts Peirce’s conventions in the second volume (1891) of his influential Algebra
der Logik, and goes on to clarify the situation when the predicate does not apply to anything, saying that ‘All X is Y’
will mean the same as X ⊆ Y, which is precisely the modern interpretation. Schröder says that although dealing with
assertions about an empty subject is not an issue in everyday life, in the scientific community one has to constantly
deal with the possibility that there are no entities fitting a given description.
7
This leads to some discussion about whether or not the conclusion of a syllogism should include the phrase
‘provided the middle term exists’.
PRELUDE: THE CREATION OF AN ALGEBRA OF LOGIC 5
Terms
Both De Morgan and Boole invent symbolic notation to give names to certain combinations of
classes. For example, if A and B denote two classes then they both use the name AB to denote the
class of elements common to the two classes. De Morgan calls such names compound names, and
Boole calls them functions. The modern terminology for such names, including the simple name
symbols, is terms. For De Morgan, compound names for classes are a sidelight to his investigations
of the syllogism. He is much more interested in names for combinations of propositions, as well as
of binary relations. But for Boole, his functions are central to his algebra of logic.
In modern logic a term that does not include any variables is called a ground term, and an
equation that does not include any variables is called a ground equation. Using this terminology,
the translations by Boole and Jevons of an argument into equations yield arguments about ground
equations.
Boole, motivated by ordinary algebra, introduced simple name symbols for terms such
as ϕ, f, t, V , etc. These would be useful in formulating his general Expansion,9 Reduction and
Elimination Theorems. For example his Expansion Theorem in one symbol is f (x) = f (1)x +
f (0)(1 − x). In our commentary we like to use p, q, r, . . . as simple names for terms. It seems a bit
curious that De Morgan and Jevons did not introduce simple names for terms.
De Morgan and Jevons both work with the operations of union, intersection and complement
in the modern sense. Boole also works with intersection and complement in the modern sense, but
he introduces the operations + and − as partial operations on classes. If x denotes the class X and
y the class Y then x + y in Boole’s system represents the union of X and Y provided X and Y are
disjoint classes; and x − y represents the difference of X and Y provided Y is contained in X.
8
It is described in his notation as A A I . See also pages 105 and 116 of [4].
9
Boole uses the word ‘Development’ instead of ‘Expansion’.
6 PRELUDE: THE CREATION OF AN ALGEBRA OF LOGIC
Kneale and Kneale ([13], p. 410) claim that Boole does not even allow one to write down
x + y unless x and y represent disjoint classes. But this claim is simply incorrect. Although many
terms such as x + y may be uninterpretable, that does not mean they cannot be written down
and used in Boole’s system. In his 1854 book Boole strenuously argues for the admittance of
uninterpretable terms in the intermediate steps of a symbolic argument—he says that to restrict all
steps of equational inference to interpretable terms would destroy much of the value of his system.
One sees uninterpretable terms used freely in his Chapter VIII On the Reduction of Systems of
Propositions, especially in his favorite procedure to reduce a system of equations V 1 = 0, V2 = 0,
. . . to a single equation V12 + V22 + · · · = 0.
x y x+y y+x
1 1 2 2
1 0 1 1
0 1 1 1
0 0 0 0
Fig. 1 Justifying the Commutative Law
Unfortunately Boole did not give such simple detailed applications of this rule. He does state
that it justifies his Expansion Theorem. In the one symbol case this is, as mentioned earlier,
f (x) = f (1)x + f (0)(1 − x) .
A detailed truth table style presentation to apply the Rule of 0 and 1 would look like
The Rule of 0 and 1 can also be used to check the correctness of arguments. However Boole
only did this in one simple case, namely if a is a numerical coefficient that is not 0 then from at = 0,
where t is a special kind of term called a constituent, one can conclude t = 0. Had Boole put more
emphasis on applications of this Rule of 0 and 1 perhaps truth tables would have been established
much earlier in the logic literature.
PRELUDE: THE CREATION OF AN ALGEBRA OF LOGIC 7
Semantics of Laws
The semantics of laws must be handled with care in the work of Boole. When he states the
laws
xy = yx
x2 = x
one would like to think that they stand on an equal footing. When one has a law like
xy = yx
this normally means that one can substitute terms s, t for the symbols x, y and conclude that
st = ts
holds. After all, if this law means that any two objects commute then surely any two objects named
by s and t commute.
However Boole’s system is deviant in this regard. He has the law x2 = x, but you cannot
derive (x + x)2 = x + x. The reason is that Boole’s idempotent law applies only to classes, and
unfortunately a term like x + x need not refer to a class. On the other hand Boole’s rule of 0 and
1 shows that for his law xy = yx one can indeed substitute any terms s, t for x, y.
Substitution
The use of the word substitution in the 1800s was different from the modern usage in equational
logic. In 1869 Jevons regards his rule of substitution as the sole central principle of reasoning,
replacing the traditional dictum de omni et nullo of Aristotle. His usage of substitution is that if
A = B then we can replace B by A in any assertion about B and obtain an equivalent assertion.
This mainly corresponds with what is called replacement in modern equational logic, namely if
A = B then A + C = B + C, C + A = C + B, etc. One can view replacement as saying that doing the
same thing to both sides of an equation, for example, adding C to the right side of each term, gives
an equation that follows from the original equation. This was essentially the sole rule of inference
proposed by Boole in 1847, namely
equivalent operations performed upon equivalent subjects provide equivalent results.
Boole did not give a special name to this rule—he merely said that it was the only axiom that was
needed. Thus, as far as equations go, we claim that the the main rule of inference proposed by
Boole10 in 1847, and by Jevons11 in 1869, was the modern rule of replacement.
In modern equational logic we use the word substitution to mean the uniform substitution of
terms for variables, for example, if we substitute x + y for x in x2 = x we obtain (x + y)2 = x + y. In
modern propositional logic we also use the word substitution in such a uniform sense when we speak
of a substitution instance of the tautology P → (Q → P). Eighteenth century writers apparently
had no word for our modern substitution.
We want to look at Jevons’ use of substitution, and also to discuss modern substitution. To
sort out this substitution tangle we will adopt the following convention: the word substitution will
at times be prefixed with a bracketed word to clarify the version of substitution that we are talking
about, namely [Jevons] substitution or [modern] substitution. This will allow us to make
10
In his 1854 book Boole formulated this as: adding or subtracting equals from equals gives equals. But in 1854
this was no longer the guiding principle of inference for Boole—instead it was the Rule of 0 and 1.
11
Jevons’ substitution is stronger than replacement, for example, one can derive the transitive rule for equality
from this. Jevons’ substitution for equations, plus the reflexive law for equality, is equivalent to replacement plus the
reflexive, symmetric and transitive laws for equality.
8 PRELUDE: THE CREATION OF AN ALGEBRA OF LOGIC
exact quotes from Jevons’ work, treating the bracketed items as editorial comments, and at the
same time allow us to use the word substitution in commentary with the modern sense.
By the beginning of the nineteenth century the importance of traditional deductive logic had
already gone through a major decline lasting nearly two centuries, the victim of exaggerated claims.
It had been promoted as the ultimate and unique means of finding truth and accumulating knowl-
edge, the perfect tool for scientific discovery. The reaction to these claims had been a major switch
to inductive logic, championed by Bacon, Descarte, Locke, and Playfair. The traditional logic was
viewed as rigid and sterile, whereas the inductive method, of accumulating facts to discover general
principles, was considered the hallmark of science. In England, only Oxford University had kept
classical logic as part of its university exams. And in the early part of the nineteenth century even
Oxford was considering dropping the subject.
The person who perhaps did the most to reverse this trend was Richard Whately of Oxford.
He grew up in a family of nine children in London, his father being a church minister. In 1805 he
entered Oxford University, took his B.A. from Oxford in 1808 in classics and mathematics, and in
1810 his M.A. In 1811 he was made a fellow of Oriel College, Oxford, and remained at Oxford till
1831.1 In 1826 he published Elements of Logic, a presentation of classical logic essentially based
on the Artis Logicae Compendium (1691) of Henry Aldridge.
There was really nothing new as far as the development of logic goes in Whately’s book, but
what was given was presented in a simple and clear manner. His approach to logic was strictly
traditional. He praised Aristotelian logic for having realized that the syllogism was the ultimate
form of argument, and that all correct syllogistic reasoning could be reduced to the principle of
omni et nullo, i.e., that what was true in the general situation held in the particular, and what
was false in the general situation was false in the particular. He tried to reconcile the conflict with
the supporters of inductive logic by pointing out that inductive logic really consisted of two parts,
the first concerned with collecting information to determine plausible premises, and then the use
of deductive logic to find further information based on those premises. Thus he saw harmony, not
conflict, between inductive logic and deductive logic.
In 1831 he became Archbishop of Dublin and left Oxford for good. His Elements became
the standard logic textbook at Oxford, and was popular throughout England for the rest of the
nineteenth century. By 1840 Whately could say that his book had been adopted by all the colleges
in America. It is still easily available today.
This excellent presentation of logic was an important anchor point for many future logicians. In
particular Augustus De Morgan uses it as a standard reference, George Boole uses Whately’s book
as his main source when writing his first book on logic, and Charles S. Peirce said that Whately’s
book had been his introduction to the subject of logic. Perhaps it was the elegant simplicity and
clarity of purpose of Whately’s work that invited others to try to improve on it.
of being meaningless. Whately commends the faculty for having rejected the proposal to remove
logic altogether, but now he urges them to consider requiring the exam only of those students who
wished to graduate with distinction. This way the standards would be upgraded to this elite group
of students, and they would receive a solid and meaningful course. In this effort at changing the
curriculum he failed.
But the book succeeded. Perhaps the most important goal of Whately was to rescue deductive
logic from the excessive burden it carried because of the extravagant promises that had been made
regarding its purpose. In the introduction he emphasized that its only goal was to deal with the
correct forms of argument, not with ascertaining the validity of the premises. And in this respect
he deemed it to be a science as much as an art, and worthy of the respect and study given to any
science. Also he brought out a new fact, the intimate connection of logic and language, how
it would be impossible to reason about a subject without language.2
Although the study of logic might not improve one’s reasoning skills, still it would provide a
tool to defend oneself against fallacious arguments of others. He particularly recommended this to
his fellow Christians since, as he said, it was clear that the shrewd opposition was indeed making
itself well acquainted with logic.
After clarifying the goal of logic in the introduction Whately has an interesting strategy for
presenting the subject.
(1) First he gives a sketchy but engaging vocabulary building overview in the chapter
Analytical Outline by working from the general nature of logical argument via examples
down to the finer structure of symbolic logic. En route he takes the opportunity to
introduce the reader to the following notions:
premiss
conclusion
syllogism
the dictum de omni et nullo of Aristotle
distributed term
quantity (universal, particular)
quality (affirmative, negative)
(2) Then in the chapter Synthetical Compendium he returns to the basic definitions and
proceeds to build step by step the structure of Aristotelian logic, in a manner that is
reminiscent of Euclid’s development of geometry.
Now let us look at some details of his presentation. After all, this text was a, if not the, launching
pad for the work of De Morgan and Boole.
In the chapter Analytical Outline Whately starts by saying that logic is the Art and Science of
reasoning, and goes on to say what logic is not. Then he points out that in all the diversity of
intellectual activities requiring reasoning, the processes of reasoning are really the same, and thus
(pages 22–23):
2
The importance of language for reasoning is stated as follows by Boole in 1847 (page 5):
The theory of Logic is thus intimately connected with that of language.
But in a postscript to the book he adds (page 81):
Language is an instrument of Logic, but not an indispensable instrument.
It would be interesting to know how Whately reconciled the importance of language with the claim by Chrysippus
that a good hunting dog has basic skills in reasoning:
When running after a rabbit, the dog found that the path suddenly split in three directions. The dog
sniffed the first path and found no scent; then it sniffed the second path and found no scent; then,
without bothering to sniff the third path, it ran down that path.
1. ELEMENTS OF LOGIC (1826) 11
. . . it could not but appear desirable to lay down some general rules of reasoning, applicable
to all cases, by which a person might be enabled the more readily and clearly to state the
grounds of his own conviction, or of his objection to the arguments of an opponent; instead
of arguing at random, without any fixed and acknowledged principles to guide his procedure.
If one looks at an argument in detail he says (page 23):
. . . it will be found that every conclusion is deduced, in reality, from two other propositions;
...
and (page 24):
An argument thus stated regularly and at full length, is called a Syllogism; which therefore
is evidently not a peculiar kind of argument, but only a peculiar form of expression, in which
every argument may be stated.
When one of the premises is suppressed, (which for brevity’s sake it usually is) the
argument is called an Enthymeme.
After presenting some fallacious examples of syllogisms, e.g., on page 27:
. . . every rational agent is accountable; brutes are not rational agents; therefore they are
not accountable . . .
he gives a similar argument to demonstrate the fallacy:
. . . every horse is an animal; sheep are not horses; therefore they are not animals . . .
and he says (pages 27–28):
This mode of exposing a fallacy, by bringing forward a similar one whose conclusion is
obviously absurd, is often, and very advantageously, resorted to in addressing those who are
ignorant of Logical rules; . . .
Then he gives two examples of correct syllogisms, one using (page 28):
. . . that whatever is said of the whole of a class, may be said of any thing comprehended
in that class . . .
and also (page 29)
. . . whatever is denied universally of any class may be denied of anything that is compre-
hended in that class.
These two statements comprise the dictum de omni et nullo of Aristotle, and it is asserted that
this provides the basis for all correct reasoning (pages 29–30):
On further examination it will be found, that all valid arguments whatever may be easily
reduced to such a form as that of the foregoing syllogisms; and that consequently the principle
on which they are constructed is the UNIVERSAL PRINCIPLE of Reasoning. . . . it will be
found that all the steps even of the longest and most complex train of reasoning, may be
reduced into the above form.
But it is a mistake, he says, to think that Aristotle and other logicians intended that one should
actually decompose everyday arguments into a detailed series of syllogisms.
Although some writers had ridiculed the principle of omni et nullo as being obvious, Whately
praises Aristotle’s dictum (page 32):
. . . it is the greatest triumph of philosophy to refer many, and seemingly very various,
phenomena to one, or a very few, simple principles; . . .
Then he explains the advantage of using symbols for the terms in a proposition, e.g., as in “All
A is B”, when analyzing an argument (page 35):
. . . to trace more distinctly the different steps of the abstracting process, by which any
particular argument may be brought into the most general form.
12 1. RICHARD WHATELY (1787–1863)
And he goes on to say that the use of symbols clarifies the connection between the premises and
the conclusion.
He turns to the importance of the distributed term, meaning a term in a proposition that
is referred to in its entirety. For example in “All A is B” the term B is not distributed as we do
not know if A refers to all or just part of B. But in the proposition “No A is B” it is clear that
one is speaking of the entirety of B. He explains that an important principle of Aristotelian logic
says that the middle term (the term that appears in both premises but not in the conclusion of a
syllogism) must be distributed in at least one of the premises if a syllogism is valid. The failure
to have a middle term that is distributed in at least one of the premises is a common feature of
fallacious arguments as Whately demonstrates by examples.
Next he talks about the quantity (universal, particular) and the quality (affirmative,
negative) of a proposition.
After this pleasant and meandering survey of some highlights of deductive logic he is ready to
turn to the detailed presentation.
In the chapter Synthetical Compendium, Part I.—Of the Operations of the Mind and of Terms,
Whately starts at the supposed beginning, the workings of the mind when reasoning, and presents
the three operations of the mind that we summarize in the following table:
is one that need not always apply to an individual in a species, e.g., being hungry is a separable
accident of the species men. An inseparable accident always or never applies to an individual of
the species, e.g., being chinese is an inseparable accident of the species men.
Whately goes on to point out the classification of a term is relative to the context, and gives
the following example that we have put in table form (page 67):
red in relation to is classified as
pink genus
rose difference
blood property
house accident
He goes on to talk about the common term as an inadequate notion of the mind, and not an
object in reality (as some authors claimed).
Then he discusses the division of a term. We would today prefer the terminology subdivision
or partition. He gives the following rules for division (page 69):
1st. each of the Parts, or any of them short of all, must contain less . . . than the thing
divided. 2d. All the Parts together must be exactly equal to the thing divided; . . . 3d. The
Parts or Members must be opposed; i.e., must not be contained in another: . . .
Then there are the two different kinds of definition which we diagram as follows:
He says that in mathematics the two kinds of definitions coincide. Another partitioning of
definition is given by:
kind of definition enumerates
accident attributes
Whately says logic deals with nominal definitions, and on page 74 he adds:
It is scarcely credible how much confusion has arisen from the ignorance of these distinctions
which has prevailed among logical writers.
The rules for definitions are that they must be:
1. adequate
2. plainer than the thing defined
3. a convenient number of appropriate words
In Part II.—Of Propositions. Whately says (pages 75–76):
The second part of Logic treats of the proposition; which is, “Judgement expressed in words.”
A Proposition is defined logically “a sentence indicative,” i.e., affirming or denying; . . .
“Sentence” being the genus, and “Indicative” the difference, this definition expresses the
whole essence; and it relates entirely to the words of a proposition. With regard to the
matter, its property is, to be true or false.
Then he classifies sentences into categorical and hypothetical, depending on whether they are
14 1. RICHARD WHATELY (1787–1863)
The quality of the expression of a proposition is affirmative if the copula is affirmative, and
it is negative if the copula is negative. The quality of the matter of a proposition is either true
or false. The quality of the expression of a proposition is essential, that of the matter is accidental.
The quantity of a proposition is universal if the predicate applies to the whole of the subject,
otherwise it is particular. In universal propositions the subject is said to be distributed, and in
particular propositions the subject is not distributed.
This leads to the four kinds of (pure) categorical propositions (page 78) that we put into a
chart with examples:
symbol expresses example
A universal affirmative All A is B
E universal negative No A is B
I particular affirmative Some A is B
O particular negative Some A is not B
Fig. 4 Pure Categorical Propositions
He notes that the distribution or nondistribution of the subject depends on the quantity of
the proposition, whereas for the predicate the key is the quality of the proposition. Thus we have
(using S for subject, P for predicate)
affirmative negative
universal S distributes S and P distribute
particular P distributes
This is presented as two “practical” rules by Whately (page 80):
1st. All universal propositions (and no particular) distribute the subject.
2nd. All negative (and no affirmative) the predicate.
.
Any two different categorical propositions with the same subject and predicate are said to
be opposed. Thus, given a subject and predicate, each pair of the A,E,I,O propositions are
opposed, and although there are six pairs, it has been traditional to classify them into four kinds
of opposition as summarized in the next table:
the pair are
A,E contraries
I,O subcontraries
A,I subalterns
E,O subalterns
A,O contradictories
E,I contradictories
The word matter is not defined by Whately other than to say that it is the substance, but
seems to refer to the meaning given to the proposition by a concrete choice of subject and predicate.
Once the matter is fixed then the truth or falsity of a categorical proposition is determined.
1. ELEMENTS OF LOGIC (1826) 15
Although the phrases necessary matter, contingent matter, and impossible matter are
also not defined, a perusal of the examples suggests the following, given a concrete subject S and
predicate P:
necessary matter: The universal affirmative (A) is true.
contingent matter: The particulars (I,O) are true.
impossible matter: The universal negative (E) is true.
This leads to the famous square of opposition, a diagram that summarizes the truth or falsity of
the matter of the various forms A,E,I,O, depending on which of the above mentioned three cases
hold:
s
ie
or
subalterns
subalterns
co
ct
nt
di
ra
di
ct
-
or
ra
nt
ie
s
co
n t n f
i f I subcontraries O i t
c t c t
Conversions that switch from universal to particular are called conversion by limitation, or Con-
version per accidens. Another form of conversion that he discusses is conversion by negation,
also called conversion by contraposition. It was not included in his source, Aldrich, but was
frequently used in Whately’s time (as it is today). He summarizes the possibilities as:
Thus, in one of these three ways, every proposition may be illatively converted: viz. E, I,
simply; A, O, by negation; A, E, limitation.
The following table shows the details of these conversions:
According to the Elements, these four rules are all one needs to determine if a (well-formed)
syllogism is valid. Actually Whately gives six rules, but two of them are just to check that one is
indeed dealing with a syllogism.
Armed with this he turns to the complete classification of the valid syllogisms. Given a syllo-
gism, the triple of letters describing the kind of major premiss, minor premiss, and conclusion, is
called the mood of the syllogism. For example AAA is the mood of the syllogism
All Y is X major premiss
All Z is Y minor premiss
All Z is X conclusion
There are clearly 64 (= 4 × 4 × 4) moods.
The mood alone does not determine a syllogism, for one needs to know the location of the
middle term in the premises. Following Whately, if we have Z for the subject and X for the
predicate of the conclusion, and Y as the middle term, then one sees that there are four possible
arrangements of the order of the terms, and each of these arrangements is called a Figure:
If A is B then C is D
destructive conditional: C is not D
A is not B
A is B or C is D
disjunctive syllogism: A is not B
C is D
He noted that a disjunctive proposition could be converted into a conditional proposition,
e.g., in this example one could change “A is B or C is D” to “If A is not B then C is D”.
If A is B then C is D; if X is Y then C is D
constructive dilemma: A is B or X is Y
C is D
If A is B then C is D; if X is Y then E is F
destructive dilemma: C is not D or E is not F
A is not B or X is not Y
The final task of Whately, in this chapter that develops the classical logic, is to show that con-
ditional syllogisms can be converted into categorical syllogisms (perhaps more than one is needed),
and thus he is able to claim that the dictum de omni et nullo reigns supreme in logic.
He gives a concrete example, but we will simply take an abstract version of it, following Boole’s
idea of abbreviating the hypothetical “If A is B then C is D” into “If P then Q”. The ‘trick’ is
to take each categorical proposition P that is involved in the syllogism and replace it by a name
of the form “The situation that P holds”, turning a proposition into a class of circumstances in
which it holds. If we let Pb denote this class, etc., then the conditional proposition “If P then Q”
becomes the categorical proposition “All Pb is Q”.
b This same idea would be used by Boole to apply
his algebra of logic to the hypothetical propositions.
20 1. RICHARD WHATELY (1787–1863)
CHAPTER 2
Peacock was an algebraist and did not work in logic. However it is widely believed that his
attempt to give a proper foundation for algebra (in the period from 1830 to 1845) influenced Boole’s
view as to the soundness of using algebra even when one does not have an interpretation, or at
least not one where the operations are always defined.
As a student at Cambridge George Peacock was a founding member of the Analytical Society,
along with Babbage and Herschel. One of their main purposes was to change Cambridge math-
ematics from Newton’s notation to the Leibniz notation used on the continent. For more than a
hundred years England had been tied to the clumsy Newtonian notation of fluxions and fluents.
Although the Analytical Society endured for only a couple of years, from 1812 to 1814, it set out a
program that changed British mathematics. French mathematicians, especially Lacroix, Lagrange
and Laplace, had written beautiful books that captivated the young English mathematicians. One
of the first contributions of the Society was an abridged translation in 1816 of S.F. Lacroix’s 3-
volumes on calculus, Traité du calcul, written in 1797–1800. (Lacroix wrote an expanded version
in the years 1810–1819). In 1817 Peacock was on the mathematics examining committee at Cam-
bridge, and he introduced the continental notation, over considerable objections. Within two years
the Newtonian notation had completely disappeared.
Peacock was a well read mathematician, and he recognized that there were serious gaps in the
development of algebra. He undertook to give the subject a careful development, in the spirit of
Euclid’s Elements. In 1830 he published Treatise on Algebra, with the main innovation being his
Principle of the Permanence of Equivalent Forms. In 1833 he presented a survey on the state
of algebra to the Royal Irish Academy in which one sees that he is quite comfortable with the
interpretation of algebra in what we now call the complex plane, although he regards it as only an
interpretation. He is also familiar with Abel’s work on the quintic, but he does not understand it.
2
For example, in late 1847 we find Cayley corresponding with Boole saying:
I wonder we should never have stumbled in our previous correspondence√ on the subject of my utter disbelief
of the received “English” theory of the geometrical interpretation of −1. I would much more easily admit
witchcraft on the philosopher’s stone.
√
Boole simply regarded −1 as uninterpretable.
1. TREATISE ON ALGEBRA: VOL. 1 1842, VOL. 2 1845 23
where n need not be a positive integer, is obtained by his Principle from the corresponding
result of Arithmetic, where n is a positive integer.
Note that Peacock’s Principle is supposed to justify the transition from a finite sum to an infinite
series, without any need to discuss convergence. He chastises Euler for not having adopted this
Principle since he finds that it is the only way to justify the Binomial Theorem (Vol. 2, p. 452):
Euler had drawn the same conclusion, nearly in the same manner, in his celebrated proof
of the series for (1 + x)m [a footnote gives the reference: Acta Petropol., 1774], though
he at the same time denied the universal application of a principle equivalent to that of the
permanence of equivalent forms, which alone could make it valid . . .
The reader is faced with the question “What precisely does Peacock mean?” in many places.
Here are some examples of such questions:
• Vol. 1 is supposed to be the algebra for the numbers of common arithmetic. At first such
numbers are described as those which can be expressed using the digits 0, . . . ,9 (Vol. 1, p. 1):
1. Arithmetical Algebra is the science which results from the use of symbols and signs
to denote the numbers and the operations to which they may be subjected; these
numbers, or their representatives, and the operations upon them, being used with the
same sense and with the same limitations as in common arithmetic.
. . . Those numbers which are actually assigned and given, are expressed by means of
the nine digits and zero, by the aid of the artifices of ordinary arithmetical notation
...
Certainly this includes the positive numbers (it is clear that he does not include 0 or any
negative numbers), but soon fractions are declared to be numbers. There is strong evidence
that the positive rationals are what he means by the numbers of Arithmetic (Vol. 1, p. 273):
416. In the solution of the preceding problems we have generally used the word
number in its largest sense, as signifying fractional as well as whole numbers . . .
But what is one to make of his treatment of roots of numbers? (Vol. 1, p. 130):
214. The square root of a number is that number, whether expressed by a finite series
of digits or not, which multiplied by itself will produce the primitive number . . . the
square root of 10 is interminable . . .
Vol. 1 also treats infinite decimals and continued fractions, and notes that every geometric
magnitude can be expressed by a decimal (Vol. 1, p. 92):
169. It thus appears that decimals, either definite or indefinite, are competent to
express the values, not merely of commensurable magnitudes, which are multiples of
some assignable subordinate unit, but also of such as are incommensurable . . .
He says that numbers are not continuous, whereas geometric magnitudes are continuous, and
again seems to emphasize that numbers are rational (Vol. 1, p. 161):
278. Numbers, whether rational or surd, are essentially discontinuous, and in strictness
of language, are incapable of expressing as symbols the properties of continuous
magnitude . . . consequently no number can become the absolute representative of an
incommensurable magnitude.
In Vol. 2 we encounter e, π, and transcendental
√ functions. What precisely does Peacock mean
by the numbers of Arithmetic? Is x or ex such an arithmetical number when x is such a
number? etc.
• What are the operations permitted in his Arithmetic? In Vol. 2 he shows, by invoking his
Principle, that in Symbolical Algebra a4/5 is the fifth root of the fourth power of a. Does this
make a4/5 an operation of Arithmetic?
• What kinds of expressions are allowed in making equations? Some of the functions √ he uses,
√ 2
like x and log x, are multivalued in his Symbolical Algebra. Although x = x is true in
n
24 2. GEORGE PEACOCK (1791–1858)
the positive numbers, one cannot apply his Principle here and conclude that this holds in
Symbolical Algebra (for then one would have the favorite paradox of 1 = −1).
• Since the operation of subtraction is partial in his Arithmetic (of positive numbers), what
does Peacock mean when he says an equation holds in Arithmetic? Does it have to hold at
least for all positive integers?
• Even if an equation holds, Peacock may not permit the application of his Principle to it on
the grounds that it has not been derived by proper means. Euler’s example of a series that
takes the value x when x is a positive integer, but not otherwise, is rejected by Peacock on
these grounds (Vol. 2, p. 452):
. . . he [Euler] produced, as a striking exception to [the Principle’s] truth, the very
remarkable series
1 − am (1 − am )(1 − am−1 ) (1 − am )(1 − am−1 )(1 − am−2 )
+ + + ···
1−a 1 − a2 1 − a3
whose sum is m, when m is a whole number, but not so for other values.
A little consideration, however, will be sufficient to shew that the principle of the
permanence of equivalent forms is not applicable to such a case: for if m be a whole
number, as in Arithmetical Algebra, the connection between m and its equivalent
series in the identical equation
1 − am (1 − am )(1 − am−1 ) (1 − am )(1 − am−1 )(1 − am−2 )
m = + + + ···
1−a 1 − a2 1 − a3
is not given, or, in other words, there is no statement or definition of the operation, by
which we pass from m, on one side of the sign =, to a series under the specified form
on the other, and there is consequently no basis for the extension of the conclusion
to all values of the symbols, either by the principle of the permanence of equivalent
forms or by any other: it is only when the results, which are general in form, but
specific in value, are derived by processes which are definable and recognized, that
they become proper subjects for the application of this principle.
Except for possibly giving Boole a green light to proceed with his algebra of logic, Peacock’s
work seems to have had little lasting influence on mathematics. His fundamental principle was
too vague to offer a foundation—any reasonable attempt to make it precise leads to results
that contradict our basic structure of complex numbers, and functions of complex numbers.
CHAPTER 3
Augustus De Morgan was born in India, the fifth child of a Colonel in the British Army. His
father was the third generation of a family of army officers working with the East India company,
but Augustus had a defective right eye and could not follow in this family tradition. He was
brought to England in his infancy, attended some ordinary private schools, and acquired a fine
classical education. At 16 he entered Trinity College, Cambridge, and soon took up mathematics
and philosophy as his main interests. As a student he was involved with the movement to replace
the Newtonian notation in British analysis with the Leibnizian notation used on the Continent. In
1827 he obtained his B.A., but was not permitted to hold a fellowship or pursue the M.A. as he
was not willing to subscribe to the required religious affirmations. He always considered himself a
‘Christian unattached’.
Fortunately the newly founded University of London was nondenominational, and De Morgan,
at the age of 22, competed successfully (against 33 other candidates) to win the mathematical
professorship. He would spend most of the rest of his life there.1 He was regarded as a brilliant
teacher, and a great believer in the importance of a strong foundation in logic when pursuing
mathematics. His students included Sylvester, Lady Lovelace, and Jevons.
De Morgan’s work in logic2 can be summarized as a single-minded quest to improve the syllogism
as the main instrument of reasoning, keeping in mind that the truths embodied in the accepted
inferences of Aristotelian logic should be preserved. At the same time that he is anchoring his
logical investigations on the syllogism he will occasionally question the adequacy of the underlying
propositions to faithfully capture the assertions of the common language. And on one occasion he
even questions the adequacy of the syllogism itself to capture inference.
But De Morgan does not question the fact that logic is concerned with propositions ϕ(X, Y)
that relate X and Y. Nor does he challenge the form of the categorical proposition
except to allow certain conjunctions of these propositions (called complex propositions). Nor does
he depart from the form of the syllogism
premise(X,Y)
premise(Y,Z)
conclusion(X,Z).
The main thrust of his work is to extend the possibilities for the components of the categor-
ical proposition, combining this with developing compact symbolic notation for propositions and
syllogisms. The following diagram shows De Morgan’s main ideas for modifying the categorical
proposition:
1
In 1831 he resigned because of the way the governing body was treating the faculty, and lived ‘in the wilderness’
for five years. But after five years his successor at the University of London died in an accident and De Morgan was
unanimously recalled.
2
Including his book he published about 800 pages on logic. This work took place during the years 1839 to 1863,
but mainly mainly between 1846 and 1850.
25
26 3. AUGUSTUS DE MORGAN (1806–1871)
reader to decide if Hamilton’s charge was justified. Here are some samples of Sir Hamilton’s prose
taken from De Morgan’s Appendix (pages 355, 363):
In reply to your letter in the last number of the Athenaeum:—you were not wrong to abandon your
promise “of trying the strength of my position;” for never was there a weaker pretension than that,
by you, so suicidially maintained.
..
.
I disregard your misrepresentation that “I avenge myself for the retraction of my aspersion on your
integrity by my copious and slashing criticisms on your intellect.” When your (excusable) irritation
has subsided, you will see that I could only secure you from a verdict of plagiarism by bringing you
in as suffering under an illusion.
Boole, a school teacher correspondent of De Morgan, became so intrigued in the spring of 1847
by this controversy that he was led to recall some of his own ideas on logic from his teenage years.
And soon he too was writing a book on the subject. No doubt it was because of the spat with
Hamilton that De Morgan wrote to Boole in the summer of 1847 asking him not to send De Morgan
a draft of the logic book that Boole was working on until De Morgan had finished his own. In the
introduction to Boole’s book, also published in November of 1847, we find the following quote
attributed to Hamilton:
[The pursuits of the mathematician] have not only not trained him to that acute scent, to that
delicate, almost instinctive, tact which, in the twilight of probability, the search and discrimination
of its finer facts demand; they have gone to cloud his vision, to indurate his touch, to all but the
blazing light, the iron chain of demonstration, and left him out of the narrow confines of his science,
to a passive credulity in any premises, or to an absolute incredulity in all.
Augustus De Morgan was a transitional logician, educated in the traditional logic that was
solidly based on the Aristotelian syllogism, active in the reform of logic, and supportive of the
new developments (of Boole) in logic. He is not particularly noted for his mathematical achieve-
ments, but he was a prolific writer, his literary and mathematical output (books, papers, reviews,
pamphlets, etc.) being possibly the greatest among the mathematicians of his time.
He married Sophia Elizabeth Frend, the daughter of a fellow mathematician, and they had
seven children. De Morgan described himself as the most ‘ungregarious animal living’, and refused
to seek election to the Royal Society, declined the offer of an honorary doctorate from Edinburgh,
saying ‘he did not feel like an LL.D.’, refused his former students’ request to have his portrait
painted for presentation to the University College, and prided himself on the fact that he never
voted, reflecting his general attitude toward politics. He resigned from the University for the second
and last time in 1866.5
1.1. Objects, Ideas, Names. De Morgan has three categories of things that one can refer to
in a proposition, namely objects, ideas, and names. Objects are in the external real world. Ideas
are representations, in the mind, of objects or other things. And names are labels we provide for
ideas. In a proposition he says that the terms should both refer to objects, or both refer to ideas,
or both refer to names,7 and that the general theory developed (that is, the immediate inferences
and syllogisms) does not depend on which of these instances one has in mind. We will simplify this
part of his system by simply saying that the terms in a proposition refer to classes of things.
1.2. Names for Contraries. His first change to Aristotelian logic is to extend the classes
for which there are names to include complements of named classes. On page 38 he introduces
his upper case/lower case notation for contraries, the contrary of A being a. 8 As this is the one
6
Unfortunately the invention of a notation for the denial of a proposition disappears after this brief chapter. In
subsequent work he uses ‘ . . . denies F’. Thus a promising beginning to Boolean combinations of propositions is not
pursued.
7
In 1850 he challenges this requirement ([5], pages 59–60).
8
This convention of using upper case roman font and lower case italic font will be adopted by Jevons, with credit
to De Morgan. But, in reality, after this introduction De Morgan uses the roman font for both upper case and lower
case symbols throughout the rest of his book. And in his several papers on the syllogism he uses the italic font for
both upper and lower case symbols.
1. FORMAL LOGIC (1847) 29
alteration of De Morgan that the writings of Aristotle explicitly disagree with he gives reasons for
this decision (pages 40–41):
. . . make it desirable to include in a formal treatise the most complete consideration of all proposi-
tions, with reference not only to their terms, but also to the contraries of those terms.
..
.
It may be objected that the introduction of terms which are merely negations of the positive ideas
contained in other terms is a species of fiction. I answer, that, first, the fiction, if it be a fiction, exists
in language, and produces its effects: nor will it easily be proved more fictitious than the invention of
sounds to stand for things. But, secondly, there is a much more effective answer, which will require
a little development.
When writers on logic, up to the present time, use such contraries as man and not-man, they
mean by the alternative, man and everything else. There can be little effective meaning, and no use,
in a classification which, because they are not-men, includes in one word, not-man, a planet and a
pin, a rock and a featherbed, bodies and ideas, wishes and things wished for. But if we remember
that in many, perhaps most, propositions, the range of thought is much less extensive than the whole
universe, commonly so called, we begin to find that the whole extent of the subject of discussion is,
for the purpose of discussion, what I have called a universe, that is to say, a range of ideas which is
either expressed or understood as containing the whole matter under consideration. In such universes,
contraries are very common: that is, terms each of which excludes every case of the other, while both
together contain the whole.
After pointing out in such cases that a term and its contrary can both be of interest he says:
Accordingly, of two contraries, neither must be considered as only the negation of the other: except
when the universe in question is so wide, and the positive term so limited, that the things contained
under the contrary name have nothing but the negative quality in common.
And later De Morgan says that Aristotle’s opposition to names for contraries was likely because
he had not considered a limited universe (page 128):
Aristotle will have no contrary terms: not-man, he says, is not the name of anything. He afterwards
calls it an indefinite or aorist name, because, as he asserts, it is both the name of existing and non-
existing things . . . I think, however, that the exclusion was probably dictated by the want of a definite
notion of the extent of the field of argument, which I have called the universe of the propositions.
Adopt such a definite notion, and, as sufficiently shown, there is no more reason to attach the mere
idea of negation to the contrary, than to the direct term.
1.3. Generalizing the Copula. After introducing names for contraries De Morgan turns to
analyze the copula. He differs from the usual literature by having a term refer to instances rather
than the whole class. ‘X is Y’ does not assert a relation between two classes X and Y, but rather
the letters X and Y refer to instances (i.e., elements) of the respective classes. Thus ‘X is Y’ really
means ‘member-of-X is member-of-Y’. (In the traditional logic the relation ‘is’ is that of identity.)
With this one sees that ‘All X is Y’ means ‘All members of X are members of Y’. Hence we see
De Morgan writing ‘All Xs are Ys’.
He distinguishes his interpretation of ‘X is Y’ from the usual in the following passage (page 48):
[logicians] would rather draw their language from the idea of two areas, one of which is larger than
the other, than from two collections of indivisible units, one of which is in number more than the
other.
We tend to think of one area being contained in the other precisely when every point of the first is
a point of the second. Perhaps a better example would have been to take two simple closed curves
in the plane with one inside the other. And his statement about ‘indivisible units, one of which is
in number more than the other’ has to be understood as ‘subset of’. His emphasis on the ‘number’
of indivisible units is surely misleading. He goes on to say:
30 3. AUGUSTUS DE MORGAN (1806–1871)
I shall take particular care to use numerical language, as distinguished from magnitudal, throughout
this work, introducing of course, the plurals Xs, Ys, Zs, &c.
He means that he will emphasize the role of the elements of the classes. We will assume that the
following table expresses De Morgan’s understanding of categorical propositions in a more modern
syntax:
proposition modern version
All X is Y (∀α ∈ X)(∃β ∈ Y) (α is β)
No X is Y (∀α ∈ X)(∀β ∈ Y) (α is not β)
Some X is Y (∃α ∈ X)(∃β ∈ Y) (α is β)
Some X is not Y (∃α ∈ X)(∀β ∈ Y) (α is not β)
De Morgan turns to the question of which properties of ‘is’ are actually needed for inference,
and says there are just three (page 50):
(1) ‘X is Y’ implies ‘Y is X’
(2) ‘X is Y’ and ‘X is Z’ imply ‘Y is Z’
(3) ‘X is not Y’ is the contradictory of ‘X is Y’
These three properties hold precisely for the symmetric and transitive binary relations R, where
the third property just means we take ‘X is not Y’ to be not (XRY). De Morgan will use (2) as
though it were the transitive law. No doubt the wording of (2) was inspired by an axiom from
Euclid, namely that ‘things equal to the same are equal to one another’.
The necessity of the first property follows from preserving the Aristotelian inference called
conversion of a particular—‘Some X is Y’ implies ‘Some Y is X’—by applying this inference when
X and Y denote singletons (De Morgan calls such propositions doubly singular). Likewise from
the 3rd Figure AAI one has property (2). (Using the 1st Figure AAA gives the transitive property.)
Thus any copula that preserves the Aristotelian inferences must satisfy (1)–(3).
He says any relation satisfying these three properties will serve equally well as a copula (page
51):
. . . we have power to invent new meanings for all the forms of inference, in every way in which we
have power to make meanings of is and is not which satisfy the above conditions.
Indeed his three properties are also sufficient to preserve the inferences of Aristotelian logic, but he
leaves the justification of this fact entirely to the reader. As an example of new meanings he gives
the following (page 51):
For instance, let X, Y, Z, each be the symbol attached to every instance of a class of material objects,
let is placed between two, as in “X is Y” mean that the two are tied together, say by a cord, and let
X be considered as tied to Z when it is tied to Y which is tied to Z, &c. There is no syllogism but
what remains true under these meanings.
Then he goes on to say that for some syllogisms one does not need all three properties of ‘is’
mentioned above:
Thus in the most common case of all, “Every A is B, every B is C, therefore every A is C,” of all the
three conditions only the second is wanted to secure the validity of this case.
Here De Morgan is careless as the second property does not express transitivity. He needs the first
property (symmetry) along with the second property to obtain the transitive property of ‘is’. He
will make a similar error later on the same page with an example that is transitive but does not
satisfy the second condition (even though he claims it does).
His claim that the three properties are sufficient for all the (valid) forms of inference is correct if
one only considers Aristotelian inferences, but he will soon develop his logic that permits names for
contraries in categorical propositions (see §1.4). Although De Morgan does not say what constitutes
an acceptable copula for his new system, we assume that it must yield the valid inferences obtained
1. FORMAL LOGIC (1847) 31
when using the copula ‘is identical to’. Presumably in Formal Logic De Morgan believes that the
properties (1)–(3) of the copula will continue to be sufficient in his new logic. 9
In his 1850 paper On the Syllogism II, De Morgan drops the third property from his requirements
of the copula, and changes the second property to the transitive property ([5], page 51). Further-
more he realizes that a symmetric and transitive copula will not yield all the valid inferences of his
new system, so he adds (without any explanation) the new requirement:10
‘X is Y’ or ‘X is y’ should hold for any X.
We take this to mean
h i
(∀α ∈ X) (∃β ∈ Y)(α is β) or (∃β ∈ y)(α is β) .
This can be replaced by the simpler condition
(∀α ∈ U)(∃β ∈ U)(α is β),
U being the universe.
De Morgan never realizes that adding his new condition to the symmetric and transitive prop-
erties is equivalent to adding the reflexive property11 (∀α ∈ U)(α is α), or, in De Morgan’s mode
of expression, ‘All X is X’. Thus he is essentially claiming by 1850 that equivalence relations are
the appropriate abstract copulas for his extension of categorical logic.
Unfortunately De Morgan fails to see that the only copula that gives the valid inferences of his
new system is the original relation of identity. Indeed, if one just adds to the Aristotelian system
the fact that each of the two propositions
‘No X is Y’ and ‘All X is y’
can be inferred from the other then the copula is forced to be the identity relation. 12 We show this
as follows.
The symmetric and transitive properties follow from preserving the Aristotelian inferences. We
derive the reflexive property from the above equivalence by first assuming α ∈ U is not related by
the copula to any β ∈ U. Let X = y = {α}. Then ‘No X is Y’ holds, so we can infer ‘All X is y’
holds, and this yields ‘α is α’. Thus (∀α ∈ U)(∃β ∈ U) (α is β) must hold. Now given α choose β
such that ‘α is β’ holds. By symmetry ‘β is α’ holds, and then by transitivity ‘α is α’ holds. Thus
the copula must be reflexive.
Next we show that the copula must be the identity relation. Suppose α 6= β, and let X = {α}
and Y = {β}. Then ‘All X is y’ holds (by the reflexive property), so ‘No X is Y’ holds. Thus ‘α is
not β’ holds. This finishes our proof.
In the book Formal Logic De Morgan does not introduce a symbol for the underlying binary
relation ‘is’. Even in the most popular case of equality (=) he does not use a symbol until 1860. 13
9
He introduces the idea of names for contraries on page 37, before discussing the abstract copula—but the valid
inferences, using names for contraries, are presented after this discussion. And there is no further discussion in Formal
Logic of the nature of the abstract copula.
10
His phrasing of the condition is as follows, where ‘—’ is the copula ([5], page 52):
When contraries are introduced, the copula condition further required is that either X—Y or X—y
should hold for any X.
11
This awkwardness in dealing with the reflexive property will manifest itself later in the work of Jevons.
12
One easily sees that his example with objects tied together could satisfy only one of these two simple
propositions—if no object in X is tied to any object in Y then one cannot conclude that every object in X is
tied to some object in the complement of Y.
13
He does use the usual equality symbol (=) in the sense of ‘is defined as’ or, with propositions, as ‘equivalent
to’. But unfortunately he often uses it in the sense of ‘implies’, for example, in his version (on page 88) of the AAA
syllogism ‘X)Y + Y)Z = X)Z’.
In his 1858 paper On the Syllogism III, page 87, he says that the notation A+B=C for ‘A and B imply C’ is
32 3. AUGUSTUS DE MORGAN (1806–1871)
1.4. Simple Propositions. By admitting contrary terms De Morgan quadruples the number
of categorical propositions. This gives his simple propositions, namely one has (for two symbols
X,Y) the following 32 possibilities:
· ¸ · ¸ · ¸ · ¸
All X is Y
Some x is not y
· ¸ · ¸ · ¸ · ¸
All Y is X
Some y is not x
This notation means that one can choose either of the possibilities in each of the bracketed items.
Here the lower case/upper case letters are contraries, that is, x means not-X in De Morgan’s
notation.
One of the chief goals of De Morgan’s book is to determine the valid simple syllogisms, and to
give a complete set of rules for this purpose. With 32 ways to make a simple proposition this gives
a total of 323 = 32, 768 simple syllogisms to consider. The first step towards this classification
is to determine which of the simple propositions are semantically the same, and to select just one
representative from each equivalence class.14
Let us say that two propositions are equivalent if each can be inferred from the other, under
the restricted semantics—De Morgan says ‘is the same as’ instead of ‘is equivalent to’. A proposition
FXY is convertible if FXY is equivalent to FYX. Likewise for FXy, etc. Otherwise a proposition
is inconvertible (see page 59).
There are eight equivalence classes among the 32 propositions (with reference to the symbols
XY)—they are given in the columns below:
The representatives are those in bold type. Note that they are distinguished by the fact that either
both X and Y occur, in that order; or their contraries x and y occur, in that order. De Morgan
introduces eight forms for his simple propositions, namely
A A E E I I O O
He says A is to be read ‘sub-A’ and A as ‘super-A’, etc. He refers to the ‘sub’ and ‘super’ as the
prepositions of the form.
Here is the list of the simple propositions used by De Morgan (with reference to the symbols
XY) and his symbolic abbreviations:
Modern
Proposition Abbrev. Expresses
Symbolic Rendering
A XY X)Y Every X is Y X⊆Y
A XY x)y Every x is y X0 ⊆ Y 0
O XY X:Y Some X is not Y X ∩ Y 0 6= Ø
O XY x:y Some x is not y X0 ∩ Y 6= Ø
E XY X.Y No X is Y X∩Y=Ø
E XY x.y No x is y X0 ∩ Y 0 = Ø
I XY XY Some X is Y X ∩ Y 6= Ø
I XY xy Some x is y X0 ∩ Y0 6= Ø
By using representative forms the number of simple syllogisms to consider is reduced from
32,768 to 83 = 512, and the number of those that are valid is only 48. This is a very manageable
number.
Note that in the expressions in the third column for the propositions F XY the symbols X and
Y appear; and for the propositions F XY the contrary symbols x and y appear. De Morgan prefers
to write just A instead of our A XY, etc. This makes for compact tables, but the reader has to
keep track of the symbols being used with the form. Our more detailed version should make it
clear that A yZ means ‘All y is Z’, that A yZ means ‘All Y is z’, etc.
The four simple propositions with the lower strokes are the usual categorical propositions of
Aristotelian logic (with reference to XY). Actually De Morgan presents his eight simple propositions
before giving the equivalences among the 32 original forms. After noting that ‘No X is Y’ and ‘Some
X is Y’ are convertible De Morgan omits one member from each equivalence class when he presents
the equivalences, using his abbreviations above, in the following table (page 61), and says the reader
should make a careful study of it. Here he uses the symbol ‘=’ to express ‘has the same meaning
as’, so it is just semantic equivalence:
To aid the reader in understanding these relationships De Morgan makes use of graphic aids
like the following for A (page 61):
34 3. AUGUSTUS DE MORGAN (1806–1871)
A U U U U U U U U U U U U
X X X X X x x x x x x x
Y Y Y Y Y Y Y Y y y y y
This says that there are 12 things in the universe U (the number of columns labelled U), and, of
those, 5 are in X and 8 are in Y. Here we see that X)Y is true (every occurrence of an X corresponds
to an occurrence of a Y), and furthermore so are X.y and y)x.
On page 63 De Morgan summarizes the relations between the eight forms. This seems to be
his version of the Square of Opposition—perhaps we should call it the Table of Opposition for the
simple propositions. Only the forms are given, the symbols being assumed the same for all:
Thus, looking at the first line, we see that A XY contradicts any one of O XY, E XY, and E XY;
it implies both I XY and I XY; and it neither implies nor contradicts each of A XY and O XY.
Also one sees the symmetry that the introduction of contrary names gives: by shifting the strokes
(upper to lower, and vice-versa) one transposes the first two lines of the table as well as the last
two lines; and by permuting O with I and A with E one transposes the first and third as well as
the second and fourth lines of the table.
1.5. Transformations of Simple Propositions. To give more insight into equivalent propo-
sitions De Morgan notes (pages 63–64) that there are certain natural ways to change a proposition
into another proposition:15
Transformation Means
S change the subject to its contrary
P change the predicate to its contrary
T transpose the symbols
F switch positive and negative
L do nothing
These transformations, and their compositions, are permutations of the 32 propositions (based
on XY). They generate a group of 16 permutations. Essentially De Morgan says the compositions
F, L, P, S, T, FT, PF, PFT, PT, SF, SFT, SP, SPF, SPT, SPFT, ST
15
De Morgan actually uses the letters F, L, P, S, T. This overlaps with his use of capital roman letters for terms,
and with our use F, . . . , L. So we have taken the liberty of changing these transformations to bold type.
1. FORMAL LOGIC (1847) 35
give the 16 permutations in this group. After observing the generators are of order two he introduces
the 16 compositions above in the following equations (page 64):
P = F, SP = SF, PF = L, SPF = S
ST = FT, SPT = FPT, SFT = T, SPFT = PT
This is not a presentation of the group, but rather an assertion that the equated pairs of permu-
tations yield equivalent results when applied to a proposition. Thus, for example, from the second
equation one has (A XY)SP is equivalent to (A XY)SF, that is, A xy is equivalent to E xY.
These results are combined with others (described below) into two tables (pages 64–65):
L T SP SPT L P PT S ST P
PF SFT SF PFT PF F SPFT SPF FT F
One recognizes the original equations as the columns of these tables (with two columns repeated;
and he has replaced FPT by PFT). The second table can be obtained from the first by multiplying
through by P. Any two (row) adjacent compositions separated by a double line yield equivalent
propositions when applied to a convertible proposition; and those separated by single lines yields
equivalent propositions when applied to an inconvertible proposition. Thus, for example, in the
second table, SPF and FT are adjacent and separated by a double line, so applying them to the
convertible proposition E XY gives equivalent results, namely A xy and A YX.
After some discussion about these tables he has the following rather puzzling passage (page
65):
It appears, then, that any change which can be made on a proposition, amounts in effect to L, P,
S, or PS. This is another verification of the preceding table: for all our forms may be derived from
applying those which relate to XY in the cases of Xy, xY, and xy.
Unfortunately he does not follow through and apply these permutations to the study of syllo-
gisms. They give (1) a simple way to determine if a particular syllogism is valid, and (2) a routine
and fast method of generating a complete catalog of valid syllogisms. Furthermore they are easily
extended to apply to the complex propositions that De Morgan introduces. (See Appendix 0 for
details.)
1.6. Complex Propositions. Before pursuing simple syllogisms De Morgan introduces seven
complex forms:
C C C D D D P.
The complex propositions (with respect to XY) are as follows (he uses + for coexists with,
which means ‘and’):
Modern
Proposition Definition
Symbolic Rendering
CXY E XY + E XY X = Y0
C XY E XY + I XY X ⊂ Y0
C XY E XY + I XY X0 ⊂ Y
DXY A XY + A XY X=Y
D XY A XY + O XY X⊂Y
D XY A XY + O XY X0 ⊂ Y 0
PXY I XY + I XY + O XY + O XY X ∩ Y 6= Ø ∧ X ∩ y 6= Ø
∧ x ∩ Y 6= Ø ∧ x ∩ y 6= Ø
De Morgan simply writes C = E + E , etc. Any pair of these complex propositions contradict each
other, so their ‘table of opposition’ is trivial. For FXY a complex proposition and GXY a simple
36 3. AUGUSTUS DE MORGAN (1806–1871)
proposition either GXY follows from FXY, or GXY contradicts FXY. 16 And these seven proposi-
tions are, up to equivalence, the only conjunctions of simple propositions with this property. 17
De Morgan’s reasons for introducing complex propositions are: (1) complex propositions are
what one uses in everyday speech, for example, when one says ‘Some of the responsibility is mine’
one means some, but not all; (2) valid complex syllogisms are easier to classify than the valid simple
syllogisms; and (3) one can use the classification of the valid complex syllogisms to determine the
valid simple syllogisms.
De Morgan notes that every complex proposition is a conjunction of simple propositions (as
in the table above), and every simple proposition is a disjunction of complex propositions. So
he says simple and complex propositions are on an equal footing. De Morgan does not give a
complete listing of the disjunctive expressions for the simple propositions—we provide one here
(using De Morgan’s abbreviated notation; he does not introduce a symbol for ‘or’):
A = D or D I = D or D or D or C or P
A = D or D I = D or D or D or C or P
E = C or C O = C or C or C or D or P
E = C or C O = C or C or C or D or P
De Morgan extends his classification of equivalent propositions to include the complex ones.
On page 70 he gives all but the last line of the following master table of equivalent propositions:
XY YX xY Yx Xy yX xy yx
A O D A O D E I C E I C E I C E I C A O D A O D
A O D A O D E I C E I C E I C E I C A O D A O D
E I C E I C A O D A O D A O D A O D E I C E I C
E I C E I C A O D A O D A O D A O D E I C E I C
DCP DCP CDP CDP CDP CDP DCP DCP
The difference between the first and second rows, as well as the third and fourth rows, is just
a shift of the strokes. An equivalence class of a proposition can be found by choosing a row and
then selecting the ith entry from each of the eight columns, where i can be one of 1, 2, or 3. Thus
looking at the second row, with i = 3, we have the equivalence class
Finally De Morgan gives a table of immediate inferences and denials between simple and complex
propositions.
16
De Morgan actually defines complex propositions as follows (page 65):
A complex proposition is one which involves within itself the assertion or denial of each and all of the
eight simple propositions.
17
In modern terminology, the seven complex propositions KXY are the atoms of the Boolean algebra generated
by the eight simple propositions SXY. Since the eight simple propositions SXY are (up to equivalence) closed under
negation, the atoms can be expressed as meets of the generators.
1. FORMAL LOGIC (1847) 37
1.7. Syllogisms. Let us say that a (simple or complex) proposition ϕ, using a form F, is based
on the symbols XY if ϕ is any of the eight propositions FXY, FYX, . . . , Fyx. A syllogism18 is
an argument in one of the two forms
" #
imply
ϕ1 , ϕ 2 ϕ3
deny
The three propositions ϕi must be based on three symbols, no two of the ϕi being based on the
same pair. When no symbols are mentioned then the default pairs are XY, YZ, XZ. De Morgan
abbreviates the affirmatory case to ϕ1 ϕ2 ϕ3 , but unfortunately he no longer uses a notation for
the negation of a proposition, and introduces no notation for a negatory syllogism. 19 Sometimes
we will write ϕ1 , ϕ2 , ϕ3 in the affirmatory case for clarity.
The syllogism is simple if all three ϕi are simple propositions, and it is complex if all three ϕi
are complex propositions. Otherwise the syllogism is said to be mixed.20 As simple propositions
are closed (up to equivalence) under negation, De Morgan only considers the affirmatory simple
syllogisms. And for complex syllogisms, if ϕ1 ϕ2 ϕ3 is valid then the premises ϕ1 ϕ2 deny ψ for ψ
any complex proposition based on XZ that is not equivalent to ϕ3 . In this case De Morgan only
writes down the affirmatory case ϕ1 ϕ2 ϕ3 , and not the accompanying negatory cases in his listing
of the valid complex syllogisms.
Note that complex propositions have the property that each one based on XY is equivalent to a
unique complex FXY. Again we take the propositions FXY to be representative of their equivalence
classes.
To find the representative GXY of a given proposition, say Fθ, look in the first column of the
master table of equivalent propositions to find F, look at the headers of the columns to find θ, and
then find the entry G in the table corresponding to the row of F and the ith entry of the column
of θ, where F is the ith entry in the first column. Thus the representative of D Yx is C XY as C
is the 3rd entry of the 2nd row and 4th column.
By using the master table of equivalent propositions any syllogism can be readily transformed
into a standard form syllogism FXY,GYZ,HXZ. De Morgan shortens this to FGH(XYZ), and
even to just FGH. The transformation replaces each of the three propositions by its representative,
and thus the transformed syllogism is valid iff the original syllogism is valid.
For example, to find the standard form of the syllogism O yX, C Zy, I zx simply find the repre-
sentatives. The representative of O yX is, from the table, I XY. Of course we need one more round
of translation for the second and third propositions since they are based on YZ and XZ instead of
XY. For the second proposition, translate C Zy into C Yx, find its representative D XY, and then
the representative of C Zy is D YZ. Doing the same for the third proposition gives the standard
form I D I .
By using standard forms the total number of syllogisms to consider is 153 = 3, 475. De Morgan’s
plan is to first determine the valid complex syllogisms, then the valid simple syllogisms, and finally
the valid mixed syllogisms. To analyze complex syllogisms De Morgan uses a graphical aid described
in §1.8. Then to determine the other syllogisms he makes a clever use of the interplay between
simple and complex syllogisms.
Before going into the technical details we want to mention that in the chapter On the Syllogism,
page 114, De Morgan challenges the adequacy of syllogistic reasoning. He says from ‘man is an
animal’ one can infer ‘the head of a man is the head of an animal’. He does not prove this cannot
18
When De Morgan says ‘syllogism’ he means ‘valid syllogism’—we are not following this convention.
19
Although he uses (F) in the first chapter for negation, in this chapter parentheses are just used as delimiters,
so (F) means the same as F.
20
Actually he requires one of the premises of a mixed syllogism to be complex, the other simple. He does not
discuss the possibility that just the conclusion be simple, as in D D I .
38 3. AUGUSTUS DE MORGAN (1806–1871)
be achieved by syllogistic reasoning, and simply offers further challenges to anyone who thinks they
can. The solution he offers is to extend Aristotle’s dictum de omni et nullo to the following:
For every term used universally less may be substituted, and for every term used particularly, more.
The species may take the place of the genus, when all the genus is spoken of: the genus may take the
place of the species when some of the species is mentioned, or the genus, used particularly, may take
the place of the species used universally. Not only in syllogisms, but in all the ramifications of the
description of a complex term. Thus for “men who are not Europeans” may be substituted “animals
who are not English.”
1.8. Complex Syllogisms. So let us start with a complete list of the 48 valid complex syllo-
gisms21 (in standard form) in a table with the first premise in the left column, the second premise
along the top row, and the conclusion being the corresponding entry in the table. A ‘•’ means no
conclusion is possible:
C D P C C D D
C D C P D D C C
D C D P C C D D
P P P • • • • •
C D C • :C D :D C
C D C • D :C C :D
D C D • C :D D :C
D C D • :D C :C D
We use De Morgan’s notation of : F to indicate that one has two negatory conclusions that
include the ‘no stroke’ version of F. For example D D :C means that the two negatory syllogisms
‘D and D deny C ’ and ‘D and D deny C’ both hold.
De Morgan focuses on deriving the results of the lower right quadrant, using diagrams like the
following:
X X
C DC Y Y C DC
Z Z
Fig. 22 De Morgan’s Second Visual Aid
Each such diagram gives two syllogisms. The dark line segments indicate the portion of the universe
that belongs to each term on the left, and the white segments apply to the terms on the right. The
syllogism on the left says:
X is properly disjoint from Y
Y is a proper superset of Z
X is properly disjoint from Z
He gives four such diagrams, to cover the complex syllogisms that do not mention C, D or P. The
syllogisms involving C or D, but not P, are briefly described as limiting cases of these. A separate
discussion takes care of the cases that involve P.
21
Negatory syllogisms that accompany an affirmatory syllogism will not be mentioned, following De Morgan’s
convention.
1. FORMAL LOGIC (1847) 39
1.9. Simple Syllogisms. Using the valid complex syllogisms De Morgan now proceeds to
analyze the valid simple syllogisms. First he claims that if one has a particular premise 22 then the
conclusion must be particular, and furthermore two particular premises give no conclusion (page
86):
The following theorems will be necessary;—I. A particular premise cannot be followed by a universal
conclusion.
This result holds for mixed syllogisms as well. His proof is by a single example to show the method.
The idea is as follows. Suppose FGH is a valid syllogism with G particular and H universal. (A
similar proof works when F is particular and G universal.) Let KXY be a complex proposition that
strengthens FXY, where K is not C, D, or P. Then the syllogism KPH is valid as P strengthens G.
As H is universal the proposition HXZ is equivalent to a disjunction of two complex propositions,
say ‘H1 XZ or H2 XZ’, where H1 is either C or D. But then H1 XZ contradicts the premises KXY,PYZ,
so it follows that KPH2 is valid. But then De Morgan refers to a previous claim (page 85) that
says for K one of C , C , D , D the premises KXY,PYZ are consistent with three different complex
propositions MXZ. This gives a contradiction.
2. From two particular premises no conclusion can follow.
De Morgan’s proof is again by a single example to illustrate the method. A general version
would be as follows. Suppose FGH is a valid syllogism with F and G particular statements. Then
PPH is also valid. Now any simple H will contradict some complex conclusion, but the premises
PP are consistent with all complex conclusions. Again a contradiction.
Now he is ready to determine the valid simple syllogisms. Given a valid simple syllogism FGH
one can strengthen the premises to complex propositions using only the four forms C , C , D , and
D . Suppose we have done this, yielding the valid syllogism KLH. If there is a complex form M
such that KLM is a valid syllogism then M is not C, D, or P; and H must be a consequence of M.
There are exactly eight valid affirmatory complex syllogisms not involving C, D, or P, namely:
C C D C D C C C D C D C D C C D D D D C C D D D
De Morgan observes a fascinating pattern, namely that if one takes any of the above eight triples
of forms, say KLM, then by putting the eight syllogisms
KLM(XYZ) KLM(XYz) ... KLM(xyz)
into standard form one obtains the previous list of eight valid complex syllogisms. For our purposes
it suffices to check this fact for D D D .
De Morgan determines 3 simple syllogisms from D D D , and then uses the last observation
to find a total of 24 syllogisms. He claims the other syllogisms can be obtained by weakening
the conclusions or strengthening the premises of the 24 syllogisms. This claim is correct, but the
proof is really left to the reader. We will give an alternate approach to De Morgan’s syllogisms in
Appendix 0 that makes it fairly easy to verify this claim.
Now the four simple propositions FXY that follow from D XY are:
A XY I XY I XY O XY
De Morgan only uses the first and last of the four—they appear in his definition D = A +O —and
he gives the three simple syllogisms obtained23 from D D D (using the two simple propositions
just mentioned):
A A A A O O O A O
22
Recall that particular and universal propositions are simple propositions.
23
Had he used all four propositions he would have found seven simple syllogisms, namely
A A A A A I A A I A I I A O O I A I O A O .
40 3. AUGUSTUS DE MORGAN (1806–1871)
Now applying these three triples of simple forms to the eight triples XYZ, . . . , xyz of symbols, and
putting them in standard form, he obtains the 24 syllogisms in bold type in the following table of
the 48 valid simple syllogisms (in standard form):24
Universal Syllogism A A A A A A E E A E E A
Weakened Conclusion A A I A A I E E I E E I
Weakened Conclusion A A I A A I E E I E E I
Universal Syllogism A E E A E E E A E E A E
Weakened Conclusion A E O A E O E A O E A O
Weakened Conclusion A E O A E O E A O E A O
Strengthened Premise A A I A A I
A I I A I I I A I I A I
Strengthened Premise A E O A E O E A O E A O
A O O A O O O A O O A O
E I O E I O I E O I E O
Strengthened Premise E E I E E I
E O I E O I O E I O E I
Classifications of the valid simple syllogisms by De Morgan are given to the left of the table.
Each such classification applies to all the syllogisms in that row. Of the possible 64 premises for
a standard form simple syllogism the above table shows that 32 are involved in valid syllogisms.
The universal syllogisms are those that have all three propositions universal. There are eight
such syllogisms. (The other syllogisms are called particular syllogisms.) One can weaken the
conclusion of each of the eight valid universal syllogisms to a particular proposition in two ways.
This gives the rows marked weakened conclusion. These are omitted by De Morgan, just as
syllogisms with weakened conclusions are omitted in the traditional Aristotelian logic. De Morgan
says that if the premises are stronger than needed for a conclusion then such a syllogism should also
be omitted. This gives the rows marked strengthened conclusion. However he tends to include
the strengthened syllogisms, for example when finding the total number (32) of pairs of premises
involved in valid simple syllogisms.
The 24 syllogisms in bold type are the fundamental syllogisms. These are the ones that
are neither weakened nor strengthened. The horizontal lines are included in the table to group the
weakened or strengthened syllogisms with fundamental syllogisms from which they can be derived. 25
1.10. Mixed Syllogisms. In a couple of paragraphs De Morgan gives a list of rules that show
how to relate a mixed syllogism to one that is not mixed to decide if the mixed syllogism is valid.
For the valid affirmatory syllogisms we refer the reader to Appendix 0, where one finds a simple
algorithm to determine if any affirmatory syllogism is valid, and a table of all the standard form valid
affirmatory syllogisms (with strongest possible conclusions) using De Morgan’s 15 propositional
forms.
1.11. Further Comments on Syllogisms. De Morgan shows throughout his work on logic
a tremendous fascination with presentation and symbolic notation. The traditional structure of
the syllogism seems quite unnatural so, for example, he changes the 1st Figure AAA syllogism
24
De Morgan lists those in bold type—see the 3rd column of page 89. Only a few of the others are explicitly
mentioned.
25
It is interesting to note that the fundamental syllogisms are precisely the simple syllogisms that are valid under
modern semantics.
1. FORMAL LOGIC (1847) 41
One must note that his ‘complement’ means, in modern terminology, ‘any superset of the comple-
ment’.
1.12. Quantifiers. The quantifier ‘all’ is perfectly clear as to its meaning, but De Morgan
sees numerous possibilities for ‘some’ (page 58):
The relation of the universal quantity to the whole quantity of instances in existence is definite, being
that whole quantity itself. But the particular quantity is wholly indefinite: “Some Xs are Ys” gives
no clue to the fraction of all the Xs spoken of, nor to the fraction which they make of all the Ys.
Common language makes a certain conventional approach to definiteness, which has been thrown
away in works of logic. “Some,” usually means a rather large small fraction of the whole; a larger
fraction would be expressed by “a good many”; and somewhat more than half expressed by “most”;
while a still larger proportion would be expressed by “a great majority” or “nearly all”. A perfectly
definite particular , as to quantity, would express how many Xs are in existence, how many Ys, and
how many of the Xs are or are not Ys: as in “70 out of the 100 Xs are among the 200 Ys.”
In 1846 De Morgan gives two examples of syllogisms using such a quantifier ([5], page 9):
Most of the Y s are X s Most of the Y s are X s
Most of the Y s are Z s Most of the Y s are not Z s
Some of the X s are Z s Some of the X s are not Z s
In his efforts to make the syllogism definite De Morgan was particularly proud of his idea of
the numerically definite proposition, an example of which is given in the passage quoted above.
He devotes a modest chapter to this form of proposition and the corresponding syllogisms, but this
idea was not destined for success among logicians.
1.13. Compound Names. De Morgan introduces compound names on page 115 to de-
scribe classes naturally composed of others, namely he uses PQ for what we call the intersection of
P and Q, and P,Q for their union. He allows any interation of these binary connectives, giving all
disjunctive and conjunctive combinations of his symbols. His purpose for introducing them is only
to reduce other syllogisms to categorical form (see §1.14).
42 3. AUGUSTUS DE MORGAN (1806–1871)
(P ∪ Q) ∩ (R ∪ S) = (P ∩ R) ∪ (P ∩ S) ∪ (Q ∩ R) ∪ (Q ∩ S).
The associative laws for union and for intersection are implicit in his omission of parentheses.
And presumably the commutative and idempotent laws were too obvious to state. Nor are they
used. He does not introduce a symbol to express the fact that two (compound) names are the same
until he uses ‘k’ in his Syllabus of 1860. And then he makes very little use of this symbol aside from
a few exercises in paragraph 134 (page 182) on contraries, for example Xk(A,B)C gives xk(ab,c).
After this introduction to his notation he considers another expansion of his categorical forms
by allowing the simplest compound names to appear in them, for example, XY)P,Q means that
‘Everything that is both X and Y is either P or Q’. But his work on this is only a few pages, just
enough to open up the subject. (His requirement that the subject and predicate be nonempty and
not the universe would complicate an analysis.)
De Morgan is surprisingly apologetic for the notation for compound names (page 116):
With respect to this and other cases of notation, repulsive as they may appear, the reader who refuses
them is in one of two circumstances. Either he wants to give his assent or dissent to what is said of the
form by means of the matter, which is easing the difficulty by avoiding it, and stepping out of logic;
or else he desires to have it in a shape in which he may get that most futile of all acquisitions, called
a general idea, which is truly, to use the contrary adjective term as colloquially, nothing particular, a
whole without parts.
Regarding the nature of compound names he says (page 117):
Whatever has the right to the name P, and also to the name Q, has right to the compound name PQ.
This is an absolute identity, for by the name PQ we signify nothing but what has right to both names.
According X)P + X)Q = X)PQ is not a syllogism, nor even an inference, but only the assertion of
our right to use at our pleasure either one of two ways of saying the same thing instead of the other.
Of course we view (X ⊆ P)&(X ⊆ Q) ↔ (X ⊆ P ∩ Q) as a very simple theorem about sets, where
we base the set theory as usual on membership (∈). However he appears to be saying that his
expression is a definition of PQ. Such an implicit definition would not be acceptable today without
an explanation as to why a unique something actually satisfies the definition.26
26
In 1880 Peirce uses similar implicit definitions for his foundations of the algebra of logic. He starts with a
poset and defines the operations + and × on a pair of elements as the least upper bound and greatest lower bound.
However he does not explain why these bounds should exist. Schröder adopts Peirce’s approach in 1890, again without
explaining why the bounds exist, as the starting point for his Algebra of Logic.
2. THE AFFIRMATORY SYLLOGISMS OF FORMAL LOGIC 43
1.14. Other Syllogisms. Let us look at an example of how De Morgan uses compound names
to reduce the constructive dilemma to a categorical syllogism (page 123):
Example 2. “If A be B, E is F; and if C be D, E is F; but either A is B or C is D; therefore E is F.”
This can be reduced to
P)R + Q)R + S)P,Q = S)R
which is immediately made a common syllogism by changing P)R + Q)R into P,Q)R.
De Morgan is letting P be the proposition ‘A is B’, etc. S denotes a true proposition. But to use
his previous setup we really need classes, not propositions. The simplest solution is to refer to
Whately’s treatment where he lets P denote the class of instances where ‘A is B’ is true, etc. Then
the above becomes
P,Q)R + U)P,Q = U)R
where U is the universe. However this does seem to conflict with his condition that a term in a
syllogism cannot be the empty class or the universe.
After presenting his system De Morgan comments on the rigidity of the followers of Aristotle
(page 127):
From the time of Aristotle until now, the formal inference has been a matter of study. In the writings
of the great philosopher, and in a somewhat scattered manner, are found the materials out of which
was constructed the system of syllogism now and always prevalent: and two distinct principles of
exclusion appear to be acted on. Perhaps it would be more correct to say that the followers collected
two distinct principles of exclusion from the writings of the master, by help of the assumption that
everything not used by the teacher was forbidden to the learner. I cannot find that Aristotle either
limits his reader in this manner, or that he anywhere implies that he has exhausted all possible modes
of syllogizing. But whether these exclusions are to be attributed to the followers alone, or whether
those who have more knowledge of his writings than myself can fix them upon the leader, this much
is certain, that they were adopted, and have in all time dictated the limits of the syllogism. Of all
men, Aristotle is the one of whom his followers have worshipped his defects as well as his excellencies:
which is what he himself never did to any man living or dead; indeed, he has been accused of the
contrary fault.
Note that (1) gives the standard form of FGH(Xyz), (2) the standard form of FGH(XYz), (3) the
standard form of FGH(ZYX), and (4) the standard form of FGH(xyz).
The reduced form of a syllogism FGH is the result of successively carrying out the following
sequence of four steps:
• If F is not among the A,D,I forms then apply transformation (1).
• If G is not among the A,D,I forms then apply transformation (2).
• If G ≺ F, where
D≺P≺D ,D ≺A , A ≺I ,I ,
then apply transformation (3).
• If F has an upperstroke apply transformation (4).
The 13 primary syllogisms are:
DGG for G not a C,E, or O form
PA I PA I DAD AAA AII
The primary syllogisms are valid.
Let us say that a syllogism KLM is a specialization of FGH if the premises of KLM are at
least as strong as those of FGH, and the conclusion of KLM is weaker or equal to that of FGH.
Thus if KLM is a specialization of a valid syllogism then KLM is valid.
To quickly check that one proposition is stronger than another one can use the following dia-
gram, where F is below G means FXY is stronger than GXY, i.e, GXY follows from FXY:
I I O O
A A E E
D C D P D C C
Theorem A syllogism FGH (in standard form) is valid iff its reduced form is a specialization of
one of the primary syllogisms.
Proof The first two steps of the reduction eliminate all occurrences of C,E, and O forms from the
premises. The next step ensures not (G ≺ F). The last step gives an F that does not have an
upperstroke. Each of the steps preserves the effects of the previous steps. This leaves the following
27 possibilities for the two premises of a reduced syllogism, where in each row one is to choose the
second premise to be any of the bracketed items:
D[DPD D A A I I ]
P[PD D A A I I ]
D [D D A A I I ]
A [A A I I ]
I [I I ]
3. ON THE SYLLOGISM 45
It is trivial to find the best conclusion for the 8 cases in the first row, namely DGG has the strongest
conclusion that one can draw from the premises DG. This gives 8 primary syllogisms. The last
row yields no valid syllogisms as the premises are both particular. A detailed analysis of the 17
pairs of premises in the second, third and fourth rows will yield the remaining 5 primary syllogisms.
Using this it is not difficult to fill in (by hand!) the following table of the 184 standard form valid
affirmatory syllogisms that have strongest possible conclusions, using the 15 propositional forms
of De Morgan. And from this one can readily determine that there are a total of 424 standard
form valid affirmatory syllogisms in De Morgan’s system.27 As usual, the first premise is in the left
column, the second premise is in the top row, and the conclusion is the corresponding entry in the
table. Several of the syllogisms involving P as a premise have two strongest possible conclusions,
and they are listed together. Thus both D P I and D P O are in the collection of valid syllogisms
with strongest possible conclusions. The bold type gives the primary syllogisms:
D D D C C C P A A E E I I O O
D D D D C C C P A A E E I I O O
D D D I C C O I O D I C O • I • O
D D I D C O C I O I D O C I • O •
C C C C D D D P E E A A O O I I
C C O C D I D I O O C I D O • I •
C C C O D D I I O C O D I • O • I
P P I O I O P I O I O • I O I O I O I O • • • •
A A D I E C O I O A I E O • I • O
A A I D E O C I O I A O E I • O •
E E O C A I D I O O E I A O • I •
E E C O A D I I O E O A I • O • I
I I I • O O • • • I • O • • • •
I I • I O • O • I • O • • • • •
O O • O I • I • O • I • • • • •
O O O • I I • • • O • I • • • •
3. On the Syllogism
Starting in 1846, and continuing until 1863, De Morgan wrote a series of papers called On the
Syllogism: I-VI; and an outline paper called Syllabus of a Proposed System of Logic in 1860 that had
244 numbered paragraphs describing his proposed system of logic. These, and an abridged version
of an 1860 article Logic for the English Cyclopedia, were collected together in 1966 in a book titled
On the Syllogism. (The introduction of this book has an excellent biography of De Morgan.) The
page numbers we quote for De Morgan’s work from these articles will refer to the page numbering
in this book, and not to the original articles. Our references to ‘paper I’, etc., will refer to his
articles ‘On the Syllogism, I’, etc.
One can summarize the series of six papers as being a continuation of his book; actually the
book was an expansion on the first paper, with many of the details we have discussed in the section
on Formal Logic originally appearing in a condensed form in this paper—the major exceptions
27
Each of the 48 complex conclusions can be weakened in 4 ways; each of the 24 universal conclusions can be
weakened in 2 ways; and the 112 particular conclusions cannot be weakened.
46 3. AUGUSTUS DE MORGAN (1806–1871)
being the abstract copula and compound names. Also it should be noted that the notation changes
considerably from the 1846 paper to the 1847 book.
In paper II (1850), On the Symbols of Logic, the Theory of the Syllogism, and in particular of the
Copula, De Morgan improves his symbolism for simple propositions, and introduces general binary
relations. Thus in the years 1846 to 1850 we see all of his main ideas for improving logic. First let
us discuss his cuneiform-like notation for simple propositions and syllogisms.
3.1. New Notation for Simple Syllogisms. In the 1850 paper De Morgan develops a simple
calculus of inference for simple syllogisms based on the following notation:
To find the propositions equivalent to a given proposition there are two rules:
• Transposing the subject and predicate, and reversing the parentheses, gives an equivalent
proposition. For example, X).)y is equivalent to y(.(X.
• Changing a symbol to its contrary, reversing the neighboring parenthesis, and adding a dot if
there is no dot, otherwise deleting a dot, gives an equivalent proposition. For example X))Y
is equivalent to x(.)Y , and this in turn is equivalent to x((y.
De Morgan also develops a compact notation for syllogisms in standard form. A single example
should suffice to explain this. The premises ‘No X is Y ’ and ‘No y is z’, which in standard form are
E XY and E Y Z, are written in the new notation as X).(Y and Y (.)Z. Amalgamate the pair of
premises into X).(Y (.)Z, and then remove the symbols X, Y, Z to obtain just ).((.) . The strongest
possible conclusion is X))Z, so he abbreviates the syllogism to ).((.) = )) . Note that the conclusion
can be obtained by deleting the inner two parentheses and the two dots. This turns out to be part
of a general rule. He gives the following two canons to determine if a pair of premises expressed in
compact form in the new notation actually has a conclusion, and if so, a deletion algorithm to find
the strongest possible conclusion (page 40):
• (De Morgan’s Canon of Validity) If both premises are universal, or if one is universal
and the middle parentheses turn the same way, then there is a conclusion.
• (De Morgan’s Canon of Inference) If the premises have a conclusion then the strongest
conclusion can be found in the following manner: delete the dots if there are two dots, and
delete the inner two parentheses.
To see that these canons are correct one only needs to examine the following table that shows all
the standard form valid simple syllogisms (with strongest possible conclusions) in this new notation.
The first premise is along the left column, the second premise in the top row, and the conclusion,
if there is one, in the corresponding entry of the table. A ‘•’ means no conclusion is possible.
3. ON THE SYLLOGISM 47
A A E E I I O O
)) (( ).( (.) () )( (.( ).)
A )) )) )( ).( ).) • )( • ).)
A (( () (( (.( (.) () • (.( •
E ).( ).) ).( )( )) ).) • )( •
E (.) (.) (.( (( () • (.( • ()
I () () • (.( • • • • •
I )( • )( • ).) • • • •
O (.( • (.( • () • • • •
O ).) ).) • )( • • • • •
X: ,Y
X: ,Y : ,Z X: ,Y : ,Z
α not in X and not in Y . To check that α has this property by comparing elements one must
examine every element of both X and Y . Thus X and Y both enter universally in this proposition.
This is the sense of quantification that De Morgan says his notation captures:
Let the subject and predicate, when specified, be written before and after the symbols of quantity.
Let the inclosing parenthesis, as in X) or (X, denote that the name-symbol X, which would be
inclosed if the oval were completed, enters universally. Let an excluding parenthesis as in )X or X(,
signify that the name-symbol enters particularly. Let an even number of dots, or none at all, inserted
between the parentheses, denote affirmation or agreement; let an odd number, usually one, denote
negation or non-agreement. Thus X))Y means that all Xs are Y s; X(.(Y means that some Xs are
not Y s; . . .
It would appear from De Morgan’s two examples that one only needs to read X) or (X as ‘All X’,
and X( or )X as ‘Some X’. But this reading works reasonably well in only six of the eight simple
propositions, the ones where Hamilton’s propositions are also simple propositions. In the other two
cases this reading fails, so one needs to exercise caution: X)(Y is the proposition ‘Some x is y’,
and not Hamilton’s ‘All X is all Y ’; and X(.)Y is the proposition ‘No X is Y ’, and not Hamilton’s
dubious ‘Some X is not some Y ’. It is probably better to avoid a direct quantifier reading of the
parentheses and rely instead on the table of definitions.
3.2. Developments with Binary Relations. In paper II (1850) De Morgan takes the rather
large step of generalizing the copula to an arbitrary binary relation, symbolized by either a solid
line segment or a dashed line segment, and permitting the copulas in the two premises to be
distinct. Almost all of his work using a general binary relation for a copula is for doubly singular
propositions, meaning that the subject and predicate are singletons. In this case the quantifiers
are not needed since ‘All’ and ‘Some’ would have the same meaning. Following De Morgan (page
235) let us say that a syllogism built from doubly singular propositions is a unit syllogism. Note
that in the study of unit syllogisms one does not want to introduce contraries of names (unless the
universe has only two members). So for the study of unit syllogisms all symbols will be capital
letters X, Y , etc.
In order to have the most fundamental unit syllogism ‘X is Y , Y is Z, therefore X is Z’ hold
he needs to introduce the composition of the two binary relations as the copula of the conclusion.
(He briefly calls this a bicopular syllogism.)
He does little with this idea in 1850 besides introduce the composition and contraries of binary
relations. It will be ten years before he is prepared to discuss syllogisms using this level of generality.
But in 1850 he does argue for the naturalness of using different copulas in a syllogism (page 59):
The admission of relation in general, and of the composition of relation, tends to throw light upon the
difference between the invented syllogism of the logicians and the natural syllogism of the external
world. The logician, tied to a verb of identity, from which if he wander it is never quite out of sight,
is bound to subject and predicate of the same class; objective both, or subjective both. He cannot
say the rose is red, for his is would require the inference that some red is the rose. He has nothing
but a method of reducing his predicate to an object: the rose is a red thing; some red thing is a rose.
The common man uses a copula which ties the object up in relation to a more subjective predicate;
not reading inversely by intension, not dwelling on redness as an attribute of a rose, but directly by
extension, thinking of the family rose as his external object, and the sensation red as one condition
under which it appears to his senses. Again, an ordinary person says that the rose is red, and red is
pretty, so that the rose is pretty. . . .
And, following the last statement, he says that the ordinary person is really dealing with different
copulas in such a syllogism.
In 1860, in paper IV On the Logic of Relations, he returns to the use of a binary relation, now
symbolized by a letter instead of a line segment, for the copula. For L a binary relation he writes
X..LY to signify that ‘X is in the relation L to Y ’, and X.LY means ‘X is not in the relation
3. ON THE SYLLOGISM 49
L to Y ’. If L is any transitive binary relation then De Morgan has the valid unit syllogism:
X..LY
Y..LZ
X..LZ
By using the compound relation28 LM , where X..LM Z means there is a Y such that X..LY
and Y..M Z, he has the valid unit syllogism:
X..LY
Y..M Z
X..LM Z
To explain some of the notation that De Morgan uses in his theory of syllogisms we give the
following definitions (expressed in modern notation):
notation definition
LY = {α : {α}LY }
XL = {β : XL{β}}
De Morgan uses LY , but not, it seems, XL. Now let us introduce some of his key definitions (pages
220–222), using modern notation when more convenient:
name notation definition
converse of L L−1 X..L−1 Y iff Y..LX
LM 0 X..LM 0 Y iff Ø 6= M Y ⊆ XL
L0 M X..L0 M Y iff Ø 6= XL ⊆ M Y
contrary of L l XlY iff X.LY
From these his other identities and inclusions can be easily deduced.
In this general setting De Morgan gives a table for the 16 unit syllogisms that have the first
premise one of ‘X is [is not] Y ’ or ‘Y is [is not] X’, and the second premise one of ‘Y is [is not] Z’
or ‘Z is [is not] Y ’. One can easily put the premises into a standard affirmative form using Y..LX
is equivalent to X..L−1 Y , and X.LY is equivalent to X..LY . Then after putting the premises
into such a form one has only to apply the basic unit syllogism: from X..LY and Y..M Z follows
X..LM Z. Let us examine a sample of the 16 cases, expressed in De Morgan’s notation (page 232):
X..LY
Z.M Y
X..Lm−1 Z
0
X.lm−1 Z
X.L0 M −1 Z
L−1 N ||M −1
the time that we arrive at the consideration of relation in general we are clear of all necessity for
quantification. And for this reason: quantification itself only expresses a relation.
With this dismissal of quantifiers De Morgan misses an essential ingredient of modern logic. He
continues:
Thus if we say that some Xs are connected with Y s, the relation of the class X to the class Y is
that of a partial connexion: that some at least, all it may be, are connected, is itself a connexion
between the classes.
This is the first time that De Morgan clearly indicates that he is aware that his propositions X))Y ,
etc., actually define relations between classes. Previously he just said that these were abbreviations.
And he continues:
Nevertheless, it may be useful to exhibit the modifying quantification as a component, not as insep-
arably thought of in the compound; though in this we must confine ourselves to what may be called
the Aristotelian branch of the extended subject.
Two decades later Frege and Peirce would revolutionize logic by properly isolating the ‘modify-
ing quantification’. The reason for De Morgan’s restriction to the ‘Archimedian branch’ is that
introducing names for contraries creates problems:
If we would enter completely upon quantified forms, we must examine not only the relation and its
contrary, but the relation of a term in connexion with the relation of the contrary term. And here we
find that all universal connexion ceases. The repugnance [i.e., disjointness] of X and not-X or x,
which, joined with alternance [i.e., the union is U ], is the notion the symbols X and x were invented
to express, cannot be predicated of LX and Lx: for Y..LX and Y..Lx may coexist. The complete
investigation would require subordinate notions of form, effecting the subdivisions of matter.
This quote seems directed at preserving the equivalence of propositions like ‘No X is Y ’ and
‘All X is y’. Using arguments similar to those in §1.3 the equivalence of ‘No X..LY ’ and ‘All X..Ly’
would hold iff L is a one-one mapping from U to U . Even if one weakens the condition to requiring
that ‘No X..LY ’ and ‘All X..L∗ y’ are equivalent for some choice of copula L∗ there are still severe
restrictions on L (and L∗ ).
Thus De Morgan does not even initiate a program to examine the expression of his simple
syllogisms in the context of bicopular syllogisms. And with the Aristotelian syllogisms he is rather
sketchy—he says that the sixteen unit syllogisms he has presented lead to valid quantified syllogisms.
He gives a single example, leaving it to the reader to work out the extent to which the Aristotelian
inferences are preserved in this general setting. There are severe complications. For example with
the 1st Figure EAE syllogism one wants to find a binary relation ¤ that makes the following valid:
All X..LY
All Y.M Z
All X.¤Z
Of course there is the trivial solution, letting ¤ be the empty relation. But there does not seem to
be any interesting solution for ¤.
After this discussion of quantified bicopular syllogisms De Morgan returns briefly to unit syl-
logisms, this time with the copula of each of the premises being either ‘.L’ or ‘..L’, where L is a
transitive relation, and in which the conclusion is in one of the four forms ‘X .. [.] L [L−1 ] Y’.
There is a curious discussion of convertible relations (i.e., symmetric relations) (page 225):
And, L being any relation whatever, LL−1 is convertible: . . . So far as I can see, every convertible
relation can be reduced to the form LL−1 .
This is clearly false, for consider the convertible relation 6=. De Morgan’s condition M = LL −1
implies that XM X for any X such that for some Y we have XM Y . This is essentially noted by
De Morgan (page 226):
52 3. AUGUSTUS DE MORGAN (1806–1871)
Among the subjects of a convertible relation must usually come the predicate itself, unless it be forced
out by express convention. If all convertible relation can be expressed by LL −1 this is obviously
necessary: for LL−1 X includes X. Is a man his own brother? It is commonly not so held: but we
cannot make a definition which shall by its own power exclude him, unless under a clause expressly
framed for the purpose. . . . I shall hold, for logical purposes, that the predicate is included among
its own convertible relatives.
He seems to be saying that the only symmetric relations that he wants to consider are also reflex-
ive.29
General binary relations were not an accepted part of logic during his lifetime. In 1860 he says
in paper IV ([5], page 208):
Much has been written on relation in all its psychological aspects except the logical one, that is,
the analysis of necessary laws of thought connected with the notion of relation. The logician has
hitherto carefully excluded from his science the study of relation in general: he places it among those
heterogeneous categories which turn the porch of his temple into a magazine of raw material mixed
with refuse.
The pursuit of the syllogism turned out to be a dead-end of investigation, but the idea of
studying general binary relations, and having compound names for relations, would captivate Peirce,
who had received a complimentary copy of On the Logic of Relations from De Morgan, and would
lead him to develop the powerful calculus of relations (or relatives).
4. Concluding Remarks
De Morgan’s description of his system of simple and complex propositions in Formal Logic is
given in most appealing terms (pages ix,x):
A simple notation, which includes the common one, gives the means of representing every syllogism
by three letters, each accented above or below. By inspection of one of these symbols it is seen
immediately, 1. What syllogism is represented, 2. Whether it be valid or invalid, 3. How it is at once
to be written down, 4. What axiom the inference contains, or what is the act of the mind when it
makes that inference . . .
But his successors would not be so generous in their evaluation. De Morgan’s work contributed to
the ‘air of change’ that existed in the 1840’s, but his system of logic was regarded as notationally
too complex, and, in comparison to Boole’s work, hopelessly rooted in the past in focusing on the
syllogism, albeit in a more general setting. A number of new ideas were introduced by him, but
the interesting ones for the future of logic, like compound names and binary relations, were really
not developed in significant detail by De Morgan. His strengths were his ability to ask probing
questions, to introduce interesting definitions, and his ability to create compact notation and to
compress data into tables.
Jevons summarized the shortcomings of De Morgan’s work in 1880 ([10], pages xii-xiii):
After a careful renewed study of the writings of these eminent logicians I felt compelled in the first
place to discard the diverse and complicated notative methods of De Morgan . . . to import his
‘mysterious spiculae’ into this book was to add a needless stumbling-block . . . There was in fact an
unfortunate want of power of generalization in De Morgan; his mind could dissect logical questions
into their very atoms, but he could not put the atoms of thought together again into a real system.
Perhaps this is a good place to clarify the limitations of De Morgan’s work.
29
De Morgan’s conjecture about the possibility of decomposing a symmetric and reflexive binary relation M into
the form LL−1 is correct, provided the universe U is infinite. Then one can do this by assigning to each edge E of
M a unique element f (E) of U . Then let L be the relation of all pairs (α, f (E)) where α is a vertex of E. However
on a five-element universe one can find symmetric and reflexive relations M that have no such decomposition. A
necessary and sufficient condition for such a decomposition is (thinking of M as a reflexive graph) that one can find
no more than |U | cliques in M such that every edge of M belongs to one of these cliques.
4. CONCLUDING REMARKS 53
• He generalizes the simple propositions by adding certain conjunctions of them called complex
propositions to his list of propositions. He says one can express the simple propositions as
disjunctions of complex ones, and he briefly has a notation for the denial of a proposition. But
he does not take the step of allowing arbitrary Boolean combinations of his propositions to be
propositions. This would have yielded a 128 element Boolean algebra of Boolean combinations
ϕ(X, Y ) of the eight simple propositions, with the complex propositions being the atoms.
• He introduces contraries, union, and intersection, and then constructs compound names. And
because he wants to make sure that he has contraries for the compound names the De Morgan
laws are mentioned, but there is no deliberate investigation of the laws of compound names.
Without a symbol for equality between compound names for most of his work, and without a
symbol for complements, he is missing the opportunity to develop one of the vital directions
of logic, an equational calculus of compound names.
De Morgan has Boolean combinations of classes, but not the Boolean algebra of classes.
Very little is done towards studying valid syllogisms that use compound names. And he does
not notice the parallels between the operations contrary-union-intersection on classes and
denial-disjunction-conjunction on propositions.
It is his student Jevons who, in 1864, starts to fully investigate the laws of (and rules
of inference for) compound names. And it is Jevons who shows that this Boolean algebra
provides a viable alternative to Boole’s algebra of logic.
• He introduces the general binary relation into logic, but he does not see that one can develop a
logic for relations that parallels that of the logic for classes, and does not realize that relations
will allow one to develop a logic for complicated arguments like those used in mathematics.
Rather than a general study of how binary relations can interact, working towards modern
logic, De Morgan is content to squeeze them into the traditional syllogistic arguments. The
idea of developing a logic for relations rather than for classes would be initiated by C.S. Peirce
starting in 1870. (In 1903 Peirce says his 1870 paper was the most important development in
logic since the work of Boole.)
Peirce’s work on relations would in turn be enthusiastically embraced by E. Schröder, and
would be the basis for Vol. III of Schröder’s Algebra der Logik. Both Löwenheim and Skolem
were well versed in Schröder’s volumes. Löwenheim would follow this work on relations with
his famous theorem on countable models in first-order languages, and Skolem would give this
theorem its definitive proof. Skolem worked on decidability questions for first-order logic
stemming from Schröder’s work, and also recommended the use of first-order language (from
Schröder’s Vol. III) for the axiomatization of Zermelo’s set theory. So one can actually trace
a direct link from De Morgan’s work to some of the highlights of modern logic.
De Morgan only sees the relevance of his ideas to the syllogism. Although he creates some of
the most basic concepts of modern logic, by applying them only to syllogistic arguments he ends
up with little credit for the development of modern logic.
54 3. AUGUSTUS DE MORGAN (1806–1871)
CHAPTER 4
George Boole was a school teacher in Lincoln, England, when he began publishing, in 1840,
respected papers in analysis in the Cambridge Mathematical Journal. His early (and primary)
interests1 were linear transformations, invariants, differential equations, and the calculus of varia-
tions. Starting to publish at the age of 25 is not particularly striking, until one realizes that Boole
had never been a student in an institution of higher learning. He had simply taught himself all
the higher mathematics, and much of the foreign languages, that he knew. For his 1844 paper on
operational methods in differential equations he received a Royal Medal from the Royal Society.
One of the major influences on Boole early in his research career was the young Cambridge
mathematician Duncan Gregory. Gregory stated three laws in 1839,
xy = yx x(u + v) = xu + xv xm · xn = xm+n ,
and said that these were all Euler needed to derive the binomial theorem (with fractional expo-
nents). Gregory then used the three laws to justify applying the binomial theorem in his work with
differential operators. In Boole’s 1841 paper on linear DEs he says, without proof, that the three
laws of Gregory also suffice to obtain the partial fraction decomposition of a rational function, and
he applies this to the inverse of a linear differential operator with constant coefficients. Next, in
Boole’s prize winning paper of 1844, on differential and difference equations, he states the three
laws as a basis for his work, inquiring if the third law might not be merely a necessity of notation.
The impact of Boole’s early work with differential operators on his subsequent work with logic
is rather clear. Replacing Gregory’s third law with xn = x gives Boole’s three laws of logic in 1847.
However, Boole goes further by adding a rule of inference that we now call the replacement rule
in equational logic, and by claiming that, in view of the first two of these laws, “all the processes
of common algebra are applicable to the present system”. What Boole actually uses is (with rare
exceptions) just the equational part of common algebra. The laws of 1847 are clearly inadequate
for what Boole does. But the point is that Boole (incorrectly) believes that the two laws give him
the right to use the common (equational) algebra, and he immediately invokes that right.
Boole initiated a correspondence with De Morgan in 1842 that was to have a profound impact
on Boole’s life. De Morgan, nine years older than Boole, was highly regarded for his wide ranging
intellectual writings, his lecturing expertise, and his popular texts in mathematics. Their early
correspondence dealt with analysis, especially differential equations.
As mentioned in the section on De Morgan, by the Spring of 1847 each was engaged in writing
a book on logic. Jevons and Venn would later claim that the equational approach to logic was a
consequence of the discovery of the quantification of the predicate by De Morgan and Hamilton.
All Boole says on this matter in his 1847 work is:
In the spring of the present year my attention was directed to the question then moved
between Sir W. Hamilton and Professor De Morgan; and I was induced by the interest which
it inspired, to resume the almost-forgotten thread of former inquiries. It appeared to me
1
Most of my colleagues think Boole was primarily a logician. But he was an analyst who suddenly acquired an
interest in logic in 1847, worked on the subject for the next seven years, and then returned to his interests in analysis.
55
56 4. GEORGE BOOLE (1815–1864)
that, although Logic might be viewed with reference to the idea of quantity, it had also
another and a deeper system of relations.
Propositions Equations
(Premises)
Algebra
Propositions
Equations
(Conclusions)
Boole’s algebra of logic is equational logic applied to two situations: in the categorical logic
it provides what is essentially a calculus of classes, and in the hypothetical logic it provides
a propositional calculus. From the perspective of De Morgan’s work, Boole has developed an
equational calculus of compound names.
Boole’s initial goal is simply to express all of the statements of traditional logic (categorical and
hypothetical) as equations, and apply suitable algebraic transformations to the equations to derive
the known valid arguments (the conversions, syllogisms, and hypothetical syllogisms) of logic.
Toward the end of writing this book Boole realizes that his algebra of logic is applicable to any
finite collection of premises with any number of symbols. The traditional syllogisms would be just
a small corner of this system, and would no longer merit the memory work on the classification and
rules that had been staples for so many centuries. What was important to remember was how to
translate between propositions and equations, and the rules of the algebra of logic. This would lead
to the end of the dominance of Aristotelian logic—brought down by the writings of a self-taught
English school teacher.
Boole’s work has serious shortcomings. First he fails to provide clear definitions of most of the
basic operations he uses to combine classes. Secondly, he claims, on rather meager evidence, that
the laws of thought are indeed governed by the rules and processes of the algebra of numbers, with
the only modification being the law x2 = x. As mentioned in the introduction, the symbol x in this
law behaves like a constant.2
Boole clearly defines only the first of the four operations product, sum, difference, and quotient
that he applies to classes. The notation for these operations is borrowed from ordinary algebra, and
he explains that the product xy refers to what we now call intersection. Addition is only defined in
passing, when he discusses the distributive law x(u + v) = xu + xv. In this context the u and v will
represent disjoint classes, and u + v refers to their union. Yet + will later appear in expressions
where there is no reason to think that the terms represent disjoint classes. Minus is only explained
2
We cannot substitute x + x for x to derive (x + x)2 = x + x; this would lead to x + x = 0, which Boole certainly
does not accept.
1. THE MATHEMATICAL ANALYSIS OF LOGIC (1847) 57
for the expression 1 − x, which represents the complement3 of the class represented by x. In the last
chapter division is used in an essentially formal manner, with its interpretation depending entirely
on applying the Expansion Theorem.
It is easy to imagine that Boole writes out the various equational laws now called the axioms, or
defining laws, of Boolean algebra and uses them to analyze equations. This is not the case. Indeed
he only has the following three laws in 1847 (page 17), clearly borrowed from his earlier work with
differential operators:
x(u + v) = xu + xv
xy = yx
xn = x
This axiom is a version of what is called the Replacement Rule in modern equational logic. This
is not the only rule of inference that Boole needs for his logic—he also needs the reflexive, sym-
metric, and transitive properties of equality. Also he will use an additively nonnilpotent
property, namely that at = 0 leads to t = 0 if a is a nonzero number.
Boole regards recognition of the above mentioned axiom as important for the foundations of
logic, and he has a lengthy footnote chastising traditional logic for omitting this from its funda-
mental rules of reasoning regarding the principles of reduction that it uses (page 18):
It is generally asserted by writers on Logic, that all reasoning ultimately depends on an
application of the dictum of Aristotle, de omni et nullo. “Whatever is predicated universally
of any class of things, may be predicated in like manner of any thing comprehended in that
class.” But it is agreed that this dictum is not immediately applicable in all cases, and that in
the majority of instances, a certain previous process of reduction is necessary. What are the
elements involved in that process of reduction? Clearly they are as much a part of general
reasoning as the dictum itself.
Another mode of considering the subject resolves all reasoning into an application of one
or other of the following canons, viz.
1. If two terms agree with one and the same third, they agree with each other.
2. If one term agrees, and another disagrees, with one and the same third, these two
disagree with each other.
But the application of these canons depends on mental acts equivalent to those which
are involved in the before-named process of reduction. We have to select individuals from
classes, to convert propositions, &c., before we can avail ourselves of their guidance. Any
account of the process of reasoning is insufficient, which does not represent, as well the laws
of the operation which the mind performs in that process, as the primary truths which it
recognizes and applies.
It is presumed that the laws in question are adequately represented by the fundamental
equations of the present Calculus. The proof of this will be found in its capability of expressing
propositions, and of exhibiting in the results of its processes, every result that may be arrived
at by ordinary reasoning.
With this preamble he proceeds to introduce the minus symbol to have a term for the class of
‘not-X’, namely 1 − x, and then later it would (without explanation) appear in expressions such as
zy − y. On page 20 we have:
The class X and the class not-X together make the Universe. But the Universe is 1, and the
class X is determined by the symbol x, therefore the class not-X will be determined by the
symbol 1 − x.
This does not read as if Boole is making a definition of 1 − x, but rather as if Boole is proving that
the symbol for not-X has to be 1 − x.
To develop his system he first shows how to express the categorical Aristotelian propositions
as equations:
A All Xs are Ys xy = x, or x(1 − y) = 0
E No Xs are Ys xy = 0
I Some Xs are Ys v = xy
O Some Xs are not Ys v = x(1 − y)
Boole’s work seems to use the restricted semantics of symbols that we discussed in the Introduction,
namely that the symbols may not be interpreted as the empty class or the universe. Under the
restricted semantics the equation v = xy guarantees that X and Y do have elements in common,
namely those of V.
1. THE MATHEMATICAL ANALYSIS OF LOGIC (1847) 59
Regarding the strength of this equational foundation Boole says (page 22):
The above equations [for the AEIO forms] involve the complete theory of categorical Propo-
sitions, and so far as respects the employment of analysis for the deduction of logical infer-
ences, nothing more can be desired.
Then he gives some equational consequences in two cases (I and O) that he will need to make his
analysis work properly (page 22):
But it may be satisfactory to notice some particular forms deducible from the third and
fourth equations [for the I and O forms] . . .
These are (with the original equations in brackets)
additional forms
Boole does not substantiate his claim that the above rules are complete. He only provides a few
lines of examples. However it is not difficult to check that he is indeed correct. We can write out
these rules completely, where we use ¤ and ¥ to denote a symbol and its complement, and likewise
for ♦ and ¨. They can appear in any of the following four combinations, for any two symbols V 1
and V2 for classes:
¤ ¥ ♦ ¨
1. V1 not-V1 V2 not-V2
2. V1 not-V1 not-V2 V2
3. not-V1 V1 V2 not-V2
4. not-V1 V1 not-V2 V2
Boole’s transformations then become the following, where an arrow indicates the direction of the
transformation, and the rule used is written above the arrow:
1st
All ¤ s are ♦s ←→ No ¤ s are ¨s
1st
Some ¤ is ♦ ←→ Some ¤ is not ¨
2nd
All ¤ s are ♦s −→ Some ¤ is ♦
2nd
No ¤ s are ♦s −→ Some ¤ is not ♦
3rd
Some ¤ is ♦ ←→ Some ♦ is ¤
3rd
No ¤ s are ♦s ←→ No ♦s are ¤ s
By iterating these transformations one obtains the following diagram, and one can readily check
that it is closed under the transformations:
Some ❏ is No ❏ s are s
1st 2nd
3rd 3rd
Some ❏ is not
Some is ❏ No s are ❏ s
1s
1st
2nd t
Some is ■
3rd
Some ■ is
1st
Some ■ is not
1. THE MATHEMATICAL ANALYSIS OF LOGIC (1847) 61
All four forms (AEIO) of simple propositions appear in this diagram. One can see that 12
propositions (including the given proposition) can be derived from either an A or an E statement,
and 4 propositions from either an I or an O statement.
The rules of simple inference given by Boole yield valid inferences under the restricted semantics.
One can check that indeed these rules will, by iteration, yield all simple inferences for simple
propositions.
The chapter Of Syllogisms opens with the general strategy statement (pages 31–32):
The equation by which we express any Proposition concerning the classes X and Y, is an
equation between the symbols x and y, and the equation by which we express any Proposition
concerning the classes Y and Z, is an equation between the symbols y and z. If from two
such equations we eliminate y, the result, if it do not vanish, will be an equation between x
and z, and will be interpretable into a Proposition concerning the classes X and Z. And it
will then constitute the third member, or Conclusion, of a Syllogism, of which the two given
Propositions are the premises.
The method of analyzing the syllogisms is to translate the two premises into two equations, say
in the symbols x, y, z, with y representing the middle term, and then to write them as linear 5
equations ay + b = 0 and cy + d = 0, where y does not appear in the coefficients. Now Boole uses
the fact that in ordinary algebra the resultant ad − bc = 0 is the most general equation that follows
from eliminating y. Thus the crucial equational derivation
ay + b = 0
cy + d = 0
ad − bc = 0
is claimed to contain the essence of simple syllogistic reasoning. For example, for the premises of
the AAA 1st Figure one has
(1 − z)y = 0
xy = x
(1 − z)x = 0
yielding ‘All Xs are Zs’, as desired. But the method as stated does not always work, for consider
the premises of the AAI 3rd Figure syllogism:
All Ys are Zs
All Ys are Xs
This leads to
(1 − z)y = 0
(1 − x)y = 0
0 = 0
and thus not to the desired conclusion. So Boole says that for the system ay = by = 0 one should
expand one of the equations. In the last example this would mean something like
(1 − z)y = 0
y = vx
(1 − z)vx = 0
5
All elective terms ϕ(x, y, ...) can be reduced to polynomials with all the elective symbols to the first power as
the elective symbols satisfy x2 = x.
62 4. GEORGE BOOLE (1815–1864)
From this we can deduce vx = vxz, and as vx means ‘some-X’, it follows that we have ‘Some X is
Z’.
With this setup Boole claims that the resultant always provides the most general conclusion,
and proceeds to list all of the forms and figures of the valid syllogisms, subdivided into four cases,
depending on whether or not an auxiliary symbol v is used, and how it is used. Although he only
illustrates the various cases with examples, indeed the method of Boole seems to work to derive
most of the valid syllogisms. As mentioned in the Introduction he mishandles the 2nd Figure when
the premises are AA. We will comment on this later.
There are two flaws in Boole’s analysis of simple syllogisms. In ordinary algebra the condition
ad − bc = 0 is indeed a necessary and sufficient condition for the linear system
ay + b = 0
cy + d = 0
to have a nonzero solution for y. Boole evidently assumes this will carry over to the algebra of
logic, where a nonzero y would correspond to a nonempty class.
But ad − bc = 0 is not the correct resultant in the algebra of logic. By 1854 Boole will have
discovered his Elimination Theorem that shows the resultant should be the equation 6
¡ 2 ¢£ ¤
(1) b + d2 (a + b)2 + (c + d)2 = 0.
This, it turns out, is exactly what one needs for there to be a solution using modern semantics,
that is, including the possibility that y = 0 or y = 1. Additional conditions are needed to guarantee
that there is a solution that is not 0, or not 1.
In the particular linear equations that Boole was considering, obtained from his translation of
the premises of a simple syllogism into equations, it turns out that the classical resultant ad−bc = 0
is actually a consequence of (1), but is not equivalent to it. However it was Boole’s good luck that,
concerning consequences that could be expressed as simple propositions, the two are equivalent.
But they are not strong enough to yield all the correct syllogisms, namely when the strongest
conclusion from universal premises is a particular proposition. Boole is able to carry out his
analysis of simple syllogisms only by declaring that the cases where the resultant does not work
require special attention.
The second flaw with Boole’s analysis of syllogisms is an aesthetic flaw rather than one of
mathematical substance. In the last example he derives the conclusion ‘Some X is Z’ from the
equation (1 − z)vx = 0 by recalling the side condition that the combination vx means ‘some X’,
that is, there is a side condition vx 6= 0. Bringing the side condition into play seems to detract
from the purity of the equational algebraic treatment. Boole could have remedied this deficiency,
using the restricted semantics, by writing out translation rules such as
)
y = vx
yield ‘Some X is Z’
(1 − z)vx = 0
To keep the generality that he claims for his algebra of logic he would need to find an appropriate
collection of such translation rules.
We will not pursue this direction further as there is a simpler translation that Boole overlooked,
namely
‘Some X is Y’ corresponds to v = vxy.
6
It is interesting to note that Boole does not inform the reader of his 1854 book of this 1847 error. When he
analyzes the simple syllogisms in the 1854 book he first uses a more specific pair of equations, more closely tied to
the form of the premises, and then he jumps over the elimination step that produces the resultant and immediately
presents the reader with the solution (of the resultant equation) for the variable x.
1. THE MATHEMATICAL ANALYSIS OF LOGIC (1847) 63
Note that v = vxy is a consequence of Boole’s translation v = xy, and it still ensures that ‘Some X
is Y’ holds under the restricted semantics. When translating an argument into equational form one
needs to use a new symbol v for each premiss that is a particular proposition, and such a symbol
is not to be used again in any other translation of the premises. However when translating from
an equation back to ordinary language, the symbol v can be any symbol x or its contrary 1 − x.
These are simply conditions on translation, and not on the algebraic manipulations.
For example from x = xy one can derive x = xxy, and this corresponds to the simple inference:
‘All X is Y’ therefore ‘Some X is Y’. For the 2nd Figure AA premises ‘All X is Y’ and ‘All Z is Y’
we have the equations xy = x and zy = z. From these we can derive (1 − y)(1 − x)(1 − z) = 1 − y,
yielding ‘Some not-X is not-Y’. This is a valid syllogism, under the restricted semantics, that Boole
missed in 1847. We will say more about this variation on Boole’s method of handling particular
propositions in the next section (on Boole’s 1854 book) and in Appendix 4.
After finishing the simple syllogisms Boole turns to the discussion of hypothetical syllogisms in
the chapter Of Hypotheticals and points out that they are of a totally different character than that
of the categorical syllogisms. First he states the constructive hypothetical
If A is B, then C is D
But A is B, therefore C is D.
If A is B, then C is D
But C is not D, therefore A is not C.
Thus what we have to consider is not objects and classes of objects, but the truths of
Propositions, namely, of those elementary Propositions which are embodied in the terms of
our hypothetical premises.
Having pointed out that hypothetical syllogisms indeed deal with a different species of proposition
he then proceeds to set up his previous equational system for the hypotheticals by using an appro-
priate interpretation of the symbols, an interpretation that we can find in Whately’s discussion of
how to transform a hypothetical into a categorical.
To the symbols X,Y,Z, representative of Propositions, we may appropriate the elective sym-
bols x, y, z, in the following sense.
The hypothetical Universe, 1, shall comprehend all conceivable cases and conjunctures
of circumstances.
The elective symbol x attached to any subject expressive of such cases shall select those
cases in which the Proposition X is true, and similarly for Y and Z.
After explaining that 1 − x will refer to the situations where X is false, he gives a table to show
how two symbols, and their complements, can divide up the universe (page 50):
64 4. GEORGE BOOLE (1815–1864)
Thus if we associate the Propositions X and Y, the total number of conceivable cases will
be found as exhibited in the following scheme.
Cases. Elective expressions.
1st X true, Y true ... xy
2nd X true, Y false ... x(1 − y)
3rd X false, Y true ... (1 − x)y
4th X false, Y false ... (1 − x)(1 − y)
The cases that Boole is referring to correspond to the terms obtained by multiplying together
symbols and their complements such that each symbol in the situation being discussed is mentioned
exactly once in the term. e.g., x(1 − y)(1 − z) is such a term for the symbols x, y, z. Such terms
are later called constituents by Boole. xy refers to the collection of circumstances in which both X
and Y are true, etc. Then he goes on to explain the fact that these cases divide up the universe
corresponds to the sum of the expressions for the cases being 1.
Now Boole starts his propositional logic. The equation x = 1 will say that the proposition X is
(in all cases) true, and x = 0 that it is (in all cases) false. One does the same more generally (page
51):
And in every case, having determined the elective expression appropriate to a given Propo-
sition, we assert the truth of that Proposition by equating the elective expression to unity,
and its falsehood by equating the same expression to 0.
Then he gives examples:
X and Y are true xy = 1
X and Y are false (1 − x)(1 − y) = 1, or x + y − xy = 0
One or the other is true (1 − x)(1 − y) = 0, or x + y − xy = 1
Then, for the first time in the book, he breaks into the full generality that his system is capable
of (page 52):
Rule. Consider what are those distinct and mutually exclusive cases of which it is implied in
the statement of the given Proposition, that some one of them is true, and equate the sum
of their elective expressions to unity. This will give the equation of the given Proposition.
This is clearly a receipe for writing out what we call the disjunctive normal form. 7
Here is another sample from this chapter (page 55). Consider the case that X and Y are
exclusive (i.e., disjoint), and we want the elective expression for the proposition that ‘One or the
other is true’. Then, as xy = 0 by assumption, the expression will be
x(1 − y) + (1 − x)y = 1
x2 − 2xy + y 2 = 1
x − y = ±1
7
In applications Boole does not include in the sum any of the cases that he knows from the premises to be 0.
1. THE MATHEMATICAL ANALYSIS OF LOGIC (1847) 65
and this represents the actual case; for, as when X is true or false, Y is respectively false or
true, we have
x=1 or 0
y=0 or 1
∴ x − y = 1 or −1
There will be no difficulty in the analysis of other cases.
Equipped with this equational propositional calculus he methodically works through the traditional
hypothetical syllogisms. At the end of the chapter on the hypothetical he summarizes the situation
with the two logics that he has been working with (page 59):
The distinction [between the two logics] is real and important. Every Proposition which
language can express may be represented by elective symbols, and the laws of combination
of those symbols are in all cases the same; but in one class of instances the symbols have
reference to collections of objects, in the other, to the truths of constituent Propositions.
This distinction seems to be over stated since in the propositional logic he is letting an elective
symbol x refer to the class of situations in which the associated proposition X is true.
After finishing the chapter on the hypothetical, where Boole presents his first result for arbitrary
expressions, he turns to the general study of logical expressions in the chapter titled Properties
of Elective Functions. Elective functions are now called terms by logicians (see Appendix 1, for
example). In this book Boole uses lower case Greek letters ϕ, ψ, . . . for his elective functions, but
in his 1854 book he switches to Latin letters, with a preference for the letters f and V , with the
letter t used for constituents. In our commentary, and in the appendices, we will use lower case
Latin letters p, q, r, s, t for terms.
He opens with a remarkable leap into the use of power series:
Since elective symbols combine according to the laws of quantity, we may, by Maclaurin’s
theorem, expand a given function ϕ(x), in ascending powers of x, known cases of failure
excepted. Thus we have
ϕ00 (0) 2
ϕ(x) = ϕ(0) + ϕ0 (0)x + x + &c., (44)
1.2
Now x2 = x, x3 = x, &c., whence
© ϕ00 (0) ª
ϕ(x) = ϕ(0) + x ϕ0 (0) + + &c. , (45)
1.2
Now if in (44) we make x = 1, we have
ϕ00 (0)
ϕ(1) = ϕ(0) + ϕ0 (0) + + &c.,
1.2
whence
ϕ00 (0) ϕ000 (0)
ϕ0 (0) + + + &c. = ϕ(1) − ϕ(0).
1.2 1.2.3
Substitute this value in for the coefficient of x in the second member in (45), and we
have
© ª
ϕ(x) = ϕ(0) + ϕ(1) − ϕ(0) x, (46)
He gives the following footnote to this last equation:
Although this and the following theorems have only been proved for those forms of functions
which are expansible by Maclaurin’s theorem, they may be regarded as true for all forms
whatever; this will appear from the applications. The reason seems to be that, as it is
66 4. GEORGE BOOLE (1815–1864)
only through the one form of expansion that elective functions become interpretable, no
conflicting interpretation is possible.
Then he says, of the last equation above for ϕ(x) (page 61):
. . . which we shall also employ under the form
ϕ(x) = ϕ(1)x + ϕ(0)(1 − x), (47).
This form is indeed, without exception, correct, and would be an important step to Boole’s Elimi-
nation Theorem of 1854.
From this start Boole introduces his general tool for analyzing arguments, the Expansion The-
orem, which says: given any term p(x, y, . . . ) one can, in a simple fashion, express it as a sum of
terms in which desired symbols only occur in the special form of constituents. For example:
p(x, y) = p(1, y)x + p(0, y)x(1 − x), and
p(x, y) = p(1, 1)xy + p(1, 0)x(1 − y) + p(0, 1)(1 − x)y
+ p(0, 0)(1 − x)(1 − y).
Let us call the first expansion a partial expansion, and the second a complete expansion. Boole
calls the coefficients p(a, b) in the complete expansion the moduli of the term p. On page 61 we
have
Prop. 1. Any two functions ϕ(x), ψ(x), are equivalent, whose corresponding moduli are
equal.
Then he says on page 62 that this generalizes to the fact that any two terms are equivalent iff they
have the same moduli.
Boole notes that the constituents t1 , . . . , tn for a given finite set of elective symbols satisfy the
following:
t2i = ti for all i
ti tj = 0 for i 6= j, and
1 = t1 + · · · + t n .
After the discussion of moduli and constituents Boole turns to his method of using complete
expansions to interpret elective equations (page 64):
We are now prepared to enter upon the question of the general interpretation of elective
equations. For this purpose we shall find the following Propositions of the greatest service.
Prop. 2. If the first member [the left side] of the general equation ϕ(xy . . . ) = 0, be
expanded in a series of terms, each of which is of the form at [t denotes a constituent], a
being a modulus of the given function, then for every numerical modulus a which does not
vanish, we shall have the equation
at = 0,
and the combined interpretations of these several equations will express the full significance
of the original equation.
After explaining why this is true he continues with:
. . . whence if a1 is a numerical constant which does not vanish,
t1 = 0
and similarly for all the moduli which do not vanish. And inasmuch as from these constituent
equations we can form the given equation, their interpretations will together express its entire
significance.
1. THE MATHEMATICAL ANALYSIS OF LOGIC (1847) 67
Since Boole can express the categorical propositions as equations of the form ϕ = 0 he can use
his Expansion Theorem to claim on page 77:
Thus all categorical Propositions are resolvable into a denial of the existence of certain
compound classes, no member of one such class being a member of another.
The compound classes referred to correspond to the constituents. Thus every categorical proposition
in the symbols x, y, . . . is equivalent to a collection of assertions that the classes corresponding to
certain constituents are empty.
The last topic of the chapter is Boole’s method of using Lagrange multipliers to handle a system
of equations.
Upon publication, in November of 1847, Boole’s claim that all the processes of common algebra
follow from the first two laws was immediately challenged in a letter from his friend Cayley. This
must have caught Boole off-guard. After all, for nearly a decade these laws had been used in
publications as a basis for work in the calculus of differential operators. Cayley pointed out that
xy = xz does not lead to y = z. This would lead to considerable soul searching by Boole, and a
new foundation for his second book on logic.
Two years after Boole published this treatise he secured a professorship at the new Queen’s
College, Cork, Ireland, thanks in part to the support of De Morgan. He lived in Cork for the next
(and last) 15 years of his life. After moving there he married Mary Everest, a niece of Colonel
Everest of the Indian Survey, after whom the famous mountain is named, and they had a family of
five daughters.
subject to laws founded upon that interpretation alone. But at the same time they exhibit
those laws as identical in form with the laws of the general symbols of algebra, with this
single addition, viz., that the symbols of Logic are further subject to a special law (Chap.
II.), to which the symbols of quantity, as such, are not subject.
The main achievement of the new book is the discovery of a procedure to eliminate symbols
from a set of premises and arrive at the most general conclusion. It is perhaps not surprising that
Boole lists this as the first requirement of a general method in logic (page 8):
1st. As the conclusion must express a relation among the whole or among a part of the
elements involved in the premises, it is requisite that we should possess the means of elimi-
nating those elements which we desire not to appear in the conclusion, and of determining
the whole amount of relation implied by the premises among the elements which we wish to
retain.
His system will be able to do this. Regarding the elimination problem he also says:
It proposes not merely the elimination of one middle term from two propositions, but the
elimination generally of middle terms from propositions, without regard to the number of
either of them . . .
In yet stronger form he says on page 10:
Given a set of premises expressing relations among certain elements, whether things or
propositions: required explicitly the whole relation consequent among any of those elements
under any proposed conditions, and in any proposed form. That this problem, under all its
aspects, is resolvable, will hereafter appear.
En route he takes a swat at Aristotle’s logic (page 10):
. . . I would remark:—1st. That syllogism, conversion, &c., are not the ultimate processes
of Logic.
Boole formulates the overview of his logic very simply (page 27):
Proposition I.
All the operations of Language, as an instrument of reasoning, may be conducted by a system
of signs composed of the following elements, viz.:
1st. Literal symbols, as x, y, &c., representing things as subjects of our conceptions.
2nd. Signs of operation, as +, −, ×, standing for those operations of the mind by which
the conceptions of things are combined or resolved so as to form new conceptions involving
the same elements.
3rd. The sign of identity, =.
And these symbols of Logic are in their use subject to definite laws, partly agreeing and
partly differing from the laws of the corresponding symbols in the science of Algebra.
Then he goes on to say that the logical symbols will now denote classes, and not selection
processes as in the 1847 book (page 28):
Let us then agree to represent the class of individuals to which a particular name or description
is applicable, by a single letter, as x.
He will at times refer to them as symbols, or literal symbols, or when distinguishing their intended
interpretation, as logical symbols, or numerical symbols. Then he explains the meaning of xy:
Let it further be agreed, that by the combination xy shall be represented that class of things
to which the names or descriptions represented by x and y are simultaneously applicable.
From this the laws xy = yx and x2 = x easily follow. Boole’s treatment of multiplication as
representing intersection is quite satisfactory. The same cannot be said for his handling of addition.
The next topic is + (which was barely defined in the 1847 text) on page 32:
70 4. GEORGE BOOLE (1815–1864)
. . . this apparent failure of correspondency between process and interpretation does not
manifest itself in the ordinary applications of human reason.
He continues to elaborate on these objections:
There are perhaps many who would be disposed to extend the same principle to the general
use of symbolical language as an instrument of reasoning. It might be argued, that as the
laws or axioms which govern the use of symbols are established upon an investigation of those
cases only in which interpretation is possible, we have no right to extend their application
to other cases in which interpretation is impossible or doubtful, even though (as should be
admitted) such application is employed in the intermediate steps of the demonstration only.
Finally he is ready to present his position on this issue, to conclude the till now cautious preparation
of the defense of the correctness of his methods. In a single paragraph his defense suddenly turns
into wishful thinking (pages 67–68):
But the objection itself is fallacious. Whatever our à priori anticipations might be, it is an
unquestionable fact that the validity of a conclusion arrived at by any symbolical process
of reasoning, does not depend upon our ability to interpret the formal results which have
presented themselves in different stages of the investigation.
This defense of Boole is no doubt influenced by the success of symbolic adventures in the devel-
opment of calculus during the preceding two centuries. Continuing, Boole introduces the following
hopeless principle for evaluating symbolic methods (page 69):
A single example of reasoning, in which symbols are employed in obedience to laws founded
upon their interpretation, but without any sustained reference to that interpretation, the
chain of demonstration conducting us through intermediate steps which are not interpretable,
to a final result which is interpretable, seems not only to establish the validity of the particular
application, but to make known to us the general law manifested therein. No accumulation
of instances can properly add weight to such evidence.
His final trump card in his rebuttal of the objections comes from the reliable standby, complex
numbers (page 69):
√
The employment of the uninterpretable symbol −1, in the intermediate processes of
trigonometry, furnishes an illustration of what has been said.
This is the end of his justification of using meaningless expressions.
Then he turns to the alternative interpretation of his expressions, one which always makes
sense, namely one can let the literal symbols range over the numbers 0 and 1. But we have jumped
over some important topics. Returning to the discussion of the fundamental operations that Boole
has introduced, he continues with the discussion of the laws that govern them, giving (page 36):
But instead of dwelling upon particular cases, we may at once affirm the general axioms:—
1st. If equal things are added to equal things, the wholes are equal.
2nd. If equal things are taken from equal things, the remainders are equal.
And it hence appears that we may add or subtract equations, and employ the rule of
transposition above given just as in common algebra.
Except for the reference to the role of transposition8 this is just another version of his ‘single axiom’
of 1847, a form of the modern Replacement Rule.
For the basic laws he has expanded the three from 1847 to seven:
8
Transposition means moving a term from one side of an equation to the other, accompanied by a change in sign
of the term.
72 4. GEORGE BOOLE (1815–1864)
xy = yx
x2 = x
x+y = y+x
z(x + y) = zx + zy
x(1 − x) = 0
x−y = −y + x
z(x − y) = zx − zy
He returns to the idempotent law x2 = x, and notes that among the numbers there are only
two that satisfy it, namely 0 and 1. On page 37 he says:
Hence, instead of determining the measure of formal agreement of the symbols of Logic
with those of Number generally, it is more immediately suggested to us to compare them
with symbols of quantity admitting only of the values 0 and 1. Let us conceive, then, of an
Algebra in which the symbols x, y, z, &c. admit indifferently of the values 0 and 1, and of
these values alone.
One is tempted to think Boole is speaking of a two-element algebra, perhaps a Boolean algebra
or a Boolean ring. This is by no means the case. He is limiting the symbols to take on the values
0 and 1, but he means for the operations to be calculated as usual, in the ordinary number system.
This now leads to one of the most powerful principles in Boole’s logic, and one for which no
justification whatsoever is given. It was not mentioned in the 1847 book. We will call this Boole’s
Rule of 0 and 1 (pages 37-38):
The laws, the axioms, and the processes, of such an Algebra will be identical in their whole
extent with the laws, the axioms, and the processes of an Algebra of Logic. Difference of
interpretation will alone divide them. Upon this principle the method of the following work
is established.
This is a very powerful principle as stated, and was never used with full force by Boole. For
it implies that to check the validity of an argument of logic that has been translated into the
equational form
p1 (~x) = 0, . . . , pk (~x) = 0 ∴ p(~x) = 0
it is necessary and sufficient to verify that the argument is correct in the ordinary numbers for each
assignment of 0s and 1s to the xi . This is very similar to the modern use of truth tables.
Boole uses this principle for two facts in his 1854 book:
• The Expansion Theorem follows because one can easily check that the 0,1 assignments make
the two sides equal (in the ordinary numbers). This Theorem had been proved in the 1847
book by using a Maclaurin expansion. In this book the Maclaurin expansion proof survives in
a footnote, with the explanation that it applied only to those terms that admitted a Maclaurin
expansion.
• If at = 0, with a being a nonzero number and with t being a constituent, then t = 0 fol-
lows. This is used by Boole to show that any equation can be interpreted as a collection of
constituents set equal to zero.
Once Boole has adopted this elegant principle the possibility of using Boolean algebra or Boolean
rings is excluded because the Rule of 0 and 1 gives: x + x = 0 implies x = 0; and x + x = x implies
x = 0. Boole seems to have been blinded by the success of this rule, a rule that tied logic to ordinary
2. THE LAWS OF THOUGHT (1854) 73
arithmetic, and thus failed to come up with a simpler algebra of logic that would be developed by
his successors.
Boole says that, when given equations obtained from logic, one can view them as numerical
equations for which all the usual algebraic steps make sense, and at the end of the derivation one
can switch back to the logical interpretation (page 70):
We may in fact lay aside the logical interpretation of the symbols in the given equation;
convert them into quantitative symbols, susceptible only of the values 0 and 1; perform
upon them as such all the requisite processes of solution; and finally restore them to their
logical interpretation. And this is the mode of procedure which will actually be adopted,
... .
With the operations established, Boole turns on page 47 to the use of 1 for the Universe and 0
for Nothing, now treating Nothing as a class. He had already decided that his universe of discourse
would be the “actual universe” (page 44). We have a curious remark in a footnote on page 50,
where Boole is talking about the fact that the equation x3 = x does not have any interpretation in
logic because it factors as x(1 − x)(1 + x) = 0; and the term 1 + x
. . . is not interpretable, because we cannot conceive of the addition of any class x to the
universe 1 . . .
This suggests that one reason Boole did not discover a natural interpretation of x + y for all classes
x and y was that he fell victim to his choice of words for the operation, namely addition, which is
in harmony with his notion of aggregating. These words suggest that whenever + has meaning it
should allow some increase in the class, some augmentation of either class being added. Since 1
cannot be increased, Boole was at a loss for an interpretation.
In Chapter IV, Divisions of Propositions, Boole explains how to translate the categorical propo-
sitions into equations. He chooses equations from his 1847 work, but not the original ones.
A All Xs are Ys x = vy xy = x
E No Xs are Ys x = v(1 − y) xy = 0
I Some Xs are Ys vx = vy v = xy
O Some Xs are not Ys vx = v(1 − y) v = x(1 − y)
The symbol v is to carry the meaning of ‘some’. The v in the universal propositions is treated like
any other symbol, and one can derive the usual x(1 − y) = 0 from the form x = vy by eliminating
v, so x = vy yields at least as much information as in the original 1847 version. One does not
derive anything that is incorrect from this translation of the universal propositions, using either
the modern or restricted semantics.
In the case of the particular propositions the symbol v is:
. . . the symbol of a class indefinite in all respects but this, that it contains some individuals
of the class to whose expression it is prefixed . . .
As mentioned in the Introduction, we will refer to this as the side condition on v. In Appendix 4
we will show that Boole could have replaced ‘vx = vy + side condition on v’ with the much simpler
v = vxy. We need a bit of caution here when using the modern semantics.
74 4. GEORGE BOOLE (1815–1864)
Peirce (1880 [16], page XX) and later Schröder (1891, [18], page XX) will blast Boole for his
handling of the particular categorical statements. Both will claim that it cannot be done with
equations, and Schröder gives a proof of this fact. As Schröder notes in 1891, Vol. II of [18], page
XX, the side condition can be simply formulated as vx 6= 0, and in the particular affirmative case
Schröder replaces the combination ‘vx = vy and vx 6= 0 with the simpler xy 6= 0, eliminating any
need for a parameter.
In Appendix 4 we show that Boole was, with minor changes, correct. For either of the semantics
we have discussed, modern or restricted, we can find simple equational techniques, using parameters,
that capture the essence of particular propositions. The introduction of parameters for the universal
propositions is wholly unnecessary.
In Chapter V, Principles of Symbolical Reasoning, Boole applies the Rule of 0 and 1 to justify the
expansion (or the development) of a term. For example if p(x) is a term in a single literal symbol
we must have
p(x) = p(1)x + p(0)(1 − x)
because the two sides agree numerically when x takes on the values 0 or 1. The same argument
works for a term in several literal symbols. As before the constituents are the terms that are of the
form of a product x
b1 · · · x
bn , where each xbi is
Peither xi or 1−xi . Thus the Expansion Theorem allows
one to express any term p(x1 , . . . , xn ) as ai ti , a sum of integer coefficients times constituents.
One of the expansions he gives is
x − y = x(1 − y) − y(1 − x)
and he remarks that this is generally uninterpretable in logic as (page 77):
We cannot take, in thought, from the class of things which are x’s and not y’s, the class of
things which are y’s and not x’s, because the latter class is not contained in the former.
Again Boole is a victim of his description of the operation—rather than try to extend the operations
so that the laws are preserved he is concerned about an extension preserving the particular meaning
he has attached to the symbol.
Nonetheless he says that if one derives an equation x − y = 0 then it has a perfectly legitimate
interpretation, namely that the constituents in the expansion must be set to 0, i.e., one has x(1−y) =
0 and (1 − x)y = 0. He says in summary (page 78):
. . . though functions do not necessarily become interpretable upon development, yet equa-
tions are always reducible by this process to interpretable forms.
He states the general theorem on the interpretation of an equation V = 0 in Chapter VI, Of
Interpretation, on page 83:
Rule.—Develop the function V , and equate to 0 every constituent whose coefficient does
not vanish. The interpretation of these results collectively will constitute the interpretation
of the given equation.
The main new result that Boole introduces in 1854 is the Elimination Theorem in Chapter
VII, On Elimination, which shows how to find the most general equation (in some of the symbols)
that can be obtained from a given equation. Later Schröder would call this Boole’s main theorem.
As an example suppose one is given an equation
p(x, y, z) = 0
in three symbols. Then the most general conclusion involving only the symbols x, z would be
p(x, 0, z)p(x, 1, z) = 0
2. THE LAWS OF THOUGHT (1854) 75
The rule is simply to put 0’s and 1’s in place of the symbols you want to eliminate, in all possible
ways, and then multiply these together and set the result equal to 0. If we have, for example,
p(x, y, z) = xyz + xy(1 − z),
then this method would give
³ ´³ ´
x0z + x0(1 − z) x1z + x1(1 − z) = 0,
which simplifies to
x(1 − z) = 0,
and which we would interpret as ‘All X is Z’.
Logical arguments usually have several premises, leading to several equations V 1 = 0, . . . ,
Vm = 0. Chapter VIII, On the Reduction of Systems of Propositions, is devoted to methods for
replacing such a system of equations by a single equation. For then one can apply the Elimination
Theorem. Boole’s favorite method is to use the single equation V12 + · · · + Vm2 = 0. This leads to a
number of examples of terms that are not interpretable in Boole’s system.
Boole’s work with division is essentially as in 1847, noting that in the expansion of the right
hand side of an equation like
p(x1 , . . . , xn )
w =
q(x1 , . . . , xn )
P
as a linear combination of constituents, ai ti , the ai can be one of four kinds: 0, 1, 00 , and all
others. As before a coefficient of 0 means the constituent is omitted, a coefficient of 1 means the
constituent is to be retained, a coefficient of 00 is to be replaced by a new symbol, and for the ‘all
others’ case he deletes ai ti from the expansion and sets ti = 0 as a side constraint.
Boole’s innovative approach to logic was not immediately appreciated. Whereas De Morgan’s
book gave thorough and clear explanations of material that was close to the traditional logic, Boole
gave a brief and dubious justification for his approach, and later authors, especially Jevons, would
describe Boole’s treatment as obscure. The one exception was De Morgan who showed remarkable
forsight in his Budget of Paradoxes:
That the symbolic processes of algebra, invented as tools of numerical calculation, should
be competent to express every act of thought, and to furnish the grammar and dictionary of
an all-containing system of logic, would not have been believed until it was proved.
..
.
The unity of the forms of thought in all applications of reason, however remotely separated,
will one day be a matter of notoriety and common wonder; and Boole’s name will be re-
membered in connection with one of the most important steps towards the attainment of
this knowledge.
It is common to associate Boole’s name with the laws of Boolean algebra, but it seems
far more accurate to assign to him the near discovery of the laws of Boolean rings. Boole did
not discover the laws of Boolean algebra because he was not working with Boolean algebra. His
successors, especially Jevons, were responsible for this development. They could not make sense
of what Boole was doing so they approached it from another direction, using the union operation
described by De Morgan.
The book of 1854 emphasizes general results from the start, using the Rule of 0 and 1 to
establish the Expansion Theorem, and then making heavy use of constituents and of the mysterious
division. If one puts aside the steps involving division then all the ordinary algebraic manipulations
make perfectly good sense when working with the interpretation of + as the symmetric difference,
76 4. GEORGE BOOLE (1815–1864)
i.e., when working with the laws of Boolean rings.9 However the methods of reduction must be
considerably modified—see Appendix 1.
The natural extension of Boole’s + chosen by Jevons was not symmetric difference, but union.
Almost all the algebraic manipulations of Boole’s book are correct 10 when one uses union for +.
The problem with union is that some of the steps involving minus are not clear. If you want 1 − x
to denote the complement of x then what would −x mean? For example how would one interpret
the inference (on page 115) of −xy − 2x(1 − y) = 0 from xy − 2x = 0? If one takes −x as the
complement of of x and x − y as denoting the elements of x not in y then x + (−y) is not in general
equal to x − y when working with union.
The Expansion and Elimination Theorems remain true as stated for Boolean rings, and a
modified version of the Reduction theorem holds. The Rule of 0 and 1 holds with the integers Z
replaced by the two-element Boolean ring Z2 . (See Appendix 1 on Boolean rings.)
The situation with Boolean algebra is different as the axioms are different. With an appropriate
change in the definition of a constituent the Expansion Theorem holds, essentially as stated. The
Reduction Theorem is simpler, and the Elimination Theorem is essentially the same. The Rule of
0 and 1 holds with the integers Z replaced by the two-element Boolean algebra B 2 . (See Appendix
2 on Boolean algebra.)
Furthermore there is a third interpretation (due to Hailperin, 1976) under which Boole’s results
hold, and it captures the true spirit of the algebra of Boole. We discuss this in Appendix 3.
The dislike of Boole’s successors (with the notable exception of John Venn) for uninterpretable
terms led them to use union for +. This led to a thorough overhaul of the algebraic treatment, in
many ways simplifying the presentation, and gave us the subject of Boolean algebra. A detailed
version of the algebra of logic using union for + would be presented by Schröder in 1890, in the
first volume of his Algebra der Logik,11 up to and including the Elimination Theorem, with one
exception. The main shortcoming of Schröder, and indeed everyone after Boole, was the complete
omission of Boole’s powerful Rule of 0 and 1.
After his 1854 book Boole returned to his original interests, and wrote successful textbooks on
differential equations and difference equations. Repeatedly during the 15 years in Cork he would
ask De Morgan if there were any possibility of obtaining an academic position in England, but
nothing seemed to turn up. His new approach to logic was only beginning to be absorbed by others
when unexpectedly, in the fall of 1864, he died—from a case of pneumonia that developed after
being caught in an autumn shower while walking to his lectures, and then lecturing in wet clothes.
9
Lewis’s 1918 A Survey of Symbolic Logic is rather confusing in his discussion of Boole’s +. After explaining that
Boole required x and y to represent disjoint classes in x + y he goes on to say (page 53):
x + y, then, symbolizes the class of things which are either members of x or members of y, but not of both.
If this means he thinks that Boole is working with the symmetric difference then Lewis has made a serious error.
10
The main reason that so much of the algebra of classes in Boole’s books is correct when viewed either as being
carried out in the theory of Boolean algebras (with + being union) or in the theory of Boolean rings (with + being
symmetric difference) is that he usually uses + between distinct constituents. The multiplications of Boolean algebras
and Boolean rings are the same (intersection), and union agrees with symmetric difference on disjoint classes.
11
Following ideas of Peirce from 1880, he takes ‘⊆’ as primitive, and ‘=’ as a defined relation. He carefully works
out the equational theory entirely as derived theorems, including proofs that equality is reflexive, symmetric and
transitive, and proving the replacement rule. The weak spot of Schröder’s work is that he does not have a formal
system to work with the primitive ‘⊆’.
CHAPTER 5
Jevons had a passion for finding order in the world around him. Perhaps this was rooted in the
losses sustained in his family during his childhood in Liverpool, England. Only six of the eleven
children delivered by his mother survived. His mother died in 1845, and within a couple of years
his oldest brother, Roscoe, fell victim to a devastating mental illness,1 at the age of 18. His father’s
fortunes in the iron business deteriorated into bankruptcy in 1848.
At the age of 16 Jevons entered the University of London and studied for two years, with a
particular interest in science. And during this time he would take extended walks through London
to try to grasp the social structure of the city.
After the two years at the University of London he was offered a job in Sydney, Australia, as
assayer to the new Royal Mint. With the encouragement of his father he accepted and set sail,
first-class, on the three month voyage from Liverpool in the summer of 1854. Jevons had been in
Australia only one year when his father died suddenly during a visit to Pisa, Italy. During Jevons’
five year stay in Australia he took a deep interest in a wide range of scientific activities, keeping
detailed records of the weather, studying the economy, the geology, the geography, the flora and
the social structure.
He returned to England in 1859, taking a leisurely return through North America, where he
visited an older brother who was trying log cabin pioneer life in Minnesota (it didn’t last). Back
in the University of London he continued the work on his academic degree (that he had left off
five years before), concentrating on strengthening his scientific background, in particular the math-
ematics which he found difficult. He was convinced that mathematical skills were essential to a
better understanding of the economy. During his years at the University of London he was greatly
influenced by the vivacious and popular instructor Augustus De Morgan. In 1880 Jevons would
write in the preface to Studies in Deductive Logic:
There was never a greater teacher of mathematics than De Morgan; but from his earliest
essay on the Study of Mathematics to his very latest writings, he always insisted upon the
need of logic as well as purely mathematical training.
..
.
My general indebtedness, both to those writings [on logic] and to his own unrivalled teaching,
cannot be sufficiently acknowledged.
Jevons finished his M.A. in 1862. After an unsuccessful stint as a freelance writer in London
he took a post as a tutor in Manchester in 1863. One of his early notable writing successes at
Manchester was an article on the possibility that the British were squandering their economic
future by their high consumption of cheap coal.
In 1866 he was appointed Professor, and within five years he would be publishing the reader-
friendly books that would be used to educate nearly all British Empire students of elementary logic
and economics for half a century.
1
This was kept a family secret until 1955. Roscoe had to be cared for until his death in 1869. And a few
months after this W.S.’s affectionate younger sister, Henrietta, fell victim to delusions which overwhelmed her for
the remaining 40 years of her life.
77
78 5. WILLIAM STANLEY JEVONS (1835–1882)
In 1864 Jevons starts publishing works on logic that set out to modify the system of Boole so
that all definitions, laws, and inferences are transparent as regards meaning. His first contribution
is to replace Boole’s partially defined + with union. This leads to the Law of Unity A + A = A.
Without explanation he keeps only the constant 0; perhaps writing 1 + 1 = 1 would have been too
controversial. He discards Boole’s subtraction and division, the former because it is in conflict with
the Law of Unity. To handle complements he adopts De Morgan’s uppercase/lower case notation.
He emphasizes that his system has no connections with the algebra of number, that the in-
terpretations of his expressions are all simple, natural items in logic. In 1869 he changes the +
to ·|· to emphasize the non-numerical nature of the operation. His primary method of inference,
the method of “indirect inference”, is to make a list of all constituents involving the symbols of
the premises, eliminate those that contradict any premise, and use the remaining constituents to
derive conclusions. This is a tedious process, and between 1864 and 1869 Jevons had several ideas
to expedite the work, culminating in a “machine capable of reasoning”.
As a further effort to eliminate any mystery he attempts to give a complete list of the axioms
and rules of inference that are needed. If one combines his secondary remarks with his primary list
of these items, and if one ignores his requirements that the symbols A, etc., and their complements
a, etc., be non-zero (see item 7 of §116 of Pure Logic), then, with the exception of the associative
laws, he succeeds. This would make Jevons the first person to formulate an (essentially complete)
equational logic, and it would be for Boolean algebra.
Jevons main contribution to the equational approach to logic would be his interpretation of
+ and the explicit presentation of the axioms and rules. His method of indirect inference would
be considered primitive and tedious by his successors. And they would find fault with the lack
of symbols for the universe and the operation of complementation, and the failure to carry over
some theorems of Boole that were true in his system, in particular the expansion and elimination
theorems. As mentioned at the end of the last chapter, all of Boole’s successors, including Jevons,
avoided mentioning the Rule of 0 and 1.
Jevons found teaching quite stressful, and by 1876 he was able to move to the University of
London with a post requiring little lecturing, and in 1881 he even resigned from this to devote
himself to writing, especially on his master treatise on economies. But the next summer, on a
holiday outing, he drowned at the age of 46. There was some speculation that his poor health had
contributed to this accident.
Jevons wants to divorce the algebra of logic from the marriage with the ordinary number system
that had been forged by Boole. Hence the ‘Pure’ in the title Pure Logic. And, to reinforce the point
that logic can be treated without reference to quantity, Jevons differs with Boole on how to interpret
the symbols. He maintains that they should be taken as intensive rather than extensive, running
counter to British tradition. This choice of intensive rather than extensive mode explains the second
part of his title,3 for in the opening paragraph of the book he says:
It is the purpose of this work to show that Logic assumes a new degree of simplicity, precision,
generality, and power, when comparison in quality is treated apart from any reference to
quantity.
He says that the intensive mode is more in touch with the common language. However his system
is just as easily understood in the extensive mode, and we will view it in that manner. Admitting
quantity in the context of classes in no way compromises Jevons program to avoid tying logic to
the concept of quantity as represented by numbers. And even Jevons will eventually slip into using
the extensive mode.
With this understanding we first summarize the definitions, axioms, and rules of inference
Jevons uses, followed by a discussion of these items. In each case we will state a name if Jevons
provides one. First we give the definitions:
Expression Meaning Name Page
AB intersection combination 15
A+B union plural term 24
not-A complement of A contrary of A 30
a complement of A contrary of A 30
0 empty class excluded from thought 31
Fig. 42 Definitions of Jevons
The members of a plural term are called alternatives. Jevons uses the operator “not-” quite spar-
ingly, preferring the De Morgan notation. The clumsiest feature of his definitions is his introduction
of 0, where he seems to be unsure of what is going on—we will discuss this below.
Next we look at the equations that Jevons says hold in general. We will call them his axioms:
The five axioms that have names are the ones that he puts in his summary as the primary ones,
the others presumed to be derivable.
Finally we want to list the five rules of inference that Jevons uses in this work. We will label
them as R1–R5. A modern name for each rule is given in parentheses. Jevons’ names for R2, R4,
and R5, are given in bold type. (He does not have names for the others.)
R4. (Second version of: Replacement for Intersection) [Law of Same Parts and Wholes] (page
17)
45. Same terms being combined with both members of a premise, the combinations may be
stated as same in a new proposition which will be true with the premise.
..
.
Thus, from A = B we may infer AC = BC by combining C with each of A and B.
The first two rules that have bold-faced names are the ones Jevons considers primary, and the
others derivable.
The modern version of the Replacement Rule says that if one is given an equation p = q and
p occurs as a subterm of s, something we can describe by writing s[p], then one can replace the
specified occurrence of p by q and obtain the equation s[p] = s[q].
The modern meaning of the Substitution Rule in equational logic is that if one is given an
equation r(x1 , . . . , xn ) = s(x1 , . . . , xn ) and terms t1 , . . . , tn , then one can infer r(t1 , . . . , tn ) =
1. PURE LOGIC (1864) 81
s(t1 , . . . , tn ), the equation that results by the simultaneous and uniform substitution of the t i for
the xi .
Jevons summarizes the above:
109. The following are the chief laws or conditions of logic:—
Condition or postulate. The meaning of a term must be same throughout any piece of
reasoning; so that A = A, B = B, and so on.
Law of Sameness.
A = B = C; hence A = C
Law of Simplicity.
AA = A, BBB = B, and so on.
Law of Contradiction.
Aa = 0, ABb = 0, and so on.
Law of Duality.
A = A(B + b) = AB + Ab
A = A(B + b)(C + c)
= ABC + ABc + AbC + Abc, and so on.
It seems likely that these are the primary and sufficient laws of thought, and others only corol-
laries of them.
..
.
The Laws of Simplicity, Unity, Contradiction, and Duality furnish the universal premises
of reasoning. The Law of Sameness is of altogether a higher order, involving inference, or
the Judgement of Judgements.
His treatment of the equivalence properties of equality, i.e., the reflexive, symmetric, and
transitive properties, seems curious from a modern viewpoint. In §24 he refers to A = A as a
“useless Identical proposition.” In §10 he states that A = B is the same as B = A, but he does not
seem to realize that this is a rule of inference as it is missing from his summary given above. And
he attaches an extraordinary importance to the transitive property. In the context of discussing
the equational form of propositions versus Aristotle’s forms he says:
138. . . . that reasoning from same to same things may be detected as the fundamental
principle of all the sciences, we need have no hesitation in treating the equation as the true
proposition, and Aristotle’s form as an imperfect proposition.
It is then the Law of Sameness, not the dictum of Aristotle, which governs reason.
Conspicuously missing from the above summary are the commutative laws, distributive
laws, associative laws, any version of De Morgan’s laws, as well as the absorption4 laws
A + AB = A and A(A + B) = A. Also his summary fails to fully include the replacement rule
that he had formulated quite clearly:
For any term, or part-term, in one premise, may be substituted its expression in other terms.
We see part of this in his Law of Same Parts and Wholes, but he is missing the important rules:
A = B implies A + C = B + C, and A = B implies not-A= not-B. By 1869 he will realize the central
4
In 1880 Peirce attributes both absorption laws to Grassmann and Schröder, without any reference to the earlier
work of Jevons.
82 5. WILLIAM STANLEY JEVONS (1835–1882)
role played by replacement, as expressed in his substitution rule, and will claim that this rule is
the source of all correct reasoning.
Jevons’ treatment of 0 also leaves much to be desired (page 9):
The meaning of 0, whatever it exactly be, may also be expressed in words.
..
.
92. Let us denote by the term or mark 0, combined with any term, that this is contradictory,
and thus excluded from thought. Then Aa = Aa.0, Bb = Bb.0, and so on. For brevity we
may write Aa = 0, Bb = 0. Such propositions are tacit premises of all reasoning.
..
.
94. The term 0, meaning excluded from thought, obeys the laws of terms.
0.0 = 0 0 + 0 = 0,
otherwised expressed:—What is excluded and excluded is excluded—What is excluded or
excluded is excluded.
Missing from Jevons’ system is a formulation of the laws
A.0 = 0 A+0=A
although one does see them used in practice. Regarding the latter there is a curious argument:
96. In a plural term of which not all the alternatives are contradictory, the contradictory
alternative or alternatives must be excluded from notice.
If for instance A = 0 + B, we may infer A = B, because A if it be 0 is excluded; and if
it be such as we can desire knowledge of, it must be the other alternative B.
Let us look at the indirect inference method of Jevons to determine the consequences of a
given set of equations. Suppose we are given k equations, say p1 = q1 , . . . , pk = qk , in the symbols
A,B,C. The first step is to write out the complete list of constituents5 in the symbols A,B,C:
ABC aBC
ABc aBc
AbC abC
Abc abc
Then one examines all equations tpj = tqj resulting from multiplying the premises pj = qj by the
constituents t, i.e., with our example of 3 symbols one is to examine the list of 8k equations
ABCp1 = ABCq1
ABCp2 = ABCq2
..
.
abcpk = abcqk
In each of these equations tpj = tqj one has each of the terms tpj and tqj simplifying to one of two
possibilities, namely 0 or t. Thus each of these equations simplifies to one of four forms, labeled as
follows:
t = t included subject of the jth equation
0 = 0 excluded subject of the jth equation
t = 0 contradiction to the jth equation
0 = t contradiction to the jth equation
5
Jevons did not use the name constituents, or any other special name, for these combinations.
1. PURE LOGIC (1864) 83
Each t that appears in a contradiction is to be struck off the list of constituents. Once this is finished
for all equations and all constituents then we say the following for each constituent t remaining on
the list:
t is an included subject if t is an included subject for some
pj = q j
t is an excluded subject if t is excluded subject for all pj =
qj
Let us suppose that the above procedure has been carried out and that the remaining con-
stituents are t1 , . . . , tn . We will refer to this as the shortened list of constituents. Then to find out
what the premises say about a given term s, assumed to be a symbol or a product of symbols, let
ti1 , . . . , tim be the constituents in the shortened list that include all the symbols of s. Then
s = t i1 + · · · + t im
follows from the premises.
Jevons also has rules for simplifying the right hand side of this equation, the first being to use
the Law of Duality to combine terms, and the second is to note that if some part s i of a ti on the
right only occurs in the one term ti in the shortened list, then one can replace ti by si .
There are a number of examples that Jevons looks at, the first being to consider the conse-
quences of the single premise A = BC. In the following table we give the details of striking off the
constituents:
Multiply A=BC by a constituent and Simplify
ABC ABCA = ABC BC ABC = ABC
ABc ABc A = ABc BC ABc = 0
AbC AbCA = AbC BC AbC = 0
Abc Abc A = Abc BC Abc = 0
aBC aBCA = aBC BC 0 = aBC
aBc aBc A = aBc BC 0 = 0
abC abC A = abC BC 0 = 0
abc abc A = abcBC 0 = 0
173. Compared with Professor Boole’s system, in its mathematical dress, this system shows
the following advantages:—
1. Every process is of self-evident nature and force, and governed by laws as simple and
primary as those of Euclid’s axioms.
2. The process is infallible, and gives no uninterpretable or anomalous results.
3. The inferences may be drawn with far less labour than in Professor Boole’s system,
which generally requires a separate computation and development for each inference.
174. So long as Professor Boole’s system of mathematical logic was capable of giving results
beyond the power of any other system, it had in this fact an impregnable stronghold. Those
who were not prepared to draw the same inferences in some other manner could not quarrel
with the manner of Professor Boole. But if it be true that the system of the foregoing
chapters is of equal power with Professor Boole’s system, the case is altered. There are
now two systems of notation, giving the same formal results, one of which gives them with
self-evident force and meaning, the other by dark and symbolic processes. The burden of
proof is shifted, and it must be for the author or supporters of the dark system to show that
it is in some way superior to the evident system.
175. It is not to be denied that Boole’s system is consistent and perfect within itself. It is,
perhaps, one of the most marvellous and admirable pieces of reasoning ever put together.
Indeed, if . . . the chief excellence of a system is in being reasoned and consistent within
itself, then Professor Boole’s is nearly or quite the most perfect system ever struck out by a
single writer.
176. . . . Professor Boole’s system is Pure Logic fettered with a condition which converts it
from a purely logical into a numerical system.
After this comparison of his system with Boole’s he goes on to list four objections to Boole’s
system:
First Objection
177. Boole’s symbols are essentially different from the names or symbols of common
discourse—his logic is not the logic of common thought.
Jevons is mainly objecting to Boole’s use of + to express the connective “or” for disjoint classes,
whereas the common use of “or” would allow overlapping classes. Jevons strongly feels that the
inclusive definition of “or” is the correct interpretation of + in logic.
Second Objection
184. There are no such operations as addition and subtraction in Pure Logic.
This seems to be primarily an objection to Boole’s use of subtraction, e.g., he says that from
A + B + C = A + D + E in logic one cannot subtract A and conclude B + C = D + E. The problem
is simply that Jevons assumes his interpretation of + is the one and only correct one. One can only
wonder how he would have reacted to being shown the symmetric difference. But the symmetric
difference did not come to the attention of logicians until the 1900s.
Third Objection
193. My third objection to Professor Boole’s system is that it is inconsistent with the
self-evident law of thought, the Law of Unity (A + A = A).
. . . It is surely self-evident, however, that x+x is equivalent to x alone . . . it is apparent
that the process of subtraction in logic is inconsistent with the self-evident Law of Unity.
Again we see that Jevons is guilty of only being able to imagine one interpretation of his symbol
+, this interpretation being a consequence of his choice of the usual definition of “or”. Again, had
someone discovered the symmetric difference at that time, much of this polemic would never have
appeared.
2. THE SUBSTITUTION OF SIMILARS (1869) 85
Fourth Objection
197. The last objection that I shall at present urge against Professor Boole’s system is,
that the symbols 11 , 00 , 10 , 10 , establish for themselves no logical meaning, and only bear a
meaning derived from some method of reasoning not contained in the symbolic system. The
meanings, in short, are those reached in the self-evident indirect method of the present work.
..
.
199. . . . Professor Boole’s system, then, as regards the symbol 00 , is not the system bestowing
certain knowledge; it is, at most, a system pointing out truths which, by another intuitive
system of reasoning, we may know to be certainly true.
..
.
202. The correspondence of these obscure forms with the self-evident inferences of the
present system is so close and obvious, as to suggest irresistibly that Professor Boole’s
operations with his abstract calculus of 1 and 0, are a mere counterpart of self-evident
operations with the intelligible symbols of pure logic.
..
.
Boole’s system is like the shadow, the ghost, the reflected image of logic, seen among the
derivatives of logic.
..
.
205. . . . these errors scarcely detract from the beauty and originality of the views he laid
open. Logic, after his work, is to logic before his work, as mathematics with equations of any
degree are to mathematics with equations of one or two degrees. He generalized logic so that
it became possible to obtain any true inference from premises of any degree of complexity,
and the work I have attempted has been little more than to translate his forms into processes
of self-evident meaning and force.
So, after all his complaints about Boole’s work, we find that he still has the greatest admiration
for it. It is remarkable in looking through this text to see how infrequently 0 appears in Jevons
equations. He much prefers that his equations be of the form “term = term”, with both left and
right sides being different from 0. This is quite different from Boole, who preferred to put his
equations in the form “term = 0”.
..
.
10. During the last two or three years the thought has constantly forced itself upon my
mind, that the modern logicians have altered the form of Aristotle’s proposition without
making any corresponding alteration in the dictum or self-evident principle which formed the
fundamental postulate of his system.
..
.
11. But recent reformers of logic have profoundly altered our view of the proposition. They
teach us to regard it as an equation of two terms . . . Does not the dictum, in short, apply
in both directions, now that the two terms are indifferently subject and predicate?
..
.
14. I am thus led to take the equation as the fundamental form of reasoning, and to modify
Aristotle’s dictum in accordance therewith. It may be formulated somewhat as follows—
Whatever is known of a term may be stated of its equal or equivalent.
or in other words,
Whatever is true of a thing is true of its like.
..
.
But the value of the formula must be judged by its results; and I do not hesitate to assert
that it not only brings into harmony all the branches of logical doctrine, but that it unites
them in close analogy to the corresponding parts of mathematical method. All aspects of
mathematical reasoning may, I believe, be considered but as applications of a corresponding
axiom of quantity; . . .
He has come to the conclusion that this fundamental form is the Substitution of Equals as
explained via the following picture, where he is discussing the algebra of numbers:
17. . . . the wildest possible expression of a process of mathematical inference is shown in
the form—
a = b a
§ hence §
c c
where the symbol § means “any conceivable kind of relation between one quantity and another”.
. . . In this all-powerful form we actually seem to have brought together the whole of the
processes by which equations are solved, viz. equal addition or subtraction, multiplication
or division, involution or evolution, performed upon both sides of the equation at the same
time. That most familiar process in mathematical reasoning, of substituting one member of
an equation for the other, appears to be the type of all reasoning, and we may fitly name
this all-important process the substitution of equals.
To express this rule for Jevons’ algebra of logic in a slightly more modern form we could use:
A = B, ϕ(B, C)
ϕ(A, C)
where ϕ(B, C) is any assertion about B and C. As there is no need to refer to C this can be simplified
to:
A = B, ϕ(B)
ϕ(A)
The only properties ϕ that Jevons considers are equations. This is consistent with his view that
equations can express any proposition. Furthermore he will always regard A = B and B = A as
2. THE SUBSTITUTION OF SIMILARS (1869) 87
interchangeable, and not as a separate rule for equality. Thus it would be appropriate to regard
his principle of substitution, that we will call [Jevons] substitution, as the following four rules
of inference:
A = B, p[B] = q A = B, p = q[B]
J1 J2
p[A] = q p = q[A]
A = B, p[A] = q A = B, p = q[A]
J3 J4
p[B] = q p = q[B]
Fig. 45 [Jevons] Substitution
The first of these laws, which I have already referred to in an earlier part of this tract (p.
103), is the Law of Identity, that whatever is, is, or a thing is identical with itself; or,
in symbols,
A = A.
The second law, The Law of Non-contradiction, is that a thing cannot both be
and not be, or that nothing can combine contradictory attributes; or, in symbols,
Aa = 0,
—that is to say, what is both A and not A does not exist, and cannot be conceived.
The third law, that of excluded middle, or, as I prefer to call it, the Law of Duality,
asserts the self-evident truth that a thing either exists or does not exist, or that everything
either possesses a given attribute or does not possess it.
Symbolically the law of duality is shown by
A = AB·|·Ab,
in which the sign ·|· means alternation, and is equivalent to the true meaning of the disjunctive
conjunction or. Hence the symbols may be interpreted as, A is either B or not B.
These laws may seem truisms, and they were ridiculed as such by Locke; but, since
they describe the very nature of identity in its three aspects, they must be assumed as true,
consciously or unconsciously, and if we can build a system of inference upon them, their
self-evidence is surely in our favour.
Note that he has replaced the + of Boole by ·|·, no doubt to distance his system from ordinary
algebra.
In the very first example following this presentation of the three laws he is using a distributive
law and commutative law. This system is very clear, but unfortunately very incomplete. There is
no explanation as to what happened to the other laws that were formulated in his Pure Logic
of 1864. The gain in insight regarding the power of [Jevons] substitution is offset by his lack of
thoroughness in collecting the needed laws together. This incompleteness may be partly explained
by his classification of some laws as “principles of logical symbols”:
25. It is desirable at this point to draw attention to the fact that the order in which nouns
adjective are stated is a matter of indifference. . . . Hence, if A and B represent any two
names or terms, their junction as in AB will be taken to indicate anything which unites the
qualities of both A and B, and then it follows that
AB = BA
This principle of logical symbols has been fully explained by Dr. Boole in his Laws of
Thought (pp. 29,30), and also in my Pure Logic (p. 15); and its truth will be assumed here
without further proof.
He has quietly switched to the use of extension for the interpretation of his symbols in this
paper, e.g., one reads:
26. We may now proceed to consider the ordinary proposition of the form
A = AB,
which asserts the identity of the class A with a particular part of the class B, . . .
Mechanical Aids
As an introduction to his work on mechanical methods of carrying out his indirect inference he
says:
2. THE SUBSTITUTION OF SIMILARS (1869) 89
47. Objections might be raised against this process of indirect inference, that it is a long and
tedious one; and so it is, when thus performed. Tedium indeed is no argument against truth;
and if, as I confidently assert, this method gives us the means of solving an infinite number
of problems, and arriving at an infinite number of conclusions, which are often demonstrable
in no simpler way, and in fact in no other way whatever, no such objections would be of
any weight. The fact however is, that almost all the tediousness and liability to mistake
may be removed from the process by the use of mechanical aids, which are of several kinds
and degrees. While practising myself in the use of the process, I was at once led to the
use of the logical slate, which consists of a common writing slate, with several series of the
combinations of letters engraved upon it . . .
The description that follows is that one has the constituents in 2 through 6 symbols engraved on
the slate. Then he continues:
48. It soon became apparent, however, that if these combinations, instead of being written
in fixed order on a slate, were printed upon light movable slips of wood, it would become
easy by suitable mechanical arrangements to pick out the combinations in convenient classes,
so as immensely to abbreviate the labour of comparison with the premises. This idea was
carried out in the logical abacus, which I constructed several years ago, and have found useful
and successful in the lecture-room for exhibiting the complete solution of logical arguments.
Then he gives an explanation of how this works, and finally turns to his latest mechanical device:
53. In the last paragraph I alluded to a further mechanical contrivance, in which the
combination-slips of the abacus should not require to be moved by hand, but could be placed
in proper order by the successive pressure of a series of keys or handles. I have since made
a successful working model of this contrivance, which may be considered a machine capable
of reasoning, or of replacing almost entirely the action of the mind in drawing inferences.
Jevons’ machine will be discussed in the next section.
12. . . . But it is hardly too much to say that Aristotle committed the greatest and most
lamentable of all mistakes in the history of science when he took this kind of proposition as
the true type of all propositions, and founded thereon his system. It was by a mere fallacy
of accident that he was misled; but the fallacy once committed by a master-mind became
so rooted in the minds of all succeeding logicians, by the influence of authority, that twenty
centuries have thereby been rendered a blank in the history of logic.
13. . . . His syllogism was therefore an edifice in which the corner-stone itself was omitted,
and the true system is to be created by suppling this omission, and re-erecting the edifice
from the very foundation.
..
.
26. [Regarding A = AB] It may seem when stated in this way to be a truism; but it is
not, because it really states in the form of an identity the inclusion of A in a wider class B.
Aristotle happened to treat it in the latter aspect only, and the extreme incompleteness of
his syllogistic system is due to this circumstance . . .
At the end of this paper, on his wonderful discovery of the central role of [Jevons] substitution,
we have a case of the writer’s blues:
69. I write this tract under the discouraging feeling that the public is little inclined to
favour or to inquire into the value of anything of an abstract nature. There are numberless
scientific journals and many learned societies, and they readily welcome the minutest details
concerning a rare mineral, or an undescribed species, the newest scientific toy, or the latest
observations concerning a change in the weather.
..
.
But Logic is under the ban of metaphysics. It is falsely supposed to lead to no useful works—
to be mere speculation; and, accordingly, there is no journal, and no society whatever,
devoted to its study. Hardly can a paper on a logical subject be edged into the Proceedings
of any learned society except under false pretences.
A A A A A A A A a a a a a a a a
B B B B b b b b B B B B b b b b
C C c c C C c c C C c c C C c c
D d D d D d D d D d D d D d D d
Display
Panel
Keyboard
4. THE PRINCIPLES OF SCIENCE (1874) 91
To key in the equation Ab·|·Cd = acD one presses the following sequence of keys (reading down
the left column, then the center column, and finally the right column):
A A A a a a a
B B B B B b b
C c c C c C c
D D d D D D d
Now if one enters another premise further constituents may be forced to drop out. If one enters
just an intersection on the left side, say Abc, then just those constituents that include this will be
left in the display; pressing Full Stop after this, without entering an equation, will return all seven
constituents to the display.
and the special laws which govern the combination of logical terms.
Collecting together the various laws that he has dispersed over forty pages one has the following
table:
These laws, along with his single rule of inference, [Jevons] substitution, give his deductive logic
system for the Principles of Science (p. 49):
By deduction we investigate and unfold the information contained in the premises; and this
we can do by one single rule—For any term occurring in any proposition [Jevons] substitute
the term which is asserted in any premise to be identical with it.
Earlier we showed how to use [Jevons] substitution to derive the symmetry of equality, i.e.,
the rule “if A = B then B = A”. However this is not what Jevons did, and instead he says the
following (pages 46-47):
A mathematician would not think it worth while to mention that if x = y then also y = x.
He would not consider these to be two equations at all, but one equation accidentally written
in two different manners.
..
.
. . . so that I shall consider the two forms
A = B and B = A
9
In 1880 Peirce attributes the associative laws to Boole and Jevons when, in fact, neither ever mentioned them.
5. STUDIES IN DEDUCTIVE LOGIC (1880) 93
This appendix gives a modern presentation of an equational proof system BR for Boolean
rings.10 The study of Boolean rings did not appear until the mid 1930s, in the work of Marshall
Stone, although it is much closer to the work of Boole than the subject of Boolean algebras.
• We have a set X of variables, whose members we will usually refer to using small latin
letters x, y, z, . . . , at the end of the alphabet.
• The terms are defined inductively by:
(i) 0 and 1 are terms;
(ii) any variable x is a term;
(iii) if p is a term then so is (−p);
(iv) if p and q are terms then so are (p + q) and (p · q).
The notation p(x1 , . . . , xn ) means that the variables that appear in p are among the
variables in the list x1 , . . . , xn .
• An equation is an expression of the form p = q where p and q are terms.
• The Axioms are
10
No proofs will be given in this appendix as the results follow easily from standard references such as the detailed
introduction to the equational theory of Boolean rings in Burris and Sankappanavar [3], Chapter IV.
95
96 APPENDIX 1: THE PROOF SYSTEM BR
• The Rules of Inference are as follows,11 where the letters p, q, r, si refer to arbitrary but
fixed terms:
)
reflexive rule
p = p
)
p = q
symmetric rule
q = p
)
p = q, q = r
transitive rule
p = r
)
p(x1 , . . . , xn ) = q(x1 , . . . , xn )
[modern] substitution rule
p(s1 , . . . , sn ) = q(s1 , . . . , sn )
)
p = q
replacement rule
s[p] = s[q]
Given a set S of equations we say that an equation p = q has a derivation from S in this
proof system if there is a sequence of equations
p1 = q 1 , · · · , p k = q k
such that
• each equation in the sequence is either
(i) an axiom, or
(ii) it is in S, or
(iii) it is the result of applying a rule of inference to previous
equations in the sequence,
• and the equation p = q is pk = qk .
If one can derive p = q from S in BR we write S `BR p = q, or simply say that the argument
S ∴ p = q has a derivation in this system.
Two equations p = q and r = s are BR-equivalent if
p = q `BR r = s and r = s `BR p = q.
We say two collections of equations S1 and S2 are BR-equivalent if
S1 `BR p = q for p = q ∈ S2 , and
S2 `BR p = q for p = q ∈ S1 .
Every equation p = q is BR-equivalent to an equation in the form s = 0, namely p = q is BR-
equivalent to p − q = 0. Thus every equational argument can be put in the form
p1 = 0, . . . , pk = 0 ∴p=0
as far as `BR is concerned.
Abbreviating Terms
11
Instead of the version of the replacement rule used here one can use the following three simpler rules:
p=q p=q p=q
−p = −q p+r =q+r p·r =q·r
APPENDIX 1: THE PROOF SYSTEM BR 97
Now that we have carefully defined the terms of BR it is important to inform the reader that
we like to use abbreviated and often ambiguous notation for terms. Here are the basic conventions:
1. Omit outer parentheses, e.g., x + (y + z) instead of (x + (y + z)).
2. Write xy instead of x · y.
3. Specify that − has greater binding power than ·, which in turn has greater binding power
than +. Thus x + y · z means x + (y · z) and not (x + y) · z.
4. Omit parentheses when several terms are added (or multiplied) together, e.g., x + y + z + w
instead of x + ((y + z) + w), and xyzw instead of (x(yz))w. These expressions are ambiguous
as one has several ways to insert parentheses to create a term. However, any two terms p and
q with the same abbreviated form are BR-equivalent, i.e., `BR p = q.
Pk Qk
5. i=1 pi is shorthand for p1 + · · · + pk , and i=1 pi is shorthand for p1 · · · pk .
6. For n a positive integer np is shorthand for a sum of n p’s, and pn is shorthand for a product
of n p’s.
Using these conventions usually helps with the communication of results.
~ ) = q(A
hold we also have p(A ~ ) holding.
98 APPENDIX 1: THE PROOF SYSTEM BR
Here are the main theorems of Boole, adapted to this system, where Z 2 is the 2-element Boolean
ring given by:
+ 0 1 · 0 1 −
0 0 1 0 0 0 0 0
1 1 0 1 0 1 1 1
The Rule of 0 and 1 for BR
A ground equational argument
p1 (~a ) = q1 (~a ), . . . , pk (~a ) = qk (~a ) ∴ p(~a ) = q(~a)
has a derivation in the equational proof system BR[C] iff every 0–1 assignment of C that
makes the premises true in Z2 also makes the conclusion true in Z2 .
In the Introduction we applied Boole’s Rule of 0 and 1 to the commutative law. Now let us
apply the Rule of 0 and 1 for Boolean rings:
a b a+b b+a
1 1 0 0
1 0 1 1
0 1 1 1
0 0 0 0
Fig. 49 Applying the Rule of 0 and 1
Due to the fact that Z2 is the only (nontrivial) subdirectly irreducible Boolean ring one actually
has an Extended Rule of 0 and 1 that applies to all equational arguments, not just ground
equational arguments. For example one can use it to justify the quantified version x + y = y + x
of the commutative law as well as the argument ax = bx ∴ a = b.12 Of course a similar argument
using only ground equations, ac = bc ∴ a = b, is not valid.
Given a list a1 , . . . , an of constants from C we define an ~a-constituent to be a term of the
form
a1 · · · b
b an
where each b
ai is either ai or 1 − ai .
Then if t1 , . . . , t2n are the 2n distinct ~a-constituents we have
`BR ti ∧ ti = ti for all i
`BR ti ∧ tj = 0 for i 6= j, and
`BR 1 = t1 ∨ · · · ∨ t2n .
Expansion Theorem
Any term p(~a, ~b ) has an expansion on the symbols ~a given by
p(~a, ~b ) = p(1, . . . , 1, ~b )a1 · · · an + · · · + p(0, . . . , 0, ~b )(1 − a1 ) · · · (1 − an ).
Reduction Theorem
12
Such arguments, using a mixture of quantified variables and unquantified constants, were not used by Boole
or Jevons.
APPENDIX 1: THE PROOF SYSTEM BR 99
A system of equations
p1
= 0
..
.
p
k = 0
is BR-equivalent to the single equation obtained by adding all the elementary functions of
p1 , . . . , pk together and setting the sum equal to 0, i.e.,
³X ´
(p1 + · · · + pk ) + pi pj + · · · + (p1 · · · pk ) = 0.
i<j
Elimination Theorem
The most general equation that one can deduce from
p(~a, ~b ) = 0
that involves only the symbols ~b from C is
q(~b ) = 0,
where
q(~b ) = p(1, . . . , 1, ~b ) · · · p(0, . . . , 0, ~b ).
100 APPENDIX 1: THE PROOF SYSTEM BR
Appendix 2: The Proof System BA
This appendix gives a modern presentation of an equational proof system BA for Boolean
algebra.13
• We have a set X of variables, whose members we will usually refer to using small latin
letters x, y, z, . . . , at the end of the alphabet.
• The terms are defined inductively by:
(i) 0 and 1 are terms;
(ii) any variable x is a term;
(iii) if p is a term then so is (p0 );
(iv) if p and q are terms then so are (p ∨ q) and (p ∧ q).
The notation p(x1 , . . . , xn ) means that the variables that appear in p are among the
variables in the list x1 , . . . , xn .
• An equation is an expression of the form p = q where p and q are terms.
• The Axioms are
13
No proofs will be given in this appendix as the results follow easily from standard references such as the detailed
introduction to the equational theory of Boolean algebras in Burris and Sankappanavar [3], Chapter IV.
101
102 APPENDIX 2: THE PROOF SYSTEM BA
• The Rules of Inference are as follows,14 where the letters p, q, r, si refer to arbitrary but
fixed terms:
)
reflexive rule
p = p
)
p = q
symmetric rule
q = p
)
p = q, q = r
transitive rule
p = r
)
p(x1 , . . . , xn ) = q(x1 , . . . , xn )
[modern] substitution rule
p(s1 , . . . , sn ) = q(s1 , . . . , sn )
)
p = q
replacement rule
s[p] = s[q]
Given a set S of equations we say that an equation p = q has a derivation from S in this proof
system if there is a sequence of equations
p1 = q 1 , · · · , p k = q k
such that
• each equation in the sequence is either
(i) an axiom, or
(ii) it is in S, or
(iii) it is the result of applying a rule of inference to previous
equations in the sequence,
• and the equation p = q is pk = qk .
If one can derive p = q from S in BA we write S `BA p = q, or simply say that the argument
S ∴ p = q has a derivation in this system.
Two equations p = q and r = s are BA-equivalent if
p = q `BA r = s and r = s `BA p = q.
We say two collections of equations S1 and S2 are BA-equivalent if
S1 `BA p = q for p = q ∈ S2 and
S2 `BA p = q for p = q ∈ S1 .
Every equation p = q is BA-equivalent to an equation in the form s = 0, namely p = q is BA-
equivalent to (p ∧ q 0 ) ∨ (p0 ∧ q) = 0. Thus every equational argument can be put in the form
p1 = 0, . . . , pk = 0 ∴p=0
as far as `BA is concerned.
We will adopt abbreviation conventions much like those described in Appendix 1, except
that we do not assign a binding priority to the two binary operations ∨ and ∧.
14
Instead of the version of the replacement rule used here one can use the following three simpler rules:
p=q p=q p=q
p0 = q 0 p∨r =q∨r p∧r =q∧r
APPENDIX 2: THE PROOF SYSTEM BA 103
0 0 1 0 0 0 0 1
1 1 1 1 0 1 1 0
The Rule of 0 and 1 for BA
A ground equational argument
p1 (~a ) = q1 (~a ), . . . , pk (~a ) = qk (~a ) ∴ p(~a ) = q(~a)
has a derivation in the proof system BA[C] iff every 0–1 assignment of C that makes the
premises true in B2 also makes the conclusion true in B2 .
15
None of these were stated by Jevons. One can find them, except for the Rule of 0 and 1, in Schröder’s Algebra
der Logik in the 1890s. Schröder says that the formulation of the Elimination Theorem (for Boolean algebra) is due
to him.
104 APPENDIX 2: THE PROOF SYSTEM BA
Due to the fact that B2 is the only (nontrivial) subdirectly irreducible Boolean algebra one
actually has an Extended Rule of 0 and 1 that applies to all equational arguments, not just
ground equational arguments.
Given a list a1 , . . . , an of constants from C we define an ~a-constituent to be a term of the
form
a1 ∧ · · · ∧ b
b an
ai is either ai or a0i .
where each b
Thus if t1 , . . . , t2n are the 2n distinct ~a-constituents we have
`BA ti ∧ ti = ti for all i
`BA ti ∧ tj = 0 for i 6= j, and
`BA 1 = t1 ∨ · · · ∨ t2n .
Expansion Theorem
Any term p(~a, ~b ) has an expansion on the symbols ~a given by
£ ¤ £ ¤
p(~a, ~b ) = p(1, . . . , 1, ~b ) ∧ a1 ∧ · · · ∧ an ∨ · · · ∨ p(0, . . . , 0, ~b ) ∧ a01 ∧ · · · ∧ a0n .
Reduction Theorem
A system of equations
p1
= 0
..
.
p
k = 0
is BA-equivalent to the single equation
p1 ∨ · · · ∨ pk = 0.
Elimination Theorem
The most general equation that one can deduce from
p(~a, ~b ) = 0
that involves only the symbols ~b from C is
q(~b ) = 0,
where
q(~b ) = p(1, . . . , 1, ~b ) ∧ · · · ∧ p(0, . . . , 0, ~b ).
Appendix 3: The Proof System AB
In Chapter 5 we noted that Jevons praised the perfect analogy between Boole’s algebra of logic
and a restricted algebra of numbers, a fact that Jevons in all likelihood did not know to be true.
In C.L. Lewis’s classic text of 1918, A Survey of Symbolic Logic, one finds (page 55) his assessment
of Boole’s methods:
It also seems unlikely that Lewis knew this, except for some examples. Boole asserts the complete-
ness of his system in his 1847 book, with no justification, and then expands this system in his 1854
book. He no longer claims it is complete, but instead takes a totally new guiding principle, the Rule
of 0 and 1, which asserts a perfect correspondence between two systems. This is indeed correct, but
again there is absolutely no justification given. This gap was not filled until 1976, when Hailperin
showed how to construct secure foundations for Boole’s methods (although we find that Hailperin’s
treatment needs some clarification). We do not know of any such scholarly evaluation of Boole’s
work that was available before Hailperin’s book.
This appendix gives a presentation of the algebra of Boole that is both faithful to the books
written by Boole and is easily understandable to those with experience in modern algebra. Our
presentation is quite close to that of Hailperin in Boole’s Logic and Probability, but differs in
the treatment of the law of substitution and the detail given to the rules of inference.
First (and foremost) we will treat the part of the algebra of Boole that does not involve division
(÷). Hailperin proposes a direct power ZU of the ring of integers Z as a model of this part of
Boole’s logic. Here U is the universe (but not Boole’s universe of “everything”). Recall that Z U is
the ring (Z U , +, ·, −, 1, 0) where Z U is the collection of all functions f : U → Z with 0 being the
constant function 0(u) = 0, and 1 being the constant function 1(u) = 1. And the operations are
defined coordinatewise:
The idempotents of ZU are the characteristic functions XA of subclasses A of U , and thus one
has a natural identification of idempotents with subclasses of U .
105
106 APPENDIX 3: THE PROOF SYSTEM AB
Z U
Z
2 2
χ
Α
1 1
0 0
-1 -1
-2 -2
U
A
Hailperin calls an element h of ZU a signed multiset, and one can think of it as a generalized
class in the sense that each element u ∈ U is designated to be appearing h(u) times in h. The
genuine classes correspond to those h for which each element u appears 0 or 1 times, namely the
idempotents.
From the following facts about the idempotents in ZU :
• XØ = 0
• XU = 1
• XA · XB = XA∩B
• XA + XB = XA∪B + XA∩B
• XA − XB = XA\B − XB\A
we can conclude that:
• The 0 element of ZU corresponds to the empty subclass Ø of U .
• The 1 element of ZU corresponds to the universe U .
• The multiplication XA · XB of idempotents corresponds to the intersection A ∩ B.
• The sum XA + XB of two idempotents corresponds to a class (i.e., is idempotent) iff A and B
are disjoint classes, and in this case the sum corresponds to the union A ∪ B.
• The difference XA − XB of two idempotents corresponds to a class iff B is a subclass of A,
and in this case the difference corresponds to the class difference A \ B.
Using the correspondence with the idempotents we can think of the ring Z U as an extension
of the the collection of subclasses of U , with operations +, ·, − that are completely defined, and
that satisfy the basic laws formulated by Boole (as ZU is a ring). Furthermore, taking the notion
of interpretable to mean idempotent, the ring ZU exhibits the basic interpretability behavior that
Boole noted.
We want to describe a proof system for the equational theory of ZU that corresponds to Boole’s
axioms and rules of inference, and prove that it has the desired properties. Hailperin says ([6]
p. 140) that the algebra of signed multisets has the following axioms:16
16
This is described as the theory of commutative rings with unit and with no non-zero nilpotents,
either multiplicative or additive.
1. THE SYSTEM AB 107
One important feature that is missing from Hailperin’s description is a formulation of the rules
of inference,17 even though Boole made a clear attempt to describe the principles underlying his
proof system. In the following we give a proof system AB for working with equations that is based
on Hailperin’s signed multisets, but it differs in three respects. First there is no substitution rule
as there are only constants (and no variables) for the idempotents. Secondly the rules of inference
are completely specified. Thirdly, we do not need H10.
As explained in the Introduction, the literal symbols x, y, . . . in Boole’s logic behave, for the
most part, like symbols for constants. To reinforce this point of view we change his symbols x, y, . . .
to the symbols that we like to use for constants, such as a, b, . . . . We will set up an equational
logic without variables.
1. The system AB
We use the function and constant symbols of the language of rings: +, ·, −, 0, 1, plus a set C
of constant symbols as names for classes. We use the boldface 0 and 1 to distinguish the constant
symbols from the integers 0 and 1. The terms for the language are defined inductively by:
• The constants 0 and 1 are terms.
• The constant symbols in C are terms.
• If p and q are terms then so are (−p), (p + q), (p · q).
These terms are clearly ground terms as they have no variables. An equation is an expression of
the form p = q where p and q are terms. Equations are clearly ground equations. The following are
17
There is some suggestion that Hailperin is interested in full first-order logic as he says ([6], p. 140):
Unlike the (elementary) theory of Boolean algebras, the theory of SM algebras . . . is undecidable.
This result certainly applies to the full first-order theory of signed multisets. However if one restricts oneself to the
kinds of formulas that express the arguments that occur in Boole’s text, namely a conjunction of equations implies an
equation (to express the assertion that the premises imply the conclusion) then such formulas are of course decidable
(by the Rule of 0 and 1).
108 APPENDIX 3: THE PROOF SYSTEM AB
Axioms
• For each constant symbol a in C we have the axiom a2 = a.
• The laws of “Common Algebra”, where p, q, r represent any ground terms:
p+0=p p1 = p
p + (−p) = 0 pq = qp
p + (q + r) = (p + q) + r p(qr) = (pq)r
p+q =q+p
p(q + r) = pq + pr
Rules of Inference
Equality is reflexive, symmetric and transitive
p=q p = q, q = r
p=p q=p p=r
Replacement
p=q p=q p=q
−p = −q p+r =q+r pr = qr
Additively nonnilpotent
np = 0
for n ∈ {1, 2, . . . }
p=0
Given a sequence of constants ~a = (a1 , . . . , an ) from C the ~a-constituents are the products of
the form
a1 · · · b
b an ,
where each abi is either ai or 1 − ai .
Proof. Use the replacement rule to replace each ai in p(~a) by its expansion in ~a-constituents
as given in Lemma 2.3. Then simplify.
P
Lemma 2.5. If `AB p(~a) = mi ti , wherePthe ti are distinct ~a-consitituents and the mi are
integers, then the mi are Boole’s moduli, that is, mi ti is the complete expansion of Boole.
P
Proof. From `AB p(~a) = mi ti follows, by multiplying both sides by ti , and by the basic
results on constituents in Lemma 2.2, that
`AB p(~a)ti = mi ti .
Then by Lemma 2.1
|=Z p(~a)ti = mi ti .
Choose α ∈ IZ such that αti = 1. (Of course such an α exists.) Then from the last assertion we
have
(Z, α) |= p(~a)ti = mi ti ,
and thus αp = mi . This means the mi are the moduli of Boole.
P
Let mi ti be Boole’s complete expansion of p(~a). Then we say that an ~a-constituent is a
constituent of p(~a) if it is one of the ti with a nonzero coefficient mi . Let C(p(~a)) be the collection
of constituents of p(~a).
Lemma 2.6. S `AB p(~a) = 0 iff S `AB t = 0 for t ∈ C(p(~a)).
Proof. This follows from Lemmas 2.4 and 2.5.
Lemma 2.7. S |=Z p(~a) = 0 iff S |=Z t = 0 for t ∈ C(p(~a)).
2. THE RULE OF 0 AND 1 111
P P
Proof. Let mi ti be the Pcomplete expansion of p(~a). As `AB p(~a) = mi ti holds by
Lemmas 2.4 and 2.5, |=Z p(~a) = mi ti holds by Lemma 2.1. Then for α ∈ IZ we have
X
(Z, α) |= p(~a) = m i ti
so
X
Z |= αp = mi αti .
Note that at most one αti is nonzero. Then for α ∈ IZ with (Z, α) |= S,
Z |= αp = 0 iff Z |= αt = 0 for t ∈ C(p),
so
(Z, α) |= p = 0 iff (Z, α) |= t = 0 for t ∈ C(p).
Reduction Theorem
A system of equations
p1
= 0
..
.
p
k = 0
is equivalent to the single equation
p21 + · · · + p2k = 0.
Let ~a be the list of all the constant symbols of C appearing in the various pi . Then the proof
of the theorem follows from complete expansions of the pi on the symbols ~a, and observing that
squaring a complete expansion simply squares all the coefficients of the expansion. Thus each p 2i
will have only nonnegative coefficients in its complete expansion on ~a, and this guarantees that the
~a-constituents of p21 + · · · + p2k will be precisely the ~a-constitutents that belong to at least one of
the pi .
Elimination Theorem
The result of eliminating the constant symbols a1 , . . . , am from
p(~a, ~b ) = 0
is
q(~b ) = 0,
where
q(~b ) = p(1, . . . , 1, ~b ) · · · p(0, . . . , 0, ~b ).
Proof. The fact that q(~b ) = 0 follows from p(~a, ~b ) = 0 in AB is an easy application of the
Rule of 0 and 1. But we need to explain what Boole means when he says (1854, p. 8) that the
equation q(~b ) = 0 expresses
. . . the whole amount of relation implied by the premises among the elements which we
wish to retain.
3. BOOLE’S MAIN THEOREMS 113
Unfortunately Boole does not clarify this, nor justify that his elimination procedure accom-
plishes this. The simplest explanation of what he means is that if r(~b ) = 0 is any equation that
follows from p(~a, ~b ) = 0 then r(~b ) = 0 follows from q(~b ) = 0, i.e., if p(~a, ~b ) = 0 `AB r(~b ) = 0 then
q(~b ) = 0 `AB r(~b ) = 0.
This is indeed true, and thus q(~b ) = 0 is the strongest possible equation that one can deduce
about the symbols ~b from the premiss p(~a, ~b ) = 0.
To see that this is true, first observe that since r(~b ) = 0 is AB-equivalent to the system of
t(b ) = 0, where t is a ~b-constituent of r(~b ), it suffices to show the above when r(~b ) is a ~b-constituent
~
t(~b ).
So we suppose p(~a, ~b ) = 0 `AB t(~b ) = 0, where t(~b ) is a ~b-constituent. Carry out a partial
expansion of p on the constant symbols ~b to obtain
k
X
(6) `AB p(~a, ~b ) = ri (~a)ti (~b ),
i=1
where the ti are ~b-constituents. Choose a 0–1 interpretation α of C such that αt = 1. (This is
of course possible.) Under this interpretation t = 0 is false, so by the Rule of 0 and 1, since
p = 0 `AB t = 0, we must have αp 6= 0. Now
k
X
αp = (αri )(αti ),
i=1
so there must be some i such that ti = t, say tj = t (for otherwise αp = 0). Then
αp = αrj .
Now rj only involves the symbols ~a from C, and since α could map those arbitrarily to 0,1 and
have the desired property, it follows that, under every 0-1 interpretation β of C, βr j 6= 0. Thus
from (6) the constituent t belongs to every term p(e1 , . . . , em , ~b ), where each ei is one of 0 or 1,
and consequently it belongs to the product of all such terms, which is q(~b ). Thus from q = 0 we
can derive t = 0 in AB, which was to be shown.
This is not the end of the story on the Elimination Theorem. Jevons did not discuss it in his
modification of Boole’s methods, but in the 1890’s Schröder highlighted it in his famous volumes,
Algebra der Logik. Schröder proved that the Elimination Theorem held for Boolean algebra. His
version was that ax + bx0 = 0 holds iff ab = 0 and, for some u, x = bu0 + a0 u.
Then Schröder ventured beyond equations, into negated equations, and considered systems of
the form:
pi (~a, ~b ) = 0 1≤i≤k
pi (~a, ~b ) 6= 0 k+1≤i≤m
and found that he could do very little to formulate a satisfying elimination theorem, so he posed it
as a challenge problem.
In 1919 Skolem was able to cast this as problem of quantifier-elimination for the first-order
theory of power set Boolean algebras, and gave a beautiful treatment. One can view Boole’s
Elimination Theorem, as formulated in the first-order theory of Boolean algebra, as the following
theorem:
h ³ ´i h i
∃x1 · · · ∃xm p(~x, ~y ) = 0 ←→ q(~y ) = 0 ,
114 APPENDIX 3: THE PROOF SYSTEM AB
where the q is as defined in Boole’s Elimination Theorem. Thus we can think of Boole’s work
on elimination as the beginning of one of the most popular methods of the 20th century to prove
decidability results, namely the elimination of quantifiers.
4. Boole’s Method
We summarize Boole’s method of using equations to handle arguments in logic:
Step 1. Given some propositions in logic as premises, translate them into equational form.
Step 2. Combine the equations into a single equation using the Reduction Theorem.
Step 3. Apply the Elimination Theorem to obtain the most general conclusion in the desired
variables.
Step 4. Apply the Expansion Theorem to give the conclusion as a collection of equations of
the form t = 0, t being a constituent.
Step 5. Interpret the equations t = 0 as propositions.
Then ≡ gives a congruence on T, and the quotient (T/≡, ν) is in RAB , where the interpretation ν
is the natural one given by ν(c) = c/≡.19 (T/≡, ν) satisfies S but not p = q, so we can conclude
that S 6 |=AB p = q. This proves that AB is complete.
Now we are ready for a second proof of the Rule of 0 and 1.
Theorem 5.2. S `AB p = q iff S |=Z p = q.
Proof. (=⇒) If S `AB p = q then by the soundness of AB we have S |=AB p = q, and thus
in particular S |=Z p = q.
(⇐=) Now we suppose S 0AB p = q. Then, from the proof of the completeness theorem,
(T/≡, ν) satisfies S but not p = q. By earlier remarks we have (T/≡, ν) is a subdirect product
of (Z, α)’s. By taking a projection on a suitable coordinate we find that for some α we have
(Z, α) |= S but (Z, α) 6 |= p = q. Thus S 6 |= Z p = q. This finishes the proof.
Corresponds to a Correct
Argument in the Real World
We have proved that the bottom two are equivalent, and now we want to show that these are
equivalent to the top.
We will start by defining the notion of an interpretable term. Of course it means that one is
working with terms where one never adds subterms that correspond to overlapping classes, and
one never subtracts one term from another unless its interpretation is contained in the latters
interpretation.
Directly Interpretable
Given a domain, or universe, U we want to describe those terms that are always directly
interpretable (without the help of algebraic manipulations) when the constants from C are assigned
to subclasses of U . The condition that is needed is that all subterms have interpretations. For
example the term (a + b) − ab is not directly interpretable in U as a + b is not always interpretable.
(However it is equivalent to the term a + (b − ab) which is directly interpretable.)
The collection of terms directly interpretable in U will be called D. An interpretation in U
starts with an interpretation λ of the constants from C as subclasses of U , i.e., λc ⊆ U for c ∈ C.
Then we want to extend the domain of λ to include the directly interpretable terms. The collection
of all such λ will be denoted Λ.
Although the notion of directly interpretable terms and interpretations is intuitively clear,
nonetheless writing out a precise definition is somewhat involved. This proceeds by simultaneous
definition of the members of D and the extension of members of Λ. One difficulty that we have to
deal with is that if a term p is directly interpretable then most likely −p is not. To handle this we
7. TYING THE PROOF SYSTEM TO THE REAL WORLD 117
will work with the binary operation of subtraction, e.g., p − q, rather than the unary operation
of minus. Here is the definition of the collection D of directly interpretable terms:
• c is in D for c ∈ C.
• 0 and 1 are in D, and λ0 = Ø and λ1 = U for λ ∈ Λ.
For s, t ∈ D:
• s · t ∈ D, and λ(s · t) = λs ∩ λt for λ ∈ Λ.
• If λs ∩ λt = Ø for all λ ∈ Λ then s + t ∈ D, and λ(s + t) = λs ∪ λt for λ ∈ Λ.
• If λs ⊇ λt for all λ ∈ Λ then s − t ∈ D, and λ(s − t) = λs \ λt for λ ∈ Λ.
P P
Note that terms p of the form ti , with the ti distinct constituents, are in D, and λp = λti .
An equation p = q, or an argument as in (7), with all terms mentioned being in D, is said to
be directly interpretable. If an argument (7) is directly interpretable then it is clear that, for
λ ∈ Λ, the interpretation using λ gives an argument about classes, namely
(8) λp1 = λq1 , . . . , λpk = λqk ∴ λp = λq.
This says that if the class λpi equals the class λqi for all i then the class λp equals the class λq.
We will say that a directly interpretable argument (7) is valid in U if (8) is a correct argument
for all interpretations λ ∈ Λ. We write
p1 = q1 , . . . , pk = qk |=U p = q
to assert that (7) is valid in U .
Interpreting in ZU
We make the connection between Boole’s methods `AB and |=Z and the real world |=U through
Hailperin’s model ZU of signed multisets. This step seems unlikely to have been available to 19th
century logicians. Using ZU is somewhat analogous to the creation of ideal numbers, which was
done in the last century, but there is no indication that logicians in the 19th century even considered
the use of ideal classes to justify Boole’s work.
Let IZU be the collection of interpretations of the elements of C as idempotent elements of Z U .
Then each ϕ ∈ IZU can be extended inductively to interpret all terms as elements of ZU :
• ϕ0 = 0 ϕ1 = 0
• ϕ(−s) = −ϕs
• ϕ(s + t) = ϕs + ϕt
• ϕ(s · t) = ϕs · ϕt
For ϕ ∈ IZU we say (ZU , ϕ) satisfies p = q, written (ZU , ϕ) |= p = q, if p = q is true in ZU under
the interpretation ϕ. This means ϕp = ϕq.
An argument (7) is said to be valid in ZU , written
p1 = q1 , . . . , pk = qk |=ZU p = q,
if for every ϕ ∈ IZU such that the equations ϕpi = ϕqi are true in ZU the equation ϕp = ϕq is also
true in ZU .
Making the Connections
In the following let p1 = q1 , . . . , pk = qk ∴ p = q be an argument that is directly interpretable
in U .
Lemma 7.1. The following are equivalent:
p1 = q1 , . . . , pk = qk |=U p=q
p1 = q1 , . . . , pk = qk |=ZU p = q.
118 APPENDIX 3: THE PROOF SYSTEM AB
has B-idempotent terms if all the p’s and q’s are B-idempotent. The Rule of 0 and 1 says that
we can use the integers Z = (Z, +, ·, −, 0, 1) to test if the equational argument (11) has a derivation
in AB, namely by checking if the argument holds in Z under all possible 0–1 interpretations of the
constants in C.
Let Z2 be the two-element Boolean ring, the ring of integers modulo 2. Then for α any 0–1
interpretation of C and for any B-idempotent ground term h the value of α(h) in Z is the same as in
Z2 . But this means that for an argument with B-idempotent terms one can use 0–1 interpretations
in Z2 and carry out the calculations in Z2 , rather than in Z, to apply the Rule of 0 and 1 to
determine if the argument is valid.
As we also have a Rule of 0 and 1 for BR[C], it follows that for arguments (11) with B-
idempotent terms we have
p1 = q1 , . . . , pk = qk `AB p = q
iff
p1 = q1 , . . . , pk = qk `BR[C] p = q
Thus a natural modernization of Boole’s system is the ground equational theory of Boolean
rings augmented by a set C of constants. But we want to emphasize that Boole never used the
two-element Boolean ring Z2 to test the validity of arguments, only the ordinary number system.
In 1880 Peirce [16] (page 22) takes strong exception to Boole’s use of equations to express
particular statements:
The two kinds of propositions [‘All A is B’ and ‘Some A is B’] are essentially different,
and every attempt to reduce the latter to the former must fail. Boole attempts to express
‘some men are not mortal’, in the form ‘whatever men have an unknown character v are not
mortal.’ But the propositions are not identical, for the latter does not imply that some men
have the character v, and accordingly, from Boole’s proposition we may legitimately infer
that ‘whatever mortals have the unknown character v are not men; yet we cannot reason
from ‘some men are not mortal’ to ‘some mortals are not men’.
Peirce is evidently starting with Boole’s equational representation vx = v(1 − y) of ‘some men are
not mortal’ and deriving vy = v(1 − x). But the interpretation that ‘some mortals are not men’
appears to be a careless mistake on his part. Boole does not claim that the equation vx = v(1 − y)
by itself has existential import, but rather it comes from the fact that he has an explicit side
condition, namely that vx is not empty. Indeed Peirce’s claim that any equational approach to
expressing particular propositions must fail is simply wrong.
Schröder follows Peirce in the attack, but is more careful, saying that the equation vx = vy
and side condition ‘some v is x’ can be simply replaced by the condition xy 6= 0. And the latter
condition he says cannot be expressed by any equation p(x, y, z, . . . ) = 0. Schröder is working
with Boolean algebra and the modern semantics, and not Boole’s algebra of logic and restricted
semantics, but his argument carries over. All he does is expand p in terms of x, y-constituents to
obtain
p(x, y, z, . . . ) = p1 xy + p2 xy + p3 xy + p4 xy
and by interpreting x and y as the empty class or universe in various ways it follows that all the p i
must be 0 in order to capture the meaning of xy 6= 0. (One can modify this argument to work with
the restricted semantics, namely by letting x and y both be p1 one can show p1 must be 0, etc.)
So Schröder is correct in saying that one cannot capture existential import on an ‘equation by
equation’ basis. But he overlooks the possibility that for the purpose of arguments one can use
equations to syntactically capture the existential import. This only requires a small amount of care
in setting up the equations, using a slightly different system with the modern semantics than with
the restricted semantics.
We will develop the method by starting with Schröder’s observation that ab 6= 0 does indeed
precisely capture the meaning of ‘Some a is b’, and following Schröder, work in the general setting
of equations and negated equations. Thus we can take for our premises a set of ground equations
and negated equations:
S = {p1 = 0, . . . , pk = 0} ∪ {q1 6= 0, . . . , qm 6= 0}
We want to show how we can derive any consequence of the form p = 0 or of the form q 6= 0 using
only equational logic.
For convenience let CW be an infinite collection of constants contained in C. In the case of
restricted semantics we set CW = C, but in the case of modern semantics we want C \ CW to be
123
124 APPENDIX 4: EXISTENTIAL IMPORT
infinite as well. For each qj in S choose a distinct cj from CW and let S ? be the system of equations
S ? = {p1 = 0, . . . , pk = 0} ∪ {c1 q12 = c1 , . . . , cm qm
2
= cm }
When working with Boolean algebra or Boolean rings one can replace q j2 by qj .
The first claim is that under either semantics an equation p = 0 is a consequence of S iff one
can give an equational proof from the systems `AB? , etc., where the ? means in the restricted case
to add the rule of inference
c = 0 c = 1
for any c ∈ C
1 = 0 1 = 0
and in the modern semantics it means that we add the inferences
c = 0
for any c ∈ CW
1 = 0
The second claim is that under either semantics an equation q 6= 0 is a consequence of S iff one
can give an equational proof of cq 2 = c from the systems `AB? , etc., for some c ∈ CW .
Bibliography
[1] George Boole, The Mathematical Analysis of Logic, Being an Essay Towards a Calculus of Deductive Rea-
soning. Originally published in Cambridge by Macmillan, Barclay, & Macmillan, 1847. Reprinted in Oxford by
Basil Blackwell, 1951.
[2] George Boole, An Investigation of The Laws of Thought on Which are Founded the Mathematical Theories of
Logic and Probabilities. Originally published by Macmillan, London, 1854. Reprint by Dover, 1958.
[3] S. Burris and H.P. Sankappanavar, A Course in Universal Algebra. Springer Verlag, 1981.
[4] Augustus De Morgan, Formal Logic: or, the Calculus of Inference, Necessary and Probable. Originally published
in London by Taylor and Walton, 1847. Reprinted in London by The Open Court Company, 1926.
[5] Augustus De Morgan, On the Syllogism, and Other Logical Writings, edited by Peter Heath, Yale University
Press, 1966. (A posthumous collection of De Morgan’s papers on logic.)
[6] Theodore Hailperin, Boole’s Logic and Probability. Studies in Logic and the Foundations of Mathematics,
85, Elsevier, North-Holland, Amsterdam, New York, Oxford 1976
[7] W. Stanley Jevons, The Substitution of Similars, the True Principle of Reasoning, Derived from a Modification
of Aristotle’s Dictum. Macmillan and Co., London, 1869.
[8] W. Stanley Jevons, Elementary Lessons in Logic, Deductive and Inductive. Originally published by
Macmillan & Co., London, 1870. Reprinted 1957.
[9] W. Stanley Jevons, The Principles of Science, A Treatise on Logic and the Scientific Method. Originally
published by Macmillan and Co., London and New York, 1874. Reprinted 1892.
[10] W. Stanley Jevons, Studies in Deductive Logic, A Manual for Students. Macmillan and Co. London and New
York, 1880.
[11] W. Stanley Jevons, The Elements of Logic. Sheldon & Co. New York and Chicago, 1883.
[12] W. Stanley Jevons, Pure Logic and Other Minor Works ed. by Robert Adamson and Harriet A. Jevons
Lennox Hill Pub. & Dist. Co., NY 1890. Reprinted 1971.
[13] M. Kneale and W. Kneale, The Development of Logic. Oxford University Press, 1962.
[14] C.I. Lewis, A Survey of Symbolic Logic. Originally published by the Univeristy of Californial Press, 1918.
Republished by Dover, 1960.
[15] C.S. Peirce, Description of a notation for the logic of relatives, resulting from an amplification of the
conceptions of Boole’s calculus of logic. Memoirs of the Amer. Acad. 9 (1870), 317–378. Reprinted in Vol. III of
Collected Papers, 27–98.
[16] C.S. Peirce, On the algebra of logic. Chapter I: Syllogistic. Chapter II: The logic of non-relative terms.
Chapter III: The logic of relatives. Amer. Journal of Math. 3 (1880), 15–57. Reprinted in Vol. III of Collected
Papers, 104–157.
[17] R.S. Pierce, Modules over Commutative Regular Rings. Memoirs AMS No. 70, 1967.
[18] Ernst Schröder, Algebra der Logik, Vols. I–III. 1890–1910, Chelsea reprint 1966.
[19] Richard Whately, Elements of Logic. 2nd edition, published in 1826 by J. Mawman, London. Reprinted, with
a new introduction, Scholars Facsimiles & Reprints, Inc. 1975.
125