Equational Logic 2 (Tourlakis) - Paper
Equational Logic 2 (Tourlakis) - Paper
George Tourlakis
Introduction.
This note builds further on [To], where the logical “calculus” of Equational
(Predicate) Logic outlined in [GS1] was formalized and shown to be sound and
complete.
We propose here a simpler formalization than the one in [To], basing the
proof-apparatus solely on propositional rules of inference—one of which, of
course, is a version of “Leibniz”. This entails an unconstrained Deduction The-
orem (contrast with [To]), which in turn further simplifies the steps of our
reasoning.
While our “foundations” include “just” a propositional version of Leibniz,
we show that there are derived rules valid in the logic, which allow the use of
Leibniz-style substitution within the scope of a quantifier.
We also address one “weakness”—to which David Gries has already called
attention in [Gr]—of the current literature ([DSc, GS1]) on equational or cal-
culational reasoning. That is, while it is customary to mix =-steps (that is,
an application of a conjunctional ≡) and ⇒-steps (that is, an application of a
conjunctional ⇒) in a calculational proof, and while we have well documented
rules to handle the former, yet the latter type of step normally seems to rely on
a compendium of ad hoc rules. We hope to have contributed towards remedy-
ing this state of affairs, as we present a unifying, yet simple and rigorous way
to understand, ascertain validity, and therefore annotate and utilize ⇒-steps,
using the rules monotonicity and antimonotonicity.
We conclude with a section on soundness and completeness of the proposed
Logic.
The term “basic” in the title is meant to convey that we include no more
than what is necessary to lay the foundations. In particular, examples that
illustrate the power of calculational reasoning were left out.
2
The layout of the paper is as follows: Section 1 introduces the formal lan-
guage. Section 2 the axioms, the rules of inference, and sets the rules of the game
(definitions of the two main types of substitution,† of theorem and of proof).
Section 3 introduces a few metatheorems, including the Deduction Theorem.
The “main lemma” in section 6 is lemma 6.15 that shows the eliminability of
propositional variables. An Appendix argues that all the axioms in [GS1] are
here theorems, but the exposition in the Appendix is no more than a “link” to
[To].
1. Syntax
Equational (first order) logic, like all (first order) logic, is “spoken” in a first
order language, L. L is a triple (V, Term, Wff), where V is the alphabet, i.e.,
the set of basic syntactic objects (symbols) that we use to built “terms” and
“formulas”. We start with a description of V , and then we describe the set of
terms (Term) and the set of formulas (Wff).
Alphabet
These variables will be only used to give a user-friendly notation for the
various versions of the rule Leibniz.
3. Equality (between “objects”—see 1.3 below for its syntactic role), “≈”.‡
4. Brackets, ( and ).
was just one notation—using the symbol [∗ := ∗∗], where ∗ is a variable and ∗∗ a formula or a
term. This notation would be annotated by surrounding text, which would indicate if capture
of free variables was, or was not, allowed. Here, instead, we ask the notation to fend for its
meaning, using different notations for the capture and no capture versions.
‡ Following the practice in Enderton [En], we use ≈ for formal equality and = for informal
(metamathematical) equality. This will enable us to write in the metatheory, for example,
A = B where A and B are formulas, meaning that the “strings” A and B are identical. A
conflict will arise though, since we will be also using = quasi-formally (in equational style
proofs) as a conjunctional alias of ≡.
1. Syntax 3
8. A set of symbols (possibly empty) for constants. We normally use the meta-
symbols a, b, c, d, e, with or without subscripts, to stand for constants unless
we have in mind some alternative “standard” notation in selected areas of
application of the 1st order logic (e.g., ∅, 0, ω, etc.).
9. A set of symbols for predicates or relations (possibly empty) for each possible
“arity” n > 0. We normally use P, Q, R with or without primes to stand for
predicate symbols.
10. Finally, a set of symbols for functions (possibly empty) for each possible
“arity” n > 0. We normally use f, g, h with or without primes to stand for
function symbols.
1.1 Remark. Any two symbols mentioned in items 1–10 are distinct. More-
over (if they are built from simpler “sub-symbols”, e.g., x1 , x2 , x3 , . . . might
really be x|x, x||x, x|||x, . . . ), none is a substring (or subexpression) of any other.
1.2 Definition. (Terms) The set of terms, Term, is the ⊆-smallest set of
strings or “expressions” over the alphabet 1–10 with the following two proper-
ties:
Any of the items in 1 or 8 (a, b, c, x, y, z, etc.) are in Term.
If f is a function† of arity n and t1 , t2 , . . . , tn are in Term, then so is the
string f t1 t2 . . . tn .
1.3 Definition. (Atomic Formulas) The set of atomic formulas, Af, contains
precisely:
1) The symbols true, false, and every Boolean variable (that is, p, q, . . . ).
† We will omit the qualification “symbol” from terminology such as “function symbol”,
a) Af ⊆ Wff.
d) If A is in Wff and x is any object variable (which may or may not occur (as
a substring) in the formula A), then the string ((∀x)A) is also in Wff.
We say that A is the scope of (∀x).
1.6 Definition. (Free and bound variables) An object variable x occurs free in
a term t or atomic formula A iff it occurs in t or A as a substring.
x occurs free in (¬A) iff it occurs free in A.
x occurs free in (A ◦ B)—where ◦ ∈ {≡, ∨, ∧, ⇒}—iff it occurs free in at
least one of A or B.
x occurs free in ((∀y)A) iff x occurs free in A, and y 6= x.†
The y in in ((∀y)A) is, of course, not free—even if it is so in A—as we have
just concluded in this inductive definition. We say that it is bound in ((∀y)A).
Trivially, terms and atomic formulas have no bound variables.
she needs as tools here, all over again. We will need a precise definition of
tautologies in our first order language L.
That is, a prime formula has no “explicit” propositional connectives (in the case
labeled Pri2 any propositional connectives are hidden inside the scope of (∀x)).
Clearly, A ∈ Wff iff A is a Propositional Calculus formula over P (i.e,
propositional variables will be all the strings in P − {true, false}).
2.2 Definition. (Tautologies in Wff) A formula A ∈ Wff is a tautology iff it
is so when viewed as a Propositional Calculus formula over P.
We call the set of all tautologies, as defined here, Taut. The symbol |=Taut A
says A ∈ Taut.
While a definition for an infinite set of premises is possible, we will not need it
here.
Before presenting the axioms, we introduce some notational conventions re-
garding substitution.
2.5 Remark. (1) An inductive definition (by induction on terms and formulas)
of the string A[x := t] is instructive and is given below:
First off, let us define s[x := t], where s is also a term:
If s = x† , then s[x := t] = t. If s = a (a constant), then s[x := t] = a. If
s = y and y 6= x (i.e., they are different strings!), then s[x := t] = y.
If s = f r1 . . . rn —where f has arity n and r1 , . . . , rn are terms—then s[x :=
t] = f r1 [x := t]r2 [x := t] . . . rn [x := t].
We turn now to formulas.
If A is true, false or p (a boolean variable), then A[x := t] = A. If A = s ≈ r,
where s and r are terms, then A[x := t] = s[x := t] ≈ r[x := t]. If A = P r1 . . . rn
(P has arity n), then A[x := t] = P r1 [x := t]r2 [x := t] . . . rn [x := t].
If A = (B ◦ C), where ◦ ∈ {≡, ∨, ∧, ⇒}, then A[x := t] = (B[x := t] ◦ C[x :=
t]).
If A = (¬B), then A[x := t] = (¬B[x := t]).
In both cases above, the left hand side is defined just in case the right hand
side is.
Finally, (the “interesting case”): say A = ((∀y)B). If y = x, then (x is not
free in A) A[x := t] = A.
If y 6= x and B[x := t] is defined, then A[x := t] is defined provided y is not
a substring of t. In that case, A[x := t] = ((∀y)B[x := t]).
(2) Similarly, we define A[p := W ] inductively below (◦ ∈ {≡, ∨, &, ⇒} as
before):
W if A = p
A if A is atomic, but not p
(¬B[p := W ]) if A = (¬B) and B[p := W ] is defined
A[p := W ] = (B[p := W ] ◦ C[p := W ]) if A = (B ◦ C) and B[p := W ] and
C[p := W ] are defined
((∀y)B[p := W ])
if A = ((∀y)B), provided B[p := W ]
is defined and y is not free in W
The cases for A[x \ t] and A[p \ W ] have exactly the same inductive definitions,
except that we drop the hedging “if defined” throughout, and we also drop the
restriction that y (of (∀y)) be not free in W or t.
2.6 Definition. (Axioms and Axiom schemata) The axioms (schemata) are all
the possible “partial” generalizations‡ of the following (exactly as in [En]§ ):
† For one last time, recall that = is metalogical, and it here denotes equal strings!
‡B is a partial generalization of A iff it is the expression consisting of A, prefixed with zero
or more strings (∀x)—x may or may not occur free in A. Repetitions of the same prefix-string
(∀x) are allowed. The well known “universal closure of A”, that is (∀x1 )(∀x2 ) . . . (∀xn )A—
where x1 , x2 , . . . , xn is the full list of free variables in A—is a special case.
§ Actually, [En] only allows atomic formulas in Ax6 below, and derives Ax6 as a theorem.
(∀x)A[x] ⇒ A[t]
or even
(∀x)A ⇒ A[t]
where the presence of A[x] (or (∀x)A, or (∃x)A) and A[t] in the same
context means that t replaces contextually all x occurrences in A.
A ⇒ (∀x)A.
x ≈ t ⇒ A ≡ A[x := t].
x ≈ t ⇒ A[x] ≡ A[t]
or even
x ≈ t ⇒ A ≡ A[t]
2. Axioms and Rules of Inference 9
2.7 Remark. (1) In any formal setting that introduces many-sorts explicitly
in the syntax, one will need as many versions of Ax2–Ax6 as there are sorts.
(2) Axioms Ax5–Ax6 characterize equality between “objects”. Adding
these two axioms makes the logical system (explicitly) applicable to mathe-
matical theories such as number theory and set theory. These axioms will be
used to prove the “one point rule” of [GS1] (in the Appendix).
(3) In Ax2 and Ax6 we imposed the condition that t must to be “substi-
tutable” in x by utilizing contextual substitution [x := t].
Here is why:
Take A to stand for (∃y)¬x ≈ y. Then (∀x)A ⇒ A[x \ y] is
(∀x)(∃y)¬x ≈ y ⇒ (∃y)¬y ≈ y
and x ≈ y ⇒ A ≡ A[x \ y] is
x ≈ y ⇒ (∃y)¬x ≈ y ≡ (∃y)¬y ≈ y
neither of which, obviously, is universally valid.
The meta-remedy is to move the quantified variable(s) out of harm’s way,
i.e., rename them so that no quantified variable in A has the same name as any
(free, of course) variable in t.
This renaming is formally correct (i.e., it does not change the meaning of the
formula) as we will see in the “variant” (meta)theorem (3.10). Of course, it is
always possible to effect this renaming since we have countably many variables,
and only finitely many appear free in t and A.
This trivial remedy allows us to render the conditions in Ax2 and Ax6
harmless. Essentially, a t is always “substitutable” (so that we can use [x \ t]
instead of the restrictive [x := t]) after renaming.
2.8 Definition. (Rules of Inference) The following three are the rules of in-
ference. These rules are relations on the set Wff and are written traditionally
as “fractions”. We call the “numerator” the premise(s) and the “denominator”
the conclusion.
We say that a rule of inference is applied to the formula(s) in the numerator,
and that it yields (or results in) the formula in the denominator. We emphasize
that the domain of the rules we describe below is the set Wff. That is why we
call the rules “strong” (a “weak” rule applies on a proper subset of Wff only.
That subset is not yet defined(!). No wonder then that we prefer “strong” over
“weak” rules).
Any set S ⊆ Wff is closed under some rule of inference iff whenever the rule
is applied to formulas in S, it also yields formulas in S.
Inf1. (Propositional (Strong) Leibniz, P SL) For any formulas A, B, C and any
propositional variable p (which may or may not occur in C)
A≡B
(P SL)
C[p := A] ≡ C[p := B]
10
A, A ≡ B
B
A ≡ B, B ≡ C
A≡C
2.9 Remark. (1) P SL is the primary rule in the propositional calculus frag-
ment of Equational Logic as it is presented in [GS1]. An additional predicate
calculus version is also given there (twin rule 8.12; the second of the two needs
a correction—see the Appendix). The Leibniz rule (or its variants) is at the
heart of equational or calculational reasoning. In standard approaches to logic
it is not a primary rule, rather it appears as the well known “derived rule”
(metatheorem) that if Γ ` A ≡ B † and if we replace one or more occurrences
of the subformula A of a formula D (here D is C[p := A]) by B, to obtain D0
(that is C[p := B]), then Γ ` D ≡ D0 . No restriction on p is necessary (as we
prove in section 4). I.e., we show that the above quoted “Leibniz” is a derived
rule in our system.
Shoenfield [Sh] calls this derived rule “the equivalence theorem”.‡
(2) [GS1] use “=” for “≡” in contexts where they want the symbol to act con-
junctionally, rather than associatively, e.g., in successive steps of an equational-
style proof. We will follow this practice as well.
This may create a few confusing moments, as we use = in the metalanguage
as well!
We next define Γ-theorems, that is, formulas we can prove from the set of
formulas Γ (this Γ may be empty).
2.11 Remark. Now we can spell out what a “weak” rule of inference is: It is
a rule whose domain is restricted to be Thm∅ .
None of Inf1–Inf3 is weak.
2.12 Definition. (Γ-proofs) A finite sequence A1 , . . . , An of members of Wff
is a Γ-proof iff every Ai , for i = 1, . . . , n is one of
Pr1. A logical axiom (as in Th1 above).
Pr2. A member of Γ.
Pr3. The result of a rule Inf1–Inf3 applied to (an) appropriate formula(s) Aj
with j < i.
Proof.
A
= hA ≡ A ≡ true is a tautology, hence a logical axiomi
A ≡ true
From 2.10 follows that any of the primary rules can be written “linearly”, that
is, premises first, followed by ` —instead of the “fraction-line”—followed by the
conclusion.
We (almost) always use this linear format for derived rules.
Proof. We have
A hnonlogical assumptioni (1)
and
A⇒B hnonlogical assumptioni (2)
Thus,
A⇒B
= hredundant true via (1); plus PSLi
true ⇒ B
= h(true ⇒ B) ≡ B is a tautologyi
B
Let us call our logic, that is, a language L along with the adopted axioms,
rules of inference and the definition (2.10) of Γ-theorems (an) E-logic (“E” for
Equational).
Let us call En-logic what we obtain by keeping all else the same, but adopting
modus ponens as the only primary rule of inference. This is, essentially, the logic
in [En] (except that [En] allows neither propositional variables nor propositional
constants).
† We take symmetry of ≡ for granted, and leave it unmentioned, due to axiom group Ax1.
3. Basic metatheorems and theorems 13
Proof. We already have shown that modus ponens is a derived rule of E-logic.
Thus, if Γ `En A, then Γ `E A.
Conversely, since every rule Inf1–Inf3 of E-logic is derived in En-logic, when-
ever Γ `E A we also get Γ `En A.
Now, why are Inf1–Inf3 derived in En-logic?
The reason is that each has the form of a tautological implication,
A1 , . . . , An |=Taut B
for n = 1 (PSL) or n = 2 (EQN, TR).†
h3.5i
(∀x1 )A[x1 , x2 ]
hAx2 and modus ponens; x1 := zi
A[z, x2 ]
h3.5—see also the remark following 3.5i
(∀x2 )A[z, x2 ]
hAx2 and modus ponens; x2 := wi
A[z, w]
hNow z := t1 , w := t2 , in any order,
is the same as “simultaneous substitution”i
h3.5—see also the remark following 3.5i
(∀z)A[z, w]
hAx2 and modus ponens; z := t1 i
A[t1 , w]
h3.5—see also the remark following 3.5i
(∀w)A[t1 , w]
hAx2 and modus ponens; w := t2 i
A[t1 , t2 ]
3.8 Metatheorem. (The Deduction Theorem) For any formulas A and B and
set of formulas Γ, if Γ, A ` B, then Γ ` A ⇒ B.
3.9 Remark. (1) We now see why our E-logic (equivalently, En-logic) does
not support strong generalization A ` (∀x)A. If it did, then, by the Deduction
Theorem that we have just proved,
` A ⇒ (∀x)A (i)
Even though we have not discussed semantics yet (we do so in section 6), still we
can see intuitively that no self-respecting logic should have the above formula
as an absolute theorem, since it is not an “absolute truth”. For example, over
the natural numbers, N, we have an obviously invalid “special case” of the
schema (i):
x ≈ 0 ⇒ (∀x)x ≈ 0
NB. We often write this (under the stated conditions) as ` (∀x)A[x] ≡ (∀z)A[z].
Proof. Since z is substitutable in x under the stated conditions, A[x := z] is, of
course, defined. Thus, by Ax2 and modus ponens
(∀x)A ` A[x := z]
(∀x)A ` (∀z)A[x := z]
` (∀x)A ⇒ (∀z)A[x := z]
We conclude this section with a couple of useful metatheorems.
Proof. This trivial fact (Ax2 and Ax3 and tautological implication) is only
stated here to make it “quotable”.
18
(∀x)(A ⇒ B)
` hAx4 and modus ponensi
(∀x)A ⇒ (∀x)B
= hPSL and 3.11i
A ⇒ (∀x)B
By EQN, Γ ` A ⇒ (∀x)B.
By 3.8, for any two formulas A and B, ` and ⇒ are “interchangeable” (strictly
speaking, ` A ⇒ B iff A ` B).
For this reason, assuming that ⇒ is conjunctional when and only
when it is used at the left margin of an annotated proof,† the above
proof could be re-written using ⇒ (the latter notation seems to be preferred in
[GS1, Gr]). The hints have to change though!
(∀x)(A ⇒ B)
⇒ hAx4i
(∀x)A ⇒ (∀x)B
= hPSL and 3.11i
A ⇒ (∀x)B (1)
A1
◦ hHintsi
A2
◦ hHintsi
A3
◦ hHintsi
..
.
◦ hHintsi
An
† The “standard” ⇒ is, of course, not conjunctional: E.g., p ⇒ q ⇒ r does not say (p ⇒
q) ∧ (q ⇒ r).
4. Derived Leibniz rules 19
Proof. By 3.5, Γ ` (∀x)(A ⇒ B). The result follows by Ax4 and modus
ponens.
Proof. A trivial exercise, using the definition of the “text” ((∃x)A), namely,
(¬(∀x)(¬A)).
A≡B
(SLU S)
C[p \ A] ≡ C[p \ B]
That SLUS is “invalid” in our Logic follows from 3.4 in [To], “strong general-
ization”, which is a derived rule if SLUS is available. But we have seen that
E-logic does not support strong generalization.
20
A≡B
(SLCS)
C[p := A] ≡ C[p := B]
5. Monotonicity
5.1 Definition. We define a set of strings, the I-Forms and D-Forms, by in-
duction. It is the smallest set of strings over the alphabet V ∪ {∗}, where ∗ is
a new symbol added to the alphabet V , satisfying the following:
We will just say U is a Form, if we do not wish to spell out its type (I or D). We
will use calligraphic capital letters U, V, W, X , Y to denote Forms.
5.2 Definition. For any Form U and any formula A or form W, the symbols
U[A] and U[W] mean, respectively, the result of the uniform substitutions U[∗ \
A] and U[∗ \ W].
Our I-Forms and D-Forms—“I” for increasing, and “D” for decreasing—are mo-
tivated by, but are different from,† the Positive and Negative Forms of Schütte
[Schü].
The expected behaviour of the Forms is that they are “monotonic functions”
of the ∗-“variable” in the following sense: We expect that ` A ⇒ B will imply
` U[A] ⇒ U[B] if U is an I-Form, and ` U[A] ⇐ U[B] if it is a D-Form.
Now, ⇒ is “like” ≤ in Boolean algebras, the latter defined by “a ≤ b means
a ∨ b = b” (compare with [GS1], Axiom 3.57 for ⇒). This observation justifies
the terminology “monotonic functions”.
We now pursue in detail the intentions stated in the above remark.
Proof. Induction on Forms. The basis is obvious, and clearly the property
propagates with the formation rules.
† For example, (∗ ∧ A) is an I-Form but not a Positive Form in the sense of [Schü], since
Proof. Induction on Forms (really using the least principle and proof by contra-
diction). Let U have the least complexity among forms that have both types.
This is not the “basic” Form ∗ as that is declared to have just type I.
Can it be a form (V ∨ A)? No, for it has both types I and D, so that also
V must have both types I and D, contradicting the assumption that U was the
least complex schizophrenic Form. We obtain similar contradictions in the case
of all the other formation rules.
5.6 Lemma. For any Forms U and V, we have the following composition prop-
erties:
Case 1. U = (W ∨ A), for some A ∈ Wff. U[V] = (W[V] ∨ A). U and W have
the same type.
By I.H. and the definition of Forms, the claim follows.
We omit a few similar cases . . .
We call MON the rule “if ` A ⇒ B and U is an I-Form, then ` U[A] ⇒ U[B]”.
We call AMON the rule “if ` A ⇒ B and U is an D-Form, then ` U[A] ⇐
U[B]”.
Proof. Induction on U.
Basis. U = ∗, hence we want to prove ` A ⇒ B, which is the same as the
hypothesis.
The induction steps:
Case 1. U = (W ∨ C), for some C ∈ Wff. If W is an I-Form, then (I.H.)
` W[A] ⇒ W[B], hence ` (W[A] ∨ C) ⇒ (W[B] ∨ C) by tautological
implication.
If W is a D-Form, then (I.H.) ` W[A] ⇐ W[B], hence ` (W[A]∨C) ⇐
(W[B] ∨ C) by tautological implication.
Case 2. U = (W ⇒ C), for some C ∈ Wff. If W is an I-Form, then (I.H.) `
W[A] ⇒ W[B], hence ` (W[A] ⇒ C) ⇐ (W[B] ⇒ C) by tautological
implication.
If W is a D-Form, then (I.H.) ` W[A] ⇐ W[B], hence ` (W[A] ⇒
C) ⇒ (W[B] ⇒ C) by tautological implication.
We omit a few similar cases based on tautological implication . . .
Case 3. U = ((∀x)W). If W is an I-Form, then (I.H.) ` W[A] ⇒ W[B].
By 3.15, ` ((∀x)W[A]) ⇒ ((∀x)W[B]).
If W is a D-Form, then (I.H.) ` W[A] ⇐ W[B].
By 3.15, ` ((∀x)W[A]) ⇐ ((∀x)W[B]).
Case 4. U = ((∃x)W). As above, but relying on 3.16 instead.
MON and AMON are applied after we have eliminated the presence of ≡ from
formulas.
5.9 Example. We illustrate the use of the rules MON or AMON by revisiting
the calculational proof fragment of 3.12.
(∀x)(A ⇒ B)
⇒ hAx4i
(∀x)A ⇒ (∀x)B
⇒ hAMON—on ∗ ⇒ (∀x)B—and Ax3i
A ⇒ (∀x)B
Proof. The proof is as in 5.8, except that the induction steps under cases 3
and 4 are modified as follows:
Case 3. U = ((∀x)W). If W is an I-Form, then (I.H.)
A ⇒ B ` W[∗ := A] ⇒ W[∗ := B] (i)
We want to argue that
A ⇒ B ` (∀x)W[∗ := A] ⇒ (∀x)W [∗ := B] (ii)
where we have already incorporated ((∀x)W)[∗ := A] = (∀x)(W[∗ :=
A]), etc., and then dropped the unnecessary brackets.
Now, if the substitutions in (ii) are not defined, then there is nothing
to state (let alone prove).
Assuming that they are defined, then A ⇒ B has no free occurrence
of x. By 3.15, (ii) follows from (i).
If W is a D-Form, then we argue as above on the I.H.
A ⇒ B ` W[∗ := A] ⇐ W[∗ := B].
(∀x)A
The above ⇒ (on the left margin) is, of course, ` (see the
lowing 3.12). Thus we have just “proved” A ` (∀x)A.
-passage fol-
6. Soundness and Completeness of E-logic 25
Item (1) makes the difference between the definition of semantics here and in
[To].
6.2 Definition. Given L and a structure M = (M, I) appropriate for L. L(M)
denotes the language obtained from L by adding in V a unique name î for each
object i ∈ M . This amends both sets Term, Wff into Term(M), Wff(M).
Members of the latter sets are called M-terms and M-formulas respectively.
We extend I to the new constants: îI = i for all i ∈ M (where the meta-
mathematical “=” is that on M ).
6.4 Definition. For any formula A in Wff(M) we define the symbol AI in-
ductively. In all cases, AI ∈ {t, f}.
(1) If A is any of p or true or false, then AI has already been defined.
(2) If A is the string t ≈ s, where t and s are M-terms, then AI = t iff tI = sI
(again, the last two occurrences of = refer to equality on {t, f} and M
respectively).
(3) If A is the string P t1 . . . tn , where P is an n-ary predicate and the ti are
M-terms, then AI = t iff (tI1 , . . . , tIn ) ∈ P I .
(4) If A is any of ¬B, B ∧ C, B ∨ C, B ⇒ C, B ≡ C, then AI is determined by
the usual truth tables using the values B I and C I .
(5) If A is (∀x)B, then AI = t iff (B[x := î])I = t for all i ∈ M .
Of course, the above also gives meaning to tI and AI for any terms and
formulas over the original language, since Term(M) ⊇ Term and Wff(M) ⊇
Wff.
Clearly, in the case of A |= B the above says that, having fixed the domain M ,
every “state” I that makes A true makes B true. Thus, A |= B exactly when
|= A ⇒ B, as earlier promised.
6.7 Definition. (First order theories) A (first order) theory is a collection of
the following objects:
We often name the theory by the name of its nonlogical axioms (as in “let Γ be
a theory . . . ”, in which case we write Γ ` A to indicate that A is a Γ-theorem),
but we may also name the theory by other characteristics, e.g., the choice of
language. For example, we may have two theories under discussion, that only
differ in the choice of the language (L vs., say, L0 ). We may call one the theory
T and the other the theory T 0 , in which case we indicate where deductions
take place by writing Γ `T A or Γ `T 0 A as the case may be. Similarly, all
other things might be the same, except that the choice of rules of inference (or
logical axioms) is different. Again, we choose names to reflect these different
choices. We have already used such notation and terminology: E-logic and En-
logic. We now are saying that we may also use the terminology “E-theory” and
“En-theory”.
A pure theory is one with Γ = ∅.
6.8 Remark. Remarks embedded in the above definition justify the use of the
indefinite article in “A pure theory . . . ”.
Towards the soundness result below we carefully look at two nastily tedious
(but easy) lemmata.
28
Induction on A):
Basis.
A=p then p
A = true then true
A = false
then false
A = P r . . . r
1 n then
A[x := t][y := a] =
P r1 [x := t][y := a] . . . rn [x := t][y := a] =
P r1 [y := a][x := t] . . . rn [y := a][x := t]
A = r ≈ s
then r[x := t][y := a] ≈ s[x := t][y := a]
= r[y := a][x := t] ≈ s[y := a][x := t]
= A[y := a][x := t]
6. Soundness and Completeness of E-logic 29
6.11 Lemma. Given a structure M = (M, I), a term s, and formula A over
L(M).
Let t be another term over L(M), such that tI = i ∈ M .
Then, (s[x := t])I = (s[x := î])I and (A[x := t])I = (A[x := î])I , in the
latter case on the assumption that A[x := t] is defined.
This almost says the intuitively expected (but formally incorrect): (A[t])I =
AI [tI ].
Proof. (Induction on s):
Basis. s[x := t] = s if s ∈ {y, a, ĵ} (y 6= x). Hence (s[x := t])I = sI =
(s[x := î])I in this case. If s = x, then x[x := t] = t and x[x := î] = î, and the
claim follows once more.
For the induction step let s = f r1 . . . rn , where f has arity n. Then
(s[x := t])I = f I (r1 [x := t])I , . . . , (rn [x := t])I
= f I (r1 [x := î])I , . . . , (rn [x := î])I by I.H.
I
= (s[x := î])
(Induction on A):
Basis.
A = p then p
A[x := t] = A = true then true
A = false then false
= A = A[x := î]
Similarly if A = r ≈ s.
The property we are proving, clearly, propagates with boolean connectives.
Let us do the induction step just in the case where A = (∀w)B. If w = x the
result is trivial. Otherwise, we note that—since we assume that t is substitutable
Proof. The pure E-theory over a fixed alphabet L is equivalent to the En-theory
over the same alphabet (3.4). Thus the proof proceeds for an En-theory.
Let ` A. Pick an arbitrary structure M = (M, I) appropriate for L and do
induction on ∅-theorems to show that |=M A.
Basis. A is a logical axiom (see 2.6).
Now, axioms in group Ax1 are tautologies over prime formulas. That is,
regardless of the values P I of prime formulas P in B, if B is a tautology, then
B I = t (see 2.2 and 6.4(4)). By 6.4(5), any (partial) generalization, A, of B
will also come out t under I. Thus, |=M A in this case.
We next show that if A is a partial generalization of (∀x)B ⇒ B[x := t],
then AI = t, from which follows that |=M A. We ask the reader to verify the
satisfiability—in the arbitrary M—of all the remaining axioms.
By 6.4(5), it suffices to prove that
I
(∀x)B ⇒ B[x := t] = t (1)
but
(B[x := t])I = f (3)
By 6.4(5) and (2), (B[x := î])I = t for all i ∈ M .
6. Soundness and Completeness of E-logic 31
B1 , . . . , Bn (1)
where Bn = A.
Basis. n = 1. Suppose that A is a logical axiom. Then A[p := W ] is as well
(by 2.6), thus Γ ` A[p := W ].
Suppose that A is a nonlogical axiom. Then A[p := W ] = A by the condition
on the proof, thus Γ ` A[p := W ].
We may assume that we are working in En-logic. On the induction hypothesis
that the claim is fine for proof-lengths < n, let us address the case of n:
If A (i.e., Bn ) is logical or nonlogical, then we have nothing to add.
† A theory T 0 over the language L0 is a conservative extension of a theory T over the
language L, if, first of all, every theorem of T is a theorem of T 0 , and (the conservative part)
moreover, any theorem of T 0 that is over L—the language of T —is also a theorem of T . That
is, T 0 proves no new theorems in the old language.
32
So let Bj = (Bi ⇒ A) in (1) above, where i and j are each less than n (i.e.,
the last step of the proof was an application of modus ponens).
By I.H., Γ ` Bi [p := W ] and Γ ` Bi [p := W ] ⇒ A[p := W ]. Thus, by modus
ponens, Γ ` A[p := W ].
6.15 Main Lemma. ([To]) Let A be a formula over the language L of sec-
tion 1, and let p be a propositional variable that occurs in A.
Extend the language L by adding P , a new 1-ary predicate symbol.
Then, |= A iff |= A[p := (∀x)P x] and ` A iff ` A[p := (∀x)P x].
Proof. Fix attention to the pure E-theory over a fixed language L. Let A be a
formula in the language, and let |= A.
Denote by A0 the formula obtained from A by replacing each occurrence of
true (respectively false) by p∨¬p (respectively p∧¬p) where p is a propositional
variable not occurring in A. Let A00 be obtained from A0 by replacing each
propositional variable p, q, . . . in it by (∀x)P x, (∀x)Qx, . . . respectively, where
P, Q, . . . are new predicate symbols (so we expand L by these additions).
Clearly, by 6.15, |= A00 . This formula is in the language of [En] (which is the
same as L of section 1, but it has no propositional variables or constants). Thus,
by completeness of En-logic/E-logic over such a “restricted” language (proved
in [En]), ` A00 , the proof being carried out in the restricted language.
7. Appendix 33
But, trivially, this proof is valid over the language L (same axioms, same
rules), hence also ` A0 , by 6.15.
Finally, by SLCS—since ` p ∨ ¬p ≡ true and ` p ∧ ¬p ≡ false—and EQN,
we get ` A.
7. Appendix
The reader is referred to [To] where all the axioms in [GS1], chapters 8 and 9,
were shown to be derived in the logic of [To].
Practically identical proofs are available within our E-logic, and they will
not be repeated here.
The justification of uses of generalization will have to be more careful in E-logic
(in [To] we could just go ahead with strong generalization). The reader should
be able to provide the right wording in each case, “translating” the proofs in
[To] to the present setting.
We recall that axiom schemata Ax5 and Ax6 are used in the proof of the
“one-point rule” (see [To]).
We will only revisit here axiom (9.5) of [GS1]—which is not an axiom of
our E-logic—and the Leibniz rules (8.12) of [GS1]. Nomenclature and numbers
given in brackets are those in [GS1].
A.1 “Distributivity of ∨ over ∀ (9.5)”. This says that
(∀x)(A ∨ B)
= hWLUS and ` A ∨ B ≡ ¬A ⇒ Bi
(∀x)(¬A ⇒ B)
= h3.14i
¬A ⇒ (∀x)B
= h` ¬A ⇒ (∀x)B ≡ A ∨ (∀x)Bi
A ∨ (∀x)B
A.2 The twin rules “Leibniz (8.12)” ([GS1], p.148), are stated below. The
ones immediately below are the “no-capture” versions, using contextual
substitution.
A≡B
(∀x)(C[p := A] ⇒ D) ≡ (∀x)(C[p := B] ⇒ D)
and
D ⇒ (A ≡ B)
(1)
(∀x)(D ⇒ C[p := A]) ≡ (∀x)(D ⇒ C[p := B])
34
and
|= D ⇒ (A ≡ B)
that is
|= x ≈ 0 ⇒ (x ≈ 0 ≡ true)
but
6|= (∀x)(D ⇒ C[p \ A]) ≡ (∀x)(D ⇒ C[p \ B])
that is
` D ⇒ (A ≡ B)
To this end, assume D. This yields A ≡ B † (by modus ponens and the
premise of (1)). By SLCS, C[p := A] ≡ C[p := B] follows, and hence so
does (4) (Deduction Theorem).
Now, the formula in (4) yields
and
` (D ⇒ C[p := A]) ⇐ (D ⇒ C[p := B]) (7)
Since both (6) and (7) are absolute theorems, MON—on the I-Form (∀x)∗ —
and the Tautology Theorem conclude the argument.
Note that we cannot do any better: If (1) is taken literally (“strongly”),
then it yields the invalid in E-logic strong generalization A ` (∀x)A (take
D = B = true, C = p in (1)).
8. Bibliography
[Ba] Barwise, J. “An introduction to first-order logic”, in Handbook of Math-
ematical Logic (J. Barwise, Ed.), 5–46, Amsterdam: North-Holland Pub-
lishing Company, 1978.