Theory of Computation
Theory of Computation
Frank Stephan
March 26, 2018
1
Introduction.
Example. “All men are mortal. Sokrates is a man. Hence Socrates is mortal.” Logic
studies sentences and their consequences, in particular ways to deduce new knowledge
from priorly known correct sentences.
Example. “All birds fly. Penguins are birds. Why do they not fly?” Assumption
“All birds fly.” is just wrong. From wrong assumptions, anything can be deducted.
In logic, “All X do Y” should mean that every object in the set X does what the
sentence Y says, while in everyday language, exceptions are often accepted.
2. If a sentence X follows logically from a sentence Y, how does one prove this and
is such a proof always possible?
3. Can everything what is correct about natural numbers also be proven? Or are
there sentences X about natural numbers which are true but which cannot be
proven? This will only be partially treated in this lecture and there is a more
specialised lecture in Year 5 which gives a comprehensive answer to this question.
First Model of Logic: Sentential Logic. This model is quite general and considers
basic sentences A0 , A1 , A2 , . . . and then it studies what one can deduct about the truth
of these sentences, when certain axioms are given, say “A0 or A1 is true” and “If A2
is true, so are A0 and A1 .”
2
Logic and First Order Logic are studied. Example of First-Order Sentences over the
natural numbers:
∀x ∃y [x < y];
∀x ∀y [x + y = y + x];
∃x ∀y [x ≤ y];
∀x ∀y [x ≤ y ↔ ∃z [x + z = y]].
The first formula says that for every number x there is a number y which is strictly
larger than x. The second formula says that the sum of two numbers does not depend
on the order of these numbers. The third formula says that there is a natural number
which is less or equal all other natural numbers; this number is 0. The fourth formula
says that for all natural numbers x, y, the number x is less or equal to y iff y is the
sum of x and some further natural number z.
Set theory codes up everything using sets, including natural numbers and real num-
bers, the latter are often identified with the subsets of the natural numbers or the
functions from natural numbers to natural numbers. The coding will not be studied
in this lecture.
Two sets X and Y are the same iff they have the same elements. The following
operations on sets are quite common:
3
The subset-relation is defined as follows: X ⊆ Y iff every element of X is also in Y ;
X ⊂ Y iff X ⊆ Y and Y − X contains at least one element.
Special sets: ∅ is the empty set; N is the set of natural numbers, it is {0, 1, 2, 3, . . .}.
Z is the set of all integers, it is {. . . , −3, −2, −1, 0, 1, 2, 3, . . .}.
Selection of subsets: {t ∈ X : t ∈
/ Y } defines X −Y ; one can also define informally the
union as {t : t ∈ X or t ∈ Y }; however, here the t is not bound to be in some set and
then this definition sometimes leads to things which are not sets. The intersection is
{t ∈ X : t ∈ Y } = {t : t ∈ X and t ∈ Y }. General operators for sets:
[
X = {t : ∃Y ∈ X [t ∈ Y ]}
and \
X = {t : ∀Y ∈ X [t ∈ Y ]}
and the latter is only a set when X is not empty.
S
Example.
T Consider X = {{0, 1, 2}, {1, 2, 3}, {2, 3, 4}}. Now X = {0, 1, 2, 3, 4} and
X = {2}.
Pairs and Sequences. An ordered pair hX, Y i can be formed by any defini-
tion which satisfies that hX, Y i = hV, W i iff X = V and Y = W . Furthermore,
hX, Y, Zi is a short-hand for hhX, Y i, Zi. An example is to represent the set hX, Y i
as {{X, Y }, {Y }}; however, any further definition which does this is okay. This def-
inition generalises to hx1 , x2 , . . . , xn , xn+1 i = hhx1 , x2 , . . . , xn i, xn+1 i. For example,
h1, 2, 4, 8, 16i stands for hhhh1, 2i, 4i, 8i, 16i.
Lemma 0A. For all natural numbers n, m and all sequences hx1 , . . . , xn i and hy1 , . . . ,
yn+m i, if hx1 , . . . , xn i = hy1 , . . . , yn+m i then x1 = hy1 , . . . , ym+1 i.
Proof. This one proves by induction over n. For n = 1, the equality follows from
the way one writes the formulas, using that hx1 i = x1 . If the Lemma 0A is true for
a given n, then it also holds for n + 1, as hx1 , . . . , xn+1 i = hhx1 , . . . , xn i, xn+1 i and
hy1 , . . . , yn , . . . , yn+m+1 i = hhhy1 , . . . , yn , . . . , yn+m i, yn+m+1 i and from the equality of
the two sequences follows the equality of their first components of the pairs equal to
it and then the statement follows from induction hypothesis. Thus the inductive step
goes through for all n and so the statement is true for all n by induction.
4
Cartesian Products and Powers. The set X × Y is the set of all pairs ha, bi with
a ∈ X and b ∈ Y . Furthermore, the set Y n is the set Y n of sequences of length n of
n elements x1 , . . . , xn ∈ Y and the formal definition is the following one:
Y n = {hx1 , x2 , . . . , xn i : x1 , x2 , . . . , xn ∈ Y }.
Relations. A relation is just a set of n-tuples where the m-th component has to
come from the m-th base set. In particular one is interested in relations between
two sets X and Y . For such an R ⊆ X × Y , one denotes with dom(R) the domain
{x : ∃y ∈ Y : hx, yi ∈ R} and ran(R) the range {y : ∃x ∈ X : hx, yi ∈ R}.
Zorn’s Lemma. Assume that X ⊆ P(Y ). A subset Z of X is called a chain iff the
subset-relation ⊂ satisfies trichotonomy on
S Z. Now Zorn’s Lemma says the following:
If for every chain Z ⊆ X it holds that ( Z) ∈ X, then there is an element V ∈ X
which is maximal, that is, there is no set W ∈ X with V ⊂ W .
One says that X has at most as many elements as Y iff there is a one-one function
f : X → Y , that is, f maps the elements of X to elements of Y without repetition.
Two sets X, Y have the same size iff X has at most as many elements as Y and Y
has at most as many elements as X. The Schröder-Bernstein Theorem says that two
sets have the same size iff there is a function f : X → Y which is one-one and onto.
Cardinal numbers. A cardinal is a set among all sets of the same size which represents
the sets of this size; cardinals can therefore be viewed as both, numbers as sets.
Usually, ℵ0 represents the countable sets. For n ∈ N, the number n represents the
cardinality of the set {m ∈ N : m < n} or, more in general, n is the cardinal of all sets
with n elements. When one assumes the Axiom of Choice, then also all uncountable
5
sets have a cardinal. One says that two cardinals κ and λ satisfy κ ≤ λ iff there is a
one-one mapping from a set with cardinal κ to a set with cardinal λ. In general, one
uses the following notions:
• Card(X) ≤ Card(Y ) iff there is a one-one function with domain X and the
range contained in Y ;
• Card(X) = Card(Y ) iff there is a one-one function with domain X and range
Y , that is, the function is a bijection from X onto Y ;
• For those of you who know a bit of set theory and ordinals (which are generali-
sation of the natural numbers), if α is an ordinal then the cardinal αα is the size
of the smallest set X which satisfies Card(X) > ℵβ for all β < α and Card(X)
> n for all natural numbers n. Thus one has a hierarchy 0 < 1 < 2 < . . . <
ℵ0 < ℵ1 < ℵ2 < . . . ℵω < ℵω+1 < ℵω+2 < . . . ℵω·2 < ℵω·2+1 < ℵω·2+2 and this
hierarchy goes beyond the size of any given set.
Continuum Hypothesis. One denotes the cardinal of the real numbers with 2ℵ0 and
one cannot determine its exact value. One only knows that whenever 2ℵ0 = ℵω·n+m
then m > 0 and for all n ∈ N and m ∈ {1, 2, 3, . . .}, this equation is consistent with
the axioms ZFC of set theory. Cantor conjectured that 2ℵ0 = ℵ1 , but was not able to
prove it and today one knows that one can neither prove nor disprove this conjecture.
The conjecture is known as Continuum Hypothesis.
Adding and Multiplying Cardinals. If X and Y are sets, then the sets X×{0} and
Y × {1} are disjoint. Now one defines Card(X) + Card(Y ) as Card((X × {0}) ∪ (Y ×
{1})). Furthermore, Card(X) · Card(Y ) is Card(X × Y ). Note that these operations
turn out to be independent on the representatives, thus they define arithmetic on
6
cardinals. The next result shows that the arithmetic on the infinite cardinals is quite
easy.
Proof for X = N. For each sequence x = hx1 , x2 , . . . , xn i, one considers the set
{x1 , x1 + x2 + 1, . . . , x1 + x2 + . . . + xn + n − 1} and one can see that there is a
bijection between sequences of length n and n-element sets. The sequence of length 0
corresponds to ∅. So for each sequence, one identifiesPit with a finite set. Furthermore,
one can map now a finite set D to the number x = y∈D 2y ; this is again a bijection.
The concatenation of these two bijections gives a bijection from all finite sequences of
natural numbers to the natural numbers.
7
Chapter 1.0
Example. One can formalise statements. For example, in case of chemical sub-
stances, one could list out for each chemical element whether it was detected in a
substance. However, one might concentrate on important elements for the findings.
Here some examples of summarised investigation results on substances:
2. C ∧ O ∧ H: The substance contains carbon (C), oxygen (O) and hydrogen (H);
3. Cl ∧ ¬K: The substance contains chlorine but does not contain potassium;
4. (Mg ∨ Ca) ∧ F: The substance contained at least one of magnesium (Mg) and
calcium (Ca); it furthermore also contained flourine;
The statements are given by atoms (which each say that the substance investigated
contained a certain chemical element). These statements can be connected by con-
nectives “and” (∧), “or” (∨), “not” (¬) and “if-then” (→). In some cases, there are
several ways to state the results formally. Both of the following sentences express the
same fact:
1. ¬(Cl∨K): It did not happen that the substance contained chlorine or contained
potassium;
Another example of this is the following statement which says that the substance
contains at least two of the elements gold (Au), silver (Ag) and copper (Cu):
8
One can use truth-tables to check whether two formulas are equivalent. For example,
for the formulas ¬(Cl ∨ K) and ¬Cl ∧ ¬K, one computes the values of these formulas
for each combinations of values “false” and “true” for Cl and K. Then one will see
that both formulas evaluate to “true” exactly when both atoms Cl and K are set to
“false”. Thus the formulas are equivalent.
Examples. The following are wffs: (A1 → (A5 → (¬A6 ))); ((¬A2 ) → (¬A7 )).
The following are not wffs, but can easily be made into some by introducing
opening and closing brackets accordingly: A1 → ¬A2 ; (A3 → A4 ) → ¬A5 . Informally,
such formulas are used whenever it is clear for what wff they stand. Here the first
of these examples would stand for (A1 → (¬A2 )) and the second for ((A3 → A4 ) →
(¬A5 )).
The following are definitely not wffs and such formulas should also not be used.
)A1 → A2 (; ¬(A1 ¬ → A2 ). The first of these has only the opening and closing bracked
mixed up, for the second it is even more difficult to see. Similarly formulas without
balanced brackets are also invalid.
9
Chapter 1.1 – The Language of Sentential Logic
The letters of the language and their meaning. The logical language uses
atoms A1 , A2 , A3 , . . . and connectives →, ↔, ∧, ∨, ¬ and brackets and truth-constants
0 (false) and 1 (true). Each of the connectives has a meaning which can be summarised
by the following truth-table:
sent. sent. implica- equi- conjunc- disjunc- negated nega-
symbol symbol tion valence tion tion equivalence tion
atom atom implies equals and or exclusive or not
A1 A2 A1 → A2 A1 ↔ A2 A1 ∧ A2 A1 ∨ A2 A1 ⊕ A2 ¬A1
0 0 1 1 0 0 0 1
0 1 1 0 0 1 1 1
1 0 0 0 0 1 1 0
1 1 1 1 1 1 0 0
The letters of the language are also called symbols. Each symbol has the length 1,
also every atom and every bracket. So the formula (¬A3 ) has the length 4 and the
formula (A3 ∨ A8 ) has the length 5. Finite sequences of symbols are called expressions
and those which are meaningful are called formulas. If they in addition also adher to
the syntactic requirements strictly, they are called well-formed formulas.
10
Proof. Assume that a formula α together with its construction sequence β1 , β2 , . . . , βn
is given. Every βm is either an atom or obtained by some construction function from
βi , βj with i, j < m. It follows then from the closure of S under all construction
functions and from the fact that S contains all atoms, that whenever all formulas βk
with k < m are wffs, so is βm . Thus one obtains from ordinary induction that all βm
are in S and hence α = βn is in S.
Example. Let S be the set of all expressions which have the same amount of opening
and closing brackets. Then S contains all wffs.
Proof. All atoms are in S, as the amount of opening and the amount of closing
bracktes in atoms is both zero. Furthermore, if α, β ∈ S then there are i and j
such that α has i opening and i closing brackets and β has j opening and j closing
brackets. Now the following formulas have i + j + 1 opening and i + j + 1 closing
brackets: (α → β), (α ↔ β), (α ∧ β), (α ∨ β), (α ⊕ β); furthermore, the formula
(¬α) has i + 1 opening and i + 1 closing brackets. Thus the constructor functions
map formulas in S to formulas in S and S is closed under all constructor functions.
It follows that S contains all wffs.
11
Chapter 1.2 – Truth Assignments
Logic and Values. Within this course, the topic is {0, 1}-valued logic. Other people
also investigate {0, 1, u}-valued logic where “u” stands for unknown. Then whenever
the value of an operation cannot be determined, it is u: For example, 0 ∧ u = 0 as the
and (∧) can only be true when both operands are 1 or might be 1; however, u ∧ u = u
as one has no idea what the two inputs u of the and stand for. There were even logics
with more than three values investigated and also the meaning of the third value can
be different between different versions of three-valued logics. From now on, the only
truth-values are 0 and 1.
Example. If ν(A1 ) = 1, ν(A2 ) = 0 and ν(A3 ) = 1 then one the goal would be to
construct an extension ν of ν to all wff which would be consistent with the definitions
of connectives, for example, ν((A1 ∧ A2 ) ∨ (¬A3 )) = 0.
2. ν(¬α) = 1 − ν(α);
These arithmetic formulas just say that ν of a connective of two formulas α and
β combines ν(α) and ν(β) exactly in the way as the truth-table for the connective
prescribes.
12
Example. Consider the formula (((A1 → A2 )∨(A3 ↔ A4 ))∧(¬A5 )) and the following
two truth-assignments µ and ν:
In short words: After knowing µ on all relevant atoms, one can determine µ on all
subformulas in the constructor sequence inductively in order to get µ for the full
formula. Similarly for ν and ν. One can also write this in short-hand.
So the values of the truth-assignment stand below the atom or below the connective;
in the latter case they are the value which the connective takes during evaluating the
formula.
13
Once none of the rules applies, the formula is either 0 or 1 and that value is ν(α).
Example. Consider the formula ((A1 → A2 ) ∨ (A3 ↔ A4 )) and the following truth-
assignment µ:
• µ(A1 ) = 1, µ(A2 ) = 1, µ(A3 ) = 1 and µ(A4 ) = 0.
• Now the algorithm does the following replacements:
((A1 → A2 ) ∨ (A3 ↔ A4 ));
((1 → A2 ) ∨ (A3 ↔ A4 ));
((1 → 1) ∨ (A3 ↔ A4 ));
(1 ∨ (A3 ↔ A4 ));
(1 ∨ (1 ↔ A4 ));
(1 ∨ (1 ↔ 0));
(1 ∨ 0);
1.
Note that the order of the replacements is not unique; for example one could have
replaced all atoms first and then done the connectives.
Theorem 12A. For every truth assignment ν there is a unique function ν which
assigns to all wff a unique truth-value respecting all the connectives and the values of
ν at the atoms.
The proof will follow in Chapters 1.3 and 1.4. However, above examples already
indicate that it is true. One uses the following definition for further studies.
14
3. {A1 , A2 } |= (A1 ∧A2 ): If A1 and A2 are both true, then is also their conjunction.
4. {A1 } |= (A1 ∨A2 ): If A1 is true, then also the disjunction of A1 with whatsoever.
One says that a set of formulas X is satisfiable iff there is a truth-assignment ν such
that ν(α) = 1 for all α ∈ X. One of the goals of Chapter 1 is also to prove the
following compactness theorem.
1. In an ∧ or ∨, if one side of the connective determines the value, the other side
can be ignored and therefore not all different possibilities of the atoms need to
be listed;
The proofs of the first two items are standard by truth-tables. The fourth item is
easy to see. The first formula of the fourth item is clear, the second formula of the
fourth item satisfies the following: the right side can only be false when (A1 ∧ A2 ) is
true and A3 is false; however, then also A1 is true and therefore (A1 → A3 ) is false as
well, so that the second statement is also right in this case.
For the third item, let α = (γ → δ) and β = (δ → A4 ) → (γ → A4 ), where γ and
15
δ are determined later. Now assume that α is true. If A4 is true, then (δ → A4 ) and
(γ → A4 ) are both true and β is true. If A4 is false and γ is true, then also δ is true
by α and therefore β is the implication (1 → 0) → (1 → 0) which is true. If γ is false,
then there is (0 → 0) on the right side of β and again β is true. Thus {α} |= β.
Now let γ = (A1 → A2 ) → A3 and δ = (A1 → A3 ). By the fourth item, it
holds that {γ} tautologically implies δ. This is equivalent to saying that (γ → δ) is a
tautology. Thus α is a tautology. By the previous argument, it follows that β is also
a tautology. As β is the statement of the third item, the third item is proven.
2. Distributive laws for ∨ versus ∧, ∧ versus ∨ and ∧ versus ⊕. Here examples for
∧ versus ⊕: ((A1 ∧(A2 ⊕A3 )) ↔ ((A1 ∧A2 )⊕(A1 ∧A3 ))) and (((A1 ⊕A2 )∧A3 ) ↔
((A1 ∧ A3 ) ⊕ (A2 ∧ A3 ))).
16
Chapter 1.3 – A Parsing Algorithm
Note that it was proven in the last example of Chapter 1.1 that a wff has the same
amount of opening and closing brackets. This is now stated as a lemma.
Lemma 13A. Every wff has the same number of opening and closing brackets.
The following lemma deals with the order of brackets in wffs.
Lemma 13B. If a wff is split into two non-empty expressions α and β then α has
more opening than closing brackets and β has more closing than opening brackets.
Proof. One shows this by structural induction. Atoms and truth-values are one
symbol and cannot be split into two non-empty parts; thus in the base case the
statement is vacuously true. Assume now that it is proven for γ and δ and that one
looks at the formulas (¬γ) and (γ → δ), where, if needed, a subformula γ or δ of
these formulas is split into non-empty parts α0 , β 0 . One obtains the following possible
splittings of the formulas:
(¬γ) The possible splittings are ( and ¬γ), (¬ and γ), (¬α0 and β 0 ), (¬γ and ).
In all cases, the left side contains the outmost opening bracket of the formula plus
some symbols plus either γ where the brackets are balanced or α0 which contains
more opening than closing brackets. Thus the left side contains more opening
than closing brackets and similarly one shows that the right side contains more
closing than opening brackets.
(γ → δ) The possible splittings are ( and γ → δ), (α0 and β 0 → δ), (γ and → δ),
(γ → and δ), (γ → α0 and β 0 ), (γ → δ and ).
As above, one can verify that in all cases the left side has more opening than
closing brackets and the right side has more closing than opening brackets. This
uses that the same is true for the parts α0 and β 0 considered, that γ and δ have
balanced brackets and that the outmost opening bracket is on the left side and
the outmost closing bracket is on the right side without being balanced off inside
their sides.
These arguments prove the lemma.
17
http://www.comp.nus.edu.sg/~fstephan/parsing.html
and the source code of the parser is the following one:
function display() // Prepares a string containing the values of variables
{ var t="";
for (index in atoms)
{ t = t+index+atoms[index]; }
return(t); }
18
case "+": return(a^b);
case ">": return(1-a+a*b);
case "<": return(1-b+a*b);
case "=": return(1-(a^b));
default: return(3); } }
if (formula[formulapos]=="0") // subcase of constant 0
{ formulapos++; return(0); }
if (formula[formulapos]=="1") // subcase of constant 1
{ formulapos++; return(1); }
if (formula[formulapos] in atoms) //subcase of atom-value
{ index = formula[formulapos]; formulapos++;
return(atoms[index]); }
return(5); }
function parse()
{ var index = formula[0]; var a=4;
if (formula.length == 0) { a = 6; }
else if ((index in atoms)&&(formula[1]=="="))
{ formulapos = 2;
var a = subparse(); }
switch(a) // generating text for error-messages
{ case 0: atoms[index] = 0; ad("",a); break;
case 1: atoms[index] = 1; ad("",a); break;
case 2: ad("Bracket Error at position "+formulapos+
" of "+formula,a); break;
case 3: ad("Unknown Operator at position "+
formulapos+" of "+formula,a); break;
case 4: ad("Formula value not assigned to a variable",a); break;
case 5: ad("Variable or constant does not exist at"+
" position "+formulapos+" of "+formula,a); break; }
if ((a<2)&&(formula.length!=formulapos))
{ alert("Symbols after end of formula ignored"); }
return(a); }
19
for (indexpos=0;indexpos<letters.length;indexpos++)
{ index = letters.charAt(indexpos); atoms[index] = 0; }
do
{ formula = prompt(display()+" Input Formula"); } // reading formula
while (parse()!=6); // do the parsing and end when formula is empty
Note that this parser uses for all connectives special symbols which are explained on
the page of the parser. Besides parsing the formula, it also assigns a value, while
taking the stored values of the atoms into account; these values are initially 0 and can
be modified by previously processed formulas. There are 26 atoms, each letter is one
of them.
The main part of the parser is in the function “subparse()” in which it enters
and where the parameters are in the global variables “formula” and “formulapos”
(current position in formula) and “atoms” (current values of the atoms) and “letters”
(possible names of atoms) and “index” (variable to access atoms) and “indexpos”
(position of index in the list of all possible indices). In the function subparse, the
parser distinguishes the following cases:
1. The current subformula is a truth-value or atom: Then it returns its value;
2. The current subformula starts with “(” followed by a negation symbol: Then
the parser reads the negation symbol, calls itself for processing the subformula
and recieving its value and then passes on the negated value of it while verifying
that the closing bracket is there and advancing the position beyond it;
3. The current subformula starts with “(” followed by something what is not a
negation symbols: Then the formula is of the form (α + β) – or with another
operator instead of the implication – and after processing the opening bracket,
the subparse calls itself to process α and obtain its value and it then reads out
the symbol representing the binary operation and then it again calls itself to
process β and obtain its value and then the parser forms the exclusive or of the
values of α and β and returns this value.
4. Whenever there is an error occurring during the processing, the parser aborts
and returns an error code which it at least 2 (to distinguish it from truth-values).
The main program assigns the value obtained by subparse() to the current atom of
the line and then goes on with processing the next user command. All commands are
assignments like
20
A=((B+C)+1)
where, as said, the plus stands for exclusive or. Also the other Boolean operations
are expressed by single symbols used in common programming languages.
An inspection of the program shows that the parser and its subfunction evaluate
all the subformulas by the conditions on their evaluation in the “Inductive Definition
of the Truth-Assignments” (though it uses Boolean Java Script operations rather than
putting everything into arithmetics). Thus, if one has set the atoms with the values
of ν, then the parser indeed returns ν(α) for a formula α typed into the input.
Instead of evaluating the formula, Enderton’s parser produced a tree representing
the formula; this tree can then be evaluated from the leaves to the root in order to
get the formula. He in particular notes the following:
• Every constructor sequence of a wff α has the property that two subformulas
β, γ on the tree with the property that whenever β is a subformula of γ, that
is, that γ is on the branch leading from the root to β, then β comes in the
constructor sequence before γ;
• If one evaluates the formulas on the tree from the leaves towards the root, any
order is fine as long as no subformula is evaluated after its motherformula and
thus, all orders which correspond to constructor sequences produce the same
results.
Polish Notation. Polish notation avoids brackets by putting the connectives first
and then let the operands follow. So one has again constructor functions which
compute the following constructions:
1. α 7→ ¬α;
2. α, β 7→ ∧αβ;
3. α, β 7→ ∨αβ;
4. α, β 7→ ⊕αβ;
5. α, β 7→ → αβ;
21
6. α, β 7→ ↔ αβ;
One can then define the formula along a construction sequence in Polish notation
instead of the usual notation to get the same formula translated. Here an example:
Formula Text Normal Notation Polish Notation
α1 Atom A1 A1
α2 Atom A2 A2
α3 Atom A3 A3
α4 α1 and α2 (A1 ∧ A2 ) ∧A1 A2
α5 Not α3 (¬A3 ) ¬A3
α6 α4 or α5 ((A1 ∧ A2 ) ∨ (¬A3 )) ∨ ∧ A1 A2 ¬A3
α7 α1 eor α3 (A1 ⊕ A3 ) ⊕A1 A3
α8 α6 implies α7 (((A1 ∧ A2 ) ∨ (¬A3 ))∨ → ∨ ∧ A1 A2 ¬A3
(A1 ⊕ A3 )) ⊕A1 A3
Formulas in Polish notation are more difficult to read for humans, partly also be-
cause humans are not trained to do so, unless they write logic textbooks. However,
parsers can be written more easily (with less effort) for formulas in Polish notation.
Furthermore, if one makes proofs over formulas in Polish notation, there are less case
distinctions on the level of a parser. For that reason, Polish notation gained some
popularity in mathematics and computer science.
Omitting Brackets. For human processing (and for sufficiently advanced computer
programs and compilers), one can try to find an informal notation which allows to
reduce the number of brackets of normal notation without having to use Polish nota-
tion. The idea is that there are connectives which bind more and connectives which
bind less; thus when one puts together different connectives like in
A1 ∧ A2 ∨ ¬A3
then ¬ and ∧ have priority over ∨ and thus one can put brackets in a way which
respect this priority. The resulting formula is
((A1 ∧ A2 ) ∨ (¬A3 ))
and can be evaluated by the parser. The rules are the following ones:
22
1. The outmost brackets can be omitted;
2. The negation symbol binds to what follows directly it, that is either an atom or a
truth-value or an expression in brackets and double negations are to be removed
from the formula as they do not have any effect (so that never a negation follows
immediately a negation);
3. conjunction binds more than disjunction and both bind more than implication
and these three all bind more than equivalence;
4. Two connectives of the same type between subformulas like, for example, α →
β → γ, where everything inside these formulas binds more than the connectives
are bracketed as α → (β → γ).
Note that the fourth rule is only relevant for →, as the other connectives ∧, ∨, oplus
and ↔ are transitive and therefore there is no need to fix the rules for bracketing
expressions only having one type of connectives. It is important to note that on one
hand ↔ is associative, but on the other hand the statement
α↔β↔γ
is not equivalent to “either α, β, γ are all three true or α, β, γ are all three false”. This
contradicts to the everyday usage in mathematical writing.
23
Chapter 1.4 – Induction and Recursion
Induction is a proof-method which goes proves an assertion first on the base-cases
and then by inductive steps on objects (numbers, formulas, strings) derived from
the bases-cases in a finite number of steps. Recursion deals with a similar process
where one defines a function first on the base-cases and then extends this definition to
objects which are built from the base-cases in finitely many steps using constructor-
operators. For this one needs to prove that this building-up of the function is unique:
Either one proves that a certain uniqueness condition on the recursion is satisfied or
one proves that different ways of extending the the base-cases to the target object
through different intermediate results give the same value. Both approaches are valid;
however, the second one is more work, as one has to do this prove each time one makes
a function by recursion while in the first approach, it is sufficient to once prove that
there is only a unique way to build the objects from the base-cases.
Example: Natural Numbers. Both recursion and induction originate from the
study of natural numbers. These are given by the following: A set of numbers N
together with a constant 0 and a constant 1 and addition + and an order < defined
as
x < y ⇔ ∃z [x + z + 1 = y].
Now one can define multiplication by recursion as follows:
Base Case: x · 0 = 0;
Recursion: x · (y + 1) = (x · y) + 1.
Furthermore, one can make an inductive prove that, for example, every number is
even or odd. Here x is even iff x is of the form y + y and x is odd iff x is of the form
y + y + 1 for some y.
Inductive Step: There are two subcases, namely the one where x is even and the
one where x is odd.
24
Odd x: Then there is y such that x = y + y + 1. Then x + 1 = y + y + 1 + 1 =
(y + 1) + (y + 1). So x + 1 is even.
Conclusion: When x is even then x + 1 is odd and when x is odd then x + 1 is even;
hence when x is even or odd, so is x + 1. This completes the inductive step.
In short, one first proves the assertion for the base case x = 0 and then, for every x
where the assertion is true, one also proves it for x + 1. The natural numbers can,
more abstractly, viewed to be built up as follows:
One proves an assertion now by being true on the base-set and then one shows that
whenever it is true for some x, then it is also true for the image of x under any
constructure function – here is only one of them, namely x 7→ x + 1, but for formulas,
there are several of them. Furthermore, one defines multiplication x, y 7→ x · y first for
the base-case where y = 0 and then by recursion for each expression x · y, one extends
the definition to x · (y + 1) by mapping it back to already known constructs like x · y
and expressions using +.
Inductive set: A set S ⊆ D is called inductive iff B ⊆ S and for all constructor
functions, they map members of S to members of S.
25
Constructor Sequences of length n for some y ∈ D: A sequence x1 , x2 , . . . , xn
is called a construction sequence of length n iff all xm satisfy that either xm is a
member of B or there are members xi , xj of the sequence with i, j < m such that
either xm = f (xi , xj ) or xm = g(xi ); such a sequence is called a construction
sequence of length n for y iff it in addition satisfies that y ∈ {x1 , x2 , . . . , xn }.
Lower Limit C∗ : The lower limit Cn is the set of all y ∈ D such that there is an n
and a construction sequence of length n for y; equivalently one can denote with
Cn allSthe y for which there is a construction-sequence of length n and then let
C∗ = n∈N Cn ; note that here C1 = B.
Note that in this definitions, C∗ is an inductive set: The reason is that whenever x, y ∈
C∗ then there are construction sequences for x and for y. Now one can concatenate
the construction sequences to and append at this sequence either f (x, y) or g(x) in
order to prove that f (x, y), g(x) are also in C∗ . So C∗ is an inductive set and therefore
C ∗ ⊆ C∗ .
Furthermore, one can make a proof that every member of C∗ is contained in every
inductive set S. So assume that y ∈ C∗ as witnessed by the construction sequence
x1 , x2 , . . . , xn . Now one proves the following statement for all m ≤ n:
All xk with k ≤ m are in S.
This statement is true for m = 0, as there is no x0 . Now assume that the statement
is true for m and one wants to prove it for m + 1. For this, one has only to show that
under the assumption that all xk with k ≤ m are in S, then also xm+1 ∈ S. There
are several cases:
1. If xm+1 ∈ B then xm+1 ∈ S by the fact that every inductive set contains the
base set B;
2. If xm+1 = f (xi , xj ) for i, j ≤ m then xi , xj ∈ S by induction hypothesis and the
xm+1 ∈ S follows by the fact that S is closed under f ;
3. If xm+1 = g(xk ) for a k ≤ m then xk ∈ S by induction hypothesis and xm+1 ∈ S
follows by the fact that S is closed under g.
One of these cases does always apply by the definition of a construction sequence and
therefore all members of the construction sequence are in S. Thus C∗ ⊆ S for every
inductive set S and so also C∗ ⊆ C ∗ .
26
Induction Principle. C∗ = C ∗ by the above proof and in the following, C denotes
either of them. Furthermore, if S is an inductive subset of C (that is, a set which
contains B and is closed under all constructor functions) then S = C.
Examples. Here are several examples of sets defined as the closure of a base set B
under certain operations; note that each of them needs a universe D on which the
operations are defined. For the below, the universe is always D = Q the set of rational
numbers.
1. Consider the base set B = {0} and the function g(x) = x + 1. Then the closure
of B under g is the set C = N of all natural numbers.
2. Consider the base set B = {0, 1} and the constructor functions f (x, y) = x + y
and g(x) = −x. Then the closure of B under f and g is the set C = Z of all
integers.
3. Consider the base set B = Z and one constructor function f given as f (x, y) =
(x + y)/2. Then the set C is the set of all dyadic rationals, that is, all rational
numbers which can be represented such that the denominator is a power of 2.
Furthermore, if one fixes a set B consisting of all atoms A1 , A2 , . . . and the truth-
values 0 and 1 and if one lets D to be the set of all strings (expressions) over an
alphabet consisting of the symbols in B and the brackets and the logical connectives
and if one takes F to be the set of constructor functions α 7→ (¬α), α, β 7→ (α ∧ β),
α, β 7→ (α ∨ β), α, β 7→ (α ⊕ β), α, β 7→ (α → β) and α, β 7→ (α ↔ β), then C consists
of all wffs including those which contain truth-values as constants somewhere in the
formula.
27
Furthermore, if β is a wff then βγ cannot be a wff. Thus one can conclude that if β, γ
are subformulas of α then either they occur at disjoint places in α or β is a subformula
of γ or γ is a subformula of β.
If α is a wff and neither an atom nor a truth-value then there are subformulas β, γ
of α such that either α = (¬β) and γ = β or α = (β ∧ γ) or α = (β ∨ γ) or α = (β ⊕ γ)
or α = (β → γ) or α = (β ↔ γ). Furthermore the subformulas β, γ are maximal,
that is, there is no subformula δ of α which contains one of β and γ as a subformula.
It can also be seen that the choice of the subformulas β, γ and the connective in the
above is unique.
Proposition. Every construction sequence for a wff α contains besides α all subfor-
mulas of α.
Proof. This is clearly true when α is an atom or a truth-value, as α does not have
subformulas. Assume that this is true for β and γ. Then a construction sequence for
α = (β → γ) must contain both β and γ due to the fact that there is only one way
to obtain α as the output of a constructor function and β, γ are the corresponding
inputs. Furthermore, the construction sequence must also be one for β and γ, thus by
induction hypothesis, all subformulas of β and γ appear in the construction sequence.
From the form of α and the preceding properties of the subformulas, one can see that
there are no further subformulas of α then β, γ and their subformulas. The same
can be shown for the connectives ↔, ∨, ∧, ⊕ in place of → in the above. Similarly, if
α = (¬β) then again every subformula of α is either β or a subformula of β. Due to
the uniqueness of the decomposition, there is only one way on how to write α and β
must appear in the construction sequence. Furthermore, by induction hypothesis, all
subformulas of β also occur there.
• x ∈ B;
28
Furthermore, assume that there is a further set E and that there are functions h :
B → E, f : E × E → E and g : E → E. Then there is a unique function h such that
the following holds:
Proof Idea. One proves by induction that the function h is defined on the whole of
C and that it is unique.
First, for all x ∈ B, h(x) = h(x) by the above definition and none of the other
cases applies, so h is defined and unique there.
Second, assume that x = f (y, z) and the statement is already proven for y and z
by induction hypothesis. Then x ∈ / B and furtherore, h(y), h(z) are already uniquely
defined. Now there is no other way to obtain x from other elements using f , g and B
and therefore, h(x) = f (h(y), h(z)) is the only possible definition and this definition
also applies, hence h(x) is defined and unique.
Third, assume that x = g(y) and the statement is already proven for y by induction
hypothesis. Then x ∈ / B and furthermore, h(y) is already uniquely defined. Now
there is no other way to obtain x from other elements using f , g and B and therefore,
h(x) = g(h(y)) is the only possible definition and this definition also applies, hence
h(x) is defined and unique.
Proof. One defines K to be the set of all partial functions h0 from C to E which
satisfy the following:
2. If x = f (y, z) and h0 (x) is defined, then h0 (y) and h0 (z) are also defined and
h0 (x) = f (h0 (y), h0 (z));
3. If x = g(y) and h0 (x) is defined, then h0 (y) is also defined and h0 (x) = g(h0 (y)).
29
(b) For every x ∈ C and all h0 , h00 ∈ K, if h0 (x) and h00 (x) are both defined then they
are equal.
First one shows that for every constructor sequence s = (x1 , x2 , . . . , xn ) there exists a
function hs ∈ K which satisfies the above two conditions. Without loss of generality,
one can assume that s is repetition-free, that is, no element equals to xi and xj in s
for two different i and j. has two indices in s. As one has only finitely many values,
one can choose these explicitly as follows for m = 1, 2, . . . , n:
1. If xm ∈ B then one chooses hs (xm ) = h(xm );
2. If xm = f (xi , xj ) for i, j < m then one chooses hs (xm ) = f (hs (xi ), hs (xj ));
3. If xm = g(xk ) for k < m then one chooses hs (xm ) = g(hs (xk )).
The inductive construction of hs enforces that hs ∈ K and so there is a member of K
on which every member of the construction sequence is defined. Thus condition (a)
is satisfied.
For condition (b), one considers the following set S: S is the set of all x ∈ C such
that there do not exist h0 , h00 which are defined and different on x. One shows that
the set S is inductive.
1. If x ∈ B then x ∈ S, as all members h0 , h00 ∈ K which are defined on x satisfy
that h0 (x) = h(x) and h00 (x) = h(x) and thus h0 and h00 are equal.
2. If x = f (y, z) then there is no alternative way to obtain x from other elements
and x ∈/ B. Assume now that h0 , h00 ∈ K satisfy that h0 (x) and h00 (x) are defined.
Then also h0 (y), h00 (y), h0 (z), h00 (z) are all defined due to membership of h0 , h00 in
K. By induction hypothesis, h0 (y) = h00 (y) and h0 (z) = h00 (z). Now
h0 (x) = f (h0 (y), h0 (z)) = f (h00 (y), h00 (z)) = h00 (x).
30
Chapter 1.5 – Sentential Connectives
31
Theorem 15A. Let α and β be wffs whose sentence symbols are among A1 , . . . , An .
Then the following statements are true:
(a) {α} |= β iff Bαn ≤ Bβn ;
(b) {α} |= β and {β} |= α (that is, α and β are equivalent) iff Bαn = Bβn ;
(c) ∅ |= α iff Bαn = B1n , that is, iff Bαn is the n-input function which always outputs 1.
Proof. (a) If {α} |= β then for every ν with ν(α) = 1, it also holds that ν(β) = 1.
This is equivalent to saying that ν(α) ≤ ν(β).
For every x1 , . . . , xn ∈ {0, 1}n there is a ν with ν(A1 ) = x1 , . . . , ν(An ) = xn , it
follows that Bα (x1 , . . . , xn ) = ν(α) ≤ ν(β) = Bβ (x1 , . . . , xn ). This gives one direction
of the equivalence.
Furthermore, if for every x1 , . . . , xn it holds that Bα (x1 , . . . , xn ) ≤ Bβ (x1 , . . . , xn )
and ν is any truth-assginment, then let xm = ν(Am ) for m = 1, 2, . . . , n and ν(α) =
Bα (x1 , . . . , xn ) ≤ Bβ (x1 , . . . , xn ) = ν(β). As this holds for all ν, {α} |= β. This gives
the second direction of the equivalence.
(b) α and β are equal as functions iff, for all x1 , . . . , xn , the equality Bα (x1 , . . . , xn )
= Bβ (x1 , . . . , xn ) holds. It is easy to see that this equality is the same as saying both
Bα ≤ Bβ and Bβ ≤ Bα hold. Thus part (b) follows directly from part (a).
(c) This part follows from the fact that a formula is a tautology iff it is equivalent to
the formula 1. Note that the formula 1 is the easiest formula which is true irrespective
of the values of the truth-values of the atoms. If one does not want to use truth-values,
the formula (A1 ∨ (¬A1 )) would also do.
Theorem 15B. For every n-place Boolean function f there is a wff α using atoms
A1 , . . . , An such that f = Bαn .
Proof. In the following, let 0 be the value of an empty disjunction of formulas. That
is, if one makes a disjunction for which no terms qualify then one lets the resulting
formula just be 0.
The idea of the proof the following: Given an n-ary function f (x1 , . . . , xn ), one
computes for each (x1 , . . . , xn ) ∈ {0, 1}n the value f (x1 , . . . , xn ). Now for each
(x1 , . . . , xn ) ∈ {0, 1}n let αx1 ,...,xn denote the conjunction of all Am with xm = 1
and (¬Am ) with xm = 0; so α0,1,1,0 is (¬A1 ) ∧ A2 ∧ A3 ∧ (¬A4 ) when omitting
brackets for readability. Now one lets α to be the disjunction of all αx1 ,...,xn where
32
f (x1 , . . . , xn ) = 1. Note that when (x1 , . . . , xn ) and (y1 , . . . , yn ) are two different tu-
ples from {0, 1}n then there is no ν which makes ν(αx1 ,...,xn ) and ν(αy1 ,...,yn ) true at
the same time: The reason is that there is one m with xm 6= ym , say xm = 1 and
ym = 0. Then ν(αx1 ,...,xn ) = 1 only if Am = 1 and ν(αy1 ,...,yn ) = 1 only if Am = 0 and
both does not happen at the same time.
Recall that Bα (x1 , . . . , xn ) is the value ν(α) for that ν which satisfies ν(A1 ) =
x1 , ν(A2 ) = x2 , . . . , ν(An ) = xn . If f (x1 , . . . , xn ) = 1 then αx1 ,...,xn occurs in the
disjunction of formulas put together for Bα and therefore Bαx1 ,...,xn (x1 , . . . , xn ) = 1
and Bα (x1 , . . . , xn ) = 1. If f (x1 , . . . , xn ) = 0 then αx1 ,...,xn does not occur in the
disjunction of formulas forming Bα and all the αy1 ,...,yn occurring in the disjunction
satisfy that Bαy1 ,...,yn (x1 , . . . , xn ) = 0 and therefore also Bα (x1 , . . . , xn ) = 0. Thus
Bα (x1 , . . . , xn ) = f (x1 , . . . , xn ) for all (x1 , . . . , xn ) ∈ {0, 1}n .
This formula evaluates to 1 iff either all three inputs are 0 or all three inputs are 1.
One can therefore represent the formula as the disjunction of the two disjunctions
which tests that all atoms are 0 or that all atoms are 1, which is done as follows:
So this formula realises the function which checks whether all inputs are equal.
33
conjunction of disjunctive clauses, the latter are also often referred to as “clauses”.
The proof of Theorem 15B gives also rise to the following corollary.
Corollary 15C. For every α one can find an equivalent β in disjunctive normal form.
Theorem 15D. The following sets of connectives are complete: {¬, ∧, ∨}, {¬, ∧}
and {¬, ∨}.
Proof. For the case of {¬, ∧, ∨}, one can directly invoke Theorem 15B. For the case
of ¬, ∧, one uses that β ∨ γ is equivalent to ¬((¬β) ∧ (¬γ)) by De Morgan’s Law.
Similarly, the same laws show that β ∧ γ is equivalent to ¬((¬β) ∨ (¬γ)). Thus one
can from each formula eliminate one of ∧ and ∨ and systematically replace it by the
other one. Here an example: The formula
(A1 ∧ (A2 ∨ A3 ))
can be written in the following two ways:
(¬((¬A1 ) ∨ (¬(A2 ∨ A3 )))) and (A1 ∧ (¬((¬A2 ) ∧ (¬A3 )))).
The same can be done iteratively with more complicated formulas. That is, when
eliminating ∨ one replaces each subformula of the form (β ∨ γ) by (¬((¬β) ∧ (¬γ)))
until no replacement of that form can be done. When eliminating ∧, one replaces
every subformula of the form (β ∧ γ) by (¬((¬β) ∨ (¬γ))) until no replacement of that
form can be done.
Example. In some cases, one needs the truth-values to get completeness. For
example, {0, 1, →} is complete: One can replace ¬β by (β → 0): If ν(β) = 1
then ν((β → 0)) = 0 and if ν(β) = 0 then ν((β → 0)) = 1. Furthermore,
ν((β ∨ γ)) = ν(((¬β) → γ)), so that both ¬ and ∨ can be expressed using 0, 1, →.
However, without the truth-values, it is impossible to express the negation, there-
fore even {→, ↔, ∧, ∨} is incomplete. To see this, one shows the following: If ν(Am ) =
34
1 for all atoms Am , then ν(α) = 1 for all formulas α which are obtained from the
atoms in finitely many steps by using only connectives from →, ↔, ∧, ∨: The induc-
tion step is that if β, γ are fomulas with ν(β) = 1 and ν(γ) = 1, so are ν((β → γ)),
ν((β ↔ γ)), ν((β ∧ γ)) and ν((β ∨ γ)).
Nullary and Unary Connectives. The truth-values 0 and 1 are the only nullary con-
nectives. However, if α is a tautology with n atoms, then Bα is an n-ary connective
of the same value as 1. Furthermore, B¬α is a n-ary connective of the value 0. Thus,
out of the four unary connectives, two are already the two constant functions. The
other two are the identity function BA1 and the negation B¬A1 .
Binary Connectives and Ternary Connectives. There are sixteen binary connectives,
out of which six are essential unary or nullary. The connectives are given in this table.
0 1 truth-values
A1 A2 copying one input
¬A1 ¬A2 negating one input
A1 ∧ A2 A1 ∨ A2 and, or
¬(A1 ∧ A2 ) ¬(A1 ∨ A2 ) nand, nor
A1 → A2 A1 ← A2 implications
(¬A1 ) ∧ A2 A1 ∧ (¬A2 ) negated implications
A1 ↔ A2 A1 ⊕ A2 equivalence, exclusive or
The “nand” is sometimes written by a vertical stroke, the Sheffer stroke; however, the
vertical stroke is used in some programming languages just to denote the “or”, so it
is a bit ambiguous. The exclusive or is also just the addition in the binary field (odd
plus odd is even) where and is the multiplication.
There are 256 ternary connectives, out of which many are essentially nullary, unary
or binary, that is, do not depend on all variables. Furthermore, as there are so many
of them, many of these connectives do not have a proper name. One of these showed
up before, namely the majority-connective.
Examples. The sets {nand} and {nor} are both complete. The reason is given by
the following table of and, or and not:
operation with nand with nor
¬α nand(α, α) nor(α, α)
α∧β nand(nand(α, β), nand(α, β)) nor(nor(α, α), nor(β, β))
α∨β nand(nand(α, α), nand(β, β)) nor(nor(α, β), nor(α, β))
35
The idea is, depending on what is needed, either to negate the output of nand / nor
in order to recover the and / or from it or to negate the input in order to apply De
Morgan’s Law. The negation is done by supplying the same input twice, so that the
disjunction or conjunction before the negation is filtered out and only the negation is
done. This then permits to implement both the negation of the output of a nand /
nor as well as the negation of the inputs for the De Morgan’s Law.
One Sample-Definition of Fuzzy Logic. Fuzzy logic is a multi-valued logic, where
the truth-values are linearly ordered. For this, one let the truth-values be a nonempty
set Q ⊆ Q of rationals where 0 = min(Q), 1 = max(Q) and Q has the following
closure properties: If p, q ∈ Q so are also 1 − p and min{p + q, 2 − p − q}. Thus Q
contains 0, 1 (as the set has a minimum and maximum of these values) and if p+q ≤ 1
then p + q ∈ Q and if p + q ≥ 1 then p + q − 1 = 1 − (2 − p − q) ∈ Q. Clearly Q = {0, 1}
is possible, but also Q = {0.00, 0.01, . . . , 0.99, 1.00} and Q = {q ∈ Q : 0 ≤ q ≤ 1}.
Now one can define the following connectives on truth-values using min, max, +, −
and constants form Q in the same way as for rational numbers:
p∧q = min{p, q};
p∨q = max{p, q};
¬p = 1 − p;
p→q = min{1 + q − p, 1};
p↔q = min{1 + p − q, 1 + q − p};
p⊕q = min{p + q, 2 − p − q}.
Given now a truth-assignment with ν(Ak ) ∈ Q for all k, one can extend ν to ν by
induction:
1. ν(Ak ) = ν(Ak ) and ν(p) = p for all p ∈ Q;
2. ν(¬(α)) = 1 − ν(α);
3. ν(α ∨ β) = max{ν(α), ν(β)} and ν(α ∧ β) = min{ν(α), ν(β)} and similarly for
all other binary connectives following the formulas above.
One can define additional connectives, where some do not exist for all Q:
1. med(p, q, r) is the middle value (median) of the three truth-values p, q, r; it is
given by the same formula as the majority which is med(p, q, r) = (p ∧ q) ∨ (p ∧
r) ∨ (q ∧ r);
36
2. ave(p, q, r) = (p+q +r)/3 which only exists when Q is closed under the mapping
p, q, r 7→ (p + q + r)/3 and this can only happen when Q is infinite.
To see that the two are different, note that med(0, 1, 1) = 1 and ave(0, 1, 1) = 2/3.
Furthermore, only some but not all tautologies from normal logic become tautologies
in fuzzy logic. For this one has first to formalise tautologies:
1. One says that S |= α iff for every ν there is a β ∈ S ∪ {1} with ν(β) ≤ ν(α);
3. All formulas made from atoms and connectives ∧, ∨, ¬ are evaluated to 1/2 in
the case that 1/2 ∈ Q and all atoms take the value 1/2 and hence none of them
is a tautology;
Note that the exact formulation of “fuzzy logic” is not fixed: There are various versions
of it around which differ in the choice of Q and which also differ in the choice of
the formulas defining the connectives. Furthermore, besides S |= α, one might also
be interested in S |=p α meaning that for every ν there is a β ∈ S ∪ {1} with
ν(β) + p − 1 ≤ ν(α). So ∅ |=0.8 α would only require the following:
The idea behind it is that the truth-values might be associated with amount of cer-
tainty that something is true: 0 is false, 0.2 is unlikely, 0.5 is unknown, 0.8 is likely,
1 is true. Then one might put into S rules like “A4 is likely” which is expressed as
“0.8 → A4 ” and so on. After having done this, one wants know what one can say
about α at the end; now S |=0.8 α would mean that for all truth-assignments, S likely
implies α.
37
Chapter 1.6 – Switching Circuits
For better graphics, please see the textbook. Circuits can be seen as inputs mapped
to outputs using some basic connectives; the main idea of circuits is that one, other
than in formulas, can reuse inputs into several outputs. Furthermore, one might have
gates with more than two inputs, in particular for AND and OR it is common to allow
as many inputs as needed. Here an example for the majority functions, for simpler
drawing, the input A is duplicated.
MAJ(A,B,C)
|
__OR __
/ | \
AND AND AND
/ \ / \ / \
A B C A
The single units in a circuit are called a gate and gates might cause delays, for example
each OR might have a processing time of 1 time unit to adjust the output to the
input and again the AND has a further processing time of another time unit, giving
an overall processing time of two time units. One goals of the layouts is to keep such
a delay to a minimum. Sometimes, formulas go over more levels, as in the following
example, which has a circuit computing E = ((A ∧ B) ∧ D) ∨ ((A ∧ B) ∧ ¬C) where
the subpart A ∧ B has two outputs to avoid duplication in the layout.
E
|
OR
/ \
AND AND
/ \ / \
/ AND NOT
/ / \ |
D A B C
The number of layers is called the depth of the circuit. Here there are three layers
of gates, each causing one delay. If one does not have a multiple input for the above
38
majority circuit, one also needs three layers; however, one can bring down from five
to four gates.
MAJ(A,B,C)
|
OR
/ \
AND \
/ \ \
/ OR AND
/ / \ / \
A B C B
Quiz. The textbook gives a circuit for ↔ made of NOR-gates. Make one with NAND-
gates and NOT-gates only. Note that NOT-gates use less inputs than NAND-gates,
but can be simulated by the latter with input-duplication.
Relay Switching. Idea is to use the inputs as relays: When they are 1, the current
can flow, when they are 0, it cannot flow. Now the circuit is 1 iff there is a flow from
the source to the sink. Costs are usages of inputs and not the wires. Inputs can be
negated. Here two examples.
Sink Sink
/ \ / \
A NOT(A) D E
| | | |
| | +---C---+
| | | |
B NOT(B) A B
\ / \ /
Source Source
39
Incomplete Specifications. One might also ask what how to realise a formula
or circuit efficiently in the case that it is underspecified; that is, how to choose the
missing values in the truth-table such that the formula gets as easy as possible. For
four variables, Enderton gives the formula as a square-diagramme, for example like
this:
NOT(A) NOT(A) A A
NOT(C) 1 . 0 . NOT(D)
NOT(C) 1 1 0 1 D
C . 1 0 1 D
C 0 . 0 0 NOT(D)
NOT(B) B B NOT(B)
Quiz. Now the task is to fill the dots such that one can solve the formula with three
binary AND and OR gates and can use as many NOT gates as one wants. What
solution is proposed?
40
Chapter 1.7 – Compactness and Effectiveness
The compactness theorem deals with the notion of satisfiability. Here one says that
a set of formulas S is satisfiable iff there is a truth-assignment ν such that, for all
α ∈ S, ν(α) = 1.
Proof. For making the proof notationally easier, one says that a set S is finitely
satisfiable iff every finite subset of S is satisfiable. So one has to prove that S is
satisfiable iff S is finitely satisfiable and the implication “S is satisfiable ⇒ S is
finitely satisfiable” follows directly from the fact that every subset of a satisfiable set is
satisfiable. So assume that S is finitely satisfiable. Furthermore, one can assume that
S is infinite, as otherwise there is nothing to prove. Let α1 , α2 , . . . be an enumeration
of all wffs which exist.
Now one constructs a supserset T of S such that T contains for each formula αk a
truth-value qk ∈ {0, 1} such that α ↔ q is in T . This is done inductively. So assume
that one has defined all qm for m < n and the set Sn = S ∪ {αm ↔ qm : 1 ≤ m ≤ n}
is finitely satisfiable; note that S0 = S. Now there are for each n ≥ 0 two cases:
41
So in the first case the inductive construction would get stuck and in the second case
it would go through. Thus one has to prove that the first case never happens. So
assume that the sets R0 and R1 would exist. Then one considers the set R0 ∪ R1 . This
set is a finite subset of Sn and thus satisfiable by a truth-assignment ν. Now there is
a truth-value q = ν(αn+1 ). Now one determines the truth-value ν(αn+1 ↔ q). This is
equal to the truth-value of ν(αn+1 ) ↔ q and thus to q ↔ q. This evaluates to 1. As
R0 ∪ R1 ∪ {αn+1 ↔ q} is satisfied by ν, ν also satisfies the subset Rq ∪ {αn+1 ↔ q}.
This contradicts the assumption, thus Case 1 does not hold.
Thus the inductive contruction gives rise to an ascending family Sn of finitely
satisfiable sets and their union T satisfies that every finite subset of T is actually
a finite subset of some Sn ; hence T is also finitely satisfiable. In order to simplify
notation, one defines qα to be qk for that index k which satisfies αk = α.
Now one defines for each atom Am that µ(Am ) = qAm and it remains to show all
α satisfy µ(α) = qα . This is done by structural induction; for the atoms it follows
by definition. Furthermore, for all truth-values p, it holds that the formula qp ↔ p
is a tautoloy when qp = p and unsatisfiable when qp 6= p; thus qp = p for p = 0, 1
and consequently µ(p) = qp . For the inductive step, consider any α, β for which it is
already verified that µ(α) = qα and µ(β) = qβ .
For ¬α, one has that µ(¬α) = ¬qα by induction hypothesis and thus µ(¬α ↔
¬qα ) = 1. Furthermore, as T is finitely satisfiable, the finite set {α ↔ qα , ¬α ↔ q¬α }
is satisfiable and therefore also the set {α ↔ qα , ¬α ↔ ¬qα , ¬α ↔ q¬α } is satisfiable,
as p ↔ q implies ¬p ↔ ¬q for all truth-values. Now a set containing α ↔ p and
α ↔ q for expressions p, q in truth-values is only satisfiable when p, q represent the
same truth-value. Thus ¬qα and q¬α are the same truth-value. Furthermore, µ(¬α) =
¬µ(α) = ¬qα by induction hypothesis and therefore q¬α = µ(¬α).
Now note that similarly, for a truth-value p, the set {α ↔ qα , β ↔ qβ , α ∧ β ↔ p}
is satisfiable iff p = qα ∧ qβ . Thus one can conclude that qα∧β = qα ∧ qβ and therefore
µ(α ∧ β) = µ(α) ∧ µ(β) = qα ∧ qβ = qα∧β . The same arguments hold for the other
connectives ∨, ⊕, ↔, ¬. This induction completes the proof.
Corollary 17A. If S |= α then there is a finite subset S 0 of S with S |= α.
Proof. One uses the basic fact that S |= α iff S ∪ {¬α} is unsatisfiable. If now every
finite S 0 ⊆ S satisfies S 0 6|= α then S 0 ∪ {¬α} is satisfiable for every finite S 0 ⊆ S and
S ∪ {¬α} is finitely satisfiable. This implies that S ∪ {¬α} is satisfiable and S 6|= α.
Compactness Theorem and Fuzzy Logic. The above proof of the Compactness
42
Theorem works also for fuzzy logics where Q is a finite set, one has of course then to
invoke the formulas for ↔ and the other connectives given at the end of Chapter 1.5.
However, the proof does not work if the set Q is that of all rationals between 0 and
1. The reason is that for this infinite set, the result is false.
The counter example
p is the following one: Let r be an irrational number with
0 < r < 1, say r = 1/2. Furthermore, let S contain for every q ∈ Q the following
formula αq : If q < r then αq is q → A1 ; if q > r then αq is A1 → q. Note that r ∈ /Q
and therefore the case q = r does not occur. Now S is finitely satisfiable; if R ⊆ S is
finite then there is some r0 ∈ Q such that all αq ∈ R satisfy that if q ≤ r0 then αq =
q → A1 else αq = A1 → q. Now taking ν(A1 ) = r0 makes all αq ∈ R true, as q → r0
evaluates to 1 for all q ≤ r0 and r0 → q evaluates to 1 for all q > r0 . However, S is not
satisfiable, as there is no possible truth-value r0 ∈ Q which A1 can take so that for all
rationals q, q < r0 iff q < r. This proof can be adjusted to any situation where Q is
a dense subset of {r ∈ R : 0 ≤ r ≤ 1} which does not contain all the reals r between
0 and 1. Thus the best what one can hope for is to show the compactness theorem
for the cases of Q = {0, 1/k, 2/k, . . . , (k − 1)/k, 1} where k is a natural number with
k ≥ 1 and Q = {r ∈ R : 0 ≤ r ≤ 1}. The first case is already covered by above the
proof for the compactness theorem; for the second case, one needs some preparation
and the resulting theorem would also work for the first case, including the classical
case of k = 1.
43
k ∈ N − {0} or Q = {r ∈ R : 0 ≤ r ≤ 1}. Let the connectives be defined as at the
end of Chapter 1.5. Now a countable set S of formulas is satisfiable iff S is finitely
satisfiable.
44
• A natural number as output which is returned by a special command at
the end of the loops.
The main data-type for this is the notion of the natural number; the program
execution time is not bounded by any function and the output of a function is
undefined when the program runs forever.
2. One defines the function from some basic functions by recursion in one variables
where + and − and comparison-functions (output 1 or 0 depending on whether
a comparision is true or false) are defined; for example, multiplication is defined
recursively by x · 0 = 0 and x · (y + 1) = (x · y) + x. Furthermore, if f is a
recursive function (say, with inputs x, y, z), then one can define a new function
which maps x, y to the least z such that f (x, y, z) = 0; note that the function
can have undefined values when such a z cannot be found.
3. A function f whose graph is a Diophantine set; that is, there is a polynomial
p with coefficients from Z such that f (x) = y iff there are z1 , . . . , zk (for some
constant k) with f (x, y, z1 , . . . , zk ) = 0. While the coefficients of the polynomial
are integers which can be negative, the values for x, y, z1 , . . . , zk are all from N.
The third definition is much easier to understand than the first two, as it does not
rely on a programming language and also not on the choice of recursion. However, the
third definition is much more difficult to handle; it took until 1970 when Matiyasevich
showed the unsolvatibility of Hilbert’s Tenth problem that mathematicians realised
that all partial recursive functions are actually giuven by a Diophantine set. The no-
tions of algorithms and definitions by recursion on the natural number were, however,
already established before 1940.
45
f (x) = 0 when x ∈ / L. That is, some algorithm can determine which possible
intputs are in the set L and which are not.
2. Recursively enumerable sets L (also called effectively enumerable sets L): There
is an algorithm (that is, recursive function) which enumerates the members of
L; for avoiding problems, one also defines that the empty set ∅ is recursively
enumerable.
On the domain N, Matiyasevich showed that the notion of a Diophantine set is the
same as a recursively enumerable set; here a set A is Diophantine iff there is a polyno-
mial p with coefficients from Z and inputs x, y1 , . . . , yn from N such that x ∈ A if and
only if there are y1 , . . . , yn ∈ N with p(x, y1 , . . . , yn ) = 0. This would permit to give
the defintion of a recursively enumerable set again without any mention of machines,
allowed programming commands in algorithms, run-time and other such ingredients.
Further, equivalent definitions are that A is the domain of a partial-recursive function
or the range of a partial-recursive function. It is easy to see that every decidable set
is recursively enumerable.
Recursively enumerable sets which are not recursive. Since the early days
of recursion theory, one knows that there are sets which are recursively enumerable
but not recursive. The most famous example of this type is Turing’s halting problem:
It is the set of all pairs of computer programs e and inputs x such that a machine
executing program e on input x will eventually halt and produce some output. A
related undecidable but recursively enumerable set is the diagonal halting problem
where one feeds into the computer program not an arbitrary input x but the own
code e as input. Turing gave these examples in the year 1936.
46
Comment. Here might be a problem that there are infinitely many atom symbols,
so that one cannot represent the above set with a finite alphabet as it would be
needed to process a formula in a computer. A way out is that one does not write
A1 , A2 , A3 , . . ., but to write A, A0 , A00 , . . . and the number of primes of the A allow to
distinguish the atoms. This then allows to use a finite alphabet as required by most
of the text-processing algorithms.
Theorem 17C. Given a finite set of formulas S and a formula α, one can decide
whether S |= α.
Proof. Note that each formula β ∈ S as well as α has only finitely many atoms. Thus
there are only finitely many atoms which occur in either β or α. Now one can make a
truth-table involving all these atoms and then check whether all rows which make all
formulas in S true, also make α true. If so then S |= α else one has a counter-example
row which tells how to choose the truth-values of the relevant atoms in order to make
all members of S true while α becomes false.
Theorem 17E. A set is recursively enumerable iff there is an algorithm which, for
all x ∈ A, outputs “yes”; however, for x ∈
/ A, the function f (x) might either never
output something or output “no”.
Proof. Assume that A is the range of a recursive function f . Now one can make the
following algorithm:
Input x;
Let y = 0;
While f (y) 6= x Do Begin y = y + 1 End;
Output “yes”.
This algorithm outputs “yes” iff it finds an y with f (y) = x; note that it can evaluate
f (y) for each y ∈ N in finite time and therefore terminates eventually when there is
such an y. However, if there is no y with f (y) = x, it will run forever. In the case
that A = ∅, one just makes an algorithm which on all inputs outputs “no”.
For the other way round, in the case that A = ∅, A is recursively enumerable
by definition. Otherwise there is a fixed element a ∈ A. Now for each input x,
47
one makes a two-input function f (x, t). This function simulates the algorithm on
input x for t seconds. If the algorithm halts within t seconds with output “yes” then
f (x, t) = x else f (x, t) = a. It is easy to see that only members of A are in the range
of f . Furthermore, if x ∈ A then the algorithm needs some time t to say “yes” and
therefore for all s > t, f (x, s) = x; hence the range of f is exactly the set A.
Theorem 17F (Kleene’s Theorem). A set A is decidable iff both the set A and
its complement are recursively enumerable.
Input x;
Let y = 0;
While (f (y) 6= x and g(y) 6= x) Do Begin y = y + 1 End;
If f (y) = x then output “yes” else output “no”.
Since x ∈ range(f ) ∪ range(g), the algorithm will eventually find a y such that the
condition of the while-loop is not satisfied. Once this y is reached, the if-then-else-
statement will provide the correct output: If y is in the range of f then the answer is
“yes” else y is in the range of g and the answer is “no”.
Generalisations of Corollary 17D. One might ask what happens if one does not
restrict S to finite sets. As there are ℵ0 atoms, one can select 2ℵ0 many sets of atoms
and negated atoms; however, there are only ℵ0 many algorithms, as each algorithm can
be written down in as a finite text over a fixed alphabet. Thus most of these sets are
not decidable, as there are less algorithms than sets. Thus it is reasonable to require
that one looks only at sets S which are themselves enumerated by an algorithm. But
even then one might have a problem. Consider Turing’s diagonal halting problem K:
48
A number e is in K iff the e-th computer program with input e halts and produces
some output. Now one let S = {Ae : e ∈ K}; the set S is also recursively enumerable.
It is easy to see that S |= Ae iff Ae is enumerated into S iff e ∈ K. Thus the set of all
formulas α such that S |= α is not decidable. However, one has the following result.
49
Thus, if S |= α then there is by the compactness theorem some m such that the
following statements are true:
Similarly, one can show that under the assumption T |= α there is an m such that
the following statements hold:
Now it remains to show that T is decidable. Indeed, T has the following decision
procedure.
1. On input α, determine the largest n such that An appears in α; n = 0
is if there is no atom in α;
2. For m = 0, 1, . . . , n Do Begin check whether g(m) = α End;
3. If an m ≤ n was found with g(m) = α then output “yes” else output
“no”.
The first statement can be done effectively, as one has only to go through the formula
and literally just inspect all the atoms which occur there. The second statement
can be done effectively, as g is a recursive function which is defined on all natural
numbers and computed in finite time by some computer program and comparing
formulas (strings) is a standard task which can be done by computers. Furthermore,
there are only n+1 of these comparisons, so the loop terminates. The third statement
can be done effectively, as one just has to note down the results of the comparisons
in Step 2 and output “yes” is any of these results said “equal” and output “no” if all
of them said “not equal”. Thus T is a decidable set.
Closure Properties. The following statements are true for recursively enumerable
and decidable subsets of N:
50
(a) Every infinite recursively enumerable set X has an infinite recursive subset Y ;
Proof of (a). Given X, one can make a recursive function doing the following: f (0)
is the minimum of X; f (n + 1) is the first element y of X found by effective search
which satisfies f (n) < y; note that this search always terminates eventually, as X is
infinite. Thus all values of f can be computed in finite time and now let Y be the
range of f . Clearly Y is recursively enumerable. One can, however, even show that Y
is decidable. For this, note that f (y) ≥ y for all y as one can show that f is strictly
monotonically increasing. Thus one has for all y that
Proof of (b). Let X and Y be recursively enumerable sets. If one of them is empty,
their union is the other set and their intersection is ∅. If both are non-empty then
they are the ranges of recursive functions fX and fY . Now X ∪ Y is the range of
the function f defined by f (2n) = fX (n) and f (2n + 1) = fY (n). Thus X ∪ Y is
also recursively enumerable. Furthermore, if X ∩ Y = ∅ then X ∩ Y is recursively
enumerable by definition; if X ∩ Y 6= ∅ then let a be one of the elements in the
intersection and now define
a if fX (n) 6= fY (m);
g(n, m) =
fX (n) if fX (n) = fY (m).
Proof of (c). Given decision procedures for X and Y , one can build the following
decision procedures for X ∪ Y , X ∩ Y and N − X: Assume that X(n) = 1 when
n ∈ X and X(n) = 0 whenb n ∈ / X, similarly for Y . That is, one identifies the sets
with their characteristic function. Now one can just define the decision procedures
for intersection, union and complement by the below list using those for X and Y :
51
1. (X ∩ Y )(n) = min{X(n), Y (n)};
3. (N − X)(n) = 1 − X(n).
52
Chapter 2 – First-Order Logic
While sentential logic only deals with connections between sets of formulas using
atoms as truth-values, first-order and second-order logic try to replace these atoms
by statements over some mathematical structure. Such statements have in particular
some basic ingredients:
5. Quantifiers, namely existential quantifiers (∃) and universal quantifiers (∀) which
allow to make statements about mathematical laws valid on X using the afore-
mentioned predicates and functions.
An example of first-order and second-order statements are the following rules which
hold to define within a structure (X, N, Succ, 0, 1) the set N of natural numbers:
5. ∀x, y ∈ X [(x ∈ N) ∧ (x = y) → (y ∈ N)] (if two objects are equal and one of
them is a natural number, then actually both of them are a natural number);
7. ∀x, y ∈ N [x = y ↔ Succ(x) = Succ(y)] (two natural numbers are the same iff
their successors are the same);
53
8. ∀x ∈ N [0 6= Succ(x)] (zero is not the successor of any natural number; in
particular, −1 is not a natural number);
These are the nine axioms listed as Peano’s axioms for natural numbers on Wikipedia.
Usually, one assumes that any logic satisfies this axioms and that x = y holds iff x and
y denote the same element. All axioms except for the ninth are only quantifying over
members of the structure; the ninth axiom, however, quantifies over sets. First-order
logic does not allow it; only second-order logic allows to quantify over sets or functions.
One has therefore studied in logic, whether one can describe the natural numbers
uniquely with axioms from first-order logic and the result is that this is impossible.
There are many structures which are very similar to the natural numbers and satisfy
all the first-order formulas which are true for the natural numbers, but which are
not isomoprhic to the natural numbers and the successor-relation. Quantifyers had
already been used informally; the following examples, however, should make their
meaning more clear.
2. ∃y, z ∈ X [y 6= z ∧ f (y) = f (z)]. There are y, z in X such that y and z are not
equal, but their images under f is; that is, f is not injective.
3. ∀y, z ∈ X [f (y) = f (z) ↔ y = z]. For all y, z in X, f (y) = f (z) if and only if
y = z; that is, f is injective.
54
The reference to the base set X in the quantification can be omitted in the case that
the range of the quantifyer is clear. The range of the quantifier is only included in the
case that there are several choices and in the case that a base set is fixed, the range of
the quantifier is always the base set unless something else is specified. So one could
write the first two examples also as follows:
When using +, · or other binary operations of numbers, these are in an informal setting
just written as usually in mathematics, together with the rules that · binds more than
+ and −; however, when doing it formally, one has to replace x + y by a function
add(x, y) and x · y by a function mult(x, y) in order to avoid too much complications
when making proves over formulas. Furthermore, the number of connectives can be
cut down to a generating set of connectives; most convenient is for this the set {¬, →}
of the negation and the implication; one could also use {¬, ∧, ∨}. Furthermore, one
replaces each formula of the form ∃x [α] by ¬∀x [¬α] and therefore one does not need
both quantifiers, but only one; however, in an informal setting, one can still use both
quantifiers.
55
Chapter 2.1 – First-Order Languages
Symbols in language.
A. Logical Symbols. These denote the logical language and also variables ranging
over the structure.
A.0. Brackets: ( and ); after quantification [ and ] in an informal setting.
A.1. Sentential connective symbols: →, ¬; the others only in an informal setting.
A.2. Variables v0 , v1 , . . . which range over members of the underlying structure.
A.3. Equality symbol = which is used in most but not all cases.
B. Parameters. These denote item specific to the structure used and the quantifiers.
B.0. Quantifier symbol ∀; the symbol ∃ occurs only in informal settings.
B.1. Predicate symbols for handling subsets of the domain and also for handling
relations and each predicate symbol comes with an arity of the inputs and these
are inside brackets and separted by commas.
B.2. Constant symbols (like 0, 1 in the natural numbers or fields).
B.3. Function symbols (in a formal setting, operators like + and · and − are replaced
by functions) and each function symbol comes with an arity, that is, the number
of inputs which are in brackets and separated by commas.
The equality symbol can be present, but one can also consider logics without equality,
though those are only a minor topic in this course. In the following some examples of
languages for which one specifies what symbols are in.
56
This formula can then be translated into the usual normalform as follows:
One can also consider languages with more than two predicates; in principle, even
languages with an infinite number of predicates of infinitely many distinct arities are
possible. However, there should be an algorithm which says for which arity how many
predicates exist and how these are called.
More formally, one has to eliminate the existential quantifier and to replace
v2 ∈/ v1 by ¬v2 ∈ v1 which, after eliminatation of double negation, gives the
following statement:
(∀v1 [¬(∀v2 [v2 ∈ v1 ])]).
A last step would be to replace the square brackets by normal brackets.
• For any two sets there is a set which contains exactly these two sets as an
element:
∀v1 ∀v2 ∃v3 ∀v4 [v4 ∈ v3 ↔ (v4 = v1 ∨ v4 = v2 )]
which one first transforms into
For atoms, one has the rules that a sequence A1 ∧A2 ∧A3 gives A1 → A2 → ¬A3
and furthermore ¬(A4 → (A5 ∨ A6 )) is ¬(A5 ) ∧ ¬(A6 ) ∧ A4 . Combining these
rules, one can transform the above formula into the following quantified sequence
of implications:
57
Now making all brackets and replacing all 6= and ∈
/ gives the formula
One has then to transform these formulas into the required form by using a function
add for the addition in place of the operator +; that is, the official language would be
the usual symbols plus the constant 0 and the function add and equality. Furthermore,
one uses ¬∀¬ for ∃. For example, the last formula becomes
The other formulas are translated similarly into this normal form.
58
the following expressions are terms: 0, 1, add(0, 1), add(1, 1), add(1, add(1, 1)) and
add(add(1, 1), 1) which represent the numbers 0, 1, 1, 2, 3 and 3, respectively. Rules
to evaluate and compare the terms will come later. If there is a further function mult,
then also mult(add(1, 1), add(1, 1)) would be a term which would provide the value
4. Here “add” and “mult” and “succ” are supposed to stand for symbols of length 1
denoting the corresponding operations, they are spelled out for better readability.
Atomic Formulas. Formulas which are either of the form P (t1 , t2 , . . . , tn ) for an n-
ary predicate and n terms or of the form t1 = t2 or of the form q for a truth-value q are
called atomic formulas (truth-values could be considered as nullary predicates). When
dealing with set theory, there is a predicate denoting the element-relation; however,
one writes usually expressions like t1 ∈ t2 and is not using a predicate like in(t1 , t2 ),
although the latter would be the standard form.
Well-Formed Formulas. These are formed from atomic formulas by putting them
together with logical connectives (formally one uses only ¬ and →, as the other can be
expressed using these) or with quantifiers (formally, one uses only ∀, as a formula of the
form ∃vk [P (vk )] is equivalent to ¬(∀vk [¬P (vk )]) and similarly for more complicated
formulas in place of P (vk )). Formally, well-formed formulas have always outmost
brackets which are kept when assembling larger formulas from parts.
the variables v2 , v3 are bound within the range [. . .] after the corresponding quantifiers.
The free variables inside a formula can be defined using the following recursively
defined function h, where vi stands for a variable, cj for a constant, t1 , t2 , . . . , tn for
terms, f for an n-ary function and P for an n-ary predicate and α, β for wffs.
1. h(vi ) = {vi };
2. h(cj ) = ∅;
59
3. h(f (t1 , t2 , . . . , tn )) = h(t1 ) ∪ h(t2 ) ∪ . . . ∪ h(tn );
6. h(¬α) = h(α);
Note that the recursive applications of the first four items gives that if α is an atomic
formula then h(α) is the set of all variables which occur in α. The last three items
define h inductively on wffs from the definition of h on atomic formulas.
Every wff α which satisfies h(α) = ∅ is called a sentence. When permitting all
connectives and quantifiers, the following formulas are sentences:
Among these sentences, the first is a tautology and the last is an antitautology. Fur-
thermore, the second sentence holds in structures with at least two elements and the
third holds in structures with at most two elements.
Additional Conventions. As done in the text before, for general writing of formulas,
in the case of several quantifiers having the same range like ∀v1 [∃v2 [v1 = v2 ]], one
writes only one set of squarebrackets, namely the inner one as in ∀v1 ∃v2 [v1 = v2 ].
Furthermore, t1 6= t2 refers to ¬(t1 = t2 ) and in formulas, the outmost bracket can be
omitted. One can also write v1 + v2 in place of add(v1 , v2 ) and similarly for further
functions with two inputs representing group and monoid operations. When there
are additive and multiplicative operations, the multiplicative operations bind more,
so v1 + v2 · v3 is v1 + (v2 · v3 ). In doubt, brackets are placed for making the meaning
of a formula clear.
60
Chapter 2.2 – Truth and Models
Structures and Models. A “model” and a “structure” are almost the same. The
main difference is that structure only refers to itself while model is also referring to a
set of formulas and one says that A is a model of a set S of formulas, if A makes all
formulas in S true.
A structure A consists of a nonempty domain A which assigns to every constant,
function and predicate a value from A (constants) or from functions from An to A
(n-ary function symbols) or from functions from An to the set Q of truth-values (n-ary
predicates); for most of this lecture, Q = {0, 1}. For constant-symbols c, function-
symbols f and predicate-symbols P , cA , f A and P A denote the objects associated with
c, f and P in the structure A.
Examples. When one considers the structure of natural numbers with the function
add and constants 0 and 1. Then the domain A of A is N and the function addA
maps pairs of natural numbers x, y to x + y. The constants 0A and 1A take as values
just the smallest and second-smallest natural numbers which usually are called “zero”
and “one”. The term add(1, 1)A has the value 2. The following formula is true in the
natural numbers:
∃x ∀y, z [add(y, z) = x → (y = x ∧ z = x)].
The reason is that 0 can only be the sum of 0 and 0. So one says that this sentence
is true in A.
On the other hand, for the structure B, the function addB is the addition on
integers and there, every number x is the sum of −1 and x + 1 or, equivalently, of −2
and x + 2. Thus above sentence is wrong. Indeed,
∀x ∃y, z [add(y, z) = x ∧ y 6= x ∧ z 6= x]
it true in the structure B.
Example. Consider a graph given by its edge set E. For example, as given in the
following diagramme.
o - o
| |
o - o - o - o o
61
The edge relation E is not by default undirected, instead any binary relation would
be a directed edge relation. So one would have to require that an undirected graph
satisfies the following axiom:
∀x ∀y [E(x, y) → E(y, x)].
Above example satisfies that there are nodes of degree 0, degree 1, degree 2 and degree
3, where the degree of a node is the number of neighbours. So having a node of at
least degree 3 could be expressed as follows:
∃x ∃u ∃v ∃w [u 6= v ∧ u 6= w ∧ v 6= w ∧ E(x, u) ∧ E(x, v) ∧ E(x, w)].
The sentence that every node has at least degree 1 is as follows:
∀x ∃y [E(x, y)].
The graph from above diagramme does not satisfies this sentence, but the two pre-
vious sentences of this example (undirectedness and existence of node with three
neighbours).
62
Here s(x|d) is the modified default-assignment to variables where x gets the default d
and all other variables y keep their default s(y). Note that the other logical connectives
and ∃ are expressed in terms of ∀, ¬, → and therefore it is sufficient to give the above
definitions.
Example. Let A be the structure (N, add, leq) of natural numbers with function add
to add numbers and predicate leq to compare two terms with respect to whether the
first term is less or equal the second term. Now one considers the following sentence:
This formula would be, when written the usual way, just be
∀x ∀y ∃z [8 + (x + y) ≤ z]
and it is easy to see that this sentence is true. Now let s say that x has the default 1,
y has the default 5 and z has the default 35. So s(add(x, y)) = 6, s(8, add(x, y)) = 14,
s(z) = 35 and A, s |= leq(add(8, add(x, y)), z) or, when written the usual way, A, s |=
8 + (x + y) ≤ z, as 14 ≤ 35. Furthermore,
and
A, s(x|a, y|b) |= (¬∀z(¬leq(add(8, add(x, y)), z))).
Now, when considering a quantifier over y one sees that
63
for all b and therefore
and again, as this holds for all values a ∈ A, one can conclude that
Summarising informally,
A, s |= ∀x ∀y ∃z [8 + x + y ≤ z]
can be made true for all values of x and y by having, in each case, z taking the value
of the term x + y + 9.
If one has a formula with true defaults, say
∀x ∃y [y = x + z]
then one has to observe that for all a it does not hold for all b that
A, s(x|a, y|b) |= (y 6= x + z)
and this is indeed the case by considering the case where b = s(z) + a. Therefore one
can conclude that
A, s(x|a) 6|= (∀y (y 6= x + z))
and
A, s(x|a) |= (¬(∀y (y 6= x + z)))
and, as this holds independently of the choice of a,
which is equivalent to
Example. Recall the graph given by its edge set E as in the following diagramme
where now the vertices are named by letters.
64
a - b
| |
c - d - e - f g
Now one studies the following formulas with one free variable x:
α = ∀y [¬E(x, y)];
β = ∃y, z [y 6= z ∧ E(x, y) ∧ E(x, z)].
Now A, s |= α iff s(x) = g and A, s |= β iff s(x) ∈ {a, b, c, d, e}. These example show
that the influence of s on the truth of α and β only depends on the defaults of the
free variable x and not on the defaults of variables which either do not occur in the
formula or occur in the formula only in a bound form.
Theorem 22A. Given a structure A and a formula α and two default assignments
to the variables s1 and s2 . If these agree on all variables which occur free in α then
A, s1 |= α if and only if A, s2 |= α.
Proof. One proves this by induction over all well-formed formulas. If it is then true
for all formulas, it is also true for α.
First consider an atomic formula P (t1 , t2 , . . . , tn ). For each term tm , s1 and s2 take
on the variables which occur free in the terms the same values. Thus s1 (tm ) = s2 (tm ).
It follows that A, s1 |= P (t1 , t2 , . . . , tn ) if and only if A |= P (s1 (t1 ), s1 (t2 ), . . . , s1 (tn ))
if and only if A |= P (s2 (t1 ), s2 (t2 ), . . . , s2 (tn )) if and only if A, s2 |= P (t1 , t2 , . . . , tn ).
Similarly for the case that the atomic formula is of the form t1 = t2 .
If α and β satisfy the statement of the theorem, so do ¬α and α → β, as their
truth-values are only Boolean combinations of those of α and β and the latter are the
same for s1 and s2 .
If α = ∀x [β], then one considers each variable assignment s1 (x|a) and s2 (x|a). In
the case that s1 and s2 coincide on all free variables in α, then s1 (x|a) and s2 (x|a)
coincide on all free variables in β. Thus, for all a ∈ A (the domain of A), A, s1 (x|a) |=
β iff A, s2 (x|a) |= β by induction hypothesis. Now, the following statements are
equivalent:
1. A, s1 |= α;
65
2. for all a ∈ A [A, s1 (x|a) |= β];
4. A, s2 (x|a) |= α.
Corollary 22B. If α is a sentence (no free variables) then either A, s |= α for all s
or A, s |= α for no s.
Logical Implication. Let S be a set of wffs and α be a wff. Now one says that
S |= α iff for every structure A and every value-assignment s to the variables the
following holds: If A, s |= β for all β ∈ S then A, s |= α.
Here one says “S logically implies α”. For simplification, one also writes A, s |= S
when one means that A, s |= β for all β ∈ S. One says that two sets formulas α and
β are logically equivalent iff {α} |= β and {β} |= α. A formula α is valid iff ∅ |= α; so
the valid formulas are those which are made true by all structures and defaults and
they are the equivalent to the tautologies in sentential logic. A formula α is satisfiable
iff there is a structure A and a value-assignment s such that A, s |= α.
In the above, it is always assumed that the structure A gives meaning to all
structure-symbols (predicates, functions, =) occurring in α and that s takes as values
for the variables those which occur in the domain of A.
Example. Consider the following set S of axioms for a structure with equality and
an edge relation E:
2. ∃u ∃v ∃w ∃x ∃y ∀z [z = u ∨ z = v ∨ z = w ∨ z = x ∨ z = y];
3. ∀x [¬E(x, x)];
6. ∃x ∃y [x 6= y ∧ ¬E(x, y)].
66
One can now draw three graphs which satisfy all axioms:
o - o o - o
/ \ | |
o - o - o o - o o - o - o
The first axiom says that the graph is undirected. The second axioms says that there
are at most five vertices. The third axiom says that there are no self-loops. The fourth
axiom says that any two vertices are either neighbours or have a common neighbour;
the latter holds also when the two vertices are equal. The fourth axiom says that every
vertex has at most two neighbours. Thus all graphs are either circles or lines with at
most five vertices. The last axiom says that there are two vertices having the distance
two, that is, they are not neighbours, but by the second axiom they have a common
neighbour. For this to happen, one needs at least three vertices. Furthermore, as two
distinct vertices cannot be neighbours, the triangle graph is not there. There is no
line graph of four or more vertices, as then the end points are neither neighbours nor
do they have a common neighbour. So, up to isomorphism, S has three models.
Example. Let the logical language just contain = and a unary function symbol f
and assume that the following axioms hold:
1. ∃v ∃w ∃x ∃y ∀z [z = v ∨ z = w ∨ z = x ∨ z = y];
2. ∀x [f (f (f (f (x)))) = x].
Now one might ask how many structures are there satisfying these axioms. The
following enumerated list gives them all.
1. Four models with f (x) = x for all x and one through four elements;
2. Three models with two elements, a, b with f (a) = b and f (b) = a and zero,
one or two further elements (named c, d when they exist) with f (c) = c and
f (d) = d;
3. One model with four elements a, b, c, d such that f (a) = b, f (b) = a, f (c) = d
and f (d) = c;
4. One model with four elements a, b, c, d such that f (a) = b, f (b) = c, f (c) = d
and f (d) = a.
67
This are nine models altogether. Note that a three-cycle (f (a) = b, f (b) = c, f (c) = a)
cannot exist, as then f (f (f (f (a)))) = b 6= a. So the above list goes by “only one-
cycles” in the first entry, “one two-cycle and perhaps some one-cycles” in the second
entry, “two two-cycles” in the third entry and “one four-cycle” in the last entry.
Note that by some convention in first-order logic, structures have always at least one
element and therefore the empty set does not create a model here.
Example. One takes a structure as in the previous example, but in addition to the
function f there are constants a, b, c, d and one requires that there are exactly four
elements. So the following axioms hold:
1. a 6= b ∧ a 6= c ∧ a 6= d ∧ b 6= c ∧ b 6= d ∧ c 6= d;
2. ∀x [x = a ∨ x = b ∨ x = c ∨ x = d];
3. ∀x [f (f (f (f (x)))) = x].
One would expect that there are less models as before, but that is not true. The
reason is that two models can only be isomorphic when the same constants play the
same role in both models. So if in one f (a) = b, the same must hold in the other one.
Therefore one gets the following models:
2. Six models with one two-cycle and two one-cycles, it is just the amount of two-
element subsets in a four-element set;
3. Three models with two two-cycles, these are determined by choosing which value
is f (a), note that then f (f (a)) = a and the other two elements also form one
two-cycle;
4. Six models with one four-cycle, again these are determined by choosing f (a) out
of three choices and then f (f (a)) out of two remaining choices, f (f (f (a))) has
only one remaining choice and f (f (f (f (a)))) = a.
68
a1 , a2 , . . . , an ∈ A, R(a1 , a2 , . . . , an ) is true iff
The requirement that the first variables are used in the definition is not a real one, as
one can rename the variables used in a formula.
Example. For example, the order is definable in (N, +) (with equality =) as v1 ≤ v2 iff
∃v3 [v1 + v3 = v2 ]. Furthermore, every natural number n is definable in this structure.
For example, n is that number for which there are exactly n + 1 pairs (x, y) with
x + y = n. This can be formalised in first-order logic.
However, in the structure (Z, +), the order is not definable and also 0 is the only
definable element by the formula v1 = v1 + v1 . So 0 = 0 + 0 while, for example,
5 6= 5 + 5. This can be proven formally by showing that the mapping a 7→ −a is an
isomorphism of the structure and as it maps non-zero numbers a to their negative
counterpart, whenever a formula α is satisfiable with v1 = a then α is also satisfiable
with v1 = −a.
A mapping like a 7→ −a is an isomorphism in the structure (Z, +). If an element
or a relation or a function is not preserved by some isomorphism of the structure onto
itself then the corresponding element or relation or function is not definable. This
will be made more precise later.
o - o - o
69
Now the middle element is definable by the formula
∃y ∃w [E(x, y) ∧ E(x, z) ∧ y 6= z]
which is true iff x has at least two neighbours. However, the two elements on the
ends of this line graph cannot be defined, as one can make a graph isomorphism of
the graph onto itself which maps one end to the other end and vice versa. So one has
one definable and two undefinable elements. Similarly, in the graph
o - o o - o
none of the elements is definable while in the graph
o - o - o - o
| | |
o - o - o
every vertex is definable.
∀x ∀y ∀z [x ◦ (y ◦ z) = (x ◦ y) ◦ z]
and so one can define the class of all semigroups with this simple formula plus the
constraint that the underlying language has equality and one binary operation symbol
◦, which could also be represented by a binary function f and the formula
Of course, one might sometimes also use several or even infinitely many formulas to
define a class of structures. A class of structures is definable iff there is a set S of
axioms which defines it; some classes of structures are not definable like the class of
all finite semigroups or all finite groups. These classes are nevertheless of interest for
mathematicians.
70
belongs to C iff it satisfies this set of sentences. If possible, one tries to get that S is
decidable or at least recursively enumerable; however, for some classes of structures
this is impossible to achieve.
Example. The class of all groups is elementary and the class of all infinite groups
is elementary in a wider sense. The class of all groups can be axiomatised by the
following three sentences (which one can combine by ∧ to get a single sentence):
∀x ∀y ∀z [x ◦ (y ◦ z) = (x ◦ y) ◦ z];
∀x ∀y ∃z [x ◦ z = y];
∀x ∀y ∃z [z ◦ x = y].
The class of all infinite groups is augmented by formulas which enure, for every n, that
there are at least n elements, that is, one quantifies existentially over v1 , v2 , . . . , vn
and then says that each vi differs for vj for i, j with 1 ≤ i < j ≤ n. For example,
is the formula which says that there are at least four elements in the structure.
71
A homomorphism h which satisfies in the third item for all predicates P and all
a1 , . . . , an ∈ A the more restrictive condition
(c) If h is a strong homomorphism, then for every formula α without equality and
without quantifiers, A, s |= α iff B, s0 |= α.
(d) If h is a strong injective homomorphism then for every formula α without quan-
tifiers, A, s |= α iff B, s0 |= α.
(e) If h is a strong surjective homomorphism then for every formula α without equal-
ity, A, s |= α iff B, s0 |= α.
Proof. Now one proves all the different parts of the Homorphism Theorem.
(a) Note that s0 (vk ) maps variables to elements of the domain B of B and that one
can show by induction over the construction of terms that h(s(t)) = s0 (t). For the
base case, it is easy to see that variables define this by definition and for constants,
h(s(c)) = h(cA ) = cB = s0 (c). Now, for induction, consider an n-ary function
f and assume that the statement is already verified for the terms t1 , . . . , tn .
72
Now h(s(f (t1 , . . . , tn ))) = h(f A (s(t1 ), . . . , s(tn ))) = f B (h(s(t1 )), . . . , h(s(tn ))) =
f B (s0 (t1 )), . . . , s0 (tn )) = s0 (f (t1 , . . . , tn )) and thus the statement also holds for
the term f (t1 , . . . , tn ). Thus one can conclude by structural induction that the
statement holds for all terms.
(b) If A, s |= P (t1 , . . . , tn ) then P A (s(t1 , . . . , s(tn )) is true and P B (s0 (t1 ), . . . , s0 (tn ))
is true and thus B, s0 |= P (t1 , . . . , tn ). Furthermore, if A, s |= t1 = t2 then
s(t1 ) = s(t2 ) and h(s(t1 )) = h(s(t2 )) and s0 (t1 ) = s0 (t2 ) and B, s0 |= t1 = t2 .
(c) For a strong homomorphism, which is obtained from atomic formulas consisting
of predicates only using connectives, one can again show the formula by induc-
tion. For the base case, consider an n-ary predicate P with terms t1 , . . . , tn and
one sees that the following statements are all equivalent to each other: A, s |=
P (t1 , . . . , tn ); P A (s(t1 , . . . , s(tn )); P B (s0 (t1 ), . . . , s0 (tn )); B, s0 |= P (t1 , . . . , tn ).
Furthermore, one can see for the inductive step that whenever A, s |= α ⇔
B, s0 |= α and A, s |= β ⇔ B, s0 |= β then the same holds for α → β and ¬α.
(d) Assume now that h is an injective strong homomorphism. Using that A, s |=
t1 = t2 iff s(t1 ) = s(t2 ) and s0 (t1 ) = h(s(t1 )) and s0 (t2 ) = h(s(t2 ) it follows that
A, s |= t1 = t2 iff s0 (t1 ) = s0 (t2 ) and the latter is equivalent to B, s0 |= t1 = t2 .
Using this as an additional item for the base case, the same inductive prove
as before can be done to show that all quantifier-free formulas α satisfy that
A, s |= α iff B, s0 |= α.
(e) Assume now that h is a surjective strong homomorphism. One extends now the
proof of (c) to cover quantifiers. Here it is important that one does the induction
in parallel for all s and derived s0 and that one uses that every s0 (x|b) is derived
from some s(x|a) where a satisfies that h(a) = b. So in the following, assume
that a satisfies h(a) = b and assume that A, s(x, a) |= α iff B, s0 (x|h(a)) |= α.
As for every b there is an a with h(a) = b, one can see that now the following
two statements are equivalent:
• For all a ∈ A, A, s(x|a) |= α and
• For all b ∈ B, B, s0 (x|b) |= α.
The first of these statements is equivalent to A, s |= ∀x [α] and the second is
equivalent to B, s0 |= ∀x [α]. This adjusts the proof of the inductive step missing
in (c) in order to cover quantifiers and it can only be included when h is surjective.
73
(f) The full induction proof for the case of an isomorphism includes the proofs of (d)
and (e) to be combined for those h which are both, injective and surjective.
Example. The structures (Q, <) and (R, <) of the rational and real numbers are
elementarily equivalent, but they are not isomorphic, as the rational and the real
numbers have different cardinalities. This will be proven later as a corollary to the
Theorem of Löwenheim and Skolem.
Corollary 22E. Let A be any structure with domain A and R be any n-ary relation
first-order definable in A and h be any self-isomoprhism of A. Then, for the domain
A of A and any a1 , . . . , an ∈ A, R(a1 , . . . , an ) holds iff R(h(a1 ), . . . , h(an )) holds.
Applications of Corollary 22E. One can prove that certain relations, constants
or functions are not definable by making isomorphisms which do not preserve the
relations or constants or functions. One example is the structure (Z, +): In this
structure, the mapping z 7→ −z is an isomorphism which does neither perserve the
relation < nor any constant different from 0; thus these constants and the order are
not definable in the integers with addition.
On the other hand, the criterion is not an “if and only if” criterion. For example,
every isomorphism of (R, +, ·) maps all integers to themselves, that is, cannot move
any integer and also not move any rational. As x ≤ y ⇔ ∃z [x+z·z = y], isomorphisms
have to be order-preserving and there is only one isomorphism from (R, +, ·) to itself.
This implies that every real r is moved to istelf by an isomorphism. However, there are
only ℵ0 many formulas and thus only ℵ0 many real numbers are first-order definable
in (R, +, ·), while there are 2ℵ0 many reals. So many reals are neither moved by an
isomorphism nor definable.
74
Chapter 2.4 – A Deductive Calculus
It is known that for sentential logic, the set of logical consequences of a recursively
enumerable set of wffs is again recursively enumerable. For this and the next chapter,
the same should be shown for first-order logic. More precisely, in this Chapter 2.4 a
proof-calculus together with an algorithm to enumerate the formulas provable in this
calculus is provided; in the next Chapter 2.5, a proof is provided, that provability in
this calculus is also the same as being implied logically.
The Axioms. The set Λ of axioms depends on the logical language used and is here
given for a logical language with equality; if one does not have equality, axioms of
type (5) and (6) are meaningless and can be dropped. The rules in Λ satisfy that
they are valid, that is, they are true for every model and every s which uses the same
logical language as the axioms. The axioms are taylormade for the connectives → and
¬; however, if one wants to use more axioms, one has to include the corresponding
additional formulas into Λ in order to handle them.
(1) α for every α which is obtained by taking a tautology in sentential logic and
replacing all atoms by well-formed formulas in a consistent way (the same atom
needs always be replaced by the same formula);
(2) ∀x [α] → (α)xt for all well-formed formulas α, variables x and terms t where the
substitution (α)xt is permitted, that is, it does not have any varialbe names inside
t which are bound at the place x;
75
(3) ∀x [α → β] → ∀x [α] → ∀x [β];
(4) α → ∀x [α] for all well-formed formulas α and variables x where x does not occur
free in α;
(6) x = y → α → β for all variables x, y and all atomic formulas α and all β derived
from α by replacing some occurrences of x by occurrences of y or vice versa;
Here a substitution (α)xt replaces all free occurrences of x by the term t. Such a
substitution is permitted only if it cannot create a contradiction. Here an example:
∃y [y 6= x] is a formula which is true whenever the domain of the structure has
at least two elements; however, when one substitutes x by y, one gets the formula
∃y [y 6= y] which is not true in any structure. Thus the truth-value is compromised by
transforming a free variable into a bound variable. Before defining when a substitution
is permitted, here the formal definition of a substitution:
2. (¬α)xt is (¬αtx );
4. (∀x [α])xt is ∀x [α], that is, bound occurrences of x are not substituted;
2. If α is of the form (¬β) or (β → γ) then αtx is permitted iff βtx and, in the case
that it it is applicable, also γtx are permitted;
76
3. If α is of the form ∀y [β] then the substitution αtx is permitted iff either x does
not occur free in α or y is a different variable than x and y does not occur in t
and the substitution βtx is permitted.
Examples. In the following, let x, y, z be variables. Assume that the logical language
contains + and numerical constants (for making more interesting terms).
1. The substition (∀x ∀y [x = y ∨ z 6= y])xy has no effect, whatever t is, and is
therefore permitted.
Quiz. What are the results of the following substitutions and are they permitted?
1. (∀x [f (x) = y])yx+8 ;
Notation of Proofs. One writes S ` β for saying that one can derive or has derived
the formula β from S using the formulas in S and the axioms Λ; so ∅ ` β means that
β can be drieved from the axioms Λ alone. In general, if S ⊆ S 0 then S ` β implies
S 0 ` β, as S ` β only says that all the formulas copied in the proof are taken from S
or Λ and as S ⊆ S 0 , those copied from S can also be copied from S 0 . However, one
77
normally tries to keep S as small as possible. The next example shows how to derive
a quantifier free formula from S using the axioms for equality, namely the formula to
be derived says that it does not matter in which order one writes the two sides of the
equality sign.
1. ∅ ` x = y → x = x → y = x (Axiom 6);
2. ∅ ` (x = y → x = x → y = x) → (x = x → x = y → y = x) (Axiom 1);
3. ∅ ` x = x → x = y → y = x (Modus Ponens);
4. ∅ ` x = x (Axiom 5);
5. ∅ ` x = y → y = x (Modus Ponens).
Proof. The statement that “β1 , . . . , βn tautologically imply α” is the same as saying
that (β1 ∧ . . . ∧ βn ) → α is a version of Axiom 1 and this is logically equivalent to
β1 → . . . → βn → α which is in turn also a version of Axiom 1. So one can do the
following derivation for S = {β1 , . . . , βn }
1. S ` β1 → . . . → βn → α (Axiom 1);
2. S ` β1 (Copy)
3. S ` β2 → . . . → βn → α (Modus Ponens);
4. S ` β2 (Copy)
78
5. S ` β3 → . . . → βn → α (Modus Ponens);
6. S ` β3 (Copy)
7. S ` β4 → . . . → βn → α (Modus Ponens);
8. . . . S ` βn → α (Modus Ponens);
9. S ` βn (Copy);
1. ∀x [α] (....);
1. ∀x [α] (....);
in order to avoid writing lengthy formulas as produced by Axiom 2. So, for example,
the lines
1. S ` ∀y ∀x [x + y = y + x] (Copy);
79
2. S ` ∀x [x + 0 = 0 + x] (Axiom 2, y → 0, Modus Ponens);
1. S ` ∀y ∀x [x + y = y + x] (Copy);
2. S ` ∀y ∀x [x + y = y + x] → ∀x [x + 0 = 0 + x] (Axiom 2);
3. S ` ∀x [x + 0 = 0 + x] (Modus Ponens);
and it is easy to see that this will save some writing-time, as this sequence of proof-
steps is quite frequent.
1. S ` ∀y ∀x [x + y = y + x] (Copy);
4. S ` ∀v ∀w ∀u [v = w → v = u → w = u] (Axiom 6,7);
5. S ` ∀w ∀u [5 + 0 = w → 5 + 0 = u → w = u] (Axiom 2, v → 5 + 0, Modus
Ponens);
6. S ` ∀u [5 + 0 = 0 + 5 → 5 + 0 = u → 0 + 5 = u] (Axiom 2, w → 5 + 0, Modus
Ponens);
8. S ` 5 + 0 = 5 → 0 + 5 = 5 (Modus Ponens);
9. S ` ∀x [x + 0 = x] (Copy);
80
Deduction Theorem. If S ∪ {α} ` β then S ` α → β.
Proof. The idea is that a formal proof of S ∪ {α} ` β consisting of steps γ1 , γ2 , . . . , γn
with β = γn can be translated into a formal proof such that each step
• S ∪ {α} ` γm ;
becomes translated into one or more steps with the last one of the new steps being
• S ` α → γm ;
and this is now more detailed done in a case distinction. In the case that γm ∈ S, the
complete step is as follows:
• S ` γm (Copy);
• S ` γm → α → γm (Axiom 1);
• S ` α → γm (Modus Ponens);
here one has to know that A1 → A2 → A1 is a tautology, as it is equivalent to
(A1 ∧ A2 ) → A1 . In the case that γm equals α, one just puts the following single
derivation step:
• S ` α → α (Axiom 1);
and this is based on the fact that A1 → A1 is always true in propositional logic. In the
case that γm is in Λ, more specifically a version of Axiom k with k ∈ {1, 2, 3, 4, 5, 6, 7},
one puts the following steps:
• S ` γm (Axiom k);
• S ` γm → α → γm (Axiom 1);
• S ` α → γm (Modus Ponens);
and these are essentially equal to copying from S. The last case is that of a Modus
Ponens. Originally one derives γm from previous steps γi and γj = γi → γm ; now one
has instead α → γi and α → γi → γm . So one makes use of the following tautology:
(A1 → A2 → A3 ) → (A1 → A2 ) → (A1 → A3 ). Note that (A1 → A2 ) → (A1 → A3 )
can only be false when (A1 → A3 ) is false and (A1 → A2 ) is true; so this can only
be when A1 is true, A2 is true and A3 is false. But in this case, also A1 → A2 → A3
is false, thus the overall implication is true. For that reason, one can now put the
following steps into the derivation:
81
• S ` (α → γi → γm ) → (α → γi ) → (α → γm ) (Axiom 1);
• S ` (α → γi ) → (α → γm ) (Modus Ponens);
• S ` α → γm (Modus Ponens);
here the first Modus Ponens step used that α → γi → γm occurred earlier in the
derivation and the second Modus Ponens step used that α → γi occurred earlier in
the derivation. Thus this allows to translate the usage of Modus Ponens. Thus every
step can be translated and the new formal proof is for S ` α → β.
If S ` α → β then
2. S ∪ {α} ` α (Copy);
Thus one can generalise the Deduction Theorem in order to make it if and only if.
1. S ∪ {α} ` β (Assumption);
2. S ` α → β (Deduction Theorem);
Now one does the opposite direction. Here one uses an additional version of the Axiom
1 which allows to replace ¬¬α by α and ¬¬β by β in the given formula.
1. S ∪ {¬β} ` ¬α (Assumption);
82
2. S ` (¬β → ¬α) (Deduction Theorem);
6. S ` α → β (Modus Ponens);
Note that one could also have combined the two usages of Axiom 1 in this direction
to one singular, slightly more complicated version of these two axioms.
Definition. One says that a set S of formulas is inconsistent, if one can derive an
antitautology. Equivalently, one can also say that S is inconsistent if one can derive,
for some β, both the formulas β and ¬β. Another equivalent definition is that S is
inconsistent iff every formula can be derived from S. This statement is a consequence
of the fact that for every α, the formula β → ¬β → α is a tautology; this is most
easily seen by its equivalent form (β ∧ ¬β) → α. So when one can derive β and ¬β,
then one can also in two further steps derive α.
1. S ∪ {α} ` ¬α (Assumption);
2. S ` α → ¬α (Deduction Theorem);
4. S ` ¬α (Modus Ponens).
83
Generalisation Theorem. If S ` α and the variable x does not occur free in S then
S ` ∀x [α]. Again one proves that by translating the proof. Assume that there is a
sequence of formulas β1 , β2 , . . . , βn which is a proof for α from S and α = βn . Now
one replaces every single step
• S ` βm ;
• S ` ∀x [βm ];
and nothing more needs to be proven. The next case is that βm ∈ S. As x does not
occur free in βm , one can then use Axiom 4.
• S ` βm (Copy);
here the Axiom 4 is needed and the proof of the Generalisation Theorem is more or
less the only place where one needs it; later one will just apply the Generalisation
Theorem instead. The last case is that there are i, j < m with βj = βi → βm and the
application of the Modus Ponens in the original proof. This translates as follows.
84
here recall that Axiom 3 was stating that one can move a quantifier from outside
over an implication. Only the last three steps are the new part for βm , the other two
steps were only cited to explain why the Modus Ponens can be used twice. The so
translated proof is then able to derive ∀x [α] from S.
• If βm ∈ S then βm does not contain c and (βm )cy = βm , so βm can still be copied
from S;
• If βm ∈ Λ then one can see that the new axiom (βm )cy can also be obtained that
way and that no conflict is introduced for versions of Axiom 2, as y does not
occur in bound form in βm by assumption;
Furthermore, as T does not contain the variable y free, it follows by the Generalisation
Theorem that there is also a proof for T ` ∀y [αyc ] and this proof does not reintroduce
the constant c, as it does not occur in T . As S ⊇ T , it follows that S ` ∀y [αyc ] by the
same proof.
Corollary 24G. Assume that S ` αcx , where the constant symbol c neither occurs in
S nor in α. Then S ` ∀x [α] and there is a deduction of this formula which does not
use the constant symbol c.
Proof. Assume that the formal proof for S ` αcx is given and y is a variable which
does not occur in any formula of this proof. Now there is also a proof for S ` (αcx )cy
which is, more precisely, a proof for S ` αyx . This proof can be translated into a
85
proof for S ` ∀y [αyx ]. As x does not occur free in αyx , (αyx )yx is the same as α and
furthermore, y occurs in αyx only in places where x is free (as otherwise the y would
not have gone there at the substitution αyx ). So the substitution is permitted and the
formula ∀y [αyx ] → α is a version of Axiom 2. Thus one can deduce ∀y [αyx ] → α and
similarly ∀x [∀y [αyx ] → α] is also an Axiom in Λ. So one has the following end of the
derivation:
Here the third step just used that x is not free in the formula ∀y [αyx ], as all free
occurrences of x in α had been renamed to y.
Corollary 24H. Assume that a constant c does not occur in the formulas S, α and
β. If S ∪ {αcx } ` β then S ∪ ∃x [α]} ` β and there is a formal proof for the latter
which does not use the constant c.
1. S ∪ {αcx } ` β (Assumption);
Note that the statement S ∪ {¬β} ` ∀x [¬α] does no longer contain c on either side
and therefore one can, by Corollary 24G, obtain a proof for this fact without using
the constant c; this proof can then be extended by the last steps to incorporate the
Contraposition.
86
Alphabetic Variants. For every formula α and every term t there is a formula β
which is obtained from α by renaming the bound variables consistently such that no
bound variable of β occurs in t and β is provably equal to α, that is, the substitution
βtx is permitted and ∅ ` α → β and ∅ ` β → α.
Proof. One proves this for each fixed term t and variable x by structural induction
over α; for this one defines recursively a function F from wffs to wffs (which depends
on t) such that β = F (α) is the corresponding formula and this formula satisfies the
following invariants:
2. ∅ ` α → F (α);
3. ∅ ` F (α) → α.
First, for atomic formulas, one defines F (α) = α and there are no bound variables at
all. The substitution (F (α))xt is therefore permitted. Furthermore, as α = F (α), the
equivalence of α and F (α) is provable.
If α = γ → δ then F (α) = F (γ) → F (δ) or ¬F (γ), respectively. By induction
hypothesis, all bound variables in F (γ) and F (δ) do not occur in t and the same holds
for F (α). Thus the substitution (F (α))xt is permitted. Furthermore, by assumption,
∅ ` γ → F (γ), ∅ ` F (γ) → γ, ∅ ` δ → F (δ), ∅ ` F (δ) → δ. Now the following two
formulas are tautologies, which say that when γ and F (γ) are equivalent and when δ
and F (δ) are equivalent then (γ → δ) → (F (γ) → F (δ)) and the other way round:
Using these formulas by Axiom 1 and then applying Modus Ponens four times, one
can derive the two formulas α → F (α) and F (α) → α and these are valid.
If α = ¬γ, then one let F (α) = ¬F (γ). As no bound variable of γ occurs in t,
the same is true for α = ¬γ and therefore the substitution αtx is permitted. Using
Contraposition, one gets out of γ → F (γ) and F (γ) → γ the formulas ¬F (γ) → ¬γ
87
and ¬γ → ¬F (γ); as the two first were valid by induction hypothesis, the two last
are valid as well.
If α = ∀y [γ], then let z be any variable which neither occurs in F (γ) nor in t;
as there are infinitely many variables, such a variable z exists. Now one lets F (α) =
∀z [(F (γ))yz ]. Note that therefore no bound variable of F (α) occurs in t, as this holds
for γ by induction hypothesis and the only new quantified variable is z which does
not occur in t. Furthermore, (F (γ))yz is a permitted substitution and note that it
only affects those occurrences of y in F (γ) where y occurs free in F (γ). Note that
y, if it occurs free in γ, also occurs at the corresponding places free in F (γ), as F
does nothing else than changing the names of quantified variables within the range
of the corresponding quantifiers. Furthermore, the substition (∀z [(F (γ))yz ])xt is also
permitted, as none of the bound variables in this formula occurs in t. It remains to
show that the proven equivalence of the new formulas goes through and this is done
by making the corresponding formal proofs.
For the converse direction, one uses that all occurrences of z in (F (γ))yz ) had been
places where y occurrs free and thus ((F (γ))yz )zy just equals F (γ).
88
3. {F (α)} ` F (γ) (The same written differently);
Example. Alphabetic variants are employed to avoid conflicts with bound variables;
the following example gives some insight.
Let t be the term x ◦ y and consider the following formula:
∀z ∀x ∀y [(x ◦ y) ◦ z = z]
and one wants to substitute z by x ◦ y. This would clash with the bound variables x
and y and thus it cannot be done. However, one can form the alphabetical variant
∀z ∀v ∀w [(v ◦ w) ◦ z = z]
and obtains from this using Axiom 2 and Modus Ponens the formula
∀v ∀w [(v ◦ w) ◦ (x ◦ y) = x ◦ y]
A structure here where the above law holds is any semigroup where the semigroup
operation is defined by the equation x ◦ y = y for all x, y. It is associative, but does
not have a neutral element, so it is not a monoid.
Example of Deriviations. The following formulas are proven using the axioms,
the Deduction Theorem and the Generalisation Theorem. All the results proven will
concern the relate equality with a function symbol. First one proves the valid formula
89
1. ∅ ` v = w → v = v → w = v (Axiom 6);
2. {v = w} ` v = v → w = v (Deduction Theorem);
3. {v = w} ` v = v (Axiom 5);
4. {v = w} ` w = v (Modus Ponens);
5. ∅ ` v = w → w = v (Deduction Theorem);
6. ∅ ` ∀w [v = w → w = v] (Generalisation Theorem);
7. ∅ ` ∀v ∀w [v = w → w = v] (Generalisation Theorem);
∀x ∀y [x = y → f (x) = f (y)]
3. {x = y} ` ∀z [z = z] (Axiom 5);
90
8. ∅ ` ∀y [x = y → f (x) = f (y)] (Generalisation Theorem);
The next statement says that if f is one-one, so is the concatenation of f with itself:
To simplify the notation, let α stand for x 6= y → f (x) 6= f (y). The proof is now the
following:
The transitivity of → is the formula which says that when A1 → A2 and A2 → A3 then
A1 → A3 . That is, one takes the formula (A1 → A2 ) → (A2 → A3 ) → (A1 → A3 ) and
replaces A1 by x 6= y, A2 by f (x) 6= f (y) and A3 by f (f (x)) 6= f (f (y)). The last two
invocations of the Generalisation Theorem can be applied, as x and y do not occur
free in ∀x ∀y [α].
91
Chapter 2.5 – Soundness and Completeness Theo-
rems
Proof. Assume that α is valid, that is, for every structure A with some domain A
and every s mapping variables to elements of A,
A, s |= α.
Now, the same holds if one modifies s at x to some a ∈ A: For all s and all value
a ∈ A,
A, s(x|a) |= α.
It follows by the definition of the universal quantifier that for all structures A and all
s,
A, s |= ∀x [α];
thus ∀x [α] is also a valid formula.
Soundness of Λ. For the soundness of the Proof Calculus from Chapter 2.4, it is
sufficient to show Axioms 1 – 6, as Axiom 7 follows from the preceding lemma.
Axiom 1 considers formulas which are tautological combinations of easier formulas.
That is, given a formula like
α → β → α,
the truth-value of this formula in a structure depends only on what the structure does
with α and β, so let ν(A1 ) = 1 iff A, s |= α and ν(A2 ) = 1 iff A, s |= β. Now one has
that
A, s |= α → β → α
92
iff ν(A1 → A2 → A1 ) = 1; as this underlying formula is a tautology, this is always
true. Thus the formula
α→β→α
is valid, regardless of what α and β are, as long as they are wffs.
Axiom 2 considers formulas which are of the form ∀x [α] → αtx , where this substi-
tution is permitted. Now if A, s |= ∀x [α], then A, s makes αax for every value a ∈ A
true. This is true in particular for the value a = s(t) and this value is at all places
where x occurs not modified by bound variables of α, as otherwise the substitution
would not be permitted. Thus if A, s |= ∀x [α] then A, s |= αtx . This implication is
then a valid formula: A, s |= ∀x [α] → αtx . As every A can be chosen which uses all
logical symbols appearing in α and every mapping s from variables to members of the
domain of A can be taken, the formula ∀x [α] → αtx is then valid.
Axiom 3 says that ∀x [α → β] → ∀x [α] → ∀x [β]. Now consider any structure A
and value-function s. If A, s |= ∀x [α → β] then for all a ∈ A (where A is the domain
of A), the implication αax → βax is true. Furthermore, if A, s |= ∀x [α] then for all
a ∈ A the statement A, s |= αax is true. The so verbally described connections are
then formalised in Axiom 3.
Axiom 4 says that whenever x does not occur free in α then the formula α → ∀x [α]
is valid. This is indeed the case, because when the truth of α does not depend on x
and A, s |= α, then also A, s(x|a) |= α for all a ∈ A. So it follows that A, s |= ∀x [α].
For that reason, A, s |= α → ∀x [α], independently of whether A, s make α true or
not. So the formulas of Axiom 4 are also all valid.
Axiom 5 says that x = x is valid for all x. This statement is true, indeed in general
even t = t for all terms t, as for every structure A and every function s, s(t) is on
both sides of the equality sign the same value.
Axiom 6 says that if x = y and α, β are atomic, thus quantifier-free, formulas in
which some occurrences of x and y are interchanged, then this has no effect, as in all
terms where x is replaced by y or vice versa, the values s(x) and s(y) are the same
by the assumption that A, s |= x = y. It follows that whenever A, s |= x = y then
A, s |= α iff A, s |= β. Thus A, s |= x = y → α → β and also this formula is valid.
93
1. βm ∈ S: Clearly S |= βm .
2. S ` ¬α → α (Deduction Theorem);
4. S ` α (Modus Ponens).
Thus (a) holds.
94
Proof. This theorem is proven by showing the following equivalent statement: Every
consistent set of formulas is satisfiable.
Let S be a consistent set of wffs in first-order logic. Then the main goal is to
extend S to a set T such that T satisfies the following items:
(I) S ⊆ T ;
(III) For every variable x and every formula α there is a new constant cx,α such that
∃x [α] → αcxx,α
After that, in Step 4, one forms a structure A in which all members of S not containing
= are satisfied and in Step 5 and 6 one adjusts the structure such that also = is handled
properly. Now an outline of the proof.
Step 1 One expands the language by adding countably many new constant symbols.
S remains consistent with respect to this enlarged language.
Step 2 One selects for each formula α and each variable x in the new language constant
cα,x and adds the formula
∃x [α] → αcxx,α
to S, just giving S 0 ; here the constant cx,α does not appear in α and one can
process the formulas in a sequence α0 , α1 , . . . such that when cxm ,αn is chosen
then this constant does not appear in any of the formulas αk with k ≤ n and
it also is different from all constants taken for formulas with pairs (xm0 , αn0 )
processed before (xn , αn ).
Step 4 Now one defines a structure A as follows: First one introduces a new predicate
E and then considers instead of T all formulas in a set T 0 obtained by replacing
in all primitive subformulas of the form t1 = t2 by E(t1 , t2 ). Now one lets the
95
domain A of A be the set of all terms using the enlarged set of constants and
one defines that a predicate P (t1 , . . . , tn ) is true in A iff P (t1 , . . . , tn ) ∈ T 0 (note
that E(t1 , t2 ) ∈ T 0 iff t1 = t2 ∈ T ). For each function symbol f , one defines
that in A, the value of f on inputs t1 , . . . , tn is just the term f (t1 , . . . , tn ). For
each variable x, one defines s(x) to be some constant c such that E(x, c) ∈ T 0 ,
this constant c always exists. Furthermore, for the variable assignment s, one
defines s(t) = t and says that A, s |= α iff α ∈ T 0 .
Step 5 In the next step, one defines a further structure B by forming the equivalence
classes of E on A, that is, for every term h, one lets h(t) = {t0 ∈ A : E(t, t0 ) ∈
T 0 } and s0 (x) = h(s(x)) and B = {h(t) : t ∈ A}. Now one observes that
h(f (t1 , . . . , tn )) = f (h(t1 ), . . . , h(tn )) for all function symbols f and that h is
a homomorphism from A to B. One defines that P B (h(t1 ), . . . , h(tn )) is true
iff P A (t1 , . . . , tn ) is true and that t1 = t2 is true in B iff E A (t1 , t2 ) is true. One
can prove by structural induction that B, s0 |= α iff α ∈ T .
Step 6 For getting the final structure C, one just defines C to have the same value as
B on all symbols which were originally in the logical language and one keeps
s0 . Then one sees that C, s0 |= α for all α ∈ S.
96