Foundation of Tcs
Foundation of Tcs
Lecture notes
of
V. Claus and E.-R. Olderog
I Basic definitions 1
4 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1 Finite automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Regular expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5 Decidability questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6 Automatic verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1 Context-free grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2 Pumping Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3 Push-down automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7 Questions of decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
i
IV The notion of algorithm 71
1 Turing machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2 Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
VI Complexity 123
ii
Chapter I
Basic definitions
- regular languages
- context-free languages
1
2 I. Basic definitions
- Grammars (1959)
We will show: These approaches are equivalent. There exist non-computable functions. Ex-
amples of non-computable functions are considered. Finite and push-down automata can
be seen as special cases of Turing machines.
Question: How much time and memory does the computation take?
Modelling of complexity:
• Automata
• Formal languages and grammars
• Computability
• Complexity
Below we have put together the notations that will be used in these lecture notes.
• We can describe finite sets by listing their elements, for example, {a, b, c, d} or {co↵ee, tea, sugar}.
The empty set {} is also denoted by ;. An important example of an infinite set is the set
N of natural numbers: N = {0, 1, 2, 3, ...}.
x 2 X denotes that x is an element of the set X, and x 62 X denotes that x is not an
element of the set X.
The principle of extensionality holds, i. e., two sets are equal if they have the same ele-
ments. For example,
{a, b, c, d} = {a, a, b, c, d, d}.
M = {n 2 N | n mod 2 = 0}
union: X [Y = {z 2 Z | z 2 X _ z 2 Y }
intersection: X \Y = {z 2 Z | z 2 X ^ z 2 Y }
di↵erence: X Y = X \ Y = {z 2 Z | z 2 X ^ z 62 Y }
complement: X = Z X
˙ stands for the disjoint union of X and Y , i. e., additionally X \ Y = ;
Furthermore X [Y
holds.
|X| and card(X) denote the cardinality or size of the set X. |X| is the number of elements
in the set X when X is finite. E.g., |{co↵ee, tea, sugar}| = 3.
P(Z) and 2Z denote the powerset of a set Z, i. e., the set of all subsets of Z: P(Z) =
{X | X ✓ Z}. In particular, ; 2 P(Z) and Z 2 P(Z) holds.
X ⇥ Y denotes the (Cartesian) product of two sets X and Y , consisting of all pairs where
the first element is from X and the second from Y : X ⇥ Y = {(x, y) | x 2 X ^ y 2 Y }.
In general, X1 ⇥ ... ⇥ Xn denotes the set of all n-tuples, where for every i 2 {1, ..., n} it
holds that the set Xi contains the i-th component: X1 ⇥ · · · ⇥ Xn = {(x1 , ..., xn ) | x1 2
X1 ^ ... ^ xn 2 Xn }.
• Relations are special sets. A (2-place or binary) relation R of two sets X and Y is a subset
of the product X ⇥ Y , i. e., R ✓ X ⇥ Y . The infix-notation xRy is often used to denote
element relation (x, y) 2 R. The domain of R is defined by
dom(R) = {x 2 X | 9 y 2 Y : (x, y) 2 R}
The relation R ✓ X ⇥ Y is
left-unique, if 8 x1 , x2 2 X, y 2 Y : (x1 , y) 2 R ^ (x2 , y) 2 R ) x1 = x2 ,
right-unique, if 8 x 2 X, y1 , y2 2 Y : (x, y1 ) 2 R ^ (x, y2 ) 2 R ) y1 = y2 ,
left-total, if X = dom(R),
right-total, if ran(R) = Y .
The identity relation on X is denoted by idX : idX = {(x, x) | x 2 X}. The inverse relation
of R ✓ X ⇥ Y is R 1 ✓ Y ⇥ X that is defined as follows:
1
8x 2 X, y 2 Y : (x, y) 2 R , (y, x) 2 R
reflexive, if 8 x 2 X : (x, x) 2 R,
or in terms of relations: idX ✓ R,
irreflexive, if ¬ 9 x 2 X : (x, x) 2 R,
or in terms of relations: R \ idX = ;,
symmetric, if 8 x, y 2 X : (x, y) 2 R ) (y, x) 2 R,
or in terms of relations: R = R 1 ,
antisymmetric, if 8 x, y 2 X : (x, y) 2 R ^ (y, x) 2 R ) x = y,
or in terms of relations: R \ R 1 ✓ idX ,
transitive, if 8 x, y, z 2 X : (x, y) 2 R ^ (y, z) 2 R ) (x, z) 2 R,
or in terms of relations: R R ✓ R.
R is an equivalence relation if R is reflexive, symmetric and transitive (note the initial
letters r-s-t of these properties). The equivalence relation R on X divides the set X
into disjoint subsets, i. e., each subset contains pairwise equivalent elements. [x]R is an
equivalence class of x for element x 2 X, i. e., the set
An element of an equivalence class is called a representative of this class because the whole
class can be identified with the representative and the relation R. Hence an arbitrary
element can be chosen as a representative element of its class because
The transitive closure R+ and the reflexive transitive closure R⇤ of R are defined as follows:
[ [
R+ = Rn and R⇤ = Rn
n2N\{0} n2N
In this lecture we will pay special attention to analysing formal languages. For this purpose the
following notation will be used.
L1 · L2 = {u · v | u 2 L1 and v 2 L2 }.
§4 Bibliography
• J.E. Hopcroft, R. Motwani & J.D. Ullmann: Introduction to Automata Theory, Languages,
and Computation. Addison-Wesley, 2nd Edition, 2001.
The following sources were used during the preparation for the lecture:
J. Albert & T. Ottmann: Automaten, Sprachen und Maschinen für Anwender. BI 1983
(nur einführendes Buch).
J.E. Hopcroft & J.D. Ullmann: Introduction to Automata Theory, Languages, and Com-
putation. Addison-Wesley, 1979.
A.R. Lewis & C.H. Papadimitriou: Elements of the Theory of Computation. Prentice
Hall, Englewood Cli↵s 1981.
In the following chapter we deal with a simple model of a machine: the finite automaton. We
will see that
• the languages recognized by finite automata have many structural properties, e.g. repre-
sentability by regular expressions,
Moreover, finite automata can be easily implemented as tables in programs and as circuits.
§1 Finite automata
We want to use finite automata for recognizing languages. We can imagine a finite automaton
as a machine with final states, which reads characters from a tape, can move the head only to
the right and can print no new symbols on the tape.
9
10 II. Finite automata and regular languages
A U T O M A T O N
Sketch:
6 - Reading direction
q2Q
finite
The representation of automata as the so called transition systems is more suitable for graphical
representation and definition of the acceptance behaviour of automata.
A = (⌃, Q, , q0 , F ) or A = (⌃, Q, !, q0 , F )
(q, a) = q 0 , (q, a, q 0 ) 2 !
a
The elements (q, a, q 0 ) 2! are called transitions. We mostly write q ! q 0 instead of (q, a, q 0 ) 2 !.
a a
For given a 2 ⌃ we use ! also as a binary relation ! ✓ Q ⇥ Q:
a
8 q, q 0 2 Q : q ! q 0 , (q, a, q 0 ) 2 !
A DFA can be graphically represented by a finite state diagram. It is a directed graph which
contains a vertex labelled with q for every state q of the automaton and a directed labelled edge
a
for every transition q ! q 0 . The initial state q0 is marked with an incoming arrow. Final states
are labelled with an additional circle.
11
✏0 ✏0
CC ⇤ CC ⇤
⌫C ⇤✏⇤ 1 ⌫C ⇤✏⇤ ⇠
- q0
- q1
⇢⇡
1
a
1. We extend the transition relations ! from input symbols a 2 ⌃ to strings of these symbols
w
w 2 ⌃⇤ and define the respective relations ! inductively:
"
• q ! q 0 if and only if q = q 0
"
or in terms of relations: ! = idQ
aw a w
• q ! q 0 if and only if 9 q 00 2 Q : q ! q 00 and q 00 ! q 0
aw a w
or in terms of relations: !=! !
⇤
: Q ⇥ ⌃⇤ ! Q with ⇤
(q, ") = q and ⇤
(q, aw) = ⇤
( (q, a), w)
⇤ w
(q, w) = q 0 , q ! q0
Example : For the automaton from the last example it holds that:
or in terms of relations:
a1 ·...·an a a
! = !1 · · · !
n
a a
The sequence q !1 q1 ... qn 1
n
! qn is also called transitions sequence.
Example (from the field of compiler construction): A compiler for a programming lan-
guage works in several phases, sometimes run in parallel:
• Lexical analysis : In this phase the so called scanner breaks the input text into the sequence
of tokens. These are identifiers, keywords and delimiters.
• Syntax analysis : The generated token sequence is the input to the so called parser , which
decides whether this sequence is a syntactically correct program.
• Code generation : The syntactic structure recognized by the parser is used to generate
machine code.
• Optimisation : Execution time of the program should be improved mostly by local changes
of machine code.
The lexical analysis is a simple task, which can be accomplished by finite automata. As an
example let us consider the typical structure of identifiers in a programming language. We will
consider a syntax diagram of MODULA
'letter ⇠
Ident - letter ? -
6
&digit ⇡
letter . . . . . .
⌫
? ⌫
? ⌫
? ⌫
?
a ... z A ... Z
"
- ... "
- "
- ... "
- -
digit . . .
⌫
? ⌫
?
0 ... 9
"
- ... "
- -
The identifiers constructed this way can be recognized by the following finite automaton:
13
⌫
?
a, . . . , Z @ 0, . . . , 9
a, . . . , Z ⌫ ⇠ R⌫
✏
@ a, . . . , Z
@
XX
z ⇠⇠
9
⇢⇡
⇠⇠ XX
⇤ COC ⇤ OCC
⇤ C
⇤ ⇤⇤ C
0, . . . , 9 0, . . . , 9
For the sake of clarity we have labelled the edges of the state diagram with several symbols.
Example (from the field of operating systems): If several programs try to access shared
resources in a multitasking environment, then these attempts need to be synchronized. For
example, consider two programs P1 and P2 which share one printer. P1 and P2 should be
synchronized in such a way that they do not send data to the printer simultaneously. We
construct a finite automaton that would monitor the printer usage by P1 and P2 .
• P1 reports the beginning and the end of printing to the automaton by the symbols b1 and
e1
At each point of time the behaviour of P1 and P2 regarding the printer is given by the finite
string
w 2 {b1 , e1 , b2 , e2 }⇤ .
• every Pi uses the printer correctly, i. e., symbols bi and ei alternate in w, starting with bi ,
i = 1, 2.
• P1 and P2 do not use the printer simultaneously, i. e., there is neither substring b1 b2 nor
b2 b1 in string w.
?⇠
⌫
e' - e$
⇢⇡
1 2
@ b2
b1
@
⇠
⌫
@ ⇠
R⌫
@
e 1 , e2
⇢⇡ ⇢⇡
@
b1 , b2 , e2@@
⌫
b1 , b2 , e1
R
@ ?
⇤ OCC
⇤⇤ C b1 , b2 , e1 , e2
Non-determinism
as transition relation. It may happen that for some q 2 Q, a 2 ⌃ there are several successor
states q1 , . . . , qn , so that (in graphical representation) it holds that
⌫
: q1
a ⇠⇠⇠
⌫
⇠⇠ ⇠ ⇠
q ..
XXX ⌫
XXX .
z q
X
a n
After reading a the automaton moves non-deterministically from q to one of the successor states
q1 , . . . , qn . A special case is n = 0; then for q 2 Q and a 2 ⌃ there is no successor state q 0
a
with q ! q 0 . In this case the automaton stops and rejects symbol a. These remarks lead to the
following definition.
1.3 Definition : A non-deterministic finite automata (or acceptor), in short NFA, is a 5-tuple
B = (⌃, Q, !, q0 , F ),
! ✓ Q ⇥ ⌃ ⇥ Q.
a w
Transitions are written (q ! q 0 ) and extended to strings (q ! q 0 ) as in DFAs.
15
(ii) Two NFAs B1 and B2 are called equivalent if L(B1 ) = L(B2 ) holds.
An NFA recognizes a string w if while reading w it can reach one of the final states. It may be
the case that, while reading w, there exist other transition sequences which end up in non-final
states.
Obviously, a DFA is a special case of an NFA. Thus the equivalence between arbitrary finite
automata is defined.
Lv = {wv | w 2 ⌃⇤ },
Bv behaves non-deterministically in the initial state q0 : while reading a string, Bv can decide
in each occurrence of a1 whether to try to recognize the suffix v. In order to do it, Bv goes to
q1 and now waits for a2 . . . an as a suffix. Should this not be the case, then Bv stops at some
i 2 {1, . . . , n} and a 6= ai .
1.5 Theorem (Rabin and Scott, 1959): For every NFA there exists an equivalent DFA.
16 II. Finite automata and regular languages
There exists a state in A for each subset S of the set of states Q of B (we denote such a state by
a
S). S !A S 0 if and only if S 0 is the set of all successor states that can be reached by a-transitions
of non-deterministic automaton B from states of S. It can be graphically represented by:
⇠ ⇠
u a - u
PP
PP
S q u
a PPP
P S0
a
results in S !A S 0
u a - u
⇢⇡ ⇢⇡
a
(i) If q ! q 0 and q 2 S,
a
then 9 S 0 ✓ Q : S !A S 0 and q 0 2 S 0 .
a
(ii) If S !A S 0 and q 0 2 S 0 ,
a
then 9 q 2 Q : q ! q 0 and q 2 S.
This way we can easily show that L(A) = L(B). Let w = a1 . . . an 2 ⌃⇤ with ai 2 ⌃ for
i = 1, . . . , n. Then it holds that:
w
w 2 L(B) , 9 q 2 F : q0 ! q
, 9 q1 , . . . , q n 2 Q :
a an
q0 !1 q1 . . . qn 1 ! qn with qn 2 F
, {“)” follows from (i) and “(” follows from (ii)}
9 S1 , . . . , S n ✓ Q :
a a
{q0 } !1A S1 . . . Sn 1 !nA Sn with Sn \ F 6= ;
w
, 9 S 2 FA : {q0 } !A S
, w 2 L(A).
17
Example : We consider the suffix recognition of 01 in strings over the alphabet ⌃ = {0, 1, 2}.
According to the previous example the language L01 = {w01 | w 2 ⌃⇤ } is recognized by the
NFA B01 with the following diagram:
0,✏
1, 2
CC ⇤
⌫C ⇤✏⇤ ⌫ ⌫ ⇠
- q0 0 - q1 1 - q2
⇢⇡
Now we apply the powerset construction from the Rabin-Scott theorem to B01 . As a result
we get the following DFA A01 :
✏
CC ⇤ 0, 1, 2
⌫C ⇤✏⇤
✏ Q
k
1, 2 Q
@ Q 0, 1, 2
6
Q
@ CC ⇤ Q
@ ⌫ ⌫ Q⌫ ⇠
0, 2 Q
C ⇤
✏
⇤
⌫
R
@
- {q0 } 1 -
⇢⇡
{q1 } {q2 } 0, 2
k 1, 2
Q
Q
26 0 Q 6
'$ Q '$
# '$
#
Q 1
?
Q
!
1 -
{q0 , q1 } {q0 , q2 } {q1 , q2 }
&% 0 "! &%
&% "!
I
@
⇤ COC @ 6
⇤⇤ C @ 0
0 @ 1
@ '$
@#
&
2
{q0 ,q1 ,q2 }
"!
&%
The fragments in boxes ⇤ are unreachable from the initial state {q0 } of A01 . Therefore, A01
can be simplified to the following equivalent DFA :
✏
1, 2
⇣
1, 2
⌫
CC ⇤
⇣ ⇣⇣
⇤✏⇤⇣
- ~ ~ - ~
C⇣⇣ 0 -
) 1
2 0
⇤ OCC
⇤⇤ C
0
18 II. Finite automata and regular languages
The number of states of this DFA can not be further minimized. There are examples, where
the DFA given in the proof of the Rabin-Scott theorem cannot be further simplifed, i. e., if
the given NFA has n states, then in the worst case the DFA actually needs 2n states.
Spontaneous transitions
where ⌃, Q, q0 and F are defined as in NFAs or DFAs and for the transition relation ! it holds
that:
! ✓ Q ⇥ (⌃ [ {"}) ⇥ Q.
"
A transition q ! q 0 is called "-transition and is represented in the state diagrams as follows:
⌫ ⌫
q " - 0
q
In order to define the acceptance behaviour of "-NFAs, we need an extended transition relation
w
q ) q 0 . For this purpose we make two terminological remarks.
↵ ↵
• We define for every ↵ 2 ⌃ [ {"} a 2-place relation ! over Q, i. e., ! ✓ Q ⇥ Q ,
↵
8 q, q 0 2 Q holds: q ! q 0 , (q, ↵, q 0 ) 2 ! .
↵
Here we use the standard infix notation for 2-place relations, i. e., q ! q 0 instead of
↵ ↵
(q, q 0 ) 2 !. We call ! the ↵-transition relation.
• Therefore, we can use the composition for such transition relations. For all ↵, 2 ⌃[{"}
↵
we define ! ! as follows: For q, q 0 2 Q it holds that
↵ ↵
q! ! q 0 , 9 q 00 2 Q : q ! q 00 and q 00 ! q 0 .
w
1.7 Definition : Let B = (⌃, Q, !, q0 , F ) be an "-NFA. Then a 2-place relation ) ✓ Q ⇥ Q is
defined inductively for every string w 2 ⌃⇤ :
aw " a w
• q ) q 0 if q ) ! ) q0,
where q, q 0 2 Q, a 2 ⌃ and w 2 ⌃⇤ .
" "
(i) q ) q, but q = q 0 does not follow from q ) q 0 .
(ii)
a ...a
n 0 " a " " a "
1
q =) q , q ) !1 ) · · · ) n
! ) q0
a an 0
, q )1 · · · ) q
"
(iii) It is decidable, whether the relation q ) q 0 holds for the given states q, q 0 2 Q.
w
L(B) = {w 2 ⌃⇤ | 9 q 2 F : q0 ) q}.
Obviously NFAs are a special case of "-NFAs. Therefore, the equivalence of NFAs and "-NFAs
can be defined.
Example : For the input alphabet ⌃ = {0, 1, 2} we consider the "-NFA B defined by the
following state diagram:
✏0 ✏1 ✏2
CC ⇤ CC ⇤ CC ⇤
⌫C ⇤✏⇤ ⌫C ⇤✏⇤ ⌫ C ⇤✏⇤ ⇠
- q0 ✏ - q1 ✏ - q2
⇢⇡
20 II. Finite automata and regular languages
Then:
L(B) = {w 2 {0, 1, 2}⇤ | 9 k, l, m 0 : w = 0| .{z
. . 0} 1| .{z . . 2} }
. . 1} 2| .{z
k-times l-times m-times
a a " a "
• q !A q 0 if and only if q ) q 0 in B, therefore ) ! ) q 0 in B
"
• q 2 FA if and only if 9 q 0 2 F : q ) q0
" 2 L(A) , 9 q0 2 FA
"
, 9 q 2 F : q0 ) q
, " 2 L(B)
Remarks :
(iii) If there are "-cycles in "-NFA B, then the set of states of NFA A can be reduced by the
preceding contraction of "-cycles; every "-cycle can be replaced by one state.
Example : We apply the construction introduced in the proof to the "-NFA B from the previous
example. As the result we have the following NFA A:
✏0 ✏1 ✏2
CC ⇤ CC ⇤ CC ⇤
⌫ C ⇤✏ ⇠ ⌫
⇤ C ⇤✏ ⇠ ⌫
⇤ C ⇤✏⇤ ⇠
- q0 0, 1
- q1 1, 2- q2
⇢⇡ ⇢⇡ ⇣⇣1⇢⇡
⇣⇣
" ⇣⇣0, 1, 2
⇣
i. e., the classes of languages accepted by DFAs, NFAs and "-NFAs coincide; we call it the class
of finitely accepted languages. If we want to show the properties of this class, we can choose the
type of automaton most suited for this purpose.
§2 Closure properties
Now we explore under which operations the class of finitely accepted languages is closed. For
this purpose we consider the set operations union, intersection, complement, di↵erence as well
as operations on languages such as concatenation and iteration (Kleene star operator), which
were introduced in Chapter I.
2.1 Theorem : The class of finitely acceptable languages is closed under the following opera-
tions:
1. union,
2. complement,
3. intersection,
4. di↵erence,
5. concatenation,
6. iteration.
Proof : Let L1 , L2 ✓ ⌃⇤ be finitely acceptable. Then there are DFAs Ai = (⌃, Qi , !i , q0i , Fi )
with Li = L(Ai ), i = 1, 2, and Q1 \ Q2 = ;. We show that L1 [ L2 , L1 , L1 \ L2 , L1 \L2 , L1 · L2
22 II. Finite automata and regular languages
and L⇤1 are finitely acceptable. For L1 [ L2 , L1 · L2 and L⇤1 we shall provide the accepting
"-NFAs. This makes the task less complicated.
✏ *✓ ⌘
q01 A1
⌫
- q0
⌫
◆ ⇣
HH
✏Hj
H
✓ ⌘
q02 A2
2. L1 : Consider the DFA A = (⌃, Q1 , !1 , q01 , Q1 \F1 ). Then for all w 2 ⌃⇤ it holds that:
w
w 2 L(A) , 9 q 2 Q1 \F1 : q01 !1 q
{!1 determ.}
w
, ¬9 q 2 F1 : q01 !1 q
, w2 / L(A1 )
, w2 / L1
6 A1 qn
&
& ✏ %
Remark : There is also an interesting direct construction of accepting DFAs for the union and
intersection . Let A1 and A2 be as in the above proof. Then we consider the following transition
relation ! ✓ Q ⇥ ⌃ ⇥ Q on the Cartesian product Q = Q1 ⇥ Q2 of the sets of states:
Proof : We show that the statement for A\ holds. For any w 2 ⌃⇤ it holds that:
w
w 2 L(A\ ) , 9 (q1 , q2 ) 2 F\ : (q01 , q02 ) ! (q1 , q2 )
w w
, 9 q1 2 F1 , q2 2 F2 : q01 !1 q1 and q02 !2 q2
, w 2 L(A1 ) and w 2 L(A2 )
, w 2 L1 \ L2
⇤
24 II. Finite automata and regular languages
§3 Regular expressions
With the help of regular expressions we can inductively describe the finitely acceptable languages.
For this purpose we consider a fixed alphabet ⌃.
• L(;) = ;
• L(") = {"}
• L(a) = {a} for a 2 ⌃
• L((re1 + re2 )) = L(re1 ) [ L(re2 )
• L((re1 · re2 )) = L(re1 ) · L(re2 )
• L(re⇤ ) = L(re)⇤
In order to save space by omitting some brackets we define priorities for the operators:
Besides we omit the outer brackets and use the associativity of · and +. The concatenation dot ·
is often omitted.
1. The language of identifiers considered by the lexical analysis is described by the regular
expression
re1 = (a + . . . + Z)(a + . . . + Z + 0 + . . . + 9)⇤ .
2. The language over {b1 , e1 , b2 , e2 } used for synchronization of two programs which share a
printer is described by the following regular expression
re3 = (a1 + . . . an + b1 + . . . + bm )⇤ a1 . . . an .
“)”: For a regular expression re over ⌃ it holds that L = L(re). We show using induction over
the structure of re that L(re) is finitely acceptable.
Basis: L(;), L(") and L(a) are obviously finitely acceptable for a 2 ⌃.
Induction step: Let L(re), L(re1 ) and L(re2 ) be finitely acceptable. Then L(re1 +re2 ), L(re1 ·re2 )
and L(re⇤ ) are also finitely acceptable, because the class of finitely acceptable languages is closed
under union, concatenation and iteration.
“(”: It holds that L = L(A) for a DFA A with n states. W.l.o.g. A = (⌃, Q, !, 1, F ), where
Q = {1, . . . , n}. For i, j 2 {1, . . . , n} and k 2 {0, 1, . . . , n} we define
w
Lki,j = {w 2 ⌃⇤ | i ! j and 8 u 2 ⌃⇤ , 8 l 2 Q :
u
from (9 v : v 6= ", v 6= w and uv = w) and i ! l holds l k}.
Lki,j consists of all strings w such that automaton A moves from state i to state j while reading
the string w. Furthermore, while reading the string w, only states with the number less or equal
k occur.
Now we show by induction on k that the languages Lki,j are all regular.
k = 0: For strings from L0i,j no intermediate state can be used. Thus it holds that:
( a
{a 2 ⌃ | i ! j} if i 6= j
L0i,j = a
{"} [ {a 2 ⌃ | i ! j} if i = j
k ! k +1: Let the languages Lki,j be regular for all i, j 2 {1, . . . , n}. Then for all i, j 2 {1, . . . , n}
it holds that:
⇤
Lk+1 k k k
i,j = Li,j [ Li,k+1 · (Lk+1,k+1 ) · Lk+1,j ,
k
In order to go from the state i to the state j, the intermediate state k + 1 is either unnecessary,
then Lki,j is sufficient for the description; or the state k + 1 is used as an intermediate state one
or several times, then Lki,k+1 · (Lkk+1,k+1 )⇤ · Lkk+1,j is used for description. Therefore, by induction
on k we get:
Lk+1
i,j is regular.
26 II. Finite automata and regular languages
From the regularity of languages Lki,j we conclude that L itself is regular, because it holds that
[
L = L(A) = Ln1,j .
j2F
As regular languages are exactly the finitely acceptable languages, we can derive important
characteristics of regular languages from the finiteness of the sets of states of the accepting
automata. First we consider the so called pumping lemma, which gives us a necessary condition
for a language to be regular. Let ⌃ be an arbitrary alphabet.
4.1 Theorem (the pumping lemma for regular languages): For every regular language
L ✓ ⌃⇤ there exists a number n 2 N, so that for all strings z 2 L with |z| n there is a
decomposition z = uvw with v 6= " and |uv| n and for all i 2 N it holds that uv w 2 L, i. e.,
i
we can “pump” the substring v and the resulting string will be in the regular language L.
In quantifier notation:
'ak $
⌫ ⌫
? ⌫
aj aj+1 ak 1
- q0 a1- · · · - qj - ··· - qk 1
ak+1
?
..
.
?⇠
ar
⌫
qr
⇢⇡
We consider a typical application of the pumping lemma, where we prove that a certain language
is not regular.
The above example shows that finite automata cannot count unlimitedly. We shall get ac-
quainted with other applications of the pumping lemma in the section devoted to decidability
properties.
Nerode relation
If we consider the so called Nerode relation, we get a characteristic, i. e., necessary and sufficient
condition for the regularity of languages L ✓ ⌃⇤ .
4.2 Definition :
Therefore, it holds that u ⌘L v if u and v can be extended in the same way to strings from L.
In particular, u 2 L , v 2 L (with w = ").
Remark : The Nerode relation is a right congruence, i. e., it has the following properties:
Given that ⌘L is an equivalence relation, we can determine the index of ⌘L , i. e., the number
of equivalence classes of ⌘L .
Proof : “)”: Let L be regular, i. e., L = L(A) for a DFA A = (⌃, Q, !, q0 , F ). We introduce
the following 2-place relation ⌘A on ⌃⇤ . For u, v 2 ⌃⇤ it holds that:
u v
u ⌘A v if and only if we have a q 2 Q, with q0 ! q and q0 ! q,
i. e., if the automaton A moves from q0 to the same state q after having read the inputs u and v.
Note that q is uniquely defined, because A is deterministic. The relation ⌘A is an equivalence
relation on ⌃⇤ .
Thus there are at least as many equivalence classes of ⌘A as of ⌘L . Therefore, it holds that
Index(⌘L )
Index(⌘A )
= number of states reachable from q0
|Q|,
⌃⇤ = [u1 ] [˙ . . . [˙ [uk ].
AL = (⌃, QL , !L , qL , FL ):
and thus
w 2 L(AL ) =
w
, 9 [uj ] 2 FL : ["] !L [uj ]
, 9 uj 2 L : [uj ] = [w]
, w2L
We use the method of proof of the Myhill-Nerode theorem in order to minimize the number
of states of a DFA. We mainly refer to the deterministic equivalence-class automaton AL from
the proof.
4.4 Corollary : Let L ✓ ⌃⇤ be regular and k = Index(⌘L ). Then every DFA that accepts L
has at least k states. The minimal number k is reached by the DFA AL . However, there may
exist NFAs with less than k states accepting L.
Proof : In the proof of the Myhill-Nerode theorem we have shown in “)” that every DFA
A = (⌃, Q, !, q0 , F ) accepting L has at least k states. At the same time k = Index(⌘L ) |Q|.
In “(” we have constructed the DFA AL with k states accepting L. ⇤
The equivalence-class automaton AL is the prototype of all DFAs accepting L with minimal
number of states k. We can show that every other DFA accepting L and that has k states is
isomorphic to AL , i. e., we can get it from AL by a bijective renaming of the states.
4.5 Definition : Two DFAs or NFAs Ai = (⌃, Qi , !i , q0i , Fi ), i = 1, 2, are called isomorphic if
there is a bijection : Q1 ! Q2 with the following properties:
• (q01 ) = q02 ,
30 II. Finite automata and regular languages
• (F1 ) = { (q) | q 2 F1 } = F2 ,
a a
• 8 q, q 0 2 Q1 8 a 2 ⌃ : q !1 q 0 , (q) !2 (q 0 ).
Note that isomorphism is an equivalence relation on finite automata. Now we will show the
announced statement.
4.6 Theorem : Let L ✓ ⌃⇤ be regular and k = Index(⌘L ). Then every DFA A that accepts L
and that has k states is isomorphic to AL .
Proof : We have A = (⌃, Q, !, q1 , F ) with L(A) = L and |Q| = k and the equivalence-class
automaton from the Myhill-Nerode theorem with QL = {[u1 ], . . . , [uk ]} and u1 = ". For every
u
i 2 {1, . . . , k} we define the state qi 2 Q by transition q1 !i qi .
([ui ]) = qi
u uj
1. is injective: Let qi = qj . Then it holds that q1 !i qi and q1 ! qi . Therefore, for all
w 2 ⌃⇤ it holds that ui w 2 L , uj w 2 L. Therefore, ui ⌘L uj and [ui ] = [uj ].
2. is surjective: this property follows from (1) and the fact that k = |Q|. Thus it holds in
particular that Q = {q1 , . . . , qk }.
4. (FL ) = F : [uj ] 2 FL , uj 2 L , qj 2 F
a a
[ui ] !L [uj ] , qi ! qj .
a
• Proof of “)”: Let [ui ] !L [uj ]. Then by definition of !L : [ui a] = [uj ]. There is an
a
l 2 {1, . . . , k} with qi ! ql . By definition of qi and ql we have the following image:
ui- a- q
q1 qi l
⇢ ⇡
6
ul
Given that the strings ui a and ul lead to the same state ql , it follows that ui a ⌘L ul .
Therefore, it holds that [uj ] = [ui a] = [ul ]. According to the choice of u1 , . . . , uk in AL it
a
follows that uj = ul , even j = l. Thus qi ! qj follows as desired.
31
a
• Proof of “(”: Let qi ! qj . Thus we have the following image similar to the one above:
ui- a - qj
q1 qi
⇢ ⇡
6
uj
For every regular language L ✓ ⌃⇤ with k as index of ⌘L there is a DFA which is unique up to
isomorphism and with minimal number of states k accepting L.
4.7 Definition : The minimal automaton for a regular language L ✓ ⌃⇤ is the DFA which
accepts L and has the number of states equal to the index of the Nerode relation ⌘L . This
minimal automaton is unique up to isomorphism.
The minimal automaton for a regular language L ✓ ⌃⇤ from every DFA A = (⌃, Q, !, q0 , F )
accepting L by reduction is algorithmically computable. The reduction includes the following
steps:
w w
q1 ! F , q2 ! F,
q1 ⇠ q2 , u ⌘L v , [u] = [v].
Comparing with the equivalence-class automaton AL we see that equivalent states must
coincide in the minimal automaton. q0 , q1 and q2 in AL are represented by the equivalence
u v
classes ["], [u] and [v] such that from ["] ! [u] and ["] ! [v] it follows that:
§5 Decidability questions
First of all, we determine that the following constructions are algorithmically computable:
After that we can move on to decidability questions for the languages represented by finite
automata or regular expressions. Due to the constructions mentioned above we consider only
the languages represented by DFAs. We consider the following problems for regular languages.
Proof : Acceptance problem: We apply A to the given string w and decide whether a final state
of A is reached. Therefore, A itself gives us the decision procedure.
Emptiness problem: Let n be the number we get by applying the pumping lemma to the regular
language L(A). Then it holds that:
The proof of (⇤): “)” is obvious. “(” by contraposition: Let L(A) 6= ;. If there is a string
w 2 L(A) with |w| < n, then we have nothing to show. Otherwise there is a string w 2 L(A)
with |w| n. By successive application of the pumping lemma with i = 0 we get a string
w0 2 L(A) with |w0 | < n. This completes the proof.
The decision procedure for L(A) = ; now solves the acceptance problem “w 2 L(A)?” for every
string over the input alphabet of A with |w| < n. If the answer is “no”, then we get “L(A) = ;”.
Otherwise we get “L(A) 6= ;”.
The proof of (⇤⇤): “)”: If there was a string w 2 L(A) with |w| n, then L(A) would be
infinite by the pumping lemma. “(” by contraposition: Let L(A) be infinite. Then there are
strings of arbitrary length in L(A), in particular a string w with w 2n. By applying the
pumping lemma successively with i = 0 we get a string w0 with n |w0 | < 2n, because with
i = 0 the given string is shortened maximally by n letters.
The finiteness problem can be decided by (⇤⇤) by solving of the acceptance problem a finite
number of times.
Hence the equivalence problem for automata is reduced to the emptiness problem for A. The
construction of A described above is hard to implement. Alternatively, we can use the following
product construction, which is similar to the last remarks in section 2 of this chapter. Let
Ai = (⌃, Qi , !i , q0i , Fi ), i = 1, 2. Then consider Q = Q1 ⇥ Q2 with the following transition
relation ! ✓ Q ⇥ ⌃ ⇥ Q
holds. Due to
L(A1 ) ✓ L(A2 ) , L(A) = ;
the inclusion problem for A1 and A2 can be reduced to the emptiness problem of A.
so that the intersection problem of A1 and A2 can be reduced to the emptiness problem of A.
For A we can use the product construction A\ from the last remark in section 2. ⇤
§6 Automatic verification
The automatic verification can be implemented for programs represented by finite automata. In
general, the verification problem can be formulated as follows:
re = ⌃⇤ b1 b2 ⌃⇤ + ⌃⇤ b2 b1 ⌃⇤ .
The application of the operators re and ⌃ makes the regular expression re simpler. Due to
closure properties of regular languages, we certainly do not leave the class of regular languages.
Hence we can also decide for every given finite automaton A for synchronization of printer usage
by both programs whether L(A) ✓ L(re) holds.
Given : a specification S
Searched : a program P that satisfies S.
36 II. Finite automata and regular languages
Chapter III
In the previous chapter we have seen that regular languages have multiple applications in com-
puter science (e.g. lexical analysis and substring recognition) and are particularly easy to use
(representability by finite automata and regular expressions, good closure and decidability prop-
erties). However these are not enough to carry out an important task of computer science, namely
the syntax description of programming languages.
The reason is that programming languages accept bracket structures of arbitrary nesting-depth,
such as for example
In section 4 we have shown with the help of the pumping lemma that the simplest example of
such a bracket structure, namely the language
L = {an bn | n 2 N}
is no longer regular. Context-free languages are used for the syntax description of programming
languages.
37
38 III. Context-free languages and push-down automata
§1 Context-free grammars
1.1 Definition : A context-free grammar is a 4-tuple G = (N, T, P, S), where the following
holds:
A ! u1 , A ! u2 , . . . , A ! uk ,
A ! u1 | u2 | . . . | uk
or also
A ::= u1 | u2 | . . . | uk . (⇤)
Thus | is a “metasymbol” for the disjunction of alternatives u1 , . . . , uk , which may not occur in
N [ T.
If the productions of a context-free grammar are represented in the form (⇤), then we deal with
Backus-Naur form or shortly BNF-notation. This notation was introduced in 1960 by John
Backus and Peter Naur to define the programming language ALGOL 60. The extended BNF-
notation, also called EBNF, allows us to make further abbreviations. The EBNF-notation can
be translated 1–1 into the syntax diagram introduced in 1970 by Niklaus Wirth to define the
programming language PASCAL.
39
By `⇤G we denote the reflexive and transitive closure of `G . We read x `⇤ y as “y can be derived
from x”. The language generated by G is
Example :
S ! " | aSb.
(2) The arithmetic expressions with variables a, b, c and operators + and ⇤ are generated by
the grammar G2 = ({S}, {a, b, c, +, ⇤, (, )}, P2 , S) with the following P2 :
S ! a | b | c | S + S | S ⇤ S | (S).
A = z0 `G z1 `G · · · `G zn = w. (⇤⇤)
This derivation is called leftmost derivation if we replace the leftmost non-terminal symbol in
every derivation step zi `G zi+1 , i.e. if zi and zi+1 always have the form
1.4 Definition : A derivation tree from A to w in G is a tree with the following properties:
40 III. Context-free languages and push-down automata
(i) Every node is labelled with a symbol from N [ T [ {"}. The root is labelled with A and
every internal node is labelled with a symbol from N .
(ii) If an internal node labelled with B has k children nodes, which are labelled with the
symbols 1 , . . . , k from left to right, then it holds that
(iii) The string w is generated by concatenating the symbols on the leaves from left to right.
Illustration
A
··
◆◆· · ·S·S
◆· · S
◆ ·· · S
◆ ·· B S
◆ ·· @S
@
◆ ··· S
◆ B 1 k S
◆ S
◆ " S
| {z }
w
n ! n + 1 : Consider a derivation
A
⌦⌦ JJ
⌦ t J
⌦ J
⌦ B J
w1 w2
"
A
⌦⌦ JJ
⌦ t J
⌦ J
⌦ B J
w1 w2
AA
1 ··· k
Example : Derivation trees for the derivations mentioned in the previous example are
S S
@
@ @
@
a S b and S ⇤ S
@
@ @
@
a S b ( S ) c
@
@
" S + S
a b
Remark : There exists the following relationship between derivations and derivation trees:
(ii) In general, several derivations from A to w correspond to the given derivation tree from A
to w. However, this holds for only one leftmost derivation and one rightmost derivation.
Proof :
(ii) Derivation trees abstract from the unimportant order of the rule application if several
non-terminal symbols occur simultaneously. For example, both derivations
S ` G 2 S + S ` G 2 a + S `G 2 a + b
and
S `G 2 S + S ` G 2 S + b ` G 2 a + b
result in the same derivation tree:
S
@
@
S + S
a b
If we have decided to use leftmost or rightmost derivation respectively, then such alterna-
tives are not possible.
42 III. Context-free languages and push-down automata
For using the programming language PL it is important that every PL-program has an unam-
biguous semantics. Therefore, for every PL-program there should exist exactly one derivation
tree.
1.5 Definition :
Example : The above given grammar G2 for arithmetic expressions is ambiguous. For the
string a + b ⇤ c 2 L(G2 ) there exist the following two derivation trees:
S S
@
@ @
@
S + S and S ⇤ S
@
@ @
@
a S ⇤ S S + S c
b c a b
• The evaluation takes place from left to right. Thus, for example, a + b + c is evaluated as
(a + b) + c.
If we want to have another evaluation sequence, we must explicitly put brackets ( and ).
43
E ! T |E + T
T ! F |T ⇤ F
F ! (E) | a | b | c
G3 is unambiguous, and it holds that L(G3 ) = L(G2 ). For example, a + b ⇤ c has the derivation
tree in G3
E
@
@
E + T
@
@
T T ⇤ F
F F c
a b
§2 Pumping Lemma
There also exists a pumping lemma for context-free languages. It provides a necessary condition
for a given language to be context-free.
(i) vx 6= ",
(ii) |vwx| n,
Hence we can “pump” the substrings v and x an arbitrary number of times without leaving the
context-free language L.
To prove the pumping lemma we need the following general lemma for trees, which we can then
apply to derivation trees. For finite trees t we define:
• A path of length m in t is a sequence of edges from the root to a leaf of t with m edges.
The trivial case m = 0 is allowed.
2.2 Lemma : Let t be a finite tree with the branching factor k, where every path has the
length m. Then the number of leaves in t is k m .
Proof : Induction on m 2 N:
t= r
⇢Z
⇢ Z
r Zr
⇢ Z
⇢ ···
⌦
⌦JJ ⌦⌦JJ
⌦ t J · · · ⌦ tj J
⌦ 1 J ⌦ J
⌦ J ⌦ J
⇤
45
Proof of the pumping lemma: Let G = (N, T, P, S) be a context-free grammar with L(G) =
L. Let:
• k = the length of the longest right side of a production from P , however at least 2
• m = |N |,
• n = k m+1 .
Now let z 2 L with |z| n. Then there is a derivation tree t from S to z in G with no part
corresponding to a derivation of the form B `⇤G B:
S
◆◆ ... SS
◆ S
◆ B S
◆ ◆ S S
◆ ◆ SS
◆ ◆ ..S S
◆ ◆S .
◆S S
◆ ◆ "S ◆" S S
◆ ◆ SB◆ S S
⌦ J
⌦ J
Every part like this could actually be removed from t without changing the derived string z.
Having chosen k and |z| as above, we can see that t has a branching factor k and k m+1
leaves. Therefore, according to the previously considered lemma there is a path with the length
m + 1 in t. There are m + 1 internal nodes on this path, so that a non-terminal symbol is
repeated, while labeling these nodes. We need this repetition in a special case.
By a repetition tree in t we denote a subtree of t, where the labeling of the root is repeated in
some other node. Now we choose a minimal repetition tree t0 in t, i.e. such a tree that contains
no other repetition tree (as a real subtree). In t0 every path has a length m + 1.
t= S
⇢ . Z
⇢ .. Z
⇢ Z
⇢ t0 = A Z
⇢⇢ .. Z Z
⇢⇢ . Z Z
⇢ ⇢ A Z Z
⇢ ⇢ Z Z
⇢ ⇢ @ Z Z
⇢ ⇢ @ Z Z
u v w x y
| {z }
z
46 III. Context-free languages and push-down automata
We show that this decomposition of z satisfies the conditions of the pumping lemma:
(ii) By the choice of t0 and the preceding pumping lemma it holds that |vwx| k m+1 = n.
(iii) From (⇤) it follows immediately that for all i 2 N it holds that: uv i wxi y 2 L(G).
S
⇢ . Z
⇢ .. Z
⇢ Z
⇢ A Z
⇢ ⇢ .. Z Z
⇢ ⇢ . Z Z
⇢ ⇢ A Z Z
⇢ ⇢ Z Z
⇢ ⇢ ⇢ .
⇢ . Z
. Z Z Z
⇢ ⇢ ⇢ Z Z Z
u v⇢ A Zx y
⇢ .. Z
v . @
@ x
A @
A @
v w x
Just as in regular languages, the above pumping lemma can be used to prove that a certain
language is not context-free.
The pumping lemma allows us to show that context-free grammars are not enough to provide the
full description of the syntax of more advanced programming languages such as Java. Though
we describe the basic structure of the syntactically correct programs with context-free grammars
(in BNF or EBNF notation), there exist side conditions, which are context-sensitive.
47
Example : The programming language Java, which consists of all syntactically correct Java
programs, is not context-free. We make a proof by contradiction.
Hypothesis: Let Java be context-free. Then there is an n 2 N with the properties mentioned in
the pumping lemma. Now let us consider the following syntactically correct Java-class:
class C {
int X 1| .{z
. . 1};
n
void m() {
X 1| .{z
. . 1} = X 1| .{z
. . 1}
n n
}
}
By every uvwxy-decomposition of this program, the vwx-part influences at most two out of
. . 1}, because |vwx| n. Therefore, while pumping to
three occurrences of the variables X 1| .{z
n
uv i wxi y, there appear either character strings which do not fulfill the requirements of the Java
syntax diagram, or character strings of the form
class C {
int X 1| .{z
. . 1};
k
void m() {
X 1| .{z
. . 1} = X 1| .{z
. . 1}
l m
}
}
where k, l, m are not all equal. These character strings violate the following condition for syn-
tactically correct Java programs:
Here either X 1
| .{z
. . 1} or X 1
| .{z
. . 1} (or both of them) is not declared.
l m
Contradiction. ⇤
Such conditions as (⇤⇤) are context-sensitive and thus are mentioned while defining program-
ming languages together with the context-free basic structure of programs. A compiler checks
these context-sensitive conditions by applying appropriate tables for saving declared variables
or general identifiers.
48 III. Context-free languages and push-down automata
§3 Push-down automata
So far we defined context-free languages by the fact that they can be generated by grammars.
Now we want to consider recognizability of such languages by automata. Our goal is to extend
the model of the finite automaton in such a way that it can recognize the context-free languages.
The weak point of finite automata is that they lack memory to store an unlimited amount of
data. Obviously, a finite automaton cannot recognize such a language as
L = {an bn | n 2 N}
because at the moment the first b is read, it does not know any more how many a’s have been
read. The only information saved is the current state, i.e. an element of a finite set of states.
Now we consider the so called push-down automata. These are nondeterministic finite automata
with "-transitions ("-NFAs), extended with memory which can save infinitely long strings but
can be accessed very limitedly. The memory is organized as a stack or a pushdown-list with
access only to the topmost symbol. The transitions of a push-down automaton may depend on
the current state, on the symbol of the input string which has been read, and on the topmost
symbol of the stack; they change the state and the contents of the stack.
Sketch
a a a b b b input string
6 - reading direction
⌃⌅
In order to define the acceptance behaviour of PDAs, we must – just as in "-NFAs – first of all
extend the transition relation. For this purpose we need the definition of the configuration of a
PDA.
↵ ↵
(i) (q, ) ! (q 0 , 0 ) implies (q, ) ) (q 0 , 0 ).
50 III. Context-free languages and push-down automata
w
(i) K accepts w if 9q 2 F 9 2 ⇤ : (q0 , Z0 ) =) (q, ).
The language accepted (or recognized) by K is
K accepts an bn for n 1 by using the following 2n + 1 transitions, which we have labelled for
the sake of clarity with numbers from 1 to 5:
a a a
(q0 , Z) !1 (q1 , aZ) !2 (q1 , aaZ) . . . !2 (q1 , an Z)
b b b
!3 (q2 , an 1 Z) !4 (q2 , an 2 Z) . . . !4 (q2 , Z)
"
!5 (q0 , ")
an b n "
Thus the relation (q0 , Z) =) (q0 , ") holds for n 1. For n = 0 it trivially holds that (q0 , Z) )
(q0 , Z). Because q0 is the final state of K, it follows that L ✓ L(K).
For the inclusion L(K) ✓ L we must analyse the transition behaviour of K. For this purpose we
index the transitions as above. We can see that K is deterministic, i.e. while sequentially reading
51
characters of a given input string, only one transition can be applied on each step. Namely the
following transition sequences are possible, where n m 0 holds:
⇣ ⌘n
a a
• (q0 , Z) !1 !2 (q1 , an+1 Z)
⇣ ⌘n ⇣ ⌘m
a a b b
• (q0 , Z) !1 !2 !3 !4 (q2 , an m Z)
⇣ ⌘n ⇣ ⌘n
a a b b "
• (q0 , Z) !1 !2 !3 !4 !5 (q0 , ")
Therefore, K accepts only those strings which have the form an bn . Altogether, it holds that
L(K) = L.
Furthermore, note that the automaton K accepts all strings an bn with the empty stack, except
for the case n = 0: L" (K) = {an bn | n 1}.
Now we want to derive general properties of push-down automata. For this purpose we often
need the following Top Lemma. Intuitively, it points out that changes on the top of the stack
do not depend on the rest of the stack.
then also
w
(q, Z ) =) (q 0 , ).
Proof : Exercise ⇤
(1) For every PDA A we can construct a PDA B with L(A) = L" (B).
(2) For every PDA A we can construct a PDA B with L" (A) = L(B).
with qB , q" 2
/ Q and # 2
/ and the following transition relation:
52 III. Context-free languages and push-down automata
w
(q0 , Z0 ) =)A (q, )
if and only if
" w "
(qB , #) !B (q0 , Z0 #) =)A (q, #) !B (q" , ").
(For the “if-then” direction we use the Top Lemma.) Analysing the applicability of the new
transitions in B we get L(A) = L" (B).
(2): Idea of the proof: B works like A, but uses an additional symbol # to label the bottom
of the stack. As soon as A has emptied its stack, B reads the symbol # and moves to a final
state. Making the exact construction of B should be done as an exercise. ⇤
Now we want to show that the nondeterministic push-down automata accept context-free lan-
guages (with empty stack). First for a given context-free grammar G we construct a push-down
automaton which represents a nondeterministic “top-down” parser of the language L(G).
3.6 Theorem : For every context-free grammar G we can construct a nondeterminstic push-
down automaton K with L" (K) = L(G).
Proof : Let G = (N, T, P, S). We construct K in such a way that it simulates the leftmost
derivation in G:
K = (T, {q}, N [ T, !, q, S, ;),
To show that L(G) = L" (K) holds, we consider more carefully the relationship between left-
most derivations in G and transition sequences in K. For this purpose we use the following
abbreviations of strings w 2 (N [ T )⇤ :
53
A ` G . . . `G w
| {z }
n-times
Proof by induction on n:
"
n = 0: Then w = A, thus wT = " and wR = A. Trivially (q, A) ) (q, A) holds.
A `G . . . `G w̃ = w̃T Bv `G w̃T uv = w
| {z }
n-times
Because wT = (w̃T uv)T = w̃T (uv)T and wR = (w̃T uv)R = (uv)R hold, then in general we get
T w
(q, A) =) (q, wR ).
Proof by induction on m:
Case ↵m+1 = a 2 T
Then we used transition type (2) and the transition has the form
a
(q, m) = (q, av) ! (q, v) = (q, m+1 )
Particularly from the statements 1 and 2 it follows that for all strings w 2 T ⇤ it holds that:
w
S `⇤G w if and only if (q, S) ) (q, ")
if and only if K accepts w with the empty stack.
Example : We consider again the language L = {an bn | n 2 N}. We have already seen that the
language is generated by the context-free grammar G1 = ({S}, {a, b}, P1 , S), where P1 consists
of the productions
S ! " | aSb,
i.e. L(G1 ) = L. The construction used in the proof above gives us the push-down automaton
It follows from the proof that L" (K1 ) = L(G1 ). To illustrate this, let us consider the transition
sequence of K1 while accepting a2 b2 :
" a
(q, S) ! (q, aSb) ! (q, Sb)
" a
! (q, aSbb) ! (q, Sbb)
" b b
! (q, bb) ! (q, b) ! (q, ").
55
Now for every given push-down automaton we construct a respective context-free grammar.
3.7 Theorem : For every push-down automaton K we can construct a context-free grammar G
with L(G) = L" (K).
(1) All strings w 2 ⌃⇤ which K can accept from the configuration (q, Z) with empty stack and
w
the state q 0 should be generated in G from [q, Z, q 0 ]: (q, Z) ) (q 0 , ").
↵
(2) Thus a transition (q, Z) ! (r0 , Z1 . . . Zk ) from K is simulated in G by the following pro-
ductions:
[q, Z, rk ] ! ↵[r0 , Z1 , r1 ][r1 , Z2 , r2 ] . . . [rk 1 , Zk , rk ]
for each r1 , . . . , rk 2 Q. The strings which are accepted by K up to the reduction of the
symbol Z1 are generated from [r0 , Z1 , r1 ]; the strings accepted by K up to the reduction
of the symbol Z2 are generated from [r1 , Z2 , r2 ], and so on. The intermediate states
r1 , . . . , rk 1 are those states which K reaches directly after the reduction of the symbols
Z1 , . . . , Z k 1 .
To show that L(G) = L" (K) holds, consider the relationship between derivations in G and
transition sequences in K.
[q, Z, q 0 ] `G . . . `G w
| {z }
n-times
Proof by induction on n:
56 III. Context-free languages and push-down automata
n = 1: From [q, Z, q 0 ] `G w it follows (because w 2 ⌃⇤ ) that we deal with the production type
↵ ↵
(3) in G. Therefore, it holds that w = ↵ 2 ⌃ [ {"} and (q, Z) ! (q 0 , "). Thus (q, Z) ) (q 0 , ").
n ! n + 1: We analyse the first step of a derivation with the length n + 1, which should take
place according to the production type (2):
[ri 1 , Zi , ri ] `G . . . `G w i
| {z }
n-times
[q, Z, q 0 ] `⇤G ↵1 . . . ↵n .
Proof by induction on n:
↵
n = 1: Then (q, Z) !1 (q 0 , ") holds. By definition of P in G — see production type (3) — it
follows that [q, Z, q 0 ] `G ↵1 .
where rk = q 0 . We consider in more detail the successive reduction of the stack contents Z1 . . . Zk
of K. Furthermore, there exist transition sequences in K
11↵ ↵1m
(r0 , Z1 ) ! ··· !1 (r1 , ")
........................................
↵k1 ↵kmk
(rk 1 , Zk ) ! ··· ! (rk , ") = (q 0 , ")
57
as required.
S `G [q0 , Z0 , q] `⇤G w
if and only if in K
w
(q0 , Z0 ) ) (q, ")
§4 Closure properties
Now we consider, under which operations the class of context-free languages is closed. In contrast
to the regular (i.e. finitely acceptable) languages we have the following results.
4.1 Theorem : The class of context-free languages is closed under the following operations
(i) union,
(ii) concatenation,
(iii) iteration,
However, the class of context-free languages is not closed under the operations
(v) intersection,
(vi) complement.
(iv) L1 \ regSpr : For the intersection with regular languages we use the representation of
context-free and regular languages by push-down automata and finite automata, respec-
tively. Let L1 = L(K1 ) for the (nondeterministic) PDA K1 = (T, Q1 , , !1 , q01 , Z0 , F1 )
and L2 = L(A2 ) for the DFA A2 = (T, Q2 , !2 , q02 , F2 ). We construct from K1 and A2 the
(nondeterministic) PDA
is defined as follows:
↵
((q1 , q2 ), Z) ! ((q10 , q20 ), 0 ) in K
59
if and only if
↵ ↵
(q1 , Z) !1 (q10 , 0 ) in K1 and q2 !2 q20 in A2 .
"
Note that in the special case ↵ = " the notation q2 !2 q20 for the DFA A2 simply means
w
q2 = q20 (compare with the definition of the extended transition relation q !2 q 0 for DFA’s
↵
in section 1). Thus the relation ! of K models the synchronous parallel progress of the
automata K1 and A2 . However, in the special case ↵ = " only K1 makes a spontaneous
"-transition, while the DFA A2 remains in the current state.
We show that for the acceptance by final states it holds that: L(K) = L(K1 ) \ L(A2 ) =
L1 \ L2 . Let w = a1 . . . an 2 T ⇤ , where n 0 and ai 2 T for i = 1, . . . , n. Then it holds
that:
⇤ a a
w 2 L(K) , 9(q1 , q2 ) 2 F, 2 : ((q01 , q02 ), Z0 ) )1 · · · )
n
((q1 , q2 ), )
⇤ a1 an
, 9q1 2 F1 , q2 2 F2 , 2 : (q01 , Z0 ) )1 · · · )1 (q1 , )
a a
and q02 !12 · · · !n2 q2
, w 2 L(K1 ) \ L(A2 ).
(v) not L1 \ L2 : However, the context-free languages are not closed under intersection with
other context-free languages. Consider
L1 = {am bn cn | m, n 0}
and
L2 = {am bm cn | m, n 0}.
It is easy to see that L1 and L2 are context-free. For example, L1 can be generated by the
context-free grammar G = ({S, A, B}, {a, b, c}, P, S} with the following P :
S ! AB,
A ! " | aA,
B ! " | bBc.
L1 \ L2 = {an bn cn | n 0},
which is not context-free (as we have shown with the help of the pumping lemma).
(vi) not Li : The context-free languages are also not closed under the complement. This
statement follows directly from (i) and (v) due to the De Morgan’s laws. If context-
free languages were closed against the complement, then they would also be closed under
intersection, because L1 \ L2 = L1 [ L2 holds.
⇤
60 III. Context-free languages and push-down automata
While considering context-free languages, it is beneficial when the rules of the underlying
context-free grammars are as simple as possible. Therefore, in this section we introduce such
transformations, which transform the given context-free grammars into equivalent grammars
with the rules satisfying additional conditions.
(ii) or only the "-production S ! " and then S does not occur on the right-hand side of any
production in G.
5.2 Theorem : Every context-free grammar can be transformed into an equivalent "-free gram-
mar.
Step 1 First we compute all erasable A 2 N . For this purpose we inductively compute the sets
N1 , N2 , N3 , . . . of erasable non-terminal symbols:
N1 = {A 2 N | A ! " 2 P }
Nk+1 = Nk [ {A 2 N | A ! B1 . . . Bn 2 P mit B1 , . . . , Bn 2 Nk }
N1 ✓ N2 ✓ N3 ✓ . . . ✓ N.
A! 1... n 2P
A ! ↵1 . . . ↵n ,
P 0 = {S 0 ! " | S is erasable } [ {S 0 ! u | S ! u 2 P0 } [ P0
“◆”: If " 2 L(G) holds, then S is erasable and thus S 0 ! " 2 P 0 . Therefore, " 2 L(G0 ) holds as
well. Now let w 2 L(G) {"}. Consider a derivation tree from S to w in G:
S
% .. e
% . e
% B e
% L e
% L e
% ⌫A L e
% L e
% L e
% L e
% " L e
| {z }
w
S0
.. @@
. @
B @
L @
L @
L @
| {z }
w
A!a or A ! BC,
where A, B, C 2 N and a 2 T .
5.4 Theorem : Every context-free grammar can be transformed into an equivalent grammar in
Chomsky normal form.
A ! aB1 . . . Bk ,
where k 0, A, B1 , . . . , Bk 2 N and a 2 T .
5.6 Theorem : Every context-free grammar can be transformed into an equivalent grammar in
Greibach normal form.
In section 3 we have shown that in general every context-free language can be recognized by
a nondeterministic push-down automaton. The question is, whether we can eliminate the non-
determinism as in the case of finite automata and whether we are always able to construct
equivalent deterministic push-down automata. This question is of practical importance for the
construction of a parser for a given context-free language. First we define the notion of deter-
minism for push-down automata and context-free languages.
6.1 Definition :
Example : The language PALc = {wcwR | w 2 {a, b}⇤ } of palindromes with c as symbol in the
middle is also deterministic context-free. The notation wR means that the string w should be
read backwards.
Theorem 3.5 does not hold for deterministic push-down automata. Thus we cannot replace L(K)
(acceptance by final states) by L" (K) (acceptance by empty stack).
Now we show that not all context-free languages are deterministic. For this purpose we use the
following theorem.
Proof sketch: We could use the same approach as in the case of finite automata and con-
sider the deterministic PDA K0 = (⌃, Q, , !, q0 , Z0 , Q F ) for a deterministic PDA K =
⇢
(⌃, Q, , !, q0 , Z0 , F ). Unfortunately, it generally holds that L(K0 ) 6= ⌃⇤ L(K). The reason
for the fact that not all strings from the complement of L(K) are accepted is the existence of
non-terminating computations. For example, if a transition
"
(q, A) ! (q, AA)
is used once, then it must be used repeatedly; then infinitely many elements will be added to the
stack and K will not terminate. If this is the case for some input string w 2 ⌃⇤ , then w 62 L(K).
However the same holds for K0 , i.e. w 62 L(K0 ).
Thus we must first transform K into an equivalent deterministic PDA which terminates after
finitely many steps for every input string. Such a construction is actually possible for push-down
64 III. Context-free languages and push-down automata
⇤ "
{(q, A) | 9 2 with (q, A) ) (q, A )}
"
can be e↵ectively constructed for K and the respective transitions (q, A) ! (q 0 , 0 ) can be deleted
or replaced by appropriate transitions.
6.3 Corollary : There exist context-free languages that are not deterministic.
Proof : If all context-free languages were deterministic, then the context-free languages would
be closed under complementation. Contradiction to the theorem from section 4.1 ⇤
the language of all non-empty palindromes with even length. The notation wR means that the
string w should be read backwards. PAL is obviously context-free: to generate it we need the
following rules
S ! aa | bb | aSa | bSb.
Now we show:
Proof : Hypothesis: PAL is deterministic. Then according to both lemmas considered above
the language
L0 = Min(PAL) \ L((ab)+ (ba)+ (ab)+ (ba)+ )
is context-free and deterministic. (ab)+ stands for the regular expression ab(ab)⇤ and (ba)+
similarly for the regular expression ba(ba)⇤ . Because all strings in L0 are palindromes with even
length without strict prefix, it holds that
According to the pumping lemma there exists a number n 2 N for L0 with the properties given
in the pumping lemma. Then we can decompose the string
into z = uvwxy. Because the intermediate part vwx fulfills the conditions |vwx| n and vx 6= ",
we can show that not all strings of the form uv i wxi y with i 2 N can be in L0 . Therefore, L0 is
not even context-free, not to mention deterministic context-free. Contradiction ⇤
Remark: Because L0 is not context-free, the closure properties imply that context-free languages
are not closed under the operator Min.
For the practical syntax analysis of programming languages we will use the deterministic context-
free languages. There are two di↵erent methods of syntax analysis: in the top-down method
66 III. Context-free languages and push-down automata
we construct the derivation trees from the start symbol of the grammar and in the bottom-up
method we reconstruct from the given string. We want to be able to determine unambiguously
the next derivation step by looking k symbols ahead for some k 0.
For the top-down method we can use the so called LL(k)-grammars. These are context-free
grammars G = (N, T, P, S) where for every intermediate string w1 Av in a leftmost derivation
S `⇤G w1 Av `⇤G w1 w2 = w of a string w 2 T ⇤ from S the part Av and the first k symbols of
the remainder w2 of w define unambiguously the next left derivation step w1 Av `G w1 uv. This
LL(k)-condition can be graphically represented as follows:
S
! . aa
top-down
!!! .. aa
!! aa
w1 A v A
A A
A A
@
@ A A
u
A
k symbols AA
w2
| {z }
w=w1 w2 2T ⇤
For the bottom-up method we can use the so called LR(k)-grammars. These are context-free
grammars G = (N, T, P, S) with the following property. For every intermediate string vw2 in
the bottom-up reconstruction of a right derivation S `⇤G vw2 `⇤G w1 w2 = w of a string w 2 T ⇤
from S, the part v and the first k symbols of the rest w2 define unambiguously the previous
right derivation step ṽAw2 `G ṽuw2 with ṽu = v and A ! u 2 P . This LR(k)-condition can be
graphically represented as follows:
S
⌦ . J
⌦
⌦ . J
.
@
@ J
⌦ J
⌦ J
⌦ A J
⌦ A J
⌦
bottom-up
⌦ ṽ u AA k symbols
J
J
⌦ v w2
⌦
⌦
⌦
w1
| {z }
w=w1 w2 2T ⇤
LL(k) grammars were introduced in 1968 by P.M. Stearns and R.E. Stearns and LR(k)
grammars in 1965 by D.E. Knuth. The following relations hold for languages LL(k) and LR(k)
generated by these grammars:
! !
S ⇢ S
• LL(k) languages 6= LR(k) languages = deterministic context-free languages
k 0 k 0
67
§7 Questions of decidability
The questions of decidability concerning context-free languages can be answered using both
representation by context-free grammars (in normal form) or by push-down automata. We
consider the same problems for regular languages (compare with chapter II, section 5).
are decidable.
Proof :
Membership problem: Consider a context-free grammar G = (N, T, P, S) in Greibach normal
form and a string w 2 T ⇤ . The question is: Does w 2 L(G) hold? The case w = " can be decided
immediately, because G is "-free: " 2 L(G) () S ! " 2 P . Now let w 6= ". Then the following
holds:
w 2 L(G) , 9n 1 : S `G . . . `G w
| {z }
n-times
, { In Greibach normal form every derivation step
produces exactly one character of w. }
S `G . . . ` G w
| {z }
|w|-times
In order to decide w 2 L(G) it is enough to check all derivation sequences with the length |w|
in G. This implies the decidability of the decision problem.
Emptiness problem: Consider a context-free grammar G = (N, T, P, S). The question is:
Does L(G) = ; hold? Let n be the number which belongs to the context-free language L(G)
69
according to the pumping lemma. The same way as in the case of regular languages we show
that:
L(G) = ; () ¬9w 2 L(G) : |w| < n.
Thus we have solved the emptiness problem by solving the decision problem for all strings
w 2 T ⇤ , where |w| < n.
Finiteness problem: Consider a context-free grammar G = (N, T, P, S). The question is: Is
L(G) finite? Let n be the same as above. Then we show the same way as in the case of regular
languages:
L(G) is finite () ¬9w 2 L(G) : n |w| < 2 · n
Therefore, the finiteness problem can be decided by solving the membership problem a finite
number of times. ⇤
are undecidable.
We can prove this theorem only after formalizing the notion of algorithm.
Another result of undecidability concerns the ambiguity of context-free grammars. For the
practical application of context-free grammars for describing the syntax of programming lan-
guages, it would be beneficial to have an algorithmical test to check the ambiguity of context-free
grammars. However, we will show later that such a test does not exist.
In practice we can easily avoid the problem of having to test the ambiguity by restricting
oneself to LR(1) grammars or its subclasses. The LR(1)-property is algorithmically decidable
and because LR-grammars are always unambiguous (because the last rightmost derivation step
must always be defined unambiguously and thus every string can have only one right derivation)
the problem of ambiguity does not exist.
70 III. Context-free languages and push-down automata
Chapter IV
§1 Turing machines
Turing machines were introduced in 1936 by A.M. Turing (1912–1954). We consider an ele-
mentary model for calculating with pencil and paper. For this purpose we need a tape, where
we can write and change the characters, and a finite program.
Sketch
··· t G T I t t ···
L : means: move one
L - R b b
6 square left on tape.
R: similarly to the right.
q2Q Turing program S : no movement.
finite
1.1 Definition : A Turing machine, shortly TM, is a 6-tuple ⌧ = (Q, ⌃, , , q0 , t) with the
following properties:
71
72 IV. The notion of algorithm
part
(vi) :Q⇥ ! Q ⇥ ⇥ {R, L, S} is the transition function.
It can be represented as a Turing table or Turing program:
: q1 a1 q10 a01 P1 with qi , qi0 2 Q,
... ... ai , a0i 2 ,
... ... Pi 2 {R, L, S}
... ...
... ...
qn an qn a0n Pn
0
A configuration of the TM describes the current state, the contents of the tape and the considered
cell.
• Initial configuration:
w
z }| {
··· t t w 0 w 1 ··· w n t ···
6
q0
• Making a step:
··· t w0 w1 · · · wn t ···
6
q0
··· t a w1 · · · wn t ···
⌥ ⇧
6
q0
: q0 | q1 t R
q0 t qe 1 S
q1 | q0 t R
q1 t qe 0 S
We consider the current state q, the current contents of the tape and the current position of the
writing/reading head. Usually two kinds of notations are used:
··· ···
-1 0 1 2
(2) Only a finite part of the tape di↵ers from t. Abstraction of blanks and cells’ numbers.
The configuration
··· t t u1 · · · um v 0 v 1 · · · v n t t ···
6
q
74 IV. The notion of algorithm
u v
z }| { z }| {
can be unambiguously represented as a string u1 . . . um q v0 v1 . . . vn with ui 2 , vj 2
, m, n 0.
Note: Q \ = ;
A configuration uqv means that the TM is in state q, the contents of the tape is t1 uvt1 and
the first (leftmost) symbol of the string v is read.
K `⌧ K 0 (K 0 is a successor configuration of K)
K `⌧ ⇤ K 0 if 9K0 , . . . , Kn , n 0 :
K = K 0 ` ⌧ . . . `⌧ K n = K 0
K `⌧ + K 0 if 9K0 , . . . , Kn , n 1 :
K = K 0 ` ⌧ . . . `⌧ K n = K 0
!(uqv) = uv,
where u is the shortest string with u = t . . . tu and v is the shortest string with v =
vt . . . t. Thus we eliminate q, as well the blanks in the beginning and in the end of the
string; however, the blanks located between characters are not removed.
75
with 8
>
< w if 9 final configuration K 2 K⌧ :
h⌧ (v) = ↵(v) `⌧ ⇤ K ^ w = !(K)
>
:
undef. otherwise
We also write Res⌧ for h⌧ (“result function” of ⌧ ).
h⌧ - w 2
⌃⇤ 3 v ⇤
6
↵ !
?
K⌧ 3 ↵(v) `⌧ ··· `⌧ K 2 K ⌧
M = {v 2 ⌃⇤ | h⌧ (v) is defined}
⇤
N = {w 2 | 9v 2 ⌃⇤ : h⌧ (v) = w}.
part
(i) A partially defined function h : A⇤ ! B ⇤ is called Turing-computable if there exists a
TM ⌧ = (Q, ⌃, , , q0 , t) with A = ⌃, B ✓ and h = h⌧ , i.e. h(v) = h⌧ (v) for all v 2 A⇤ .
part
(ii) TA,B =def {h : A⇤ ! B ⇤ | h is Turing-computable}
(iii) Let T be the class of all Turing-computable functions (for arbitrary alphabets A, B).
76 IV. The notion of algorithm
L : A⇤ ! {0, 1}
Remarks on computability
If v1 = ", then q0 observes the first separator. Except for this modification we can use the
previous definition of computability.
b |n = | . . . |
n=
| {z }
n times
with
hf : ( |n ) = |f (n)
is Turing-computable.
First we define useful elementary TMs. Let = {a0 , . . . , an } be the tape alphabet with a0 = t.
77
• Small right-machine r:
makes one step to the right and halts. Turing table:
r q 0 a 0 qe a 0 R
... ... ... ... ...
q0 a n qe a n R
• Small left-machine l:
makes one step to the left and the halts. Turing table:
l q0 a 0 qe a0 L
... ... ... ... ...
q0 a n qe an L
• Printer-machine a for a 2 :
prints the symbol a and then halts. Turing table:
a q 0 a 0 qe a S
... ... ... ... ...
q0 a n qe a S
Furthermore we assume that all constructed TMs have exactly one final state, i.e. there exists
one state qe so that for all final configurations uqv it holds that:
q = qe .
It is obvious that TMs r, l, a fulfill this condition. Such TMs can be combined as follows:
a
⌧1 ! ⌧2 means that first ⌧1 works. If ⌧1 terminates on a cell with symbol a, ⌧2 starts.
⌧1 ! ⌧2 means that first ⌧1 works. If ⌧1 terminates, ⌧2 starts.
⌧1 ⌧2 is an abbreviation for ⌧1 ! ⌧2 .
6=a
⌧1 ! ⌧2 means that first ⌧1 works. If ⌧1 terminates on a cell with a symbol di↵erent
from a, ⌧2 starts.
On the basis of the given TMs we can construct flowcharts. The nodes of this flowchart are
a 6=a
labelled with the names of TMs. The edges are labelled with arrows of the form !, ! or !.
Loops are allowed. One TM ⌧ in the flowchart is labelled with an arrow ! ⌧ as a start-TM.
Illustration:
⇠
?
⌧0
a⌦ Jb
⇡
⌦ ^
J
⌧0 ⌧1 ⌧2
A flowchart describes a “large” TM. We can get its Turing table as follows:
Step 1: For every occurrence of a TM ⌧i in the flowchart construct the respective Turing table.
78 IV. The notion of algorithm
Step 3: Generate the table of TM by writing all tables (in any order) one below each other.
a
Step 4: Combining: For each ⌧1 ! ⌧2 in the flowchart add to the table of TM line
Let qe⌧1 be the final state (renamed according to Step 2) of ⌧1 and q0⌧2 the start state
6=a
(renamed according to Step 2) of ⌧2 . Similarly for ⌧1 ! ⌧2 and ⌧1 ! ⌧2 .
Example :
• Large right-machine R:
first makes a step to the right. Then R moves to the right until a blank a0 = t is observed.
R ?
r
6= t
qe a 1 q0 a 1 S ⇤
⇥
... ... ... ... ... 6= a0
qe a n q0 a n S
L ?
l
6= t
f :N⇥N !N
with (
. x y for x y
f (x, y) = x y =
0 otherwise
tq0 k . . . k # k . . . k t
| {z } | {z }
x-times y-times
Idea: erase x-dashes as long as y-dashes are there. Then erase the remaining y-dashes and #.
? (y > 0) (x > 0)
- Rl - tLr -t
# (y = 0) # (x = 0)
? ?
-
t tr (y > 0)
t
⇢
6 ⇡
There exist many variations of the TM definition. These variations are often more convenient
if we want to prove that a certain function is computable. We can show that these variations
do not increase the power of a TM, which was defined in Definition 1.1. First we consider TM
with several tapes.
k part k
:Q⇥ ! Q⇥ ⇥ {R, L, S}k
Illustration of for k = 3:
⌧ ⌧
6 6
⇢ ⇠ ⇢ ⇠
6 6
q q’
Configuration of a k-tape TM ⌧ :
with u1 , . . . , uk 2 ⇤, q 2 Q, v1 , . . . , vk 2 +.
i.e. we take the result of the first tape; we consider the remaining tapes only as auxiliary
tapes for computation.
part
The computed function of a k-tape TM is h⌧ : ⌃⇤ ! ⇤ with
8
>
< w if 9 final config. K 2 K⌧ :
h⌧ (v) = ↵k (v) `⌧ ⇤ K ^ w = !k (K)
>
:
undef. otherwise
Remark: Several tapes can be used for computing functions with several arguments; in this
case we distribute the input strings on the tapes appropriately.
Goal: We want to prove that every function computed by a k-tape TM can be computed by a
normal 1-tape TM.
Idea of the proof: Every k-tape TM can be “simulated” using a constructed 1-tape TM.
Therefore, first we want to specify the notion of simulation.
Notion of simulation
The notion of simulation is essential for proving that di↵erent machines have the same power.
1.9 Definition : We consider (1- or k-tape) TM ⌧ and ⌧ 0 over the same input alphabet ⌃, with
the sets of configurations K⌧ and K⌧ 0 , with transition relations `⌧ , `⌧ 0 , initial configurations
↵, ↵0 and result functions !, ! 0 . Informally, ⌧ is the “higher” TM and ⌧ 0 the “lower” TM.
81
sim : K⌧ ! K⌧ 0
Illustration:
' $
h⌧ = h⌧ 0
v - w
6
↵ !
?
↵(v) `⌧ ··· `⌧ K
↵0 sim sim !0
? ? ?
↵0 (v) `⌧ 0 ⇤ sim(↵(v)) `⌧ 0 + ··· `⌧ 0 + sim(K) `⌧ 0 ⇤ K 0
1.10 Theorem (Simulation theorem): Let ⌧ and ⌧ 0 (1- or k-tape) be TMs over the same
input alphabet ⌃. Let there be a simulation sim of ⌧ by ⌧ 0 . Then the functions h⌧ and h⌧ 0
computed by ⌧ and ⌧ 0 , respectively, coincide, i.e.
8v 2 ⌃⇤ : h⌧ (v) = h⌧ 0 (v)
(The equality here means that either both sides are undefined or they have the same value).
Proof : Let v 2 ⌃⇤ .
Case 1: h⌧ (v) is undefined.
Then there is an infinite sequence of configurations
↵(v) = K1 `⌧ K2 `⌧ K3 `⌧ . . .
of ⌧ . Therefore, due to properties (1) and (2) of sim , we have an infinite sequence of configu-
rations
↵0 (v) `⌧ 0 ⇤ sim(K1 ) `⌧ 0 + sim(K2 ) `⌧ 0 + sim(K3 ) `⌧ 0 + . . .
82 IV. The notion of algorithm
↵(v) = K1 `⌧ . . . `⌧ Kn , n 1,
of ⌧ , where Kn is a final configuration and w = !(Kn ) holds. Due to the properties (1) - (3) of
sim , we have a finite sequence of configurations
of ⌧ 0 , where Kn0 is a final configuration with !(Kn ) = ! 0 (Kn0 ). Thus it holds that
Thus we choose: 0 = [ {ã | a 2 } [ {#}. We sketch the choice of Q0 and 0 according to the
following idea of simulation. A step of ⌧ of the form
K1 = (u1 q1 a1 v1 , `⌧ K2 = (u1 q2 b1 v1 ,
... , ... ,
uk q1 a k v k ) u k q 2 bk v k )
generated by
(q1 , (a1 , . . . , ak )) = (q2 , (b1 , . . . , bk ), (S, . . . , S))
is simulated by ⌧ 0 in 2 phases of steps.
• Reading phase:
sim(K1 ) = u1 q1 ae1 v1 # . . . #uk aek vk
>⌧ 0
u1 [read, q1 ]ae1 v1 # . . . #uk aek vk
>⇤⌧ 0
u1 ae1 v1 # . . . #uk [read, q1 , a1 , . . . , ak ]aek vk
83
• Changing phase:
Now we will change the configuration according to (q, (a1 , . . . , ak )).
>⌧ 0
u1 ae1 v1 # . . . #uk [change, q2 , b1 , . . . , bk , S, . . . , S]aek vk
>⇤⌧ 0
u1 [change, q2 ]be1 v1 # . . . #uk bek vk
>⌧ 0
sim(K2 ) = u1 q2 be1 v1 # . . . #uk bek vk
Thus a step of ⌧ is simulated by at most as many steps of ⌧ 0 as the double length of all strings
written on the k tapes.
↵0 (v) `⌧ 0 ⇤ sim(↵(v))
holds.
holds only in the reading phase. Then we must bring sim(K) to the result form of a 1-tape
TM by erasing all symbols starting from the first #. Thus we can program ⌧ 0 in such a
way that
sim(K) `⌧ 0 ⇤ u1 qe0 a1 v1
holds and K 0 = u1 qe0 a1 v1 is the final configuration of ⌧ 0 with !(K) = ! 0 (K 0 ).
84 IV. The notion of algorithm
Thus we have:
Q0 = Q [ {q00 , qe0 }
[ {[read, q, a1 , . . . , aj ] | q 2 Q, a1 , . . . , aj 2 , 0 j kf f }
[ {[change, q, b1 , . . . , bj , P1 , . . . , Pj ] | q 2 Q, 0 j k, b1 , . . . , bj 2 ,
P1 , . . . , Pj 2 {R, L, S} }
[ {. . . further states . . . }
Altogether we have now shown that sim is a simulation of ⌧ by ⌧ 0 . Thus according to the
simulation theorem it follows that h⌧ = h⌧ 0 .
⇤
85
t F R E I B U R G t
6 6✏ 6
Definition of the transition function : as in Definition 1.8 of k-tape TM, but the definition
of a configuration is di↵erent, because k cells on the tape should be marked.
• Turing machines with a two-dimensional computational space divided into cells (“comput-
ing on graph paper”), possibly with several heads:
⇤ ⇤⇤ ⇤⇤⇤ = filled
⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤
⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤⇤ ⇤
= blank
⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤
⇤ ⇤⇤ ⇤⇤ ⇤ 6⇤ ⇤⇤ ⇤⇤ ⇤ 6
6⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤⇤ ⇤⇤⇤
' %
···
6
q
⌧ = (Q, ⌃, , , q0 , t, F ),
where F ✓ Q is the set of final states and all remaining components are defined as before.
1.12 Definition :
(4) If we speak about sets, then by T we denote the languages accepted by Turing machines.
Proof : “(”: Let L be decidable by ⌧ . By adding final state transitions we get an accepting
TM for L and L.
“)”: L is accepted by ⌧1 and L by ⌧2 . Now let us construct ⌧ with 2 tapes in such a way that
a step of ⌧1 on tape 1 and simultaneously a step of ⌧2 on tape 2 is made. If ⌧1 accepts the string,
then ⌧ produces the value 1. Otherwise ⌧2 accepts the string, and ⌧ produces the value 0. Thus
⌧ decides the language L. ⇤
Remark : If only L is Turing-acceptable, then it does not imply that L is also Turing-decidable.
The accepting TM might not terminate for strings from L .
A tuple
(q 0 , b, R) 2 (q, a)
means that the TM, in case if the symbol a is read in state q, can do the following:
The choice between these two possible steps of the TM is non-deterministic, i.e. the TM
can arbitrarily choose one of the two possible steps. The transition relation `⌧ of a non-
deterministic TM ⌧ is not right-unique any more. Apart from that, the language L(⌧ )
accepted by ⌧ is defined as for a deterministic TM.
8q 2 Q 8a 2 : | (q, a)| r,
9q 2 Q 9a 2 : | (q, a)| = r
By r we denote the degree of non-determinism of ⌧ . For each pair (q, a) we number the possible
choices from 1 to (at most) r consequently according to (q, a). Then we can represent every
finite sequence of non-deterministic choices as strings over the alphabet {1, . . . , r}.
↵(v) `⌧ ⇤ K accepting.
The set of all computations of ⌧ starting in ↵(v) can be represented as a tree with nodes labelled
with configurations:
↵ K"
↵(v) =
⌦
@
@
K↵
1 · · · Kr
⌦
@ @
@ @
K1·1 · · · K1·r Kr·1 · · · Kr·r
↵
..........................
⌦
@
K accepting.
88 IV. The notion of algorithm
⌥⌅
By ⌃` ⇧
we mark the choices which have lead to the accepting configuration K.
0. !
1. !
2. -
3. -
···
• Tape 2 is used for systematic generation of all strings over {1, . . . , r}. In order to model
the breadth-first search the strings are generated as follows:
• Tape 3 is used for simulation of ⌧ by ⌧ 0 . First the input string v is copied on the tape 3,
then a sequence of transitions of ⌧ is simulated according to the string encoding choices
from {1, . . . , r}, which is written on tape 2.
If ⌧ has an accepting computation v, then its string of choices is generated by ⌧ 0 on tape 2, and
then ⌧ 0 accepts v. If there is no accepting computation v of ⌧ , then ⌧ 0 will run infinitely long
and generate all strings over {1, . . . , r} on tape 2.
Thus L(⌧ ) = L(⌧ 0 ) holds. ⇤
Remark : The above simulation of ⌧ by ⌧ 0 has exponential complexity in the number of steps:
in order to find an accepting computation of ⌧ with n steps, ⌧ 0 must generate and traverse a
computation tree with the breadth r and depth n, i.e. with rn nodes.
• k-tape TM
• several heads
• 2-dimensional TM
• non-deterministic
89
§2 Grammars
In the previous chapter we got acquainted with context-free grammars. They are a special case
of Chomsky-Grammars introduced in 1959 by American linguist Noam Chomsky. There exist
several types of such grammars (Types 0–3). Here we consider the most general Type 0.
Productions (u, v) 2 P are written down using arrow notation u ! v, as in context-free gram-
mars. Every grammar G has a derivation relation `G on (N [ T )⇤ :
By `⇤G we again denote the reflexive and transitive closure of `G . We read x `⇤ y as “y can be
derived from x”. It holds
x `⇤ y if 9z0 , . . . , zn 2 ⌃⇤ , n 0 : x = z 0 ` G . . . ` G zn = y
2.2 Definition :
i.e. we are interested only in strings over the terminal symbols; the non-terminal symbols
are used as auxiliary symbols within derivations.
We can easily see that for every TM we can provide a TM with this property, which accepts the
same language. Now we construct a Chomsky-0-grammar G = (N, T, P, S) with L(G) = L in
4 steps.
S `⇤G1 wc|↵(w)$
P1 = { S ! c|q0 t$,
S ! aAc|Ca $ for all a 2 T,
Ca b ! bCa for all a, b 2 T,
Ca $ ! Ba$ for all a 2 T,
bB ! Bb for all b 2 T,
Ac|B ! c|q0 ,
Ac|B ! aAc|Ca for all a 2 T }
Operating principle of P1 :
S `G1 c|q0 t$ = "c|↵(")$
or
S `⇤G1 aAc|Ca $
S `⇤G1 w Ac|B w$
S `⇤G1 w aAc|Ca w$
S `⇤G1 w aAc|wCa $
S `⇤G1 w aAc|wBa$
S `⇤G1 w a Ac|B wa$
S `⇤G1 w ac|q0 wa$
Choose P3 as follows:
P3 = { c|qe ! D,
Da ! D for all a 2 ,
D$ ! " }
Operating principle:
wc|qe v$ `G3 wDv$ `⇤G3 wD$ `G3 w.
N = N1 [ N2 [ N3 ,
P = P1 [˙ P2 [˙ P3 (disjoint union).
part
2.4 Corollary : Let the function h : T ⇤ ! T ⇤ be computed by a Turing machine. Then the
graph of h, i.e. the set
is a Chomsky-0-language.
(1) ⌧ 0 leaves a given input string of the form w#v unchanged on the 1st tape.
(2) ⌧ 0 copies the part w on the initially empty 2nd tape and then simulates ⌧ on this tape.
(3) If ⌧ terminates, then the result h(w) of the final configuration is compared to the part v of
the 1st tape. If h(w) = v holds, then ⌧ 0 accepts the input w#v on the 1st tape. Otherwise,
⌧ 0 does not accept the input w#v on the 1st tape.
(2) On the initially empty 2nd tape, ⌧ iteratively generates strings over N [T according to the
rules from P , starting with the start string S. In every step ⌧ chooses non-deterministically
a substring u from the last generated string and a rule u ! v from P and then replaces u
by v. If the input string w occurs in some step, then w is accepted.
For context-sensitive grammars G = (N, T, P, S) there exists the following special rule, so that
" 2 L(G) is possible: "-production may only be of the form S ! ". Then all other derivations
begin with a new non-terminal symbol S 0 :
S ! S0
S0 ! . . .
Now we want to show that the Chomsky-3- (i.e. right-linear) languages are exactly the finitely
acceptable (i.e. regular) languages from Chapter II.
Proof : Exercise. ⇤
where qe 2
/ N holds and the transition relation ! for , 2 N and a 2 T is defined as follows:
a
• ! if and only if !a 2 P
"
• ! qe if and only if !" 2 P
• f is Turing-computable.
Remark : Sometimes when dealing with partially defined recursive functions, we speak about
“partially recursive” functions; by “recursive” we mean only total recursive functions. That is
why we should make sure that we understand what exactly “recursive” means.
Church’s thesis cannot be formally proved, because “intuitively computable” is not a mathe-
matically precise notion, but can only be confirmed by observations.
However, these observations are so grave that Church’s thesis is universally accepted. Particu-
larly it holds that:
- Recursive functions have properties, for example, closure under the µ-operator (while
loop), which should also hold for the algorithms.
Therefore, we can use the following informal but universally accepted notions for functions f
and sets L:
In addition to the already considered formalizations of the notion “computable”, there exist
other equivalent formalizations:
• µ-recursive functions:
According to the works of Gödel, Herbrand, Kleene and Ackermann the set of all
part
computable funcions f : Nn ! N is defined inductively. For this purpose we first of
all introduce very simple basic functions (such as successor function, projection function,
constant function). Then we define how we can get new functions from the functions we
already know (by the principles of composition, primitive recursion and µ-recursion).
– leave unchanged
– increase by 1
– reduce by 1, if we get no negative number.
Note that 2 registers are already enough for the computation of all computable functions
part
f : N ! N (except for encoding). 2-register machines are also called 2-counter machines
or Minsky-machines.
• -calculus:
This calculus introduced by Church describes the computation using higher-order func-
tions, i.e. functions, which take other functions as parameters. Today the -calculus is
used to describe the semantics of functional languages.
Non-computable functions —
undecidable problems
The answer is “no”. Below we will show that there exist non-computable functions f : N ! N.
More precisely:
Idea of the proof : The cardinality of the set of all functions is greater (namely uncountable)
than the cardinality of the set of Turing-computable functions (which is countable).
In order to prove this statement we need to make some preparations. Therefore, we put together
definitions and results from the set theory.
97
98 V. Non-computable functions — undecidable problems
5. M is uncountable if N M.
6. M is finite if M N.
7. M is infinite if N M
Remark :
• M N , 9 injection : M ! N.
1.5 Corollary : An infinite set M is countable if and only if N ⇠ M , i.e. it holds that
M = { (0), (1), (2), . . .} for a bijection : N ! M .
• N,
• {n 2 N | n is even} and
• P(N),
First we show:
1.6 Lemma :
99
Proof : Let us assume that we have a countable set {q0 , q1 , q2 , . . .} of states and a countable
set { 0 , 1 , 2 , . . .} of tape symbols, where 0 = t and 1 = |. Let t he sets { 0 , 1 , 2 , . . .} and
{q0 , q1 , q2 , . . .} be disjoint.
(Remark: we add 1 to k and l, because the states and tape symbols are counted from 0.)
l 1 2 3 4
@u
k@ u u u
0
1 u u u
2 u u
3 u
Remark : Therefore, the basic idea for the countability of T is the countability of sets of all
pairs {(k, l) | k 0, l 1} according to the diagonal scheme. This scheme goes back to G.
Cantor’s proof for countability of N ⇥ N.
However, it holds:
1.7 Lemma :
Proof :
Let F = {f | f : N ! N}.
Hypothesis : F is countable, i.e. there is a bijection : N ! F, so that F = {f0 , f1 , f2 , . . .}
with fn = (n) for n 2 N holds. Now let us define g 2 F by g(n) = fn (n) + 1 for all n 2 N.
According to the hypothesis there is an index m 2 N with g = fm . However, then it holds that
fm (m) = g(m) = fm (m) + 1. Contradiction! ⇤
Remark :
Here we use Cantor’s diagonal method , i.e. the values of fx (y) are considered on the “diagonals”
x = y; then the contradiction arises at some point m :
fx (y)
Y 0 1 2 3
X @@u
0 f0 (0)
@
@e u
f1 (1)
1 @
@
@u
f2 (2)
2 @
@u
@ f3 (3)
3
@@
The Lemmas 1.6 and 1.7 imply the theorem about the existence of non-computable functions.
Do there exist functions important for computer science which are non-computable?
101
Below we consider the “binary” alphabet B = {0, 1}. Now we consider sets (languages) L ✓ B ⇤
which are undecidable and for which the problem:
Given : w 2 B ⇤
Question : Does w 2 L hold ?
First of all we want to consider the halting problem for Turing machines with the input alphabet
B:
The relevance of this problem for computer science: is it decidable whether a given program
terminates for given input values?
2.1 Definition : The standard encoding of such a Turing machine ⌧ is the following string w⌧
over N [ {]}:
w⌧ = k]l
.
.
.
] i ] j ] s ] t ] nr(P ) for (qi , j) = (qs , t, P )
.
.
.
where P 2 {R, L, S} and nr(R) = 0, nr(L) = 1, nr(S) = 2 holds and the substrings of the form
. . . of w⌧ are written down in lexicographical order of pairs (i, j).
102 V. Non-computable functions — undecidable problems
Thus the binary encoding of ⌧ is the string bw⌧ 2 B ⇤ generated from w⌧ by making the following
substitution:
]!"
n ! 01n+1 for n 2 N
Example :
:
q0 0 q1 0 R
q0 1 q1 1 R
q0 t q1 t R
Then
w⌧ = 1 ] 2
]0 ]0]1]0]0
]0 ]1]1]1]0
]0 ]2]1]2]0
and
Remark :
1. Not every string w 2 B ⇤ is a binary encoding of a Turing machine, for example w = " is
not.
2.2 Definition (Special halting problem or self-application problem for Turing ma-
chines): The special halting problem or self-application problem for Turing machines is the
language
K = {bw⌧ 2 B ⇤ | ⌧ applied to bw⌧ halts}.
(
1 if w 2 K
K (w) =
0 otherwise
⌧ 0: @
R
@
⌧
=1
?
r
⇤ OCC
⇤⇤ C
Thus it holds: ⌧ 0 gets into an infinite loop if ⌧ halts with 1, and ⌧ 0 halts (with 0) if ⌧ halts with
0.
Then it holds:
Now we will show a general method, which lets us prove that some problems are undecidable
by using some other problems (e.g. K) which are known to be undecidable. This method is the
so-called reduction.
2.4 Definition (Reduction): Let L1 ✓ ⌃⇤1 and L2 ✓ ⌃⇤2 be languages. Then L1 is reducible to
L2 , shortly L1 L2 , if there is a total computable function f : ⌃⇤1 ! ⌃⇤2 so that for all w 2 ⌃⇤1
it holds that: w 2 L1 , f (w) 2 L2 . We also write: L1 L2 using f .
Proof : Because (ii) is the contraposition of (i), it is enough to show (i). Thus let L1 L2
using total computable function f and let L2 be decidable, i.e. L2 is computable. Then the
composition L2 (f ) = f L2 (first apply f , then L2 ) is computable. It holds
L1 =f L2 ,
2.6 Definition (general halting problem for TM): The (general) halting problem for Turing
machines is the language
2.8 Definition : The blank tape halting problem for Turing machines is the language
Proof : We show H H0 . First we describe for a given Turing machine ⌧ and a string u 2 B ⇤
a Turing machine ⌧u , which operates as follows:
• Applied to the blank tape ⌧u first writes u on the tape. Then ⌧u operates as ⌧ (applied to
u).
• It is not important how ⌧u operates if the tape is not blank at the beginning.
We can construct ⌧u from ⌧ and u. Thus there exists a computable function f : B ⇤ ! B ⇤ with
Then it holds
bw⌧ 00u 2 H
, ⌧ applied to u halts.
, f (bw⌧ 00u) 2 H0 .
We got acquainted with three variants of halting problems, namely K, H and H0 . All three
variants are undecidable.
For K we have shown it directly using a diagonal method , and for H and H0 we have shown it
by reduction: K H H0 .
In all three variants we dealt with the question whether there exists a decision method which
decides halting for every given Turing machine.
Perhaps we can at least construct a decision procedure for a given Turing machine ⌧ which
decides whether ⌧ halts.
2.10 Definition : Consider a Turing machine ⌧ . The halting problem for ⌧ is the language
H⌧ = {w 2 B ⇤ | ⌧ applied to w halts}.
For certain Turing machines H⌧ is decidable, for example, for the small right-machine r it holds
that
Hr = B ⇤
⌧ ?
r
H⌧ = ;.
However, there also exist Turing machines ⌧ for which H⌧ is undecidable. These are the “pro-
grammable” or universal Turing machines.
2.11 Definition : A Turing machine ⌧uni with the input alphabet B is called universal if for
the function h⌧uni computed by ⌧uni the following holds:
i.e. ⌧uni can simulate every other Turing machine ⌧ applied to input string u 2 B ⇤ .
106 V. Non-computable functions — undecidable problems
The Turing machine ⌧uni is an interpreter of Turing machines, which itself is written as a Turing
machine:
bw⌧ -
⌧uni - h⌧ (u)
u -
• ⌧uni determines whether w has the form w = bw⌧ 00u for a Turing machine ⌧ (with input
alphabet B) and a string u 2 B ⇤ .
3. If (q, a) is defined, then ⌧uni returns to the configuration vl qavr of ⌧ and then makes the
necessary transition. Usually ⌧uni must run back and forth several times, because it can
store only a finite amount of information in its memory Q. As a result we have a new
label of the tape of the form
bw⌧ c|vl 0 q 0 a0 vr 0 $,
and ⌧uni operates further as described in (2).
4. If (q, a) is undefined, i.e. vl qavr is a final configuration of ⌧ , then ⌧uni erases the substrings
bw⌧ c| and $ and halts in a final configuration of the form
vl qe avr .
107
Proof : According to the definition of ⌧uni it holds that H H⌧uni using f = idB ⇤ . For the
Turing machine ⌧uni constructed in the proof it even holds that H = H⌧uni , because ⌧uni gets into
an infinite loop if a given string w 2 B ⇤ is not from H, i.e. does not have the form w = bw⌧ 00u.
⇤
§3 Recursive enumerability
In this section we deal with a weakening of the notion of decidability (recursion) for languages
L over an alphabet A.
Thus the di↵erence is that by “countable” we mean that the function may not necessarily be
computable.
1. L is semi-decidable , L is Turing-acceptable.
Proof : (1) follows from the definition of “semi-decidable”. (2) follows from the respective
theorem about Turing-acceptability. ⇤
• Input: w 2 A⇤
• Apply to n = 0, 1, 2, . . . successively.
• If at some point (n) = w holds, halt with output 1. (Otherwise the algorithm does not
halt.)
“(” : ( dovetailing )
• Input : n 2 N
• Otherwise, determine whether ⌧ applied to w halts in at most k steps (which means the
value 1, i.e. “w 2 L”, is produced).
If yes, the output is w.
Otherwise, the output is w0 .
109
We show: (N) = L.
“✓” : The above algorithm produces only strings from L.
“◆” : Let w 2 L. Then there is a number of steps k 2 N, such that ⌧ applied to w halts in
k steps and thus produces 1 (“w 2 L”). Let n be the prime number encoding of (w, k). Then
w = (n) 2 (N) holds. ⇤
1. L is recursively enumerable.
3. L is semi-decidable.
5. L is Turing-acceptable.
6. L is Chomsky-0.
Now we show that the halting problems for Turing machines are recursively enumerable.
• If yes, then ⌧0 lets the Turing machine ⌧ run on the blank tape. If ⌧ halts, then ⌧0 produces
the value 1. Otherwise, ⌧0 runs further indefinitely.
⇤
110 V. Non-computable functions — undecidable problems
Thus we get the following main result about the halting of Turing machines.
2. The complementary problems K, H, H0 and H⌧uni are countable, but not recursively enu-
merable.
Proof :
Can the computer verify programs, i.e. determine whether a program P satisfies the given
specification S ? We consider the following verification problem in more detail:
4.1 Theorem (Rice’s Theorem): Let S be an arbitrary non-trivial subset of TB,B , i.e. it
holds that ; ⇢ S ⇢ TB,B . Then the language
BW (S) = {bw⌧ | h⌧ 2 S} ✓ B ⇤
of the binary encodings of all Turing machines ⌧ , whose computed function h⌧ is in the set of
functions S, is undecidable.
111
Proof : In order to show the undecidability of BW (S), we reduce the undecidable problem H0
or its complement H0 = B ⇤ H0 , respectively, to BW (S).
Therefore, first we consider an arbitrary function g 2 TB,B and an arbitrary Turing machine ⌧ .
Let ⌧g be a TM computing g. Furthermore, let ⌧ 0 = ⌧ 0 (⌧, g) be the following TM depending on
⌧ and g. When applied to a string v 2 B ⇤ , ⌧ 0 (⌧, g) operates as follows:
Let ⌦ 2 TB,B be the totally undefined function. Then the function ⌧ 0 (⌧, g) computed by h⌧ 0 (⌧,g)
is computable by the following case distinction:
(
g if ⌧ halts on the blank tape
h⌧ 0 (⌧,g) =
⌦ otherwise
For a given g there exists a total computable function fg : B ⇤ ! B ⇤ with
i.e. fg computes the binary encoding of a Turing machine ⌧ 0 (⌧, g) from a given binary encoding
of a Turing machine ⌧ .
Case 1 : ⌦ 62 S
Choose some function g 2 S. It is possible, because S =
6 ;.
We show: H0 BW (S) using fg .
Case 2 : ⌦ 2 S
Choose any function g 62 S. It is possible, because S =
6 TB,B .
We show: H 0 BW (S) using fg .
It is clear that Rice’s theorem proves that it is not worth trying to algorithmically verify the
semantics of Turing machines or programs on the basis of their syntactic form, i.e. their input
/ output behavior.
Nevertheless, automatic program verification (also called :“model checking”, where “model” is
equivalent to “program”) is a very active field of research these days. For example, it is possible
to analyse programs whose behavior can be described by finite automata.
Idea of the proof : With appropriate binary encoding of grammar G and string w the halting
problem for Turing machines can be reduced to the following:
H decision problem.
For the direct proof compare also the proofs of Theorem 1.2 and Lemma 1.6.
Proof : Obviously the reduction decision problem derivation problem holds, because for all
grammars G = (N, T, P, S) and strings w 2 T ⇤ it holds that:
w 2 L(G) , S `⇤G w.
Next we consider some sort of a puzzle game, the Post correspondence problem. It was intro-
duced by E. Post (1946) and is abbreviated as PCP (from “Post Correspondence Problem”).
The point is to construct a string in two di↵erent ways.
For subsequent encoding purposes let us assume that we have a countable infinite set of symbols.
5.5 Definition : An input/instance of the PCP is a finite sequence Y of pairs of strings, i.e.
Y = ((u1 , v1 ), . . . , (un , vn ))
(i1 , . . . , im )
u i 1 u i 2 . . . ui m = v i 1 v i 2 . . . v i m
5.6 Definition :
i.e.
u1 = 10 , v1 = 00
u2 = 1 , v2 = 101
u3 = 011 , v3 = 11
Then (2, 3, 1, 3) is a correspondence of Y1 , because it holds that:
114 V. Non-computable functions — undecidable problems
u2 u3 u1 u3 = 101110011 = v2 v3 v1 v3
Even simple inputs of PCP can have a high degree of complexity. For example, the shortest
solution for
Y = ((001, 0), (01, 011), (01, 101), (10, 001))
already has 66 indices (compare with Schöning, 4th edition, p. 132). Thus we consider the
question whether the PCP is decidable.
Proof : Let X = {I}. Every string I n over X can be identified with a natural number n. Thus
every input Y of PCP over X can be considered as a sequence of pairs of natural numbers:
Y = ((u1 , v1 ), . . . , (un , vn ))
We show:
“)”: Proof by contradiction: If neither (a) nor (b) hold, then all pairs (u, v) 2 Y have the
property u < v or all pairs (u, v) 2 Y have the property u > v, respectively. Thus the strings
composed of u-pairs would be shorter or longer than the strings composed of v-pairs. Thus Y
is not solvable.
“(”: If uj = vj holds, the trivial sequence of indices (j) is the solution of Y . If uk < vk and
ul > vl hold, then the sequence
( k, . . . , k , l, . . . , l )
| {z } | {z }
(ul vl ) times (vk uk ) times
115
Obviously the properties (a) and (b) are decidable. Thus the PCP is decidable over a one-element
alphabet. ⇤
Now we consider the general case of the PCP. Our goal is the following theorem:
5.7 Theorem : The PCP over X is undecidable for every alphabet X with |X| 2.
We show the theorem by reduction of the derivation problem for Chomsky-0-grammars on the
PCP over {0, 1}. We consider the following three reductions:
Proof : We introduce an algorithm which constructs an input YG,u,v for MPCP for a given
Chomsky-0-grammar G = (N, T, P, S) with (N [ T ✓ SYM) and given strings u, v 2 (N [ T )⇤
such that the following holds:
(1) u `⇤G v , YG,u,v has a correspondence starting with the first pair of strings.
We use a symbol ], which does not occur in N [ T . Then YG,u,v consists of the following pairs
of strings:
The exact sequence of pairs of strings in YG,u,v is irrelevant, except for the first pairs of strings.
Now we show that (1) holds.
“)”: It holds that u `⇤G v. Then there is a derivation from u to v of the form
(2) u = u0 p0 v0 `G u0 q0 v0 = u1 p1 v1
`G u 1 q 1 v 1 = u 2 p2 v 2
...
`G u k q k v k = v
⇠ ⇠ ⇠ ⇠
u0 p0 v0 u1 p1 v1 ...
⇢ ⇡⇢⇡
l⇢ ⇡⇢⇡
# # #
b
⌦
b
⌦
b
⇠ ⇠ ⇠ ⇠
l
u0 q0 v0 u1 q1 v1 ...
⇢ ⇢⇡
⇡ ⇢ ⇢⇡
⇡
# u # # #
⇠ ⇠
... uk pk vk
⇢⇡
⇢ ⇡
# # v #
⌦
⇠ ⇠
... uk qk vk #
⇢⇡
⇢ ⇡
#
The “angle pieces” with u and v represent the first pair of strings and the last pair, respectively.
The substrings ui , vi , ] in round frames are constructed character-by-character by copy-pairs
(a, a) for a 2 N [ T [ {]}. The substrings pi , qi in angled frames are placed by the respective
pair of productions (pi , qi ). The connecting lines mean that both strings belong to the same pair
of strings.
“(” : Now consider a correspondence of YG,u,v which begins with the first pair of strings. The
form of the pairs of strings in YG,u,v implies that the correspondence must be constructed as
shown under “)” with the exception that the substrings u and ui qi vi respectively can repeat
infinitely many times by applying only copy-pairs (a, a) :
⇠ ⇠ ⇠ ⇠ ⇠
ui qi vi ui qi vi ...
b⇢⇡ ⇢ ⇡⇢⇡l⇢ ⇡⇢⇡
# # #
⌦
b
⌦
b
b b
⇠ ⇠ ⇠ ⇠ ⇠ ⇠
h
hhh l
b
b
b
... ui qi vi ui qi vi ui qi vi ...
⇢ ⇢⇡
⇡ ⇢ ⇢⇡
⇡ ⇢ ⇢⇡
⇡
# # #
In order to make the correspondence complete, we must be able to place the final pair. For this
purpose we must insert pairs of productions (pi , qi ) such that the correspondence describes a
derivation (2) from u to v.
Note that “(” holds only because we begin with the first pair of strings in the sense of M P CP .
Without this restriction, for example, the copy-pair (], ]) gives us a trivial correspondence which
does not imply u `⇤G v. ⇤
5.9 Lemma : M P CP P CP
Proof : We introduce an algorithm which constructs an input YP CP of the P CP for the given
input Y = ((u1 , v1 ), . . . , (un , vn )) of the M P CP , i.e. with ui 6= " for i = 1, . . . , n, such that the
following holds:
117
(3) Y has a correspondence which starts with the first pair of strings
, YP CP has a correspondence.
Idea : Let us construct YP CP in such a way that every correspondence necessarily starts with
the first pair of strings.
For this purpose we use new symbols and which may not appear in any of the strings of Y .
We define two mappings
⇢ : SYM⇤ ! SYM⇤ ,
: SYM⇤ ! SYM⇤ ,
• ⇢(w) is generated from w by inserting the symbol to the right of every character of w,
• (w) is generated from w by inserting the symbol to the left of every character of w.
For the mappings ⇢ and the following holds: for all u, v 2 SYM ⇤
3. u = v , ⇢(u) = (v)
The statements 1 and 2 mean that ⇢ and are string homomorphisms, i.e. " and the concate-
nation are preserved.
From Y we construct the following YP CP with pairs of strings which are numbered from 0 to
n + 1:
YP CP = ( ( ⇢(u1 ), (v1 )), 0 pair of strings
(⇢(u1 ), (v1 )) , 1st pair of strings
... , ...
(⇢(un ), (vn )) , n-th pair of strings
( , )) , n + 1-th pair of strings
Now we show that (3) holds:
“)”: Let (1, i1 , . . . , im ) be a correspondence of Y , i.e. u1 ui2 . . . uim = v1 vi2 . . . vim . According to
the statement 3 it holds that :
...
⇢(u1 ) ⇢(ui2 ) ⇢(uim )
(v1 ) (vi2 ) ... (vim )
=
By using frames we have put together strings which appear in a pair of strings of YP CP . We see
that:
(1, i2 , . . . , im , n + 1)
is a correspondence of YP CP .
“(”: We show this direction only for the case vi 6= " for i = 1, . . . , n:
Let (i1 , . . . , im ) be some correspondence of YP CP . Then i1 = 0 and im = n + 1 hold, because
only the pairs of strings in the 1st pair of strings ( ⇢(u1 ), (v1 )) start with the same symbol
and only the strings in the (n + 1)-th pair of strings ( , ) end with the same symbol. Let
k 2 {2, . . . , m} be the smallest index with ik = n + 1. Then (i1 , . . . , ik ) is also a correspondence
of YP CP , because appears only as the last symbol in the appropriate correspondence string.
The form of the pairs of the strings implies that:
ij 6= 1 for j = 2, . . . , k 1.
Otherwise, there would be two consequent ’s in the substring ⇢(uij 1 ) ⇢(u1 ) which could not
be reproduced by placing (vi )’s, because vi 6= ".
...
⇢(u1 ) ⇢(ui2 ) ⇢(uik 1 )
(v1 ) (vi2 ) ... (uik 1 )
=
u 1 u i 2 . . . ui k 1
= v 1 v i2 . . . v ik 1 .
Thus (1, i2 , . . . , ik 1) is a correspondence of Y . In the case that there is a vi = ", the reasoning
is more difficult.
Proof : For reduction we use a binary encoding over SYM, i.e. an injective computable function
with which we have got acquainted while considering binary encoding of Turing machines.
From the three lemmas above we get the undecidability of the following problems:
• M P CP ,
• P CP ,
Now we can prove the results on undecidability of context-free languages mentioned in Chapter
III. In contrast to the regular languages the following holds:
are undecidable.
Proof :
Intersection problem: Consider two context-free grammars G1 and G2 . The question is:
Does L(G1 ) \ L(G2 ) = ; hold? We show that the Post correspondence problem can be reduced
to the intersection problem:
P CP intersection problem
The idea is that G1 generates all strings which can be produced by putting ui ’s one after another,
and G2 generates all strings which can be produced by putting vi ’s one after another. In order
for the necessary relation (⇤) to hold, G1 and G2 must also record the indices i of the placed ui
and vi . For this purpose we use n new symbols a1 , . . . , an 2 / X and choose them as the set of
terminal symbols of G1 and G2 :
T = {a1 , . . . , an } [ X
Then put Gi = ({S}, T, Pi , S), i = 1, 2, where P1 is given by the following productions:
S ! a1 u1 | a1 Su1 | . . . | an un | an Sun
S ! a1 v1 | a1 Sv1 | . . . | an vn | an Svn
and
L(G2 ) = {aim . . . ai1 vi1 . . . vim | m 1 and i1 , . . . , im 2 {1, . . . , n}}.
Therefore, (⇤) holds and so does the undecidability of the intersection problem. We have proved
a stronger result: the undecidability of the intersection problem for deterministic context-free
languages. As it is easy to check, the languages L(G1 ) and L(G2 ) are deterministic. ⇤
Equivalence problem: Consider two context-free grammars G1 and G2 . The question is: Does
L(G1 ) = L(G2 ) hold? We show the undecidability of this problem using the following reduction:
Consider two deterministic push-down automata K1 and K2 . We show that we can construct
from them two (not necessarily deterministic) context-free grammars G1 and G2 such that the
following holds:
L(K1 ) \ L(K2 ) = ; () L(G1 ) = L(G2 ).
We use the fact that for K2 we can construct the complement push-down automaton K2 with
L(K2 ) = L(K2 ) (compare with Chapter III, section 6). Then from K1 and K2 we can algorith-
mically construct context-free grammars G1 and G2 with
as required.
Inclusion problem: Consider two context-free grammars G1 and G2 . The question is: Does
L(G1 ) ✓ L(G2 ) hold? Obviously the reduction
Another result of undecidability concerns the ambiguity of context-free grammars. For the
practical use of context-free grammars for syntax description of programming languages it would
122 V. Non-computable functions — undecidable problems
be beneficial to have an algorithmic ambiguity test. However, we show that such a test does not
exist.
P = {S ! S1 , S ! S2 } [ P1 [ P2 .
Because G1 and G2 are unambiguous, the only possible ambiguity of G, while constructing a
derivation tree of G for a string w 2 T ⇤ , is the use of productions S ! S1 or S ! S2 respectively.
Therefore, we have the following result:
Thus we have shown the reduction (⇤), and it implies the undecidability of the ambiguity prob-
lem. ⇤
Chapter VI
Complexity
§1 Computational complexity
So far we have considered the computability of problems, i.e. the question whether the given prob-
lems are algorithmically solvable at all. We have also considered a kind of structural complexity,
i.e. determining which type of machine is necessary for solving problems using algorithms. We
have become acquainted with the following hierarchy:
Now we will consider the efficiency or computational complexity, i.e. the question: how much
computing time and how much space (memory) do we need in order to solve the problem algo-
rithmically. We will study time and space depending on the size of the input. There exist two
working directions:
b) Find a lower bound for a problem so that every algorithm solving this problem has at least
this complexity. The size of the input, n, is a trivial lower bound for the time complexity.
Statements about complexity depend on the machine model. In theory we mostly consider
deterministic or also non-deterministic Turing machines with several tapes. By using several
123
124 VI. Complexity
tapes we get more realistic statements than using 1-tape-TM, because the computation time for
merely moving back and forth on the tape of TM can be avoided.
(i) ⌧ has the time complexity f (n) if for every string w 2 ⌃⇤ of length n it holds: ⌧ applied
to the input w terminates for every possible computation in at most f (n) steps.
(ii) ⌧ has the space complexity f (n) if for every string w 2 ⌃⇤ of length n it holds: ⌧ applied
to the input w uses for every possible computation on every tape at most f (n) cells.
This definition can be also applied to deterministic TMs with several tapes. For deterministic
TMs there exists exactly one computation for every input w.
If we deal with TMs, we usually represent problems as languages which should be accepted. We
have already used such representation in Chapter “Non-computable functions and undecidable
problems”. For example, the halting problem for TMs was represented as a language
Now we put together problems, i.e. languages, with the same complexity into the so called
complexity classes.
Solution idea:
- For a given string w 2 {a, b}⇤ first determine the center of w and then
compare both halves.
- In order to determine the center we use the 2nd tape, i.e. we need a
deterministic 2-tape TM.
t w t t w t
6 `⇤ 6 ` ⇤
ppp
w
z }| {
u v u t t ··· t
`⇤ ` ⇤
⌃⌅
6 6
(3) #!(4)
⌃ ⌥⇧
⌅
copy the
v
6 2nd half of KA
move the
v from the upper head
1st tape to the right
to the 2nd end of u
tape and
erase it
on the 1st
tape
u u
`⇤
⌦
6 6
(5) "
⌫⇤⌥ ⇧ ⌃ ⌅
v v
compare 6
u and v
backwards;
accept if
u=v
Part 1: Copy character by character from tape 1 to tape 2. Stop this process non-deterministically
at any time.
Part 3: Now compare character by character starting from the current position and make sure that
the first tape has the same contents as the second one. If it is the case and both heads
point to the blank field, then accept.
126 VI. Complexity
In complexity theory we compare the asymptotic behavior of time and space complexity, i.e. the
behavior for n which is “large enough”. Thus we can omit constant factors. For this purpose
we use the O-notation from the number theory.
i.e. O(g(n)) is the class of all functions f which are bounded by a constant multiplied by g for
sufficiently large values of n.
Because a TM can visit at most one new field on its tapes in each computational step, it follows
that:
DSPACE(f (n))
✓ ✓
DTIME(f (n)) NSPACE(f (n))
✓ ✓
NTIME(f (n))
127
The complexity for algorithms which can be applied in practice should be a polynomial p(n) of
the k-th degree, i.e. of the form
p(n) = ak nk + · · · + a1 n + a0
S
P = DTIME(p(n))
p polynomial in n
S
NP = NTIME(p(n))
p polynomial in n
S
P SP ACE = DSPACE(p(n))
p polynomial in n
S
N P SP ACE = NSPACE(p(n))
p polynomial in n
and thus
N P SP ACE = P SP ACE.
P ✓ N P ✓ P SP ACE, (⇤)
because, as we have already mentioned, a TM can visit at most one new cell in every computa-
tional step.
Open problem in computer science: Are the inclusions in (⇤) strict or does the equality
hold ?
The classes P and NP are very important, because they mark the transition from computability
or decidability questions which are of practical interest, to the questions which are purely of
theoretical interest. This transition can be illustrated by the following diagram:
128 VI. Complexity
✏
non-det. polyn. time: NP
det. polyn. time: P NPC
6
Practically solvable, i.e. practically computable or decidable are those problems in the class P,
which can be characterized as follows:
The polynomials which bound the computation time should have small degree, such as n, n2 or
n3 .
However, practically unsolvable are all problems for which it can be proved that the compu-
tation time grows exponentially with the input size n. Between these two extremes there is a
large class of practically important problems for which at the moment we know only exponential
deterministic algorithms. However, these problems can be solved using non-deterministic algo-
rithms in polynomial time. This is the class NP, which in comparison to P can be characterized
as follows:
In full generality, these problems are also still practically unsolvable. In practice we make use of
so called heuristics, which strongly restricts the non-deterministic search space of the possible
solution proposals. Using these heuristics we try to approximate an “ideal solution”.
Then the sequence of vertices e1 -e2 -e3 -e4 is a Hamiltonian path. It is easy to see that
the problem of the Hamiltonian path lies in NP: First let us guess a path and then check
whether each vertex is visited exactly once. Because there are n! paths in the graph, this
method could be used deterministically only with exponential complexity.
While considering the question (which still remains open) whether P = NP holds, a subclass of
NP was found, namely the class NPC of the so called NP-complete problems. It holds that:
If one problem of NPC lies in P, then all problems of NP are already in P, i.e. P =
NP holds.
The class NPC was introduced in 1971 by S.A. Cook. Cook was the first to prove that the
SAT problem, which we have just introduced, is NP-complete. Since 1972 R. Karp has proved
that many other problems are also NP-complete. Today we know more than 1000 examples of
problems from the class NPC.
Below we want to define the notion of NP-completeness. For this purpose we need the notion
of polynomial-time reduction, which was introduced by Karp in 1972 as a technique to prove
NP-completeness. Therefore, we tighten the notion of reduction L1 L2 , which was introduced
in Section 2.
2.3 Definition : Let L1 ✓ ⌃⇤1 and L2 ✓ ⌃⇤2 be languages. Then L1 is called polynomially-time
reducible to L2 , shortly
L1 p L2 ,
130 VI. Complexity
if there is a function f : ⌃⇤1 ! ⌃⇤2 which is total and computable with a polynomial time
complexity, such that for all w 2 ⌃⇤1 it holds that:
w 2 L1 , f (w) 2 L2 .
Intuitively, L1 p L2 indicates that L1 is not more complex than L2 . We can easily recognize
that p is a reflexive and transitive relation on languages, because with two polynomials p1 (n)
and p2 (n), p1 (p2 (n)) is also a polynomial.
Proof : for (i) and (ii): Let L1 p L2 using a function f , which is computed by a Turing
machine ⌧1 . Let the polynomial p1 bound the computing time of ⌧1 . Because L2 2 P (or
L2 2 N P , respectively), there is a (non-deterministic) Turing machine ⌧2 which is bounded by
a polynomial p2 and computes the characteristic function L2 .
Similar to normal reduction, the characteristic function L1 for all w 2 ⌃⇤1 can be computed as
follows:
L1 (w) = L2 (f (w)).
We do it by sequentially connecting the Turing machines ⌧1 and ⌧2 . Now let |w| = n. Then
⌧1 computes the string f (w) in p1 (n) steps. This time bound also limits the length of f (w), i.e.
|f (w)| p1 (n). Therefore, the computation of L2 (f (w)) is carried out in
Consider a graph G with n vertices. As a reduction function we consider the following construc-
tion f : G 7! G0 of a new graph G0 as follows:
• G0 has n vertices as G.
Thus G0 is a complete graph, i.e. every two vertices are connected by one edge.
r rp
C ppp
E.g. G: n=5 G0 : and appropriate bound
⇤1C 1 p p2p
⇤C ⇤
pp
⇤C
k =n+1
r r rp 1 ⇤ C p pr
1
pQ p
f
p p 1Q⇤ 2 Cp p p
⇤ C 7 !
Q Q
p
Q ⇤ C
2 pp pp C 1
Q
⇤r QCr p⇤r pp p p p Qp pCr
⇤Q C ⇤Q
2
The construction f takes place in polynomial time (in order to find the non-edges: consider an
incidence matrix, i.e. O(n2 )). Now we show the reduction property of f , i.e.
Proof of “)”: We need n 1 edges of the length 1 (i.e. “from G”) to connect n di↵erent
vertices. In order to close this Hamiltonian path, we need another edge of the length 2. All
in all the constructed round-trip has the length n + 1.
Proof of “(”: The round-trip has at least n edges (to connect n vertices) and at most n + 1
edges (due to the edge length 1).
During the round-trip at least 1 vertex is visited twice (start = end). Therefore, we must
definitely remove one edge, in order to get a path with di↵erent vertices. We check whether it
is enough.
Case 1: After removing one edge, the remaining path has the length n 1.
Thus the vertices which can be visited are all di↵erent. Therefore, we have found a Hamiltonian
path.
(Example for case 1: see above)
Case 2: Otherwise, after removing one edge, the remaining path has the length n.
Then this path has n edges and one vertex is visited twice on a remaining cycle. Then we get a
Hamiltonian path after having removed one more edge.
Example for case 2: n = 3
132 VI. Complexity
G0 :
1
⇠ 1
⇠ ⇠ ⇠
r r r 7 ! r r r
⇢ ⇡
⇢ ⇡
1 1
Round-trip of the length 4 Hamiltonian path
133
(i) A Boolean expression is a term over {0,1} and variables with the operations not, and,
or. In particular: Let V = {x1 , x2 , . . .} be an infinite set of variables. Let B = {0, 1} be
the set of logical values (“false” and “true”).
(1) If y1 , y2 , . . . , yk are literals, then (y1 _ y2 _ . . . _ yk ) is called clause (of the order k,
i.e. k literals/alternatives)
(2) If c1 , c2 , . . . , cr are clauses (of the order k), then c1 ^ c2 ^ . . . ^ cr is called a
Boolean expression in conjunctive normal form (“CNF”) (of the order k). If at
least one clause contains k literals, then this expression is called a CNF of the order
k. Similarly, we can define “disjoint normal form”.
i.e. x1 7! x1, x2 7! x10, x3 7! x11, x4 7! x100 etc. The elements from B can be easily
removed from E using calculation rules, and we can bring E to CNF. Thus we get:
134 VI. Complexity
Proof :
• The difficult part is to show that for every L 2 NP it holds that L SAT. Thus
consider an arbitrary L 2 NP. Then there is a non-deterministic Turing machine ⌧ =
(Q, X, , , q0 , t, F ) with: Q = set of states; X = input alphabet; = tape alphabet,
which contains X; q0 2 Q initial state; t 2 \ X blank symbol; F ⇢ Q set of final states
(compare with the definition in chapter V, supplemented with the final states, because ⌧
must halt for all inputs; only those strings w 2 X ⇤ belong to L which bring ⌧ into a final
state). ⌧ accepts L in polynomial time, i.e. there exists a polynomial p such that ⌧ halts
for every w 2 X ⇤ and every computation sequence after at most p(n) (with n = |w|) steps,
and w 2 L if and only if there is a computation sequence of ⌧ which transfers the initial
configuration q0 w into a final configuration u1 qu2 with q 2 F .
We assume without loss of generality that ⌧ has only one tape. Now for every w 2 X ⇤
we construct a Boolean expression g(w) 2 ⌃⇤ (with ⌃ = {x, 0, 1, ¬, ^, _, (, ) }) in CNF
such that it holds that w 2 L () g(w) 2 SAT. If we have shown that this function
g : X ⇤ ! ⌃⇤ is a polynomial reduction, then it follows that SAT is NP-complete, because
L from NP was arbitrary.
Construction of g(w) from w: (We follow [Mehlhorn, “Data Structures and Algorithms”
2, Section VI.4]) Let Q = {q1 , . . . , qs }, q1 = initial state, F = {qr , qr+1 , . . . , qs }. Let
= {c1 , . . . , c } with c1 = t = blank symbol.
Without loss of generality let : Q ⇥ ! 2Q⇥ ⇥{ L1, +1, 0 } with (q, c) = ; for q 2 F .
R S
✓ 6 I
@
@
“Left” “Right” “Stay”
We make complete by changing every (q, c) = ; into (q, c) = {(q, c, 0)}. Thus the empty
135
set ; never appears as an image of . Because the end of the computation was determined
by (q, c) = ;, by the modification we define a non-terminating Turing machine ⌧ 0 , which,
however, exactly reflects the operation of ⌧ : If ⌧ halts in state q, then ⌧ 0 after making
p(|w|) steps will reach a configuration u1 qu2 with exactly this state q and ⌧ 0 will stay in
this configuration indefinitely; the opposite direction also holds. Thus:
w2L () ⌧ applied to w is in a configuration u1 qu2 after p(|w|) steps with q 2 F
() ⌧ 0 runs through a sequence K1 , K2 , . . . , Kp(n) of configurations with
Because ⌧ makes at most p(n) steps, we can always assume that |i| p(n) and t p(n).
The Boolean expression g(w) should exactly describe the above sequence of configurations
K1 , K2 , . . . , Kp(n) (or all such sequences of configurations, respectively). This requires
fulfilling the following conditions:
(1) Initial configuration: ⌧ 0 in state q1 , tp(n)+1 wtp(n) n is on the tape, the head is at the
cell 1. Time point t = 1.
(2) Final configuration (provided that w is accepted): ⌧ 0 in state qj with r j s at
the time point t = p(n).
(3) Transition condition: ⌧ 0 at every time point 1 t p(n) is in exactly one state, every
cell from p(n) to +p(n) contains exactly one symbol from , the head is exactly at
one of these cells, and exactly one tuple of is used for the transition.
136 VI. Complexity
(4) Successor configuration: The next configuration follows from the previous configura-
tion from the transition determined by the tuple of described in (3).
with
atleastone(x1 , . . . , xk ) := (x1 _ x2 _ . . . _ xk )
and
atmostone(x1 , . . . , xk ) := ¬atleasttwo(x1 , . . . , xk )
W !
=¬ (xi ^ xj )
1i<jk apply
V
= (xi _ xj ) de Morgan’s law!
1i<jk
with
Astate
3 (t) = exactlyone(zt,1 , . . . , zt,s )
Aplace
3 (t) = exactlyone(st, p(n) , st, p(n)+1 , . . . , st,p(n) )
V
Acell
3 (t) = exactlyone(at,i,1 , . . . , at,i, )
p(n)ip(n)
Anext
3 (t) = exactlyone(bt,1 , . . . , bt,m )
Again, this formula is in CNF.
A3 has p(n) · (s2 + (2 · p(n) + 1)2 + (2 · p(n) + 1)2 · 2 + m2 ) variables.
A3 describes exactly the transition condition (3).
137
V
for(4): A4 = A4 (t)
1t<p(n)
Now we must consider the tuple of in more detail. For l = 1, 2, . . . , m let the l-th
tuple of be given by
Put: "
V V
A4 (t) = (st,i _ at,i,j _ at+1,i,j )
p(n)ip(n) 1j
V ⇣
^ (st,i _ bt,l _ zt,kl ) ^ (st,i _ bt,l _ at,i,jl )
1lm
^(st,i _ bt,l _ zt+1,k̃l ) ^ (st,i _ bt,l _ at+1,i,j̃l )
⌘
^(st,i _ bt,l _ st+1,i+dl )
variables (for this polynomial p0 ). It is obvious that the generation of g(w) from w
is proportional to p0 (n), i.e. should be computed on a deterministic 1-tape Turing
machine in at most const · (p0 (n))2 steps, i.e. in polynomial time.
Statement 2: g is a reduction, i.e.
This expression follows from the construction, where “(” has to be proved in a more
complex way.
Thus g is a polynomial reduction. ) SAT is NP-complete.
138 VI. Complexity
Proof : Because SAT 2 NP, it also holds that SAT(3) 2 NP. Let us reduce SAT to SAT(3).
Let us replace every clause (x1 _ x2 _ . . . _ xr ) by
with new variables y1 , y2 , . . . , yr . It is easy to see that (x1 _ . . . _ xr ) is satisfiable if and only if
the longer formula (of the order 3) is satisfiable. The process is of polynomial size ) SAT can
be polynomially-time reduced to SAT(3). ⇤
For further NP-complete problems: see literature (rucksack problem, shortest round-trip, Hamil-
tonian paths in graphs, clique problem, coloring problem of graphs, integer programming,
timetable problem, allocation problem, graph embedding problem, etc.).