Theory of Computation
Theory of Computation
Theory of Computation
A Course Material
on
Theory of Computation
By
S.Parvathi
Assistant Professor
Computer Science and Engineering Department
Quality Certificate
Subject Code:CS6503
Year/Sem: III/V
Name: S.Parvathi
This is to certify that the course material being prepared by Mrs.S.Parvathi is of the
adequate quality. He has referred more than five books and one among them is from
abroad author.
Seal: Seal:
OBJECTIVES:
The student should be made to:
Understand various Computing models like Finite State Machine, Pushdown Automata,
and Turing Machine.
Be aware of Decidability and Un-decidability of various problems.
Learn types of grammars.
UNIT II GRAMMARS 9
Grammar Introduction– Types of Grammar – Context Free Grammars and Languages–
Derivations and Languages – Ambiguity- Relationship between derivation and
derivation trees – Simplification of CFG – Elimination of Useless symbols – Unit
productions – Null productions – Greiback Normal form – Chomsky normal form –
Problems related to CNF and GNF.
OUTCOMES:
At the end of the course, the student should be able to:
Design Finite State Machine
Understand the concept og Grammars
Design Pushdown Automata
Design Turing Machine
Explain the Decidability or Undecidability of various problems
TEXT BOOKS:
1. Hopcroft J.E., Motwani R. and Ullman J.D, ―Introduction to Automata Theory,
Languages and Computations‖, Second Edition, Pearson Education, 2008. (UNIT 1,2,3)
2. John C Martin, ―Introduction to Languages and the Theory of Computation‖, Third
Edition, Tata McGraw Hill Publishing Company, New Delhi, 2007. (UNIT 4,5)
REFERENCES:
1. Mishra K L P and Chandrasekaran N, ―Theory of Computer Science – Automata,
Languages and Computation‖, Third Edition, Prentice Hall of India, 2004.
2. Harry R Lewis and Christos H Papadimitriou, ―Elements of the Theory of
Computation‖, Second Edition, Prentice Hall of India, Pearson Education, New Delhi,
2003.
3. Peter Linz, ―An Introduction to Formal Language and Automata‖, Third Edition,
Narosa Publishers, New Delhi, 2002.
4. Kamala Krithivasan and Rama. R, ―Introduction to Formal Languages, Automata
Theory and Computation‖, Pearson Education 2009
CONTENTS
1 Unit – I 7
2 Unit – II 40
3 Unit – III 64
4 Unit – IV 91
5 Unit – V 121
Prerequiste
You must know the basic concenpts of design and analysis of algorithm and
basic knowledge about the programming, data structure concepts with the time
and space complexity of various algorithms used in the computation.
UNIT – I
FINITE AUTOMATA
TWO MARKS
Where
M = (Q, ∑, δ, q0, F)
Where
• M-Finite automata
• Q - finite, non empty set of states
• ∑- finite set of alphabets
• q0 ЄQ - the start state
• F C Q- a set of final states
• δ a transition function (Q Х ∑)
M = (Q, ∑, δ, q0, F)
1. ∑- an input alphabet
2. q0 ЄQ - the start state
3. FXQ- a set of final states
4. δ a transition function (Q Х ∑ = 2Q (2Q is the power set of Q)
6. Define NFA with ε -transition. Is the NFA‟s with ε -transitions are more powerful
than the NFA‟s without ε -transition? [CO1-L2]
The NFA with ε moves defined by 5 tuple or quadruple as similarly as NFA,
except ε. M = (Q, ∑, δ, q0, F) with all components as before: Q x (δ U{ ε }) = 2Q
No, NFA with ε transition and NFA without ε -transition have the same power.
Solution:
Є – Closure (q0) = {q0, q1, q2}
Є – Closure (q1) = {q1, q2}
DFA NFA
state states
transition
in F
DFA)
13. Design DFA to accept the language L = {w|w has both even number of
0‟s and even number of 1‟s}[CO1-H3]
q0 1 q1
0 0
0 0
1
q2 q3
1
14. Construct the DFA that accepts input string of 0‟s and 1‟s that end with
00.[CO1-H3]
1
1 1
0 q1 0
q0 q2
1
1/0 1/0
0 1 1
q0 q1 q2 q3
19. Write a regular expression to denote a language L which accepts all the
strings which begin or end with either 00 or 11.[CO1-L2] [NOV/DEC 2012]
The R.E consists of two parts:
L1 = (00+11) (any no of 0‘s
and 1‘s) = (00+11)(0+1)*
L2 = (any no of 0‘s and
1‘s)(00+11) =
(0+1)*(00+11)
Hence Reg.Exp R=L1+L2 = [(00+11)(0+1)*] + [(0+1)* (00+11)]
20. Construct a R.E for the language which accepts all strings with atleast two c‟s
over the set Σ = {c,b}. [CO1-L1]
Ans: (b+c)* c (b+c)* c (b+c)*
we can write w x yz
with | x y | ≤ m and
| y|≥1
such that: x y i z ∈
L i 0, 1, 2, ...
25. What are the applications of Regular expressions and Finite automata?[CO1-
L1]
Lexical analyzers and Text editors are two applications.
Lexical analyzers: The tokens of the programming language can be expressed using
regular expressions. The lexical analyzer scans the input program and separates the
tokens. For eg identifier can be expressed as a regular expression as: (letter)
(letter+digit)*
If anything in the source language matches with this regular expression then it is
recognized as an identifier. The letter is{A,B,C,………..Z,a,b,c….z} and digit is
{0,1,…9}.Thus regular expression identifies token in a language.
Text editors: These are programs used for processing the text. For example UNIX
text editors uses the regular expression for substituting the strings such as: S/bbb*/b/
Gives the substitute a single blank for the first string of two or more blanks in a given
line. In UNIX text editors any regular expression is converted to an NFA with ε –
transitions, this NFA can be then simulated directly.
1. For the finite state machine M given in the following table, test whether the
strings 101101, 111111 are accepted by M [CO1-L3-Apr/May 2007]
state 0 1
q1 [q3] [q0]
q2 [q0] [q3]
q3 [q1] [q2]
Solution
2. Construct an NFA for the set of strings with {0, 1} ending with 01 and draw
the transition table for the same and check whether the input string 00101 is
accepted by above NFA. [CO1-L3]
Solution:
The transition diagram is,
0
1
Start
q
0 q1
1/0
Input String = 00101
The transition table is,
0 1
0 {q0,q1} {q0}
q
1 Φ {q2}
*q2 Φ Φ
δ(q0,0) = { q0, q1 }
δ‘(q0,00) = δ (δ(q0,0),0)
= δ ({ q0, q1 },0)
= δ (q0,0) U δ(q1,0)
= { q0, q1 } U Φ = {q0, q1}
δ‘(q0,001) = (δ‘(q0,00),1)
= δ ({ q0, q1 }, 1)
δ(q0,1) U
= (q0,1)
= {q0,q2) U Φ
= {q0,q2}
δ(δ‘(q0,001),
δ‘(q0,0010) = 0)
= δ( { q0,q2},0
= δ(q0,0) U δ(q2,0)
= {q0,q1} U Φ = {q0,q1}
δ‘(q0,00101 δ(δ‘(q0,0010),1
) = )
= δ({q0,q1},1)
δ(q0,1) U
= (q1,1)
= {q0} U {q2} = {q0,q2}
δ‘ (q0,00101) ∩ F = {q0,q2} ∩ {q2} = {q2}
3.Briefly discuss about the additional forms of proof and inductive proof ? [CO1-
L2-Nov/Dec2012]
FORMAL PROOF
Formal proof is one in which step by step procedure is used to solve the problem.
Testing the program is so important, but if our program is very complex that involving
the recursion or iteration, then our code may become incorrect.
To make the iteration or recursion to be correct, then we need to set up an inductive
hypothesis and it helps to reason that the hypothesis is consistent with iteration or
recursion.
This process of analyzing the working of a correct program is similar to proving the
theorems by inductive or induction.
Automata theory is used to cover the methodologies of formal proof. The proof may be
of,
Deductive Formal Proof
O Consists of a sequence of justified steps.
• Inductive Formal Proof
O Recursive proof of a parameterized statement that use the statement
itself with
lower values of the parameter.
Methods of formal proof involves,
Deductive proof
Reduction of definition
Other theorem forms
Theorems that appear not to be If-then
DEDUCTIVE PROOF
A deductive proof consists of a sequence of statements whose truth leads from some
initial statements called ―hypothesis‖ or the given statement to a conclusion statement.
Each step in the proof must follow by some accepted logical principle from the given
facts or some of the previous statements in the deductive proof.
Hypothesis consists of independent statements connected by a logical AND. The format
is,
“If H then C”
Where,
H is Hypothesis, C is Conclusion
Example:
If x>=4, then 2 x >= x 2
REDUCTION TO DEFINITION
If we are not sure how to start a proof, then convert all terms in the hypothesis to their
definitions.
• We can prove the set equality E=F by if and only if statement, that is ―an
element x is in E if and only if x is in F‖
• The set equality E=F can be proved by two if statements as follows,
O If x is in E, then x is in F
O If x is in F, then x is in E
Contrapositive
The contrapositive of the statement “if H then C”, is “if not C then not H‖, and it is
represented as, H
To prove that ―if H then C‖ and ―if not C, then not H‖ are logically equivalent, there are
four cases to consider.
Example I:
All primes are odd.
Example II:
Theorem: There is no pair of integers a and b such that ―a mod b = b mod a‖.
A mod ≠ b mod a.
INDUCTIVE PROOF
Inductive proof deals with recursively defined objects like trees and expressions of
various sorts, such as regular expression.
• Induction on integers
• Structural Inductions
• Mutual Induction
STRUCTURAL INDUCTIONS
In Automata, there are several recursively defined structures such as trees and
expressions. The structural induction deals with the recursive definition that has a
basis case in which one or more elementary structures are defined. There is an
inductive step, where more complex structures are defined in terms of previously
defined structures.
Example 1: The recursive definition of a tree.
Basis: A single node is a tree, and that node is the root of the tree.
Induction: If T1, T2…. Tk are trees, and then we can form a new tree as follows,
• Begin with a new node N, which is the root of the tree
• Add copies of all the trees T1, T2, … Tk
• Add edges from node N to the roots of each of the trees T1,T2, … Tk
T1 T2 Tk
For inductive step, take a structure ‗x‘ that the recursive definition says that it is formed
from y1, y2,…,yk. Assume the statements s(y1),s(y2)…s(yk) and use these to prove s(x).
MUTUAL INDUCTIONS
Sometimes, we cannot prove a single statement by induction and we may need to
prove thegroup of statements S1(n),S2(n),….Sk(n) together by the induction „n‟.
Automata theory involves proving the group of statements, one for each state. Proving
the group of statements is similar to proving the conjunction of all the statements.
For example, the group of statements, S1(n), S2(n)….Sk(n) can be replaced by the
single statement as,
S1(n) AND S2(n) AND … AND Sk(n)
When there are several independent statements to prove, then we keep the statements
separate and to prove them all in their own parts of the basis and inductive step. This
form of proof is called Mutual Inductions.
δ a b
P {p} {p,q}
Q {r} {r}
*r {Φ} {Φ}
Solution
NFA transition diagram for the above NFA is,
Step 2
Step 3
(C,a) δN({p,r},a) = δ (p,a)
U δ(r,a) =
{p} U ф
={p}
{p,q} U Φ
F {p,q}
So δD([p,r],b) = [p,q] -
Step 4
(D,a) δN({p,q,r},a) = δ(p,a) U δ(q,a) U δ{r,a}
= {p} U {r} U Φ
= {p,r}
So δD([p,r],0) = [p,r] -
A b
p {p,q} {p}
q {r {r}
r {s} Φ
*s {s} {s}
So δD([p,q,r,s],1) =[p,r,s]
Step 6
δN({p,q,s},0) = δ(p,0) U δ{q,0) U
(F,0) δ(s,0)
= {p,q} U {r} U {s} = {p,q,r,s}
So δD([p,q,s],0) = [p,q,r,s] E
(F,1) δN({p,q,s},1) = δ(p,1) U δ{q,1) U δ(r,1) U δ (s,1)
= {p} U {r} U {s} = {p,r,s}
So δD([p,q,s],1) = [p,r,s]
Step 7
Step 8
(H,0) δN({p,s},0) = δ(p,0) U δ(s,0)
= {p,q} U {s} ={p,q,s}
So δD({p,s},0) = [p,q,s] F
{H,1) δN({p,s},1) = δ(p,1) U δ(s,1)
= {p} U {s} = {p,s}
So δD([p,s],1) = [p,s]
The transition table for the DFA is shown below,
0 1
a b c Є
{q} {p} Φ Φ
q {r} Φ {q} Φ
*r Φ Φ Φ {r}
Solution:
The NFA with Є transition diagram
is,
Step 1
Є – Closure (p) = { p, q, r }
Є – Closure (q) = { q, r }
Є – Closure (r) = { r }
Step 2: Processing state p
δ (p, a) = Є – closure (δ (δ‘ ( p, Є ) , a) )
= Є – closure (δ ({ p, q, r }), a) )
= Є – closure (δ (p, a) U δ (q, a) U δ (r, a) )
= Є – closure ({ p })
δ (p, a) = { p, q, r}
δ (p, b) = Є – closure (δ (δ‘ (p, Є), b) )
= Є – closure (δ ({ p, q, r }, b) )
= Є – closure (δ (p, b) U δ (q, b) U δ (r, b) )
= Є – closure (q)
δ (p, b) = { q, r }
δ (p, c) = Є – closure (δ (δ‘ (p, Є) , c) )
= Є – closure (δ ({ p, q, r } , c ) )
= Є – closure (δ (p, c) U δ (q, c) U δ (r, c) )
= Є – closure ( r )
δ (p, c) ={r}
Processing state q
δ (q, a) = Є – closure (δ ( δ‘ (q, Є ), a) )
= Є – closure (δ ({ q, r }, a) )
= Є – closure (δ (q, a) U δ (r, a) )
= Є – closure (Φ)
δ (q, a) =Φ
δ (q, b) = Є – closure ( (δ ( δ‘ (q, Є ), b) )
= Є – closure (δ ({ q, r }, b) )
= Є – closure (δ (q, b) U δ (r, b) )
= Є – closure (q)
δ (q, b) = { q, r }
δ (q, c) = Є – closure ( (δ ( δ‘ (q, Є ), c) )
= Є – closure (δ ({ q, r }, c) )
= Є – closure (δ (q, c) U δ (r, c) )
= Є – closure (r)
δ (q, c) ={r}
Processing state r
δ (r, a) = Є – closure ( (δ ( δ‘ (r, Є ), a) )
= Є – closure (δ ({ r }, a) )
= Є – closure (Φ)
δ (r, a) =Φ
δ (r, b) = Є – closure ( (δ ( δ‘ (r, Є ), b) )
= Є – closure (δ ({ r }, b) )
= Є – closure (Φ)
δ (r, b) =Φ
δ (r, c) = Є – closure ( (δ ( δ‘ (r, Є ), c) )
= Є – closure (δ ({ r }, c ) )
= Є – closure ( r )
δ (r, c) ={r}
0 1 2
*q Φ {q, r} {r}
*r Φ Φ {r}
a,b,c
7. Explain the extended transition function for NFA, DFA and NFA – ε[CO1-L2-
Nov/Dec 2007]
Extended Transition Function for DFA
The transition function ‗δ‘ is extended to δ‘ (or) δ‘ extended transition function that
operates on strings and states. The extended transition function describes what
happens when we start in any state and follow any sequence of inputs. If δ is the
transition function, then the extended transition function constructed from δ is called δ‘.
The extended transition function δ‘ that takes a state ‗q‘ and a string ‗w‘ and reaches
the state ‗s‘. That is δ‘ reaches the state‗s‘ from ‗q‘ after processing the sequence of the
string ‗w‘.
Basis:
Induction:
For example if the string w=1101. Then the last symbol ‗1‘ is given to y and the
remaining is x. So x=110 and y=1.
Basis:
δ‘ (q, Є) = { q } => If we are in state q and read no input then we are in state q itself.
Without reading input NFA remains in the same state.
Induction:
and
Therefore
δ‘ (q, wa) = δ‘ (q, x) = { R1, R2…..Rn }
The language accepted by NFA M= (Q, Є, δ, q0, F) is denoted by L (M) and it is
defined as,
(Or)
L (M) = { w/ δ‟ (q0, w) ∩ F ≠ Φ }
q0 q1
Solution:
q
0 {q0} Φ {q1}
*q1 Φ {q1} Φ
Step 1:
ε – closure(q0) ={q0,q1}
ε – closure(q1)= {q1}
Step 2:
ε – closure(q0) = {q0,q1}
(A,a) δN({q0,q1},a) = ε – closure(δ(q0,a) U δ(q1,a))
= ε – closure(q0) =
{q0,q1}
So δD([q0,q1],a) = [q0,q1]
(A,b) δN({q0,q1},b) = ε – closure(δ(q0,b) U δ(q1,b))
= ε – closure(q1) = {q1}
So δD([q0,q1],b) = [q1]
Step 3:
(B,a) δN({q1},a) = ε – closure(δ(q1,a)) = ε – closure(Φ)
So δD([q1],a) = Φ
*[q1] Φ [q1]
UNIT –II
GRAMMARS
PART-A
Right Sentential Form: If the string a can be generated from the starting symbol by
using rightmost derivation, such that S==> α is right sentential
10. What is an ambiguous grammar?[CO2-L1-Dec 2009]
A grammar is said to be ambiguous if it has more than one derivation trees for a
sentence or in other words if it has more than one leftmost derivation or more than one
rightmost derivation.
11. Consider the grammar P= {S aS | aSbS | є} is ambiguous by constructing: (a)
Two parse trees (b) two leftmost derivation (c) rightmost derivation [CO2-L1-
Nov/Dec 2007]
Let w =
aab
(b) (i) SaS (ii) SaSbS
aaSbS aaSbS
aabS aabS
aab aab
(c) (i) SaS (ii) SaSbS
aaSbS aSb
aaSb aaSbS
aab aaSb
aab
12. What are the properties of the CFL generated by a CFG? [CO2-L1]
Each variable and each terminal of G appears in the derivation of some word
in L There are no productions of the form A B. where, A and B are variables.
13. Find the grammar for the language L = {a2nbc, where n>1}[CO2-L3]
Let G = ({S, A, B}, {a, b, c} ,P , {S})
Where Production is:
S Abc
A aaA | є
Find the language generated by:
S 0S1 | 0A | 0 |1B | 1
A 0A | 0
B 1B | 1
19. Let G= ( {S, C} ,{a, b}, P, S) where P consists of S aCa , C aCa |b. Find L(G).
[CO2-L2]
S aCa aba
S aCa aaCaa aabaa
S aCa aaCaa aaaCaaa aaabaaa
Thus L(G) = { a ban , whre n>=1
n
20. Find L(G) where G= ( {S} ,{0, 1}, {S 0S1 ,S є}, S )[CO2-L2-Dec 2010]
S є , є is in L(G)
S 0S1 0є1 01
S 0S1 00S11 0011
Thus L(G)= { 0n1n | n>=0}
24. Define left most derivation and right most derivation. [CO2-L1]
Left most derivation: In this method, we replace the left most non-terminal by one of its
production in the grammar. Such a derivation is known as left most derivation and it‘s
represented by using the relation *=> and *=> for one or more steps respectively.
lm lm
Right most derivation: In this method, we replace the right most nonterminal by one
of its production in the grammar. Such a derivation is known as left most derivation
and it‘s represented by using the relation *=> and *=> for one or more steps
respectively.
rm rm
25. Construct a context free grammar for generating the language L ={anbn/n≥1}
[CO2-L3] (Nov/Dec-2004, 2010, 2013, May-05, 06)
G = {V, T, P, S}
P = { S aSb, S ab}
27. Convert the following grammar into an equivalent one with no unit
productions and no useless symbols S ABA, A aAA | aBC | bB, B A | bB | Cb, C
CC | cC[CO2-L2-Nov/Dec 2011]
As we can clearly observe that B A is a unit production. So we will eliminate it
Similarly, C CC |cC does not lead to any terminating string. Hence it is considered as
useless symbol. Same is true for the other non-terminal symbols. i.e. A and B. hence all
A, B and C and hence S are useless symbols.
Parse tree or derivation tree. This parse tree clearly shows how the symbols of a
terminal strings are grouped into substrings, each of which belongs to the language of
one of the variables of the grammar.
32.What are the two major normal forms for context-free grammar? [CO2-L1]
The two Normal forms are
i. Chomsky Normal Form (CNF)
ii. Greibach Normal Form (GNF)
not
appear in any derivation of a terminal string from the start symbol.
-productions which is of the form A ε for some variable A.
B for variables A, B.
of the normal forms to get the simplified CFG
PART – B
1. Derive the strings a*(a+b00) using leftmost and rightmost derivation for the
following[CO2-H2]
production.
1. E I
2 .E E+E
3. E E*E
4. E (E)
5. I a
6. I b
7. I Ia
8. I Ib
9. I I0
10 I I1
Solution
Leftmost Derivation
E E*E
I*E (E I)
a*E (I a)
a*(E) (E(E))
a*(E+E) (E E+E)
a*(I+E) (E I)
a*(a+E) (I a)
a*(a+I) (E I)
a*(a+I0) (I I0)
a*(a+I00) (I I00)
a*(a+b00) (I b)
Rightmost
Derivation
E E*E
E*(E) (E (E))
E*(E+E) (E E+E)
E*(E+I) (E I)
E*(E+I0) (I I0)
E*(E+I00) (I I0)
E*(E+b00) (I b)
E*(I+b00) (E I)
E*(a+b00) (I a)
I*(a+b00) (E I)
a*(a+b00) (I a)
2. Show that the grammar S aSbS | bSaS | e is ambiguous and what is the
language generated by this grammar?[CO2-L2-Nov/Dec 2006]
Solution - Input string w = aabbab
Leftmost Derivation
S aaSbSbS
aaєbSbS
aabSbS
aabєbS
aabbS
aabbaSbS
aabbaєbS
aabbabS
aabbabє
aabbab
Rightmost Derivation
S aSbS
aSbaSbS
aSbaSbЄ
aSbaSb
aSbaЄb
aSbab
aaSbSbab
aaSbЄbab
aaSbbab
aaєbbab
aabbab
aaabbabbbA (S bA)
aaabbabbba (A a)
S SbS
abS (S a)
abSbS (S SbS)
ababS (S a)
ababSbS (S SbS)
abababS (S a)
abababa (S a)
→ SbS
SbSbS
abSbS
ababS
ababSbS
abababS
abababa
S aSbS
aaSbS
aaєbS
aabS
aabє
S aab
S
S
S
a
b
S
a
S
b
a S
Є
a
Є Є
Є
S aS
aaSbS
aaSbЄ
aaSb
aaЄb
aab
S aSbS
aSbЄ
aSb
aaSb
aaЄb
aab
Solution 1
Step 1 - Kill all Є productions
By inspection, the only nullable non-terminal is X. Delete all Є productions and add new
productions, with all possible combinations of the nullable X removed.
The new CFG, without Є productions, is:
S aX | a | Yb
X S
Y bY | b
S aX | a | Yb
X aX | a | Yb
Y bY | b
S AX | YB | a
X AX | YB | a
Y BY | b
A a
B b
S AA
A B | BB
B abB | b | bb
Solution 2
Step 1 - Kill all Є productions
There are no Є productions, so none of the non-terminals is nullable.
The CFG remains unchanged.
Step 2 - Kill all unit productions
The only unit production is A B, where the B can be replaced with all B‘s non-unit
productions (i.e. all of them).
The new CFG, without unit productions, is:
S AA
A BB | abB | b | bb
B abB | b | bb
Step 3 - Replace all mixed strings with solid nonterminals.
Create extra productions that produce one terminal, when doing the replacement.
The new CFG, with a RHS consisting of only solid nonterminals or one
terminal is: S AA
A BB | XYB | b |
YY B XYB | b |
YY
X
a
Y
Solution 3
Step 1 - Kill all Є productions
By inspection, the only nullable nonterminal is X. Delete all Є productions and add new
productions, with all possible combinations of the nullable X removed.
The new CFG, without Є productions, is:
S XaX | aX | Xa | a | bX | b | Y
X XaX | aX | Xa | a | XbX | bX | Xb | b
Y ab
Step 2 - Kill all unit productions
The only unit production is S Y, where the Y can be replaced with all Y‘s non-unit
productions (i.e. ab). Furthermore, the Y-production can be completely removed, since
its only purpose is to turn S into ab in one particular production sequence: S Y ab.
The new CFG, without unit productions, is:
S XaX | aX | Xa | a | bX | b | ab
X XaX | aX | Xa | a | XbX | bX | Xb | b
Step 3 - Replace all mixed strings with solid nonterminals.
Create extra productions that produce one terminal, when doing the replacement.
The new CFG, with a RHS consisting of only solid nonterminals or one terminal is:
S XAX | AX | XA | BX | AB | a | b
X XAX | AX | XA | XBX | BX | XB | a | b
A a
B b
Step 4 - Shorten the strings of nonterminals to length 2.
Create new, intermediate nonterminals to accomplish this.
The new CFG, in CN is:
S XR | AX | XA | BX | AB | a | b
R AX
X XR | AX | XA | XQ | BX | XB | a | b
Q BX
A a
B b
S aAbB
A aA/a
B bB/b
Solution 11
Eliminate useless symbols
• a,b generates itself, So a,b are generating symbol.
• A a,A generates a, So ‗A‘ is generating symbol.
• B b,B generates ‗B‘, So ‗B‘ is also generating symbol.
• S aAbB, So S is also generating symbol.
Generating symbol={S,A,B,a,b}
• S is always reachable
• S aAbB, So a,A,b,B are reachable.
Reachable symbol={S,A,B,a,b}
Since all the variables and terminals are both generating and reachable, the
grammar does not have useless symbols.
Eliminate ε-production
There is no nullable symbol and ε-production.
Eliminate Unit production
There is no unit production and the grammar has only the (S,S) (A,A) (B,B) as
unit pair b basis.
CNF Grammar:
S C1C3
A C1A/a
B C2B/b
C1 a
C2 b
C3 AC4
C4 C2B
Step 2: The given grammar is Chomsky normal form such that each production is
of the form A B or A a
So the given grammar is in CNF
Step 3: Replace the variables S and A as A1 and
A2 such that S = A1
A = A2
Now the grammar after replacing the variable becomes,
A1 A2 A2/0
A2 A1 A1/1
Step 4: Now process each production for each variable.
(i) A1 A2A2/0
(Here i=1, j=2 and i<j(1<2) So leave the production
A1 A2A2/0 . ......(1)
(ii) A2 A1 A1/1
Here i=2,j=1 and i>j So apply rule 1 to replace the production body of
A1 to A2. So the production of A2 becomes,
A2 A1A1/1
Becomes,
A2 A2A2A1/0A1/1 ...........(2)
(iii) Now consider A2 A2A2A1/0A1/1
Here i=j such that i=2, j=2, So apply rule 2 to introduce
new variable B2 as
follows,
B2 A2A1/A2A1B2
A2 0A1/1/0A1B2/1B2 .........(3)
(iv) Now apply the production A2 in (3) to (1) So the production
becomes, A1 A2A2/0
Becomes,
A1 0A1A2/1A2/0A1B2A2/1B2A2/0 .....(4)
Apply the production A2 in (3) to B2 in (3)
B2 A2A1/A2A1B2
Becomes,
B2 0A1A1/1A1/0A1B2A1/1B2A1/ 0A1A1B2/1A1B2/0A1B2A1B2/1B2A1B2
After applying all the rules of GNF, the resulting production of the variables A 1,A2,B2
results in the required form of GNF and the resulting grammar is ,
GNF Grammar
A1 0A1A2/1A2/0A1B2A2/1B2A2/0
A2 0A1/1/0A1B2/1B2
B2 0A1A1/1A1/0A1B2A1/1B2A1/ 0A1A1B2/1A1B2/0A1B2A1B2/1B2A1B2
A3 A2A3A2/a
Now again i=3,j=2 and i>j(3>2) So apply rule 1 to replace the production
body of A2 to A3, So the production of A3 becomes,
A3 A2A3A2/a
Becomes,
A3 A3A1A3A2/bA3A2/a
Here i=j such that i=3,j=3 So apply rule 2 to introduce new variable B 3 as,
B3 A1A3A2/A1A3A2B3
A3 bA3A2/a/bA3A2B3/aB3
(iv) Now apply the production (4) to (2) as follows,
A2 A3A1/b
Becomes,
` A2 bA3A2A1/aA1/bA3A2B3A1/aB3A1/b
……………………….….(5)
Now apply the production (5) to (1) as follows,
A1 A2A3
Becomes,
A1 bA3A2A1A3/aA1A3/bA3A2 B3A1A3/aB3A1A3/bA3
………………..(6)
Now apply the production (6) to (3) as follows
B3 A1A3A2/A1A3A2B3
Becomes,
B3 b A3A2A1A3A3A2 | aA1A3A3A2 | bA3A2B3A1A3A3A2 |
aB3A1A3A3A2| bA3A3A2 | bA3A2A1A3A3A2B3 | aA1A3A3A2B3 |
bA3A2B3A1A3A3A2B3 | aB3A1A3A3A2B3 | bA3A3A2B3
UNIT III
PUSHDOWN AUTOMATA
PART-A
3. What are the different types of language acceptances by a PDA and define
them.[CO3-L1]
For a PDA M=(Q, Σ ,Ґ ,δ ,q0 ,Z0 ,F ) we define :
i) Language accepted by final state L(M) as:
{ w | (q0 , w , Z0 ) |-- ( p, Є , γ ) for some p in F and γ in Ґ * }.
ii) Language accepted by empty / null
stack N(M) is: { w | (q0,w ,Z0) |----( p, Є,
Є ) for some p in Q}.
4. Is it true that the language accepted by a PDA by empty stack and final states
15. Is it true that NDPA is more powerful than that od DPDA? Justify your
answer.[CO3-L2]
No, NPDA is not powerful than DPDA. Because NPDA may produce ambiguous
grammar by reaching its final state or by emptying its stack. But DPDA produces only
unambiguous grammar.
16. What is the additional feature PDA has when compared with NFA? Is PDA
superior over NFA in the sense L acceptance? Justify your answer.[CO3-L1]
PDA is superior NFA by having the following additional features.
• Stack which is used to store the necessary tape symbols and use the state to
remember the conditions.
• Two ways of L acceptances, one by reaching its final state and another by
emptying its stack.
17.What are the components of PDA?[CO3-L1]
The PDA usually consists of four components:
• A control unit.
• A Read Unit.
• An input tape.
• A Memory unit.
PART-B
1. Design PDA to accept the language L={wcwR / w={0,1}*}[CO3-H3]
Solution:
The PDA P is defined as,
P = ( {q0,q1,q2}, {0,1,c}, Γ, δ, q0, z0, {q2})
The transition function is given as,
δ(q0, 0, z0) = (q0,0z0)
δ(q0, 0 ,0) = (q0,00)
δ(q0, 1 ,z0) = (q0,1z0)
δ(q0, 1 ,1) = (q0,11)
δ(q0, 0 ,1) = (q0,01)
δ(q0, 1 ,0) = (q0,10)
δ(q0, c ,z0) = (q1,z0)
δ(q0, c ,0) = (q1,0)
δ(q0, c ,1) = (q1,1)
δ(q1, 0 ,0) = (q1,ε)
δ(q1, 1 ,1) = (q1,ε)
δ(q1, ε ,z0) = (q2,ε)
q
q0 q1 q2 q3 4
a, z0/z0
b, a/a
c, a/ε ε, z0/ε
w = aabcc
Here after entering the final state, we can have any symbols on the stack since we
are designing PDA by reaching final state. The transition diagram for L= { an b2n | n
≥ 0 } is shown below,
Let us take w = aabbbb
(q0, aabbbb,z0) ├ ρ (q0,abbbb,aaz0)
├ ρ (q0,bbbb,aaaaz0)
├ ρ (q1,bbb,aaazo)
a, a/ aaa b, a / Є
a, z0 / aaz0
b, a / Є Є, a / Є
q
Start
q
0 2
q1
Є, z0/ Є
Then while reading ‗a‘ with ‗a‘ on the top stack symbol pop the symbol.
The PDA p is defined as,
W=aabbcc
W=abcc
w = aaaaaabb
n
6. Construct the PDA accepting the language { (ab) | n ≥1 } by empty stack.
[CO3-H3-Nov/Dec 2012]
The transition function is defined as,
δ (q0, Є, z0) = (q3, Є)
δ (q0, a ,z0) = (q1,z0)
δ (q1, a ,z0) = (q2,az0)
δ (q1, a ,a) = (q2, aa)
δ (q2, b ,a) = (q2,Є)
δ (q2, a, a) = (q1, a)
δ (q2, Є ,z0) = (q3, Є)
w = abab
2. δ(q,1,X) = {(q,XX)}
P7 : [q, X,q] 1 [q,X,q] [q,X,q]
P8 : [q, X,q] 1 [q,X,P] [P,X,q]
P9 : [q,X,P] 1 [q,X,q] [q,X,P]
P10 : [q,X,P] 1 [q,X,P] [P,X,P]
3.δ(q,0,X) = { (P,X)}
P11: [q, X,q] 0
[P,X,q] P12 :
[q,X,P] 0 [P,X,P]
3. δ(q,ε,X) = {(q,ε)}
P13 : [q,X,q] ε
4. δ(P,1,X) = {(P,ε)}
P14 : [P,X,P] 1
5. δ(P,0,Z0) = {(q,z0)}
P15 : [P, z0,q] 0 [q, z0,q]
P16 : [P, z0,P] 0 [q, z0,P]
Solution:
2. ρ (q, aa×a0, Ia × E )
3. ρ (q, aa×a0, aa ×E)
4. ρ (q, a×a0,a × E)
5. ρ (q, ×a0, ×E)
6. ρ (q, a0, E)
7. ρ (q, a0, I0)
8. ρ (q, a0, a0)
9. ρ (q, 0, 0)
10. ρ (q, ε, ε)
w= aa×a0
Thus the CFG accepts the string aa×a0 and it‟s accepted by PDA by empty stack.
W= (a0+a)
(q, (a0+a),E) ├ ρ (q, ε (a0+a),E)
- ρ (q, (a0+a), (E))
- ρ (q, a0+a), E) )
- ρ (q, a0+a), E+E))
- ρ (q, a0+a), I0+E ))
10.Construct the CFG for L = { 0n10 n | n ≥ 0} and use it to construct PDA. [CO3-
H3]
Solution:
S A1 A
A 0A | ε
δ( q, ε, S) = {
(q,A1A)} δ(q, ε, A) =
{(q,0A), (q,ε) } δ(q,
0, 0) = {(q,ε)}
δ(q, 1, 1) =
{(q,ε)} δ(q,
ε, ε) =
{(q,ε)}
Thus the CFG accepts the string 0001 and it‟s accepted by PDA by empty stack.
iii. Since the both the condition are true, for all i ≥ 0, the string uviwxiy is
also in „L‟.
Z = uviwxiy
a. uvvi-1wxxi-1y
b. anbn(bn-m)i-1cn
Z = anbn(bn-m)i-1cn
Put i=0
Z = anbn(bn-
m)i-1cn Z =
anbn(bn-
m)0-1cn
Z = anbn(bm-n)cn
Z = an bm cn not in L
Put i=1
Z = anbn(bn-m)i-1cn
Z = anbn(bn-m) 1-1cn
n n n
Z = a b c is in L
Put i=2
Z = anbn(bn-
m)i-1cn Z =
anbn(bn-
m)2-1cn
Z = an b2n-mcn ≠ L
Since we lead to a contradiction that for the value of i=0,2 the string does not
belong to the language. So the language L = {anbncn /n ≥ 0} is not a context
free language.
u = an vwx = bn vx = bn-m
y = cndn
Now check
Z = uvwxy = anbncndn
So our assumption is correct. The two conditions are,
i. |vwx| ≤ n
|bn|≤ n → n ≤ n
H = anbn(bm-n)cndn
I = anbn(bm-
n
) cndn Z=
anbm cndn ≠
L
Put i=1
F = anbn(bn-m)i-1cndn
G = anbn(bn-m)1-1cndn
n n n n
H= a b c d = L
Put i=2
Z = anbn(bn-m)i-1cndn
Z = anbn(bn-m)2-1cndn
n 2n-m n n
Z=a b c d ≠L
Since we lead to a contradiction that for the value of i=0,2 the string does
not belong to the language. So the language L = {anbncndn /n ≥ 0} is not a context
free language.
y = 0p-(q+r)
Now check
Z = uvwxy = 0q0r0p-(q+r)0p
Z = 0p (0s)i-1
Put i=0
Z = 0p (0s)i-1
Z = 0p (0s)0-1
Z = 0p-s ≠L
Put i=1
Z = 0p (0s)1-1
Z =0p (0s)0
Z = 0p = L
Put i=2
15.Prove that If L=N (P) for some PDA where P = (Q, Σ, Γ, δN, q0, z0), then there
is a PDA P such that L=L(P ) [CO3-H3-Nov/Dec 2003, 2004,2005,2006, 2007]
(Apr/May 2005)
Converting a Language Accepted by Empty Stack to Final State
Generally it is possible to show that the classes of language that are L(p) for some PDA
P is the same as the classes of languages that are N(p) for some PDA P. Here we first
construct a PDA P that accepts a language by empty stack and then by using PN we
construct a PDA p that accepts L by final state.
Theorem:
If L=N (PN) for some PDA where PN = (Q, Σ, Γ, δN, q0, z0), then there is a PDA P
F such that L=L(P )
To prove: If there exists a PDA P that accepts a language by empty stack, then there
exists
PDA pF that accepts a language by reaching final state.
Proof: To prove this theorem, we use a new symbol x 0 which must not be a
symbol in (i.e.,) (x0 є Γ*).
• Here we are going to simulate P from P that is we are going to construct P F.
• Here x0 is used as the starting top symbol of the stack
• And x0 is the symbol marked on the bottom of the stack for PN.
• PN goes on processing the input if it sees any other symbol on the
stack except the symbol x0.
• If PN sees x0, then it finishes processing the string.
• So now we are going to construct PF with a new starting state P 0 and new
final state PF.
• P0, which is the starting state of PF is used to push the z0x0 symbol on the
stack to make x0 at the bottom of the stack, when it reads x symbol and
So from the above theorem, we can conclude that w is in L(PF) if and only if w is in
N(PN). So by combining all the moves, the instantaneous description of the p F after
simulating P is given
by,
(p0,w, x0)├ (p0, w, z0, x0)├ (q, є, x0)├ (pf, є, є)
(Fig) PN simulates p and empties its stacks if and only if the PDA PF enters the
accepting states
Here the initial state P0 of PN. Push the stack symbol z0 which is the start symbol of PF
on to the stack. So x0 is on the bottom of the stack and z0 is on the top which is the top
stack symbol for PF. P0 after pushing z0 on to the stack, it enters the state q0, initial state
of PF
Then PF after consuming its input w, it enters any one of the final state.
O For each accepting states of PF, add a transition to the final state P on e
Where
P0 Initial state
X0 starting stack symbol
And the transition function δ is defined by
1. δN (p0, є, x0)={ (q0,z0, x0) } => push the start symbol z of PF on to the stack
and enters the state q , initial state of p.
2. For all states q in Q, input symbol a in Σ or a=є and Y in Γ δ N(q, a, y) contains
every pair of δN (q, a, y) since PN simulates PF.
3. For all accepting states of q in F and stack symbol Y in Γ or y=x0, δN (q, є, y)
contains (p, є)
4. For all stack symbol y in Γ or y=x δN (q, є, y) = { (p, є) }
So from above theorem, we can conclude that w is in N(P N) if and only if w is in
L(PF). So by combining all the moves, the instantaneous description of p after
simulating p is given by,
UNIT IV
PUSHDOWN AUTOMATA
PART A
δ (q, Ai) = (q1,Y,L) that means move left then you will enter in state q 1 , with output Y on
tape A1 A2 ….Ai-1q Ai Ai+1….An - A1 ,A2 ….Ai-2 q1 Ai-1Y Ai+1….An.
4.Explain the basicturing machine model and explain in one move. What are the
actions take place in move in turing machine?[CO4-L2-Dec-04,May-05]
In one move there are three cases either the tape head is pointing any intermediate cell
or it is pointing at the first cell or it pointing at the last cell.
5. What are the two classes of problems that are solved by the Turing
machine?[CO4-L1]
The two classes of problems that can be solved by the Truing machines are as follows.
1. Those problems that have an algorithm
2. Those that are only solved by Turing machine.
# 1 1 1 1 1 $
B B B B 1 1 B ……
B 1 1 1 B B B
Finite
control
]
The multitape turing machine is atype of turing machine in which there are more than
one input tapes. Each tape is divided into cells and each can hold any symbol of finite
tape alphabet. The multitape turing machine is as shown in figure.
This TM is more powerful than the basic turing machine. Because finite control reads
more than one input tape and more symbols can be scanned at a time.
(0,0,R
(Δ,
Halt
23. Design a turing machine with not more than states that accepts languages a
(a+b)* . Assume ∑={a,b}[CO4-H3-may-05].
Let r.e = a (a+b)*.The corresponding turing machine will be
(a,a,R)
(b,b,R)
(Δ, Δ,R)
Start (a,a,R)
qo q1 Halt
24. Design a TM that accepts the language of odd integers written in binary.[CO4-
H3]
This is a simple TM in which the logic is used as follows- the binary string that ends with
1 is always an odd integer. Hence the TM will be
(1,1R (0,0,R)
Start (0,0,R
q q
(Δ, (1,1,R)
Halt
25. Design a TM for finding 1‟s complement of a given binary number. [CO4-H3]
Dec-11]
(1,0,R)
(0,1,R)
(Δ, Δ,R)
Start (1,0,R)
qo q1 Halt
Turing machine can model even recursive enumberable languages. Thus the advantage
of turing machine is that it can model all the computable functions as well as the
languages for which the algorithm is possible.
27. What are the Comparison of FM, PDA and TM? [CO4-L1]
Basically have discussed three models Viz. finite automata or Finite Machines (FM),
Pushdown Automata (PDA) and turing Machine(TM). We will now discuss the
comparison between these models ,
1. This finite machine is of two types – deterministic finite state machine and non
deterministic finite state machine. Both of these DFA and NFA accept regular
language only. Hence both the machines have equal power i.e. DFA = NFA.
We have then learnt push down automata again, pushdown automata consists of two
types of models deterministic PDA and non deterministic PDA. The advantage of PDA
over FA is that PDA has a memory and hence PDA has a memory and hence PDA
accepts large class of languages than FA hence PDA has more
PART B
….. a b c ∆ ∆ ∆ ∆ ∆ ∆ …..
Input tape
Fig input tape
3. The finite control and the tape head which is responsible for reading the current
input symbol. The tape head can move to left to right.
4. A finite set of states through which machine has to undergo.
5.
….. ∆ ∆ a b A A B b a ∆ ∆ ∆ …..
Finite
control
Turing machine
6. Finite set of symbols called external symbols which are used in building the logic
of turing machine.
2.State the techniques for turing machine construction. Illustrate with a simple
language. [CO4-H3-Dec-11)
Write briefly about the programming techniques for TM. (Dec-12,May-13) (or)
the move will be in right direction. After converting the given string to upper case we
reach to the final state q3 is a halt state.
2. Multiple tracks
If the input tape is divided into multiple tracks then the intput tape will be as
follows
For example :
# 1 1 1 1 1 $
B B B B 1 1 B ……
B 1 1 1 B B B
Finite
control
As shown in fig the input tape has multiple tracks on the first track.
The input which is placed is surrounded by # and $. The unary number
equivalent to 5 is placed on the input tape, on the first track. On the second track
unary 2 is placed. If we construct a TM which subtracts 2 and 5 we get the
answer on the third track and that is 3, in unary form. Thus this TM is for
subtracting two unary numbers with the help of multiple tracks.
3. Checking Off Symbols
Checking off symbols is an effective way of recognizing the language by TM. The
symbols are to be placed on the input tape. The symbol which is read is marked
by any special character.
The tape head can be moved to the right or left. Let us take some example and
we will see how to build turing machine by checking of symbols.
Example: Construct a turing machine M=(Q, ∑, Γ,δ,q0,B,F) which recognizes the
language L={wcw\w ∈ (a+b)+ }
Solution: In this language the input set is ∑ ={a,b}. The string which when will be
placed on the input tape it will have two distinct parts separated by letter c, such
as
In the checking off symbols, each symbol is marked special character. The
simple logic in construction of this TM will be we mark the first letter and then
move to the right till we not get c, the first letter after c will be compared with the
marked letter. If it is same as which we have marked then mark this symbol
otherwise goto reject state. It can be shown as below,
* * a c * b a ∆ ∆ …
Now machine goes to accept state. Thus in this TM we are scanning each symbol and
trying to recognize the string.
4.Subroutine
In this high level languages use of subroutines built the modularity in the program
development process. The same type of concept can be introduced in
construction of TM. We can subroutines as a turing machine. Let us see how it
works with the help of some example.
∆ 0 1 1 0 1 1 ∆ …
Proof:
As theorem states, if any language is recognized by a TM with one way infinite tape
then it should also be recognized by a TM with two way infinite tape.
Let M1 be a TM with one way infinite tape and can be denoted by
M1=(Q1,∑1, Γ1,δ1,q1, B1,F1)
Similarly,M2 be a TM with two way infinite tape
M2=(Q2,∑2, Γ2,δ2,q2, B2,F2)
The input tapes are as shown by fig.
…. ∆ a5 a4 a3 a2 a1 a0 ∆ …
In above fig. the TM with one way infinite tape has a external symbol # placed in the
very first cell from left side. This symbol is used as indicator for the left side termination.
If we want a language l={ q0, q1, q2, q3, q4, q5} sequence then in the two way infinite tape
the tape head is fixed at the rightmost symbol.
The TM M2 can be
(a2,a2,L
(a0,a0,L) (a1,a1,L
Start q )
q q ) q
(a3,a3,L)
Halt q q q
(∆,∆,S (a5,a5,L) (a4,a4,
δ(q6, ∆) (HALT, ∆,S) leads to HALT state. L)
…. ∆ a5 a4 a3 a2 a1 a0 ∆ …
Even we can make the TM with one way infinite tape as a multitrack tape to
simulate it as a two way infinite tape. Let us now solve some interesting problems
to see the working of two way infinite tape in the turing machine.
(a2,a2,L
(a0,a0,L) (a1,a1,L
Start q )
q q ) q
(a3,a3,L)
Halt q q q
#,#,S) (a5,a5,L) (a4,a4,
L)
In one move the heads may move left, right or remain stationary.
This type of turing machine is as powerful as one tape turing machine.
The multi head turing machine is as shown in the following figure
input Accept/reject
Finite
Control
Head 1 Head n
….. …..
1.
2. Multi-tape turing machine
The multitape turing machine is atype of turing machine in which there are more
than one input tapes. Each tape is divided into cells and each can hold any
symbol of finite tape alphabet. The multitape turing machine is as shown in
figure.This TM is more powerful than the basic turing machine. Because finite
control reads more than one input tape and more symbols can be scanned at a
time.
Finite
Control
Input tape 1
….. …..
Input tape 2
….. …..
6.Show that there exists a TM for which the halting problem is unsolvable.(May-
08,Dec-10)
Prove that the halting problem is undecidable.[CO4-L3-Dec-12]
Halting Problem
To state halting problem we will consider the given configuration of a turing machine
.The output of TM can be
i) Halt: The machine starting at this configuration will halt after a finite number of
states.
ii) No Halt : The machine starting at this configuration never reaches a halt
state, no matter how long it runs.
Now the question arises based on these two observation : Given any functional matrix,
input data tape and initial configuration , then is it possible to determine whether the
process will ever halt? This is called halting problem. That means we are asking for a
procedure which enable us to solve the halting problem for every pair (machine,
tape).The answer is ―no‖. That is the halting problem is unsolvable. Now we will prove
why It is unsolvable. Let, there exists a TM M1 which decides whether or not any
computation by a TM T will ever halt when a description dT of T and tape t of T is given
[That means the input to machine M1 will be (machine, tape) pair]. Then for every input (
t, dT) to M1 if T halt for input t, M1 also halts which is called accept halt. Similarly if T
does not halt for input t then the M1 will halt which is called reject halt.
Accept
input
When Tdoes not
halts for t
Now we will consider another turing machine M2 which takes an input dT . It first copies
dT and duplicates dT on its tape and then this duplicate tape information is given as
input to machine M1 . But machine M1 is a modified machine with the modification that
whenever M1 is supposed to reach an accept halt, M2 loops forever. Hence behavior of
M2 is as given. It loops if T halts for input t = dT and halts if T does not halt for T = dT .
The T is any orbitary turing machine.
M2 When T halts for
t
( t, dT) Accept
Copy (dT, dT)
T Accept
input When Tdoes not
halts for t
M1
As M2 itself is one turing machine we will take M2 = T . That means we will replace T by
M2 from above given machine.
M2 halts for input dM2
M2
( t, dM2)
Loop
halt
input
M2 does not halts for
input dM2
Thus machine M2 halts for input dM2 if M2 does not halt for input dM2 .This is a
contradiction .That means a machine M1 which can tell whether any other TM will halt
on particular input does not exist. Hence halting problem is unsolvable.
This is a hierarchy therefore every language of type 3 is also of type 2,1 and 0. Similarly
every language of type 2 is also of type a and 0 etc.
Computable languages
Regular language
Chomsky hierarchy
Regular languages are those languages which can be described using regular
expressions. These languages can be modeled by NFA or DFA.
The context free languages are the languages which can be represented by context free
grammar (CFG). The production rule is of the form
Aα
Where A is any single non-terminal and α is any combination of terminals and non-
terminals.
A NFA or DFA cannot recognize strings of this language because these automata are
not having ―stack‖ to memorize. Instead of it the Push Down Automata can be used to
represent these languages.
The context sensitive grammars are used to represent context languages. The context
sensitive grammar is follows the following rules-
1. The context sensitive grammar may have more than one symbol on the left hand
side of their production rules.
2. The number of symbols on the left hand side must not exceed the number of
symbols on the right hand side.
3. The rule of the form Aє is not allowed unless A is a start symbol. It does not
occur on the right hand side of any rule.
The automation which recognizes context sensitive languages is called linear
bounded automaton. While deriving using context sensitive grammar the
sentential form must always increase in length every time a production rule is
applied. Thus the size of a sentential form is bounded by a length of the sentence
we are deriving.
1 1 1 1 Δ Δ Δ Δ
(1,1,R)
(Δ, Δ,R)
(B,B,L
(a,a,R)
(B,B,L
(a,a,R)
Start (a,A,R) (b,B,L) q2
qo q1
(B,B,R) (A,A,R)
q3 (B,B,R)
(Δ, Δ,L)
Halt
(B,B,R) (C,C,R)
(a,a,R) (b,b,R)
(b,b,L)
(a,a,L)
(c,C,L)
Start (a,A,R) (b,B,R) q2 q3 (B,B,L)
qo q1
(C,C,L)
(B,B,R)
(A,A,R)
(Δ, Δ,R) (B,B,R)
q4
(C,C,R)
(Δ, Δ,R)
Halt
(B,B,R) (C,C,R)
(0,0,R) (1,1,R)
(1,1,L)
(0,0,L)
(0,C,L)
Start (0,A,R) (1,B,R) q2 q3 (B,B,L)
qo q1
(C,C,L)
(B,B,R)
(A,A,R)
(Δ, Δ,R) (B,B,R)
q4
(C,C,R)
(Δ, Δ,R)
Halt
(Δ, Δ,S)
Halt
(*,*,R)
(1,1,R) (*,*,L)
(*,*,L)
(1,*,L) (-,-,L)
Start (1,1,R) (-,-,R) q2 q3 q4
qo q1
(1,*,R)
13.Construct TM for the addition function for the unary number system. [CO4-H3-
May -07]
(1,1,R) (1,1,R)
(Δ, Δ,L)
Start (1,1,R) (+,1,R) q2 q3
qo q1
(1, Δ,R)
q4
(Δ, Δ,R)
Halt
14.Construct a TM for a successor function for a unary number i.e. f(n) = n+1.
[CO4-H3]
(1,1,R)
(Δ, Δ,R)
Start (1,1,R) (Δ,1,R) q2
qo q1 Halt
(Δ, Δ,L)
(b,B,R (B,B,L)
q5
(a, A,R)
(b,b,R)
16.Design a TM to accept the language L = {0n 1n / n≥1 } and simulate its action on
the input 0011. [CO4-H3- May – 14]
(B,B,L)
(0,0,R)
(B,B,L)
(0,0,R)
Start (0,A,R) (1,B,L) q2
qo q1
(B,B,R) (A,A,R)
q3
(B,B,R)
(Δ, Δ,L)
Halt
Figure
a/a, R
b/b, R Δ/Δ, R
Δ /Δ, L (
q2 q3
a/Δ, L
a/Δ, R
Δ/Δ, R Δ/Δ, R a/a, L
q4 ha
q0 q1 b/b, L
b/Δ, R
Δ/Δ, L
q5 q6 Δ/Δ, R
UNIT V
a/a, R
UNSOLVABLE PROBLEMS AND COMPUTABLE FUNCTIONS
b/b, R
PART A
Δ /Δ, R
4.When a language is said to be recursive? Is it true that every regular set is not
recursive?[CO5-L2-Dec -05]
A language is said to be recursive if there exists a turing machine that accepts the
strings belonging to that language and rejects on every string that are not belonging to
that language. Every regular set is not recursive.
Recursively Enumerable
A language is recursively enumerable if there exists a Turing Machine that
accepts every string of the language and does not accept strings that are not in
the language.
W
Accept
TM
Loops for ever
Input string
RE may not halt on every input, it may fall into an infinite loop.
There exist RE language L whose complement L‘ may not be RE. If L and L‘ both
are recursively enumberable then that L is definitely a recursive language.
W
Accept
TM
Loops for ever
Input string
W
Accept
TM
Loops for ever
Input string
Recursive sets
A language is said to be recursive if there exists a turing machine that accepts the
strings belonging to that language and rejects on every string that are not belonging to
that language. Every regular set is not recursive.
P class problems - problems that can be solved in ―polynomial time‖are called P class
problems. For example – The sorting and searching problems.
NP class problems - problems that can be solved in non-deterministic polynomial time.
For example – Traveling salesperson problems, graph coloring problem. The NP class
problems can be NP-Complete and NP – hard problems.
17. Define (a) Recursively Enumerable languages (b) Recursive Sets? [CO5-L1]
The languages accepted by Turing machines are called recursively enumerable (RE),
and the subset of RE languages that are accepted by a TM that always halts are called
recursive. Enumerable means that the strings in the language can be enumerated by
the TM. The class of recursively enumerable languages includes CFL‘s.
The recursive sets include languages accepted by at least one TM that halts on all
inputs.
tape will be used to hold the simulated tape of M, using the same format as for the code
of M. That is, tape symbol Xi of M will be represented by 0i, and tape symbols will be
separated by single l's. The third tape of U holds the state of M, with state qi
represented by i 0's. A sketch of U is in Fig
Universal TMs are TMs that can be programmed to solve any problem, which can be
solved by any Turing machine. A specific Universal Turing machine U is:
Input to U: The encoding ―M‖ of a TM M and encoding ―w‖ of a string w.
Behavior: U halts on input ―M‖ ―w‖ if and only if M halts on input w.
23. What properties of recursive enumerable sets are not decidable? [CO5-L1]
o Emptiness
o Finiteness
o Regularity
o Context-freedom.
L = Σ *.
L is recursive
L is not recursive.
L is singleton.
L is a regular set.
L - Lu ≠ Φ
Notice that the pair (w1, x1) is forced to be at the beginning of the two strings, even
though the index 1 is not mentioned at the front of the list that is the solution. Also,
unlike PCP, where the solution has to have at least one integer on the solution list, in
MPCP, the empty list could be a solution if wi = xi (but those instances are rather
uninteresting and will not figure in our use of MPCP).
W
Accept
TM
Loops for ever
Input string
38. Let A and B be lists of three strings each, as defined in the following table?
[CO5-L2]
That is, w1, w2…. wk and x1, x2…. xk = 01111110. Note this solution is not unique. For
instance, 2, 1, 1, 3, 2, 1, 1, 3 is another solution.
40. What are the properties of recursive and Recursively Enumerable Language?
[CO5-L1]
1. The complement of a Recursive language is Recursive.
2. The union of two recursive languages is recursive.
The union of two RE languages is RE.
3. If a language L and complement L are both RE, then L is recursive.
In the theory of computation we often come across such problems that are answered
either yes or no. The class of problems which can be answered as yes are called
solvable or decidable, otherwise the class of problems is said to be unsolvable or
undecidable.
PART B
Initial Function
The initial functions are the elementary functions whose values are independent of their
smaller arguments. The following functions comprise the class of recursive functions.
The Zero function : Z(X) = 0
The successor function : S(X) = successor of X (roughly,‖X+1‖)
The identity function : id(X) = X
The zero functions returns zero regardless of this argument.
The successor functions returns the successor of its argument. Since successorship is
a more primitive notion.
The zero and successor functions take only one argument each. But the identity
functions is designed to take any number of arguments. When it takes one argument
(as above ) it returns its argument as its value. When it takes more than one
argument,it returns one of them. That means,
id(X,Y) = X
id(X,Y) = Y
Building operations: We will build more complex functions from the initial set by using
only three methods that are,
I) Compostion
II) Primitive recursion
III) Minimization
i) Composition
We will start with the successor function,
S(X) = X+1
Then we mayreplace its argument , X, with a function. If we replace the arguments,
X, with the zero function.
Z(X)
Then the result is the successor of zero,
S(Z(X)) = 1
S(S(Z(X))) = 2 and so on.
In this way, with the help of the initial functions we can describe the natural
numbers. This building operations is called ―composition‖. It should be clear that
when composition is applied to computable functions, only computable functions
will result.
ii) Primitive recursion
The second building operation is called primitive recursion is a method of
defining new functions from old function. The function f is defined through
functions f and g by primitive recursion when
h(x,0) = f(x)
h(x,s(y)) = g(x,h(x,y))
where f and g are known computable functions. There are two equations. When
h‘s second argument is zero, the first equation applies; when it is not zero, we
use the second. Use of successor function in second equation enforces the
condition that the argument be greater than zero. Hence, the equation applies in
the minimal case and the second applied in every other case.
Thus the function obtained will be computable in nature. For example, we can calculate
the factorial function using recursion as:
Initally 1! = 1 and to calculate n! if we multiply n by (n-1) then it will generate a
nonrecursion series. Instead of that we will multiply n by (n-1)! We can express the
calculation of factorial function by following two equations.
n! + n*(n-1)
A strict definition of the factorial function, f(n) then,consists of these two equations:
f(n) = 1 When n = 1 1
f(n) = n*(f(n-1)) When n > 1 2
Consider n= 5 then using equation 2 we will get,
f(5) = 5* f(4)
f(4) = 4* f(3)
f(3) = 3* f(2)
f(2) = 2* f(1)
By putting the value of equation 1 for calculating f(2) we will get,
f(2) = 2* 1 = 2
Then f(3) = 3* f(2)
= 3*2 = 6
Then f(4) = 4* f(3)
= 4*6 = 24
Then f(5) = 5* f(4)
= 5*24 = 120
Primitive recursion is like mathematical induction. The first equation defines the basis,
and the second defines the induction step.
iii) Minimization
If g(x) is a function that computes the least x such that f(x) = 0, then we know that g is
computable. And then we can say that g is produced from f by minimization. But we can
build g by minimization only if f(X) is already known to be computable.
For example: Suppose we want to obtain least x which makes f(x) = 0 then we will try
the natural numbers 0,1,2,…. Until we reach the first value that gives f(x) = 0. Now if
such search for x never gets terminated then it is called unbounded minimization. While
unbounded minimization has the disadvantages of a partial function which may never
terminate, bounded minimization has the disadvantage of sometimes failing to minimize.
Hence A(1,1) = 3
To compute A(1,2) put x = 0 and y = 1
A(1,2) = A(0+1,1+1)
= A(0, A(1,1))
= A(0,3) = 4
To compute A(2,1) put x = 1 and y = 0
A(2,1) = A(1+1,0+1)
= A(1, A(2,0))
= A(1, A(1,1))
= A(1,3)
= A(0+1,2+1)
= A(0, A(1,2))
= A(0,4)
=5
To compute A(2,2) put x = 1 and y = 1
A(2,2) = A(1+1,1+1)
= A(1, A(2,1))
= A(1,5)
Now we will compute A(1,5) where in x = 0 and y = 4
A(1,5) = A(0+1,4+1)
= A(0, A(1,4))
= A(0, A(0+1,3+1))
= A(0, A(0,A(1,3)))
= A(0, A(0,A(0+1,2+1)))
= A(0, A(0,A(0,A(1,2))))
= A(0, A(0,A(0,4)))
= A(0, A(0,5))
= A(0,6)
A(1,5) = 7
Hence A(2,2) = A(1,5)
A(2,2) = 7
3. Show that if a language L and its complement L‟ are both recursively
enumerable then L is recursive. (Nov/Dec 2003) (Nov/Dec 2004) (Apr/May2005)
(May/June 2006) (Nov/Dec 2006) (May/June 2007) (Nov/Dec 2007) [CO5-L2]
1 Recursive Languages
We call a language L recursive if L = L(M) for some Turing machine M such that:
Figure (9.2) Relationship between the recursive, RE, and non-RE languages
The existence or nonexistence of an algorithm to solve a problem is often of
more importance than the existence of some TM to solve the problem. As mentioned
above, the Turing machines that are not guaranteed to halt may not give us enough
information ever to conclude that a string is not in the language, so there is a sense in
which they have not "solved the problem."
Thus, dividing problems or languages between the decidable — those that are
solved by an algorithm — and those that are undecidable is often more important than
the division between the recursively enumerable languages (those that have TM's of
some sort) and the non-recursively-enumerable languages (which have no TM at all).
We have positioned the non-RE language L5 properly, and we also show the language
L, or "universal language," that we shall prove not to be recursive, although it is RE.
We shall show that the recursive languages are closed under complementation. Thus, if
a language L is RE, but L‘, the complement of L, is not RE, and then we know L cannot
be recursive. For if L were recursive, then L‘ would also be recursive and thus surely
RE.
Proof: Let L=L(M) for some TM Al that always halts. We construct a TM M‘ such that
L‘= L(M‘)
1. The accepting states of M are made non-accepting states of M with no
transitions; i.e., in these states M will halt without accepting.
2. M‘ has a new accepting state r; there are no transitions from r.
3. For each combination of a nonaccepting state of M and a tape symbol of Al such
that M has no transition (i.e., M halts without accepting), add a transition to the
accepting state r.
A language L and its complement L in the diagram of Fig. 9.2, only the following four are
possible:
1. Both L and I, are recursive; i.e., both are in the inner ring.
2. Neither L nor L‘ is RE; i.e., both are in the outer ring.
3. L is RE but not recursive and L‘ is not RE; i.e., one is in the middle ring and the
other is in the outer ring.
4. L‘ is RE hut not recursive, and L is not RE; i.e., the same as (3), but with L and L‘
swapped.
In proof of the above, Theorem (9.3) eliminates the possibility that one language (L or L)
is recursive and the other is in either of the other two classes. Theorem (9.4) eliminates
the possibility that both are RE but not recursive.
Example: As an example, consider the language Ld, which we know is not RE. Thus, Ld
could not be recursive. It is, however, possible that Ld‘ could be either non-RE or RE but
not-recursive. Ld‘ is the set of strings wi such that Mi accepts wi.
This language is similar to the universal language Lu, consisting of all pairs (M, w) such
that M accepts w, the same argument can be used to show Ld‘ is RE.
5.Prove that the Universal language is recursively enumerable but not recursive.
(Nov/Dec 2009) [CO5-L3]
The Universal Language (Nov/Dec 2003), (Apr/May 2005) (Nov/Dec 2005)
(May/June 2006) (Nov/Dec 2006)
We already discussed how a Turing machine could be used to simulate a computer that
had been loaded with an arbitrary program. That is to say, a single TM can be used as a
"stored program computer," taking its program as well as its data from one or more
tapes on which input is placed.
In this section, we shall repeat the idea with the additional formality that comes with
talking about the Turing machine as our representation of a stored program.
We define Lu, the universal language, to be the set of binary strings that encode, a pair
(M, w), where M is a TM with the binary input alphabet, and w is a string in (0+1)*, such
that w is in L(M). That is, Lu is the set of strings representing a TM and an input
accepted by that TM. We shall show that there is a TM U, often called the universal
Turing machine, such that, Lu = L(U). Since the input to U is a binary string, U is in fact
some M in the list of binary-input Turing machines.
Examine the input to make sure that the code for M is a legitimate code for some TM. If
not, U halts without accepting. Since invalid codes are assumed to represent the TM
with no moves, and such a TM accepts no inputs, this action is correct.
Initialize the second tape to contain the input w, in its encoded form. That is, for each 0
of w, place 10 on the second tape, and for each 1 of w, place 100 there.
Note that the blanks on the simulated tape of M, which are represented by 1000, will not
actually appear on that tape; all cells beyond those used for w will hold the blank of U.
However, U knows that, should it look for a simulated symbol of M and find its own
blank, it must replace that blank by the sequence 1000 to simulate the blank of M.
1. Place 0, the start state of M, on the third tape, and move the head of Ws second
tape to the first simulated cell.
2. To simulate a move of M, U searches on its first tape for ,a transition
0i10i10k10110m, such that 0i is the state on tape 3, and 0j is the tape symbol of M
that begins at the position on tape 2 scanned by U. This transition is the one M
would next make.
(a) Change the contents of tape 3 to 0k that is, simulate the state change of M. To
do so, U first changes all the 0's on tape 3 to blanks, and then copies 0k from tape 1
to tape 3.
(b) Replace 0j on tape 2 by 0k; that is, change the tape symbol of M. If more or less
space is needed (i.e., i =1), use the scratch tape and the shifting-over technique to
manage the spacing.
(c) Move the head on tape 2 to the position of the next 1 to the left or right,
respectively, depending on whether m = 1 (move left) or m= 2 (move right). Thus, U
simulates the move of M to the left or to the right.
3. If M has no transition that matches the simulated state and tape symbol, then in
(4), no transition will be found. Thus, M halts in the simulated configuration, and
U must do likewise.
4. If M enters its accepting state, then U accepts.
5. In this manner, U simulates M on w. U accepts the coded pair (M,w) if and only if
M accepts w.
A problem that is RE but not recursive; it is the language Lu. Knowing that Lu is
undecidable (i.e., not a recursive language) is in many ways more valuable than our
previous discovery that Ld is not RE. The reason is that the reduction of Lu to another
problem P can be used to show there is no algorithm to solve P, regardless of whether
or not P is RE. However, reduction of Ld to P is only possible if P is not RE, so Ld
cannot be used to show undecidability for those problems that are RE but not recursive.
On the other hand, if we want to show a problem not to be RE, then only Ld can be
used; Lu is useless since it is RE.
6.Define the Language Lu. Show that Lu is recursively enumerable but not
recursive. (Nov/Dec 2003) (Apr/May 2004) (Apr/May 2005) (Nov/Dec 2005)
(May/June 2006) (Nov/Dec 2006) (May/June 2009) [CO5-L2]
Theorem 9.6: Lu is RE but not recursive.
Proof: We Suppose Assume Lu is recursive. Then Lu, is the complement of Lu‘, would
also be recursive. However, if we have a TM M to accept Lu‘, then we can construct a
TM to accept Ld. Since we already know that Ld is not RE, we have a contradiction of
our assumption that Lu is recursive
Suppose L(M) = Lu. As suggested by Fig (9.6), we can modify TM M into a TM M' that
accepts Ld as follows.
1. Given string w on its input, M' changes the input to w111w. You may, as an
exercise, write a TM program to do this step on a single tape. However, an easy
argument that it can be done is to use a second tape to copy w, and then convert the
two-tape TM to a one-tape TM.
Computational complexity
problems
Example 1:
(1) (2)
1
1
2
V1
V1
8 1
V6
V6
2 3 V2
1
13
2
12 2 8 V2
3
V5 V3
7 12
7
10 9 V8
V3
V4 V4
V
1
V
6
1
2
3 V
2
7
V V
8 3
V
4
Weight = 21
Find the minimum spanning tree for the following figure using Kruskal‘s algorithm.
a c
1 b 2
2 1
2
2 1
e
1 1
3 f d 2
3 3
3
2 3
3 3
g h i
In kruskal‘s algorithm, we will start with some vertex and will cover all the vertices with
minimum weight. The vertices need not be adjacent.
a b
a b
1
1
a b
c
1 1
e d
1
Computer Science Engineering Department 148 Theory of Computation
S.K.P. Engineering College, Tiruvannamalai V SEM
a b
c
1 1
f e d 2
1
Travelling salesman‘s problem (TSP) : This problem can be stated as ― Given a set of
cities and cost to travel between each pair of cities, determine whether there is a path
that visits every city once and returns to the first city. Such that the cost travelled is the
tour will be a-b-d-e-c-a and total cost of tour will be 16.
This problem is NP problem as there may exist some path with shortest distance
between the cities. If you get the solution by applying certain algorithm then travelling
salesman problem is NP complete problem. If we get no solution at all by applying an
algorithm then the travelling salesman belongs to NP hard class.
For Example:
3
a b
6
5
7
1
4
8 c d 2
2 3
e
NP Completeness
As we know, P denotes the class of all deterministic polynomial language problems and
NP denotes the class of all non-deterministic polynomial language problems. Hence
P € NP
The question of whether or not P= NP
Holds, is the most famous outstanding problem in the computer science.
Problems which are known to lie in P are often called as tracetable. Problems which lie
outside of P are often termed as intracetable . Thus, the question of whether P=NP or P
= NP is the same as that of asking whether there exist problems in NP which are
intracetable or not.
Every problem in NP
CIRCUIT -
SAT
CNF - SAT
3- SAT
VERTEX
COVER
KNAPSACK TSP
Reduction in NP completeness
2. A 3- SAT problem
A 3 SAT problem is a problem which takes a Boolean formula S in CNF form with
each clause having exactly three literals and check whether S is satisfied or not.
[Note that CNF means each literal is ORed to form a clause, and each clause is
ANDed to form Boolean formula S].
AND NOT OR
1 0
1
1
0
1
1
0 0 1
1
The Post's correspondence problem is: Given an instance of PCP, tell whether this
instance has a solution.
Example 9.13: Let Σ = {0,1}, and let the A and B lists be as defined in Fig. 9.12. In this
case, PCP has a solution. For instance, let m = 4, i1 = 2, i2 = 1, i3 = 1, and i4 = 3; (i.e.,)
the solution is the list 2, 1, 1, 3. We verify that this list is a solution by concatenating the
corresponding strings in order for the two lists.
That is, w2w1w1w3 = x2x1x1x3 = 01111110. Note this solution is not unique. For
instance, 2, 1, 1, 3, 2, 1, 1, 3 is another solution.
An instance of PCP
Example 9.14: Here is an example where there is no solution. Again we let E = {0,1},
but now the instance is the two lists given in Fig. 9.13.
Suppose that the PCP instance of Fig. 9.13 has a solution, say for some m ≥ 1. We
claim i1 = 1. For if i1 = 2, then a string beginning with w2 = 011 would have to equal a
string that begins with x2 = 11. But that equality is impossible, since the first symbols of
these two strings are 0 and 1, respectively. Similarly, it is not possible that i 1 = 3, since
then a string beginning with w3 = 101 would have to equal a string beginning with x3 =
011.
If i1 = 1, then the two corresponding strings from lists A and B would have to begin:
A=10…
B=101…
Now, let us see what i2 could be.
1. If i2 = 1, then we have a problem. Since no string beginning with w1w1 = 1010 can
match a string that begins with x1x1=101101;they must disagree at the fourth position.
B: 101011….
There is nothing about these strings that immediately suggests we cannot extend list 1,
3 to a solution. However, we can argue that it is not possible to do so. The reason is
that we are in the same condition we were in after choosing i 1 = 1. The string from the B
list is the same as the string from the A list except that in the B list there is an extra 1 at
the end. Thus, we are forced to choose i3 = 3, i4 = 3, and so on, to avoid creating a
mismatch. We can never allow the A string to catch up to the B string, and thus can
never reach a solution.
2 The "Modified" PCP
It is easier to reduce Lu to PCP if we first introduce an intermediate version of PCP,
which we call the Modified Post's Correspondence Problem, or MPCP. In the modified
PCP, there is the additional requirement on a solution that the first pair on the A and B
lists must be the first pair in the solution. More formally, an instance of MPCP is two lists
A=w1, w2…. wk and B=x1, x2…. xk and a solution is a list of 0 or more integers i1 ,i2,…. im
such that
wi1, wi2….. wim, = xi1, xi2,….. xim.
Notice that the pair (w1, x1) is forced to be at the beginning of the two strings, even
though the index 1 is not mentioned at the front of the list that is the solution. Also,
unlike PCP, where the solution has to have at least one integer on the solution list, in
MPCP, the empty list could be a solution if wi = xi (but those instances are rather
uninteresting and will not figure in our use of MPCP).
Example 9.15: The lists of Fig. 9.12 may be regarded as an instance of MPCP.
However, as an instance of MPCP it has no solution. In proof, observe that any partial
solution has to begin with index 1, so the two strings of a solution would begin:
A:1…..
B:111….
The next integer could not be 2 or 3, since both w2 and w3 begin with 10 and thus
would produce a mismatch at the third position. Thus, the next index would have to be
1, yielding:
A:11….
B:111111….
We can argue this way indefinitely. Only another 1 in the solution can avoid a mismatch,
but if we can only pick index 1, the B string remains three times as long as the A string,
and the two strings can never become equal.
An important step in showing PCP is undecidable is reducing MPCP to PCP. Later, we
show MPCP is undecidable by reducing Lu to MPCP. At that point, we will have a proof
that PCP is undecidable as well; if it were decidable then we could decide MPCP, and
thus Lu.
Given an instance of MPCP with alphabet Σ. we construct an instance of PCP as
follows. First, we introduce a new symbol * that, in the PCP instance, goes between
every symbol in the strings of the MPCP instance. However, in the strings of the A list,
the *'s follow the symbols of E, and in the B list, the *'s precede the symbols of E. The
one exception is a new pair that is based on the first pair of the MPCP instance; this pair
has an extra * at the beginning of w1, so it can be used to start the PCP solution.
A final pair (*, *$) is added to the PCP instance. This pair serves as the last in a PCP
solution that mimics a solution to the MPCP instance.
Now, let us formalize the above construction. We are given an instance of MPCP with
lists A=w1, w2…. wk and B=x1, x2…. xk. We assume * and $ are symbols not present in
the alphabet Σ of this MPCP instance. We construct a. PCP instance C = y 0, y1….
yk+1 and D = z0, z1…. zk+1 follows:
1. For i = 1, 2, ... , k, let yi be wi with a * after each symbol of wi, and let zi be xi with
a * before each symbol of xi.
2. y0 = * y1, and z0 = z1. That is, the 0th pair looks like pair 1, except that there is an
extra * at the beginning of the string from the first list. Note that the 0th pair will
be the only pair in the PCP instance where both strings begin with the same
symbol, so any solution to this PCP instance will have to begin with index 0.
That is, wi1, wi2….. wim, = xi1, xi2,….. xim. We say the sequence i1 ,i2,…. im is a solution to
this instance of PCP, if so.
The Post's correspondence problem is: Given an instance of PCP, tell whether this
instance has a solution.
Example: Let Σ = {0,1}, and let the A and B lists be as defined in Fig. 9.12. In this case,
PCP has a solution. For instance, let m = 4, i1 = 2, i2 = 1, i3 = 1, and i4 = 3; (i.e.,) the
solution is the list 2, 1, 1, 3. We verify that this list is a solution by concatenating the
corresponding strings in order for the two lists.
That is, w2w1w1w3 = x2x1x1x3 = 01111110. Note this solution is not unique. For
instance, 2, 1, 1, 3, 2, 1, 1, 3 is another solution.
An instance of PCP
Example: Here is an example where there is no solution. Again we let E = {0,1}, but
now the instance is the two lists .
Suppose that the PCP instance has a solution, say for some m ≥ 1. We claim i 1 = 1. For
if i1 = 2, then a string beginning with w2 = 011 would have to equal a string that begins
with x2 = 11. But that equality is impossible, since the first symbols of these two strings
are 0 and 1, respectively. Similarly, it is not possible that i1 = 3, since then a string
beginning with w3 = 101 would have to equal a string beginning with x3 = 011.
If i1 = 1, then the two corresponding strings from lists A and B would have to begin:
A=10…
B=101…
2. If i2 = 2, we again have a problem, because no string that begins with w1w2 = 10011
can match a string that begins with x1x2 = 10111; they must differ at the third position.
3. Only i2 = 3 is possible.
If we choose i2 = 3, then the corresponding strings formed from list of integers ii,i3 are:
There is nothing about these strings that immediately suggests we cannot extend list 1,
3 to a solution. However, we can argue that it is not possible to do so. The reason is
that we are in the same condition we were in after choosing i 1 = 1. The string from the B
list is the same as the string from the A list except that in the B list there is an extra 1 at
the end. Thus, we are forced to choose i3 = 3, i4 = 3, and so on, to avoid creating a
mismatch. We can never allow the A string to catch up to the B string, and thus can
never reach a solution.
Notice that the pair (w1, x1) is forced to be at the beginning of the two strings, even
though the index 1 is not mentioned at the front of the list that is the solution. Also,
unlike PCP, where the solution has to have at least one integer on the solution list, in
MPCP, the empty list could be a solution if wi = xi (but those instances are rather
uninteresting and will not figure in our use of MPCP).
A:1…..
B:111….
The next integer could not be 2 or 3, since both w2 and w3 begin with 10 and thus
would produce a mismatch at the third position. Thus, the next index would have to be
1, yielding:
A:11….
B:111111….
We can argue this way indefinitely. Only another 1 in the solution can avoid a mismatch,
but if we can only pick index 1, the B string remains three times as long as the A string,
and the two strings can never become equal.
one exception is a new pair that is based on the first pair of the MPCP instance; this pair
has an extra * at the beginning of w1, so it can be used to start the PCP solution.
A final pair (*, *$) is added to the PCP instance. This pair serves as the last in a PCP
solution that mimics a solution to the MPCP instance.
Now, let us formalize the above construction. We are given an instance of MPCP with
lists A=w1, w2…. wk and B=x1, x2…. xk. We assume * and $ are symbols not present in
the alphabet Σ of this MPCP instance. We construct a. PCP instance C = y 0, y1….
yk+1 and D = z0, z1…. zk+1 follows:
3. For i = 1, 2, ... , k, let yi be wi with a * after each symbol of wi, and let zi be xi with
a * before each symbol of xi.
4. y0 = * y1, and z0 = z1. That is, the 0th pair looks like pair 1, except that there is an
extra * at the beginning of the string from the first list. Note that the 0th pair will
be the only pair in the PCP instance where both strings begin with the same
symbol, so any solution to this PCP instance will have to begin with index 0.
5. yk+1 = $ and zk+1=*$.
Example: Suppose Fig. is an MPCP instance. Then the instance of PCP constructed by
the above steps is shown in Fig. 9.14.
PROOF: The construction given above is the heart of the proof. First, suppose that
1,i2,,rn is a solution to the given MPCP instance with lists A and B.
Then we know wiwi1 wi2….. wim = xi xi1 xi2….. xim. If we were to replace the w's by y's and
the x's by z‘s, we would have two strings that were almost the same; y1yi1 yi2….. yim =
z1zi1 zi2….. zim. The difference is that the first string would be missing a * at the
beginning, and the second would be missing a * at the end. That is,
However, y0 = * y1, and z0 = z1, so we can fix the initial * by replacing the first index by 0.
We then have:
6. We can take care of the final * by appending the index k + 1. Since y k+1 = $ and
zk+1=*$. we have:
y0yi1 yi2….. yim yk+1 = z0zi1 zi2….. zim zk+1
We have thus shown that 0, i1 ,i2 ….. iim ik+1 is a solution to the instance of PCP.
Now, we must show the converse, that if the constructed instance of PCP has a
solution, then the original TVIPCP instance has a solution as well. We observe that a
solution to the PCP instance must begin with index 0 arid end with index k + 1, since
only the 0th pair has strings yo and zo that begin with the same symbol, and only the (k
+ 1) st pair has strings that end with the same symbol. Thus, the PCP solution can be
written 0, i1 ,i2 ….. iim ik+1
We claim that i1 i2, is a solution to the MPCP instance. The reason is that if we remove
the *'s and the final $ from the string y0yi1 yi2….. yim yk+1 WC get the string w1wi1 wi2…..
wim. Also, if we remove the *'s and $ from the string z0zi1 zi2….. zim zk+1 we get x1 xi1
xi2….. xim We know that
so it follows that
Thus, a solution to the PCP instance implies a solution to the MPCP instance.
We now see that the construction described prior to this theorem is an algorithm that
converts an instance of MPCP with a solution to an instance of PCP with a solution, and
also converts an instance of MPCP with no solution to an instance of PCP with no
solution, Thus, there is a reduction of MPCP to PCP, which confirms that if PCP were
decidable, MPCP would also be decidable.
To simplify the construction of an MPCP instance, we shall invoke Theorem, which says
that we may assume our TM never prints a blank, and never moves left from its initial
head position. In that case, an ID of the Turing machine will always be a string of the
form αqβ, where α and β are strings of nonblank tape symbols, and q is a state.
However, we shall allow β to be empty, if the head is at the blank immediately to the
right of α, rather than placing a blank to the right of the state. Thus, the symbols of α
and β will correspond exactly to the contents of the cells that held the input, plus any
cells to the right that the head has previously visited.
List A List B
# #qow#
This, pair, which must start any solution according to the rules of MPCP, begins the
simulation of M on input w. Notice that initially, the B list is a complete ID ahead of the A
list.
Tape symbols and the separator # can be appended to both lists. The pairs
List A List B
X X for each X in Γ
# #
allow symbols not involving the state to be "copied." In effect, choice of these pairs lets
us extend the A string to match the B string, and at the same time copy parts of the
previous ID to the end of the B string. So doing helps to form the next ID in the
sequence of moves of M, at the end of the B string.
3. To simulate a move of M, we have certain pairs that reflect those moves. For all q in
Q - F (i.e., q is a nonaccepting state), p in Q, and X, Y, and Z in Γ we have:
List A List B
qX Yp if δ(q, X) = (p, Y, R)
ZqX pZY if δ(q, X) = (p, Y, L); Z is any tape symbol
q# Y p# if δ(q, B) = (p, Y, R)
Zq# pZY # if δ(q, B) = (p, Y, L); Z is any tape symbol
Like the pairs of (2), these pairs help extend the B string to add the next ID, by
extending the A string to match the B string. However, these pairs use the state to
determine the change in the current ID that is needed to produce the next ID. These
changes — a new state, tape symbol, and head move — are reflected in the ID being
constructed at the end of the B string.
. If the ID at the end of the B string has an accepting state, then we need to allow the
partial solution to become a complete solution. We do so by extending with "ID's" that
are not really ID's of M, but represent
what would happen if the accepting state were allowed to consume all the tape symbols
to either side of it. Thus, if q is an accepting state, then for all tape symbols X and Y,
there are pairs:List A List B
X qY q
Xq q
qY q
5 Finally, once the accepting state has consumed all tape symbols, it stands alone as
the last ID on the B string. That is, the remainder of the two strings (the suffix of the B
string that must be appended to the A string to match the B string) is q#. We use the
final pair:
List A List B
q## #
In what follows, we refer to the five kinds of pairs generated above as the pairs from rule
(1), rule (2), and so on.