FAFL
FAFL
q0
b
q1
a,b
q2
q3
a,b
Fig.2.7 Transition diagram to accept string ab(a+b)*
So, the DFA which accepts strings of as and bs starting with the string ab is given by M =
(Q, , , q0, A) where
Q = {q0, q1, q2, q3}
= {a, b}
q0 is the start state
A = {q2}.
is shown the transition table 2.4.
ab States
q0q1q3 q1q3q2 q2q2
q3q3q3
q2
Draw a DFA to accept string of 0s and 1s ending with the string 011.
1
q0
q1
q2
q3
0
1
a,b
a
q0
q1
q2
Obtain a DFA to accept strings of as and bs except those containing the substring aab.
a
a
q0
q1
a,b
q2
q3
b
Obtain DFAs to accept strings of as and bs having exactly one a,
b
b
a
q0
a,b
a
q1
q2
a, b
a
q0
b
q0
b
a
b
a
q1
q1
q2
a, b
b
a
q3
q4
q
2
a
b
a
a
q
3
b
2
q1
a
b
a
q2
q3
a
a
q0
b
q1
a
b
a
q2
q3
a
a
q0
b
q1
a
b
a
q2
q3
Regular language
Definition: Let M = (Q, , , q0, A) be a DFA. The language L is regular if there exists a
machine M such that L = L(M).
* Applications of Finite Automata *
String matching/processing
Compiler Construction
The various compilers such as C/C++, Pascal, Fortran or any other compiler is designed
using the finite automata. The DFAs are extensively used in the building the various phases
of compiler such as
Lexical analysis (To identify the tokens, identifiers, to strip of the comments etc.)
Syntax analysis (To check the syntax of each statement or control statement used in
the program)
3
Other applications
The concept of finite automata is used in wide applications. It is not possible to list all the
applications as there are infinite number of applications. This section lists some applications:
1. Large natural vocabularies can be described using finite automaton which includes
the applications such as spelling checkers and advisers, multi-language dictionaries,
to indent the documents, in calculators to evaluate complex expressions based on the
priority of an operator etc. to name a few. Any editor that we use uses finite
automaton for implementation.
2. Finite automaton is very useful in recognizing difficult problems i.e., sometimes it is
very essential to solve an un-decidable problem. Even though there is no general
solution exists for the specified problem, using theory of computation, we can find
the approximate solutions.
3. Finite automaton is very useful in hardware design such as circuit verification, in
design of the hardware board (mother board or any other hardware unit), automatic
traffic signals, radio controlled toys, elevators, automatic sensors, remote sensing or
controller etc.
4. In game theory and games wherein we use some control characters to fight against a
monster, economics, computer graphics, linguistics etc., finite automaton plays a very
important role.
Non deterministic finite automata(NFA)
Definition: An NFA is a 5-tuple or quintuple M = (Q, , , q0, A) where
Q is non empty, finite set of states.
is non empty, finite set of input alphabets.
is transition function which is a mapping from
Q x { U } to subsets of 2Q. This function shows
the change of state from one state to a set of states
based on the input symbol.
q0 Q is the start state.
A Q is set of final states.
Acceptance of language
Definition: Let M = (Q, , , q0, A) be a DFA where Q is set of finite states, is set of input
alphabets (from which a string can be formed), is transition function from Q x {U} to 2Q,
q0 is the start state and A is the final or accepting state. The string (also called language) w
accepted by an NFA can be defined in formal notation as:
4
q1
q5
q2
q6
q0
q3
q4
a
q7
0
q0
0,1 q 0, 1 q
1
2
(2.7)
Step2: Find the new states from each state in QD and obtain the corresponding transitions.
Consider the state [q0]:
When a = 0
D([q0], 0)
When a = 1
D([q0], 1)
= N([q0], 0)
= [q0, q1]
(2.8)
= N([q0], 1)
= [q1]
(2.9)
Since the states obtained in (2.8) and (2.9) are not in QD(2.7), add these two states to QD so
that
QD = {[q0], [q0, q1], [q1] }
(2.10)
Q
Consider the
When
[q0]
[q0, q1]
[q1]
D([q0,
0)
When a = 1
D([q0,
0
[q0, q1]
1
[q1]
q1], =
=
=
=
N([q0, q1], 0)
N(q0, 0) U N(q1, 0)
{q0, q1}
U {q2}
[q0, q1, q2]
(2.11)
= N(q0, 1) U N(q1, 1)
= {q1} U {q2}
= [q1, q2]
(2.12)
1)
Since the states obtained in (2.11) and (2.12) are the not defined in Q D(see 2.10), add these
two states to QD so that
QD = {[q0], [q0, q1], [q1], [q0, q1, q2], [q1, q2] } (2.13)
and add the transitions on a = 0 and a = 1 as shown below:
Q
Consider the
When
[q0]
[q0, q1]
[q1]
[q0, q1, q2]
[q1, q2]
D([q1], 0)
0
[q0, q1]
[q0, q1, q2]
1
[q1]
[q1, q2]
state [q1]:
a=0
= N([q1], 0)
= [q2]
(2.14)
When a = 1
D([q1], 1)
= N([q1], 1)
= [q2]
(2.15)
Since the states obtained in (2.14) and (2.15) are same and the state q 2 is not in QD(see 2.13),
add the state q2 to QD so that
QD = {[q0], [q0, q1], [q1], [q0, q1, q2], [q1, q2], [q2]}
(2.16)
0
[q0, q1]
[q0, q1, q2]
[q2]
1
[q1]
[q1, q2]
[q2]
[q0]
state [q0,q1,q2]:
[q0, q1]
[q1]
a=0
[q0, q1, q2]
[q1, q2]
= N([q0,q1,q2], 0)
D([q0,q[q1,q
2],
2]
= N(q0, 0) U N(q1, 0) U N(q2, 0)
0)
= {q0,q1} U {q2} U {}
=
Consider the
Q
When
[q0,q1,q2]
(2.17)
When a = 1
D([q0,q1,q2],
1)
=
=
=
=
N([q0,q1,q2], 1)
N(q0, 1) U N(q1, 1) U N(q2, 1)
{q1} U {q2} U {q2}
[q1, q2]
(2.18)
Since the states obtained in (2.17) and (2.18) are not new states (are already in Q D, see 2.16),
do not add these two states to Q D. But, the transitions on a = 0 and a = 1 should be added to
the transitional table as shown below:
[q0]
Consider the [q0, q1]
Q
[q1]
When [q0, q1, q2]
D([q1,q[q2],1, 0)
q2] =
=
[q2]
=
=
When a = 1
D([q1,q2], 1)
=
=
=
=
0
1
[q0, q1]
[q1]
[q0, q1, q2] [q1, q2] state [q1,q2]:
[q2]
[q2]
[q0,q1,q2]
[q1, q2] a = 0
N([q1,q2], 0)
N(q1, 0) U N(q2, 0)
{q2} U {}
[q2]
(2.19)
N([q1,q2], 1)
N(q1, 1) U N(q2, 1)
{q2} U {q2}
[q2]
(2.20)
Since the states obtained in (2.19) and (2.20) are not new states (are already in Q D see 2.16),
do not add these two states to Q D. But, the transitions on a = 0 and a = 1 should be added to
the transitional table as shown below:
Consider the
Q
0
[q0]
[q0, q1]
[q0, q1]
[q0, q1, q2]
[q1]
[q2]
[q0, q1, q2] [q0,q1,q2]
[q1, q2]
[q2]
[q2]
1
[q1]
[q1, q2]
[q2]
[q1, q2]
[q2]
state [q2]:
8
When a = 0
D([q2], 0)
= N([q2], 0)
= {}
(2.21)
When a = 1
D([q2], 1)
= N([q2], 1)
= [q2]
(2.22)
Since the states obtained in (2.21) and (2.22) are not new states (are already in Q D, see 2.16),
do not add these two states to Q D. But, the transitions on a = 0 and a = 1 should be added to
the transitional table. The final transitional table is shown in table 2.14. and final DFA is
shown in figure 2.35.
[q0]
[q0,q1]
[q2[q
] 1]
0
[q0, q1]
[q0, q1, q2]
1
[q1]
[q1, q2]
[q2]
[q2]
[q0,q1,q2]
[q1, q2]
[q2]
[q2]
[q2]
[q0 ]
[q0,q1,q2]
[q1,q2]
[q0, q 1]
[q0, q 1, q 2]
[q1 ]
[q1, q 2 ]
0, 1
0, 1
[q2 ]
8
6
Let QD = {0}
(A)
= N(0, a)
= {1}
(B)
= N(0, b)
= {}
= N(1, a)
= {}
= N(1, b)
= {2}
= {2,3,4,6,9}
(C)
This is because, in state 2, due to -transitions (or without giving any input) there
can be transition to states 3,4,6,9 also. So, all these states are reachable from state 2.
Therefore,
(B, b) = {2,3,4,6,9} = C
Consider the state [C]:
When input is a:
(C, a)
= N({2,3,4,6,9}, a)
= {5}
10
= {5, 8, 9, 3, 4, 6}
= {3, 4, 5, 6, 8, 9}
order) (D)
(ascending
This is because, in state 5 due to -transitions, the states reachable are {8, 9, 3, 4,
6}. Therefore,
(C, a) = {3, 4, 5, 6, 8, 9} = D
When input is b:
( C, b)
N({2, 3, 4, 6, 9}, b)
{7}
{7, 8, 9, 3, 4, 6}
{3, 4, 6, 7, 8, 9}(ascending order)
(E)
This is because, from state 7 the states that are reachable without any input (i.e., transition) are {8, 9, 3, 4, 6}. Therefore,
=
=
=
=
(C, b) = {3, 4, 6, 7, 8, 9} = E
Consider the state [D]:
When input is a:
(D, a)
=
=
=
=
N({3,4,5,6,8,9}, a)
{5}
{5, 8, 9, 3, 4, 6}
{3, 4, 5, 6, 8, 9}
order) (D)
=
=
=
=
N({3,4,5,6,8,9}, b)
{7}
{7, 8, 9, 3, 4, 6}
{3, 4, 6, 7, 8, 9}
order) (E)
=
=
=
=
N({3,4,6,7,8,9}, a)
{5}
{5, 8, 9, 3, 4, 6}
{3, 4, 5, 6, 8, 9}(ascending order)
(D)
(ascending
When input is b:
(D, b)
(ascending
11
When input is b:
(E, b)
N({3,4,6,7,8,9}, b)
{7}
{7, 8, 9, 3, 4, 6}
{3, 4, 6, 7, 8, 9}(ascending order)
(E)
=
=
=
=
Since there are no new states, we can stop at this point and the transition table for the DFA is
shown in table 2.15.
A
B
a
B
D
D
D
b
C
E
E
E
C
b
E
a
b
b
Fig. 2.36 The DFA
Regular Languages
Regular expression
Definition: A regular expression is recursively defined as follows.
1. is a regular expression denoting an empty language.
2. -(epsilon) is a regular expression indicates the language containing an empty string.
3. a is a regular expression which indicates the language containing only {a}
12
Regular
expressions
(a+b)*
Meaning
13
String of as and bs of even length can be obtained by the combination of the strings aa, ab,
ba and bb. The language may even consist of an empty string denoted by . So, the regular
expression can be of the form
(aa + ab + ba + bb)*
The * closure includes the empty string.
Note: This regular expression can also be represented using set notation as
L(R) = {(aa + ab + ba + bb)n | n 0}
Obtain a regular expression to accept a language consisting of strings of as and bs of odd length.
String of as and bs of odd length can be obtained by the combination of the strings aa, ab,
ba and bb followed by either a or b. So, the regular expression can be of the form
(aa + ab + ba + bb)* (a+b)
String of as and bs of odd length can also be obtained by the combination of the strings aa,
ab, ba and bb preceded by either a or b. So, the regular expression can also be represented as
(a+b) (aa + ab + ba + bb)*
Note: Even though these two expression are seems to be different, the language
corresponding to those two expression is same. So, a variety of regular expressions can be
obtained for a language and all are equivalent.
Obtain NFA from the regular expression
Theorem: Let R be a regular expression. Then there exists a finite automaton M = (Q, , ,
q0, A) which accepts L(R).
Proof: By definition, , and a are regular expressions. So, the corresponding machines to
recognize these expressions are shown in figure 3.1.a, 3.1.b and 3.1.c respectively.
q0
q0
qf
(a)
qf
(b)
q0
qf
(c)
M f
14
q0
q1 M 1
f1
q2 M 2
f2 L(R )
2
qf
q1 M 1
q2 M2
f1
f2
Fig. 3.4To accept the language L(R1 . R2)
It is clear from figure 3.4 that the machine after accepting L(R 1) moves from state q1 to f1.
Since there is a -transition, without any input there will be a transition from state f 1 to state
q2. In state q2, upon accepting L(R2), the machine moves to f2 which is the final state. Thus, q1
which is the start state of machine M1 becomes the start state of the combined machine M
and f2 which is the final state of machine M 2, becomes the final state of machine M and
accepts the language L(R1.R2).
Case 3: R = (R1)*. We can construct an NFA which accepts either L(R1)*) as shown in figure
3.5.a. It can also be represented as shown in figure 3.5.b.
q0
q1
f1
M1
qf
15
L(R1)
(a)
q0
q1
f1
M1
qf
(b)
8
6
9
16
q1 4
(3.1)
Note:
1. Any graph can be reduced to the graph shown in figure 3.9. Then substitute the
regular expressions appropriately in the equation 3.1 and obtain the final regular
expression.
2. If r3 is not there in figure 3.9, the regular expression can be of the form
r = r1*r2 r4*
(3.2)
17
3. If q0 and q1 are the final states then the regular expression can be of the form
r = r1* + r1*r2 r4*
(3.3)
Obtain a regular expression for the FA shown below:
0
q0
0
q1
1
q2
0
q3
0,1
0
q0
q1
0,
q2 1
Since, state q2 is the dead state, it can be removed and the following FA is obtained.
1
0
q0
q1
The state q0 is the final state and at this point it can accept any number of 0s which can be
represented using notation as
0*
q1 is also the final state. So, to reach q 1 one can input any number of 0s followed by 1 and
followed by any number of 1s and can be represented as
0*11*
18
So, the final regular expression is obtained by adding 0* and 0*11*. So, the regular expression
is
R.E = 0* + 0*11*
= 0* ( + 11*)
= 0* ( + 1+)
= 0* (1*) = 0*1*
It is clear from the regular expression that language consists of any number of 0s (possibly
) followed by any number of 1s(possibly ).
Applications of Regular Expressions
Pattern Matching refers to a set of objects with some common properties. We can match an
identifier or a decimal number or we can search for a string in the text.
An application of regular expression in UNIX editor ed.
In UNIX operating system, we can use the editor ed to search for a specific pattern in the
text. For example, if the command specified is
/acb*c/
then the editor searches for a string which starts with ac followed by zero or more bs and
followed by the symbol c. Note that the editor ed accepts the regular expression and searches
for that particular pattern in the text. As the input can vary dynamically, it is challenging to
write programs for string patters of these kinds.
19
a,b
a
q0
b
q1
q2
a
q3
a,b
20
ab States
q0q1q3 q1q3q2 q2q2
q3q3q3
q2
Draw a DFA to accept string of 0s and 1s ending with the string 011.
1
q0
q1
q2
q3
0
1
Obtain a DFA to accept strings of as and bs having a sub string aa
b
q0
a,b
a
q1
q2
b
Obtain a DFA to accept strings of as and bs except those containing the substring aab.
b
q0
a
a
q1
q2
a,b
q3
b
Obtain DFAs to accept strings of as and bs having exactly one a,
b
q0
b
a
q1
a,b
a
q2
21
a, b
a
q0
b
q0
b
a
q1
q1
b
a
a, b
b
a
q2
q3
q4
q1
a
b
a
q2
q3
q1
a
b
a
q2
q3
a
q0
b
q1
a
b
a
q2
q3
a
a
q0
b
q1
a
b
a
q2
q3
22
Regular language
Definition: Let M = (Q, , , q0, A) be a DFA. The language L is regular if there exists a
machine M such that L = L(M).
* Applications of Finite Automata *
String matching/processing
Compiler Construction
The various compilers such as C/C++, Pascal, Fortran or any other compiler is designed
using the finite automata. The DFAs are extensively used in the building the various phases
of compiler such as
Lexical analysis (To identify the tokens, identifiers, to strip of the comments etc.)
Syntax analysis (To check the syntax of each statement or control statement used in
the program)
Code optimization (To remove the un wanted code)
Code generation (To generate the machine code)
Other applications
The concept of finite automata is used in wide applications. It is not possible to list all the
applications as there are infinite number of applications. This section lists some applications:
5. Large natural vocabularies can be described using finite automaton which includes
the applications such as spelling checkers and advisers, multi-language dictionaries,
to indent the documents, in calculators to evaluate complex expressions based on the
priority of an operator etc. to name a few. Any editor that we use uses finite
automaton for implementation.
6. Finite automaton is very useful in recognizing difficult problems i.e., sometimes it is
very essential to solve an un-decidable problem. Even though there is no general
solution exists for the specified problem, using theory of computation, we can find
the approximate solutions.
7. Finite automaton is very useful in hardware design such as circuit verification, in
design of the hardware board (mother board or any other hardware unit), automatic
traffic signals, radio controlled toys, elevators, automatic sensors, remote sensing or
controller etc.
8. In game theory and games wherein we use some control characters to fight against a
monster, economics, computer graphics, linguistics etc., finite automaton plays a very
important role.
Non deterministic finite automata(NFA)
Definition: An NFA is a 5-tuple or quintuple M = (Q, , , q0, A) where
23
q1
q5
q2
q6
q0
q3
q4
a
q7
The start state of NFA MN is the start state of DFA M D. So, add q0(which is the start
state of NFA) to QD and find the transitions from this state. The way to obtain
different transitions is shown in step2.
Step2:
For each state [qi, qj,.qk] in QD, the transitions for each input symbol in can be
obtained as shown below:
4. D([qi, qj,.qk], a) = N(qi, a) U N(qj, a) U N(qk, a)
= [ql, qm,.qn] say.
5. Add the state [ql, qm,.qn] to QD, if it is not already in QD.
6. Add the transition from [qi, qj,.qk] to [ql, qm,.qn] on the input symbol a iff the
state [ql, qm,.qn] is added to QD in the previous step.
Step3:
The state [qa, qb,.qc] QD is the final state, if at least one of the state in qa, qb, ..
qc AN i.e., at least one of the component in [q a, qb,.qc] should be the final state of
NFA.
Step4:
If epsilon () is accepted by NFA, then start state q0 of DFA is made the final state.
Convert the following NFA into an equivalent DFA.
0
q0
0,1 q 0, 1 q
1
2
(2.7)
Step2: Find the new states from each state in QD and obtain the corresponding transitions.
Consider the state [q0]:
When a = 0
D([q0], 0)
When a = 1
D([q0], 1)
= N([q0], 0)
= [q0, q1]
(2.8)
= N([q0], 1)
= [q1]
(2.9)
Since the states obtained in (2.8) and (2.9) are not in QD(2.7), add these two states to QD so
that
QD = {[q0], [q0, q1], [q1] }
(2.10)
25
Q
Consider the
When
[q0]
[q0, q1]
[q1]
D([q0,
0)
When a = 1
D([q0,
1)
0
[q0, q1]
q1], =
=
=
=
q1], =
=
=
=
1
[q1]
a=0
N([q0, q1], 0)
N(q0, 0) U N(q1, 0)
{q0, q1}
U {q2}
[q0, q1, q2]
(2.11)
N([q0, q1], 1)
N(q0, 1) U N(q1, 1)
{q1} U {q2}
[q1, q2]
(2.12)
Since the states obtained in (2.11) and (2.12) are the not defined in Q D(see 2.10), add these
two states to QD so that
QD = {[q0], [q0, q1], [q1], [q0, q1, q2], [q1, q2] } (2.13)
and add the transitions on a = 0 and a = 1 as shown below:
Q
Consider the
When
[q0]
[q0, q1]
[q1]
[q0, q1, q2]
[q1, q2]
D([q1], 0)
0
[q0, q1]
[q0, q1, q2]
1
[q1]
[q1, q2]
state [q1]:
a=0
= N([q1], 0)
= [q2]
(2.14)
When a = 1
D([q1], 1)
= N([q1], 1)
= [q2]
(2.15)
26
Since the states obtained in (2.14) and (2.15) are same and the state q 2 is not in QD(see 2.13),
add the state q2 to QD so that
QD = {[q0], [q0, q1], [q1], [q0, q1, q2], [q1, q2], [q2]}
(2.16)
[q0]
[q0, q1]
[q1]
[q0, q1, q2]
[q1, q2]
=
D([q0,q[q1,q
2],
2]
=
0)
=
=
Consider the
Q
When
When a = 1
D([q0,q1,q2],
1)
=
=
=
=
0
[q0, q1]
[q0, q1, q2]
[q2]
1
[q1]
[q1, q2]
[q2]
state [q0,q1,q2]:
a=0
N([q0,q1,q2], 0)
N(q0, 0) U N(q1, 0) U N(q2, 0)
{q0,q1} U {q2} U {}
[q0,q1,q2]
(2.17)
N([q0,q1,q2], 1)
N(q0, 1) U N(q1, 1) U N(q2, 1)
{q1} U {q2} U {q2}
[q1, q2]
(2.18)
Since the states obtained in (2.17) and (2.18) are not new states (are already in Q D, see 2.16),
do not add these two states to Q D. But, the transitions on a = 0 and a = 1 should be added to
the transitional table as shown below:
[q0]
Consider the [q0, q1]
Q
[q1]
When [q0, q1, q2]
D([q1,q[q2],1, 0)
q2] =
=
[q2]
=
0
1
[q0, q1]
[q1]
[q0, q1, q2] [q1, q2] state [q1,q2]:
[q2]
[q2]
[q0,q1,q2]
[q1, q2] a = 0
N([q1,q2], 0)
N(q1, 0) U N(q2, 0)
{q2} U {}
27
= [q2]
(2.19)
When a = 1
D([q1,q2], 1)
N([q1,q2], 1)
N(q1, 1) U N(q2, 1)
{q2} U {q2}
[q2]
(2.20)
=
=
=
=
Since the states obtained in (2.19) and (2.20) are not new states (are already in Q D see 2.16),
do not add these two states to Q D. But, the transitions on a = 0 and a = 1 should be added to
the transitional table as shown below:
0
[q0]
[q0, q1]
[q0, q1]
[q0, q1, q2]
Q
[q1]
[q2]
[q0, q1, q2] [q0,q1,q2]
Consider the [q1, q2]
[q2]
When [q ]
2
= N([q2], 0)
D([q2], 0)
= {}
(2.21)
When a = 1
D([q2], 1)
1
[q1]
[q1, q2]
[q2]
[q1, q2]
[q2]
state [q2]:
a=0
= N([q2], 1)
= [q2]
(2.22)
Since the states obtained in (2.21) and (2.22) are not new states (are already in Q D, see 2.16),
do not add these two states to Q D. But, the transitions on a = 0 and a = 1 should be added to
the transitional table. The final transitional table is shown in table 2.14. and final DFA is
shown in figure 2.35.
Table 2.14
Q
0[q0]
[q0, q1]
0 [q0, q1, q2]
[q0]
[q0,q1]
[q1]
[q2]
[q0,q1,q2]
[q1,q2]
[q2]
[q0, q 1, q 2]
[q2]
1
Transitional table
[q2]
[q0, q 1]
[q0,q1,q2]
1
[q1]
[q11, q2]
[q1, q 2 ]
[q1 ]
[q1, q2]
[q2]
[q2]
0, 1
0, 1
[q2 ]
28
8
6
Let QD = {0}
(A)
= N(0, a)
= {1}
(B)
= N(0, b)
= {}
= N(1, a)
= {}
= N(1, b)
= {2}
= {2,3,4,6,9}
(C)
29
This is because, in state 2, due to -transitions (or without giving any input) there
can be transition to states 3,4,6,9 also. So, all these states are reachable from state 2.
Therefore,
(B, b) = {2,3,4,6,9} = C
Consider the state [C]:
When input is a:
(C, a)
=
=
=
=
N({2,3,4,6,9}, a)
{5}
{5, 8, 9, 3, 4, 6}
{3, 4, 5, 6, 8, 9}
order) (D)
(ascending
This is because, in state 5 due to -transitions, the states reachable are {8, 9, 3, 4,
6}. Therefore,
(C, a) = {3, 4, 5, 6, 8, 9} = D
When input is b:
( C, b)
N({2, 3, 4, 6, 9}, b)
{7}
{7, 8, 9, 3, 4, 6}
{3, 4, 6, 7, 8, 9}(ascending order)
(E)
This is because, from state 7 the states that are reachable without any input (i.e., transition) are {8, 9, 3, 4, 6}. Therefore,
=
=
=
=
(C, b) = {3, 4, 6, 7, 8, 9} = E
Consider the state [D]:
When input is a:
(D, a)
=
=
=
=
N({3,4,5,6,8,9}, a)
{5}
{5, 8, 9, 3, 4, 6}
{3, 4, 5, 6, 8, 9}
order) (D)
=
=
=
=
N({3,4,5,6,8,9}, b)
{7}
{7, 8, 9, 3, 4, 6}
{3, 4, 6, 7, 8, 9}
(ascending
When input is b:
(D, b)
(ascending
30
order) (E)
Consider the state [E]:
When input is a:
(E, a)
=
=
=
=
N({3,4,6,7,8,9}, a)
{5}
{5, 8, 9, 3, 4, 6}
{3, 4, 5, 6, 8, 9}(ascending order)
(D)
=
=
=
=
N({3,4,6,7,8,9}, b)
{7}
{7, 8, 9, 3, 4, 6}
{3, 4, 6, 7, 8, 9}(ascending order)
(E)
When input is b:
(E, b)
Since there are no new states, we can stop at this point and the transition table for the DFA is
shown in table 2.15.
A
B
C
D
E
Table 2.15
Q
The
states
state of NFA)
transition
a
B
D
D
D
b
C
E
E
E
Transitional table
C,D and E are final states, since 9 (final
is present in C, D and E. The final
diagram of DFA is shown in figure 2.36
a
A
C
b
E
a
b
b
Fig. 2.36 The DFA
31
Regular Languages
Regular expression
Definition: A regular expression is recursively defined as follows.
is a regular expression denoting an empty language.
-(epsilon) is a regular expression indicates the language containing an empty string.
a is a regular expression which indicates the language containing only {a}
If R is a regular expression denoting the language LR and S is a regular expression
denoting the language LS, then
a. R+S is a regular expression corresponding to the language LRULS.
b. R.S is a regular expression corresponding to the language LR.LS..
c. R* is a regular expression corresponding to the language LR*.
10. The expressions obtained by applying any of the rules from 1-4 are regular
expressions.
6.
7.
8.
9.
The table 3.1 shows some examples of regular expressions and the language corresponding to these
regular expressions.
Regular
expressions
(a+b)*
(a+b)*abb
ab(a+b)*
(a+b)*aa(a+b)
*
a*b*c*
a+b+c+
Meaning
Set of strings of as and bs of any length
including the NULL string.
Set of strings of as and bs ending with the
string abb
Set of strings of as and bs starting with the
string ab.
Set of strings of as and bs having a sub string
aa.
Set of string consisting of any number of
as(may be empty string also) followed by any
number of bs(may include empty string)
followed by any number of cs(may include
empty string).
Set of string consisting of at least one a
followed by string consisting of at least one b
followed by string consisting of at least one c.
32
aa*bb*cc*
String of as and bs of even length can be obtained by the combination of the strings aa, ab,
ba and bb. The language may even consist of an empty string denoted by . So, the regular
expression can be of the form
(aa + ab + ba + bb)*
The * closure includes the empty string.
Note: This regular expression can also be represented using set notation as
L(R) = {(aa + ab + ba + bb)n | n 0}
Obtain a regular expression to accept a language consisting of strings of as and bs of odd length.
String of as and bs of odd length can be obtained by the combination of the strings aa, ab,
ba and bb followed by either a or b. So, the regular expression can be of the form
(aa + ab + ba + bb)* (a+b)
String of as and bs of odd length can also be obtained by the combination of the strings aa,
ab, ba and bb preceded by either a or b. So, the regular expression can also be represented as
(a+b) (aa + ab + ba + bb)*
Note: Even though these two expression are seems to be different, the language
corresponding to those two expression is same. So, a variety of regular expressions can be
obtained for a language and all are equivalent.
Obtain NFA from the regular expression
Theorem: Let R be a regular expression. Then there exists a finite automaton M = (Q, , ,
q0, A) which accepts L(R).
33
Proof: By definition, , and a are regular expressions. So, the corresponding machines to
recognize these expressions are shown in figure 3.1.a, 3.1.b and 3.1.c respectively.
q0
q0
qf
(a)
qf
q0
(b)
qf
(c)
M f
q0
q1 M 1
f1
q2 M 2
f2 L(R )
2
qf
q1 M 1
q2 M2
f1
f2
34
q0
q1 M 1
f1 L(R1)
qf
(a)
q0
q1
f1
M1
qf
(b)
35
8
6
36
q1 4
(3.1)
Note:
4. Any graph can be reduced to the graph shown in figure 3.9. Then substitute the
regular expressions appropriately in the equation 3.1 and obtain the final regular
expression.
5. If r3 is not there in figure 3.9, the regular expression can be of the form
r = r1*r2 r4*
(3.2)
6. If q0 and q1 are the final states then the regular expression can be of the form
r = r1* + r1*r2 r4*
(3.3)
0
q0
0
q1
1
q2
0
q3
0,1
It is clear from this figure that the machine accepts strings of 01s and 10s of any length and
the regular expression can be of the form
(01 + 10)*
What is the language accepted by the following FA
0
q0
q1
0,
q2 1
Since, state q2 is the dead state, it can be removed and the following FA is obtained.
1
0
q0
q1
The state q0 is the final state and at this point it can accept any number of 0s which can be
represented using notation as
0*
q1 is also the final state. So, to reach q 1 one can input any number of 0s followed by 1 and
followed by any number of 1s and can be represented as
0*11*
So, the final regular expression is obtained by adding 0* and 0*11*. So, the regular expression
is
R.E = 0* + 0*11*
= 0* ( + 11*)
= 0* ( + 1+)
= 0* (1*) = 0*1*
It is clear from the regular expression that language consists of any number of 0s (possibly
) followed by any number of 1s(possibly ).
Applications of Regular Expressions
Pattern Matching refers to a set of objects with some common properties. We can match an
identifier or a decimal number or we can search for a string in the text.
An application of regular expression in UNIX editor ed.
In UNIX operating system, we can use the editor ed to search for a specific pattern in the
text. For example, if the command specified is
38
/acb*c/
then the editor searches for a string which starts with ac followed by zero or more bs and
followed by the symbol c. Note that the editor ed accepts the regular expression and searches
for that particular pattern in the text. As the input can vary dynamically, it is challenging to
write programs for string patters of these kinds.
39
Pumping Lemma
Used to prove certain languages like L = {0n1n | n 1} are not regular.
L = {0n1n | n 1}
There is no regular expression to define L. 00*11* is not the regular expression defining L.
Let L= {0212}
0
1
1
3
1
1
1
4
0
6
0
0,1
0,1
State 6 is a trap state, state 3 remembers that two 0s have come and from there state 5
remembers that two 1s are accepted.
This implies DFA has no memory to remember arbitrary n. In other words if we have to
remember n, which varies from 1 to we have to have infinite states, which is not possible
with a finite state machine, which has finite number of states.
40
|w|
k=1
a1 ------ an is accepted
k=2
(ii)
Example 1.
To prove that L={w|w anbn, where n 1} is not regular
Proof:
Let L be regular. Let n is the constant (PL Definition). Consider a word w in L.
Let w = anbn, such that |w|=2n. Since 2n > n and L is regular it must satisfy PL.
42
,15
To prove that L={ all strings of 1s whose length is prime} is not regular. i.e., L={12,
,17 ,111 ,----}
Let k = p-m
= (p-m) + m (p-m)
= (p-m) (1+m) ----- this can not be prime
if p-m 2 or 1+m 2
1.
(1+m) 2 because m 1
2.
Example 4.
To prove that L={ 0i2 | i is integer and i >0} is not regular. i.e., L={02, 04 ,09 ,016 ,
025 ,----}
Proof: Let L be regular. Let w = 0n2 where |w| = n2 n
by PL xykz L, for all k = 0,1,--Select k = 2
| xy2z |= | xyz | + | y |
43
Say n = 5 this implies that string can have length > 25 and < 36
which is not of the form 0i2.
Exercises for students: a) Show that following languages are not regular
(i)
(ii)
(iii)
L={anbmcmdn | n, m 1 }
(iv)
(v)
b) Apply pumping lemma to following languages and understand why we cannot complete
proof
(i)
L={anaba | n 0 }
(ii)
L={anbm | n, m 0 }
44
L1={a,a3,a5,-----}
* -L1={e,a2,a4,a6,-----}
RE=(aa)*
Ex2.
Consider a DFA, A that accepts all and only the strings of 0s and 1s that end in
01. That is L(A) = (0+1)*01. The complement of L(A) is therefore all string of 0s and 1s
that do not end in 01
46
a regular language
Proof: - Let L =L(A) for some DFA. A=(Q, , , q0, F). Then L = L(B), where B is the
DFA (Q, , , q0, Q-F). That is, B is exactly like A, but the accepting states of A have become
non-accepting states of B, and vice versa, then w is in L(B) if and only if ^ ( q0, w) is in QF, which occurs if and only if w is not in L(A).
This automaton accepts the intersection of the first two languages: Those languages that have
both a 0 and a 1. Then pr represents only the initial condition, in which we have seen neither
0 nor 1. Then state qr means that we have seen only once 0s, while state ps represents the
condition that we have seen only 1s. The accepting state qs represents the condition where
we have seen both 0s and 1s.
Ex 4 (on intersection)
Write a DFA to accept the intersection of L1=(a+b)*a and L2=(a+b)*b that is for L1 L2.
48
DFA for L1 L2
Reversal
Theorem : If L is a regular language, so is LR
Ex.
49
L={001,10,111,01}
LR={100,01,111,10}
To prove that regular languages are closed under reversal.
Let L = {001, 10, 111}, be a language over ={0,1}.
LR is a language consisting of the reversals of the strings of L.
That is LR = {100,01,111}.
If L is regular we can show that LR is also regular.
Proof.
As L is regular it can be defined by an FA, M = (Q, , , q0, F), having only one final state.
If there are more than one final states, we can use - transitions from the final states going
to a common final state.
Let FA, MR = (QR, R , R,q0R,FR) defines the language LR,
Where QR = Q, R = , q0R=F,FR=q0, and R (p,a)-> q, iff (q,a) -> p
Since MR is derivable from M, LR is also regular.
The proof implies the following method
1. Reverse all the transitions.
2. Swap initial and final states.
3. Create a new start state p0 with transition on to all the
accepting states of original DFA
Example
Let r=(a+b)* ab define a language L. That is
L = {ab, aab, bab,aaab, -----}. The FA is as given below
The FA for LR can be derived from FA for L by swapping initial and final states and changing
the direction of each edge. It is shown in the following figure.
50
51
Homomorphism
A string homomorphism is a function on strings that works by substituting a particular string
for each symbol.
Theorem : If L is a regular language over alphabet , and h is a homomorphism on , then h
(L) is also regular.
Ex.
The function h defined by h(0)=ab h(1)=c is a homomorphism.
h applied to the string 00110 is ababccab
L1= (a+b)* a (a+b)*
h : {a, b}
{0, 1}*
Resulting :
h1(L) = (01 + 11)* 01 (01 + 11)*
h2(L) = (101 + 010)* 101 (101 + 010)*
h3(L) = (01 + 101)* 01 (01 + 101)*
Inverse Homomorphism
Theorem : If h is a homomorphism from alphabet S to alphabet T, and L is a regular language
over T, then h-1 (L) is also a regular language.
52
Fig 2. gives the list of all unordered pairs of states (p,q) with p q.
54
The boxes (1,2) and (2,3) are marked in the first pass according to the algorithm 1.
In pass 2 no boxes are marked because, (1,a) and (3,a) 2. That is (1,3)
where and 3 are non final states.
(,2),
55
The pairs marked 1 are those of which exactly one element is in F; They are marked on pass
1. The pairs marked 2 are those marked on the second pass. For example (5,2) is one of these,
since (5,2) (6,4), and the pair (6,4) was marked on pass 1.
From this we can make out that 1, 2, and 4 can be replaced by a single state 124 and
states 3, 5, and 7 can be replaced by the single state 357. The resultant minimal FA is shown
in Fig. 6
Example 2. (Method1):
(2,3)
(4,6) this implies that 2 and 3 belongs to different group hence they are split in level
2. similarly it can be easily shown for the pairs (4,5) (1,7) and (2,5) and so on.
56
P
P0
P1
P0P0
P1P1
57
E*
There is a rightmost derivation that uses the same replacements for each variable, although it
makes the replacements in different order. This rightmost derivation is:
58
E E * E E * (E) E * (E + E)
E * (E + I) E * (E + I0) E * (E + I00) E * (E + b00)
E * (I + b00) E * (a + b00) I * (a + b00) a * (a + b00)
This derivation allows us to conclude E a * (a + b00)
Consider the Grammar for string (a+b)*c
EE + T | T
T T * F | F
F ( E ) | a | b | c
Leftmost Derivation
ETT*FF*F(E)*F(E+T)*F(T+T)*F(F+T)*F (a+T)*F (a+F)*F
(a+b)*F(a+b)*c
Rightmost derivation
ETT*FT*cF*c(E)*c(E+T)*c(E+F)*c
(E+b)*c(T+b)*c(F+b)*c(a+b)*c
Example 2:
Consider the Grammar for string (a,a)
S->(L)|a
L->L,S|S
Leftmost derivation
S(L)(L,S)(S,S)(a,S)(a,a)
Rightmost Derivation
S(L)(L,S)(L,a)(S,a)(a,a)
The Language of a Grammar
If G(V,T,P,S) is a CFG, the language of G, denoted by L(G), is the set of terminal
strings that have derivations from the start symbol.
L(G) = {w in T | S w}
Sentential Forms
Derivations from the start symbol produce strings that have a special role called sentential
forms. That is if G = (V, T, P, S) is a CFG, then any string in (V T)* such that S is a
sentential form. If S , then is a left sentential form, and if S , then is a right
sentential form. Note that the language L(G) is those sentential
forms that are in T*; that is they consist solely of terminals.
For example, E * (I + E) is a sentential form, since there is a derivation
E E * E E * (E) E * (E + E) E * (I + E)
59
However this derivation is neither leftmost nor rightmost, since at the last step, the middle E
is replaced.
As an example of a left sentential form, consider a * E, with the leftmost derivation.
EE*EI*Ea*E
Additionally, the derivation
E E * E E * (E) E * (E + E)
Shows that
E * (E + E) is a right sentential form.
Ambiguity
A context free grammar G is said to be ambiguous if there exists some w L(G) which has
at least two distinct derivation trees. Alternatively, ambiguity implies the existence of two or
more left most or rightmost derivations.
Ex:Consider the grammar G=(V,T,E,P) with V={E,I}, T={a,b,c,+,*,(,)}, and productions.
EI,
EE+E,
EE*E,
E(E),
Ia|b|c
Consider two derivation trees for a + b * c.
60
61
Parsers
Markup Languages
{}
| Exp + Exp
{}
| Exp * Exp
{}
| ( Exp )
{}
;
Id
: a
{}
|b
{}
| Id a {}
|Id b {}
|Id 0 {}
|Id 1 {}
;
62
A(E1)?
A
AE1
EXERCISE QUESTIONS
1) Design context-free grammar for the following cases
a) L={ 0n1n | nl }
b) L={aibjck| ij or jk}
2) The following grammar generates the language of RE
0*1(0+1)*
S A|B
A 0A|
B 0B|1B|
Give leftmost and rightmost derivations of the following strings
63
a) 00101
3)
b) 1001
c) 00011
4) Suppose h is the homomorphism from the alphabet {0,1,2} to the alphabet { a,b} defined
by h(0) = a; h(1) = ab &
h(2) = ba
a) What is h(0120) ?
b) What is h(21120) ?
c) If L is the language L(01*2), what is h(L) ?
d) If L is the language L(0+12), what is h(L) ?
e) If L is the language L(a(ba)*) , what is h-1(L) ?
64
27.0.05
a * (E) a * (E + E) a * (I + E) a * (a + E)
a * (a + I) a * (a + I0) a * (a + I00) a * (a + b00)
Leftmost Derivation - Tree
66
There is a rightmost derivation that uses the same replacements for each variable, although it
makes the replacements in different order. This rightmost derivation is:
E E * E E * (E) E * (E + E)
E * (E + I) E * (E + I0) E * (E + I00) E * (E + b00)
E * (I + b00) E * (a + b00) I * (a + b00) a * (a + b00)
This derivation allows us to conclude E a * (a + b00)
Consider the Grammar for string(a+b)*c
EE + T | T
T T * F | F
F ( E ) | a | b | c
Leftmost Derivation
ETT*FF*F(E)*F(E+T)*F(T+T)*F(F+T)*F (a+T)*F (a+F)*F
(a+b)*F(a+b)*c
Rightmost derivation
ETT*FT*cF*c(E)*c(E+T)*c(E+F)*c
(E+b)*c(T+b)*c(F+b)*c(a+b)*c
Example 2:
Consider the Grammar for string (a,a)
S->(L)|a
L->L,S|S
Leftmost derivation
S(L)(L,S)(S,S)(a,S)(a,a)
Rightmost Derivation
S(L)(L,S)(L,a)(S,a)(a,a)
The Language of a Grammar
If G(V,T,P,S) is a CFG, the language of G, denoted by L(G), is the set of terminal
strings that have derivations from the start symbol.
L(G) = {w in T | S w}
Sentential Forms
Derivations from the start symbol produce strings that have a special role called sentential
forms. That is if G = (V, T, P, S) is a CFG, then any string in (V T)* such that S is a
sentential form. If S , then is a left sentential form, and if S , then is a right
sentential form. Note that the language L(G) is those sentential
forms that are in T*; that is they consist solely of terminals.
For example, E * (I + E) is a sentential form, since there is a derivation
67
E E * E E * (E) E * (E + E) E * (I + E)
However this derivation is neither leftmost nor rightmost, since at the last step, the middle E
is replaced.
As an example of a left sentential form, consider a * E, with the leftmost derivation.
EE*EI*Ea*E
Additionally, the derivation
E E * E E * (E) E * (E + E)
Shows that
E * (E + E) is a right sentential form.
Ambiguity
A context free grammar G is said to be ambiguous if there exists some w L(G) which has
at least two distinct derivation trees. Alternatively, ambiguity implies the existence of two or
more left most or rightmost derivations.
Ex:Consider the grammar G=(V,T,E,P) with V={E,I}, T={a,b,c,+,*,(,)}, and productions.
EI,
EE+E,
EE*E,
E(E),
Ia|b|c
Consider two derivation trees for a + b * c.
68
69
Parsers
The YACC Parser Generator
Markup Languages
XML and Document typr definitions
#include "y.tab.h"
%}
%%
[0-9]+
{yylval.ID = atoi(yytext); return id;}
[ \t \n]
;
[+ * ( )]
{return yytext[0];}
.
{ECHO; yyerror ("unexpected character");}
%%
Example 2:
%{
#include <stdio.h>
%}
%start line
%token <a_number> number
%type <a_number> exp term factor
%%
line : exp ';' {printf ("result is %d\n", $1);}
;
exp : term {$$ = $1;}
| exp '+' term {$$ = $1 + $3;}
| exp '-' term {$$ = $1 - $3;}
term : factor {$$ = $1;}
| term '*' factor {$$ = $1 * $3;}
| term '/' factor {$$ = $1 / $3;}
;
factor : number {$$ = $1;}
| '(' exp ')' {$$ = $2;}
;
%%
int main (void) {
return yyparse ( );
}
void yyerror (char *s) {
fprintf (stderr, "%s\n", s);
}
%{
#include "y.tab.h"
%}
%%
[0-9]+ {yylval.a_number = atoi(yytext); return number;}
[ \t\n] ;
[-+*/();] {return yytext[0];}
.
{ECHO; yyerror ("unexpected character");}
%%
71
Markup Languages
Functions
Creating links between documents
Describing the format of the document
Example
The Things I hate
1. Moldy bread
2. People who drive too slow
In the fast lane
HTML Source
<P> The things I <EM>hate</EM>:
<OL>
<LI> Moldy bread
<LI>People who drive too slow
In the fast lane
</OL>
HTML Grammar
Char
Text
Doc
Element
5.
6.
a |A|
e | Char Text
e | Element Doc
Text |
<EM> Doc </EM>|
<p> Doc |
<OL> List </OL>|
List-Item
<LI> Doc
List
e | List-Item List
Start symbol
CE2
3. AE1 | E2.
AE1
AE2
4. A(E1)*
ABA
A
BE1
4. A(E1)+
ABA
AB
BE1
5.
A(E1)?
A
AE1
EXERCISE QUESTIONS
1) Design context-free grammar for the following cases
a) L={ 0n1n | nl }
b) L={aibjck| ij or jk}
3) The following grammar generates the language of RE
0*1(0+1)*
S A|B
A 0A|
B 0B|1B|
Give leftmost and rightmost derivations of the following strings
a) 00101
b) 1001
c) 00011
3)
4) Suppose h is the homomorphism from the alphabet {0,1,2} to the alphabet { a,b} defined
by h(0) = a; h(1) = ab &
h(2) = ba
a) What is h(0120) ?
b) What is h(21120) ?
c) If L is the language L(01*2), what is h(L) ?
d) If L is the language L(0+12), what is h(L) ?
e) If L is the language L(a(ba)*) , what is h-1(L) ?
73
74
76
(q,x,Y) =
To understand the behavior or PDA clearer, the transition diagram of PDA can be used.
Transition diagram of PDA is generalization of transition diagram of FA.
1. Node corresponds to states of PDA
2. Arrow labeled Start indicates start state
3. Doubly circled states are final states
4. Arc corresponds to transitions of PDA. If (q, a, X) = {(p, )} is an arc labeled (a,
X/) from state q to state p means that an input tape head positioned at symbol a and
stack top with X, moves automaton to state q and replaces the stack top with .
The transition diagram for the above example PDA is given in Figure 2.
b,a/
a, Z0/aZ0
a,a/aa
Start
q0
b,a/
q1
, Z0 /
q2
, Z0 /
Instantaneous Description
:
Instantaneous Description or configuration of a PDA describes its execution status at
any time. Instantaneous Description is a represented by a triplet (q, w, u), where
1. q is the current state of the automaton,
2. w is the unread part of the input string, w *
3. u is the stack contents, written as a string, with the leftmost symbol at the top of the stack.
So u *
77
Moves of A PDA:
Let the symbol "|-" indicates a move of the nPDA. There are two types of moves possible
for a PDA.
1. Move by consuming input symbol
Suppose that (q1, a, x) = {(q2, y), ...}. Then the following move by consuming an input
symbol is possible:
(q1, aW, xZ) |- (q2, W, yZ),
where W indicates the rest of the input string following the a, and Z indicates the rest of the
stack contents underneath the x. This notation says that in moving from state q 1 to state q2, an
input symbol a is consumed from the input string aW, and the symbol x at the top (left) of
the stack xZ is replaced with symbol y, leaving yZ on the stack.
The above example PDA with a few example input strings, the moves are given below:
a) Moves for the input string aabb:
(q0, aabb, Z0) |- (q0, abb, aZ0) as per transition rule (q0, a, Z0) = {(q0, aZ0)}
|- (q0, bb, aaZ0) as per transition rule (q0, a, a) = {(q0, aa)}
|- (q1, b, aZ0) as per transition rule (q0, b, a) = {(q1, )}
|- (q1, , Z0) as per transition rule (q0, b, a) = {(q1, )}
|- (q2, ,) as per transition rule (q1, , Z0) = {(q2, )}
PDA reached a configuration of (q2, ,). The input tape is empty, stack is empty and PDA
has reached a final state. So the string is accepted.
b)
Moves for the input string aaabb:
(q0, aaabb, Z0) |- (q0, aabb, aZ0) as per transition rule (q0, a, Z0) = {(q0, aZ0)}
|- (q0, abb, aaZ0) as per transition rule (q0, a, a) = {(q0, aa)}
|- (q0, bb, aaaZ0) as per transition rule (q0, a, a) = {(q0, aa)}
|- (q1, b, aaZ0) as per transition rule (q0, b, a) = {(q1, )}
|- (q1, , aZ0) as per transition rule (q0, b, a) = {(q1, )}
|- There is no defined move.
So the automaton stops and the string is not accepted.
78
c)
Moves for the input string aabbb:
(q0, aabbb, Z0) |- (q0, abbb, aZ0) as per transition rule (q0, a, Z0) = {(q0, aZ0)}
|- (q0, bbb, aaZ0) as per transition rule (q0, a, a) = {(q0, aa)}
|- (q1, bb, aZ0) as per transition rule (q0, b, a) = {(q1, )}
|- (q1, b, Z0) as per transition rule (q0, b, a) = {(q1, )}
|- There is no defined move.
So the automaton stops and the string is not accepted.
2. - move
Suppose that (q1, , x) = {(q2, y), ...}. Then the following move without consuming an input
symbol is possible:
(q1, aW, xZ) |- (q2, aW, yZ),
This notation says that in moving from state q 1 to state q2, an input symbol a is not
consumed from the input string aW, and the symbol x at the top (left) of the stack xZ is
replaced with symbol y, leaving yZ on the stack. In this move, the tape head position will
not move forward. This move is usually used to represent non-determinism.
The relation |-* is the reflexive-transitive closure of |- used to represent zero or more moves
of PDA. For the above example, (q0, aabb, Z0) |-* (q2, ,).
Example 2:
Design a PDA to accept the set of all strings of 0s and 1s such that no prefix has more 1s
than 0s.
Solution: The transition diagram could be given as figure 3.
(b, 1, 0) ={(c, )}
(c, 0, 0) ={(b, 00)}
(c, 1, 0) ={(c, )}
(b, , 0) ={(d, 0)}
(b, , Z) ={(d, Z)}
(c, 0, Z) ={(b, 0Z)}
(c, , 0) ={(d, 0)}
(c, , Z) ={(d, Z)}
For all other moves, PDA stops.
Moves for the input string 0010110 is given by:
(a, 0010110, Z) |- (b, 010110, 0Z) |- (b,10110, 00Z) |- (c, 0110, 0Z) |- (b, 110, 00Z) |(c, 10, 0Z) |- (c, 0, Z) |- (b, , 0Z) |- (d, , 0Z).
So the input string is accepted by the PDA.
Moves for 011
(a,011,Z) |- (b,11,0Z) |- (c,1,Z) |-no move, so PDA stops
Exercises:
Construct PDA:
1. For the language L = {wcwR | w {a, b}*, c }
2. Accepting the set of all strings over {a, b} with equal number of as and bs. Show the
moves for abbaba
3. For the language L = {anb2n | a, b , n 0}
4. Accepting the language of balanced parentheses, (consider any type of parentheses)
80
0,0/
1,1/
q0
,0/0
,1/1
, Z0/ Z0
q1
, Z0 / Z0
q2
0, 0/00
0, Z0/0Z0
1, 0/
1,Z0/Z0
1, 0/
p
,
1, Z0/Z0
1, Z0/Z0
, Z0/Z0
,Z0/
Z0/Z0
PN
p0
, X0/Z0X0
q0
, X0/
Pf
83
Figure 4: PF simulates PN
The method of conversion is given in figure 4.
We use a new symbol X0, which must be not symbol of to denote the stack start symbol for
PF. Also add a new start state p0 and final state pf for PF. Let PF = (Q{p0, pf}, ,
{X0}, F, p0, X0, {Pf}), where F is defined by
F(p0, , X0) = {(q0, Z0 X0)} to push X0 to the bottom of the stack
F(q, a, y) = N(q, a, y) a or a = and y , same for both PN and PF.
F(q, , X0) = {(Pf, )} to accept the string by moving to final state.
The moves of PF to accept a string w can be written like:
(p0, w, X0) |-P (p0, w, Z0X0) |-*P (q, , X0) |- (Pf , , )
F
F
b. From Final State to Empty Stack:
Theorem: If L = L(PF) for some PDA PF= (Q, , , F, q0, Z0, F), then there is a PDA PN
such that L = N(PN)
PF
Proof:
p0
, X0/Z0X0
q0
,
Figure 5: PN simulates PF
The method of conversion is given in figure 5.
To avoid PF accidentally empting its stack, initially change the stack start content from Z 0 to
Z0X0. Also add a new start state p0 and final state p for PN. Let PN = (Q{p0, p}, ,
{X0}, N, p0, X0), where N is defined by:
N(p0, , X0) = {(q0, Z0X0)} to change the stack content initially
N(q , a, y) = F(q , a, y), a or a = and y , same for both
84
86
The aim is to prove that the following three classes of languages are same:
1. Context Free Language defined by CFG
2. Language accepted by PDA by final state
3. Language accepted by PDA by empty stack
It is possible to convert between any 3 classes. The representation is shown in figure 1.
CFG
PDA by
empty stack
PDA by
Final state
CFG to PDA conversion is another way of constructing PDA. First construct CFG, and then
convert CFG to PDA.
Example:
Convert the grammar with following production to PDA accepted by empty stack:
S 0S1 | A
A 1A0 | S |
Solution:
P = ({q}, {0, 1}, {0, 1, A, S}, , q, S), where is given by:
(q, , S) = {(q, 0S1), (q, A)}
(q, , A) = {(q, 1A0), (q, S), (q, )}
(q, 0, 0) = {(q, )}
(q, 1, 1) = {(q, )}
From PDA to CFG:
Let P = (Q, , , , q0, Z0) be a PDA. An equivalent CFG is G = (V, , R, S), where
V = {S, [pXq]}, where p, q Q and X , productions of R consists of
1. For all states p, G has productions S [q0Z0 p]
2. Let (q,a,X) = {(r, Y1Y2Yk)} where a or a = , k can be 0 or any number and
r1r2 rk are list of states. G has productions
[qXrk] a[rY1r1] [r1Y2r2] [rk-1Ykrk]
If k = 0 then [qXr] a
Example:
Construct PDA to accept if-else of a C program and convert it to CFG. (This does not accept
if if else-else statements).
Let the PDA P = ({q}, {i, e}, {X,Z}, , q, Z), where is given by:
(q, i, Z) = {(q, XZ)}, (q, e, X) = {(q, )} and (q, , Z) = {(q, )}
Solution:
88
Deterministic PDA:
NPDA provides non-determinism to PDA. Deterministic PDAs (DPDA) are very useful for use in programming languages. For example
Parsers used in YACC are DPDAs.
Definition:
A PDA P= (Q, , , , q0, Z0, F) is deterministic if and only if,
90
Example:
Construct DPDA which accepts the language L = {wcwR | w {a, b}*, c }.
The transition diagram for the DPDA is given in figure 2.
0, Z0/0Z0
1, Z0/1Z0
0,0/00
1,1/11
0,1/ 01
1,0/ 10
0,0/
1,1/
q0
c,0/0
c,1/1
c, Z0/ Z0
q1
, Z0 / Z0
q2
91
If L is a regular language, L=L(P) for some DPDA P. PDA surely includes a stack, but the
DPDA used to simulate a regular language does not use the stack. The stack is inactive
always. If A is the FA for accepting the language L, then P(q,a,Z)={(p,Z)} for all p, q Q
such that A(q,a)=p.
To accept with empty stack:
Every regular language is not N(P) for some DPDA P. A language L = N(P) for some DPDA
P if and only if L has prefix property. Definition of prefix property of L states that if x, y L,
then x should not be a prefix of y, or vice versa. Non-Regular language L=WcW R could be
accepted by DPDA with empty stack, because if you take any x, y L(WcWR), x and y
satisfy the prefix property. But the language, L={0*} could be accepted by DPDA with final
state, but not with empty stack, because strings of this language do not satisfy the prefix
property. So N(P) are properly included in CFL L, ie. N(P) L
DPDA and Ambiguous grammar:
DPDA is very important to design of programming languages because languages DPDA
accept are unambiguous grammars. But all unambiguous grammars are not accepted by
DPDA. For example S 0S0|1S1| is an unambiguous grammar corresponds to the
language of palindromes. This is language is accepted by only nPDA. If L = N(P) for DPDA
P, then surely L has unambiguous CFG.
If L = L(P) for DPDA P, then L has unambiguous CFG. To convert L(P) to N(P) to have
prefix property by adding an end marker $ to strings of L. Then convert N(P) to CFG G.
From G we have to construct G to accept L by getting rid of $ .So add a new production
$ as a variable of G.
92
Simplification of CFG
The goal is to take an arbitrary Context Free Grammar G = (V, T, P, S) and perform
transformations on the grammar that preserve the language generated by the grammar but
reach a specific format for the productions. A CFG can be simplified by eliminating
1. Useless symbols
Those variables or terminals that do not appear in any derivation of a terminal string
starting from Start variable, S.
2. - Productions
A , where A is a variable
3. Unit production
A B, A and B are variables
1.
Draw dependency graph for all productions. If there is no edge reaching a variable X from
Start symbol S, then X is non reachable.
Example:
Eliminate non-reachable symbols from G= ({S, A}, {a}, {S a, A a}, S)
Solution:
S
Draw the dependency graph as given above. A is non-reachable from S. After eliminating A,
G1= ({S}, {a}, {S a}, S)
Example:
Eliminate non-reachable symbols from S aS | A, A a, B aa
A
Draw the dependency graph as given above. B is non-reachable from S. After eliminating B,
we get the grammar with productions S aS | A, A a
Example
:
Eliminate useless symbols from the grammar with productions S AB | CA, B BC | AB,
A a, C AB | b
Step 1: Eliminate non-generating symbols
V1 = {A, C, S}
P1 = {S CA, A a, C b}
Step
2: Eliminate symbols that are non reachable
95
Draw the dependency graph as given above. All Variables are reachable. So the final
variables and productions are same V1 and P1.
V2 = {A, C, S}
P2 = {S CA, A a, C b}
Exercises:
Eliminate useless symbols from the grammar
97
A aA | a
B bB | b
Example:
Find out the grammar without - Productions
G = ({S, A, B, D}, {a}, {S aS | AB, A , B , D b}, S)
98
Solution:
Nullable variables = {S, A, B}
New Set of productions:
S aS | a
S AB | A | B
Db
G1= ({S, B, D}, {a}, {S aS | a | AB | A | B, D b}, S)
Exercise:
Eliminate - productions from the grammar
1.
2.
3.
4.
S a |Xb | aYa, X Y| , Y b | X
S Xa, X aX | bX |
S XY, X Zb, Y bW, Z AB, W Z, A aA | bB | , B Ba | Bb|
S ASB | , A aAS | a, B SbS | A| bb
But if you have to get a grammar without - productions and useless symbols, follow the
sequence given below:
1. Eliminate - productions and obtain G1
2. Eliminate useless symbols from G1 and obtain G2
Example:
Eliminate - productions and useless symbols from the grammar
S a |aA|B|C, A aB| , B aA, C aCD, D ddd
Step 1: Eliminate - productions
Nullable ={A}
P1={S a |aA | B | C, A aB, B aA|a, C aCD, D ddd
Step 2: Eliminating useless symbols
99
100
I.
Definition:
Unit production is of form A B, where A and B are variables.
Unit productions could complicate certain proofs and they also introduce extra steps into
derivations that technically need not be there. The algorithm for eliminating unit productions
from the set of production P is given below:
1. Add all non unit productions of P to P1
2. For each unit production A B, add to P1 all productions A , where B is a
non-unit production in P.
3. Delete all the unit productions
Example:
Eliminate unit productions from S ABA | BA | AA | AB | A | B, A aA | a, B bB|b
Solution:
The unit productions are S A, S B. So A and B are derivable
. Now add productions from derivable.
S ABA | BA | AA | AB | A | B | aA | a | bB | b
A aA | a
B bB | b
Remove unit productions from above productions to get
S ABA | BA | AA | AB | aA | a | bB | b
A aA | a
B bB | b
Example:
Eliminate unit productions from S Aa | B, A a | bc | B, B A | bb
Solution:
101
Simplified Grammar:
If you have to get a grammar without - productions, useless symbols and unit productions,
follow the sequence given below:
1. Eliminate - productions from G and obtain G1
2. Eliminate unit productions from G1 and obtain G2
3. Eliminate useless symbols from G2and obtain G3
Example:
Eliminate useless symbols, -productions and unit productions from the grammar with
productions:
S a | aA | B| C, A aB | , B aA, C cCD, D ddd
Step 1: Eliminate -productions
Nullable = {A}
P1 = {S a | aA | B | C, A aB, B aA | a, C cCD, D ddd}
Step 2: Eliminate unit productions
Unit productions are S B and S C. So Derivable variables are B and C.
P2 = {S a | aA | cCD, A aB, B aA | a, C cCD, D ddd}
Step 3: Eliminate useless symbols
After eliminate non-generating symbol C we get
P3 = {S a | aA, A aB, B aA | a, D ddd}
After eliminate symbols that are non reachable
102
P4 = {S a | aA, A aB, B aA | a}
So the equivalent grammar G1 = ({S, A, B}, {a}, {S a | aA, A aB, B aA | a}, S)
II.
Every nonempty CFL without , has a grammar in CNF with productions of the form:
1. A BC, where A, B, C V
2. A a, where A V and a T
Algorithm to produce a grammar in CNF:
1. Eliminate useless symbols, -productions and unit productions from the grammar
2. Elimination of terminals on RHS of a production
a) Add all productions of the form A BC or A a to P1
b) Consider a production A X1X2Xn with some terminals of RHS. If Xi is
a terminal say ai, then add a new variable Cai to V1 and a new production Cai
ai to P1. Replace Xi in A production of P by Cai
c) Consider A X1X2Xn, where n 3 and all Xis are variables. Introduce
new productions A X1C1, C1 X2C2, , Cn-2 Xn-1Xn to P1 and
C1, C2, , Cn-2 to V1
Example:
Convert to CNF: S aAD, A aB | bAB, B b, D d
Solution:
Step1: Simplify the grammar
Grammar is already simplified.
Step2a: Elimination of terminals on RHS
Change S aAD to S CaAD, Ca a
A aB to A CaB
103
A bAB to A CbAB, Cb b
104
1.
2.
3.
4.
S aSa | bSb | a | b | aa | bb
S bA | aB, A bAA | aS | a, B aBB | bS | b
S Aba, A aab, B AC
S 0A0 |1B1 | BB, A C, B S|A, C S|
Usage of GNF
:
Construction of PDA from a GNF grammar can be made more meaningful with GNF.
Assume that the PDA has two states: start state q0 and accepting state q1.
An S rule of the form S aA1A2An generates a transition that processes the terminal a,
pushes the variables A1A2An on the stack and enters q1.
The remainder of the computation uses the input symbol and the stack top to determine the
appropriate transition.
106
Extend the tree by duplicating the terminals generated at each level on all lower levels. The
extended parse tree for the string a4b4
is given in figure 2. Number of symbols at each level is at most twice of previous level. 1
symbols at level 0, 2 symbols at 1, 4 symbols at 2 2 i symbols at level i. To have 2n
symbols at bottom level, tree must be having at least depth of n and level of at least n+1.
Let L be a CFL. Then there exists a constant k 0 such that if z is any string in L such that |z|
k, then we can write z = uvwxy such that
1. |vwx| k (that is, the middle portion is not too long).
2. vx (since v and x are the pieces to be pumped, at least one of the strings we
pump must not be empty).
3. For all i 0, uviwxiy is in L.
Proof:
The parse tree for a grammar G in CNF will be a binary tree. Let k = 2 n+1, where n is the
number of variables of G. Suppose z L(G) and |z| k. Any parse tree for z must be of depth
at least n+1. The longest path in the parse tree is at least n+1, so this path must contain at
least n+1 occurrences of the variables. By pigeonhole principle, some variables occur more
than once along the path. Reading from bottom to top, consider the first pair of same variable
along the path. Say X has 2 occurrences. Break z into uvwxy such that w is the string of
terminals generated at the lower occurrence of X and vwx is the string generated by upper
occurrence of X.
Example parse tree:
For the above example S has repeated occurrences, and the parse tree is shown in figure 3. w = ab is the string generated by lower
occurrence of S and vwx = aabb is the string generated by upper occurrence of S. So here u=aa, v=a, w=ab, x=b, y=bb.
108
Let T be the subtree rooted at upper occurrence of S and t be subtree rooted at lower
occurrence of S. These parse trees are shown in figure 4. To get uv2wx2y L, cut out t and
replace it with copy of T. The parse tree for uv2wx2y L is given in figure 5. Cutting out t
and replacing it with copy of T as many times to get a valid parse tree for uviwxiy for i 1.
To get uwy L, cut T out of the original tree and replace it with t to get a parse tree of
uv0wx0y = uwy as shown in figure 6.
Pumping Lemma game:
1.
2.
3.
4.
5.
6.
Example:
Show that L = {aibici | i 1} is not CFL
Solution:
Assume L is CFL. Choose an appropriate z = a nbncn = uvwxy. Since |vwx| n then vwx can
either consists of
109
110
t=
Suppose vwx consists of 1st block of 0s and first block of 1s: vx consists of only 0s if
x= , then uwy is not in the form tt. If vx has at least one 1, then |t| is at least 3n/2 and first t
ends with a 0, not a 1.
Very similar explanations could be given for the cases of vwx consists of first block of 1s
and vwx consists of 1st block of 1s and 2nd block of 0s. In all cases uwy is expected to be in
the form of tt. But first t and second t are not the same string. So uwy is not in L and L is not
context free.
Example:
Exercises:
Using pumping lemma for CFL prove that below languages are not context free
1. {0p | p is a prime}
2. {anbnci | i n}
111
112
Substitution:
By substitution operation, each symbol in the strings of one language is replaced by an entire
CFL language
.
Example:
S(0) = {anbn| n 1}, S(1)={aa,bb} is a substitution on alphabet ={0, 1}.
Theorem:
If a substitution s assigns a CFL to every symbol in the alphabet of a CFL L, then s(L) is a
CFL.
Proof:
Let G = (V, , P, S) be grammar for the CFL L. Let G a = (Va, Ta, Pa, Sa) be the grammar
corresponding to each terminal a and V Va = . Then G= (V, T, P, S) is a grammar
for s(L) where
V = V Va
P consists of
o All productions in any Pa for a
o
o The productions of P, with each terminal a is replaced by S a everywhere a
occurs.
113
Example:
L = {0n1n| n 1}, generated by the grammar S 0S1 | 01, s(0) = {anbm | m n}, generated
by the grammar S aSb | A; A aA | ab, s(1) = {ab, abc}, generated by the grammar S
abA, A c |
. Rename second and third Ss to S 0 and S1, respectively. Rename second A to B. Resulting
grammars are:
S 0S1 | 01
S0 aS0b | A; A aA | ab
S1 abB; B c |
In the first grammar replace 0 by S0 and 1 by S1. The resulted grammar after substitution is:
S S0SS1 | S0S1
S0 aS0b | A; A aA | ab
S1abB; B c |
II.
Application of substitution:
114
115
Closure under
Reversal:
Intersection:
FA
AND
Reject
PDA
Stack
Proof:
116
P = (QP, , , P, qP, Z0, FP) be PDA to accept L by final state. Let A = (QA, , A, qA, FA) for
DFA to accept the Regular Language R. To get L R, we have to run a Finite Automata in
parallel with a push down automata as shown in figure 1. Construct PDA P = (Q, , , ,
qo, Z0, F) where
Q = (Qp X QA)
qo = (qp, qA)
F = (FPX FA)
is in the form ((q, p), a, X) = ((r, s), g) such that
1. s = A(p, a)
2. (r, g) is in P(q, a, X)
That is for each move of PDA P, we make the same move in PDA P and also we carry along
the state of DFA A in a second component of P. P accepts a string w if and only if both P
and A accept w. ie w is in L R. The moves ((qp, qA), w, Z) |-*P ((q, p), , ) are possible if
and only if (qp, w, Z) |-*P (q, ,) moves and p = *(qA, w) transitions are possible.
b.
CFL and RL properties:
Theorem: The following are true about CFLs L, L1, and L2, and a regular language R.
1. Closure of CFLs under set-difference with a regular language.
ie
1. L - R is a CFL.
Proof:
R is regular and regular language is closed under complement. So R C is also regular. We
know that L - R = L RC. We have already proved the closure of intersection of a CFL
and a regular language. So CFL is closed under set difference with a Regular language.
2. CFL is not closed under complementation
LC is not necessarily a CFL
Proof:
Assume that CFLs were closed under complement. ie if L is a CFL then L C is a CFL.
Since CFLs are closed under union, L 1C L2C is a CFL. By our assumption (L 1C
L2C)C is a CFL. But (L1C L2C)C = L1 L2, which we just showed isnt necessarily
a CFL. Contradiction! . So our assumption is false. CFL is not closed under
complementation.
117
Inverse Homomorphism:
Recall that if h is a homomorphism, and L is any language, then h -1(L), called an inverse
homomorphism, is the set of all strings w such that h(w) L. The CFLs are closed under
inverse homomorphism.
Theorem:
If L is a CFL and h is a homomorphism, then h-1(L) is a CFL
118
Buffer
h(a)
a
Input
Accept/
Reject
PDA
Stack
119
Language Hierarchy
and
History of Turing Machine
Turing
machines
PDA
Context free
language
DFA
Regular language
Regular language
A language is called a regular language if some finite automaton recognizes it.
Ex: To recognize string that is a multiple of 4
0
S1
S2
S0
0
120
Limitation again
( S 0 ,0 ) (S 1,X,R)
( S 1,0 ) (S 1,0,R)
( S 1,1 ) (S 2 ,Y,R )
( S 2 ,1 ) (S 2 ,1,R)
( S 2 ,2) (S 3,Z,L)
( S 3,1 ) (S 3,1,L)
( S 3,Y)(S 3,Y,L )
( S 3,0 ) (S 3,0,L)
( S 3,X) (S 0 ,X,R)
( S 0 ,Y)(SA,Y,R )
SA : Accepting state
121
. (concat)
DFA
Regular Lang.
Regex
example:
(0 | 1)*
Regular Expn.
2-way DFA
NFA
DFA
122
PDA
CFL
CFL
example:
0n1n
CFG
But there are strings that cannot be recognized by PDAs. For example: anbncn
PDA
Some Historical Notes
At the turn of the 20th century the German mathematician David Hilbert proposed the
Entscheidungsproblem.
Given a set X, and a universe of elements U, and a criteria for membership, is it possible to
formulate a general procedureto decide whether a given element of U is a member of X?
Hilberts grand ideas were watered down by the incompleteness theorem proposed by
the Austrian Kurt Gdel.
His theorem states that in any mathematical system, there exist certain obviously true
assertions that cannot be proved to be true by the system.
In 1936 the British cryptologist Alan Turing, addressed Hilberts
Entscheidungsproblem using a different approach.
123
He proposed two kinds of mathematical machines called the a-machine and the c-machine
respectively, and showed the existence of some problems where membership cannot be
determined.
But Turings a-machine became famous for something else other than its original
intentions.
The a-machine was found to be more expressive than CFGs. It can recognize strings that
cannot be modeled by CFLs like anbncn.
The a-machines came to be more popularly known as Turing Machines
The Princeton mathematician Alonzo Church recognized the power of a-machines.
He invited Turing to Princeton to compare Turing Machines with his own l-calculus.
Church and Turing proved the equivalence of Turing Machines and l-calculus, and showed
that they represent algorithmic computation. This is called the Church-Turing thesis.
Turing Machines
Definition:
A Turing Machine (TM) is an abstract, mathematical model that describes what can
and cannot be computed. A Turing Machine consists of a tape of infinite length, on which
input is provided as a finite sequence of symbols. A head reads the input tape. The Turing
Machine starts at start state S0. On reading an input symbol it optionally replaces it with
another symbol, changes its internal state and moves one cell to the right or left.
Notation for the Turing Machine :
TM = <S, T, S0, , H> where,
S
is a set of TM states
T
is a set of tape symbols
S0
is the start state
HS
is a set of halting states
: S x T S x T x {L,R}
is the transition function
{L,R}
is direction in which the
head moves
L : Left
R: Right
input symbols on infinite length tape
124
1 0 1 0 1 1 1 1 1 1 0
head
The Turing machine model uses an infinite tape as its unlimited memory. (This is
important because it helps to show that there are tasks that these machines cannot perform,
even though unlimited memory and unlimited time is given.) The input symbols occupy
some of the tapes cells, and other cells contain blank symbols.
Output :
#1111100000000.
Initially the TM is in Start state S0. Move right as long as the input symbol is 1. When a 0 is
encountered, replace it with 1 and halt.
Transitions:
(S0, 1)
(S0, 1, R)
(S0, 0)
( h , 1, STOP)
TM Example 2 :
TM: X-Y
125
Given two unary numbers x and y, compute |x-y| using a TM. For purposes of
simplicity we shall be using multiple tape symbols.
Ex: 5 (11111) 3 (111) = 2 (11)
#11111b1110000..
#___11b___000
a) Stamp out the first 1 of x and seek the first 1 of y.
(S0, 1)
(S0, b)
(S1, 1)
(S1, b)
(S1, _, R)
(h, b, STOP)
(S1, 1, R)
(S2, b, R)
b) Once the first 1 of y is reached, stamp it out. If instead the input ends, then y has finished.
But in x, we have stamped out one extra 1, which we should replace. So, go to some state s5
which can handle this.
(S2, 1)
(S2,_)
(S2, 0)
(S3, _, L)
(S2, _, R)
(S5, 0, L)
c) State s3 is when corresponding 1s from both x and y have been stamped out. Now go back
to x to find the next 1 to stamp. While searching for the next 1 from x, if we reach the head of
tape, then stop.
(S3, _)
(S3,b)
(S4, 1)
(S4, _)
(S4, #)
(S3, _, L)
(S4, b, L)
(S4, 1, L)
(S0, _, R)
(h, #, STOP)
d) State s5 is when y ended while we were looking for a 1 to stamp. This means we have
stamped out one extra 1 in x. So, go back to x, and replace the blank character with 1 and
stop the process.
(S5, _)
(S5,b)
(S6, 1)
(S6, _)
(S5, _, L)
(S6, b, L)
(S6, 1, L)
(h, 1, STOP)
126
( S 0 ,0 ) (S 1,X,R)
( S 1,0 ) (S 1,0 ,R)
( S 1,1 ) (S 2 ,Y,R )
( S 2 ,1 ) (S 2 ,1,R)
( S 2 ,2) (S 3 ,Z,L )
S0 = Start State, seeking 0, stamp it with X
S1 = Seeking 1, stamp it with Y
S2 = Seeking 2, stamp it with Z
Step 2: Move left until an X is reached, then move one step right.
( S 3 ,1 ) (S 3 ,1,L )
( S 3 ,Y)(S 3,Y,L )
( S 3 ,0 ) (S 3 ,0 ,L)
( S 3 ,X) (S 0 ,X,R )
S3 = Seeking X, to repeat the process.
Step 3: Move right until the end of the input denoted by blank( _ ) is reached passing through
X Y Z s only, then the accepting state SA is reached.
(S 0 ,Y)(S 4 ,Y,R )
(S 4 ,Y)(S 4 ,Y,R )
(S 4 ,Z) (S 4 ,Z,R)
(S 4 ,_ )(SA,_ ,STOP )
127
S4 = Seeking blank
These are the transitions that result in halting states.
( S 4 ,1 ) (h,1,STOP )
( S 4 ,2 ) (h,2,STOP )
( S 4 ,_ ) (SA,_ ,STOP )
( S 0 ,1 ) (h,1,STOP )
( S 0 ,2 ) (h,2,STOP )
( S 1,2 ) (h,2,STOP )
( S 2 ,_ ) (h,_ ,STOP )
( S 0 ,0 ) (S 1,_ ,R )
( S 0 ,1 ) (S 2 ,_ ,R )
( S 1,_ ) (S 3 ,_ ,L)
( S 3,1 ) (h,1,STOP )
( S 2 ,_ ) (S 5 ,_ ,L)
( S 5 ,0 ) (h,0,STOP )
Step 2: If the last character is 0/1 accordingly, then move left until a blank is reached to start
the process again.
( S 3,0) (S 4 ,_ ,L)
( S 4 ,1) (S 4 ,1,L)
( S 4 ,0) (S 4 ,0,L)
( S 4 ,_) (S 0 ,_ ,R)
( S 5 ,1) (S 6 ,_ ,L)
( S 6 ,1) (S 6 ,1,L)
( S 6 ,0) (S 6 ,0,L)
( S 6 ,_) (S 0 ,_ ,R )
128
Step 3 : If a blank ( _ ) is reached when seeking next pair of characters to match or when
seeking a matching character, then accepting state is reached.
129
TM: 0n1n2n
0-Stamper
1-Seeker
1-Stamper
2-Seeker
2-Stamper
0-Seeker
Universal Turing Machine
A Universal Turing Machine UTM takes an encoding of a TM and the input data as its input
in its tape and behaves as that TM on the input data.
A TM spec could be as follows:
TM = (S,S0,H,T,d)
Suppose, S={a,b,c,d}, S0=a, H={b,d} T={0,1}
: (a,0) (b,1,R) , (a,1) (c,1,R) ,
(c,0) (d,0,R) and so on
then TM spec:
$abcd$a$bd$01$a0b1Ra1c1Rc0d0R.
where $ is delimiter
This spec along with the actual input data would be the input to the UTM.
This can be encoded in binary by assigning numbers to each of the characters appearing in
the TM spec.
130
131
0 1 1 0 1 0 1 0 0
0 0 1 1 1 1 1 1 0
Track
A composite tape consists of many tracks which can be read or written simultaneously.
A composite tape TM (CTM) contains more than one tracks in its tape.
Equivalence of CTMs and TMs
A CTM is simply a TM with a complex alphabet..
T = {a, b, c, d}
T = {00, 01, 10, 11}
Turing Machines with Stay Option
Turing Machines with stay option has a third option for movement of the TM head:
left, right or stay.
STM = <S, T, , s0, H>
: S x T S x T x {L, R, S}
Equivalence of STMs and TMs
STM = TM:
Just dont use the S option
TM = STM:
For L and R moves of a given STM build a TM that moves correspondingly L or R
132
TM = STM:
For S moves of the STM, do the following:
1.Move right,
2.Move back left without changing the tape
3.
STM: (s,a) |-- (s,b,S)
TM: (s,a) |-- (s, b, R)
(s,*) |-- (s,*,L)
2-way Infinite Turing Machine
In a 2-way infinite TM (2TM), the tape is infinite on both sides.
There is no # that delimits the left end of the tape.
Equivalence of 2TMs and TMs
2TM = TM:
Just dont use the left part of the tape
TM = 2TM:
Simulate a 2-way infinite tape on a one-way infinite tape
0 1 1 2 2 3 3 4 4 5 5
133
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
TM
A0 B0 C0 A1 B1 C1 A2 B2 C2 A3 B3 ..
Multi-
tape.
dimensional TMs
TM
Multi-dimensional TMs
(MDTMs) use a multi-dimensional space instead of a single dimensional
134
10
0
1
2
3
4
Non-deterministic TM
A non-deterministic TM (NTM) is defined as:
NTM = <S, T, s0, , H>
where : S x T 2SxTx{L,R}
Ex: (s2,a) {(s3,b,L) (s4,a,R)}
135
bccaaabccacb
s2
bccbaabccacb
s3
bccaaabccacb
s4
136