Theory of Finite Automata With A Introduction To Formal Language - Carroll, John PDF
Theory of Finite Automata With A Introduction To Formal Language - Carroll, John PDF
to FORMAL LANGUAGES
JOHN CARROLL
San Diego State University
DARRELL LONG
University of California, Santa Cruz
The author and publisher of this book have used their best efforts in preparing this book. These efforts
include the development, research, and testing of the theories and programs to determine their effec-
tiveness. The author and publisher make no warranty of any kind, expressed or implied, with regard to
these programs or the do~umentation contained in this book. The author and publisher shall not be
liable in any event for incidental or consequential damages in connection with, or arising out of, the
furnishing, performance, or use of these programs.
TRADEMARK INFORMATION
UNIX is a registered trademark of AT&T Bell Laboratories.
Turing's World, copyright 1986 by Jon Barwise and John Etchemendy
Apple Macintosh is a registered trademark of Apple Computer Inc.
II
-
© 1989 by Prentice-Hall, Inc.
A Division of Simon & Schuster
- Englewood Cliffs, New Jersey 07632
All rights reserved. No part of this book may be reproduced, in any form or by any means,
without permission in writing from the publisher.
10 9 8 7 6 5 4 3 2
ISBN 0-13-913708-4
PREFACE vii
0 PRELIMINARIES 1
iii
iv Contents
12 DECIDABILITY 405
REFERENCES 432
INDEX 433
PREFACE
It often seems that mathematicians regularly provide answers well before the rest of
the world finds reasons to ask the questions. The operation of the networks of relays
used in the first computers is exactly described by Boolean functions. George Boole
thereby made his contribution to computer science in the mid-1800s, and Boolean
algebra is used today to represent modern TIL (transistor-transistor logic) circuits.
In the 1930s, Alan Turing formalized the concept of an algorithm with his presen-
tation of an abstract computing device and characterized the limitations of such
machines. In the 1950s, the abstraction of the concepts behind natural language
grammars provided the theoretical basis for computer languages that today guides
the design of compilers.
These three major foundations of computer science, the mathematical descrip-
tion of computational networks, the limitations of mechanical computation, and the
formal specification of languages are highly interrelated disciplines, and all require
a great deal of mathematical maturity to appreciate. A computer science under-
graduate is often expected to deal with all these concepts, typically armed only with
a course in discrete mathematics.
This presentation attempts to make it possible for the average student to
acquire more than just the facts about the subject. It is aimed at providing a rea-
sonable level of understanding about the methods of proof and the attendant
thought processes, without burdening the instructor with the formidable task of
simplifying the material. The majority of the proofs are written with a level of detail
that should leave no doubt about how to proceed from one step to the next. These
same proofs thereby provide a template for the exercises and serve as examples of
how to produce formal proofs in the mathematical areas of computer science. It is
vii
viii Preface
not unreasonable to expect to read and understand the material presented here in a
nonclassroom setting. The text is therefore a useful supplement to those approach-
ing a course in computation or formal languages with some trepidation.
This text develops the standard mathematical models of computational de-
vices, and investigates the cognitive and generative capabilities of such machines.
The engineering viewpoint is addressed, both in relation to the construction of such
devices and in the applications of the theory to real-world machines such as traffic
controllers and vending machines. The software viewpoint is also considered, pro-
viding insight into the underpinnings of computer languages. Examples andapplica-
tions relating to compiler construction abound.
This material can be tailored to several types of courses. A course in formal
languages that stressed the development of mathematical skills could easily span two
semesters. At the other extreme, a course designed as a prerequisite for a formal
languages sequence might cover Chapters 1 through 7 and parts of Chapters 8
and 12. In particular, Chapter 8 is written so that the discussion of the more robust
grammars (Section 8.1) can be entirely omitted. Section 12.1 is exclusively devoted
to results pertaining to the constructs described in the earlier chapters, and Section
12.3 provides a natural introduction to the theory of computability by developing
the halting problem without relying on Turing machine concepts.
Several people played significant roles in shaping this text. The book grew out
of a set of lecture notes taken by Jack Porter, a student in a one-semester course on
finite automata taught by Sara Baase at San Diego State in the 1970s. Baase's
course was based on five weeks of lectures by Richard M. Karp at the University of
California, Berkeley. The lecture notes were revised by William Root during the
semesters he taught the course at San Diego State. The authors are also indebted to
the many students who helped refine the presentation by suggesting clarifications
and identifying typos, inaccuracies, and sundry other sins. Special thanks to Jon
Barwise and John Etchemendy at Stanford University for their permission to incor-
porate examples from their Turing's World Macintosh software package, available
from Kinko's Academic Courseware Exchange, 255 West Stanley Ave., Ventura,
CA 93001. Robin Fishbaugh was instrumental in shepherding the class notes
through their various electronic forms; her numerous contributions are gratefully
acknowledged.
Courtesy of Alexis A. Gilliland
CHAPTER
PRELIMINARIES
This chapter reviews some of the basic concepts used in this text. Many can be found
in standard texts on discrete mathematics. Much of the notation employed in later
chapters is also presented here.
A basic familiarity with the nature of formal proofs is assumed; most proofs given in
this text are complete and rigorous, and the reader is encouraged to work the
exercises in similar detail. A knowledge of logic circuits would be necessary to
construct the machines discussed in this text. Important terminology and techniques
are reviewed here.
Unambiguous statements that can take on the values True or False (denoted by
1 and 0, respectively) can be combined with connectives such as and (1\), or (V), and
not (-,) to form more complex statements. The truth tables for several useful
connectives are given in Figure 0.1, along with the symbols representing the
physical devices that implement these connectives.
As an example of a complex statement, consider the assertion that two state-
ments p and q take on the same value. This can be rephrased as:
As the truth table for not shows, a statement r is false exactly when -,r is true; the
above assertion could be further refined to:
Either (p is true and q is true) or (-,p is true and -,q is true).
1
2 Preliminaries Chap. 0
"P~
··il~p
p q pAq p q pVq P q pt q p q p~q
1 1 1 1 1 1 1 1 0 1 1 0
1 0 0 1 0 1 1 0 1 1 0 0
0 1 0 '0 1 1 0 1 1 0 1 0
0 0 0 0 0 0 0 0 1 0 0 1
EXAMPLE 0.1
Circuitry for realizing each of the above.statements is displayed in Figure 0.3. Since
the two statements were equivalent, the circuits will exhibit the same behavior for
all combinations of input signals p and q. The second circuit would be less costly to
build since it contains fewer components, and tangible benefits therefore arise when
equivalent but less cumbersome statements can be derived. Techniques for min-
imizing such circuitry are presented in moSt discrete mathematics texts.
Example 0.1 shows that it is straightforward to implement statement formulas
by circuitry. Recall that the location of the 1. values in the truth table can be used to
find the corresponding principal disjunctive normal form (PDNF) for the expression
Sec. 0.1 Logic and Set Theory 3
.p
q
(pAq)V(pJ,q)
p
p
q q
represented by the truth table. ·For example, the truth table corresponding to
NAND has 3 rows with 1 values (p = 1, q = 0; p = 0, q = 1; P = O,q = 0), leading to
three terms in the PDNF expression: (pA--,q)V(--,pAq)V(--,pA--,q). This for-
mula can be implemented as the circuit illustrated in Figure 0.4, and thus a NAND
gate can be replaced by this combination of three ANDs and one OR gate. This
circuit relies on the assurance that a quantity of interest (such as p) will generally be
available in both its negated and un negated forms. Hence we can count on access to
an input line representing --,p (rather than feeding the input for p into a NOT gate).
p
-,q
-,p
q
-,p
-,q
Figure 0.4 A circuit equivalent to a sin-
gle NAND gate
Predicates are often used to make statements about certain objects, such as the
numbers in the set ~ of integers. For example, Q might represent the property of
4 Preliminaries Chap. 0
being less than 5, in which case Q(x) will represenf the statement "x is less than 5."
Thus, Q(3) is true, while Q(7) is false. It is often necessary to make global
statements such as: All integers have the property P, which can be denoted by
(Vx E ~)P(x). Note that the dummy variable x was used to state the concept in a
convenient form; x is not meant to represent a particular object, and the statement
could be equivalently phrased as (Vi E ~)P(i). For the predicate Q defined above,
the statement (Vx E ~)Q(x) is false, while when applied to more restricted domains,
(Vx E {I, 2, 3})Q(x) is true, since it is in this case equivalent to Q(l) 1\ Q(2) 1\ Q(3),
or (1 < 5) 1\ (2 < 5) 1\ (3 < 5).
In a similar fashion, the statement that some integers have the property P will
be denoted by (3i E ~)P(i). For the predicate Q defined above, (3i E {4, 5, 6})Q(i) is
true, since it is equivalent to Q(4) VQ(5)VQ(6), or (4 < 5)V(5 < 5)V(6 < 5). The
statement (3y E {7, 8, 9} )Q(y) is false.
Note that asserting that it is not the case that all objects have the property P is
equivalent to saying that there is at least one object that does not have the property
P. In symbols, we have
---,(Vx E ~)P(x) ~ (3x E ~)(---,P(x))
Similarly,
---,(3x E ~)P(x) ~ (Vx E O)(---,P(x))
Given two statements A and B, if B is true whenever A is true, we will say that
A implies B, and write A ~ B. For example, the truth tables show that pl\q ~ pV q,
since for the case where pl\q is true (p = 1, q = 1), p V q is true, also. In the cases
where pl\q is false, the value of p V q is immaterial.
A basic knowledge of set theory is assumed. Some standard special symbols
will be repeatedly used to designate common sets.
V Definition 0.1
The set of natural numbers is given by N = {O, 1,2,3,4, ... }.
The set of integers is given by ~ = {... -2, -1,0,1,2, ... }.
The set of rational numbers is given by Q = {alb Ia E ~,b EO, b =1= O}.
The set of real numbers (points on the number line) will be denoted by R
The following concepts and notation will be used frequently throughout the text.
V Definition 0.3. Two sets A and B are said to be equal if they contain exactly
the same elements; that is, A = B iff (Vx) (x EA ~ x E B).
Ll
Sec. 0.2 Relations 5
Thus, two sets A and B are equal iff A ~ Band B ~ A. The symbol C will be used to
denote a proper subset: A C B iff A ~ B and A f B.
V Definition 0.4. For sets A and B, the cross product of A with B, is the set of
all ordered pairs from A and B; that is, A x B = {(a, b) la EA 1\ bE B}.
~
0.2 RELATIONS
EXAMPLE 0.2
Let X = {I, 2, 3}. The familiar relation < (less than) would then consist of the
following ordered pairs: <: {(I, 2), (1,3), (2, 3)}, by which we mean to indicate that
1 < 2, 1 < 3, and 2 < 3. (3,3) ft. < since 31: 3.
Some relations have special properties. For example, the relation "less than"
is transitive, by which we mean that for any numbers x, y, and z, if x <y and y < z,
then x < z. Definition 0.6 describes an important class of relations that have some
familiar properties.
V Definition 0.6
A relation is reflexive iff (Vx)(xRx).
A relation is symmetric iff (Vx)(Vy)(xRy => yRx).
A relation is transitive iff (Vx) (Vy)( Vz)((xRy 1\ y Rz) => xRz).
An equivalence relation is a relation that is reflexive, symmetric, and transi-
tive.
EXAMPLE 0.3
< is not an equivalence relation; while it is transitive, it is not reflexive since 31: 3.
(It is also not symmetric, since 2 < 3, but 31: 2.)
EXAMPLE 0.4
and it is clear that ('v'x)('v'y)(x = y =? y = x). The equality relation is therefore sym-
metric, and it is likewise obvious that = is also reflexive and transitive.
EXAMPLE 0.5
The equivalence classes for = are singleton sets: [1]= = {I}, [5]= = {5}, and so on.
EXAMPLE 0.6
If (x, y) is viewed as the fraction xly, then R is the relation that identifies
equivalent fractions: 2/3 R 14/21, since 2·21 = 3·14. In this sense, R can be viewed as
the equality operator on the set of rational numbers Q.
Note that in this context the equivalence class [2/8]R represents the set of all
"names" for the point one-fourth of the way between and 1; that is, °
[2/8]R = {... , - 31-12, -2/-8, -11-4, 114,2/8,3/12,4/16,5/20, ... }
There are therefore many other ways of designating this same set; for example,
[1I4]R = { ... , -3/-12, -2/-8, -11-4, 114,2/8,3/12,4/16,5/20, ... }
EXAMPLE 0.7
V' Definition 0.8. Given a set X and sets Al>A 2 , •• • ,An, P = {Al>A 2 , • •• ,An}
is a partition of X if the sets in P are all subsets of X, they cover X, and are pairwise
disjoint. That is, the following three conditions are satisfied:
(Vi E {I, 2, ... ,n})(Ai ~ X)
(Vx EX)(3i E {1,2, ... ,n} ~ x EAi)
(Vi,j E {I, 2, ... ,n})(i -+ j => Ai n Aj = 0)
V' Definition 0.9. Given a set X and a partition P = {AI, A 2 , ••• ,An} of X, the
relation R(P) in X induced by P is given by
(Vx E X)(Vy E X)(x R(P)y ¢::> (3i E {I, 2, ... ,n} ~ x EAi 1\ Y E Ai»
EXAMPLE 0.8
Let X = {I, 2, 3, 4, 5} and consider the relation Q = R(S) induced by the partition
S = {{I, 2}, {3, 5}, {4}}. Since 1 and 2 are in the same set, they should be related by Q,
while 1<D4 because 1 and 4 belong to different sets of the partition. Q can be
described by
Q = {(I, I), (I, 2), (2, I), (2, 2), (3, 3), (3, 5), (4, 4), (5, 3), (5, 5)}
0.3 FUNCTIONS
EXAMPLE 0.10
Let n be a positive integer. Define fn : ~ ~ ~ by fnU ) = the smallest natural number
i for which j == i modn. /3U), for example, is a function and is represented by the
ordered pairs /3: {(O, 0), (1,1), (2, 2), (3, 0), (4,1), ... }. This implies that /3(0) = 0,
/3(1) = 1, f3(2) = 2, f3(3) = 0, and so on.
given first coordinate; in general, a proposed relation may also fail to be well
defined by associating no objects with a potential first coordinate.
EXAMPLE 0.11
The problem with this seemingly innocent definition is that 0.25 is actually an
equivalence class of fractions (recall Example 0.6), and the definition of g was based
on just one representative of that class. We observed that two representatives (2/8
and 5120) of the same class gave conflicting answers (2 and 5) for the value that g
associated with their class (0.25). While it is possible to define functions on a set of
equivalence classes in a consistent manner, it will always be important to verify that
such functions are single valued.
Selection criteria, which determine whether a candidate does or does not
belong to a given set, are special types of functions.
EXAMPLE 0.12
The characteristic function for the set of odd numbers is the function fz given in
Example 0.10.
To say that a set is well defined essentially means that the characteristic
function associated with that set is a well-defined function. A set of equivalence
classes can be ill defined if the definition is based on the representatives of those
equivalence classes.
EXAMPLE 0.13
Consider the "set" of fractions that have <?dd numerators, whose characteristic
"function" is defined by:
XB(mln) = 1 if m is odd
and
XB(mln) = 0 if m is even
10 Preliminaries Chap. 0
This characteristic function suffers from flaws similar to those found in the function
g in Example 0.11.114 = 2/8 and yet XB(1I4) = 1 while XB(2/8) = 0, which implies that
the fraction 114 belongs to B, while 2/8 is not an element of B. Due to this ambiguous
definition of set membership, B is not a well-defined set. B failed to pass the test: if
x = y, then (x E B iff Y E B).
EXAMPLE 0.14
The function g: {I, 2, 3}-? {a, b} defined by g(l) = a, g(2) = b, and g(3) = a is onto
since both codomain elements are part of the range of g. However, the function
h: {I, 2, 3}-? {a, b, c} defined by h (1) = a, h (2) = b, and h (3) = a is not onto since no
domain element maps to c.
The function f: I\J-? I\J defined by f(i) = i + 1 (Vi = 0, 1,2, ... ) is not onto
since there is no element x for whichf(x) = 0.
EXAMPLE 0.15
.V Definition 0.16. A function is a bijection iffit is one to one and onto (injec-
tive and surjective); that is, it must satisfy
A bijective function must therefore have exactly one first coordinate associated with
any given second coordinate.
EXAMPLE 0.16
The functionf: N~ N defined by f(i) = i + 1 (Vi = 0, 1,2, ... ) is injective but not
surjective, so it is not a bijection. However, the function b: O~ 0 defined by
b(i) = i + 1 (Vi = ... , -2, -1,0,1,2, ... ) is a bijection. Note that while the rule
for b remains the same as for f, both the domain and range have been expanded,
and many more ordered pairs have been added to form b.
It is often appropriate to take the results produced by one function and apply
the rule specified by a second function. For example, we may have a list associating
students with their height in inches (that is, we have a function relating names with
numbers). The conversion rule for changing inches into centimeters is also a func-
tion (associating any given number of inches with the corresponding length in
centimeters), which can be applied to the heights given in the student list to produce
a new list matching student names with their height in centimeters. This new list is
referred to asthe composition of the original two functions.
Note that the composition is not defined unless the codomain of the first function
. matches the domain of the second function. In functional notation, gof =
{(x, z)13y E Y ~ f(x) = y andg(y) = z}, and therefore wheng of is defined, it can be
described by the rule g 0 f(x) = g(f(x» .
.
.,
12 Preliminaries Chap. 0
EXAMPLE 0.17
Consider the functions 13 from Example 0.10 and I from Example 0.14, where
/J : ~ ~ ~ was defined by /J(j ) = the smallest natural number i for which i a i mod 3,
- and the function f: ~~ ~ is defined by I(i) = i + 1. 1'13 consists of the ordered
pairs {(O, 1), (1,2), (2, 3), (3, 1), (4, 2), (5, 3), ... } and is represented by the rule
10/J(j ) = f3(j ) + 1, which happens to be the smallest positive number that is con-
gruent to j + 1 mod 3. Note that f3 f(j ) = /J(j + 1), which happens to be the
0
smallest natural number that is congruent to j + 1 mod 3. This represents the differ-
ent set of ordered pairs {(O, 1), (1, 2), (2, 0), (3,1), (4, 2), (5, 0), ... }. In most cases,
log-j=gof.
When the inverse exist~~ iti~:$ppropriate to usefunctional notation for I-I also, and
we therefore have, for any elements a and b;rl(b) = a iff I(a) = b. Note that if
I: X~ Ythenf-I: Y~x.
EXAMPLE 0.18
Consider the ordered pairs for the relation <: {(l, 2), (1, 3), (2, 3)}. The converse is
then -<: {(2, 1), (3,1), (3,2)}. Thus, the converse of "less than" is the relation "greater
than. "
The function b: n~ ndefined by b(i) = i + 1 (Vi = ... , -2, -1,0, 1,2, ... )
has the inverse b- I: n~ 0 defined by b-I(i) = i-I (Vi = ... , -2, -1,0, 1,2, ... ).
The inverse of the funct~pn that increments integers by 1 is the function that
decrements integers by the same amount.
Sec. 0.4 Cardinality and Induction 13
The functionf: 0- 0 defined by f(i) = i 2 ('Vi = ... , -2, -1, 0,1,2, ... ) has a
converse that is not a function over the given domain and codomain; the inverse
notation is inappropriate, since rl(3) is not defined, nor is rl( -4),
Not surprisingly, if the converse of f is to be a function, the codomain of f
(which will be the new domain of rl) must satisfy conditions similar to those
imposed on the domain off. In particular:
Iffis a bijection,f-1 must exist and will also be a bijection. In fact, the compositions
r
f of -1 and 1 of are the identity functions on the domain and codomain, respectively
(see the exercises).
The size of various sets will frequently be of interest in the topics covered in this
text, and it will occasionally be necessary to consider the set of all subsets of a given
set.
v 'Defmition 0.19. Given a set A, the power set of A, denoted by p(A) or 2A, is
p(A)={XIXkA}
EXAMPLE 0.19
p({a, b, c}) = {0, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, cH
and
p({ }) = {0}.
v Defmition 0.20. Two sets X and Yare equipotent if there exists a bijection
f: X - Y, and we will write II xii = II YII. IIXII denotes the cardinality of X, that is, the
number of elements in X.
II
That is, sets with the same cardinality or "size" are equipotent. The equipotent
relation is reflexive, symmetric, and transitive and is therefore an equivalence
relation.
14 Preliminaries Chap. 0
EXAMPLE 0.20
y,
The function g: {a, b, 'c}~ {x, y, z} defined by g(a) = z, g( b) = and g'( c) = x is a
bijection, and thus II{a, b, cHI = II{x,y, z}ll. The equivalence class ~onsisting of all sets
that are equipotent to {a,b,c} is generally associated with the cardinal number 3.
Thus, II{a, b, cHI = 3; II{ }II = o. {a, b, c} is not equipotent to { }, and hence 3 =1= O.
The subset relation allows the sizes of sets to be ordered: IIA II ::511 B II iff
(3C)(C ~ B AliA II = IICII)· We will write IIA II < IIBII iff<llA II ::5IIBII and IIA II =1= liB II)·
The observations about {a, b, c} and { } imply that 0 < 3.
For N = {O, 1,2,3,4,5,6, ... } and IE = {O, 2, 4, 6, ... }, the function f: M~ IE,
defined by f(x) = 2x, is a bijection. The set of natural numbers N is countably
infinite, and its size is often denoted by Xo = IINII. The doubling functionfshows that
IINII = 111E11~ Similarly, it can be shown that 0 and N x N are also countably infinite
(see the exercises). A set that is equipotent to.one of its proper subsets is called an
infinite set. Since II N II = II IE II and yet IE eN, we know that N must be infinite. No
such correspondence between {a, b, c} and any of its proper subsets is possible, so
{a, b, c} is a finite set. 3 is therefore a finite cardinal number, while X~ represents an
infinite cardinal number.
Theorem 0.4 compares the size of a setA with the number of subsets of A and
. shows that IIA II < IIp(A)II.For the sets in Example 0.19, we see that 3 < 8 and 0 < 1,
which is not unexpected. It is perhaps surprising to find that the theorem will also
apply to infinite sets, for example, IINII < IIp(N)II. This means that there are cardinal
numbers larger than Xo; there are infinite sets that are not countably infinite.
Indeed, the next. theorem implies that there is an unending progression of infinite
cardinal n u m b e r s . · . .
V Theorem 0.4. Let A be any set. Then IIA II < II peA) II·
Proof. There is a bijection between A and the set of all singleton subsets of A,
as shown by the function s: A ~{{x}lx EA} defined by s(z) = {z} for each z EA.
Since {{x}lx EA}~p(A), we have IIA 11::5 II p(A) II. It remains to show that
IIAII =1= IIp(A)II· By definitiop. of cardinality, we must show that there cannot exist a
bijection between A and peA). The following proof by contradiction will show this.
Assume f: A ~ peA) is a function; we will demonstrate that there must exist a
set in peA) that is not in the range of f, and hence f cannot be onto. Consider an
element z of A and the setf(z) to which it maps. fez) is a subset of A, and hence z
mayor may not belong to fez). Define B to be the set {y EA Iy E;tf(y)}. B is then
the set of all elements of A that do not appear in the set corresponding to their
image under f. It is impossible for B to be in the range off, for if it were then there
would be an element of A that maps to this subset: assume W EA and few) = B.
Since w is an element of A, it might belong to B, which is a subset of A. If wEB,
then w Ef(w), sincef(w) = B; butthe elements for whichy Ef(y) were exactly the
. ones omitted fJ:"om B, and thus we would have w E;t B, which is a contradiction. Our
speculation that w might belong to B is therefore incorrect. The only other option is
that w does not belong to B. But if w E;t B = f( w), then w is one of the elements that
Sec. 0.4 Cardinality and Induction 15
are supposed to be in B and we are again faced with the impossibility that w fl. Band
wEB. In all cases, we reach a contradiction if we assume that there exists an
element w for whichf(w) = B. Thus, B was a member of the codomain that is not in
the range off, andfis therefore not a bijection .
.!l
Sets that are finite or are countably infinite are called countable or denumer-
able because their elements can be arranged one after the other (enumerated). We
will often need to prove that a given statement is true in an infinite variety of cases
that can be enumerated by the natural numbers 0, 1,2, .... The assertion that the
sum of the first n positive numbers can be predicted by multiplying n by the number
one larger than n and dividing the result by 2 seems to be true for various test values
of n:
1 + 2 + 3 = 3(3 + 1)/2
1 + 2 + 3 + 4 + 5 = 5(5 + 1)/2
and so on. We would like to show that the assertion is true for all values of
n = 1,2,3, ... , but we clearly could never check the arithmetic individually for an
infinite number of cases. The assertion, which varies according to the particular
number n we choose, can be represented by the statement
Note that P(n) is not a number; it is the assertion that two numbers are the same and
therefore will only take on the values True and False. We would like to show that
P(n) is true for each positive integer n; that is, ('v'n)P(n). Notice that if you were to
attempt to check out whether P(101) was true your work would be considerably
simplified if you already knew how the first 100 numbers added up. If the first 100
summed to 5050, it is clear that 1 + 2 + ... + 99 + 100 + 101 = (1 + 2 + ... +
99 + 100) + 101 = 5050 + 101 = 5151; the hard part of the calculation can be done
without doing arithmetic with 101 separate numbers. Checking that (101 + 1)101/2
agrees with 5151 shows that P(101) is indeed true [that is, as long as we are sure that
our calculations in verifying P(100) are correct]. Essentially, the same technique
could have been used to show that P(6) followed from P(5). This trick of using the
results of previous cases to help verify further cases is reflected in the principle of
mathematical induction.
V Theorem 0.5. Let P(n) be a statement for each natural number n EN. From
the two hypotheses
i. P(O)
ii. ('v'm E N)(P(m) =i> P(m + 1))
EXAMPLE 0.21
Consider the statement discussed above, where P(n) was the assertion that
1 + 2 + 3 + ... + (n - 2) + (n - 1) + n adds up to (n + 1)n/2. We will begin with
P(l) (the basis step) and note that 1 = (1 + 1)112, so P(l) is indeed true. For the
inductive step, let m be an arbitrary (but fixed) positive integer, and assume
P(m + I) is true; that is, 1 + 2 + 3 + ... + (m - 2) + (m - 1) + m adds up to
(m + 1)m/2. We need to show P(m + I): 1 + 2 + 3 + ... + (m + 1 - 2) +
(m + 1- 1) + (m + 1» adds up to (m + 1 + I)(m + 1)/2. As in the case of pro-
ceeding from 100 to 101, we will use the fact that the first m integers add up
correctly (the induction assumption) to see how the first m + 1 integers add up. We
have:
1 + 2 + 3 + ... + (m + 1 - 2) + (m + 1 - 1) + (m + 1)
= (1 + 2 + 3 + ... + (m + 1 - 2) + (m + 1 - 1)) + (m + 1)
= (m + 1)m/2 + (m + I)
= (m + 1)m/2 + (m + 1)2/2
= «m + l)m + (m + 1)2)/2
= (m + I)(m + 2)/2
= (m + 1 + l)(m + 1)/2
P(m + 1) is therefore true, and P(m + 1) indeed follows from P(m). Since m was
arbitrary, (\:Im)(P(m) ~ P(m + I» and, by induction, (\:In;::: I)P(l'i). The formula is
therefore true for every positive integer n. It is interesting to note that, with the
usual convention of defining the sum of no integers to be zero, the formula also
holds for n = 0, and P(O) could have been used as the basis step to prove
(\:In E N)P(n).
EXAMPLE 0.22
Consider the statement
Any statement formula using the n variables Pt, P2, ... , p" has an equivalent
expression that contains less than n ·2" operators.
0.5 RECURSION
Since this text will be dealing with devices that repeatedly perform certain oper-
ations, it is important to understand the recursive definition of functions and how to
effectively investigate the properties of such functions. Recall that the factorial
function (f(n) = n!) is defined to be the product of the first n integers. Thus,
f(1) =1
f(2) = 1·2 = 2
f(3) = 1·2·3 = 6
f(4) = 1·2·3·4 = 24
18 Preliminaries Chap. 0
and so on. Note that individual definitions get longer as n increases. If we adopt the
convention thatf(O) = 1, the factorial function can be recursively defined in terms of
other values produced by the function.
EXAMPLE 0.23
The constraints for integer constants, which may begin with a sign and must consist
of one or more digits, are succinctly described by the following productions
(replacement rules):
<sign>:: =+ 1-
<digit>:: =0111213141516171819
<natural> :: = <digit> 1 <digit><natural>
<integer> :: = <natural> 1 <sign><natural>
The symbol 1 represents "or," and the rule
<sign> :: = + 1-
should be interpreted to mean that the token <sign> can be replaced by either the
symbol + or the symbol -. A typical integer constant is therefore + 12, since it can
be derived by applying the above rules in the following fashion:
Syntax diagrams for each of the four productions are shown in Figure 0.6. These can
be combined to form a diagram that does not involve the intermediate tokens
<sign>, <digit>, and <natural> (see Figure 0.7).
(b) natural
t ) digit)
(a) sign
(d) integer
(c) digit
) 0
1
(
---'-~) sign
) ) natural ---~
2
3
4
5
6
7
8
9
Figure 0.6 Syntax diagrams for the components of integer constants
integer
() l{ 0
I
L~J 1
2
3
4
5
6
7
8 Figure 0.7 A syntax diagram for integer
9 constants
20 Preliminaries Chap. 0
EXERCISES
0.1. Construct truth tables for:
(a) -,r V (-,p ~ -,q)
(b) (p/\-,q)V-,(p t q)
0.2. Draw circuit diagrams for:
(a) -,(rV(-,p ~ -,q» t (s/\p)
(b) (p/\-,q)V-,(p t q)
0.3. Show that the sets {l, 2} x {a, b} and {a, b} x {l, 2} are not equal.
0.4. Let X = {l, 2, 3, 4}.
(a) Determine the set of ordered pairs comprising the relation <.
(b) Determine the set of ordered pairs comprising the relation =.
(c) Since relations are sets of ordered pairs, it makes sense to union them together.
Determine the set = U <.
(d) Determine the set of ordered pairs comprising the relation ~.
0.5. Let n be a natural number. Show that congruence modulo n, =n, is an equivalence
relation.
0.6. Let X = N. Determine the equivalence classes for congruence modulo O.
0.7. Let X = N. Determine the equivalence classes for congruence modulo 1.
0.8. Let X = IR. Determine the equivalence classes for congruence modulo 1.
0.9. Let R be an arbitrary equivalence relation in X. Prove that the distinct equivalence
classes of R form a partition of X.
0.10. Given a set X and a partition P = {A 1 ,A 2 , ••• ,An} of X, prove that X equals the union
of the sets in P.
0.11. Given a set X and a partition P = {A l ,A 2 , ••• ,An} of X, prove that the relation R(P)
in X induced by P is an equivalence relation.
0.12. Let X = {l, 2, 3, 4}.
(a) Give an example of a partition P for which R(P) is a function.
(b) Give an example of a partition P for which R(P) is not a function.
0.13. The following "proof' seems to indicate that a relation that is symmetric and transitive
must also be reflexive:
0.17. Let IP' be the set of nonnegative real numbers, and consider the function s: IP'~ IP'
defined by s(x) = x 2 • Show that s -1 exists.
0.18. Let/: X x Y be an arbitrary function. Prove that the converse of I is a function if/lis a
bijection.
0.19. (a) Let -A denote the complement of a set A. Prove that -(-A) = A.
(b) Let - R denote the converse of a relation R. Prove that -( - R) = R.
0.20. Let the functions I: X ~ Y and g: Y ~ Z be one to one. Prove that g 01 is one to one.
0.21. Let the functions f: X ~ Y and g: Y ~ Z be onto. Prove that go f is onto.
0.22. Define two functions for which fog = g of.
0.23. Define, if possible, a bijection between:
(a) Nand D
(b) Nand N x N
(c) Nand Q
(d) N and {a, b, c}
0.24. Use induction to prove that the sum of the cubes of the first n positive integers adds up
to n 2(n + 1)2/4.
0.25. Use induction to prove that the sum of the first n positive integers is less than n 2 (for
n > 1).
0.26. qse induction to prove that, for n > 3, n! > n 2 •
~. 0.27. U'se-induction to prove that, for n > 3, n! > 2n.
0.28. Use induction to prove that 12 + 22 + ... + n 2 = n(n + 1)(2n + 1)/6.
0.29. Prove by induction that X n (Xl U X 2 U ... U Xn ) = (X n Xl ) U (X n X 2 ) U ... U
(xnXn).
0.30. Let -A denote the complement of the setA. Prove -(Xl UX2 U··· UXn) = (-Xl) n
_ _~-.<.X2) n ... n (- Xn) by induction.
0.31. Use induction to prove that there are 2n subsets of a set of size n; that is, for a finite set
A, IIp(A I = 211AII.
0.32. The principle of mathematical induction is often stated in the following form, which
requires (apparently) stronger hypotheses to reach the desired conclusion: Let P(n) be
a statement for each natural number n EN. From the two hypotheses
i. P(O)
ii. ("1m E N)«(Vi :$ m)P(i» ~ P(m + 1»
we can conclude ("In E N)P(n). Prove that the strong form of induction is equivalent to
the statement of induction given in the text. Hint: Consider the restatement of the
hypothesis given in Example 0.22.
0.33. Determine what types of strings are defined by the following BNF:
<sign>:: =+ 1-
<digit>:: =0111213141516171819
<natural> : : = <digit> I<digit> <natural>
<integer> :: = <natural> I <sign> <natural>
<real constant> : : = <integer> I
<integer>. I
<integer>. <natural> I
<integer>. <natural> E<integer>
0.34. A set X is cofinite if the complement of X (with respect to some generally understood
universal set) is finite. Let the universal set be D. Give an example of
22 Preliminaries Chap. 0
PI P2 P3 P4 ql q2 q3
0 0 0 0 1 1 1
0 0 0 1 0 1 0
0 0 1 0 1 0 0
0 0 1 1 1 1 0
0 1 0 0 0 1 1
0 1 0 1 1 0 0
0 1 1 0 1 1 0
PI P2 P3 q 0 1 1 1 0 0 0
0 0 0 0 1 0 0 0 1 0 1
0 0 1 0 1 0 0 1 1 0 0
0 1 0 1 1 0 1 0 0 1 0
0 1 1 0 1 0 1 1 0 0 0
1 0 0 1 1 1 0 0 1 1 1
1 0 1 0 1 1 0 1 0 1 0
1 1 0 0 1 1 1 0 0 1 1
1 1 1 1 1 1 1 1 0 0 0
Figure 0.8 The truth table for Exercise 0.41 Figure 0.9 The truth table for Exercise 0.42
CHAPTER
This chapter introduces the concept of a finite automaton, which is perhaps the
simplest form of abstract computing device. Although finite automata theory is
concerned with relatively simple machines, it is an important foundation of a large
number of concrete and abstract applications. The finite-state control of a finite
automaton is also at the heart of more complex computing devices such as finite-
state transducers (Chapter 7), pushdown automata (Chapter 10), and Turing ma-
chines (Chapter 11).
Applications for finite automata can be found in the algorithms used for string
matching in text editors and spelling checkers and in the lexical analyzers used by
assemblers and compilers. In fact, the best known string matching algorithms are
based on finite automata. Although finite automata are generally thought of as
abstract computing devices, other noncomputer applications are possible. These
applications include traffic signals and vending machines or any device in which
there are a finite set of inputs and a finite set of things that must be "remembered"
by the device.
Briefly, a deterministic finite automaton, also called a recognizer or acceptor, is
a mathematical model of a finite-state computing device that recognizes a set of
words over some alphabet; this set of words is called the language accepted by the
automaton. For each word over the alphabet of the automaton, there is a unique
path through the automaton; if the path ends in what is called a final or accepting
state, then the word traversing this path is in the language accepted by the auto-
maton.
Finite automata represent one attempt at employing a finite description to
rigorously define a (possibly) infinite set of words (that is, a language). Given such a
23
24 Introduction and Basic Definitions Chap. 1
The devices we will consider are meant to react to and manipulate symbols. Differ-
ent applications may employ different character sets, and we will therefore take care
to explicitly mention the alphabet under consideration.
EXAMPLE 1.1
i. {O, I}
ii. {a, b, c}
iii. {(O, 0), (0,1), (1, 0), (1, I)}
As formally specified in Definition 1.5, the order in which the symbols of the
word occur will be deemed significant, and therefore a word of length 3 can be
identified with an ordered triple belonging to!' x !, x !'. Indeed, one may view the
three-letter word bca as a convenient shorthand for the ordered triple (b, c, a). A
word over an alphabet is thus an ordered string of symbols, where each symbol in
the string is an element of the given alphabet. An obvious example of words are
what you are reading right now, which are words (or strings) over the standard
English alphabet. In some contexts, these strings of symbols are occasionally called
sentences.
EXAMPLE 1.2
Let!' = {O, 1,2,3,4,5,6,7,8, 9}; some examples of words over this alphabet are
i. 42
ii. 242342
Even though only three different members of!, occur in the second example,
the length of 242342 is 6, as each symbol is counted each time it occurs. To easily
and succinctly express these concepts, the absolute value notation will be employed
to denote the length of a string. Thus, 1421 = 2, 12423421 = 6, and Iala2a3a41 = 4.
V Definition 1.3. For a given alphabet!, and a word x = ala2· .. an over !', Ix I
denotes the length of x. That is, Iala2· .. an 1= n.
~
When the operation of concatenation is clear from the context, we will adopt the
convention of omitting the symbol for the operator (as is done in arithmetic with the
multiplication operator). Thus xyz refers to X'y ·z. In fact, in Chapter 6 it will be
seen that the operation of concatenation has many algebraic properties that are
similar to those of arithmetic multiplication.
It is often necessary to count the number of occurrences of a given symbol
within a word. The notation described in the next definition will be an especially
useful shorthand in many contexts.
V Definition 1.6. Given an alphabet~, and some b E~, the length of a word w
with respect to b, denoted Iw Ib , is the number of occurrences of the letter b within
that word.
~
EXAMPLE 1.3
i. labblb = 2
ii. Iabb Ie = 0
iii. 1100000000111888188888811 = 5
Sec. 1.1 Alphabets and Words 27
The empty word is often denoted by E in other formal language texts. The empty
string serves as the identity element for concatenation. That is, for all strings x,
X'A=A'X=X
Even though the empty word is represented by a single character, Ais a string but is
not a member of any alphabet: A$.!'.
A particular string x can be divided into substrings in several ways. If we
choose to break x up into three substrings u, v, and w, there are many ways to
accomplish this. For example, if x = abccdbc, it could be written as ab·ccd·bc; that
is, x = uvw, where u = ab, v = ccd, and w = bc. This x could also be written as
abc'A'cdbc, where u = abc, v = A, and w = cdbc. In this second case, Ixi = 7 =
3+0+4=lul+lvl+lwl·
A fundamental structure in formal languages involves sets of words. A simple
example of such a set is !'k, the collection of all words of exactly length k (for some
k E~) that can be constructed from the letters of!,.
EXAMPLE 1.4
If
!, = {O, I}
then
!,O = {A}
!,i = {O, I}
!,2 = {OO, 01,10, 11}
!,3 = {OOO, 001, 010, 011,100,101,110, 111}
Ais the only element of!'o, the set of all words containing zero letters from!,. There
is no difficulty in letting Abe an element (and the only element) of!'o, since each !,k
is not necessarily an alphabet, but is instead a set of words; A, according to the
definition, is indeed a word consisting of zero letters.
28 Introduction and Basic Definitions Chap. 1
and
'"
~+ = U ~k=~lU~2U~3U ...
k=l
~* is the set of all words that may be constructed from the letters of an alphabet ~.
~+ is the set of all nonempty words that may be constructed from~ .
. ~*, like the set of natural numbers, is an infinite set. Although ~* is infinite,
each word in ~* is of finite length. This property follows from the definition of ~*
and a property of natural numbers: any kEN must by definition be a finite number.
~* is defined to be the union of all ~k, kEN. Since each such k is a finite number
and every word in ~k is of length k, then every word in ~k must be of finite length.
Furthermore, since ~* is the union of all such ~k, every word in ~* must also be of
finite length. While ~ * can contain arbitrarily long words, each of these words must
be finite, just as every number in N is finite.
Since ~* is the union of all ~k for kEN, ~* must also contain ~o. In other
words, besides containing all words that can be constructed from one or more letters
of~, ~* also contains the empty word A. While A $.~, A E ~*. A represents a string
and not a symbol, and thus the empty string cannot be in the alphabet~. However,
A is included in ~ *, since ~ * is not just an alphabet, but a collection of words over
the alphabet~. Note, however, that ~+ is ~* - {A}; ~+ specifically excludes A.
We now have the building blocks necessary to define deterministic finite automata.
A deterministic finite automaton is a mathematical model of a machine that accepts a
particular set of words over some alphabet~.
A useful visualization of this concept might be referred to as the black box
model. This conceptualization is built around a black box that houses the finite-state
control. This control reacts to the information provided by the read head, which
extracts data from the input tape. The control also governs the operation of the
output indicator, often depicted as an acceptance light, as shown in Figure 1.1.
There is no limit to the number of symbols that can be on the tape (although
each individual word must be of finite length). As the input tape is read by the
machine, state transitions, which alter the current state of the automaton, take place
within the black box. Depending on the word contained on the input tape, the light
bulb either lights or remains dark when the end of the input string is reached,
indicating acceptance or rejection of the word, respectively. We assume that the
input head can sense when it has passed the last symbol on the tape.
In some sense, a personal computer fits the finite-state control model; it reacts
Sec. 1.2 Definition of a Finite Automaton 29
Finite State
Control
to each keystroke entered from the keyboard according to the current state of the
CPU and its own internal memory. However, the number of possible bit patterns
that even a small computer can assume is so astronomically large that it is totally
impractical to model a computer in this fashion. Finite-state machines can be
profitably used to qescribe portions of a computer (such as parts of the arithmetic/
logic unit, as discussed in Chapter 7, Example 7.15) and other devices that assume a
reasonable number of states.
Although finite automata are usually thought of as processing strings of letters
over some alphabet, the input can conceptually be elements from any finite set. A
useful example is the "brain" of a vending machine, which, say, dispenses 30tt candy
bars.
EXAMPLE 1.5
The input to the vending machine is the set of coins {nickel, dime, quarter}, repre-
sented by 0, d, and q in Figure 1.2. The machine may only "remember" a finite
number of things; in this case, it will keep track of the amount of money that has
been dropped into the machine. Thus, the machine may be in the "state" of
remembering that no money has yet been deposited (denoted ,in this example by
<Ott», or that a single nickel has been inserted (the state labeled <Stt», or that
either a dime or two nickels have been deposited «10tt», and so on. Note that
from state <Ott> there is an arrow labeled by the dime token d pointing to the state
< lOtt > , indicating that, at a time when the machine "believes" that no money has
been deposited, the insertion of a dime causes the machine to transfer to the state
that remembers that ten cents has been deposited. From the <Ott> state, the arrows
in the diagram show that if two nickels (0) are input the machine moves through the
<Stt> state and likewise ends in the state labeled <lOtt>.
The vending machine thus counts the amount of change dropped into the
machine (up to SOtt). The machine begins in the state labeled <Ott> and follows the
arrows to higher-numbered states as coins are inserted. For example, depositing a
nickel, a dime, and then a quarter would move the machine to the states <Stt>,
30 . Introduction and Basic Definitions Chap. 1
<15¢>, and then <40¢>. The states labeled 30¢ and above are doubly encircled to
indicate that enough money has been deposited; if 30¢ or more has been deposited,
then the machine "accepts," indicating that a candy bar may be selected.
Finite automata are appropriate whenever there are a finite number of inputs
and only a finite number of situations must be distinguished by the machine. Other
applications include traffic signals and elevators (as discussed in Chapter 7). We now
present a formal mathematical definition of a finite-state machine.
The input alphabet, I., for any deterministic finite automaton A, is the set of
symbols that can appear on the input tape. Each successive symbol in a word will
cause a transition from the present state to another state in the machine. As
specified by the 3 function, there is exactly one such state transition for each
combination of a symbol a E I. and a state s E S. This is the origin of the word
"deterministic" in the phrase "deterministic finite automaton."
The various states represent the memory of the machine. Since the number of
states in the machine is finite, the number of distinguishable situations that can be
Sec. 1.2 Definition of a Finite Automaton 31
remembered by the machine is also finite. This limitation of the device's ability to
store its past history is the origin of the word "finite" in the phrase "deterministic
finite automaton." At any given time during processing, if the previous history of
the machine is considered to be the reactions of the DFA to the letters that have
already been read, then the current state represents all that is known about the
history of the machine.
The start state of the machine is the state in which the machine always begins
processing a string. From this state, successive input symbols from ~ are used by the
8 function to arrive at successive states in the machine. Processing stops when the
string of symbols is exhausted. The state in which the machine is left can either be a
final state, in which case the word is accepted, or it can be anyone of the other states
of S, in which case the word is rejected.
To produce a formal description of the concepts defined above, it is necessary
to enumerate each part of the quintuple that comprises the DFA. ~, S, so, and Fare
easily enumerated, but the·function 8 can often be tedious to describe. One device
used to display the mapping 8 is the state transition diagram. Besides graphically
displaying the transitions of the 8 function, the state transition diagram for a deter-
ministic finite automaton also illustrates the other four parts of the quintuple.
A finite automaton state transition diagram is a directed graph. The states of
the machine represent the vertices of the graph, while the mapping of the 8 function
describes the edges. Final states are denoted by a doubly encircled state, and the
start state is identified by a straight incoming arrow. Each domain element of the
transition function corresponds to an edge in the directed graph. We formally define
a finite automaton state transition diagram for <~, S, so, 8, F> as a directed graph
G = (V, E), as follows:
i. V=S,
ii. E = {(s, t, a) Is, t E S, a E ~ /\ 8(s, a) = t},
where V is the set of vertices of the graph, and E is the set of edges connecting these
vertices. Each element of E is an ordered triple, (s, t, a), such that s is the origin
vertex, t is the terminus, and a is the letter from ~ labeling the edge. Thus, for any
vertex there is exactly one edge leaving that vertex for each element of~.
EXAMPLE 1.6
In the DFA shown in Figure 1.3, the set of edges E of the graph G is given by
E = {(so, SJ, a), (so, S2, b), (sJ, SJ, a), (sJ, S2, b), (S2, SJ, a), (S2, so, b)}. The figure also
shows that So is the designated start state and that Sl is the only final state. The state
transition function for a finite automaton is often represented in the form of a state
transition table. A state transition table is a matrix with the rows of the matrix
labeled and indexed by the states of the machine, and the columns of the matrix
labeled and indexed by the elements of the input alphabet; the entries in the table
are the states to which the DFA will move. Formally, let T be a state transition table
32 Introduction and Basic Definitions Chap. 1
for some deterministic finite automaton A = <I, S, sO, 8, F>, and let s E Sand
a E I. Then the value of each matrix entry is given by the equation
("Is E S)('Va E I)T.. = 8(s, a)
For the automaton in Example 1.6, the state transition table is
8 a b
So SI S2
SI SI S2
S2 SI So
With 3, we can describe the state in which we will find ourselves after proces-
sing a single letter. We also want to be able to describe the state at which we will
arrive after processing an entire string. We will extend the 3 function to cover entire
strings rather than just single letters; S(s, x) will be the state we wind up at when
starting at s and processing, in order, all the letters of the string x. While this is a
relatively easy concept to (vaguely) state in English, it is somewhat awkward to
formally define. To facilitate formal proofs concerning DFAs, we use the following
recursive definition.
V Definition 1.11. Given aDFAA = <!"S, so, 3, F>, the extended state transi-
tion function for A, denoted S, is a function S: S x !,* ~ S defined recursively as
follows:
Therefore, P(m)::;, P(m + 1), and since this implication holds for any nonnegative
integer m, by the principle of mathematical induction we can say that Pen) is true for
all n EN. Since the statement therefore holds for any string y of any length, the
assertion is indeed true for all y in I *. This completes the proof of the theorem.
a
Note that the statement of Theorem 1.1 is very similar to the rule iii of the
recursive definition of the extended state transition function (Definition 1.11) with
the string y replacing the single letter a. We will see a remarkable number of
situations like this, where a recursive rule defined for a single symbol extends in a
natural manner to a similar rule for arbitrary strings.
As alluded to earlier, the state in which a string terminates is significant; in
particular, it is important to determine whether the terminal state for a string
happens to be one of the states that was designated to be a final state.
const
MaxWordLength = 255; {an arbitrary constraint}
type
Word = record
Length :0 .. MaxWordLength;
Letters: packed array [0 .• MaxWordLengthj of Sigma
end; {Word}
We say a word w is accepted by a machine A = <~, S, sO, 8, F> iff the extended
state transition function B associated with A maps to a final state from So when
processing the word w. This means that the path from the start state ultimately leads
to a final state when the word w is presented to the machine. We will occasionally
say that A recognizes w; a DFA is sometimes referred to as a recognizer.
In other words, a word w is rejected by a machine A = <~, S, so, 8, F> iff the B
function associated with A maps to a nonfinal state from So when processing the
word w.
EXAMPLE 1.7
Let
A= <~, S, so, 8, F>
where
~= {O, I}
S = {go,gd
So = go
F={gd
36 Introduction and Basic Definitions Chap. 1
B 0 1
The functions Delta, Del taBar, and Accept can be combined to form a
Pascal program that models a DFA. The sample fragments given in Figures 1.4,1.5,
and 1.7 rightly pass the candidate string as a parameter. A full program would be
complicated by several constraints, including the awkward way in which strings must
be handled in Pascal. To highlight the correspondence between the code modules
and the automata definitions, the program given in Figure 1.8 handles input at the
character level rather than at the word level. The definitions in the procedure
Initialize reflect the structure of the DFA shown in Figure 1.9. Invoking this
program will produce a response to a single input word. For example, a typical
exchange would be
cba
Rejected
Running this program again might produce
ecce
Accepted
This behavior is essentially the same as that of the C program shown in Figure 1.10.
The succinct coding clearly shows the relationship between the components of the
quintuple for the DFA and the corresponding code.
type
Sigma = I a I .. I C I ;
State= (sO, s1, s2);
var
TransitionTable array [State, Sigma] of State;
FinalState set of State;
t : =s;
begin
Accept :=DeltaBar(so) in FinalState
end; { Accept }
procedure Initialize;
begin
FinalState : = [s2];
end; { Initialize
begin { DFA }
Initialize;
if Accept then
writeln(output, 'Accepted ' )
else
writeln(output, IRejected l )
end. { DFA }
Figure 1.8 A Pascal program that emulates the DFA shown in Figure 1.9
words over the Roman alphabet and this collection is therefore a language accord-
ing to our definition. Note that a language L, in this context, is simply a list of
words; neither syntax nor semantics are involved in the specification of L. Thus, a
language as defined by Definition 1.14 has little of the structure or relationships one
would normally expect of either a natural language (like English) or a programming
language (like Pascal).
EXAMPLE 1.8
Some other examples of valid languages are
i. 0
ii. {w E {O, 1}* I Iwi> 5}
iii. {A}
iv. {A, bilbo, frodo, samwise}
v. {xE{a,b}*llxl.=lxlb}
40 Introduction and Basic Definitions Chap. 1
# include <stdio.h>
# define «int) c- (int) la l )
# define
enum state
/* .
** This table implements the state transition function and is indexed by
** the current state and the current input letter.
*/
enum state transition_table[3)[3)={ s_1, s_D, s_2 },
s_2, s_D, s_D },
s_D, s_D, s_1 }
};
/*
** Step through the input one letter at a time.
*/
while «char) (c = getchar(» ! = I\nl)
t=delta(t, c);
return t;
main () {
if (delta_bar(s_D) == FINAL_STATE)
printf("Accepted\n");
else
printf("Rejected\n");
exit(D) ;
Figure 1.10 A C program that emulates the DFA shown in Figure 1.9
Sec. 1.3 Examples of Finite Automata 41
V Definition 1.15. Given a DFA A = <I, S, sO, 8, F>, the language accepted by
A, denotedL(A), is defined to be
L (A) = {w E I* 18(so, w) E F}
L (A), the language accepted by a finite automaton A, is the set of all words w from
::£* for which 8(so, w) E F. In order for a word w to be contained in L(B), the path
through the finite automaton B, as determined by the letters in w, must lead from
the start state to one of the final states.
For deterministic finite automata, the path for a given word w is unique: there
is only one path since, at any given state in the automaton, there is exactly one
transition for each a E I. This is not necessarily the case for another variety of finite
automaton, the nondeterministic finite automaton, as will be seen in Chapter 4.
The set of all words over {O, I} that contain an odd number of Is is finite automaton
definable, as evidenced by the automaton in Example 1. 7, which accepts exactly this
set of words.
This section illustrates the definitions of the quintuples and the state transition
diagrams for some nontrivial automata. The following example and Example 1.11
deal with the recognition of tokens, an important issue in the construction of
compilers.
EXAMPLE 1.9
The set of FORTRAN identifiers is a finite automaton definable language. This
statement can be proved by verifying that the following machine accepts the set of
all valid FORTRAN 66 identifiers. These identifiers, which represent variable,
subroutine, and array names, can contain from 1 to 6 (nonblank) characters, must
42 Introduction and Basic Definitions Chap. 1
B a b c ... y Z 0 1 ... 8 9 0 r
80 81 81 81 ..• 81 81 87 87 ... 87 87 80 87
81 82 82 82 ... 82 82 82 82 ... 82 82 81 87
82 83 83 83 ••• 83 83 83 83 ••. 83 83 82 87
83 84 84 84 ... 84 84 84 84 ... 84 84 83 87
84 85 85 85 .•. 85 85 85 85 ••. 85 85 84 87
85 86 86 86 ... 86 86 86 86 ... 86 86 85 87
86 87 87 87 ... 87 87 87 87 ... 87 87 86 87
87 87 87 87 ... 87 87 87 87 ... 87 87 87 87
The entries under the column labeled r show the transitions taken for each member
of the set r. The state transition diagram of the machine corresponding to this
quintuple is displayed in Figure 1.11. Note that, while each of the 26 letters
transition from So to SI , a single arrow labeled a-z is sufficient to denote all these
transitions. Similarly, the transition labeled ~ from S7 indicates that every element
of the alphabet follows the same path.
EXAMPLE 1.10
The DFA M shown in Figure 1.12 accepts only those strings that have an even
number of bs and an even number of as. Thus,
L(M) = {x E{a, b}* I Ixl. == Omod2 A Ixlb== Omod2}
The corresponding quintuple for M = <~, S, SO , 8, F> has the following compo-
nents:
~ == {a, b}
S = {<O, 0>, <0, 1>, <1,0>, <1, I>}
so= <0,0>
S a b
F = {<O,O>}
EXAMPLE 1.11
Consider a typical set of all real number constants in modified scientific notation
format described by the BNF in Table 1.1.
44 Introduction and Basic Definitions Chap. 1
TABLE 1.1
<sign>:: =+ 1-
<digit>:: =0111213141516171819
<natural> :: = <digit> 1<digit><natural>
<integer>:: =<natural> 1<sign><natural>
<real constant> :: = <integer>
<integer>.
<integer>. <natural> 1
<integer>. <natural> E<integer>
This set of productions defines real number constants like + 192., since
1
3.1415
2.718281828
27.
42.42
1.0E-32
The set of all real number constants that can be derived from the productions given
in Table 1.1 is a FAD language. Let R be the deterministic finite automaton defined
below. The corresponding state transition diagram is given in Figure 1.13.
!, = {O, 1, 2, 3, 4, 5, 6, 7,8,9, + ,- ,E, .}
S = {so, S), S2, S3, S4, S5, S6, S7, S8}
So.= So
8 0 1 2 3 4 5 6 7 8 9 + E
So S2 S2 S2 S2 S2 S2 S2 S2 S2 S2 SI SI S7 S7
SI S2 S2 S2 S2 S2 S2 S2 S2 S2 S2 S7 S7 S7 S7
S2 S2 S2 S2 S2 S2 S2 S2 S2 S2 S2 S7 S7 S7 S3
S3 S8 S8 S8 S8 S8 S8 S8 S8 S8 S8 S7 S7 S7 S7
S4 S5 S5 S5 S5 S5 S5 S5 S5 S5 S5 S6 S6 S7 S7
S5 S5 S5 S5 S5 S5 S5 S5 S5 S5 S5 S7 S7 S7 S7
S6 S5 S5 S5 S5 S5 S5 S5 S5 S5 S5 S7 S7 S7 S7
S7 S7 S7 S7 S7 S7 S7 S7 S7 S7 S7 S7 S7 S7 S7
S8 S8 S8 S8 S8 S8 S8 S8 S8 S8 S8 S7 S7 S4 S7
+,- E
+,-..
o-~
P--(>0=4 :==Cr-:~ :~ :~
NOT gate AND gate OR gate NAND gate NOR gate
ttt P
p q p/\q p q pVq q pi q p q p ~ q
1 0 1 1 1 1 1 1 0 1 1 0
o 1 1 0 0 1 0 1 1 0 1 1 0 0
0 1 0 0 1 1 0 1 0 1 0
0 0 0 0 0 0 0 0 1 0 0 1
Figure 1.14 Common logic gates and their truth tables
volts
+5 r-
0
tim e
would be replaced by a device that pulsed whenever a new input (such as the
insertion of a coin) was detected.
We need to retain the present status of the network (current state, letter, and
so forth) as we move on to the next input symbol. This is achieved through the use of
a D flip-flop (D stands for data or delay), which uses NAND gates and the clock
signal to store the current value of, say, p', between clock pulses. The symbol for a
D flip-flop (sometimes called a latch) is shown in Figure 1.16, along with the actual
gates that comprise the circuit.
The output, p and -'p, will reflect the value of the input signal p' only after
the high clock pulse is received and will retain that value after the clock drops to low
(even if p' subsequently changes) until the next clock pulse comes along, at which
time the output will reflect the new current value of p'. This is best illustrated by
referring to the NAND truth table and tracing the changes in the circuit. Begin with
clock = p = p' = 0 and -,p = 1, and verify that the circuit is stable. Now assume
that p' changes to 1, and note that, although some internal values may change, p
and -,p remain at 0 and 1, respectively; the old value ofp' has been "remembered"
by the D flip-flop. Contrast this with the behavior when we strobe the clock: assume
that the clock now also changes to 1 so that we now have clock = p' = -,p = 1, and
p = O. When the signal propagates through the network, we find that p and -,p
have changed to reflect the new value of p'; clock = p = p' = 1, and -,p = O.
We will also have to represent the letters of our input alphabet by high and low
voltages (that is~ combinations of Os and Is). The. ASCII alphabet, for example, is
quite naturally represented by 8 bits, 313233343s36373B, where B, for example, has the
bit pattern 01000010 (binary 66). One of these bit patterns should be reserved for
48 Introduction and Basic Definitions Chap. 1
p'---I t----p
(a)
~-+-+---...,p
(b)
Figure 1.16 (a) A data flip·flop or latch (b) The circuitry for a D flip-flop
indicating the end of our input string <EOS>. Our convention will be to reserve
binary zero for this role, which means our ASCII end of string symbol would be
00000000 (or NULL). In actual applications using the ASCII alphabet, however, a
more appropriate choice for <EOS> might be 00001101 (a carriage return) or
00001001 (a line feed) or 00100000 (a space).
Our alphabets are likely to be far smaller than the ASCII character set, and we
will hence need fewer than 8 bits of information to encode our letters. For example,
if ~ = {b, c}, 2 bits, 31 and 32, will suffice. Our choice of encoding could be
00 = <EOS>, 01 = b, 10 = c, and 11 is unused.
Sec. 1.4 Circuit Implementation of Finite Automata 49
EXAMPLE 1.12
When building a logical circuit from the definition of a DFA, we will find it con-
venient to treat <EOS> as an input symbol, and define the state transition function
for it by (Vs E S)(8(s, <EOS» = s). Thus, the DFA in Figure 1.17a should be
thought of as shown in Figure 1.17b. As we have only two states, a single state bit
will suffice, representing So by tl = 0 and Sl by tl = 1. Since I = {b, c}, we will again
use 2 bits, 31 and 32, to represent the input symbols. As before, 00 = <EOS>,
01 = b, 10 = c, and 11 is unused.
(a) (b)
Figure 1.17 (a) The DFA discussed in Example 1.12 (b) The expanded state
transition diagram for the DFA implemented in Figure 1.18
Determining the state transition function will require knowledge of the cur-
rent state (represented by the status of t l) and the current input symbol (repre-
sented by the pair of bits 31 and 32' These three input values will allow the next state
t{ to be calculated. From the 8 function, we know that
8(so, b) = So
8(so, c) = Sl
8(s}, b) = So
8(s}, c) = So
These specifications correspond to the following four rows of the truth table for tt:
50 Introduction and Basic Definitions Chap. 1
tl al a2 t; t1 a1 a2 t;
0 0 1 0 So 0 l=b So
0 1 0 1 which represents So 1 o=c sl
1 0 1 0 sl 0 l=b So
1 1 0 0 sl 1 o=c So
Adding the state transitions for <EOS> and using * to represent the outcome for
the two rows corresponding to the unused combination 8182 = 11 fills out the eight
rows of the complete truth table, as shown in Table 1.2.
TABLE 1.2
t1 a1 a2 t;
0 0 0 0
0 0 1 0
0 1 0 1
0 1 1 *
1 0 0 1
1 0 1 0
1 1 0 0
1 1 1 *
If we arbitrarily assume that the two don't-care combinations (*) are zero, the
principle disjunctive normal form of t; contains just two terms: (-,t l /\8I/\-'82) V
(tl /\-'81/\-'82). It is profitable to reassign the don't-care value in the fourth row to
1, 'since the expression can then be shortened to (-,t l /\81)V(t l /\-,al/\-,a2) by
applying standard techniques for minimizing Boolean functions. Incorporating this
into a feedback loop with a D flip-flop provides the heart of the digital logic circuit
representing the DFA, as shown in Figure 1.18.
t,l-~-+---I
accept
1;1---+--+--1
clock reject
.. ---+----l, <EOS>
1\---1
Figure 1.18 The circuitry implementing the DFA discussed in Example 1.12
Sec. 1.4 Circuit Implementation of Finite Automata 51
The accept portion of the circuitry ensures that we do not indicate acceptance
when passing through the final state; it is only activated when we are in a final state
while scanning the <EOS> symbol. Similarly, the reject circuitry can only be
activated when the <EOS> symbol is encountered. When there are several final
states, this part of the circuitry becomes correspondingly more complex. It is in-
structive to follow the effect a string such as bcc has on the above circuit. Define
ai(j) as the jth value the bit ai takes on as the string bcc is processed; that is, alj) is
the value of ai during the jth clock pulse. We then have
al(1) = 0 a2(1) = 1 ;:? b
al(2) = 1 a2(2) =0 ;:?c
al(3) =1 a2(3) = 0 ;:?c
al(4) = 0 a2(4) = 0 ;:?<EOS>
Trace the circuit through four clock pulses (starting with tl = 0), and observe the
cutrent values that tl assumes, noting that it corresponds to the appropriate state of
the machine as each input symbol is scanned.
Note that a six-state machine would require more and substantially larger
truth tables. Since a state encoding would now need to specify tl> t2 , and t 3 , three
different truth tables (for tf, t2, and t3) must be constructed to predict the next state
transition. More significantly, the input variables would include tl> t 2, t 3 , al> and a2,
making each table 32 rows long. Three D flip-flop feedback loops would be neces-
sary to store the three values t1, t2, and t3'
Also, physical logic circuits of this type have the disconcerting habit of initial-
izing to some random configuration the first time power is applied to the network. A
true working model would thus need a reset circuit to initialize each ti to 0 in order to
ensure that the machine started in state SQ. Slightly more complex set-reset flip-flops
can be used to provide a hardware solution to this problem. However, a simple
algorithmic solution would require the input tape to have a leading start-of-string
symbol <SOS>. The definition of the state transition function should be expanded
so that scanning the <SOS> symbol from any state will automatically transfer
control to SQ. We will adopt the convention that <SOS> will be represented by the
highest binary code; in ASCII, for example,' this would be 11111111, while in the
preceding example it would be 11. To promote uniformity in the exercises, it is
suggested that <SOS> should always be given the highest binary code and <EOS>
be represented by binary zero; as in the examples given here, the symbols in ~
should be numbered sequentially according to their natural alphabetical order. In a
similar fashion, numbered states should be given their corresponding binary codes.
The reader should note, however, that other encodings might result in less complex
circuitry.
EXAMPLE 1.13
As a more complex example of automaton circuitry, consider the DFA displayed in
Figure 1.19. Two flip-flops tl and t2 will be necessary to represent the three states,
52 Introduction and Basic Definitions Chap. 1
TABLE 1.3
t1 t2 a1 a2 t; t2 accept
0 0 0 0 0 0 0
0 0 0 1 0 1 0
0 0 1 0 1 0 0
0 0 1 1 0 0 0
0 1 0 0 0 1 1
0 1 0 1 1 0 0
0 1 1 0 0 0 0
0 1 1 1 0 0 0
1 0 0 0 1 0 1
1 0 0 1 0 0 0
1 0 1 0 0 1 0
1 0 1 1 0 0 0
1 1 0 0 '" '" '"
1 1 0 1 '" '" '"
1
1
1
1
1
1
0
1 0
'" '"
0
'"
0
Sec. 1.4 Circuit Implementation of Finite Automata 53
<SOS>,<EOS>
b,<SOS>
f--
~
r-RJ
r-b- -. tl'-' t,,, s,"-. ..
~"'A~'~ t,l---
f-- I
t,,,-. t. -'s,"-'" m tl
1;
clock
s,
-'s,
-. .- f--
accept.
..
--<J
--<
-.t,"t.~,,-. ..
- '~'A'A~ t,
Figure 1.21 The circuitry implementing the DFA discussed in Example 1.13
54 Introduction and Basic Definitions Chap. 1
In this chapter we have described the simplest form of finite automaton, the DFA.
Other forms of automata, such as nondeterministic finite automata, pushdown
automata, and Turing machines, are introduced later in the text. We close this
chapter with three examples to motivate the material in the succeeding chapters.
When presenting automata in this chapter, we made no effort to construct the
minimal machine. A minimal machine for a given language is one that has the least
number of states required to accept that language.
EXAMPLE 1.14
In Example 1.5, the vending machine kept track of the amount of change that had
been deposited up to SOIt. Since the candy bars cost only 301t, there is no need to
count up to SOIt. In this sense, the machine is not optimal, since a less complex
machine can perform the same task, as shown in Figure 1.22. The corresponding
quintuple is <{n, d, q}, {so, S5, SIO, S15, S20, S25, S30}, so, 8, {S30} >, where for each state s;,
8 is defined by
8(s;, n) = Smin{30,; + 5}
Note that the higher-numbered states in Example 1.5 were all effectively
"remembering" the same thing, that enough coins had been deposited. These final
states have been coalesced into a singI~ final state to produce the more efficient
machine in Figure 1.22. In the next two chapters, we develop the theoretical back-
ground and algorithms necessary to construct from an arbitrary DFA the minimal
machine that accepts the same language.
As another illustration of the utility of concepts relating to finite-state ma-
chines, we will consider the formalism used by many text editors to search for a
particular target string pattern in a text file. To find ababb in a file, for example, a
q q
naive approach might consist of checking whether the first five characters of the file
fit this pattern, and next checking characters 2 through 6 to find a match, and so on.
This results in examining file characters more than once; it ought to be possible to
remember past values, and avoid such duplication. Consider the text string
aabababbb. By the time the fifth character is scanned, we have matched the first
four characters of ababb. Unfortunately, a, the sixth character of aabababbb, does
not produce the final match; however, since characters 4, 5, and 6 (aba) now match
the first three characters of the target string, it does allow for the possibility of
. characters 4 through 8 matching (as is indeed the case in this example). This leads to
a general rule: If we have matched the first four letters of the target string, and the
next character happens to be a (rather than the desired b), we must remember
that we have now matched the first three letters of the target string.
"Rules" such as these are actually the state transitions in the DFA given in the
next example. State Si represents having matched the first i characters of the target
string, and the rule developed above is succinctly stated as 8(S4' a) = S3'
EXAMPLE 1.15
A DFA that accepts all strings that contain ababb as a substring is displayed in
Figure 1.23. The corresponding quintuple is
<{a, b}, {so, S1> S2, S3, S4, S5}, so, 8, {S5} >,
where 8 is defined by
8 a b
So SI So
SI SI S2
S2 S3 So
S3 SI S4
S4 S3 S5
S5 S5 S5
8 a b
So S1 So
S1 S1 S2
S2 S3 S3
S3 S1 S4
S4 S3 S5
S5 S5 S5
B,b
Figure 1.24 A DFA that accepts strings that contain either ababb or abbbb
In this case, we required one letter between the initial part of the search string (ab)
and the terminal part (bb). It is possible to modify the machine to accept strings that
contain ab, followed by any number of letters, followed by bb. This type of machine
would be useful for identifying comments in many programming languages. For
example, a Pascal comment is essentially of the form (*, followed by most combi-
nations of letters, followed by the first occurrence of *).
It should be noted that the machine in Example 1.15 is highly specialized and
tailored for the. specific string ababb; other target strings would require completely
different recognizers. While it appears to require much thought to generate the
appropriate DFA for a given string, we will see how the tools presented in Chapter 4
can be used to automate the entire process.
Example 1.15 indicates how automata can be used to guide the construction of
software for matching designated patterns. Finite-state machines are also useful in
designing hardware that detects designated sequences. Example 4.7 will explore a
communications application, and the following discussion illustrates how these
concepts can be applied to help evaluate the performance of computers.
A computer program is essentially a linear list of machine instructions, stored
in consecutive memory locations. Each memory location holds a sequence of bits
that can be thought of as words comprised of Os and Is. Different types of in-
structions are represented by different patterns of bits. The CPU sequentially
fetches these instructions and chooses its next action by examining the incoming bit
pattern to determine the type of instruction that should be executed. The sequences
of bits that encode the instruction type are called opcodes.
Various performance advantages can be attained when one part of the CPU
prefetches the next instruction while another part executes the current instruction.
However, computers must have the capability of altering the order in which instruc-
Sec. 1.5 Applications of Finite Automata 57
tions are executed; branch instructions allow the CPU to avoid the anticipated next
instruction and instead begin executing the instructions stored in some other area of
memory. When a branch occurs, the prefetched instruction will generally need to be
replaced by the proper instruction from the new area of memory. The consequent
delay can degrade the speed with which instructions are executed.
Irrespective of prefetching problems, it should be clear that a branch instruc-
tion followed immediately by another branch instruction is inefficient. If a CPU is
found to be regularly executing two or more consecutive branch instructions, it may
be worthwhile to consider replacing such series of branches with a single branch to
the ultimate destination [FERR]. Such information would be determined by moni-
toring the instruction stream and searching for patterns that represented consecu-
tive branch opcodes. This activity is essentially the pattern recognition problem
discussed in Example 1.15.
It is unwise to try to collect the data representing the contents of the instruc-
tion stream on secondary storage so that it can be analyzed later. The volume of
information and the speed with which it is generated preclude the collection of a
sufficiently large set of data points. Instead, the preferred solution uses a specially
tailored piece of hardware to monitor the contents of the CPU opcode register and
increment a hardware counter each time the appropriate patterns are detected. The
heart of this monitor can be built by transforming the appropriate automaton into
the corresponding logic circuitry, as outlined in Section 1.4. Unlike the automaton
in Example 1.15, the automaton model for this application would allow transitions
out of the final state, so that it may continue to search for successive patterns. The
resulting logic circuitry would accept as input the bit patterns currently present in
the opcode register, and send a pulse to the counter mechanism each time the
accept circuitry was energized.
Note that in this case we would not want to inhibit the accept circuitry by
requiring an <EOS> symbol to be scanned. Indeed, we want the light on our
conceptual black box to flicker as we process the data, since we are intent on
counting the number of times it flickers during the course of our monitoring.
EXAMPLE 1.16
We close this chapter with an illustration of the manner in which computational
algorithms can profitably use the automaton abstraction. Network communications
between independent processors are governed by a protocol that implements a
finite state control [TANE]. The Kermit protocol, developed at Columbia Univer-
sity, is widely employed to communicate between processors and is still most often
used for its original purpose: to transfer files between micros and mainframes
[DACR]. During a file transfer, the send portion of Kermit on the source host is
responsible for delivering data to the receive portion of the Kermit process on the
destination host. The receive portion of Kermit reacts to incoming data in much the
same way as the machines presented in this chapter. The receive program starts in a
state of waiting for a transfer request (in the form of an initialization packet) to
signal the commencement of a file transfer (state R in Figure 1.25). When such a
packet is received, Kermit transitions to the RF state, where it awaits a file-header
58 Introduction and Basic Definitions Chap. 1
packet (which specifies the name of the file about to be transferred). Upon receipt
of the file-header packet, it enters the RD state, where it processes a succession of
data packets (which comprise the body of the file being transferred). An EOF
packet should arrive after all the data are sent, which can then be followed by
another file-header packet (if there is a sequence of files to be transferred) or by a
break packet (if the transfer is complete). In the latter case, Kermit reverts to the
start state R and awaits the next transfer request. The send portion of the Kermit
process on the source host follows the behavior of a slightly more complex automa-
ton. The state transition diagram given in Figure 1.25 succinctly describes the logic
of the receive portion of the Kermit protocol; for simplicity, timeouts and error
conditions are not reflected in the diagram. The input alphabet is {B, D, Z, H, S},
where B represents a break, D is a data packet, Z is EOF, H is a file-header packet,
and S is a send-intention packet. The state set is {A, R, RF, RD}, where A denotes
the abort state, R signifies receive, RF is receive jileheader, and RD is receive
data. Note that unexpected packets (such as a data packet received in the start state
R or a break packet received when data packets are expected in state RD) cause a
transition to the abort state A.
In actuality, the receive protocol does more than just observe the incoming
packets; Kermit sends an acknowledgment (ACK or NAK) of each packet back to
the source host. Receipt of the file header should also cause an appropriate file to be
created and opened, and each succeeding data packet should be verified and its
contents placed sequentially in the new file. A machine model that incorporates
actions in response to input is the subject of Chapter 7, where automata with output
are explored.
EXERCISES
1.1. Recall how we defined 8 in this chapter:
(Vs E S)(Va E 1) 8.(s, a) = 8(s, a)
(VsES) 8.(s, A) = s
(Vs E S)(Vx E l*)(Va E 1) 8.(s, ax) = 8.(8(s, a), x)
Chap. 1 Exercises 59
8, here denoted 8" was tail recursive. Tail recursion means that all recursion takes
plat;e at the end of the string. Let us now define an alternative extended transition
function, 8h , thusly:
(Vs E S)(Va E I) 8h (s, a) = 8(s, a)
(Vs E S) 8h (s, >..) = s
(Vs E S)(Va E I)(Vx E I*) 8h (s,xa) = 8(8h (s,x), a)
It is clear from the definition of 8h that all the recursion takes place at the head of the
string. For this reason, 8h is called head recursive. Show that the two definitions result
in the same extension of 8, that is, prove by mathematical induction that
(Vs E S)(Vx E I*)(8,(s,x) = 8h (s,x))
1.2. Consider Example 1.14. The vending machine accepts coins as input, but if you change
your mind (or find you do not have enough change), it will not refund your money.
Modify this example to have another input, <coin-return>, which is represented by r
and which will conceptually return all your coins.
1.3. (a) Specify the quintuple corresponding to the DFA displayed in Figure 1.26.
(b) Describe the language defined by the DFA displayed in Figure 1.26.
~O'l
Figure 1.26 The automaton discussed in
Exercise 1. 3
1.4. Construct a state transition diagram and enumerate all five parts of a deterministic
finite automaton A = <{a, b, c}, S, sO, 8, F> such that
L(A) = {x I Ixl is a multiple of 2 or 3}.
1.5. Let I = {O, I}. Construct deterministic finite automata that will accept each of the
following languages, if possible.
(a) L J ={xllxlmod7=4}
(b) L2=I*-{wI3n~Bw=aJ ... an 1\ an =l}
(c) L3={yllylo=lyIJ}
1.6. Let I = {a, b}.
(a) Construct deterministic finite automata Alo A2 , A3, and ~ such that:
i. L(Al) = {x 1(lxl. is odd)I\(lxlb is even)}
ii. L(A2 ) = {y I(Iy I. is even) V (Iy Ib is odd)}
iii. L(A3) = {z 1(lzl. is even) V (Izlb is even)} (V represents exclusive-or)
iv. L(~) = {z I Iz I. is even}
(b) How does the structure of each of these machines relate to the one defined in
Example 1.lO?
1.7. Modify the machine M defined in Example 1.10 so that the language accepted by the
machine consists of strings x E {a, b}*, where both Ix I. and Ix Ib are even and Ix I> 0,
that is, the new machine should accept L(M) - {>..}.
1.8. Let M = <I, S, sO, 8, F> be an (arbitrary) DFA that accepts the language L(M). Write
down a general procedure for modifying this machine so that it will accept L(M) - {>.}.
(Specify the five parts of the new machine and justify your statements.) It may be
helpful to do this for a specific machine (as in Exercise 1.7) before attempting the
general case.
1.9. Let M = <I,S,so,8,F> be an (arbitrary) DFA that accepts thelanguageL(M). Write
60 Introduction and Basic Definitions Chap. 1
down a general procedure for modifying this machine so that it will accept L(M) U {A}.
(Specify the five parts of the new machine and justify your statements.)
1.10. Let I = {a, b, d} and 'IJI = {x E I* I(x begins with d) V (x contains two consecutive bs)}.
(a) Draw a machine that will accept 'IJI.
(b) Formally specify the five parts of the DFA from part (a).
1.11. Let I = {a, b, c} and <I> = {x E I* Ievery b in x is immediately followed by c}.
(a) Draw a machine that will accept <1>.
(b) Formally specify the five parts of the DFA from part (a).
1.12. Let I = {O, 1,2,3,4,5,6,7,8, 9}. Consider the base 10 numbers formed by strings from
I*: 14 represents fourteen, the three-digit string 205 represents two hundred and five,
and so on. Let n = {x E I* Ithe number represented by x is evenly divisible by 7} =
{A,O,OO,OOO, ... ,7,07,007, ... ,14,21,28,35, ... }.
(a) Draw a machine that will accept n.
(b) Formally specify the five parts of the DFA from part (a).
1.13. Let I={0,1,2,3,4,5,6,7,8,9}. Let r={x EI*lthe number represented by x is
evenly divisible by 3}.
(a) Draw a three-state machine that will accept r.
(b) Formally specify the five parts of the DFA from part (a).
1.14. Let I={0,1,2,3,4,5,6,7,8,9}. Let K={xEI*lthe number represented by x is
evenly divisible by 5}.
(a) Draw a five-state DFA that accepts K.
(b) Formally specify the five parts of the DFA from part (a).
(c) Draw a two-state DFA that accepts K.
(d) Formally specify the five parts of the DFA from part (c).
1.15. Let I = {O, 1,2,3,4,5,6,7,8, 9}. Draw a DFA that accepts the first eight primes.
1.16. (a) Find all ten combinations of u, v, and w such that uvw = cab (one such combina-
tion is u = c, v = A, w = ab).
(b) In general, if x is oflength n, and uvw = x, how many distinct combinations of u, v,
and w will satisfy this constraint?
1.17. Let I = {a, b} and E = {x E I* Ix contains (at least) two consecutive bs /\ x does not
contain two consecutive as}. Draw a machine that will accept E.
1.18. The FORTRAN identifier in Example 1.9 recognized all alphabetic words, including
those like DO, DATA, END, and STOP, which have different uses in FORTRAN.
Modify Figure 1.11 to produce a DFA that will also reject the words DO and DATA
while still accepting all other valid FORTRAN identifiers.
1.19. Consider the machine defined in Example 1.11. This machine accepts most real-
number constants in scientific notation. However, this machine does have some
(possibly desirable) limitations. These limitations include requiring that a 0 precede
the decimal point when specifying a number with a mantissa less than 1.
(a) Modify Figure 1.13 so that it will accept the set of real-number constants described
by the following BNF.
<sign>:: =+ 1-
<digit>:: =0111213141516171819
<natural>:: = <digit> I <digit> <natural>
. <integer> :: = <natural> I<sign> <natural>
<real constant> :: = <integer>
Chap. 1 Exercises 61
<integer>. I
. <natural> I
<sign>.<natural> I
. <natural>E<integer> I
<sign>. <natural> E<integer> I
<integer>. <natural> I
<integer>. <natural> E<integer>
(b) Write a program in your favorite programming language to implement the automa-
ton derived in part (a). The program should read a line of text and state whether or
not the word on that line was accepted.
1.20. Show that part (i) of Definition 1.11 is implied by parts (ii) and (iii) of that definition.
1.21. Develop a more succinct description of the transition function given in Example 1.9
(compare with the description in Example 1.10).
1.22. Let the universal set be {a, b}*. Give an example of
(a) A finite set.
(b) A cofinite set.
(c) A set that is neither finite nor cofinite.
1.23. Consider the DFA given in Figure 1.27.
(a) Specify the quintuple for this machine.
(b) Describe the language defined by this machine.
1.24. Consider the set consisting of the names of everyone in China. Is this set a FAD
language?
1.25. Consider the set of all legal infix arithmetic expressions over the alphabet
{A,B, +, -, *,1} without parentheses (assume normal procedence rules apply). Is this
set a FAD language? If so, draw the machine.
1.26. Consider an arbitrary deterministic finite automaton M.
(a) What aspect of the machine determines whether AE L (M)?
(b) Specify a condition that would guarantee that L (M) = I *.
(c) Specify a condition that would guarantee that L (M) = 0.
62 Introduction and Basic Definitions Chap. 1
1.27. Construct deterministic finite automata to accept each of the following languages.
(a) {x E {a, b, c}* Iabc is a substring of x}
(b) {x E {a, b, c}* Iacaba is a substring ofx}
1.28. Consider Example 1.14. The vending machine had as input nickels, dimes, and quar-
ters. When 30e had been deposited, a candy bar could be selected. Modify this
machine to also accept pennies, denoted by p, as an additional input. How does this
affect the number of states in the machine?
1.29. (a) Describe the language defined by the following quintuple (compare with Figure
1.28).
~ = {a, b} S(to, a) = to
S = {to, t l} S(to, b) = tl
So = to S(t!, a) = tl
F = {tl} S(t!, b) = to
(b) Rigorously prove the statement you made in part (a). Hint: First prove the in-
ductive statement
Pen): ("Ix E In)((8(to, X) = to ~ Ixlb is even) A (8(to, x) = tl ~ Ixlb is odd».
1.30. Consider a vending machine that accepts as input pennies, nickels, dimes, and quarters
and dispenses 10e candy bars.
(a) Draw a DFA that models this machine.
(b) Define the quintuple for this machine.
(c) How many states are absolutely necessary to build this machine?
1.31. Consider a vending machine that accepts as input nickels, dimes, and quarters and
dispenses 10e candy bars.
(a) Draw a DFA that models this machine.
(b) How many states are absolutely necessary to build this machine?
(c) Using the standard encoding conventions, draw a circuit diagram for this machine
(include <EOS> but not <SOS> in the input alphabet).
1.32. Using the standard encoding conventions, draw a circuit diagram that will implement
the machine given in Exercise 1.29, as follows:
(a) Implements both <EOS> and <SOS>.
(b) Uses neither <EOS> nor <SOS>.
1.33. Using the standard encoding conventions, draw a circuit diagram that will implement
the machine given in Exercise 1. 7, as follows:
(a) Implements both <EOS> and <SOS>.
(b) Uses neither <EOS> nor <SOS>.
1.34. Modify Example 1.12 so that it correctly handles the <SOS> symbol; draw the new
circuit diagram.
1.35. Using the standard encoding conventions, draw a circuit diagram that will implement
the machine given in Example 1.6, as follows:
Chap. 1 Exercises 63
CHARACTERIZATION of FAD
LANGUAGES
65
66 Characterization of FAD Languages Chap. 2
Note that if P is a right congruence then the first three conditions imply that P must
be an equivalence relation; for example, if!, = {a, b}, aa P aa by reflexivity, and if
(abb, aba) E P, then by symmetry (aba, abb) E P, and so forth. Furthermore, if
abb P aba, then the right congruence property guarantees that
abba P abaa ifu =a
abbbPabab if u = b
abbaa P abaaa ifu=aa
abbbbaabb P ababbaabb if u = bbaabb
and so on. Thus, the presence of just one ordered pair in P requires the existence of
many, many more ordered pairs. This might seem to make right congruences rather
rare objects; there are, however, an infinite number of them, many of them rather
simple, as shown by the following examples.
EXAMPLE 2.1
Let!' = {a, b}, and let R be defined by x Ry ~ Ix I-Iy I is even. It is easy to show
that this R is an equivalence relation (see the exercises) and partitions!, * into two
equivalence classes: the even-length words and the odd-length words. Furthermore,
R is a right congruence: for example, if x = abb and y = baabb, then abb R baabb,
since Ixl-IYI = 3 - 5 = -2, which is even. Note that for any choice of u,
abbu Rbaabbu, since Ixul-Iyul will also be -2. Thus abbu Rbaabbu for every
choice of u. The same is true for any other pair of words x and y that are related by
R, and so R is indeed a right congruence.
Sec. 2.1 Right Congruences 67
EXAMPLE 2.2
Let I = {a, b, c}, and let R2 be defined by x R 2 y ~x and y end with the same letter.
It is straightforward to show that R2 is a right congruence (see the exercises) and
partitions I* into four equivalence classes: those words ending in a, those words
ending in b, those words ending in c, and {A}.
The relation R2 was based on the placement of letters within words, while
Example 2.1 was based solely on the length of the words. The following definition
illustrates a way to produce a relation in I * based on a given set of words L.
EXAMPLE 2.3
Let K be the set of all words over {a, b}* that are of odd length. Those strings that
are in K are used to define exactly which pairs of strings are in R K. For example, we
can determine that ab RK bbaa, since it is true that, for any u E I *, either abu fl. K
and bbaaufl.K (when lui is even) or abu EK and bbaau EK (when lui is odd).
Note that ab and a are not related by R K, since there are choices for u that would
violate the definition of R K: abA $. K and yet aA E K. In this case, RK turns out to be
the same as the relation R defined in Example 2.1.
Recall that relations are sets of ordered pairs, and thus the claim that these
two relations are equal means that they are equal as sets; an ordered pair belongs to
R exactly when it belongs to R K:
R = RK iff (Vx,y E I*)(xRy ~XRKY)
The strings ab and bbaa are related by R in Example 2.1, and they are likewise
related by R K. A similar statement is true for any other pair that was in the relation
R; it will be in R K, also. Additionally, it can be shown that elements that were not in
R will not be in RK either.
Notice that RK relates more than just the words in K; neither ab nor bbaa
belongs to K, and yet they were related to each other. This simple language K
happens to partition I* into two equivalence classes, corresponding to the language
itself and its complement. Less trivial languages will often form many equivalence
classes. The relation RL defined by a language L has all the properties given in
Definition 2.1.
Note that the above theorem is very broad in scope: any language, no matter how
complex, always induces a relation that satisfies all four properties of a right congru-
ence. Thus, RL always partitions 1* into equivalence classes. One useful measure of
the complexity of a language L is the degree to which it fragments 1*, that is, the
number of equivalence classes in R L •
The ranks of the relation in Example 2.3 was 2, since there were two equiv-
alence classes, the set of even-length words and the set of odd-length words. In
Example 2.2, rk (R 2 ) = 4. The rank of RL can be thought of as a measure of the
complexity of the underlying language L. Thus, for K in Example 2.3, rk(RK) = 2,
and K might consequently be considered to be a relatively simple language. Some
languages are too complex to be recognized by finite automata; this relationship will
be explored in the subsequent sections.
While the way in which a language gives rise to a partition of 1* may seem
mysterious and highly nonintuitive, a deterministic finite automaton naturally dis-
tributes the words of 1* into equivalence classes. The following definition describes
the manner in which a DFA partitions 1*.
R M relates all strings that, when starting at so, wind up at the same state. It is
easy to show that RM will be an equivalence relation with (usually) one equivalence
class for each state of M (remember that equivalence classes are by definition
nonempty; what type of state might not have an equivalence class associated with
it?). It is also straightforward to show that the properties of the state transition
function guarantee that RM is in fact a right congruence (see the exercises).
The equivalence classes of R M are called initial sets and will be of further
interest in later chapters. For a DFA M = <I, S, so, 8, F> and a given state t from
M, [(M, t) = {x I8(so, x) = t}. This initial set can be thought of as the language
accepted by a machine similar to M, but which has t as its only final state. That is, if
we define Mt = <I, S, sO, 8, {t}> , then [(M, t) = L(M t ).
The notation presented here allows a concise method of denoting both
relations defined by languages and relations defined by automata. It is helpful to
observe that even in the absence of context, Rx indicates that a relation based on the
language X is being described (since X occurs as a subscript), while the relation R Y
identifies Y as a machine (since Y occurs as a superscript).
Just as each DFA M gives rise to a right congruence R M , many right congru-
Sec. 2.1 Right Congruences 69
ences Q can be associated with a DFA, which will be called Ao. It can be shown that,
if some of the equivalence classes of Q are singled out to form a language L, Ao will
recognize L.
V Definition 2.5. Given a right congruence Q of finite rank and a language L
that is the union of some of the equivalence classes of Q, Ao is defined by
Ao = <!" So, soo' 80 , Fo>
where
So = {[x] 0 Ix E !,*}
sOo = [11.]0
Fo={[x]olx EL}
and 80 is defined by
(Vx E!, *)(Va E !')(80 ([xlo, a) = [xa]o)
Note that this is a finite-state machine since rk(Q) < 00, and that if Ll were a
different collection of equivalence classes of Q, Ao would remain the same except
for the placement of the final states. In other words, Fo is the only aspect of this
machine that depends on the language L (or Ll)' As small as this change might be, it
should be noted that Ao is defined both by Q and the language L. It is left for the
reader to show that Ao is well-defined and that L(Ao) = L (see the exercises). The
corresponding statements will be proved in detail in the next section for the im-
portant special case where Q = R L .
EXAMPLE 2.4
Let Q C {a}* x {a}* be the equivalence relation with the following equivalence
classes:
[11.]0 = {X.} = {a}O
[a]o = {a} = {aF
[aalo = {aYU {ap U {a}4 U {ap U ...
It is easy to show that Q is a right congruence (see the exercises). If Ll were defined
to be [11.]0 U [a]o, then Ao would have the structure shown in Figure 2.la. For the
language defined by the different combination of equivalence classes given by
Lz = [11.]0 U [aalo, Ao would look like the DFA given in Figure 2.lb. This example
illustrates that it is the right congruence Q that establishes the start state and the
transitions, while the language L determines the final state set. It should also be
clear why L must be a union of equivalence classes from Q. The figure shows that a
machine with the structure imposed by Q cannot possibly both reject aaa and accept
aaaa. Either the entire equivalence class for [aa]o must belong to L, or none of the
strings from [aa]o can belong to L.
70 Characterization of FAD Languages Chap. 2
(a)
a
In this section, we will show that languages that partition I* into a finite number of
equivalence classes can be represented by finite automata, while those that yield an
infinite number of classes would require a machine with an infinite number of
states.
EXAMPLE 2.5
The language K given in Example 2.3 can be represented by a finite automaton with
two states; all words that have an even number of letters eventually wind up at state
so, while all the odd words are taken by 8 to Sl. This machine is shown in Figure 2.2.
It is no coincidence that these states split up the words of I* into the same
equivalence classes that RK does. There is an intimate relationship between
languages that can be represented by a machine with a finite number of states and
languages that induce right congruences with a finite number of equivalence classes,
as shown by the following theorem.
1. L is FAD.
2. There exists a right congruence R on I* for which L is the (possibly empty)
union of some of the equivalence classes of Rand rk(R) < 00.
3. rk(Rd < 00.
Proof. Because of the transitivity of =>, it will be sufficient to show only the
three implications (1) => (2), (2) => (3), and (3) => (1), rather than all six of them.
Sec. 2.2 Nerode's Theorem 71
Proof of (1) ~ (2): Assume (1); that is, let L be FAD. Then there is a machine
that accepts L; that is, there exists a finite automaton M = <I, S, So, B, F> such that
L(M) = L. Consider the relation RM on I* based on this machine M as given in
Definition 2.4: (Vx,y E I*)(xRM Y ~ 8(so, x) = 8(so,Y»·
This R M will be the relation R we need to prove (2). For each s E S, consider
J(M, s) = {x E I* 18(so, x) = s}, which represents all strings that wind up at state s
(from so). Note that it is easy to define automata for which it is impossible to reach
certain states from the start state; for such states, J(M, s) would be empty. Then
Vs E S, J(M, s) is either an equivalence class of RM or /(M, s) = 0. Since there is at
most one equivalence class per state, and there are a finite number of states, it
follows that rk(RM) is also finite: rk(RM):s IISII < 00.
However, we have
L = L(M) = {x E I* 18(so,x) E F} = U {x E I* 18(so, x) = f} = U J(M, f)
fEF fEF
That is, L is the union of some of the equivalence classes of the right congruence
R M, and R M is indeed of finite rank, and hence (2) is satisfied. Thus (1) ~ (2).
Proof of (2) ~ (3): Assume that (2) holds; that is, there is a right congruence
R for which L is the union of some of the equivalence classes of the right congruence
R, and rk(R) < 00. Note that we no longer have (1) as an assumption; there is no
machine (as yet) associated with L.
Case 1: It could be that L is the empty union; that is, that L = 0. In this case, it
is easy to show that RL has only one equivalence class (I *), and thus rk (RL) = 1 < 00
and (3) will be satisfied.
Case 2: In the nontrivial case, L is the union of one or more of the equivalence
classes of the given right congruence R, and it is possible to show that this R must
then be closely related to the RL induced by the original language L. In particular,
for any strings x and y,
xRy ~ (since R is a right congruence)
(Vu E I *)(xu R yu) ~ (by definition of [ ])
(Vu E I*)([XU]R = [YU]R) ~ (by definition of L as a union of [ ]'s)
(Vu E I *)(xu E L ~ yu E L) ~ (by definition of Rd
xRLy
(Vx E I *)(Vy E I *)(x Ry ~ x RLy) means that R refines R L , and thus each equiv-
alence class of R is entirely contained in an equivalence class of R L ; that is, each
equivalence class of RL must be a union of one or more equivalence classes of R.
Thus, there are more equivalence classes in R than in R L , and so rk(Rd:s rk(R).
But by hypothesis, rk(R) is finite, and so RL must be of finite rimk also, and (3) is
satisfied. Thus, in either case, (2) ~ (3).
Proof of (3) ~ (1): Assume now that condition (3) holds; that is, L is a
language for which RL is of finite rank. Once again, note that all we know is that RL
has a finite number of equivalence classes; we do not have either (1) or (2) as a
hypothesis. Indeed, we wish to show (1) by proving that L is accepted by some finite
72 Characterization of FAD Languages Chap. 2
automaton. We will base the structure of this automaton on the right congruence
R L , using Definition 2.5 with Q = R L . ARL is then defined by
ARL = <l, SRL, SORL' 8 RL , FRL>
where
SRL = {[xklx E l*}
SORL = [A]R L
FRL = {[xklx E L}
and 8RL is defined by
('t/x E l*)('t/a E l)(8 RL ([xk, a) = [xa]RJ
The basic idea in this construction is to define one state for each equivalence
class in R L , use the equivalence class containing A as the start state, use those classes
that were made up of words in L as final states, and define 8 in a natural manner. We
claim that this machine is really a well-defined finite automaton and that it does
behave as we wish it to; that is, the language accepted by ARL really is L. In other
words, L (ARL) = L.
First, note that SR Lis a finite set, since [by the only assumption we have in (3)]
RL consists of only a finite number of equivalence classes. It can be shown that FRL is
well defined; if [Z]RL = [Y]RL' then either (both Z ELand Y E L) or (neither Z nor Y
belong to L) (why?). The reader should show that 8RL is similarly well defined; that
is, if [Z]RL = [Y]R L, it follows that 8RL is forced to also take both transitions to the
same state ([za]RL = [yak). Also, a straightforward induction on IY I shows that the
rule for 8RL extends to a similar rule for 8RL :
('t/x E l*)('t/y E l*)(8RL ([X]R L,Y) = [xYk)
With this preliminary work out of the way, it is possible to easily show that
L (ARJ = L. Let x be any element of l *. Then
x EL(ARL)~ (by definition of L)
8RL (SORL' x) E FRL ~ (by definition of SORJ
8RL([A]RL, x) E FRL ~ (by definition of 8RL and induction)
[uk E FRL ~ (by definition of A)
[x ]R LE FRL ~ (by definition of FRJ
xEL
Consequently, L is exactly the language accepted by this finite automaton; so L
must be FAD, and (1) is satisfied. Thus (3) ~ (1). We have therefore come full
circle, and all three conditions are equivalent.
.:l
EXAMPLE 2.6
Let L be the following FAD language: L = l * - {h} = l +. There are many finite
automata that accept L, one of which is the DFA given in Figure 2.3. This four-state
machine gives rise to a four-equivalence class right congruence as described in
(1):::;' (2), where
[h]R N = leN, so) = {h}, since h is the only string that ends up at So
= {y I Iy I is odd, and y ends with a I} = {z I"B(so, z) = SI}
[l]R N = leN, SI)
[l1]R = l(N, S2) = {y I Iy I is even} - {h} = {z I"B(so, z) = S2}
N
[OOO]R N = leN, S3) = {y I Iyl is odd, and y ends with a O} = {z I"B(so, z) = S3}
Note that L is indeed I (N, SI) U I (N, S2) U I (N, S3) which is the union of all the
equivalence classes that correspond to final states in N, as required by (2). To
illustrate (2):::;' (3), let R be the equivalence relation RN defined above, let L again
be l+, and note that (2) is satisfied: L = [l]R U [l1]R U [OOO]R, the union of 3 of the
equivalence class of a right congruence of rank 4 (which is finite).
As in the proof of (2):::;' (3), RL is refined by R, but in this case Rand RL are
not equal. All the relations from R still hold, such as 11 R 1111, so 11 RL 1111;
OR 000, and thus 0 RL 000, and so forth. It can also be shown that 11 RL 000, even
though 11 and 000 were not related by R (apply the definition of RL to convince
yourself of this). Thus, everything in [l1]R is related by RL to everything in [OOO]R;
that is, all the strings belong to the same equivalence class of R L , even though they
formed separate equivalence classes in R. It may at first appear strange, but the fact
that there are more relations in RL means that there are fewer equivalence classes in
RL than in R. Indeed, RL has only two equivalence classes, {h} and L. In this case,
three equivalence classes of R collapse to form one large equivalence class of R L.
Thus {h} = [h]R L= [h]R and L = [l1]RL = [l]R U [l1]R U [OOO]R and, as we were as-
sured by (2):::;' (3), R refines R L.
1 0,1
--=-:-o_~
0,1
Figure 2.3 The DFA N discussed in Ex-
ample 2.6
To illustrate (3):::;' (1), let's continue to use the Land RL given above. Since RL
is of rank 2, we are assured of finding a two-state machine that will accept L. ARL in
this case would take the form of the automaton P displayed in Figure 2.4. In this DFA,
for example, 8([11]RL,0) = [110]RL= [l1]RL' and [h]RL is the start state. [l1]RL is a
final state since 11 E L. Verify that this machine accepts all words except h; that is,
L(ARJ = L.
74 Characterization of FAD Languages Chap. 2
EXAMPLE 2.7
If we were instead to begin with the same language L, but use the two-state
machine P at the end of Example 2.6 to represent L, we would find that L would
consist of only one equivalence class, R P would have only two equivalence classes,
and R P would in this case be the same as RL (see the exercises). R P turns out to be as
simple as RL because the machine we started with was as "simple" as we could get
and still represent L. In Chapter 3 we will characterize the idea of a machine being
"as simple as possible," that is, minimal.
The two machines given in Example 2.6 accept the same language. It will be
convenient to formalize this notion of distinct machines "performing the same
task," and we therefore make the following definition.
V Definition 2.6. Two DFAs A and 8 are equivalent iff L (A) = L (8).
~
EXAMPLE 2.8
The DFAs N from Example 2.6 and Ao from Example 2.7 are equivalent since
L(N) = I+ =L(Ao).
V Definition 2.7. A DFA A = <I, SA, SOA' 8A, FA> is minimal iff for every DFA
8 = <I, SB, SOB' 8B, FB> for which L (A) = L (8), /SA / :::;/ SB /.
~
EXAMPLE 2.9
The DFA N from Example 2.6 is clearly not minimal since the automaton Ao from
Example 2.7 is equivalent and has fewer states than N. The techniques from Chapter
3 can be used to verify that the automaton Ao from Example 2.7 is minimal. More
importantly, minimization techniques will be explored in Chapter 3 that will allow
an optimal machine (like this Ao) to be produced from an inefficient automaton
(like N).
That is, the string uw winds up in the same place uvw does; this is illustrated in
Figure 2.7a. Note that a similar thing happens if uv 2 w is processed:
8(s, uvvw) = 8(8(s, u), vvw)
= 8(q, vvw)
= 8(8(q, v), vw)
= 8(q, vw)
= 8(8(q, v), w)
= 8(q, w)
=f
This behavior is illustrated in Figure 2.7b.
(a)
EXAMPLE 2.10
Let E be the set of all even-length words over {a, b}*. There is a two-state machine
that accepts E, so E is FAD, and the pumping lemma applies if n is, say, S. Then
Vx , Ix I> 5, if x = ala2a3 ... aj E E (that is,j is even), we can choose u = A, v = ala2,
and w = a3a4 ... aj. Note that luvl = 2 $. 5, Ivl = 2;::: 1, and luviwl = j + 2(i -1),
which is even, and so (Vi E ~)(UVi wEE).
If Example 2.10 does not appear truly exciting, there is good reason: The
pumping lemma is generally not applied to FAD languages! (Note: We will see an
application later.) The pumping lemma is often applied to show languages are not
FAD (by proving that the language does not satisfy the pumping lemma). Note that
the contrapositive of Theorem 2.3 is:
EXAMPLE 2.11
Consider L4 = {y E {O, l}* I Iy 11 = Iy lo}. We will use Theorem 2.4 to show L4 is not
FAD: Let n be given, and choose x = on In. Then x E L4, since Ix 11 = n = Ix 10. It
should be observed that x must be dependent on n, and we have no control over n
(in particular, n cannot be replaced by some constant; similarly, while i may be
chosen to be a fixed constant, a proof that covers all possible combinations of u, v,
and w must be given).
Note that this choice of x is "long enough" in that Ix 1= 2n ;::: n, as required by
Theorem 2.4. For any combination of u, v, wEI * such that x = uvw, Iuv I $. n,
Iv I;::: 1, we hope to find a value for i such that uv i w Et L4. Since Iuv I $. n and the first
n letters of x are all zeros, this narrows down the choices for u, v, and w. They must
be of the form u = oj and v = Ok (since Iuv I $. n and x starts with n zeros), and w
must be the "rest of the string" and look something like w = 0"' r. The constraints
on u, v, and w imply thatj + k $. n, k ;::: 1, aI).dj + k + m = n. If i = 2, we have that
78 Characterization of FAD Languages Chap. 2
EXAMPLE 2.12
Consider L = {aibili,j E r\J and i and j are relatively prime}. We will use Theorem
2.4 to show L is not FAD: Let n be given, and choose a prime p larger than n + 1
(we can be assured such a p exists since there are an infinite number of primes). Let
x = a P b(p - 1)1. Since p has no factors other than 1 and p, it has no nontrivial factor in
common with (p - l)·(p - 2)· ... ·3·2·1, and so p and (p - I)! are relatively
prime, which guarantees that x E L. The length of x is clearly greater than n, so
Theorem 2.3 should apply, which implies that there must exist a combination
u, v, wE!,* such that x = uvw, Iuv 1:5 n, Ivl:2: 1; we hope to find a value for i such
that uv i w E L. Since Iuv 1:5 n and the first n letters of x are all as, there must exist
integers j, k, and m for which u = a i and v = a\ and w must be the "rest of the
string"; that is, w = am b(p -1)1. The constraints on u, v, and w imply that j + k :5 n,
k:2: 1, andj + k + m = p. If i = 0, we have that uvow = aP-kb(p-l)l. Butp - k is a
number between p - 1 and p - n and hence must match one of the nontrivial
factors in (p -I)!, which means that uvowEtL (why?). Therefore, Theorem 2.3
has been violated, so L could not have been FAD.
The details of the basic argument used to prove the pumping lemma can be
Sec. 2.3 Pumping Lemmas 79
varied to produce other theorems of a similar nature: for example, when processing
x, there must be a state q' repeated within the last n letters. This gives rise to the
following variation of the pumping lemma.
The new condition Ivw I:5 n reflects the constraint that some state must be
repeated within the last n letters. The contrapositive of Theorem 2.5 can be useful
in demonstrating that certain languages are not FAD. By repeating our original
reasoning while assuming the string x takes us to a nonfinal state, we obtain yet
another variation.
and
(Vi E N)(uv i w f£. L)
Proof. See the exercises.
Notice that Theorem 2.6 guarantees that if one "long" string is not in the
language then there is an entire sequence of strings that cannot be in the language.
There are some examples of languages in the exercises where Theorem 2.4 is hard
to apply, but where Theorem 2.5 (or Theorem 2.6) is appropriate.
When i = 0, the pumping lemma states that given a "long" string (uvw) in L
there is a shorter string (uw) that is also in L. If this new string is still of length
greater than n, the pumping lemma can be reapplied to find a still shorter string,
and so on. This technique is the basis for proving the following theorem .
..
·
80 Characterization of FAD Languages Chap. 2
Note that 8il8iz .•• 8ij represents a string formed by "removing" letters from
perhaps several places in x, and that this new string has length less than n.
Theorem 2.7 can be applied in areas that do not initially seem to relate to
DFAs. Consider an arbitrary right congruence R of (finite) rank n. It can be shown
that each equivalence class of R is guaranteed to contain a representative of length
less than n. For example, consider the relation R given by
[A]R = {A}
[I1I1I]R = {y Ily I is odd, and y ends with a I}
[OIOI]R = {y Ily I is even and Iy I> O}
[OOOOO]R = {y Ily I is odd, and y ends with a O}
In this relation, rk (R) = 4, and appropriate representatives of length less than 4 are
A, I, 11, and 100, respectively. That is, [A]R = [A]R, [I]R = [I1I1I]R, [I1]R = [OIOI]R,
and [IOO]R = [OOOOOk By constructing a DFA based on the right congruence R,
Theorem 2.7 can be used to prove that every equivalence class ~f R has a "short"
representative (see the exercises).
We have seen that deterministic finite automata are limited in their cognitive
powers, that is, there are languages that are too complex to be recognized by DFAs.
When only a finite set of previous histories can be distinguished, the resulting
languages must have a certain repetitious nature. Allowing automata to instead
have an infinite number of states is uninteresting for several reasons. On the prac-
tical side, it would be inconvenient (to say the least) to physically construct such a
machine. Infinite automata are also of little theoretical interest as they do not
distinguish between simple and complex languages: any language can be accepted
by the infinite analog of a DFA. With an infinite number of states available, the
state transition diagrams can look like trees, with a unique state corresponding to
each word in l *. The states corresponding to desired words can simply be made
final states.
More reasonable enhancements to automata will be explored later. Non-
determinism will be presented in Chapter 4, and machines with extended
capabilities will be defined and investigated in Chapters 10 and 11.
EXERCISES
2.1 Let}; = {a, b, c}. Show that the relation 'It ~ }; * x }; * defined by'
x 'lty ¢:> Ix I - Iy I is odd
is not a right congruence. (Is it an equivalence relation?)
Chap. 2 Exercises 81
2.16. Prove by induction that, for the strings defined in the discussion of the pumping
lemma, (Vi E 1'\\I)(8(s, uv i w) = f = 8(s, uvw».
2.17. Prove Theorem 2.1.
2.18. (a) Find a language that gives rise to the relation I defined in Exercise 2.8.
(b) Could such a language be FAD? Explain.
2.19. Starting with Theorem 2.3 as a given hypothesis, prove Theorem 2.4.
2.20. Prove Theorem 2.5 by constructing an argument similar to that given for Theorem 2.3.
2.21. Prove Theorem 2.6 by constructing an argument similar to that given for Theorem 2.3.
2.22. Prove Theorem 2.7.
2.23. LetL={x E{a,b}*llxla<lxlb}' ShowLisnotFAD.
2.24. Let G = {x E {a, b}* Ilx la 2= Ix Ib}. Show G is not FAD.
2.25. Let P = {y E {d}* 13 prime p :) y = c¥'} = {dd, ddd, ddddd, d7 , d ll , d 13 , ... }. Prove that P
is not FAD.
2.26. Letf = {xE{0,1,2}*13wE{0,1}*h = w·2·w} = {2,121,020,11211,10210, ... }.
Prove that f is not FAD.
2.27. Let 'I' = {x E {O, I}* 13w E {O, I}* :) x = w 'w} = {A, 00,11,0000,1010,1111, ... }. Prove
that 'I' is not FAD.
2.28. Define the reverse of a string w as follows: If w = ala2a314 ... an -Ian, then
w' = anan-I' .. a4a3a2al. Let K = {w E {O, I}* Iw = w'} = {A, 0, 1,00,11,000,010,
101,111,0000,0110, ... }. Prove K is not FAD.
2.29. Let cP = {x E {a, b, e}* 13i,j, k E 1'\\1 :) x = ai bk em, where j 2= 3 and k = m}. Prove cP is
not FAD. Hint: The first version of the pumping lemma is hard to apply here (why?).
2.30. Let C = {y E {d}* 13 non prime q :) y = dq } = {A, d, d4, d 6, dB, d 9 , dlO ... }. Show C is not
FAD. Hint: The first version of the pumping lemma is hard to apply here (why?).
2.31. Assume I = {a, b} and L is a language for which RL has the following three equivalence
classes: {A}, {all odd-length words}, {all even-length words except A}.
(a) Why couldn't L = {x Ilxl is odd}? (Hint: Recompute R{xllxl is odd).
(b) List the languagesL that could give rise to this R L.
2.32. Let I = {a, b} and let 'I' = {x E I* Ix has an even number of as and ends with (at least)
one b}. Describe R", and draw a machine accepting '1'.
2.33. Let E = {x E {a}* 13j E 1'\\1 :) Ixl = /} = {A, a, aaaa, a9 , a 16 , a 25 , ... }. Prove that E is not
FAD.
2.34. Let cP = {x E {b}* 13j E 1'\\1 :) Ixl = 21} = {b, bb, bbbb, bB, b 16 , b32 , ... }. Prove that cP is not
FAD.
2.35. Let I = {a, b}. Assume RL has .the following five equivalence classes: {A}, {a}, {aa},
{a3, a4 , a 5, a6, ... }, {x Ix contains (at least) one b}. Also assume that L consists of exactly
one of these equivalence classes.
(a) Which equivalence class is L?
(b) List the other languages L that could give rise to this RL (and note that they might
consist of several equivalence classes).
2.36. Let n = {y E {O, I}* I(y contains exactly one 0) V (y contains an even number of Is)}.
Find Rn.
2.37. Let I = {a, b} and LI = {x E I* Ilxl a > Ixlb} and L2 = {x E I* Ilxl a < 3}. Which of the
following are FAD? Support your answers.
(a) LI (b) L2 (c) LI n L2 (d) -L2 (e) LI U Lz
Chap. 2 Exercises 83
MINIMIZATION of FINITE
AUTOMATA
We have seen that there are many different automata that can be used to represent a
given language. We would like to be able to find an automaton for a language L that
is minimal, that is, a machine which can represent that language which has the
fewest number of states possible.
Finding such an optimal DFA will involve transforming a given automaton
into the most efficient equivalent machine. To effectively accomplish this transfor-
mation, we must have a set of clear, unequivocal directions specifying how to
proceed. A procedure is a finite set of instructions that unambiguously defines
deterministic, discrete steps for performing some task. As anyone who has pro-
grammed a computer knows, it is possible to generate procedures that will never
halt for some inputs (or perhaps for all inputs if the program is seriously flawed). An
algorithm is a procedure that is guaranteed to halt on all (legal) inputs. In this
chapter we will specify a procedure for finding a minimal machine and then justify
that this procedure is actually an algorithm. Thus, the theorems and definitions will
show how to transform an inefficient DFA into an optimal automaton in a straight-
forward manner that can be easily programmed.
One of our goals for this chapter can be stated as follows: Given a language L, we
wish to survey all the machines that recognize L and choose the machine (or
machines) that is "smallest." It will be seen that there is indeed a unique smallest
machine: A RL • The automaton ARL will be unique in the sense that any other optimal
86
Sec. 3.1 Homomorphisms and Isomorphisms 87
DFA looks exactly like ARL except for a trivial relabeling of the state names. The
concept of two automata "looking alike" will have to be formalized to provide a
basis for our rigorous statements. Machines that "look alike" will be called
isomorphic, and the relabeling specification will be called an isomorphism.
We have already learned some facts about ARLO which stem from the proof of
Nerode's theorem. These are summarized below and show that ARL is indeed one of
the optimal machines for the language L.
Also, in the proof of (1) ~ (2) in Nerode's theorem, the relation RM (defined
by a given DFA M = <I, S, sO, 8, F> for which L(M) = L) was used to show
IISII ~ rk(RM). Furthermore, in (2) ~ (3), right congruences such as RM that satis-
fied property (2) must be refinements of R L , and so rk(RM) ~ rk(RL)' Thus
IISII ~ rk(RM) ~ rk(Rd = IISRJ, which leads immediately to the following corollary.
That is, a connected machine requires all states to be accessible; every state s of S
must be "reachable" from sci by some string (xs) in I* (different states will require
different strings, and hence it is convenient to associate an appropriate string Xs with
88 Minimization of Finite Automata Chap. 3
the state s). States that are not accessible are sometimes called disconnected,
inaccessible, or unreachable.
EXAMPLE 3.1
The machine defined in Figure 3.1 satisfies the definition of a deterministic finite
automaton, but is disconnected since r cannot be reached by any string from the
start state q. Note that Xq could be A or 10, while XI might be 0 or 111. There is no
candidate for X r • Furthermore, r could be "thrown away" without affecting the
language that this machine accepts. This will be one of the techniques we will use to
minimize finite automata: removing the inaccessible states.
There is a second way for an automaton to have superfluous states, as shown
by the automata in the following examples. An overabundance of states may be
present, recording nonessential information and consequently distinguishing be-
tween strings in ways that are unnecessary.
0,1
EXAMPLE 3.2
Consider the four-state DFA over {a, b}* in which So is the start and only final state,
defined in Figure 3.2. This automaton is clearly connected, but it is still not optimal.
This machine accepts all strings whose length is a multiple of 3, and SI and S2 are
really "remembering" the same information, that is, that we currently have read a
string that is one more than a multiple of 3. The fact that some strings that end in a
are sent to S10 while those that end in b may be sent to S2, is of no real importance; we
do not have to "remember" what the last letter in the string actually was in order to
correctly accept the given language. The states SI and S2 are in some sense equiva-
lent, since they are performing the same function. The careful reader may have
noticed that this language could have been recognized with a three-state machine, in
which a single state combines the functions of SI and Si.
Now consider the automaton shown in Figure 3.3, in which there are three
superfluous states. This automaton accepts the same language as the DFA in Figure
3.2, but this time not only are SI and S2 performing the same function, but S3 and S4
are "equivalent," and So and S5 are both "remembering" that there has been a
multiple of three letters seen so far. Note that it is not enough to check that SI and S2
take you to exactly the same places (as was the case in the first example); in this
Sec. 3.1 Homomorphisms and Isomorphisms 89
example, the arrows coming out of SI and S2 do not point to the same places. The
important thing is that, when leaving SI or S2, when a is seen, we go to equivalent
states, and when processing b from SI or S2, we also go to equivalent states. However,
deciding whether two states are equivalent or not is perhaps a little less straight-
forward than it may at first seem. This sets the stage for the appropriate definition of
equivalence.
In other words, we will relate sand t iff it is not possible to distinguish whether we
are starting from state s or state t; each string x E !, * will either take x to a final state
90 Minimization of Finite Automata Chap. 3
when starting from s and also take x to a final state from t, or neither s nor t will take
x to a final state.
Another way of looking at this concept is to define new machines that "look
like" A, but have different start states. Given a finite automaton A = <~, S, sO, 8, F>
and two states s, t E S, define a new automaton At = <~, S, t, 8, F> that has t
as a start state, and another automaton AS = <~, S, s, 8, F> having s as a start
state. Then sEA t ~ L (AS) = L (At). (Why is this an equivalent definition?) These
sets of words will be used in later chapters and are referred to as terminal sets.
T(A, t) will denote the set of all words that reach final states from t, and thus
T(A, t) = L (At) = {x I~(t,x) E F}.
In terms of the black box model presented in Chapter 1, we see that we cannot
distinguish between AS and At by placing matching strings on the input tapes and
observing the acceptance lights of the two machines. For any string, both AS and At
will accept, or both will reject; without looking inside the black boxes, there is no
way to tell whether we are starting in state s or state 1. This highlights the sense in
which sand t are deemed equivalent: we cannot distinguish between sand t by the
subsequent behavior of the automaton.
The modified automaton At, which gives rise to the terminal set T(A, t), can be
contrasted with the modified automaton At = <~, S, sO, 8, it}> from Chapter 2,
which recognized the initial set J(A, t) = {x I~(so,x) = t}. Notice that initial sets are
comprised of strings that move from the start state to the distinguished state t, while
terminal sets are made up of strings that go from t to a final state.
EXAMPLE 3.3
The automaton N discussed in Example 2.6 (Figure 3.4) has the following relations
comprising EN:
SOEN So
SI EN SI , SI EN S2 , SI EN S3
S2 EN SI , S2 EN S2 , S2 EN S3
S3 EN SI , S3 EN S2 , S3 EN S3
Recall that Example 2.6 showed that the minimal machine that accepted L (N)
had two states; it will be seen that it is no coincidence that EN has exactly the same
number of equivalence classes.
V Definition 3.3. A finite automaton A = <I, S, sO, 8, F> is called reduced iff
(Vs, t E S)(s EA t ¢:> s = t).
~
The automaton N in Figure 3.4 is not reduced, since Example 3.3 shows that [SZlEN
contains three states. On the other hand, the automaton A displayed in Figure 3.Sa
is reduced since [SOlEA = {so} and [SdEA = {Sl}' The concepts of homomorphism and
isomorphism will play an integral part in justifying the correctness of the algorithms
that produce the optimal DFA for a given language. We need to formalize what we
mean when we say that two automata are "the same." The following examples
illustrate the criteria that must exist between similar machines.
EXAMPLE 3.5
We now consider the automaton B shown in Figure 3.Sb, which looks suspiciously
like the DFA A given in Figure 3.Sa. In fact, it is basically the "same" machine.
While it has been oriented differently (which has rio effect on the 8 function), and
the start state has been labeled qo rather than so, and the final state is called ql rather
than SJ, A and B are otherwise "identical." For such a relabeling to truly reflect the
same automaton structure, certain conditions must be met, as illustrated in the
following examples.
EXAMPLE 3.6
Consider machine C, defined by the state transition diagram given in Figure 3.Sc.
This machine is identical to B, except for the position of the start state. However, it
is not the same machine as B, since it behaves differently (and. in fact accepts a
different language). Thus we see that it is important for the start state of one
machine to correspond to the start state of the other machine. Note that we cannot
'"
(a) (b)
(c) (d)
(e)
Figure 3.5 (a) The automaton A (b) The automaton B (c) The automaton C (d) The automa-
ton D (e) The automaton E
EXAMPLE 3.7
Let machine D be defined by the state transition diagram given in Figure 3.Sd. The
automata Band D (Figures 3.Sb and 3.Sd) look much the same, with start states
corresponding, but they are not the same (and will in fact accept different lan-
guages), because we cannot get the final states to correspond correctly. Even if we
do get the start and final states to agree, we still have to make sure that the
transitions correspond.'This is illustrated in the next example.
EXAMPLE 3.8
Consider the machine E given in Figure 3.Se. In this automaton, when leaving the
start state, we travel to a final state if we see 0, and remain at the start state (which is
nonfinal) if we see 1; this is different from what happened in machine A, where we
traveled to a final state regardless of whether we saw 0 or 1. Thus it is seen that we
not only have to find a correspondence (which can be thought of as a function j.1)
between the states of our two machines, but we must do this in a way that satisfies
the above three conditions (or else we cannot claim the machines are "the same").
This is summed up in the following definition of a homomorphism.
Sec. 3.1 Homomorphisms and Isomorphisms 93
V Definition 3.4. Given two finite automata, A = <I, SA, SOA' 8A, FA> and
B = <I, SB, SOB' 8 B, FB>, and a function (.L: SA ~ SB, (.L is called a finite automata
homomorphism from A to B if!the following three conditions hold:
EXAMPLE 3.9
Machines A and B in Example 3.5 are homomorphic, since the homomorphism
(.L: {so, Sl}~ {qo, qd defined by (.L(so) = qo and (.L(St) = ql satisfies the three condi-
tions.
The following example shows that even if we can find a homomorphism that satisfies
the 3 conditions the machines might not be the same.
EXAMPLE 3.10
0,1
(a) (b)
Figure 3.6 (a) The DFA M discussed in Example 3.10 (b) The DFA N discussed in
Example 3.10
94 Minimization of Finite Automata Chap. 3
Note that this really does say that the transition labeled 0 leaving the start state
conforms; So has a O-transition pointing to S1> and so the O-transition from $(so)
should point to $(Sl) (that is, qo should point to ql)' But the transition taken from So
upon seeing a 0, in our notation, is 8M(so, 0), and the place qo goes to is 8N(qo, 0). We
wish to make sure that the state in N corresponding to where the O-transition from So
points, denoted by $(8 M(so, 0», agrees with the state to which qo points. Hence we
require $(8 M(so, 0» = 8N (qo, 0). In the last formula, qo was chosen because that
was the state corresponding to so; that is, $(so) = qo. Hence, in our formal notation,
we were really checking $(8 M(so, 0» = 8N ($(so), 0). Hence, we see that rule (iii)
requires us to check all transitions leading out of all states for all letters; that is,
('Is E SM)(Va E I)($(8 M(s, a» = 8N ($(s), a». Applying this rule to each choice of
letters a and states s, we have
$(8 M(so,0» = $(Sl) = ql = 8N(qo, 0) = 8N($(so), 0)
$(8 M(so, 1» = $(Sl) = ql = 8N(qo, 1) = 8N($(so), 1)
$(8 M(s1> 0» = $(Sl) = ql = 8N(q1> 0) = 8N ($(SI), 0)
$(8 M(s1> 1» = $(so) = qo = 8N(qJ, 1) = 8N($(SI), 1)
$(8 M(S2, 0» = $(Sl) = ql = 8N(qo, 0) = 8N ($(S2), 0)
$(8 M(S2, 1» = $(Sl) = ql = 8N(qo, 1) = 8N ($(S2), 1)
Hence $ is a homomorphism between M and N even though M has three states and
N has two states. While the existence of a homomorphism is not enough to ensure
that the machines are "the same," the exercises for this chapter indicate that the
existence of a homomorphism is enough to ensure that the machines are equivalent.
The extra condition we need to guarantee that the machines are identical (except
for a trivial renaming of the states) is that $ be a bijection.
V Definition 3.5. Given two finite automata A = <I, SA, SOA' 8A, FA> and
B = <I, SB, SOs' 8B , FB>, and a function ,....: SA~ SB, ,.... is called a finite automata
isomorphism from A to B iff the following five conditions hold:
i. ,....(SOA) = sos'
ii. ('Is E SA)(S E FA ~ ,....(s) E FB)'
iii. ('Is E SA)(Va E I) (,....(8 A(s, a» = 8B(,....(s), a».
iv. ,.... is a one-to-one function from SA to SB'
v. ,.... is onto SB'
il
EXAMPLE 3.11
,.... from Example 3.9 is an isomorphism. Example 3.5 illustrated that the automaton
A was essentially "the same" as B except for the way the states were named. Note
that ,.... can be thought of as the recipe for relabeling the states of A to form a
Sec. 3.1 Homomorphisms and Isomorphisms 95
machine that would then be in the very strictest sense absolutely identical to B. I\J
from Example 3.10 is not an isomorphism because it is not one to one.
V Definition 3.6. Given two finite automata A = <I , SA, SOA' 8A, FA> and
B = <I, Sa, SOS' 8a, Fa>, A is said to be isomorphic to B iff there exists a finite
automata isomorphism between A and B, and we will write A == B.
d
EXAMPLE 3.12
Machines A and B from Examples 3.4 and 3.5 are isomorphic. Machines M and N
from Example 3.10 are not isomorphic (and not just because the particular function
I\J fails to satisfy the conditions; we must actually prove that no function exists that
qualifies as an isomorphism between M and N).
Now that we have rigorously defined the concept of two machines being
"essentially identical," we can prove that, given a language L, any reduced and
connected machine A accepting L must be minimal, that is, have as few states as
possible for that particular language. We will prove this assertion by showing that
any such A is isomorphic to ARL , which was shown in Corollary 3.2 to be the
"smallest" possible machine for L.
V Theorem 3.1. Let L be any FAD language over an alphabet I, and let
A = <I, S, so, 8, F> be any reduced and connected automaton that accepts L. Then
A==ARL •
Proof. We must try to define a reasonable function j.L from the states of A to
the states of ARL (which you should recall corresponded to equivalence classes of
Rd. A natural way to define j.L (which happens to work!) is: For each s E S, find a
string Xs E I* 1 8(so, xs) = s. (Since A is connected, we are guaranteed to find such
an Xs. In fact, there may be many strings that take us from So to s; choose anyone of
them, and call it xs.) We need to map s to some equivalence class of R L ; the logical
choice is the class rontaining Xs. Thus we define
j.L(s) = [Xs]RL
An immediate question comes to mind: There may be several strings that we could
use for Xs; does it matter which one we choose to find the equivalence class? It
would not do if, say, RL consisted of two equivalence classes, the even-length
strings = [l1]RL and the odd-length strings = [O]RL , and both 8(so,0) and 8(so,11)
equaled s. Then, on the one hand, j.L(s) should be [O]RL , and, on the other hand, it
should be [l1]RL" j.L must be a function; it cannot send s to two different equivalence
classes. Note that there would be no problem if 8(so, U) = sand 8(so 1111) = s, since
[l1]RL = [l1l1]RL' both of which represent the set of all even-length strings. Here Xs
could be 11, or it could be 1111, and there is no inconsistency in the way in which
j.L(s) is defined; in either case, s is mapped by j.L to the class of even-length strings.
Thus we must first show:
96 Minimization of Finite Automata Chap. 3
4. Final states map to final states; that is, ('Is E S)(f.L(s) E FRL ¢:> s E F). Choose
an s E S and pick a corresponding Xs E I * such that 8(so, xs) = s. Then
5. The transitions match up; that is, ('Is E S)(Va E I)(f.L(o(s, a)) = ORL(f.L(S), a)).
Choose an s E S and pick a corresponding Xs E I * such that 8(so, xs) = s. Note
that this implies that [xs] = f.L(s) = f.L(8(so,xs)). Then
6. I.L is one to one; that is, if l.L(s) = l.L(t), then s = t. Let s, t E S and assume
l.L(s) = l.L(t).
I.L( s) = I.L( t) ¢> (by definition of =)
(Vu E !.*)(BRL(I.L(S), u) = 8 RL (I.L(t), u)) ¢> [by property (5), induction]
(Vu E !-*)(1.L(8(s, u)) = 1.L(8(t, u))) ~ (by definition of =)
(Vu E !-*)(1.L(8(s, u)) E FRL ¢> 1.L(8(t, u)) E FRJ¢> [by property (4) above]
(Vu E !-*)(8(s, u) E F ¢> 8(t, u) E F) ¢> (by definition of E A)
SEA t ¢> (since A is reduced)
s=t
Thus, by results (1) through (6), I.L is a well-defined homomorphism that is also
a bijection; so I.L is an isomorphism and therefore A == ARL .
Til Corollary 3.3. Let A and B be reduced and connected finite automata.
Under these conditions, A is equivalent to B iff A == B.
Proof. If A == B, it is easy to show that A is equivalent to B (as indicated in the
exercises, this implication is true even if A and B are not reduced and connected).
Now assume the hypothesis that A and B are reduced and connected does hold, and
that A is equivalent to B. Since A is minimal, A == ARL(A)' Similarly, B == ARL(B)' Since
L (A) = L (B), ARL(A) = ARL(B)' Therefore, A == ARL(A) = ARL(B) == B.
6.
From the results in the previous section, it follows that a reduced and connected
finite automaton must be minimal. This section demonstrates how to transform an
existing DFA into an equivalent machine that is both reduced and connected and
hence is the most efficient machine possible for the given language. The designer of
an automaton can therefore focus solely on producing a machine that recognizes the
correct set of strings (without regard for efficiency), knowing that the techniques
presented in this section can later be employed to shrink the DFA to its optimal size.
98 Minimization of Finite Automata Chap. 3
The concepts explored in Chapters 4,5, and 6 will provide further tools to aid in the
design process and corresponding techniques to achieve optimality.
V Definition 3.7. Given a finite automaton A = <'1., S, so, 8, F>, define a new
automaton AC = <'1., SC, Sb, 8c, FC>, called A connected, by
SC={sESI3x E'1.* ~3(so,x)=s}
Sb = So
pc =F n Sc = {fEF13x E'1.* ~ 3(So, x) = f}
and 8Cis derived from the restriction of 8 to SC x '1.:
(\fa E '1.)(\fs E SC)(8C(s, a) = 8(s, a»
AC is thus simply the machine A with the unreachable states "thrown away"; So
can be reached by x = A, so it is a valid choice for the start state in N. pc is simply the
final states that can be reached from so, and 8Cis the collection of transitions that
still come from (and consequently point to) states in the connected portion. Ac-
tually, 8Cwas defined to be the transitions that merely come from states in SC, with
no mention of any restrictions on the range of 8 We must have, however,
C
•
8c : SC x '1.~ SC; in order for N to be well defined, 8Cmust be shown to map into the
proper range. It would not do to have a transition leading from a state in SC to a state
that is not in the new state set of N. The fact that 8Cdoes indeed have the desired
properties is relegated to the exercises.
EXAMPLE 3.13
Let M=<{a,b},{qo,qbq2,q3},qo,8,{qbQ3}>, as illustrated in Figure 3.7a. By
inspection, the only states that can be reached from the start state are qo and q3'
Sec. 3.2 Minimization Algorithms 99
(a) (b)
Figure 3.7 (a) The DFA M discussed in Example 3.13 (b) The DFA M' discussed
in Example 3.13
Hence MC = <{a, b}, {qo, q3}, qo, 8c, {q3}>' The resulting automaton is shown in Fig-
ure 3.7b. An algorithm for effectively computing SC will be presented later.
V Theorem 3.2. Given any finite automaton A, the new machine N is indeed
connected .
. Proof. This is an immediate consequence of the way SC was defined.
Definition 3.7 and Theorem 3.2 would be of little consequence if it were not
for the fact that A and N accept the same language. AC is in fact equivalent to A, as
proved in Theorem 3.3.
V Theorem 3.3. Given any finite automaton A = <I, S, sO, 8, F>, A and AC are
equivalent, that is, L(N) = L(A).
Proof. Let x E I *. Then:
x EL(A)~(by definition ofL)
3s E S :l (8(so, x) = s 1\ s E F) ~ (by definition of SC)
s E SC 1\ s E F~ (by definition of n)
s E (SC n F) ~ (by definition of PC)
s E Pc ~ (by definition of s)
8(so, x) E Pc~ (by definition of 8Cand induction)
8 sO, x) E Pc ~ (by definition of s3)
C
(
Thus, given any machine A, we can find an equivalent machine (that is, a
machine that accepts the same language as A) that is connected. Furthermore, there
is an algorithm that can be applied to find AC (that is, we don't just know that such a
machine exists, we actually have a method for calculating what it is). The definition
of SC implies that there is a procedure for finding SC: one can begin enumerating the
strings x in !, *, and by applying the transition function to each x, the new states that
are reached can be included in SC. This is not a very satisfactory process because
there are an infinite number of strings in !, * to check. However, the indicated proof
for Theorem 2.7 shows that, if a state can be reached by a "long" string, then it can
be reached by a "short" string. Thus, we will only need to check the "short" strings.
In particular,
SC::: U 8(so,x) = U 8(so,x)
xE~' xEQ
where Q consists of the "short" strings: Q = {x E !, * I Ix I < II S II}. Thus, Q is the set
of all strings of length less than the number of states in the DFA. Q is a finite set,
and therefore we can check all strings x in Q in a finite amount of time; we therefore
have an algorithm (that is, a procedure that is guaranteed to halt) for finding SC, and
consequently an algorithm for constructing N. Thus, given any machine, we can
find an equivalent machine that is connected. The above method is not very efficient
because many calculations are constantly repeated. A better algorithm based on
Definition 3.10 will be presented later.
We now turn our attention to building a reduced machine from an arbitrary
machine. The following definition gives a consistent way to combine the redundant
states identified by the state equivalence relation EA.
V Definition 3.8. Given a finite automaton A = <!" S, sO, 8, F>, define a new
finite automaton A/E A , called A modulo its state equivalence relation, by
A/EA = <!" SEA' SOEA' 8EA , FEA >
where
SEA = {[S]EAI s E S}
SOEA = [SO]EA
FEA = {[S]EAI s E F}
and 8EA is defined by
(Va E !')(V[s] E SEA)(8EA ([S]EA' a) = [8(s, a)]EA)
Thus, there is one state in A/EA for each equivalence class in E A, the new start
state is the equivalence class containing so, and the final states are those equivalence
classes that are made up of states from F. The transition function is also defined in a
natural manner: Given an equivalence class [t]EA and a letter a, choose one state, say
Sec. 3.2 Minimization Algorithms 101
t, from the class and see what state the old transition specified (8(t, a)). The new
transition function will choose the equivalence class containing this new state
([8(t, a)lEA)' Once again, there may be several states in an equivalence class and thus
several states from which to choose. We must make sure that the definition of 8EA
does not depend on which state of [t]EAwe choose (that is, we must ascertain that 8EA
is well defined.) Similarly, PEA should be shown to be well defined (see the exer-
cises).
It stands to reason that if we coalesce all the states that performed the same
function (that is, were related by E A) into a single state the resulting machine should
no longer have distinct states that perform the same function. We can indeed prove
that this is the case, that is, that A/EA is reduced.
Since we ultimately want to first apply Definition 3.7 to find a connected DFA
and then apply Definition 3.8 to reduce that DFA, we wish to show that this process
of obtaining a reduced machine does not destroy connectedness. We can be assured
that if Definition 3.8 is applied to a connected machine the result will then be both
connected (Theorem 3.5) and reduced (Theorem 3.4).
V Theorem 3.5. If A = <'i, S, sO, 8, F> is connected, then A/EA is connected.
Proof. We need to show that every state in A/EA can be reached from the start
state of A/EA' Assume s E SEA' Th~ 3s' E S ~ s = [S']EA; but A was connected, and
so there exists an x E 'i * such that 8(so, x) = s'; that is, there is a string that will take
102 Minimization of Finite Automata Chap. 3
us from So to s' in the original machine A. This same string will take us from SOE to s
in A/EA since A
Finally, we want to show that we do not change the language by reducing the
machine. The following theorem proves that A/EA and A are indeed equivalent.
Proof.
x E L (A/EA ) ¢::> (by definition of L)
8EA (SOEA' x) E FEA ¢::> (by definition of SOE)
8 EA ([SO]EA' x) E FEA ¢::> (by definition of 8EA and induction)
[8(so, X)]EA E FEA ¢::> (by definition of FEA)
8( so, x) E F ¢::> (by definition of L )
x EL(A)
V Theorem 3.7. Given a finite automaton definable language L and any finite
automaton A that accepts L, then there exists an algorithm for constructing the
unique (up to isomorphism) minimum-state finite automaton accepting L.
Proof. For the finite automaton A that accepts L, there is an algorithm for
finding the set of connected states in A, and therefore there exists an algorithm
for constructing AC, which is a connected automaton with the property that
L(AC) =L(A) = L.
Furthermore, there exists an algorithm for computing EAc, the state
equivalence relation on N; consequently, there is an algorithm for construct-
ing N/EAc, which is a reduced, connected automaton with the property that
L (Ac/EAC ) = L (AC) = L (A) = L.
From the main theorem on minimization (Theorem 3.1), we know that
AC/EAC == ARL , and ARL is the unique (up to isomorphism) minimum-state finite au-
Sec. 3.2 Minimization Algorithms 103
V Definition 3.9. Given a finite automaton A = <I, S, sO, 3, F> and an integer
i, define the ith partial state equivalence relation on A, a relation between the states
of A denoted by EiA' by
(Vs, t E S)(SEiA t ~ (Vx E I* :llxl::; i)(5(s,x) E F ~ 8(t,x) E F))
EXAMPLE 3.14
Let B be the DFA illustrated in Figure 3.8. Consider the relation E OB . The empty
string A can differentiate between qo and the final states, but cannot differentiate
between qb q2, q3, and q4· Thus EOB has two equivalence classes, {qo} and
{qb q2, q3, q4}'
In E 1B , A still differentiates qo from the other states, but the string 1 can
distinguish q3 from qb q2, and q4 since 8( q3, 1) $. F, but 8(qi> 1) E F for i = 1, 2, and
4. We still cannot distinguish between qb q2, and q4 with strings of length 0 or
1, so these remain together and EIB = {{qo}, {q3}, {qb q2, q4}}' Similarly, since
8(qj, 11) E F but 8(q2' 11) $. F and 8(q4' 11) $. F, E2B = {{qo}, {q3}, {ql}, {q2, q4}}' Fur-
ther investigation shows E2B = E3B = E4B = ESB = ... , and indeed EB = E 2B .
1--_~1_---..@)
1 1
The ith state equivalence relation provides a convenient vehicle for computing
EA. The behavior exhibited by the relations in Example 3.14 follow a pattern that is
similar for all deterministic finite automata. The following observations will
culminate in a proof that the calculation of successive partial state equivalence
relations is guaranteed to lead to the relation EA.
Given an integer i and a finite alphabet I, there is clearly an algorithm for
finding EiA since there are only a finite number of strings in IO U Il U I2 U ... U Ii.
Furthermore, given every EiA' there is an expression for E A:
EA = E OA n E1A n E2A II E3A n ... n E nAn· .. = n'" EA
j=O J
The proof is relegated to the exercises and is related to the fact that
I* = IO U Il U I2 u··· u In u···
Finally, it should be clear that if two states cannot be distinguished by strings
of length 7 or less, they cannot be distinguished by strings of length 6 or less, which
means E7A is a refinement of E 6A . This principle generalizes, as formalized below.
V Lemma 3.1. Given a finite automaton A = <I, S, so, 8, F> and an integer
m, Em+1A is a refinement of EmA' which means
(V's, t E S)(s Em+1At => s EmA t)
or
Lemma 3.2 shows that each EmA is related to the desired EA. Lemma 3.1 thus
shows that successive EmA relations come closer to "looking like" EA'
V Lemma 3.2. Given a finite automaton A = <I, S, so, 8, F> and an integer
m, EA is a refinement of E rnA , and so
That is,
'i/ Lemma 3.3. Given a finite automaton A = <!" S, So, 8, F>, EOA has two
equivalence classes, Fand S-F (unless either For S-Fis empty, in which case there is
only one equivalence class, S).
Proof. The proof follows immediately from the definition of E OA ; the empty
string A differentiates between final and nonfinal states, producing the equivalence
classes outlined above.
Ll
Given EOA as a starting point, Theorem 3.8 shows how successive relations can
be efficiently calculated.
Note that Theorem 3.8 gives a far superior method for determining successive
EjA relations. The definition required the examination of many (long) strings using
the 8 function; Theorem 3.8 allows us to simply check a few letters using the 8
function. Theorems 3.9, 3.10, and 3.11 will assure us that EA will eventually be
106 Minimization of Finite Automata Chap. 3
found. The following theorem guarantees that the relations, should they ever begin
to look alike, will continue to look alike as successive relations are computed.
The result in Theorem 3.9 is essential to the proof of the next theorem, which
guarantees that when successive relations look alike they are identical to EA.
3. Combining (1) and (2), we have (Vq, r E S)(qEmAr¢:>qEAr), and so EmA = EA.
Ll
The next theorem guarantees that these relations will eventually look alike
(and so by Theorem 3.10, we are assured that successive computations of EiA will
yield an expression representing the relation EA)'
not all these relations can be distinct, and so there is some index m for which
E mA = E m +1A'
~.
V Corollary 3.5. Given a DFA A = <I, S, so, S, F>, there is an algorithm for
computing EA.
Proof. EA can be found by using Lemma 3.2 to find E OA , and computing
successive EiA relations using Theorem 3.8 until EiA = E i+1A ; this EiA will equal E A,
and this will all happen before i reaches liS II, the number of states in S. Tl).e
procedure is therefore guaranteed to halt.
~
V Definition 3.10. Given a finite automaton A = <I, S, so, S, F>, the ith par-
tial state set Ci is defined by the following rules: Let Co = {so} and recursively define
Ci+l = C; U U S(q, a).
qECj,aEI
qS11 must equal SC (why?), and we will often arrive at the final answer long
before IISII iterations have been calculated (see the exercises and refer to the
108 Minimization of Finite Automata Chap. 3
treatment of E iA ). It can also be proved (by induction) that Ci represents the set of
all states that can be reached from So by strings of length i or less (see the exercises).
Recall that the definition of SC involved the extended state transition function
8. Definition 3.10 instead uses the information found in the previous iteration to
avoid calculating paths for lorig strings. As suggested earlier, there is an even more
efficient method of calculating C+1 from C, since only paths from the newly added
states need be explored anew.
EXAMPLE 3.15
and since S5 leads to no new states, we have C4 = C3 ; as with EiA' we will now find
C3 = C4 = C5 = C6 = ... = sc. The exercises will develop the parallels between the
generation of the partial state sets Ci and the generation of the partial state equiv-
alence relations E iA •
The procedure for recursively calculating successive C s to determine SC pro-
a,b
a
a
b
vi des the final algorithm needed to efficiently find the minimal machine correspond-
ing to a given automaton A. From A, we use the Ci s to calculate SC and thereby
define AC. Theorems 3.8 and related results suggest an efficient algorithm for com-
puting EN, from which we can construct AC/EA,. N/EAc is indeed the minimal ma-
chine equivalent to A, as shown by the results in this chapter. Theorems 3.3 and 3.6
show that AC/EA, is equivalent to A. By Theorems 3.2, 3.4, and 3.5, this automaton is
reduced and connected, and Corollary 3.4 guarantees that N/EA' must therefore be
minimal.
The proof of Theorem 3.7 suggests building a minimal equivalent determinstic
finite automaton for A by first shrinking to a connected machine and then reducing
modulo the state equivalence relation, that is, by finding AC/EN . Theorem 3.5
assures us that when we reduce a connected machine it will still be connected. An
alternate strategy would be to first reduce modulo EA and then shrink to a con-
nected machine, that is, to find (A/EAt In this case, we would want to make sure
that connecting a reduced machine will still leave us with a reduced machine. It can
be shown that if A is reduced then N is reduced (see the exercises), and hence this
method could also be used to find the minimal equivalent DFA.
Finding the minimal equivalent DFA by reducing A first and then eliminating
the disconnected states is, however, less efficient than applying the algorithms in the
opposite order. Finding the connected set of states is simpler than finding the state
equivalence relation, so it is best to eliminate as many states as possible by finding
SC before embarking on the more complex search for the state equivalence relation.
It should be clear that the algorithms in this chapter are presented in sufficient
detail to easily allow them to be programmed. As suggested in Chapter 1, the final
states can be represented as a set and the transition function as a matrix. The
minimization procedures would then return the minimized matrix and new final
state set.
As a practical matter then, when generating an automaton to perform a given
task, our concern can be limited to defining a machine that works. No further
creative insight is then necessary to find the minimal machine. Once a machine that
recognizes the desired language is found (however inefficient it may be), the minimi-
zation algorithms can then be applied to produce a machine that is both correct and
efficient.
The proof that a reduced and connected machine is the most efficient was
based on the properties of the automaton ARL obtained from the right congruence
R L . This can be proved without relying on the existence of ARL • We close this
chapter with an outline of such a proof. The details are similar to the proofs given in
Chapter 7 for finite-state transducers.
Theorem 3.3, which was not based in any way on R L , implies that a minimal
DFA must be connected. Similarly, an immediate corollary of Theorem 3.6 is that a
minimal DFA must be reduced. Thus, a minimal machine is forced to be both
reduced and connecteg. We now must justify that a reduced and connected machine
is minimal. This result will follow from Corollary 3.3, which can also be proved
without relying on A RL • The implication (A == B ~ A is equivalent to B) is due solely
110 Minimization of Finite Automata Chap. 3
3.4. Given a finite automaton A = <I, S, sO, 8, F>, show that the function 8EA given in
Definition 3.8 is well defined.
3.5. Given a finite automaton A = <I, S, sO, 8, F>, show that the set FEA given in Defini-
tion 3.8 is a well-defined set.
3.6. Show that the range of the function 8e given in Definition 3.7 is contained in se.
3.7. Prove Lemma 3.1.
3.8. Prove Lemma 3.3.
3.9. Prove Theorem 3.9.
3.10. Given a homomorphism fL from the finite automaton A = <I, SA, SOM 8A, FA> to the
DFA B = <I, SB, SOB' 8B, FB >, prove by induction that
(\Is E SA)(\lx E I*)(fL(8A(s, x)) = 8 B(fL(S), x))
3.11. Given a homomorphism fL from the finite automaton A = <I, SA, SOM 8A, FA> to the
DFA B = <I, SB, SOB' 8B, FB >, prove thatL(A) = L(B). As long as it is explicitly cited,
the result of Exercise 3.10 may be used without proof.
3.12. (a) Give an example of a DFA for which A is not connected and A/EA isnot connected.
(b) Give an example of a DFA for which A is not connected but A/EA is connected.
3.13. Given a finite automaton A = <I, S, sO, 8, F> and the state equivalence relation E A,
show there exists a homomorphism from A to A/EA'
3.14. Given a connected finite automaton A= <I,S,so,8,F>, show there exists a homo-
morphism from A to ARL(A) by:
(a) Define a mapping $ from A to ARL(A)' (No justification need be given.)
(b) Prove that your $ is well defined.
(c) Prove that $ is a homomorphism.
Chap. 3 Exercises 111
3.15. Give an example to show that there may not exist a homomorphism from A to ARL(A) if
A is not connected (see Exercise 3.14).
3.16. Give an example to show that there may still exist a homomorphism from A to ARL(A)
even if A is not connected (see Exercise 3.14).
3.17. Give an example to show that, for the relations Rand RL given in Theorem 2.2, there
need not exist a homomorphism from ARL to AR.
3.1S. :: is an equivalence relation; in Chapter 2 we saw some relations were also right
congruences. Comment on the appropriateness of asking whether:: is a right congru-
ence.
3.19. Is EA a right congruence? Explain your answer.
C
3.20. Prove that if A is reduced then A is reduced.
3.21. For a homomorphism Ii.. between two finite automata A = <I, SA, SOA' BA, FA> and
B = <I, Se, SOB' Be, Fe>, prove ('<Is, t E SA)(Ii..(S) Ee Ii..(t) <=> SEA t).
3.22. Let M be a DFA, and let L = L (M).
(a) Define a mapping", from M C to A(R M). (No justification need be given.)
(b) Prove that your", is well defined.
(c) Prove that", is a homomorphism.
(d) Prove that'" is a bijection.
(e) Argue that MC :: A(R M ).
3.23. For the machine A given in Figure 3.lOa, find:
(a) EA (list each E;A)
(b) L(A)
(c) ARL(A)
(d) RL(A)
(e) A/EA
3.24. For the machine B given in Figure 3.10b, find:
(a) Ee (list each E;e)
(b) L(B)
(c) ARL(B)
(d) RL(e)
(e) B/Ee
Note that your answer to part (e) might contain some disconnected states.
3.25. For the machine C given in Figure 3.10c, find:
(a) Ec (list each E;c)
(b) L(C)
(c) ARL(C)
(d) RL(c)
(e) c/Ec
Note that your answer to part (e) might contain some disconnected states.
3.26. For the machine D given in Figure 3.lOd, find:
(a) E (list each E; D)
(b) L(D)
(c) ARL(D)
(d) RL(D)
(e) DIED
Note that your answer to part (e) might contain some disconnected states.
112 Minimization of Finite Automata Chap. 3
(a)
(b)
0
1
1
0 0
1
@ 0
(c)
3
S .cgS 3
3
3,b
b 3
(d)
Figure 3.10 (a) The DFA A discussed in Exercise 3.23 (b) The DFA B discussed in
Exercise 3.24 (c) The DFA C discussed in Exercise 3.25 (d) The DFA D discussed in
Exercise 3.26
Chap. 3 Exercises 113
3.27. Without relying on ARL , prove that if A and B are both reduced and connected
equivalent DFAs then A = B. Give the details for the following steps:
(a) Define an appropriate function l\I between the states of A and the states of B.
(b) Show that l\I is well defined.
(c) Show that l\I is a homomorphism.
(d) Show that l\I is a bijection.
3.28. In the proof of (6) in Theorem 3.1, one transition only involved::;' rather than ¢:>.
Show by means of an example that the two expressions involved in this transition are
not equivalent.
3.29. Supply reasons for each of the equivalences in the proof of Theorem 3.8.
3.30. Minimize the machine defined in Figure 3.3.
3.31. (a) Give an example of a DFA for which A is not reduced and AC is not reduced.
(b) Give an example of a DFA for which A is not reduced and AC is reduced.
3.32. Note that = relates some automata to other automata, and therefore = is a relation
over the set of all deterministic finite automata.
(a) For automata A, B, and C, show that if g is an isomorphism from A to B andfis an
isomorphism from B to C, then fog is an isomorphism from A to C.
(b) Prove that = is a symmetric relation; that is, formally justify that if there is an
isomorphism from A to B then there is an isomorphism from B to A.
(c) Prove that = is a reflexive relation.
(d) From the results in parts (a), (b), and (c), prove that = is an equivalence relation'
over the set of all deterministic finite automata.
3.33. Show that homomorphism is not an equivalence relation over the set of all deter-
ministic finite automata.
3.34. For the relations Rand RL given in Theorem 2.2, show that there exists a homo-
morphism from AR to A RL •
3.35. Prove that ifthere is a homomorphism from A to B then RA refines RB.
3.36. Prove that if A is isomorphic to B then RA = R B
(a) By appealing to Exercise 3.35.
(b) Without appealing to Exercise 3.35.
3.37. Consider two deterministic finite automata for which A is not homomorphic to B, but
RA=RB.
(a) Give an example of such automata for which L(A) = L(B).
(b) Give an example of such automata for which L(A) =F L(B).
(c) Can such examples be found if both A and B are connected and L(A) = L(B)?
(d) Can such examples be found if both A and B are reduced and L(A) = L(B)?
3.38. Disprove that if A is homomorphic to B then RA = RB.
3.39. Prove or give a counterexample [assume L =L(M)].
(a) For any DFA M, there exists a homomorphism l\I from A(RM) to M.
(b) For any DFA M, there exists an isomorphism l\I from A(RM) to M.
(c) For any DFA M, there exists a homomorphism l\I from M to A(R M ).
3.40. Prove that if A is a minimal DFA then RA = RL(A).
3.41. Give an example to show that Exercise 3.40 can be false if A is not minimal.
3.42. Give an example to show that Exercise 3.40 may still hold if A is not minimal.
3.43. Definition 3.8 takes an equivalence relation ofthe set of states S and defines a machine
based on that relation. In general, we could choose a relation R in S and define a
114 Minimization of Finite Automata Chap. 3
machine AIR (as we did when we defined A/EA when the relation R was EA)'
(a) Consider R = E OA . Is A/EoA always well defined? Give an example to illustrate your
answer.
(b) Assume R is a refinement of EA. Is A/R always well defined? For the cases where it
is well defined, consider the theorems that would correspond to Theorems 3.4,3.5,
and 3.6 if EA were replaced by such a refinement R. Which of these theorems
would still be true?
3.44. Given a DFA M, prove or give a counterexample.
(a) There exists a homomorphism from M/EM to ARL(M)'
(b) There exists a homomorphism from ARL(M) to M/E M.
3.45. Prove that the bound given for Theorem 3.11 can be sharpened: given a finite automa-
ton A = <1,S,so,8,F>, (3m EN"7 m < !!S!!I\ EmA = Em+IA)'
3.46. Prove or give a counterexample: .
(a) If A and B are equivalent, then A and B are isomorphic.
(b) If A and B are isomorphic, then A and B are equivalent.
3.47. Given a finite automaton A = <I, S, sO, 8, F>, prove that the C; s given in Definition
3.10 are nested: (Vi E N)(C;~ Ci+I)'
3.48. Prove (by induction) that C; does indeed represent the set of all states that can be
reached from So by strings of length i or less.
3.49. Prove that, given a finite automaton A = <I, S, sO, 8, F>,
(3i E N "7 Ci = Ci+I) =? (Vk E N)( C; = C;+k)'
3.50. Prove that, given a DFA A = <I, S, sO, 8, F>, (3i EN "7 C; = C;+I) =? (Ci = SC).
3.51. Prove that, given a finite automaton A = <I, S, sO, 8, F>, 3i EN "7 C; = Ci+I.
3.52. Prove that, given a DFAA= <1,S,so,8,F>, (3i EN"7 i :5!!S!!1\ C;= SC).
3.53. Use the results of Exercises 3.47 through 3.52 to argue that the procedure for
generating SC from successive calculations of C; is correct and is actually an algorithm.
3.54. Give an example of two DFAs A and B that simultaneously satisfy the following three
criteria:
1. There is a homomorphism from A to B.
2. There is a homomorphism from B to A.
3. There does not exist any isomorphism between A and B.
3.55. Assume R and Q are both right congruences of finite rank, R refines Q, and L is a
union of equivalence classes of Q.
(a) Show that L is also a union of equivalence classes of R.
(b) Show that there exists a homomorphism from AR to AQ. (Hint: Do not use the fJ..
given in Theorem 3.1; there is a far more straightforward way to define a map-
ping.)
(c) Give an example to show that there need not be a homomorphism from AQ to AR.
3.56. Prove that AQ must be connected.
3.57. Prove that if there is an isomorphism from A to B and A is connected then B must also
be connected.
3.58. Prove that if there is an isomorphism from A to Band B is connected then A must also
be connected.
3.59. Disprove that if there is a homomorphism from A to B and A is connected then B must
also be connected.
Chap. 3 Exercises 115
3.60. Disprove that if there is a homomorphism from A to Band B is connected then A must
also be connected.
3.61. Given a DFA A, recall the relation RA on 1* induced by A. This relation gives rise to
another DFA A(RA) [with Q = RA and L = L(A)]. Consider also the connected version
of A,Ac.
(a) Define an isomorphism l\J from A(RA) to A (No justification need be given.)
C
•
NONDETERMINISTIC FINITE
AUTOMATA
Whereas deterministic finite automata are restricted to having exactly one transi-
tion from a state for each a E~, a nondeterministic finite automaton may have any
number of transitions for a given input symbol, including zero transitions.
When processing an input string, if an NDFA comes to a state from which
116
Sec. 4.1 Definitions and Basic Theorems 117
there is no transition arc labeled with the next input symbol, the path through the
machine which is being followed is terminated. Termination can take the place of
the "garbage state" (a permanent rejection state) found in many deterministic finite
automata, which is used to reject some strings that are not in the language recog-
nized by the automaton (the state S7 played this role in Examples 1.11 and 1.9).
EXAMPLE 4.1
Let L = {w E {a, b, e}* 13y E {b}* ~ w = aye}. We can easily build a nondeterministic
finite automaton that accepts this set of words. One such automaton is displayed in
Figure 4.1. In this example there are no transitions out of So labeled with either b or
e, nor are there any transitions from Sl labeled with a. From state S2 there are no
transitions at all. This means that if either b or e is encountered in state So or a is
encountered in state Sb or any input letter is encountered once we reach state S2,
the word on the input tape will not be able to follow this particular path through the
machine. Thus, if a word is not fully processed by the NDFA, it will not be
considered accepted (even if the state in which it was prematurely "stuck" was a
final state).
EXAMPLE 4.2
In the NDFA given in Figure 4.3, there are multiple transitions from state so:
processing the symbol 0 causes the machine to enter states SI and ~, whereas
processing a 1 causes the machine to enter both state S3 and state S2.
EXAMPLE 4.3
We can build a machine that will accept the same language as in Example 4.2, but in
a slightly different way. Note that in Figure 4.4 the multiplicity of start states
simplifies the construction considerably.
i. ~ is an alphabet.
ii. S is a finite nonempty set of states.
iii. So is a set of initial states, a nonempty subset of S.
iv. 8: S x ~~ p(S) is the state transition function.
v. Fis the set of accepting states, a (possibly empty) subset of S.
A
The input alphabet, the state space, and even the set of final states are the
same as for deterministic finite automata. The important differences are contained
in the definitions of the initial states and of the 8 function.
The set of initial states can be any nonempty subset of the state space. These
can be viewed as multiple entry points into the machine, with each start state
beginning distinct, although not necessarily disjoint, paths through the machine.
120 Nondeterministic Finite Automata Chap. 4
The 8 function for nondeterministic finite automata differs from the 8 function
of deterministic machines in that it maps a single state and a letter to a set of states.
In some texts, one will find 8 defined as simply a relation with range S and not as a
function; without any loss of generality we define 8 as a function with range p(S),
which makes the formal proofs of relevant theorems considerably easier.
EXAMPLE 4.4
8 a b
r {s} 0
s 0 {r, t}
t 0 {t}
We will see later that this machine accepts strings that begin with alternating as and
bs and end with one or more consecutive bs. '
V Definition 4.2. Given an NDFA A = <I,S,So, 8,F>, the extended state tran-
sition function for A is the function 8: S x I * ~ p( S) defined recursively as follows:
(V's E S) 8(s, >t) = {s}
(V's E S)(V'a E I)(V'x E I *) 8(s, xa) = U 8( q, a)
q E8(s,x)
Sec. 4.1 Definitions and Basic Theorems 121
EXAMPLE 4.5
Consider again the NDFA displayed in Example 4.4. To find all the places a string
such as bb can reach from s, we would first determine what can be reached by the
first b. The reachable states are rand t, since &(s, b) = {r, t}. From these states, we
would then determine what could be reached by the second b (from r, no progress is
possible, but from t, we can again reach t). These calculations are reflected in the
recursive definition of &:
&(s, bb) = l) 8(q, b) = U 8(q, b) = 8(r, b) U 8(t, b) = { }U {t} = {t}
qE8(s,b) qE{r,t}
Because of the multiplicity of initial states and because the 8 function is now
set valued, it is possible for a nondeterministic finite automaton to be active in more
than a single state at one time. Whereas in all deterministic finite automata there is
a unique path through the machine labeled with components of w for each wEI * ,
this is not necessarily the case for nondeterministic finite automata. At any point in
the processing of a string, the 8 function maps the input symbol and the current state
to a set of states. This implies that multiple paths through the machine are possible
or that the machine can get "stuck" and be unable to process the remainder of the
string if there is no transition from a state labeled with the appropriate letter. There
is no more than one path for each word if there is exactly one start state and the 8
function always maps to a singleton set (or 0). If we were to further require that the
8 function have a defined transition to another state for every input symbol, then the
machine that we have would essentially be a deterministic finite automaton. Thus,
all deterministic finite automata are simply a special class of nondeterministic finite
automata; with some trivial changes in notation, any DFA can be thought of as an
NDFA. Indeed, the state transition diagram of a DFA could be a picture of a
well-behaved NDFA. Therefore, any language accepted by a DFA can be accepted
by an NDFA.
EXAMPLE 4.6
Consider the machine given in Example 4.4 and let x = b; the possible paths
through the machine include (1) starting at s and proceeding to r, and (2) starting at
s and proceeding to t. Note that it is not possible to start from t (since t ct=. So), and
there is no way to proceed with x = b by starting at r, the other start state.
Now let x = ba and consider the possibilities. The only path through the
machine requires that we start at s, proceed to r, and return to s; starting at sand
proceeding to t leaves no way to process the second letter of x. Starting from r is
again hopeless (what types of strings are good candidates for starting at r?).
, ' I
Now letx = bab; the possible paths through the machine include (1) starting at
s, proceeding to r, returning to s, and then moving again to r, and (2) starting at s,
proceeding to r, returning to s, and then proceeding to t. Note that starting at sand
moving immediately to t again leaves us with no way to process the remainder of the
string. Both band bab included paths that terminated at the final state t (among
other places). These strings will be said to be recognized by this NDFA (compare
with Definition 4.3). ba had no path that led to a final state, and as a consequence
we will consider ba to be rejected by this machine.
1. At each state where a multiple transition occurs, the machine replicates into
identical copies of itself, with each copy following one of the possible paths.
2. Multiple states of the machine are allowed to be active, and each of the active
states reacts to each input letter.
It happens that the second viewpoint is the most useful for our purposes. From
a theoretical point of view, we use this as the basis for proving the equivalence of
deterministic and nondeterministic finite automata. It is also a useful model upon
which to base the circuits that implement NDFAs.
The concept of a language for nondeterministic finite automata is different
from that for deterministic machines. Recall that the requirement for a word to be
contained in the language accepted by a deterministic finite automaton was that the
processing of a string would terminate in a final state. This is also the condition for
belonging to the language accepted by a nondeterministic finite automaton; how-
ever, since the path through a nondeterministic finite automaton is not necessarily
unique, only one of the many possible paths need terminate in a final state for the
string to be accepted.
Again conforming with our previous usage, a word that is not accepted is
rejected. The use of the symbol L will be consistent with its usage in previous
chapters, although it does have a different formal definition. As before, L(A) is
used to designate all those strings that are accepted by a finite automaton A. Since
the concept of acceptance must be modified for nondeterministic finite automata,
the formal definition of L is necessarily different (contrast Definitions 4.3 and 1.12).
V Definition 4.4. Given an NDFA A = <I, S, So, 8, F>, the language accepted
byA,denotedL(A),is{xEI*I( U 3(q,x»nF4=0}.
Ll qESO
Sec. 4.1 Definitions and Basic Theorems 123
EXAMPLE 4.7
For a more concrete example, consider the problem of a ship attempting to transmit
data to shore at random intervals. The receiver must continually listen, usually to
noise, and recognize when an actual transmission starts so that it can record the data
that follow. Let us assume that the start of a transmission is signaled by the string
010010 (in practice, such a signal string should be much longer to minimize the
possibility of random noise triggering the recording mechanism). In essence, we
wish to build an NDFA that will monitor a bit stream and move to a final state when
the substring 010010 is detected (note that nonfinal states correspond to having the
recording mechanism off, and final states signify that the current data should be
recorded). The reader is encouraged to discover firsthand how hard it is to build a
DFA that correctly implements this machine and contrast that solution to the
NDFA T given in Figure 4.6.
Since the transitions leading to higher states are labeled by the symbols in
010010, it is clear that the last state cannot be reached unless the sequence 010010 is
actually scanned at some point during the processing of the input string. Thus, the
NDFA clearly accepts no word that should be rejected. Conversely, since all possi-
ble legal paths are explored by an NDFA, valid strings will find a way to the final
state. It is sometimes helpful to think of the NDFA as remaining in So while the
initial part of the input string is being processed and then "guessing" when it is the
right time to move to Sl'
It is also possible to model an end-of-transmission signal that turns the record-
",
124 Nondeterministic Finite Automata Chap. 4
ing device off (see the exercises). The device would remain in various final states
until a valid end-of-transmission string was scanned, at which point it would return
to the (nonfinal) start state.
While the NDFA given in Example 4.6 is very straightforward, it appears to be
hard to simulate this nondeterminism in real time with a deterministic computer. It
has not been difficult to keep track of the multiple paths in the simple machines seen
so far. However, if each state has multiple transitions for a given symbol, the
number of distinct paths a single word can take through an NDFA grows exponen-
tially as the length of the word increases. For example, if each transition allowed a
choice of three destination states, a word of length m would have 3m possible paths
from one single start state. An improvement can be made by calculating, as each
letter is processed, the set of possible destinations (rather than recording all the
paths). Still, in an n-state NDFA, there are potentially 2n such combinations of
states. This represents an improvement over the path set, since now the number of
state combinations is independent of the length of the particular word being
processed; it depends only on the number of states in the NDFA, which is fixed. We
will see that keeping track of the set of possible destination states is indeed the best
way to handle an NDFA in a deterministic manner.
Since we have seen in Chapter 1 that it is easy to implement a DFA, we now
explore methods to convert an NDFA to an equivalent DFA. Suppose that we are
given a nondeterministic finite automaton A and that we want to construct a corre-
sponding deterministic finite automaton Ad. Using the concepts in Definitions 4.1
through 4.4, we can proceed in the following fashion. Our general strategy will be to
keep track of all the states that can be reached by some string in the nondeter-
ministic finite automaton. Since we can arbitrarily label the states of an automaton,
we let the state space of Ad be the power set of S. Thus, Sd = peS), and each state in
the new machine will be labeled by some subset of S. Furthermore, let the start state
of N, denoted sg, be labeled by the member of peS) containing those states that are
initial states in A; that is, sg = So.
Since our general strategy is to "remember" all the states that can be reached
for some string, we can define the 3 function in the following natural manner: For
every letter in ~, let the new state transition function, 3d, map to the subset of peS)
labeled by the union of all those states that are reached from some state contained
in the current state name (according to the old state transition function 3).
According to Definition 4.4, for a word to be contained in the language
accepted by some nondeterministic finite automaton, at least one of the terminal
states was required to be contained in the set of final states. Thus, let the set of final
states in the corresponding deterministic finite automaton be labeled by the subsets
of S that contain at least one of the accepting states in the nondeterministic counter-
part. The· formal definition of our corresponding deterministic finite automaton is
given in Definition 4.5.
V Definition 4.5. Given an NDFA A = <~, S, So, 3, F>, the corresponding de-
terministic finite automaton, N = <~, Sd, sg, 3d, F d >, is defined as follows:
Sec. 4.1 Definitions and Basic Theorems 125
Sd = peS)
sg = So
Fd={Q ESdlQ nFf0}
and ad is the state transition function, ad: Sd x ~-'> Sd, defined by
(VQ ESd)(VaE~) ad(Q, a) = U a(q,a)
qEQ
ad extends to the function Sd: Sd x ~ * -'> Sd as suggested by Theorem 1.1:
EXAMPLE 4.8
Consider the NDFA B given in Figure 4.7. As specified by Definition 4.5, the
corresponding DFA Bd would look like the machine shown in Figure 4.8. Note that
all the states happen to be accessible in this particular example.
Figure 4.7 The NDFA B discussed in Figure 4.8 The deterministic equivalent
Example 4.8 of the NDFA given in Example 4.8
'.·
126 Nondeterministic Finite Automata Chap. 4
Inductive step: Suppose that the result holds for all x 1 Ix I = k; that is, P(k) is true.
Let y E ~k+l. Then 3x E ~k and 3a E ~ 1 Y =xa. Then
Sd(Q, y)= (by definition of y)
Sd(Q,xa) = (by Theorem 1.1)
8d (Sd(Q, x), a) = (by the induction hypothesis)
8d( U &(q, x), a) = (VA, BE p(S»(Va E ~)(8d(A U B, a) = 8d(A, a) U 8d(B, a»
qEQ
U 8 (S(q, x), a) = (by Definition 4.5)
d
qEQ
U ( l) 8(p, a» = (by Definition 4.2)
qEQ pE8(q.x)
U &(q,xa) = (by definition of y)
qEQ
qEQ
U &(q,y)
Therefore, P(k) ~ P(k + 1) for all k ;::: 0, and thus by the principle of mathematical
induction we can say that the result holds for all x E ~ * .
Ll
Sec. 4.1 Definitions and Basic Theorems 127
Now that we have established that nondeterministic finite automata and deter-
ministic finite automata are equal in computing power, the reader might wonder
why we bother with nondeterministic finite automata. Even though nondetermin-
istic finite automata cannot recognize any language that camlot be defined by a
DFA, they are very useful both in theory and in machine construction (as illustrated
by Example 4.7). The following examples further illustrate that NDFAs often yield
more natural (and less complex) solutions to a given problem.
EXAMPLE 4.9
Recall the machine from Chapter 1 that accepted a subset of real constants in
scientific notation according to the fOllowing BNF:
<sign>:: = + 1-
<digit>:: =0111213141516171819
<natural> :: = <digit> 1 <digit> <natural>
<integer>:: = <natural> 1 <sign> <natural>
<real constant> : : = <integer>
<integer>.
<integer>. <natural> 1
EXAMPLE 4.10
It can be shown that the technique employed in Example 4.10, when applied
to any automaton, will yield a new NDFA that is guaranteed to accept the reverse of
the original language. The material in Chapter 5 will reveal many instances where
the ability to define multiple start states and multiple transitions will be of great
value.
Sec. 4.1 Definitions and Basic Theorems 129
Figure 4.11 An NDFA representing the reverse of the language represented in Figure 4.10
EXAMPLE 4.11
Assume we wish to identify all words that contain at least one of the three strings
10110, 1010, or 01101 as substrings. Consequently, we let L be the set of all words
that are made up of some characters, followed by one of our three target strings,
followed by some other characters. That is,
L = {w E {O, 1}* Iw = xyz, x E {O, 1}* ,y E {10110, 1010, 01101}, z E {O, 1}*}
We can construct a nondeterministic finite automaton that will accept this language
as follows. First construct three machines each of which will accept one of the
candidates for y. Next, prep end a single state (so in Figure 4.12) that loops on I*;
make this state an initial state and draw arrows from it which mimic the transitions
from each of the other three initial states (as shown in Figure 4.12). Finally, append
a single state machine (SI8) that accepts I *; draw arrows from each of the final states
to this state. The machine that accepts this language is given in Figure 4.12. The
EXAMPLE 4.12
Recall the application in Chapter 1 involving string searching (Example 1.15). The
construction of DFAs involved much thought, but there is an NDFA that solves the
problem in an obvious and straightforward manner. For example, an automaton
that recognizes all strings over the alphabet {a, b} containing the substring aab might
look like the NDFA in Figure 4.13.
a a b
As is the case for this NDFA, it may be impossible for certain sets of states to
all be active at once. These combinations can never be achieved during the normal
operation of the NDFA. The DFA states corresponding to these combinations will
not be in the connected part of N. Applying Definition 4.5 to find the entire
deterministic version and then pruning it down to just the relevant portion is very
inefficient. A better solution is to begin at the start state and "follow transitions" to
new states until no further new states are uncovered. At this point, the relevant
states and their transitions will have all been defined; the remainder of the machine
can be safely ignored. For the NDFA in Figure 4.13, the connected portion of the
equivalent DFA is shown in Figure 4.14. This automaton is still not reduced; the last
a
a
b
b a
Figure 4.14 The connected portion of the DFA equivalent to the NDFA given in Example
4.12
Sec. 4.2 Circuit Implementation of NDFAs 131
a
a b
b
three states are all equivalent and can be coalesced to form the minimal machine
given in Figure 4.15.
The above process can be easily automated; an interesting but frustrating
exercise might involve producing an appropriate set of rules for generating, given a
specific string y, a DFA that will recognize all strings containing the substring y.
Definition 4.5 can be used to generate the appropriate DFA from the obvious
NDFA without subjecting the designer to such frustrations!
As mentioned earlier, the presence of multiple paths within an NDFA for a single
word characterizes the nondeterministic nature of these automata. The most profit-
able way to view the operation of an NDFA is to consider the automaton as having
(potentially) several active states, with each of the active states reacting to the next
letter to determine a new set of active states. In fact, by using one D flip-flop per
state, this viewpoint can be directly translated into hardware. When a given state is
active, the corresponding flip-flop will be on, and when it is inactive (that is, it
cannot be reached by the substring that has been processed at this point), it will be
off. As a new letter is processed, a state will be activated (that is, be placed in the
new set of active states) if it can be reached from one of the previously active states.
Thus, the state transition function will again determine the circuitry that feeds into
each flip-flop.
Following the same conventions given for DFAs, the input tape will be as-
sumed to be bounded by special start-of-string <SOS> and end-of-string <EOS>
symbols. The <EOS> character is again used to activate the accept circuitry so that
acceptance is not indicated until all letters on the tape have been processed. As
before, the <SOS> symbol can be employed at the beginning of the string to ensure
that the circuitry begins pro'cessing the string from the appropriate start state(s).
Alternately, SR (set-reset) flip-flops can be used to initialize the configuration
without relying on the <SOS> conventions.
EXAMPLE 4.13
Consider the NDFA D given in Figure 4.16. With the <SOS> and <EOS>
transitions illustrated, the complete model would appear as in Figure 4.17.
Two bits of input data (a1 and a2) are required to represent the symbols
132 Nondeterministic Finite Automata Chap. 4
a,<EOS>,<SOS> <EOS>,<SOS>
Figure 4.16 The NDFA discussed in Ex- Figure 4.17 The expanded state transi-
ample 4.13 tion diagram for the NDFA in Figure 4.16
t1 h a1 a2 t; t2 accept
0 0 0 0 0 0 0
0 0 0 1 0 0 0
0 0 1 0 0 0 0
0 0 1 1 1 1 0
0 1 0 0 0 1 1
0 1 0 1 1 0 0
0 1 1 0 1 0 0
0 1 1 1 1 1 0
1 0 0 0 1 0 0
1 0 0 1 1 1 0
1 0 1 0 0 0 0
1 0 1 1 1 1 0
1 1 0 0 1 1 1
1 1 0 1 1 1 0
1 1 1 0 1 0 0
1 1 1 1 1 1 0
The first four rows of Table 4.1 reflect the situation in which a string is
hopelessly stuck, and no states are active. Processing subsequent symbols from I
will not change this; both tl and t2 remain O. The one exception is when the <SOS>
symbol is scanned; in this case, each of the start states is activated (tl = 1 andt2, = 1).
This corrects the situation in which both flip-flops happen to initialize to 0 when
power is first applied to the circuitry. Scanning the <SOS> symbol changes the
state of the flip-flops to reflect the appropriate starting conditions (in this machine,
both states are start states, and therefore both should be active as processing is
begun). Note that each ofthe rows of Table 4.1 that correspond to scanning <SOS>
show that t1 and t2 are reset in the same fashion.
Determining the circuit behavior for the symbols in I closely parallels the
definition of 3d in Definition 4.5. For example, when state S1 is active but S2 is
Sec. 4.2 Circuit Implementation of NDFAs 133
inactive (tl == 1 and t2 = 0) and a is scanned (a\ == 0 and a2 == 1), transitions from s\
cause both states to next be active (t; == 1 and t2 == 1). The other combinations are
calculated similarly. Minimized expressions for the new values of each of the flip-
flops and the accept circuitry are
~==~A~~V~A~V~A~V~A~
t2 = (t2A~a\A~a2)V(tIAa2) V (aIAa2)
accept = (t2A~aIA-,a2)
Since similar terms appear in these expressions, these three subcircuits can "share"
the common components, as shown in Figure 4.18 .
.. -+---+-....
""',,-+---t
.......
acce t
Note that the accept circuitry reflects that a string should be recognized when
some final state is active (S2 in this example) and <EOS> is scanned. In more
complex machines with several final states, lines leading from each of the flip-flops
corresponding to final states would be joined by an OR gate before being ANDed
with the <EOS> condition.
An interesting exercise involves converting the NDFA 0 given in Example 4.1
to the equivalent DFA Dd, which will have four states: 0, {s\}, {S2}, and {SbS2}. The
·
"
134 Nondeterministic Finite Automata Chap. 4
i. I is an alphabet.
ii. S is a unite nonempty set of states.
iii. So is a set of initial states, a nonempty subset of S.
iv. 8),,: (S X (~U {A}»~ p(S) is the state transition function.
v. F is the set of accepting states, a (possibly empty) subset of S.
ll.
vention that the machine is capable of making a spontaneous transition to the new
state specified by that A-transition without processing an input symbol. However,
the machine may also "choose" not to follow this path and instead remain in the
original state. Before we can extend the 8~ function to operate on strings from I*,
we need the very useful concept of lambda-closure.
The A-closure of a state is the set of all the states that can be reached from that
state, including itself, by following A-transitions only. Obviously, one can always
reach the state currently occupied without having to move. Consequently, even if
there are no explicit arcs labeled by A going back to state t, t is always in the
A-closure of itself.
EXAMPLE 4.14
Consider the machine given in Figure 4.19, which contains A-transitions from So to Sl
and from Sl to S2. By Definition 4.7,
A(so) = {So, SJ, S2}
A(sl) = {sJ, S2}
A(s2) = {S2}
A(s3) = {S3}
·
"
,
,.
The 8~ function is not extended in the same way as for the nondeterministic
finite automata given in Definition 4.2. Most importantly, due to the effects of the
"--closure, 8~(s, a) =1= 8~(s, a). Thus, not only does the 8~ function map to a set of
states based on a single letter, but it also includes the "--closure ofthose states. This
may seem strange for single letters (strings of length 1), but it is required for
consistency when the 8~ function is presented with strings of length greater than 1,
since at each state along the path there can be "--transitions. Each "--transition maps
to a new state (which may have "--transitions of its own) that must be included in this
path and processed by the 8~ function.
The nondeterministic finite automaton without "--transitions that corresponds
to a nondeterministic finite automaton with "--transitions is given in Definition 4.9.
To account for the case that A might be in the language accepted by the
automaton AI., we add an extra start state qo to the corresponding machine A'{.,
which is disconnected from the rest of the machine. If AE L (AA)' we also make qo a
final state.
EXAMPLE 4.15
Let AI. represent the NDFA given in Example 4.14. A'{. would then be given by the
NDFA shown in Figure 4.20. This new NDFA does indeed accept the same lan-
guage as AI.' To show in general that L(AA) = L(Ao, we must first show that the
respective extended state transition functions behave in similar fashions. However,
these two functions can be equivalent only for strings of nonzero length (because of
the effects of the A-closure in the definition of So. This result is established in
Lemma 4.2.
d
a,c,d
Once we have shown that the extended state transition functions behave
(almost) identically, we can proceed to show that the languages accepted by these
two machines are the same.
A-transitions. We will show L (A).) = L (An, and thereby prove that the two machines
. are equivalent. Because the way A~was constructed limits the scope of Lemma 4.2,
the proof is divided into two cases.
Case 1: If x = A, then by Definition 4.9
(qoEFO" iff AEL(A).»
and so
(A E L(An ~ A E L(A).»
Case 2: Assume x =1= A. Since there are no transitions leaving qo, it may be dis-
regarded as one of the start states of A~. Then
x E L (An =? (by definition of L)
(U 8~(so,x»nF=I=0=?(sinceF~P)
SoESo
EXAMPLE 4.16
Suppose that we wanted to construct a machine that will accept the language
L = {x E {a, b, c}* Ix contains exactly one b, which is immediately followed by c}. A
machine that accepts this language is given in Figure 4.21.
a,
b c
Suppose VIe now wish to build a machine that will accept any positive number
of occurrences of various strings from this language concatenated together. In this
case, the resulting language would include all strings (with at least one b) with the
property that each and every b is immediately followed by c. By simply adding a
X-transition from every final state to the start state, we achieve our objective. The
machine that accepts this new language is shown in Figure 4.22.
EXAMPLE 4.17
As an illustration of how circuitry can ·be defined for machines with X-transitions,
consider theDFA E givenin figure 4.23. This machine is si~lar to the NDFA 0 in
Example 4.13, but a X-transition has been added from SltO S2; that is, 8(Sh X) = {S2}'
This transition implies that S2 should be considered active· whenever SI is active.
Consequently, the circuit diagram produced in Example 4.13 need only be slightly
modified by establishing the. extra connection indicated by the dotted line shown in
Figure 4.24. . ,
In general, the need for such "extra" connections leaving a given flip-flop
input ti is determined by examining 8(Si, X), the set of X-transitions for Si. Note that
the propagation delay in this circuit has been increased; there are signals that must
140 Nondeterministic Finite Automata Chap. 4
In t,
tl
'1;
1\-+---+-....
-1&'--1--4
""'s,
S,
In t,. \-----.
f2
acce t
now propagate through an extra gate during a single clock cycle. The delay will be
exacerbated in automata that contain sequences of A-transitions. In such cases, the
length of the clock cycle may n~ed to be increased to ensure proper operation. This
problem can be minimiZed by adding all the connections indicated by A(Si), rather
than just adding those implied by 8(s;, A).
EXERCISES
4.1. Draw the deterministic versions of each of the nondeterministic finite automata shown
in Figure 4.25. In each part, assume It = {a, b, c}.
4.2. Consider the automaton given in Example 4.17.
(a) Convert this automaton!!!!o an NDFA without ?.-transitions using Definition 4.9.
(b) Convert this NDFA into a DFA using Definition 4.5.
Chap. 4 Exercises 141
(a)
(d)
(e)
(g)
(d) Convert the NDFA into a DFA using Definition 4.5 (draw the entire machine,
including the disconnected portion).
4.4. Consider the automaton given in Example 4.2.
(a) Using the standard encodings, draw a circuit diagram for this NDFA (include both
<SOS> and <EOS».
(b) Convert the NDFA into a DFA using Definition 4.5.
4.5. Consider the automaton given in Example 4.3.
(a) Using the standard encodings, draw a circuit diagram for this NDFA (include
neither <SOS> nor <EOS».
(b) Using the standard encodings, draw a circuit diagram for this NDFA (include both
<SOS> and <EOS> ).
(c) Convert the NDFA into a DFA using Definition 4.5 (draw only the connected
portion ofthe machine).
(d) Is this DFA isomorphic to any of the automata constructed in Exercise 4.4?
4.6. Consider the automaton given in Example 4.14.
(a) Using the standard encodings, draw a circuit diagram for this NDFA (include
neither <SOS> nor <EOS> ).
(b) Using the standard encodings, draw a circuit diagram for the NDFA in part (b)
(include neither <SOS> nor <EOS».
4.7. Consider the automaton given in the second part of Example 4.16.
(a) Using the standard encodings, draw a circuit diagram for this NDFA (include
<EOS> but not <SOS».
(b) Build the equivalent automaton without A-transitions using Definition 4.9.
(c) Using the standard encodings, draw a circuit diagram for the NDFA in part (b)
(include <EOS> but not <SOS».
(d) Convert the NDFA into a DFA using Definition 4.5 (draw only the connected
portion ofthe machine).
4.8. It is possible to build a deterministic finite automaton A such that the language accepted
by this machine is the absolute complement of the language accepted by a machine A
[that is, L(A) = ~* - L(A)] by simply complementing the set offinal states (see The-
orem 5.1). Can a similar thing be done for nondeterministic finite automata? If not,
why not? Give an example to support your statements.
4.9. Given a nondeterministic finite automaton A without A-transitions, show that it is
possible to construct a nondeterministic finite automaton with A-transitions A' with the
properties
(1) A' has exactly one start state and exactly one final state and
(2) L(A') = L(A).
4.10. Consider (ii) in Definition 4.8. Can this fact be deduced from parts (i) and (iii)? Justify
your answer.
4.11. If we wanted another way to construct a nondeterministic finite automaton without
A-transitions corresponding to one that does have them, we could try the following: Let
S'=S, Sb=A(So), F'=F, and 8'(s,a) = 8,(s, a) for all aE~,sES. Show that this
works (or if it does not work, explain why not and give an example).
4.12. Using nondeterministic machines with A-transitions, give an algorithm for constructing
a A-NDFA having one start state and one final state that will accept the union of two
FAD languages.
Chap. 4 Exercises 143
4.26. Give an example to show that the domain of Lemma 4.2 cannot be expanded to include
A; that is, show that 8~(s, A) 1- 8>..(s, A).
4.27. Refer to Definition 4.5 and prove the fact used in Lemma 4.1:
(VA E p(S»(VB E p(S»(Va E :t)(8 d (A U B, a) = 8d (A, a) U 8d (B, a»
4.28. Recall that if a word can reach several states in an NDFA, some of which are final and
some nonfinal, Definition 4.4 requires us to accept that word.
(a) Change the definition of L (A) so that a word is accepted only if every state the word
can reach is final.
(b) Change the definition of Ad to produce a deterministic machine that accepts only
those words specified in part (a).
4.29. Draw the connected part of T d , the deterministic equivalent ofthe NDFA T in Example
4.7.
4.30. Refer to Example 4.7 and modify the NDFA T so that the machine reverts to a nonfinal
state (that is, turns the recorder off) when the substring 000111 is detected. Note that
000111 functions as the EOT (end of transmission) signal.
4.31. Consider the automaton A given in Example 4.14.
(a) Draw a diagram of A~.
(c) Draw A~d (draw only the connected portion of the machine).
4.32. What is wrong with the following "proof' of Lemma 4.2? Let P(k) be defined by
P(k): ('Is E S)(Vx E :tk)(8~(s, x) = 8>..(s, x».
Basis step (k = 1): ('Is E S)(Va E :t)(8~(s, a) = 8~(s, a) = 8>..(s, a».
Inductive step: Suppose that the result holds for all x E:tk and let y E:tk + 1. Then
(3x E :tk)(3a E :t ;) y = xa). Then
8~(s,y) = 8~(s,xa)
= 8~(8~(s, x), a)
= 8~(8)..(s,x),a)
= 8>..(8)..(s, x), a)
= 8>..(s,xa)
= 8>..(s,y)
Therefore, P(k):::} P(k + 1) for all k 2: 1, and by the principle of mathematical
induction, we are assured that the equation holds for all x E:t+.
4.33. Consider the automaton given in Example 4.7.
(a) Using the standard encodings, draw a circuit diagram for this NDFA (include
neither <SOS> nor <EOS> ).
(b) Convert the NDFA into a DFA using Definition 4.5 (draw only the connected
portion of the machine).
4.34. Consider the automaton given in Example 4.11.
(a) Using the standard encodings, draw a circuit diagram for this NDFA (include
neither <SOS> nor <EOS> ).
(b) Convert the NDFA into a DFA using Definition 4.5 (draw only the connected
portion of the machine).
4.35. Consider the automaton 8 given in Example 4.8.
Chap. 4 Exercises 145
(a) Using the standard encodings, draw a circuit diagram for B (include neither
<SOS> nor <EOS> ).
(b) Using the standard encodings, draw a circuit diagram for Bd (include neither
<SOS> nor <EOS». Encode the states in such a way that your circuit is similar
to the one found in part (a).
4.36. Draw a circuit diagram for each NDFA given in Exercise 4.1 (include neither <SOS>
nor <EOS». Use the standard encodings.
4.37. Draw a circuit diagram for each NDFA given in Exercise 4.1 (include both <SOS> and
<EOS». Use the standard encodings.
4.38. Definition 3.10 and the associated algorithms were used in Chapter 3 for finding the
connected portion of a DFA.
(a) Adapt Definition 3.10 so that it applies to NDFAs.
(b) Prove that there is an algorithm for finding the connected portion of an NDFA.
CHAPTER
CLOSURE PROPERTIES
In this chapter we will look at ways to combine languages that are recognized by
finite automata (that is, FAD languages) and consider whether the combinations
result in other FAD languages. These results will provide insights into the construc-
tion of finite automata and will provide useful information that will have bearing on
the topics covered in later chapters. After the properties of the collection of FAD
languages have been fully explored, other classes of languages will be investigated.
We begin with a review of the concept of closure.
Notice that when many everyday operators combine objects of a given type they
produce an object of the same type. In arithmetic, for example, the multiplication
of any two whole numbers produces another whole number. Recall that this prop-
erty is described by saying that the set of whole numbers is closed under the
operation of multiplication. In contrast, the quotient of two whole numbers is likely
to produce a fraction: the whole numbers are not closed under division. The formal
definition of closure, both for operators that combine two other objects (binary
operators) and those that modify only one object (unary operators) is given below.
yo Definition 5.1. The set K is closed under the (binary) operator ® iff
(Vx,y E K)(x®y E K) .
.:l
146
Sec. 5.1 FAD Languages and Basic Closure Theorems 147
V Definition 5.2. The setK is closed under the (unary) operator T) iff
('fix E K)(T)(x) E K).
A
EXAMPLE 5.1
1\1 is closed under + since, if x and yare nonnegative integers, then x + y is another
nonnegative integer; that is, if x,y E 1\1, then x + y E 1\1.
EXAMPLE 5.2
Let p = {X IX is a finite subset of I\I}; then p is closed under U, since the union of
two finite sets is still finite. (If Y and Z are subsets for which I YII = n < 00
and I ZI = m < 00, then I YUZI :5 n + m < 00. Under what conditions would
IIYUZII<n +m?)
To show a set K is not closed under a binary operator e, we must show
-,[('fIx,y E K)(xey E K)], which means 3x,y E K ,xey fJ. K.
EXAMPLE 5.4
1\1 is not closed under - (subtraction) since 3 - 5 = -2 $. 1\1, even though both 3 E 1\1
and 5 E 1\1.
Notice that the set as well as the operator is important when discussing closure
properties; unlike 1\1, the set of all integers 0is closed under subtraction. As with the
binary operator in Example 5.4, a single counterexample is sufficient to show that a
given set is not closed under a unary operator.
EXAMPLE 5.5
1\1 is not closed under V (square root) since 7 E 1\1 but V7 $. 1\1.
V Definition 5.3. Let I be an alphabet. The symbol ~I is used to denote the set
of all FAD languages over I; that is,
~I = {L k I * 13 deterministic finite automaton M , L (M) = L}
148 Closure Properties Chap. 5
~l: is the set of all languages that can be recognized by finite automata. In this
chapter, it is this set whose closure properties with respect to various operations in
I* we are most interested in investigating. For example, if there exists a machine
that accepts a language K, then there is also a machine that accepts the complement
of K. That is, if K is FAD, then -K is FAD: ~l: is closed under-.
x E L (A -) ~ (by definition of L)
8-(so, x) E F- ~ (by induction and the fact that 8 = 8-)
8( so, x) E F- ~ (by definition of so)
8( so, x) E F- ~ (by definition of F-)
8(so,x) E S-F~ (by definition of complement)
8( so, x) $. F ~ (by definition of L )
x f/:. L (A) ~ (by definition of K)
x f/:. K ~ (by definition of complement)
xE-K
It turns out that ~l: is closed under all the common set operators. Notice that
the definition of ~l: implies that we are working with only one alphabet; if we
combine two machines in some way, it is understood that both automata use exactly
the same input alphabet. This turns out to be not much of a restriction, however, for
if we wish to consider two machines that use different alphabets It and I 2 , we can
simply modify each machine so that it is able to process the new common alphabet
I = It U I 2 • It should be clear that this can be done in such a way as not to affect the
language accepted by either machine (see the exercises).
We will now prove that the union of two FAD languages is also FAD. This can
be shown by demonstrating that, given two automata Mt and M2 , it is possible to
Sec. 5.1 FAD Languages and Basic Closure Theorems 149
construct another automaton that recognizes the union of the languages accepted by
Ml and M2 •
EXAMPLES.6
Consider the two machines Ml and M2 displayed in Figure 5.1. These two machines
can easily be employed to construct a nondeterministic finite automaton that clearly
accepts the appropriate union. We simply need to combine them into a single
machine, which in this case will have two start states, as shown in Figure 5.2.
The structure inside the dotted box should be viewed as a single NDFA with
two start states. Any string that would be accepted by Ml will reach a final state if it
starts in the "upper half" of the new machine, while strings that are recognized by
M2 will be accepted by the "lower half" of the machine. Recall that the definition of
acceptance by a nondeterministic finite automaton implies that the NDFA in Figure
5.2 will accept a string if any path leads to a final state. This new NDFA will
therefore accept all the strings that Ml accepted and all the strings that M2 accepted.
Furthermore, these are the only strings that will be accepted. This trick is the basis
of the following proof, which demonstrates the convenience of using the NDFA
concept; a proof involving only DFAs would be both longer and less obvious (see
the exercises).
( \
b
~
a
I
I I
Figure 5.2 The resulting automaton in
"--- ~ Example 5.6
150 Closure Properties Chap. 5
The above "proof" is actually incomplete; the transition from line 4 to line 5
actually depends on the assumed properties of 8 u, and not the known properties of
8u. A rigorous justification should include an inductive proof of (or at least a
reference to) the fact that 8 u reflects the same sort of behavior that 8u does; that is,
_ 181(S' x) ifsESI
8 U(s,x) = Vs E SI U S2, Vx E L*
8 2(s, x) if s E S2
The above rule essentially states that the definition that applies to the single letter a
also applies to the string x, and it is easy to prove by induction on the length of x (see
the exercises).
Sec. 5.1 FAD Languages and Basic Closure Theorems 151
The following theorem, which states that <2iJ 1 is closed under n, will be justified
in two separate ways. The first proof will argue that the closure property must hold
due to previous results; no new DFA need be constructed. The drawback to this
type of proof is that we have no suitable guide for actually combining two existing
DFAs into a new machine that will recognize the appropriate intersection (al-
though, as outlined in the exercises, in this case a construction based on the first
proof is fairly easy to generate).
Some operators are so bizarre that a nonconstructive proof of closure is the
best we can hope for; intersection is definitely not that strange, however. In a
second proof of the closure of <2iJ 1 under n, Lemma 5.1 will explicitly outline how an
intersection machine could be built. When such constructions can be demonstrated,
we will say that <2iJ 1 is effectively closed under the operator in question (see Theorem
5.12 for a discussion of an operator that is not effectively closed).
Note that the above argument could be made to apply to any collection C of
sets that were known to be closed under union and complementation. A second
proof of Theorem 5.3 might rely on the following lemma, using the "direct" method
of constructing a deterministic machine that accepts LI n Lz. This would show that
<2iJ1 is effectively closed under the intersection operator.
Til Lemma 5.1. Given deterministic finite automata Al = <l, SI, sal' ~h, F;> and
A2 = <l, S2, S02' 82, F'z> such that L(AI) = LI and L(A2) = Lz, define a new DFA
An = <l, Sn, s~, 8n,F n>, where
Sn = SI X S2
s~ == (sal' so)
F n = Pi x F'z,
and 8n: (SI x S2) x l~ SI X S2 is defined by
8n(s, t), a) = (8 1(s, a), 82(t, a» 'Vs E S1, 'Vt E S2, 'Va E l
Then L(An) = LI n Lz.
Proof. As usual, the key is to show that x EL(An)~x ELI n Lz. The proof
hinges on the inductive statement that 3n obeys the same rule that defines 8n; that
is, ('Vs E SI)('Vt E S2)('VX E l*)(Sn(s, t),x) = (31(s, x), 32(t,x»). The details are left
for the reader (see the exercises).
~
152 Closure Properties Chap. 5
The idea behind the above construction is to build a machine that "remem-
bers" the state changes that both Al and A2 make as they each process the same
string, and hence the state set consists of all possible pairs of states from Al and A2 .
The goal was to design the transition function an so that being in state (s, t) in An
indicates that Al would currently be in state sand A2 would be in state t. This goal
also motivates the definition of the new start state; we want to begin in the start
states of Al and A2, and hence s~ = (so\, s~). We only wish to accept strings that are
common to both languages, which means that the terminating state in Al belongs to
Fi and the last state reached in A2 is likewise a final state. This requirement naturally
leads to the definition of F n , where (s, t) is a final state if and only if both sand t
were final states in their respective machines.
EXAMPLE 5.7
Consider the two machines Al and A2 displayed in Figure 5.3. Note that A2 "remem-
bers" whether there have been an even or an odd number of bs, while Al "counts"
the number of letters (mod 3). We now demonstrate how the definition in Lemma
5.1 can be applied to form a deterministic machine that accepts the intersection of
L(Al) and L(A2). The structure of An would in this .case look like the automaton
shown in Figure 5.4. Note that An does indeed keep track of the criteria that both Al
and A2 use to accept or reject strings. We will be in a state on the right side of An if
A:;,
(a) (b)
an odd number of bs have been seen and on the left side when an even number of bs
have been processed. At the same time, we will be in the upper, middle, or lower
row of states depending on the total number of letters (mod 3) that have been
processed. There is but one final state, corresponding to the situation where we
have both an odd number of bs and the letter count is 0 (mod 3).
The operations used in the previous three theorems are common to set theory.
We now present some new operators that are special to string algebra. We have
defined concatenation (.) for individual strings, but there is a natural extension of
the definition to languages, as indicated by the next definition.
EXAMPLE5.S
Note that baa qualifies to be in LI'~ for two reasons: baa = A·baa and baa = b·aa.
Thus we see that the concatenation contains only eight words rather than the
expected 9 ( = 3·3). In general, L I· L2 consists of all words that can be formed by the
concatenation of a word from LI with a word from L2; for finite sets, concatenating
an n word set with an m word set results in no more than n'm words. As shown in
this example, the number of words can actually be less than n ·m. Larger languages
can be concatenated, also. For example, I*·I = I+.
The concatenation of two FAD languages is also FAD, as can easily be seen by
employing NDFAs with A-transitions.
EXAMPLE 5.9
Figure 5.5 illustrates two nondeterministic finite automata BI and B2 that accept the
languages LI and ~ given in Example 5.8. Combining these two machines and
linking the final states of BI to the start states of B2 with A-transitions yields a new
NDFA that accepts LI'~, as shown in Figure 5.6.
Figure 5.6 An NDFA which accepts the concatenation of the machines discussed
in Example 5.9
EXAMPLE 5.10
Consider the deterministic finite automata Al and A2 displayed in Figure 5.7. These
can similarly be linked together to form an NDFA that accepts the concatenation of
the languages accepted by Al and A2, as shown in Figure 5.8.
Sec. 5.1 FAD Languages and Basic Closure Theorems 155
F - {Fz if A ~ ~
- F; U Fz if AE ~
and 5·: (SI U S2) x I~ P(SI U S2) is defined by
156 Closure Properties Chap. 5
{8 1(S' a)} if s E SI - Pi
8·(s, a) = ( {81(s, a), 82(S02' a)} if s E Pi
{82 (s, a)} if s E S2
It can be shown that L (A·) = L (AI)·L (A2) = LJ"l-z (see the exercises).
Ll
EXAMPLE 5.11
Conside the deterministic finite automata Al and A2 in Example 5.10. These can be
linked together to form the NDFA A', and the reader can indeed verify that the
machine illustrated in Figure 5.9 accepts the concatenation of the languages
accepted by Al and A2. Notice that the new transitions from the final states of Al
mimic the transitions out of the start state of A2.
Thus we see that avoiding A-transitions while defining a concatenation ma-
chine is relatively simple. Unfortunately, avoiding the nondeterministic aspects of
the construction is relatively impractical and would basically entail re-creating the
construction in Definition 4.5 (which outlined the method for converting an NDFA
into a DFA). Whereas it was merely convenient (rather than necessary) to employ
NDFAs to demonstrate that 2iJ 1 is closed under union, the use of nondeterminism is
essential to the proof of closure under concatenation.
EXAMPLE 5.12
Consider the nondeterministic finite automata BI and B2 from Example 5.9. Ap-
plying the analog of Theorem 5.4 (see Exercise 5.43) yields the automaton shown in
Figure 5.10. Notice that each final state of BI now mimics the start state of B2, and to
has become a disconnected state. Both So and Sl are still final states since A E L (B2).
EXAMPLE 5.13
Figure 5.10 An NDFA without lambda-moves which accepts the concatenation of the lan-
guages discussed in Example 5.8
a
Figure 5.11 Candidates for concatenation in which the second machine does not accept X.
(Example 5.13)
ing Theorem 5.4 in this case yields the automaton shown in Figure 5.12. In this
construction, So and S1 are no longer final states since the definition of F' must follow
a different rule when Aft. L(B3)' By examining the resulting machine, the reader can
verify that having t3 as the only final state is indeed the correct strategy for this case.
Besides concatenation, string algebra allows other new operators on lan-
a
guages. The operators * and +, which have at this point only been defined for
alphabets, likewise have natural extensions to languages. Loosely, we would expect
L * to consist of all words that can be formed by the concatenation of several words
from L.
V Definition 5.5. Let L be a language over some alphabet X. Define
U={A}
U=L
U=L·L
V=L·U=L·L·L
and in general
Ln = L.Ln-l, for n = 1,2,3, ...
'"
L* =
i=O
U V = LO U U U L2 U ... = {A} U L U L· L U L· L· L U ...
00
u =
i=l
U Li=UUUUVU··· =LUL·LUL·L·LU ...
L * is called the Kleene closure of the language L.
.:l
EXAMPLE 5.14
If L = {aa, e}, then L * = {A, aa, e, aae, eaa, aaaa, ee, aaaaaa, aaaae, aaeaa, ... }.
EXAMPLE 5.15
If K = {db, b, e}, then K* consists of all words (over {b, e, d}) for which each occur-
rence of d is immediately followed by (at least) one b.
21ll; is closed under both * and +. The technique for Kleene closure is outlined
in Theorem 5.5. The construction for L + is similar (see the exercises).
V Theorem 5.5. For any alphabet X, 21ll; is closed under * (Kleene closure).
Proof. Let L belong to 21ll;. Then there is a nondeterministic finite automaton
A = <X, S, So, 8, F> such that L(A) = L. Define a nondeterministic machine
A. = <X,S.,So.,8.,F.>, where -
S. = S U {qo} (where qo is some new element; qo $. S)
So' = So U {qo}
p. = F U {qo}
and 8. : (S U {qo} x X~ p(S U {qo}) is defined by
5(S' a) if s $. F U {qo}
8.(s, a) = 5(s, a) U (
( fII
U 8(t,
tESo.
a)) if s E F
·f
'I' 1 S = qo
Sec. 5.1 FAD Languages and Basic Closure Theorems 159
EXAMPLE 5.16
machine from which the new NDFA is formed. In reviewing Theorems 5.4 and 5.5,
the reader should be able to see the parallel between the differences in the specifica-
tions of the B function and the differences in the definitions of So and So', It is also
instructive to compare and contrast the proof of Theorem 5.2 to those discussed
above.
The operators discussed in this section, while not as fundamental as those presented
earlier, illustrate some useful techniques for constructing modified automata. Also
explored are techniques that provide existence proofs rather than constructive
proofs.
V Theorem 5.6. For any alphabet l, 9Jl; is closed under the operator Z, where
Z is defined by
Z(L) = {x Ix is formed by deleting zero or more letters from a word in L} .•
Proof. See the exercises and the following example.
A
EXAMPLE 5.17
and can be accepted by modifying C so that every transition in the diagram has a
corresponding X,-move (allowing that particular letter to be skipped), as shown in
Figure 5.16.
V Theorem 5.7. For any alphabet I, ~I is closed under the operator Y, where
Y is defined by
Y(L) = {x Ix is formed by deleting exactly one letter from a word in L}.
Proof. See the exercises and the following example.
A
EXAMPLE 5.18
We need a way to skip a letter as was done in Example 5.17, but we must now skip
one and only one letter. The technique for accomplishing this involves using copies
of the original machine. Consider the deterministic finite automaton D displayed in
Figure 5.17. We will use A-moves to mimic normal transitions, but in this case we
will move from one copy of the machine to an appropriate state in a second copy.
Being in the first copy of the machine will indicate that we have yet to skip a letter,
and being in the second copy will signify that we have followed exactly one A-move
and have thus skipped exactly one letter. Hence the second copy will be the only
one in which states are deemed final, and the first copy will contain the only start
state. The modified machine for this example might look like the NDFA shown in
Figure 5.18. The string aba, which is accepted by the original machine, should cause
ab, aa, and ba to be accepted by the new machine. Each of these three are indeed
accepted, by following the correct A-move at the appropriate time. A similar tech-
nique, with the state transition function slightly redefined, could be used to accept
words in which every other letter was deleted. If one wished only to acknowledge
every third letter, three copies of the machine could be suitably connected together
to achieve the desir~d result (see the exercises).
While 2lJ I is certainly the most important class of languages we have seen so
far, we will now consider some other classes whose properties can be investigated.
The closure properties of other collections of languages will be considered in the
exercises and in later chapters.
V Definition 5.6. Let}; be an alphabet. Then "WI is defined to be the set of all
languages over}; recognized by NDFAs; that is,
"WI = {L C};* 13 NDFA N j L(N) = L}.
Proof. The proof follows immediately from Theorem 4.1 and Exercise 4.25.
The reader should note that Lemma 5.2 simply restates in new terms the
conclusion reached in Chapter 4, where it was proved that NDFAs were exactly as
powerful as DFAs. More specifically, it was shown that any language that could be
recognized by a NDFA could also be recognized by a DFA, and conversely. While
every subset of };* represents a language, those in ~I have exhibited many nice
properties owing to the convenient representation afforded by finite automata. We
now focus our attention on "the other languages," that is, those that are not in ~I'
XI is all the "complicated" languages (subsets) that can be formed from };*;
that is, XI = p(};*) - ~I' Be careful not to confuse XI with the set of languages that
can be recognized by NDFAs ("WI in Definition 5.6).
V Definition 5.8. Let!' = {aI, a2, ... ,am} be an alphabet and let f be a second
alphabet. Given words Wi> W2, ... , Wm over f*, define a language homomorphism
\)I: !,-? f* by \)I(ai) = Wi for each i, which can be extended to \iI: !,*-? f* by:
\iI(X.) = X.
(V'aE!,)(V'x E!,*)(\iI(a·x) = \)I(a)·(\iI(x)))
\iI can be further extended to operate on a language L by defining
\iI(L) = {\iI(z) E f* Iz E L}
EXAMPLE 5.19
Let !, = {a, b} and f = {c, d}, and define \)I by \)I(a) = cd and \)I(b) = d. For
K = {X., ab, bb}, \iI(K) = {X., cdd, dd}, while for L = {a, b}*, \iI(L) represents all words
over {c, d} in which every c is immediately followed by d.
EXAMPLE 5.20
As a second example, let!' = {), (} and let f be the ASCII alphabet. If j.L is defined
by j.L(O = begin and j.LO) = end, then the set M of all strings of matched parentheses
maps to K, the set of all matched begin-end pairs .
.,
·
164 Closure Properties Chap. 5
EXAMPLE 5.21
Consider the DFA 8 displayed in Figure 5.19a. For the homomorphism I-L defined
by ~(a) = a and ~(b) = a, the automaton that will accept I-L(L(8)) is shown in Figure
5.19b. Note that even in simple examples like this one the resulting automaton can
be nondeterministic.
(a) (b)
Figure 5.19 (a) The automaton discussed in Example 5.21 (b) The resulting auto-
maton for Example 5.21
EXAMPLE 5.22
For the NDFA C displayed in Figure 5.20a and the homomorphism I-L defined by
I-L(a) = cc and I-L(b) = a, the automaton that will accept I-L(L(C)) is shown in Figure
5.20b. Note that each state of C requires an extra state to accommodate the cc
transition.
Sec. 5.2 Further Closure Properties 165
(3) (b)
Figure 5.20 (a) The automaton discussed in Example 5.22 (b) The resulting auto-
maton for Example 5.22
EXAMPLE 5.23
Consider the identity homomorphism j.L: I--? I* defined by (Va E I)(j.L(a) = a).
Since 1L(L) = L, any collection of languages, including .N'~, is clearly closed under
this homomorphism. Unlike ~~, though, there are many homomorphisms under
which .N'~ is not closed.
V Lemma 5.4. Let I = {a, b}, and let ~: I--? I* be defined by ~(a) = a and
~(b) = a. Then.N'~ is not closed under ~.
Proof. Consider the set L of all strings that have the same number of as as bs.
This language is in .N'~, but ~(L) is the set of all even-length strings of as, which is
clearly not in .N'~.
!l
A rather trivial example involves the homomorphism defined by ljJ(a) ="A for
every letter a E I. Then for all languages L, whether or not L E .N'~, ljj(L) = {"A},
which is definitely not in .N'~.
V Definition 5.9. Let 1jJ: I--? r* be a language homomorphism and consider
z E f*. The inverse homomorphic image of z under ljj is then
ljj-l(Z) = {x E I * Iljj(x) = z}
For a language L ~ f* , the inverse homomorphic image of Lunder IjI is defined by
1jI-\L) = {x E I* IIjI(x) E L}
EXAMPLE 5.24
Consider ~ from Lemma 5.4 in which ~: ~--,) ~* was defined by Ha) = a and
~(b) = a. Let z = aa. Since ~(bb) = ~(ba) = ~(ab) = ~(aa) = aa,
~-\aa) = {bb, ba, ab, aa}. Note that ~-I(ba) = 0.
For L={xE{a}*llxl=Omod3}, ~-I(L)={xE{a,b}*llxl=Omod3}. Note
that this second set is definitely larger, since it also contains words with bs in them.
It can be shown that 2lll; is closed under inverse homomorphism. The trick is to
make the state transition function of the new automaton simulate, for a given letter
a, the action the old automaton would have taken for the entire string l\I(a). As the
following proof will illustrate, the only change that need take place is in the 3
function; the newly constructed machine is even deterministic!
This theorem makes it possible to extend the range· of the pumping lemma
(Theorem 2.3) to many otherwise unpleasant problems. The set M given in Exam-
ple 5.20 can easily be shown to violate Theorem 2.3 and is therefore not FAD. The
set K given in Example 5.20 is just as clearly not FAD, but this is quite tedious to
formally prove by the pumping lemma (the number of choices for u, v, and w is
prohibitively large to thoroughly cover). An argument might proceed as follows:
Assume K were FAD. Then M, being the inverse homomorphic image of a FAD
language, must also be FAD. Since M is known (by an easy pumping lemma proof)
to be definitely not FAD, the assumption that K is FAD must be incorrect. Thus,
K E Xl;.
V Lemma 5.5. Let ~ = {a, b}, and let ~: ~--,) ~* be defined by ~(a) = a and
Hb) = a. Then Xl; is not closed under ~-l.
Sec. 5.2 Further Closure Properties 167
Proof. Consider the set L of all strings that have the same number of as as bs.
This language is in.N '2:, but ~-\L) is {A.}, which is clearly not in .N'2:'
d
We close this chapter by considering two operators for which it is definitely not
convenient to modify the structure of an existing automaton to construct a new
automaton with which to demonstrate closure.
V Theorem 5.11. Let t. be an alphabet. Define the operator b by
Lb = {x 13y Et.* j (xy E L 1\ Ixl = Iyl)}
Then ~'2: is closed under the operator b.
Proof. Lb represents the first halves of all the words in L. For example, if
K = {ad, abaa, ccccc}, then Kb = {a, ab}. Assume that L is FAD. Then there exists a
DFA A = <t., S, so, 8, F> that accepts L. The proof consists of identifying those
states q that are "midway" between the start state and a final state; specifically, we
need to identify the set of strings for which q is the midpoint. The previous closure
results for union, intersection, homomorphism, and inverse homomorphism will be
used to construct the language representing U. Define the length homomorphism
1\1: t.~ {1}* by l\I(a) = 1 for all a E t.. Note that Ij} effectively counts the number of
letters in a word:
Ij}(x) = llxl
The following argument can be applied to each state q to determine the set of strings
that use it as a "midway" state.
Consider the initial set for q, I(A, q) = {x 18(so,x) = q} and the terminal set for
q, T(A,q)={xI8(q,x)EF}. We are interested in finding those words in I(A,q)
that are the same length as words in T(A, q). 1j}(I(A, q)) represents strings of Is
whose lengths are the same as words in I (A, q). A similar interpretation can be
given for Ij}(T(A, q)). Therefore, Ij}(I(A, q)) n Ij}(T(A, q)) will reflect those lengths
that are common to both the initial set and the terminal set. The inverse image
under Ij} for this set will then reflect only those strings in l * that are of the correct
length to reach q from so. This set is lj}-l(\jJ(I(A, q)) n Ij}(T(A, q))). Not all strings of
a given length are likely to reach q, though, so this set must be intersected with
I (A, q) to correctly describe those strings that are both of the proper length and that
reach q from the start state. This set, I(A, q) n lj}-l(\jJ(I(A, q)) n Ij}(T(A, q))), is thus
the first halves of all words that have q as their midpoint. This process can be
repeated for each of the (finite) number of states in the automaton A, and the union
of the resulting sets will form all the first halves of words that are accepted by A; that
is, the union will equal Lb.
Note that by moving the start state of A and forming the automaton
Aq= <t.,S,q,8,F>, each of the initial sets I(A,q) can be shown to be FAD.
Similarly, the automaton Aq = <t., S, so,f> , {q}> illustrates that each terminal set
T(A, q) must be FAD, also. Since Lb has now been shown to be formed from these
168 Closure Properties Chap. 5
Similar calculations would have to be done for each of the other states of A.
Once again, Xl: does not enjoy the same closure properties that ~l: does.
V Lemma 5.6. Let ~ be an alphabet. Then Xl: is not closed under the oper-
ator b.
Proof. Let L = {anbnIn 2:: O} E Xl:' Then Lb = {an In 2:: O} ft: Xl:.
Sec. 5.2 Further Closure Properties 169
Other examples that show .N~ is not closed under the operator b abound. If
K = {x E {a, b}* I Ixl. = Ixlb}, then Kb = {a, b}*. The last operator we will cover in
this chapter is useful for illustrating closures that may not be effective, that is, for
which there may not exist an algorithm for constructing the desired entity.
Roughly speaking, the quotient consists of the beginnings of those words in LI that
terminate in a word from ~.
EXAMPLE 5.26
Let I = {a, b}*. {b2, b\ b6 , b8 , b lO , b I2 , ... }/{b} = {bl, b3 , b 5 , b7 , b 19 , b ll , ••• }. Note that
{b2, b\ b6 , b8 , b lO , b I2 , ... }/{a} = { }.
Note that the above proof did not mention the automaton associated with the
second language ~. Indeed, the definition given for F' is sufficient to argue that the
new automaton does recognize the quotient of the two languages. It was not actually
necessary to deal with an automaton for ~ in order to argue that there must exist a
DFA that recognizes Ll/~' The proof of Theorem 5.12 is thus an existence proof,
but does not indicate whether 2ll~ is effectively closed under quotient. Indeed,
Theorem 5.12 actually proves that the quotient of a FAD language with any other
language (including those in .N~) will always be FAD. However, if it is hard to
determine just which strings in the other language may have the properties we need
to define F'; we may not really know which subset of states F' should actually be
[after all, we could hardly check the property 31(q, y) E F, one string at a time, for
each of an infinite number of strings y in ~ in a finite amount of time]. Fortunately,
it is not necessary to know F' exactly, since there are only a finite number of ways to
choose a set of final states in the automaton A', and the proof of Theorem 5.12
assures us that one of those ways must be the correct one that admits the conclusion
L(A') = L I /L2 •
It would, however, be quite convenient to know what F' actually is so that we
170 Closure Properties Chap. 5
could construct the automaton that actually accepts the quotient; this seems much
more satisfying that just knowing that such a machine must exist! If Lz is FAD, the
existence of an automaton A2 = <!" S2, sop ~>z, F2> for which L(A2) = Lz does make
it possible to calculate FI exactly (see the exercises). Thus, ~l: is effectively closed
under quotient. In later chapters, languages that may make it impossible to deter-
mine F' will be studied. We defer the details of such problems until then.
V Lemma 5.7. Let!' = {a, b}. Xl: is not closed under quotient.
Proof. Consider the set L of all strings that have a different number of as than
bs. This language is in Xl:, but LlL = !,* (why?).
Ll
From the exercises it will become clear that Xl: is not closed over most of the
usual (or unusual!) operators. Note that ~l: is by contrast a very special set, in that it
appears to be closed over every reasonable unary and binary operation that we
might consider. The question of closure will again arise as more complex classes of
machines and languages are presented in later chapters.
EXERCISES
5.1. Let l be an alphabet. Define Fl; to be the collection of all finite languages over l.
Prove or give counterexamples to the following:
(a) Fl; is closed under complementation.
(b) Fl; is closed under union.
(c) Fl; is closed under intersection.
(d) Fl; is closed under concatenation.
(e) Fl; is closed under Kleene closure.
(I) F l; is closed under relative complement.
5.2. Let l be an alphabet. Define Cl; to be the collection of all cofinite languages over l (a
language is cofinite if it is the complement of a finite language). Prove or give counter-
examples to the following:
(a) Cl; is closed under complementation.
(b) Cl; is closed under union.
(c) Cl; is closed under intersection.
(d) Cl; is closed under concatenation.
(e) Cl; is closed under Kleene closure.
(I) Cl; is closed under relative complement.
5.3. Let l be an alphabet. Define Bl; = Fl; U Cl; (see Exercises 5.1 and 5.2). Prove or give
counterexamples to the following:
(a) Bl; is closed under complementation.
(b) Bl; is closed under union.
(c) Bl; is closed under intersection.
(d) Bl; is closed under concatenation.
(e) Bl; is closed under Kleene closure.
(I) Bl; is closed under relative complement.
Chap. 5 Exercises 171
5.4. Let I be an alphabet. Define h to be the collection of all infinite languages over I.
Note that h = p(I*) - F~ (see Exercise 5.1). Prove or give counterexamples to the
following:
(a) h is closed under complementation.
(b) h is closed under union.
(c) h is closed under intersection.
(d) h is closed under concatenation.
(e) h is closed under Kleene closure.
(I) h is closed under relative complement.
5.5. Let I be an alphabet. Define J~ to be the collection of all languages over I that have
infinite complements. Note that J~ = p(I*) - C~ (see Exercise 5.2). Prove or give
counterexamples to the following:
(a) J~ is closed under complementation.
(b) J~ is closed under union.
(c) J~ is closed under intersection.
(d) J~ is closed under concatenation.
(e) J~ is closed under Kleene closure.
(I) J~ is closed under relative complement.
5.6. Let I be an alphabet. Define E to be the collection of all languages over {a, b} that
contain the word abba. Prove or give counterexamples to the following:
(a) E is closed under complementation.
(b) E is closed under union.
(c) E is closed under intersection.
(d) E is closed under concatenation.
(e) E is closed under Kleene closure.
(I) E is closed under relative complement.
5.7. If a collection of languages is closed under intersection, does it have to be closed under
union? Prove orgive a counterexamaple.
S.S. If a collection of languages is closed under intersection and complement, does it have
to be closed under union? Prove or give a counterexample.
5.9. Show that if a collection of languages is closed under concatenation it is not necessarily
closed under Kleene closure.
5.10. Show that if a collection of languages is closed under Kleene closure it is not necessarily
closed under concatenation.
5.11. Show that if a collection of languages is closed under complementation it is not
necessarily closed under relative complement.
5.12. Give a finite set of numbers that is closed under V.
5.13. Give an infinite set of numbers that is closed under V.
5.14. Given deterministic machines Al and Az, use the definition of AU and Definition 4.5 to
describe an algorithm for building a deterministic automaton AU that will accept
L(AI) UL(Az).
5.15. Given deterministic machines Al and Az, and without relying on the construction used
in Theorem 5.2:
(a) Build a deterministic automaton AU that will accept L (AI) U L (Az).
(b) Prove that your construction behaves as advertised.
(c) If no minimization is performed in Exercise 5.14, how do the number of states in
172 Closure Properties Chap. 5
AU, AU, and' AU compare? (Assume Al has n states and A2 has m states, and give
expressions based on these variables.)
5.16. Let ~ be an alphabet. Define the (unary) operator P by
peL) = {x 13y E ~* j xy E L} (for any collection of words L)
peL) then represents all the prefixes of words in L. For example, if K = {a, bbc, dd},
then P(K) = {A, a, b, bb, bbc, d, dd}. Prove that 2iJ}; is closed under the operator P.
5.17. Let ~ be an alphabet. Define the (unary) operator S by
S(L) = {x 13y E ~* j yx E L} (for any collection of words L)
S(L) then represents all the suffixes of words in L. For example, if K = {a, bbe, dd},
then S(K) = {A, a, e, be, bbe, d, dd}.
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting S (L).
(b) Prove that your construction behaves as advertised.
(e) Argue that 2iJ}; is closed under the operator S.
5.18. Let ~ be an alphabet. Define the (unary) operator C by
CCL) = {x 13y, z E ~* j yxz E L} (for any collection of words L)
C(L) then represents all the centers of words in L. For example, if K = {a, bbe, dd},
then C(K) = {A, a, e, be, bbe, b, bb, d, dd}.
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting CCL).
(b) Prove that your construction behaves as advertised.
(e) Argue that 2iJ}; is closed under the operator C.
5.19. Let ~ be an alphabet. Define the (unary) operator F by
F(L) = {x Ix E L!\ (if3y E ~* hy E L, then y = A)}
F(L) then represents all the words in L that are not the beginnings of other words in L.
For example, if K = {ad, ab, abbad}, then F(K) = {ad, abbad}. Prove that 2iJ}; is closed
under the operator F.
5.20. Let ~ be an alphabet, and x = ala2 ... an _ Ian E ~ *; define x r = anan - I ... a2al. For a
language L over~, define V = {xrlx E L}. Note that the (unary) reversal operator r is
thus defined by L r = {anan _ I ... a3a2all ala2a3 ... an -Ian E L}, and V therefore repre-
sents all the words in L written backward. For example, if K = {A, ad, bbe, bbad}, then
K r = {A, da, ebb, dabb}.
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting Lr.
(b) Prove that your construction behaves as advertised.
(e) Argue that 2iJ}; is closed under the operator r.
5.21. Let ~ = {a, b, e, d}. Define the (unary) operator G by
={wr'wlw EL}
(see the definition of w r in Exercise 5.20). As an example, if K = {A, ad, bbe, bbad},
then G (K) = {A, daad, ebbbbe, dabbbbad}.
Chap. 5 Exercises 173
n R L2 . Hint: It may
(b) Give examples of languages Ll and Lz for which R(LI n L2) =f. RLI
be helpful to consider the construction of An given in Lemma 5.1 to direct your
thinking.
5.47. Consider the following assertion: ~~ is closed under relative complement; that is, if Ll
and Lz are FAD, then Ll - Lz is also FAD.
(a) Prove this by appealing to existing theorems.
(b) Define an appropriate "new" machine.
(c) Prove that the machine constructed in part (b) behaves as advertised.
5.48. Define;e~ to be the set of all languages recognized by NDFAs with A-transitions. What
sort of closure properties does ;e~ have? How does ;e~ compare to ~~?
5.49. (a) Give an example of a language L for which AE L +.
(b) Give three examples of languages L for which L + = L.
5.50. Recall that l)u: (Sl U Sz) X !--7P(Sl U Sz) was defined by
8l (S, a) if s E Sl
l)U(s, a) = VaE~
( 8 (s, a) if s E S2
2
,
if s E S2
(b) Was this fact used in the proof of Theorem 5.2?
5.51. Let! be an alphabet. Prove or give counterexamples to the following:
(a) .N'~ is closed under relative complement.
(b) .N'~ is closed under union.
(c) .N'~ is closed under concatenation.
(d) .N'~ is closed under Kleene closure.
(g) If L E.N'~, then V E.N'~. .
5.52. Why was it necessary to require that Sl n Sz = 0 in the proof of Theorem 5A? Would
any step of the proof be invalid without this assumption? Explain.
5.53. Let! be an alphabet. Define E(L) = {z I (3y E !+)(3x E L)(z = yx)}.
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting E(L).
(b) Prove that your construction behaves as advertised.
(c) Argue that ~~ is closed under the. operator E.
5.54. Let! be an alphabet. Define B(L) = {z I(3x E L)(3y E !*)(z = xy)}.
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting B(L).
(b) Prove that your construction behaves as advertised.
(c) Argue that ~~ is closed under the operator B.
5.55. Let! be an alphabet. Define M(L) = {z I(3x E L)(3y E !+)(z = xy)}.
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting M(L).
(b) Prove that your construction behaves as advertised .
. . (c) Argue that ~~ is closed under the operator M.
Chap. 5 Exercises 175
5.56. Refer to the definitions given in Lemma 5.1 and use induction to show that
('rIs E SI)('rIt E Sz)('rIx E I*)(8 n «s, t),x) = (8 1(s, x), 8z(t,x))
5.57. Refer to Lemma 5.1 and prove that L(An) = LI n Lz. As long as the reference is
explicitly stated, the result in Exercise 5.56 can be used without proof.
5.58. Prove Theorem 5.6.
5.59. Prove Theorem 5.7.
5.60. (a) Cleverly define a machine modification that does not use any A-moves that could
be used to prove Theorem 5.7 (your new machine is still likely to be non-
deterministic, however).
(b) Prove that your modified machine behaves as advertised.
5.61. Let W(L) = {x Ix is formed by deleting one or more letters from a word in L}.
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting W (L).
(b) Prove that your construction behaves as advertised.
(c) Argue that 21Jl: is closed under the operator W.
5.62. Let V(L) = {x Ix is formed by deleting the odd-positioned letters from a word in L}.
[Note: This refers to the first, third, fifth, and so on, letters in a word. For example, if
abcdefE L, then bdfE V(L).]
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting V(L).
(b) Prove that your construction behaves as advertised.
(c) Argue that 21Jl: is closed under the operator V.
5.63. Let U(L) = {x Ix is formed by deleting the even-positioned letters from a word in L}.
[Note: This refers to the second, fourth, sixth, and so on, letters in a word. For
example, if abcdefg E L, then aceg E U(L).]
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting U(L).
(b) Prove that your construction behaves as advertised.
(c) Argue that 21Jl: is closed under the operator U.
5.64. Let T(L) = {x Ix is formed by deleting every third, sixth, ninth, and so on, letters from
a word in L}. [Note: This refers to those letters in a word whose index position is
congruent to Omod3. For example, if abcdefg E L, then abdeg E T(L).]
(a) Given an automaton accepting L, describe how to modify it to produce an automa-
ton accepting T(L).
(b) Prove that your construction behaves as advertised.
(c) Argue that21Jl: is closed under the operator T.
5.65. Let P = {x Ilxl is prime} and let I(L) be defined by I(L) = L n P.
(a) Show that 21Jl: is not closed under I.
(b) Show that Fl: is closed under I (see Exercise 5.1).
(c) Prove or disprove: Cl: is closed under I (see Exercise 5.2).
(d) Prove or disprove: Bl: is closed under I (see Exercise 5.3).
(e) Prove or disprove: h is closed under I (see Exercise 5.4).
(I) Prove or disprove: Jl: is closed under I (see Exercise 5.5).
(g) Prove or disprove: E is closed under I (see Exercise 5.6).
(h) Prove or disprove: .N'l: is closed under I.
176 Closure Properties Chap. 5
5.66. Define C to be the collection of all languages over {a, b} that do not contain A. Prove or
give counterexamples to the following:
(a) C is closed under complementation.
(b) C is closed under union.
(c) C is closed under intersection.
(d) C is closed under concatenation.
(e) C is closed under Kleene closure.
(0 C is closed under relative complement.
(g) If L E C, then V E C.
5.67. (a) Consider the statement that 2tll; is closed under finite union:
(i) Prove by existing theorems and induction.
(ii) Prove by construction.
(b) Prove or disprove that 2tll; is closed under infinite union. Justify your assertions.
5.68. Let I = {a, b}.
(a) Give examples of three homomorphisms under which j( l; is not closed.
(b) Give examples of three homomorphisms under which j( l; is closed.
5.69. Let I = {a}. Can you find two different homomorphisms under which j( l; is not closed?
Justify your conclusions.
5.70. Refer to the construction given in Theorem 5.10.
(a) Prove 8"(s,x) = 8(s, iJI(x)) '<Is E S, '<Ix E I*.
(b) Complete the proof of Theorem 5.10.
5.71. Consider the homomorphism ~ given in Lemma 5.4 and the set L of all strings that have
the same number of as as bs.
(a) 2tll; is closed under inverse homomorphism, but ~(L) is the set of all even-length
strings of as, and it appears that under ~-l the DFA language ~(L) maps to the
non-FAD language L. Explain the apparent contradiction. Hint: First compute
~-l(~(L)).
(b) Give an example of a homomorphism for which iJI(iJI-\L)) 1- L.
(c) Give an example of a homomorphism for which iJI-l(iJI(L)) 1- L.
(d) Prove iJI(iJI-l(L)) ~ L.
(e) Prove L ~ iJI-l(iJI(L)).
5.72. Let I be an alphabet. Define the (unary) operator e by
U={xI3yEI*0l(yxEL 1\ Ixl=lyl)}
U then represents the last halves of all the words in L. For example, if
K = {ad, abaa, ccccc},
e
then K = {d, aa}. Prove that 2tll; is closed under the operator e.
5.73. Refer to the proof of Theorem 5.11 and show that there exists an automaton A for
which it would be incorrect to try to accept Lb by redefining the set of final states to be
the set of "midway" states.
5.74. Consider the sets M and K in Example 5.20. Assume thatwe have used the pumping
lemma to show that M is not FAD. What would be wrong with arguing that, since M
was not FAD, its homomorphic image cannot be FAD either, and hence K is therefore
not FAD.
5.75. Prove Theorem 5.9.
Chap. 5 Exercises 177
5.76. Let ~ be the ASCII alphabet. Define a homomorphism that will capitalize all lower-
case letters (and does not change punctuation, spelling, and the like).
5.77. Consider the proof of Theorem 5.12.
(a) Show that for A' defined by A' = <~, St, SOl' lit, F'>, where
REGULAR EXPRESSIONS
In this chapter we will develop a standard notation for denoting FAD languages and
thus explore yet another characterization of these languages. The specification of a
language by an automaton unfortunately does not provide a convenient summary of
those strings that are accepted; it is straightforward to check whether any particular
word belongs to the language, but it is often difficult to get an overall sense of the set
of accepted words. Were the language finite, the individual words could simply be
explicitly listed. The delineation of an infinite set in this manner is clearly impos-
sible.
Up to this point, we have relied on English descriptions of the languages under
consideration. Natural languages are unfortunately imprecise, and even small
machines can have impossibly complex descriptions. the concept of regular expres-
sions provides a clear and concise vehicle for denoting many of the languages we
have studied in the previous chapters.
The definition of set union and the concepts of language concatenation (Definition
5.4) and Kleene closure (Definition 5.5) afford a convenient and powerful method
for building new languages from existing ones. The expression ({a, b}·{c})* o{d} is an
infinite set built from simple alphabets and the operators presented in Chapter 5.
We will see that this type of representation is quite suitable for our purposes and is
intimately related to the finite automaton definable languages.
178
Sec. 6.1 Algebra of Regular Expressions 179
V Definition 6.1. Let I = {aj, a2, ... ,am} be an alphabet. A regular set over I is
any set that can be formed by a sequence of applications of the following rules:
EXAMPLE 6.1
Let I = {a, b, e}. Each of the following languages are regular sets:
{A} {b} U {e} {a}' ({b} U {e}) {b'A}
{a}* ({a} U {A})' ({b}*) { }* {e}·{ }
The multitude of set brackets in these expressions is somewhat undesirable;
we now present a common shorthand notation to represent such sets. Expressions
like {a}* will simply be written as a*, and {a}·{b} will be shortened to abo The
notation we wish to use can be formally defined in the following recursive manner.
V Definition 6.2. Let I = {al, a2, ... ,am} be an alphabet. A regular expression
over I is a sequence of symbols formed by repeated application of the following
rules:
i. at, a2, ... ,am are all regular expressions, representing the regular sets
{all, {a2}, ... ,{am}, respectively.
ii. 0 is a regular expression representing { }.
iii. E is a regular expression representing {A}.
iv. If RI and R2 are regular expressions corresponding to the sets Ll and Lz, then
(RI . R 2) is a regular expression representing the set LI . Lz.
v. If Rl and R2 are regular expressions corresponding to the sets LI and Lz, then
(Rl U R 2) is a regular expression representing the set LI U Lz.
vi. If Rl is a regular expression corresponding to the set Lj, then (Rl)* is a regular
expression representing the set Lt.
EXAMPLE 6.2
Let I = {a, b, e}. The regular sets in Example 6.1 can be represented by the follow-
ing regular expressions:
E (b U c) (a·(b U c)) (b'E)
(a)* ((a U E)·(b)*) (0)* (e·0)
180 Regular Expressions Chap. 6
Note that each expression consists of the "basic building blocks" given by 6.2i
through 6.2iii and are connected by the operators U, . and * according to rules 6.2iv
through 6.2vi. Each expression is intended to denote a particular language over I.
Such representations of languages are by no means unique. For example,
(a·(b U c» and «a·b) U (a·c» both represent the same set, {ab, ac}. Similarly, (b·e)
and b both represent {b}.
The intention of the parentheses is to prevent ambiguity; a·b U c could mean
(a·(b U c» or «a·b) U c), and the difference is important: the first expression repre-
sents {ab, ac}, while the second represents {ab, c}, which are obviously different
languages. To ease the burden of all these parentheses, we will adopt the following
simplifying conventions.
EXAMPLE 6.3
Thus, a·b U c will be taken to mean «a·b) U c), not (a·(b U c», since· has pre-
cedence over U. Redundant parentheses that are implied by the precedence rules
can be eliminated, and thus «(a·b) U c)·d) can be written as (ab U c)d. Notice that
b U c* represents (b U (c*», not (b U c)*. Kleene closure therefore behaves much
like exponentiation does in ordinary algebraic expressions in that it is given prece-
dence over the other operators. Concatenation and union behave much like the
algebraic operators multiplication and addition, respectively. Indeed, some texts
use + instead of U for union; the symbol for concatenation already agrees with that
for multiplication ( . ), and we will likewise allow the symbol to be omitted in favor of
juxtaposition. The constants 0 and e behave much like the numbers 0 and 1 do in
algebra. The common identities x + 0 = x, x·1 = x and x·O = 0 have parallels in
language theory (see Lemma 6.1). Indeed, 0 is the identity for union and e is the
identity for concatenation.
Thus far we have been very careful to distinguish between the name of an
object and the object itself. In algebra, we are used to saying that the symbol 4
equals the string of symbols (that is, the word) 20 -;- 5; we really mean that both
names refer to the same object, the concept we generally call the number/our. (You
should be able to think of many more strings that are commonly used as a name for
this number, for example, 1111, IV, and 1002 • We will be equally inexact here, writing
a·(b U c) = (a·b) U (a·c). This will be taken to mean that the sets represented by the
two expressions are equal (as is the case here; both equal {ab, ac}) and will not be
construed to mean that the two expressions themselves are identical (which is clearly
not the case here; the right-hand side has more as, more parentheses, and more
concatenation symbols).
Sec. 6.1 Algebra of Regular Expressions 181
Thus, Rl and R2 are equivalent if L (RJ) = L (R2), but this is commonly abbre-
viated RJ = R 2. The word "equivalent" has been seen in three different contexts so
far: there are equivalent DFAs, equivalent NDFAs, and now equivalent regular
expressions. In each case, the intent has been to equate constructs that are associ-
ated with the same language. Now that the idea of equality (equivalence) has been
established, some general identities can be outlined. The properties given in
Lemma 6.1 follow directly from the definitions of the operators.
'il Lemma 6.1. Let I be an alphabet, and let Rl> R 2, and R3 be regular
expressions. Then:
(a) Rl u0 = RJ
(b) RJ·e = RJ = e·R J
(c) RJ'0 = 0 = 0·R J
(d) RI U R2 = R2 U RJ
(e) R J U R J = RJ
(0 Rl U (R2 U R 3) = (Rl U R 2) U R3
(g) RJ"{R2·R3) = (R J·R2)·R3
(h) R I · (R2 U R 3) = (R J· R 2) U (Rl . R 3)
(i) e* = e
(j)0*=.e
(k) (Rl U R2)* = (Rt U Rn*
(I) (RJ U R2)* = (Ri-Rn*
(m) (Rt)* = Rt
(n) (R1)*·(Rt) = Rt
Furthermore, there are examples of sets for which:
(b') RJ U e =I- R J
(d') RJ'R2 =I- R 2·R J
(e') RJ·R j =I- R J
(h') RJ U (R 2·R3) =I- (RJ U R 2)·(R j U R 3)
(k') (R J ·R2)* =I- (Rt·Rn*
(I') (R j ·R2)* =I- (Rt U Rn*
Proof. Property (h) will be proved here. The remainder are left as exercises.
182 Regular Expressions Chap. 6
Note that identity (c) in Lemma 6.1 implies that {a, b}·0 = 0, which follows
immediately from the definition of concatenation. If w E {a, b}·0, then w would
have to be of the form x 'y, where x E {a, b} and y E 0; there are clearly no valid
choices for y, so {a, b}·0 is empty.
Armed with the constructs and properties discussed in the first section, we will now
consider what types of languages can actually be defined by regular expressions.
How general is this method of expressing sets of words? Can the FAD languages be
represented by regular expressions? (Yes). Can all programming languages be
represented by regular expressions? (No). Are regular sets always finite automaton
definable languages? (Yes). We begin by addressing this last question.
V Definition 6.4. Let l be an alphabet. e1t~ is defined to be the set of all regular
sets over l.
!l.
The first question to be considered is, Can every regular set be recognized by a
DFA? That is, is e1t~ ~ 2lJ~? It is clear that the "basic building blocks" are recog-
nizable. Figure 6.1 shows three NDFAs that accept { }, {A}, and {c}, respectively.
Recalling the constructions outlined in Chapter 5, it is easy to see how to combine
these "basic machines" into machines that will accept expressions involving the
operators U, " and *.
EXAMPLE 6.4
An NDFA that accepts a U b (as suggested by the proof of Theorem 5.2) is shown in
Figure 6.2. Note that it is composed of the basic building blocks for the letters a and
b, as suggested by the constructions in Figure 6.1.
Sec. 6.2 Regular Sets as FAD Languages 183
~ CL(C)~{c}
Figure 6.1 NDFAs which recognize regular expressions with zero operators
EXAMPLE6.S
An NDFA that accepts (a U b)* is shown in Figure 6.3. The automaton given in
Figure 6.2 for (a U b) is modified as suggested by the proof of Theorem 5.5 to
produce the Kleene closure of (a U b). Recall that the "extra" state qo was added to
ensure that A is accepted by the new machine.
a b
EXAMPLES.S
An NDFA that accepts c·(a U b)* (as suggested by the proof of Theorem 5.4) is
shown in Figure 6.4.
Note that in this last example qo, to, and So are disconnected states, and rb Sb
and t1 could be coalesced into a single state. The resulting machines are not
advertised to be efficient; the main point is that they can be built. The techniques
illustrated above are used to prove the following lemma.
V Lemma 6.2. Let I be an alphabet and let R be a regular set over I. Then
there is a DFA that accepts R.
Proof. The proof is by induction on the number of operators in the regular
expression describing R (see the exercises). Note that Figure 6.1 effectively
illustrates the basis step: Those regular expressions with zero operators
(0, E, aI, a2, ... ,am) do indeed correspond to FAD languages. This covers sets gen-
erated by rules i, ii, and iii of Definition 6.2. For sets corresponding to regular
expressions with a positive number of operators, the outermost operator can be
identified, and it will be either·, U, or *, corresponding to an application of rule iv,
v, or vi. The induction assumption will guarantee that the sub expressions used by
the outermost operator have corresponding DFAs. Theorems 5.2,5.4, and 5.5 can
then be invoked to argue that the entire expression has a corresponding DFA.
d
Since we are assured that every regular set can be accepted by a finite automa-
ton, the collection of regular sets is clearly contained in the set of FAD languages.
This also means that those languages that cannot be represented by a DFA (that is,
those contained in.N'l:) have no chance of being represented by a regular expression.
The next question we will address is whether ~l: ~ ~l:' that is, whether every FAD
language can be represented by a regular expression. The reader is invited to take a
sample DFA and try to express the language it accepts by a regular expression. You
will probably be able to do it, but only by guesswork and trial and error. Our first
question appears to have a much more methodical solution: Given a regular expres-
sion, it was a relatively straightforward task to draw a NDFA (and then a DFA); in
fact, we have a set of algorithms for doing just that, and we could program a
computer to do the task for us. This second question does not seem to have an
obvious algorithm connected with it, and we will have to attack the problem using a
new concept: language equations.
In algebra, we are used to algebraic equations such as 3x + 7 = 19. Recall that
a solution to this equation is a numerical value for x that will make the equation true,
that is, make both sides equal. In the above example, there is only one choice for x,
the unique solution 4. Equations can have two different solutions, like x 2 = 9, no
solutions, like x 2 = -9, or an infinite number of solutions, like 2(x + 3) = x + 6 + x.
In a similar way, set equations can be solved, such as {a, b, e} = {a, b} U X. Here X
represents a set, and we are again looking for a value for X that will make the
equation true; an obvious choice is X = {e}, but there are other choices, like
X = {b, e} (since {a, b, e} = {a, b} U {b, e}). Such equations may likewise have no solu-
tions, like XU {b} = {a, e}, or an infinite number of solutions, such as XU {b} = X
(what sorts of sets satisfy this last equation?). We wish to look at set equations
where the sets are actually sets of strings, that is, language equations. The type of
equation in which we are most interested has one and only one solution, as outlined
in the next theorem. It is very similar in form and spirit to the theorem in algebra
that says "For any numbers a and b, where a =1= 0, the equation ax = b has a unique
solution given by x = b -:- a."
Proof. First note that the set A *E is indeed a solution to this equation, since
A *E = E U A·(A *E) (see the exercises). Now assume that some set Y is a solution
to this equation, and let us investigate some of the properties that Y must have: If Y
is a solution, then
186 Regular Expressions Chap. 6
Y = E U A· Y =? (by definition of U)
E~Y 1\ A·Y~Y=?(ifE~Y, thenA·E~A·Y)
A· E ~ A· Y ~ Y =? (by substitution)
A·A·E ~ A·A·Y ~ A·Y ~Y=?(by induction)
('fin E N)(A n • E ~ Y) =? (by definition of A *)
A*·E~Y
Thus, every solution must contain all of A *E, and A *E is in this sense the smallest
solution. This is true regardless of whether or not A belongs to A.
Now let us assume that A $. A and that we have a solution W that is actually
"bigger" than A *E; we will show that this is a contradiction, and thus all solutions
must look exactly like A *E. If W is a solution, W =F A *E, then there must be some
elements in the set W - A *E; choose a string of minimal length from among these
elements and call it z. Thus z E Wand z t!. A *E, and since E ~ A *E (why?), z t!. E.
Since W is a solution, we have
EXAMPLE 6.7
X={b,c}U{a}·X does indeed have a solution; X can equal {a}*·{b,c}. Note also
that this is the only solution (verify, for example, that X = {a}*'{c} is not a solu-
tion). The equation Z = {b, c} U {a, A}'Z has several solutions; among them are
Z = {a}* '{b, c} and Z = {a, b, c}*.
It is instructive to explicitly list the first few elements of {a}* '{b, c} and begin to
check the validity of the solution to the first equation. If Y is a solution, then the two
sides of the equation Y = {b, c} U {a}' Y must be equal. Since both band c appear on
Sec. 6.3 Language Equations 187
the right-hand side they must also be on the left-hand side, which clearly means that
they have to be in Y. Once b is known to be in Y, it will give rise to a term on the
right-hand side due to the presence of {a}·Y. Thus, a·b must also be found on the
left-hand side and therefore is in Y, and so on. The resulting sequence of implica-
tions parallels the first part of the proof of Theorem 6.1.
To see intuitively why no string other than those found in {a}* ·{b, c} may
belong to a solution for X = {b, c} U {a}· X, consider a string such as aa. If this were
to belong to X, then it would appear on the left-hand side and therefore would have
to appear on the right-hand side as well if the two sides were to indeed be equal. On
.the right-hand side are just the two components, {b, c} and {a}· X. aa is clearly not in
{b,c}, so it must be in {a}·X, which does seem plausible; all that is necessary is for a
to be in X, and then aa will belong to {a}· X. If a is in X, though, it must also appear
on the left-hand side, and so a must be on the right-hand side as well. Again, a is not
in {b,c}, so it must be in {a}·X. This can happen only if A belongs to X so that a·A
will belong to {a}·X. This implies that A must now show up on both sides, and this
leads to a contradiction: A cannot be on the right-hand side since A clearly is not in
{b,c}, and it cannot belong to {a}·X either, since all these words begin with an a.
This contradiction shows why aa cannot be part of any solution X.
This example illustrates the basic nature of these types of equations: for words
that are not in {a}* ·{b, c}, the inclusion of that word in the solution leads to the
inclusion of shorter and shorter strings, which eventually leads to a contradiction.
This property was exploited in the second half of the proof of Theorem 6.1. Rather
than finding shorter and shorter strings, though, it was assumed we already had the
shortest, and we showed that there had to be a still shorter one; this led to the
desired contradiction more directly.
Our main goal will be to solve systems of language equations, since the re-
lationships between the terminal sets of an automaton can be described by such a
system. Systems of language equations are similar in form and spirit to systems of
algebraic equations, such as
3XI + X2 = 10
which has the unique solution Xl = 3,X2 = 1. We will look at systems of language
equations such as
X I =EUa·X I Ub·X2
X 2 = 0u b·X I u0·X2
which has the (unique) solution Xl=(aUbb)*,X2=b·(aUbb)*. Checking that
this is a solution entails verifying that both equations are satisifed if these expres-
sions are substituted for the variables Xl and X 2.
The solution of such systems parallels the solution of algebraic equations. For
example, the system
188 Regular Expressions Chap. 6
3Xl + X2 = 10
Xl -X2 = 2
can be solved by treating the second statement as an equation in just the variable X2
and solving as indicated by the algebraic theorem "For any numbers a and b, where
a =1= 0, the equation ax = b has a unique solution given by X = b -:- a." The second
statement can be written as (-1)x2 = 2 - X2, which then admits the solution
X2 = (2 - Xl) -:- (-1) or X2 = Xl - 2. This solution can be inserted into the first equa-
tion to eliminate X2 and form an equation' solely in Xl' Terms can be regrouped and
the algebraic theorem can be applied to find Xl. We would have
3Xl + X2 = 10
which becomes
or
4Xl-2= 10
or
or
Xl = 12 -:- 4
yielding
xl=3
This value of Xl can be back-substituted to find the unique solution for X2:
X2 = Xl - 2 = 3 - 2 = 1.
Essentially, the same technique can be applied to any two equations in two
unknowns, and formulas can be developed that predict the coefficients for the
reduced set of equations. Consider the generalized system of algebraic equations
with unknowns Xl and X2, constant terms El and E 2, and coefficients An, A 12 , A 2b
and A 22 :
AnXl + A12X2 = El
A 2lXl + A 22X 2 = E2
Recall that the appropriate formulas for reducing this to a single equation of the
form .Anxl = E l , where the new coefficients.A n and El can be calculated as
El = ElA22 - E2Al2
.An = An A 22 - A12A2l
A similar technique can be used to eliminate variables when there is a larger
number of equations in the system. The following theorem makes similar predic-
tions of the new coefficients for language equations.
Sec. 6.3 . Language Equations 189
V Theorem 6.2. Let n 2= 2 and consider the system of equations in the un-
knowns Xl, Xz, ... ,Xn given by
Xl = E1 U A ll X 1 U A 12XZU ... U A 1(n-1)Xn- 1U A 1n X n
Xz = Ez U A Z1 X1U AzzXz U ... U A Z(n-1)X n- 1U AznXn
and A = Ann. The solution, A *E, is exactly as given by part (c) above:
Xn = A:n ·(En U AnlXI U A n2 X 2 U' .. U An(n-I)Xn- l )
or
Xn = A:n . En U A:n· AnlXI U A:n. A n2 X 2 U ... U A:n· An(n-I)Xn- l )
If there was a unique solution for the terms Xl through Xn-l> then Theorem 6.1
would guarantee a unique solution for X n , too.
The solution for Xn can be substituted for Xn in each of the other n - 1
equations. If the kth equation is represented by
X k = Ek U AklXI U A k2 X 2U ... U AknXn
then the substitution will yield
X k = Ek U AklX I U A k2 X 2U ...
U (Akn·(A:n 'En U A:n ·AnIXI U A:n. A n2 X 2U'" U A:n ·An(n-I)Xn- I »
By using the distributive law, this becomes
X k = Ek U AklXI U A k2 X 2U ...
U(Akn·A:n·EnUAkn·A:n. AnlX I UAkn·A:n ·An2X 2U··· UAkn·A:n·An(n-l)Xn-l)
Collecting like terms yields
X k = (Ek U Akn·A:n·En) U (AkIXI U Akn·A:n·AnIXI )
U (Ak2X2 U Akn·A:n . A n2 X 2) U'" U (Ak(n-I)X n- 1 U Akn·A:n·An(n-I)Xn- l )
or
X k = (Ek U A kn · A:n .En) U (Akl U A kn · A:n . Anl)XI
U (Ak2 U Akn' A:n· A n2 )X2U ... U (Ak(n-l) U A kn · A:n· An(n-I»Xn- 1
The constant term in this equation is (Ek U A kn · A:n .En), which is exactly the for-
mula given for Ek in part (b). The coefficient for Xl is seen to be
(Akl U Akn·A:n ·Anl ),
while the coefficient for X2 is (Ak2 U Akn·A:n . A n2 ), and so on. The coefficient for Xj
would then be A.kj = A kj U (A kn · A:n. AnJ, which also agrees with the formula given
in part (b). This is why the solution of the original set of equations agrees with the
solution of the set of n - 1 equations given in part (b).
Part (a) is proved by induction on n: the method outlined above can be
repeated on the new set of n - 1 equations to eliminate Xn-l> and so on, until one
equation in the one unknown Xl is obtained. Theorem 6.1 will guarantee a unique
solution for Xl> and part (c) can then be used to find the unique solution for X 2, and
so on .
.6.
Sec. 6.3 Language Equations 191
EXAMPLE6.S
X l =EUa,Xl Ub,X2
X 2 = 0Ub,Xl U 0,X2
The proof of Theorem 6.2 implies the solution for Xl will agree with the solution to
the one-variable equation Xl = El U AnxI, where
El = El U (AI2·Ai2·E2) = E U (b·0*·0) = E U (b·E·0) = E U 0 = E,
and
We will now see that the language accepted by a DFA can be equated with the
solution of a set of language equations, which will allow us to prove the following
important theorem.
if Si $. F
for i = 1,2, ... , n
if Si E F
and
That is, Ai} represents the set of all letters that cause a transition from state Si to state
Sj. Notice that since X. $. I none of the sets Ai} contain the empty string, and
therefore by Theorem 6.2, there is a unique solution to the system:
192 Regular Expressions Chap. 6
However, these equations exactly describe the relationships between the terminal
sets denoted by X), Xz, ... ,Xn at the beginning of this proof (compare with Exam-
ple 6.11), and hence the solution will represent exactly those quantities. In particu-
lar, the solution for Xl will be a regular expression for L(A), that is, for L.
Ll
EXAMPLE 6.9
Consider the DFA B given by the diagram in Figure 6.5, which accepts all strings
with an odd number of bs over {a, b}. This machine generates the following system
of language equations:
Xl = 0 U aXl U bXz
Xz = E U bX l U aXz
which will have the same solution for Xl as the equation
Xl = El U Allx l
where
and
EXAMPLE 6.10
Consider again the system described in Example 6.8. This can be thought of as the
set of language equations corresponding to the NDFA called B, illustrated in Figure
6.6a. Note thatL(B) is indeed the given solution: L(B) = Xl = (a U bb)*. Notice the
(a)
(b)
similarity between B and the machine C shown in Figure 6.6b, which has S2 as the
start state. Note that L (C) is given by X2 = b· (a U bb)* , where X2 was the other part
of the solution given in Example 6.8 (verify this). Finally, consider a similar machine
D in Figure 6.6c with both Sl and S2 as start states. Can you quickly write a regular
expression that describes the language accepted by D?
EXAMPLE 6.11
Regular expressions for machines with more than two states can be found by
repeated application of the technique described in Theorem 6.2. For example,
consider the three-state DFA given in Figure 6.7. The solution for this three-state
machine will be explored shortly. We begin by illustrating the natural relationships
between the terminal sets described in Theorem 6.3. First let us note that the
language accepted by this machine includes:
and rewritten as
Sec. 6.3 Language Equations 195
Xl = E U bX l U aXz U cX3
Xz = bXl U cXl U aX 3
X3 = aX l U bXl U cXz
The union of these four classes should equal Xl, which is exactly what the first
equation states.
X? = bX l U cXl U aX3 can be interpreted similarly; E does not appear in this
equation because there is no way to reach a final state from Sz if no letters are
processed. If at least one letter is processed, then that first letter is an a, b, or c. If it
is a, then we move from state Sz to S3, and the remainder of the string must take us to
a final state from S3 (that is, the remainder must belong to X3)' Strings that begin
with an a and are followed by a string from X3 can easily be described by a' X 3.
Similarly, strings that start with b or c must move from Sz to Sl, and then be followed
by a string from Xl' These strings are described by b,Xl and c·Xl . The three cases
for reaching a final state from Sz that have just been described are exhaustive (and
mutually exclusive), and so their union should equal all of Xz. This is exactly the
relation expressed by the second equation, Xz = bXl U cXl U aX3. The last equation
admits a similar interpretation.
None of the above observations are necessary to actually solve the system! The
preceding discussion is intended to illustrate that the natural relationships between
the terminal sets described by Theorem 6.3 and the correspondences we have so
laboriously developed here are succinctly predicted by the language equations.
Once the equations are written down, we can simply apply Theorem 6.2 and reduce
to a system with only two unknowns. We have
Ez=0,
Au = b, A13=C
AZI = b U C, Azz = 0, A Z3 =a
A3l = aUb, An = c,
EXAMPLE 6.12
Consider the automaton shown in Figure 6.8. It is similar to the one in Example
6.11, but it now gives rise to four equations in four unknowns. As these equations
are solved, the final Au coefficient for Xl will again describe strings that, when
startiqg at Sl in the automaton, return you to Sl again for the first time; it will ag}'ee
with Au in Example 6.11. The final constant term associated with Xl (that is, E 1),
will represent all those strings that deposit you in a final ~tate fromb without ever
returning to Sl. In this automaton, this will be given by E1 = de*. Ai1E1 therefore
represents strings that go from Sl back to Sl any number of times, followed by a
string that leaves Sl (for the last time) for a final state.
Sec. 6.3 Language Equations 197
In general, the final coefficient and constant terms can always be interpreted
in this manner. In Example 6.11, the only way to reach a final state from Sl and
avoid having to return
A
again to Sl was to not leave in the first place; this was reflected
by the fact that El = E.
EXAMPLE 6.13
Consider the automaton illustrated in Figure 6.9, which is identical to the DFA in
Example 6.11 except for the placement of the final state. Even A
though the initial
system of three equations is now pifferent, we can expect Au to compute to the
same expression as before. Since El is supposed to represent all those strings that
deposit you in a final state from Sl without ever returning to s}, one should be able to
predict that the new final constant term will look like El = a(ac)* U c(ca)*c. An
expression for the language recognized by this automaton would then be given by
Xl = (Au)*El
= (b U c(a U b) U (a U cc)(ac)*((b U c) U a(a U b)))*·(a(ac)* U c(ca)*c)
It may often be convenient to eliminate a variable other than the one that is
numerically last. This can be accomplished by appropriately renumbering the un-
knowns and applying Theorem 6.2 to the new set of equations. For convenience, we
state an analog of Theorem 6.2 that allows the elimination of the mth unknown
from a set of n equations in n unknowns. The following lemma agrees with Theorem
6.2 if m = n.
198 Regular Expressions Chap. 6
V Lemma 6.3. Let nand m be positive integers and let m :5 n. Consider the
system of n ~ 2 equations in the unknowns Xl, X 2 , ••• , Xn given by
for k = 1,2, ... , n
in which (Vi,j)(A. $. Aij).
The unknown Xm can be eliminated from this system to form the following n - 1
equations in the unknowns Xl> X2 , .•• , X m- l , Xm+l> ... , X n.
X k = Ek U AklX l U A k2 X 2U ... U Ak(m-I)Xm- 1 U Ak(m+l) X m+l U ... U Aknxn,
for k = 1, 2, ... , m - 1, m + 1, ... , n
where
for alIi = 1,2, ... ,m -I,m + 1, ... ,n
and
foralli,j=I,2, ... ,m - l,m+l, ... ,n
Furthermore, once the solution to the above n - 1 equations is known, that solution
can be used to find the remaining unknown:
Xm = A!m ·(EmU AmlX l U A m2 X 2U· .. U Am(m-I)Xm - 1
U Am(m+I)Xm +1 U ... U AmnXn)
Proof. The proof follows from a renumbering of the equations given in
Theorem 6.2.
a
A significant reduction in the size of the expressions representing the solutions
can often be achieved by carefully choosing the order in which to eliminate the
unknowns. This situation can easily arise when solving language equations that
correspond to finite automata. For example, consider the DFA illustrated in Figure
6.10. The equations for this machine are given by
Xl = 0 U 0X I U (0 U 1)X2 U 0X3
X 2 = E U OX I U lX2 U 0X3
X3 = 0 U 0Xl U (0 U 1)X2 U 0X3
Using Theorem 6.2 to methodically solve for X!, X 2 , and X3 involves eliminating X3
and then eliminating X 2 • Theorem 6.1 can then be used to solve for X!, and then the
back-substitution rules can be employed to find X 2 and X 3. The regular expressions
found in this manner are quite complex. A striking simplification can be made by
eliminating X3 and then eliminating Xl (instead of X 2). The solution for X 2 is quite
concise, which leads to simple expressions for Xl and X3 during the back-
substitution phase (see Exercise 6.19).
Let A = <I, {s!, S2, ... , sn}, s!, 5, F> be a deterministic finite automaton. We
have seen that the relationships between the terminal sets T(A, Si) described in
Chapter 3 give rise to a system of equations. Similarly, the initial sets leA, Si)
defined in Chapter 2 are also interrelated. Recall that, for a state Si, leA, Si) is
comprised of strings that, when starting in the start state, lead to the state Si' That is,
leA, Si) = {x 18(s!, x) = s;}. The equations we have discussed to this point have been
right linear; that is, the unknowns Xi appear to the right of their coefficients. The
initial sets for an automaton are also related by a system of equations, but these
equations are left linear; the unknowns Y i appear to the left of their coefficients.
The solution for sets of left-linear equations parallels that of right-linear systems.
V Theorem 6.4. Let nand m be positive integers and let m :5 n. Consider the
system of n ::: 2 equations in the unknowns Y b Y2 , •.. , Y n given by
Y k = Ik U YlB kl U Y 2B k2 U'" U YnB kn , for k = 1, 2, ... , n
in which (Vi,j)(A f/:. Bij).
a. The unknown Ym can be eliminated from this system to form the following
n -1 equations in the unknowns Y!, Y 2 , •.• , Ym- b Ym+ b ••• , Y n •
Y k = Ik U YlB kl U Y 2B k2 U'" U Ym-lBk(m-l) U Y m+ I B k(m+1) U ... U YnB kn ,
for k = 1,2, ... ,m -I,m + 1, ... ,n
where
i; = IiU (lm·B';'m ·B im ), for all i = 1, 2, ... , m - 1, m + 1, ... , n
and
for all i,j = 1,2, ... ,m -1, m + 1, ... , n
Bij = Bij U (BmrB';'m . B im ),
b. Once the solution to the above n - 1 equations is known, that solution can be
used to find the remaining unknown:
Y m = (1m U YlB ml U Y 2B m2 U'" U Ym-lBm(m-l)
U Ym+ I B m(m+1) U ... U YnBmn)·B';'m
200 Regular Expressions Chap. 6
V Lemma 6.4. Let A = <I, {S1o S2, ... ,sn}, So, 8, F> be an NDFA. For each
i = 1,2, ... ,n, let the initial set leA, Si) = {x 18(s1o x) = Si} be denoted by Y i . The
unknowns Y 10 Y 2, ... ,Yn satisfy a system of n left-linear equations ofthe form
for k = 1,2, ... ,n
where the coefficients are given by
if Si$ So
for i = 1, 2, ... ,n
if Si E So
and
Bij= U a for i,j = 1,2, ... ,n
aE:!:Asi E8(sj,a)
In contrast to Theorem 6.3, where Aij represented the set of all letters that
causes a transition from state Si to state Sj, Bij represents the set of all letters that
causes a transition from state Sj to state Si' That is, Bij = A ji • In the definition in
Theorem 6.3, Ei represented the set of all strings of length zero that can reach final
states from Si. Compare this with the definition of Ii above, which represents the set
of all strings of length zero that can reach Si from a start state.
The technique outlined by Theorems 6.1, 6.2, and 6.3 provide the second half of the
correspondence between regular sets and FAD languages. As a consequence, regu-
lar expressions and automata characterize exactly the same class of languages.
Thus the terms FAD language and regular set can be used interchangeably,
since languages accepted by finite automata can be described by regular expres-
sions, and vice versa. Such languages are often referred to as regular languages. The
correspondence will allow, for example, the pumping lemma to be invoked to justify
that certain languages cannot be represented by any regular expression.
~l: is therefore closed under every operator for which ~l: is closed. We have
now seen two representations for FAD languages, and a third will be presented in
Chapter 8. Since there are effective algorithms for switching from one representa-
tion to another, we may use whichever vehicle is most convenient to describe a
language or prove properties about regular languages. For example, we may use
whichever concept best lends itself to the proof of closure properties. The justifica-
tion that ~l: is closed under union follows immediately from Definition 6.1; much
more effort was required in Chapter 5 to prove that the union of two languages
represented by DFAs could be represented by another DFA. On the other hand,
attempting to justify closure under complementation by using regular expressions is
an exercise in frustration. We will now see that closure under substitution is con-
veniently proved via regular expressions.
A substitution is similar to a language homomorphism (Definition 5.8), in
which letters were replaced by single words. Substitutions will denote the methodi-
cal replacement of the individual letters within a regular expression with sets of
words. The only restriction on these sets is that they must also be regular expres-
sions, though not necessarily over the same alphabet.
V Definition 6.5. Let I = {ab a2, ... ,am} be an alphabet and let r be a second
alphabet. Given regular expressions Rb R 2, ... ,Rm over r, define a regular set
substitution s: I~ pep) by s (a;) = R; for each i = 1,2, ... ,m, which can be ex-
tended to s: I*~ pcP) by
seA) = E
and
(Va E I)(Vx E I*)(s(a·x) = s(a)·s(x))
s can be further extended to operate on a language L k I * by defining
s(L) = U s(z)
zEL
In this context, s: p(I *) ~ pcP).
Ll
EXAMPLE 6.14
Let I = {O} and r = {a, b}. Define s(O) = (a U b)·(a U b). From the recursive defini-
tion, s(OO) = (a U b)·(a U b)·(a U b)·(a U b). Furthermore, the language s(O*) rep-
resents all even-length strings over {a, b}.
The definition of s(L) for a language L allows the domain of the substitution to
be extended all the way to s: p(I*)~ pcP). It can be proven that the image of ~l:
202 Regular Expressions Chap. 6
under s is contained in (!}tr (see the exercises); however, the image of jf~. under s is
not completely contained in j{ r.
In Example 6.14, the language 0* was regular and so was its image under s.
Neither of the sets described in the second example were regular. It is possible to
start with a nonregular set and define a substitution that produces a regular set (see
Lemma 6.5), but it is impossible for the image of a regular set to avoid being regular,
as shown by the next theorem.
The analogous result does not always hold for the nonregular sets.
a. There are examples of regular set substitutions s: I--,-. p(I*) for which Xl: is
not closed under s.
b. There are examples of regular set substitutions t: I--,-. p(I*) for which Xl: is
closed under t.
Proof. (a) Xl: is not closed under some substitutions. Let I = {a, b} and define
sea) = (a U b) and s(b) = (a U b). The image of the nonregular set
L = {x Ilxl. = Ixlb}
is the set of even-length words, which is regular. Thus L E .N'l: but s(L) $. Xl:.
(b) Xl: is closed under some substitutions. Some substitutions do preserve non-
regularity (such as the identity substitution i, since for any language L, teL) = L). In
this case, (\fL)(L E Xl: ~ teL) E Xl:) and therefore Xl: is closed under t.
~
Note that a substitution in which each R; is a single string then conforms to Defini-
tion 5.8 and represents a language homomorphism.
EXERCISES
6.1. Let I = {a, b}. Give (if possible) a regular expression that describes the set of all
even-length words in I *.
6.2. Let I = {a, b}. Give (if possible) a regular expression that describes the set of all words
x in I* for which Ixl<== 2.
6.3. Let I = {a, b}. Give (if possible) a regular expression that describes the set of all words
x in I* for which Ixl a = Ix lb.
6.4. Let I = {a, b, c}. Give a regular expression that describes the set of all odd-length
words in I * that do not end in b.
6.5. Let I = {a, b, c}. Give a regular expression that describes the set of all words in I* that
do not contain two consecutive cs.
6.6. Let I = {a, b, c}. Give a regular expression that describes the set of all words in I* that
do contain two consecutive cs.
6.7. Let I = {a, b, c}. Give a regular expression that describes the set of all words in I* that
do not contain any cs.
6.8. Let I = {O, I}. Give, if possible, regular expressions that will describe each of the
following languages. Try to write these directly from the descriptions (that is, avoid
relying on the nature of the corresponding automata).
(a) L 1 ={xllxlmod3=2}
(b) L2=I*-{wI3n<==1~w=al···an 1\ an =l}
(c) L3={yllylo>lyh}
6.9. Let I = {a, b, c}. Give, if possible, regular expressions that will describe each of the
following languages. Try to write these directly from the descriptions (that is, avoid
relying on the nature of the corresponding automata).
(a) Ll = {x I(I xl. is odd)I\(lxlb is even)}
(b) k = {y I(Iy Ie is even) V (Iy Ib is odd)}
(c) L3 = {z I(I z la is even)}
(d) L4 = {z II zI. is a prime number}
(e) Ls = {x Iabc is a substring of x}
(1) L6 = {x Iacaba is a substring of x}
(g) ~={x E{a,b,c}*llxl a =Omod3}
6.10. Let I = {a, b, d}. Give a regular expression that will describe
'I' = {x E I * I(x begins with d) V (x contains two consecutive bs)}.
6.11. Let I = {a, b, c}. Give a regular expression that will describe
cI> = {x E I * Ievery b in x is immediately followed by c}.
6.12. Let I = {O, 1,2,3,4,5,6,7,8, 9}. Give a regular expression that will describe
r = {x E I * Ithe number represented by x is evenly divisible by 3}
= {A, 0, 00, 000, ... ,3,03,003, ... ,6,9,12,15, ... }.
6.13. Let I = {O, 1,2,3,4,5,6,7,8, 9}. Give a regular expression that will describe
K = {x E I * Ithe number represented by x is evenly divisible by 5}.
Chap. 6 Exercises 205
6.14. Use the exact constructs given in the theorems of Chapter 5 to build a NDFA that
accepts b U a*c (refer to Examples 6.4,6.5, and 6.6). Do not simplify your answer.
6.15. Give examples of sets that demonstrate the following inequalities listed in Lemma 6.1:
(a) Rl U e =f. Rl
(b) R l ·R2 =f. R 2·Rl
(c) R l · Rl =f. Rl
(d) Rl U (R2·R3) =f. (Rl U R 2)·(R I U R 3)
(e) (R l ·R2)* =f. (Rj·Ri)*
(f) (Rl·R2)* =f. (Rj U Ri)*
Find other examples of sets that show the following expressions may be equal under
some conditions:
(g) Rl Ue= Rl
(h) R l ·R2 =R2·Rl (evenifRl=f.R2)
(i) Rl . Rl = Rl
(j) Rl U (R2·R3) = (Rl U R 2)·(R I U R 3) (even if Rl =f. R2 =f. R3 =f. R l )
(k) (R l ·R2)*=(Rj·Ri)* (evenifRl=f.R2)
(I) (R l ·R2)* = (Rj U Ri)* (even ifRl =f. R 2)
6.16. Prove the equalities listed in Lemma 6.1.
6.17. (a) Consider Theorem 6.1. Find examples of sets A and E that will show that A *. E is
not a unique solution if AE A.
(b) Find examples of setsA and E that will show that A *. E can be the unique solution
even if A E A.
6.18. Solve the following set of language equations for Xo and Xl over {O, 1}*:
Xo = (OU I)Xl
Xl = e U lXo U OX I
Do you see any relation between these equations and the DFA A in Example 3.4?
6.19. (a) Solve the following set of language equations for X!, X2, and X3 by eliminating X3
and then eliminating X2. Solve for Xl and then back-substitute to find X2 and X 3.
Note that these equations arise from the automaton in Figure 6.10.
Xl = 0 U 0X l U (0 U I)X2 U 0X3
X2 = e U OXI U lX2 U 0X3
X3 = 0 U 0X l U (0 U I)X2 U 0X3
(b) Rework part (a) by eliminating X3 and then eliminating Xl (instead of X2).
(c) How does the solution in part (b) compare to the solution in part (a)? Is one more
concise? Are they equivalent?
6.20. Prove Lemma 6.2. [Hint: Let P(m) be the statement that "Every regular expression R
with m or fewer operators represents a regular set that is FAD," and induct on m.l
6.21. Let ~ = {a, b, c}. Find all solutions to the language equation X = X U {b}.
6.22. Prove that, for any languages A and E, A *E = E U A· (A *E).
6.23. Give a regular expression that will describe the intersection of the regular sets
(ab U b)*a and (ba U a)*.
6.24. Develop an algorithm that, when applied to two regular expressions, will generate an
expression describing their intersection.
206 Regular Expressions Chap. 6
6.25. Verify by direct substitution that Xl = (a U bb)* and X2 = b·(a U bb)* is a solution to
Xl = e U a·X l U b·X2
Xz = 0 Ub·X l U 0·Xz
6.26. (a) Find L (D) for the machine D described in Example 6.10.
(b) Generalize your technique: For a machine A with start states Sip Si2' ... ,Sim , L(A) is
given by ?
6.27. Let I = {a, b}. Give a regular expression that will describe the complement of the
regular set (ab U b)*a.
6.28. Develop an algorithm that, when applied to a regular expression, will generate an
expression describing the complement.
6.29. Let I = {a, b, c}. Define E(L) = {z I(3y E r)(3x E L)z = yx}. Use the regular expres-
sion concepts given in this chapter to argue that 1!Jt:t is closed under the operator E (that
is, don't build a new automaton; build a new regular expression from the old expres-
sion).
6.30. Let I = {a, b, c}. Define B (L) = {z I(3x E L)(3y E I *)z = xy}. Use the regular expres-
sion concepts given in this chapter to argue that 1!Jt:t is closed under the operator B (that
is, don't build a new automaton; build a new regular expression from the old expres-
sion).
6.31. Let I = {a, b, c}. Define M(L) = {z I(3x E L)(3y E r)z = xy}. Use the regular ex-
pression concepts given in this chapter to argue that 1!Jt:t is closed under the operator M
(that is, don't build a new automaton; build a new regular expression from the old
expression).
6.32. (a) Let I = {a, b, c}. Show that there does not exist a unique solution to the following
set of language equations:
Xl = b U e· Xl U a· X2
X2 = cU 0·X l U e·Xz
(b) Does this contradict Theorem 6.2? Explain.
6.33. Solve the following set of language equations for Xo and Xl over {O, 1}*:
Xo = 0*1 U (10)*Xo U 0(0 U I)Xl
Xl = e U 1*01 Xo U OXI
6.34. Let I = {a, b, c}.
(a) Give a regular expression that describes the set of all words in I * that end with c
and for which aa, bb, and cc never appear as substrings.
(b) Give a regular expression that describes the set of all words in I* that begin with c
and for which aa, bb, and cc never appear as substrings.
6.35. Let I = {a, b, c}.
(a) Give a regular expression that describes the set of all words in I * that contain no
more than two cs.
(b) Give a regular expression that describes the set of all words in I* that do not have
exactly one c.
6.36. Recall that the reverse of a word x, written x', is the word written backward. The
reverse of a language is likewise given by L' = {x' Ix E L}. Let I = {a, b, c}.
Chap. 6 Exercises 207
(a) Note that (RJ U R 2 )' = (R~ U RD for any regular sets RJ and R 2 • Give similar
equivalences for each of the rules in Definition 6.1.
(b) If L were represented by a regular expression, explain how to generate a regular
expression representing L' (compare with the technique used in the proof of
Theorem 6.6).
(c) Prove part (b) by inducting on the number of operators in the expression.
(d) Use parts (a), (b), and (c) to argue that ~l: is closed under the operator r.
6.37. Complete the details of the proof of Theorem 6.4.
6.38. Let I, = {a, b, c}.
(a) Give a regular expression that describes the set of all words in I, * for which no b is
immediately preceded by a.
(b) Give a regular expression that describes the set of all words in I,* that contain
exactly two cs and for which no b is immediately preceded by a.
6.39. Let I, = {a, b, c}.
(a) Give a regular expression that describes the set of all words in I, * for which no b is
immediately preceded by c.
(b) Give a regular expression that describes the set of all words in I,* that contain
exactly one c and for which no b is immediately preceded by c.
6.40. (a) Use Theorem 6.3 to write the two right-linear equations in two unknowns corre-
sponding to the NDFA given in Figure 6.11.
6.42. (a) Use Theorem 6.3 to write the seven right-linear equations in seven unknowns
corresponding to the NDFA given in Figure 6.13.
(b) Solve these equations for all seven unknowns. Hint: Make use of the simple nature
of these equations to eliminate variables without appealing to Theorem 6.2.
(c) Give a regular expression that corresponds to the language accepted by this
NDFA.
(d) Rework the problem with seven left-linear equations.
6.43. Prove that for any languages A, E, and Y, ifEk Y, then A·E k A·Y.
6.44. Let I be an alphabet, and let s: l~ r* be a substitution.
(a) Prove that the image of 1Ytl: under s is contained in IYtr .
(b) Give an example to show that the image of Xl: under s need not be completely
contained in X r .
6.45. Give a detailed proof of Lemma 6.3.
6.46. Let I = {a, b} and E = {x E l* Ix contains (at least) two consecutive bs A x does not
contain two consecutive as}. Draw a machine that will accept E.
6.47. Let I = {a, b, c}. Give regular expressions that will describe:
(a) {x E {a, b, c}* Ievery b in x is eventually followed by c}; that is, x might look like
baabacaa, or bcacc, and so on.
(b) {x E {a, b, c}* Ievery b in x is immediately followed by c}.
6.48. Let I = {a, b}. Give, if possible, regular expressions that will describe each of the
following languages. Try to write these directly from the descriptions (that is, avoid
relying on the nature of the corresponding automata).
(a) The language consisting of all words that have neither consecutive as nor
consecutive bs.
(b) The language consisting of all words that begin and end with different letters.
(c) The language consisting of all words for which the last two letters match.
(d) The language consisting of all words for which the first two letters match.
(e) The language consisting of all words for which the first and last letters match.
6.49. The set of all valid regular expressions over {a, b} is a language over the alphabet
{a, b, (,), U,·, *,~,E}. Show that this language is not FAD.
6.50. Give regular expressions corresponding to the languages accepted by each of the
NDFAs listed in Figure 6.14.
6.51. Complete the details of the proof of Theorem 6.6.
6.52. Prove Lemma 6.4.
6.53. Corollary 6.3 followed immediately from Theorem 6.6. Show that Theorems 5.2,5.4,
and 5.5 are also corollaries of Theorem 6.6.
Chap. 6 Exercises 209
a)
d)
g)
6.54. Let F be the collection of languages that can be formed by repeated application of the
following five rules:
i. {a} E F and {b} E F
ii. { } E F
iii. {A} E F
iv. If Fl E F and F2 EF, then F l ·F2 E F
v. IfFl EF and F 2 EF, then Fl UF2 EF
Describe the class of languages generated by these five rules.
CHAPTER
FINITE-STATE TRANSDUCERS
We have seen that finite-state acceptors are by no means robust enough to accept
standard computer languages like Pascal. Furthermore, even if a DFA could
reliably recognize valid Pascal programs, a machine that only indicates "Yes, this is
a valid program" or "No, this is not a valid program" is certainly not all we expect
from a compiler. To emulate a compiler, it is necessary to have a mechanism that
will produce some output other than a simple yes or no: in this case, we would
expect the corresponding machine language code (if the program compiled success-
fully) or some hint as to the location and nature of the syntax errors (if the program
was invalid).
A machine that accepts input strings and translates them into output strings is
called a sequential machine or transducer. Our conceptual picture of such a device is
only slightly different from the model of a DFA shown in Figure 7.1a. We still have
a finite-state control and an input tape with a read head, but the accept/reject
indicator is replaced by an output tape and writing device, as shown in Figure 7.1b.
These machines do not have the power to model useful compilers, but they can
be employed in many other areas. Applications of sequential machine concepts are
by no means limited to the computer world or even to the normal connotations
associated with "read" and "write." A vending machine is essentially a transducer
that interprets inserted coins and button presses as valid inputs and returns candy
bars and change as output. Elevators, traffic lights, and many other common de-
vices that monitor and react to limited stimuli can be modeled by finite-state
transducers.
The vending machine analogy illustrates that the types of input to a device
(coins) may be very different from the types of output (candy bars). In terms of our
210
Sec. 7.1 Basic Definitions 211
conceptual model, the read head may be capable of recognizing symbols that are
different from those that the output head can print. Thus we will have an output
alphabet r that is not necessarily the same as our input alphabet k.
Also essential to our model is a rule that governs what characters are printed.
For our first type of transducer, this rule will depend on both the current internal
state of the machine and the current symbol being scanned by the read head and will
be represented by the function w. Finally, since we are dealing with translation
rather than acceptance/rejection, there is no need to single out accepting states: the
concept of final states can be dispensed with entirely.
EXAMPLE 7.1
Let V = <in, d, q, b}, {<p, n', d', q', CO, Ct, C2, C3, C4}, S, sO, 8, w> be the FST illustrated
in Figure 7.2. V describes the action of a candy machine that dispenses 30¢
Chocolate Explosions. n, d, q denote inputs of nickels, dimes, and quarters (re-
spectively), and b denotes the act of pushing the button to select a candy bar.
'P, n', d', q', Co, Ct, C2, C3, C4 represent the vending machine's response to these inputs:
it may do nothing, return the nickel that was just inserted, return the dime, return
the quarter, or dispense a candy bar with 0, 1,2,3, or 4 nickels as change, respec-
tively. Note that the transitions agree with the vending machine model presented in
Chapter 1; the new model now specifies the action corresponding to the given input.
It is relatively simple to modify the above machine to include a new input r that
signifies that the coin return has been activated and a new output a representing the
release of all coins that have been inserted (see the exercises).
} .
EXAMPLE 7.2
B .a b
So So S1
S1 So S1
-ili-t
So
S1
a
0
1
b
0
1
It should be clear that the above discussion illustrates a very awkward way of
describing translations. While w describes the way in which single letters are trans-
lated, the study of finite-state transducers will involve descriptions of how entire
strings are translated. This situation is reminiscent of the modification of the state
transition function S, which likewise operated on single letters, to the extended state
transition function 8 (which was defined for strings). Indeed, what is called for is an
extension of w to w, which will encompass the translation of entire strings. The
translation cited in the last example could then be succinctly stated as
w(so, abaabbaa) = 00100110. That is, the notation wet, y) is intended to represent
the output string produced by a transducer (beginning from state t) in response to
the input string y.
The formal recursive definition of w will depend not only on w but also on the
state transition function S (and its extension 8). 8 retains the same conceptual
meaning it had for finite-state acceptors: 8(s, x) denotes the state reached when
starting from s and processing, in sequence, the individual letters of the string x.
Furthermore, the conclusion stated in Theorem 1.1 still holds:
(Vx E I*)(Vy E I*)(Vs E S)(8(s,yx) = 8(8(s,y),x))
A similar statement can be made about w once it has been rigorously defined.
V Definition 7.2. Given a FST A = <I, f, S, sO, S, w>, the extended output
function for A, denoted by W, is a function w: S x I * ~ f* defined recursively as
follows:
EXAMPLE 7.3
Let B = <I, f, S, sO, S, w> be the FST given in Example 7.2. Then
w(sr, baa) = w(sr, b) ·w(S(sr, b), aa) = l·w(sr, aa)
= l·w(sr,a)·w(S(sr,a),a) = 11·w(so,a) =:' 110
Note that a three-letter input sequence gives rise to exactly three output symbols: w
is length preserving, in the sense that (Vt E S)(Vx E I*)(!w(t, x) 1= Ixl).
The w function extends the w function from single letters to words. Whereas
the w function maps a state and a letter to a single symbol from f, the w function
maps a state and a word to an entire string from f*. It can be deduced from (i) and
(ii) (see the exercises) that (iii) (Vt E S)(Va E I)(w(t, a) = wet, a)), which is the
observation that wand w treat single letters the same. The extended output function
w has properties similar to those of 8, in that the single letter a found in the recursive
Sec. 7.1 Basic Definitions 215
Let B = <!" r, S, so, 3, w> be the FST given in Example 7.2. Consider the string
z = abaabbaa = yx, where y = abaab and x == baa. To apply Theorem 7.1 with
t = so, we first calculate w(so,y) = w(so, abaab) = 00100, and 3(so,Y) = Sl. From Ex-
ample 7.3, W(Sb baa) = 110, and hence, as required by Theorem 7.1,
00100110 = w(so, abaabbaa) = w(so,yx) = w(so,y)'w(5(t,y),x) = 00100·110
For a given FST A with a specified start state, the deterministic nature of
finite-state transducers requires that each input string be translated into a unique
output string; that is, the relation fA that associates input strings with their corre-
sponding output strings is afunction.
V Definition 7.3. Given a FST M = <!" r, S, so, 3, w>, the translation function
for M, denoted by fM' is the functionfM: !,*~ r* defined by fM(X) = w(so,x).
a
Note thatfM, like 00, is length preserving: (Vx E !'*)(lfM(X) I = Ixl). Consequently,
for any n EN, if the domain offM were restricted to !,n, then the range offM would
likewise be contained in rn.
EXAMPLE 7.5
Let B = <!" r, S, so, 3, w> be the finite-state transducer given in Figure 7.3. Since
w(so, abaab) = (00100), fB(abaab) = 00100. Similarly, fB(A) = A, fB(a) = 0, fB(b) = 0,
fs(aa) = 00, fB(ab) = 00, fB(ba) = 01, fB(bb) = 01. Coupled with these seven base
definitions, this particular fB could be recursively defined by
and
fB(xbb) = fB(xb)·1
fB in essence replaces as with Os and bs with Is, and "delays" the output by one
letter. More specifically, the translation function for B takes an entire string and
substitutes Os and Is for as and bs (respectively), deletes the last letter of the string,
and appends a 0 to the front of the resulting string. The purpose of the two states So
and S1 in the FST B is to remember whether the previous symbol was an a or a b
(respectively) and output the appropriate replacement letter. Note that Is are
always printed on transitions from s}, and Os are printed as we leave so.
EXAMPLE 7.6
Let C = <{a, b}, {O, I}, {to, t}, t2 , t3}' to, 8e, we> be the FST shown in Figure 7.4. C
flags occurrences of the string aab by printing a 1 on the output tape only when the
substring aab appears in the input stream.
EXAMPLE 7.7
Consider the function g: {a, b, c}* ~ {O, I}*, which replaces input symbols by 0
unless the next letter is c, in which case 1 is used instead. Thus,
g(abcaaccb) = 01001100 and g(abb) = 000.
With n = 2, choosing x = ab, y = caaccb, and z = b shows that g violates Theorem
7.2, so g cannot be FrO.
Two transducers that perform exactly the same translation over the entire range of
input strings from I * will be called equivalent transducers. This is in spirit similar to
the way equivalence was defined for deterministic finite automata.
EXAMPLE 7.8
The FST C = <{a, b}, {O, I}, {to, tl> t 2, t 3 }, to, 8e, we> given in Figure 7.4 is not min-
imal. The FST D = <{a, b}, {O, I}, {qo, ql> q2}, qo, 80 , 000> given in Figure 7.S per-
forms the same translation, but has only three states.
218 Finite-State Transducers Chap. 7
The concept of two transducers being essentially the same except for a trivial
renaming of the states will again be formalized through the definition of isomorph-
ism (and homomorphism). As before, it will be important to match the respective
start states and state transitions; but rather than matching up final states (which do
not exist in the FST model), we must instead ensure that the output function is
preserved by the relabeling process.
Consider the two FSTs C = <{a, b}, {O, I}, {to, t1> t 2, t 3}, to, 8e , we>, given in Figure
7.4, and D = <{a, b}, {O, I}, {qo, q1> q2}, qo, 80, 000>, displayed in Figure 7.S. The
function f.L: {to, t1> t 2, t3}- {qo, q1> q2}, defined by f.L(t o) = qo, f.L(t 1) = q1> f.L(t 2) = q2,
and f.i.(t3) = qo is a homomorphism between C and D. Conditions (i) and (ii) are
exactly the same criteria used for finite automata homomorphisms and have exactly
the same interpretation: the start states must correspond arid the transitions must
match. The third condition is present to ensure that the properties of the 00 function
are respected; for example, since t2 causes 1 to be printed when b is processed, so
should the corresponding state in the 0 machine, which is q2 = f.L(t 2) in this example.
Indeed, we(t2, b) = 1 = wo(f.L(t 2), b). Such similarities extend to full strings also: note
that we(to, aab) = 001 = wO(f.L(t o), aab) in this example. The results can be gen-
eralized as presented in the next lemma.
and
(VsESA)(Vx EI*)(WA(S,X) =WB(f.L(S),x»).
Proof. The proof is by induction on Ix I (see the exercises).
connectedness can be defined. As was the case in Chapter 3, a reduced and con-
nected machine will be isomorphic to every other equivalent minimal machine. The
definition for connectedness is essentially unchanged.
That is, every state s of S can be reached by some string (xs) in ~*; once again, the
choice of the state s will have a bearing on which particular string is used as a
representative. States that are not accessible do not affect the translation performed
by the transducer; such states can be safely deleted to form a connected version of
the machine.
V Definition 7.11. Given a FST M = <~, r, S, sO, 8, 00>, define the transducer
MC = <~, r, SC, ~, 8c, wC>, called M connected, by
SC = {sE SI3x E ~* 18(so,x) = s}
sg = So
C
8 is essentially the restriction of 8 to SC x~: (\fa E ~)(\fs E SC)W(s, a) = 8(s, a»,
and wCis the restriction of 00 to SC x ~: (\fa E ~)(\fs E SC)(WC(s, a) = w(s, a».
!1
Me is, as in Chapter 3, the machine M with the unreachable states "thrown away."
As with DFAs, trimming a machine in this fashion has no effect on the operation of
the transducer. To formally prove this, the following lemma is needed.
the operation of any transducer is indistinguishable from the operation of its con-
nected counterpart.
the two major requirements for minimality. The other requirement is that no two
states behave identically. For DFAs, this translated into statements about accept-
ance and rejection. For FSTs, this will instead involve the behavior of the output
function. The analog to Definition 3.2 is given next.
V Definition 7.12. Given a transducer M = <I, r, S, sO, &, w>, the state equiv-
alence relation on M, EM, is defined by
(Vs E S)(Vt E S)(s EM t ¢::> (Vx E I *)(w(s, x) = w(t, x)))
In other words, we will relate states sand t if and only if it is not possible to
determine, by only observing the output, whether we are starting from state s or
state t (no matter what input string is used). The more efficient machines will not
have such duplication of states, and, as with DFAs, will be said to be reduced.
V Definition 7.14. Given a FST M = <I, r, S, sO, &, w>, defined M modulo its
state equivalence relation, M/EM' by M/EM = <I, r, SEw SOEM' &EM' WEM>' where
SEM = {[SkMls E S}
SOEM = [SO]EM
&EM is defined by
(Va E I)(V[skM E SEM)(&EM([S]EM' a) = [&(s, a)k M),
and WEM is defined by
(Va E I)(V[S]EM E SEM)(WEi[s]EM' a) = w(s, a)).
The proof that &EM is well defined is similar to that found in Chapter 3. In an
analogous fashion, WEM must be shown to be well defined (see the exercises).
All the properties that one would expect of M/EM are present, as outlined in
the following theorem.
Sec. 7.2 Minimization of Finite-State Transducers 223
An argument similar to that given for Corollary 7.2 shows that a reduced FST
is also a requirement for minimality.
V Theorem 7.5. Two reduced and connected FSTs, Ml = <I, r, S), SOl' 8), WI>
and M2 = <I, r, S2, S02' 82, W2>, are equivalent iff Ml == M2.
Proof. By Corollary 7.1, if Ml == M2, then Ml is equivalent to M2. The con-
verse half of the proof is very reminiscent· of that given for Theorem 3.1. We must
assume Ml and M2 are equivalent and then prove that an isomorphism can be
exhibited between Ml and M 2 • A natural way to define such an isomorphism is
as follows: Given a state s in Mb choose a string Xs such that 81 (SOl' x s) = s. Let
fJ.(s) = 82(S~,Xs). At least one such string Xs must exist for each state of Mb since Ml
was assumed to be connected. There may be several choices for Xs for a given state s,
but all will yield the same value for 82( S02' x s), and so fJ. is well defined (see the
exercises). The function fJ. satisfies the three properties of a homomorphism and
turns out to be a bijection (see the exercises). Thus Ml == M 2 • As will be clear from
224 Finite-State Transducers Chap. 7
the exercises, the hypothesis that Ml and Mz are reduced and connected is crucial to
the proof of this part of the theorem.
a
Note that Theorem 7.5 implies that, as long as we are dealing with reduced
and connected machines, fMl = fM2 iff Ml = Mz. The conclusions discussed earlier
now follow immediately from Theorem 7.5.
Thus EiM relates states that cannot be distinguished by strings of length i or less,
whereas EM relates states that cannot be distinguished by any string of any length.
All the properties attributable to the analogous relations for finite automata (EiA)
carry over, with essentially the same proofs, to the relations for finite-state trans-
ducers (EiM)'
V Corollary 7.6. . Given a FST M = <I, r, S, sO, 8, w>, there is an algorithm for
computing EM.
Proof. Use Lemma 7.4 to compute successive EiM relations from ElM until
EiM = Ei+lM; by Lemma 7.3, this EiM will equal EM, and this will all happen before i
reaches IISII, the number of states in S. Thus the procedure is guaranteed to halt.
A
V Corollary 7.7. Given a FST M = <I, r, S, sO, 8, w>, there is an algorithm for
computing the minimal machine equivalent to M.
Proof. Using the algorithm for computing the set of connected states, MC can
be found. The output function is used to find E1Mc, and the state transition function
is then used to calculate successive relations until EMc is found. MC/EMC can then be
defined and will be the minimal machine equivalent to M.
A
Example 7.16, which demonstrates that traffic signal controllers can most naturally
be modeled by the transducers discussed in this section.
EXAMPLE 7.10
Let C = <~, r, S, ro, 8, w> be given by
~= {a, b}
r = {O, I}
So = ro
The state transition table is shown in Table 7.2.
TABLE 7.2
1) a b
fo fo f2
fI fo f2
f2 fI f3
f3 fI f3
Sec. 7.3 Moore Sequential Machines 227
Finally, the output function is given by w(ro) = 0, w(rl) = 1, w(rz) = 0, and w(r3) = 1,
or, more succinctly, [w(r;) = imod2] for i =0,1,2,3. All the above information
about C is contained in Figure 7.6. This Moore machine performs the same
translation as the Mealy machine B in Example 7.2.
Results that were targeted toward a: FST in the previous sections were specific
to Mealy machines. When the descriptor "transducer" appears in the theorems and
definitions presented earlier, the concept or result applies unchanged to both FSTs
and MSMs. Most of these results are alluded to but not restated in this section. For
example, "8 is defined like and behaves like the extended state transition functions
for DFAs and FSTs. On the other hand, because of the drastic change in the domain
of w, w must be modified as outlined below in order for w(s, x) to represent the
output string produced when starting at s and processing x.
V Definition 7.17. Given a MSM A =<~, r, S, sO, 8; w>, the extended output
function for A, denoted again by w, is a function w: S x ~ * ~ f* defined recursively
by:
i. (VtES) w(t, A) =A
ii. (Vt E S)(Vx E ~*)(Va E ~)(w(t, ax) = w(8(t, a))'w(8(t, a),x))
A
Note that the domain of the function w has been extended further than usual:
in all previous cases, the domain was enlarged from S x ~ to S x ~ *; in this in-
stance, we are beginning with a domain of only S and still extending it to S x ~ *.
The above definition allows the following analog of Theorem 7.1 to remain essen-
tially unchanged.
Definition 7.5 applies to Moore machines; two MSMs are equivalent if they
define the same translation. Indeed, it is possible for a Mealy machine to be
equivalent to a Moore machine, as shown by the transducers in Figures 7.2 and 7.6.
It is easy to turn a Moore machine A = <I, r, S, sO, 8, w> into an equivalent
Mealy machine M = <1,r,S,so,8,w' >. The first five parts of the transducer are
unchanged. Only the sixth component (the output function) must be redefined, as
outlined below.
V Definition 7.19. Given a Moore machine A = <I, r, S, sO, 8, w>, the corre-
sponding Mealy machine M is given by M = <I, r, S, sO, 8, w'>, where w' is defined
by
(Va E 1)(Vs E S)(w'(s, a) = w(8(s, a)))
Pictorially, all arrows that lead into a given state in the Moore machine should
be labeled in the corresponding Mealy machine with the output symbol for that
particular state. It follows easily from the definition that the corresponding ma-
chines perform the same translation.
EXAMPLE 7.11
Let A= <1,r,S,ro,8,w> be the Moore machine given in Figure 7.6. The
corresponding Mealy machine M = <I, r, S, sO, 8, w'> is then given by
1={a,b}, r={O,I}, So = ro
Sec. 7.3 Moore Sequential Machines 229
and the state transition table and the output function table are specified as in Tables
7.3a and 7.3b.
TABLE7.3A TABLE7.3B
I) a b 00' a b
fa fa f2 fa 0 0
f1 fa f2 f1 0 0
f2 f1 f3 f2 1 1
f3 f1 f3 f3 1 1
The new Mealy machine is shown in Figure 7.7. Note that the arrow labeled a
leaving rl now has a 0 associated with it, since the state at which the arrow pointed
(ro/O) originally output a O.
'i1 Definition 7.20. Given a Mealy machine M = <It, r, S, so, 8, 00>, the corre-
sponding Moore machine A is given by A = <It, r, S x r, (so, a), 8',00'>, where a is
an (arbitrary) member of r,
230 Finite-State Transducers Chap. 7
5' is defined by (Vs E S)(Vb E r)(Va E I)(5 ' (s, b), a) = (5(s, a), w(s, a»)
and
Wi is defined by (Vs E S)(Vb E r)(w'(s, b» = b).
A
V Theorem 7.S. Given a Mealy machine M = <I, r, S, so, 5, w>, the corre-
sponding Moore machine A = <I, r, S, so, 5', Wi> is equivalent to M; that is,
(Vx E I*)(fA(X) = fM(X».
Proof. The proof is by induction on Ix I (see the exercises).
A
Since every Mealy machine has an equivalent Moore machine and every
Moore machine has an equivalent Mealy machine, either construct can be used as a
basis of what was meant by a translation f being finite transducer definable.
V Corollary 7.S. A translation f is FTD iff f can be defined by a FST M ifff can
be defined by a MSM A.
Proof. The proof is immediate from the definition of FTD and Theorems 7.7
and 7.8.
A
EXAMPLE 7.12
Consider the Mealy machine B from Figure 7.3. The corresponding Moore machine
A = <I, r, S, qo, 5, w> is given by
I = {a, b}
r = {O, I}
S = {(so, 0), (so, I), (Sb 0), (Sb 1)}
qo = (so, I)
w(so, 0» = 0, w(so, 1» = 1, W(Sb 0» = 0, W(Sb I» = 1
and the state transition table is specified as in Table 7.4.
TABLE 7.4
1) a b
(so,O) (so,O) (SI, 0)
(so,l) (so,O) (SI, 0)
(SI,O) (so,l) (sl,l)
(SI, 1) (so, 1) (SI, 1)
Figure 7,8 displays this new Moore machine. Note that this transducer A,
except for the placement of the start state, looks very much like the Moore machine
Sec. 7.3 Moore Sequential Machines 231
C given in Figure 7.4. Indeed, any ordered pair that is labeled with the original start
state would be an acceptable choice for the new start state in the corresponding
Moore machine. For example, the automaton A', which is similar to A but utilizes
(so, 0) as the new start state, is another Moore machine that is equivalent to the
original Mealy machine B. The transition diagram for A' is shown in Figure 7.9. In
fact, by appropriately recasting the definition of isomorphism so that it applies to
Moore sequential machines, it can be shown that A' and C are isomorphic. The
definition of isomorphic again guarantees that a renaming of the states can be found
that preserves start states, transition functions, and output functions. Indeed, the
definition of isomorphism agrees with that of Mealy machines (and of DFAs, for
i. f-L(SOA) = SOB'
= 8B(f-L(s), a).
ii. ('Vs E SA)('Va E I)(f-L(8 A(s, a))
iii. ('Vs E SA)(WA(S) = WB(f-L(S))).
iv. f-L is a one-to-one function between SA and SB.
v. f-L is onto SB.
11
EXAMPLE 7.13
The two Moore machines A' in Figure 7.9 and C in Figure 7.6 are indeed iso-
morphic. There is a function f-L from the states of A'to the states of C that satisfies
all five properties of an isomorphism. This correspondence is given by f-L( (so, 0») = ro,
f-L«so, 1») = r b f-L«Sb 0») = r2, and f-L«Sb 1») = r3, succinctly defined by f-L( (Si' j»
= r2i+j
for i,j E {a, I}. As before, a homomorphism is meant to represent a correspondence
between states that preserves the algebraic structure of the transducer without
necessarily being a bijection.
V Definition 7 ~22. Given two MSMs
A = <I, r, SA, SOA' 8A, WA> and B = <I, r, SB, SOB' 8B, WB>,
and a function f-L: SA ~ SB, f-L is called a Moore machine homomorphism from A to B
iff the following three conditions hold:
then
and
("Is E SA)('VX E I*)(WA(S, X) = WB(J.1(S),x))).
Proof. The proof is by induction on Ix I (see the exercises).
It is interesting to note that the MSMs A in Figure 7.8 and A' in Figure 7.9 are
not isomorphic. In fact, there does not even exist a homomorphism (in either
direction) between A and A' since the start states print different symbols, and rules
(i) and (iii) therefore conflict. The absence of an isomorphism in this instance
illustrates that an analog to Theorem 7.5 cannot be asserted under the definition of
Moore sequential machines presented here. Observe that A and A' are equivalent
and they are both minimal (four states are necessary in a Moore machine to perform
this translation), yet they are not isomorphic. The reader should contrast this failure
with the analogous statement about Mealy machines in Theorem 7.5.
Producing a result comparable to Theorem 7.5 is not possible without a
fundamental adjustment of at least one of the definitions. One possibility is to drop
the distinguished start state from the definition of the Moore machine. This re-
moves condition (i) from the isomorphism definition and thereby resolves the
conflict between (i) and (iii). We have already noted that many applications do not
require a distinguished start state (such as elevators and traffic signal controls),
which makes this adjustment not altogether unreasonable.
A more common alternative is to decree that a Moore sequential machine first
print the character specified by the start state upon being turned on (before any of
the input tape is read) and then proceed as before. This results in output strings that
are always one symbol longer than the corresponding input strings, and the length-
preserving property of transducers is thereby lost. A more substantial drawback
results from the less natural correspondence between Mealy and Moore machines:
no FST can be truly equivalent to any MSM since translations would not even be of
the same length.
The advantage of this decree is that machines like A and A' (from Figures 7.8
and 7.9) would no longer be equivalent, and hence they would not be expected to be
234 Finite-State Transducers Chap. 7
isomorphic. Note that equivalence is lost since, under the new decree for trans-
lations, they would produce different output when presented with, say, A as input:
A would print 1 while A' would produce o. Our definition of a MSM (Definition
7.16) was chosen to remain compatible with the translations obtained from Mealy
machines and to preserve a distinguished state as the start state; these advantages
were obtained at the expense of a convenient analog to Theorem 7.5.
A third, and perhaps the best, alternative is to modify what we mean by a
MSM isomorphism. Definition 7.21 can be rephrased to relax the condition that the
start states of the two machines must print the same character.
As with Mealy machines, Moore machines can also be minimized, and a
reduced and connected MSM is guaranteed to be the smallest MSM which performs
that translation. Note that Definitions 7.4 (FTD), 7.5 (equivalence), 7.9 (iso-
morphic), 7.10 (connected), 7.12 (state equivalence relation), 7.13 (reduced), and
7.15 (ith relation) have been phrased to encompass both forms of transducers.
Minor changes (generally involving the domain of the output function) are all that is
necessary to make the remaining definitions and results conform to the Moore
constructs. We begin with a formal definition of minimality, which is in essence the
same as the definitions presented for DFAs and FSTs (Definitions 2.7 and 7.6).
V Definition 7.23. Given a MSM A = <!, r, SA, SOA' 3A, WA>, A is the minimal
Moore machine for the translation fA ifJfor all MSMs 8 = <!, r, SB, SOB' 3B, WB> for
which fA = Is, II SAil ::; IISBII·
~
V Definition 7.24. Given a MSM M = <!, r, S, SO, 3, w>, define the trans-
ducer MC= <!,r,SC,so,3e,wc >, called Mconnected, by
se = {s E S 13x E ! * ~ 3(so, x) = s}
So = So
3Cis essentially the restriction of 3 to se x!: (\fa E !)(\fs E se)(3c (s, a) = 3(s, a)),
and we is the restriction of w to se x!: (\fs E SC)(WC(s) = w(s)).
~
The concept of a reduced Moore machine and the definition of the state
equivalence relation are identical in spirit and in form to those presented for Mealy
Sec. 7.3 Moore Sequential Machines 235
machines (Definitions 7.12 and 7.13). The definition that outlines how to reduce a
Moore machine by coalescing states differs from that given for FSTs (Definition
7.14) only in the specification of the output function. In both Definition 7.14 and
the following Moore machine analog, the value w takes for an equivalence class is
determined by the value given for a representative of that equivalence class. As
; before, this natural definition for the output function can be shown to be well
defined (see the exercises).
The Moore machine M/EM has all the properties attributed to the Mealy
version. Without changing the nature of the translation, it is guaranteed to produce
a MSM which is reduced.
V Theorem 7.10
EOM will generally have one equivalence class for each symbol in f; rk(EoM)
could be less than IlfII if some output symbols are not printed by any state (remem-
ber that equivalence classes are by definition nonempty). The rule for computing
E i+1M from EiM is identical to that given for Mealy machines (and DFAs); only the
starting point, E OM , had to be redefined for Moore machines (compare with Lemma
7.4). Lemmas 7.3 and 7.6 imply that there is an algorithm for finding EM for any
Moore machine M; this was the final computation needed to produce MC/EM", which
will be the minimal Moore machine equivalent to the MSM M.
The vending machine example that began this chapter showed that the transducer
was capable of modeling many of the machines we deal with in everyday life. This
section gives examples of several types of applications and then shows how to form
the circuitry that will implement such transducers. Transducers can be used not only
to model physical machinery, but can also form the basis for computational
algorithms. The following example can be best thought of not as a model of a
machine that receives files, but as a model of the behavior of the computer algo-
rithm that specifies how such files are to be received.
EXAMPLE 7.14
The transducer metaphor is often used to succinctly describe the structure of many
algorithms commonly used in computer applications, most notably in network com-
munications. Kermit is a popular means of transferring files between mainframes
and microcomputers. A transfer is accomplished by the send portion of Kermit on
the source host exchanging information with the receive portion of Kermit on the
destination host. The two processes communicate by exchanging packets of infor-
mation; these packets comprise the input alphabet of our model. When the Kermit
protocol was examined in Chapter 1 (Example 1.16), it was noted that a full
description of the algorithm must also describe the action taken upon receipt of an
incoming packet; these actions comprise the output alphabet of our model. DuriJ;lg
a file transfer, the states of the receiving portion of Kermit on the destination host
are R (awaiting a transfer request), RF (awaiting the name of the file to be trans-
ferred), RD (awaiting more data to be placed in the new file), and A (abort due to
an unrecoverable error). The set of states will again be {A, R, RD, RF}.
Expected inputs are represented by S (an initialization packet, indicating that
a transfer is requested), H (a header packet, containing the name of one of the files
to be created and opened), D (a data packet), Z (an end of file marker, signaling that
no more data need be placed in the currently opened file), and B (break, signaling
the end of transmission). Unexpected input, representing a garbled transmission, is
denoted by X. The input alphabet is therefore I = {B, D, H, S, X, Z}.
When Kermit receives a recognizable packet, it sends an acknowledgment
(ACK) back to the other host. This action will be represented in the output alphabet
by the symbol Y. When the receiver expects and gets a valid header packet, it opens
the appropriate file and also acknowledges the packet. This pair of actions is
represented by the output symbol. O. W will denote the writing of the packet
contents to the opened file and acknowledgment of the packet, and 'P will denote
that no action is taken. C will indicate that the currently opened file is closed. N will
represent the transmission of a NAK (negative acknowledgment), which is used to
alert the sender that a garbled packet was detected. The output alphabet is there-
fore r = {N, 0, W, Y, 'P}. The complete algorithm is summed up in the state transi-
tion diagram given in Figure 7.10.
Hardware as well as software can be profitably modeled by finite-state trans-
ducers. The column-by-column addition of two binary numbers is quite naturally
238 Finite-State Transducers Chap. 7
modeled by a simple two-state FST, since the carry bit is the only piece of previous
history needed by the transducer to correctly sum the current column. This dis-
cussion will focus on binary numbers in order to keep the alphabets small, but trivial
extensions will make the two-state machine apply to addition in any base system.
EXAMPLE 7.15
A computation such as the one shown in Figure 7.lla would be divided up into
columns and presented to the FST as indicated in Figure 7.llb (shown in mid-
computation). A digit from the first number and the corresponding digit from the
second number are presented to the transducer as a single input symbol. With the
column pairs represented by standard ordered pairs, the corresponding input tape
binary adder
00110
+00011 o 0
(a) (b)
Figure 7.11 (a) The addition problem discussed in Example 7.15 (b) Conceptual
model of the binary adder discussed in Example 7.15
Sec. 7.4 Transducer Applications and Circuit Implementation 239
binary adder
include an <EOS> symbol and have the transducer react to <EOS> by printing a y
or n to indicate whether or not there was overflow.
While the binary adder is only one small component of a computer, finite-
state transducers can be profitably used to model complete systems; one such
application involves traffic lights. The controller for a large intersection may handle
eight banks of traffic signals for the various straight-ahead and left-turn lanes, as
well as four sets of walk lights (see the exercises). Input about the intersection
conditions is often fed to the controller from pedestrian walk buttons and metal
detectors embedded in the roadway. For simplicity, we will choose a simplified
intersection to illustrate how to model a traffic controller by a transducer. The
simplified example nevertheless incorporates all the essential features of the more
intricate intersections. A full-blown model would only require larger alphabets and
more states.
EXAMPLE 7.16
100-
200 ,r@J ,r
-
- -00 3
4
jN
Figure 7.14 The intersection discussed
in Example 7.16
Sec. 7.4 Transducer Applications and Circuit Implementation 241
<0,1>,<1,1>
<1;0>,<1,1>
<0,0> <0,1>
<1,0>,<1,1>
Figure 7.15 The state transition diagram for a stoplight modeled as a Moore
machine, as discussed in Example 7.16
242 Finite-State Transducers Chap. 7
<0,1>,<1,1>
be represented by the bits WI> W2, W3 .... We again suggest (solely for simplicity and
standardization in the exercises) ordering the symbols in r alphabetically and
assigning binary codes in ascending order, as was recommended earlier for I. We
must construct a circuit for generating each wi> in the same manner as we built
circuits implementing the accept function for finite automata.
Many practical applications of FSTs (such as traffic signals) operate continu-
ously, rather than starting and stopping for one small string. In such cases, an
<EOS> symbol is not necessary; the circuit operates until power is shut off.
Similarly, an <SOS> symbol is not essential for a traffic signal complex; upon
resuming operation after a power failure, it is usually immaterial whether east-west
traffic first gets a green light or whether it gets a red light in deference to the
north-south traffic. In contrast, it is important for vending machines to initialize to
the proper state or some interesting discounts could be obtained by playing with the
power cord.
EXAMPLE 7.17
Consider the FST displayed in Figure 7.17. If <EOS> and <SOS> is unnecessary,
then the input alphabet can be represented by a single bit al> with al = 0 represent-
ing c and al = 1 representing d. Similarly, the output alphabet can be represented by
a single bit WI> with WI = 0 representing a and WI = 1 representing b. The states can
1 1 1
0 1 0
1 0 0
0 0 1
The principal disjunctive normal form for the transition function is therefore seen
to be tf = (tl!\al)V(~tl!\~al)' The output function can be found in a similar
manner, as shown in Table 7.Sb.
Thus, WI = (tl i al)' As in Example 1.12, the circuit for tl will be fed back into
the D flip-flop(s); the circuit for WI will form the output for the machine (replacing
the acceptance circuit used in DFAs). The complete network is shown in Figure
'.·
244 Finite-State Transducers Chap. 7
TABLE 7.5b
tl al WI
1 1 0
0 1 1
1 0 1
0 0 1
7.18. Note that we would want the output device to print on the rising edge of the
clock cycle, before the new value of tl propagates through the circuitry.
A larger output alphabet would require an encoding of several bits; each Wi
would have its own network of gates; and the complete circuit would then simulta-
neously generate several bits of output information. As in Chapter 1, additional
states or input symbols will add bits to the other encoding schemes and add to the
number of rows in the truth tables for 8 and w. Each additional state bit will also
require its own D flip-flop and a new truth table for its feedback loop. Each
additional state bit doubles the number of states that can be represented, which
means that, as was the case with deterministic finite automata, the number of
flip-flops grows as the logarithm of the number of states.
w,
t,
t,,, u, V -,t,"-'a, tl
clock -~---i
EXERCISES
7.1. Let A = <I, r, S, sO, 8, 00> be a Mealy machine. Prove the following statements from
Theorem 7.1:
(a) (Vx E 1*)(Vy E 1*)(Vt E S)(w(t,yx) = w(t,y)·w(8(t,y),x»
(b) (Vx E 1*)(Vy E 1*)(Vs E S)(B(s,yx) = B(B(s,y),x»
7.2. Refer to Lemma 7.1 and prove:
(a) (Vs E SA)(VX E 1*)«f,L(BA(s,x» = BB(f,L(S), x»
(b) (Vs E SA)(Vx1*)(wA(s, x) = WB(f,L(S),x)))
7.3. Prove Corollary 7.3.
Chap. 7 Exercises 245
7.4. Prove Corollary 7.4 by showing that a necessary and sufficient condition for a Mealy
machine M to be minimal is that M is both reduced and connected.
7.5. Show that any FfD functionfmust satisfy a "pumping lemma."
(a) Devise the statement of a theorem that shows that the way any sufficiently long
string is translated determines how an entire sequence of longer strings are trans-
lated.
(b) Prove the statement made in part (a).
7.6. In each of the following parts, you may assume the results in the preceding parts; for
example, you may assume parts (a) and (b) when proving (c).
(a) Prove Lemma 7.3a.
(b) Prove Lemma 7.3b.
(c) Prove Lemma 7.3c.
(d) Prove Lemma 7.3d.
(e) Prove Lemma 7.3e.
7.7. Given a FST M = <I, r, S, sO, &, 00>, prove the following statements from Lemma 7.4:
(a) EOM has just one equivalence classes, which consists of all of S.
(b) ElM is defined by s ElM t ~ ('v'a E I)(oo(s, a) = oo(t, a».
(c) ('v's E S)('v't E S)('v'i 2: 1)(s E i+ IM t ~ SEiM t/\ ('v'a E I)(&(s, a) EiM &(t, a))).
7.S. Prove Theorem 7.6 by showing that if A = <I, r, S; sO, &, 00> is a Moore machine then
('v'x EI*)('v'y EI*)('v'tES)(iii(t,yx) =iii(t,y)·iii(8(t,y),x».
7.9. Prove Theorem 7.7.
7.10. Prove Theorem 7.8.
7.11. Use Lemma 7.6 to find Ee in Example 7.10.
7.12. Show that there is a homomorphism from the machine M in Example 7.11 to the
machine 8 in Example 7.2.
7.13. Prove that, in a FST M = <I, r, S, sO, &, 00>, ('v't E S)('v'a E I) (iii(t, a) = oo(t, a».
7.14. Modify the vending machine in Example 7.1 so that it can return all the coins that have
been inserted. Let r denote a new input that represents activating the coin return, and
let a represent a new output corresponding to the vending machine releasing all the
coins in its temporary holding area.
7.15. Given a FST M = <I,r,S,so,&,oo> and M/EM = <I,r,SEM,soEM,&EM,ooEM>' show
that &EM is well defined.
7.16. Given a FST M = <I, r, S, sO, &, 00> and M/EM = <I, r,SEM,SOEM' &EM' ooEM> , show
that ooEM is well defined.
7.17. Give an example that shows that requiring a FST M to be reduced is not a sufficient
condition to ensure that M is minimal.
7.1S. Show that the function f.L defined in the proof of Theorem 7.5 is well defined.
7.19. Given the function f.L defined in the proof of Theorem 7.5, prove that f.L is really an
isomorphism; that is:
(a) f.L(SOI) = S02'
(b) ('v's E SI)('v'a E I)(f.L(&I(s, a» = &2(f.L(S), a»
(c) ('v's E SI)('v'a E I)(ooI(s, a) = oo2(f.L(S), a»
(d) f.L is a one-to-one function between SI and S2.
(e) f.L is onto S2.
7.20. Consider a transducer that implements a "one-unit delay" over the alphabets I = {a, b}
246 Finite-State Transducers Chap. 7
and r = {a, b, x}. The first letter of the output string should be x, and the nth letter of
the output string should be the n - 1st letter of the input string (for n > 1). Thus,
oo(abbab) = xabba, and so on.
(a) Define a sextuple for a Mealy machine that will perform this translation.
(b) Draw a Mealy machine that will perform this translation.
(c) Define a sextuple for a Moore machine that will perform this translation.
(d) Draw a Moore machine that will perform this translation.
7.21. Consider the circuit diagram that would correspond to the vending machine in Exam-
ple 7.1.
(a) Does there appear to be any reason to use an <EOS> symbol in the input
alphabet? Explain.
(b) Does there appear to be any reason to use an <SOS> symbol in the input alpha-
bet? Explain.
(c) How many encoding bits are needed for the input alphabet? Define an appropriate
encoding scheme.
(d) How many encoding bits are needed for the output alphabet? Define an appropri-
ate encoding scheme.
(e) How many encoding bits are needed for the state names? Define an appropriate
encoding scheme.
(I) Write the truth table and corresponding (minimized) Boolean function for t 2 • Try
to make the best possible use of the don't-care combinations.
(g) Write the truth table and corresponding (minimized) Boolean function for W2. Try
to make the best possible use of the don't-care combinations.
(h) Define the other functions and draw the complete circuit for the vending machine.
7.22. Consider the vending machine described in Exercise 7.14.
(a) Does there appear to be any reason to use an <EOS> symbol in the input
alphabet? Explain.
(b) How many encoding bits are needed for the input alphabet? Define an appropriate
encoding scheme.
(c) How many encoding bits are needed for the output alphabet? Define an appropri-
ate encoding scheme.
(d) How many encoding bits are needed for the state names? Define an appropriate
encoding scheme.
(e) Write the truth table and corresponding (minimized) Boolean function for h. Try
to make the best possible use of the don't-care combinations.
(I) Write the truth table and corresponding (minimized) Boolean function for W3. Try
to make the best possible use of the don't-care combinations.
(g) Define the other functions and draw the complete circuit for the vending machine.
7.23. Use the standard encoding conventions to draw the circuit corresponding to the FST
defined in Example 7.2.
7.24. Use the standard encoding conventions to draw the circuit corresponding to the FST
defined in Example 7.6.
7.25. Use the standard. encoding conventions to draw the circuit corresponding to the FST D
defined in Example 7.8.
7.26. Give an example that shows that requiring a FST M to be connected is not a sufficient
condition to ensure that M is minimal.
7.27. Consider a transducer that implements a "two-unit delay" over the alphabets I == {a, b}
Chap. 7 Exercises 247
and r = {a, b, x}. The first two letters of the output string should be xx, and the nth
letter of the output string should be the n - 2nd letter of the input string (for n > 2).
Thus, w(abbaba) = xxabba, and so on.
(a) Define a sextuple for a Mealy machine that will perform this translation.
(b) Draw a Mealy machine that will perform this translation.
(c) Define a sextuple for a Moore machine that will perform this translation.
(d) Draw a Moore machine that will perform this translation.
7.28. (a) Give an example that shows that the conclusion of Theorem 7.5 can be false if MI is
not reduced.
(b) What essential property of the proposed isomorphism J.L is now absent?
7.29. (a) Give an example that shows that the conclusion of Theorem 7.5 can be false if MI is
not connected.
(b) What essential property of the proposed isomorphism J.L is now absent?
7.30. (a) Give an example that shows that the conclusion of Theorem 7.5 can be false if M2 is
not reduced.
(b) What essential property of the proposed isomorphism J.L is now absent?
7.310 (a) Give an example that shows that the conclusion of Theorem 7.5 can be false if M2 is
not connected.
(b) What essential property of the proposed isomorphism J.L is now absent?
7.32. (a) Give an example of a FST A for which A is not reduced and N is not reduced.
(b) Give an example of a FST A for which A is not reduced and AC is reduced.
7.33. Complete the proof of Theorem 7.4 by showing:
(a) (Vy E I*)(Vt E S)(w(t,y) = WEM([t]EM'y».
(b) M/EM is equivalent to M.
(c) M/EM is reduced.
(d) If M is connected, then M/EM is connected.
7.34. Let I = {O, I} and r = {y, n}.
(a) Define !1(ala2 ... am) = ym if al = 1, and let !1(ala2 ... am) = nm otherwise. Thus,
!1(10) = yy and [J(0101) = nnnn. Demonstrate that!1 is FTD.
(b) Define /2(ala2 ... am) = ym if am = 1, and let /2(ala2 ... am) = nm otherwise. Thus,
/2(10) = nn and/2(0101) = yyyy. Prove that/2 is not FTD.
7.35. Let I = {a, b} and r = {O, I}. Define h(ala2 ... am) to be the first m letters of the infinite
sequence 0100100010000105 106 107 108 1.... Thus, Nababababab) = 0100100010 and
Nabbaa) = 01001. Argue thath is not FTD.
7.36. Assume tis FTD. Prove that (Vx E In)(vy E I*)(Vz E I*) (the first n letters of!(xy)
must agree with the first n letters of!(xz».
7.37. Consider an elevator in a building with two floors. Floor 1 has an up button u on the
wall, floor two has a down button d, and there are buttons labeled 1 and 2 inside the
elevator itself. The four actions taken by the elevator are close the doors, open the
doors, go to floor 1, and go to floor 2. Assume that an inactive elevator will attempt to
close the doors. For simplicity, assume that the model is not to incorporate sensors to
test for improperly closed doors, nor are there buttons to hold the doors open, and the
like. Also assume that when the elevator arrives on a given floor the call button for that
floor is automatically deactivated, rather than modeling the shutoff as a component of
the output.
(a) Define the input alphabet for this transducer (compare with Example 7.16).
248 Finite-State Transducers Chap. 7
~rrJ ~rrJ
100 +-- +--
~rrJ ~rrJ
[N
t \..
t ~
§2] §2]
4 8
100 + -
500 ,r0L] ,r
-
~ [Y]J' 00 7
-- --+ 00 3 6 2
[3 [3
t
~ IN
t
"
Figure 7.20 The intersection discussed in Exercise 7.40
alternation of straight-ahead traffic is carried out, with no left turns indicated unless
the corresponding sensor is activated. Further assume that left-turn traffic will be
allowed to precede the opposing traffic.
(a) Define the new input and output alphabets.
(b) Draw a Moore machine that implements this scenario.
(c) Draw a Mealy machine that implements this scenario.
7.41. Consider an adder similar to the one in Example 7.15, but which instead models
addition in base 3.
(a) Define the input and output alphabets.
(b) Draw a Mealy machine that performs this addition.
(c) Draw a Moore machine that performs this addition.
(d) Draw a circuit that will implement the transducer built in part (b); use both
<EOS> and <SOS>.
7.42. Consider an adder similar to the one in Example 7.15, but which models addition in
base 10.
(a) Define the input and output alphabets.
(b) Define the sextuple of a Mealy machine that performs this addition (by indicating
the output and transitions by concise formulas, rather than writing out the 200
entries in the tables).
(c) Define the sextuple of a Moore machine that performs this addition.
(d) Draw a circuit that will implement the transducer built in part (b); use both
<EOS> and <SOS>.
250 Finite-State Transducers Chap. 7
(b) Give an example of a FST for which A is not reduced and A is reduced.
C
7.64. (a) Give an example of a MSM for which A is not reduced and A is not reduced.
C
(b) Give an example of a MSM for which A is not reduced and N is reduced.
7.65. Isomorphism (==) is a relation in the set of all Mealy machines.
(a) Prove that == is a symmetric relation; that is, formally justify that if there is an
isomorphism from A to B then there is an isomorphism from B to A.
252 Finite-State Transducers Chap. 7
REGULAR GRAMMARS
In the preceding chapters, we have seen several ways to characterize the set of FAD
languages: via DFAs, NDFAs, right congruences, and regular expressions. In this
chapter we will look at still another way to represent this class, using the concept of
grammars. This construct is very powerful, and many restrictions must be placed on
the general definition of a grammar in order to limit the scope to FAD languages.
The very restrictive regular grammars will be explored in full detail in this chapter.
The more robust classes of grammars introduced here will be discussed at length in
later chapters.
Much like the rules given in Backus-Naur Form (BNF) in Chapters 0 and 1, the
language-defining power of a grammar stems from the generation of strings through
the successive replacement of symbols in a partially constructed string. These re-
placement rules form the foundation for the definition of programming languages
and are used in compiler construction not only to determine correct syntax, but also
to help determine the meaning of the statements and thereby guide the translation
of a program into machine language.
EXAMPLES.1
A BNF that describes the set of all valid FORTRAN identifiers is given below.
Recall that such identifiers must begin with a letter and be followed by no more than
five other letters and numerals. These criteria can be specified by the following set
of rules.
253
254 Regular Grammars Chap. 8
EXAMPLE 8.2
The strings used to represent regular sets (see Chapter 6) could have been succinctly
specified using BNF. Recall that regular languages over, say, {a, b, c} are described
by regular expressions. These regular expressions were strings over the alphabet
{ft, E, a, b, c, U, *,), G, and the formal definition was quite complex. A regular
0 ,
=> (ao(cUR»*
=> (ao(cUE»*
Note that in the intermediate steps of the derivation we do not wish to consider
strings such as (aoR)* to be valid regular expressions. (aoR)* is not a string over the
alphabet {tt, E, a, b, c, U, 0, *,),0, and it does not represent a regular language over
{a, b, c}. To generate a valid regular expression, the derivation must proceed until all
occurrences of R are removed. To differentiate between the symbols that may
remain and those that must be replaced, grammars divide the tokens into terminal
symbols and nonterminal symbols, respectively.
tial components of a grammar. A grammar must specify the terminal alphabet, the
set of intermediary nonterminal symbols, and the designated start symbol, and it
must also enumerate the set of rules for replacing phrases within a derivation with
other phrases. In the above examples, the productions have all involved the replace-
ment of single nonterminals with other strings. In an unrestricted grammar, a
general replacement rule may allow an entire string IX to be replaced by another
string 13. Thus, aBeD~ beA would be a legal production, and thus whenever the
sequence aBeD is found within a derivation it can be replaced by the shorter string
beA.
EXAMPLE 8.3
EXAMPLE 8.4
Languages that contain A, such as {aibiei Ii;::: O} generated in Example 8.3 by the
unrestricted grammar Gil, cannot possibly be represented by a pure context-
sensitive grammar. However, the empty string is actually the only impediment to
finding an alternative collection of productions that all satisfy the condition
Ia I~ If31. The language {aibieil i ;::: 1} can be represented by a pure context-sensitive
grammar, as illustrated by the following grammar. Let G be given by
G = <{A, B, S, T}, {a, b, e}, S, {S---;. aSBe, S---;. aTe, T ---;. b, TB---;. bT, eB---;. Be}>
The derivation to produce aabbee would now be
S ~ (by applying S---;. aSBe)
aSBe ~ (by applying S---;. aTe)
aaTeBe ~ (by applying eB ---;. Be)
aaTBee ~ (by applying TB ---;. bT)
aabTee ~ (by applying T ---;. b)
aabbee
The shortest string derivable by G is S ~ aTe ~ abc. In Example 8.3, the shortest
derivation was S ~ T ~ A.
original start state. Such grammars and their resulting languages are generally
referred to as type 1 or context sensitive.
The only production U---7!3 that violates the condition Iu 1::;1!31 is Z---7 A, and
this production cannot playa part in any derivation other than Z =? A. From the start
symbol Z, the application Z---7 A immediately ends the derivation (producing A),
while the application of Z---7 S will provide no further opportunity to use Z---7 A,
since the requirement that Z $. 0 U ! means that the other productions· will never
allow Z to reappear in the derivation. Thus, G' enhances the generating power of G
only to the extent that G' can produce A. Every string in L(G) can be derived from
the productions of G', and G' generates no new strings besides A. This argument
essentially proves that L(G') = L(G) U {A} (see the exercises).
EXAMPLES.5
The language generated by G" in Example 8.3 was L(G") = {aibieili ~ O}. Since
L(G") is {aibieili ~ I} U {A}, it can therefore be represented by a context-sensitive
grammar by modifying the pure contexf-sensitive grammar in Example 8.4. Let G'
be given by
G' = <{A, B, S, T, Z}, {a, b, e}, Z,
{S---7 aSBe, S---7 aTe, T---7 b, TB---7 bT, eB---7 Be, Z---7 A, Z---7 S}>
The derivation to produce aabbee would now be
Z=? (by applying Z---7 S)
S =? (by applying S---7 aSBe)
aSBe =? (by applying S ---7 aTe)
aaTeBe =? (by applying eB ---7 Be)
aaTBee =? (by applying TB ---7 bT)
aabTee =? (by applying T ---7 b)
aabbee
This grammar does produce A, and all other derivations are strictly length-
increasing. Note that this was not the case in the grammar G" in Example 8.3. The
last step of the derivation shown there transformed a string of length 7 into a string
260 Regular Grammars Chap. 8
of length 6. Gil does not satisfy the definition of a context-sensitive grammar; even
though only T could produce A, T could occur later in the derivation. The presence
of T at later steps destroys the desirable property of having all other derivations
strictly length-increasing at each step. Definition 8.4 is constructed to ensure that
the start symbol Z can never appear in a later derivation step.
Note that since the length of the left side of a context-free production is 1 and the
right side cannot be empty, pure context-free grammars have no contracting pro-
ductions and are therefore pure context-sensitive grammars. As with pure context-
sensitive grammars, pure context-free grammars cannot generate languages that
contain the empty string.
Productions of the form C~ ~ are called C-rules. As was done with context-
sensitive grammars, this definition uses a new start state Z to avoid all such length-
decreasing productions except for a single one of the form Z~ 'A., which is used only
for generating the empty string. Type 2 languages will therefore always be type 1
languages. Note that the definition ensures that the only production that can
decrease the length of a derivation must be the Z-rule Z~ 'A..
The grammar corresponding to the BNF given in Example 8.2 would be a
context-free grammar, and thus the collection of all regular expressions is a type 2
language. The grammar given in Example 8.4 is not context free due to the presence
of the production eB ~ Be, but this does not yield sufficient evidence to claim that
the resulting language {aibieil i 2: 1} is not a context-free language. To support this
claim, it must be shown that no type 2 grammar can generate this language. A
pumping lemma for context-free languages will be presented in Chapter 10 to
provide a tool for measuring the complexity of such languages. Just as there are type
1 languages that are not type 2, there are type 0 languages that are not type 1.
Note that even these very restrictive type 2 grammars can produce languages
that are not FAD. As shown in Example 8.2, the language consisting of the
collection of all strings representing regular expressions is context free. However,
this collection is not FAD, since it is clear that the pumping lemma (Theorem 2.3)
would show that a DFA could not hope to correctly match up unlimited pairs of
parentheses.
Consequently, even more severe restrictions must be placed on grammars if
they are to have generative powers similar to the cognitive powers of a deterministic
finite automaton. The type 3 grammars explored in the next section are precisely
what is required. It will follow from the definitions that all type 3 languages are type
2. It is likewise clear that all type 2 languages must be type 1, and every type 1
language is type O. Thus, a hierarchy of languages is formed, from the most re-
strictive type 3 languages to the most robust type 0 languages. The four classes of
languages are distinct; there are type 2 languages that are not type 3 (for example,
Example 8.2), type 1 languages that are not type 2 (see Chapter 9), and type 0
languages that are not type 1 (see Chapter 12).
The grammatical classes described in Section 8.1 are each capable of generating all
the FAD languages; indeed, they even generate languages that cannot be recog-
nized by finite automata. This section will explore a class of grammars that generate
the class of regular languages: every FAD language can be generated by one of the
right-linear grammars defined below, and yet no right-linear grammar can generate
a non-FAD language.
262 Regular Grammars Chap. 8
Right-linear grammars belong to the class of type 3 grammars and generate all
the type 3 languages. Grammars that are right linear are very restrictive; only one
nonterminal can appear, and it must appear at the very end of the expression.
Consequently, in the course of a derivation, new terminals appear only on the right
end of the developing string, and the only time the string might shrink in size is
when a (final) production of the form A ---7 A is applied. A right-linear grammar may
have several contracting productions that produce A and may not strictly conform
with the definition of a context-free grammar. However, Corollary 8.3 will show
that every type 3 language is a type 2 language.
Right-linear grammars generate words in the same fashion as the grammars
defined in Section 8.1. The following definition of derivation is tailored to right-
linear grammars, but it can easily be generalized to less restrictive grammars (see
Chapter 9).
While the symbol :::;> might be more consistent with our previous extension nota-
tions, ~ is most commonly used in the literature.
EXAMPLE 8.6
Let GI = <{T, S}, {a, b}, S, {S---7aS, S---7 bT, T---7 aa}>. Then S~aabaa, since by
Definition 8.2, with XI = A, X2 = a, X3 = aa, X4 = aab, Xs = aabaa, Al = A2 = A3 = S,
A4= T, and As= A.
S:::;> as(by applying S---7 as)
:::;> aaS(by applying S---7 as)
Sec. 8.2 Right-Linear Grammars and Automata 263
EXAMPLES.1
As in Example 8.6, consider GI=<{T,S},{a,b},S,{S-aS,S-bT,T-aa}>.
Then L(G I ) = a*baa = {baa,abaa, aabaa, ... }. Note that each of these words can
certainly be produced by GI ; the number of as at the front of the string is entirely
determined by how many times the production S- as is used in the derivation.
Furthermore, no other words in I* can be derived from G I ; beginning from S,
the production S- as may be used several times, but if no other production is
used, a string of the form anS will be produced, and since Sf/:. I, this is not a valid
string of terminals. The only way to remove the S is to apply the production S - bT,
which will leave a string of .the form anbT, which is also not in I *. The only
production that can be applied at this point is T - aa, deriving a string of the form
anbaa. A proof involving induction on n would be required to formally prove that
L (G I) = {an baa In E N} = a*baa. If G contains many productions, such inductive
proofs can be truly unpleasant.
EXAMPLES.S
Consider the grammar Q = <{I, F}, {O, 1, .}, I,{I-OIllIIO.Fll.F,F- AIOFIIF}>.
L(Q) generates the set of all (terminating) binary numbers including 101.11, OIl.,
10.0, 0.010, and so on.
In a manner similar to that used for automata and regular expressions, we will
consider two grammars to be similar in some fundamental sense if they generate the
same language. The following definition formalizes this notion.
V Definition 8.9. Two grammars G I = <01> I, S1> PI> and G2 = <02 , I, S2, P2>
are called equivalent iffL(G I ) = L(G 2), and we will write G2= GI.
A
EXAMPLES.9
Consider GI from Examples 8.6 and 8.7, and define the right-linear grammar
Gs = <{Z},{a, b},Z,{Z-aZ,Z- baa}>. Then L(G s) = a*baa =L(G I ), and there-
fore Gs = GI. The concept of equivalence applies to all types of grammars, whether
264 Regular Grammars Chap.S
or not they are right linear, and hence the grammars Gil and G' from Examples 8.3
and 8.5 are likewise equivalent.
Definition 8.9 marks the fourth distinct use of the operator L and the concept
of equivalence. It has previously been used to denote the language recognized by a
DFA, the language recognized by an NDFA, and the language represented by a
regular expression [although the more precise notation L (R), which is the regular
set represented by the regular expression R, has generally been eschewed in favor of
the more common convention of denoting both the set and the expression by the
same symbol R]. In the larger sense, then, a representation X of a language,
regardless of whether X is a grammar, DFA, NDFA, or regular expression, is
equivalent to another representation Y iff L (X) = L (Y).
Our first goal in this section is to demonstrate that a cognitive representation
of a language (via a DFA) can be replaced by a generative representation (via a
right-linear grammar). In the broader sense of equivalence of representations
discussed above, Lemma 8.1 shows that any language defined by a DFA has an
equivalent representation as a right-linear grammar. We begin with a definition of
the class of all type 3 languages.
V Lemma 8.1. Given any alphabet I and a DFA A = <I, Q, qo, 8, F>, there
exists a right-linear grammar GA for which L (A) = L (G A).
Proof· Without loss of generality, assume Q = {qo, ql> q2,' .. , qm}. Define
GA = <Q, l,qo,PA >, where PA = {q~a·8(q,a)lq E Q,aEI}U{q~Alq EF}.
There is one production of the form s~ bt for each transition in the DFA, and one
production of the form s ~ A for each final state s in F. (It may be helpful to look
over Example 8.10 to get a firmer grasp of the nature of PA before proceeding with
this proof.) Note that the set of nonterminals n is made up ofthenames ofthe states
in A, and the start symbol S is the name of the start state of A.
The heart of this proof is an inductive argument, which will show that for any
string x = ala2' .. an E l*,
qo ~ al·(8(qo, al))
~ al'ad8(qo, ala2))
Sec. 8.2 Right-Linear Grammars and Automata 265
EXAMPLE 8.10
Let
B = <{a, b}, is, T}, S, 8, {T}>
where
8(S, a) = T, 8(S, b) = T
8(T, a) = S, S(T, b) = S
This automaton is shown in Figure 8.1. Applying the construction in Lemma 8.1, we
have .n = is, T}, I = {a, b}, S = S, and
Pa = {S--'> aT, S--'> bT, T--'> as, T--'> bS, T--'> 'A}.
Note that the derivation S ~ bT ~ baS ~ babT ~ bab mirrors the action of the
DFA as it processes the string bab, recording at each step of the derivation the string
that has been processed so far, followed by the current state of B. Conversely,
in trying to duplicate the action of B as it processes the string ab, we have
S~aT~abS, which cannot be transformed into a string of only as and bs without
processing at least one more letter, and hence ab Et= L (G a). Since S is not a final
state, it cannot be removed from the derivation, corresponding to the rejection of
any string that brings us to a nonfinal state. Those strings that are accepted by Bare
exactly those that end in the state T, and for which we will have the opportunity to
use the production T--'> 'A in the corresponding derivation in Ga, which will leave us
with a terminal string of only as and bs.
",
266 Regular Grammars Chap. 8
EXAMPLE 8.11
Let G1 = <{T, S}, {a, b}, S, {S~ as, S~ bT, T~ aa}>. Then
AG I = <{a, b}, {<as>, <S>, <bT>, <T>, <aa>, <a>, <"->}, {<S>} ,8 GI , {<"->} >,
where 8GI is given by
Sec. 8.3 Regular Grammars and Regular Expressions 267
The grammars we have considered so far are called right linear because productions
are constrained to have the resulting nonterminal appear to the right of the terminal
symbols. We next consider the class of grammars that arises by forcing the lone
nonterminal to appear to the left of the terminal symbols.
Note that a typical production might now look like A ~ Bcd, where the nonterminal
B occurs to the left of the terminal string cd.
EXAMPLE 8.12
Let G2 = <{A, S},{a, b}, S,{S~Abaa, A~ Aa,A~ A}>. Then
L(G 2 ) = a*baa = {baa, abaa, aabaa, ... } = L(G I),
and so G2= G I (compare with Example 8.7). Note that there does not seem to be an
obvious way to transform the right-linear grammar GI discussed in Example 8.7 into
an equivalent left-linear grammar such as G2 (see the exercises).
As was done for right-linear grammars in the last section, we could show that
these left-linear grammars also generate the set of regular languages by constructing
corresponding machines and grammars (see the exercises). However, we will
instead prove that left-linear grammars are equivalent in power to right-linear
grammars by applying known results from previous chapters. The key to this strat-
egy is the reverse operator r (compare with Example 4.10 and Exercises 5.20 and
6.36).
EXAMPLE 8.13
Consider
G3 = <{T,S},{a,b,c,d},S,{S~abS, S~cdT, T~ bT, T~ b}>.
Then
G~ = <{T,S},{a, b,c,d},S,{S~Sba,S~Tdc, T~Tb, T~ b}>,
L(G 3 ) = (ab)*cdbb*, L(GD = b*bdc(ba)* ,
Sec. 8.3 Regular Grammars and Regular Expressions 269
and
Thus, the languages generated by left-linear (and hence regular) grammars are
referred to as type 3 languages. The class of type 3 languages is exactly~.
can be derived from Si by using the productions in P. XS1 then representsL(G), and
these sets satisfy the language equations
XSk = Ek U Ak1XSI U A k2 X Sz U ... U AknX Sn , for k = 1,2, ... ,n
where Ei is the union of all terminal strings x that appear in productions of the form
Si~ x, and Aij is the union of all terminal strings x that appear in productions ofthe
form Si~XSj.
Proof. Since Sl is the start symbol, XS1 is by definition the set of all words that
can be derived from the start symbol, and hence XS1 =L(G). The relationships
betweep. the variables XSj essentially embody the relationships enforced by the
productions in P.
~
EXAMPLE 8.14
Consider Gl from Example 8.11, in which
G1 = <{T,S},{a,b},S,{S~aS,S~bT, T~aa}>.
EXAMPLE 8.15
Let I = {a, b, c},and consider the set of all words that end in b and for which every c
is immediately followed by a. This can be succinctly described by the grammar
G = <{S},{a,b,c},S,{S~aS,S~bS,S~caS,S~b}>. The resulting one equa-
tion in the single unknown Xs is Xs = b U (a U b U ca)Xs, and Theorem 6.1 can be
applied to yield a regular expression for this language; that is, Xs = (a U b U ca)*b.
desired solution will always be the minimal solution predicted by the technique used
in Theorem 6.1. The condition prohibiting ~ from appearing in the set A in the
equation X = E U AX was required to guarantee a unique solution. Regardless of
the nature of the set A, A *E is guaranteed to be a solution, and it will be contained
in any other solution, as restated in Lemma 8.4.
V Lemma 8.4. Let E and A be any two sets, and consider the language equa-
tion X = E U AX. A *E is always a solution for X, and any other solution Y must
satisfy the property A *E f: Y.
Proof. Follows immediately from Theorem 6.1.
I ,
212 Regular Grammars Chap. 8
The constant term in this equation is (Ek U A kn · A:n. En), and the coefficient for Xj is
Akj = A kj U (Akn·A:n. A nj), which agrees with the formula given in (a). The substi-
tution of Xn was shown to yield a minimal set of n - 1 equations in the unknowns Xl
through Xn-h and the induction assumption guarantees that the elimination and
back-substitution method yields a minimal solution for WI through Wn- 1. Lemma
8.4 then guarantees that the solution for
Wn = A:n·En U A:n. A n1W1U A:n . A n2W2 U'" U A:n·An(n-1)W n- 1)
is minimal, which completes the minimal solution for the original system of n
equations.
a
As with Lemma 8.4, the minimal expressions thus generated describe exactly
those terminal strings that can be produced by a right-linear grammar. In an analo-
gous fashion, left-linear grammars give rise to a set of left-linear equations, which
can be solved as indicated in Theorem 6.4.
The above discussion describes the transformation of regular grammars into
regular expressions. Generating grammars from regular expressions hinges on the
interpretation of the six building blocks of regular expressions, as described in
Definition 6.2. Since '9l; is the same as ~l;, all the closure properties known about ~l;
must also apply to '9l;, but it can be instructive to reprove these theorems using
grammatical constructions. Such proofs will also provide guidelines for directly
transforming a regular expression into a grammar without first constructing a corre-
sponding automaton.
V Theorem 8.4. Let! be an alphabet. Then '9l; is effectively closed under
union.
Proof. Let G1 = <o.t,!, S10 PI> and G2 = <o. 2,!, S2, P2 > be two right-linear
grammars, and without loss of generality assume that 0.1n 0.2 = 0. Choose a new
nonterminal Z such that Z f/=. 0.1U~, and consider the new grammar GU defined by
Gu=<o.1Uo.2U{Z},!,Z,P1UP2U{Z~S1oZ~S2}>' It is straightforward to
show that L (G U) = L (G 1) U L (G 2) (see the exercises). From the start symbol Z there
are only two productions that can be applied; if Z~ Sl is chosen, then the
derivation will have to continue with productions from PI and produce a word from
L(G 1 ) (why can't productions from ~ be applied?). Similarly, if Z~ S2 is chosen
instead, the only result can be a word from L (G 2).
a
In an analogous fashion, effective closure of ~ can be demonstrated for the
operators Kleene closure and concatenation. The proof for Kleene closure is out-
lined below. The construction for concatenation is left for the exercises; the tech-
nique is illustrated in Example 8.18.
That is, all productions in P that end in a nonterminal are retained, while all other
productions in P are appended with the new symbol Z, and the two new productions
Z~ A and Z~ S are added. A straightforward induction argument will show that
the derivations that use n applications of productions of the form A ~ x Z generate
exactly the words in L (Gt. Consequently, L (G.) = L (G)*.
tJ.
EXAMPLE 8.16
Let!' = {a, b, c}, and consider the regular expression (a U b). The grammars
G1 = <{R}, {a, b,c}, R,{R~a}> and G2 = <{T}, {a, b,c}, T,{T~ b}> can be com-
bined as suggested in Theorem 8.4 (with A playing the role of Z) to form
G = <{T,R,A},{a,b,c},A,{A~R,A~T,R~a, T~b}>.
276 Regular Grammars Chap. 8
EXAMPLE 8.17
EXAMPLE 8.18
The previous examples illustrate the manner in which regular expressions can
be systematically translated into right-linear grammars. Constructions correspond-
ing to those given in Theorems 8.4,8.5, and 8.6 can similarly be found for left-linear
grammars (see the exercises).
Normal forms for grammars are quite useful in many contexts. A standard
representation can be especially useful in proving theorems about grammars. For
example, the construction given in Lemma 8.2 would have been more concise and
easier to investigate if complex productions such as S ~ bcaaT could be avoided.
Indeed, if all productions in the grammar G had been of the form A ~ aB or A ~ x.,
both the state set and the state transition function of AG could have been defined
more easily. Other constructions and proofs may also be able to make use of the
simpler types of productions in grammars that conform to such normal forms. The
following theorem guarantees that a given right-linear grammar has a correspond-
ing equivalent grammar containing only productions that conform to the above
standard.
right-linear grammar. By the construction given in Lemma 8.1, all the productions
in this grammar are indeed of the form A -? aB or A -? A.
A
Note that the proof given is a constructive proof: rather than simply arguing
the existence of such a grammar, a method for obtaining G1 is outlined. The above
theorem could have been proved without relying on automata constructs. Basically,
"long" productions like T -? abcR would be replaced by a series of productions
involving newly introduced nonterminals, for example, T -? aX, X-? bY, Y -? cR.
Similarly, a production like T -? aa might be replaced by the sequence T -? aB,
B-?aC,C-?A. If the existence of such a normal form had been available for the
proof of Lemma 8.2, the construction of AG could have been simplified and the
complexity of the proof drastically curtailed. Indeed, the resulting machine would
have contained no A-moves. Only one state per nonterminal would have been
necessary, with final states corresponding to nonterminals that had productions of
the form A -? A. Productions of the form A -? aB would imply that B E 8(A, a).
EXAMPLE 8.19
G = <{S, T,B, q,{a, b}, S,{S-?aS, S-? bT, T-?aB, B-?aC, C-? A}>can be
represented by the NDFA shown in Figure 8.3.
EXERCISES
8.1. Can strings like abBAdBc (where B and A are nonterminals) ever be derived from the
start symbol S in a right-linear grammar? Explain.
8.2. Given A and GA as defined in Lemma 8.1, let P(n) be the statement that
('fix E 1")(3j EN) [if to ~ xtj then BA ( to, x) = tjl. Prove that P(n) is true for all n EN.
8.3. Give regular expressions that describe the language generated by:
Chap. 8 Exercises 279
8.18. Without appealing to results from Chapter 12, outline an algorithm that will determine
whether the language generated by a given regular grammar G is infinite.
8.19. Without appealing to results from Chapter 12, outline an algorithm that will determine
whether two right-linear grammars G1 and G2 generate the same language.
8.20. Consider the grammar
H= <{A,B,S},{a,b,c},S,{S~aSBc,S~A,SB~bS,cB~Bc}>
Determine L (H).
8.21. What is wrong with proving that %: is closed under concatenation by using the fol-
lowing construction? Let G1 = <01, I, St. PI> and G2 = <02, I, S2, P2 > be two
right-linear grammars, and, without loss of generality, assume that 0 1 n fiz = 0.
Choose a new nonterminal Z such that Z f/=. 0 1 U O2, and define a new grammar
GO = <0 1 u fiz u {Z}, I, Z, PI U P2 U {Z~ SI . S2} >. Note: It is straightforward to show
that L(GO) =L(G 1 )·L(G 2 ) (see Chapter 9).
8.22. Prove that %: is closed under concatenation by:
(a) Constructing a new grammar GO with the property that L(GO) = L(G 1 )·L(G 2 ).
(b) Proving that L(GO) = L(G 1 )·L(G 2 ).
8.23. Use the constructs presented in this chapter to solve the following problem from
Chapter 4: Given a nondeterministic finite automaton A without A-transitions, show
that it is possible to construct a nondeterministic finite automaton with A-transitions A'
with the properties (1) A' has exactly one start state and exactly one final state, and (2)
L(A/) =L(A).
8.24. Complete the proof of Lemma 8.2 by:
(a) Defining an appropriate inductive statement.
(b) Proving the statement defined in part (a).
8.25. Complete the proof of Lemma 8.3 by:
(a) Defining an appropriate inductive statement.
(b) Proving the statement defined in part (a).
8.26. Fill in the details in the second half of the proof of Theorem 8.2 by providing reasons
for each of the assertions that were made.
8.27. (a) Refer to Example 8.7 and use induction to formally prove that
L(G 1 ) = {an baa In EN}.
(b) Refer to Example 8.9 and use induction to formally prove that
L(G s ) = {anbaaln EN}.
8.28. Notice that regular grammars are defined to have production sets that contain only
right-linear-type productions or only left-linear-type productions. Consider the follow-
ing grammar C, which contains both types of productions:
C = <{S,A,B},{O, 1},S,{S~OAIIBI011IA,A~SO,B~Sl}>.
Note that S::;. OA::;' OSO::;' O1BO::;' 01S10::;' 0110.
(a) Find L(C).
(b) IsL(C) FAD?
(c) Should the definition of regular grammars be expanded to include grammars like
this one? Explain.
8.29. (a) Why was it important to assume that 0 1 n fiz = 0 in the proof of Theorem 8.4?
Give an example.
Chap. 8 Exercises 281
(b) Why was it possible to assume that 0 1 n O2 = f/J in the proof of Theorem 8.4? Give a
justification.
8.30. Consider the NDFA AG defined in Lemma 8.2. If AG is disconnected, what does this
say about the grammar G?
8.31. Apply Lemma 8.1 to the automata in Figure 8.4.
8.32. (a) Restate Lemma 8.1 so that it directly applies to NDFAs.
(b) Prove this new lemma.
(c) Assume I = {a, b, c} and apply this new lemma to the automata in Figure 8.5.
a) b)
c)
e)
a)
d)
g)
CONTEXT-FREE GRAMMARS
The preceding chapter explored the properties of the type 3 grammars. The next
class of grammars in the language hierarchy, the type 2 or context-free grammars,
are central to the linguistic aspects of computer science. Context-free grammars
were originally used to help specify natural languages and are thus well-suited for
defining computer languages. These context-free grammars represent a much wider
class of languages than did the regular grammars. Due to the need for balancing
parentheses and matched begin-end pairs (among other things), the language Pas-
cal cannot be specified by a regular grammar, but it can be defined with a context-
free grammar. Programming languages are specifically designed to be representable
by context-free grammars in order to take advantage of the desirable properties
inherent in type 2 grammars. These properties are explored in this chapter, while
Chapter 10 investigates the generalized automata corresponding to context-free
languages.
284
Sec. 9.1 Parse Trees 285
can be directly derived from aAy by applying the production A - [3, and write
aA,),?a[3')'. Furthermore, if (a,?a2)/\(a2?a3)/\" '/\(an-,?an), then we will
say that a, derives an and write a, ~ an.
a
Recall that for context-free grammars only the start symbol Z can have a
production of the form B- A; regular grammars are allowed to have several such
rules.
EXAMPLE 9.1
<sentence>
/~
<subject> <predicate>
<noun phrase>
/~
<verb> <prepositional phrase>
/'"
<modifier> <noun> /~
<prepOs;tion> 7P~
<modifier> <noun>
EXAMPLE 9.2
Regular grammars form parse trees that are much more restrictive; at any given
level in the tree, only one node can be labeled with a nonterminal. Figure 9.2 shows
the parse tree for the word aaabaa from the grammar
G1 = <{T, S},{a, b}, S, {S~ as, S~ bT, T~aa}>.
In general, since productions in a right-linear grammar allow only the rightmost
symbol to be a nonterminal, parse trees for right-linear grammars will only allow the
rightmost child of a node to have a nontrivial subtree.
EXAMPLE 9.3
a
Figure 9.2 The parse tree discussed in Example 9.2
( ( a U b ) * • C )
Figure 9.3 The parse tree discussed in Example 9.3
EXAMPLE 9.4
For the grammar
G = <{R}, {a, b, c, (,), E, tt, u, 0, *}, R, {R~ a\b\c\E\tt\(RoR)\(RUR)\ R*}> ,
each of the following are valid derivations of the string x = «aUb)*oc).
Derivation 1,'
R~(RoR)
~(R*oR)
~ «RUR)*oR)
~.«aUR)*oR)
~ «aUb)*oR)
~ «aUb)*oc)
Derivation 2:
R~(RoR)
~(R*oR)
~ «RUR)*oR)
~ «RUR)*oc)
~ «RUb)*oc)
~«aUb)*oc)
Sec. 9.1 Parse Trees 289
Derivation 3:
R:::} (RoR)
:::} (Roc)
:::} (R*oc)
:::} «RUR)*oc)
:::} «aUR)*oc)
:::} «aUb)*oc)
Derivation 4:
R:::} (RoR)
:::} (Roc)
:::} (R*oc)
:::} «RUR)*oc)
:::} «RUb)*oc)
:::} «aUb)*oc)
be visited by a pre order traversal. Note that the sequence in which the nonterminals
would be expanded in a leftmost derivation corresponds to the order in which they
appear in the preorder traversal.
9.2 AMBIGUITY
EXAMPLE 9.5
S~AA
~aSaA
~aAAaA
~aaAaA
~aaaaA
~aaaaa
~aA
~aaSa
~aaAAa
~aaaAa
~aaaaa
EXAMPLE 9.6
A A
I
s
/"'"
A A
a a a a a
(a)
a a a a a
(b)
Figure 9.5 (a) A parse tree for aaaaa in Example 9.5 (b) An alternate parse
tree for aaaaa
Sec. 9.2 Ambiguity 293
<expression> ~ <identifier>
<expression> ~ <identifier> - <expression>
<expression> ~ <expression> - <identifier>
<identifier> ~ a
<identifier> ~ b
<identifier> ~ c
<identifier> ~ d
L(G s ) then contains the string a - b - d, which can be generated by two distinct
parse trees, as shown in Figure 9.6. Figure 9.6a corresponds to the following
leftmost derivation.
<expression> ~ <expression> - <identifier>
~ <identifier> - <expression> - <identifier>
~ a - <expression> - <identifier>
~ a - <identifier> - <identifier>
~ a - b - <identifier>
~a-b-d
<expression>
<expression> <identifier>
/~
<identifier> <expression>
I
<identifier>
a b c
(a)
<expression>
<identifier> <expression>
/~
<identifier> <expression>
I
<identifier>
a b c
(b)
Figure 9.6 (a) A parse tree for a-bod in Example 9.6 (b) An alternate parse
tree for a-bod
Sec. 9.2 Ambiguity 295
In the language L(G s ) discussed in Example 9.6, the ambiguity is again not
inherent in the language itself, but is rather a consequence of the specific produc-
tions in the grammar Gs describing the language. In most programming languages,
the expression a - b - d is allowed and has a well-defined meaning. Most languages
decree that such expressions be evaluated from left to right, and hence a - b - d
would be interpreted as (a - b) - d. This interpretation can be enforced by simply
removing the production
<expression> ~ <identifier> - <expression>
from Gs to form the new grammar
Gm = <{<expression>, <identifier>}, {a, b, c, d, -}, <expression>, P'>
where P' consists of the productions
<expression> ~ <identifier>
<expression> ~ <expression> - <identifier>
<identifier> ~ a
<identifier> ~ b
<identifier> ~ c
<identifier> ~ d
It should be clear that Gs and Gm are equivalent, and both generate the regular
language «aUbUcUd)o - )*o(aUbUcUd). G m gives rise to unique parse trees and
is therefore unambiguous. It should be noted that the language could have
been defined with a single nonterminal; a simpler grammar equivalent to Gm is
Gt = <{T}, {a, b,c, d, -}, T, {T~alblcldIT - T}>. However, since Gt is ambigu-
ous, it is much more difficult to work with than Gm • The pair of nonterminals
<expression> and <identifier> are used to circumvent the ambiguity problem in
this language. For the grammar Gm , the production
<expression> ~ <expression> - <identifier>
contains the nonterminal <expression> to the left of the subtraction token and
<identifier> to the right of the -. Since <identifier> can only be replaced by a
terminal representing a single variable, the resulting parse tree will ensure that the
entire expression to the left of the - will be evaluated before the operation corre-
sponding to this current subtraction token is performed. In this fashion, the dis-
tinction between the two nonterminals forces a left-to-right evaluation sequence. In
fact, a more robust language with other operators like x and -:- will require more
nonterminals to enforce the default precedence among these operators.
Most modern programming languages employ a solution to the ambiguity
problem that is different from the one just described. Programmers generally do not
want to be constrained by operators that can only be evaluated from left to right,
and hence matched parentheses are used to indicate an order of evaluation that may
differ from the default. Thus, unambiguous grammars that correctly reflect the
meaning of expressions like d - (b - c) or even (a) - «c - (d») are sought.
296 Context-Free Grammars Chap. 9
EXAMPLE 9.7
The following grammar Gp allows expressions with parentheses, minus signs, and
single-letter identifiers to be uniquely parsed.
Gp = <{<expression>, <identifier>}, {a, b, c, d, -, (, )}, <expression>, p lI
>
where p" consists of the productions
<expression> ~ ( <expression> )
<expression> ~ <expression> - «expression»
<expression> ~ <identifier>
<expression> ~ <expression> - <identifier>
<identifier> ~ a
<identifier> ~ b
<identifier> ~ c
<identifier> ~ d
The first two productions in pll, which were not present in pI, are designed to
handle the balancing of parentheses. The first rule allows superfluous sets of paren-
theses to be correctly recognized. The second rule ensures that an expression that is
surrounded by parentheses is evaluated before the operator outside those paren-
theses is evaluated. In the absence of parentheses, the left-to-right ordering of the
operators is maintained. Figure 9.7 illustrates the unique parse tree for the
expression (a) - «c - (d»).
Gp is a context-free language that is too complex to be regular; the pumping
lemma for regular sets (Theorem 2.3) can be used to show that is impossible for a
DFA to maintain an unlimited number of corresponding balanced parentheses. This
language, and the others discussed so far, can all be expressed by unambiguous
grammars. It should be clear that every language generated by grammars has ambig-
uous grammars that also generate it, since an unambiguous grammar can always be
modified to become ambiguous. What is not immediately clear is whether there are
languages that can only be generated by ambiguous grammars.
V Definition 9.6. Let the class of context-free language L over the alphabet I
be denoted by '€l:. Let thedass. of unambiguous context-free languages be denoted
byOUl:'
!l
V Theorem 9.1. There are context-free languages that are inherently ambigu-
ous; that is, OUl: is properly contained in '€l:.
Sec. 9.2 Ambiguity 297
<expression>
<expression>
/ \
<expression> <expression>
I
<identifier>
I
<identifier>
I I
a c d
Theorem 9.1 states that there exist inherently ambiguous type 2 languages. No
type 3 language is inherently ambiguous. Even though there are regular grammars
that are ambiguous, every regular grammar has an equivalent grammar that is
unambiguous. This assertion is supported by the following examples and results.
EXAMPLE9.S
S S
~ C A
/
a b c a b c
Figure 9.8 The parse trees discussed in Example 9.8
Proof. Let G' = G(A(/). That is, beginning with the right-linear grammar G,
use the construction outlined in Lemma 8.2 to find the corresponding automaton
AG • Use Definition 4.9 to remove the lambda-transitions and Definition 4.5 to
produce a deterministic machine, and then apply the construction outlined in
Lemma 8.1 to form the new right-linear grammar G'. By Lemma 8.2, Theorem 4.2,
Theorem 4.1, and Lemma 8.1, the language defined by each of these constructs is
unchanged, so G' is equivalent to G. Due to the deterministic nature of the machine
from which this new grammar was built, the resulting parse tree for a given string
must be unique, since only one production is applicable at any point in the
derivation. A formal inductive statement of this property is left as an exercise.
a
V Corollary 9.1. The class '9l: of languages generated by regular grammars is
properly contained in OUl:.
Proof. Containment follows immediately from Theorem 9.2. Proper contain-
ment is demonstrated by the language and grammar discussed in Example 9.3.
a
EXAMPLE 9.9
(a)
(b)
(c) a,b,c
(d) a,b,c
Figure 9.9 (a) The automaton discussed in Example 9.9 (b) The simplified
automaton discussed in Example 9.9 (c) The deterministic automaton discussed
in Example 9.9 (d) The final automaton discussed in Example 9.9
300 Context-Free Grammars Chap. 9
The orderly nature of this resulting type of grammar easily admits the specification
of an algorithm that scans a proposed terminal string and builds the corresponding
parse tree. The partial parse tree for a string such as abb would be as pictured in
Figure 9.10a. This would clearly be an invalid string since S4 cannot be replaced by
A. By contrast, the tree for the word abc would produce a complete parse tree, and
it is instructive to step through the process by which it is built. The root of the tree
must be labeled So, and scanning the first letter of the word abc is sufficient to
determine that the first production to be applied is So~ as! (since no other So-rule
immediately produces an a). Scanning the next letter provides enough information
to determine that the next S! rule that is used must be S! ~ bS 2, and the third letter
admits the production S2 ~ CS3 and no other. Recognizing the end of the string
causes a check for whether the current nonterminal can produce the empty string.
Since S3~ A is in the grammar, the string abc is a valid terminal string, and corre-
sponds to the parse tree shown in Figure 9.10b.
a b b
(a)
a b c
(b)
Figure 9.10 (a) The partial parse tree for the string abb (b) The parse tree for
the string abc
Sec. 9.3 Canonical Forms 301
Grammars that admit scanning algorithms like the one outlined above are
called LLO grammars since the parse tree can be deduced using a left-to-right scan
of the proposed string while looking ahead Qsymbols to produce a leftmost deri-
vation. That is, the production that produces a given symbol can be immediately
determined without regard to the symbols that follow.
Note that the grammar G3 = <{T},{a}, T,{T~aaaT, T~aa}> is LL2; that is,
upon seeing a, the scanner must look ahead two symbols to see if the end-of-string
marker is imminent. In this grammar, a may be produced by either of the two
T-rules; the letters following this symbol in the proposed string are an important
factor in determining which production must be applied. The language described by
G3 is simple enough to be defined by a grammar that is LLO, since every regular
grammar can be transformed as suggested by the proof of Theorem 9.2.
The deterministic orderliness of LLO grammars may be generally unattain-
able, but it represents a desirable goal that a compiler designer would strive to
approximate when specifying a grammatical model of a programming language.
When a grammar is being defined to serve as a guide to construct a compiler, an
LLO grammar is clearly the grammar of choice. Indeed, if even a portion of a
context-free grammar conforms to the LLO property, this is of considerable benefit.
Whereas the technique outlined in Theorem 9.2 could be applied to any regular
language to find a hospitable LLO grammar, programming languages are generally
more complex than regular languages, and these languages are unlikely to have LLO
models. For context-free languages, it is much more likely that it will not be possible
to determine which production (or sequence of productions) will produce the
symbol currently being scanned. In such cases, it will be necessary to look ahead to
successive symbols to make this determination.
A classic example of the need to look ahead in parsing programming lan-
guages is reflected in the following FORTRAN statement:
D077I=1.5
Since FORTRAN allows blanks within identifiers, this is a valid statement and
should cause the variable D077I to be assigned the value 1.5. On the other hand,
the statement
D077I=1,5
specifies a "do" loop, and has an entirely different meaning. A lexical analyzer that
sees the three characters 'DO ' cannot immediately determine whether this
represents a token for a do loop, or is instead part of a variable identifier. It may
have to wait until well after the equal sign is scanned to correctly identify the tokens.
, . '
302 Context-Free Grammars Chap. 9
ships between the strings generated by the grammar and the production sequences
generating those strings. In particular, the length of a terminal string may bear very
little relation to the number of productions needed to generate that string.
EXAMPLE 9.10
It should be clear that even more extreme examples can be defined, in which the
number of terminal symbols markedly dominates the number of productions, and
vice versa.
The pumping theorem for context-free grammars (Theorem 9.7) and other
theorems hinge on a more precise relationship between the number of terminal
symbols produced and the number of productions used to produce those symbols.
Grammars whose production sets satisfy more rigorous constraints are needed if
such relationships are to be guaranteed. The constraints should not be so severe that
some context-free languages cannot be generated by a set of productions that
conform to the restrictions. In other words, some well-behaved normal forms are
sought.
A practical step toward that goal is the abolition of productions that cannot
participate in valid derivations. The algorithm for identifying such productions
constitutes an application of the algorithms developed previously for finite auto-
mata. The following definition formally identifies productions that cannot partici-
pate in valid derivations.
EXAMPLE 9.11
Consider the grammar with productions
S~gAe,S~aYB,S~CY
A~bBY,A~ooC
B~dd,B~D
C~jVB,C~gi
D~n
U~kW
V~baXXX,V~oV
W~c
X~fV
Y~Yhm
This grammar illustrates the three basic ways a nonterminal can qualify as useless.
states of the NDFA correspond to nonterminals of the grammar, and one extra
state, denoted by w, is added to serve as the only final state. A transition from A
to C will arise if a production in P allows A to be replaced by a string contain-
ing the nonterminal C. States corresponding to nonterminals that directly pro-
duce terminal strings will also have transitions to the sole final state w. Formally,
for the grammar G = <n, I, S, P> and any nonterminal BEn, define the
NDFA AB =<{I},nU{w},B,&,{w}>, where & is defined by &(w,I)=0, and for
each A E n, let
&(A, 1) = {q (C E n /\ (3a,), E (n U I)*)(A ~ aC)' E pm U {w}
if (3a E I*)(A~a E P), and
&(A, 1) = {q (C E n /\ (3a,), E (n U I)*)(A ~ aC)' E P))}
otherwise.
Note that, for any two nonterminals Rand Q in n, AR and AQ are identical
except for the specification of the start state. The previously presented algorithms
for determining the set of connected states in an automaton can be applied to these
new automata to identify the useless nonterminals. As noted before, there are three
basic ways a nonterminal can qualify as useless. The inaccessible states in the NDFA
AS correspond to nonterminals of the first type and can be eliminated from both the
grammar and the automata. For each remaining nonterminal B, if the final state w is
not accessible in AB , then B is a useless nonterminal of the second type and can be
eliminated from further consideration in both the grammar and the automata.
Checking for disconnected states in the pared-down version of AS will identify
useless nonterminals of the third type. The process can be repeated until no further
disconnected states are found.
fl.
EXAMPLE 9.12
Consider again the grammar introduced in Example 9.11. The structure of each of
the automata is similar to that of AS, shown in Figure 9.11a. Note that the
disconnected states are indeed Wand U, which can be eliminated from the state
transition table. Checking the accessibility of w in AS, A\ AB , AC , and AD result in
no changes, but V, X, and Yare eliminated when AV , AX, and AY are examined,
resulting in the automaton displayed in Figure 9.11 b. Eliminating transitions
associated with the corresponding useless productions yields the automaton shown
in Figure 9.llc. Checking for disconnected states in this machine reveals the
remaining inaccessible states. Thus, the equivalent grammar GU with no useless
nonterminals contains only the productions S~ gAe, A ~ ooC, and C~ gi.
(a)
(b) (c)
Figure 9.11 (a) The automaton discussed in Example 9.12 (b) The simplified
automaton discussed in Example 9.12 (c) The final automaton discussed in Exam-
ple 9.12
306 Context-Free Grammars Chap. 9
EXAMPLE 9.13
Consider again the pure context-free grammar
< {SJ, S2, S3, S4, Ss}, {a, b, c}, SJ, {Sl--;> S2, S2--;> S3, S3--;> S4, S4 --;> Ss, Ss--;> a}>
The production set is split into pn = {Ss --;> a} and
pu = {Sl--;> S2, S2 --;> S3, S3 --;> S4, S4 --;> Ss}.
The unit-closure sets are
S~ = {SJ, S2, S3, S4, Ss}
Sz = {S2, S3, S4, Ss}
S~ = {S3, S4, Ss}
S~ == {S4, Ss}
S~ = {Ss}
Since Ss --;> a and Ss E S~, the production S3 --;> a is added to P'. The full set of
productions is p' = {Sl--;> a, S2--;> a, S3--;> a, S4 --;> a, Ss--;> a}. The elimination of useless
nonterminals and productions results in the grammar <{Sl}, {a, b, c}, SJ, {Sl--;> a}>.
EXAMPLE 9.14
Consider the context-free grammar with productions
z--;> S, Z--;> A
S--;> CBh, S--;> D
A--;> aaC
B --;> Sf, B --;> ggg
C--;> cA, C--;> d, C--;> C
D--;> E, D--;> SABC
E--;> be
The unit closures of each of the appropriate nonterminals and the new productions
they imply are shown below. Note that Z--;> S is not considered and that the produc-
tions suggested by C--;> C are ~lready present.
S~ D S--;> SABC
S~E S--;> be
D--;> be
C --;> cA, C--;> d
308 Context-Free Grammars Chap. 9
B~Sf,B~ggg
C~cA,C~d
D~be,D~SABC
V Theorem 9.5.
Every pure context-free language L can be generated by a pure Chomsky
normal form grammar.
Every context-free language L' can be generated by a Chomsky normal form
grammar.
Proof. Again, if the first statement of the theorem is proved, the second will
follow immediately from Definition 8.6. If L is a pure context-free language, then
by Definition 8.5 there is a pure context-free grammar G = <0, I, S, P> that
generates L. Theorem 9.4 shows that without loss of generality we may assume that
P contains no unit productions. We construct a new grammar G' = <0, I, S, P'> in
Sec. 9.3 Canonical Forms 309
the following manner. Number the productions in P, and consider each production
in turn. If the right side of the kth production consists of only a single symbol, then
it must be a terminal symbol, since there are no unit productions. No modifications
are necessary in this case, and the production is retained for use in the new set of
productions P'. The same is true if the kth production consists of two symbols and
they are both nonterminals. If one or both of the symbols is a terminal, then the rule
must be modified by replacing any terminal symbol a with a new nonterminal Xa.
Whenever such a replacement is done, a production of the form Xa --i> a must also be
included in the new set of productions P I. If the kth production is A --i> ala2a3 ... an>
where the number of (terminal and nonterminal) symbols is n > 2, then new non-
terminals Y kh Y k2 , •• • , Y kn - 2 must be introduced and the rule must be replaced by
the set of productions A--i>alYkh Ykl--i>a2Yk2, Yk2--i>a3Yk3, ••• , Y kn - 2 --i> an-Ian'
Again, if any ai is a terminal symbol such as a, it must be replaced as indicated
earlier by the nonterminal Xa'
Each new set of rules is clearly capable of producing the same effect as the rule
that was replaced. Each nonterminal Y ki is used in only one such replacement set to
ensure that the new rules do not combine in unexpected new ways. Tedious but
straightforward inductive proofs will justify that L(G) = L(G').
a
EXAMPLE 9.15
The grammar discussed in Example 9.14 can be transformed into CNF by the
algorithm given in Theorem 9.5. After elimination of the unit productions and the
consequent useless productions, the productions (suitably numbered) that must be
examined are
1. S--i> SABC 5. B--i>Sf
2. S--i> be 6. B--i> ggg
3. S--i> CBh 7. C--i>cA
4. A--i>aaC 8. C--i> d
In the corresponding lists given below, notice that only production 8 is retained; the
others are replaced by
S--i> Xbx.
S--i> CY3h Y31 --i> BXh
A--i>XaY4h Y41 --i>X.C
B--i> SXr
B --i> XgY61, Y 61 --i> XgXg
C--i> X.A
C--i>d
310 Context-Free Grammars Chap. 9
and the terminal productions Xb~ b, x.~ e, Xh~ h, X.~ a, Xr~ f, Xg~ g. Since
d did not appear as part of a two-symbol production, the rule ~~ d was not
needed. The above rules, with S as the start symbol, form a pure Chomsky normal
form grammar. The new start symbol Z and productions Z~ S and Z~ A would be
added to this pure context-free grammar to obtain the required CNF.
Other useful properties are also assured for grammars in Chomsky normal
form. When a grammar is in CNF, all parse trees can be represented by binary trees,
and upper and lower bounds on the depth of a parse tree for a string of length n can
be found (see the exercises). The derivational relationship between the number of
production steps used and the number of terminals produced implies that CNF
grammars generate an average of one terminal every two productions. The follow-
ing canonical form requires every production to contain at least one terminal sym-
bol, and grammars in this form must produce strings of length n ( > 0) in no more
than n steps.
In pure Greibach normal form, the grammatical rules are limited to producing
at least one terminal symbol as the first symbol. The original grammar in Example
9.9 is a PGNF grammar, but few of the other grammars presented in this chapter
meet the seemingly mild restrictions required for Greibach normal form. The main
obstacle to obtaining a GNF grammar is the possible presence of left recursion. A
nonterminal A is called left recursive if there is a sequence of one or more prod-
uctions for which A ~ AI3 for some string 13. Greibach normal form disallows such
occurrences since no production may produce a string starting with a nonterminal.
Sec. 9.3 Canonical Forms 311
Replacing productions involved with left recursion is complex; but every context-
free grammar can be transformed into an equivalent GNF grammar, as shown by
. Theorem 9.6. Two techniques will be needed to transform the productions into the
appropriate form, and the following lemmas ensure that the grammatical trans-
formations leave the language unchanged. The first indicates how to remove an
X-rule that begins with an undesired nonterminal; Lemma 9.1 specifies a new set of
productions that compensate for the loss.
xn = {X -i> 'Yh X -i> 'Yz, ... ,X -i> 'Yn}.Choose a new nonterminal Y f/:. 0 and let
Gil = <0 U {Y}, I, S,P">, where P" = P U {X-i>'YlY, X-i>'YzY, ... ,X-i>'YnY} U
{Y -i> <Xh Y -i> <Xz, ... , Y -i> <Xm} U {Y -i> <XlY, Y -i> <XzY, ... , Y -i> <XmY} - X'. Then
L(G) =L(G").
Proof. As in Lemma 9.1, let each nonterminal A be associated with the set of
sentential forms XA that A can produce, and consider the set of language equations
generated by P. The Xx equation is
Xx = 'Yl U 'Yz U ... U 'Yn U XX<Xl U Xx<Xz U ... U XX<Xm
Solving by the method indicated in Theorem 6.4c for an equivalent expression for
Xx shows that
Xx = (-vl U 'Yz U ... U 'Yn)(<Xl U <Xz U ... U u m)*
In the new set of productions plI, the equations of interest are
Xx = 'Yl U 'Yz U ... U 'Yn U 'YlXy U 'YzXy U ... U 'YnXy
Xy = Ul U Uz U ... U UmU <XlXy U <XzXy U ... U umXy
Factoring each equation produces
Xx = 'Yl U 'Yz U ... U 'Yn U ('Yl U 'Yz U ... U 'Yn)Xy
Xy = <Xl U Uz U ... U <Xm U (Ul U <Xz U ... U <Xm)Xy
and the second can also be solved for an equivalent expression for Xy, yielding
Xy = (Ul U Uz U··· U <Xm)*(Ul U <Xz U ... U urn)
Substituting this expression for Xy in the Xx equation produces
Xx = 'Yl U 'Yz U ... U 'Yn U (-vl U 'Yz U ... U 'Yn)(<Xl U Uz U ... U <Xm)*(Ul U Uz U ... U urn)
which by the distributive law becomes
Xx = (-vl U 'Yz U· .. U 'Yn)(A U (<Xl U Uz U ... U <Xm)*(<Xl U <Xz U ... U Urn»
Using the fact that A U B*B = B*, this simplifies to
Xx = (-vl U 'Yz U ... U 'Yn)(<Xl U Uz U ... U <Xm)*
Therefore, when X y is eliminated from the sentential forms, Xx produces exactly
the same strings as before. This indicates why the productions in the sets
{X -i> 'YlY, X -i> 'YzY, ... ,X -i> 'YnY} U {Y -i> <Xh Y -i> <Xz, ... , Y -i> <Xm} U
{Y -i> ulY, Y -i> <XzY, ... ,Y -i> UmY}
can replace the recursive X-rules X -i> 'Yh X -i> 'Yz, ... ,X -i> 'Yn.
A
Note that the new production set eliminates all recursive X-rules and does not
introduce any new recursive productions. The techniques discussed in Lemmas 9.1
Sec. 9.3 Canonical Forms 313
and 9.2, when applied in the proper order, will transform any context-free grammar
into one that is in Greibach normal form. The appropriate sequence is given in the
next theorem.
V Theorem 9.6.
Every pure context-free language L can be generated by a pure Greibach
normal form grammar.
Every context-free Language L' can be generated by a Greibach normal form
grammar.
Proof. Because of Definition 8.6, the second statement will follow
immediately from the first. If L is a pure context-free language, then by Definition
8.5 there is a pure context-free grammar G = <{Sh S2,' .. ,Sr},!" Sh P> that gener-
ates L. We construct a new grammar by applying the transformations discussed in
the previous lemmas.
Phase 1: The replacements suggested by Lemmas 9.1 and 9.2 will be used to
ensure that the increasing condition is met: if Si-i> SjO~ belongs to the new grammar,
then i > j. We transform the Sk rules for k = r, r - 1, ... ,2,1 (in that order), consid-
ering the productions for each nonterminal in turn. At the end of the ith iteration,
the top i nonterminals will conform to the increasing condition. After the final step,
all nonterminals (including any newly introduced ones) will conform, all left recur-
sion will be eliminated, and we can proceed to phase 2.
The procedure for the ith iteration is: If an Si-rule of the form Si-i> Sja is found
where i <j. eliminate it as specified in Lemma 9.1. This may introduce other rules
of the form Si-i> Sj'a', in which i is still less thanj'. Such new rules will likewise have
to be eliminated via Lemma 9.1, but since the offending subscript will decrease each
time, this process will eventually terminate. Si-rules of the form Si-i> Sja where i = j
can then be eliminated according to Lemma 9.2. This will introduce some new
nonterminals, which can be given new, higher-numbered subscripts. Lemma 9.2 is
designed so that the new rules will automatically satisfy the increasing condition
specified earlier. The remaining Si-rules must then conform to the increasing condi-
tion. The process continues with lower-numbered rules until all the rules in the new
production set conform to the increasing condition.
Phase 2: At this point, SI conforms to the increasing condition, and since there
are no nonterminals with subscripts that are less than 1, all the SI-rules must begin
with terminal symbols, as required by GNF. The only S2-rules that may not conform
to GNF are those of the form S2-i> Sla, and Lemma 9.1 can eliminate such rules by
replacing them with the SI-rules. Since all the SI-rules now begin with terminal
symbols, all the new S2-rules will have the same property. This process is applied to
Sk-rules for increasing k until the entire production set conforms to GNF.
The resulting context-free grammar is in GNF, and since all modifications
were of the type allowed by Lemmas 9.1 and 9.2, the new grammar is equivalent to
the original.
~
314 Context-Free Grammars Chap. 9
EXAMPLE 9.16
The first nonterminal has a left-recursive rule that must be eliminated by introduc-
ing the new nonterminal S4. In the notation of Lemma 9.2, n = 1, m = 2, "11 = debS 3,
U1 = S2C, and U2 = SlebS3. Eliminating Sl ~ SlS2C and Sl ~ SlSlebS 3 introduces the
new nonterminal Y = S4 and the productions
Sl ~ debS 3S4, S4 ~ S2C, S4 ~ SlebS 3, S4 ~ S2CS4, S4 ~ SlebS 3S4
Phase 1 is now complete. All left-recursion has been eliminated and the grammar
now contains the productions
Sl ~ debS 3S4, Sl ~ debS 3
S2~ SlS!) S2~ d
S3~S2e
The S4 rules are treated similarly. The final set of productions at the comple-
tionof phase 2 contains
Sc"'" debS 3S4, Sl ---;. debS 3
S2 ---;. debS 3S4Sr, S2 ---;. debS 3S!, S2 ---;. d
S3---;' debS 3S4S1e, S3---;' debS 3S1e, S3---;' de
S4 ---;. dc, S4 ---;. debS 3S4S1C, S4 ---;. debS 3S1c, S4 ---;. debS 3S4ebS 3, S4 ---;. debS 3ebS3,
S4 ---;. debS 3S4S1CS4,
S4 ---;. dCS4, S4 ---;. debS 3S1cS4, S4 ---;. debS 3S4ebS 3S4, S4 ---;. debS 3ebS3S4
In this grammar, S2 is now useless and can be eliminated.
As was the case with type 3 languages, some languages are too complex to be
defined by a context-free grammar. To prove a language L is context-free, one need
only define a grammar that generates L. By contrast, to prove L is not context free,
one must effectively argue that no context-free grammar can possibly generate L.
The pumping lemma for deterministic finite automata (Theorem 2.3) showed that
the repetition of patterns within strings accepted by a DFA was a consequence of
the nature of the finite description. The finiteness of grammatical descriptions like-
wise implies a pumping theorem for languages represented by context-free gram-
mars. The proof is greatly simplified by the properties implied by the existence of
canonical forms for context-free grammars.
corresponding to a distinct leaf in the tree. Let n = 2k+l. Choose a string z generated
by G of length at least n (if there are no.strings in L that are this long, then the
theorem is vacuously true, and we are done). The binary parse tree for any such
string z must have depth at least k + 1, which implies the existence of a path involv-
ing at least k + 2 nodes, beginning at the root and terminating with a leaf. The
labels on the k + 1 interior nodes along the path must all be nonterminals, and since
11011 = k, they cannot all be distinct. Indeed, the repetition must occur within the
"bottom" k + 1 interior nodes along the path. Call the repeated label R (see Figure
9.12), and note that there must exist a derivation for the parse tree that looks like
S ~ uRy ~ uvRxy ~ uvwxy
where u, v, w, x, and yare all terminal strings and z = uvwxy. That is, there are
productions in P that allow R ~ v Rx and R ~ w. Since S ~ u Ry and R ~ w,
S ~ uwy is a valid derivation, and uwy is therefore a word in L. Similarly,
S~uRy~uvRxy~uvvRxxy~uvvwxxy, and so uv 2 wx 2y EL. Induction shows
that each of the strings uviwxiy belongs to L for i = 0, 1,2, .... If both v and x were
empty, these strings would not be distinct words in L. This case cannot arise, as
shown next, and thus the existence of z implies that there is an infinite sequence of
strings that must belong to L.
The two occurrences of R were in distinct places in the parse tree, and hence at
heightSk+l
u v w x y
Figure 9.12 The parse tree discussed in the proof of Theorem 9.7
Sec. 9.4 Pumping Theorem 317
least one production was applied in deriving uvRxy from uRy. Since the PCNF
grammar G contains neither contracting productions nor unit productions, the
sentential form uvRxy must be of greater length than uRy, and hence Ivl + Ixl > o.
Furthermore, the subtree rooted at the higher occurrence of R was of height k + 1
or less, and hence accounts for no more than 2k + 1( = n) terminals. Thus, Ivwx I ::5 n.
All the criteria described in the pumping theorem are therefore met.
Since a context-free language must be generated by a CNF grammar with a
finite number of nonterminals, there must exist a constant n (such as n = 2111111+1) for
which the existence of a string of length at least n implies the existence of an infinite
sequence of distinct strings that must all belong to L, as stated in the theorem.
!1
As with the pumping lemma, the pumping theorem is usually applied to justify
that certain languages are comple'; (by proving that the language does not satisfy
the pumping theorem and is thus not context free). Such proofs naturally employ
the contrapositive of Theorem 9.7, which is stated next.
Examples 8.5 and 9.17 show that there are context-sensitive languages which are
not context free.
EXAMPLE 9.17
EXAMPLE 9.18
Recall that 'fbl: represented the class of context-free languages over I. The applica-
tions of the pumping theorem show that not every language is context free. The
ability to show that specific languages are not context free makes it feasible to
decide which language operators preserve context-free languages. The context-free
languages are closed under most of the operators considered in Chapter 5; the
major exceptions are complement and intersection. We begin with a definition of
substitution for context-free languages.
V Definition 9.11. Let I = {ab a2, ... ,am} be an alphabet and let f be a second
alphabet. Given context-free languages Lb Lz, ... ,Lm over f, define a substitu-
tion s: I~ p(f*) by s (ai) = Li for each i = 1,2, ... ,m, which can be extended to
s: I*~ pep) by
seA) =A
and
(Va E I)Vx E I*)(s(a·x) = s(a)·s(x»
s can be further extended to operate on a language L ~ I * by defining
s: p(I *) ~ p(f*), where
s(L) = U s(z)
zEL
A
EXAMPLE 9.19
Let L = L (G t ), where
G t = <{T}, {a, b,c,d, -}, T,{T~alblcldIT- T}>
If the substitution s were defined by sea) = Lb s(b) = Lz, s(c) = L3, sed) = L 4, then
320 Context-Free Grammars Chap. 9
s(L) would represent the set of all unparenthesized FORTRAN expressions in-
volving only the subtraction operator.
In this example, s(L) is a language over a significant portion of the ASCII
alphabet, whereas the original alphabet consisted of only five symbols. The result is
still context free, and this can be proved for all substitutions of context-free lan-
guages into context-free languages. In Example 9.19, the languages LI through L4
were not only context free, but were in fact regular. There are clearly context-free
grammars defining each of them, and it should be obvious how to modify Gt to
produce a grammar that generates s(L). If GI = <Ob!,b Sb PI> is a grammar gen-
erating L I, for example, then occurrences of a in the productions of Gt should simply
be replaced by the start symbol SI of GI and the productions of PI added to the new
grammar that will generate s(L). This is essentially the technique used to justify
Theorem 9.10. The closure theorem is stated for substitutions that do not modify
the terminal alphabet, but it is also true in general, as a trivial modification of the
following proofwouid show.
mars follow immediately from the result for substitution. Closure under union could
be proved by essentially the same method presented in Theorem 8.4. An alternate
proof, based on Theorem 9.10, is given next.
V Theorem 9.12. Let I be an alphabet, and let Ll and ~ be context-free
languages over I. Then Ll U ~ is context free. Thus, ~l is closed under union.
Proof. Assume Ll and ~ are context-free languages over I. The grammar
U = <{S}, {a, b}, S, {S~ a, S~ b}> clearly generates the context-free language
{a, b}. The substitution defined by s (a) = Ll and s (b) = ~ gives rise to the language
s({a, b}), which obviously equals Ll U~. By Theorem 9.10, this language must be
context free.
Ll
The exercises show that ~l is not closed under intersection for any alphabet I
with two or more letters. It was noted in Chapter 5 that De Morgan's laws implied
that any collection of languages that is closed under union and complementation
must also be closed under intersection. It therefore follows immediately that ~{a,b,c}
cannot be closed under complementation either.
Thus, the context-free languages do not enjoy all of the closure properties that
the regular languages do. However, the distinction between a regular language and
a context-free language is lost if the underlying alphabet contains only one letter, as
shown by the following theorem. The proof demonstrates that there is a certain
regularity in the lengths of any context-free language. It is the relationships between
the different letters in the words of a context-free language that give it the potential
for being non-FAD. If L is a context-free language over the singleton alphabet {a},
then no such complex relationships can exist; the character of a word is determined
solely by its length.
V Theorem 9.15. ~{a} = 2il{a}; that is, every context-free language over a single
letter alphabet is regular.
Proof. Let L be a context-free language over the singleton alphabet {a}, and
assume the CNF grammar G = <0, I, S, P> generates L. Let n = 2111111+1. Consider
the words in L that are of length n or greater, choose the smallest such word, and
denote it by alt. Since h 2:: n, the pumping theorem can be applied to this word,
Chap.9 Exercises 323
and hence ail can be written as uvwxy, where u = aP!, V = aq !, W = a r !, x = as!, and
y = a'!. Let i! = q! + S!. Note that Ivwxl:s; n implies that i!:S; n. The pumping the-
orem then implies that all strings in the set L! = {ail+ki!1 k = 0, 1,2, ... } must belong
to L. These account for many of the large words in L. If there are other large words
in L, choose the next smallest word ah that is of length greater than n that belongs to
L but is not already in the set L!. By a similar argument, there is an integer iz:S; n for
which all strings in the set Lz = {ah + ki2 k = 0, 1,2, ... } must also belong to L. Note
1
EXERCISES
9.5. Consider the proof of Theorem 9.4. Let G = <0, I, S, P> be a context-free grammar,
with the production set divided up into pu and pn (the set of unit productions and
the set of nonunit productions, re~ectively). Devise an automaton-based algorithm
that correctly calculates BU = {q B ~ q for each nonterminal B found in PU.
9.6. (a) What is wrong with proving that '€l; is closed under concatenation by using the
following construction? Let G l = <01 , I, SI, PI> and Gz = <Oz, I, Sz, Pz> be two
context-free grammars, and without loss of generality assume that 0 1 n Oz = r/J.
Choose a new nonterminal Z such that Z $. 0 1 U Oz, and define a new grammar
G e = <01 U Oz U {Z}, I, Z, PI U Pz U {Z~ SI 'Sz} >. Note: It is straightforward to
show that L(G e ) = L(G l )·L(G z ).
(b) Modify Ge so that it reflects an appropriate valid context-free grammars. (Hint:
Pay careful attention to the treatment of lambda productions.)
(c) Prove that '€l; is closed under concatenation by using the construction defined in
part (b).
9.7. Let I ={a, b,c}. Show that {ailickli,j,k E Nand i + j = k}is context free.
9.S. (a) Show that the following right-linear grammar is ambiguous.
G= <{S,A,B},{a},S,{S~A,S~B,A~aaA,A~A,B~aaaB,B~A}>
(b) Use the method outlined in Theorem 9.2 to remove the ambiguity in G.
9.9. The regular expression grammar discussed in Example 9.3 produces strings with need-
less outermost parentheses, such as «aUb)ec).
(a) Define a grammar that generates all the words in this language and strings that are
stripped of (only) the outermost parentheses, as in (aUb)ec.
(b) Define a grammar that generates all the words in this language and also allows
extraneous sets of parentheses, such as ««a)Ub»ec).
9.10. For the regular expression grammar discussed in Example 9.3:
(a) Determine the leftmost derivation for «a*eb)U(ced)*).
(b) Determine the rightmost derivation for «a*eb)U(ced)*).
9.11. Consider the grammars G and G' in the proof of Theorem 9.5. Induct on the number of
steps in a derivation in G to show that L(G) = L(G').
9.12. For a grammar G in Chomsky normal form, prove by induction that for any string
x E L(G) other than x = A the number of productions applied to derive x is 21xl.
9.13. (a) For a grammar G in Chomsky normal form and a string x EL(G), state and prove a
lower bound on the depth of the parse tree for x.
(b) For a grammar G in Chomsky normal form and a string x E L(G), state and prove
an upper bound on the depth of the parse tree for x.
9.14. Convert the following grammars to Chomsky normal form.
(a) <{S,B, q,{a, b,c},S,{S~aB,S~abC,B~ bc,C~c}>
(b) <{S,A,B}, {a, b, c}, S,{S~cBA, S~B, A~cB,A~ AbbS, B~aaa}>
(c) <{R}, {a, b,c, (,), E,r/J, U, e, *}, R, {R~alblcIEIr/JI(ReR)I(RUR)IR*}>
(d) <{T} , {a, b,c,d, -, +}, T,{T~alblcldIT-TIT+T}>
9.15. Convert the following grammars to Greibach normal form.
(a) <{SI, Sz}' {a, b, c, d, e}, S1, {SI ~ SZSle, SI ~ Szb, Sz~ SISZ, Sz~c}>
(b) <{SI, Sz, S3}, {a, b, c, d, e}, SI, {SI ~ S3S1, SI ~ Sza, Sz~ be, S3~ Szc}>
(c) < {S1, Sz, S3}, {a, b, c, d, e}, St, {SI ~ SISZC, SI ~ dS 3 , Sz ~ SISt, Sz ~ a, S3 ~ S3e} >
9.16. Let G be a context-free grammar, and obtain G' from G by adding rules of the form
Chap. 9 Exercises 325
A ~ A. Prove that there is a context-free grammar Gil that is equivalent to G'. That is,
show that apart from the special rule Z~ A all other lambda productions are
unnecessary.
9.17. Prove the following generalization of Lemma 9.1. Let G = <0, I, S, be a P>
context-free grammar, and assume there are strings a and -y and nonterminals X and B
for which X ~ -yBa E P. Further assume that the set of all B rules is given by
{B~ 131,B~ 132, .. ' ,B~ 13m}, and let G' = <O,I,S,P'>, where
P' = P U {XB~ -y131a, XB~ -y132a, ... ,XB~ -Yl3ma} - {X~ -yBa}.
ThenL(G) =L(G').
9.18. Let P = {y E {d}* 13 prime p :l Y = di'} = {dd, ddd, ddddd, d7 , d ll , d 13 , ••• }.
(a) Prove that P is not context free by directly applying the pumping theorem.
(b) Prove that P is not context free by using the fact that P is known to be a nonregular
language.
9.19. Let r = {x E {O, 1, 2}* 13w E {O, 1}* :l x = w ·2·w} = {2, 121,020, 11211, 10210, ... }.
Prove that r is not context free.
9.20. Let 'I' = {x E {O, 1}* 13w E {O, 1}* :l x = w ·w} = {A, 00,11,0000,1010,1111, ... }. Prove
that 'I' is not context free.
9.21. Let E = {x E {b}* 13j E ~ :llxl = 2J} = {b, bb, bbbb, b8 , b 16 , b32 , ... }. Prove that E is not
context free.
9.22. Let <I> = {x E{a}*13j E ~ :llxl = /} = {A,a,aaaa,a9,aI6,a25, ... }, and let
<I>'={xE{b,c,d}*llxlb;::1/\ IxIc=(14)2}.
(a) Prove that <I> is not context free.
(b) Use the conclusion of part (a) and the properties of homomorphism to prove that
<1>' is not context free.
(c) Use Ogden's lemma to directly prove that <1>' is not context free.
(d) Is it possible to use the pumping theorem to directly prove that <1>' is not context
free?
9.23. Consider L = {y E {O, 1}* Ily 10 = Iy II}. Prove or disprove that L is context free.
9.24. Refer to the proof of Theorem 9.9.
(a) Give a formal recursive definition of the path by (1) stating boundary conditions,
and (2) giving a rule for choosing the next node on the path.
(b) Show that the conclusions of Theorem 9.9 follow from the properties of this path.
9.25. Show that '€l: is closed under U by directly constructing a new context-free grammar
with the appropriate properties.
9.26. Let lfl: be the set of all languages that are not context free. Determine whether or not:
(a) lfl: is closed under union.
(b) lfl: is closed under complement.
(c) lfl: is closed under intersection.
(d) lfl: is closed under Kleene closure.
(e) lfl: is closed under concatenation.
9.27. Let I be an alphabet, and x = ala2' .. an-Ian E I*; define xr = a na n-l ... a2al. For a
language L over I, define Lr = {xrlx E L}. Note that the (unary) reversal operator r is
thus defined by V = {ana n-l ... a3a2all ala2a3' .. an-Ian E L}, and V therefore repre-
sents all the words in L written backward. Show that '€l: is closed under the operator r.
326 Context-Free Grammars Chap. 9
= {wr·w Iw EL}
(see the definition of w r in Exercise 9.27). Prove or disprove that '€l: is closed under the
operator T.
9.29. Prove or disprove that '€{a, b) is closed under relative complement; that is, if Ll and k are
context free, then Ll - k is also context free.
9.30. (a) Prove that '€{a,b} is not closed under intersection, nor is it closed under com-
plement.
(b) By defining an appropriate homomorphism, argue that whenever I has more than.
one symbol '€l: is not closed under intersection, nor is it closed under complement.
9.31. Consider the iterative method discussed in the proof of Theorem 9.3. Outline an
alternative method based on an automaton with states labeled by the sets in pen).
9.32. Consider grammars in Greibach normal form that also satisfy one of the restrictions of
Chomsky normal form; that is, no production has more than two symbols on the right
side.
(a) Show that this is not a "normal form" for context-free languages by demonstrating
that there is a context-free language that cannot be generated by any grammar in
this form.
(b) Characterize the languages generated by grammars that can be represented by this
restrictive form.
9.33. Let L be any collection of words over an alphabet I. Prove that L* must be regular.
9.34. If IIIII = 1, prove or disprove that '€l: is closed under complementation.
9.35. Prove that {anbncm In, m E I\I} is context free.
9.36. Use Ogden's lemma to prove that {akbncml(k i= n) A (n i= m)} is not context free.
c H A p T E R
PUSHDOWN AUTOMATA
In the earlier part of this text, the representation of languages via regular grammars
was a generative construct equivalent to the cognitive power of deterministic finite
automata and nondeterministic finite automata. Chapter 9 showed that context-free
grammars had more generative potential than did regular grammars, and thus
defined a significantly larger class of languages. This chapter and the next explore
generalizations of the basic automata construct introduced in Chapter 1. In Chapter
4, we discovered that adding nondeterminism did not enhance the language capabil-
ities of an automaton. It seems that more powerful automata will need the ability to
store more than a finite amount of state information, and machines with the ability
to write and read from an indefinitely long tape will now be considered. Automata
that allow unrestricted access to all portions ofthe tape are the subject of Chapter
11. Such machines are regarded to be as powerful as a general-purpose computer.
This chapter will deal with automata with restricted access to the auxiliary tape.
One such device is known as a pushdown automaton and is strongly related to the
context-free languages.
327
, . '
, . ,
inpUlIiIpe
Finite State
stack tape
Control
read/write head
both a new current state and a new string of symbols from f* to replace the top stack
symbol. This definition allows the machine to behave nondeterministically, since a
current state, input letter, and stack symbol are allowed to have any (finite) number
of alternatives for state transitions and strings from f* to record on the stack.
The auxiliary tape is similar to that of a finite-state transducer; the second
component of the range of the state transition function in a pushdown automaton
specifies the string to be written on the stack tape. Thus, the functions 5 and w of a
FST are essentially combined in the 5 function for pushdown automata. The auxil-
iary tape differs from that of a FST in that the current symbol from f on the tape is
sensed by the stack read/write head and can affect the subsequent operation of the
automaton. If no symbols are written to tape during a transition, the tape head
drops back one position and will then be scanning the previous stack symbol. In
essence, a state transition is initiated by the currently scanned symbol on both the
input tape and the stack tape and begins with the stack symbol being popped from
the stack; the state transition is accompanied by a push operation, which writes a
new string of stack symbols on the stack tape. If several symbols are written, the
auxiliary read/write head will move ahead an appropriate amount, and the head will
be positioned over the last of the symbols written. Thus, if exactly one symbol is
written, the stack tape head does not move, and the effect is that the old top-of-
stack symbol is overwritten by the new symbol. When the empty string is to be
written, the effect is a pop followed by a push of no letters, and the stack tape head
retreats one position. If the only remaining stack symbol is removed from the stack
in this fashion, the stack tape head moves off the end of the tape. It would then no
longer be scanning a valid stack symbol, so no further transitions are possible, and
the device halts.
It is possible to manipulate the stack and change states without consuming an
input letter, which is the intent of the ~-moves in the state transition function. Since
at most one symbol can be removed from the stack as a result of a transition,
~-moves allow the stack to be shortened by several symbols before the next input
symbol is processed.
Acceptance can be defined by requiring the stack to be empty after the entire
input tape is consumed (as was the case with counting automata) or by requiring
that the automaton be in a final state after all the input is consumed. The non-
determinism may allow the device to react to a given input string in several distinct
ways. As with NDFAs, the input word is considered accepted if at least one of the
possible reactions satisfies the criteria for acceptance. For a given PDA, the set of
words accepted by the empty stack criterion will likely differ from the set of words
accepted by the final state condition.
EXAMPLE 10.1
Consider the PDA defined by Pl = <{a, b}, {A, B}, {q, r}, q, 5, B, 0>, where 5 is
defined by
Sec. 10.1 Definitions and Examples 331
The action of the state transition function can be displayed much like that of
finite-state transducers. Transition arrows are no longer labeled with just a symbol
from the input alphabet, since both a stack symbol and an input symbol now govern
the action of the automaton. Thus, arrows are labeled by ordered pairs from I x r.
As with FSTs, this is followed by the output caused by the transition. The diagram
corresponding to Pi is shown in Figure 10.2.
(a)
(b)
(c)
(d)
empty string, leaving the stack shorter than before. This is shown in Figure lO.3d.
The last of the eight transition rules now applies, leaving the automaton in the
configuration shown by Figure 1O.3e. Since the stack is now empty, no further
moves are possible. However, since the read head has reached the end of the input
string, the word aabb is accepted by Pl' The word aab would be rejected by Ph since
334 Pushdown Automata Chap. 10
(e)
the automaton would run out of input in a configuration similar to that of Figure
1O.3d, in which the stack is not yet empty. The word aabbb would not be accepted
because the stack would empty prematurely, leaving PI stuck in a configuration
similar to that of Figure lO.3e, but with the input string incompletely consumed.
The word aaba would likewise be rejected because there would be no move from the
state r with which to process the final input symbol a.
As with deterministic finite automata, once an input symbol is consumed, it
has no .further effect on the operation of the pushdown automaton. The current
state of the device, the remaining input symbols, and the current stack contents
form a triple that describes the current configuration of the PDA. The triple
(q, bb, AA) thus describes the configuration of the PDA in Figure 1O.3c. When
processing aabb, PI moved through the following sequence of configurations:
(q, aabb,B)
(q, abb, A)
(q,bb,AA)
(r, b, A)
(r, A, A)
An ordered pair (t, 'Y) within the finite set of objects specified by 8(s, a, A) can
cause a move in the pushdown automaton P from the configuration (s, ay, A/3) to the
configuration (t,y, 'Y/3). This transition is denoted as (s, ay, A/3) I- (t,y, 'Y/3).
A sequence of successive moves in which
(SI, xI, al) H S2, X2, (2), (S2, X2, (2) H S3, X3, (3), ... ,(Sm-I, X';'-I, am-i) HSm, X m, am)
is denoted by (SI, XI, al) ~ (sm, X m, am).
!l
The operator ~ reflects the· reflexive and transitive closure of 1-, and thus
we also have (SI, XI, al) ~ (SI, XI, al) and clearly (SI, XI, al) HS2; X2, (2) implies
(S1, XI, al) ~ (S2, X2, (2)'
EXAMPLE 10.2
For the pushdown automaton Pi in Example 10.1, (q, aabb, B) ~ (r, A, A) because
(q, aabb, B) I- (q, abb, A) I- (q, bb, AA) I- (r, b, A) I- (r, A, A).
EXAMPLE 10.3
Consider the pushdown automaton Pi in Example 10.1. Since only strings of the
form aib i (for i ;;::: 1) allow (q, aib i, B) ~ (r, A, X.), it follows that A(P l) =;: {anb nIn;;::: I}.
However, F = 0 and thus L(P l) is clearly 0.
EXAMPLE 10.4
Consider the pushdown automaton defined by P2 = <{a, b}, {S, q, {t}, t, 8, S, 0>,
where 8 is defined by
8(t, a, S) = {(t, sq, (t, q}
8(t, a, C) = { }
8(t, b, S) ={ }
8(t, b, C) = {(t, A)}
8(t, A, S) = { }
8(t, A, C) = { }
In this automaton, there are two distinct courses of action when the input symbol is
a and the top stack symbol is S, which leads to several possible options when trying
to process the word aabb. One option is to apply the first move whenever possible,
which leads to the sequence of configurations
(t, aabb, S) I- (t, abb, sq I- (t, bb, scq.
Since there are no A-moves and 8(t, b, S) = { }, there are no further moves that can
be made, and the input word cannot be completely consumed in this manner.
Another option is to choose the second move option exclusively, leading to the
abortive sequence (t, aabb, S) I- (t, abb, C); 8(t, a, C) = { }, and processing again
cannot be completed. A mixture of the first and second moves results in the
sequence (t, aabb, S) I- (t, abb, sq I- (t, bb, CC) I- (t, b, q I- (t, A, A), and aabb is thus
accepted by P2 • Further experimentation shows that A(P2) = {anbnln;::: I}. To suc-
cessfully empty its stack, this automaton must correctly "guess" when the last a is
being read and choose the second transition pair, placing only C on the stack.
V Definition 10.4. Two pushdown automata MI = <t., f}, S}, SOl' 8}, B}, Fi>
and M2 = <t., f 2, S2, s~, 82 , B2 , F2> are called equivalent iff they accept the same
language.
A
EXAMPLE 10.5
The following pushdown automaton illustrates the use of A-moves and accept-
ance by final state for the language {anbmln;::: 1/\ (n = m V n = 2m)}. Let
P3 = <{a, b}, {A}, {so}, {so, S}, S2, S3, S4}, 8, A, {S2, S4}>, where 8 is defined by
Sec. 10.1 Definitions and Examples 337
8(S2' A, A) ={ }
8(S3' a, A) = { }
8(S3' b, A) = { }
The finite-state control for this automaton is diagrammed in Figure lOA. Note that
the A-move from state S3 is not responsible for any non determinism in this machine.
From S3, only one move is permissible: the A-move to S4. On the other hand, the
A-move from state Sl does allow a choice of moving to S2 (without moving the read
head) or staying at Sl while consuming another input symbol. The choice of moves
from state So also contributes to the nondeterminism; the device must "guess"
whether the number of bs will equal the number of as or whether there will be half
as many, and at the appropriate time transfer control to Sl or S3, respectively. Notice
that the moves defined by states S3 and S4 allow two stack symbols to be removed for
each b consumed. Furthermore, a string like aab can transfer control to S3 as the
, . '
338 Pushdown Automata Chap. 10
final b is processed, but the A-move can then be applied to reach S4 even though
there are no more symbols on the input tape.
Since A was the only stack symbol in P3, the language could have as easily
been described by the sack-and-stone counting device described at the beginning of
the section. It should be clear that counting automata are essentially pushdown
automata with a singleton stack alphabet. Pushdown automata with only one stack
symbol cannot generate all the languages that a PDA with two symbols can
[DENN]. However, it can be shown that using more than two stack symbols does
not contribute to the generative power of a PDA; for example, a PDA with
f = {A, B, C, D} can be converted into an 'equivalent machine with f' = {O, 1} and
the occurrences of the old stack symbols replaced by the encodings A = 01, B = 001,
C = 0001, and D = 00001.
Every NDFA can be simulated by a PDA that simply ignores its stack. In fact,
every NDFA has an equivalent counting automaton, as shown in the following
theorem.
and
(Vs E F)(8"(s, A, ¢) = {(s, Am
while
(Vs E S - F)(8"(s, A, ¢) = { })
The same type of inductive statement proved for A' holds for A", and it therefore
will follow that exactly those words that terminate in what used to be final states
empty the stack, and thus L(A) = A(A").
Ll
where 8G is defined by
_ ({(s, a) l-qr~ aa E P}, if -qr E 0
8G(s, a, -qr) - Va E L, V-qr E (0 U L)
{(s, A)}, if -qr E L !\ -qr = a
EXAMPLE 10.6
Consider the pure Greibach normal form grammar
G= <{S},{a,b},S,{S~aSb,S~ab}>
which is perhaps the simplest grammar generating {anbnIn ~ I}. The automaton PG
is then
PG = <{a, b}, {S, a, b}, {s}, S, 8G , S, 0>
where 8G is defined by
8G (s, a, S) = {(s, Sb), (s, b)}
8G(s, a, a) = {(s, A)}
8G(s,a,b)={ }
8G(s, b, S) = { }
8G(s, b, a) = { }
EXAMPLE 10.7
For a slightly more complex example, consider the pure Greibach normal form
grammar
G = <{R}, {a, b,c, (,),E,e, U,·, *},{R,{R~alblcIElel(R·R)I(RUR)I(R)*}>.
The automaton PG is then
<{a, b, c, (,), E, 0, U,·, *}, {R, a, b, c, (,), E, 0, U, ., *}, {s}, S, 8G, R, 0>,
where 8G is comprised of the following nonempty transitions:
Sec. 10.2 Equivalence of PDAs and CFGs 341
In this grammar, it happens that the symbol (is never pushed onto the stack, and so
the last transition is not utilized. Transitions not listed are empty; that is, they are of
the form 8G(s, d, A) = { }.
Consider the string (aU(boc», which has the following (unique) derivation:
R~(RUR)
~(aUR)
~(aU(RoR»
~(aU(boR»
~(aU(boc»
I
I
Figure 10.5 illustrates the state of the machine at several points during the
move sequence. At each point when an R is the top stack symbol and the input tape
head is scanning a (, there are three choices of productions that might have
generated the opening parenthesis, and consequently the automaton has three
choices with which to replace the R on the stack. If the wrong choice is taken, PG
will halt at some future point. For example, if the initial move guessed that the first
parenthesis was due to a concatenation operation, the move sequence would be
Since there are no A.-moves and the entry for 8G (s, U, 0) is empty, this attempt can go
no further. A construction such as the one given in Definition 10.6 can be shown to
produce the desired automaton for any context-free grammar in Greibach normal
form.
(a)
u
R
)
(b)
(c)
V Theorem 10.2. Given any alphabet I, '€l; ~ Cf}l;. In particular, for any
context-free grammar G, there is a pushdown automaton that accepts (via empty
stack) the language generated by G.
Proof. Let G' be any context-free grammar. Theorem 9.6 guarantees that
there is a pure Greibach normal form grammar G = <0, I, S, P> for which
344 Pushdown Automata Chap. 10
(d)
R
)
)
(e)
(f)
In either case, induction on the number of moves in a sequence will show that
('fix E~*)('fII3E(~un)*)(s,x,S)~(s,A,I3) iff S~xl3 as a leftmost derivation).
Note that x 13 is likely to be a sentential form that still contains nonterminals. The
words x that result in an empty stack (13 = A) will then be exactly those words that
produce an entire string of terminal symbols from the start symbol S (or Z in the
case where the grammar contains the two special Z-rules). In other words,
L(G') = A(PG).
Il
Note that when m = 1, the transition (r, A 1) E 3(s, a, A) gives rise to a rule of
the form A'q~ aAiq for each state q E S.
EXAMPLE 10.8
G p2 = <{Z, stt, C t}, {a, b}, Z, {Z~ stt, stt~ aSttCtt , stt~ aCt, Ct~ b}>
and Gp2 does indeed generate {anbnin ~ 1} and is therefore equivalent to P2 •
EXAMPLE 10.9
Since there are two stack symbols and two choices for each of the state superscripts,
the nonterminal set for the grammar Gp is r = {Z Bqq Bqr Brq B rr Aqq Aqr Nq Arr}
1 " " " ' "
although some of these will turn out to be useless.
PP1 contains the Z-rules Z~ Bqq and Z~ Bqr from the first criteria for produc-
tions. The transition 8(q, a, B) = {(q, A)} accounts for the productions Bqr ~ aAqr
and Bqq~ aNq. 8(q, a, A) = {(q, AA)} gives rise to the Nq-rules Aqq~ aAqqAqq and
Aqq~ aAqrNq, and the Aqr-rules Aqr~ aAqqAqr and Aqr~ aAqrArr . 8(q, b, A) = {(r, A)}
accounts for another Aqr-rule, Aqr~ b. Finally, the transition 8(r, b, A) = {(r, A)}
generates the only Arr-rule, Nr~ b.
Note that some of the potential nonterminals (Bqr, Br\ Brr) are never gener-
ated, and others (Aq\ Bqq) cannot produce terminal strings. The resulting grammar,
with useless items deleted, is given by
Gp1 = <{Z,Bqr,Aqr,Arr},{a,b},Z,{Z~Bqr,Bqr~aAqr,Aqr~aAqrArr,Aqr~b,Arr~b}>
=> aaAqrArr
=> aaaAqrArr N r
=> aaabNrArr
=>aaabbArr
=>aaabbb
, . ,
Note the relationship between the sequence of stack configurations and the
nonterminals in the corresponding sentential form. For example, when aaa has been
processed by PI> AAA is on the stack, and when the leftmost derivation has pro-
duced aaa, the remaining nonterminals are also three A-based symbols (AqrNrArr).
Aqr denotes a nonterminal (which corresponds to the stack symbol A) that will
eventually produce a terminal string as the stack shrinks below the current size
during a sequence of transitions that lead from state q to state r. This finally
happens in the last of the following steps, where aaaAqrN'Arr~aaabbb. Arr, by
contrast, denotes a nonterminal (again corresponding to the stack symbol A) that
will produce a terminal string as the stack shrinks in size during transitions from
state r back to state r. In this example, this occurs in the next to the last two steps.
The initial stack symbol position held by B is finally vacated during a sequence of
transitions from q to r, and hence Bqr appears in the leftmost derivation. On the
other hand, it was not possible to vacate B's position during a sequence of moves
from q to q, so Bqq consequently does not participate in significant derivations.
The strong correspondence between profitable move sequences in P and valid
leftmost derivations in Gp forms the cornerstone of the following proof.
V Theorem 10.3. Given any alphabet ~, 'lJ'i, ~ '€'i,. In particular, for any push-
down automaton P, there is a context-free grammar Gp for which L(G p) = A(P).
Proof. Let P = <~, r, S, sO, 8, B, 0> be a pushdown automaton, and let Gp be
the grammar given in Definition 10.7. The key to the proof is to show that all words
accepted by empty stack in the PDA P can be generated by Gp and that only such
words can be generated by Gp • That is, we wish to show that the automaton halts in
some state t with an empty stack after processing the terminal string x exactly when
there is a leftmost derivation of the form
Z~Not~x
That is,
('Vx E ~*)(Z~ Bsot~x ~ (so,x, B) ~ (t, A, A»
The desired conclusion, that L(G p) = A(P), will follow immediately from this equiv-
alence. The equivalence does not easily lend itself to proof by induction on the
length of x; indeed, to progress from the mth to the m + 1st step, a more general
statement involving more of the nonterminals of Gp is needed. The following
statement can be proved by induction on the number of moves and leads to the
desired conclusion when s = So and A = B:
('Vx E ~ *)('VA E f)('Vs E S)('Vt E S)(Nt ~ X ~ (s, x, A) ~ (t, A, A»
The resulting grammar will then generate A(P), but Gp may not be a strict context-
free grammar; A-moves may result in some productions of the form N r~ A, which
will then have to be "removed," as specified by Exercise 9.16.
A
Sec. 10.3 Equivalence of Acceptance by Final State and Empty Stack 349
Thus, '!PI = <€I' Furthermore, only one state in a PDA is truly necessary, as
noted in the following corollary. In essence, this means that for PDAs that accept by
empty stack any state information can be effectively encoded with the information
on the stack.
V Corollary 10.1. For every PDA P that accepts via empty stack, there is an
equivalent one-state PDA pI that also accepts via empty stack.
Proof. Let P be a PDA that accepts via empty stack. Let P' = PGp ' That is,
from the original PDA P, find the corresponding context-free grammar Gp • By
Theorem 10.3, this is equivalent to P. However, by Theorem 10.2, the grammar Gp
has an equivalent one-state PDA, which must also be equivalent to P.
b.
Unlike the pushdown automata discussed in this section, PDAs that accept via
final state cannot always make do with a single state. As the exercises will make
clear, at least one final and one nonfinal state are necessary. Unlike DFAs, PDAs
with only one state can accept some nontrivial languages, since selected words can
be rejected because there is no appropriate move sequence. However, a single final
state and a single nonfinal state are sufficient, as shown in the following section.
V Theorem 10.4. Every pushdown automaton P that accepts via empty stack
has an equivalent two-state pushdown automaton Pf that accepts via final state.
, . ,
Proof. Corollary 10.1 guaranteed that every pushdown automaton that ac-
cepts via empty stack has an equivalent one-state pushdown automaton that also
accepts via empty stack. Without loss of generality, we may therefore assume that
P = <I, f, {s}, s, S, B, 0>. Define Pr by choosing a new state f and two new stack
symbols Y and Z such that Y, Z ft f, and let Pr = <I, f U {Y, Z}, {s, f}, s, Sr, Z, {f}> ,
where Sr is defined by:
Notice that rules 2 and 3 imply that, while the original stack symbols appear on
the stack, the machine moves exactly as the original PDA. Rules 5 and 6 indicate
that rio letters can be consumed while there is a Y or Z on the stack, and no moves
are possible once the final state f is reached. Since the bottom of the stack symbol is
now the new letter Z, rule 1 is the only rule that initially applies. Its application
results in a configuration very much like that of the old PDA, with the symbol Y
underneath the old bottom of the stack symbol B. Pr now simulates P until the Y is
uncovered (that is, until a point is reached in which the old PDA would have
emptied its stack). In such cases (and only in such cases), rule 4 applies, and control
can be transferred to the final state f, and Pr must then halt.
By inducting on the number of moves in a sequence, it can be shown for any
a, 13 E f* that
(Vx,y E I*)(s, xy, a) ~ (s,y, 13) in P ¢::> (s,xy, aY) ~ (s,y, I3Y) in Pr)
From this, with y = 13 = Aand a = B, it follows that
(Vx E I*)(s,x, B) ~ (s, A, A) in P ¢::> (s, x, BY) ~ (s, A, Y) in Pi)
Consequently, since Sr(s, A, Z) = {(s, BY)} and Sr(S, A, Y) = {(f, Y)},
(Vx E I*)(s,x, B) ~ (s, A, A) in P ¢::> (s, x, Z) ~ (f, A, Y) in Pr)
which implies that (Vx E I*)(x E A(P) ¢::> x EL(Pr)), as was to be proved.
~
Thus, every language which is A(P) for some PDA can be recognized by a
PDA that accepts via final state, and this PDA need only employ one final and one
nonfinal state. Thus, (jI-z ~ f!Ji-z. One might conjecture that f!Ji-z might actually be
larger than (jI-z, since some added capability might arise if more than two states are
used in a pushdown automaton that accepts via final state. This is not the case, as
demonstrated by the following theorem. Once again, the information stored in the
Sec. 10.3 Equivalence of Acceptance by Final State and Empty Stack 351
finite control can effectively be transferred to the stack; only one final and one
nonfinal state are needed to accept any context-free language via final state, and
context-free languages are the only type accepted via final state.
V Theorem 10.5. Every pushdown automaton P that accepts via final state has
an equivalent pushdown automaton PI. that accepts via empty stack.
Proof. Assume that P = <I, f, S, so, 3, B, F>. Define PI. by choosing a new
stack symbols Y and Z such that Y, Z $. f and a new state e such that e $. S, and let
PI. = <I, f U {Y, Z}, S U {e}, so, 31., Z, 0>, where 31. is defined by:
The first rule guards against PI. inappropriately accepting if P simply empties
its stack (by padding the stack with the new stack symbol Y). The intent of rules 2
through 4 is to arrange for PI. to simulate the moves of P and allow PI. to enter the
state e when final states can be reached. The state e does not allow any further
symbols to be processed, but does allow the stack contents (including the new buffer
symbol) to be emptied via rules 5 and 6. Thus, PI. has a sequence of moves for input
x that empties the stack exactly when P has a sequence of moves that leads to a final
state.
By inducting on the number of moves in a sequence, it can be shown for any
a, 13 E f* that
(Vx, y E I *)(Vs, t E S)«s, xy, a) ~ (t, y, 13) in P ~ (s, xy, a Y) ~ (t, y, I3Y) in PI.)
From this, with y = A, a = B, and t E F, it follows that
(Vx, y E 1*)(Vt E F)( (so, x, B) ~ (t, A, 13) in P ~ (so, x, BY) ~ (t, A, I3Y) in PI.)
Consequently, since 3A(so, A, Z) = {(so, BY)} and 3A(t, A, A) contains (e, A), repeated
application of rules 5 and 6 implies
(Vx, y E I *)(Vt E F)( (so, x, B) ~ (t, A, 13) in P ~ (so, x, Z) ~ (t, A, A) in PI.)
This shows that (Vx E l*)(x E A(P A) ~ x EL(P)).
Ll
V Theorem 10.6. '€l is closed under intersection with a regular set. That is, if
Ll is context free and R2 is regular, Ll n R2 is always context free.
Sec. 10.4 Closure Properties and Deterministic Pushdown Automata 353
Finite State
Control
Figure 10.6 A model of a "pushdown
automaton" with two tapes
Closure properties such as this are quite useful in showing that certain lan-
guages are not context free. Consider the set L = {x E {a, b, c} I Ix I. = Ix Ib = Ix Ie}.
Since the letters in a word can occur in any order, a pumping theorem proof is less
'.,
..
straightforward than for the set {anbncnln 2:0}. However, if L were context free,
then Lna*b*c* would also be context free. But Lna*b*c* = {anbncnln 2:0}, and
thus L cannot be context free. The exercises suggest other occasions for which
closure properties are useful in showing certain languages are not context free.
For the machines discussed in the first portion of this text, it was seen that
nondeterminism did not add to the computing power of DFAs. In contrast, this is
not the case for pushdown automata. There are languages that can be accepted by
nondeterministic pushdown automata that cannot be accepted by any deterministic
pushdown automaton. The following is the broadest definition of what can con-
stitute a deterministic pushdown automaton.
Rule 1 states that, for a given input letter, deterministic pushdown automata
cannot have two different choices of destination states or two different choices of
strings to place on the stack. Rule 2 ensures that there is no choice of A-moves
either. Furthermore, rule 3 guarantees that there will never be a choice between a
A-move and a transition that consumes a letter; states that have a A-move can have
only that one move; no other transitions of any type are allowed out of that state.
Thus, for any string, there is never any more than one path through the machine.
Unlike deterministic finite automata, deterministic pushdown automata may not
always completely process the strings in I *; a given string may reach a state that has
no further valid moves, or a string may prematurely empty the stack. In each case,
the DPDA would halt without processing any further input.
EXAMPLE 10.10
The automaton P1 in Example 10.1 was deterministic. The PDAs in Examples 10.4
and 10.5 were not deterministic. The automaton PG derived in Example 10.7 was
not deterministic because there were three possible choices of moves listed for
8G(s, (, R): {(s, RoR», (s, RUR», (s, R)*)}. These choices corresponded to the three
different operators that might have generated the open parenthesis.
ing PDA give an indication of which productions in the underlying grammar were
used; such information is of obvious use in compiler construction. A nondetermin-
istic pushdown automaton is at best a very inefficient tool for parsing; a DPDA is
much better suited to the task.
As mentioned in the proof of Theorem 10.2, each leftmost derivation in G has
a corresponding sequence of moves in PG. If G is ambiguous, then there is at least
one word with two distinct leftmost derivations, and hence if that word appeared on
the input tape of PG, there would be two distinct move sequences leading to accept-
ance. In this case, PG cannot possibly be deterministic. On the other hand, if PG is
nondeterministic, this does not mean that G is ambiguous, as demonstrated by
Example 10.7. In parsing a string in that automaton, it may not be immediately
obvious which production to use (and hence which transition to take), but for any
string, there is at most only one correct choice; each word has a unique parse tree
and a unique leftmost derivation. The grammar in Example 10.7 is not ambiguous,
even though the corresponding PDA was nondeterministic.
EXAMPLE 10.11
The following Greibach normal form grammar is similar to the one used to
construct the PDA in Example 10.7, but with the different operators paired with
unique delimiters. Let
G = <{R},{a,b,c,('),{,},L]'E,0,U,·,*},R, {R ~ a[b[c[E[0[(R·R)[ [RUR] [{R}*}>.
The automaton PGis then
<{a, b, c, (,), {,}, L]' E, 0, U,·, *}, {R, a, b, c, (,), {,}, L], E, 0, U,·, *}, {s}, S, BG, R, 0>
where BG is comprised of the following non empty transitions:
BG(s, (, R) = {(s, R·R»}
BG(s, L R) = {(s, RUR])}
8G(s, {, R) = {(s, R}*)}
8G(s, a, R) = {(s, A)}
8G(s, b, R) = {(s, A)}
BG(s, c, R) = {(s, A)}
8G(s, E, R) = {(s, A)}
8G(s, 0, R) = {(s, A)}
8G(s, a, a) = {(s, A)}
8G(s, b, b) = {(s, A)}
8G(s, c, c)= {(s, A)}
BG(s, 0, 0) = {(s, A)}
, . '
356 Pushdown Automata Chap. 10
EXAMPLE 10.12
Consider again the language discussed in Example 10.7, which can also be ex-
pressed by the following grammar
H = <{S, T}, {a, b, c, (,), E, 0, U,', *}, S, {S~ (STlalb IcIEI0, T ~ 'S)I US) I)*}>
The automaton PH is then
<{a, b, c, (,), E, 0, U,', *}, {S, T, a, b, c, (,), E, 0, U,', *}, {t}, t, 3H , S, 0>
Sec. 10.4 Closure Properties and Deterministic Pushdown Automata 357
Thus, even though the PDA PG in Example 10.7 turned out to be nondeter-
ministic, this was not a flaw in the language itself, since PH is an equivalent DPDA.
Notice that the grammar G certainly appears to be more straightforward than H. G
had fewer nonterminals and fewer productions, and it is a bit harder to understand
the relationships between the nonterminals of H. Nevertheless, the LLO grammar H
led to an efficient parser and G did not.
To take advantage of the resulting reduction in complexity, all major pro-
gramming languages are designed to be recognized by DPDAs. These constructs
naturally lead to a mechanical framework for syntactic analysis. In Example 10.12,
the application of the production T--'» US) [that is, the use of the transition
SH(t, U, T) = {(t, S»} ] signifies that the previous expression and the expression to
which S will expand are to be combined with the union operator. It should be easy
to see that a similar grammar and DPDA for arithmetic expressions (using +, -, * ,
and / rather than U, " and *) would provide a guide for converting such expressions
into their equivalent machine code.
Deterministic pushdown automata have some surprising properties. Recall
that ~l: was not closed under complementation, and since 'lIl: = ~l:' there must be
some PDAs that define languages whose complement cannot be recognized by any
PDA. However, it can be shown that any language accepted by a DPDA must have
a complement that can also be recognized by a DPDA. The construction used to
prove this statement, in which final and nonfinal states are interchanged in a DPDA
.,
,
358 Pushdown Automata Chap. 10
that accepts via final state, is similar to the approach used in Theorem 5.1 for
deterministic finite automata. It is useful to recall why it was crucial in the proof of
Theorem 5.1 to begin with a DFA when interchanging states, rather than using an
NDFA. Strings that have multiple paths in an NDFA that lead to both final and
nonfinal states would be accepted in the original automaton and also in the machine
with the states interchanged. Furthermore, some strings may have no complete
paths through the NDFA and be rejected in both the original and new automata.
The problem of multiple paths does not arise with DPDAs, since by definition no
choice of moves is allowed. However, strings that do not get completely consumed
would be rejected in both the original DPDA and the DPDA with final and nonfinal
states interchanged. Thus, the proof of closure under complement for DPDAs is not
as straightforward as for DFAs. There are three ways an input string might not be
completely consumed: the stack might empty prematurely, there may be no transi-
tion available at some point, or there might only be a cycle of A-moves available that
consumes no further input. The exercises indicate that it is possible to avoid these
problems by padding the stack with a new bottom-of-the-stack symbol, and adding a
"garbage state" to which strings that are hopelessly stuck would transfer.
V Definition 10.9. Given any alphabet I, let sIlI, represent the collection of all
languages recognized by deterministic pushdown automata. If L E sIlI" then L is said
to be a deterministic context-free language (DCFL).
d
Theorem 10.7 shows that unlike 'lJ>I" sIlI, is closed under complementation.
This divergent behavior has some immediate consequences, as stated below.
properties, they cannot represent the same collection, and.stl}; C '!J}; implies that the
containment must be proper.
~
V Theorem 10.9. Given any alphabet I, .stl}; is closed under complement . .stl}; is
also closed under union, intersection, and difference with a regular set. That is, if Ll
is a DCFL and Rz is a FAD language, then the following are deterministic,
context-free languages:
-Ll
LlnRz
LlURz
Ll -Rz
Rz-L l
Proof. The proof follows from the above discussion and theorems and the
exercises.
~
EXAMPLE 10.13
These closure properties can often be used to justify that certain languages are
not DCFLs. For example, the language
L = {x E {a, b, c}* Ilxl a = Ixlb} U {x E {a, b, c}* Ilxlb = Ixl c}
can be recognized by a PDA but not by a DPDA. If L were a DCFL, then
~L = {x E {a, b, c}* Ilx la =1= Ixlb} n {x E {a, b, c}* Ilxlb =1= Ixl.} would also be a DCFL.
However, ~L n a*b*c* = {akbncml (k =1= n) 1\ (n =1= m)}, which should also be a DCFL.
Ogden's lemma shows that this is not even a CFL (see the exercises), and hence the
original hypothesis that L was a DCFL must be false. The interested reader is
referred to similar discussions in [HOPC] and [DENN].
The restriction that the head scanning the stack tape could only access the
symbol at the top of the stack imposed limitations on the cognitive power of this
class of automata. While the current contents of the top of the stack could be stored
in the finite-state control and be remembered after the stack was popped, only a
finite number of such pops can be recorded within the states of the PDA. At some
point, seeking information further down on the stack will cause an irretrievable loss
of information. One might suspect that if popped items were not erased (so that
they could be revisited and reviewed at some later point) a wider class of languages
might be recognizable. Generalized automata that allow such nondestructive
"backtracking" are called Turing machines and form a significantly more powerful
class of automata. These devices and their derivatives are the subject of the next
chapter.
EXERCISES
10.2. Define a deterministic pushdown automaton Pi with only one state for which
A(Pl) = {anbnln 2: 1}.
10.3. Consider the pushdown automaton defined by Pz = <{a, b}, {S, q, {t}, t, 8, S, {t}> ,
where 8 is defined by
8(t, a, S) = {(t, sq, (t, C)}
8(t, a, C) = { }
8(t, b,S) = { }
8(t, b, C) = {(t, A)}
(a) Give an inductive proof that
(Vi E N)«t, ai, S) ~ (t, A, a) =? (a = SC' V a = C'))
(b) Give an inductive proof that
(Vi E N)«t,x, C i) ~ (t, A, [3) =? (x = bi))
(c) Find L(P2); use parts (a) and (b) to rigorously justify your statements.
10.4. Let L = {aibjckl i,j, kEN and i + j = k}.
(a) Find a pushdown automaton (which accepts via final state) that recognizes L.
(b) Find a pushdown automaton (which accepts via empty stack) that recognizes L.
(c) Is there a counting automaton that accepts L?
(d) Is there a DPDA that accepts L?
(e) Use Definition 10.7 to find a grammar equivalent to the PDA in part (a).
10.5. Let L = {x E {a, b, c}* Ilx I. + Ix Ib = Ix Ie}.
(a) Fi~d a pushdown automaton (which accepts via final state) that recognizes L.
(b) Find a pushdown automaton (which accepts via empty stack) that recognizes L.
(c) Is there a counting automaton that accepts L?
(d) Is there a DPDA that accepts L?
(e) Use Definition 10.7 to find a grammar equivalent to the PDA in part (a).
10.6. Prove or disprove that:
(a) W''I, is closed under inverse homomorphism.
(b) slh is closed under inverse homomorphism.
10.7. Give an example of a finite language that cannot be recognized by anyone-state PDA
that accepts via final state.
10.8. Let L = {anbncmdmln, mE NJ.
(a) Find a pushdown automaton (which accepts via final state) that recognizes L.
(b) Find a pushdown automaton (whicf', accepts via empty stack) that recognizes L.
(c) Is there a DPDA that accepts L?
(d) Is there a counting automaton that accepts L?
(e) Use Definition 10.7 to find a grammar equivalent to the PDA in part (b).
10.9. Refer to Theorem 10.2 and use induction on the number of moves in a sequence to
show that
(Vx E I *)(V[3 E (I U il)*)( (s, x, S) ~ (s, A, [3) iff S ~ x [3 as a leftmost derivation)
10.10. Consider the grammar
<{R}, {a, b,c, (,), E,~, U,·, *}, R, {R~alblcIEI~I(R·R)I(RUR)IR*}>
362 Pushdown Automata Chap. 10
(a) Convert this grammar to Greibach normal form, adding the new non terminal Y.
(b) Use Definition 10.6 on part (a) to find the corresponding PDA.
(c) Use the construct suggested by Theorem 10.4 in part (b) to find the corresponding
PDA that accepts via final state.
10.11. Let L = {aibjcjdili,j EN}.
(a) Find a pushdown automaton (which accepts via final state) that recognizes L.
(b) Find a pushdown automaton (which accepts via empty stack) that recognizes L.
(c) Is there a DPDA that accepts L?
(d) Is there a counting automaton that accepts L?
(e) Use Definition 10.7 to find a grammar equivalent to the PDA in part (b).
10.12. Consider the PDA P3 in Example 10.5. Use Definition 10.7 to find G p3 •
10.13. Refer to Theorem 10.3 and use induction to show
('Vx E~*)('VAEr)('VsES)('VtES)(A't~x ~ (s,x,A)~(t,A,A»)
10.14. Let L = {anbncmdmln, mEN} U {aibjcjdil i,j EN}.
(a) Find a pushdown automaton (which accepts via final state) that recognizes L.
. (b) Find a pushdown automaton (which accepts via empty stack) that recognizes L.
(c) Is there a DPDA that accepts L?
(d) Is there a counting automaton that accepts L?
(e) Use Definition 10.7 to find a grammar equivalent to the PDA in part (b).
10.15. Consider the PDA PG in Example 10.6. Use Definition 10.7 to find GPG •
10.16. Refer to Theorem 10.4 and use induction to show
('Vet, [3 E f*)('Vx, y E ~ *)( (s, xy, et) ~ (s, y, [3) in P ~ (s, xy, et Y) ~ (s, y, [3Y) in Pf)
10.17. Refer to Theorem 10.5 and use induction to show
('Vet, [3 E r*)('Vx,y E ~*)('Vs, t E S)(s, xy, et) ~ (t,y, [3) in P ~
(s, xy, etY) ~ (t, y, [3Y) in P~)
10.18. Prove that {x E {a, b, c}*1 Ix I. = Ix Ib 1\ Ix Ib > IxIc} is not context free. (Hint: Use
closure properties.)
10.19. (a) Give an appropriate definition for the state transition function of the two-tape
automaton pictured in Figure 10.6, stating the new domain and range.
(b) Define a two-tape automaton that accepts {anbncnIn 2: I} via final state.
10.20. (a) Prove that {anbncnIn 2: I} is not context free.
(b) Prove that {x E {a, b, c}* I Ix I. = Ix Ib} is not context free. [Hint: Use closure prop-
erties and apply part (a).]
10.21. (a) Find a DPDA that accepts
{cnbml(n 2:1)I\(n =m)}U{anbml(n 2:1)I\(n = 2m)}
(b) Define a homomorphism that transforms part (a) into a language that is not a
DCFL.
10.22. Use Ogden's lemma to show that {akbncmI(k f- n) 1\ (n f- m)} is not a context-free
language.
10.23. Refer to Theorem 10.6 and use induction to show
('Vet, [3 E r*)('Vx E ~ *)('VSI, tl E SI)('VS2, tz E S2)
((SI, S2), xy, et) ~ «h, tz), y, [3) in PI ~ (( (SI, xy, et) ~ (h, y, [3) in pn) 1\ (tz = 32(S2, x))))
Chap. 10 Exercises 363
10.24. Assume that P is a DPDA. Prove that there is an equivalent DPDA P' (which accepts
via final state) for which:
(a) P' always has a move for all combinations of states, input symbols, and stack
symbols.
(b) P' never empties its stack.
(c) For each input string presented to pI, P' always scans the entire input string.
10.25. Assume the results of Exercise 10.24, and show that .s4.l: is closed under complementa-
tion. (Hint: Exercise 10.24 almost allows the trick of switching final and nonfinal
states to work; the main remaining problem involves handling the case where a series
of X,-moves may cycle through both final and nonfinal states.) .
10.26. Give an example that shows that .s4.l: is not closed under concatenation.
10.27. Give an example that shows that.s4.l: is not closed under Kleene closure.
10.28. Show that {canbml(n ~ l)l\(n = m)} U {anbml(n ~ l)l\(n = 2m)} is a DCFL.
10.29. (a) Modify the proof of Theorem 10.6 to show that if Ll is context free and R2 is
regular, Ll - R2 is always context free.
(b) Prove the result in part (a) by instead appealing to closure properties for com-
plement and intersection.
10.30. (a) Modify the proof of Theorem 10.6 to show that if Ll is context free and R2 is
regular, Ll U R2 is always context free.
(b) Prove the result in part (a) by instead appealing to closure properties for com-
plement and intersection.
10.31. Argue that if Ll is a DCFL and R2 is regular, R2 - Ll is always a DCFL.
10.32. (a) Prove that {w2w r l wE {O, 1}*} is a DCFL.
(b) Prove that {wwrl wE {O, 1}*} is not a DCFL.
10.33. Give examples to show that even if Ll and L2 are DCFLs:
(a) L 1 • Lz need not be a DCFL.
(b) Ll - Lz need not be a DCFL.
(c) Lf need not be a DCFL.
(d) L~ need not be a DCFL.
10.34. Consider the quotient operator / given by Definition 5.10. Prove or disprove that:
(a) 9Jll: is closed under quotient.
(b) .s4.l: is closed under quotient.
10.35. Consider the operator b defined in Theorem 5.11. Prove or disprove that:
(a) 9Jll: is closed under the operator b.
(b) .s4.l: is closed under the operator b.
10.36. Consider the operator Y defined in Theorem 5.7. Prove or disprove that:
(a) 9Jll: is closed under the operator Y.
(b) .s4.:;:. is closed under the operator Y.
10.37. Consider the operator P given in Exercise 5.16. Prove or disprove that:
(a) 9Jll: is closed under the operator P.
(b) .s4.:;:. is closed under the operator P.
10.38. Consider the operator F given in Exercise 5.19. Prove or disprove that:
(a) 9Jll: is closed under the operator F.
(b) .s4.:;:. is closed under the operator F.
c H A p T E R
TURING MACHINES
In the preceding chapters, we have seen that DFAs and NDFAs represented the
type 3 languages and pushdown automata represented the type 2 languages. In this
chapter we will explore the machine analog to the type 1 and type 0 grammars.
These devices, called Turing machines, are the most powerful automata known and
can recognize every language considered so far in this text. We will also encounter
languages that are too complex to be recognized by any Turing machine. Indeed, we
will see that any other such (finite) scheme for the representation of languages is
likewise forced to be unable to represent all possible languages over a given
alphabet. Turing machines provide a gateway to undecidability, discussed in the
next chapter, and to the general theory of computational complexity, which is rich
enough to warrant much broader treatment than would be possible here.
Pushdown automata turned out to be the appropriate cognitive devices for the type
2 languages, but further enhancements in the capabilities of the automaton model
are necessary to achieve the generality inherent in type 0 and type 1 languages. A
(seemingly) minor modification will be all that is required. Turing milchines are
comprised of the familiar components that have already been used in previous
classes of automata. As with the earlier constructions, the heart of the device is a
finite-state control, which reacts to information scanned by the tape head(s). Like
finite-state transducers and pushdown automata, information can be written to tape
as transitions between states are made. Unlike FSTs and PDAs, Turing machines
364
Sec. 11.1 Definitions and Examples 365
have only one tape with which to work, which serves both the input and the output
needs of the device. Note that with finite-state transducers the presence of a second
tape was purely for convenience; a single tape, with input symbols overwritten by
the appropriate output symbol as the read head progressed, would have sufficed.
Whereas a pushdown automaton could write an entire string of symbols to the
stack, a Turing machine is constrained to print a single letter at a tirrie. These new
devices would therefore be of less value than PDAs were they not given some other
capability. In all previous classes of automata, the read head was forced to move one
space to the right on each transition (or, in the case of A-moves, remain stationary).
On each transition, the Turing machine tape head has the option of staying put,
moving right, or moving left. The ability to move back to the left and review
previously written information accounts for the added power of Turing machines.
It is possible to view a Turing machine as a powerful transducer of computable
functions, with an associated function defined much like those for FSTs. That is, as
with finite-state transducers, each word that could be placed on an otherwise blank
tape is associated with the word formed by allowing the Turing machine to operate
on that word. With FSTs, this function was well defined; the machine would process
each letter of the word in a unique way, the read head would eventually find the end
of the word (that is, it would scan a blank), and the device would then halt. With
Turing machines, there is no built-in guarantee that it will always halt; since the tape
head can move both right and left, it is possible to define a Turing machine that
would reverberate back and forth between two adjacent spaces indefinitely. A
Turing machine is also not constrained to halt when it scans a blank symbol; it may
overwrite the blank and/or continue moving right indefinitely.
Rather than viewing a Turing machine as a transducer, we will primarily be
concerned with employing it as an acceptor of words placed on the tape. Some
variants of Turing machines are defined with a set of final states, and the criteria for
acceptance would then be that the device both halt and be in a final state. For our
purposes, we will employ the writing capabilities of the Turing machine and simply
require that acceptance be indicated by printing a Y just prior to halting. If such a Y
is never printed or the machine does not halt, the word will be considered rejected.
It may be that there are words that might be placed on the input tape that would
prevent the machine from halting, which is at best a serious inconvenience; if the
device has been operating for an extraordinary amount of time, we may not be able
to tell if it will never halt (and thus reject the word), or whether we simply need to be
patient and wait for it to eventually print the Y. This uncertainty can in some cases
be avoided by finding a superior design for the Turing machine, which would always
halt, printing N when a word is rejected and Y when a word is accepted. This is not
always a matter of being clever in defining the machine; we will see that there are
some languages that are inherently so complex that this goal is impossible to
achieve.
A conceptual model of a Turing machine is shown in Figure 11.1. Note that the
tape head is capable of both reading and overwriting the currently scanned symbol.
As before, the tape is composed of a series of cells, with one symbol per cell. The
,.
I
, . ,
I I I I I 1 I Iblal I I I I
... read/write head
Finite State
Control
Figure 11.1 A model of a Turing
machine
tape head will also be allowed to move one cell to either the left or right during a
transition. Note that unlike all previous automata, the tape does not have a "left
end"; it extends indefinitely in both directions. This tape will be used for input,
output, and as a "scratch pad" for any intermediate calculations. At the start of
operation of the device, all but a finite number of contiguous cells are blank. Also,
unlike our earlier devices, the following definition implies that Turing machines may
continue to operate after scanning a blank.
The auxiliary alphabet always includes the blank symbol (denoted by #), and
neither I nor r include the special symbols Land R, which denote moving the tape
head left and right, respectively. The state h is a special halt state, from which no
further transitions are possible; h $. S.
Ll
1. Overprint the cell with a symbol from I or r (and thus a blank may be
printed).
Sec. 11.1 Definitions and Examples 367
In the case where a cell is overprinted, the tape head remains positioned on that
cell.
The above definition of a Turing machine is compatible with the construct
implemented by Jon Barwise and John Etchemendy in their Turing's World© soft-
ware package for the Apple® Macintosh. The Turing's World program allows the
user to interactively draw a state transition diagram of a Turing machine and watch
it operate on any given input string. As indicated by the next example, the same
software can be used to produce and test state transition diagrams for deterministic
finite automata.
EXAMPLE 11.1
The following simple Turing machine recognizes the set of even-length words over
{a, b}. The state transition diagram for this device is shown in Figure 11.2 and con-
forms to the conventions introduced in Chapter 7. Transitions between states are
represented by arrows labeled by the symbol that caused the transition. The symbol
after the slash denotes the character to be printed or, in the case of Land R, the
direction to move the tape head. The quintuple is <{a, b}, {#, V, N}, {so, Sl}, so, th>,
where 8T is given by
8T (so, a) = (sj, R)
8T (so, b) = (sj, R)
8T (so, #} = (h, V)
8T(sj, a) = (so, R)
8T (sj, b) = (so, R)
8T (sj, #) = (h, N)
This particular Turing machine operates in much the same way as a DFA
would, always moving right as it scans each symbol of the word on the input tape.
When it reaches the end of the word (that is, when it first scans a blank), it prints Y
or N, depending on which state it is in, and halts. It differs from a DFA in that the
accept/reject indication is printed on the tape at the right end of the word. Figure
11.3 shows an alternative way of displaying this machine, in which the halt state is
not explicitly shown. Much like the straight start state arrow that denotes where the
automaton is entered, the new straight arrows show how the machine is left. This
notation is especially appropriate for submachines. As with complex programs, a
complex Turing machine may be comprised of several submodules. Control may be
passed to a submachine, which manipulates the input tape until it halts. Control
may then be passed to a second submachine, which then further modifies the tape
contents. When this submachine would halt, control may be passed on to a third
submachine, or back to the first submachine, and so on. The straight arrows leaving
the state transition diagram can be thought of as exit arrows for a submachine, and
they function much like a return statement in many programming languages. Exam-
ple 11.4 illustrates a Turing machine that employs submachines.
We will see that any DFA can be emulated by a Turing machine in the manner
suggested by Example 11.1. The following example shows that Turing machines can
recognize languages that are definitely not FAD. In fact, the language accepted in
Example 11.2 is not even context free.
EXAMPLE 11.2
The Turing machine M illustrated in Figure 11.4 operates on words over {a, b, c}.
When started at the leftmost end of the word, it is guaranteed to halt at the
rightmost end and print Y or N. It happens to overwrite the symbols comprising the
input word as it operates, but this is immaterial. In fact, it is possible to design a
slightly more complex machine that restores the word before halting (see Example
11.11). The quintuple is <{a, b, c}, {#, X, Y, N}, {so, S10 S2, S3, S4, S5, S6}, so, 8>, where 8
is as indicated in the diagram in Figure 11.4. It is intended to recognize the language
{x E {a, b, c}* I Ix Is = Ix Ib = Ix Ie}. One possible procedure for processing a string to
check if it had the same number of as, bs, and cs is given by the pseudocode below.
while an a remains do
begin
replace a by X
return to leftmost symbol
Sec. 11.1 Definitions and Examples 369
States So and S1 in Figure 11.4 check the while condition, and states S2 through
S6 perform the body of the do loop. On each iteration, beginning at the leftmost
symbol, state So moves the tape head right, checking for symbols that have not been
replaced by X. If it reaches the end of the word (that is, if it scans a blank), the as,
bs, and cs all matched, and it halts, printing Y. If b or c is found, state 1 searches for
as; if the end of the string is reached without finding a corresponding a, the machine
halts with N, since there were an insufficient number of as. From either So or s},
control passes to S2 when an a is scanned, and that a is replaced by X. State Sz, like S4
and S6, returns the tape head to the leftmost character. This is done by scanning left
until a blank is found and then moving right as control is passed on to the next state.
State S3 searches for b, halting with N if none is found. The first b .encountered is
otherwise replaced by X, and the Turing machine enters S4, which then passes
control on to S5 after returning to the leftmost symbol. State S5 operates much like S3,
searching for c this time, and S6 returns the tape head to the extreme left if the
previous a and b have been matched with c. The process then repeats from so.
To see exactly how the machine operates, it is useful to step through the
computation for an input string such as babcca. To do this, conventions to designate
the status of the device are quite helpful. Like the stack in a PDA, the tape contents
may change as transitions occur, and the notation for the configuration of a Turing
machine must reflect those changes. Steps in a computation will be represented
according to the following conventions.
V Definition 11.2. Let M = <~, r, S, sO, B> be a Turing machine that is oper-
ating on a tape containing ... ###o.bfJ### ... , currently in state t with the tape
head scanning the b, where a, fJ E (~ U f)*, a contains no leading blanks and fJ has
no trailing blanks. This configuration will be represented by o.tbfJ.
'Y 1-1\1 will be taken to mean that the configuration denoted by 1\1 is reached in
one transition from 'Y. The symbol ~ will denote the reflexive and transitive closure
ofl-.
Ll
That is, the symbol representing the state will be embedded within the string,
just to the left of the symbol being scanned. If B(t, b) = (8, R), then o.tbfJ I- o.bsfJ. The
new placement of the state label within the string indicates that the tape head has
indeed moved right one symbol. The condition S n (~ U f) = 0 ensures that there is
no confusion as to which symbol in the configuration representation denotes the
state. As with PDAs, 'Y ~ 1\1 means that 'Y produces 1\1 in zero or more transitions.
Note that the leading and trailing blanks are not represented, but a and fJ may
contain blanks. Indeed, b may be a blank. The representation ac###t# indicates
that the tape head has moved past the word ac and is scanning the fourth blank to
the right of the word (a = ac###, b = #, fJ = A). At the other extreme, t##ac
shows the tape head two cells to the left of the word (a = A, b = #, fJ = #ac). A
totally blank tape is represented by t#.
V Definition 11.3. For a Turing machine M = <~, r, S, sO, B>, the language
accepted by M, denoted by L(M), is L(M) = {x E ~* [Sox ~xhY}. A language ac-
cepted by a Turing machine is called a Turing-acceptable language.
Ll
It is generally convenient to assume that the special symbol Y is not part of the
input alphabet. Note that words can be rejected if the machine does not print a Y or
if the machine never halts.
Several reasonable definitions of acceptance can be applied to Turing ma-
chines. One of the most common specifies that the language accepted by M is the set
of all words for which M simply halts, irrespective of what the final tape contents
are. It might be expected that this more robust definition of acceptance might lead
to more (or at least different) languages being recognized. However, this definition
turns out to yield a device with the same cognitive power as specified by Definition
11.3, as indicated below. More precisely, let us define
Sec. 11.1 Definitions and Examples 371
EXAMPLE 11.3
Consider again the machine M in Example 11.2 and the input string babcca. By
the strict definition of acceptance given in Definition 11.3, L(M) = {"-}, since "- is
the only word that does not get destroyed by M. Using the looser criteria for
acceptance yields a more interesting language. The following steps show that
sobabcca ~ XXXXXXh Y.
sobabcca f- bsjabcca f- bs 2Xbcca f- s2bXbcca f-
s2#bXbcca f- s3bXbcca f-s 4XXbcca f- s4#XXbcca f-
ssXXbcca f- XssXbcca f-XXssbcca f-XXbsscca f-
XXbs 6Xca f- XXs6bXca f- Xs 6XbXca f- s6XXbXca f-
s6#XXbXca f- soXXbXca f- XsoXbXca f- XXsobXca f-
XXbsjXca ~s2#XXbXcX ~
XXXXsscX ~S6#XXXXXX ~
XXXXXXso f-XXXXXXhY
372 Turing Machines Chap. 11
The string babeea is therefore accepted. ae is rejected since soae ~ XehN. Further
analysis shows that L3(M) is exactly {x E {a, b, e}* Ilxl a = Ixlb = Ix Ie}. Since the only
place Y is printed is at the end of the word on the tape, L3(M) = L 2(M). Every word
eventually causes M to halt with either Y or N on the tape, and so L 1(M) = I*.
EXAMPLE 11.4
The composite Turing machine shown in Figure 11.5 employs several submachines
and is based on the parenthesis checker included as a sample in the Turing's World
software. The machine will search for correctly matched parentheses, restoring the
original string and printing Y if the string is syntactically correct, and leaving a $ to
mark the offending position if the string has mismatched parentheses. Asterisks are
recorded to the left of the string as left parentheses are found, and these are erased
as they are matched with right parentheses.
Figure 11.5a shows the main architecture of the Turing machine. The square
nodes represent the submachines illustrated in Figures 11.5b and 11.5c. When So
encounters a left parenthesis, it marks the occurrence with $, and transfers control
to the submachine S1. S1 moves the read head to the left end of the string, and
deposits one * there. The cells to the left of the original string serve as a scratch
area; the asterisks record the number of unmatched left parentheses encountered
thus far. Sub machine S1 then scans right until the $ is found; it then restores the
original left parenthesis. At this point, no further internal moves can be made in SI>
and the arrow leaving S12 indicates that control should be returned to the parent
automaton.
The transition leaving the square S1 node in Figure 11.5a now applies, and the
tape head moves to the right of the left parenthesis that was just processed by SI>
and control is returned to so. So continues to move right past the symbols a and b,
uses S1 to process subsequent left parenthesis, and transfers control to the sub-
machine S2 whenever a right parenthesis is encountered.
Submachine S2 attempts to match a right parenthesis with a previous left
parenthesis. As control was passed to S2, the right parenthesis was replaced by $ so
that this spot on the tape can be identified later. The transitions in state S20 move the
tape head left until a blank cell is scanned. If the cell to the right of this blank does
not contain an asterisk, S21 has no moves and control is passed back to the parent
Turing machine, which will enter S4 and move right past all the symbols in the word,
printing N as it halts. The absence of the asterisk implies that no previous matching
left parenthesis had been found, so halting with N is the appropriate action.
Ifan asterisk had been found, S21 would have replaced it with a blank, and then
would have no further moves, and the return arrow would be followed. The blank
that is now under the tape head will cause the parent automaton to pass control to
S3, which will move right to $, and the $ is then restored to ). Control returns to So as
the tape head moves past this parenthesis.
The start state continues checking the remainder of the word in this fashion.
When the end of the word is reached, S6 is used to examine the left end of the string;
Sec. 11.1 Definitions and Examples 373
aIR
b/R
(/R
)/R
$/R
(b)
remaining asterisks indicate unmatched left parentheses, and will yield N as the
machine halts from S8. If S6 does not encounter *, the Turing machine halts with Y
and accepts the string from S7.
As more complex examples are considered, one may begin to suspect that any
programming assignment could be carried out on a Turing machine. While it would
be truly unwise to try to make a living selling computers with this architecture, these
devices are generally regarded to be as powerful as any general-purpose computer.
That is, if an algorithm for solving a class of problems can be carried out on a
computer, then there should be a Turing machine that can produce identical output
for each instance of a problem in that class.
The language {x E{a, b,c}*llxl. = Ixlb = Ixlc} is not context free, so it cannot
be recognized by a PDA. Turing machines can therefore accept some languages that
PDAs cannot, and we will see that they can recognize every context-free language.
We began with DFAs, which were then extended to the more powerful PDAs,
which have now been eclipsed by the Turing machine construct. Each of these
classes of automata has been substantially more general than the previous class. If
this text were longer, one might wonder when the next class of superior machines
would be introduced. Barring the application of magic or divine intuition, there
does not seem to be a "next class." That is, any machine that is constrained to
operate algorithmically by a well-defined set of rules appears to have no more
computing power than do Turing machines.
This constraint, "to behave in an algorithmic fashion," is an intuitive notion
without an obvious exact formal expression. Indeed, "behaving like a Turing ma-
chine" is generally regarded as the best way to express this notion! A discussion of
how Turing machines came to be viewed in this manner is perhaps in order. An
excellent in-depth treatment of their history can be found in [BARW].
At the beginning of the twentieth century, mathematicians were searching for
a universal algorithm that could be applied to mechanically prove any well-stated
mathematical formula. This naturally focused attention on the manipUlation of
symbols. In 1931, Godel showed that algorithms of this sort cannot exist. Since this
implied that there were classes of problems that could not have an algorithmic
solution, this then led to attempts to characterize those problems that could be
effectively "computed." In 1936, Turing introduced his formal device for symbol
manipulation and sllggested that the definition of an algorithm be based on the
Turing machine. He also outlined the halting problem (discussed later), which
demonstrated a problem to which no Turing machine could possibly provide the
correct answer in all instances. The search for abetter, perhaps more powerful
characterization of what constitutes an algorithm continued.
While it cannot be proved that it is impossible to find a better formalization
that is truly more powerful, on the basis of the ,accumulating evidence, no one
believes that a better formulation exists. For one thing, other attempts at formaliza-
tion, including grammars, A-calculus, f.L-recursive functions, and Post systems, have
all turned out to yield exactly the same computing power as Turing machines.
Second, all attempts at "improving" the capabilities of Turing machines have not
Sec. 11.1 Definitions and Examples 375
expanded the class of languages that can be recognized. Some of these possible
improvements will be examined in the next section. We close this section by for-
malizing what Example 11.1 probably made clear: every DFA can be simulated by a
Turing machine.
This result actually follows trivially from the much stronger results presented
later. Not only is every type 3 language Turing acceptable, but every type 0 language
is Turing acceptable (as will be shown by Theorem 11.2). The above proof presents
the far more straightforward conversion available to type 3 languages and illustrates
the flavor of the inductive arguments needed in .other proofs concerning Turing
machines. By using this conversion, the Turing's World software can be employed to
interactively build and test deterministic finite automata on a Macintosh.
EXAMPLE 11.5
Consider the DFA Tshown in Figure 11.6, which recognizes all words of even length
over {a, b}. The corresponding Turing machine is illustrated in Example 11.1 (see
Figure 11.2).
There are several ways in which the basic definition of the Turing machine can be
modified. For example, Definition 11.1 disallows the tape head from both moving
and printing during a single transition. It should be clear that if such an effect were
desired at some point it could be effectively accomplished under the more restrictive
Definition 11.1 by adding a state to the finite-state control. The desired symbol
could be printed as control is transferred to the new state. The transition out of the
new state would then move the tape head in the appropriate fashion, thus accom-
plishing in two steps what a "fancier" automaton might do in one step. While this
modification might be convenient, the ability of Definition 11.1 style machines to
simulate this behavior makes it clear that such modified automata are no more
powerful than those given by Definition 11.1. That is, every such modified auto-
maton has an equivalent Turing machine.
It is also possible to examine machines that are more restrictive than Defini-
tion 11.1. If the machine were constrained to write on only a fixed, finite amount of
the tape, this would seriously limit the types of languages that could be recognized.
In fact, only the type 3 languages can be accepted by such machines. Linear
bounded automata, which are Turing machines constrained to write only on the
portion of the tape containing the original input word, are also less powerful than
unrestricted Turing machines and are discussed in a later section. Having an un-
bounded area in which to write is therefore an important factor in the cognitive
power of Turing machines, but it can be shown that the tape need not be unbounded
in both directions. That is, Turing machines that cannot move left of the cell the
tape head originally scanned can perform any calculation that can be carried out by
the less-restrictive machines given by Definition 11.1 (see the exercises).
In deciding whether a Turing machine can simulate the modified machines
suggested below, it is important to remember that the auxiliary alphabet r can be
expanded as necessary, as long as it remains finite. In particular, it is possible to
expand the information content of each cell by adding a second "track" to the tape.
For example, we may wish to add check marks to certain designated cells, as shown
in Figure 11.7. The lower track would contain the original symbols, and the upper
track mayor may not have a check mark. This can be accomplished by doubling the
I I
•
I I I IblaJa I I I I
Finite State
Control
Figure 11.7 A Thring machine with a
two-track tape
Sec. 11.2 Variants of Turing Machines 371
combined size of the alphabets!. and r to include all symbols without check marks
and the same symbols with check marks. The new symbols can be thought of as
ordered pairs, and erasing a check mark then amounts to rewriting a pair such as
(a, j) with (a, #). A scheme such as this could be used to modify the automaton in
Example 11.2. Rather than replacing designated symbols with X, a check could
instead be placed over the original symbol. Just prior to acceptance, each check
mark could be erased, leaving the original string to the left of the Y (see Example
11.11).
The foregoing discussion justifies that a Turing machine with a tape head
capable of reading two tracks can be simulated by a Definition 11.1 style Turing
machine; indeed, it is a Turing machine with a slightly more complex alphabet.
When convenient, then, we may assume that we have a Turing machine with two
tracks. A similar argument shows that, for any finite number k, a k-track machine
has an equivalent one-track Turing machine with an expanded alphabet. The sym-
bols on the other tracks can be more varied than just j and #; any finite number of
symbols may appear on any of the tracks. Indeed, a Turing machine may initially
make a copy of the input string on another track to use in a later calculation and/or
to restore the tape to its original form. The ability to preserve the input word in this
manner illustrates why each language L = L3(A) for some Turing machine A must be
Turing acceptable; that is, L = L3(A) implies that there is a multitrack Turing ma-
chine M for which L = L (M).
EXAMPLE 11.6
Conceptualizing the tape as being divided into tracks simplifies many of the argu-
ments concerning modification of the basic Turing machine design. For example, a
modified Turing machine might have two heads that move independently up and
down a single tape, both scanning symbols to determine what transition should be
made and both capable of moving in either direction (or remaining stationary and
overwriting the current cell) as each transition is carried out. Such machines would
be handy for recognizing certain languages. The set {anbnIn;::::: 1} can be easily recog-
nized by such a machine. If both heads started at the left of the word, one head
might first scan right to the first b encountered. The two heads could then begin
moving in unison to the right, comparing symbols as they progressed, until the
leading head encounters a blank and/or the trailing head scans its first b. If these two
events occurred on the same move, the word would be accepted. A single head
Turing machine would have to travel back and forth across the word several times to
ascertain if it contained the same number of as as bs. The ease with which the
two-headed mutation accomplished the same task might make one wonder whether
such .a modified machine can recognize any languages which the standard Turing
machine cannot.
To justify that a two-headed Turing machine is no more powerful than the type
described by Definition 11.1, we must show that any two-headed machine can be
.,
,
378 Turing Machines Chap. 11
I I I I ~blalblclclal I I I
Finite State
Control
Figure 11.8 Emulating a two-headed
Turing machine with a three-track tape
EXAMPLE 11.7
Consider now a device employing several independent tapes with one head for each
tape, as depicted in Figure 11.9. If we think ofthe tapes as stationary and the heads
mobile, it is easy to see that we could simply glue the tapes together into one thick
tape with several tracks, as indicated in Figure 11.10. The multiple heads would now
scan an entire column of cells, but a head would ignore the information on all but
the track for which it was responsible. In this fashion, a multitape Turing machine
can be simulated by a multihead Turing machine, which can in turn be simulated by
a standard Turing machine. Thus, multitape machines are no more powerful than
the machines considered earlier.
Finite State
Control
Figure 11.9 A three-tape Turing
machine
Finite State
Control Figure 11.10 Emulating a three-tape
Turing machine with a single three-track
tape
I •
380 Turing Machines Chap. 11
EXAMPLE 11.8
simulate enough of that sequence to discover that fact and accept the word. In this
way, we avoid getting trapped in a dead end with no QPportunity to pursue the
alternatives.
Implementing the above scheme will produce a deterministic Turing machine
that is equivalent to the original nondeterministic machine. It remains to be shown
that the Turing machine can indeed start over as necessary, and that the possible
move sequences can be enumerated in a reasonable fashion so that they can be
pursued according to the pattern outlined above. A three-tape (deterministic) Tur-
ing machine will suffice. The first tape will keep an inviolate copy of the input string,
which will be copied onto the second tape each time a computation begins anew. A
specific sequence of steps will be carried out on this second scratch tape, after which
the presence of Y will be determined. The third tape is responsible for keeping track
of the iterations and generating the appropriate sequences to be employed. Enum-
erating the sequences is much like the problem of generating words over some
alphabet in lexicographic order (see the exercises). Methods for generating the
"directing sequences" can be found in both [LEWI] and [HOPe]. These references
also propose a more efficient approach to the whole simulation, which is based on
keeping track of the sets of possible configurations, much as was done in Theorem
4.5 for nondeterministic finite automata.
Thus, neither nondeterminism nor any of the enhancements considered above
improved the computational power of these devices. As mentioned previously, no
one has yet been able to find any mechanical enhancement that does yield a device
that can recognize a langu~e that is not Turing acceptable. Attempts at producing
completely different formal vstems have fared no better, and there is little cause to
believe that such systems exist. We now turn to characterizing what appears to be
the largest class of algorithmically definable languages. In the next section, we will
see that the Turing-acceptable languages are exactly the type 0 languages intro-
duced in Chapter 8.
V Definition 11.4. For a given alphabet I, let ~I, be the collection of all
Turing-acceptable languages, and let 2lI, be the collection of all type 0 languages.
Ll
The freedom to use several tapes and non determinism makes it easier to
explore the capabilities of Turing machines and relate ~I, to the previous classes of
languages encountered. It is now trivial to justify that every PDA can be simulated
by a nondeterministic Turing machine with two tapes. The first tape will hold the
input, which will be scanned by the first tape head, which will only have to move
right or, at worst, remain stationary and reprint the same character it was scanning.
The second tape will function as the stack, with strings pushed or symbols popped in
correspondence with what takes place in the PDA. Since a Turing machine can only
print one symbol at a time, some new states may be needed in the finite-state
control to simulate pushing an entire string, but the translation process is quite
direct.
, . ,
V Lemma 11.1. Let ~ be an alphabet. Then rJP'i, C Iff'i,. That is, every context-
free language is Turing acceptable, and the containment is proper.
Proof. Containment follows from the formalization of the above discussion
(see the exercises). Example 11.3 presented a language over {a, b, c} that is Turing
acceptable but not context free. While the distinction between ~'i, and rJP'i, disap-
pe,ared for singleton alphabets, proper containment remains between rJP{.} and Iff{.}>
as shown by languages such as {anln is a perfect square} .
.:l
In the next section, an even stronger result is discussed, which shows that the
class of Turing-acceptable languages includes much more than just the context-free
languages. Lemma 11.1 is actually an immediate corollary of Theorem 11.2. The
next section also explores the formal relationship between Turing machines and
context-sensitive languages.
The previous sections have shown that the class of Turing-acceptable languages
properly contains the type 2 languages. We now explore how the type 0 and type 1
languages relate to Turing machines. Since the preceding discussions mentioned
that no formal systems have been found that surpass Turing machines, one would
expect that every language generated by a grammar can be recognized by a Turing
machine. This is indeed the case, as indicated by the following theorem.
V Theorem 11.2. Let ~ be an alphabet. Then ~'i, ~ Iff'i,. That is, every type 0
language is Turing acceptable.
Proof. We justify that, given any type 0 grammar G = <~,r,s,p>, there
must be a Turing machine TG that is equivalent to G. As with the suggested con-
version of a PDA to a Turing machine, TG will employ two tapes and nondeterm-
inism. The first tape again holds the input, which will be compared to the sentential
form generated on the second tape. The second tape begins with only the start
symbol on an otherwise blank tape. The finite-state control is responsible for
nondeterministically guessing the proper sequence of productions to apply, and
with each guess, the second tape is modified to reflect the new sentential form. If at
some point the sentential form agrees with the contents of the first tape, the
machine prints Y and halts. A guess will consist of choosing both an arbitrary
position within the current sentential form and a particular production to attempt to
substitute for the substring beginning at that position. Only words that can be
generated by the grammar will have a sequence of moves that produces Y, and no
word that cannot be generated will be accepted. Thus, the new Turing machine is
equivalent to G .
.:l
Sec. 11.3 Turing Machines, LBAs, and Grammars 383
EXAMPLE 11.9
Consider the context-sensitive grammar G = <{a,b,c},{S,A,B,C},S,P>, where
P contains the productions
1. Z---? A
2. Z---? S
3. S ---? SABC
4. S---? ABC
5. AB---? BA
6. BA---?AB
7. CB---? BC
8. BC---? CB
9. CA---? AC
10. AC---? CA
11. A ---? a
12. B---? b
13. C---? c
This would lead to a failed attempt, since it corresponds to Z =? S =? ABC, and the
substring BC beginning at position 2 does not match BA, the left side of rule 6. On
the other hand, there is a pattern of guesses that would cause the following sequence
of symbols to appear on the second tape:
Z =? S =? ABC =? BAC =? BCA =? BcA =? Bca =? bca
This would lead to a favorable comparison if bca was the word on the input tape.
Note that the Turing machine may have to handle shifting over existing symbols on
the scratch tape to accommodate increases in the size of the sentential form. Since
type 0 grammars allow length-reducing productions, the machine may also be
required to shrink the sentential form when a string of symbols is replaced by a
smaller string.
384 Turing Machines Chap. 11
A rather nice feature of type 1 languages is that the length of the sentential
form could never decrease (except perhaps for the application of the initial produc-
tion Z~ A), and hence sentential forms that become longer than the desired word
are known to be hopeless. All context-sensitive (that is, type 1) languages can
therefore be recognized by a Turing machine that use an amount of tape propor-
tional to the length of the input string, as outlined below.
That is, the automaton cannot move left of the symbol < nor overwrite it. The LBA
likewise cannot move right of the symbol>, and it can only overwrite it with Y or N
just prior to halting. The symbols #, L, R, Y, and N retain their former meaning,
although # can be dropped from f since it will never be scanned. As implied by the
following definition, the special markers < and> are intended to delimit the input
string, and Definition 11.5 ensures that the automaton cannot move past these
limits. As has been seen, the use of several tracks can easily mUltiply the amount of
information that can be stored in a fixed amount of space, and thus the restriction is
essentially that the amount of available tape is a linear function of the length of the
input string. In practice, any Turing machine variant for which each tape head is
constrained to operate within an area that is a multiple of the length of the input
string is called a linear bounded automaton.
V Definition 11.6. For a linear bounded automaton M = <I, f, S, sO, 8>, the
language accepted by M, denoted by L(M), is L(M) = {x E!* I<sox> ~ <xhY}. A
language accepted by a linear bounded automaton is called a linear bounded lan-
guage (LBL).
Ll
Note that while the endmarkers must enclose the string x, it is the word x
(rather than <x» that is considered to belong to L (M). As before, other criteria
Sec. 11.3 Turing Machines, LBAs, and Grammars 385
for acceptance are equivalent to Definition 11.6. The set of all words for which a
LBA merely halts can be shown to be a LBL according to the above definition. The
following example illustrates a linear bounded automaton that is intended to
recognize all words that cause the machine to print Y at the end of the (obliterated)
word. Example 11.13 illustrates a general technique for restoring the input word,
producing an LBA that accepts according to Definition 11.6.
EXAMPLE 11.10
Consider the machine L shown in Figure 11.11 and the input string babcca. The
following steps show that <sobabcca> ~ <XXXXXXh Y.
<sobabcca> f- <bs1abcca> f- <bs2Xbcca> f- <s2bXbcca> f-
s2<bXbcca> f- <s3bXbcca> f- <s4XXbcca> f- s4<XXbcca> f-
<ssXXbcca> f- <XssXbcca> f- <XXssbcca> f- <XXbsscca> f-
<XXbs~ca> f- <XXs6bXca> f- <Xs6XbXca> f- <s6XXbXca> f-
s6<XXbXca> f- <soXXbXca> f- <XsoXbXca> f- <XXsobXca> f-
<XXbs1Xca> ~ s2<XXbXcX> ~
<XXs3bXcX> ~ S4<XXXXCX> ~
<XXXXsscX> ~ S6<XXXXXX> ~
<XXXXXXso> f-<XXXXXXhY
V Definition 11.7. For a given alphabet !', let:£~ be the collection of all linear
bounded languages, and let O~ be the collection of all context-sensitive (type 1)
languages.
Ll
The proof of Theorem 11.2 can be modified to show that all context-sensitive
languages can be recognized by linear bounded automata. Since context-sensitive
languages do not contain contracting productions, no sentential forms that are
longer than the desired word need be considered. Consequently, the two-tape
Turing machine in Theorem 11.2 can operate as a linear bounded automaton. The
first tape with the input word never changes and thus satisfies the boundary re-
striction, while the finite-state control can simply abort any computation on the
second tape that violates the length restriction. Just as Theorem 11.2 showed that
el~ ~ '!J~, we now have a relationship between another pair of cognitive and
generative classes.
V Theorem 11.3. Let!, be an alphabet. Then O~ ~ :£~. That is, every type 1
language is a LBL.
Proof. The proof follows from the formalization of the above discussion (see
the exercises).
Ll
We have argued that every type 0 grammar must have an equivalent Turing
machine, and it can conversely be shown that every Turing-acceptable language can
be generated by a type 0 grammar. To do this, it is most convenient to use the very
restrictive criteria for a Turing-acceptable language given in Definition 11.3, in
which the original input string is not destroyed. For Turing machines which behave
in this fashion, the descriptions of the device configurations bear a remarkable
resemblance to the derivations in a grammar.
EXAMPLE 11.11
Consider again the language {x E {a, b, c}* I Ixl. = Ix Ib = Ix Ie}. As discussed in Ex-
ample 11.3, the Turing machine in Figure 11.4 destroys the word originally on the
input tape. Figure 11.12 depicts a slightly more complex Turing machine that
restores the original word just prior to acceptance. It will (fortunately) not generally
be necessary for our purposes to restore rejected words, since there are intricate
languages for which this is not always possible. The modified quintuple is
T = <{a, b, c}, {#, A, B, C, Y, N}, {so, S1, S2, S3, S4, S5, S6, S7, S8}, so, 8>, where 8 is as in-
dicated in the diagram in Figure 11.13. "Saving" the original input string is
accomplished by replacing occurrences of the different letters by distinct symbols
and restoring them later. The implementation reflects one of the first uses suggested
for multiple-track machines: using the second track to check off input symbols. For
legibility, an a with a check mark above it is denoted by A, while an a with no check
Sec. 11.3 Turing Machines, LBAs, and Grammars 387
I I I
I I I Iblalblclclal I I I I
Finite State
Control
Figure 11.12 The Turing machine dis-
cussed in Example 11.11
The additional states S7 and S8 essentially erase the check marks just before halting
by replacing A with a, B with b, and C with c.
Consider again the input string babcca processed by the Turing machine in
Example 11.3. It is also accepted by this Turing machine because the following steps
show that sobabcca ~ babccahY. Note how closely the steps correspond with those in
Example 11.3. The sequence below also illustrates how S7 converts the string back to
lowercase, after which S8 returns the tape head to the right for acceptance.
sobabcca f- bs1abcca f- bs 2Abcca f- s2bAbcca f-
s2#bAbcca f- s3bAbcca f-s 4BAbcca f-s 4#BAbcca f-
ssBAbcca f- BSsAbcca f- BAssbcca f- BAbsscca f-
BAbs6Cca f- BAs6bCca f- Bs6AbCca f- s6BAbCca f-
s6#BAbCca f- soBAbCca f- BSoAbCca f- BAsobCca f-
BAbs1Cca ~s2#BAbCcA ~
BAs3bCca ~s4#BABCcA ~
BABCsscA ~s6#BABCCA ~
BABCCAsof- BABCCs7A f- BABCCs7a f- BABCs7Ca f-
BABCs7ca ~s7babcca f-s7#babcca f-
s8babcca f- bs8abcca ~babccas8 f-
babccahY
If occurrences of the machine transition symbol f- are replaced by the deriva-
tion symbol ~, the above sequence would look remarkably like a derivation in a
type 0 grammar. Indeed, we would like to construct a grammar in which sentential
forms like bs1abcca could be derived from sobabcca in one step. Since the machine
changed configurations because of the transition rule &(so, b) = (Sb R), this transi-
tion should have a corresponding production of the form sob~ bs1. Each transition
in the Turing machine will be responsible for similar productions.
Unfortunately, the correspondence between transition rules and productions
is complicated by the fact that the tape head may occasionally scan blank cells,
which must then be added to the sentential form. The special characters [ and] will
bracket the sentential form throughout this stage of the derivation and will indicate
the current left and right limits of the tape head travel, respectively. Attempting to
move left past the conceptual position of [ (or right past the position of ]) will result
in the addition of a blank symbol to the sentential form.
To generate the words accepted by a Turing machine, our grammar will
randomly generate a word over I, delimit it by brackets, and insert the symbol for
the start state at the left edge. The rules derived from the transitions should then be
Sec. 11.3 Turing Machines, LBAs, and Grammars 389
able to transform a string such as [sobabcca#] into [#babccahY#]. Since only the
letters in I will be considered terminal symbols, the symbols [, ], #, and Yare
nonterminals, and the derivation will not yet be complete. To derive terminal strings
for just the accepted words, the presence of Y will allow further productions to
delete the remaining nonterminals.
V' Definition 11.8. Given a Turing machine M = <I, f, S, sO, 8>, the grammar
corresponding to M, GM, is given by GM= <I, f u S u {Z, W, U, v, [,]}, Z, PM>,
where PM contains the following classes of productions:
1. Z~[W#]EPM
(Va E I)([W ~ [Wa E PM)
W~SOEPM
2. Each printing transition gives rise to a production rule as follows
(Vs E S)(Vt E S U {h})(Va, b E I U f)(if 8(s, a) = (t, b>, then sa ~ tb E PM)
Each move right gives rise to a production rule as follows
(Vs, t E S)(Va E I U f)(if 8(s, a) = (t, R>, then sa~ at E PM)
If a = #, an additional production is needed:
(Vs, t E S)(if 8(s, #) = (t, R>, then s] ~ #t] E PM)
Each move left gives rise to a production rule as follows
(Vs, t E S)(Va E I U f)
(if 8(s, a) = (t, L>, then [sa~ [t#a E PM 1\ (Vd E I U f)(dsa~ tda E PM»
3. hY~UEPM
U#~UEPM
U]~VEPM
(Va E I)(aV ~ Va E PM)
#V~VEPM
[V~A.EPM
The rules in class 1 are intended to generate all words of the form [soX#],
where x is an arbitrary member of I *. The remaining rules are defined in such a way
that only those strings x that are recognized by M can successfully produce a
terminal string. Note that once W is replaced by So neither Z nor W can appear in a
later sentential form. After So is generated, the rules in class 2 may apply. It can be
inductively argued that the derivations arising from the application of these rules
directly reflect the changes in the configuration of the Turing machine (see Theorem
11.4).
None of the class 3 productions can be used until the point at which the halt
·
.
EXAMPLE 11.12
Consider the Turing machine T in Figure 11.13 and the corresponding grammar G T •
Among the many possible derivations involving the class 1 productions is
Z:3:;> [W#]:3:;> [Wa#]:3:;> [Wca#]:3:;> [Wcca#]:3:;> [Wbcca#]:3:;> [Wabcca#]
:3:;> [Wbabcca#]:3:;> [sobabcca#]
Only class 2 productions apply at this point, and there is exactly one derivation
applicable at each step in the following sequence.
[sobabcca#]:3:;> [bsjabcca#] :3:;> [bs 2Abcca#] :3:;> [s 2bAbcca#] :3:;>
[s2#bAbcca#]:3:;> [#s3bAbcca#] :3:;> [#s4BAbcca#] :3:;> [s4#BAbcca#] :3:;>
[#ssBAbcca#]:3:;> [#BssAbcca#] :3:;> [#BAssbcca#] :3:;> [#BAbsscca#] :3:;>
[#BAbs6Cca#]:3:;> [#BAs6bCca#] :3:;> [#Bs 6AbCca#] :3:;> [#s6BAbCca#] :3:;>
[s6#BAbCca#]:3:;> [#soBAbCca#] :3:;> [#BsoAbCca#] :3:;> [#BAsobCca#] :3:;>
[#BAbsjCca#] ~ [s2#BAbCcA#] ~
[#BAs3bCcA#] ~ [s4#BABCcA#] ~
[#BABCsscA#] ~ [s6#BABCCA#] ~
[#BABCCAso#]:3:;> [#BABCCs7A#] :3:;> [#BABCCs7a#] :3:;> [#BABCs7Ca#] :3:;>
[#BABCs7ca#] ~ [#s7babcca#] :3:;> [s7#babcca#] :3:;>
[#sgbabcca#]:3:;> [#bsgabcca#] ~ [#babccasg#]
[#babccahY]
In Turing machines where the tape head travels further afield, there may be many
more blanks enclosed within the brackets. At this point, the class 3 productions take
over to tidy up the string:
[#babccahY]:3:;> [#babccaU]:3:;> [#babccaV:3:;> [#babccVa
:3:;> [#babcVca:3:;> [#babVcca:3:;> [#baVbcca:3:;> [#bVabcca
:3:;> [# Vbabcca :3:;> [Vbabcca :3:;> babcca
As expected, babcca E L (G T).
Sec. 11.3 Turing Machines, LBAs, and Grammars 391
V Theorem 11.4. Let!' be an alphabet. Then ?Jl: ~ ~l:. That is, every Turing-
acceptable languge can be generated by a type 0 grammar.
Proof. Let M be a Turing machine M = <!', f, S, so, 3>, and let
L(M) = {x E!'* IsoX ~ xhY},
as specified in the most restrictive sense of a Turing-acceptable language (Definition
11.3). Consider the grammar GM corresponding to M, as given in Definition 11.8.
The previous discussion of GM provided a general sense of the way in which the
productions could be used and justified that they could not be combined in
unexpected ways. A rigorous proof requires an explicit formal statement of the
general properties that have been discussed. A trivial induction on the length of x
shows that by using just the productions in class 1
(Vx E !'*)(Z~ [sox#])
Another induction argument establishes the correspondence between se-
quences of applications of the class 2 productions and sequences of moves in the
Turing machine. Specifically, by inducting on the number of transitions, it can be
shown that
(Vs, t E S U {h})(Va, ~, 'Y, w E (!, U f)*)
(as~ ~ 'Ytw iff (3i,j, m, n E N)([#ias~#Jl ~ [#m'Ytw#n]))
The actual number of padded blanks is related to the extent of the tape head
movement, but this is not important for our purposes. The essential observation is
that a move sequence in M is related to a derivation sequence in GM , with perhaps
some change in the number of blanks at either end. The above statement was stated
in full generality to facilitate the induction proof (see the exercises). We need apply
it in a very limited sense, as stated below.
(Va,~, 'Y, wE (!, U f)*)(soX ~ xhY iff (3m, n E N)([soX#] ~ [#mxhy#n]))
Observe that the productions in class 3 cannot be used unless h Y appears on the
tape after a finite number of steps. As discussed earlier, the presence of hY triggers
the class 3 productions, which remove all the remaining nonterminals. Thus,
392 Turing Machines Chap. 11
Since every Turing machine has an equivalent type 0 grammar and every type 0
grammar generates a Turing-acceptable language, we have two ways of representing
the same class of languages.
As will be seen in Chapter 12, the linear bounded languages are a distinctly
smaller class than the Turing-acceptable languages. Theorem 11.3 showed that
Ol; C ~l;, and a technique similar to that used in Theorem 11.4 will show that
~l; COl;. That is, we can show that every linear bounded automaton has an equiv-
alent context-sensitive gramr .dr. Note that the class 1 and 2 productions in
Definition 11.8 contained no contracting productions; it was only when the class 3
productions were applied that the sentential form might shrink. When dealing with
linear bounded automata, the tape head is restricted to the portion of the tape
containing the input string, so there will be no extraneous blanks to delete. The
input word on the tape of a linear bounded automaton is bracketed by distinct
symbols < and>, which might be used in the corresponding grammar in a fashion
similar to [ and]. These would be immovable in the sense that no new blanks would
be inserted between them and the rest of the bracketed word. Unfortunately, in
Definition 11.8 the delimiters [ and] must eventually disappear, shortening the
sentential form. No such shrinking can occur if we hope to produce a context-
sensitive grammar.
To overcome this difficulty, it is useful to imagine a three-track tape with the
input word on the middle track and the delimiter - on the upper track of the tape
above the first symbol of the word. Another - will occur on the lower track below
the last character of the input string. These markers will serve as guides to prevent
the tape head from moving past the limits of the input word. For example, if the
linear bounded automaton contained the word <babcca> on its input tape, the tape
for the corresponding three-track automaton would be as pictured in Figure 11.14a.
If the word were accepted, the tape would eventually reach the configuration shown
in Figure l1.14b as it halted, printing Y on the lower track. It is a relatively simple
task to convert a linear bounded automaton into a three-track automaton, where
the tape head never moves left of the tape cell with the - in the upper track, and
never moves right of the cell with the - in the lower track (see the exercises). We
will refer to such an automaton as a strict linear bounded automaton. The definitions
Sec. 11.3 Turing Machines, LBAs, and Grammars 393
I I I I I- I I I I I I I I J I
I I I Iblalblclclal I I I I
I I I I I I I I I- I I I I I
+
Finite State
Control
(a)
I I I I-I I I I I I I I I I
I I I IblaLblclclal J J j I
I I I I I I I I IYI I I I I
+
Finite State
Control
Figure 11.14 (a) A three-track Turing
machine employing delimiters (b) An
(b) accepting configuration
used will depend on the upper and lower track markers occurring in different cells,
which makes the representation of words of length less than two awkward. Since
this construct is motivated by a need to find a context-sensitive grammar, we will
simply modify the resulting grammar to explicitly generate any such short words and
not rely on the above formalism.
EXAMPLE 11.13
AlA
.B/BN
b/R N
elR C/C
I----.~I N
lilR .iJlaN
ClR ~CN
aiL AIL
b/L B/L
elL C/L
»/L
Ala
B/ii
'CfC Figure 11.15 The Turing machine dis-
cussed in Example 11.13
following definition, note that by the conditions placed on a strict linear bounded
automaton f already contains symbols of the form A and A, and hence so will f B.
For simplicity, the state set is required to be of the form {so, SJ, ... ,sn}, but clearly
the state names of any automaton could be renumbered sequentially to fit the given
definition.
1. If A. E L(B), then Z~ A. E PB
Z~SEPB
(Vd E I)(if d EL(B), then S~d E PB)
(Vd E I)(S~ WQ E PB)
(Vd E I)(W ~ Wd E PB)
(Vd E I)(W ~ do E PB)
2. Each printing transition gives rise to a production rule as follows:
(Vs i, Sj E S)(Va, b E I U f)(if 8(s;, a) = (Sj, b), then ai~ bj E PB)
Each move right gives rise to a production rule as follows:
(Vs i, sjE S)(VaE I U f)(if 8(Si' a) = (sj> R), then (Vd E I U f)(aid~adjE PB)
Each move left gives rise to a production rule as follows:
(Vs i, Sj E S)(Va E I U f) (if 8(Si' a) = (Sj, L), then (Vd E I U f)(dai~ dja E PB)
Each halt with acceptance gives rise to a production rule as follows:
(Vs i E S)(Vb E I U f)(Va E I)(if 8(Si' b) = (h, ay), then bi~ ay E PB)
3. (Va, b E I)(bay~ bya E PB )
(Va, b E I)(bay~ ba E PB)
EXAMPLE 11.14
Consider again the strict linear bounded automaton B given in Figure 11.15 and the
corresponding context-sensitive grammar G B• The following derivation sequences
show that babcca EL(G B):
Z => S => W ~ => W c~ => W cc~ => Wbcc~ => Wabcc~ => hoabcc~
At this point, only the class 2 productions can be employed, yielding:
hoabcc~ => hal bcc~ => hA2 bcc~=> h2Abcc~ =>
h3Abcc~ => B4Abcc~ =>
B5Abcc~ => BA5bcc~ => BAb5CC~ => BAbc5C~ =>
BAbC6C~ => BAb6CC~ => BA6bCc~ => B6AbCc~ =>
BoAbCc~ => BAobCc~ => BAboCc~ =>
BAbC1C~ ~B2AbCcA
BAb3CcA ~B4ABCcA
BABCC5A ~B6ABCCA
Sec. 11.3 Turing Machines, LBAs, and Grammars 391
BABCC7!! ~ B7abcca
bsabcc!!::;> basbcc!! ~ babcc!!s
Finally, since B(ss,!!) = (h, ay), the class 3 productions now apply:
babcc!!s::;> babccay ::;> babccya::;> babcyca ::;> babycca ::;> baybcca::;> babcca
Once again, the grammars springing from Definition 11.9 can generate sen-
tential forms corresponding to any string in I *, as long as the length of the string is
at least two. As with the grammars arising from Definition 11.8, only strings that
would have been accepted by the original machine will lead to a terminal string. If
the productions of this example were applied to the sentential form boa!!, at each
step there will be exactly one choice of applicable production, until eventually the
form BA!!s is obtained. At this step, no production will apply, and therefore a
terminal string cannot be generated from boa!!. This correspondence between words
accepted by the machine B and words generated by the context-sensitive grammar
Ge given in Definition 11.9 is the foundation of the following theorem.
V Theorem 11.5. Let I be an alphabet. Then ;£k ~ Ok. That is, every linear
bounded language can be generated by a type 1 grammar.
Proof. Any linear bounded language can be recognized by a strict linear
bounded automaton (see the exercises). Hence, if L is a linear bounded language,
there exists a strict linear bounded automaton B = <I, r, {so, S10 ••• , sn}, sO, B>
which accepts exactly the words in L by printing Y on the lowest of the three tracks
after restoring the original word to the middle track. We will employ the grammar
Ge corresponding to B, as given in Definition 11.9. Example 11.14 illustrated that
these productions can be used in a manner similar to those of Definition 11.8, and it
is easy to justify that they cannot be combined in unexpected ways. Induction on the
length of x will show that by using just the productions in class 1,
(Vx E I *)(Va, b E I)(Z ~ 3oX.!!)
The correspondence between sequence's of applications of the class 2 produc-
tions and sequences of moves in B follows as in Theorem 11.4. Due to the myriad
positions that the integer subscript can occupy, and the special cases caused by the
presence of the overbars and underscores, the general induction statement is quite
tedious to state and is left as an exercise. The statement will again be applied to the
special case in which we are interested, as stated below.
(Vx EI*)(Va,bEI)(soaxb~axhby iff 30xE.~axby)
, "
398 Turing Machines Chap. 11
This establishes the correspondence between words of length at least two accepted
by B and those generated by GB • Definition 11.9 included specific productions of the
form Z ~ X. and S ~ d to ensure that words of length 0 and 1 also corresponded.
This implies that L (B) = L (G B), as was to be shown.
~
The proof of Theorem 11.5 argues that there exists a context-sensitive gram-
mar GB for each strict linear bounded automaton B, and it certainly appears that
given an automaton B we can immediately write down all the productions in PB, as
specified by Definition 11.9. However, some of the class 1 productions may cause
some trouble. For example, determining whether the production z~ X. is included
in PB depends on whether the automaton halts with Y when presented with a blank
tape. In the next chapter, we will see that even this simple question cannot be
effectively answered for arbitrary Turing machines! That is, it is impossible to find
an algorithm that, when presented with the state diagram of a Turing machine, can
reliably determine whether or not the machine accepts the empty string. It will be
shown that any such proposed algorithm is guaranteed to give the wrong answer for
some Turing machines. Similarly, it now seems that there might be some uncer-
tainty about which members of I give rise to productions of the form S ~ d.
The productions specified by Definition 11.9 were otherwise quite explicit;
only the productions relating to the immediate generation of a single character or
the empty string were in any way questionable. There are only II I+ 1 such produc-
tions, and some combination of them has to be the correct set of productions to
include in P B • Thus, as stated in the theorem, we are assured that a context-sensitive
grammar does exist, even if we are unclear as to exactly what productions it should
contain.
As will be seen in Chapter 12, it is possible to determine which words are
accepted (and which are rejected) by linear bounded automata. Unlike unrestricted
Turing machines, there is only a finite span of tape upon which symbols can be
placed. Furthermore, there are only a finite number of characters that can appear in
those cells, a finite number of positions the tape head can be in, and a finite number
of states to consider. The limited number of configurations makes it possible to
determine exactly which words of a given size are recognized by the LBA.
We have seen that every linear bounded automaton is equivalent to a strict
linear bounded automaton, and these have equivalent type 1 grammars. Con-
versely, every type 1 grammar generates a linear bounded language, which implies
there is another correspondence between a generative construct and a cognitive
construct.
, ..
400 Turing Machines Chap. 11
proper place on the tape. Tn therefore accepts if and only if both Tl and T2 accept,
and hence Ll n Lz is Turing acceptable.
!1
Note that it was important that, except for the presence of Y after the input
word, Tl left the tape in the same condition it found it, with the input string intact
for T2. As with type 3 and type 2 grammars, there is no pleasant way to combine
type 0 grammars to produce a grammar that generates the intersection of type 0
languages, although Theorem 11.7 guarantees that such a grammar must surely
exist.
As shown in Chapter 12, there are some operators under which 5"l: is not
closed. Complementation is perhaps the most glaring exception. The closure
properties of :£l: are very similar to those of ~l:. In most cases, slight modifications
of the above proofs carryover to the type 1 languages.
was a LBA, the new version will also be a LBA. In the generative approach,
reversing the characters in type 1 productions still results in a type 1 grammar. That
is, if the original grammar had no contracting productions, neither will the new
grammar.
Proving that the union of two type 1 languages is type 1 is similar to the proof
given in Theorem 11.6, although care must be taken to avoid extraneous produc-
tions of the form Zl ~ A. Building an intersection machine from two linear bounded
automata can be done exactly as described in Theorem 11.7. The remaining closure
properties are left for the exercises.
a
It is clear from our definitions that (J}; ~ ~};, but we have yet to prove that
C ~};. That the inclusion is proper and ~}; is truly a larger class than (J}; will be
(J};
shown to be a consequence of the material considered in Chapter 12. Apart from
this one missing piece, we have over the course of several chapters encountered the
major components of the following hierarchy theorem.
EXERCISES
11.1. By making the appropriate analogies for states and input, answer the musical question
"How is a Turing machine like an elevator?" What essential (missing) component
prevents an elevator from modeling a general computing device?
11.2. Let I = {a, b, c} and let L = {w Iw = w r}.
(a) Explicitly define a deterministic, one-tape, one-head Turing machine that will
recognize L.
, . ,
(b) Justify that there exists a linear bounded automaton that accepts L.
(c) Describe how nondeterminism or additional tapes and heads might be employed
to recognize L.
11.3. Let I = {a}. Explicitly define a deterministic, one-tape, one-head Turing machine that
will recognize {an In is a perfect square}.
11.4. Let I = {a, b, c}.
(a) Explicitly define a deterministic, one-tape, one-head Turing machine that will
recognize {akbncm I(k =1= n) 1\ (n =1= m)}.
(b) Explicitly define a deterministic, one-tape, one-head Turing machine that will
recognize {x E {a, b, c}* Ilx la =1= Ix Ib 1\ Ix Ib =1= Ix Ie}.
11.5. (a) Recall that there are several common definitions of acceptance that can be
applied to Turing machines. Design a machine M for which
L(M) =Ll(M) =L2(M) =L3(M) = {x E{a, b,c}*llxla = Ixlb= IxIc}.
(b) For any Turing-acceptable language L, is it always possible to find a correspond-
ing machine for which L(M) = L 1(M) = Lz(M) = L3(M) = L? Justify your answer.
11.6. Let L = {ww Iw E {a, b, c}*}.
(a) Explicitly define a deterministic, one-tape, one-head Turing machine that will
recognize L.
(b) Justify that there exists a linear bounded automaton that accepts L.
(c) Describe how nondeterminism or additional tapes and heads might be employed
to recognize L.
11.7. Given an alphabet I = {aI, a2, a3, ... ,an}, associate each word with the base n number
derived from the subscripts. Thus, a3a2a4 is associated with 324, al with 1, and Awith
o. These associated numbers then imply a lexicographic ordering of I *, with
(a) Given an alphabet I, build a Turing machine that, given an input word x, will
replace that word with the string that follows x in lexicographic order.
(b) Using the machine in part (a) as a submachine, build a Turing machine that will
start with a blank tape and sequentially generate the words in I * in lexicographic
order, erasing the previous word as the following word is generated.
(c) Using the machine in part (a) as a submachine, build a Turing machine that will
start with a blank tape and sequentially enumerate the words in I * in lexico-
graphic order, placing each successive word to the right of the previous word on
the tape, separated by a blank.
(d) Explain how these techniques can be used in building a deterministic version of a
nondeterministic Turing machine.
11.8. Define a semi-infinite tape as one that has a distinct left boundary but extends indefi-
nitely to the right, such as those employed by DFAs.
(a) Given a Turing machine satisfying Definition 11.1, define an equivalent two-track
Turing machine with a semi-infinite tape.
(b) Prove that your construction is equivalent to the original.
11.9. Let I = {a}. Explicitly define a deterministic, one-tape, one-head Turing machine that
will recognize {an In is a power of 2} = {a, aa, aaaa, ... }.
11.10. Define a three-head Turing machine that accepts {x E{a,b,c}*llxl a = Ixlb= Ixl e}.
Chap. 11 Exercises 403
Assume that all three heads start on the leftmost character. Is there any need for any
of the heads to ever move left?
11.11. Let ~ be an alphabet. Prove that every context-free language is Turing-acceptable by
providing the details for the construction discussed in Lemma 11.1.
11.12. Let ~ be an alphabet. Prove that every type 1 language is a LBL by providing the
details for the construction discussed in Theorem 11.3.
11.13. Let M = <~, r, S, sO, B> be a linear bounded automaton. Show how to convert M
into a three-track automaton that never scans any cells but those containing the
original word by:
(a) Explicitly defining the new alphabets.
(b) Explicitly defining the new transitions from the old. (Hint: From any state, an old
transition "leaving" the word to scan one of the delimiters must return to the
word in a unique manner.)
(c) Prove that for words of length at least 2 your new strict linear bounded automaton
accepts exactly when M does.
11.14. By adding appropriate new symbols (of the form hl and suitable transitions:
(a) Modify the strict linear bounded automaton defined in Exercise 11.13 so that it
correctly handles strings of length 1.
(b) Assume that a strict LBA that initially scans a blank is actually scanning an empty
tape. If we expect to handle the empty string, we cannot insist that a strict linear
bounded automaton never scan a cell that is not part of the input string, since the
tape head must initially look at something. If we instead require that the tape
head of a strict LBA may never actively move to a cell that is not part of the input
string, then the dilemma is solved. Show that such a strict LBA can be found for
any type 1 language.
11.15. Refer to Theorem 11.4 and show, by inducting on the number of transitions, that
('v's, t E S U {h})('v'a, 13, ,,/, w E (~ U r)*)
(asf3 htw iff (3i,j, m, n E N)(Wasf3#i] ~ [#m"/tw#"]))
11.16. State and prove the general induction statement needed to rigorously prove Theorem
11.5.
11.17. If G = <~, r, Z, P>
is a grammar for a type 0 language:
(a) Explain why the following construction may not accept L (G) *: Choose a new start
symbol W, and form G. = <~,ru{W}, W,P U{W~A, W~ WW, W~Z}>.
(b) Give an example of a grammar that illustrates this flaw.
(c) Given a type 0 grammar G = <~,r,z,p>, define an appropriate grammar G.
that should accept the Kleene closure of L(G).
(d) Prove that the construction defined in part (c) has the property that
L(G.) =L(G)*.
11.18. Let ~ be an alphabet. Prove that '!J"£ is closed under:
(a) Homomorphism
(b) Inverse homomorphism
(c) Concatenation
(d) Substitution
11.19. (a) Show that any Turing machine Al accepting L = LI(A I) has an equivalent Turing
404 Decidability Chap. 12
machine A2 for which L = L 2 (A2 ) by explicitly modifying the quintuple for Al and
proving that your construction behaves as desired.
(b) Show that any Turing machine A2 accepting L = L 2 (A2 ) has an equivalent Turing
machine A3 for which L = L 3(A 3 ) by explicitly modifying the quintuple for A2 and
proving that your construction behaves as desired.
11.20. Let ~ be an alphabet. Prove that 01: is closed under:
(a) Homomorphism
(b) Inverse homomorphism
(c) Concatenation
(d) Substitution
c H A p T E R
DECIDABILITY
In this chapter, the nature and limitations of algorithms are explored. We will first
look at the general properties that can be ascertained about finite automata and
FAD languages. For example, we might like to be able to enter the state transition
table of a DFA into a suitably sized array and then run a program that determines
whether the DFA was connected. An algorithm for checking this property was
outlined in Chapter 3. Similarly, we have seen that it is possible to write a program
to check whether an arbitrary DFA is minimal. We know this property can be
reliably checked because we proved that the algorithms in Chapter 3 could be
applied to ascertain the correct answer for virtually every conceivable DFA. There
are an infinite number of DFAs about which the question can be posed, and yet our
algorithm decides the question correctly in all cases. In the following section we
consider questions that can be asked about more complex languages and machines.
In the latter part of this chapter, we will see that unlike the questions in
Sections 12.1 and 12.2, there are some questions that are in a fundamental sense
unanswerable in the general case. That is, there cannot exist an algorithm that
correctly answers such a question in all cases. These questions will be called
undecidable. An undecidable question about Pascal programs is considered in detail
in Section 12.3 and is independent of advanced machine theory. The concept of
undecidability is addressed formally in Section 12.4, and other undecidable prob-
lems are also presented.
405
406 Decidability Chap. 12
Williams
Jones
Smith
A variety of sorting algorithms, when applied to this file, will produce the correct
output. It is also possible to write a program that ignores its input and always prints
the lines
Jones
Smith
Williams
This program does yield the correct answer for the particular problem we wished to
solve, and indeed it solves the sorting problem for all files that contain exactly these
three particular names in some arbitrary order (there are six such files). Thus, this
trivial program is an algorithm that solves the sorting problem for these six specific
instances. A slightly more complex program might be capable of printing two or
three distinct answers, depending on the input, and thus solve the sorting problem
for an even larger (but still finite) class of instances.
It should be clear that producing an algorithm that solves a finite set of
instances is no great' accomplishment, since these algorithms are guaranteed to
exist. Such an algorithm could be programmed as one big case statement, which
Sec. 12.1 Decidable Questions About Regular Languages 407
identifies the particular input instance and produces the corresponding output for
that instance. Algorithms that apply to an infinite set of instances are of much more
theoretical and practical interest.
'il Definition 12.1. Given a set of instances and a yes-no question that can be
applied to those instances, we will say that the question is decidable if there is an
algorithm for determining in each instance the (correct) answer to the question.
Ll
EXAMPLE 12.1
A typical set of instances might be the set of all deterministic finite automata over a
given alphabet~; a typical question might be whether a given automaton accepts at
least one string in ~ * .
'il Theorem 12.1. Given any alphabet ~ and a DFA A = <~, S, sO, 5, F>, it is
decidable whether L (A) = 0.
Proof. Let n = \\S\\. Since both ~ and S are finite sets,
B = {X-} U ~ U ~2 U ... U ~n-l
is a finite set, and we can examine each string of this set and still have a procedure
that halts. There is clearly an algorithm for determining the set C of all states that
are reached by these few strings. Specifically,
C = {"8(so,x)\x E~* 1\ \x\ <n}={"8(so,x)\x EB}.
,,
408 Decidability Chap. 12
Note that Theorem 2.7 implies that if a string (of any length) is accepted by A then
there is another string of length less than n that is also accepted by A. Consequently,
it is sufficient to examine only the "short" strings contained in B rather than
examine all of I * . If any of the strings in B lead to a final state (that is, if C n F =F 0),
then the answer to the question is clearly "NO-L(A) is not empty," while if
C n F = 0, then Theorem 2.7 guarantees that "YES-L (A) is empty" is the correct
answer. We have therefore constructed an algorithm (which computes C and then
examines C n F, both of which can be done in a finite amount of time) for determin-
ing whether the language accepted by a given machine is empty.
Il.
The definition of C does not suggest the most efficient algorithm for calcu-
lating the set C; better strategies are available. The technique is similar to that
employed to find the state equivalence relation EA. C is actually the set of connected
states SC, which can be calculated recursively as indicated in Definition 3.10. Note
that Theorem 12.1 answers the question posed in Example 12.1. The set of instances
to which this question applies can easily'be expanded. It can be shown that it is
decidable whether L(A) = 0 for any NDFA A by first employing Definition 4.5 to
find the equivalent DFA Ad and then applying the method outlined in Theorem 12.1
to that machine. It is possible to find a much more efficient algorithm for answering
this question that does not rely on the conversion to a DFA (see the exercises).
Just as the algorithm for converting an NDFA into a DFA allows the empti-
ness question to be answered for NDFAs, the techniques in Chapter 6 justify that
the similar question for regular expressions is decidable. That is, since every regular
expression has an equivalent DFA, the question of whether a regular expression
describes any strings is clearly decidable. Similar extensions can be applied to most
of the results in this section. Just as we can decide whether a DFA A accepts any
strings, we can also decide if A accepts an infinity of strings, as shown by Theorem
12.2. This can be proved by a related appeal to Theorem 12.1, but an efficient
algorithm for answering this question depends on the following lemma.
A question similar to the one posed in Theorem 12.1 is "Does a given DFA
accept a finite or an infinite number of strings?" This is also a decidable question, as
demonstrated by the following theorem. The proof is based on the observation that
a DFA A that accepts no strings of length greater than some fixed constant must by
definition recognize a finite set, while the pumping lemma implies that if L (A)
Sec. 12.1 Decidable Questions About Regular Languages 409
contains a sufficiently long string, then L(A) must contain an infinite number of
related strings.
V Theorem 12.2. Given any alphabet I and a DFA A = <I, S, Sa, 3, F>, it is
decidable whether L(A) is an infinite set.
Proof. Let n = IISII. Clearly, if A accepts no strings of length n or greater,
then L(A) is finite. From the pumping lemma, we know that if A accepts even one
string of length equal to or greater than n, then A must accept an infinite number
of strings. We still cannot check all the strings of length greater than n and have
a procedure that halts, so Lemma 12.1 will be invoked to argue that if a long string
is accepted by A, then a string whose length is in the range n ::5 Ix 1< 2n must
be accepted, and it is therefore sufficient to check the strings in this limited
range. Thus, our algorithm will consist of computing the intersection of
{8"(Sa, y) Iy E I* /\ n ::5IY 1< 2n} and F. L (A) is infinite iff this intersection is non-
empty.
b.
If we were to write a program that consulted the matrix containing the state
transition table for A to actually determine {8"( Sa, y) lyE I * /\ n ::51 y I< 2n}, it would
be very inefficient to implement this computation as implied by the definition.
Repeatedly looking up entries in the state transition table to determine 8" for each
word in this large class of specified strings would involve an enormous duplication of
effort. It is far better to recursively calculate Ri = {8"(sa, x) Ix E Ii}, which represents
the set of all states that can be reached by strings of length exactly i. This can be
easily computed by defining Ra = {sa} and using the recursive formula
Ri+l = {3(s, a) la E I, s E R;}
Successive sets can thereby be calculated from Ra. When Rn is reached, it is checked
against F, and the algorithm halts and returns Yes if they have a common state.
Otherwise, Rn+l through R 2n - 1 are checked, and No is returned if no final state
appears in this group. This method is easily adaptable to nondeterministic finite
automata by setting Ra to be the set of all start states and adjusting the definition of
Ri+l to conform to NDFA notation.
The involved arguments presented in Lemma 12.1 and the proof of Theorem
12.2 are necessary to justify that the above efficient recursive algorithm correctly
answers the question of whether a finite automaton accepts an infinite number of
strings. However, if we were simply interested in justifying that it is decidable
whether L(A) is infinite, it would have been much more convenient to simply adapt
the result of Theorem 12.1. In particular, we could have easily built a DFA that
accepts all strings of length at least n, form the "intersection" machine, and apply
Theorem 12.1 to the new machine.
Specifically, if A is an n-state deterministic finite automaton, consider the
DFA An = <I, {ra, r}, r2,· .. ,rn}, ra, 8n, {r n}> , where 8n is defined by
410 Decidability Chap. 12
The following theorem answers a major question about DFAs: "Are two given
deterministic finite automata equivalent?" At first glance, this appears to be a hard
question; an initial strategy might be to check longer and longer strings, and answer
"No, they are not equivalent" if a string is found that is accepted by one machine
but is not accepted by the other. As in the proof of Theorems 12.1 and 12.2, we
would again be faced with the task of determining when we could confidently stop
checking strings and answer "Yes, they are equivalent."
Such a strategy can be made to work, but an easier method is again available.
We are essentially checking whether the start state of the first machine treats strings
differently than does the start state of the second machine. This problem was
addressed in Chapter 3, and an algorithm that accomplished this sort of checking
has already been presented. This observation provides the basis for the proof of the
following theorem.
V Theorem 12.3. Given any alphabet I and two DFAs Al = <I, S1, SOl' 81, F1>
and A2 = <I, S2, S02' 82, F2>, it is decidable whether L (Al) = L (A2)'
Proof. Without loss of generality, assume that Sl n S2 = 0, and construct a
new DFA defined by A = <I, Sl U S2, SOl' 8, F1 U F2>, where
EXAMPLE 12.2
Consider the two machines Al and A2 displayed in Figure 12.2. The machine A
constructed according to Theorem 12.3 would look like the diagram inside the
dotted box shown in Figure 12.3. This new machine is very definitely disconnected,
and in this example sal is not related to S02 by EA since these two states treat ab
differently (ab is accepted by Al and rejected by A2)' The reader is encouraged to
generate another example using two equivalent machines, and verify that the two
original start states would indeed be related by EA.
I
I I
Figure 12.3 The composite machine
~ ~ discussed in Example 12.2
converting this NDFA into a DFA. Theorem 3.7 and Corollary 3.5 indicate the
algorithms for minimizing this DFA. Counting the number of final states in this
minimal machine will allow the question to be answered correctly.
Ll
The careful reader may have noticed that the minimal machine described in
Chapters 2 and 3 was only advertised to have the minimum total number of states
and has not yet been guaranteed to have the smallest number offinal states (perhaps
there is an equivalent machine with many more nonfinal states but fewer final
states). An investigation of the relationship between the final states of the minimal
machine and the equivalence classes comprising the right congruence generated by
this language will show that no equivalent machine can have fewer final states than
the minimal machine has (see the exercises).
The proofs of Theorems 12.3 and 12.4 are good examples of using existing
algorithms to build new algorithms. This technique should be applied whenever
possible in the following exercises. It is certainly useful in resolving the following
question about grammars.
Given two right linear grammars G1 = <OloI,SloPl> and G2 = <02,I,S2,P2>,
it is clearly decidable whether G2 is equivalent to G1 . An algorithm can be formed
that simply:
Proof. Recall that a nonterminal is useless if it can never appear in the deri-
vation of any valid terminal string. Essentially, only two things can prevent a
nonterminal X from being effectively used somewhere in a valid derivation: either X
can never appear as part of a partial derivation that begins with only the start
symbol (no matter how many productions we apply), or, once X is generated, it can
never lead to a valid terminal string.
Finding the members of n that can be produced from S is a simple recursive
procedure: Begin with Zo = {S} and form ZI by adding to Zo all the nonterminals
that appear on the right side of productions that are used to replace S. Then form ~
by adding to ZI all the nonterminals that appear on the right side of productions that
are used to replace members of ZI> and so on. More formally:
Zo = {S}
and for i ~ 1,
Zi+1 = Zi U {Y E nl(3x E I*)(3TE Zi) , T~xY is a production in P}
Clearly, Zo ~ ZI ~ ... ~ Zn ~ ... ~ n, and as was shown for similar collections of
nested entities (such as E OA , E IA , ... in Chapter 3), after a finite number of steps we
will reach the point where Zm = Zm+1 and Zm will then represent the set of all
nonterminals that can be reached from the start symbol S.
In a similar fashion, we can generate another nested sequence of sets
Wo, WI> ... , where Wi represents the set of all nonterminals that can produce a
terminal string in i or fewer steps. We are again guaranteed to reach a point where
W n = Wn+h and Wn will indeed be the set of all nonterminals that can ever produce a
valid terminal string.
Zm n Wn is thus the set of all useful members of n, and n - (Zm n W n) is
therefore the set of all useless nonterminals.
t:.
EXAMPLE 12.3
The techniques employed here should look somewhat familiar. They involve
iteration methods similar to those developed in Chapter 3. In fact, it is possible to
apply the connectivity algorithms for nondeterministic finite automata to this
problem by transforming the right-linear grammar G into the NDFA AG , as defined
in the proof of Lemma 8.1. The automaton corresponding to the grammar in
Example 12.3 is shown in Figure 12.4. Note that the state labeled <W> is
inaccessible, which means that it cannot be reached from <S>. This indicates that
there is no sequence of productions starting with the start symbol S that will produce
a string containing W.
Checking whether a nonterminal such as V can produce a terminal string is
tantamount to checking whether the language accepted by AX is non empty , where
AX is AG with the start state moved to the state labeled <V>. Since both L (AX) and
L(A~D are empty, V and X are useless.
It is fairly easy to find succinct algorithms that answer most of the reasonable
questions one might ask about representations of regular languages. For each of the
more complex classes of languages, there are many reasonable questions that are
not decidable. Several of these will be presented in the following sections. In this
section, we consider some of the answerable questions that can be asked about the
more robust machines and grammars.
a. L (P) is empty.
h. L (P) is finite.
c. L(P) is infinite.
are many standard algorithms for determining paths and components in a graph,
and thus the question of whether x E L (G) is decidable.
A
The generation of all the edges in the graph generally involves more effort
than is needed to answer the question. A more efficient method is similar to the
recursive calculations used to find the set of connected states in a DFA. Beginning
with {S}, the production set P can be consulted to determine the labels of nodes that
can be derived from S in one step. These new labels can be added to the set of
accessible sentential forms, and the added nodes can be checked until no new labels
are found. The set of sentential forms will then consist of
{w E(!'ur)*IS~w 1\ Iwl:5n}
and contain all words in L(G) of length :5n. If we are only interested in the specific
word x, then the algorithm can return Yes as soon as x appears in the set of
accessible sentential forms and would return No if x did not appear by the time the
set stopped growing.
The above algorithm will suffice for any grammar that does not contain con-
tracting productions, but can clearly give the wrong answers when applied to type 0
grammars. Since the length of sentential forms can both grow and shrink in un-
restricted grammars, the word x may actually be generated by a sequence of produc-
tions that at some point generates a sentential form longer than x. Such a sequence
would not be considered by the method outlined in Theorem 12.9, and the algo-
rithm might answer No when the correct answer is Yes. We could define a pro-
cedure that looked at larger and larger graphs (consisting of more and longer
sentential forms), which would halt and answer Yes if a derivation sequence for x
was discovered. If x actually can be generated by G, this method will eventually
uncover the appropriate sequence. We therefore have a procedure that will reliably
tell us if a word can be generated by an unrestricted grammar. Unless we include a
specification of when to stop and answer No, this procedure is not an algorithm. In
later sections, we will see that it is impossible to determine, for an arbitrary type 0
grammar G, if an arbitrary word x is not generated by G. The question of whether
x E L (G) is not decidable for arbitrary grammars.
It turns out that there are many reasonable questions such as this one that
cannot be determined algorithmically. We begin our overview of undecidable prob-
lems with an analysis of a very reasonable question concerning Pascal programs.
Subsequent sections consider undecidable questions concerning the grammars and
machines covered in this text.
Having now developed a false sense of security about our ability to produce
algorithms for determining many properties about machines and languages, we now
step back and see whether there is anything we cannot do algorithmically. A simple
418 Decidability Chap. 12
counting argument will show that there are too many things to calculate and not
enough algorithms with which to calculate them all. It may be helpful to review the
section on cardinality in Chapter 0 and recall that there are different orders of
infinity. A diagonalization argument showed that the natural numbers could not be
put in one-to-one correspondence with the real numbers; there are simply too many
real numbers to allow such a matching to occur. A similar mismatch occurs when
comparing the (countable) number of algorithms to the (uncountable) number of
possible yes-no functions.
By definition, an algorithm is a finite list of instructions, written over some
finite character set. As such, there are only a countable number of different
algorithms that can be written. It may be helpful to consider the set of all Pascal
programs and view each file that contains the ASCII code for a program, which is
essentially a sequence of zeros and ones, as one very long binary integer. Clearly, an
infinite number of Pascal programs can be written, but no more programs than
there are binary integers, so the number of such files is indeed countable.
Now consider the possible lists of answers that could be given to questions
involving a countable number of instances. We will argue that there are an uncount-
able number of yes-no patterns that might describe the answers to such questions.
Notice that the descriptions for automata, grammars, and the like are also finite,
and thus there are a countable number of DFAs, a countable number of grammars,
and so on, that can be described. The questions we asked in the previous sections
were therefore applied to a countable number of instances, and these instances
could be ordered in some well-defined way, much as the natural numbers are
ordered. If we think of a yes response corresponding to the digit 1 and a no response
corresponding to 0, then the corresponding series of answers to a particular ques-
tion can be thought of as an unending sequence of Os and Is. By placing a dec-
imal point at the beginning of the sequence, each such pattern can be thought
of as a binary fraction, representing a real number between .00000 ... = 0 and
.111111 ... = 1. Conversely, each such real number in this range represents a
sequence of yes-no answers to some question. Since there are an uncountable
number of real numbers between 0 and 1, there are an uncountable number of
answers that might be of interest to us. Some of these answers cannot be obtained
by algorithms, since there are not enough algorithms to go around. Thus, there
must be many questions that are not decidable.
It is not immediately apparent that the existence of undecidable questions is
much of a drawback; perhaps all the "interesting" questions are decidable. After
all, there are an uncountable number of real numbers, yet all computers and many
humans seem to make do with just the countable number of rational numbers.
Unfortunately, there are many simple and meaningful questions that are un-
decidable. We discuss one such question now; others are considered in the next
section.
Just about every programmer has had the experience of running a program
that never produces any output and never shows any sign of halting. For programs
that are fairly short, this is usually not a problem. For major projects that are
Sec. 12.3 An Undecidable Problem 419
expected to take a very long time, there comes an agonizing moment when we have
to give up hope that it is on the verge of producing a useful answer and stop the
program on the assumption that it has entered an infinite loop. While it would be
very nice to have a utility that would look over a program and predict how long it
would run, most of us would settle for a device that would simply predict whether or
not it will ever halt.
It's a good bet that you have never used such a device, which may at first seem
strange since a solution to the halting problem would certainly provide information
that would often be useful. If you have never thought about this before, you might
surmise that the scarcity of such programs is a consequence of anyone of several
limiting factors. Perhaps they are inordinately expensive to run, or no one has taken
the time to implement an existing scheme, or perhaps no one has yet figured out
how to develop the appropriate algorithms. In actuality, no one is even looking for a
"halting" algorithm, since no such algorithm can possibly exist.
Let us consider the implications that would arise if such an algorithm could be
programmed in, say, Pascal. We can consider such an algorithm to be implemented
as a Boolean function called HALT, which looks at whatever program happens to
be in the file named data.p and returns the value TRUE if that program will halt,
and returns FALSE if the program in data. p would never halt. Perhaps this function
is general enough to look at source code for many different languages, but we will
see that it is impossible for it to simply respond correctly even when looking solely at
Pascal programs.
The programmer of the function HALT would likely have envisioned it to be
used in a program such as CHECK, shown in Figure 12.5. We will use it in a slightly
different way and show that a contradiction arises if HALT really did solve the
halting problem. Our specific assumptions are that:
Consider the program TEST in Figure 12.6, which is structured so that it will
run forever if the function HALT indicates that the program in the file data. p would
halt, and simply quits if HALT indicates that the program in data.p would not halt.
Some interesting things happen if we run this program after putting a copy of the
source code for TEST in data.p.
If HALT does not produce an answer, then HALT certainly does not behave
as advertised, and we have an immediate contradiction. HALT is supposed to be an
algorithm, so it must eventually return with an answer. Since HALT is a Boolean
function, we have only two cases to consider.
420 Decidability Chap. 12
program CHECK;
{ envisioned usage of HALT
function HALT: boolean;
begin
{ marvelous code goes here
end { HALT }
begin { CHECK}
if HALT then
wri teln ( 'The program in file data. p will halt' )
else
writeln( 'The program in file data.p will not halt')
end { CHECK }.
Case 1: HALT returns a value of TRUE to the calling program TEST. This has
two consequences, the first of which is implied by the asserted behavior of HALT.
i. If halt does what it is supposed to do, this means that the program in data.p
halts. We ran this program with the source code for TEST in data.p, so TEST
must actually halt.
The second consequence comes from examining the code for TEST, and noting
what happens when HALT returns TRUE.
ii. The i f statement in the program TEST then causes the infinite loop to be
entered, and TEST runs forever, doing nothing particularly useful.
Our two consequences are that TEST halts and TEST does not halt. This is a clear
contradiction, and so case 1 never occurs.
Case 2: HALT returns a value of FALSE to the calling program TEST. This
likewise has two consequences. Considering the advertised behavior of HALT, this
must mean that the program in data.p, TEST, must not halt. However, the code for
TEST shows that if HALT returns FALSE we execute the else statement, write
one line, and then stop. TEST therefore halts. TEST must again both halt and not
halt.
Whichever way we turn, we reach a contradiction. The only possible conclu-
sion is that the function HALT does not behave as advertised. It must either return
no answer, or give an incorrect answer.
It should be clear that the problem cannot be fixed by having the programmer
who proposed the function HALT fiddle with the code; the above contradiction will
be reached regardless of what code appears between the begin and end statements
in the function HALT. We have shown that any such proposed function is guaran-
teed to behave inappropriately when fed a program such as TEST. In actuality,
Sec. 12.3 An Undecidable Problem 421
program TEST;
{ to be placed in the file data.p
var FOREVER: boolean;
function HALT: boolean;
begin
{ marvelous code goes here
end; { HALT}
begin { TEST }
FOREVER: = false;
i f HALT then
repeat
FOREVER: = false;
until FOREVER
else
writeln( 'This program halts')
end { TEST }.
there are an infinite number of programs that cause HALT to misbehave, but it was
sufficient to demonstrate just one failure to justify that no such function can solve
the general problem.
The above argument demonstrates that the halting problem for Pascal pro-
grams is undecidable or unsolvable. That is, there does not exist a Pascal program
that can always decide correctly, when fed an arbitrary Pascal program, whether
that program halts.
If we were to define an algorithm as "something that can be programmed in
Pascal," we would have shown that there is no algorithm for deciding whether an
arbitrary Pascal program halts. One might suspect that this is therefore not a very
satisfying definition of what an algorithm is, since we have a concise, well-stated
problem that cannot be solved using Pascal. It is generally agreed that the problem
does not lie with some overlooked feature that was inadvertently not incorporated
into Pascal. Clearly, all programming languages suffer from similar inadequacies.
For example, an argument similar to the one presented for Pascal would show that
no C program can be devised that can tell which C programs can halt. Thus, no
other programming language can provide a more robust definition of what an
algorithm is.
There are variations on this theme that likewise lead to contradictions. Might
there be a Pascal program that can check which C programs can halt? If you believe
that every Pascal program can be rewritten as an equivalent C program, the answer
is definitely no; a Pascal program that checks C programs could then be rewritten as
a C program (which checks C programs), and we again reach a contradiction.
It is generally agreed that the limitations do not arise from some correctable
inadequacy in our current methods of implementing algorithms. That is, the limita-
tions of algorithmic solutions seem to be inherent in the nature of algorithms.
Programming languages, Turing machines, grammars, and all other proposed sys-
422 Decidability Chap. 12
tems for implementing algorithms have been shown to be subject to the same lim-
itations in computational power. The use of Turing machines to implement algo-
rithms has several implications that apply to the theory of languages. These are
explored in the following sections.
In the previous section, we saw that no Pascal program could always correctly
predict when another Pascal program would halt. A similar statement was true for C
programs, and Turing machines, considered as computing devices, are no different;
no Juring machine solves the halting problem.
Each of us is probably familiar with the way in which a Pascal program reads a
file, and hence it is not hard to imagine a Pascal program that reacts to the code for
another Pascal program. As long as the input alphabet contains at least two sym-
bols, encodings can be defined for the structure of a Turing machine, which allows
the blueprint for its finite state control to be placed on the input tape of another
Turing machine. A binary encoding might be given for the number of states,
followed by codes that enumerate the moves from each of the states. Just as we are
not presently concerned about the exact ASCII codes that define the individual
characters in a file containing a Pascal program, we need not be concerned with the
specific representation used to encode a Turing machine on an input tape.
Consider input tapes that contain the encoding of a Turing machine, followed
by some delimiter, followed by an input word. Assume there exists a Turing
machine H that, given such an encoding of an arbitrary machine and an input word,
always correctly predicts whether the Turing machine represented by that encoding
halts for that particular word. This assumption leads to a contradiction exactly as
shown in the last section for Pascal programs. We would be able to use the machine
H as a submachine in another Turing machine that halts exactly when it is not
supposed to halt, and thereby show that H cannot possibly behave properly.
We will see that the un solvability of the halting problem will imply that it is not
decidable whether a given string will cause a Turing machine to halt and print Y. If a
word is accepted, this fact can eventually be discovered, but we cannot reliably tell
which words are rejected by an arbitrary Turing machine. If we could, we would
have an algorithm for computing the complement of any Turing-acceptable lan-
guage. In the next section, we will show that there are Turing-acceptable languages
Sec. 12.4 Turing Decidability 423
that have complements that are not Turing-acceptable, which means that a general
algorithm for computing complements cannot exist.
A problem equivalent to the halting problem involves the question of whether
an arbitrary type 0 grammar accepts a given word. This can be seeIito be almost the
same question as was asked of Turing machines.
read head to the leftmost symbol of x. Control then passes to the original start state
of M. In this manner, TM. accepts A exactly when M accepts x.
This correspondence makes it possible to use the Turing machine X as a
submachine in another Turing machine XH that solves the halting problem. That is,
given an input tape with an encoding of a machine M followed by the symbols for a
word x, XH can be easily programmed to modify the encoding of M to produce the
encoding of TM. and leave this new encoding on the input tape before passing
control to the submachine X. XH then accepts exactly when TM. accepts A, which
happens exactly when M halts on input x. XH would therefore represent an
algorithm for solving the halting problem, which we know cannot exist. The portion
of the machine that modifies the encoding of M is quite elementary, so it must be
the submachine X that cannot exist. Thus, there is no algorithm that can accomplish
the task for which X was designed, that is, determining whether an arbitrary Turing
machine T accepts the empty string.
d
The conclusion that X was the portion of XH that behaves improperly is akin to
the observation in the previous section that the main part of the Pascal program
TEST was valid, and hence it must be the function HALT that behaves incorrectly.
We now consider languages whose criteria for membership is related to the halting
problem. Define the language D to be those words that either are not encodings of
Turing machines or are encodings of machines that would halt with Y when pre-
sented with their own encoding on their input tape. The language D is Turing
acceptable, since a multitape machine could copy the input word to a second tape,
check whether the encoding truly represented a valid Turing machine, and then use
the "directions" on the second tape to simulate the action of the encoded machine
on the original input. The multitape machine would halt with Y if the encoding was
invalid or if the simulated machine ever accepts.
On the other hand, the complement of D is not Turing acceptable. Let U be
the set of all valid encodings of Turing machines that do not halt when fed their own
encodings. Then U = ~ D, and there does not exist a machine T for which L (T) = U.
If such a machine existed, it would have an encoding, and this leads to the same
problem encountered with the HALT function in Pascal. This encoding of T is
either a word in U or is not a word in U; both cases lead to contradictions. If the
encoding of T belongs to U, then by definition of U it does not halt when fed its own
encoding. But the assumption that L (T) = U requires that T halt with Y for all
encodings belonging to U, which means T must halt when fed its own encoding. A
similar contradiction is also reached if the encoding of T does not belong to U.
Therefore, no such Turing machine T can exist, and U is an example of a language
that is not Turing-acceptable.
Sec. 12.5 Turing-Decidable Languages 425
V Theorem 12.13. If IIIII === 2, then ~}; is not closed under complementation.
Proof. Encodings of arbitrary Turing machines can be effectively accom-
plished with only two distinct symbols in the alphabet. The Turing-acceptable
language D described above has a complement U that is not Turing-acceptable.
b.
V Corollary 12.2. There is a language that is Turing acceptable but not Turing
decidable. That is, '3eI C ?II.
Proof. Definition 12.2 implies that '3eI c: ?II. By Theorems 12.13 and 12.14,
these two classes have different closure properties, and thus they cannot be equal.
Therefore, '3eI C ?II'
Ll
Actually, we have already seen a language that is Turing acceptable but not
Turing decidable. D was shown to be Turing acceptable, but if D were Turing
decidable, then its complement would be Turing decidable by Theorem 12.14.
However, ~ D = U, ap.d U is definitely not Turing decidable since it is not even
Turing acceptable.
VI, the context-sensitive languages, is another subclass of ?II. It is possible to
determine how '3eI relates to VI and thereby insert '3eI into the language hierarchy.
V Theorem 12.15. Let I be an alphabet for which IIIII ;::: 2. There is a language
that is Turing decidable but not context sensitive. That is, VIC '3eI .
Proof. By Corollary 12.3, VI c: '3eI , and it remains to be shown that there is a
member of '3eI that is not a member of VI' By the technique described in Theorem
Sec. 12.5 Turing-Decidable Languages 427
1. Checks if the string on the input tape represents the encoding of a valid
context-sensitive grammar.
2. Calculates the encoding of the corresponding Turing machine.
3. Simulates that Turing machine being fed its own encoding.
This process is guaranteed to halt, since the Turing machine being simulated is
known to be a halting Turing machine. Thus, L E ~};. However, if LEO};, we find
ourselves in a familiar dilemma. If there is a context-sensitive grammar GL that
generates L, then this grammar would have a corresponding Turing machine TL,
which would have an encoding XL' If XL did not belong to L, then by definition of L it
would be an encoding of a machine (Ti,) that did not reject its own encoding (XL)'
Thus, T L recognizes XL, and therefore the corresponding grammar GL must generate
XL' But then XL E L (G L ) = L, contradicting the assumption that XL did not belong to
L. If on the other hand XL belongs to L, then, by definition of L, TL must reject its
own encoding (XL)' and thus XL tt. L (TL) = L (G L) = L, which is another contradic-
tion. Thus, no such context-sensitive grammar GL can exist, and L is not a context-
sensitive language.
A
lies between the type 3 and type 2 languages. Corollary 12.2 and Theorem 12.15
show that 'lit!, properly lies between the type 1 and type 0 languages and also show
that the type 1 languages are a proper subset of the type 0 languages. The existence
of languages that are not Turing acceptable shows that ~!, is properly contained in
p(I*). A counting argument shows that proper containment of ~!, in p(I*) also
holds even if I is a singleton set.
a
The relationships between six distinct and nontrivial classes of languages are
characterized by Theorem 12.16. Each of these classes is defined by a particular
type of automaton. The trivial class of all languages, p(I*), was shown to have no
mechanical counterpart. We have seen that type 3 languages appear in many useful
applications. Program design, lexical analysis, and various engineering problems
are aided by the use of finite automata concepts. Programming languages are always
defined in such a way that they belong to the class sl.!" since compilers should
operate deterministically. The theory of compiler construction builds on the mate-
rial presented here; syntactic analysis, the translation from source code to machine
code, is guided by the generation of parse trees for the sentences in the program,
which in turn give meaning to the code. The type 0 languages represent the funda-
mental limits of mechanical computation. The concepts presented in this text
provide a foundation for the study of computational complexity and other elements
of computation theory.
EXERCISES
12.1. Verify the assertions made in the proof of Theorem 12.1 concerning Theorem 2.7.
12.2. Prove Lemma 12.1.
12.3. Given an FAD language L, the minimal DFA accepting L, and another machine 8 for
which L(8) = L, prove that the number of nonfinal states in the minimal machine
must be equal to or less than the number of non final states in 8.
12.4. Given two DFAs Al = <:£, St, sO" 8t, r;> and A2 = <I, S2, S02' 82, 1';>, show that it is
decidable whether L(AI) r;;;"L(A2 ).
12.5. Given any alphabet :£ and a DFA A = <I, S, sO, 8, F>, show that it is decidable
whether L(A) is cofinite. (Note: A set L is cofinite iffits complement is finite, that is,
iff I * - L is finite.)
12.6. Given any alphabet :£ and a DFA A = <:£, S, sO, 8, F>, show that it is decidable
whether L (A) contains any string of length greater than 1228.
12.7. Given any alphabet I and a DFA A= <:£,S,so,8,F>, show that it is decidable
whether A accepts any even-length strings.
12.8. Given any alphabet I and regular expressions Rl and R2 over :£, show that it is
decidable whether RI and R2 represent languages that are complements of each
other.
12.9. Given any alphabet :£ and regular expressions RI and R2 over I, show that it is
decidable whether RI and R2 describe any common strings.
Chap. 12 Exercises 429
12.10. Given any alphabet I and a regular expression RJ over I, show that it is decidable
whether there is a DFA with less than 31 states that accepts the language described by
R1•
12.11. Given any alphabet I and a regular expressions RJ over I, show that it is decidable
whether there is a DFA with more than 31 states that accepts the language described
by R 1 • (You should be able to argue that there is a one-step algorithm that always
supplies the correct yes-no answer to this question.)
12.12. Given any alphabet I and a regular expression R over I, show that it is decidable
whether there exists a NDFA (with A-moves) with at most one final state that accepts
R.
12.13. Given any alphabet I and a DFA A = <I, S, sO, 8, F>, show that it is decidable
whether there exists a NDFA (without A-moves) with at most one final state that
accepts the same language A does.
12.14. Given any alphabet I and regular expressions RI and R2 over I, show that it is
decidable whether RI = R 2.
12.15. Given any alphabet I and regular expressions RI and R2 over I (which represent
languages LI and G, respectively), show that it is decidable whether they generate the
same right congruences (that is, whether RLI = RLz)'
12.16. Prove Theorem 12.5.
12.17. Outline an efficient algorithm for computing {3(so,y)[Y EI* /\ n:::; [y[ <2n} in the
proof of Theorem 12.2, and justify why your procedure always halts.
12.18. Consider intersecting the set {3(so, y) [y E I * /\ 5n :::; [y [< 6n} with F to answer the
question posed in Theorem 12.2. Would this strategy always produce the correct
answer? Justify your claims.
12.19. Show that it is decidable whether two Mealy machines are equivalent.
12.20. Show that it is decidable whether two Moore machines are equivalent.
12.21. Given any alphabet I and a regular expression R, show that it is decidable whether R
represents any strings of length greater than 28. Give an argument that does not
depend on finite automata or grammars.
12.22. Given any alphabet I and a right-linear grammar G, show that it is decidable whether
L(G) contains any string of length greater than 28. Give an argument that does not
depend on finite automata or regular expressions.
12.23. Refer to the proof of Theorem 12.6 and prove that Zo ~ Zl ~ ... ~ Zn ~ ... ~ n.
12.24. Refer to the proof of Theorem 12.6 and prove that if (3m E N)(Zm = Zm+l) then Zm
will then represent the set of all nonterminals that can be reached from the start
symbol S.
12.25. Refer to the proof of Theorem 12.6 and prove that (3m E N)(Zm = Zm+J)'
12.26. (a) Refer to the proof of Theorem 12.6 and give a formal definition of Wi.
(b) Prove that WO~WI ~ ... ~Wn ~ ... ~n.
12.27. Refer to the proof of Theorem 12.6 and prove that if (3m E N)(Wm = W m+l) then W m
will represent the set of all nonterminals that can produce valid terminal strings.
12.28. Refer to the proof of Theorem 12.6 and prove that (3m E N)(Wm = W m+l)'
12.29. Let A be an arbitrary NDFA (with A-moves). A string processed by A may successfully
find several paths through the machine; it is also possible that a string will be rejected
because there are no complete paths available.
, . ,
(a) Show that it is decidable whether there exists a string with no complete path in A.
(b) Show that it is decidable whether there exists a string that has at least one path
through A that leads to a nonfinal state.
(c) Show that it is decidable whether there exists a string accepted by A for which all
complete paths lead to final states.
(d) Show that it is decidable whether all strings accepted by A have the property that
all their complete paths lead to final states.
(e) Show that it is decidable whether all strings have unique paths through A.
12.30. Given two DFAs Al = <~, SI, sO" 31, F I > and A2 = <~, S2, sO" 32, F2>:
(a) Show that it is decidable whether there exists a homomorphism between Al and
A2 •
(b) Show that it is decidable whether there exists an isomorphism between Al and A2.
(c) Show that it is decidable whether there exist more than three isomorphisms
between Al and A2. (Note: There are examples of disconnected DFAs for which
more than three isomorphisms do exist!)
12.31. Given any alphabet ~ and a regular expression RI over ~, show that it is decidable
whether RI describes an infinite number of strings. Do this by developing an
algorithm that does not depend on the construction of a DFA, that is, does not
depend on Theorem 12.2.
12.32. Given a Mealy machine M and a Moore machine A, show that it is decidable whether
M is equivalent to A.
12.33. Given any alphabet ~ and regular expressions RI and R2 over ~, show that it is
decidable whether the language represented by R2 properly contains that of R I .
12.34. It can be shown that it is decidable whether L (A) = 0for any NDFA A by first finding
the equivalent DFA Ad and applying Theorem 12.1 to that machine.
(a) Give an efficient method for answering this question that does not rely on the
conversion to a DFA.
(b) Give an efficient method for testing whether L(A) is infinite for any NDFA A.
Your method should likewise not rely on the conversion to a DFA.
12.35. Given a DPDA M, show that it is decidable whether L(M) is a regular set.
12.36. (a) Refer to Theorem 12.9 and outline an appropriate algorithm for determining
paths in the graphs discussed.
(b) Give the details for a more efficient recursive algorithm.
12.37. Prove that 'Je}'. is closed under:
(a) Union
(b) Intersection
(c) Concatenation
(d) Reversal
12.38. Let L = L2(T) for some Turing machine T that halts on all inputs. That is, let L consist
of all strings that cause T to halt with Y somewhere on the tape. Prove that there exists
a halting Turing machine T' for which L = L2(T) = L(T'). T' must:
1. Halt on all input.
2. Place a Y after the input word on an otherwise blank tape for accepted words.
3. Place an N after the input word on an otherwise blank tape for rejected words.
12.39. (a) Assume there is a Turing machine M'4l that determines whether an encoding of a
Turing machine T belongs to some set X. Let the class of languages recognized by
Chap. 12 Exercises 431
REFERENCES
432
INDEX
433
,.
434 Index
Context-sensitive grammar (cont.) Directly derives, 254, 262 Equivalent finite-state trans-
pure, 258 Disjoint sets, 7 ducers,217-23,228-31,
Context-sensitive language, Disjunctive normal form. (see 232-36
258-59,386 Principle disjunctive nor- Equivalent FST corresponding to
Contracting production, 257 mal form) aMSM,228
Converse relation, 12 Distinguishable states, 88-89, 222 Equivalent grammars, 263
Correspondence, one-to-one, 11, Distributive laws, 3, 181 Equivalent logical expressions, 2,
94 DO loop lookahead, 301 16
Countable set, 15,418 Domain of a function, 8 Equivalent MSM corresponding
Count ably infinite set, 14 DPDA. (see Deterministic push- to a FST, 229-30
Counting automaton, 328-29, 338 down automaton) Equivalent NDFA without
Cross product, 5, 151,352 Duality, principle of, 3 lambda-moves, 136
Equivalent PDA corresponding to
CFG,339-40
Equivalent pushdown automata,
2/)>;,147 Editors, text, 54-56 . 336,338,349,351
DCFL. (see Deterministic Effective closure, 151 Equivalent regular expressions,
context-free languages) Empty language, 41 181
Dead state, 117 Empty set, 41,179 Equivalent representations, 264
Decidability, 405, 407 Empty stack criteria for PDA Equivalent states, 88-89, 222
of equivalence of DFAs, 410 acceptance, 330, 335 Equivalent Turing machine corre-
of equivalence of regular gram- Empty string, 27 sponding to a type 0 gram-
mars, 412 Empty word, 27 mar, 391
of emptiness Encoding of states and alphabets, Equivalent type 0 grammar corre-
ofregular set, 407 47-52 sponding to a Turing ma-
of context-free language, 414 End-of-file (EOF) packet, 57-58 chine, 389
of finiteness End-of-string <EOS>, 47-49, Evaluation of computer per-
of regular set, 409 131 formance, 56-57
of context-free language, 414 End markers for LBA, 384 Existential quantifier, 4
of membership in CSL, 416 Enumerate, 15 Extended output function, 214,
8 function. (see Extended state Enumeration of move sequences, 227
transition function) 381 Extended state transition func-
/) function. (see State transition EOF packet, 57-58 tion, 33, 120, 136, 214, 227
function) <EOS>, 47-49,131 C implementation, 40
DeMorgan's laws, 3, 151 E,27,179 Pascal implementation, 35
Denumerable, 15 E-move, 134
Derivation, 262, 284-85 Equality:
leftmost, 289 of sets, 4
rightmost, 289 of strings, 26 ;{F>;,339
Derivation Tree, 285 Equation system, 187 Factorial function, 17
Deterministic context-free lan- algebraic, 185, 188 FAD. (see Finite automaton de-
guages, 358 derived from automaton, 191, finable language)
closure properties, 359-60 200 Fetching instructions, 56
Deterministic finite automaton, derived from grammars, 269- Final state, 30, 119,211,329
28 70 criteria for PDA acceptance,
circuit implementation, 46 solution of, 185, 189, 198, 199 335
homomorphism, 93 Equipotent, 13 Finite automata, equivalent, 74,
induced by a right congruence, Equivalence. (see Equivalent) 97,123,127
69,71-72 Equivalence class, 6, 66 Finite automaton, 23, 28
isomorphism, 87, 94 Equivalence relation, 5, 65 C implementation, 40
minimization, 97 rank of, 68 derived from regular expres-
software implementation, right congruence, 65-68 sion, 182-84
32-40 between states (see State equiv- deterministic, 28, 30
Deterministic pushdown automa- alence relation) minimal deterministic, 87
ton, 354 Equivalent CFG corresponding to nondeterministic, 119
Deterministic Turing machine, PDA,346 Pascal implementation, 35
366,380 Equivalent CSG corresponding to Finite automaton definable lan-
DFA. (see Deterministic finite a strict LBA, 395 guage,41
automaton) Equivalent DFA corresponding to closure properties, 148-69
D flip-flop, 47, 131 an NDFA, 124-25, 130-31 Finite rank, of a relation, 68, 70,
Diagonalization technique, 418, Equivalent finite automata, 74, 80
427 97, 123, 127 Finite set, 14, 170,209,409,416
436 Index
Finite state control, 28-29, 211, hierarchy, 261, 401, 427 Induction, 15,33-34,126
366 left-linear, 267 strong, 21
Finite-state transducer, 211 LL(k), 301, 356-57 Inductive step, 16
circuit implementation, 242-44 pure, 258, 260 Infinite automata, 80
homomorphism, 218, 232 regular, 267, 269 Infinite set, 14,409,416
isomorphism, 218, 232 right-linear, 262 Infix language, 84
minimization, 217 unambiguous, 290 Inherently ambiguous language,
Finite transducer definable, 216 unrestricted, 256 296
Flip-flop: Graph, 31, 416 Initial set, 68, 90, 115, 199
D, 47,131 Greibach normal form, 310 Initial state, 30, 119,211
SR,131 Injective function, 10
V,4 Input alphabet, 30, 119,211,329
Formula, statement, 1, 16-17 Input tape, 28
FORTRAN identifier grammar, 'iJe~, 425 Instance of a question, 407
253-54 Halting problem, 417 Integer, 4
FORTRAN identifier language, in C, 421 Intersection:
41-42 in Pascal, 419 of two languages, 151-53, 163,
FORTRAN lookahead problem, for Turing machines, 374 322,359,399-400
301 Halt state, 366 with a regular set, 352-53
FST. (see Finite-state transducer) Head: Inverse function, 12
FTD. (see Finite transducer de- multiple, 377-78 Inverse homomorphism, 165-67
finable) read, 28, 210 Isomorphic automata, 95-97, 219
Function, 8 read/write, 330, 365 Isomorphism:
codomain, 8 write, 210
between deterministic finite au-
composition, 11 Head recursion, 58-59 tomata, 87, 94
characteristic, 9 Height of a parse tree, 316 between finite-state trans-
domain, 8 Hierarchy: ducers,218
factorial, 17 of grammars, 253
of languages, 261, 401, 427 between Moore sequential ma-
one-to-one, 10 chines, 232
onto, 10 Homomorphism: ith partial state equivalence re-
homomorphism, 93, 218, 232 between deterministic finite au- lation, 103,224,236
identity, 13, 65, 81, 165,203 tomata,93 ith partial state set relation, 107
injective, 10 between finite-state trans-
inverse, 12 ducers,218
isomorphism, 94, 218, 232 between Moore sequential ma-
chines, 232 Kermit protocol, 57-58, 237-38
output, 211 Kleene closure, 158,274,321,
extended, 214, 227 language, 163
closure properties, 164-65, 359-60,400
recursive, 17, 33, 374
range, 10 320,359-60,399-400
state transition, 30, 33, 119, ;e~, 386
211,366,384 A. (see Empty string)
extended, 33, 120, 136,214, Identity element, 180 A-calculus, 374
227 Identity function, 13,65,81,165, A-closure, 135
surjective, 10 203 A-move, 134
translation, for transducer, 215 Identity relation, 5, 91 A-transition, 134
well-defined, 8, 69, 96 Ill-defined. (see Well-defined) Language, 37
Implementation: accepted by a deterministic fi-
of deterministic finite auto- nite automaton, 41
'§~,
264 mata: accepted by a nondeterministic
Garbage state, 117 with hardware, 46 finite automaton, 122
Gates, logic, 2, 46 with software, 32-40 accepted by a pushdown autom-
GCD. (see Greatest common di- of nondeterministic finite auto- aton:
visor) mata,131 via empty stack, 335
Generation of a language, 257 of nondeterministic finite auto- via final state, 335
Generative device, 257 mata with lambda-moves, ambiguous, 290
GNF. (see Greibach normal form) 139-40 cofinite, 170
Godel, K., 374 of finite-state transducers, context-free, 260, 284
Grammar, 253 242-44 context-sensitive, 257
ambiguous, 290 Implies, 4 equations, 185,269-74
context-free, 260 Increasing condition, 313 FAD, 41
context-sensitive, 258 Induced partition, 7 finite, 170,209
decidable questions, 412-16 Induced relation, 67 generation of, 257
Index 437
Principle disjunctive normal form, Reduced transducer, 222 Rightmost derivation, 289
2,50 Reduced version Roman numeral language, 64
Procedure,86,405 ofa DFA, 100
(see Algorithm) of a transducer, 222, 235
Production, 18,254-55 Refinement, 7-8, 71, 104,224-25
A-rule, 261 Reflexive relation, 5, 65-66 Sack and stone automaton, 328
contracting, 257 Reflexive and transitive closure, Scientific notation language,
nongenerative, 306 255,262,335,370 43-46
unit, 306 Regular expression, 179 Semantics, 287
useful,302 identities, 181 Sentence, 25
useless, 302 Regular expression grammar, Septuple, 329
Programming language, 32-40, 254-55,286-87,340,345 Sequential machine. (see Finite-
56,60,63,65,138,163, with unique delimiters, 355 state transducer)
355,417-22 deterministic, 356-57 Set, 4
Protocol, Kermit, 57-58, 237-38 Regular grammar, 267, 269 cardinality, 13
Pumping lemma, 75 (see Right-linear grammar, countable, 15,418
Pumping theorem, 315 Left-linear grammar) denumerable, 15
Pure: Regular language, 201 (see Regu- empty, 41,179
Chomsky normal form, 308 lar set) equations (see Language equa-
context-free grammar, 260 Regular set, 179 tions)
context-sensitive grammar, 258 closure properties, 201-3 finite, 14
Greibach normal form, 310 decidability problems, 405-13 infinite, 14
Pushdown automaton, 327 derived from DFA, 184 uncountable, 15,418
closure properties, 352-60 Reject, 31, 35,122 Sextuple, 211, 226
configuration, 335 Relabelling of states, 91 (see Ho- Ship transmission example, 123
decidable questions, 416 momorphism, Isomor- :r. (see Alphabet)
deterministic, 354-60 phism) :r\ 27
equivalence with context-free Relation, 5 :r + ,27-28
grammars, 339, 346 converse, 12 :r*,27-28
equivalence with other PDAs, equivalence, 5, 65 Simulating machine behavior,
336,338,349,351 identity, 5, 91 376-81
nondeterministic, 329 induced by a language, 67 <SOS>, 51,131
two-stack, 352 induced by a machine, 68 Stack, 328-29
Pushdown stack, 328-9 refinement, 7-8, 71, 104, alphabet, 329
Push onto stack, 328 224-25 bottom symbol, 329
reflexive, 5, 65-66 Start-of-string <SOS>, 51, 131
rank,68 Start state, 30, 119, 211, 329, 366,
Quadruple, 256, 258, 260, 262, right congruence, 66 384
267 state equivalence, 88-89, 222 Start symbol, 255
Question, 407 ith partial, 103,224,236 State:
Quintuple, 30, 119,366 symmetric, 5, 65-66 accessible, 87
Quotient: transitive, 5, 65-66 accepting, 23
of two languages, 169-70 Relatively prime language, 78 active, 122, 131
with a regular language, 169 Replacement rule. (see Produc- dead,117
tion) disconnected, 88
Reset circuitry, 51 distinguishable, 88-89, 222
Ill;, 182 Reverse: final, 30, 119,211,329
Range of a function, 10 ofalanguage, 128-29, 172, garbage, 117
Rank of a relation, 68 206-7 halt, 366
Rational number, 4 of a string, 82, 172 inaccessible, 88
Read head, 28, 210 tape processing for addition, initial, 30, 119,211
Read/write head, 330, 365 239 reachable, 87
Real number, 4 Right congruence, 65 start, 30, 119,211,329,366,
Recognizer. (See Finite automa- corresponding to a DFA, 71-72 384
ton) induced by a language, 67 unreachable, 88
Recursion, 17,58-59 induced by a DFA, 68 State equivalence relation:
left, 310 Right-linear grammar, 261 decidability application, 410
elimination of, 311-15 constructed from a DFA, 264 for DFAs, 88-89
Recursive function, 17, 33, 374 equivalence of left-linear gram- for transducers, 222
Reduced deterministic finite au- mar, 269 partial, 103,224, 236
tomaton, 91 yielding a NDFA, 266 Statement, logical, 1
Reduced machine, 91, 222 Right-linear set equations, 185-99 State transition diagram, 31-32
438 Index
State transition function, 30, 33, Transducer. (see Finite-state Type 3 grammar. (see Regular
119,211,366,384 transducer, Moore se- grammar)
extended, 33, 120, 136,214,227 quential machine) Type 3 language. (see Regular set)
State transition table, 31-32 Transition function. (see State
Stone and sack automaton, 328 transition function)
String, 25
concatenation, 25
Transitive relation, 5, 65-66
Translation function for trans-
au,;, 296
Unambiguous context-free'gram-
empty, 27 ducer,215 mar, 296
matching, 55-56, 123, 129, 130 Tree. (see Derivation tree, Parse Unary operator, 147
reverse of, 82 trees) Uncountable, 15,418
Submachine, 368, 372 Truth tables, 1-2,50, 53, 132, Undecidable problems:
Subset, 4 243-44 Pascal halting problem, 417-22
proper, 5 Turing-acceptable language, 257, Turing machine halting prob-
Substitution: 370 lem,422
regular set, 201 closure properties, 399-401 word acceptance by a Turing
closure properties, 202-3, 320, Turing-decidable language, machine, 423-24 '.
359-60, 399-400 424-25 word generation in a type 0
context-free language, 319 Turing, A., 374 grammar, 423
Substring, 27 Turing machine, 366 Union, 148-50,274,321,359,
Subtraction grammar, 291-96, acceptance criteria, 370-71 399-400
319 blank symbol, 366 Unique minimum-state machine,
Suffix language, 172 bounded: 102,225,236
Surjective function, 10 on one end, 376 Unit closure, 306
Symbol, 24, 28 on both ends (see Linear- Unit production, 306
blank, 366 bounded automaton) Universal quantifier, 4
buffer, 349 configuration, 370 UNIX, 138
end markers for LBA, 384 corresponding grammar, 389 Unrestricted grammar, 256
nonterminal,255 deterministic, 366 Unsolvable problem. (see Un-
terminal, 255 encoding, 422 decidable problems)
Symmetric relation, 5, 65-66 halt state, 366 Useful nonterminal, 302
Syntax, 18 halting problem, 374 Useful production, 302
correctness, 287 linearly bounded, 376, 384 Useless nonterminal, 302
diagrams, 18 strict, 392 Useless production, 302
moves, 366-67
multihead,377-78
mUltitape, 379
'!f,;,381 multitrack, 376-77 Vending machine, 29, 54, 212
Tail recursion, 33, 58-59 nondeterministic, 380
Tape: submachines, 368, 372
auxiliary, 330 two-dimensional, 379 W,;,162
blank, 366 undecidable problems, 422 Well-defined function, 8, 69, 96
input, 28 Turing's World, 24, 367, 372, 375 Word. (see String)
multitrack, 376-77 Type 0 grammar. (see Un- Write head, 210
output, 211 restricted grammar)
stack, 328 Type 0 language. (see Turing-
two-dimensional, 379 acceptable language) \f,;,325
Tape head, 28, 210, 330 Type 1 grammar. (see Context-
Terminal set, 90,115,187,191, sensitive grammar)
199 Type 1 language. (see Context-
Yield in one step, 254, 262
Terminal symbol, 255 sensitive language)
3,4 Type 2 grammar. (see Context- Yield in k steps, 255, 262
Top of a stack, 328 free grammar)
Traffic signal emulation, 240-42, Type 2 language. (see Context-
248-49 free language) ~,;, 381