FA Equals Regular Express
FA Equals Regular Express
Kleene's Theorem
In 1956 Kleene proved that any language that can be defined by one of these methods
can be defined by the other two methods as well. This theorem is the most important
result in the theory of finite automata.
Our text spends a lot of time on the proof of this theorem. The proof is structured as
follows:
The circular nature of this constructive proof provides us with algorithms to change an
example of one of these three structures into either of the other two. Here is a sketch of
the proof:
Part 1. Every finite automaton can be turned into a transition graph. This part is easy
because every finite automaton is already a transition graph.
Part 2. Every transition graph can be turned into a regular expression. A satisfactory
algorithm must work in every case and finish in a finite amount of time. In the next
section of these notes there is an algorithm called NFA to regular expression. An NFA is
a nondeterministic finite automaton. The only difference between an NFA and a TG is
that an NFA allows only a single character per transition. We can convert a TG to an NFA
by adding extra states for any transition on which we find a string of length 2 or more,
then use the NFA to regular expression algorithm to turn the transition graph into a
regular expression.
Part 3. Every regular expression can be turned into a transition graph. In the next
section there is an algorithm to turn a regular expression into an NFA and another
algorithm to remove the nondeterminism from an NFA and turn it into a DFA
(deterministic finite automaton), the kind of finite automaton we are familiar with already.
Nondeterminism
Algorithms
NFA to DFA
2. Make a transition table for the NFA: Label the rows of the table with the names of
the states, and the columns with the characters of the alphabet. As the entry for
cell (s,c), list the states that can be reached from state s by consuming character
a a b
b 1 - -
2,6,8 3,4
2 6 b 2 -
3 2,6,8 5
a 7 4 - -
b b b 7,8
1 a 5 2,6,8 8
- -
8 6 -
b + 7 -
3 4 5 -
8
c. Also include any states that can be reached from state s by using -transitions
after consuming character
3. Determine the start state of the FA by combining the start state of the NFA with
any states reachable from this start state using -transitions. On the above
example, the start state of the FA should be (1,2,6,8) since the start state of the
NFA is 1 and from state 1 you can reach states 2, 6, and 8 on -transitions.
4. Make a transition table for the FA one row at a time. Label the columns of this
table with the characters of the alphabet and label row #1 with the start state
determined in step 3.
Determine the entries for row r as follows: For each character c of the alphabet
and for each individual state s in the label of row r, add the entries in cell (s,c) of
the NFA transition table to cell (r,c) in the FA transition table, except do not
include any entry more than once.
Once row r is completed, let the entries in each cell be the name of a state in the
FA. If that state has not been previously listed in the table, add a new row with
that state as label.
Continue this process until no new states are added.
5. Draw the FA making any state whose name contains the name of an original
accept state also be an accept state.
1. Add a new start state and add an edge labeled from this new start state to
each of the original start states. The original start states are no longer start
states.
2. Add a new accept state and -transitions from the original accept states to this
new accept state. The original accept states are no longer accept states. The
new accept state must be different from the new start state.
3. Combine any edges that exit and enter the same states:
4. Eliminate any states that have no edges going out to other states. Eliminate any
edges to these states as well.
5. Repeat until the only remaining states are the start state and the accept state:
Select a state s to be eliminated. For each entering edge (r,s) and for each
leaving edge (s,t) create a bypass edge (r,t). If there is a self-loop (s,s) on state s
then label edge (r,t) with the concatenation of the label on (r,s) with the *-closure
of the label on (s,s) and with the label on (s,t). Else, label edge (r,t) with the
concatenation of the label on (r,s) with the label on (s,t). Eliminate edge (r,s).
Once all entering edges to s have been eliminated, eliminate state s and all its
leaving edges. Combine any edges that exit and enter the same states as in step
3.
6. Combine all edges from the start state to the accept state as in step 3. The
regular expression on the resulting edge is the output of this algorithm.
1. Draw a start state and an accept state and connect them with an edge labeled
with the regular expression given.
2. Recursively apply the following rules until each edge of the finite automaton
contains a single character of the alphabet. Apply the rules to an edge in the
order shown:
a. If an edge is labeled with an expression that consists of two or more
subexpressions "or-ed" together, add new edges so that each
subexpression is on an edge of its own. However, if each subexpression
consists of a single character of the alphabet, you may simply change the
+'s in the expression into commas.
Minimizing a DFA
1. Add a trap state to the machine if it contains an implied trap state and add all the
edges that go to the trap state.
2. Create a stair-step diagram like the one shown below. Suppose the states of the
machine are named 1..n. Then label the rows of the diagram 2..n and the
columns 1..n-1 (label them along the bottom.) Note that there is exactly one cell
in the diagram for each possible pair of states. The reduction of the machine is
achieved by combining any states which are serving the same purpose in the
machine, and the cells of the diagram will be marked according to whether that
pair of cells can be combined or not.
3. Draw an X through any cell that pairs an accept state with a non-accept state.
4. Recursively consider the individual cells of the diagram until every cell contains
either a checkmark or an X. For cell (i,j) do the following:
If states i and j have transitions going to the same states for each character of
the alphabet, put a checkmark in cell (i,j).
Else if for some character of the alphabet state i has a transition to an accept
state while state j has a transition to a non-accept state, or vice versa, put an X in
cell (i,j).
Else consider each character of the alphabet separately. If state i has a transition
to state m and state j has a transition to state n for character c, then write m,n in
cell (i,j). Write such a pair for any character for which states i and j have
transitions to two different states. If at some point in the processing all of the cells
representing the different pairs written in cell (i,j) contain checkmarks, then put a
checkmark in cell (i,j). If at some point in the processing one of the cells
representing a pair in cell (i,j) contains an X, then put an X in cell (i,j).
Suppose cell (3,4) contains a checkmark. Then the minimized FA will contain one
state named (3,4) instead of having the two different states 3 and 4. All of the
transitions into the original state 3 and all of the transitions into the original state
4 now come into state (3,4), and the same for the transitions that left from the
original states. A transition from the original 3 to the original 4, or vice versa, will
now be a self- loop on state (3,4).
Suppose cells (3,4), (4,5) and (3,5) contain checkmarks. Then the minimized FA
will contain one state named (3,4,5) instead of having the three different states 3,
4, and 5. The same rules about transitions apply to the combination of three or
more states as applied to a combination of two states.
Minimizing a DFA
a a b
- +
b b a
b
+ a
a
a, b
b
1. Add a trap state to the machine if it contains an implied trap state and add all the
edges that go to the trap state.
2. Divide the set of states into two subsets: accept states and non-accept states.
3. Line 2 will be another partitioning of the original set. Two states can remain in the
same subset only if the transitions leaving those states (on each letter of the
alphabet) both go to states that are in the same subset on the previous line.
a. Take the first subset above, {0, 1, 3, 5}. Remove the 0 and place it in a
subset of its own. Then take the 1 and look to see if it can be placed in
the subset with the 0 or if it has to go to a different subset. On the letter a,
0 goes to 1 while 1 goes to 2. States 1 and 2 are in different subsets on
line 1, so we cannot place 1 with 0. Put 1 in a subset of its own.
b. Now take 3. On the letter a, 0 goes to 1 while 3 goes to 4. 1 and 4 are in
different subsets on line 1, so we cannot place 3 with 0. Can we place it
with 1? On the character a, 1 goes to 2 and 3 goes to 4. Since 2 and 4
are in the same subset on line 1, we're good so far. On the character b, 1
goes to 3 and 3 goes to 1. Since 1 and 3 are also in the same subset, we
can put 3 into 1's subset. Now take 5 and see if we can put it with 0. On
the character a, 0 goes to 1 and 5 goes to 5. Since 0 and 5 are in the
same subset on line 1, we may be able to put 5 with 0. On the character
b, 0 goes to 3 and 5 goes to 5. So 5 can be placed with 0. Thus we have
broken {0, 1, 3, 5} into two subsets: {0, 5} {1, 3}
c. Now see if 2 and 4 can stay together or if they have to be broken apart.
On the letter a, 2 goes to 5 and 4 goes to 5. On the letter b, 2 goes to 2
and 4 goes to 4. Since 2 and 4 are in the same subset on line 1, we can
leave 2 and 4 in the same subset. Thus line 2 is {0, 5} {1,3} {2, 4}.
4. We continue this process until no change occurs from one line to the next. Then
we make one state for each of the subsets in the final partitioning. In the above
example we end up with {0} {1,3} {2,4} {5}.
Moore Machines
A Moore machine is like a finite automaton except for the following differences. In a
Moore machine there are two alphabets: an input alphabet and an output alphabet. The
two alphabets may be the same but they do not have to be. Another difference is that
there are no accept states in a Moore machine. Its purpose is not to answer yes or no,
not to accept or reject a string. It is not a language recognizer, it is an output producer.
Each state of a Moore machine produces a one-character output immediately upon the
machine's entry into that state. At the beginning, the start state produces an output
before any input has been read. Thus the output of a Moore machine is one character
longer than its input.
We draw Moore machines in the same way as finite automata but the label in a state is
composed both of the name of the state and the output character that the state
produces. Run the string abab through the following machine and you will find that the
output produced is 10010.
The following Moore machine might be considered a "counting" machine. The output
produced by the machine contains a 1 for each occurrence of the substring aab found in
the input string.
Mealy Machines
Example:
The following Mealy machine takes the one's complement of its binary input. In other
words, it flips each digit from a 0 to a 1 or from a 1 to a 0.
The next Mealy machine from page 154 of the text increments its binary input. The only
rather disconcerting characteristic of the machine is that we must feed the input number
backwards and the machine produces its output backwards. It also does not work
correctly if the input string consists completely of 1's. In that case the answer always
comes out 0.
Here is an example of a Mealy machine that reports on the parity of each 4-bit substring
in its input. For each of the first 3 bits of each 4-bit substring, the machine outputs a 0. If
a 4-bit substring contains an even number of 1's then the machine outputs a 0 on the 4th
bit of that 4-bit substring, otherwise it outputs a 1.
Although Moore and Mealy machines do not accept or reject their input strings, they do
yield information about their input through the output that they produce. Here is a Mealy
machine to count the number of occurrences of aa or bb. It produces a 1 each time it
finds that it has just seen a double letter.
When we talk about equivalence of two Moore machines or two Mealy machines we
mean that, given the same input, they produce the same output. Since a Moore machine
outputs the symbol associated with its start state before it begins processing its input, its
output is always one longer than its input. The output of a Mealy machine is always the
same length as its input. Therefore a Moore machine cannot be equivalent to a Mealy
machine in the above sense. We say that a Moore machine is equivalent to a Mealy
machine if, given the same input, the output of the Moore machine after removing the
first character is the same as the output of the Mealy machine.
Using this definition of equivalence, our text proves that for every Moore machine there
is an equivalent Mealy machine and vice versa. It does this with two constructive
algorithms: one for converting a Moore machine to a Mealy machine and one for going
the other direction. We will not study these algorithms because we need to move on to
other material, but they are interesting. Read about them in the text.
Theorem. The set of regular languages is closed under union, concatenation, and the
Kleene star.
Proof.This theorem is easy to prove using regular expressions. Take any two regular
languages L1 and L2 from the set of regular languages. Each of them has a regular
expression corresponding to it. Call the two regular expressions R 1 and R2. It is easy to
form a regular expression that corresponds to the union of the two languages by putting
a + between the two regular expressions. Likewise we can concatenate the two regular
expressions or put parentheses around one and place a star outside the parentheses.
These operations form regular expressions that correspond to the concatenation and the
Kleene star of the languages. QED
We can also prove the above theorem using Transition Graphs (TG’s) or
Nondeterministic Finite Automata (NFA). We will take the parts of the proof one at a
time. First we prove the following:
If L1 and L2 are regular languages then their union, L1+L2, is also regular.
Since L1 and L2 are regular languages, we know that we can draw a TG (or NFA) for
each of them. Call these TG's M1 and M2. To form the union of the two languages we can
create a union machine by creating a new start state with -transitions from this new
state to each of the start states of M1 and M2. Any language for which you can draw a TG
is a regular language, so the union of L1 and L2 is regular.
We can form the TG for the concatenation of two languages as shown in the next
diagram. We first make sure that machine M 1 has only one accept state. If it has more
than one, we take away the accept status of all its accept states, draw a new accept
state and add -transitions from each of the original accept states to this new accept
state. Then we add a -transition from the accept state of the first machine to the start
state of the second.
M1 M2 +
-
This new TG defines the concatenation of L1 and L2 (L1L2); therefore the L1L2 is a
regular language.
Finally, we can make a TG that accepts the Kleene closure of the language accepted by
machine M1 as shown below. We make sure that M1 has only one start state and one
- M1 +
accept state; otherwise we modify M 1 so that it has one start state and one accept state.
Then we add a new start state and draw a -transition from this new state to the start
state of M1. Then we add another -transition from the new start state to the accept
state and one more -transition from the accept state to the new start state.
Complement
An FA that accepts L'. That is, it accepts all strings other than ab:
Intersection
Theorem. The set of regular languages is closed under intersection. That is, if L 1 and L2
are regular languages then L1 L2 is a regular language.
Proof. L1 and L2 are sets so we can use set theory in our proof. By DeMorgan's Law we
have the following:
L1 L2 = (L1' L2')'.
Since L1 and L2 are regular, so are L1' and L2'. So is the union of L1' and L2' and so is the
complement of that union. QED
It is useful to know how to build a finite automaton to accept the intersection of two
languages accepted by finite automata M1 and M2. Let's find a finite automaton to accept
strings that start with a and end with a. This language is the intersection of the two
languages of strings that start with a, and strings that end with a. Here are finite
automata for each of those languages. Note that the trap state in the first one was made
explicit. This is a necessity for our algorithm.
a, b
a, b
b a
b a
a +
- + -
b
We want to find a finite automaton that accepts strings that are in both languages. We
build the following table, the start state of which is the combination of the start states of
the above two machines. To determine the table entry for row (1,A) and character a, we
combine the transition on character a from state 1 in the first machine with the transition
on character a from state A in the second machine. We do this for character b as well.
Then we make a row for any combined state that we produced and for which we do not
already have a row. We continue this process until we have a row for every combined
state that is produced by the algorithm. Note that the only accept states in this new
machine will be states that are combinations of accept states from the original two
machines.
The machine above can be simplified to the machine below because states 2A and 2B
are acting together as a trap state.
a b b
+ a
a
-
b