0% found this document useful (0 votes)
17 views

Module 5

The document discusses the conversion of a non-deterministic finite automaton (NFA) to a deterministic finite automaton (DFA) using the subset construction algorithm. It begins by explaining the reasons for converting an NFA to a DFA, as DFAs are faster for string matching. It then provides details on the subset construction algorithm, which is a two-step process involving preprocessing to determine epsilon closures of states, followed by DFA construction. The algorithm constructs a transition table for the DFA to simulate all possible moves of the NFA on input strings. An example conversion is provided to illustrate the process.

Uploaded by

ARSHIYA K
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Module 5

The document discusses the conversion of a non-deterministic finite automaton (NFA) to a deterministic finite automaton (DFA) using the subset construction algorithm. It begins by explaining the reasons for converting an NFA to a DFA, as DFAs are faster for string matching. It then provides details on the subset construction algorithm, which is a two-step process involving preprocessing to determine epsilon closures of states, followed by DFA construction. The algorithm constructs a transition table for the DFA to simulate all possible moves of the NFA on input strings. An example conversion is provided to illustrate the process.

Uploaded by

ARSHIYA K
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Module 5 – Lexical Phase – NFA to DFA

We have seen conversion of Regular Expression to ϵ-NFA using the Thompson’s Construction
algorithm. In this module, we shall continue our discussion on the conversion from NFA to DFA
since DFA is faster for string matching. The primary objective of this module is to learn how the
subset construction algorithm converts a given NFA or ε-NFA to DFA.

5.1 Reasons for Conversion to DFA


As we have already discussed in the earlier modules, the language of a DFA is define as follows:

L = {w | δ(q, w) = r, where q is the start state and r is in F} (5.1)

Thus from equation (5.1), it is clear that the set of all strings “w” such that there is a path from
the start state to the final state are those that belong to the DFA. The language of NFA is given
by

L = {w | δ(q, w) = R, where ‘q’ is the start state and R is a set of states and at least

a ‘r’ in R is in F} (5.2)

As, it is obvious that from the multiple states that can be reached from the start state on the string
“w”, even if one of them is a final state the string belongs to the NFA. For an ϵ-NFA, if the ε-
closure( R ) contains final state, we say the string “w” belongs to the ϵ-NFA. Thus, it is clear that
the DFA will have more number of states than a NFA or a ε-NFA for the same language L.
However, a DFA will avoid the ambiguity that is present in the NFA in pattern matching for a
string.

5.2. Conversion of NFA to DFA


The subset construction algorithm is used to convert the NFA to DFA. The fundamental
behind the subset construction algorithm is that the state set of a state in a NFA is thought of as a
following “state” of the state in the converted DFA. The following are the input and output of the
subset construction algorithm

• Input. An NFA N=(S,,move,S0,Z)

• Output. A DFA D= (Q,,,I0,F),

Essentially, we are trying to map “S” to “Q”. Q is defined as the subset states of “S”. Hence, Q =
2S . The input symbols of the input and output are same. I0 , the start state of the DFA is defined
as the ϵ-Closure (S0). Hence, the start state of the DFA corresponds to a set of states of the ε-
NFA. The final state F of the DFA, corresponds to all the states that contain the final state Z of
the NFA.
The subset construction algorithm is a two step process.

 Pre-processing step: Determination of E-Closure


 DFA construction
5.2.1 E-Closure determination

-closure(T) is defined as the set of NFA states reachable from NFA state “s” in T on -
transitions alone. Consider a state T such that T S. -closure is defined for all the states “s” in T
of the NFA. Algorithm 5.1 is the pre-processing step to compute E-Closure

-closure (T)
1. push all states in T onto stack;
2. initialize -closure(T) to T;
3. while stack is not empty do
{
4. pop the top element of the stack into t;
5. for each state u with an edge from t to u labeled  do
{ if u is not in -closure (T) {
add u to -closure (T)
push u into stack}
}
}
Algorithm 5.1 -closure (T)

Line 1 is an indication to compute -closure for all the states of the NFA. Hence, a stack is used
to remember all the states in the input NFA. In Line 2, -closure is computed for a state that
belongs to T. Consider one state ‘q’ from the stack. This state is added to the -closure (q).
Hence, this is an indication that -closure is not a null set. In line number 5, a for loop is
initialized which checks all edges labeled ε from the state considered and added to -closure (q).
This state is pushed back to the stack. Since, this new state, for example ‘r’ which is reachable
from ‘q’ on ε, will have edges labeled ε to some other states “S”. These “S” will also belong to
the -closure (q). Hence, this current state is pushed back into the stack.

---------------------------------

Example 5.1 Consider the NFA given in Figure 5.1


Figure 5.1 Example NFA to convert to DFA

In figure 5.1, let us consider computation of ε-Closure of all the states. According to the
algorithm, ε-Closure (0) will include the state”0”. In addition, there is a direct edge from state
“0” on ϵ to states “1” and “7”. Hence,

ε-Closure (0) = {0} U ε-Closure (1) U ε-Closure (7)

Let us consider ε-Closure(1), which will include state “1” and the ε-Closure(2) and ε-Closure(4),
since there is a edge labeled ε from state “1” to “2” and “4”

ε-Closure (1) = {1} U ε-Closure (2) U ε-Closure (4)

On, the other hand, ε-Closure(2) will include just state “2”, since there is no edge labeled ε from
state “2” and so is the case for ε-Closure(4).

ε-Closure (2) = {2}, ε-Closure (4) = {4}

Hence, ϵ-Closure (1) can be determined as

ε-Closure (1) = {1, 2, 4}

If we consider determination of ε-Closure (7), it will include “7” alone as there are no edges on ε
from “7”

ε-Closure (7) = {7}

In a similar manner,

ε-Closure (8) = {8}, ε-Closure (9) = {9}, ε-Closure (10) = {10}


Consider, ε-Closure (3) which will include “3” and there is an edge labeled ε from “3” to “6”, we
need to find ε-Closure (6).

ε-Closure (6) = {6} U ε-Closure (7) U ε-Closure (1)

= {6, 7, 1, 2, 4}

Hence, ε-Closure (3) = {3, 6, 7, 1, 2, 4} and so is ε-Closure (5) = {5, 6, 7, 1, 2, 4}.

-----------------------

After determining the ε-Closure ( ) of all the states, this is fed as input to the subset construction
algorithm to construct the DFA.

5.2.2 Subset Construction Algorithm

The input to the algorithm is the ε-Closure (q) of all the states and the NFA to know the
transition function. The output will be the DFA. This algorithm constructs a transition table
Dtran for the DFA so that D simulate in parallel all possible moves the NFA can do on an input
string. Dstates refer to the states of the DFA. Subset construction algorithm is given in Algorithm
5.2 to construct the DFA from the NFA and the pre-processed input

SubsetConstruction (ε-Closure (S0), NFA)


{
1. Initially, ε-Closure(S0) is the state in Dstates and it is unmarked
2. While there is an unmarked state T in Dstates do
{
3. Mark T;
for each input symbol ‘a’
{
4. U = ϵ-Closure(move(T,a))
5. If U is not in Dstates then
6. add U as an unmarked state to Dstates
7. Dtran[T, a] = U
}
}
}
Algorithm 5.2 Subset construction algorithm

The operation move (T,a), in addition to the ϵ-Closure is used to convert the NFA to DFA. A
move(T,a) is defined as the set of NFA states to which there is a transition on input symbol “a”
from some NFA state “s” in T. ε-Closure(S0) is computed and this is assigned as the start state I0
of the DFA. For all input symbols, ‘a’, the ε-Closure (S0) is referred and the states that are
reachable from every state in ϵ-Closure (S0) on the input symbol ‘a’ is defined as move (ε-
Closure (S0), a). The resultant will be another set of states, which is indicated as the next state in
the DFA. This procedure is carried out, till there are no more new states in the resultant DFA.
The last step is to indicate the final states of the DFA. The final state F of the DFA is
defined as

F= {I | I ∈ Q, such that I ∩ Z < >}

Line 1 of the algorithm 5.2 initializes the start state of the DFA as the ϵ-Closure (start state of
NFA). Line 2, is a while loop, which iterates till there are no more states being created. Line 3,
accepts the start state as one of the DFA’s states and marks it. Line 4 runs a “for” loop which
defines the transition from the initial state of the DFA on all input symbols. A single state on the
DFA consists of the set of states of the NFA. The transition on an input symbol ‘a’ in the DFA is
defined as the union of the states given in the DFA’s state, by referring to their individual
transitions in the NFA. After identifying the transition, union of the ϵ-Closure of all the states
that are reached is determined and this set of states is one state of the DFA. The new state U is
labeled as unmarked, indicating that transition from this state has not been defined in line 6 of
the algorithm. The transition table of the DFA, DTran is updated with transition information.

---------

Consider example 5.1, which is given in figure 5.1. The start state of the DFA is ϵ-Closure (0).
We already know that ϵ-Closure (0) = {0, 1, 2, 4, 7}. This is the start state of the DFA and is
indicated by state “A”. From this state, move (0, a), (1, a), (2,a), (4, a), (7, a) is determined. The
input NFA is referred and the transitions are only from 2, and 4 which is to state 3 and 8
respectively. Then we compute ϵ-Closure ((move (0, a)). This is nothing but ϵ-Closure (3,8). This
is computed as union of ϵ-Closure (3) and ϵ-Closure (8) = {3, 6, 7, 1, 2, 4} U {8} = {3, 6, 7, 1, 2,
4, 8}. This is referred to as state B in the DFA. Similarly, from state A on input ‘b’ is computed
as move(0, b), (1, b), (2,b), (4, b), (7, b). There is a transition only from state 4 on “b” which is 5.
Hence, we compute ϵ-Closure(5) = {1,2,4,5,6,7} which is given as state C. Thus Dtran [A,a] = B
and Dtran[A, b] = C and we have two new states B, C. From these two states transitions are
defined on the input symbols, “a” and “b”. The entire, computation is shown in Table 5.1 and the
automaton is represented in figure 5.2 and where “E” is the final state.

Figure 5.2 DFA constructed from NFA


Table 5.1 DTran for the conversion

I a b

A={0,1,2,4,7} B={1,2, 3, 4, 6, 7, 8} C = {1,2,4,5,6,7}

B={1,2, 3, 4, 6, 7, 8} B={1,2, 3, 4, 6, 7, 8} D = {1,2,4,5,6,7,9}

C = {1,2,4,5,6,7} B={1,2, 3, 4, 6, 7, 8} C = {1,2,4,5,6,7}

D = {1,2,4,5,6,7,9} B={1,2, 3, 4, 6, 7, 8} E = {1,2,3,5,6,7,10}

E = {1,2,3,5,6,7,10} B={1,2, 3, 4, 6, 7, 8} C = {1,2,4,5,6,7}

The conversion algorithm, however results in more number of states than that would be actually
required by the DFA. Thus, this would result in a non-minimized DFA as the output.

Figure 5.3 is one more example, of NFA for the keywords. This could be also converted to DFA,
which can be taken as an exercise.

Figure 5.3 Example NFA for keywords in Pascal


Summary

The Thompson’s subset construction algorithm, converts the regular expression to ε-NFA and
then to DFA. The algorithm results in a non-minimized version of the DFA which needs to be
minimized and will be discussed in the next module.

You might also like