0% found this document useful (0 votes)
59 views34 pages

Finite State Machines State Minimization

The document discusses state minimization of finite state machines (FSMs). It describes two key steps to minimize a FSM: (1) remove any unreachable states, and (2) remove any redundant states that have identical transitions and behaviors. To find redundant states, the document defines an equivalence relation between states based on their behaviors when reading all possible input strings. A minimal FSM will have one state per equivalence class. The document also discusses constructing a minimal FSM directly from the equivalence classes without knowing the original FSM.

Uploaded by

Raj Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views34 pages

Finite State Machines State Minimization

The document discusses state minimization of finite state machines (FSMs). It describes two key steps to minimize a FSM: (1) remove any unreachable states, and (2) remove any redundant states that have identical transitions and behaviors. To find redundant states, the document defines an equivalence relation between states based on their behaviors when reading all possible input strings. A minimal FSM will have one state per equivalence class. The document also discusses constructing a minimal FSM directly from the equivalence classes without knowing the original FSM.

Uploaded by

Raj Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 34

Finite State Machines

State Minimization

Chapter 5
State Minimization
Consider:

Is this a minimal machine?


State Minimization
Step (1): Get rid of unreachable states.

State 3 is unreachable.

Step (2): Get rid of redundant states.

States 2 and 3 are redundant.


Getting Rid of Unreachable States
We can’t easily find the unreachable states directly.
But we can find the reachable ones and determine the
unreachable ones from there.

An algorithm for finding the reachable states:


Getting Rid of Redundant States
Intuitively, two states are equivalent to each other (and
thus one is redundant) if all strings in * have the same
fate, regardless of which of the two states the machine
is in. But how can we tell this?

The simple case:

Two states have identical sets of transitions out.


Getting Rid of Redundant States
The harder case:

The outcomes in states 2 and 3 are the same, even


though the states aren’t.
Finding a Minimal DFSM
Two problems:
1.Given a regular language, find a minimal DFSM for it.
2.Given a DFSM, find a minimal DFSM equivalent to it.

We focus first on 1. Given a language:

•Capture the notion of equivalence classes of strings with


respect to that language.

•Prove that we can always find a (unique up to state


naming) deterministic FSM with a number of states equal
to the number of equivalence classes of strings.

•Describe an algorithm for finding that DFSM.


Defining Equivalence for Strings
We want to capture the notion that two strings are equivalent or
indistinguishable with respect to a language L if, no matter what
is tacked on to them on the right, either they will both be in L or
neither will. Why is this the right notion? Because it corresponds
naturally to what the states of a recognizing FSM have to
remember.

Example:

(1) a b a b a b

(2) b a a b a b

Suppose L = {w  {a, b}* : |w| is even}. Are (1) and (2) equivalent?

Suppose L = {w  {a, b}* : every a is immediately followed by b}.


Are (1) and (2) equivalent?
Defining Equivalence for Strings
Equivalent, or indistinguishable, string with respect to L:

x,y  *: x L y iff z  * (xz  L iff yz  L).


L is an Equivalence Relation
L is an equivalence relation because it is:
• Reflexive: x  * (x L x), because:
x, z  * (xz  L  xz  L).
• Symmetric: x, y  * (x L y  y L x), because:
x, y, z  * ((xz  L  yz  L) 
(yz  L  xz  L)).
• Transitive: x, y, z  * (((x L y)  (y L w))  (x L w)),
because:
x, y, z  *
(((xz  L  yz  L)  (yz  L  wz  L)) 
(xz  L  wz  L)).
L is an Equivalence Relation

Because L is an equivalence relation:

• No equivalence class of L is empty.

• Each string in * is in exactly one equivalence class of L.


An example of L
 = {a, b}
L = {w  * : |w| is even}

The equivalence classes of L:

[, aa, ab, ba, bb, aaaa, aaab, …] – even length

[a, b, aaa, aab, aba, abb, …] – odd length


Another example
 = {a, b}
L = {w  *: every a is immediately followed by b}

The equivalence classes of L:

[, b, abb, …] [all strings in L].

[a, abbba, …] [all strings that end in a and


have no prior a that is not
followed by a b].

[aa, abaa, …] [all strings that contain at least


one instance of aa].
Yet Another Example of L

 = {a, b}
L = aab*a

The equivalence classes of L:


When More Than One Class Contains
Strings in L
 = {a, b}
L = {w  * : no two adjacent characters are the same}

The equivalence classes of L:

[0] []
[1] [a, aba, ababa, …]
[2] [b, ab, bab, abab, …]
[3] [aa, abaa, ababb…]
Does L Always Have a Finite Number
of Equivalence Classes?

 = {a, b}
L = {anbn, n  0}

 aa aaaa
a aba aaaaa
b aaa

The equivalence classes of L:


The Best We Can Do
Theorem 5.4: Let L be a regular language and let M be
a DFSM that accepts L. Then, the number of states in M
is greater than or equal to the number of equivalence
classes of L.

Proof: Suppose that the number of states in M were


less than the number of equivalence classes of L.
Then, by the pigeonhole principle, there must be at least
one state q that contains strings from at least two
equivalence classes of L. But then M’s future behavior
on those strings will be identical, which is not consistent
with the fact that they are in different equivalence
classes of L.
The Best Is Unique
Theorem 5.5: Let L be a regular language over some alphabet
. Then there is a DFSM M that accepts L and that has
precisely n states where n is the number of equivalence
classes of L. Any other FSM that accepts L must either have
more states than M or it must be equivalent to M except for
state names.

Proof: (by construction)


M = (K, , , s, A), where:
● K contains n states, one for each equivalence class of L.
● s = [], the equivalence class of  under L.
● A = {[x] : x  L}.
● ([x], a) = [xa]. In other words, if M is in the state that
contains some string x, then, after reading the next
symbol, a, it will be in the state that contains xa.
Proof, Continued
We must show that:

• K is finite. Since L is regular, it is accepted by some


DFSM M. M has some finite number of states m. By
Theorem 5.4 (see above), n  m. So K is finite.

  is a function. In other words, it is defined for all (state,


input) pairs and it produces, for each of them, a unique
value. The construction defines a value of  for all
(state, input) pairs. The fact that the construction
guarantees a unique such value follows from the
definition of L.
Proof, Continued
• L(M) = L.
We prove first that, for any string w in *:

([], w) |-M* ([w], ).

([], w) = ([], w1w2…wm) wi  , 1  i  m


|- M ([w1], w2w3 …wm)
|- M ([w1w2], w3w4 …wm)
|- M …
|- M ([w1w2…wm], ) = ([w], )

M accepts w iff [w]  A = {[x] : x  L} iff wL

So M accepts precisely the strings in L, that is, L(M) = L.


Proof, Continued
• There exists no smaller machine M# that also accepts L.
This follows directly from Theorem 5.4 (see above), which
says that the number of equivalence classes of L
imposes a lower bound on the number of states in any
DFSM that accepts L.

• There is no different machine M# that also has n states


and that accepts L.
Example
• L = {w  {a,b}* : no adjacent characters are the same}
• equivalence classes of L:
• [0] [] = start state
• [1] [a, ba, aba, baba, …]
• [2] [b, ab, bab, abab, …]
• [3] [aa, bb, abaa, ababb…]
([x], a) = [xa]
a 1 a

0 b a 3 a,b

b 2 b
The Myhill-Nerode Theorem
Theorem 5.6 (Myhill-Nerode): A language is regular iff the
number of equivalence classes of L is finite.

Proof: Show the two directions of the implication:

L regular  the number of equivalence classes of L is


finite: If L is regular, then there exists some FSM M that
accepts L. M has some finite number of states m. The
cardinality of L  m. So the cardinality of L is finite.

The number of equivalence classes of L is finite  L


regular: If the cardinality of L is finite, then the construction
that was described in the proof of the previous theorem will
build an FSM that accepts L. So L must be regular.
So Where Do We Stand?
1. We know that for any regular language L there exists a minimal
accepting machine ML.
2. We know that |K| of ML equals the number of equivalence
classes of L.
3. We know how to construct ML from L.
4. We know that ML is unique up to the naming of its states.
But is this good enough?

Consider:
Minimizing an Existing DFSM
(Without Knowing L)

Two approaches:

• Begin with M and collapse redundant states, getting rid


of one at a time until the resulting machine is minimal.

• Begin by overclustering the states of L into just two


groups, accepting and nonaccepting. Then iteratively
split those groups apart until all the distinctions that L
requires have been made.
The Overclustering Approach

We need a definition for “equivalent”, i.e., mergeable


states.

Define q  p iff for all strings w  *, either w drives M to


an accepting state from both q and p or it drives M to
a rejecting state from both q and p.
An Example
 = {a, b} L = {w  * : |w| is even}

q2  q3
Constructing  as the Limit of a Sequence of
Approximating Equivalence Relations n

(Where n is the length of the input strings that have


been considered so far)

Consider input strings, starting with , and increasing in


length by 1 at each iteration. Start by way
overgrouping states. Then split them apart as it
becomes apparent (with longer and longer strings)
that their behavior is not identical.
Constructing n
• p 0 q iff they behave equivalently when they read . In
other words, if they are both accepting or both
rejecting states.
• p 1 q iff they behave equivalently when they read any
string of length 1, i.e., if any single character sends both
of them to an accepting state or both of them to a
rejecting state. Note that this is equivalent to saying that
any single character sends them to states that are 0 to
each other.
• p 2 q iff they behave equivalently when they read any
string of length 2, which they will do if, when they read
the first character they land in states that are 1 to each
other. By the definition of 1, they will then yield the
same outcome when they read the single remaining
character.
• And so forth.
Constructing  (cont’d)

More precisely, p,q  K and any n  1,

q n p

iff

1. q n-1 p, and
2. a   ((p, a) n-1 (q, a))
MinDFSM
MinDFSM(M: DFSM) =
1. classes := {A, K-A};
2. While E  classes, p,q  E, c   with [(p,c)] ≠ [(q,c)] do
3. split E such that [p] ≠ [q]
4. remove E from classes
5. add the subclasses of E to classes
6. return M* = (classes, , , [sM], {[q]: q  AM]}), where M* is
constructed as follows:
if M(q, c) = p, then M*([q], c) = [p]
An Example
 = {a, b}

0 = {[1,3,5,6], [2,4]}
1 = {[1,3,5], [6], [2,4]}
2 = {[1,3,5], [2], [4], [6]}
The Result
Summary

● Given any regular language L, there exists a


minimal DFSM M that accepts L.

● M is unique up to the naming of its states.

● Given any DFSM M, the algorithm minDFSM


constructs a minimal DFSM that also accepts
L(M).

You might also like