Optimization of DFA Based Pattern Matchers
Optimization of DFA Based Pattern Matchers
BASED PATTERN
MATCHERS
Important States of an NFA
An NFA state is important if it has non- out
transitions
During Subset construction - -closure
(move (T, a)) takes into account only the
important states
Direct construction relates important states
of NFA with symbols in the RE
Augmented RE
Final state is not important
Concatenate an unique right end marker #
Add a transition on # out of the accepting
state
Converting Regular Expression to DFA
A regular expression can be converted into a
DFA (without creating a NFA first).
First the given regular expression is augmented
by concatenating it with a special symbol #.
r (r)#
augmented regular expression
Then a syntax tree is created for this
augmented regular expression.
Converting Regular Expression to DFA
If firstpos and lastpos have been computed for each node, followpos
of each position can be computed by making one depth-first traversal
of the syntax tree.
Followpos
followpos(i) = { firstpos(c2) }
Cat-node
ilastpos(c1) firstpos(c2)
C1 C2
Followpos
followpos(i) = { firstpos(n) }
Star-node
firstpos(n) n ilastpos(n)
C1
Example -- ( a | b) * a #
{1,2,3} {4}
red – firstpos
{1,2,3} {3} {4} # {4} blue – lastpos
4
{1,2} {1,2} {3} {3}
* a
followpos(1) = {1,2,3}
3
followpos(2) = {1,2,3}
{1,2} | {1,2} followpos(3) = {4}
followpos(4) = {}
{1} a {1} {2} b {2}
1 2
S1=firstpos(root)={1,2,3}
mark S1
a: followpos(1) followpos(3)={1,2,3,4}=S2 move(S1,a)=S2
b: followpos(2)={1,2,3}=S1 move(S1,b)=S1
mark S2
a: followpos(1) followpos(3)={1,2,3,4}=S2 move(S2,a)=S2
b: followpos(2)={1,2,3}=S1 move(S2,b)=S1
b a
a
S1 S2
start state: S1
accepting states: {S2} b
Example -- ( a | ) b c* #
1 2 3 4
S1=firstpos(root)={1,2}
mark S1
a: followpos(1)={2}=S2 move(S1,a)=S2
b: followpos(2)={3,4}=S3 move(S1,b)=S3
mark S2
b: followpos(2)={3,4}=S3 move(S2,b)=S3 S2
a
mark S3
b
c: followpos(3)={3,4}=S3 move(S3,c)=S3 S1
b
S3 c
start state: S1
accepting states: {S3}