0% found this document useful (0 votes)
67 views13 pages

String Matching With Finite Automata: by Caroline Moore

The document discusses string matching using finite automata. It explains what a finite automaton is and how it can be used to search for strings. It also describes how to construct a string matching automaton and compute its transition function to search for patterns in time linear to the text size.

Uploaded by

shahzad5alam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views13 pages

String Matching With Finite Automata: by Caroline Moore

The document discusses string matching using finite automata. It explains what a finite automaton is and how it can be used to search for strings. It also describes how to construct a string matching automaton and compute its transition function to search for patterns in time linear to the text size.

Uploaded by

shahzad5alam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 13

String Matching with Finite

Automata

by Caroline Moore
String Matching
Whenever you use a search engine, or a
find function like sed or grep, you are
utilizing a string matching program. Many
of these programs create finite automata in
order to effectively search for your string.


Finite Automata
A finite automaton is a quintuple (Q, E, o, s, F):
Q: the finite set of states
E: the finite input alphabet
o: the transition function from QxE to Q
s e Q: the start state
F c Q: the set of final (accepting) states
How it works
A finite automaton accepts
strings in a specific language. It
begins in state q
0
and reads
characters one at a time from
the input string. It makes
transitions (|) based on these
characters, and if when it
reaches the end of the tape it is
in one of the accept states, that
string is accepted by the
language.
Graphic: Eppstein, David.
http://www.ics.uci.edu/~eppstein/161/9
60222.html
The Suffix Function
In order to properly
search for the string, the
program must define a
suffix function (o)
which checks to see
how much of what it is
reading matches the
search string at any
given moment.
Graphic: Reif, John.
http://www.cs.duke.edu/education/courses/c
ps130/fall98/lectures/lect14/node31.html
Example: nano
n a o other
empty: n c c c
n: n na c c
na: nan c c c
nan: n na nano c
nano: nano nano nano nano
Graphic & Example: Eppstein, David. http://www.ics.uci.edu/~eppstein/161/960222.html
String-Matching Automata
For any pattern P of length m, we can
define its string matching automata:
Q = {0,,m} (states)
q
0
= 0 (start state)
F = {m} (accepting state)
o(q,a) = o(P
q
a)
The transition function chooses the next state to
maintain the invariant:
|(T
i
) = o(T
i
)

After scanning in i characters, the state number is the
longest prefix of P that is also a suffix of T
i
.
Finite-Automaton-Matcher
The simple loop structure
implies a running time
for a string of length n is
O(n).
However: this is only the
running time for the
actual string matching. It
does not include the time
it takes to compute the
transition function.
Graphic: http://www.cs.duke.edu/education/courses/cps130/fall98/lectures/lect14/node33.html
Computing the Transition Function
Compute-Transition-Function (P,E)
m length[P]
For q 0 to m
do for each character a e E
do k min(m+1, q+2)
repeat k k-1
until P
k
P
q
a
o(q,a) k
return o
This procedure computes
o(q,a) according to its
definition. The loop on line
2 cycles through all the
states, while the nested loop
on line 3 cycles through the
alphabet. Thus all state-
character combinations are
accounted for. Lines 4-7 set
o(q,a) to be the largest k such
that P
k
P
q
a.
Running Time of
Compute-Transition-Function

Running Time: O(m
3
|E|)
Outer loop: m |E|
Inner loop: runs at most m+1
P
k
P
q
a: requires up to m comparisons
Improving Running Time
Much faster procedures for computing the transition
function exist. The time required to compute P can be
improved to O(m|E|).
The time it takes to find the string is linear: O(n).

This brings the total runtime to:
O(n + m|E|)
Not bad if your string is fairly small relative to the text
you are searching in.
Sources
Cormen, et al. Introduction to Algorithms. 1990
MIT Press, Cambridge. 862-868.

Reif, John.
http://www.cs.duke.edu/education/courses/cps130/fall
98/lectures/lect14/node28.html

Eppstein, David.
http://www.ics.uci.edu/~eppstein/161/960222.html

You might also like