0% found this document useful (0 votes)
33 views3 pages

Finite Automata

[1] A finite automaton (FA) is a model of computation that consists of a set of states, transitions between states, an input alphabet, a starting state, and accepting states. [2] Given an input string, an FA reads the string character by character and changes state according to the transition function until it reaches an accepting or non-accepting state. [3] We can construct an FA to search for a pattern in text by having states correspond to matching prefixes of the pattern and transitions that extend the match or backtrack on mismatches.

Uploaded by

Praveen Bhushan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views3 pages

Finite Automata

[1] A finite automaton (FA) is a model of computation that consists of a set of states, transitions between states, an input alphabet, a starting state, and accepting states. [2] Given an input string, an FA reads the string character by character and changes state according to the transition function until it reaches an accepting or non-accepting state. [3] We can construct an FA to search for a pattern in text by having states correspond to matching prefixes of the pattern and transitions that extend the match or backtrack on mismatches.

Uploaded by

Praveen Bhushan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

String Matching with Finite Automata Given any input string x over the alphabet Σ, a FA

• starts in start q0, and


A finite automaton (FA) consists of a tuple ( Q, q0, A, Σ, δ), where • reads the string x, character by character, changing state after
i) Q is a finite set of states, each character read.
ii) q0 ∈ Q is the starting state, When the FA is in state q and reads character σ, it enters state δ(q,σ).
iii) A ⊆ Q is the set of accepting states,
iv) Σ is the input alphabet (finite), The finite automaton
v) δ: Q × Σ → Q is the transition function. • accepts the string x if it ends up in an accepting state, and
• rejects x if it does not end up in an accepting state.
Example: δ(q,σ)
Q = {0,1,2,3,4,5}, Example: Suppose our FA reads the string x = bababaabaaabbaab.
q σ a b
q0 = 0, 0 1 3 symbol read b a b a b a a b a a a b b a a b
A = {2,4}, new state 0 3 1 3 1 3 1 2 4 2 2 2 4 5 5 5 5
1 2 3
Σ = {a,b}, 2 2 4 The final state (5) is not an accepting state. So the FA rejects string x.
3 1 5
4 2 5 Our FA accepts exactly those strings that contain two consecutive as
5 5 5 but do not contain two consecutive bs.

We can think of the states of our finite automaton as recording certain


We can represent our FA graphically like this:
information about the characters read so far?
a a a
0 1 2 Have two consecu- Have two consecu- Last char-
State
b tive b s been found? tive a s been found? acter read
b b a
a State 0 = starting state. 0 no no none
Double line boundary = 1 no no a
3 accepting state 2 no yes a
4
b b
3 no no b
4 no yes b
5 a,b 5 yes --- ---
We can construct an FA to search for a pattern P = p1 p2 ...pm in a text
T = t1 t2 ...tn. T: t1 t2 ... tj tj+1 tj+2 ... tj+k−2 tj+k−1 tj+k ...
• The FA will have m+1 states, which we number 0, 1, ..., m. P: p1 ... pk−3 pk−2 pk−1 ...
If match, enter state k −1; else continue
• State 0 will be the starting state, and state m will be the only
accepting state. We continue like this till we reach
• In general, the FA will be in state k if k characters of the T: t1 t2 ... tj tj+1 tj+2 ... tj+k−2 tj+k−1 tj+k ...
pattern have been matched. P: p1 ...
if match, enter state 1 ; else state 0
o In other words, the FA is in state k if the k most recently
read characters of the text match the first k pattern
characters. o If σ = tj+k ≠ pk+1, then δ(k, σ) = largest integer d such that
last char-
tj+k−(d−1) ... tj+k−2 tj+k−1 tj+k = p1 ... pd−2 pd−1 pd
acter read

T: t1 t2 ... tj tj+1 tj+2 ... tj+k−2 tj+k−1 tj+k ... o This makes it appear that δ(k, σ) depends on the subject as
P: p1 p2 p3 ... pk−1 pk pk+1 ... well as the pattern, but recall that to be in state k to begin
with we must have a match
• If the next text character tj+k equals pk+1, we have matched k+1
characters, and the FA enters state k+1. T: t1 t2 ... tj tj+1 tj+2 ... tj+k−2 tj+k−1 tj+k ...
o In other words, δ(k,pk+1) = k+1. P: p1 p2 p3 ... pk−1 pk pk+1 ...

• If the next text character tj+k differs from pk+1, then the FA meaning tj+i = pi+1 for i = 1,2,..., k−1. So we can say that
enters a state 0, 1, 2, ..., or k, depending on how many initial δ(k, σ) = largest integer d such that
pattern characters match text characters ending with tj+k. pk−d+2 ... pk−1 pk σ = p1 ... pd−2 pd−1 pd
o We shift the pattern right till we obtain a match, or
Thus δ(k, σ) depends only on the pattern, k, and σ.
exhaust the pattern.
T: t1 t2 ... tj tj+1 tj+2 ... tj+k−2 tj+k−1 tj+k ... • If the FA reaches state m, a match has been found, and the FA
P: p1 p2 ... pk−2 pk−1 pk ... remains in state m. (In practice, the computation could stop at
If match, enter state k; else continue this point.)
Example: A finite automaton to match pattern ababc over
Straight from the definition, we can compute
alphabet Σ = {a,b,c}.
• δ(i,x) in O(m2) time,
b • all (m+1)|Σ| entries of δ in O(m3) time, if we treat |Σ| as
c constant. (It may be a fairly large constant, e.g., 256.)
b,c
a b a b c
This isn’t too bad, since typically the pattern is fairly short
0 1 2 3 4 5 compared to the text.
c
a a
But much more efficient constructions are known. (They are
b,c a a,b,c fairly simple, and reduce the time to O(m) if |Σ| is treated as
constant.)

Matching pattern ababc in text caabaabcabababccb. Once we have constructed a finite automaton for the pattern,
searching a text t1t2....tn for the pattern works wonderfully.
char read c a a b a a b c a b a b a b c c b
new state 0 0 1 1 2 3 1 2 0 1 2 3 4 3 4 5 5 5 • Search time is O(n).
• Each character in the text is examined just once, in
Match succeeds – sequential order.
can stop here
state = 0;
for ( i = 1, 2, ...., n )
Recall we compute the transition function by state = δ( state, ti );
δ(m, σ) = m for all σ. if ( state == m )
Match succeeds in position i–m+1; stop;
δ(k, pk) = k+1. Match fails;
δ(k, σ) = d if σ ≠ pk , where d ≤ k is maximal subject to a match

pk–d+2 pk–d+3 ... pk–1 pk σ


=

p1 p2 ... pd–2 pd–1 pd

You might also like