Data Structure & Algorithms - Pattern Matching
Data Structure & Algorithms - Pattern Matching
Expressions
Patterns can be represented by just using two
special characters:
“ ” represents alternatives.
ab cd matches ab or cd.
Data Structures and Algorithms II
1 3
Regular Expressions
Pattern Matching: Motivation Given the following regular expressions, which of
For many applications we want tools to find if given the example strings do you think it would match?
string matches some criteria, e.g.,
(c de)*
(a*b c+)
and “abstraction”.
1. b
These can typically be done by checking if strings 2. aaaa
match a pattern. 3. ccc
We need: 4. ab
?*(ie ei)?*
2 4
Regular Expressions and Finite State Implementing the Machine
Machines (FSMs)
These nodes represent states, and the graph ADTs (discussed later). But we never
connections represent transitions between allow a node to have more than two neighbours,
them. so a simpler data structure is possible.
The nodes in our pattern matcher capture the
state “in which a certain character in the pattern
5 7
Algorithm
We’re now ready for an algorithm for pattern
matching.
Example:
6
6 8
Pattern Matching Algorithm
(j=position in string)
dq.put(’+’); j=0;
state=next[start][0];
A variant of the stack/queue is used as the While not at end of string or pattern.
ADT: double ended queue allows nodes to be
If (state==’+’) j++;
else if (ch[state]==str[j])
character
dq.put(next[state][0])
e.g., (1 2 + 5 6)
(char matches one in pattern, so put next state
States before “+” represent possible states
on queue).
corresponding to current character.
States after “+” represent possible states
else if (ch[state]=’?’)
corresponding to next character. dq.push(next[state][0]);
dq.push(next[state][1]);
state=dq.pop();
(remove state from dq)
9 11
10 12
(+ 6) 2 5
(6) 2 +
(6 +) 3 +
(+) 3 6
13
Summary
14