0% found this document useful (0 votes)
16 views

String Matching: COMP171 Fall 2005

This document discusses string matching algorithms. It describes the naive algorithm, which has a worst-case running time of O(nm) to find all occurrences of a pattern string P of length m in a text T of length n. It then introduces the Boyer-Moore algorithm, which matches the pattern from right to left and shifts the pattern by various amounts depending on a mismatch. This allows it to skip many character comparisons and run faster in practice than the naive algorithm. The shifts are based on tables constructed in O(m + A) time where A is the alphabet size.

Uploaded by

saiteja1234
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

String Matching: COMP171 Fall 2005

This document discusses string matching algorithms. It describes the naive algorithm, which has a worst-case running time of O(nm) to find all occurrences of a pattern string P of length m in a text T of length n. It then introduces the Boyer-Moore algorithm, which matches the pattern from right to left and shifts the pattern by various amounts depending on a mismatch. This allows it to skip many character comparisons and run faster in practice than the naive algorithm. The shifts are based on tables constructed in O(m + A) time where A is the alphabet size.

Uploaded by

saiteja1234
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 8

COMP171

Fall 2005

String Matching

String matching 2

Pattern Matching
Given

a text string T[0..n-1] and a pattern


P[0..m-1], find all occurrences of the pattern
within the text.

Example:

T = 000010001010001 and P =
0001, the occurrences are:

first occurrence starts at T[1]


second occurrence starts at T[5]
third occurrence starts at T[11]

String matching 3

Nave algorithm

Worst-case running time = O(nm).

String matching 4

Boyer-Moore Algorithm
Basic

idea is simple.

We

match the pattern P against substrings in


the text string T from right to left.

We

align the pattern with the beginning of the


text string. Compare the characters starting
from the rightmost character of the pattern. If
fail, shift the pattern to the right, by how far?

String matching 5

Boyer-Moore Algorithm

Suppose we are comparing the last character P[m-1]


of the pattern with some character T[k] in the text.

Case (1): if the character T[k] does not appear in P at


all, we should shift P all the way to align P[0] with
T[k+1]

If P[m-1] T[k], then the pattern does not occur here

and match P[m-1] with T[k+m] again. This saves a lot of


character comparisons.

Case (2): if the character T[k] appears in P, then we


should shift P to align the rightmost occurrence of this
character in P with T[k].

String matching 6

Examples

Case (1)

Case (2)
Case (1)

String matching 7

If the last character P[m-1] of the pattern matches with


T[k], then we continue scanning P from right to left
and match with T.
If we find a complete match, we are done.
Otherwise (case (3)), whenever we fail to find a
complete match, we should always shift P to align the
next rightmost occurrence of P[m-1] in P with T[k] and
try again

Case (3)
Case (2)
Case (2)

String matching 8

Boyer-Moore algorithm

To implement, we need to find out for each character


c in the alphabet, the amount of shift needed if P[m-1]
aligns with the character c in the input text and they
dont match.

This takes O(m + A) time, where A is the number of possible characters.


Afterwards, matching P with substrings in T is very fast in practice.

You might also like