ICS202-11-String Matching
ICS202-11-String Matching
ICS 202
Data Structures and Algorithms
Unit 11
String Matching
Reading Assignment
›
◼
◼
Outline
Introduction
›
›
›
›
›
›
›
Straightforward (Brute Force) Algorithm
›
›
›
›
6
Brute Force algorithm...
›
›
The Knuth-Morris-Pratt Algorithm
KMP Algorithm - Motivation
✓
✓ ✓
✓ ✓ ✓
✓ ✓ ✓ ✓
✓ ✓ ✓ ✓
›
›
i=4
j=2
KMP Algorithm - Mechanism
KMP Algorithm – Preprocessing Terms
0. . 𝑛 − 1 0. . 𝑗 𝑗
𝑛−2
◼ 𝜆
0. . 𝑛 − 1 𝑗. . 𝑛 − 1 𝑗
𝑛−1
◼ 𝜆
0. . 𝑛 − 1
𝜆
› 𝜆
› 𝜆
›
KMP Algorithm - Preprocessing
l 𝜋−
l
If the border is 𝜆
KMP Algorithm – The lps- or next-array
𝒋
P[j]
𝑃[0. . 𝑗 − 1]
𝑛𝑒𝑥𝑡[𝑗]
𝒋
P[j]
𝑃[0. . 𝑗 − 1]
𝜆 𝜆 𝜆 𝜆 𝜆 𝜆
𝜆 𝜆 𝜆 𝜆 𝜆 𝜆
𝑛𝑒𝑥𝑡[𝑗]
The KMP Algorithm – Principle Explained
KMP Algorithm Search Methodology
𝜆 𝜆 𝜆 𝜆
𝜆 𝜆 𝜆 𝜆
𝑛𝑒𝑥𝑡[𝑗]
›
›
›
i
j
The KMP Algorithm – Tracing example 01...
›
i
→
The KMP Algorithm – Tracing example 01...
›
i
→
The KMP Algorithm – Tracing example 01...
›
i
→
The KMP Algorithm – Tracing example 01...
›
i
→
The KMP Algorithm – Tracing example 01...
›
i
→
The KMP Algorithm – Tracing example 01...
›
i
j
›
›
The KMP Algorithm
The KMP Algorithm
public static String searchKMP(String pattern, String text) static int[] computeLPSArray(String pattern) {
{
int M = pattern.length(); int M = pattern.length();
int N = text.length(); int lps[] = new int[M];
String indexes = "";
// length of the previous longest prefix suffix
// Preprocess the pattern (calculate lps[] array) int j = 0;
// lps[] will hold the longest prefix suffix values
for pattern int i = 1;
int[] lps = computeLPSArray(pattern);
lps[0] = 0; // lps[0] is always 0
int i = 0; // index for txt[]
int j = 0; // index for pat[]
while (i < N) { // the loop calculates lps[i] for i = 1 to M-1
if (pattern.charAt(j) == text.charAt(i)) {
j++; while (i < M) {
i++; if (pattern.charAt(i) == pattern.charAt(j)) {
}
if (j == M) { j++;
indexes += (i - j) + " ";
j = lps[j - 1]; lps[i] = j;
} i++;
// mismatch after j matches }
else if (i < N && pattern.charAt(j) !=
text.charAt(i)) { else // (pat[i] != pat[len])
// Do not match lps[0..lps[j-1]] characters, {
they will match anyway
if (j != 0) if (j != 0) {
j = lps[j - 1];
else j = lps[j - 1];
i = i + 1; }
}
} else // if (len == 0)
return indexes; {
} lps[i] = j;
i++;
return lps; }
}
}
}
The KMP Algorithm – Tracing example 02
Comparison of Brute Force and KMP Algorithms
Some applications of String matching
Exercises
›
›
Exercises...