0% found this document useful (0 votes)
6 views16 pages

KMP Algorithm: Engineerpro - K01

The document provides an overview of the KMP (Knuth-Morris-Pratt) algorithm for finding occurrences of a string pattern within a haystack. It explains the inefficiencies of brute force methods, introduces the KMP table for optimizing searches, and details the time complexities of both the KMP table calculation and the search process. Additionally, it includes examples and homework problems related to the KMP algorithm.

Uploaded by

minh.tn.hust
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views16 pages

KMP Algorithm: Engineerpro - K01

The document provides an overview of the KMP (Knuth-Morris-Pratt) algorithm for finding occurrences of a string pattern within a haystack. It explains the inefficiencies of brute force methods, introduces the KMP table for optimizing searches, and details the time complexities of both the KMP table calculation and the search process. Additionally, it includes examples and homework problems related to the KMP algorithm.

Uploaded by

minh.tn.hust
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

EngineerPro - K01

KMP Algorithm

Tran Minh Hieu, 2023.


1
1. Introduction
Contents 2. Examples
3. Homework

2
1. Introduction

3
KMP Algorithm
● The problem: Finding occurrence of the string pattern inside the string haystack.
○ Doesn't have to be strings, can be any array of data such as array of integers, bytes, etc…
● Definitions:
○ n: length of pattern
○ m: length of haystack
● Brute force algorithm: For every position of haystack, check if the substring pattern can start from it.
○ Time complexity: O(n * m)
○ Why is it inefficient?
■ We have to start from the beginning of pattern for every position of haystack.
■ Can we skip some section of pattern that we already know matched?

4
KMP Algorithm
● KMP table (Partial match table): For a string s, the KMP table kmpTable is an array of integers with:
kmpTable[0] = -1
kmpTable[i] = max(v | s[0: v] == s[i - v: i]) for all i > 0
● When matching pattern in haystack , if one character mismatch, we can use pattern's KMP table to minimize the number
of characters we have to backtrack.
● Algorithm to calculate kmpTable :

std::vector<int> calculateKMPTable(const std::vector<int>& s) {


int n = s.size();
auto kmpTable = vector<int>(n);

for (int i = 1, j = 0; i < n; i ++) {


while (j > 0 && s[i] != s[j]) {
j = kmpTable[j - 1];
}

if (s[i] == s[j])
kmpTable[i] = ++j;
}

return kmpTable;
}

5
KMP Algorithm
● Evaluating time complexity of calculateKMPTable():
○ Outer loop: i goes from 1 to n - 1.
○ Inner loop:
■ j gets increased by one in every iteration → j can go up to n - 1.
■ j may get decreased several times, but can never go below 0 → cannot be decreased more than n -
1 times.
→ At most 2n - 2 addition/subtraction operations, equivalent to O(n) time complexity.

6
KMP Algorithm
std::vector<int> kmpSearch(const std::string& haystack, const std::string& pattern) {
auto result = std::vector<int>();
int m = haystack.size();
int n = pattern.size();
auto kmpTable = calculateKMPTable(pattern);
for (int i = 0, j = 0; i < m; i ++) {
while (j > 0 && haystack[i] != pattern[j]) {
j = kmpTable[j - 1];
}

if (haystack[i] == pattern[j]) {
if (++j == n) {
result.push_back(i - n + 1);
j = kmpTable[j - 1];
}
}
}

return result;
}

7
KMP Algorithm
● Evaluating time complexity of kmpSearch():
○ Outer loop: i goes from 1 to m - 1.
○ Inner loop:
■ j gets increased by one in every iteration until we match n → j can go up to n - 1.
■ j may get decreased several times, but can never go below 0 → cannot be decreased more than n -
1 times.
→ At most m + 2n - 2 addition/subtraction operations, equivalent to O(n + m) time complexity.
● Overall complexity of the KMP Algorithm: O(n + m).

8
KMP Algorithm
● Visualization: https://cmps-people.ok.ubc.ca/ylucet/DS/KnuthMorrisPratt.html

9
2. Example

10
Example 1

● https://leetcode.com/problems/find-the-index-of-the-first-occurrence-in-a-string/description/
○ Well, just calculate the KMP table

11
Example 2

● https://leetcode.com/problems/longest-happy-prefix/description/
○ Well, just calculate the KMP table too

12
Example 3

● https://leetcode.com/problems/shortest-palindrome/description/
○ Well, it's the previous problem, but in reverse…

13
Example 4

● https://leetcode.com/problems/repeated-string-match/description/
○ It's the classic KMP problem, but with a twist!

14
3. Homework

15
Homework

1. https://leetcode.com/problems/remove-all-occurrences-of-a-substring/description/
○ Implementing it may look challenging, but actually not so!
2. https://leetcode.com/problems/form-array-by-concatenating-subarrays-of-another-array/
○ KMP on array + dynamic programming, oh boi…
3. https://leetcode.com/problems/camelcase-matching/
○ Now do this again, but with passion!

16

You might also like