Knuth Morris Pratt (KMP) is an algorithm, which checks the characters from left to right. When a pattern has a sub-pattern appears more than one in the sub-pattern, it uses that property to improve the time complexity, also for in the worst case.
The time complexity of KMP is O(n).
Input and Output
Input: Main String: “AAAABAAAAABBBAAAAB”, The pattern “AAAB” Output: Pattern found at location: 1 Pattern found at location: 7 Pattern found at location: 14
Algorithm
findPrefix(pattern, m, prefArray)
Input − The pattern, the length of pattern and an array to store prefix location
Output − The array to store where prefixes are located
Begin length := 0 prefArray[0] := 0 for all character index ‘i’ of pattern, do if pattern[i] = pattern[length], then increase length by 1 prefArray[i] := length else if length ≠ 0 then length := prefArray[length - 1] decrease i by 1 else prefArray[i] := 0 done End
kmpAlgorithm(text, pattern)
Input: The main text, and the pattern, which will be searched
Output − The location where patterns are found
Begin n := size of text m := size of pattern call findPrefix(pattern, m, prefArray) while i < n, do if text[i] = pattern[j], then increase i and j by 1 if j = m, then print the location (i-j) as there is the pattern j := prefArray[j-1] else if i < n AND pattern[j] ≠ text[i] then if j ≠ 0 then j := prefArray[j - 1] else increase i by 1 done End
Example
#include<iostream>
using namespace std;
void findPrefix(string pattern, int m, int prefArray[]) {
int length = 0;
prefArray[0] = 0; //first place is always 0 as no prefix
for(int i = 1; i<m; i++) {
if(pattern[i] == pattern[length]) {
length++;
prefArray[i] = length;
}else {
if(length != 0) {
length = prefArray[length - 1];
i--; //decrease i to avoid effect of increasing after iteration
}else
prefArray[i] = 0;
}
}
}
void kmpPattSearch(string mainString, string pattern, int *locArray, int &loc) {
int n, m, i = 0, j = 0;
n = mainString.size();
m = pattern.size();
int prefixArray[m]; //prefix array as same size of pattern
findPrefix(pattern, m, prefixArray);
loc = 0;
while(i < n) {
if(mainString[i] == pattern[j]) {
i++; j++;
}
if(j == m) {
locArray[loc] = i-j; //item found at i-j position.
loc++;
j = prefixArray[j-1]; //get the prefix length from array
}else if(i < n && pattern[j] != mainString[i]) {
if(j != 0)
j = prefixArray[j-1];
else
i++;
}
}
}
int main() {
string str = "AAAABAAAAABBBAAAAB";
string patt = "AAAB";
int locationArray[str.size()];
int index;
kmpPattSearch(str, patt, locationArray, index);
for(int i = 0; i<index; i++) {
cout << "Pattern found at location: " <<locationArray[i] << endl;
}
}Output
Pattern found at location: 1 Pattern found at location: 7 Pattern found at location: 14