Open In App

KMP Algorithm for Pattern Searching

Last Updated : 25 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Given two strings: txt, representing the main text, and pat, representing the pattern to be searched.
Find and return all starting indices in txt where the string pat appears as a substring.
The matching should be exact, and the indices should be 0-based, meaning the first character of txt is considered to be at index 0.

Examples:

Input: txt = "abcab", pat = "ab"
Output: [0, 3]
Explanation: The string "ab" occurs twice in txt, first occurrence starts from index 0 and second from index 3.

Input: txt =  "aabaacaadaabaaba", pat =  "aaba"
Output: [0, 9, 12]
Explanation:

kmp-algorithm-for-pattern-searching

[Naive Approach] Naive Pattern Searching Algorithm - O(n × m) Time and O(1) Space

We start at every index in the text and compare it with the first character of the pattern. If they match, we move to the next character in both text and pattern. If there is a mismatch, we start the same process for the next index of the text.

C++
#include <iostream>
#include <vector>
#include <string>
using namespace std;

vector<int> search(string &pat, string &txt) {
    vector<int> res;
    int n = txt.size();
    int m = pat.size();

    for (int i = 0; i <= n - m; i++) {
        int j = 0;

        // compare pattern with substring 
        // starting at index i
        while (j < m && txt[i + j] == pat[j]) {
            j++;
        }

        // if full pattern matched
        if (j == m) {
            res.push_back(i);
        }
    }

    return res;
}

int main() {
    string txt = "aabaacaadaabaaba";
    string pat = "aaba";

    vector<int> res = search(pat, txt);
    for (int i = 0; i < res.size(); i++)
        cout << res[i] << " ";

    return 0;
}
Java
import java.util.ArrayList;

class GfG {
    
    static ArrayList<Integer> search(String pat, String txt) {
        ArrayList<Integer> res = new ArrayList<>();
        int n = txt.length();
        int m = pat.length();

        for (int i = 0; i <= n - m; i++) {
            int j = 0;

            // compare pattern with substring 
            // starting at index i
            while (j < m && txt.charAt(i + j) == pat.charAt(j)) {
                j++;
            }

            // if full pattern matched
            if (j == m) {
                res.add(i);
            }
        }

        return res;
    }

    public static void main(String[] args) {
        String txt = "aabaacaadaabaaba";
        String pat = "aaba";

        ArrayList<Integer> res = search(pat, txt);
        for (int i = 0; i < res.size(); i++)
            System.out.print(res.get(i) + " ");
    }
}
Python
def search(pat, txt):
    res = []
    n = len(txt)
    m = len(pat)

    for i in range(n - m + 1):
        j = 0

        # compare pattern with substring 
        # starting at index i
        while j < m and txt[i + j] == pat[j]:
            j += 1

        # if full pattern matched
        if j == m:
            res.append(i)

    return res


if __name__ == "__main__":
    txt = "aabaacaadaabaaba"
    pat = "aaba"

    res = search(pat, txt)
    for idx in res:
        print(idx, end=" ")
C#
using System;
using System.Collections.Generic;

class GfG {
    static List<int> search(string pat, string txt) {
        List<int> res = new List<int>();
        int n = txt.Length;
        int m = pat.Length;

        for (int i = 0; i <= n - m; i++) {
            int j = 0;

            // compare pattern with substring 
            // starting at index i
            while (j < m && txt[i + j] == pat[j]) {
                j++;
            }

            // if full pattern matched
            if (j == m) {
                res.Add(i);
            }
        }

        return res;
    }

    static void Main() {
        string txt = "aabaacaadaabaaba";
        string pat = "aaba";

        List<int> res = search(pat, txt);
        for (int i = 0; i < res.Count; i++)
            Console.Write(res[i] + " ");
    }
}
JavaScript
// compare pattern with substring 
// starting at index i
function search(pat, txt) {
    let res = [];
    let n = txt.length;
    let m = pat.length;

    for (let i = 0; i <= n - m; i++) {
        let j = 0;

        // compare pattern with substring 
        // starting at index i
        while (j < m && txt[i + j] === pat[j]) {
            j++;
        }

        // if full pattern matched
        if (j === m) {
            res.push(i);
        }
    }

    return res;
}

// Driver Code
let txt = "aabaacaadaabaaba";
let pat = "aaba";

let res = search(pat, txt);
for (let i = 0; i < res.length; i++)
    process.stdout.write(res[i] + " ");

Output
0 9 12 

[Expected Approach] KMP Pattern Searching Algorithm

The KMP algorithm improves pattern matching by avoiding rechecking characters after a mismatch. It uses the degenerating property of patterns — repeated sub-patterns — to skip unnecessary comparisons.

Whenever a mismatch happens after some matches, we already know part of the pattern has matched earlier. KMP uses this information through a preprocessed LPS (Longest Prefix Suffix) array to shift the pattern efficiently, without restarting from the next character in the text.

This allows KMP to run in O(n + m) time, where n is the length of the text and m is the length of the pattern.

Terminologies used in KMP Algorithm:

  • text (txt): The main string in which we want to search for a pattern.
  • pattern (pat): The substring we are trying to find within the text.
  • Match: A match occurs when all characters of the pattern align exactly with a substring of the text.
  • LPS Array (Longest Prefix Suffix): For each position i in the pattern, lps[i] stores the length of the longest proper prefix which is also a suffix in the substring pat[0...i].
  • Proper Prefix: A proper prefix is a prefix that is not equal to the whole string.
  • Suffix: A suffix is a substring that ends at the current position.
  • The LPS array helps us determine how much we can skip in the pattern when a mismatch occurs, thus avoiding redundant comparisons.

Example of lps[] construction:

Example 1: Pattern "aabaaac"

At index 0: "a" → No proper prefix/suffix → lps[0] = 0
At index 1: "aa" → "a" is both prefix and suffix → lps[1] = 1
At index 2: "aab" → No prefix matches suffix → lps[2] = 0
At index 3: "aaba" → "a" is prefix and suffix → lps[3] = 1
At index 4: "aabaa" → "aa" is prefix and suffix → lps[4] = 2
At index 5: "aabaaa" → "aaa" is prefix and suffix → lps[5] = 3
At index 6: "aabaaac" → Mismatch, so reset → lps[6] = 0
Final lps[]: [0, 1, 0, 1, 2, 3, 0]

Example 2: Pattern "abcdabca"

At index 0: lps[0] = 0
At index 1: lps[1] = 0
At index 2:lps[2] = 0
At index 3: lps[3] = 0 (no repetition in "abcd")
At index 4: lps[4] = 1 ("a" repeats)
At index 5: lps[5] = 2 ("ab" repeats)
At index 6: lps[6] = 3 ("abc" repeats)
At index 7: lps[7] = 1 (mismatch, fall back to "a")
Final LPS: [0, 0, 0, 0, 1, 2, 3, 1]

Note: lps[i] could also be defined as the longest prefix which is also a proper suffix. We need to use it properly in one place to make sure that the whole substring is not considered.

Algorithm for Construction of LPS Array:

The value of lps[0] is always 0 because a string of length one has no non-empty proper prefix that is also a suffix. We maintain a variable len, initialized to 0, which keeps track of the length of the previous longest prefix suffix. As we traverse the pattern from index 1 onward, we compare the current character pat[i] with pat[len]. Based on this comparison, we have three possible cases:

Case 1: pat[i] == pat[len]

This means the current character continues the existing prefix-suffix match.
→ We increment len by 1 and assign lps[i] = len.
→ Then, move to the next index.

Case 2: pat[i] != pat[len] and len == 0

There is no prefix that matches any suffix ending at i, and we can't fall back to any earlier matching pattern.
→ So we set lps[i] = 0 and simply move to the next character.

Case 3: pat[i] != pat[len] and len > 0

We cannot extend the previous matching prefix-suffix. However, there might still be a shorter prefix which is also a suffix that matches the current position.
Instead of comparing all prefixes manually, we reuse previously computed LPS values.
→ Since pat[0...len-1] equals pat[i-len...i-1], we can fall back to lps[len - 1] and update len.
→ This reduces the prefix size we're matching against and avoids redundant work.
We do not increment i immediately in this case — instead, we retry the current pat[i] with the new updated len.

Illustration:


Example of Construction of LPS Array:


Implementation of KMP Algorithm:

We initialize two pointers — one for the text string and another for the pattern. When the characters at both pointers match, we increment both pointers and continue the comparison. If they do not match, we reset the pattern pointer to the last value from the LPS array, since that portion of the pattern has already been matched with the text. Additionally, if we have traversed the entire pattern string (i.e., a full match is found), we add the starting index of the pattern's occurrence in the text to the result, and continue the search from the LPS value of the last element in the pattern.

Let’s say we are at position i in the text string and position j in the pattern string when a mismatch occurs:

  • At this point, we know that pat[0..j-1] has already matched with txt[i-j..i-1].
  • The value of lps[j-1] represents the length of the longest proper prefix of the substring pat[0..j-1] that is also a suffix of the same substring.
  • From these two observations, we can conclude that there's no need to recheck the characters in pat[0..lps[j-1]]. Instead, we can directly resume our search from lps[j-1].
C++
#include <iostream>
#include <string>
#include <vector>
using namespace std;

void constructLps(string &pat, vector<int> &lps) {

    // len stores the length of longest prefix which
    // is also a suffix for the previous index
    int len = 0;

    // lps[0] is always 0
    lps[0] = 0;

    int i = 1;
    while (i < pat.length()) {

        // If characters match, increment the size of lps
        if (pat[i] == pat[len]) {
            len++;
            lps[i] = len;
            i++;
        }

        // If there is a mismatch
        else {
            if (len != 0) {

                // Update len to the previous lps value
                // to avoid reduntant comparisons
                len = lps[len - 1];
            }
            else {

                // If no matching prefix found, set lps[i] to 0
                lps[i] = 0;
                i++;
            }
        }
    }
}

vector<int> search(string &pat, string &txt) {
    int n = txt.length();
    int m = pat.length();

    vector<int> lps(m);
    vector<int> res;

    constructLps(pat, lps);

    // Pointers i and j, for traversing
    // the text and pattern
    int i = 0;
    int j = 0;

    while (i < n) {

        // If characters match, move both pointers forward
        if (txt[i] == pat[j]) {
            i++;
            j++;

            // If the entire pattern is matched
            // store the start index in result
            if (j == m) {
                res.push_back(i - j);

                // Use LPS of previous index to
                // skip unnecessary comparisons
                j = lps[j - 1];
            }
        }

        // If there is a mismatch
        else {

            // Use lps value of previous index
            // to avoid redundant comparisons
            if (j != 0)
                j = lps[j - 1];
            else
                i++;
        }
    }
    return res;
}

int main() {
    string txt = "aabaacaadaabaaba";
    string pat = "aaba";

    vector<int> res = search(pat, txt);
    for (int i = 0; i < res.size(); i++)
        cout << res[i] << " ";

    return 0;
}
Java
import java.util.ArrayList;

class GfG {
    
    static void constructLps(String pat, int[] lps) {
        
        // len stores the length of longest prefix which 
        // is also a suffix for the previous index
        int len = 0;

        // lps[0] is always 0
        lps[0] = 0;

        int i = 1;
        while (i < pat.length()) {
            
            // If characters match, increment the size of lps
            if (pat.charAt(i) == pat.charAt(len)) {
                len++;
                lps[i] = len;
                i++;
            }
            
            // If there is a mismatch
            else {
                if (len != 0) {
                    
                    // Update len to the previous lps value 
                    // to avoid redundant comparisons
                    len = lps[len - 1];
                } 
                else {
                    
                    // If no matching prefix found, set lps[i] to 0
                    lps[i] = 0;
                    i++;
                }
            }
        }
    }

    static ArrayList<Integer> search(String pat, String txt) {
        int n = txt.length();
        int m = pat.length();

        int[] lps = new int[m];
        ArrayList<Integer> res = new ArrayList<>();

        constructLps(pat, lps);

        // Pointers i and j, for traversing 
        // the text and pattern
        int i = 0;
        int j = 0;

        while (i < n) {
            // If characters match, move both pointers forward
            if (txt.charAt(i) == pat.charAt(j)) {
                i++;
                j++;

                // If the entire pattern is matched 
                // store the start index in result
                if (j == m) {
                    res.add(i - j);
                    
                    // Use LPS of previous index to 
                    // skip unnecessary comparisons
                    j = lps[j - 1];
                }
            }
            
            // If there is a mismatch
            else {
                
                // Use lps value of previous index
                // to avoid redundant comparisons
                if (j != 0)
                    j = lps[j - 1];
                else
                    i++;
            }
        }
        return res; 
    }

    public static void main(String[] args) {
        String txt = "aabaacaadaabaaba"; 
        String pat = "aaba"; 

        ArrayList<Integer> res = search(pat, txt);
        for (int i = 0; i < res.size(); i++) 
            System.out.print(res.get(i) + " ");
    }
}
Python
def constructLps(pat, lps):
    
    # len stores the length of longest prefix which 
    # is also a suffix for the previous index
    len_ = 0
    m = len(pat)
    
    # lps[0] is always 0
    lps[0] = 0

    i = 1
    while i < m:
        
        # If characters match, increment the size of lps
        if pat[i] == pat[len_]:
            len_ += 1
            lps[i] = len_
            i += 1
        
        # If there is a mismatch
        else:
            if len_ != 0:
                
                # Update len to the previous lps value 
                # to avoid redundant comparisons
                len_ = lps[len_ - 1]
            else:
                
                # If no matching prefix found, set lps[i] to 0
                lps[i] = 0
                i += 1

def search(pat, txt):
    n = len(txt)
    m = len(pat)

    lps = [0] * m
    res = []

    constructLps(pat, lps)

    # Pointers i and j, for traversing 
    # the text and pattern
    i = 0
    j = 0

    while i < n:
        
        # If characters match, move both pointers forward
        if txt[i] == pat[j]:
            i += 1
            j += 1

            # If the entire pattern is matched 
            # store the start index in result
            if j == m:
                res.append(i - j)
                
                # Use LPS of previous index to 
                # skip unnecessary comparisons
                j = lps[j - 1]
        
        # If there is a mismatch
        else:
            
            # Use lps value of previous index
            # to avoid redundant comparisons
            if j != 0:
                j = lps[j - 1]
            else:
                i += 1
    return res 

if __name__ == "__main__":
    txt = "aabaacaadaabaaba"
    pat = "aaba"

    res = search(pat, txt)
    for i in range(len(res)):
        print(res[i], end=" ")
C#
using System;
using System.Collections.Generic;

class GfG {
    static void constructLps(string pat, int[] lps) {
        // len stores the length of longest prefix which 
        // is also a suffix for the previous index
        int len = 0;

        // lps[0] is always 0
        lps[0] = 0;

        int i = 1;
        while (i < pat.Length) {
          
            // If characters match, increment the size of lps
            if (pat[i] == pat[len]) {
                len++;
                lps[i] = len;
                i++;
            }
          
            // If there is a mismatch
            else {
                if (len != 0) {
                  
                    // Update len to the previous lps value 
                    // to avoid redundant comparisons
                    len = lps[len - 1];
                }
                else {
                  
                    // If no matching prefix found, set lps[i] to 0
                    lps[i] = 0;
                    i++;
                }
            }
        }
    }

    static List<int> search(string pat, string txt) {
        int n = txt.Length;
        int m = pat.Length;

        int[] lps = new int[m];
        List<int> res = new List<int>();

        constructLps(pat, lps);

        // Pointers i and j, for traversing 
        // the text and pattern
        int i = 0;
        int j = 0;

        while (i < n) {
          
            // If characters match, move both pointers forward
            if (txt[i] == pat[j]) {
                i++;
                j++;

                // If the entire pattern is matched 
                // store the start index in result
                if (j == m) {
                    res.Add(i - j);
                    
                    // Use LPS of previous index to 
                    // skip unnecessary comparisons
                    j = lps[j - 1];
                }
            }
            
            // If there is a mismatch
            else {
            
                // Use lps value of previous index
                // to avoid redundant comparisons
                if (j != 0)
                    j = lps[j - 1];
                else
                    i++;
            }
        }
        return res;
    }

    static void Main(string[] args) {
        string txt = "aabaacaadaabaaba";
        string pat = "aaba";

        List<int> res = search(pat, txt);
        for (int i = 0; i < res.Count; i++)
            Console.Write(res[i] + " ");
    }
}
JavaScript
function constructLps(pat, lps) {
    
    // len stores the length of longest prefix which 
    // is also a suffix for the previous index
    let len = 0;

    // lps[0] is always 0
    lps[0] = 0;

    let i = 1;
    while (i < pat.length) {
        
        // If characters match, increment the size of lps
        if (pat[i] === pat[len]) {
            len++;
            lps[i] = len;
            i++;
        }
        
        // If there is a mismatch
        else {
            if (len !== 0) {
                
                // Update len to the previous lps value 
                // to avoid redundant comparisons
                len = lps[len - 1];
            } else {
                
                // If no matching prefix found, set lps[i] to 0
                lps[i] = 0;
                i++;
            }
        }
    }
}

function search(pat, txt) {
    const n = txt.length;
    const m = pat.length;

    const lps = new Array(m);
    const res = [];

    constructLps(pat, lps);

    // Pointers i and j, for traversing 
    // the text and pattern
    let i = 0;
    let j = 0;

    while (i < n) {
        
        // If characters match, move both pointers forward
        if (txt[i] === pat[j]) {
            i++;
            j++;

            // If the entire pattern is matched 
            // store the start index in result
            if (j === m) {
                res.push(i - j);
                
                // Use LPS of previous index to 
                // skip unnecessary comparisons
                j = lps[j - 1];
            }
        }
        
        // If there is a mismatch
        else {
            
            // Use lps value of previous index
            // to avoid redundant comparisons
            if (j !== 0)
                j = lps[j - 1];
            else
                i++;
        }
    }
    return res; 
}

// Driver Code
const txt = "aabaacaadaabaaba";
const pat = "aaba";

const res = search(pat, txt);
console.log(res.join(" "));

Output
0 9 12 

Time Complexity: O(n + m), where n is the length of the text and m is the length of the pattern. This is because creating the LPS (Longest Prefix Suffix) array takes O(m) time, and the search through the text takes O(n) time.

Auxiliary Space: O(m), as we need to store the LPS array of size m.


KMP Algorithm (Part 1 : Constructing LPS Array)
Visit Course explore course icon
Video Thumbnail

KMP Algorithm (Part 1 : Constructing LPS Array)

Video Thumbnail

KMP Algorithm (Part 2 : Complete Algorithm)

Video Thumbnail

KMP Algorithm for Pattern Searching

Similar Reads