KMP Algorithm for Pattern Searching
Last Updated :
25 Jul, 2025
Given two strings: txt, representing the main text, and pat, representing the pattern to be searched.
Find and return all starting indices in txt where the string pat appears as a substring.
The matching should be exact, and the indices should be 0-based, meaning the first character of txt is considered to be at index 0.
Examples:
Input: txt = "abcab", pat = "ab"
Output: [0, 3]
Explanation: The string "ab" occurs twice in txt, first occurrence starts from index 0 and second from index 3.
Input: txt = "aabaacaadaabaaba", pat = "aaba"
Output: [0, 9, 12]
Explanation:
[Naive Approach] Naive Pattern Searching Algorithm - O(n × m) Time and O(1) Space
We start at every index in the text and compare it with the first character of the pattern. If they match, we move to the next character in both text and pattern. If there is a mismatch, we start the same process for the next index of the text.
C++
#include <iostream>
#include <vector>
#include <string>
using namespace std;
vector<int> search(string &pat, string &txt) {
vector<int> res;
int n = txt.size();
int m = pat.size();
for (int i = 0; i <= n - m; i++) {
int j = 0;
// compare pattern with substring
// starting at index i
while (j < m && txt[i + j] == pat[j]) {
j++;
}
// if full pattern matched
if (j == m) {
res.push_back(i);
}
}
return res;
}
int main() {
string txt = "aabaacaadaabaaba";
string pat = "aaba";
vector<int> res = search(pat, txt);
for (int i = 0; i < res.size(); i++)
cout << res[i] << " ";
return 0;
}
Java
import java.util.ArrayList;
class GfG {
static ArrayList<Integer> search(String pat, String txt) {
ArrayList<Integer> res = new ArrayList<>();
int n = txt.length();
int m = pat.length();
for (int i = 0; i <= n - m; i++) {
int j = 0;
// compare pattern with substring
// starting at index i
while (j < m && txt.charAt(i + j) == pat.charAt(j)) {
j++;
}
// if full pattern matched
if (j == m) {
res.add(i);
}
}
return res;
}
public static void main(String[] args) {
String txt = "aabaacaadaabaaba";
String pat = "aaba";
ArrayList<Integer> res = search(pat, txt);
for (int i = 0; i < res.size(); i++)
System.out.print(res.get(i) + " ");
}
}
Python
def search(pat, txt):
res = []
n = len(txt)
m = len(pat)
for i in range(n - m + 1):
j = 0
# compare pattern with substring
# starting at index i
while j < m and txt[i + j] == pat[j]:
j += 1
# if full pattern matched
if j == m:
res.append(i)
return res
if __name__ == "__main__":
txt = "aabaacaadaabaaba"
pat = "aaba"
res = search(pat, txt)
for idx in res:
print(idx, end=" ")
C#
using System;
using System.Collections.Generic;
class GfG {
static List<int> search(string pat, string txt) {
List<int> res = new List<int>();
int n = txt.Length;
int m = pat.Length;
for (int i = 0; i <= n - m; i++) {
int j = 0;
// compare pattern with substring
// starting at index i
while (j < m && txt[i + j] == pat[j]) {
j++;
}
// if full pattern matched
if (j == m) {
res.Add(i);
}
}
return res;
}
static void Main() {
string txt = "aabaacaadaabaaba";
string pat = "aaba";
List<int> res = search(pat, txt);
for (int i = 0; i < res.Count; i++)
Console.Write(res[i] + " ");
}
}
JavaScript
// compare pattern with substring
// starting at index i
function search(pat, txt) {
let res = [];
let n = txt.length;
let m = pat.length;
for (let i = 0; i <= n - m; i++) {
let j = 0;
// compare pattern with substring
// starting at index i
while (j < m && txt[i + j] === pat[j]) {
j++;
}
// if full pattern matched
if (j === m) {
res.push(i);
}
}
return res;
}
// Driver Code
let txt = "aabaacaadaabaaba";
let pat = "aaba";
let res = search(pat, txt);
for (let i = 0; i < res.length; i++)
process.stdout.write(res[i] + " ");
[Expected Approach] KMP Pattern Searching Algorithm
The KMP algorithm improves pattern matching by avoiding rechecking characters after a mismatch. It uses the degenerating property of patterns — repeated sub-patterns — to skip unnecessary comparisons.
Whenever a mismatch happens after some matches, we already know part of the pattern has matched earlier. KMP uses this information through a preprocessed LPS (Longest Prefix Suffix) array to shift the pattern efficiently, without restarting from the next character in the text.
This allows KMP to run in O(n + m) time, where n is the length of the text and m is the length of the pattern.
Terminologies used in KMP Algorithm:
- text (txt): The main string in which we want to search for a pattern.
- pattern (pat): The substring we are trying to find within the text.
- Match: A match occurs when all characters of the pattern align exactly with a substring of the text.
- LPS Array (Longest Prefix Suffix): For each position i in the pattern, lps[i] stores the length of the longest proper prefix which is also a suffix in the substring pat[0...i].
- Proper Prefix: A proper prefix is a prefix that is not equal to the whole string.
- Suffix: A suffix is a substring that ends at the current position.
- The LPS array helps us determine how much we can skip in the pattern when a mismatch occurs, thus avoiding redundant comparisons.
Example of lps[] construction:
Example 1: Pattern "aabaaac"
At index 0: "a" → No proper prefix/suffix → lps[0] = 0
At index 1: "aa" → "a" is both prefix and suffix → lps[1] = 1
At index 2: "aab" → No prefix matches suffix → lps[2] = 0
At index 3: "aaba" → "a" is prefix and suffix → lps[3] = 1
At index 4: "aabaa" → "aa" is prefix and suffix → lps[4] = 2
At index 5: "aabaaa" → "aaa" is prefix and suffix → lps[5] = 3
At index 6: "aabaaac" → Mismatch, so reset → lps[6] = 0
Final lps[]: [0, 1, 0, 1, 2, 3, 0]
Example 2: Pattern "abcdabca"
At index 0: lps[0] = 0
At index 1: lps[1] = 0
At index 2:lps[2] = 0
At index 3: lps[3] = 0 (no repetition in "abcd")
At index 4: lps[4] = 1 ("a" repeats)
At index 5: lps[5] = 2 ("ab" repeats)
At index 6: lps[6] = 3 ("abc" repeats)
At index 7: lps[7] = 1 (mismatch, fall back to "a")
Final LPS: [0, 0, 0, 0, 1, 2, 3, 1]
Note: lps[i] could also be defined as the longest prefix which is also a proper suffix. We need to use it properly in one place to make sure that the whole substring is not considered.
Algorithm for Construction of LPS Array:
The value of lps[0] is always 0 because a string of length one has no non-empty proper prefix that is also a suffix. We maintain a variable len, initialized to 0, which keeps track of the length of the previous longest prefix suffix. As we traverse the pattern from index 1 onward, we compare the current character pat[i] with pat[len]. Based on this comparison, we have three possible cases:
Case 1: pat[i] == pat[len]
This means the current character continues the existing prefix-suffix match.
→ We increment len by 1 and assign lps[i] = len.
→ Then, move to the next index.
Case 2: pat[i] != pat[len] and len == 0
There is no prefix that matches any suffix ending at i, and we can't fall back to any earlier matching pattern.
→ So we set lps[i] = 0 and simply move to the next character.
Case 3: pat[i] != pat[len] and len > 0
We cannot extend the previous matching prefix-suffix. However, there might still be a shorter prefix which is also a suffix that matches the current position.
Instead of comparing all prefixes manually, we reuse previously computed LPS values.
→ Since pat[0...len-1] equals pat[i-len...i-1], we can fall back to lps[len - 1] and update len.
→ This reduces the prefix size we're matching against and avoids redundant work.
We do not increment i immediately in this case — instead, we retry the current pat[i] with the new updated len.
Illustration:
Example of Construction of LPS Array:
Implementation of KMP Algorithm:
We initialize two pointers — one for the text string and another for the pattern. When the characters at both pointers match, we increment both pointers and continue the comparison. If they do not match, we reset the pattern pointer to the last value from the LPS array, since that portion of the pattern has already been matched with the text. Additionally, if we have traversed the entire pattern string (i.e., a full match is found), we add the starting index of the pattern's occurrence in the text to the result, and continue the search from the LPS value of the last element in the pattern.
Let’s say we are at position i in the text string and position j in the pattern string when a mismatch occurs:
- At this point, we know that pat[0..j-1] has already matched with txt[i-j..i-1].
- The value of lps[j-1] represents the length of the longest proper prefix of the substring pat[0..j-1] that is also a suffix of the same substring.
- From these two observations, we can conclude that there's no need to recheck the characters in pat[0..lps[j-1]]. Instead, we can directly resume our search from lps[j-1].
C++
#include <iostream>
#include <string>
#include <vector>
using namespace std;
void constructLps(string &pat, vector<int> &lps) {
// len stores the length of longest prefix which
// is also a suffix for the previous index
int len = 0;
// lps[0] is always 0
lps[0] = 0;
int i = 1;
while (i < pat.length()) {
// If characters match, increment the size of lps
if (pat[i] == pat[len]) {
len++;
lps[i] = len;
i++;
}
// If there is a mismatch
else {
if (len != 0) {
// Update len to the previous lps value
// to avoid reduntant comparisons
len = lps[len - 1];
}
else {
// If no matching prefix found, set lps[i] to 0
lps[i] = 0;
i++;
}
}
}
}
vector<int> search(string &pat, string &txt) {
int n = txt.length();
int m = pat.length();
vector<int> lps(m);
vector<int> res;
constructLps(pat, lps);
// Pointers i and j, for traversing
// the text and pattern
int i = 0;
int j = 0;
while (i < n) {
// If characters match, move both pointers forward
if (txt[i] == pat[j]) {
i++;
j++;
// If the entire pattern is matched
// store the start index in result
if (j == m) {
res.push_back(i - j);
// Use LPS of previous index to
// skip unnecessary comparisons
j = lps[j - 1];
}
}
// If there is a mismatch
else {
// Use lps value of previous index
// to avoid redundant comparisons
if (j != 0)
j = lps[j - 1];
else
i++;
}
}
return res;
}
int main() {
string txt = "aabaacaadaabaaba";
string pat = "aaba";
vector<int> res = search(pat, txt);
for (int i = 0; i < res.size(); i++)
cout << res[i] << " ";
return 0;
}
Java
import java.util.ArrayList;
class GfG {
static void constructLps(String pat, int[] lps) {
// len stores the length of longest prefix which
// is also a suffix for the previous index
int len = 0;
// lps[0] is always 0
lps[0] = 0;
int i = 1;
while (i < pat.length()) {
// If characters match, increment the size of lps
if (pat.charAt(i) == pat.charAt(len)) {
len++;
lps[i] = len;
i++;
}
// If there is a mismatch
else {
if (len != 0) {
// Update len to the previous lps value
// to avoid redundant comparisons
len = lps[len - 1];
}
else {
// If no matching prefix found, set lps[i] to 0
lps[i] = 0;
i++;
}
}
}
}
static ArrayList<Integer> search(String pat, String txt) {
int n = txt.length();
int m = pat.length();
int[] lps = new int[m];
ArrayList<Integer> res = new ArrayList<>();
constructLps(pat, lps);
// Pointers i and j, for traversing
// the text and pattern
int i = 0;
int j = 0;
while (i < n) {
// If characters match, move both pointers forward
if (txt.charAt(i) == pat.charAt(j)) {
i++;
j++;
// If the entire pattern is matched
// store the start index in result
if (j == m) {
res.add(i - j);
// Use LPS of previous index to
// skip unnecessary comparisons
j = lps[j - 1];
}
}
// If there is a mismatch
else {
// Use lps value of previous index
// to avoid redundant comparisons
if (j != 0)
j = lps[j - 1];
else
i++;
}
}
return res;
}
public static void main(String[] args) {
String txt = "aabaacaadaabaaba";
String pat = "aaba";
ArrayList<Integer> res = search(pat, txt);
for (int i = 0; i < res.size(); i++)
System.out.print(res.get(i) + " ");
}
}
Python
def constructLps(pat, lps):
# len stores the length of longest prefix which
# is also a suffix for the previous index
len_ = 0
m = len(pat)
# lps[0] is always 0
lps[0] = 0
i = 1
while i < m:
# If characters match, increment the size of lps
if pat[i] == pat[len_]:
len_ += 1
lps[i] = len_
i += 1
# If there is a mismatch
else:
if len_ != 0:
# Update len to the previous lps value
# to avoid redundant comparisons
len_ = lps[len_ - 1]
else:
# If no matching prefix found, set lps[i] to 0
lps[i] = 0
i += 1
def search(pat, txt):
n = len(txt)
m = len(pat)
lps = [0] * m
res = []
constructLps(pat, lps)
# Pointers i and j, for traversing
# the text and pattern
i = 0
j = 0
while i < n:
# If characters match, move both pointers forward
if txt[i] == pat[j]:
i += 1
j += 1
# If the entire pattern is matched
# store the start index in result
if j == m:
res.append(i - j)
# Use LPS of previous index to
# skip unnecessary comparisons
j = lps[j - 1]
# If there is a mismatch
else:
# Use lps value of previous index
# to avoid redundant comparisons
if j != 0:
j = lps[j - 1]
else:
i += 1
return res
if __name__ == "__main__":
txt = "aabaacaadaabaaba"
pat = "aaba"
res = search(pat, txt)
for i in range(len(res)):
print(res[i], end=" ")
C#
using System;
using System.Collections.Generic;
class GfG {
static void constructLps(string pat, int[] lps) {
// len stores the length of longest prefix which
// is also a suffix for the previous index
int len = 0;
// lps[0] is always 0
lps[0] = 0;
int i = 1;
while (i < pat.Length) {
// If characters match, increment the size of lps
if (pat[i] == pat[len]) {
len++;
lps[i] = len;
i++;
}
// If there is a mismatch
else {
if (len != 0) {
// Update len to the previous lps value
// to avoid redundant comparisons
len = lps[len - 1];
}
else {
// If no matching prefix found, set lps[i] to 0
lps[i] = 0;
i++;
}
}
}
}
static List<int> search(string pat, string txt) {
int n = txt.Length;
int m = pat.Length;
int[] lps = new int[m];
List<int> res = new List<int>();
constructLps(pat, lps);
// Pointers i and j, for traversing
// the text and pattern
int i = 0;
int j = 0;
while (i < n) {
// If characters match, move both pointers forward
if (txt[i] == pat[j]) {
i++;
j++;
// If the entire pattern is matched
// store the start index in result
if (j == m) {
res.Add(i - j);
// Use LPS of previous index to
// skip unnecessary comparisons
j = lps[j - 1];
}
}
// If there is a mismatch
else {
// Use lps value of previous index
// to avoid redundant comparisons
if (j != 0)
j = lps[j - 1];
else
i++;
}
}
return res;
}
static void Main(string[] args) {
string txt = "aabaacaadaabaaba";
string pat = "aaba";
List<int> res = search(pat, txt);
for (int i = 0; i < res.Count; i++)
Console.Write(res[i] + " ");
}
}
JavaScript
function constructLps(pat, lps) {
// len stores the length of longest prefix which
// is also a suffix for the previous index
let len = 0;
// lps[0] is always 0
lps[0] = 0;
let i = 1;
while (i < pat.length) {
// If characters match, increment the size of lps
if (pat[i] === pat[len]) {
len++;
lps[i] = len;
i++;
}
// If there is a mismatch
else {
if (len !== 0) {
// Update len to the previous lps value
// to avoid redundant comparisons
len = lps[len - 1];
} else {
// If no matching prefix found, set lps[i] to 0
lps[i] = 0;
i++;
}
}
}
}
function search(pat, txt) {
const n = txt.length;
const m = pat.length;
const lps = new Array(m);
const res = [];
constructLps(pat, lps);
// Pointers i and j, for traversing
// the text and pattern
let i = 0;
let j = 0;
while (i < n) {
// If characters match, move both pointers forward
if (txt[i] === pat[j]) {
i++;
j++;
// If the entire pattern is matched
// store the start index in result
if (j === m) {
res.push(i - j);
// Use LPS of previous index to
// skip unnecessary comparisons
j = lps[j - 1];
}
}
// If there is a mismatch
else {
// Use lps value of previous index
// to avoid redundant comparisons
if (j !== 0)
j = lps[j - 1];
else
i++;
}
}
return res;
}
// Driver Code
const txt = "aabaacaadaabaaba";
const pat = "aaba";
const res = search(pat, txt);
console.log(res.join(" "));
Time Complexity: O(n + m), where n is the length of the text and m is the length of the pattern. This is because creating the LPS (Longest Prefix Suffix) array takes O(m) time, and the search through the text takes O(n) time.
Auxiliary Space: O(m), as we need to store the LPS array of size m.
KMP Algorithm (Part 1 : Constructing LPS Array)
Visit Course
KMP Algorithm (Part 1 : Constructing LPS Array)
KMP Algorithm (Part 2 : Complete Algorithm)
KMP Algorithm for Pattern Searching
Similar Reads
Basics & Prerequisites
Data Structures
Getting Started with Array Data StructureArray is a collection of items of the same variable type that are stored at contiguous memory locations. It is one of the most popular and simple data structures used in programming. Basic terminologies of ArrayArray Index: In an array, elements are identified by their indexes. Array index starts fr
14 min read
String in Data StructureA string is a sequence of characters. The following facts make string an interesting data structure.Small set of elements. Unlike normal array, strings typically have smaller set of items. For example, lowercase English alphabet has only 26 characters. ASCII has only 256 characters.Strings are immut
2 min read
Hashing in Data StructureHashing is a technique used in data structures that efficiently stores and retrieves data in a way that allows for quick access. Hashing involves mapping data to a specific index in a hash table (an array of items) using a hash function. It enables fast retrieval of information based on its key. The
2 min read
Linked List Data StructureA linked list is a fundamental data structure in computer science. It mainly allows efficient insertion and deletion operations compared to arrays. Like arrays, it is also used to implement other data structures like stack, queue and deque. Hereâs the comparison of Linked List vs Arrays Linked List:
2 min read
Stack Data StructureA Stack is a linear data structure that follows a particular order in which the operations are performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out). LIFO implies that the element that is inserted last, comes out first and FILO implies that the element that is inserted first
2 min read
Queue Data StructureA Queue Data Structure is a fundamental concept in computer science used for storing and managing data in a specific order. It follows the principle of "First in, First out" (FIFO), where the first element added to the queue is the first one to be removed. It is used as a buffer in computer systems
2 min read
Tree Data StructureTree Data Structure is a non-linear data structure in which a collection of elements known as nodes are connected to each other via edges such that there exists exactly one path between any two nodes. Types of TreeBinary Tree : Every node has at most two childrenTernary Tree : Every node has at most
4 min read
Graph Data StructureGraph Data Structure is a collection of nodes connected by edges. It's used to represent relationships between different entities. If you are looking for topic-wise list of problems on different topics like DFS, BFS, Topological Sort, Shortest Path, etc., please refer to Graph Algorithms. Basics of
3 min read
Trie Data StructureThe Trie data structure is a tree-like structure used for storing a dynamic set of strings. It allows for efficient retrieval and storage of keys, making it highly effective in handling large datasets. Trie supports operations such as insertion, search, deletion of keys, and prefix searches. In this
15+ min read
Algorithms
Searching AlgorithmsSearching algorithms are essential tools in computer science used to locate specific items within a collection of data. In this tutorial, we are mainly going to focus upon searching in an array. When we search an item in an array, there are two most common algorithms used based on the type of input
2 min read
Sorting AlgorithmsA Sorting Algorithm is used to rearrange a given array or list of elements in an order. For example, a given array [10, 20, 5, 2] becomes [2, 5, 10, 20] after sorting in increasing order and becomes [20, 10, 5, 2] after sorting in decreasing order. There exist different sorting algorithms for differ
3 min read
Introduction to RecursionThe process in which a function calls itself directly or indirectly is called recursion and the corresponding function is called a recursive function. A recursive algorithm takes one step toward solution and then recursively call itself to further move. The algorithm stops once we reach the solution
14 min read
Greedy AlgorithmsGreedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum solution. At every step of the algorithm, we make a choice that looks the best at the moment. To make the choice, we sometimes sort the array so that we can always get
3 min read
Graph AlgorithmsGraph is a non-linear data structure like tree data structure. The limitation of tree is, it can only represent hierarchical data. For situations where nodes or vertices are randomly connected with each other other, we use Graph. Example situations where we use graph data structure are, a social net
3 min read
Dynamic Programming or DPDynamic Programming is an algorithmic technique with the following properties.It is mainly an optimization over plain recursion. Wherever we see a recursive solution that has repeated calls for the same inputs, we can optimize it using Dynamic Programming. The idea is to simply store the results of
3 min read
Bitwise AlgorithmsBitwise algorithms in Data Structures and Algorithms (DSA) involve manipulating individual bits of binary representations of numbers to perform operations efficiently. These algorithms utilize bitwise operators like AND, OR, XOR, NOT, Left Shift, and Right Shift.BasicsIntroduction to Bitwise Algorit
4 min read
Advanced
Segment TreeSegment Tree is a data structure that allows efficient querying and updating of intervals or segments of an array. It is particularly useful for problems involving range queries, such as finding the sum, minimum, maximum, or any other operation over a specific range of elements in an array. The tree
3 min read
Pattern SearchingPattern searching algorithms are essential tools in computer science and data processing. These algorithms are designed to efficiently find a particular pattern within a larger set of data. Patten SearchingImportant Pattern Searching Algorithms:Naive String Matching : A Simple Algorithm that works i
2 min read
GeometryGeometry is a branch of mathematics that studies the properties, measurements, and relationships of points, lines, angles, surfaces, and solids. From basic lines and angles to complex structures, it helps us understand the world around us.Geometry for Students and BeginnersThis section covers key br
2 min read
Interview Preparation
Practice Problem