Boyer Moore Algorithm | Good Suffix heuristic
Last Updated :
12 Apr, 2025
Before diving into the Good Suffix Heuristic, it is recommended to first read about the Boyer-Moore Pattern Searching Algorithm and the Bad Character Heuristic to gain a clear understanding of how this algorithm optimizes pattern searching.
Refer Boyer Moore Algorithm for Pattern Searching for clear understanding of Boyer Moore Algo.
One important part of the Boyer-Moore algorithm is the Good Suffix Heuristic. When the algorithm encounters a mismatch, this heuristic helps by shifting the pattern to align with another occurrence of a similar part (suffix) of the pattern that has already matched. This reduces the number of checks needed, making the search faster.
Good Suffix Heuristic for Pattern Searching
Just like the Bad Character Heuristic, the Good Suffix Heuristic also involves a preprocessing step where a table is generated to optimize the pattern matching process. The Strong Good Suffix Heuristic is an important optimization in the Boyer-Moore algorithm for string pattern matching. It helps to skip unnecessary comparisons and efficiently shift the pattern when a mismatch occurs. Let’s break down how it works and its associated preprocessing.
Let t
be a substring of the text T
that matches a substring of the pattern P
. When a mismatch occurs, the pattern is shifted based on the following rules:
- Another occurrence of
t
in P
- Prefix of
P
matches the suffix of t
- Move the pattern past
t
This heuristic helps optimize the search by allowing the pattern to skip over sections of the text that have already been matched, leading to faster string matching.
Case 1: Another occurrence of t in P matched with t in T
If there are other occurrences of the substring t
in the pattern P
, the pattern is shifted so that one of these occurrences aligns with the substring t
in the text T
. This allows the pattern to continue matching efficiently. For example-

Explanation:
In the above example, we have a substring t
of text T
that matches with pattern P
(in green) before the mismatch occurs at index 2. Now, we search for occurrences of t
("AB") within P
. We find an occurrence starting at position 1 (highlighted with a yellow background). Therefore, we right shift the pattern by 2 positions to align the occurrence of t
in P
with t
in T
.
This is the weak rule of the original Boyer-Moore algorithm, which is not very effective. We will discuss the Strong Good Suffix rule shortly, which provides a more efficient approach.
Case 2: A prefix of P, which matches with suffix of t in T
It is not always guaranteed that we will find an occurrence of t
in P
. In some cases, there may be no exact occurrence at all. In such situations, we can instead look for a suffix of t
that matches with a prefix of P
. If a match is found, we can shift the pattern P
to align the matched suffix of t
with the prefix of P
, allowing us to continue the search effectively. This method helps when direct matches for t
are not found, and it ensures that the search can still progress efficiently. For example -

Explanation:
In above example, we have got t (“BAB”) matched with P (in green) at index 2-4 before mismatch . But because there exists no occurrence of t in P we will search for some prefix of P which matches with some suffix of t. We have found prefix “AB” (in the yellow background) starting at index 0 which matches not with whole t but the suffix of t “AB” starting at index 3. So now we will shift pattern 3 times to align prefix with the suffix.
Case 3: P moves past t
If the above two cases are not satisfied, we will shift the pattern past the t. For example -

Explanation:
If above example, there exist no occurrence of t (“AB”) in P and also there is no prefix in P which matches with the suffix of t. So, in that case, we can never find any perfect match before index 4, so we will shift the P past the t ie. to index 5.
Suppose the substring q = P[i to n] of the pattern P
has matched with the substring t
in the text T
, and the character c = P[i-1] is the mismatching character.
Unlike the weak rule of the Good Suffix Heuristic, we now search for the substring t
in P
where t
is not preceded by the character c.
This means we look for occurrences of t
in the pattern P
that are not immediately preceded by the mismatching character c
.
The closest occurrence of t
in P
, which is not preceded by c
, is then aligned with t
in T
. The pattern P
is shifted accordingly to align this occurrence of t
in the pattern with the matched substring t
in the text. This strong rule of the Good Suffix Heuristic enhances the efficiency of the Boyer-Moore algorithm by ensuring that the pattern is shifted more effectively, reducing unnecessary comparisons and speeding up the search process.
For example -

Explanation:
Consider the scenario where a substring q = P[i to n]
of the pattern P
has matched with a substring t
in the text T
, and the mismatching character c = P[i-1]
is found at position P[i-1]
. Instead of searching for the occurrence of t
in the pattern P
that is preceded by the mismatching character c
, we focus on finding the occurrence of t
in P
not preceded by c
.
- Step 1: Search for
t
in the pattern P
. If the first occurrence of t
is found at position 4
, but this occurrence is preceded by the mismatching character c = "C"
, we skip it and continue searching. - Step 2: At position
1
, we find another occurrence of t
in P
. This occurrence is preceded by the character A
, which is not equal to c
. Therefore, we shift the pattern by 6 positions to align this occurrence with the matched substring t
in the text T
.
This method helps avoid unnecessary comparisons and ensures that the pattern shifts efficiently, reducing the time complexity of the pattern matching process.
Preprocessing for Good Suffix Heuristic
As part of the preprocessing, an array shift
is created. Each entry shift[i]
stores the distance the pattern will shift when a mismatch occurs at position i-1
. Essentially, shift[i]
tells us how far we should shift the pattern when a mismatch happens after matching the suffix starting at position i
in the pattern P
.
1) Preprocessing for Strong Good Suffix
Before we dive into the preprocessing steps, it’s important to understand the concept of borders. A border is a substring that is both a proper suffix and a proper prefix. For example, in the string "ccacc"
, "c"
and "cc"
are borders because they appear at both ends of the string, but "cca"
is not a border.
During preprocessing, we calculate the array bpos
(border positions). Each entry bpos[i]
stores the starting index of the border for the suffix starting at position i
in the given pattern P
.
- The suffix starting at position
m
has no border, so bpos[m] = m + 1
(where m
is the length of the pattern).
The shift position is determined by these borders, and the pattern is shifted based on these positions. The following code shows the preprocessing algorithm for the Strong Good Suffix Heuristic:
C++
void preprocess_strong_suffix(int *shift, int *bpos, char *pat, int m)
{
int i = m, j = m + 1;
bpos[i] = j;
while (i > 0)
{
while (j <= m && pat[i - 1] != pat[j - 1])
{
if (shift[j] == 0)
shift[j] = j - i;
j = bpos[j];
}
i--;
j--;
bpos[i] = j;
}
}
Explanation of the Code
Initialization:
i = m
and j = m + 1
set up the initial boundary for calculating the borders.bpos[i] = j
initializes the bpos
array, marking the end of the pattern.
Main Loop:
- The main loop starts from the end of the pattern (
i = m
), and compares the character at pat[i-1]
with pat[j-1]
. - If the characters match, the loop proceeds with calculating the borders.
- If a mismatch is found, we adjust
j
using bpos[j]
and try to find a new border.
Shifting the Pattern:
- If
shift[j] == 0
, we calculate the shift based on the current mismatch and store it in shift[j] = j - i
.
Example for bpos[i] = j
:
Consider the pattern P = "ABBABAB"
where m = 7
. Here’s how the preprocessing works:
- The suffix
"AB"
starting at position i = 5
has no border, so bpos[5] = 7
. - The suffix
"BABAB"
starting at position i = 2
has a border "BAB"
, so bpos[2] = 4
.
This helps in determining where to shift the pattern when a mismatch occurs.
2) Preprocessing for Case 2 (Widest Border of the Pattern)
In this preprocessing step, we determine the widest border of the pattern that is contained in each suffix. The widest border refers to the longest substring that serves as both a proper prefix and a proper suffix of the pattern.
bpos[0]
stores the starting position of the widest border of the entire pattern.- This value is initially set in all free entries of the
shift
array. If the suffix length becomes smaller than bpos[0]
, the algorithm proceeds to the next-wider border.
This preprocessing ensures that when a mismatch occurs, the pattern is shifted optimally, either by aligning borders or skipping over parts of the text that are already matched.
Following is the implementation of the search algorithm -
C++
#include <stdio.h>
#include <string.h>
// preprocessing for strong good suffix rule
void preprocess_strong_suffix(int *shift, int *bpos,
char *pat, int m)
{
// m is the length of pattern
int i=m, j=m+1;
bpos[i]=j;
while(i>0)
{
/*if character at position i-1 is not equivalent to
character at j-1, then continue searching to right
of the pattern for border */
while(j<=m && pat[i-1] != pat[j-1])
{
/* the character preceding the occurrence of t in
pattern P is different than the mismatching character in P,
we stop skipping the occurrences and shift the pattern
from i to j */
if (shift[j]==0)
shift[j] = j-i;
//Update the position of next border
j = bpos[j];
}
/* p[i-1] matched with p[j-1], border is found.
store the beginning position of border */
i--;j--;
bpos[i] = j;
}
}
//Preprocessing for case 2
void preprocess_case2(int *shift, int *bpos,
char *pat, int m)
{
int i, j;
j = bpos[0];
for(i=0; i<=m; i++)
{
/* set the border position of the first character of the pattern
to all indices in array shift having shift[i] = 0 */
if(shift[i]==0)
shift[i] = j;
/* suffix becomes shorter than bpos[0], use the position of
next widest border as value of j */
if (i==j)
j = bpos[j];
}
}
void search(char *text, char *pat)
{
// s is shift of the pattern with respect to text
int s=0, j;
int m = strlen(pat);
int n = strlen(text);
int bpos[m+1], shift[m+1];
//initialize all occurrence of shift to 0
for(int i=0;i<m+1;i++) shift[i]=0;
//do preprocessing
preprocess_strong_suffix(shift, bpos, pat, m);
preprocess_case2(shift, bpos, pat, m);
while(s <= n-m)
{
j = m-1;
/* Keep reducing index j of pattern while characters of
pattern and text are matching at this shift s*/
while(j >= 0 && pat[j] == text[s+j])
j--;
/* If the pattern is present at the current shift, then index j
will become -1 after the above loop */
if (j<0)
{
printf("pattern occurs at shift = %d\n", s);
s += shift[0];
}
else
/*pat[i] != pat[s+j] so shift the pattern
shift[j+1] times */
s += shift[j+1];
}
}
int main()
{
char text[] = "ABAAAABAACD";
char pat[] = "ABA";
search(text, pat);
return 0;
}
Java
class GFG
{
// preprocessing for strong good suffix rule
static void preprocess_strong_suffix(int []shift, int []bpos,
char []pat, int m)
{
// m is the length of pattern
int i = m, j = m + 1;
bpos[i] = j;
while(i > 0)
{
/*if character at position i-1 is not
equivalent to character at j-1, then
continue searching to right of the
pattern for border */
while(j <= m && pat[i - 1] != pat[j - 1])
{
/* the character preceding the occurrence of t
in pattern P is different than the mismatching
character in P, we stop skipping the occurrences
and shift the pattern from i to j */
if (shift[j] == 0)
shift[j] = j - i;
//Update the position of next border
j = bpos[j];
}
/* p[i-1] matched with p[j-1], border is found.
store the beginning position of border */
i--; j--;
bpos[i] = j;
}
}
static void preprocess_case2(int []shift, int []bpos,
char []pat, int m)
{
int i, j;
j = bpos[0];
for(i = 0; i <= m; i++)
{
/* set the border position of the first character
of the pattern to all indices in array shift
having shift[i] = 0 */
if(shift[i] == 0)
shift[i] = j;
/* suffix becomes shorter than bpos[0],
use the position of next widest border
as value of j */
if (i == j)
j = bpos[j];
}
}
/*Search for a pattern in given text using
Boyer Moore algorithm with Good suffix rule */
static void search(char []text, char []pat)
{
// s is shift of the pattern
// with respect to text
int s = 0, j;
int m = pat.length;
int n = text.length;
int []bpos = new int[m + 1];
int []shift = new int[m + 1];
//initialize all occurrence of shift to 0
for(int i = 0; i < m + 1; i++)
shift[i] = 0;
//do preprocessing
preprocess_strong_suffix(shift, bpos, pat, m);
preprocess_case2(shift, bpos, pat, m);
while(s <= n - m)
{
j = m - 1;
/* Keep reducing index j of pattern while
characters of pattern and text are matching
at this shift s*/
while(j >= 0 && pat[j] == text[s+j])
j--;
/* If the pattern is present at the current shift,
then index j will become -1 after the above loop */
if (j < 0)
{
System.out.printf("pattern occurs at shift = %d\n", s);
s += shift[0];
}
else
/*pat[i] != pat[s+j] so shift the pattern
shift[j+1] times */
s += shift[j + 1];
}
}
public static void main(String[] args)
{
char []text = "ABAAAABAACD".toCharArray();
char []pat = "ABA".toCharArray();
search(text, pat);
}
}
// This code is contributed by 29AjayKumar
Python
# preprocessing for strong good suffix rule
def preprocess_strong_suffix(shift, bpos, pat, m):
# m is the length of pattern
i = m
j = m + 1
bpos[i] = j
while i > 0:
# if character at position i-1 is not equivalent to
# character at j-1, then continue searching to right
# of the pattern for border
while j <= m and pat[i - 1] != pat[j - 1]:
# the character preceding the occurrence of t in
# pattern P is different than the mismatching character in P,
# we stop skipping the occurrences and shift the pattern
# from i to j
if shift[j] == 0:
shift[j] = j - i
# Update the position of next border
j = bpos[j]
# p[i-1] matched with p[j-1], border is found.
# store the beginning position of border
i -= 1
j -= 1
bpos[i] = j
# Preprocessing for case 2
def preprocess_case2(shift, bpos, pat, m):
j = bpos[0]
for i in range(m + 1):
# set the border position of the first character of the pattern
# to all indices in array shift having shift[i] = 0
if shift[i] == 0:
shift[i] = j
# suffix becomes shorter than bpos[0], use the position of
# next widest border as value of j
if i == j:
j = bpos[j]
def search(text, pat):
# s is shift of the pattern with respect to text
s = 0
m = len(pat)
n = len(text)
bpos = [0] * (m + 1)
shift = [0] * (m + 1)
# do preprocessing
preprocess_strong_suffix(shift, bpos, pat, m)
preprocess_case2(shift, bpos, pat, m)
while s <= n - m:
j = m - 1
# Keep reducing index j of pattern while characters of
# pattern and text are matching at this shift s
while j >= 0 and pat[j] == text[s + j]:
j -= 1
# If the pattern is present at the current shift, then index j
# will become -1 after the above loop
if j < 0:
print(f'pattern occurs at shift = {s}')
s += shift[0]
else:
# pat[i] != pat[s+j] so shift the pattern
# shift[j+1] times
s += shift[j + 1]
if __name__ == '__main__':
text = 'ABAAAABAACD'
pat = 'ABA'
search(text, pat)
JavaScript
function preprocessStrongSuffix(shift, bpos, pat, m) {
let i = m;
let j = m + 1;
bpos[i] = j;
while (i > 0) {
// If character at position i-1 is not equivalent to character at j-1,
// then continue searching to the right of the pattern for a border.
while (j <= m && pat[i - 1] !== pat[j - 1]) {
// The character preceding the occurrence of t in pattern P is different
// than the mismatching character in P, so stop skipping the occurrences
// and shift the pattern from i to j.
if (shift[j] === 0) {
shift[j] = j - i;
}
// Update the position of the next border.
j = bpos[j];
}
// pat[i-1] matched with pat[j-1], a border is found.
// Store the beginning position of the border.
i--;
j--;
bpos[i] = j;
}
}
// Preprocessing for case 2
function preprocessCase2(shift, bpos, pat, m) {
let j = bpos[0];
for (let i = 0; i <= m; i++) {
// Set the border position of the first character of the pattern
// to all indices in the shift array having shift[i] = 0.
if (shift[i] === 0) {
shift[i] = j;
}
// If the suffix becomes shorter than bpos[0], use the position of
// the next widest border as the value of j.
if (i === j) {
j = bpos[j];
}
}
}
// Search for a pattern in given text using Boyer-Moore algorithm with Good Suffix rule
function search(text, pat) {
let s = 0;
let j;
const m = pat.length;
const n = text.length;
const bpos = new Array(m + 1);
const shift = new Array(m + 1).fill(0);
// Initialize all occurrences of shift to 0
for (let i = 0; i < m + 1; i++) {
shift[i] = 0;
}
// Perform preprocessing
preprocessStrongSuffix(shift, bpos, pat, m);
preprocessCase2(shift, bpos, pat, m);
while (s <= n - m) {
j = m - 1;
// Keep reducing index j of pattern while characters of
// pattern and text are matching at this shift s
while (j >= 0 && pat[j] === text[s + j]) {
j--;
}
// If the pattern is present at the current shift, then index j
// will become -1 after the above loop
if (j < 0) {
console.log(`Pattern occurs at shift = ${s}`);
s += shift[0];
} else {
// pat[i] != pat[s+j] so shift the pattern shift[j+1] times
s += shift[j + 1];
}
}
}
function main() {
const text = "ABAAAABAACD";
const pat = "ABA";
search(text, pat);
}
main();
Outputpattern occurs at shift = 0
pattern occurs at shift = 5
Similar Reads
Basics & Prerequisites
Data Structures
Array Data StructureIn this article, we introduce array, implementation in different popular languages, its basic operations and commonly seen problems / interview questions. An array stores items (in case of C/C++ and Java Primitive Arrays) or their references (in case of Python, JS, Java Non-Primitive) at contiguous
3 min read
String in Data StructureA string is a sequence of characters. The following facts make string an interesting data structure.Small set of elements. Unlike normal array, strings typically have smaller set of items. For example, lowercase English alphabet has only 26 characters. ASCII has only 256 characters.Strings are immut
2 min read
Hashing in Data StructureHashing is a technique used in data structures that efficiently stores and retrieves data in a way that allows for quick access. Hashing involves mapping data to a specific index in a hash table (an array of items) using a hash function. It enables fast retrieval of information based on its key. The
2 min read
Linked List Data StructureA linked list is a fundamental data structure in computer science. It mainly allows efficient insertion and deletion operations compared to arrays. Like arrays, it is also used to implement other data structures like stack, queue and deque. Hereâs the comparison of Linked List vs Arrays Linked List:
2 min read
Stack Data StructureA Stack is a linear data structure that follows a particular order in which the operations are performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out). LIFO implies that the element that is inserted last, comes out first and FILO implies that the element that is inserted first
2 min read
Queue Data StructureA Queue Data Structure is a fundamental concept in computer science used for storing and managing data in a specific order. It follows the principle of "First in, First out" (FIFO), where the first element added to the queue is the first one to be removed. It is used as a buffer in computer systems
2 min read
Tree Data StructureTree Data Structure is a non-linear data structure in which a collection of elements known as nodes are connected to each other via edges such that there exists exactly one path between any two nodes. Types of TreeBinary Tree : Every node has at most two childrenTernary Tree : Every node has at most
4 min read
Graph Data StructureGraph Data Structure is a collection of nodes connected by edges. It's used to represent relationships between different entities. If you are looking for topic-wise list of problems on different topics like DFS, BFS, Topological Sort, Shortest Path, etc., please refer to Graph Algorithms. Basics of
3 min read
Trie Data StructureThe Trie data structure is a tree-like structure used for storing a dynamic set of strings. It allows for efficient retrieval and storage of keys, making it highly effective in handling large datasets. Trie supports operations such as insertion, search, deletion of keys, and prefix searches. In this
15+ min read
Algorithms
Searching AlgorithmsSearching algorithms are essential tools in computer science used to locate specific items within a collection of data. In this tutorial, we are mainly going to focus upon searching in an array. When we search an item in an array, there are two most common algorithms used based on the type of input
2 min read
Sorting AlgorithmsA Sorting Algorithm is used to rearrange a given array or list of elements in an order. For example, a given array [10, 20, 5, 2] becomes [2, 5, 10, 20] after sorting in increasing order and becomes [20, 10, 5, 2] after sorting in decreasing order. There exist different sorting algorithms for differ
3 min read
Introduction to RecursionThe process in which a function calls itself directly or indirectly is called recursion and the corresponding function is called a recursive function. A recursive algorithm takes one step toward solution and then recursively call itself to further move. The algorithm stops once we reach the solution
14 min read
Greedy AlgorithmsGreedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum solution. At every step of the algorithm, we make a choice that looks the best at the moment. To make the choice, we sometimes sort the array so that we can always get
3 min read
Graph AlgorithmsGraph is a non-linear data structure like tree data structure. The limitation of tree is, it can only represent hierarchical data. For situations where nodes or vertices are randomly connected with each other other, we use Graph. Example situations where we use graph data structure are, a social net
3 min read
Dynamic Programming or DPDynamic Programming is an algorithmic technique with the following properties.It is mainly an optimization over plain recursion. Wherever we see a recursive solution that has repeated calls for the same inputs, we can optimize it using Dynamic Programming. The idea is to simply store the results of
3 min read
Bitwise AlgorithmsBitwise algorithms in Data Structures and Algorithms (DSA) involve manipulating individual bits of binary representations of numbers to perform operations efficiently. These algorithms utilize bitwise operators like AND, OR, XOR, NOT, Left Shift, and Right Shift.BasicsIntroduction to Bitwise Algorit
4 min read
Advanced
Segment TreeSegment Tree is a data structure that allows efficient querying and updating of intervals or segments of an array. It is particularly useful for problems involving range queries, such as finding the sum, minimum, maximum, or any other operation over a specific range of elements in an array. The tree
3 min read
Pattern SearchingPattern searching algorithms are essential tools in computer science and data processing. These algorithms are designed to efficiently find a particular pattern within a larger set of data. Patten SearchingImportant Pattern Searching Algorithms:Naive String Matching : A Simple Algorithm that works i
2 min read
GeometryGeometry is a branch of mathematics that studies the properties, measurements, and relationships of points, lines, angles, surfaces, and solids. From basic lines and angles to complex structures, it helps us understand the world around us.Geometry for Students and BeginnersThis section covers key br
2 min read
Interview Preparation
Practice Problem