Count of occurrences of each prefix in a string using modified KMP algorithm
Last Updated :
13 Dec, 2021
Given a string S of size N, the task is to count the occurrences of all the prefixes of the given string S.
Examples:
Input: S = "AAAA"
Output:
A occurs 4 times
AA occurs 3 times.
AAA occurs 2 times.
AAAA occurs 1 times.
Explanation:
Below is the illustration of all the prefix:
Input: S = "ABACABA"
Output:
A occurs 4 times
AB occurs 2 times
ABA occurs 2 times
ABAC occurs 1 times
ABACA occurs 1 times
ABACAB occurs 1 times
ABACABA occurs 1 times
Naive Approach:
- Traverse over all the prefixes in set P. Let the x be the prefix.
- Do a sliding window approach of size |x|.
- Check if the current sliding window on S is equal to x. If yes then increase the count[x] by 1.
Time complexity: O(N3)
Auxiliary Space: O(N)
Efficient Approach:
Use the LPS array (also called prefix_function) from the KMP algorithm.
The prefix function for this string is defined as an array LPS of length N, where LPS[i] is the length of the longest proper prefix of the substring S[0…i] which is also a suffix of this substring. Let occ[i] denote the number of occurrences of the prefix of length i.
Below are the steps to implement this approach:
- Compute the LPS array or prefix_function.
- For each value of the prefix function, first, count how many times it occurs in the LPS array.
- The length prefix i appears exactly ans[i] times, then this number must be added to the number of occurrences of its longest suffix that is also a prefix.
- In the end, add 1 to all the values of occ array, because of the original prefix that should be counted as well.
For example:
LPS[i] denotes that in position i, a prefix of length = LPS[i] appears. And this is the longest prefix possible. But shorter prefixes can occur.
For String S = "AAAA", following are the prefixes:
S[0..0] = A
S[0..1] = AA
S[0..2] = AAA
S[0..3] = AAAA
Initially:
occ[A] = 0
occ[AA] = 0
occ[AAA] = 0
occ[AAAA] = 0
Step1: LPS Array of the following string denotes the length of the longest prefix which is also a suffix:
LPS[1] denotes in string AA, A is a suffix and also a prefix as LPS[1] = 1
LPS[2] denotes in string AAA, AA is a suffix and also a prefix as LPS[2] = 2
LPS[3] denotes in string AAAA, AAA is a suffix and also a prefix as LPS[3] = 3
Step 2:Add these occurrences of prefixes as suffixes to the answer in the occ[] array:
Values : Counted substrings
occ[A] = 1 : S[1]
occ[AA] = 1 : S[1..2]
occ[AAA] = 1 : S[1..3]
occ[AAAA] = 0 : NULL(as there is not a prefix "AAAA" which is also a suffix.
Step 3: Now traverse the string in reverse order starting from "AAA" (as the last value will always be 0 since the complete string is not a proper prefix).
Since, string "AAA" S[1..3] contains "AA" S[2..3] as well, which was not counted yet, therefore increment the occurrence of string "AA" in occ["AA"] as occ["AA"] += occ["AAA"]. Below is the count for the same:
Values : Counted substrings
occ[A] = 1 : S[1]
occ[AA] = 2 : S[1..2], S[2..3]
occ[AAA] = 1 : S[1..3]
occ[AAAA] = 0 : NULL
Now string "AA" contains "A" as well, which was not counted yet, therefore increment the occurrence of string "A" in occ["A"] as occ["A"] += occ["AA"]. Below is the count for the same:
Values : Counted substrings
occ[A] = 3 : S[1], S[2], S[3]
occ[AA] = 2 : S[1..2], S[2..3]
occ[AAA] = 1 : S[1..3]
occ[AAAA] = 0 : NULL
Step 4: At last add one to all occurrences for the original prefixes, which are not counted yet.
Values : Counted substrings
occ[A] = 4 : S[1], S[2], S[3], S[0]
occ[AA] = 3 : S[1..2], S[2..3], S[0..1]
occ[AAA] = 2 : S[1..3], S[0..2]
occ[AAAA] = 1 : S[0..3]
Below is the implementation of the above approach:
C++
// C++ program for the above approach
#include <bits/stdc++.h>
using namespace std;
// Function to print the count of all
// prefix in the given string
void print(vector<int>& occ, string& s)
{
// Iterate over string s
for (int i = 1; i <= int(s.size());
i++) {
// Print the prefix and their
// frequency
cout << s.substr(0, i)
<< " occurs "
<< occ[i]
<< " times."
<< endl;
}
}
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// substring of the string S
vector<int> prefix_function(string& s)
{
// Array to store LPS values
vector<int> LPS(s.size());
// Value of lps[0] is 0
// by definition
LPS[0] = 0;
// Find the values of LPS[i] for
// the rest of the string using
// two pointers and DP
for (int i = 1;
i < int(s.size());
i++) {
// Initially set the value
// of j as the longest
// prefix that is also a
// suffix for i as LPS[i-1]
int j = LPS[i - 1];
// Check if the suffix of
// length j+1 is also a prefix
while (j > 0 && s[i] != s[j]) {
j = LPS[j - 1];
}
// If s[i] = s[j] then, assign
// LPS[i] as j+1
if (s[i] == s[j]) {
LPS[i] = j + 1;
}
// If we reached j = 0, assign
// LPS[i] as 0 as there was no
// prefix equal to suffix
else {
LPS[i] = 0;
}
}
// Return the calculated
// LPS array
return LPS;
}
// Function to count the occurrence
// of all the prefix in the string S
void count_occurrence(string& s)
{
int n = s.size();
// Call the prefix_function
// to get LPS
vector<int> LPS
= prefix_function(s);
// To store the occurrence of
// all the prefix
vector<int> occ(n + 1);
// Count all the suffixes that
// are also prefix
for (int i = 0; i < n; i++) {
occ[LPS[i]]++;
}
// Add the occurrences of
// i to smaller prefixes
for (int i = n - 1;
i > 0; i--) {
occ[LPS[i - 1]] += occ[i];
}
// Adding 1 to all occ[i] for all
// the original prefix
for (int i = 0; i <= n; i++)
occ[i]++;
// Function Call to print the
// occurrence of all the prefix
print(occ, s);
}
// Driver Code
int main()
{
// Given String
string A = "ABACABA";
// Function Call
count_occurrence(A);
return 0;
}
Java
// Java program for
// the above approach
import java.util.*;
class GFG{
// Function to print the count
// of all prefix in the
// given String
static void print(int[] occ,
String s)
{
// Iterate over String s
for (int i = 1;
i <= s.length() - 1; i++)
{
// Print the prefix and their
// frequency
System.out.print(s.substring(0, i) +
" occurs " + occ[i] +
" times." + "\n");
}
}
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// subString of the String S
static int[] prefix_function(String s)
{
// Array to store LPS values
int []LPS = new int[s.length()];
// Value of lps[0] is 0
// by definition
LPS[0] = 0;
// Find the values of LPS[i] for
// the rest of the String using
// two pointers and DP
for (int i = 1;
i < s.length(); i++)
{
// Initially set the value
// of j as the longest
// prefix that is also a
// suffix for i as LPS[i-1]
int j = LPS[i - 1];
// Check if the suffix of
// length j+1 is also a prefix
while (j > 0 &&
s.charAt(i) != s.charAt(j))
{
j = LPS[j - 1];
}
// If s[i] = s[j] then, assign
// LPS[i] as j+1
if (s.charAt(i) == s.charAt(j))
{
LPS[i] = j + 1;
}
// If we reached j = 0, assign
// LPS[i] as 0 as there was no
// prefix equal to suffix
else
{
LPS[i] = 0;
}
}
// Return the calculated
// LPS array
return LPS;
}
// Function to count the occurrence
// of all the prefix in the String S
static void count_occurrence(String s)
{
int n = s.length();
// Call the prefix_function
// to get LPS
int[] LPS = prefix_function(s);
// To store the occurrence of
// all the prefix
int []occ = new int[n + 1];
// Count all the suffixes that
// are also prefix
for (int i = 0; i < n; i++)
{
occ[LPS[i]]++;
}
// Add the occurrences of
// i to smaller prefixes
for (int i = n - 1;
i > 0; i--)
{
occ[LPS[i - 1]] += occ[i];
}
// Adding 1 to all occ[i] for all
// the original prefix
for (int i = 0; i <= n; i++)
occ[i]++;
// Function Call to print the
// occurrence of all the prefix
print(occ, s);
}
// Driver Code
public static void main(String[] args)
{
// Given String
String A = "ABACABA";
// Function Call
count_occurrence(A);
}
}
// This code is contributed by Princi Singh
Python3
# Python3 program for the above approach
# Function to print the count of all
# prefix in the given string
def Print(occ, s):
# Iterate over string s
for i in range(1, len(s) + 1):
# Print the prefix and their
# frequency
print(s[0 : i], "occur", occ[i], "times.")
# Function to implement the LPS
# array to store the longest prefix
# which is also a suffix for every
# substring of the string S
def prefix_function(s):
# Array to store LPS values
# Value of lps[0] is 0
# by definition
LPS = [0 for i in range(len(s))]
# Find the values of LPS[i] for
# the rest of the string using
# two pointers and DP
for i in range(1, len(s)):
# Initially set the value
# of j as the longest
# prefix that is also a
# suffix for i as LPS[i-1]
j = LPS[i - 1]
# Check if the suffix of
# length j+1 is also a prefix
while (j > 0 and s[i] != s[j]):
j = LPS[j - 1]
# If s[i] = s[j] then, assign
# LPS[i] as j+1
if (s[i] == s[j]):
LPS[i] = j + 1
# If we reached j = 0, assign
# LPS[i] as 0 as there was no
# prefix equal to suffix
else:
LPS[i] = 0
# Return the calculated
# LPS array
return LPS
# Function to count the occurrence
# of all the prefix in the string S
def count_occurrence(s):
n = len(s)
# Call the prefix_function
# to get LPS
LPS = prefix_function(s)
# To store the occurrence of
# all the prefix
occ = [0 for i in range(n + 1)]
# Count all the suffixes that
# are also prefix
for i in range(n):
occ[LPS[i]] += 1
# Add the occurrences of
# i to smaller prefixes
for i in range(n - 1, 0, -1):
occ[LPS[i - 1]] += occ[i]
# Adding 1 to all occ[i] for all
# the original prefix
for i in range(n + 1):
occ[i] += 1
# Function Call to print the
# occurrence of all the prefix
Print(occ, s)
# Driver Code
# Given String
A = "ABACABA"
# Function Call
count_occurrence(A)
# This code is contributed by avanitrachhadiya2155
C#
// C# program for
// the above approach
using System;
class GFG{
// Function to print the
// count of all prefix
// in the given String
static void print(int[] occ,
String s)
{
// Iterate over String s
for (int i = 1;
i <= s.Length - 1; i++)
{
// Print the prefix and their
// frequency
Console.Write(s.Substring(0, i) +
" occurs " + occ[i] +
" times." + "\n");
}
}
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// subString of the String S
static int[] prefix_function(String s)
{
// Array to store LPS values
int []LPS = new int[s.Length];
// Value of lps[0] is 0
// by definition
LPS[0] = 0;
// Find the values of LPS[i] for
// the rest of the String using
// two pointers and DP
for (int i = 1;
i < s.Length; i++)
{
// Initially set the value
// of j as the longest
// prefix that is also a
// suffix for i as LPS[i-1]
int j = LPS[i - 1];
// Check if the suffix of
// length j+1 is also a prefix
while (j > 0 && s[i] != s[j])
{
j = LPS[j - 1];
}
// If s[i] = s[j] then,
// assign LPS[i] as j+1
if (s[i] == s[j])
{
LPS[i] = j + 1;
}
// If we reached j = 0, assign
// LPS[i] as 0 as there was no
// prefix equal to suffix
else
{
LPS[i] = 0;
}
}
// Return the calculated
// LPS array
return LPS;
}
// Function to count the occurrence
// of all the prefix in the String S
static void count_occurrence(String s)
{
int n = s.Length;
// Call the prefix_function
// to get LPS
int[] LPS = prefix_function(s);
// To store the occurrence of
// all the prefix
int []occ = new int[n + 1];
// Count all the suffixes that
// are also prefix
for (int i = 0; i < n; i++)
{
occ[LPS[i]]++;
}
// Add the occurrences of
// i to smaller prefixes
for (int i = n - 1;
i > 0; i--)
{
occ[LPS[i - 1]] += occ[i];
}
// Adding 1 to all occ[i] for all
// the original prefix
for (int i = 0; i <= n; i++)
occ[i]++;
// Function Call to print the
// occurrence of all the prefix
print(occ, s);
}
// Driver Code
public static void Main(String[] args)
{
// Given String
String A = "ABACABA";
// Function Call
count_occurrence(A);
}
}
// This code is contributed by Amit Katiyar
JavaScript
<script>
// JavaScript program for the above approach
// Function to print the count of all
// prefix in the given string
const print = (occ, s) => {
// Iterate over string s
for (let i = 1; i <= s.length; i++) {
// Print the prefix and their
// frequency
document.write(`${s.substr(0, i)} occurs ${occ[i]} times.<br/>`);
}
}
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// substring of the string S
const prefix_function = (s) => {
// Array to store LPS values
let LPS = new Array(s.length).fill(0);
// Value of lps[0] is 0
// by definition
LPS[0] = 0;
// Find the values of LPS[i] for
// the rest of the string using
// two pointers and DP
for (let i = 1; i < s.length; i++) {
// Initially set the value
// of j as the longest
// prefix that is also a
// suffix for i as LPS[i-1]
let j = LPS[i - 1];
// Check if the suffix of
// length j+1 is also a prefix
while (j > 0 && s[i] != s[j]) {
j = LPS[j - 1];
}
// If s[i] = s[j] then, assign
// LPS[i] as j+1
if (s[i] == s[j]) {
LPS[i] = j + 1;
}
// If we reached j = 0, assign
// LPS[i] as 0 as there was no
// prefix equal to suffix
else {
LPS[i] = 0;
}
}
// Return the calculated
// LPS array
return LPS;
}
// Function to count the occurrence
// of all the prefix in the string S
const count_occurrence = (s) => {
let n = s.length;
// Call the prefix_function
// to get LPS
let LPS = prefix_function(s);
// To store the occurrence of
// all the prefix
let occ = new Array(n + 1).fill(0);
// Count all the suffixes that
// are also prefix
for (let i = 0; i < n; i++) {
occ[LPS[i]]++;
}
// Add the occurrences of
// i to smaller prefixes
for (let i = n - 1;
i > 0; i--) {
occ[LPS[i - 1]] += occ[i];
}
// Adding 1 to all occ[i] for all
// the original prefix
for (let i = 0; i <= n; i++)
occ[i]++;
// Function Call to print the
// occurrence of all the prefix
print(occ, s);
}
// Driver Code
// Given String
let A = "ABACABA";
// Function Call
count_occurrence(A);
// This code is contributed by rakeshsahni
</script>
Output: A occurs 4 times.
AB occurs 2 times.
ABA occurs 2 times.
ABAC occurs 1 times.
ABACA occurs 1 times.
ABACAB occurs 1 times.
ABACABA occurs 1 times.
Time Complexity: O(N2)
Auxiliary Space: O(N)
Similar Reads
Remove all occurrences of string t in string s using KMP Algorithm
Given two strings s and t, the task is to remove all occurrences of t in s and return the modified string s, and you have to use the KMP algorithm to solve this. Examples: Input: s = "abcdefgabcabcabdefghabc", t = "abc"Output: "defgdefgh" Input: s = "aaabbbccc", t = "bbb"Output: "aaaccc" Approach: T
8 min read
Count occurrences of strings formed using words in another string
Given a string A and a vector of strings B, the task is to count the number of strings in vector B that only contains the words from A. Examples: Input: A="blue green red yellow"B[]={"blue red", "green pink", "yellow green"}Output: 2 Input: A="apple banana pear"B[]={"apple", "banana apple", "pear ba
7 min read
Count distinct substrings of a string using Rabin Karp algorithm
Given a string, return the number of distinct substrings using Rabin Karp Algorithm. Examples: Input : str = âabaâOutput : 5Explanation :Total number of distinct substring are 5 - "a", "ab", "aba", "b" ,"ba" Input : str = âabcdâOutput : 10Explanation :Total number of distinct substring are 10 - "a",
9 min read
Count occurrences of substring X before every occurrence of substring Y in a given string
Given three strings S, X, and Y consisting of N, A, and B characters respectively, the task is to find the number of occurrences of the substring X before every occurrence of the substring Y in the given string S. Examples: Input S = âabcdefdefabcâ, X = âdefâ, Y = âabcâOutput: 0 2Explanation:First o
6 min read
Count of occurrences of a "1(0+)1" pattern in a string
Given an alphanumeric string, find the number of times a pattern 1(0+)1 occurs in the given string. Here, (0+) signifies the presence of non empty sequence of consecutive 0's. Examples: Input : 1001010001 Output : 3 First sequence is in between 0th and 3rd index. Second sequence is in between 3rd an
7 min read
Count Occurrences of a Given Character in a String
Given a string S and a character 'c', the task is to count the occurrence of the given character in the string.Examples: Input : S = "geeksforgeeks" and c = 'e'Output : 4Explanation: 'e' appears four times in str.Input : S = "abccdefgaa" and c = 'a' Output : 3Explanation: 'a' appears three times in
6 min read
Count of substrings of given string with frequency of each character at most K
Given a string str, the task is to calculate the number of substrings of the given string such that the frequency of each element of the string is almost K. Examples: Input: str = "abab", K = 1Output: 7Explanation: The substrings such that the frequency of each character is atmost 1 are "a", "b", "a
6 min read
Count occurrences of a character in a repeated string
Given an integer N and a lowercase string. The string is repeated infinitely. The task is to find the No. of occurrences of a given character x in first N letters.Examples: Input : N = 10 str = "abcac"Output : 4Explanation: "abcacabcac" is the substring from the infinitely repeated string. In first
8 min read
Minimize length of prefix of string S containing all characters of another string T
Given two string S and T, the task is to find the minimum length prefix from S which consists of all characters of string T. If S does not contain all characters of string T, print -1. Examples: Input: S = "MarvoloGaunt", T = "Tom" Output: 12 Explanation: The 12 length prefix "MarvoloGaunt" contains
7 min read
Count the number of vowels occurring in all the substrings of given string | Set 2
Given a string str[] of length N of lowercase characters containing 0 or more vowels, the task is to find the count of vowels that occurred in all the substrings of the given string. Examples: Input: str = âabcâ Output: 3The given string âabcâ contains only one vowel = âaâ Substrings of âabcâ are =
5 min read