CSES Solutions - String Matching
Last Updated :
02 Apr, 2024
Given a string S and a pattern P, your task is to count the number of positions where the pattern occurs in the string.
Examples:
Input: S = "saippuakauppias", P = "pp"
Output: 2
Explanation: "pp" appears 2 times in S.
Input: S = "aaaa", P = "aa"
Output: 3
Explanation: "aa" appears 3 times in S.
Approach: To solve the problem, follow the below idea:
To find all occurrences of a pattern in a text we can use various String-Matching algorithms. The Knuth-Morris-Pratt (KMP) algorithm is a suitable choice for this problem. KMP is an efficient string-matching algorithm that can find all occurrences of a pattern in a string in linear time.
Concatenate the Pattern and Text: The first step is to concatenate the pattern and the text with a special character # in between. This is done to ensure that the pattern and text don’t overlap during the computation of the prefix function.
Compute the Prefix Function: The computePrefix function is used to compute the prefix function of the concatenated string. The prefix function for a position i in the string is defined as the maximum proper prefix of the substring ending at position i that is also a suffix of this substring. This function is a key part of the KMP algorithm.
Count the Occurrences: After the prefix function is computed, the next step is to count the number of occurrences of the pattern in the text. This is done by iterating over the prefix function array and checking how many times the pattern length appears in the array. Each time the pattern length appears in the array, it means an occurrence of the pattern has been found in the text.
Step-by-step algorithm:
- Declare the prefix function array pi[], and the count of occurrences.
- The prefix function is computed for the pattern string. This function calculates the longest proper prefix which is also a suffix for each substring of the pattern. This information is stored in the pi array.
- The pattern string is concatenated with the text string, with a special character (#) in between to separate them.
- Iterate over the concatenated string. For each character, check if it matches the current character of the pattern (using the pi[] array). If it does, move to the next character of both the pattern and the text. If it doesn’t, move to the next character of the text, but stay on the current character of the pattern (or move to the character indicated by the pi array).
- Each time the end of the pattern is reached (i.e., all characters of the pattern have matched), increment the count of occurrences.
- After the entire text has been scanned, print the count of occurrences.
Below is the implementation of the algorithm:
C++
#include <bits/stdc++.h>
using namespace std;
// Function to compute the prefix function of a string for
// KMP algorithm
vector<int> computePrefix(string S)
{
int N = S.length();
vector<int> pi(N);
for (int i = 1; i < N; i++) {
int j = pi[i - 1];
// Find the longest proper prefix which is also a
// suffix
while (j > 0 && S[i] != S[j])
j = pi[j - 1];
if (S[i] == S[j])
j++;
pi[i] = j;
}
return pi;
}
// Function to count the number of occurrences of a pattern
// in a text using KMP algorithm
int countOccurrences(string S, string P)
{
// Concatenate pattern and text with a special character
// in between
string combined = P + "#" + S;
// Compute the prefix function
vector<int> prefixArray = computePrefix(combined);
int count = 0;
// Count the number of times the pattern appears in the
// text
for (int i = 0; i < prefixArray.size(); i++) {
if (prefixArray[i] == P.size())
count++;
}
return count;
}
// Driver code
int main()
{
string S = "saippuakauppias";
string P = "pp";
cout << countOccurrences(S, P) << "\n";
return 0;
}
Java
import java.util.*;
public class KMPAlgorithm {
// Function to compute the prefix function of a string for KMP algorithm
static List<Integer> computePrefix(String S) {
int N = S.length();
List<Integer> pi = new ArrayList<>(Collections.nCopies(N, 0));
for (int i = 1; i < N; i++) {
int j = pi.get(i - 1);
// Find the longest proper prefix which is also a suffix
while (j > 0 && S.charAt(i) != S.charAt(j))
j = pi.get(j - 1);
if (S.charAt(i) == S.charAt(j))
j++;
pi.set(i, j);
}
return pi;
}
// Function to count the number of occurrences of a pattern in a text using KMP algorithm
static int countOccurrences(String S, String P) {
// Concatenate pattern and text with a special character in between
String combined = P + "#" + S;
// Compute the prefix function
List<Integer> prefixArray = computePrefix(combined);
int count = 0;
// Count the number of times the pattern appears in the text
for (int i = 0; i < prefixArray.size(); i++) {
if (prefixArray.get(i) == P.length())
count++;
}
return count;
}
// Driver code
public static void main(String[] args) {
String S = "saippuakauppias";
String P = "pp";
System.out.println(countOccurrences(S, P));
}
}
Python
# Function to compute the prefix function of a string for
# KMP algorithm
def compute_prefix(s):
n = len(s)
pi = [0] * n
j = 0
for i in range(1, n):
while j > 0 and s[i] != s[j]:
j = pi[j - 1]
if s[i] == s[j]:
j += 1
pi[i] = j
return pi
# Function to count the number of occurrences of a pattern
# in a text using KMP algorithm
def count_occurrences(s, p):
# Concatenate pattern and text with a special character
# in between
combined = p + "#" + s
# Compute the prefix function
prefix_array = compute_prefix(combined)
count = 0
# Count the number of times the pattern appears in the
# text
for pi in prefix_array:
if pi == len(p):
count += 1
return count
# Driver code
if __name__ == "__main__":
S = "saippuakauppias"
P = "pp"
print(count_occurrences(S, P))
C#
using System;
using System.Collections.Generic;
public class KMPAlgorithm
{
// Function to compute the prefix function of a string for KMP algorithm
static List<int> ComputePrefix(string S)
{
int N = S.Length;
List<int> pi = new List<int>(new int[N]);
for (int i = 1; i < N; i++)
{
int j = pi[i - 1];
// Find the longest proper prefix which is also a suffix
while (j > 0 && S[i] != S[j])
j = pi[j - 1];
if (S[i] == S[j])
j++;
pi[i] = j;
}
return pi;
}
// Function to count the number of occurrences of a pattern in a text using KMP algorithm
static int CountOccurrences(string S, string P)
{
// Concatenate pattern and text with a special character in between
string combined = P + "#" + S;
// Compute the prefix function
List<int> prefixArray = ComputePrefix(combined);
int count = 0;
// Count the number of times the pattern appears in the text
for (int i = 0; i < prefixArray.Count; i++)
{
if (prefixArray[i] == P.Length)
count++;
}
return count;
}
// Driver code
public static void Main(string[] args)
{
string S = "saippuakauppias";
string P = "pp";
Console.WriteLine(CountOccurrences(S, P));
}
}
JavaScript
// Function to compute the prefix function of a string for
// KMP algorithm
function computePrefix(S) {
let N = S.length;
let pi = new Array(N).fill(0);
for (let i = 1; i < N; i++) {
let j = pi[i - 1];
// Find the longest proper prefix which is also a
// suffix
while (j > 0 && S[i] != S[j])
j = pi[j - 1];
if (S[i] == S[j])
j++;
pi[i] = j;
}
return pi;
}
// Function to count the number of occurrences of a pattern
// in a text using KMP algorithm
function countOccurrences(S, P) {
// Concatenate pattern and text with a special character
// in between
let combined = P + "#" + S;
// Compute the prefix function
let prefixArray = computePrefix(combined);
let count = 0;
// Count the number of times the pattern appears in the
// text
for (let i = 0; i < prefixArray.length; i++) {
if (prefixArray[i] == P.length)
count++;
}
return count;
}
// Driver code
let S = "saippuakauppias";
let P = "pp";
console.log(countOccurrences(S, P));
Time Complexity: O(N+M) where N is the length of the text and M is the length of the pattern to be found.
Auxiliary Space: O(N)
Similar Reads
Shorten and Match the String Given two strings S1 and S2, which consists of uppercase alphabets. Find the shortest string str which consists of # and uppercase letters, where the number of # must be less than the number of letters. Make sure if you replace the # in the string str with any string of uppercase letters of any leng
14 min read
Applications of String Matching Algorithms String matching is a process of finding a smaller string inside a larger text. For example, searching for the word "apple" in a paragraph. It is useful in areas like text search, data analysis and more. There are two types of string matching algorithms:Exact String Matching AlgorithmsApproximate Str
2 min read
String matching where one string contains wildcard characters Given two strings where first string may contain wild card characters and second string is a normal string. Write a function that returns true if the two strings match. The following are allowed wild card characters in first string. * --> Matches with 0 or more instances of any character or set o
9 min read
Check if two strings are same or not Given two strings, the task is to check if these two strings are identical(same) or not. Consider case sensitivity.Examples:Input: s1 = "abc", s2 = "abc" Output: Yes Input: s1 = "", s2 = "" Output: Yes Input: s1 = "GeeksforGeeks", s2 = "Geeks" Output: No Approach - By Using (==) in C++/Python/C#, eq
7 min read
Find one extra character in a string Given two strings which are of lengths n and n+1. The second string contains all the characters of the first string, but there is one extra character. Your task is to find the extra character in the second string. Examples: Input : string strA = "abcd"; string strB = "cbdae"; Output : e string B con
15+ min read
Commonly Asked Data Structure Interview Questions on Strings Strings are essential data structures used to represent sequences of characters and are frequently encountered in coding interviews. Questions often focus on string manipulation techniques such as searching, concatenation, reversal, and substring extraction. Understanding key algorithms like pattern
4 min read
Searching For Characters and Substring in a String in Java Efficient String manipulation is very important in Java programming especially when working with text-based data. In this article, we will explore essential methods like indexOf(), contains(), and startsWith() to search characters and substrings within strings in Java.Searching for a Character in a
5 min read
String matching with * (that matches with any) in any of the two strings You are given two strings A and B. Strings also contains special character * . you can replace * with any alphabetic character. Finally, you have to tell whether it is possible to make both string same or not. Examples: Input : A = "gee*sforgeeks" B = "geeksforgeeks"Output :YesInput :A = "abs*" B =
10 min read
String matches() Method in Java with Examples In Java, the matches() method in the String class checks if a string matches a specified regular expression. It is useful for validating input patterns and searching within strings. In this article, we will learn how to use the matches() method effectively in Java with examples to illustrate its fun
3 min read
String Guide for Competitive Programming Strings are a sequence of characters, and are one of the most fundamental data structures in Competitive Programming. String problems are very common in competitive programming contests, and can range from simple to very challenging. In this article we are going to discuss about most frequent string
15 min read