How to Get Longest Common Substring in Python

  1. Longest Common Substring in Python
  2. Longest Common Substring Using Loops in Python
  3. Use Recursion to Find the Longest Common Substring
  4. Use Dynamic Programming to Find the Longest Common Substring in Python
  5. Conclusion
How to Get Longest Common Substring in Python

String in Python stores a sequence of characters in it to perform various operations on them. Substrings are a part of these strings.

ADVERTISEMENT

In this article, we need to find the longest common substring between two given strings and will discuss various solutions for the same.

Longest Common Substring in Python

A substring is a contiguous sequence of characters within a given string. It can be of any length.

The main problem is that we have been provided with two strings, and we need to find a substring that is common between the given strings and should be the longest among all possible common substrings.

 textCopyInput: str1 = "HaveAAppleRegularly"
       str2 = "HaveAnAnApple"
Output: "Apple"

In the above example, we have two common substrings between the given strings, "Have" and "Apple". But since the length of the "Apple" substring is the longest among the rest of the substrings; therefore, it will be displayed as our result.

This problem can be solved using various concepts like recursion and dynamic programming.

Longest Common Substring Using Loops in Python

Loops can be used to iterate through a string. We can iterate through the string here using the loops and find our longest common substring between the two given strings.

We will use both for and while loops in this approach. The following are the steps that need to be followed to find the longest common substring in Python.

Code Example:

Python
 pythonCopystr1 = "PokemonGo"
str2 = "StopPokemonLetsGo"
result = 0
for i in range(len(str1)):
    for j in range(len(str2)):
        k = 0
        while (
            (i + k) < len(str1) and (j + k) < len(str2) and str1[i + k] == str2[j + k]
        ):
            k = k + 1
        result = max(result, k)

print("Length of the longest common substring is:", result)

Output:

 textCopyLength of the longest common substring is: 7

All the substrings of the string can be calculated in O(n^2) time, whereas checking that if the current substring matches with a substring of the second string would take O(m) time. The time complexity for the above approach would be O(n^2 * m) where n and m are lengths of the two given strings, respectively.

However, as we have not taken any extra space to achieve the solution, the space complexity for the above solution would be O(1).

Use Recursion to Find the Longest Common Substring

Recursion refers to a function calling itself. We need a base case and the choice diagram for the problem at hand in recursion.

We will follow the steps below to find the longest common substring in Python.

Code Example:

Python
 pythonCopydef LCS(s1, s2, n, m):
    if n == 0 or m == 0:
        return 0

    if s1[n - 1] == s2[m - 1]:
        return 1 + lcs(s1, s2, n - 1, m - 1)

    else:
        return max(lcs(s1, s2, n, m - 1), lcs(s1, s2, n - 1, m))


s1 = "pokemonGo"
s2 = "watchPokemon"
n = len(s1)
m = len(s2)
res = lcs(s1, s2, n, m)
print("Length of the longest common substring is:", res)

Output:

 textCopyLength of the longest common substring is: 7

The time complexity of the above solution would be O(3^(m+n)), and the space complexity would be O(m+n).

Use Dynamic Programming to Find the Longest Common Substring in Python

The basic idea of dynamic programming is to find the lengths for all the substrings of both the strings and store their respective lengths in a table. Since using recursion, there is a possibility of getting a stack overflow error because, for large inputs, the recursion stack will keep on increasing.

We introduce the concept of dynamic programming in which we form a table and will keep on storing the results for the respective substrings. The same algorithm we used in recursion is also used with some changes.

Now, we will discuss the steps for finding the longest common substring in Python.

Code Example:

Python
 pythonCopydef LCS(X, Y, m, n):

    dp = [[0 for x in range(m + 1)] for y in range(n + 1)]

    for i in range(m + 1):
        for j in range(n + 1):
            if X[i - 1] == Y[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    return dp[m][n]


s1 = "playbatball"
s2 = "batballwicket"
n = len(s1)
m = len(s2)
res = lcs(s1, s2, n, m)
print("Length of the longest common substring is:", res)

Output:

 textCopyLength of the longest common substring is: 7

Since there are only two loops in our solution, the time complexity for the above solution would be O(m*n). We are using extra space by forming a table to store the results.

The space complexity of the above solution would be O(m*n).

Conclusion

We have learned three different approaches for finding the longest common substring between two given strings. The first approach, a naive approach, uses three loops and finds all the substrings of the given string and keeps a check on the longest among all the substrings.

The second approach uses recursion, whereas the third approach uses dynamic programming to find the longest common substring in Python. The third approach would be the fastest among all the solutions discussed in this article to solve the problem and should be preferred.

Enjoying our tutorials? Subscribe to DelftStack on YouTube to support us in creating more high-quality video guides. Subscribe

Related Article - Python String