Module V
Module V
Module V - (4L)
Set and String Problems**
In this module, we explore important problems related to sets and strings, focusing on
optimization techniques and algorithms.
Problem Statement
Approach
The problem is NP-hard, but a greedy algorithm provides an approximate solution. The greedy
approach selects the subset that covers the most uncovered elements of U at each step.
Greedy Algorithm
Initialize the set of covered elements as empty.
While there are uncovered elements, select the subset that covers the maximum number of
uncovered elements.
Repeat until all elements are covered.
Problem Statement
Given two sequences X and Y , find the longest subsequence that appears in both sequences in
the same order (but not necessarily consecutively).
Let dp[i][j] represent the length of the LCS of the first i characters of X and the first j characters
of Y . The recurrence relation is:
def lcs(X, Y): m = len(X) n = len(Y) dp = [[0] * (n + 1) for _ in range(m + 1)] for i
in range(1, m + 1): for j in range(1, n + 1): if X[i-1] == Y[j-1]: dp[i][j] = dp[i-1]
[j-1] + 1 else: dp[i][j] = max(dp[i-1][j], dp[i][j-1]) return dp[m][n] # Example
usage X = "AGGTAB" Y = "GXTXAYB" print("Length of LCS:", lcs(X, Y))
Time Complexity
The time complexity of this algorithm is O(m × n), where m and n are the lengths of the two
sequences.
Summary
In this module, we covered several key problems related to sets and strings:
Set Cover: An NP-hard optimization problem, approximated using a greedy approach.
String Matching: Finding exact occurrences of a pattern in a text using naive and efficient
algorithms like KMP.
Approximate String Matching: Finding close matches between strings using dynamic
programming to compute edit distances.
Longest Common Subsequence: A dynamic programming problem that finds the longest
subsequence common to two sequences.
These problems have wide-ranging applications in optimization, data analysis, and computational
biology.