DP 2
DP 2
Let us now move to some advanced-level DP questions, which deal with 2D arrays.
Problem Statement: Given an integer matrix of size m*n, you need to find out the
value of minimum cost to reach from the cell (0, 0) to (m-1, n-1). From a cell (i, j), you
can move in three directions : (i+1, j), (i, j+1) and (i+1, j+1). The cost of a path is
defined as the sum of values of each cell through which path passes.
34
3412
2189
4781
The path that should be followed is 3 -> 1 -> 8 -> 1. Hence the output is 13.
Approach: Thinking about the r ecursive approach to reach from the cell (0, 0) to
(m-1, n-1), we need to decide for every cell about the direction to proceed out of
three. We will simply call recursion over all the three choices available to us, and
finally, we will be considering the one with minimum cost and add the current cell’s
value to it.
Let’s now look at the recursive code for this problem:
include <iostream>
using namespace std;
1
int minCostPath(int **input, int m, int n, int i, int j) {
// Base case: reaching out to the destination cell
if(i == m- 1 && j == n- 1) {
return input[i][j];
}
if(i >= m || j >= n) { // Checking if the current row and column are
return INT_MAX; // within the constraints or not, if not, then
} / / returning +infinity, so that it will not be considered as the answer
// Recursive calls
int x = minCostPath(input, m, n, i, j+1); // Towards right direction
int y = minCostPath(input, m, n, i+1, j+1); // Towards diagonally right-down
int z = minCostPath(input, m, n, i+1, j); / / Towards the down direction
// Small Calculation: figuring out the minimum value and then adding
// current cells value to it
int ans = min(x, min(y, z)) + input[i][j];
return ans;
}
int main() {
int m, n;
cin >> m >> n;
int **input = new int*[m];
for(int i = 0; i < m; i++) {
input[i] = new int[n];
for(int j = 0; j < n; j++) {
cin >> input[i][j];
}
}
cout << minCostPath(input, m, n) << endl;
}
Let’s dry run the approach to see the code flow. Suppose, m = 4 and n = 5; then the
recursive call flow looks something like below:
2
Here, we can clearly see that there are many repeated/overlapping recursive
calls(for example: (1,1) is one of them), leading to exponential time complexity, i.e.,
O(3n). If we store the output for each recursive call after their first occurrence, we
can easily avoid the repetition. It means that we can improve this using
memoization.
Now, let’s move on to the Memoization approach.
In memoization, we avoid repeated overlapping calls by storing the output of each
recursive call in an array. In this case, we will be using a 2D array instead of 1D, as
we already discussed in our previous lectures that the storage used for the
memoization is generally the same as the one that recursive calls use to their
maximum.
Refer to the memoization code (along with the comments) below for better
understanding:
3
}
// Small Calculation
int a = min(x, min(y, z)) + input[i][j];
return a;
}
int minCostPath_Mem(int **input, int m, int n, int i, int j) { / / This function will be
int **output = new int*[m]; // called from main()
for(int i = 0; i < m; i++) {
output[i] = new int[n];
for(int j = 0; j < n; j++) {
output[i][j] = -1; / / Initialising the output array by -1. Here, -1
} / / denotes that the value of the current cell is
} / / unknown and could be replaced only after we
// find the same
return minCostPath_Mem(input, m, n, i, j, output);
}
Here, we can observe that as we move from the cell (0,0) to (m-1, n-1), in general,
the i-th row varies from 0 to m-1, and the j-th column runs from 0 to n-1. Hence, the
unique recursive calls will be a maximum of (m-1) * (n-1), which leads to the time
complexity of O(m*n).
To get rid of the recursion, we will now proceed towards the DP approach.
The DP approach is simple. We just need to create a solution array (lets name that
as ans), where ans[i][j] = minimum cost to reach from (i, j) to (m-1, n-1).
4
Now, initialize the last row and last column of the matrix with the sum of their
values and the value, just after it. This is because, in the last row or column, we can
reach there from their forward cell only (You can manually check it), except the cell
(m-1, n-1), which is the value itself.
● ans[m-1][n-1] = input[m-1][n-1]
● ans[m-1][j] = ans[m-1][j+1] + input[m-1][j] (for 0 < j < n)
● ans[i][n-1] = ans[i+1][n-1] + input[i][m-1] (for 0 < i < m)
Next, we will simply fill the rest of our answer matrix by checking out the minimum
among values from where we could reach them. For this, we will use the same
formula as used in the recursive approach:
ans[i][j] = min(ans[i+1][j], min(ans[i+1][j+1], ans[i][j+1])) + input[i][j]
Finally, we will get our answer at the cell (0, 0), which we will return.
The code looks as follows:
ans[m-1][n-1] = input[m-1][n-1];
// Last row
for(int j = n - 2; j >= 0; j--) {
ans[m-1][j] = input[m-1][j] + ans[m-1][j+1];
}
// Last col
for(int i = m-2; i >= 0; i--) {
ans[i][n-1] = input[i][n-1] + ans[i+1][n-1];
}
5
return ans[0][0]; / / Our Final answer as discussed above
}
Note: T
his is the bottom-up approach to solve the question using DP.
Note: S
ubsequence is a part of the string which can be made by omitting none or
some of the characters from that string while maintaining the order of the
characters.
If s1 and s2 are two given strings then z is the common subsequence of s1 and s2, if
z is a subsequence of both of them.
Example 1:
s1 = “abcdef”
s2 = “xyczef”
Here, the longest common subsequence is “cef”; hence the answer is 3 (the length
of LCS).
Example 2:
s1 = “ahkolp”
s2 = “ehyozp”
6
Here, the longest common subsequence is “hop”; hence the answer is 3.
Approach: Let’s first think of a brute-force approach using r ecursion. For LCS, we
are required to first-of-all match the starting characters of both the strings. If they
match, then simply we can break the problem as shown below:
|
s1 = “x yzar”
s2 = “x|qwea”
and the rest of the LCS will be handled by recursion. But, if the first characters do
not match, then we have to figure out that by traversing which of the following
strings, we will get our answer. This can’t be directly predicted by just looking at
them, so we will be traversing over both of them one-by-one and check for the
maximum value of LCS obtained among them to be considered for our answer.
For example:
We can see that their first characters do not match so that we can call recursion
over it in either of the following ways:
A=
B=
7
C=
Check the code below and follow the comments for a better understanding.
// Recursive calls
if(s[0] == t[0]) {
return 1 + lcs(s.substr(1), t.substr(1));
}
else {
int a = lcs(s.substr(1), t); // discarding the first character of string s
int b = lcs(s, t.substr(1)); // discarding the first character of string t
int c = lcs(s.substr(1), t.substr(1)); //discarding the first character of both
return max(a, max(b, c)); // Small calculation
}
}
If we dry run this over the example: s = “ xyz” and t = “zxay”, it will look something
like below:
8
Here, as for each node, we will be making three recursive calls, so the time
complexity will be exponential and is represented as O(2m+n), where m and n are
the lengths of both strings. This is because, if we carefully observe the above code,
then we can skip the third recursive call as it will be covered by the two others.
Consider the diagram below, where we are representing the dry run in terms of its
length taken at each recursive call:
As we can see there are multiple overlapping recursive calls, the solution can be
optimized using m
emoization followed by DP. So, beginning with the memoization
approach, as we want to match all the subsequences of the given two strings, we
9
have to figure out the number of unique recursive calls. For string s, we can make
at most length(s) recursive calls, and similarly, for string t, we can make at most
length(t) recursive calls, which are also dependent on each other’s solution. Hence,
our result can be directly stored in the form of a 2-dimensional array of size
(length(s)+1) * (length(t) + 1) as for string s, we have 0 to length(s) possible
combinations, and the same goes for string t.
So for every index ‘i’ in string s and ‘j’ in string t , we will choose one of the following
two options:
1. If the character s[i] matches t[j], the length of the common subsequence
would be one plus the length of the common subsequence till the i-1 and j-1
indexes in the two respective strings.
2. If the character s[i] does not match t[j], we will take the longest subsequence
by either skipping i-th or j-th character from the respective strings.
Hence, the answer stored in the matrix will be the LCS of both strings when the
length of string s will be ‘i’ and the length of string t will be ‘j’.
// Base case
if(s.size() == 0 || t.size() == 0) {
return 0;
}
10
int ans;
// Recursive calls
if(s[0] == t[0]) {
ans = 1 + lcs_mem(s.substr(1), t.substr(1), output);
}
else {
int a = lcs_mem(s.substr(1), t, output);
int b = lcs_mem(s, t.substr(1), output);
int c = lcs_mem(s.substr(1), t.substr(1), output);
ans = max(a, max(b, c));
}
// Return ans
return ans;
}
}
}
return lcs_mem(s, t, output);
}
11
// Fill 1st row
for(int j = 0; j <= n; j++) { / / as if string t is empty, then the lcs(s, t) = 0
output[0][j] = 0;
}
Time Complexity: We can see that the time complexity of the DP and memoization
approach is reduced to O(m*n) where m and n are the lengths of the given strings.
Edit Distance
Problem statement: G
iven two strings s and t of lengths m and n respectively, find
the Edit Distance between the strings. Edit Distance of two strings is the minimum
number of steps required to make one string equal to another. To do so, you can
perform the following three operations only :
● Delete a character
● Replace a character with another one
● Insert a character
12
Example 1:
s1 =
“but”
s2 = “bat”
Answer: 1
Explanation: W e just need to replace ‘a’ with ‘u’ to transform s2 to s1.
Example 2:
s1 = “cbda”
s2 = “abdca”
Answer: 2
Explanation: W e just need to replace the first ‘a’ with ‘c’ and delete the second ‘c’.
Example 3:
s1 = “ppsspqrt”
s2 = “passpot”
Answer: 3
Explanation: W e just need to replace first ‘a’ with ‘p’, ‘o’ with ‘q’, and insert ‘r’.
Approach: Let’s think about this problem using r ecursion first. We need to apply
each of the three operations on each character of s2 to make it similar to s1 and
then find the minimum among them.
Let’s assume index1 and index2 point to the current indexes of s1 and s2
respectively, so we have two options at every step:
1. If the strings have the same character, we can recursively match for the
remaining lengths of the strings.
2. If the strings do not match, we start three new recursive calls representing
the three edit operations, as mentioned in the problem statement. Consider
the minimum count of operations among the three recursive calls.
13
if(s.size() == 0 || t.size() == 0) {
return max(s.size(), t.size());
}
From here, it is clear that the time complexity is again exponential, which is O(3m+n),
where m and n are the lengths of the given strings.
Also, we can see the overlapping/repeated recursive calls(for example: (2,2) is one
of them), which means this problem can be further solved using memoization.
14
This problem is somehow similar to the LCS problem. We will be using a similar
approach to solve this problem too. The answer to each recursive call will be stored
in a 2-Dimensional array of size (m+1)*(n+1), and the final solution will be obtained
at index (m,n) as each cell will be storing the answer for the given m length of s1
and n length of s2.
int ans;
if(s[0] == t[0]) { // First character matches
ans = editDistance_mem(s.substr(1), t.substr(1), output);
}
else {
int x = editDistance(s.substr(1), t, output) + 1; // Insert
int y = editDistance(s, t.substr(1), output) + 1; / / Delete
int z = editDistance(s.substr(1), t.substr(1), output) + 1; / / Replace
ans = min(x, min(y, z));
}
15
ans[i] = new int[n+1];
for(int j = 0; j <= n; j++) {
ans[i][j] = -1;
}
}
return editDistance_mem(s, t, ans);
}
Time Complexity: As there are (m+1)*(n+1) number of unique calls, hence the time
complexity becomes O(m*n), which is better than the recursive approach.
We have already discussed the basic requirements like output array size, final
output’s position, and the value stored at each position of the output array in the
memoization approach. We have already figured out that this problem is similar to
the LCS question. So, forwarding directly towards the code:
16
}
else {
int a = output[i-1][j];
int b = output[i][j-1];
int c = output[i-1][j-1];
output[i][j] = min(a, min(b, c)) + 1;
}
}
}
return output[m][n]; // final answer
}
Time complexity: It is the same as the memoization approach, i.e., O(m*n).
Knapsack
Problem Statement: Given the weights and values of ‘N’ items, we are asked to put
these items in a knapsack, which has a capacity ‘C’. The goal is to get the maximum
value from the items in the knapsack. Each item can only be selected once, as we
don’t have multiple quantities of any item.
For example:
17
Approach: First-of-all, let’s discuss the brute-force-approach, i.e., the r ecursive
approach. There are two possible cases for every item, either to put that item into
the knapsack or not. If we consider that item, then its value will be contributed
towards the total value, otherwise not. To figure out the maximum value obtained
by maintaining the capacity of the knapsack, we will call recursion over these two
cases simultaneously, and then will consider the maximum value obtained out of
the two.
If we consider a particular weight ‘w’ from the array of weights with value ‘v’ and the
total capacity was ‘C’ with initial value ‘Val’, then the remaining capacity of the
knapsack becomes ‘C-w’, and the value becomes ‘Val + v’.
// If the particular weight’s value extends the limit of knapsack’s remaining
// capacity, then we have to simpy skip it
if(weight[0] > maxWeight) {
return knapsack(weight + 1, values + 1, n - 1, maxWeight);
}
// Recursive calls
//1. Considering the weight
int x = knapsack(weight + 1, values + 1, n - 1, maxWeight - weight[0]) + values[0];
// 2. Skipping the weight and moving forward
int y = knapsack(weight + 1, values + 1, n - 1, maxWeight);
18
Now, the memoization and DP approach is left for you to solve. For the code, refer
to the solution tab of the same. Also, figure out the time complexity for the same by
running the code over some example and by dry running it.
Practice problems:
https://atcoder.jp/contests/dp/tasks
19