Ada 4
Ada 4
This tutorial is more warmup on DPs. What we would like you to do for each of the problems is:-
• Case I : xm = yn , that is the last characters of the two strings match. We claim that, in
this case, zk = xm = yn that is , there is one optimal subsequence which ends with these
occurences of this character. Why ? Well assume that this is not the case.
Now, if none of xm and yn are the last character of Z ? , then we can increase the length of
the subsequence by appending this common character. This contradicts the fact that Z ? is an
optimal subsequence
Now suppose (without loss of generality) that xm is matched with a different occurrence
of yn - call it y0 . Note that since yn is the last character in Y, y0 appears before yn . Hence,
instead of matching xm to y0 , we can simple match it to yn without changing the feasibility
or optimality of the solution.
Ok, now that we know the last common character is necessarily part of Z ? , what is the
subproblem that we need to solve ? Well clearly it is the subproblem on the substrings
X [1, 2 · · · xm−1 ], Y [1, 2, · · · yn−1 ]. The formal claim is
Claim 1 In case I, an optimal solution to the subproblem X [1, 2 · · · xm−1 ], Y [1, 2, · · · yn−1 ] is indeed
the subsequence Z ? \ zk
1
• Case II : Suppose xm 6= yn . In this case, the last two characters cannot be matched with each
other to form zk . Hence, the claim is
Claim 2 In case II, an optimal solution to the subproblem X [1, 2 · · · xm−1 ], Y [1, 2, · · · yn−1 ] is same
as Z ? \ zk
Subseq(∅, Yj ) = 0, ∀ j ≥ 0
Subseq( Xi , ∅) = 0, ∀i ≥ 0
The rest is just converting this to a pseudocode using a 2D table. Can you figure out the recon-
struction of actual solution from this table once it is populated ?
2 Dictionary
You are given a string of n characters s, which you believe to be a corrupted text document in
which all punctuation has vanished (so that it looks something like ”itwasthebestoftimes...”). You
wish to reconstruct the document using a dictionary, which is available in the form of a Boolean
function dict(): for any string w, dict(w) outputs true if w is a valid word false otherwise. Give a
dynamic programming algorithm that determines whether the string s can be reconstituted as a
sequence of valid words. The running time should be at most O(n2 ), assuming each call to dict()
takes unit time.
Solution. This problem has a slightly different flavor since we need to figure out whether a valid
sequence of words can be constructed out of the given gibberish.
Subproblem. Let T [i ] be 1 if string s[i . . . n] can be reconstituted as a sequence of valid words else
0. Hence, the subproblem here has a Boolean value. Now the reurrence is quite straightforward.
T [i ] = max (dict(i . . . j) = 1 ∧ T [ j + 1] = 1)
j=i...n
The size of the DP table is 1 × (n + 1). Filling each entry takes O(n) int the worst case. Therefore
time complexity is O(n2 ).
2
a user requests the file from server Si , and no copy of the file is present at Si , then the servers
Si+1 , Si+2 , Si+3 , · · · Sn are searched in order until a copy of the file is finally found, say at server
S j , where j > i. This results in an access cost of j − i. (Note that the lower-indexed servers
Si−1 , Si−2 , . . . are not consulted in this search.) The access cost is 0 if Si holds a copy of the file. We
will require that a copy of the file be placed at server Sn , so that all such searches will terminate,
at the latest, at Sn . We would like to place copies of the files at the servers so as to minimize
the sum of placement and access costs. We know that the accesses are going to happen at every
server S1 , S2 , · · · Sn . Formally, we say that a configuration is a choice, for each server Si with
i = 1, 2, · · · , n − 1, of whether to place a copy of the file at Si or not. (Recall that a copy is always
placed at Sn .) The total cost of a configuration is the sum of all placement costs for the selected
servers with a copy of the file, plus the sum of all access costs associated with all n servers. Give a
polynomial-time algorithm to find a configuration of minimum total cost.
Solution.
Subproblems. T (i ) = The minimum cost solution assuming we are placing a copy at Si and we
only have the servers Si , Si+1 , · · · Sn , for all i = 1, 2, · · · n.
Recurrence.
T ((n) = cn )
(k − i − 1)(k − i )
T (i ) = ci + min + T ( k ) , ∀i < n
i +1≤ k ≤ n 2
Why is the recurrence above correct ? Suppose you have placed a copy at Si (that is required by
the definition of T (i )) - you pay ci . Now, imagine that I tell you that the next copy is placed at
server Sk , i + 1 ≤ k ≤ n. Then what is the cost of this configuration ? Well, you need to pay access
cost for serving i + 1, i + 2, · · · k − 1 with the copy at Sk which is
( k − i − 1) + ( k − i − 2) + · · · + 1
We do not need to pay for access at i since we have already placed a copy at Si . So the other
cost we pay is given by the optimal solution to the subproblem assuming Sk has a copy and only
servers from k to n. The only catch is - we do not know the correct index k. So we try all and take
the one which gives the minimum total cost.