0% found this document useful (0 votes)
52 views119 pages

6 DP

The document discusses the technique of dynamic programming. It begins by explaining that dynamic programming solves large problems recursively by building from carefully chosen subproblems. It then provides examples to illustrate dynamic programming, including the longest increasing subsequence problem and activity selection problem. For each problem, it defines the relevant subproblems, provides the recurrence relation for solving the subproblems, and analyzes the time complexity.

Uploaded by

Ryan Dolm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views119 pages

6 DP

The document discusses the technique of dynamic programming. It begins by explaining that dynamic programming solves large problems recursively by building from carefully chosen subproblems. It then provides examples to illustrate dynamic programming, including the longest increasing subsequence problem and activity selection problem. For each problem, it defines the relevant subproblems, provides the recurrence relation for solving the subproblems, and analyzes the time complexity.

Uploaded by

Ryan Dolm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 119

6.

DYNAMIC PROGRAMMING

Aleks Ignjatović, [email protected]


office: K17 504
Course Admin: Song Fang, [email protected]

School of Computer Science and Engineering


UNSW Sydney

Term 2, 2023
Table of Contents 2

1. Introduction

2. Example Problems

3. Applications to graphs

4. Puzzle
What is dynamic programming? 3

The main idea is to solve a large problem recursively by building


from (carefully chosen) subproblems of smaller size.

Optimal substructure property


We must choose subproblems in such a way that
optimal solutions to subproblems can be combined into
an optimal solution for the full problem.
Why is dynamic programming useful? 4

Recently we discussed greedy algorithms, where the problem is


viewed as a sequence of stages and we consider only the
locally optimal choice at each stage.

We saw that some greedy algorithms are incorrect, i.e. they


fail to construct a globally optimal solution.

Also, greedy algorithms are unhelpful for certain types of


problems, such as enumeration (“count the number of ways to
. . . ”).

Dynamic programming can be used to efficiently consider all


the options at each stage.
Why is dynamic programming efficient? 5

We have already seen one problem-solving paradigm that used


recursion: divide-and-conquer.

D&C aims to break a large problem into disjoint subproblems,


solve those subproblems recursively and recombine.

However, DP is characterised by overlapping subproblems.


Why is dynamic programming efficient? 6

Overlapping subproblems property


We must choose subproblems in such a way that
the same subproblem occurs several times in the
recursion tree.

When we solve a subproblem, we store the result so that


subsequent instances of the same subproblem can be answered by
just looking up a value in a table.
The parts of a dynamic programming algorithm 7

A dynamic programming algorithm consists of three parts:

a definition of the subproblems;

a recurrence relation, which determines how the solutions to


smaller subproblems are combined to solve a larger
subproblem, and

any base cases, which are the trivial subproblems - those for
which the recurrence is not required.
Putting it all together 8

The original problem may be one of our subproblems, or it


may be solved by combining results from several subproblems,
in which case we must also describe this process.

Finally, we should estimate the time complexity of our


algorithm.
Table of Contents 9

1. Introduction

2. Example Problems

3. Applications to graphs

4. Puzzle
Longest Increasing Subsequence 10

Problem
Instance: a sequence of n real numbers A[1..n].

Task: determine a subsequence (not necessarily contiguous) of


maximum length, in which the values in the subsequence are
strictly increasing.
Longest Increasing Subsequence 11

A natural choice for the subproblems is as follows: for each


1 ≤ i ≤ n, let P(i) be the problem of determining the length
of the longest increasing subsequence of A[1..i].

However, it is not immediately obvious how to relate these


subproblems to each other.

A more convenient specification involves Q(i), the problem of


determining opt(i), the length of the longest increasing
subsequence of A[1..i] ending at the last element A[i].

Note that the overall solution is recovered by


max {opt(i) | 1 ≤ i ≤ n}.
Longest Increasing Subsequence 12

Instead, assume we have solved all the subproblems for j < i.

We now look for all indices j < i such that A[j] < A[i].

Among those we pick m so that opt(m) is maximal, and


extend that sequence with A[i].

This forms the basis of our recurrence!

The recurrence is not necessary if i = 1, as there are no


previous indices to consider, so this is our base case.
Longest Increasing Subsequence 13

Solution
Subproblems: for each 1 ≤ i ≤ n, let Q(i) be the problem of
determining opt(i), the maximum length of an increasing
subsequence of A[1..i] which ends with A[i].

Recurrence: for i > 1,

opt(i) = max {opt(j) | j < i, A[j] < A[i]} + 1.

Base case: opt(1) = 1.


Longest Increasing Subsequence 15

i 1 2 3 4 5 6 7 8 9
A[i] 1 5 3 6 2 7 4 8 6
opt(i) 1 2 2 3 2 4 3 5 4

opt(i) = max {opt(j) | j < i, A[j] < A[i]} + 1


Longest Increasing Subsequence 16

Upon computing a value opt(i), we store it in the i th slot of a


table, so that we can look it up in the future.

The overall longest increasing subsequence is the best of those


ending at some element, i.e. max {opt(i) | 1 ≤ i ≤ n}.

Each of n subproblems is solved in O(n), and the overall


solution is found in O(n). Therefore the time complexity is
O(n2 ).
Longest Increasing Subsequence 17

Why does this produce optimal solutions to subproblems? We


can use a kind of “cut and paste” argument.

We claim that truncating the optimal solution for Q(i) must


produce an optimal solution of the subproblem Q(m).

Otherwise, if a better solution for Q(m) existed, we could


extend that instead to find a better solution for Q(i) as well.
Longest Increasing Subsequence 18

What if the problem asked for not only the length, but the
entire longest increasing subsequence?

This is a common extension to such problems, and is easily


handled.

In the i th slot of the table, alongside opt(i) we also store the


index m such that the optimal solution for Q(i) extends the
optimal solution for Q(m).

After all subproblems have been solved, the longest increasing


subsequence can be recovered by backtracking through the
table.
Longest Increasing Subsequence 20

i 1 2 3 4 5 6 7 8 9
A[i] 1 5 3 6 2 7 4 8 6
opt(i) 1 2 2 3 2 4 3 5 4
pred(i) 1 1 2 1 4 3 6 7

opt(i) = max {opt(j) | j < i, A[j] < A[i]} + 1


pred(i) = argmax {opt(j) | j < i, A[j] < A[i]}
Activity Selection 21

Problem
Instance: A list of n activities with starting times si and finishing
times fi . No two activities can take place simultaneously.

Task: Find the maximal total duration of a subset of compatible


activities.
Activity Selection 22

Remember, we used the greedy method to solve a somewhat


similar problem of finding a subset with the largest possible
number of compatible activities, but the greedy method does
not work for the present problem.

As before, we start by sorting the activities by their finishing


time into a non-decreasing sequence, and henceforth we will
assume that f1 ≤ f2 ≤ . . . ≤ fn .
Activity Selection 23

We can then specify the subproblems: for each 1 ≤ i ≤ n, let


P(i) be the problem of finding the duration t(i) of a
subsequence σ(i) of the first i activities which
1 consists of non-overlapping activities,
2 ends with activity i, and
3 is of maximal total duration among all such sequences.

As in the previous problem, the second condition will simplify


the recurrence.
Activity Selection 24

We would like to solve P(i) by appending activity i to σ(j) for


some j < i.

We require that activity i not overlap with activity j, i.e. the


latter finishes before the former begins.

Among all such j, our recurrence will choose that which


maximises the duration t(j).

There is no need to solve P(1) in this way, as there are no


preceding activities.
Activity Selection 25

i −3 i −1

i −4 i −2 i

si−3 fi−4 si−2 fi−3 si−1 fi−2 si fi−1 fi


Activity Selection 26

Solution
Subproblems: for each 1 ≤ i ≤ n, let P(i) be the problem of
determining t(i), the maximal duration of a non-overlapping
subsequence of the first i activities which ends with activity i.

Recurrence: for i > 1,

t(i) = max {t(j) | j < i, fj < si } + fi − si .

Base Case: t(1) = f1 − s1 .


Activity Selection 27

Again, the best overall solution is given by


max {t(i) | 1 ≤ i ≤ n}.

Sorting the activities took O(n log n). Each of n subproblems


is solved in O(n), and the overall solution is found in O(n).
Therefore the time complexity is O(n2 ).
Activity Selection 28

Why does this recurrence produce optimal solutions to


subproblems P(i)?

Let the optimal solution of subproblem P(i) be given by the


sequence σ = hk1 , k2 , . . . , km−1 , km i, where km = i.

We claim: the truncated subsequence σ 0 = hk1 , k2 , . . . , km−1 i


gives an optimal solution to subproblem P(km−1 ).

Why? We apply the same “cut and paste” argument!


Activity Selection 29

Suppose instead that P(km−1 ) is solved by a sequence τ 0 of


larger total duration than σ 0 .

Then let τ be the sequence formed by extending τ 0 with


activity i.

It is clear that τ has larger total duration than σ. This


contradicts the earlier definition of σ as the sequence solving
P(i).

Thus, the optimal sequence for problem P(i) is obtained from


the optimal sequence for problem P(j) (for some j < i) by
extending it with i.
Activity Selection 30

Suppose we also want to construct the optimal sequence


which solves our problem.

In the i th slot of our table, we should store not only t(i) but
the value previous(i) = j such that the optimal solution of
P(i) extends the optimal solution of subproblem P(j).

previous(i) = argmax {t(j) | j < i, fj < si }


Making Change 31

Problem
Instance: You are given n types of coin denominations of values
v1 < v2 < . . . < vn (all integers). Assume v1 = 1 (so that you can
always make change for any integer amount) and that you have an
unlimited supply of coins of each denomination.

Task: make change for a given integer amount C , using as few


coins as possible.
Making Change 32

Attempt 1
Greedily take as many coins of value vm as possible, then vm−1 ,
and so on.

This approach is very tempting, and works for all real-world


currencies.

However, it doesn’t work for all sequences vi . In general, we


will need to use DP.

Exercise
Design a counterexample to the above algorithm.
Making Change 33

We will try to find the optimal solution for not only C , but
every amount up to C .

Assume we have found optimal solutions for every amount


j < i and now want to find an optimal solution for amount i.

We consider each coin vk as part of the solution for amount i,


and make up the remaining amount i − vk with the previously
computed optimal solution.
Making Change 34

Among all of these optimal solutions, which we find in the


table we are constructing recursively, we pick one which uses
the fewest number of coins.

Supposing we choose coin m, we obtain an optimal solution


opt(i) for amount i by adding one coin of denomination vm to
opt(i − vm ).

If C = 0 the solution is trivial: use no coins.


Making Change 35

Solution
Subproblems: for each 0 ≤ i ≤ C , let P(i) be the problem of
determining opt(i), the fewest coins needed to make change for an
amount i.

Recurrence: for i > 0,

opt(i) = min {opt(i − vk ) | 1 ≤ k ≤ n, vk ≤ i} + 1.

Base case: opt(0) = 0.


Making Change 36

There is no extra work required to recover the overall solution;


it is just opt(C ).

Each of C subproblems is solved in O(n) time, so the time


complexity is O(nC ).

Note
Our algorithm is NOT a polynomial time algorithm in the length of
the input, because C is represented by log C bits, while the
running time is O(nC ). There is no known polynomial time
algorithm for this problem!
Making Change 37

Why does this produce an optimal solution for each amount


i ≤ C?

Consider an optimal solution for some amount i, and say this


solution includes at least one coin of denomination vm for
some 1 ≤ m ≤ n.

Removing this coin must leave an optimal solution for the


amount i − vm , again by our “cut-and-paste” argument.

By considering all coins of value at most i, we can pick m for


which the optimal solution for amount i − vm uses the fewest
coins.
Making Change 38

Suppose we were required to also determine the exact number


of each coin required to make change for amount C .

In the i th slot of the table, we would store both opt(i) and the
coin type k = pred(i) which minimises opt(i − vk ).

Then pred(C ) is a coin used in the optimal solution for total


C , leaving C 0 = C − pred(C ) remaining. We then repeat,
identifying another coin pred(C 0 ) used in the optimal solution
for total C 0 , and so on.

Notation
We denote the k that minimises opt(i − vk ) by

argmin opt(i − vk ).
1≤k≤n
Integer Knapsack Problem (with duplicates) 39

Problem
Instance: You have n types of items; all items of kind i are
identical and of weight wi and value vi . All weights are integers.
You can take any number of items of each kind. You also have a
knapsack of capacity C .

Task: Choose a combination of available items which all fit in the


knapsack and whose value is as large as possible.
Integer Knapsack Problem (with duplicates) 40

Similarly to the previous problem, we solve for each total


weight up to C .

Assume we have solved the problem for all total weights j < i.

We now consider each type of item, the kth of which has


weight wk . If this item is included, we would fill the remaining
weight with the already computed optimal solution for i − wk .

We choose the m which maximises the total value of the


optimal solution for i − wm plus an item of type m, to obtain
a packing of total weight i of the highest possible value.
Integer Knapsack Problem (with duplicates) 41

Solution
Subproblems: for each 0 ≤ i ≤ C , let P(i) be the problem of
determining opt(i), the maximum value that can be achieved using
up to i units of weight, and m(i), the type of some item in such a
collection.

Recurrence: for i > 0,

opt(i) = max opt (i − wk ) + vk


1≤k≤n,wk ≤i

m(i) = argmax (opt(i − wk ) + vk ) .


1≤k≤n,wk ≤i

Base case: if i < min1≤k≤n wk , then opt(i) = 0 and m(i) is


undefined.
Integer Knapsack Problem (with duplicates) 42

The overall solution is opt(C ), as the optimal knapsack can


hold up to C units of weight.

Each of C subproblems is solved in O(n), for a time


complexity of O(nC ).

Again, our algorithm is NOT polynomial in the length of the


input.
Integer Knapsack Problem (without duplicates) 43

Problem
Instance: You have n items, the ith of which has weight wi and
value vi . All weights are integers. You also have a knapsack of
capacity C .

Task: Choose a combination of available items which all fit in the


knapsack and whose value is as large as possible.
Integer Knapsack Problem (without duplicates) 44

Let’s use the same subproblems as before, and try to develop


a recurrence.

Question
If we know the optimal solution for each total weight j < i, can we
deduce the optimal solution for weight i?

Answer
No! If we begin our solution for weight i with item k, we have
i − wk remaining weight to fill. However, we did not record
whether item k was itself already used in the optimal solution for
that weight.
Integer Knapsack Problem (without duplicates) 45

For each total weight i, we will find the optimal solution using
only the first k items.

We can take cases on whether item k is used in the solution:

if so, we have i − wk remaining weight to fill using the first


k − 1 items, and

otherwise, we must fill all i units of weight with the first k − 1


items.
Integer Knapsack Problem (without duplicates) 46

Solution
Subproblems: for 0 ≤ i ≤ C and 0 ≤ k ≤ n, let P(i, k) be the
problem of determining opt(i, k), the maximum value that can be
achieved using up to i units of weight and using only the first k
items, and m(i, k), the (largest) index of an item in such a
collection.

Recurrence: for i > 0 and 1 ≤ k ≤ n,

opt(i, k) = max (opt(i, k − 1), opt(i − wk , k − 1) + vk ) ,

with m(i, k) = m(i, k − 1) in the first case and k in the second.

Base cases: if i = 0 or k = 0, then opt(i, k) = 0 and m(i, k) is


undefined.
Integer Knapsack Problem (without duplicates) 47

We need to be careful about the order in which we solve the


subproblems.

When we get to P(i, k), the recurrence requires us to have


already solved P(i, k − 1) and P(i − wk , k − 1).

This is guaranteed if we subproblems P(i, k) in increasing


order of k, then increasing order of capacity i.
Integer Knapsack Problem (without duplicates) 48

0 i − wk i C
0

k −1
k

The overall solution is opt(C , n).

Each of O(nC ) subproblems is solved in constant time, for a


time complexity of O(nC ).
Balanced Partition 49

Problem
Instance: a set of n positive integers xi .

Task: partition these integers into two subsets S1 and S2 with


sums Σ1 and Σ2 respectively, so as to minimise |Σ1 − Σ2 |.
Balanced Partition 50

Suppose without loss of generality that Σ1 ≥ Σ2 .

Let Σ = x1 + . . . + xn , the sum of all integers in the set.

Observe that Σ1 + Σ2 = Σ, which is a constant, and upon


rearranging it follows that
 
Σ
2 − Σ2 = Σ − 2Σ2 = Σ − Σ2 − Σ2 = Σ1 − Σ2
2

So, all we have to do is find a subset S2 of these numbers with


total sum as close to Σ/2 as possible, but not exceeding it.
Balanced Partition 51

For each integer xi in the set, construct an item with both


weight and value equal to xi .

Consider the knapsack problem (with duplicate items not


allowed), with items as specified above and knapsack capacity
Σ/2.

Solution
The best packing of this knapsack produces an optimally balanced
partition, with set S1 given by the items outside the knapsack and
set S2 given by the items in the knapsack.
Matrix chain multiplication 52

Let A and B be matrices. The matrix product AB exists if A


has as many columns as B has rows: if A is m × n and B is
n × p, then AB is m × p.

Each element of AB is the dot product of a row of A with a


column of B, both of which have length n. Therefore
m × n × p multiplications are required to compute AB.

Matrix multiplication is associative, that is, for any three


matrices of compatible sizes we have A(BC ) = (AB)C .

However, the number of real number multiplications needed


to obtain the product can be very different.
Matrix chain multiplication 53

Suppose A is 10 × 100, B is 100 × 5 and C is 5 × 50.

(AB)(C )
| {z }
10×5

10 × 100 × 5 10 × 5 × 50

(A)(B)(C ) (ABC )

100 × 5 × 50 10 × 100 × 50
(A) (BC )
| {z }
100×50

Evaluating (AB)C involves only 7500 multiplications, but


evaluating A(BC ) requires 75000 multiplications!
Matrix chain multiplication 54

Problem
Instance: a compatible sequence of matrices A1 A2 ...An , where Ai
is of dimension si−1 × si .

Task: group them in such a way as to minimise the total number


of multiplications needed to find the product matrix.
Matrix chain multiplication 55

How many groupings are there?

The total number of different groupings satisfies the following


recurrence (why?):
n−1
X
T (n) = T (i)T (n − i),
i=1

with base case T (1) = 1.

One can show that the solution satisfies T (n) = Ω(2n ).

Thus, we cannot efficiently do an exhaustive search for the


optimal grouping.
Matrix chain multiplication 56

Instead, we try dynamic programming. A first attempt might


be to specify subproblems corresponding to prefixes of the
matrix chain, that is, find the optimal grouping for
A1 A2 . . . Ai .

This is not enough to construct a recurrence; consider for


example splitting the chain as

(A1 A2 . . . Aj )(Aj+1 Aj+2 . . . Ai ).


Matrix chain multiplication 57

Instead we should specify a subproblem corresponding to each


contiguous subsequence Ai+1 Ai+2 . . . Aj of the chain.

The recurrence will consider all possible ways to place the


outermost multiplication, splitting the chain into the product

(Ai+1 . . . Ak ) (Ak+1 . . . Aj ) .
| {z }| {z }
si ×sk sk ×sj

No recursion is necessary for subsequences of length one.


Matrix chain multiplication 58

Solution
Subproblems: for all 0 ≤ i < j ≤ n, let P(i, j) be the problem of
determining opt(i, j), the fewest multiplications needed to compute
the product Ai+1 Ai+2 . . . Aj .

Recurrence: for all j − i > 1,

opt(i, j) = min{opt(i, k) + si sk sj + opt(k, j) | i < k < j}.

Base cases: for all 0 ≤ i ≤ n − 1, opt(i, i + 1) = 0.


Matrix chain multiplication 59

We have choose the order of iteration carefully. To solve a


subproblem P(i, j), we must have already solved P(i, k) and
P(k, j) for each i < k < j.

The simplest way to ensure this is to solve the subproblems in


increasing order of j − i, i.e. subsequence length.
Matrix chain multiplication 60

The last subproblem to be solved is P(0, n), which gives the


overall solution.

To recover the actual bracketing required, we should store


alongside each value opt(i, j) the splitting point k used to
obtain it.

Each of O(n2 ) subproblems is solved in O(n) time, so the


overall time complexity is O(n3 ).
Longest Common Subsequence 61

Problem
Instance: two sequences S = ha1 , a2 , . . . an i and
S ∗ = hb1 , b2 , . . . , bm i.

Task: find the length of a longest common subsequence of S, S ∗ .


Longest Common Subsequence 62

A sequence s is a subsequence of another sequence S if s can


be obtained by deleting some of the symbols of S (while
preserving the order of the remaining symbols).

Given two sequences S and S ∗ a sequence s is a Longest


Common Subsequence of S, S ∗ if s is a subsequence of both
S and S ∗ and is of maximal possible length.

This can be useful as a measurement of the similarity between


S and S ∗ .

Example: how similar are the genetic codes of two viruses? Is


one of them just a genetic mutation of the other?
Longest Common Subsequence 63

A natural choice of subproblems considers prefixes of both


sequences, say

Si = ha1 , a2 , . . . , ai i and Sj∗ = hb1 , b2 , . . . , bj i.

If ai and bj are the same symbol (say c), the longest common
subsequence of Si and Sj∗ is formed by appending c to the
∗ .
solution for Si−1 and Sj−1

Otherwise, a common subsequence of Si and Sj∗ cannot


contain both ai and bj , so we consider discarding either of
these symbols.

No recursion is necessary when either Si or Sj∗ are empty.


Longest Common Subsequence 64

Solution
Subproblems: for all 0 ≤ i ≤ n and all 0 ≤ j ≤ m let P(i, j) be
the problem of determining opt(i, j), the length of the longest
common subsequence of the truncated sequences
Si = ha1 , a2 , . . . ai i and Sj∗ = hb1 , b2 , . . . , bj i.

Recurrence: for all i, j > 0,


(
opt(i − 1, j − 1) + 1 if ai = bj
opt(i, j) =
max(opt(i − 1, j), opt(i, j − 1)) otherwise.

Base cases: for all 0 ≤ i ≤ n, opt(i, 0) = 0, and for all 0 ≤ j ≤ m,


opt(0, j) = 0.
Longest Common Subsequence 65

Iterating through the subproblems P(i, j) in lexicographic


order (increasing i, then increasing j) guarantees that
P(i − 1, j), P(i, j − 1) and P(i − 1, j − 1) are solved before
P(i, j), so all dependencies are satisfied.

The overall solution is opt(n, m).

Each of O(nm) subproblems is solved in constant time, for an


overall time complexity of O(nm).

To reconstruct the longest common subsequence itself, we can


record the direction from which the value opt(i, j) was
obtained in the table, and backtrack.
Longest Common Subsequence 66

! !
!
Longest Common Subsequence 67

What if we have to find a longest common subsequence of


three sequences S, S ∗ , S ∗∗ ?

Question
Can we do LCS (LCS (S, S ∗ ) , S ∗∗ )?

Answer
Not necessarily!
Longest Common Subsequence 68

Let S = ABCDEGG , S ∗ = ACBEEFG and S ∗∗ = ACCEDGF .


Then
LCS (S, S ∗ , S ∗∗ ) = ACEG .

However,

LCS (LCS (S, S ∗ ) , S ∗∗ )


= LCS (LCS (ABCDEGG , ACBEEFG ) , S ∗∗ )
= LCS (ABEG , ACCEDGF )
= AEG .

Exercise
Confirm that LCS (LCS (S ∗ , S ∗∗ ) , S) and LCS (LCS (S, S ∗∗ ) , S ∗ )
also give wrong answers.
Longest Common Subsequence 69

Problem
Instance: three sequences S = ha1 , a2 , . . . an i,
S ∗ = hb1 , b2 , . . . , bm i and S ∗∗ = hc1 , c2 , . . . , cl i.

Task: find the length of a longest common subsequence of S, S ∗


and S ∗∗ .
Longest Common Subsequence 70

Solution
Subproblems: for all 0 ≤ i ≤ n, all 0 ≤ j ≤ m and all 0 ≤ k ≤ l,
let P(i, j, k) be the problem of determining opt(i, j, k), the length
of the longest common subsequence of the truncated sequences
Si = hai , a2 , . . . , ai i, Sj∗ = hb1 , b2 , . . . , bj i and
Sk∗∗ = hc1 , c2 , . . . , ck i.

Recurrence: for all i, j, k > 0,



opt(i− 1, j − 1, k− 1) + 1 if ai = bj = ck
opt(i, j, k) = opt(i−1,j,k),
max opt(i,j−1,k), otherwise.
opt(i,j,k−1)

Base cases: if i = 0, j = 0 or k = 0, opt(i, j, k) = 0.


Longest Common Subsequence 71

Iterating through the subproblems P(i, j, k) in lexicographic


order (increasing i, then increasing j, then increasing k)
guarantees that P(i − 1, j, k), P(i, j − 1, k), P(i, j, k − 1) and
P(i − 1, j − 1, k − 1) are solved before P(i, j, k), so all
dependencies are satisfied.

The overall solution is opt(n, m, l).

Each of O(nml) subproblems is solved in constant time, for


an overall time complexity of O(nml).

To reconstruct the longest common subsequence itself, we can


record the direction from which the value opt(i, j, k) was
obtained in the table, and backtrack.
Shortest Common Supersequence 72

Problem
Instance: two sequences s = ha1 , a2 , . . . an i and
s ∗ = hb1 , b2 , . . . , bm i.

Task: find a shortest common supersequence S of s, s ∗ , i.e., a


shortest possible sequence S such that both s and s ∗ are
subsequences of S.
Shortest Common Supersequence 73

Solution
Find a longest common subsequence LCS(s, s ∗ ) of s and s ∗ , then
add back the differing elements of the two sequences in the right
places, in any compatible order.

Example
If
s = abacada and s ∗ = xby cazd,
then
LCS(s, s ∗ ) = bcad
and therefore
SCS(s, s ∗ ) = axbyacazda.
Edit Distance 74

Problem
Instance: Given two text strings A of length n and B of length m,
you want to transform A into B. You are allowed to insert a
character, delete a character and to replace a character with
another one. An insertion costs cI , a deletion costs cD and a
replacement costs cR .

Task: find the lowest total cost transformation of A into B.


Edit Distance 75

Edit distance is another measure of the similarity of pairs of


strings.

Note: if all operations have a unit cost, then you are looking
for the minimal number of such operations required to
transform A into B; this number is called the Levenshtein
distance between A and B.

If the sequences are sequences of DNA bases and the costs


reflect the probabilities of the corresponding mutations, then
the minimal cost represents how closely related the two
sequences are.
Edit Distance 76

Again we consider prefixes of both strings, say A[1..i] and


B[1..j].

We have the following options to transform A[1..i] into


B[1..j]:

1 delete A[i] and then transform A[1..i − 1] into B[1..j];

2 transform A[1..i] to B[1..j − 1] and then append B[j];

3 transform A[1..i − 1] to B[1..j − 1] and if necessary replace


A[i] by B[j].

If i = 0 or j = 0, we only insert or delete respectively.


Edit Distance 77

Solution
Subproblems: for all 0 ≤ i ≤ n and 0 ≤ j ≤ m, let P(i, j) be the
problem of determining opt(i, j), the minimum cost of
transforming the sequence A[1..i] into the sequence B[1..j].

Recurrence: for i, j ≥ 1,



 opt(i − 1, j) + cD

opt(i, j − 1) + c
I
opt(i, j) = min (


 opt(i − 1, j − 1) if A[i] = B[j]

 opt(i − 1, j − 1) + c
R 6 B[j].
if A[i] =

Base cases: opt(i, 0) = icD and opt(0, j) = jcI .


Edit Distance 78

The overall solution is opt(n, m).

Each of O(nm) subproblems is solved in constant time, for a


total time complexity of O(nm).
Maximising an expression 79

Problem
Instance: a sequence of numbers with operations +, −, × in
between, for example

1 + 2 − 3 × 6 − 1 − 2 × 3 − 5 × 7 + 2 − 8 × 9.

Task: place brackets in a way that the resulting expression has the
largest possible value.
Maximising an expression 80

What will be the subproblems?

Similar to the matrix chain multiplication problem earlier, it’s


not enough to just solve for prefixes A[1..i].

Maybe for a subsequence of numbers A[i + 1..j] place the


brackets so that the resulting expression is maximised?
Maximising an expression 81

How about the recurrence?

It is natural to break down A[i + 1..j] into


A[i + 1..k] A[k + 1..j], with cases = +, −, ×.

In the case = +, we want to maximise the values over both


A[i + 1..k] and A[k + 1..j].

This doesn’t work for the other two operations!

We should look for placements of brackets not only for the


maximal value but also for the minimal value!
Maximising an expression 82

Exercise
Write a complete solution for this problem. Your solution should
include the subproblem specification, recurrence and base cases.
You should also describe how the overall solution is to be obtained,
and analyse the time complexity of the algorithm.
Turtle Tower 83

Problem
Instance: You are given n turtles, and for each turtle you are
given its weight and its strength. The strength of a turtle is the
maximal weight you can put on it without cracking its shell.

Task: find the largest possible number of turtles which you can
stack one on top of the other, without cracking any turtle.

Hint
Surprisingly difficult! Order turtles in increasing order of the sum
of their weight and their strength, and proceed by recursion.

You can find a solution to this problem in additional lecture notes


available on Moodle.
Integer Partitions 84

Problem
Instance: a positive integer n.

Task: compute the number of partitions of n, i.e., the number of


distinct multisets of positive integers {n1 , . . . , nk } such that
n1 + . . . + nk = n.
Integer Partitions 85

Hint
It’s not obvious how to construct a recurrence between the number
of partitions of different values of n. Instead consider restricted
partitions!

Let nump(i, j) denote the number of partitions of j in which no


part exceeds i, so that the answer is nump(n, n).

The recursion is based on relaxation of the allowed size i of the


parts of j for all j up to n. It distinguishes those partitions where
all parts are ≤ i − 1 and those where at least one part is exactly i.
Table of Contents 86

1. Introduction

2. Example Problems

3. Applications to graphs

4. Puzzle
Directed acyclic graphs 87

Definition
Recall that a directed acyclic graph (DAG) is exactly that: a
directed graph without (directed) cycles.

b c

d
Topological ordering 88

Definition
Recall that in a directed graph, a topological ordering of the
vertices is one in which all edges point “left to right”.

A directed graph admits a topological ordering if and only if it


is acyclic.

There may be more than one valid topological ordering for a


particular DAG.

A topological ordering can be found in linear time, i.e.


O(|V | + |E |).
Shortest path in a directed acyclic graph 89

Problem
Instance: a directed acyclic graph G = (V , E ) in which each edge
e ∈ E has a corresponding weight w (e) (which may be negative),
and a designated vertex s ∈ V .

Task: find the shortest path from s to each vertex t ∈ V .

Notation
Let n = |V | and m = |E |.
Shortest path in a directed acyclic graph 90

If all edge weights are positive, the single source shortest path
problem is solved by Dijkstra’s algorithm in O(m log n).

Later in this lecture, we’ll see how to solve the general single
source shortest path problem in O(nm) using the
Bellman-Ford algorithm.

However, in the special case of directed acyclic graphs, a


simple DP solves this problem in O(n + m), i.e. linear time.
Shortest path in a directed acyclic graph 91

The natural subproblems are the shortest path to each vertex.

Each vertex v with an edge to t is a candidate for the


penultimate vertex in an s − t path.

The recurrence considers the path to each such v , plus the


weight of the last edge, and selects the minimum of these
options.

The base case is s itself, where the shortest path is obviously


zero.
Shortest path in a directed acyclic graph 92

Solution
Subproblems: for all t ∈ V , let P(t) be the problem of
determining opt(t), the length of a shortest path from s to t.

Recurrence: for all t 6= s,

opt(t) = min{opt(v ) + w (v , t) | (v , t) ∈ E }.

Base case: opt(s) = 0.


Shortest path in a directed acyclic graph 93

The solution is the entire list of values opt(t).

At first it appears that each of n subproblems is solved in


O(n) time, giving a time complexity of O(n2 ).

However, each edge is only considered once (at its endpoint),


so we can use the tighter bound O(m).
Shortest path in a directed acyclic graph 94

Question
In what order should we solve the subproblems?

In any DP algorithm, the recurrence introduces certain


dependencies, and it is crucial that these dependencies are
respected.

Here, opt(t) depends on all the opt(v ) values for vertices v


with outgoing edges to t, so we need to solve P(v ) for each
such v before solving P(t).

We can achieve this by solving the vertices in topological


order, from left to right. All edges point from left to right, so
any vertex with an outgoing edge to t is solved before t is.
DP on a directed acyclic graph 95

Many problems on directed acyclic graphs can be solved in the


same way: first use topological sort, then DP over the vertices
in that order.

If we replace the min in the earlier recurrence by max, we


have an algorithm to find the longest path from s to each t.
This problem is much harder on general graphs; indeed, there
is no known algorithm to solve it in polynomial time.

Often a graph will be specified in a way that makes it


obviously acyclic, with a natural topological order.
Assembly line scheduling 96

Problem
Instance: You are given two assembly lines, each consisting of n
workstations. The k th workstation on each assembly line performs
the k th of n jobs.

To bring a new product to the start of assembly line i takes si


units of time.

To retrieve a finished product from the end of assembly line i


takes fi units of time.
Assembly line scheduling 97

Problem (continued)

On assembly line i, the k th job takes ai,k units of time to


complete.

To move the product from station k on assembly line i to


station k + 1 on the other line takes ti,k units of time.

There is no time required to continue from station k to


station k + 1 on the same line.

Task: Find a fastest way to assemble a product using both lines as


necessary.
Assembly line scheduling 98

a1,1 ... a1,k−1 a1,k ... a1,n

s1 t1,1 t1,k−1 t1,n−1 f1

start finish

s2 t2,k−1 t2,n−1 f2
t2,1

a2,1 ... a2,k−1 a2,k ... a2,n


Assembly line scheduling 99

a1,k−2 a1,k−1 a1,k a1,k+1

t1,k−2 t1,k−1 t1,k

t2,k−2 t2,k−1 t2,k

a2,k−2 a2,k−1 a2,k a2,k+1


Assembly line scheduling 100

We will denote internal vertices using the form (i, k) to


represent workstation k on assembly line i.

The problem requires us to find the shortest path from the


start node to the finish node, where unlabelled edges have
zero weight.

This is clearly a directed acyclic graph, and moreover the


topological ordering is obvious:
start, (1, 1), (2, 1), (1, 2), (2, 2), . . . , (1, n), (2, n), finish.
So we can use DP!

There are 2n + 2 vertices and 4n edges, so the DP should take


O(n) time, whereas Dijkstra’s algorithm would take
O(n log n).
Assembly line scheduling 101

We’ll solve for the shortest path from s to each vertex (i, k).

To form a recurrence, we should consider the ways of getting


to workstation k on assembly line i.

We could have come from workstation k − 1 on either line,


after completing the previous job.

The exception is the first workstation, which leads to the base


case.
Assembly line scheduling 102

Solution
Subproblems: for i ∈ {1, 2} and 1 ≤ k ≤ n, let P(i, k) be the
problem of determining opt(i, k), the minimal time taken to
complete the first k jobs, with the k th job performed on assembly
line i.

Recurrence: for k > 1,

opt(1, k) = min (opt(1, k − 1), opt(2, k − 1) + t2,k−1 ) + a1,k


opt(2, k) = min (opt(2, k − 1), opt(1, k − 1) + t1,k−1 ) + a2,k .

Base cases: opt(1, 1) = s1 + a1,1 and opt(2, 1) = s2 + a2,1 .


Assembly line scheduling 103

As the recurrence uses values from both assembly lines, we


have to solve the subproblems in order of increasing k, solving
both P(1, k) and P(2, k) at each stage.

Finally, after obtaining opt(1, n) and opt(2, n), the overall


solution is given by

min (opt(1, n) + f1 , opt(2, n) + f2 ) .

Each of 2n subproblems is solved in constant time, and the


final two subproblems are combined as above in constant time
also. Therefore the overall time complexity is O(n).
Assembly line scheduling 104

Remark
This problem is important because it has the same design logic as
the Viterbi algorithm, an extremely important algorithm for many
fields such as speech recognition, decoding convolutional codes in
telecommunications etc. This will be covered in COMP4121
Advanced Algorithms.
Single Source Shortest Paths 105

Problem
Instance: a directed weighted graph G = (V , E ) with edge
weights w (e) which can be negative, but without cycles of
negative total weight, and a designated vertex s ∈ V .

Task: find the weight of the shortest path from vertex s to every
other vertex t.

Notation
Let n = |V | and m = |E |.
Single Source Shortest Paths 106

How does this problem differ to the one solved by Dijkstra’s


algorithm?

In this problem, we allow negative edge weights, so the greedy


strategy no longer works.

Note that we disallow cycles of negative total weight. This is


only because with such a cycle, there is no shortest path - you
can take as many laps around a negative cycle as you like.

This problem was first solved by Shimbel in 1955, and was


one of the earliest uses of Dynamic Programming.
Bellman-Ford algorithm 107

Observation
For any vertex t, there is a shortest s − t path without cycles.

Proof Outline
Suppose the opposite. Let p be a shortest s − t path, so it must
contain a cycle. Since there are no negative weight cycles,
removing this cycle produces an s − t path of no greater length.

Observation
It follows that every shortest s − t path contains any vertex v at
most once, and therefore has at most n − 1 edges.
Bellman-Ford algorithm 108

For every vertex t, let’s find the weight of a shortest s − t


path consisting of at most i edges, for each i up to n − 1.

Suppose the path in question is

. . → v} → t,
p = s| → .{z
p0

with the final edge going from v to t.

Then p 0 must be itself the shortest path from s to v of at


most i − 1 edges, which is another subproblem!

No such recursion is necessary if t = s, or if i = 0.


Bellman-Ford algorithm 109

Solution
Subproblems: for all 0 ≤ i ≤ n − 1 and all t ∈ V , let P(i, t) be
the problem of determining opt(i, t), the length of a shortest path
from s to t which contains at most i edges.

Recurrence: for all i > 0 and t 6= s,

opt(i, t) = min{opt(i − 1, v ) + w (v , t) | (v , t) ∈ E }.

Base cases: opt(i, s) = 0, and for t 6= s, opt(0, t) = ∞.


Bellman-Ford algorithm 110

The overall solutions are given by opt(n − 1, t).

We proceed in n rounds (i = 0, 1, . . . , n − 1). In each round,


each edge of the graph is considered only once.

Therefore the time complexity is O(nm).


Bellman-Ford algorithm 112

5
a b
t
−2 s a b c d
6
i

s 0 0 ∞ ∞ ∞ ∞
8 7
−3 −4 1 0 6 ∞ 7 ∞
7
2 0 6 4 7 2
9
c d 3 0 2 4 7 2

2 4 0 2 4 7 −2

opt(i, t) = min{opt(i − 1, v ) + w (v , t) | (v , t) ∈ E }.
Bellman-Ford algorithm 113

How do we reconstruct an actual shortest s − t path?

As usual, we’ll store one step at a time and backtrack. Let


pred(i, t) be the immediate predecessor of vertex t on a
shortest s − t path of at most i edges.

The additional recurrence required is

pred(i, t) = argmin{opt(i − 1, v ) + w (v , t) | (v , t) ∈ E }.
v ∈V
Bellman-Ford algorithm 114

There are several small improvements that can be made to


this algorithm.

As stated, we build a table of size O(n2 ), with a new row for


each ‘round’.

It is possible to reduce this to O(n). Including opt(i − 1, t) as


a candidate for opt(i, t), doesn’t change the recurrence, so we
can instead maintain a table with only one row, and overwrite
it at each round.
All Pairs Shortest Paths 115

Problem
Instance: a directed weighted graph G = (V , E ) with edge
weights w (e) which can be negative, but without cycles of
negative total weight.

Task: find the weight of the shortest path from every vertex s to
every other vertex t.

Notation
Let n = |V | and m = |E |.
Floyd-Warshall algorithm 116

We can use a similar idea, this time in terms of the


intermediate vertices allowed on an s − t path.

Label the vertices of V as v1 , v2 , . . . , vn .

Let S be the set of vertices allowed as intermediate vertices.


Initially S is empty, and we add vertices v1 , v2 , . . . , vn one at a
time.
Floyd-Warshall algorithm 117

Question
When is the shortest path from s to t using the first k vertices as
intermediates actually an improvement on the value from the
previous round?

Answer
When there is a shorter path of the form

s→ ...
|{z} → vk → ...
|{z} → t.
v1 ,...,vk−1 v1 ,...,vk−1
Floyd-Warshall algorithm 118

Solution
Subproblems: for all 1 ≤ i, j ≤ n and 0 ≤ k ≤ n, let P(i, j, k) be
the problem of determining opt(i, j, k), the weight of a shortest
path from vi to vj using only v1 , . . . , vk as intermediate vertices.

Recurrence: for all 1 ≤ i, j, k ≤ n,

opt(i, j, k) = min(opt(i, j, k − 1), opt(i, k, k − 1) + opt(k, j, k − 1)).

Base cases:

0
 if i = j
opt(i, j, 0) = w (i, j) if (vi , vj ) ∈ E .

∞ otherwise.

Floyd-Warshall algorithm 119

Since P(i, j, k) depends on P(i, j, k − 1), P(i, k, k − 1) and


P(k, j, k − 1), we solve subproblems in increasing order of k.

The overall solutions are given by opt(i, j, n), where all


vertices are allowed as intermediates.

Each of O(n3 ) subproblems is solved in constant time, so the


time complexity is O(n3 ).

The space complexity can again be improved by overwriting


the table every round.
Table of Contents 120

1. Introduction

2. Example Problems

3. Applications to graphs

4. Puzzle
Puzzle 121

You have 2 lengths of fuse that are guaranteed to burn for


precisely 1 hour each. Other than that fact, you know nothing;
they may burn at different (indeed, at variable) rates, they may be
of different lengths, thicknesses, materials, etc.

How can you use these two fuses to time a 45 minute interval?
That’s All, Folks!!

You might also like