Chapter 8
Dynamic Programming
Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Dynamic Programming
Dynamic Programming is a general algorithm design technique
for solving problems defined by or formulated as recurrences
with overlapping subinstances
• Invented by American mathematician Richard Bellman in the
1950s to solve optimization problems and later assimilated by CS
• “Programming” here means “planning”
• Main idea:
- set up a recurrence relating a solution to a larger instance
to solutions of some smaller instances
- solve smaller instances once
- record solutions in a table
- extract solution to the initial instance from that table
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-2
Example: Fibonacci numbers
• Recall definition of Fibonacci numbers:
F(n) = F(n-1) + F(n-2)
F(0) = 0
F(1) = 1
• Computing the nth Fibonacci number recursively (top-down):
F(n)
F(n-1) + F(n-2)
F(n-2) + F(n-3) F(n-3) + F(n-4)
...
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-3
Example: Fibonacci numbers (cont.)
Computing the nth Fibonacci number using bottom-up iteration and
recording results:
F(0) = 0
F(1) = 1
F(2) = 1+0 = 1
…
F(n-2) =
F(n-1) =
F(n) = F(n-1) + F(n-2)
0 1 1 . . . F(n-2) F(n-1) F(n)
Efficiency:
- time n What if we solve
- space n it recursively?
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-4
Examples of DP algorithms
• Computing a binomial coefficient
• Longest common subsequence
• Warshall’s algorithm for transitive closure
• Floyd’s algorithm for all-pairs shortest paths
• Constructing an optimal binary search tree
• Some instances of difficult discrete optimization problems:
- traveling salesman
- knapsack
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-5
Computing a binomial coefficient by DP
Binomial coefficients are coefficients of the binomial formula:
(a + b)n = C(n,0)anb0 + . . . + C(n,k)an-kbk + . . . + C(n,n)a0bn
Recurrence: C(n,k) = C(n-1,k) + C(n-1,k-1) for n > k > 0
C(n,0) = 1, C(n,n) = 1 for n ≥ 0
Value of C(n,k) can be computed by filling a table:
0 1 2 . . . k-1 k
0 1
1 1 1
.
.
.
n-1 C(n-1,k-1) C(n-1,k)
n C(n,k)
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-6
Computing C(n,k): pseudocode and analysis
Time efficiency: Θ (nk)
Space efficiency: Θ (nk)
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-7
Knapsack Problem by DP
Given n items of
integer weights: w1 w2 … wn
values: v1 v2 … vn
a knapsack of integer capacity W
find most valuable subset of the items that fit into the knapsack
Consider instance defined by first i items and capacity j (j ≤ W).
Let V[i,j] be optimal value of such an instance. Then
max {V[i-1,j], vi + V[i-1,j- wi]} if j- wi ≥ 0
V[i,j] =
V[i-1,j]
{ if j- wi < 0
Initial conditions: V[0,j] = 0 and V[i,0] = 0 8-8
A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. nd
ed., Ch. 8
Knapsack Problem by DP (example)
Example: Knapsack of capacity W = 5
item weight value
1 2 $12
2 1 $10
3 3 $20
4 2 $15 capacity j
0 1 2 3 4 5
0 0 0 0
w1 = 2, v1= 12 1 0 0 12
w2 = 1, v2= 10 2 0 10 12 22 22 22 Backtracing
finds the actual
w3 = 3, v3= 20 3 0 10 12 22 30 32
optimal subset,
w4 = 2, v4= 15 4 0 10 15 25 30 37
? i.e. solution.
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-9
Knapsack Problem by DP (pseudocode)
Algorithm DPKnapsack(w[1..n], v[1..n], W)
var V[0..n,0..W], P[1..n,1..W]: int
for j := 0 to W do
V[0,j] := 0
for i := 0 to n do Running time and space:
V[i,0] := 0 O(nW).
for i := 1 to n do
for j := 1 to W do
if w[i] ≤ j and v[i] + V[i-1,j-w[i]] > V[i-1,j] then
V[i,j] := v[i] + V[i-1,j-w[i]]; P[i,j] := j-w[i]
else
V[i,j] := V[i-1,j]; P[i,j] := j
return V[n,W] and the optimal subset by backtracing
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-10
Longest Common Subsequence (LCS)
A subsequence of a sequence/string S is obtained by
deleting zero or more symbols from S. For example, the
following are some subsequences of “president”: pred, sdn,
predent. In other words, the letters of a subsequence of S
appear in order in S, but they are not required to be
consecutive.
The longest common subsequence problem is to find a
maximum length common subsequence between two
sequences.
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-11
LCS
For instance,
Sequence 1: president
Sequence 2: providence
Its LCS is priden.
president
providence
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-12
LCS
Another example:
Sequence 1: algorithm
Sequence 2: alignment
One of its LCS is algm.
a l g o r i t h m
a l i g n m e n t
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-13
How to compute LCS?
Let A=a1a2…am and B=b1b2…bn .
len(i, j): the length of an LCS between
a1a2…ai and b1b2…bj
With proper initializations, len(i, j) can be computed as follows.
0 if i = 0 or j = 0,
len(i, j ) = len(i − 1, j − 1) + 1 if i, j > 0 and ai = b j ,
max(len(i, j − 1), len(i − 1, j )) if i, j > 0 and a ≠ b .
i j
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-14
procedure LCS-Length(A, B)
1. for i ← 0 to m do len(i,0) = 0
2. for j ← 1 to n do len(0,j) = 0
3. for i ← 1 to m do
4. for j ← 1 to n do
len(i, j ) =len(i −1, j −1) +1
5. if ai =b j then
prev(i, j ) =" "
6. else if len(i −1, j ) ≥len(i, j −1)
len(i, j ) =len(i −1, j )
7. then
prev(i, j ) =" "
len(i, j ) =len(i, j −1)
8. else
prev(i, j ) =" "
9. return len and prev
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-15
i j 0 1 2 3 4 5 6 7 8 9 10
p r o v i d e n c e
0 0 0 0 0 0 0 0 0 0 0 0
1 p 0 1 1 1 1 1 1 1 1 1 1
2 r 0 1 2 2 2 2 2 2 2 2 2
3 e 0 1 2 2 2 2 2 3 3 3 3
4 s 0 1 2 2 2 2 2 3 3 3 3
5 i 0 1 2 2 2 3 3 3 3 3 3
6 d 0 1 2 2 2 3 4 4 4 4 4
7 e 0 1 2 2 2 3 4 5 5 5 5
8 n 0 1 2 2 2 3 4 5 6 6 6
9 t 0 1 2 2 2 3 4 5 6 6 6
Running time and memory: O(mn) and O(mn).
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-16
The backtracing algorithm
procedure Output-LCS(A, prev, i, j)
1 if i = 0 or j = 0 then return
Output − LCS ( A, prev, i − 1, j − 1)
2 if prev(i, j)=” “ then
print ai
3 else if prev(i, j)=” “ then Output-LCS(A, prev, i-1, j)
4 else Output-LCS(A, prev, i, j-1)
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-17
i j 0 1 2 3 4 5 6 7 8 9 10
p r o v i d e n c e
0 0 0 0 0 0 0 0 0 0 0 0
1 p 0 1 1 1 1 1 1 1 1 1 1
2 r 0 1 2 2 2 2 2 2 2 2 2
3 e 0 1 2 2 2 2 2 3 3 3 3
4 s 0 1 2 2 2 2 2 3 3 3 3
5 i 0 1 2 2 2 3 3 3 3 3 3
6 d 0 1 2 2 2 3 4 4 4 4 4
7 e 0 1 2 2 2 3 4 5 5 5 5
8 n 0 1 2 2 2 3 4 5 6 6 6
9 t 0 1 2 2 2 3 4 5 6 6 6
Output: priden
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-18
Warshall’s Algorithm: Transitive Closure
• Computes the transitive closure of a relation
• Alternatively: existence of all nontrivial paths in a digraph
• Example of transitive closure:
3 3
1 1
4 4 0 0 1 0
2 0 0 1 0 2
1 0 0 1 1 1 1 1
0 0 0 0 0 0 0 0
0 1 0 0 1 1 1 1
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-19
Warshall’s Algorithm
Constructs transitive closure T as the last matrix in the sequence
of n-by-n matrices R(0), … , R(k), … , R(n) where
R(k)[i,j] = 1 iff there is nontrivial path from i to j with only the
first k vertices allowed as intermediate
Note that R(0) = A (adjacency matrix), R(n) = T (transitive closure)
3 3 3 3 3
1 1 1 1 1
4 4 4 2 4 4
2 2 2 2
R(0) R(1) R(2) R(3) R(4)
0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0
1 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-20
Warshall’s Algorithm (recurrence)
On the k-th iteration, the algorithm determines for every pair of
vertices i, j if a path exists from i and j with just vertices 1,…,k
allowed as intermediate
R(k)[i,j] =
{ R(k-1)[i,j]
or
(path using just 1 ,…,k-1)
R(k-1)[i,k] and R(k-1)[k,j] (path from i to k
and from k to j
k using just 1 ,…,k-1)
i
Initial condition?
j
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-21
Warshall’s Algorithm (matrix generation)
Recurrence relating elements R(k) to elements of R(k-1) is:
R(k)[i,j] = R(k-1)[i,j] or (R(k-1)[i,k] and R(k-1)[k,j])
It implies the following rules for generating R(k) from R(k-1):
Rule 1 If an element in row i and column j is 1 in R(k-1),
it remains 1 in R(k)
Rule 2 If an element in row i and column j is 0 in R(k-1),
it has to be changed to 1 in R(k) if and only if
the element in its row i and column k and the element
in its column j and row k are both 1’s in R(k-1)
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-22
Warshall’s Algorithm (example)
3
1 0 0 1 0 0 0 1 0
1 0 0 1 1 0 1 1
R(0) = 0 0 0 0 R(1) = 0 0 0 0
2 4
0 1 0 0 0 1 0 0
0 0 1 0 0 0 1 0 0 0 1 0
1 0 1 1 1 0 1 1 1 1 1 1
R(2) = 0 0 0 0 R(3) = 0 0 0 0 R(4) = 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-23
Warshall’s Algorithm (pseudocode and analysis)
Time efficiency: Θ (n3)
Space efficiency: Matrices can be written over their predecessors
(with some care), so it’s Θ(n^2).
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-24
Floyd’s Algorithm: All pairs shortest paths
Problem: In a weighted (di)graph, find shortest paths between
every pair of vertices
Same idea: construct solution through series of matrices D(0), …,
D (n) using increasing subsets of the vertices allowed
as intermediate
Example: 4 3
1 0 ∞ 4 ∞
1
6 1 0 4 3
1 5 ∞ ∞ 0 ∞
4 6 5 1 0
2 3
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-25
Floyd’s Algorithm (matrix generation)
On the k-th iteration, the algorithm determines shortest paths
between every pair of vertices i, j that use only vertices among 1,
…,k as intermediate
D(k)[i,j] = min {D(k-1)[i,j], D(k-1)[i,k] + D(k-1)[k,j]}
D(k-1)[i,k]
k
i
D(k-1)[k,j]
D(k-1)[i,j]
j
Initial condition?
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-26
Floyd’s Algorithm (example)
1
2 2 0 ∞ 3 ∞ 0 ∞ 3 ∞
3 6 7 2 0 ∞ ∞ 2 0 5 ∞
D(0) = ∞ 7 0 1 D(1) = ∞ 7 0 1
3
1
4 6 ∞ ∞ 0 6 ∞ 9 0
0 ∞ 3 ∞ 0 10 3 4 0 10 3 4
2 0 5 ∞ 2 0 5 6 2 0 5 6
D(2) = 9 7 0 1 D(3) = 9 7 0 1 D(4) = 7 7 0 1
6 ∞ 9 0 6 16 9 0 6 16 9 0
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-27
Floyd’s Algorithm (pseudocode and analysis)
If D[i,k] + D[k,j] < D[i,j] then P[i,j] k
Since the superscripts k or k-1 make
Time efficiency: Θ (n3)
no difference to D[i,k] and D[k,j].
Space efficiency: Matrices can be written over their predecessors
Note: Works on graphs with negative edges but without negative cycles.
Shortest paths themselves can be found, too. How?
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-28
Optimal Binary Search Trees
Problem: Given n keys a1 < …< an and probabilities p1, …, pn
searching for them, find a BST with a minimum
average number of comparisons in successful search.
Since total number of BSTs with n nodes is given by C(2n,n)/
(n+1), which grows exponentially, brute force is hopeless.
Example: What is an optimal BST for keys A, B, C, and D with
search probabilities 0.1, 0.2, 0.4, and 0.3, respectively?
C
Average # of comparisons
B D
= 1*0.4 + 2*(0.2+0.3) + 3*0.1
A
= 1.7
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-29
DP for Optimal BST Problem
Let C[i,j] be minimum average number of comparisons made in
T[i,j], optimal BST for keys ai < …< aj , where 1 ≤ i ≤ j ≤ n.
Consider optimal BST among all BSTs with some ak (i ≤ k ≤ j )
as their root; T[i,j] is the best among them.
ak C[i,j] =
min {pk · 1 +
i≤k≤j
k-1
∑ ps (level as in T[i,k-1] +1) +
Optimal Optimal s=i
BST for BST for
a i , ..., a k-1 a k+1 , ..., a j
j
∑ ps (level as in T[k+1,j] +1)}
s =k+1
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-30
DP for Optimal BST Problem (cont.)
After simplifications, we obtain the recurrence for C[i,j]:
j
C[i,j] = min {C[i,k-1] + C[k+1,j]} + ∑ ps for 1 ≤ i ≤ j ≤ n
i≤k≤j s=i
C[i,i] = pi for 1 ≤ i ≤ j ≤ n
0 1 j n
1 0 p1 goal
0 p2
i C[i,j]
pn
n+1 0
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-31
Example: key A B C D
probability 0.1 0.2 0.4 0.3
The tables below are filled diagonal by diagonal: the left one is filled
using the recurrence j
C[i,j] = min {C[i,k-1] + C[k+1,j]} + ∑ ps , C[i,i] = pi ;
i≤k≤j s=i
the right one, for trees’ roots, records k’s values giving the minima
j 0 1 2 3 4 j 0 1 2 3 4
i i
1 0 .1 .4 1.1 1.7 1 1 2 3 3 C
2 0 .2 .8 1.4 2 2 3 3 B D
3 0 .4 1.0 3 3 3
A
4 0 .3 4 4
optimal BST
5 0 5
Optimal Binary Search Trees
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-33
Analysis DP for Optimal BST Problem
Time efficiency: Θ(n3) but can be reduced to Θ(n2) by taking
advantage of monotonicity of entries in the
root table, i.e., R[i,j] is always in the range
between R[i,j-1] and R[i+1,j]
Space efficiency: Θ(n2)
Method can be expanded to include unsuccessful searches
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2 nd ed., Ch. 8 8-34