Dynamic Programming
Matrix Chain Multiplication
Dynamic Programming
• An algorithm design technique (like divide and
conquer)
• Divide and conquer
– Partition the problem into independent subproblems
– Solve the subproblems recursively
– Combine the solutions to solve the original problem
2
Dynamic Programming
• Applicable when subproblems are not independent
– Subproblems share subsubproblems
E.g.: Combinations:
n n-1 n-1
= +
k k k-1
n n
=1 =1
1 n
– A divide and conquer approach would repeatedly solve the
common subproblems
– Dynamic programming solves every subproblem just once and
stores the answer in a table
3
Example: Combinations
C om b (6 ,4 )
= C om b (5 , 3 ) C om b (5 , 4 )
+
= C om b (4 ,2 ) C om b (4 , 3 ) + C om b (4 , 3 ) + C om b (4 , 4 )
+
= C om b (3 , 1 +) + C om b (3 , 2 ) + C om b (3 , 2 ) + C om+ b (3 , 3 ) + C om+ b (3 , 2 ) + C om+ b (3 , 3 ) + 1
= 3 + , 1 ) + C om b (2 , 2 ) + +C om b (2 , 1 ) + C om b (2
+ C om b (2 + , 2) + 1 + +C om b (2 , 1 ) + C om b (2
+ , 2) + 1 + 1+
= 3 + 2 + 1 + 2 + 1 + 1 + 2 + 1 + 1 + 1
n n-1 n-1
= +
k k k-1
4
Dynamic Programming
• Used for optimization problems
– A set of choices must be made to get an optimal
solution
– Find a solution with the optimal value (minimum or
maximum)
– There may be many solutions that lead to an optimal
value
– Our goal: find an optimal solution
5
Dynamic Programming Algorithm
1. Characterize the structure of an optimal
solution
2. Recursively define the value of an optimal
solution
3. Compute the value of an optimal solution in a
bottom-up fashion
4. Construct an optimal solution from computed
information (not always necessary)
6
Matrix-Chain Multiplication
Problem: given a sequence A1, A2, …, An, compute the
product:
A1 A2 An
• Matrix compatibility:
C=AB C=A1 A2 Ai Ai+1 An
colA = rowB coli = rowi+1
rowC = rowA rowC = rowA1
colC = colB colC = colAn
7
MATRIX-MULTIPLY(A, B)
if columns[A] rows[B]
then error “incompatible dimensions”
else for i 1 to rows[A]
do for j 1 to columns[B]rows[A] cols[A] cols[B]
do C[i, j] = 0 multiplications
for k 1 to columns[A]
k do C[i, j] C[i, j] + A[i, k]
j cols[B]
B[k, j] j cols[B]
i = i
* k
A B C
rows[A]
rows[A] 8
Matrix-Chain Multiplication
• In what order should we multiply the matrices?
A1 A2 An
• Parenthesize the product to get the order in which
matrices are multiplied
• E.g.: A1 A2 A3 = ((A1 A2) A3)
= (A1 (A2 A3))
• Which one of these orderings should we choose?
– The order in which we multiply the matrices has a
significant impact on the cost of evaluating the product
9
Example
A1 A 2 A 3
• A1: 10 x 100
• A2: 100 x 5
• A3: 5 x 50
1. ((A1 A2) A3): A1 A2 = 10 x 100 x 5 = 5,000 (10 x 5)
((A1 A2) A3) = 10 x 5 x 50 = 2,500
Total: 7,500 scalar multiplications
2. (A1 (A2 A3)): A2 A3 = 100 x 5 x 50 = 25,000 (100 x 50)
(A1 (A2 A3)) = 10 x 100 x 50 = 50,000
Total: 75,000 scalar multiplications
one order of magnitude difference!! 10
Matrix-Chain Multiplication:
Problem Statement
• Given a chain of matrices A1, A2, …, An, where
Ai has dimensions pi-1x pi, fully parenthesize the
product A1 A2 An in a way that minimizes
the number of scalar multiplications.
A1 A2 Ai Ai+1 An
p0 x p1 p1 x p2 pi-1 x pi pi x pi+1 pn-1 x pn
11
What is the number of possible
parenthesizations?
• Exhaustively checking all possible
parenthesizations is not efficient!
• It can be shown that the number of
parenthesizations grows as Ω(4n/n3/2)
(see page 333 in your textbook)
12
1. The Structure of an Optimal
Parenthesization
• Notation:
Ai…j = Ai Ai+1 Aj, i j
• Suppose that an optimal parenthesization of Ai…j
splits the product between Ak and Ak+1, where
ik<j
Ai…j = Ai Ai+1 Aj
= Ai Ai+1 Ak Ak+1 Aj
= Ai…k Ak+1…j
13
Optimal Substructure
Ai…j = Ai…k Ak+1…j
• The parenthesization of the “prefix” Ai…k must be an
optimal parentesization
• If there were a less costly way to parenthesize Ai…k, we
could substitute that one in the parenthesization of Ai…j
and produce a parenthesization with a lower cost than
the optimum contradiction!
• An optimal solution to an instance of the matrix-
chain multiplication contains within it optimal
solutions to subproblems
14
2. A Recursive Solution
• Subproblem:
determine the minimum cost of parenthesizing
Ai…j = Ai Ai+1 Aj for 1 i j n
• Let m[i, j] = the minimum number of
multiplications needed to compute Ai…j
– full problem (A1..n): m[1, n]
0, for i = 1, 2, …, n
– i = j: Ai…i = Ai m[i, i] =
15
2. A Recursive Solution
• Consider the subproblem of parenthesizing
Ai…j = Ai Ai+1 Aj for 1 i j n
pi-1pkpj
= Ai…k Ak+1…j for i k < j
m[i, k] m[k+1,j]
• Assume that the optimal parenthesization splits
the product Ai Ai+1 Aj at k (i k < j)
m[i, k] + m[k+1, j] + p i-1pkpj
m[i, j] =
min # of multiplications min # of multiplications # of multiplications
to compute Ai…k to compute Ak+1…j to compute Ai…kAk…j
16
2. A Recursive Solution (cont.)
m[i, j] = m[i, k] + m[k+1, j] + p i-1pkpj
• We do not know the value of k
– There are j – i possible values for k: k = i, i+1, …, j-1
• Minimizing the cost of parenthesizing the product
Ai Ai+1 Aj becomes:
0 if i = j
m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if
i<j
ik<j
17
3. Computing the Optimal Costs
0 if i = j
m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i
<j
ik<j
• Computing the optimal solution recursively takes
exponential time! 1 2 3 n
• How many subproblems? n
(n2)
– Parenthesize Ai…j
j
for 1 i j n 3
– One problem for each 2
choice of i and j 1
i 18
3. Computing the Optimal Costs (cont.)
0 if i = j
m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j
ik<j
• How do we fill in the tables m[1..n, 1..n]?
– Determine which entries of the table are used in computing m[i, j]
Ai…j = Ai…k Ak+1…j
– Subproblems’ size is one less than the original size
– Idea: fill in m such that it corresponds to solving problems of
increasing length
19
3. Computing the Optimal Costs (cont.)
0 if i = j
m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i
<j
ik<j
• Length = 1: i = j, i = 1, 2, …, n ond
ec
irst
1 2 3 sn f
• Length = 2: j = i + 1, i = 1, 2, …, n-1
n
m[1, n] gives the optimal
solution to the problem
j
Compute rows from bottom to top 3
and from left to right 2
1
i 20
Example: min {m[i, k] + m[k+1, j] + pi-
1 pkpj}
m[2, 2] + m[3, 5] + p1p2p5 k=2
m[2, 5] = min k=3
m[2, 3] + m[4, 5] + p1p3p5
k=4
m[2, 4] + m[5, 5] + p1p4p5
1 2 3 4 5 6
6
5
• Values m[i, j] depend only
4
j on values that have been
3 previously computed
2
1
i
21
Example min {m[i, k] + m[k+1, j] + pi-
1 pkpj}
1 2 3
Compute A1 A2 A3 2 2
3 7500 25000 0
• A1: 10 x 100 (p0 x p1) 1
2 5000 0
• A2: 100 x 5 (p1 x p2)
1 0
• A3: 5 x 50 (p2 x p3)
m[i, i] = 0 for i = 1, 2, 3
m[1, 2] = m[1, 1] + m[2, 2] + p0p1p2 (A1A2)
= 0 + 0 + 10 *100* 5 = 5,000
m[2, 3] = m[2, 2] + m[3, 3] + p1p2p3 (A2A3)
= 0 + 0 + 100 * 5 * 50 = 25,000
m[1, 3] = min m[1, 1] + m[2, 3] + p0p1p3 = 75,000 (A1(A2A3))
m[1, 2] + m[3, 3] + p0p2p3 = 7,500 ((A1A2)A22
3)
Matrix-Chain-Order(p)
O(N3)
23
4. Construct the Optimal Solution
• In a similar matrix s we
keep the optimal 1 2 3 n
values of k n
• s[i, j] = a value of k
such that an optimal k
j
parenthesization of Ai..j 3
splits the product 2
between Ak and Ak+1 1
24
4. Construct the Optimal Solution
• s[1, n] is associated with
the entire product A1..n
1 2 3 n
– The final matrix
n
multiplication will be split
at k = s[1, n]
A1..n = A1..s[1, n] As[1, n]+1..n
j
– For each subproduct 3
recursively find the 2
corresponding value of k 1
that results in an optimal
parenthesization
25
4. Construct the Optimal Solution
• s[i, j] = value of k such that the optimal
parenthesization of Ai Ai+1 Aj splits the
product between Ak and Ak+1
1 2 3 4 5 6
6 3 3 3 5 5 -
• s[1, n] = 3 A1..6 = A1..3
5 3 3 3 4 -
A4..6
4 3 3 3 -
• s[1, 3] = 1 A1..3 = A1..1
3 1 2 - A2..3
j
2 1 -
• s[4, 6] = 5 A4..6 = A4..5
1 - A6..6
i
26
4. Construct the Optimal Solution (cont.)
PRINT-OPT-PARENS(s, i, j) 1 2 3 4 5 6
if i = j 6 3 3 3 5 5 -
then print “A”i 5 3 3 3 4 -
4 3 3 3 -
else print “(” j
3 1 2 -
PRINT-OPT-PARENS(s, i, s[i, j])
2 1 -
PRINT-OPT-PARENS(s, s[i, j] + 1, j)1
-
print “)” i
27
Example: A1 A(6 ( A1 ( A2 A3 ) ) ( ( A4 A5 ) A6 ) )
PRINT-OPT-PARENS(s, i, j) s[1..6, 1..6] 1 2 3 4 5 6
if i = j 6 3 3 3 5 5 -
then print “A”i 5 3 3 3 4 -
else print “(”
4 3 3 3 -
PRINT-OPT-PARENS(s, i, s[i, j]) j
PRINT-OPT-PARENS(s, s[i, j] + 1,
3 1 2 -
j) 2 1 -
print1,“)”
P-O-P(s, 6) s[1, 6] = 3 1 -
i = 1, j = 6 “(“ P-O-P (s, 1, 3) s[1, 3] = 1 i
i = 1, j = 3 “(“ P-O-P(s, 1, 1) “A1”
P-O-P(s, 2, 3) s[2, 3] = 2
i = 2, j = 3 “(“ P-O-P (s, 2, 2)
“A2”
P-O-P (s, 3, 3)
“A3” … 28
Memoization
• Top-down approach with the efficiency of typical dynamic
programming approach
• Maintaining an entry in a table for the solution to each
subproblem
– memoize the inefficient recursive algorithm
• When a subproblem is first encountered its solution is
computed and stored in that table
• Subsequent “calls” to the subproblem simply look up that
value
29
Memoized Matrix-Chain
Alg.: MEMOIZED-MATRIX-CHAIN(p)
1. n length[p] – 1
2. for i 1 to n
Initialize the m table with
large values that indicate
3. do for j i to n whether the values of m[i, j]
have been computed
4. do m[i, j]
5. return LOOKUP-CHAIN(p, 1, n) Top-down approach
30
Memoized Matrix-Chain
Alg.: LOOKUP-CHAIN(p, i, j) Running time is O(n3)
1. if m[i, j] <
2. then return m[i, j]
3. if i = j
4. then m[i, j] 0
5. else for k i to j – 1
6. do q LOOKUP-CHAIN(p, i, k) +
LOOKUP-CHAIN(p, k+1, j) + pi-1pkpj
7. if q < m[i, j]
8. then m[i, j] q
9. return m[i, j]
31
Dynamic Progamming vs. Memoization
• Advantages of dynamic programming vs.
memoized algorithms
– No overhead for recursion, less overhead for
maintaining the table
– The regular pattern of table accesses may be used to
reduce time or space requirements
• Advantages of memoized algorithms vs.
dynamic programming
– Some subproblems do not need to be solved
32
Elements of Dynamic Programming
• Optimal Substructure
– An optimal solution to a problem contains within it an
optimal solution to subproblems
– Optimal solution to the entire problem is build in a
bottom-up manner from optimal solutions to
subproblems
• Overlapping Subproblems
– If a recursive algorithm revisits the same subproblems
over and over the problem has overlapping
subproblems
34
Parameters of Optimal Substructure
• How many subproblems are used in an optimal
solution for the original problem
– Assembly line: One subproblem (the line that gives best time)
– Matrix multiplication: Two subproblems (subproducts Ai..k, Ak+1..j)
• How many choices we have in determining
which subproblems to use in an optimal solution
– Assembly line: Two choices (line 1 or line 2)
– Matrix multiplication: j - i choices for k (splitting the product)
35
Parameters of Optimal Substructure
• Intuitively, the running time of a dynamic
programming algorithm depends on two factors:
– Number of subproblems overall
– How many choices we look at for each subproblem
• Assembly line
(n) subproblems (n stations) (n) overall
– 2 choices for each subproblem
• Matrix multiplication:
(n2) subproblems (1 i j n)
(n3) overall
– At most n-1 choices
36
Longest Common Subsequence
• Given two sequences
X = x1, x2, …, xm
Y = y1, y2, …, yn
find a maximum length common subsequence
(LCS) of X and Y
• E.g.:
X = A, B, C, B, D, A, B
• Subsequences of X:
– A subset of elements in the sequence taken in order
A, B, D, B, C, D, B, etc.
37
Example
X = A, B, C, B, D, A, B X = A, B, C, B, D, A, B
Y = B, D, C, A, B, A Y = B, D, C, A, B, A
B, C, B, A and B, D, A, B are longest common
subsequences of X and Y (length = 4)
B, C, A, however is not a LCS of X and Y
38
Brute-Force Solution
• For every subsequence of X, check whether it’s
a subsequence of Y
• There are 2m subsequences of X to check
• Each subsequence takes (n) time to check
– scan Y for first letter, from there scan for second, and
so on
• Running time: (n2m)
39
Making the choice
X = A, B, D, E
Y = Z, B, E
• Choice: include one element into the common
sequence (E) and solve the resulting
subproblem
X = A, B, D, G
Y = Z, B, D
• Choice: exclude an element from a string and
solve the resulting subproblem
40
Notations
• Given a sequence X = x1, x2, …, xm we define
the i-th prefix of X, for i = 0, 1, 2, …, m
Xi = x1, x2, …, xi
• c[i, j] = the length of a LCS of the sequences
Xi = x1, x2, …, xi and Yj = y1, y2, …, yj
41
A Recursive Solution
Case 1: xi = yj
e.g.: Xi = A, B, D, E
Yj = Z, B, E
c[i, j] =c[i - 1, j - 1] + 1
– Append xi = yj to the LCS of Xi-1 and Yj-1
– Must find a LCS of Xi-1 and Yj-1 optimal solution to a
problem includes optimal solutions to subproblems
42
A Recursive Solution
Case 2: xi yj
e.g.: Xi = A, B, D, G
Yj = Z, B, D
max { c[i - 1, j], c[i, j-1] }
c[i, j] =
– Must solve two problems
• find a LCS of Xi-1 and Yj: Xi-1 = A, B, D and Yj = Z, B, D
• find a LCS of Xi and Yj-1: Xi = A, B, D, G and Yj = Z, B
• Optimal solution to a problem includes optimal
solutions to subproblems 43
Overlapping Subproblems
• To find a LCS of X and Y
– we may need to find the LCS between X and Yn-1 and
that of Xm-1 and Y
– Both the above subproblems has the subproblem of
finding the LCS of Xm-1 and Yn-1
• Subproblems share subsubproblems
44
3. Computing the Length of the LCS
0 if i = 0 or j = 0
c[i, j] = c[i-1, j-1] + 1 if xi = yj
max(c[i, j-1], c[i-1, j]) if xi yj
0 1 2 n
yj: y1 y2 yn
0 xi 0 0 0 0 0 0
1 x1 0 first
2 x2 0 second
i
0
0
m xm 0
j
45
Additional Information
0 if i,j = 0 A matrix b[i, j]:
c[i, j] = c[i-1, j-1] + 1 if xi = yj
• For a subproblem [i, j] it
max(c[i, j-1], c[i-1, j]) if xi yj
tells us what choice was
made to obtain the
0 1 2 3 n
b & c: yj: A C D F optimal value
0 xi 0 0 0 0 0 0 • If xi = yj
1 A 0 b[i, j] = “ ”
2 B 0 • Else, if
c[i-1,j]
i
3 C 0 c[i - 1, j] ≥ c[i, j-1]
c[i,j-1]
0 b[i, j] = “ ”
m D 0 else
j b[i, j] = “ ”
46
LCS-LENGTH(X, Y, m, n)
1. for i ← 1 to m
2. do c[i, 0] ← 0 The length of the LCS if one of the sequences
3. for j ← 0 to n is empty is zero
4. do c[0, j] ← 0
5. for i ← 1 to m
6. do for j ← 1 to n
7. do if xi = yj
Case 1: xi = yj
8. then c[i, j] ← c[i - 1, j - 1] + 1
9. b[i, j ] ← “ ”
10. else if c[i - 1, j] ≥ c[i, j - 1]
11. then c[i, j] ← c[i - 1, j]
12. b[i, j] ← “↑” Case 2: xi yj
13. else c[i, j] ← c[i, j - 1]
14. b[i, j] ← “←”
15. return c and b Running time: (mn)
47
Example
0 if i = 0 or
X = A, B, C, B, D, A j = 0
Y = B, D, C, A, B, A c[i, j] = c[i-1, j-1] + 1 if xi = yj
0 1 2
max(c[i, 3
j-1], 4
c[i-1,5j]) if6 x y
i j
If xi = yj yj B D C A B A
0 xi
b[i, j] = “ ” 0 0 0 0 0 0 0
1 A
0 1
Else if c[i - 0 0 0 1 1
2 B
1, j] ≥ c[i, j-1] 0 1 1 1 1 2 2
b[i, j] = “ ”3 C 0 1 1 2 2 2 2
else 4 B 0 1 1 2 2 3 3
b[i, j] = “ 5” D 0 1 2 2 2 3 3
6 A 0 1 2 2 3 3 4
7 B 0 1 2 2 3 4 4
48
4. Constructing a LCS
• Start at b[m, n] and follow the arrows
• When we encounter a “ “ in b[i, j] xi = yj is an element
of the LCS 0 1 2 3 4 5 6
yj B D C A B A
0 xi 0 0 0 0 0 0 0
1 A
0 0 0 0 1 1 1
2 B
0 1 1 1 1 2 2
3 C
0 1 1 2 2 2 2
4 B 0 1 1 2 2 3 3
5 D
0 1 2 2 2 3 3
6 A 0 1 2 2 3 3 4
7 B 0 1 2 2 3 4 4
49
PRINT-LCS(b, X, i, j)
1. if i = 0 or j = 0 Running time: (m + n)
2. then return
3. if b[i, j] = “ ”
4. then PRINT-LCS(b, X, i - 1, j - 1)
5. print xi
6. elseif b[i, j] = “↑”
7. then PRINT-LCS(b, X, i - 1, j)
8. else PRINT-LCS(b, X, i, j - 1)
Initial call: PRINT-LCS(b, X, length[X], length[Y])
50
Improving the Code
• What can we say about how each entry c[i, j] is
computed?
– It depends only on c[i -1, j - 1], c[i - 1, j], and
c[i, j - 1]
– Eliminate table b and compute in O(1) which of the
three values was used to compute c[i, j]
– We save (mn) space from table b
– However, we do not asymptotically decrease the
auxiliary space requirements: still need table c
51
Improving the Code
• If we only need the length of the LCS
– LCS-LENGTH works only on two rows of c at a time
• The row being computed and the previous row
– We can reduce the asymptotic space requirements by
storing only these two rows
52