0% found this document useful (0 votes)
61 views46 pages

CSC373 Week 3: Dynamic Programming Nisarg Shah

This document discusses dynamic programming and recaps greedy algorithms. It introduces the weighted interval scheduling problem and shows how a greedy approach fails for this problem. It then presents the dynamic programming solution, which works by breaking the problem into overlapping subproblems and storing previously computed solutions rather than recomputing them. The dynamic programming approach runs in O(n log n) time through memoization or bottom-up computation. It also discusses how to recover the optimal solution subset after computing the optimal value.

Uploaded by

yiyang hua
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views46 pages

CSC373 Week 3: Dynamic Programming Nisarg Shah

This document discusses dynamic programming and recaps greedy algorithms. It introduces the weighted interval scheduling problem and shows how a greedy approach fails for this problem. It then presents the dynamic programming solution, which works by breaking the problem into overlapping subproblems and storing previously computed solutions rather than recomputing them. The dynamic programming approach runs in O(n log n) time through memoization or bottom-up computation. It also discusses how to recover the optimal solution subset after computing the optimal value.

Uploaded by

yiyang hua
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

CSC373

Week 3: Dynamic Programming

Nisarg Shah

373F19 - Nisarg Shah 1


Recap
• Greedy Algorithms
➢ Interval scheduling
➢ Interval partitioning

➢ Minimizing lateness

➢ Huffman encoding

➢…

373F19 - Nisarg Shah 2


Jeff Erickson on greedy algorithms…
373F19 - Nisarg Shah 3
The 1950s were not good years for mathematical research.
We had a very interesting gentleman in Washington named
Wilson. He was secretary of Defense, and he actually had a
pathological fear and hatred of the word ‘research’. I’m not
using the term lightly; I’m using it precisely. His face would
suffuse, he would turn red, and he would get violent if
people used the term ‘research’ in his presence. You can
imagine how he felt, then, about the term ‘mathematical’.
The RAND Corporation was employed by the Air Force, and
the Air Force had Wilson as its boss, essentially. Hence, I felt
I had to do something to shield Wilson and the Air Force
from the fact that I was really doing mathematics inside the
RAND Corporation. What title, what name, could I choose?

— Richard Bellman, on the origin of his term ‘dynamic


programming’ (1984)

Richard Bellman’s quote from Jeff Erickson’s book

373F19 - Nisarg Shah 4


Dynamic Programming
• Outline
➢ Breaking the problem down into simpler subproblems,
solve each subproblem just once, and store their
solutions.
➢ The next time the same subproblem occurs, instead of
recomputing its solution, simply look up its previously
computed solution.
➢ Hopefully, we save a lot of computation at the expense of
modest increase in storage space.
➢ Also called “memoization”

• How is this different from divide & conquer?


373F19 - Nisarg Shah 5
Weighted Interval Scheduling
• Problem
➢ Job 𝑗 starts at time 𝑠𝑗 and finishes at time 𝑓𝑗
➢ Each job 𝑗 has a weight 𝑤𝑗

➢ Two jobs are compatible if they don’t overlap


➢ Goal: find a set 𝑆 of mutually compatible jobs with highest
total weight σ𝑗∈𝑆 𝑤𝑗

• Recall: If all 𝑤𝑗 = 1, then this is simply the interval


scheduling problem from last week
➢ Greedy algorithm based on earliest finish time ordering was
optimal for this case

373F19 - Nisarg Shah 6


Recall: Interval Scheduling
• What if we simply try to use it again?
➢ Fails spectacularly!

373F19 - Nisarg Shah 7


Weighted Interval Scheduling
• What if we use other orderings?
➢ By weight: choose jobs with highest 𝑤𝑗 first
➢ Maximum weight per time: choose jobs with highest
𝑤𝑗 /(𝑓𝑗 − 𝑠𝑗 ) first
➢ ...

• None of them work!


➢ They’re arbitrarily worse than the optimal solution
➢ In fact, under a certain formalization, “no greedy
algorithm” can produce any “decent approximation” in the
worst case (beyond this course!)

373F19 - Nisarg Shah 8


Weighted Interval Scheduling
• Convention
➢ Jobs are sorted by finish time: 𝑓1 ≤ 𝑓2 ≤ ⋯ ≤ 𝑓𝑛
➢ 𝑝 𝑗 = largest index 𝑖 < 𝑗 such that job 𝑖 is compatible
with job 𝑗 (i.e. 𝑓𝑖 < 𝑠𝑗 )

Among jobs before


job 𝑗, the ones
compatible with it
are precisely 1 … 𝑖

E.g.
𝑝[8] = 1,
𝑝[7] = 3,
𝑝[2] = 0

373F19 - Nisarg Shah 9


Weighted Interval Scheduling
• The DP approach
➢ Let OPT be an optimal solution
➢ Two cases regarding job 𝑛:
o Option 1: Job 𝑛 is in OPT
• Can’t use incompatible jobs 𝑝 𝑛 + 1, … , 𝑛 − 1
• Must select optimal subset of jobs from {1, … , 𝑝 𝑛 }
o Option 2: Job 𝑛 is not in OPT
• Must select optimal subset of jobs from {1, … , 𝑛 − 1}
➢ OPT is best of both
➢ Note: In both cases, knowing how to solve any prefix of
our ordering is enough solve the overall problem

373F19 - Nisarg Shah 10


Weighted Interval Scheduling
• The DP approach
➢ 𝑂𝑃𝑇(𝑗) = maximum value from compatible jobs in 1, … , 𝑗
➢ Base case: 𝑂𝑃𝑇 0 = 0
➢ Two cases regarding job 𝑗:
o Job 𝑗 is selected: optimal value is 𝑣𝑗 + 𝑂𝑃𝑇(𝑝 𝑗 )
o Job 𝑗 is not selected: optimal value is 𝑂𝑃𝑇(𝑗 − 1)
➢ 𝑂𝑃𝑇(𝑗) is best of both worlds

➢ Bellman equation:
0 if 𝑗 = 0
𝑂𝑃𝑇 𝑗 = ൝
max 𝑂𝑃𝑇 𝑗 − 1 , 𝑣𝑗 + 𝑂𝑃𝑇 𝑝 𝑗 if 𝑗 > 0

373F19 - Nisarg Shah 11


Brute Force Solution

373F19 - Nisarg Shah 12


Brute Force Solution

• Q: Worst-case running time of COMPUTE-OPT(𝑛)?


a) Θ(𝑛)
b) Θ 𝑛 log 𝑛
c) Θ 1.618𝑛
d) Θ(2𝑛 )

373F19 - Nisarg Shah 13


Brute Force Solution
• Brute force running time
➢ It is possible that 𝑝 𝑗 = 𝑗 − 1 for each 𝑗
➢ Then, we call COMPUTE-OPT(𝑗 − 1) twice in COMPUTE-OPT 𝑗
𝑛
➢ So this might take 2 steps

➢ But we can just check if 𝑗 is compatible with 𝑗 − 1, and if


so, only execute the part where we select 𝑗
➢ Now the worst case is where 𝑝 𝑗 = 𝑗 − 2 for each 𝑗
➢ Running time: 𝑇 𝑛 = 𝑇 𝑛 − 1 + 𝑇 𝑛 − 2
o Fibonacci, golden ratio, … ☺
o 𝑇 𝑛 = 𝑂(𝜑𝑛 ), where 𝜑 ≈ 1.618

373F19 - Nisarg Shah 14


Dynamic Programming
• Why is the runtime high?
➢ Some solutions are being computed many, many times
o E.g. if 𝑝[5] = 3, then COMPUTE-OPT(5) might call COMPUTE-OPT(4)
and COMPUTE-OPT(3)
o But COMPUTE-OPT(4) might in tern call COMPUTE-OPT(3)

• Memoization trick
➢ Simply remember what you’ve already computed, and re-
use the answer if needed in future

373F19 - Nisarg Shah 15


Dynamic Program: Top-Down
• Let’s store COMPUTE-OPT(j) in 𝑀[𝑗]

373F19 - Nisarg Shah 16


Dynamic Program: Top-Down
• Claim: This memoized version takes 𝑂 𝑛 log 𝑛 time
➢ Sorting jobs takes 𝑂 𝑛 log 𝑛
➢ It also takes 𝑂(𝑛 log 𝑛) to do 𝑛 binary searches to
compute 𝑝(𝑗) for each 𝑗

➢ M-Compute-OPT(𝑗) is called at most once for each 𝑗


➢ Each such call takes 𝑂(1) time, not considering the time
taken by any subroutine calls
➢ So M-Compute-OPT(𝑛) takes only 𝑂 𝑛 time

➢ Overall time is 𝑂 𝑛 log 𝑛

373F19 - Nisarg Shah 17


Dynamic Program: Bottom-Up
• Find an order in which to call the functions so that
the sub-solutions are ready when needed

373F19 - Nisarg Shah 18


Top-Down vs Bottom-Up
• Top-Down may be preferred…
➢ …when not all sub-solutions need to be computed on
some inputs
➢ …because one does not need to think of the “right order”
in which to compute sub-solutions

• Bottom-Up may be preferred…


➢ …when all sub-solutions will anyway need to be
computed
➢ …because it is sometimes faster as it prevents recursive
call overheads and unnecessary random memory
accesses

373F19 - Nisarg Shah 19


Optimal Solution
• This approach gave us the optimal value
• What about the actual solution (subset of jobs)?
➢ Typically, this is done by maintaining the optimal value
and an optimal solution for each subproblem
➢ So, we compute two quantities:

0 if 𝑗 = 0
𝑂𝑃𝑇 𝑗 = ൝
max 𝑂𝑃𝑇 𝑗 − 1 , 𝑣𝑗 + 𝑂𝑃𝑇 𝑝 𝑗 if 𝑗 > 0

∅ if 𝑗 = 0
𝑆 𝑗 = ൞ 𝑆(𝑗 − 1) if 𝑗 > 0 ∧ 𝑂𝑃𝑇 𝑗 − 1 ≥ 𝑣𝑗 + 𝑂𝑃𝑇 𝑝 𝑗
𝑗 ∪ 𝑆(𝑝 𝑗 ) if 𝑗 > 0 ∧ 𝑂𝑃𝑇 𝑗 − 1 < 𝑣𝑗 + 𝑂𝑃𝑇 𝑝 𝑗

373F19 - Nisarg Shah 20


Optimal Solution
0 if 𝑗 = 0
𝑂𝑃𝑇 𝑗 = ൝
max 𝑂𝑃𝑇 𝑗 − 1 , 𝑣𝑗 + 𝑂𝑃𝑇 𝑝 𝑗 if 𝑗 > 0

∅ if 𝑗 = 0
𝑆 𝑗 = ൞ 𝑆(𝑗 − 1) if 𝑗 > 0 ∧ 𝑂𝑃𝑇 𝑗 − 1 ≥ 𝑣𝑗 + 𝑂𝑃𝑇 𝑝 𝑗
𝑗 ∪ 𝑆(𝑝 𝑗 ) if 𝑗 > 0 ∧ 𝑂𝑃𝑇 𝑗 − 1 < 𝑣𝑗 + 𝑂𝑃𝑇 𝑝 𝑗

This works with both top-down In this problem, we can do something


(memoization) and bottom-up simpler: just compute 𝑂𝑃𝑇 first, and
approaches. later compute 𝑆 using only 𝑂𝑃𝑇.

373F19 - Nisarg Shah 21


Optimal Substructure Property
• Dynamic programming applies well to problems
that have optimal substructure property
➢ Optimal solution to a problem contains (or can be
computed easily given) optimal solution to subproblems.

• Recall: divide-and-conquer also uses this property


➢ You can think of divide-and-conquer as a special case of
dynamic programming, where the two (or more)
subproblems you need to solve don’t “overlap”
➢ So there’s no need for memoization
➢ In dynamic programming, one of the subproblems may in
turn require solution to the other subproblem…

373F19 - Nisarg Shah 22


Knapsack Problem
• Problem
➢ 𝑛 items: item 𝑖 provides value 𝑣𝑖 > 0 and has weight 𝑤𝑖 > 0
➢ Knapsack has weight capacity 𝑊
➢ Assumption: 𝑊, each 𝑣𝑖 , and each 𝑤𝑖 is an integer
➢ Goal: pack the knapsack with a collection of items with
highest total value given that their total weight is at most 𝑊

373F19 - Nisarg Shah 23


A First Attempt
• Let 𝑂𝑃𝑇(𝑤) = maximum value we can pack with a
knapsack of capacity 𝑤
➢ Goal: Compute 𝑂𝑃𝑇(𝑊)
➢ Claim? 𝑂𝑃𝑇(𝑤) must use at least one job 𝑗 with weight ≤ 𝑤
and then optimally pack the remaining capacity of 𝑤 − 𝑤𝑗

➢ Let 𝑤 = min 𝑤𝑗
𝑗
0 if 𝑤 < 𝑤 ∗
➢ 𝑂𝑃𝑇 𝑤 = ቐ max 𝑣 + 𝑂𝑃𝑇 𝑤 − 𝑤 if 𝑤 ≥ 𝑤 ∗
𝑗 𝑗
𝑗:𝑤𝑗 ≤𝑤

• Why is this wrong?


➢ It might use an item more than once!
373F19 - Nisarg Shah 24
A Refined Attempt
• 𝑂𝑃𝑇(𝑖, 𝑤) = maximum value we can pack using
only items 1, … , 𝑖 given capacity 𝑤
➢ Goal: Compute 𝑂𝑃𝑇(𝑛, 𝑊)
• Consider item 𝑖
➢ If 𝑤𝑖 > 𝑤, then we can’t choose 𝑖. Just use 𝑂𝑃𝑇(𝑖 − 1, 𝑤)
➢ If 𝑤𝑖 ≤ 𝑤, there are two cases:
o If we choose 𝑖, the best is 𝑣𝑖 + 𝑂𝑃𝑇 𝑖 − 1, 𝑤 − 𝑤𝑖
o If we don’t choose 𝑖, the best is 𝑂𝑃𝑇(𝑖 − 1, 𝑤)

373F19 - Nisarg Shah 25


Running Time
• Consider possible evaluations 𝑂𝑃𝑇(𝑖, 𝑤)
➢ 𝑖 ∈ 1, … , 𝑛
➢ 𝑤 ∈ {1, … , 𝑊} (recall weights and capacity are integers)
➢ There are 𝑂(𝑛 ⋅ 𝑊) possible evaluations of 𝑂𝑃𝑇
➢ Each is evaluated at most once (memoization)
➢ Each takes 𝑂(1) time to evaluate
➢ So the total running time is 𝑂(𝑛 ⋅ 𝑊)

• Q: Is this polynomial in the input size?


➢ A: No! But it’s pseudo-polynomial.

373F19 - Nisarg Shah 26


What if…?
• Note that this algorithm runs in polynomial time
when 𝑊 is polynomially bounded in the length of
the input

• Q: What if instead of 𝑊, 𝑤1 , … , 𝑤𝑛 being small


integers, we were told that 𝑣1 , … , 𝑣𝑛 are small
integers?
➢ Then we can use a different dynamic programming
approach!

373F19 - Nisarg Shah 27


A Different DP
• 𝑂𝑃𝑇(𝑖, 𝑣) = minimum capacity needed to pack a
total value of at least 𝑣 using items 1, … , 𝑖
➢ Goal: Compute max 𝑣 ∈ 1, … , 𝑉 ∶ 𝑂𝑃𝑇 𝑖, 𝑣 ≤ 𝑊
• Consider item 𝑖
➢ If we choose 𝑖, we need capacity 𝑤𝑖 + 𝑂𝑃𝑇(𝑖 − 1, 𝑣 − 𝑣𝑖 )
➢ If we don’t choose 𝑖, we need capacity 𝑂𝑃𝑇 𝑖 − 1, 𝑣

0 if 𝑣 ≤ 0
∞ if 𝑣 > 0, 𝑖 = 0
𝑂𝑃𝑇 𝑖, 𝑣 =
𝑤 + 𝑂𝑃𝑇 𝑖 − 1, 𝑣 − 𝑣𝑖 ,
min 𝑖 if 𝑣 > 0, 𝑖 > 0
𝑂𝑃𝑇 𝑖 − 1, 𝑣

373F19 - Nisarg Shah 28


A Different DP
• 𝑂𝑃𝑇(𝑖, 𝑣) = minimum capacity needed to pack a
total value of at least 𝑣 using items 1, … , 𝑖
➢ Goal: Compute max 𝑣 ∈ 1, … , 𝑉 ∶ 𝑂𝑃𝑇 𝑖, 𝑣 ≤ 𝑊
• This approach has running time 𝑂(𝑛 ⋅ 𝑉), where
𝑉 = 𝑣1 + ⋯ + 𝑣𝑛
• So we can get 𝑂(𝑛 ⋅ 𝑊) or 𝑂(𝑛 ⋅ 𝑉)

• Can we remove the dependence on both 𝑉 and 𝑊?


➢ Not likely. Knapsack problem is NP-complete (we’ll see
later).

373F19 - Nisarg Shah 29


Looking Ahead: FPTAS
• While we cannot hope to solve the problem exactly
in time 𝑂 𝑝𝑜𝑙𝑦 𝑛, log 𝑊 , log 𝑉 …
➢ For any 𝜖 > 0, we can get a value that is within 1 + 𝜖
multiplicative factor of the optimal value in time
1
𝑂 𝑝𝑜𝑙𝑦 𝑛, log 𝑊 , log 𝑉 ,
𝜖
➢ Such algorithms are known as fully polynomial-time
approximation scheme (FPTAS)
➢ Core idea behind FPTAS for knapsack:
o Approximate all weights and values up to the desired precision
o Solve knapsack on approximate input using DP

373F19 - Nisarg Shah 30


Single-Source Shortest Paths
• Problem
➢ Input: A directed graph 𝐺 = (𝑉, 𝐸) with edge lengths ℓ𝑣𝑤
on each edge (𝑣, 𝑤), and a source vertex 𝑠
➢ Goal: Compute the length of the shortest path from 𝑠 to
every vertex 𝑡

• When ℓ𝑣𝑤 ≥ 0 for each (𝑣, 𝑤)…


➢ Dijkstra’s algorithm can be used for this purpose
➢ But it fails when some edge lengths can be negative
➢ What do we do in this case?

373F19 - Nisarg Shah 31


Single-Source Shortest Paths
• Cycle length = sum of lengths of edges in the cycle
• If there is a negative length cycle, shortest paths
are not even well defined…
➢ You can traverse the cycle arbitrarily many times to get
arbitrarily “short” paths

373F19 - Nisarg Shah 32


Single-Source Shortest Paths
• But if there are no negative cycles…
➢ Shortest paths are well-defined even when some of the
edge lengths may be negative

• Claim: With no negative cycles, there is always a


shortest path from any vertex to any other vertex
that is simple
➢ Consider the shortest 𝑠 ⇝ 𝑡 path with the fewest edges
among all shortest 𝑠 ⇝ 𝑡 paths
➢ If it has a cycle, removing the cycle creates a path with
fewer edges that is no longer than the original path

373F19 - Nisarg Shah 33


Optimal Substructure Property
• Consider a simple shortest 𝑠 ⇝ 𝑡 path 𝑃
➢ It could be just a single edge
➢ But if 𝑃 has more than one edges, consider 𝑢 which
immediately precedes 𝑡 in the path
➢ If 𝑠 ⇝ 𝑡 is shortest, 𝑠 ⇝ 𝑢 must be shortest as well and it
must use one fewer edge than the 𝑠 ⇝ 𝑡 path

373F19 - Nisarg Shah 34


Optimal Substructure Property
• 𝑂𝑃𝑇(𝑡, 𝑖) = shortest path from 𝑠 to 𝑡 using at most 𝑖
edges
• Then:
➢ Either this path uses at most 𝑖 − 1 edges ⇒ 𝑂𝑃𝑇(𝑡, 𝑖 − 1)
➢ Or it uses 𝑖 edges ⇒ min 𝑂𝑃𝑇 𝑢, 𝑖 − 1 + ℓ𝑢𝑡
𝑢

373F19 - Nisarg Shah 35


Optimal Substructure Property
• 𝑂𝑃𝑇(𝑡, 𝑖) = shortest path from 𝑠 to 𝑡 using at most 𝑖
edges
• Then:
➢ Either this path uses at most 𝑖 − 1 edges ⇒ 𝑂𝑃𝑇(𝑡, 𝑖 − 1)
➢ Or it uses 𝑖 edges ⇒ min 𝑂𝑃𝑇 𝑢, 𝑖 − 1 + ℓ𝑢𝑡
𝑢

0 𝑖 =0∨𝑡 =𝑠
𝑂𝑃𝑇 𝑡, 𝑖 = ൞ ∞ 𝑖 =0∧𝑡 ≠𝑠
min 𝑂𝑃𝑇 𝑡, 𝑖 − 1 , min 𝑂𝑃𝑇 𝑢, 𝑖 − 1 + ℓ𝑢𝑡 otherwise
𝑢

➢ Running time: 𝑂(𝑛2 ) calls, each takes 𝑂(𝑛) time ⇒ 𝑂 𝑛3


➢ Q: What do you need to store to also get the actual paths?

373F19 - Nisarg Shah 36


Side Notes
• Bellman-Ford-
Moore algorithm
➢ Improvement over
this DP
➢ Running time
remains 𝑂(𝑚 ⋅ 𝑛) for
𝑛 vertices, 𝑚 edges
➢ But the space
complexity reduces
to 𝑂(𝑚 + 𝑛)

373F19 - Nisarg Shah 37


Maximum Length Paths?
• Can we use a similar DP to compute maximum
length paths from 𝑠 to all other vertices?

• This is well defined when there are no positive


cycles, in which case, yes.

• What if there are positive cycles, but we want


maximum length simple paths?

373F19 - Nisarg Shah 38


Maximum Length Paths?
• What goes wrong?
➢ Our DP doesn’t work because its path from 𝑠 to 𝑡 might
use a path from 𝑠 to 𝑢 and edge from 𝑢 to 𝑡
➢ But path from 𝑠 to 𝑢 might in turn go through 𝑡
➢ The path may no longer remain simple

• In fact, maximum length simple path is NP-hard


➢ Hamiltonian path problem (i.e. is there a path of length
𝑛 − 1 in a given undirected graph?) is a special case

373F19 - Nisarg Shah 39


All-Pairs Shortest Paths
• Problem
➢ Input: A directed graph 𝐺 = (𝑉, 𝐸) with edge lengths ℓ𝑣𝑤
on each edge (𝑣, 𝑤) and no negative cycles
➢ Goal: Compute the length of the shortest path from all
vertices 𝑠 to all other vertices 𝑡

• Simple idea:
➢ Run single-source shortest paths from each source 𝑠
➢ Running time is 𝑂 𝑛
4
3
➢ Actually, we can do this in 𝑂(𝑛 ) as well

373F19 - Nisarg Shah 40


All-Pairs Shortest Paths
• Problem
➢ Input: A directed graph 𝐺 = (𝑉, 𝐸) with edge lengths ℓ𝑣𝑤
on each edge (𝑣, 𝑤) and no negative cycles
➢ Goal: Compute the length of the shortest path from all
vertices 𝑠 to all other vertices 𝑡

• 𝑂𝑃𝑇 𝑢, 𝑣, 𝑘 = length of shortest simple path from


𝑢 to 𝑣 in which intermediate nodes from {1, … , 𝑘}
• Exercise: Write down the recursion formula of 𝑂𝑃𝑇
such that given subsolutions, it requires 𝑂(1) time
• Running time: 𝑂 𝑛3 calls, 𝑂 1 per call ⇒ 𝑂 𝑛3
373F19 - Nisarg Shah 41
Chain Matrix Product
• Problem
➢ Input: Matrices 𝑀1 , … , 𝑀𝑛 where the dimension of 𝑀𝑖 is
𝑑𝑖−1 × 𝑑𝑖
➢ Goal: Compute 𝑀1 ⋅ 𝑀2 ⋅ … 𝑀𝑛

• But matrix multiplication is associative


➢ 𝐴⋅ 𝐵⋅𝐶 = 𝐴⋅𝐵 ⋅𝐶
➢ So isn’t the optimal solution going to call the algorithm
for multiplying two matrices exactly 𝑛 − 1 times?
➢ Insight: the time it takes to multiply two matrices
depends on their dimensions

373F19 - Nisarg Shah 42


Chain Matrix Product
• Assume
➢ We use the brute force approach for matrix multiplication
➢ So multiplying 𝑝 × 𝑞 and 𝑞 × 𝑟 matrices requires 𝑝 ⋅ 𝑞 ⋅ 𝑟
operations

• Example
➢ 𝑀1 is 5 X 10, 𝑀2 is 10 X 100, and 𝑀3 is 100 X 50
➢ 𝑀1 ⋅ 𝑀2 ⋅ 𝑀3 requires 5 ⋅ 10 ⋅ 100 + 5 ⋅ 100 ⋅ 50 =
30000 ops
➢ 𝑀1 ⋅ 𝑀2 ⋅ 𝑀3 requires 10 ⋅ 100 ⋅ 50 + 5 ⋅ 10 ⋅ 50 =
52500 ops

373F19 - Nisarg Shah 43


Chain Matrix Product
• Note
➢ Our input is simply the dimensions 𝑑0 , 𝑑1 , … , 𝑑𝑛 and not
the actual matrices

• Why is DP right for this problem?


➢ Optimal substructure property
➢ Think of the final product computed, say 𝐴 ⋅ 𝐵
➢ 𝐴 is the product of some prefix, 𝐵 is the product of the
remaining suffix
➢ For the overall optimal computation, each of 𝐴 and 𝐵
should be computed optimally

373F19 - Nisarg Shah 44


Chain Matrix Product
• 𝑂𝑃𝑇(𝑖, 𝑗) = min ops required to compute 𝑀𝑖 ⋅ … ⋅ 𝑀𝑗
➢ Here, 1 ≤ 𝑖 ≤ 𝑗 ≤ 𝑛
➢ Q: Why do we not just care about prefixes and suffices?
o 𝑀1 ⋅ 𝑀2 ⋅ 𝑀3 ⋅ 𝑀4 ⋅ 𝑀5 ⇒ need to know optimal solution for
𝑀2 ⋅ 𝑀3 ⋅ 𝑀4

0 𝑖=𝑗
𝑂𝑃𝑇 𝑖, 𝑗 = ൝
min 𝑂𝑃𝑇 𝑖, 𝑘 + 𝑂𝑃𝑇 𝑘 + 1, 𝑗 + 𝑑𝑖−1 𝑑𝑘 𝑑𝑗 ∶ 𝑖 ≤ 𝑘 < 𝑗 if 𝑖 < 𝑗

➢ Running time: 𝑂 𝑛2 calls, 𝑂(𝑛) time per call ⇒ 𝑂 𝑛3

373F19 - Nisarg Shah 45


Chain Matrix Product This slide is not in the
scope of the course

• Can we do better?
➢ Surprisingly, yes. But not by a DP algorithm (that I know of)
➢ Hu & Shing (1981) developed 𝑂(𝑛 log 𝑛) time algorithm by
reducing chain matrix product to the problem of
“optimally” triangulating a regular polygon
Source: Wikipedia

Example
• 𝐴 is 10 × 30, 𝐵 is 30 × 5, 𝐶 is 5 × 60
• The cost of each triangle is the product
of its vertices
• Want to minimize total cost of all
triangles

373F19 - Nisarg Shah 46

You might also like