Algorithms Illuminated: Part 3: Greedy Algorithms and Dynamic Programming Tim Roughgarden
Algorithms Illuminated: Part 3: Greedy Algorithms and Dynamic Programming Tim Roughgarden
Tim Roughgarden
c 2019 by Tim Roughgarden
All rights reserved. No portion of this book may be reproduced in any form
without permission from the publisher, except as permitted by U. S. copyright
law.
First Edition
Preface vii
14 Huffman Codes 23
14.1 Codes 23
14.2 Codes as Trees 28
14.3 Huffman’s Greedy Algorithm 32
*14.4 Proof of Correctness 41
Problems 49
v
vi Contents
Index 211
Preface
vii
viii Preface
This series of books has only one goal: to teach the basics of algorithms
in the most accessible way possible. Think of them as a transcript
of what an expert algorithms tutor would say to you over a series of
one-on-one lessons.
There are a number of excellent more traditional and encyclopedic
textbooks about algorithms, any of which usefully complement this
book series with additional details, problems, and topics. I encourage
you to explore and find your own favorites. There are also several
books that, unlike these books, cater to programmers looking for
ready-made algorithm implementations in a specific programming
language. Many such implementations are freely available on the Web
as well.
x Preface
Additional Resources
These books are based on online courses that are currently running
on the Coursera and Stanford Lagunita platforms. I’ve made several
resources available to help you replicate as much of the online course
experience as you like.
Videos. If you’re more in the mood to watch and listen than
to read, check out the YouTube video playlists available from
www.algorithmsilluminated.org. These videos cover all the topics
in this book series, as well as additional advanced topics. I hope they
exude a contagious enthusiasm for algorithms that, alas, is impossible
to replicate fully on the printed page.
Quizzes. How can you know if you’re truly absorbing the concepts
in this book? Quizzes with solutions and explanations are scattered
throughout the text; when you encounter one, I encourage you to
pause and think about the answer before reading on.
End-of-chapter problems. At the end of each chapter you’ll find
several relatively straightforward questions for testing your under-
Preface xi
Acknowledgments
These books would not exist without the passion and hunger supplied
by the hundreds of thousands of participants in my algorithms courses
over the years. I am particularly grateful to those who supplied
detailed feedback on an earlier draft of this book: Tonya Blust, Yuan
Cao, Carlos Guia, Jim Humelsine, Vladimir Kokshenev, Bayram
Kuliyev, and Daniel Zingaro.
I always appreciate suggestions and corrections from readers.
These are best communicated through the discussion forums men-
tioned above.
Tim Roughgarden
New York, NY
April 2019
Chapter 16
Pep Talk
103
104 Introduction to Dynamic Programming
I’m not going to tell you what dynamic programming is just yet.
Instead, we’ll devise from scratch an algorithm for a tricky and
concrete computational problem, which will force us to develop a
number of new ideas. After we’ve solved the problem, we’ll zoom out
and identify the ingredients of our solution that exemplify the general
principles of dynamic programming. Then, armed with a template
for developing dynamic programming algorithms and an example
instantiation, we’ll tackle increasingly challenging applications of the
paradigm.
Quiz 16.1
How many different independent sets does a complete graph
with 5 vertices have?
a) 1 and 2 (respectively)
b) 5 and 10
c) 6 and 11
d) 6 and 16
Output: An independent
P set S ✓ V of G with the
maximum-possible sum v2S wv of vertex weights.
1 4 5 4
This graph has 8 independent sets: the empty set, the four singleton
sets, the first and third vertices, the first and fourth vertices, and the
second and fourth vertices. The last of these has the largest total
weight of 8. The number of independent sets of a path graph grows
exponentially with the number of vertices (do you see why?), so there
is no hope of solving the problem via exhaustive search, except in the
tiniest of instances.
106 Introduction to Dynamic Programming
S := ;
sort vertices of V by weight
for each v 2 V , in nonincreasing order of weight do
if S [ {v} is an independent set of G then
S := S [ {v}
return S
Quiz 16.2
What is the total weight of the output of the greedy al-
gorithm when the input graph is the four-vertex path on
page 105? Is this the maximum possible?
a) 6; no
b) 6; yes
c) 8; no
d) 8; yes
G1 := first half of G
G2 := second half of G
S1 := recursively solve the WIS problem on G1
S2 := recursively solve the WIS problem on G2
combine S1 , S2 into a solution S for G
return S
1 4 5 4
the first and second recursive calls return the second and third vertices
as the optimal solutions to their respective subproblems. The union
of their solutions is not an independent set due to the conflict at the
boundary between the two solutions. It’s easy to see how to defuse a
border conflict when the input graph has only four vertices; when it
has hundreds or thousands of vertices, not so much.1
Can we do better than a greedy or divide-and-conquer algorithm?
1
The problem can be solved in O(n2 ) time by a divide-and-conquer algorithm
that makes four recursive calls rather than two, where n is the number of vertices.
(Do you see how to do this?) Our dynamic programming algorithm for the problem
will run in O(n) time.
108 Introduction to Dynamic Programming
with total weight W . What can we say about it? Here’s a tautology: S
either contains the final vertex vn , or it doesn’t. Let’s examine these
cases in reverse order.
v1 v2 v3 v4
1 4 5 4
Gn-2
(i) an MWIS of Gn 1; or
Lemma 16.1 singles out the only two possibilities for an MWIS, so
whichever option has larger total weight is an optimal solution. We
therefore have a recursive formula—a recurrence—for the total weight
of an MWIS:
Wn = max{Wn 1 , Wn 2 + wn }.
| {z } | {z }
Case 1 Case 2
Wi = max{Wi 1 , Wi 2 + wi }.
Quiz 16.3
What is the asymptotic running time of the recursive WIS al-
gorithm, as a function of the number n of vertices? (Choose
the strongest correct statement.)
a) O(n)
b) O(n log n)
4
The proof proceeds by induction on the number n of vertices. The base
cases (n = 0, 1) are clearly correct. For the inductive step (n 2), the inductive
hypothesis guarantees that S1 and S2 are indeed MWISs of Gn 1 and Gn 2 ,
respectively. Lemma 16.1 implies that the better of S1 and S2 [ {vn } is an MWIS
of G, and this is the output of the algorithm.
112 Introduction to Dynamic Programming
c) O(n2 )
Quiz 16.3 shows that our recursive WIS algorithm is no better than
exhaustive search. The next quiz contains the key to unlocking a
radical running time improvement. Think about it carefully before
reading the solution.
Quiz 16.4
a) ⇥(1)5
b) ⇥(n)
c) ⇥(n2 )
d) 2⇥(n)
Quiz 16.4 implies that the exponential running time of our recursive
WIS algorithm stems solely from its absurd redundancy, solving the
same subproblems from scratch over, and over, and over, and over
again. Here’s an idea: The first time we solve a subproblem, why not
save the result in a cache once and for all? Then, if we encounter the
5
If big-O notation is analogous to “less than or equal,” then big-theta notion is
analogous to “equal.” Formally, a function f (n) is ⇥(g(n)) if there are constants c1
and c2 such that f (n) is wedged between c1 · g(n) and c2 · g(n) for all sufficiently
large n.
16.2 A Linear-Time Algorithm for WIS in Paths 113
WIS
3 2 1 6 4 5
0 3 3 4 9 9 14
the first i vertices and i 1 edges of the input graph. This follows
from an inductive argument similar to the one in footnote 4. The base
cases A[0] and A[1] are clearly correct. When computing A[i] with
i 2, by induction, the values A[i 1] and A[i 2] are indeed the
total weights of MWISs of Gi 1 and Gi 2 , respectively. Corollary 16.2
then implies that A[i] is computed correctly, as well. In the example
above, the total weight of an MWIS in the original input graph is the
value in the final array entry (14), corresponding to the independent
set consisting of the first, fourth, and sixth vertices.
Theorem 16.3 (Properties of WIS) For every path graph and non-
negative vertex weights, the WIS algorithm runs in linear time and
returns the total weight of a maximum-weight independent set.
G1
G3
v1 v2 v3 v4
G4
G2
The WIS algorithm in Section 16.2.4 computes only the weight pos-
sessed by an MWIS of a path graph, not an MWIS itself. A simple
hack is to modify the WIS algorithm so that each array entry A[i]
records both the total weight of an MWIS of the ith subproblem Gi
and the vertices of an MWIS of Gi that realizes this value.
A better approach, which saves both time and space, is to use
a postprocessing step to reconstruct an MWIS from the tracks in
the mud left by the WIS algorithm in its subproblem array A. For
starters, how do we know whether the last vertex vn of the input
graph G belongs to an MWIS? The key is again Lemma 16.1, which
states that two and only two candidates are vying to be an MWIS
of G: an MWIS of the graph Gn 1 , and an MWIS of the graph Gn 2 ,
supplemented with vn . Which one is it? The one with larger total
weight. How do we know which one that is? Just look at the clues
left in the array A! The final values of A[n 1] and A[n 2] record
the total weights of MWISs of Gn 1 and Gn 2 , respectively. So:
WIS Reconstruction
S := ; // vertices in an MWIS
i := n
while i 2 do
if A[i 1] A[i 2] + wi then // Case 1 wins
i := i 1 // exclude vi
else // Case 2 wins
S := S [ {vi } // include vi
i := i 2 // exclude vi 1
if i = 1 then // base case #2
S := S [ {v1 }
return S
3 2 1 6 4 5
0 3 3 4 9 9 14
The key that unlocks the potential of dynamic programming for solving
a problem is the identification of the right collection of subproblems.
What properties do we want them to satisfy? Assuming we perform
at least a constant amount of work solving each subproblem, the
number of subproblems is a lower bound on the running time of our
algorithm. Thus, we’d like the number of subproblems to be as low as
possible—our WIS solution used only a linear number of subproblems,
which is usually the best-case scenario. Similarly, the time required to
solve a subproblem (given solutions to smaller subproblems) and to
infer the final solution will factor into the algorithm’s overall running
time.
For example, suppose an algorithm solves at most f (n) different
subproblems (working systematically from “smallest” to “largest”),
using at most g(n) time for each, and performs at most h(n) postpro-
cessing work to extract the final solution (where n denotes the input
size). The algorithm’s running time is then at most
The three steps of the recipe call for keeping f (n), g(n), and h(n),
respectively, as small as possible. In the basic WIS algorithm, without
the WIS Reconstruction postprocessing step, we have f (n) = O(n),
g(n) = O(1), and h(n) = O(1), for an overall running time of O(n).
If we include the reconstruction step, the h(n) term jumps to O(n),
but the overall running time O(n) ⇥ O(1) + O(n) = O(n) remains
linear.
120 Introduction to Dynamic Programming
16
Richard E. Bellman, Eye of the Hurricane: An Autobiography, World Scien-
tific, 1984, page 159.
17
It’s actually not important that the item values are integers (as opposed to
arbitrary positive real numbers). It is important that the item sizes are integers,
as we’ll see in due time.
124 Introduction to Dynamic Programming
Problem: Knapsack
Input: Item values v1 , v2 , . . . , vn , item sizes s1 , s2 , . . . , sn ,
and a knapsack capacity C. (All positive integers.)
Quiz 16.5
Consider an instance of the knapsack problem with knapsack
capacity C = 6 and four items:
Item Value Size
1 3 4
2 2 3
3 4 2
4 4 3
a) 6
b) 7
c) 8
d) 10
18
The WIS problem on path graphs is inherently sequential, with the vertices
ordered along the path. This naturally led to subproblems that correspond to
prefixes of the input. The items in the knapsack problem are not inherently
ordered, but to identify the right collection of subproblems, it’s helpful to mimic
our previous approach and pretend they’re ordered in some arbitrary way. A
“prefix” of the items then corresponds to the first i items in our arbitrary ordering
(for some i 2 {0, 1, 2, . . . , n}). Many other dynamic programming algorithms use
this same trick.
126 Introduction to Dynamic Programming
Quiz 16.6
This case analysis shows that two and only two candidates are
vying to be an optimal knapsack solution:
Because both c and items’ sizes are integers, the residual capacity c si
in the second case is also an integer.
Knapsack: Subproblems
Compute Vi,c , the total value of an optimal knapsack solution
with the first i items and knapsack capacity c.
20
In the WIS problem on path graphs, there’s only one dimension in which a
subproblem can get smaller (by having fewer vertices). In the knapsack problem,
there are two (by having fewer items, or less knapsack capacity).
128 Introduction to Dynamic Programming
Knapsack
21
Or, thinking recursively, each recursive call removes the last item and an
integer number of units of capacity. The only subproblems that can arise in this
way involve some prefix of the items and some integer residual capacity.
16.5 The Knapsack Problem 129
16.5.5 Example
22
In the notation of (16.1), f (n) = O(nC), g(n) = O(1), and h(n) = O(1).
23
The running time bound of O(nC) is impressive only if C is small, for
example, if C = O(n) or ideally even smaller. In Part 4 we’ll see the reason for the
not-so-blazingly fast running time—there is a precise sense in which the knapsack
problem is a difficult problem.
130 Introduction to Dynamic Programming
6 0 3 3 7 8
5 0 3 3 6 8
residual capacity c
4 0 3 3 4 4
3 0 0 2 4 4
2 0 0 0 4 4
1 0 0 0 0 0
0 0 0 0 0 0
0 1 2 3 4
prefix length i
16.5.6 Reconstruction
Knapsack Reconstruction
6 0 3 3 7 8
5 0 3 3 6 8
residual capacity c
4 0 3 3 4 4
3 0 0 2 4 4
2 0 0 0 4 4
1 0 0 0 0 0
0 0 0 0 0 0
0 1 2 3 4
prefix length i
24
In the notation of (16.1), postprocessing with the Knapsack Reconstruction
algorithm increases the h(n) term to O(n). The overall running time O(nC) ⇥
O(1) + O(n) = O(nC) remains the same.
132 Introduction to Dynamic Programming
The Upshot
25
For example, suppose C = 2 and consider two items, with v1 = s1 = 1 and
v2 = s2 = 2. The optimal solution S is {2}. S {2} is the empty set, but the
only optimal solution to the subproblem consisting of the first item and knapsack
capacity 2 is {1}.
Problems 133
the subproblems.
5 3 1 7 2 4 6
where vertices are labeled with their weights. What are the final array
entries of the WIS algorithm from Section 16.2, and which vertices
belong to the MWIS?
G H K
WG = max{WH , WK + wv }.
Which of the following statements are true? (Choose all that apply.)
Problems 135
and knapsack capacity C = 9. What are the final array entries of the
Knapsack algorithm from Section 16.5, and which items belong to the
optimal solution?
Challenge Problems
Problem 16.5 (H) This problem describes four generalizations of
the knapsack problem. In each, the input consists of item values
v1 , v2 , . . . , vn , item sizes s1 , s2 , . . . , sn , and additional problem-specific
data (all positive integers). Which of these generalizations can be
solved by dynamic programming in time polynomial in the number n
of items and the largest number M that appears in the input? (Choose
all that apply.)
a) Given a positive integer capacity C, compute a subset of items
with the maximum-possible total value subject to having total
size exactly C. (If no such set exists, the algorithm should
correctly detect that fact.)
136 Introduction to Dynamic Programming
Programming Problems
Problem 16.6 Implement in your favorite programming lan-
guage the WIS and WIS Reconstruction algorithms. (See www.
algorithmsilluminated.org for test cases and challenge data sets.)