MIT6 006F11 Lec20 PDF
MIT6 006F11 Lec20 PDF
Dynamic Programming II of IV
Summary
* DP careful brute force
* DP guessing + recursion + memoization
* DP dividing into reasonable # subproblems whose solutions relate acyclicly
usually via guessing parts of solution.
* time = # subproblems time/subproblem
|
{z
}
count # subproblems
count # choices
compute time/subproblem
4. recurse + memoize
problems
OR build DP table bottom-up
check subproblems acyclic/topological order
= extra time
Lecture 20
Dynamic Programming II of IV
Examples:
subprobs:
time/subpr:
topo. order:
total time:
Fibonacci
Fk
for 1 k n
n
nothing
1
Fk = Fk1
+Fk2
(1)
for k = 1, . . . , n
(n)
orig. prob.:
extra time:
Fn
(1)
# subprobs:
guess:
# choices:
recurrence:
Shortest Paths
k (s, v) for v V, 0 k < |V |
= min s v path using k edges
V2
edge into v (if any)
indegree(v) + 1
k (s, v) = min{k1 (s, u) + w(u, v)
| (u, v) E}
(1 + indegree(v))
for k = 0, 1, . . . |V | 1 for v V
(V E)
+ (V 2 ) unless efficient about indeg. 0
|V |1 (s, v) for v V
(V )
Text Justification
Split text into good lines
obvious (MS Word/Open Office) algorithm: put as many words that fit on first line,
repeat
but this can make very bad lines
vs.
blah
blah
blah
blah
reallylongword
:)
:<
Lecture 20
Dynamic Programming II of IV
3. recurrence:
DP[i] = min(badness (i, j) + DP [j] for j in range (i + 1, n + 1))
DP [n] = 0
= time per subproblem = (n)
4. order: for i = n, n 1, . . . , 1, 0
total time = (n2 )
badness(i,j)
Figure 2: DAG.
5. solution = DP [0]
Perfect-Information Blackjack
Given entire deck order: c0 , c1 , cn1
1-player game against stand-on-17 dealer
when should you hit or stand? GUESS
goal: maximize winnings for fixed bet $1
may benefit from losing one hand to improve future hands!
ci , . . . cn1
where i is # cards already played
| {z }
remaining cards
= # subproblems = n
2. guess: how many times player hits (hit means draw another card)
= # choices n
3. recurrence: BJ(i) = max(
outcome {+1, 0, 1} + BJ(i + # cards used)
for # hits in 0, 1, . . . if valid play dont hit after bust
3
O(n)
O(n)
Lecture 20
Dynamic Programming II of IV
)
= time/subproblem = (n2 )
4. order: for i in reversed(range(n))
total time = (n3 )
iO(1)
n
1 nX
X
time is really
(n i #h) = (n3 ) still
i=0
#h=0
5. solution: BJ(0)
detailed recurrence: before memoization (ignoring splits/betting)
(n2 )
(n)
(n)
with
care
valid
plays
BJ(i):
if n i < 4: return 0 (not enough cards)
for p in range(2, n i 1): (# cards taken)
player = sum(ci , ci+2 , ci+4:i+p+2 )
if player > 21: (bust)
options.append(1(bust) + BJ(i + p + 2))
break
for d in range(2, n i p)
dealer = sum(ci+1 , ci+3 , ci+p+2:i+p+d )
if dealer 17: break
if dealer > 21: dealer = 0 (bust)
options.append(cmp(player, dealer) + BJ(i + p + d))
return
max(options)
0
-1
+1
outcomes
Parent Pointers
To recover actual solution in addition to cost, store parent pointers (which guess used at
each subproblem) & walk back
4
Lecture 20
Dynamic Programming II of IV
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.