Part4 1 Deterministic Full
Part4 1 Deterministic Full
Prescriptive Analytics
Dynamic Programming 1
IEDA 3010
Dynamic Programming
Dr. Jin QI
Department of Industrial Engineering and Decision Analytics
Hong Kong University of Science and Technology
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Introduction
• Dynamic programming provides a systematic
procedure for determining the optimal combination of
decisions
– Determine a sequence of decisions
• Fundamental idea
– Start with a small portion of the original problem and finds
the optimal solution for this smaller problem
– Then gradually enlarge the problem, finding the current
optimal solution from the preceding one, until the original
problem is solved in its entirety
2
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
4
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Basic Concept
• Determine the safest route from A to J
• Could I say this?
– Safest path from A to J depends on
• Safest path from B to J
hil76299_ch10_424-463.qxd 11/19/08 04:19 PM Page 425
6
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
7
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
8
The value of csxn is given by the preceding tables for cij by se
IEDA 3010
Prescriptive Analytics
and j ! xn (the immediate destination). Because the ultimate
Dynamic Programming 1 d
at the end of stage 4, f 5* ( J) ! 0.
Example 1: Traveling in Ukraine
The objective is to find f 1* (A) and the corresponding r
finds it by successively finding f 4*(s), f 3*(s), f 2*(s), for each
3
then using f * (s) to solve for f * (A).
• Solution procedure 2 1
Solution
We start with the smallerProcedure.
problem whereWhenthe thebusiness
fortune seeker has only o
his route
delegation has nearly thereafter
completed is determined
its trip and entirely
has onlybyone hismorecurrent stat
bus run to go.
nal destination x4 ! J, so the route for this final stagecoach r
– n = 4: thef 4*route
(s) !isf4determined
(s, J) ! cs,J,entirely
the immediate solution
by the current stateto thes, n ! 4
so the route for this final stagecoach run is s è J
n ! 4: s f 4*(s) x4*
H 3 J
I 4 J
6 H 3
3 I 4
4
Example 1: Traveling in Ukraine
Similar calculations need to be made when you start from the other two possible state
s ! E and s ! G with two stages to go. Try it, proceeding both graphically (Fig. 10.1
• So, given the current state F, the optimal decision is x⇤3 = I
and algebraically *
⇤ [combining cij and f 4 (s) values], to verify the following complete r
with f3 (F ) = 7
sults for the n ! 3 problem.
E 4 8 4 H
F 9 7 7 I
G 6 7 6 H
The solution for the second-stage problem (n ! 2), where there are three stages
go, is obtained in a similar fashion. In this case, f2(s, x2) ! csx " f 3*(x2). For exampl
2
4 11
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Stage 2 Stage 3
3 E 4
2
C F 7
4 G 6
12
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
B 11 11 12 11 E or F
C 7 9 10 7 E
D 8 8 11 8 E or F
13
In the first and third rows of this table, note that E and F tie as the minimizing value of
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Stage 1 Stage 2
2 B 11
4
A C 7
3 D 8
14
B
IEDA 3010
2 Prescriptive Analytics
7 Dynamic Programming 1
4
A C
Example 1: Traveling
3 in Ukraine
D
• After getting there, the minimum additional cost for stage 2 to
8 = 7, or f ⇤ (D) = 8
the end is f2⇤ (B) = 11 , f2⇤ (C) 2
These calculations arex1summarized next
= B : f1 (A, B)for
= cthe
A,B three alternatives
+ f2⇤ (B) = 2 + 11 for
= 13the immediate
destination: x1 = C : f1 (A, C) = cA,C + f2⇤ (C) = 4 + 7 = 11
⇤
x1 ! B: f1(A, B)
x1 != cDA,B: "ff12*(A,
(B)D)
!=2" 11 +
cA,D ! f13.
2 (D) = 3 + 8 = 11
x1 ! C: f1(A, C) ! cA,C " f 2*(C) ! 4 " 7 ! 11.
• So, given the current state A, the optimal decision is x⇤1 = C
x1 ! D: f1(A, D)⇤ ! cA,D " f 2*(D) ! 3 " 8 ! 11.
or x1 = D with f1⇤ (A) = 11
Since 11 is the minimum, f 1*(A) ! 11 and x1* ! C or D, as shown in the following table.
A 13 11 11 11 C or D
An optimal solution for the entire problem can now be identified from the four ta- 15
hil76299_ch10_424-463.qxd 11/19/08 04:19 PM Page 429 IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Stage: 1 2 3 4
■ FIGURE 10.2
Optimal Solutions
Graphical display of the
11
7
4
dynamic programming B E 1
AèC E H J
è è of è
solution the stagecoach
4
3
problem. Each arrow shows H
AèD F I J
è è èpolicy decision
an optimal
3 3
(the best immediate 11 7 7
AèD E H J
è è è
destination) from that state,
State: A
4
where the number by the C F T
Total cost is 11
state is the resulting cost 4 3
from there to the end. 4
Following the boldface 3 1 3
arrows from A to T gives the I
three optimal solutions (the 4
three routes giving the D G
minimum total cost of 11). 8 6
arrows (and the resulting cost) comes from one row in one of the other t
the same way.
You will see in the next section that the special terms describing the pa
text of this problem—stage, state, and policy—actually are part of the gene
16
ogy of dynamic programming with an analogous interpretation in other con
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Characteristics of DP Problems
1. The problem can be divided into stages, with a policy
decision required at each stage.
– DP problems require making a sequence of interrelated
decisions, where each decision corresponds to one stage of
the problem.
2. Each stage has a number of states associated with
the beginning of that stage.
– States are the various possible conditions in which the
system might be at that stage of the problem
– The number of states may be either finite or infinite
17
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Characteristics of DP Problems
3. The effect of the policy decision at each stage is to
transform the current state to a state associated with
the beginning of the next stage (possibly according
to a probability distribution).
4. The solution procedure is designed to find an optimal
policy for the overall problem
– DP provides a prescription of the optimal policy decision at
each stage for each of the possible states
18
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Characteristics of DP Problems
5. Given the current state, an optimal policy for the
remaining stages is independent of the policy
decisions adopted in previous stages.
– The optimal immediate decision depends on only the current
state and not on how you got there.
6. The solution procedure begins by finding the optimal
policy for the last stage, which is usually trivial.
19
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Characteristics of DP Problems
7. DP problems are solved via a recursive relationship
that identifies the optimal policy for stage n, given
the optimal policy for stage n+1.
8. Use this recursive relationship, the solution
procedure starts at the end and moves backward
stage by stage—each time finding the optimal policy
for that stage— until it finds the optimal policy
starting at the initial stage.
20
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
21
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
22
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
24
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
25
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
• Solution procedure
– n = 3: country 3 is the only remaining country
• The profit function p3(x3) is obviously increasing in x3
36 CHAPTER
• So it’s optimal 10 DYNAMIC
to allocate PROGRAMMING
all available teams to country 3
n ! 3: s3 f 3*(s3) x3*
0 0 0
1 50 1
2 70 2
3 80 3
4 100 4
5 130 5
26 (n #
We now move backward to start from the next-to-last stage
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Country 2 Country 3
45 0 0
20
2 1 50
0 2 70
27
IEDA 3010
Prescriptive Analytics
10:03 AM Page 437 Dynamic Programming 1
Rev.Confirming Pages
Example 2: Fighting Ebola
• Given the current state 2, the optimal decision is
⇤ ⇤
10.3 x = 0 or 1 with f 2 (2) = 70
DETERMINISTIC2 DYNAMIC PROGRAMMING 437
x2 = 0 : f2 (2, 0) = p2 (0) + f3⇤ (2) = 0 + 70 = 70
Proceeding in a similar
x2 = way
1 : with
f2the other
(2, 1) = ppossible values
⇤ of s (try it) yields the fol-
2 (1) + f3 (1) = 20 2+ 50 = 70
lowing table.
x2 = 2 : f2 (2, 2) = p2 (2) + f3⇤ (0) = 45 + 0 = 45
0 0 0 0 or 1
1 50 20 50 0 or 1
2 70 70 45 70 0 or 1
3 80 90 95 75 95 2 or 1
4 100 100 115 125 110 125 3 or 1
5 130 120 125 145 160 150 160 4 or 1
28
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
0 0
120
105 1 50
90 2 70
5 70
3 95
45
4 125
0
5 160
29
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
30
up to the top node with x1 ! 5. The corresponding p1(x1) values from Table 10.1 are shown
next to the links. The numbers next to the nodes are obtained from the f 2*(s2) column of the
IEDA 3010
Prescriptive Analytics
n ! 2 table. As with n ! 2, the calculation needed for each alternative value ofDynamic
the decision
Programming 1
variable involves adding the corresponding link value and node value, as summarized below.
Formula: Example 2: Fighting Ebola
f1(5, x1) ! p1(x1) # f 2*(5 " x1).
p1(x1) is given in the country 1 column of Table 10.1.
f 2*(5 " x1) is given in the n ! 2 table.
• Optimal
x1 ! 0:
Solution
f1(5, 0) ! p1(0) # f 2*(5) ! 0 # 160 ! 160.
x1 ! 1: – Allocate
f1(5, 1) ! p1,
1(1)3,
# and
f 2*(4) 1
!teams to !
45 # 125 the three countries,
170.
! respectively
x1 ! 5: f1(5, 5) ! p1(5) # f 2*(0) ! 120 # 0 ! 120.
– Total additional population covered by health personnel
The similar calculations for x1 ! 2, 3, 4 (try it) verify that x1* ! 1 with f 1*(5) ! 170, as
is 170K
shown in the following table.
31
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
33
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
35
where IEDA 3010
Prescriptive Analytics
fn(sn, xn) ! pn(xn) " f *n$1(sn % xn) Dynamic Programming 1
n ! 3: s3 f 3*(s3) x3*
0 0.80 0
1 0.50 1
2 0.30 2
Stage Stage
FIGURE 10.7
n n$1
he basic structure for the xn
overnment space project State: sn sn % xn 36
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Team 2 Team 3
0.4 0 0.8
0.6 1 0.5
37
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
• Given the
CHAPTER 10 DYNAMIC current state
PROGRAMMING 1, the optimal decision is
x⇤2 = 0 with f2⇤ (1) = 0.30
0 0.48 0.48 0
1 0.30 0.32 0.30 0
2 0.18 0.20 0.16 0.16 2
Team 1 Team 2
0.15 0 0.48
0.20
2 1 0.30
0.40 2 0.16
39
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
CHAPTER 10 DYNAMIC PROGRAMMING
• Optimal Solution
Therefore, the optimal solution
– Allocate 1, 0, must
and 1have x1* ! 1,towhich
scientists makes
the three teams, 2 " 1 ! 1, so that
s2 !respectively
x2* ! 0, which makes s3 ! 1 " 0 ! 1, so that x3* ! 1. Thus, teams 1 and 3 should each
– The probability that all three teams will fail is 0.060
receive one additional scientist. The new probability that all three teams will fail would
then be 0.060. 40
IEDA 3010
Prescriptive Analytics
Dynamic Programming 1
Summary
• DP is a general type of approach to problem solving.
• There is no standard formulation and the particular
equations used must be developed to fit each
situation.
• Key is to identify stage, state, decisions, and
recursion
• The solution procedure is backward from the last
stage
41