Chapter 3 Dynamic Programming
Chapter 3 Dynamic Programming
Sonia REBAI
Tunis Business School
University of Tunis
Introduction
ü Dynamic programming (DP) is a recursive optimization approach that helps
take interdependent and sequential decisions.
ü The effect of the policy decision at each stage is to transform the current
§ Backward method: we start from the last step and we go back to the
first.
§ Forward method: we start from the first stage and we go to the last.
Consider an n-step sequential decision problem.
ü We decompose the problem into n steps, each corresponding to a particular
decision.
ü Each step will feed the next one so that the output of one step serves as
input to the next step. Decision
Result
Backward approach
ü Xi : the decision at step i
ü Ri(Si, Xi) : The immediate result of decision Xi given that the state is Si.
X1 Xn-1 Xn
Step 1
F1(S1,X1) = R1(S1, X1) + F2*(X1) Optimal Decision
S1 \ X 1 B C D F1*(S1) X 1*
A 5 + 3 = 8 3 + 7= 10 4 + 8 = 12 8 B
Thus, the shortest path linking A to H is A-B-F-H with length 8.
F3(S3,X3) = R3(S3,X3)
Forward approach
ü Xi : the decision at step i
X1 X2 Xn
Step 3
F3(S3,X3) = R3(S3,X3) + F2*(X3) Optimal Decision
S3 \ X3 E F G F3*(S3) X3*
H 4 + 11 =15 1 + 7 = 8 3 + 7 = 10 8 F
Thus, once again we find the same shortest path linking A to H:
F1(S1,X1) = R1(S1,X1)
Which method to use?
ü The method to adopt depends on the availability of information on the
initial or final state
ü If we know the initial state but not the final state, then we use the
backward method
ü If we know the final state but not the initial state, then we use the forward
method
ü If both states are known, then both methods apply
Keep in mind
DP Characteristics
ü The problem can be decomposed into a number of steps
Fi(Si,Xi) = f(direct result, optimal cumulative result over the previous steps)
DP Characteristics - continued
More precisely:
Step 2 (Sousse)
F2(S2,X2) = R2(S2, X2) + F*3(S2-X2) OD
S2 \ X 2 0 10 20 30 40 F2*(S2) X2*
20 0+75 =75 45 +35 =80 - - - 80 10
30 0+95 =95 45 +75 =120 60 +35 =95 - - 120 10
40 0 +110 =110 45 +95 =140 60 +75 =135 70 +35 =105 - 140 10
50 - 45+110 =155 60 +95 =155 70 +75 =145 90 +35 =125 155 10 ou 20
Step 1 (Tunis)
F1(S1,X1) = R1(S1, X1) + F*2(S1-X1) OD
S1 \ X 1 10 20 30 40 F1*(S1) X1 *
60 30 +155 =185 50 +140 =190 90 +120 =210 100 +80 =180 210 30
Example 2 - continued
Consequently, the optimal allocation is of 30,000 TD for the plant of Tunis,
10,000 TD for the plant of Sousse and 20,000 TD for the plant of Sfax. The
total revenue is of 210,000 TD.
the next 4 months. A production run involves a fixed cost of 3 DT and a variable
cost of 1 DT per unit. At the end of each month, any excess of stock involves a
holding cost of 0.5 DT per unit. At any month, the production capacity is of 4
units while the storage capacity is of 2 units. The demand for the next 4 months
is respectively 1, 3, 2, and 4. Given that the initial stock is empty, determine the
(3+1*Xi) + 0.5*(Si+Xi–Di) if Xi ≠ 0
Ri (Si, Xi) =
0.5*(Si+Xi–Di) if Xi = 0
ü Fi (Si,Xi) = total minimum cost for months i, i+1,…, 4, given that at the start
of month i the stock level is Si and Xi units are to be produced (i =1,…,4).
Example 2 - continued
ü F4(S4,X4) = R4(S4,X4)
Step 4 (month 4) D4 = 4
F4(S4,X4) = R4(S4,X4) Optimal Decision
S4\ X4 2 3 4 F4*(S4) X4*
0 - - 3 +4+ 0 = 7 7 4
1 - 3 + 3+ 0 = 6 3+4 + 0,5 = 7,5 6 3
2 3 + 2+ 0 = 5 3 + 3+ 0,5 =6,5 3+ 4 + 1 = 8 5 2
Hence, the optimal production plan is: X1*= 2, X2*= 4, X3*= 0, X4*= 4 with a
minimum cost of = 20.5 dinars.
Hence, the optimal production plan is: X1*= 2, X2*= 4, X3*= 0, X4*= 4
with a minimum cost of = 20.5 dinars.