0% found this document useful (0 votes)
104 views

Probabilistic Dynamic Programming: Example 1

Tiger Supermarket has purchased 6 gallons of milk to allocate across 3 stores to maximize expected profit. The document outlines a probabilistic dynamic programming approach to solve this allocation problem. It defines the expected revenue functions for each store and uses these to recursively calculate the maximum expected revenue functions f1, f2, f3 that can be achieved by allocating gallons to stores 1, 2, and 3. This leads to two optimal solutions that allocate the 6 gallons across the stores to yield the maximum expected profit of $9.75.

Uploaded by

utari2210
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views

Probabilistic Dynamic Programming: Example 1

Tiger Supermarket has purchased 6 gallons of milk to allocate across 3 stores to maximize expected profit. The document outlines a probabilistic dynamic programming approach to solve this allocation problem. It defines the expected revenue functions for each store and uses these to recursively calculate the maximum expected revenue functions f1, f2, f3 that can be achieved by allocating gallons to stores 1, 2, and 3. This leads to two optimal solutions that allocate the 6 gallons across the stores to yield the maximum expected profit of $9.75.

Uploaded by

utari2210
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Probabilistic Dynamic Programming

Example 1:
Tiger Supermarket chain has purchased 6 gallons of milk ($1 per gallon).
The milk is then sold at $2 per gallon.
Left over can be sold at $0.50 per gallon

Store 1

Store 2

Gallons Probability
1
2
3

Demand

Store 3

Gallons Probability
1
2
3

0.6
0.0
0.4

Gallons Probability

0.5
0.1
0.4

1
2
3

0.4
0.3
0.3

Tiger Supermarket wants to allocate the 6 gallons of milk to the three stores so as
Maximize the expected net daily profit (revenues less costs)
Note that it would foolish to assign more than 3 gallons to any store.
Store 1

Store 2
g2 = 0

Store 3
3

g3 = 3

g2 = 1

g1 = 3
4

g2 = 2

g3 = 2

g1 = 2

g2 = 3

g1 = 1

g3 = 1

g3 = 0

g1 = 0

Dynamic Programming
Store 1

Store 2
3

f2(3)
f2(4)

Store 3
3

f3(3)
f3(2)

f1(6)

0
f2(5)
5

f3(1)
1

f2(6)

f3(0)
0

Dynamic Programming Procedure


Before we start, note that:
MAX {Profit} = MAX {Revenue Costs} = MAX {Revenue $1*6)
MAX {Profit} =

MAX { Revenue }

A random variable

Define:
rt(gt) = Expected revenue earned from gt gallons assigned to store t.
Example:
0.4*($2) + 0.3*($2) + 0.3*($2)

r3(1) = E[Revenue] =

r3(2) = $2.00
Example:

0.4*($2+0.50) + 0.3*($4 ) + 0.3*($4 )

r3(2) = E[Revenue] =

r3(2) = $3.40

Store 1

Store 2

Store 3

r1(0) = $0.00

r2(0) = $0.00

r3(0) = $0.00

r1(1) = $2.00

r2(1) = $2.00

r3(1) = $2.00

r1(2) = $3.10

r2(2) = $3.25

r3(2) = $3.40

r1(3) = $4.20

r2(3) = $4.35

r3(3) = $4.35

Store 1

Store 2

r1(g1)

d1

Store 3

r2(g2)

r3(g3)

d2

g1

d3
g3

g2

Define:
ft(x) : maximum expected revenue earned from x gallons of milk assigned
to stores t, t+1,..,3
f3(x) = MAX { r3(g3) }
g3

Where g3 x

f2(x) = MAX { r2(g2) + f 3(x g2) }


g2

Where g2 x

f1(x) = MAX { r1(g1) + f 2(x g1) }


g1

Where g1 x

ft (x) = MAX { rt(gt) + f t+1(x - gt) }


gt

f3(0) = maximum expected revenue earned from 0 gallons of milk


assigned to store 3
Store 3
0

f3(0) = r3(0) = $0

f3(0)

f3(1) = maximum expected revenue earned from 1 gallons of milk


assigned to store 3
Store 3
0

f3(1) = r3(1) = $2.00

f3(1)

f3(2) = maximum expected revenue earned from 2 gallons of milk


assigned to store 3
Store 3
2

f3(2) = r3(2) = $3.40

f3(2)

f3(3) = maximum expected revenue earned from 3 gallons of milk


assigned to store 3
Store 3
0

f3(3) = r3(3) = $4.35

f3(3)

f2(3) = maximum expected revenue earned from 3 gallons of milk assigned to


stores 2 and 3
Store 2
r2(3)
3
f2(3)

f2(3) = MAX

r2(2)
r2(1)
r2(0)

f3(0)
0

Store 3

f3(1)
1

f3(2)
2
f3(3)
3

r2(3) + f3(0)

= 4.35 + 0 = 4.35

r2(2) + f3(1)

= 3.25 + 2.00 = 5.25

r2(1) + f3(2)

= 2.00 + 3.40 = 5.40

r2(0) + f3(3)

= 0 + 4.35 = 4.35

f2(3) = 5.40
Assign 1 gallon to
store 2

f2(4) = maximum expected revenue earned from 4 gallons of milk


assigned to stores 2 and 3

Store 2
r2(3)
4
f2(4)

r2(2)
r2(1)

f3(1)
1

Store 3

f3(2)

2
f3(3)
3

f2(4) = MAX

r2(3) + f3(1)

= 4.35 + 2.00 = 6.35

r2(2) + f3(2)

= 3.25 + 3.40 = 6.65

r2(1) + f3(3)

= 2.00 + 4.35 = 6.35

f2(4) = 6.65
Assign 2 gallon to
store 2

So far, we have
f3(0) = 0

Assign 0 gallon to store 3

f3(1) = 2.00

Assign 1 gallon to store 3

f3(2) = 3.40

Assign 2 gallon to store 3

f3(3) = 4.35

Assign 3 gallon to store 3

f2(3) = 5.40

Assign 1 gallon to store 2

f2(4) = 6.65

Assign 2 gallons to store 2

f2(5) = 7.75

Assign 3 gallons to store 2

f2(6) = 8.70

Assign 3 gallons to store 2

*Refer to the textbook for the computation of f2(5), and f2(6).

f1(6)

= maximum expected revenue earned from 6 gallons of milk


assigned to stores 1, 2 and 3

Store 1

f2(3)
3

r1(3)
6

r1(2)
r1(1)

f1(6)
r1(0)

f2(4)
4

Store 3

Store 2
3

2
0

f2(5)
5

f2(3)
6

f1(6)

MAX

r1(3) + f2(3)

= 4.20 + 5.40 = 9.60

r1(2) + f2(4)

= 3.10 + 6.65 = 9.75

r1(1) + f2(5)

= 2.00 + 7.75 = 9.75

r1(0) + f2(6)

= 0 + 8.70 = 8.70

f1(6) = 9.75
Assign 1 or 2
gallons to store 1

We have two optimal solutions


Store 1: 2 gallons
Store 2: 2 gallons
Store 3: 2 gallons

Solution 1:

Solution 2:

Store 1: 1 gallon
Store 2: 3 gallons
Store 3: 2 gallons

Solution 2
Store 1

Store 2

Gallons Probability
1
2
3

Store 3

Gallons Probability
1
2
3

0.6
0.0
0.4

0.5
0.1
0.4

Gallons Probability
1
2
3

0.4
0.3
0.3

E[Revenues]
Cost

$ 2.00

$ 4.35

$ 3.40

TOTAL
$ 9.75

$ 1.00

$ 3.00

$ 2.00

$ 6.00

E[Profits]

$1.00

$1.85

$1.40

$3.75

Monte Carlo Simulation


Simulation
1
2
3
4
5
6
7
8
98
99
100

g1
1
1
1
1
1
1
1
1
1
1
1

g2
3
3
3
3
3
3
3
3
3
3
3

g3
2
2
2
2
2
2
2
2
2
2
2

d1
1
3
1
3
1
1
3
1
3
3
1

d2
1
3
3
1
1
1
3
3
1
1
3

d3
1
2
1
3
2
3
1
1
3
2
3

Average

1.76

1.93

1.85

Revenue 1 Revenue 2 Revenue 3


$2.00
$3.00
$2.50
$2.00
$6.00
$4.00
$2.00
$6.00
$2.50
$2.00
$3.00
$4.00
$2.00
$3.00
$4.00
$2.00
$3.00
$4.00
$2.00
$6.00
$2.50
$2.00
$6.00
$2.50
$2.00
$3.00
$4.00
$2.00
$3.00
$4.00
$2.00
$6.00
$4.00
$2.00

$4.40

$3.37

Cost 1
$1.00
$1.00
$1.00
$1.00
$1.00
$1.00
$1.00
$1.00
$1.00
$1.00
$1.00

Cost 2
$3.00
$3.00
$3.00
$3.00
$3.00
$3.00
$3.00
$3.00
$3.00
$3.00
$3.00

Cost 3 Total profit


$2.00 $1.50
$2.00 $6.00
$2.00 $4.50
$2.00 $3.00
$2.00 $3.00
$2.00 $3.00
$2.00 $4.50
$2.00 $4.50
$2.00 $3.00
$2.00 $3.00
$2.00 $6.00

$1.00 $3.00 $2.00

$3.77

Example 2: Consider the following three-period inventory problem.


Period 1

Period 2

Period 3

Production

x1

x2

x3

Prod Cost

3+2x1

3+2x2

3+2x3

Production

4 units

4 units

4 units

1 or 2
Equally likely

Demand
End of period

$1

$1

Holding cost

3 units

Inventory capacity

1 or 2
Equally likely

1 or 2
Equally likely

$1
3 units

3 units

Initial inventory

Inventory on hand at the end of period 3 can be sold at $2 per unit

Assume that at period 1 we decide to produce 2 units (x1= 2)


SIMULATION
PERIOD: 1

TIME:

PM
85 AM

Cost: $8

Inventory on hand
$1
Production
$3 + $2(2)

Demand (random)

Assume that now we decide to produce x2= 1.


SIMULATION
PERIOD: 2

TIME:

PM
85 AM

Cost: $6

Inventory on hand
$1
Production
$3 + $2(1)

Demand (random)

Assume that now we decide to produce x3= 1.


SIMULATION
PERIOD: 3

TIME:

PM
85 AM

Cost: $5

Inventory on hand
$0
Production
$3 + $2(1)

Demand (random)

Total cost
$8 + $6 + $5 - $2 = $17

Solution Space
Period 1

x1= 4
d1 = 2

Period 3

x 3

or d1=
1=1

x1= 3
d1 = 2
x1= 2
d1 = 2

x1= 1
d1 = 2

Period 2

or

x1= 2
d1 = 1
x 1

or d1=
1=1

x 0

or d1=
1=1

A FESIBLE SOLUTION
Period 1

Period 2
3

x2 = 0

x2 = 0

Period 3
3

x3=0

x3=0

x1=3

x2 = 2

x2 = 3

x3=1
x3=2

Dynamic Programming
Period 1

Period 2

f2(3)

d2 = 1

x2

f3(3)

f2(2)

f1(1)
x1

End inv

d3 = 1

d3 = 2

f3(2)

x2

x3

d2 = 2

Period 3

x3

d1 = 1
d1 = 2
f2(1)

f3(1)

x2

f3(0)

f2(0)
0

x3

x2

x3

Dynamic Programming Solution


f1(1)

i1=1

Period 1
x1

f2(i)

d1 i2

x2

Define:

f3(i)

Period 2

Period 3

d2 i3

x3

d3

ft(i) = minimum expected net cost incurred during the periods t, t+1,..,3
when the inventory at the beginning of period t is i units.
f3(i) = min { PC of x3 + E[HC of x3 + i - d3] E[SV of x3 + i - d3]}
x3
f2(i) = min { PC of x2 + E[HC of x2 + i - d2] + E[ f3( x2 + i- d2)]}
x2
f1(1) = min { PC of x1 + E[HC of x1+1- d1] + E[ f2(x1 + 1- d1)]}
x1

Computing f3(i)
f3(i) = min { PC of x3 + E[HC of x3+i-d3] E[SV of x3+i-d3]}
x3
f3(i) = min { c(x3) + $1[0.5(x3 +i-1) + 0.5(x3 +i-2)] $2[ 0.5(x3 +i-1) +0.5(x3 +i-2)]}
x3
f3(i) = min { c(x3) - x3 - i + 1.5 }
x3
Range of values for i: 0,1,2,3
Range of values for x3:

If i = 0 => 2 x3 4
If i = 1 => 1 x3 3
If i = 2 => 0 x3 2
If i = 3 => 0 x3 1

f3(i) = min { c(x3) - x3 - i + 1.5 }


x3
i

x3

c(x3)

f3(i)

x3 *

2
3
4

7
9
11

c(x3) - x3 - i + 1.5
6.5
7.5
8.5

6.5

1
2
3

5
7
9

4.5
5.5
6.5

4.5

0
1
2

0
5
7

-0.5
3.5
4.5

-0.5

0
1

0
5

-1.5
2.5

-1.5

f2(i) = min { PC of x2 + E[HC of x2+i-d2] + E[ f3( x2+i-d2)]}


x2

f2(i) = min { c(x2) + $1[0.5(x2 +i-1) + 0.5(x2 +i-2)] + 0.5f3( x2+i-1)+ 0.5f3( x2+i-2)}
x2
f2(i) = min { c(x2) + x2 + i - 1.5 + 0.5f3( x2+i-1) + 0.5f3( x2+i-2)}
x2
Range of values for i: 0,1,2,3
Range of values for x2:

If i = 0 => 2 x2 4
If i = 1 => 1 x2 3
If i = 2 => 0 x2 2
If i = 3 => 0 x2 1

f2(i) = min { c(x2) + x2 + i - 1.5 + 0.5f3( x2+i-1) + 0.5f3( x2+i-2)}


x2
i

x2

c(x2)

f2(i)

x2 *

2
3
4

7
9
11

f3( x2+i-1)
4.5
-0.5
-1.5

f3( x2+i-2)
6.5
4.5
-0.5

c(x2) +
13
12.5
12.5

12.5

3 or 4

1
2
3

5
7
9

4.5
-0.5
-1.5

6.5
4.5
-0.5

11
10.5
10.5

10.5

2 or 3

0
1
2

0
5
7

4.5
-0.5
-1.5

6.5
4.5
-0.5

6
8.5
8.5

0
1

0
5

-0.5
-1.5

4.5
-0.5

3.5
6.5

3.5

f1(i) = min { PC of x1 + E[HC of x1+1-d1] + E[ f2(x1+1-d1)]}


x1
f1(1) = min { c(x1) + $1[0.5(x1 +1-1) + 0.5(x1 +1-2)] + 0.5f2( x1+1-1)+ 0.5f2( x1+1-2)}
x1
f1(1) = min { c(x1) + x1 - 0.5 + 0.5f2( x1) + 0.5f2( x1-1)}
x1
1 x1 3

Range of values for x1:

x1

c(x1)

1
2
3

5
7
9

f2( x1)

f2( x1-1)

10.5
6.0
3.5

c(x1) +

12.5
10.5
6.0

f1(1)

17
16.75
16.25

x1 *
3

16.25

Optimal policy:
Period 1: Produce 3 units
Period 2: if the inventory on hand is equal to
0 produce 3 or 4 units
1 produce 2 or 3 units
2 produce 0 units
3 produce 0 units

Do not produce any unit

Period 3: If inventory on hand is equal to


0 produce 2 units
1 produce 1 unit
2 produce 0 units
3 produce 0 units

If inventory on hand < 2 units produce enough


units to meet highest demand.
Otherwise, do not produce any unit

Expected cost of this policy is $16.25

MONTE CARLO SIMULATION


Simulation
1
2
3
4
5
6
7
14
15
16
17
18
94
95
96
97
98
99
100
101
102
103
495
496
497
498

i1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

x1
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

Average

d1
2
1
1
2
2
1
1
1
1
1
1
2
1
1
1
2
1
2
2
2
1
1
1
1
1
1

i2
2
3
3
2
2
3
3
3
3
3
3
2
3
3
3
2
3
2
2
2
3
3
3
3
3
3

1.5241 2.476

x2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

d2
2
1
1
2
2
2
1
1
2
1
2
1
2
2
1
1
1
2
1
1
2
2
1
2
1
1

i3
0
2
2
0
0
1
2
2
1
2
1
1
1
1
2
1
2
0
1
1
1
1
2
1
2
2

x3
2
0
0
2
2
1
0
0
1
0
1
1
1
1
0
1
0
2
1
1
1
1
0
1
0
0

1.466

1.01

0.99

d3
2
2
1
1
2
1
2
1
2
1
2
2
1
1
1
1
1
1
2
1
2
2
1
2
2
2

i4
0
0
1
1
0
1
0
1
0
1
0
0
1
1
1
1
1
1
0
1
0
0
1
0
0
0

c1
11
12
12
11
11
12
12
12
12
12
12
11
12
12
12
11
12
11
11
11
12
12
12
12
12
12

1.5542 0.4458 11.4759

c2
0
2
2
0
0
1
2
2
1
2
1
1
1
1
2
1
2
0
1
1
1
1
2
1
2
2
1.01

c3
7
0
1
8
7
6
0
1
5
1
5
5
6
6
1
6
1
8
5
6
5
5
1
5
0
0

SV
0
0
-2
-2
0
-2
0
-2
0
-2
0
0
-2
-2
-2
-2
-2
-2
0
-2
0
0
-2
0
0
0

TC
18
14
13
17
18
17
14
13
18
13
18
17
17
17
13
16
13
17
17
16
18
18
13
18
14
14

4.6426 -0.8916 16.237

10

You might also like