0% found this document useful (0 votes)

45 views207 pages

Mpro 4

Uploaded by

pokadoc289

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views207 pages

Mpro 4

Uploaded by

pokadoc289

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 207

Stochastic Dynamic Programming

Extending the usage of dynamic programming

Structured problems

Stochastic Dynamic Programmin

V. Leclère

December 8th 2023

Vincent Leclère Dynamic Programming 08/12/2023 1 / 36

Stochastic Dynamic Programming
Extending the usage of dynamic programming
Structured problems

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 1 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 1 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 1 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Stochastic Controlled Dynamic System

A discrete time controlled stochastic dynamic system is defined by
its dynamic
x t+1 = ft (x t , u t , ξ t+1 )
and initial state
x 0 = ξ0

The variables
x t is the state of the system,
u t is the control applied to the system at time t,
ξ t is an exogeneous noise.

Usually, x t ∈ Xt and u t belongs to a set depending upon the state:

u t ∈ Ut (x t ).

Vincent Leclère Dynamic Programming 08/12/2023 2 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Examples

Stock of water in a dam:

x t is the amount of water in the dam at time t,
u t is the amount of water turbined at time t,
ξ t+1 is the inflow of water in [t, t + 1[.
Boat in the ocean:
x t is the position of the boat at time t,
u t is the direction and speed chosen for [t, t + 1[,
ξ t+1 is the wind and current for [t, t + 1[.
Subway network:
x t is the position and speed of each train at time t,
u t is the acceleration chosen at time t,
ξ t+1 is the delay due to passengers and incident on the
network for [t, t + 1[.

Vincent Leclère Dynamic Programming 08/12/2023 3 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

More considerations about the state

Physical state: the physical value of the controlled system.

e.g. amount of water in your dam, position of your boat...
Information state: physical state and information you have
over noises. e.g.: amount of water and weather forecast...
Knowledge state: your current belief over the actual
information state (in case of noisy observations). Represented
as a distribution law over information states.
The state, in the Dynamic Programming sense, is the information
required to define an optimal solution.

Vincent Leclère Dynamic Programming 08/12/2023 4 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Optimization Problem

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
u
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = ξ0
u t ∈ Ut (x t ), x t ∈ Xt

σ(u t ) ⊂ σ ξ 0 , · · · , ξ t

Vincent Leclère Dynamic Programming 08/12/2023 5 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Optimization Problem

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
u
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = ξ0
u t ∈ Ut (x t ), x t ∈ Xt

σ(u t ) ⊂ σ ξ 0 , · · · , ξ t

1 We want to minimize the expectation of the sum of costs.

2 The system follows a dynamic given by the function ft .
3 There are stagewise constraints on the controls and costs.
4 The controls are functions of the past noises
(= non-anticipativity).
Vincent Leclère Dynamic Programming 08/12/2023 5 / 36
Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Optimization Problem

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
Φ
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = ξ0
u t ∈ Ut (x t ), x t ∈ Xt
u t = Φ(ξ 0 , · · · , ξ t )

1 We want to minimize the expectation of the sum of costs.

Optimization Problem with independence of noises

Assuming stagewise independence of the noises, we can compress

information in the following way:
−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
Φ
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = ξ0
u t ∈ Ut (x t ), x t ∈ Xt
u t = Φt (ξ 0 , · · · , ξ t )

Vincent Leclère Dynamic Programming 08/12/2023 6 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Optimization Problem with independence of noises

Assuming stagewise independence of the noises, we can compress

information in the following way:
−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
π
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = ξ0
u t ∈ Ut (x t ), x t ∈ Xt
u t = πt (x t )

Vincent Leclère Dynamic Programming 08/12/2023 6 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Keeping only the state

For notational ease, we want to formulate Problem (??) only with states.
Let Xt (xt , ξt+1 ) be the reachable states, i.e.,
n o
Xt (xt , ξt+1 ) := xt+1 ∈ Xt+1 | ∃ut ∈ Ut (xt , ξt+1 ), xt+1 = ft (xt , ut , ξt+1 ) .

And ct (xt , xt+1 , ξt+1 ) the transition cost from xt to xt+1 , i.e.,
n o
ct (xt , xt+1 , ξt+1 ) := min Lt (xt , ut , ξt+1 ) | xt+1 = ft (xt , ut , ξt+1 ) .
ut ∈Ut (xt ,ξt+1 )

Then, under independance of noises, the optimization problem reads

−1
h TX i
min E ct (x t , x t+1 , ξ t+1 ) + K (xT )
ψ
t=0
s.t. x t+1 ∈ Xt (x t , ξ t+1 ), x0 = ξ 0
x t+1 = ψt (x t , ξ t+1 )

Vincent Leclère Dynamic Programming 08/12/2023 7 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Keeping only the state

And ct (xt , xt+1 , ξt+1 ) the transition cost from xt to xt+1 , i.e.,
n o
ct (xt , xt+1 , ξt+1 ) := min Lt (xt , ut , ξt+1 ) | xt+1 = ft (xt , ut , ξt+1 ) .
ut ∈Ut (xt ,ξt+1 )

Then, under independance of noises, the optimization problem reads

−1
h TX i
min E ct (x t , x t+1 , ξ t+1 ) + K (xT )
ψ
t=0
s.t. x t+1 ∈ Xt (x t , ξ t+1 ), x0 = ξ 0
x t+1 = ψt (x t , ξ t+1 )

Vincent Leclère Dynamic Programming 08/12/2023 7 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 7 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Bellman’s Principle of Optimality

An optimal policy has the

property that whatever the
initial state and initial deci-
sion are, the remaining de-
cisions must constitute an
optimal policy with regard
to the state resulting from
Richard Ernest Bellman the first decision (Richard
(August 26, 1920 – March 19, Bellman)
1984)

Vincent Leclère Dynamic Programming 08/12/2023 8 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

The shortest path on a graph illustrates Bellman’s

Principle of Optimality
For an auto travel analogy,
suppose that the fastest
route from Los Angeles
to Boston passes through
Chicago.
The principle of optimality
translates to obvious fact
that the Chicago to Boston
portion of the route is also
the fastest route for a trip
that starts from Chicago
and ends in Boston. (Dim-
itri P. Bertsekas)

Vincent Leclère Dynamic Programming 08/12/2023 9 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Idea behind dynamic programming

If noises are time independent, then

1 The cost to go at time t depends only upon the current state.
2 We can compute recursively the cost to go for each position,
starting from the terminal state and computing optimal
trajectories backward.

Optimal cost-to-go of being in state x at time t is:

At time t, Vt+1 gives the cost of the future.
Dynamic Programming is a time decomposition method.

Vincent Leclère Dynamic Programming 08/12/2023 10 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Idea behind dynamic programming

If noises are time independent, then

Optimal cost-to-go of being in state x at time t is:

At time t, Vt+1 gives the cost of the future.
Dynamic Programming is a time decomposition method.

Vincent Leclère Dynamic Programming 08/12/2023 10 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Idea Behind Dynamic Programming

−1
h TX
" #
i
min E L0 x0 , u0 , ξ 1 + min E Lt x t , u t , w t+1 + K x T
u0 ∈U0 (x0 ) u1 ,...uT −1
t=1
s.t. x 1 = f0 (x0 , u0 , ξ 1 )
x t+1 = ft (x t , u t , ξ t+1 ) ∈ Xt+1 ,
u t ∈ Ut (x t )

σ(u t ) ⊂ σ ξ 0 , · · · , ξ t

Vincent Leclère Dynamic Programming 08/12/2023 11 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Idea Behind Dynamic Programming

−1
h TX
" #
i
min E L0 x0 , u0 , ξ 1 + min E Lt x t , u t , w t+1 + K x T
u0 ∈U0 (x0 ) u1 ,...uT −1
t=1
s.t. x 1 = f0 (x0 , u0 , ξ 1 )
x t+1 = ft (x t , u t , ξ t+1 ) ∈ Xt+1 ,
u t ∈ Ut (x t )
σ(u t ) ⊂ σ(x t )

Independence of noises

Vincent Leclère Dynamic Programming 08/12/2023 11 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Idea Behind Dynamic Programming

| {z }
=:V1 (x 1 )
Independence of noises

Vincent Leclère Dynamic Programming 08/12/2023 11 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Definition of Bellman Value Function

The Bellman’s value function Vt0 (x ) is defined as the value of the

problem starting at time t0 from the state x .
More precisely we have
−1
h TX i
Vt0 (x ) = min E Lt x t , u t , ξ t+1 + K x T
t=t0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x t0 = x
u t ∈ Ut (x t ), x t ∈ Xt

σ(u t ) ⊂ σ ξ 0 , · · · , ξ t

Vincent Leclère Dynamic Programming 08/12/2023 12 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Bellman’s recursion

The core idea of Bellman’s recursion is to see the total (expected)

cost as the sum of the current cost and the future cost:
h i
Vt (xt ) = min E Lt (x , u, ξt+1 ) + Vt+1 (xt+1 )
ut
xt+1 = ft (xt , ut , ξt+1 )
ut ∈ Ut (xt )
xt+1 ∈ Xt+1

And we know the final cost function:

VT (xT ) = K (xT ).

Vincent Leclère Dynamic Programming 08/12/2023 13 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Dynamic Programming Algorithm - Discrete Case

Data: Problem parameters

Result: optimal strategy and value;
VT ≡ K ; Vt ≡ 0
for t : T − 1 → 0 do
for x ∈ Xt do h i
Vt (x ) = min E Lt (x , u, ξ t+1 ) + Vt+1 ft (x , u, ξt+1 )
u∈Ut (x ) | {z }
xt+1
Algorithm 1: Classical stochastic DP algorithm

Vincent Leclère Dynamic Programming 08/12/2023 14 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Dynamic Programming Algorithm - Discrete Case

Data: Problem parameters

Result: optimal strategy and value;
VT ≡ K ; Vt ≡ 0
for t : T − 1 → 0 do
for x ∈ Xt do
Vt (x ) = +∞;
for u ∈ U(x ) do h i
Qt (x , u) = E Lt (x , u, ξ t+1 ) + Vt+1 ft (x , u, ξt+1 )
| {z }
xt+1
if Qt (x , u) < Vt (x ) then
Vt (x ) = Qt (x , u);
πt (x ) = u;
Algorithm 1: Classical stochastic DP algorithm

Vincent Leclère Dynamic Programming 08/12/2023 14 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Dynamic Programming Algorithm - Discrete Case

Data: Problem parameters
Result: optimal strategy and value;
VT ≡ K ; Vt ≡ 0
for t : T − 1 → 0 do
for x ∈ Xt do
Vt (x ) = +∞;
for u ∈ U(x ) do
for ξ ∈ Ξt+1 do
ξ
xt+1 = ft (x , u, ξ);
ξ
if xt+1 ∈ Xt then
ξ
Q̇t (x , u, ξ) = Lt (x , u, ξ t+1 ) + Vt+1 (xt+1 )
else
Q̇t (x , u, ξ) = +∞
P
Qt (x , u) = P(ξ t+1 = ξ)Q̇t (x , u, ξ);
ξ∈Ξt+1
if Qt (x , u) < Vt (x ) then
Vt (x ) = Qt (x , u); πt (x ) = u;
Algorithm 1: Classical stochastic DP algorithm
Vincent Leclère Dynamic Programming 08/12/2023 14 / 36
Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

3 curses of dimensionality
Complexity = O(T × |Xt | × |Ut | × |Ξt |)
Linear in the number of time steps, but we have 3 curses of
dimensionality :
1 State. Complexity is exponential in the dimension of Xt
e.g. 3 independent states each taking 10 values leads to a
loop over 1000 points.
2 Decision. Complexity is exponential in the dimension of Ut .
⇝ due to exhaustive minimization of inner problem. Can be
accelerated using faster method (e.g. MILP solver).
3 Expectation. Complexity is exponential in the dimension of
Ξt .
⇝ due to expectation computation. Can be accelerated
through Monte-Carlo approximation (still at least 1000 points)
In practice, DP is not used for a state of dimension more than 5.
Vincent Leclère Dynamic Programming 08/12/2023 15 / 36
Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Illustrating dynamic programming with the damsvalley

example

Gnioure Izourt Soulcem

Auzat

Sabart

Vincent Leclère Dynamic Programming 08/12/2023 16 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Illustrating the curse of dimensionality

We are in dimension 5 (not so high in the world of big data!) with

52 timesteps (common in energy management) plus 5 controls and
5 independent noises.
1 We discretize each state’s dimension in 100 values:
|Xt | = 1005 = 1010
2 We discretize each control’s dimension in 100 values:
|Ut | = 1005 = 1010
3 We use optimal quantization to discretize the noises’ space in
10 values: |Ξt | = 10
Number of flops: O(52 × 1010 × 1010 × 10) ≈ O(1023 ).
In the TOP500, the best computer computes 1017 flops/s.
Even with the most powerful computer, it takes at least 12 days to
solve this problem.

Vincent Leclère Dynamic Programming 08/12/2023 17 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 17 / 36

Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

A storage management example

A producer that needs to satisfy a weekly demand over 12 weeks.
Storage capacity of 100 units, starting with 50 units.
The producer can produce 0 (cost 0), 10 (cost 20) or 20 (cost 30)
or 25 (cost 45) units per week.
Demand is random and follows a stagewise independent uniform
distribution on {0, 10, 20, 30, 40}.
Storage cost 0.1 per unit per week.
Unmet demand is lost and costs 5 per unit.
Products remaining at the end are sold at 1 per unit.
During a given week:
producer decide how much to produce during the week
demand is revealed and should be met with current stock and
production
remaining stock is stored (at a cost), stock above capacity is
lost
Vincent Leclère Dynamic Programming 08/12/2023 18 / 36
Stochastic Dynamic Programming Stochastic optimal control problem
Extending the usage of dynamic programming Dynamic Programming principle
Structured problems Example

Exercise

1 Formulate the problem as a stochastic dynamic program,

underlying state, decision and noise.
2 Write the dynamic programming (Bellman’s) equation.
3 Solve the problem with your favorite programming language.

Vincent Leclère Dynamic Programming 08/12/2023 19 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 19 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 19 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Requirements of stochastic DP

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
π
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = x0
u t ∈ Ut (x t ), x t ∈ Xt
u t = πt (x t )

Assumptions:
The noise are stagewise-independent.
The only constraint linking stages is the dynamic equation: no
coupling between stages.
The cost function is additive over stages.
We consider the expectation of costs.
Vincent Leclère Dynamic Programming 08/12/2023 20 / 36
Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Dynamic Programming Algorithm - Discrete Case

Data: Problem parameters

Result: optimal strategy and value;
VT ≡ K ; Vt ≡ 0
for t : T − 1 → 0 do
for x ∈ Xt do h i
Vt (x ) = min E Lt (x , u, ξ t+1 ) + Vt+1 ft (x , u, ξt+1 )
u∈Ut (x ) | {z }
xt+1
Algorithm 2: Classical stochastic DP algorithm

Vincent Leclère Dynamic Programming 08/12/2023 21 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Dynamic Programming Algorithm - Discrete Case

Data: Problem parameters

Vincent Leclère Dynamic Programming 08/12/2023 21 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Dynamic Programming Algorithm - Discrete Case

Data: Problem parameters
Result: optimal strategy and value;
VT ≡ K ; Vt ≡ 0
for t : T − 1 → 0 do
for x ∈ Xt do
Vt (x ) = +∞;
for u ∈ U(x ) do
for ξ ∈ Ξt+1 do
ξ
xt+1 = ft (x , u, ξ);
ξ
if xt+1 ∈ Xt then
ξ
Q̇t (x , u, ξ) = Lt (x , u, ξ t+1 ) + Vt+1 (xt+1 )
else
Q̇t (x , u, ξ) = +∞
P
Qt (x , u) = P(ξ t+1 = ξ)Q̇t (x , u, ξ);
ξ∈Ξt+1
if Qt (x , u) < Vt (x ) then
Vt (x ) = Qt (x , u); πt (x ) = u;
Algorithm 2: Classical stochastic DP algorithm
Vincent Leclère Dynamic Programming 08/12/2023 21 / 36
Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Markovian noise
Assume that (ξ t )t is a Markovian noise, i.e. ξ t only depends on x t .
We can recover the previous setting by defining an extended
state
x̃t = (x t , ξ t )

Bellman equation then becomes:

h i
Vt (xt , ξt ) := min E Lt (xt , ut , ξ t+1 )+Vt+1 (x t+1 ) | ξ t = ξt
ut ∈Ut (xt )

More precisely, it means that:

1 The value function V (and the optimal policy π ) depends on
t t
both the current physical state xt and the current noise ξt .
2 The probability used to average the cost to go in the

algorithm is the conditional probability given ξt .

Vincent Leclère Dynamic Programming 08/12/2023 22 / 36
Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Markovian noise
Assume that (ξ t )t is a Markovian noise, i.e. ξ t only depends on x t .
We can recover the previous setting by defining an extended
state
x̃t = (x t , ξ t )

Bellman equation then becomes:

h i
Vt (xt , ξt ) := min E Lt (xt , ut , ξ t+1 )+Vt+1 (x t+1 ) | ξ t = ξt
ut ∈Ut (xt )

More precisely, it means that:

1 The value function V (and the optimal policy π ) depends on
t t
both the current physical state xt and the current noise ξt .
2 The probability used to average the cost to go in the

algorithm is the conditional probability given ξt .

Markovian noise
Assume that (ξ t )t is a Markovian noise, i.e. ξ t only depends on x t .
We can recover the previous setting by defining an extended
state
x̃t = (x t , ξ t )

Bellman equation then becomes:

h i
Vt (xt , ξt ) := min E Lt (xt , ut , ξ t+1 )+Vt+1 (x t+1 ) | ξ t = ξt
ut ∈Ut (xt )

More precisely, it means that:

1 The value function V (and the optimal policy π ) depends on
t t
both the current physical state xt and the current noise ξt .
2 The probability used to average the cost to go in the

algorithm is the conditional probability given ξt .

Coupling control

Consider the following problem, with stagewise independent noise:

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
π
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = x0
u t ∈ Ut (x t ), x t ∈ Xt
u t = πt (x t )
∥u t − u t−1 ∥ ≤ δ

How can we solve this problem using Dynamic Programming?

Vincent Leclère Dynamic Programming 08/12/2023 23 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Coupling control

Consider the following problem, with stagewise independent noise:

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
π
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = x0
u t ∈ Ut (x t ), x t ∈ Xt
u t = πt (x t )
∥u t − u t−1 ∥ ≤ δ

How can we solve this problem using Dynamic Programming?

Vincent Leclère Dynamic Programming 08/12/2023 23 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Delayed control

Consider the following problem, with stagewise independent noise:

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
π
t=0
s.t. x t+1 = ft (x t , u t−2 , ξ t+1 ), x 0 = x0
u t ∈ Ut (x t ), x t ∈ Xt
u t = πt (x t )

How can we solve this problem using Dynamic Programming?

Vincent Leclère Dynamic Programming 08/12/2023 24 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Delayed control

Consider the following problem, with stagewise independent noise:

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
π
t=0
s.t. x t+1 = ft (x t , u t−2 , ξ t+1 ), x 0 = x0
u t ∈ Ut (x t ), x t ∈ Xt
u t = πt (x t )

How can we solve this problem using Dynamic Programming?

Vincent Leclère Dynamic Programming 08/12/2023 24 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Bankruptcy

Consider the following problem, with stagewise independent noise:

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
π
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = x0
u t ∈ Ut (x t ), x t ∈ Xt
u t = πt (x t )

In addition, we assume that we start with a capital C0 , and that

we must never, under any circonstance, have a negative capital.
How can we solve this problem using Dynamic Programming?

Vincent Leclère Dynamic Programming 08/12/2023 25 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Bankruptcy

Consider the following problem, with stagewise independent noise:

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
π
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = x0
u t ∈ Ut (x t ), x t ∈ Xt
u t = πt (x t )

In addition, we assume that we start with a capital C0 , and that

we must never, under any circonstance, have a negative capital.
How can we solve this problem using Dynamic Programming?

Vincent Leclère Dynamic Programming 08/12/2023 25 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Maximizing probability

Consider the following problem, with stagewise independent noise:

−1
h TX i
min E Lt x t , u t , ξ t+1 + K x T
π
t=0
s.t. x t+1 = ft (x t , u t , ξ t+1 ), x 0 = x0
u t ∈ Ut (x t ), x t ∈ Xt
u t = πt (x t )

We are now reconsidering our objective function, and want to

replace the expectation by the probability of the accumulated, at
the end of the period, to be negative.
How can we solve this problem by Dynamic Programming?

Vincent Leclère Dynamic Programming 08/12/2023 26 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 26 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Dynamic Programming Algorithm - Discrete Case - HD

Data: Problem parameters

Result: optimal trajectory and value;
VT ≡ K ; Vt ≡ 0
for t : T − 1 → 0 do
for x ∈ Xt do h i
Vt (x ) = E min ct (x , y , ξ t+1 ) + Vt+1 (y )
y ∈Xt (x ,ξt+1 )
Algorithm 3: Classical stochastic dynamic programming algo-
rithm

Vincent Leclère Dynamic Programming 08/12/2023 27 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Dynamic Programming Algorithm - Discrete Case - HD

Data: Problem parameters

Result: optimal trajectory and value;
VT ≡ K ; Vt ≡ 0
for t : T − 1 → 0 do
for x ∈ Xt do
for ξ ∈ Ξt do
V̂t (x , ξ) = min ct (x , y , ξ) + Vt+1 (y )
y ∈Xt (x ,ξ)
Vt (x ) = Vt (x ) + P(ξ)V̂t (x , ξ)
Algorithm 3: Classical stochastic dynamic programming algo-
rithm

Vincent Leclère Dynamic Programming 08/12/2023 27 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Dynamic Programming Algorithm - Discrete Case - HD

Data: Problem parameters
Result: optimal trajectory and value;
VT ≡ K ; Vt ≡ 0
for t : T − 1 → 0 do
for x ∈ Xt do
for ξ ∈ Ξt do
V̂t (x , ξ) = ∞;
for y ∈ Xt (x , ξ) do
vy = ct (x , y , ξ) + Vt+1 (y );
if vy < V̂t (x , ξ) then
V̂t (x , ξ) = vy ;
ψt (x , ξ) = y ;
Vt (x ) = Vt (x ) + P(ξ)V̂t (x , ξ)
Algorithm 3: Classical stochastic dynamic programming algo-
rithm

Vincent Leclère Dynamic Programming 08/12/2023 27 / 36

Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

x2
Ṽt ≡ 0
for t : T − 1 → 1 do
D
for xin ∈ Xt−1 do
for ξ ∈ Ξt do
v̇ξ =
min ℓt (xin , xout , ξ) + Ṽt+1 (xout )
xout ∈Xt (xin ,ξ)
| {z }
:=Ḃt (Ṽt+1 )(xin ,ξ)
Ṽt (xin ) += πξ v̇ξ
x1
|{z}
:=P(ξt =ξ)
Extend definition of Ṽt to Xt by
interpolation time
Algorithm 1: Discretized SDP
Vincent Leclère Dynamic Programming 08/12/2023 28 / 36
Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Discretized Stochastic Dynamic Programming

The simplest DP algorithm is obtained by discretizing the state
set, and then doing a single backward pass over the grid.

Cost-to-go induced policy and Forward Bellman operator

The point of most DP methods is to produce approximations

Ṽt of the true value function1 Vt .
From any approximation Ṽt of Vt , we can define a cost-to-go
induced policy ψt by solving the stage problem:

min ℓt+1 (xin , xt , ut , ξt ) + Ṽ (xout )

xout ,ut ∈Xt (xin ,ξt ) | {z } | {z }
transition costs cost-to-go

Thus a (sequence of) value functions approximations yields a

policy, which can be simulated to obtain trajectories and costs.
➥ Often used to pass information from long-term to short-term
problems.

1
Sometimes it can be of V̇t instead
Vincent Leclère Dynamic Programming 08/12/2023 29 / 36
Stochastic Dynamic Programming
More flexibility in the framework
Extending the usage of dynamic programming
Continuous state space
Structured problems

Cost-to-go induced policy and Forward Bellman operator

The point of most DP methods is to produce approximations

Ṽt of the true value function1 Vt .
From any approximation Ṽt of Vt , we can define a cost-to-go
induced policy ψt by solving the stage problem:

min ℓt+1 (xin , xt , ut , ξt ) + Ṽ (xout )

xout ,ut ∈Xt (xin ,ξt ) | {z } | {z }
transition costs cost-to-go

Thus a (sequence of) value functions approximations yields a

policy, which can be simulated to obtain trajectories and costs.
➥ Often used to pass information from long-term to short-term
problems.

Cost-to-go induced policy and Forward Bellman operator

The point of most DP methods is to produce approximations

Ṽt of the true value function1 Vt .
From any approximation Ṽt of Vt , we can define a cost-to-go
induced policy ψt by solving the stage problem:

min ℓt+1 (xin , xt , ut , ξt ) + Ṽ (xout )

xout ,ut ∈Xt (xin ,ξt ) | {z } | {z }
transition costs cost-to-go

Thus a (sequence of) value functions approximations yields a

policy, which can be simulated to obtain trajectories and costs.
➥ Often used to pass information from long-term to short-term
problems.

Cost-to-go induced policy and Forward Bellman operator

The point of most DP methods is to produce approximations

Ṽt of the true value function1 Vt .
From any approximation Ṽt of Vt , we can define a cost-to-go
induced policy ψt by solving the stage problem:

min ℓt+1 (xin , xt , ut , ξt ) + Ṽ (xout )

xout ,ut ∈Xt (xin ,ξt ) | {z } | {z }
transition costs cost-to-go

Thus a (sequence of) value functions approximations yields a

policy, which can be simulated to obtain trajectories and costs.
➥ Often used to pass information from long-term to short-term
problems.

1
Sometimes it can be of V̇t instead
Vincent Leclère Dynamic Programming 08/12/2023 29 / 36
Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 29 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 29 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Linear Quadratic case

−1
h TX i
min E x⊤ ⊤ ⊤
t Qt x t + u t Rt u t + x T QT x T
π
t=0
s.t. x t+1 = At x t + Bt u t + ξ t , x 0 = x0
u t = πt (x t )
Under stagewise independence of the (centered) noise we can show that:
1 The value function is quadratic: Vt (xt ) = xt⊤ Kt xt + kt .
2 The optimal policy is linear: πt (xt ) = Lt xt .
3With explicit (Riccati) formulas for Kt and Lt .

KT = QT , kT = 0

Kt = Qt + A⊤ ⊤ ⊤ −1 ⊤
t Kt+1 At − At Kt+1 Bt (Rt + Bt Kt+1 Bt ) Bt Kt+1 At
⊤ −1 ⊤

Lt = −(Rt + Bt Kt+1 Bt ) Bt Kt+1 At

➥ Can be solved for large dimension (say n ∼ 104 ).
Vincent Leclère Dynamic Programming 08/12/2023 30 / 36
Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Linear Quadratic case

Presentation Outline

1 Stochastic Dynamic Programming

Stochastic optimal control problem
Dynamic Programming principle
Example

2 Extending the usage of dynamic programming

More flexibility in the framework
Continuous state space

3 Structured problems
Linear Quadratic case
Linear convex case

Vincent Leclère Dynamic Programming 08/12/2023 30 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

From Dynamic Programming to SDDP

DP is a flexible tool, hampered

by the curses of dimensionality
Numerical illustration (7 dams):
T = 52 weeks
|S| = 1007 possible states
|U| = 107 possible controls
|ξt | = 10 (1052 scenarios)

➥ ≈ 2 days on today’s fastest

super-computer
(3.106 years for 10 dams)

➥ Can be solved2 in ≈ 10 minutes

2
Approximately, depending on the problem and precision required...
Vincent Leclère Dynamic Programming 08/12/2023 31 / 36
Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

From Dynamic Programming to SDDP

DP is a flexible tool, hampered

by the curses of dimensionality
Numerical illustration (7 dams):
T = 52 weeks
|S| = 1007 possible states
|U| = 107 possible controls
|ξt | = 10 (1052 scenarios)

➥ ≈ 2 days on today’s fastest

super-computer
(3.106 years for 10 dams)

➥ Can be solved2 in ≈ 10 minutes

From Dynamic Programming to SDDP

DP is a flexible tool, hampered

by the curses of dimensionality
Numerical illustration (7 dams):
T = 52 weeks
|S| = 1007 possible states
|U| = 107 possible controls
|ξt | = 10 (1052 scenarios)

➥ ≈ 2 days on today’s fastest

super-computer
(3.106 years for 10 dams)

➥ Can be solved2 in ≈ 10 minutes

How can we be so much faster ?

Structural assumptions:
convexity
continuous state Independ
➥ duality tools Finitely suppo
Sampling instead of exhaustive computation Convex
Discrete c
Iteratively refining value function estimation at ”the right State discre
places” only Progres
Maximum
LP solvers

➥ Stochastic Dual Dynamic Programming (SDDP) which

has been around for 30 years
is widely used in the energy community
has lots of extensions and variants
some convergence results, mainly asymptotic

Vincent Leclère Dynamic Programming 08/12/2023 32 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

The setting

1 We are in a finite-time, stagewise independent framework.

2 The state and control variables are continuous and bounded.
3 The costs are convex (jointly in state and control).
4 The dynamic is linear.
5 The constraint on control is convex.
6 We are in a relatively complete recourse framework.
Then, we can show that, the value function are convex, and we
can approximate them by polyhedral functions.

Vincent Leclère Dynamic Programming 08/12/2023 33 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming: principle

The main idea is to update approximations of the value functions

by adding cuts, in order to refine the approximations. We iterate
the following steps:
Forward pass Given approximations of the value functions, we
simulate the policy induced by these approximations,
and obtain a trajectory.
Backward pass We refine the approximations by adding cuts, in
order to make the approximations more precise
around the trajectory.

Vincent Leclère Dynamic Programming 08/12/2023 34 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

First backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

second backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third forward pass : computing trajectory

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

third backward pass : refining approximation (adding cuts)

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

Stochastic Dual Dynamic Programming

time

And so on...

Vincent Leclère Dynamic Programming 08/12/2023 35 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

x x x

Final Cost V2 = V2

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V2
V1

x x x

Real Bellman function V1 = B1 (V2 )

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

x x x

Real Bellman function V0 = B0 (V1 )

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

x x x

Lower polyhedral approximation V2 of V2

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

x x x

Lower polyhedral approximation V 1 = Bt (V 2 ) of V1

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

x x x

Lower polyhedral approximation V 0 = Bt (V 1 ) of V0

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

x x x

Assume that we have lower polyhedral approximations of Vt

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x x x

Obtain a lower bound on the value of our problem

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x x x
(2) (2)
Apply F0 V 1 (x0 ) and obtain X 1

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x0 X12 X12 X12

x x x
(2) (2)
Apply F0 V 1 (x0 ) and obtain X 1

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x0 x12

x x x
(2) (2)
Draw a random realisation x1 of X 1

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x0 x12

x x x
(2) (2) (2)
We apply F1 V 1 (x1 ) and obtain X 2

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x0 x12 X22 X22 X22

x x x
(2) (2) (2)
We apply F1 V 1 (x1 ) and obtain X 2

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x0 x12 x22

x x x
(2) (2)
Draw a random realisation x2 of X 2

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x0 x12 x22

x x x
(2)
Compute a cut for V2 at x2

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x0 x12

x x x
(2) (3)
Add the cut to V 2 which gives V 2

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x0 x12

x x x
(3)
A new lower approximation of V1 is B1 (V 2 )

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x0 x12

x x x
(2)
Compute the face active at x1

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x x x
(2) (3)
Add the cut to V 1 which gives V 1

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x x x
(3)
A new lower approximation of V0 is B0 (V 1 )

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x x x

Compute the face active at x0

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 20 (x0 )

x x x

Compute the face active at x0

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

Stochastic Dynamic Programming
Linear Quadratic case
Extending the usage of dynamic programming
Linear convex case
Structured problems

SDDP

t=0 t=1 t=2

V0
V2
V1

V0 (x0 )

V 30 (x0 )

x x x

Obtain a new lower bound

Vincent Leclère Dynamic Programming 08/12/2023 36 / 36

(Nisio) Stochastic Control Theory (2015)
No ratings yet
(Nisio) Stochastic Control Theory (2015)
263 pages
Dynamicprogrammingkk
No ratings yet
Dynamicprogrammingkk
513 pages
Untitled
50% (2)
Untitled
343 pages
Dynamic Programming
100% (1)
Dynamic Programming
52 pages
l1 - l3 - Advanced Power System Optimization Ell776 - l1-l3
No ratings yet
l1 - l3 - Advanced Power System Optimization Ell776 - l1-l3
84 pages
Lecture 2 Deterministic
No ratings yet
Lecture 2 Deterministic
21 pages
MIT6 231F15 Complete Slide
No ratings yet
MIT6 231F15 Complete Slide
166 pages
Mechanics Using Python: An Introductory Guide
From Everand
Mechanics Using Python: An Introductory Guide
Aayushman Dutta
No ratings yet
Dynamic Programing and Optimal Control PDF
No ratings yet
Dynamic Programing and Optimal Control PDF
276 pages
5SC28 L7 Machine Learning
No ratings yet
5SC28 L7 Machine Learning
61 pages
English Specimen Paper 2 Mark Scheme 2012
100% (3)
English Specimen Paper 2 Mark Scheme 2012
6 pages
Powell-Tutorial-ComputationalStochasticOptimization Informs Nov152014
No ratings yet
Powell-Tutorial-ComputationalStochasticOptimization Informs Nov152014
142 pages
MIT6 231F15 Notes PDF
No ratings yet
MIT6 231F15 Notes PDF
303 pages
Stochastic Dynamic Programming 2
No ratings yet
Stochastic Dynamic Programming 2
105 pages
Handout 10 Dynamic Programming Nov14
No ratings yet
Handout 10 Dynamic Programming Nov14
113 pages
EPPA2013 Session5
No ratings yet
EPPA2013 Session5
38 pages
Dynamic Programing and Optimal Control
No ratings yet
Dynamic Programing and Optimal Control
276 pages
Dynamic Programming and Optimal Control
No ratings yet
Dynamic Programming and Optimal Control
62 pages
Dynamic Optimization
No ratings yet
Dynamic Optimization
73 pages
Canonical Problem Forms: Ryan Tibshirani Convex Optimization 10-725
No ratings yet
Canonical Problem Forms: Ryan Tibshirani Convex Optimization 10-725
27 pages
Dynamic Programming Online Teaching FOR PRINT
No ratings yet
Dynamic Programming Online Teaching FOR PRINT
44 pages
MDPintro 4 Yixin Ye-2
No ratings yet
MDPintro 4 Yixin Ye-2
42 pages
Dynamic Programming
No ratings yet
Dynamic Programming
52 pages
Dynamic Programming: Xiaolan Xie
No ratings yet
Dynamic Programming: Xiaolan Xie
97 pages
Balaji Opt Lecture8 Act
No ratings yet
Balaji Opt Lecture8 Act
72 pages
Dynamic Optimization - Book
No ratings yet
Dynamic Optimization - Book
84 pages
CH 9 MDP
No ratings yet
CH 9 MDP
97 pages
Dynamic Programming An Introduction by Example
No ratings yet
Dynamic Programming An Introduction by Example
24 pages
MIT6 231F11 Notes Short
No ratings yet
MIT6 231F11 Notes Short
125 pages
Opt Class CH17102 - Unit 4
No ratings yet
Opt Class CH17102 - Unit 4
26 pages
DD Unit 5. Slides
No ratings yet
DD Unit 5. Slides
27 pages
Dynamic Programming
No ratings yet
Dynamic Programming
16 pages
Hastrup, Kirsten - Getting It Right
No ratings yet
Hastrup, Kirsten - Getting It Right
19 pages
Introduction To Dynamic Programming
No ratings yet
Introduction To Dynamic Programming
15 pages
CH 18
No ratings yet
CH 18
30 pages
Optimizations, Chapter 1,2,3,4
No ratings yet
Optimizations, Chapter 1,2,3,4
13 pages
5.4-Reinforcement Learning-Part1-Introduction
No ratings yet
5.4-Reinforcement Learning-Part1-Introduction
15 pages
Dynamic Programming 2
No ratings yet
Dynamic Programming 2
39 pages
DP Methods
No ratings yet
DP Methods
61 pages
Discrete Dynamic Programming Problem: Pt. Ravishankar Shukla University, Raipur
No ratings yet
Discrete Dynamic Programming Problem: Pt. Ravishankar Shukla University, Raipur
14 pages
Dynamic Programming Principles PDFalgorithm
No ratings yet
Dynamic Programming Principles PDFalgorithm
16 pages
PPT3 - W2-S3 - Dynamic Programming - R0
No ratings yet
PPT3 - W2-S3 - Dynamic Programming - R0
29 pages
Process Optimisation: Dynamic Programming
No ratings yet
Process Optimisation: Dynamic Programming
35 pages
04 - OR2 - Dynamic Programming
No ratings yet
04 - OR2 - Dynamic Programming
14 pages
Global Adaptive Dynamic Programming For Continuous-Time Nonlinear Systems
No ratings yet
Global Adaptive Dynamic Programming For Continuous-Time Nonlinear Systems
13 pages
Dynamic Programming
No ratings yet
Dynamic Programming
9 pages
IIM7064 Dynamic Programming
No ratings yet
IIM7064 Dynamic Programming
18 pages
Wisdom of Crowds Intro
No ratings yet
Wisdom of Crowds Intro
53 pages
La5 PDF
No ratings yet
La5 PDF
35 pages
Powell - Modernizing The Teaching of Optimization January 5 2024
No ratings yet
Powell - Modernizing The Teaching of Optimization January 5 2024
8 pages
Linear Programming: - Socrates
No ratings yet
Linear Programming: - Socrates
21 pages
16.323 Principles of Optimal Control: Mit Opencourseware
No ratings yet
16.323 Principles of Optimal Control: Mit Opencourseware
27 pages
Dynamic Programmingvia Linear Programming
No ratings yet
Dynamic Programmingvia Linear Programming
14 pages
Dynamminc Programming-Optimality
No ratings yet
Dynamminc Programming-Optimality
3 pages
A Deterministic Dynamic Programming Approach For Optimization Problem With Quadratic Objective Function and Linear Constraints
No ratings yet
A Deterministic Dynamic Programming Approach For Optimization Problem With Quadratic Objective Function and Linear Constraints
5 pages
Dynamic Programming: of Optimality
No ratings yet
Dynamic Programming: of Optimality
11 pages
Water Resources Systems:: Modeling Techniques and Analysis
No ratings yet
Water Resources Systems:: Modeling Techniques and Analysis
15 pages
Tom04 Quick Overview of The Bible
No ratings yet
Tom04 Quick Overview of The Bible
38 pages
Approximate Dynamic Programming - II: Algorithms: Warren B. Powell
No ratings yet
Approximate Dynamic Programming - II: Algorithms: Warren B. Powell
22 pages
Stochastic Programming - Optimization
No ratings yet
Stochastic Programming - Optimization
5 pages
21st Century Literacies - An Introduction
No ratings yet
21st Century Literacies - An Introduction
23 pages
Scan 09-Sep-2020
No ratings yet
Scan 09-Sep-2020
3 pages
Paper 1
No ratings yet
Paper 1
5 pages
Stochastic Programming
No ratings yet
Stochastic Programming
9 pages
Makkah Before 20th Century
No ratings yet
Makkah Before 20th Century
14 pages
Lecture Note - 7 - CE605A&CHE705B
No ratings yet
Lecture Note - 7 - CE605A&CHE705B
3 pages
Expl NetFund CH 01 Intro - 56 Slides
No ratings yet
Expl NetFund CH 01 Intro - 56 Slides
68 pages
Answer Key: Cumulative Test
No ratings yet
Answer Key: Cumulative Test
2 pages
Exploring Library Resources and Services For Research and Instruction
100% (1)
Exploring Library Resources and Services For Research and Instruction
40 pages
TEST 1 - READING - IELTS Cambridge 13 (1-183) 2
No ratings yet
TEST 1 - READING - IELTS Cambridge 13 (1-183) 2
136 pages
Core Network in GSM
No ratings yet
Core Network in GSM
81 pages
Graduation Script2023
No ratings yet
Graduation Script2023
4 pages
Mpro 1
No ratings yet
Mpro 1
101 pages
SM Contents-1
No ratings yet
SM Contents-1
8 pages
Q 4
No ratings yet
Q 4
27 pages
Asm 13606
No ratings yet
Asm 13606
3 pages
English Exercises - Word Formation - Prefixes & Suffixes
No ratings yet
English Exercises - Word Formation - Prefixes & Suffixes
3 pages
Mpro 2
No ratings yet
Mpro 2
80 pages
Risk Measures
No ratings yet
Risk Measures
72 pages
Mpro 3
No ratings yet
Mpro 3
69 pages
Microprocessor Microcontroller EXAM 2021
No ratings yet
Microprocessor Microcontroller EXAM 2021
5 pages
IT3401 - WE Lesson Plan
No ratings yet
IT3401 - WE Lesson Plan
6 pages
RRB ALP Previous Year Papers PDF - 2424
No ratings yet
RRB ALP Previous Year Papers PDF - 2424
70 pages
Mpro 5
No ratings yet
Mpro 5
27 pages
Pas Mahfudzot 2022-1
No ratings yet
Pas Mahfudzot 2022-1
75 pages
Greece and The Greeks in Ottoman History and Turkish Historiography
No ratings yet
Greece and The Greeks in Ottoman History and Turkish Historiography
15 pages
Marriland Team Builder For Pokémon Teams
No ratings yet
Marriland Team Builder For Pokémon Teams
1 page
Sample Paper 13 IP
No ratings yet
Sample Paper 13 IP
9 pages
L06 - Syntactic and Semantic Errors
No ratings yet
L06 - Syntactic and Semantic Errors
19 pages
I&O Device Simulation 2.0
No ratings yet
I&O Device Simulation 2.0
7 pages
Birth Application Fillable Form 1
No ratings yet
Birth Application Fillable Form 1
1 page
4 47 PG TRB 2013 English Keyanswer
No ratings yet
4 47 PG TRB 2013 English Keyanswer
6 pages
Read 366 Lit Assess Lesson Plan
No ratings yet
Read 366 Lit Assess Lesson Plan
2 pages
Realize and Design An 8 Bit Serial in and Serial Out Shift Register Using Two 4 Bit Shift Register
No ratings yet
Realize and Design An 8 Bit Serial in and Serial Out Shift Register Using Two 4 Bit Shift Register
5 pages
CSSE 3113: Software Engineering: Assignment 2
No ratings yet
CSSE 3113: Software Engineering: Assignment 2
3 pages