Dynamic Programming Principles PDFalgorithm
Dynamic Programming Principles PDFalgorithm
Solutions
Bruno Bouchard, Nizar Touzi
Abstract. We prove a weak version of the dynamic programming principle for standard stochas-
tic control problems and mixed control-stopping problems, which avoids the technical difficulties
related to the measurable selection argument. In the Markov case, our result is tailor-made for the
derivation of the dynamic programming equation in the sense of viscosity solutions.
where U is the controls set, X ν is the controlled process, f is some given function,
0 < T ≤ ∞ is a given time horizon, t ∈ [0, T ) is the time origin, and x ∈ Rd is some
given initial condition. This framework includes the general class of stochastic control
problems under the so-called Bolza formulation, the corresponding singular versions,
and optimal stopping problems.
A key-tool for the analysis of such problems is the so-called dynamic programming
principle (DPP), which relates the time−t value function V (t, .) to any later time−τ
value V (τ, .) for any stopping time τ ∈ [t, T ) a.s. A formal statement of the DPP is:
00
V (t, x) = v(t, x) := sup E [V (τ, Xτν )|Xtν = x] .00 (1.1)
ν∈U
In particular, this result is routinely used in the case of controlled Markov jump-
diffusions in order to derive the corresponding dynamic programming equation in the
sense of viscosity solutions, see Lions [10, 11], Fleming and Soner [8], Touzi [15], for
the case of controlled diffusions, and Oksendal and Sulem [12] for the case of Markov
jump-diffusions.
The statement (1.1) of the DPP is very intuitive and can be easily proved in the
deterministic framework, or in discrete-time with finite probability space. However,
its proof is in general not trivial, and requires on the first stage that V be measurable.
When the value function V is known to be continuous, the abstract measurability
arguments are not needed and the proof of the dynamic programming principle is
significantly simplified. See e.g. Fleming and Soner [8], or Kabanov and Klueppelberg
[9] in the context of a special singular control problem in finance. Our objective is to
reduce the proof to this simple context in a general situation where the value function
has no a priori regularity.
The inequality ”V ≤ v” is the easy one but still requires that V be measurable.
Our weak formulation avoids this issue. Namely, under fairly general conditions on
the controls set and the controlled process, it follows from an easy application of the
where τnν := τ ∧inf {s > t : |Xsν − x| > n}, and V∗ is the lower semicontinuous envelope
of V .
This result is weaker than the classical DPP (1.1). However, in the controlled
Markov jump-diffusions case, it turns out to be tailor-made for the derivation of the
dynamic programming equation in the sense of viscosity solutions. Section 5 reports
this derivation in the context of controlled jump diffusions.
Finally, Section 4 provides an extension of our argument in order to obtain a weak
dynamic programming principle for mixed control-stopping problems.
2. The stochastic control problem. Let (Ω, F, P) be a probability space sup-
porting a càdlàg Rd -valued process Z with independent increments. Given T > 0, let
F := {Ft , 0 ≤ t ≤ T } be the completion of its natural filtration on [0, T ]. Note that
F satisfies the usual conditions, see e.g. [6]. We assume that F0 is trivial and that
FT = F.
For every t ≥ 0, we set Ft := (Fst )s≥0 , where Fst is the completion of σ(Zr −Zt , t ≤
r ≤ s ∨ t) by null sets of F.
We denote by T the collection of all F−stopping times. For τ1 , τ2 ∈ T with
τ1 ≤ τ2 a.s., the subset T[τ1 ,τ2 ] is the collection of all τ ∈ T such that τ ∈ [τ1 , τ2 ] a.s.
When τ1 = 0, we simply write Tτ2 . We use the notations T[τt 1 ,τ2 ] and Tτt2 to denote
the corresponding sets of Ft -stopping times.
Throughout the paper, the only reason for introducing the filtration F through
the process Z is to guarantee the following property of the filtrations Ft .
remark 2.1. Notice that Fst −measurable random variables are independent of
Ft for all s, t ≤ T , and that Fst is the trivial degenerate σ−algebra for s ≤ t. Similarly,
all Ft −stopping times are independent of Ft .
For τ ∈ T and a subset A of a finite dimensional space, we denote by L0τ (A)
the collection of all Fτ −measurable random variables with values in A. H0 (A) is the
collection of all F−progressively measurable processes with values in A, and H0rcll (A)
is the subset of all processes in H0 (A) which are right-continuous with finite left limits.
In the following, we denote by Br (z) (resp. ∂Br (z)) the open ball (resp. its
boundary) of radius r > 0 and center z ∈ R` , ` ∈ N.
Weak Dynamic Programming Principle for Viscosity Solutions 3
A suitable choice of the set S in the case of jump-diffusion processes driven by Brow-
nian motion is given in Section 5 below.
Given a Borel function f : Rd −→ R and (t, x) ∈ S, we introduce the reward
function J : S × U −→ R:
ν
J(t, x; ν) := E f Xt,x (T ) (2.1)
remark 2.2. The restriction to control processes that are Ft -progressively mea-
surable in the definition of V (t, ·) is natural and consistent with the case where t = 0,
since F0 is assumed to be trivial, and is actually commonly used, compare with e.g.
[16]. It will be technically important in the following. It also seems a-priori necessary
in order to ensure that Assumption A4 below makes sense, see Remark 3.2 and the
proof of Proposition 5.4 below. However, we will show in Remark 5.2 below that it is
not restrictive.
3. Dynamic programming for stochastic control problems. For the pur-
pose of our weak dynamic programming principle, the following assumptions are cru-
cial.
Assumption A For all (t, x) ∈ S and ν ∈ Ut , the controlled state process satisfies:
ν
A1 (Independence) The process Xt,x is Ft -progressively measurable.
A2 (Causality) For ν̃ ∈ Ut , τ ∈ T[t,T ] and A ∈ Fτt , if ν = ν̃ on [t, τ ] and
t
ν ν̃
ν1A = ν̃1A on (τ, T ], then Xt,x 1A = Xt,x 1A .
t
A3 (Stability under concatenation) For every ν̃ ∈ Ut , and θ ∈ T[t,T ]:
ν1[0,θ] + ν̃1(θ,T ] ∈ Ut .
4 BOUCHARD B. AND N. TOUZI
t
A4 (Consistency with deterministic initial data) For all θ ∈ T[t,T ] , we have:
a. For P-a.e ω ∈ Ω, there exists ν̃ω ∈ Uθ(ω) such that
ν ν
E f Xt,x (T ) |Fθ (ω) ≤ J(θ(ω), Xt,x (θ)(ω); ν̃ω ).
t
b. For t ≤ s ≤ T , θ ∈ T[t,s] , ν̃ ∈ Us , and ν̄ := ν1[0,θ] + ν̃1(θ,T ] , we have:
ν̄ ν
E f Xt,x (T ) |Fθ (ω) = J(θ(ω), Xt,x (θ)(ω); ν̃)forP − a.e. ω ∈ Ω.
ν
remark 3.1. Assumption A2 above means that the process Xt,x is defined
(caused) by the control ν pathwise.
remark 3.2. Let θ be equal to a fixed time s in A4-b. If ν̃is allowed to depend
ν̃
on Fs , then the left-hand side in A4-b does not coincide with E f (Xs,X ν (s)(ω) (T )) .
t,x
Hence, the above identity can not hold in this form.
remark 3.3. In Section 5 below, we show that Assumption A4-a holds with
equality in the jump-diffusion setting. Although we have no example of a control
problem where the equality does not hold, we keep Assumption A4-a under this form
because the proof only needs this requirement.
remark 3.4. Assumption A3 above implies the following property of the controls
set which will be needed later:
t t
A5 (Stability under bifurcation) For ν1 , ν2 ∈ Ut , τ ∈ T[t,T ] and A ∈ Fτ , we have:
Our main result is the following weak version of the dynamic programming prin-
ciple which uses the following notation:
V∗ (t, x) := lim inf V (t0 , x0 ), V ∗ (t, x) := lim sup V (t0 , x0 ),(t, x) ∈ S.
(t0 ,x0 )→(t,x) (t0 ,x0 )→(t,x)
Theorem 3.5. Let Assumptions A hold true and assume that V is locally
bounded. Then for every (t, x) ∈ S, and for every family of stopping times {θν , ν ∈
t
Ut } ⊂ T[t,T ] , we have
Assume further that J(.; ν) ∈ LSC(S) for every ν ∈ Uo . Then, for any function
ϕ : S −→ R:
ϕ ∈ USC(S) and V ≥ ϕ =⇒ V (t, x) ≥ sup E ϕ(θν , Xt,x
ν
(θν )) ,
(3.2)
ν∈Utϕ
Weak Dynamic Programming Principle for Viscosity Solutions 5
Before proceeding to the proof of this result, we report the following consequence.
Corollary 3.6. Let the conditions of Theorem 3.5 hold. For (t, x) ∈ S, let
∞
{θν , ν ∈ Ut } ⊂ T[t,T
t ν
] be a family of stopping times such that Xt,x 1[t,θ ν ] is L −bounded
for all ν ∈ Ut . Then,
Proof. The right-hand side inequality is already provided in Theorem 3.5. Fix
r > 0. It follows from standard arguments, see e.g. Lemma 3.5 in [13], that we can
find a sequence of continuous functions (ϕn )n such that ϕn ≤ V∗ ≤ V for all n ≥ 1
and such that ϕn converges pointwise to V∗ on [0, T ] × Br (0). Set φN := minn≥N ϕn
for N ≥ 1 and observe that the sequence (φN )N is non-decreasing and converges
pointwise to V∗ on [0, T ] × Br (0). Applying (3.2) of Theorem 3.5 and using the
monotone convergence Theorem, we then obtain:
remark 3.7. Notice that the value function V (t, x) is defined by means of Ut
as the set of controls. Because of this, the lower semicontinuity of J(., ν) required in
the second part of Theorem 3.5 does not imply that V is lower semicontinuous in its
t-variable. See however Remark 5.3 below.
Proof. [Theorem 3.5] 1. Let ν ∈ Ut be arbitrary and set θ := θν . The first
assertion is a direct consequence of Assumption A4-a. Indeed, it implies that, for
P-almost all ω ∈ Ω, there exists ν̃ω ∈ Uθ(ω) such that
ν ν
E f Xt,x (T ) |Fθ (ω) ≤ J(θ(ω), Xt,x (θ)(ω); ν̃ω ) .
ν
Since, by definition, J(θ(ω), Xt,x (θ)(ω); ν̃ω ) ≤ V ∗ (θ(ω), Xt,x
ν
(θ)(ω)), it follows from
the tower property of conditional expectations that
ν ν
(T ) |Fθ ≤ E V ∗ θ, Xt,x ν
E f Xt,x (T ) = E E f Xt,x (θ) .
2. Let ε > 0 be given. Then there is a family (ν (s,y),ε )(s,y)∈S ⊂ Uo such that:
ϕ(s, y)−ϕ(t0 , x0 ) ≥ −ε and J(s, y; ν (s,y),ε )−J(t0 , x0 ; ν (s,y),ε ) ≤ ε for (t0 , x0 ) ∈ B(s, y; r(s,y) ),
(3.5)
where, for r > 0 and (s, y) ∈ S,
Note that we do not use here balls of the usual form Br (s, y) and consider the topoly
induced by half-closed intervals on [0, T ]. The fact that t0 ≤ s for (t0 , x0 ) ∈ B(s, y; r)
6 BOUCHARD B. AND N. TOUZI
will play an important role when appealing to Assumption A4-b in Step 3 below.
Clearly, {B(s, y; r) : (s, y) ∈ S, 0 < r ≤ r(s,y) } forms an open covering of (0, T ] × Rd .
It then follows from the Lindelöf covering Theorem, see e.g. [14] Theorem 6.3 Chap.
VIII, that we can find a countable sequence (ti , xi , ri )i≥1 of elements of S × R, with
0 < ri ≤ r(ti ,xi ) for all i ≥ 1, such that S ⊂ {0} × Rd ∪ (∪i≥1 B(ti , xi ; ri )). Set
A0 := {T } × Rd , C−1 := ∅, and define the sequence
With this construction, it follows from (3.4), (3.5), together with the fact that V ≥ ϕ,
that the countable family (Ai )i≥0 satisfies
ν
(θ, Xt,x (θ)) ∈ ∪i≥0 Ai P−a.s., Ai ∩Aj = ∅ for i 6= j ∈ N, and J(·; ν i,ε ) ≥ ϕ−3ε on Ai for i ≥ 1,
(3.7)
where ν i,ε := ν (ti ,xi ),ε for i ≥ 1.
t n
3. We now prove (3.2). We fix ν ∈ Ut and θ ∈ T[t,T ] . Set A := ∪0≤i≤n Ai , n ≥ 1.
Given ν ∈ Ut , we define
n
X
νsε,n := 1[t,θ] (s)νs + 1(θ,T ] (s) νs 1(An )c (θ, Xt,x
ν
(θ)) + ν
1Ai (θ, Xt,x (θ))νsi,ε ,fors ∈ [t, T ].
i=1
ν
Notice that {(θ, Xt,x (θ)) ∈ Ai } ∈ Fθt as a consequence of Assumption A1. Then,
it follows from the stability under concatenation Assumption A3, Remark 3.4 that
ν ε,n ∈ Ut . By the definition of the neighbourhood (3.6), notice that θ = θ ∧ ti ≤ ti on
ν
{(θ, Xt,x (θ)) ∈ Ai }. Then, using Assumptions A4-b, A2, and (3.7), we deduce that:
h ε,n i h ε,n i
ν ν ν ν
E f Xt,x (T ) |Fθ 1An θ, Xt,x (θ) = E f Xt,x (T ) |Fθ 1A0 θ, Xt,x (θ)
n
X h ε,n i
ν ν
+ E f Xt,x (T ) |Fθ∧ti 1Ai θ, Xt,x (θ)
i=1
ν ε,n ν
= V T, Xt,x (T ) 1A0 θ, Xt,x (θ)
n
X
ν
(θ∧ti ); ν i,ε )1Ai θ, Xt,x
ν
+ J(θ∧ti , Xt,x (θ)
i=1
n
X
ν ν
≥ ϕ(θ, Xt,x (θ)) − 3ε 1Ai θ, Xt,x (θ)
i=0
ν ν
= ϕ(θ, Xt,x (θ)) − 3ε 1An θ, Xt,x (θ) ,
ν
Since f Xt,x (T ) ∈ L1 , it follows from the dominated convergence theorem that:
ν ν
V (t, x) ≥ −3ε + lim inf E ϕ(θ, Xt,x (θ))1An θ, Xt,x (θ)
n→∞
ν
(θ))+ 1An θ, Xt,x
ν
= −3ε + lim E ϕ(θ, Xt,x (θ)
n→∞
ν
(θ))− 1An θ, Xt,x
ν
− lim E ϕ(θ, Xt,x (θ)
n→∞
ν
= −3ε + E ϕ(θ, Xt,x (θ)) ,
where the last equality follows from the left-hand side of (3.7) and from the mono-
ν
tone convergence theorem, due to the fact that either E ϕ(θ, Xt,x (θ))+ < ∞ or
(θ))− < ∞. The proof of (3.2) is completed by the arbitrariness of
ν
E ϕ(θ, Xt,x
ν ∈ Ut and ε > 0.
remark 3.8. (Lower-semicontinuity condition I) It is clear from the above proof
that it suffices to prove the lower-semicontinuity of (t, x) 7→ J(t, x; ν) for ν in a subset
Ũo of Uo such that supν∈Ũt J(t, x; ν) = V (t, x). Here Ũt is the subset of Ũo whose
elements are Ft -progressively measurable. In most applications, this allows to reduce
to the case where the controls are essentially bounded or satisfy a strong integrability
condition.
remark 3.9. (Lower-semicontinuity condition II) In the above proof, the lower-
semicontinuity assumption is only used to construct the balls Bi on which J(ti , xi ; ν i,ε )−
J(·; ν i,ε ) ≤ ε. Clearly, it can be alleviated, and it suffices that the lower-semicontinuity
holds in time from the left, i.e.
remark 3.10. (The Bolza and Lagrange formulations) Consider the stochastic
control problem under the so-called Lagrange formulation:
"Z #
T
ν ν ν ν
V (t, x) := sup E Yt,x,1 (s)g s, Xt,x (s), νs ds + Yt,x,1 (T )f Xt,x (T ) ,
ν∈Ut t
where
ν ν ν ν
dYt,x,y (s) = −Yt,x,y (s)k s, Xt,x (s), νs ds , Yt,x,y (t) = y > 0 .
Then, it is well known that this problem can be converted into the Mayer formulation
(2.3) by augmenting the state process to (X, Y, Z), where
ν ν ν ν
dZt,x,y,z (s) = Yt,x,y (s)g s, Xt,x (s), νs ds , Zt,x,y,z (t) = z ∈ R ,
In particular, V (t, x) = V̄ (t, x, 1, 0). The first assertion of Theorem 3.5 implies
" Z θν #
ν ν ∗ ν ν ν ν ν
V (t, x) ≤ sup E Yt,x,1 (θ )V (θ , Xt,x (θ )) + Yt,x,1 (s)g s, Xt,x (s), νs ds (3.8)
.
ν∈Ut t
8 BOUCHARD B. AND N. TOUZI
remark 3.11. (Infinite Horizon) Infinite horizon problems can be handled sim-
ilarly. Following the notations of the previous Remark 3.10, we introduce the infinite
horizon stochastic control problem:
Z ∞
∞ ν ν
V (t, x) := sup E Yt,x,1 (s)g s, Xt,x (s), νs ds .
ν∈Ut t
Then, it is immediately seen that V ∞ satisfies the weak dynamic programming prin-
ciple (3.8)-(3.9).
4. Dynamic programming for mixed control-stopping problems. In this
section, we provide a direct extension of the dynamic programming principle of The-
orem 3.5 to the larger class of mixed control and stopping problems.
In the context of the previous section, we consider a Borel function f : Rd −→ R,
and we assume |f | ≤ f¯ for some continuous function f¯. For (t, x) ∈ S the reward
J¯ : S × Ū × T[t,T ] −→ R:
¯ x; ν, τ ) := E f Xt,x
ν
J(t, (τ ) , (4.1)
V̄ (t, x) := sup ¯ x; ν, τ ) ,
J(t, (4.2)
t
(ν,τ )∈Ūt ×T[t,T ]
In order to extend the result of Theorem 3.5, we shall assume that the following
version of A4 holds:
t t
Assumption A4’ For all (t, x) ∈ S, (ν, τ ) ∈ Ūt × T[t,T ] and θ ∈ T[t,T ] , we have:
θ(ω)
a. For P-a.e ω ∈ Ω, there exists (ν̃ω , τ̃ω ) ∈ Ūθ(ω) × T[θ(ω),T ] such that
ν ν
1{τ ≥θ} (ω)E f Xt,x (τ ) |Fθ (ω) ≤ 1{τ ≥θ} (ω)J θ(ω), Xt,x (θ)(ω); ν̃ω , τ̃ω
Weak Dynamic Programming Principle for Viscosity Solutions 9
t s
b. For t ≤ s ≤ T , θ ∈ T[t,s] , (ν̃, τ̃ ) ∈ Ūs × T[s,T ] , τ̄ := τ 1{τ <θ} + τ̃ 1{τ ≥θ} , and
ν̄ := ν1[0,θ] + ν̃1(θ,T ] , we have for P−a.e. ω ∈ Ω:
ν̄ ν
1{τ ≥θ} (ω)E f Xt,x (τ̄ ) |Fθ (ω) = 1{τ ≥θ} (ω)J(θ(ω), Xt,x (θ)(ω); ν̃, τ̃ ).
Theorem 4.1. Let Assumptions A1, A2, A3 and A4’ hold true. Then for every
(t, x) ∈ S, and for all family of stopping times {θν , ν ∈ Ūt } ⊂ T[t,T t
]:
ν
(τ )) + 1{τ ≥θν } V̄ ∗ (θν , Xt,x
ν
(θν )) . (4.4)
V̄ (t, x) ≤ sup E 1{τ <θν } f (Xt,x
t
(ν,τ )∈Ūt ×T[t,T ]
Assume further that the map (t, x) 7−→ J(t, ¯ x; ν, τ ) satisfies the following lower-
semicontinuity property
lim inf ¯ 0 , x0 ; ν, τ ) ≥ J(t,
J(t ¯ x; ν, τ )for every(t, x) ∈ S and (ν, τ ) ∈ Ū × T . (4.5)
0 0
t ↑t,x →x
For simplicity, we only provide the proof of Theorem 4.1 for optimal stopping
problems, i.e. in the case where Ū is reduced to a singleton. The dynamic program-
ming principle for mixed control-stopping problems is easily proved by combining the
arguments below with those of the proof of Theorem 3.5.
Proof. (for optimal stopping problems) We omit the control ν from all notations,
¯ x; τ ). Inequality (4.4) follows immediately from
thus simply writing Xt,x (·) and J(t,
the tower property together with Assumptions A4’-a, recall that J¯ ≤ V̄ ∗ .
We next prove (4.6). Arguing as in Step 2 of the proof of Theorem 3.5, we first
observe that, for every ε > 0, we can find a countable family Āi ⊂ (ti −ri , ti ]×Ai ⊂ S,
together with a sequence of stopping times τ i,ε in T[ttii,T ] , i ≥ 1, satisfying Ā0 =
{T } × Rd and
¯ τ i,ε ) ≥ ϕ − 3ε on Āi for i ≥ 1(4.7)
∪i≥0 Āi = S, Āi ∩ Āj = ∅ for i 6= j ∈ N,andJ(·; .
Set Ān := ∪i≤n Āi , n ≥ 1. Given two stopping times θ, τ ∈ T[t,Tt
] , it follows from (4.3)
(and Assumption A1 in the general mixed control case) that
n
!
X
n,ε i,ε
τ := τ 1{τ <θ} + 1{τ ≥θ} T 1(Ān )c (θ, Xt,x (θ)) + τ 1Āi (θ, Xt,x (θ))
i=1
t
is a stopping time in T[t,T ].
We then deduce from the tower property together with
Assumptions A4’-b and (4.7) that
¯ x; τ n,ε )
V̄ (t, x) ≥ J(t,
ν
≥ E f Xt,x (τ ) 1{τ <θ} + 1{τ ≥θ} (ϕ(θ, Xt,x (θ)) − 3ε) 1Ān (θ, Xt,x (θ))
+E 1{τ ≥θ} f (Xt,x (T ))1(Ān )c (θ, Xt,x (θ)) .
By sending n → ∞ and arguing as in the end of the proof of Theorem 3.5, we deduce
that
V̄ (t, x) ≥ E f (Xt,x (τ )) 1{τ <θ} + 1{τ ≥θ} ϕ(θ, Xt,x (θ)) − 3ε,
t
and the result follows from the arbitrariness of ε > 0 and τ ∈ T[t,T ].
10 BOUCHARD B. AND N. TOUZI
The difference between Ṽ (t, ·) and V (t, ·) comes from the fact that all controls in U
are considered in the former, while we restrict to controls independent of Ft in the
latter. We claim that
Ṽ = V ,
so that both problems are indeed equivalent. Clearly, Ṽ ≥ V . To see that the converse
holds true, fix (t, x) ∈ [0, T ) × Rd and ν ∈ U. Then, ν can be written as a measur-
able function of the canonical process ν((ωs )0≤s≤t , (ωs − ωt )t≤s≤T ), where, for fixed
(ωs )0≤s≤t , the map ν(ωs )0≤s≤t : (ωs − ωt )t≤s≤T 7→ ν((ωs )0≤s≤t , (ωs − ωt )t≤s≤T ) can
be viewed as a control independent on Ft . Using the independence of the increments
of the Brownian motion and the compound Poisson process, and Fubini’s Lemma, it
thus follows that
Z h Z
ν(ωs ) i
J(t, x; ν) = E f (Xt,x 0≤s≤t (T )) dP((ωs )0≤s≤t ) ≤ V (t, x)dP((ωs )0≤s≤t )
where the latter equals V (t, x). By arbitrariness of ν ∈ U, this implies that Ṽ (t, x) ≤
V (t, x).
remark 5.3. By the previous remark, it follows that the value function V
inherits the lower semicontinuity of the performance criterion required in the second
part of Theorem 3.5, compare with Remark 3.7. This simplification is specific to the
simple stochastic control problem considered in this section, and may not hold in
other control problems, see e.g. [4]. Consequently, we shall deliberately ignore the
lower semicontinuity of V in the subsequent analysis in order to show how to derive
the dynamic programming equation in a general setting.
Let f : Rd −→ R be a lower semicontinuous function with linear growth, and
define the performance criterion J by (2.1). Then, it follows that U = Uo and,
ν
from (5.2) and the almost sure continuity of (t, x) 7→ Xt,x (T ), that J(., ν) is lower
semicontinuous, as required in the second part of Theorem 3.5.
The value function V is defined by (2.3). Various types of conditions can be for-
mulated in order to guarantee that V is locally bounded. For instance, if f is bounded
from above, this condition is trivially satisfied. Alternatively, one may restrict the
set U to be bounded, so that the linear growth of f implies corresponding bounds for
V . We do not want to impose such a constraint because we would like to highlight
the fact that our methodology applies to general singular control problems. We then
leave this issue as a condition which is to be checked by specific arguments to the case
in hand.
Proposition 5.4. In the above controlled diffusion context, assume further that
V is locally bounded. Then, the value function V satisfies the weak dynamic program-
ming principle (3.1)-(3.2).
Proof. Conditions A1, A2 and A3 from Assumption A are obviously satisfied in
the present context. It remains to check that A4 holds true. For ω ∈ Ω and r ≥ 0,
we denote ω·r := ω.∧r and Tr (ω)(·) := ω·∨r − ωr so that ω· = ω·r + Tr (ω)(·). Fix
t
(t, x) ∈ S, ν ∈ Ut , θ ∈ T[t,T ] , and observe that, by the flow property,
Z
ν
ν(ω θ(ω) +Tθ(ω) (ω))
E f Xt,x (T ) |Fθ (ω) = f Xθ(ω),X ν (θ)(ω) (T )(Tθ(ω) (ω)) dP(Tθ(ω) (ω))
t,x
Z
ν(ω θ(ω) +Tθ(ω) (ω̃))
= f Xθ(ω),X ν (θ)(ω) (T )(Tθ(ω) (ω̃)) dP(ω̃)
t,x
ν
= J(θ(ω), Xt,x (θ)(ω); ν̃ω )
12 BOUCHARD B. AND N. TOUZI
where, ν̃ω (ω̃) := ν(ω θ(ω) + Tθ(ω) (ω̃)) is an element of Uθ(ω) . This already proves A4-a.
t
As for A4-b, note that if ν̄ := ν1[0,θ] + ν̃1(θ,T ] with ν̃ ∈ Us and θ ∈ T[t,s] , then the
same computations imply
Z
ν̄
ν̃(ω θ(ω) +Tθ(ω) (ω̃))
E f Xt,x (T ) |Fθ (ω) = f Xθ(ω),X ν (θ)(ω) (T )(Tθ(ω) (ω̃)) dP(ω̃),
t,x
ν ν̄
where we used the flow property together with the fact that Xt,x = Xt,x on [t, θ]
ν̄
and that the dynamics of Xt,x depends only on ν̃ after θ. Now observe that ν̃ is
independent of Fs and therefore on ω θ(ω) since θ ≤ s P − a.s. It follows that
Z
ν̄
ν̃(Ts (ω̃))
E f Xt,x (T ) |Fθ (ω) = f Xθ(ω),X ν (θ)(ω) (T )(Tθ(ω) (ω̃)) dP(ω̃)
t,x
ν
= J(θ(ω), Xt,x (θ)(ω); ν̃) .
remark 5.5. It can be similarly proved that A4’ holds true, in the context of
mixed control-stopping problems.
5.2. PDE derivation. We can now show how our weak formulation of the dy-
namic programming principle allows to characterize the value function as a discontin-
uous viscosity solution of a suitable Hamilton-Jacobi-Bellman equation.
Let C 0 denote the set of continuous maps on [0, T ]×Rd endowed with the topology
of uniform convergence on compact sets. To (t, x, p, A, ϕ) ∈ [0, T ]×Rd ×Rd ×Md ×C 0 ,
we associate the Hamiltonian of the control problem:
where, for u ∈ U ,
1
H u (t, x, p, A, ϕ) := −hµ(t, x, u), pi − Tr [(σσ 0 )(t, x, u)A]
Z 2
− (ϕ(t, x + β(t, x, u, e)) − ϕ(t, x)) λ(de),
E
H∗ (z) := lim
0
inf H(z 0 )forz = (t, x, p, A, ϕ) ∈ S × Rd × Md × C 0 .
z →z
Proof. 1. We start with the supersolution property. Assume to the contrary that
there is (t0 , x0 ) ∈ [0, T ) × Rd together with a smooth function ϕ : [0, T ) × Rd −→ R
satisfying
0 = (V∗ − ϕ)(t0 , x0 ) < (V∗ − ϕ)(t, x)for all(t, x) ∈ [0, T ) × Rd , (t, x) 6= (t0 , x0 ),
such that
where we recall that Br (t0 , x0 ) denotes the ball of radius r and center (t0 , x0 ). Let
(tn , xn )n be a sequence in Br (t0 , x0 ) such that (tn , xn , V (tn , xn )) → (t0 , x0 , V∗ (t0 , x0 )),
and let X·n := Xtun ,xn (·) denote the solution of (5.1) with constant control ν = u and
initial condition Xtnn = xn , and consider the stopping time
Note that θn < T since t0 + r < T . Applying Itô’s formula to φ(·, X n ), and using
(5.4) and (5.2), we see that
" Z θn #
n
∂t φ − H (., Dφ, D φ, φ) (s, Xs )ds ≤ E φ(θn , Xθnn ) .
u 2 n
φ(tn , xn ) = E φ(θn , Xθn ) −
tn
Now observe that ϕ ≥ φ + η on ([0, T ] × Rd) \ Br (t0 , x0 ) for some η > 0. Hence, the
above inequality implies that φ(tn , xn ) ≤ E ϕ(θn , Xθnn ) − η. Since (φ − V )(tn , xn ) →
0, we can then find n large enough so that
2. We now prove the subsolution property. Assume to the contrary that there is
(t0 , x0 ) ∈ [0, T ) × Rd together with a smooth function ϕ : [0, T ) × Rd −→ R satisfying
such that
−∂t φ + H u (., Dφ, D2 φ, φ) (t, x) > 0for everyu ∈ U and (t, x) ∈ Br (t0 , x0 ).(5.6)
Let (tn , xn )n be a sequence in Br (t0 , x0 ) such that (tn , xn , V (tn , xn )) → (t0 , x0 , V ∗ (t0 , x0 )).
n
For an arbitrary control ν n ∈ Utn , let X n := Xtνn ,xn denote the solution of (5.1) with
initial condition Xtnn = xn , and set
Notice that θn < T as a consequence of the fact that t0 + r < T . We may assume
without loss of generality that
In view of (5.7), the above inequality implies that φ(tn , xn ) ≥ E V ∗ (θn , Xθnn ) + 2η,
REFERENCES
[1] G. Barles and C. Imbert, Second-Order Elliptic Integro-Differential Equations: Viscosity So-
lutions’ Theory Revisited, Annales de l’IHP, 25 (2008), pp. 567-585.
[2] D.P. Bertsekas and S.E. Shreve, Stochastic Optimal Control : The Discrete Time Case,
Mathematics in Science and Engineering, 139, Academic Press, 1978.
Weak Dynamic Programming Principle for Viscosity Solutions 15
[3] V.S. Borkar, Optimal Control of Diffusion Processes, Pitman Research Notes 203. Longman
Sci. and Tech. Harlow, 1989.
[4] B. Bouchard, N.-M. Dang and C.-A. Lehalle, Optimal control of trading algorithms: a
general impulse control approach, to appear in SIAM Journal on Financial Mathematics.
[5] M.G. Crandall, H. Ishii and P.-L. Lions, User’s guide to viscosity solutions of second order
Partial Differential Equations, Amer. Math. Soc., 27 (1992), pp. 1-67.
[6] C. Dellacherie and P.-A. Meyer, Probabilité et Potentiel, Théorie du potentiel, Hermann,
Springer, 1987.
[7] N. El Karoui, Les Aspects probabilistes du contrôle stochastique, Springer Lecture Notes in
Mathematics 876, Springer Verlag, New York, 1981.
[8] W.H. Fleming and H.M. Soner, Controlled Markov Processes and Viscosity Solutions, Second
Edition, Springer, 2006.
[9] Y. Kabanov and C. Klueppelberg, A geometric approach to portfolio optimization in models
with transaction costs, Finance and Stochastics, 8 (2004), pp. 207-227.
[10] P.-L. Lions, Optimal Control of Diffusion Processes and Hamilton-Jacobi-Bellman Equations
I, Comm. PDE., 8 (1983), pp. 1101-1134.
[11] P.-L. Lions, Optimal Control of Diffusion Processes and Hamilton-Jacobi-Bellman Equations,
Part II: Viscosity Solutions and Uniqueness, Comm. PDE., 8 (1983), pp. 1229-1276.
[12] B. Oksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext,
Springer (Second edition), 2007.
[13] P. J. Reny, On the Existence of Pure and Mixed Strategy Nash Equilibria in Discontinuous
Games, Econometrica, 67 (1999), pp. 1029-1056.
[14] J. Dugundj, Topology, Allyn and Bacon series in Advanced Mathematics, Allyn and Bacon edt,
1966.
[15] N. Touzi, Stochastic Control Problems, Viscosity Solutions, and Application to Finance,
Quaderni, Edizioni della Scuola Normale Superiore, Pisa, 2002.
[16] J. Yong and X. Zhou, Stochastic Controls: Hamiltonian Systems and HJB Equations,
Springer, New York, 1999.