0% found this document useful (0 votes)
33 views19 pages

Stopping Time Markov Processes

The chapter considers optimal stopping of Markov chains. There are two settings: finite-horizon problems where stopping must occur within N steps, and infinite-horizon problems with no limit on stopping time. For finite-horizon problems, the value function satisfies a dynamic programming equation that can be used to find the optimal stopping time. In infinite-horizon problems, the value function also satisfies a dynamic programming equation, but finding the optimal stopping time is more complex.

Uploaded by

alvin3141
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views19 pages

Stopping Time Markov Processes

The chapter considers optimal stopping of Markov chains. There are two settings: finite-horizon problems where stopping must occur within N steps, and infinite-horizon problems with no limit on stopping time. For finite-horizon problems, the value function satisfies a dynamic programming equation that can be used to find the optimal stopping time. In infinite-horizon problems, the value function also satisfies a dynamic programming equation, but finding the optimal stopping time is more complex.

Uploaded by

alvin3141
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Chapter 3.

Optimal Stopping of Markov Chains

The optimal stopping theory is well understood in both continuous time


and discrete time. The major development was the work of Snell Envelop,
which deals with the general non-Markovian setting [3, 1]. The applications
of optimal stopping usually specialize to the Markovian setting [2].
The chapter considers the optimal stopping of time-homogeneous Markov
chains. Assume that, defined on some probability space (Ω, P ) is the Markov
chain {X0, X1, . . .} taking value in some set S (which is called state space).
The transition probability kernel is denoted by P (dy|x); i.e.,

P (Xn+1 ∈ A |Xn = x, Xn−1 , . . ., X0) = P (A|x),

and Z
E [f (Xn+1 ) |Xn = x, Xn−1 , . . ., X0] = f (y)P (dy|x)
S
for any measurable function f : S → R. The main information from
these equalities is that the distribution of Xn+1 , conditional on the history
{Xn , Xn−1, . . . , X0}, is independent of {Xn−1 , . . . , X0}.
Let c : S → R and g : S → R be two functions on the state space. The
objective is to minimize the total expected cost
 
τX
−1
E g(Xτ ) + c(Xj )
j=0

over a class of admissible stopping times τ . Depending on the class of stop-


ping times, the problem is classified as finite-horizon and infinite-horizon.
We will assume the following condition throughout the chapter, so that
the value function cannot take value plus/minus infinity.

Condition 1 The function c is non-negative, and g is bounded.

1 Finite-horizon

Suppose N is a fixed non-negative integer (i.e. horizon). Assume the class


of admissible stopping times is {τ : P (τ ≤ N ) = 1}. The value function is
denoted by
 
τX
−1
.
vN (x) = inf E  g(Xτ ) + c(Xj ) X0 = x . (1)
{τ :P (τ ≤N )=1}
j=0

1
The DPE for this problem is quite simple: Define
.
VN (x) = g(x)
 Z 
.
Vn (x) = min g(x), c(x) + Vn+1 (y)P (dy|x) ,
S

for all n = 0, 1, . . ., N − 1. Then one would expect that vN (x) = V0(x).

Proposition 1 The value function vN (x) = V0 (x), and the optimal stop-
ping time is
∗ .
τN = inf {j ≥ 0 : Vj (Xj ) = g(Xj )} .

Proof. Consider the process


n−1
. X
Zn = c(Xj ) + Vn (Xn ); n = 0, 1, . . ., N.
j=0

Then the process {Zn : n = 0, 1, . . ., N } is a sub martingale. Indeed,


n
X
E[Zn+1 |Xn, Xn−1 , . . . , X0] = c(Xj ) + E[Vn+1(Xn+1 ) |Xn]
j=0
Xn Z
= c(Xj ) + Vn+1 (y)P (dy|Xn)
j=0 S
n−1
X
≥ c(Xj ) + Vn (Xn )
j=0
= Zn .

In particular, we have

E[Zτ ] ≥ E[Z0] = V0 (x)

for any stopping times taking values in {0, 1, . . ., N }. But by definition,


Vj ≤ g for any j, thus
 
τX
−1
V0(x) ≤ E[Zτ ] ≤ E  c(Xj ) + g(Xτ ) .
j=0

But for every n = 0, 1, . . ., N − 1, we have

E[ZτN∗ ∧(n+1) |Xn , Xn−1, . . . , X0]

2
= 1{τN∗ ≤n} ZτN∗ + 1{τN∗ ≥(n+1)} E[Zn+1 |Xn, Xn−1 , . . . , X0]
 
n
X Z
= 1{τN∗ ≤n} ZτN∗ + 1{τN∗ ≥(n+1)}  c(Xj ) + Vn+1 (y)P (dy|Xn)
j=0 S
 
n−1
X
= 1{τN∗ ≤n} ZτN∗ + 1{τN∗ ≥(n+1)}  c(Xj ) + Vn (Xn )
j=0
= 1{τN∗ ≤n} ZτN∗ + 1{τN∗ ≥(n+1)} Zn
= ZτN∗ ∧n .

Therefore,

V0(x0 ) = E[Z0] = E[ZτN∗ ∧0] = E[ZτN∗ ∧1] = · · · = E[ZτN∗ ∧N ] = E[ZτN∗ ].

But
 ∗ −1   ∗ −1 
τN τN
X X
E[ZτN∗ ] = E  c(Xj ) + VτN∗ (XτN∗ ) = E  c(Xj ) + g(XτN∗ ) .
j=0 j=0

It follows that
V0(x) = vN (x)

and that τN is optimal for the problem with horizon N .
The following result is a rewrite of the Proposition 1, which is indeed
the forward form of the above DPE.

Proposition 2 The value function {vn } satisfies vn+1 ≤ vn for every n,


and

v0 (x) = g(x)
 Z 
vn+1 (x) = min g(x), c(x) + vn (y)P (dy|x) ,
S

For each N ≥ 0, the stopping time


∗ .
τN = inf {j ≥ 0 : vN −j (Xj ) = g(Xj )}

is optimal for the problem with horizon N .

Example 1 Let us consider a search problem with following setup. Sup-


pose there are totally three offers {Y0 , Y1, Y2 }. Offers are iid with common
Bernoulli distribution

P (Y0 = a) = P (Y0 = b) = 1/2

3
with 0 < a < b. Each solicitation will cost c. After the first offer comes in,
what should the agent do so as to minimize the total cost plus the minimum
price attained?
Solution: The objective is to find
 
τ −1
. X
vN (x) = E  Xτ + c X0 = x ,
j=0
.
where N = 2, and Xn = Y0 ∧ . . . ∧ Yn .
It is not difficult to check that {X0, X1, X2} is a time-homogeneous
Markov chain taking values in {a, b}, with transition probability matrix

P (Xn+1 = a|Xn = a) = 1,
P (Xn+1 = b|Xn = a) = 0,
P (Xn+1 = a|Xn = b) = 1/2,
P (Xn+1 = b|Xn = b) = 1/2.

We will write
b−a
c= + ε.
2
Define V2(x) = x, or
V2(a) = a, V2(b) = b.
and

V1(a) = min {a, c + 1 · V2(a) + 0 · V2(b)} = a


V1 (b) = min {b, c + 1/2 · V2(a) + 1/2 · V2(b)} = b ∧ (b + ε),

and

V0(a) = min {a, c + 1 · V1(a) + 0 · V1(b)} = a


V0 (b) = min {b, c + 1/2 · V1(a) + 1/2 · V1(b)} = b ∧ (b + 3ε/2).

We have v2 (x) = V0(x) and the optimal stopping time is

τ ∗ = inf {n ≥ 0; Vn(x) = x} .

More precisely,

1. For ε ≥ 0, v2 (x) ≡ x, and the optimal stopping time is τ ∗ = 0, or stop


immediately.

4
2. For ε < 0, then v2 (a) = a, and v2 (b) = b + 3ε/2. It is not difficult to
see that
τ ∗ = inf {n ≥ 0; Xn = a} .
That is, the optimal policy is to search until the lowest price a is
solicited.

2 Infinite-horizon

The objective is to determine the value function


 
τX
−1
.
v(x) = inf E  g(Xτ ) + c(Xj ) X0 = x . (2)
{τ :P (τ <∞)=1}
j=0

The restriction of P (τ < ∞) = 1 is to avoid the confusion of the definition


of Xτ on set {τ = ∞}.
The DPE associated with this problem is
 Z 
V (x) = min g(x), c(x) + V (y)P (dy|x) . (3)
S

The following result claim that the value function satisfies the DPE.

Theorem 1 The value function satisfies the DPE (3); i.e.,


 Z 
v(x) = min g(x), c(x) + v(y)P (dy|x) .
S

An optimal (finite) stopping time exists if and only if the stopping time
.
τ ∗ = inf {n ≥ 0 : v(Xn ) = g(Xn)}

is finite; i.e. P (τ ∗ < ∞) = 1. Furthermore, if τ ∗ is finite, then τ ∗ is an


optimal stopping time.

Proof. We claim v(x) = limn ↓ vn (x) for every x ∈ S. By definition, we


have v ≤ vn ≤ vn+1 for all n, and whence v ≤ limn vn . Now for the reverse
inequality, fix X0 = x ∈ S and an arbitrary constant ε > 0. There exists a
finite stopping time τε such that
 
τX
ε −1

v(x) + ε ≥ E g(Xτε ) + c(Xj ) .


j=0

5
But thanks to Condition 1, MCT, and DCT, we have
   
τX
ε −1 (τε ∧n)−1
X
E g(Xτε ) + c(Xj ) = lim E g(Xτε∧n ) + c(Xj )
n
j=0 j=0
≥ lim vn (x).
n

Therefore,
v(x) + ε ≥ lim vn (x).
n
Since ε is arbitrary, we have v(x) ≥ limn vn (x), which in turn implies v(x) =
limn vn (x), for every x ∈ S.
By Proposition 2, we have
 Z 
vn+1 (x) = min g(x), c(x) + vn (y)P (dy|x)
S

Letting n → ∞ on both sides, and noting that {vn } is clearly uniformly


bounded, we have
 Z 
v(x) = min g(x), c(x) + v(y)P (dy|x)
S

from DCT.
Fix X0 = x, and assume now there exists an optimal finite stopping
time, say σ. Consider the process
n−1
. X
Zn = c(Xj ) + v(Xn), n = 0, 1, . . ..
j=0

Then {Zn } is a submartingale, similar to the proof of Proposition 1. We


have  
σ∧n−1
.  X
E[Zσ∧n] = c(Xj ) + v(Xσ∧n) ≥ E[Z0] = v(x).
j=0

Letting n → ∞, thanks to the MCT, DCT, that v ≤ g, and that σ is


optimal, we arrive at
   
σ−1
X σ−1
X
v(x) ≤ E  c(Xj ) + v(Xσ ) ≤ E  c(Xj ) + g(Xσ ) = v(x).
j=0 j=0

In particular, with probability one,

v(Xσ ) = g(Xσ ).

6
This implies σ ≥ τ ∗ . Whence τ ∗ is finite if there exists an optimal finite
stopping time.
Suppose now τ ∗ is finite. The same proof in Proposition 1 yields that
 
τ ∗X
∧n−1
v(x) = E[Z0] = · · · = E[Zτ ∗∧n ] =  c(Xj ) + v(Xτ ∗∧n ) .
j=0

Letting n → ∞, we complete the proof, thanks to MCT, DCT, and that


v(Xτ∗) = g(Xτ∗).

Remark 1 For obvious reasons, sometimes the set {x ∈ S : v(x) = g(x)} is


called the “stopping region”, and the set {x ∈ S : v(x) < g(x)} is called the
“ continuation region”.

Remark 2 The forward form of DPE (Proposition 2) also provide a recursive


numerical algorithm to compute v in general.

Corollary 1 The value function v is largest solution to the DPE (3) in the
sense that: if u is another solution to the DPE, then v(x) ≥ u(x) for every
x ∈ S.

Proof. Suppose u is another solution. The proof of theorem 1 implies that


v(x) = limn vn (x). In order to show u(x) ≤ v(x), it suffices to show that
u(x) ≤ vn (x) for every n.
.
We will use induction. For n = 0, u(x) ≤ v0(x) = g(x) is trivial.
Suppose u(x) ≤ vn (x) for some n. We want to show u(x) ≤ vn+1 (x). But
by Proposition 2,
 Z 
vn+1 (x) = min g(x), c(x) + vn (y)P (dy|x)
S
 Z 
≥ min g(x), c(x) + u(y)P (dy|x)
S
= u(x).

This completes the proof.

Corollary 2 Suppose {X0, X1, . . .} is a time-homogeneous Markov chain


with finite state space: i.e., the state space S is a finite. Then τ ∗ is always
finite and optimal.

7
Proof. Let x̄ ∈ S such that

g(x̄) = min g(x).


x∈S

Since S is a finite set, such a x̄ always exists. Clearly, v(x̄) = g(x̄). Thus
.
τ ∗ = {n ≥ 0 : v(Xn) = g(Xn )} ≤ {n ≥ 0 : Xn = x̄} = σ.

But since {Xn } is an irreducible, finite Markov chain, σ is finite, whence so


is τ ∗ . .

The following proposition is useful to verify whether a solution of the


DPE equals the value function.

Proposition 3 Suppose v̄ is a bounded solution to the DPE (3), and that


the stopping time
.
τ̄ = inf {n ≥ 0 : v̄(Xn ) = g(Xn)}

is finite; i.e. P (τ̄ < ∞) = 1. Then v̄ = v and τ̄ is optimal.

Proof. The proof is exactly the same as the last part of the proof of Theorem
1 with v replaced by v̄. We omit the details.

Remark 3 Suppose β ∈ (0, 1) is the discount factor, and we consider the


discounted optimal stopping problem
 
τX
−1
inf E β τ g(Xτ ) + β j c(Xj ) .
{τ :P (τ <∞)=1}
j=0

Then the DPE becomes


 Z 
V (x) = min g(x), c(x) + β V (y)P (dy|x) .
S

All the results we have obtained hold for the discounted problem, as long as
Condition 1 holds.

8
3 Examples of infinite horizon

Example 2 (Concave majorant) Suppose that {X0, X1, . . .} is a simple


symmetric random walk on integers S = {0, 1, . . . , b} with absorption at
two end points {0, b}. Let f : S → R be a non-negative function. Show that
the value function
.
v(x) = sup E [ f (Xτ )| X0 = x]
{τ :P (τ <∞)=1}

is the least concave majorant of f (i.e., the smallest concave function that
dominates f ).

Solution: Theorem 1 implies that the value function v satisfies


 Z 
v(x) = max f (x), v(y)P (dy|x) , ∀ x ∈ S.
S

Thus v ≥ f . Also, for x ∈ {1, 2, . . ., b − 1}, the above equation yields


 
1 1
v(x) = max f (x), v(x + 1) + v(x − 1) .
2 2
In particular, for all such x,
1 1
v(x) ≥ v(x + 1) + v(x − 1).
2 2
Thus v is a concave majorant of f .
It remains to show that v is the smallest concave major ant. To this
end, let u be another concave major ant; that is, u ≥ f and u is concave.
Note that the proof of Theorem 1 implies that v = limn ↓ vn , where {vn } is
recursively determined by

v0 (x) = f (x)
 Z 
vn+1 (x) = max f (x), vn (y)P (dy|x) .
S

It suffices to show that u ≥ vn for every n. We will use induction.


The claim holds when n = 0. Suppose u ≥ vn for some n. Since u is
concave, it is not difficult to see that
Z
u(y)P (dy|x) ≤ u(x)
S

9
for every x ∈ S. Then
 Z 
vn+1 (x) ≤ max f (x), u(y)P (dy|x) ≤ max {f (x), u(x)} = u(x).
S

We complete the proof.

Example 3 (Discrete put option) Suppose the log-stock price is modeled


by the simple symmetric random process {X0, X1, . . .} on Z. In other words,
the stock price at time n is exp(Xn ). Let β ∈ (0, 1) be a discounter factor.
Compute the value function
h i
.
v(x) = sup E β τ (1 − exp(Xτ ))+ X0 = x ,
{τ :P (τ <∞)=1}

and the optimal strategy. The value function represents the price of a dis-
crete put option.

Solution: Theorem 1 and Proposition 3 are applicable since Condition 1 is


satisfied; see Remark 3. The DPE associated with this optimal stopping
problem is
 Z 
x +
V (x) = max (1 − e ) , β V (y)P (dy|x)
Z
 
β x +β
= max (1 − e ) , V (x + 1) + V (x − 1) .
2 2
The idea is to find a bounded solution to the DPE, and then apply Propo-
sition 3.
It is natural to expect that the optimal stopping time will take form

inf{n ≥ 0 : v(Xn) = (1 − exp(Xn ))+ } = inf{n ≥ 0 : Xn ≤ b}

for some integer b ∈ Z that has to be determined. Of course, since it is never


optimal to stop if Xn ≥ 0, we also conjecture b < 0.
The above discussion prompts us to consider a candidate bounded solu-
tion v̄ such that v̄(x) = 1−ex for all x ≤ b, and v̄(x) = β/2 [v̄(x + 1) + v̄(x − 1)]
for x > b. The general solution for the latter difference equation is

v̄(x) = Aeαx + Be−αx , ∀ x ≥ b.

for some constants A and B, where


p
. 1− 1 − β2
α = log . (4)
β

10
Clearly, α < 0. But since we assume v̄ is bounded, it follows B = 0, and
thus
v̄(x) = Aeαx , ∀ x ≥ b.
But at x = b, we must have

1 − eb = Aeαb ,

or  
A = e−αb 1 − eb .
In other words, a candidate solution is
(
1 − ex  ; if x ≤ b
v̄(x) = α(x−b) b (5)
e 1−e ; if x > b

But what is the value of the free boundary b ∈ Z? Recall that for v̄ to be a
solution, we will have

1 − ex ≥ β/2[v̄(x + 1) + v̄(x − 1)], ∀ x ≤ b, (6)

and
1 − ex < β/2[v̄(x + 1) + v̄(x − 1)], ∀ x > b. (7)
These inequalities indeed uniquely determines b ∈ Z. We have the following
Lemma.
Lemma 1 There exists a unique negative integer b ∈ Z such that the above
inequalities (6)-(7) hold. Indeed, b ∈ Z is the unique integer such that
b ∈ (B − 1, B], with
1 − eα
B = log < 0.
1 − eα−1
The function v̄ given by (5) is the value function, and the optimal stopping
time is
τ ∗ = {n ≥ 0 : Xn ≤ b} .
The proof of this lemma is given in the Appendix.

Remark 4 In the above example, what happens if the discount factor β = 1?


It is not difficult to see that the value function is v(x) ≡ 1 for every x ∈ Z.
Indeed, it is clear that v(x) ≤ 1 for all x. But for an arbitrary j ∈ N, let
.
σj = inf{n ≥ 0 : Xn ≤ −j}, then σj is finite, and we have, for x ≥ −j,
h i
v(x) = E (1 − exp(Xσj ))+ = (1 − exp(−j)) → 1

11
as j → ∞. This implies that v(x) ≥ 1, and whence v(x) = 1. The value
function clearly satisfies the DPE

v(x) = max (1 − ex )+ , [v(x + 1) + v(x − 1)]/2 .
But since the stopping time
inf{n ≥ 0 : v(Xn) = (1 − exp(Xn ))+ } = ∞,
there is no optimal stopping time. Note that {σj } is a sequence of stopping
times that approach the value function (i.e., {σj } is an optimizing sequence).

4 When Condition 1 is violated: The verification argument

The results in the previous sections rely on Condition 1. What if the con-
dition fails to hold? Or more seriously, what if the cost structure is not
as specified; e.g. the cost is path-dependent? In this case, one method we
can apply the so-called verification argument. This approach has been used
many times in Chapter 2, and is also implicitly used in the proof of Theorem
1. The next two examples show that this method is quite general.

Example 4 (Search for maximum) This is another search model, but with
a different cost. Let {Y0 , Y1, . . .} be a sequence of iid non-negative random
variables, representing the offers, with common distribution F and such
that E[Y0] < ∞. Let c > 0 be a constant, representing the cost for each
solicitation. The objective for the agent is to solve the optimization problem
max E [Y0 ∨ Y1 ∨ . . . ∨ Yτ − cτ ]
τ

over all finite stopping times τ .


.
Solution: Let Xn = Y0 ∨ Y1 ∨ . . . ∨ Xn . Then {Xn } is a (non-decreasing)
Markov chain. We will denote its transition probability matrix by P (dy|x).
Moreover, the optimization problem is equivalent to
 
τX
−1
inf E −Xτ + c .
τ
j=0

This does not satisfies Condition 1 since {Xn } may not be bounded. Denote
the value function by
v(x) = sup E [ Xτ − cτ | X0 = x] .
{τ :P (τ <∞)=1}

We will consider two cases separately.

12
1. If E[Y0] ≤ c, then the optimal stopping time is τ ∗ = 0, and the value
function is v(x) ≡ x for all x. Indeed, it is trivial to check that the
process {Xn −cn} is a supermartingale. Thus, for every finite stopping
time τ and n ∈ N, we have

E [Xτ ∧n − c(τ ∧ n)] ≤ E[X0] = x

Letting n → ∞, using MCT twice, we have

E [Xτ − cτ ] ≤ x.

Thus v(x) ≤ x. But if we take τ ∗ = 0, then E [Xτ ∗ − cτ ∗ ] = x. Hence


v(x) = x and τ ∗ = 0 is optimal.

2. Suppose now E[Y0] > c. We claim that the value function is


(
x∗ ; if x ≤ x∗
v(x) = (8)
x ; if x > x∗

where x∗ is the solution to the equation


Z ∞
c= [1 − F (y)] dy.
x∗

We also assert that the optimal stopping time is


.
τ ∗ = inf{n ≥ 0 : Xn ≥ x∗ } = inf{n ≥ 0 : Yn ≥ x∗ }.

Before proving the claim, it worth pointing out that x∗ always exists,
since Z ∞
[1 − F (y)] dy = E[Y0] > c,
0

and that τ is obviously finite.
The idea of the proof is the same as Case 1. Denote by v̄ the RHS of
equation (8). It is not difficult to check that the
 Z 
v̄(x) = max x, − c + v̄(y)P (dy|x) (9)
 Z 
= max x, − c + v̄(x ∨ y) dF (y)

13
Indeed, for x > x∗ , we have
Z
−c + v̄(x ∨ y) dF (y)
Z x Z ∞
= −c + x dF (y) + y dF (y)
0 x
Z x
= −c + E[Y0] + (x − y) dF (y)
Z0 x
= −c + E[Y0] + F (y) dy
0
Z ∞ Z ∞ Z x
= − [1 − F (y)] dy + [1 − F (y)] dy + F (y) dy
x∗ 0 0
Z x∗ Z x
= [1 − F (y)] dy + F (y) dy
0 0
Z x
= x− [1 − F (y)] dy,
x∗

and thus RHS = x = LHS. For x ≤ x∗ , we have similarly


Z Z x∗ Z ∞
−c + v̄(x ∨ y) dF (y) = −c + x∗ dF (y) + y dF (y) = x∗ ,
0 x∗

and thus again RHS = x∗ = LHS.


It follows equation (9) that the process {v̄(Xn ) − cn} is a supermartin-
gale. Thus for every finite stopping time τ and n ∈ N, we have

E [v̄(Xτ ∧n) − c(τ ∧ n)] ≤ v̄(x).

Letting n → ∞, using MCT twice, then taking supremum over τ on


the LHS, and observing v̄(x) ≥ x, we arrive at v(x) ≤ v̄(x). But for
the stopping time τ ∗ , one can show analogously, for every n ≥ 0,
h i
E v̄(Xτ ∗∧(n+1)) − c(τ ∗ ∧ (n + 1)) = E [v̄(Xτ ∗∧n ) − c(τ ∗ ∧ n)]
= · · · = v̄(x).

Letting n → ∞ and using MCT twice, we have

v̄(x) = E [v̄(Xτ ∗ ) − cτ ∗] = E [Xτ ∗ − cτ ∗] .

Thus v̄(x) = v(x) and τ ∗ is optimal.

14
Example 5 (Discrete up-and-out put option) Suppose the log-stock price
is modeled by the simple symmetric random process {X0, X1, . . .} on Z. Let
β ∈ (0, 1) be a discounter factor. Compute the value function
h i
.
v(x) = sup E β τ (1 − exp(Xτ ))+ · 1{max0≤j≤τ Xj <H} X0 = x .
{τ :P (τ <∞)=1}

Here H ∈ Z is a positive integer. The value function represents the price of


a discrete up-and-out put option with barrier H.
Consider the following variational inequality, which is the DPE in detail.

Variational Inequality: Find a bounded function V : Z → R+ and an integer


(free boundary) b < 0, such that

V (x) = 0, x≥H (10)


V (x) = β [V (x + 1) + V (x − 1)] /2; b<x<H (11)
x
V (x) = 1 − e , x≤b (12)
x +
V (x) > (1 − e ) , b<x<H (13)
V (x) ≥ β [V (x + 1) + V (x − 1)] /2. x≤b (14)

Lemma 2 Suppose (V, b) is a solution to the variational inequality. Then


v(x) = V (x) for all x and the optimal stopping time is
.
τ ∗ = inf{n ≥ 0 : Xn ≤ b}.
.
Proof. Let σ = inf{n ≥ 0 : Xn ≥ H}. Then the value function can be
rewritten as
h i

v(x) = sup E β τ (1 − exp(Xτ ))+ · 1{τ <σ} X0 = x .
{τ :P (τ <∞)=1}

.
Now consider the process Zn = {β σ∧n V (Xσ∧n ) : n = 0, 1, . . .}. We claim
{Zn } is a supermartingale. Indeed,

E[Zn+1 |Xn , . . ., X0]


= β σ V (Xσ )1{σ≤n} + 1{σ≥n+1} β n+1 E[V (Xn+1 ) |Xn, . . . , X0]
= β σ V (Xσ )1{σ≤n} + 1{σ≥n+1} β n+1 [V (Xn + 1) + V (Xn − 1)] /2
≤ β σ V (Xσ )1{σ≤n} + 1{σ≥n+1} β n V (Xn )
= Zn .

15
It follows from Optional Sampling Theorem that for any stopping time τ
and any n ∈ N , we have

E[Zτ ∧n] ≤ E[Z0] = V (x).

Letting n → ∞, the DCT further implies that

E[Zτ ] = lim E[Zτ ∧n] ≤ V (x).


n

But
Zτ = β σ∧τ V (Xσ∧τ ) ≥ β τ (1 − exp(Xτ ))+ · 1{τ <σ}.
Whence h i
V (x) ≥ E β τ (1 − exp(Xτ ))+ · 1{τ <σ} .
Taking supremum over τ on the RHS, we have V (x) ≥ v(x).
. ∗
Now define Zn∗ = {β σ∧τ ∧n V (Xσ∧τ ∗∧n ) : n = 0, 1, . . .}. Similarly to the
above argument, we have

E[Zn+1 |Xn , . . . , X0]

= β σ∧τ V (Xσ∧τ ∗ )1{σ∧τ ∗≤n} + 1{σ∧τ ∗≥n+1} β n+1 [V (Xn + 1) + V (Xn − 1)] /2

= β σ∧τ V (Xσ∧τ ∗ )1{σ∧τ ∗≤n} + 1{σ∧τ ∗≥n+1} β n V (Xn)
= Zn∗ .

In particular,
h ∗ ∧n
i
E β σ∧τ V (Xσ∧τ ∗∧n ) = E[Zn∗] = E[Zn−1

] = . . . = E[Z0] = V (x).

Letting n → ∞, we have, thanks to DCT,


h ∗
i
E β σ∧τ V (Xσ∧τ ∗ ) = V (x).

But
∗ ∗
β σ∧τ V (Xσ∧τ ∗ ) = β τ (1 − exp(Xτ ∗ ))+ · 1{τ ∗<σ} ,
Thus V (x) ≥ v(x), which in turn implies that V (x) = v(x) and that τ ∗ is
optimal.

Lemma 3 The solution to the variational inequality is




 0 ; if x ≥ H
V (x) = Aeαx + Be−αx ; if b < x < H ,

 1 − ex ; if x ≤ b

16
with α as defined in (4), and

(1 − eb )e−αb (1 − eb )eαb
A= , B= .
1 − e2α(H−b) 1 − e−2α(H−b)
The free boundary is the unique integer in the interval (D − 1, D], with D
being the unique negative solution satisfying the equation
h i h i
(1 − eD−1 ) 1 − e2α(H−D) + (eD − 1) e−α − eα+2α(H−D) = 0.

Proof. The proof of the lemma is just some technical details, and is deferred
to the Appendix.

A Appendix
Proof of Lemma 1. It suffices to solve for an integer b < 0. The rest is
implied by Proposition 3. The inequalities (6)-(7) amount to
 
1 − ex ≥ β/2 1 − ex+1 + 1 − ex−1 , ∀ x≤ b−1
   
1 − eb ≥ β/2 eα 1 − eb + 1 − eb−1 , x=b
 
(1 − ex )+ < eα(x−b) 1 − eb , ∀ x ≥ b + 1.

The last inequality is equivalent to


 
ex + eα(x−b) 1 − eb > 1, ∀ x≥b+1

However, since b < 0, the LHS is a convex function, and taking value 0 if
x = b. Thus the above inequality reduces to

eb+1 + eα (1 − eb ) > 1,

or
1 − eα
b > log = B − 1.
e − eα
The inequality corresponding to x = b gives

eb (1 − βeα /2 − βe/2) ≤ 1 − βeα /2 − β/2.

But observing exp(α) + exp(−α) = 2/β, we have

e−α − 1
b ≤ log = B.
e−α − e−1

17
These uniquely determine b as the integer in the interval (B − 1, B].
It remains to verify the inequality for x ≤ b − 1, or
 
1 − β ≥ ex 1 − βe/2 − βe−1 /2 , ∀ x ≤ b − 1.

But this is trivial since e + e−1 > 2 and

RHS ≤ eb−1 (1 − β) < 1 − β.

We finish the proof.

Proof of Lemma 3. Consider the function


h i h i
.
f (y) = (1 − ey−1 ) 1 − e2α(H−y) + (ey − 1) e−α − eα+2α(H−y)

for all y ≤ 0. It can be shown that f is a strictly increasing function with


f (∞) < 0 < f (0) (the details are left for interested students). Thus there
exists a unique solution D < 0 such that f (D) = 0. The monotonicity of f
implies that f (y) ≥ 0 for y ≥ D, and f (y) < 0 for y < D.
Let b be the unique integer such that b ∈ (D − 1, D]. We want to
show that (V, b) defines a solution to the variational inequality. Clearly V
is bounded. It remains to show that inequalities (13)-(14). We first show
inequality (13). It is not difficult to see that, for b < x < H,
   
V (x) = e−αx Ae2αx + B ≥ e−αx Ae2αb + B = e−α(x−b) > 0.

Thus, we only need to show V (x) > 1 − ex for all b < x < H. However,

Aeαx + Be−αx = α2 (Aeαx + Be−αx ) > 0

on interval (b, H). Therefore, V is convex and so is V (x) + ex . Since V (b) +


eb = 1, inequality (13) amounts to V (b + 1) > 1 − eb+1 , or equivalently (after
some algebra) f (b + 1) > 0, or b > D − 1. But by definition, this inequality
holds, so does (13). Next we show inequality (14). But for x < b, it is
equivalent to
1 − β ≥ ex (1 − βe/2 − βe−1 /2).
This is trivial for x < b < 0. We remains to show, for x = b,
 
1 − eb ≥ β/2 Aeα(b+1) + Be−α(b+1) + 1 − eb−1 .

This, after some algebra, is equivalent to f (b) ≤ 0, or b ≤ D. We complete


the proof.

18
References
[1] J. Neveu. Mathematical Foundations of the Calculus of Probability.
Holden-Day, San Francisco, 1965.

[2] A.N. Shiryaev. Optimal Stopping Rules. Springer, Berlin, 1978.

[3] J.L. Snell. Applications of martingale system theorems. Trans. Amer.


Math. Society, 73:293–312, 1952.

19

You might also like