0% found this document useful (0 votes)
9 views3 pages

Lecture optimalStoppingTime

Hhhh bbb

Uploaded by

josselin.arj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views3 pages

Lecture optimalStoppingTime

Hhhh bbb

Uploaded by

josselin.arj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Optimal Stopping Time

Jae Yun JUN KIM*

Reference: Neil Walton’s lecture notes

1 Optimal stopping problem


An optimal stopping problem is a Markov Decision Process with two actions:
ˆ a = 0: to stop

ˆ a = 1: to continue

and with two types costs (


k(x), for a = 0,
c(x, a) = (1)
c(x), for a = 1.

2 Bellman equation for optimal stopping problem


Assuming that the time is discrete and finite, the Bellman equation for the optimal stopping
problem can be defined as

Cs (x) = min{k(x), c(x) + EX [Cs−1 (X̂]} (2)

with c(x) = 0 for s = 0, and C0 (x) = k(x).


Note: In this problem, we want to neither control the process nor influence the environment.
But, we only observe the process and decide the right moment to stop.

3 One step look ahead (OLSA) rule


In the one step look ahead (OLSA) rule, we stop when ever x ∈ S, where

S = {x : k(x) ≤ c(x) + E[k(X̂)]}. (3)

That is, you stop whenever it is better to stop now rather than to continue one step further
and then stop.
Let us say that I am in state x.
Then, what is the stopping cost? k(x)
On the other hand, what is the cost that I continue now and I stop at one step further?
c(x) + E[k(X̂)].
* ECE Paris Graduate School of Engineering, 37 quai de Grenelle 75015 Paris, France; [email protected]

1
4 Closed stopping set
We say that the set S ⊂ X is closed (where X is the state space), if once inside the stopping
set, you cannot leave from it. That is,

Px,y = 0, ∀x ∈ S, y 6∈ S. (4)

Suppose that S is given by the OSLA. S is a closed stopping set if

Cs−1 (x) = k(x), for x ∈ S =⇒ Cs (x) = k(x). (5)

That is, if S is a closed stopping set, then x ∈ S is the current state implies that the next state
x̂ is also x̂ ∈ S.
Then, using the Bellman equation, we have

Cs (x) = min{k(x), c(x) + E[Cs−1 (x̂)]} = min{k(x), c(x) + E[k(X̂)]} = k(x) (6)

5 Optimal policy for the optimal stopping problem


For the finite time stopping problem, given by the one step look ahead rule, the set S is closed.
Then, the one step look ahead rule is an optimal policy.
Proof: (By induction on s).
We know that C0 (x) = k(x), ∀x.
Hence, by the fact that we saw for defining the closed stopping set,

Cs (x) = k(x), ∀x ∈ S and s ∈ Z+ . (7)

So, it is always optimal to stop for x ∈ S.


Further, if x 6∈ S, then k(x) > c(x) + E[k(X̂)].
In conclusion, ∀x ∈ S, it is optimal to stop; and, ∀x 6∈ S, it is optimal to continue.

6 Example: Finding parking space


You look for a parking space on street. Each space is free with probability p = 1 − q. You can
not tell if a space is free until you reach it. You can not go backward. Once at a space, you
must decide to stop or to continue. From position s (i.e., s spaces from your destination), the
cost of stopping is s. The cost of passing your destination without parking is D. Construct
the strategy that will return the optimal parking space for a destination.
Answer
Let the state at time s be defined as

xs = I[space x is free]. (8)

Now, using the Bellman equation, we have


Cs (1) = min{s, p Cs−1 (1) + q Cs−1 (0)},
(9)
Cs (0) = p Cs−1 (1) + q Cs−1 (0).

2
Let us now consider the following stopping set

S = {s : s ≤ K(s − 1)}, (10)

where K(s − 1) is the cost of taking the next available space from position s − 1 onwards.
Let us define
K(s) = p s + q K(s − 1), (11)
with K(0) = q D.
If we solve this difference equation, we have
q
K(s) = − + s + c q s+1 , (12)
p

with c = D + p1 .
Substituting this into the expression of the stopping set, we have

S = {s : (D p + 1) q s ≥ 1}. (13)

Hence, the optimal policy is to take the next available space once the condition (D p + 1)q s ≥ 1
is met.
In conclusion, the OSLA rule is optimal.

You might also like