0% found this document useful (0 votes)

25 views12 pages

1993 Asymptotic 20 Properties 20 in 20 Dynamic 20 Programming

This document summarizes a counter example to show that uniform convergence of discounted values in dynamic programming does not imply equality between the limit and the lower infinite value. Specifically: - The example constructs an infinite tree with bounded payoffs but payoffs that are not uniformly bounded below. - It shows that as the discount factor approaches 1, the discounted values converge uniformly to the stage values. - However, by choosing the discount factors carefully, the lower infinite value can be made equal to 0 while the limit of discounted values is 1. - This demonstrates that uniform convergence alone is not sufficient to relate the limit to the lower infinite value. Additional conditions are needed.

Uploaded by

Tahamid Hasan.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views12 pages

1993 Asymptotic 20 Properties 20 in 20 Dynamic 20 Programming

Uploaded by

Tahamid Hasan.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/24058475

Asymptotic Properties in Dynamic Programming

Article in International Journal of Game Theory · February 1993

DOI: 10.1007/BF01245566 · Source: RePEc

CITATIONS READS

29 106

2 authors, including:

Dov Monderer
Technion - Israel Institute of Technology
66 PUBLICATIONS 5,535 CITATIONS

SEE PROFILE

All content following this page was uploaded by Dov Monderer on 04 June 2014.

The user has requested enhancement of the downloaded file.

Game
International Journal of Game Theory (1993) 22:1-11 Theory
Asymptotic Properties in Dynamic Programming

D o v MONDERER
Department of Economics, Queen's University, Kingston, Canada, K7L 3N6, Canada and
Faculty of Industrial Engineering and Management, the Technion, Haifa, Israel

SYLVAIN SORIN

DMJ (URA CNRS 762), Ecole Normale Sup6rieure, 45 rue d'Ulm, 75230 Paris, France

Abstract: In the framework of dynamic programming we provide two results:

- An example where uniform convergence of the T-stage value does not imply equality
of the limit and the lower infinite value.
- Generalized Tauberian theorems, that relate uniform convergence of the T-stage value
to uniform convergence of values associated with a general distribution on stages.

1 Introduction

Let S be a state space. For each sES let OCF(s)C_S, and let f be a real b o u n d e d
function on S. Consider the dynamic programming problem where the decision
maker on day t, at stage st, has to choose a new state st+lEF(st), and receives a
payoff f(st). A play at s s S is a sequence (st)~~ with So=S and st+~ ~F(st) f o r all
t ~ 0. One traditionally considers the X-discounted value Vx (s):
oo
V~(s)= sup ( 1 - 2 ) ~. Xtf(st),
(st)7~o t =0

or the T-stage value Vr(s):

1 T
VT(S) = sup - - ~, f(s,),
(s,)Y=o T + 1 t=o

where in both cases the supremum ranges over all plays at s.

One can also consider other evaluations: Let 0 = (0(t))F=o be a probability on
the set of non-negative integers and define:
oo
Vo(s)= sup ~, O(t) f(st).
(st) t = o t = O

Lehrer and Sorin (1992) proved that if either one of the limits limz~l Vz(s), or
l i m r ~ = Vr(s) exists uniformly in s~S, then the other limit also exists uniformly, and
the limit functions coincide.
In Section 3 we give sufficient conditions on linearly ordered families (| < ) of
probabilities on the integers to get analogous results for (Vo)o~o and (Vr)r>_o-

This research was supported by the fund for the promotion of research in the technion.

0020-7276/93 / 1/ 1-11 $ 2.50 9 1993 Physica-Verlag, Heidelberg

2 D. Monderer and S. Sorin

There are other natural ways of evaluating streams of payoffs in dynamic pro-
gramming (except for those discussed above):
The lower (long-run average) value,

1 T
V(s) = sup lim inf ~, f ( s t ) ,
(~,Y;=o r ~ T + I t=o

and the upper (long-run average) value,

1 T
V(s) = sup lim sup ~ ~, f(s~),
(st)~= 0 T~oo 1-}- I t=O

where, again, the supremum is taken on all plays at s.

Lehrer and Monderer (1989) proved that uniform convergence of (V~)x~to,1) to
some V implies V= V, and showed in an example that it does not imply the equality
V=V. If one allows the decision maker to use mixed strategies, i.e., to choose a play
in random, and then defines the payoff of each state as the expectation, one obtains
new evaluations. It is clear that the evaluations Vx, Vr, Vo, and Vwill not change by
allowing mixed strategies, but V will change in general. Let

U(s) = sup lim inf E u t ,

/z~A T~oo t

where A is the set of all probabilities on the set of plays, endowed with the cylinder
a-field, and E u stands for the expectation operator with respect to ~.
Obviously U _ V. As for the relationship between __Uand the limit V of the dis-
counted value functions, Mertens and Neyman (1981) provided sufficient condi-
tions, stronger than the uniform convergence of (VDz~to,1) (and satisfied in every
finite setup), that ensure the equality __U= V (even for stochastic games). In Section 2
we show that uniform convergence alone is not sufficient by providing a counter
example. See Mertens (1987) for related conjectures, hints, and comments. Other
type of necessary conditions, for specific types of dynamic programming problems,
are discussed in Dutta (!991).

2 The Counter Example

Every rooted directed tree without terminal nodes naturally defines a dynamic pro-
gramming problem when we attach payoffs to the nodes. Our dynamic program-
ming problem will be defined as a tree, constructed inductively in the spirit of Lehrer
and Monderer (1989).
Given two decreasing vanishing sequences (an)~=l and (fin);=1, define for
every real number x the tree T ( x ) as follows:
Asymptotic Properties in Dynamic Programming 3

Every node of T(x) except for the root has an outdegree one, and the root itself
has countably many branches. On the n th branch of the root the payoff, g(s), is 0
until node [enn] + 1, it then equals x - a n until node n, and from then on it equals 0.
Define a valuation r at each node s different from the root as follows: ~o(s)=x-an
for every s in the n th branch appearing before the n th node in this branch, and
~o(s) = 0 for every node thereafter. Set T~ = T(1). T2 is obtained from T~ by attaching
the tree T ( r to each node s of T1, different from the root, and keeping the old
payoff of s (i.e., its payoff in 7"1). One can continue naturally and define inductively
the trees T3, T4, ... and finally define T = U ~= ~ T,. Denote the root of T b y So, and
the payoff function by g.
Note that although g is bounded from above by 1, it is not necessarily bounded
from below. Therefore we replace g with a new bounded payoff function f, defined
by: f ( s ) = m a x ( g ( s ) , 0) for every node s of T.
It is clear that limr~= VT(S)= q~(s) uniformly on all nodes s of T. In particular
V(so) = 1.
We will show that for a specific choice of the sequences (g ,)n=~ and (an)n%1,
_U(so) = O.
Let then ~ > 0 and let us prove that __U(so)<o~. Assume in negation that there
e x i s t s / ~ A such that for some integer M , T>_M implies

E 1 T _ OL.
(2.1)

We remark that we can assume that all plays in the support of/~ belong to the fol-
lowing set ~:
If a play in f2 is on the n th branch of some T(.), it remains in this branch until
exactly node n. In fact, if some play leaves the branch before node [en n] + 1, the
decision maker will increase his payoff by leaving the branch at its root, and if a
play leaves the n 'h branch after node n, it is better for the decision maker to leave it
at precisely node n. In particular, a play in f~ never remains in a branch of some
T(.) and is thus characterized by a sequence (m3ff=l of integers inducing the path:
Branch ml of T(1) until the ml th node, Sin1, branch m2 of T(~o(s,,q)) until node m2 of
this branch (with the valuation 1 - a m ~ - a , , ) , etc . . . . Finally, for every play in ~,

~, am,_< 1. (2.2)
i~l

Having done the above reduction, we can now replace any strictly positive payoff on
any play in f~ by 1.
The basic idea of the proof is to choose a sequence (an)~~ converging very
slowly to zero, implying by (2.2), that for every play in f~, for a set of integers i with
positive density, emimt is much larger than ~.k<,'mk. Hence, every play in f] has
"many" large blocks of zeros.
More precisely, let 341 =2, and define inductively ni= ~.k_~iMk and M~+ i = n~
for every i_> 1. Define an =-1 for all i and extend a by monotonicity to all other in-
' i
4 D. Monderer and S. Sorin

1
tegers. Choose e, = ~ n " We say that a play w is good in the i th block li = [ni_ 1, n;] if
v--

a sequence of ones starts in this block. That is, if w is determined by ml, m2.....
there exists mk adapted to Ii in the sense that

Z mj+emkmkEIi 9 (2.3)
j<k
1
Set S~ (w) = - ~.e= 1 we. We claim that there exists io such that for every i > io and for
n
every w e f t , if S~,(w)>_ot, then w is good in the i th block. Otherwise, denote by k the
largest integer such that the k th sequence of ones in w starts before the i th block.

Then em, mk<_ni_l, and hence mk <

- n 2i_ ~ = M/. This implies that this sequence
ni_l
ofonesendsveryearlyintheithblock, andthat wt=Ofor ( 1 - n l ~ ) M i t ' s i n
this block. As hi-1 ~ 0 as i ~ ~ , then S,i(w) must be very small contradicting our
Mi
assumption.
Define Ji(w) to be one if S,,(w) >_o~ and 0 otherwise. If Ji(w) = 1, one can by the
above claim, define k(w, t) as the smallest k that satisfy (2.3). Denote Oi(w)=~k<w.oif
Ji(w) = 1 and 0 otherwise.
Using the m o n o t o n e convergence theorem we have:

l~-~E,u( Z Ji(w)Oi(w)) I ~ Z E,u(Ji(W)Oi(W)~ Z El~(Ji(w))(~ni.

\ i>_io / i>_io i>_io
1
Since (2.1) at ni implies E~(J~(w))>_~, we obtain, recalling that ~,, = _ ,
l

1 ~" i>_i07 Ol,

a contradiction. 9

3 Uniform Convergence

We first establish a few notations. Let D denote the set of all probability distribu-
tions 0 on the set N = {0, 1, 2 . . . . } of non-negative integers, that are non-increasing.
That is,

O(t + 1)_<-0(t) for all teN. (A)

For real numbers c~_<fl and for a distribution 0,

Asymptotic Properties in Dynamic Programming 5

o[~, p]= Y o(t).

c~<--t<_fl

For OeD, define 0 on N as follows:

0(t) = ( 0 ( t ) - 0 ( t + 1))(t+ 1) for all teN. (3.1)

Note that

T T
~. O(t)= ~, O(t)-(T+ 1) O(T+ 1) for all T_>0. (3.2)
t=0 t~O

Because of (A), limt~oo tO(t)=O, and therefore 0 is a probability distribution on

N.
Let a = (at)t=| be a bounded sequence. For every T _ 0 , denote

1 T

ST(a) -- Z at,
T+ 1 t=o

and denote S(a)= (St(a));~174 For every probability 0, set,

So(a)= ~ Off)at.
t=O

Observe that by (3.1), similarly to the way (3.2) was obtained, we have
So(a) = So(S(a)) for all sequences a and probabilities 0, that is,
o| o|

~. Off)at= ~. O(t)St(a). (3.3)

t=O t=O

We consider linearly ordered families (0, >), where @_CD, and " > " is a linear
(complete) order on | satisfying:
N
r e > 0 , vN___0, ~0oe| such that V0>0o, ~ O(t)<e, (B)
t=0

which is obviously equivalent to:

r e > O , ~OoeO, such that vO>Oo, O(O)<e. (B*)

Note that Condition (B) implies that for every 0eO, there exists 0e| with
0 < 0 . Therefore, the notions of lira, lira inf, lira sup, etc . . . . are naturally defined
for real-valued function on | An increasing sequence (0n)~=o in 0, is increasing
to oo, if for every 0 c O , there exists an integer N such that 0~> 0 for all n>_N. For
the equivalence results we will need the next properties:
(C) ~eo>0 and ~:(0, eo)~(0, 1) such that re<e| 3J(e), and a sequence
(0~,,)n% J(~), that increases to o| and satisfies:
6 D. Monderer and S. Sorin

0,. ~ [ ( 1 - e) n, nl>~o(e) for all n>_J(e).

(D) There exists a sequence (0n),=o,- that increases to oo , and 3eo>O and
~u:(0, eo)~(0, 1) such that v e < e o , aI(e),

O, [~ (e) n, n] > 1 - e for all n >_I(e).

3.1 Preliminary Results

We will assume without loss of generality that the payoff function in our dynamic
programming satisfies 0_<f__. 1.

Lemma 3.1. r e > 0 , vN, 300 such that v0>0o, qso~S, 3n>__N satisfying
V . (So) >- Vo (So) - e.

Proof." By condition (B) and by (3.2), there exists 0o, such that ~.tU=o0 ( t ) < _e for
2
all 0>0o. Let 0>0o, and let sotS. Let s=(st)?'=o be an 88 play for 0
in So. Then by (3.3),

~. O(t)S,(f(s))>>- Vo(so)-e,
t=N+ 1

where f(s) = (f(st)) ~=o.

A S Zt~ 1 0 ( t ) ~ 1, the above inequality implies that a convex combination of
{St(f(s))lt>>_N+ 1} is greater or equals Ve(so)-e. Therefore there exists t>>_N+l
with St (f(s)) >_Vo (So), implying V t (So) ~ V O(So) - e. []

Corollary 3.2.

lim sup V. _> lim sup 11o.

Lemma 3.3. lim sup 11o is non-increasing in plays. That is,

lim sup Vo(so)>_lim sup Vo(sD for every sl~F(so).

Proof." Note that if (st)~~ is e-optimal in sl for O, then s = ( S t)t=o

00 is a play in So.

Hence, it suffices to prove that for every e > O, for sufficiently large O,

~, O(t)f(st+ 1) -f(st) < e.

t=O
Asymptotic Properties in Dynamic Programming 7

By rearranging terms and by (3.3), the last inequality can be proved by showing
that

~. O(t) f(st+ 1) - f ( s o ) < c .

t:o t+l

Hence, it suffices to prove that for every e > 0, for sufficiently large 0,

which follows easily from Condition (B).

2
L e m m a 3.4 (Lehrer and Sorin (1992)). r e > 0 , vn > - , and u there exist a p l a y
s = (S t)t=o
'~' and a stage L such that

1 T C
~, f(sL +t) >- Vn (So) - e f o r every 0 <_T <_- n.
T+ 1 t~o 2

3.2 From Vo to V~

Proposition 1. A s s u m e limo~ 0o 11o= V, uniformly.

r e > 0 3N, such that vn>_N, Vn <_ V+ e.

C
Proof." Set el = ~ . By the uniform convergence assumption, there exists 0o, such
J
that

IVo(so)-V(so)l <el for all sotS. (3.4)

Let M be an integer satisfying

~, Oo(t)> 1 - ~ , (3.5)
t=o

and let N be an integer satisfying N > --.2 We now show that N satisfies the asser-

tion of the proposition. Indeed, let n >_N, and let s o t S . By Lemma 3.4, there exists a
play s = (st)~-o and an integer L that satisfy the assertion of Lemma 3.4 for el. By
(3.3) and (3.5), this implies, Voo(sz) - Vn (So) - 2 el. Therefore V(SL) -- V~(So) -- 3 el, by
(3.4). Hence, by Lemma 3.3, and because 3el = e,
8 D. Monderer and S. Sorin

V(so) >- V~ (So) - e. 9

Proposition 2. Assume (| > ) satisfies Condition (C), and uniform convergence of

(Vo)o~o to V.

r e > 0 , ~N, such that vn>_N, V,>_ V - e .

Proof." Otherwise, there exists e > 0 such that for every N, there exists n>_N and
Soe S with V, (So)< V(so)- e. We now choose a particular integer N as follows: set

el = e2 = ~ , and choose e3, e4, e5 in a way that will be described later. Choose an

integer K satisfying the following 4 properties.

(1) K is large enough such that at every play s=(st)t=o, vn>_K, if
II. (So) < V(so) - e, then

S r ( f ( s ) ) < V(so)-el for all ( 1 - e l ) n < T<n.

(2) Let J(~2) and the sequence (0 .... ),--J(~2) satisfy the property stated in Condi-
tion (C). Choose K>J(82). That is,

0 n [(1 -- e2) n , n] > ~o(e2) for every n ~ K,

where 0, = 0,, ~2"

(3) As ( 0,)n=k is increasing to oo, and Vo~ V, we can choose K large enough
such that

- - •4 < V o n -- V < •4 f o r a l l n _> K .

(4) By Proposition 1, we can choose K large enough, such that for every
n ~K,

Vn< V+a3 for all n>_K.

Finally, choose N > K satisfying

K
for al n>_N.
t=O

By our initial assumption there exists n _>N and So with Vn (So)< V(so)- e. Let
s = (st)7~ o be any play at So. Set at= O,(t)St(f(s)). Then

K
So,,(f(s)): E a, + 7 a, + Z a, + Z a,.
t=O K<t<(1--az)n (1 - - e z ) n ~ t ~ n t>n

Therefore, by the way we chose N,

Asymptotic Properties in Dynamic Programming

So. ( f (s)) ~_ V(so) + A,

where

A = e3 + es - (o(e2) e l .

As the last inequality holds for every play at So, then

vo~ V(so) + a .

Hence, by property (3), satisfied by K and hence by N, and recalling that el =

e
e2 = ~ , we have

(ff ~ <" ~3 + ~4 + e5 9

Thus we can have a contradiction by choosing ei, i = 3 , 4 , 5, to be less than

3.3 From Vn to I1o

Proposition 3. A s s u m e lim,~ o~ V, = W uniformly.

r e > 0 , 30o, such that u Vo <

- W + e.

Proof." The proof is an immediate consequence of Lemma 3.1. 9

L e m m a 3.5 (Lehrer and Sorin (1992)). A s s u m e limn~ oo V, = W uniformly. Then f o r

every e small enough, there exists an integer N, such that f o r every n>_N and soeS,
there is a play s = (si)2~ o at so satisfying:

1 r
f(st)>_ W ( s o ) - e f o r every en<_ T _ < ( 1 - e ) n .
T + I t=o

Proposition 4. A s s u m e (| > ) satisfies condition (D), and l i m , ~ o V, = W uni-

formly.

r e > 0 , 3N, such that u Vo >_W - e ,

where (0,)2'=o is defined in Condition (D).

10 D. Monderer and S. Sorin

f
Proof." Let e > 0. Let fi > 0 satisfies< rain (~u(e), e). Then by Lemma 3.5 there
1-d
exists N such that for every n>_N and soeS, there is a play s=(st)?~=o at So satis-
fying:

T
1
~, f(st) > W(so) - f for every dn_< T_(1 - d ) n .
T+ 1 t=o

Without loss of generality we can choose N>_I(e). Note that if m >_N (assuming that
N was chosen large enough), there exists n>_N, with

[q/(e)m, m] c_ [fin, (1 - d ) n l .

Hence, t~m[q/(e)m, m l - - - 1 - e, and S t ( f (s))-> 1 - O_ 1 - e, for T~ [q/(e)m, m]. There-

fore,

Vom(So)>--W(so) - 2 e for all m _ N and all So~ S.

Remark 1.
If the sequence (0n)2=o, given in Condition (D) is dense in (| > ) (in the sense
that its uniform convergence implies the uniform convergence of (Vo)o~| then un-
der conditions (C) and (D), uniform convergence of (Vn)~~ implies uniform con-
vergence of (Vo)o~o to the same limit function. As it was proved in Lehrer and Sorin
(1992), such is the case when | {0h: 2~[0, 1)}, where 0~(t) = ( 1 - 2 ) 2 t, and " > " is
the natural order on real numbers.

Remark 2.
Let (| > ) be a linearly ordered set of distributions on N satisfying (B), (C*),
and (D*), where (C*) and (D*) are obtained from (C) and (D) respectively, by re-
placing 0 with 0 everywhere. Define,

Uo(so)= sup Z O(t)St(f(S)).

(st) ~'= o t = o

It is obvious that our proofs yield the equivalence theorem for this solution concept
as well. E.g., for every 0 < 2 < 1 define

U~(so)= sup ( 1 - 2 ) ~. 2tSt(f(s)).

(st) ~ - o t~ 0

Then (UD converges uniformly if and only if (V,) converges uniformly, and both
share the same limit function.
Asymptotic Properties in Dynamic Programming 11

References

1. Dutta PK, What Do Discounted Optima Converge to? A Theory of Discount Rate Asymp-
totics In Economic Models, Journal of Economic Theory 55 (1991), 64-941
2. Lehrer E and Monderer D, Discounting Versus Averaging in Dynamic Programming,
Games and Economic Behavior (to appear) (1989).
3. Lehrer E and Sorin S, A Uniform Tauberian Theorem in Dynamic Programming, Mathe-
matics of Operations Research 17 (1992), 303-307.
4. Mertens J-F, Repeated Games, Proceeding of the International Congress of Mathemati-
cians (Berkeley 1986) (1987), 1528-1577.
5. Mertens J-F and Neyman A, Stochastic games, International Journal of Game Theory 10,
2 (1981), 53-66.

Received June 1992

Revised version February 1993

View publication stats

DP Practice
No ratings yet
DP Practice
6 pages
The Body As Medium and Metaphor
83% (6)
The Body As Medium and Metaphor
213 pages
Dynamic Programming Value Iteration
100% (1)
Dynamic Programming Value Iteration
36 pages
Fundamental Theorem of Asset Pricing
No ratings yet
Fundamental Theorem of Asset Pricing
18 pages
CH 7
No ratings yet
CH 7
83 pages
Necessary and Sufficient Conditions For Existence and Uniqueness of Recursive Utilities
No ratings yet
Necessary and Sufficient Conditions For Existence and Uniqueness of Recursive Utilities
29 pages
Continuous-Time Limit of Dynam
No ratings yet
Continuous-Time Limit of Dynam
33 pages
n dP d (P +Q) P f f +1 dP dQ Σ 3 1 2 n f f +1 n g 1−g g 1−g 1
No ratings yet
n dP d (P +Q) P f f +1 dP dQ Σ 3 1 2 n f f +1 n g 1−g g 1−g 1
2 pages
Dynamic Programming For Dummies Parts I & II
No ratings yet
Dynamic Programming For Dummies Parts I & II
53 pages
Bellman Routingproblem 1958
No ratings yet
Bellman Routingproblem 1958
5 pages
CS 748 (Spring 2021) : Weekly Quizzes: Week 2
No ratings yet
CS 748 (Spring 2021) : Weekly Quizzes: Week 2
2 pages
EC744 Lecture Note 3 Dynamic Programming Under Certainty: Prof. Jianjun Miao
No ratings yet
EC744 Lecture Note 3 Dynamic Programming Under Certainty: Prof. Jianjun Miao
17 pages
Notas - Dynamic Optimation and Optimal Control
No ratings yet
Notas - Dynamic Optimation and Optimal Control
26 pages
Rapidly Varying Sequences and Rapid Convergence: D. Djur Ci C, Lj.D.R. Ko Cinac, M.R. Žižovi C
No ratings yet
Rapidly Varying Sequences and Rapid Convergence: D. Djur Ci C, Lj.D.R. Ko Cinac, M.R. Žižovi C
7 pages
Unit 05 Dynamic Programming
No ratings yet
Unit 05 Dynamic Programming
9 pages
Kramko Schachermayer - AAP - 99
No ratings yet
Kramko Schachermayer - AAP - 99
47 pages
Cs748 s2021 Quizzes Till q4
No ratings yet
Cs748 s2021 Quizzes Till q4
4 pages
Exsheet 1
No ratings yet
Exsheet 1
2 pages
Younes - 1999 - On The Convergence of Markovian Stochastic Algorithms With Rapidly Decreasing Ergodicity Rates
No ratings yet
Younes - 1999 - On The Convergence of Markovian Stochastic Algorithms With Rapidly Decreasing Ergodicity Rates
53 pages
Informs: INFORMS Is Collaborating With JSTOR To Digitize, Preserve and Extend Access To Management Science
No ratings yet
Informs: INFORMS Is Collaborating With JSTOR To Digitize, Preserve and Extend Access To Management Science
7 pages
Ergodic Properties of Markov Processes
No ratings yet
Ergodic Properties of Markov Processes
39 pages
Equilibrium in a Stochastic n-Person Game: 1 Ή the game is eJ (Γ) - This choice
No ratings yet
Equilibrium in a Stochastic n-Person Game: 1 Ή the game is eJ (Γ) - This choice
6 pages
Stochastic I MG Intro PDF
No ratings yet
Stochastic I MG Intro PDF
8 pages
2 Growth Neoclassical Growth
No ratings yet
2 Growth Neoclassical Growth
71 pages
Stat 150 Class Notes: Onur Kaya 16292609
No ratings yet
Stat 150 Class Notes: Onur Kaya 16292609
4 pages
Nash For Mixed-1
No ratings yet
Nash For Mixed-1
5 pages
Typeset by AMS-TEX
No ratings yet
Typeset by AMS-TEX
27 pages
Nonzero Sum Linear Quadratic Stochastic Differential Games and Backward Forward Equations
No ratings yet
Nonzero Sum Linear Quadratic Stochastic Differential Games and Backward Forward Equations
15 pages
14.128 Dynamic Optimization and Economic Applications (Recursive Methods)
No ratings yet
14.128 Dynamic Optimization and Economic Applications (Recursive Methods)
3 pages
The Dynamics of Distributions in Continuous-Time Stochastic Models
No ratings yet
The Dynamics of Distributions in Continuous-Time Stochastic Models
33 pages
Dynamic Programming: Quantitative Macroeconomics (Econ 5725)
No ratings yet
Dynamic Programming: Quantitative Macroeconomics (Econ 5725)
55 pages
Dynamic Programming: Thomas J. Sargent and John Stachurski January 16, 2024
No ratings yet
Dynamic Programming: Thomas J. Sargent and John Stachurski January 16, 2024
446 pages
Lecture 3 and 4
No ratings yet
Lecture 3 and 4
14 pages
Mathfinbn
No ratings yet
Mathfinbn
232 pages
Meyer-Zheng Topology and Multi-Asset Behavioral
No ratings yet
Meyer-Zheng Topology and Multi-Asset Behavioral
19 pages
Dynamic Programming
No ratings yet
Dynamic Programming
21 pages
Answers For Stochastic Calculus For Finance I Steven Shreve Vjul 15 2009
No ratings yet
Answers For Stochastic Calculus For Finance I Steven Shreve Vjul 15 2009
7 pages
11 - Numerical Issues #1: The Complications of Continuity: V (X, T) That Maps From The Continuous Domain of X To
No ratings yet
11 - Numerical Issues #1: The Complications of Continuity: V (X, T) That Maps From The Continuous Domain of X To
24 pages
Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach
No ratings yet
Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach
55 pages
Output 6
No ratings yet
Output 6
43 pages
EC744 Lecture Note 7 Stochastic Dynamic Programming: Prof. Jianjun Miao
No ratings yet
EC744 Lecture Note 7 Stochastic Dynamic Programming: Prof. Jianjun Miao
24 pages
Dynamic Optimization in Continuous
No ratings yet
Dynamic Optimization in Continuous
27 pages
) W !"#$%&' +,-./012345 Ya - Fi Mu: Discounted Properties of Probabilistic Pushdown Automata
No ratings yet
) W !"#$%&' +,-./012345 Ya - Fi Mu: Discounted Properties of Probabilistic Pushdown Automata
33 pages
BDG Inequalities and Their Applications For Model
No ratings yet
BDG Inequalities and Their Applications For Model
33 pages
Effective Martingales
No ratings yet
Effective Martingales
13 pages
13 - Econometric Applications of Dynamic Programming: V X Uxi EV Xi
No ratings yet
13 - Econometric Applications of Dynamic Programming: V X Uxi EV Xi
5 pages
Walk
No ratings yet
Walk
3 pages
Homework - 04 - 223 - Spring 2024
No ratings yet
Homework - 04 - 223 - Spring 2024
3 pages
DP - Bellman - 1741339134 2025-03-07 09 - 19 - 05
No ratings yet
DP - Bellman - 1741339134 2025-03-07 09 - 19 - 05
13 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Bellman Equation
No ratings yet
Bellman Equation
13 pages
GDD Nonlinear NIPS 2009 Convergent Temporal Difference Learning With Arbitrary Smooth Function Approximation
No ratings yet
GDD Nonlinear NIPS 2009 Convergent Temporal Difference Learning With Arbitrary Smooth Function Approximation
9 pages
336 Lecture4 2007
No ratings yet
336 Lecture4 2007
5 pages
cs747 A2020 Quizzes PDF
No ratings yet
cs747 A2020 Quizzes PDF
5 pages
2006 - Gimnert, Zielonka - Deterministic Priority Mean-Payoff Games As Limits of Discounted Games
No ratings yet
2006 - Gimnert, Zielonka - Deterministic Priority Mean-Payoff Games As Limits of Discounted Games
12 pages
Learning From Delayed Rewards（1989）-可选定
No ratings yet
Learning From Delayed Rewards（1989）-可选定
241 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
EC744 Lecture Note 9 Convergence of Markov Processes: Prof. Jianjun Miao
No ratings yet
EC744 Lecture Note 9 Convergence of Markov Processes: Prof. Jianjun Miao
22 pages
MDP Cheatsheet
No ratings yet
MDP Cheatsheet
3 pages
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet
Value Functions & Bellman Equations: UNIT-3
No ratings yet
Value Functions & Bellman Equations: UNIT-3
11 pages
SE Lec 09
No ratings yet
SE Lec 09
14 pages
BiLD+Law+Journal+2 (2) 7 18
No ratings yet
BiLD+Law+Journal+2 (2) 7 18
12 pages
14 DFA Operations Revised 2
No ratings yet
14 DFA Operations Revised 2
14 pages
Daa Algorithm
No ratings yet
Daa Algorithm
93 pages
Greece and The Greeks in Ottoman History and Turkish Historiography
No ratings yet
Greece and The Greeks in Ottoman History and Turkish Historiography
15 pages
PR2 Printer Driver W2k-WXp
No ratings yet
PR2 Printer Driver W2k-WXp
9 pages
TMJC H2 Mathematics Prelims Paper 2 (Q)
No ratings yet
TMJC H2 Mathematics Prelims Paper 2 (Q)
25 pages
IBM AIX7 官方培训文档
No ratings yet
IBM AIX7 官方培训文档
495 pages
Program Outcomes: Doctor of Philosophy in Development Education (Ph.D. Deved)
No ratings yet
Program Outcomes: Doctor of Philosophy in Development Education (Ph.D. Deved)
5 pages
Typology of The Adjective
No ratings yet
Typology of The Adjective
15 pages
Evans L.,Thompson R. Introduction To Algebraic Topology PDF
No ratings yet
Evans L.,Thompson R. Introduction To Algebraic Topology PDF
248 pages
Defining - Non-Defining New Version
No ratings yet
Defining - Non-Defining New Version
6 pages
Description: Tags: Jjfellows2001
No ratings yet
Description: Tags: Jjfellows2001
3 pages
Coding Form Dokter
No ratings yet
Coding Form Dokter
5 pages
Preschool-Thaa Arabic
100% (1)
Preschool-Thaa Arabic
14 pages
The Study of Select Themes in Cormac Mcarthy'S
No ratings yet
The Study of Select Themes in Cormac Mcarthy'S
26 pages
How Great Is Our God Chords
No ratings yet
How Great Is Our God Chords
1 page
1
No ratings yet
1
271 pages
Chuyên Tiền Giang - Tiền Giang
No ratings yet
Chuyên Tiền Giang - Tiền Giang
26 pages
Q 4
No ratings yet
Q 4
27 pages
Session 15 and 16
No ratings yet
Session 15 and 16
17 pages
Psalm 131 As Prayer N Trust
No ratings yet
Psalm 131 As Prayer N Trust
13 pages
Windows XP Visual Guidelines
No ratings yet
Windows XP Visual Guidelines
49 pages
Will The Humanities Survive Artificial Intelligence - The New Yorker
No ratings yet
Will The Humanities Survive Artificial Intelligence - The New Yorker
39 pages
DX Dy Substitution
No ratings yet
DX Dy Substitution
2 pages
The Sacred Revolution Propaganda and Personality Cult in North Korea
No ratings yet
The Sacred Revolution Propaganda and Personality Cult in North Korea
18 pages
De Cuong On Tap Giua Ki 1tieng Anh Lop 6 Ilearn Smart World
No ratings yet
De Cuong On Tap Giua Ki 1tieng Anh Lop 6 Ilearn Smart World
4 pages
Mid1-ITC-Fall-2015 - DONE
No ratings yet
Mid1-ITC-Fall-2015 - DONE
11 pages
Theorizing Affect and Emotion
No ratings yet
Theorizing Affect and Emotion
7 pages
Verb Tense
No ratings yet
Verb Tense
19 pages
String Handling
No ratings yet
String Handling
33 pages
Microprocessor Microcontroller EXAM 2021
No ratings yet
Microprocessor Microcontroller EXAM 2021
5 pages
Screenshot 2022-11-07 at 07.47.46
No ratings yet
Screenshot 2022-11-07 at 07.47.46
48 pages

1993 Asymptotic 20 Properties 20 in 20 Dynamic 20 Programming

Uploaded by

1993 Asymptotic 20 Properties 20 in 20 Dynamic 20 Programming

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Asymptotic Properties in Dynamic Programming

Article in International Journal of Game Theory · February 1993

The user has requested enhancement of the downloaded file.

Abstract: In the framework of dynamic programming we provide two results:

or the T-stage value Vr(s):

where in both cases the supremum ranges over all plays at s.

0020-7276/93 / 1/ 1-11 $ 2.50 9 1993 Physica-Verlag, Heidelberg

and the upper (long-run average) value,

where, again, the supremum is taken on all plays at s.

U(s) = sup lim inf E u t ,

2 The Counter Example

Then em, mk<_ni_l, and hence mk <

l~-~E,u( Z Ji(w)Oi(w)) I ~ Z E,u(Ji(W)Oi(W)~ Z El~(Ji(w))(~ni.

1 ~" i>_i07 Ol,

O(t + 1)_<-0(t) for all teN. (A)

For real numbers c~_<fl and for a distribution 0,

o[~, p]= Y o(t).

For OeD, define 0 on N as follows:

0(t) = ( 0 ( t ) - 0 ( t + 1))(t+ 1) for all teN. (3.1)

Because of (A), limt~oo tO(t)=O, and therefore 0 is a probability distribution on

and denote S(a)= (St(a));~174 For every probability 0, set,

~. Off)at= ~. O(t)St(a). (3.3)

which is obviously equivalent to:

r e > O , ~OoeO, such that vO>Oo, O(O)<e. (B*)

0,. ~ [ ( 1 - e) n, nl>~o(e) for all n>_J(e).

O, [~ (e) n, n] > 1 - e for all n >_I(e).

3.1 Preliminary Results

where f(s) = (f(st)) ~=o.

lim sup V. _> lim sup 11o.

Lemma 3.3. lim sup 11o is non-increasing in plays. That is,

lim sup Vo(so)>_lim sup Vo(sD for every sl~F(so).

Proof." Note that if (st)~~ is e-optimal in sl for O, then s = ( S t)t=o

~, O(t)f(st+ 1) -f(st) < e.

~. O(t) f(st+ 1) - f ( s o ) < c .

which follows easily from Condition (B).

Proposition 1. A s s u m e limo~ 0o 11o= V, uniformly.

r e > 0 3N, such that vn>_N, Vn <_ V+ e.

IVo(so)-V(so)l <el for all sotS. (3.4)

Let M be an integer satisfying

V(so) >- V~ (So) - e. 9

Proposition 2. Assume (| > ) satisfies Condition (C), and uniform convergence of

r e > 0 , ~N, such that vn>_N, V,>_ V - e .

integer K satisfying the following 4 properties.

S r ( f ( s ) ) < V(so)-el for all ( 1 - e l ) n < T<n.

0 n [(1 -- e2) n , n] > ~o(e2) for every n ~ K,

where 0, = 0,, ~2"

- - •4 < V o n -- V < •4 f o r a l l n _> K .

Vn< V+a3 for all n>_K.

Finally, choose N > K satisfying

Therefore, by the way we chose N,

So. ( f (s)) ~_ V(so) + A,

As the last inequality holds for every play at So, then

Hence, by property (3), satisfied by K and hence by N, and recalling that el =

Thus we can have a contradiction by choosing ei, i = 3 , 4 , 5, to be less than

3.3 From Vn to I1o

Proposition 3. A s s u m e lim,~ o~ V, = W uniformly.

r e > 0 , 30o, such that u Vo <

Proof." The proof is an immediate consequence of Lemma 3.1. 9

L e m m a 3.5 (Lehrer and Sorin (1992)). A s s u m e limn~ oo V, = W uniformly. Then f o r

Proposition 4. A s s u m e (| > ) satisfies condition (D), and l i m , ~ o V, = W uni-

r e > 0 , 3N, such that u Vo >_W - e ,

where (0,)2'=o is defined in Condition (D).

Hence, t~m[q/(e)m, m l - - - 1 - e, and S t ( f (s))-> 1 - O_ 1 - e, for T~ [q/(e)m, m]. There-

Vom(So)>--W(so) - 2 e for all m _ N and all So~ S.

Uo(so)= sup Z O(t)St(f(S)).

U~(so)= sup ( 1 - 2 ) ~. 2tSt(f(s)).

Received June 1992

View publication stats

You might also like