0% found this document useful (0 votes)

57 views17 pages

Computational Economics: Session 16: Numerical Dynamic Programming

This document outlines methods for solving dynamic programming problems numerically. It discusses: 1) Discrete-time and continuous-time dynamic programming frameworks for finite-state and continuous-state problems. 2) Methods for solving the Bellman equation including value function iteration, policy function iteration, and Gaussian acceleration methods. 3) The document provides details on setting up and solving the dynamic programming problem for both finite and infinite time horizons.

Uploaded by

h_ebook8077

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views17 pages

Computational Economics: Session 16: Numerical Dynamic Programming

Uploaded by

h_ebook8077

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Computational Economics

Session 16:
Numerical Dynamic Programming
Agenda
Discrete-time dynamic programming.
Continuous-time dynamic programming.
Methods for nite-state problems:
Value function iteration.
Policy function iteration.
Gaussian acceleration methods.
Methods for continuous-state problems:
Discretization.
Parametric approximation methods.
Projection methods.
Discrete-Time Dynamic Programming
The objective is to maximize the expected NPV of payos
E

t=0
(x
t
, u
t
, t) +W(x
T+1
)

subject to the law of motion

Pr(x
t+1
x|x
t
, u
t
, t) = F(x, x
t
, u
t
, t), x
0
given.
Notation:
is the per-period payo.
W is the terminal payo.
x
t
X is the state; X is the set of states.
u
t
D(x
t
, t) is the control; D(x
t
, t) is the nonempty set of feasible
controls in state x
t
at time t.
Discrete-Time Dynamic Programming
The value function V (x
t
, t) is the maximum expected NPV of payos
from time t onward if the state at time t is x
t
.
The value function V (x
t
, t) satises the Bellman equation
V (x
t
, t) = max
u
t
D(x
t
,t)
(x
t
, u
t
, t) +E
t
{V (x
t+1
, t +1)|x
t
, u
t
}
with terminal condition V (x
T+1
, T +1) = W(x
T+1
).
The optimal policy function U(x
t
, t) satises
U(x
t
, t) arg max
u
t
D(x
t
,t)
(x
t
, u
t
, t) +E
t
{V (x
t+1
, t +1)|x
t
, u
t
} .
In the autonomous, discounted, innite-horizon case (x, u, t) is replaced
by

t
(x, u),
where [0, 1) is the discount factor, and neither F() nor D() depend
explicitly on t.
The value function V (x) satises the Bellman equation
V (x) = max
uD(x)
(x, u) +E

V (x
+
)|x, u

and the optimal policy function U(x) satises

U(x) arg max
uD(x)
(x, u) +E

V (x
+
)|x, u

.
Continuous-Time Dynamic Programming: Deterministic Case
The state at time is x(t) X R
n
continuous states.
The objective is to maximize the NPV of payos

T
0
e
t
(x, u, t)dt +W(x(T))
subject to the law of motion
x = f(x, u, t), x(0) = x
0
.
The Bellman equation is
V (x, t) V
t
(x, t) = max
uD(x,t)
(x, u, t) +
n

i=1
V
x
i
(x, t)f
i
(x, u, t)
with terminal condition V (x, T) = W(x).
In the autonomous innite-horizon case the Bellman equation becomes
V (x) = max
uD(x)
(x, u) +
n

i=1
V
x
i
(x)f
i
(x, u).
Continuous-Time Dynamic Programming: Stochastic Case
Continuous states. Brownian motion.
The objective is to maximize the expected NPV of payos
E

T
0
e
t
(x, u, t)dt +W(x(T))

subject to the law of motion

dx = f(x, u, t)dt +(x, u, t)dz, x(0) = x
0
,
where
f(x, u, t) is the n 1 vector of instantaneous drifts;
(x, u, t) is the n n matrix of instantaneous standard deviations;
dz is white noise.
The Bellman equation is
V (x, t) V
t
(x, t) = max
uD(x,t)
(x, u, t) +
n

i=1
V
xi
(x, t)f
i
(x, u, t)
+
1
2
tr

(x, u, t)(x, u, t)

V
xx
(x, t)

with terminal condition V (x, T) = W(x), where tr(A) is the trace of the matrix A.
In the autonomous innite-horizon case the Bellman equation becomes
V (x) = max
uD(x)
(x, u) +
n

i=1
V
xi
(x)f
i
(x, u) +
1
2
tr

(x, u)(x, u)

V
xx
(x)

.
Finite-State Problems
The set of states is X = {x
1
, x
2
, . . . , x
n
}. Time is discrete.
The law of motion is a controlled discrete-time, nite-state, rst-order
Markov process, where q
t
ij
(u) is the probability that the state transits
from x
i
to x
j
if the control is u at time t.
Finite-horizon case: Let V
t
i
= V (x
i
, t), i = 1, . . . , n, t = 0, . . . , T +1. The
Bellman equation is
V
t
i
= max
uD(x
i
,t)
(x
i
, u, t) +
n

j=1
q
t
ij
(u)V
t+1
j
with terminal condition V
T+1
i
= W(x
i
).
Recursive system of nonlinear equations. Solve backwards from t = T +1
to t = 0 for V
t
i
, i = 1, . . . , n.
Innite-horizon case: Let V
i
= V (x
i
), i = 1, . . . , n. The Bellman equation
is
V
i
= max
uD(x
i
)
(x
i
, u) +
n

j=1
q
ij
(u)V
j
.
System of nonlinear equations. The contraction mapping theorem en-
sures existence and uniqueness of a solution.
Finite-State Problems: Value Function Iteration
Dene the operator T pointwise by
(TV )
i
= max
uD(x
i
)
(x
i
, u) +
n

j=1
q
ij
(u)V
j
, i = 1, . . . , n.
Value function iteration:
Initialization: Choose initial guess V
0
and stopping criterion .
Step 1: Compute V
l+1
= TV
l
.
Step 2: If ||V
l+1
V
l
|| < , stop; otherwise, go to step 1.
The sequence

V
l

l=0
converges linearly at rate to V

and ||V
l+1
V

||
||V
l
V

||. Hence,
||V
l
V

||
||V
l+1
V
l
||
1
.
To ensure ||V
l+1
V

|| < , stop if ||V

l+1
V
l
|| (1 ).
Maximization is costliest. Exploit special structure of objective (e.g.,
concavity, monotonicity) whenever possible.
Finite-State Problems: Policy Function Iteration
Dene the operator U pointwise by
(UV )
i
arg max
uD(x
i
)
(x
i
, u) +
n

j=1
q
ij
(u)V
j
, i = 1, . . . , n.
Let U
i
= U(x
i
), i = 1, . . . , n, Q
U
= (q
ij
(U
i
))
i,j
,
U
= ((x
i
, U
i
))
i
. Then
the value V
U
of following policy U forever satises the system of linear
equations
V
U
=
U
+Q
U
V
U
V
U
=

I Q
U

U
.
Policy function iteration (a.k.a. Howard improvement):
Initialization: Choose initial guess V
0
and stopping criterion . (Or:
Choose U
0
instead of V
0
and go to step 2.)
Step 1: Compute U
l+1
= UV
l
.
Step 2: Solve

I Q
U
l+1

V
l+1
=
U
l+1
to obtain V
l+1
.
Step 3: If ||V
l+1
V
l
|| < , stop; otherwise, go to step 1.
Step 2 computes the value of following policy U
l+1
forever.
Finite-State Problems: Policy Function Iteration
Modied policy iteration with k steps: Replace step 2 by
Step 2a: Set W
0
= V
l
.
Step 2b: Compute W
j+1
=
U
l+1
+Q
U
l+1
W
j
, j = 0, . . . , k.
Step 2c: Set V
l+1
= W
k+1
.
Step 2 computes the value of following policy U
l+1
for k +1 periods.
The sequence

V
l

l=0
converges linearly to V

and
||V
l+1
V

|| min

,
(1
k
)
1
||U
l
U

|| +
k+1

||V
l
V

||.
Rate approaches
k+1
as U
l
approaches U

accelerated convergence.
Finite-State Problems: Gaussian Acceleration Methods
Idea: The Bellman equation is a system of nonlinear equations. Treat it
as such!
Pre-Gauss-Jacobi iteration (a.k.a. value function iteration):
V
l+1
i
= max
uD(x
i
)
(x
i
, u) +
n

j=1
q
ij
(u)V
l
j
, i = 1, . . . , n.
Gauss-Jacobi iteration:
V
l+1
i
= max
uD(x
i
)
(x
i
, u) +

j=i
q
ij
(u)V
l
j
1 q
ii
(u)
, i = 1, . . . , n.
Pre-Gauss-Seidel iteration:
V
l+1
i
= max
uD(x
i
)
(x
i
, u) +
n

j<i
q
ij
(u)V
l+1
j
+
n

ji
q
ij
(u)V
l
j
, i = 1, . . . , n.
Gauss-Seidel iteration:
V
l+1
i
= max
uD(x
i
)
(x
i
, u) +

j<i
q
ij
(u)V
l+1
j
+

j>i
q
ij
(u)V
l
j
1 q
ii
(u)
, i = 1, . . . , n.
Finite-State Problems: Gaussian Acceleration Methods
Idea: The Gauss-Seidel methods depend on the ordering of states. Ex-
ploit it!
Downwind (solid) and upwind (dashed) directions. Source: Judd, K. (1998), Figure
12.1.
Upwind Gauss-Seidel: In iteration l, rst order the state space such that
q
i,i+1
(U
l
i
) q
i+1,i
(U
l
i+1
), i = 1, . . . , n. Then traverse the states space in
decreasing order.
Simulated upwind Gauss-Seidel: In iteration l, rst simulate the Markov
process under U
l
. Then traverse the simulated states in decreasing order.
Alternating sweep Gauss-Seidel: In iteration l, traverse the state space
in increasing (decreasing) order if l is odd (even).
Continuous-State Problems: Discretization
Specify a nite-state problem that is similar to the continuous-state prob-
lem under consideration.
Example: Optimal growth. The Bellman equation is
V (k) = max
c[0,k+f(k)]
u(c) +V (k +f(k) c).
Replace the set of states [0, ) by K = {k
1
, k
2
, . . . , k
n
}. Choose K large
enough so that the initial and the steady state are contained in it.
To ensure landing on a point in K, take the control to be next periods
state and rewrite the Bellman equation as
V (k) = max
k
+
K
u(k +f(k) k
+
) +V (k
+
).
Remarks:
Easy and robust.
Sometimes requires reformulating the problem and/or altering the set
of states and controls.
Requires a large number of points, particularly if the state space is
multidimensional.
Inecient approximation to smooth problems.
Continuous-State Problems: Parametric Approximation Methods
Approximate the value function using the family of functions

V (x; a)
and use methods for nite-state problems to choose the parameters a.
Parametric dynamic programming with value function iteration:
Initialization: Choose a functional form for

V (x; a), where a R
m
,
and a set of points X = {x
i
}
n
i=1
, where n m. Choose initial guess
a
0
and stopping criterion .
Step 1 (maximization step): Compute
v
i
= max
uD(x
i
)
(x
i
, u) +

V (x
+
; a)dF(x
+
, x
i
, u), i = 1, . . . , n.
Step 2 (tting step): Compute a
l+1
such that

V (x; a
l+1
) approximates
the Lagrange data {(x
i
, v
i
)}
n
i=1
.
Step 3: If ||

V (x; a
l+1
)

V (x; a
l
)|| < , stop; otherwise, go to step 1.
Three interconnected components:
Numerical integration.
Maximization.
Function approximation (CompEcon toolbox: help cetools).
Continuous-State Problems: Parametric Approximation Methods
The computable approximation

T to the contraction mapping T may be
neither contractive nor monotonic.
Shape-preserving methods.
Dynamic programming and the shape of the value function. Source: Judd, K. (1998),
Figure 12.2.
Linear spline (C
0
).
Schumaker shape-preserving spline (C
1
; use envelope theorem to ob-
tain Hermite data {(x
i
, v
i
, v

}
n
i=1
).
Bilinear and simplicial interpolation (C
0
).
Could use policy function iteration instead of value function iteration,
but Gauss-Seidel methods are harder to adapt.
Continuous-State Problems: Projection Methods
The Bellman equation is a functional equation.
Approximate the value function using the family of functions

V (x; a)
and choose the parameters a such that

V (x; a) almost satises the
Bellman equation.
The residual function is dened pointwise by
R(x; a) =

V (x; a) + max
uD(x)
(x, u) +

V (x
+
; a)dF(x
+
, x, u).
Special case: Suppose the FOC ensures optimality. Then the value
function V (x) and the optimal policy function U(x) satisfy
V (x) = (x, U(x)) +

V (x
+
)dF(x
+
, x, U(x)),
0 =
u
(x, U(x)) +

V (x
+
)dF
u
(x
+
, x, U(x)).
Approximate the value function using the family of functions

V (x; a) and
the optimal policy function using

U(x; b).
Continuous-State Problems: Projection Methods
The residual function is dened pointwise by
R(x; a, b) =

R
1
(x; a, b)
R
2
(x; a, b)

V (x; a) +(x,

U(x; b) +

V (x
+
; a)dF(x
+
, x,

U(x; b))

u
(x,

U(x; b)) +

V (x
+
; a)dF
u
(x
+
, x,

U(x; b))

.
Even more special case: Suppose the FOC ensures optimality and can
be solved in closed form for U(x). Then the value function V (x) satises
V (x) = (x, U(x)) +

V (x
+
)dF(x
+
, x, U(x)).
Approximate the value function using the family of functions

V (x; a).
The residual function is dened pointwise by
R(x; a) =

V (x; a) +(x, U(x)) +

V (x
+
; a)dF(x
+
, x, U(x)).
Projection methods are natural for continuous-time problems.

Handout 10 Dynamic Programming Nov14
No ratings yet
Handout 10 Dynamic Programming Nov14
113 pages
Lecture SM 1 DP
No ratings yet
Lecture SM 1 DP
71 pages
Markov Decision Process II
No ratings yet
Markov Decision Process II
88 pages
Fa19 Lecture 15 MDPs II
No ratings yet
Fa19 Lecture 15 MDPs II
76 pages
2 Dynamic
No ratings yet
2 Dynamic
50 pages
Rust J. - Numerical Dynamic Programming in Economics
No ratings yet
Rust J. - Numerical Dynamic Programming in Economics
167 pages
242 Sheet 02 03
No ratings yet
242 Sheet 02 03
5 pages
Bellman Filtering and Smoothing For State-Space Models
No ratings yet
Bellman Filtering and Smoothing For State-Space Models
60 pages
Lec 09
No ratings yet
Lec 09
51 pages
Abstract Dynamic Programming
No ratings yet
Abstract Dynamic Programming
257 pages
Dynamic Programming and Optimal Control
No ratings yet
Dynamic Programming and Optimal Control
62 pages
l1 Mdps Exact Methods
No ratings yet
l1 Mdps Exact Methods
69 pages
MS&E 221: Stochastic Modeling: Session 7: Nonlinear Optimization, Markov Decision Processes
No ratings yet
MS&E 221: Stochastic Modeling: Session 7: Nonlinear Optimization, Markov Decision Processes
18 pages
Exact (RL IITH)
No ratings yet
Exact (RL IITH)
47 pages
RL Unit-4
No ratings yet
RL Unit-4
18 pages
Infinite-Horizon Dynamic Programming: Tianxiao Zheng Saif
No ratings yet
Infinite-Horizon Dynamic Programming: Tianxiao Zheng Saif
10 pages
Optimal Control Theory
No ratings yet
Optimal Control Theory
28 pages
Formula Sheet: Section 1 - Deterministic Dynamic Programming
No ratings yet
Formula Sheet: Section 1 - Deterministic Dynamic Programming
10 pages
Dynamic Programming: Quantitative Macroeconomics (Econ 5725)
No ratings yet
Dynamic Programming: Quantitative Macroeconomics (Econ 5725)
55 pages
MIT 6.036 Lecture
No ratings yet
MIT 6.036 Lecture
64 pages
09 - Monte Carlo Learning
No ratings yet
09 - Monte Carlo Learning
24 pages
Dynamic Programming and Optimal Control Script
No ratings yet
Dynamic Programming and Optimal Control Script
58 pages
SLchapt 3
No ratings yet
SLchapt 3
10 pages
Lec 4
No ratings yet
Lec 4
16 pages
Lec 3
No ratings yet
Lec 3
15 pages
Lecture26 Ri
No ratings yet
Lecture26 Ri
55 pages
Abstract Dynamic Programming Bertsekas Dimitri P Download
No ratings yet
Abstract Dynamic Programming Bertsekas Dimitri P Download
87 pages
Back in Time. Fast. Accelerated Time Iterations
No ratings yet
Back in Time. Fast. Accelerated Time Iterations
28 pages
EC106 DeterministicMOdels WS21 June13
No ratings yet
EC106 DeterministicMOdels WS21 June13
19 pages
Homework - 06 - 223 - Spring 2024
No ratings yet
Homework - 06 - 223 - Spring 2024
5 pages
Cap 3 y 2
50% (2)
Cap 3 y 2
34 pages
Bouchardtalk
No ratings yet
Bouchardtalk
78 pages
10 - Reinforcement Learning
No ratings yet
10 - Reinforcement Learning
24 pages
2025 - MDPs 2
No ratings yet
2025 - MDPs 2
42 pages
cs229 Notes13
No ratings yet
cs229 Notes13
15 pages
2025 - MDPs - Part 2
No ratings yet
2025 - MDPs - Part 2
41 pages
Part 10
No ratings yet
Part 10
57 pages
Dynamic Programming
No ratings yet
Dynamic Programming
52 pages
Lecture 3 and 4
No ratings yet
Lecture 3 and 4
14 pages
HJB Equations
100% (1)
HJB Equations
38 pages
20ai903 - RL - Unit 2
No ratings yet
20ai903 - RL - Unit 2
27 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
101 pages
Notas - Dynamic Optimation and Optimal Control
No ratings yet
Notas - Dynamic Optimation and Optimal Control
26 pages
Vectorspace 2
100% (1)
Vectorspace 2
91 pages
Optimal Control Theory
No ratings yet
Optimal Control Theory
28 pages
MDP Cheatsheet
No ratings yet
MDP Cheatsheet
3 pages
EE675A Lec12
No ratings yet
EE675A Lec12
5 pages
A Child's Guide To Dynamic Programming
No ratings yet
A Child's Guide To Dynamic Programming
20 pages
2 Dynamics
No ratings yet
2 Dynamics
10 pages
GDD Nonlinear NIPS 2009 Convergent Temporal Difference Learning With Arbitrary Smooth Function Approximation
No ratings yet
GDD Nonlinear NIPS 2009 Convergent Temporal Difference Learning With Arbitrary Smooth Function Approximation
9 pages
Approximate Dynamic Programming - II: Algorithms: Warren B. Powell
No ratings yet
Approximate Dynamic Programming - II: Algorithms: Warren B. Powell
22 pages
02 MarkovDecisionProcess
No ratings yet
02 MarkovDecisionProcess
51 pages
Dynamic Programming and Optimal Control
No ratings yet
Dynamic Programming and Optimal Control
199 pages
Markov Decision Processes and Exact Solution Methods
No ratings yet
Markov Decision Processes and Exact Solution Methods
34 pages
Reinforcement Learning Cheat Sheet: Return
No ratings yet
Reinforcement Learning Cheat Sheet: Return
7 pages
Subtitle
No ratings yet
Subtitle
1 page
Paulo Brito Ecomat Discreto
No ratings yet
Paulo Brito Ecomat Discreto
49 pages
Cs748 s2021 Quizzes Till q4
No ratings yet
Cs748 s2021 Quizzes Till q4
4 pages
Dynamic Programming Handout - : 14.451 Recitation, February 18, 2005 - Todd Gormley
No ratings yet
Dynamic Programming Handout - : 14.451 Recitation, February 18, 2005 - Todd Gormley
11 pages
11 - Numerical Issues #1: The Complications of Continuity: V (X, T) That Maps From The Continuous Domain of X To
No ratings yet
11 - Numerical Issues #1: The Complications of Continuity: V (X, T) That Maps From The Continuous Domain of X To
24 pages
Christian Clason Introduction To Functional Analysis
No ratings yet
Christian Clason Introduction To Functional Analysis
9 pages
MATH1048 Linear Algebra 1 Exam 2014
No ratings yet
MATH1048 Linear Algebra 1 Exam 2014
9 pages
The Piecewise Function
100% (1)
The Piecewise Function
4 pages
Dpp-1 English PC Bwyryeu
No ratings yet
Dpp-1 English PC Bwyryeu
7 pages
7 Fourier Series
No ratings yet
7 Fourier Series
43 pages
Lecture Planner Maths PDF Only66dfd154731530ec1f40dd6b
No ratings yet
Lecture Planner Maths PDF Only66dfd154731530ec1f40dd6b
1 page
IB Math SL AI Unit 3 Functions and Modeling Notes Updated
No ratings yet
IB Math SL AI Unit 3 Functions and Modeling Notes Updated
25 pages
Linear Applications
No ratings yet
Linear Applications
11 pages
Stochastic Search Algorithms-I
No ratings yet
Stochastic Search Algorithms-I
19 pages
2.2 The Seismic Wavelet: Deterministic Statistical Strata
No ratings yet
2.2 The Seismic Wavelet: Deterministic Statistical Strata
9 pages
D Equation
No ratings yet
D Equation
13 pages
Allen Maths 12th 12th Edition Allen 2024 Scribd Download
100% (4)
Allen Maths 12th 12th Edition Allen 2024 Scribd Download
49 pages
Maths Ch2 Polynomials-Worksheet No.2
No ratings yet
Maths Ch2 Polynomials-Worksheet No.2
3 pages
Skima Revision Test 1 Dum30242
No ratings yet
Skima Revision Test 1 Dum30242
8 pages
Calculus I: Assignment Problems Applications of Integrals
No ratings yet
Calculus I: Assignment Problems Applications of Integrals
13 pages
Robotics Lec9
No ratings yet
Robotics Lec9
46 pages
Algebraic-Structures GROUPS Feb-13
No ratings yet
Algebraic-Structures GROUPS Feb-13
12 pages
Class 10 25 Marks Paper
No ratings yet
Class 10 25 Marks Paper
1 page
Function: SPM Additional Mathematics KSSM
No ratings yet
Function: SPM Additional Mathematics KSSM
45 pages
GENMATH M2 Lec04C Inverse Functions
No ratings yet
GENMATH M2 Lec04C Inverse Functions
12 pages
Inf Sta2
No ratings yet
Inf Sta2
22 pages
MATB1014 TEST 3 S1 Sem 22-23 Solution
No ratings yet
MATB1014 TEST 3 S1 Sem 22-23 Solution
8 pages
2012e4003lecture10-12 Complete Part2 PDF
No ratings yet
2012e4003lecture10-12 Complete Part2 PDF
73 pages
Algorithms Matrices Math Ia
No ratings yet
Algorithms Matrices Math Ia
9 pages
Linear Programming Worksheet 1
No ratings yet
Linear Programming Worksheet 1
2 pages
Canny Edge Detection Tutorial PDF
No ratings yet
Canny Edge Detection Tutorial PDF
17 pages
The Mellin Transform
No ratings yet
The Mellin Transform
6 pages
ECE5340/6340: Homework 5 Finite Difference Method
No ratings yet
ECE5340/6340: Homework 5 Finite Difference Method
2 pages
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet