11 - Numerical Issues #1: The Complications of Continuity: V (X, T) That Maps From The Continuous Domain of X To
11 - Numerical Issues #1: The Complications of Continuity: V (X, T) That Maps From The Continuous Domain of X To
( = +
.
Now this equation can be easily evaluated at any finite grid of points, say
X={x
1
, x
2
, x
3
,, x
n
}, since we know the functional form for u(), V(x
T
,T) and the state
equation are known. When we come to the equation for V(x
T-2
,T-2), however, we have
2. ( ) ( ) ( )
2
2 2 2 1
, 2 max , , 2 , 1
T
T T T T
z
V x T E u z x T V x T
( = +
.
This may cause problems because from 1 we only know the values of V(x
T-1
,T-1) at the
points at which 1 has been evaluated, namely X={x
1
, x
2
, x
3
,, x
n
}. Since we need to find
the value of z
T2
that maximizes the RHS of 2, it is likely that some candidate values of z
will lead to values for x
T-1
that are not contained in the grid X. Hence, were faced with a
problem: How do we ensure that were finding the correct solution to 2 if we only know
the values of V(x
T-1,
T-1) at a finite set of points?
If the problem is stochastic, then this issue becomes even more relevant. Suppose there is
a continuous probability distribution over x
T-1
conditional on x
T2
and z
T2
, then for a
given choice, z
T2
, the expected future value of next period's stock will be
11 -
2
( ) ( ) ( )
2 2
1 1 2 2 1 1
,
, 1 | , , 1
t T
T T t T T T
x z
E V x T f x x z V x T dx
( =
where f(x
T1
| z
T2
, x
T2
)
is the probability distribution of x
T1
conditional on z
T2
and x
T2
.
It might be easier to think of the case of a discrete probability distribution
( ) ( ) ( )
2 2
1 1 2 1
,
1
, 1 ; , 1
t T
m
i i
T T T T
x z
i
E V x T p x z V x T
=
( =
with p x z
T
i
T 1 2
;
c h
being
the probability that x
T1
takes on a particular value x
i
given a particular choice z
T2
with m large compared to n. A discrete specification would typically be used to approximate
continuous distributions.
However, since we've used numerical methods to solve the first equation at only n points
or nodes, we don't have observations of V(x
T1
,T1) at all the possible values of x
T1
.
Suppose, for example you have evaluated V(x
T1
) at the eight points in the figure below
and came up with the values as indicated. Then in order to solve for V(x
T2
) you need to
take an expectation of some value that falls between points on this grid. How do we
proceed? We now discuss a number of ways around this problem.
x
T-1
V x ( )
T-1
x
1
x x x x x x x
2 3 4 5 6 7 8
Needless to say, this problem does not occur only at T2, but in all periods except the
final one. Hence, it critically affects infinite horizon problems in the same way. We
present the infinite-horizon case below; extension to a finite-horizon case is straight
forward. We should point out, however, that unlike the successive approximation
algorithm for DD problems, the convergence of the infinite-horizon algorithm for these
problems is not as well behaved, may not be monotonic, and may not converge
uniformly.
II. Solution #1: Rounding
The easiest way to handle a continuous state space is to turn it into a discrete space. That
is, we treat the value function as if it can only take on n possible values, those values
associated with the points in our set X. If x
t+1
should happen to fall outside this set of
11 -
3
points, either between the points or completely outside the range, we simply round up or
down until we get to a value that we've evaluated. Technically, we might write,
arg min V x V x x x x
t t t
x X
t + + +
+
= =
1 1 1 1
b g b g c h b g with .
That is, since we don't have an estimate of V(x
t+1
), we approximate it with the value of the
nearest point for which we actually know.
x
t+1
V x ( )
t+1
x x
0 1
x x x x x x
2 3 4 5 6 7
Using the same points from the first figure, the implicit value function that follows from
rounding would take a form like that in the figure above.
It is easy to see that rounding may not be the best way to handle the problem of
approximating the value function. For example if x
1
=1.0 and x
2
=2.0, then the estimate of
the value of x
t+1
=1.49 would be dramatically different from the approximation of the
value of x
t+1
=1.51.
Usually, however, our value functions are not as nasty looking as the one in the figures.
If V() is a nice monotonic function without huge changes in its slope, then the magnitude
of the error using rounding can be quite small. Nonetheless, if you want to round you
need to make sure that your grid is tight enough so that the rounding are not having an
overwhelming influence on your results.
A. The rounding algorithm
Implementing rounding numerically is quite simple in principle and could be
implemented by an algorithm like the following. Let V be an array of values from the
previous stage at each of the points in your grid X, let x be your grid, X which will take
on values x
1
, x
2
,, x
nx
, and let xtrue be the true value of x
t+1
for which we want to find an
estimate of V(x
t+1
), say Vest. The following algorithm would find the nearest estimate
using rounding. (A note on notation: if the font is Arial, then this is actual VB code)
11 -
4
These lines of code would calculate the value of V(x
t+1
) for some value x
t+1
X.
diff = 99999
xTrue = g(x
t
, z
t
,
t
) Calculates the true value x
t+1
as a function of current state,
control and random shock. Call this value xtrue
Using the invertgrid function from the MatrixOperations module, we can find the
index of the point in x closest to xtrue as follows:
ix = abs(invertgrid(xtrue,x,999))
The value at xtrue is then found simply
Vest = V(ix)
Without using the invertgrid function, this could be carried out as follows:
For ix = 1 to nx These lines identify the value of xX that is
if abs(x(ix)-xTrue) < diff then closest to xtrue
diff = abs(x(ix)-xTrue)
Vest = V(ix) Use V(x(ix)) as an estimate of V(xtrue),
endif The x(ix) that is closest to xtrue will be used
next ix
Alternatively, suppose your grid is defined as nx+1 equally spaced points starting at from
xlow and ending at xmax with the index starting at zero. In this case the distance between
each point on the grid would be xstep=(xmax-xlow)/(nx-1):
i 1 2 3 4 nx
x
xlow+
0xstep
xlow+
1xstep
xlow+
2xstep
xlow+
3xstep
xmax
In this case, you could calculate the correct index associated with a value x
t+1
X, using
the following, where x
t+1
is represented using the symbol xt1 and the index associated
with x
t+1
is identified as i1.
iest = int((xt1-xlow+0.5*xstep)/xstep) this picks the index of the value of x that
is closest to xt1
Vnext = V(iest) The estimate of V(xt1) is the iest
th
element
of the stored array, V()
An important modeling decision that must be made when implementing a rounding
algorithm is how to treat points that are completely outside the grid. In the figure and
algorithms above I assumed that the V(x
t+1
) is the same as V(x
0
) if x
t+1
<x
0
and is the same
as V(x
n
) if x
t+1
>x
n
. However, this may not be appropriate. For example, this may give the
impression that a decision-maker could drive the state variable to negative infinity
without sacrificing any future value. In some problems, therefore, it is necessary to set
V(x
t+1
) equal to a very large negative number for any x
t+1
that falls completely outside the
11 -
5
grid. It is extremely important to be careful in how you handle the edges of your grid in
applied dynamic programming; this seemingly small modelling decision can dramatically
affect your results. An important goal for your model is that your state grid should be
specified in a manner such that the edges of the grid do not influence the solution inside
the grid and all optimal paths lead to points in the interior of your grid.
While rounding is not always the best way to deal with CD problems, it sometimes works
out pretty well. Of course, the more points that you have in your grid, the more accurate
your rounding estimation will be.
B. The Curse of dimensionality
The problem with tightening your grid is that this means that in each stage you have to
solve more state problems. If your state space is multi-dimensional, then tightening your
grid increases the number of state problems increases geometrically. This problem is
known as the
THE CURSE OF DIMENSIONALITY."
The curse of dimensionality refers to the fact if a problem has m state variables, each of
which is allowed to take on n values, then you need to solve the Bellmans equation at n
m
points in each stage. For example, a rather coarse grid would be to approximate the state
space with only 10 points in each dimension. If you have four state variables then your
computer algorithm must solve the 10
4
or 10,000 points. If each evaluation takes only
1/10
th
of a second, then each stage would still take 1,000 seconds or 16.7 minutes.
Moving from 4 to 5 to 6 variables under the same assumptions would increase each stage
loop to 2.7 hours and then to over one day. It would be 317 years before a problem with
just 11 state variables completed just one stage. If you contrast with the relative freedom
that one has when choosing how many variables to put into an econometric model, we
see that problems of applied dynamic programming are of a very different nature.
As we note in Woodward, Wui and Griffin (2005), despite the incredible increases in the
speed of computers, the curse remains very real.
Although enormous improvements in the computational speed have been
achieved in recent years, this computational burden will continue to limit the
size of DP problems for many years to come. Moores law is the regular
tendency for the density of computer circuitry (and processor speed) to
double every eighteen months (Schaller). This law, which has held up
surprisingly well since its conception in 1965, has startling implications for
simulation modelers: a simulation model could double in size every 1.5 years
without slowing down. The implications for DP, however, are not nearly so
promising. For example, in a model in which each state variable takes on
just 8 possible values, it would be 4.5 years before just one state variable
could be added without increasing the run time of the program. The solution
of DP problems with hundreds of state variables lies only in the far distant
future.
In many problems, hundreds of iterations of the stage loop are necessary for convergence.
Hence, there is an obvious premium on keeping your grid as sparse as possible and, even
11 -
6
more critically, on keeping the dimension of your state space as small as possible. Since
finding precise answers using rounding usually requires the use of a tight grid, this
approach has its limitations. Nonetheless, it should be emphasized that the curse affects
all the approaches considered below, its only the extent to which these approaches are
affected that varies. That is, if you can reduce n, the number of grid points, then the
consequences of increasing m are not as severe.
John Rust (1997) has proposed an approach that uses rounding in a randomly chosen grid
and can be used to solve very complicated problems involving very large state spaces and
is the only approach that I know of that actually overcomes the curse.
III. Solution #2: Interpolation
A. Linear interpolation (also known as linear splines)
As we've seen above, using rounding leads to a step function for the estimated value
function. This may not be a problem, but we can usually do better. One simple way to
do better is to use linear interpolation to get the estimate for points between other than
those included in our grid, X.
x
t+1
V x ( )
t+1
x x
0 1
x x x x x x
2 3 4 5 6 7
In this case, the estimate of our value function would become a piece-wise linear and
continuous function. Again, as above, there is no uniform rule on how to extrapolate
beyond the grid. I have indicated this using the multiple arrows.
Programming a linear interpolation algorithm is a bit tedious because you first need to
identify the indices of the grid points below and above the point x
t+1
for which you need
an estimate of V. I assume in this that the grid for X is ordered in the sense that
x
1
<x
2
<<x
n
. A more efficient algorithm can be written similar to the second one above
if your grid is uniform.
11 -
7
----------------------------------------------------------------
This code would appear in the subroutine where you calculate the
value function, V(x
t+1
)
----------------------------------------------------------------
xTrue = g(x
t,
z
t
,
t
) xtrue is a real number that does not fall at a node in state grid
----------------------------------------------------------------
First find the indices iLow which is the index of first member
of the matrix X that is greater than xTrue.
----------------------------------------------------
for ix = 1 to nx
if x(ix) > xTrue Then
iLow = ix-1
iHi = iLow + 1
Exit For
endif
----------------------------------------------------------------
Boundary issues. This will make values that fall
outside the grid take on the value at the nearest
edge
----------------------------------------------------------------
iLow= application.Max(iLow, 1)
iHi = application.Min(iHi, nx)
----------------------------------------------------------------
We now use these indices to calculate the linear interpolation.
V(xtrue) a weighted sum of V(iLow) and V(iHi) depending upon
how close it is to one or the other. The values a and b simply
indicate what percentage of the distance between x(iLow) and
x(iHi). Note that a+b=1.0.
At the boundaries, disLoHi = 0
----------------------------------------------------------------
distLoHi = x(iHi) - x(iLow)
if distLoHi > 0. Then
a = abs(xTrue - x(iLow))/DistLoHi
b = abs(xTrue - x( iHi ) )/DistLoHi
else
a = 1.0
b = 0.0
endif
----------------------------------------------------------------
Now, using a and b we then calculate Vest. Note that you use a
times V(iHi) and b*V(iLow). The intuition is that if you're
close to x(iLow) a will be small but you want to give a lot of
weight to V(iLow).
----------------------------------------------------------------
Vest = a*V(iHi) + b*V(iLow)
11 -
8
Linear interpolation is not a bad way to go about approximating the state space, but it still
has some limitations. In particular, the estimated value function is smoother than the
rounding approach, the derivatives of the estimated value function are discontinuous,
which can be problematic if your control variable is also continuous (but we'll get into
that in later lectures). Also, if the value function is highly nonlinear, then a tight grid will
still be needed to obtain a good estimate.
Interpolating in 2 or more dimensions is a straightforward analogue to the 1 dimensional
case. As seen in the figure below, the 2-Dimensional example simply involves
calculating the weights a, b, c and d, which sum to 1 as above. The weighting that is used
is the then the opposite diagonal, e.g.,
V(X
True
)aV(x
1
(iHi),x
2
(iLo))+bV(x
1
(iLo),x
2
(iLo))+cV(x
1
(iLo),x
2
(iHi))+dV(x
1
(iHi),x
2
(iHi))
a b
d c
x
2
(iHi)
x
2
(iLo)
x
1
(iLo) x
1
(iHi)
x
True
B. Cubic interpolation or cubic splines
An improvement over linear interpolation is to use cubic interpolation. Cubic splines
yield a smooth approximating function something like the one below. In this case, you
are basically interpolating using both the levels and the partial derivatives of the function.
I will not present the algorithm for cubic interpolation here. A detailed discussion of the
use of cubic splines is available in Numerical Recipes a book by Press et al. (1989)
(available on line at http://www.nr.com/) that contains careful discussion of many
numerical techniques and Fortran code for implementing these techniques that could
easily be adopted to VB. Judd (pp. 225-227) also discusses the use of cubic splines.
x
t+1
V x ( )
t+1
x x
0 1
x x x x x x
2 3 4 5 6 7
11 -
9
In each of the approaches discussed so far, the estimate of the value function, say
( )
1 t
V x
+
, is obtained using a finite number of observations of V at the points in the
state grid, xX, which were found in the previous stage loop. If x
t+1
is a number that is
not contained in X, then there will be some error, and the expected magnitude of that
error declines as we move from rounding, to linear splines, to cubic splines. Moreover, if
x
t+1
is, by coincidence, a value contained in X, then there will be no estimation error.
IV. Solution #3: Functional approximation
The next set of solution methods that well consider is to assume that there is an
underlying function that describes the value function. In this case the analyst assumes the
functional form, and the DP algorithm is used to identify its parameters. This differs in an
important way from the methods weve considered so far. Up to now the Bellmans
equation in the k
th
iteration (i.e. k
th
stage) was calculated using the values of V that you
found in iteration k1. In the functional approximation approach, the value function on
the RHS of the Bellmans equation is defined not by a set of values at fixed points in the
state grid, but by a set of parameters the k1
th
set of coefficients of the assumed
functional form. The updating step between each stage loop, therefore, involves finding a
new set of coefficients for the value function. The test of convergence in a successive
approximation algorithm might be based on the extent to which the coefficients change
from one stage to the next, though it is important to be aware that the scale of these
coefficients might be very important.
A. Functional approximation using ordinary polynomials
The first functional approximation method that we consider is the use of ordinary
polynomials. For example, you may assume that the value function can be closely
approximated by a second order Taylor series approximation, i.e.,
V x a a x
a
x
t t t + + +
= + +
1 0 1 1
2
1
2
2
b g b g .
In this case, your problem becomes one of choosing the parameters a
0
, a
1
and a
2
at each
iteration. Let c
k
be the set of coefficients of the value function in the k
th
iteration of a
successive approximation algorithm.
; V x c
t k +1
b g is then the estimated value function
conditional on the parameters c
k
. Then the k+1
th
set of parameters would be found in two
steps. First, solve the problem V x E u z x V x c
t
z
t t t k
b g b g b g = +
+
max ,
;
1
at every point in
your grid. Then, use the values ( )
t
V x like data to find the new set of coefficients, c
k+1
,
that give you the best possible approximating function. How might this be done? Well
OLS is not a bad option. We are able, therefore to get a new set of parameters, c
k+1
.
Each stage, therefore represents a mapping from c
k
to c
k+1
.
One significant advantage of using this approach is that we then have a closed form
expression for our estimated value function
; V x c
t k +1
b g . Evaluating a point on this line is
as simple as plugging it into the equation with the most recent set of parameters.
11 -
10
Using ordinary polynomials as we have in the case above, however, has significant
limitations is not recommended. As seen in the figure below, for low order polynomials,
the assumed functional form places very strong restrictions on the form that
; V x c
t k +1
b g
might take. Hence, if the true value function is highly nonlinear, the estimate using a 2
nd
order polynomial would be quite inaccurate. As the order of the polynomial rises,
; V x c
t k +1
b g will get closer and closer to the points V(x
t
) on the grid. However, the errors
at points between the values on the grid can actually rise and extrapolation beyond the
grid is extremely dangerous.
x
t+1
V x ( )
t+1
x x
0 1
x x x x x x
2 3 4 5 6 7
a 2 order
polynomial
nd
an 8 order
polynomial .
th
As will be discussed in section C below, there are alternative polynomial forms that are
far superior to the ordinary polynomials used here.
B. Functional approximation using prior knowledge about the functional form of V
If a modeler uses the polynomial approach to approximating the value function, we can
say that he or she has assumed that the value function will be taking a particular
functional form. In this case, however, the functional form is arbitrarily chosen to make
analysis easy and/or the error between V(x
t
) and
; V x c
t k +1
b g small. In some instances,
however, the modeler can use prior knowledge regarding the function form of V(x
t
).
If it is known that V(x
t
) is of the form
~
; V x c
t
b g , with parameters c, then the successive
approximation algorithm can be implemented in the same way as was done for ordinary
polynomials, stepping from c
k
to c
k+1
.
11 -
11
x
t+1
V x ( )
t+1
x
1
x x x x x x x
2 3 4 5 6 7 8
Judd (p. 437-438) points out that it may be very important to use information concavity
about the value function if you have it. For example, he considers the case of points that
are strictly increasing as in the figure. A cubic spline might lead to the squiggly
approximation of the function that generated these points as indicated by the line in the
figure. This can lead to quite erroneous outcomes since it indicates, for example that
despite the fact that V(x
8
)>V(x
7
), the estimated value of almost all the points between x
6
and x
7
exceeds the value of the points between x
7
and x
8
.
Hence, if you know that the true value function is monotonic or concave, choosing an
approximation method that preserves those characteristics can avoid errors.
C. Functional approximation using Chebyshev polynomials
If the modeler is interested in using a functional approximation method, but does not
have prior knowledge of the form of V(), the use of polynomials is still a possibility.
While ordinary polynomials can give very large errors, the Chebyshev polynomial is a
polynomial with an unintuitive functional form but very attractive numerical properties.
As noted by Press et al. (1989),
The Chebyshev approximation is very nearly the same polynomial as the
holy grail of approximating polynomials the minimax polynomial, which
(among all polynomials of the same degree) has the smallest maximum
deviation from the true function f(x). (p. 149)
The computation of Chebyshev polynomials are tedious but relatively easy. For details, I
refer you to Numerical Recipes. I have subroutines that I can share with you should you
be interested in using this form of polynomial approximation.
V. Setting up your grid
An important modeling decision that you must make if your are solving problems with a
continuous state variable using any of the above techniques is how your grid will be
established. Regardless of the method chosen to approximate the value function, a tighter
grid will lead to a more precise estimate of your final solution.
There is no general rule to guide how you should set up your grid and how tight you
should make it. In many problems, a uniform grid (e.g. x
1
=0.1, x
2
=0.2, x
3
=0.3,) is as
11 -
12
good as any. In other cases, if you know that the probability of hitting a particular range
in the grid is high, then you'll want to have more grid points in that range than in a range
where there is a very low probability of actually ending up. However, if the relative
values of these low probability ranges is very high and, therefore, important to getting the
correct answer, then the grid may need to be tight in that area as well. If you use
Chebyshev polynomials, then the grid must be set up in a very precise way.
How tight your grid is, i.e., the size of n, is typically a decision that you make based on
practical concerns. As a rule, you do not want your results to be sensitive to the size of
the grid, so you should tighten your grid until it doesn't seem to influence your results any
more. However, if your problem is large (i.e. you have lots of state variables), then
tightening your grid may add hours or days to the time it takes your program to run.
Clearly, practical considerations regarding the tradeoff between computing time and
precision also enter into the choice of the grid.
Part 2: Continuous Choice Variables
VI. The additional difficulties associated with continuous choice variables
As we saw above, there are some important problems that arise when the state space of a
dynamic programming problem is continuous. Additional difficulties arise when the
control variable(s) that you are trying to model are actually continuous.
Before we start talking about continuous controls, it's probably worth pointing out that
many control variables that are relevant in economics are not continuous. The cow
replacement problem and the option price problem are two good examples. These
problems are typically referred to as optimal stopping problems. In those cases, the
decision was binary (replace or not, exercise or not). However, many if not most
economic decisions are continuous, not discrete -- how much to consume, how much to
produce, how much of an input should be used, etc. In such problems the question is not
simply whether or not a particular action should be taken, but the level at which that
action should be taken.
Remember the backward-recursion algorithm for solving finite and infinite horizon DP
problems is as follows:
For each stage (t=T, T1, T2, ,0 for finite horizon problems; k=1, 2, for infinite
horizon problems) we want to find the value of each point in the state space. In order to
identify the value at each point in the state space, we need to solve a maximization
problem -- identify the choice variable z
t
that maximizes
3. E u z x V x
t t t t
, , b g b g +
+1
, where x
t+1
=g(z
t
,x
t
,
t
).
When the choice variable is discrete, this is easy - we just try all the values and see which
one is the best. But when the choice variable is continuous, it is impossible to check
every possible value using a computer. We will now explore how you might address this
difficulty in practice.
11 -
13
VII. A slight detour -- Numerical integration over continuous probability density
functions
We have not yet covered the basic principles of taking expectations with a continuous
probability distributions. Hence, I provide here a very quick overview of some methods.
Further development is available in chapter 7 of Judd (1998), Chapter 5 of Miranda and
Fackler's text, and Chapter 4 of Press et al.
1
Suppose you want to take an expected value from a continuous distribution using a
computer. That is, you hypothesize that the underlying distribution of your random
variable, e, is continuous distributed, say normally with mean e . The PDF of the
variable, f(e), therefore, would look like the figure below.
e
Figure 1
A. Numerical integration using a uniform grid
The expected value of some function, u(e) with a probability density function f(e) is
simply u e f e de b g b g
+
z
. The computational problem is that we do not have a closed form
expression for this integral. Hence, numerical approximation methods must be used. The
most simplistic way to deal with this problem is simply to divide the range of e into a grid
and then calculate the probability of falling into each portion of the grid. This process is
demonstrated in the figure below.
e e e e e e e e e e
1 2 3 4 5 6 7 8 9 10
Figure 2
In this case the expected value of u(e) would be approximated using the function
u e w e
i i
i
b g b g
=
1
10
, where w(e
i
) is the probability weight associated with the grid cell centered
at e
i
. The value of w(e
i
) is equal to the area under f(e) in the grid box centered at e
i
with
an adjustment to account for the fact that we have truncated off the ends of the
1
Miranda and Fackler's notes are probably the easiest to read option of the three sources noted. Press et
al.'s Numerical Recipes for Fortran 77 are also quite readable and have the advantage of including well
commented Fortran 77 code, which you should be able to translate into VB or any other language. These
can also be accessed through the internet at (http://www.nr.com/).
11 -
14
distribution, i.e., ( )
( )
( )
10
1
i
i
e
e
i e
e
f e de
w e
f e de
=
, where e
i
and e
i
are the upper and lower bounds on
the grid cell centered at e
i
.
This is a fairly straightforward process, you could even use a spreadsheet to generate
values for w(e
i
) for any grid size.
B. Numerical integration using non-uniform grids
While the uniform grid approach is quite straightforward and intuitive, it is not very
efficient. For example, we're getting just as much information about the points e
1
and e
10
as we are about e
5
and e
6
, despite that the points located in the center of the distribution
carry much more weight in our expectation. An efficient algorithm would spread out the
cells in such a way so as to get as precise an estimate of the true expectation as possible
for any fixed number of grid points.
There are numerous methods that are used to accurately approximate a continuous
integral. The Gaussian Quadrature methods are efficient methods for integrating smooth
functions. For a detailed discussion of these methods, I refer you to the above-mentioned
sources.
The basic idea in Gaussian Quadrature methods is that the points are chosen wisely so
that a more accurate approximation of the expectation can be achieved. The basic
principle is seen in Figure 3. Grid points towards the tails are spaced further apart than
the grid points near the mean, as in the figure below. (though the differences are
exaggerated here).
e e e e e e e e e e
1 2 3 4 5 6 7 8 9 10
Figure 3
The formulas that are used to calculate the values for e
i
and w(e
i
) are quite complicated
and involve some pretty tricky programming. Fortunately, for advanced programming,
well-tested subroutines are available for your use. The Gauss-Normal Quadrature points
and the associated weights for grids of 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20 and 25 points in the
file GAUNRM.txt can be downloaded from the class homepage (follow the link to
Programs). The values in that file are for a uniform normal distribution with mean of
zero and standard deviation of 1.0. The nodes would need to be adjusted for non-
standard normal distributions.
11 -
15
C. A programming note
Solving a stochastic DP problem involves finding, in each stage and for each state, the z
that maximizes E[u(z,x,)+V(x
t+1
)]. There are two ways you might address this problem
in your program. Suppose takes on only two values, say
1
,
2
, with probabilities p
1
and
p
2
. With only a small number of probabilities you might take your expectation directly
using commands such as the following:
u1 = u(z,x,
1
)
xnext1 = g(z,x,
1
)
Vnext1 = V(xnext1)
u2 = u(z,x,
2
)
xnext2 = g(z,x,
2
)
Vnext2 = V(xnext2)
EV = p
1
*(u1+Vnext1)+p
2
*(u2+Vnext2)
You would then compare EV with V(x) and, if it is better, store it; if not, move on to the
next value of z.
Alternatively, you can build a loop over your probabilities. This is accomplished by
putting your 's and your p's in arrays, say eps(neps) and probs(neps). You then loop
over each value of , gradually adding to the sum, EV.
EV = 0.
For ieps = 1 to neps
epsnow = eps(ieps) choose the value of the random variable
Call UtilityFunction Evaluate u() & V(x
t+1
)
Call StateEquation
Call VtPlus1Calc
EV = probs(ieps)*(utility+Vnext)+EV Add to sum to obtain EV()
Next ieps
Note that the trick here is to set EV =0 before we start the loop and then gradually add the
weighted value function to it as we change the value of .
If you are using a precise approximation of a continuous probability distribution, looping
in this way makes a lot of sense. This also adds to your programming flexibility in that
you can use an imprecise probability distribution for early runs and higher order
probabilities for your final run once you know that your program is running correctly.
11 -
16
VIII. Methods for solving CC problems
A. Discretize the control space
The simplest approach is to treat the control variable as if it were a discrete variable.
Suppose for example that you were interested in a variable z that can take on any value
between zero and one. Instead of using the infinite number of values between zero and
one, perhaps you can get a sufficiently precise answer by only looking at the n+1 values,
say
1 2 1
0, , , ,1
n
Z
n n n
=
`
)
. By treating your variable as if it were discrete, you have
greatly simplified your problem and you can now solve each state problem by simply
evaluating which of these n+1 options is the best.
How tight should your control grid be? This depends on your needs. If you are
interested in qualitative results of your model, i.e., the general trends, then the you should
tighten your grid until any further tightening does not alter the qualitative features of your
results. If you need precise results, then your grid will probably need to be tighter.
The tighter your grid, the more precise will be your results. However, a tighter grid mean
a slower program. Let's take a simple example. If you have one choice variable, then
doubling the number of points in your control grid will approximately double the time it
takes your program to run. If you have two control variables, then doubling the number
of points in each dimension will quadruple your run time. In general, an n-fold increase
in the grid points of a problem with m choice variables will increase your run time by a
factor of n
m
. If you have two control variables and your program takes 30 seconds to run
with 10 points in each choice grid, then increasing your grid to 100 in each dimension
would increase your run time to 3000 seconds or 50 minutes!
Tip: If you use the discretization approach, debug your program with a quite
sparse grid (relatively few points) and then increasing your precision once your
program is running and you think it is giving you the correct answers.
B. Using a hill-climbing algorithm
There are a wide variety of software packages such as GAMS that solve optimization
problems for continuous variables. They do this by carrying out an organized search over
the continuous range of your choice variables between user-defined upper and lower
bounds. Although, modern algorithms are more sophisticated, Newtons method is
representative: starting with an initial guess, successively better guesses of the optimum
are made based using the function and its first derivative. In general, hill-climbing
algorithms usually make extensive use of the first and, sometimes, second derivatives of
your objective function and constraints to move quickly to the optimum.
A hill-climbing algorithm can be plugged into your solution algorithm at the point where
the control loop normally fits. This approach has some advantages and disadvantages
when compared to a grid-search method. First, good hill-climbing algorithms
approximate a continuous distribution so that it is possible to come very close to the exact
11 -
17
solution to the state problem in every loop. Secondly, particularly given their accuracy,
these algorithms can be quite fast. Having to loop over a very fine grid of control options
would be comparatively quite slow.
The disadvantage to this approach lies in the certainty that one can have in your solution
to each state problem. Recall that we would use the hill-climbing algorithm to solve the
problem,
V x E u z x V x
t
z
t t t t
t
b g b g b g = +
+
max , ,
1
subject to the state equation x
t+1
=g(z
t
,x
t
,
t
).
The recursive algorithm for solving finite- or infinite-horizon problems requires that this
problem be solved at each point in the state space, i.e., the correct answer must be
obtained. It is well known, however, that for highly nonlinear problems, numerical hill-
climbing algorithms may not yield the correct solution. If a black box is used, a
computer program into which most users cannot look, one does not have complete
confidence that a global maximum has been achieve. Hence, our confidence in our final
results is diminished if this method is applied.
The problems with the use of hill-climbing algorithms are particularly severe if rounding
or linear interpolation are used to approximate the value function. Hill-climbing
algorithms make use of the derivatives of your objective function. When either rounding
or linear interpolation are used either the approximate value function or the first
derivatives of the approximate value function are discontinuous. I have also had
problems with this approach when approximating the value function with a Chebyshev
polynomial. Hence, if you decide to use a hill-climbing algorithm to find the optimal
choice at each state, it is best to approximate your value function in a way that preserves
continuity in the first-derivatives of the value function. Also, if you believe that the value
function is monotonically increasing in x, then you should ensure that the approximating
value function satisfies this property as well.
C. Closed form solution for particular functional forms
2
If it is assumed that the value function takes on a particular functional form, e.g., a
polynomial, then it is sometimes possible to find a closed form solution to the Bellman's
equation. Let's look at this in more detail.
Suppose, for example, we assume or know based on some prior information that the
value function of a two dimensional DP problem takes the form
4. V x x a a x a x a x a x a x x
1 2 00 10 1 20 1
2
01 2 02 2
2
11 1 2
, b g = + + + + +
where the a
ij
are parameters that we need to identify. The k+1
th
approximation of the
value function is found by solving the problem
2
I've only seen this method applied once, in a paper by Androkovich, Robert A. and Kenneth R. Stollery.
1994, A Stochastic Dynamic Programming Model of Bycatch Control in Fisheries Marine Resource
Economics 9:19-30. Nonetheless, its an interesting approach and helps highlight how we solve DP
problems in practice.
11 -
18
5.
( )
( )
( ) ( )
( ) ( ) ( ) ( )
( ) ( )
2
1 1 1
1 2 00 10 20
2
2 2 1 2
01 02 11
1 2
1 1 2 1
, ,
, max
with , , and , , .
t t t
k k k k
z
k k k
t t t t t t t t
u z x
V x x E a a g a g
a g a g a g g
x g z x x g z x
+
+ +
( +
(
| | (
= + + +
|
(
|
( + +
= =
where
k
ij
a is the k
th
approximation of the ij
th
coefficient of the value function.
What is particularly attractive about this specification is that for relatively simple
probability distributions and state equations it is possible to obtain closed form solutions
for the optimal policy function of the i
th
choice variable, say z x a
i
*
; b g . This policy
function indicates the best choice as a continuous function of all possible values of the
state variables, contingent on a particular set of coefficients of the value function, a={a
ij
}.
If an analytical representation of z x a
i
*
; b g can obtained, then two approaches might be
taken.
First, a numerical approach could be taken in which a set of grid points, X, so that the
values V(x) can be calculated explicitly for all xX by plugging z x a
i
*
; b g into 5. With
this set of values, the k+1
th
set of coefficients could be determined using, for example,
OLS approximation.
Alternatively, an analytical representation of the value function V
k+1
(x) can be found.
Since this almost certainly would not take the same form as in 4, Androkovich and
Stollery (1994) suggest taking a Taylor approximation of the value function, to obtain a
new set of coefficients,
1 k
ij
a
+
. In either case, the solution of the infinite horizon problem
could be found by iterating until
1 k k
ij ij
a a
+
< for some critical value .
If on the other hand, the true underlying value function cannot accurately be depicted
using a second order polynomial like 4, then this approach will lead to erroneous results.
Moreover, the approach is intrinsically inconsistent in that they never obtain a value
function V(x) such that V x E u z x V x
t
z
t t t t
t
b g b g b g = +
+
max , ,
1
.
D. Other approaches
There are two other approaches that are frequently used to solve CC problems, these are
what I'll call Euler equation iteration and linear quadratic approximation methods. Both
of these draw on the fact that CC problems are differentiable. Before introducing these, it
will be useful to go over a little theory.
11 -
19
IX. A little theory about infinite-horizon problems
A. The Euler Equilibrium conditions
The key theoretical feature that distinguishes CC problems from problems with discrete
choices is the ability to apply the standard principles of differential calculus to the
problem. The Bellman's equation of an infinite horizon problem takes the following
form:
V x E u z x V x
t
z
t t t t
t
b g b g b g = +
+
max , ,
1
where z, x and can be scalars or vectors and x g z x
t
i i
t t t +
=
1
, , b g , for each state variable,
i=1,,m. If the functions u and all the g
i
are differentiable in z and x, and V() is
differentiable in x, then we know that, for an unconstrained DP problem, the first-order
conditions would be satisfied at the optimum for each choice variable z
j
, i.e.,
E
u
z
V
x
g
z
j i j i
m
L
N
M
M
O
Q
P
P
=
=
1
0 .
Letting
V
x
x
i
i
b g and applying the envelope theorem to the problem,
6.
1 1
j j m m
i j
j j i j i i i
u V g u g
E E
x x x x x
= =
( (
= + = +
( (
(
.
The equations in 6 are typically referred to as the Euler conditions. You should see in
them a close similarity to the maximum conditions of optimal control. In particular, if
you look at Dorfman's derivation, you'll find the deterministic version of the conditions
we have here.
If the problem is subject to intratemporal constraints, the Euler conditions would be
altered to reflect the Kuhn-Tucker conditions but the intuition is fundamentally the same.
B. The steady state and certainty-equivalent steady state of CC problems
For many deterministic DP problems, the optimal strategy will lead to a steady state.
That is, following the policy rule set out by your policy function, z
*
(x), will lead to an
evolution in the state space that will lead to a steady state. Consider, for example a
simple problem in which the optimal policy function takes a linear form, z
*
(x)=x and
x
t+1
=x
t
+(x
t
)
z
t
. In this case, following the optimal policy would lead the state variable to
a unique steady state value as in the figure below from any initial starting value.
11 -
20
0
2
4
6
8
10
12
14
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
x
t
t
An appreciation of the steady state can be quite useful in understanding a problem. This
is particularly true when the steady state is reached quickly so that it can be safely
assumed that the agents you are studying will probably be at the steady state at any time.
The steady state of infinite horizon problems with m state variables and n control
variables can be found by solving three sets of equations,
E
u
z
g
z
j
i
j i
m
L
N
M
M
O
Q
P
P
=
=
1
0 for j=1,,n
j
j
i
i
j i
m
E
u
x
g
x
=
L
N
M
M
O
Q
P
P =
1
for j=1,,m, and
x g z x
j
= , b g for j=1,,m.
That is, at the optimum steady state, the FOC's of the problem must be satisfied and the
state variables must not be changing over time. Note that solving for the steady state
does not require knowledge of V, instead, it is based on information about the slope of the
value function, , at the steady state. While it still may be impossible to analytically
solve this system equations for closed form expressions for the variables z
i
, x
j
and
j
, this
system of equations is well specified and it should be possible to solve the system
numerically (see Judd, 1998 chapter 5).
X. Solution methods for CC problems that utilize the optimality conditions
A. Linear quadratic (LQ) approximation
3
A method that has been quite widely used to solve CC dynamic programming problems is
to assume that the problem that you're interested in solving actually falls into a class of
problems for which a nice clean solution exists. If the state equations, g
i
(), are linear in z
and x and the benefit function, u(), is quadratic, then it is possible to find analytical
solution to stochastic DP problems. This has led to a great deal of analysis of these types
of problems. In many presentations the material covered in this class, LQ problems are
3
Details of this section are taken from Miranda and Fackler (1999).
11 -
21
presented separately and analyzed in depth. In these notes such specifications are given
substantially less emphasis as I see it as one more means of finding an approximate
solution to a true DP problem. If the true problem that you want to solve fits the LQ
requirements, then it is obviously important to use the LQ methods to solve the problem.
If your problem does not meet these quite restrictive conditions, then using this approach
is just one more way to find an approximate solution to your true underlying problem. In
some instances, particularly in the neighborhood of the certainty-equivalent steady state,
this approach might be quite useful.
In an LQ problem, the benefit function takes the form
( )
0 1 2 3 4 5
1 1
, ' ' '
2 2
u x z A A x A z x A x x A z z A z = + + + + +
where z and x are n1 and m1 vectors, A
0
is a scalar, A
1
and A
2
are 1m and 1n vectors,
and A
3
, A
4
and A
5
are conformable matrices.
The state equations in the LQ setup are linear functions in the state variable and control
variable
x=G
0
+G
1
x+G
2
z+
where G
0
, G
1
and G
2
are m1, 1m and 1n vectors and is an m1 vector of random
shocks with zero mean.
What makes these types of problems particularly important is that they can be solved
explicitly. The policy function and shadow price function () are linear functions of the
state variables:
z x Z Z x
x x
x
x
b g
b g
= +
= +
0
0
.
in which Z
0
is an n1 vector, Z
x
is an nm matrix,
0
is an n1 vector and
x
is an nm
matrix.
The parameter matrices
0
and
x
are characterized by the nonlinear Riccati equations.
Riccati equations are fixed-point equations which define the coefficients of the z(x) and
(x) above. The elements of the matrices
0
and
x
appear on both the right- and left-
hand sides of the Riccati equations. The solution of these equations is discussed in Judd
(1998, p. 432) and in Miranda and Fackler.
One thing that is particularly interesting about the solution to these problems is that the
solution is entirely independent of the type of stochastic shock. Regardless of the
distribution of the shock, the problem will have the same solution.
B. Using LQ approximation around the certainty-equivalent steady state
One way that the LQ method can be particularly useful is to describe the behavior of a
system around the certainty-equivalent steady state (CESS). In this case the first step is
to find the variables at the CESS, say x
*
,
*
, and z
*
.
The second step is to take first- and second-order Taylor series approximations of the
state equations and benefit function respectively at the CESS.
11 -
22
The third step is to then solve the approximate LQ problem. The resulting solution
should yield quite reasonable estimates of the optimal policies in the neighborhood of the
CESS. This could then be used to analyze the behavior of the system in the long run.
For example, this approach might give a quite accurate approximation of the long-term
reaction to a one-period policy change.
LQ methods have advantages and disadvantages when compared to methods that rely on
approximating the value function. The numerical methods give an approximate solution
to a problem very close to the one youre interested in. LQ methods give an exact
solution to a problem that is a rough approximation of the one youre interested in. You
choose your poison.
C. Euler equation iteration
In the standard successive approximation technique we update our approximations of the
value function. That is, we take one guess at the value function, say V
0
(x), and then use
that to get a new value function, V
1
(x). By repeating this operation the algorithm
converges to a value function V(x) that can then appear on both the right-and left-hand
sides of Bellman's equation
V x E u z x V x
t
z
t t t t
t
b g b g b g = +
+
max , ,
1
.
An alternative approach is to use a similar successive approximation algorithm on the
Euler equations. In this case the unknown that we need to successively approximate is
the co-state variable (x).
The algorithm follows a pattern much as we do with the successive approximation of the
value function
1. Initialization step: make an initial guess of the values of at each point in your grid,
say
i
x
0
b g .
Then, for k=1, 2,
2. Update the policy function: For each point in your state grid, solve the system of
equations for a set of candidate optimal policies, say z
( )
( )
( )
1
1
, , , ,
0
m
k
j
j
i i
u z x g z x
E x
z z
=
(
+ =
(
This system of equations could be solved numerically.
3. Update the costate variables: these candidate policies for each point in your grid are
then plugged into the Euler equations
k
k
i
i
V
x
.
11 -
23
4. Convergence check: compare the
j
k
j
k
x x b g b g with
1
. If is small, then stop. If
not, return to step 2 and continue.
As with value function approximation, a variety of methods can be used to approximate
the function (x), including rounding, interpolation, or functional approximation.
XI. Problems that allow for analytical solutions of the Bellmans equation
Occasionally a problem arises that makes it possible for the optimal choice variable to
found analytically. For example, suppose that x takes on only two values, x
1
and x
2
, and
that the choice variable, z, affects the probability of being in state 1 according to the
function continuously differentiable function, P(x,z). In this case the Bellmans equation
could be written:
( ) ( ) ( ) ( ) ( ) ( ) ( )
1 2
max , , 1 ,
t
i i t t t t t
z
V x u x z P z x V x P z x V x
(
= + +
.
In this case, the first-order condition for the optimum
( ) ( )
( )
( )
( )
1 2
, , ,
0
i t t t t t
t t t
u x z P z x P z x
V x V x
z z z
(
+ =
(
If the functional forms for u() and P() are such that this equation can be solved
analytically. Under these circumstances, rather than using an approximate solution from
a hill-climbing or grid-search approach, the analytical solutions can be used to obtain the
exact optimal choices at each iteration of the stage loop.
For example, suppose that x
1
=2, x
2
=1; P(z,x) = (1-z) with z[0,1] and
t
u x z
= . In this
case we can the Bellmans equation would be
( ) ( ) ( ) ( )
1 2
max 1
t
i t
z
V x x z z V x z V x
( = + +
and the optimal choices of z for interior solutions can be found using the FOC
( ) ( )
1
1 2
0
t
z x V x V x
( + + =
yielding optimal value of z,
( )
( ) ( )
1
1
* 1 2
t
t
V x V x
z x
x
| | (
=
|
|
\
(7)
Starting with a terminal value of, say V(x
1
)=2 , and V(x
2
)=1, one could find the values of
iterate z
*
, then substitute them into the Bellmans equation to get updated values of V and
repeat.
XII. References
Androkovich, Robert A. and Kenneth R. Stollery. 1994. A Stochastic Dynamic
Programming Model of Bycatch Control in Fisheries. Marine Resource
Economics 9(1):19-30.
Press, William H., Brian P. Flannery, Saul A. Teukolsky and William T. Vetterling.
1989. Numerical Recipes: The Art of Scientific Computing (FORTRAN Version),
Cambridge University Press, Cambridge.
11 -
24
Rust, John. "Using Randomization to Break the Curse of Dimensionality." Econometrica
65(May 1997):487-516.
Woodward, Richard T., Yong-Suhk Wui and Wade L. Griffin. Living with the Curse of
Dimensionality: Closed-Loop optimization in large-scale fisheries simulation
model. American Journal of Agricultural Economics 87(Feb. 2005):48-60.
XIII. Readings for next class
Kamien and Schwartz, pp. 202-217