0% found this document useful (0 votes)
6 views

T08DynamicProgramming

The class notes for Finance 911 by Jessica Wachter cover optimal consumption and investment policies using dynamic programming, focusing on the Markov property, recursive formulations, and the Euler equation. Key concepts include the budget constraint, value function, and the application of logarithmic utility in investment decisions. The notes provide a detailed mathematical framework for understanding the relationship between consumption, investment returns, and wealth over time.

Uploaded by

guanyuzhou98
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

T08DynamicProgramming

The class notes for Finance 911 by Jessica Wachter cover optimal consumption and investment policies using dynamic programming, focusing on the Markov property, recursive formulations, and the Euler equation. Key concepts include the budget constraint, value function, and the application of logarithmic utility in investment decisions. The notes provide a detailed mathematical framework for understanding the relationship between consumption, investment returns, and wealth over time.

Uploaded by

guanyuzhou98
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Jessica Wachter Class notes for Finance 911

VIII Characterizing Optimal Consumption and


Investment Policies: Dynamic Programming
1. Markov property

2. Recursive formulation of the dynamic problem

3. Euler equation

4. Example: logarithmic utility

5. Infinite horizon recursive formulation

6. Representative agent revisited

1 Markov property

Let Sjt denote the ex-dividend price of asset j at time t and xjt the dividend on this
asset. Then the net return on asset j equals
Sj,t+1 + xj,t+1
rj,t+1 = − 1,
Sjt

for j = 1, . . . , M . Let rf,t+1 denote the return from period t to t + 1 on the riskfree
asset. Note that rj is adapted but rf is predictable.

Definition (Conditional expectations). Given a process zt adapted to F,


Et [zt+s ] = E[zt+s |Ft ] is the random variable equal to E[zt+s |at ] in event at .

Definition (Markov Property). The distribution of returns rj,t+1 (∀j) at time t and the
value of the riskfree rate rf,t+1 depend only on a state vector with finitely many elements
Yt . Moreover the distribution of Yt+1 at time t depends only on Yt .

Markov property ⇒ for any function g, such that the expectation exists and s > t,

Et [g(rf,s , r1,s , . . . , rM,s )] = E[g(rf,s , r1,s , . . . , rM,s )|Yt ]

All you need to know is Yt to know the whole distribution.

1
Jessica Wachter Class notes for Finance 911

2 Recursive formulation of the dynamic problem

Notation:

• rt = M × 1 vector of risky asset returns between times t − 1 and t. (adapted).

• rf t = riskless asset return between time t − 1 and t (predictable).

• θt = M × 1 vector of dollars in the risky assets at time t − 1 (predictable).

• αt = dollars in riskfree asset at t − 1 (predictable).

• Wt = individual’s wealth at time t (including consumption) (adapted).

• ct = consumption in period t (adapted).

• ι = M × 1 vector of 1s

Let θ = {θt }, c = {ct }.


Assumptions

• Markov property holds and Y is the vector of state variables.

• Investor’s utility: " #


T
X
E β t u(ct )
t=0
0 00
with u > 0 and u < 0.

Claim: Budget constraint is


>
Wt+1 = (1 + rf,t+1 )(Wt − ct ) + θt+1 (rt+1 − rf,t+1 ι) (1)

for t = 0, . . . , T − 1, and cT ≤ WT .

• Why?
By definition:
>
Wt = ct + αt+1 + θt+1 ι.
We also know:
>
Wt+1 = αt+1 (1 + rf ) + θt+1 (ι + rt+1 ).

2
Jessica Wachter Class notes for Finance 911

Substituting in for α:
> >
Wt+1 = (Wt − ct − θt+1 ι)(1 + rf,t+1 ) + θt+1 (ι + rt+1 ).

Collecting terms in θ leads us to (1).

Define Problem P : " #


T
X
max E β t u(ct )
c,θ
t=0

subject to (1) and cT ≤ WT .


We now recursively define a function V . We first define the transition between t + 1
and t:

V (Wt , Yt , t) = max β t u(ct ) + E[V (Wt+1 , Yt+1 , t + 1)|Yt ] subject to (1)



ct ,θt+1

The boundary condition is:

V (WT , YT , T ) = β T u(WT ).

V is called the value function.


Define problem P 0 : For each t = 0, . . . T − 1, solve

max β t u(ct ) + E[V (Wt+1 , Yt+1 , t + 1)|Yt ]


ct ,θt+1

subject to (1).
Theorem (Principle of optimality). Solving problem P is equivalent to solving P 0 .
That is,
XT
V (Wt , Yt , t) = max Et β s u(cs ).
cs ,θs+1 ,s≥t
s=t

We prove this by induction on the number of remaining periods. To keep notation


simple, we assume that there is only one risky asset and that rf t = rf .

Proof. Consider time T . By the boundary condition

V (WT , YT , T ) = β T u(WT ).

Consider
max β T u(cT ).
cT ≤WT

3
Jessica Wachter Class notes for Finance 911

Because u0 > 0, the solution is cT = WT . This proves the result for time T .
Now assume that the result holds for time t + 1 and show that it holds for t. By the
induction step, we know
T
X
V (Wt+1 , Yt+1 , t + 1) = Et+1 β s u(ĉs )
s=t+1

for some {ĉs , s > t} that is adapted and satisfies the budget constraint from time t + 1
onwards. It follows from the definition of V (Wt , Yt , t) that, for some ĉt
" " T # #
X
V (Wt , Yt , t) = β t u(ĉt ) + E Et+1 β s u(ĉs ) Yt ,
s=t+1

where we have substituted in for V (Wt+1 , Yt+1 , t + 1) inside the expectation. It follows
from the Markov property and the law of iterated expectations that
T
X
V (Wt , Yt , t) = Et β s u(ĉs ),
s=t

where {ĉs , s = t, . . . T } represents an adapted consumption path that satisfies the


budget constraint from time t onwards.
For a given event at time t, let
T
X

V = max Et β s u(cs ), (2)
s=t

where the max is taken over all adapted consumption paths satisfying the budget
constraint from time t onwards. Note that V ∗ solves the problem “all at once.”
It follows from the definition of V ∗ that
V ≤ V ∗.

Let {c∗s , s = t, . . . T } and {θs∗ , s = t, . . . , T } be the solution to (2). Then wealth at time
t + 1 satisfies

Wt+1 = (1 + rf )(Wt − c∗t ) + θt+1 ∗
(rt+1 − rf ).

It follows from the induction step that


T
X

V (Wt+1 , Yt+1 , t + 1) ≥ Et+1 β s u(c∗s ).
s=t+1

4
Jessica Wachter Class notes for Finance 911

Add β t u(c∗t ) to both sides of this equation and take the expectation at time t (use the
law of iterated expectations). It follows that

β t u(c∗t ) + Et [V (Wt+1 , Yt+1 , t + 1)] ≥ V ∗ .

By the Markov property, the LHS can be rewritten as



β t u(c∗t ) + E[V (Wt+1 , Yt+1 , t + 1)|Yt ].

Because V maximizes this quantity over all possibilities of c∗t and θt+1

subject to budget
constraint, it follows that V is greater than the LHS, and therefore

V ≥ V ∗,

so V = V ∗ . This completes the induction step, and therefore the proof.

3 Euler equation

Theorem. Assume either finite states or u bounded. Then ∃ a solution


c∗ (Wt , Yt , t), θ∗ (Wt , Yt , t) and

1. V (Wt , Yt , t) is strictly concave and increasing in Wt .

2. VW (Wt , Yt , t) = β t u0 (c∗t )

3. u0 (c∗t ) = βEt [u0 (c∗t+1 )](1 + rf,t+1 )

4. Et [u0 (c∗t+1 )(rjt+1 − rf,t+1 )] = 0

We will prove 2–4.

Proof. • Proof of 2: For simplicity, we will write ct rather than c∗t and θt rather
than θt∗ .
FOC using the Bellman equation:

– WRT ct :  
t 0 ∂Wt+1
β u (ct ) + Et VW (t + 1) =0
∂ct
We are using the shorthand VW (t + 1) = VW (Wt+1 , Yt+1 , t + 1). Note the
application of the chain rule.

5
Jessica Wachter Class notes for Finance 911

– WRT θj,t+1  
∂Wt+1
Et VW (t + 1) =0
∂θj,t+1
This equation holds for all j = 1, . . . , M .

To evaluate these derivatives, use the dynamic budget constraint


>
Wt+1 = (1 + rf,t+1 )(Wt − ct ) + θt+1 (rt+1 − rf,t+1 )

and note that Wt is unaffected by ct and θt+1 while θt+1 is unaffected by ct .


Therefore
∂Wt+1
= −(1 + rf,t+1 )
∂ct
and
∂Wt+1
= rj,t+1 − rf,t+1
∂θj,t+1
Substituting these derivatives back into the FOCs implies:

β t u0 (ct ) − Et [VW (t + 1)(1 + rf,t+1 )] = 0 (i)

Et [VW (t + 1)(rj,t+1 − rf,t+1 )] = 0 (ii)

Differentiate on both sides of the Bellman equation with respect to Wt :


 
t 0 ∂ct ∂Wt+1
VW (t) = β u (ct ) + Et VW (t + 1) .
∂Wt ∂Wt
∂Wt+1
The budget constraint allows us to evaluate the derivative ∂Wt
:

∂θ>
 
∂Wt+1 ∂ct
= (1 + rf,t+1 ) 1 − + t+1 (rt+1 − rf,t+1 ).
∂Wt ∂Wt ∂Wt
Therefore,
  
t 0 ∂ct ∂ct
VW (t) = β u (ct ) + Et VW (t + 1)(1 + rf,t+1 ) 1 − +
∂Wt ∂Wt
>
 
∂θt+1
Et VW (t + 1) (rt+1 − rf,t+1 )
∂Wt
 
t 0 ∂ct t 0 ∂ct
= β u (ct ) + β u (ct ) 1 −
∂Wt ∂Wt
t 0
= β u (ct )

6
Jessica Wachter Class notes for Finance 911

• Proof of 3: From 2. and (i):

β t u0 (ct ) − Et [β t+1 u0 (ct+1 )(1 + rf )] = 0

⇒ u0 (ct ) = βE[u0 (ct+1 )(1 + rf )]

• Proof of 4: From 2. and (ii):

Et [β t+1 u0 (ct+1 )(rj,t+1 − rf,t+1 )] = 0

Note: (3) and (4) together ⇒

Et [βu0 (ct+1 )(1 + rj,t+1 )] = u0 (ct ) ∀j

Why?
u0 (ct ) = βEt [u0 (ct+1 )(1 + rf,t+1 )] + βEt [u0 (ct+1 )(rj,t+1 − rf,t+1 )].

Informal derivation of the Euler equation

What is the change in lifetime utility from shifting a small amount of consumption from
time t to time t + 1?
For small shifts away from the optimum, the change should be zero. Note that if we
reduce consumption at t, we could invest it in risky asset j and have that same amount,
multiplied by 1 + rj,t+1 . Let c∗ be the optimum and c be the new value. Then

∆ in lifetime utility = β t (u(ct )−u(c∗t ))+β t+1 Et u(c∗t+1 + (c∗t − ct )(1 + rj,t+1 )) − u(c∗t+1 )
 

Now use a first order Taylor expansion around optimal consumption:

∆ in lifetime utility ≈ β t u0 (c∗t )(ct − c∗t ) + β t+1 Et [u0 (c∗t+1 )(1 + rj,t+1 )](c∗t − ct )

At the optimum, this should be zero. Dividing by β t (c∗t − ct ) implies

u0 (c∗t ) = βEt [u0 (c∗t+1 )(1 + rj,t+1 )]

Applying this at rf gives us (3), and combining rj and rf gives us (4).

7
Jessica Wachter Class notes for Finance 911

4 Example: Logarithmic utility

The investor solves:


T
X
max E β t log ct
{c,θ}
t=0

subject to
>
Wt+1 = (1 + rf,t+1 )(Wt − ct ) + θt+1 (rt+1 − rf,t+1 ι)

• First characterize the value function. Conjecture

V (Wt , Yt , t) = g(t) log Wt + k(Yt , t)

for some functions g and k.


We verify the conjecture by induction. Note that the conjecture holds at time T :

V (WT , YT , T ) = β T log WT .

We assume that the conjecture holds at t + 1 and show it holds at t. Set up the
following change of variables:
ct θt+1
ĉt = θ̂t+1 = .
Wt Wt
Define the normalized budget constraint:
Wt+1 >
= (1 + rf )(1 − ĉt ) + θ̂t+1 (rt+1 − rf ι).
Wt
Apply the Bellman equation and the induction step:

max β t log ct + Et [g(t + 1) log Wt+1 + k(Yt+1 , t + 1)]



V (Wt , Yt , t) =
ct ,θt+1

= β t + g(t + 1) log Wt +

  
t Wt+1
max β log ĉt + Et g(t + 1) log + k(Yt+1 , t + 1).
ct ,θt+1 Wt
Recursively define
g(t) = β t + g(t + 1)
and
  
t Wt+1
k(Yt , t) = max β log ĉt + Et g(t + 1) log + k(Yt+1 , t + 1)
ĉt ,θ̂t+1 Wt

8
Jessica Wachter Class notes for Finance 911

where the maximization is subject to the normalized budget constraint. Note that
k in the equation above is a function of Yt and t alone. This verifies the conjecture.
Moreover, it follows from g(T ) = β T and the recursive definition of g that
T
X
g(t) = β s.
s=t

• Now use the envelope condition to derive implications for the consumption policy:
βt g(t)
= ,
ct Wt
so
βt
ct = Wt
βt + · · · + βT
1−β
= Wt .
1 − β T −t+1
ct
Implication: Wt
is non-stochastic (does not depend on Yt ).
It may seem surprising that this does not depend on investment opportunities
(distribution of returns). But in general, we would expect improvements in
investment opportunities to go in two directions. If investment opportunities are
better, the agent might want to invest more to take advantage of them
(substitution effect). However, the agent is better off in general and can enjoy
some of that wealth now by consuming more (income effect). In the log utility
case, these cancel out exactly.

• Implications for portfolio choice: It follows from the Euler equation that

Et [u0 (ct+1 )(rj,t+1 − rf )] = 0.

Substituting in from the envelope condition:

1 − β T −t 1
 
Et (rj,t+1 − rf ) = 0.
1 − β Wt+1
Therefore,  
1
Et (rj,t+1 − rf ) = 0. (*)
Wt+1
Note that the above holds for all j and therefore constitutes M equations in M
unknowns.

9
Jessica Wachter Class notes for Finance 911

This equation has allows us to derive a surprising implication of log utility:


portfolio choice in the multi-period problem is the same as in the single-period
problem. We can see this from equation (*). This is the same first order condition
that would obtain for a single-period investor who has initial wealth Wt − ct .
Further, (*) implies that the optimal portfolio is independent of the investor’s
horizon. Using the budget constraint,
>
 
θt+1
Wt+1 = (Wt − ct ) (1 + rf ) + (rt+1 − rf ι) .
W t − ct
Substituting in:
 
1
Et  h >
i (rj,t+1 − rf ) = 0. (**)
θt+1
(1 + rf ) + (r
Wt −ct t+1
− rf ι)

Let θ̃t+1 = Wθtt+1


−ct
. θ̃ is the portfolio allocation as a fraction of invested wealth.
(**) implies that this quantity depends only on the distribution of returns
between period t and t + 1, not on the investor’s horizon.
Conclusion: When utility function is log, the multi period problem becomes a
succession of single period problems, with no interaction between periods.

5 Infinite horizon recursive formulation

We will sometimes be interested in what happens when we take the horizon out to
infinity. We are then solving: "∞ #
X
max E β t u(ct )
θ,c
t=0

subject to
>
Wt+1 = (1 + rf,t+1 )(Wt − ct ) + θt+1 (rt+1 − rf,t+1 ) ∀t ≥ 0
We would like a condition to replace our previous boundary condition cT ≤ WT . Now
we need to be careful that the agent does not borrow more and more money (See
Duffie, chapter 4 for details).
In practice, the solution to these problems is often less complicated than for the finite
horizon problems. Rather than applying backward induction, we can search for a fixed
point. It is possible to show that solving this problem is equivalent to solving for:

G(Wt , Yt ) = max [u(ct ) + βEt G(Wt+1 , Yt+1 )]


ct ,θt+1

10
Jessica Wachter Class notes for Finance 911

and that ∞
X
G(Wt , Yt ) = Et β s−t u(c∗s )
s=t

where c∗t is optimal consumption.


Note that this equation does solve the above fixed-point problem, because

" ∞
#
X X
Et β s−t u(c∗s ) = u(c∗t ) + βEt Et+1 β s−(t+1) u(c∗s ) .
s=t s=t+1

Note that this is also the limit of the value function (divided by β t ) as the horizon goes
to infinity.

6 Representative agent revisited

We now return to the concept of a security markets equilibrium (SME).


Let xjt denote the dividend on the jth risky security. Normalize the number of shares
of each risky security to equal one. Assume that the riskless security is in zero net
supply (the aggregate number of shares is zero).
Continue to assume time-additive and state-separable utility function. Let Ct = i cit
P
denote the aggregate endowment at time t. We will use the terms aggregate endowment
and aggregate consumption interchangeably. P Because we have normalized the number
of shares of each security to equal one, Ct = Mj=1 xjt .

Theorem. Assume that markets are dynamically complete. Consider a SME (securities
market equilibrium) with M risky securities with return vector rt and one riskless
security with return rf,t . Then ∃ a representative agent with utility function ut such that
Ct is a solution to " T #
X
max E ut (CtR )
θ,C R
t=1

subject to
>
Wt+1 = (1 + rf,t+1 )(Wt − CtR ) + θt+1 (rt+1 − rf,t+1 )
CTR ≤ WT

Note: requiring that CtR = Ct = M


P
j=0 xjt is equivalent to requiring that the agent holds
all shares in each security (as shown in the homework).

11

You might also like