UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Lecture 4: The Bellman Operator
Dynamic Programming
Jeppe Druedahl
Department of Economics
15th of February 2016 — Slide 1/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
Infinite horizon, t → ∞
Policy Iteration
Projection methods
• We know
Bellman equation
Until next
V 0 ( Mt ) = whatever
n o
V 1 ( Mt ) = max u( Mt , Ct ) + βV 0 ( Mt+1 )
Ct ∈C( Mt )
n o
2
V ( Mt ) = max u( Mt , Ct ) + βV 1 ( Mt+1 )
Ct ∈C( Mt )
n o
V 3 ( Mt ) = max u( Mt , Ct ) + βV 2 ( Mt+1 )
Ct ∈C( Mt )
... n o
lim V n ( Mt ) = max u( Mt , Ct ) + βV n−1 ( Mt+1 ) ?
n→∞ Ct ∈C( Mt )
where Mt+1 = Γ( Mt , Ct )
• Does the limit exist?
15th of February 2016 — Slide 3/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
Operator notation
Policy Iteration
Projection methods
• Write the Bellman equation on the following general form
Bellman equation
Until next V n ( Mt ) = max u( Mt , Ct ) + V n−1 (Γ( Mt , Ct )) for all Mt ∈ M
Ct ∈C( Mt )
• Alternatively in operator form
V n ( Mt ) = J (V n−1 )( Mt ) for all Mt ∈ M
• A fixed point is a function V such that
V ( Mt ) = J (V )( Mt ) for all Mt ∈ M
• Is there always a fixed point, and is it unique?
15th of February 2016 — Slide 4/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
Contraction mapping requirement
Policy Iteration
Projection methods
• Let F (M) be the space of bounded continuous functions
Bellman equation
Until next Theorem
Assume u( Mt , Ct ) is real-valued, continuous and bounded,
0 < β < 1 and the constraint set, C( Mt ) is non-empty,
compact-valued and continuous, then J has a unique fixed
point V ∈ F (M), and for all V0 ∈ F (M)
| J n (V0 ) − V | ≤ βn |V0 − V | , n = 0, 1, 2, 3, . . .
• Full proof: Lucas and Stokey (1989), theorem 4.6
• Main idea: Apply Blackwell’s contraction mapping theorem
requiring that J is
1 Monotone
2 Discounted
15th of February 2016 — Slide 5/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration Montone (requirement 1)
Policy Iteration
Projection methods
Bellman equation
≥ Q ( Mt ) , ∀ Mt ∈ M ⇒
V ( Mt )
Until next J (V )( Mt ) ≥ J ( Q)( Mt ), ∀ Mt ∈ M
?
CV ( Mt ) ≡ arg max u( Mt , Ct ) + βV (Γ( Mt , Ct ))
Ct ∈C( Mt )
?
CQ ( Mt ) ≡ arg max u( Mt , Ct ) + βQ(Γ( Mt , Ct )
Ct ∈C( Mt )
• Insert into J (V )( Mt )
J (V )( Mt ) = max u( Mt , Ct ) + βV (Γ( Mt , Ct ))
Ct ∈C( Mt )
?
= u( Mt , CV ( Mt )) + βV (Γ( Mt , CV? ( Mt ))
? ?
≥ u ( M t , CQ ( Mt )) + βV (Γ( Mt , CQ ( Mt ))
? ?
≥ u( Mt , CQ ( Mt )) + βQ(Γ( Mt , CQ ( Mt ))
= J ( Q)( Mt )
15th of February 2016 — Slide 6/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration Discounted (requirement 2)
Policy Iteration
Projection methods ∃γ ∈ (0, 1) : J (V + k)( Mt ) ≤ J (V )( Mt ) + γk.
Bellman equation
Until next • We have
J (V + k )( Mt ) = max u( Mt , Ct ) + β(V (Γ( Mt , Ct )) + k )
Ct ∈C( Mt )
= max u( Mt , Ct ) + βV (Γ( Mt , Ct )) + βk
Ct ∈C( Mt )
= J (V )( Mt ) + βk
≤ J (V )( Mt ) + γk for γ = β ∈ (0, 1)
• What could break down here?
15th of February 2016 — Slide 7/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
Summarize
Policy Iteration
Projection methods
1 The uniqueness of the value function can be proven
Bellman equation 2 Iteration on the value function can be proven to converge
Until next at a rate of β
3 Further properties:
1 Monotonicity in states expanding the choice set
2 Concavity if choice set is convex and u is concave
3 Differentiability (e.g. Benvenste and Scheinkman (1979),
Clausen and Strub (2016))
4 Unique policy function typically requires that the choice
set is convex and u is strictly concave
5 Boyd’s Weighted Contraction Mapping Theorem can be
used if returns are unbounded (see Carroll (2012))
15th of February 2016 — Slide 8/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
Value function iteration (VFI)
Policy Iteration
Algorithm 11: Find the fixed point V
Projection methods
Bellman equation input : tol. = ∞
Until next output: V [•]
C ? [•]
1 V [•] = 0 all m ∈ M
2 while ? do
3 V− [•] = V [•]
4 V [•], C ? [•] = find V(V [•])
5 δ = max(|V− [:] − V [:]|)
15th of February 2016 — Slide 10/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
Policy iteration
Policy Iteration
Projection methods
• Think of step n in VFI where we for all Mt ∈ M set
Bellman equation h i
Until next V n ( Mt ) = u( Mt , C ?n ( Mt )) + βEt V n−1 (Γ( Mt , C ?n ( Mt ))
h i
C ?n ( Mt ) = arg max u( Mt , Ct ) + βEt V n−1 (Γ( Mt , Ct ))
Ct
• Alternative: Simulate forward for k periods using C ?n (•) as
decision rule, and update by
k
V n ( Mt ) = ∑ β j u( Mt+ j , C?n ( Mt+ j ))
j =0
+ βk+1 V n−1 (Γ( Mt+k+1 , Ct?+k+1 ))
• Better convergence? Yes, in terms of speed. No, in terms of
pool of atraction
• Everything is discrete: The simulation can be replaced by
inversion of a matrix! [Bertel will show you]
15th of February 2016 — Slide 12/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
Guess and verify
Policy Iteration
• Consider the neoclassical growth model
Projection methods
Bellman equation
V ( Kt ) = max log Ct + βV (Kt+1 )
Until next Ct
s.t.
K t +1 = AKtα − Ct
• Assume that V (Kt ) = a + b log Kt such that
a + b log Kt = max log( AKtα − Kt+1 ) + β( a + b log Kt+1 )
K t +1
• The FOC then is
1 βb βb
= ↔ K t +1 = AKtα
AKtα − K t +1 K t +1 1 + βb
• Inserrt FOC and solve for a and b (independent of Kt )
βb βb
a + b log Kt = log( AKtα − AKtα ) + β( a + b log( AKtα ))
1 + βb 1 + βb
15th of February 2016 — Slide 14/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
Projection methods
Policy Iteration
Projection methods
• Guess and verify is only possible for very special models
Bellman equation • Value and policy functions might, however, be well
Until next approximated by parametric functions (typically
polynomials, Weierstrass theorem)
• Solve for the parameters numerically instead of solving
the maximization problems (relying on the first-order
conditions instead)
15th of February 2016 — Slide 15/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
The Bellman equation
Policy Iteration
• The model:
Projection methods
Bellman equation 1 A household gets utility from consumption
Until next and disutility from labor
2 The household’s income dependent on
whether it works or not
3 The household accumulates human capital by working
4 It can save in an acount with an interest rate of r
• Task: Write up the Bellman equation on the white board for
your choice of utility function, wage process and human
capital accumulation equation
15th of February 2016 — Slide 17/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS
Bellman operator
Value function
iteration
Until next
Policy Iteration
• Ensure that you understand:
Projection methods
Bellman equation
• Algorithm 11
Until next • How to set up a Bellman equation
• Go to PadLet and ask or answer a question
(https://padlet.com/jeppe druedahl/dynamic programming)
• Think about: What is the problem with having
respectively:
1 Multiple states
2 Multiple choices
3 Multiple shocks
15th of February 2016 — Slide 19/19