0% found this document useful (0 votes)

25 views13 pages

Lecture4 BellmanOperator Handout

1. The document discusses methods for solving dynamic programming problems, specifically value function iteration and policy iteration. 2. Value function iteration iteratively updates the value function by applying the Bellman operator until convergence, while policy iteration alternates between improving the policy given the value and updating the value given the policy. 3. Policy iteration may converge faster than value function iteration by simulating the policy for multiple periods instead of a single period at each iteration.

Uploaded by

EvanYao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views13 pages

Lecture4 BellmanOperator Handout

Uploaded by

EvanYao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Lecture 4: The Bellman Operator

Dynamic Programming

Jeppe Druedahl
Department of Economics

15th of February 2016 — Slide 1/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
Infinite horizon, t → ∞
Policy Iteration

Projection methods
• We know
Bellman equation

Until next
V 0 ( Mt ) = whatever
n o
V 1 ( Mt ) = max u( Mt , Ct ) + βV 0 ( Mt+1 )
Ct ∈C( Mt )
n o
2
V ( Mt ) = max u( Mt , Ct ) + βV 1 ( Mt+1 )
Ct ∈C( Mt )
n o
V 3 ( Mt ) = max u( Mt , Ct ) + βV 2 ( Mt+1 )
Ct ∈C( Mt )
... n o
lim V n ( Mt ) = max u( Mt , Ct ) + βV n−1 ( Mt+1 ) ?
n→∞ Ct ∈C( Mt )

where Mt+1 = Γ( Mt , Ct )
• Does the limit exist?

15th of February 2016 — Slide 3/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
Operator notation
Policy Iteration

Projection methods
• Write the Bellman equation on the following general form
Bellman equation

Until next V n ( Mt ) = max u( Mt , Ct ) + V n−1 (Γ( Mt , Ct )) for all Mt ∈ M

Ct ∈C( Mt )

• Alternatively in operator form

V n ( Mt ) = J (V n−1 )( Mt ) for all Mt ∈ M

• A fixed point is a function V such that

V ( Mt ) = J (V )( Mt ) for all Mt ∈ M

• Is there always a fixed point, and is it unique?

15th of February 2016 — Slide 4/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
Contraction mapping requirement
Policy Iteration

Projection methods
• Let F (M) be the space of bounded continuous functions
Bellman equation

Until next Theorem

Assume u( Mt , Ct ) is real-valued, continuous and bounded,
0 < β < 1 and the constraint set, C( Mt ) is non-empty,
compact-valued and continuous, then J has a unique fixed
point V ∈ F (M), and for all V0 ∈ F (M)

| J n (V0 ) − V | ≤ βn |V0 − V | , n = 0, 1, 2, 3, . . .

• Full proof: Lucas and Stokey (1989), theorem 4.6

• Main idea: Apply Blackwell’s contraction mapping theorem
requiring that J is
1 Monotone
2 Discounted

15th of February 2016 — Slide 5/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration Montone (requirement 1)
Policy Iteration

Projection methods

Bellman equation
≥ Q ( Mt ) , ∀ Mt ∈ M ⇒
V ( Mt )
Until next J (V )( Mt ) ≥ J ( Q)( Mt ), ∀ Mt ∈ M

?
CV ( Mt ) ≡ arg max u( Mt , Ct ) + βV (Γ( Mt , Ct ))
Ct ∈C( Mt )
?
CQ ( Mt ) ≡ arg max u( Mt , Ct ) + βQ(Γ( Mt , Ct )
Ct ∈C( Mt )

• Insert into J (V )( Mt )

J (V )( Mt ) = max u( Mt , Ct ) + βV (Γ( Mt , Ct ))
Ct ∈C( Mt )
?
= u( Mt , CV ( Mt )) + βV (Γ( Mt , CV? ( Mt ))
? ?
≥ u ( M t , CQ ( Mt )) + βV (Γ( Mt , CQ ( Mt ))
? ?
≥ u( Mt , CQ ( Mt )) + βQ(Γ( Mt , CQ ( Mt ))
= J ( Q)( Mt )
15th of February 2016 — Slide 6/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration Discounted (requirement 2)
Policy Iteration

Projection methods ∃γ ∈ (0, 1) : J (V + k)( Mt ) ≤ J (V )( Mt ) + γk.

Bellman equation

Until next • We have

J (V + k )( Mt ) = max u( Mt , Ct ) + β(V (Γ( Mt , Ct )) + k )

Ct ∈C( Mt )
= max u( Mt , Ct ) + βV (Γ( Mt , Ct )) + βk
Ct ∈C( Mt )
= J (V )( Mt ) + βk
≤ J (V )( Mt ) + γk for γ = β ∈ (0, 1)

• What could break down here?

15th of February 2016 — Slide 7/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
Summarize
Policy Iteration

Projection methods
1 The uniqueness of the value function can be proven
Bellman equation 2 Iteration on the value function can be proven to converge
Until next at a rate of β
3 Further properties:
1 Monotonicity in states expanding the choice set
2 Concavity if choice set is convex and u is concave
3 Differentiability (e.g. Benvenste and Scheinkman (1979),
Clausen and Strub (2016))
4 Unique policy function typically requires that the choice
set is convex and u is strictly concave
5 Boyd’s Weighted Contraction Mapping Theorem can be
used if returns are unbounded (see Carroll (2012))

15th of February 2016 — Slide 8/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
Value function iteration (VFI)
Policy Iteration
Algorithm 11: Find the fixed point V
Projection methods

Bellman equation input : tol. = ∞

Until next output: V [•]
C ? [•]
1 V [•] = 0 all m ∈ M
2 while ? do
3 V− [•] = V [•]
4 V [•], C ? [•] = find V(V [•])
5 δ = max(|V− [:] − V [:]|)

15th of February 2016 — Slide 10/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
Policy iteration
Policy Iteration

Projection methods
• Think of step n in VFI where we for all Mt ∈ M set
Bellman equation h i
Until next V n ( Mt ) = u( Mt , C ?n ( Mt )) + βEt V n−1 (Γ( Mt , C ?n ( Mt ))
h i
C ?n ( Mt ) = arg max u( Mt , Ct ) + βEt V n−1 (Γ( Mt , Ct ))
Ct

• Alternative: Simulate forward for k periods using C ?n (•) as

decision rule, and update by
k
V n ( Mt ) = ∑ β j u( Mt+ j , C?n ( Mt+ j ))
j =0

+ βk+1 V n−1 (Γ( Mt+k+1 , Ct?+k+1 ))

• Better convergence? Yes, in terms of speed. No, in terms of
pool of atraction
• Everything is discrete: The simulation can be replaced by
inversion of a matrix! [Bertel will show you]
15th of February 2016 — Slide 12/19
UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
Guess and verify
Policy Iteration
• Consider the neoclassical growth model
Projection methods

Bellman equation
V ( Kt ) = max log Ct + βV (Kt+1 )
Until next Ct
s.t.
K t +1 = AKtα − Ct
• Assume that V (Kt ) = a + b log Kt such that

a + b log Kt = max log( AKtα − Kt+1 ) + β( a + b log Kt+1 )

K t +1

• The FOC then is

1 βb βb
= ↔ K t +1 = AKtα
AKtα − K t +1 K t +1 1 + βb
• Inserrt FOC and solve for a and b (independent of Kt )

βb βb
a + b log Kt = log( AKtα − AKtα ) + β( a + b log( AKtα ))
1 + βb 1 + βb

15th of February 2016 — Slide 14/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
Projection methods
Policy Iteration

Projection methods
• Guess and verify is only possible for very special models
Bellman equation • Value and policy functions might, however, be well
Until next approximated by parametric functions (typically
polynomials, Weierstrass theorem)
• Solve for the parameters numerically instead of solving
the maximization problems (relying on the first-order
conditions instead)

15th of February 2016 — Slide 15/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
The Bellman equation
Policy Iteration
• The model:
Projection methods

Bellman equation 1 A household gets utility from consumption

Until next and disutility from labor
2 The household’s income dependent on
whether it works or not
3 The household accumulates human capital by working
4 It can save in an acount with an interest rate of r

• Task: Write up the Bellman equation on the white board for

your choice of utility function, wage process and human
capital accumulation equation

15th of February 2016 — Slide 17/19

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Bellman operator

Value function
iteration
Until next
Policy Iteration
• Ensure that you understand:
Projection methods

Bellman equation
• Algorithm 11
Until next • How to set up a Bellman equation
• Go to PadLet and ask or answer a question
(https://padlet.com/jeppe druedahl/dynamic programming)
• Think about: What is the problem with having
respectively:
1 Multiple states
2 Multiple choices
3 Multiple shocks

15th of February 2016 — Slide 19/19

Laibson Notes 2013 0
No ratings yet
Laibson Notes 2013 0
54 pages
Infinite-Horizon Dynamic Programming: Tianxiao Zheng Saif
No ratings yet
Infinite-Horizon Dynamic Programming: Tianxiao Zheng Saif
10 pages
Bellman Equation
No ratings yet
Bellman Equation
13 pages
Exact (RL IITH)
No ratings yet
Exact (RL IITH)
47 pages
Dynamic Programming Value Iteration
100% (1)
Dynamic Programming Value Iteration
36 pages
EE675A Lec12
No ratings yet
EE675A Lec12
5 pages
Lecture15 ValueFunctionIteration
No ratings yet
Lecture15 ValueFunctionIteration
30 pages
MDP Cheatsheet
No ratings yet
MDP Cheatsheet
3 pages
SanchezPajueloKai PS3
No ratings yet
SanchezPajueloKai PS3
6 pages
Fa19 Lecture 15 MDPs II
No ratings yet
Fa19 Lecture 15 MDPs II
76 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
14 pages
3 Recursive
No ratings yet
3 Recursive
8 pages
7 B Matching
No ratings yet
7 B Matching
30 pages
EC106 DeterministicMOdels WS21 June13
No ratings yet
EC106 DeterministicMOdels WS21 June13
19 pages
Notes ValueFunctionIteration
No ratings yet
Notes ValueFunctionIteration
10 pages
Lecture SM 1 DP
No ratings yet
Lecture SM 1 DP
71 pages
Subtitle
No ratings yet
Subtitle
1 page
Part 10
No ratings yet
Part 10
57 pages
Macro2 HW2 Solution v1
No ratings yet
Macro2 HW2 Solution v1
15 pages
ECON 809: Problem Set 1
No ratings yet
ECON 809: Problem Set 1
18 pages
Dp-Intro Dynamic Programming
No ratings yet
Dp-Intro Dynamic Programming
4 pages
Handout 10 Dynamic Programming Nov14
No ratings yet
Handout 10 Dynamic Programming Nov14
113 pages
Computational Economics: Session 16: Numerical Dynamic Programming
No ratings yet
Computational Economics: Session 16: Numerical Dynamic Programming
17 pages
SLchapt 3
No ratings yet
SLchapt 3
10 pages
Dynamic Equilibrium Models III: Infinite Periods
No ratings yet
Dynamic Equilibrium Models III: Infinite Periods
15 pages
Lecture26 Ri
No ratings yet
Lecture26 Ri
55 pages
RBC Models
No ratings yet
RBC Models
9 pages
Bellman's Equations: T t+1 T T 0 T T t+1 T T, T
No ratings yet
Bellman's Equations: T t+1 T T 0 T T t+1 T T, T
7 pages
2025 - MDPs 2
No ratings yet
2025 - MDPs 2
42 pages
Macro Answer Key 2018
No ratings yet
Macro Answer Key 2018
21 pages
Lecture 09 Handouts
No ratings yet
Lecture 09 Handouts
51 pages
Notes Vfi sp2024
No ratings yet
Notes Vfi sp2024
11 pages
(Presentation) Shi (2009) 110427
No ratings yet
(Presentation) Shi (2009) 110427
9 pages
Dynamic Macroeconomic Modeling With Matlab
No ratings yet
Dynamic Macroeconomic Modeling With Matlab
38 pages
Dyn Mac-2010 PDF
No ratings yet
Dyn Mac-2010 PDF
38 pages
Dynamic Programming 3 Bellman
No ratings yet
Dynamic Programming 3 Bellman
8 pages
T08 Dynamic Programming
No ratings yet
T08 Dynamic Programming
11 pages
Mathii at Su and Sse: John Hassler Iies, Stockholm University February 25, 2005
No ratings yet
Mathii at Su and Sse: John Hassler Iies, Stockholm University February 25, 2005
87 pages
Dynamic Programming Handout - : 14.451 Recitation, February 18, 2005 - Todd Gormley
No ratings yet
Dynamic Programming Handout - : 14.451 Recitation, February 18, 2005 - Todd Gormley
11 pages
Parameterized Expectations Algorithm: Lecture Notes 8
No ratings yet
Parameterized Expectations Algorithm: Lecture Notes 8
33 pages
2025 - MDPs - Part 2
No ratings yet
2025 - MDPs - Part 2
41 pages
Optimization Methods (MFE) : Elena Perazzi
No ratings yet
Optimization Methods (MFE) : Elena Perazzi
28 pages
Lectures On Innovation and Firm Dynamics
No ratings yet
Lectures On Innovation and Firm Dynamics
33 pages
09 - Monte Carlo Learning
No ratings yet
09 - Monte Carlo Learning
24 pages
EC744 Lecture Note 5 Applications of Deterministic DP: Prof. Jianjun Miao
No ratings yet
EC744 Lecture Note 5 Applications of Deterministic DP: Prof. Jianjun Miao
23 pages
RBC PDF
No ratings yet
RBC PDF
18 pages
2 Dynamic
No ratings yet
2 Dynamic
50 pages
cs229 Notes13
No ratings yet
cs229 Notes13
15 pages
Lecture 3
No ratings yet
Lecture 3
43 pages
02 ECON5210 Lecture 3 Final Version
No ratings yet
02 ECON5210 Lecture 3 Final Version
33 pages
Linear Quadratic Control
No ratings yet
Linear Quadratic Control
7 pages
Modelling Money in General Equilibrium: A Primer The Basic MIU Model: Value Function Solution
No ratings yet
Modelling Money in General Equilibrium: A Primer The Basic MIU Model: Value Function Solution
37 pages
Lecture18 SolvingHeterogeneousAgents
No ratings yet
Lecture18 SolvingHeterogeneousAgents
27 pages
Markov Decision Process II
No ratings yet
Markov Decision Process II
88 pages
l1 Mdps Exact Methods
No ratings yet
l1 Mdps Exact Methods
69 pages
EC744 Lecture Note 3 Dynamic Programming Under Certainty: Prof. Jianjun Miao
No ratings yet
EC744 Lecture Note 3 Dynamic Programming Under Certainty: Prof. Jianjun Miao
17 pages
Lecture 11 Handouts
No ratings yet
Lecture 11 Handouts
37 pages
Practical - 8
No ratings yet
Practical - 8
15 pages
2023-24 AIML ML Mid-Semester Make-Up Answer-Keys
No ratings yet
2023-24 AIML ML Mid-Semester Make-Up Answer-Keys
6 pages
Lecture Module 3 - System of Linear Equations
No ratings yet
Lecture Module 3 - System of Linear Equations
91 pages
Exp 5-7 Lab Manual
No ratings yet
Exp 5-7 Lab Manual
20 pages
Unit-5 ML Notes
No ratings yet
Unit-5 ML Notes
72 pages
Algorithms and Flowcharts-1 PDF
No ratings yet
Algorithms and Flowcharts-1 PDF
47 pages
EC3500 - Analysis of Random Signals: HTTP://WWW - Itl.nist - Gov/div898/handbook/eda
No ratings yet
EC3500 - Analysis of Random Signals: HTTP://WWW - Itl.nist - Gov/div898/handbook/eda
2 pages
Generative AI and LLMS
No ratings yet
Generative AI and LLMS
34 pages
The Division Algorithm (Keith Conrad)
No ratings yet
The Division Algorithm (Keith Conrad)
10 pages
B.SC Computer Science (Hons) WBSU 2nd SEM 2024
No ratings yet
B.SC Computer Science (Hons) WBSU 2nd SEM 2024
2 pages
Digital Signal Processing Methods For Ultrasonic Echoes
No ratings yet
Digital Signal Processing Methods For Ultrasonic Echoes
6 pages
Decision Tree Learning and Inductive Inference
No ratings yet
Decision Tree Learning and Inductive Inference
37 pages
Gabor Filter
No ratings yet
Gabor Filter
7 pages
LM
No ratings yet
LM
18 pages
Welcome To Adaptive Signal Processing! Lectures and Exercises
No ratings yet
Welcome To Adaptive Signal Processing! Lectures and Exercises
9 pages
Oct 2020 Decision 1 Ial Maths Edexcel QP
No ratings yet
Oct 2020 Decision 1 Ial Maths Edexcel QP
32 pages
Application of Machine Learning
No ratings yet
Application of Machine Learning
8 pages
Receiver Gain and AGC Package
No ratings yet
Receiver Gain and AGC Package
20 pages
Chapter 8: Introduction To Systems Control: 8.1 System Stability From Pole-Zero Locations (S-Domain)
No ratings yet
Chapter 8: Introduction To Systems Control: 8.1 System Stability From Pole-Zero Locations (S-Domain)
47 pages
Lecture 6: Value Function Approximation: David Silver
No ratings yet
Lecture 6: Value Function Approximation: David Silver
56 pages
ISMLA Module5
No ratings yet
ISMLA Module5
25 pages
02 Cbse X Mathematics Dpt-5 Phase-II Set-B 21-11-2024 QP
No ratings yet
02 Cbse X Mathematics Dpt-5 Phase-II Set-B 21-11-2024 QP
3 pages
An Improved Recursive Formula For Calculating Shock Response Spectra-SRS-Smallwood
No ratings yet
An Improved Recursive Formula For Calculating Shock Response Spectra-SRS-Smallwood
7 pages
Machine Learning Andrew NG Week 5 Quiz 1
No ratings yet
Machine Learning Andrew NG Week 5 Quiz 1
3 pages
DL-basics-of-neural-networks-MNIST-dataset - Ipynb - Colab
No ratings yet
DL-basics-of-neural-networks-MNIST-dataset - Ipynb - Colab
5 pages
Signal Filtering Using Fourier Transform
No ratings yet
Signal Filtering Using Fourier Transform
8 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
68 pages
Homework 2 Solutions
No ratings yet
Homework 2 Solutions
6 pages
Assignment Analysis Design of Algorithm Anshika Chauhan 0103CS191041
No ratings yet
Assignment Analysis Design of Algorithm Anshika Chauhan 0103CS191041
51 pages
OS Lecture2 - CPU Scheduling
No ratings yet
OS Lecture2 - CPU Scheduling
48 pages

Lecture4 BellmanOperator Handout

Uploaded by

Lecture4 BellmanOperator Handout

Uploaded by

UNIVERSITY OF COPENHAGEN DEPARTMENT OF ECONOMICS

Lecture 4: The Bellman Operator

15th of February 2016 — Slide 1/19

15th of February 2016 — Slide 3/19

Until next V n ( Mt ) = max u( Mt , Ct ) + V n−1 (Γ( Mt , Ct )) for all Mt ∈ M

• Alternatively in operator form

V n ( Mt ) = J (V n−1 )( Mt ) for all Mt ∈ M

• A fixed point is a function V such that

• Is there always a fixed point, and is it unique?

15th of February 2016 — Slide 4/19

Until next Theorem

• Full proof: Lucas and Stokey (1989), theorem 4.6

15th of February 2016 — Slide 5/19

Projection methods ∃γ ∈ (0, 1) : J (V + k)( Mt ) ≤ J (V )( Mt ) + γk.

Until next • We have

J (V + k )( Mt ) = max u( Mt , Ct ) + β(V (Γ( Mt , Ct )) + k )

• What could break down here?

15th of February 2016 — Slide 7/19

15th of February 2016 — Slide 8/19

Bellman equation input : tol. = ∞

15th of February 2016 — Slide 10/19

• Alternative: Simulate forward for k periods using C ?n (•) as

+ βk+1 V n−1 (Γ( Mt+k+1 , Ct?+k+1 ))

a + b log Kt = max log( AKtα − Kt+1 ) + β( a + b log Kt+1 )

• The FOC then is

15th of February 2016 — Slide 14/19

15th of February 2016 — Slide 15/19

Bellman equation 1 A household gets utility from consumption

• Task: Write up the Bellman equation on the white board for

15th of February 2016 — Slide 17/19

15th of February 2016 — Slide 19/19

You might also like