0% found this document useful (0 votes)
6 views

Quantecon Python Advanced

The document is a comprehensive guide on Advanced Quantitative Economics using Python, authored by Thomas J. Sargent and John Stachurski. It covers various topics including orthogonal projections, continuous state Markov chains, reverse engineering, discrete state dynamic programming, and LQ control. Each section includes theoretical foundations, applications, and exercises to reinforce learning.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Quantecon Python Advanced

The document is a comprehensive guide on Advanced Quantitative Economics using Python, authored by Thomas J. Sargent and John Stachurski. It covers various topics including orthogonal projections, continuous state Markov chains, reverse engineering, discrete state dynamic programming, and LQ control. Each section includes theoretical foundations, applications, and exercises to reinforce learning.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1074

Advanced Quantitative Economics with

Python

Thomas J. Sargent & John Stachurski

Feb 17, 2025


CONTENTS

I Tools and Techniques 3


1 Orthogonal Projections and Their Applications 5
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Key Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 The Orthogonal Projection Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Orthonormal Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Projection Via Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Least Squares Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7 Orthogonalization and Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Continuous State Markov Chains 23


2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 The Density Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Beyond Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Reverse Engineering a la Muth 45


3.1 Friedman (1956) and Muth (1960) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 A Process for Which Adaptive Expectations are Optimal . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Some Useful State-Space Math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Estimates of Unobservables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5 Relationship of Unobservables to Observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.6 MA and AR Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 Discrete State Dynamic Programming 53


4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Discrete DPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Solving Discrete DPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4 Example: A Growth Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.6 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.7 Appendix: Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

II LQ Control 75
5 Information and Consumption Smoothing 77

i
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Two Representations of One Nonfinancial Income Process . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Application of Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4 News Shocks and Less Informative Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.5 Representation of 𝜖𝑡 Shock in Terms of Future 𝑦𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.6 Representation in Terms of 𝑎𝑡 Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.7 Permanent Income Consumption-Smoothing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.8 State Space Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.9 Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.10 Simulating Income Process and Two Associated Shock Processes . . . . . . . . . . . . . . . . . . . . 89
5.11 Calculating Innovations in Another Way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.12 Another Invertibility Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6 Consumption Smoothing with Complete and Incomplete Markets 91


6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.3 Linear State Space Version of Complete Markets Model . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4 Model 1 (Complete Markets) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.5 Model 2 (One-Period Risk-Free Debt Only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

7 Tax Smoothing with Complete and Incomplete Markets 105


7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 Tax Smoothing with Complete Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.3 Returns on State-Contingent Debt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.4 More Finite Markov Chain Tax-Smoothing Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 115

8 Markov Jump Linear Quadratic Dynamic Programming 143


8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.2 Review of useful LQ dynamic programming formulas . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.3 Linked Riccati equations for Markov LQ dynamic programming . . . . . . . . . . . . . . . . . . . . 144
8.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.5 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.6 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.7 More examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

9 How to Pay for a War: Part 1 197


9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9.2 Public Finance Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.3 Barro (1979) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.4 Python Class to Solve Markov Jump Linear Quadratic Control Problems . . . . . . . . . . . . . . . . 203
9.5 Barro Model with a Time-varying Interest Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

10 How to Pay for a War: Part 2 207


10.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
10.2 Two example specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
10.3 One- and Two-period Bonds but No Restructuring . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
10.4 Mapping into an LQ Markov Jump Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
10.5 Penalty on Different Issues Across Maturities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
10.6 A Model with Restructuring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
10.7 Restructuring as a Markov Jump Linear Quadratic Control Problem . . . . . . . . . . . . . . . . . . . 216

11 How to Pay for a War: Part 3 221


11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
11.2 Roll-Over Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
11.3 A Dead End . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

ii
11.4 Better Representation of Roll-Over Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

12 Optimal Taxation in an LQ Economy 227


12.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
12.2 The Ramsey Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
12.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
12.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
12.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

III Multiple Agent Models 249


13 Default Risk and Income Fluctuations 251
13.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
13.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
13.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
13.4 Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
13.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
13.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264

14 Globalization and Cycles 271


14.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
14.2 Key Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
14.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
14.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
14.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

15 Coase’s Theory of the Firm 289


15.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
15.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
15.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
15.4 Existence, Uniqueness and Computation of Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . 294
15.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
15.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

16 Composite Sorting 303


16.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
16.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
16.3 Characterization of primal solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
16.4 Solving primal problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
16.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
16.6 Dual Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
16.7 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

IV Dynamic Linear Economies 357


17 Recursive Models of Dynamic Linear Economies 359
17.1 A Suite of Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
17.2 Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
17.3 Dynamic Demand Curves and Canonical Household Technologies . . . . . . . . . . . . . . . . . . . 378
17.4 Gorman Aggregation and Engel Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
17.5 Partial Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
17.6 Equilibrium Investment Under Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
17.7 A Rosen-Topel Housing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382

iii
17.8 Cattle Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
17.9 Models of Occupational Choice and Pay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
17.10 Permanent Income Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
17.11 Gorman Heterogeneous Households . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
17.12 Non-Gorman Heterogeneous Households . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386

18 Growth in Dynamic Linear Economies 389


18.1 Common Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
18.2 A Planning Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
18.3 Example Economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391

19 Lucas Asset Pricing Using DLE 401


19.1 Asset Pricing Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
19.2 Asset Pricing Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402

20 IRFs in Hall Models 409


20.1 Example 1: Hall (1978) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
20.2 Example 2: Higher Adjustment Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
20.3 Example 3: Durable Consumption Goods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415

21 Permanent Income Model using the DLE Class 419


21.1 The Permanent Income Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419

22 Rosen Schooling Model 423


22.1 A One-Occupation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
22.2 Mapping into HS2013 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424

23 Cattle Cycles 429


23.1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
23.2 Mapping into HS2013 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430

24 Shock Non Invertibility 437


24.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
24.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
24.3 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440

V Risk, Model Uncertainty, and Robustness 443


25 Risk and Model Uncertainty 445
25.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
25.2 Basic objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
25.3 Five preference specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
25.4 Expected utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
25.5 Constraint preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
25.6 Multiplier preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
25.7 Risk-sensitive preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
25.8 Ex post Bayesian preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
25.9 Comparing preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
25.10 Risk aversion and misspecification aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
25.11 Indifference curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
25.12 State price deflators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
25.13 Iso-utility and iso-entropy curves and expansion paths . . . . . . . . . . . . . . . . . . . . . . . . . . 464
25.14 Bounds on expected utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
25.15 Why entropy? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

iv
26 Etymology of Entropy 469
26.1 Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
26.2 A Measure of Unpredictability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
26.3 Mathematical Properties of Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
26.4 Conditional Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
26.5 Independence as Maximum Conditional Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
26.6 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
26.7 Statistical Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
26.8 Continuous distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
26.9 Relative entropy and Gaussian distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
26.10 Von Neumann Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
26.11 Backus-Chernov-Zin Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
26.12 Wiener-Kolmogorov Prediction Error Formula as Entropy . . . . . . . . . . . . . . . . . . . . . . . . 474
26.13 Multivariate Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
26.14 Frequency Domain Robust Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
26.15 Relative Entropy for a Continuous Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . 476

27 Robustness 479
27.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
27.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
27.3 Constructing More Robust Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
27.4 Robustness as Outcome of a Two-Person Zero-Sum Game . . . . . . . . . . . . . . . . . . . . . . . 485
27.5 The Stochastic Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
27.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
27.7 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
27.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496

28 Robust Markov Perfect Equilibrium 499


28.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
28.2 Linear Markov Perfect Equilibria with Robust Agents . . . . . . . . . . . . . . . . . . . . . . . . . . 500
28.3 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

VI Time Series Models 519


29 Covariance Stationary Processes 521
29.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
29.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
29.3 Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
29.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534

30 Estimation of Spectra 543


30.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
30.2 Periodograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
30.3 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
30.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552

31 Additive and Multiplicative Functionals 559


31.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
31.2 A Particular Additive Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
31.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
31.4 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
31.5 More About the Multiplicative Martingale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576

32 Classical Control with Linear Algebra 583

v
32.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583
32.2 A Control Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
32.3 Finite Horizon Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
32.4 Infinite Horizon Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
32.5 Undiscounted Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
32.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
32.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601

33 Classical Prediction and Filtering With Linear Algebra 603


33.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
33.2 Finite Dimensional Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
33.3 Combined Finite Dimensional Control and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . 615
33.4 Infinite Horizon Prediction and Filtering Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
33.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620

34 Knowing the Forecasts of Others 623


34.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
34.2 The Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
34.3 Tactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
34.4 Equilibrium Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
34.5 Equilibrium with 𝜃𝑡 stochastic but observed at 𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
34.6 Guess-and-Verify Tactic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
34.7 Equilibrium with One Noisy Signal on 𝜃𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
34.8 Equilibrium with Two Noisy Signals on 𝜃𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
34.9 Key Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
34.10 An observed common shock benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
34.11 Comparison of All Signal Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
34.12 Notes on History of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645

VII Asset Pricing and Finance 647


35 Asset Pricing II: The Lucas Asset Pricing Model 649
35.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
35.2 The Lucas Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
35.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656

36 Elementary Asset Pricing Theory 659


36.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
36.2 Key Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
36.3 Implications of Key Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
36.4 Expected Return - Beta Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
36.5 Mean-Variance Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
36.6 Sharpe Ratios and the Price of Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
36.7 Mathematical Structure of Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
36.8 Multi-factor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
36.9 Empirical Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
36.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668

37 Two Modifications of Mean-Variance Portfolio Theory 673


37.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
37.2 Mean-Variance Portfolio Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
37.3 Estimating Mean and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
37.4 Black-Litterman Starting Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
37.5 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676

vi
37.6 Adding Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678
37.7 Bayesian Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 680
37.8 Curve Decolletage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
37.9 Black-Litterman Recommendation as Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . 685
37.10 A Robust Control Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687
37.11 A Robust Mean-Variance Portfolio Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688
37.12 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689
37.13 Special Case – IID Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690
37.14 Dependence and Sampling Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690
37.15 Frequency and the Mean Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 692

38 Irrelevance of Capital Structures with Complete Markets 697


38.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697
38.2 Competitive equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 701
38.3 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707

39 Equilibrium Capital Structures with Incomplete Markets 717


39.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717
39.2 Asset Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720
39.3 Equilibrium verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
39.4 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
39.5 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
39.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736
39.7 A picture worth a thousand words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753

VIII Dynamic Programming Squared 755


40 Optimal Unemployment Insurance 757
40.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
40.2 Shavell and Weiss’s Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
40.3 Private Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760
40.4 Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767

41 Stackelberg Plans 771


41.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771
41.2 Duopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771
41.3 Stackelberg Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774
41.4 Two Bellman Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776
41.5 Stackelberg Plan for Duopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777
41.6 Recursive Representation of Stackelberg Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779
41.7 Dynamic Programming and Time Consistency of Follower’s Problem . . . . . . . . . . . . . . . . . . 780
41.8 Computing Stackelberg Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 782
41.9 Time Series for Price and Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783
41.10 Time Inconsistency of Stackelberg Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785
41.11 Recursive Formulation of Follower’s Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 786
41.12 Markov Perfect Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
41.13 Comparing Markov Perfect Equilibrium and Stackelberg Outcome . . . . . . . . . . . . . . . . . . . 793

42 Machine Learning a Ramsey Plan 795


42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795
42.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796
42.3 Model Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796
42.4 Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798
42.5 Approximation and Truncation parameter 𝑇 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 799

vii
42.6 A Gradient Descent Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 800
42.7 A More Structured ML Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804
42.8 Continuation Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811
42.9 Adding Some Human Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814
42.10 What has Machine Learning Taught Us? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 820

43 Time Inconsistency of Ramsey Plans 823


43.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823
43.2 Model Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824
43.3 Friedman’s Optimal Rate of Deflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825
43.4 Calvo’s Distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826
43.5 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 827
43.6 Three Timing Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828
43.7 Note on Dynamic Programming Squared . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828
43.8 A Ramsey Planner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 829
43.9 Representation of Ramsey Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 831
43.10 Multiple roles of 𝜃𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 832
43.11 Time inconsistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 832
43.12 Constrained-to-Constant-Growth-Rate Ramsey Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . 832
43.13 Markov Perfect Governments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 833
43.14 Outcomes under Three Timing Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834
43.15 Ramsey Planner’s Value Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839
43.16 Perturbing Model Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 841

44 Sustainable Plans for a Calvo Model 847


44.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
44.2 Model Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
44.3 Another Timing Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848
44.4 Sustainable or Credible Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 850
44.5 Whose Plan is It? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857

45 Optimal Taxation with State-Contingent Debt 859


45.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859
45.2 A Competitive Equilibrium with Distorting Taxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 860
45.3 Recursive Formulation of the Ramsey Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 871
45.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 878

46 Optimal Taxation without State-Contingent Debt 891


46.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 891
46.2 Competitive Equilibrium with Distorting Taxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 892
46.3 Recursive Version of AMSS Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 900
46.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907

47 Fluctuating Interest Rates Deliver Fiscal Insurance 919


47.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 919
47.2 Forces at Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 920
47.3 Logical Flow of Lecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921
47.4 Example Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 923
47.5 Reverse Engineering Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933
47.6 Code for Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934
47.7 Short Simulation for Reverse-engineered: Initial Debt . . . . . . . . . . . . . . . . . . . . . . . . . . 935
47.8 Long Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944
47.9 BEGS Approximations of Limiting Debt and Convergence Rate . . . . . . . . . . . . . . . . . . . . . 946

48 Fiscal Risk and Government Debt 951

viii
48.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 951
48.2 The Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 952
48.3 Long Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 953
48.4 Asymptotic Mean and Rate of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 978

49 Competitive Equilibria of a Model of Chang 987


49.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987
49.2 Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 989
49.3 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 991
49.4 Inventory of Objects in Play . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 992
49.5 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 993
49.6 Calculating all Promise-Value Pairs in CE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 996
49.7 Solving a Continuation Ramsey Planner’s Bellman Equation . . . . . . . . . . . . . . . . . . . . . . . 1012

50 Credible Government Policies in a Model of Chang 1021


50.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1021
50.2 The Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1022
50.3 Calculating the Set of Sustainable Promise-Value Pairs . . . . . . . . . . . . . . . . . . . . . . . . . 1028

IX Other 1045
51 Troubleshooting 1047
51.1 Fixing Your Local Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1047
51.2 Reporting an Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1048

52 References 1049

53 Execution Statistics 1051

Bibliography 1053

Proof Index 1059

Index 1061

ix
x
Advanced Quantitative Economics with Python

This website presents a set of advanced lectures on quantitative economic modeling.


• Tools and Techniques
– Orthogonal Projections and Their Applications
– Continuous State Markov Chains
– Reverse Engineering a la Muth
– Discrete State Dynamic Programming
• LQ Control
– Information and Consumption Smoothing
– Consumption Smoothing with Complete and Incomplete Markets
– Tax Smoothing with Complete and Incomplete Markets
– Markov Jump Linear Quadratic Dynamic Programming
– How to Pay for a War: Part 1
– How to Pay for a War: Part 2
– How to Pay for a War: Part 3
– Optimal Taxation in an LQ Economy
• Multiple Agent Models
– Default Risk and Income Fluctuations
– Globalization and Cycles
– Coase’s Theory of the Firm
– Composite Sorting
• Dynamic Linear Economies
– Recursive Models of Dynamic Linear Economies
– Growth in Dynamic Linear Economies
– Lucas Asset Pricing Using DLE
– IRFs in Hall Models
– Permanent Income Model using the DLE Class
– Rosen Schooling Model
– Cattle Cycles
– Shock Non Invertibility
• Risk, Model Uncertainty, and Robustness
– Risk and Model Uncertainty
– Etymology of Entropy
– Robustness
– Robust Markov Perfect Equilibrium
• Time Series Models
– Covariance Stationary Processes

CONTENTS 1
Advanced Quantitative Economics with Python

– Estimation of Spectra
– Additive and Multiplicative Functionals
– Classical Control with Linear Algebra
– Classical Prediction and Filtering With Linear Algebra
– Knowing the Forecasts of Others
• Asset Pricing and Finance
– Asset Pricing II: The Lucas Asset Pricing Model
– Elementary Asset Pricing Theory
– Two Modifications of Mean-Variance Portfolio Theory
– Irrelevance of Capital Structures with Complete Markets
– Equilibrium Capital Structures with Incomplete Markets
• Dynamic Programming Squared
– Optimal Unemployment Insurance
– Stackelberg Plans
– Machine Learning a Ramsey Plan
– Time Inconsistency of Ramsey Plans
– Sustainable Plans for a Calvo Model
– Optimal Taxation with State-Contingent Debt
– Optimal Taxation without State-Contingent Debt
– Fluctuating Interest Rates Deliver Fiscal Insurance
– Fiscal Risk and Government Debt
– Competitive Equilibria of a Model of Chang
– Credible Government Policies in a Model of Chang
• Other
– Troubleshooting
– References
– Execution Statistics

2 CONTENTS
Part I

Tools and Techniques

3
CHAPTER

ONE

ORTHOGONAL PROJECTIONS AND THEIR APPLICATIONS

1.1 Overview

Orthogonal projection is a cornerstone of vector space methods, with many diverse applications.
These include
• Least squares projection, also known as linear regression
• Conditional expectations for multivariate normal (Gaussian) distributions
• Gram–Schmidt orthogonalization
• QR decomposition
• Orthogonal polynomials
• etc
In this lecture, we focus on
• key ideas
• least squares regression
We’ll require the following imports:

import numpy as np
from scipy.linalg import qr

1.1.1 Further Reading

For background and foundational concepts, see our lecture on linear algebra.
For more proofs and greater theoretical detail, see A Primer in Econometric Theory.
For a complete set of proofs in a general setting, see, for example, [Roman, 2005].
For an advanced treatment of projection in the context of least squares prediction, see this book chapter.

5
Advanced Quantitative Economics with Python

1.2 Key Definitions

Assume 𝑥, 𝑧 ∈ ℝ𝑛 .
Define ⟨𝑥, 𝑧⟩ = ∑𝑖 𝑥𝑖 𝑧𝑖 .
Recall ‖𝑥‖2 = ⟨𝑥, 𝑥⟩.
The law of cosines states that ⟨𝑥, 𝑧⟩ = ‖𝑥‖‖𝑧‖ cos(𝜃) where 𝜃 is the angle between the vectors 𝑥 and 𝑧.
When ⟨𝑥, 𝑧⟩ = 0, then cos(𝜃) = 0 and 𝑥 and 𝑧 are said to be orthogonal and we write 𝑥 ⟂ 𝑧.

For a linear subspace 𝑆 ⊂ ℝ𝑛 , we call 𝑥 ∈ ℝ𝑛 orthogonal to 𝑆 if 𝑥 ⟂ 𝑧 for all 𝑧 ∈ 𝑆, and write 𝑥 ⟂ 𝑆.


The orthogonal complement of linear subspace 𝑆 ⊂ ℝ𝑛 is the set 𝑆 ⟂ ∶= {𝑥 ∈ ℝ𝑛 ∶ 𝑥 ⟂ 𝑆}.
𝑆 ⟂ is a linear subspace of ℝ𝑛
• To see this, fix 𝑥, 𝑦 ∈ 𝑆 ⟂ and 𝛼, 𝛽 ∈ ℝ.
• Observe that if 𝑧 ∈ 𝑆, then
⟨𝛼𝑥 + 𝛽𝑦, 𝑧⟩ = 𝛼⟨𝑥, 𝑧⟩ + 𝛽⟨𝑦, 𝑧⟩ = 𝛼 × 0 + 𝛽 × 0 = 0
• Hence 𝛼𝑥 + 𝛽𝑦 ∈ 𝑆 ⟂ , as was to be shown

6 Chapter 1. Orthogonal Projections and Their Applications


Advanced Quantitative Economics with Python

1.2. Key Definitions 7


Advanced Quantitative Economics with Python

8 Chapter 1. Orthogonal Projections and Their Applications


Advanced Quantitative Economics with Python

A set of vectors {𝑥1 , … , 𝑥𝑘 } ⊂ ℝ𝑛 is called an orthogonal set if 𝑥𝑖 ⟂ 𝑥𝑗 whenever 𝑖 ≠ 𝑗.


If {𝑥1 , … , 𝑥𝑘 } is an orthogonal set, then the Pythagorean Law states that

‖𝑥1 + ⋯ + 𝑥𝑘 ‖2 = ‖𝑥1 ‖2 + ⋯ + ‖𝑥𝑘 ‖2

For example, when 𝑘 = 2, 𝑥1 ⟂ 𝑥2 implies

‖𝑥1 + 𝑥2 ‖2 = ⟨𝑥1 + 𝑥2 , 𝑥1 + 𝑥2 ⟩ = ⟨𝑥1 , 𝑥1 ⟩ + 2⟨𝑥2 , 𝑥1 ⟩ + ⟨𝑥2 , 𝑥2 ⟩ = ‖𝑥1 ‖2 + ‖𝑥2 ‖2

1.2.1 Linear Independence vs Orthogonality

If 𝑋 ⊂ ℝ𝑛 is an orthogonal set and 0 ∉ 𝑋, then 𝑋 is linearly independent.


Proving this is a nice exercise.
While the converse is not true, a kind of partial converse holds, as we’ll see below.

1.3 The Orthogonal Projection Theorem

What vector within a linear subspace of ℝ𝑛 best approximates a given vector in ℝ𝑛 ?


The next theorem answers this question.
Theorem (OPT) Given 𝑦 ∈ ℝ𝑛 and linear subspace 𝑆 ⊂ ℝ𝑛 , there exists a unique solution to the minimization problem

𝑦 ̂ ∶= arg min ‖𝑦 − 𝑧‖
𝑧∈𝑆

The minimizer 𝑦 ̂ is the unique vector in ℝ𝑛 that satisfies


• 𝑦̂ ∈ 𝑆
• 𝑦 − 𝑦̂ ⟂ 𝑆
The vector 𝑦 ̂ is called the orthogonal projection of 𝑦 onto 𝑆.
The next figure provides some intuition

1.3.1 Proof of Sufficiency

We’ll omit the full proof.


But we will prove sufficiency of the asserted conditions.
To this end, let 𝑦 ∈ ℝ𝑛 and let 𝑆 be a linear subspace of ℝ𝑛 .
Let 𝑦 ̂ be a vector in ℝ𝑛 such that 𝑦 ̂ ∈ 𝑆 and 𝑦 − 𝑦 ̂ ⟂ 𝑆.
Let 𝑧 be any other point in 𝑆 and use the fact that 𝑆 is a linear subspace to deduce

‖𝑦 − 𝑧‖2 = ‖(𝑦 − 𝑦)̂ + (𝑦 ̂ − 𝑧)‖2 = ‖𝑦 − 𝑦‖̂ 2 + ‖𝑦 ̂ − 𝑧‖2

Hence ‖𝑦 − 𝑧‖ ≥ ‖𝑦 − 𝑦‖,
̂ which completes the proof.

1.3. The Orthogonal Projection Theorem 9


Advanced Quantitative Economics with Python

10 Chapter 1. Orthogonal Projections and Their Applications


Advanced Quantitative Economics with Python

1.3.2 Orthogonal Projection as a Mapping

For a linear space 𝑌 and a fixed linear subspace 𝑆, we have a functional relationship

𝑦 ∈ 𝑌 ↦ its orthogonal projection 𝑦 ̂ ∈ 𝑆

By the OPT, this is a well-defined mapping or operator from ℝ𝑛 to ℝ𝑛 .


In what follows we denote this operator by a matrix 𝑃
• 𝑃 𝑦 represents the projection 𝑦.̂
• This is sometimes expressed as 𝐸𝑆̂ 𝑦 = 𝑃 𝑦, where 𝐸̂ denotes a wide-sense expectations operator and the sub-
script 𝑆 indicates that we are projecting 𝑦 onto the linear subspace 𝑆.
The operator 𝑃 is called the orthogonal projection mapping onto 𝑆.

It is immediate from the OPT that for any 𝑦 ∈ ℝ𝑛


1. 𝑃 𝑦 ∈ 𝑆 and
2. 𝑦 − 𝑃 𝑦 ⟂ 𝑆
From this, we can deduce additional useful properties, such as
1. ‖𝑦‖2 = ‖𝑃 𝑦‖2 + ‖𝑦 − 𝑃 𝑦‖2 and
2. ‖𝑃 𝑦‖ ≤ ‖𝑦‖
For example, to prove 1, observe that 𝑦 = 𝑃 𝑦 + 𝑦 − 𝑃 𝑦 and apply the Pythagorean law.

1.3. The Orthogonal Projection Theorem 11


Advanced Quantitative Economics with Python

Orthogonal Complement

Let 𝑆 ⊂ ℝ𝑛 .
The orthogonal complement of 𝑆 is the linear subspace 𝑆 ⟂ that satisfies 𝑥1 ⟂ 𝑥2 for every 𝑥1 ∈ 𝑆 and 𝑥2 ∈ 𝑆 ⟂ .
Let 𝑌 be a linear space with linear subspace 𝑆 and its orthogonal complement 𝑆 ⟂ .
We write

𝑌 = 𝑆 ⊕ 𝑆⟂

to indicate that for every 𝑦 ∈ 𝑌 there is unique 𝑥1 ∈ 𝑆 and a unique 𝑥2 ∈ 𝑆 ⟂ such that 𝑦 = 𝑥1 + 𝑥2 .
Moreover, 𝑥1 = 𝐸𝑆̂ 𝑦 and 𝑥2 = 𝑦 − 𝐸𝑆̂ 𝑦.
This amounts to another version of the OPT:
Theorem. If 𝑆 is a linear subspace of ℝ𝑛 , 𝐸𝑆̂ 𝑦 = 𝑃 𝑦 and 𝐸𝑆̂ ⟂ 𝑦 = 𝑀 𝑦, then

𝑃 𝑦 ⟂ 𝑀𝑦 and 𝑦 = 𝑃 𝑦 + 𝑀 𝑦 for all 𝑦 ∈ ℝ𝑛

The next figure illustrates

12 Chapter 1. Orthogonal Projections and Their Applications


Advanced Quantitative Economics with Python

1.4 Orthonormal Basis

An orthogonal set of vectors 𝑂 ⊂ ℝ𝑛 is called an orthonormal set if ‖𝑢‖ = 1 for all 𝑢 ∈ 𝑂.


Let 𝑆 be a linear subspace of ℝ𝑛 and let 𝑂 ⊂ 𝑆.
If 𝑂 is orthonormal and span 𝑂 = 𝑆, then 𝑂 is called an orthonormal basis of 𝑆.
𝑂 is necessarily a basis of 𝑆 (being independent by orthogonality and the fact that no element is the zero vector).
One example of an orthonormal set is the canonical basis {𝑒1 , … , 𝑒𝑛 } that forms an orthonormal basis of ℝ𝑛 , where 𝑒𝑖
is the 𝑖 th unit vector.
If {𝑢1 , … , 𝑢𝑘 } is an orthonormal basis of linear subspace 𝑆, then
𝑘
𝑥 = ∑⟨𝑥, 𝑢𝑖 ⟩𝑢𝑖 for all 𝑥∈𝑆
𝑖=1

To see this, observe that since 𝑥 ∈ span{𝑢1 , … , 𝑢𝑘 }, we can find scalars 𝛼1 , … , 𝛼𝑘 that verify
𝑘
𝑥 = ∑ 𝛼𝑗 𝑢𝑗 (1.1)
𝑗=1

Taking the inner product with respect to 𝑢𝑖 gives


𝑘
⟨𝑥, 𝑢𝑖 ⟩ = ∑ 𝛼𝑗 ⟨𝑢𝑗 , 𝑢𝑖 ⟩ = 𝛼𝑖
𝑗=1

Combining this result with (1.1) verifies the claim.

1.4.1 Projection onto an Orthonormal Basis

When a subspace onto which we project is orthonormal, computing the projection simplifies:
Theorem If {𝑢1 , … , 𝑢𝑘 } is an orthonormal basis for 𝑆, then
𝑘
𝑃 𝑦 = ∑⟨𝑦, 𝑢𝑖 ⟩𝑢𝑖 , ∀ 𝑦 ∈ ℝ𝑛 (1.2)
𝑖=1

Proof: Fix 𝑦 ∈ ℝ𝑛 and let 𝑃 𝑦 be defined as in (1.2).


Clearly, 𝑃 𝑦 ∈ 𝑆.
We claim that 𝑦 − 𝑃 𝑦 ⟂ 𝑆 also holds.
It sufficies to show that 𝑦 − 𝑃 𝑦 ⟂ any basis vector 𝑢𝑖 .
This is true because
𝑘 𝑘
⟨𝑦 − ∑⟨𝑦, 𝑢𝑖 ⟩𝑢𝑖 , 𝑢𝑗 ⟩ = ⟨𝑦, 𝑢𝑗 ⟩ − ∑⟨𝑦, 𝑢𝑖 ⟩⟨𝑢𝑖 , 𝑢𝑗 ⟩ = 0
𝑖=1 𝑖=1

(Why is this sufficient to establish the claim that 𝑦 − 𝑃 𝑦 ⟂ 𝑆?)

1.4. Orthonormal Basis 13


Advanced Quantitative Economics with Python

1.5 Projection Via Matrix Algebra

Let 𝑆 be a linear subspace of ℝ𝑛 and let 𝑦 ∈ ℝ𝑛 .


We want to compute the matrix 𝑃 that verifies

𝐸𝑆̂ 𝑦 = 𝑃 𝑦

Evidently 𝑃 𝑦 is a linear function from 𝑦 ∈ ℝ𝑛 to 𝑃 𝑦 ∈ ℝ𝑛 .


This reference is useful.
Theorem. Let the columns of 𝑛 × 𝑘 matrix 𝑋 form a basis of 𝑆. Then

𝑃 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′

Proof: Given arbitrary 𝑦 ∈ ℝ𝑛 and 𝑃 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ , our claim is that


1. 𝑃 𝑦 ∈ 𝑆, and
2. 𝑦 − 𝑃 𝑦 ⟂ 𝑆
Claim 1 is true because

𝑃 𝑦 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 = 𝑋𝑎 when 𝑎 ∶= (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦

An expression of the form 𝑋𝑎 is precisely a linear combination of the columns of 𝑋 and hence an element of 𝑆.
Claim 2 is equivalent to the statement

𝑦 − 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 ⟂ 𝑋𝑏 for all 𝑏 ∈ ℝ𝐾

To verify this, notice that if 𝑏 ∈ ℝ𝐾 , then

(𝑋𝑏)′ [𝑦 − 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦] = 𝑏′ [𝑋 ′ 𝑦 − 𝑋 ′ 𝑦] = 0

The proof is now complete.

1.5.1 Starting with the Basis

It is common in applications to start with 𝑛 × 𝑘 matrix 𝑋 with linearly independent columns and let

𝑆 ∶= span 𝑋 ∶= span{col1 𝑋, … , col𝑘 𝑋}

Then the columns of 𝑋 form a basis of 𝑆.


From the preceding theorem, 𝑃 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 projects 𝑦 onto 𝑆.
In this context, 𝑃 is often called the projection matrix
• The matrix 𝑀 = 𝐼 − 𝑃 satisfies 𝑀 𝑦 = 𝐸𝑆̂ ⟂ 𝑦 and is sometimes called the annihilator matrix.

1.5.2 The Orthonormal Case

Suppose that 𝑈 is 𝑛 × 𝑘 with orthonormal columns.


Let 𝑢𝑖 ∶= col 𝑈𝑖 for each 𝑖, let 𝑆 ∶= span 𝑈 and let 𝑦 ∈ ℝ𝑛 .

14 Chapter 1. Orthogonal Projections and Their Applications


Advanced Quantitative Economics with Python

We know that the projection of 𝑦 onto 𝑆 is

𝑃 𝑦 = 𝑈 (𝑈 ′ 𝑈 )−1 𝑈 ′ 𝑦

Since 𝑈 has orthonormal columns, we have 𝑈 ′ 𝑈 = 𝐼.


Hence
𝑘
𝑃 𝑦 = 𝑈 𝑈 ′ 𝑦 = ∑⟨𝑢𝑖 , 𝑦⟩𝑢𝑖
𝑖=1

We have recovered our earlier result about projecting onto the span of an orthonormal basis.

1.5.3 Application: Overdetermined Systems of Equations

Let 𝑦 ∈ ℝ𝑛 and let 𝑋 be 𝑛 × 𝑘 with linearly independent columns.


Given 𝑋 and 𝑦, we seek 𝑏 ∈ ℝ𝑘 that satisfies the system of linear equations 𝑋𝑏 = 𝑦.
If 𝑛 > 𝑘 (more equations than unknowns), then 𝑏 is said to be overdetermined.
Intuitively, we may not be able to find a 𝑏 that satisfies all 𝑛 equations.
The best approach here is to
• Accept that an exact solution may not exist.
• Look instead for an approximate solution.
By approximate solution, we mean a 𝑏 ∈ ℝ𝑘 such that 𝑋𝑏 is close to 𝑦.
The next theorem shows that a best approximation is well defined and unique.
The proof uses the OPT.
Theorem The unique minimizer of ‖𝑦 − 𝑋𝑏‖ over 𝑏 ∈ ℝ𝐾 is

𝛽 ̂ ∶= (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦

Proof: Note that

𝑋 𝛽 ̂ = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 = 𝑃 𝑦

Since 𝑃 𝑦 is the orthogonal projection onto span(𝑋) we have

‖𝑦 − 𝑃 𝑦‖ ≤ ‖𝑦 − 𝑧‖ for any 𝑧 ∈ span(𝑋)

Because 𝑋𝑏 ∈ span(𝑋)

‖𝑦 − 𝑋 𝛽‖̂ ≤ ‖𝑦 − 𝑋𝑏‖ for any 𝑏 ∈ ℝ𝐾

This is what we aimed to show.

1.6 Least Squares Regression

Let’s apply the theory of orthogonal projection to least squares regression.


This approach provides insights about many geometric properties of linear regression.
We treat only some examples.

1.6. Least Squares Regression 15


Advanced Quantitative Economics with Python

1.6.1 Squared Risk Measures

Given pairs (𝑥, 𝑦) ∈ ℝ𝐾 × ℝ, consider choosing 𝑓 ∶ ℝ𝐾 → ℝ to minimize the risk

𝑅(𝑓) ∶= 𝔼 [(𝑦 − 𝑓(𝑥))2 ]

If probabilities and hence 𝔼 are unknown, we cannot solve this problem directly.
However, if a sample is available, we can estimate the risk with the empirical risk:

1 𝑁
min ∑(𝑦𝑛 − 𝑓(𝑥𝑛 ))2
𝑓∈ℱ 𝑁
𝑛=1

Minimizing this expression is called empirical risk minimization.


The set ℱ is sometimes called the hypothesis space.
The theory of statistical learning tells us that to prevent overfitting we should take the set ℱ to be relatively simple.
If we let ℱ be the class of linear functions, the problem is
𝑁
min ∑(𝑦𝑛 − 𝑏′ 𝑥𝑛 )2
𝑏∈ℝ𝐾
𝑛=1

This is the sample linear least squares problem.

1.6.2 Solution

Define the matrices


𝑦1 𝑥𝑛1

⎜ 𝑦2 ⎞
⎟ ⎛
⎜ 𝑥𝑛2 ⎞

𝑦 ∶= ⎜
⎜ ⋮ ⎟
⎟, 𝑥𝑛 ∶= ⎜
⎜ ⋮ ⎟
⎟ = 𝑛-th obs on all regressors
⎝ 𝑦𝑁 ⎠ ⎝ 𝑥𝑛𝐾 ⎠
and
𝑥′1 𝑥11 𝑥12 ⋯ 𝑥1𝐾
⎛ 𝑥′2 ⎞ ⎛ 𝑥21 𝑥22 ⋯ 𝑥2𝐾 ⎞
𝑋 ∶= ⎜

⎜ ⋮

⎟ ⎜
⎟ ∶=∶ ⎜
⎜ ⋮



⋮ ⋮
⎝ 𝑥′𝑁 ⎠ ⎝ 𝑥𝑁1 𝑥𝑁2 ⋯ 𝑥𝑁𝐾 ⎠
We assume throughout that 𝑁 > 𝐾 and 𝑋 is full column rank.
𝑁
If you work through the algebra, you will be able to verify that ‖𝑦 − 𝑋𝑏‖2 = ∑𝑛=1 (𝑦𝑛 − 𝑏′ 𝑥𝑛 )2 .
Since monotone transforms don’t affect minimizers, we have
𝑁
arg min ∑(𝑦𝑛 − 𝑏′ 𝑥𝑛 )2 = arg min ‖𝑦 − 𝑋𝑏‖
𝑏∈ℝ𝐾 𝑛=1 𝑏∈ℝ𝐾

By our results about overdetermined linear systems of equations, the solution is

𝛽 ̂ ∶= (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦

Let 𝑃 and 𝑀 be the projection and annihilator associated with 𝑋:

𝑃 ∶= 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ and 𝑀 ∶= 𝐼 − 𝑃

16 Chapter 1. Orthogonal Projections and Their Applications


Advanced Quantitative Economics with Python

The vector of fitted values is

𝑦 ̂ ∶= 𝑋 𝛽 ̂ = 𝑃 𝑦

The vector of residuals is

𝑢̂ ∶= 𝑦 − 𝑦 ̂ = 𝑦 − 𝑃 𝑦 = 𝑀 𝑦

Here are some more standard definitions:


• The total sum of squares is ∶= ‖𝑦‖2 .
• The sum of squared residuals is ∶= ‖𝑢‖̂ 2 .
• The explained sum of squares is ∶= ‖𝑦‖̂ 2 .
TSS = ESS + SSR
We can prove this easily using the OPT.
From the OPT we have 𝑦 = 𝑦 ̂ + 𝑢̂ and 𝑢̂ ⟂ 𝑦.̂
Applying the Pythagorean law completes the proof.

1.7 Orthogonalization and Decomposition

Let’s return to the connection between linear independence and orthogonality touched on above.
A result of much interest is a famous algorithm for constructing orthonormal sets from linearly independent sets.
The next section gives details.

1.7.1 Gram-Schmidt Orthogonalization

Theorem For each linearly independent set {𝑥1 , … , 𝑥𝑘 } ⊂ ℝ𝑛 , there exists an orthonormal set {𝑢1 , … , 𝑢𝑘 } with

span{𝑥1 , … , 𝑥𝑖 } = span{𝑢1 , … , 𝑢𝑖 } for 𝑖 = 1, … , 𝑘

The Gram-Schmidt orthogonalization procedure constructs an orthogonal set {𝑢1 , 𝑢2 , … , 𝑢𝑛 }.


One description of this procedure is as follows:
• For 𝑖 = 1, … , 𝑘, form 𝑆𝑖 ∶= span{𝑥1 , … , 𝑥𝑖 } and 𝑆𝑖⟂
• Set 𝑣1 = 𝑥1
• For 𝑖 ≥ 2 set 𝑣𝑖 ∶= 𝐸𝑆̂ 𝑖−1
⟂ 𝑥𝑖 and 𝑢𝑖 ∶= 𝑣𝑖 /‖𝑣𝑖 ‖

The sequence 𝑢1 , … , 𝑢𝑘 has the stated properties.


A Gram-Schmidt orthogonalization construction is a key idea behind the Kalman filter described in A First Look at the
Kalman filter.
In some exercises below, you are asked to implement this algorithm and test it using projection.

1.7. Orthogonalization and Decomposition 17


Advanced Quantitative Economics with Python

1.7.2 QR Decomposition

The following result uses the preceding algorithm to produce a useful decomposition.
Theorem If 𝑋 is 𝑛 × 𝑘 with linearly independent columns, then there exists a factorization 𝑋 = 𝑄𝑅 where
• 𝑅 is 𝑘 × 𝑘, upper triangular, and nonsingular
• 𝑄 is 𝑛 × 𝑘 with orthonormal columns
Proof sketch: Let
• 𝑥𝑗 ∶= col𝑗 (𝑋)
• {𝑢1 , … , 𝑢𝑘 } be orthonormal with the same span as {𝑥1 , … , 𝑥𝑘 } (to be constructed using Gram–Schmidt)
• 𝑄 be formed from cols 𝑢𝑖
Since 𝑥𝑗 ∈ span{𝑢1 , … , 𝑢𝑗 }, we have

𝑗
𝑥𝑗 = ∑⟨𝑢𝑖 , 𝑥𝑗 ⟩𝑢𝑖 for 𝑗 = 1, … , 𝑘
𝑖=1

Some rearranging gives 𝑋 = 𝑄𝑅.

1.7.3 Linear Regression via QR Decomposition

For matrices 𝑋 and 𝑦 that overdetermine 𝛽 in the linear equation system 𝑦 = 𝑋𝛽, we found the least squares approximator
𝛽 ̂ = (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦.
Using the QR decomposition 𝑋 = 𝑄𝑅 gives

𝛽 ̂ = (𝑅′ 𝑄′ 𝑄𝑅)−1 𝑅′ 𝑄′ 𝑦
= (𝑅′ 𝑅)−1 𝑅′ 𝑄′ 𝑦
= 𝑅−1 (𝑅′ )−1 𝑅′ 𝑄′ 𝑦 = 𝑅−1 𝑄′ 𝑦

Numerical routines would in this case use the alternative form 𝑅𝛽 ̂ = 𝑄′ 𝑦 and back substitution.

1.8 Exercises

Exercise 1.8.1
Show that, for any linear subspace 𝑆 ⊂ ℝ𝑛 , 𝑆 ∩ 𝑆 ⟂ = {0}.

Solution to Exercise 1.8.1


If 𝑥 ∈ 𝑆 and 𝑥 ∈ 𝑆 ⟂ , then we have in particular that ⟨𝑥, 𝑥⟩ = 0, but then 𝑥 = 0.

Exercise 1.8.2
Let 𝑃 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ and let 𝑀 = 𝐼 − 𝑃 . Show that 𝑃 and 𝑀 are both idempotent and symmetric. Can you give
any intuition as to why they should be idempotent?

18 Chapter 1. Orthogonal Projections and Their Applications


Advanced Quantitative Economics with Python

Solution to Exercise 1.8.2


Symmetry and idempotence of 𝑀 and 𝑃 can be established using standard rules for matrix algebra. The intuition behind
idempotence of 𝑀 and 𝑃 is that both are orthogonal projections. After a point is projected into a given subspace, applying
the projection again makes no difference (A point inside the subspace is not shifted by orthogonal projection onto that
space because it is already the closest point in the subspace to itself).

Exercise 1.8.3
Using Gram-Schmidt orthogonalization, produce a linear projection of 𝑦 onto the column space of 𝑋 and verify this
using the projection matrix 𝑃 ∶= 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ and also using QR decomposition, where:

1
𝑦 ∶= ⎛
⎜ 3 ⎞⎟,
⎝ −3 ⎠
and
1 0
𝑋 ∶= ⎛
⎜ 0 −6 ⎞

⎝ 2 2 ⎠

Solution to Exercise 1.8.3


Here’s a function that computes the orthonormal vectors using the GS algorithm given in the lecture

def gram_schmidt(X):
"""
Implements Gram-Schmidt orthogonalization.

Parameters
----------
X : an n x k array with linearly independent columns

Returns
-------
U : an n x k array with orthonormal columns

"""

# Set up
n, k = X.shape
U = np.empty((n, k))
I = np.eye(n)

# The first col of U is just the normalized first col of X


v1 = X[:,0]
U[:, 0] = v1 / np.sqrt(np.sum(v1 * v1))

for i in range(1, k):


# Set up
b = X[:, i] # The vector we're going to project
Z = X[:, 0:i] # First i-1 columns of X
(continues on next page)

1.8. Exercises 19
Advanced Quantitative Economics with Python

(continued from previous page)

# Project onto the orthogonal complement of the col span of Z


M = I - Z @ np.linalg.inv(Z.T @ Z) @ Z.T
u = M @ b

# Normalize
U[:, i] = u / np.sqrt(np.sum(u * u))

return U

Here are the arrays we’ll work with

y = [1, 3, -3]

X = [[1, 0],
[0, -6],
[2, 2]]

X, y = [np.asarray(z) for z in (X, y)]

First, let’s try projection of 𝑦 onto the column space of 𝑋 using the ordinary matrix expression:

Py1 = X @ np.linalg.inv(X.T @ X) @ X.T @ y


Py1

array([-0.56521739, 3.26086957, -2.2173913 ])

Now let’s do the same using an orthonormal basis created from our gram_schmidt function

U = gram_schmidt(X)
U

array([[ 0.4472136 , -0.13187609],


[ 0. , -0.98907071],
[ 0.89442719, 0.06593805]])

Py2 = U @ U.T @ y
Py2

array([-0.56521739, 3.26086957, -2.2173913 ])

This is the same answer. So far so good. Finally, let’s try the same thing but with the basis obtained via QR decomposition:

Q, R = qr(X, mode='economic')
Q

array([[-0.4472136 , -0.13187609],
[-0. , -0.98907071],
[-0.89442719, 0.06593805]])

20 Chapter 1. Orthogonal Projections and Their Applications


Advanced Quantitative Economics with Python

Py3 = Q @ Q.T @ y
Py3

array([-0.56521739, 3.26086957, -2.2173913 ])

Again, we obtain the same answer.

1.8. Exercises 21
Advanced Quantitative Economics with Python

22 Chapter 1. Orthogonal Projections and Their Applications


CHAPTER

TWO

CONTINUOUS STATE MARKOV CHAINS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

2.1 Overview

In a previous lecture, we learned about finite Markov chains, a relatively elementary class of stochastic dynamic models.
The present lecture extends this analysis to continuous (i.e., uncountable) state Markov chains.
Most stochastic dynamic models studied by economists either fit directly into this class or can be represented as continuous
state Markov chains after minor modifications.
In this lecture, our focus will be on continuous Markov models that
• evolve in discrete-time
• are often nonlinear
The fact that we accommodate nonlinear models here is significant, because linear stochastic models have their own highly
developed toolset, as we’ll see later on.
The question that interests us most is: Given a particular stochastic dynamic model, how will the state of the system
evolve over time?
In particular,
• What happens to the distribution of the state variables?
• Is there anything we can say about the “average behavior” of these variables?
• Is there a notion of “steady state” or “long-run equilibrium” that’s applicable to the model?
– If so, how can we compute it?
Answering these questions will lead us to revisit many of the topics that occupied us in the finite state case, such as
simulation, distribution dynamics, stability, ergodicity, etc.

Note: For some people, the term “Markov chain” always refers to a process with a finite or discrete state space. We
follow the mainstream mathematical literature (e.g., [Meyn and Tweedie, 2009]) in using the term to refer to any discrete
time Markov process.

Let’s begin with some imports:

23
Advanced Quantitative Economics with Python

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import lognorm, beta
from quantecon import LAE
from scipy.stats import norm, gaussian_kde

2.2 The Density Case

You are probably aware that some distributions can be represented by densities and some cannot.
(For example, distributions on the real numbers ℝ that put positive probability on individual points have no density
representation)
We are going to start our analysis by looking at Markov chains where the one-step transition probabilities have density
representations.
The benefit is that the density case offers a very direct parallel to the finite case in terms of notation and intuition.
Once we’ve built some intuition we’ll cover the general case.

2.2.1 Definitions and Basic Properties

In our lecture on finite Markov chains, we studied discrete-time Markov chains that evolve on a finite state space 𝑆.
In this setting, the dynamics of the model are described by a stochastic matrix — a nonnegative square matrix 𝑃 = 𝑃 [𝑖, 𝑗]
such that each row 𝑃 [𝑖, ⋅] sums to one.
The interpretation of 𝑃 is that 𝑃 [𝑖, 𝑗] represents the probability of transitioning from state 𝑖 to state 𝑗 in one unit of time.
In symbols,

ℙ{𝑋𝑡+1 = 𝑗 | 𝑋𝑡 = 𝑖} = 𝑃 [𝑖, 𝑗]

Equivalently,
• 𝑃 can be thought of as a family of distributions 𝑃 [𝑖, ⋅], one for each 𝑖 ∈ 𝑆
• 𝑃 [𝑖, ⋅] is the distribution of 𝑋𝑡+1 given 𝑋𝑡 = 𝑖
(As you probably recall, when using NumPy arrays, 𝑃 [𝑖, ⋅] is expressed as P[i,:])
In this section, we’ll allow 𝑆 to be a subset of ℝ, such as
• ℝ itself
• the positive reals (0, ∞)
• a bounded interval (𝑎, 𝑏)
The family of discrete distributions 𝑃 [𝑖, ⋅] will be replaced by a family of densities 𝑝(𝑥, ⋅), one for each 𝑥 ∈ 𝑆.
Analogous to the finite state case, 𝑝(𝑥, ⋅) is to be understood as the distribution (density) of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥.
More formally, a stochastic kernel on 𝑆 is a function 𝑝 ∶ 𝑆 × 𝑆 → ℝ with the property that
1. 𝑝(𝑥, 𝑦) ≥ 0 for all 𝑥, 𝑦 ∈ 𝑆
2. ∫ 𝑝(𝑥, 𝑦)𝑑𝑦 = 1 for all 𝑥 ∈ 𝑆

24 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

(Integrals are over the whole space unless otherwise specified)


For example, let 𝑆 = ℝ and consider the particular stochastic kernel 𝑝𝑤 defined by

1 (𝑦 − 𝑥)2
𝑝𝑤 (𝑥, 𝑦) ∶= √ exp {− } (2.1)
2𝜋 2

What kind of model does 𝑝𝑤 represent?


The answer is, the (normally distributed) random walk
IID
𝑋𝑡+1 = 𝑋𝑡 + 𝜉𝑡+1 where {𝜉𝑡 } ∼ 𝑁 (0, 1) (2.2)

To see this, let’s find the stochastic kernel 𝑝 corresponding to (2.2).


Recall that 𝑝(𝑥, ⋅) represents the distribution of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥.
Letting 𝑋𝑡 = 𝑥 in (2.2) and considering the distribution of 𝑋𝑡+1 , we see that 𝑝(𝑥, ⋅) = 𝑁 (𝑥, 1).
In other words, 𝑝 is exactly 𝑝𝑤 , as defined in (2.1).

2.2.2 Connection to Stochastic Difference Equations

In the previous section, we made the connection between stochastic difference equation (2.2) and stochastic kernel (2.1).
In economics and time-series analysis we meet stochastic difference equations of all different shapes and sizes.
It will be useful for us if we have some systematic methods for converting stochastic difference equations into stochastic
kernels.
To this end, consider the generic (scalar) stochastic difference equation given by

𝑋𝑡+1 = 𝜇(𝑋𝑡 ) + 𝜎(𝑋𝑡 ) 𝜉𝑡+1 (2.3)

Here we assume that


IID
• {𝜉𝑡 } ∼ 𝜙, where 𝜙 is a given density on ℝ
• 𝜇 and 𝜎 are given functions on 𝑆, with 𝜎(𝑥) > 0 for all 𝑥
Example 1: The random walk (2.2) is a special case of (2.3), with 𝜇(𝑥) = 𝑥 and 𝜎(𝑥) = 1.
Example 2: Consider the ARCH model

𝑋𝑡+1 = 𝛼𝑋𝑡 + 𝜎𝑡 𝜉𝑡+1 , 𝜎𝑡2 = 𝛽 + 𝛾𝑋𝑡2 , 𝛽, 𝛾 > 0

Alternatively, we can write the model as

𝑋𝑡+1 = 𝛼𝑋𝑡 + (𝛽 + 𝛾𝑋𝑡2 )1/2 𝜉𝑡+1 (2.4)

This is a special case of (2.3) with 𝜇(𝑥) = 𝛼𝑥 and 𝜎(𝑥) = (𝛽 + 𝛾𝑥2 )1/2 .
Example 3: With stochastic production and a constant savings rate, the one-sector neoclassical growth model leads to a
law of motion for capital per worker such as

𝑘𝑡+1 = 𝑠𝐴𝑡+1 𝑓(𝑘𝑡 ) + (1 − 𝛿)𝑘𝑡 (2.5)

Here
• 𝑠 is the rate of savings
• 𝐴𝑡+1 is a production shock

2.2. The Density Case 25


Advanced Quantitative Economics with Python

– The 𝑡 + 1 subscript indicates that 𝐴𝑡+1 is not visible at time 𝑡


• 𝛿 is a depreciation rate
• 𝑓 ∶ ℝ+ → ℝ+ is a production function satisfying 𝑓(𝑘) > 0 whenever 𝑘 > 0
(The fixed savings rate can be rationalized as the optimal policy for a particular set of technologies and preferences (see
[Ljungqvist and Sargent, 2018], section 3.1.2), although we omit the details here).
Equation (2.5) is a special case of (2.3) with 𝜇(𝑥) = (1 − 𝛿)𝑥 and 𝜎(𝑥) = 𝑠𝑓(𝑥).
Now let’s obtain the stochastic kernel corresponding to the generic model (2.3).
To find it, note first that if 𝑈 is a random variable with density 𝑓𝑈 , and 𝑉 = 𝑎 + 𝑏𝑈 for some constants 𝑎, 𝑏 with 𝑏 > 0,
then the density of 𝑉 is given by
1 𝑣−𝑎
𝑓𝑉 (𝑣) = 𝑓 ( ) (2.6)
𝑏 𝑈 𝑏
(The proof is below. For a multidimensional version see EDTC, theorem 8.1.3).
Taking (2.6) as given for the moment, we can obtain the stochastic kernel 𝑝 for (2.3) by recalling that 𝑝(𝑥, ⋅) is the
conditional density of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥.
In the present case, this is equivalent to stating that 𝑝(𝑥, ⋅) is the density of 𝑌 ∶= 𝜇(𝑥) + 𝜎(𝑥) 𝜉𝑡+1 when 𝜉𝑡+1 ∼ 𝜙.
Hence, by (2.6),

1 𝑦 − 𝜇(𝑥)
𝑝(𝑥, 𝑦) = 𝜙( ) (2.7)
𝜎(𝑥) 𝜎(𝑥)

For example, the growth model in (2.5) has stochastic kernel

1 𝑦 − (1 − 𝛿)𝑥
𝑝(𝑥, 𝑦) = 𝜙( ) (2.8)
𝑠𝑓(𝑥) 𝑠𝑓(𝑥)

where 𝜙 is the density of 𝐴𝑡+1 .


(Regarding the state space 𝑆 for this model, a natural choice is (0, ∞) — in which case 𝜎(𝑥) = 𝑠𝑓(𝑥) is strictly positive
for all 𝑠 as required)

2.2.3 Distribution Dynamics

In this section of our lecture on finite Markov chains, we asked the following question: If
1. {𝑋𝑡 } is a Markov chain with stochastic matrix 𝑃
2. the distribution of 𝑋𝑡 is known to be 𝜓𝑡
then what is the distribution of 𝑋𝑡+1 ?
Letting 𝜓𝑡+1 denote the distribution of 𝑋𝑡+1 , the answer we gave was that

𝜓𝑡+1 [𝑗] = ∑ 𝑃 [𝑖, 𝑗]𝜓𝑡 [𝑖]


𝑖∈𝑆

This intuitive equality states that the probability of being at 𝑗 tomorrow is the probability of visiting 𝑖 today and then
going on to 𝑗, summed over all possible 𝑖.
In the density case, we just replace the sum with an integral and probability mass functions with densities, yielding

𝜓𝑡+1 (𝑦) = ∫ 𝑝(𝑥, 𝑦)𝜓𝑡 (𝑥) 𝑑𝑥, ∀𝑦 ∈ 𝑆 (2.9)

26 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

It is convenient to think of this updating process in terms of an operator.


(An operator is just a function, but the term is usually reserved for a function that sends functions into functions)
Let 𝒟 be the set of all densities on 𝑆, and let 𝑃 be the operator from 𝒟 to itself that takes density 𝜓 and sends it into
new density 𝜓𝑃 , where the latter is defined by

(𝜓𝑃 )(𝑦) = ∫ 𝑝(𝑥, 𝑦)𝜓(𝑥)𝑑𝑥 (2.10)

This operator is usually called the Markov operator corresponding to 𝑝

Note: Unlike most operators, we write 𝑃 to the right of its argument, instead of to the left (i.e., 𝜓𝑃 instead of 𝑃 𝜓).
This is a common convention, with the intention being to maintain the parallel with the finite case — see here

With this notation, we can write (2.9) more succinctly as 𝜓𝑡+1 (𝑦) = (𝜓𝑡 𝑃 )(𝑦) for all 𝑦, or, dropping the 𝑦 and letting
“=” indicate equality of functions,

𝜓𝑡+1 = 𝜓𝑡 𝑃 (2.11)

Equation (2.11) tells us that if we specify a distribution for 𝜓0 , then the entire sequence of future distributions can be
obtained by iterating with 𝑃 .
It’s interesting to note that (2.11) is a deterministic difference equation.
Thus, by converting a stochastic difference equation such as (2.3) into a stochastic kernel 𝑝 and hence an operator 𝑃 , we
convert a stochastic difference equation into a deterministic one (albeit in a much higher dimensional space).

Note: Some people might be aware that discrete Markov chains are in fact a special case of the continuous Markov
chains we have just described. The reason is that probability mass functions are densities with respect to the counting
measure.

2.2.4 Computation

To learn about the dynamics of a given process, it’s useful to compute and study the sequences of densities generated by
the model.
One way to do this is to try to implement the iteration described by (2.10) and (2.11) using numerical integration.
However, to produce 𝜓𝑃 from 𝜓 via (2.10), you would need to integrate at every 𝑦, and there is a continuum of such 𝑦.
Another possibility is to discretize the model, but this introduces errors of unknown size.
A nicer alternative in the present setting is to combine simulation with an elegant estimator called the look-ahead estimator.
Let’s go over the ideas with reference to the growth model discussed above, the dynamics of which we repeat here for
convenience:

𝑘𝑡+1 = 𝑠𝐴𝑡+1 𝑓(𝑘𝑡 ) + (1 − 𝛿)𝑘𝑡 (2.12)

Our aim is to compute the sequence {𝜓𝑡 } associated with this model and fixed initial condition 𝜓0 .
To approximate 𝜓𝑡 by simulation, recall that, by definition, 𝜓𝑡 is the density of 𝑘𝑡 given 𝑘0 ∼ 𝜓0 .
If we wish to generate observations of this random variable, all we need to do is
1. draw 𝑘0 from the specified initial condition 𝜓0

2.2. The Density Case 27


Advanced Quantitative Economics with Python

2. draw the shocks 𝐴1 , … , 𝐴𝑡 from their specified density 𝜙


3. compute 𝑘𝑡 iteratively via (2.12)
If we repeat this 𝑛 times, we get 𝑛 independent observations 𝑘𝑡1 , … , 𝑘𝑡𝑛 .
With these draws in hand, the next step is to generate some kind of representation of their distribution 𝜓𝑡 .
A naive approach would be to use a histogram, or perhaps a smoothed histogram using SciPy’s gaussian_kde function.
However, in the present setting, there is a much better way to do this, based on the look-ahead estimator.
With this estimator, to construct an estimate of 𝜓𝑡 , we actually generate 𝑛 observations of 𝑘𝑡−1 , rather than 𝑘𝑡 .
1 𝑛
Now we take these 𝑛 observations 𝑘𝑡−1 , … , 𝑘𝑡−1 and form the estimate

1 𝑛
𝜓𝑡𝑛 (𝑦) = 𝑖
∑ 𝑝(𝑘𝑡−1 , 𝑦) (2.13)
𝑛 𝑖=1

where 𝑝 is the growth model stochastic kernel in (2.8).


What is the justification for this slightly surprising estimator?
The idea is that, by the strong law of large numbers,

1 𝑛 𝑖 𝑖
∑ 𝑝(𝑘𝑡−1 , 𝑦) → 𝔼𝑝(𝑘𝑡−1 , 𝑦) = ∫ 𝑝(𝑥, 𝑦)𝜓𝑡−1 (𝑥) 𝑑𝑥 = 𝜓𝑡 (𝑦)
𝑛 𝑖=1

with probability one as 𝑛 → ∞.


Here the first equality is by the definition of 𝜓𝑡−1 , and the second is by (2.9).
We have just shown that our estimator 𝜓𝑡𝑛 (𝑦) in (2.13) converges almost surely to 𝜓𝑡 (𝑦), which is just what we want to
compute.
In fact, much stronger convergence results are true (see, for example, this paper).

2.2.5 Implementation

A class called LAE for estimating densities by this technique can be found in lae.py.
Given our use of the __call__ method, an instance of LAE acts as a callable object, which is essentially a function that
can store its own data (see this discussion).
This function returns the right-hand side of (2.13) using
• the data and stochastic kernel that it stores as its instance data
• the value 𝑦 as its argument
The function is vectorized, in the sense that if psi is such an instance and y is an array, then the call psi(y) acts
elementwise.
(This is the reason that we reshaped X and y inside the class — to make vectorization work)
Because the implementation is fully vectorized, it is about as efficient as it would be in C or Fortran.

28 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

2.2.6 Example

The following code is an example of usage for the stochastic growth model described above

# == Define parameters == #
s = 0.2
δ = 0.1
a_σ = 0.4 # A = exp(B) where B ~ N(0, a_σ)
α = 0.4 # We set f(k) = k**α
ψ_0 = beta(5, 5, scale=0.5) # Initial distribution
ϕ = lognorm(a_σ)

def p(x, y):


"""
Stochastic kernel for the growth model with Cobb-Douglas production.
Both x and y must be strictly positive.
"""
d = s * x**α
return ϕ.pdf((y - (1 - δ) * x) / d) / d

n = 10000 # Number of observations at each date t


T = 30 # Compute density of k_t at 1,...,T+1

# == Generate matrix s.t. t-th column is n observations of k_t == #


k = np.empty((n, T))
A = ϕ.rvs((n, T))
k[:, 0] = ψ_0.rvs(n) # Draw first column from initial distribution
for t in range(T-1):
k[:, t+1] = s * A[:, t] * k[:, t]**α + (1 - δ) * k[:, t]

# == Generate T instances of LAE using this data, one for each date t == #
laes = [LAE(p, k[:, t]) for t in range(T)]

# == Plot == #
fig, ax = plt.subplots()
ygrid = np.linspace(0.01, 4.0, 200)
greys = [str(g) for g in np.linspace(0.0, 0.8, T)]
greys.reverse()
for ψ, g in zip(laes, greys):
ax.plot(ygrid, ψ(ygrid), color=g, lw=2, alpha=0.6)
ax.set_xlabel('capital')
ax.set_title(f'Density of $k_1$ (lighter) to $k_T$ (darker) for $T={T}$')
plt.show()

2.2. The Density Case 29


Advanced Quantitative Economics with Python

The figure shows part of the density sequence {𝜓𝑡 }, with each density computed via the look-ahead estimator.
Notice that the sequence of densities shown in the figure seems to be converging — more on this in just a moment.
Another quick comment is that each of these distributions could be interpreted as a cross-sectional distribution (recall
this discussion).

2.3 Beyond Densities

Up until now, we have focused exclusively on continuous state Markov chains where all conditional distributions 𝑝(𝑥, ⋅)
are densities.
As discussed above, not all distributions can be represented as densities.
If the conditional distribution of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥 cannot be represented as a density for some 𝑥 ∈ 𝑆, then we need
a slightly different theory.
The ultimate option is to switch from densities to probability measures, but not all readers will be familiar with measure
theory.
We can, however, construct a fairly general theory using distribution functions.

30 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

2.3.1 Example and Definitions

To illustrate the issues, recall that Hopenhayn and Rogerson [Hopenhayn and Rogerson, 1993] study a model of firm
dynamics where individual firm productivity follows the exogenous process
IID
𝑋𝑡+1 = 𝑎 + 𝜌𝑋𝑡 + 𝜉𝑡+1 , where {𝜉𝑡 } ∼ 𝑁 (0, 𝜎2 )

As is, this fits into the density case we treated above.


However, the authors wanted this process to take values in [0, 1], so they added boundaries at the endpoints 0 and 1.
One way to write this is

𝑋𝑡+1 = ℎ(𝑎 + 𝜌𝑋𝑡 + 𝜉𝑡+1 ) where ℎ(𝑥) ∶= 𝑥 1{0 ≤ 𝑥 ≤ 1} + 1{𝑥 > 1}

If you think about it, you will see that for any given 𝑥 ∈ [0, 1], the conditional distribution of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥 puts
positive probability mass on 0 and 1.
Hence it cannot be represented as a density.
What we can do instead is use cumulative distribution functions (cdfs).
To this end, set

𝐺(𝑥, 𝑦) ∶= ℙ{ℎ(𝑎 + 𝜌𝑥 + 𝜉𝑡+1 ) ≤ 𝑦} (0 ≤ 𝑥, 𝑦 ≤ 1)

This family of cdfs 𝐺(𝑥, ⋅) plays a role analogous to the stochastic kernel in the density case.
The distribution dynamics in (2.9) are then replaced by

𝐹𝑡+1 (𝑦) = ∫ 𝐺(𝑥, 𝑦)𝐹𝑡 (𝑑𝑥) (2.14)

Here 𝐹𝑡 and 𝐹𝑡+1 are cdfs representing the distribution of the current state and next period state.
The intuition behind (2.14) is essentially the same as for (2.9).

2.3.2 Computation

If you wish to compute these cdfs, you cannot use the look-ahead estimator as before.
Indeed, you should not use any density estimator, since the objects you are estimating/computing are not densities.
One good option is simulation as before, combined with the empirical distribution function.

2.4 Stability

In our lecture on finite Markov chains, we also studied stationarity, stability and ergodicity.
Here we will cover the same topics for the continuous case.
We will, however, treat only the density case (as in this section), where the stochastic kernel is a family of densities.
The general case is relatively similar — references are given below.

2.4. Stability 31
Advanced Quantitative Economics with Python

2.4.1 Theoretical Results

Analogous to the finite case, given a stochastic kernel 𝑝 and corresponding Markov operator as defined in (2.10), a density
𝜓∗ on 𝑆 is called stationary for 𝑃 if it is a fixed point of the operator 𝑃 .
In other words,

𝜓∗ (𝑦) = ∫ 𝑝(𝑥, 𝑦)𝜓∗ (𝑥) 𝑑𝑥, ∀𝑦 ∈ 𝑆 (2.15)

As with the finite case, if 𝜓∗ is stationary for 𝑃 , and the distribution of 𝑋0 is 𝜓∗ , then, in view of (2.11), 𝑋𝑡 will have
this same distribution for all 𝑡.
Hence 𝜓∗ is the stochastic equivalent of a steady state.
In the finite case, we learned that at least one stationary distribution exists, although there may be many.
When the state space is infinite, the situation is more complicated.
Even existence can fail very easily.
For example, the random walk model has no stationary density (see, e.g., EDTC, p. 210).
However, there are well-known conditions under which a stationary density 𝜓∗ exists.
With additional conditions, we can also get a unique stationary density (𝜓 ∈ 𝒟 and 𝜓 = 𝜓𝑃 ⟹ 𝜓 = 𝜓∗ ), and also
global convergence in the sense that

∀ 𝜓 ∈ 𝒟, 𝜓𝑃 𝑡 → 𝜓∗ as 𝑡 → ∞ (2.16)

This combination of existence, uniqueness and global convergence in the sense of (2.16) is often referred to as global
stability.
Under very similar conditions, we get ergodicity, which means that

1 𝑛
∑ ℎ(𝑋𝑡 ) → ∫ ℎ(𝑥)𝜓∗ (𝑥)𝑑𝑥 as 𝑛 → ∞ (2.17)
𝑛 𝑡=1

for any (measurable) function ℎ ∶ 𝑆 → ℝ such that the right-hand side is finite.
Note that the convergence in (2.17) does not depend on the distribution (or value) of 𝑋0 .
This is actually very important for simulation — it means we can learn about 𝜓∗ (i.e., approximate the right-hand side of
(2.17) via the left-hand side) without requiring any special knowledge about what to do with 𝑋0 .
So what are these conditions we require to get global stability and ergodicity?
In essence, it must be the case that
1. Probability mass does not drift off to the “edges” of the state space.
2. Sufficient “mixing” obtains.
For one such set of conditions see theorem 8.2.14 of EDTC.
In addition
• [Stokey et al., 1989] contains a classic (but slightly outdated) treatment of these topics.
• From the mathematical literature, [Lasota and MacKey, 1994] and [Meyn and Tweedie, 2009] give outstanding
in-depth treatments.
• Section 8.1.2 of EDTC provides detailed intuition, and section 8.3 gives additional references.
• EDTC, section 11.3.4 provides a specific treatment for the growth model we considered in this lecture.

32 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

2.4.2 An Example of Stability

As stated above, the growth model treated here is stable under mild conditions on the primitives.
• See EDTC, section 11.3.4 for more details.
We can see this stability in action — in particular, the convergence in (2.16) — by simulating the path of densities from
various initial conditions.
Here is such a figure.

All sequences are converging towards the same limit, regardless of their initial condition.
The details regarding initial conditions and so on are given in this exercise, where you are asked to replicate the figure.

2.4.3 Computing Stationary Densities

In the preceding figure, each sequence of densities is converging towards the unique stationary density 𝜓∗ .
Even from this figure, we can get a fair idea what 𝜓∗ looks like, and where its mass is located.
However, there is a much more direct way to estimate the stationary density, and it involves only a slight modification of
the look-ahead estimator.
Let’s say that we have a model of the form (2.3) that is stable and ergodic.
Let 𝑝 be the corresponding stochastic kernel, as given in (2.7).

2.4. Stability 33
Advanced Quantitative Economics with Python

To approximate the stationary density 𝜓∗ , we can simply generate a long time-series 𝑋0 , 𝑋1 , … , 𝑋𝑛 and estimate 𝜓∗ via

1 𝑛
𝜓𝑛∗ (𝑦) = ∑ 𝑝(𝑋𝑡 , 𝑦) (2.18)
𝑛 𝑡=1

This is essentially the same as the look-ahead estimator (2.13), except that now the observations we generate are a single
time-series, rather than a cross-section.
The justification for (2.18) is that, with probability one as 𝑛 → ∞,

1 𝑛
∑ 𝑝(𝑋𝑡 , 𝑦) → ∫ 𝑝(𝑥, 𝑦)𝜓∗ (𝑥) 𝑑𝑥 = 𝜓∗ (𝑦)
𝑛 𝑡=1

where the convergence is by (2.17) and the equality on the right is by (2.15).
The right-hand side is exactly what we want to compute.
On top of this asymptotic result, it turns out that the rate of convergence for the look-ahead estimator is very good.
The first exercise helps illustrate this point.

2.5 Exercises

Exercise 2.5.1
Consider the simple threshold autoregressive model
IID
𝑋𝑡+1 = 𝜃|𝑋𝑡 | + (1 − 𝜃2 )1/2 𝜉𝑡+1 where {𝜉𝑡 } ∼ 𝑁 (0, 1) (2.19)

This is one of those rare nonlinear stochastic models where an analytical expression for the stationary density is available.
In particular, provided that |𝜃| < 1, there is a unique stationary density 𝜓∗ given by

𝜃𝑦
𝜓∗ (𝑦) = 2 𝜙(𝑦) Φ [ ] (2.20)
(1 − 𝜃2 )1/2

Here 𝜙 is the standard normal density and Φ is the standard normal cdf.
As an exercise, compute the look-ahead estimate of 𝜓∗ , as defined in (2.18), and compare it with 𝜓∗ in (2.20) to see
whether they are indeed close for large 𝑛.
In doing so, set 𝜃 = 0.8 and 𝑛 = 500.
The next figure shows the result of such a computation
The additional density (black line) is a nonparametric kernel density estimate, added to the solution for illustration.
(You can try to replicate it before looking at the solution if you want to)
As you can see, the look-ahead estimator is a much tighter fit than the kernel density estimator.
If you repeat the simulation you will see that this is consistently the case.

Solution to Exercise 2.5.1


Look-ahead estimation of a TAR stationary density, where the TAR model is

𝑋𝑡+1 = 𝜃|𝑋𝑡 | + (1 − 𝜃2 )1/2 𝜉𝑡+1

34 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

2.5. Exercises 35
Advanced Quantitative Economics with Python

and 𝜉𝑡 ∼ 𝑁 (0, 1).


Try running at n = 10, 100, 1000, 10000 to get an idea of the speed of convergence

ϕ = norm()
n = 500
θ = 0.8
# == Frequently used constants == #
d = np.sqrt(1 - θ**2)
δ = θ / d

def ψ_star(y):
"True stationary density of the TAR Model"
return 2 * norm.pdf(y) * norm.cdf(δ * y)

def p(x, y):


"Stochastic kernel for the TAR model."
return ϕ.pdf((y - θ * np.abs(x)) / d) / d

Z = ϕ.rvs(n)
X = np.empty(n)
for t in range(n-1):
X[t+1] = θ * np.abs(X[t]) + d * Z[t]
ψ_est = LAE(p, X)
k_est = gaussian_kde(X)

fig, ax = plt.subplots(figsize=(10, 7))


ys = np.linspace(-3, 3, 200)
ax.plot(ys, ψ_star(ys), 'b-', lw=2, alpha=0.6, label='true')
ax.plot(ys, ψ_est(ys), 'g-', lw=2, alpha=0.6, label='look-ahead estimate')
ax.plot(ys, k_est(ys), 'k-', lw=2, alpha=0.6, label='kernel based estimate')
ax.legend(loc='upper left')
plt.show()

36 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

Exercise 2.5.2
Replicate the figure on global convergence shown above.
The densities come from the stochastic growth model treated at the start of the lecture.
Begin with the code found above.
Use the same parameters.
For the four initial distributions, use the shifted beta distributions

ψ_0 = beta(5, 5, scale=0.5, loc=i*2)

Solution to Exercise 2.5.2


Here’s one program that does the job

# == Define parameters == #
s = 0.2
δ = 0.1
a_σ = 0.4 # A = exp(B) where B ~ N(0, a_σ)
α = 0.4 # f(k) = k**α

ϕ = lognorm(a_σ)
(continues on next page)

2.5. Exercises 37
Advanced Quantitative Economics with Python

(continued from previous page)

def p(x, y):


"Stochastic kernel, vectorized in x. Both x and y must be positive."
d = s * x**α
return ϕ.pdf((y - (1 - δ) * x) / d) / d

n = 1000 # Number of observations at each date t


T = 40 # Compute density of k_t at 1,...,T

fig, axes = plt.subplots(2, 2, figsize=(11, 8))


axes = axes.flatten()
xmax = 6.5

for i in range(4):
ax = axes[i]
ax.set_xlim(0, xmax)
ψ_0 = beta(5, 5, scale=0.5, loc=i*2) # Initial distribution

# == Generate matrix s.t. t-th column is n observations of k_t == #


k = np.empty((n, T))
A = ϕ.rvs((n, T))
k[:, 0] = ψ_0.rvs(n)
for t in range(T-1):
k[:, t+1] = s * A[:,t] * k[:, t]**α + (1 - δ) * k[:, t]

# == Generate T instances of lae using this data, one for each t == #


laes = [LAE(p, k[:, t]) for t in range(T)]

ygrid = np.linspace(0.01, xmax, 150)


greys = [str(g) for g in np.linspace(0.0, 0.8, T)]
greys.reverse()
for ψ, g in zip(laes, greys):
ax.plot(ygrid, ψ(ygrid), color=g, lw=2, alpha=0.6)
ax.set_xlabel('capital')
plt.show()

38 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

Exercise 2.5.3
A common way to compare distributions visually is with boxplots.
To illustrate, let’s generate three artificial data sets and compare them with a boxplot.
The three data sets we will use are:

{𝑋1 , … , 𝑋𝑛 } ∼ 𝐿𝑁 (0, 1), {𝑌1 , … , 𝑌𝑛 } ∼ 𝑁 (2, 1), and {𝑍1 , … , 𝑍𝑛 } ∼ 𝑁 (4, 1),

Here is the code and figure:

n = 500
x = np.random.randn(n) # N(0, 1)
x = np.exp(x) # Map x to lognormal
y = np.random.randn(n) + 2.0 # N(2, 1)
z = np.random.randn(n) + 4.0 # N(4, 1)

fig, ax = plt.subplots(figsize=(10, 6.6))


ax.boxplot([x, y, z])
ax.set_xticks((1, 2, 3))
ax.set_ylim(-2, 14)
ax.set_xticklabels(('$X$', '$Y$', '$Z$'), fontsize=16)
plt.show()

2.5. Exercises 39
Advanced Quantitative Economics with Python

Each data set is represented by a box, where the top and bottom of the box are the third and first quartiles of the data,
and the red line in the center is the median.
The boxes give some indication as to
• the location of probability mass for each sample
• whether the distribution is right-skewed (as is the lognormal distribution), etc
Now let’s put these ideas to use in a simulation.
Consider the threshold autoregressive model in (2.19).
We know that the distribution of 𝑋𝑡 will converge to (2.20) whenever |𝜃| < 1.
Let’s observe this convergence from different initial conditions using boxplots.
In particular, the exercise is to generate J boxplot figures, one for each initial condition 𝑋0 in

initial_conditions = np.linspace(8, 0, J)

For each 𝑋0 in this set,


1. Generate 𝑘 time-series of length 𝑛, each starting at 𝑋0 and obeying (2.19).
2. Create a boxplot representing 𝑛 distributions, where the 𝑡-th distribution shows the 𝑘 observations of 𝑋𝑡 .
Use 𝜃 = 0.9, 𝑛 = 20, 𝑘 = 5000, 𝐽 = 8

Solution to Exercise 2.5.3


Here’s a possible solution.

40 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

Note the way we use vectorized code to simulate the 𝑘 time series for one boxplot all at once

n = 20
k = 5000
J = 8

θ = 0.9
d = np.sqrt(1 - θ**2)
δ = θ / d

fig, axes = plt.subplots(J, 1, figsize=(10, 4*J))


initial_conditions = np.linspace(8, 0, J)
X = np.empty((k, n))

for j in range(J):

axes[j].set_ylim(-4, 8)
axes[j].set_title(f'time series from t = {initial_conditions[j]}')

Z = np.random.randn(k, n)
X[:, 0] = initial_conditions[j]
for t in range(1, n):
X[:, t] = θ * np.abs(X[:, t-1]) + d * Z[:, t]
axes[j].boxplot(X)

plt.show()

2.5. Exercises 41
Advanced Quantitative Economics with Python

42 Chapter 2. Continuous State Markov Chains


Advanced Quantitative Economics with Python

2.6 Appendix

Here’s the proof of (2.6).


Let 𝐹𝑈 and 𝐹𝑉 be the cumulative distributions of 𝑈 and 𝑉 respectively.
By the definition of 𝑉 , we have 𝐹𝑉 (𝑣) = ℙ{𝑎 + 𝑏𝑈 ≤ 𝑣} = ℙ{𝑈 ≤ (𝑣 − 𝑎)/𝑏}.
In other words, 𝐹𝑉 (𝑣) = 𝐹𝑈 ((𝑣 − 𝑎)/𝑏).
Differentiating with respect to 𝑣 yields (2.6).

2.6. Appendix 43
Advanced Quantitative Economics with Python

44 Chapter 2. Continuous State Markov Chains


CHAPTER

THREE

REVERSE ENGINEERING A LA MUTH

In addition to what’s in Anaconda, this lecture uses the quantecon library.

!pip install --upgrade quantecon

We’ll also need the following imports:

import matplotlib.pyplot as plt


import numpy as np

from quantecon import Kalman


from quantecon import LinearStateSpace
np.set_printoptions(linewidth=120, precision=4, suppress=True)

This lecture uses the Kalman filter to reformulate John F. Muth’s first paper [Muth, 1960] about rational expectations.
Muth used classical prediction methods to reverse engineer a stochastic process that renders optimal Milton Friedman’s
[Friedman, 1956] “adaptive expectations” scheme.

3.1 Friedman (1956) and Muth (1960)

Milton Friedman [Friedman, 1956] (1956) posited that consumer’s forecast their future disposable income with the adap-
tive expectations scheme


𝑦𝑡+𝑖,𝑡 = 𝐾 ∑(1 − 𝐾)𝑗 𝑦𝑡−𝑗 (3.1)
𝑗=0


where 𝐾 ∈ (0, 1) and 𝑦𝑡+𝑖,𝑡 is a forecast of future 𝑦 over horizon 𝑖.
Milton Friedman justified the exponential smoothing forecasting scheme (3.1) informally, noting that it seemed a plau-
sible way to use past income to forecast future income.
In his first paper about rational expectations, John F. Muth [Muth, 1960] reverse-engineered a univariate stochastic

process {𝑦𝑡 }𝑡=−∞ for which Milton Friedman’s adaptive expectations scheme gives linear least forecasts of 𝑦𝑡+𝑗 for any
horizon 𝑖.
Muth sought a setting and a sense in which Friedman’s forecasting scheme is optimal.
That is, Muth asked for what optimal forecasting question is Milton Friedman’s adaptive expectation scheme the answer.
Muth (1960) used classical prediction methods based on lag-operators and 𝑧-transforms to find the answer to his question.
Please see lectures Classical Control with Linear Algebra and Classical Filtering and Prediction with Linear Algebra for an
introduction to the classical tools that Muth used.

45
Advanced Quantitative Economics with Python

Rather than using those classical tools, in this lecture we apply the Kalman filter to express the heart of Muth’s analysis
concisely.
The lecture First Look at Kalman Filter describes the Kalman filter.
We’ll use limiting versions of the Kalman filter corresponding to what are called stationary values in that lecture.

3.2 A Process for Which Adaptive Expectations are Optimal

Suppose that an observable 𝑦𝑡 is the sum of an unobserved random walk 𝑥𝑡 and an IID shock 𝜖2,𝑡 :

𝑥𝑡+1 = 𝑥𝑡 + 𝜎𝑥 𝜖1,𝑡+1
(3.2)
𝑦𝑡 = 𝑥𝑡 + 𝜎𝑦 𝜖2,𝑡

where
𝜖
[ 1,𝑡+1 ] ∼ 𝒩(0, 𝐼)
𝜖2,𝑡

is an IID process.

Note: A property of the state-space representation (3.2) is that in general neither 𝜖1,𝑡 nor 𝜖2,𝑡 is in the space spanned by
square-summable linear combinations of 𝑦𝑡 , 𝑦𝑡−1 , ….

𝜖
In general [ 1,𝑡 ] has more information about future 𝑦𝑡+𝑗 ’s than is contained in 𝑦𝑡 , 𝑦𝑡−1 , ….
𝜖2𝑡
We can use the asymptotic or stationary values of the Kalman gain and the one-step-ahead conditional state covariance
matrix to compute a time-invariant innovations representation

𝑥𝑡+1
̂ = 𝑥𝑡̂ + 𝐾𝑎𝑡
(3.3)
𝑦𝑡 = 𝑥𝑡̂ + 𝑎𝑡

where 𝑥𝑡̂ = 𝐸[𝑥𝑡 |𝑦𝑡−1 , 𝑦𝑡−2 , …] and 𝑎𝑡 = 𝑦𝑡 − 𝐸[𝑦𝑡 |𝑦𝑡−1 , 𝑦𝑡−2 , …].

Note: A key property about an innovations representation is that 𝑎𝑡 is in the space spanned by square summable linear
combinations of 𝑦𝑡 , 𝑦𝑡−1 , ….

For more ramifications of this property, see the lectures Shock Non-Invertibility and Recursive Models of Dynamic Linear
Economies.
Later we’ll stack these state-space systems (3.2) and (3.3) to display some classic findings of Muth.
But first, let’s create an instance of the state-space system (3.2) then apply the quantecon Kalman class, then uses it to
construct the associated “innovations representation”

# Make some parameter choices


# sigx/sigy are state noise std err and measurement noise std err
μ_0, σ_x, σ_y = 10, 1, 5

# Create a LinearStateSpace object


A, C, G, H = 1, σ_x, 1, σ_y
ss = LinearStateSpace(A, C, G, H, mu_0=μ_0)
(continues on next page)

46 Chapter 3. Reverse Engineering a la Muth


Advanced Quantitative Economics with Python

(continued from previous page)

# Set prior and initialize the Kalman type


x_hat_0, Σ_0 = 10, 1
kmuth = Kalman(ss, x_hat_0, Σ_0)

# Computes stationary values which we need for the innovation


# representation
S1, K1 = kmuth.stationary_values()

# Extract scalars from nested arrays


S1, K1 = S1.item(), K1.item()

# Form innovation representation state-space


Ak, Ck, Gk, Hk = A, K1, G, 1

ssk = LinearStateSpace(Ak, Ck, Gk, Hk, mu_0=x_hat_0)

3.3 Some Useful State-Space Math

Now we want to map the time-invariant innovations representation (3.3) and the original state-space system (3.2) into a
convenient form for deducing the impulse responses from the original shocks to the 𝑥𝑡 and 𝑥𝑡̂ .
Putting both of these representations into a single state-space system is yet another application of the insight that “finding
the state is an art”.
We’ll define a state vector and appropriate state-space matrices that allow us to represent both systems in one fell swoop.
Note that

𝑎𝑡 = 𝑥𝑡 + 𝜎𝑦 𝜖2,𝑡 − 𝑥𝑡̂

so that
𝑥𝑡+1
̂ = 𝑥𝑡̂ + 𝐾(𝑥𝑡 + 𝜎𝑦 𝜖2,𝑡 − 𝑥𝑡̂ )
= (1 − 𝐾)𝑥𝑡̂ + 𝐾𝑥𝑡 + 𝐾𝜎𝑦 𝜖2,𝑡

The stacked system

𝑥𝑡+1 1 0 0 𝑥𝑡 𝜎𝑥 0
⎤ = ⎡𝐾 𝜖1,𝑡+1
⎡ 𝑥̂
⎢ 𝑡+1 ⎥ ⎢ (1 − 𝐾) 𝐾𝜎𝑦 ⎤ ⎡ 𝑥̂ ⎤ + ⎡ 0
⎥⎢ 𝑡 ⎥ ⎢ 0⎤
⎥ [𝜖 ]
⎣𝜖2,𝑡+1 ⎦ ⎣ 0 0 0 ⎦ ⎣𝜖2,𝑡 ⎦ ⎣ 0 1⎦ 2,𝑡+1

𝑥
𝑦 1 0 𝜎𝑦 ⎡ 𝑡 ⎤
[ 𝑡] = [ ] ⎢ 𝑥𝑡̂ ⎥
𝑎𝑡 1 −1 𝜎𝑦
⎣𝜖2,𝑡 ⎦
𝜖
is a state-space system that tells us how the shocks [ 1,𝑡+1 ] affect states 𝑥𝑡+1
̂ , 𝑥𝑡 , the observable 𝑦𝑡 , and the innovation
𝜖2,𝑡+1
𝑎𝑡 .
With this tool at our disposal, let’s form the composite system and simulate it

3.3. Some Useful State-Space Math 47


Advanced Quantitative Economics with Python

# Create grand state-space for y_t, a_t as observed vars -- Use


# stacking trick above
Af = np.array([[ 1, 0, 0],
[K1, 1 - K1, K1 * σ_y],
[ 0, 0, 0]])
Cf = np.array([[σ_x, 0],
[ 0, K1 * σ_y],
[ 0, 1]])
Gf = np.array([[1, 0, σ_y],
[1, -1, σ_y]])

μ_true, μ_prior = 10, 10


μ_f = np.array([μ_true, μ_prior, 0]).reshape(3, 1)

# Create the state-space


ssf = LinearStateSpace(Af, Cf, Gf, mu_0=μ_f)

# Draw observations of y from the state-space model


N = 50
xf, yf = ssf.simulate(N)

print(f"Kalman gain = {K1}")


print(f"Conditional variance = {S1}")

Kalman gain = 0.1809975124224177


Conditional variance = 5.524937810560442

Now that we have simulated our joint system, we have 𝑥𝑡 , 𝑥𝑡̂ , and 𝑦𝑡 .
We can now investigate how these variables are related by plotting some key objects.

3.4 Estimates of Unobservables

First, let’s plot the hidden state 𝑥𝑡 and the filtered version 𝑥𝑡̂ that is linear-least squares projection of 𝑥𝑡 on the history
𝑦𝑡−1 , 𝑦𝑡−2 , …

fig, ax = plt.subplots()
ax.plot(xf[0, :], label="$x_t$")
ax.plot(xf[1, :], label="Filtered $x_t$")
ax.legend()
ax.set_xlabel("Time")
ax.set_title(r"$x$ vs $\hat{x}$")
plt.show()

48 Chapter 3. Reverse Engineering a la Muth


Advanced Quantitative Economics with Python

Note how 𝑥𝑡 and 𝑥𝑡̂ differ.


For Friedman, 𝑥𝑡̂ and not 𝑥𝑡 is the consumer’s idea about her/his permanent income.

3.5 Relationship of Unobservables to Observables

Now let’s plot 𝑥𝑡 and 𝑦𝑡 .


Recall that 𝑦𝑡 is just 𝑥𝑡 plus white noise

fig, ax = plt.subplots()
ax.plot(yf[0, :], label="y")
ax.plot(xf[0, :], label="x")
ax.legend()
ax.set_title(r"$x$ and $y$")
ax.set_xlabel("Time")
plt.show()

3.5. Relationship of Unobservables to Observables 49


Advanced Quantitative Economics with Python

We see above that 𝑦 seems to look like white noise around the values of 𝑥.

3.5.1 Innovations

Recall that we wrote down the innovation representation that depended on 𝑎𝑡 . We now plot the innovations {𝑎𝑡 }:

fig, ax = plt.subplots()
ax.plot(yf[1, :], label="a")
ax.legend()
ax.set_title(r"Innovation $a_t$")
ax.set_xlabel("Time")
plt.show()

50 Chapter 3. Reverse Engineering a la Muth


Advanced Quantitative Economics with Python

3.6 MA and AR Representations

Now we shall extract from the Kalman instance kmuth coefficients of


• a fundamental moving average representation that represents 𝑦𝑡 as a one-sided moving sum of current and past 𝑎𝑡 s
that are square summable linear combinations of 𝑦𝑡 , 𝑦𝑡−1 , ….
• a univariate autoregression representation that depicts the coefficients in a linear least square projection of 𝑦𝑡 on
the semi-infinite history 𝑦𝑡−1 , 𝑦𝑡−2 , ….
Then we’ll plot each of them

# Kalman Methods for MA and VAR


coefs_ma = kmuth.stationary_coefficients(5, "ma")
coefs_var = kmuth.stationary_coefficients(5, "var")

# Coefficients come in a list of arrays, but we


# want to plot them and so need to stack into an array
coefs_ma_array = np.vstack(coefs_ma)
coefs_var_array = np.vstack(coefs_var)

fig, ax = plt.subplots(2)
ax[0].plot(coefs_ma_array, label="MA")
ax[0].legend()
ax[1].plot(coefs_var_array, label="VAR")
(continues on next page)

3.6. MA and AR Representations 51


Advanced Quantitative Economics with Python

(continued from previous page)


ax[1].legend()

plt.show()

The moving average coefficients in the top panel show tell-tale signs of 𝑦𝑡 being a process whose first difference is a
first-order autoregression.
The autoregressive coefficients decline geometrically with decay rate (1 − 𝐾).
These are exactly the target outcomes that Muth (1960) aimed to reverse engineer

print(f'decay parameter 1 - K1 = {1 - K1}')

decay parameter 1 - K1 = 0.8190024875775823

52 Chapter 3. Reverse Engineering a la Muth


CHAPTER

FOUR

DISCRETE STATE DYNAMIC PROGRAMMING

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

4.1 Overview

In this lecture we discuss a family of dynamic programming problems with the following features:
1. a discrete state space and discrete choices (actions)
2. an infinite horizon
3. discounted rewards
4. Markov state transitions
We call such problems discrete dynamic programs or discrete DPs.
Discrete DPs are the workhorses in much of modern quantitative economics, including
• monetary economics
• search and labor economics
• household savings and consumption theory
• investment theory
• asset pricing
• industrial organization, etc.
When a given model is not inherently discrete, it is common to replace it with a discretized version in order to use discrete
DP techniques.
This lecture covers
• the theory of dynamic programming in a discrete setting, plus examples and applications
• a powerful set of routines for solving discrete DPs from the QuantEcon code library
Let’s start with some imports:

import numpy as np
import matplotlib.pyplot as plt
import quantecon as qe
import scipy.sparse as sparse
(continues on next page)

53
Advanced Quantitative Economics with Python

(continued from previous page)


from quantecon import compute_fixed_point
from quantecon.markov import DiscreteDP

4.1.1 How to Read this Lecture

We use dynamic programming many applied lectures, such as


• The shortest path lecture
• The McCall search model lecture
The objective of this lecture is to provide a more systematic and theoretical treatment, including algorithms and imple-
mentation while focusing on the discrete case.

4.1.2 Code

Among other things, it offers


• a flexible, well-designed interface
• multiple solution methods, including value function and policy function iteration
• high-speed operations via carefully optimized JIT-compiled functions
• the ability to scale to large problems by minimizing vectorized operators and allowing operations on sparse matrices
JIT compilation relies on Numba, which should work seamlessly if you are using Anaconda as suggested.

4.1.3 References

For background reading on dynamic programming and additional applications, see, for example,
• [Ljungqvist and Sargent, 2018]
• [Hernandez-Lerma and Lasserre, 1996], section 3.5
• [Puterman, 2005]
• [Stokey et al., 1989]
• [Rust, 1996]
• [Miranda and Fackler, 2002]
• EDTC, chapter 5

4.2 Discrete DPs

Loosely speaking, a discrete DP is a maximization problem with an objective function of the form

𝔼 ∑ 𝛽 𝑡 𝑟(𝑠𝑡 , 𝑎𝑡 ) (4.1)
𝑡=0

where
• 𝑠𝑡 is the state variable

54 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

• 𝑎𝑡 is the action
• 𝛽 is a discount factor
• 𝑟(𝑠𝑡 , 𝑎𝑡 ) is interpreted as a current reward when the state is 𝑠𝑡 and the action chosen is 𝑎𝑡
Each pair (𝑠𝑡 , 𝑎𝑡 ) pins down transition probabilities 𝑄(𝑠𝑡 , 𝑎𝑡 , 𝑠𝑡+1 ) for the next period state 𝑠𝑡+1 .
Thus, actions influence not only current rewards but also the future time path of the state.
The essence of dynamic programming problems is to trade off current rewards vs favorable positioning of the future state
(modulo randomness).
Examples:
• consuming today vs saving and accumulating assets
• accepting a job offer today vs seeking a better one in the future
• exercising an option now vs waiting

4.2.1 Policies

The most fruitful way to think about solutions to discrete DP problems is to compare policies.
In general, a policy is a randomized map from past actions and states to current action.
In the setting formalized below, it suffices to consider so-called stationary Markov policies, which consider only the current
state.
In particular, a stationary Markov policy is a map 𝜎 from states to actions
• 𝑎𝑡 = 𝜎(𝑠𝑡 ) indicates that 𝑎𝑡 is the action to be taken in state 𝑠𝑡
It is known that, for any arbitrary policy, there exists a stationary Markov policy that dominates it at least weakly.
• See section 5.5 of [Puterman, 2005] for discussion and proofs.
In what follows, stationary Markov policies are referred to simply as policies.
The aim is to find an optimal policy, in the sense of one that maximizes (4.1).
Let’s now step through these ideas more carefully.

4.2.2 Formal Definition

Formally, a discrete dynamic program consists of the following components:


1. A finite set of states 𝑆 = {0, … , 𝑛 − 1}.
2. A finite set of feasible actions 𝐴(𝑠) for each state 𝑠 ∈ 𝑆, and a corresponding set of feasible state-action pairs.

SA ∶= {(𝑠, 𝑎) ∣ 𝑠 ∈ 𝑆, 𝑎 ∈ 𝐴(𝑠)}

3. A reward function 𝑟 ∶ SA → ℝ.
4. A transition probability function 𝑄 ∶ SA → Δ(𝑆), where Δ(𝑆) is the set of probability distributions over 𝑆.
5. A discount factor 𝛽 ∈ [0, 1).

4.2. Discrete DPs 55


Advanced Quantitative Economics with Python

We also use the notation 𝐴 ∶= ⋃𝑠∈𝑆 𝐴(𝑠) = {0, … , 𝑚 − 1} and call this set the action space.
A policy is a function 𝜎 ∶ 𝑆 → 𝐴.
A policy is called feasible if it satisfies 𝜎(𝑠) ∈ 𝐴(𝑠) for all 𝑠 ∈ 𝑆.
Denote the set of all feasible policies by Σ.
If a decision-maker uses a policy 𝜎 ∈ Σ, then
• the current reward at time 𝑡 is 𝑟(𝑠𝑡 , 𝜎(𝑠𝑡 ))
• the probability that 𝑠𝑡+1 = 𝑠′ is 𝑄(𝑠𝑡 , 𝜎(𝑠𝑡 ), 𝑠′ )
For each 𝜎 ∈ Σ, define
• 𝑟𝜎 by 𝑟𝜎 (𝑠) ∶= 𝑟(𝑠, 𝜎(𝑠)))
• 𝑄𝜎 by 𝑄𝜎 (𝑠, 𝑠′ ) ∶= 𝑄(𝑠, 𝜎(𝑠), 𝑠′ )
Notice that 𝑄𝜎 is a stochastic matrix on 𝑆.
It gives transition probabilities of the controlled chain when we follow policy 𝜎.
If we think of 𝑟𝜎 as a column vector, then so is 𝑄𝑡𝜎 𝑟𝜎 , and the 𝑠-th row of the latter has the interpretation

(𝑄𝑡𝜎 𝑟𝜎 )(𝑠) = 𝔼[𝑟(𝑠𝑡 , 𝜎(𝑠𝑡 )) ∣ 𝑠0 = 𝑠] when {𝑠𝑡 } ∼ 𝑄𝜎 (4.2)

Comments
• {𝑠𝑡 } ∼ 𝑄𝜎 means that the state is generated by stochastic matrix 𝑄𝜎 .
• See this discussion on computing expectations of Markov chains for an explanation of the expression in (4.2).
Notice that we’re not really distinguishing between functions from 𝑆 to ℝ and vectors in ℝ𝑛 .
This is natural because they are in one to one correspondence.

4.2.3 Value and Optimality

Let 𝑣𝜎 (𝑠) denote the discounted sum of expected reward flows from policy 𝜎 when the initial state is 𝑠.
To calculate this quantity we pass the expectation through the sum in (4.1) and use (4.2) to get

𝑣𝜎 (𝑠) = ∑ 𝛽 𝑡 (𝑄𝑡𝜎 𝑟𝜎 )(𝑠) (𝑠 ∈ 𝑆)
𝑡=0

This function is called the policy value function for the policy 𝜎.
The optimal value function, or simply value function, is the function 𝑣∗ ∶ 𝑆 → ℝ defined by

𝑣∗ (𝑠) = max 𝑣𝜎 (𝑠) (𝑠 ∈ 𝑆)


𝜎∈Σ

(We can use max rather than sup here because the domain is a finite set)
A policy 𝜎 ∈ Σ is called optimal if 𝑣𝜎 (𝑠) = 𝑣∗ (𝑠) for all 𝑠 ∈ 𝑆.
Given any 𝑤 ∶ 𝑆 → ℝ, a policy 𝜎 ∈ Σ is called 𝑤-greedy if

𝜎(𝑠) ∈ arg max {𝑟(𝑠, 𝑎) + 𝛽 ∑ 𝑤(𝑠′ )𝑄(𝑠, 𝑎, 𝑠′ )} (𝑠 ∈ 𝑆)


𝑎∈𝐴(𝑠) 𝑠′ ∈𝑆

As discussed in detail below, optimal policies are precisely those that are 𝑣∗ -greedy.

56 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

4.2.4 Two Operators

It is useful to define the following operators:


• The Bellman operator 𝑇 ∶ ℝ𝑆 → ℝ𝑆 is defined by

(𝑇 𝑣)(𝑠) = max {𝑟(𝑠, 𝑎) + 𝛽 ∑ 𝑣(𝑠′ )𝑄(𝑠, 𝑎, 𝑠′ )} (𝑠 ∈ 𝑆)


𝑎∈𝐴(𝑠)
𝑠′ ∈𝑆

• For any policy function 𝜎 ∈ Σ, the operator 𝑇𝜎 ∶ ℝ𝑆 → ℝ𝑆 is defined by


(𝑇𝜎 𝑣)(𝑠) = 𝑟(𝑠, 𝜎(𝑠)) + 𝛽 ∑ 𝑣(𝑠′ )𝑄(𝑠, 𝜎(𝑠), 𝑠′ ) (𝑠 ∈ 𝑆)
𝑠′ ∈𝑆
This can be written more succinctly in operator notation as
𝑇𝜎 𝑣 = 𝑟𝜎 + 𝛽𝑄𝜎 𝑣
The two operators are both monotone
• 𝑣 ≤ 𝑤 implies 𝑇 𝑣 ≤ 𝑇 𝑤 pointwise on 𝑆, and similarly for 𝑇𝜎
They are also contraction mappings with modulus 𝛽
• ‖𝑇 𝑣 − 𝑇 𝑤‖ ≤ 𝛽‖𝑣 − 𝑤‖ and similarly for 𝑇𝜎 , where ‖⋅‖ is the max norm
For any policy 𝜎, its value 𝑣𝜎 is the unique fixed point of 𝑇𝜎 .
For proofs of these results and those in the next section, see, for example, EDTC, chapter 10.

4.2.5 The Bellman Equation and the Principle of Optimality

The main principle of the theory of dynamic programming is that


• the optimal value function 𝑣∗ is a unique solution to the Bellman equation

𝑣(𝑠) = max {𝑟(𝑠, 𝑎) + 𝛽 ∑ 𝑣(𝑠′ )𝑄(𝑠, 𝑎, 𝑠′ )} (𝑠 ∈ 𝑆)


𝑎∈𝐴(𝑠)
𝑠′ ∈𝑆
or in other words, 𝑣∗ is the unique fixed point of 𝑇 , and
• 𝜎∗ is an optimal policy function if and only if it is 𝑣∗ -greedy
By the definition of greedy policies given above, this means that

𝜎∗ (𝑠) ∈ arg max {𝑟(𝑠, 𝑎) + 𝛽 ∑ 𝑣∗ (𝑠′ )𝑄(𝑠, 𝑎, 𝑠′ )} (𝑠 ∈ 𝑆)


𝑎∈𝐴(𝑠) 𝑠′ ∈𝑆

4.3 Solving Discrete DPs

Now that the theory has been set out, let’s turn to solution methods.
The code for solving discrete DPs is available in ddp.py from the QuantEcon.py code library.
It implements the three most important solution methods for discrete dynamic programs, namely
• value function iteration
• policy function iteration
• modified policy function iteration
Let’s briefly review these algorithms and their implementation.

4.3. Solving Discrete DPs 57


Advanced Quantitative Economics with Python

4.3.1 Value Function Iteration

Perhaps the most familiar method for solving all manner of dynamic programs is value function iteration.
This algorithm uses the fact that the Bellman operator 𝑇 is a contraction mapping with fixed point 𝑣∗ .
Hence, iterative application of 𝑇 to any initial function 𝑣0 ∶ 𝑆 → ℝ converges to 𝑣∗ .
The details of the algorithm can be found in the appendix.

4.3.2 Policy Function Iteration

This routine, also known as Howard’s policy improvement algorithm, exploits more closely the particular structure of a
discrete DP problem.
Each iteration consists of
1. A policy evaluation step that computes the value 𝑣𝜎 of a policy 𝜎 by solving the linear equation 𝑣 = 𝑇𝜎 𝑣.
2. A policy improvement step that computes a 𝑣𝜎 -greedy policy.
In the current setting, policy iteration computes an exact optimal policy in finitely many iterations.
• See theorem 10.2.6 of EDTC for a proof.
The details of the algorithm can be found in the appendix.

4.3.3 Modified Policy Function Iteration

Modified policy iteration replaces the policy evaluation step in policy iteration with “partial policy evaluation”.
The latter computes an approximation to the value of a policy 𝜎 by iterating 𝑇𝜎 for a specified number of times.
This approach can be useful when the state space is very large and the linear system in the policy evaluation step of policy
iteration is correspondingly difficult to solve.
The details of the algorithm can be found in the appendix.

4.4 Example: A Growth Model

Let’s consider a simple consumption-saving model.


A single household either consumes or stores its own output of a single consumption good.
The household starts each period with current stock 𝑠.
Next, the household chooses a quantity 𝑎 to store and consumes 𝑐 = 𝑠 − 𝑎
• Storage is limited by a global upper bound 𝑀 .
• Flow utility is 𝑢(𝑐) = 𝑐𝛼 .
Output is drawn from a discrete uniform distribution on {0, … , 𝐵}.
The next period stock is therefore

𝑠′ = 𝑎 + 𝑈 where 𝑈 ∼ 𝑈 [0, … , 𝐵]

The discount factor is 𝛽 ∈ [0, 1).

58 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

4.4.1 Discrete DP Representation

We want to represent this model in the format of a discrete dynamic program.


To this end, we take
• the state variable to be the stock 𝑠
• the state space to be 𝑆 = {0, … , 𝑀 + 𝐵}
– hence 𝑛 = 𝑀 + 𝐵 + 1
• the action to be the storage quantity 𝑎
• the set of feasible actions at 𝑠 to be 𝐴(𝑠) = {0, … , min{𝑠, 𝑀 }}
– hence 𝐴 = {0, … , 𝑀 } and 𝑚 = 𝑀 + 1
• the reward function to be 𝑟(𝑠, 𝑎) = 𝑢(𝑠 − 𝑎)
• the transition probabilities to be
1
if 𝑎 ≤ 𝑠′ ≤ 𝑎 + 𝐵
𝑄(𝑠, 𝑎, 𝑠′ ) ∶= { 𝐵+1 (4.3)
0 otherwise

4.4.2 Defining a DiscreteDP Instance

This information will be used to create an instance of DiscreteDP by passing the following information
1. An 𝑛 × 𝑚 reward array 𝑅.
2. An 𝑛 × 𝑚 × 𝑛 transition probability array 𝑄.
3. A discount factor 𝛽.
For 𝑅 we set 𝑅[𝑠, 𝑎] = 𝑢(𝑠 − 𝑎) if 𝑎 ≤ 𝑠 and −∞ otherwise.
For 𝑄 we follow the rule in (4.3).

Note:
• The feasibility constraint is embedded into 𝑅 by setting 𝑅[𝑠, 𝑎] = −∞ for 𝑎 ∉ 𝐴(𝑠).
• Probability distributions for (𝑠, 𝑎) with 𝑎 ∉ 𝐴(𝑠) can be arbitrary.

The following code sets up these objects for us

class SimpleOG:

def __init__(self, B=10, M=5, α=0.5, β=0.9):


"""
Set up R, Q and β, the three elements that define an instance of
the DiscreteDP class.
"""

self.B, self.M, self.α, self.β = B, M, α, β


self.n = B + M + 1
self.m = M + 1

self.R = np.empty((self.n, self.m))


(continues on next page)

4.4. Example: A Growth Model 59


Advanced Quantitative Economics with Python

(continued from previous page)


self.Q = np.zeros((self.n, self.m, self.n))

self.populate_Q()
self.populate_R()

def u(self, c):


return c**self.α

def populate_R(self):
"""
Populate the R matrix, with R[s, a] = -np.inf for infeasible
state-action pairs.
"""
for s in range(self.n):
for a in range(self.m):
self.R[s, a] = self.u(s - a) if a <= s else -np.inf

def populate_Q(self):
"""
Populate the Q matrix by setting

Q[s, a, s'] = 1 / (1 + B) if a <= s' <= a + B

and zero otherwise.


"""

for a in range(self.m):
self.Q[:, a, a:(a + self.B + 1)] = 1.0 / (self.B + 1)

Let’s run this code and create an instance of SimpleOG.

g = SimpleOG() # Use default parameters

Instances of DiscreteDP are created using the signature DiscreteDP(R, Q, β).


Let’s create an instance using the objects stored in g

ddp = qe.markov.DiscreteDP(g.R, g.Q, g.β)

Now that we have an instance ddp of DiscreteDP we can solve it as follows

results = ddp.solve(method='policy_iteration')

Let’s see what we’ve got here

dir(results)

['max_iter', 'mc', 'method', 'num_iter', 'sigma', 'v']

(In IPython version 4.0 and above you can also type results. and hit the tab key)
The most important attributes are v, the value function, and σ, the optimal policy

results.v

60 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

array([19.01740222, 20.01740222, 20.43161578, 20.74945302, 21.04078099,


21.30873018, 21.54479816, 21.76928181, 21.98270358, 22.18824323,
22.3845048 , 22.57807736, 22.76109127, 22.94376708, 23.11533996,
23.27761762])

results.sigma

array([0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 5, 5, 5, 5])

Since we’ve used policy iteration, these results will be exact unless we hit the iteration bound max_iter.
Let’s make sure this didn’t happen

results.max_iter

250

results.num_iter

Another interesting object is results.mc, which is the controlled chain defined by 𝑄𝜎∗ , where 𝜎∗ is the optimal policy.
In other words, it gives the dynamics of the state when the agent follows the optimal policy.
Since this object is an instance of MarkovChain from QuantEcon.py (see this lecture for more discussion), we can easily
simulate it, compute its stationary distribution and so on.

results.mc.stationary_distributions

array([[0.01732187, 0.04121063, 0.05773956, 0.07426848, 0.08095823,


0.09090909, 0.09090909, 0.09090909, 0.09090909, 0.09090909,
0.09090909, 0.07358722, 0.04969846, 0.03316953, 0.01664061,
0.00995086]])

Here’s the same information in a bar graph


What happens if the agent is more patient?

ddp = qe.markov.DiscreteDP(g.R, g.Q, 0.99) # Increase β to 0.99


results = ddp.solve(method='policy_iteration')
results.mc.stationary_distributions

array([[0.00546913, 0.02321342, 0.03147788, 0.04800681, 0.05627127,


0.09090909, 0.09090909, 0.09090909, 0.09090909, 0.09090909,
0.09090909, 0.08543996, 0.06769567, 0.05943121, 0.04290228,
0.03463782]])

If we look at the bar graph we can see the rightward shift in probability mass

4.4. Example: A Growth Model 61


Advanced Quantitative Economics with Python

62 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

4.4.3 State-Action Pair Formulation

The DiscreteDP class in fact, provides a second interface to set up an instance.


One of the advantages of this alternative set up is that it permits the use of a sparse matrix for Q.
(An example of using sparse matrices is given in the exercises below)
The call signature of the second formulation is DiscreteDP(R, Q, β, s_indices, a_indices) where
• s_indices and a_indices are arrays of equal length L enumerating all feasible state-action pairs
• R is an array of length L giving corresponding rewards
• Q is an L x n transition probability array
Here’s how we could set up these objects for the preceding example

B, M, α, β = 10, 5, 0.5, 0.9


n = B + M + 1
m = M + 1

def u(c):
return c**α

s_indices = []
a_indices = []
Q = []
R = []
b = 1.0 / (B + 1)

for s in range(n):
for a in range(min(M, s) + 1): # All feasible a at this s
s_indices.append(s)
a_indices.append(a)
q = np.zeros(n)
q[a:(a + B + 1)] = b # b on these values, otherwise 0
Q.append(q)
R.append(u(s - a))

ddp = qe.markov.DiscreteDP(R, Q, β, s_indices, a_indices)

For larger problems, you might need to write this code more efficiently by vectorizing or using Numba.

4.5 Exercises

In the stochastic optimal growth lecture from our introductory lecture series, we solve a benchmark model that has an
analytical solution.
The exercise is to replicate this solution using DiscreteDP.

4.5. Exercises 63
Advanced Quantitative Economics with Python

4.6 Solutions

4.6.1 Setup

Details of the model can be found in the lecture on optimal growth.


We let 𝑓(𝑘) = 𝑘𝛼 with 𝛼 = 0.65, 𝑢(𝑐) = log 𝑐, and 𝛽 = 0.95

α = 0.65
f = lambda k: k**α
u = np.log
β = 0.95

Here we want to solve a finite state version of the continuous state model above.
We discretize the state space into a grid of size grid_size=500, from 10−6 to grid_max=2

grid_max = 2
grid_size = 500
grid = np.linspace(1e-6, grid_max, grid_size)

We choose the action to be the amount of capital to save for the next period (the state is the capital stock at the beginning
of the period).
Thus the state indices and the action indices are both 0, …, grid_size-1.
Action (indexed by) a is feasible at state (indexed by) s if and only if grid[a] < f([grid[s]) (zero consumption
is not allowed because of the log utility).
Thus the Bellman equation is:

𝑣(𝑘) = max 𝑢(𝑓(𝑘) − 𝑘′ ) + 𝛽𝑣(𝑘′ ),


0<𝑘′ <𝑓(𝑘)

where 𝑘′ is the capital stock in the next period.


The transition probability array Q will be highly sparse (in fact it is degenerate as the model is deterministic), so we
formulate the problem with state-action pairs, to represent Q in scipy sparse matrix format.
We first construct indices for state-action pairs:

# Consumption matrix, with nonpositive consumption included


C = f(grid).reshape(grid_size, 1) - grid.reshape(1, grid_size)

# State-action indices
s_indices, a_indices = np.where(C > 0)

# Number of state-action pairs


L = len(s_indices)

print(L)
print(s_indices)
print(a_indices)

118841
[ 0 1 1 ... 499 499 499]
[ 0 0 1 ... 389 390 391]

64 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

Reward vector R (of length L):

R = u(C[s_indices, a_indices])

(Degenerate) transition probability matrix Q (of shape (L, grid_size)), where we choose the scipy.sparse.lil_matrix
format, while any format will do (internally it will be converted to the csr format):

Q = sparse.lil_matrix((L, grid_size))
Q[np.arange(L), a_indices] = 1

(If you are familiar with the data structure of scipy.sparse.csr_matrix, the following is the most efficient way to create the
Q matrix in the current case)

# data = np.ones(L)
# indptr = np.arange(L+1)
# Q = sparse.csr_matrix((data, a_indices, indptr), shape=(L, grid_size))

Discrete growth model:

ddp = DiscreteDP(R, Q, β, s_indices, a_indices)

Notes
Here we intensively vectorized the operations on arrays to simplify the code.
As noted, however, vectorization is memory consumptive, and it can be prohibitively so for grids with large size.

4.6.2 Solving the Model

Solve the dynamic optimization problem:

res = ddp.solve(method='policy_iteration')
v, σ, num_iter = res.v, res.sigma, res.num_iter
num_iter

10

Note that sigma contains the indices of the optimal capital stocks to save for the next period. The following translates
sigma to the corresponding consumption vector.

# Optimal consumption in the discrete version


c = f(grid) - grid[σ]

# Exact solution of the continuous version


ab = α * β
c1 = (np.log(1 - ab) + np.log(ab) * ab / (1 - ab)) / (1 - β)
c2 = α / (1 - ab)

def v_star(k):
return c1 + c2 * np.log(k)

def c_star(k):
return (1 - ab) * k**α

Let us compare the solution of the discrete model with that of the original continuous model

4.6. Solutions 65
Advanced Quantitative Economics with Python

fig, ax = plt.subplots(1, 2, figsize=(14, 4))


ax[0].set_ylim(-40, -32)
ax[0].set_xlim(grid[0], grid[-1])
ax[1].set_xlim(grid[0], grid[-1])

lb0 = 'discrete value function'


ax[0].plot(grid, v, lw=2, alpha=0.6, label=lb0)

lb0 = 'continuous value function'


ax[0].plot(grid, v_star(grid), 'k-', lw=1.5, alpha=0.8, label=lb0)
ax[0].legend(loc='upper left')

lb1 = 'discrete optimal consumption'


ax[1].plot(grid, c, 'b-', lw=2, alpha=0.6, label=lb1)

lb1 = 'continuous optimal consumption'


ax[1].plot(grid, c_star(grid), 'k-', lw=1.5, alpha=0.8, label=lb1)
ax[1].legend(loc='upper left')
plt.show()

The outcomes appear very close to those of the continuous version.


Except for the “boundary” point, the value functions are very close:

np.abs(v - v_star(grid)).max()

121.49819147053378

np.abs(v - v_star(grid))[1:].max()

0.012681735127500815

The optimal consumption functions are close as well:

np.abs(c - c_star(grid)).max()

0.003826523100010082

In fact, the optimal consumption obtained in the discrete version is not really monotone, but the decrements are quite
small:

66 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

diff = np.diff(c)
(diff >= 0).all()

False

dec_ind = np.where(diff < 0)[0]


len(dec_ind)

174

np.abs(diff[dec_ind]).max()

0.001961853339766839

The value function is monotone:

(np.diff(v) > 0).all()

True

4.6.3 Comparison of the Solution Methods

Let us solve the problem with the other two methods.

Value Iteration

ddp.epsilon = 1e-4
ddp.max_iter = 500
res1 = ddp.solve(method='value_iteration')
res1.num_iter

294

np.array_equal(σ, res1.sigma)

True

4.6. Solutions 67
Advanced Quantitative Economics with Python

Modified Policy Iteration

res2 = ddp.solve(method='modified_policy_iteration')
res2.num_iter

16

np.array_equal(σ, res2.sigma)

True

Speed Comparison

%timeit ddp.solve(method='value_iteration')
%timeit ddp.solve(method='policy_iteration')
%timeit ddp.solve(method='modified_policy_iteration')

94.9 ms ± 360 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

9.34 ms ± 16.9 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

11.3 ms ± 59.9 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

As is often the case, policy iteration and modified policy iteration are much faster than value iteration.

4.6.4 Replication of the Figures

Using DiscreteDP we replicate the figures shown in the lecture.

Convergence of Value Iteration

Let us first visualize the convergence of the value iteration algorithm as in the lecture, where we use ddp.
bellman_operator implemented as a method of DiscreteDP

w = 5 * np.log(grid) - 25 # Initial condition


n = 35
fig, ax = plt.subplots(figsize=(8,5))
ax.set_ylim(-40, -20)
ax.set_xlim(np.min(grid), np.max(grid))
lb = 'initial condition'
ax.plot(grid, w, color=plt.cm.jet(0), lw=2, alpha=0.6, label=lb)
for i in range(n):
w = ddp.bellman_operator(w)
ax.plot(grid, w, color=plt.cm.jet(i / n), lw=2, alpha=0.6)
lb = 'true value function'
ax.plot(grid, v_star(grid), 'k-', lw=2, alpha=0.8, label=lb)
ax.legend(loc='upper left')
(continues on next page)

68 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

(continued from previous page)

plt.show()

We next plot the consumption policies along with the value iteration

w = 5 * u(grid) - 25 # Initial condition

fig, ax = plt.subplots(3, 1, figsize=(8, 10))


true_c = c_star(grid)

for i, n in enumerate((2, 4, 6)):


ax[i].set_ylim(0, 1)
ax[i].set_xlim(0, 2)
ax[i].set_yticks((0, 1))
ax[i].set_xticks((0, 2))

w = 5 * u(grid) - 25 # Initial condition


compute_fixed_point(ddp.bellman_operator, w, max_iter=n, print_skip=1)
σ = ddp.compute_greedy(w) # Policy indices
c_policy = f(grid) - grid[σ]

ax[i].plot(grid, c_policy, 'b-', lw=2, alpha=0.8,


label='approximate optimal consumption policy')
ax[i].plot(grid, true_c, 'k-', lw=2, alpha=0.8,
label='true optimal consumption policy')
ax[i].legend(loc='upper left')
ax[i].set_title(f'{n} value function iterations')
plt.show()

4.6. Solutions 69
Advanced Quantitative Economics with Python

Iteration Distance Elapsed (seconds)


---------------------------------------------
1 5.518e+00 6.032e-04
2 4.070e+00 9.885e-04
Iteration Distance Elapsed (seconds)
---------------------------------------------
1 5.518e+00 3.893e-04
2 4.070e+00 7.424e-04
3 3.866e+00 1.085e-03
4 3.673e+00 1.462e-03
Iteration Distance Elapsed (seconds)
---------------------------------------------
1 5.518e+00 3.693e-04
2 4.070e+00 7.381e-04
3 3.866e+00 1.083e-03
4 3.673e+00 1.423e-03
5 3.489e+00 1.779e-03
6 3.315e+00 2.120e-03

/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/quantecon/_
↪compute_fp.py:152: RuntimeWarning: max_iter attained before convergence in␣

↪compute_fixed_point

warnings.warn(_non_convergence_msg, RuntimeWarning)

70 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

4.6. Solutions 71
Advanced Quantitative Economics with Python

Dynamics of the Capital Stock

Finally, let us work on Exercise 2, where we plot the trajectories of the capital stock for three different discount factors,
0.9, 0.94, and 0.98, with initial condition 𝑘0 = 0.1.

discount_factors = (0.9, 0.94, 0.98)


k_init = 0.1

# Search for the index corresponding to k_init


k_init_ind = np.searchsorted(grid, k_init)

sample_size = 25

fig, ax = plt.subplots(figsize=(8,5))
ax.set_xlabel("time")
ax.set_ylabel("capital")
ax.set_ylim(0.10, 0.30)

# Create a new instance, not to modify the one used above


ddp0 = DiscreteDP(R, Q, β, s_indices, a_indices)

for beta in discount_factors:


ddp0.beta = beta
res0 = ddp0.solve()
k_path_ind = res0.mc.simulate(init=k_init_ind, ts_length=sample_size)
k_path = grid[k_path_ind]
ax.plot(k_path, 'o-', lw=2, alpha=0.75, label=fr'$\beta = {beta}$')

ax.legend(loc='lower right')
plt.show()

72 Chapter 4. Discrete State Dynamic Programming


Advanced Quantitative Economics with Python

4.7 Appendix: Algorithms

This appendix covers the details of the solution algorithms implemented for DiscreteDP.
We will make use of the following notions of approximate optimality:
• For 𝜀 > 0, 𝑣 is called an 𝜀-approximation of 𝑣∗ if ‖𝑣 − 𝑣∗ ‖ < 𝜀.
• A policy 𝜎 ∈ Σ is called 𝜀-optimal if 𝑣𝜎 is an 𝜀-approximation of 𝑣∗ .

4.7.1 Value Iteration

The DiscreteDP value iteration method implements value function iteration as follows
1. Choose any 𝑣0 ∈ ℝ𝑛 , and specify 𝜀 > 0; set 𝑖 = 0.
2. Compute 𝑣𝑖+1 = 𝑇 𝑣𝑖 .
3. If ‖𝑣𝑖+1 − 𝑣𝑖 ‖ < [(1 − 𝛽)/(2𝛽)]𝜀, then go to step 4; otherwise, set 𝑖 = 𝑖 + 1 and go to step 2.
4. Compute a 𝑣𝑖+1 -greedy policy 𝜎, and return 𝑣𝑖+1 and 𝜎.
Given 𝜀 > 0, the value iteration algorithm
• terminates in a finite number of iterations
• returns an 𝜀/2-approximation of the optimal value function and an 𝜀-optimal policy function (unless iter_max
is reached)
(While not explicit, in the actual implementation each algorithm is terminated if the number of iterations reaches
iter_max)

4.7.2 Policy Iteration

The DiscreteDP policy iteration method runs as follows


1. Choose any 𝑣0 ∈ ℝ𝑛 and compute a 𝑣0 -greedy policy 𝜎0 ; set 𝑖 = 0.
2. Compute the value 𝑣𝜎𝑖 by solving the equation 𝑣 = 𝑇𝜎𝑖 𝑣.
3. Compute a 𝑣𝜎𝑖 -greedy policy 𝜎𝑖+1 ; let 𝜎𝑖+1 = 𝜎𝑖 if possible.
4. If 𝜎𝑖+1 = 𝜎𝑖 , then return 𝑣𝜎𝑖 and 𝜎𝑖+1 ; otherwise, set 𝑖 = 𝑖 + 1 and go to step 2.
The policy iteration algorithm terminates in a finite number of iterations.
It returns an optimal value function and an optimal policy function (unless iter_max is reached).

4.7.3 Modified Policy Iteration

The DiscreteDP modified policy iteration method runs as follows:


1. Choose any 𝑣0 ∈ ℝ𝑛 , and specify 𝜀 > 0 and 𝑘 ≥ 0; set 𝑖 = 0.
2. Compute a 𝑣𝑖 -greedy policy 𝜎𝑖+1 ; let 𝜎𝑖+1 = 𝜎𝑖 if possible (for 𝑖 ≥ 1).
3. Compute 𝑢 = 𝑇 𝑣𝑖 (= 𝑇𝜎𝑖+1 𝑣𝑖 ). If span(𝑢 − 𝑣𝑖 ) < [(1 − 𝛽)/𝛽]𝜀, then go to step 5; otherwise go to step 4.
• Span is defined by span(𝑧) = max(𝑧) − min(𝑧).
4. Compute 𝑣𝑖+1 = (𝑇𝜎𝑖+1 )𝑘 𝑢 (= (𝑇𝜎𝑖+1 )𝑘+1 𝑣𝑖 ); set 𝑖 = 𝑖 + 1 and go to step 2.

4.7. Appendix: Algorithms 73


Advanced Quantitative Economics with Python

5. Return 𝑣 = 𝑢 + [𝛽/(1 − 𝛽)][(min(𝑢 − 𝑣𝑖 ) + max(𝑢 − 𝑣𝑖 ))/2]1 and 𝜎𝑖+1 .


Given 𝜀 > 0, provided that 𝑣0 is such that 𝑇 𝑣0 ≥ 𝑣0 , the modified policy iteration algorithm terminates in a finite number
of iterations.
It returns an 𝜀/2-approximation of the optimal value function and an 𝜀-optimal policy function (unless iter_max is
reached).
See also the documentation for DiscreteDP.

74 Chapter 4. Discrete State Dynamic Programming


Part II

LQ Control

75
CHAPTER

FIVE

INFORMATION AND CONSUMPTION SMOOTHING

In addition to what’s in Anaconda, this lecture employs the following libraries:

!pip install --upgrade quantecon

5.1 Overview

In the linear-quadratic permanent income of consumption smoothing model described in this quantecon lecture, a scalar
parameter 𝛽 ∈ (0, 1) plays two roles:
• it is a discount factor that the consumer applies to future utilities from consumption
• it is the reciprocal of the gross interest rate on risk-free one-period loans
That 𝛽 plays these two roles is essential in delivering the outcome that, regardless of the stochastic process that describes
his non-financial income, the consumer chooses to make consumption follow a random walk (see [Hall, 1978]).
In this lecture, we assign a third role to 𝛽:
• it describes a first-order moving average process for the growth in non-financial income

5.1.1 Same non-financial incomes, different information

We study two consumers who have exactly the same nonfinancial income process and who both conform to the linear-
quadratic permanent income of consumption smoothing model described here.
The two consumers have different information about their future nonfinancial incomes.
A better informed consumer each period receives news in the form of a shock that simultaneously affects both today’s
nonfinancial income and the present value of future nonfinancial incomes in a particular way.
A less informed consumer each period receives a shock that equals the part of today’s nonfinancial income that could not
be forecast from past values of nonfinancial income.
Even though they receive exactly the same nonfinancial incomes each period, our two consumers behave differently
because they have different information about their future nonfinancial incomes.
The second consumer receives less information about future nonfinancial incomes in a sense that we shall make precise.
This difference in their information sets manifests itself in their responding differently to what they regard as time 𝑡
information shocks.
Thus, although at each date they receive exactly the same histories of nonfinancial income, our two consumers receive
different shocks or news about their future nonfinancial incomes.

77
Advanced Quantitative Economics with Python

We use the different behaviors of our consumers as a way to learn about


• operating characteristics of a linear-quadratic permanent income model
• how the Kalman filter introduced in this lecture and/or another representation of the theory of optimal forecasting
introduced in this lecture embody lessons that can be applied to the news and noise literature
• ways of representing and computing optimal decision rules in the linear-quadratic permanent income model
• a Ricardian equivalence outcome that describes effects on optimal consumption of a tax cut at time 𝑡 accompanied
by a foreseen permanent increases in taxes that is just sufficient to cover the interest payments used to service the
risk-free government bonds that are issued to finance the tax cut
• a simple application of alternative ways to factor a covariance generating function along lines described in this
lecture
This lecture can be regarded as an introduction to invertibility issues that take center stage in the analysis of fiscal
foresight by Eric Leeper, Todd Walker, and Susan Yang [Leeper et al., 2013], as well as in chapter 4 of [Sargent et al.,
1991].

5.2 Two Representations of One Nonfinancial Income Process

We study consequences of endowing a consumer with one of two alternative representations for the change in the con-
sumer’s nonfinancial income 𝑦𝑡+1 − 𝑦𝑡 .
For both types of consumer, a parameter 𝛽 ∈ (0, 1) plays three roles.
It appears
• as a discount factor applied to future expected one-period utilities,
• as the reciprocal of a gross interest rate on one-period loans, and
• as a parameter in a first-order moving average that equals the increment in a consumer’s non-financial income
The first representation, which we shall sometimes refer to as the more informative representation, is

𝑦𝑡+1 − 𝑦𝑡 = 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡 (5.1)

where {𝜖𝑡 } is an i.i.d. normally distributed scalar process with means of zero and contemporaneous variances 𝜎𝜖2 .
This representation of the process is used by a consumer who at time 𝑡 knows both 𝑦𝑡 and the shock 𝜖𝑡 and can use both
of them to forecast future 𝑦𝑡+𝑗 ’s.
As we’ll see below, representation (5.1) has the peculiar property that a positive shock 𝜖𝑡+1 leaves the discounted present
value of the consumer’s financial income at time 𝑡 + 1 unaltered.
The second representation of the same {𝑦𝑡 } process is

𝑦𝑡+1 − 𝑦𝑡 = 𝑎𝑡+1 − 𝛽𝑎𝑡 (5.2)

where {𝑎𝑡 } is another i.i.d. normally distributed scalar process, with means of zero and now variances 𝜎𝑎2 > 𝜎𝜖2 .
The i.i.d. shock variances are related by

𝜎𝑎2 = 𝛽 −2 𝜎𝜖2 > 𝜎𝜖2

so that the variance of the innovation exceeds the variance of the original shock by a multiplicative factor 𝛽 −2 .
Representation (5.2) is the innovations representation of equation (5.1) associated with Kalman filtering theory.

78 Chapter 5. Information and Consumption Smoothing


Advanced Quantitative Economics with Python

To see how this works, note that equating representations (5.1) and (5.2) for 𝑦𝑡+1 −𝑦𝑡 implies 𝜖𝑡+1 −𝛽 −1 𝜖𝑡 = 𝑎𝑡+1 −𝛽𝑎𝑡 ,
which in turn implies

𝑎𝑡+1 = 𝛽𝑎𝑡 + 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡 .

Solving this difference equation backwards for 𝑎𝑡+1 gives, after a few lines of algebra,

𝑎𝑡+1 = 𝜖𝑡+1 + (𝛽 − 𝛽 −1 ) ∑ 𝛽 𝑗 𝜖𝑡−𝑗 (5.3)
𝑗=0

which we can also write as



𝑎𝑡+1 = ∑ ℎ𝑗 𝜖𝑡+1−𝑗 ≡ ℎ(𝐿)𝜖𝑡+1
𝑗=0


where 𝐿 is the one-period lag operator, ℎ(𝐿) = ∑𝑗=0 ℎ𝑗 𝐿𝑗 , 𝐼 is the identity operator, and

𝐼 − 𝛽 −1 𝐿
ℎ(𝐿) =
𝐼 − 𝛽𝐿
Let 𝑔𝑗 ≡ 𝐸𝑧𝑡 𝑧𝑡−𝑗 be the 𝑗th autocovariance of the {𝑦𝑡 − 𝑦𝑡−1 } process.
Using calculations in the quantecon lecture, where 𝑧 ∈ 𝐶 is a complex variable, the covariance generating function

𝑔(𝑧) = ∑𝑗=−∞ 𝑔𝑗 𝑧 𝑗 of the {𝑦𝑡 − 𝑦𝑡−1 } process equals

𝑔(𝑧) = 𝜎𝜖2 ℎ(𝑧)ℎ(𝑧 −1 ) = 𝛽 −2 𝜎𝜖2 > 𝜎𝜖2 ,

which confirms that {𝑎𝑡 } is a serially uncorrelated process with variance

𝜎𝑎2 = 𝛽 −1 𝜎𝜖2 .

To verify these claims, just notice that 𝑔(𝑧) = 𝛽 −2 𝜎𝜖2 implies that
• 𝑔0 = 𝛽 −2 𝜎𝜖2 , and
• 𝑔𝑗 = 0 for 𝑗 ≠ 0.
Alternatively, if you are uncomfortable with covariance generating functions, note that we can directly calculate 𝜎𝑎2 from
formula (5.3) according to

𝜎𝑎2 = 𝜎𝜖2 + [1 + (𝛽 − 𝛽 −1 )2 ∑ 𝛽 2𝑗 ] = 𝛽 −1 𝜎𝜖2 .
𝑗=0

5.3 Application of Kalman filter

We can also use the the Kalman filter to obtain representation (5.2) from representation (5.1).
Thus, from equations associated with the Kalman filter, it can be verified that the steady-state Kalman gain 𝐾 = 𝛽 2 and
the steady state conditional covariance

Σ = 𝐸[(𝜖𝑡 − 𝜖𝑡̂ )2 |𝑦𝑡−1 , 𝑦𝑡−2 , …] = (1 − 𝛽 2 )𝜎𝜖2

In a little more detail, let 𝑧𝑡 = 𝑦𝑡 − 𝑦𝑡−1 and form the state-space representation

𝜖𝑡+1 = 0𝜖𝑡 + 𝜖𝑡+1


𝑧𝑡+1 = −𝛽 −1 𝜖𝑡 + 𝜖𝑡+1

5.3. Application of Kalman filter 79


Advanced Quantitative Economics with Python

and assume that 𝜎𝜖 = 1 for convenience


Let’s compute the steady-state Kalman filter for this system.
Let 𝐾 be the steady-state gain and 𝑎𝑡+1 the one-step ahead innovation.
The steady-state innovations representation is

𝜖𝑡+1
̂ = 0𝜖𝑡̂ + 𝐾𝑎𝑡+1
𝑧𝑡+1 = −𝛽𝑎𝑡 + 𝑎𝑡+1

By applying formulas for the steady-state Kalman filter, by hand it is possible to verify that 𝐾 = 𝛽 2 , 𝜎𝑎2 = 𝛽 −2 𝜎𝜖2 = 𝛽 −2 ,
and Σ = (1 − 𝛽 2 )𝜎𝜖2 .
Alternatively, we can obtain these formulas via the classical filtering theory described in this lecture.

5.4 News Shocks and Less Informative Shocks

Representation (5.1) is cast in terms of a news shock 𝜖𝑡+1 that represents a shock to nonfinancial income coming from
taxes, transfers, and other random sources of income changes known to a well-informed person who perhaps has all sorts
of information about the income process.
Representation (5.2) for the same income process is driven by shocks 𝑎𝑡 that contain less information than the news shock
𝜖𝑡 .
Representation (5.2) is called the innovations representation for the {𝑦𝑡 − 𝑦𝑡−1 } process.
It is cast in terms of what time series statisticians call the innovation or fundamental shock that emerges from apply-
ing the theory of optimally predicting nonfinancial income based solely on the information in past levels of growth in
nonfinancial income.
Fundamental for the 𝑦𝑡 process means that the shock 𝑎𝑡 can be expressed as a square-summable linear combination of
𝑦𝑡 , 𝑦𝑡−1 , ….
The shock 𝜖𝑡 is not fundamental because it has more information about the future of the {𝑦𝑡 − 𝑦𝑡−1 } process than is
contained in 𝑎𝑡 .
Representation (5.3) reveals the important fact that the original shock 𝜖𝑡 contains more information about future 𝑦’s than
is contained in the semi-infinite history 𝑦𝑡 = [𝑦𝑡 , 𝑦𝑡−1 , …].
Staring at representation (5.3) for 𝑎𝑡+1 shows that it consists both of new news 𝜖𝑡+1 as well as a long moving average

(𝛽 − 𝛽 −1 ) ∑𝑗=0 𝛽 𝑗 𝜖𝑡−𝑗 of old news.
The more information representation (5.1) asserts that a shock 𝜖𝑡 results in an impulse response to nonfinancial income
of 𝜖𝑡 times the sequence

1, 1 − 𝛽 −1 , 1 − 𝛽 −1 , …

so that a shock that increases nonfinancial income 𝑦𝑡 by 𝜖𝑡 at time 𝑡 is followed by a change in future 𝑦 of 𝜖𝑡 times
1 − 𝛽 −1 < 0 in all subsequent periods.
Because 1 − 𝛽 −1 < 0, this means that a positive shock of 𝜖𝑡 today raises income at time 𝑡 by 𝜖𝑡 and then permanently
decreases all future incomes by (𝛽 −1 − 1)𝜖𝑡 .
This pattern precisely describes the following mental experiment:
• The consumer receives a government transfer of 𝜖𝑡 at time 𝑡.
• The government finances the transfer by issuing a one-period bond on which it pays a gross one-period risk-free
interest rate equal to 𝛽 −1 .

80 Chapter 5. Information and Consumption Smoothing


Advanced Quantitative Economics with Python

• In each future period, the government rolls over the one-period bond and so continues to borrow 𝜖𝑡 forever.
• The government imposes a lump-sum tax on the consumer in order to pay just the current interest on the original
bond and its rolled over successors.
• Thus, in periods 𝑡 + 1, 𝑡 + 2, …, the government levies a lump-sum tax on the consumer of 𝛽 −1 − 1 that is just
enough to pay the interest on the bond.
0
The present value of the impulse response or moving average coefficients equals 𝑑𝜖 (𝐿) = 1−𝛽 = 0, a fact that we’ll see
again below.
Representation (5.2), i.e., the innovations representation, asserts that a shock 𝑎𝑡 results in an impulse response to nonfi-
nancial income of 𝑎𝑡 times

1, 1 − 𝛽, 1 − 𝛽, …

so that a shock that increases income 𝑦𝑡 by 𝑎𝑡 at time 𝑡 can be expected to be followed by an increase in 𝑦𝑡+𝑗 of 𝑎𝑡 times
1 − 𝛽 > 0 in all future periods 𝑗 = 1, 2, ….
1−𝛽2
The present value of the impulse response or moving average coefficients for representation (5.2) is 𝑑𝑎 (𝛽) = 1−𝛽 =
(1 + 𝛽), another fact that will be important below.

5.5 Representation of 𝜖𝑡 Shock in Terms of Future 𝑦𝑡

Notice that reprentation (5.1), namely, 𝑦𝑡+1 − 𝑦𝑡 = −𝛽 −1 𝜖𝑡 + 𝜖𝑡+1 implies the linear difference equation

𝜖𝑡 = 𝛽𝜖𝑡+1 − 𝛽(𝑦𝑡+1 − 𝑦𝑡 ).

Solving forward we obtain



𝜖𝑡 = 𝛽(𝑦𝑡 − (1 − 𝛽) ∑ 𝛽 𝑗 𝑦𝑡+𝑗+1 )
𝑗=0

This equation shows that 𝜖𝑡 equals 𝛽 times the one-step-backwards error in optimally backcasting 𝑦𝑡 based on the semi-
𝑡
infinite future 𝑦+ ≡ [𝑦𝑡+1 , 𝑦𝑡+2 , …] via the optimal backcasting formula

𝑡
𝐸[𝑦𝑡 |𝑦+ ] = (1 − 𝛽) ∑ 𝛽 𝑗 𝑦𝑡+𝑗+1
𝑗=0

𝑡
Thus, 𝜖𝑡 exactly reveals the gap between 𝑦𝑡 and 𝐸[𝑦𝑡 |𝑦+ ].

5.6 Representation in Terms of 𝑎𝑡 Shocks

Next notice that representation (5.2), namely, 𝑦𝑡+1 − 𝑦𝑡 = −𝛽𝑎𝑡 + 𝑎𝑡+1 implies the linear difference equation

𝑎𝑡+1 = 𝛽𝑎𝑡 + (𝑦𝑡+1 − 𝑦𝑡 )

Solving this equation backward establishes that the one-step-prediction error 𝑎𝑡+1 is

𝑎𝑡+1 = 𝑦𝑡+1 − (1 − 𝛽) ∑ 𝛽 𝑗 𝑦𝑡−𝑗 .
𝑗=0

Here the information set is 𝑦𝑡 = [𝑦𝑡 , 𝑦𝑡−1 , …] and a one step-ahead optimal prediction is

𝐸[𝑦𝑡+1 |𝑦𝑡 ] = (1 − 𝛽) ∑ 𝛽 𝑗 𝑦𝑡−𝑗
𝑗=0

5.5. Representation of 𝜖𝑡 Shock in Terms of Future 𝑦𝑡 81


Advanced Quantitative Economics with Python

5.7 Permanent Income Consumption-Smoothing Model

When we computed optimal consumption-saving policies for our two representations (5.1) and (5.2) by using formulas
obtained with the difference equation approach described in quantecon lecture, we obtained:
for a consumer having the information assumed in the news representation (5.1):

𝑐𝑡+1 − 𝑐𝑡 = 0
𝑏𝑡+1 − 𝑏𝑡 = −𝛽 −1 𝜖𝑡

for a consumer having the more limited information associated with the innovations representation (5.2):

𝑐𝑡+1 − 𝑐𝑡 = (1 − 𝛽 2 )𝑎𝑡+1
𝑏𝑡+1 − 𝑏𝑡 = −𝛽𝑎𝑡

These formulas agree with outcomes from Python programs below that deploy state-space representations and dynamic
programming.
Evidently, although they receive exactly the same histories of nonfinancial incomethe two consumers behave differently.
The better informed consumer who has the information sets associated with representation (5.1) responds to each shock
𝜖𝑡+1 by leaving his consumption unaltered and saving all of 𝜖𝑡+1 in anticipation of the permanently increased taxes that he
will bear in order to service the permanent interest payments on the risk-free bonds that the government has presumably
issued to pay for the one-time addition 𝜖𝑡+1 to his time 𝑡 + 1 nonfinancial income.
The less well informed consumer who has information sets associated with representation (5.2) responds to a shock 𝑎𝑡+1
by increasing his consumption by what he perceives to be the permanent part of the increase in consumption and by
increasing his saving by what he perceives to be the temporary part.
The behavior of the better informed consumer sharply illustrates the behavior predicted in a classic Ricardian equivalence
experiment.

5.8 State Space Representations

We now cast our representations (5.1) and (5.2), respectively, in terms of the following two state space systems:

𝑦𝑡+1 1 −𝛽 −1 𝑦𝑡 𝜎
[ ]=[ ] [ ] + [ 𝜖 ] 𝑣𝑡+1
𝜖𝑡+1 0 0 𝜖𝑡 𝜎𝜖
(5.4)
𝑦
𝑦𝑡 = [1 0] [ 𝑡 ]
𝜖𝑡

and
𝑦𝑡+1 1 −𝛽 𝑦𝑡 𝜎
[ ]=[ ] [ ] + [ 𝑎 ] 𝑢𝑡+1
𝑎𝑡+1 0 0 𝑎𝑡 𝜎𝑎
(5.5)
𝑦
𝑦𝑡 = [1 0] [ 𝑡 ]
𝑎𝑡

where {𝑣𝑡 } and {𝑢𝑡 } are both i.i.d. sequences of univariate standardized normal random variables.
These two alternative income processes are ready to be used in the framework presented in the section “Comparison with
the Difference Equation Approach” in thid quantecon lecture.
All the code that we shall use below is presented in that lecture.

82 Chapter 5. Information and Consumption Smoothing


Advanced Quantitative Economics with Python

5.9 Computations

We shall use Python to form two state-space representations (5.4) and (5.5).
We set the following parameter values 𝜎𝜖 = 1, 𝜎𝑎 = 𝛽 −1 𝜎𝜖 = 𝛽 −1 where 𝛽 is the same value as the discount factor in
the household’s problem in the LQ savings problem in the lecture.
For these two representations, we use the code in this lecture to
• compute optimal decision rules for 𝑐𝑡 , 𝑏𝑡 for the two types of consumers associated with our two representations
of nonfinancial income
• use the value function objects 𝑃 , 𝑑 returned by the code to compute optimal values for the two representations
when evaluated at the initial condition
10
𝑥0 = [ ]
0
for each representation.
• create instances of the LinearStateSpace class for the two representations of the {𝑦𝑡 } process and use them to
obtain impulse response functions of 𝑐𝑡 and 𝑏𝑡 to the respective shocks 𝜖𝑡 and 𝑎𝑡 for the two representations.
• run simulations of {𝑦𝑡 , 𝑐𝑡 , 𝑏𝑡 } of length 𝑇 under both of the representations
We formulae the problem:

2
min ∑ 𝛽 𝑡 (𝑐𝑡 − 𝛾)
𝑡=0

subject to a sequence of constraints


1
𝑐𝑡 + 𝑏 𝑡 = 𝑏 + 𝑦𝑡 , 𝑡≥0
1 + 𝑟 𝑡+1
where 𝑦𝑡 follows one of the representations defined above.
Define the control as 𝑢𝑡 ≡ 𝑐𝑡 − 𝛾.
(For simplicity we can assume 𝛾 = 0 below because 𝛾 has no effect on the impulse response functions that interest us.)
The state transition equations under our two representations for the nonfinancial income process {𝑦𝑡 } can be written as

𝑦𝑡+1 1 −𝛽 −1 0 𝑦𝑡 0 𝜎𝜖
⎡ 𝜖 ⎤= ⎡ 0 0 0 ⎤ ⎡ 𝜖 ⎤ + ⎡ 0 ⎤ [ 𝑐 ] + ⎡ 𝜎 ⎤𝜈 ,
⎢ 𝑡+1 ⎥ ⎢ ⎥⎢ 𝑡 ⎥ ⎢ ⎥ 𝑡 ⎢ 𝜖 ⎥ 𝑡+1
⎣ − (1 + 𝑟)
⎣ 𝑏𝑡+1 ⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 0 1 + 𝑟 ⎦ ⎣ 𝑏𝑡 ⎦ ⏟⎣⏟1⏟
+⏟𝑟⏟
⎦ ⎣
⏟ 0 ⎦
≡𝐴1 ≡𝐵1 ≡𝐶1

and
𝑦𝑡+1 1 −𝛽 0 𝑦𝑡 0 𝜎𝑎
⎡ 𝑎 ⎤ ⎡ 0 0 0 ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
⎢ 𝑡+1 ⎥ = ⎢ ⎥ ⎢ 𝑎𝑡 ⎥ + ⎢ 0 ⎥ [ 𝑐𝑡 ] + ⎢ 𝜎𝑎 ⎥𝑢𝑡+1 .
⎣ 𝑏𝑡+1 ⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⎣ − (1 + 𝑟) 0 1 + 𝑟 ⎦ ⎣ 𝑏𝑡 ⎦ ⏟ ⎣⏟1⏟
+⏟𝑟⏟
⎦ ⎣ 0 ⎦

≡𝐴2 ≡𝐵2 ≡𝐶2

As usual, we start by importing packages.

import numpy as np
import quantecon as qe
import matplotlib.pyplot as plt

5.9. Computations 83
Advanced Quantitative Economics with Python

# Set parameters
β, σϵ = 0.95, 1
σa = σϵ / β

R = 1 / β

# Payoff matrices are the same for two representations


RLQ = np.array([[0, 0, 0],
[0, 0, 0],
[0, 0, 1e-12]]) # put penalty on debt
QLQ = np.array([1.])

# More informative representation state transition matrices


ALQ1 = np.array([[1, -R, 0],
[0, 0, 0],
[-R, 0, R]])
BLQ1 = np.array([[0, 0, R]]).T
CLQ1 = np.array([[σϵ, σϵ, 0]]).T

# Construct and solve the LQ problem


LQ1 = qe.LQ(QLQ, RLQ, ALQ1, BLQ1, C=CLQ1, beta=β)
P1, F1, d1 = LQ1.stationary_values()

# The optimal decision rule for c


-F1

array([[ 1. , -1. , -0.05]])

Evidently, optimal consumption and debt decision rules for the consumer having news representation (5.1) are

𝑐𝑡∗ = 𝑦𝑡 − 𝜖𝑡 − (1 − 𝛽) 𝑏𝑡 ,

𝑏𝑡+1 = 𝛽 −1 𝑐𝑡∗ + 𝛽 −1 𝑏𝑡 − 𝛽 −1 𝑦𝑡
= 𝛽 −1 𝑦𝑡 − 𝛽 −1 𝜖𝑡 − (𝛽 −1 − 1) 𝑏𝑡 + 𝛽 −1 𝑏𝑡 − 𝛽 −1 𝑦𝑡
= 𝑏𝑡 − 𝛽 −1 𝜖𝑡 .

# Innovations representation
ALQ2 = np.array([[1, -β, 0],
[0, 0, 0],
[-R, 0, R]])
BLQ2 = np.array([[0, 0, R]]).T
CLQ2 = np.array([[σa, σa, 0]]).T

LQ2 = qe.LQ(QLQ, RLQ, ALQ2, BLQ2, C=CLQ2, beta=β)


P2, F2, d2 = LQ2.stationary_values()

-F2

array([[ 1. , -0.9025, -0.05 ]])

For a consumer having access only to the information associated with the innovations representation (5.2), the optimal

84 Chapter 5. Information and Consumption Smoothing


Advanced Quantitative Economics with Python

decision rules are


𝑐𝑡∗ = 𝑦𝑡 − 𝛽 2 𝑎𝑡 − (1 − 𝛽) 𝑏𝑡 ,

𝑏𝑡+1 = 𝛽 −1 𝑐𝑡∗ + 𝛽 −1 𝑏𝑡 − 𝛽 −1 𝑦𝑡
= 𝛽 −1 𝑦𝑡 − 𝛽𝑎𝑡 − (𝛽 −1 − 1) 𝑏𝑡 + 𝛽 −1 𝑏𝑡 − 𝛽 −1 𝑦𝑡
= 𝑏𝑡 − 𝛽𝑎𝑡 .

Now we construct two Linear State Space models that emerge from using optimal policies of the form 𝑢𝑡 = −𝐹 𝑥𝑡 .
Take the more informative original representation (5.1) as an example:

𝑦𝑡+1 𝑦𝑡
⎡ 𝜖 ⎤ = (𝐴 − 𝐵 𝐹 ) ⎡ 𝜖 ⎤ + 𝐶 𝜈
⎢ 𝑡+1 ⎥ 1 1 1 ⎢ 𝑡 ⎥ 1 𝑡+1
⎣ 𝑏𝑡+1 ⎦ ⎣ 𝑏𝑡 ⎦

𝑦
𝑐𝑡 −𝐹1 ⎡ 𝑡 ⎤
[ ]=[ ] ⎢ 𝜖𝑡 ⎥
𝑏𝑡 𝑆𝑏
⎣ 𝑏𝑡 ⎦
To have the Linear State Space model be of an innovations representation form (5.2), we can simply replace the corre-
sponding matrices.

# Construct two Linear State Space models


Sb = np.array([0, 0, 1])

ABF1 = ALQ1 - BLQ1 @ F1


G1 = np.vstack([-F1, Sb])
LSS1 = qe.LinearStateSpace(ABF1, CLQ1, G1)

ABF2 = ALQ2 - BLQ2 @ F2


G2 = np.vstack([-F2, Sb])
LSS2 = qe.LinearStateSpace(ABF2, CLQ2, G2)

The following code computes impulse response functions of 𝑐𝑡 and 𝑏𝑡 .

J = 5 # Number of coefficients that we want

x_res1, y_res1 = LSS1.impulse_response(j=J)


b_res1 = np.array([x_res1[i][2, 0] for i in range(J)])
c_res1 = np.array([y_res1[i][0, 0] for i in range(J)])

x_res2, y_res2 = LSS2.impulse_response(j=J)


b_res2 = np.array([x_res2[i][2, 0] for i in range(J)])
c_res2 = np.array([y_res2[i][0, 0] for i in range(J)])

c_res1 / σϵ, b_res1 / σϵ

(array([1.99998906e-11, 1.89473923e-11, 1.78947621e-11, 1.68421319e-11,


1.57895017e-11]),
array([ 0. , -1.05263158, -1.05263158, -1.05263158, -1.05263158]))

plt.title("more informative representation")


plt.plot(range(J), c_res1 / σϵ, label="c impulse response function")
plt.plot(range(J), b_res1 / σϵ, label="b impulse response function")
plt.legend()

5.9. Computations 85
Advanced Quantitative Economics with Python

<matplotlib.legend.Legend at 0x7fc14e8fe450>

The above two impulse response functions show that when the consumer has the information assumed in the more infor-
mative representation (5.1), his response to receiving a positive shock of 𝜖𝑡 is to leave his consumption unchanged and to
save the entire amount of his extra income and then forever roll over the extra bonds that he holds.
To see this notice, that starting from next period on, his debt permanently decreases by 𝛽 −1

c_res2 / σa, b_res2 / σa

(array([0.0975, 0.0975, 0.0975, 0.0975, 0.0975]),


array([ 0. , -0.95, -0.95, -0.95, -0.95]))

plt.title("innovations representation")
plt.plot(range(J), c_res2 / σa, label="c impulse response function")
plt.plot(range(J), b_res2 / σa, label="b impulse response function")
plt.plot([0, J-1], [0, 0], '--', color='k')
plt.legend()

<matplotlib.legend.Legend at 0x7fc14e6a64e0>

86 Chapter 5. Information and Consumption Smoothing


Advanced Quantitative Economics with Python

The above impulse responses show that when the consumer has only the information that is assumed to be available
under the innovations representation (5.2) for {𝑦𝑡 − 𝑦𝑡−1 }, he responds to a positive 𝑎𝑡 by permanently increasing his
consumption.
He accomplishes this by consuming a fraction (1 − 𝛽 2 ) of the increment 𝑎𝑡 to his nonfinancial income and saving the
rest, thereby lowering 𝑏𝑡+1 in order to finance the permanent increment in his consumption.
The preceding computations confirm what we had derived earlier using paper and pencil.
Now let’s simulate some paths of consumption and debt for our two types of consumers while always presenting both
types with the same {𝑦𝑡 } path.

# Set time length for simulation


T = 100

x1, y1 = LSS1.simulate(ts_length=T)
plt.plot(range(T), y1[0, :], label="c")
plt.plot(range(T), x1[2, :], label="b")
plt.plot(range(T), x1[0, :], label="y")
plt.title("more informative representation")
plt.legend()

<matplotlib.legend.Legend at 0x7fc14e6a6d20>

5.9. Computations 87
Advanced Quantitative Economics with Python

x2, y2 = LSS2.simulate(ts_length=T)
plt.plot(range(T), y2[0, :], label="c")
plt.plot(range(T), x2[2, :], label="b")
plt.plot(range(T), x2[0, :], label="y")
plt.title("innovations representation")
plt.legend()

<matplotlib.legend.Legend at 0x7fc14e1b6c00>

88 Chapter 5. Information and Consumption Smoothing


Advanced Quantitative Economics with Python

5.10 Simulating Income Process and Two Associated Shock Pro-


cesses

We now form a single {𝑦𝑡 }𝑇𝑡=0 realization that we will use to simulate decisions associated with our two types of consumer.
We accomplish this in the following steps.
1. We form a {𝑦𝑡 , 𝜖𝑡 } realization by drawing a long simulation of {𝜖𝑡 }𝑇𝑡=0 , where 𝑇 is a big integer, 𝜖𝑡 = 𝜎𝜖 𝑣𝑡 , 𝑣𝑡 is
a standard normal scalar, 𝑦0 = 100, and

𝑦𝑡+1 − 𝑦𝑡 = −𝛽 −1 𝜖𝑡 + 𝜖𝑡+1 .

2. We take the {𝑦𝑡 } realization generated in step 1 and form an innovation process {𝑎𝑡 } from the formulas
𝑎0 = 0
𝑡−1
𝑎𝑡 = ∑ 𝛽 𝑗 (𝑦𝑡−𝑗 − 𝑦𝑡−𝑗−1 ) + 𝛽 𝑡 𝑎0 , 𝑡≥1
𝑗=0

3. We throw away the first 𝑆 observations and form a sample {𝑦𝑡 , 𝜖𝑡 , 𝑎𝑡 }𝑇𝑆+1 as the realization that we’ll use in the
following steps.
4. We use the step 3 realization to evaluate and simulate the decision rules for 𝑐𝑡 , 𝑏𝑡 that Python has computed for
us above.
The above steps implement the experiment of comparing decisions made by two consumers having identical incomes at
each date but at each date having different information about their future incomes.

5.10. Simulating Income Process and Two Associated Shock Processes 89


Advanced Quantitative Economics with Python

5.11 Calculating Innovations in Another Way

Here we use formula (5.3) above to compute 𝑎𝑡+1 as a function of the history 𝜖𝑡+1 , 𝜖𝑡 , 𝜖𝑡−1 , …
Thus, we compute

𝑎𝑡+1 = 𝛽𝑎𝑡 + 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡


= 𝛽 (𝛽𝑎𝑡−1 + 𝜖𝑡 − 𝛽 −1 𝜖𝑡−1 ) + 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡
= 𝛽 2 𝑎𝑡−1 + 𝛽 (𝜖𝑡 − 𝛽 −1 𝜖𝑡−1 ) + 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡
= ⋮ ⋮
𝑡
= 𝛽 𝑡+1 𝑎0 + ∑ 𝛽 𝑗 (𝜖𝑡+1−𝑗 − 𝛽 −1 𝜖𝑡−𝑗 )
𝑗=0
𝑡−1
= 𝛽 𝑡+1 𝑎0 + 𝜖𝑡+1 + (𝛽 − 𝛽 −1 ) ∑ 𝛽 𝑗 𝜖𝑡−𝑗 − 𝛽 𝑡−1 𝜖0 .
𝑗=0

We can verify that we recover the same {𝑎𝑡 } sequence computed earlier.

5.12 Another Invertibility Issue

This quantecon lecture contains another example of a shock-invertibility issue that is endemic to the LQ permanent income
or consumption smoothing model.
The technical issue discussed there is ultimately the source of the shock-invertibility issues discussed by Eric Leeper,
Todd Walker, and Susan Yang [Leeper et al., 2013] in their analysis of fiscal foresight.

90 Chapter 5. Information and Consumption Smoothing


CHAPTER

SIX

CONSUMPTION SMOOTHING WITH COMPLETE AND INCOMPLETE


MARKETS

In addition to what’s in Anaconda, this lecture uses the library:

!pip install --upgrade quantecon

6.1 Overview

This lecture describes two types of consumption-smoothing models.


• one is in the complete markets tradition of Kenneth Arrow
• the other is in the incomplete markets tradition of Hall [Hall, 1978]
Complete markets allow a consumer to buy and sell claims contingent on all possible states of the world.
Incomplete markets allow a consumer to buy and sell a limited set of securities, often only a single risk-free security.
Hall [Hall, 1978] worked in an incomplete markets tradition by assuming that the only asset that can be traded is a
risk-free one-period bond.
Hall assumed an exogenous stochastic process of nonfinancial income and an exogenous and time-invariant gross interest
rate on one-period risk-free debt that equals 𝛽 −1 , where 𝛽 ∈ (0, 1) is also a consumer’s intertemporal discount factor.
This is equivalent to saying that it costs 𝛽 of time 𝑡 consumption to buy one unit of consumption at time 𝑡 + 1 for sure.
So 𝛽 is the price of a one-period risk-free claim to consumption next period.
We preserve Hall’s assumption about the interest rate when we describe an incomplete markets version of our model.
In addition, we extend Hall’s assumption about the risk-free interest rate to an appropriate counterpart when we create
another model in which there are markets in a complete array of one-period Arrow state-contingent securities.
We’ll consider two closely related alternative assumptions about the consumer’s exogenous nonfinancial income process:
• that it is generated by a finite 𝑁 state Markov chain (setting 𝑁 = 2 most of the time in this lecture)
• that it is described by a linear state space model with a continuous state vector in ℝ𝑛 driven by a Gaussian vector
IID shock process
We’ll spend most of this lecture studying the finite-state Markov specification, but will begin by studying the linear state
space specification because it is so closely linked to earlier lectures.
Let’s start with some imports:

91
Advanced Quantitative Economics with Python

import numpy as np
import quantecon as qe
import matplotlib.pyplot as plt
import scipy.linalg as la

6.1.1 Relationship to Other Lectures

This lecture can be viewed as a followup to Optimal Savings II: LQ Techniques


This lecture is also a prologomenon to a lecture on tax-smoothing Tax Smoothing with Complete and Incomplete Markets

6.2 Background

Outcomes in consumption-smoothing models emerge from two sources:


• a consumer who wants to maximize an intertemporal objective function that expresses its preference for paths of
consumption that are smooth in the sense of varying as little as possible both across time and across realized Markov
states
• opportunities that allow the consumer to transform an erratic nonfinancial income process into a smoother con-
sumption process by buying and selling one or more financial securities
In the complete markets version, each period the consumer can buy or sell a complete set of one-period ahead state-
contingent securities whose payoffs depend on next period’s realization of the Markov state.
• In the two-state Markov chain case, two such securities are traded each period.
• In an 𝑁 state Markov state version, 𝑁 such securities are traded each period.
• In a continuous state Markov state version, a continuum of such securities is traded each period.
These state-contingent securities are commonly called Arrow securities, after Kenneth Arrow.
In the incomplete markets version, the consumer can buy and sell only one security each period, a risk-free one-period
bond with gross one-period return 𝛽 −1 .

6.3 Linear State Space Version of Complete Markets Model

We’ll study a complete markets model adapted to a setting with a continuous Markov state like that in the first lecture on
the permanent income model.
In that model
• a consumer can trade only a single risk-free one-period bond bearing gross one-period risk-free interest rate equal
to 𝛽 −1 .
• a consumer’s exogenous nonfinancial income is governed by a linear state space model driven by Gaussian shocks,
the kind of model studied in an earlier lecture about linear state space models.
Let’s write down a complete markets counterpart of that model.
Suppose that nonfinancial income is governed by the state space system

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐶𝑤𝑡+1


𝑦𝑡 = 𝑆𝑦 𝑥𝑡

92 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

where 𝑥𝑡 is an 𝑛 × 1 vector and 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼) is IID over time.


We want a natural counterpart of the Hall assumption that the one-period risk-free gross interest rate is 𝛽 −1 .
We make the good guess that prices of one-period ahead Arrow securities are described by the pricing kernel

𝑞𝑡+1 (𝑥𝑡+1 | 𝑥𝑡 ) = 𝛽𝜙(𝑥𝑡+1 | 𝐴𝑥𝑡 , 𝐶𝐶 ′ ) (6.1)

where 𝜙(⋅ | 𝜇, Σ) is a multivariate Gaussian distribution with mean vector 𝜇 and covariance matrix Σ.
With the pricing kernel 𝑞𝑡+1 (𝑥𝑡+1 | 𝑥𝑡 ) in hand, we can price claims to consumption at time 𝑡 + 1 consumption that pay
off when 𝑥𝑡+1 ∈ 𝑆 at time 𝑡 + 1:

∫ 𝑞𝑡+1 (𝑥𝑡+1 | 𝑥𝑡 )𝑑𝑥𝑡+1


𝑆

𝑛
where 𝑆 is a subset of ℝ .
The price ∫𝑆 𝑞𝑡+1 (𝑥𝑡+1 | 𝑥𝑡 )𝑑𝑥𝑡+1 of such a claim depends on state 𝑥𝑡 because the prices of the 𝑥𝑡+1 -contingent securities
depend on 𝑥𝑡 through the pricing kernel 𝑞(𝑥𝑡+1 | 𝑥𝑡 ).
Let 𝑏(𝑥𝑡+1 ) be a vector of state-contingent debt due at 𝑡 + 1 as a function of the 𝑡 + 1 state 𝑥𝑡+1 .
Using the pricing kernel assumed in (6.1), the value at 𝑡 of 𝑏(𝑥𝑡+1 ) is evidently

𝛽 ∫ 𝑏(𝑥𝑡+1 )𝜙(𝑥𝑡+1 | 𝐴𝑥𝑡 , 𝐶𝐶 ′ )𝑑𝑥𝑡+1 = 𝛽𝔼𝑡 𝑏𝑡+1

In our complete markets setting, the consumer faces a sequence of budget constraints

𝑐𝑡 + 𝑏𝑡 = 𝑦𝑡 + 𝛽𝔼𝑡 𝑏𝑡+1 , 𝑡≥0

Please note that

𝛽𝐸𝑡 𝑏𝑡+1 = 𝛽 ∫ 𝜙𝑡+1 (𝑥𝑡+1 |𝐴𝑥𝑡 , 𝐶𝐶 ′ )𝑏𝑡+1 (𝑥𝑡+1 )𝑑𝑥𝑡+1

or

𝛽𝐸𝑡 𝑏𝑡+1 = ∫ 𝑞𝑡+1 (𝑥𝑡+1 |𝑥𝑡 )𝑏𝑡+1 (𝑥𝑡+1 )𝑑𝑥𝑡+1

which verifies that 𝛽𝐸𝑡 𝑏𝑡+1 is the value of time 𝑡 + 1 state-contingent claims on time 𝑡 + 1 consumption issued by the
consumer at time 𝑡
We can solve the time 𝑡 budget constraint forward to obtain

𝑏𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗 (𝑦𝑡+𝑗 − 𝑐𝑡+𝑗 )
𝑗=0

The consumer cares about the expected value of



∑ 𝛽 𝑡 𝑢(𝑐𝑡 ), 0<𝛽<1
𝑡=0

In the incomplete markets version of the model, we assumed that 𝑢(𝑐𝑡 ) = −(𝑐𝑡 − 𝛾)2 , so that the above utility functional
became

− ∑ 𝛽 𝑡 (𝑐𝑡 − 𝛾)2 , 0<𝛽<1
𝑡=0

6.3. Linear State Space Version of Complete Markets Model 93


Advanced Quantitative Economics with Python

But in the complete markets version, it is tractable to assume a more general utility function that satisfies 𝑢′ > 0 and
𝑢″ < 0.
First-order conditions for the consumer’s problem with complete markets and our assumption about Arrow securities
prices are

𝑢′ (𝑐𝑡+1 ) = 𝑢′ (𝑐𝑡 ) for all 𝑡 ≥ 0

which implies 𝑐𝑡 = 𝑐 ̄ for some 𝑐.̄


So it follows that

𝑏𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗 (𝑦𝑡+𝑗 − 𝑐)̄
𝑗=0

or
1
𝑏𝑡 = 𝑆𝑦 (𝐼 − 𝛽𝐴)−1 𝑥𝑡 − 𝑐̄ (6.2)
1−𝛽
where 𝑐 ̄ satisfies
1
𝑏̄0 = 𝑆𝑦 (𝐼 − 𝛽𝐴)−1 𝑥0 − 𝑐̄ (6.3)
1−𝛽

where 𝑏̄0 is an initial level of the consumer’s debt due at time 𝑡 = 0, specified as a parameter of the problem.
Thus, in the complete markets version of the consumption-smoothing model, 𝑐𝑡 = 𝑐,̄ ∀𝑡 ≥ 0 is determined by (6.3) and
the consumer’s debt is the fixed function of the state 𝑥𝑡 described by (6.2).
Please recall that in the LQ permanent income model studied in permanent income model, the state is 𝑥𝑡 , 𝑏𝑡 , where 𝑏𝑡 is
a complicated function of past state vectors 𝑥𝑡−𝑗 .
Notice that in contrast to that incomplete markets model, at time 𝑡 the state vector is 𝑥𝑡 alone in our complete markets
model.
Here’s an example that shows how in this setting the availability of insurance against fluctuating nonfinancial income
allows the consumer completely to smooth consumption across time and across states of the world

def complete_ss(β, b0, x0, A, C, S_y, T=12):


"""
Computes the path of consumption and debt for the previously described
complete markets model where exogenous income follows a linear
state space
"""
# Create a linear state space for simulation purposes
# This adds "b" as a state to the linear state space system
# so that setting the seed places shocks in same place for
# both the complete and incomplete markets economy
# Atilde = np.vstack([np.hstack([A, np.zeros((A.shape[0], 1))]),
# np.zeros((1, A.shape[1] + 1))])
# Ctilde = np.vstack([C, np.zeros((1, 1))])
# S_ytilde = np.hstack([S_y, np.zeros((1, 1))])

lss = qe.LinearStateSpace(A, C, S_y, mu_0=x0)

# Add extra state to initial condition


# x0 = np.hstack([x0, np.zeros(1)])

# Compute the (I - β * A)^{-1}


(continues on next page)

94 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


rm = la.inv(np.eye(A.shape[0]) - β * A)

# Constant level of consumption


cbar = (1 - β) * (S_y @ rm @ x0 - b0)
c_hist = np.full(T, cbar)

# Debt
x_hist, y_hist = lss.simulate(T)
b_hist = np.squeeze(S_y @ rm @ x_hist - cbar / (1 - β))

return c_hist, b_hist, np.squeeze(y_hist), x_hist

# Define parameters
N_simul = 80
α, ρ1, ρ2 = 10.0, 0.9, 0.0
σ = 1.0

A = np.array([[1., 0., 0.],


[α, ρ1, ρ2],
[0., 1., 0.]])
C = np.array([[0.], [σ], [0.]])
S_y = np.array([[1, 1.0, 0.]])
β, b0 = 0.95, -10.0
x0 = np.array([1.0, α / (1 - ρ1), α / (1 - ρ1)])

# Do simulation for complete markets


s = np.random.randint(0, 10000)
np.random.seed(s) # Seeds get set the same for both economies
out = complete_ss(β, b0, x0, A, C, S_y, 80)
c_hist_com, b_hist_com, y_hist_com, x_hist_com = out

fig, ax = plt.subplots(1, 2, figsize=(14, 4))

# Consumption plots
ax[0].set_title('Consumption and income')
ax[0].plot(np.arange(N_simul), c_hist_com, label='consumption')
ax[0].plot(np.arange(N_simul), y_hist_com, label='income', alpha=.6, linestyle='--')
ax[0].legend()
ax[0].set_xlabel('Periods')
ax[0].set_ylim([80, 120])

# Debt plots
ax[1].set_title('Debt and income')
ax[1].plot(np.arange(N_simul), b_hist_com, label='debt')
ax[1].plot(np.arange(N_simul), y_hist_com, label='Income', alpha=.6, linestyle='--')
ax[1].legend()
ax[1].axhline(0, color='k')
ax[1].set_xlabel('Periods')

plt.show()

6.3. Linear State Space Version of Complete Markets Model 95


Advanced Quantitative Economics with Python

6.3.1 Interpretation of Graph

In the above graph, please note that:


• nonfinancial income fluctuates in a stationary manner.
• consumption is completely constant.
• the consumer’s debt fluctuates in a stationary manner; in fact, in this case, because nonfinancial income is a first-
order autoregressive process, the consumer’s debt is an exact affine function (meaning linear plus a constant) of the
consumer’s nonfinancial income.

6.3.2 Incomplete Markets Version

The incomplete markets version of the model with nonfinancial income being governed by a linear state space system is
described in permanent income model.
In that incomplete markerts setting, consumption follows a random walk and the consumer’s debt follows a process with
a unit root.

6.3.3 Finite State Markov Income Process

We now turn to a finite-state Markov version of the model in which the consumer’s nonfinancial income is an exact
function of a Markov state that takes one of 𝑁 values.
We’ll start with a setting in which in each version of our consumption-smoothing model, nonfinancial income is governed
by a two-state Markov chain (it’s easy to generalize this to an 𝑁 state Markov chain).
In particular, the state 𝑠𝑡 ∈ {1, 2} follows a Markov chain with transition probability matrix

𝑃𝑖𝑗 = ℙ{𝑠𝑡+1 = 𝑗 | 𝑠𝑡 = 𝑖}

where ℙ means conditional probability


Nonfinancial income {𝑦𝑡 } obeys

𝑦1̄ if 𝑠𝑡 = 1
𝑦𝑡 = {
𝑦2̄ if 𝑠𝑡 = 2

96 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

A consumer wishes to maximize



𝔼 [∑ 𝛽 𝑡 𝑢(𝑐𝑡 )] where 𝑢(𝑐𝑡 ) = −(𝑐𝑡 − 𝛾)2 and 0 < 𝛽 < 1 (6.4)
𝑡=0

Here 𝛾 > 0 is a bliss level of consumption

6.3.4 Market Structure

Our complete and incomplete markets models differ in how thoroughly the market structure allows a consumer to transfer
resources across time and Markov states, there being more transfer opportunities in the complete markets setting than in
the incomplete markets setting.
Watch how these differences in opportunities affect
• how smooth consumption is across time and Markov states
• how the consumer chooses to make his levels of indebtedness behave over time and across Markov states

6.4 Model 1 (Complete Markets)

At each date 𝑡 ≥ 0, the consumer trades a full array of one-period ahead Arrow securities.
We assume that prices of these securities are exogenous to the consumer.
Exogenous means that they are unaffected by the consumer’s decisions.
In Markov state 𝑠𝑡 at time 𝑡, one unit of consumption in state 𝑠𝑡+1 at time 𝑡 + 1 costs 𝑞(𝑠𝑡+1 | 𝑠𝑡 ) units of the time 𝑡
consumption good.
The prices 𝑞(𝑠𝑡+1 | 𝑠𝑡 ) are given and can be organized into a matrix 𝑄 with 𝑄𝑖𝑗 = 𝑞(𝑗|𝑖)
At time 𝑡 = 0, the consumer starts with an inherited level of debt due at time 0 of 𝑏0 units of time 0 consumption goods.
The consumer’s budget constraint at 𝑡 ≥ 0 in Markov state 𝑠𝑡 is

𝑐𝑡 + 𝑏𝑡 ≤ 𝑦(𝑠𝑡 ) + ∑ 𝑞(𝑗 | 𝑠𝑡 ) 𝑏𝑡+1 (𝑗 | 𝑠𝑡 ) (6.5)


𝑗

where 𝑏𝑡 is the consumer’s one-period debt that falls due at time 𝑡 and 𝑏𝑡+1 (𝑗 | 𝑠𝑡 ) are the consumer’s time 𝑡 sales of the
time 𝑡 + 1 consumption good in Markov state 𝑗.
Thus
• 𝑞(𝑗 | 𝑠𝑡 )𝑏𝑡+1 (𝑗 | 𝑠𝑡 ) is a source of time 𝑡 financial income for the consumer in Markov state 𝑠𝑡
• 𝑏𝑡 ≡ 𝑏𝑡 (𝑗 | 𝑠𝑡−1 ) is a source of time 𝑡 expenditures for the consumer when 𝑠𝑡 = 𝑗
Remark: We are ignoring an important technicality here, namely, that the consumer’s choice of 𝑏𝑡+1 (𝑗| 𝑠𝑡 ) must respect
so-called natural debt limits that assure that it is feasible for the consumer to repay debts due even if he consumers zero
forevermore. We shall discuss such debt limits in another lecture.
A natural analog of Hall’s assumption that the one-period risk-free gross interest rate is 𝛽 −1 is

𝑞(𝑗 | 𝑖) = 𝛽𝑃𝑖𝑗 (6.6)

To understand how this is a natural analogue, observe that in state 𝑖 it costs ∑𝑗 𝑞(𝑗 | 𝑖) to purchase one unit of consumption
next period for sure, i.e., meaning no matter what Markov state 𝑗 occurs at 𝑡 + 1.

6.4. Model 1 (Complete Markets) 97


Advanced Quantitative Economics with Python

Hence the implied price of a risk-free claim on one unit of consumption next period is

∑ 𝑞(𝑗 | 𝑖) = ∑ 𝛽𝑃𝑖𝑗 = 𝛽
𝑗 𝑗

This confirms the sense in which (6.6) is a natural counterpart to Hall’s assumption that the risk-free one-period gross
interest rate is 𝑅 = 𝛽 −1 .
It is timely please to recall that the gross one-period risk-free interest rate is the reciprocal of the price at time 𝑡 of a
risk-free claim on one unit of consumption tomorrow.
First-order necessary conditions for maximizing the consumer’s expected utility subject to the sequence of budget con-
straints (6.5) are
𝑢′ (𝑐𝑡+1 )
𝛽 ℙ{𝑠𝑡+1 | 𝑠𝑡 } = 𝑞(𝑠𝑡+1 | 𝑠𝑡 )
𝑢′ (𝑐𝑡 )
for all 𝑠𝑡 , 𝑠𝑡+1 or, under our assumption (6.6) about Arrow security prices,

𝑐𝑡+1 = 𝑐𝑡 (6.7)

Thus, our consumer sets 𝑐𝑡 = 𝑐 ̄ for all 𝑡 ≥ 0 for some value 𝑐 ̄ that it is our job now to determine along with values for
𝑏𝑡+1 (𝑗|𝑠𝑡 = 𝑖) for 𝑖 = 1, 2 and 𝑗 = 1, 2.
We’ll use a guess and verify method to determine these objects
Guess: We’ll make the plausible guess that

𝑏𝑡+1 (𝑠𝑡+1 = 𝑗 | 𝑠𝑡 = 𝑖) = 𝑏(𝑗), 𝑖 = 1, 2; 𝑗 = 1, 2 (6.8)

so that the amount borrowed today depends only on tomorrow’s Markov state. (Why is this is a plausible guess?)
To determine 𝑐,̄ we shall deduce implications of the consumer’s budget constraints in each Markov state today and our
guess (6.8) about the consumer’s debt level choices.
For 𝑡 ≥ 1, these imply

𝑐 ̄ + 𝑏(1) = 𝑦(1) + 𝑞(1 | 1)𝑏(1) + 𝑞(2 | 1)𝑏(2)


(6.9)
𝑐 ̄ + 𝑏(2) = 𝑦(2) + 𝑞(1 | 2)𝑏(1) + 𝑞(2 | 2)𝑏(2)
or
𝑏(1) 𝑐̄ 𝑦(1) 𝑃 𝑃12 𝑏(1)
[ ]+[ ]=[ ] + 𝛽 [ 11 ][ ]
𝑏(2) 𝑐̄ 𝑦(2) 𝑃21 𝑃22 𝑏(2)

These are 2 equations in the 3 unknowns 𝑐,̄ 𝑏(1), 𝑏(2)


To get a third equation, we assume that at time 𝑡 = 0, 𝑏0 is debt due; and we assume that at time 𝑡 = 0, the Markov state
𝑠0 = 1
(We could instead have assumed that at time 𝑡 = 0 the Markov state 𝑠0 = 2, which would affect our answer as we shall
see)
Since we have assumed that 𝑠0 = 1, the budget constraint at time 𝑡 = 0 is

𝑐 ̄ + 𝑏0 = 𝑦(1) + 𝑞(1 | 1)𝑏(1) + 𝑞(2 | 1)𝑏(2) (6.10)

where 𝑏0 is the (exogenous) debt the consumer is assumed to bring into period 0
If we substitute (6.10) into the first equation of (6.9) and rearrange, we discover that

𝑏(1) = 𝑏0 (6.11)

98 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

We can then use the second equation of (6.9) to deduce the restriction

𝑦(1) − 𝑦(2) + [𝑞(1 | 1) − 𝑞(1 | 2) − 1]𝑏0 + [𝑞(2 | 1) + 1 − 𝑞(2 | 2)]𝑏(2) = 0, (6.12)

an equation that we can solve for the unknown 𝑏(2).


Knowing 𝑏(1) and 𝑏(2), we can solve equation (6.10) for the constant level of consumption 𝑐.̄

6.4.1 Key Outcomes

The preceding calculations indicate that in the complete markets version of our model, we obtain the following striking
results:
• The consumer chooses to make consumption perfectly constant across time and across Markov states.
• State-contingent debt purchases 𝑏𝑡+1 (𝑠𝑡+1 = 𝑗|𝑠𝑡 = 𝑖) depend only on 𝑗
• If the initial Markov state is 𝑠0 = 𝑗 and initial consumer debt is 𝑏0 , then debt in Markov state 𝑗 satisfies 𝑏(𝑗) = 𝑏0
To summarize what we have achieved up to now, we have computed the constant level of consumption 𝑐 ̄ and indicated
how that level depends on the underlying specifications of preferences, Arrow securities prices, the stochastic process of
exogenous nonfinancial income, and the initial debt level 𝑏0
• The consumer’s debt neither accumulates, nor decumulates, nor drifts – instead, the debt level each period is an
exact function of the Markov state, so in the two-state Markov case, it switches between two values.
• We have verified guess (6.8).
• When the state 𝑠𝑡 returns to the initial state 𝑠0 , debt returns to the initial debt level.
• Debt levels in all other states depend on virtually all remaining parameters of the model.

6.4.2 Code

Here’s some code that, among other things, contains a function called consumption_complete().
This function computes {𝑏(𝑖)}𝑁
𝑖=1 , 𝑐 ̄ as outcomes given a set of parameters for the general case with 𝑁 Markov states
under the assumption of complete markets

class ConsumptionProblem:
"""
The data for a consumption problem, including some default values.
"""

def __init__(self,
β=.96,
y=[2, 1.5],
b0=3,
P=[[.8, .2],
[.4, .6]],
init=0):
"""
Parameters
----------

β : discount factor
y : list containing the two income levels
b0 : debt in period 0 (= initial state debt level)
(continues on next page)

6.4. Model 1 (Complete Markets) 99


Advanced Quantitative Economics with Python

(continued from previous page)


P : 2x2 transition matrix
init : index of initial state s0
"""
self.β = β
self.y = np.asarray(y)
self.b0 = b0
self.P = np.asarray(P)
self.init = init

def simulate(self, N_simul=80, random_state=1):


"""
Parameters
----------

N_simul : number of periods for simulation


random_state : random state for simulating Markov chain
"""
# For the simulation define a quantecon MC class
mc = qe.MarkovChain(self.P)
s_path = mc.simulate(N_simul, init=self.init, random_state=random_state)

return s_path

def consumption_complete(cp):
"""
Computes endogenous values for the complete market case.

Parameters
----------

cp : instance of ConsumptionProblem

Returns
-------

c_bar : constant consumption


b : optimal debt in each state

associated with the price system

Q = β * P
"""
β, P, y, b0, init = cp.β, cp.P, cp.y, cp.b0, cp.init # Unpack

Q = β * P # assumed price system

# construct matrices of augmented equation system


n = P.shape[0] + 1

y_aug = np.empty((n, 1))


y_aug[0, 0] = y[init] - b0
y_aug[1:, 0] = y

Q_aug = np.zeros((n, n))


Q_aug[0, 1:] = Q[init, :]

(continues on next page)

100 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


Q_aug[1:, 1:] = Q

A = np.zeros((n, n))
A[:, 0] = 1
A[1:, 1:] = np.eye(n-1)

x = np.linalg.inv(A - Q_aug) @ y_aug

c_bar = x[0, 0]
b = x[1:, 0]

return c_bar, b

def consumption_incomplete(cp, s_path):


"""
Computes endogenous values for the incomplete market case.

Parameters
----------

cp : instance of ConsumptionProblem
s_path : the path of states
"""
β, P, y, b0 = cp.β, cp.P, cp.y, cp.b0 # Unpack

N_simul = len(s_path)

# Useful variables
n = len(y)
y.shape = (n, 1)
v = np.linalg.inv(np.eye(n) - β * P) @ y

# Store consumption and debt path


b_path, c_path = np.ones(N_simul+1), np.ones(N_simul)
b_path[0] = b0

# Optimal decisions from (12) and (13)


db = ((1 - β) * v - y) / β

for i, s in enumerate(s_path):
c_path[i] = (1 - β) * (v - np.full((n, 1), b_path[i]))[s, 0]
b_path[i + 1] = b_path[i] + db[s, 0]

return c_path, b_path[:-1], y[s_path]

Let’s test by checking that 𝑐 ̄ and 𝑏2 satisfy the budget constraint

cp = ConsumptionProblem()
c_bar, b = consumption_complete(cp)
np.isclose(c_bar + b[1] - cp.y[1] - (cp.β * cp.P)[1, :] @ b, 0)

True

Below, we’ll take the outcomes produced by this code – in particular the implied consumption and debt paths – and
compare them with outcomes from an incomplete markets model in the spirit of Hall [Hall, 1978]

6.4. Model 1 (Complete Markets) 101


Advanced Quantitative Economics with Python

6.5 Model 2 (One-Period Risk-Free Debt Only)

This is a version of the original model of Hall (1978) in which the consumer’s ability to substitute intertemporally is
constrained by his ability to buy or sell only one security, a risk-free one-period bond bearing a constant gross interest
rate that equals 𝛽 −1 .
Given an initial debt 𝑏0 at time 0, the consumer faces a sequence of budget constraints

𝑐𝑡 + 𝑏𝑡 = 𝑦𝑡 + 𝛽𝑏𝑡+1 , 𝑡≥0

where 𝛽 is the price at time 𝑡 of a risk-free claim on one unit of time consumption at time 𝑡 + 1.
First-order conditions for the consumer’s problem are

∑ 𝑢′ (𝑐𝑡+1,𝑗 )𝑃𝑖𝑗 = 𝑢′ (𝑐𝑡,𝑖 )


𝑗

For our assumed quadratic utility function this implies

∑ 𝑐𝑡+1,𝑗 𝑃𝑖𝑗 = 𝑐𝑡,𝑖 (6.13)


𝑗

which for our finite-state Markov setting is Hall’s (1978) conclusion that consumption follows a random walk.
As we saw in our first lecture on the permanent income model, this leads to

𝑏𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗 𝑦𝑡+𝑗 − (1 − 𝛽)−1 𝑐𝑡 (6.14)
𝑗=0

and

𝑐𝑡 = (1 − 𝛽) [𝔼𝑡 ∑ 𝛽 𝑗 𝑦𝑡+𝑗 − 𝑏𝑡 ] (6.15)
𝑗=0

Equation (6.15) expresses 𝑐𝑡 as a net interest rate factor 1 − 𝛽 times the sum of the expected present value of nonfinancial

income 𝔼𝑡 ∑𝑗=0 𝛽 𝑗 𝑦𝑡+𝑗 and financial wealth −𝑏𝑡 .
Substituting (6.15) into the one-period budget constraint and rearranging leads to

𝑏𝑡+1 − 𝑏𝑡 = 𝛽 −1 [(1 − 𝛽)𝔼𝑡 ∑ 𝛽 𝑗 𝑦𝑡+𝑗 − 𝑦𝑡 ] (6.16)
𝑗=0


Now let’s calculate the key term 𝔼𝑡 ∑𝑗=0 𝛽 𝑗 𝑦𝑡+𝑗 in our finite Markov chain setting.
Define the expected discounted present value of non-financial income

𝑣𝑡 ∶= 𝔼𝑡 ∑ 𝛽 𝑗 𝑦𝑡+𝑗
𝑗=0

which in the spirit of dynamic programming we can write as a Bellman equation

𝑣𝑡 ∶= 𝑦𝑡 + 𝛽𝔼𝑡 𝑣𝑡+1

In our two-state Markov chain setting, 𝑣𝑡 = 𝑣(1) when 𝑠𝑡 = 1 and 𝑣𝑡 = 𝑣(2) when 𝑠𝑡 = 2.
Therefore, we can write our Bellman equation as

𝑣(1) = 𝑦(1) + 𝛽𝑃11 𝑣(1) + 𝛽𝑃12 𝑣(2)


𝑣(2) = 𝑦(2) + 𝛽𝑃21 𝑣(1) + 𝛽𝑃22 𝑣(2)

102 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

or

𝑣 ⃗ = 𝑦 ⃗ + 𝛽𝑃 𝑣 ⃗

𝑣(1) 𝑦(1)
where 𝑣 ⃗ = [ ] and 𝑦 ⃗ = [ ].
𝑣(2) 𝑦(2)
We can also write the last expression as

𝑣 ⃗ = (𝐼 − 𝛽𝑃 )−1 𝑦 ⃗

In our finite Markov chain setting, from expression (6.15), consumption at date 𝑡 when debt is 𝑏𝑡 and the Markov state
today is 𝑠𝑡 = 𝑖 is evidently

𝑐(𝑏𝑡 , 𝑖) = (1 − 𝛽) ([(𝐼 − 𝛽𝑃 )−1 𝑦]⃗ 𝑖 − 𝑏𝑡 ) (6.17)

and the increment to debt is

𝑏𝑡+1 − 𝑏𝑡 = 𝛽 −1 [(1 − 𝛽)𝑣(𝑖) − 𝑦(𝑖)] (6.18)

6.5.1 Summary of Outcomes

In contrast to outcomes in the complete markets model, in the incomplete markets model
• consumption drifts over time as a random walk; the level of consumption at time 𝑡 depends on the level of debt that
the consumer brings into the period as well as the expected discounted present value of nonfinancial income at 𝑡.
• the consumer’s debt drifts upward over time in response to low realizations of nonfinancial income and drifts
downward over time in response to high realizations of nonfinancial income.
• the drift over time in the consumer’s debt and the dependence of current consumption on today’s debt level account
for the drift over time in consumption.

6.5.2 The Incomplete Markets Model

The code above also contains a function called consumption_incomplete() that uses (6.17) and (6.18) to
• simulate paths of 𝑦𝑡 , 𝑐𝑡 , 𝑏𝑡+1
• plot these against values of 𝑐,̄ 𝑏(𝑠1 ), 𝑏(𝑠2 ) found in a corresponding complete markets economy
Let’s try this, using the same parameters in both complete and incomplete markets economies

cp = ConsumptionProblem()
s_path = cp.simulate()
N_simul = len(s_path)

c_bar, debt_complete = consumption_complete(cp)

c_path, debt_path, y_path = consumption_incomplete(cp, s_path)

fig, ax = plt.subplots(1, 2, figsize=(14, 4))

ax[0].set_title('Consumption paths')
ax[0].plot(np.arange(N_simul), c_path, label='incomplete market')
ax[0].plot(np.arange(N_simul), np.full(N_simul, c_bar),
(continues on next page)

6.5. Model 2 (One-Period Risk-Free Debt Only) 103


Advanced Quantitative Economics with Python

(continued from previous page)


label='complete market')
ax[0].plot(np.arange(N_simul), y_path, label='income', alpha=.6, ls='--')
ax[0].legend()
ax[0].set_xlabel('Periods')

ax[1].set_title('Debt paths')
ax[1].plot(np.arange(N_simul), debt_path, label='incomplete market')
ax[1].plot(np.arange(N_simul), debt_complete[s_path],
label='complete market')
ax[1].plot(np.arange(N_simul), y_path, label='income', alpha=.6, ls='--')
ax[1].legend()
ax[1].axhline(0, color='k', ls='--')
ax[1].set_xlabel('Periods')

plt.show()

In the graph on the left, for the same sample path of nonfinancial income 𝑦𝑡 , notice that
• consumption is constant when there are complete markets, but takes a random walk in the incomplete markets
version of the model.
• the consumer’s debt oscillates between two values that are functions of the Markov state in the complete markets
model, while the consumer’s debt drifts in a “unit root” fashion in the incomplete markets economy.

6.5.3 A sequel

In tax smoothing with complete and incomplete markets, we reinterpret the mathematics and Python code presented in this
lecture in order to construct tax-smoothing models in the incomplete markets tradition of Barro [Barro, 1979] as well as
in the complete markets tradition of Lucas and Stokey [Lucas and Stokey, 1983].

104 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets


CHAPTER

SEVEN

TAX SMOOTHING WITH COMPLETE AND INCOMPLETE MARKETS

In addition to what’s in Anaconda, this lecture uses the library:

!pip install --upgrade quantecon

7.1 Overview

This lecture describes tax-smoothing models that are counterparts to consumption-smoothing models in Consumption
Smoothing with Complete and Incomplete Markets.
• one is in the complete markets tradition of Lucas and Stokey [Lucas and Stokey, 1983].
• the other is in the incomplete markets tradition of Barro [Barro, 1979].
Complete markets allow a government to buy or sell claims contingent on all possible Markov states.
Incomplete markets allow a government to buy or sell only a limited set of securities, often only a single risk-free security.
Barro [Barro, 1979] worked in an incomplete markets tradition by assuming that the only asset that can be traded is a
risk-free one period bond.
In his consumption-smoothing model, Hall [Hall, 1978] had assumed an exogenous stochastic process of nonfinancial
income and an exogenous gross interest rate on one period risk-free debt that equals 𝛽 −1 , where 𝛽 ∈ (0, 1) is also a
consumer’s intertemporal discount factor.
Barro [Barro, 1979] made an analogous assumption about the risk-free interest rate in a tax-smoothing model that turns
out to have the same mathematical structure as Hall’s consumption-smoothing model.
To get Barro’s model from Hall’s, all we have to do is to rename variables.
We maintain Hall’s and Barro’s assumption about the interest rate when we describe an incomplete markets version of
our model.
In addition, we extend their assumption about the interest rate to an appropriate counterpart to create a “complete markets”
model in the style of Lucas and Stokey [Lucas and Stokey, 1983].

105
Advanced Quantitative Economics with Python

7.1.1 Isomorphism between Consumption and Tax Smoothing

For each version of a consumption-smoothing model, a tax-smoothing counterpart can be obtained simply by relabeling
• consumption as tax collections
• a consumer’s one-period utility function as a government’s one-period loss function from collecting taxes that im-
pose deadweight welfare losses
• a consumer’s nonfinancial income as a government’s purchases
• a consumer’s debt as a government’s assets
Thus, we can convert the consumption-smoothing models in lecture Consumption Smoothing with Complete and Incomplete
Markets into tax-smoothing models by setting 𝑐𝑡 = 𝑇𝑡 , 𝑦𝑡 = 𝐺𝑡 , and −𝑏𝑡 = 𝑎𝑡 , where 𝑇𝑡 is total tax collections, {𝐺𝑡 }
is an exogenous government expenditures process, and 𝑎𝑡 is the government’s holdings of one-period risk-free bonds
coming maturing at the due at the beginning of time 𝑡.
For elaborations on this theme, please see Optimal Savings II: LQ Techniques and later parts of this lecture.
We’ll spend most of this lecture studying acquire finite-state Markov specification, but will also treat the linear state space
specification.

Link to History

For those who love history, President Thomas Jefferson’s Secretary of Treasury Albert Gallatin (1807) [Gallatin, 1837]
seems to have prescribed policies that come from Barro’s model [Barro, 1979]
Let’s start with some standard imports:

import numpy as np
import quantecon as qe
import matplotlib.pyplot as plt

To exploit the isomorphism between consumption-smoothing and tax-smoothing models, we simply use code from Con-
sumption Smoothing with Complete and Incomplete Markets

7.1.2 Code

Among other things, this code contains a function called consumption_complete().


This function computes {𝑏(𝑖)}𝑁
𝑖=1 , 𝑐 ̄ as outcomes given a set of parameters for the general case with 𝑁 Markov states
under the assumption of complete markets

class ConsumptionProblem:
"""
The data for a consumption problem, including some default values.
"""

def __init__(self,
β=.96,
y=[2, 1.5],
b0=3,
P=[[.8, .2],
[.4, .6]],
init=0):
"""
(continues on next page)

106 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


Parameters
----------

β : discount factor
y : list containing the two income levels
b0 : debt in period 0 (= initial state debt level)
P : 2x2 transition matrix
init : index of initial state s0
"""
self.β = β
self.y = np.asarray(y)
self.b0 = b0
self.P = np.asarray(P)
self.init = init

def simulate(self, N_simul=80, random_state=1):


"""
Parameters
----------

N_simul : number of periods for simulation


random_state : random state for simulating Markov chain
"""
# For the simulation define a quantecon MC class
mc = qe.MarkovChain(self.P)
s_path = mc.simulate(N_simul, init=self.init, random_state=random_state)

return s_path

def consumption_complete(cp):
"""
Computes endogenous values for the complete market case.

Parameters
----------

cp : instance of ConsumptionProblem

Returns
-------

c_bar : constant consumption


b : optimal debt in each state

associated with the price system

Q = β * P
"""
β, P, y, b0, init = cp.β, cp.P, cp.y, cp.b0, cp.init # Unpack

Q = β * P # assumed price system

# construct matrices of augmented equation system


n = P.shape[0] + 1

(continues on next page)

7.1. Overview 107


Advanced Quantitative Economics with Python

(continued from previous page)


y_aug = np.empty((n, 1))
y_aug[0, 0] = y[init] - b0
y_aug[1:, 0] = y

Q_aug = np.zeros((n, n))


Q_aug[0, 1:] = Q[init, :]
Q_aug[1:, 1:] = Q

A = np.zeros((n, n))
A[:, 0] = 1
A[1:, 1:] = np.eye(n-1)

x = np.linalg.inv(A - Q_aug) @ y_aug

c_bar = x[0, 0]
b = x[1:, 0]

return c_bar, b

def consumption_incomplete(cp, s_path):


"""
Computes endogenous values for the incomplete market case.

Parameters
----------

cp : instance of ConsumptionProblem
s_path : the path of states
"""
β, P, y, b0 = cp.β, cp.P, cp.y, cp.b0 # Unpack

N_simul = len(s_path)

# Useful variables
n = len(y)
y.shape = (n, 1)
v = np.linalg.inv(np.eye(n) - β * P) @ y

# Store consumption and debt path


b_path, c_path = np.ones(N_simul+1), np.ones(N_simul)
b_path[0] = b0

# Optimal decisions from (12) and (13)


db = ((1 - β) * v - y) / β

for i, s in enumerate(s_path):
c_path[i] = (1 - β) * (v - np.full((n, 1), b_path[i]))[s, 0]
b_path[i + 1] = b_path[i] + db[s, 0]

return c_path, b_path[:-1], y[s_path]

108 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

7.1.3 Revisiting the consumption-smoothing model

The code above also contains a function called consumption_incomplete() that uses (6.17) and (6.18) to
• simulate paths of 𝑦𝑡 , 𝑐𝑡 , 𝑏𝑡+1
• plot these against values of 𝑐,̄ 𝑏(𝑠1 ), 𝑏(𝑠2 ) found in a corresponding complete markets economy
Let’s try this, using the same parameters in both complete and incomplete markets economies

cp = ConsumptionProblem()
s_path = cp.simulate()
N_simul = len(s_path)

c_bar, debt_complete = consumption_complete(cp)

c_path, debt_path, y_path = consumption_incomplete(cp, s_path)

fig, ax = plt.subplots(1, 2, figsize=(14, 4))

ax[0].set_title('Consumption paths')
ax[0].plot(np.arange(N_simul), c_path, label='incomplete market')
ax[0].plot(np.arange(N_simul), np.full(N_simul, c_bar), label='complete market')
ax[0].plot(np.arange(N_simul), y_path, label='income', alpha=.6, ls='--')
ax[0].legend()
ax[0].set_xlabel('Periods')

ax[1].set_title('Debt paths')
ax[1].plot(np.arange(N_simul), debt_path, label='incomplete market')
ax[1].plot(np.arange(N_simul), debt_complete[s_path], label='complete market')
ax[1].plot(np.arange(N_simul), y_path, label='income', alpha=.6, ls='--')
ax[1].legend()
ax[1].axhline(0, color='k', ls='--')
ax[1].set_xlabel('Periods')

plt.show()

In the graph on the left, for the same sample path of nonfinancial income 𝑦𝑡 , notice that
• consumption is constant when there are complete markets.
• consumption takes a random walk in the incomplete markets version of the model.
• the consumer’s debt oscillates between two values that are functions of the Markov state in the complete markets
model.

7.1. Overview 109


Advanced Quantitative Economics with Python

• the consumer’s debt drifts because it contains a unit root in the incomplete markets economy.

Relabeling variables to create tax-smoothing models

As indicated above, we relabel variables to acquire tax-smoothing interpretations of the complete markets and incomplete
markets consumption-smoothing models.

fig, ax = plt.subplots(1, 2, figsize=(14, 4))

ax[0].set_title('Tax collection paths')


ax[0].plot(np.arange(N_simul), c_path, label='incomplete market')
ax[0].plot(np.arange(N_simul), np.full(N_simul, c_bar), label='complete market')
ax[0].plot(np.arange(N_simul), y_path, label='govt expenditures', alpha=.6, ls='--')
ax[0].legend()
ax[0].set_xlabel('Periods')
ax[0].set_ylim([1.4, 2.1])

ax[1].set_title('Government assets paths')


ax[1].plot(np.arange(N_simul), debt_path, label='incomplete market')
ax[1].plot(np.arange(N_simul), debt_complete[s_path], label='complete market')
ax[1].plot(np.arange(N_simul), y_path, label='govt expenditures', ls='--')
ax[1].legend()
ax[1].axhline(0, color='k', ls='--')
ax[1].set_xlabel('Periods')

plt.show()

7.2 Tax Smoothing with Complete Markets

It is instructive to focus on a simple tax-smoothing example with complete markets.


This example illustrates how, in a complete markets model like that of Lucas and Stokey [Lucas and Stokey, 1983], the
government purchases insurance from the private sector.
Payouts from the insurance it had purchased allows the government to avoid raising taxes when emergencies make gov-
ernment expenditures surge.
We assume that government expenditures take one of two values 𝐺1 < 𝐺2 , where Markov state 1 means “peace” and
Markov state 2 means “war”.

110 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

The government budget constraint in Markov state 𝑖 is

𝑇𝑖 + 𝑏𝑖 = 𝐺𝑖 + ∑ 𝑄𝑖𝑗 𝑏𝑗
𝑗

where

𝑄𝑖𝑗 = 𝛽𝑃𝑖𝑗

is the price today of one unit of goods in Markov state 𝑗 tomorrow when the Markov state is 𝑖 today.
𝑏𝑖 is the government’s level of assets when it arrives in Markov state 𝑖.
That is, 𝑏𝑖 equals one-period state-contingent claims owed to the government that fall due at time 𝑡 when the Markov state
is 𝑖.
Thus, if 𝑏𝑖 < 0, it means the government is owed 𝑏𝑖 or owes −𝑏𝑖 when the economy arrives in Markov state 𝑖 at time 𝑡.
In our examples below, this happens when in a previous war-time period the government has sold an Arrow securities
paying off −𝑏𝑖 in peacetime Markov state 𝑖
It can be enlightening to express the government’s budget constraint in Markov state 𝑖 as

𝑇𝑖 = 𝐺𝑖 + (∑ 𝑄𝑖𝑗 𝑏𝑗 − 𝑏𝑖 )
𝑗

in which the term (∑𝑗 𝑄𝑖𝑗 𝑏𝑗 − 𝑏𝑖 ) equals the net amount that the government spends to purchase one-period Arrow
securities that will pay off next period in Markov states 𝑗 = 1, … , 𝑁 after it has received payments 𝑏𝑖 this period.

7.3 Returns on State-Contingent Debt


𝑁
Notice that ∑𝑗′ =1 𝑄𝑖𝑗′ 𝑏(𝑗′ ) is the amount that the government spends in Markov state 𝑖 at time 𝑡 to purchase one-period
state-contingent claims that will pay off in Markov state 𝑗′ at time 𝑡 + 1.
Then the ex post one-period gross return on the portfolio of government assets held from state 𝑖 at time 𝑡 to state 𝑗 at time
𝑡 + 1 is
𝑏(𝑗)
𝑅(𝑗|𝑖) = 𝑁
∑𝑗′ =1 𝑄𝑖𝑗′ 𝑏(𝑗′ )

The cumulative return earned from putting 1 unit of time 𝑡 goods into the government portfolio of state-contingent
securities at time 𝑡 and then rolling over the proceeds into the government portfolio each period thereafter is

𝑅𝑇 (𝑠𝑡+𝑇 , 𝑠𝑡+𝑇 −1 , … , 𝑠𝑡 ) ≡ 𝑅(𝑠𝑡+1 |𝑠𝑡 )𝑅(𝑠𝑡+2 |𝑠𝑡+1 ) ⋯ 𝑅(𝑠𝑡+𝑇 |𝑠𝑡+𝑇 −1 )

Here is some code that computes one-period and cumulative returns on the government portfolio in the finite-state Markov
version of our complete markets model.
Convention: In this code, when 𝑃𝑖𝑗 = 0, we arbitrarily set 𝑅(𝑗|𝑖) to be 0.

def ex_post_gross_return(b, cp):


"""
calculate the ex post one-period gross return on the portfolio
of government assets, given b and Q.
"""
Q = cp.β * cp.P
(continues on next page)

7.3. Returns on State-Contingent Debt 111


Advanced Quantitative Economics with Python

(continued from previous page)

values = Q @ b

n = len(b)
R = np.zeros((n, n))

for i in range(n):
ind = cp.P[i, :] != 0
R[i, ind] = b[ind] / values[i]

return R

def cumulative_return(s_path, R):


"""
compute cumulative return from holding 1 unit market portfolio
of government bonds, given some simulated state path.
"""
T = len(s_path)

RT_path = np.empty(T)
RT_path[0] = 1
RT_path[1:] = np.cumprod([R[s_path[t], s_path[t+1]] for t in range(T-1)])

return RT_path

7.3.1 An Example of Tax Smoothing

We’ll study a tax-smoothing model with two Markov states.


In Markov state 1, there is peace and government expenditures are low.
In Markov state 2, there is war and government expenditures are high.
We’ll compute optimal policies in both complete and incomplete markets settings.
Then we’ll feed in a particular assumed path of Markov states and study outcomes.
• We’ll assume that the initial Markov state is state 1, which means we start from a state of peace.
• The government then experiences 3 time periods of war and come back to peace again.
• The history of Markov states is therefore {𝑝𝑒𝑎𝑐𝑒, 𝑤𝑎𝑟, 𝑤𝑎𝑟, 𝑤𝑎𝑟, 𝑝𝑒𝑎𝑐𝑒}.
In addition, as indicated above, to simplify our example, we’ll set the government’s initial asset level to 1, so that 𝑏1 = 1.
Here’s code that itinitializes government assets to be unity in an initial peace time Markov state.

# Parameters
β = .96

# change notation y to g in the tax-smoothing example


g = [1, 2]
b0 = 1
P = np.array([[.8, .2],
[.4, .6]])

cp = ConsumptionProblem(β, g, b0, P)
(continues on next page)

112 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


Q = β * P

# change notation c_bar to T_bar in the tax-smoothing example


T_bar, b = consumption_complete(cp)
R = ex_post_gross_return(b, cp)
s_path = [0, 1, 1, 1, 0]
RT_path = cumulative_return(s_path, R)

print(f"P \n {P}")
print(f"Q \n {Q}")
print(f"Govt expenditures in peace and war = {g}")
print(f"Constant tax collections = {T_bar}")
print(f"Govt debts in two states = {-b}")

msg = """
Now let's check the government's budget constraint in peace and war.
Our assumptions imply that the government always purchases 0 units of the
Arrow peace security.
"""
print(msg)

AS1 = Q[0, :] @ b
# spending on Arrow security
# since the spending on Arrow peace security is not 0 anymore after we change b0 to 1
print(f"Spending on Arrow security in peace = {AS1}")
AS2 = Q[1, :] @ b
print(f"Spending on Arrow security in war = {AS2}")

print("")
# tax collections minus debt levels
print("Government tax collections minus debt levels in peace and war")
TB1 = T_bar + b[0]
print(f"T+b in peace = {TB1}")
TB2 = T_bar + b[1]
print(f"T+b in war = {TB2}")

print("")
print("Total government spending in peace and war")
G1 = g[0] + AS1
G2 = g[1] + AS2
print(f"Peace = {G1}")
print(f"War = {G2}")

print("")
print("Let's see ex-post and ex-ante returns on Arrow securities")

Π = np.reciprocal(Q)
exret = Π
print(f"Ex-post returns to purchase of Arrow securities = \n {exret}")
exant = Π * P
print(f"Ex-ante returns to purchase of Arrow securities \n {exant}")

print("")
print("The Ex-post one-period gross return on the portfolio of government assets")
print(R)

(continues on next page)

7.3. Returns on State-Contingent Debt 113


Advanced Quantitative Economics with Python

(continued from previous page)


print("")
print("The cumulative return earned from holding 1 unit market portfolio of␣
↪government bonds")

print(RT_path[-1])

P
[[0.8 0.2]
[0.4 0.6]]
Q
[[0.768 0.192]
[0.384 0.576]]
Govt expenditures in peace and war = [1, 2]
Constant tax collections = 1.2716883116883118
Govt debts in two states = [-1. -2.62337662]

Now let's check the government's budget constraint in peace and war.
Our assumptions imply that the government always purchases 0 units of the
Arrow peace security.

Spending on Arrow security in peace = 1.2716883116883118


Spending on Arrow security in war = 1.895064935064935

Government tax collections minus debt levels in peace and war


T+b in peace = 2.2716883116883118
T+b in war = 3.895064935064935

Total government spending in peace and war


Peace = 2.2716883116883118
War = 3.895064935064935

Let's see ex-post and ex-ante returns on Arrow securities


Ex-post returns to purchase of Arrow securities =
[[1.30208333 5.20833333]
[2.60416667 1.73611111]]
Ex-ante returns to purchase of Arrow securities
[[1.04166667 1.04166667]
[1.04166667 1.04166667]]

The Ex-post one-period gross return on the portfolio of government assets


[[0.78635621 2.0629085 ]
[0.5276864 1.38432018]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

2.0860704239993675

114 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

7.3.2 Explanation

In this example, the government always purchase 1 units of the Arrow security that pays off in peace time (Markov state
1).
And it purchases a higher amount of the security that pays off in war time (Markov state 2).
Thus, this is an example in which
• during peacetime, the government purchases insurance against the possibility that war breaks out next period
• during wartime, the government purchases insurance against the possibility that war continues another period
• so long as peace continues, the ex post return on insurance against war is low
• when war breaks out or continues, the ex post return on insurance against war is high
• given the history of states that we assumed, the value of one unit of the portfolio of government assets eventually
doubles in the end because of high returns during wartime.
We recommend plugging the quantities computed above into the government budget constraints in the two Markov states
and staring.

Exercise 7.3.1
Try changing the Markov transition matrix so that

1 0
𝑃 =[ ]
.2 .8

Also, start the system in Markov state 2 (war) with initial government assets −10, so that the government starts the war
in debt and 𝑏2 = −10.

7.4 More Finite Markov Chain Tax-Smoothing Examples

To interpret some episodes in the fiscal history of the United States, we find it interesting to study a few more examples.
We compute examples in an 𝑁 state Markov setting under both complete and incomplete markets.
These examples differ in how Markov states are jumping between peace and war.
To wrap procedures for solving models, relabeling graphs so that we record government debt rather than government
assets, and displaying results, we construct a Python class.

class TaxSmoothingExample:
"""
construct a tax-smoothing example, by relabeling consumption problem class.
"""
def __init__(self, g, P, b0, states, β=.96,
init=0, s_path=None, N_simul=80, random_state=1):

self.states = states # state names

# if the path of states is not specified


if s_path is None:
self.cp = ConsumptionProblem(β, g, b0, P, init=init)
self.s_path = self.cp.simulate(N_simul=N_simul, random_state=random_state)
(continues on next page)

7.4. More Finite Markov Chain Tax-Smoothing Examples 115


Advanced Quantitative Economics with Python

(continued from previous page)


# if the path of states is specified
else:
self.cp = ConsumptionProblem(β, g, b0, P, init=s_path[0])
self.s_path = s_path

# solve for complete market case


self.T_bar, self.b = consumption_complete(self.cp)
self.debt_value = - (β * P @ self.b).T

# solve for incomplete market case


self.T_path, self.asset_path, self.g_path = \
consumption_incomplete(self.cp, self.s_path)

# calculate returns on state-contingent debt


self.R = ex_post_gross_return(self.b, self.cp)
self.RT_path = cumulative_return(self.s_path, self.R)

def display(self):

# plot graphs
N = len(self.T_path)

plt.figure()
plt.title('Tax collection paths')
plt.plot(np.arange(N), self.T_path, label='incomplete market')
plt.plot(np.arange(N), np.full(N, self.T_bar), label='complete market')
plt.plot(np.arange(N), self.g_path, label='govt expenditures', alpha=.6, ls='-
↪-')
plt.legend()
plt.xlabel('Periods')
plt.show()

plt.title('Government debt paths')


plt.plot(np.arange(N), -self.asset_path, label='incomplete market')
plt.plot(np.arange(N), -self.b[self.s_path], label='complete market')
plt.plot(np.arange(N), self.g_path, label='govt expenditures', ls='--')
plt.plot(np.arange(N), self.debt_value[self.s_path], label="value of debts␣
↪today")

plt.legend()
plt.axhline(0, color='k', ls='--')
plt.xlabel('Periods')
plt.show()

fig, ax = plt.subplots()
ax.set_title('Cumulative return path (complete markets)')
line1 = ax.plot(np.arange(N), self.RT_path, color='blue')[0]
c1 = line1.get_color()
ax.set_xlabel('Periods')
ax.set_ylabel('Cumulative return', color=c1)

ax_ = ax.twinx()
line2 = ax_.plot(np.arange(N), self.g_path, ls='--', color='green')[0]
c2 = line2.get_color()
ax_.set_ylabel('Government expenditures', color=c2)

plt.show()

(continues on next page)

116 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)

# plot detailed information


Q = self.cp.β * self.cp.P

print(f"P \n {self.cp.P}")
print(f"Q \n {Q}")
print(f"Govt expenditures in {', '.join(self.states)} = {self.cp.y.flatten()}
↪")
print(f"Constant tax collections = {self.T_bar}")
print(f"Govt debt in {len(self.states)} states = {-self.b}")

print("")
print(f"Government tax collections minus debt levels in {', '.join(self.
↪states)}")

for i in range(len(self.states)):
TB = self.T_bar + self.b[i]
print(f" T+b in {self.states[i]} = {TB}")

print("")
print(f"Total government spending in {', '.join(self.states)}")
for i in range(len(self.states)):
G = self.cp.y[i, 0] + Q[i, :] @ self.b
print(f" {self.states[i]} = {G}")

print("")
print("Let's see ex-post and ex-ante returns on Arrow securities \n")

print(f"Ex-post returns to purchase of Arrow securities:")


for i in range(len(self.states)):
for j in range(len(self.states)):
if Q[i, j] != 0.:
print(f" π({self.states[j]}|{self.states[i]}) = {1/Q[i, j]}")

print("")
exant = 1 / self.cp.β
print(f"Ex-ante returns to purchase of Arrow securities = {exant}")

print("")
print("The Ex-post one-period gross return on the portfolio of government␣
↪assets")

print(self.R)

print("")
print("The cumulative return earned from holding 1 unit market portfolio of␣
↪government bonds")

print(self.RT_path[-1])

7.4. More Finite Markov Chain Tax-Smoothing Examples 117


Advanced Quantitative Economics with Python

7.4.1 Parameters

γ = .1
λ = .1
ϕ = .1
θ = .1
ψ = .1
g_L = .5
g_M = .8
g_H = 1.2
β = .96

7.4.2 Example 1

This example is designed to produce some stylized versions of tax, debt, and deficit paths followed by the United States
during and after the Civil War and also during and after World War I.
We set the Markov chain to have three states
1−𝜆 𝜆 0
𝑃 =⎡
⎢ 0 1 − 𝜙 𝜙⎤⎥
⎣ 0 0 1⎦

where the government expenditure vector 𝑔 = [𝑔𝐿 𝑔𝐻 𝑔𝑀 ] where 𝑔𝐿 < 𝑔𝑀 < 𝑔𝐻 .


We set 𝑏0 = 1 and assume that the initial Markov state is state 1 so that the system starts off in peace.
These parameters have government expenditure beginning at a low level, surging during the war, then decreasing after
the war to a level that exceeds its prewar level.
(This type of pattern occurred in the US Civil War and World War I experiences.)

g_ex1 = [g_L, g_H, g_M]


P_ex1 = np.array([[1-λ, λ, 0],
[0, 1-ϕ, ϕ],
[0, 0, 1]])
b0_ex1 = 1
states_ex1 = ['peace', 'war', 'postwar']

ts_ex1 = TaxSmoothingExample(g_ex1, P_ex1, b0_ex1, states_ex1, random_state=1)


ts_ex1.display()

118 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 119


Advanced Quantitative Economics with Python

120 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

P
[[0.9 0.1 0. ]
[0. 0.9 0.1]
[0. 0. 1. ]]
Q
[[0.864 0.096 0. ]
[0. 0.864 0.096]
[0. 0. 0.96 ]]
Govt expenditures in peace, war, postwar = [0.5 1.2 0.8]
Constant tax collections = 0.7548096885813149
Govt debt in 3 states = [-1. -4.07093426 -1.12975779]

Government tax collections minus debt levels in peace, war, postwar


T+b in peace = 1.754809688581315
T+b in war = 4.825743944636679
T+b in postwar = 1.8845674740484442

Total government spending in peace, war, postwar


peace = 1.754809688581315
war = 4.825743944636679
postwar = 1.8845674740484442

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:


π(peace|peace) = 1.1574074074074074
π(war|peace) = 10.416666666666666
(continues on next page)

7.4. More Finite Markov Chain Tax-Smoothing Examples 121


Advanced Quantitative Economics with Python

(continued from previous page)


π(war|war) = 1.1574074074074074
π(postwar|war) = 10.416666666666666
π(postwar|postwar) = 1.0416666666666667

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets


[[0.7969336 3.24426428 0. ]
[0. 1.12278592 0.31159337]
[0. 0. 1.04166667]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

0.17908622141460231

# The following shows the use of the wrapper class when a specific state path is given
s_path = [0, 0, 1, 1, 2]
ts_s_path = TaxSmoothingExample(g_ex1, P_ex1, b0_ex1, states_ex1, s_path=s_path)
ts_s_path.display()

122 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 123


Advanced Quantitative Economics with Python

P
[[0.9 0.1 0. ]
[0. 0.9 0.1]
[0. 0. 1. ]]
Q
[[0.864 0.096 0. ]
[0. 0.864 0.096]
[0. 0. 0.96 ]]
Govt expenditures in peace, war, postwar = [0.5 1.2 0.8]
Constant tax collections = 0.7548096885813149
Govt debt in 3 states = [-1. -4.07093426 -1.12975779]

Government tax collections minus debt levels in peace, war, postwar


T+b in peace = 1.754809688581315
T+b in war = 4.825743944636679
T+b in postwar = 1.8845674740484442

Total government spending in peace, war, postwar


peace = 1.754809688581315
war = 4.825743944636679
postwar = 1.8845674740484442

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:


π(peace|peace) = 1.1574074074074074
π(war|peace) = 10.416666666666666
(continues on next page)

124 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


π(war|war) = 1.1574074074074074
π(postwar|war) = 10.416666666666666
π(postwar|postwar) = 1.0416666666666667

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets


[[0.7969336 3.24426428 0. ]
[0. 1.12278592 0.31159337]
[0. 0. 1.04166667]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

0.9045311615620277

7.4.3 Example 2

This example captures a peace followed by a war, eventually followed by a permanent peace .
Here we set
1 0 0
𝑃 =⎡
⎢0 1−𝛾 𝛾 ⎤⎥
⎣𝜙 0 1 − 𝜙⎦

where the government expenditure vector 𝑔 = [𝑔𝐿 𝑔𝐿 𝑔𝐻 ] and where 𝑔𝐿 < 𝑔𝐻 .


We assume 𝑏0 = 1 and that the initial Markov state is state 2 so that the system starts off in a temporary peace.

g_ex2 = [g_L, g_L, g_H]


P_ex2 = np.array([[1, 0, 0],
[0, 1-γ, γ],
[ϕ, 0, 1-ϕ]])
b0_ex2 = 1
states_ex2 = ['peace', 'temporary peace', 'war']

ts_ex2 = TaxSmoothingExample(g_ex2, P_ex2, b0_ex2, states_ex2, init=1, random_state=1)


ts_ex2.display()

7.4. More Finite Markov Chain Tax-Smoothing Examples 125


Advanced Quantitative Economics with Python

126 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 127


Advanced Quantitative Economics with Python

P
[[1. 0. 0. ]
[0. 0.9 0.1]
[0.1 0. 0.9]]
Q
[[0.96 0. 0. ]
[0. 0.864 0.096]
[0.096 0. 0.864]]
Govt expenditures in peace, temporary peace, war = [0.5 0.5 1.2]
Constant tax collections = 0.6053287197231834
Govt debt in 3 states = [ 2.63321799 -1. -2.51384083]

Government tax collections minus debt levels in peace, temporary peace, war
T+b in peace = -2.0278892733564
T+b in temporary peace = 1.6053287197231834
T+b in war = 3.1191695501730106

Total government spending in peace, temporary peace, war


peace = -2.0278892733564
temporary peace = 1.6053287197231834
war = 3.1191695501730106

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:


π(peace|peace) = 1.0416666666666667
π(temporary peace|temporary peace) = 1.1574074074074074
(continues on next page)

128 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


π(war|temporary peace) = 10.416666666666666
π(peace|war) = 10.416666666666666
π(war|war) = 1.1574074074074074

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets


[[ 1.04166667 0. 0. ]
[ 0. 0.90470824 2.27429251]
[-1.37206116 0. 1.30985865]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

-9.368991732594216

7.4.4 Example 3

This example features a situation in which one of the states is a war state with no hope of peace next period, while another
state is a war state with a positive probability of peace next period.
The Markov chain is:
1−𝜆 𝜆 0 0
⎡ 0 1−𝜙 𝜙 0 ⎤
𝑃 =⎢ ⎥
⎢ 0 0 1−𝜓 𝜓 ⎥
⎣ 𝜃 0 0 1 − 𝜃⎦

with government expenditure levels for the four states being [𝑔𝐿 𝑔𝐿 𝑔𝐻 𝑔𝐻 ] where 𝑔𝐿 < 𝑔𝐻 .
We start with 𝑏0 = 1 and 𝑠0 = 1.

g_ex3 = [g_L, g_L, g_H, g_H]


P_ex3 = np.array([[1-λ, λ, 0, 0],
[0, 1-ϕ, ϕ, 0],
[0, 0, 1-ψ, ψ],
[θ, 0, 0, 1-θ ]])
b0_ex3 = 1
states_ex3 = ['peace1', 'peace2', 'war1', 'war2']

ts_ex3 = TaxSmoothingExample(g_ex3, P_ex3, b0_ex3, states_ex3, random_state=1)


ts_ex3.display()

7.4. More Finite Markov Chain Tax-Smoothing Examples 129


Advanced Quantitative Economics with Python

130 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 131


Advanced Quantitative Economics with Python

P
[[0.9 0.1 0. 0. ]
[0. 0.9 0.1 0. ]
[0. 0. 0.9 0.1]
[0.1 0. 0. 0.9]]
Q
[[0.864 0.096 0. 0. ]
[0. 0.864 0.096 0. ]
[0. 0. 0.864 0.096]
[0.096 0. 0. 0.864]]
Govt expenditures in peace1, peace2, war1, war2 = [0.5 0.5 1.2 1.2]
Constant tax collections = 0.6927944572748268
Govt debt in 4 states = [-1. -3.42494226 -6.86027714 -4.43533487]

Government tax collections minus debt levels in peace1, peace2, war1, war2
T+b in peace1 = 1.6927944572748268
T+b in peace2 = 4.117736720554273
T+b in war1 = 7.553071593533488
T+b in war2 = 5.128129330254041

Total government spending in peace1, peace2, war1, war2


peace1 = 1.6927944572748268
peace2 = 4.117736720554273
war1 = 7.553071593533487
war2 = 5.128129330254041

Let's see ex-post and ex-ante returns on Arrow securities


(continues on next page)

132 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)

Ex-post returns to purchase of Arrow securities:


π(peace1|peace1) = 1.1574074074074074
π(peace2|peace1) = 10.416666666666666
π(peace2|peace2) = 1.1574074074074074
π(war1|peace2) = 10.416666666666666
π(war1|war1) = 1.1574074074074074
π(war2|war1) = 10.416666666666666
π(peace1|war2) = 10.416666666666666
π(war2|war2) = 1.1574074074074074

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets


[[0.83836741 2.87135998 0. 0. ]
[0. 0.94670854 1.89628977 0. ]
[0. 0. 1.07983627 0.69814023]
[0.2545741 0. 0. 1.1291214 ]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

0.02371440178864222

7.4.5 Example 4

Here the Markov chain is:


1−𝜆 𝜆 0 0 0
⎡ 0 1−𝜙 𝜙 0 0⎤
⎢ ⎥
𝑃 =⎢ 0 0 1−𝜓 𝜓 0⎥
⎢ 0 0 0 1−𝜃 𝜃⎥
⎣ 0 0 0 0 1⎦

with government expenditure levels for the five states being [𝑔𝐿 𝑔𝐿 𝑔𝐻 𝑔𝐻 𝑔𝐿 ] where 𝑔𝐿 < 𝑔𝐻 .
We ssume that 𝑏0 = 1 and 𝑠0 = 1.

g_ex4 = [g_L, g_L, g_H, g_H, g_L]


P_ex4 = np.array([[1-λ, λ, 0, 0, 0],
[0, 1-ϕ, ϕ, 0, 0],
[0, 0, 1-ψ, ψ, 0],
[0, 0, 0, 1-θ, θ],
[0, 0, 0, 0, 1]])
b0_ex4 = 1
states_ex4 = ['peace1', 'peace2', 'war1', 'war2', 'permanent peace']

ts_ex4 = TaxSmoothingExample(g_ex4, P_ex4, b0_ex4, states_ex4, random_state=1)


ts_ex4.display()

7.4. More Finite Markov Chain Tax-Smoothing Examples 133


Advanced Quantitative Economics with Python

134 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 135


Advanced Quantitative Economics with Python

P
[[0.9 0.1 0. 0. 0. ]
[0. 0.9 0.1 0. 0. ]
[0. 0. 0.9 0.1 0. ]
[0. 0. 0. 0.9 0.1]
[0. 0. 0. 0. 1. ]]
Q
[[0.864 0.096 0. 0. 0. ]
[0. 0.864 0.096 0. 0. ]
[0. 0. 0.864 0.096 0. ]
[0. 0. 0. 0.864 0.096]
[0. 0. 0. 0. 0.96 ]]
Govt expenditures in peace1, peace2, war1, war2, permanent peace = [0.5 0.5 1.2 1.
↪2 0.5]

Constant tax collections = 0.6349979047185738


Govt debt in 5 states = [-1. -2.82289484 -5.4053292 -1.77211121 3.
↪37494762]

Government tax collections minus debt levels in peace1, peace2, war1, war2,␣
↪permanent peace

T+b in peace1 = 1.6349979047185736


T+b in peace2 = 3.4578927455370505
T+b in war1 = 6.040327103363229
T+b in war2 = 2.4071091102836433
T+b in permanent peace = -2.7399497132457697

Total government spending in peace1, peace2, war1, war2, permanent peace


(continues on next page)

136 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


peace1 = 1.6349979047185736
peace2 = 3.457892745537051
war1 = 6.040327103363228
war2 = 2.407109110283643
permanent peace = -2.7399497132457697

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:


π(peace1|peace1) = 1.1574074074074074
π(peace2|peace1) = 10.416666666666666
π(peace2|peace2) = 1.1574074074074074
π(war1|peace2) = 10.416666666666666
π(war1|war1) = 1.1574074074074074
π(war2|war1) = 10.416666666666666
π(war2|war2) = 1.1574074074074074
π(permanent peace|war2) = 10.416666666666666
π(permanent peace|permanent peace) = 1.0416666666666667

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets


[[ 0.8810589 2.48713661 0. 0. 0. ]
[ 0. 0.95436011 1.82742569 0. 0. ]
[ 0. 0. 1.11672808 0.36611394 0. ]
[ 0. 0. 0. 1.46806216 -2.79589276]
[ 0. 0. 0. 0. 1.04166667]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

-11.132109773063616

7.4.6 Example 5

The example captures a case when the system follows a deterministic path from peace to war, and back to peace again.
Since there is no randomness, the outcomes in complete markets setting should be the same as in incomplete markets
setting.
The Markov chain is:
0 1 0 0 0 0 0
⎡0 0 1 0 0 0 0⎤
⎢ ⎥
⎢0 0 0 1 0 0 0⎥
𝑃 = ⎢0 0 0 0 1 0 0⎥
⎢ ⎥
⎢0 0 0 0 0 1 0⎥
⎢0 0 0 0 0 0 1⎥
⎣0 0 0 0 0 0 1⎦

with government expenditure levels for the seven states being [𝑔𝐿 𝑔𝐿 𝑔𝐻 𝑔𝐻 𝑔𝐻 𝑔𝐻 𝑔𝐿 ] where 𝑔𝐿 < 𝑔𝐻 .
Assume 𝑏0 = 1 and 𝑠0 = 1.

g_ex5 = [g_L, g_L, g_H, g_H, g_H, g_H, g_L]


P_ex5 = np.array([[0, 1, 0, 0, 0, 0, 0],
(continues on next page)

7.4. More Finite Markov Chain Tax-Smoothing Examples 137


Advanced Quantitative Economics with Python

(continued from previous page)


[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 1]])
b0_ex5 = 1
states_ex5 = ['peace1', 'peace2', 'war1', 'war2', 'war3', 'permanent peace']

ts_ex5 = TaxSmoothingExample(g_ex5, P_ex5, b0_ex5, states_ex5, N_simul=7, random_


↪state=1)

ts_ex5.display()

138 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 139


Advanced Quantitative Economics with Python

P
[[0 1 0 0 0 0 0]
[0 0 1 0 0 0 0]
[0 0 0 1 0 0 0]
[0 0 0 0 1 0 0]
[0 0 0 0 0 1 0]
[0 0 0 0 0 0 1]
[0 0 0 0 0 0 1]]
Q
[[0. 0.96 0. 0. 0. 0. 0. ]
[0. 0. 0.96 0. 0. 0. 0. ]
[0. 0. 0. 0.96 0. 0. 0. ]
[0. 0. 0. 0. 0.96 0. 0. ]
[0. 0. 0. 0. 0. 0.96 0. ]
[0. 0. 0. 0. 0. 0. 0.96]
[0. 0. 0. 0. 0. 0. 0.96]]
Govt expenditures in peace1, peace2, war1, war2, war3, permanent peace = [0.5 0.5␣
↪1.2 1.2 1.2 1.2 0.5]

Constant tax collections = 0.5571895472128001


Govt debt in 6 states = [-1. -1.10123911 -1.20669652 -0.58738132 0.
↪05773868 0.72973868
1.42973868]

Government tax collections minus debt levels in peace1, peace2, war1, war2, war3,␣
↪permanent peace

T+b in peace1 = 1.5571895472128001


T+b in peace2 = 1.6584286588928001
(continues on next page)

140 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


T+b in war1 = 1.7638860668928005
T+b in war2 = 1.1445708668928003
T+b in war3 = 0.4994508668928004
T+b in permanent peace = -0.17254913310719955

Total government spending in peace1, peace2, war1, war2, war3, permanent peace
peace1 = 1.5571895472128
peace2 = 1.6584286588928003
war1 = 1.7638860668928
war2 = 1.1445708668928003
war3 = 0.49945086689280027
permanent peace = -0.17254913310719933

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:


π(peace2|peace1) = 1.0416666666666667
π(war1|peace2) = 1.0416666666666667
π(war2|war1) = 1.0416666666666667
π(war3|war2) = 1.0416666666666667
π(permanent peace|war3) = 1.0416666666666667

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets


[[0. 1.04166667 0. 0. 0. 0.
0. ]
[0. 0. 1.04166667 0. 0. 0.
0. ]
[0. 0. 0. 1.04166667 0. 0.
0. ]
[0. 0. 0. 0. 1.04166667 0.
0. ]
[0. 0. 0. 0. 0. 1.04166667
0. ]
[0. 0. 0. 0. 0. 0.
1.04166667]
[0. 0. 0. 0. 0. 0.
1.04166667]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

1.2775343959060068

7.4.7 Continuous-State Gaussian Model

To construct a tax-smoothing version of the complete markets consumption-smoothing model with a continuous state
space that we presented in the lecture consumption smoothing with complete and incomplete markets, we simply relabel
variables.
Thus, a government faces a sequence of budget constraints

𝑇𝑡 + 𝑏𝑡 = 𝑔𝑡 + 𝛽𝔼𝑡 𝑏𝑡+1 , 𝑡≥0

7.4. More Finite Markov Chain Tax-Smoothing Examples 141


Advanced Quantitative Economics with Python

where 𝑇𝑡 is tax revenues, 𝑏𝑡 are receipts at 𝑡 from contingent claims that the government had purchased at time 𝑡 − 1, and

𝛽𝔼𝑡 𝑏𝑡+1 ≡ ∫ 𝑞𝑡+1 (𝑥𝑡+1 |𝑥𝑡 )𝑏𝑡+1 (𝑥𝑡+1 )𝑑𝑥𝑡+1

is the value of time 𝑡 + 1 state-contingent claims purchased by the government at time 𝑡.


As above with the consumption-smoothing model, we can solve the time 𝑡 budget constraint forward to obtain

𝑏𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗 (𝑔𝑡+𝑗 − 𝑇𝑡+𝑗 )
𝑗=0

which can be rearranged to become


∞ ∞
𝔼𝑡 ∑ 𝛽 𝑗 𝑔𝑡+𝑗 = 𝑏𝑡 + 𝔼𝑡 ∑ 𝛽 𝑗 𝑇𝑡+𝑗
𝑗=0 𝑗=0

which states that the present value of government purchases equals the value of government assets at 𝑡 plus the present
value of tax receipts.
With these relabelings, examples presented in consumption smoothing with complete and incomplete markets can be inter-
preted as tax-smoothing models.
Returns: In the continuous state version of our incomplete markets model, the ex post one-period gross rate of return
on the government portfolio equals

𝑏(𝑥𝑡+1 )
𝑅(𝑥𝑡+1 |𝑥𝑡 ) =
𝛽𝐸𝑏(𝑥𝑡+1 )|𝑥𝑡

Related Lectures

Throughout this lecture, we have taken one-period interest rates and Arrow security prices as exogenous objects deter-
mined outside the model and specified them in ways designed to align our models closely with the consumption smoothing
model of Barro [Barro, 1979].
Other lectures make these objects endogenous and describe how a government optimally manipulates prices of govern-
ment debt, albeit indirectly via effects distorting taxes have on equilibrium prices and allocations.
In optimal taxation in an LQ economy and recursive optimal taxation, we study complete-markets models in which the
government recognizes that it can manipulate Arrow securities prices.
Linear-quadratic versions of the Lucas-Stokey tax-smoothing model are described in Optimal Taxation in an LQ Economy.
That lecture is a warm-up for the non-linear-quadratic model of tax smoothing described in Optimal Taxation with State-
Contingent Debt.
In both Optimal Taxation in an LQ Economy and Optimal Taxation with State-Contingent Debt, the government recognizes
that its decisions affect prices.
In optimal taxation with incomplete markets, we study an incomplete-markets model in which the government also
manipulates prices of government debt.

142 Chapter 7. Tax Smoothing with Complete and Incomplete Markets


CHAPTER

EIGHT

MARKOV JUMP LINEAR QUADRATIC DYNAMIC PROGRAMMING

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

8.1 Overview

This lecture describes Markov jump linear quadratic dynamic programming, an extension of the method described
in the first LQ control lecture.
Markov jump linear quadratic dynamic programming is described and analyzed in [Do Val et al., 1999] and the references
cited there.
The method has been applied to problems in macroeconomics and monetary economics by [Svensson et al., 2008] and
[Svensson and Williams, 2009].
The periodic models of seasonality described in chapter 14 of [Hansen and Sargent, 2013] are a special case of Markov
jump linear quadratic problems.
Markov jump linear quadratic dynamic programming combines advantages of
• the computational simplicity of linear quadratic dynamic programming, with
• the ability of finite state Markov chains to represent interesting patterns of random variation.
The idea is to replace the constant matrices that define a linear quadratic dynamic programming problem with 𝑁 sets
of matrices that are fixed functions of the state of an 𝑁 state Markov chain.
The state of the Markov chain together with the continuous 𝑛 × 1 state vector 𝑥𝑡 form the state of the system.
For the class of infinite horizon problems being studied in this lecture, we obtain 𝑁 interrelated matrix Riccati equations
that determine 𝑁 optimal value functions and 𝑁 linear decision rules.
One of these value functions and one of these decision rules apply in each of the 𝑁 Markov states.
That is, when the Markov state is in state 𝑗, the value function and the decision rule for state 𝑗 prevails.

143
Advanced Quantitative Economics with Python

8.2 Review of useful LQ dynamic programming formulas

To begin, it is handy to have the following reminder in mind.


A linear quadratic dynamic programming problem consists of a scalar discount factor 𝛽 ∈ (0, 1), an 𝑛 × 1 state
vector 𝑥𝑡 , an initial condition for 𝑥0 , a 𝑘 × 1 control vector 𝑢𝑡 , a 𝑝 × 1 random shock vector 𝑤𝑡+1 and the following two
triples of matrices:
• A triple of matrices (𝑅, 𝑄, 𝑊 ) defining a loss function
𝑟(𝑥𝑡 , 𝑢𝑡 ) = 𝑥′𝑡 𝑅𝑥𝑡 + 𝑢′𝑡 𝑄𝑢𝑡 + 2𝑢′𝑡 𝑊 𝑥𝑡
• a triple of matrices (𝐴, 𝐵, 𝐶) defining a state-transition law
𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑤𝑡+1
The problem is

−𝑥′0 𝑃 𝑥0 − 𝜌 = min

𝐸 ∑ 𝛽 𝑡 𝑟(𝑥𝑡 , 𝑢𝑡 )
{𝑢𝑡 }𝑡=0
𝑡=0

subject to the transition law for the state.


The optimal decision rule has the form

𝑢𝑡 = −𝐹 𝑥𝑡

and the optimal value function is of the form

− (𝑥′𝑡 𝑃 𝑥𝑡 + 𝜌)

where 𝑃 solves the algebraic matrix Riccati equation

𝑃 = 𝑅 + 𝛽𝐴′ 𝑃 𝐴 − (𝛽𝐵′ 𝑃 𝐴 + 𝑊 )′ (𝑄 + 𝛽𝐵𝑃 𝐵)−1 (𝛽𝐵𝑃 𝐴 + 𝑊 )

and the constant 𝜌 satisfies

𝜌 = 𝛽 (𝜌 + trace(𝑃 𝐶𝐶 ′ ))

and the matrix 𝐹 in the decision rule for 𝑢𝑡 satisfies

𝐹 = (𝑄 + 𝛽𝐵′ 𝑃 𝐵)−1 (𝛽(𝐵′ 𝑃 𝐴) + 𝑊 )

With the preceding formulas in mind, we are ready to approach Markov Jump linear quadratic dynamic programming.

8.3 Linked Riccati equations for Markov LQ dynamic programming

The key idea is to make the matrices 𝐴, 𝐵, 𝐶, 𝑅, 𝑄, 𝑊 fixed functions of a finite state 𝑠 that is governed by an 𝑁 state
Markov chain.
This makes decision rules depend on the Markov state, and so fluctuate through time in limited ways.
In particular, we use the following extension of a discrete-time linear quadratic dynamic programming problem.
We let 𝑠𝑡 ∈ [1, 2, … , 𝑁 ] be a time 𝑡 realization of an 𝑁 -state Markov chain with transition matrix Π having typical
element Π𝑖𝑗 .

144 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

Here 𝑖 denotes today and 𝑗 denotes tomorrow and

Π𝑖𝑗 = Prob(𝑠𝑡+1 = 𝑗|𝑠𝑡 = 𝑖)

We’ll switch between labeling today’s state as 𝑠𝑡 and 𝑖 and between labeling tomorrow’s state as 𝑠𝑡+1 or 𝑗.
The decision-maker solves the minimization problem:

min

𝐸 ∑ 𝛽 𝑡 𝑟(𝑥𝑡 , 𝑠𝑡 , 𝑢𝑡 )
{𝑢𝑡 }𝑡=0
𝑡=0

with

𝑟(𝑥𝑡 , 𝑠𝑡 , 𝑢𝑡 ) = 𝑥′𝑡 𝑅𝑠𝑡 𝑥𝑡 + 𝑢′𝑡 𝑄𝑠𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑊𝑠𝑡 𝑥𝑡

subject to linear laws of motion with matrices (𝐴, 𝐵, 𝐶) each possibly dependent on the Markov-state-𝑠𝑡 :

𝑥𝑡+1 = 𝐴𝑠𝑡 𝑥𝑡 + 𝐵𝑠𝑡 𝑢𝑡 + 𝐶𝑠𝑡 𝑤𝑡+1

where {𝑤𝑡+1 } is an i.i.d. stochastic process with 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼).


The optimal decision rule for this problem has the form

𝑢𝑡 = −𝐹𝑠𝑡 𝑥𝑡

and the optimal value functions are of the form

− (𝑥′𝑡 𝑃𝑠𝑡 𝑥𝑡 + 𝜌𝑠𝑡 )

or equivalently

−𝑥′𝑡 𝑃𝑖 𝑥𝑡 − 𝜌𝑖

The optimal value functions −𝑥′ 𝑃𝑖 𝑥 − 𝜌𝑖 for 𝑖 = 1, … , 𝑛 satisfy the 𝑁 interrelated Bellman equations

−𝑥′ 𝑃𝑖 𝑥 − 𝜌𝑖 = max −𝑥′ 𝑅𝑖 𝑥 +𝑢′ 𝑄𝑖 𝑢 + 2𝑢′ 𝑊𝑖 𝑥 + 𝛽 ∑ Π𝑖𝑗 𝐸((𝐴𝑖 𝑥 + 𝐵𝑖 𝑢 + 𝐶𝑖 𝑤)′ 𝑃𝑗 (𝐴𝑖 𝑥 + 𝐵𝑖 𝑢 + 𝐶𝑖 𝑤)𝑥 + 𝜌𝑗 )
𝑢
𝑗

The matrices 𝑃𝑠𝑡 = 𝑃𝑖 and the scalars 𝜌𝑠𝑡 = 𝜌𝑖 , 𝑖 = 1, …, n satisfy the following stacked system of algebraic matrix
Riccati equations:

𝑃𝑖 = 𝑅𝑖 + 𝛽 ∑ 𝐴′𝑖 𝑃𝑗 𝐴𝑖 Π𝑖𝑗 − ∑ Π𝑖𝑗 [(𝛽𝐵𝑖′ 𝑃𝑗 𝐴𝑖 + 𝑊𝑖 )′ (𝑄 + 𝛽𝐵𝑖′ 𝑃𝑗 𝐵𝑖 )−1 (𝛽𝐵𝑖′ 𝑃𝑗 𝐴𝑖 + 𝑊𝑖 )]


𝑗 𝑗

𝜌𝑖 = 𝛽 ∑ Π𝑖𝑗 (𝜌𝑗 + trace(𝑃𝑗 𝐶𝑖 𝐶𝑖′ ))


𝑗

and the 𝐹𝑖 matrices in the optimal decision rules are

𝐹𝑖 = (𝑄𝑖 + 𝛽 ∑ Π𝑖𝑗 𝐵𝑖′ 𝑃𝑗 𝐵𝑖 )−1 (𝛽 ∑ Π𝑖𝑗 (𝐵𝑖′ 𝑃𝑗 𝐴𝑖 ) + 𝑊𝑖 )


𝑗 𝑗

8.4 Applications

We now describe Python code and some examples.


To begin, we import these Python modules

8.4. Applications 145


Advanced Quantitative Economics with Python

import numpy as np
import quantecon as qe
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Set discount factor


β = 0.95

8.5 Example 1

This example is a version of a classic problem of optimally adjusting a variable 𝑘𝑡 to a target level in the face of costly
adjustment.
This provides a model of gradual adjustment.
Given 𝑘0 , the objective function is

max

𝐸0 ∑ 𝛽 𝑡 𝑟 (𝑠𝑡 , 𝑘𝑡 )
{𝑘𝑡 }𝑡=1 𝑡=0

where the one-period payoff function is

𝑟(𝑠𝑡 , 𝑘𝑡 ) = 𝑓1,𝑠𝑡 𝑘𝑡 − 𝑓2,𝑠𝑡 𝑘𝑡2 − 𝑑𝑠𝑡 (𝑘𝑡+1 − 𝑘𝑡 )2 ,

𝐸0 is a mathematical expectation conditioned on time 0 information 𝑥0 , 𝑠0 and the transition law for continuous state
variable 𝑘𝑡 is

𝑘𝑡+1 − 𝑘𝑡 = 𝑢𝑡

We can think of 𝑘𝑡 as the decision-maker’s capital and 𝑢𝑡 as costs of adjusting the level of capital.
We assume that 𝑓1 (𝑠𝑡 ) > 0, 𝑓2 (𝑠𝑡 ) > 0, and 𝑑 (𝑠𝑡 ) > 0.
Denote the state transition matrix for Markov state 𝑠𝑡 ∈ {1, 2} as Π:

Pr (𝑠𝑡+1 = 𝑗 ∣ 𝑠𝑡 = 𝑖) = Π𝑖𝑗

𝑘
Let 𝑥𝑡 = [ 𝑡 ]
1
We can represent the one-period payoff function 𝑟 (𝑠𝑡 , 𝑘𝑡 ) as

𝑟 (𝑠𝑡 , 𝑘𝑡 ) = 𝑓1,𝑠𝑡 𝑘𝑡 − 𝑓2,𝑠𝑡 𝑘𝑡2 − 𝑑𝑠𝑡 𝑢𝑡 2


𝑓
𝑓 𝑡 − 1,𝑠𝑡
= −𝑥′𝑡 [ 2,𝑠
𝑓1,𝑠𝑡
2 ]𝑥 + 𝑑
𝑡 ⏟ 𝑠𝑡 𝑢 𝑡
2

⏟⏟⏟⏟⏟⏟2 0
⏟⏟⏟ ≡𝑄(𝑠𝑡 )
≡𝑅(𝑠𝑡 )

and the state-transition law as


𝑘𝑡+1 1
𝑥𝑡+1 = [ ]= 𝐼⏟2 𝑥𝑡 + [ ] 𝑢𝑡
1 ⏟0
≡𝐴(𝑠 )
𝑡
≡𝐵(𝑠𝑡 )

146 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

def construct_arrays1(f1_vals=[1. ,1.],


f2_vals=[1., 1.],
d_vals=[1., 1.]):
"""
Construct matrices that map the problem described in example 1
into a Markov jump linear quadratic dynamic programming problem
"""

# Number of Markov states


m = len(f1_vals)
# Number of state and control variables
n, k = 2, 1

# Construct sets of matrices for each state


As = [np.eye(n) for i in range(m)]
Bs = [np.array([[1, 0]]).T for i in range(m)]

Rs = np.zeros((m, n, n))
Qs = np.zeros((m, k, k))

for i in range(m):
Rs[i, 0, 0] = f2_vals[i]
Rs[i, 1, 0] = - f1_vals[i] / 2
Rs[i, 0, 1] = - f1_vals[i] / 2

Qs[i, 0, 0] = d_vals[i]

Cs, Ns = None, None

# Compute the optimal k level of the payoff function in each state


k_star = np.empty(m)
for i in range(m):
k_star[i] = f1_vals[i] / (2 * f2_vals[i])

return Qs, Rs, Ns, As, Bs, Cs, k_star

The continuous part of the state 𝑥𝑡 consists of two variables, namely, 𝑘𝑡 and a constant term.

state_vec1 = ["k", "constant term"]

We start with a Markov transition matrix that makes the Markov state be strictly periodic:
0 1
Π1 = [ ],
1 0
We set 𝑓1,𝑠𝑡 and 𝑓2,𝑠𝑡 to be independent of the Markov state 𝑠𝑡

𝑓1,1 = 𝑓1,2 = 1,

𝑓2,1 = 𝑓2,2 = 1
In contrast to 𝑓1,𝑠𝑡 and 𝑓2,𝑠𝑡 , we make the adjustment cost 𝑑𝑠𝑡 vary across Markov states 𝑠𝑡 .
We set the adjustment cost to be lower in Markov state 2

𝑑1 = 1, 𝑑2 = 0.5

The following code forms a Markov switching LQ problem and computes the optimal value functions and optimal decision
rules for each Markov state

8.5. Example 1 147


Advanced Quantitative Economics with Python

# Construct Markov transition matrix


Π1 = np.array([[0., 1.],
[1., 0.]])

# Construct matrices
Qs, Rs, Ns, As, Bs, Cs, k_star = construct_arrays1(d_vals=[1., 0.5])

# Construct a Markov Jump LQ problem


ex1_a = qe.LQMarkov(Π1, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)
# Solve for optimal value functions and decision rules
ex1_a.stationary_values();

Let’s look at the value function matrices and the decision rules for each Markov state

# P(s)
ex1_a.Ps

array([[[ 1.56626026, -0.78313013],


[-0.78313013, -4.60843493]],

[[ 1.37424214, -0.68712107],
[-0.68712107, -4.65643947]]])

# d(s) = 0, since there is no randomness


ex1_a.ds

array([0., 0.])

# F(s)
ex1_a.Fs

array([[[ 0.56626026, -0.28313013]],

[[ 0.74848427, -0.37424214]]])

Now we’ll plot the decision rules and see if they make sense

# Plot the optimal decision rules


k_grid = np.linspace(0., 1., 100)
# Optimal choice in state s1
u1_star = - ex1_a.Fs[0, 0, 1] - ex1_a.Fs[0, 0, 0] * k_grid
# Optimal choice in state s2
u2_star = - ex1_a.Fs[1, 0, 1] - ex1_a.Fs[1, 0, 0] * k_grid

fig, ax = plt.subplots()
ax.plot(k_grid, k_grid + u1_star, label=r"$\overline{s}_1$ (high)")
ax.plot(k_grid, k_grid + u2_star, label=r"$\overline{s}_2$ (low)")

# The optimal k*
ax.scatter([0.5, 0.5], [0.5, 0.5], marker="*")
ax.plot([k_star[0], k_star[0]], [0., 1.0], '--')
(continues on next page)

148 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

(continued from previous page)

# 45 degree line
ax.plot([0., 1.], [0., 1.], '--', label="45 degree line")

ax.set_xlabel("$k_t$")
ax.set_ylabel("$k_{t+1}$")
ax.legend()
plt.show()

The above graph plots 𝑘𝑡+1 = 𝑘𝑡 + 𝑢𝑡 = 𝑘𝑡 − 𝐹 𝑥𝑡 as an affine (i.e., linear in 𝑘𝑡 plus a constant) function of 𝑘𝑡 for both
Markov states 𝑠𝑡 .
It also plots the 45 degree line.
Notice that the two 𝑠𝑡 -dependent closed loop functions that determine 𝑘𝑡+1 as functions of 𝑘𝑡 share the same rest point
(also called a fixed point) at 𝑘𝑡 = 0.5.
Evidently, the optimal decision rule in Markov state 2, in which the adjustment cost is lower, makes 𝑘𝑡+1 a flatter function
of 𝑘𝑡 in Markov state 2.
This happens because when 𝑘𝑡 is not at its fixed point, |𝑢𝑡,2 | > |𝑢𝑡,2 |, so that the decision-maker adjusts toward the fixed
point faster when the Markov state 𝑠𝑡 takes a value that makes it cheaper.

# Compute time series


T = 20
x0 = np.array([[0., 1.]]).T
x_path = ex1_a.compute_sequence(x0, ts_length=T)[0]

(continues on next page)

8.5. Example 1 149


Advanced Quantitative Economics with Python

(continued from previous page)


fig, ax = plt.subplots()
ax.plot(range(T), x_path[0, :-1])
ax.set_xlabel("$t$")
ax.set_ylabel("$k_t$")
ax.set_title("Optimal path of $k_t$")
plt.show()

Now we’ll depart from the preceding transition matrix that made the Markov state be strictly periodic.
We’ll begin with symmetric transition matrices of the form

1−𝜆 𝜆
Π2 = [ ].
𝜆 1−𝜆

λ = 0.8 # high λ
Π2 = np.array([[1-λ, λ],
[λ, 1-λ]])

ex1_b = qe.LQMarkov(Π2, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)


ex1_b.stationary_values();
ex1_b.Fs

array([[[ 0.57291724, -0.28645862]],

[[ 0.74434525, -0.37217263]]])

150 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

λ = 0.2 # low λ
Π2 = np.array([[1-λ, λ],
[λ, 1-λ]])

ex1_b = qe.LQMarkov(Π2, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)


ex1_b.stationary_values();
ex1_b.Fs

array([[[ 0.59533259, -0.2976663 ]],

[[ 0.72818728, -0.36409364]]])

We can plot optimal decision rules associated with different 𝜆 values.

λ_vals = np.linspace(0., 1., 10)


F1 = np.empty((λ_vals.size, 2))
F2 = np.empty((λ_vals.size, 2))

for i, λ in enumerate(λ_vals):
Π2 = np.array([[1-λ, λ],
[λ, 1-λ]])

ex1_b = qe.LQMarkov(Π2, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)


ex1_b.stationary_values();
F1[i, :] = ex1_b.Fs[0, 0, :]
F2[i, :] = ex1_b.Fs[1, 0, :]

for i, state_var in enumerate(state_vec1):


fig, ax = plt.subplots()
ax.plot(λ_vals, F1[:, i], label=r"$\overline{s}_1$", color="b")
ax.plot(λ_vals, F2[:, i], label=r"$\overline{s}_2$", color="r")

ax.set_xlabel(r"$\lambda$")
ax.set_ylabel("$F_{s_t}$")
ax.set_title(f"Coefficient on {state_var}")
ax.legend()
plt.show()

8.5. Example 1 151


Advanced Quantitative Economics with Python

152 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

Notice how the decision rules’ constants and slopes behave as functions of 𝜆.
Evidently, as the Markov chain becomes more nearly periodic (i.e., as 𝜆 → 1), the dynamic program adjusts capital faster
in the low adjustment cost Markov state to take advantage of what is only temporarily a more favorable time to invest.
Now let’s study situations in which the Markov transition matrix Π is asymmetric
1−𝜆 𝜆
Π3 = [ ].
𝛿 1−𝛿

λ, δ = 0.8, 0.2
Π3 = np.array([[1-λ, λ],
[δ, 1-δ]])

ex1_b = qe.LQMarkov(Π3, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)


ex1_b.stationary_values();
ex1_b.Fs

array([[[ 0.57169781, -0.2858489 ]],

[[ 0.72749075, -0.36374537]]])

We can plot optimal decision rules for different 𝜆 and 𝛿 values.

λ_vals = np.linspace(0., 1., 10)


δ_vals = np.linspace(0., 1., 10)

(continues on next page)

8.5. Example 1 153


Advanced Quantitative Economics with Python

(continued from previous page)


λ_grid = np.empty((λ_vals.size, δ_vals.size))
δ_grid = np.empty((λ_vals.size, δ_vals.size))
F1_grid = np.empty((λ_vals.size, δ_vals.size, len(state_vec1)))
F2_grid = np.empty((λ_vals.size, δ_vals.size, len(state_vec1)))

for i, λ in enumerate(λ_vals):
λ_grid[i, :] = λ
δ_grid[i, :] = δ_vals
for j, δ in enumerate(δ_vals):
Π3 = np.array([[1-λ, λ],
[δ, 1-δ]])

ex1_b = qe.LQMarkov(Π3, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)


ex1_b.stationary_values();
F1_grid[i, j, :] = ex1_b.Fs[0, 0, :]
F2_grid[i, j, :] = ex1_b.Fs[1, 0, :]

for i, state_var in enumerate(state_vec1):


fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# high adjustment cost, blue surface
ax.plot_surface(λ_grid, δ_grid, F1_grid[:, :, i], color="b")
# low adjustment cost, red surface
ax.plot_surface(λ_grid, δ_grid, F2_grid[:, :, i], color="r")
ax.set_xlabel(r"$\lambda$")
ax.set_ylabel(r"$\delta$")
ax.set_zlabel("$F_{s_t}$")
ax.set_title(f"coefficient on {state_var}")
plt.show()

154 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

The following code defines a wrapper function that computes optimal decision rules for cases with different Markov
transition matrices

8.5. Example 1 155


Advanced Quantitative Economics with Python

def run(construct_func, vals_dict, state_vec):


"""
A Wrapper function that repeats the computation above
for different cases
"""

Qs, Rs, Ns, As, Bs, Cs, k_star = construct_func(**vals_dict)

# Symmetric Π
# Notice that pure periodic transition is a special case
# when λ=1
print("symmetric Π case:\n")
λ_vals = np.linspace(0., 1., 10)
F1 = np.empty((λ_vals.size, len(state_vec)))
F2 = np.empty((λ_vals.size, len(state_vec)))

for i, λ in enumerate(λ_vals):
Π2 = np.array([[1-λ, λ],
[λ, 1-λ]])

mplq = qe.LQMarkov(Π2, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)


mplq.stationary_values();
F1[i, :] = mplq.Fs[0, 0, :]
F2[i, :] = mplq.Fs[1, 0, :]

for i, state_var in enumerate(state_vec):


fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(λ_vals, F1[:, i], label=r"$\overline{s}_1$", color="b")
ax.plot(λ_vals, F2[:, i], label=r"$\overline{s}_2$", color="r")

ax.set_xlabel(r"$\lambda$")
ax.set_ylabel(r"$F(\overline{s}_t)$")
ax.set_title(f"coefficient on {state_var}")
ax.legend()
plt.show()

# Plot optimal k*_{s_t} and k that optimal policies are targeting


# only for example 1
if state_vec == ["k", "constant term"]:
fig = plt.figure()
ax = fig.add_subplot(111)
for i in range(2):
F = [F1, F2][i]
c = ["b", "r"][i]
ax.plot([0, 1], [k_star[i], k_star[i]], "--",
color=c, label=r"$k^*(\overline{s}_"+str(i+1)+")$")
ax.plot(λ_vals, - F[:, 1] / F[:, 0], color=c,
label=r"$k^{target}(\overline{s}_"+str(i+1)+")$")

# Plot a vertical line at λ=0.5


ax.plot([0.5, 0.5], [min(k_star), max(k_star)], "-.")

ax.set_xlabel(r"$\lambda$")
ax.set_ylabel("$k$")
ax.set_title("Optimal k levels and k targets")
ax.text(0.5, min(k_star)+(max(k_star)-min(k_star))/20, r"$\lambda=0.5$")
(continues on next page)

156 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

(continued from previous page)


ax.legend(bbox_to_anchor=(1., 1.))
plt.show()

# Asymmetric Π
print("asymmetric Π case:\n")
δ_vals = np.linspace(0., 1., 10)

λ_grid = np.empty((λ_vals.size, δ_vals.size))


δ_grid = np.empty((λ_vals.size, δ_vals.size))
F1_grid = np.empty((λ_vals.size, δ_vals.size, len(state_vec)))
F2_grid = np.empty((λ_vals.size, δ_vals.size, len(state_vec)))

for i, λ in enumerate(λ_vals):
λ_grid[i, :] = λ
δ_grid[i, :] = δ_vals
for j, δ in enumerate(δ_vals):
Π3 = np.array([[1-λ, λ],
[δ, 1-δ]])

mplq = qe.LQMarkov(Π3, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)


mplq.stationary_values();
F1_grid[i, j, :] = mplq.Fs[0, 0, :]
F2_grid[i, j, :] = mplq.Fs[1, 0, :]

for i, state_var in enumerate(state_vec):


fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(λ_grid, δ_grid, F1_grid[:, :, i], color="b")
ax.plot_surface(λ_grid, δ_grid, F2_grid[:, :, i], color="r")
ax.set_xlabel(r"$\lambda$")
ax.set_ylabel(r"$\delta$")
ax.set_zlabel(r"$F(\overline{s}_t)$")
ax.set_title(f"coefficient on {state_var}")
plt.show()

To illustrate the code with another example, we shall set 𝑓2,𝑠𝑡 and 𝑑𝑠𝑡 as constant functions and

𝑓1,1 = 0.5, 𝑓1,2 = 1

Thus, the sole role of the Markov jump state 𝑠𝑡 is to identify times in which capital is very productive and other times in
which it is less productive.
The example below reveals much about the structure of the optimum problem and optimal policies.
Only 𝑓1,𝑠𝑡 varies with 𝑠𝑡 .
𝑓1,𝑠𝑡
So there are different 𝑠𝑡 -dependent optimal static 𝑘 level in different states 𝑘𝑠∗𝑡 = 2𝑓2,𝑠𝑡 , values of 𝑘 that maximize
one-period payoff functions in each state.
We denote a target 𝑘 level as 𝑘𝑠𝑡𝑎𝑟𝑔𝑒𝑡
𝑡
, the fixed point of the optimal policies in each state, given the value of 𝜆.
We call 𝑘𝑠𝑡𝑎𝑟𝑔𝑒𝑡
𝑡
a “target” because in each Markov state 𝑠𝑡 , optimal policies are contraction mappings and will push 𝑘𝑡
towards a fixed point 𝑘𝑠𝑡𝑎𝑟𝑔𝑒𝑡
𝑡
.
When 𝜆 → 0, each Markov state becomes close to absorbing state and consequently 𝑘𝑠𝑡𝑎𝑟𝑔𝑒𝑡
𝑡
→ 𝑘𝑠∗𝑡 .
But when 𝜆 → 1, the Markov transition matrix becomes more nearly periodic, so the optimum decision rules target more
at the optimal 𝑘 level in the other state in order to enjoy higher expected payoff in the next period.

8.5. Example 1 157


Advanced Quantitative Economics with Python

The switch happens at 𝜆 = 0.5 when both states are equally likely to be reached.
Below we plot an additional figure that shows optimal 𝑘 levels in the two states Markov jump state and also how the
targeted 𝑘 levels change as 𝜆 changes.

run(construct_arrays1, {"f1_vals":[0.5, 1.]}, state_vec1)

symmetric Π case:

158 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.5. Example 1 159


Advanced Quantitative Economics with Python

asymmetric Π case:

160 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

Set 𝑓1,𝑠𝑡 and 𝑑𝑠𝑡 as constant functions and

𝑓2,1 = 0.5, 𝑓2,2 = 1

run(construct_arrays1, {"f2_vals":[0.5, 1.]}, state_vec1)

symmetric Π case:

8.5. Example 1 161


Advanced Quantitative Economics with Python

162 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

asymmetric Π case:

8.5. Example 1 163


Advanced Quantitative Economics with Python

8.6 Example 2

We now add to the example 1 setup another state variable 𝑤𝑡 that follows the evolution law

𝑤𝑡+1 = 𝛼0 (𝑠𝑡 ) + 𝜌 (𝑠𝑡 ) 𝑤𝑡 + 𝜎 (𝑠𝑡 ) 𝜖𝑡+1 , 𝜖𝑡+1 ∼ 𝑁 (0, 1)

We think of 𝑤𝑡 as a rental rate or tax rate that the decision maker pays each period for 𝑘𝑡 .
To capture this idea, we add to the decision-maker’s one-period payoff function the product of 𝑤𝑡 and 𝑘𝑡

𝑟(𝑠𝑡 , 𝑘𝑡 , 𝑤𝑡 ) = 𝑓1,𝑠𝑡 𝑘𝑡 − 𝑓2,𝑠𝑡 𝑘𝑡2 − 𝑑𝑠𝑡 (𝑘𝑡+1 − 𝑘𝑡 )2 − 𝑤𝑡 𝑘𝑡 ,

𝑘𝑡
We now let the continuous part of the state at time 𝑡 be 𝑥𝑡 = ⎡ ⎤
⎢ 1 ⎥ and continue to set the control 𝑢𝑡 = 𝑘𝑡+1 − 𝑘𝑡 .
⎣𝑤𝑡 ⎦
We can write the one-period payoff function 𝑟 (𝑠𝑡 , 𝑘𝑡 , 𝑤𝑡 ) as
2
𝑟 (𝑠𝑡 , 𝑘𝑡 , 𝑤𝑡 ) = 𝑓1 (𝑠𝑡 ) 𝑘𝑡 − 𝑓2 (𝑠𝑡 ) 𝑘𝑡2 − 𝑑 (𝑠𝑡 ) (𝑘𝑡+1 − 𝑘𝑡 ) − 𝑤𝑡 𝑘𝑡

⎛ ⎞

⎜ 𝑓2 (𝑠𝑡 ) − 𝑓1 (𝑠 2
𝑡) 1
2


⎜ ′ ⎡ 𝑓 (𝑠 ) ⎤ 2⎟
= −⎜
⎜𝑥 𝑡 ⎢− 1
2
𝑡
0 0 ⎥𝑥𝑡 + 𝑑
⏟ (𝑠 𝑡 ) 𝑢𝑡 ⎟
⎟ ,

⎜ ⏟⏟ 1
0 0 ≡𝑄(𝑠𝑡 ) ⎟

⎣ ⏟ 2⏟⏟⏟⏟⏟⏟⏟⏟ ⎦
⎝ ≡𝑅(𝑠𝑡 ) ⎠

and the state-transition law as


𝑘𝑡+1 1 0 0 1 0
𝑥𝑡+1 = ⎡ ⎤ ⎡
⎢ 1 ⎥ = ⎢0 1 0 ⎤ ⎥𝑥𝑡 +
⎡0⎤ 𝑢 +
⎢ ⎥ 𝑡
⎡ 0 ⎤𝜖
⎢ ⎥ 𝑡+1
⎣𝑤𝑡+1 ⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟
⎣0 𝛼0 (𝑠𝑡 ) 𝜌 (𝑠𝑡 )⎦ ⎣0⎦
⏟ 𝜎
⏟⎦
⎣ (𝑠 𝑡 )
≡𝐴(𝑠𝑡 ) ≡𝐵(𝑠𝑡 ) ≡𝐶(𝑠𝑡 )

def construct_arrays2(f1_vals=[1. ,1.],


f2_vals=[1., 1.],
d_vals=[1., 1.],
α0_vals=[1., 1.],
ρ_vals=[0.9, 0.9],
σ_vals=[1., 1.]):
"""
Construct matrices that maps the problem described in example 2
into a Markov jump linear quadratic dynamic programming problem.
"""

m = len(f1_vals)
n, k, j = 3, 1, 1

Rs = np.zeros((m, n, n))
Qs = np.zeros((m, k, k))
As = np.zeros((m, n, n))
Bs = np.zeros((m, n, k))
Cs = np.zeros((m, n, j))

for i in range(m):
Rs[i, 0, 0] = f2_vals[i]
Rs[i, 1, 0] = - f1_vals[i] / 2
(continues on next page)

164 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

(continued from previous page)


Rs[i, 0, 1] = - f1_vals[i] / 2
Rs[i, 0, 2] = 1/2
Rs[i, 2, 0] = 1/2

Qs[i, 0, 0] = d_vals[i]

As[i, 0, 0] = 1
As[i, 1, 1] = 1
As[i, 2, 1] = α0_vals[i]
As[i, 2, 2] = ρ_vals[i]

Bs[i, :, :] = np.array([[1, 0, 0]]).T

Cs[i, :, :] = np.array([[0, 0, σ_vals[i]]]).T

Ns = None
k_star = None

return Qs, Rs, Ns, As, Bs, Cs, k_star

state_vec2 = ["k", "constant term", "w"]

Only 𝑑𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"d_vals":[1., 0.5]}, state_vec2)

symmetric Π case:

8.6. Example 2 165


Advanced Quantitative Economics with Python

166 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.6. Example 2 167


Advanced Quantitative Economics with Python

asymmetric Π case:

168 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.6. Example 2 169


Advanced Quantitative Economics with Python

Only 𝑓1,𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"f1_vals":[0.5, 1.]}, state_vec2)

symmetric Π case:

170 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.6. Example 2 171


Advanced Quantitative Economics with Python

172 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

asymmetric Π case:

8.6. Example 2 173


Advanced Quantitative Economics with Python

174 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

Only 𝑓2,𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"f2_vals":[0.5, 1.]}, state_vec2)

symmetric Π case:

8.6. Example 2 175


Advanced Quantitative Economics with Python

176 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.6. Example 2 177


Advanced Quantitative Economics with Python

asymmetric Π case:

178 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.6. Example 2 179


Advanced Quantitative Economics with Python

Only 𝛼0 (𝑠𝑡 ) depends on 𝑠𝑡 .

run(construct_arrays2, {"α0_vals":[0.5, 1.]}, state_vec2)

symmetric Π case:

180 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.6. Example 2 181


Advanced Quantitative Economics with Python

182 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

asymmetric Π case:

8.6. Example 2 183


Advanced Quantitative Economics with Python

184 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

Only 𝜌𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"ρ_vals":[0.5, 0.9]}, state_vec2)

symmetric Π case:

8.6. Example 2 185


Advanced Quantitative Economics with Python

186 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.6. Example 2 187


Advanced Quantitative Economics with Python

asymmetric Π case:

188 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.6. Example 2 189


Advanced Quantitative Economics with Python

Only 𝜎𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"σ_vals":[0.5, 1.]}, state_vec2)

symmetric Π case:

190 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.6. Example 2 191


Advanced Quantitative Economics with Python

192 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

asymmetric Π case:

8.6. Example 2 193


Advanced Quantitative Economics with Python

194 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


Advanced Quantitative Economics with Python

8.7 More examples

The following lectures describe how Markov jump linear quadratic dynamic programming can be used to extend the
[Barro, 1979] model of optimal tax-smoothing and government debt in several interesting directions
1. How to Pay for a War: Part 1
2. How to Pay for a War: Part 2
3. How to Pay for a War: Part 3

8.7. More examples 195


Advanced Quantitative Economics with Python

196 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming


CHAPTER

NINE

HOW TO PAY FOR A WAR: PART 1

9.1 Overview

This lecture uses the method of Markov jump linear quadratic dynamic programming that is described in lecture
Markov Jump LQ dynamic programming to extend the [Barro, 1979] model of optimal tax-smoothing and government
debt in a particular direction.
This lecture has two sequels that offer further extensions of the Barro model
1. How to Pay for a War: Part 2
2. How to Pay for a War: Part 3
The extensions are modified versions of his 1979 model suggested by [Barro, 1999] and [Barro and McCleary, 2003]).
[Barro, 1979] m is about a government that borrows and lends in order to minimize an intertemporal measure of distortions
caused by taxes.
Technical tractability induced [Barro, 1979] to assume that
• the government trades only one-period risk-free debt, and
• the one-period risk-free interest rate is constant
By using Markov jump linear quadratic dynamic programming we can allow interest rates to move over time in empirically
interesting ways.
Also, by expanding the dimension of the state, we can add a maturity composition decision to the government’s problem.
By doing these two things we extend [Barro, 1979] along lines he suggested in [Barro, 1999] and [Barro and McCleary,
2003]).
[Barro, 1979] assumed
• that a government faces an exogenous sequence of expenditures that it must finance by a tax collection sequence
whose expected present value equals the initial debt it owes plus the expected present value of those expenditures.

• that the government wants to minimize a measure of tax distortions that is proportional to 𝐸0 ∑𝑡=0 𝛽 𝑡 𝑇𝑡2 , where
𝑇𝑡 are total tax collections and 𝐸0 is a mathematical expectation conditioned on time 0 information.
• that the government trades only one asset, a risk-free one-period bond.
• that the gross interest rate on the one-period bond is constant and equal to 𝛽 −1 , the reciprocal of the factor 𝛽 at
which the government discounts future tax distortions.
Barro’s model can be mapped into a discounted linear quadratic dynamic programming problem.
Partly inspired by [Barro, 1999] and [Barro and McCleary, 2003], our generalizations of [Barro, 1979], assume
• that the government borrows or saves in the form of risk-free bonds of maturities 1, 2, … , 𝐻.

197
Advanced Quantitative Economics with Python

• that interest rates on those bonds are time-varying and in particular, governed by a jointly stationary stochastic
process.
Our generalizations are designed to fit within a generalization of an ordinary linear quadratic dynamic programming
problem in which matrices that define the quadratic objective function and the state transition function are time-varying
and stochastic.
This generalization, known as a Markov jump linear quadratic dynamic program, combines
• the computational simplicity of linear quadratic dynamic programming, and
• the ability of finite state Markov chains to represent interesting patterns of random variation.
We want the stochastic time variation in the matrices defining the dynamic programming problem to represent variation
over time in
• interest rates
• default rates
• roll over risks
As described in Markov Jump LQ dynamic programming, the idea underlying Markov jump linear quadratic dynamic
programming is to replace the constant matrices defining a linear quadratic dynamic programming problem with
matrices that are fixed functions of an 𝑁 state Markov chain.
For infinite horizon problems, this leads to 𝑁 interrelated matrix Riccati equations that pin down 𝑁 value functions and
𝑁 linear decision rules, applying to the 𝑁 Markov states.

9.2 Public Finance Questions

[Barro, 1979] is designed to answer questions such as


• Should a government finance an exogenous surge in government expenditures by raising taxes or borrowing?
• How does the answer to that first question depend on the exogenous stochastic process for government expenditures,
for example, on whether the surge in government expenditures can be expected to be temporary or permanent?
[Barro, 1999] and [Barro and McCleary, 2003] are designed to answer more fine-grained questions such as
• What determines whether a government wants to issue short-term or long-term debt?
• How do roll-over risks affect that decision?
• How does the government’s long-short portfolio management decision depend on features of the exogenous stochas-
tic process for government expenditures?
Thus, both the simple and the more fine-grained versions of Barro’s models are ways of precisely formulating the classic
issue of How to pay for a war.
This lecture describes:
• An application of Markov jump LQ dynamic programming to a model in which a government faces exogenous
time-varying interest rates for issuing one-period risk-free debt.
A sequel to this lecture describes applies Markov LQ control to settings in which a government issues risk-free debt of
different maturities.

!pip install --upgrade quantecon

Let’s start with some standard imports:

198 Chapter 9. How to Pay for a War: Part 1


Advanced Quantitative Economics with Python

import quantecon as qe
import numpy as np
import matplotlib.pyplot as plt

9.3 Barro (1979) Model

We begin by solving a version of [Barro, 1979] by mapping it into the original LQ framework.
As mentioned in this lecture, the Barro model is mathematically isomorphic with the LQ permanent income model.
Let
• 𝑇𝑡 denote tax collections
• 𝛽 be a discount factor
• 𝑏𝑡,𝑡+1 be time 𝑡 + 1 goods that at 𝑡 the government promises to deliver to time 𝑡 buyers of one-period government
bonds
• 𝐺𝑡 be government purchases
• 𝑝𝑡,𝑡+1 the number of time 𝑡 goods received per time 𝑡 + 1 goods promised to one-period bond purchasers.
Evidently, 𝑝𝑡,𝑡+1 is inversely related to appropriate corresponding gross interest rates on government debt.
In the spirit of [Barro, 1979], the stochastic process of government expenditures is exogenous.
The government’s problem is to choose a plan for taxation and borrowing {𝑏𝑡+1 , 𝑇𝑡 }∞
𝑡=0 to minimize

𝐸0 ∑ 𝛽 𝑡 𝑇𝑡2
𝑡=0

subject to the constraints

𝑇𝑡 + 𝑝𝑡,𝑡+1 𝑏𝑡,𝑡+1 = 𝐺𝑡 + 𝑏𝑡−1,𝑡

𝐺𝑡 = 𝑈𝑔 𝑧𝑡
𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1
where 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼)
The variables 𝑇𝑡 , 𝑏𝑡,𝑡+1 are control variables chosen at 𝑡, while 𝑏𝑡−1,𝑡 is an endogenous state variable inherited from the
past at time 𝑡 and 𝑝𝑡,𝑡+1 is an exogenous state variable at time 𝑡.
To begin, we assume that 𝑝𝑡,𝑡+1 is constant (and equal to 𝛽)
• later we will extend the model to allow 𝑝𝑡,𝑡+1 to vary over time
𝑏𝑡−1,𝑡
To map into the LQ framework, we use 𝑥𝑡 = [ ] as the state vector, and 𝑢𝑡 = 𝑏𝑡,𝑡+1 as the control variable.
𝑧𝑡
Therefore, the (𝐴, 𝐵, 𝐶) matrices are defined by the state-transition law:
0 0 1 0
𝑥𝑡+1 = [ ] 𝑥 + [ ] 𝑢𝑡 + [ ] 𝑤𝑡+1
0 𝐴22 𝑡 0 𝐶2

To find the appropriate (𝑅, 𝑄, 𝑊 ) matrices, we note that 𝐺𝑡 and 𝑏𝑡−1,𝑡 can be written as appropriately defined functions
of the current state:

𝐺𝑡 = 𝑆𝐺 𝑥𝑡 , 𝑏𝑡−1,𝑡 = 𝑆1 𝑥𝑡

9.3. Barro (1979) Model 199


Advanced Quantitative Economics with Python

If we define 𝑀𝑡 = −𝑝𝑡,𝑡+1 , and let 𝑆 = 𝑆𝐺 + 𝑆1 , then we can write taxation as a function of the states and control using
the government’s budget constraint:

𝑇𝑡 = 𝑆𝑥𝑡 + 𝑀𝑡 𝑢𝑡

It follows that the (𝑅, 𝑄, 𝑊 ) matrices are implicitly defined by:

𝑇𝑡2 = 𝑥′𝑡 𝑆 ′ 𝑆𝑥𝑡 + 𝑢′𝑡 𝑀𝑡′ 𝑀𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑀𝑡′ 𝑆𝑥𝑡

If we assume that 𝑝𝑡,𝑡+1 = 𝛽, then 𝑀𝑡 ≡ 𝑀 = −𝛽.


In this case, none of the LQ matrices are time varying, and we can use the original LQ framework.
We will implement this constant interest-rate version first, assuming that 𝐺𝑡 follows an AR(1) process:

𝐺𝑡+1 = 𝐺 ̄ + 𝜌𝐺𝑡 + 𝜎𝑤𝑡+1

1
To do this, we set 𝑧𝑡 = [ ], and consequently:
𝐺𝑡

1 0 0
𝐴22 = [ ̄ ] , 𝐶2 = [ ]
𝐺 𝜌 𝜎

# Model parameters
β, Gbar, ρ, σ = 0.95, 5, 0.8, 1

# Basic model matrices


A22 = np.array([[1, 0],
[Gbar, ρ],])

C2 = np.array([[0],
[σ]])

Ug = np.array([[0, 1]])

# LQ framework matrices
A_t = np.zeros((1, 3))
A_b = np.hstack((np.zeros((2, 1)), A22))
A = np.vstack((A_t, A_b))

B = np.zeros((3, 1))
B[0, 0] = 1

C = np.vstack((np.zeros((1, 1)), C2))

Sg = np.hstack((np.zeros((1, 1)), Ug))


S1 = np.zeros((1, 3))
S1[0, 0] = 1
S = S1 + Sg

M = np.array([[-β]])

R = S.T @ S
Q = M.T @ M
W = M.T @ S

# Small penalty on the debt required to implement the no-Ponzi scheme


R[0, 0] = R[0, 0] + 1e-9

200 Chapter 9. How to Pay for a War: Part 1


Advanced Quantitative Economics with Python

We can now create an instance of LQ:

LQBarro = qe.LQ(Q, R, A, B, C=C, N=W, beta=β)


P, F, d = LQBarro.stationary_values()
x0 = np.array([[100, 1, 25]])

We can see the isomorphism by noting that consumption is a martingale in the permanent income model and that taxation
is a martingale in Barro’s model.
We can check this using the 𝐹 matrix of the LQ model.
Because 𝑢𝑡 = −𝐹 𝑥𝑡 , we have

𝑇𝑡 = 𝑆𝑥𝑡 + 𝑀 𝑢𝑡 = (𝑆 − 𝑀 𝐹 )𝑥𝑡

and

𝑇𝑡+1 = (𝑆 − 𝑀 𝐹 )𝑥𝑡+1 = (𝑆 − 𝑀 𝐹 )(𝐴𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑤𝑡+1 ) = (𝑆 − 𝑀 𝐹 )((𝐴 − 𝐵𝐹 )𝑥𝑡 + 𝐶𝑤𝑡+1 )

Therefore, the mathematical expectation of 𝑇𝑡+1 conditional on time 𝑡 information is

𝐸𝑡 𝑇𝑡+1 = (𝑆 − 𝑀 𝐹 )(𝐴 − 𝐵𝐹 )𝑥𝑡

Consequently, taxation is a martingale (𝐸𝑡 𝑇𝑡+1 = 𝑇𝑡 ) if

(𝑆 − 𝑀 𝐹 )(𝐴 − 𝐵𝐹 ) = (𝑆 − 𝑀 𝐹 ),

which holds in this case:

S - M @ F, (S - M @ F) @ (A - B @ F)

(array([[ 0.05000002, 19.79166502, 0.2083334 ]]),


array([[ 0.05000002, 19.79166504, 0.2083334 ]]))

This explains the fanning out of the conditional empirical distribution of taxation across time, computed by simulating
the Barro model many times and averaging over simulated paths:

T = 500
for i in range(250):
x, u, w = LQBarro.compute_sequence(x0, ts_length=T)
plt.plot(list(range(T+1)), ((S - M @ F) @ x)[0, :])
plt.xlabel('Time')
plt.ylabel('Taxation')
plt.show()

9.3. Barro (1979) Model 201


Advanced Quantitative Economics with Python

We can see a similar, but a smoother pattern, if we plot government debt over time.

T = 500
for i in range(250):
x, u, w = LQBarro.compute_sequence(x0, ts_length=T)
plt.plot(list(range(T+1)), x[0, :])
plt.xlabel('Time')
plt.ylabel('Govt Debt')
plt.show()

202 Chapter 9. How to Pay for a War: Part 1


Advanced Quantitative Economics with Python

9.4 Python Class to Solve Markov Jump Linear Quadratic Control


Problems

To implement the extension to the Barro model in which 𝑝𝑡,𝑡+1 varies over time, we must allow the M matrix to be
time-varying.
Our 𝑄 and 𝑊 matrices must also vary over time.
We can solve such a model using the LQMarkov class that solves Markov jump linear quandratic control problems as
described above.
The code for the class can be viewed here.
The class takes lists of matrices that corresponds to 𝑁 Markov states.
The value and policy functions are then found by iterating on a coupled system of matrix Riccati difference equations.
Optimal 𝑃𝑠 , 𝐹𝑠 , 𝑑𝑠 are stored as attributes.
The class also contains a method that simulates a model.

9.4. Python Class to Solve Markov Jump Linear Quadratic Control Problems 203
Advanced Quantitative Economics with Python

9.5 Barro Model with a Time-varying Interest Rate

We can use the above class to implement a version of the Barro model with a time-varying interest rate.
A simple way to extend the model is to allow the interest rate to take two possible values.
We set:
1
𝑝𝑡,𝑡+1 = 𝛽 + 0.02 = 0.97

2
𝑝𝑡,𝑡+1 = 𝛽 − 0.017 = 0.933
Thus, the first Markov state has a low interest rate and the second Markov state has a high interest rate.
We must also specify a transition matrix for the Markov state.
We use:
0.8 0.2
Π=[ ]
0.2 0.8

Here, each Markov state is persistent, and there is are equal chances of moving from one state to the other.
The choice of parameters means that the unconditional expectation of 𝑝𝑡,𝑡+1 is 0.9515, higher than 𝛽(= 0.95).
If we were to set 𝑝𝑡,𝑡+1 = 0.9515 in the version of the model with a constant interest rate, government debt would
explode.

# Create list of matrices that corresponds to each Markov state


Π = np.array([[0.8, 0.2],
[0.2, 0.8]])

As = [A, A]
Bs = [B, B]
Cs = [C, C]
Rs = [R, R]

M1 = np.array([[-β - 0.02]])
M2 = np.array([[-β + 0.017]])

Q1 = M1.T @ M1
Q2 = M2.T @ M2
Qs = [Q1, Q2]
W1 = M1.T @ S
W2 = M2.T @ S
Ws = [W1, W2]

# create Markov Jump LQ DP problem instance


lqm = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)
lqm.stationary_values();

The decision rules are now dependent on the Markov state:

lqm.Fs[0]

array([[-0.98437712, 19.20516427, -0.8314215 ]])

204 Chapter 9. How to Pay for a War: Part 1


Advanced Quantitative Economics with Python

lqm.Fs[1]

array([[-1.01434301, 21.5847983 , -0.83851116]])

Simulating a large number of such economies over time reveals interesting dynamics.
Debt tends to stay low and stable but recurrently surges.

T = 2000
x0 = np.array([[1000, 1, 25]])
for i in range(250):
x, u, w, s = lqm.compute_sequence(x0, ts_length=T)
plt.plot(list(range(T+1)), x[0, :])
plt.xlabel('Time')
plt.ylabel('Govt Debt')
plt.show()

9.5. Barro Model with a Time-varying Interest Rate 205


Advanced Quantitative Economics with Python

206 Chapter 9. How to Pay for a War: Part 1


CHAPTER

TEN

HOW TO PAY FOR A WAR: PART 2

10.1 Overview

This lecture presents another application of Markov jump linear quadratic dynamic programming and constitutes a sequel
to an earlier lecture.
We use a method introduced in lecture Markov Jump LQ dynamic programming toimplement suggestions by [Barro, 1999]
and [Barro and McCleary, 2003]) for extending his classic 1979 model of tax smoothing.
[Barro, 1979] model is about a government that borrows and lends in order to help it minimize an intertemporal measure
of distortions caused by taxes.
Technically, [Barro, 1979] model looks a lot like a consumption-smoothing model.
Our generalizations of [Barro, 1979] will also look like souped-up consumption-smoothing models.
Wanting tractability induced [Barro, 1979] to assume that
• the government trades only one-period risk-free debt, and
• the one-period risk-free interest rate is constant
In our earlier lecture, we relaxed the second of these assumptions but not the first.
In particular, we used Markov jump linear quadratic dynamic programming to allow the exogenous interest rate to vary
over time.
In this lecture, we add a maturity composition decision to the government’s problem by expanding the dimension of the
state.
We assume
• that the government borrows or saves in the form of risk-free bonds of maturities 1, 2, … , 𝐻.
• that interest rates on those bonds are time-varying and in particular are governed by a jointly stationary stochastic
process.
In addition to what’s in Anaconda, this lecture deploys the quantecon library:

!pip install --upgrade quantecon

Let’s start with some standard imports:

import quantecon as qe
import numpy as np
import matplotlib.pyplot as plt

207
Advanced Quantitative Economics with Python

10.2 Two example specifications

We’ll describe two possible specifications


• In one, each period the government issues zero-coupon bonds of one- and two-period maturities and redeems them
only when they mature – in this version, the maturity structure of government debt at each date is partly inherited
from the past.
• In the second, the government redesigns the maturity structure of the debt each period.

10.3 One- and Two-period Bonds but No Restructuring

Let
• 𝑇𝑡 denote tax collections
• 𝛽 be a discount factor
• 𝑏𝑡,𝑡+1 be time 𝑡 + 1 goods that the government promises to pay at 𝑡
• 𝑏𝑡,𝑡+2 betime 𝑡 + 2 goods that the government promises to pay at time 𝑡
• 𝐺𝑡 be government purchases
• 𝑝𝑡,𝑡+1 be the number of time 𝑡 goods received per time 𝑡 + 1 goods promised
• 𝑝𝑡,𝑡+2 be the number of time 𝑡 goods received per time 𝑡 + 2 goods promised.
Evidently, 𝑝𝑡,𝑡+1 , 𝑝𝑡,𝑡+2 are inversely related to appropriate corresponding gross interest rates on government debt.
In the spirit of [Barro, 1979], government expenditures are governed by an exogenous stochastic process.
Given initial conditions 𝑏−2,0 , 𝑏−1,0 , 𝑧0 , 𝑖0 , where 𝑖0 is the initial Markov state, the government chooses a contingency
plan for {𝑏𝑡,𝑡+1 , 𝑏𝑡,𝑡+2 , 𝑇𝑡 }∞
𝑡=0 to maximize.


−𝐸0 ∑ 𝛽 𝑡 [𝑇𝑡2 + 𝑐1 (𝑏𝑡,𝑡+1 − 𝑏𝑡,𝑡+2 )2 ]
𝑡=0

subject to the constraints

𝑇𝑡 = 𝐺𝑡 + 𝑏𝑡−2,𝑡 + 𝑏𝑡−1,𝑡 − 𝑝𝑡,𝑡+2 𝑏𝑡,𝑡+2 − 𝑝𝑡,𝑡+1 𝑏𝑡,𝑡+1


𝐺𝑡 = 𝑈𝑔,𝑠𝑡 𝑧𝑡
𝑧𝑡+1 = 𝐴22,𝑠𝑡 𝑧𝑡 + 𝐶2,𝑠𝑡 𝑤𝑡+1
𝑝𝑡,𝑡+1
⎡𝑝 ⎤
⎢ 𝑡,𝑡+2 ⎥
⎢ 𝑈𝑔,𝑠𝑡 ⎥ ∼ functions of Markov state with transition matrix Π
⎢𝐴22,𝑠𝑡 ⎥
⎣ 𝐶2,𝑠𝑡 ⎦

Here
• 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼) and Π𝑖𝑗 is the probability that the Markov state moves from state 𝑖 to state 𝑗 in one period
• 𝑇𝑡 , 𝑏𝑡,𝑡+1 , 𝑏𝑡,𝑡+2 are control variables chosen at time 𝑡
• variables 𝑏𝑡−1,𝑡 , 𝑏𝑡−2,𝑡 are endogenous state variables inherited from the past at time 𝑡
• 𝑝𝑡,𝑡+1 , 𝑝𝑡,𝑡+2 are exogenous state variables at time 𝑡

208 Chapter 10. How to Pay for a War: Part 2


Advanced Quantitative Economics with Python

The parameter 𝑐1 imposes a penalty on the government’s issuing different quantities of one and two-period debt.
This penalty deters the government from taking large “long-short” positions in debt of different maturities.
An example below will show the penalty in action.
As well as extending the model to allow for a maturity decision for government debt, we can also in principle allow the
matrices 𝑈𝑔,𝑠𝑡 , 𝐴22,𝑠𝑡 , 𝐶2,𝑠𝑡 to depend on the Markov state 𝑠𝑡 .
Below, we will often adopt the convention that for matrices appearing in a linear state space, 𝐴𝑡 ≡ 𝐴𝑠𝑡 , 𝐶𝑡 ≡ 𝐶𝑠𝑡 and
so on, so that dependence on 𝑡 is always intermediated through the Markov state 𝑠𝑡 .

10.4 Mapping into an LQ Markov Jump Problem

First, define

𝑏̂𝑡 = 𝑏𝑡−1,𝑡 + 𝑏𝑡−2,𝑡 ,

which is debt due at time 𝑡.


Then define the endogenous part of the state:

𝑏̂𝑡
𝑏̄𝑡 = [ ]
𝑏𝑡−1,𝑡+1

and the complete state vector

𝑏̄
𝑥𝑡 = [ 𝑡 ]
𝑧𝑡

and the control vector


𝑏𝑡,𝑡+1
𝑢𝑡 = [ ]
𝑏𝑡,𝑡+2

The endogenous part of state vector follows the law of motion:

𝑏̂ 0 1 𝑏̂𝑡 1 0 𝑏𝑡,𝑡+1
[ 𝑡+1 ] = [ ][ ]+[ ][ ]
𝑏𝑡,𝑡+2 0 0 𝑏𝑡−1,𝑡+1 0 1 𝑏𝑡,𝑡+2
or

𝑏̄𝑡+1 = 𝐴11 𝑏̄𝑡 + 𝐵1 𝑢𝑡

Define the following functions of the state

𝐺𝑡 = 𝑆𝐺,𝑡 𝑥𝑡 , 𝑏̂𝑡 = 𝑆1 𝑥𝑡

and

𝑀𝑡 = [−𝑝𝑡,𝑡+1 −𝑝𝑡,𝑡+2 ]

where 𝑝𝑡,𝑡+1 is the discount on one period loans in the discrete Markov state at time 𝑡 and 𝑝𝑡,𝑡+2 is the discount on
two-period loans in the discrete Markov state.
Define

𝑆𝑡 = 𝑆𝐺,𝑡 + 𝑆1

10.4. Mapping into an LQ Markov Jump Problem 209


Advanced Quantitative Economics with Python

Note that in discrete Markov state 𝑖

𝑇𝑡 = 𝑀𝑡 𝑢𝑡 + 𝑆𝑡 𝑥𝑡

It follows that

𝑇𝑡2 = 𝑥′𝑡 𝑆𝑡′ 𝑆𝑡 𝑥𝑡 + 𝑢′𝑡 𝑀𝑡′ 𝑀𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑀𝑡′ 𝑆𝑡 𝑥𝑡

or

𝑇𝑡2 = 𝑥′𝑡 𝑅𝑡 𝑥𝑡 + 𝑢′𝑡 𝑄𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑊𝑡 𝑥𝑡

where

𝑅𝑡 = 𝑆𝑡′ 𝑆𝑡 , 𝑄𝑡 = 𝑀𝑡′ 𝑀𝑡 , 𝑊𝑡 = 𝑀𝑡′ 𝑆𝑡

Because the payoff function also includes the penalty parameter on issuing debt of different maturities, we have:

𝑇𝑡2 + 𝑐1 (𝑏𝑡,𝑡+1 − 𝑏𝑡,𝑡+2 )2 = 𝑥′𝑡 𝑅𝑡 𝑥𝑡 + 𝑢′𝑡 𝑄𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑊𝑡 𝑥𝑡 + 𝑐1 𝑢′𝑡 𝑄𝑐 𝑢𝑡

1 −1
where 𝑄𝑐 = [ ].
−1 1
Therefore, the appropriate 𝑄 matrix in the Markov jump LQ problem is:

𝑄𝑐𝑡 = 𝑄𝑡 + 𝑐1 𝑄𝑐

The law of motion of the state in all discrete Markov states 𝑖 is

𝑥𝑡+1 = 𝐴𝑡 𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑡 𝑤𝑡+1

where
𝐴11 0 𝐵1 0
𝐴𝑡 = [ ], 𝐵=[ ], 𝐶𝑡 = [ ]
0 𝐴22,𝑡 0 𝐶2,𝑡

Thus, in this problem all the matrices apart from 𝐵 may depend on the Markov state at time 𝑡.
As shown in the previous lecture, when provided with appropriate 𝐴, 𝐵, 𝐶, 𝑅, 𝑄, 𝑊 matrices for each Markov state the
LQMarkov class can solve Markov jump LQ problems.
The function below maps the primitive matrices and parameters from the above two-period model into the matrices that
the LQMarkov class requires:

def LQ_markov_mapping(A22, C2, Ug, p1, p2, c1=0):

"""
Function which takes A22, C2, Ug, p_{t, t+1}, p_{t, t+2} and penalty
parameter c1, and returns the required matrices for the LQMarkov
model: A, B, C, R, Q, W.
This version uses the condensed version of the endogenous state.
"""

# Make sure all matrices can be treated as 2D arrays


A22 = np.atleast_2d(A22)
C2 = np.atleast_2d(C2)
Ug = np.atleast_2d(Ug)
p1 = np.atleast_2d(p1)
(continues on next page)

210 Chapter 10. How to Pay for a War: Part 2


Advanced Quantitative Economics with Python

(continued from previous page)


p2 = np.atleast_2d(p2)

# Find the number of states (z) and shocks (w)


nz, nw = C2.shape

# Create A11, B1, S1, S2, Sg, S matrices


A11 = np.zeros((2, 2))
A11[0, 1] = 1

B1 = np.eye(2)

S1 = np.hstack((np.eye(1), np.zeros((1, nz+1))))


Sg = np.hstack((np.zeros((1, 2)), Ug))
S = S1 + Sg

# Create M matrix
M = np.hstack((-p1, -p2))

# Create A, B, C matrices
A_T = np.hstack((A11, np.zeros((2, nz))))
A_B = np.hstack((np.zeros((nz, 2)), A22))
A = np.vstack((A_T, A_B))

B = np.vstack((B1, np.zeros((nz, 2))))

C = np.vstack((np.zeros((2, nw)), C2))

# Create Q^c matrix


Qc = np.array([[1, -1], [-1, 1]])

# Create R, Q, W matrices

R = S.T @ S
Q = M.T @ M + c1 * Qc
W = M.T @ S

return A, B, C, R, Q, W

With the above function, we can proceed to solve the model in two steps:
1. Use LQ_markov_mapping to map 𝑈𝑔,𝑡 , 𝐴22,𝑡 , 𝐶2,𝑡 , 𝑝𝑡,𝑡+1 , 𝑝𝑡,𝑡+2 into the 𝐴, 𝐵, 𝐶, 𝑅, 𝑄, 𝑊 matrices for each
of the 𝑛 Markov states.
2. Use the LQMarkov class to solve the resulting n-state Markov jump LQ problem.

10.5 Penalty on Different Issues Across Maturities

To implement a simple example of the two-period model, we assume that 𝐺𝑡 follows an AR(1) process:

𝐺𝑡+1 = 𝐺 ̄ + 𝜌𝐺𝑡 + 𝜎𝑤𝑡+1

1
To do this, we set 𝑧𝑡 = [ ], and consequently:
𝐺𝑡

1 0 0
𝐴22 = [ ̄ ] , 𝐶2 = [ ] , 𝑈𝑔 = [0 1]
𝐺 𝜌 𝜎

10.5. Penalty on Different Issues Across Maturities 211


Advanced Quantitative Economics with Python

Therefore, in this example, 𝐴22 , 𝐶2 and 𝑈𝑔 are not time-varying.


We will assume that there are two Markov states, one with a flatter yield curve, and one with a steeper yield curve.
In state 1, prices are:
1 1
𝑝𝑡,𝑡+1 = 𝛽 , 𝑝𝑡,𝑡+2 = 𝛽 2 − 0.02

and in state 2, prices are:


2 2
𝑝𝑡,𝑡+1 = 𝛽 , 𝑝𝑡,𝑡+2 = 𝛽 2 + 0.02

We first solve the model with no penalty parameter on different issuance across maturities, i.e. 𝑐1 = 0.
We specify that the transition matrix for the Markov state is

0.9 0.1
Π=[ ]
0.1 0.9

Thus, each Markov state is persistent, and there is an equal chance of moving from one to the other.

# Model parameters
β, Gbar, ρ, σ, c1 = 0.95, 5, 0.8, 1, 0
p1, p2, p3, p4 = β, β**2 - 0.02, β, β**2 + 0.02

# Basic model matrices


A22 = np.array([[1, 0], [Gbar, ρ] ,])
C_2 = np.array([[0], [σ]])
Ug = np.array([[0, 1]])

A1, B1, C1, R1, Q1, W1 = LQ_markov_mapping(A22, C_2, Ug, p1, p2, c1)
A2, B2, C2, R2, Q2, W2 = LQ_markov_mapping(A22, C_2, Ug, p3, p4, c1)

# Small penalties on debt required to implement no-Ponzi scheme


R1[0, 0] = R1[0, 0] + 1e-9
R2[0, 0] = R2[0, 0] + 1e-9

# Construct lists of matrices correspond to each state


As = [A1, A2]
Bs = [B1, B2]
Cs = [C1, C2]
Rs = [R1, R2]
Qs = [Q1, Q2]
Ws = [W1, W2]

Π = np.array([[0.9, 0.1],
[0.1, 0.9]])

# Construct and solve the model using the LQMarkov class


lqm = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)
lqm.stationary_values()

# Simulate the model


x0 = np.array([[100, 50, 1, 10]])
x, u, w, t = lqm.compute_sequence(x0, ts_length=300)

# Plot of one and two-period debt issuance


fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.plot(u[0, :])
(continues on next page)

212 Chapter 10. How to Pay for a War: Part 2


Advanced Quantitative Economics with Python

(continued from previous page)


ax1.set_title('One-period debt issuance')
ax1.set_xlabel('Time')
ax2.plot(u[1, :])
ax2.set_title('Two-period debt issuance')
ax2.set_xlabel('Time')
plt.show()

The above simulations show that when no penalty is imposed on different issuances across maturities, the government has
an incentive to take large “long-short” positions in debt of different maturities.
To prevent such outcomes, we set 𝑐1 = 0.01.
This penalty is big enough to motivate the government to issue positive quantities of both one- and two-period debt:

# Put small penalty on different issuance across maturities


c1 = 0.01

A1, B1, C1, R1, Q1, W1 = LQ_markov_mapping(A22, C_2, Ug, p1, p2, c1)
A2, B2, C2, R2, Q2, W2 = LQ_markov_mapping(A22, C_2, Ug, p3, p4, c1)

# Small penalties on debt required to implement no-Ponzi scheme


R1[0, 0] = R1[0, 0] + 1e-9
R2[0, 0] = R2[0, 0] + 1e-9

# Construct lists of matrices


As = [A1, A2]
Bs = [B1, B2]
Cs = [C1, C2]
Rs = [R1, R2]
Qs = [Q1, Q2]
Ws = [W1, W2]

# Construct and solve the model using the LQMarkov class


lqm2 = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)
lqm2.stationary_values()

# Simulate the model


x, u, w, t = lqm2.compute_sequence(x0, ts_length=300)

# Plot of one and two-period debt issuance


(continues on next page)

10.5. Penalty on Different Issues Across Maturities 213


Advanced Quantitative Economics with Python

(continued from previous page)


fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.plot(u[0, :])
ax1.set_title('One-period debt issuance')
ax1.set_xlabel('Time')
ax2.plot(u[1, :])
ax2.set_title('Two-period debt issuance')
ax2.set_xlabel('Time')
plt.show()

10.6 A Model with Restructuring

We now alter two features of the previous model:


1. The maximum horizon of government debt is now extended to a general H periods.
2. The government is able to redesign the maturity structure of debt every period.
We impose a cost on adjusting issuance of each maturity by amending the payoff function to become:
𝐻−1
𝑡−1
𝑇𝑡2 + ∑ 𝑐2 (𝑏𝑡+𝑗 𝑡
− 𝑏𝑡+𝑗+1 )2
𝑗=0

The government’s budget constraint is now:


𝐻 𝐻−1
𝑡
𝑇𝑡 + ∑ 𝑝𝑡,𝑡+𝑗 𝑏𝑡+𝑗 𝑡−1
= 𝑏𝑡𝑡−1 + ∑ 𝑝𝑡,𝑡+𝑗 𝑏𝑡+𝑗 + 𝐺𝑡
𝑗=1 𝑗=1

To map this into the Markov Jump LQ framework, we define state and control variables.
Let:
𝑏𝑡𝑡−1 𝑡
𝑏𝑡+1
⎡ 𝑏𝑡−1 ⎤ ⎡ 𝑏𝑡 ⎤
𝑏̄𝑡 = ⎢ 𝑡+1 ⎥ , 𝑢𝑡 = ⎢ 𝑡+2 ⎥
⎢ ⋮ ⎥ ⎢ ⋮ ⎥
𝑡−1 𝑡
𝑏
⎣ 𝑡+𝐻−1 ⎦ ⎣𝑏𝑡+𝐻 ⎦

Thus, 𝑏̄𝑡 is the endogenous state (debt issued last period) and 𝑢𝑡 is the control (debt issued today).
As before, we will also have the exogenous state 𝑧𝑡 , which determines government spending.

214 Chapter 10. How to Pay for a War: Part 2


Advanced Quantitative Economics with Python

Therefore, the full state is:

𝑏̄
𝑥𝑡 = [ 𝑡 ]
𝑧𝑡

We also define a vector 𝑝𝑡 that contains the time 𝑡 price of goods in period 𝑡 + 𝑗:

𝑝𝑡,𝑡+1
⎡𝑝 ⎤
𝑝𝑡 = ⎢ 𝑡,𝑡+2 ⎥
⎢ ⋮ ⎥
⎣𝑝𝑡,𝑡+𝐻 ⎦

Finally, we define three useful matrices 𝑆𝑠 , 𝑆𝑥 , 𝑆𝑥̃ :

𝑝𝑡,𝑡+1 1 0 0 ⋯ 0
⎡ 𝑝 ⎤ ⎡0 1 0 ⋯ 0⎤
⎢ 𝑡,𝑡+2 ⎥ = 𝑆𝑠 𝑝𝑡 where 𝑆𝑠 = ⎢ ⎥
⎢ ⋮ ⎥ ⎢⋮ ⋱ ⎥
⎣𝑝𝑡,𝑡+𝐻−1 ⎦ ⎣0 0 ⋯ 1 0⎦
𝑡−1
𝑏𝑡+1 0 1 0 ⋯ 0
⎡ 𝑏𝑡−1 ⎤ ⎡0 0 1 ⋯ 0⎤
⎢ 𝑡+2 ̄
⎥ = 𝑆𝑥 𝑏𝑡 where 𝑆𝑥 = ⎢ ⎥
⎢ ⋮ ⎥ ⎢⋮ ⋱ ⎥
𝑡−1
⎣𝑏𝑡+𝑇 −1 ⎦ ⎣0 0 ⋯ 0 1⎦

𝑏𝑡𝑡−1 = 𝑆𝑥̃ 𝑏̄𝑡 where 𝑆𝑥̃ = [1 0 0 ⋯ 0]


In terms of dimensions, the first two matrices defined above are (𝐻 − 1) × 𝐻.
The last is 1 × 𝐻
We can now write the government’s budget constraint in matrix notation.
We can rearrange the government budget constraint to become
𝐻−1 𝐻
𝑡
𝑇𝑡 = 𝑏𝑡𝑡−1 + ∑ 𝑝𝑡+𝑗 𝑡−1
𝑏𝑡+𝑗 𝑡
+ 𝐺𝑡 − ∑ 𝑝𝑡+𝑗 𝑡
𝑏𝑡+𝑗
𝑗=1 𝑗=1

or

𝑇𝑡 = 𝑆𝑥̃ 𝑏̄𝑡 + (𝑆𝑠 𝑝𝑡 ) ⋅ (𝑆𝑥 𝑏̄𝑡 ) + 𝑈𝑔 𝑧𝑡 − 𝑝𝑡 ⋅ 𝑢𝑡

To express 𝑇𝑡 as a function of the full state, let

𝑇𝑡 = [(𝑆𝑥̃ + 𝑝𝑡′ 𝑆𝑠′ 𝑆𝑥 ) 𝑈 𝑔] 𝑥𝑡 − 𝑝𝑡′ 𝑢𝑡

To simplify the notation, let 𝑆𝑡 = [(𝑆𝑥̃ + 𝑝𝑡 ’𝑆𝑠 ’𝑆𝑥 ) 𝑈 𝑔].


Then

𝑇𝑡 = 𝑆𝑡 𝑥𝑡 − 𝑝𝑡′ 𝑢𝑡

Therefore

𝑇𝑡2 = 𝑥′𝑡 𝑅𝑡 𝑥𝑡 + 𝑢′𝑡 𝑄𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑊𝑡 𝑥𝑡

where

𝑅𝑡 = 𝑆𝑡′ 𝑆𝑡 , 𝑄𝑡 = 𝑝𝑡 𝑝𝑡′ , 𝑊𝑡 = −𝑝𝑡 𝑆𝑡

10.6. A Model with Restructuring 215


Advanced Quantitative Economics with Python

where to economize on notation we adopt the convention that for the linear state matrices 𝑅𝑡 ≡ 𝑅𝑠𝑡 , 𝑄𝑡 ≡ 𝑊𝑠𝑡 and so
on.
We’ll use this convention for the linear state matrices 𝐴, 𝐵, 𝑊 and so on below.
Because the payoff function also includes the penalty parameter for rescheduling, we have:
𝐻−1
𝑇𝑡2 + ∑ 𝑐2 (𝑏𝑡+𝑗
𝑡−1 𝑡
− 𝑏𝑡+𝑗+1 )2 = 𝑇𝑡2 + 𝑐2 (𝑏̄𝑡 − 𝑢𝑡 )′ (𝑏̄𝑡 − 𝑢𝑡 )
𝑗=0

Because the complete state is 𝑥𝑡 and not 𝑏̄𝑡 , we rewrite this as:

𝑇𝑡2 + 𝑐2 (𝑆𝑐 𝑥𝑡 − 𝑢𝑡 )′ (𝑆𝑐 𝑥𝑡 − 𝑢𝑡 )

where 𝑆𝑐 = [𝐼 0]
Multiplying this out gives:

𝑇𝑡2 + 𝑐2 𝑥′𝑡 𝑆𝑐′ 𝑆𝑐 𝑥𝑡 − 2𝑐2 𝑢′𝑡 𝑆𝑐 𝑥𝑡 + 𝑐2 𝑢′𝑡 𝑢𝑡

Therefore, with the cost term, we must amend our 𝑅, 𝑄, 𝑊 matrices as follows:

𝑅𝑡𝑐 = 𝑅𝑡 + 𝑐2 𝑆𝑐′ 𝑆𝑐

𝑄𝑐𝑡 = 𝑄𝑡 + 𝑐2 𝐼

𝑊𝑡𝑐 = 𝑊𝑡 − 𝑐2 𝑆𝑐
To finish mapping into the Markov jump LQ setup, we need to construct the law of motion for the full state.
This is simpler than in the previous setup, as we now have 𝑏̄𝑡+1 = 𝑢𝑡 .
Therefore:
𝑏̄𝑡+1
𝑥𝑡+1 ≡ [ ] = 𝐴𝑡 𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑡 𝑤𝑡+1
𝑧𝑡+1

where
0 0 𝐼 0
𝐴𝑡 = [ ], 𝐵 = [ ], 𝐶=[ ]
0 𝐴22,𝑡 0 𝐶2,𝑡

This completes the mapping into a Markov jump LQ problem.

10.7 Restructuring as a Markov Jump Linear Quadratic Control Prob-


lem

We can define a function that maps the primitives of the model with restructuring into the matrices required by the
LQMarkov class:

def LQ_markov_mapping_restruct(A22, C2, Ug, T, p_t, c=0):

"""
Function which takes A22, C2, T, p_t, c and returns the
required matrices for the LQMarkov model: A, B, C, R, Q, W
Note, p_t should be a T by 1 matrix
(continues on next page)

216 Chapter 10. How to Pay for a War: Part 2


Advanced Quantitative Economics with Python

(continued from previous page)


c is the rescheduling cost (a scalar)
This version uses the condensed version of the endogenous state
"""

# Make sure all matrices can be treated as 2D arrays


A22 = np.atleast_2d(A22)
C2 = np.atleast_2d(C2)
Ug = np.atleast_2d(Ug)
p_t = np.atleast_2d(p_t)

# Find the number of states (z) and shocks (w)


nz, nw = C2.shape

# Create Sx, tSx, Ss, S_t matrices (tSx stands for \tilde S_x)
Ss = np.hstack((np.eye(T-1), np.zeros((T-1, 1))))
Sx = np.hstack((np.zeros((T-1, 1)), np.eye(T-1)))
tSx = np.zeros((1, T))
tSx[0, 0] = 1

S_t = np.hstack((tSx + p_t.T @ Ss.T @ Sx, Ug))

# Create A, B, C matrices
A_T = np.hstack((np.zeros((T, T)), np.zeros((T, nz))))
A_B = np.hstack((np.zeros((nz, T)), A22))
A = np.vstack((A_T, A_B))

B = np.vstack((np.eye(T), np.zeros((nz, T))))


C = np.vstack((np.zeros((T, nw)), C2))

# Create cost matrix Sc


Sc = np.hstack((np.eye(T), np.zeros((T, nz))))

# Create R_t, Q_t, W_t matrices

R_c = S_t.T @ S_t + c * Sc.T @ Sc


Q_c = p_t @ p_t.T + c * np.eye(T)
W_c = -p_t @ S_t - c * Sc

return A, B, C, R_c, Q_c, W_c

10.7.1 Example with Restructuring

As an example let 𝐻 = 3.
Assume that there are two Markov states, one with a flatter yield curve, the other with a steeper yield curve.
In state 1, prices are:
1 1 1
𝑝𝑡,𝑡+1 = 0.9695 , 𝑝𝑡,𝑡+2 = 0.902 , 𝑝𝑡,𝑡+3 = 0.8369

and in state 2, prices are:


2 2 2
𝑝𝑡,𝑡+1 = 0.9295 , 𝑝𝑡,𝑡+2 = 0.902 , 𝑝𝑡,𝑡+3 = 0.8769

We specify the same transition matrix and 𝐺𝑡 process that we used earlier.

10.7. Restructuring as a Markov Jump Linear Quadratic Control Problem 217


Advanced Quantitative Economics with Python

# New model parameters


H = 3
p1 = np.array([[0.9695], [0.902], [0.8369]])
p2 = np.array([[0.9295], [0.902], [0.8769]])
Pi = np.array([[0.9, 0.1], [0.1, 0.9]])

# Put penalty on different issuance across maturities


c2 = 0.5

A1, B1, C1, R1, Q1, W1 = LQ_markov_mapping_restruct(A22, C_2, Ug, H, p1, c2)
A2, B2, C2, R2, Q2, W2 = LQ_markov_mapping_restruct(A22, C_2, Ug, H, p2, c2)

# Small penalties on debt required to implement no-Ponzi scheme


R1[0, 0] = R1[0, 0] + 1e-9
R1[1, 1] = R1[1, 1] + 1e-9
R1[2, 2] = R1[2, 2] + 1e-9
R2[0, 0] = R2[0, 0] + 1e-9
R2[1, 1] = R2[1, 1] + 1e-9
R2[2, 2] = R2[2, 2] + 1e-9

# Construct lists of matrices


As = [A1, A2]
Bs = [B1, B2]
Cs = [C1, C2]
Rs = [R1, R2]
Qs = [Q1, Q2]
Ws = [W1, W2]

# Construct and solve the model using the LQMarkov class


lqm3 = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)
lqm3.stationary_values()

x0 = np.array([[5000, 5000, 5000, 1, 10]])


x, u, w, t = lqm3.compute_sequence(x0, ts_length=300)

# Plots of different maturities debt issuance

fig, (ax1, ax2, ax3, ax4) = plt.subplots(1, 4, figsize=(11, 3))


ax1.plot(u[0, :])
ax1.set_title('One-period debt issuance')
ax1.set_xlabel('Time')
ax2.plot(u[1, :])
ax2.set_title('Two-period debt issuance')
ax2.set_xlabel('Time')
ax3.plot(u[2, :])
ax3.set_title('Three-period debt issuance')
ax3.set_xlabel('Time')
ax4.plot(u[0, :] + u[1, :] + u[2, :])
ax4.set_title('Total debt issuance')
ax4.set_xlabel('Time')
plt.tight_layout()
plt.show()

218 Chapter 10. How to Pay for a War: Part 2


Advanced Quantitative Economics with Python

# Plot share of debt issuance that is short-term

fig, ax = plt.subplots()
ax.plot((u[0, :] / (u[0, :] + u[1, :] + u[2, :])))
ax.set_title('One-period debt issuance share')
ax.set_xlabel('Time')
plt.show()

10.7. Restructuring as a Markov Jump Linear Quadratic Control Problem 219


Advanced Quantitative Economics with Python

220 Chapter 10. How to Pay for a War: Part 2


CHAPTER

ELEVEN

HOW TO PAY FOR A WAR: PART 3

11.1 Overview

This lecture presents another application of Markov jump linear quadratic dynamic programming and constitutes a sequel
to an earlier lecture.
We again use a method introduced in lecture Markov Jump LQ dynamic programming to implement some ideas of [Barro,
1999] and [Barro and McCleary, 2003]) that extend the classic [Barro, 1979] model of tax smoothing.
[Barro, 1979] is about a government that borrows and lends in order to help it minimize an intertemporal measure of
distortions caused by taxes.
Technically, [Barro, 1979] looks a lot like a consumption-smoothing model.
Our generalization will also look like a souped-up consumption-smoothing model.
In this lecture, we describe a tax-smoothing problem of a government that faces roll-over risk.
In addition to what’s in Anaconda, this lecture deploys the quantecon library:

!pip install --upgrade quantecon

Let’s start with some standard imports:

import quantecon as qe
import numpy as np
import matplotlib.pyplot as plt

11.2 Roll-Over Risk

Let 𝑇𝑡 denote tax collections, 𝛽 a discount factor, 𝑏𝑡,𝑡+1 time 𝑡 + 1 goods that the government promises to pay at 𝑡, 𝐺𝑡
𝑡
government purchases, 𝑝𝑡+1 the number of time 𝑡 goods received per time 𝑡 + 1 goods promised.
The stochastic process of government expenditures is exogenous.
The government’s problem is to choose a plan for borrowing and tax collections {𝑏𝑡+1 , 𝑇𝑡 }∞
𝑡=0 to minimize


𝐸0 ∑ 𝛽 𝑡 𝑇𝑡2
𝑡=0

subject to the constraints


𝑡
𝑇𝑡 + 𝑝𝑡+1 𝑏𝑡,𝑡+1 = 𝐺𝑡 + 𝑏𝑡−1,𝑡

221
Advanced Quantitative Economics with Python

𝐺𝑡 = 𝑈𝑔,𝑡 𝑧𝑡

𝑧𝑡+1 = 𝐴22,𝑡 𝑧𝑡 + 𝐶2,𝑡 𝑤𝑡+1


where 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼).
Let
• 𝑇𝑡 , 𝑏𝑡,𝑡+1 be controls chosen at 𝑡
• 𝑏𝑡−1,𝑡 be an endogenous state variable inherited from the past at time 𝑡
𝑡
• 𝑝𝑡+1 be an exogenous price at time 𝑡.
This is the same set-up as used in this lecture.
We will consider a situation in which the government faces “roll-over risk”.
Specifically, we shut down the government’s ability to borrow in one of the Markov states.

11.3 A Dead End


𝑡
A first thought for how to implement this might be to allow 𝑝𝑡+1 to vary over time with:
𝑡
𝑝𝑡+1 =𝛽

in Markov state 1 and


𝑡
𝑝𝑡+1 =0

in Markov state 2.
Consequently, in the second Markov state, the government is unable to borrow, and the budget constraint becomes 𝑇𝑡 =
𝐺𝑡 + 𝑏𝑡−1,𝑡 .
However, if this is the only adjustment we make in our linear-quadratic model, the government will not set 𝑏𝑡,𝑡+1 = 0,
which is the outcome we want to express roll-over risk in period 𝑡.
Instead, the government would have an incentive to set 𝑏𝑡,𝑡+1 to a large negative number in state 2 – it would accumulate
large amounts of assets to bring into period 𝑡 + 1 because that is cheap
• Riccati equations will tell us this
Thus, we must represent “roll-over risk” some other way.

11.4 Better Representation of Roll-Over Risk

To force the government to set 𝑏𝑡,𝑡+1 = 0, we can instead extend the model to have four Markov states:
1. Good today, good yesterday
2. Good today, bad yesterday
3. Bad today, good yesterday
4. Bad today, bad yesterday

222 Chapter 11. How to Pay for a War: Part 3


Advanced Quantitative Economics with Python

where good is a state in which effectively the government can issue debt and bad is a state in which effectively the
government can’t issue debt.
We’ll explain what effectively means shortly.
We now set
𝑡
𝑝𝑡+1 =𝛽

in all states.
In addition – and this is important because it defines what we mean by effectively – we put a large penalty on the 𝑏𝑡−1,𝑡
element of the state vector in states 2 and 4.
This will prevent the government from wishing to issue any debt in states 3 or 4 because it would experience a large
penalty from doing so in the next period.
The transition matrix for this formulation is:
0.95 0 0.05 0
⎡0.95 0 0.05 0 ⎤
Π=⎢ ⎥
⎢ 0 0.9 0 0.1⎥
⎣ 0 0.9 0 0.1⎦
This transition matrix ensures that the Markov state cannot move, for example, from state 3 to state 1.
Because state 3 is “bad today”, the next period cannot have “good yesterday”.

# Model parameters
β, Gbar, ρ, σ = 0.95, 5, 0.8, 1

# Basic model matrices


A22 = np.array([[1, 0], [Gbar, ρ], ])
C2 = np.array([[0], [σ]])
Ug = np.array([[0, 1]])

# LQ framework matrices
A_t = np.zeros((1, 3))
A_b = np.hstack((np.zeros((2, 1)), A22))
A = np.vstack((A_t, A_b))

B = np.zeros((3, 1))
B[0, 0] = 1

C = np.vstack((np.zeros((1, 1)), C2))

Sg = np.hstack((np.zeros((1, 1)), Ug))


S1 = np.zeros((1, 3))
S1[0, 0] = 1
S = S1 + Sg

R = S.T @ S

# Large penalty on debt in R2 to prevent borrowing in a bad state


R1 = np.copy(R)
R2 = np.copy(R)
R1[0, 0] = R[0, 0] + 1e-9
R2[0, 0] = R[0, 0] + 1e12

M = np.array([[-β]])
(continues on next page)

11.4. Better Representation of Roll-Over Risk 223


Advanced Quantitative Economics with Python

(continued from previous page)


Q = M.T @ M
W = M.T @ S

Π = np.array([[0.95, 0, 0.05, 0],


[0.95, 0, 0.05, 0],
[0, 0.9, 0, 0.1],
[0, 0.9, 0, 0.1]])

# Construct lists of matrices that correspond to each state


As = [A, A, A, A]
Bs = [B, B, B, B]
Cs = [C, C, C, C]
Rs = [R1, R2, R1, R2]
Qs = [Q, Q, Q, Q]
Ws = [W, W, W, W]

lqm = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)


lqm.stationary_values();

Using the same process for 𝐺𝑡 as in this lecture, we shall simulate our model with roll-over risk.
𝑡
When 𝑝𝑡+1 = 𝛽 government debt fluctuates around zero.
The spikes in the tax collection series indicate periods when the government is unable to access financial markets:
• positive spikes occur when debt is positive and the government must urgently raise tax revenues now
Negative spikes occur when the government has positive asset holdings.
An inability to use financial markets in the next period means that the government uses those assets to lower taxation
today.

x0 = np.array([[0, 1, 25]])
T = 300
x, u, w, state = lqm.compute_sequence(x0, ts_length=T)

# Calculate taxation each period from the budget constraint and the Markov state
tax = np.zeros([T, 1])
for i in range(T):
tax[i, :] = S @ x[:, i] + M @ u[:, i]

# Plot of debt issuance and taxation


fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 3))
ax1.plot(x[0, :])
ax1.set_title('One-period debt issuance')
ax1.set_xlabel('Time')
ax2.plot(tax)
ax2.set_title('Taxation')
ax2.set_xlabel('Time')
plt.show()

224 Chapter 11. How to Pay for a War: Part 3


Advanced Quantitative Economics with Python

We can adjust parameters so that, rather than debt fluctuating around zero, the government is a debtor in every period
that it can borrow.
𝑡
To accomplish this, we simply raise 𝑝𝑡+1 to 𝛽 + 0.02 = 0.97.

M = np.array([[-β - 0.02]])

Q = M.T @ M
W = M.T @ S

# Construct lists of matrices


As = [A, A, A, A]
Bs = [B, B, B, B]
Cs = [C, C, C, C]
Rs = [R1, R2, R1, R2]
Qs = [Q, Q, Q, Q]
Ws = [W, W, W, W]

lqm2 = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)


x, u, w, state = lqm2.compute_sequence(x0, ts_length=T)

# Calculate taxation each period from the budget constraint and the
# Markov state
tax = np.zeros([T, 1])
for i in range(T):
tax[i, :] = S @ x[:, i] + M @ u[:, i]

# Plot of debt issuance and taxation


fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 3))
ax1.plot(x[0, :])
ax1.set_title('One-period debt issuance')
ax1.set_xlabel('Time')
ax2.plot(tax)
ax2.set_title('Taxation')
ax2.set_xlabel('Time')
plt.show()

11.4. Better Representation of Roll-Over Risk 225


Advanced Quantitative Economics with Python

With a lower interest rate, the government has an incentive to increase debt over time.
However, with “roll-over risk”, debt is recurrently reset to zero and tax collections spike up.
In this model, high costs of a “sudden stop” make the government wary about letting its debt get too high.

226 Chapter 11. How to Pay for a War: Part 3


CHAPTER

TWELVE

OPTIMAL TAXATION IN AN LQ ECONOMY

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

12.1 Overview

In this lecture, we study optimal fiscal policy in a linear quadratic setting.


We modify a model of Robert Lucas and Nancy Stokey [Lucas and Stokey, 1983] so that convenient formulas for solving
linear-quadratic models can be applied.
The economy consists of a representative household and a benevolent government.
The government finances an exogenous stream of government purchases with state-contingent loans and a linear tax on
labor income.
A linear tax is sometimes called a flat-rate tax.
The household maximizes utility by choosing paths for consumption and labor, taking prices and the government’s tax
rate and borrowing plans as given.
Maximum attainable utility for the household depends on the government’s tax and borrowing plans.
The Ramsey problem [Ramsey, 1927] is to choose tax and borrowing plans that maximize the household’s welfare, taking
the household’s optimizing behavior as given.
There is a large number of competitive equilibria indexed by different government fiscal policies.
The Ramsey planner chooses the best competitive equilibrium.
We want to study the dynamics of tax rates, tax revenues, government debt under a Ramsey plan.
Because the Lucas and Stokey model features state-contingent government debt, the government debt dynamics differ
substantially from those in a model of Robert Barro [Barro, 1979].
The treatment given here closely follows this manuscript, prepared by Thomas J. Sargent and Francois R. Velde.
We cover only the key features of the problem in this lecture, leaving you to refer to that source for additional results and
intuition.
We’ll need the following imports:

import sys
import numpy as np
import matplotlib.pyplot as plt
(continues on next page)

227
Advanced Quantitative Economics with Python

(continued from previous page)


from numpy import sqrt, eye, zeros, cumsum
from numpy.random import randn
import scipy.linalg
from collections import namedtuple
from quantecon import nullspace, mc_sample_path, var_quadratic_sum

12.1.1 Model Features

• Linear quadratic (LQ) model


• Representative household
• Stochastic dynamic programming over an infinite horizon
• Distortionary taxation

12.2 The Ramsey Problem

We begin by outlining the key assumptions regarding technology, households and the government sector.

12.2.1 Technology

Labor can be converted one-for-one into a single, non-storable consumption good.


In the usual spirit of the LQ model, the amount of labor supplied in each period is unrestricted.
This is unrealistic, but helpful when it comes to solving the model.
Realistic labor supply can be induced by suitable parameter values.

12.2.2 Households

Consider a representative household who chooses a path {ℓ𝑡 , 𝑐𝑡 } for labor and consumption to maximize

1 ∞
−𝔼 ∑ 𝛽 𝑡 [(𝑐𝑡 − 𝑏𝑡 )2 + ℓ𝑡2 ] (12.1)
2 𝑡=0

subject to the budget constraint



𝔼 ∑ 𝛽 𝑡 𝑝𝑡0 [𝑑𝑡 + (1 − 𝜏𝑡 )ℓ𝑡 + 𝑠𝑡 − 𝑐𝑡 ] = 0 (12.2)
𝑡=0

Here
• 𝛽 is a discount factor in (0, 1).
• 𝑝𝑡0 is a scaled Arrow-Debreu price at time 0 of history contingent goods at time 𝑡 + 𝑗.
• 𝑏𝑡 is a stochastic preference parameter.
• 𝑑𝑡 is an endowment process.
• 𝜏𝑡 is a flat tax rate on labor income.

228 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

• 𝑠𝑡 is a promised time-𝑡 coupon payment on debt issued by the government.


The scaled Arrow-Debreu price 𝑝𝑡0 is related to the unscaled Arrow-Debreu price as follows.
If we let 𝜋𝑡0 (𝑥𝑡 ) denote the probability (density) of a history 𝑥𝑡 = [𝑥𝑡 , 𝑥𝑡−1 , … , 𝑥0 ] of the state 𝑥𝑡 , then the Arrow-Debreu
time 0 price of a claim on one unit of consumption at date 𝑡, history 𝑥𝑡 would be

𝛽 𝑡 𝑝𝑡0
𝜋𝑡0 (𝑥𝑡 )

Thus, our scaled Arrow-Debreu price is the ordinary Arrow-Debreu price multiplied by the discount factor 𝛽 𝑡 and divided
by an appropriate probability.
The budget constraint (12.2) requires that the present value of consumption be restricted to equal the present value of
endowments, labor income and coupon payments on bond holdings.

12.2.3 Government

The government imposes a linear tax on labor income, fully committing to a stochastic path of tax rates at time zero.
The government also issues state-contingent debt.
Given government tax and borrowing plans, we can construct a competitive equilibrium with distorting government taxes.
Among all such competitive equilibria, the Ramsey plan is the one that maximizes the welfare of the representative
consumer.

12.2.4 Exogenous Variables

Endowments, government expenditure, the preference shock process 𝑏𝑡 , and promised coupon payments on initial gov-
ernment debt 𝑠𝑡 are all exogenous, and given by
• 𝑑𝑡 = 𝑆𝑑 𝑥𝑡
• 𝑔𝑡 = 𝑆𝑔 𝑥𝑡
• 𝑏𝑡 = 𝑆𝑏 𝑥𝑡
• 𝑠𝑡 = 𝑆𝑠 𝑥𝑡
The matrices 𝑆𝑑 , 𝑆𝑔 , 𝑆𝑏 , 𝑆𝑠 are primitives and {𝑥𝑡 } is an exogenous stochastic process taking values in ℝ𝑘 .
We consider two specifications for {𝑥𝑡 }.
1. Discrete case: {𝑥𝑡 } is a discrete state Markov chain with transition matrix 𝑃 .
2. VAR case: {𝑥𝑡 } obeys 𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐶𝑤𝑡+1 where {𝑤𝑡 } is independent zero-mean Gaussian with identify
covariance matrix.

12.2.5 Feasibility

The period-by-period feasibility restriction for this economy is

𝑐𝑡 + 𝑔𝑡 = 𝑑𝑡 + ℓ𝑡 (12.3)

A labor-consumption process {ℓ𝑡 , 𝑐𝑡 } is called feasible if (12.3) holds for all 𝑡.

12.2. The Ramsey Problem 229


Advanced Quantitative Economics with Python

12.2.6 Government Budget Constraint

Where 𝑝𝑡0 is again a scaled Arrow-Debreu price, the time zero government budget constraint is

𝔼 ∑ 𝛽 𝑡 𝑝𝑡0 (𝑠𝑡 + 𝑔𝑡 − 𝜏𝑡 ℓ𝑡 ) = 0 (12.4)
𝑡=0

12.2.7 Equilibrium

An equilibrium is a feasible allocation {ℓ𝑡 , 𝑐𝑡 }, a sequence of prices {𝑝𝑡0 }, and a tax system {𝜏𝑡 } such that
1. The allocation {ℓ𝑡 , 𝑐𝑡 } is optimal for the household given {𝑝𝑡0 } and {𝜏𝑡 }.
2. The government’s budget constraint (12.4) is satisfied.
The Ramsey problem is to choose the equilibrium {ℓ𝑡 , 𝑐𝑡 , 𝜏𝑡 , 𝑝𝑡0 } that maximizes the household’s welfare.
If {ℓ𝑡 , 𝑐𝑡 , 𝜏𝑡 , 𝑝𝑡0 } solves the Ramsey problem, then {𝜏𝑡 } is called the Ramsey plan.
The solution procedure we adopt is
1. Use the first-order conditions from the household problem to pin down prices and allocations given {𝜏𝑡 }.
2. Use these expressions to rewrite the government budget constraint (12.4) in terms of exogenous variables and
allocations.
3. Maximize the household’s objective function (12.1) subject to the constraint constructed in step 2 and the feasibility
constraint (12.3).
The solution to this maximization problem pins down all quantities of interest.

12.2.8 Solution

Step one is to obtain the first-conditions for the household’s problem, taking taxes and prices as given.
Letting 𝜇 be the Lagrange multiplier on (12.2), the first-order conditions are 𝑝𝑡0 = (𝑐𝑡 − 𝑏𝑡 )/𝜇 and ℓ𝑡 = (𝑐𝑡 − 𝑏𝑡 )(1 − 𝜏𝑡 ).
Rearranging and normalizing at 𝜇 = 𝑏0 − 𝑐0 , we can write these conditions as

𝑏𝑡 − 𝑐 𝑡 ℓ𝑡
𝑝𝑡0 = and 𝜏𝑡 = 1 − (12.5)
𝑏0 − 𝑐 0 𝑏𝑡 − 𝑐 𝑡

Substituting (12.5) into the government’s budget constraint (12.4) yields



𝔼 ∑ 𝛽 𝑡 [(𝑏𝑡 − 𝑐𝑡 )(𝑠𝑡 + 𝑔𝑡 − ℓ𝑡 ) + ℓ𝑡2 ] = 0 (12.6)
𝑡=0

The Ramsey problem now amounts to maximizing (12.1) subject to (12.6) and (12.3).
The associated Lagrangian is

1
ℒ = 𝔼 ∑ 𝛽 𝑡 {− [(𝑐𝑡 − 𝑏𝑡 )2 + ℓ𝑡2 ] + 𝜆 [(𝑏𝑡 − 𝑐𝑡 )(ℓ𝑡 − 𝑠𝑡 − 𝑔𝑡 ) − ℓ𝑡2 ] + 𝜇𝑡 [𝑑𝑡 + ℓ𝑡 − 𝑐𝑡 − 𝑔𝑡 ]} (12.7)
𝑡=0
2

The first-order conditions associated with 𝑐𝑡 and ℓ𝑡 are

−(𝑐𝑡 − 𝑏𝑡 ) + 𝜆[−ℓ𝑡 + (𝑔𝑡 + 𝑠𝑡 )] = 𝜇𝑡

230 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

and

ℓ𝑡 − 𝜆[(𝑏𝑡 − 𝑐𝑡 ) − 2ℓ𝑡 ] = 𝜇𝑡

Combining these last two equalities with (12.3) and working through the algebra, one can show that

ℓ𝑡 = ℓ𝑡̄ − 𝜈𝑚𝑡 and 𝑐𝑡 = 𝑐𝑡̄ − 𝜈𝑚𝑡 (12.8)

where
• 𝜈 ∶= 𝜆/(1 + 2𝜆)
• ℓ𝑡̄ ∶= (𝑏𝑡 − 𝑑𝑡 + 𝑔𝑡 )/2
• 𝑐𝑡̄ ∶= (𝑏𝑡 + 𝑑𝑡 − 𝑔𝑡 )/2
• 𝑚𝑡 ∶= (𝑏𝑡 − 𝑑𝑡 − 𝑠𝑡 )/2
Apart from 𝜈, all of these quantities are expressed in terms of exogenous variables.
To solve for 𝜈, we can use the government’s budget constraint again.
The term inside the brackets in (12.6) is (𝑏𝑡 − 𝑐𝑡 )(𝑠𝑡 + 𝑔𝑡 ) − (𝑏𝑡 − 𝑐𝑡 )ℓ𝑡 + ℓ𝑡2 .
Using (12.8), the definitions above and the fact that ℓ ̄ = 𝑏 − 𝑐,̄ this term can be rewritten as

(𝑏𝑡 − 𝑐𝑡̄ )(𝑔𝑡 + 𝑠𝑡 ) + 2𝑚2𝑡 (𝜈 2 − 𝜈)

Reinserting into (12.6), we get


∞ ∞
𝔼 {∑ 𝛽 𝑡 (𝑏𝑡 − 𝑐𝑡̄ )(𝑔𝑡 + 𝑠𝑡 )} + (𝜈 2 − 𝜈)𝔼 {∑ 𝛽 𝑡 2𝑚2𝑡 } = 0 (12.9)
𝑡=0 𝑡=0

Although it might not be clear yet, we are nearly there because:


• The two expectations terms in (12.9) can be solved for in terms of model primitives.
• This in turn allows us to solve for the Lagrange multiplier 𝜈.
• With 𝜈 in hand, we can go back and solve for the allocations via (12.8).
• Once we have the allocations, prices and the tax system can be derived from (12.5).

12.2.9 Computing the Quadratic Term

Let’s consider how to obtain the term 𝜈 in (12.9).


If we can compute the two expected geometric sums
∞ ∞
𝑏0 ∶= 𝔼 {∑ 𝛽 𝑡 (𝑏𝑡 − 𝑐𝑡̄ )(𝑔𝑡 + 𝑠𝑡 )} and 𝑎0 ∶= 𝔼 {∑ 𝛽 𝑡 2𝑚2𝑡 } (12.10)
𝑡=0 𝑡=0

then the problem reduces to solving

𝑏0 + 𝑎0 (𝜈 2 − 𝜈) = 0

for 𝜈.
Provided that 4𝑏0 < 𝑎0 , there is a unique solution 𝜈 ∈ (0, 1/2), and a unique corresponding 𝜆 > 0.
Let’s work out how to compute mathematical expectations in (12.10).

12.2. The Ramsey Problem 231


Advanced Quantitative Economics with Python

For the first one, the random variable (𝑏𝑡 − 𝑐𝑡̄ )(𝑔𝑡 + 𝑠𝑡 ) inside the summation can be expressed as

1 ′
𝑥 (𝑆 − 𝑆𝑑 + 𝑆𝑔 )′ (𝑆𝑔 + 𝑆𝑠 )𝑥𝑡
2 𝑡 𝑏
For the second expectation in (12.10), the random variable 2𝑚2𝑡 can be written as

1 ′
𝑥 (𝑆 − 𝑆𝑑 − 𝑆𝑠 )′ (𝑆𝑏 − 𝑆𝑑 − 𝑆𝑠 )𝑥𝑡
2 𝑡 𝑏
It follows that both objects of interest are special cases of the expression

𝑞(𝑥0 ) = 𝔼 ∑ 𝛽 𝑡 𝑥′𝑡 𝐻𝑥𝑡 (12.11)
𝑡=0

where 𝐻 is a matrix conformable to 𝑥𝑡 and 𝑥′𝑡 is the transpose of column vector 𝑥𝑡 .


Suppose first that {𝑥𝑡 } is the Gaussian VAR described above.
In this case, the formula for computing 𝑞(𝑥0 ) is known to be 𝑞(𝑥0 ) = 𝑥′0 𝑄𝑥0 + 𝑣, where
• 𝑄 is the solution to 𝑄 = 𝐻 + 𝛽𝐴′ 𝑄𝐴, and
• 𝑣 = trace (𝐶 ′ 𝑄𝐶)𝛽/(1 − 𝛽)
The first equation is known as a discrete Lyapunov equation and can be solved using this function.

12.2.10 Finite State Markov Case

Next, suppose that {𝑥𝑡 } is the discrete Markov process described above.
Suppose further that each 𝑥𝑡 takes values in the state space {𝑥1 , … , 𝑥𝑁 } ⊂ ℝ𝑘 .
Let ℎ ∶ ℝ𝑘 → ℝ be a given function, and suppose that we wish to evaluate

𝑞(𝑥0 ) = 𝔼 ∑ 𝛽 𝑡 ℎ(𝑥𝑡 ) given 𝑥0 = 𝑥𝑗
𝑡=0

For example, in the discussion above, ℎ(𝑥𝑡 ) = 𝑥′𝑡 𝐻𝑥𝑡 .


It is legitimate to pass the expectation through the sum, leading to

𝑞(𝑥0 ) = ∑ 𝛽 𝑡 (𝑃 𝑡 ℎ)[𝑗] (12.12)
𝑡=0

Here
• 𝑃 𝑡 is the 𝑡-th power of the transition matrix 𝑃 .
• ℎ is, with some abuse of notation, the vector (ℎ(𝑥1 ), … , ℎ(𝑥𝑁 )).
• (𝑃 𝑡 ℎ)[𝑗] indicates the 𝑗-th element of 𝑃 𝑡 ℎ.
It can be shown that (12.12) is in fact equal to the 𝑗-th element of the vector (𝐼 − 𝛽𝑃 )−1 ℎ.
This last fact is applied in the calculations below.

232 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

12.2.11 Other Variables

We are interested in tracking several other variables besides the ones described above.
To prepare the way for this, we define

𝑡
𝑏𝑡+𝑗 − 𝑐𝑡+𝑗
𝑝𝑡+𝑗 =
𝑏𝑡 − 𝑐 𝑡
as the scaled Arrow-Debreu time 𝑡 price of a history contingent claim on one unit of consumption at time 𝑡 + 𝑗.
These are prices that would prevail at time 𝑡 if markets were reopened at time 𝑡.
These prices are constituents of the present value of government obligations outstanding at time 𝑡, which can be expressed
as

𝐵𝑡 ∶= 𝔼𝑡 ∑ 𝛽 𝑗 𝑝𝑡+𝑗
𝑡
(𝜏𝑡+𝑗 ℓ𝑡+𝑗 − 𝑔𝑡+𝑗 ) (12.13)
𝑗=0

Using our expression for prices and the Ramsey plan, we can also write 𝐵𝑡 as
∞ 2
(𝑏𝑡+𝑗 − 𝑐𝑡+𝑗 )(ℓ𝑡+𝑗 − 𝑔𝑡+𝑗 ) − ℓ𝑡+𝑗
𝐵𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗
𝑗=0
𝑏𝑡 − 𝑐 𝑡

This version is more convenient for computation.


Using the equation
𝑡 𝑡 𝑡+1
𝑝𝑡+𝑗 = 𝑝𝑡+1 𝑝𝑡+𝑗

it is possible to verify that (12.13) implies that



𝑡
𝐵𝑡 = (𝜏𝑡 ℓ𝑡 − 𝑔𝑡 ) + 𝐸𝑡 ∑ 𝑝𝑡+𝑗 (𝜏𝑡+𝑗 ℓ𝑡+𝑗 − 𝑔𝑡+𝑗 )
𝑗=1

and
𝑡
𝐵𝑡 = (𝜏𝑡 ℓ𝑡 − 𝑔𝑡 ) + 𝛽𝐸𝑡 𝑝𝑡+1 𝐵𝑡+1 (12.14)

Define

𝑅𝑡−1 ∶= 𝔼𝑡 𝛽 𝑗 𝑝𝑡+1
𝑡
(12.15)

𝑅𝑡 is the gross 1-period risk-free rate for loans between 𝑡 and 𝑡 + 1.

12.2.12 A Martingale

We now want to study the following two objects, namely,

𝜋𝑡+1 ∶= 𝐵𝑡+1 − 𝑅𝑡 [𝐵𝑡 − (𝜏𝑡 ℓ𝑡 − 𝑔𝑡 )]

and the cumulation of 𝜋𝑡


𝑡
Π𝑡 ∶= ∑ 𝜋𝑡
𝑠=0

The term 𝜋𝑡+1 is the difference between two quantities:

12.2. The Ramsey Problem 233


Advanced Quantitative Economics with Python

• 𝐵𝑡+1 , the value of government debt at the start of period 𝑡 + 1.


• 𝑅𝑡 [𝐵𝑡 + 𝑔𝑡 − 𝜏𝑡 ], which is what the government would have owed at the beginning of period 𝑡 + 1 if it had simply
borrowed at the one-period risk-free rate rather than selling state-contingent securities.
Thus, 𝜋𝑡+1 is the excess payout on the actual portfolio of state-contingent government debt relative to an alternative
portfolio sufficient to finance 𝐵𝑡 + 𝑔𝑡 − 𝜏𝑡 ℓ𝑡 and consisting entirely of risk-free one-period bonds.
Use expressions (12.14) and (12.15) to obtain
1 𝑡
𝜋𝑡+1 = 𝐵𝑡+1 − 𝑡 [𝛽𝐸𝑡 𝑝𝑡+1 𝐵𝑡+1 ]
𝛽𝐸𝑡 𝑝𝑡+1
or

𝜋𝑡+1 = 𝐵𝑡+1 − 𝐸𝑡̃ 𝐵𝑡+1 (12.16)

where 𝐸𝑡̃ is the conditional mathematical expectation taken with respect to a one-step transition density that has been
formed by multiplying the original transition density with the likelihood ratio
𝑡
𝑝𝑡+1
𝑚𝑡𝑡+1 = 𝑡
𝐸𝑡 𝑝𝑡+1

It follows from equation (12.16) that

𝐸𝑡̃ 𝜋𝑡+1 = 𝐸𝑡̃ 𝐵𝑡+1 − 𝐸𝑡̃ 𝐵𝑡+1 = 0

which asserts that {𝜋𝑡+1 } is a martingale difference sequence under the distorted probability measure, and that {Π𝑡 } is
a martingale under the distorted probability measure.
In the tax-smoothing model of Robert Barro [Barro, 1979], government debt is a random walk.
In the current model, government debt {𝐵𝑡 } is not a random walk, but the excess payoff {Π𝑡 } on it is.

12.3 Implementation

The following code provides functions for


1. Solving for the Ramsey plan given a specification of the economy.
2. Simulating the dynamics of the major variables.
Description and clarifications are given below

# Set up a namedtuple to store data on the model economy


Economy = namedtuple('economy',
('β', # Discount factor
'Sg', # Govt spending selector matrix
'Sd', # Exogenous endowment selector matrix
'Sb', # Utility parameter selector matrix
'Ss', # Coupon payments selector matrix
'discrete', # Discrete or continuous -- boolean
'proc')) # Stochastic process parameters

# Set up a namedtuple to store return values for compute_paths()


Path = namedtuple('path',
('g', # Govt spending
'd', # Endowment
(continues on next page)

234 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

(continued from previous page)


'b', # Utility shift parameter
's', # Coupon payment on existing debt
'c', # Consumption
'l', # Labor
'p', # Price
'τ', # Tax rate
'rvn', # Revenue
'B', # Govt debt
'R', # Risk-free gross return
'π', # One-period risk-free interest rate
'Π', # Cumulative rate of return, adjusted
'ξ')) # Adjustment factor for Π

def compute_paths(T, econ):


"""
Compute simulated time paths for exogenous and endogenous variables.

Parameters
===========
T: int
Length of the simulation

econ: a namedtuple of type 'Economy', containing


β - Discount factor
Sg - Govt spending selector matrix
Sd - Exogenous endowment selector matrix
Sb - Utility parameter selector matrix
Ss - Coupon payments selector matrix
discrete - Discrete exogenous process (True or False)
proc - Stochastic process parameters

Returns
========
path: a namedtuple of type 'Path', containing
g - Govt spending
d - Endowment
b - Utility shift parameter
s - Coupon payment on existing debt
c - Consumption
l - Labor
p - Price
τ - Tax rate
rvn - Revenue
B - Govt debt
R - Risk-free gross return
π - One-period risk-free interest rate
Π - Cumulative rate of return, adjusted
ξ - Adjustment factor for Π

The corresponding values are flat numpy ndarrays.

"""

# Simplify names
β, Sg, Sd, Sb, Ss = econ.β, econ.Sg, econ.Sd, econ.Sb, econ.Ss

(continues on next page)

12.3. Implementation 235


Advanced Quantitative Economics with Python

(continued from previous page)

if econ.discrete:
P, x_vals = econ.proc
else:
A, C = econ.proc

# Simulate the exogenous process x


if econ.discrete:
state = mc_sample_path(P, init=0, sample_size=T)
x = x_vals[:, state]
else:
# Generate an initial condition x0 satisfying x0 = A x0
nx, nx = A.shape
x0 = nullspace((eye(nx) - A))
x0 = -x0 if (x0[nx-1] < 0) else x0
x0 = x0 / x0[nx-1]

# Generate a time series x of length T starting from x0


nx, nw = C.shape
x = zeros((nx, T))
w = randn(nw, T)
x[:, 0] = x0.T
for t in range(1, T):
x[:, t] = A @ x[:, t-1] + C @ w[:, t]

# Compute exogenous variable sequences


g, d, b, s = ((S @ x).flatten() for S in (Sg, Sd, Sb, Ss))

# Solve for Lagrange multiplier in the govt budget constraint


# In fact we solve for ν = lambda / (1 + 2*lambda). Here ν is the
# solution to a quadratic equation a(ν**2 - ν) + b = 0 where
# a and b are expected discounted sums of quadratic forms of the state.
Sm = Sb - Sd - Ss
# Compute a and b
if econ.discrete:
ns = P.shape[0]
F = scipy.linalg.inv(eye(ns) - β * P)
a0 = 0.5 * (F @ (x_vals.T @ Sm.T)**2)[0]
H = ((Sb - Sd + Sg) @ x_vals) * ((Sg - Ss) @ x_vals)
b0 = 0.5 * (F @ H.T)[0]
a0, b0 = float(a0), float(b0)
else:
H = Sm.T @ Sm
a0 = 0.5 * var_quadratic_sum(A, C, H, β, x0)
H = (Sb - Sd + Sg).T @ (Sg + Ss)
b0 = 0.5 * var_quadratic_sum(A, C, H, β, x0)

# Test that ν has a real solution before assigning


warning_msg = """
Hint: you probably set government spending too {}. Elect a {}
Congress and start over.
"""
disc = a0**2 - 4 * a0 * b0
if disc >= 0:
ν = 0.5 * (a0 - sqrt(disc)) / a0
else:

(continues on next page)

236 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

(continued from previous page)


print("There is no Ramsey equilibrium for these parameters.")
print(warning_msg.format('high', 'Republican'))
sys.exit(0)

# Test that the Lagrange multiplier has the right sign


if ν * (0.5 - ν) < 0:
print("Negative multiplier on the government budget constraint.")
print(warning_msg.format('low', 'Democratic'))
sys.exit(0)

# Solve for the allocation given ν and x


Sc = 0.5 * (Sb + Sd - Sg - ν * Sm)
Sl = 0.5 * (Sb - Sd + Sg - ν * Sm)
c = (Sc @ x).flatten()
l = (Sl @ x).flatten()
p = ((Sb - Sc) @ x).flatten() # Price without normalization
τ = 1 - l / (b - c)
rvn = l * τ

# Compute remaining variables


if econ.discrete:
H = ((Sb - Sc) @ x_vals) * ((Sl - Sg) @ x_vals) - (Sl @ x_vals)**2
temp = (F @ H.T).flatten()
B = temp[state] / p
H = (P[state, :] @ x_vals.T @ (Sb - Sc).T).flatten()
R = p / (β * H)
temp = ((P[state, :] @ x_vals.T @ (Sb - Sc).T)).flatten()
ξ = p[1:] / temp[:T-1]
else:
H = Sl.T @ Sl - (Sb - Sc).T @ (Sl - Sg)
L = np.empty(T)
for t in range(T):
L[t] = var_quadratic_sum(A, C, H, β, x[:, t])
B = L / p
Rinv = (β * ((Sb - Sc) @ A @ x)).flatten() / p
R = 1 / Rinv
AF1 = (Sb - Sc) @ x[:, 1:]
AF2 = (Sb - Sc) @ A @ x[:, :T-1]
ξ = AF1 / AF2
ξ = ξ.flatten()

π = B[1:] - R[:T-1] * B[:T-1] - rvn[:T-1] + g[:T-1]


Π = cumsum(π * ξ)

# Prepare return values


path = Path(g=g, d=d, b=b, s=s, c=c, l=l, p=p,
τ=τ, rvn=rvn, B=B, R=R, π=π, Π=Π, ξ=ξ)

return path

def gen_fig_1(path):
"""
The parameter is the path namedtuple returned by compute_paths(). See
the docstring of that function for details.
"""

(continues on next page)

12.3. Implementation 237


Advanced Quantitative Economics with Python

(continued from previous page)

T = len(path.c)

# Prepare axes
num_rows, num_cols = 2, 2
fig, axes = plt.subplots(num_rows, num_cols, figsize=(14, 10))
plt.subplots_adjust(hspace=0.4)
for i in range(num_rows):
for j in range(num_cols):
axes[i, j].grid()
axes[i, j].set_xlabel('Time')
bbox = (0., 1.02, 1., .102)
legend_args = {'bbox_to_anchor': bbox, 'loc': 3, 'mode': 'expand'}
p_args = {'lw': 2, 'alpha': 0.7}

# Plot consumption, govt expenditure and revenue


ax = axes[0, 0]
ax.plot(path.rvn, label=r'$\tau_t \ell_t$', **p_args)
ax.plot(path.g, label='$g_t$', **p_args)
ax.plot(path.c, label='$c_t$', **p_args)
ax.legend(ncol=3, **legend_args)

# Plot govt expenditure and debt


ax = axes[0, 1]
ax.plot(list(range(1, T+1)), path.rvn, label=r'$\tau_t \ell_t$', **p_args)
ax.plot(list(range(1, T+1)), path.g, label='$g_t$', **p_args)
ax.plot(list(range(1, T)), path.B[1:T], label='$B_{t+1}$', **p_args)
ax.legend(ncol=3, **legend_args)

# Plot risk-free return


ax = axes[1, 0]
ax.plot(list(range(1, T+1)), path.R - 1, label='$R_t - 1$', **p_args)
ax.legend(ncol=1, **legend_args)

# Plot revenue, expenditure and risk free rate


ax = axes[1, 1]
ax.plot(list(range(1, T+1)), path.rvn, label=r'$\tau_t \ell_t$', **p_args)
ax.plot(list(range(1, T+1)), path.g, label='$g_t$', **p_args)
axes[1, 1].plot(list(range(1, T)), path.π, label=r'$\pi_{t+1}$', **p_args)
ax.legend(ncol=3, **legend_args)

plt.show()

def gen_fig_2(path):
"""
The parameter is the path namedtuple returned by compute_paths(). See
the docstring of that function for details.
"""

T = len(path.c)

# Prepare axes
num_rows, num_cols = 2, 1
fig, axes = plt.subplots(num_rows, num_cols, figsize=(10, 10))
plt.subplots_adjust(hspace=0.5)

(continues on next page)

238 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

(continued from previous page)


bbox = (0., 1.02, 1., .102)
bbox = (0., 1.02, 1., .102)
legend_args = {'bbox_to_anchor': bbox, 'loc': 3, 'mode': 'expand'}
p_args = {'lw': 2, 'alpha': 0.7}

# Plot adjustment factor


ax = axes[0]
ax.plot(list(range(2, T+1)), path.ξ, label=r'$\xi_t$', **p_args)
ax.grid()
ax.set_xlabel('Time')
ax.legend(ncol=1, **legend_args)

# Plot adjusted cumulative return


ax = axes[1]
ax.plot(list(range(2, T+1)), path.Π, label=r'$\Pi_t$', **p_args)
ax.grid()
ax.set_xlabel('Time')
ax.legend(ncol=1, **legend_args)

plt.show()

12.3.1 Comments on the Code

The function var_quadratic_sum imported from quadsums is for computing the value of (12.11) when the ex-
ogenous process {𝑥𝑡 } is of the VAR type described above.
Below the definition of the function, you will see definitions of two namedtuple objects, Economy and Path.
The first is used to collect all the parameters and primitives of a given LQ economy, while the second collects output of
the computations.
In Python, a namedtuple is a popular data type from the collections module of the standard library that replicates
the functionality of a tuple, but also allows you to assign a name to each tuple element.
These elements can then be references via dotted attribute notation — see for example the use of path in the functions
gen_fig_1() and gen_fig_2().
The benefits of using namedtuples:
• Keeps content organized by meaning.
• Helps reduce the number of global variables.
Other than that, our code is long but relatively straightforward.

12.4 Examples

Let’s look at two examples of usage.

12.4. Examples 239


Advanced Quantitative Economics with Python

12.4.1 The Continuous Case

Our first example adopts the VAR specification described above.


Regarding the primitives, we set
• 𝛽 = 1/1.05
• 𝑏𝑡 = 2.135 and 𝑠𝑡 = 𝑑𝑡 = 0 for all 𝑡
Government spending evolves according to

𝑔𝑡+1 − 𝜇𝑔 = 𝜌(𝑔𝑡 − 𝜇𝑔 ) + 𝐶𝑔 𝑤𝑔,𝑡+1

with 𝜌 = 0.7, 𝜇𝑔 = 0.35 and 𝐶𝑔 = 𝜇𝑔 √1 − 𝜌2 /10.


Here’s the code

# == Parameters == #
β = 1 / 1.05
ρ, mg = .7, .35
A = eye(2)
A[0, :] = ρ, mg * (1-ρ)
C = np.zeros((2, 1))
C[0, 0] = np.sqrt(1 - ρ**2) * mg / 10
Sg = np.array((1, 0)).reshape(1, 2)
Sd = np.array((0, 0)).reshape(1, 2)
Sb = np.array((0, 2.135)).reshape(1, 2)
Ss = np.array((0, 0)).reshape(1, 2)

economy = Economy(β=β, Sg=Sg, Sd=Sd, Sb=Sb, Ss=Ss,


discrete=False, proc=(A, C))

T = 50
path = compute_paths(T, economy)
gen_fig_1(path)

240 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

The legends on the figures indicate the variables being tracked.


Most obvious from the figure is tax smoothing in the sense that tax revenue is much less variable than government
expenditure.

gen_fig_2(path)

12.4. Examples 241


Advanced Quantitative Economics with Python

See the original manuscript for comments and interpretation.

12.4.2 The Discrete Case

Our second example adopts a discrete Markov specification for the exogenous process

# == Parameters == #
β = 1 / 1.05
P = np.array([[0.8, 0.2, 0.0],
[0.0, 0.5, 0.5],
[0.0, 0.0, 1.0]])

# Possible states of the world


(continues on next page)

242 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

(continued from previous page)


# Each column is a state of the world. The rows are [g d b s 1]
x_vals = np.array([[0.5, 0.5, 0.25],
[0.0, 0.0, 0.0],
[2.2, 2.2, 2.2],
[0.0, 0.0, 0.0],
[1.0, 1.0, 1.0]])

Sg = np.array((1, 0, 0, 0, 0)).reshape(1, 5)
Sd = np.array((0, 1, 0, 0, 0)).reshape(1, 5)
Sb = np.array((0, 0, 1, 0, 0)).reshape(1, 5)
Ss = np.array((0, 0, 0, 1, 0)).reshape(1, 5)

economy = Economy(β=β, Sg=Sg, Sd=Sd, Sb=Sb, Ss=Ss,


discrete=True, proc=(P, x_vals))

T = 15
path = compute_paths(T, economy)
gen_fig_1(path)

/tmp/ipykernel_8650/2748685684.py:111: DeprecationWarning: Conversion of an array␣


↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

a0, b0 = float(a0), float(b0)

12.4. Examples 243


Advanced Quantitative Economics with Python

The call gen_fig_2(path) generates

gen_fig_2(path)

See the original manuscript for comments and interpretation.

244 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

12.5 Exercises

Exercise 12.5.1
Modify the VAR example given above, setting

𝑔𝑡+1 − 𝜇𝑔 = 𝜌(𝑔𝑡−3 − 𝜇𝑔 ) + 𝐶𝑔 𝑤𝑔,𝑡+1

with 𝜌 = 0.95 and 𝐶𝑔 = 0.7√1 − 𝜌2 .


Produce the corresponding figures.

Solution to Exercise 12.5.1

# == Parameters == #
β = 1 / 1.05
ρ, mg = .95, .35
A = np.array([[0, 0, 0, ρ, mg*(1-ρ)],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 1]])
C = np.zeros((5, 1))
C[0, 0] = np.sqrt(1 - ρ**2) * mg / 8
Sg = np.array((1, 0, 0, 0, 0)).reshape(1, 5)
Sd = np.array((0, 0, 0, 0, 0)).reshape(1, 5)
# Chosen st. (Sc + Sg) * x0 = 1
Sb = np.array((0, 0, 0, 0, 2.135)).reshape(1, 5)
Ss = np.array((0, 0, 0, 0, 0)).reshape(1, 5)

economy = Economy(β=β, Sg=Sg, Sd=Sd, Sb=Sb,


Ss=Ss, discrete=False, proc=(A, C))

T = 50
path = compute_paths(T, economy)

gen_fig_1(path)

12.5. Exercises 245


Advanced Quantitative Economics with Python

gen_fig_2(path)

246 Chapter 12. Optimal Taxation in an LQ Economy


Advanced Quantitative Economics with Python

12.5. Exercises 247


Advanced Quantitative Economics with Python

248 Chapter 12. Optimal Taxation in an LQ Economy


Part III

Multiple Agent Models

249
CHAPTER

THIRTEEN

DEFAULT RISK AND INCOME FLUCTUATIONS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

13.1 Overview

This lecture computes versions of Arellano’s [Arellano, 2008] model of sovereign default.
The model describes interactions among default risk, output, and an equilibrium interest rate that includes a premium for
endogenous default risk.
The decision maker is a government of a small open economy that borrows from risk-neutral foreign creditors.
The foreign lenders must be compensated for default risk.
The government borrows and lends abroad in order to smooth the consumption of its citizens.
The government repays its debt only if it wants to, but declining to pay has adverse consequences.
The interest rate on government debt adjusts in response to the state-dependent default probability chosen by government.
The model yields outcomes that help interpret sovereign default experiences, including
• countercyclical interest rates on sovereign debt
• countercyclical trade balances
• high volatility of consumption relative to output
Notably, long recessions caused by bad draws in the income process increase the government’s incentive to default.
This can lead to
• spikes in interest rates
• temporary losses of access to international credit markets
• large drops in output, consumption, and welfare
• large capital outflows during recessions
Such dynamics are consistent with experiences of many countries.
Let’s start with some imports:

251
Advanced Quantitative Economics with Python

import matplotlib.pyplot as plt


import numpy as np
import quantecon as qe
from numba import njit, prange

13.2 Structure

In this section we describe the main features of the model.

13.2.1 Output, Consumption and Debt

A small open economy is endowed with an exogenous stochastically fluctuating potential output stream {𝑦𝑡 }.
Potential output is realized only in periods in which the government honors its sovereign debt.
The output good can be traded or consumed.
The sequence {𝑦𝑡 } is described by a Markov process with stochastic density kernel 𝑝(𝑦, 𝑦′ ).
Households within the country are identical and rank stochastic consumption streams according to

𝔼 ∑ 𝛽 𝑡 𝑢(𝑐𝑡 ) (13.1)
𝑡=0

Here
• 0 < 𝛽 < 1 is a time discount factor
• 𝑢 is an increasing and strictly concave utility function
Consumption sequences enjoyed by households are affected by the government’s decision to borrow or lend internationally.
The government is benevolent in the sense that its aim is to maximize (13.1).
The government is the only domestic actor with access to foreign credit.
Because household are averse to consumption fluctuations, the government will try to smooth consumption by borrowing
from (and lending to) foreign creditors.

13.2.2 Asset Markets

The only credit instrument available to the government is a one-period bond traded in international credit markets.
The bond market has the following features
• The bond matures in one period and is not state contingent.
• A purchase of a bond with face value 𝐵′ is a claim to 𝐵′ units of the consumption good next period.
• To purchase 𝐵′ next period costs 𝑞𝐵′ now, or, what is equivalent.
• For selling −𝐵′ units of next period goods the seller earns −𝑞𝐵′ of today’s goods.
– If 𝐵′ < 0, then −𝑞𝐵′ units of the good are received in the current period, for a promise to repay −𝐵′ units
next period.
– There is an equilibrium price function 𝑞(𝐵′ , 𝑦) that makes 𝑞 depend on both 𝐵′ and 𝑦.

252 Chapter 13. Default Risk and Income Fluctuations


Advanced Quantitative Economics with Python

Earnings on the government portfolio are distributed (or, if negative, taxed) lump sum to households.
When the government is not excluded from financial markets, the one-period national budget constraint is

𝑐 = 𝑦 + 𝐵 − 𝑞(𝐵′ , 𝑦)𝐵′ (13.2)

Here and below, a prime denotes a next period value or a claim maturing next period.
To rule out Ponzi schemes, we also require that 𝐵 ≥ −𝑍 in every period.
• 𝑍 is chosen to be sufficiently large that the constraint never binds in equilibrium.

13.2.3 Financial Markets

Foreign creditors
• are risk neutral
• know the domestic output stochastic process {𝑦𝑡 } and observe 𝑦𝑡 , 𝑦𝑡−1 , … , at time 𝑡
• can borrow or lend without limit in an international credit market at a constant international interest rate 𝑟
• receive full payment if the government chooses to pay
• receive zero if the government defaults on its one-period debt due
When a government is expected to default next period with probability 𝛿, the expected value of a promise to pay one unit
of consumption next period is 1 − 𝛿.
Therefore, the discounted expected value of a promise to pay 𝐵 next period is

1−𝛿
𝑞= (13.3)
1+𝑟
Next we turn to how the government in effect chooses the default probability 𝛿.

13.2.4 Government’s Decisions

At each point in time 𝑡, the government chooses between


1. defaulting
2. meeting its current obligations and purchasing or selling an optimal quantity of one-period sovereign debt
Defaulting means declining to repay all of its current obligations.
If the government defaults in the current period, then consumption equals current output.
But a sovereign default has two consequences:
1. Output immediately falls from 𝑦 to ℎ(𝑦), where 0 ≤ ℎ(𝑦) ≤ 𝑦.
• It returns to 𝑦 only after the country regains access to international credit markets.
2. The country loses access to foreign credit markets.

13.2. Structure 253


Advanced Quantitative Economics with Python

13.2.5 Reentering International Credit Market

While in a state of default, the economy regains access to foreign credit in each subsequent period with probability 𝜃.

13.3 Equilibrium

Informally, an equilibrium is a sequence of interest rates on its sovereign debt, a stochastic sequence of government default
decisions and an implied flow of household consumption such that
1. Consumption and assets satisfy the national budget constraint.
2. The government maximizes household utility taking into account
• the resource constraint
• the effect of its choices on the price of bonds
• consequences of defaulting now for future net output and future borrowing and lending opportunities
3. The interest rate on the government’s debt includes a risk-premium sufficient to make foreign creditors expect on
average to earn the constant risk-free international interest rate.
To express these ideas more precisely, consider first the choices of the government, which
1. enters a period with initial assets 𝐵, or what is the same thing, initial debt to be repaid now of −𝐵
2. observes current output 𝑦, and
3. chooses either
1. to default, or
2. to pay −𝐵 and set next period’s debt due to −𝐵′
In a recursive formulation,
• state variables for the government comprise the pair (𝐵, 𝑦)
• 𝑣(𝐵, 𝑦) is the optimum value of the government’s problem when at the beginning of a period it faces the choice of
whether to honor or default
• 𝑣𝑐 (𝐵, 𝑦) is the value of choosing to pay obligations falling due
• 𝑣𝑑 (𝑦) is the value of choosing to default
𝑣𝑑 (𝑦) does not depend on 𝐵 because, when access to credit is eventually regained, net foreign assets equal 0.
Expressed recursively, the value of defaulting is

𝑣𝑑 (𝑦) = 𝑢(ℎ(𝑦)) + 𝛽 ∫ {𝜃𝑣(0, 𝑦′ ) + (1 − 𝜃)𝑣𝑑 (𝑦′ )} 𝑝(𝑦, 𝑦′ )𝑑𝑦′

The value of paying is

𝑣𝑐 (𝐵, 𝑦) = max

{𝑢(𝑦 − 𝑞(𝐵′ , 𝑦)𝐵′ + 𝐵) + 𝛽 ∫ 𝑣(𝐵′ , 𝑦′ )𝑝(𝑦, 𝑦′ )𝑑𝑦′ }
𝐵 ≥−𝑍

The three value functions are linked by

𝑣(𝐵, 𝑦) = max{𝑣𝑐 (𝐵, 𝑦), 𝑣𝑑 (𝑦)}

The government chooses to default when

𝑣𝑐 (𝐵, 𝑦) < 𝑣𝑑 (𝑦)

254 Chapter 13. Default Risk and Income Fluctuations


Advanced Quantitative Economics with Python

and hence given 𝐵′ the probability of default next period is

𝛿(𝐵′ , 𝑦) ∶= ∫ 𝟙{𝑣𝑐 (𝐵′ , 𝑦′ ) < 𝑣𝑑 (𝑦′ )}𝑝(𝑦, 𝑦′ )𝑑𝑦′ (13.4)

Given zero profits for foreign creditors in equilibrium, we can combine (13.3) and (13.4) to pin down the bond price
function:
1 − 𝛿(𝐵′ , 𝑦)
𝑞(𝐵′ , 𝑦) = (13.5)
1+𝑟

13.3.1 Definition of Equilibrium

An equilibrium is
• a pricing function 𝑞(𝐵′ , 𝑦),
• a triple of value functions (𝑣𝑐 (𝐵, 𝑦), 𝑣𝑑 (𝑦), 𝑣(𝐵, 𝑦)),
• a decision rule telling the government when to default and when to pay as a function of the state (𝐵, 𝑦), and
• an asset accumulation rule that, conditional on choosing not to default, maps (𝐵, 𝑦) into 𝐵′
such that
• The three Bellman equations for (𝑣𝑐 (𝐵, 𝑦), 𝑣𝑑 (𝑦), 𝑣(𝐵, 𝑦)) are satisfied
• Given the price function 𝑞(𝐵′ , 𝑦), the default decision rule and the asset accumulation decision rule attain the
optimal value function 𝑣(𝐵, 𝑦), and
• The price function 𝑞(𝐵′ , 𝑦) satisfies equation (13.5)

13.4 Computation

Let’s now compute an equilibrium of Arellano’s model.


The equilibrium objects are the value function 𝑣(𝐵, 𝑦), the associated default decision rule, and the pricing function
𝑞(𝐵′ , 𝑦).
We’ll use our code to replicate Arellano’s results.
After that we’ll perform some additional simulations.
We use a slightly modified version of the algorithm recommended by Arellano.
• The appendix to [Arellano, 2008] recommends value function iteration until convergence, updating the price, and
then repeating.
• Instead, we update the bond price at every value function iteration step.
The second approach is faster and the two different procedures deliver very similar results.
Here is a more detailed description of our algorithm:
1. Guess a pair of non-default and default value functions 𝑣𝑐 and 𝑣𝑑 .
2. Using these functions, calculate the value function 𝑣, the corresponding default probabilities and the price function
𝑞.
3. At each pair (𝐵, 𝑦),
1. update the value of defaulting 𝑣𝑑 (𝑦).

13.4. Computation 255


Advanced Quantitative Economics with Python

2. update the value of remaining 𝑣𝑐 (𝐵, 𝑦).


4. Check for convergence. If converged, stop – if not, go to step 2.
We use simple discretization on a grid of asset holdings and income levels.
The output process is discretized using a quadrature method due to Tauchen.
As we have in other places, we accelerate our code using Numba.
We define a class that will store parameters, grids and transition probabilities.

class Arellano_Economy:
" Stores data and creates primitives for the Arellano economy. "

def __init__(self,
B_grid_size= 251, # Grid size for bonds
B_grid_min=-0.45, # Smallest B value
B_grid_max=0.45, # Largest B value
y_grid_size=51, # Grid size for income
β=0.953, # Time discount parameter
γ=2.0, # Utility parameter
r=0.017, # Lending rate
ρ=0.945, # Persistence in the income process
η=0.025, # Standard deviation of the income process
θ=0.282, # Prob of re-entering financial markets
def_y_param=0.969): # Parameter governing income in default

# Save parameters
self.β, self.γ, self.r, = β, γ, r
self.ρ, self.η, self.θ = ρ, η, θ

self.y_grid_size = y_grid_size
self.B_grid_size = B_grid_size
self.B_grid = np.linspace(B_grid_min, B_grid_max, B_grid_size)
mc = qe.markov.tauchen(y_grid_size, ρ, η, 0, 3)
self.y_grid, self.P = np.exp(mc.state_values), mc.P

# The index at which B_grid is (close to) zero


self.B0_idx = np.searchsorted(self.B_grid, 1e-10)

# Output recieved while in default, with same shape as y_grid


self.def_y = np.minimum(def_y_param * np.mean(self.y_grid), self.y_grid)

def params(self):
return self.β, self.γ, self.r, self.ρ, self.η, self.θ

def arrays(self):
return self.P, self.y_grid, self.B_grid, self.def_y, self.B0_idx

Notice how the class returns the data it stores as simple numerical values and arrays via the methods params and
arrays.
We will use this data in the Numba-jitted functions defined below.
Jitted functions prefer simple arguments, since type inference is easier.
Here is the utility function.

256 Chapter 13. Default Risk and Income Fluctuations


Advanced Quantitative Economics with Python

@njit
def u(c, γ):
return c**(1-γ)/(1-γ)

Here is a function to compute the bond price at each state, given 𝑣𝑐 and 𝑣𝑑 .

@njit
def compute_q(v_c, v_d, q, params, arrays):
"""
Compute the bond price function q(b, y) at each (b, y) pair.

This function writes to the array q that is passed in as an argument.


"""

# Unpack
β, γ, r, ρ, η, θ = params
P, y_grid, B_grid, def_y, B0_idx = arrays

for B_idx in range(len(B_grid)):


for y_idx in range(len(y_grid)):
# Compute default probability and corresponding bond price
delta = P[y_idx, v_c[B_idx, :] < v_d].sum()
q[B_idx, y_idx] = (1 - delta ) / (1 + r)

Next we introduce Bellman operators that updated 𝑣𝑑 and 𝑣𝑐 .

@njit
def T_d(y_idx, v_c, v_d, params, arrays):
"""
The RHS of the Bellman equation when income is at index y_idx and
the country has chosen to default. Returns an update of v_d.
"""
# Unpack
β, γ, r, ρ, η, θ = params
P, y_grid, B_grid, def_y, B0_idx = arrays

current_utility = u(def_y[y_idx], γ)
v = np.maximum(v_c[B0_idx, :], v_d)
cont_value = np.sum((θ * v + (1 - θ) * v_d) * P[y_idx, :])

return current_utility + β * cont_value

@njit
def T_c(B_idx, y_idx, v_c, v_d, q, params, arrays):
"""
The RHS of the Bellman equation when the country is not in a
defaulted state on their debt. Returns a value that corresponds to
v_c[B_idx, y_idx], as well as the optimal level of bond sales B'.
"""
# Unpack
β, γ, r, ρ, η, θ = params
P, y_grid, B_grid, def_y, B0_idx = arrays
B = B_grid[B_idx]
y = y_grid[y_idx]

(continues on next page)

13.4. Computation 257


Advanced Quantitative Economics with Python

(continued from previous page)


# Compute the RHS of Bellman equation
current_max = -1e10
# Step through choices of next period B'
for Bp_idx, Bp in enumerate(B_grid):
c = y + B - q[Bp_idx, y_idx] * Bp
if c > 0:
v = np.maximum(v_c[Bp_idx, :], v_d)
val = u(c, γ) + β * np.sum(v * P[y_idx, :])
if val > current_max:
current_max = val
Bp_star_idx = Bp_idx
return current_max, Bp_star_idx

Here is a fast function that calls these operators in the right sequence.

@njit(parallel=True)
def update_values_and_prices(v_c, v_d, # Current guess of value functions
B_star, q, # Arrays to be written to
params, arrays):

# Unpack
β, γ, r, ρ, η, θ = params
P, y_grid, B_grid, def_y, B0_idx = arrays
y_grid_size = len(y_grid)
B_grid_size = len(B_grid)

# Compute bond prices and write them to q


compute_q(v_c, v_d, q, params, arrays)

# Allocate memory
new_v_c = np.empty_like(v_c)
new_v_d = np.empty_like(v_d)

# Calculate and return new guesses for v_c and v_d


for y_idx in prange(y_grid_size):
new_v_d[y_idx] = T_d(y_idx, v_c, v_d, params, arrays)
for B_idx in range(B_grid_size):
new_v_c[B_idx, y_idx], Bp_idx = T_c(B_idx, y_idx,
v_c, v_d, q, params, arrays)
B_star[B_idx, y_idx] = Bp_idx

return new_v_c, new_v_d

We can now write a function that will use the Arellano_Economy class and the functions defined above to compute
the solution to our model.
We do not need to JIT compile this function since it only consists of outer loops (and JIT compiling makes almost zero
difference).
In fact, one of the jobs of this function is to take an instance of Arellano_Economy, which is hard for the JIT
compiler to handle, and strip it down to more basic objects, which are then passed out to jitted functions.

def solve(model, tol=1e-8, max_iter=10_000):


"""
Given an instance of Arellano_Economy, this function computes the optimal
policy and value functions.
(continues on next page)

258 Chapter 13. Default Risk and Income Fluctuations


Advanced Quantitative Economics with Python

(continued from previous page)


"""
# Unpack
params = model.params()
arrays = model.arrays()
y_grid_size, B_grid_size = model.y_grid_size, model.B_grid_size

# Initial conditions for v_c and v_d


v_c = np.zeros((B_grid_size, y_grid_size))
v_d = np.zeros(y_grid_size)

# Allocate memory
q = np.empty_like(v_c)
B_star = np.empty_like(v_c, dtype=int)

current_iter = 0
dist = np.inf
while (current_iter < max_iter) and (dist > tol):

if current_iter % 100 == 0:
print(f"Entering iteration {current_iter}.")

new_v_c, new_v_d = update_values_and_prices(v_c, v_d, B_star, q, params,␣


↪ arrays)
# Check tolerance and update
dist = np.max(np.abs(new_v_c - v_c)) + np.max(np.abs(new_v_d - v_d))
v_c = new_v_c
v_d = new_v_d
current_iter += 1

print(f"Terminating at iteration {current_iter}.")


return v_c, v_d, q, B_star

Finally, we write a function that will allow us to simulate the economy once we have the policy functions

def simulate(model, T, v_c, v_d, q, B_star, y_idx=None, B_idx=None):


"""
Simulates the Arellano 2008 model of sovereign debt

Here `model` is an instance of `Arellano_Economy` and `T` is the length of


the simulation. Endogenous objects `v_c`, `v_d`, `q` and `B_star` are
assumed to come from a solution to `model`.

"""
# Unpack elements of the model
B0_idx = model.B0_idx
y_grid = model.y_grid
B_grid, y_grid, P = model.B_grid, model.y_grid, model.P

# Set initial conditions to middle of grids


if y_idx == None:
y_idx = np.searchsorted(y_grid, y_grid.mean())
if B_idx == None:
B_idx = B0_idx
in_default = False

# Create Markov chain and simulate income process


(continues on next page)

13.4. Computation 259


Advanced Quantitative Economics with Python

(continued from previous page)


mc = qe.MarkovChain(P, y_grid)
y_sim_indices = mc.simulate_indices(T+1, init=y_idx)

# Allocate memory for outputs


y_sim = np.empty(T)
y_a_sim = np.empty(T)
B_sim = np.empty(T)
q_sim = np.empty(T)
d_sim = np.empty(T, dtype=int)

# Perform simulation
t = 0
while t < T:

# Store the value of y_t and B_t


y_sim[t] = y_grid[y_idx]
B_sim[t] = B_grid[B_idx]

# if in default:
if v_c[B_idx, y_idx] < v_d[y_idx] or in_default:
y_a_sim[t] = model.def_y[y_idx]
d_sim[t] = 1
Bp_idx = B0_idx
# Re-enter financial markets next period with prob θ
in_default = False if np.random.rand() < model.θ else True
else:
y_a_sim[t] = y_sim[t]
d_sim[t] = 0
Bp_idx = B_star[B_idx, y_idx]

q_sim[t] = q[Bp_idx, y_idx]

# Update time and indices


t += 1
y_idx = y_sim_indices[t]
B_idx = Bp_idx

return y_sim, y_a_sim, B_sim, q_sim, d_sim

13.5 Results

Let’s start by trying to replicate the results obtained in [Arellano, 2008].


In what follows, all results are computed using Arellano’s parameter values.
The values can be seen in the __init__ method of the Arellano_Economy shown above.
For example, r=0.017 matches the average quarterly rate on a 5 year US treasury over the period 1983–2001.
Details on how to compute the figures are reported as solutions to the exercises.
The first figure shows the bond price schedule and replicates Figure 3 of Arellano, where 𝑦𝐿 and 𝑌𝐻 are particular below
average and above average values of output 𝑦.
• 𝑦𝐿 is 5% below the mean of the 𝑦 grid values
• 𝑦𝐻 is 5% above the mean of the 𝑦 grid values

260 Chapter 13. Default Risk and Income Fluctuations


Advanced Quantitative Economics with Python

The grid used to compute this figure was relatively fine (y_grid_size, B_grid_size = 51, 251), which
explains the minor differences between this and Arrelano’s figure.
The figure shows that
• Higher levels of debt (larger −𝐵′ ) induce larger discounts on the face value, which correspond to higher interest
rates.
• Lower income also causes more discounting, as foreign creditors anticipate greater likelihood of default.
The next figure plots value functions and replicates the right hand panel of Figure 4 of [Arellano, 2008].
We can use the results of the computation to study the default probability 𝛿(𝐵′ , 𝑦) defined in (13.4).
The next plot shows these default probabilities over (𝐵′ , 𝑦) as a heat map.
As anticipated, the probability that the government chooses to default in the following period increases with indebtedness
and falls with income.
Next let’s run a time series simulation of {𝑦𝑡 }, {𝐵𝑡 } and 𝑞(𝐵𝑡+1 , 𝑦𝑡 ).
The grey vertical bars correspond to periods when the economy is excluded from financial markets because of a past
default.
One notable feature of the simulated data is the nonlinear response of interest rates.
Periods of relative stability are followed by sharp spikes in the discount rate on government debt.

13.5. Results 261


Advanced Quantitative Economics with Python

262 Chapter 13. Default Risk and Income Fluctuations


Advanced Quantitative Economics with Python

13.5. Results 263


Advanced Quantitative Economics with Python

13.6 Exercises

Exercise 13.6.1
To the extent that you can, replicate the figures shown above
• Use the parameter values listed as defaults in Arellano_Economy.
• The time series will of course vary depending on the shock draws.

Solution to Exercise 13.6.1


Compute the value function, policy and equilibrium prices

ae = Arellano_Economy()

v_c, v_d, q, B_star = solve(ae)

Entering iteration 0.

Entering iteration 100.

Entering iteration 200.

Entering iteration 300.

Terminating at iteration 399.

Compute the bond price schedule as seen in figure 3 of Arellano (2008)

# Unpack some useful names


B_grid, y_grid, P = ae.B_grid, ae.y_grid, ae.P
B_grid_size, y_grid_size = len(B_grid), len(y_grid)
r = ae.r

# Create "Y High" and "Y Low" values as 5% devs from mean
high, low = np.mean(y_grid) * 1.05, np.mean(y_grid) * .95
iy_high, iy_low = (np.searchsorted(y_grid, x) for x in (high, low))

fig, ax = plt.subplots(figsize=(10, 6.5))


ax.set_title("Bond price schedule $q(y, B')$")

# Extract a suitable plot grid


x = []
q_low = []
q_high = []
for i, B in enumerate(B_grid):
if -0.35 <= B <= 0: # To match fig 3 of Arellano
x.append(B)
q_low.append(q[i, iy_low])
(continues on next page)

264 Chapter 13. Default Risk and Income Fluctuations


Advanced Quantitative Economics with Python

(continued from previous page)


q_high.append(q[i, iy_high])
ax.plot(x, q_high, label="$y_H$", lw=2, alpha=0.7)
ax.plot(x, q_low, label="$y_L$", lw=2, alpha=0.7)
ax.set_xlabel("$B'$")
ax.legend(loc='upper left', frameon=False)
plt.show()

Draw a plot of the value functions

v = np.maximum(v_c, np.reshape(v_d, (1, y_grid_size)))

fig, ax = plt.subplots(figsize=(10, 6.5))


ax.set_title("Value Functions")
ax.plot(B_grid, v[:, iy_high], label="$y_H$", lw=2, alpha=0.7)
ax.plot(B_grid, v[:, iy_low], label="$y_L$", lw=2, alpha=0.7)
ax.legend(loc='upper left')
ax.set(xlabel="$B$", ylabel="$v(y, B)$")
ax.set_xlim(min(B_grid), max(B_grid))
plt.show()

13.6. Exercises 265


Advanced Quantitative Economics with Python

Draw a heat map for default probability

xx, yy = B_grid, y_grid


zz = np.empty_like(v_c)

for B_idx in range(B_grid_size):


for y_idx in range(y_grid_size):
zz[B_idx, y_idx] = P[y_idx, v_c[B_idx, :] < v_d].sum()

# Create figure
fig, ax = plt.subplots(figsize=(10, 6.5))
hm = ax.pcolormesh(xx, yy, zz.T)
cax = fig.add_axes([.92, .1, .02, .8])
fig.colorbar(hm, cax=cax)
ax.axis([xx.min(), 0.05, yy.min(), yy.max()])
ax.set(xlabel="$B'$", ylabel="$y$", title="Probability of Default")
plt.show()

266 Chapter 13. Default Risk and Income Fluctuations


Advanced Quantitative Economics with Python

Plot a time series of major variables simulated from the model

T = 250
np.random.seed(42)
y_sim, y_a_sim, B_sim, q_sim, d_sim = simulate(ae, T, v_c, v_d, q, B_star)

# Pick up default start and end dates


start_end_pairs = []
i = 0
while i < len(d_sim):
if d_sim[i] == 0:
i += 1
else:
# If we get to here we're in default
start_default = i
while i < len(d_sim) and d_sim[i] == 1:
i += 1
end_default = i - 1
start_end_pairs.append((start_default, end_default))

plot_series = (y_sim, B_sim, q_sim)


titles = 'output', 'foreign assets', 'bond price'

fig, axes = plt.subplots(len(plot_series), 1, figsize=(10, 12))


fig.subplots_adjust(hspace=0.3)

for ax, series, title in zip(axes, plot_series, titles):


# Determine suitable y limits
s_max, s_min = max(series), min(series)
s_range = s_max - s_min
y_max = s_max + s_range * 0.1
(continues on next page)

13.6. Exercises 267


Advanced Quantitative Economics with Python

(continued from previous page)


y_min = s_min - s_range * 0.1
ax.set_ylim(y_min, y_max)
for pair in start_end_pairs:
ax.fill_between(pair, (y_min, y_min), (y_max, y_max),
color='k', alpha=0.3)
ax.grid()
ax.plot(range(T), series, lw=2, alpha=0.7)
ax.set(title=title, xlabel="time")

plt.show()

268 Chapter 13. Default Risk and Income Fluctuations


Advanced Quantitative Economics with Python

13.6. Exercises 269


Advanced Quantitative Economics with Python

270 Chapter 13. Default Risk and Income Fluctuations


CHAPTER

FOURTEEN

GLOBALIZATION AND CYCLES

14.1 Overview

In this lecture, we review the paper Globalization and Synchronization of Innovation Cycles by Kiminori Matsuyama,
Laura Gardini and Iryna Sushko.
This model helps us understand several interesting stylized facts about the world economy.
One of these is synchronized business cycles across different countries.
Most existing models that generate synchronized business cycles do so by assumption, since they tie output in each country
to a common shock.
They also fail to explain certain features of the data, such as the fact that the degree of synchronization tends to increase
with trade ties.
By contrast, in the model we consider in this lecture, synchronization is both endogenous and increasing with the extent
of trade integration.
In particular, as trade costs fall and international competition increases, innovation incentives become aligned and coun-
tries synchronize their innovation cycles.
Let’s start with some imports:

import numpy as np
import matplotlib.pyplot as plt
from numba import jit
from ipywidgets import interact

14.1.1 Background

The model builds on work by Judd [Judd, 1985], Deneckner and Judd [Deneckere and Judd, 1992] and Helpman and
Krugman [Helpman and Krugman, 1985] by developing a two-country model with trade and innovation.
On the technical side, the paper introduces the concept of coupled oscillators to economic modeling.
As we will see, coupled oscillators arise endogenously within the model.
Below we review the model and replicate some of the results on synchronization of innovation across countries.

271
Advanced Quantitative Economics with Python

14.2 Key Ideas

It is helpful to begin with an overview of the mechanism.

14.2.1 Innovation Cycles

As discussed above, two countries produce and trade with each other.
In each country, firms innovate, producing new varieties of goods and, in doing so, receiving temporary monopoly power.
Imitators follow and, after one period of monopoly, what had previously been new varieties now enter competitive pro-
duction.
Firms have incentives to innovate and produce new goods when the mass of varieties of goods currently in production is
relatively low.
In addition, there are strategic complementarities in the timing of innovation.
Firms have incentives to innovate in the same period, so as to avoid competing with substitutes that are competitively
produced.
This leads to temporal clustering in innovations in each country.
After a burst of innovation, the mass of goods currently in production increases.
However, goods also become obsolete, so that not all survive from period to period.
This mechanism generates a cycle, where the mass of varieties increases through simultaneous innovation and then falls
through obsolescence.

14.2.2 Synchronization

In the absence of trade, the timing of innovation cycles in each country is decoupled.
This will be the case when trade costs are prohibitively high.
If trade costs fall, then goods produced in each country penetrate each other’s markets.
As illustrated below, this leads to synchronization of business cycles across the two countries.

14.3 Model

Let’s write down the model more formally.


(The treatment is relatively terse since full details can be found in the original paper)
Time is discrete with 𝑡 = 0, 1, ….
There are two countries indexed by 𝑗 or 𝑘.
In each country, a representative household inelastically supplies 𝐿𝑗 units of labor at wage rate 𝑤𝑗,𝑡 .
Without loss of generality, it is assumed that 𝐿1 ≥ 𝐿2 .
Households consume a single nontradeable final good which is produced competitively.
Its production involves combining two types of tradeable intermediate inputs via
𝑜 1−𝛼 𝛼
𝑋𝑘,𝑡 𝑋𝑘,𝑡
𝑌𝑘,𝑡 = 𝐶𝑘,𝑡 = ( ) ( )
1−𝛼 𝛼

272 Chapter 14. Globalization and Cycles


Advanced Quantitative Economics with Python

𝑜
Here 𝑋𝑘,𝑡 is a homogeneous input which can be produced from labor using a linear, one-for-one technology.
It is freely tradeable, competitively supplied, and homogeneous across countries.
By choosing the price of this good as numeraire and assuming both countries find it optimal to always produce the
homogeneous good, we can set 𝑤1,𝑡 = 𝑤2,𝑡 = 1.
The good 𝑋𝑘,𝑡 is a composite, built from many differentiated goods via
1
1− 1 1− 𝜎
𝑋𝑘,𝑡 𝜎 = ∫ [𝑥𝑘,𝑡 (𝜈)] 𝑑𝜈
Ω𝑡

Here 𝑥𝑘,𝑡 (𝜈) is the total amount of a differentiated good 𝜈 ∈ Ω𝑡 that is produced.
The parameter 𝜎 > 1 is the direct partial elasticity of substitution between a pair of varieties and Ω𝑡 is the set of varieties
available in period 𝑡.
We can split the varieties into those which are supplied competitively and those supplied monopolistically; that is, Ω𝑡 =
Ω𝑐𝑡 + Ω𝑚
𝑡 .

14.3.1 Prices

Demand for differentiated inputs is


−𝜎
𝑝𝑘,𝑡 (𝜈) 𝛼𝐿𝑘
𝑥𝑘,𝑡 (𝜈) = ( )
𝑃𝑘,𝑡 𝑃𝑘,𝑡
Here
• 𝑝𝑘,𝑡 (𝜈) is the price of the variety 𝜈 and
• 𝑃𝑘,𝑡 is the price index for differentiated inputs in 𝑘, defined by
1−𝜎
[𝑃𝑘,𝑡 ] = ∫ [𝑝𝑘,𝑡 (𝜈)]1−𝜎 𝑑𝜈
Ω𝑡

The price of a variety also depends on the origin, 𝑗, and destination, 𝑘, of the goods because shipping varieties between
countries incurs an iceberg trade cost 𝜏𝑗,𝑘 .
Thus the effective price in country 𝑘 of a variety 𝜈 produced in country 𝑗 becomes 𝑝𝑘,𝑡 (𝜈) = 𝜏𝑗,𝑘 𝑝𝑗,𝑡 (𝜈).
Using these expressions, we can derive the total demand for each variety, which is

𝐷𝑗,𝑡 (𝜈) = ∑ 𝜏𝑗,𝑘 𝑥𝑘,𝑡 (𝜈) = 𝛼𝐴𝑗,𝑡 (𝑝𝑗,𝑡 (𝜈))−𝜎


𝑘

where
𝜌𝑗,𝑘 𝐿𝑘
𝐴𝑗,𝑡 ∶= ∑ and 𝜌𝑗,𝑘 = (𝜏𝑗,𝑘 )1−𝜎 ≤ 1
𝑘
(𝑃𝑘,𝑡 )1−𝜎

It is assumed that 𝜏1,1 = 𝜏2,2 = 1 and 𝜏1,2 = 𝜏2,1 = 𝜏 for some 𝜏 > 1, so that

𝜌1,2 = 𝜌2,1 = 𝜌 ∶= 𝜏 1−𝜎 < 1

The value 𝜌 ∈ [0, 1) is a proxy for the degree of globalization.


Producing one unit of each differentiated variety requires 𝜓 units of labor, so the marginal cost is equal to 𝜓 for 𝜈 ∈ Ω𝑗,𝑡 .
Additionally, all competitive varieties will have the same price (because of equal marginal cost), which means that, for
all 𝜈 ∈ Ω𝑐 ,
𝑐 𝑐 𝑐 −𝜎
𝑝𝑗,𝑡 (𝜈) = 𝑝𝑗,𝑡 ∶= 𝜓 and 𝐷𝑗,𝑡 = 𝑦𝑗,𝑡 ∶= 𝛼𝐴𝑗,𝑡 (𝑝𝑗,𝑡 )

14.3. Model 273


Advanced Quantitative Economics with Python

Monopolists will have the same marked-up price, so, for all 𝜈 ∈ Ω𝑚 ,

𝑚 𝜓 𝑚 𝑚 −𝜎
𝑝𝑗,𝑡 (𝜈) = 𝑝𝑗,𝑡 ∶= 1 and 𝐷𝑗,𝑡 = 𝑦𝑗,𝑡 ∶= 𝛼𝐴𝑗,𝑡 (𝑝𝑗,𝑡 )
1− 𝜎

Define
𝑐
𝑝𝑗,𝑡 𝑐
𝑦𝑗,𝑡 1 1−𝜎
𝜃 ∶= 𝑚 𝑚 = (1 − )
𝑝𝑗,𝑡 𝑦𝑗,𝑡 𝜎

Using the preceding definitions and some algebra, the price indices can now be rewritten as
1−𝜎 𝑚
𝑃𝑘,𝑡 𝑐
𝑁𝑗,𝑡
( ) = 𝑀𝑘,𝑡 + 𝜌𝑀𝑗,𝑡 where 𝑀𝑗,𝑡 ∶= 𝑁𝑗,𝑡 +
𝜓 𝜃
𝑐 𝑚
The symbols 𝑁𝑗,𝑡 and 𝑁𝑗,𝑡 will denote the measures of Ω𝑐 and Ω𝑚 respectively.

14.3.2 New Varieties

To introduce a new variety, a firm must hire 𝑓 units of labor per variety in each country.
Monopolist profits must be less than or equal to zero in expectation, so
𝑚 𝑚 𝑚 𝑚 𝑚 𝑚
𝑁𝑗,𝑡 ≥ 0, 𝜋𝑗,𝑡 ∶= (𝑝𝑗,𝑡 − 𝜓)𝑦𝑗,𝑡 −𝑓 ≤0 and 𝜋𝑗,𝑡 𝑁𝑗,𝑡 =0

With further manipulations, this becomes

𝑚 𝑐 1 𝛼𝐿𝑗 𝛼𝐿𝑘
𝑁𝑗,𝑡 = 𝜃(𝑀𝑗,𝑡 − 𝑁𝑗,𝑡 ) ≥ 0, [ + ]≤𝑓
𝜎 𝜃(𝑀𝑗,𝑡 + 𝜌𝑀𝑘,𝑡 ) 𝜃(𝑀𝑗,𝑡 + 𝑀𝑘,𝑡 /𝜌)

14.3.3 Law of Motion

With 𝛿 as the exogenous probability of a variety becoming obsolete, the dynamic equation for the measure of firms
becomes
𝑐 𝑐 𝑚 𝑐 𝑐
𝑁𝑗,𝑡+1 = 𝛿(𝑁𝑗,𝑡 + 𝑁𝑗,𝑡 ) = 𝛿(𝑁𝑗,𝑡 + 𝜃(𝑀𝑗,𝑡 − 𝑁𝑗,𝑡 ))

We will work with a normalized measure of varieties


𝑐 𝑚
𝜃𝜎𝑓𝑁𝑗,𝑡 𝜃𝜎𝑓𝑁𝑗,𝑡 𝜃𝜎𝑓𝑀𝑗,𝑡 𝑖𝑗,𝑡
𝑛𝑗,𝑡 ∶= , 𝑖𝑗,𝑡 ∶= , 𝑚𝑗,𝑡 ∶= = 𝑛𝑗,𝑡 +
𝛼(𝐿1 + 𝐿2 ) 𝛼(𝐿1 + 𝐿2 ) 𝛼(𝐿1 + 𝐿2 ) 𝜃
𝐿𝑗
We also use 𝑠𝑗 ∶= 𝐿1 +𝐿2 to be the share of labor employed in country 𝑗.
We can use these definitions and the preceding expressions to obtain a law of motion for 𝑛𝑡 ∶= (𝑛1,𝑡 , 𝑛2,𝑡 ).
In particular, given an initial condition, 𝑛0 = (𝑛1,0 , 𝑛2,0 ) ∈ ℝ2+ , the equilibrium trajectory, {𝑛𝑡 }∞ ∞
𝑡=0 = {(𝑛1,𝑡 , 𝑛2,𝑡 )}𝑡=0 ,
2 2
is obtained by iterating on 𝑛𝑡+1 = 𝐹 (𝑛𝑡 ) where 𝐹 ∶ ℝ+ → ℝ+ is given by

⎧(𝛿(𝜃𝑠1 (𝜌) + (1 − 𝜃)𝑛1,𝑡 ), 𝛿(𝜃𝑠2 (𝜌) + (1 − 𝜃)𝑛2,𝑡 )) for 𝑛𝑡 ∈ 𝐷𝐿𝐿


{
{(𝛿𝑛1,𝑡 , 𝛿𝑛2,𝑡 ) for 𝑛𝑡 ∈ 𝐷𝐻𝐻
𝐹 (𝑛𝑡 ) = ⎨
{(𝛿𝑛1,𝑡 , 𝛿(𝜃ℎ2 (𝑛1,𝑡 ) + (1 − 𝜃)𝑛2,𝑡 )) for 𝑛𝑡 ∈ 𝐷𝐻𝐿
{(𝛿(𝜃ℎ (𝑛 ) + (1 − 𝜃)𝑛 , 𝛿𝑛 )) for 𝑛𝑡 ∈ 𝐷𝐿𝐻
⎩ 1 2,𝑡 1,𝑡 2,𝑡

274 Chapter 14. Globalization and Cycles


Advanced Quantitative Economics with Python

Here
𝐷𝐿𝐿 ∶= {(𝑛1 , 𝑛2 ) ∈ ℝ2+ |𝑛𝑗 ≤ 𝑠𝑗 (𝜌)}
𝐷𝐻𝐻 ∶= {(𝑛1 , 𝑛2 ) ∈ ℝ2+ |𝑛𝑗 ≥ ℎ𝑗 (𝑛𝑘 )}
𝐷𝐻𝐿 ∶= {(𝑛1 , 𝑛2 ) ∈ ℝ2+ |𝑛1 ≥ 𝑠1 (𝜌) and 𝑛2 ≤ ℎ2 (𝑛1 )}
𝐷𝐿𝐻 ∶= {(𝑛1 , 𝑛2 ) ∈ ℝ2+ |𝑛1 ≤ ℎ1 (𝑛2 ) and 𝑛2 ≥ 𝑠2 (𝜌)}

while
𝑠1 − 𝜌𝑠2
𝑠1 (𝜌) = 1 − 𝑠2 (𝜌) = min { , 1}
1−𝜌

and ℎ𝑗 (𝑛𝑘 ) is defined implicitly by the equation


𝑠𝑗 𝑠𝑘
1= +
ℎ𝑗 (𝑛𝑘 ) + 𝜌𝑛𝑘 ℎ𝑗 (𝑛𝑘 ) + 𝑛𝑘 /𝜌

Rewriting the equation above gives us a quadratic equation in terms of ℎ𝑗 (𝑛𝑘 ).


Since we know ℎ𝑗 (𝑛𝑘 ) > 0 then we can just solve the quadratic equation and return the positive root.
This gives us
1 𝑠𝑗 𝑛𝑘
ℎ𝑗 (𝑛𝑘 )2 + ((𝜌 + )𝑛𝑘 − 𝑠𝑗 − 𝑠𝑘 ) ℎ𝑗 (𝑛𝑘 ) + (𝑛2𝑘 − − 𝑠𝑘 𝑛𝑘 𝜌) = 0
𝜌 𝜌

14.4 Simulation

Let’s try simulating some of these trajectories.


We will focus in particular on whether or not innovation cycles synchronize across the two countries.
As we will see, this depends on initial conditions.
For some parameterizations, synchronization will occur for “most” initial conditions, while for others synchronization will
be rare.
The computational burden of testing synchronization across many initial conditions is not trivial.
In order to make our code fast, we will use just in time compiled functions that will get called and handled by our class.
These are the @jit statements that you see below (review this lecture if you don’t recall how to use JIT compilation).
Here’s the main body of code

@jit(nopython=True)
def _hj(j, nk, s1, s2, θ, δ, ρ):
"""
If we expand the implicit function for h_j(n_k) then we find that
it is quadratic. We know that h_j(n_k) > 0 so we can get its
value by using the quadratic form
"""
# Find out who's h we are evaluating
if j == 1:
sj = s1
sk = s2
else:
sj = s2
(continues on next page)

14.4. Simulation 275


Advanced Quantitative Economics with Python

(continued from previous page)


sk = s1

# Coefficients on the quadratic a x^2 + b x + c = 0


a = 1.0
b = ((ρ + 1 / ρ) * nk - sj - sk)
c = (nk * nk - (sj * nk) / ρ - sk * ρ * nk)

# Positive solution of quadratic form


root = (-b + np.sqrt(b * b - 4 * a * c)) / (2 * a)

return root

@jit(nopython=True)
def DLL(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"Determine whether (n1, n2) is in the set DLL"
return (n1 <= s1_ρ) and (n2 <= s2_ρ)

@jit(nopython=True)
def DHH(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"Determine whether (n1, n2) is in the set DHH"
return (n1 >= _hj(1, n2, s1, s2, θ, δ, ρ)) and \
(n2 >= _hj(2, n1, s1, s2, θ, δ, ρ))

@jit(nopython=True)
def DHL(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"Determine whether (n1, n2) is in the set DHL"
return (n1 >= s1_ρ) and (n2 <= _hj(2, n1, s1, s2, θ, δ, ρ))

@jit(nopython=True)
def DLH(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"Determine whether (n1, n2) is in the set DLH"
return (n1 <= _hj(1, n2, s1, s2, θ, δ, ρ)) and (n2 >= s2_ρ)

@jit(nopython=True)
def one_step(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"""
Takes a current value for (n_{1, t}, n_{2, t}) and returns the
values (n_{1, t+1}, n_{2, t+1}) according to the law of motion.
"""
# Depending on where we are, evaluate the right branch
if DLL(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
n1_tp1 = δ * (θ * s1_ρ + (1 - θ) * n1)
n2_tp1 = δ * (θ * s2_ρ + (1 - θ) * n2)
elif DHH(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
n1_tp1 = δ * n1
n2_tp1 = δ * n2
elif DHL(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
n1_tp1 = δ * n1
n2_tp1 = δ * (θ * _hj(2, n1, s1, s2, θ, δ, ρ) + (1 - θ) * n2)
elif DLH(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
n1_tp1 = δ * (θ * _hj(1, n2, s1, s2, θ, δ, ρ) + (1 - θ) * n1)
n2_tp1 = δ * n2

return n1_tp1, n2_tp1

@jit(nopython=True)

(continues on next page)

276 Chapter 14. Globalization and Cycles


Advanced Quantitative Economics with Python

(continued from previous page)


def n_generator(n1_0, n2_0, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"""
Given an initial condition, continues to yield new values of
n1 and n2
"""
n1_t, n2_t = n1_0, n2_0
while True:
n1_tp1, n2_tp1 = one_step(n1_t, n2_t, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ)
yield (n1_tp1, n2_tp1)
n1_t, n2_t = n1_tp1, n2_tp1

@jit(nopython=True)
def _pers_till_sync(n1_0, n2_0, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ, maxiter, npers):
"""
Takes initial values and iterates forward to see whether
the histories eventually end up in sync.

If countries are symmetric then as soon as the two countries have the
same measure of firms then they will be synchronized -- However, if
they are not symmetric then it is possible they have the same measure
of firms but are not yet synchronized. To address this, we check whether
firms stay synchronized for `npers` periods with Euclidean norm

Parameters
----------
n1_0 : scalar(Float)
Initial normalized measure of firms in country one
n2_0 : scalar(Float)
Initial normalized measure of firms in country two
maxiter : scalar(Int)
Maximum number of periods to simulate
npers : scalar(Int)
Number of periods we would like the countries to have the
same measure for

Returns
-------
synchronized : scalar(Bool)
Did the two economies end up synchronized
pers_2_sync : scalar(Int)
The number of periods required until they synchronized
"""
# Initialize the status of synchronization
synchronized = False
pers_2_sync = maxiter
iters = 0

# Initialize generator
n_gen = n_generator(n1_0, n2_0, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ)

# Will use a counter to determine how many times in a row


# the firm measures are the same
nsync = 0

while (not synchronized) and (iters < maxiter):


# Increment the number of iterations and get next values

(continues on next page)

14.4. Simulation 277


Advanced Quantitative Economics with Python

(continued from previous page)


iters += 1
n1_t, n2_t = next(n_gen)

# Check whether same in this period


if abs(n1_t - n2_t) < 1e-8:
nsync += 1
# If not, then reset the nsync counter
else:
nsync = 0

# If we have been in sync for npers then stop and countries


# became synchronized nsync periods ago
if nsync > npers:
synchronized = True
pers_2_sync = iters - nsync

return synchronized, pers_2_sync

@jit(nopython=True)
def _create_attraction_basis(s1_ρ, s2_ρ, s1, s2, θ, δ, ρ,
maxiter, npers, npts):
# Create unit range with npts
synchronized, pers_2_sync = False, 0
unit_range = np.linspace(0.0, 1.0, npts)

# Allocate space to store time to sync


time_2_sync = np.empty((npts, npts))
# Iterate over initial conditions
for (i, n1_0) in enumerate(unit_range):
for (j, n2_0) in enumerate(unit_range):
synchronized, pers_2_sync = _pers_till_sync(n1_0, n2_0, s1_ρ,
s2_ρ, s1, s2, θ, δ,
ρ, maxiter, npers)
time_2_sync[i, j] = pers_2_sync

return time_2_sync

# == Now we define a class for the model == #

class MSGSync:
"""
The paper "Globalization and Synchronization of Innovation Cycles" presents
a two-country model with endogenous innovation cycles. Combines elements
from Deneckere Judd (1985) and Helpman Krugman (1985) to allow for a
model with trade that has firms who can introduce new varieties into
the economy.

We focus on being able to determine whether the two countries eventually


synchronize their innovation cycles. To do this, we only need a few
of the many parameters. In particular, we need the parameters listed
below

Parameters
----------
s1 : scalar(Float)

(continues on next page)

278 Chapter 14. Globalization and Cycles


Advanced Quantitative Economics with Python

(continued from previous page)


Amount of total labor in country 1 relative to total worldwide labor
θ : scalar(Float)
A measure of how much more of the competitive variety is used in
production of final goods
δ : scalar(Float)
Percentage of firms that are not exogenously destroyed every period
ρ : scalar(Float)
Measure of how expensive it is to trade between countries
"""
def __init__(self, s1=0.5, θ=2.5, δ=0.7, ρ=0.2):
# Store model parameters
self.s1, self.θ, self.δ, self.ρ = s1, θ, δ, ρ

# Store other cutoffs and parameters we use


self.s2 = 1 - s1
self.s1_ρ = self._calc_s1_ρ()
self.s2_ρ = 1 - self.s1_ρ

def _unpack_params(self):
return self.s1, self.s2, self.θ, self.δ, self.ρ

def _calc_s1_ρ(self):
# Unpack params
s1, s2, θ, δ, ρ = self._unpack_params()

# s_1(ρ) = min(val, 1)
val = (s1 - ρ * s2) / (1 - ρ)
return min(val, 1)

def simulate_n(self, n1_0, n2_0, T):


"""
Simulates the values of (n1, n2) for T periods

Parameters
----------
n1_0 : scalar(Float)
Initial normalized measure of firms in country one
n2_0 : scalar(Float)
Initial normalized measure of firms in country two
T : scalar(Int)
Number of periods to simulate

Returns
-------
n1 : Array(Float64, ndim=1)
A history of normalized measures of firms in country one
n2 : Array(Float64, ndim=1)
A history of normalized measures of firms in country two
"""
# Unpack parameters
s1, s2, θ, δ, ρ = self._unpack_params()
s1_ρ, s2_ρ = self.s1_ρ, self.s2_ρ

# Allocate space
n1 = np.empty(T)
n2 = np.empty(T)

(continues on next page)

14.4. Simulation 279


Advanced Quantitative Economics with Python

(continued from previous page)

# Create the generator


n1[0], n2[0] = n1_0, n2_0
n_gen = n_generator(n1_0, n2_0, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ)

# Simulate for T periods


for t in range(1, T):
# Get next values
n1_tp1, n2_tp1 = next(n_gen)

# Store in arrays
n1[t] = n1_tp1
n2[t] = n2_tp1

return n1, n2

def pers_till_sync(self, n1_0, n2_0, maxiter=500, npers=3):


"""
Takes initial values and iterates forward to see whether
the histories eventually end up in sync.

If countries are symmetric then as soon as the two countries have the
same measure of firms then they will be synchronized -- However, if
they are not symmetric then it is possible they have the same measure
of firms but are not yet synchronized. To address this, we check whether
firms stay synchronized for `npers` periods with Euclidean norm

Parameters
----------
n1_0 : scalar(Float)
Initial normalized measure of firms in country one
n2_0 : scalar(Float)
Initial normalized measure of firms in country two
maxiter : scalar(Int)
Maximum number of periods to simulate
npers : scalar(Int)
Number of periods we would like the countries to have the
same measure for

Returns
-------
synchronized : scalar(Bool)
Did the two economies end up synchronized
pers_2_sync : scalar(Int)
The number of periods required until they synchronized
"""
# Unpack parameters
s1, s2, θ, δ, ρ = self._unpack_params()
s1_ρ, s2_ρ = self.s1_ρ, self.s2_ρ

return _pers_till_sync(n1_0, n2_0, s1_ρ, s2_ρ,


s1, s2, θ, δ, ρ, maxiter, npers)

def create_attraction_basis(self, maxiter=250, npers=3, npts=50):


"""
Creates an attraction basis for values of n on [0, 1] X [0, 1]

(continues on next page)

280 Chapter 14. Globalization and Cycles


Advanced Quantitative Economics with Python

(continued from previous page)


with npts in each dimension
"""
# Unpack parameters
s1, s2, θ, δ, ρ = self._unpack_params()
s1_ρ, s2_ρ = self.s1_ρ, self.s2_ρ

ab = _create_attraction_basis(s1_ρ, s2_ρ, s1, s2, θ, δ,


ρ, maxiter, npers, npts)

return ab

14.4.1 Time Series of Firm Measures

We write a short function below that exploits the preceding code and plots two time series.
Each time series gives the dynamics for the two countries.
The time series share parameters but differ in their initial condition.
Here’s the function

def plot_timeseries(n1_0, n2_0, s1=0.5, θ=2.5,


δ=0.7, ρ=0.2, ax=None, title=''):
"""
Plot a single time series with initial conditions
"""
if ax is None:
fig, ax = plt.subplots()

# Create the MSG Model and simulate with initial conditions


model = MSGSync(s1, θ, δ, ρ)
n1, n2 = model.simulate_n(n1_0, n2_0, 25)

ax.plot(np.arange(25), n1, label="$n_1$", lw=2)


ax.plot(np.arange(25), n2, label="$n_2$", lw=2)

ax.legend()
ax.set(title=title, ylim=(0.15, 0.8))

return ax

# Create figure
fig, ax = plt.subplots(2, 1, figsize=(10, 8))

plot_timeseries(0.15, 0.35, ax=ax[0], title='Not Synchronized')


plot_timeseries(0.4, 0.3, ax=ax[1], title='Synchronized')

fig.tight_layout()

plt.show()

14.4. Simulation 281


Advanced Quantitative Economics with Python

In the first case, innovation in the two countries does not synchronize.
In the second case, different initial conditions are chosen, and the cycles become synchronized.

14.4.2 Basin of Attraction

Next, let’s study the initial conditions that lead to synchronized cycles more systematically.
We generate time series from a large collection of different initial conditions and mark those conditions with different
colors according to whether synchronization occurs or not.
The next display shows exactly this for four different parameterizations (one for each subfigure).
Dark colors indicate synchronization, while light colors indicate failure to synchronize.
As you can see, larger values of 𝜌 translate to more synchronization.
You are asked to replicate this figure in the exercises.
In the solution to the exercises, you’ll also find a figure with sliders, allowing you to experiment with different parameters.
Here’s one snapshot from the interactive figure

282 Chapter 14. Globalization and Cycles


Advanced Quantitative Economics with Python

14.4. Simulation 283


Advanced Quantitative Economics with Python

284 Chapter 14. Globalization and Cycles


Advanced Quantitative Economics with Python

14.5 Exercises

Exercise 14.5.1
Replicate the figure shown above by coloring initial conditions according to whether or not synchronization occurs from
those conditions.

Solution to Exercise 14.5.1

def plot_attraction_basis(s1=0.5, θ=2.5, δ=0.7, ρ=0.2, npts=250, ax=None):


if ax is None:
fig, ax = plt.subplots()

# Create attraction basis


unitrange = np.linspace(0, 1, npts)
model = MSGSync(s1, θ, δ, ρ)
ab = model.create_attraction_basis(npts=npts)
cf = ax.pcolormesh(unitrange, unitrange, ab, cmap="viridis")

return ab, cf

fig = plt.figure(figsize=(14, 12))

# Left - Bottom - Width - Height


ax0 = fig.add_axes((0.05, 0.475, 0.38, 0.35), label="axes0")
ax1 = fig.add_axes((0.5, 0.475, 0.38, 0.35), label="axes1")
ax2 = fig.add_axes((0.05, 0.05, 0.38, 0.35), label="axes2")
ax3 = fig.add_axes((0.5, 0.05, 0.38, 0.35), label="axes3")

params = [[0.5, 2.5, 0.7, 0.2],


[0.5, 2.5, 0.7, 0.4],
[0.5, 2.5, 0.7, 0.6],
[0.5, 2.5, 0.7, 0.8]]

ab0, cf0 = plot_attraction_basis(*params[0], npts=500, ax=ax0)


ab1, cf1 = plot_attraction_basis(*params[1], npts=500, ax=ax1)
ab2, cf2 = plot_attraction_basis(*params[2], npts=500, ax=ax2)
ab3, cf3 = plot_attraction_basis(*params[3], npts=500, ax=ax3)

cbar_ax = fig.add_axes([0.9, 0.075, 0.03, 0.725])


plt.colorbar(cf0, cax=cbar_ax)

ax0.set_title(r"$s_1=0.5$, $\theta=2.5$, $\delta=0.7$, $\rho=0.2$",


fontsize=22)
ax1.set_title(r"$s_1=0.5$, $\theta=2.5$, $\delta=0.7$, $\rho=0.4$",
fontsize=22)
ax2.set_title(r"$s_1=0.5$, $\theta=2.5$, $\delta=0.7$, $\rho=0.6$",
fontsize=22)
ax3.set_title(r"$s_1=0.5$, $\theta=2.5$, $\delta=0.7$, $\rho=0.8$",
fontsize=22)

fig.suptitle("Synchronized versus Asynchronized 2-cycles",


x=0.475, y=0.915, size=26)
plt.show()

14.5. Exercises 285


Advanced Quantitative Economics with Python

Additionally, instead of just seeing 4 plots at once, we might want to manually be able to change 𝜌 and see how it affects
the plot in real-time. Below we use an interactive plot to do this.
Note, interactive plotting requires the ipywidgets module to be installed and enabled.

Note: This interactive plot is disabled on this static webpage. In order to use this, we recommend to run this notebook
locally.

def interact_attraction_basis(ρ=0.2, maxiter=250, npts=250):


# Create the figure and axis that we will plot on
fig, ax = plt.subplots(figsize=(12, 10))
# Create model and attraction basis
s1, θ, δ = 0.5, 2.5, 0.75
model = MSGSync(s1, θ, δ, ρ)
ab = model.create_attraction_basis(maxiter=maxiter, npts=npts)
# Color map with colormesh
unitrange = np.linspace(0, 1, npts)
cf = ax.pcolormesh(unitrange, unitrange, ab, cmap="viridis")
cbar_ax = fig.add_axes([0.95, 0.15, 0.05, 0.7])
plt.colorbar(cf, cax=cbar_ax)
plt.show()
return None

286 Chapter 14. Globalization and Cycles


Advanced Quantitative Economics with Python

fig = interact(interact_attraction_basis,
ρ=(0.0, 1.0, 0.05),
maxiter=(50, 5000, 50),
npts=(25, 750, 25))

14.5. Exercises 287


Advanced Quantitative Economics with Python

288 Chapter 14. Globalization and Cycles


CHAPTER

FIFTEEN

COASE’S THEORY OF THE FIRM

15.1 Overview

In 1937, Ronald Coase wrote a brilliant essay on the nature of the firm [Coase, 1937].
Coase was writing at a time when the Soviet Union was rising to become a significant industrial power.
At the same time, many free-market economies were afflicted by a severe and painful depression.
This contrast led to an intensive debate on the relative merits of decentralized, price-based allocation versus top-down
planning.
In the midst of this debate, Coase made an important observation: even in free-market economies, a great deal of top-
down planning does in fact take place.
This is because firms form an integral part of free-market economies and, within firms, allocation is by planning.
In other words, free-market economies blend both planning (within firms) and decentralized production coordinated by
prices.
The question Coase asked is this: if prices and free markets are so efficient, then why do firms even exist?
Couldn’t the associated within-firm planning be done more efficiently by the market?
We’ll use the following imports:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import fminbound

15.1.1 Why Firms Exist

On top of asking a deep and fascinating question, Coase also supplied an illuminating answer: firms exist because of
transaction costs.
Here’s one example of a transaction cost:
Suppose agent A is considering setting up a small business and needs a web developer to construct and help run an online
store.
She can use the labor of agent B, a web developer, by writing up a freelance contract for these tasks and agreeing on a
suitable price.
But contracts like this can be time-consuming and difficult to verify
• How will agent A be able to specify exactly what she wants, to the finest detail, when she herself isn’t sure how the
business will evolve?

289
Advanced Quantitative Economics with Python

• And what if she isn’t familiar with web technology? How can she specify all the relevant details?
• And, if things go badly, will failure to comply with the contract be verifiable in court?
In this situation, perhaps it will be easier to employ agent B under a simple labor contract.
The cost of this contract is far smaller because such contracts are simpler and more standard.
The basic agreement in a labor contract is: B will do what A asks him to do for the term of the contract, in return for a
given salary.
Making this agreement is much easier than trying to map every task out in advance in a contract that will hold up in a
court of law.
So agent A decides to hire agent B and a firm of nontrivial size appears, due to transaction costs.

15.1.2 A Trade-Off

Actually, we haven’t yet come to the heart of Coase’s investigation.


The issue of why firms exist is a binary question: should firms have positive size or zero size?
A better and more general question is: what determines the size of firms?
The answer Coase came up with was that “a firm will tend to expand until the costs of organizing an extra transaction
within the firm become equal to the costs of carrying out the same transaction by means of an exchange on the open
market…” ([Coase, 1937], p. 395).
But what are these internal and external costs?
In short, Coase envisaged a trade-off between
• transaction costs, which add to the expense of operating between firms, and
• diminishing returns to management, which adds to the expense of operating within firms
We discussed an example of transaction costs above (contracts).
The other cost, diminishing returns to management, is a catch-all for the idea that big operations are increasingly costly
to manage.
For example, you could think of management as a pyramid, so hiring more workers to implement more tasks requires
expansion of the pyramid, and hence labor costs grow at a rate more than proportional to the range of tasks.
Diminishing returns to management makes in-house production expensive, favoring small firms.

15.1.3 Summary

Here’s a summary of our discussion:


• Firms grow because transaction costs encourage them to take some operations in house.
• But as they get large, in-house operations become costly due to diminishing returns to management.
• The size of firms is determined by balancing these effects, thereby equalizing the marginal costs of each form of
operation.

290 Chapter 15. Coase’s Theory of the Firm


Advanced Quantitative Economics with Python

15.1.4 A Quantitative Interpretation

Coases ideas were expressed verbally, without any mathematics.


In fact, his essay is a wonderful example of how far you can get with clear thinking and plain English.
However, plain English is not good for quantitative analysis, so let’s bring some mathematical and computation tools to
bear.
In doing so we’ll add a bit more structure than Coase did, but this price will be worth paying.
Our exposition is based on [Kikuchi et al., 2018].

15.2 The Model

The model we study involves production of a single unit of a final good.


Production requires a linearly ordered chain, requiring sequential completion of a large number of processing stages.
The stages are indexed by 𝑡 ∈ [0, 1], with 𝑡 = 0 indicating that no tasks have been undertaken and 𝑡 = 1 indicating that
the good is complete.

15.2.1 Subcontracting

The subcontracting scheme by which tasks are allocated across firms is illustrated in the figure below

In this example,
• Firm 1 receives a contract to sell one unit of the completed good to a final buyer.
• Firm 1 then forms a contract with firm 2 to purchase the partially completed good at stage 𝑡1 , with the intention of
implementing the remaining 1 − 𝑡1 tasks in-house (i.e., processing from stage 𝑡1 to stage 1).
• Firm 2 repeats this procedure, forming a contract with firm 3 to purchase the good at stage 𝑡2 .
• Firm 3 decides to complete the chain, selecting 𝑡3 = 0.
At this point, production unfolds in the opposite direction (i.e., from upstream to downstream).

15.2. The Model 291


Advanced Quantitative Economics with Python

• Firm 3 completes processing stages from 𝑡3 = 0 up to 𝑡2 and transfers the good to firm 2.
• Firm 2 then processes from 𝑡2 up to 𝑡1 and transfers the good to firm 1,
• Firm 1 processes from 𝑡1 to 1 and delivers the completed good to the final buyer.
The length of the interval of stages (range of tasks) carried out by firm 𝑖 is denoted by ℓ𝑖 .

Each firm chooses only its upstream boundary, treating its downstream boundary as given.
The benefit of this formulation is that it implies a recursive structure for the decision problem for each firm.
In choosing how many processing stages to subcontract, each successive firm faces essentially the same decision problem
as the firm above it in the chain, with the only difference being that the decision space is a subinterval of the decision
space for the firm above.
We will exploit this recursive structure in our study of equilibrium.

15.2.2 Costs

Recall that we are considering a trade-off between two types of costs.


Let’s discuss these costs and how we represent them mathematically.
Diminishing returns to management means rising costs per task when a firm expands the range of productive activities
coordinated by its managers.
We represent these ideas by taking the cost of carrying out ℓ tasks in-house to be 𝑐(ℓ), where 𝑐 is increasing and strictly
convex.
Thus, the average cost per task rises with the range of tasks performed in-house.
We also assume that 𝑐 is continuously differentiable, with 𝑐(0) = 0 and 𝑐′ (0) > 0.
Transaction costs are represented as a wedge between the buyer’s and seller’s prices.
It matters little for us whether the transaction cost is borne by the buyer or the seller.
Here we assume that the cost is borne only by the buyer.
In particular, when two firms agree to a trade at face value 𝑣, the buyer’s total outlay is 𝛿𝑣, where 𝛿 > 1.
The seller receives only 𝑣, and the difference is paid to agents outside the model.

292 Chapter 15. Coase’s Theory of the Firm


Advanced Quantitative Economics with Python

15.3 Equilibrium

We assume that all firms are ex-ante identical and act as price takers.
As price takers, they face a price function 𝑝, which is a map from [0, 1] to ℝ+ , with 𝑝(𝑡) interpreted as the price of the
good at processing stage 𝑡.
There is a countable infinity of firms indexed by 𝑖 and no barriers to entry.
The cost of supplying the initial input (the good processed up to stage zero) is set to zero for simplicity.
Free entry and the infinite fringe of competitors rule out positive profits for incumbents, since any incumbent could be
replaced by a member of the competitive fringe filling the same role in the production chain.
Profits are never negative in equilibrium because firms can freely exit.

15.3.1 Informal Definition of Equilibrium

An equilibrium in this setting is an allocation of firms and a price function such that
1. all active firms in the chain make zero profits, including suppliers of raw materials
2. no firm in the production chain has an incentive to deviate, and
3. no inactive firms can enter and extract positive profits

15.3.2 Formal Definition of Equilibrium

Let’s make this definition more formal.


(You might like to skip this section on first reading)
An allocation of firms is a nonnegative sequence {ℓ𝑖 }𝑖∈ℕ such that ℓ𝑖 = 0 for all sufficiently large 𝑖.
Recalling the figures above,
• ℓ𝑖 represents the range of tasks implemented by the 𝑖-th firm
As a labeling convention, we assume that firms enter in order, with firm 1 being the furthest downstream.
An allocation {ℓ𝑖 } is called feasible if ∑ 𝑖≥1 ℓ𝑖 = 1.
In a feasible allocation, the entire production process is completed by finitely many firms.
Given a feasible allocation, {ℓ𝑖 }, let {𝑡𝑖 } represent the corresponding transaction stages, defined by

𝑡0 = 𝑠 and 𝑡𝑖 = 𝑡𝑖−1 − ℓ𝑖 (15.1)

In particular, 𝑡𝑖−1 is the downstream boundary of firm 𝑖 and 𝑡𝑖 is its upstream boundary.
As transaction costs are incurred only by the buyer, its profits are

𝜋𝑖 = 𝑝(𝑡𝑖−1 ) − 𝑐(ℓ𝑖 ) − 𝛿𝑝(𝑡𝑖 ) (15.2)

Given a price function 𝑝 and a feasible allocation {ℓ𝑖 }, let


• {𝑡𝑖 } be the corresponding firm boundaries.
• {𝜋𝑖 } be corresponding profits, as defined in (15.2).
This price-allocation pair is called an equilibrium for the production chain if

15.3. Equilibrium 293


Advanced Quantitative Economics with Python

1. 𝑝(0) = 0,
2. 𝜋𝑖 = 0 for all 𝑖, and
3. 𝑝(𝑠) − 𝑐(𝑠 − 𝑡) − 𝛿𝑝(𝑡) ≤ 0 for any pair 𝑠, 𝑡 with 0 ≤ 𝑠 ≤ 𝑡 ≤ 1.
The rationale behind these conditions was given in our informal definition of equilibrium above.

15.4 Existence, Uniqueness and Computation of Equilibria

We have defined an equilibrium but does one exist? Is it unique? And, if so, how can we compute it?

15.4.1 A Fixed Point Method

To address these questions, we introduce the operator 𝑇 mapping a nonnegative function 𝑝 on [0, 1] to 𝑇 𝑝 via
𝑇 𝑝(𝑠) = min {𝑐(𝑠 − 𝑡) + 𝛿𝑝(𝑡)} for all 𝑠 ∈ [0, 1]. (15.3)
𝑡≤𝑠

Here and below, the restriction 0 ≤ 𝑡 in the minimum is understood.


The operator 𝑇 is similar to a Bellman operator.
Under this analogy, 𝑝 corresponds to a value function and 𝛿 to a discount factor.
But 𝛿 > 1, so 𝑇 is not a contraction in any obvious metric, and in fact, 𝑇 𝑛 𝑝 diverges for many choices of 𝑝.
Nevertheless, there exists a domain on which 𝑇 is well-behaved: the set of convex increasing continuous functions
𝑝 ∶ [0, 1] → ℝ such that 𝑐′ (0)𝑠 ≤ 𝑝(𝑠) ≤ 𝑐(𝑠) for all 0 ≤ 𝑠 ≤ 1.
We denote this set of functions by 𝒫.
In [Kikuchi et al., 2018] it is shown that the following statements are true:
1. 𝑇 maps 𝒫 into itself.
2. 𝑇 has a unique fixed point in 𝒫, denoted below by 𝑝∗ .
3. For all 𝑝 ∈ 𝒫 we have 𝑇 𝑘 𝑝 → 𝑝∗ uniformly as 𝑘 → ∞.
Now consider the choice function
𝑡∗ (𝑠) ∶= the solution to min{𝑐(𝑠 − 𝑡) + 𝛿𝑝∗ (𝑡)} (15.4)
𝑡≤𝑠

By definition, 𝑡∗ (𝑠) is the cost-minimizing upstream boundary for a firm that is contracted to deliver the good at stage 𝑠
and faces the price function 𝑝∗ .
Since 𝑝∗ lies in 𝒫 and since 𝑐 is strictly convex, it follows that the right-hand side of (15.4) is continuous and strictly
convex in 𝑡.
Hence the minimizer 𝑡∗ (𝑠) exists and is uniquely defined.
We can use 𝑡∗ to construct an equilibrium allocation as follows:
Recall that firm 1 sells the completed good at stage 𝑠 = 1, its optimal upstream boundary is 𝑡∗ (1).
Hence firm 2’s optimal upstream boundary is 𝑡∗ (𝑡∗ (1)).
Continuing in this way produces the sequence {𝑡∗𝑖 } defined by

𝑡∗0 = 1 and 𝑡∗𝑖 = 𝑡∗ (𝑡𝑖−1 ) (15.5)

The sequence ends when a firm chooses to complete all remaining tasks.

294 Chapter 15. Coase’s Theory of the Firm


Advanced Quantitative Economics with Python

We label this firm (and hence the number of firms in the chain) as

𝑛∗ ∶= inf{𝑖 ∈ ℕ ∶ 𝑡∗𝑖 = 0} (15.6)

The task allocation corresponding to (15.5) is given by ℓ𝑖∗ ∶= 𝑡∗𝑖−1 − 𝑡∗𝑖 for all 𝑖.
In [Kikuchi et al., 2018] it is shown that
1. The value 𝑛∗ in (15.6) is well-defined and finite,
2. the allocation {ℓ𝑖∗ } is feasible, and
3. the price function 𝑝∗ and this allocation together forms an equilibrium for the production chain.
While the proofs are too long to repeat here, much of the insight can be obtained by observing that, as a fixed point of
𝑇 , the equilibrium price function must satisfy

𝑝∗ (𝑠) = min {𝑐(𝑠 − 𝑡) + 𝛿𝑝∗ (𝑡)} for all 𝑠 ∈ [0, 1] (15.7)


𝑡≤𝑠

From this equation, it is clear that so profits are zero for all incumbent firms.

15.4.2 Marginal Conditions

We can develop some additional insights on the behavior of firms by examining marginal conditions associated with the
equilibrium.
As a first step, let ℓ∗ (𝑠) ∶= 𝑠 − 𝑡∗ (𝑠).
This is the cost-minimizing range of in-house tasks for a firm with downstream boundary 𝑠.
In [Kikuchi et al., 2018] it is shown that 𝑡∗ and ℓ∗ are increasing and continuous, while 𝑝∗ is continuously differentiable
at all 𝑠 ∈ (0, 1) with

(𝑝∗ )′ (𝑠) = 𝑐′ (ℓ∗ (𝑠)) (15.8)

Equation (15.8) follows from 𝑝∗ (𝑠) = min𝑡≤𝑠 {𝑐(𝑠 − 𝑡) + 𝛿𝑝∗ (𝑡)} and the envelope theorem for derivatives.
A related equation is the first order condition for 𝑝∗ (𝑠) = min𝑡≤𝑠 {𝑐(𝑠 − 𝑡) + 𝛿𝑝∗ (𝑡)}, the minimization problem for a
firm with upstream boundary 𝑠, which is

𝛿(𝑝∗ )′ (𝑡∗ (𝑠)) = 𝑐′ (𝑠 − 𝑡∗ (𝑠)) (15.9)

This condition matches the marginal condition expressed verbally by Coase that we stated above:
“A firm will tend to expand until the costs of organizing an extra transaction within the firm become equal
to the costs of carrying out the same transaction by means of an exchange on the open market…”
Combining (15.8) and (15.9) and evaluating at 𝑠 = 𝑡𝑖 , we see that active firms that are adjacent satisfy

𝛿 𝑐′ (ℓ𝑖+1

) = 𝑐′ (ℓ𝑖∗ ) (15.10)

In other words, the marginal in-house cost per task at a given firm is equal to that of its upstream partner multiplied by
gross transaction cost.
This expression can be thought of as a Coase–Euler equation, which determines inter-firm efficiency by indicating how
two costly forms of coordination (markets and management) are jointly minimized in equilibrium.

15.4. Existence, Uniqueness and Computation of Equilibria 295


Advanced Quantitative Economics with Python

15.5 Implementation

For most specifications of primitives, there is no closed-form solution for the equilibrium as far as we are aware.
However, we know that we can compute the equilibrium corresponding to a given transaction cost parameter 𝛿 and a cost
function 𝑐 by applying the results stated above.
In particular, we can
1. fix initial condition 𝑝 ∈ 𝒫,
2. iterate with 𝑇 until 𝑇 𝑛 𝑝 has converged to 𝑝∗ , and
3. recover firm choices via the choice function (15.3)
At each iterate, we will use continuous piecewise linear interpolation of functions.
To begin, here’s a class to store primitives and a grid:

class ProductionChain:

def __init__(self,
n=1000,
delta=1.05,
c=lambda t: np.exp(10 * t) - 1):

self.n, self.delta, self.c = n, delta, c


self.grid = np.linspace(1e-04, 1, n)

Now let’s implement and iterate with 𝑇 until convergence.


Recalling that our initial condition must lie in 𝒫, we set 𝑝0 = 𝑐

def compute_prices(pc, tol=1e-5, max_iter=5000):


"""
Compute prices by iterating with T

* pc is an instance of ProductionChain
* The initial condition is p = c

"""
delta, c, n, grid = pc.delta, pc.c, pc.n, pc.grid
p = c(grid) # Initial condition is c(s), as an array
new_p = np.empty_like(p)
error = tol + 1
i = 0

while error > tol and i < max_iter:


for j, s in enumerate(grid):
Tp = lambda t: delta * np.interp(t, grid, p) + c(s - t)
new_p[j] = Tp(fminbound(Tp, 0, s))
error = np.max(np.abs(p - new_p))
p = new_p
i = i + 1

if i < max_iter:
print(f"Iteration converged in {i} steps")
else:
print(f"Warning: iteration hit upper bound {max_iter}")
(continues on next page)

296 Chapter 15. Coase’s Theory of the Firm


Advanced Quantitative Economics with Python

(continued from previous page)

p_func = lambda x: np.interp(x, grid, p)


return p_func

The next function computes optimal choice of upstream boundary and range of task implemented for a firm face price
function p_function and with downstream boundary 𝑠.

def optimal_choices(pc, p_function, s):


"""
Takes p_func as the true function, minimizes on [0,s]

Returns optimal upstream boundary t_star and optimal size of


firm ell_star

In fact, the algorithm minimizes on [-1,s] and then takes the


max of the minimizer and zero. This results in better results
close to zero

"""
delta, c = pc.delta, pc.c
f = lambda t: delta * p_function(t) + c(s - t)
t_star = max(fminbound(f, -1, s), 0)
ell_star = s - t_star
return t_star, ell_star

The allocation of firms can be computed by recursively stepping through firms’ choices of their respective upstream
boundary, treating the previous firm’s upstream boundary as their own downstream boundary.
In doing so, we start with firm 1, who has downstream boundary 𝑠 = 1.

def compute_stages(pc, p_function):


s = 1.0
transaction_stages = [s]
while s > 0:
s, ell = optimal_choices(pc, p_function, s)
transaction_stages.append(s)
return np.array(transaction_stages)

Let’s try this at the default parameters.


The next figure shows the equilibrium price function, as well as the boundaries of firms as vertical lines

pc = ProductionChain()
p_star = compute_prices(pc)

transaction_stages = compute_stages(pc, p_star)

fig, ax = plt.subplots()

ax.plot(pc.grid, p_star(pc.grid))
ax.set_xlim(0.0, 1.0)
ax.set_ylim(0.0)
for s in transaction_stages:
ax.axvline(x=s, c="0.5")
plt.show()

15.5. Implementation 297


Advanced Quantitative Economics with Python

Iteration converged in 2 steps

Here’s the function ℓ∗ , which shows how large a firm with downstream boundary 𝑠 chooses to be

ell_star = np.empty(pc.n)
for i, s in enumerate(pc.grid):
t, e = optimal_choices(pc, p_star, s)
ell_star[i] = e

fig, ax = plt.subplots()
ax.plot(pc.grid, ell_star, label=r"$\ell^*$")
ax.legend(fontsize=14)
plt.show()

298 Chapter 15. Coase’s Theory of the Firm


Advanced Quantitative Economics with Python

Note that downstream firms choose to be larger, a point we return to below.

15.6 Exercises

Exercise 15.6.1
The number of firms is endogenously determined by the primitives.
What do you think will happen in terms of the number of firms as 𝛿 increases? Why?
Check your intuition by computing the number of firms at delta in (1.01, 1.05, 1.1).

Solution to Exercise 15.6.1


Here is one solution

for delta in (1.01, 1.05, 1.1):

pc = ProductionChain(delta=delta)
p_star = compute_prices(pc)
transaction_stages = compute_stages(pc, p_star)
num_firms = len(transaction_stages)
print(f"When delta={delta} there are {num_firms} firms")

Iteration converged in 2 steps


When delta=1.01 there are 64 firms

15.6. Exercises 299


Advanced Quantitative Economics with Python

Iteration converged in 2 steps


When delta=1.05 there are 41 firms

Iteration converged in 2 steps


When delta=1.1 there are 35 firms

Exercise 15.6.2
The value added of firm 𝑖 is 𝑣𝑖 ∶= 𝑝∗ (𝑡𝑖−1 ) − 𝑝∗ (𝑡𝑖 ).
One of the interesting predictions of the model is that value added is increasing with downstreamness, as are several other
measures of firm size.
Can you give any intution?
Try to verify this phenomenon (value added increasing with downstreamness) using the code above.

Solution to Exercise 15.6.2


Firm size increases with downstreamness because 𝑝∗ , the equilibrium price function, is increasing and strictly convex.
This means that, for a given producer, the marginal cost of the input purchased from the producer just upstream from
itself in the chain increases as we go further downstream.
Hence downstream firms choose to do more in house than upstream firms — and are therefore larger.
The equilibrium price function is strictly convex due to both transaction costs and diminishing returns to management.
One way to put this is that firms are prevented from completely mitigating the costs associated with diminishing returns
to management — which induce convexity — by transaction costs. This is because transaction costs force firms to have
nontrivial size.
Here’s one way to compute and graph value added across firms

pc = ProductionChain()
p_star = compute_prices(pc)
stages = compute_stages(pc, p_star)

va = []

for i in range(len(stages) - 1):


va.append(p_star(stages[i]) - p_star(stages[i+1]))

fig, ax = plt.subplots()
ax.plot(va, label="value added by firm")
ax.set_xticks((5, 25))
ax.set_xticklabels(("downstream firms", "upstream firms"))
plt.show()

Iteration converged in 2 steps

300 Chapter 15. Coase’s Theory of the Firm


Advanced Quantitative Economics with Python

15.6. Exercises 301


Advanced Quantitative Economics with Python

302 Chapter 15. Coase’s Theory of the Firm


CHAPTER

SIXTEEN

COMPOSITE SORTING

16.1 Overview

Optimal transport theory is studies how one (marginal) probabilty measure can be related to another (marginal) probability
measure in an ideal way.
The output of such a theory is a coupling of the two probability measures, i.e., a joint probabilty measure having those
two marginal probability measures.
This lecture describes how Job Boerma, Aleh Tsyvinski, Ruodo Wang, and Zhenyuan Zhang [Boerma et al., 2024] used
optimal transport theory to formulate and solve an equilibrium of a model in which wages and allocations of workers
across jobs adjust to match measures of different types with measures of different types of occupations.
Production technologies allow firms to affect shape costs of mismatch with the consequence that costs of mismatch can
be concave.
That means that it is possible that equilibrium there is neither positive assortive nor negative assorting matching, an
outcome that [Boerma et al., 2024] call composite assortive matching.
For example, in an equilibrium with composite matching, identical workers can sort into different occupations, some
positively and some negatively.
[Boerma et al., 2024] show how this can generate distinct distributions of labor earnings within and across occupations.
This lecture describes the [Boerma et al., 2024] model and presents Python code for computing equilibria.
The lecture applies the code to the [Boerma et al., 2024] model of labor markets.
As with an earlier QuantEcon lecture on optimal transport, a key tool will be linear programming.

16.2 Setup

𝑋 and 𝑌 are finite sets that represent two distinct types of people to be matched.
For each 𝑥 ∈ 𝑋, let a positive integer 𝑛𝑥 be the number of agents of type 𝑥.
Similarly, let a positive integer 𝑚𝑦 be the agents of agents of type 𝑦 ∈ 𝑌 .
We refer to these two measures as marginals.
We assume that

∑ 𝑛𝑥 = ∑ 𝑚𝑦 =∶ 𝑁
𝑥∈𝑋 𝑦∈𝑌

so that the matching problem is balanced.

303
Advanced Quantitative Economics with Python

Given a cost function 𝑐 ∶ 𝑋 × 𝑌 → ℝ, the (discrete) optimal transport problem is

min ∑ 𝜇𝑥𝑦 𝑐𝑥𝑦


𝜇≥0
(𝑥,𝑦)∈𝑋×𝑌

s.t. ∑ 𝜇𝑥𝑦 = 𝑛𝑥
𝑥∈𝑋

∑ 𝜇𝑥𝑦 = 𝑚𝑦
𝑦∈𝑌

Given our discreteness assumptions about 𝑋 and 𝑌 , the problem admits an integer solution 𝜇 ∈ ℤ𝑋×𝑌
+ , i.e., 𝜇𝑥𝑦 is a
non-negative integer for each 𝑥 ∈ 𝑋, 𝑦 ∈ 𝑌 .
We will study integer solutions.
Two points about restricting ourselves to integer solutions are worth mentioning:
• it is without loss of generality for computational purposes, since every problem with float marginals can be trans-
formed into an equivalent problem with integer marginals;
• although the mathematical structure that we present actually works for arbitrary real marginals, some of our Python
implementations would fail to work with float arithmetic.
We focus on a specific instance of an optimal transport problem:
We assume that 𝑋 and 𝑌 are finite subsets of ℝ and that the cost function satisfies 𝑐𝑥𝑦 = ℎ(|𝑥 − 𝑦|) for all 𝑥, 𝑦 ∈ ℝ, for
an ℎ ∶ ℝ+ → ℝ+ that is strictly concave and strictly increasing and grounded (i.e., ℎ(0) = 0).
Such an ℎ satisfies the following
Lemma. If ℎ ∶ ℝ+ → ℝ+ is strictly concave and grounded, then ℎ is strictly subadditive, i.e. for all 𝑥, 𝑦 ∈ ℝ+ , 0 < 𝑥 < 𝑦,
we have

ℎ(𝑥 + 𝑦) < ℎ(𝑥) + ℎ(𝑦)

Proof. For 𝛼 ∈ (0, 1) and 𝑥 > 0 we have, by strict concavity and groundedness, ℎ(𝛼𝑥) > 𝛼ℎ(𝑥)+(1−𝛼)ℎ(0) = 𝛼ℎ(𝑥).
𝑥
Now fix 𝑥, 𝑦 ∈ ℝ+ , 0 < 𝑥 < 𝑦, and let 𝛼 = 𝑥+𝑦 ; the previous observation gives ℎ(𝑥) = ℎ(𝛼(𝑥 + 𝑦)) > 𝛼ℎ(𝑥 + 𝑦) and
ℎ(𝑦) = ℎ((1 − 𝛼)(𝑥 + 𝑦)) > (1 − 𝛼)ℎ(𝑥 + 𝑦); summing these inequality delivers the result. □
In the following implementation we assume that the cost function is 𝑐𝑥𝑦 = |𝑥 − 𝑦|1/𝜁 for 𝜁 > 1, i.e. ℎ(𝑧) = 𝑧 1/𝜁 for
𝑧 ∈ ℝ+ .
Hence, our problem is

min ∑ 𝜇𝑥𝑦 |𝑥 − 𝑦|1/𝜁


𝜇∈ℤ𝑋×𝑌
+ (𝑥,𝑦)∈𝑋×𝑌

s.t. ∑ 𝜇𝑥𝑦 = 𝑛𝑥
𝑥∈𝑋

∑ 𝜇𝑥𝑦 = 𝑚𝑦
𝑦∈𝑌

Let’s start setting up some Python code.


We use the following imports:

import numpy as np
from scipy.optimize import linprog
from itertools import chain
import pandas as pd
from collections import namedtuple
(continues on next page)

304 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)

import matplotlib.pyplot as plt


import matplotlib.patches as patches
from matplotlib.ticker import MaxNLocator
from matplotlib import cm
from matplotlib.colors import Normalize

The following Python class takes as inputs sets of types 𝑋, 𝑌 ⊂ ℝ, marginals 𝑛, 𝑚 with positive integer entries such that
∑𝑥∈𝑋 𝑛𝑥 = ∑𝑦∈𝑌 𝑚𝑦 and cost parameter 𝜁 > 1.

The cost function is stored as an |𝑋| × |𝑌 | matrix with (𝑥, 𝑦)-entry equal to |𝑥 − 𝑦|1/𝜁 , i.e., the cost of matching an agent
of type 𝑥 ∈ 𝑋 with an agent of type 𝑦 ∈ 𝑌 .

class ConcaveCostOT():
def __init__(self, X_types=None, Y_types=None, n_x =None, m_y=None, ζ=2):

# Sets of types
self.X_types, self.Y_types = X_types, Y_types

# Marginals
if X_types is not None and Y_types is not None:
non_empty_types = True
self.n_x = np.ones(len(X_types), dtype=int) if n_x is None else n_x
self.m_y = np.ones(len(Y_types), dtype=int) if m_y is None else m_y
else:
non_empty_types = False
self.n_x, self.m_y = n_x, m_y

# Cost function: |X|x|Y| matrix


self.ζ = ζ
if non_empty_types:
self.cost_x_y = np.abs(X_types[:, None] - Y_types[None, :]) \
** (1 / ζ)
else:
self.cost_x_y = None

Let’s consider a random instance with given numbers of types |𝑋| and |𝑌 | and a given number of agents.
First, we generate random types 𝑋 and 𝑌 .
Then we generate random quantities for each type so that there are 𝑁 agents for each side.

number_of_x_types = 20
number_of_y_types = 20
N_agents_per_side = 60

np.random.seed(1)

## Genetate random types


# generate random support for distributions of types
support_size = 50
random_support = np.unique(np.random.uniform(0,200, size=support_size))

# generate types
X_types_example = np.random.choice(random_support,
(continues on next page)

16.2. Setup 305


Advanced Quantitative Economics with Python

(continued from previous page)


size=number_of_x_types, replace=False)
Y_types_example = np.random.choice(random_support,
size=number_of_y_types, replace=False)

## Generate random integer types quantities summing to N_agents_per_side

# generate integer vectors of lenght n_types summing to n_agents


def random_marginal(n_types, n_agents):
cuts = np.sort(np.random.choice(np.arange(1,n_agents),
size= n_types-1, replace=False))
segments = np.diff(np.concatenate(([0], cuts, [n_agents])))
return segments

# Create a method to assign random marginals to our class


def assign_random_marginals(self,random_seed):
np.random.seed(random_seed)
self.n_x = random_marginal(len(self.X_types), N_agents_per_side)
self.m_y = random_marginal(len(self.Y_types), N_agents_per_side)

ConcaveCostOT.assign_random_marginals = assign_random_marginals

# Create an instance of our class and generate random marginals


example_pb = ConcaveCostOT(X_types_example, Y_types_example, ζ=2)
example_pb.assign_random_marginals(random_seed=1)

We use 𝐹 (resp. 𝐺) to denote the cumulative distribution function associated to the measure 𝑛 (resp. 𝑚)
Thus, 𝐹 (𝑧) = ∑𝑥≤𝑧∶𝑛 𝑛𝑥 and 𝐺(𝑧) = ∑𝑦≤𝑧∶𝑚 𝑚𝑦 for 𝑧 ∈ ℝ.
𝑥 >0 𝑦 >0

Notice that we not normalizing the measures so 𝐹 (∞) = 𝐺(∞) = 𝑁 .


The following method plots the marginals on the real line
• blue for 𝑋 types,
• red for 𝑌 types.
Note that there are possible overlaps between 𝑋 and 𝑌 .

def plot_marginals(self, figsize=(15, 8), title='Distributions of types'):

plt.figure(figsize=figsize)

# Scatter plot n_x


plt.scatter(self.X_types, self.n_x, color='blue', label='n_x')
plt.vlines(self.X_types, ymin=0, ymax= self.n_x,
color='blue', linestyles='dashed')

# Scatter plot m_y


plt.scatter(self.Y_types, - self.m_y, color='red', label='m_y')
plt.vlines(self.Y_types, ymin=0, ymax=- self.m_y,
color='red', linestyles='dashed')

# Add grid and y=0 axis


plt.grid(True)
plt.axhline(0, color='black', linewidth=1)
plt.gca().spines['bottom'].set_position(('data', 0))

(continues on next page)

306 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


# Labeling the axes and the title
plt.ylabel('frequency')
plt.title(title)
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.legend()
plt.show()

ConcaveCostOT.plot_marginals = plot_marginals

example_pb.plot_marginals()

16.3 Characterization of primal solution

16.3.1 Three properties of an optimal solution

We now indicate important properties that are satisfied by an optimal solution.


1. Maximal number of perfect pairs
2. No intersecting pairs
3. Layering
(Maximal number of perfect pairs)
If (𝑧, 𝑧) ∈ 𝑋 × 𝑌 for some 𝑧 ∈ ℝ then in each optimal solution there are min{𝑛𝑧 , 𝑚𝑧 } matches between type 𝑧 ∈ 𝑋
and 𝑧 ∈ 𝑌 .
Indeed, assume by contradiction that at an optimal solution we have (𝑧, 𝑦) and (𝑥, 𝑧) matched in positive amounts for
𝑦, 𝑥 ≠ 𝑧.

16.3. Characterization of primal solution 307


Advanced Quantitative Economics with Python

We can verify that reassigning the minimum of such quantities to the pairs (𝑧, 𝑧) and (𝑥, 𝑦) improves upon the current
matching since

ℎ(|𝑥 − 𝑦|) ≤ ℎ(|𝑥 − 𝑧| + |𝑧 − 𝑦|) < ℎ(|𝑥 − 𝑧|) + ℎ(|𝑧 − 𝑦|)

where the first inequality follows from triangle inequality and the fact that ℎ is increasing and the strict inequality from
strict subadditivity.
We can then repeat the operation for any other analogous pair of matches involving 𝑧, while improving the value, until
we have mass min{𝑛𝑧 , 𝑚𝑧 } on match (𝑧, 𝑧).
Viewing the matching 𝜇 as a measure on 𝑋 × 𝑌 with marginals 𝑛 and 𝑚, this property says that in any optimal 𝜇 we
have 𝜇𝑧𝑧 = 𝑛𝑧 ∧ 𝑚𝑧 for (𝑧, 𝑧) in the diagonal {(𝑥, 𝑦) ∈ 𝑋 × 𝑌 ∶ 𝑥 = 𝑦} of ℝ × ℝ.
The following method finds perfect pairs and returns the on-diagonal matchings as well as the residual off-diagonal
marginals.

def match_perfect_pairs(self):

# Find pairs on diagonal and related mass


perfect_pairs_x, perfect_pairs_y = np.where(
self.X_types[:,None] == self.Y_types[None,:])
Δ_q = np.minimum(self.n_x[perfect_pairs_x] ,self.m_y[perfect_pairs_y])

# Compute off-diagonal residual masses for each side


n_x_off_diag = self.n_x.copy()
n_x_off_diag[perfect_pairs_x]-= Δ_q

m_y_off_diag = self.m_y.copy()
m_y_off_diag[perfect_pairs_y] -= Δ_q

# Compute on-diagonal matching


matching_diag = np.zeros((len(self.X_types), len(self.Y_types)), dtype=int)
matching_diag[perfect_pairs_x, perfect_pairs_y] = Δ_q

return n_x_off_diag, m_y_off_diag , matching_diag

ConcaveCostOT.match_perfect_pairs = match_perfect_pairs

n_x_off_diag, m_y_off_diag , matching_diag = example_pb.match_perfect_pairs()


print(f"On-diagonal matches: {matching_diag.sum()}")
print(f"Residual types in X: {len(n_x_off_diag[n_x_off_diag >0])}")
print(f"Residual types in Y: {len(m_y_off_diag[m_y_off_diag >0])}")

On-diagonal matches: 15
Residual types in X: 14
Residual types in Y: 16

We can therefore create a new instance with the residual marginals that will feature no perfect pairs.
Later we shall add the on-diagonal matching to the solution of this new instance.
We refer to this instance as “off-diagonal” since the product measure of the residual marginals 𝑛 ⊗ 𝑚 feature zeros mass
on the diagonal of ℝ × ℝ.
In the rest of this section, we will focus on this instance.
We create a subclass to study the residual off-diagonal problem.

308 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

The subclass inherits the attributes and the modules from the original class.
We let 𝑍 ∶= 𝑋 ⊔ 𝑌 , where ⊔ denotes the union of disjoint sets. We will
• index types 𝑋 as {0, … , |𝑋| − 1} and types 𝑌 as {|𝑋|, … , |𝑋| + |𝑌 | − 1};
• store the cost function as a |𝑍| × |𝑍| matrix with entry (𝑧, 𝑧 ′ ) equal to 𝑐𝑥𝑦 if 𝑧 = 𝑥 ∈ 𝑋 and 𝑧 ′ = 𝑦 ∈ 𝑌 or
𝑧 = 𝑦 ∈ 𝑌 and 𝑧 ′ = 𝑥 ∈ 𝑋 or equal to +∞ if 𝑧 and 𝑧 ′ belong to the same side
– (the latter is just customary, since these “infinitely penalized” entries are actually never accessed in the im-
plementation);
• let 𝑞 be a vector of size |𝑍| whose 𝑧-th entry equals 𝑛𝑥 if type 𝑥 is the 𝑧-th smallest type in 𝑍 and −𝑚𝑦 if type 𝑦
is the 𝑧-th smallest type in 𝑍; hence 𝑞 encodes capacities of both sides on the (ascending) sorted set of types.
Finally, we add a method to flexibly add a pair (𝑖, 𝑗) with 𝑖 ∈ {0, … , |𝑋| − 1}, 𝑗 ∈ {|𝑋|, … , |𝑋| + |𝑌 | − 1} or
𝑗 ∈ {0, … , |𝑋| − 1}, 𝑖 ∈ {|𝑋|, … , |𝑋| + |𝑌 | − 1} to a matching matrix of size |𝑋| × |𝑌 |.

class OffDiagonal(ConcaveCostOT):
def __init__(self, X_types, Y_types, n_x, m_y, ζ):
super().__init__(X_types, Y_types, n_x, m_y, ζ)

# Types (unsorted)
self.types_list = np.concatenate((X_types,Y_types))

# Cost function: |Z|x|Z| matrix


self.cost_z_z = np.ones((len(self.types_list),
len(self.types_list))) * np.inf

# upper-right block
self.cost_z_z[:len(self.X_types), len(self.X_types):] = self.cost_x_y

# lower-left block
self.cost_z_z[len(self.X_types):, :len(self.X_types)] = self.cost_x_y.T

## Distributions of types
# sorted types and index identifier for each z in support
self.type_z = np.argsort(self.types_list)
self.support_z = self.types_list[self.type_z]

# signed quantity for each type z


self.q_z = np.concatenate([n_x, - m_y])[self.type_z]

# Mathod that adds to matching matrix a pair (i,j)


def add_pair_to_matching(self, pair_ids, matching):
if pair_ids[0] < pair_ids[1]:
# the pair of indices correspond to a pair (x,y)
matching[pair_ids[0], pair_ids[1]-len(self.X_types)] = 1
else:
# the pair of indices correspond to a pair (y,x)
matching[pair_ids[1], pair_ids[0]-len(self.X_types)] = 1

We add a function that returns an instance of the off-diagonal subclass as well as the on-diagonal matching and the indices
of the residual off-diagonal types.
These indices will come handy for adding the off-diagonal matching matrix to the diagonal matching matrix we just found,
since the former will have a smaller size if there are perfect pairs in the original problem.

16.3. Characterization of primal solution 309


Advanced Quantitative Economics with Python

def generate_offD_onD_matching(self):
# Match perfect pairs and compute on-diagonal matching
n_x_off_diag, m_y_off_diag , matching_diag = self.match_perfect_pairs()

# Find indices of residual non-zero quantities for each side


nonzero_id_x = np.flatnonzero(n_x_off_diag)
nonzero_id_y = np.flatnonzero(m_y_off_diag)

# Create new instance with off-diagonal types


off_diagonal = OffDiagonal(self.X_types[nonzero_id_x],
self.Y_types[nonzero_id_y],
n_x_off_diag[nonzero_id_x],
m_y_off_diag[nonzero_id_y],
self.ζ)

return off_diagonal, (nonzero_id_x, nonzero_id_y, matching_diag)

ConcaveCostOT.generate_offD_onD_matching = generate_offD_onD_matching

We apply it to our example:

example_off_diag, _ = example_pb.generate_offD_onD_matching()

Let’s plot the residual marginals to verify visually that there are no overlappings between types from distinct sides in the
off-diagonal instance.

example_off_diag.plot_marginals(title='Distributions of types: off-diagonal')

(No intersecting pairs) This property summarizes the following fact:


• represent both types on the real line and draw a semicirle joining (𝑥, 𝑦) for all pairs (𝑥, 𝑦) ∈ 𝑋 × 𝑌 that are
matched in a solution
• these semicirles do not intersect (unless they share one of the endpoints).

310 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

A proof proceeds by contradiction.


Let’s consider types 𝑥, 𝑥′ ∈ 𝑋 and 𝑦, 𝑦′ ∈ 𝑌 .
Matched pairs cain “intersect” (or be tangent).
We will show that in both cases the partial matching among types 𝑥, 𝑥′ , 𝑦, 𝑦′ can be improved by uncrossing, i.e. reas-
signing the quantities while improving on the solution and reducing the number of intersecting pairs.
The first case of intersecting pairs is

𝑥 < 𝑦 < 𝑦 ′ < 𝑥′

with pairs (𝑥, 𝑦′ ) and (𝑥′ , 𝑦) matched in positive quantities.


Then it follows from strict monotonicity of ℎ that ℎ(|𝑥 − 𝑦|) < ℎ(|𝑥 − 𝑦′ |) and ℎ(|𝑥′ − 𝑦′ |) < ℎ(|𝑥′ − 𝑦|), hence
ℎ(|𝑥 − 𝑦|) + ℎ(|𝑥′ − 𝑦′ |) < ℎ(|𝑥 − 𝑦′ |) + ℎ(|𝑥′ − 𝑦|).
Therefore, we can take the minimum of the masses of the matched pairs (𝑥, 𝑦′ ) and (𝑥′ , 𝑦) and reallocate it to the pairs
(𝑥, 𝑦) and (𝑥′ , 𝑦′ ), therby strictly improving the cost among 𝑥, 𝑦, 𝑥′ , 𝑦′ .
The second case of intersecting pairs is

𝑥 < 𝑥′ < 𝑦 ′ < 𝑦

with pairs (𝑥, 𝑦′ ) and (𝑥′ , 𝑦) matched.


In this case we have

|𝑥 − 𝑦′ | + |𝑥′ − 𝑦| = |𝑥 − 𝑦| + |𝑥′ − 𝑦′ |

|𝑥−𝑦|+|𝑥′ −𝑦|
Letting 𝛼 ∶= |𝑥−𝑦′ |−|𝑥′ −𝑦| ∈ (0, 1), we have |𝑥−𝑦| = 𝛼|𝑥−𝑦′ |+(1−𝛼)|𝑥′ −𝑦| and |𝑥′ −𝑦′ | = (1−𝛼)|𝑥−𝑦′ |+𝛼|𝑥′ −𝑦|.
Hence, by strict concavity of ℎ,

ℎ(|𝑥 − 𝑦|) + ℎ(|𝑥′ − 𝑦′ |) < 𝛼ℎ(|𝑥 − 𝑦′ |) + (1 − 𝛼)ℎ(|𝑥′ − 𝑦|) + (1 − 𝛼)ℎ(|𝑥 − 𝑦′ |) + 𝛼ℎ(|𝑥′ − 𝑦|) = ℎ(|𝑥 − 𝑦′ |) + ℎ(|𝑥′ − 𝑦|).

Therefore, as in the first case, we can strictly improve the cost among 𝑥, 𝑦, 𝑥′ , 𝑦′ by uncrossing the pairs.
Finally, it remains to argue that in both cases uncrossing operations do not increase the number of intersections with other
matched pairs.
It can indeed be shown on a case-by-case basis that, in both of the above cases, for any other matched pair (𝑥″ , 𝑦″ ) the
number of intersections between pairs (𝑥, 𝑦), (𝑥′ , 𝑦′ ) and the pair (𝑥″ , 𝑦″ ) (i.e., after uncrossing) is not larger than the
number of intersections between pairs (𝑥, 𝑦′ ), (𝑥′ , 𝑦) and the pair (𝑥″ , 𝑦″ ) (i.e., before uncrossing), hence the uncrossing
operations above reduce the number of intersections.
We conclude that if a matching features intersecting pairs, it can be modified via a sequence of uncrossing operations
into a matching without intersecting pairs while improving on the value.
(Layering) Recall that there are 2𝑁 individual agents, each agent 𝑖 having type 𝑧𝑖 ∈ 𝑋 ⊔ 𝑌 .
When we introduce the off diagonal matching, to stress that the types sets are disjoint now.
To simplify our explanation of this property, assume for now that each agent has its own distinct type (i.e., |𝑋| = |𝑌 | = 𝑁
and 𝑛 = 𝑚 = 1𝑁 ), in which case the optimal transport problem is also referred to as assignment problem.
Let’s index agents according to their types:

𝑧1 < 𝑧2 ⋯ < 𝑧2𝑁−1 < 𝑧2𝑁 .

Suppose that agents 𝑖 of type 𝑧𝑖 and 𝑗 of type 𝑧𝑗 , with 𝑧𝑖 < 𝑧𝑗 , are matched in a particular optimal solution.

16.3. Characterization of primal solution 311


Advanced Quantitative Economics with Python

Then there is an equal number of agents from each side in {𝑖 + 1, … , 𝑗 − 1}, if this set is not empty.
Indeed, if this were not the case, then some agent 𝑘 ∈ {𝑖 + 1, 𝑗 − 1} would be matched with some agent ℓ with
ℓ ∉ {𝑖, … , 𝑗}, i.e., there would be types

𝑧𝑖 < 𝑧𝑘 < 𝑧𝑗 < 𝑧ℓ

with matches (𝑧𝑖 , 𝑧𝑗 ) and (𝑧𝑘 , 𝑧ℓ ), violating the no intersecting pairs property.
We conclude that we can define a binary relation on [𝑁 ] such that 𝑖 ∼ 𝑗 if there is an equal number of agents of each
side in {𝑖, 𝑖 + 1, … , 𝑗} (or if this set is empty).
This is an equivalence relation, so we can find associated equivalence classes that we call layers.
By the reasoning above, in an optimal solution all pairs 𝑖, 𝑗 (of opposite sides) which are matched belong to the same
layer, hence we can solve the assignment problem associated to each layer and then add up the solutions.
In terms of distributions, 𝑖 and 𝑗, of types 𝑥 ∈ 𝑋 and 𝑦 ∈ 𝑌 respectively, belong to the same layer (i.e., 𝑥 ∼ 𝑦) if and
only if 𝐹 (𝑦−) − 𝐹 (𝑥) = 𝐺(𝑦−) − 𝐺(𝑥).
If 𝐹 and 𝐺 were continuous, then 𝐹 (𝑦) − 𝐹 (𝑥) = 𝐺(𝑦) − 𝐺(𝑥) ⟺ 𝐹 (𝑥) − 𝐺(𝑥) = 𝐹 (𝑦) − 𝐺(𝑦).
This suggests that the following quantity plays an important role:

𝐻(𝑧) ∶= 𝐹 (𝑧) − 𝐺(𝑧), for 𝑧 ∈ ℝ.

Returning to our general (integer) discrete setting, let’s plot 𝐻.


Notice that 𝐻 is right-continuous (being the difference of right-continuous functions) and that upward (resp. downward)
jumps correspond to point masses of agents with types from 𝑋 (resp. 𝑌 ).

def plot_H_z(self, figsize=(15, 8), range_x_axis=None, scatter=True):


# Determine H(z) = F(z) - G(z)
H_z = np.cumsum(self.q_z)

# Plot H(z)
plt.figure(figsize=figsize)
plt.axhline(0, color='black', linewidth=1)

# determine the step points for horizontal lines


step = np.concatenate(([self.support_z.min() - .05 * self.support_z.ptp()],
self.support_z,
[self.support_z.max() + .05 * self.support_z.ptp()]))
height = np.concatenate(([0], H_z, [0]))

# plot the horizontal lines of the step function


for i in range(len(step) - 1):
plt.plot([step[i], step[i+1]], [height[i], height[i]], color='black')

# draw dashed vertical lines for the step function


for i in range(1, len(step) - 1):
plt.plot([step[i], step[i]], [height[i-1], height[i]],
color='black', linestyle='--')

# plot discontinuities points of H(z)


if scatter:
plt.scatter(np.sort(self.X_types), H_z[self.q_z > 0], color='blue')
plt.scatter(np.sort(self.Y_types), H_z[self.q_z < 0], color='red')

if range_x_axis is not None:


(continues on next page)

312 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


plt.xlim(range_x_axis)

# Add labels and title


plt.title('Underqualification Measure (Off-Diagonal)')
plt.xlabel('$z$')
plt.ylabel('$H(z)$')
plt.grid(False)
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.show()

OffDiagonal.plot_H_z = plot_H_z

example_off_diag.plot_H_z()

The layering property extends to the general discrete setting.


There are |𝐻(ℝ)| − 1 layers in total.
Enumerating the range of 𝐻 as 𝐻(ℝ) = {ℎ1 , ℎ2 , … , ℎ|𝐻(ℝ)| } with ℎ1 < ℎ2 < ⋯ < ℎ|𝐻(ℝ)| , we can define layer 𝐿ℓ , for
ℓ ∈ {1, … , |𝐻(ℝ)| − 1} as the collection of types 𝑧 ∈ 𝑍 such that

𝐻(𝑧−) ≤ ℎℓ−1 < ℎℓ ≤ 𝐻(𝑧),

(which are types in 𝑋), or

𝐻(𝑧) ≤ ℎℓ−1 < ℎℓ ≤ 𝐻(𝑧−),

which are types in 𝑌 .


The mass associated with layer 𝐿ℓ is 𝑀ℓ = ℎℓ+1 − ℎℓ .
Intuitively, a layer 𝐿ℓ consists of some mass 𝑀ℓ , of multiple types in 𝑍, i.e. the problem within the layer is unitary.
A unitary problem is essentially an assignment problem up to a constant: we can solve the problem with unit mass and
then rescale a solution by 𝑀ℓ .

16.3. Characterization of primal solution 313


Advanced Quantitative Economics with Python

Moreover, each layer 𝐿ℓ contains an even number of types 𝑁ℓ ∈ 2ℕ, which are alternating, i.e., ordering them as
𝑧1 < 𝑧2 ⋯ < 𝑧𝑁ℓ −1 < 𝑧𝑁ℓ all odd (or even, respectively) indexed types belong to the same side.
The following method finds the layers associated with distributions 𝐹 and 𝐺.
Again, types in 𝑋 are indexed with {0, … , |𝑋| − 1} and types in 𝑌 with {|𝑋|, … , |𝑋| + |𝑌 | − 1}.
Using these indices (instead of the types themselves) to represent the layers allows keeping track of sides types in each
layer, without adding an additional bit of information that would identify the side of the first type in the layer, which,
because a layer is alternating, would then allow identifying sides of all types in the layer.
In addition, using indices will let us extract the cost function within a layer from the cost function 𝑐𝑧𝑧′ computed offline.

def find_layers(self):
# Compute H(z) on the joint support
H_z = np.concatenate([[0], np.cumsum(self.q_z)])

# Compute the range of H, i.e. H(R), stored in ascending order


layers_height = np.unique(H_z)

# Compute the mass of each layer


layers_mass = np.diff(layers_height)

# Compute layers
# the following |H(R)|x|Z| matrix has entry (z,l) equal to 1 iff type z belongs␣
↪to layer l

layers_01 = ((H_z[None, :-1] <= layers_height[:-1, None])


* (layers_height[1:, None] <= H_z[None, 1:]) |
(H_z[None, 1:] <= layers_height[:-1, None])
* (layers_height[1:, None] <= H_z[None, :-1]))

# each layer is reshaped as a list of indices correponding to types


layers = [self.type_z[layers_01[ell]]
for ell in range(len(layers_height)-1)]

return layers, layers_mass, layers_height, H_z

OffDiagonal.find_layers = find_layers

layers_list_example, layers_mass_example, _, _ = example_off_diag.find_layers()


print(layers_list_example)

[array([23, 10]), array([27, 3, 23, 10]), array([16, 2, 21, 3, 25, 8, 23, 12]),
↪ array([16, 2, 21, 3, 25, 12]), array([22, 0, 16, 2, 21, 3, 18, 12]),␣
↪array([15, 0, 16, 2, 14, 5, 21, 3, 18, 9]), array([20, 0, 16, 2, 14, 5,␣
↪21, 3, 19, 11, 24, 1, 18, 9]), array([ 2, 26, 5, 21, 3, 19, 4, 18]),␣
↪array([ 2, 26, 7, 21, 3, 19, 4, 17, 6, 18]), array([13, 26, 7, 21, 3, 19, ␣
↪6, 18]), array([ 6, 18]), array([ 6, 28]), array([ 6, 29])]

The following method gives a graphical representation of the layers.


From the picture it is easy to spot two key features described above:
• types are alternating
• the layer problem is unitary

314 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

def plot_layers(self, figsize=(15, 8)):


# Find layers
layers, layers_mass , layers_height, H_z = self.find_layers()

plt.figure(figsize=figsize)

# Plot H(z)
step = np.concatenate(([self.support_z.min() - .05 * self.support_z.ptp()],
self.support_z,
[self.support_z.max() + .05 * self.support_z.ptp()]))
height = np.concatenate((H_z, [0]))
plt.step(step, height, where='post', color='black', label='CDF', zorder=1)

# Plot layers
colors = cm.viridis(np.linspace(0, 1, len(layers)))
for ell, layer in enumerate(layers):
plt.vlines(self.types_list[layer], layers_height[ell] ,
layers_height[ell] + layers_mass[ell],
color=colors[ell], linewidth=2)
plt.scatter(self.types_list[layer],
np.ones(len(layer)) * layers_height[ell]
+.5 * layers_mass[ell],
color=colors[ell], s=50)

plt.axhline(layers_height[ell], color=colors[ell],
linestyle=':', linewidth=1.5, zorder=0)

# Add labels and title


plt.xlabel('$z$')
plt.title('Layers')
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.show()

OffDiagonal.plot_layers = plot_layers

example_off_diag.plot_layers()

16.3. Characterization of primal solution 315


Advanced Quantitative Economics with Python

16.3.2 Solving a layer

Recall that layer 𝐿ℓ consists of a list of distinct types from 𝑌 ⊔ 𝑋

𝑧1 < 𝑧2 ⋯ < 𝑧𝑁ℓ −1 < 𝑧𝑁ℓ ,

which is alternating.
The problem within a layer is unitary.
Hence we can solve the problem with unit masses and later rescale the solution by the layer’s mass 𝑀ℓ .
Let us select a layer from the example above (we pick the one with maximum number of types) and plot the types on the
real line

# Pick layer with maximum number of types


layer_id_example = max(enumerate(layers_list_example),
key = lambda x: len(x[1]))[0]
layer_example = layers_list_example[layer_id_example]

# Plot layer types


def plot_layer_types(self, layer, mass, figsize=(15, 3)):

plt.figure(figsize=figsize)

# Scatter plot n_x


x_layer = layer[layer < len(self.X_types)]
y_layer = layer[layer >= len(self.X_types)] - len(self.X_types)
M_ell = np.ones(len(x_layer))* mass

plt.scatter(self.X_types[x_layer], M_ell, color='blue', label='X types')


plt.vlines(self.X_types[x_layer], ymin=0, ymax= M_ell,
(continues on next page)

316 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


color='blue', linestyles='dashed')

# Scatter plot m_y


plt.scatter(self.Y_types[y_layer], - M_ell, color='red', label='Y types')
plt.vlines(self.Y_types[y_layer], ymin=0, ymax=- M_ell,
color='red', linestyles='dashed')

# Add grid and y=0 axis


# plt.grid(True)
plt.axhline(0, color='black', linewidth=1)
plt.gca().spines['bottom'].set_position(('data', 0))

# Labeling the axes and the title


plt.ylabel('mass')
plt.title('Distributions of types in the layer')
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)
plt.legend()
plt.show()

ConcaveCostOT.plot_layer_types = plot_layer_types

example_off_diag.plot_layer_types(layer_example,
layers_mass_example[layer_id_example])

Given the structure of a layer and the no intersecting pairs property, the optimal matching and value of the layer can be
found recursively.
Indeed, if in certain optimal matching 1 and 𝑗 ∈ [𝑁ℓ ], 𝑗 − 1 odd, are paired, then there is no matching between agents
in [2, 𝑗 − 1] and those in [𝑗 + 1, 𝑁ℓ ] (if both are non empty, i.e., 𝑗 is not 2 or 𝑁ℓ ).
Hence in such optimal solution agents in [2, 𝑗 − 1] are matched among themselves.
Since [𝑧2 , 𝑧𝑗−1 ] (as well as [𝑧𝑗+1 , 𝑧𝑁ℓ ]) is alternating, we can reason recursively.
Let 𝑉𝑖𝑗 be the optimal value of matching agents in [𝑖, 𝑗] with 𝑖, 𝑗 ∈ [𝑁ℓ ], 𝑗 − 𝑖 ∈ {1, 3, … , 𝑁ℓ − 1}.
Suppose that we computed the value 𝑉𝑖𝑗 for all 𝑖, 𝑗 ∈ [𝑁ℓ ] with 𝑖 − 𝑗 ∈ {1, 3, … , 𝑡 − 2} for some odd natural number 𝑡.
Then, for 𝑖, 𝑗 ∈ [𝑁ℓ ] with 𝑖 − 𝑗 = 𝑡 we have

𝑉𝑖𝑗 = min {𝑐𝑖𝑘 + 𝑉𝑖+1,𝑘−1 + 𝑉𝑘+1,𝑗 }


𝑘∈{𝑖+1,𝑖+3,…,𝑗}

with the RHS depending only on previously computed values.


We set the boundary conditions at 𝑡 = −1: 𝑉𝑖+1,𝑖 = 0 for each 𝑖 ∈ [𝑁ℓ ], so that we can apply the same Bellman equation
at 𝑡 = 1.

16.3. Characterization of primal solution 317


Advanced Quantitative Economics with Python

The following method takes as input the layer types indices and computes the value function as a matrix
[𝑉𝑖𝑗 ]𝑖∈[𝑁ℓ +1],𝑗∈[𝑁ℓ ] .
In order to distinguish entries that are relevant for our computations from those that are never accessed, we initialize this
matrix as full of NaN values.

def solve_bellman_eqs(self,layer):
# Recover cost function within the layer
cost_i_j = self.cost_z_z[layer[:,None],layer[None,:]]

# Initialize value function


V_i_j = np.full((len(layer)+1,len(layer)), np.nan)

# Add boundary conditions


i_bdry = np.arange(len(layer))
V_i_j[i_bdry+1, i_bdry] = 0

t = 1
while t < len(layer):
# Select agents i in [n_L-t] (with potential partners j's in [t,n_L])
i_t = np.arange(len(layer)-t)

# For each i, select each k with |k-i| <= t


# (potential partners of i within segment)
index_ik = i_t[:,None] + np.arange(1, t+1, 2)[None,:]

# Compute optimal value for pairs with |i-j| = t


V_i_j[i_t, i_t + t] = (cost_i_j[i_t[:,None], index_ik] +
V_i_j[i_t[:,None] + 1, index_ik - 1] +
V_i_j[index_ik + 1, i_t[:,None] + t]).min(1)
# Go to next odd integer
t += 2

return V_i_j

OffDiagonal.solve_bellman_eqs = solve_bellman_eqs

Let’s compute values for the layer from our example.


Only non-NaN entries are actually used in the computations.

# Compute layer value function


V_i_j = example_off_diag.solve_bellman_eqs(layer_example)

print(f"Type indices in the layer: {layer_example}")


print('##########################')
print("Section of the Value function of the layer:")
print(V_i_j.round(2)[:min(10, V_i_j.shape[0]),
:min(10, V_i_j.shape[1])])

Type indices in the layer: [20 0 16 2 14 5 21 3 19 11 24 1 18 9]


##########################
Section of the Value function of the layer:
[[ nan 4.29 nan 5.73 nan 9.82 nan 13.9 nan 14.52]
[ 0. nan 2.75 nan 6.17 nan 8.44 nan 10.56 nan]
[ nan 0. nan 1.44 nan 5.52 nan 9.6 nan 10.22]
[ nan nan 0. nan 3.58 nan 5.84 nan 7.96 nan]
(continues on next page)

318 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


[ nan nan nan 0. nan 4.08 nan 8.16 nan 8.78]
[ nan nan nan nan 0. nan 2.26 nan 4.38 nan]
[ nan nan nan nan nan 0. nan 4.08 nan 4.7 ]
[ nan nan nan nan nan nan 0. nan 2.12 nan]
[ nan nan nan nan nan nan nan 0. nan 0.62]
[ nan nan nan nan nan nan nan nan 0. nan]]

Having computed the value function, we can proceed to compute the optimal matching as the policy that attains the value
function that solves the Bellman equation (policy evaluation).
We start from agent 1 and match it with the 𝑘 that achieves the minimum in the equation associated with 𝑉1,2𝑁ℓ .
Then we store segments [2, 𝑘 − 1] and [𝑘 + 1, 2𝑁ℓ ] (if not empty).
In general, given a segment [𝑖, 𝑗], we match 𝑖 with 𝑘 that achieves the minimum in the equation associated with 𝑉𝑖𝑗 and
store the segments [𝑖, 𝑘 − 1] and [𝑘 + 1, 𝑗] (if not empty).
The algorithm proceeds until there are no segments left.

def find_layer_matching(self, V_i_j, layer):


# Initialize
segments_to_process = [np.arange(len(layer))]
matching = np.zeros((len(self.X_types),len(self.Y_types)), bool)

while segments_to_process:
# Pick i, first agent of the segment
# and potential partners i+1,i+3,..., in the segment
segment = segments_to_process[0]
i_0 = segment[0]
potential_matches = np.arange(i_0, segment[-1], 2) + 1

# Compute optimal partner j_i


obj = (self.cost_z_z[layer[i_0],layer[potential_matches]] +
V_i_j[i_0 +1, potential_matches -1] +
V_i_j[potential_matches +1,segment[-1]])

j_i_0 = np.argmin(obj)*2 + (i_0 + 1)

# Add matched pair (i,j_i)


self.add_pair_to_matching(layer[[i_0,j_i_0]], matching)

# Update segments to process:


# remove current segment
segments_to_process = segments_to_process[1:]

# add [i+1,j-1] and [j+1,last agent of the segment]


if j_i_0 > i_0 + 1:
segments_to_process.append(np.arange(i_0 + 1, j_i_0))
if j_i_0 < segment[-1]:
segments_to_process.append(np.arange(j_i_0 + 1, segment[-1] + 1))

return matching

OffDiagonal.find_layer_matching = find_layer_matching

Lets apply this method our example to find the matching within the layer and then rescale it by 𝑀ℓ .
Note that the unscaled value equals 𝑉1,𝑁ℓ .

16.3. Characterization of primal solution 319


Advanced Quantitative Economics with Python

matching_layer = example_off_diag.find_layer_matching(V_i_j,layer_example)
print(f"Value of the layer (unscaled): {(matching_layer * example_off_diag.cost_x_y).
↪sum()}")

print(f"Value of the layer (scaled by the mass = {layers_mass_example[layer_id_


↪example]}): "

f"{layers_mass_example[layer_id_example] * (matching_layer * example_off_diag.


↪cost_x_y).sum()}")

Value of the layer (unscaled): 24.764959193288938


Value of the layer (scaled by the mass = 1): 24.764959193288938

The following method plots the matching within a layer.


We apply it to the layer from our example.

def plot_layer_matching(self, layer, matching_layer):


# Create the figure and axis
fig, ax = plt.subplots(figsize=(15, 15))

# Plot the points on the x-axis


X_types_layer = self.X_types[layer[layer < len(self.X_types)]]
Y_types_layer = self.Y_types[layer[layer >= len(self.X_types)]
- len(self.X_types)]
ax.scatter(X_types_layer, np.zeros_like(X_types_layer), color='blue',
s = 20, zorder=5)
ax.scatter(Y_types_layer, np.zeros_like(Y_types_layer), color='red',
s = 20, zorder=5)

# Draw semicircles for each row in matchings


matched_types = np.where(matching_layer >0)
matched_types_x = self.X_types[matched_types[0]]
matched_types_y = self.Y_types[matched_types[1]]

for iter in range(len(matched_types_x)):


width = abs(matched_types_x[iter] - matched_types_y[iter])
center = (matched_types_x[iter] + matched_types_y[iter]) / 2
height = width
semicircle = patches.Arc((center, 0), width, height, theta1=0,
theta2=180, lw=3)
ax.add_patch(semicircle)

# Add title and layout settings


plt.title('Optimal Layer Matching' )
ax.set_aspect('equal')
plt.gca().spines['bottom'].set_position(('data', 0))
ax.spines['left'].set_color('none')
ax.spines['top'].set_color('none')
ax.spines['right'].set_color('none')
ax.yaxis.set_ticks([])
ax.set_ylim(bottom= -self.support_z.ptp() / 100)

plt.show()

ConcaveCostOT.plot_layer_matching = plot_layer_matching

320 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

example_off_diag.plot_layer_matching(layer_example, matching_layer)

Solving a layer in a smarter way

We now present two key results in the context of OT with concave type costs.
We refer [Boerma et al., 2024] and [Delon et al., 2011] for proofs.
Consider the problem faced within a layer, i.e., types from 𝑌 ⊔ 𝑋

𝑧1 < 𝑧2 ⋯ < 𝑧𝑁ℓ −1 < 𝑧𝑁ℓ , 𝑁ℓ ∈ 2ℕ

are alternating and the problem is unitary.


Given a matching on [1, 𝑘], 𝑘 ∈ [𝑁ℓ ], 𝑘 even, we say that a matched pair (𝑖, 𝑗) within this matching is hidden if there is
a matched pair (𝑖′ , 𝑗′ ) with 𝑖′ < 𝑖 < 𝑗 < 𝑗′ .
Visually, the arc joining (𝑖′ , 𝑗′ ) surmounts the arc joining (𝑖, 𝑗).
Theorem (DSS) Given an optimal matching on [1, 𝑘], if (𝑖, 𝑗) is hidden in this matching, then the pair (𝑖, 𝑗) belongs to
every optimal matching on [1, 2𝑁ℓ ] and is hidden in this matching too.
As a consequence, there exists a more efficient way to compute the value function within a layer.
It can be shown that the solving the following second-order difference equations delivers the same result as the Bellman
equations above:

𝑉𝑖𝑗 = min{𝑐𝑖𝑗 + 𝑉𝑖+1,𝑗−1 , 𝑉𝑖+2,𝑗 + 𝑉𝑖,𝑗−2 − 𝑉𝑖+2,𝑗−2 }

for 𝑖, 𝑗 ∈ [𝑁ℓ ], 𝑗 − 𝑖 odd, with boundary conditions 𝑉𝑖+1,𝑖 = 0 for 𝑖 ∈ [0, 𝑁ℓ ] and 𝑉𝑖+2,𝑖−1 = −𝑐𝑖,𝑖+1 for 𝑖 ∈ [𝑁ℓ − 1] .
The following method uses these equations to compute the value function that is stored as a matrix [𝑉𝑖𝑗 ]𝑖∈[𝑁ℓ +1],𝑗∈[𝑁ℓ +1] .

def solve_bellman_eqs_DSS(self,layer):
# Recover cost function within the layer
cost_i_j = self.cost_z_z[layer[:,None],layer[None,:]]

# Initialize value function


V_i_j = np.full((len(layer)+1,len(layer)+1), np.nan)

# Add boundary conditions


V_i_j[np.arange(len(layer)+1), np.arange(len(layer)+1)] = 0
i_bdry = np.arange(len(layer)-1)
V_i_j[i_bdry+2,i_bdry] = - cost_i_j[i_bdry, i_bdry+1]

t = 1
while t < len(layer):
# Select agents i in [n_l-t] and potential partner j=i+t for each i
i_t = np.arange(len(layer)-t)
j_t = i_t + t +1

(continues on next page)

16.3. Characterization of primal solution 321


Advanced Quantitative Economics with Python

(continued from previous page)


# Compute optimal values for ij with j-i = t
V_i_j[i_t, j_t] = np.minimum(cost_i_j[i_t, j_t-1]
+ V_i_j[i_t + 1, j_t - 1],
V_i_j[i_t, j_t - 2] + V_i_j[i_t + 2, j_t]
- V_i_j[i_t + 2, j_t - 2])

## Go to next odd integer


t += 2

return V_i_j

OffDiagonal.solve_bellman_eqs_DSS = solve_bellman_eqs_DSS

Let’s apply the algorithm to our example and compare outcomes with those attained with the Bellman equations above.

V_i_j_DSS = example_off_diag.solve_bellman_eqs_DSS(layer_example)

print(f"Type indices of the layer: {layer_example}")


print('##########################')

print("Section of Value function of the layer:")


print(V_i_j_DSS.round(2)[:min(10, V_i_j_DSS.shape[0]),
:min(10, V_i_j_DSS.shape[1])])

print('##########################')
print(f"Difference with previous Bellman equations: \
{(V_i_j_DSS[:,1:] - V_i_j)[V_i_j >= 0].sum()}")

Type indices of the layer: [20 0 16 2 14 5 21 3 19 11 24 1 18 9]


##########################
Section of Value function of the layer:
[[ 0. nan 4.29 nan 5.73 nan 9.82 nan 13.9 nan]
[ nan 0. nan 2.75 nan 6.17 nan 8.44 nan 10.56]
[-4.29 nan 0. nan 1.44 nan 5.52 nan 9.6 nan]
[ nan -2.75 nan 0. nan 3.58 nan 5.84 nan 7.96]
[ nan nan -1.44 nan 0. nan 4.08 nan 8.16 nan]
[ nan nan nan -3.58 nan 0. nan 2.26 nan 4.38]
[ nan nan nan nan -4.08 nan 0. nan 4.08 nan]
[ nan nan nan nan nan -2.26 nan 0. nan 2.12]
[ nan nan nan nan nan nan -4.08 nan 0. nan]
[ nan nan nan nan nan nan nan -2.12 nan 0. ]]
##########################
Difference with previous Bellman equations: 4.440892098500626e-14

We can actually compute the optimal matching within the layer simultaneously with computing the value function, rather
than sequentially.
The key idea is that, if at some step of the computation of the values the left branch of the minimum above achieves the
minimum, say 𝑉𝑖𝑗 = 𝑐𝑖𝑗 + 𝑉𝑖+1,𝑗−1 , then (𝑖, 𝑗) are optimally matched on [𝑖, 𝑗] and by the theorem above we get that a
matching on [𝑖 + 1, 𝑗 − 1] which achieves 𝑉𝑖+1,𝑗−1 belongs to an optimal matching on the whole layer (since it is covered
by the arc (𝑖, 𝑗) in [𝑖, 𝑗]).
We can therefore proceed as follows
We initialize an empty matching and a list with all the agents in the layer (representing the agents which are not matched
yet).

322 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

Then whenever the left branch of the minimum is achieved for some (𝑖, 𝑗) in the computation of 𝑉 , we take the collections
of agents 𝑘1 , … , 𝑘𝑀 in [𝑖 + 1, 𝑗 − 1] (in ascending order, i.e. with 𝑧𝑘𝑝 < 𝑧𝑘𝑝+1 ) that are not matched yet (if any) and
add to the matching the pairs (𝑘1 , 𝑘2 ), (𝑘3 , 𝑘4 ), … , (𝑘𝑀−1 , 𝑘𝑀 ).
Thus, we match each unmatched agent 𝑘𝑝 in [𝑖 + 1, 𝑗 − 1] with the closest unmatched right neighbour 𝑘𝑝+1 (starting from
𝑘1 ).
Intuitively, if 𝑘𝑝 were optimally matched with some 𝑘𝑞 in [𝑖 + 1, 𝑗 − 1] and not with 𝑘𝑝+1 , then 𝑘𝑝+1 would have already
been hidden by the match (𝑘𝑝 , 𝑘𝑞 ) from some previous computation (because |𝑘𝑝 − 𝑘𝑞 | < |𝑖 − 𝑗|) and it would therefore
be matched.
Finally, if the process above leaves some umatched agents, we proceed by matching each of these agent with the closest
unmatched right neighbour, starting again from the leftmost of these collection.
To gain understanding, note that this situation happens when the left branch is achieved only for pairs 𝑖, 𝑗 with |𝑖 − 𝑗| = 1,
which leads to the optimal matching (1, 2), (2, 3), … , (𝑛ℓ − 1, 𝑛ℓ ).

def find_layer_matching_DSS(self,layer):
# Recover cost function within the layer
cost_i_j = self.cost_z_z[layer[:,None],layer[None,:]]

# Add boundary conditions


V_i_j = np.zeros((len(layer)+1,len(layer)+1))
i_bdry = np.arange(len(layer)-1)
V_i_j[i_bdry+2,i_bdry] = - cost_i_j[i_bdry, i_bdry+1]

# Initialize matching and list of to-match agents


unmatched = np.ones(len(layer), dtype = bool)
matching = np.zeros((len(self.X_types),len(self.Y_types)), bool)

t = 1
while t < len(layer):
# Compute optimal value for pairs with |i-j| = t
i_t = np.arange(len(layer)-t)
j_t = i_t + t + 1

left_branch = cost_i_j[i_t, j_t-1] + V_i_j[i_t + 1, j_t - 1]


V_i_j[i_t, j_t] = np.minimum(left_branch, V_i_j[i_t, j_t - 2]
+ V_i_j[i_t + 2, j_t] - V_i_j[i_t + 2, j_t - 2])

# Select each i for which left branch achieves minimum in the V_{i,i+t}
left_branch_achieved = i_t[left_branch == V_i_j[i_t, j_t]]

# Update matching
for i in left_branch_achieved:
# for each agent k in [i+1,i+t-1]
for k in np.arange(i+1,i+t)[unmatched[range(i+1,i+t)]]:
# if k is unmatched
if unmatched[k] == True:
# find unmatched right neighbour
j_k = np.arange(k+1,len(layer))[unmatched[k+1:]][0]
# add pair to matching
self.add_pair_to_matching(layer[[k, j_k]], matching)
# remove pair from unmatched agents list
unmatched[[k, j_k]] = False

# go to next odd integer


t += 2
(continues on next page)

16.3. Characterization of primal solution 323


Advanced Quantitative Economics with Python

(continued from previous page)

# Each umatched agent is matched with next unmatched agent


for i in np.arange(len(layer))[unmatched]:
# if i is unmatched
if unmatched[i] == True:
# find unmatched right neighbour
j_i = np.arange(i+1,len(layer))[unmatched[i+1:]][0]
# add pair to matching
self.add_pair_to_matching(layer[[i, j_i]], matching)
# remove pair from unmatched agents list
unmatched[[i, j_i]] = False

return matching

OffDiagonal.find_layer_matching_DSS = find_layer_matching_DSS

matching_layer_DSS = example_off_diag.find_layer_matching_DSS(layer_example)
print(f" Value of layer with DSS recursive equations \
{(matching_layer_DSS * example_off_diag.cost_x_y).sum()}")
print(f" Value of layer with Bellman equations \
{(matching_layer * example_off_diag.cost_x_y).sum()}")

Value of layer with DSS recursive equations 24.764959193288938


Value of layer with Bellman equations 24.764959193288938

example_off_diag.plot_layer_matching(layer_example, matching_layer_DSS)

16.4 Solving primal problem

The following method assembles our components in order to solve the primal problem.
First, if matches are perfect pairs, we store the on-diagonal matching and create an off-diagonal instance with the residual
marginals.
Then we compute the set of layers of the residual distributions.
Finally, we solve each layer and put together matchings within each layer with the on-diagonal matchings.
We then return the full matching, the off-diagonal matching, and the off-diagonal instance.

def solve_primal_pb(self):
# Compute on-diagonal matching, create new instance with resitual types
off_diagoff_diagonal, match_tuple = self.generate_offD_onD_matching()
nonzero_id_x, nonzero_id_y, matching_diag = match_tuple

# Compute layers
(continues on next page)

324 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


layers_list, layers_mass, _, _ = off_diagoff_diagonal.find_layers()

# Solve layers to compute off-diagonal matching


matching_off_diag = np.zeros_like(off_diagoff_diagonal.cost_x_y, dtype=int)

for ell, layer in enumerate(layers_list):


V_i_j = off_diagoff_diagonal.solve_bellman_eqs(layer)
matching_off_diag += layers_mass[ell] \
* off_diagoff_diagonal.find_layer_matching(V_i_j, layer)

# Add together on- and off-diagonal matchings


matching = matching_diag.copy()
matching[np.ix_(nonzero_id_x, nonzero_id_y)] += matching_off_diag

return matching, matching_off_diag, off_diagoff_diagonal

ConcaveCostOT.solve_primal_pb = solve_primal_pb

matching, matching_off_diag, off_diagoff_diagonal = example_pb.solve_primal_pb()

We implement a similar method that adopts the DSS algorithm

def solve_primal_DSS(self):
# Compute on-diagonal matching, create new instance with resitual types
off_diagoff_diagonal, match_tuple = self.generate_offD_onD_matching()
nonzero_id_x, nonzero_id_y, matching_diag = match_tuple

# Find layers
layers, layers_mass, _, _ = off_diagoff_diagonal.find_layers()

# Solve layers to compute off-diagonal matching


matching_off_diag = np.zeros_like(off_diagoff_diagonal.cost_x_y, dtype=int)

for ell, layer in enumerate(layers):


matching_off_diag += layers_mass[ell] \
* off_diagoff_diagonal.find_layer_matching_DSS(layer)

# Add together on- and off-diagonal matchings


matching = matching_diag.copy()
matching[np.ix_(nonzero_id_x, nonzero_id_y)] += matching_off_diag

return matching, matching_off_diag, off_diagoff_diagonal

ConcaveCostOT.solve_primal_DSS = solve_primal_DSS

DSS_tuple = example_pb.solve_primal_DSS()
matching_DSS, matching_off_diag_DSS, off_diagoff_diagonal_DSS = DSS_tuple

By drawing semicircles joining the matched agents (with distinct types), we can visualize the off-diagonal matching.
In the following figure, widths and colors of semicirles indicate relative numbers of agents that are “transported” along
an arc.

def plot_matching(self, matching_off_diag, title, figsize=(15, 15),


(continues on next page)

16.4. Solving primal problem 325


Advanced Quantitative Economics with Python

(continued from previous page)


add_labels=False, plot_H_z=False, scatter=True):

# Create the figure and axis


fig, ax = plt.subplots(figsize=figsize)

# Plot types on the real line


if scatter:
ax.scatter(self.X_types, np.zeros_like(self.X_types), color='blue',
s=20, zorder=5)
ax.scatter(self.Y_types, np.zeros_like(self.Y_types), color='red',
s=20, zorder=5)

# Add labels for X_types and Y_types if add_labels is True


if add_labels:
# Remove x-axis ticks
ax.set_xticks([])

# Add labels
for i, x in enumerate(self.X_types):
ax.annotate(f'$x_{{{i }}}$', (x, 0), textcoords="offset points",
xytext=(0, -15), ha='center', color='blue', fontsize=12)
for j, y in enumerate(self.Y_types):
ax.annotate(f'$y_{{{j }}}$', (y, 0), textcoords="offset points",
xytext=(0, -15), ha='center', color='red', fontsize=12)

# Draw semicircles for each pair of matched types


matched_types = np.where(matching_off_diag > 0)
matched_types_x = self.X_types[matched_types[0]]
matched_types_y = self.Y_types[matched_types[1]]

count = matching_off_diag[matched_types]
colors = plt.cm.Greys(np.linspace(0.5, 1.5, count.max() + 1))
max_height = 0
for iter in range(len(count)):
width = abs(matched_types_x[iter] - matched_types_y[iter])
center = (matched_types_x[iter] + matched_types_y[iter]) / 2
height = width
max_height = max(max_height, height)
semicircle = patches.Arc((center, 0), width, height,
theta1=0, theta2=180,
color=colors[count[iter]],
lw=count[iter] * (2.2 / count.max()))
ax.add_patch(semicircle)

# Title and layout settings for the main plot


plt.title(title)
ax.set_aspect('equal')
plt.axhline(0, color='black', linewidth=1)
ax.spines['bottom'].set_position(('data', 0))
ax.spines['left'].set_color('none')
ax.spines['top'].set_color('none')
ax.spines['right'].set_color('none')
ax.yaxis.set_ticks([])
ax.set_ylim(- self.X_types.ptp() / 10,
(max_height / 2) + self.X_types.ptp()*.01)

(continues on next page)

326 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


# Plot H_z on the main axis if enabled
if plot_H_z:
H_z = np.cumsum(self.q_z)

step = np.concatenate(([self.support_z.min()
- .02 * self.support_z.ptp()],
self.support_z,
[self.support_z.max()
+ .02 * self.support_z.ptp()]))

H_z = H_z/H_z.ptp() * self.support_z.ptp() /2


height = np.concatenate(([0], H_z, [0]))

# Plot the compressed H_z on the same main x-axis


ax.step(step, height, color='green', lw=2,
label='$H_z$', where='post')

# Set the y-limit to keep H_z and maximum circle size in the plot
ax.set_ylim(np.min(H_z) - H_z.ptp() *.01,
np.maximum(np.max(H_z), max_height / 2) + H_z.ptp() *.01)

# Add label and legend for H_z


ax.legend(loc="upper right")

plt.show()

ConcaveCostOT.plot_matching = plot_matching

off_diagoff_diagonal.plot_matching(matching_off_diag,
title='Optimal Matching (off-diagonal)', plot_H_z=True)
off_diagoff_diagonal_DSS.plot_matching(matching_off_diag_DSS,
title='Optimal Matching (off-diagonal) with DSS algorithm')

16.4. Solving primal problem 327


Advanced Quantitative Economics with Python

16.4.1 Verify with linear programming

Let’s verify some of the proceeding findings using linear programming.

def solve_1to1(c_i_j, n_x, m_y, return_dual=False):


n, m = np.shape(c_i_j)

# Constraint matrix
M_z_a = np.vstack([np.kron(np.eye(n), np.ones(m)),
np.kron(np.ones(n), np.eye(m))])
# Constraint vector
q = np.concatenate((n_x, m_y))

# Solve the linear programming problem using linprog from scipy


result = linprog(c_i_j.flatten(), A_eq=M_z_a, b_eq=q,
bounds=(0, None), method='highs')

if return_dual:
return (np.round(result.x).astype(int).reshape([n, m]),
result.eqlin.marginals)
else:
return np.round(result.x).astype(int).reshape([n, m])

mu_x_y_LP = solve_1to1(example_pb.cost_x_y,
example_pb.n_x,
example_pb.m_y)
print(f"Value of LP (scipy): {(mu_x_y_LP * example_pb.cost_x_y).sum()}")
print(f"Value (plain Bellman equations): {(matching * example_pb.cost_x_y).sum()}")
print(f"Value (DSS): {(matching_DSS * example_pb.cost_x_y).sum()}")

Value of LP (scipy): 143.45490363125705


Value (plain Bellman equations): 143.45490363125705
Value (DSS): 143.45490363125705

16.5 Examples

16.5.1 Example 1

We study optimal transport problems on the real line with cost 𝑐(𝑥, 𝑦) = ℎ(|𝑥 − 𝑦|) for a strictly concave and increasing
function ℎ ∶ ℝ+ → ℝ+ .
The outcome is called composite sorting.

328 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

1
Here, we will focus on 𝑐(𝑥, 𝑦) = |𝑥 − 𝑦| 𝜁 for 𝜁 > 1
To appreciate differences with positive assortative matching (PAM) note that the latter is induced by a cost of the form
ℎ(𝑥 − 𝑦) for some strictly convex ℎ ∶ ℝ → ℝ+ .
See Santambrogio 2015, Ch. 2.2.
For example, the cost function |𝑥 − 𝑦|𝑝 , 𝑝 > 1 induces PAM.
On the other hand, negative assortative matching (NAM) arises if 𝑐(𝑥, 𝑦) = ℎ(𝑥 − 𝑦) with ℎ ∶ ℝ → ℝ+ strictly concave.
For example, the cost function −|𝑥 − 𝑦|𝑝 , 𝑝 > 1, induces NAM.
Thus, NAM corresponds to a matching that maximizes a transport problem criterion with gain function 𝑔(𝑥, 𝑦) = |𝑥−𝑦|𝑝 .
Note how PAM and NAM differ from composite sorting
Composite sorting is induced by a cost that is the composition of a strictly concave increasing function ℎ and a convex
function | ⋅ | applied to displacement 𝑥 − 𝑦.
Different functions ℎ potentially induce different matchings.
The following example shows that composite matching can feature both positive and negative assortative patterns.
Suppose that there are two agents per side and types

𝑥0 < 𝑦0 < 𝑥1 < 𝑦1

There are two feasible matchings, one corresponding to PAM, the other to NAM.
• The first features two displacements |𝑥0 − 𝑦0 |, |𝑥1 − 𝑦1 |
• The second features a large displacement |𝑥0 − 𝑦1 | and a small displacement |𝑥1 − 𝑦0 |.
Evidently,
• PAM corresponds to the matching with two medium side displacement because the correponding cost is strictly
convex and increasing in the the displacement.
• NAM corresponds to the matching with a small displacement and a large displacement because the gain is strictly
convex and increasing in the displacement.
In this example, composite sorting ends up coinciding with NAM, but this is something of a coincidence
• Thus, note that in composite matching the cost function is strictly concave and increasing in the displacement.

N = 2
p = 2
ζ = 2

# Solve composite sorting problem


example_1 = ConcaveCostOT(np.array([0,5]),
np.array([4,10]),
ζ=ζ)
matching_CS, _ ,_ = example_1.solve_primal_DSS()

# Solve PAM and NAM


# I use the linear programs to compute PAM and NAM,
# but of course they can be computed directly

convex_cost = np.abs(example_1.X_types[:,None] - example_1.Y_types[None,:])**p

#PAM: |x-y|^p , p>1


matching_PAM = solve_1to1(convex_cost, example_1.n_x, example_1.m_y)
(continues on next page)

16.5. Examples 329


Advanced Quantitative Economics with Python

(continued from previous page)

#NAM: -|x-y|^p , p>1


matching_NAM = solve_1to1(-convex_cost, example_1.n_x, example_1.m_y)

# Plot the matchings


example_1.plot_matching(matching_CS,
title=f'Composite Sorting: $|x-y|^{{1/{ζ}}}$',
figsize=(5,5), add_labels=True)
example_1.plot_matching(matching_PAM, title='PAM',
figsize=(5,5), add_labels=True)

To explore the coincidental resemblence to a NAM outcome, let’s shift left type 𝑦0 while keeping it in between 𝑥0 and
𝑥1 .
PAM and NAM are invariant to any such shift.
However, for a large enough shift, composite sorting now coindices with PAM.

N = 2
ζ = 2
p = 2

# Solve composite sorting problem


example_1 = ConcaveCostOT(np.array([0,5]),
np.array([1,10]) ,
ζ = ζ)
matching_CS, _ ,_ = example_1.solve_primal_DSS()

# Solve PAM and NAM


(continues on next page)

330 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


convex_cost = np.abs(example_1.X_types[:,None] - example_1.Y_types[None,:])**p

matching_PAM = solve_1to1(convex_cost, example_1.n_x, example_1.m_y)


matching_NAM = solve_1to1(-convex_cost, example_1.n_x, example_1.m_y)

# Plot the matchings


example_1.plot_matching(matching_CS,
title = f'Composite Sorting: $|x-y|^{{1/{ζ}}}$',
figsize = (5,5), add_labels = True)
example_1.plot_matching(matching_PAM, title = 'PAM',
figsize = (5,5), add_labels = True)
example_1.plot_matching(matching_NAM, title = 'NAM',
figsize = (5,5), add_labels = True)

Finally, notice that the Monge problem cost function |𝑥 − 𝑦| equals the limit of the composite sorting cost |𝑥 − 𝑦|1/𝜁 as
𝜁 ↓ 1 and also the limit of |𝑥 − 𝑦|𝑝 as 𝑝 ↓ 1.
Evidently, the Monge problem is solved by both the PAM and the composite sorting assignment that arises for 𝜁 ↓ 1.

16.5. Examples 331


Advanced Quantitative Economics with Python

In the following example, the Monge cost of the composite sorting assignment equals the Monge cost of PAM.
Consequently, it is optimal for the Monge problem.

N = 10

ζ = 1.01
p = 2
np.random.seed(1)
X_types = np.random.uniform(0,10, size=N)
Y_types = np.random.uniform(0,10, size=N)

# Solve composite sorting problem


example_1 = ConcaveCostOT(X_types, Y_types, ζ=ζ)

matching_CS, _ ,_ = example_1.solve_primal_DSS()

# Solve PAM and NAM


convex_cost = np.abs(X_types[:,None] - Y_types[None,:])** p

matching_PAM = solve_1to1(convex_cost, example_1.n_x, example_1.m_y)


matching_NAM = solve_1to1(-convex_cost, example_1.n_x, example_1.m_y)

example_1.plot_matching(matching_CS,
title=f'Composite Sorting: $|x-y|^{{1/{ζ}}}$', figsize=(5,5))
example_1.plot_matching(matching_PAM, title = 'PAM', figsize=(5,5))

monge_cost_comp = (matching_CS * np.abs(X_types[:,None] - Y_types[None,:])).sum()


monge_cost_PAM = (matching_PAM * np.abs(example_1.X_types[:,None]
- example_1.Y_types[None,:])).sum()
print("Monge cost of the composite matching assignment:")
print(monge_cost_comp)
print("Monge cost of PAM:")
print(monge_cost_PAM)

Monge cost of the composite matching assignment:


10.530287849572634
Monge cost of PAM:
10.530287849572636

332 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

16.5.2 Example 2

The following example has five agents per side.


The composite sorting assignment differs from both PAM and NAM.
Composite sorting features a hierarchical structure, with each hierarchy positively sorted.
Indeed, consider the composite sorting assignment and note that
• the only arcs visible from above are the ones corresponding to pairings (𝑥0 , 𝑦3 ) and (𝑦4 , 𝑥4 );
• after removing these agents, the only arcs visible from above correspond to (𝑥1 , 𝑦1 ) and (𝑥3 , 𝑦2 ) ;
• after removing these agents, the only arc/pairing left is (𝑥2 , 𝑦0 ).
Note that, at each iteration, the partial assignment corresponding to the arcs visible from above features positive assorta-
tiveness.
Another distinct feature of composite matching stands out from the figures:
• arcs do not intersect

N = 5
ζ = 2
p = 2

X_types_example_2 = np.array([-2,0,2,9, 15])


Y_types_example_2 = np.array([3,6,10,12, 14])

# Solve composite sorting problem


example_2 = ConcaveCostOT(X_types_example_2, Y_types_example_2, ζ=ζ)

matching_CS, _ ,_ = example_2.solve_primal_DSS()

# Solve PAM and NAM


convex_cost = np.abs(X_types_example_2[:,None] - Y_types_example_2[None,:])** p

matching_PAM = solve_1to1(convex_cost, example_2.n_x, example_2.m_y)


matching_NAM = solve_1to1(-convex_cost, example_2.n_x, example_2.m_y)

example_2.plot_matching(matching_CS, title = 'Composite Sorting: $|x-y|^{1/2}$',


figsize = (5,5), add_labels=True)
example_2.plot_matching(matching_PAM, title = 'PAM',
figsize = (5,5), add_labels=True)
example_2.plot_matching(matching_NAM, title = 'NAM',
figsize = (5,5), add_labels=True)

16.5. Examples 333


Advanced Quantitative Economics with Python

16.5.3 Example 3

[Boerma et al., 2024] provide the following example.


There are four agents per side and three types per side (so the problem is not unitary, as opposed to the examples above).

X_types_example_3 = np.array([0,5,9])
Y_types_example_3 = np.array([1,6,10])
n_x_example_3 = np.array([2,1,1], dtype= int)
m_y_example_3 = np.array([1,1,2], dtype= int)

example_3 = ConcaveCostOT(X_types_example_3, Y_types_example_3,


(continues on next page)

334 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


n_x_example_3, m_y_example_3, ζ = 2)
example_3.plot_marginals(figsize = (5,5))

In the case of positive assortative matching (PAM), the two agents with lowest value 𝑥0 are matched with the lowest
valued agents on the other side 𝑦0 , 𝑦1 .
Similarly, the agents with highest value 𝑦2 are matched with the highest valued types on the other side, 𝑥1 and 𝑥2 .
Composite sorting features both negative and positive sorting patterns: agents of type 𝑥0 are matched with both the
bottom 𝑦0 and the top 𝑦2 of the distribution.

matching_CS, _ ,_ = example_3.solve_primal_DSS()

convex_cost = np.abs(example_3.X_types[:,None] - example_3.Y_types[None,:])**2


matching_PAM = solve_1to1(convex_cost, example_3.n_x, example_3.m_y)
matching_NAM = solve_1to1(-convex_cost, example_3.n_x, example_3.m_y)

example_3.plot_matching(matching_PAM, title = 'PAM',


figsize = (5,5), add_labels= True)
example_3.plot_matching(matching_CS, title = 'Composite Sorting',
figsize = (5,5), add_labels= True)
example_3.plot_matching(matching_NAM, title = 'NAM',
figsize = (5,5), add_labels= True)

16.5. Examples 335


Advanced Quantitative Economics with Python

336 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

16.6 Dual Solution

Let’s recall the formulation

𝑉𝑃 = min ∑ 𝜇𝑥𝑦 𝑐𝑥𝑦


𝜇≥0
(𝑥,𝑦)∈𝑋×𝑌

s.t. ∑ 𝜇𝑥𝑦 = 𝑛𝑥
𝑥∈𝑋

∑ 𝜇𝑥𝑦 = 𝑚𝑦
𝑦∈𝑌

The dual problem is

𝑉𝐷 = max ∑ 𝑛𝑥 𝜙𝑥 + ∑ 𝑚𝑦 𝜓𝑦
𝜙,𝜓
𝑥∈𝑋 𝑦∈𝑌

s.t. 𝜙𝑥 + 𝜓𝑦 ≤ 𝑐𝑥𝑦

where (𝜙, 𝜓) are dual variables, which can be interpreted as shadow cost of agents in 𝑋 and 𝑌 , respectively.
Since the dual is feasible and bounded, 𝑉𝑃 = 𝑉𝐷 (strong duality prevails).
Assume now that 𝑦𝑥𝑦 = 𝛼𝑥 + 𝛾𝑦 − 𝑐𝑥𝑦 is the output generated by matching 𝑥 and 𝑦.
It includes the sum of 𝑥 and 𝑦 specific amenities/outputs minus the cost 𝑐𝑥𝑦 .
Then we can formulate the following problem and its dual

𝑊𝑃 = max ∑ 𝜇𝑥𝑦 𝑦𝑥𝑦


𝜇≥0
(𝑥,𝑦)∈𝑋×𝑌

s.t. ∑ 𝜇𝑥𝑦 = 𝑛𝑥
𝑥∈𝑋

∑ 𝜇𝑥𝑦 = 𝑚𝑦
𝑦∈𝑌

𝑊𝐷 = min ∑ 𝑛𝑥 𝑢𝑥 + ∑ 𝑚𝑦 𝑣𝑦
𝑢,𝑣
𝑥∈𝑋 𝑦∈𝑌

s.t. 𝑢𝑥 + 𝑣𝑦 ≥ 𝑦𝑥𝑦
Given the constraints, the primal problem 𝑊𝑃 does not depend on 𝛼, 𝛾 and it has the same solutions as the cost mini-
mization problem 𝑉𝑃 .
The values are related by 𝑊𝑃 = ∑𝑥∈𝑋 𝑛𝑥 𝛼𝑥 + ∑𝑦∈𝑌 𝑚𝑦 𝛾𝑦 − 𝑉𝑃 .
The dual solutions of 𝑉𝐷 and 𝑊𝐷 are related by 𝑢𝑥 = 𝛼𝑥 − 𝜙𝑥 and 𝑣𝑦 = 𝛾𝑦 − 𝜓𝑦 .
The dual solution (𝑢, 𝑣) of 𝑊𝐷 can be interpreted as equilibrium utilities of the agents, which include the individual
specific amenities and equilibrium shadow costs.
[Boerma et al., 2024] propose an efficient method to compute the dual variables from the optimal matching (primal
solution) in the case of composite sorting.
Let’s generate an instance and compute the optimal matching.

num_agents = 8

np.random.seed(1)

X_types_assignment_pb = np.random.uniform(0, 10, size=num_agents)


(continues on next page)

16.6. Dual Solution 337


Advanced Quantitative Economics with Python

(continued from previous page)


Y_types_assignment_pb = np.random.uniform(0, 10, size=num_agents)

# Create instance of the problem


exam_assign = ConcaveCostOT(X_types_assignment_pb, Y_types_assignment_pb)

# Solve primal problem


assignment, assignment_OD, exam_assign_OD = exam_assign.solve_primal_DSS()

# Plot matching
add_labels = True if num_agents < 16 else False
exam_assign_OD.plot_matching(assignment_OD, title = f'Composite Sorting',
figsize=(10,10), add_labels=add_labels)

Having computed the optimal matching, we say that a pair (𝑥0 , 𝑦0 ) is a subpair of a matched pair (𝑥, 𝑦) if 𝑥0 , 𝑦0 are in
the open interval between 𝑥 and 𝑦 and the pair (𝑥0 , 𝑦0 ) is not nested.
The following method computes the subpairs of the optimal matching of the off-diagonal instance.
The output of this method is a dictionary with keys corresponding to matched pairs and an “artificial pair” which collects
all arcs which are visible from above.
Values of each key (𝑥0 , 𝑦0 ) are the subpairs ordered so that the first subpair is the subpair with the 𝑥 type closest to 𝑥0
and the last subpair is the subpair with the 𝑦 type closest to 𝑦0 .

def sort_subpairs(self, subpairs, x_smaller_y=True ):

x_key = min if x_smaller_y else max


y_key = max if x_smaller_y else min

first_pair = x_key(subpairs, key=lambda pair: self.X_types[pair[0]])


last_pair = y_key(subpairs, key=lambda pair: self.Y_types[pair[1]])

intermediate_pairs = [pair for pair in subpairs


(continues on next page)

338 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


if pair != first_pair and pair != last_pair]

return [first_pair] + intermediate_pairs + [last_pair]

ConcaveCostOT.sort_subpairs = sort_subpairs

def find_subpairs(self, matching, return_pairs_between = False):

# Create set of matched pairs of types and add an artificial pair


matched_pairs = set( zip(* np.where(matching > 0)))

# Initialize dictionary to store subpairs


subpairs = {}
pairs_between = {}

# Find subpairs (both nested and non-nested) for each matched pair
for matched_pair in matched_pairs | {'artificial_pair'}:
# Determine the interval of the matched pair
if matched_pair != 'artificial_pair':
min_type, max_type = sorted([self.X_types[matched_pair[0]],
self.Y_types[matched_pair[1]]])
else:
min_type, max_type = (-np.inf, np.inf)

# Add all pairs in the interval to the list of nested_subpairs


pairs_between[matched_pair] = {
pair for pair in matched_pairs if pair != matched_pair and
min_type <= self.X_types[pair[0]] <= max_type and
min_type <= self.Y_types[pair[1]] <= max_type}

subpairs = {key: value.copy() for key, value in pairs_between.items()}

# Remove nested pairs


for matched_pair in matched_pairs | {'artificial_pair'}:
# Compute all nested subpairs
nested_subpairs = set(chain.from_iterable(subpairs[pair]
for pair in subpairs[matched_pair]))
# Remove nested pairs from subpairs[matched_pair]
subpairs[matched_pair] -= nested_subpairs
# subpairs[matched_pair].discard(matched_pair)
subpairs[matched_pair] = list(subpairs[matched_pair])

# Order the subpairs:


# the first (last) pair should have x (y) close to pair_x (pair_y)
if matched_pair != 'artificial_pair' and len(subpairs[matched_pair]) > 1:
subpairs[matched_pair] = self.sort_subpairs(
subpairs[matched_pair],
x_smaller_y=self.X_types[matched_pair[0]]
< self.Y_types[matched_pair[1]])

if return_pairs_between:
return subpairs, pairs_between
return subpairs

OffDiagonal.find_subpairs = find_subpairs

16.6. Dual Solution 339


Advanced Quantitative Economics with Python

subpairs, pairs_between = exam_assign_OD.find_subpairs(assignment,


return_pairs_between = True)
subpairs

{(5, 5): [(4, 7), (1, 3)],


(3, 1): [(7, 0), (0, 2)],
'artificial_pair': [(5, 5), (2, 6)],
(7, 0): [],
(6, 4): [],
(0, 2): [],
(2, 6): [],
(1, 3): [],
(4, 7): [(6, 4), (3, 1)]}

The algorithm to compute the dual variables has a hierarchical structure: it starts from the matched pairs with no subpairs
and then moves to those pairs whose subpairs have been already processed.
We can visualize the hierarchical structure by computing the order in which he pairs will be processed and plotting the
matching with color of the arcs corresponding the hierarchy.

## Compute Hierarchies

def find_hierarchies(subpairs):

# Initialize sets for faster membership checks


pairs_to_process = set(subpairs.keys()) # All pairs to process
processed_pairs = set() # Pairs that have been processed

# Initialize ready_to_process with pairs that have no subpairs


ready_to_process = {pair for pair, sublist in subpairs.items()
if len(sublist) == 0}

# Initialize hierarchies with the first level


hierarchies = [list(ready_to_process)]

# Continue processing while there are unprocessed pairs


while len(processed_pairs) < len(subpairs):
# Mark ready_to_process pairs as processed
processed_pairs.update(ready_to_process)

# Remove ready_to_process pairs from pairs_to_process


pairs_to_process -= ready_to_process

# Find new ready_to_process pairs that have all their subpairs processed
ready_to_process = {
pair for pair in pairs_to_process
if all(subpair in processed_pairs for subpair in subpairs[pair])}

# Append the new ready_to_process to hierarchies


hierarchies.append(list(ready_to_process))

return hierarchies

## Plot Hierarchies

def plot_hierarchies(self, subpairs, scatter=True, range_x_axis=None):


(continues on next page)

340 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


# Compute hierarchies
hierarchies = find_hierarchies(subpairs)

# Create the figure and axis


fig, ax = plt.subplots(figsize=(15, 15))

# Plot types on the real line (blue for X_types, red for Y_types)
size_marker = 20 if scatter else 0
ax.scatter(self.X_types, np.zeros_like(self.X_types), color='blue',
s=size_marker, zorder=5, label='X_types')
ax.scatter(self.Y_types, np.zeros_like(self.Y_types), color='red',
s=size_marker, zorder=5, label='Y_types')

# Plot arcs
# Create a colormap ('viridis' or 'coolwarm', 'plasma')
cmap = plt.colormaps['plasma']
for level, hierarchy in enumerate(hierarchies):
color = (cmap(level / (len(hierarchies) - 1))
if len(hierarchies) > 1 else cmap(0))
for pair in hierarchy:
if pair == 'artificial_pair':
continue

min_type, max_type = sorted([self.X_types[pair[0]],


self.Y_types[pair[1]]])
width = max_type - min_type
center = (max_type + min_type) / 2
# Semicircle height can be the same as the width for a perfect arc
height = width
semicircle = patches.Arc((center, 0), width, height,
theta1=0, theta2=180,
color=color, lw = 3)
ax.add_patch(semicircle)

if range_x_axis is not None:


ax.set_xlim(range_x_axis)
ax.set_ylim(- self.X_types.ptp() / 10,
(range_x_axis[1] - range_x_axis[0]) / 2 )

# Title and layout settings for the main plot


plt.title('Hierarchies of the optimal matching (off-diagonal)')
ax.set_aspect('equal')
plt.axhline(0, color='black', linewidth=1)
ax.spines['bottom'].set_position(('data', 0))
ax.spines['left'].set_color('none')
ax.spines['top'].set_color('none')
ax.spines['right'].set_color('none')
ax.yaxis.set_ticks([]) # Hide the y-axis ticks

# Add a colorbar to represent hierarchy levels


sm = cm.ScalarMappable(cmap=cmap,
norm=Normalize(vmin=0, vmax= len(hierarchies) - 1))
sm.set_array([])
cbar = plt.colorbar(sm, ax=ax, orientation='vertical', pad=0.1, shrink=0.2)
# Show only min and max levels
cbar.set_ticks([0, len(hierarchies) - 1])

(continues on next page)

16.6. Dual Solution 341


Advanced Quantitative Economics with Python

(continued from previous page)


# Label the ticks for clarity
cbar.set_ticklabels(['Lowest', 'Highest'])

plt.show()

OffDiagonal.plot_hierarchies = plot_hierarchies

exam_assign_OD.plot_hierarchies(subpairs)

We proceed to describe and implement the algorithm to compute the dual solution.
As already mentioned, the algorithm starts from the matched pairs (𝑥0 , 𝑦0 ) with no subpairs and assigns the (temporary)
values 𝜓𝑥0 = 𝑐𝑥0 𝑦0 and 𝜓𝑦0 = 0, i.e. the 𝑥 type sustains the whole cost of matching.
The algorithm then proceeds sequentially by processing any matched pair whose subpairs have already been processed.
After picking any such matched pair (𝑥0 , 𝑦0 ), the dual variables already computed for the processed subpairs need to be
made “comparable”.
Indeed, for any subpair (𝑥1 , 𝑦1 ) of (𝑥0 , 𝑦0 ), the dual variables of all the types between the 𝑥1 and 𝑦1 satisfy dual feasibility
and complementary slackness locally, i.e. 𝜙𝑥 +𝜓𝑦 ≤ 𝑐𝑥𝑦 with equality if (𝑥, 𝑦) is a matched pair for all types 𝑥, 𝑦 between
𝑥0 and 𝑦0 .
But dual feasibility is not satisfied globally in general, for instance it might not be satisfied for two subpairs (𝑥1 , 𝑦1 ) and
(𝑥2 , 𝑦2 ) of (𝑥0 , 𝑦0 ).
Therefore, letting (𝑥1 , 𝑦1 ), … , (𝑥𝑝 , 𝑦𝑝 ) be the subpairs of (𝑥0 , 𝑦0 ), we compute the solution (𝛽2 , … , 𝛽𝑝 ) of the linear
system
𝑗
max(𝑐𝑥0 𝑦0 − 𝑐𝑥0 𝑦𝑖 − 𝑐𝑥𝑗 𝑦0 , −𝑐𝑥𝑗 𝑦𝑖 ) + 𝑐𝑥𝑖 𝑦𝑖 ≤ ∑ 𝛽𝑘 ≤ min(𝑐𝑥0 𝑦𝑗 + 𝑐𝑥𝑖 𝑦0 − 𝑐𝑥0 𝑦0 , 𝑐𝑥𝑖 𝑦𝑗 ) − 𝑐𝑥𝑗 𝑦𝑗 , for all 1 ≤ 𝑖 < 𝑗 ≤ 𝑝.
𝑘=𝑖+1

𝑝
Then for all 𝑖 ∈ [𝑝] compute the adjustment Δ𝑖 = ∑𝑘=𝑖+1 𝛽𝑘 + 𝜙𝑥𝑝 − 𝜙𝑥1 and modify the dual variables

𝜙𝑥 ← 𝜙 𝑥 + Δ 𝑖
𝜓𝑦 ← 𝜓 𝑦 − Δ 𝑖 ,

for all matched pairs (𝑥, 𝑦) between 𝑥𝑖 and 𝑦𝑖 .

342 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

After this step, the dual variables of the types between 𝑥0 and 𝑦0 satisfy dual feasibility and complementary slackness;
we can then proceed to compute the dual variables for 𝑥0 and 𝑦0 by setting
𝜓𝑦0 = min{𝑐𝑥𝑖 𝑦0 − 𝜙𝑥𝑖 }
𝑖∈[𝑝]

𝜙𝑥0 = 𝑐𝑥0 𝑦0 − 𝜓𝑦0 .

The pair (𝑥0 , 𝑦0 ) is now processed.


The following method computes the solution 𝛽 of the linear system of inequalities above.

def compute_betas(self, pair, subpairs):


types_subpairs = np.array(subpairs)

# Define the bounds of the linear inequality system


if pair == 'artificial_pair':
bounds = (- self.cost_x_y[types_subpairs[:,0][:,None],
types_subpairs[:,1][None,:]]
+ self.cost_x_y[types_subpairs[:,0],
types_subpairs[:,1]][None,:])
else:
bounds = (np.maximum(
self.cost_x_y[pair]
- self.cost_x_y[pair[0], types_subpairs[:,1]][None,:]
- self.cost_x_y[types_subpairs[:,0],pair[1]][:,None],
- self.cost_x_y[types_subpairs[:,0][:,None],
types_subpairs[:,1][None,:]]
)
+ self.cost_x_y[types_subpairs[:,0], types_subpairs[:,1]][None,:])

# Define linear inequality system


num_subpairs = len(types_subpairs)
c_1 = (np.arange(num_subpairs)[:, None, None]
>= np.arange(num_subpairs)[None, None, :])
c_2 = (np.arange(num_subpairs)[None, None, :]
> np.arange(num_subpairs)[ None,:, None])
sum_tensor = (c_1 & c_2).astype(int)

sum_tensor -= sum_tensor.transpose(1, 0, 2)

# Solve the system of linear inequalities


result = linprog(c = np.zeros(num_subpairs),
A_ub= - sum_tensor.reshape(num_subpairs**2, num_subpairs),
b_ub= - bounds.flatten(),
bounds=(None,None),
method='highs')

beta = result.x
beta[0] = 0

return beta

OffDiagonal.compute_betas = compute_betas

The following method iteratively processes the matched pairs of the off-diagonal matching as explained above.

def compute_dual_off_diagonal(self, subpairs, pairs_between):

(continues on next page)

16.6. Dual Solution 343


Advanced Quantitative Economics with Python

(continued from previous page)


# Initialize dual variables
ϕ_x = np.zeros(len(self.X_types))
ψ_y = np.zeros(len(self.Y_types))

# Initialize sets for faster membership checks


pairs_to_process = set(subpairs.keys()) # All pairs to process
processed_pairs = set() # Pairs that have been processed

# Initialize ready_to_process with pairs that have no subpairs


ready_to_process = {pair for pair, sublist in subpairs.items()
if len(sublist) == 0}

while len(processed_pairs) < len(subpairs):

# 1. Pick any subpair which is ready to process


for pair in ready_to_process:

# 2. If there are no subpairs, φ_x = c_{xy} and ψ_y = 0


if len(subpairs[pair]) == 0:
ϕ_x[pair[0]] = self.cost_x_y[pair]
ψ_y[pair[1]] = 0

# 3. If there are subpairs:


else:
# (a) compute betas
beta = self.compute_betas(pair, subpairs[pair])

# (b) adjust potentials of types between each subpair of the pair


for i, subpair in enumerate(subpairs[pair]):
# update potentials of these types
types_between_subpair = np.array(
list(pairs_between[subpair]) + [subpair])

Δ_subpair = (beta[np.arange(i+1,len(subpairs[pair]))].sum()
+ ϕ_x[subpairs[pair][-1][0]]
- ϕ_x[subpair[0]])

ϕ_x[ types_between_subpair[:,0]] += Δ_subpair


ψ_y[ types_between_subpair[:,1]] -= Δ_subpair

# (c) compute potentials of the pair


subpairs_x = np.array(subpairs[pair])[:,0]
subpairs_y = np.array(subpairs[pair])[:,1]

if pair != 'artificial_pair':
if pair[0] == subpairs_x[0]:
ψ_y[pair[1]] = np.min(self.cost_x_y[pair[0], subpairs_y]
- ψ_y[subpairs_y]) + self.cost_x_y[pair]
else:
ψ_y[pair[1]] = np.min(self.cost_x_y[subpairs_x,
pair[1]] - ϕ_x[subpairs_x] )

ϕ_x[pair[0]] = self.cost_x_y[pair] - ψ_y[pair[1]]

# Add pair to processed pairs

(continues on next page)

344 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

(continued from previous page)


processed_pairs.add(pair)

# Remove ready_to_process from pairs_to_process


pairs_to_process -= ready_to_process

# Add to ready_to_process pairs for which all subpairs are in processed_pairs


ready_to_process = {pair for pair in pairs_to_process
if all(subpair in processed_pairs for subpair in subpairs[pair])}

return ϕ_x, ψ_y

OffDiagonal.compute_dual_off_diagonal = compute_dual_off_diagonal

We apply the algorithm to our example and check that dual feasibility (𝜙𝑥 + 𝜓𝑦 ≤ 𝑐𝑥𝑦 for all 𝑥 ∈ 𝑋 and 𝑦 ∈ 𝑌 ) as well
as strong duality (𝑉𝑃 = 𝑉𝐷 ) are satisfied.

ϕ_x , ψ_y = exam_assign_OD.compute_dual_off_diagonal(subpairs, pairs_between)

# Check dual feasibility


dual_feasibility_i_j = ϕ_x[:,None] + ψ_y[None,:] - exam_assign_OD.cost_x_y
print('Violations of dual feasibility:' , np.sum(dual_feasibility_i_j > 1e-10))

dual_sol = (exam_assign_OD.n_x * ϕ_x).sum() + (exam_assign_OD.m_y* ψ_y).sum()


primal_sol = (assignment_OD * exam_assign_OD.cost_x_y).sum()

# Check strong duality


print('Value of dual solution: ', dual_sol)
print('Value of primal solution: ', primal_sol)

# # Check the value of the primal problem


if len(exam_assign_OD.n_x) * len(exam_assign_OD.m_y) < 1000:
mu_x_y , p_z= solve_1to1(exam_assign_OD.cost_x_y,
exam_assign_OD.n_x,
exam_assign_OD.m_y,
return_dual = True)
print('Value of primal solution (scipy)',
(mu_x_y * exam_assign_OD.cost_x_y).sum())

Violations of dual feasibility: 0


Value of dual solution: 9.03369035213102
Value of primal solution: 9.03369035213102
Value of primal solution (scipy) 9.03369035213102

Having computed the dual variables of the off-diagonal types, we compute the dual variables for perfecly matched pairs
by setting

𝜙𝑥 = min {𝑐𝑥𝑦 − 𝜓𝑦 }
𝑦∈𝑌 𝑂𝐷

𝜓𝑦 = min {𝑐𝑥𝑦 − 𝜙𝑥 }
𝑥∈𝑋𝑂𝐷

where 𝑋 𝑂𝐷 and 𝑌 𝑂𝐷 are the types of the off-diagonal instance, for which the dual variables have already been computed.
The following method computes the full dual solution from the primal solution.

16.6. Dual Solution 345


Advanced Quantitative Economics with Python

def compute_dual_solution(self, matching_off_diag):

# Compute the dual solution for the off-diagonal types


off_diag, match_tuple = self.generate_offD_onD_matching()
nonzero_id_x, nonzero_id_y, matching_diag = match_tuple

subpairs, pairs_between = off_diag.find_subpairs(matching_off_diag,


return_pairs_between = True)
ϕ_x_off_diag, ψ_x_off_diag = off_diag.compute_dual_off_diagonal(
subpairs,pairs_between)

# Compute the dual solution for the on-diagonal types


ϕ_x = np.ones(len(self.X_types)) * np.inf
ψ_y = np.ones(len(self.Y_types)) * np.inf

ϕ_x[nonzero_id_x] = ϕ_x_off_diag
ψ_y[nonzero_id_y] = ψ_x_off_diag

ϕ_x = np.min( self.cost_x_y - ψ_y[None,:] , axis = 1)


ψ_y = np.min( self.cost_x_y - ϕ_x[:,None] , axis = 0)

return ϕ_x, ψ_y

ConcaveCostOT.compute_dual_solution = compute_dual_solution

ϕ_x, ψ_y = exam_assign.compute_dual_solution(assignment_OD)

dual_feasibility_i_j = ϕ_x[:,None] + ψ_y[None,:] - exam_assign.cost_x_y


print('Violations of dual feasibility:' , np.sum(dual_feasibility_i_j > 1e-10))
print('Value of dual solution: ', (exam_assign.n_x * ϕ_x).sum()
+ (exam_assign.m_y * ψ_y).sum())
print('Value of primal solution: ', (assignment * exam_assign.cost_x_y).sum())

Violations of dual feasibility: 0


Value of dual solution: 9.03369035213102
Value of primal solution: 9.03369035213102

16.7 Application

16.7.1 Data

We now replicate the empirical analysis carried out by [Boerma et al., 2024].
The dataset is obtained from the American Community Survey and contains individual level data on income, age and
occupation.
The occupation of each individual consists of a Standard Occupational Classification (SOC) code.
There are 497 codes in total.
We consider only employed (civilian) individuals with ages between 25 and 60 from 2010 to 2017.
To visualize log-wage dispersion, we group the individuals by occupation and compute the mean and standard deviation
of the wages within each occupation.

346 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

Then we sort occupations by average log-earnings within each occupation.


The resulting dataset is included in the dataset acs_data_summary.csv

data_path = '_static/lecture_specific/match_transport/'
occupation_df = pd.read_csv(data_path + 'acs_data_summary.csv')

We plot the wage standard deviation for the sorted occupations.

# Scatter plot wage dispersion for each occupation


plt.figure(figsize=(10, 6))

# Scatter plot with marker size proportional to count


plt.scatter(
occupation_df.index,
occupation_df['std_Earnings'],
# marker_sizes
s = 1000 * (occupation_df['count'] / occupation_df['count'].max()),
# transparency
alpha = 0.5,
label = 'Occupations'
)

# Polynomial interpolation
x = np.arange(len(occupation_df))
y = occupation_df['std_Earnings']
degree = 5
p = np.poly1d(np.polyfit(x, y, degree) )
plt.plot(x, p(x), color='red')

# Add labels and title


plt.xlabel("Occupations", fontsize=12)
plt.ylabel("Wage Dispersion", fontsize=12)
plt.xticks([], fontsize=8)

plt.show()

We also plot the average wages for each occupation (SOC code). Again, occupations are ordered by increasing average
wage.

# Scatter plot average wage for each occupation


plt.figure(figsize=(10, 6))

# Scatter plot with marker size proportional to count


plt.scatter(
occupation_df.index,
occupation_df['mean_Earnings'],
alpha = 0.5, # transparency
label = 'Occupations'
)

# Polynomial interpolation
x = np.arange(len(occupation_df))
y = occupation_df['mean_Earnings']
degree = 5
p = np.poly1d(np.polyfit(x, y, degree) )
(continues on next page)

16.7. Application 347


Advanced Quantitative Economics with Python

Fig. 16.1: Average wage for each Standard Occupational Classification (SOC) code. The codes are sorted by average
wage on the horizontal axis. In red, a polynomial of degree 5 is fitted to the data. The size of the marker is proportional
to the number of individuals in the occupation.

(continued from previous page)


plt.plot(x, p(x), color='red')

# Add labels and title


plt.xlabel("Occupations", fontsize=12)
plt.ylabel("Average Wage", fontsize=12)
plt.xticks([], fontsize=8)

plt.show()

16.7.2 Model

parameters_1980 = namedtuple('Params_Jobs', [
'mean_1', 'var_1', 'mean_2', 'var_2', 'mixing_weight', 'var_workers'
])(
mean_1=0.38,
var_1=0.06,
mean_2=0.0,
var_2=0.75,
mixing_weight=0.36,
var_workers=0.2
)
(continues on next page)

348 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

Fig. 16.2: Average wage for each Standard Occupational Classification (SOC) code. The codes are sorted by average
wage on the horizontal axis. In red, a polynomial of degree 5 is fitted to the data.

(continued from previous page)

num_agents=1500

def generate_types_application(self, num_agents, params, random_seed=1):

mean_1, var_1, mean_2, var_2, mixing_weight, var_workers = params

np.random.seed(random_seed)

# Job types
job_types = np.where(np.random.rand(num_agents) < mixing_weight,
np.random.lognormal(mean_1, var_1, num_agents),
np.random.lognormal(mean_2, var_2, num_agents))

# Worker types
mean_workers = - var_workers/ 2
worker_types = np.random.lognormal(mean_workers, var_workers, num_agents)

# Check that worker and job types have distinct values


assert len(np.unique(worker_types)) == num_agents
assert len(np.unique(job_types)) == num_agents

# Assign types to the instance


self.X_types = worker_types
self.Y_types = job_types

(continues on next page)

16.7. Application 349


Advanced Quantitative Economics with Python

(continued from previous page)


# Assign unitary marginals
self.n_x = np.ones(num_agents, dtype=int)
self.m_y = np.ones(num_agents, dtype=int)

# Assign cost matrix


self.cost_x_y = np.abs(worker_types[:, None] \
- job_types[None, :]) ** (1/self.ζ)

ConcaveCostOT.generate_types_application = generate_types_application

# Create an instance of ConcaveCostOT class and generate types


model_1980 = ConcaveCostOT()
model_1980.generate_types_application(num_agents, parameters_1980)

Since we will consider examples with a large number of agents, it will be convenient to visualize the distributions as
histograms approximating the pdfs.

def plot_marginals_pdf(self, bins, figsize=(15, 8),


range_x_axis=None, title='Distributions of types'):

plt.figure(figsize=figsize)

# Plotting histogram for X_types (approximating PDF)


plt.hist(self.X_types, bins=bins, density=True, color='blue', alpha=0.7,
label='PDF of worker types',
edgecolor='blue', range = range_x_axis)

# Plotting histogram for Y_types (approximating PDF)


counts, edges = np.histogram(self.Y_types, bins=bins,
density=True, range=range_x_axis)
plt.bar(edges[:-1], -counts, width=np.diff(edges), color='red', alpha=0.7,
label='PDF of job types ', align='edge', edgecolor='red')

# Add grid and y=0 axis


plt.grid(False)
plt.axhline(0, color='black', linewidth=1)
plt.gca().spines['bottom'].set_position(('data', 0))

# Set the x-axis limits based on the range argument


if range_x_axis is not None:
plt.xlim(range_x_axis)

# Labeling the axes and the title


plt.ylabel('Density')
plt.title(title)
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.legend()
plt.yticks([])

plt.show()

ConcaveCostOT.plot_marginals_pdf = plot_marginals_pdf

We plot the hystograms and the measure of underqualification for the worker types and job types. We then compute the
primal solution and plot the matching.

350 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

# Plot pdf
range_x_axis = (0, 4)
model_1980.plot_marginals_pdf(figsize=(8, 5),
bins=300, range_x_axis=range_x_axis)

# Plot H_z
model_OD_1980 , _ = model_1980.generate_offD_onD_matching()
model_OD_1980.plot_H_z(figsize=(8, 5), range_x_axis=range_x_axis, scatter=False)

16.7. Application 351


Advanced Quantitative Economics with Python

# Compute optimal matching and plot off diagonal matching


matching_1980, matching_OD_1980, model_OD_1980 = model_1980.solve_primal_DSS()
model_OD_1980.plot_matching(matching_OD_1980,
title = 'Optimal Matching (off-diagonal)',
figsize=(10, 10), plot_H_z=True, scatter=False)

From the optimal matching we compute and visualize the hierarchies.

352 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

We then find the dual solution (𝜙, 𝜓) and compute the wages as 𝑤𝑥 = 𝑔(𝑥) − 𝜙𝑥 , assuming that the type-specific
productivity of type 𝑥 is 𝑔(𝑥) = 𝑥.

# Find subpairs and plot hierarchies


subpairs, pairs_between = model_OD_1980.find_subpairs(matching_OD_1980,
return_pairs_between=True)
model_OD_1980.plot_hierarchies(subpairs, scatter=False,
range_x_axis=range_x_axis)

# Compute dual solution: φ_x and ψ_y


ϕ_worker_x_1980 , ψ_firm_y_1980 = model_OD_1980.compute_dual_off_diagonal(
subpairs, pairs_between)

# Check dual feasibility


dual_feasibility_i_j = ϕ_worker_x_1980[:,None] + ψ_firm_y_1980[None,:] \
- model_OD_1980.cost_x_y
print('Dual feasibility violation:', dual_feasibility_i_j.max())

# Check strong duality


dual_sol = (model_OD_1980.n_x * ϕ_worker_x_1980).sum() \
+ (model_OD_1980.m_y * ψ_firm_y_1980).sum()
primal_sol = (matching_OD_1980 * model_OD_1980.cost_x_y).sum()

print('Value of dual solution: ', dual_sol)


print('Value of primal solution: ', primal_sol)

# Compute wages: wage_x = x - φ_x


wage_worker_x_1980 = model_1980.X_types - ϕ_worker_x_1980

Dual feasibility violation: 4.322138159526534e-08


Value of dual solution: 825.8167029184153
Value of primal solution: 825.8167029184157

Let’s plot average wages and wage dispersion generated by the model.

def plot_wages_application(wages):

(continues on next page)

16.7. Application 353


Advanced Quantitative Economics with Python

(continued from previous page)


plt.figure(figsize=(10, 6))
plt.plot(np.sort(wages), label='Wages')
plt.xlabel("Occupations", fontsize=12)
plt.ylabel("Wages", fontsize=12)
plt.grid(True)
plt.show()

def plot_wage_dispersion_model(wage_worker_x, bins=100,


title='Wage Dispersion', figsize=(10, 6)):
# Compute the percentiles
percentiles = np.linspace(0, 100, bins + 1)
bin_edges = np.percentile(wage_worker_x, percentiles)

# Compute the standard deviation within each percentile range


stds = []
for i in range(bins):
# Compute the standard deviation for the current bin
bin_data = wage_worker_x[
(wage_worker_x >= bin_edges[i]) & (wage_worker_x < bin_edges[i + 1])]
if len(bin_data) > 1:
stds.append(np.std(bin_data))
else:
stds.append(0)

# Plot the standard deviations for each percentile as bars


plt.figure(figsize=figsize)
plt.bar(range(bins), stds, width=1.0, color='grey',
alpha=0.7, edgecolor='white')
plt.xlabel('Percentile', fontsize=12)
plt.ylabel('Standard Deviation', fontsize=12)
plt.title(title, fontsize=14)
plt.grid(axis='y', linestyle='-', alpha=0.6)

plot_wages_application(wage_worker_x_1980)

354 Chapter 16. Composite Sorting


Advanced Quantitative Economics with Python

plot_wage_dispersion_model(wage_worker_x_1980, bins=100)

16.7. Application 355


Advanced Quantitative Economics with Python

356 Chapter 16. Composite Sorting


Part IV

Dynamic Linear Economies

357
CHAPTER

SEVENTEEN

RECURSIVE MODELS OF DYNAMIC LINEAR ECONOMIES

“Mathematics is the art of giving the same name to different things” – Henri Poincare
“Complete market economies are all alike” – Robert E. Lucas, Jr., (1989)
“Every partial equilibrium model can be reinterpreted as a general equilibrium model.” – Anonymous

17.1 A Suite of Models

This lecture presents a class of linear-quadratic-Gaussian models of general economic equilibrium designed by Lars Peter
Hansen and Thomas J. Sargent [Hansen and Sargent, 2013].
The class of models is implemented in a Python class DLE that is part of quantecon.
Subsequent lectures use the DLE class to implement various instances that have appeared in the economics literature
1. Growth in Dynamic Linear Economies
2. Lucas Asset Pricing using DLE
3. IRFs in Hall Model
4. Permanent Income Using the DLE class
5. Rosen schooling model
6. Cattle cycles
7. Shock Non Invertibility

17.1.1 Overview of the Models

In saying that “complete markets are all alike”, Robert E. Lucas, Jr. was noting that all of them have
• a commodity space.
• a space dual to the commodity space in which prices reside.
• endowments of resources.
• peoples’ preferences over goods.
• physical technologies for transforming resources into goods.
• random processes that govern shocks to technologies and preferences and associated information flows.
• a single budget constraint per person.
• the existence of a representative consumer even when there are many people in the model.

359
Advanced Quantitative Economics with Python

• a concept of competitive equilibrium.


• theorems connecting competitive equilibrium allocations to allocations that would be chosen by a benevolent social
planner.
The models have no frictions such as …
• Enforcement difficulties
• Information asymmetries
• Other forms of transactions costs
• Externalities
The models extensively use the powerful ideas of
• Indexing commodities and their prices by time (John R. Hicks).
• Indexing commodities and their prices by chance (Kenneth Arrow).
Much of the imperialism of complete markets models comes from applying these two tricks.
The Hicks trick of indexing commodities by time is the idea that dynamics are a special case of statics.
The Arrow trick of indexing commodities by chance is the idea that analysis of trade under uncertainty is a special
case of the analysis of trade under certainty.
The [Hansen and Sargent, 2013] class of models specify the commodity space, preferences, technologies, stochastic
shocks and information flows in ways that allow the models to be analyzed completely using only the tools of linear time
series models and linear-quadratic optimal control described in the two lectures Linear State Space Models and Linear
Quadratic Control.
There are costs and benefits associated with the simplifications and specializations needed to make a particular model fit
within the [Hansen and Sargent, 2013] class
• the costs are that linear-quadratic structures are sometimes too confining.
• benefits include computational speed, simplicity, and ability to analyze many model features analytically or nearly
analytically.
A variety of superficially different models are all instances of the [Hansen and Sargent, 2013] class of models
• Lucas asset pricing model
• Lucas-Prescott model of investment under uncertainty
• Asset pricing models with habit persistence
• Rosen-Topel equilibrium model of housing
• Rosen schooling models
• Rosen-Murphy-Scheinkman model of cattle cycles
• Hansen-Sargent-Tallarini model of robustness and asset pricing
• Many more …
The diversity of these models conceals an essential unity that illustrates the quotation by Robert E. Lucas, Jr., with which
we began this lecture.

360 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

17.1.2 Forecasting?

A consequence of a single budget constraint per person plus the Hicks-Arrow tricks is that households and firms need not
forecast.
But there exist equivalent structures called recursive competitive equilibria in which they do appear to need to forecast.
In these structures, to forecast, households and firms use:
• equilibrium pricing functions, and
• knowledge of the Markov structure of the economy’s state vector.

17.1.3 Theory and Econometrics

For an application of the [Hansen and Sargent, 2013] class of models, the outcome of theorizing is a stochastic process,
i.e., a probability distribution over sequences of prices and quantities, indexed by parameters describing preferences,
technologies, and information flows.
Another name for that object is a likelihood function, a key object of both frequentist and Bayesian statistics.
There are two important uses of an equilibrium stochastic process or likelihood function.
The first is to solve the direct problem.
The direct problem takes as inputs values of the parameters that define preferences, technologies, and information flows
and as an output characterizes or simulates random paths of quantities and prices.
The second use of an equilibrium stochastic process or likelihood function is to solve the inverse problem.
The inverse problem takes as an input a time series sample of observations on a subset of prices and quantities determined
by the model and from them makes inferences about the parameters that define the model’s preferences, technologies,
and information flows.

17.1.4 More Details

A [Hansen and Sargent, 2013] economy consists of lists of matrices that describe peoples’ household technologies, their
preferences over consumption services, their production technologies, and their information sets.
There are complete markets in history-contingent commodities.
Competitive equilibrium allocations and prices
• satisfy equations that are easy to write down and solve
• have representations that are convenient econometrically
Different example economies manifest themselves simply as different settings for various matrices.
[Hansen and Sargent, 2013] use these tools:
• A theory of recursive dynamic competitive economies
• Linear optimal control theory
• Recursive methods for estimating and interpreting vector autoregressions
The models are flexible enough to express alternative senses of a representative household
• A single ‘stand-in’ household of the type used to good effect by Edward C. Prescott.
• Heterogeneous households satisfying conditions for Gorman aggregation into a representative household.

17.1. A Suite of Models 361


Advanced Quantitative Economics with Python

• Heterogeneous household technologies that violate conditions for Gorman aggregation but are still susceptible to
aggregation into a single representative household via ‘non-Gorman’ or ‘mongrel’ aggregation’.
These three alternative types of aggregation have different consequences in terms of how prices and allocations can be
computed.
In particular, can prices and an aggregate allocation be computed before the equilibrium allocation to individual hetero-
geneous households is computed?
• Answers are “Yes” for Gorman aggregation, “No” for non-Gorman aggregation.
In summary, the insights and practical benefits from economics to be introduced in this lecture are
• Deeper understandings that come from recognizing common underlying structures.
• Speed and ease of computation that comes from unleashing a common suite of Python programs.
We’ll use the following mathematical tools
• Stochastic Difference Equations (Linear).
• Duality: LQ Dynamic Programming and Linear Filtering are the same things mathematically.
• The Spectral Factorization Identity (for understanding vector autoregressions and non-Gorman aggregation).
So here is our roadmap.
We’ll describe sets of matrices that pin down
• Information
• Technologies
• Preferences
Then we’ll describe
• Equilibrium concept and computation
• Econometric representation and estimation

17.1.5 Stochastic Model of Information Flows and Outcomes

We’ll use stochastic linear difference equations to describe information flows and equilibrium outcomes.
The sequence {𝑤𝑡 ∶ 𝑡 = 1, 2, …} is said to be a martingale difference sequence adapted to {𝐽𝑡 ∶ 𝑡 = 0, 1, …} if
𝐸(𝑤𝑡+1 |𝐽𝑡 ) = 0 for 𝑡 = 0, 1, … .

The sequence {𝑤𝑡 ∶ 𝑡 = 1, 2, …} is said to be conditionally homoskedastic if 𝐸(𝑤𝑡+1 𝑤𝑡+1 ∣ 𝐽𝑡 ) = 𝐼 for 𝑡 = 0, 1, … .
We assume that the {𝑤𝑡 ∶ 𝑡 = 1, 2, …} process is conditionally homoskedastic.
Let {𝑥𝑡 ∶ 𝑡 = 1, 2, …} be a sequence of 𝑛-dimensional random vectors, i.e. an 𝑛-dimensional stochastic process.
The process {𝑥𝑡 ∶ 𝑡 = 1, 2, …} is constructed recursively using an initial random vector 𝑥0 ∼ 𝒩(𝑥0̂ , Σ0 ) and a time-
invariant law of motion:

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐶𝑤𝑡+1

for 𝑡 = 0, 1, … where 𝐴 is an 𝑛 by 𝑛 matrix and 𝐶 is an 𝑛 by 𝑁 matrix.


Evidently, the distribution of 𝑥𝑡+1 conditional on 𝑥𝑡 is 𝒩(𝐴𝑥𝑡 , 𝐶𝐶 ′ ).

362 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

17.1.6 Information Sets

Let 𝐽0 be generated by 𝑥0 and 𝐽𝑡 be generated by 𝑥0 , 𝑤1 , … , 𝑤𝑡 , which means that 𝐽𝑡 consists of the set of all measurable
functions of {𝑥0 , 𝑤1 , … , 𝑤𝑡 }.

17.1.7 Prediction Theory

The optimal forecast of 𝑥𝑡+1 given current information is

𝐸(𝑥𝑡+1 ∣ 𝐽𝑡 ) = 𝐴𝑥𝑡

and the one-step-ahead forecast error is

𝑥𝑡+1 − 𝐸(𝑥𝑡+1 ∣ 𝐽𝑡 ) = 𝐶𝑤𝑡+1

The covariance matrix of 𝑥𝑡+1 conditioned on 𝐽𝑡 is

𝐸(𝑥𝑡+1 − 𝐸(𝑥𝑡+1 ∣ 𝐽𝑡 ))(𝑥𝑡+1 − 𝐸(𝑥𝑡+1 ∣ 𝐽𝑡 ))′ = 𝐶𝐶 ′

A nonrecursive expression for 𝑥𝑡 as a function of 𝑥0 , 𝑤1 , 𝑤2 , … , 𝑤𝑡 is

𝑥𝑡 = 𝐴𝑥𝑡−1 + 𝐶𝑤𝑡
= 𝐴2 𝑥𝑡−2 + 𝐴𝐶𝑤𝑡−1 + 𝐶𝑤𝑡
𝑡−1
= [∑ 𝐴𝜏 𝐶𝑤𝑡−𝜏 ] + 𝐴𝑡 𝑥0
𝜏=0

Shift forward in time:


𝑗−1
𝑥𝑡+𝑗 = ∑ 𝐴𝑠 𝐶𝑤𝑡+𝑗−𝑠 + 𝐴𝑗 𝑥𝑡
𝑠=0

Projecting on the information set {𝑥0 , 𝑤𝑡 , 𝑤𝑡−1 , … , 𝑤1 } gives

𝐸𝑡 𝑥𝑡+𝑗 = 𝐴𝑗 𝑥𝑡

where 𝐸𝑡 (⋅) ≡ 𝐸[(⋅) ∣ 𝑥0 , 𝑤𝑡 , 𝑤𝑡−1 , … , 𝑤1 ] = 𝐸(⋅) ∣ 𝐽𝑡 , and 𝑥𝑡 is in 𝐽𝑡 .


𝑗−1
It is useful to obtain the covariance matrix of the 𝑗-step-ahead prediction error 𝑥𝑡+𝑗 − 𝐸𝑡 𝑥𝑡+𝑗 = ∑𝑠=0 𝐴𝑠 𝐶𝑤𝑡−𝑠+𝑗 .
Evidently,
𝑗−1

𝐸𝑡 (𝑥𝑡+𝑗 − 𝐸𝑡 𝑥𝑡+𝑗 )(𝑥𝑡+𝑗 − 𝐸𝑡 𝑥𝑡+𝑗 )′ = ∑ 𝐴𝑘 𝐶𝐶 ′ 𝐴𝑘 ≡ 𝑣𝑗
𝑘=0

𝑣𝑗 can be calculated recursively via

𝑣1 = 𝐶𝐶 ′
𝑣𝑗 = 𝐶𝐶 ′ + 𝐴𝑣𝑗−1 𝐴′ , 𝑗≥2

17.1. A Suite of Models 363


Advanced Quantitative Economics with Python

17.1.8 Orthogonal Decomposition

To decompose these covariances into parts attributable to the individual components of 𝑤𝑡 , we let 𝑖𝜏 be an 𝑁 -dimensional
column vector of zeroes except in position 𝜏 , where there is a one. Define a matrix 𝜐𝑗,𝜏
𝑗−1

𝜐𝑗,𝜏 = ∑ 𝐴𝑘 𝐶𝑖𝜏 𝑖′𝜏 𝐶 ′ 𝐴 𝑘 .
𝑘=0
𝑁
Note that ∑𝜏=1 𝑖𝜏 𝑖′𝜏 = 𝐼, so that we have
𝑁
∑ 𝜐𝑗,𝜏 = 𝜐𝑗
𝜏=1

Evidently, the matrices {𝜐𝑗,𝜏 , 𝜏 = 1, … , 𝑁 } give an orthogonal decomposition of the covariance matrix of 𝑗-step-ahead
prediction errors into the parts attributable to each of the components 𝜏 = 1, … , 𝑁 .

17.1.9 Taste and Technology Shocks

𝐸(𝑤𝑡 ∣ 𝐽𝑡−1 ) = 0 and 𝐸(𝑤𝑡 𝑤𝑡′ ∣ 𝐽𝑡−1 ) = 𝐼 for 𝑡 = 1, 2, …


𝑏𝑡 = 𝑈𝑏 𝑧𝑡 and 𝑑𝑡 = 𝑈𝑑 𝑧𝑡 ,

𝑈𝑏 and 𝑈𝑑 are matrices that select entries of 𝑧𝑡 . The law of motion for {𝑧𝑡 ∶ 𝑡 = 0, 1, …} is
𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1 for 𝑡 = 0, 1, …

where 𝑧0 is a given initial condition. The eigenvalues of the matrix 𝐴22 have absolute values that are less than or equal
to one.
Thus, in summary, our model of information and shocks is
𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1
𝑏𝑡 = 𝑈 𝑏 𝑧 𝑡
𝑑𝑡 = 𝑈𝑑 𝑧𝑡 .
We can now briefly summarize other components of our economies, in particular
• Production technologies
• Household technologies
• Household preferences

17.1.10 Production Technology

Where 𝑐𝑡 is a vector of consumption rates, 𝑘𝑡 is a vector of physical capital goods, 𝑔𝑡 is a vector intermediate productions
goods, 𝑑𝑡 is a vector of technology shocks, the production technology is
Φ𝑐 𝑐𝑡 + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 = Γ𝑘𝑡−1 + 𝑑𝑡
𝑘𝑡 = Δ𝑘 𝑘𝑡−1 + Θ𝑘 𝑖𝑡
𝑔𝑡 ⋅ 𝑔𝑡 = ℓ𝑡2
Here Φ𝑐 , Φ𝑔 , Φ𝑖 , Γ, Δ𝑘 , Θ𝑘 are all matrices conformable to the vectors they multiply and ℓ𝑡 is a disutility generating
resource supplied by the household.
For technical reasons that facilitate computations, we make the following.
Assumption: [Φ𝑐 Φ𝑔 ] is nonsingular.

364 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

17.1.11 Household Technology

Households confront a technology that allows them to devote consumption goods to construct a vector ℎ𝑡 of household
capital goods and a vector 𝑠𝑡 of utility generating house services
𝑠𝑡 = Λℎ𝑡−1 + Π𝑐𝑡
ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡
where Λ, Π, Δℎ , Θℎ are matrices that pin down the household technology.
We make the following
Assumption: The absolute values of the eigenvalues of Δℎ are less than or equal to one.
Below, we’ll outline further assumptions that we shall occasionally impose.

17.1.12 Preferences

Where 𝑏𝑡 is a stochastic process of preference shocks that will play the role of demand shifters, the representative house-
hold orders stochastic processes of consumption services 𝑠𝑡 according to

1
( )𝐸 ∑ 𝛽 𝑡 [(𝑠𝑡 − 𝑏𝑡 ) ⋅ (𝑠𝑡 − 𝑏𝑡 ) + ℓ𝑡2 ]∣𝐽0 , 0 < 𝛽 < 1
2 𝑡=0

We now proceed to give examples of production and household technologies that appear in various models that appear
in the literature.
First, we give examples of production Technologies

Φ𝑐 𝑐𝑡 + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 = Γ𝑘𝑡−1 + 𝑑𝑡

∣ 𝑔𝑡 ∣≤ ℓ𝑡
so we’ll be looking for specifications of the matrices Φ𝑐 , Φ𝑔 , Φ𝑖 , Γ, Δ𝑘 , Θ𝑘 that define them.

17.1.13 Endowment Economy

There is a single consumption good that cannot be stored over time.


In time period 𝑡, there is an endowment 𝑑𝑡 of this single good.
There is neither a capital stock, nor an intermediate good, nor a rate of investment.
So 𝑐𝑡 = 𝑑𝑡 .
To implement this specification, we can choose 𝐴22 , 𝐶2 , and 𝑈𝑑 to make 𝑑𝑡 follow any of a variety of stochastic processes.
To satisfy our earlier rank assumption, we set:

𝑐𝑡 + 𝑖𝑡 = 𝑑1𝑡

𝑔𝑡 = 𝜙1 𝑖𝑡
where 𝜙1 is a small positive number.
To implement this version, we set Δ𝑘 = Θ𝑘 = 0 and
1 1 0 0 𝑑
Φ𝑐 = [ ] , Φ𝑖 = [ ] , Φ𝑔 = [ ] , Γ = [ ] , 𝑑𝑡 = [ 1𝑡 ]
0 𝜙1 −1 0 0
We can use this specification to create a linear-quadratic version of Lucas’s (1978) asset pricing model.

17.1. A Suite of Models 365


Advanced Quantitative Economics with Python

17.1.14 Single-Period Adjustment Costs

There is a single consumption good, a single intermediate good, and a single investment good.
The technology is described by

𝑐𝑡 = 𝛾𝑘𝑡−1 + 𝑑1𝑡 , 𝛾 > 0


𝜙1 𝑖𝑡 = 𝑔𝑡 + 𝑑2𝑡 , 𝜙1 > 0
ℓ𝑡2 = 𝑔𝑡2
𝑘𝑡 = 𝛿𝑘 𝑘𝑡−1 + 𝑖𝑡 , 0 < 𝛿𝑘 < 1

Set
1 0 0
Φ𝑐 = [ ] , Φ 𝑔 = [ ] , Φ 𝑖 = [ ]
0 −1 𝜙1

𝛾
Γ = [ ] , Δ𝑘 = 𝛿 𝑘 , Θ 𝑘 = 1
0
We set 𝐴22 , 𝐶2 and 𝑈𝑑 to make (𝑑1𝑡 , 𝑑2𝑡 )′ = 𝑑𝑡 follow a desired stochastic process.
Now we describe some examples of preferences, which as we have seen are ordered by

1
− ( ) 𝐸 ∑ 𝛽 𝑡 [(𝑠𝑡 − 𝑏𝑡 ) ⋅ (𝑠𝑡 − 𝑏𝑡 ) + (ℓ𝑡 )2 ] ∣ 𝐽0 , 0<𝛽<1
2 𝑡=0

where household services are produced via the household technology

ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡

𝑠𝑡 = Λℎ𝑡−1 + Π𝑐𝑡
and we make
Assumption: The absolute values of the eigenvalues of Δℎ are less than or equal to one.
Later we shall introduce canonical household technologies that satisfy an ‘invertibility’ requirement relating sequences
{𝑠𝑡 } of services and {𝑐𝑡 } of consumption flows.
And we’ll describe how to obtain a canonical representation of a household technology from one that is not canonical.
Here are some examples of household preferences.
Time Separable preferences

1 ∞
− 𝐸 ∑ 𝛽 𝑡 [(𝑐𝑡 − 𝑏𝑡 )2 + ℓ𝑡2 ] ∣ 𝐽0 , 0<𝛽<1
2 𝑡=0

Consumer Durables

ℎ𝑡 = 𝛿ℎ ℎ𝑡−1 + 𝑐𝑡 , 0 < 𝛿ℎ < 1

Services at 𝑡 are related to the stock of durables at the beginning of the period:

𝑠𝑡 = 𝜆ℎ𝑡−1 , 𝜆 > 0

Preferences are ordered by

1 ∞
− 𝐸 ∑ 𝛽 𝑡 [(𝜆ℎ𝑡−1 − 𝑏𝑡 )2 + ℓ𝑡2 ] ∣ 𝐽0
2 𝑡=0

366 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

Set Δℎ = 𝛿ℎ , Θℎ = 1, Λ = 𝜆, Π = 0.
Habit Persistence
∞ ∞
1 2
−( ) 𝐸 ∑ 𝛽 𝑡 [(𝑐𝑡 − 𝜆(1 − 𝛿ℎ ) ∑ 𝛿ℎ𝑗 𝑐𝑡−𝑗−1 − 𝑏𝑡 ) + ℓ𝑡2 ]∣𝐽0
2 𝑡=0 𝑗=0

0 < 𝛽 < 1 , 0 < 𝛿ℎ < 1 , 𝜆 > 0



Here the effective bliss point 𝑏𝑡 + 𝜆(1 − 𝛿ℎ ) ∑𝑗=0 𝛿ℎ𝑗 𝑐𝑡−𝑗−1 shifts in response to a moving average of past consumption.
Initial Conditions

Preferences of this form require an initial condition for the geometric sum ∑𝑗=0 𝛿ℎ𝑗 𝑐𝑡−𝑗−1 that we specify as an initial
condition for the ‘stock of household durables,’ ℎ−1 .
Set

ℎ𝑡 = 𝛿ℎ ℎ𝑡−1 + (1 − 𝛿ℎ )𝑐𝑡 , 0 < 𝛿ℎ < 1

𝑡
ℎ𝑡 = (1 − 𝛿ℎ ) ∑ 𝛿ℎ𝑗 𝑐𝑡−𝑗 + 𝛿ℎ𝑡+1 ℎ−1
𝑗=0

𝑠𝑡 = −𝜆ℎ𝑡−1 + 𝑐𝑡 , 𝜆 > 0
To implement, set Λ = −𝜆, Π = 1, Δℎ = 𝛿ℎ , Θℎ = 1 − 𝛿ℎ .
Seasonal Habit Persistence
∞ ∞
1 2
−( ) 𝐸 ∑ 𝛽 𝑡 [(𝑐𝑡 − 𝜆(1 − 𝛿ℎ ) ∑ 𝛿ℎ𝑗 𝑐𝑡−4𝑗−4 − 𝑏𝑡 ) + ℓ𝑡2 ]
2 𝑡=0 𝑗=0

0 < 𝛽 < 1 , 0 < 𝛿ℎ < 1 , 𝜆 > 0



Here the effective bliss point 𝑏𝑡 + 𝜆(1 − 𝛿ℎ ) ∑𝑗=0 𝛿ℎ𝑗 𝑐𝑡−4𝑗−4 shifts in response to a moving average of past consumptions
of the same quarter.
To implement, set

ℎ̃ 𝑡 = 𝛿ℎ ℎ̃ 𝑡−4 + (1 − 𝛿ℎ )𝑐𝑡 , 0 < 𝛿ℎ < 1

This implies that

ℎ̃ 𝑡 0 0 0 𝛿ℎ ℎ̃ 𝑡−1 (1 − 𝛿ℎ )
⎡ ̃ ⎤ ⎡ ⎤ ⎡ ̃ ⎤ ⎡
ℎ 1 0 0 0 ⎢ℎ𝑡−2 ⎥ 0 ⎤
ℎ𝑡 = ⎢ ̃ 𝑡−1 ⎥ = ⎢ ⎥ + ⎢ ⎥𝑐
⎢ℎ𝑡−2 ⎥ ⎢0 1 0 0 ⎥ ⎢ℎ̃ 𝑡−3 ⎥ ⎢ 0 ⎥ 𝑡
⎣ℎ̃ 𝑡−3 ⎦ ⎣0 0 1 0 ⎦ ⎣ℎ̃ ⎦ ⎣ 0 ⎦
𝑡−4

with consumption services

𝑠𝑡 = − [0 0 0 −𝜆] ℎ𝑡−1 + 𝑐𝑡 , 𝜆>0

Adjustment Costs.
Recall

1
−( )𝐸 ∑ 𝛽 𝑡 [(𝑐𝑡 − 𝑏1𝑡 )2 + 𝜆2 (𝑐𝑡 − 𝑐𝑡−1 )2 + ℓ𝑡2 ] ∣ 𝐽0
2 𝑡=0

17.1. A Suite of Models 367


Advanced Quantitative Economics with Python

0<𝛽<1 , 𝜆>0
To capture adjustment costs, set

ℎ𝑡 = 𝑐𝑡

0 1
𝑠𝑡 = [ ]ℎ + [ ] 𝑐𝑡
−𝜆 𝑡−1 𝜆
so that

𝑠1𝑡 = 𝑐𝑡

𝑠2𝑡 = 𝜆(𝑐𝑡 − 𝑐𝑡−1 )


We set the first component 𝑏1𝑡 of 𝑏𝑡 to capture the stochastic bliss process and set the second component identically equal
to zero.
Thus, we set Δℎ = 0, Θℎ = 1

0 1
Λ=[ ] , Π=[ ]
−𝜆 𝜆
Multiple Consumption Goods
0 𝜋 0
Λ = [ ] and Π = [ 1 ]
0 𝜋2 𝜋3
1
− 𝛽 𝑡 (Π𝑐𝑡 − 𝑏𝑡 )′ (Π𝑐𝑡 − 𝑏𝑡 )
2
𝜇𝑡 = −𝛽 𝑡 [Π′ Π 𝑐𝑡 − Π′ 𝑏𝑡 ]
𝑐𝑡 = −(Π′ Π)−1 𝛽 −𝑡 𝜇𝑡 + (Π′ Π)−1 Π′ 𝑏𝑡
This is called the Frisch demand function for consumption.
We can think of the vector 𝜇𝑡 as playing the role of prices, up to a common factor, for all dates and states.
The scale factor is determined by the choice of numeraire.
Notions of substitutes and complements can be defined in terms of these Frisch demand functions.
Two goods can be said to be substitutes if the cross-price effect is positive and to be complements if this effect is
negative.
Hence this classification is determined by the off-diagonal element of −(Π′ Π)−1 , which is equal to 𝜋2 𝜋3 / det(Π′ Π).
If 𝜋2 and 𝜋3 have the same sign, the goods are substitutes.
If they have opposite signs, the goods are complements.
To summarize, our economic structure consists of the matrices that define the following components:
Information and shocks
𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1
𝑏𝑡 = 𝑈 𝑏 𝑧 𝑡
𝑑𝑡 = 𝑈𝑑 𝑧𝑡
Production Technology
Φ𝑐 𝑐𝑡 + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 = Γ𝑘𝑡−1 + 𝑑𝑡
𝑘𝑡 = Δ𝑘 𝑘𝑡−1 + Θ𝑘 𝑖𝑡
𝑔𝑡 ⋅ 𝑔𝑡 = ℓ𝑡2

368 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

Household Technology

𝑠𝑡 = Λℎ𝑡−1 + Π𝑐𝑡
ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡

Preferences

1
( )𝐸 ∑ 𝛽 𝑡 [(𝑠𝑡 − 𝑏𝑡 ) ⋅ (𝑠𝑡 − 𝑏𝑡 ) + ℓ𝑡2 ]∣𝐽0 , 0 < 𝛽 < 1
2 𝑡=0

Next steps: we move on to discuss two closely connected concepts


• A Planning Problem or Optimal Resource Allocation Problem
• Competitive Equilibrium

17.1.15 Optimal Resource Allocation

Imagine a planner who chooses sequences {𝑐𝑡 , 𝑖𝑡 , 𝑔𝑡 }∞


𝑡=0 to maximize


−(1/2)𝐸 ∑ 𝛽 𝑡 [(𝑠𝑡 − 𝑏𝑡 ) ⋅ (𝑠𝑡 − 𝑏𝑡 ) + 𝑔𝑡 ⋅ 𝑔𝑡 ]∣𝐽0
𝑡=0

subject to the constraints

Φ𝑐 𝑐𝑡 + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 = Γ𝑘𝑡−1 + 𝑑𝑡 ,
𝑘𝑡 = Δ𝑘 𝑘𝑡−1 + Θ𝑘 𝑖𝑡 ,
ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡 ,
𝑠𝑡 = Λℎ𝑡−1 + Π𝑐𝑡 ,
𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1 , 𝑏𝑡 = 𝑈𝑏 𝑧𝑡 , and 𝑑𝑡 = 𝑈𝑑 𝑧𝑡

and initial conditions for ℎ−1 , 𝑘−1 , and 𝑧0 .


Throughout, we shall impose the following square summability conditions
∞ ∞
𝐸 ∑ 𝛽 𝑡 ℎ𝑡 ⋅ ℎ𝑡 ∣ 𝐽0 < ∞ and 𝐸 ∑ 𝛽 𝑡 𝑘𝑡 ⋅ 𝑘𝑡 ∣ 𝐽0 < ∞
𝑡=0 𝑡=0

Define:

𝐿20 = [{𝑦𝑡 } ∶ 𝑦𝑡 is a random variable in 𝐽𝑡 and 𝐸 ∑ 𝛽 𝑡 𝑦𝑡2 ∣ 𝐽0 < +∞]
𝑡=0

Thus, we require that each component of ℎ𝑡 and each component of 𝑘𝑡 belong to 𝐿20 .
We shall compare and utilize two approaches to solving the planning problem
• Lagrangian formulation
• Dynamic programming

17.1. A Suite of Models 369


Advanced Quantitative Economics with Python

17.1.16 Lagrangian Formulation

Form the Lagrangian



1
ℒ = −𝐸 ∑ 𝛽 𝑡 [( )[(𝑠𝑡 − 𝑏𝑡 ) ⋅ (𝑠𝑡 − 𝑏𝑡 ) + 𝑔𝑡 ⋅ 𝑔𝑡 ]
𝑡=0
2
+ 𝑀𝑡𝑑′ ⋅ (Φ𝑐 𝑐𝑡 + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 − Γ𝑘𝑡−1 − 𝑑𝑡 )
+ 𝑀𝑡𝑘′ ⋅ (𝑘𝑡 − Δ𝑘 𝑘𝑡−1 − Θ𝑘 𝑖𝑡 )
+ 𝑀𝑡ℎ′ ⋅ (ℎ𝑡 − Δℎ ℎ𝑡−1 − Θℎ 𝑐𝑡 )

+ 𝑀𝑡𝑠′ ⋅ (𝑠𝑡 − Λℎ𝑡−1 − Π𝑐𝑡 )]∣𝐽0


The planner maximizes ℒ with respect to the quantities {𝑐𝑡 , 𝑖𝑡 , 𝑔𝑡 }𝑡=0 and minimizes with respect to the Lagrange mul-
tipliers 𝑀𝑡𝑑 , 𝑀𝑡𝑘 , 𝑀𝑡ℎ , 𝑀𝑡𝑠 .
First-order necessary conditions for maximization with respect to 𝑐𝑡 , 𝑔𝑡 , ℎ𝑡 , 𝑖𝑡 , 𝑘𝑡 , and 𝑠𝑡 , respectively, are:
−Φ′𝑐 𝑀𝑡𝑑 + Θ′ℎ 𝑀𝑡ℎ + Π′ 𝑀𝑡𝑠 = 0,
− 𝑔𝑡 − Φ′𝑔 𝑀𝑡𝑑 = 0,
−𝑀𝑡ℎ + 𝛽𝐸(Δ′ℎ 𝑀𝑡+1

+ Λ′ 𝑀𝑡+1
𝑠
) ∣ 𝐽𝑡 = 0,
− Φ′𝑖 𝑀𝑡𝑑 + Θ′𝑘 𝑀𝑡𝑘 = 0,
−𝑀𝑡𝑘 + 𝛽𝐸(Δ′𝑘 𝑀𝑡+1
𝑘
+ Γ′ 𝑀𝑡+1
𝑑
) ∣ 𝐽𝑡 = 0,
− 𝑠𝑡 + 𝑏𝑡 − 𝑀𝑡𝑠 = 0
for 𝑡 = 0, 1, ….
In addition, we have the complementary slackness conditions (these recover the original transition equations) and also
transversality conditions
lim 𝛽 𝑡 𝐸[𝑀𝑡𝑘′ 𝑘𝑡 ] ∣ 𝐽0 = 0
𝑡→∞
lim 𝛽 𝑡 𝐸[𝑀𝑡ℎ′ ℎ𝑡 ] ∣ 𝐽0 = 0
𝑡→∞

The system formed by the FONCs and the transition equations can be handed over to Python.
Python will solve the planning problem for fixed parameter values.
Here are the Python Ready Equations
−Φ′𝑐 𝑀𝑡𝑑 + Θ′ℎ 𝑀𝑡ℎ + Π′ 𝑀𝑡𝑠 = 0,
− 𝑔𝑡 − Φ′𝑔 𝑀𝑡𝑑 = 0,
−𝑀𝑡ℎ + 𝛽𝐸(Δ′ℎ 𝑀𝑡+1

+ Λ′ 𝑀𝑡+1
𝑠
) ∣ 𝐽𝑡 = 0,
− Φ′𝑖 𝑀𝑡𝑑 + Θ′𝑘 𝑀𝑡𝑘 = 0,
−𝑀𝑡𝑘 + 𝛽𝐸(Δ′𝑘 𝑀𝑡+1
𝑘
+ Γ′ 𝑀𝑡+1
𝑑
) ∣ 𝐽𝑡 = 0,
− 𝑠𝑡 + 𝑏𝑡 − 𝑀𝑡𝑠 = 0
Φ𝑐 𝑐𝑡 + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 = Γ𝑘𝑡−1 + 𝑑𝑡 ,
𝑘𝑡 = Δ𝑘 𝑘𝑡−1 + Θ𝑘 𝑖𝑡 ,
ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡 ,
𝑠𝑡 = Λℎ𝑡−1 + Π𝑐𝑡 ,
𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1 , 𝑏𝑡 = 𝑈𝑏 𝑧𝑡 , and 𝑑𝑡 = 𝑈𝑑 𝑧𝑡
The Lagrange multipliers or shadow prices satisfy
𝑀𝑡𝑠 = 𝑏𝑡 − 𝑠𝑡

370 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python


𝑀𝑡ℎ = 𝐸[∑ 𝛽 𝜏 (Δ′ℎ )𝜏−1 Λ′ 𝑀𝑡+𝜏
𝑠
∣ 𝐽𝑡 ]
𝜏=1
−1
Φ′𝑐 Θ′ 𝑀 ℎ + Π′ 𝑀𝑡𝑠
𝑀𝑡𝑑 = [ ] [ ℎ 𝑡 ]
Φ′𝑔 −𝑔𝑡

𝑀𝑡𝑘 = 𝐸[∑ 𝛽 𝜏 (Δ′𝑘 )𝜏−1 Γ′ 𝑀𝑡+𝜏
𝑑
∣ 𝐽𝑡 ]
𝜏=1

𝑀𝑡𝑖 = Θ′𝑘 𝑀𝑡𝑘


Although it is possible to use matrix operator methods to solve the above Python ready equations, that is not the approach
we’ll use.
Instead, we’ll use dynamic programming to get recursive representations for both quantities and shadow prices.

17.1.17 Dynamic Programming

Dynamic Programming always starts with the word let.


Thus, let 𝑉 (𝑥0 ) be the optimal value function for the planning problem as a function of the initial state vector 𝑥0 .
(Thus, in essence, dynamic programming amounts to an application of a guess and verify method in which we begin
with a guess about the answer to the problem we want to solve. That’s why we start with let 𝑉 (𝑥0 ) be the (value of the)
answer to the problem, then establish and verify a bunch of conditions 𝑉 (𝑥0 ) has to satisfy if indeed it is the answer)
The optimal value function 𝑉 (𝑥) satisfies the Bellman equation

𝑉 (𝑥0 ) = max [−.5[(𝑠0 − 𝑏0 ) ⋅ (𝑠0 − 𝑏0 ) + 𝑔0 ⋅ 𝑔0 ] + 𝛽𝐸𝑉 (𝑥1 )]


𝑐0 ,𝑖0 ,𝑔0

subject to the linear constraints

Φ𝑐 𝑐0 + Φ𝑔 𝑔0 + Φ𝑖 𝑖0 = Γ𝑘−1 + 𝑑0 ,
𝑘0 = Δ𝑘 𝑘−1 + Θ𝑘 𝑖0 ,
ℎ0 = Δℎ ℎ−1 + Θℎ 𝑐0 ,
𝑠0 = Λℎ−1 + Π𝑐0 ,
𝑧1 = 𝐴22 𝑧0 + 𝐶2 𝑤1 , 𝑏0 = 𝑈𝑏 𝑧0 and 𝑑0 = 𝑈𝑑 𝑧0

Because this is a linear-quadratic dynamic programming problem, it turns out that the value function has the form

𝑉 (𝑥) = 𝑥′ 𝑃 𝑥 + 𝜌

Thus, we want to solve an instance of the following linear-quadratic dynamic programming problem:
Choose a contingency plan for {𝑥𝑡+1 , 𝑢𝑡 }∞
𝑡=0 to maximize


−𝐸 ∑ 𝛽 𝑡 [𝑥′𝑡 𝑅𝑥𝑡 + 𝑢′𝑡 𝑄𝑢𝑡 + 2𝑢′𝑡 𝑊 ′ 𝑥𝑡 ], 0 < 𝛽 < 1
𝑡=0

subject to

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑤𝑡+1 , 𝑡 ≥ 0

where 𝑥0 is given; 𝑥𝑡 is an 𝑛 × 1 vector of state variables, and 𝑢𝑡 is a 𝑘 × 1 vector of control variables.


We assume 𝑤𝑡+1 is a martingale difference sequence with 𝐸𝑤𝑡 𝑤𝑡′ = 𝐼, and that 𝐶 is a matrix conformable to 𝑥 and 𝑤.

17.1. A Suite of Models 371


Advanced Quantitative Economics with Python

The optimal value function 𝑉 (𝑥) satisfies the Bellman equation

𝑉 (𝑥𝑡 ) = max{−(𝑥′𝑡 𝑅𝑥𝑡 + 𝑢′𝑡 𝑄𝑢𝑡 + 2𝑢′𝑡 𝑊 𝑥𝑡 ) + 𝛽𝐸𝑡 𝑉 (𝑥𝑡+1 )}


𝑢𝑡

where maximization is subject to

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑤𝑡+1 , 𝑡 ≥ 0

𝑉 (𝑥𝑡 ) = −𝑥′𝑡 𝑃 𝑥𝑡 − 𝜌
𝑃 satisfies

𝑃 = 𝑅 + 𝛽𝐴′ 𝑃 𝐴 − (𝛽𝐴′ 𝑃 𝐵 + 𝑊 )(𝑄 + 𝛽𝐵′ 𝑃 𝐵)−1 (𝛽𝐵′ 𝑃 𝐴 + 𝑊 ′ )

This equation in 𝑃 is called the algebraic matrix Riccati equation.


The optimal decision rule is 𝑢𝑡 = −𝐹 𝑥𝑡 , where

𝐹 = (𝑄 + 𝛽𝐵′ 𝑃 𝐵)−1 (𝛽𝐵′ 𝑃 𝐴 + 𝑊 ′ )

The optimum decision rule for 𝑢𝑡 is independent of the parameters 𝐶, and so of the noise statistics.
Iterating on the Bellman operator leads to

𝑉𝑗+1 (𝑥𝑡 ) = max{−(𝑥′𝑡 𝑅𝑥𝑡 + 𝑢′𝑡 𝑄𝑢𝑡 + 2𝑢′𝑡 𝑊 𝑥𝑡 ) + 𝛽𝐸𝑡 𝑉𝑗 (𝑥𝑡+1 )}


𝑢𝑡

𝑉𝑗 (𝑥𝑡 ) = −𝑥′𝑡 𝑃𝑗 𝑥𝑡 − 𝜌𝑗
where 𝑃𝑗 and 𝜌𝑗 satisfy the equations

𝑃𝑗+1 = 𝑅 + 𝛽𝐴′ 𝑃𝑗 𝐴 − (𝛽𝐴′ 𝑃𝑗 𝐵 + 𝑊 )(𝑄 + 𝛽𝐵′ 𝑃𝑗 𝐵)−1 (𝛽𝐵′ 𝑃𝑗 𝐴 + 𝑊 ′ )


𝜌𝑗+1 = 𝛽𝜌𝑗 + 𝛽 trace 𝑃𝑗 𝐶𝐶 ′

We can now state the planning problem as a dynamic programming problem



max −𝐸 ∑ 𝛽 𝑡 [𝑥′𝑡 𝑅𝑥𝑡 + 𝑢′𝑡 𝑄𝑢𝑡 + 2𝑢′𝑡 𝑊 ′ 𝑥𝑡 ], 0<𝛽<1
{𝑢𝑡 ,𝑥𝑡+1 }
𝑡=0

where maximization is subject to

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑤𝑡+1 , 𝑡 ≥ 0

ℎ𝑡−1
𝑥𝑡 = ⎡ ⎤
⎢𝑘𝑡−1 ⎥ , 𝑢𝑡 = 𝑖𝑡
⎣ 𝑧𝑡 ⎦
where
Δℎ Θℎ 𝑈𝑐 [Φ𝑐 Φ𝑔 ]−1 Γ Θℎ 𝑈𝑐 [Φ𝑐 Φ𝑔 ]−1 𝑈𝑑

𝐴=⎢ 0 Δ𝑘 0 ⎤

⎣ 0 0 𝐴22 ⎦
−1
−Θℎ 𝑈𝑐 [Φ𝑐 Φ𝑔 ] Φ𝑖 0
𝐵=⎡
⎢ Θ𝑘 ⎤ , 𝐶=⎡0⎤
⎥ ⎢ ⎥
⎣ 0 ⎦ ⎣𝐶2 ⎦
′ ′
𝑥 𝑥 𝑥 𝑅 𝑊 𝑥
[ 𝑡] 𝑆 [ 𝑡] = [ 𝑡] [ ′ ] [ 𝑡]
𝑢𝑡 𝑢𝑡 𝑢𝑡 𝑊 𝑄 𝑢𝑡

372 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

𝑆 = (𝐺′ 𝐺 + 𝐻 ′ 𝐻)/2
𝐻 = [Λ ⋮ Π𝑈𝑐 [Φ𝑐 Φ𝑔 ]−1 Γ ⋮ Π𝑈𝑐 [Φ𝑐 Φ𝑔 ]−1 𝑈𝑑 − 𝑈𝑏 ⋮ −Π𝑈𝑐 [Φ𝑐 Φ𝑔 ]−1 Φ𝑖 ]
𝐺 = 𝑈𝑔 [Φ𝑐 Φ𝑔 ]−1 [0 ⋮ Γ ⋮ 𝑈𝑑 ⋮ −Φ𝑖 ].
Lagrange multipliers as gradient of value function
A useful fact is that Lagrange multipliers equal gradients of the planner’s value function
ℳ𝑘𝑡 = 𝑀𝑘 𝑥𝑡 and 𝑀𝑡ℎ = 𝑀ℎ 𝑥𝑡 where
𝑀𝑘 = 2𝛽[0 𝐼 0]𝑃 𝐴𝑜
𝑀ℎ = 2𝛽[𝐼 0 0]𝑃 𝐴𝑜
ℳ𝑠𝑡 = 𝑀𝑠 𝑥𝑡 where 𝑀𝑠 = (𝑆𝑏 − 𝑆𝑠 ) and 𝑆𝑏 = [0 0 𝑈𝑏 ]
−1
Φ′ Θ′ 𝑀 + Π ′ 𝑀 𝑠
ℳ𝑑𝑡 = 𝑀𝑑 𝑥𝑡 where 𝑀𝑑 = [ ′𝑐 ] [ ℎ ℎ ]
Φ𝑔 −𝑆𝑔
ℳ𝑐𝑡 = 𝑀𝑐 𝑥𝑡 where 𝑀𝑐 = Θ′ℎ 𝑀ℎ + Π′ 𝑀𝑠
ℳ𝑖𝑡 = 𝑀𝑖 𝑥𝑡 where 𝑀𝑖 = Θ′𝑘 𝑀𝑘
We will use this fact and these equations to compute competitive equilibrium prices.

17.1.18 Other mathematical infrastructure

Let’s start with describing the commodity space and pricing functional for our competitive equilibrium.
For the commodity space, we use

𝐿20 = [{𝑦𝑡 } ∶ 𝑦𝑡 is a random variable in 𝐽𝑡 and 𝐸 ∑ 𝛽 𝑡 𝑦𝑡2 ∣ 𝐽0 < +∞]
𝑡=0

For pricing functionals, we express values as inner products



𝜋(𝑐) = 𝐸 ∑ 𝛽 𝑡 𝑝𝑡0 ⋅ 𝑐𝑡 ∣ 𝐽0
𝑡=0

where 𝑝𝑡0 belongs to 𝐿20 .


With these objects in our toolkit, we move on to state the problem of a Representative Household in a competitive
equilibrium.

17.1.19 Representative Household

The representative household owns endowment process and initial stocks of ℎ and 𝑘 and chooses stochastic processes for
{𝑐𝑡 , 𝑠𝑡 , ℎ𝑡 , ℓ𝑡 }∞ 2
𝑡=0 , each element of which is in 𝐿0 , to maximize

1
− 𝐸0 ∑ 𝛽 𝑡 [(𝑠𝑡 − 𝑏𝑡 ) ⋅ (𝑠𝑡 − 𝑏𝑡 ) + ℓ𝑡2 ]
2 𝑡=0

subject to
∞ ∞
𝐸 ∑ 𝛽 𝑡 𝑝𝑡0 ⋅ 𝑐𝑡 ∣ 𝐽0 = 𝐸 ∑ 𝛽 𝑡 (𝑤𝑡0 ℓ𝑡 + 𝛼0𝑡 ⋅ 𝑑𝑡 ) ∣ 𝐽0 + 𝑣0 ⋅ 𝑘−1
𝑡=0 𝑡=0

𝑠𝑡 = Λℎ𝑡−1 + Π𝑐𝑡
ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡 , ℎ−1 , 𝑘−1 given
We now describe the problems faced by two types of firms called type I and type II.

17.1. A Suite of Models 373


Advanced Quantitative Economics with Python

17.1.20 Type I Firm

A type I firm rents capital and labor and endowments and produces 𝑐𝑡 , 𝑖𝑡 .
It chooses stochastic processes for {𝑐𝑡 , 𝑖𝑡 , 𝑘𝑡 , ℓ𝑡 , 𝑔𝑡 , 𝑑𝑡 }, each element of which is in 𝐿20 , to maximize

𝐸0 ∑ 𝛽 𝑡 (𝑝𝑡0 ⋅ 𝑐𝑡 + 𝑞𝑡0 ⋅ 𝑖𝑡 − 𝑟𝑡0 ⋅ 𝑘𝑡−1 − 𝑤𝑡0 ℓ𝑡 − 𝛼0𝑡 ⋅ 𝑑𝑡 )
𝑡=0

subject to

Φ𝑐 𝑐𝑡 + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 = Γ𝑘𝑡−1 + 𝑑𝑡

− ℓ𝑡2 + 𝑔𝑡 ⋅ 𝑔𝑡 = 0

17.1.21 Type II Firm

A firm of type II acquires capital via investment and then rents stocks of capital to the 𝑐, 𝑖-producing type I firm.
A type II firm is a price taker facing the vector 𝑣0 and the stochastic processes {𝑟𝑡0 , 𝑞𝑡0 }.
The firm chooses 𝑘−1 and stochastic processes for {𝑘𝑡 , 𝑖𝑡 }∞
𝑡=0 to maximize


𝐸 ∑ 𝛽 𝑡 (𝑟𝑡0 ⋅ 𝑘𝑡−1 − 𝑞𝑡0 ⋅ 𝑖𝑡 ) ∣ 𝐽0 − 𝑣0 ⋅ 𝑘−1
𝑡=0

subject to

𝑘𝑡 = Δ𝑘 𝑘𝑡−1 + Θ𝑘 𝑖𝑡

17.1.22 Competitive Equilibrium: Definition

We can now state the following.


Definition: A competitive equilibrium is a price system [𝑣0 , {𝑝𝑡0 , 𝑤𝑡0 , 𝛼0𝑡 , 𝑞𝑡0 , 𝑟𝑡0 }∞
𝑡=0 ] and an allocation
{𝑐𝑡 , 𝑖𝑡 , 𝑘𝑡 , ℎ𝑡 , 𝑔𝑡 , 𝑑𝑡 }∞
𝑡=0 that satisfy the following conditions:
• Each component of the price system and the allocation resides in the space 𝐿20 .
• Given the price system and given ℎ−1 , 𝑘−1 , the allocation solves the representative household’s problem and the
problems of the two types of firms.
Versions of the two classical welfare theorems prevail under our assumptions.
We exploit that fact in our algorithm for computing a competitive equilibrium.
Step 1: Solve the planning problem by using dynamic programming.
The allocation (i.e., quantities) that solve the planning problem are the competitive equilibrium quantities.
Step 2: use the following formulas to compute the equilibrium price system

𝑝𝑡0 = [Π′ 𝑀𝑡𝑠 + Θ′ℎ 𝑀𝑡ℎ ]/𝜇𝑤 𝑐 𝑤


0 = 𝑀𝑡 /𝜇0

𝑤𝑡0 =∣ 𝑆𝑔 𝑥𝑡 ∣ /𝜇𝑤
0

𝑟𝑡0 = Γ′ 𝑀𝑡𝑑 /𝜇𝑤


0

374 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

𝑞𝑡0 = Θ′𝑘 𝑀𝑡𝑘 /𝜇𝑤 𝑖 𝑤


0 = 𝑀𝑡 /𝜇0

𝛼0𝑡 = 𝑀𝑡𝑑 /𝜇𝑤


0

𝑣0 = Γ′ 𝑀0𝑑 /𝜇𝑤 ′ 𝑘 𝑤
0 + Δ𝑘 𝑀0 /𝜇0

Verification: With this price system, values can be assigned to the Lagrange multipliers for each of our three classes of
agents that cause all first-order necessary conditions to be satisfied at these prices and at the quantities associated with
the optimum of the planning problem.

17.1.23 Asset pricing

An important use of an equilibrium pricing system is to do asset pricing.


Thus, imagine that we are presented a dividend stream: {𝑦𝑡 } ∈ 𝐿20 and want to compute the value of a perpetual claim
to this stream.

To value this asset we simply take price times quantity and add to get an asset value: 𝑎0 = 𝐸 ∑𝑡=0 𝛽 𝑡 𝑝𝑡0 ⋅ 𝑦𝑡 ∣ 𝐽0 .
To compute 𝑎𝑜 we proceed as follows.
We let
𝑦𝑡 = 𝑈 𝑎 𝑥𝑡

𝑎0 = 𝐸 ∑ 𝛽 𝑡 𝑥′𝑡 𝑍𝑎 𝑥𝑡 ∣ 𝐽0
𝑡=0
𝑍𝑎 = 𝑈𝑎′ 𝑀𝑐 /𝜇𝑤
0

We have the following convenient formulas:


𝑎0 = 𝑥′0 𝜇𝑎 𝑥0 + 𝜎𝑎

𝜇𝑎 = ∑ 𝛽 𝜏 (𝐴𝑜′ )𝜏 𝑍𝑎 𝐴𝑜𝜏
𝜏=0

𝛽
𝜎𝑎 = trace (𝑍𝑎 ∑ 𝛽 𝜏 (𝐴𝑜 )𝜏 𝐶𝐶 ′ (𝐴𝑜′ )𝜏 )
1−𝛽 𝜏=0

17.1.24 Re-Opening Markets

We have assumed that all trading occurs once-and-for-all at time 𝑡 = 0.


If we were to re-open markets at some time 𝑡 > 0 at time 𝑡 wealth levels implicitly defined by time 0 trades, we would
obtain the same equilibrium allocation (i.e., quantities) and the following time 𝑡 price system
𝐿2𝑡 = [{𝑦𝑠 }∞
𝑠=𝑡 ∶ 𝑦𝑠 is a random variable in 𝐽𝑠 for 𝑠 ≥ 𝑡

and 𝐸 ∑ 𝛽 𝑠−𝑡 𝑦𝑠2 ∣ 𝐽𝑡 < +∞].
𝑠=𝑡

𝑝𝑠𝑡 = 𝑀𝑐 𝑥𝑠 /[𝑒𝑗̄ 𝑀𝑐 𝑥𝑡 ], 𝑠≥𝑡


𝑤𝑠𝑡 =∣ 𝑆𝑔 𝑥𝑠 |/[𝑒𝑗̄ 𝑀𝑐 𝑥𝑡 ], 𝑠 ≥ 𝑡
𝑟𝑠𝑡 = Γ′ 𝑀𝑑 𝑥𝑠 /[𝑒𝑗̄ 𝑀𝑐 𝑥𝑡 ], 𝑠 ≥ 𝑡
𝑞𝑠𝑡 = 𝑀𝑖 𝑥𝑠 /[𝑒𝑗̄ 𝑀𝑐 𝑥𝑡 ], 𝑠≥𝑡
𝛼𝑡𝑠 = 𝑀𝑑 𝑥𝑠 /[𝑒𝑗̄ 𝑀𝑐 𝑥𝑡 ], 𝑠 ≥ 𝑡
𝑣𝑡 = [Γ′ 𝑀𝑑 + Δ′𝑘 𝑀𝑘 ]𝑥𝑡 / [𝑒𝑗̄ 𝑀𝑐 𝑥𝑡 ]

17.1. A Suite of Models 375


Advanced Quantitative Economics with Python

17.2 Econometrics

Up to now, we have described how to solve the direct problem that maps model parameters into an (equilibrium)
stochastic process of prices and quantities.
Recall the inverse problem of inferring model parameters from a single realization of a time series of some of the prices
and quantities.
Another name for the inverse problem is econometrics.
An advantage of the [Hansen and Sargent, 2013] structure is that it comes with a self-contained theory of econometrics.
It is really just a tale of two state-space representations.
Here they are:
Original State-Space Representation:

𝑥𝑡+1 = 𝐴𝑜 𝑥𝑡 + 𝐶𝑤𝑡+1
𝑦𝑡 = 𝐺𝑥𝑡 + 𝑣𝑡

where 𝑣𝑡 is a martingale difference sequence of measurement errors that satisfies 𝐸𝑣𝑡 𝑣𝑡′ = 𝑅, 𝐸𝑤𝑡+1 𝑣𝑠′ = 0 for all
𝑡 + 1 ≥ 𝑠 and

𝑥0 ∼ 𝒩(𝑥0̂ , Σ0 )

Innovations Representation:

𝑥𝑡+1
̂ = 𝐴𝑜 𝑥𝑡̂ + 𝐾𝑡 𝑎𝑡
𝑦𝑡 = 𝐺𝑥𝑡̂ + 𝑎𝑡 ,

where 𝑎𝑡 = 𝑦𝑡 − 𝐸[𝑦𝑡 |𝑦𝑡−1 ], 𝐸𝑎𝑡 𝑎′𝑡 ≡ Ω𝑡 = 𝐺Σ𝑡 𝐺′ + 𝑅.


Compare numbers of shocks in the two representations:
• 𝑛𝑤 + 𝑛𝑦 versus 𝑛𝑦
Compare spaces spanned
• 𝐻(𝑦𝑡 ) ⊂ 𝐻(𝑤𝑡 , 𝑣𝑡 )
• 𝐻(𝑦𝑡 ) = 𝐻(𝑎𝑡 )
Kalman Filter:.
Kalman gain:

𝐾𝑡 = 𝐴𝑜 Σ𝑡 𝐺′ (𝐺Σ𝑡 𝐺′ + 𝑅)−1

Riccati Difference Equation:

Σ𝑡+1 = 𝐴𝑜 Σ𝑡 𝐴𝑜′ + 𝐶𝐶 ′
− 𝐴𝑜 Σ𝑡 𝐺′ (𝐺Σ𝑡 𝐺′ + 𝑅)−1 𝐺Σ𝑡 𝐴𝑜′

Innovations Representation as Whitener


Whitening Filter:

𝑎𝑡 = 𝑦𝑡 − 𝐺𝑥𝑡̂
𝑥𝑡+1
̂ = 𝐴𝑜 𝑥𝑡̂ + 𝐾𝑡 𝑎𝑡

376 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

can be used recursively to construct a record of innovations {𝑎𝑡 }𝑇𝑡=0 from an (𝑥0̂ , Σ0 ) and a record of observations
{𝑦𝑡 }𝑇𝑡=0 .
Limiting Time-Invariant Innovations Representation

Σ = 𝐴𝑜 Σ𝐴𝑜′ + 𝐶𝐶 ′
− 𝐴𝑜 Σ𝐺′ (𝐺Σ𝐺′ + 𝑅)−1 𝐺Σ𝐴𝑜′
𝐾 = 𝐴𝑜 Σ𝑡 𝐺′ (𝐺Σ𝐺′ + 𝑅)−1

𝑥𝑡+1
̂ = 𝐴𝑜 𝑥𝑡̂ + 𝐾𝑎𝑡
𝑦𝑡 = 𝐺𝑥𝑡̂ + 𝑎𝑡
where 𝐸𝑎𝑡 𝑎′𝑡 ≡ Ω = 𝐺Σ𝐺 + 𝑅.′

17.2.1 Factorization of Likelihood Function

Sample of observations {𝑦𝑠 }𝑇𝑠=0 on a (𝑛𝑦 × 1) vector.

𝑓(𝑦𝑇 , 𝑦𝑇 −1 , … , 𝑦0 ) = 𝑓𝑇 (𝑦𝑇 |𝑦𝑇 −1 , … , 𝑦0 )𝑓𝑇 −1 (𝑦𝑇 −1 |𝑦𝑇 −2 , … , 𝑦0 ) ⋯ 𝑓1 (𝑦1 |𝑦0 )𝑓0 (𝑦0 )
= 𝑔𝑇 (𝑎𝑇 )𝑔𝑇 −1 (𝑎𝑇 −1 ) … 𝑔1 (𝑎1 )𝑓0 (𝑦0 ).

Gaussian Log-Likelihood:
𝑇
−.5 ∑{𝑛𝑦 ln(2𝜋) + ln |Ω𝑡 | + 𝑎′𝑡 Ω−1
𝑡 𝑎𝑡 }
𝑡=0

17.2.2 Covariance Generating Functions

Autocovariance: 𝐶𝑥 (𝜏 ) = 𝐸𝑥𝑡 𝑥′𝑡−𝜏 .



Generating Function: 𝑆𝑥 (𝑧) = ∑𝜏=−∞ 𝐶𝑥 (𝜏 )𝑧 𝜏 , 𝑧 ∈ 𝐶.

17.2.3 Spectral Factorization Identity

Original state-space representation has too many shocks and implies:

𝑆𝑦 (𝑧) = 𝐺(𝑧𝐼 − 𝐴𝑜 )−1 𝐶𝐶 ′ (𝑧 −1 𝐼 − (𝐴𝑜 )′ )−1 𝐺′ + 𝑅

Innovations representation has as many shocks as dimension of 𝑦𝑡 and implies

𝑆𝑦 (𝑧) = [𝐺(𝑧𝐼 − 𝐴𝑜 )−1 𝐾 + 𝐼][𝐺Σ𝐺′ + 𝑅][𝐾 ′ (𝑧 −1 𝐼 − 𝐴𝑜′ )−1 𝐺′ + 𝐼]

Equating these two leads to:

𝐺(𝑧𝐼 − 𝐴𝑜 )−1 𝐶𝐶 ′ (𝑧 −1 𝐼 − 𝐴𝑜′ )−1 𝐺′ + 𝑅 =


[𝐺(𝑧𝐼 − 𝐴𝑜 )−1 𝐾 + 𝐼][𝐺Σ𝐺′ + 𝑅][𝐾 ′ (𝑧 −1 𝐼 − 𝐴𝑜′ )−1 𝐺′ + 𝐼].

Key Insight: The zeros of the polynomial det[𝐺(𝑧𝐼 − 𝐴𝑜 )−1 𝐾 + 𝐼] all lie inside the unit circle, which means that 𝑎𝑡
lies in the space spanned by square summable linear combinations of 𝑦𝑡 .

𝐻(𝑎𝑡 ) = 𝐻(𝑦𝑡 )

Key Property: Invertibility

17.2. Econometrics 377


Advanced Quantitative Economics with Python

17.2.4 Wold and Vector Autoregressive Representations

Let’s start with some lag operator arithmetic.


The lag operator 𝐿 and the inverse lag operator 𝐿−1 each map an infinite sequence into an infinite sequence according to
the transformation rules

𝐿𝑥𝑡 ≡ 𝑥𝑡−1

𝐿−1 𝑥𝑡 ≡ 𝑥𝑡+1
A Wold moving average representation for {𝑦𝑡 } is

𝑦𝑡 = [𝐺(𝐼 − 𝐴𝑜 𝐿)−1 𝐾𝐿 + 𝐼]𝑎𝑡

Applying the inverse of the operator on the right side and using

[𝐺(𝐼 − 𝐴𝑜 𝐿)−1 𝐾𝐿 + 𝐼]−1 = 𝐼 − 𝐺[𝐼 − (𝐴𝑜 − 𝐾𝐺)𝐿]−1 𝐾𝐿

gives the vector autoregressive representation



𝑦𝑡 = ∑ 𝐺(𝐴𝑜 − 𝐾𝐺)𝑗−1 𝐾𝑦𝑡−𝑗 + 𝑎𝑡
𝑗=1

17.3 Dynamic Demand Curves and Canonical Household Technolo-


gies

17.3.1 Canonical Household Technologies

ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡
𝑠𝑡 = Λℎ𝑡−1 + Π𝑐𝑡
𝑏𝑡 = 𝑈 𝑏 𝑧 𝑡
Definition: A household service technology (Δℎ , Θℎ , Π, Λ, 𝑈𝑏 ) is said to be canonical if
• Π is nonsingular, and

• the absolute values of the eigenvalues of (Δℎ − Θℎ Π−1 Λ) are strictly less than 1/ 𝛽.
Key invertibility property: A canonical household service technology maps a service process {𝑠𝑡 } in 𝐿20 into a corre-
sponding consumption process {𝑐𝑡 } for which the implied household capital stock process {ℎ𝑡 } is also in 𝐿20 .
An inverse household technology:

𝑐𝑡 = −Π−1 Λℎ𝑡−1 + Π−1 𝑠𝑡


ℎ𝑡 = (Δℎ − Θℎ Π−1 Λ)ℎ𝑡−1 + Θℎ Π−1 𝑠𝑡

The restriction on the eigenvalues of the matrix (Δℎ − Θℎ Π−1 Λ) keeps the household capital stock {ℎ𝑡 } in 𝐿20 .

378 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

17.3.2 Dynamic Demand Functions



𝜌𝑡0 ≡ Π−1′ [𝑝𝑡0 − Θ′ℎ 𝐸𝑡 ∑ 𝛽 𝜏 (Δ′ℎ − Λ′ Π−1′ Θ′ℎ )𝜏−1 Λ′ Π−1′ 𝑝𝑡+𝜏
0
]
𝜏=1

𝑠𝑖,𝑡 = Λℎ𝑖,𝑡−1
ℎ𝑖,𝑡 = Δℎ ℎ𝑖,𝑡−1
where ℎ𝑖,−1 = ℎ−1 .


𝑊0 = 𝐸0 ∑ 𝛽 𝑡 (𝑤𝑡0 ℓ𝑡 + 𝛼0𝑡 ⋅ 𝑑𝑡 ) + 𝑣0 ⋅ 𝑘−1
𝑡=0


𝐸0 ∑𝑡=0 𝛽 𝑡 𝜌𝑡0 ⋅ (𝑏𝑡 − 𝑠𝑖,𝑡 ) − 𝑊0
𝜇𝑤
0 = ∞
𝐸0 ∑𝑡=0 𝛽 𝑡 𝜌𝑡0 ⋅ 𝜌𝑡0
𝑐𝑡 = −Π−1 Λℎ𝑡−1 + Π−1 𝑏𝑡 − Π−1 𝜇𝑤
0 𝐸𝑡 {Π
′ −1
− Π′ −1 Θ′ℎ
[𝐼 − (Δ′ℎ − Λ′ Π′ −1 Θ′ℎ )𝛽𝐿−1 ]−1 Λ′ Π′−1 𝛽𝐿−1 }𝑝𝑡0
ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡
This system expresses consumption demands at date 𝑡 as functions of: (i) time-𝑡 conditional expectations of future scaled
0
Arrow-Debreu prices {𝑝𝑡+𝑠 }∞
𝑠=0 ; (ii) the stochastic process for the household’s endowment {𝑑𝑡 } and preference shock
{𝑏𝑡 }, as mediated through the multiplier 𝜇𝑤 0 and wealth 𝑊0 ; and (iii) past values of consumption, as mediated through
the state variable ℎ𝑡−1 .

17.4 Gorman Aggregation and Engel Curves

We shall explore how the dynamic demand schedule for consumption goods opens up the possibility of satisfying Gorman’s
(1953) conditions for aggregation in a heterogeneous consumer model.
The first equation of our demand system is an Engel curve for consumption that is linear in the marginal utility 𝜇20 of
individual wealth with a coefficient on 𝜇𝑤
0 that depends only on prices.

The multiplier 𝜇𝑤
0 depends on wealth in an affine relationship, so that consumption is linear in wealth.

In a model with multiple consumers who have the same household technologies (Δℎ , Θℎ , Λ, Π) but possibly different
preference shock processes and initial values of household capital stocks, the coefficient on the marginal utility of wealth
is the same for all consumers.
Gorman showed that when Engel curves satisfy this property, there exists a unique community or aggregate preference
ordering over aggregate consumption that is independent of the distribution of wealth.

17.4.1 Re-Opened Markets



𝜌𝑡𝑡 ≡ Π−1′ [𝑝𝑡𝑡 − Θ′ℎ 𝐸𝑡 ∑ 𝛽 𝜏 (Δ′ℎ − Λ′ Π−1′ Θ′ℎ )𝜏−1 Λ′ Π−1′ 𝑝𝑡+𝜏
𝑡
]
𝜏=1

𝑠𝑖,𝑡 = Λℎ𝑖,𝑡−1
ℎ𝑖,𝑡 = Δℎ ℎ𝑖,𝑡−1 ,

17.4. Gorman Aggregation and Engel Curves 379


Advanced Quantitative Economics with Python

where now ℎ𝑖,𝑡−1 = ℎ𝑡−1 . Define time 𝑡 wealth 𝑊𝑡



𝑊𝑡 = 𝐸𝑡 ∑ 𝛽 𝑗 (𝑤𝑡+𝑗
𝑡
ℓ𝑡+𝑗 + 𝛼𝑡𝑡+𝑗 ⋅ 𝑑𝑡+𝑗 ) + 𝑣𝑡 ⋅ 𝑘𝑡−1
𝑗=0


𝐸𝑡 ∑𝑗=0 𝛽 𝑗 𝜌𝑡+𝑗
𝑡
⋅ (𝑏𝑡+𝑗 − 𝑠𝑖,𝑡+𝑗 ) − 𝑊𝑡
𝜇𝑤
𝑡 = ∞ 𝑡 𝑡
𝐸𝑡 ∑𝑡=0 𝛽 𝑗 𝜌𝑡+𝑗 ⋅ 𝜌𝑡+𝑗
𝑐𝑡 = −Π−1 Λℎ𝑡−1 + Π−1 𝑏𝑡 − Π−1 𝜇𝑤
𝑡 𝐸𝑡 {Π
′ −1
− Π′ −1 Θ′ℎ
[𝐼 − (Δ′ℎ − Λ′ Π′ −1 Θ′ℎ )𝛽𝐿−1 ]−1 Λ′ Π′−1 𝛽𝐿−1 }𝑝𝑡𝑡
ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡

17.4.2 Dynamic Demand

Define a time 𝑡 continuation of a sequence {𝑧𝑡 }∞ ∞


𝑡=0 as the sequence {𝑧𝜏 }𝜏=𝑡 . The demand system indicates that the time
𝑡 vector of demands for 𝑐𝑡 is influenced by:
Through the multiplier 𝜇𝑤
𝑡 , the time 𝑡 continuation of the preference shock process {𝑏𝑡 } and the time 𝑡 continuation of
{𝑠𝑖,𝑡 }.
The time 𝑡 − 1 level of household durables ℎ𝑡−1 .
Everything that affects the household’s time 𝑡 wealth, including its stock of physical capital 𝑘𝑡−1 and its value 𝑣𝑡 , the
time 𝑡 continuation of the factor prices {𝑤𝑡 , 𝛼𝑡 }, the household’s continuation endowment process, and the household’s
continuation plan for {ℓ𝑡 }.
The time 𝑡 continuation of the vector of prices {𝑝𝑡𝑡 }.

17.4.3 Attaining a Canonical Household Technology

Apply the following version of a factorization identity:

[Π + 𝛽 1/2 𝐿−1 Λ(𝐼 − 𝛽 1/2 𝐿−1 Δℎ )−1 Θℎ ]′ [Π + 𝛽 1/2 𝐿Λ(𝐼 − 𝛽 1/2 𝐿Δℎ )−1 Θℎ ]
̂ − 𝛽 1/2 𝐿−1 Δ )−1 Θ ]′ [Π̂ + 𝛽 1/2 𝐿Λ(𝐼
= [Π̂ + 𝛽 1/2 𝐿−1 Λ(𝐼 ̂ − 𝛽 1/2 𝐿Δ )−1 Θ ]
ℎ ℎ ℎ ℎ

The factorization identity guarantees that the [Λ,̂ Π]̂ representation satisfies both requirements for a canonical represen-
tation.

17.5 Partial Equilibrium

Now we’ll provide quick overviews of examples of economies that fit within our framework
We provide details for a number of these examples in subsequent lectures
1. Growth in Dynamic Linear Economies
2. Lucas Asset Pricing using DLE
3. IRFs in Hall Model
4. Permanent Income Using the DLE class
5. Rosen schooling model
6. Cattle cycles

380 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

7. Shock Non Invertibility


We’ll start with an example of a partial equilibrium in which we posit demand and supply curves
Suppose that we want to capture the dynamic demand curve:

𝑐𝑡 = −Π−1 Λℎ𝑡−1 + Π−1 𝑏𝑡 − Π−1 𝜇𝑤


0 𝐸𝑡 {Π
′ −1
− Π′ −1 Θ′ℎ
[𝐼 − (Δ′ℎ − Λ′ Π′ −1 Θ′ℎ )𝛽𝐿−1 ]−1 Λ′ Π′−1 𝛽𝐿−1 }𝑝𝑡
ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡

From material described earlier in this lecture, we know how to reverse engineer preferences that generate this demand
system
• note how the demand equations are cast in terms of the matrices in our standard preference representation
Now let’s turn to supply.
A representative firm takes as given and beyond its control the stochastic process {𝑝𝑡 }∞
𝑡=0 .

The firm sells its output 𝑐𝑡 in a competitive market each period.


Only spot markets convene at each date 𝑡 ≥ 0.
The firm also faces an exogenous process of cost disturbances 𝑑𝑡 .
The firm chooses stochastic processes {𝑐𝑡 , 𝑔𝑡 , 𝑖𝑡 , 𝑘𝑡 }∞
𝑡=0 to maximize


𝐸0 ∑ 𝛽 𝑡 {𝑝𝑡 ⋅ 𝑐𝑡 − 𝑔𝑡 ⋅ 𝑔𝑡 /2}
𝑡=0

subject to given 𝑘−1 and

Φ𝑐 𝑐𝑡 + Φ𝑖 𝑖𝑡 + Φ𝑔 𝑔𝑡 = Γ𝑘𝑡−1 + 𝑑𝑡
𝑘𝑡 = Δ𝑘 𝑘𝑡−1 + Θ𝑘 𝑖𝑡 .

17.6 Equilibrium Investment Under Uncertainty

A representative firm maximizes



𝐸 ∑ 𝛽 𝑡 {𝑝𝑡 𝑐𝑡 − 𝑔𝑡2 /2}
𝑡=0

subject to the technology

𝑐𝑡 = 𝛾𝑘𝑡−1
𝑘𝑡 = 𝛿𝑘 𝑘𝑡−1 + 𝑖𝑡
𝑔𝑡 = 𝑓1 𝑖𝑡 + 𝑓2 𝑑𝑡

where 𝑑𝑡 is a cost shifter, 𝛾 > 0, and 𝑓1 > 0 is a cost parameter and 𝑓2 = 1. Demand is governed by

𝑝𝑡 = 𝛼0 − 𝛼1 𝑐𝑡 + 𝑢𝑡

where 𝑢𝑡 is a demand shifter with mean zero and 𝛼0 , 𝛼1 are positive parameters.
Assume that 𝑢𝑡 , 𝑑𝑡 are uncorrelated first-order autoregressive processes.

17.6. Equilibrium Investment Under Uncertainty 381


Advanced Quantitative Economics with Python

17.7 A Rosen-Topel Housing Model

𝑅𝑡 = 𝑏𝑡 + 𝛼ℎ𝑡

𝑝𝑡 = 𝐸𝑡 ∑(𝛽𝛿ℎ )𝜏 𝑅𝑡+𝜏
𝜏=0
where ℎ𝑡 is the stock of housing at time 𝑡 𝑅𝑡 is the rental rate for housing, 𝑝𝑡 is the price of new houses, and 𝑏𝑡 is a
demand shifter; 𝛼 < 0 is a demand parameter, and 𝛿ℎ is a depreciation factor for houses.
We cast this demand specification within our class of models by letting the stock of houses ℎ𝑡 evolve according to
ℎ𝑡 = 𝛿ℎ ℎ𝑡−1 + 𝑐𝑡 , 𝛿ℎ ∈ (0, 1)
where 𝑐𝑡 is the rate of production of new houses.
̄ 𝑡 or 𝑠𝑡 = 𝜆ℎ𝑡−1 + 𝜋𝑐𝑡 , where 𝜆 = 𝜆𝛿
Houses produce services 𝑠𝑡 according to 𝑠𝑡 = 𝜆ℎ ̄ ℎ , 𝜋 = 𝜆.̄
̄ 𝑡0 = 𝑅𝑡 as the rental rate on housing at time 𝑡, measured in units of time 𝑡 consumption (housing).
We can take 𝜆𝜌
Demand for housing services is
𝑠𝑡 = 𝑏𝑡 − 𝜇0 𝜌𝑡0
where the price of new houses 𝑝𝑡 is related to 𝜌𝑡0 by 𝜌𝑡0 = 𝜋−1 [𝑝𝑡 − 𝛽𝛿ℎ 𝐸𝑡 𝑝𝑡+1 ].

17.8 Cattle Cycles

Rosen, Murphy, and Scheinkman (1994). Let 𝑝𝑡 be the price of freshly slaughtered beef, 𝑚𝑡 the feeding cost of preparing
an animal for slaughter, ℎ̃ 𝑡 the one-period holding cost for a mature animal, 𝛾1 ℎ̃ 𝑡 the one-period holding cost for a yearling,
and 𝛾0 ℎ̃ 𝑡 the one-period holding cost for a calf.
The cost processes {ℎ̃ 𝑡 , 𝑚𝑡 }∞ ∞
𝑡=0 are exogenous, while the stochastic process {𝑝𝑡 }𝑡=0 is determined by a rational expec-
tations equilibrium. Let 𝑥𝑡̃ be the breeding stock, and 𝑦𝑡̃ be the total stock of animals.
The law of motion for cattle stocks is
𝑥𝑡̃ = (1 − 𝛿)𝑥𝑡−1
̃ + 𝑔𝑥𝑡−3
̃ − 𝑐𝑡
where 𝑐𝑡 is a rate of slaughtering. The total head-count of cattle
𝑦𝑡̃ = 𝑥𝑡̃ + 𝑔𝑥𝑡−1
̃ + 𝑔𝑥𝑡−2
̃
is the sum of adults, calves, and yearlings, respectively.
A representative farmer chooses {𝑐𝑡 , 𝑥𝑡̃ } to maximize

𝐸0 ∑ 𝛽 𝑡 {𝑝𝑡 𝑐𝑡 − ℎ̃ 𝑡 𝑥𝑡̃ − (𝛾0 ℎ̃ 𝑡 )(𝑔𝑥𝑡−1
̃ ) − (𝛾1 ℎ̃ 𝑡 )(𝑔𝑥𝑡−2
̃ ) − 𝑚 𝑡 𝑐𝑡
𝑡=0
− Ψ(𝑥𝑡̃ , 𝑥𝑡−1
̃ , 𝑥𝑡−2
̃ , 𝑐𝑡 )}
where
𝜓1 2 𝜓2 2 𝜓 𝜓
Ψ= 𝑥̃ + 𝑥̃ + 3 𝑥2𝑡−2
̃ + 4 𝑐𝑡2
2 𝑡 2 𝑡−1 2 2
Demand is governed by
𝑐𝑡 = 𝛼0 − 𝛼1 𝑝𝑡 + 𝑑𝑡̃
where 𝛼0 > 0, 𝛼1 > 0, and {𝑑𝑡̃ }∞
𝑡=0 is a stochastic process with mean zero representing a demand shifter.

For more details see Cattle cycles

382 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

17.9 Models of Occupational Choice and Pay

We’ll describe the following pair of schooling models that view education as a time-to-build process:
• Rosen schooling model for engineers
• Two-occupation model

17.9.1 Market for Engineers

Ryoo and Rosen’s (2004) [Ryoo and Rosen, 2004] model consists of the following equations:
first, a demand curve for engineers

𝑤𝑡 = −𝛼𝑑 𝑁𝑡 + 𝜖1𝑡 , 𝛼𝑑 > 0

second, a time-to-build structure of the education process

𝑁𝑡+𝑘 = 𝛿𝑁 𝑁𝑡+𝑘−1 + 𝑛𝑡 , 0 < 𝛿𝑁 < 1

third, a definition of the discounted present value of each new engineering student

𝑣𝑡 = 𝛽 𝑘 𝐸𝑡 ∑(𝛽𝛿𝑁 )𝑗 𝑤𝑡+𝑘+𝑗 ;
𝑗=0

and fourth, a supply curve of new students driven by 𝑣𝑡

𝑛𝑡 = 𝛼𝑠 𝑣𝑡 + 𝜖2𝑡 , 𝛼𝑠 > 0

Here {𝜖1𝑡 , 𝜖2𝑡 } are stochastic processes of labor demand and supply shocks.
Definition: A partial equilibrium is a stochastic process {𝑤𝑡 , 𝑁𝑡 , 𝑣𝑡 , 𝑛𝑡 }∞
𝑡=0 satisfying these four equations, and initial
conditions 𝑁−1 , 𝑛−𝑠 , 𝑠 = 1, … , −𝑘.
We sweep the time-to-build structure and the demand for engineers into the household technology and putting the supply
of new engineers into the technology for producing goods.

ℎ1𝑡−1
⎡ ℎ ⎤
𝑠𝑡 = [𝜆1 0 … 0] ⎢ 2𝑡−1 ⎥ + 0 ⋅ 𝑐𝑡
⎢ ⋮ ⎥
⎣ℎ𝑘+1,𝑡−1 ⎦
ℎ1𝑡 𝛿𝑁 1 0 ⋯ 0 ℎ1𝑡−1 0
⎡ ℎ ⎤ ⎡0 0 1 ⋯ 0⎤ ⎡ ℎ2𝑡−1 ⎤ ⎡0⎤
2𝑡
⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ ⋮ ⎥=⎢ ⋮ ⋮ ⋮ ⋱ ⋮⎥⎢ ⋮ ⎥ + ⎢ ⋮ ⎥ 𝑐𝑡
⎢ ℎ𝑘,𝑡 ⎥ ⎢ 0 ⋯ ⋯ 0 1⎥ ⎢ ℎ𝑘,𝑡−1 ⎥ ⎢0⎥
⎣ℎ𝑘+1,𝑡 ⎦ ⎣ 0 0 0 ⋯ 0⎦ ⎣ℎ𝑘+1,𝑡−1 ⎦ ⎣1⎦

This specification sets Rosen’s 𝑁𝑡 = ℎ1𝑡−1 , 𝑛𝑡 = 𝑐𝑡 , ℎ𝜏+1,𝑡−1 = 𝑛𝑡−𝜏 , 𝜏 = 1, … , 𝑘, and uses the home-produced service
to capture the demand for labor. Here 𝜆1 embodies Rosen’s demand parameter 𝛼𝑑 .
• The supply of new workers becomes our consumption.
• The dynamic demand curve becomes Rosen’s dynamic supply curve for new workers.
Remark: This has an Imai-Keane flavor.
For more details and Python code see Rosen schooling model.

17.9. Models of Occupational Choice and Pay 383


Advanced Quantitative Economics with Python

17.9.2 Skilled and Unskilled Workers

First, a demand curve for labor

𝑤 𝑁
[ 𝑢𝑡 ] = 𝛼𝑑 [ 𝑢𝑡 ] + 𝜖1𝑡
𝑤𝑠𝑡 𝑁𝑠𝑡

where 𝛼𝑑 is a (2 × 2) matrix of demand parameters and 𝜖1𝑡 is a vector of demand shifters second, time-to-train specifi-
cations for skilled and unskilled labor, respectively:

𝑁𝑠𝑡+𝑘 = 𝛿𝑁 𝑁𝑠𝑡+𝑘−1 + 𝑛𝑠𝑡


𝑁𝑢𝑡 = 𝛿𝑁 𝑁𝑢𝑡−1 + 𝑛𝑢𝑡 ;

where 𝑁𝑠𝑡 , 𝑁𝑢𝑡 are stocks of the two types of labor, and 𝑛𝑠𝑡 , 𝑛𝑢𝑡 are entry rates into the two occupations.
third, definitions of discounted present values of new entrants to the skilled and unskilled occupations, respectively:

𝑣𝑠𝑡 = 𝐸𝑡 𝛽 𝑘 ∑(𝛽𝛿𝑁 )𝑗 𝑤𝑠𝑡+𝑘+𝑗
𝑗=0

𝑣𝑢𝑡 = 𝐸𝑡 ∑(𝛽𝛿𝑁 )𝑗 𝑤𝑢𝑡+𝑗
𝑗=0

where 𝑤𝑢𝑡 , 𝑤𝑠𝑡 are wage rates for the two occupations; and fourth, supply curves for new entrants:

𝑛 𝑣
[ 𝑠𝑡 ] = 𝛼𝑠 [ 𝑢𝑡 ] + 𝜖2𝑡
𝑛𝑢𝑡 𝑣𝑠𝑡

Short Cut
As an alternative, Siow simply used the equalizing differences condition

𝑣𝑢𝑡 = 𝑣𝑠𝑡

17.10 Permanent Income Models

We’ll describe a class of permanent income models that feature


• Many consumption goods and services
• A single capital good with 𝑅𝛽 = 1
• The physical production technology
𝜙𝑐 ⋅ 𝑐𝑡 + 𝑖𝑡 = 𝛾𝑘𝑡−1 + 𝑒𝑡
𝑘𝑡 = 𝑘𝑡−1 + 𝑖𝑡
𝜙𝑖 𝑖𝑡 − 𝑔𝑡 = 0
Implication One:
Equality of Present Values of Moving Average Coefficients of 𝑐 and 𝑒

𝑘𝑡−1 = 𝛽 ∑ 𝛽 𝑗 (𝜙𝑐 ⋅ 𝑐𝑡+𝑗 − 𝑒𝑡+𝑗 ) ⇒
𝑗=0


𝑘𝑡−1 = 𝛽 ∑ 𝛽 𝑗 𝐸(𝜙𝑐 ⋅ 𝑐𝑡+𝑗 − 𝑒𝑡+𝑗 )|𝐽𝑡 ⇒
𝑗=0

384 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

∞ ∞
∑ 𝛽 𝑗 (𝜙𝑐 )′ 𝜒𝑗 = ∑ 𝛽 𝑗 𝜖𝑗
𝑗=0 𝑗=0

where 𝜒𝑗 𝑤𝑡 is the response of 𝑐𝑡+𝑗 to 𝑤𝑡 and 𝜖𝑗 𝑤𝑡 is the response of endowment 𝑒𝑡+𝑗 to 𝑤𝑡 :


Implication Two:
Martingales

ℳ𝑡𝑘 = 𝐸(ℳ𝑘𝑡+1 |𝐽𝑡 )


ℳ𝑒𝑡 = 𝐸(ℳ𝑒𝑡+1 |𝐽𝑡 )

and

ℳ𝑐𝑡 = (Φ𝑐 )′ ℳ𝑑𝑡 = 𝜙𝑐 𝑀𝑡𝑒

For more details see Permanent Income Using the DLE class
Testing Permanent Income Models:
We have two types of implications of permanent income models:
• Equality of present values of moving average coefficients.
• Martingale ℳ𝑘𝑡 .
These have been tested in work by Hansen, Sargent, and Roberts (1991) [Sargent et al., 1991] and by Attanasio and
Pavoni (2011) [Attanasio and Pavoni, 2011].

17.11 Gorman Heterogeneous Households

We now assume that there is a finite number of households, each with its own household technology and preferences over
consumption services.
Household 𝑗 orders preferences over consumption processes according to

1
− ( ) 𝐸 ∑ 𝛽 𝑡 [(𝑠𝑗𝑡 − 𝑏𝑗𝑡 ) ⋅ (𝑠𝑗𝑡 − 𝑏𝑗𝑡 ) + ℓ𝑗𝑡
2
] ∣ 𝐽0
2 𝑡=0

𝑠𝑗𝑡 = Λ ℎ𝑗,𝑡−1 + Π 𝑐𝑗𝑡

ℎ𝑗𝑡 = Δℎ ℎ𝑗,𝑡−1 + Θℎ 𝑐𝑗𝑡


and ℎ𝑗,−1 is given

𝑏𝑗𝑡 = 𝑈𝑏𝑗 𝑧𝑡
∞ ∞
𝐸 ∑ 𝛽 𝑡 𝑝𝑡0 ⋅ 𝑐𝑗𝑡 ∣ 𝐽0 = 𝐸 ∑ 𝛽 𝑡 (𝑤𝑡0 ℓ𝑗𝑡 + 𝛼0𝑡 ⋅ 𝑑𝑗𝑡 ) ∣ 𝐽0 + 𝑣0 ⋅ 𝑘𝑗,−1 ,
𝑡=0 𝑡=0
th
where 𝑘𝑗,−1 is given. The 𝑗 consumer owns an endowment process 𝑑𝑗𝑡 , governed by the stochastic process 𝑑𝑗𝑡 = 𝑈𝑑𝑗 𝑧𝑡 .
We refer to this as a setting with Gorman heterogeneous households.
This specification confines heterogeneity among consumers to:
• differences in the preference processes {𝑏𝑗𝑡 }, represented by different selections of 𝑈𝑏𝑗
• differences in the endowment processes {𝑑𝑗𝑡 }, represented by different selections of 𝑈𝑑𝑗
• differences in ℎ𝑗,−1 and

17.11. Gorman Heterogeneous Households 385


Advanced Quantitative Economics with Python

• differences in 𝑘𝑗,−1
The matrices Λ, Π, Δℎ , Θℎ do not depend on 𝑗.
This makes everybody’s demand system have the form described earlier, with different 𝜇𝑤 𝑗0 ’s (reflecting different wealth
levels) and different 𝑏𝑗𝑡 preference shock processes and initial conditions for household capital stocks.
Punchline: there exists a representative consumer.
We can use the representative consumer to compute a competitive equilibrium aggregate allocation and price system.
With the equilibrium aggregate allocation and price system in hand, we can then compute allocations to each household.
Computing Allocations to Individuals:
Set

ℓ𝑗𝑡 = (𝜇𝑤 𝑤
0𝑗 /𝜇0𝑎 )ℓ𝑎𝑡

Then solve the following equation for 𝜇𝑤


0𝑗 :

∞ ∞
𝜇𝑤 𝑡 0 0 0 𝑤 𝑡 0 𝑖 0
0𝑗 𝐸0 ∑ 𝛽 {𝜌𝑡 ⋅ 𝜌𝑡 + (𝑤𝑡 /𝜇0𝑎 )ℓ𝑎𝑡 } = 𝐸0 ∑ 𝛽 {𝜌𝑡 ⋅ (𝑏𝑗𝑡 − 𝑠𝑗𝑡 ) − 𝛼𝑡 ⋅ 𝑑𝑗𝑡 } − 𝑣0 𝑘𝑗,−1
𝑡=0 𝑡=0

𝑠𝑗𝑡 − 𝑏𝑗𝑡 = 𝜇𝑤 0
0𝑗 𝜌𝑡

𝑐𝑗𝑡 = −Π−1 Λℎ𝑗,𝑡−1 + Π−1 𝑠𝑗𝑡


ℎ𝑗𝑡 = (Δℎ − Θℎ Π−1 Λ)ℎ𝑗,𝑡−1 + Π−1 Θℎ 𝑠𝑗𝑡
Here ℎ𝑗,−1 given.

17.12 Non-Gorman Heterogeneous Households

We now describe a less tractable type of heterogeneity across households that we dub Non-Gorman heterogeneity.
Here is the specification:
Preferences and Household Technologies:

1
− 𝐸 ∑ 𝛽 𝑡 [(𝑠𝑖𝑡 − 𝑏𝑖𝑡 ) ⋅ (𝑠𝑖𝑡 − 𝑏𝑖𝑡 ) + ℓ𝑖𝑡
2
] ∣ 𝐽0
2 𝑡=0

𝑠𝑖𝑡 = Λ𝑖 ℎ𝑖𝑡−1 + Π𝑖 𝑐𝑖𝑡


ℎ𝑖𝑡 = Δℎ𝑖 ℎ𝑖𝑡−1 + Θℎ𝑖 𝑐𝑖𝑡 , 𝑖 = 1, 2.
𝑏𝑖𝑡 = 𝑈𝑏𝑖 𝑧𝑡

𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1


Production Technology

Φ𝑐 (𝑐1𝑡 + 𝑐2𝑡 ) + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 = Γ𝑘𝑡−1 + 𝑑1𝑡 + 𝑑2𝑡

𝑘𝑡 = Δ𝑘 𝑘𝑡−1 + Θ𝑘 𝑖𝑡

𝑔𝑡 ⋅ 𝑔𝑡 = ℓ𝑡2 , ℓ𝑡 = ℓ1𝑡 + ℓ2𝑡

𝑑𝑖𝑡 = 𝑈𝑑𝑖 𝑧𝑡 , 𝑖 = 1, 2

386 Chapter 17. Recursive Models of Dynamic Linear Economies


Advanced Quantitative Economics with Python

Pareto Problem:

1
− 𝜆𝐸0 ∑ 𝛽 𝑡 [(𝑠1𝑡 − 𝑏1𝑡 ) ⋅ (𝑠1𝑡 − 𝑏1𝑡 ) + ℓ1𝑡
2
]
2 𝑡=0

1
− (1 − 𝜆)𝐸0 ∑ 𝛽 𝑡 [(𝑠2𝑡 − 𝑏2𝑡 ) ⋅ (𝑠2𝑡 − 𝑏2𝑡 ) + ℓ2𝑡
2
]
2 𝑡=0

Mongrel Aggregation: Static


There is what we call a kind of mongrel aggregation in this setting.
We first describe the idea within a simple static setting in which there is a single consumer static inverse demand with
implied preferences:

𝑐𝑡 = Π−1 𝑏𝑡 − 𝜇0 Π−1 Π−1′ 𝑝𝑡

An inverse demand curve is

𝑝𝑡 = 𝜇−1 ′ −1 ′
0 Π 𝑏𝑡 − 𝜇0 Π Π𝑐𝑡

Integrating the marginal utility vector shows that preferences can be taken to be

(−2𝜇0 )−1 (Π𝑐𝑡 − 𝑏𝑡 ) ⋅ (Π𝑐𝑡 − 𝑏𝑡 )

Key Insight: Factor the inverse of a ‘covariance matrix’.


Now assume that there are two consumers, 𝑖 = 1, 2, with demand curves

𝑐𝑖𝑡 = Π−1 −1 −1′


𝑖 𝑏𝑖𝑡 − 𝜇0𝑖 Π𝑖 Π𝑖 𝑝𝑡

𝑐1𝑡 + 𝑐2𝑡 = (Π−1 −1 −1 −1′


1 𝑏1𝑡 + Π2 𝑏2𝑡 ) − (𝜇01 Π1 Π1 + 𝜇02 Π2 Π−1′
2 )𝑝𝑡

Setting 𝑐1𝑡 + 𝑐2𝑡 = 𝑐𝑡 and solving for 𝑝𝑡 gives

𝑝𝑡 = (𝜇01 Π−1 −1′


1 Π1 + 𝜇02 Π−1 −1′ −1 −1 −1
2 Π2 ) (Π1 𝑏1𝑡 + Π2 𝑏2𝑡 )
− (𝜇01 Π−1 −1′
1 Π1 + 𝜇02 Π−1 −1′ −1
2 Π2 ) 𝑐 𝑡

Punchline: choose Π associated with the aggregate ordering to satisfy

𝜇−1 ′ −1 −1′
0 Π Π = (𝜇01 Π1 Π2 + 𝜇02 Π−1 −1′ −1
2 Π2 )

Dynamic Analogue:
We now describe how to extend mongrel aggregation to a dynamic setting.
The key comparison is
• Static: factor a covariance matrix-like object
• Dynamic: factor a spectral-density matrix-like object
Programming Problem for Dynamic Mongrel Aggregation:
Our strategy for deducing the mongrel preference ordering over 𝑐𝑡 = 𝑐1𝑡 + 𝑐2𝑡 is to solve the programming problem:
choose {𝑐1𝑡 , 𝑐2𝑡 } to maximize the criterion

∑ 𝛽 𝑡 [𝜆(𝑠1𝑡 − 𝑏1𝑡 ) ⋅ (𝑠1𝑡 − 𝑏1𝑡 ) + (1 − 𝜆)(𝑠2𝑡 − 𝑏2𝑡 ) ⋅ (𝑠2𝑡 − 𝑏2𝑡 )]
𝑡=0

17.12. Non-Gorman Heterogeneous Households 387


Advanced Quantitative Economics with Python

subject to

ℎ𝑗𝑡 = Δℎ𝑗 ℎ𝑗𝑡−1 + Θℎ𝑗 𝑐𝑗𝑡 , 𝑗 = 1, 2


𝑠𝑗𝑡 = Δ𝑗 ℎ𝑗𝑡−1 + Π𝑗 𝑐𝑗𝑡 , 𝑗 = 1, 2
𝑐1𝑡 + 𝑐2𝑡 = 𝑐𝑡

subject to (ℎ1,−1 , ℎ2,−1 ) given and {𝑏1𝑡 }, {𝑏2𝑡 }, {𝑐𝑡 } being known and fixed sequences.
Substituting the {𝑐1𝑡 , 𝑐2𝑡 } sequences that solve this problem as functions of {𝑏1𝑡 , 𝑏2𝑡 , 𝑐𝑡 } into the objective determines
a mongrel preference ordering over {𝑐𝑡 } = {𝑐1𝑡 + 𝑐2𝑡 }.
In solving this problem, it is convenient to proceed by using Fourier transforms. For details, please see [Hansen and
Sargent, 2013] where they deploy a
Secret Weapon: Another application of the spectral factorization identity.
Concluding remark: The [Hansen and Sargent, 2013] class of models described in this lecture are all complete markets
models. We have exploited the fact that complete market models are all alike to allow us to define a class that gives the
same name to different things in the spirit of Henri Poincare.
Could we create such a class for incomplete markets models?
That would be nice, but before trying it would be wise to contemplate the remainder of a statement by Robert E. Lucas,
Jr., with which we began this lecture.
“Complete market economies are all alike but each incomplete market economy is incomplete in its own
individual way.” Robert E. Lucas, Jr., (1989)

388 Chapter 17. Recursive Models of Dynamic Linear Economies


CHAPTER

EIGHTEEN

GROWTH IN DYNAMIC LINEAR ECONOMIES

This is another member of a suite of lectures that use the quantecon DLE class to instantiate models within the [Hansen
and Sargent, 2013] class of models described in detail in Recursive Models of Dynamic Linear Economies.
In addition to what’s included in Anaconda, this lecture uses the quantecon library.

!pip install --upgrade quantecon

This lecture describes several complete market economies having a common linear-quadratic-Gaussian structure.
Three examples of such economies show how the DLE class can be used to compute equilibria of such economies in
Python and to illustrate how different versions of these economies can or cannot generate sustained growth.
We require the following imports

import numpy as np
import matplotlib.pyplot as plt
from quantecon import DLE

18.1 Common Structure

Our example economies have the following features


• Information flows are governed by an exogenous stochastic process 𝑧𝑡 that follows

𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1

where 𝑤𝑡+1 is a martingale difference sequence.


• Preference shocks 𝑏𝑡 and technology shocks 𝑑𝑡 are linear functions of 𝑧𝑡

𝑏𝑡 = 𝑈𝑏 𝑧𝑡

𝑑𝑡 = 𝑈𝑑 𝑧𝑡

• Consumption and physical investment goods are produced using the following technology

Φ𝑐 𝑐𝑡 + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 = Γ𝑘𝑡−1 + 𝑑𝑡

𝑘𝑡 = Δ𝑘 𝑘𝑡−1 + Θ𝑘 𝑖𝑡
𝑔𝑡 ⋅ 𝑔𝑡 = 𝑙2𝑡
where 𝑐𝑡 is a vector of consumption goods, 𝑔𝑡 is a vector of intermediate goods, 𝑖𝑡 is a vector of investment goods,
𝑘𝑡 is a vector of physical capital goods, and 𝑙𝑡 is the amount of labor supplied by the representative household.

389
Advanced Quantitative Economics with Python

• Preferences of a representative household are described by

1 ∞
− 𝔼 ∑ 𝛽 𝑡 [(𝑠𝑡 − 𝑏𝑡 ) ⋅ (𝑠𝑡 − 𝑏𝑡 ) + 𝑙2𝑡 ], 0 < 𝛽 < 1
2 𝑡=0

𝑠𝑡 = Λℎ𝑡−1 + Π𝑐𝑡

ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡

where 𝑠𝑡 is a vector of consumption services, and ℎ𝑡 is a vector of household capital stocks.


Thus, an instance of this class of economies is described by the matrices

{𝐴22 , 𝐶2 , 𝑈𝑏 , 𝑈𝑑 , Φ𝑐 , Φ𝑔 , Φ𝑖 , Γ, Δ𝑘 , Θ𝑘 , Λ, Π, Δℎ , Θℎ }

and the scalar 𝛽.

18.2 A Planning Problem

The first welfare theorem asserts that a competitive equilibrium allocation solves the following planning problem.
Choose {𝑐𝑡 , 𝑠𝑡 , 𝑖𝑡 , ℎ𝑡 , 𝑘𝑡 , 𝑔𝑡 }∞
𝑡=0 to maximize

1 ∞
− 𝔼 ∑ 𝛽 𝑡 [(𝑠𝑡 − 𝑏𝑡 ) ⋅ (𝑠𝑡 − 𝑏𝑡 ) + 𝑔𝑡 ⋅ 𝑔𝑡 ]
2 𝑡=0

subject to the linear constraints

Φ𝑐 𝑐𝑡 + Φ𝑔 𝑔𝑡 + Φ𝑖 𝑖𝑡 = Γ𝑘𝑡−1 + 𝑑𝑡

𝑘𝑡 = Δ𝑘 𝑘𝑡−1 + Θ𝑘 𝑖𝑡

ℎ𝑡 = Δℎ ℎ𝑡−1 + Θℎ 𝑐𝑡

𝑠𝑡 = Λℎ𝑡−1 + Π𝑐𝑡
and

𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1

𝑏𝑡 = 𝑈 𝑏 𝑧 𝑡

𝑑𝑡 = 𝑈𝑑 𝑧𝑡
The DLE class in Python maps this planning problem into a linear-quadratic dynamic programming problem and then
solves it by using QuantEcon’s LQ class.
(See Section 5.5 of Hansen & Sargent (2013) [Hansen and Sargent, 2013] for a full description of how to map these
economies into an LQ setting, and how to use the solution to the LQ problem to construct the output matrices in order to
simulate the economies)
The state for the LQ problem is

ℎ𝑡−1
𝑥𝑡 = ⎡ ⎤
⎢ 𝑘𝑡−1 ⎥
⎣ 𝑧𝑡 ⎦

390 Chapter 18. Growth in Dynamic Linear Economies


Advanced Quantitative Economics with Python

and the control variable is 𝑢𝑡 = 𝑖𝑡 .


Once the LQ problem has been solved, the law of motion for the state is

𝑥𝑡+1 = (𝐴 − 𝐵𝐹 )𝑥𝑡 + 𝐶𝑤𝑡+1

where the optimal control law is 𝑢𝑡 = −𝐹 𝑥𝑡 .


Letting 𝐴𝑜 = 𝐴 − 𝐵𝐹 we write this law of motion as

𝑥𝑡+1 = 𝐴𝑜 𝑥𝑡 + 𝐶𝑤𝑡+1

18.3 Example Economies

Each of the example economies shown here will share a number of components. In particular, for each we will consider
preferences of the form

1 ∞
− 𝔼 ∑ 𝛽 𝑡 [(𝑠𝑡 − 𝑏𝑡 )2 + 𝑙2𝑡 ], 0 < 𝛽 < 1
2 𝑡=0

𝑠𝑡 = 𝜆ℎ𝑡−1 + 𝜋𝑐𝑡

ℎ𝑡 = 𝛿ℎ ℎ𝑡−1 + 𝜃ℎ 𝑐𝑡

𝑏𝑡 = 𝑈𝑏 𝑧𝑡
Technology of the form

𝑐𝑡 + 𝑖𝑡 = 𝛾1 𝑘𝑡−1 + 𝑑1𝑡

𝑘𝑡 = 𝛿𝑘 𝑘𝑡−1 + 𝑖𝑡

𝑔𝑡 = 𝜙1 𝑖𝑡 , 𝜙1 > 0
𝑑1𝑡
[ ] = 𝑈𝑑 𝑧𝑡
0
And information of the form
1 0 0 0 0
𝑧𝑡+1 = ⎡
⎢ 0 0.8 0 ⎤ ⎡
⎥ 𝑧𝑡 + ⎢ 1 0 ⎤
⎥ 𝑤𝑡+1
⎣ 0 0 0.5 ⎦ ⎣ 0 1 ⎦

𝑈𝑏 = [ 30 0 0 ]
5 1 0
𝑈𝑑 = [ ]
0 0 0
We shall vary {𝜆, 𝜋, 𝛿ℎ , 𝜃ℎ , 𝛾1 , 𝛿𝑘 , 𝜙1 } and the initial state 𝑥0 across the three economies.

18.3. Example Economies 391


Advanced Quantitative Economics with Python

18.3.1 Example 1: Hall (1978)

First, we set parameters such that consumption follows a random walk. In particular, we set
1
𝜆 = 0, 𝜋 = 1, 𝛾1 = 0.1, 𝜙1 = 0.00001, 𝛿𝑘 = 0.95, 𝛽 =
1.05
(In this economy 𝛿ℎ and 𝜃ℎ are arbitrary as household capital does not enter the equation for consumption services We
set them to values that will become useful in Example 3)
It is worth noting that this choice of parameter values ensures that 𝛽(𝛾1 + 𝛿𝑘 ) = 1.
For simulations of this economy, we choose an initial condition of

𝑥0 = [ 5 150 1 0 0 ]

# Parameter Matrices
γ_1 = 0.1
ϕ_1 = 1e-5

ϕ_c, ϕ_g, ϕ_i, γ, δ_k, θ_k = (np.array([[1], [0]]),


np.array([[0], [1]]),
np.array([[1], [-ϕ_1]]),
np.array([[γ_1], [0]]),
np.array([[.95]]),
np.array([[1]]))

β, l_λ, π_h, δ_h, θ_h = (np.array([[1 / 1.05]]),


np.array([[0]]),
np.array([[1]]),
np.array([[.9]]),
np.array([[1]]) - np.array([[.9]]))

a22, c2, ub, ud = (np.array([[1, 0, 0],


[0, 0.8, 0],
[0, 0, 0.5]]),
np.array([[0, 0],
[1, 0],
[0, 1]]),
np.array([[30, 0, 0]]),
np.array([[5, 1, 0],
[0, 0, 0]]))

# Initial condition
x0 = np.array([[5], [150], [1], [0], [0]])

info1 = (a22, c2, ub, ud)


tech1 = (ϕ_c, ϕ_g, ϕ_i, γ, δ_k, θ_k)
pref1 = (β, l_λ, π_h, δ_h, θ_h)

These parameter values are used to define an economy of the DLE class.

econ1 = DLE(info1, tech1, pref1)

We can then simulate the economy for a chosen length of time, from our initial state vector 𝑥0

econ1.compute_sequence(x0, ts_length=300)

392 Chapter 18. Growth in Dynamic Linear Economies


Advanced Quantitative Economics with Python

The economy stores the simulated values for each variable. Below we plot consumption and investment

# This is the right panel of Fig 5.7.1 from p.105 of HS2013


plt.plot(econ1.c[0], label='Cons.')
plt.plot(econ1.i[0], label='Inv.')
plt.legend()
plt.show()

Inspection of the plot shows that the sample paths of consumption and investment drift in ways that suggest that each has
or nearly has a random walk or unit root component.
This is confirmed by checking the eigenvalues of 𝐴𝑜

econ1.endo, econ1.exo

(array([0.9, 1. ]), array([1. , 0.8, 0.5]))

The endogenous eigenvalue that appears to be unity reflects the random walk character of consumption in Hall’s model.
• Actually, the largest endogenous eigenvalue is very slightly below 1.
• This outcome comes from the small adjustment cost 𝜙1 .

econ1.endo[1]

0.9999999999904767

The fact that the largest endogenous eigenvalue is strictly less than unity in modulus means that it is possible to compute
the non-stochastic steady state of consumption, investment and capital.

18.3. Example Economies 393


Advanced Quantitative Economics with Python

econ1.compute_steadystate()
np.set_printoptions(precision=3, suppress=True)
print(econ1.css, econ1.iss, econ1.kss)

[[4.999]] [[-0.001]] [[-0.023]]

However, the near-unity endogenous eigenvalue means that these steady state values are of little relevance.

18.3.2 Example 2: Altered Growth Condition

We generate our next economy by making two alterations to the parameters of Example 1.
• First, we raise 𝜙1 from 0.00001 to 1.
– This will lower the endogenous eigenvalue that is close to 1, causing the economy to head more quickly to
the vicinity of its non-stochastic steady-state.
• Second, we raise 𝛾1 from 0.1 to 0.15.
– This has the effect of raising the optimal steady-state value of capital.
We also start the economy off from an initial condition with a lower capital stock

𝑥0 = [ 5 20 1 0 0 ]

Therefore, we need to define the following new parameters

γ2 = 0.15
γ22 = np.array([[γ2], [0]])

ϕ_12 = 1
ϕ_i2 = np.array([[1], [-ϕ_12]])

tech2 = (ϕ_c, ϕ_g, ϕ_i2, γ22, δ_k, θ_k)

x02 = np.array([[5], [20], [1], [0], [0]])

Creating the DLE class and then simulating gives the following plot for consumption and investment

econ2 = DLE(info1, tech2, pref1)

econ2.compute_sequence(x02, ts_length=300)

plt.plot(econ2.c[0], label='Cons.')
plt.plot(econ2.i[0], label='Inv.')
plt.legend()
plt.show()

394 Chapter 18. Growth in Dynamic Linear Economies


Advanced Quantitative Economics with Python

Simulating our new economy shows that consumption grows quickly in the early stages of the sample.
However, it then settles down around the new non-stochastic steady-state level of consumption of 17.5, which we find as
follows

econ2.compute_steadystate()
print(econ2.css, econ2.iss, econ2.kss)

[[17.5]] [[6.25]] [[125.]]

The economy converges faster to this level than in Example 1 because the largest endogenous eigenvalue of 𝐴𝑜 is now
significantly lower than 1.

econ2.endo, econ2.exo

(array([0.9 , 0.952]), array([1. , 0.8, 0.5]))

18.3. Example Economies 395


Advanced Quantitative Economics with Python

18.3.3 Example 3: A Jones-Manuelli (1990) Economy

For our third economy, we choose parameter values with the aim of generating sustained growth in consumption, invest-
ment and capital.
To do this, we set parameters so that Jones and Manuelli’s “growth condition” is just satisfied.
In our notation, just satisfying the growth condition is actually equivalent to setting 𝛽(𝛾1 + 𝛿𝑘 ) = 1, the condition that
was necessary for consumption to be a random walk in Hall’s model.
Thus, we lower 𝛾1 back to 0.1.
In our model, this is a necessary but not sufficient condition for growth.
To generate growth we set preference parameters to reflect habit persistence.
In particular, we set 𝜆 = −1, 𝛿ℎ = 0.9 and 𝜃ℎ = 1 − 𝛿ℎ = 0.1.
This makes preferences assume the form

1 ∞ ∞
− 𝔼 ∑ 𝛽 𝑡 [(𝑐𝑡 − 𝑏𝑡 − (1 − 𝛿ℎ ) ∑ 𝛿ℎ𝑗 𝑐𝑡−𝑗−1 )2 + 𝑙2𝑡 ]
2 𝑡=0 𝑗=0

These preferences reflect habit persistence



• the effective “bliss point” 𝑏𝑡 + (1 − 𝛿ℎ ) ∑𝑗=0 𝛿ℎ𝑗 𝑐𝑡−𝑗−1 now shifts in response to a moving average of past con-
sumption
Since 𝛿ℎ and 𝜃ℎ were defined earlier, the only change we need to make from the parameters of Example 1 is to define
the new value of 𝜆.

l_λ2 = np.array([[-1]])
pref2 = (β, l_λ2, π_h, δ_h, θ_h)

econ3 = DLE(info1, tech1, pref2)

We simulate this economy from the original state vector

econ3.compute_sequence(x0, ts_length=300)

# This is the right panel of Fig 5.10.1 from p.110 of HS2013


plt.plot(econ3.c[0], label='Cons.')
plt.plot(econ3.i[0], label='Inv.')
plt.legend()
plt.show()

396 Chapter 18. Growth in Dynamic Linear Economies


Advanced Quantitative Economics with Python

Thus, adding habit persistence to the Hall model of Example 1 is enough to generate sustained growth in our economy.
The eigenvalues of 𝐴𝑜 in this new economy are

econ3.endo, econ3.exo

(array([1.+0.j, 1.-0.j]), array([1. , 0.8, 0.5]))

We now have two unit endogenous eigenvalues. One stems from satisfying the growth condition (as in Example 1).
The other unit eigenvalue results from setting 𝜆 = −1.
To show the importance of both of these for generating growth, we consider the following experiments.

18.3.4 Example 3.1: Varying Sensitivity

Next we raise 𝜆 to -0.7

l_λ3 = np.array([[-0.7]])
pref3 = (β, l_λ3, π_h, δ_h, θ_h)

econ4 = DLE(info1, tech1, pref3)

econ4.compute_sequence(x0, ts_length=300)

plt.plot(econ4.c[0], label='Cons.')
plt.plot(econ4.i[0], label='Inv.')
plt.legend()
plt.show()

18.3. Example Economies 397


Advanced Quantitative Economics with Python

We no longer achieve sustained growth if 𝜆 is raised from -1 to -0.7.


This is related to the fact that one of the endogenous eigenvalues is now less than 1.

econ4.endo, econ4.exo

(array([0.97, 1. ]), array([1. , 0.8, 0.5]))

18.3.5 Example 3.2: More Impatience

Next let’s lower 𝛽 to 0.94

β_2 = np.array([[0.94]])
pref4 = (β_2, l_λ, π_h, δ_h, θ_h)

econ5 = DLE(info1, tech1, pref4)

econ5.compute_sequence(x0, ts_length=300)

plt.plot(econ5.c[0], label='Cons.')
plt.plot(econ5.i[0], label='Inv.')
plt.legend()
plt.show()

398 Chapter 18. Growth in Dynamic Linear Economies


Advanced Quantitative Economics with Python

Growth also fails if we lower 𝛽, since we now have 𝛽(𝛾1 + 𝛿𝑘 ) < 1.


Consumption and investment explode downwards, as a lower value of 𝛽 causes the representative consumer to front-load
consumption.
This explosive path shows up in the second endogenous eigenvalue now being larger than one.

econ5.endo, econ5.exo

(array([0.9 , 1.013]), array([1. , 0.8, 0.5]))

18.3. Example Economies 399


Advanced Quantitative Economics with Python

400 Chapter 18. Growth in Dynamic Linear Economies


CHAPTER

NINETEEN

LUCAS ASSET PRICING USING DLE

This is one of a suite of lectures that use the quantecon DLE class to instantiate models within the [Hansen and Sargent,
2013] class of models described in detail in Recursive Models of Dynamic Linear Economies.
In addition to what’s in Anaconda, this lecture uses the quantecon library

!pip install --upgrade quantecon

This lecture uses the DLE class to price payout streams that are linear functions of the economy’s state vector, as well as
risk-free assets that pay out one unit of the first consumption good with certainty.
We assume basic knowledge of the class of economic environments that fall within the domain of the DLE class.
Many details about the basic environment are contained in the lecture Growth in Dynamic Linear Economies.
We’ll also need the following imports

import numpy as np
import matplotlib.pyplot as plt
from quantecon import DLE

We use a linear-quadratic version of an economy that Lucas (1978) [Lucas, 1978] used to develop an equilibrium theory
of asset prices:
Preferences
1 ∞
− 𝔼 ∑ 𝛽 𝑡 [(𝑐𝑡 − 𝑏𝑡 )2 + 𝑙2𝑡 ]|𝐽0
2 𝑡=0

𝑠𝑡 = 𝑐𝑡
𝑏𝑡 = 𝑈𝑏 𝑧𝑡
Technology

𝑐𝑡 = 𝑑1𝑡

𝑘𝑡 = 𝛿𝑘 𝑘𝑡−1 + 𝑖𝑡
𝑔𝑡 = 𝜙1 𝑖𝑡 , 𝜙1 > 0
𝑑1𝑡
[ ] = 𝑈𝑑 𝑧𝑡
0
Information
1 0 0 0 0
𝑧𝑡+1 = ⎡
⎢ 0 0.8 0 ⎤ 𝑧 +
⎥ 𝑡 ⎢
⎡ 1 0 ⎤
⎥ 𝑤𝑡+1
⎣ 0 0 0.5 ⎦ ⎣ 0 1 ⎦

401
Advanced Quantitative Economics with Python

𝑈𝑏 = [ 30 0 0 ]
5 1 0
𝑈𝑑 = [ ]
0 0 0

𝑥0 = [ 5 150 1 0 0 ]

19.1 Asset Pricing Equations

[Hansen and Sargent, 2013] show that the time t value of a permanent claim to a stream 𝑦𝑠 = 𝑈𝑎 𝑥𝑠 , 𝑠 ≥ 𝑡 is:

𝑎𝑡 = (𝑥′𝑡 𝜇𝑎 𝑥𝑡 + 𝜎𝑎 )/(𝑒1̄ 𝑀𝑐 𝑥𝑡 )

with


𝜇𝑎 = ∑ 𝛽 𝜏 (𝐴𝑜 )𝜏 𝑍𝑎 𝐴𝑜𝜏
𝜏=0

𝛽 ′ ′
𝜎𝑎 = trace(𝑍𝑎 ∑ 𝛽 𝜏 (𝐴𝑜 )𝜏 𝐶𝐶 (𝐴𝑜 )𝜏 )
1−𝛽 𝜏=0
where

𝑍𝑎 = 𝑈𝑎 𝑀𝑐

The use of 𝑒1̄ indicates that the first consumption good is the numeraire.

19.2 Asset Pricing Simulations

gam = 0
γ = np.array([[gam], [0]])
ϕ_c = np.array([[1], [0]])
ϕ_g = np.array([[0], [1]])
ϕ_1 = 1e-4
ϕ_i = np.array([[0], [-ϕ_1]])
δ_k = np.array([[.95]])
θ_k = np.array([[1]])
β = np.array([[1 / 1.05]])
ud = np.array([[5, 1, 0],
[0, 0, 0]])
a22 = np.array([[1, 0, 0],
[0, 0.8, 0],
[0, 0, 0.5]])
c2 = np.array([[0, 1, 0],
[0, 0, 1]]).T
l_λ = np.array([[0]])
π_h = np.array([[1]])
δ_h = np.array([[.9]])
θ_h = np.array([[1]]) - δ_h
ub = np.array([[30, 0, 0]])
x0 = np.array([[5, 150, 1, 0, 0]]).T

info1 = (a22, c2, ub, ud)


tech1 = (ϕ_c, ϕ_g, ϕ_i, γ, δ_k, θ_k)
pref1 = (β, l_λ, π_h, δ_h, θ_h)

402 Chapter 19. Lucas Asset Pricing Using DLE


Advanced Quantitative Economics with Python

econ1 = DLE(info1, tech1, pref1)

After specifying a “Pay” matrix, we simulate the economy.


The particular choice of “Pay” used below means that we are pricing a perpetual claim on the endowment process 𝑑1𝑡

econ1.compute_sequence(x0, ts_length=100, Pay=np.array([econ1.Sd[0, :]]))

The graph below plots the price of this claim over time:

### Fig 7.12.1 from p.147 of HS2013


plt.plot(econ1.Pay_Price, label='Price of Tree')
plt.legend()
plt.show()

The next plot displays the realized gross rate of return on this “Lucas tree” as well as on a risk-free one-period bond:

### Left panel of Fig 7.12.2 from p.148 of HS2013


plt.plot(econ1.Pay_Gross, label='Tree')
plt.plot(econ1.R1_Gross, label='Risk-Free')
plt.legend()
plt.show()

19.2. Asset Pricing Simulations 403


Advanced Quantitative Economics with Python

np.corrcoef(econ1.Pay_Gross[1:, 0], econ1.R1_Gross[1:, 0])

array([[ 1. , -0.45097342],
[-0.45097342, 1. ]])

Above we have also calculated the correlation coefficient between these two returns.
To give an idea of how the term structure of interest rates moves in this economy, the next plot displays the net rates of
return on one-period and five-period risk-free bonds:

### Right panel of Fig 7.12.2 from p.148 of HS2013


plt.plot(econ1.R1_Net, label='One-Period')
plt.plot(econ1.R5_Net, label='Five-Period')
plt.legend()
plt.show()

404 Chapter 19. Lucas Asset Pricing Using DLE


Advanced Quantitative Economics with Python

From the above plot, we can see the tendency of the term structure to slope up when rates are low and to slope down
when rates are high.
Comparing it to the previous plot of the price of the “Lucas tree”, we can also see that net rates of return are low when
the price of the tree is high, and vice versa.
We now plot the realized gross rate of return on a “Lucas tree” as well as on a risk-free one-period bond when the
autoregressive parameter for the endowment process is reduced to 0.4:

a22_2 = np.array([[1, 0, 0],


[0, 0.4, 0],
[0, 0, 0.5]])
info2 = (a22_2, c2, ub, ud)

econ2 = DLE(info2, tech1, pref1)


econ2.compute_sequence(x0, ts_length=100, Pay=np.array([econ2.Sd[0, :]]))

### Left panel of Fig 7.12.3 from p.148 of HS2013


plt.plot(econ2.Pay_Gross, label='Tree')
plt.plot(econ2.R1_Gross, label='Risk-Free')
plt.legend()
plt.show()

19.2. Asset Pricing Simulations 405


Advanced Quantitative Economics with Python

np.corrcoef(econ2.Pay_Gross[1:, 0], econ2.R1_Gross[1:, 0])

array([[ 1. , -0.63164195],
[-0.63164195, 1. ]])

The correlation between these two gross rates is now more negative.
Next, we again plot the net rates of return on one-period and five-period risk-free bonds:

### Right panel of Fig 7.12.3 from p.148 of HS2013


plt.plot(econ2.R1_Net, label='One-Period')
plt.plot(econ2.R5_Net, label='Five-Period')
plt.legend()
plt.show()

406 Chapter 19. Lucas Asset Pricing Using DLE


Advanced Quantitative Economics with Python

We can see the tendency of the term structure to slope up when rates are low (and down when rates are high) has been
accentuated relative to the first instance of our economy.

19.2. Asset Pricing Simulations 407


Advanced Quantitative Economics with Python

408 Chapter 19. Lucas Asset Pricing Using DLE


CHAPTER

TWENTY

IRFS IN HALL MODELS

This is another member of a suite of lectures that use the quantecon DLE class to instantiate models within the [Hansen
and Sargent, 2013] class of models described in detail in Recursive Models of Dynamic Linear Economies.
In addition to what’s in Anaconda, this lecture uses the quantecon library.

!pip install --upgrade quantecon

We’ll make these imports:

import numpy as np
import matplotlib.pyplot as plt
from quantecon import DLE

This lecture shows how the DLE class can be used to create impulse response functions for three related economies,
starting from Hall (1978) [Hall, 1978].
Knowledge of the basic economic environment is assumed.
See the lecture “Growth in Dynamic Linear Economies” for more details.

20.1 Example 1: Hall (1978)

First, we set parameters to make consumption (almost) follow a random walk.


We set
1
𝜆 = 0, 𝜋 = 1, 𝛾1 = 0.1, 𝜙1 = 0.00001, 𝛿𝑘 = 0.95, 𝛽 =
1.05
(In this example 𝛿ℎ and 𝜃ℎ are arbitrary as household capital does not enter the equation for consumption services.
We set them to values that will become useful in Example 3)
It is worth noting that this choice of parameter values ensures that 𝛽(𝛾1 + 𝛿𝑘 ) = 1.
For simulations of this economy, we choose an initial condition of:

𝑥0 = [ 5 150 1 0 0 ]

γ_1 = 0.1
γ = np.array([[γ_1], [0]])
ϕ_c = np.array([[1], [0]])
ϕ_g = np.array([[0], [1]])
(continues on next page)

409
Advanced Quantitative Economics with Python

(continued from previous page)


ϕ_1 = 1e-5
ϕ_i = np.array([[1], [-ϕ_1]])
δ_k = np.array([[.95]])
θ_k = np.array([[1]])
β = np.array([[1 / 1.05]])
l_λ = np.array([[0]])
π_h = np.array([[1]])
δ_h = np.array([[.9]])
θ_h = np.array([[1]])
a22 = np.array([[1, 0, 0],
[0, 0.8, 0],
[0, 0, 0.5]])
c2 = np.array([[0, 0],
[1, 0],
[0, 1]])
ud = np.array([[5, 1, 0],
[0, 0, 0]])
ub = np.array([[30, 0, 0]])
x0 = np.array([[5], [150], [1], [0], [0]])

info1 = (a22, c2, ub, ud)


tech1 = (ϕ_c, ϕ_g, ϕ_i, γ, δ_k, θ_k)
pref1 = (β, l_λ, π_h, δ_h, θ_h)

These parameter values are used to define an economy of the DLE class.
We can then simulate the economy for a chosen length of time, from our initial state vector 𝑥0 .
The economy stores the simulated values for each variable. Below we plot consumption and investment:

econ1 = DLE(info1, tech1, pref1)


econ1.compute_sequence(x0, ts_length=300)

# This is the right panel of Fig 5.7.1 from p.105 of HS2013


plt.plot(econ1.c[0], label='Cons.')
plt.plot(econ1.i[0], label='Inv.')
plt.legend()
plt.show()

410 Chapter 20. IRFs in Hall Models


Advanced Quantitative Economics with Python

The DLE class can be used to create impulse response functions for each of the endogenous variables:
{𝑐𝑡 , 𝑠𝑡 , ℎ𝑡 , 𝑖𝑡 , 𝑘𝑡 , 𝑔𝑡 }.
If no selector vector for the shock is specified, the default choice is to give IRFs to the first shock in 𝑤𝑡+1 .
Below we plot the impulse response functions of investment and consumption to an endowment innovation (the first
shock) in the Hall model:

econ1.irf(ts_length=40, shock=None)
# This is the left panel of Fig 5.7.1 from p.105 of HS2013
plt.plot(econ1.c_irf, label='Cons.')
plt.plot(econ1.i_irf, label='Inv.')
plt.legend()
plt.show()

20.1. Example 1: Hall (1978) 411


Advanced Quantitative Economics with Python

It can be seen that the endowment shock has permanent effects on the level of both consumption and investment, consistent
with the endogenous unit eigenvalue in this economy.
Investment is much more responsive to the endowment shock at shorter time horizons.

20.2 Example 2: Higher Adjustment Costs

We generate our next economy by making only one change to the parameters of Example 1: we raise the parameter
associated with the cost of adjusting capital,𝜙1 , from 0.00001 to 0.2.
This will lower the endogenous eigenvalue that is unity in Example 1 to a value slightly below 1.

ϕ_12 = 0.2
ϕ_i2 = np.array([[1], [-ϕ_12]])
tech2 = (ϕ_c, ϕ_g, ϕ_i2, γ, δ_k, θ_k)

econ2 = DLE(info1, tech2, pref1)


econ2.compute_sequence(x0, ts_length = 300)

# This is the right panel of Fig 5.8.1 from p.106 of HS2013


plt.plot(econ2.c[0], label='Cons.')
plt.plot(econ2.i[0], label='Inv.')
plt.legend()
plt.show()

412 Chapter 20. IRFs in Hall Models


Advanced Quantitative Economics with Python

econ2.irf(ts_length=40,shock=None)
# This is the left panel of Fig 5.8.1 from p.106 of HS2013
plt.plot(econ2.c_irf,label='Cons.')
plt.plot(econ2.i_irf,label='Inv.')
plt.legend()
plt.show()

20.2. Example 2: Higher Adjustment Costs 413


Advanced Quantitative Economics with Python

econ2.endo

array([0.9 , 0.99657126])

econ2.compute_steadystate()
print(econ2.css, econ2.iss, econ2.kss)

[[5.]] [[2.92940472e-12]] [[5.85879555e-11]]

The first graph shows that there seems to be a downward trend in both consumption and investment.
This is a consequence of the decrease in the largest endogenous eigenvalue from unity in the earlier economy, caused by
the higher adjustment cost.
The present economy has a nonstochastic steady state value of 5 for consumption and 0 for both capital and investment.
Because the largest endogenous eigenvalue is still close to 1, the economy heads only slowly towards these mean values.
The impulse response functions now show that an endowment shock does not have a permanent effect on the levels of
either consumption or investment.

414 Chapter 20. IRFs in Hall Models


Advanced Quantitative Economics with Python

20.3 Example 3: Durable Consumption Goods

We generate our third economy by raising 𝜙1 further, to 1.0. We also raise the production function parameter from 0.1
to 0.15 (which raises the non-stochastic steady state value of capital above zero).
We also change the specification of preferences to make the consumption good durable.
Specifically, we allow for a single durable household good obeying:

ℎ𝑡 = 𝛿ℎ ℎ𝑡−1 + 𝑐𝑡 , 0 < 𝛿ℎ < 1

Services are related to the stock of durables at the beginning of the period:

𝑠𝑡 = 𝜆ℎ𝑡−1 , 𝜆 > 0

And preferences are ordered by:

1 ∞
− 𝔼 ∑ 𝛽 𝑡 [(𝜆ℎ𝑡−1 − 𝑏𝑡 )2 + 𝑙2𝑡 ]|𝐽0
2 𝑡=0

To implement this, we set 𝜆 = 0.1 and 𝜋 = 0 (we have already set 𝜃ℎ = 1 and 𝛿ℎ = 0.9).
We start from an initial condition that makes consumption begin near around its non-stochastic steady state.

ϕ_13 = 1
ϕ_i3 = np.array([[1], [-ϕ_13]])

γ_12 = 0.15
γ_2 = np.array([[γ_12], [0]])

l_λ2 = np.array([[0.1]])
π_h2 = np.array([[0]])

x01 = np.array([[150], [100], [1], [0], [0]])

tech3 = (ϕ_c, ϕ_g, ϕ_i3, γ_2, δ_k, θ_k)


pref2 = (β, l_λ2, π_h2, δ_h, θ_h)

econ3 = DLE(info1, tech3, pref2)


econ3.compute_sequence(x01, ts_length=300)

# This is the right panel of Fig 5.11.1 from p.111 of HS2013


plt.plot(econ3.c[0], label='Cons.')
plt.plot(econ3.i[0], label='Inv.')
plt.legend()
plt.show()

20.3. Example 3: Durable Consumption Goods 415


Advanced Quantitative Economics with Python

In contrast to Hall’s original model of Example 1, it is now investment that is much smoother than consumption.
This illustrates how making consumption goods durable tends to undo the strong consumption smoothing result that Hall
obtained.

econ3.irf(ts_length=40, shock=None)
# This is the left panel of Fig 5.11.1 from p.111 of HS2013
plt.plot(econ3.c_irf, label='Cons.')
plt.plot(econ3.i_irf, label='Inv.')
plt.legend()
plt.show()

416 Chapter 20. IRFs in Hall Models


Advanced Quantitative Economics with Python

The impulse response functions confirm that consumption is now much more responsive to an endowment shock (and
investment less so) than in Example 1.
As in Example 2, the endowment shock has permanent effects on neither variable.

20.3. Example 3: Durable Consumption Goods 417


Advanced Quantitative Economics with Python

418 Chapter 20. IRFs in Hall Models


CHAPTER

TWENTYONE

PERMANENT INCOME MODEL USING THE DLE CLASS

This lecture is part of a suite of lectures that use the quantecon DLE class to instantiate models within the [Hansen and
Sargent, 2013] class of models described in detail in Recursive Models of Dynamic Linear Economies.
In addition to what’s included in Anaconda, this lecture uses the quantecon library.

!pip install --upgrade quantecon

This lecture adds a third solution method for the linear-quadratic-Gaussian permanent income model with 𝛽𝑅 = 1, com-
plementing the other two solution methods described in Optimal Savings I: The Permanent Income Model and Optimal
Savings II: LQ Techniques and this Jupyter notebook.
The additional solution method uses the DLE class.
In this way, we map the permanent income model into the framework of Hansen & Sargent (2013) “Recursive Models
of Dynamic Linear Economies” [Hansen and Sargent, 2013].
We’ll also require the following imports

import numpy as np
import matplotlib.pyplot as plt
from quantecon import DLE

np.set_printoptions(suppress=True, precision=4)

21.1 The Permanent Income Model

The LQ permanent income model is an example of a savings problem.


A consumer has preferences over consumption streams that are ordered by the utility functional

𝐸0 ∑ 𝛽 𝑡 𝑢(𝑐𝑡 ) (21.1)
𝑡=0

where 𝐸𝑡 is the mathematical expectation conditioned on the consumer’s time 𝑡 information, 𝑐𝑡 is time 𝑡 consumption,
𝑢(𝑐) is a strictly concave one-period utility function, and 𝛽 ∈ (0, 1) is a discount factor.
The LQ model gets its name partly from assuming that the utility function 𝑢 is quadratic:

𝑢(𝑐) = −.5(𝑐 − 𝛾)2

where 𝛾 > 0 is a bliss level of consumption.

419
Advanced Quantitative Economics with Python

The consumer maximizes the utility functional (21.1) by choosing a consumption, borrowing plan {𝑐𝑡 , 𝑏𝑡+1 }∞
𝑡=0 subject
to the sequence of budget constraints

𝑐𝑡 + 𝑏𝑡 = 𝑅−1 𝑏𝑡+1 + 𝑦𝑡 , 𝑡 ≥ 0 (21.2)

where 𝑦𝑡 is an exogenous stationary endowment process, 𝑅 is a constant gross risk-free interest rate, 𝑏𝑡 is one-period
risk-free debt maturing at 𝑡, and 𝑏0 is a given initial condition.
We shall assume that 𝑅−1 = 𝛽.
Equation (21.2) is linear.
We use another set of linear equations to model the endowment process.
In particular, we assume that the endowment process has the state-space representation

𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1


(21.3)
𝑦𝑡 = 𝑈 𝑦 𝑧𝑡

where 𝑤𝑡+1 is an IID process with mean zero and identity contemporaneous covariance matrix, 𝐴22 is a stable matrix,
its eigenvalues being strictly below unity in modulus, and 𝑈𝑦 is a selection vector that identifies 𝑦 with a particular linear
combination of the 𝑧𝑡 .
We impose the following condition on the consumption, borrowing plan:

𝐸0 ∑ 𝛽 𝑡 𝑏𝑡2 < +∞ (21.4)
𝑡=0

This condition suffices to rule out Ponzi schemes.


(We impose this condition to rule out a borrow-more-and-more plan that would allow the household to enjoy bliss con-
sumption forever)
The state vector confronting the household at 𝑡 is

𝑧
𝑥𝑡 = [ 𝑡 ]
𝑏𝑡

where 𝑏𝑡 is its one-period debt falling due at the beginning of period 𝑡 and 𝑧𝑡 contains all variables useful for forecasting
its future endowment.
We assume that {𝑦𝑡 } follows a second order univariate autoregressive process:

𝑦𝑡+1 = 𝛼 + 𝜌1 𝑦𝑡 + 𝜌2 𝑦𝑡−1 + 𝜎𝑤𝑡+1

21.1.1 Solution with the DLE Class

One way of solving this model is to map the problem into the framework outlined in Section 4.8 of [Hansen and Sargent,
2013] by setting up our technology, information and preference matrices as follows:
1 0 −1 −1
Technology: 𝜙𝑐 = [ ] , 𝜙𝑔 = [ ] , 𝜙𝑖 = [ ], Γ = [ ], Δ𝑘 = 0, Θ𝑘 = 𝑅.
0 1 −0.00001 0
1 0 0 0
0 1 0
Information: 𝐴22 = ⎡
⎢ 𝛼 𝜌1 𝜌2 ⎤ ⎡ ⎤
⎥, 𝐶2 = ⎢ 𝜎 ⎥, 𝑈𝑏 = [ 𝛾 0 0 ], 𝑈𝑑 = [ ].
0 0 0
⎣ 0 1 0 ⎦ ⎣ 0 ⎦
Preferences: Λ = 0, Π = 1, Δℎ = 0, Θℎ = 0.
We set parameters

420 Chapter 21. Permanent Income Model using the DLE Class
Advanced Quantitative Economics with Python

𝛼 = 10, 𝛽 = 0.95, 𝜌1 = 0.9, 𝜌2 = 0, 𝜎 = 1


(The value of 𝛾 does not affect the optimal decision rule)
The chosen matrices mean that the household’s technology is:

𝑐𝑡 + 𝑘𝑡−1 = 𝑖𝑡 + 𝑦𝑡

𝑘𝑡
= 𝑖𝑡
𝑅
𝑙2𝑡 = (0.00001)2 𝑖𝑡
Combining the first two of these gives the budget constraint of the permanent income model, where 𝑘𝑡 = 𝑏𝑡+1 .
The third equation is a very small penalty on debt-accumulation to rule out Ponzi schemes.
We set up this instance of the DLE class below:

α, β, ρ_1, ρ_2, σ = 10, 0.95, 0.9, 0, 1

γ = np.array([[-1], [0]])
ϕ_c = np.array([[1], [0]])
ϕ_g = np.array([[0], [1]])
ϕ_1 = 1e-5
ϕ_i = np.array([[-1], [-ϕ_1]])
δ_k = np.array([[0]])
θ_k = np.array([[1 / β]])
β = np.array([[β]])
l_λ = np.array([[0]])
π_h = np.array([[1]])
δ_h = np.array([[0]])
θ_h = np.array([[0]])

a22 = np.array([[1, 0, 0],


[α, ρ_1, ρ_2],
[0, 1, 0]])

c2 = np.array([[0], [σ], [0]])


ud = np.array([[0, 1, 0],
[0, 0, 0]])
ub = np.array([[100, 0, 0]])

x0 = np.array([[0], [0], [1], [0], [0]])

info1 = (a22, c2, ub, ud)


tech1 = (ϕ_c, ϕ_g, ϕ_i, γ, δ_k, θ_k)
pref1 = (β, l_λ, π_h, δ_h, θ_h)
econ1 = DLE(info1, tech1, pref1)

To check the solution of this model with that from the LQ problem, we select the 𝑆𝑐 matrix from the DLE class.
The solution to the DLE economy has:

𝑐𝑡 = 𝑆𝑐 𝑥𝑡

econ1.Sc

21.1. The Permanent Income Model 421


Advanced Quantitative Economics with Python

array([[ 0. , -0.05 , 65.5172, 0.3448, 0. ]])

The state vector in the DLE class is:


ℎ𝑡−1
𝑥𝑡 = ⎡ ⎤
⎢ 𝑘𝑡−1 ⎥
⎣ 𝑧𝑡 ⎦
where 𝑘𝑡−1 = 𝑏𝑡 is set up to be 𝑏𝑡 in the permanent income model.
𝑧
The state vector in the LQ problem is [ 𝑡 ].
𝑏𝑡
Consequently, the relevant elements of econ1.Sc are the same as in −𝐹 occur when we apply other approaches to the
same model in the lecture Optimal Savings II: LQ Techniques and this Jupyter notebook.
The plot below quickly replicates the first two figures of that lecture and that notebook to confirm that the solutions are
the same

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

for i in range(25):
econ1.compute_sequence(x0, ts_length=150)
ax1.plot(econ1.c[0], c='g')
ax1.plot(econ1.d[0], c='b')
ax1.plot(econ1.c[0], label='Consumption', c='g')
ax1.plot(econ1.d[0], label='Income', c='b')
ax1.legend()

for i in range(25):
econ1.compute_sequence(x0, ts_length=150)
ax2.plot(econ1.k[0], color='r')
ax2.plot(econ1.k[0], label='Debt', c='r')
ax2.legend()
plt.show()

422 Chapter 21. Permanent Income Model using the DLE Class
CHAPTER

TWENTYTWO

ROSEN SCHOOLING MODEL

This lecture is yet another part of a suite of lectures that use the quantecon DLE class to instantiate models within the
[Hansen and Sargent, 2013] class of models described in detail in Recursive Models of Dynamic Linear Economies.
In addition to what’s included in Anaconda, this lecture uses the quantecon library

!pip install --upgrade quantecon

We’ll also need the following imports:

import numpy as np
import matplotlib.pyplot as plt
from collections import namedtuple
from quantecon import DLE

22.1 A One-Occupation Model

Ryoo and Rosen’s (2004) [Ryoo and Rosen, 2004] partial equilibrium model determines
• a stock of “Engineers” 𝑁𝑡
• a number of new entrants in engineering school, 𝑛𝑡
• the wage rate of engineers, 𝑤𝑡
It takes k periods of schooling to become an engineer.
The model consists of the following equations:
• a demand curve for engineers:
𝑤𝑡 = −𝛼𝑑 𝑁𝑡 + 𝜖𝑑𝑡
• a time-to-build structure of the education process:
𝑁𝑡+𝑘 = 𝛿𝑁 𝑁𝑡+𝑘−1 + 𝑛𝑡
• a definition of the discounted present value of each new engineering student:

𝑣𝑡 = 𝛽𝑘 𝔼 ∑(𝛽𝛿𝑁 )𝑗 𝑤𝑡+𝑘+𝑗
𝑗=0

• a supply curve of new students driven by present value 𝑣𝑡 :


𝑛𝑡 = 𝛼𝑠 𝑣𝑡 + 𝜖𝑠𝑡

423
Advanced Quantitative Economics with Python

22.2 Mapping into HS2013 Framework

We represent this model in the [Hansen and Sargent, 2013] framework by


• sweeping the time-to-build structure and the demand for engineers into the household technology, and
• putting the supply of engineers into the technology for producing goods

22.2.1 Preferences

𝛿𝑁 1 0 ⋯ 0 0
⎡0 0 1 ⋯ 0⎤ ⎡0⎤
⎢ ⎥ ⎢ ⎥
Π = 0, Λ = [𝛼𝑑 0 ⋯ 0] , Δℎ = ⎢ ⋮ ⋮ ⋮ ⋱ ⋮ ⎥ , Θℎ = ⎢ ⋮ ⎥
⎢0 ⋯ ⋯ 0 1⎥ ⎢0⎥
⎣0 0 0 ⋯ 0⎦ ⎣1⎦
where Λ is a k+1 x 1 matrix, Δℎ is a k_1 x k+1 matrix, and Θℎ is a k+1 x 1 matrix.
This specification sets 𝑁𝑡 = ℎ1𝑡−1 , 𝑛𝑡 = 𝑐𝑡 , ℎ𝜏+1,𝑡−1 = 𝑛𝑡−(𝑘−𝜏) for 𝜏 = 1, ..., 𝑘.
Below we set things up so that the number of years of education, 𝑘, can be varied.

22.2.2 Technology

To capture Ryoo and Rosen’s [Ryoo and Rosen, 2004] supply curve, we use the physical technology:

𝑐𝑡 = 𝑖𝑡 + 𝑑1𝑡

𝜓1 𝑖𝑡 = 𝑔𝑡
where 𝜓1 is inversely proportional to 𝛼𝑠 .

22.2.3 Information

Because we want 𝑏𝑡 = 𝜖𝑑𝑡 and 𝑑1𝑡 = 𝜖𝑠𝑡 , we set

1 0 0 0 0
10 1 0
𝐴22 = ⎡
⎢0 𝜌𝑠 0⎤ , 𝐶
⎥ 2 ⎢ = ⎡1 0⎤⎥ , 𝑈𝑏 = [30 0 1] , 𝑈𝑑 = [ ]
0 0 0
⎣0 0 𝜌𝑑 ⎦ ⎣0 1⎦

where 𝜌𝑠 and 𝜌𝑑 describe the persistence of the supply and demand shocks

Information = namedtuple('Information', ['a22', 'c2','ub','ud'])


Technology = namedtuple('Technology', ['ϕ_c', 'ϕ_g', 'ϕ_i', 'γ', 'δ_k', 'θ_k'])
Preferences = namedtuple('Preferences', ['β', 'l_λ', 'π_h', 'δ_h', 'θ_h'])

424 Chapter 22. Rosen Schooling Model


Advanced Quantitative Economics with Python

22.2.4 Effects of Changes in Education Technology and Demand

We now study how changing


• the number of years of education required to become an engineer and
• the slope of the demand curve
affects responses to demand shocks.
To begin, we set 𝑘 = 4 and 𝛼𝑑 = 0.1

k = 4 # Number of periods of schooling required to become an engineer

β = np.array([[1 / 1.05]])
α_d = np.array([[0.1]])
α_s = 1
ε_1 = 1e-7
λ_1 = np.full((1, k), ε_1)
# Use of ε_1 is trick to aquire detectability, see HS2013 p. 228 footnote 4
l_λ = np.hstack((α_d, λ_1))
π_h = np.array([[0]])

δ_n = np.array([[0.95]])
d1 = np.vstack((δ_n, np.zeros((k - 1, 1))))
d2 = np.hstack((d1, np.eye(k)))
δ_h = np.vstack((d2, np.zeros((1, k + 1))))

θ_h = np.vstack((np.zeros((k, 1)),


np.ones((1, 1))))

ψ_1 = 1 / α_s

ϕ_c = np.array([[1], [0]])


ϕ_g = np.array([[0], [-1]])
ϕ_i = np.array([[-1], [ψ_1]])
γ = np.array([[0], [0]])

δ_k = np.array([[0]])
θ_k = np.array([[0]])

ρ_s = 0.8
ρ_d = 0.8

a22 = np.array([[1, 0, 0],


[0, ρ_s, 0],
[0, 0, ρ_d]])

c2 = np.array([[0, 0], [10, 0], [0, 10]])


ub = np.array([[30, 0, 1]])
ud = np.array([[10, 1, 0], [0, 0, 0]])

info1 = Information(a22, c2, ub, ud)


tech1 = Technology(ϕ_c, ϕ_g, ϕ_i, γ, δ_k, θ_k)
pref1 = Preferences(β, l_λ, π_h, δ_h, θ_h)

econ1 = DLE(info1, tech1, pref1)

We create three other instances by:

22.2. Mapping into HS2013 Framework 425


Advanced Quantitative Economics with Python

1. Raising 𝛼𝑑 to 2
2. Raising 𝑘 to 7
3. Raising 𝑘 to 10

α_d = np.array([[2]])
l_λ = np.hstack((α_d, λ_1))
pref2 = Preferences(β, l_λ, π_h, δ_h, θ_h)
econ2 = DLE(info1, tech1, pref2)

α_d = np.array([[0.1]])

k = 7
λ_1 = np.full((1, k), ε_1)
l_λ = np.hstack((α_d, λ_1))
d1 = np.vstack((δ_n, np.zeros((k - 1, 1))))
d2 = np.hstack((d1, np.eye(k)))
δ_h = np.vstack((d2, np.zeros((1, k+1))))
θ_h = np.vstack((np.zeros((k, 1)),
np.ones((1, 1))))

Pref3 = Preferences(β, l_λ, π_h, δ_h, θ_h)


econ3 = DLE(info1, tech1, Pref3)

k = 10
λ_1 = np.full((1, k), ε_1)
l_λ = np.hstack((α_d, λ_1))
d1 = np.vstack((δ_n, np.zeros((k - 1, 1))))
d2 = np.hstack((d1, np.eye(k)))
δ_h = np.vstack((d2, np.zeros((1, k + 1))))
θ_h = np.vstack((np.zeros((k, 1)),
np.ones((1, 1))))

pref4 = Preferences(β, l_λ, π_h, δ_h, θ_h)


econ4 = DLE(info1, tech1, pref4)

shock_demand = np.array([[0], [1]])

econ1.irf(ts_length=25, shock=shock_demand)
econ2.irf(ts_length=25, shock=shock_demand)
econ3.irf(ts_length=25, shock=shock_demand)
econ4.irf(ts_length=25, shock=shock_demand)

The first figure plots the impulse response of 𝑛𝑡 (on the left) and 𝑁𝑡 (on the right) to a positive demand shock, for 𝛼𝑑 = 0.1
and 𝛼𝑑 = 2.
When 𝛼𝑑 = 2, the number of new students 𝑛𝑡 rises initially, but the response then turns negative.
A positive demand shock raises wages, drawing new students into the profession.
However, these new students raise 𝑁𝑡 .
The higher is 𝛼𝑑 , the larger the effect of this rise in 𝑁𝑡 on wages.
This counteracts the demand shock’s positive effect on wages, reducing the number of new students in subsequent periods.
Consequently, when 𝛼𝑑 is lower, the effect of a demand shock on 𝑁𝑡 is larger

426 Chapter 22. Rosen Schooling Model


Advanced Quantitative Economics with Python

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))


ax1.plot(econ1.c_irf,label=r'$\alpha_d = 0.1$')
ax1.plot(econ2.c_irf,label=r'$\alpha_d = 2$')
ax1.legend()
ax1.set_title('Response of $n_t$ to a demand shock')

ax2.plot(econ1.h_irf[:, 0], label=r'$\alpha_d = 0.1$')


ax2.plot(econ2.h_irf[:, 0], label=r'$\alpha_d = 24$')
ax2.legend()
ax2.set_title('Response of $N_t$ to a demand shock')
plt.show()

The next figure plots the impulse response of 𝑛𝑡 (on the left) and 𝑁𝑡 (on the right) to a positive demand shock, for 𝑘 = 4,
𝑘 = 7 and 𝑘 = 10 (with 𝛼𝑑 = 0.1)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))


ax1.plot(econ1.c_irf, label='$k=4$')
ax1.plot(econ3.c_irf, label='$k=7$')
ax1.plot(econ4.c_irf, label='$k=10$')
ax1.legend()
ax1.set_title('Response of $n_t$ to a demand shock')

ax2.plot(econ1.h_irf[:,0], label='$k=4$')
ax2.plot(econ3.h_irf[:,0], label='$k=7$')
ax2.plot(econ4.h_irf[:,0], label='$k=10$')
ax2.legend()
ax2.set_title('Response of $N_t$ to a demand shock')
plt.show()

22.2. Mapping into HS2013 Framework 427


Advanced Quantitative Economics with Python

Both panels in the above figure show that raising 𝑘 lowers the effect of a positive demand shock on entry into the engi-
neering profession.
Increasing the number of periods of schooling lowers the number of new students in response to a demand shock.
This occurs because with longer required schooling, new students ultimately benefit less from the impact of that shock
on wages.

428 Chapter 22. Rosen Schooling Model


CHAPTER

TWENTYTHREE

CATTLE CYCLES

This is another member of a suite of lectures that use the quantecon DLE class to instantiate models within the [Hansen
and Sargent, 2013] class of models described in detail in Recursive Models of Dynamic Linear Economies.
In addition to what’s in Anaconda, this lecture uses the quantecon library.

!pip install --upgrade quantecon

This lecture uses the DLE class to construct instances of the “Cattle Cycles” model of Rosen, Murphy and Scheinkman
(1994) [Rosen et al., 1994].
That paper constructs a rational expectations equilibrium model to understand sources of recurrent cycles in US cattle
stocks and prices.
We make the following imports:

import numpy as np
import matplotlib.pyplot as plt
from collections import namedtuple
from quantecon import DLE
from math import sqrt

23.1 The Model

The model features a static linear demand curve and a “time-to-grow” structure for cattle.
Let 𝑝𝑡 be the price of slaughtered beef, 𝑚𝑡 the cost of preparing an animal for slaughter, ℎ𝑡 the holding cost for a mature
animal, 𝛾1 ℎ𝑡 the holding cost for a yearling, and 𝛾0 ℎ𝑡 the holding cost for a calf.
The cost processes {ℎ𝑡 , 𝑚𝑡 }∞ ∞
𝑡=0 are exogenous, while the price process {𝑝𝑡 }𝑡=0 is determined within a rational expecta-
tions equilibrium.
Let 𝑥𝑡 be the breeding stock, and 𝑦𝑡 be the total stock of cattle.
The law of motion for the breeding stock is

𝑥𝑡 = (1 − 𝛿)𝑥𝑡−1 + 𝑔𝑥𝑡−3 − 𝑐𝑡

where 𝑔 < 1 is the number of calves that each member of the breeding stock has each year, and 𝑐𝑡 is the number of cattle
slaughtered.
The total headcount of cattle is

𝑦𝑡 = 𝑥𝑡 + 𝑔𝑥𝑡−1 + 𝑔𝑥𝑡−2

429
Advanced Quantitative Economics with Python

This equation states that the total number of cattle equals the sum of adults, calves and yearlings, respectively.
A representative farmer chooses {𝑐𝑡 , 𝑥𝑡 } to maximize:

𝜓1 2 𝜓2 2 𝜓 𝜓
𝔼0 ∑ 𝛽 𝑡 {𝑝𝑡 𝑐𝑡 − ℎ𝑡 𝑥𝑡 − 𝛾0 ℎ𝑡 (𝑔𝑥𝑡−1 ) − 𝛾1 ℎ𝑡 (𝑔𝑥𝑡−2 ) − 𝑚𝑡 𝑐𝑡 − 𝑥 − 𝑥 − 3 𝑥2𝑡−3 − 4 𝑐𝑡2 }
𝑡=0
2 𝑡 2 𝑡−1 2 2

subject to the law of motion for 𝑥𝑡 , taking as given the stochastic laws of motion for the exogenous processes, the equi-
librium price process, and the initial state [𝑥−1 , 𝑥−2 , 𝑥−3 ].
Remark The 𝜓𝑗 parameters are very small quadratic costs that are included for technical reasons to make well posed and
well behaved the linear quadratic dynamic programming problem solved by the fictitious planner who in effect chooses
equilibrium quantities and shadow prices.
Demand for beef is government by 𝑐𝑡 = 𝑎0 − 𝑎1 𝑝𝑡 + 𝑑𝑡̃ where 𝑑𝑡̃ is a stochastic process with mean zero, representing a
demand shifter.

23.2 Mapping into HS2013 Framework

23.2.1 Preferences

−1
We set Λ = 0, Δℎ = 0, Θℎ = 0, Π = 𝛼1 2 and 𝑏𝑡 = Π𝑑𝑡̃ + Π𝛼0 .
With these settings, the FOC for the household’s problem becomes the demand curve of the “Cattle Cycles” model.

23.2.2 Technology

To capture the law of motion for cattle, we set

(1 − 𝛿) 0 𝑔 1
Δ𝑘 = ⎡
⎢ 1 0 0 ⎤ ⎡ ⎤
⎥ , Θ𝑘 = ⎢ 0 ⎥
⎣ 0 1 0 ⎦ ⎣ 0 ⎦
(where 𝑖𝑡 = −𝑐𝑡 ).
To capture the production of cattle, we set

1 0 0 0 0 1 0 0 0
⎡ 𝑓 ⎤ ⎡ 1 0 0 0 ⎤ ⎡ 0 ⎤ ⎡ 𝑓 (1 − 𝛿) 0 𝑔𝑓1 ⎤
⎢ 1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 1 ⎥
Φ𝑐 = ⎢ 0 ⎥ , Φ𝑔 = ⎢ 0 1 0 0 ⎥ , Φ𝑖 = ⎢ 0 ⎥ , Γ = ⎢ 𝑓3 0 0 ⎥
⎢ 0 ⎥ ⎢ 0 0 1 0 ⎥ ⎢ 0 ⎥ ⎢ 0 𝑓5 0 ⎥
⎣ −𝑓7 ⎦ ⎣ 0 0 0 1 ⎦ ⎣ 0 ⎦ ⎣ 0 0 0 ⎦

23.2.3 Information

We set
0
1 0 0 0 0 0 0 ⎡ ⎤
⎡ ⎤ ⎡ ⎤ 𝑓2 𝑈 ℎ
0 𝜌1 0 0 1 0 0 ⎢ ⎥
𝐴22 =⎢ ⎥ , 𝐶2 = ⎢ ⎥ , 𝑈𝑏 = [ Π𝛼0 0 0 Π ] , 𝑈𝑑 = ⎢ 𝑓4 𝑈 ℎ ⎥
⎢ 0 0 𝜌2 0 ⎥ ⎢ 0 1 0 ⎥
⎢ 𝑓6 𝑈 ℎ ⎥
⎣ 0 0 0 𝜌3 ⎦ ⎣ 0 0 15 ⎦
⎣ 𝑓8 𝑈 ℎ ⎦
Ψ1 Ψ2 Ψ3
To map this into our class, we set 𝑓12 = 2 , 𝑓22 = 2 , 𝑓32 = 2 , 2𝑓1 𝑓2 = 1, 2𝑓3 𝑓4 = 𝛾0 𝑔, 2𝑓5 𝑓6 = 𝛾1 𝑔.

430 Chapter 23. Cattle Cycles


Advanced Quantitative Economics with Python

# We define namedtuples in this way as it allows us to check, for example,


# what matrices are associated with a particular technology.

Information = namedtuple('Information', ['a22', 'c2', 'ub', 'ud'])


Technology = namedtuple('Technology', ['ϕ_c', 'ϕ_g', 'ϕ_i', 'γ', 'δ_k', 'θ_k'])
Preferences = namedtuple('Preferences', ['β', 'l_λ', 'π_h', 'δ_h', 'θ_h'])

We set parameters to those used by [Rosen et al., 1994]

β = np.array([[0.909]])
lλ = np.array([[0]])

a1 = 0.5
πh = np.array([[1 / (sqrt(a1))]])
δh = np.array([[0]])
θh = np.array([[0]])

δ = 0.1
g = 0.85
f1 = 0.001
f3 = 0.001
f5 = 0.001
f7 = 0.001

ϕc = np.array([[1], [f1], [0], [0], [-f7]])

ϕg = np.array([[0, 0, 0, 0],
[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1,0],
[0, 0, 0, 1]])

ϕi = np.array([[1], [0], [0], [0], [0]])

γ = np.array([[ 0, 0, 0],
[f1 * (1 - δ), 0, g * f1],
[ f3, 0, 0],
[ 0, f5, 0],
[ 0, 0, 0]])

δk = np.array([[1 - δ, 0, g],
[ 1, 0, 0],
[ 0, 1, 0]])

θk = np.array([[1], [0], [0]])

ρ1 = 0
ρ2 = 0
ρ3 = 0.6
a0 = 500
γ0 = 0.4
γ1 = 0.7
f2 = 1 / (2 * f1)
f4 = γ0 * g / (2 * f3)
f6 = γ1 * g / (2 * f5)
f8 = 1 / (2 * f7)

(continues on next page)

23.2. Mapping into HS2013 Framework 431


Advanced Quantitative Economics with Python

(continued from previous page)


a22 = np.array([[1, 0, 0, 0],
[0, ρ1, 0, 0],
[0, 0, ρ2, 0],
[0, 0, 0, ρ3]])

c2 = np.array([[0, 0, 0],
[1, 0, 0],
[0, 1, 0],
[0, 0, 15]])

πh_scalar = πh.item()
ub = np.array([[πh_scalar * a0, 0, 0, πh_scalar]])
uh = np.array([[50, 1, 0, 0]])
um = np.array([[100, 0, 1, 0]])
ud = np.vstack(([0, 0, 0, 0],
f2 * uh, f4 * uh, f6 * uh, f8 * um))

Notice that we have set 𝜌1 = 𝜌2 = 0, so ℎ𝑡 and 𝑚𝑡 consist of a constant and a white noise component.
We set up the economy using tuples for information, technology and preference matrices below.
We also construct two extra information matrices, corresponding to cases when 𝜌3 = 1 and 𝜌3 = 0 (as opposed to the
baseline case of 𝜌3 = 0.6).

info1 = Information(a22, c2, ub, ud)


tech1 = Technology(ϕc, ϕg, ϕi, γ, δk, θk)
pref1 = Preferences(β, lλ, πh, δh, θh)

ρ3_2 = 1
a22_2 = np.array([[1, 0, 0, 0],
[0, ρ1, 0, 0],
[0, 0, ρ2, 0],
[0, 0, 0, ρ3_2]])

info2 = Information(a22_2, c2, ub, ud)

ρ3_3 = 0
a22_3 = np.array([[1, 0, 0, 0],
[0, ρ1, 0, 0],
[0, 0, ρ2, 0],
[0, 0, 0, ρ3_3]])

info3 = Information(a22_3, c2, ub, ud)

# Example of how we can look at the matrices associated with a given namedtuple
info1.a22

array([[1. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0.6]])

# Use tuples to define DLE class


econ1 = DLE(info1, tech1, pref1)
econ2 = DLE(info2, tech1, pref1)
(continues on next page)

432 Chapter 23. Cattle Cycles


Advanced Quantitative Economics with Python

(continued from previous page)


econ3 = DLE(info3, tech1, pref1)

# Calculate steady-state in baseline case and use to set the initial condition
econ1.compute_steadystate(nnc=4)
x0 = econ1.zz

econ1.compute_sequence(x0, ts_length=100)

[Rosen et al., 1994] use the model to understand the sources of recurrent cycles in total cattle stocks.
Plotting 𝑦𝑡 for a simulation of their model shows its ability to generate cycles in quantities

# Calculation of y_t
totalstock = econ1.k[0] + g * econ1.k[1] + g * econ1.k[2]
fig, ax = plt.subplots()
ax.plot(totalstock)
ax.set_xlim((-1, 100))
ax.set_title('Total number of cattle')
plt.show()

In their Figure 3, [Rosen et al., 1994] plot the impulse response functions of consumption and the breeding stock of cattle
to the demand shock, 𝑑𝑡̃ , under the three different values of 𝜌3 .
We replicate their Figure 3 below

23.2. Mapping into HS2013 Framework 433


Advanced Quantitative Economics with Python

shock_demand = np.array([[0], [0], [1]])

econ1.irf(ts_length=25, shock=shock_demand)
econ2.irf(ts_length=25, shock=shock_demand)
econ3.irf(ts_length=25, shock=shock_demand)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))


ax1.plot(econ1.c_irf, label=r'$\rho=0.6$')
ax1.plot(econ2.c_irf, label=r'$\rho=1$')
ax1.plot(econ3.c_irf, label=r'$\rho=0$')
ax1.set_title('Consumption response to demand shock')
ax1.legend()

ax2.plot(econ1.k_irf[:, 0], label=r'$\rho=0.6$')


ax2.plot(econ2.k_irf[:, 0], label=r'$\rho=1$')
ax2.plot(econ3.k_irf[:, 0], label=r'$\rho=0$')
ax2.set_title('Breeding stock response to demand shock')
ax2.legend()
plt.show()

The above figures show how consumption patterns differ markedly, depending on the persistence of the demand shock:
• If it is purely transitory (𝜌3 = 0) then consumption rises immediately but is later reduced to build stocks up again.
• If it is permanent (𝜌3 = 1), then consumption falls immediately, in order to build up stocks to satisfy the permanent
rise in future demand.
In Figure 4 of their paper, [Rosen et al., 1994] plot the response to a demand shock of the breeding stock and the total
stock, for 𝜌3 = 0 and 𝜌3 = 0.6.
We replicate their Figure 4 below

total1_irf = econ1.k_irf[:, 0] + g * econ1.k_irf[:, 1] + g * econ1.k_irf[:, 2]


total3_irf = econ3.k_irf[:, 0] + g * econ3.k_irf[:, 1] + g * econ3.k_irf[:, 2]

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))


ax1.plot(econ1.k_irf[:, 0], label='Breeding Stock')
ax1.plot(total1_irf, label='Total Stock')
ax1.set_title(r'$\rho=0.6$')

ax2.plot(econ3.k_irf[:, 0], label='Breeding Stock')


ax2.plot(total3_irf, label='Total Stock')
(continues on next page)

434 Chapter 23. Cattle Cycles


Advanced Quantitative Economics with Python

(continued from previous page)


ax2.set_title(r'$\rho=0$')
plt.show()

The fact that 𝑦𝑡 is a weighted moving average of 𝑥𝑡 creates a humped shape response of the total stock in response to
demand shocks, contributing to the cyclicality seen in the first graph of this lecture.

23.2. Mapping into HS2013 Framework 435


Advanced Quantitative Economics with Python

436 Chapter 23. Cattle Cycles


CHAPTER

TWENTYFOUR

SHOCK NON INVERTIBILITY

24.1 Overview

This is another member of a suite of lectures that use the quantecon DLE class to instantiate models within the [Hansen
and Sargent, 2013] class of models described in Recursive Models of Dynamic Linear Economies.
In addition to what’s in Anaconda, this lecture uses the quantecon library.

!pip install --upgrade quantecon

We’ll make these imports:

import numpy as np
import quantecon as qe
import matplotlib.pyplot as plt
from quantecon import DLE
from math import sqrt

This lecture describes an early contribution to what is now often called a news and noise issue.
In particular, it analyzes a shock-invertibility issue that is endemic within a class of permanent income models.
Technically, the invertibility problem indicates a situation in which histories of the shocks in an econometrician’s autore-
gressive or Wold moving average representation span a smaller information space than do the shocks that are seen by the
agents inside the econometrician’s model.
An econometrician who is unaware of the problem would misinterpret shocks and likely responses to them.
A shock-invertibility that is technically close to the one studied here is discussed by Eric Leeper, Todd Walker, and Susan
Yang [Leeper et al., 2013] in their analysis of fiscal foresight.
A distinct shock-invertibility issue is present in the special LQ consumption smoothing model in this quantecon lecture
Information and Consumption Smoothing.

437
Advanced Quantitative Economics with Python

24.2 Model

We consider the following modification of Robert Hall’s (1978) model [Hall, 1978] in which the endowment process is
the sum of two orthogonal autoregressive processes:
Preferences
1 ∞
− 𝔼 ∑ 𝛽 𝑡 [(𝑐𝑡 − 𝑏𝑡 )2 + 𝑙2𝑡 ]|𝐽0
2 𝑡=0

𝑠𝑡 = 𝑐𝑡

𝑏𝑡 = 𝑈𝑏 𝑧𝑡
Technology

𝑐𝑡 + 𝑖𝑡 = 𝛾𝑘𝑡−1 + 𝑑𝑡

𝑘𝑡 = 𝛿𝑘 𝑘𝑡−1 + 𝑖𝑡

𝑔𝑡 = 𝜙1 𝑖𝑡 , 𝜙1 > 0

𝑔𝑡 ⋅ 𝑔𝑡 = 𝑙2𝑡
Information
1 0 0 0 0 0 0 0
⎡ 0 0.9 0 0 0 0 ⎤ ⎡ 1 0 ⎤
⎢ ⎥ ⎢ ⎥
0 0 0 0 0 0 0 4
𝑧𝑡+1 =⎢ ⎥ 𝑧𝑡 + ⎢ ⎥ 𝑤𝑡+1
⎢ 0 0 1 0 0 0 ⎥ ⎢ 0 0 ⎥
⎢ 0 0 0 1 0 0 ⎥ ⎢ 0 0 ⎥
⎣ 0 0 0 0 1 0 ⎦ ⎣ 0 0 ⎦

𝑈𝑏 = [ 30 0 0 0 0 0 ]
5 1 1 0.8 0.6 0.4
𝑈𝑑 = [ ]
0 0 0 0 0 0
The preference shock is constant at 30, while the endowment process is the sum of a constant and two orthogonal processes.
Specifically:

𝑑𝑡 = 5 + 𝑑1𝑡 + 𝑑2𝑡

𝑑1𝑡 = 0.9𝑑1𝑡−1 + 𝑤1𝑡

𝑑2𝑡 = 4𝑤2𝑡 + 0.8(4𝑤2𝑡−1 ) + 0.6(4𝑤2𝑡−2 ) + 0.4(4𝑤2𝑡−3 )


𝑑1𝑡 is a first-order AR process, while 𝑑2𝑡 is a third-order pure moving average process.

γ_1 = 0.05
γ = np.array([[γ_1], [0]])
ϕ_c = np.array([[1], [0]])
ϕ_g = np.array([[0], [1]])
ϕ_1 = 0.00001
ϕ_i = np.array([[1], [-ϕ_1]])
δ_k = np.array([[1]])
θ_k = np.array([[1]])
(continues on next page)

438 Chapter 24. Shock Non Invertibility


Advanced Quantitative Economics with Python

(continued from previous page)


β = np.array([[1 / 1.05]])
l_λ = np.array([[0]])
π_h = np.array([[1]])
δ_h = np.array([[.9]])
θ_h = np.array([[1]]) - δ_h
ud = np.array([[5, 1, 1, 0.8, 0.6, 0.4],
[0, 0, 0, 0, 0, 0]])
a22 = np.zeros((6, 6))
# Chase's great trick
a22[[0, 1, 3, 4, 5], [0, 1, 2, 3, 4]] = np.array([1.0, 0.9, 1.0, 1.0, 1.0])
c2 = np.zeros((6, 2))
c2[[1, 2], [0, 1]] = np.array([1.0, 4.0])
ub = np.array([[30, 0, 0, 0, 0, 0]])
x0 = np.array([[5], [150], [1], [0], [0], [0], [0], [0]])

info1 = (a22, c2, ub, ud)


tech1 = (ϕ_c, ϕ_g, ϕ_i, γ, δ_k, θ_k)
pref1 = (β, l_λ, π_h, δ_h, θ_h)

econ1 = DLE(info1, tech1, pref1)

We define the household’s net of interest deficit as 𝑐𝑡 − 𝑑𝑡 .


Hall’s model imposes “expected present-value budget balance” in the sense that

𝔼 ∑ 𝛽 𝑗 (𝑐𝑡+𝑗 − 𝑑𝑡+𝑗 )|𝐽𝑡 = 𝛽 −1 𝑘𝑡−1 ∀𝑡
𝑗=0

Define a moving average representation of (𝑐𝑡 , 𝑐𝑡 − 𝑑𝑡 ) in terms of the 𝑤𝑡 s to be:

𝑐𝑡 𝜎 (𝐿)
[ ]=[ 1 ] 𝑤𝑡
𝑐𝑡 − 𝑑 𝑡 𝜎2 (𝐿)

Hall’s model imposes the restriction 𝜎2 (𝛽) = [0 0].


• The consumer who lives inside this model observes histories of both components of the endowment process 𝑑1𝑡
and 𝑑2𝑡 .
• The econometrician has data on the history of the pair [𝑐𝑡 , 𝑑𝑡 ], but not directly on the history of 𝑤𝑡 ’s.
• The econometrician obtains a Wold representation for the process [𝑐𝑡 , 𝑐𝑡 − 𝑑𝑡 ]:
𝑐𝑡 𝜎∗ (𝐿)
[ ] = [ 1∗ ] 𝑢𝑡
𝑐𝑡 − 𝑑 𝑡 𝜎2 (𝐿)
A representation with equivalent shocks would be recovered by estimating a bivariate vector autoregression for 𝑐𝑡 , 𝑐𝑡 −𝑑𝑡 .
The Appendix of chapter 8 of [Hansen and Sargent, 2013] explains why the impulse response functions in the Wold
representation estimated by the econometrician do not resemble the impulse response functions that depict the response
of consumption and the net-of-interest deficit to innovations 𝑤𝑡 to the consumer’s information.
Technically, 𝜎2 (𝛽) = [0 0] implies that the history of 𝑢𝑡 s spans a smaller linear space than does the history of 𝑤𝑡 s.
This means that 𝑢𝑡 will typically be a distributed lag of 𝑤𝑡 that is not concentrated at zero lag:

𝑢𝑡 = ∑ 𝛼𝑗 𝑤𝑡−𝑗
𝑗=0

Thus, the econometrician’s news 𝑢𝑡 typically responds belatedly to the consumer’s news 𝑤𝑡 .

24.2. Model 439


Advanced Quantitative Economics with Python

24.3 Code

We will construct Figures from Chapter 8 Appendix E of [Hansen and Sargent, 2013] to illustrate these ideas:

# This is Fig 8.E.1 from p.188 of HS2013

econ1.irf(ts_length=40, shock=None)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))


ax1.plot(econ1.c_irf, label='Consumption')
ax1.plot(econ1.c_irf - econ1.d_irf[:,0].reshape(40,1), label='Deficit')
ax1.legend()
ax1.set_title('Response to $w_{1t}$')

shock2 = np.array([[0], [1]])


econ1.irf(ts_length=40, shock=shock2)

ax2.plot(econ1.c_irf, label='Consumption')
ax2.plot(econ1.c_irf - econ1.d_irf[:,0].reshape(40, 1), label='Deficit')
ax2.legend()
ax2.set_title('Response to $w_{2t}$')
plt.show()

The above figure displays the impulse response of consumption and the net-of-interest deficit to the innovations 𝑤𝑡 to the
consumer’s non-financial income or endowment process.
Consumption displays the characteristic “random walk” response with respect to each innovation.
Each endowment innovation leads to a temporary surplus followed by a permanent net-of-interest deficit.
The temporary surplus just offsets the permanent deficit in terms of expected present value.

G_HS = np.vstack([econ1.Sc, econ1.Sc-econ1.Sd[0, :].reshape(1, 8)])


H_HS = 1e-8 * np.eye(2) # Set very small so there is no measurement error
lss_hs = qe.LinearStateSpace(econ1.A0, econ1.C, G_HS, H_HS)

hs_kal = qe.Kalman(lss_hs)
w_lss = hs_kal.whitener_lss()
ma_coefs = hs_kal.stationary_coefficients(50, 'ma')

# This is Fig 8.E.2 from p.189 of HS2013


(continues on next page)

440 Chapter 24. Shock Non Invertibility


Advanced Quantitative Economics with Python

(continued from previous page)

ma_coefs = ma_coefs
jj = 50
y1_w1 = np.empty(jj)
y2_w1 = np.empty(jj)
y1_w2 = np.empty(jj)
y2_w2 = np.empty(jj)

for t in range(jj):
y1_w1[t] = ma_coefs[t][0, 0]
y1_w2[t] = ma_coefs[t][0, 1]
y2_w1[t] = ma_coefs[t][1, 0]
y2_w2[t] = ma_coefs[t][1, 1]

# This scales the impulse responses to match those in the book


y1_w1 = sqrt(hs_kal.stationary_innovation_covar()[0, 0]) * y1_w1
y2_w1 = sqrt(hs_kal.stationary_innovation_covar()[0, 0]) * y2_w1
y1_w2 = sqrt(hs_kal.stationary_innovation_covar()[1, 1]) * y1_w2
y2_w2 = sqrt(hs_kal.stationary_innovation_covar()[1, 1]) * y2_w2

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))


ax1.plot(y1_w1, label='Consumption')
ax1.plot(y2_w1, label='Deficit')
ax1.legend()
ax1.set_title('Response to $u_{1t}$')

ax2.plot(y1_w2, label='Consumption')
ax2.plot(y2_w2, label='Deficit')
ax2.legend()
ax2.set_title('Response to $u_{2t}$')
plt.show()

The above figure displays the impulse response of consumption and the deficit to the innovations in the econometrician’s
Wold representation
• this is the object that would be recovered from a high order vector autoregression on the econometrician’s obser-
vations.
Consumption responds only to the first innovation
• this is indicative of the Granger causality imposed on the [𝑐𝑡 , 𝑐𝑡 −𝑑𝑡 ] process by Hall’s model: consumption Granger
causes 𝑐𝑡 − 𝑑𝑡 , with no reverse causality.

24.3. Code 441


Advanced Quantitative Economics with Python

# This is Fig 8.E.3 from p.189 of HS2013

jj = 20
irf_wlss = w_lss.impulse_response(jj)
ycoefs = irf_wlss[1]
# Pull out the shocks
a1_w1 = np.empty(jj)
a1_w2 = np.empty(jj)
a2_w1 = np.empty(jj)
a2_w2 = np.empty(jj)

for t in range(jj):
a1_w1[t] = ycoefs[t][0, 0]
a1_w2[t] = ycoefs[t][0, 1]
a2_w1[t] = ycoefs[t][1, 0]
a2_w2[t] = ycoefs[t][1, 1]

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))


ax1.plot(a1_w1, label='Consumption innov.')
ax1.plot(a2_w1, label='Deficit innov.')
ax1.set_title('Response to $w_{1t}$')
ax1.legend()
ax2.plot(a1_w2, label='Consumption innov.')
ax2.plot(a2_w2, label='Deficit innov.')
ax2.legend()
ax2.set_title('Response to $w_{2t}$')
plt.show()

The above figure displays the impulse responses of 𝑢𝑡 to 𝑤𝑡 , as depicted in:



𝑢𝑡 = ∑ 𝛼𝑗 𝑤𝑡−𝑗
𝑗=0

While the responses of the innovations to consumption are concentrated at lag zero for both components of 𝑤𝑡 , the
responses of the innovations to (𝑐𝑡 − 𝑑𝑡 ) are spread over time (especially in response to 𝑤1𝑡 ).
Thus, the innovations to (𝑐𝑡 − 𝑑𝑡 ) as revealed by the vector autoregression depend on what the economic agent views as
“old news”.

442 Chapter 24. Shock Non Invertibility


Part V

Risk, Model Uncertainty, and Robustness

443
CHAPTER

TWENTYFIVE

RISK AND MODEL UNCERTAINTY

25.1 Overview

As an introduction to one possible approach to modeling Knightian uncertainty, this lecture describes static represen-
tations of five classes of preferences over risky prospects.
These preference specifications allow us to distinguish risk from uncertainty along lines proposed by [Knight, 1921].
All five preference specifications incorporate risk aversion, meaning displeasure from risks governed by well known
probability distributions.
Two of them also incorporate uncertainty aversion, meaning dislike of not knowing a probability distribution.
The preference orderings are
• Expected utility preferences
• Constraint preferences
• Multiplier preferences
• Risk-sensitive preferences
• Ex post Bayesian expected utility preferences
This labeling scheme is taken from [Hansen and Sargent, 2001].
Constraint and multiplier preferences express aversion to not knowing a unique probability distribution that describes
random outcomes.
Expected utility, risk-sensitive, and ex post Bayesian expected utility preferences all attribute a unique known probability
distribution to a decision maker.
We present things in a simple before-and-after one-period setting.
In addition to learning about these preference orderings, this lecture also describes some interesting code for computing
and graphing some representations of indifference curves, utility functions, and related objects.
Staring at these indifference curves provides insights into the different preferences.
Watch for the presence of a kink at the 45 degree line for the constraint preference indifference curves.
We begin with some that we’ll use to create some graphs.

# Package imports
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib import rc, cm
(continues on next page)

445
Advanced Quantitative Economics with Python

(continued from previous page)


from mpl_toolkits.mplot3d import Axes3D
from scipy import optimize, stats
from scipy.io import loadmat
from matplotlib.collections import LineCollection
from matplotlib.colors import ListedColormap, BoundaryNorm
from numba import njit

25.2 Basic objects

Basic ingredients are


• a set of states of the world
• plans describing outcomes as functions of the state of the world,
• a utility function mapping outcomes into utilities
• either a probability distribution or a set of probability distributions over states of the world; and
• a way of measuring a discrepancy between two probability distributions.
In more detail, we’ll work with the following setting.
• A finite set of possible states 𝐼 = {𝑖 = 1, … , 𝐼}.
• A (consumption) plan is a function 𝑐 ∶ 𝐼 → ℝ.
• 𝑢 ∶ ℝ → ℝ is a utility function.
𝐼
• 𝜋 is an 𝐼 × 1 vector of nonnegative probabilities over states, with 𝜋𝑖 ≥ 0, ∑𝑖=1 𝜋𝑖 = 1.
• Relative entropy ent(𝜋, 𝜋)̂ of a probability vector 𝜋̂ with respect to a probability vector 𝜋 is the expected value of
the logarithm of the likelihood ratio 𝑚𝑖 ≐ ( 𝜋𝜋̂𝑖 ) under distribution 𝜋̂ defined as:
𝑖

𝐼 𝐼
𝜋𝑖̂ 𝜋̂ 𝜋̂
ent(𝜋, 𝜋)̂ = ∑ 𝜋𝑖̂ log( ) = ∑ 𝜋𝑖 ( 𝑖 ) log( 𝑖 )
𝑖=1
𝜋𝑖 𝑖=1
𝜋𝑖 𝜋𝑖
or
𝐼
ent(𝜋, 𝜋)̂ = ∑ 𝜋𝑖 𝑚𝑖 log 𝑚𝑖 .
𝑖=1

Remark: A likelihood ratio 𝑚𝑖 is a discrete random variable. For any discrete random variable {𝑥𝑖 }𝐼𝑖=1 , the expected
value of 𝑥 under the 𝜋𝑖̂ distribution can be represented as the expected value under the 𝜋 distribution of the product of
𝑥𝑖 times the `shock’ 𝑚𝑖 :
𝐼 𝐼
̂ = ∑ 𝑥𝑖 𝜋𝑖̂ = ∑ 𝑚𝑖 𝑥𝑖 𝜋𝑖 = 𝐸𝑚𝑥,
𝐸𝑥
𝑖=1 𝑖=1

where 𝐸̂ is the mathematical expectation under the 𝜋̂ distribution and 𝐸 is the expectation under the 𝜋 distribution.
Evidently,
̂ = 𝐸𝑚 = 1
𝐸1

and relative entropy is

𝐸𝑚 log 𝑚 = 𝐸̂ log 𝑚.

446 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

In the three figures below, we plot relative entropy from several perspectives.
Our first figure depicts entropy as a function of 𝜋1̂ when 𝐼 = 2 and 𝜋1 = .5.
When 𝜋1 ∈ (0, 1), entropy is finite for both 𝜋1̂ = 0 and 𝜋1̂ = 1 because lim𝑥→0 𝑥 log 𝑥 = 0
However, when 𝜋1 = 0 or 𝜋1 = 1, entropy is infinite.

Fig. 25.1: Figure 1

The heat maps in the next two figures vary both 𝜋1̂ and 𝜋1 .
The following figure plots entropy.

25.2. Basic objects 447


Advanced Quantitative Economics with Python

The next figure plots the logarithm of entropy.

3.8205752275831846

/tmp/ipykernel_7791/3759713737.py:2: RuntimeWarning: divide by zero encountered in␣


↪log

plt.pcolormesh(x, y, np.log(ent_vals_mat.T), shading='gouraud', cmap='seismic')

448 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

25.3 Five preference specifications

We describe five types of preferences over plans.


• Expected utility preferences
• Constraint preferences
• Multiplier preferences
• Risk-sensitive preferences
• Ex post Bayesian expected utility preferences
Expected utility, risk-sensitive, and ex post Bayesian prefernces are each cast in terms of a unique probability distribution,
so they can express risk-aversion, but not model ambiguity aversion.
Multiplier and constraint prefernces both express aversion to concerns about model misppecification, i.e., model uncer-
tainty; both are cast in terms of a set or sets of probability distributions.
• The set of distributions expresses the decision maker’s ambiguity about the probability model.

25.3. Five preference specifications 449


Advanced Quantitative Economics with Python

• Minimization over probability distributions expresses his aversion to ambiguity.

25.4 Expected utility

A decision maker is said to have expected utility preferences when he ranks plans 𝑐 by their expected utilities
𝐼
∑ 𝑢(𝑐𝑖 )𝜋𝑖 , (25.1)
𝑖=1

where 𝑢 is a unique utility function and 𝜋 is a unique probability measure over states.
• A known 𝜋 expresses risk.
• Curvature of 𝑢 expresses risk aversion.

25.5 Constraint preferences

A decision maker is said to have constraint preferences when he ranks plans 𝑐 according to
𝐼
min ∑ 𝑚𝑖 𝜋𝑖 𝑢(𝑐𝑖 ) (25.2)
{𝑚𝑖 ≥0}𝐼𝑖=1 𝑖=1

subject to
𝐼
∑ 𝜋𝑖 𝑚𝑖 log 𝑚𝑖 ≤ 𝜂 (25.3)
𝑖=1

and
𝐼
∑ 𝜋𝑖 𝑚𝑖 = 1. (25.4)
𝑖=1

In (25.3), 𝜂 ≥ 0 defines an entropy ball of probability distributions 𝜋̂ = 𝑚𝜋 that surround a baseline distribution 𝜋.
𝐼
As noted earlier, ∑𝑖=1 𝑚𝑖 𝜋𝑖 𝑢(𝑐𝑖 ) is the expected value of 𝑢(𝑐) under a twisted probability distribution {𝜋𝑖̂ }𝐼𝑖=1 =
{𝑚𝑖 𝜋𝑖 }𝐼𝑖=1 .
Larger values of the entropy constraint 𝜂 indicate more apprehension about the baseline probability distribution {𝜋𝑖 }𝐼𝑖=1 .
Following [Hansen and Sargent, 2001] and [Hansen and Sargent, 2008], we call minimization problem (25.2) subject to
(25.3) and(25.4) a constraint problem.
To find minimizing probabilities, we form a Lagrangian
𝐼 𝐼
𝐿 = ∑ 𝑚𝑖 𝜋𝑖 𝑢(𝑐𝑖 ) + 𝜃[̃ ∑ 𝜋𝑖 𝑚𝑖 log 𝑚𝑖 − 𝜂] (25.5)
𝑖=1 𝑖=1

where 𝜃 ̃ ≥ 0 is a Lagrange multiplier associated with the entropy constraint.


𝐼
Subject to the additional constraint that ∑𝑖=1 𝑚𝑖 𝜋𝑖 = 1, we want to minimize (25.5) with respect to {𝑚𝑖 }𝐼𝑖=1 and to
maximize it with respect to 𝜃.̃

450 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

The minimizing probability distortions (likelihood ratios) are

exp(−𝑢(𝑐𝑖 )/𝜃)̃
𝑚̃ 𝑖 (𝑐; 𝜃)̃ = . (25.6)
∑𝑗 𝜋𝑗 exp(−𝑢(𝑐𝑗 )/𝜃)̃

̃ 𝜂), we must solve


To compute the Lagrange multiplier 𝜃(𝑐,

∑ 𝜋𝑖 𝑚̃ 𝑖 (𝑐; 𝜃)̃ log(𝑚̃ 𝑖 (𝑐; 𝜃))


̃ =𝜂
𝑖

or

exp(−𝑢(𝑐𝑖 )/𝜃)̃ exp(−𝑢(𝑐𝑖 )/𝜃)̃


∑ 𝜋𝑖 log[ ]=𝜂 (25.7)
𝑖 ∑𝑗 𝜋𝑗 exp(−𝑢(𝑐𝑗 )/𝜃)̃ ∑𝑗 𝜋𝑗 exp(−𝑢(𝑐𝑗 )/𝜃)̃

for 𝜃 ̃ = 𝜃(𝑐;
̃ 𝜂).

For a fixed 𝜂, the 𝜃 ̃ that solves equation (25.7) is evidently a function of the consumption plan 𝑐.
̃ 𝜂) in hand we can obtain worst-case probabilities as functions 𝜋 𝑚̃ (𝑐; 𝜂) of 𝜂.
With 𝜃(𝑐; 𝑖 𝑖

The indirect (expected) utility function under constraint preferences is


𝐼 𝐼 ̃ 𝑢(𝑐 ))
exp(−𝜃−1
∑ 𝜋𝑖 𝑚̃ 𝑖 (𝑐𝑖 ; 𝜂)𝑢(𝑐𝑖 ) = ∑ 𝜋𝑖 ⎡
⎢ 𝐼
𝑖 ⎤ 𝑢(𝑐 ). (25.8)
∑ exp(− 𝜃 ̃ 𝑢(𝑐 ))𝜋 ⎥
−1
𝑖
𝑖=1 𝑖=1 ⎣ 𝑗=1 𝑗 𝑗⎦

Entropy evaluated at the minimizing probability distortion (25.6) equals 𝐸 𝑚̃ log 𝑚̃ or


𝐼 ̃ 𝑢(𝑐 ))
exp(−𝜃−1
∑⎡ ⎢ 𝐼
𝑖 ⎤×

̃
−1
𝑖=1 ⎣ ∑𝑗=1 exp(−𝜃 𝑢(𝑐𝑗 ))𝜋𝑗 ⎦
𝐼
̃ 𝑢(𝑐 ) + log (∑ exp(−𝜃−1
{−𝜃−1 ̃ 𝑢(𝑐 ))𝜋 )} 𝜋
𝑖 𝑗 𝑗 𝑖
𝑗=1
(25.9)
𝐼 ̃ 𝑢(𝑐 ))
exp(−𝜃−1
= ̃ ∑𝜋 ⎡
−𝜃−1 𝑖 ⎤ 𝑢(𝑐 )
𝑖⎢

𝐼
exp(− 𝜃 ̃ 𝑢(𝑐 ))𝜋 ⎥
−1
𝑖
𝑖=1 ⎣ 𝑗=1 𝑗 𝑗⎦
𝐼
̃ 𝑢(𝑐 ))𝜋 ) .
+ log (∑ exp(−𝜃−1 𝑗 𝑗
𝑗=1

Expression (25.9) implies that


𝐼 𝐼 ̃ 𝑢(𝑐 ))
exp(−𝜃−1
−𝜃 ̃ log (∑ exp(−𝜃−1
̃ 𝑢(𝑐 ))𝜋 ) =
𝑗 𝑗 ∑ 𝜋𝑖 ⎡
⎢ 𝐼
𝑖 ⎤ 𝑢(𝑐 )
⎥ 𝑖
∑ exp(− 𝜃 ̃
−1 𝑢(𝑐 ))𝜋
𝑗=1 𝑖=1 ⎣ 𝑗=1 𝑗 𝑗 ⎦ (25.10)
𝐼
̃ 𝜂) ∑ log 𝑚̃ (𝑐; 𝜂)𝑚̃ (𝑐; 𝜂)𝜋 ,
+𝜃(𝑐; 𝑖 𝑖 𝑖
𝑖=1

where the last term is 𝜃 ̃ times the entropy of the worst-case probability distribution.

25.5. Constraint preferences 451


Advanced Quantitative Economics with Python

25.6 Multiplier preferences

A decision maker is said to have multiplier preferences when he ranks consumption plans 𝑐 according to
𝐼
T𝑢(𝑐) ≐ min ∑ 𝜋𝑖 𝑚𝑖 [𝑢(𝑐𝑖 ) + 𝜃 log 𝑚𝑖 ] (25.11)
{𝑚𝑖 ≥0}𝐼𝑖=1 𝑖=1

where minimization is subject to


𝐼
∑ 𝜋𝑖 𝑚𝑖 = 1.
𝑖=1

Here 𝜃 ∈ (𝜃, +∞) is a ‘penalty parameter’ that governs a ‘cost’ to an ‘evil alter ego’ who distorts probabilities by choosing
{𝑚𝑖 }𝐼𝑖=1 .
Lower values of the penalty parameter 𝜃 express more apprehension about the baseline probability distribution 𝜋.
Following [Hansen and Sargent, 2001] and [Hansen and Sargent, 2008], we call the minimization problem on the right
side of (25.11) a multiplier problem.
The minimizing probability distortion that solves the multiplier problem is

exp(−𝑢(𝑐𝑖 )/𝜃)
𝑚̂ 𝑖 (𝑐; 𝜃) = . (25.12)
∑𝑗 𝜋𝑗 exp(−𝑢(𝑐𝑗 )/𝜃)

We can solve
exp(−𝑢(𝑐𝑖 )/𝜃) exp(−𝑢(𝑐𝑖 )/𝜃)
∑ 𝜋𝑖 log[ ] = 𝜂̃ (25.13)
𝑖 ∑𝑗 𝜋𝑗 exp(−𝑢(𝑐𝑗 )/𝜃) ∑𝑗 𝜋𝑗 exp(−𝑢(𝑐𝑗 )/𝜃)

to find an entropy level 𝜂(𝑐;


̃ 𝜃) associated with multiplier preferences with penalty parameter 𝜃 and allocation 𝑐.
For a fixed 𝜃, the 𝜂 ̃ that solves equation (25.13) is a function of the consumption plan 𝑐
The forms of expressions (25.6) and (25.12) are identical, but the Lagrange multiplier 𝜃 ̃ appears in (25.6), while the
penalty parameter 𝜃 appears in (25.12).
Formulas (25.6) and (25.12) show that worst-case probabilities are context specific in the sense that they depend on both
the utility function 𝑢 and the consumption plan 𝑐.
If we add 𝜃 times entropy under the worst-case model to expected utility under the worst-case model, we find that the
indirect expected utility function under multiplier preferences is
𝐼
−𝜃 log (∑ exp(−𝜃−1 𝑢(𝑐𝑗 ))𝜋𝑗 ) . (25.14)
𝑗=1

25.7 Risk-sensitive preferences


𝐼
Substituting 𝑚̂ 𝑖 into ∑𝑖=1 𝜋𝑖 𝑚̂ 𝑖 [𝑢(𝑐𝑖 ) + 𝜃 log 𝑚̂ 𝑖 ] gives the indirect utility function

𝐼
T𝑢(𝑐) ≐ −𝜃 log ∑ 𝜋𝑖 exp(−𝑢(𝑐𝑖 )/𝜃). (25.15)
𝑖=1

Here T𝑢 in (25.15) is the risk-sensitivity operator of [Jacobson, 1973], [Whittle, 1981], and [Whittle, 1990].

452 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

It defines a risk-sensitive preference ordering over plans 𝑐.


Because it is not linear in utilities 𝑢(𝑐𝑖 ) and probabilities 𝜋𝑖 , it is said not to be separable across states.
Because risk-sensitive preferences use a unique probability distribution, they apparently express no model distrust or
ambiguity.
Instead, they make an additional adjustment for risk-aversion beyond that embedded in the curvature of 𝑢.
For 𝐼 = 2, 𝑐1 = 2, 𝑐2 = 1, 𝑢(𝑐) = ln 𝑐, the following figure plots the risk-sensitive criterion T𝑢(𝑐) defined in (25.15) as
a function of 𝜋1 for values of 𝜃 of 100 and .6.

For large values of 𝜃, T𝑢(𝑐) is approximately linear in the probability 𝜋1 , but for lower values of 𝜃, T𝑢(𝑐) has considerable
curvature as a function of 𝜋1 .
Under expected utility, i.e., 𝜃 = +∞, T𝑢(𝑐) is linear in 𝜋1 , but it is convex as a function of 𝜋1 when 𝜃 < +∞.
The two panels in the next figure below can help us to visualize the extra adjustment for risk that the risk-sensitive operator
entails.
This will help us understand how the T transformation works by envisioning what function is being averaged.

25.7. Risk-sensitive preferences 453


Advanced Quantitative Economics with Python

The panel on the right portrays how the transformation exp ( −𝑢(𝑐)
𝜃 ) sends 𝑢 (𝑐) to a new function by (i) flipping the sign,
and (ii) increasing curvature in proportion to 𝜃.
In the left panel, the red line is our tool for computing the mathematical expectation for different values of 𝜋.
The green lot indicates the mathematical expectation of exp ( −𝑢(𝑐)
𝜃 ) when 𝜋 = .5.

Notice that the distance between the green dot and the curve is greater in the transformed space than the original space
as a result of additional curvature.
The inverse transformation 𝜃 log 𝐸 [exp ( −𝑢(𝑐)
𝜃 )] generates the green dot on the left panel that constitutes the risk-
sensitive utility index.
The gap between the green dot and the red line on the left panel measures the additional adjustment for risk that risk-
sensitive preferences make relative to plain vanilla expected utility preferences.

25.7.1 Digression on moment generating functions

The risk-sensitivity operator T is intimately connected to a moment generating function.


In particular, a principal constinuent of the T operator, namely,
𝐼
𝐸 exp(−𝑢(𝑐𝑖 )/𝜃) = ∑ 𝜋𝑖 exp(−𝑢(𝑐𝑖 )/𝜃)
𝑖=1

is evidently a moment generating function for the random variable 𝑢(𝑐𝑖 ), while
𝐼
𝑔(𝜃−1 ) ≐ log ∑ 𝜋𝑖 exp(−𝑢(𝑐𝑖 )/𝜃)
𝑖=1

is a cumulant generating function,


∞ 𝑗
(−𝜃−1 )
𝑔(𝜃−1 ) = ∑ 𝜅𝑗 .
𝑗=1
𝑗!

where 𝜅𝑗 is the 𝑗th cumulant of the random variable 𝑢(𝑐).


Then
∞ 𝑗
(−𝜃−1 )
T𝑢(𝑐) = −𝜃𝑔(𝜃−1 ) = −𝜃 ∑ 𝜅𝑗 .
𝑗=1
𝑗!

454 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

In general, when 𝜃 < +∞, T𝑢(𝑐) depends on cumulants of all orders.


These statements extend to cases with continuous probability distributions for 𝑐 and therefore for 𝑢(𝑐).
For the special case 𝑢(𝑐) ∼ 𝒩(𝜇𝑢 , 𝜎𝑢2 ), 𝜅1 = 𝜇𝑢 , 𝜅2 = 𝜎𝑢2 , and 𝜅𝑗 = 0 ∀𝑗 ≥ 3, so

1 2
T𝑢(𝑐) = 𝜇𝑢 − 𝜎 , (25.16)
2𝜃 𝑢
which becomes expected utility 𝜇𝑢 when 𝜃−1 = 0.
The right side of equation (25.16) is a special case of stochastic differential utility preferences in which consumption
plans are ranked not just by their expected utilities 𝜇𝑢 but also the variances 𝜎𝑢2 of their expected utilities.

25.8 Ex post Bayesian preferences

A decision maker is said to have ex post Bayesian preferences when he ranks consumption plans according to the
expected utility function

∑ 𝜋𝑖̂ (𝑐∗ )𝑢(𝑐𝑖 ) (25.17)


𝑖

where 𝜋(𝑐̂ ∗ ) is the worst-case probability distribution associated with multiplier or constraint preferences evaluated at a
particular consumption plan 𝑐∗ = {𝑐𝑖∗ }𝐼𝑖=1 .
At 𝑐∗ , an ex post Bayesian’s indifference curves are tangent to those for multiplier and constraint preferences with appro-
priately chosen 𝜃 and 𝜂, respectively.

25.9 Comparing preferences

For the special case in which 𝐼 = 2, 𝑐1 = 2, 𝑐2 = 1, 𝑢(𝑐) = ln 𝑐, and 𝜋1 = .5, the following two figures depict how
worst-case probabilities are determined under constraint and multiplier preferences, respectively.
The first figure graphs entropy as a function of 𝜋1̂ .
̂
It also plots expected utility under the twisted probability distribution, namely, 𝐸𝑢(𝑐) = 𝑢(𝑐2 ) + 𝜋1̂ (𝑢(𝑐1 ) − 𝑢(𝑐2 )),
which is evidently a linear function of 𝜋1̂ .
𝐼
The entropy constraint ∑𝑖=1 𝜋𝑖 𝑚𝑖 log 𝑚𝑖 ≤ 𝜂 implies a convex set Π̂ 1 of 𝜋1̂ ’s that constrains the adversary who chooses
𝜋1̂ , namely, the set of 𝜋1̂ ’s for which the entropy curve lies below the horizontal dotted line at an entropy level of 𝜂 = .25.
̂
Unless 𝑢(𝑐1 ) = 𝑢(𝑐2 ), the 𝜋1̂ that minimizes 𝐸𝑢(𝑐) is at the boundary of the set Π̂ 1 .

25.8. Ex post Bayesian preferences 455


Advanced Quantitative Economics with Python

𝐼
The next figure shows the function ∑𝑖=1 𝜋𝑖 𝑚𝑖 [𝑢(𝑐𝑖 ) + 𝜃 log 𝑚𝑖 ] that is to be minimized in the multiplier problem.
The argument of the function is 𝜋1̂ = 𝑚1 𝜋1 .

456 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

Evidently, from this figure and also from formula (25.12), lower values of 𝜃 lead to lower, and thus more distorted,
minimizing values of 𝜋1̂ .
The figure indicates how one can construct a Lagrange multiplier 𝜃 ̃ associated with a given entropy constraint 𝜂 and a
given consumption plan.
Thus, to draw the figure, we set the penalty parameter for multiplier preferences 𝜃 so that the minimizing 𝜋1̂ equals the
minimizing 𝜋1̂ for the constraint problem from the previous figure.
The penalty parameter 𝜃 = .42 also equals the Lagrange multiplier 𝜃 ̃ on the entropy constraint for the constraint pref-
erences depicted in the previous figure because the 𝜋1̂ that minimizes the asymmetric curve associated with penalty
parameter 𝜃 = .42 is the same 𝜋1̂ associated with the intersection of the entropy curve and the entropy constraint dashed
vertical line.

25.10 Risk aversion and misspecification aversion

All five types of preferences use curvature of 𝑢 to express risk aversion.


Constraint preferences express concern about misspecification or ambiguity for short with a positive 𝜂 that circum-
scribes an entropy ball around an approximating probability distribution 𝜋, and aversion aversion to model misspecification
through minimization with respect to a likelihood ratio 𝑚.
Multiplier preferences express misspecification concerns with a parameter 𝜃 < +∞ that penalizes deviations from the
approximating model as measured by relative entropy, and they express aversion to misspecification concerns with min-
imization over a probability distortion 𝑚.
By penalizing minimization over the likelihood ratio 𝑚, a decrease in 𝜃 represents an increase in ambiguity (or what
[Knight, 1921] called uncertainty) about the specification of the baseline approximating model {𝜋𝑖 }𝐼𝑖=1 .

25.10. Risk aversion and misspecification aversion 457


Advanced Quantitative Economics with Python

Formulas (25.6) assert that the decision maker acts as if he is pessimistic relative to an approximating model 𝜋.
It expresses what [Bucklew, 2004] [p. 27] calls a statistical version of Murphy’s law:
The probability of anything happening is in inverse ratio to its desirability.
The minimizing likelihood ratio 𝑚̂ slants worst-case probabilities 𝜋̂ exponentially to increase probabilities of events that
give lower utilities.
As expressed by the value function bound (25.19) to be displayed below, the decision maker uses pessimism instrumen-
tally to protect himself against model misspecification.
The penalty parameter 𝜃 for multipler preferences or the entropy level 𝜂 that determines the Lagrange multiplier 𝜃 ̃ for
constraint preferences controls how adversely the decision maker exponentially slants probabilities.
A decision rule is said to be undominated in the sense of Bayesian decision theory if there exists a probability distribution
𝜋 for which it is optimal.
A decision rule is said to be admissible if it is undominated.
[Hansen and Sargent, 2008] use ex post Bayesian preferences to show that robust decision rules are undominated and
therefore admissible.

25.11 Indifference curves

Indifference curves illuminate how concerns about robustness affect asset pricing and utility costs of fluctuations. For
𝐼 = 2, the slopes of the indifference curves for our five preference specifications are
• Expected utility:

𝑑𝑐2 𝜋 𝑢′ (𝑐 )
=− 1 ′ 1
𝑑𝑐1 𝜋2 𝑢 (𝑐2 )

• Constraint and ex post Bayesian preferences:

𝑑𝑐2 𝜋̂ 𝑢′ (𝑐 )
=− 1 ′ 1
𝑑𝑐1 𝜋2̂ 𝑢 (𝑐2 )

where 𝜋1̂ , 𝜋2̂ are the minimizing probabilities computed from the worst-case distortions (25.6) from the constraint
problem at (𝑐1 , 𝑐2 ).
• Multiplier and risk-sensitive preferences:

𝑑𝑐2 𝜋 exp(−𝑢(𝑐1 )/𝜃) 𝑢′ (𝑐1 )


=− 1
𝑑𝑐1 𝜋2 exp(−𝑢(𝑐2 )/𝜃) 𝑢′ (𝑐2 )

When 𝑐1 > 𝑐2 , the exponential twisting formula (25.12) implies that 𝜋1̂ < 𝜋1 , which in turn implies that the indifference
curves through (𝑐1 , 𝑐2 ) for both constraint and multiplier preferences are flatter than the indifference curve associated
with expected utility preferences.
As we shall see soon when we discuss state price deflators, this gives rise to higher estimates of prices of risk.
For an example with 𝑢(𝑐) = ln 𝑐, 𝐼 = 2, and 𝜋1 = .5, the next two figures show indifference curves for expected utility,
multiplier, and constraint preferences.
The following figure shows indifference curves going through a point along the 45 degree line.

458 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

Kink at 45 degree line


Notice the kink in the indifference curve for constraint preferences at the 45 degree line.
To understand the source of the kink, consider how the Lagrange multiplier and worst-case probabilities vary with the
consumption plan under constraint preferences.
For fixed 𝜂, a given plan 𝑐, and a utility function increasing in 𝑐, worst case probabilities are fixed numbers 𝜋1̂ < .5
when 𝑐1 > 𝑐2 and 𝜋1̂ > .5 when 𝑐2 > 𝑐1 .
This pattern makes the Lagrange multiplier 𝜃 ̃ vary discontinuously at 𝜋1̂ = .5.
The discontinuity in the worst case 𝜋1̂ at the 45 degree line accounts for the kink at the 45 degree line in an indifference
curve for constraint preferences associated with a given positive entropy constraint 𝜂.
The code for generating the preceding figure is somewhat intricate we formulate a root finding problem for finding indif-
ference curves.
Here is a brief literary description of the method we use.
Parameters
• Consumption bundle 𝑐 = (1, 1)
• Penalty parameter 𝜃 = 2
• Utility function 𝑢 = log
• Probability vector 𝜋 = (0.5, 0.5)
Algorithm:
• Compute 𝑢̄ = 𝜋1 𝑢 (𝑐1 ) + 𝜋2 𝑢 (𝑐2 )
• Given values for 𝑐1 , solve for values of 𝑐2 such that 𝑢̄ = 𝑢 (𝑐1 , 𝑐2 ):
– Expected utility: 𝑐2,𝐸𝑈 = 𝑢−1 ( 𝑢−𝜋
̄ 1 𝑢(𝑐1 )
𝜋 )
2

−𝑢(𝑐𝑖 ) −𝑢(𝑐𝑖 )
exp( 𝜃 ) exp( 𝜃 )
– Multiplier preferences: solve 𝑢−∑
̄ 𝜋
𝑖 𝑖 −𝑢(𝑐 )
(𝑢 (𝑐𝑖 ) + 𝜃 log ( −𝑢(𝑐𝑗 )
)) = 0 numerically
∑𝑗 exp( 𝜃 𝑗 ) ∑𝑗 exp( 𝜃 )

25.11. Indifference curves 459


Advanced Quantitative Economics with Python

−𝑢(𝑐𝑖 )
exp( 𝜃∗ )
– Constraint preference: solve 𝑢̄ − ∑𝑖 𝜋𝑖 −𝑢(𝑐𝑗 )
𝑢 (𝑐𝑖 ) = 0 numerically where 𝜃∗ solves
∑𝑗 exp( 𝜃∗ )
−𝑢(𝑐𝑖 ) −𝑢(𝑐𝑖 )
exp( 𝜃∗ ) exp( 𝜃∗ )
∑ 𝑖 𝜋𝑖 −𝑢(𝑐𝑗 )
log ( −𝑢(𝑐𝑗 )
) − 𝜂 = 0 numerically.
∑𝑗 exp( 𝜃∗ ) ∑𝑗 exp( 𝜃∗ )

Remark: It seems that the constraint problem is hard to solve in its original form, i.e. by finding the distorting measure
that minimizes the expected utility.
It seems that viewing equation (25.7) as a root finding problem works much better.
But notice that equation (25.7) does not always have a solution.
Under 𝑢 = log, 𝑐1 = 𝑐2 = 1, we have:

exp ( −𝑢(𝑐
𝜃̃
𝑖)
⎛ ) exp ( −𝑢(𝑐
𝜃̃
𝑖)
) ⎞
∑ 𝜋𝑖 log ⎜
⎜ ⎟=0
−𝑢(𝑐𝑗 ) −𝑢(𝑐𝑗 ) ⎟
𝑖 ∑𝑗 𝜋𝑗 exp ( 𝜃 ̃ ) ∑ 𝜋
⎝ 𝑗 𝑗 exp ( 𝜃̃
) ⎠
Conjecture: when our numerical method fails it because the derivative of the objective doesn’t exist for our choice of
parameters.
Remark: It is tricky to get the algorithm to work properly for all values of 𝑐1 . In particular, parameters were chosen
with graduate student descent.
Tangent indifference curves off 45 degree line
For a given 𝜂 and a given allocatin (𝑐1 , 𝑐2 ) off the 45 degree line, by solving equations (25.7) and (25.13), we can find
̃ 𝑐) and 𝜂(𝜃,
𝜃(𝜂, ̃ 𝑐) that make indifference curves for multiplier and constraint preferences be tangent to one another.
The following figure shows indifference curves for multiplier and constraint preferences through a point off the 45 degree
line, namely, (𝑐(1), 𝑐(2)) = (3, 1), at which 𝜂 and 𝜃 are adjusted to render the indifference curves for constraint and
multiplier preferences tangent.

Note that all three lines of the left graph intersect at (1, 3). While the intersection at (3, 1) is hard-coded, the intersection
at (1,3) arises from the computation, which confirms that the code seems to be working properly.
As we move along the (kinked) indifference curve for the constraint preferences for a given 𝜂, the worst-case probabilities
𝐼
remain constant, but the Lagrange multiplier 𝜃 ̃ on the entropy constraint ∑𝑖=1 𝑚𝑖 log 𝑚𝑖 ≤ 𝜂 varies with (𝑐1 , 𝑐2 ).

460 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

As we move along the (smooth) indifference curve for the multiplier preferences for a given penalty parameter 𝜃, the
implied entropy 𝜂 ̃ from equation (25.13) and the worst-case probabilities both change with (𝑐1 , 𝑐2 ).
For constraint preferences, there is a kink in the indifference curve.
For ex post Bayesian preferences, there are effectively two sets of indifference curves depending on which side of the 45
degree line the (𝑐1 , 𝑐2 ) endowment point sits.
There are two sets of indifference curves because, while the worst-case probabilities differ above and below the 45 degree
line, the idea of ex post Bayesian preferences is to use a single probability distribution to compute expected utilities for
all consumption bundles.
Indifference curves through point (𝑐1 , 𝑐2 ) = (3, 1) for expected logarithmic utility (less curved smooth line), multiplier
(more curved line), constraint (solid line kinked at 45 degree line), and ex post Bayesian (dotted lines) preferences. The
worst-case probability 𝜋1̂ < .5 when 𝑐1 = 3 > 𝑐2 = 1 and 𝜋1̂ > .5 when 𝑐1 = 1 < 𝑐2 = 3.

25.12 State price deflators

Concerns about model uncertainty boost prices of risk that are embedded in state-price deflators. With complete markets,
let 𝑞𝑖 be the price of consumption in state 𝑖.
𝐼
The budget set of a representative consumer having endowment 𝑐 ̄ = {𝑐𝑖̄ }𝐼𝑖=1 is expressed by ∑𝑖 𝑞𝑖 (𝑐𝑖 − 𝑐𝑖̄ ) ≤ 0.
When a representative consumer has multiplier preferences, the state prices are

exp(−𝑢(𝑐𝑖̄ )/𝜃)
𝑞𝑖 = 𝜋𝑖 𝑚̂ 𝑖 𝑢′ (𝑐𝑖̄ ) = 𝜋𝑖 ( )𝑢′ (𝑐𝑖̄ ). (25.18)
∑𝑗 𝜋𝑗 exp(−𝑢(𝑐𝑗̄ )/𝜃)

The worst-case likelihood ratio 𝑚̂ 𝑖 operates to increase prices 𝑞𝑖 in relatively low utility states 𝑖.
State prices agree under multiplier and constraint preferences when 𝜂 and 𝜃 are adjusted according to (25.7) or (25.13)
to make the indifference curves tangent at the endowment point.
The next figure can help us think about state-price deflators under our different preference orderings.
In this figure, budget line and indifference curves through point (𝑐1 , 𝑐2 ) = (3, 1) for expected logarithmic utility, multi-
plier, constraint (kinked at 45 degree line), and ex post Bayesian (dotted lines) preferences.
Figure 2.7:

25.12. State price deflators 461


Advanced Quantitative Economics with Python

Because budget constraints are linear, asset prices are identical under multiplier and constraint preferences for which 𝜃
and 𝜂 are adjusted to verify (25.7) or (25.13) at a given consumption endowment {𝑐𝑖 }𝐼𝑖=1 .
However, as we note next, though they are tangent at the endowment point, the fact that indifference curves differ for
multiplier and constraint preferences means that certainty equivalent consumption compensations of the kind that [Lucas,
1987], [Hansen et al., 1999], [Tallarini, 2000], and [Barillas et al., 2009] used to measure the costs of business cycles
must differ.

25.12.1 Consumption-equivalent measures of uncertainty aversion

For each of our five types of preferences, the following figure allows us to construct a certainty equivalent point (𝑐∗ , 𝑐∗ )
on the 45 degree line that renders the consumer indifferent between it and the risky point (𝑐(1), 𝑐(2)) = (3, 1).
Figure 2.8:

462 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

The figure indicates that the certainty equivalent level 𝑐∗ is higher for the consumer with expected utility preferences than
for the consumer with multiplier preferences, and that it is higher for the consumer with multiplier preferences than for
the consumer with constraint preferences.
The gap between these certainty equivalents measures the uncertainty aversion of the multiplier preferences or constraint
preferences consumer.
The gap between the expected value .5𝑐(1) + .5𝑐(2) at point A and the certainty equivalent for the expected utility
decision maker at point B is a measure of his risk aversion.
The gap between points 𝐵 and 𝐶 measures the multiplier preference consumer’s aversion to model uncertainty.
The gap between points B and D measures the constraint preference consumer’s aversion to model uncertainty.

25.12. State price deflators 463


Advanced Quantitative Economics with Python

25.13 Iso-utility and iso-entropy curves and expansion paths

The following figures show iso-entropy and iso-utility lines for the special case in which 𝐼 = 3, 𝜋1 = .3, 𝜋2 = .4, and
1−𝛼
the utility function is 𝑢(𝑐) = 𝑐1−𝛼 with 𝛼 = 0 and 𝛼 = 3, respectively, for the fixed plan 𝑐(1) = 1, 𝑐(2) = 2, 𝑐(3) = 3.
The iso-utility lines are the level curves of

𝜋1̂ 𝑐1 + 𝜋2̂ 𝑐2 + (1 − 𝜋1̂ − 𝜋2̂ )𝑐3

and are linear in (𝜋1̂ , 𝜋2̂ ).


This is what it means to say ‘expected utility is linear in probabilities.’
Both figures plot the locus of points of tangency between the iso-entropy and the iso-utility curves that is traced out as
one varies 𝜃−1 in the interval [0, 2].
While the iso-entropy lines are identical in the two figures, these ‘expansion paths’ differ because the utility functions
𝑖 )/𝜃)
differ, meaning that for a given 𝜃 and (𝑐1 , 𝑐2 , 𝑐3 ) triple, the worst-case probabilities 𝜋𝑖̂ (𝜃) = 𝜋𝑖 𝐸exp(−𝑢(𝑐
exp(−𝑢(𝑐)/𝜃) differ as we
vary 𝜃, causing the associated entropies to differ.
Color bars:
• First color bar: variation in 𝜃
• Second color bar: variation in utility levels
• Third color bar: variation in entropy levels

/tmp/ipykernel_7791/3904427642.py:36: RuntimeWarning: invalid value encountered in␣


↪divide

m = m_unnormalized / (π * m_unnormalized).sum()

/tmp/ipykernel_7791/3904427642.py:35: RuntimeWarning: overflow encountered in exp


m_unnormalized = np.exp(-u(c) / θ)
/tmp/ipykernel_7791/3904427642.py:36: RuntimeWarning: invalid value encountered in␣
↪divide

m = m_unnormalized / (π * m_unnormalized).sum()

464 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

25.14 Bounds on expected utility


𝐼
Suppose that a decision maker wants a lower bound on expected utility ∑𝑖=1 𝜋𝑖̂ 𝑢(𝑐𝑖 ) that is satisfied for any distribution
𝜋̂ with relative entropy less than or equal to 𝜂.
An attractive feature of multiplier and constraint preferences is that they carry with them such a bound.
To show this, it is useful to collect some findings in the following string of inequalities associated with multiplier prefer-
ences:
𝐼
−𝑢(𝑐𝑖 )
T𝜃 𝑢(𝑐) = −𝜃 log ∑ exp( )𝜋𝑖
𝑖=1
𝜃
𝐼
= ∑ 𝑚∗𝑖 𝜋𝑖 (𝑢(𝑐𝑖 ) + 𝜃 log 𝑚∗𝑖 )
𝑖=1
𝐼 𝑖
≤ ∑ 𝑚𝑖 𝜋𝑖 𝑢(𝑐𝑖 ) + 𝜃 ∑ 𝑚𝑖 log 𝑚𝑖 𝜋𝑖
𝑖=1 𝑖=1

where 𝑚∗𝑖 ∝ exp( −𝑢(𝑐 𝑖)


𝜃 ) are the worst-case distortions to probabilities.

The inequality in the last line just asserts that minimizers minimize.
Therefore, we have the following useful bound:
𝐼 𝐼
∑ 𝑚𝑖 𝜋𝑖 𝑢(𝑐𝑖 ) ≥ T𝜃 𝑢(𝑐) − 𝜃 ∑ 𝜋𝑖 𝑚𝑖 log 𝑚𝑖 . (25.19)
𝑖=1 𝑖=1

The left side is expected utility under the probability distribution {𝑚𝑖 𝜋𝑖 }.
The right side is a lower bound on expected utility under all distributions expressed as an affine function of relative entropy
𝐼
∑𝑖=1 𝜋𝑖 𝑚𝑖 log 𝑚𝑖 .

The bound is attained for 𝑚𝑖 = 𝑚∗𝑖 ∝ exp( −𝑢(𝑐 𝑖)


𝜃 ).

25.14. Bounds on expected utility 465


Advanced Quantitative Economics with Python

The intercept in the bound is the risk-sensitive criterion T𝜃 𝑢(𝑐), while the slope is the penalty parameter 𝜃.
Lowering 𝜃 does two things:
• it lowers the intercept T𝜃 𝑢(𝑐), which makes the bound less informative for small values of entropy; and
• it lowers the absolute value of the slope, which makes the bound more informative for larger values of relative
𝐼
entropy ∑𝑖=1 𝜋𝑖 𝑚𝑖 log 𝑚𝑖 .
The following figure reports best-case and worst-case expected utilities.
We calculate the lines in this figure numerically by solving optimization problems with respect to the change of measure.

In this figure, expected utility is on the co-ordinate axis while entropy is on the ordinate axis.
The lower curved line depicts expected utility under the worst-case model associated with each value of entropy 𝜂 recorded
𝐼 ̃ 𝜂))𝑢(𝑐 ), where 𝑚̃ (𝜃(𝜂))
̃
on the ordinate axis, i.e., it is ∑𝑖=1 𝜋𝑖 𝑚̃ 𝑖 (𝜃(𝑐, 𝑖 𝑖 ∝ exp( −𝑢(𝑐
𝜃̃
𝑖)
) and 𝜃 ̃ is the Lagrange multiplier
associated with the constraint that entropy cannot exceed the value on the ordinate axis.
The higher curved line depicts expected utility under the best-case model indexed by the value of the Lagrange mul-
tiplier 𝜃 ̌ > 0 associated with each value of entropy less than or equal to 𝜂 recorded on the ordinate axis, i.e., it is
𝐼 ̌ ̌ 𝑢(𝑐𝑖 )
∑𝑖=1 𝜋𝑖 𝑚̌ 𝑖 (𝜃(𝜂))𝑢(𝑐 𝑖 ) where 𝑚̌ 𝑖 (𝜃(𝑐, 𝜂)) ∝ exp( 𝜃 ̌ ).

(Here 𝜃 ̌ is the Lagrange multiplier associated with max-max expected utility.)


Points between these two curves are possible values of expected utility for some distribution with entropy less than or
equal to the value 𝜂 on the ordinate axis.
The straight line depicts the right side of inequality (25.19) for a particular value of the penalty parameter 𝜃.
As noted, when one lowers 𝜃, the intercept T𝜃 𝑢(𝑐) and the absolute value of the slope both decrease.

466 Chapter 25. Risk and Model Uncertainty


Advanced Quantitative Economics with Python

Thus, as 𝜃 is lowered, T𝜃 𝑢(𝑐) becomes a more conservative estimate of expected utility under the approximating model
𝜋.
However, as 𝜃 is lowered, the robustness bound (25.19) becomes more informative for sufficiently large values of entropy.
The slope of straight line depicting a bound is −𝜃 and the projection of the point of tangency with the curved depicting
the lower bound of expected utility is the entropy associated with that 𝜃 when it is interpreted as a Lagrange multiplier
on the entropy constraint in the constraint problem .
This is an application of the envelope theorem.

25.15 Why entropy?

Beyond the helpful mathematical fact that it leads directly to convenient exponential twisting formulas (25.6) and (25.12)
for worst-case probability distortions, there are two related justifications for using entropy to measure discrepancies be-
tween probability distribution.
One arises from the role of entropy in statistical tests for discriminating between models.
The other comes from axioms.

25.15.1 Entropy and statistical detection

Robust control theory starts with a decision maker who has constructed a good baseline approximating model whose free
parameters he has estimated to fit historical data well.
The decision maker recognizes that actual outcomes might be generated by one of a vast number of other models that fit
the historical data nearly as well as his.
Therefore, he wants to evaluate outcomes under a set of alternative models that are plausible in the sense of being statis-
tically close to his model.
He uses relative entropy to quantify what close means.
[Anderson et al., 2003] and [Barillas et al., 2009]describe links between entropy and large deviations bounds on test
statistics for discriminating between models, in particular, statistics that describe the probability of making an error in
applying a likelihood ratio test to decide whether model A or model B generated a data record of length 𝑇 .
For a given sample size, an informative bound on the detection error probability is a function of the entropy parameter 𝜂
in constraint preferences. [Anderson et al., 2003] and [Barillas et al., 2009] use detection error probabilities to calibrate
reasonable values of 𝜂.
[Anderson et al., 2003] and [Hansen and Sargent, 2008] also use detection error probabilities to calibrate reasonable
values of the penalty parameter 𝜃 in multiplier preferences.
For a fixed sample size and a fixed 𝜃, they would calculate the worst-case 𝑚̂ 𝑖 (𝜃), an associated entropy 𝜂(𝜃), and an
associated detection error probability. In this way they build up a detection error probability as a function of 𝜃.
They then invert this function to calibrate 𝜃 to deliver a reasonable detection error probability.
To indicate outcomes from this approach, the following figure plots the histogram for U.S. quarterly consumption growth
along with a representative agent’s approximating density and a worst-case density that [Barillas et al., 2009] show imply
high measured market prices of risk even when a representative consumer has the unit coefficient of relative risk aversion
associated with a logarithmic one-period utility function.

25.15. Why entropy? 467


Advanced Quantitative Economics with Python

The density for the approximating model is log 𝑐𝑡+1 − log 𝑐𝑡 = 𝜇 + 𝜎𝑐 𝜖𝑡+1 where 𝜖𝑡+1 ∼ 𝑁 (0, 1) and 𝜇 and 𝜎𝑐 are
estimated by maximum likelihood from the U.S. quarterly data in the histogram over the period 1948.I-2006.IV.
The consumer’s value function under logarithmic utility implies that the worst-case model is log 𝑐𝑡+1 − log 𝑐𝑡 = (𝜇 +
𝜎𝑐 𝑤) + 𝜎𝑐 𝜖𝑡+1
̃ where {𝜖𝑡+1 ̃ } is also a normalized Gaussian random sequence and where 𝑤 is calculated by setting a
detection error probability to .05.
The worst-case model appears to fit the histogram nearly as well as the approximating model.

25.15.2 Axiomatic justifications

Multiplier and constraint preferences are both special cases of what [Maccheroni et al., 2006] call variational preferences.
They provide an axiomatic foundation for variational preferences and describe how they express ambiguity aversion.
Constraint preferences are particular instances of the multiple priors model of [Gilboa and Schmeidler, 1989].

468 Chapter 25. Risk and Model Uncertainty


CHAPTER

TWENTYSIX

ETYMOLOGY OF ENTROPY

This lecture describes and compares several notions of entropy.


Among the senses of entropy, we’ll encounter these
• A measure of uncertainty of a random variable advanced by Claude Shannon [Shannon and Weaver, 1949]
• A key object governing thermodynamics
• Kullback and Leibler’s measure of the statistical divergence between two probability distributions
• A measure of the volatility of stochastic discount factors that appear in asset pricing theory
• Measures of unpredictability that occur in classical Wiener-Kolmogorov linear prediction theory
• A frequency domain criterion for constructing robust decision rules
The concept of entropy plays an important role in robust control formulations described in this lecture Risk and Model
Uncertainty and in this lecture Robustness.

26.1 Information Theory

In information theory [Shannon and Weaver, 1949], entropy is a measure of the unpredictability of a random variable.
To illustrate things, let 𝑋 be a discrete random variable taking values 𝑥1 , … , 𝑥𝑛 with probabilities 𝑝𝑖 = Prob(𝑋 = 𝑥𝑖 ) ≥
0, ∑𝑖 𝑝𝑖 = 1.
Claude Shannon’s [Shannon and Weaver, 1949] definition of entropy is

𝐻(𝑝) = ∑ 𝑝𝑖 log𝑏 (𝑝𝑖−1 ) = − ∑ 𝑝𝑖 log𝑏 (𝑝𝑖 ). (26.1)


𝑖 𝑖

where log𝑏 denotes the log function with base 𝑏.


Inspired by the limit

log 𝑝
lim 𝑝 log 𝑝 = lim = lim 𝑝 = 0,
𝑝↓0 𝑝↓0 𝑝−1 𝑝↓0

we set 𝑝 log 𝑝 = 0 in equation (26.1).


Typical bases for the logarithm are 2, 𝑒, and 10.
In the information theory literature, logarithms of base 2, 𝑒, and 10 are associated with units of information called bits,
nats, and dits, respectively.
Shannon typically used base 2.

469
Advanced Quantitative Economics with Python

26.2 A Measure of Unpredictability

For a discrete random variable 𝑋 with probability density 𝑝 = {𝑝𝑖 }𝑛𝑖=1 , the surprisal for state 𝑖 is 𝑠𝑖 = log ( 𝑝1 ).
𝑖

The quantity log ( 𝑝1 ) is called the surprisal because it is inversely related to the likelihood that state 𝑖 will occur.
𝑖

Note that entropy 𝐻(𝑝) equals the expected surprisal


1
𝐻(𝑝) = ∑ 𝑝𝑖 𝑠𝑖 = ∑ 𝑝𝑖 log ( ) = − ∑ 𝑝𝑖 log (𝑝𝑖 ) .
𝑖 𝑖
𝑝𝑖 𝑖

26.2.1 Example

Take a possibly unfair coin, so 𝑋 = {0, 1} with 𝑝 = Prob(𝑋 = 1) = 𝑝 ∈ [0, 1].


Then
𝐻(𝑝) = −(1 − 𝑝) log(1 − 𝑝) − 𝑝 log 𝑝.
Evidently,
𝐻 ′ (𝑝) = log(1 − 𝑝) − log 𝑝 = 0
1 1
at 𝑝 = .5 and 𝐻 ″ (𝑝) = − 1−𝑝 − 𝑝 < 0 for 𝑝 ∈ (0, 1).
So 𝑝 = .5 maximizes entropy, while entropy is minimized at 𝑝 = 0 and 𝑝 = 1.
Thus, among all coins, a fair coin is the most unpredictable.
See Fig. 26.1

26.2.2 Example
1
Take an 𝑛-sided possibly unfair die with a probability distribution {𝑝𝑖 }𝑛𝑖=1 . The die is fair if 𝑝𝑖 = 𝑛 ∀𝑖.

Among all dies, a fair die maximizes entropy.


For a fair die, entropy equals 𝐻(𝑝) = −𝑛−1 ∑𝑖 log ( 𝑛1 ) = log(𝑛).
To specify the expected number of bits needed to isolate the outcome of one roll of a fair 𝑛-sided die requires log2 (𝑛)
bits of information.
For example, if 𝑛 = 2, log2 (2) = 1.
For 𝑛 = 3, log2 (3) = 1.585.

26.3 Mathematical Properties of Entropy

For a discrete random variable with probability vector 𝑝, entropy 𝐻(𝑝) is a function that satisfies
• 𝐻 is continuous.
• 𝐻 is symmetric: 𝐻(𝑝1 , 𝑝2 , … , 𝑝𝑛 ) = 𝐻(𝑝𝑟1 , … , 𝑝𝑟𝑛 ) for any permutation 𝑟1 , … , 𝑟𝑛 of 1, … , 𝑛.
• A uniform distribution maximizes 𝐻(𝑝): 𝐻(𝑝1 , … , 𝑝𝑛 ) ≤ 𝐻( 𝑛1 , … , 𝑛1 ).
• Maximum entropy increases with the number of states: 𝐻( 𝑛1 , … , 𝑛1 ) ≤ 𝐻( 𝑛+1
1 1
, … , 𝑛+1 ).
• Entropy is not affected by events zero probability.

470 Chapter 26. Etymology of Entropy


Advanced Quantitative Economics with Python

Fig. 26.1: Entropy as a function of 𝜋1̂ when 𝜋1 = .5.

26.4 Conditional Entropy

Let (𝑋, 𝑌 ) be a bivariate discrete random vector with outcomes 𝑥1 , … , 𝑥𝑛 and 𝑦1 , … , 𝑦𝑚 , respectively, occurring with
probability density 𝑝(𝑥𝑖 , 𝑦𝑖 ).
Conditional entropy 𝐻(𝑋|𝑌 ) is defined as

𝑝(𝑦𝑗 )
𝐻(𝑋|𝑌 ) = ∑ 𝑝(𝑥𝑖 , 𝑦𝑗 ) log . (26.2)
𝑖,𝑗
𝑝(𝑥𝑖 , 𝑦𝑗 )

𝑝(𝑦𝑗 )
Here 𝑝(𝑥𝑖 ,𝑦𝑗 ) , the reciprocal of the conditional probability of 𝑥𝑖 given 𝑦𝑗 , can be defined as the conditional surprisal.

26.5 Independence as Maximum Conditional Entropy

Let 𝑚 = 𝑛 and [𝑥1 , … , 𝑥𝑛 ] = [𝑦1 , … , 𝑦𝑛 ].


Let ∑𝑗 𝑝(𝑥𝑖 , 𝑦𝑗 ) = ∑𝑗 𝑝(𝑥𝑗 , 𝑦𝑖 ) for all 𝑖, so that the marginal distributions of 𝑥 and 𝑦 are identical.
Thus, 𝑥 and 𝑦 are identically distributed, but they are not necessarily independent.
Consider the following problem: choose a joint distribution 𝑝(𝑥𝑖 , 𝑦𝑗 ) to maximize conditional entropy (26.2) subject to
the restriction that 𝑥 and 𝑦 are identically distributed.
The conditional-entropy-maximizing 𝑝(𝑥𝑖 , 𝑦𝑗 ) sets

𝑝(𝑥𝑖 , 𝑦𝑗 )
= ∑ 𝑝(𝑥𝑖 , 𝑦𝑗 ) = 𝑝(𝑥𝑖 )∀𝑖.
𝑝(𝑦𝑗 ) 𝑗

26.4. Conditional Entropy 471


Advanced Quantitative Economics with Python

Thus, among all joint distributions with identical marginal distributions, the conditional entropy maximizing joint distri-
bution makes 𝑥 and 𝑦 be independent.

26.6 Thermodynamics

Josiah Willard Gibbs (see https://en.wikipedia.org/wiki/Josiah_Willard_Gibbs) defined entropy as

𝑆 = −𝑘𝐵 ∑ 𝑝𝑖 log 𝑝𝑖 (26.3)


𝑖

where 𝑝𝑖 is the probability of a micro state and 𝑘𝐵 is Boltzmann’s constant.


• The Boltzmann constant 𝑘𝑏 relates energy at the micro particle level with the temperature observed at the macro
level. It equals what is called a gas constant divided by an Avogadro constant.
The second law of thermodynamics states that the entropy of a closed physical system increases until 𝑆 defined in (26.3)
attains a maximum.

26.7 Statistical Divergence

Let 𝑋 be a discrete state space 𝑥1 , … , 𝑥𝑛 and let 𝑝 and 𝑞 be two discrete probability distributions on 𝑋.
𝑝𝑖
Assume that 𝑞𝑡 ∈ (0, ∞) for all 𝑖 for which 𝑝𝑖 > 0.
Then the Kullback-Leibler statistical divergence, also called relative entropy, is defined as

𝑝𝑖 𝑝 𝑝
𝐷(𝑝|𝑞) = ∑ 𝑝𝑖 log ( ) = ∑ 𝑞𝑖 ( 𝑖 ) log ( 𝑖 ) . (26.4)
𝑖
𝑞𝑖 𝑖
𝑞𝑖 𝑞𝑖

Evidently,

𝐷(𝑝|𝑞) = − ∑ 𝑝𝑖 log 𝑞𝑖 + ∑ 𝑝𝑖 log 𝑝𝑖


𝑖 𝑖
= 𝐻(𝑝, 𝑞) − 𝐻(𝑝),

where 𝐻(𝑝, 𝑞) = ∑𝑖 𝑝𝑖 log 𝑞𝑖 is the cross-entropy.


It is easy to verify, as we have done above, that 𝐷(𝑝|𝑞) ≥ 0 and that 𝐷(𝑝|𝑞) = 0 implies that 𝑝𝑖 = 𝑞𝑖 when 𝑞𝑖 > 0.

26.8 Continuous distributions

For a continuous random variable, Kullback-Leibler divergence between two densities 𝑝 and 𝑞 is defined as

𝑝(𝑥)
𝐷(𝑝|𝑞) = ∫ 𝑝(𝑥) log ( ) 𝑑 𝑥.
𝑞(𝑥)

472 Chapter 26. Etymology of Entropy


Advanced Quantitative Economics with Python

26.9 Relative entropy and Gaussian distributions

We want to compute relative entropy for two continuous densities 𝜙 and 𝜙 ̂ when 𝜙 is 𝑁 (0, 𝐼) and 𝜙 ̂ is 𝑁 (𝑤, Σ), where
the covariance matrix Σ is nonsingular.
We seek a formula for

̂ − log 𝜙(𝜀))𝜙(𝜀)𝑑𝜀.
ent = ∫(log 𝜙(𝜀) ̂

Claim
1 1 1
ent = − log det Σ + 𝑤′ 𝑤 + trace(Σ − 𝐼). (26.5)
2 2 2
Proof
The log likelihood ratio is

̂ − log 𝜙(𝜀) = 1
log 𝜙(𝜀) [−(𝜀 − 𝑤)′ Σ−1 (𝜀 − 𝑤) + 𝜀′ 𝜀 − log det Σ] . (26.6)
2
Observe that
1 ̂ 1
− ∫ (𝜀 − 𝑤)′ Σ−1 (𝜀 − 𝑤)𝜙(𝜀)𝑑𝜀 = − trace(𝐼).
2 2
Applying the identity 𝜀 = 𝑤 + (𝜀 − 𝑤) gives
1 ′ 1 1
𝜀 𝜀 = 𝑤′ 𝑤 + (𝜀 − 𝑤)′ (𝜀 − 𝑤) + 𝑤′ (𝜀 − 𝑤).
2 2 2
Taking mathematical expectations
1 ̂ 1 1
∫ 𝜀′ 𝜀𝜙(𝜀)𝑑𝜀 = 𝑤′ 𝑤 + trace(Σ).
2 2 2
Combining terms gives

̂ = − 1 log det Σ + 1 𝑤′ 𝑤 + 1 trace(Σ − 𝐼).


ent = ∫(log 𝜙 ̂ − log 𝜙)𝜙𝑑𝜀 (26.7)
2 2 2
which agrees with equation (26.5). Notice the separate appearances of the mean distortion 𝑤 and the covariance distortion
Σ − 𝐼 in equation (26.7).
Extension
Let 𝑁0 = 𝒩(𝜇0 , Σ0 ) and 𝑁1 = 𝒩(𝜇1 , Σ1 ) be two multivariate Gaussian distributions.
Then
1 detΣ0
𝐷(𝑁0 |𝑁1 ) = (trace(Σ−1 ′ −1
1 Σ0 ) + (𝜇1 − 𝜇0 ) Σ1 (𝜇1 − 𝜇0 ) − log ( ) − 𝑘) . (26.8)
2 detΣ1

26.10 Von Neumann Entropy

Let 𝑃 and 𝑄 be two positive-definite symmetric matrices.


A measure of the divergence between two 𝑃 and 𝑄 is

𝐷(𝑃 |𝑄) = trace(𝑃 ln 𝑃 − 𝑃 ln 𝑄 − 𝑃 + 𝑄)

26.9. Relative entropy and Gaussian distributions 473


Advanced Quantitative Economics with Python

where the log of a matrix is defined here (https://en.wikipedia.org/wiki/Logarithm_of_a_matrix).


A density matrix 𝑃 from quantum mechanics is a positive definite matrix with trace 1.
The von Neumann entropy of a density matrix 𝑃 is

𝑆 = −trace(𝑃 ln 𝑃 )

26.11 Backus-Chernov-Zin Entropy

After flipping signs, [Backus et al., 2014] use Kullback-Leibler relative entropy as a measure of volatility of stochastic
discount factors that they assert is useful for characterizing features of both the data and various theoretical models of
stochastic discount factors.

Where 𝑝𝑡+1 is the physical or true measure, 𝑝𝑡+1 is the risk-neutral measure, and 𝐸𝑡 denotes conditional expectation
under the 𝑝𝑡+1 measure, [Backus et al., 2014] define entropy as
∗ ∗
𝐿𝑡 (𝑝𝑡+1 /𝑝𝑡+1 ) = −𝐸𝑡 log(𝑝𝑡+1 /𝑝𝑡+1 ). (26.9)

Evidently, by virtue of the minus sign in equation (26.9),


∗ ∗
𝐿𝑡 (𝑝𝑡+1 /𝑝𝑡+1 ) = 𝐷𝐾𝐿,𝑡 (𝑝𝑡+1 |𝑝𝑡+1 ), (26.10)

where 𝐷𝐾𝐿,𝑡 denotes conditional relative entropy.


1
Let 𝑚𝑡+1 be a stochastic discount factor, 𝑟𝑡+1 a gross one-period return on a risky security, and (𝑟𝑡+1 )−1 ≡ 𝑞𝑡1 = 𝐸𝑡 𝑚𝑡+1
be the reciprocal of a risk-free one-period gross rate of return. Then

𝐸𝑡 (𝑚𝑡+1 𝑟𝑡+1 ) = 1

[Backus et al., 2014] note that a stochastic discount factor satisfies



𝑚𝑡+1 = 𝑞𝑡1 𝑝𝑡+1 /𝑝𝑡+1 .

They derive the following entropy bound


1
𝐸𝐿𝑡 (𝑚𝑡+1 ) ≥ 𝐸(log 𝑟𝑡+1 − log 𝑟𝑡+1 )

which they propose as a complement to a Hansen-Jagannathan [Hansen and Jagannathan, 1991] bound.

26.12 Wiener-Kolmogorov Prediction Error Formula as Entropy

Let {𝑥𝑡 }∞
𝑡=−∞ be a covariance stationary stochastic process with mean zero and spectral density 𝑆𝑥 (𝜔).

The variance of 𝑥 is
𝜋
1
𝜎𝑥2 = ( ) ∫ 𝑆 (𝜔)𝑑𝜔.
2𝜋 −𝜋 𝑥

As described in chapter XIV of [Sargent, 1987], the Wiener-Kolmogorov formula for the one-period ahead prediction
error is
𝜋
1
𝜎𝜖2 = exp [( ) ∫ log 𝑆𝑥 (𝜔)𝑑𝜔] . (26.11)
2𝜋 −𝜋

474 Chapter 26. Etymology of Entropy


Advanced Quantitative Economics with Python

Occasionally the logarithm of the one-step-ahead prediction error 𝜎𝜖2 is called entropy because it measures unpredictabil-
ity.
Consider the following problem reminiscent of one described earlier.
Problem:
Among all covariance stationary univariate processes with unconditional variance 𝜎𝑥2 , find a process with maximal one-
step-ahead prediction error.
The maximizer is a process with spectral density

𝑆𝑥 (𝜔) = 2𝜋𝜎𝑥2 .

Thus, among all univariate covariance stationary processes with variance 𝜎𝑥2 , a process with a flat spectral density is the
most uncertain, in the sense of one-step-ahead prediction error variance.
This no-patterns-across-time outcome for a temporally dependent process resembles the no-pattern-across-states outcome
for the static entropy maximizing coin or die in the classic information theoretic analysis described above.

26.13 Multivariate Processes



Let 𝑦𝑡 be an 𝑛 × 1 covariance stationary stochastic process with mean 0 with matrix covariogram 𝐶𝑦 (𝑗) = 𝐸𝑦𝑡 𝑦𝑡−𝑗 and
spectral density matrix

𝑆𝑦 (𝜔) = ∑ 𝑒−𝑖𝜔𝑗 𝐶𝑦 (𝑗), 𝜔 ∈ [−𝜋, 𝜋].
𝑗=−∞

Let

𝑦𝑡 = 𝐷(𝐿)𝜖𝑡 ≡ ∑ 𝐷𝑗 𝜖𝑡
𝑗=0

be a Wold representation for 𝑦, where 𝐷(0)𝜖𝑡 is a vector of one-step-ahead errors in predicting 𝑦𝑡 conditional on the
infinite history 𝑦𝑡−1 = [𝑦𝑡−1 , 𝑦𝑡−2 , …] and 𝜖𝑡 is an 𝑛 × 1 vector of serially uncorrelated random disturbances with mean
zero and identity contemporaneous covariance matrix 𝐸𝜖𝑡 𝜖′𝑡 = 𝐼.
Linear-least-squares predictors have one-step-ahead prediction error 𝐷(0)𝐷(0)′ that satisfies
𝜋
1
log det[𝐷(0)𝐷(0)′ ] = ( ) ∫ log det[𝑆𝑦 (𝜔)]𝑑𝜔. (26.12)
2𝜋 −𝜋

Being a measure of the unpredictability of an 𝑛×1 vector covariance stationary stochastic process, the left side of (26.12)
is sometimes called entropy.

26.14 Frequency Domain Robust Control

Chapter 8 of [Hansen and Sargent, 2008] adapts work in the control theory literature to define a frequency domain
entropy criterion for robust control as

∫ log det[𝜃𝐼 − 𝐺𝐹 (𝜁)′ 𝐺𝐹 (𝜁)]𝑑𝜆(𝜁), (26.13)


Γ

where 𝜃 ∈ (𝜃, +∞) is a positive robustness parameter and 𝐺𝐹 (𝜁) is a 𝜁-transform of the objective function.

26.13. Multivariate Processes 475


Advanced Quantitative Economics with Python

Hansen and Sargent [Hansen and Sargent, 2008] show that criterion (26.13) can be represented as

log det[𝐷(0)′ 𝐷(0)] = ∫ log det[𝜃𝐼 − 𝐺𝐹 (𝜁)′ 𝐺𝐹 (𝜁)]𝑑𝜆(𝜁), (26.14)


Γ

for an appropriate covariance stationary stochastic process derived from 𝜃, 𝐺𝐹 (𝜁).


This explains the moniker maximum entropy robust control for decision rules 𝐹 designed to maximize criterion (26.13).

26.15 Relative Entropy for a Continuous Random Variable

Let 𝑥 be a continuous random variable with density 𝜙(𝑥), and let 𝑔(𝑥) be a nonnegative random variable satisfying
∫ 𝑔(𝑥)𝜙(𝑥)𝑑𝑥 = 1.
̂
The relative entropy of the distorted density 𝜙(𝑥) = 𝑔(𝑥)𝜙(𝑥) is defined as

ent(𝑔) = ∫ 𝑔(𝑥) log 𝑔(𝑥)𝜙(𝑥)𝑑𝑥.

Fig. 26.2 plots the functions 𝑔 log 𝑔 and 𝑔 − 1 over the interval 𝑔 ≥ 0.
That relative entropy ent(𝑔) ≥ 0 can be established by noting (a) that 𝑔 log 𝑔 ≥ 𝑔 − 1 (see Fig. 26.2) and (b) that under
𝜙, 𝐸𝑔 = 1.
Fig. 26.3 and Fig. 26.4 display aspects of relative entropy visually for a continuous random variable 𝑥 for two densities
with likelihood ratio 𝑔 ≥ 0.
Where the numerator density is 𝒩(0, 1), for two denominator Gaussian densities 𝒩(0, 1.5) and 𝒩(0, .95), respectively,
Fig. 26.3 and Fig. 26.4 display the functions 𝑔 log 𝑔 and 𝑔 − 1 as functions of 𝑥.

Fig. 26.2: The function 𝑔 log 𝑔 for 𝑔 ≥ 0. For a random variable 𝑔 with 𝐸𝑔 = 1, 𝐸𝑔 log 𝑔 ≥ 0.

476 Chapter 26. Etymology of Entropy


Advanced Quantitative Economics with Python

Fig. 26.3: Graphs of 𝑔 log 𝑔 and 𝑔 − 1 where 𝑔 is the ratio of the density of a 𝒩(0, 1) random variable to the density of
a 𝒩(0, 1.5) random variable. Under the 𝒩(0, 1.5) density, 𝐸𝑔 = 1.

Fig. 26.4: 𝑔 log 𝑔 and 𝑔 − 1 where 𝑔 is the ratio of the density of a 𝒩(0, 1) random variable to the density of a 𝒩(0, 1.5)
random variable. Under the 𝒩(0, 1.5) density, 𝐸𝑔 = 1.

26.15. Relative Entropy for a Continuous Random Variable 477


Advanced Quantitative Economics with Python

478 Chapter 26. Etymology of Entropy


CHAPTER

TWENTYSEVEN

ROBUSTNESS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

27.1 Overview

This lecture modifies a Bellman equation to express a decision-maker’s doubts about transition dynamics.
His specification doubts make the decision-maker want a robust decision rule.
Robust means insensitive to misspecification of transition dynamics.
The decision-maker has a single approximating model of the transition dynamics.
He calls it approximating to acknowledge that he doesn’t completely trust it.
He fears that transition dynamics are actually determined by another model that he cannot describe explicitly.
All that he knows is that the actual data-generating model is in some (uncountable) set of models that surrounds his
approximating model.
He quantifies the discrepancy between his approximating model and the genuine data-generating model by using a quantity
called entropy.
(We’ll explain what entropy means below)
He wants a decision rule that will work well enough no matter which of those other models actually governs outcomes.
This is what it means for his decision rule to be “robust to misspecification of an approximating model”.
This may sound like too much to ask for, but ….
… a secret weapon is available to design robust decision rules.
The secret weapon is max-min control theory.
A value-maximizing decision-maker enlists the aid of an (imaginary) value-minimizing model chooser to construct bounds
on the value attained by a given decision rule under different models of the transition dynamics.
The original decision-maker uses those bounds to construct a decision rule with an assured performance level, no matter
which model actually governs outcomes.

Note: In reading this lecture, please don’t think that our decision-maker is paranoid when he conducts a worst-case
analysis. By designing a rule that works well against a worst-case, his intention is to construct a rule that will work well
across a set of models.

479
Advanced Quantitative Economics with Python

Let’s start with some imports:

import pandas as pd
import numpy as np
from scipy.linalg import eig
import matplotlib.pyplot as plt
import quantecon as qe

27.1.1 Sets of Models Imply Sets Of Values

Our “robust” decision-maker wants to know how well a given rule will work when he does not know a single transition
law ….
… he wants to know sets of values that will be attained by a given decision rule 𝐹 under a set of transition laws.
Ultimately, he wants to design a decision rule 𝐹 that shapes the set of values in ways that he prefers.
With this in mind, consider the following graph, which relates to a particular decision problem to be explained below

The figure shows a value-entropy correspondence for a particular decision rule 𝐹 .


The shaded set is the graph of the correspondence, which maps entropy to a set of values associated with a set of models
that surround the decision-maker’s approximating model.
Here

480 Chapter 27. Robustness


Advanced Quantitative Economics with Python

• Value refers to a sum of discounted rewards obtained by applying the decision rule 𝐹 when the state starts at some
fixed initial state 𝑥0 .
• Entropy is a non-negative number that measures the size of a set of models surrounding the decision-maker’s ap-
proximating model.
– Entropy is zero when the set includes only the approximating model, indicating that the decision-maker com-
pletely trusts the approximating model.
– Entropy is bigger, and the set of surrounding models is bigger, the less the decision-maker trusts the approx-
imating model of the transition dynamics.
The shaded region indicates that for all models having entropy less than or equal to the number on the horizontal axis,
the value obtained will be somewhere within the indicated set of values.
Now let’s compare sets of values associated with two different decision rules, 𝐹𝑟 and 𝐹𝑏 .
In the next figure,
• The red set shows the value-entropy correspondence for decision rule 𝐹𝑟 .
• The blue set shows the value-entropy correspondence for decision rule 𝐹𝑏 .

The blue correspondence is skinnier than the red correspondence.


This conveys the sense in which the decision rule 𝐹𝑏 is more robust than the decision rule 𝐹𝑟
• more robust means that the set of values is less sensitive to increasing misspecification as measured by entropy

27.1. Overview 481


Advanced Quantitative Economics with Python

Notice that the less robust rule 𝐹𝑟 promises higher values for small misspecifications (small entropy).
(But it is more fragile in the sense that it is more sensitive to perturbations of the approximating model)
Below we’ll explain in detail how to construct these sets of values for a given 𝐹 , but for now ….
Here is a hint about the secret weapons we’ll use to construct these sets
• We’ll use some min problems to construct the lower bounds
• We’ll use some max problems to construct the upper bounds
We will also describe how to choose 𝐹 to shape the sets of values.
This will involve crafting a skinnier set at the cost of a lower level (at least for low values of entropy).

27.1.2 Inspiring Video

If you want to understand more about why one serious quantitative researcher is interested in this approach, we recom-
mend Lars Peter Hansen’s Nobel lecture.

27.1.3 Other References

Our discussion in this lecture is based on


• [Hansen and Sargent, 2000]
• [Hansen and Sargent, 2008]

27.2 The Model

For simplicity, we present ideas in the context of a class of problems with linear transition laws and quadratic objective
functions.
To fit in with our earlier lecture on LQ control, we will treat loss minimization rather than value maximization.
To begin, recall the infinite horizon LQ problem, where an agent chooses a sequence of controls {𝑢𝑡 } to minimize

∑ 𝛽 𝑡 {𝑥′𝑡 𝑅𝑥𝑡 + 𝑢′𝑡 𝑄𝑢𝑡 } (27.1)
𝑡=0

subject to the linear law of motion

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑤𝑡+1 , 𝑡 = 0, 1, 2, … (27.2)

As before,
• 𝑥𝑡 is 𝑛 × 1, 𝐴 is 𝑛 × 𝑛
• 𝑢𝑡 is 𝑘 × 1, 𝐵 is 𝑛 × 𝑘
• 𝑤𝑡 is 𝑗 × 1, 𝐶 is 𝑛 × 𝑗
• 𝑅 is 𝑛 × 𝑛 and 𝑄 is 𝑘 × 𝑘

482 Chapter 27. Robustness


Advanced Quantitative Economics with Python

Here 𝑥𝑡 is the state, 𝑢𝑡 is the control, and 𝑤𝑡 is a shock vector.


For now, we take {𝑤𝑡 } ∶= {𝑤𝑡 }∞
𝑡=1 to be deterministic — a single fixed sequence.

We also allow for model uncertainty on the part of the agent solving this optimization problem.
In particular, the agent takes 𝑤𝑡 = 0 for all 𝑡 ≥ 0 as a benchmark model but admits the possibility that this model might
be wrong.
As a consequence, she also considers a set of alternative models expressed in terms of sequences {𝑤𝑡 } that are more or
less “close” to the zero sequence.
She seeks a policy that will do well enough for a set of alternative models whose members are pinned down by sequences
{𝑤𝑡 }.
A sequence {𝑤𝑡 } might represent
• nonlinearities absent from the approximating model
• time variations in parameters of the approximating model
• omitted state variables in the approximating model
• neglected history dependencies …
• and other potential sources of misspecification
Soon we’ll quantify the quality of a model specification in terms of the maximal size of the discounted sum

∑𝑡=0 𝛽 𝑡+1 𝑤𝑡+1

𝑤𝑡+1 .

27.3 Constructing More Robust Policies

If our agent takes {𝑤𝑡 } as a given deterministic sequence, then, drawing on ideas in earlier lectures on dynamic program-
ming, we can anticipate Bellman equations such as

𝐽𝑡−1 (𝑥) = min{𝑥′ 𝑅𝑥 + 𝑢′ 𝑄𝑢 + 𝛽 𝐽𝑡 (𝐴𝑥 + 𝐵𝑢 + 𝐶𝑤)}


𝑢

(Here 𝐽 depends on 𝑡 because the sequence {𝑤𝑡 } is not recursive)


Our tool for studying robustness is to construct a rule that works well even if an adverse sequence {𝑤𝑡 } occurs.
In our framework, “adverse” means “loss increasing”.
As we’ll see, this will eventually lead us to construct a Bellman equation

𝐽 (𝑥) = min max{𝑥′ 𝑅𝑥 + 𝑢′ 𝑄𝑢 + 𝛽 [𝐽 (𝐴𝑥 + 𝐵𝑢 + 𝐶𝑤) − 𝜃𝑤′ 𝑤]} (27.3)


𝑢 𝑤

Notice that we’ve added the penalty term −𝜃𝑤′ 𝑤.


Since 𝑤′ 𝑤 = ‖𝑤‖2 , this term becomes influential when 𝑤 moves away from the origin.
The penalty parameter 𝜃 controls how much we penalize the maximizing agent for “harming” the minimizing agent.
By raising 𝜃 more and more, we more and more limit the ability of maximizing agent to distort outcomes relative to the
approximating model.
So bigger 𝜃 is implicitly associated with smaller distortion sequences {𝑤𝑡 }.

27.3. Constructing More Robust Policies 483


Advanced Quantitative Economics with Python

27.3.1 Analyzing the Bellman Equation

So what does 𝐽 in (27.3) look like?


As with the ordinary LQ control model, 𝐽 takes the form 𝐽 (𝑥) = 𝑥′ 𝑃 𝑥 for some symmetric positive definite matrix 𝑃 .
One of our main tasks will be to analyze and compute the matrix 𝑃 .
Related tasks will be to study associated feedback rules for 𝑢𝑡 and 𝑤𝑡+1 .
First, using matrix calculus, you will be able to verify that

max{(𝐴𝑥 + 𝐵𝑢 + 𝐶𝑤)′ 𝑃 (𝐴𝑥 + 𝐵𝑢 + 𝐶𝑤) − 𝜃𝑤′ 𝑤}


𝑤
(27.4)
= (𝐴𝑥 + 𝐵𝑢)′ 𝒟(𝑃 )(𝐴𝑥 + 𝐵𝑢)

where

𝒟(𝑃 ) ∶= 𝑃 + 𝑃 𝐶(𝜃𝐼 − 𝐶 ′ 𝑃 𝐶)−1 𝐶 ′ 𝑃 (27.5)

and 𝐼 is a 𝑗 × 𝑗 identity matrix. Substituting this expression for the maximum into (27.3) yields

𝑥′ 𝑃 𝑥 = min{𝑥′ 𝑅𝑥 + 𝑢′ 𝑄𝑢 + 𝛽 (𝐴𝑥 + 𝐵𝑢)′ 𝒟(𝑃 )(𝐴𝑥 + 𝐵𝑢)} (27.6)


𝑢

Using similar mathematics, the solution to this minimization problem is 𝑢 = −𝐹 𝑥 where 𝐹 ∶= (𝑄 +


𝛽𝐵′ 𝒟(𝑃 )𝐵)−1 𝛽𝐵′ 𝒟(𝑃 )𝐴.
Substituting this minimizer back into (27.6) and working through the algebra gives 𝑥′ 𝑃 𝑥 = 𝑥′ ℬ(𝒟(𝑃 ))𝑥 for all 𝑥, or,
equivalently,

𝑃 = ℬ(𝒟(𝑃 ))

where 𝒟 is the operator defined in (27.5) and

ℬ(𝑃 ) ∶= 𝑅 − 𝛽 2 𝐴′ 𝑃 𝐵(𝑄 + 𝛽𝐵′ 𝑃 𝐵)−1 𝐵′ 𝑃 𝐴 + 𝛽𝐴′ 𝑃 𝐴

The operator ℬ is the standard (i.e., non-robust) LQ Bellman operator, and 𝑃 = ℬ(𝑃 ) is the standard matrix Riccati
equation coming from the Bellman equation — see this discussion.
Under some regularity conditions (see [Hansen and Sargent, 2008]), the operator ℬ ∘ 𝒟 has a unique positive definite
fixed point, which we denote below by 𝑃 ̂ .
A robust policy, indexed by 𝜃, is 𝑢 = −𝐹 ̂ 𝑥 where

𝐹 ̂ ∶= (𝑄 + 𝛽𝐵′ 𝒟(𝑃 ̂ )𝐵)−1 𝛽𝐵′ 𝒟(𝑃 ̂ )𝐴 (27.7)

We also define

𝐾̂ ∶= (𝜃𝐼 − 𝐶 ′ 𝑃 ̂ 𝐶)−1 𝐶 ′ 𝑃 ̂ (𝐴 − 𝐵𝐹 ̂ ) (27.8)

The interpretation of 𝐾̂ is that 𝑤𝑡+1 = 𝐾𝑥̂ 𝑡 on the worst-case path of {𝑥𝑡 }, in the sense that this vector is the maximizer
of (27.4) evaluated at the fixed rule 𝑢 = −𝐹 ̂ 𝑥.
Note that 𝑃 ̂ , 𝐹 ̂ , 𝐾̂ are all determined by the primitives and 𝜃.
Note also that if 𝜃 is very large, then 𝒟 is approximately equal to the identity mapping.
Hence, when 𝜃 is large, 𝑃 ̂ and 𝐹 ̂ are approximately equal to their standard LQ values.
Furthermore, when 𝜃 is large, 𝐾̂ is approximately equal to zero.
Conversely, smaller 𝜃 is associated with greater fear of model misspecification and greater concern for robustness.

484 Chapter 27. Robustness


Advanced Quantitative Economics with Python

27.4 Robustness as Outcome of a Two-Person Zero-Sum Game

What we have done above can be interpreted in terms of a two-person zero-sum game in which 𝐹 ̂ , 𝐾̂ are Nash equilibrium
objects.
Agent 1 is our original agent, who seeks to minimize loss in the LQ program while admitting the possibility of misspec-
ification.
Agent 2 is an imaginary malevolent player.
Agent 2’s malevolence helps the original agent to compute bounds on his value function across a set of models.
We begin with agent 2’s problem.

27.4.1 Agent 2’s Problem

Agent 2
1. knows a fixed policy 𝐹 specifying the behavior of agent 1, in the sense that 𝑢𝑡 = −𝐹 𝑥𝑡 for all 𝑡
2. responds by choosing a shock sequence {𝑤𝑡 } from a set of paths sufficiently close to the benchmark sequence
{0, 0, 0, …}

A natural way to say “sufficiently close to the zero sequence” is to restrict the summed inner product ∑𝑡=1 𝑤𝑡′ 𝑤𝑡 to be
small.
However, to obtain a time-invariant recursive formulation, it turns out to be convenient to restrict a discounted inner
product

∑ 𝛽 𝑡 𝑤𝑡′ 𝑤𝑡 ≤ 𝜂 (27.9)
𝑡=1

Now let 𝐹 be a fixed policy, and let 𝐽𝐹 (𝑥0 , w) be the present-value cost of that policy given sequence w ∶= {𝑤𝑡 } and
initial condition 𝑥0 ∈ ℝ𝑛 .
Substituting −𝐹 𝑥𝑡 for 𝑢𝑡 in (27.1), this value can be written as

𝐽𝐹 (𝑥0 , w) ∶= ∑ 𝛽 𝑡 𝑥′𝑡 (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥𝑡 (27.10)
𝑡=0

where

𝑥𝑡+1 = (𝐴 − 𝐵𝐹 )𝑥𝑡 + 𝐶𝑤𝑡+1 (27.11)

and the initial condition 𝑥0 is as specified in the left side of (27.10).


Agent 2 chooses w to maximize agent 1’s loss 𝐽𝐹 (𝑥0 , w) subject to (27.9).
Using a Lagrangian formulation, we can express this problem as

max ∑ 𝛽 𝑡 {𝑥′𝑡 (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥𝑡 − 𝛽𝜃(𝑤𝑡+1

𝑤𝑡+1 − 𝜂)}
w
𝑡=0

where {𝑥𝑡 } satisfied (27.11) and 𝜃 is a Lagrange multiplier on constraint (27.9).


For the moment, let’s take 𝜃 as fixed, allowing us to drop the constant 𝛽𝜃𝜂 term in the objective function, and hence write
the problem as

max ∑ 𝛽 𝑡 {𝑥′𝑡 (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥𝑡 − 𝛽𝜃𝑤𝑡+1

𝑤𝑡+1 }
w
𝑡=0

27.4. Robustness as Outcome of a Two-Person Zero-Sum Game 485


Advanced Quantitative Economics with Python

or, equivalently,

min ∑ 𝛽 𝑡 {−𝑥′𝑡 (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥𝑡 + 𝛽𝜃𝑤𝑡+1

𝑤𝑡+1 } (27.12)
w
𝑡=0

subject to (27.11).
What’s striking about this optimization problem is that it is once again an LQ discounted dynamic programming problem,
with w = {𝑤𝑡 } as the sequence of controls.
The expression for the optimal policy can be found by applying the usual LQ formula (see here).
We denote it by 𝐾(𝐹 , 𝜃), with the interpretation 𝑤𝑡+1 = 𝐾(𝐹 , 𝜃)𝑥𝑡 .
The remaining step for agent 2’s problem is to set 𝜃 to enforce the constraint (27.9), which can be done by choosing
𝜃 = 𝜃𝜂 such that

𝛽 ∑ 𝛽 𝑡 𝑥′𝑡 𝐾(𝐹 , 𝜃𝜂 )′ 𝐾(𝐹 , 𝜃𝜂 )𝑥𝑡 = 𝜂 (27.13)
𝑡=0

Here 𝑥𝑡 is given by (27.11) — which in this case becomes 𝑥𝑡+1 = (𝐴 − 𝐵𝐹 + 𝐶𝐾(𝐹 , 𝜃))𝑥𝑡 .

27.4.2 Using Agent 2’s Problem to Construct Bounds on Value Sets

The Lower Bound

Define the minimized object on the right side of problem (27.12) as 𝑅𝜃 (𝑥0 , 𝐹 ).
Because “minimizers minimize” we have
∞ ∞
𝑅𝜃 (𝑥0 , 𝐹 ) ≤ ∑ 𝛽 𝑡 {−𝑥′𝑡 (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥𝑡 } + 𝛽𝜃 ∑ 𝛽 𝑡 𝑤𝑡+1

𝑤𝑡+1 ,
𝑡=0 𝑡=0

where 𝑥𝑡+1 = (𝐴 − 𝐵𝐹 + 𝐶𝐾(𝐹 , 𝜃))𝑥𝑡 and 𝑥0 is a given initial condition.


This inequality in turn implies the inequality

𝑅𝜃 (𝑥0 , 𝐹 ) − 𝜃 ent ≤ ∑ 𝛽 𝑡 {−𝑥′𝑡 (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥𝑡 } (27.14)
𝑡=0

where

ent ∶= 𝛽 ∑ 𝛽 𝑡 𝑤𝑡+1

𝑤𝑡+1
𝑡=0

The left side of inequality (27.14) is a straight line with slope −𝜃.
Technically, it is a “separating hyperplane”.
At a particular value of entropy, the line is tangent to the lower bound of values as a function of entropy.
In particular, the lower bound on the left side of (27.14) is attained when

ent = 𝛽 ∑ 𝛽 𝑡 𝑥′𝑡 𝐾(𝐹 , 𝜃)′ 𝐾(𝐹 , 𝜃)𝑥𝑡 (27.15)
𝑡=0

To construct the lower bound on the set of values associated with all perturbations w satisfying the entropy constraint
(27.9) at a given entropy level, we proceed as follows:

486 Chapter 27. Robustness


Advanced Quantitative Economics with Python

• For a given 𝜃, solve the minimization problem (27.12).


• Compute the minimizer 𝑅𝜃 (𝑥0 , 𝐹 ) and the associated entropy using (27.15).
• Compute the lower bound on the value function 𝑅𝜃 (𝑥0 , 𝐹 ) − 𝜃 ent and plot it against ent.
• Repeat the preceding three steps for a range of values of 𝜃 to trace out the lower bound.

Note: This procedure sweeps out a set of separating hyperplanes indexed by different values for the Lagrange multiplier
𝜃.

The Upper Bound

To construct an upper bound we use a very similar procedure.


We simply replace the minimization problem (27.12) with the maximization problem

̃ ′ 𝑤 }
𝑉𝜃 ̃(𝑥0 , 𝐹 ) = max ∑ 𝛽 𝑡 {−𝑥′𝑡 (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥𝑡 − 𝛽 𝜃𝑤 (27.16)
w 𝑡+1 𝑡+1
𝑡=0

where now 𝜃 ̃ > 0 penalizes the choice of w with larger entropy.


(Notice that 𝜃 ̃ = −𝜃 in problem (27.12))
Because “maximizers maximize” we have
∞ ∞
𝑉𝜃 ̃(𝑥0 , 𝐹 ) ≥ ∑ 𝛽 𝑡 {−𝑥′𝑡 (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥𝑡 } − 𝛽 𝜃 ̃ ∑ 𝛽 𝑡 𝑤𝑡+1

𝑤𝑡+1
𝑡=0 𝑡=0

which in turn implies the inequality



𝑉𝜃 ̃(𝑥0 , 𝐹 ) + 𝜃 ̃ ent ≥ ∑ 𝛽 𝑡 {−𝑥′𝑡 (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥𝑡 } (27.17)
𝑡=0

where

ent ≡ 𝛽 ∑ 𝛽 𝑡 𝑤𝑡+1

𝑤𝑡+1
𝑡=0

The left side of inequality (27.17) is a straight line with slope 𝜃.̃
The upper bound on the left side of (27.17) is attained when

ent = 𝛽 ∑ 𝛽 𝑡 𝑥′𝑡 𝐾(𝐹 , 𝜃)̃ ′ 𝐾(𝐹 , 𝜃)𝑥
̃
𝑡 (27.18)
𝑡=0

To construct the upper bound on the set of values associated all perturbations w with a given entropy we proceed much
as we did for the lower bound
• For a given 𝜃,̃ solve the maximization problem (27.16).
• Compute the maximizer 𝑉𝜃 ̃(𝑥0 , 𝐹 ) and the associated entropy using (27.18).

• Compute the upper bound on the value function 𝑉𝜃 ̃(𝑥0 , 𝐹 ) + 𝜃 ̃ ent and plot it against ent.

• Repeat the preceding three steps for a range of values of 𝜃 ̃ to trace out the upper bound.

27.4. Robustness as Outcome of a Two-Person Zero-Sum Game 487


Advanced Quantitative Economics with Python

Reshaping the Set of Values

Now in the interest of reshaping these sets of values by choosing 𝐹 , we turn to agent 1’s problem.

27.4.3 Agent 1’s Problem

Now we turn to agent 1, who solves



min ∑ 𝛽 𝑡 {𝑥′𝑡 𝑅𝑥𝑡 + 𝑢′𝑡 𝑄𝑢𝑡 − 𝛽𝜃𝑤𝑡+1

𝑤𝑡+1 } (27.19)
{𝑢𝑡 }
𝑡=0

where {𝑤𝑡+1 } satisfies 𝑤𝑡+1 = 𝐾𝑥𝑡 .


In other words, agent 1 minimizes

∑ 𝛽 𝑡 {𝑥′𝑡 (𝑅 − 𝛽𝜃𝐾 ′ 𝐾)𝑥𝑡 + 𝑢′𝑡 𝑄𝑢𝑡 } (27.20)
𝑡=0

subject to

𝑥𝑡+1 = (𝐴 + 𝐶𝐾)𝑥𝑡 + 𝐵𝑢𝑡 (27.21)

Once again, the expression for the optimal policy can be found here — we denote it by 𝐹 ̃ .

27.4.4 Nash Equilibrium

Clearly, the 𝐹 ̃ we have obtained depends on 𝐾, which, in agent 2’s problem, depended on an initial policy 𝐹 .
Holding all other parameters fixed, we can represent this relationship as a mapping Φ, where

𝐹 ̃ = Φ(𝐾(𝐹 , 𝜃))

The map 𝐹 ↦ Φ(𝐾(𝐹 , 𝜃)) corresponds to a situation in which


1. agent 1 uses an arbitrary initial policy 𝐹
2. agent 2 best responds to agent 1 by choosing 𝐾(𝐹 , 𝜃)
3. agent 1 best responds to agent 2 by choosing 𝐹 ̃ = Φ(𝐾(𝐹 , 𝜃))
As you may have already guessed, the robust policy 𝐹 ̂ defined in (27.7) is a fixed point of the mapping Φ.
In particular, for any given 𝜃,
1. 𝐾(𝐹 ̂ , 𝜃) = 𝐾,̂ where 𝐾̂ is as given in (27.8)
2. Φ(𝐾)̂ = 𝐹 ̂
A sketch of the proof is given in the appendix.

488 Chapter 27. Robustness


Advanced Quantitative Economics with Python

27.5 The Stochastic Case

Now we turn to the stochastic case, where the sequence {𝑤𝑡 } is treated as an IID sequence of random vectors.
In this setting, we suppose that our agent is uncertain about the conditional probability distribution of 𝑤𝑡+1 .
The agent takes the standard normal distribution 𝑁 (0, 𝐼) as the baseline conditional distribution, while admitting the
possibility that other “nearby” distributions prevail.
These alternative conditional distributions of 𝑤𝑡+1 might depend nonlinearly on the history 𝑥𝑠 , 𝑠 ≤ 𝑡.
To implement this idea, we need a notion of what it means for one distribution to be near another one.
Here we adopt a very useful measure of closeness for distributions known as the relative entropy, or Kullback-Leibler
divergence.
For densities 𝑝, 𝑞, the Kullback-Leibler divergence of 𝑞 from 𝑝 is defined as

𝑝(𝑥)
𝐷𝐾𝐿 (𝑝, 𝑞) ∶= ∫ ln [ ] 𝑝(𝑥) 𝑑𝑥
𝑞(𝑥)
Using this notation, we replace (27.3) with the stochastic analog

𝐽 (𝑥) = min max {𝑥′ 𝑅𝑥 + 𝑢′ 𝑄𝑢 + 𝛽 [∫ 𝐽 (𝐴𝑥 + 𝐵𝑢 + 𝐶𝑤) 𝜓(𝑑𝑤) − 𝜃𝐷𝐾𝐿 (𝜓, 𝜙)]} (27.22)
𝑢 𝜓∈𝒫

Here 𝒫 represents the set of all densities on ℝ𝑛 and 𝜙 is the benchmark distribution 𝑁 (0, 𝐼).
The distribution 𝜙 is chosen as the least desirable conditional distribution in terms of next period outcomes, while taking
into account the penalty term 𝜃𝐷𝐾𝐿 (𝜓, 𝜙).
This penalty term plays a role analogous to the one played by the deterministic penalty 𝜃𝑤′ 𝑤 in (27.3), since it discourages
large deviations from the benchmark.

27.5.1 Solving the Model

The maximization problem in (27.22) appears highly nontrivial — after all, we are maximizing over an infinite dimen-
sional space consisting of the entire set of densities.
However, it turns out that the solution is tractable, and in fact also falls within the class of normal distributions.
First, we note that 𝐽 has the form 𝐽 (𝑥) = 𝑥′ 𝑃 𝑥 + 𝑑 for some positive definite matrix 𝑃 and constant real number 𝑑.
Moreover, it turns out that if (𝐼 − 𝜃−1 𝐶 ′ 𝑃 𝐶)−1 is nonsingular, then

max {∫(𝐴𝑥 + 𝐵𝑢 + 𝐶𝑤)′ 𝑃 (𝐴𝑥 + 𝐵𝑢 + 𝐶𝑤) 𝜓(𝑑𝑤) − 𝜃𝐷𝐾𝐿 (𝜓, 𝜙)}


𝜓∈𝒫 (27.23)

= (𝐴𝑥 + 𝐵𝑢) 𝒟(𝑃 )(𝐴𝑥 + 𝐵𝑢) + 𝜅(𝜃, 𝑃 )

where

𝜅(𝜃, 𝑃 ) ∶= 𝜃 ln[det(𝐼 − 𝜃−1 𝐶 ′ 𝑃 𝐶)−1 ]

and the maximizer is the Gaussian distribution

𝜓 = 𝑁 ((𝜃𝐼 − 𝐶 ′ 𝑃 𝐶)−1 𝐶 ′ 𝑃 (𝐴𝑥 + 𝐵𝑢), (𝐼 − 𝜃−1 𝐶 ′ 𝑃 𝐶)−1 ) (27.24)

Substituting the expression for the maximum into Bellman equation (27.22) and using 𝐽 (𝑥) = 𝑥′ 𝑃 𝑥 + 𝑑 gives

𝑥′ 𝑃 𝑥 + 𝑑 = min {𝑥′ 𝑅𝑥 + 𝑢′ 𝑄𝑢 + 𝛽 (𝐴𝑥 + 𝐵𝑢)′ 𝒟(𝑃 )(𝐴𝑥 + 𝐵𝑢) + 𝛽 [𝑑 + 𝜅(𝜃, 𝑃 )]} (27.25)
𝑢

27.5. The Stochastic Case 489


Advanced Quantitative Economics with Python

Since constant terms do not affect minimizers, the solution is the same as (27.6), leading to

𝑥′ 𝑃 𝑥 + 𝑑 = 𝑥′ ℬ(𝒟(𝑃 ))𝑥 + 𝛽 [𝑑 + 𝜅(𝜃, 𝑃 )]

To solve this Bellman equation, we take 𝑃 ̂ to be the positive definite fixed point of ℬ ∘ 𝒟.
In addition, we take 𝑑 ̂ as the real number solving 𝑑 = 𝛽 [𝑑 + 𝜅(𝜃, 𝑃 )], which is

𝛽
𝑑 ̂ ∶= 𝜅(𝜃, 𝑃 ) (27.26)
1−𝛽

The robust policy in this stochastic case is the minimizer in (27.25), which is once again 𝑢 = −𝐹 ̂ 𝑥 for 𝐹 ̂ given by (27.7).
Substituting the robust policy into (27.24) we obtain the worst-case shock distribution:

̂ 𝑡 , (𝐼 − 𝜃−1 𝐶 ′ 𝑃 ̂ 𝐶)−1 )
𝑤𝑡+1 ∼ 𝑁 (𝐾𝑥

where 𝐾̂ is given by (27.8).


Note that the mean of the worst-case shock distribution is equal to the same worst-case 𝑤𝑡+1 as in the earlier deterministic
setting.

27.5.2 Computing Other Quantities

Before turning to implementation, we briefly outline how to compute several other quantities of interest.

Worst-Case Value of a Policy

One thing we will be interested in doing is holding a policy fixed and computing the discounted loss associated with that
policy.
So let 𝐹 be a given policy and let 𝐽𝐹 (𝑥) be the associated loss, which, by analogy with (27.22), satisfies

𝐽𝐹 (𝑥) = max {𝑥′ (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥 + 𝛽 [∫ 𝐽𝐹 ((𝐴 − 𝐵𝐹 )𝑥 + 𝐶𝑤) 𝜓(𝑑𝑤) − 𝜃𝐷𝐾𝐿 (𝜓, 𝜙)]}


𝜓∈𝒫

Writing 𝐽𝐹 (𝑥) = 𝑥′ 𝑃𝐹 𝑥 + 𝑑𝐹 and applying the same argument used to derive (27.23) we get

𝑥′ 𝑃𝐹 𝑥 + 𝑑𝐹 = 𝑥′ (𝑅 + 𝐹 ′ 𝑄𝐹 )𝑥 + 𝛽 [𝑥′ (𝐴 − 𝐵𝐹 )′ 𝒟(𝑃𝐹 )(𝐴 − 𝐵𝐹 )𝑥 + 𝑑𝐹 + 𝜅(𝜃, 𝑃𝐹 )]

To solve this we take 𝑃𝐹 to be the fixed point

𝑃𝐹 = 𝑅 + 𝐹 ′ 𝑄𝐹 + 𝛽(𝐴 − 𝐵𝐹 )′ 𝒟(𝑃𝐹 )(𝐴 − 𝐵𝐹 )

and
𝛽 𝛽
𝑑𝐹 ∶= 𝜅(𝜃, 𝑃𝐹 ) = 𝜃 ln[det(𝐼 − 𝜃−1 𝐶 ′ 𝑃𝐹 𝐶)−1 ] (27.27)
1−𝛽 1−𝛽
If you skip ahead to the appendix, you will be able to verify that −𝑃𝐹 is the solution to the Bellman equation in agent 2’s
problem discussed above — we use this in our computations.

490 Chapter 27. Robustness


Advanced Quantitative Economics with Python

27.6 Implementation

The QuantEcon.py package provides a class called RBLQ for implementation of robust LQ optimal control.
The code can be found on GitHub.
Here is a brief description of the methods of the class
• d_operator() and b_operator() implement 𝒟 and ℬ respectively
• robust_rule() and robust_rule_simple() both solve for the triple 𝐹 ̂ , 𝐾,̂ 𝑃 ̂ , as described in equations
(27.7) – (27.8) and the surrounding discussion
– robust_rule() is more efficient
– robust_rule_simple() is more transparent and easier to follow
• K_to_F() and F_to_K() solve the decision problems of agent 1 and agent 2 respectively
• compute_deterministic_entropy() computes the left-hand side of (27.13)
• evaluate_F() computes the loss and entropy associated with a given policy — see this discussion

27.7 Application

Let us consider a monopolist similar to this one, but now facing model uncertainty.
The inverse demand function is 𝑝𝑡 = 𝑎0 − 𝑎1 𝑦𝑡 + 𝑑𝑡 .
where
IID
𝑑𝑡+1 = 𝜌𝑑𝑡 + 𝜎𝑑 𝑤𝑡+1 , {𝑤𝑡 } ∼ 𝑁 (0, 1)

and all parameters are strictly positive.


The period return function for the monopolist is

(𝑦𝑡+1 − 𝑦𝑡 )2
𝑟𝑡 = 𝑝𝑡 𝑦𝑡 − 𝛾 − 𝑐𝑦𝑡
2

Its objective is to maximize expected discounted profits, or, equivalently, to minimize 𝔼 ∑𝑡=0 𝛽 𝑡 (−𝑟𝑡 ).
To form a linear regulator problem, we take the state and control to be

1
𝑥𝑡 = ⎡ 𝑦
⎢ 𝑡⎥
⎤ and 𝑢𝑡 = 𝑦𝑡+1 − 𝑦𝑡
⎣𝑑𝑡 ⎦
Setting 𝑏 ∶= (𝑎0 − 𝑐)/2 we define

0 𝑏 0
𝑅 = −⎡
⎢ 𝑏 −𝑎 1 1/2⎤
⎥ and 𝑄 = 𝛾/2
⎣0 1/2 0 ⎦

For the transition matrices, we set


1 0 0 0 0
𝐴=⎡
⎢0 1 0⎤⎥, 𝐵=⎡ ⎤
⎢1⎥ , 𝐶=⎡
⎢0⎥

⎣0 0 𝜌⎦ 0
⎣ ⎦ 𝜎
⎣ 𝑑⎦
Our aim is to compute the value-entropy correspondences shown above.

27.6. Implementation 491


Advanced Quantitative Economics with Python

The parameters are

𝑎0 = 100, 𝑎1 = 0.5, 𝜌 = 0.9, 𝜎𝑑 = 0.05, 𝛽 = 0.95, 𝑐 = 2, 𝛾 = 50.0

The standard normal distribution for 𝑤𝑡 is understood as the agent’s baseline, with uncertainty parameterized by 𝜃.
We compute value-entropy correspondences for two policies
1. The no concern for robustness policy 𝐹0 , which is the ordinary LQ loss minimizer.
2. A “moderate” concern for robustness policy 𝐹𝑏 , with 𝜃 = 0.02.
The code for producing the graph shown above, with blue being for the robust policy, is as follows

# Model parameters

a_0 = 100
a_1 = 0.5
ρ = 0.9
σ_d = 0.05
β = 0.95
c = 2
γ = 50.0

θ = 0.02
ac = (a_0 - c) / 2.0

# Define LQ matrices

R = np.array([[0., ac, 0.],


[ac, -a_1, 0.5],
[0., 0.5, 0.]])

R = -R # For minimization
Q = γ / 2

A = np.array([[1., 0., 0.],


[0., 1., 0.],
[0., 0., ρ]])
B = np.array([[0.],
[1.],
[0.]])
C = np.array([[0.],
[0.],
[σ_d]])

# ----------------------------------------------------------------------- #
# Functions
# ----------------------------------------------------------------------- #

def evaluate_policy(θ, F):

"""
Given θ (scalar, dtype=float) and policy F (array_like), returns the
value associated with that policy under the worst case path for {w_t},
as well as the entropy level.
"""

(continues on next page)

492 Chapter 27. Robustness


Advanced Quantitative Economics with Python

(continued from previous page)


rlq = qe.RBLQ(Q, R, A, B, C, β, θ)
K_F, P_F, d_F, O_F, o_F = rlq.evaluate_F(F)
x0 = np.array([[1.], [0.], [0.]])
value = - x0.T @ P_F @ x0 - d_F
entropy = x0.T @ O_F @ x0 + o_F
return list(map(float, (value, entropy)))

def value_and_entropy(emax, F, bw, grid_size=1000):

"""
Compute the value function and entropy levels for a θ path
increasing until it reaches the specified target entropy value.

Parameters
==========
emax: scalar
The target entropy value

F: array_like
The policy function to be evaluated

bw: str
A string specifying whether the implied shock path follows best
or worst assumptions. The only acceptable values are 'best' and
'worst'.

Returns
=======
df: pd.DataFrame
A pandas DataFrame containing the value function and entropy
values up to the emax parameter. The columns are 'value' and
'entropy'.
"""

if bw == 'worst':
θs = 1 / np.linspace(1e-8, 1000, grid_size)
else:
θs = -1 / np.linspace(1e-8, 1000, grid_size)

df = pd.DataFrame(index=θs, columns=('value', 'entropy'))

for θ in θs:
df.loc[θ] = evaluate_policy(θ, F)
if df.loc[θ, 'entropy'] >= emax:
break

df = df.dropna(how='any')
return df

# ------------------------------------------------------------------------ #
# Main
# ------------------------------------------------------------------------ #

(continues on next page)

27.7. Application 493


Advanced Quantitative Economics with Python

(continued from previous page)


# Compute the optimal rule
optimal_lq = qe.LQ(Q, R, A, B, C, beta=β)
Po, Fo, do = optimal_lq.stationary_values()

# Compute a robust rule given θ


baseline_robust = qe.RBLQ(Q, R, A, B, C, β, θ)
Fb, Kb, Pb = baseline_robust.robust_rule()

# Check the positive definiteness of worst-case covariance matrix to


# ensure that θ exceeds the breakdown point
test_matrix = np.identity(Pb.shape[0]) - (C.T @ Pb @ C) / θ
eigenvals, eigenvecs = eig(test_matrix)
assert (eigenvals >= 0).all(), 'θ below breakdown point.'

emax = 1.6e6

optimal_best_case = value_and_entropy(emax, Fo, 'best')


robust_best_case = value_and_entropy(emax, Fb, 'best')
optimal_worst_case = value_and_entropy(emax, Fo, 'worst')
robust_worst_case = value_and_entropy(emax, Fb, 'worst')

fig, ax = plt.subplots()

ax.set_xlim(0, emax)
ax.set_ylabel("Value")
ax.set_xlabel("Entropy")
ax.grid()

for axis in 'x', 'y':


plt.ticklabel_format(style='sci', axis=axis, scilimits=(0, 0))

plot_args = {'lw': 2, 'alpha': 0.7}

colors = 'r', 'b'

df_pairs = ((optimal_best_case, optimal_worst_case),


(robust_best_case, robust_worst_case))

class Curve:

def __init__(self, x, y):


self.x, self.y = x, y

def __call__(self, z):


return np.interp(z, self.x, self.y)

for c, df_pair in zip(colors, df_pairs):


curves = []
for df in df_pair:
# Plot curves
x, y = df['entropy'], df['value']
x, y = (np.asarray(a, dtype='float') for a in (x, y))
egrid = np.linspace(0, emax, 100)

(continues on next page)

494 Chapter 27. Robustness


Advanced Quantitative Economics with Python

(continued from previous page)


curve = Curve(x, y)
print(ax.plot(egrid, curve(egrid), color=c, **plot_args))
curves.append(curve)
# Color fill between curves
ax.fill_between(egrid,
curves[0](egrid),
curves[1](egrid),
color=c, alpha=0.1)

plt.show()

/tmp/ipykernel_9100/3234817930.py:51: DeprecationWarning: Conversion of an array␣


↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

return list(map(float, (value, entropy)))


/tmp/ipykernel_9100/3234817930.py:51: DeprecationWarning: Conversion of an array␣
↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

return list(map(float, (value, entropy)))


/tmp/ipykernel_9100/3234817930.py:51: DeprecationWarning: Conversion of an array␣
↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

return list(map(float, (value, entropy)))

/tmp/ipykernel_9100/3234817930.py:51: DeprecationWarning: Conversion of an array␣


↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

return list(map(float, (value, entropy)))

[<matplotlib.lines.Line2D object at 0x7f0780274e30>]


[<matplotlib.lines.Line2D object at 0x7f077bae7260>]
[<matplotlib.lines.Line2D object at 0x7f077bae77d0>]
[<matplotlib.lines.Line2D object at 0x7f077bae7d10>]

27.7. Application 495


Advanced Quantitative Economics with Python

Here’s another such figure, with 𝜃 = 0.002 instead of 0.02


Can you explain the different shape of the value-entropy correspondence for the robust policy?

27.8 Appendix

We sketch the proof only of the first claim in this section, which is that, for any given 𝜃, 𝐾(𝐹 ̂ , 𝜃) = 𝐾,̂ where 𝐾̂ is as
given in (27.8).
This is the content of the next lemma.
Lemma. If 𝑃 ̂ is the fixed point of the map ℬ ∘ 𝒟 and 𝐹 ̂ is the robust policy as given in (27.7), then

𝐾(𝐹 ̂ , 𝜃) = (𝜃𝐼 − 𝐶 ′ 𝑃 ̂ 𝐶)−1 𝐶 ′ 𝑃 ̂ (𝐴 − 𝐵𝐹 ̂ ) (27.28)

Proof: As a first step, observe that when 𝐹 = 𝐹 ̂ , the Bellman equation associated with the LQ problem (27.11) – (27.12)
is

𝑃 ̃ = −𝑅 − 𝐹 ̂ ′ 𝑄𝐹 ̂ − 𝛽 2 (𝐴 − 𝐵𝐹 ̂ )′ 𝑃 ̃ 𝐶(𝛽𝜃𝐼 + 𝛽𝐶 ′ 𝑃 ̃ 𝐶)−1 𝐶 ′ 𝑃 ̃ (𝐴 − 𝐵𝐹 ̂ ) + 𝛽(𝐴 − 𝐵𝐹 ̂ )′ 𝑃 ̃ (𝐴 − 𝐵𝐹 ̂ ) (27.29)

(revisit this discussion if you don’t know where (27.29) comes from) and the optimal policy is

𝑤𝑡+1 = −𝛽(𝛽𝜃𝐼 + 𝛽𝐶 ′ 𝑃 ̃ 𝐶)−1 𝐶 ′ 𝑃 ̃ (𝐴 − 𝐵𝐹 ̂ )𝑥𝑡

Suppose for a moment that −𝑃 ̂ solves the Bellman equation (27.29).

496 Chapter 27. Robustness


Advanced Quantitative Economics with Python

In this case, the policy becomes

𝑤𝑡+1 = (𝜃𝐼 − 𝐶 ′ 𝑃 ̂ 𝐶)−1 𝐶 ′ 𝑃 ̂ (𝐴 − 𝐵𝐹 ̂ )𝑥𝑡

which is exactly the claim in (27.28).


Hence it remains only to show that −𝑃 ̂ solves (27.29), or, in other words,

𝑃 ̂ = 𝑅 + 𝐹 ̂ ′ 𝑄𝐹 ̂ + 𝛽(𝐴 − 𝐵𝐹 ̂ )′ 𝑃 ̂ 𝐶(𝜃𝐼 − 𝐶 ′ 𝑃 ̂ 𝐶)−1 𝐶 ′ 𝑃 ̂ (𝐴 − 𝐵𝐹 ̂ ) + 𝛽(𝐴 − 𝐵𝐹 ̂ )′ 𝑃 ̂ (𝐴 − 𝐵𝐹 ̂ )

Using the definition of 𝒟, we can rewrite the right-hand side more simply as

𝑅 + 𝐹 ̂ ′ 𝑄𝐹 ̂ + 𝛽(𝐴 − 𝐵𝐹 ̂ )′ 𝒟(𝑃 ̂ )(𝐴 − 𝐵𝐹 ̂ )

Although it involves a substantial amount of algebra, it can be shown that the latter is just 𝑃 ̂ .

Hint: Use the fact that 𝑃 ̂ = ℬ(𝒟(𝑃 ̂ ))

27.8. Appendix 497


Advanced Quantitative Economics with Python

498 Chapter 27. Robustness


CHAPTER

TWENTYEIGHT

ROBUST MARKOV PERFECT EQUILIBRIUM

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

28.1 Overview

This lecture describes a Markov perfect equilibrium with robust agents.


We focus on special settings with
• two players
• quadratic payoff functions
• linear transition rules for the state vector
These specifications simplify calculations and allow us to give a simple example that illustrates basic forces.
This lecture is based on ideas described in chapter 15 of [Hansen and Sargent, 2008] and in Markov perfect equilibrium
and Robustness.
Let’s start with some standard imports:

import numpy as np
import quantecon as qe
from scipy.linalg import solve
import matplotlib.pyplot as plt

28.1.1 Basic Setup

Decisions of two agents affect the motion of a state vector that appears as an argument of payoff functions of both agents.
As described in Markov perfect equilibrium, when decision-makers have no concerns about the robustness of their de-
cision rules to misspecifications of the state dynamics, a Markov perfect equilibrium can be computed via backward
recursion on two sets of equations
• a pair of Bellman equations, one for each agent.
• a pair of equations that express linear decision rules for each agent as functions of that agent’s continuation value
function as well as parameters of preferences and state transition matrices.

499
Advanced Quantitative Economics with Python

This lecture shows how a similar equilibrium concept and similar computational procedures apply when we impute con-
cerns about robustness to both decision-makers.
A Markov perfect equilibrium with robust agents will be characterized by
• a pair of Bellman equations, one for each agent.
• a pair of equations that express linear decision rules for each agent as functions of that agent’s continuation value
function as well as parameters of preferences and state transition matrices.
• a pair of equations that express linear decision rules for worst-case shocks for each agent as functions of that agent’s
continuation value function as well as parameters of preferences and state transition matrices.
Below, we’ll construct a robust firms version of the classic duopoly model with adjustment costs analyzed in Markov
perfect equilibrium.

28.2 Linear Markov Perfect Equilibria with Robust Agents

As we saw in Markov perfect equilibrium, the study of Markov perfect equilibria in dynamic games with two players
leads us to an interrelated pair of Bellman equations.
In linear quadratic dynamic games, these “stacked Bellman equations” become “stacked Riccati equations” with a tractable
mathematical structure.

28.2.1 Modified Coupled Linear Regulator Problems

We consider a general linear quadratic regulator game with two players, each of whom fears model misspecifications.
We often call the players agents.
The agents share a common baseline model for the transition dynamics of the state vector
• this is a counterpart of a ‘rational expectations’ assumption of shared beliefs
But now one or more agents doubt that the baseline model is correctly specified.
The agents express the possibility that their baseline specification is incorrect by adding a contribution 𝐶𝑣𝑖𝑡 to the time
𝑡 transition law for the state
• 𝐶 is the usual volatility matrix that appears in stochastic versions of optimal linear regulator problems.
• 𝑣𝑖𝑡 is a possibly history-dependent vector of distortions to the dynamics of the state that agent 𝑖 uses to represent
misspecification of the original model.
For convenience, we’ll start with a finite horizon formulation, where 𝑡0 is the initial date and 𝑡1 is the common terminal
date.
Player 𝑖 takes a sequence {𝑢−𝑖𝑡 } as given and chooses a sequence {𝑢𝑖𝑡 } to minimize and {𝑣𝑖𝑡 } to maximize
𝑡1 −1
∑ 𝛽 𝑡−𝑡0 {𝑥′𝑡 𝑅𝑖 𝑥𝑡 + 𝑢′𝑖𝑡 𝑄𝑖 𝑢𝑖𝑡 + 𝑢′−𝑖𝑡 𝑆𝑖 𝑢−𝑖𝑡 + 2𝑥′𝑡 𝑊𝑖 𝑢𝑖𝑡 + 2𝑢′−𝑖𝑡 𝑀𝑖 𝑢𝑖𝑡 − 𝜃𝑖 𝑣𝑖𝑡

𝑣𝑖𝑡 } (28.1)
𝑡=𝑡0

while thinking that the state evolves according to

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵1 𝑢1𝑡 + 𝐵2 𝑢2𝑡 + 𝐶𝑣𝑖𝑡 (28.2)

Here
• 𝑥𝑡 is an 𝑛 × 1 state vector, 𝑢𝑖𝑡 is a 𝑘𝑖 × 1 vector of controls for player 𝑖, and

500 Chapter 28. Robust Markov Perfect Equilibrium


Advanced Quantitative Economics with Python

• 𝑣𝑖𝑡 is an ℎ × 1 vector of distortions to the state dynamics that concern player 𝑖


• 𝑅𝑖 is 𝑛 × 𝑛
• 𝑆𝑖 is 𝑘−𝑖 × 𝑘−𝑖
• 𝑄𝑖 is 𝑘𝑖 × 𝑘𝑖
• 𝑊𝑖 is 𝑛 × 𝑘𝑖
• 𝑀𝑖 is 𝑘−𝑖 × 𝑘𝑖
• 𝐴 is 𝑛 × 𝑛
• 𝐵𝑖 is 𝑛 × 𝑘𝑖
• 𝐶 is 𝑛 × ℎ
• 𝜃𝑖 ∈ [𝜃𝑖 , +∞] is a scalar multiplier parameter of player 𝑖
If 𝜃𝑖 = +∞, player 𝑖 completely trusts the baseline model.
If 𝜃𝑖 <∞ , player 𝑖 suspects that some other unspecified model actually governs the transition dynamics.

The term 𝜃𝑖 𝑣𝑖𝑡 𝑣𝑖𝑡 is a time 𝑡 contribution to an entropy penalty that an (imaginary) loss-maximizing agent inside agent
𝑖’s mind charges for distorting the law of motion in a way that harms agent 𝑖.
• the imaginary loss-maximizing agent helps the loss-minimizing agent by helping him construct bounds on the
behavior of his decision rule over a large set of alternative models of state transition dynamics.

28.2.2 Computing Equilibrium

We formulate a linear robust Markov perfect equilibrium as follows.


Player 𝑖 employs linear decision rules 𝑢𝑖𝑡 = −𝐹𝑖𝑡 𝑥𝑡 , where 𝐹𝑖𝑡 is a 𝑘𝑖 × 𝑛 matrix.
Player 𝑖’s malevolent alter ego employs decision rules 𝑣𝑖𝑡 = 𝐾𝑖𝑡 𝑥𝑡 where 𝐾𝑖𝑡 is an ℎ × 𝑛 matrix.
A robust Markov perfect equilibrium is a pair of sequences {𝐹1𝑡 , 𝐹2𝑡 } and a pair of sequences {𝐾1𝑡 , 𝐾2𝑡 } over 𝑡 =
𝑡0 , … , 𝑡1 − 1 that satisfy
• {𝐹1𝑡 , 𝐾1𝑡 } solves player 1’s robust decision problem, taking {𝐹2𝑡 } as given, and
• {𝐹2𝑡 , 𝐾2𝑡 } solves player 2’s robust decision problem, taking {𝐹1𝑡 } as given.
If we substitute 𝑢2𝑡 = −𝐹2𝑡 𝑥𝑡 into (28.1) and (28.2), then player 1’s problem becomes minimization-maximization of
𝑡1 −1
∑ 𝛽 𝑡−𝑡0 {𝑥′𝑡 Π1𝑡 𝑥𝑡 + 𝑢′1𝑡 𝑄1 𝑢1𝑡 + 2𝑢′1𝑡 Γ1𝑡 𝑥𝑡 − 𝜃1 𝑣1𝑡

𝑣1𝑡 } (28.3)
𝑡=𝑡0

subject to

𝑥𝑡+1 = Λ1𝑡 𝑥𝑡 + 𝐵1 𝑢1𝑡 + 𝐶𝑣1𝑡 (28.4)

where
• Λ𝑖𝑡 ∶= 𝐴 − 𝐵−𝑖 𝐹−𝑖𝑡

• Π𝑖𝑡 ∶= 𝑅𝑖 + 𝐹−𝑖𝑡 𝑆𝑖 𝐹−𝑖𝑡
• Γ𝑖𝑡 ∶= 𝑊𝑖′ − 𝑀𝑖′ 𝐹−𝑖𝑡

28.2. Linear Markov Perfect Equilibria with Robust Agents 501


Advanced Quantitative Economics with Python

This is an LQ robust dynamic programming problem of the type studied in the Robustness lecture, which can be solved
by working backward.
Maximization with respect to distortion 𝑣1𝑡 leads to the following version of the 𝒟 operator from the Robustness lecture,
namely

𝒟1 (𝑃 ) ∶= 𝑃 + 𝑃 𝐶(𝜃1 𝐼 − 𝐶 ′ 𝑃 𝐶)−1 𝐶 ′ 𝑃 (28.5)

The matrix 𝐹1𝑡 in the policy rule 𝑢1𝑡 = −𝐹1𝑡 𝑥𝑡 that solves agent 1’s problem satisfies

𝐹1𝑡 = (𝑄1 + 𝛽𝐵1′ 𝒟1 (𝑃1𝑡+1 )𝐵1 )−1 (𝛽𝐵1′ 𝒟1 (𝑃1𝑡+1 )Λ1𝑡 + Γ1𝑡 ) (28.6)

where 𝑃1𝑡 solves the matrix Riccati difference equation

𝑃1𝑡 = Π1𝑡 − (𝛽𝐵1′ 𝒟1 (𝑃1𝑡+1 )Λ1𝑡 + Γ1𝑡 )′ (𝑄1 + 𝛽𝐵1′ 𝒟1 (𝑃1𝑡+1 )𝐵1 )−1 (𝛽𝐵1′ 𝒟1 (𝑃1𝑡+1 )Λ1𝑡 + Γ1𝑡 )+
(28.7)
𝛽Λ′1𝑡 𝒟1 (𝑃1𝑡+1 )Λ1𝑡

Similarly, the policy that solves player 2’s problem is

𝐹2𝑡 = (𝑄2 + 𝛽𝐵2′ 𝒟2 (𝑃2𝑡+1 )𝐵2 )−1 (𝛽𝐵2′ 𝒟2 (𝑃2𝑡+1 )Λ2𝑡 + Γ2𝑡 ) (28.8)

where 𝑃2𝑡 solves

𝑃2𝑡 = Π2𝑡 − (𝛽𝐵2′ 𝒟2 (𝑃2𝑡+1 )Λ2𝑡 + Γ2𝑡 )′ (𝑄2 + 𝛽𝐵2′ 𝒟2 (𝑃2𝑡+1 )𝐵2 )−1 (𝛽𝐵2′ 𝒟2 (𝑃2𝑡+1 )Λ2𝑡 + Γ2𝑡 )+
(28.9)
𝛽Λ′2𝑡 𝒟2 (𝑃2𝑡+1 )Λ2𝑡

Here in all cases 𝑡 = 𝑡0 , … , 𝑡1 − 1 and the terminal conditions are 𝑃𝑖𝑡1 = 0.


The solution procedure is to use equations (28.6), (28.7), (28.8), and (28.9), and “work backwards” from time 𝑡1 − 1.
Since we’re working backwards, 𝑃1𝑡+1 and 𝑃2𝑡+1 are taken as given at each stage.
Moreover, since
• some terms on the right-hand side of (28.6) contain 𝐹2𝑡
• some terms on the right-hand side of (28.8) contain 𝐹1𝑡
we need to solve these 𝑘1 + 𝑘2 equations simultaneously.

28.2.3 Key Insight

As in Markov perfect equilibrium, a key insight here is that equations (28.6) and (28.8) are linear in 𝐹1𝑡 and 𝐹2𝑡 .
After these equations have been solved, we can take 𝐹𝑖𝑡 and solve for 𝑃𝑖𝑡 in (28.7) and (28.9).
Notice how 𝑗’s control law 𝐹𝑗𝑡 is a function of {𝐹𝑖𝑠 , 𝑠 ≥ 𝑡, 𝑖 ≠ 𝑗}.
Thus, agent 𝑖’s choice of {𝐹𝑖𝑡 ; 𝑡 = 𝑡0 , … , 𝑡1 − 1} influences agent 𝑗’s choice of control laws.
However, in the Markov perfect equilibrium of this game, each agent is assumed to ignore the influence that his choice
exerts on the other agent’s choice.
After these equations have been solved, we can also deduce associated sequences of worst-case shocks.

502 Chapter 28. Robust Markov Perfect Equilibrium


Advanced Quantitative Economics with Python

28.2.4 Worst-case Shocks

For agent 𝑖 the maximizing or worst-case shock 𝑣𝑖𝑡 is

𝑣𝑖𝑡 = 𝐾𝑖𝑡 𝑥𝑡

where

𝐾𝑖𝑡 = 𝜃𝑖−1 (𝐼 − 𝜃𝑖−1 𝐶 ′ 𝑃𝑖,𝑡+1 𝐶)−1 𝐶 ′ 𝑃𝑖,𝑡+1 (𝐴 − 𝐵1 𝐹𝑖𝑡 − 𝐵2 𝐹2𝑡 )

28.2.5 Infinite Horizon

We often want to compute the solutions of such games for infinite horizons, in the hope that the decision rules 𝐹𝑖𝑡 settle
down to be time-invariant as 𝑡1 → +∞.
In practice, we usually fix 𝑡1 and compute the equilibrium of an infinite horizon game by driving 𝑡0 → −∞.
This is the approach we adopt in the next section.

28.2.6 Implementation

We use the function nnash_robust to compute a Markov perfect equilibrium of the infinite horizon linear quadratic
dynamic game with robust planers in the manner described above.

28.3 Application

28.3.1 A Duopoly Model

Without concerns for robustness, the model is identical to the duopoly model from the Markov perfect equilibrium lecture.
To begin, we briefly review the structure of that model.
Two firms are the only producers of a good the demand for which is governed by a linear inverse demand function

𝑝 = 𝑎0 − 𝑎1 (𝑞1 + 𝑞2 ) (28.10)

Here 𝑝 = 𝑝𝑡 is the price of the good, 𝑞𝑖 = 𝑞𝑖𝑡 is the output of firm 𝑖 = 1, 2 at time 𝑡 and 𝑎0 > 0, 𝑎1 > 0.
In (28.10) and what follows,
• the time subscript is suppressed when possible to simplify notation
• 𝑥̂ denotes a next period value of variable 𝑥
Each firm recognizes that its output affects total output and therefore the market price.
The one-period payoff function of firm 𝑖 is price times quantity minus adjustment costs:

𝜋𝑖 = 𝑝𝑞𝑖 − 𝛾(𝑞𝑖̂ − 𝑞𝑖 )2 , 𝛾 > 0, (28.11)

Substituting the inverse demand curve (28.10) into (28.11) lets us express the one-period payoff as

𝜋𝑖 (𝑞𝑖 , 𝑞−𝑖 , 𝑞𝑖̂ ) = 𝑎0 𝑞𝑖 − 𝑎1 𝑞𝑖2 − 𝑎1 𝑞𝑖 𝑞−𝑖 − 𝛾(𝑞𝑖̂ − 𝑞𝑖 )2 , (28.12)

where 𝑞−𝑖 denotes the output of the firm other than 𝑖.

28.3. Application 503


Advanced Quantitative Economics with Python


The objective of the firm is to maximize ∑𝑡=0 𝛽 𝑡 𝜋𝑖𝑡 .
Firm 𝑖 chooses a decision rule that sets next period quantity 𝑞𝑖̂ as a function 𝑓𝑖 of the current state (𝑞𝑖 , 𝑞−𝑖 ).
This completes our review of the duopoly model without concerns for robustness.
Now we activate robustness concerns of both firms.
To map a robust version of the duopoly model into coupled robust linear-quadratic dynamic programming problems, we
again define the state and controls as

1
𝑥𝑡 ∶= ⎡𝑞 ⎤
⎢ 1𝑡 ⎥ and 𝑢𝑖𝑡 ∶= 𝑞𝑖,𝑡+1 − 𝑞𝑖𝑡 , 𝑖 = 1, 2
⎣𝑞2𝑡 ⎦
If we write

𝑥′𝑡 𝑅𝑖 𝑥𝑡 + 𝑢′𝑖𝑡 𝑄𝑖 𝑢𝑖𝑡

where 𝑄1 = 𝑄2 = 𝛾,

0 − 𝑎20 0 0 0 − 𝑎20
𝑅1 ∶= ⎡ 𝑎0
⎢− 2 𝑎1 𝑎1 ⎤
2 ⎥ and 𝑅2 ∶= ⎡
⎢ 0𝑎 0 𝑎1
2


𝑎1 𝑎1
⎣ 0 2 0⎦ ⎣− 20 2 𝑎1 ⎦

then we recover the one-period payoffs (28.11) for the two firms in the duopoly model.
The law of motion for the state 𝑥𝑡 is 𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵1 𝑢1𝑡 + 𝐵2 𝑢2𝑡 where

1 0 0 0 0
𝐴 ∶= ⎡
⎢0 1 0⎤⎥, 𝐵1 ∶= ⎡ ⎤
⎢1⎥ , 𝐵2 ∶= ⎡ ⎤
⎢0⎥
⎣0 0 1⎦ ⎣0⎦ ⎣1⎦
A robust decision rule of firm 𝑖 will take the form 𝑢𝑖𝑡 = −𝐹𝑖 𝑥𝑡 , inducing the following closed-loop system for the
evolution of 𝑥 in the Markov perfect equilibrium:

𝑥𝑡+1 = (𝐴 − 𝐵1 𝐹1 − 𝐵1 𝐹2 )𝑥𝑡 (28.13)

28.3.2 Parameters and Solution

Consider the duopoly model with parameter values of:


• 𝑎0 = 10
• 𝑎1 = 2
• 𝛽 = 0.96
• 𝛾 = 12
From these, we computed the infinite horizon MPE without robustness using the code

import numpy as np
import quantecon as qe

# Parameters
a0 = 10.0
a1 = 2.0
β = 0.96
(continues on next page)

504 Chapter 28. Robust Markov Perfect Equilibrium


Advanced Quantitative Economics with Python

(continued from previous page)


γ = 12.0

# In LQ form
A = np.eye(3)
B1 = np.array([[0.], [1.], [0.]])
B2 = np.array([[0.], [0.], [1.]])

R1 = [[ 0., -a0 / 2, 0.],


[-a0 / 2., a1, a1 / 2.],
[ 0, a1 / 2., 0.]]

R2 = [[ 0., 0., -a0 / 2],


[ 0., 0., a1 / 2.],
[-a0 / 2, a1 / 2., a1]]

Q1 = Q2 = γ
S1 = S2 = W1 = W2 = M1 = M2 = 0.0

# Solve using QE's nnash function


F1, F2, P1, P2 = qe.nnash(A, B1, B2, R1, R2, Q1,
Q2, S1, S2, W1, W2, M1,
M2, beta=β)

# Display policies
print("Computed policies for firm 1 and firm 2:\n")
print(f"F1 = {F1}")
print(f"F2 = {F2}")
print("\n")

Computed policies for firm 1 and firm 2:

F1 = [[-0.66846615 0.29512482 0.07584666]]


F2 = [[-0.66846615 0.07584666 0.29512482]]

Markov Perfect Equilibrium with Robustness

We add robustness concerns to the Markov Perfect Equilibrium model by extending the function qe.nnash (link) into
a robustness version by adding the maximization operator 𝒟(𝑃 ) into the backward induction.
The MPE with robustness function is nnash_robust.
The function’s code is as follows

def nnash_robust(A, C, B1, B2, R1, R2, Q1, Q2, S1, S2, W1, W2, M1, M2,
θ1, θ2, beta=1.0, tol=1e-8, max_iter=1000):

r"""
Compute the limit of a Nash linear quadratic dynamic game with
robustness concern.

In this problem, player i minimizes


.. math::
\sum_{t=0}^{\infty}
(continues on next page)

28.3. Application 505


Advanced Quantitative Economics with Python

(continued from previous page)


\left\{
x_t' r_i x_t + 2 x_t' w_i
u_{it} +u_{it}' q_i u_{it} + u_{jt}' s_i u_{jt} + 2 u_{jt}'
m_i u_{it}
\right\}
subject to the law of motion
.. math::
x_{it+1} = A x_t + b_1 u_{1t} + b_2 u_{2t} + C w_{it+1}
and a perceived control law :math:`u_j(t) = - f_j x_t` for the other
player.

The player i also concerns about the model misspecification,


and maximizes
.. math::
\sum_{t=0}^{\infty}
\left\{
\beta^{t+1} \theta_{i} w_{it+1}'w_{it+1}
\right\}

The solution computed in this routine is the :math:`f_i` and


:math:`P_i` of the associated double optimal linear regulator
problem.

Parameters
----------
A : scalar(float) or array_like(float)
Corresponds to the MPE equations, should be of size (n, n)
C : scalar(float) or array_like(float)
As above, size (n, c), c is the size of w
B1 : scalar(float) or array_like(float)
As above, size (n, k_1)
B2 : scalar(float) or array_like(float)
As above, size (n, k_2)
R1 : scalar(float) or array_like(float)
As above, size (n, n)
R2 : scalar(float) or array_like(float)
As above, size (n, n)
Q1 : scalar(float) or array_like(float)
As above, size (k_1, k_1)
Q2 : scalar(float) or array_like(float)
As above, size (k_2, k_2)
S1 : scalar(float) or array_like(float)
As above, size (k_1, k_1)
S2 : scalar(float) or array_like(float)
As above, size (k_2, k_2)
W1 : scalar(float) or array_like(float)
As above, size (n, k_1)
W2 : scalar(float) or array_like(float)
As above, size (n, k_2)
M1 : scalar(float) or array_like(float)
As above, size (k_2, k_1)
M2 : scalar(float) or array_like(float)
As above, size (k_1, k_2)
θ1 : scalar(float)
Robustness parameter of player 1
θ2 : scalar(float)

(continues on next page)

506 Chapter 28. Robust Markov Perfect Equilibrium


Advanced Quantitative Economics with Python

(continued from previous page)


Robustness parameter of player 2
beta : scalar(float), optional(default=1.0)
Discount factor
tol : scalar(float), optional(default=1e-8)
This is the tolerance level for convergence
max_iter : scalar(int), optional(default=1000)
This is the maximum number of iterations allowed

Returns
-------
F1 : array_like, dtype=float, shape=(k_1, n)
Feedback law for agent 1
F2 : array_like, dtype=float, shape=(k_2, n)
Feedback law for agent 2
P1 : array_like, dtype=float, shape=(n, n)
The steady-state solution to the associated discrete matrix
Riccati equation for agent 1
P2 : array_like, dtype=float, shape=(n, n)
The steady-state solution to the associated discrete matrix
Riccati equation for agent 2
"""

# Unload parameters and make sure everything is a matrix


params = A, C, B1, B2, R1, R2, Q1, Q2, S1, S2, W1, W2, M1, M2
params = map(np.asmatrix, params)
A, C, B1, B2, R1, R2, Q1, Q2, S1, S2, W1, W2, M1, M2 = params

# Multiply A, B1, B2 by sqrt(β) to enforce discounting


A, B1, B2 = [np.sqrt(β) * x for x in (A, B1, B2)]

# Initial values
n = A.shape[0]
k_1 = B1.shape[1]
k_2 = B2.shape[1]

v1 = np.eye(k_1)
v2 = np.eye(k_2)
P1 = np.eye(n) * 1e-5
P2 = np.eye(n) * 1e-5
F1 = np.random.randn(k_1, n)
F2 = np.random.randn(k_2, n)

for it in range(max_iter):
# Update
F10 = F1
F20 = F2

I = np.eye(C.shape[1])

# D1(P1)
# Note: INV1 may not be solved if the matrix is singular
INV1 = solve(θ1 * I - C.T @ P1 @ C, I)
D1P1 = P1 + P1 @ C @ INV1 @ C.T @ P1

(continues on next page)

28.3. Application 507


Advanced Quantitative Economics with Python

(continued from previous page)


# D2(P2)
# Note: INV2 may not be solved if the matrix is singular
INV2 = solve(θ2 * I - C.T @ P2 @ C, I)
D2P2 = P2 + P2 @ C @ INV2 @ C.T @ P2

G2 = solve(Q2 + B2.T @ D2P2 @ B2, v2)


G1 = solve(Q1 + B1.T @ D1P1 @ B1, v1)
H2 = G2 @ B2.T @ D2P2
H1 = G1 @ B1.T @ D1P1

# Break up the computation of F1, F2


F1_left = v1 - (H1 @ B2 + G1 @ M1.T) @ (H2 @ B1 + G2 @ M2.T)
F1_right = H1 @ A + G1 @ W1.T - \
(H1 @ B2 + G1 @ M1.T) @ (H2 @ A + G2 @ W2.T)
F1 = solve(F1_left, F1_right)
F2 = H2 @ A + G2 @ W2.T - (H2 @ B1 + G2 @ M2.T) @ F1

Λ1 = A - B2 @ F2
Λ2 = A - B1 @ F1
Π1 = R1 + F2.T @ S1 @ F2
Π2 = R2 + F1.T @ S2 @ F1
Γ1 = W1.T - M1.T @ F2
Γ2 = W2.T - M2.T @ F1

# Compute P1 and P2
P1 = Π1 - (B1.T @ D1P1 @ Λ1 + Γ1).T @ F1 + \
Λ1.T @ D1P1 @ Λ1
P2 = Π2 - (B2.T @ D2P2 @ Λ2 + Γ2).T @ F2 + \
Λ2.T @ D2P2 @ Λ2

dd = np.max(np.abs(F10 - F1)) + np.max(np.abs(F20 - F2))

if dd < tol: # success!


break

else:
raise ValueError(f'No convergence: Iteration limit of {max_iter} \
reached in nnash')

return F1, F2, P1, P2

28.3.3 Some Details

Firm 𝑖 wants to minimize


𝑡1 −1
∑ 𝛽 𝑡−𝑡0 {𝑥′𝑡 𝑅𝑖 𝑥𝑡 + 𝑢′𝑖𝑡 𝑄𝑖 𝑢𝑖𝑡 + 𝑢′−𝑖𝑡 𝑆𝑖 𝑢−𝑖𝑡 + 2𝑥′𝑡 𝑊𝑖 𝑢𝑖𝑡 + 2𝑢′−𝑖𝑡 𝑀𝑖 𝑢𝑖𝑡 }
𝑡=𝑡0

where
1
𝑥𝑡 ∶= ⎡ ⎤
⎢𝑞1𝑡 ⎥ and 𝑢𝑖𝑡 ∶= 𝑞𝑖,𝑡+1 − 𝑞𝑖𝑡 , 𝑖 = 1, 2
⎣𝑞2𝑡 ⎦

508 Chapter 28. Robust Markov Perfect Equilibrium


Advanced Quantitative Economics with Python

and
0 − 𝑎20 0 0 0 − 𝑎20
𝑅1 ∶= ⎡ 𝑎0
⎢− 2 𝑎1 𝑎1 ⎤
2 ⎥, 𝑅2 ∶= ⎡
⎢ 0𝑎 0 𝑎1
2
⎤ , 𝑄 = 𝑄 = 𝛾, 𝑆 = 𝑆 = 0,
⎥ 1 2 1 2 𝑊1 = 𝑊2 = 0, 𝑀1 = 𝑀2 = 0
𝑎1 𝑎1
⎣ 0 2 0⎦ ⎣− 20 2 𝑎1 ⎦

The parameters of the duopoly model are:


• 𝑎0 = 10
• 𝑎1 = 2
• 𝛽 = 0.96
• 𝛾 = 12

# Parameters
a0 = 10.0
a1 = 2.0
β = 0.96
γ = 12.0

# In LQ form
A = np.eye(3)
B1 = np.array([[0.], [1.], [0.]])
B2 = np.array([[0.], [0.], [1.]])

R1 = [[ 0., -a0 / 2, 0.],


[-a0 / 2., a1, a1 / 2.],
[ 0, a1 / 2., 0.]]

R2 = [[ 0., 0., -a0 / 2],


[ 0., 0., a1 / 2.],
[-a0 / 2, a1 / 2., a1]]

Q1 = Q2 = γ
S1 = S2 = W1 = W2 = M1 = M2 = 0.0

Consistency Check

We first conduct a comparison test to check if nnash_robust agrees with qe.nnash in the non-robustness case in
which each 𝜃𝑖 ≈ +∞

# Solve using QE's nnash function


F1, F2, P1, P2 = qe.nnash(A, B1, B2, R1, R2, Q1,
Q2, S1, S2, W1, W2, M1,
M2, beta=β)

# Solve using nnash_robust


F1r, F2r, P1r, P2r = nnash_robust(A, np.zeros((3, 1)), B1, B2, R1, R2, Q1,
Q2, S1, S2, W1, W2, M1, M2, 1e-10,
1e-10, beta=β)

print('F1 and F1r should be the same: ', np.allclose(F1, F1r))


print('F2 and F2r should be the same: ', np.allclose(F1, F1r))
(continues on next page)

28.3. Application 509


Advanced Quantitative Economics with Python

(continued from previous page)


print('P1 and P1r should be the same: ', np.allclose(P1, P1r))
print('P2 and P2r should be the same: ', np.allclose(P1, P1r))

F1 and F1r should be the same: True


F2 and F2r should be the same: True
P1 and P1r should be the same: True
P2 and P2r should be the same: True

We can see that the results are consistent across the two functions.

Comparative Dynamics under Baseline Transition Dynamics

We want to compare the dynamics of price and output under the baseline MPE model with those under the baseline
model under the robust decision rules within the robust MPE.
This means that we simulate the state dynamics under the MPE equilibrium closed-loop transition matrix

𝐴𝑜 = 𝐴 − 𝐵 1 𝐹1 − 𝐵 2 𝐹2

where 𝐹1 and 𝐹2 are the firms’ robust decision rules within the robust markov_perfect equilibrium
• by simulating under the baseline model transition dynamics and the robust MPE rules we are in assuming that at
the end of the day firms’ concerns about misspecification of the baseline model do not materialize.
• a short way of saying this is that misspecification fears are all ‘just in the minds’ of the firms.
• simulating under the baseline model is a common practice in the literature.
• note that some assumption about the model that actually governs the data has to be made in order to create a
simulation.
• later we will describe the (erroneous) beliefs of the two firms that justify their robust decisions as best responses
to transition laws that are distorted relative to the baseline model.
After simulating 𝑥𝑡 under the baseline transition dynamics and robust decision rules 𝐹𝑖 , 𝑖 = 1, 2, we extract and plot
industry output 𝑞𝑡 = 𝑞1𝑡 + 𝑞2𝑡 and price 𝑝𝑡 = 𝑎0 − 𝑎1 𝑞𝑡 .
Here we set the robustness and volatility matrix parameters as follows:
• 𝜃1 = 0.02
• 𝜃2 = 0.04
0
⎜0.01⎞
• 𝐶=⎛ ⎟
⎝ 0.01⎠
Because we have set 𝜃1 < 𝜃2 < +∞ we know that
• both firms fear that the baseline specification of the state transition dynamics are incorrect.
• firm 1 fears misspecification more than firm 2.

# Robustness parameters and matrix


C = np.asmatrix([[0], [0.01], [0.01]])
θ1 = 0.02
θ2 = 0.04
n = 20
(continues on next page)

510 Chapter 28. Robust Markov Perfect Equilibrium


Advanced Quantitative Economics with Python

(continued from previous page)

# Solve using nnash_robust


F1r, F2r, P1r, P2r = nnash_robust(A, C, B1, B2, R1, R2, Q1,
Q2, S1, S2, W1, W2, M1, M2,
θ1, θ2, beta=β)

# MPE output and price


AF = A - B1 @ F1 - B2 @ F2
x = np.empty((3, n))
x[:, 0] = 1, 1, 1
for t in range(n - 1):
x[:, t + 1] = AF @ x[:, t]
q1 = x[1, :]
q2 = x[2, :]
q = q1 + q2 # Total output, MPE
p = a0 - a1 * q # Price, MPE

# RMPE output and price


AO = A - B1 @ F1r - B2 @ F2r
xr = np.empty((3, n))
xr[:, 0] = 1, 1, 1
for t in range(n - 1):
xr[:, t+1] = AO @ xr[:, t]
qr1 = xr[1, :]
qr2 = xr[2, :]
qr = qr1 + qr2 # Total output, RMPE
pr = a0 - a1 * qr # Price, RMPE

# RMPE heterogeneous beliefs output and price


I = np.eye(C.shape[1])
INV1 = solve(θ1 * I - C.T @ P1 @ C, I)
K1 = P1 @ C @ INV1 @ C.T @ P1 @ AO
AOCK1 = AO + C.T @ K1

INV2 = solve(θ2 * I - C.T @ P2 @ C, I)


K2 = P2 @ C @ INV2 @ C.T @ P2 @ AO
AOCK2 = AO + C.T @ K2
xrp1 = np.empty((3, n))
xrp2 = np.empty((3, n))
xrp1[:, 0] = 1, 1, 1
xrp2[:, 0] = 1, 1, 1
for t in range(n - 1):
xrp1[:, t + 1] = AOCK1 @ xrp1[:, t]
xrp2[:, t + 1] = AOCK2 @ xrp2[:, t]
qrp11 = xrp1[1, :]
qrp12 = xrp1[2, :]
qrp21 = xrp2[1, :]
qrp22 = xrp2[2, :]
qrp1 = qrp11 + qrp12 # Total output, RMPE from player 1's belief
qrp2 = qrp21 + qrp22 # Total output, RMPE from player 2's belief
prp1 = a0 - a1 * qrp1 # Price, RMPE from player 1's belief
prp2 = a0 - a1 * qrp2 # Price, RMPE from player 2's belief

28.3. Application 511


Advanced Quantitative Economics with Python

The following code prepares graphs that compare market-wide output 𝑞1𝑡 + 𝑞2𝑡 and the price of the good 𝑝𝑡 under
equilibrium decision rules 𝐹𝑖 , 𝑖 = 1, 2 from an ordinary Markov perfect equilibrium and the decision rules under a
Markov perfect equilibrium with robust firms with multiplier parameters 𝜃𝑖 , 𝑖 = 1, 2 set as described above.
Both industry output and price are under the transition dynamics associated with the baseline model; only the decision
rules 𝐹𝑖 differ across the two equilibrium objects presented.

fig, axes = plt.subplots(2, 1, figsize=(9, 9))

ax = axes[0]
ax.plot(q, 'g-', lw=2, alpha=0.75, label='MPE output')
ax.plot(qr, 'm-', lw=2, alpha=0.75, label='RMPE output')
ax.set(ylabel="output", xlabel="time", ylim=(2, 4))
ax.legend(loc='upper left', frameon=0)

ax = axes[1]
ax.plot(p, 'g-', lw=2, alpha=0.75, label='MPE price')
ax.plot(pr, 'm-', lw=2, alpha=0.75, label='RMPE price')
ax.set(ylabel="price", xlabel="time")
ax.legend(loc='upper right', frameon=0)
plt.show()

512 Chapter 28. Robust Markov Perfect Equilibrium


Advanced Quantitative Economics with Python

Under the dynamics associated with the baseline model, the price path is higher with the Markov perfect equilibrium
robust decision rules than it is with decision rules for the ordinary Markov perfect equilibrium.
So is the industry output path.
To dig a little beneath the forces driving these outcomes, we want to plot 𝑞1𝑡 and 𝑞2𝑡 in the Markov perfect equilibrium
with robust firms and to compare them with corresponding objects in the Markov perfect equilibrium without robust firms

fig, axes = plt.subplots(2, 1, figsize=(9, 9))

ax = axes[0]
ax.plot(q1, 'g-', lw=2, alpha=0.75, label='firm 1 MPE output')
ax.plot(qr1, 'b-', lw=2, alpha=0.75, label='firm 1 RMPE output')
ax.set(ylabel="output", xlabel="time", ylim=(1, 2))
ax.legend(loc='upper left', frameon=0)

ax = axes[1]
(continues on next page)

28.3. Application 513


Advanced Quantitative Economics with Python

(continued from previous page)


ax.plot(q2, 'g-', lw=2, alpha=0.75, label='firm 2 MPE output')
ax.plot(qr2, 'r-', lw=2, alpha=0.75, label='firm 2 RMPE output')
ax.set(ylabel="output", xlabel="time", ylim=(1, 2))
ax.legend(loc='upper left', frameon=0)
plt.show()

Evidently, firm 1’s output path is substantially lower when firms are robust firms while firm 2’s output path is virtually the
same as it would be in an ordinary Markov perfect equilibrium with no robust firms.
Recall that we have set 𝜃1 = .02 and 𝜃2 = .04, so that firm 1 fears misspecification of the baseline model substantially
more than does firm 2
• but also please notice that firm 2’s behavior in the Markov perfect equilibrium with robust firms responds to the
decision rule 𝐹1 𝑥𝑡 employed by firm 1.
• thus it is something of a coincidence that its output is almost the same in the two equilibria.

514 Chapter 28. Robust Markov Perfect Equilibrium


Advanced Quantitative Economics with Python

Larger concerns about misspecification induce firm 1 to be more cautious than firm 2 in predicting market price and the
output of the other firm.
To explore this, we study next how ex-post the two firms’ beliefs about state dynamics differ in the Markov perfect
equilibrium with robust firms.
(by ex-post we mean after extremization of each firm’s intertemporal objective)

Heterogeneous Beliefs

As before, let 𝐴𝑜 = 𝐴 − 𝐵_1𝐹 _1𝑟 − 𝐵_2𝐹 _2𝑟 , where in a robust MPE, 𝐹𝑖𝑟 is a robust decision rule for firm 𝑖.
Worst-case forecasts of 𝑥𝑡 starting from 𝑡 = 0 differ between the two firms.
This means that worst-case forecasts of industry output 𝑞1𝑡 + 𝑞2𝑡 and price 𝑝𝑡 also differ between the two firms.
To find these worst-case beliefs, we compute the following three “closed-loop” transition matrices
• 𝐴𝑜
• 𝐴𝑜 + 𝐶𝐾_1
• 𝐴𝑜 + 𝐶𝐾_2
We call the first transition law, namely, 𝐴𝑜 , the baseline transition under firms’ robust decision rules.
We call the second and third worst-case transitions under robust decision rules for firms 1 and 2.
From {𝑥𝑡 } paths generated by each of these transition laws, we pull off the associated price and total output sequences.
The following code plots them

print('Baseline Robust transition matrix AO is: \n', np.round(AO, 3))


print('Player 1\'s worst-case transition matrix AOCK1 is: \n', \
np.round(AOCK1, 3))
print('Player 2\'s worst-case transition matrix AOCK2 is: \n', \
np.round(AOCK2, 3))

Baseline Robust transition matrix AO is:


[[ 1. 0. 0. ]
[ 0.666 0.682 -0.074]
[ 0.671 -0.071 0.694]]
Player 1's worst-case transition matrix AOCK1 is:
[[ 0.998 0.002 0. ]
[ 0.664 0.685 -0.074]
[ 0.669 -0.069 0.694]]
Player 2's worst-case transition matrix AOCK2 is:
[[ 0.999 0. 0.001]
[ 0.665 0.683 -0.073]
[ 0.67 -0.071 0.695]]

# == Plot == #
fig, axes = plt.subplots(2, 1, figsize=(9, 9))

ax = axes[0]
ax.plot(qrp1, 'b--', lw=2, alpha=0.75,
label='RMPE worst-case belief output player 1')
ax.plot(qrp2, 'r:', lw=2, alpha=0.75,
label='RMPE worst-case belief output player 2')
(continues on next page)

28.3. Application 515


Advanced Quantitative Economics with Python

(continued from previous page)


ax.plot(qr, 'm-', lw=2, alpha=0.75, label='RMPE output')
ax.set(ylabel="output", xlabel="time", ylim=(2, 4))
ax.legend(loc='upper left', frameon=0)

ax = axes[1]
ax.plot(prp1, 'b--', lw=2, alpha=0.75,
label='RMPE worst-case belief price player 1')
ax.plot(prp2, 'r:', lw=2, alpha=0.75,
label='RMPE worst-case belief price player 2')
ax.plot(pr, 'm-', lw=2, alpha=0.75, label='RMPE price')
ax.set(ylabel="price", xlabel="time")
ax.legend(loc='upper right', frameon=0)
plt.show()

We see from the above graph that under robustness concerns, player 1 and player 2 have heterogeneous beliefs about total
output and the goods price even though they share the same baseline model and information

516 Chapter 28. Robust Markov Perfect Equilibrium


Advanced Quantitative Economics with Python

• firm 1 thinks that total output will be higher and price lower than does firm 2
• this leads firm 1 to produce less than firm 2
These beliefs justify (or rationalize) the Markov perfect equilibrium robust decision rules.
This means that the robust rules are the unique optimal rules (or best responses) to the indicated worst-case transition
dynamics.
([Hansen and Sargent, 2008] discuss how this property of robust decision rules is connected to the concept of admissibility
in Bayesian statistical decision theory)

28.3. Application 517


Advanced Quantitative Economics with Python

518 Chapter 28. Robust Markov Perfect Equilibrium


Part VI

Time Series Models

519
CHAPTER

TWENTYNINE

COVARIANCE STATIONARY PROCESSES

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

29.1 Overview

In this lecture we study covariance stationary linear stochastic processes, a class of models routinely used to study eco-
nomic and financial time series.
This class has the advantage of being
1. simple enough to be described by an elegant and comprehensive theory
2. relatively broad in terms of the kinds of dynamics it can represent
We consider these models in both the time and frequency domain.

29.1.1 ARMA Processes

We will focus much of our attention on linear covariance stationary models with a finite number of parameters.
In particular, we will study stationary ARMA processes, which form a cornerstone of the standard theory of time series
analysis.
Every ARMA process can be represented in linear state space form.
However, ARMA processes have some important structure that makes it valuable to study them separately.

29.1.2 Spectral Analysis

Analysis in the frequency domain is also called spectral analysis.


In essence, spectral analysis provides an alternative representation of the autocovariance function of a covariance station-
ary process.
Having a second representation of this important object
• shines a light on the dynamics of the process in question
• allows for a simpler, more tractable representation in some important cases
The famous Fourier transform and its inverse are used to map between the two representations.

521
Advanced Quantitative Economics with Python

29.1.3 Other Reading

For supplementary reading, see


• [Ljungqvist and Sargent, 2018], chapter 2
• [Sargent, 1987], chapter 11
• John Cochrane’s notes on time series analysis, chapter 8
• [Shiriaev, 1995], chapter 6
• [Cryer and Chan, 2008], all
Let’s start with some imports:

import numpy as np
import matplotlib.pyplot as plt
import quantecon as qe

29.2 Introduction

Consider a sequence of random variables {𝑋𝑡 } indexed by 𝑡 ∈ ℤ and taking values in ℝ.


Thus, {𝑋𝑡 } begins in the infinite past and extends to the infinite future — a convenient and standard assumption.
As in other fields, successful economic modeling typically assumes the existence of features that are constant over time.
If these assumptions are correct, then each new observation 𝑋𝑡 , 𝑋𝑡+1 , … can provide additional information about the
time-invariant features, allowing us to learn from as data arrive.
For this reason, we will focus in what follows on processes that are stationary — or become so after a transformation (see
for example this lecture).

29.2.1 Definitions

A real-valued stochastic process {𝑋𝑡 } is called covariance stationary if


1. Its mean 𝜇 ∶= 𝔼𝑋𝑡 does not depend on 𝑡.
2. For all 𝑘 in ℤ, the 𝑘-th autocovariance 𝛾(𝑘) ∶= 𝔼(𝑋𝑡 − 𝜇)(𝑋𝑡+𝑘 − 𝜇) is finite and depends only on 𝑘.
The function 𝛾 ∶ ℤ → ℝ is called the autocovariance function of the process.
Throughout this lecture, we will work exclusively with zero-mean (i.e., 𝜇 = 0) covariance stationary processes.
The zero-mean assumption costs nothing in terms of generality since working with non-zero-mean processes involves no
more than adding a constant.

522 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

29.2.2 Example 1: White Noise

Perhaps the simplest class of covariance stationary processes is the white noise processes.
A process {𝜖𝑡 } is called a white noise process if
1. 𝔼𝜖𝑡 = 0
2. 𝛾(𝑘) = 𝜎2 1{𝑘 = 0} for some 𝜎 > 0
(Here 1{𝑘 = 0} is defined to be 1 if 𝑘 = 0 and zero otherwise)
White noise processes play the role of building blocks for processes with more complicated dynamics.

29.2.3 Example 2: General Linear Processes

From the simple building block provided by white noise, we can construct a very flexible family of covariance stationary
processes — the general linear processes

𝑋𝑡 = ∑ 𝜓𝑗 𝜖𝑡−𝑗 , 𝑡∈ℤ (29.1)
𝑗=0

where
• {𝜖𝑡 } is white noise

• {𝜓𝑡 } is a square summable sequence in ℝ (that is, ∑𝑡=0 𝜓𝑡2 < ∞)
The sequence {𝜓𝑡 } is often called a linear filter.
Equation (29.1) is said to present a moving average process or a moving average representation.
With some manipulations, it is possible to confirm that the autocovariance function for (29.1) is

𝛾(𝑘) = 𝜎2 ∑ 𝜓𝑗 𝜓𝑗+𝑘 (29.2)
𝑗=0

By the Cauchy-Schwartz inequality, one can show that 𝛾(𝑘) satisfies equation (29.2).
Evidently, 𝛾(𝑘) does not depend on 𝑡.

29.2.4 Wold Representation

Remarkably, the class of general linear processes goes a long way towards describing the entire class of zero-mean
covariance stationary processes.
In particular, Wold’s decomposition theorem states that every zero-mean covariance stationary process {𝑋𝑡 } can be
written as

𝑋𝑡 = ∑ 𝜓𝑗 𝜖𝑡−𝑗 + 𝜂𝑡
𝑗=0

where
• {𝜖𝑡 } is white noise
• {𝜓𝑡 } is square summable
• 𝜓0 𝜖𝑡 is the one-step ahead prediction error in forecasting 𝑋𝑡 as a linear least-squares function of the infinite history
𝑋𝑡−1 , 𝑋𝑡−2 , …

29.2. Introduction 523


Advanced Quantitative Economics with Python

• 𝜂𝑡 can be expressed as a linear function of 𝑋𝑡−1 , 𝑋𝑡−2 , … and is perfectly predictable over arbitrarily long horizons
For the method of constructing a Wold representation, intuition, and further discussion, see [Sargent, 1987], p. 286.

29.2.5 AR and MA

General linear processes are a very broad class of processes.


It often pays to specialize to those for which there exists a representation having only finitely many parameters.
(Experience and theory combine to indicate that models with a relatively small number of parameters typically perform
better than larger models, especially for forecasting)
One very simple example of such a model is the first-order autoregressive or AR(1) process

𝑋𝑡 = 𝜙𝑋𝑡−1 + 𝜖𝑡 where |𝜙| < 1 and {𝜖𝑡 } is white noise (29.3)



By direct substitution, it is easy to verify that 𝑋𝑡 = ∑𝑗=0 𝜙𝑗 𝜖𝑡−𝑗 .
Hence {𝑋𝑡 } is a general linear process.
Applying (29.2) to the previous expression for 𝑋𝑡 , we get the AR(1) autocovariance function

𝜎2
𝛾(𝑘) = 𝜙𝑘 , 𝑘 = 0, 1, … (29.4)
1 − 𝜙2
The next figure plots an example of this function for 𝜙 = 0.8 and 𝜙 = −0.8 with 𝜎 = 1.

num_rows, num_cols = 2, 1
fig, axes = plt.subplots(num_rows, num_cols, figsize=(10, 8))
plt.subplots_adjust(hspace=0.4)

for i, ϕ in enumerate((0.8, -0.8)):


ax = axes[i]
times = list(range(16))
acov = [ϕ**k / (1 - ϕ**2) for k in times]
ax.plot(times, acov, 'bo-', alpha=0.6,
label=fr'autocovariance, $\phi = {ϕ:.2}$')
ax.legend(loc='upper right')
ax.set(xlabel='time', xlim=(0, 15))
ax.hlines(0, 0, 15, linestyle='--', alpha=0.5)
plt.show()

524 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

Another very simple process is the MA(1) process (here MA means “moving average”)

𝑋𝑡 = 𝜖𝑡 + 𝜃𝜖𝑡−1

You will be able to verify that

𝛾(0) = 𝜎2 (1 + 𝜃2 ), 𝛾(1) = 𝜎2 𝜃, and 𝛾(𝑘) = 0 ∀𝑘 > 1

The AR(1) can be generalized to an AR(𝑝) and likewise for the MA(1).
Putting all of this together, we get the

29.2.6 ARMA Processes

A stochastic process {𝑋𝑡 } is called an autoregressive moving average process, or ARMA(𝑝, 𝑞), if it can be written as

𝑋𝑡 = 𝜙1 𝑋𝑡−1 + ⋯ + 𝜙𝑝 𝑋𝑡−𝑝 + 𝜖𝑡 + 𝜃1 𝜖𝑡−1 + ⋯ + 𝜃𝑞 𝜖𝑡−𝑞 (29.5)

where {𝜖𝑡 } is white noise.


An alternative notation for ARMA processes uses the lag operator 𝐿.
Def. Given arbitrary variable 𝑌𝑡 , let 𝐿𝑘 𝑌𝑡 ∶= 𝑌𝑡−𝑘 .
It turns out that

29.2. Introduction 525


Advanced Quantitative Economics with Python

• lag operators facilitate succinct representations for linear stochastic processes


• algebraic manipulations that treat the lag operator as an ordinary scalar are legitimate
Using 𝐿, we can rewrite (29.5) as

𝐿0 𝑋𝑡 − 𝜙1 𝐿1 𝑋𝑡 − ⋯ − 𝜙𝑝 𝐿𝑝 𝑋𝑡 = 𝐿0 𝜖𝑡 + 𝜃1 𝐿1 𝜖𝑡 + ⋯ + 𝜃𝑞 𝐿𝑞 𝜖𝑡 (29.6)

If we let 𝜙(𝑧) and 𝜃(𝑧) be the polynomials

𝜙(𝑧) ∶= 1 − 𝜙1 𝑧 − ⋯ − 𝜙𝑝 𝑧 𝑝 and 𝜃(𝑧) ∶= 1 + 𝜃1 𝑧 + ⋯ + 𝜃𝑞 𝑧𝑞 (29.7)

then (29.6) becomes

𝜙(𝐿)𝑋𝑡 = 𝜃(𝐿)𝜖𝑡 (29.8)

In what follows we always assume that the roots of the polynomial 𝜙(𝑧) lie outside the unit circle in the complex plane.
This condition is sufficient to guarantee that the ARMA(𝑝, 𝑞) process is covariance stationary.
In fact, it implies that the process falls within the class of general linear processes described above.
That is, given an ARMA(𝑝, 𝑞) process {𝑋𝑡 } satisfying the unit circle condition, there exists a square summable sequence

{𝜓𝑡 } with 𝑋𝑡 = ∑𝑗=0 𝜓𝑗 𝜖𝑡−𝑗 for all 𝑡.
The sequence {𝜓𝑡 } can be obtained by a recursive procedure outlined on page 79 of [Cryer and Chan, 2008].
The function 𝑡 ↦ 𝜓𝑡 is often called the impulse response function.

29.3 Spectral Analysis

Autocovariance functions provide a great deal of information about covariance stationary processes.
In fact, for zero-mean Gaussian processes, the autocovariance function characterizes the entire joint distribution.
Even for non-Gaussian processes, it provides a significant amount of information.
It turns out that there is an alternative representation of the autocovariance function of a covariance stationary process,
called the spectral density.
At times, the spectral density is easier to derive, easier to manipulate, and provides additional intuition.

29.3.1 Complex Numbers

Before discussing the spectral density, we invite you to recall the main properties of complex numbers (or skip to the next
section).
It can be helpful to remember that, in a formal sense, complex numbers are just points (𝑥, 𝑦) ∈ ℝ2 endowed with a
specific notion of multiplication.
When (𝑥, 𝑦) is regarded as a complex number, 𝑥 is called the real part and 𝑦 is called the imaginary part.
The modulus or absolute value of a complex number 𝑧 = (𝑥, 𝑦) is just its Euclidean norm in ℝ2 , but is usually written as
|𝑧| instead of ‖𝑧‖.
The product of two complex numbers (𝑥, 𝑦) and (𝑢, 𝑣) is defined to be (𝑥𝑢 − 𝑣𝑦, 𝑥𝑣 + 𝑦𝑢), while addition is standard
pointwise vector addition.
When endowed with these notions of multiplication and addition, the set of complex numbers forms a field — addition
and multiplication play well together, just as they do in ℝ.

526 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

The complex number (𝑥, 𝑦) is often written as 𝑥 + 𝑖𝑦, where 𝑖 is called the imaginary unit and is understood to obey
𝑖2 = −1.
The 𝑥 + 𝑖𝑦 notation provides an easy way to remember the definition of multiplication given above, because, proceeding
naively,

(𝑥 + 𝑖𝑦)(𝑢 + 𝑖𝑣) = 𝑥𝑢 − 𝑦𝑣 + 𝑖(𝑥𝑣 + 𝑦𝑢)

Converted back to our first notation, this becomes (𝑥𝑢 − 𝑣𝑦, 𝑥𝑣 + 𝑦𝑢) as promised.
Complex numbers can be represented in the polar form 𝑟𝑒𝑖𝜔 where

𝑟𝑒𝑖𝜔 ∶= 𝑟(cos(𝜔) + 𝑖 sin(𝜔)) = 𝑥 + 𝑖𝑦

where 𝑥 = 𝑟 cos(𝜔), 𝑦 = 𝑟 sin(𝜔), and 𝜔 = arctan(𝑦/𝑧) or tan(𝜔) = 𝑦/𝑥.

29.3.2 Spectral Densities

Let {𝑋𝑡 } be a covariance stationary process with autocovariance function 𝛾 satisfying ∑𝑘 𝛾(𝑘)2 < ∞.
The spectral density 𝑓 of {𝑋𝑡 } is defined as the discrete time Fourier transform of its autocovariance function 𝛾.

𝑓(𝜔) ∶= ∑ 𝛾(𝑘)𝑒−𝑖𝜔𝑘 , 𝜔∈ℝ


𝑘∈ℤ

(Some authors normalize the expression on the right by constants such as 1/𝜋 — the convention chosen makes little
difference provided you are consistent).
Using the fact that 𝛾 is even, in the sense that 𝛾(𝑡) = 𝛾(−𝑡) for all 𝑡, we can show that

𝑓(𝜔) = 𝛾(0) + 2 ∑ 𝛾(𝑘) cos(𝜔𝑘) (29.9)


𝑘≥1

It is not difficult to confirm that 𝑓 is


• real-valued
• even (𝑓(𝜔) = 𝑓(−𝜔) ), and
• 2𝜋-periodic, in the sense that 𝑓(2𝜋 + 𝜔) = 𝑓(𝜔) for all 𝜔
It follows that the values of 𝑓 on [0, 𝜋] determine the values of 𝑓 on all of ℝ — the proof is an exercise.
For this reason, it is standard to plot the spectral density only on the interval [0, 𝜋].

29.3.3 Example 1: White Noise

Consider a white noise process {𝜖𝑡 } with standard deviation 𝜎.


It is easy to check that in this case 𝑓(𝜔) = 𝜎2 . So 𝑓 is a constant function.
As we will see, this can be interpreted as meaning that “all frequencies are equally present”.
(White light has this property when frequency refers to the visible spectrum, a connection that provides the origins of the
term “white noise”)

29.3. Spectral Analysis 527


Advanced Quantitative Economics with Python

29.3.4 Example 2: AR and MA and ARMA

It is an exercise to show that the MA(1) process 𝑋𝑡 = 𝜃𝜖𝑡−1 + 𝜖𝑡 has a spectral density

𝑓(𝜔) = 𝜎2 (1 + 2𝜃 cos(𝜔) + 𝜃2 ) (29.10)

With a bit more effort, it’s possible to show (see, e.g., p. 261 of [Sargent, 1987]) that the spectral density of the AR(1)
process 𝑋𝑡 = 𝜙𝑋𝑡−1 + 𝜖𝑡 is

𝜎2
𝑓(𝜔) = (29.11)
1 − 2𝜙 cos(𝜔) + 𝜙2

More generally, it can be shown that the spectral density of the ARMA process (29.5) is

2
𝜃(𝑒𝑖𝜔 )
𝑓(𝜔) = ∣ ∣ 𝜎2 (29.12)
𝜙(𝑒𝑖𝜔 )

where
• 𝜎 is the standard deviation of the white noise process {𝜖𝑡 }.
• the polynomials 𝜙(⋅) and 𝜃(⋅) are as defined in (29.7).
The derivation of (29.12) uses the fact that convolutions become products under Fourier transformations.
The proof is elegant and can be found in many places — see, for example, [Sargent, 1987], chapter 11, section 4.
It’s a nice exercise to verify that (29.10) and (29.11) are indeed special cases of (29.12).

29.3.5 Interpreting the Spectral Density

Plotting (29.11) reveals the shape of the spectral density for the AR(1) model when 𝜙 takes the values 0.8 and -0.8
respectively.

def ar1_sd(ϕ, ω):


return 1 / (1 - 2 * ϕ * np.cos(ω) + ϕ**2)

ωs = np.linspace(0, np.pi, 180)


num_rows, num_cols = 2, 1
fig, axes = plt.subplots(num_rows, num_cols, figsize=(10, 8))
plt.subplots_adjust(hspace=0.4)

# Autocovariance when phi = 0.8


for i, ϕ in enumerate((0.8, -0.8)):
ax = axes[i]
sd = ar1_sd(ϕ, ωs)
ax.plot(ωs, sd, 'b-', alpha=0.6, lw=2,
label=fr'spectral density, $\phi = {ϕ:.2}$')
ax.legend(loc='upper center')
ax.set(xlabel='frequency', xlim=(0, np.pi))
plt.show()

528 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

These spectral densities correspond to the autocovariance functions for the AR(1) process shown above.
Informally, we think of the spectral density as being large at those 𝜔 ∈ [0, 𝜋] at which the autocovariance function seems
approximately to exhibit big damped cycles.
To see the idea, let’s consider why, in the lower panel of the preceding figure, the spectral density for the case 𝜙 = −0.8
is large at 𝜔 = 𝜋.
Recall that the spectral density can be expressed as

𝑓(𝜔) = 𝛾(0) + 2 ∑ 𝛾(𝑘) cos(𝜔𝑘) = 𝛾(0) + 2 ∑(−0.8)𝑘 cos(𝜔𝑘) (29.13)


𝑘≥1 𝑘≥1

When we evaluate this at 𝜔 = 𝜋, we get a large number because cos(𝜋𝑘) is large and positive when (−0.8)𝑘 is positive,
and large in absolute value and negative when (−0.8)𝑘 is negative.
Hence the product is always large and positive, and hence the sum of the products on the right-hand side of (29.13) is
large.
These ideas are illustrated in the next figure, which has 𝑘 on the horizontal axis.

ϕ = -0.8
times = list(range(16))
y1 = [ϕ**k / (1 - ϕ**2) for k in times]
y2 = [np.cos(np.pi * k) for k in times]
(continues on next page)

29.3. Spectral Analysis 529


Advanced Quantitative Economics with Python

(continued from previous page)


y3 = [a * b for a, b in zip(y1, y2)]

num_rows, num_cols = 3, 1
fig, axes = plt.subplots(num_rows, num_cols, figsize=(10, 8))
plt.subplots_adjust(hspace=0.25)

# Autocovariance when ϕ = -0.8


ax = axes[0]
ax.plot(times, y1, 'bo-', alpha=0.6, label=r'$\gamma(k)$')
ax.legend(loc='upper right')
ax.set(xlim=(0, 15), yticks=(-2, 0, 2))
ax.hlines(0, 0, 15, linestyle='--', alpha=0.5)

# Cycles at frequency π
ax = axes[1]
ax.plot(times, y2, 'bo-', alpha=0.6, label=r'$\cos(\pi k)$')
ax.legend(loc='upper right')
ax.set(xlim=(0, 15), yticks=(-1, 0, 1))
ax.hlines(0, 0, 15, linestyle='--', alpha=0.5)

# Product
ax = axes[2]
ax.stem(times, y3, label=r'$\gamma(k) \cos(\pi k)$')
ax.legend(loc='upper right')
ax.set(xlim=(0, 15), ylim=(-3, 3), yticks=(-1, 0, 1, 2, 3))
ax.hlines(0, 0, 15, linestyle='--', alpha=0.5)
ax.set_xlabel("k")

plt.show()

530 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

On the other hand, if we evaluate 𝑓(𝜔) at 𝜔 = 𝜋/3, then the cycles are not matched, the sequence 𝛾(𝑘) cos(𝜔𝑘) contains
both positive and negative terms, and hence the sum of these terms is much smaller.

ϕ = -0.8
times = list(range(16))
y1 = [ϕ**k / (1 - ϕ**2) for k in times]
y2 = [np.cos(np.pi * k/3) for k in times]
y3 = [a * b for a, b in zip(y1, y2)]

num_rows, num_cols = 3, 1
fig, axes = plt.subplots(num_rows, num_cols, figsize=(10, 8))
plt.subplots_adjust(hspace=0.25)

# Autocovariance when phi = -0.8


ax = axes[0]
ax.plot(times, y1, 'bo-', alpha=0.6, label=r'$\gamma(k)$')
ax.legend(loc='upper right')
ax.set(xlim=(0, 15), yticks=(-2, 0, 2))
ax.hlines(0, 0, 15, linestyle='--', alpha=0.5)

# Cycles at frequency π
ax = axes[1]
ax.plot(times, y2, 'bo-', alpha=0.6, label=r'$\cos(\pi k/3)$')
(continues on next page)

29.3. Spectral Analysis 531


Advanced Quantitative Economics with Python

(continued from previous page)


ax.legend(loc='upper right')
ax.set(xlim=(0, 15), yticks=(-1, 0, 1))
ax.hlines(0, 0, 15, linestyle='--', alpha=0.5)

# Product
ax = axes[2]
ax.stem(times, y3, label=r'$\gamma(k) \cos(\pi k/3)$')
ax.legend(loc='upper right')
ax.set(xlim=(0, 15), ylim=(-3, 3), yticks=(-1, 0, 1, 2, 3))
ax.hlines(0, 0, 15, linestyle='--', alpha=0.5)
ax.set_xlabel("$k$")

plt.show()

In summary, the spectral density is large at frequencies 𝜔 where the autocovariance function exhibits damped cycles.

532 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

29.3.6 Inverting the Transformation

We have just seen that the spectral density is useful in the sense that it provides a frequency-based perspective on the
autocovariance structure of a covariance stationary process.
Another reason that the spectral density is useful is that it can be “inverted” to recover the autocovariance function via
the inverse Fourier transform.
In particular, for all 𝑘 ∈ ℤ, we have
𝜋
1
𝛾(𝑘) = ∫ 𝑓(𝜔)𝑒𝑖𝜔𝑘 𝑑𝜔 (29.14)
2𝜋 −𝜋

This is convenient in situations where the spectral density is easier to calculate and manipulate than the autocovariance
function.
(For example, the expression (29.12) for the ARMA spectral density is much easier to work with than the expression for
the ARMA autocovariance)

29.3.7 Mathematical Theory

This section is loosely based on [Sargent, 1987], p. 249-253, and included for those who
• would like a bit more insight into spectral densities
• and have at least some background in Hilbert space theory
Others should feel free to skip to the next section — none of this material is necessary to progress to computation.
Recall that every separable Hilbert space 𝐻 has a countable orthonormal basis {ℎ𝑘 }.
The nice thing about such a basis is that every 𝑓 ∈ 𝐻 satisfies

𝑓 = ∑ 𝛼𝑘 ℎ𝑘 where 𝛼𝑘 ∶= ⟨𝑓, ℎ𝑘 ⟩ (29.15)


𝑘

where ⟨⋅, ⋅⟩ denotes the inner product in 𝐻.


Thus, 𝑓 can be represented to any degree of precision by linearly combining basis vectors.
The scalar sequence 𝛼 = {𝛼𝑘 } is called the Fourier coefficients of 𝑓, and satisfies ∑𝑘 |𝛼𝑘 |2 < ∞.
In other words, 𝛼 is in ℓ2 , the set of square summable sequences.
Consider an operator 𝑇 that maps 𝛼 ∈ ℓ2 into its expansion ∑𝑘 𝛼𝑘 ℎ𝑘 ∈ 𝐻.
The Fourier coefficients of 𝑇 𝛼 are just 𝛼 = {𝛼𝑘 }, as you can verify by confirming that ⟨𝑇 𝛼, ℎ𝑘 ⟩ = 𝛼𝑘 .
Using elementary results from Hilbert space theory, it can be shown that
• 𝑇 is one-to-one — if 𝛼 and 𝛽 are distinct in ℓ2 , then so are their expansions in 𝐻.
• 𝑇 is onto — if 𝑓 ∈ 𝐻 then its preimage in ℓ2 is the sequence 𝛼 given by 𝛼𝑘 = ⟨𝑓, ℎ𝑘 ⟩.
• 𝑇 is a linear isometry — in particular, ⟨𝛼, 𝛽⟩ = ⟨𝑇 𝛼, 𝑇 𝛽⟩.
Summarizing these results, we say that any separable Hilbert space is isometrically isomorphic to ℓ2 .
In essence, this says that each separable Hilbert space we consider is just a different way of looking at the fundamental
space ℓ2 .
With this in mind, let’s specialize to a setting where
• 𝛾 ∈ ℓ2 is the autocovariance function of a covariance stationary process, and 𝑓 is the spectral density.

29.3. Spectral Analysis 533


Advanced Quantitative Economics with Python

• 𝐻 = 𝐿2 , where 𝐿2 is the set of square summable functions on the interval [−𝜋, 𝜋], with inner product ⟨𝑔, ℎ⟩ =
𝜋
∫−𝜋 𝑔(𝜔)ℎ(𝜔)𝑑𝜔.
• {ℎ𝑘 } = the orthonormal basis for 𝐿2 given by the set of trigonometric functions.
𝑒𝑖𝜔𝑘
ℎ𝑘 (𝜔) = √ , 𝑘 ∈ ℤ, 𝜔 ∈ [−𝜋, 𝜋]
2𝜋
Using the definition of 𝑇 from above and the fact that 𝑓 is even, we now have

𝑒𝑖𝜔𝑘 1
𝑇 𝛾 = ∑ 𝛾(𝑘) √ = √ 𝑓(𝜔) (29.16)
𝑘∈ℤ 2𝜋 2𝜋

In other words, apart from a scalar multiple, the spectral density is just a transformation of 𝛾 ∈ ℓ2 under a certain linear
isometry — a different way to view 𝛾.
In particular, it is an expansion of the autocovariance function with respect to the trigonometric basis functions in 𝐿2 .
As discussed above, the Fourier coefficients of 𝑇 𝛾 are given by the sequence 𝛾, and, in particular, 𝛾(𝑘) = ⟨𝑇 𝛾, ℎ𝑘 ⟩.
Transforming this inner product into its integral expression and using (29.16) gives (29.14), justifying our earlier expres-
sion for the inverse transform.

29.4 Implementation

Most code for working with covariance stationary models deals with ARMA models.
Python code for studying ARMA models can be found in the tsa submodule of statsmodels.
Since this code doesn’t quite cover our needs — particularly vis-a-vis spectral analysis — we’ve put together the module
arma.py, which is part of QuantEcon.py package.
The module provides functions for mapping ARMA(𝑝, 𝑞) models into their
1. impulse response function
2. simulated time series
3. autocovariance function
4. spectral density

29.4.1 Application

Let’s use this code to replicate the plots on pages 68–69 of [Ljungqvist and Sargent, 2018].
Here are some functions to generate the plots

def plot_impulse_response(arma, ax=None):


if ax is None:
ax = plt.gca()
yi = arma.impulse_response()
ax.stem(list(range(len(yi))), yi)
ax.set(xlim=(-0.5), ylim=(min(yi)-0.1, max(yi)+0.1),
title='Impulse response', xlabel='time', ylabel='response')
return ax

def plot_spectral_density(arma, ax=None):


(continues on next page)

534 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

(continued from previous page)


if ax is None:
ax = plt.gca()
w, spect = arma.spectral_density(two_pi=False)
ax.semilogy(w, spect)
ax.set(xlim=(0, np.pi), ylim=(0, np.max(spect)),
title='Spectral density', xlabel='frequency', ylabel='spectrum')
return ax

def plot_autocovariance(arma, ax=None):


if ax is None:
ax = plt.gca()
acov = arma.autocovariance()
ax.stem(list(range(len(acov))), acov)
ax.set(xlim=(-0.5, len(acov) - 0.5), title='Autocovariance',
xlabel='time', ylabel='autocovariance')
return ax

def plot_simulation(arma, ax=None):


if ax is None:
ax = plt.gca()
x_out = arma.simulation()
ax.plot(x_out)
ax.set(title='Sample path', xlabel='time', ylabel='state space')
return ax

def quad_plot(arma):
"""
Plots the impulse response, spectral_density, autocovariance,
and one realization of the process.

"""
num_rows, num_cols = 2, 2
fig, axes = plt.subplots(num_rows, num_cols, figsize=(10, 7))
plot_functions = [plot_impulse_response,
plot_spectral_density,
plot_autocovariance,
plot_simulation]
for plot_func, ax in zip(plot_functions, axes.flatten()):
plot_func(arma, ax)
plt.tight_layout()
plt.show()

Now let’s call these functions to generate plots.


As a warmup, let’s make sure things look right when we for the pure white noise model 𝑋𝑡 = 𝜖𝑡 .

ϕ = 0.0
θ = 0.0
arma = qe.ARMA(ϕ, θ)
quad_plot(arma)

/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/matplotlib/
↪cbook.py:1762: ComplexWarning: Casting complex values to real discards the␣

↪imaginary part

return math.isfinite(val)
(continues on next page)

29.4. Implementation 535


Advanced Quantitative Economics with Python

(continued from previous page)


/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/matplotlib/
↪cbook.py:1398: ComplexWarning: Casting complex values to real discards the␣

↪imaginary part

return np.asarray(x, float)


/tmp/ipykernel_6988/4271821819.py:15: UserWarning: Attempt to set non-positive␣
↪ylim on a log-scaled axis will be ignored.

ax.set(xlim=(0, np.pi), ylim=(0, np.max(spect)),


/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/matplotlib/
↪transforms.py:993: ComplexWarning: Casting complex values to real discards the␣

↪imaginary part

self._points[:, 1] = interval

If we look carefully, things look good: the spectrum is the flat line at 100 at the very top of the spectrum graphs, which
is at it should be.
Also
1 𝜋
• the variance equals 1 = 2𝜋 ∫−𝜋 1𝑑𝜔 as it should.
• the covariogram and impulse response look as they should.
• it is actually challenging to visualize a time series realization of white noise – a sequence of surprises – but this too
looks pretty good.
To get some more examples, as our laboratory we’ll replicate quartets of graphs that [Ljungqvist and Sargent, 2018] use
to teach “how to read spectral densities”.
Ljunqvist and Sargent’s first model is 𝑋𝑡 = 1.3𝑋𝑡−1 − .7𝑋𝑡−2 + 𝜖𝑡

536 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

ϕ = 1.3, -.7
θ = 0.0
arma = qe.ARMA(ϕ, θ)
quad_plot(arma)

/tmp/ipykernel_6988/4271821819.py:15: UserWarning: Attempt to set non-positive␣


↪ylim on a log-scaled axis will be ignored.

ax.set(xlim=(0, np.pi), ylim=(0, np.max(spect)),

Ljungqvist and Sargent’s second model is 𝑋𝑡 = .9𝑋𝑡−1 + 𝜖𝑡

ϕ = 0.9
θ = -0.0
arma = qe.ARMA(ϕ, θ)
quad_plot(arma)

/tmp/ipykernel_6988/4271821819.py:15: UserWarning: Attempt to set non-positive␣


↪ylim on a log-scaled axis will be ignored.

ax.set(xlim=(0, np.pi), ylim=(0, np.max(spect)),

29.4. Implementation 537


Advanced Quantitative Economics with Python

Ljungqvist and Sargent’s third model is 𝑋𝑡 = .8𝑋𝑡−4 + 𝜖𝑡

ϕ = 0., 0., 0., .8


θ = -0.0
arma = qe.ARMA(ϕ, θ)
quad_plot(arma)

/tmp/ipykernel_6988/4271821819.py:15: UserWarning: Attempt to set non-positive␣


↪ylim on a log-scaled axis will be ignored.

ax.set(xlim=(0, np.pi), ylim=(0, np.max(spect)),

538 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

Ljungqvist and Sargent’s fourth model is 𝑋𝑡 = .98𝑋𝑡−1 + 𝜖𝑡 − .7𝜖𝑡−1

ϕ = .98
θ = -0.7
arma = qe.ARMA(ϕ, θ)
quad_plot(arma)

/tmp/ipykernel_6988/4271821819.py:15: UserWarning: Attempt to set non-positive␣


↪ylim on a log-scaled axis will be ignored.

ax.set(xlim=(0, np.pi), ylim=(0, np.max(spect)),

29.4. Implementation 539


Advanced Quantitative Economics with Python

29.4.2 Explanation

The call
arma = ARMA(ϕ, θ, σ)
creates an instance arma that represents the ARMA(𝑝, 𝑞) model

𝑋𝑡 = 𝜙1 𝑋𝑡−1 + ... + 𝜙𝑝 𝑋𝑡−𝑝 + 𝜖𝑡 + 𝜃1 𝜖𝑡−1 + ... + 𝜃𝑞 𝜖𝑡−𝑞

If ϕ and θ are arrays or sequences, then the interpretation will be


• ϕ holds the vector of parameters (𝜙1 , 𝜙2 , ..., 𝜙𝑝 ).
• θ holds the vector of parameters (𝜃1 , 𝜃2 , ..., 𝜃𝑞 ).
The parameter σ is always a scalar, the standard deviation of the white noise.
We also permit ϕ and θ to be scalars, in which case the model will be interpreted as

𝑋𝑡 = 𝜙𝑋𝑡−1 + 𝜖𝑡 + 𝜃𝜖𝑡−1

The two numerical packages most useful for working with ARMA models are scipy.signal and numpy.fft.
The package scipy.signal expects the parameters to be passed into its functions in a manner consistent with the
alternative ARMA notation (29.8).
For example, the impulse response sequence {𝜓𝑡 } discussed above can be obtained using scipy.signal.
dimpulse, and the function call should be of the form
times, ψ = dimpulse((ma_poly, ar_poly, 1), n=impulse_length)

540 Chapter 29. Covariance Stationary Processes


Advanced Quantitative Economics with Python

where ma_poly and ar_poly correspond to the polynomials in (29.7) — that is,
• ma_poly is the vector (1, 𝜃1 , 𝜃2 , … , 𝜃𝑞 )
• ar_poly is the vector (1, −𝜙1 , −𝜙2 , … , −𝜙𝑝 )
To this end, we also maintain the arrays ma_poly and ar_poly as instance data, with their values computed automat-
ically from the values of phi and theta supplied by the user.
If the user decides to change the value of either theta or phi ex-post by assignments such as arma.phi = (0.5,
0.2) or arma.theta = (0, -0.1).
then ma_poly and ar_poly should update automatically to reflect these new parameters.
This is achieved in our implementation by using descriptors.

29.4.3 Computing the Autocovariance Function

As discussed above, for ARMA processes the spectral density has a simple representation that is relatively easy to calculate.
Given this fact, the easiest way to obtain the autocovariance function is to recover it from the spectral density via the
inverse Fourier transform.
Here we use NumPy’s Fourier transform package np.fft, which wraps a standard Fortran-based package called FFTPACK.
A look at the np.fft documentation shows that the inverse transform np.fft.ifft takes a given sequence 𝐴0 , 𝐴1 , … , 𝐴𝑛−1
and returns the sequence 𝑎0 , 𝑎1 , … , 𝑎𝑛−1 defined by

1 𝑛−1
𝑎𝑘 = ∑ 𝐴 𝑒𝑖𝑘2𝜋𝑡/𝑛
𝑛 𝑡=0 𝑡

Thus, if we set 𝐴𝑡 = 𝑓(𝜔𝑡 ), where 𝑓 is the spectral density and 𝜔𝑡 ∶= 2𝜋𝑡/𝑛, then

1 𝑛−1 1 2𝜋 𝑛−1
𝑎𝑘 = ∑ 𝑓(𝜔𝑡 )𝑒𝑖𝜔𝑡 𝑘 = ∑ 𝑓(𝜔𝑡 )𝑒𝑖𝜔𝑡 𝑘 , 𝜔𝑡 ∶= 2𝜋𝑡/𝑛
𝑛 𝑡=0 2𝜋 𝑛 𝑡=0

For 𝑛 sufficiently large, we then have


2𝜋 𝜋
1 1
𝑎𝑘 ≈ ∫ 𝑓(𝜔)𝑒𝑖𝜔𝑘 𝑑𝜔 = ∫ 𝑓(𝜔)𝑒𝑖𝜔𝑘 𝑑𝜔
2𝜋 0 2𝜋 −𝜋

(You can check the last equality)


In view of (29.14), we have now shown that, for 𝑛 sufficiently large, 𝑎𝑘 ≈ 𝛾(𝑘) — which is exactly what we want to
compute.

29.4. Implementation 541


Advanced Quantitative Economics with Python

542 Chapter 29. Covariance Stationary Processes


CHAPTER

THIRTY

ESTIMATION OF SPECTRA

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

30.1 Overview

In a previous lecture, we covered some fundamental properties of covariance stationary linear stochastic processes.
One objective for that lecture was to introduce spectral densities — a standard and very useful technique for analyzing
such processes.
In this lecture, we turn to the problem of estimating spectral densities and other related quantities from data.
Estimates of the spectral density are computed using what is known as a periodogram — which in turn is computed via
the famous fast Fourier transform.
Once the basic technique has been explained, we will apply it to the analysis of several key macroeconomic time series.
For supplementary reading, see [Sargent, 1987] or [Cryer and Chan, 2008].
Let’s start with some standard imports:

import numpy as np
import matplotlib.pyplot as plt
from quantecon import ARMA, periodogram, ar_periodogram

30.2 Periodograms

Recall that the spectral density 𝑓 of a covariance stationary process with autocorrelation function 𝛾 can be written

𝑓(𝜔) = 𝛾(0) + 2 ∑ 𝛾(𝑘) cos(𝜔𝑘), 𝜔∈ℝ


𝑘≥1

Now consider the problem of estimating the spectral density of a given time series, when 𝛾 is unknown.
In particular, let 𝑋0 , … , 𝑋𝑛−1 be 𝑛 consecutive observations of a single time series that is assumed to be covariance
stationary.

543
Advanced Quantitative Economics with Python

The most common estimator of the spectral density of this process is the periodogram of 𝑋0 , … , 𝑋𝑛−1 , which is defined
as
2
1 𝑛−1
𝐼(𝜔) ∶= ∣∑ 𝑋𝑡 𝑒𝑖𝑡𝜔 ∣ , 𝜔∈ℝ (30.1)
𝑛 𝑡=0

(Recall that |𝑧| denotes the modulus of complex number 𝑧)


Alternatively, 𝐼(𝜔) can be expressed as
2 2
1⎧{ 𝑛−1 𝑛−1 ⎫
}
𝐼(𝜔) = ⎨[∑ 𝑋𝑡 cos(𝜔𝑡)] + [∑ 𝑋𝑡 sin(𝜔𝑡)] ⎬
𝑛 { 𝑡=0 𝑡=0 }
⎩ ⎭
It is straightforward to show that the function 𝐼 is even and 2𝜋-periodic (i.e., 𝐼(𝜔) = 𝐼(−𝜔) and 𝐼(𝜔 + 2𝜋) = 𝐼(𝜔) for
all 𝜔 ∈ ℝ).
From these two results, you will be able to verify that the values of 𝐼 on [0, 𝜋] determine the values of 𝐼 on all of ℝ.
The next section helps to explain the connection between the periodogram and the spectral density.

30.2.1 Interpretation

To interpret the periodogram, it is convenient to focus on its values at the Fourier frequencies
2𝜋𝑗
𝜔𝑗 ∶= , 𝑗 = 0, … , 𝑛 − 1
𝑛
In what sense is 𝐼(𝜔𝑗 ) an estimate of 𝑓(𝜔𝑗 )?
The answer is straightforward, although it does involve some algebra.
With a bit of effort, one can show that for any integer 𝑗 > 0,
𝑛−1 𝑛−1
𝑡
∑ 𝑒𝑖𝑡𝜔𝑗 = ∑ exp {𝑖2𝜋𝑗 } = 0
𝑡=0 𝑡=0
𝑛
𝑛−1
Letting 𝑋̄ denote the sample mean 𝑛−1 ∑𝑡=0 𝑋𝑡 , we then have
2
𝑛−1 𝑛−1 𝑛−1
̄ 𝑖𝑡𝜔𝑗 ∣ = ∑(𝑋𝑡 − 𝑋)𝑒
𝑛𝐼(𝜔𝑗 ) = ∣∑(𝑋𝑡 − 𝑋)𝑒 ̄ 𝑖𝑡𝜔𝑗 ∑(𝑋𝑟 − 𝑋)𝑒
̄ −𝑖𝑟𝜔𝑗
𝑡=0 𝑡=0 𝑟=0

By carefully working through the sums, one can transform this to


𝑛−1 𝑛−1 𝑛−1
𝑛𝐼(𝜔𝑗 ) = ∑(𝑋𝑡 − 𝑋)̄ 2 + 2 ∑ ∑(𝑋𝑡 − 𝑋)(𝑋
̄ ̄
𝑡−𝑘 − 𝑋) cos(𝜔𝑗 𝑘)
𝑡=0 𝑘=1 𝑡=𝑘

Now let
1 𝑛−1 ̄ ̄
𝛾(𝑘)
̂ ∶= ∑(𝑋 − 𝑋)(𝑋𝑡−𝑘 − 𝑋), 𝑘 = 0, 1, … , 𝑛 − 1
𝑛 𝑡=𝑘 𝑡
This is the sample autocovariance function, the natural “plug-in estimator” of the autocovariance function 𝛾.
(“Plug-in estimator” is an informal term for an estimator found by replacing expectations with sample means)
With this notation, we can now write
𝑛−1
𝐼(𝜔𝑗 ) = 𝛾(0)
̂ + 2 ∑ 𝛾(𝑘)
̂ cos(𝜔𝑗 𝑘)
𝑘=1

Recalling our expression for 𝑓 given above, we see that 𝐼(𝜔𝑗 ) is just a sample analog of 𝑓(𝜔𝑗 ).

544 Chapter 30. Estimation of Spectra


Advanced Quantitative Economics with Python

30.2.2 Calculation

Let’s now consider how to compute the periodogram as defined in (30.1).


There are already functions available that will do this for us — an example is statsmodels.tsa.stattools.
periodogram in the statsmodels package.
However, it is very simple to replicate their results, and this will give us a platform to make useful extensions.
The most common way to calculate the periodogram is via the discrete Fourier transform, which in turn is implemented
through the fast Fourier transform algorithm.
In general, given a sequence 𝑎0 , … , 𝑎𝑛−1 , the discrete Fourier transform computes the sequence
𝑛−1
𝑡𝑗
𝐴𝑗 ∶= ∑ 𝑎𝑡 exp {𝑖2𝜋 }, 𝑗 = 0, … , 𝑛 − 1
𝑡=0
𝑛

With numpy.fft.fft imported as fft and 𝑎0 , … , 𝑎𝑛−1 stored in NumPy array a, the function call fft(a) returns
the values 𝐴0 , … , 𝐴𝑛−1 as a NumPy array.
It follows that when the data 𝑋0 , … , 𝑋𝑛−1 are stored in array X, the values 𝐼(𝜔𝑗 ) at the Fourier frequencies, which are
given by
2
1 𝑛−1 𝑡𝑗
∣∑ 𝑋𝑡 exp {𝑖2𝜋 }∣ , 𝑗 = 0, … , 𝑛 − 1
𝑛 𝑡=0 𝑛

can be computed by np.abs(fft(X))**2 / len(X).

Note: The NumPy function abs acts elementwise, and correctly handles complex numbers (by computing their modulus,
which is exactly what we need).

A function called periodogram that puts all this together can be found here.
Let’s generate some data for this function using the ARMA class from QuantEcon.py (see the lecture on linear processes
for more details).
Here’s a code snippet that, once the preceding code has been run, generates data from the process

𝑋𝑡 = 0.5𝑋𝑡−1 + 𝜖𝑡 − 0.8𝜖𝑡−2 (30.2)

where {𝜖𝑡 } is white noise with unit variance, and compares the periodogram to the actual spectral density

n = 40 # Data size
ϕ, θ = 0.5, (0, -0.8) # AR and MA parameters
lp = ARMA(ϕ, θ)
X = lp.simulation(ts_length=n)

fig, ax = plt.subplots()
x, y = periodogram(X)
ax.plot(x, y, 'b-', lw=2, alpha=0.5, label='periodogram')
x_sd, y_sd = lp.spectral_density(two_pi=False, res=120)
ax.plot(x_sd, y_sd, 'r-', lw=2, alpha=0.8, label='spectral density')
ax.legend()
plt.show()

30.2. Periodograms 545


Advanced Quantitative Economics with Python

/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/matplotlib/
↪cbook.py:1762: ComplexWarning: Casting complex values to real discards the␣

↪imaginary part

return math.isfinite(val)
/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/matplotlib/
↪cbook.py:1398: ComplexWarning: Casting complex values to real discards the␣

↪imaginary part

return np.asarray(x, float)

This estimate looks rather disappointing, but the data size is only 40, so perhaps it’s not surprising that the estimate is
poor.
However, if we try again with n = 1200 the outcome is not much better
The periodogram is far too irregular relative to the underlying spectral density.
This brings us to our next topic.

30.3 Smoothing

There are two related issues here.


One is that, given the way the fast Fourier transform is implemented, the number of points 𝜔 at which 𝐼(𝜔) is estimated
increases in line with the amount of data.
In other words, although we have more data, we are also using it to estimate more values.
A second issue is that densities of all types are fundamentally hard to estimate without parametric assumptions.
Typically, nonparametric estimation of densities requires some degree of smoothing.

546 Chapter 30. Estimation of Spectra


Advanced Quantitative Economics with Python

30.3. Smoothing 547


Advanced Quantitative Economics with Python

The standard way that smoothing is applied to periodograms is by taking local averages.
In other words, the value 𝐼(𝜔𝑗 ) is replaced with a weighted average of the adjacent values

𝐼(𝜔𝑗−𝑝 ), 𝐼(𝜔𝑗−𝑝+1 ), … , 𝐼(𝜔𝑗 ), … , 𝐼(𝜔𝑗+𝑝 )

This weighted average can be written as


𝑝
𝐼𝑆 (𝜔𝑗 ) ∶= ∑ 𝑤(ℓ)𝐼(𝜔𝑗+ℓ ) (30.3)
ℓ=−𝑝

where the weights 𝑤(−𝑝), … , 𝑤(𝑝) are a sequence of 2𝑝 + 1 nonnegative values summing to one.
In general, larger values of 𝑝 indicate more smoothing — more on this below.
The next figure shows the kind of sequence typically used.
Note the smaller weights towards the edges and larger weights in the center, so that more distant values from 𝐼(𝜔𝑗 ) have
less weight than closer ones in the sum (30.3).

def hanning_window(M):
w = [0.5 - 0.5 * np.cos(2 * np.pi * n/(M-1)) for n in range(M)]
return w

window = hanning_window(25) / np.abs(sum(hanning_window(25)))


x = np.linspace(-12, 12, 25)
fig, ax = plt.subplots(figsize=(9, 7))
ax.plot(x, window)
ax.set_title("Hanning window")
ax.set_ylabel("Weights")
ax.set_xlabel("Position in sequence of weights")
plt.show()

548 Chapter 30. Estimation of Spectra


Advanced Quantitative Economics with Python

30.3.1 Estimation with Smoothing

Our next step is to provide code that will not only estimate the periodogram but also provide smoothing as required.
Such functions have been written in estspec.py and are available once you’ve installed QuantEcon.py.
The GitHub listing displays three functions, smooth(), periodogram(), ar_periodogram(). We will discuss
the first two here and the third one below.
The periodogram() function returns a periodogram, optionally smoothed via the smooth() function.
Regarding the smooth() function, since smoothing adds a nontrivial amount of computation, we have applied a fairly
terse array-centric method based around np.convolve.
Readers are left either to explore or simply to use this code according to their interests.
The next three figures each show smoothed and unsmoothed periodograms, as well as the population or “true” spectral
density.
(The model is the same as before — see equation (30.2) — and there are 400 observations)
From the top figure to bottom, the window length is varied from small to large.
In looking at the figure, we can see that for this model and data size, the window length chosen in the middle figure
provides the best fit.
Relative to this value, the first window length provides insufficient smoothing, while the third gives too much smoothing.

30.3. Smoothing 549


Advanced Quantitative Economics with Python

550 Chapter 30. Estimation of Spectra


Advanced Quantitative Economics with Python

Of course in real estimation problems, the true spectral density is not visible and the choice of appropriate smoothing
will have to be made based on judgement/priors or some other theory.

30.3.2 Pre-Filtering and Smoothing

In the code listing, we showed three functions from the file estspec.py.
The third function in the file (ar_periodogram()) adds a pre-processing step to periodogram smoothing.
First, we describe the basic idea, and after that we give the code.
The essential idea is to
1. Transform the data in order to make estimation of the spectral density more efficient.
2. Compute the periodogram associated with the transformed data.
3. Reverse the effect of the transformation on the periodogram, so that it now estimates the spectral density of the
original process.
Step 1 is called pre-filtering or pre-whitening, while step 3 is called recoloring.
The first step is called pre-whitening because the transformation is usually designed to turn the data into something closer
to white noise.
Why would this be desirable in terms of spectral density estimation?
The reason is that we are smoothing our estimated periodogram based on estimated values at nearby points — recall
(30.3).
The underlying assumption that makes this a good idea is that the true spectral density is relatively regular — the value
of 𝐼(𝜔) is close to that of 𝐼(𝜔′ ) when 𝜔 is close to 𝜔′ .
This will not be true in all cases, but it is certainly true for white noise.
For white noise, 𝐼 is as regular as possible — it is a constant function.
In this case, values of 𝐼(𝜔′ ) at points 𝜔′ near to 𝜔 provided the maximum possible amount of information about the value
𝐼(𝜔).
Another way to put this is that if 𝐼 is relatively constant, then we can use a large amount of smoothing without introducing
too much bias.

30.3.3 The AR(1) Setting

Let’s examine this idea more carefully in a particular setting — where the data are assumed to be generated by an AR(1)
process.
(More general ARMA settings can be handled using similar techniques to those described below)
Suppose in particular that {𝑋𝑡 } is covariance stationary and AR(1), with

𝑋𝑡+1 = 𝜇 + 𝜙𝑋𝑡 + 𝜖𝑡+1 (30.4)

where 𝜇 and 𝜙 ∈ (−1, 1) are unknown parameters and {𝜖𝑡 } is white noise.
It follows that if we regress 𝑋𝑡+1 on 𝑋𝑡 and an intercept, the residuals will approximate white noise.
Let
• 𝑔 be the spectral density of {𝜖𝑡 } — a constant function, as discussed above
• 𝐼0 be the periodogram estimated from the residuals — an estimate of 𝑔

30.3. Smoothing 551


Advanced Quantitative Economics with Python

• 𝑓 be the spectral density of {𝑋𝑡 } — the object we are trying to estimate


In view of an earlier result we obtained while discussing ARMA processes, 𝑓 and 𝑔 are related by
2
1
𝑓(𝜔) = ∣ ∣ 𝑔(𝜔) (30.5)
1 − 𝜙𝑒𝑖𝜔
This suggests that the recoloring step, which constructs an estimate 𝐼 of 𝑓 from 𝐼0 , should set
2
1
𝐼(𝜔) = ∣ ∣ 𝐼0 (𝜔)
1 − 𝜙𝑒̂ 𝑖𝜔

where 𝜙 ̂ is the OLS estimate of 𝜙.


The code for ar_periodogram() — the third function in estspec.py — does exactly this. (See the code here).
The next figure shows realizations of the two kinds of smoothed periodograms
1. “standard smoothed periodogram”, the ordinary smoothed periodogram, and
2. “AR smoothed periodogram”, the pre-whitened and recolored one generated by ar_periodogram()
The periodograms are calculated from time series drawn from (30.4) with 𝜇 = 0 and 𝜙 = −0.9.
Each time series is of length 150.
The difference between the three subfigures is just randomness — each one uses a different draw of the time series.
In all cases, periodograms are fit with the “hamming” window and window length of 65.
Overall, the fit of the AR smoothed periodogram is much better, in the sense of being closer to the true spectral density.

30.4 Exercises

Exercise 30.4.1
Replicate this figure (modulo randomness).
The model is as in equation (30.2) and there are 400 observations.
For the smoothed periodogram, the window type is “hamming”.

Solution to Exercise 30.4.1

## Data
n = 400
ϕ = 0.5
θ = 0, -0.8
lp = ARMA(ϕ, θ)
X = lp.simulation(ts_length=n)

fig, ax = plt.subplots(3, 1, figsize=(10, 12))

for i, wl in enumerate((15, 55, 175)): # Window lengths

x, y = periodogram(X)
ax[i].plot(x, y, 'b-', lw=2, alpha=0.5, label='periodogram')
(continues on next page)

552 Chapter 30. Estimation of Spectra


Advanced Quantitative Economics with Python

30.4. Exercises 553


Advanced Quantitative Economics with Python

(continued from previous page)

x_sd, y_sd = lp.spectral_density(two_pi=False, res=120)


ax[i].plot(x_sd, y_sd, 'r-', lw=2, alpha=0.8, label='spectral density')

x, y_smoothed = periodogram(X, window='hamming', window_len=wl)


ax[i].plot(x, y_smoothed, 'k-', lw=2, label='smoothed periodogram')

ax[i].legend()
ax[i].set_title(f'window length = {wl}')
plt.show()

554 Chapter 30. Estimation of Spectra


Advanced Quantitative Economics with Python

Exercise 30.4.2
Replicate this figure (modulo randomness).
The model is as in equation (30.4), with 𝜇 = 0, 𝜙 = −0.9 and 150 observations in each time series.
All periodograms are fit with the “hamming” window and window length of 65.

30.4. Exercises 555


Advanced Quantitative Economics with Python

Solution to Exercise 30.4.2

lp = ARMA(-0.9)
wl = 65

fig, ax = plt.subplots(3, 1, figsize=(10,12))

for i in range(3):
X = lp.simulation(ts_length=150)
ax[i].set_xlim(0, np.pi)

x_sd, y_sd = lp.spectral_density(two_pi=False, res=180)


ax[i].semilogy(x_sd, y_sd, 'r-', lw=2, alpha=0.75,
label='spectral density')

x, y_smoothed = periodogram(X, window='hamming', window_len=wl)


ax[i].semilogy(x, y_smoothed, 'k-', lw=2, alpha=0.75,
label='standard smoothed periodogram')

x, y_ar = ar_periodogram(X, window='hamming', window_len=wl)


ax[i].semilogy(x, y_ar, 'b-', lw=2, alpha=0.75,
label='AR smoothed periodogram')

ax[i].legend(loc='upper left')
plt.show()

556 Chapter 30. Estimation of Spectra


Advanced Quantitative Economics with Python

30.4. Exercises 557


Advanced Quantitative Economics with Python

558 Chapter 30. Estimation of Spectra


CHAPTER

THIRTYONE

ADDITIVE AND MULTIPLICATIVE FUNCTIONALS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

31.1 Overview

Many economic time series display persistent growth that prevents them from being asymptotically stationary and ergodic.
For example, outputs, prices, and dividends typically display irregular but persistent growth.
Asymptotic stationarity and ergodicity are key assumptions needed to make it possible to learn by applying statistical
methods.
But there are good ways to model time series that have persistent growth that still enable statistical learning based on a
law of large numbers for an asymptotically stationary and ergodic process.
Thus, [Hansen, 2012] described two classes of time series models that accommodate growth.
They are
1. additive functionals that display random “arithmetic growth”
2. multiplicative functionals that display random “geometric growth”
These two classes of processes are closely connected.
If a process {𝑦𝑡 } is an additive functional and 𝜙𝑡 = exp(𝑦𝑡 ), then {𝜙𝑡 } is a multiplicative functional.
In this lecture, we describe both additive functionals and multiplicative functionals.
We also describe and compute decompositions of additive and multiplicative processes into four components:
1. a constant
2. a trend component
3. an asymptotically stationary component
4. a martingale
We describe how to construct, simulate, and interpret these components.
More details about these concepts and algorithms can be found in Hansen [Hansen, 2012] and Hansen and Sargent [Hansen
and Sargent, 2024].
Let’s start with some imports:

559
Advanced Quantitative Economics with Python

import numpy as np
import scipy.linalg as la
import quantecon as qe
import matplotlib.pyplot as plt
from scipy.stats import norm, lognorm

31.2 A Particular Additive Functional

[Hansen, 2012] describes a general class of additive functionals.


This lecture focuses on a subclass of these: a scalar process {𝑦𝑡 }∞
𝑡=0 whose increments are driven by a Gaussian vector
autoregression.
Our special additive functional displays interesting time series behavior while also being easy to construct, simulate, and
analyze by using linear state-space tools.
We construct our additive functional from two pieces, the first of which is a first-order vector autoregression (VAR)

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵𝑧𝑡+1 (31.1)

Here
• 𝑥𝑡 is an 𝑛 × 1 vector,
• 𝐴 is an 𝑛 × 𝑛 stable matrix (all eigenvalues lie within the open unit circle),
• 𝑧𝑡+1 ∼ 𝑁 (0, 𝐼) is an 𝑚 × 1 IID shock,
• 𝐵 is an 𝑛 × 𝑚 matrix, and
• 𝑥0 ∼ 𝑁 (𝜇0 , Σ0 ) is a random initial condition for 𝑥
The second piece is an equation that expresses increments of {𝑦𝑡 }∞
𝑡=0 as linear functions of

• a scalar constant 𝜈,
• the vector 𝑥𝑡 , and
• the same Gaussian vector 𝑧𝑡+1 that appears in the VAR (31.1)
In particular,

𝑦𝑡+1 − 𝑦𝑡 = 𝜈 + 𝐷𝑥𝑡 + 𝐹 𝑧𝑡+1 (31.2)

Here 𝑦0 ∼ 𝑁 (𝜇𝑦0 , Σ𝑦0 ) is a random initial condition for 𝑦.


The nonstationary random process {𝑦𝑡 }∞
𝑡=0 displays systematic but random arithmetic growth.

31.2.1 Linear State-Space Representation

A convenient way to represent our additive functional is to use a linear state space system.
To do this, we set up state and observation vectors

1
𝑥
𝑥𝑡̂ = ⎡𝑥
⎢ 𝑡⎥
⎤ and 𝑦𝑡̂ = [ 𝑡 ]
𝑦𝑡
⎣ 𝑦𝑡 ⎦

560 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

Next we construct a linear system

1 1 0 0 1 0
⎡𝑥 ⎤ = ⎡0 𝐴 0⎤ ⎡𝑥 ⎤ + ⎡𝐵⎤ 𝑧
⎢ 𝑡+1 ⎥ ⎢ ⎥ ⎢ 𝑡 ⎥ ⎢ ⎥ 𝑡+1
⎣ 𝑦𝑡+1 ⎦ ⎣𝜈 𝐷 1⎦ ⎣ 𝑦𝑡 ⎦ ⎣ 𝐹 ⎦

1
𝑥 0 𝐼 0 ⎡ ⎤
[ 𝑡] = [ ] ⎢𝑥𝑡 ⎥
𝑦𝑡 0 0 1
⎣ 𝑦𝑡 ⎦
This can be written as

𝑥𝑡+1
̂ = 𝐴𝑥 ̂ ̂ + 𝐵𝑧
̂ 𝑡+1
𝑡

𝑦𝑡̂ = 𝐷̂ 𝑥𝑡̂

which is a standard linear state space system.


To study it, we could map it into an instance of LinearStateSpace from QuantEcon.py.
But here we will use a different set of code for simulation, for reasons described below.

31.3 Dynamics

Let’s run some simulations to build intuition.


In doing so we’ll assume that 𝑧𝑡+1 is scalar and that 𝑥𝑡̃ follows a 4th-order scalar autoregression.

𝑥𝑡+1
̃ = 𝜙1 𝑥𝑡̃ + 𝜙2 𝑥𝑡−1
̃ + 𝜙3 𝑥𝑡−2
̃ + 𝜙4 𝑥𝑡−3
̃ + 𝜎𝑧𝑡+1 (31.3)

in which the zeros 𝑧 of the polynomial

𝜙(𝑧) = (1 − 𝜙1 𝑧 − 𝜙2 𝑧2 − 𝜙3 𝑧3 − 𝜙4 𝑧4 )

are strictly greater than unity in absolute value.


(Being a zero of 𝜙(𝑧) means that 𝜙(𝑧) = 0)
Let the increment in {𝑦𝑡 } obey

𝑦𝑡+1 − 𝑦𝑡 = 𝜈 + 𝑥𝑡̃ + 𝜎𝑧𝑡+1

with an initial condition for 𝑦0 .


While (31.3) is not a first order system like (31.1), we know that it can be mapped into a first order system.
• For an example of such a mapping, see this example.
In fact, this whole model can be mapped into the additive functional system definition in (31.1) – (31.2) by appropriate
selection of the matrices 𝐴, 𝐵, 𝐷, 𝐹 .
You can try writing these matrices down now as an exercise — correct expressions appear in the code below.

31.3. Dynamics 561


Advanced Quantitative Economics with Python

31.3.1 Simulation

When simulating we embed our variables into a bigger system.


This system also constructs the components of the decompositions of 𝑦𝑡 and of exp(𝑦𝑡 ) proposed by Hansen [Hansen,
2012].
All of these objects are computed using the code below

class AMF_LSS_VAR:
"""
This class transforms an additive (multiplicative)
functional into a QuantEcon linear state space system.
"""

def __init__(self, A, B, D, F=None, ν=None):


# Unpack required elements
self.nx, self.nk = B.shape
self.A, self.B = A, B

# Checking the dimension of D (extended from the scalar case)


if len(D.shape) > 1 and D.shape[0] != 1:
self.nm = D.shape[0]
self.D = D
elif len(D.shape) > 1 and D.shape[0] == 1:
self.nm = 1
self.D = D
else:
self.nm = 1
self.D = np.expand_dims(D, 0)

# Create space for additive decomposition


self.add_decomp = None
self.mult_decomp = None

# Set F
if not np.any(F):
self.F = np.zeros((self.nk, 1))
else:
self.F = F

# Set ν
if not np.any(ν):
self.ν = np.zeros((self.nm, 1))
elif type(ν) == float:
self.ν = np.asarray([[ν]])
elif len(ν.shape) == 1:
self.ν = np.expand_dims(ν, 1)
else:
self.ν = ν

if self.ν.shape[0] != self.D.shape[0]:
raise ValueError("The dimension of ν is inconsistent with D!")

# Construct BIG state space representation


self.lss = self.construct_ss()

def construct_ss(self):
(continues on next page)

562 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

(continued from previous page)


"""
This creates the state space representation that can be passed
into the quantecon LSS class.
"""
# Pull out useful info
nx, nk, nm = self.nx, self.nk, self.nm
A, B, D, F, ν = self.A, self.B, self.D, self.F, self.ν
if self.add_decomp:
ν, H, g = self.add_decomp
else:
ν, H, g = self.additive_decomp()

# Auxiliary blocks with 0's and 1's to fill out the lss matrices
nx0c = np.zeros((nx, 1))
nx0r = np.zeros(nx)
nx1 = np.ones(nx)
nk0 = np.zeros(nk)
ny0c = np.zeros((nm, 1))
ny0r = np.zeros(nm)
ny1m = np.eye(nm)
ny0m = np.zeros((nm, nm))
nyx0m = np.zeros_like(D)

# Build A matrix for LSS


# Order of states is: [1, t, xt, yt, mt]
A1 = np.hstack([1, 0, nx0r, ny0r, ny0r]) # Transition for 1
A2 = np.hstack([1, 1, nx0r, ny0r, ny0r]) # Transition for t
# Transition for x_{t+1}
A3 = np.hstack([nx0c, nx0c, A, nyx0m.T, nyx0m.T])
# Transition for y_{t+1}
A4 = np.hstack([ν, ny0c, D, ny1m, ny0m])
# Transition for m_{t+1}
A5 = np.hstack([ny0c, ny0c, nyx0m, ny0m, ny1m])
Abar = np.vstack([A1, A2, A3, A4, A5])

# Build B matrix for LSS


Bbar = np.vstack([nk0, nk0, B, F, H])

# Build G matrix for LSS


# Order of observation is: [xt, yt, mt, st, tt]
# Selector for x_{t}
G1 = np.hstack([nx0c, nx0c, np.eye(nx), nyx0m.T, nyx0m.T])
G2 = np.hstack([ny0c, ny0c, nyx0m, ny1m, ny0m]) # Selector for y_{t}
# Selector for martingale
G3 = np.hstack([ny0c, ny0c, nyx0m, ny0m, ny1m])
G4 = np.hstack([ny0c, ny0c, -g, ny0m, ny0m]) # Selector for stationary
G5 = np.hstack([ny0c, ν, nyx0m, ny0m, ny0m]) # Selector for trend
Gbar = np.vstack([G1, G2, G3, G4, G5])

# Build H matrix for LSS


Hbar = np.zeros((Gbar.shape[0], nk))

# Build LSS type


x0 = np.hstack([1, 0, nx0r, ny0r, ny0r])
S0 = np.zeros((len(x0), len(x0)))
lss = qe.LinearStateSpace(Abar, Bbar, Gbar, Hbar, mu_0=x0, Sigma_0=S0)

(continues on next page)

31.3. Dynamics 563


Advanced Quantitative Economics with Python

(continued from previous page)

return lss

def additive_decomp(self):
"""
Return values for the martingale decomposition
- ν : unconditional mean difference in Y
- H : coefficient for the (linear) martingale component (κ_a)
- g : coefficient for the stationary component g(x)
- Y_0 : it should be the function of X_0 (for now set it to 0.0)
"""
I = np.identity(self.nx)
A_res = la.solve(I - self.A, I)
g = self.D @ A_res
H = self.F + self.D @ A_res @ self.B

return self.ν, H, g

def multiplicative_decomp(self):
"""
Return values for the multiplicative decomposition (Example 5.4.4.)
- ν_tilde : eigenvalue
- H : vector for the Jensen term
"""
ν, H, g = self.additive_decomp()
ν_tilde = ν + (.5)*np.expand_dims(np.diag(H @ H.T), 1)

return ν_tilde, H, g

def loglikelihood_path(self, x, y):


A, B, D, F = self.A, self.B, self.D, self.F
k, T = y.shape
FF = F @ F.T
FFinv = la.inv(FF)
temp = y[:, 1:] - y[:, :-1] - D @ x[:, :-1]
obs = temp * FFinv * temp
obssum = np.cumsum(obs)
scalar = (np.log(la.det(FF)) + k*np.log(2*np.pi))*np.arange(1, T)

return -(.5)*(obssum + scalar)

def loglikelihood(self, x, y):


llh = self.loglikelihood_path(x, y)

return llh[-1]

564 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

Plotting

The code below adds some functions that generate plots for instances of the AMF_LSS_VAR class.

def plot_given_paths(amf, T, ypath, mpath, spath, tpath,


mbounds, sbounds, horline=0, show_trend=True):

# Allocate space
trange = np.arange(T)

# Create figure
fig, ax = plt.subplots(2, 2, sharey=True, figsize=(15, 8))

# Plot all paths together


ax[0, 0].plot(trange, ypath[0, :], label="$y_t$", color="k")
ax[0, 0].plot(trange, mpath[0, :], label="$m_t$", color="m")
ax[0, 0].plot(trange, spath[0, :], label="$s_t$", color="g")
if show_trend:
ax[0, 0].plot(trange, tpath[0, :], label="$t_t$", color="r")
ax[0, 0].axhline(horline, color="k", linestyle="-.")
ax[0, 0].set_title("One Path of All Variables")
ax[0, 0].legend(loc="upper left")

# Plot Martingale Component


ax[0, 1].plot(trange, mpath[0, :], "m")
ax[0, 1].plot(trange, mpath.T, alpha=0.45, color="m")
ub = mbounds[1, :]
lb = mbounds[0, :]

ax[0, 1].fill_between(trange, lb, ub, alpha=0.25, color="m")


ax[0, 1].set_title("Martingale Components for Many Paths")
ax[0, 1].axhline(horline, color="k", linestyle="-.")

# Plot Stationary Component


ax[1, 0].plot(spath[0, :], color="g")
ax[1, 0].plot(spath.T, alpha=0.25, color="g")
ub = sbounds[1, :]
lb = sbounds[0, :]
ax[1, 0].fill_between(trange, lb, ub, alpha=0.25, color="g")
ax[1, 0].axhline(horline, color="k", linestyle="-.")
ax[1, 0].set_title("Stationary Components for Many Paths")

# Plot Trend Component


if show_trend:
ax[1, 1].plot(tpath.T, color="r")
ax[1, 1].set_title("Trend Components for Many Paths")
ax[1, 1].axhline(horline, color="k", linestyle="-.")

return fig

def plot_additive(amf, T, npaths=25, show_trend=True):


"""
Plots for the additive decomposition.
Acts on an instance amf of the AMF_LSS_VAR class

"""
# Pull out right sizes so we know how to increment
(continues on next page)

31.3. Dynamics 565


Advanced Quantitative Economics with Python

(continued from previous page)


nx, nk, nm = amf.nx, amf.nk, amf.nm

# Allocate space (nm is the number of additive functionals -


# we want npaths for each)
mpath = np.empty((nm*npaths, T))
mbounds = np.empty((nm*2, T))
spath = np.empty((nm*npaths, T))
sbounds = np.empty((nm*2, T))
tpath = np.empty((nm*npaths, T))
ypath = np.empty((nm*npaths, T))

# Simulate for as long as we wanted


moment_generator = amf.lss.moment_sequence()
# Pull out population moments
for t in range (T):
tmoms = next(moment_generator)
ymeans = tmoms[1]
yvar = tmoms[3]

# Lower and upper bounds - for each additive functional


for ii in range(nm):
li, ui = ii*2, (ii+1)*2
mscale = np.sqrt(yvar[nx+nm+ii, nx+nm+ii])
sscale = np.sqrt(yvar[nx+2*nm+ii, nx+2*nm+ii])
if mscale == 0.0:
mscale = 1e-12 # avoids a RuntimeWarning from calculating ppf
if sscale == 0.0: # of normal distribution with std dev = 0.
sscale = 1e-12 # sets std dev to small value instead

madd_dist = norm(ymeans[nx+nm+ii], mscale)


sadd_dist = norm(ymeans[nx+2*nm+ii], sscale)

mbounds[li:ui, t] = madd_dist.ppf([0.01, .99])


sbounds[li:ui, t] = sadd_dist.ppf([0.01, .99])

# Pull out paths


for n in range(npaths):
x, y = amf.lss.simulate(T)
for ii in range(nm):
ypath[npaths*ii+n, :] = y[nx+ii, :]
mpath[npaths*ii+n, :] = y[nx+nm + ii, :]
spath[npaths*ii+n, :] = y[nx+2*nm + ii, :]
tpath[npaths*ii+n, :] = y[nx+3*nm + ii, :]

add_figs = []

for ii in range(nm):
li, ui = npaths*(ii), npaths*(ii+1)
LI, UI = 2*(ii), 2*(ii+1)
add_figs.append(plot_given_paths(amf, T,
ypath[li:ui,:],
mpath[li:ui,:],
spath[li:ui,:],
tpath[li:ui,:],
mbounds[LI:UI,:],
sbounds[LI:UI,:],

(continues on next page)

566 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

(continued from previous page)


show_trend=show_trend))

add_figs[ii].suptitle(f'Additive decomposition of $y_{ii+1}$',


fontsize=14)

return add_figs

def plot_multiplicative(amf, T, npaths=25, show_trend=True):


"""
Plots for the multiplicative decomposition

"""
# Pull out right sizes so we know how to increment
nx, nk, nm = amf.nx, amf.nk, amf.nm
# Matrices for the multiplicative decomposition
ν_tilde, H, g = amf.multiplicative_decomp()

# Allocate space (nm is the number of functionals -


# we want npaths for each)
mpath_mult = np.empty((nm*npaths, T))
mbounds_mult = np.empty((nm*2, T))
spath_mult = np.empty((nm*npaths, T))
sbounds_mult = np.empty((nm*2, T))
tpath_mult = np.empty((nm*npaths, T))
ypath_mult = np.empty((nm*npaths, T))

# Simulate for as long as we wanted


moment_generator = amf.lss.moment_sequence()
# Pull out population moments
for t in range(T):
tmoms = next(moment_generator)
ymeans = tmoms[1]
yvar = tmoms[3]

# Lower and upper bounds - for each multiplicative functional


for ii in range(nm):
li, ui = ii*2, (ii+1)*2
Mdist = lognorm(np.sqrt(yvar[nx+nm+ii, nx+nm+ii]).item(),
scale=np.exp(ymeans[nx+nm+ii] \
- t * (.5)
* np.expand_dims(
np.diag(H @ H.T),
1
)[ii]
).item()
)
Sdist = lognorm(np.sqrt(yvar[nx+2*nm+ii, nx+2*nm+ii]).item(),
scale = np.exp(-ymeans[nx+2*nm+ii]).item())
mbounds_mult[li:ui, t] = Mdist.ppf([.01, .99])
sbounds_mult[li:ui, t] = Sdist.ppf([.01, .99])

# Pull out paths


for n in range(npaths):
x, y = amf.lss.simulate(T)
for ii in range(nm):

(continues on next page)

31.3. Dynamics 567


Advanced Quantitative Economics with Python

(continued from previous page)


ypath_mult[npaths*ii+n, :] = np.exp(y[nx+ii, :])
mpath_mult[npaths*ii+n, :] = np.exp(y[nx+nm + ii, :] \
- np.arange(T)*(.5)
* np.expand_dims(np.diag(H
@ H.T),
1)[ii]
)
spath_mult[npaths*ii+n, :] = 1/np.exp(-y[nx+2*nm + ii, :])
tpath_mult[npaths*ii+n, :] = np.exp(y[nx+3*nm + ii, :]
+ np.arange(T)*(.5)
* np.expand_dims(np.diag(H
@ H.T),
1)[ii]
)

mult_figs = []

for ii in range(nm):
li, ui = npaths*(ii), npaths*(ii+1)
LI, UI = 2*(ii), 2*(ii+1)

mult_figs.append(plot_given_paths(amf,T,
ypath_mult[li:ui,:],
mpath_mult[li:ui,:],
spath_mult[li:ui,:],
tpath_mult[li:ui,:],
mbounds_mult[LI:UI,:],
sbounds_mult[LI:UI,:],
1,
show_trend=show_trend))
mult_figs[ii].suptitle(f'Multiplicative decomposition of \
$y_{ii+1}$', fontsize=14)

return mult_figs

def plot_martingale_paths(amf, T, mpath, mbounds, horline=1, show_trend=False):


# Allocate space
trange = np.arange(T)

# Create figure
fig, ax = plt.subplots(1, 1, figsize=(10, 6))

# Plot Martingale Component


ub = mbounds[1, :]
lb = mbounds[0, :]
ax.fill_between(trange, lb, ub, color="#ffccff")
ax.axhline(horline, color="k", linestyle="-.")
ax.plot(trange, mpath.T, linewidth=0.25, color="#4c4c4c")

return fig

def plot_martingales(amf, T, npaths=25):

# Pull out right sizes so we know how to increment


nx, nk, nm = amf.nx, amf.nk, amf.nm
# Matrices for the multiplicative decomposition

(continues on next page)

568 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

(continued from previous page)


ν_tilde, H, g = amf.multiplicative_decomp()

# Allocate space (nm is the number of functionals -


# we want npaths for each)
mpath_mult = np.empty((nm*npaths, T))
mbounds_mult = np.empty((nm*2, T))

# Simulate for as long as we wanted


moment_generator = amf.lss.moment_sequence()
# Pull out population moments
for t in range (T):
tmoms = next(moment_generator)
ymeans = tmoms[1]
yvar = tmoms[3]

# Lower and upper bounds - for each functional


for ii in range(nm):
li, ui = ii*2, (ii+1)*2
Mdist = lognorm(np.sqrt(yvar[nx+nm+ii, nx+nm+ii]).item(),
scale= np.exp(ymeans[nx+nm+ii] \
- t * (.5)
* np.expand_dims(
np.diag(H @ H.T),
1)[ii]

).item()
)
mbounds_mult[li:ui, t] = Mdist.ppf([.01, .99])

# Pull out paths


for n in range(npaths):
x, y = amf.lss.simulate(T)
for ii in range(nm):
mpath_mult[npaths*ii+n, :] = np.exp(y[nx+nm + ii, :] \
- np.arange(T) * (.5)
* np.expand_dims(np.diag(H
@ H.T),
1)[ii]
)

mart_figs = []

for ii in range(nm):
li, ui = npaths*(ii), npaths*(ii+1)
LI, UI = 2*(ii), 2*(ii+1)
mart_figs.append(plot_martingale_paths(amf, T, mpath_mult[li:ui, :],
mbounds_mult[LI:UI, :],
horline=1))
mart_figs[ii].suptitle(f'Martingale components for many paths of \
$y_{ii+1}$', fontsize=14)

return mart_figs

For now, we just plot 𝑦𝑡 and 𝑥𝑡 , postponing until later a description of exactly how we compute them.

31.3. Dynamics 569


Advanced Quantitative Economics with Python

ϕ_1, ϕ_2, ϕ_3, ϕ_4 = 0.5, -0.2, 0, 0.5


σ = 0.01
ν = 0.01 # Growth rate

# A matrix should be n x n
A = np.array([[ϕ_1, ϕ_2, ϕ_3, ϕ_4],
[ 1, 0, 0, 0],
[ 0, 1, 0, 0],
[ 0, 0, 1, 0]])

# B matrix should be n x k
B = np.array([[σ, 0, 0, 0]]).T

D = np.array([1, 0, 0, 0]) @ A
F = np.array([1, 0, 0, 0]) @ B

amf = AMF_LSS_VAR(A, B, D, F, ν=ν)

T = 150
x, y = amf.lss.simulate(T)

fig, ax = plt.subplots(2, 1, figsize=(10, 9))

ax[0].plot(np.arange(T), y[amf.nx, :], color='k')


ax[0].set_title('Path of $y_t$')
ax[1].plot(np.arange(T), y[0, :], color='g')
ax[1].axhline(0, color='k', linestyle='-.')
ax[1].set_title('Associated path of $x_t$')
plt.show()

570 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

Notice the irregular but persistent growth in 𝑦𝑡 .

31.3.2 Decomposition

Hansen and Sargent [Hansen and Sargent, 2024] describe how to construct a decomposition of an additive functional into
four parts:
• a constant inherited from initial values 𝑥0 and 𝑦0
• a linear trend
• a martingale
• an (asymptotically) stationary component
To attain this decomposition for the particular class of additive functionals defined by (31.1) and (31.2), we first construct
the matrices
𝐻 ∶= 𝐹 + 𝐷(𝐼 − 𝐴)−1 𝐵
𝑔 ∶= 𝐷(𝐼 − 𝐴)−1

31.3. Dynamics 571


Advanced Quantitative Economics with Python

Then the Hansen [Hansen, 2012], [Hansen and Sargent, 2024] decomposition is
Martingale component

𝑡 initial conditions
𝑦𝑡 = 𝑡𝜈
⏟ + ∑ 𝐻𝑧𝑗 − 𝑔𝑥
⏟𝑡 + 𝑔⏞
𝑥 0 + 𝑦0
trend component 𝑗=1 stationary component

At this stage, you should pause and verify that 𝑦𝑡+1 − 𝑦𝑡 satisfies (31.2).
It is convenient for us to introduce the following notation:
• 𝜏𝑡 = 𝜈𝑡 , a linear, deterministic trend
𝑡
• 𝑚𝑡 = ∑𝑗=1 𝐻𝑧𝑗 , a martingale with time 𝑡 + 1 increment 𝐻𝑧𝑡+1
• 𝑠𝑡 = 𝑔𝑥𝑡 , an (asymptotically) stationary component
We want to characterize and simulate components 𝜏𝑡 , 𝑚𝑡 , 𝑠𝑡 of the decomposition.
A convenient way to do this is to construct an appropriate instance of a linear state space system by using LinearStateSpace
from QuantEcon.py.
This will allow us to use the routines in LinearStateSpace to study dynamics.
To start, observe that, under the dynamics in (31.1) and (31.2) and with the definitions just given,

1 1 0 0 0 0 1 0
⎡ 𝑡 + 1 ⎤ ⎡1 1 0 0 0⎤ ⎡ 𝑡 ⎤ ⎡ 0 ⎤
⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ 𝑥𝑡+1 ⎥ = ⎢0 0 𝐴 0 0⎥ ⎢ 𝑥𝑡 ⎥ + ⎢ 𝐵 ⎥ 𝑧𝑡+1
⎢ 𝑦𝑡+1 ⎥ ⎢𝜈 0 𝐷 1 0 ⎥ ⎢ 𝑦𝑡 ⎥ ⎢ 𝐹 ⎥
⎣𝑚𝑡+1 ⎦ ⎣0 0 0 0 1⎦ ⎣𝑚𝑡 ⎦ ⎣𝐻 ⎦

and
𝑥𝑡 0 0 𝐼 0 0 1
⎡ 𝑦 ⎤ ⎡0 0 0 1 0⎤ ⎡ 𝑡 ⎤
⎢ 𝑡⎥ ⎢ ⎥⎢ ⎥
⎢ 𝜏𝑡 ⎥ = ⎢0 𝜈 0 0 0 ⎥ ⎢ 𝑥𝑡 ⎥
⎢𝑚𝑡 ⎥ ⎢0 0 0 0 1 ⎥ ⎢ 𝑦𝑡 ⎥
⎣ 𝑠𝑡 ⎦ ⎣0 0 −𝑔 0 0⎦ ⎣𝑚𝑡 ⎦

With
1 𝑥𝑡
⎡ 𝑡 ⎤ ⎡𝑦 ⎤
⎢ ⎥ ⎢ 𝑡⎥
𝑥̃ ∶= ⎢ 𝑥𝑡 ⎥ and 𝑦 ̃ ∶= ⎢ 𝜏𝑡 ⎥
⎢ 𝑦𝑡 ⎥ ⎢𝑚𝑡 ⎥
⎣𝑚𝑡 ⎦ ⎣ 𝑠𝑡 ⎦

we can write this as the linear state space system

𝑥𝑡+1
̃ ̃ ̃ + 𝐵𝑧
= 𝐴𝑥 ̃ 𝑡+1
𝑡

𝑦𝑡̃ = 𝐷̃ 𝑥𝑡̃

By picking out components of 𝑦𝑡̃ , we can track all variables of interest.

572 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

31.4 Code

The class AMF_LSS_VAR mentioned above does all that we want to study our additive functional.
In fact, AMF_LSS_VAR does more because it allows us to study an associated multiplicative functional as well.
(A hint that it does more is the name of the class – here AMF stands for “additive and multiplicative functional” – the
code computes and displays objects associated with multiplicative functionals too.)
Let’s use this code (embedded above) to explore the example process described above.
If you run the code that first simulated that example again and then the method call you will generate (modulo randomness)
the plot

plot_additive(amf, T)
plt.show()

When we plot multiple realizations of a component in the 2nd, 3rd, and 4th panels, we also plot the population 95%
probability coverage sets computed using the LinearStateSpace class.
We have chosen to simulate many paths, all starting from the same non-random initial conditions 𝑥0 , 𝑦0 (you can tell this
from the shape of the 95% probability coverage shaded areas).
Notice tell-tale signs of these probability coverage shaded areas

• the purple one for the martingale component 𝑚𝑡 grows with 𝑡
• the green one for the stationary component 𝑠𝑡 converges to a constant band

31.4. Code 573


Advanced Quantitative Economics with Python

31.4.1 Associated Multiplicative Functional

Where {𝑦𝑡 } is our additive functional, let 𝑀𝑡 = exp(𝑦𝑡 ).


As mentioned above, the process {𝑀𝑡 } is called a multiplicative functional.
Corresponding to the additive decomposition described above we have a multiplicative decomposition of 𝑀𝑡
𝑡
𝑀𝑡
= exp(𝑡𝜈) exp(∑ 𝐻 ⋅ 𝑍𝑗 ) exp(𝐷(𝐼 − 𝐴)−1 𝑥0 − 𝐷(𝐼 − 𝐴)−1 𝑥𝑡 )
𝑀0 𝑗=1

or
𝑀𝑡 ̃
𝑀 𝑒(𝑋
̃ 0)
̃ ( 𝑡 )(
= exp (𝜈𝑡) )
𝑀0 ̃
𝑀0 𝑒(𝑥
̃ 𝑡)

where
𝑡
𝐻 ⋅𝐻 ̃𝑡 = exp(∑(𝐻 ⋅ 𝑧𝑗 − 𝐻 ⋅ 𝐻 )), ̃0 = 1
𝜈̃ = 𝜈 + , 𝑀 𝑀
2 𝑗=1
2

and

𝑒(𝑥)
̃ = exp[𝑔(𝑥)] = exp[𝐷(𝐼 − 𝐴)−1 𝑥]

An instance of class AMF_LSS_VAR (above) includes this associated multiplicative functional as an attribute.
Let’s plot this multiplicative functional for our example.
If you run the code that first simulated that example again and then the method call in the cell below you’ll obtain the graph
in the next cell.

plot_multiplicative(amf, T)
plt.show()

574 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

As before, when we plotted multiple realizations of a component in the 2nd, 3rd, and 4th panels, we also plotted population
95% confidence bands computed using the LinearStateSpace class.
Comparing this figure and the last also helps show how geometric growth differs from arithmetic growth.
The top right panel of the above graph shows a panel of martingales associated with the panel of 𝑀𝑡 = exp(𝑦𝑡 ) that we
have generated for a limited horizon 𝑇 .
It is interesting to how the martingale behaves as 𝑇 → +∞.
Let’s see what happens when we set 𝑇 = 12000 instead of 150.

31.4.2 Peculiar Large Sample Property

Hansen and Sargent [Hansen and Sargent, 2024] (ch. 8) describe the following two properties of the martingale compo-
̃𝑡 of the multiplicative decomposition
nent 𝑀
̃𝑡 = 1 for all 𝑡 ≥ 0, nevertheless …
• while 𝐸0 𝑀
̃𝑡 converges to zero almost surely
• as 𝑡 → +∞, 𝑀
̃𝑡 is a multiplicative martingale with initial condition 𝑀
The first property follows from the fact that 𝑀 ̃0 = 1.

The second is a peculiar property noted and proved by Hansen and Sargent [Hansen and Sargent, 2024].
̃𝑡 illustrates both properties
The following simulation of many paths of 𝑀

np.random.seed(10021987)
plot_martingales(amf, 12000)
plt.show()

31.4. Code 575


Advanced Quantitative Economics with Python

The dotted line in the above graph is the mean 𝐸 𝑀̃ 𝑡 = 1 of the martingale.
It remains constant at unity, illustrating the first property.
The purple 95 percent frequency coverage interval collapses around zero, illustrating the second property.

31.5 More About the Multiplicative Martingale

̃𝑡 }∞ in more detail.
Let’s drill down and study probability distribution of the multiplicative martingale {𝑀 𝑡=0

As we have seen, it has representation


𝑡
̃𝑡 = exp(∑(𝐻 ⋅ 𝑧𝑗 − 𝐻 ⋅ 𝐻 )),
𝑀 ̃0 = 1
𝑀
𝑗=1
2

where 𝐻 = [𝐹 + 𝐷(𝐼 − 𝐴)−1 𝐵].


̃𝑡 ∼ 𝒩(− 𝑡𝐻⋅𝐻 , 𝑡𝐻 ⋅ 𝐻) and that consequently 𝑀
It follows that log 𝑀 ̃𝑡 is log normal.
2

31.5.1 Simulating a Multiplicative Martingale Again

Next, we want a program to simulate the likelihood ratio process {𝑀̃ 𝑡 }∞


𝑡=0 .

In particular, we want to simulate 5000 sample paths of length 𝑇 for the case in which 𝑥 is a scalar and [𝐴, 𝐵, 𝐷, 𝐹 ] =
[0.8, 0.001, 1.0, 0.01] and 𝜈 = 0.005.
After accomplishing this, we want to display and study histograms of 𝑀̃ 𝑇𝑖 for various values of 𝑇 .
Here is code that accomplishes these tasks.

31.5.2 Sample Paths

Let’s write a program to simulate sample paths of {𝑥𝑡 , 𝑦𝑡 }∞


𝑡=0 .

We’ll do this by formulating the additive functional as a linear state space model and putting the LinearStateSpace class
to work.

class AMF_LSS_VAR:
"""
This class is written to transform a scalar additive functional
into a linear state space system.
"""
def __init__(self, A, B, D, F=0.0, ν=0.0):
# Unpack required elements
self.A, self.B, self.D, self.F, self.ν = A, B, D, F, ν

# Create space for additive decomposition


self.add_decomp = None
self.mult_decomp = None

# Construct BIG state space representation


self.lss = self.construct_ss()

def construct_ss(self):
(continues on next page)

576 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

(continued from previous page)


"""
This creates the state space representation that can be passed
into the quantecon LSS class.
"""
# Pull out useful info
A, B, D, F, ν = self.A, self.B, self.D, self.F, self.ν
nx, nk, nm = 1, 1, 1
if self.add_decomp:
ν, H, g = self.add_decomp
else:
ν, H, g = self.additive_decomp()

# Build A matrix for LSS


# Order of states is: [1, t, xt, yt, mt]
A1 = np.hstack([1, 0, 0, 0, 0]) # Transition for 1
A2 = np.hstack([1, 1, 0, 0, 0]) # Transition for t
A3 = np.hstack([0, 0, A, 0, 0]) # Transition for x_{t+1}
A4 = np.hstack([ν, 0, D, 1, 0]) # Transition for y_{t+1}
A5 = np.hstack([0, 0, 0, 0, 1]) # Transition for m_{t+1}
Abar = np.vstack([A1, A2, A3, A4, A5])

# Build B matrix for LSS


Bbar = np.vstack([0, 0, B, F, H])

# Build G matrix for LSS


# Order of observation is: [xt, yt, mt, st, tt]
G1 = np.hstack([0, 0, 1, 0, 0]) # Selector for x_{t}
G2 = np.hstack([0, 0, 0, 1, 0]) # Selector for y_{t}
G3 = np.hstack([0, 0, 0, 0, 1]) # Selector for martingale
G4 = np.hstack([0, 0, -g, 0, 0]) # Selector for stationary
G5 = np.hstack([0, ν, 0, 0, 0]) # Selector for trend
Gbar = np.vstack([G1, G2, G3, G4, G5])

# Build H matrix for LSS


Hbar = np.zeros((1, 1))

# Build LSS type


x0 = np.hstack([1, 0, 0, 0, 0])
S0 = np.zeros((5, 5))
lss = qe.LinearStateSpace(Abar, Bbar, Gbar, Hbar,
mu_0=x0, Sigma_0=S0)

return lss

def additive_decomp(self):
"""
Return values for the martingale decomposition (Proposition 4.3.3.)
- ν : unconditional mean difference in Y
- H : coefficient for the (linear) martingale component (kappa_a)
- g : coefficient for the stationary component g(x)
- Y_0 : it should be the function of X_0 (for now set it to 0.0)
"""
A_res = 1 / (1 - self.A)
g = self.D * A_res
H = self.F + self.D * A_res * self.B

(continues on next page)

31.5. More About the Multiplicative Martingale 577


Advanced Quantitative Economics with Python

(continued from previous page)


return self.ν, H, g

def multiplicative_decomp(self):
"""
Return values for the multiplicative decomposition (Example 5.4.4.)
- ν_tilde : eigenvalue
- H : vector for the Jensen term
"""
ν, H, g = self.additive_decomp()
ν_tilde = ν + (.5) * H**2

return ν_tilde, H, g

def loglikelihood_path(self, x, y):


A, B, D, F = self.A, self.B, self.D, self.F
T = y.T.size
FF = F**2
FFinv = 1 / FF
temp = y[1:] - y[:-1] - D * x[:-1]
obs = temp * FFinv * temp
obssum = np.cumsum(obs)
scalar = (np.log(FF) + np.log(2 * np.pi)) * np.arange(1, T)

return (-0.5) * (obssum + scalar)

def loglikelihood(self, x, y):


llh = self.loglikelihood_path(x, y)

return llh[-1]

The heavy lifting is done inside the AMF_LSS_VAR class.


The following code adds some simple functions that make it straightforward to generate sample paths from an instance
of AMF_LSS_VAR.

def simulate_xy(amf, T):


"Simulate individual paths."
foo, bar = amf.lss.simulate(T)
x = bar[0, :]
y = bar[1, :]

return x, y

def simulate_paths(amf, T=150, I=5000):


"Simulate multiple independent paths."

# Allocate space
storeX = np.empty((I, T))
storeY = np.empty((I, T))

for i in range(I):
# Do specific simulation
x, y = simulate_xy(amf, T)

# Fill in our storage matrices


storeX[i, :] = x
(continues on next page)

578 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

(continued from previous page)


storeY[i, :] = y

return storeX, storeY

def population_means(amf, T=150):


# Allocate Space
xmean = np.empty(T)
ymean = np.empty(T)

# Pull out moment generator


moment_generator = amf.lss.moment_sequence()

for tt in range (T):


tmoms = next(moment_generator)
ymeans = tmoms[1]
xmean[tt] = ymeans[0]
ymean[tt] = ymeans[1]

return xmean, ymean

Now that we have these functions in our toolkit, let’s apply them to run some simulations.

def simulate_martingale_components(amf, T=1000, I=5000):


# Get the multiplicative decomposition
ν, H, g = amf.multiplicative_decomp()

# Allocate space
add_mart_comp = np.empty((I, T))

# Simulate and pull out additive martingale component


for i in range(I):
foo, bar = amf.lss.simulate(T)

# Martingale component is third component


add_mart_comp[i, :] = bar[2, :]

mul_mart_comp = np.exp(add_mart_comp - (np.arange(T) * H**2)/2)

return add_mart_comp, mul_mart_comp

# Build model
amf_2 = AMF_LSS_VAR(0.8, 0.001, 1.0, 0.01,.005)

amc, mmc = simulate_martingale_components(amf_2, 1000, 5000)

amcT = amc[:, -1]


mmcT = mmc[:, -1]

print("The (min, mean, max) of additive Martingale component in period T is")


print(f"\t ({np.min(amcT)}, {np.mean(amcT)}, {np.max(amcT)})")

print("The (min, mean, max) of multiplicative Martingale component \


in period T is")
print(f"\t ({np.min(mmcT)}, {np.mean(mmcT)}, {np.max(mmcT)})")

31.5. More About the Multiplicative Martingale 579


Advanced Quantitative Economics with Python

The (min, mean, max) of additive Martingale component in period T is


(-1.8379907335579106, 0.011040789361757435, 1.4697384727035145)
The (min, mean, max) of multiplicative Martingale component in period T is
(0.14222026893384476, 1.006753060146832, 3.8858858377907133)

̃𝑡 for 𝑡 = 100, 500, 1000, 10000, 100000.


Let’s plot the probability density functions for log 𝑀
Then let’s use the plots to investigate how these densities evolve through time.
̃𝑡 for different values of 𝑡.
We will plot the densities of log 𝑀

Note: scipy.stats.lognorm expects you to pass the standard deviation first (𝑡𝐻 ⋅ 𝐻) and then the exponent of
the mean as a keyword argument scale (scale=np.exp(-t * H2 / 2)).
• See the documentation here.
This is peculiar, so make sure you are careful in working with the log normal distribution.

Here is some code that tackles these tasks

def Mtilde_t_density(amf, t, xmin=1e-8, xmax=5.0, npts=5000):

# Pull out the multiplicative decomposition


νtilde, H, g = amf.multiplicative_decomp()
H2 = H*H

# The distribution
mdist = lognorm(np.sqrt(t*H2), scale=np.exp(-t*H2/2))
x = np.linspace(xmin, xmax, npts)
pdf = mdist.pdf(x)

return x, pdf

def logMtilde_t_density(amf, t, xmin=-15.0, xmax=15.0, npts=5000):

# Pull out the multiplicative decomposition


νtilde, H, g = amf.multiplicative_decomp()
H2 = H*H

# The distribution
lmdist = norm(-t*H2/2, np.sqrt(t*H2))
x = np.linspace(xmin, xmax, npts)
pdf = lmdist.pdf(x)

return x, pdf

times_to_plot = [10, 100, 500, 1000, 2500, 5000]


dens_to_plot = map(lambda t: Mtilde_t_density(amf_2, t, xmin=1e-8, xmax=6.0),
times_to_plot)
ldens_to_plot = map(lambda t: logMtilde_t_density(amf_2, t, xmin=-10.0,
xmax=10.0), times_to_plot)

fig, ax = plt.subplots(3, 2, figsize=(14, 14))


ax = ax.flatten()
(continues on next page)

580 Chapter 31. Additive and Multiplicative Functionals


Advanced Quantitative Economics with Python

(continued from previous page)

fig.suptitle(r"Densities of $\tilde{M}_t$", fontsize=18, y=1.02)


for (it, dens_t) in enumerate(dens_to_plot):
x, pdf = dens_t
ax[it].set_title(f"Density for time {times_to_plot[it]}")
ax[it].fill_between(x, np.zeros_like(pdf), pdf)

plt.tight_layout()
plt.show()

These probability density functions help us understand mechanics underlying the peculiar property of our multiplicative
martingale
• As 𝑇 grows, most of the probability mass shifts leftward toward zero.

31.5. More About the Multiplicative Martingale 581


Advanced Quantitative Economics with Python

• For example, note that most mass is near 1 for 𝑇 = 10 or 𝑇 = 100 but most of it is near 0 for 𝑇 = 5000.
̃𝑇 lengthens toward the right.
• As 𝑇 grows, the tail of the density of 𝑀
̃𝑇 = 1 even as most mass in the distribution of 𝑀
• Enough mass moves toward the right tail to keep 𝐸 𝑀 ̃𝑇 collapses
around 0.

31.5.3 Multiplicative Martingale as Likelihood Ratio Process

This lecture studies likelihood processes and likelihood ratio processes.


A likelihood ratio process is a multiplicative martingale with mean unity.
Likelihood ratio processes exhibit the peculiar property that naturally also appears here.

582 Chapter 31. Additive and Multiplicative Functionals


CHAPTER

THIRTYTWO

CLASSICAL CONTROL WITH LINEAR ALGEBRA

32.1 Overview

In an earlier lecture Linear Quadratic Dynamic Programming Problems, we have studied how to solve a special class
of dynamic optimization and prediction problems by applying the method of dynamic programming. In this class of
problems
• the objective function is quadratic in states and controls.
• the one-step transition function is linear.
• shocks are IID Gaussian or martingale differences.
In this lecture and a companion lecture Classical Filtering with Linear Algebra, we study the classical theory of linear-
quadratic (LQ) optimal control problems.
The classical approach does not use the two closely related methods – dynamic programming and Kalman filtering –
that we describe in other lectures, namely, Linear Quadratic Dynamic Programming Problems and A First Look at the
Kalman Filter.
Instead, they use either
• 𝑧-transform and lag operator methods, or
• matrix decompositions applied to linear systems of first-order conditions for optimum problems.
In this lecture and the sequel Classical Filtering with Linear Algebra, we mostly rely on elementary linear algebra.
The main tool from linear algebra we’ll put to work here is LU decomposition.
We’ll begin with discrete horizon problems.
Then we’ll view infinite horizon problems as appropriate limits of these finite horizon problems.
Later, we will examine the close connection between LQ control and least-squares prediction and filtering problems.
These classes of problems are connected in the sense that to solve each, essentially the same mathematics is used.
Let’s start with some standard imports:

import numpy as np
import matplotlib.pyplot as plt

583
Advanced Quantitative Economics with Python

32.1.1 References

Useful references include [Whittle, 1963], [Hansen and Sargent, 1980], [Orfanidis, 1988], [Athanasios and Pillai, 1991],
and [Muth, 1960].

32.2 A Control Problem

Let 𝐿 be the lag operator, so that, for sequence {𝑥𝑡 } we have 𝐿𝑥𝑡 = 𝑥𝑡−1 .
More generally, let 𝐿𝑘 𝑥𝑡 = 𝑥𝑡−𝑘 with 𝐿0 𝑥𝑡 = 𝑥𝑡 and

𝑑(𝐿) = 𝑑0 + 𝑑1 𝐿 + … + 𝑑𝑚 𝐿𝑚

where 𝑑0 , 𝑑1 , … , 𝑑𝑚 is a given scalar sequence.


Consider the discrete-time control problem
𝑁
1 2 1 2
max lim ∑ 𝛽 𝑡 {𝑎𝑡 𝑦𝑡 − ℎ𝑦 − [𝑑(𝐿)𝑦𝑡 ] } , (32.1)
{𝑦𝑡 } 𝑁→∞
𝑡=0
2 𝑡 2

where
• ℎ is a positive parameter and 𝛽 ∈ (0, 1) is a discount factor.
𝑡
• {𝑎𝑡 }𝑡≥0 is a sequence of exponential order less than 𝛽 −1/2 , by which we mean lim𝑡→∞ 𝛽 2 𝑎𝑡 = 0.
Maximization in (32.1) is subject to initial conditions for 𝑦−1 , 𝑦−2 … , 𝑦−𝑚 .
Maximization is over infinite sequences {𝑦𝑡 }𝑡≥0 .

32.2.1 Example

The formulation of the LQ problem given above is broad enough to encompass many useful models.
As a simple illustration, recall that in LQ Control: Foundations we consider a monopolist facing stochastic demand shocks
and adjustment costs.
Let’s consider a deterministic version of this problem, where the monopolist maximizes the discounted sum

∑ 𝛽 𝑡 𝜋𝑡
𝑡=0

and

𝜋𝑡 = 𝑝𝑡 𝑞𝑡 − 𝑐𝑞𝑡 − 𝛾(𝑞𝑡+1 − 𝑞𝑡 )2 with 𝑝𝑡 = 𝛼0 − 𝛼1 𝑞𝑡 + 𝑑𝑡

In this expression, 𝑞𝑡 is output, 𝑐 is average cost of production, and 𝑑𝑡 is a demand shock.


The term 𝛾(𝑞𝑡+1 − 𝑞𝑡 )2 represents adjustment costs.
You will be able to confirm that the objective function can be rewritten as (32.1) when
• 𝑎𝑡 ∶= 𝛼0 + 𝑑𝑡 − 𝑐
• ℎ ∶= 2𝛼1

• 𝑑(𝐿) ∶= 2𝛾(𝐼 − 𝐿)
Further examples of this problem for factor demand, economic growth, and government policy problems are given in ch.
IX of [Sargent, 1987].

584 Chapter 32. Classical Control with Linear Algebra


Advanced Quantitative Economics with Python

32.3 Finite Horizon Theory

We first study a finite 𝑁 version of the problem.


Later we will study an infinite horizon problem solution as a limiting version of a finite horizon problem.
(This will require being careful because the limits as 𝑁 → ∞ of the necessary and sufficient conditions for maximizing
finite 𝑁 versions of (32.1) are not sufficient for maximizing (32.1))
We begin by
1. fixing 𝑁 > 𝑚,
2. differentiating the finite version of (32.1) with respect to 𝑦0 , 𝑦1 , … , 𝑦𝑁 , and
3. setting these derivatives to zero.
For 𝑡 = 0, … , 𝑁 − 𝑚 these first-order necessary conditions are the Euler equations.
For 𝑡 = 𝑁 − 𝑚 + 1, … , 𝑁 , the first-order conditions are a set of terminal conditions.
Consider the term
𝑁
𝐽 = ∑ 𝛽 𝑡 [𝑑(𝐿)𝑦𝑡 ][𝑑(𝐿)𝑦𝑡 ]
𝑡=0
𝑁
= ∑ 𝛽 𝑡 (𝑑0 𝑦𝑡 + 𝑑1 𝑦𝑡−1 + ⋯ + 𝑑𝑚 𝑦𝑡−𝑚 ) (𝑑0 𝑦𝑡 + 𝑑1 𝑦𝑡−1 + ⋯ + 𝑑𝑚 𝑦𝑡−𝑚 )
𝑡=0

Differentiating 𝐽 with respect to 𝑦𝑡 for 𝑡 = 0, 1, … , 𝑁 − 𝑚 gives

𝜕𝐽
= 2𝛽 𝑡 𝑑0 𝑑(𝐿)𝑦𝑡 + 2𝛽 𝑡+1 𝑑1 𝑑(𝐿)𝑦𝑡+1 + ⋯ + 2𝛽 𝑡+𝑚 𝑑𝑚 𝑑(𝐿)𝑦𝑡+𝑚
𝜕𝑦𝑡
= 2𝛽 𝑡 (𝑑0 + 𝑑1 𝛽𝐿−1 + 𝑑2 𝛽 2 𝐿−2 + ⋯ + 𝑑𝑚 𝛽 𝑚 𝐿−𝑚 ) 𝑑(𝐿)𝑦𝑡

We can write this more succinctly as

𝜕𝐽
= 2𝛽 𝑡 𝑑(𝛽𝐿−1 ) 𝑑(𝐿)𝑦𝑡 (32.2)
𝜕𝑦𝑡

Differentiating 𝐽 with respect to 𝑦𝑡 for 𝑡 = 𝑁 − 𝑚 + 1, … , 𝑁 gives

𝜕𝐽
= 2𝛽 𝑁 𝑑0 𝑑(𝐿)𝑦𝑁
𝜕𝑦𝑁
𝜕𝐽
= 2𝛽 𝑁−1 [𝑑0 + 𝛽 𝑑1 𝐿−1 ] 𝑑(𝐿)𝑦𝑁−1
𝜕𝑦𝑁−1 (32.3)
⋮ ⋮
𝜕𝐽
= 2𝛽 𝑁−𝑚+1 [𝑑0 + 𝛽𝐿−1 𝑑1 + ⋯ + 𝛽 𝑚−1 𝐿−𝑚+1 𝑑𝑚−1 ]𝑑(𝐿)𝑦𝑁−𝑚+1
𝜕𝑦𝑁−𝑚+1

With these preliminaries under our belts, we are ready to differentiate (32.1).
Differentiating (32.1) with respect to 𝑦𝑡 for 𝑡 = 0, … , 𝑁 − 𝑚 gives the Euler equations

[ℎ + 𝑑 (𝛽𝐿−1 ) 𝑑(𝐿)]𝑦𝑡 = 𝑎𝑡 , 𝑡 = 0, 1, … , 𝑁 − 𝑚 (32.4)

The system of equations (32.4) forms a 2 × 𝑚 order linear difference equation that must hold for the values of 𝑡 indicated.

32.3. Finite Horizon Theory 585


Advanced Quantitative Economics with Python

Differentiating (32.1) with respect to 𝑦𝑡 for 𝑡 = 𝑁 − 𝑚 + 1, … , 𝑁 gives the terminal conditions

𝛽 𝑁 (𝑎𝑁 − ℎ𝑦𝑁 − 𝑑0 𝑑(𝐿)𝑦𝑁 ) = 0


𝛽 𝑁−1 (𝑎𝑁−1 − ℎ𝑦𝑁−1 − (𝑑0 + 𝛽 𝑑1 𝐿−1 ) 𝑑(𝐿) 𝑦𝑁−1 ) = 0
⋮ ⋮ =0 (32.5)

𝛽 𝑁−𝑚+1 (𝑎𝑁−𝑚+1 − ℎ𝑦𝑁−𝑚+1 − (𝑑0 + 𝛽𝐿−1 𝑑1 + ⋯ + 𝛽 𝑚−1 𝐿−𝑚+1 𝑑𝑚−1 )𝑑(𝐿)𝑦𝑁−𝑚+1 ) = 0

In the finite 𝑁 problem, we want simultaneously to solve (32.4) subject to the 𝑚 initial conditions 𝑦−1 , … , 𝑦−𝑚 and the
𝑚 terminal conditions (32.5).
These conditions uniquely pin down the solution of the finite 𝑁 problem.
That is, for the finite 𝑁 problem, conditions (32.4) and (32.5) are necessary and sufficient for a maximum, by concavity
of the objective function.
Next, we describe how to obtain the solution using matrix methods.

32.3.1 Matrix Methods

Let’s look at how linear algebra can be used to tackle and shed light on the finite horizon LQ control problem.

A Single Lag Term

Let’s begin with the special case in which 𝑚 = 1.


We want to solve the system of 𝑁 + 1 linear equations

[ℎ + 𝑑 (𝛽𝐿−1 ) 𝑑 (𝐿)]𝑦𝑡 = 𝑎𝑡 , 𝑡 = 0, 1, … , 𝑁 − 1
𝑁
(32.6)
𝛽 [𝑎𝑁 − ℎ 𝑦𝑁 − 𝑑0 𝑑 (𝐿)𝑦𝑁 ] = 0

where 𝑑(𝐿) = 𝑑0 + 𝑑1 𝐿.
These equations are to be solved for 𝑦0 , 𝑦1 , … , 𝑦𝑁 as functions of 𝑎0 , 𝑎1 , … , 𝑎𝑁 and 𝑦−1 .
Let

𝜙(𝐿) = 𝜙0 + 𝜙1 𝐿 + 𝛽𝜙1 𝐿−1 = ℎ + 𝑑(𝛽𝐿−1 )𝑑(𝐿) = (ℎ + 𝑑02 + 𝑑12 ) + 𝑑1 𝑑0 𝐿 + 𝑑1 𝑑0 𝛽𝐿−1

Then we can represent (32.6) as the matrix equation

(𝜙0 − 𝑑12 ) 𝜙1 0 0 … … 0 𝑦𝑁 𝑎𝑁
⎡ 𝛽𝜙 𝜙0 𝜙1 0 … … 0 ⎤ ⎡𝑦𝑁−1 ⎤ ⎡ 𝑎𝑁−1 ⎤
1
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ 0 𝛽𝜙 1 𝜙0 𝜙1 … … 0 ⎥ ⎢𝑦𝑁−2 ⎥ ⎢ 𝑎𝑁−2 ⎥
= (32.7)
⎢ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎢ 0 … … … 𝛽𝜙1 𝜙0 𝜙1 ⎥ ⎢ 𝑦1 ⎥ ⎢ 𝑎1 ⎥
⎣ 0 … … … 0 𝛽𝜙1 𝜙0 ⎦ ⎣ 𝑦0 ⎦ ⎣𝑎0 − 𝜙1 𝑦−1 ⎦
or

𝑊 𝑦 ̄ = 𝑎̄ (32.8)

Notice how we have chosen to arrange the 𝑦𝑡 ’s in reverse time order.


The matrix 𝑊 on the left side of (32.7) is “almost” a Toeplitz matrix (where each descending diagonal is constant).
There are two sources of deviation from the form of a Toeplitz matrix

586 Chapter 32. Classical Control with Linear Algebra


Advanced Quantitative Economics with Python

1. The first element differs from the remaining diagonal elements, reflecting the terminal condition.
2. The sub-diagonal elements equal 𝛽 time the super-diagonal elements.
The solution of (32.8) can be expressed in the form

𝑦 ̄ = 𝑊 −1 𝑎̄ (32.9)

which represents each element 𝑦𝑡 of 𝑦 ̄ as a function of the entire vector 𝑎.̄


That is, 𝑦𝑡 is a function of past, present, and future values of 𝑎’s, as well as of the initial condition 𝑦−1 .

An Alternative Representation

An alternative way to express the solution to (32.7) or (32.8) is in so-called feedback-feedforward form.
The idea here is to find a solution expressing 𝑦𝑡 as a function of past 𝑦’s and current and future 𝑎’s.
To achieve this solution, one can use an LU decomposition of 𝑊 .
There always exists a decomposition of 𝑊 of the form 𝑊 = 𝐿𝑈 where
• 𝐿 is an (𝑁 + 1) × (𝑁 + 1) lower triangular matrix.
• 𝑈 is an (𝑁 + 1) × (𝑁 + 1) upper triangular matrix.
The factorization can be normalized so that the diagonal elements of 𝑈 are unity.
Using the LU representation in (32.9), we obtain

𝑈 𝑦 ̄ = 𝐿−1 𝑎̄ (32.10)

Since 𝐿−1 is lower triangular, this representation expresses 𝑦𝑡 as a function of


• lagged 𝑦’s (via the term 𝑈 𝑦),
̄ and
• current and future 𝑎’s (via the term 𝐿−1 𝑎)̄
Because there are zeros everywhere in the matrix on the left of (32.7) except on the diagonal, super-diagonal, and sub-
diagonal, the 𝐿𝑈 decomposition takes
• 𝐿 to be zero except in the diagonal and the leading sub-diagonal.
• 𝑈 to be zero except on the diagonal and the super-diagonal.
Thus, (32.10) has the form

1 𝑈12 0 0 … 0 0 𝑦𝑁
⎡0 1 𝑈23 0 … 0 0 ⎤ ⎡𝑦 ⎤
⎢ ⎥ ⎢ 𝑁−1 ⎥
⎢0 0 1 𝑈34 … 0 0 ⎥ 𝑦
⎢ 𝑁−2 ⎥
⎢0 0 0 1 … 0 0 ⎥ ⎢𝑦𝑁−3 ⎥ =
⎢ ⎥ ⎢ ⎥
⎢⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⎥ ⎢ ⋮ ⎥
⎢0 0 0 0 … 1 𝑈𝑁,𝑁+1 ⎥ ⎢ 𝑦1 ⎥
⎣0 0 0 0 … 0 1 ⎦ ⎣ 𝑦0 ⎦

𝐿−1
11 0 0 … 0 𝑎𝑁
⎡ 𝐿−1 𝐿−1 0 … 0 ⎤⎡ 𝑎 ⎤
⎢ 21
−1
22
⎥⎢ 𝑁−1

⎢ 𝐿31 𝐿−1
32 𝐿−1
33 … 0 ⎥ ⎢ 𝑎𝑁−2 ⎥
⎢ ⋮ ⋮ ⋮ ⋱ ⋮ ⎥⎢ ⋮ ⎥
⎢ 𝐿−1
𝑁,1 𝐿−1
𝑁,2 𝐿−1
𝑁,3 … 0 ⎥⎢ 𝑎1 ⎥
−1
⎣𝐿𝑁+1,1 𝐿−1
𝑁+1,2 𝐿−1
𝑁+1,3
−1
… 𝐿𝑁+1 𝑁+1 ⎦ ⎣𝑎0 − 𝜙1 𝑦−1 ⎦
where 𝐿−1
𝑖𝑗 is the (𝑖, 𝑗) element of 𝐿
−1
and 𝑈𝑖𝑗 is the (𝑖, 𝑗) element of 𝑈 .

32.3. Finite Horizon Theory 587


Advanced Quantitative Economics with Python

Note how the left side for a given 𝑡 involves 𝑦𝑡 and one lagged value 𝑦𝑡−1 while the right side involves all future values of
the forcing process 𝑎𝑡 , 𝑎𝑡+1 , … , 𝑎𝑁 .

Additional Lag Terms

We briefly indicate how this approach extends to the problem with 𝑚 > 1.
Assume that 𝛽 = 1 and let 𝐷𝑚+1 be the (𝑚 + 1) × (𝑚 + 1) symmetric matrix whose elements are determined from the
following formula:

𝐷𝑗𝑘 = 𝑑0 𝑑𝑘−𝑗 + 𝑑1 𝑑𝑘−𝑗+1 + … + 𝑑𝑗−1 𝑑𝑘−1 , 𝑘≥𝑗

Let 𝐼𝑚+1 be the (𝑚 + 1) × (𝑚 + 1) identity matrix.


Let 𝜙𝑗 be the coefficients in the expansion 𝜙(𝐿) = ℎ + 𝑑(𝐿−1 )𝑑(𝐿).
Then the first order conditions (32.4) and (32.5) can be expressed as:

𝑦𝑁 𝑎𝑁 𝑦𝑁−𝑚+1
⎡𝑦 ⎤ ⎡𝑎 ⎤ ⎡𝑦 ⎤
(𝐷𝑚+1 + ℎ𝐼𝑚+1 ) ⎢ 𝑁−1 ⎥ = ⎢ 𝑁−1 ⎥ + 𝑀 ⎢ 𝑁−𝑚−2 ⎥
⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥
𝑦
⎣ 𝑁−𝑚 ⎦ ⎣𝑎𝑁−𝑚 ⎦ 𝑦
⎣ 𝑁−2𝑚 ⎦

where 𝑀 is (𝑚 + 1) × 𝑚 and

𝐷𝑖−𝑗, 𝑚+1 for 𝑖 > 𝑗


𝑀𝑖𝑗 = {
0 for 𝑖 ≤ 𝑗

𝜙𝑚 𝑦𝑁−1 + 𝜙𝑚−1 𝑦𝑁−2 + … + 𝜙0 𝑦𝑁−𝑚−1 + 𝜙1 𝑦𝑁−𝑚−2 +


… + 𝜙𝑚 𝑦𝑁−2𝑚−1 = 𝑎𝑁−𝑚−1
𝜙𝑚 𝑦𝑁−2 + 𝜙𝑚−1 𝑦𝑁−3 + … + 𝜙0 𝑦𝑁−𝑚−2 + 𝜙1 𝑦𝑁−𝑚−3 +
… + 𝜙𝑚 𝑦𝑁−2𝑚−2 = 𝑎𝑁−𝑚−2

𝜙𝑚 𝑦𝑚+1 + 𝜙𝑚−1 𝑦𝑚 + + … + 𝜙0 𝑦1 + 𝜙1 𝑦0 + 𝜙𝑚 𝑦−𝑚+1 = 𝑎1
𝜙𝑚 𝑦𝑚 + 𝜙𝑚−1 𝑦𝑚−1 + 𝜙𝑚−2 + … + 𝜙0 𝑦0 + 𝜙1 𝑦−1 + … + 𝜙𝑚 𝑦−𝑚 = 𝑎0
As before, we can express this equation as 𝑊 𝑦 ̄ = 𝑎.̄
The matrix on the left of this equation is “almost” Toeplitz, the exception being the leading 𝑚 × 𝑚 submatrix in the upper
left-hand corner.
We can represent the solution in feedback-feedforward form by obtaining a decomposition 𝐿𝑈 = 𝑊 , and obtain

𝑈 𝑦 ̄ = 𝐿−1 𝑎̄ (32.11)
𝑡 𝑁−𝑡
∑ 𝑈−𝑡+𝑁+1, −𝑡+𝑁+𝑗+1 𝑦𝑡−𝑗 = ∑ 𝐿−𝑡+𝑁+1, −𝑡+𝑁+1−𝑗 𝑎𝑡+𝑗
̄ ,
𝑗=0 𝑗=0

𝑡 = 0, 1, … , 𝑁
where 𝐿−1
𝑡,𝑠 is the element in the (𝑡, 𝑠) position of 𝐿, and similarly for 𝑈 .

The left side of equation (32.11) is the “feedback” part of the optimal control law for 𝑦𝑡 , while the right-hand side is the
“feedforward” part.
We note that there is a different control law for each 𝑡.
Thus, in the finite horizon case, the optimal control law is time-dependent.

588 Chapter 32. Classical Control with Linear Algebra


Advanced Quantitative Economics with Python

It is natural to suspect that as 𝑁 → ∞, (32.11) becomes equivalent to the solution of our infinite horizon problem, which
below we shall show can be expressed as

𝑐(𝐿)𝑦𝑡 = 𝑐(𝛽𝐿−1 )−1 𝑎𝑡 ,

−1
so that as 𝑁 → ∞ we expect that for each fixed 𝑡, 𝑈𝑡,𝑡−𝑗 → 𝑐𝑗 and 𝐿𝑡,𝑡+𝑗 approaches the coefficient on 𝐿−𝑗 in the
−1 −1
expansion of 𝑐(𝛽𝐿 ) .
This suspicion is true under general conditions that we shall study later.
For now, we note that by creating the matrix 𝑊 for large 𝑁 and factoring it into the 𝐿𝑈 form, good approximations to
𝑐(𝐿) and 𝑐(𝛽𝐿−1 )−1 can be obtained.

32.4 Infinite Horizon Limit

For the infinite horizon problem, we propose to discover first-order necessary conditions by taking the limits of (32.4)
and (32.5) as 𝑁 → ∞.
This approach is valid, and the limits of (32.4) and (32.5) as 𝑁 approaches infinity are first-order necessary conditions
for a maximum.
However, for the infinite horizon problem with 𝛽 < 1, the limits of (32.4) and (32.5) are, in general, not sufficient for a
maximum.
That is, the limits of (32.5) do not provide enough information uniquely to determine the solution of the Euler equation
(32.4) that maximizes (32.1).
As we shall see below, a side condition on the path of 𝑦𝑡 that together with (32.4) is sufficient for an optimum is

∑ 𝛽 𝑡 ℎ𝑦𝑡2 < ∞ (32.12)
𝑡=0

All paths that satisfy the Euler equations, except the one that we shall select below, violate this condition and, therefore,
evidently lead to (much) lower values of (32.1) than does the optimal path selected by the solution procedure below.
Consider the characteristic equation associated with the Euler equation

ℎ + 𝑑 (𝛽𝑧 −1 ) 𝑑 (𝑧) = 0 (32.13)

Notice that if 𝑧 ̃ is a root of equation (32.13), then so is 𝛽 𝑧 −1


̃ .
Thus, the roots of (32.13) come in “𝛽-reciprocal” pairs.
Assume that the roots of (32.13) are distinct.
Let the roots be, in descending order according to their moduli, 𝑧1 , 𝑧2 , … , 𝑧2𝑚 .

From the reciprocal pairs property and the assumption of distinct roots, it follows that |𝑧𝑗 | > 𝛽 for 𝑗 ≤ 𝑚 and |𝑧𝑗 | <

𝛽 for 𝑗 > 𝑚.
−1
It also follows that 𝑧2𝑚−𝑗 = 𝛽𝑧𝑗+1 , 𝑗 = 0, 1, … , 𝑚 − 1.
Therefore, the characteristic polynomial on the left side of (32.13) can be expressed as

ℎ + 𝑑(𝛽𝑧 −1 )𝑑(𝑧) = 𝑧 −𝑚 𝑧0 (𝑧 − 𝑧1 ) ⋯ (𝑧 − 𝑧𝑚 )(𝑧 − 𝑧𝑚+1 ) ⋯ (𝑧 − 𝑧2𝑚 )


(32.14)
= 𝑧 −𝑚 𝑧0 (𝑧 − 𝑧1 )(𝑧 − 𝑧2 ) ⋯ (𝑧 − 𝑧𝑚 )(𝑧 − 𝛽𝑧𝑚
−1
) ⋯ (𝑧 − 𝛽𝑧2−1 )(𝑧 − 𝛽𝑧1−1 )

where 𝑧0 is a constant.

32.4. Infinite Horizon Limit 589


Advanced Quantitative Economics with Python

1 𝛽 −1
In (32.14), we substitute (𝑧 − 𝑧𝑗 ) = −𝑧𝑗 (1 − 𝑧𝑗 𝑧) and (𝑧 − 𝛽𝑧𝑗−1 ) = 𝑧(1 − 𝑧𝑗 𝑧 ) for 𝑗 = 1, … , 𝑚 to get

1 1 1 1
ℎ + 𝑑(𝛽𝑧 −1 )𝑑(𝑧) = (−1)𝑚 (𝑧0 𝑧1 ⋯ 𝑧𝑚 )(1 − 𝑧) ⋯ (1 − 𝑧)(1 − 𝛽𝑧 −1 ) ⋯ (1 − 𝛽𝑧 −1 )
𝑧1 𝑧𝑚 𝑧1 𝑧𝑚
𝑚
Now define 𝑐(𝑧) = ∑𝑗=0 𝑐𝑗 𝑧 𝑗 as

1/2 𝑧 𝑧 𝑧
𝑐 (𝑧) = [(−1)𝑚 𝑧0 𝑧1 ⋯ 𝑧𝑚 ] (1 − ) (1 − ) ⋯ (1 − ) (32.15)
𝑧1 𝑧2 𝑧𝑚

Notice that (32.14) can be written

ℎ + 𝑑 (𝛽𝑧 −1 ) 𝑑 (𝑧) = 𝑐 (𝛽𝑧 −1 ) 𝑐 (𝑧) (32.16)

It is useful to write (32.15) as

𝑐(𝑧) = 𝑐0 (1 − 𝜆1 𝑧) … (1 − 𝜆𝑚 𝑧) (32.17)

where
1/2 1
𝑐0 = [(−1)𝑚 𝑧0 𝑧1 ⋯ 𝑧𝑚 ] ; 𝜆𝑗 = , 𝑗 = 1, … , 𝑚
𝑧𝑗
√ √
Since |𝑧𝑗 | > 𝛽 for 𝑗 = 1, … , 𝑚 it follows that |𝜆𝑗 | < 1/ 𝛽 for 𝑗 = 1, … , 𝑚.
Using (32.17), we can express the factorization (32.16) as

ℎ + 𝑑(𝛽𝑧 −1 )𝑑(𝑧) = 𝑐02 (1 − 𝜆1 𝑧) ⋯ (1 − 𝜆𝑚 𝑧)(1 − 𝜆1 𝛽𝑧 −1 ) ⋯ (1 − 𝜆𝑚 𝛽𝑧 −1 )

In sum, we have constructed a factorization (32.16) of the characteristic polynomial for the Euler equation in which the
zeros of 𝑐(𝑧) exceed 𝛽 1/2 in modulus, and the zeros of 𝑐 (𝛽𝑧 −1 ) are less than 𝛽 1/2 in modulus.
Using (32.16), we now write the Euler equation as

𝑐(𝛽𝐿−1 ) 𝑐 (𝐿) 𝑦𝑡 = 𝑎𝑡

The unique solution of the Euler equation that satisfies condition (32.12) is

𝑐(𝐿) 𝑦𝑡 = 𝑐 (𝛽𝐿−1 )−1 𝑎𝑡 (32.18)

This can be established by using an argument paralleling that in chapter IX of [Sargent, 1987].
To exhibit the solution in a form paralleling that of [Sargent, 1987], we use (32.17) to write (32.18) as

𝑐0−2 𝑎𝑡
(1 − 𝜆1 𝐿) ⋯ (1 − 𝜆𝑚 𝐿)𝑦𝑡 = −1
(32.19)
(1 − 𝛽𝜆1 𝐿 ) ⋯ (1 − 𝛽𝜆𝑚 𝐿−1 )

Using partial fractions, we can write the characteristic polynomial on the right side of (32.19) as
𝑚
𝐴𝑗 𝑐0−2
∑ where 𝐴𝑗 ∶= 𝜆𝑖
1 − 𝜆𝑗 𝛽𝐿−1 ∏𝑖≠𝑗 (1 −
𝑗=1 𝜆𝑗 )

Then (32.19) can be written


𝑚
𝐴𝑗
(1 − 𝜆1 𝐿) ⋯ (1 − 𝜆𝑚 𝐿)𝑦𝑡 = ∑ 𝑎
𝑗=1
1 − 𝜆𝑗 𝛽𝐿−1 𝑡

590 Chapter 32. Classical Control with Linear Algebra


Advanced Quantitative Economics with Python

or
𝑚 ∞
(1 − 𝜆1 𝐿) ⋯ (1 − 𝜆𝑚 𝐿)𝑦𝑡 = ∑ 𝐴𝑗 ∑ (𝜆𝑗 𝛽)𝑘 𝑎𝑡+𝑘 (32.20)
𝑗=1 𝑘=0

Equation (32.20) expresses the optimum sequence for 𝑦𝑡 in terms of 𝑚 lagged 𝑦’s, and 𝑚 weighted infinite geometric
sums of future 𝑎𝑡 ’s.
Furthermore, (32.20) is the unique solution of the Euler equation that satisfies the initial conditions and condition (32.12).
In effect, condition (32.12) compels us to solve the “unstable” roots of ℎ + 𝑑(𝛽𝑧 −1 )𝑑(𝑧) forward (see [Sargent, 1987]).
The step of√factoring the polynomial ℎ + 𝑑(𝛽𝑧 −1 ) 𝑑(𝑧) into 𝑐 (𝛽𝑧 −1 )𝑐 (𝑧), where the zeros of 𝑐 (𝑧) all have modulus
exceeding 𝛽, is central to solving the problem.
We note two features of the solution (32.20)
√ √
• Since |𝜆𝑗 | < 1/ 𝛽 for all 𝑗, it follows that (𝜆𝑗 𝛽) < 𝛽.

• The assumption that {𝑎𝑡 } is of exponential order less than 1/ 𝛽 is sufficient to guarantee that the geometric sums
of future 𝑎𝑡 ’s on the right side of (32.20) converge.
We immediately see that those sums will converge under the weaker condition that {𝑎𝑡 } is of exponential order less than
𝜙−1 where 𝜙 = max {𝛽𝜆𝑖 , 𝑖 = 1, … , 𝑚}.
Note that with 𝑎𝑡 identically zero, (32.20) implies that in general |𝑦𝑡 | eventually grows exponentially at a rate given by
max𝑖 |𝜆𝑖 |.

The condition max𝑖 |𝜆𝑖 | < 1/ 𝛽 guarantees that condition (32.12) is satisfied.

In fact, max𝑖 |𝜆𝑖 | < 1/ 𝛽 is a necessary condition for (32.12) to hold.
Were (32.12) not satisfied, the objective function would diverge to −∞, implying that the 𝑦𝑡 path could not be optimal.
For example, with 𝑎𝑡 = 0, for all 𝑡, it is easy to describe a naive (nonoptimal) policy for {𝑦𝑡 , 𝑡 ≥ 0} that gives a finite
value of (32.17).
We can simply let 𝑦𝑡 = 0 for 𝑡 ≥ 0.
This policy involves at most 𝑚 nonzero values of ℎ𝑦𝑡2 and [𝑑(𝐿)𝑦𝑡 ]2 , and so yields a finite value of (32.1).
Therefore it is easy to dominate a path that violates (32.12).

32.5 Undiscounted Problems

It is worthwhile focusing on a special case of the LQ problems above: the undiscounted problem that emerges when
𝛽 = 1.
In this case, the Euler equation is

(ℎ + 𝑑(𝐿−1 )𝑑(𝐿)) 𝑦𝑡 = 𝑎𝑡

The factorization of the characteristic polynomial (32.16) becomes

(ℎ + 𝑑 (𝑧 −1 )𝑑(𝑧)) = 𝑐 (𝑧 −1 ) 𝑐 (𝑧)

32.5. Undiscounted Problems 591


Advanced Quantitative Economics with Python

where
𝑐 (𝑧) = 𝑐0 (1 − 𝜆1 𝑧) … (1 − 𝜆𝑚 𝑧)
𝑐0 = [(−1)𝑚 𝑧0 𝑧1 … 𝑧𝑚 ]
|𝜆𝑗 | < 1 for 𝑗 = 1, … , 𝑚
1
𝜆𝑗 = for 𝑗 = 1, … , 𝑚
𝑧𝑗
𝑧0 = constant
The solution of the problem becomes
𝑚 ∞
(1 − 𝜆1 𝐿) ⋯ (1 − 𝜆𝑚 𝐿)𝑦𝑡 = ∑ 𝐴𝑗 ∑ 𝜆𝑘𝑗 𝑎𝑡+𝑘
𝑗=1 𝑘=0

32.5.1 Transforming Discounted to Undiscounted Problem

Discounted problems can always be converted into undiscounted problems via a simple transformation.
Consider problem (32.1) with 0 < 𝛽 < 1.
Define the transformed variables
𝑎𝑡̃ = 𝛽 𝑡/2 𝑎𝑡 , 𝑦𝑡̃ = 𝛽 𝑡/2 𝑦𝑡 (32.21)
𝑚
Then notice that 𝛽 𝑡 [𝑑 (𝐿)𝑦𝑡 ]2 = [𝑑 ̃(𝐿)𝑦𝑡̃ ]2 with 𝑑 ̃(𝐿) = ∑𝑗=0 𝑑𝑗̃ 𝐿𝑗 and 𝑑𝑗̃ = 𝛽 𝑗/2 𝑑𝑗 .
Then the original criterion function (32.1) is equivalent to
𝑁
1 1
lim ∑{𝑎𝑡̃ 𝑦𝑡̃ − ℎ 𝑦𝑡2̃ − [𝑑 ̃(𝐿) 𝑦𝑡̃ ]2 } (32.22)
𝑁→∞
𝑡=0
2 2
which is to be maximized over sequences {𝑦𝑡̃ , 𝑡 = 0, …} subject to 𝑦−1
̃ , ⋯ , 𝑦−𝑚
̃ given and {𝑎𝑡̃ , 𝑡 = 1, …} a known
bounded sequence.
The Euler equation for this problem is [ℎ + 𝑑 ̃(𝐿−1 ) 𝑑 ̃(𝐿)] 𝑦𝑡̃ = 𝑎𝑡̃ .
The solution is
𝑚 ∞
(1 − 𝜆̃ 1 𝐿) ⋯ (1 − 𝜆̃ 𝑚 𝐿) 𝑦𝑡̃ = ∑ 𝐴𝑗̃ ∑ 𝜆̃ 𝑘𝑗 𝑎𝑡+𝑘
̃
𝑗=1 𝑘=0
or
𝑚 ∞
𝑦𝑡̃ = 𝑓1̃ 𝑦𝑡−1
̃ + ⋯ + 𝑓𝑚̃ 𝑦𝑡−𝑚
̃ + ∑ 𝐴𝑗̃ ∑ 𝜆̃ 𝑘𝑗 𝑎𝑡+𝑘
̃ , (32.23)
𝑗=1 𝑘=0

where 𝑐 ̃ (𝑧 −1 )𝑐 ̃ (𝑧) = ℎ + 𝑑 ̃(𝑧 −1 )𝑑 ̃(𝑧), and where


1/2
[(−1)𝑚 𝑧0̃ 𝑧1̃ … 𝑧𝑚
̃ ] (1 − 𝜆̃ 1 𝑧) … (1 − 𝜆̃ 𝑚 𝑧) = 𝑐 ̃ (𝑧), where |𝜆̃ 𝑗 | < 1
We leave it to the reader to show that (32.23) implies the equivalent form of the solution
𝑚 ∞
𝑦𝑡 = 𝑓1 𝑦𝑡−1 + ⋯ + 𝑓𝑚 𝑦𝑡−𝑚 + ∑ 𝐴𝑗 ∑ (𝜆𝑗 𝛽)𝑘 𝑎𝑡+𝑘
𝑗=1 𝑘=0

where
𝑓𝑗 = 𝑓𝑗̃ 𝛽 −𝑗/2 , 𝐴𝑗 = 𝐴𝑗̃ , 𝜆𝑗 = 𝜆̃ 𝑗 𝛽 −1/2 (32.24)
The transformations (32.21) and the inverse formulas (32.24) allow us to solve a discounted problem by first solving a
related undiscounted problem.

592 Chapter 32. Classical Control with Linear Algebra


Advanced Quantitative Economics with Python

32.6 Implementation

Here’s the code that computes solutions to the LQ problem using the methods described above.

import numpy as np
import scipy.stats as spst
import scipy.linalg as la

class LQFilter:

def __init__(self, d, h, y_m, r=None, h_eps=None, β=None):


"""

Parameters
----------
d : list or numpy.array (1-D or a 2-D column vector)
The order of the coefficients: [d_0, d_1, ..., d_m]
h : scalar
Parameter of the objective function (corresponding to the
quadratic term)
y_m : list or numpy.array (1-D or a 2-D column vector)
Initial conditions for y
r : list or numpy.array (1-D or a 2-D column vector)
The order of the coefficients: [r_0, r_1, ..., r_k]
(optional, if not defined -> deterministic problem)
β : scalar
Discount factor (optional, default value is one)
"""

self.h = h
self.d = np.asarray(d)
self.m = self.d.shape[0] - 1

self.y_m = np.asarray(y_m)

if self.m == self.y_m.shape[0]:
self.y_m = self.y_m.reshape(self.m, 1)
else:
raise ValueError("y_m must be of length m = {self.m:d}")

#---------------------------------------------
# Define the coefficients of ϕ upfront
#---------------------------------------------
ϕ = np.zeros(2 * self.m + 1)
for i in range(- self.m, self.m + 1):
ϕ[self.m - i] = np.sum(np.diag(self.d.reshape(self.m + 1, 1) \
@ self.d.reshape(1, self.m + 1),
k=-i
)
)
ϕ[self.m] = ϕ[self.m] + self.h
self.ϕ = ϕ

#-----------------------------------------------------
# If r is given calculate the vector ϕ_r
#-----------------------------------------------------
if r is None:
(continues on next page)

32.6. Implementation 593


Advanced Quantitative Economics with Python

(continued from previous page)


pass
else:
self.r = np.asarray(r)
self.k = self.r.shape[0] - 1
ϕ_r = np.zeros(2 * self.k + 1)
for i in range(- self.k, self.k + 1):
ϕ_r[self.k - i] = np.sum(np.diag(self.r.reshape(self.k + 1, 1) \
@ self.r.reshape(1, self.k + 1),
k=-i
)
)
if h_eps is None:
self.ϕ_r = ϕ_r
else:
ϕ_r[self.k] = ϕ_r[self.k] + h_eps
self.ϕ_r = ϕ_r

#-----------------------------------------------------
# If β is given, define the transformed variables
#-----------------------------------------------------
if β is None:
self.β = 1
else:
self.β = β
self.d = self.β**(np.arange(self.m + 1)/2) * self.d
self.y_m = self.y_m * (self.β**(- np.arange(1, self.m + 1)/2)) \
.reshape(self.m, 1)

def construct_W_and_Wm(self, N):


"""
This constructs the matrices W and W_m for a given number of periods N
"""

m = self.m
d = self.d

W = np.zeros((N + 1, N + 1))
W_m = np.zeros((N + 1, m))

#---------------------------------------
# Terminal conditions
#---------------------------------------

D_m1 = np.zeros((m + 1, m + 1))


M = np.zeros((m + 1, m))

# (1) Constuct the D_{m+1} matrix using the formula

for j in range(m + 1):


for k in range(j, m + 1):
D_m1[j, k] = d[:j + 1] @ d[k - j: k + 1]

# Make the matrix symmetric


D_m1 = D_m1 + D_m1.T - np.diag(np.diag(D_m1))

# (2) Construct the M matrix using the entries of D_m1

(continues on next page)

594 Chapter 32. Classical Control with Linear Algebra


Advanced Quantitative Economics with Python

(continued from previous page)

for j in range(m):
for i in range(j + 1, m + 1):
M[i, j] = D_m1[i - j - 1, m]

#----------------------------------------------
# Euler equations for t = 0, 1, ..., N-(m+1)
#----------------------------------------------
ϕ = self.ϕ

W[:(m + 1), :(m + 1)] = D_m1 + self.h * np.eye(m + 1)


W[:(m + 1), (m + 1):(2 * m + 1)] = M

for i, row in enumerate(np.arange(m + 1, N + 1 - m)):


W[row, (i + 1):(2 * m + 2 + i)] = ϕ

for i in range(1, m + 1):


W[N - m + i, -(2 * m + 1 - i):] = ϕ[:-i]

for i in range(m):
W_m[N - i, :(m - i)] = ϕ[(m + 1 + i):]

return W, W_m

def roots_of_characteristic(self):
"""
This function calculates z_0 and the 2m roots of the characteristic
equation associated with the Euler equation (1.7)

Note:
------
numpy.poly1d(roots, True) defines a polynomial using its roots that can
be evaluated at any point. If x_1, x_2, ... , x_m are the roots then
p(x) = (x - x_1)(x - x_2)...(x - x_m)
"""
m = self.m
ϕ = self.ϕ

# Calculate the roots of the 2m-polynomial


roots = np.roots(ϕ)
# Sort the roots according to their length (in descending order)
roots_sorted = roots[np.argsort(abs(roots))[::-1]]

z_0 = ϕ.sum() / np.poly1d(roots, True)(1)


z_1_to_m = roots_sorted[:m] # We need only those outside the unit circle

λ = 1 / z_1_to_m

return z_1_to_m, z_0, λ

def coeffs_of_c(self):
'''
This function computes the coefficients {c_j, j = 0, 1, ..., m} for
c(z) = sum_{j = 0}^{m} c_j z^j

Based on the expression (1.9). The order is

(continues on next page)

32.6. Implementation 595


Advanced Quantitative Economics with Python

(continued from previous page)


c_coeffs = [c_0, c_1, ..., c_{m-1}, c_m]
'''
z_1_to_m, z_0 = self.roots_of_characteristic()[:2]

c_0 = (z_0 * np.prod(z_1_to_m).real * (- 1)**self.m)**(.5)


c_coeffs = np.poly1d(z_1_to_m, True).c * z_0 / c_0

return c_coeffs[::-1]

def solution(self):
"""
This function calculates {λ_j, j=1,...,m} and {A_j, j=1,...,m}
of the expression (1.15)
"""
λ = self.roots_of_characteristic()[2]
c_0 = self.coeffs_of_c()[-1]

A = np.zeros(self.m, dtype=complex)
for j in range(self.m):
denom = 1 - λ/λ[j]
A[j] = c_0**(-2) / np.prod(denom[np.arange(self.m) != j])

return λ, A

def construct_V(self, N):


'''
This function constructs the covariance matrix for x^N (see section 6)
for a given period N
'''
V = np.zeros((N, N))
ϕ_r = self.ϕ_r

for i in range(N):
for j in range(N):
if abs(i-j) <= self.k:
V[i, j] = ϕ_r[self.k + abs(i-j)]

return V

def simulate_a(self, N):


"""
Assuming that the u's are normal, this method draws a random path
for x^N
"""
V = self.construct_V(N + 1)
d = spst.multivariate_normal(np.zeros(N + 1), V)

return d.rvs()

def predict(self, a_hist, t):


"""
This function implements the prediction formula discussed in section 6 (1.59)
It takes a realization for a^N, and the period in which the prediction is
formed

Output: E[abar | a_t, a_{t-1}, ..., a_1, a_0]

(continues on next page)

596 Chapter 32. Classical Control with Linear Algebra


Advanced Quantitative Economics with Python

(continued from previous page)


"""

N = np.asarray(a_hist).shape[0] - 1
a_hist = np.asarray(a_hist).reshape(N + 1, 1)
V = self.construct_V(N + 1)

aux_matrix = np.zeros((N + 1, N + 1))


aux_matrix[:(t + 1), :(t + 1)] = np.eye(t + 1)
L = la.cholesky(V).T
Ea_hist = la.inv(L) @ aux_matrix @ L @ a_hist

return Ea_hist

def optimal_y(self, a_hist, t=None):


"""
- if t is NOT given it takes a_hist (list or numpy.array) as a
deterministic a_t
- if t is given, it solves the combined control prediction problem
(section 7)(by default, t == None -> deterministic)

for a given sequence of a_t (either deterministic or a particular


realization), it calculates the optimal y_t sequence using the method
of the lecture

Note:
------
scipy.linalg.lu normalizes L, U so that L has unit diagonal elements
To make things consistent with the lecture, we need an auxiliary
diagonal matrix D which renormalizes L and U
"""

N = np.asarray(a_hist).shape[0] - 1
W, W_m = self.construct_W_and_Wm(N)

L, U = la.lu(W, permute_l=True)
D = np.diag(1 / np.diag(U))
U = D @ U
L = L @ np.diag(1 / np.diag(D))

J = np.fliplr(np.eye(N + 1))

if t is None: # If the problem is deterministic

a_hist = J @ np.asarray(a_hist).reshape(N + 1, 1)

#--------------------------------------------
# Transform the 'a' sequence if β is given
#--------------------------------------------
if self.β != 1:
a_hist = a_hist * (self.β**(np.arange(N + 1) / 2))[::-1] \
.reshape(N + 1, 1)

a_bar = a_hist - W_m @ self.y_m # a_bar from the lecture


Uy = np.linalg.solve(L, a_bar) # U @ y_bar = L^{-1}
y_bar = np.linalg.solve(U, Uy) # y_bar = U^{-1}L^{-1}

(continues on next page)

32.6. Implementation 597


Advanced Quantitative Economics with Python

(continued from previous page)


# Reverse the order of y_bar with the matrix J
J = np.fliplr(np.eye(N + self.m + 1))
# y_hist : concatenated y_m and y_bar
y_hist = J @ np.vstack([y_bar, self.y_m])

#--------------------------------------------
# Transform the optimal sequence back if β is given
#--------------------------------------------
if self.β != 1:
y_hist = y_hist * (self.β**(- np.arange(-self.m, N + 1)/2)) \
.reshape(N + 1 + self.m, 1)

return y_hist, L, U, y_bar

else: # If the problem is stochastic and we look at it

Ea_hist = self.predict(a_hist, t).reshape(N + 1, 1)


Ea_hist = J @ Ea_hist

a_bar = Ea_hist - W_m @ self.y_m # a_bar from the lecture


Uy = np.linalg.solve(L, a_bar) # U @ y_bar = L^{-1}
y_bar = np.linalg.solve(U, Uy) # y_bar = U^{-1}L^{-1}

# Reverse the order of y_bar with the matrix J


J = np.fliplr(np.eye(N + self.m + 1))
# y_hist : concatenated y_m and y_bar
y_hist = J @ np.vstack([y_bar, self.y_m])

return y_hist, L, U, y_bar

32.6.1 Example

In this application, we’ll have one lag, with

𝑑(𝐿)𝑦𝑡 = 𝛾(𝐼 − 𝐿)𝑦𝑡 = 𝛾(𝑦𝑡 − 𝑦𝑡−1 )

Suppose for the moment that 𝛾 = 0.


Then the intertemporal component of the LQ problem disappears, and the agent simply wants to maximize 𝑎𝑡 𝑦𝑡 − ℎ𝑦𝑡2 /2
in each period.
This means that the agent chooses 𝑦𝑡 = 𝑎𝑡 /ℎ.
In the following we’ll set ℎ = 1, so that the agent just wants to track the {𝑎𝑡 } process.
However, as we increase 𝛾, the agent gives greater weight to a smooth time path.
Hence {𝑦𝑡 } evolves as a smoothed version of {𝑎𝑡 }.
The {𝑎𝑡 } sequence we’ll choose as a stationary cyclic process plus some white noise.
Here’s some code that generates a plot when 𝛾 = 0.8

# Set seed and generate a_t sequence


np.random.seed(123)
n = 100
(continues on next page)

598 Chapter 32. Classical Control with Linear Algebra


Advanced Quantitative Economics with Python

(continued from previous page)


a_seq = np.sin(np.linspace(0, 5 * np.pi, n)) + 2 + 0.1 * np.random.randn(n)

def plot_simulation(γ=0.8, m=1, h=1, y_m=2):

d = γ * np.asarray([1, -1])
y_m = np.asarray(y_m).reshape(m, 1)

testlq = LQFilter(d, h, y_m)


y_hist, L, U, y = testlq.optimal_y(a_seq)
y = y[::-1] # Reverse y

# Plot simulation results

fig, ax = plt.subplots(figsize=(10, 6))


p_args = {'lw' : 2, 'alpha' : 0.6}
time = range(len(y))
ax.plot(time, a_seq / h, 'k-o', ms=4, lw=2, alpha=0.6, label='$a_t$')
ax.plot(time, y, 'b-o', ms=4, lw=2, alpha=0.6, label='$y_t$')
ax.set(title=rf'Dynamics with $\gamma = {γ}$',
xlabel='Time',
xlim=(0, max(time))
)
ax.legend()
ax.grid()
plt.show()

plot_simulation()

Here’s what happens when we change 𝛾 to 5.0

32.6. Implementation 599


Advanced Quantitative Economics with Python

plot_simulation(γ=5)

And here’s 𝛾 = 10

plot_simulation(γ=10)

600 Chapter 32. Classical Control with Linear Algebra


Advanced Quantitative Economics with Python

32.7 Exercises

Exercise 32.7.1
Consider solving a discounted version (𝛽 < 1) of problem (32.1), as follows.
Convert (32.1) to the undiscounted problem (32.22).
Let the solution of (32.22) in feedback form be
𝑚 ∞
(1 − 𝜆̃ 1 𝐿) ⋯ (1 − 𝜆̃ 𝑚 𝐿)𝑦𝑡̃ = ∑ 𝐴𝑗̃ ∑ 𝜆̃ 𝑘𝑗 𝑎𝑡+𝑘
̃
𝑗=1 𝑘=0

or
𝑚 ∞
𝑦𝑡̃ = 𝑓1̃ 𝑦𝑡−1
̃ + ⋯ + 𝑓𝑚̃ 𝑦𝑡−𝑚
̃ + ∑ 𝐴𝑗̃ ∑ 𝜆̃ 𝑘𝑗 𝑎𝑡+𝑘
̃ (32.25)
𝑗=1 𝑘=0

Here
̃ −1 )𝑑(𝑧)
• ℎ + 𝑑(𝑧 ̃ = 𝑐(𝑧
̃ −1 )𝑐(𝑧)
̃
̃ ]1/2 (1 − 𝜆̃ 1 𝑧) ⋯ (1 − 𝜆̃ 𝑚 𝑧)
̃ = [(−1)𝑚 𝑧0̃ 𝑧1̃ ⋯ 𝑧𝑚
• 𝑐(𝑧)
̃ −1 ) 𝑑(𝑧).
where the 𝑧𝑗̃ are the zeros of ℎ + 𝑑(𝑧 ̃

Prove that (32.25) implies that the solution for 𝑦𝑡 in feedback form is
𝑚 ∞
𝑦𝑡 = 𝑓1 𝑦𝑡−1 + … + 𝑓𝑚 𝑦𝑡−𝑚 + ∑ 𝐴𝑗 ∑ 𝛽 𝑘 𝜆𝑘𝑗 𝑎𝑡+𝑘
𝑗=1 𝑘=0

32.7. Exercises 601


Advanced Quantitative Economics with Python

where 𝑓𝑗 = 𝑓𝑗̃ 𝛽 −𝑗/2 , 𝐴𝑗 = 𝐴𝑗̃ , and 𝜆𝑗 = 𝜆̃ 𝑗 𝛽 −1/2 .

Exercise 32.7.2
Solve the optimal control problem, maximize
2
1
∑ {𝑎𝑡 𝑦𝑡 − [(1 − 2𝐿)𝑦𝑡 ]2 }
𝑡=0
2

subject to 𝑦−1 given, and {𝑎𝑡 } a known bounded sequence.


Express the solution in the “feedback form” (32.20), giving numerical values for the coefficients.
Make sure that the boundary conditions (32.5) are satisfied.

Note: This problem differs from the problem in the text in one important way: instead of ℎ > 0 in (32.1), ℎ = 0. This
has an important influence on the solution.

Exercise 32.7.3
Solve the infinite time-optimal control problem to maximize
𝑁
1
lim ∑ − [(1 − 2𝐿)𝑦𝑡 ]2 ,
𝑁→∞
𝑡=0
2

subject to 𝑦−1 given. Prove that the solution is

𝑦𝑡 = 2𝑦𝑡−1 = 2𝑡+1 𝑦−1 𝑡>0

Exercise 32.7.4
Solve the infinite time problem, to maximize
𝑁
1
lim ∑ (.0000001) 𝑦𝑡2 − [(1 − 2𝐿)𝑦𝑡 ]2
𝑁→∞
𝑡=0
2

subject to 𝑦−1 given. Prove that the solution 𝑦𝑡 = 2𝑦𝑡−1 violates condition (32.12), and so is not optimal.
Prove that the optimal solution is approximately 𝑦𝑡 = .5𝑦𝑡−1 .

602 Chapter 32. Classical Control with Linear Algebra


CHAPTER

THIRTYTHREE

CLASSICAL PREDICTION AND FILTERING WITH LINEAR ALGEBRA

33.1 Overview

This is a sequel to the earlier lecture Classical Control with Linear Algebra.
That lecture used linear algebra – in particular, the LU decomposition – to formulate and solve a class of linear-quadratic
optimal control problems.
In this lecture, we’ll be using a closely related decomposition, the Cholesky decomposition, to solve linear prediction and
filtering problems.
We exploit the useful fact that there is an intimate connection between two superficially different classes of problems:
• deterministic linear-quadratic (LQ) optimal control problems
• linear least squares prediction and filtering problems
The first class of problems involves no randomness, while the second is all about randomness.
Nevertheless, essentially the same mathematics solves both types of problem.
This connection, which is often termed “duality,” is present whether one uses “classical” or “recursive” solution procedures.
In fact, we saw duality at work earlier when we formulated control and prediction problems recursively in lectures LQ
dynamic programming problems, A first look at the Kalman filter, and The permanent income model.
A useful consequence of duality is that
• With every LQ control problem, there is implicitly affiliated a linear least squares prediction or filtering problem.
• With every linear least squares prediction or filtering problem there is implicitly affiliated a LQ control problem.
An understanding of these connections has repeatedly proved useful in cracking interesting applied problems.
For example, Sargent [Sargent, 1987] [chs. IX, XIV] and Hansen and Sargent [Hansen and Sargent, 1980] formulated
and solved control and filtering problems using 𝑧-transform methods.
In this lecture, we begin to investigate these ideas by using mostly elementary linear algebra.
This is the main purpose and focus of the lecture.
However, after showing matrix algebra formulas, we’ll summarize classic infinite-horizon formulas built on 𝑧-transform
and lag operator methods.
And we’ll occasionally refer to some of these formulas from the infinite dimensional problems as we present the finite
time formulas and associated linear algebra.
We’ll start with the following standard import:

603
Advanced Quantitative Economics with Python

import numpy as np

33.1.1 References

Useful references include [Whittle, 1963], [Hansen and Sargent, 1980], [Orfanidis, 1988], [Athanasios and Pillai, 1991],
and [Muth, 1960].

33.2 Finite Dimensional Prediction

Let (𝑥1 , 𝑥2 , … , 𝑥𝑇 )′ = 𝑥 be a 𝑇 × 1 vector of random variables with mean 𝔼𝑥 = 0 and covariance matrix 𝔼𝑥𝑥′ = 𝑉 .
Here 𝑉 is a 𝑇 × 𝑇 positive definite matrix.
The 𝑖, 𝑗 component 𝐸𝑥𝑖 𝑥𝑗 of 𝑉 is the inner product between 𝑥𝑖 and 𝑥𝑗 .
We regard the random variables as being ordered in time so that 𝑥𝑡 is thought of as the value of some economic variable
at time 𝑡.
For example, 𝑥𝑡 could be generated by the random process described by the Wold representation presented in equation
(33.16) in the section below on infinite dimensional prediction and filtering.
In that case, 𝑉𝑖𝑗 is given by the coefficient on 𝑧 ∣𝑖−𝑗∣ in the expansion of 𝑔𝑥 (𝑧) = 𝑑(𝑧) 𝑑(𝑧 −1 ) + ℎ, which equals ℎ +

∑𝑘=0 𝑑𝑘 𝑑𝑘+∣𝑖−𝑗∣ .
We want to construct 𝑗 step ahead linear least squares predictors of the form

𝔼̂ [𝑥𝑇 |𝑥𝑇 −𝑗 , 𝑥𝑇 −𝑗+1 , … , 𝑥1 ]

where 𝔼̂ is the linear least squares projection operator.


(Sometimes 𝔼̂ is called the wide-sense expectations operator)
To find linear least squares predictors it is helpful first to construct a 𝑇 × 1 vector 𝜀 of random variables that form an
orthonormal basis for the vector of random variables 𝑥.
The key insight here comes from noting that because the covariance matrix 𝑉 of 𝑥 is a positive definite and symmetric,
there exists a (Cholesky) decomposition of 𝑉 such that

𝑉 = 𝐿−1 (𝐿−1 )′

and

𝐿 𝑉 𝐿′ = 𝐼

where 𝐿 and 𝐿−1 are both lower triangular.


Form the 𝑇 × 1 random vector 𝜀 = 𝐿𝑥.
The random vector 𝜀 is an orthonormal basis for 𝑥 because
• 𝐿 is nonsingular
• 𝔼 𝜀 𝜀′ = 𝐿𝔼𝑥𝑥′ 𝐿′ = 𝐼
• 𝑥 = 𝐿−1 𝜀

604 Chapter 33. Classical Prediction and Filtering With Linear Algebra
Advanced Quantitative Economics with Python

It is enlightening to write out and interpret the equations 𝐿𝑥 = 𝜀 and 𝐿−1 𝜀 = 𝑥.


First, we’ll write 𝐿𝑥 = 𝜀

𝐿11 𝑥1 = 𝜀1
𝐿21 𝑥1 + 𝐿22 𝑥2 = 𝜀2
(33.1)

𝐿𝑇 1 𝑥1 … + 𝐿𝑇 𝑇 𝑥𝑇 = 𝜀𝑇
or
𝑡−1
∑ 𝐿𝑡,𝑡−𝑗 𝑥𝑡−𝑗 = 𝜀𝑡 , 𝑡 = 1, 2, … 𝑇 (33.2)
𝑗=0

Next, we write 𝐿−1 𝜀 = 𝑥

𝑥1 = 𝐿−1
11 𝜀1
𝑥2 = 𝐿−1 −1
22 𝜀2 + 𝐿21 𝜀1
, (33.3)

𝑥𝑇 = 𝐿−1 −1 −1
𝑇 𝑇 𝜀𝑇 + 𝐿𝑇 ,𝑇 −1 𝜀𝑇 −1 … + 𝐿𝑇 ,1 𝜀1

or
𝑡−1
𝑥𝑡 = ∑ 𝐿−1
𝑡,𝑡−𝑗 𝜀𝑡−𝑗 (33.4)
𝑗=0

where 𝐿−1 −1
𝑖,𝑗 denotes the 𝑖, 𝑗 element of 𝐿 .

From (33.2), it follows that 𝜀𝑡 is in the linear subspace spanned by 𝑥𝑡 , 𝑥𝑡−1 , … , 𝑥1 .


From (33.4) it follows that that 𝑥𝑡 is in the linear subspace spanned by 𝜀𝑡 , 𝜀𝑡−1 , … , 𝜀1 .
Equation (33.2) forms a sequence of autoregressions that for 𝑡 = 1, … , 𝑇 express 𝑥𝑡 as linear functions of 𝑥𝑠 , 𝑠 =
1, … , 𝑡 − 1 and a random variable (𝐿𝑡,𝑡 )−1 𝜀𝑡 that is orthogonal to each componenent of 𝑥𝑠 , 𝑠 = 1, … , 𝑡 − 1.
(Here (𝐿𝑡,𝑡 )−1 denotes the reciprocal of 𝐿𝑡,𝑡 while 𝐿−1 −1
𝑡,𝑡 denotes the 𝑡, 𝑡 element of 𝐿 ).

The equivalence of the subspaces spanned by 𝜀𝑡 , … , 𝜀1 and 𝑥𝑡 , … , 𝑥1 means that for 𝑡 − 1 ≥ 𝑚 ≥ 1

̂ 𝑡 ∣ 𝑥𝑡−𝑚 , 𝑥𝑡−𝑚−1 , … , 𝑥1 ] = 𝔼[𝑥


𝔼[𝑥 ̂ 𝑡 ∣ 𝜀𝑡−𝑚 , 𝜀𝑡−𝑚−1 , … , 𝜀1 ] (33.5)

To proceed, it is useful to drill down and note that for 𝑡 − 1 ≥ 𝑚 ≥ 1 we can rewrite (33.4) in the form of the moving
average representation
𝑚−1 𝑡−1
𝑥𝑡 = ∑ 𝐿−1 −1
𝑡,𝑡−𝑗 𝜀𝑡−𝑗 + ∑ 𝐿𝑡,𝑡−𝑗 𝜀𝑡−𝑗 (33.6)
𝑗=0 𝑗=𝑚

𝑡−1
Representation (33.6) is an orthogonal decomposition of 𝑥𝑡 into a part ∑𝑗=𝑚 𝐿−1
𝑡,𝑡−𝑗 𝜀𝑡−𝑗 that lies in the space spanned
𝑡−1
by [𝑥𝑡−𝑚 , 𝑥𝑡−𝑚+1 , … , 𝑥1 ] and an orthogonal component ∑𝑗=𝑚 𝐿−1
𝑡,𝑡−𝑗 𝜀𝑡−𝑗 that does not lie in that space but instead in
a linear space knowns as its orthogonal complement.
It follows that
𝑚−1
̂ 𝑡 ∣ 𝑥𝑡−𝑚 , 𝑥𝑡−𝑚−1 , … , 𝑥1 ] = ∑ 𝐿−1
𝔼[𝑥 𝑡,𝑡−𝑗 𝜀𝑡−𝑗
𝑗=0

33.2. Finite Dimensional Prediction 605


Advanced Quantitative Economics with Python

33.2.1 Implementation

Here’s the code that computes solutions to LQ control and filtering problems using the methods described here and in
Classical Control with Linear Algebra.

import numpy as np
import scipy.stats as spst
import scipy.linalg as la

class LQFilter:

def __init__(self, d, h, y_m, r=None, h_eps=None, β=None):


"""

Parameters
----------
d : list or numpy.array (1-D or a 2-D column vector)
The order of the coefficients: [d_0, d_1, ..., d_m]
h : scalar
Parameter of the objective function (corresponding to the
quadratic term)
y_m : list or numpy.array (1-D or a 2-D column vector)
Initial conditions for y
r : list or numpy.array (1-D or a 2-D column vector)
The order of the coefficients: [r_0, r_1, ..., r_k]
(optional, if not defined -> deterministic problem)
β : scalar
Discount factor (optional, default value is one)
"""

self.h = h
self.d = np.asarray(d)
self.m = self.d.shape[0] - 1

self.y_m = np.asarray(y_m)

if self.m == self.y_m.shape[0]:
self.y_m = self.y_m.reshape(self.m, 1)
else:
raise ValueError("y_m must be of length m = {self.m:d}")

#---------------------------------------------
# Define the coefficients of ϕ upfront
#---------------------------------------------
ϕ = np.zeros(2 * self.m + 1)
for i in range(- self.m, self.m + 1):
ϕ[self.m - i] = np.sum(np.diag(self.d.reshape(self.m + 1, 1) \
@ self.d.reshape(1, self.m + 1),
k=-i
)
)
ϕ[self.m] = ϕ[self.m] + self.h
self.ϕ = ϕ

#-----------------------------------------------------
# If r is given calculate the vector ϕ_r
#-----------------------------------------------------
(continues on next page)

606 Chapter 33. Classical Prediction and Filtering With Linear Algebra
Advanced Quantitative Economics with Python

(continued from previous page)


if r is None:
pass
else:
self.r = np.asarray(r)
self.k = self.r.shape[0] - 1
ϕ_r = np.zeros(2 * self.k + 1)
for i in range(- self.k, self.k + 1):
ϕ_r[self.k - i] = np.sum(np.diag(self.r.reshape(self.k + 1, 1) \
@ self.r.reshape(1, self.k + 1),
k=-i
)
)
if h_eps is None:
self.ϕ_r = ϕ_r
else:
ϕ_r[self.k] = ϕ_r[self.k] + h_eps
self.ϕ_r = ϕ_r

#-----------------------------------------------------
# If β is given, define the transformed variables
#-----------------------------------------------------
if β is None:
self.β = 1
else:
self.β = β
self.d = self.β**(np.arange(self.m + 1)/2) * self.d
self.y_m = self.y_m * (self.β**(- np.arange(1, self.m + 1)/2)) \
.reshape(self.m, 1)

def construct_W_and_Wm(self, N):


"""
This constructs the matrices W and W_m for a given number of periods N
"""

m = self.m
d = self.d

W = np.zeros((N + 1, N + 1))
W_m = np.zeros((N + 1, m))

#---------------------------------------
# Terminal conditions
#---------------------------------------

D_m1 = np.zeros((m + 1, m + 1))


M = np.zeros((m + 1, m))

# (1) Constuct the D_{m+1} matrix using the formula

for j in range(m + 1):


for k in range(j, m + 1):
D_m1[j, k] = d[:j + 1] @ d[k - j: k + 1]

# Make the matrix symmetric


D_m1 = D_m1 + D_m1.T - np.diag(np.diag(D_m1))

(continues on next page)

33.2. Finite Dimensional Prediction 607


Advanced Quantitative Economics with Python

(continued from previous page)


# (2) Construct the M matrix using the entries of D_m1

for j in range(m):
for i in range(j + 1, m + 1):
M[i, j] = D_m1[i - j - 1, m]

#----------------------------------------------
# Euler equations for t = 0, 1, ..., N-(m+1)
#----------------------------------------------
ϕ = self.ϕ

W[:(m + 1), :(m + 1)] = D_m1 + self.h * np.eye(m + 1)


W[:(m + 1), (m + 1):(2 * m + 1)] = M

for i, row in enumerate(np.arange(m + 1, N + 1 - m)):


W[row, (i + 1):(2 * m + 2 + i)] = ϕ

for i in range(1, m + 1):


W[N - m + i, -(2 * m + 1 - i):] = ϕ[:-i]

for i in range(m):
W_m[N - i, :(m - i)] = ϕ[(m + 1 + i):]

return W, W_m

def roots_of_characteristic(self):
"""
This function calculates z_0 and the 2m roots of the characteristic
equation associated with the Euler equation (1.7)

Note:
------
numpy.poly1d(roots, True) defines a polynomial using its roots that can
be evaluated at any point. If x_1, x_2, ... , x_m are the roots then
p(x) = (x - x_1)(x - x_2)...(x - x_m)
"""
m = self.m
ϕ = self.ϕ

# Calculate the roots of the 2m-polynomial


roots = np.roots(ϕ)
# Sort the roots according to their length (in descending order)
roots_sorted = roots[np.argsort(abs(roots))[::-1]]

z_0 = ϕ.sum() / np.poly1d(roots, True)(1)


z_1_to_m = roots_sorted[:m] # We need only those outside the unit circle

λ = 1 / z_1_to_m

return z_1_to_m, z_0, λ

def coeffs_of_c(self):
'''
This function computes the coefficients {c_j, j = 0, 1, ..., m} for
c(z) = sum_{j = 0}^{m} c_j z^j

(continues on next page)

608 Chapter 33. Classical Prediction and Filtering With Linear Algebra
Advanced Quantitative Economics with Python

(continued from previous page)


Based on the expression (1.9). The order is
c_coeffs = [c_0, c_1, ..., c_{m-1}, c_m]
'''
z_1_to_m, z_0 = self.roots_of_characteristic()[:2]

c_0 = (z_0 * np.prod(z_1_to_m).real * (- 1)**self.m)**(.5)


c_coeffs = np.poly1d(z_1_to_m, True).c * z_0 / c_0

return c_coeffs[::-1]

def solution(self):
"""
This function calculates {λ_j, j=1,...,m} and {A_j, j=1,...,m}
of the expression (1.15)
"""
λ = self.roots_of_characteristic()[2]
c_0 = self.coeffs_of_c()[-1]

A = np.zeros(self.m, dtype=complex)
for j in range(self.m):
denom = 1 - λ/λ[j]
A[j] = c_0**(-2) / np.prod(denom[np.arange(self.m) != j])

return λ, A

def construct_V(self, N):


'''
This function constructs the covariance matrix for x^N (see section 6)
for a given period N
'''
V = np.zeros((N, N))
ϕ_r = self.ϕ_r

for i in range(N):
for j in range(N):
if abs(i-j) <= self.k:
V[i, j] = ϕ_r[self.k + abs(i-j)]

return V

def simulate_a(self, N):


"""
Assuming that the u's are normal, this method draws a random path
for x^N
"""
V = self.construct_V(N + 1)
d = spst.multivariate_normal(np.zeros(N + 1), V)

return d.rvs()

def predict(self, a_hist, t):


"""
This function implements the prediction formula discussed in section 6 (1.59)
It takes a realization for a^N, and the period in which the prediction is
formed

(continues on next page)

33.2. Finite Dimensional Prediction 609


Advanced Quantitative Economics with Python

(continued from previous page)


Output: E[abar | a_t, a_{t-1}, ..., a_1, a_0]
"""

N = np.asarray(a_hist).shape[0] - 1
a_hist = np.asarray(a_hist).reshape(N + 1, 1)
V = self.construct_V(N + 1)

aux_matrix = np.zeros((N + 1, N + 1))


aux_matrix[:(t + 1), :(t + 1)] = np.eye(t + 1)
L = la.cholesky(V).T
Ea_hist = la.inv(L) @ aux_matrix @ L @ a_hist

return Ea_hist

def optimal_y(self, a_hist, t=None):


"""
- if t is NOT given it takes a_hist (list or numpy.array) as a
deterministic a_t
- if t is given, it solves the combined control prediction problem
(section 7)(by default, t == None -> deterministic)

for a given sequence of a_t (either deterministic or a particular


realization), it calculates the optimal y_t sequence using the method
of the lecture

Note:
------
scipy.linalg.lu normalizes L, U so that L has unit diagonal elements
To make things consistent with the lecture, we need an auxiliary
diagonal matrix D which renormalizes L and U
"""

N = np.asarray(a_hist).shape[0] - 1
W, W_m = self.construct_W_and_Wm(N)

L, U = la.lu(W, permute_l=True)
D = np.diag(1 / np.diag(U))
U = D @ U
L = L @ np.diag(1 / np.diag(D))

J = np.fliplr(np.eye(N + 1))

if t is None: # If the problem is deterministic

a_hist = J @ np.asarray(a_hist).reshape(N + 1, 1)

#--------------------------------------------
# Transform the 'a' sequence if β is given
#--------------------------------------------
if self.β != 1:
a_hist = a_hist * (self.β**(np.arange(N + 1) / 2))[::-1] \
.reshape(N + 1, 1)

a_bar = a_hist - W_m @ self.y_m # a_bar from the lecture


Uy = np.linalg.solve(L, a_bar) # U @ y_bar = L^{-1}
y_bar = np.linalg.solve(U, Uy) # y_bar = U^{-1}L^{-1}

(continues on next page)

610 Chapter 33. Classical Prediction and Filtering With Linear Algebra
Advanced Quantitative Economics with Python

(continued from previous page)

# Reverse the order of y_bar with the matrix J


J = np.fliplr(np.eye(N + self.m + 1))
# y_hist : concatenated y_m and y_bar
y_hist = J @ np.vstack([y_bar, self.y_m])

#--------------------------------------------
# Transform the optimal sequence back if β is given
#--------------------------------------------
if self.β != 1:
y_hist = y_hist * (self.β**(- np.arange(-self.m, N + 1)/2)) \
.reshape(N + 1 + self.m, 1)

return y_hist, L, U, y_bar

else: # If the problem is stochastic and we look at it

Ea_hist = self.predict(a_hist, t).reshape(N + 1, 1)


Ea_hist = J @ Ea_hist

a_bar = Ea_hist - W_m @ self.y_m # a_bar from the lecture


Uy = np.linalg.solve(L, a_bar) # U @ y_bar = L^{-1}
y_bar = np.linalg.solve(U, Uy) # y_bar = U^{-1}L^{-1}

# Reverse the order of y_bar with the matrix J


J = np.fliplr(np.eye(N + self.m + 1))
# y_hist : concatenated y_m and y_bar
y_hist = J @ np.vstack([y_bar, self.y_m])

return y_hist, L, U, y_bar

Let’s use this code to tackle two interesting examples.

33.2.2 Example 1

Consider a stochastic process with moving average representation

𝑥𝑡 = (1 − 2𝐿)𝜀𝑡

where 𝜀𝑡 is a serially uncorrelated random process with mean zero and variance unity.
If we were to use the tools associated with infinite dimensional prediction and filtering to be described below, we would
use the Wiener-Kolmogorov formula (33.21) to compute the linear least squares forecasts 𝔼[𝑥𝑡+𝑗 ∣ 𝑥𝑡 , 𝑥𝑡−1 , …], for
𝑗 = 1, 2.
But we can do everything we want by instead using our finite dimensional tools and setting 𝑑 = 𝑟, generating an instance
of LQFilter, then invoking pertinent methods of LQFilter.

m = 1
y_m = np.asarray([.0]).reshape(m, 1)
d = np.asarray([1, -2])
r = np.asarray([1, -2])
h = 0.0
example = LQFilter(d, h, y_m, r=d)

33.2. Finite Dimensional Prediction 611


Advanced Quantitative Economics with Python

The Wold representation is computed by example.coeffs_of_c().


Let’s check that it “flips roots” as required

example.coeffs_of_c()

array([ 2., -1.])

example.roots_of_characteristic()

(array([2.]), -2.0, array([0.5]))

Now let’s form the covariance matrix of a time series vector of length 𝑁 and put it in 𝑉 .
Then we’ll take a Cholesky decomposition of 𝑉 = 𝐿−1 𝐿−1 and use it to form the vector of “moving average represen-
tations” 𝑥 = 𝐿−1 𝜀 and the vector of “autoregressive representations” 𝐿𝑥 = 𝜀.

V = example.construct_V(N=5)
print(V)

[[ 5. -2. 0. 0. 0.]
[-2. 5. -2. 0. 0.]
[ 0. -2. 5. -2. 0.]
[ 0. 0. -2. 5. -2.]
[ 0. 0. 0. -2. 5.]]

Notice how the lower rows of the “moving average representations” are converging to the appropriate infinite history
Wold representation to be described below when we study infinite horizon-prediction and filtering

Li = np.linalg.cholesky(V)
print(Li)

[[ 2.23606798 0. 0. 0. 0. ]
[-0.89442719 2.04939015 0. 0. 0. ]
[ 0. -0.97590007 2.01186954 0. 0. ]
[ 0. 0. -0.99410024 2.00293902 0. ]
[ 0. 0. 0. -0.99853265 2.000733 ]]

Notice how the lower rows of the “autoregressive representations” are converging to the appropriate infinite-history au-
toregressive representation to be described below when we study infinite horizon-prediction and filtering

L = np.linalg.inv(Li)
print(L)

[[0.4472136 0. 0. 0. 0. ]
[0.19518001 0.48795004 0. 0. 0. ]
[0.09467621 0.23669053 0.49705012 0. 0. ]
[0.04698977 0.11747443 0.2466963 0.49926632 0. ]
[0.02345182 0.05862954 0.12312203 0.24917554 0.49981682]]

612 Chapter 33. Classical Prediction and Filtering With Linear Algebra
Advanced Quantitative Economics with Python

33.2.3 Example 2

Consider a stochastic process 𝑋𝑡 with moving average representation



𝑋𝑡 = (1 − 2𝐿2 )𝜀𝑡

where 𝜀𝑡 is a serially uncorrelated random process with mean zero and variance unity.
Let’s find a Wold moving average representation for 𝑥𝑡 that will prevail in the infinite-history context to be studied in
detail below.
To do this, we’ll use the Wiener-Kolomogorov formula (33.21) presented below to compute the linear least squares
forecasts 𝔼̂ [𝑋𝑡+𝑗 ∣ 𝑋𝑡−1 , …] for 𝑗 = 1, 2, 3.
We proceed in the same way as in example 1

m = 2
y_m = np.asarray([.0, .0]).reshape(m, 1)
d = np.asarray([1, 0, -np.sqrt(2)])
r = np.asarray([1, 0, -np.sqrt(2)])
h = 0.0
example = LQFilter(d, h, y_m, r=d)
example.coeffs_of_c()

array([ 1.41421356, -0. , -1. ])

example.roots_of_characteristic()

(array([ 1.18920712, -1.18920712]),


-1.4142135623731122,
array([ 0.84089642, -0.84089642]))

V = example.construct_V(N=8)
print(V)

[[ 3. 0. -1.41421356 0. 0. 0.
0. 0. ]
[ 0. 3. 0. -1.41421356 0. 0.
0. 0. ]
[-1.41421356 0. 3. 0. -1.41421356 0.
0. 0. ]
[ 0. -1.41421356 0. 3. 0. -1.41421356
0. 0. ]
[ 0. 0. -1.41421356 0. 3. 0.
-1.41421356 0. ]
[ 0. 0. 0. -1.41421356 0. 3.
0. -1.41421356]
[ 0. 0. 0. 0. -1.41421356 0.
3. 0. ]
[ 0. 0. 0. 0. 0. -1.41421356
0. 3. ]]

Li = np.linalg.cholesky(V)
print(Li[-3:, :])

33.2. Finite Dimensional Prediction 613


Advanced Quantitative Economics with Python

[[ 0. 0. 0. -0.9258201 0. 1.46385011
0. 0. ]
[ 0. 0. 0. 0. -0.96609178 0.
1.43759058 0. ]
[ 0. 0. 0. 0. 0. -0.96609178
0. 1.43759058]]

L = np.linalg.inv(Li)
print(L)

[[0.57735027 0. 0. 0. 0. 0.
0. 0. ]
[0. 0.57735027 0. 0. 0. 0.
0. 0. ]
[0.3086067 0. 0.65465367 0. 0. 0.
0. 0. ]
[0. 0.3086067 0. 0.65465367 0. 0.
0. 0. ]
[0.19518001 0. 0.41403934 0. 0.68313005 0.
0. 0. ]
[0. 0.19518001 0. 0.41403934 0. 0.68313005
0. 0. ]
[0.13116517 0. 0.27824334 0. 0.45907809 0.
0.69560834 0. ]
[0. 0.13116517 0. 0.27824334 0. 0.45907809
0. 0.69560834]]

33.2.4 Prediction

It immediately follows from the “orthogonality principle” of least squares (see [Athanasios and Pillai, 1991] or [Sargent,
1987] [ch. X]) that
𝑡−1
̂ 𝑡 ∣ 𝑥𝑡−𝑚 , 𝑥𝑡−𝑚+1 , … 𝑥1 ] = ∑ 𝐿−1
𝔼[𝑥 𝑡,𝑡−𝑗 𝜀𝑡−𝑗
𝑗=𝑚 (33.7)
= [𝐿−1 −1 −1
𝑡,1 𝐿𝑡,2 , … , 𝐿𝑡,𝑡−𝑚 0 0 … 0]𝐿 𝑥

This can be interpreted as a finite-dimensional version of the Wiener-Kolmogorov 𝑚-step ahead prediction formula.
We can use (33.7) to represent the linear least squares projection of the vector 𝑥 conditioned on the first 𝑠 observations
[𝑥𝑠 , 𝑥𝑠−1 … , 𝑥1 ].
We have

̂ ∣ 𝑥𝑠 , 𝑥𝑠−1 , … , 𝑥1 ] = 𝐿−1 [𝐼𝑠


𝔼[𝑥
0
] 𝐿𝑥 (33.8)
0 0(𝑡−𝑠)

This formula will be convenient in representing the solution of control problems under uncertainty.
Equation (33.4) can be recognized as a finite dimensional version of a moving average representation.
Equation (33.2) can be viewed as a finite dimension version of an autoregressive representation.
Notice that even if the 𝑥𝑡 process is covariance stationary, so that 𝑉 is such that 𝑉𝑖𝑗 depends only on |𝑖−𝑗|, the coefficients
in the moving average representation are time-dependent, there being a different moving average for each 𝑡.

614 Chapter 33. Classical Prediction and Filtering With Linear Algebra
Advanced Quantitative Economics with Python

If 𝑥𝑡 is a covariance stationary process, the last row of 𝐿−1 converges to the coefficients in the Wold moving average
representation for {𝑥𝑡 } as 𝑇 → ∞.
Further, if 𝑥𝑡 is covariance stationary, for fixed 𝑘 and 𝑗 > 0, 𝐿−1 −1
𝑇 ,𝑇 −𝑗 converges to 𝐿𝑇 −𝑘,𝑇 −𝑘−𝑗 as 𝑇 → ∞.

That is, the “bottom” rows of 𝐿−1 converge to each other and to the Wold moving average coefficients as 𝑇 → ∞.
This last observation gives one simple and widely-used practical way of forming a finite 𝑇 approximation to a Wold
moving average representation.

First, form the covariance matrix 𝔼𝑥𝑥′ = 𝑉 , then obtain the Cholesky decomposition 𝐿−1 𝐿−1 of 𝑉 , which can be
accomplished quickly on a computer.
The last row of 𝐿−1 gives the approximate Wold moving average coefficients.
This method can readily be generalized to multivariate systems.

33.3 Combined Finite Dimensional Control and Prediction

Consider the finite-dimensional control problem, maximize


𝑁
1 1
𝔼 ∑ {𝑎𝑡 𝑦𝑡 − ℎ𝑦𝑡2 − [𝑑(𝐿)𝑦𝑡 ]2 } , ℎ>0
𝑡=0
2 2

where 𝑑(𝐿) = 𝑑0 + 𝑑1 𝐿 + … + 𝑑𝑚 𝐿𝑚 , 𝐿 is the lag operator, 𝑎̄ = [𝑎𝑁 , 𝑎𝑁−1 … , 𝑎1 , 𝑎0 ]′ a random vector with mean
̄ ′̄ = 𝑉 .
zero and 𝔼 𝑎𝑎
The variables 𝑦−1 , … , 𝑦−𝑚 are given.
Maximization is over choices of 𝑦0 , 𝑦1 … , 𝑦𝑁 , where 𝑦𝑡 is required to be a linear function of {𝑦𝑡−𝑠−1 , 𝑡 + 𝑚 − 1 ≥
0; 𝑎𝑡−𝑠 , 𝑡 ≥ 𝑠 ≥ 0}.
We saw in the lecture Classical Control with Linear Algebra that the solution of this problem under certainty could be
represented in the feedback-feedforward form

𝑦−1
𝑈 𝑦 ̄ = 𝐿−1 𝑎̄ + 𝐾 ⎡
⎢ ⋮ ⎥

⎣𝑦−𝑚 ⎦

for some (𝑁 + 1) × 𝑚 matrix 𝐾.


Using a version of formula (33.7), we can express 𝔼[̂ 𝑎̄ ∣ 𝑎𝑠 , 𝑎𝑠−1 , … , 𝑎0 ] as

0 0
𝔼[̂ 𝑎̄ ∣ 𝑎𝑠 , 𝑎𝑠−1 , … , 𝑎0 ] = 𝑈̃ −1 [ ] 𝑈̃ 𝑎 ̄
0 𝐼(𝑠+1)

where 𝐼(𝑠+1) is the (𝑠 + 1) × (𝑠 + 1) identity matrix, and 𝑉 = 𝑈̃ −1 𝑈̃ −1 , where 𝑈̃ is the upper triangular Cholesky

factor of the covariance matrix 𝑉 .


(We have reversed the time axis in dating the 𝑎’s relative to earlier)
The time axis can be reversed in representation (33.8) by replacing 𝐿 with 𝐿𝑇 .
The optimal decision rule to use at time 0 ≤ 𝑡 ≤ 𝑁 is then given by the (𝑁 − 𝑡 + 1)th row of

𝑦−1
0 0
𝑈 𝑦 ̄ = 𝐿−1 𝑈̃ −1 [ ] 𝑈̃ 𝑎 ̄ + 𝐾 ⎡
⎢ ⋮ ⎥

0 𝐼(𝑡+1)
⎣𝑦−𝑚 ⎦

33.3. Combined Finite Dimensional Control and Prediction 615


Advanced Quantitative Economics with Python

33.4 Infinite Horizon Prediction and Filtering Problems

It is instructive to compare the finite-horizon formulas based on linear algebra decompositions of finite-dimensional
covariance matrices with classic formulas for infinite horizon and infinite history prediction and control problems.
These classic infinite horizon formulas used the mathematics of 𝑧-transforms and lag operators.
We’ll meet interesting lag operator and 𝑧-transform counterparts to our finite horizon matrix formulas.
We pose two related prediction and filtering problems.
We let 𝑌𝑡 be a univariate 𝑚th order moving average, covariance stationary stochastic process,

𝑌𝑡 = 𝑑(𝐿)𝑢𝑡 (33.9)
𝑚
where 𝑑(𝐿) = ∑𝑗=0 𝑑𝑗 𝐿𝑗 , and 𝑢𝑡 is a serially uncorrelated stationary random process satisfying

𝔼𝑢𝑡 = 0
1 if 𝑡 = 𝑠 (33.10)
𝔼𝑢𝑡 𝑢𝑠 = {
0 otherwise

We impose no conditions on the zeros of 𝑑(𝑧).


A second covariance stationary process is 𝑋𝑡 given by

𝑋𝑡 = 𝑌𝑡 + 𝜀𝑡 (33.11)

where 𝜀𝑡 is a serially uncorrelated stationary random process with 𝔼𝜀𝑡 = 0 and 𝔼𝜀𝑡 𝜀𝑠 = 0 for all distinct 𝑡 and 𝑠.
We also assume that 𝔼𝜀𝑡 𝑢𝑠 = 0 for all 𝑡 and 𝑠.
The linear least squares prediction problem is to find the 𝐿2 random variable 𝑋̂ 𝑡+𝑗 among linear combinations of
{𝑋𝑡 , 𝑋𝑡−1 , …} that minimizes 𝔼(𝑋̂ 𝑡+𝑗 − 𝑋𝑡+𝑗 )2 .
∞ ∞
That is, the problem is to find a 𝛾𝑗 (𝐿) = ∑𝑘=0 𝛾𝑗𝑘 𝐿𝑘 such that ∑𝑘=0 |𝛾𝑗𝑘 |2 < ∞ and 𝔼[𝛾𝑗 (𝐿)𝑋𝑡 − 𝑋𝑡+𝑗 ]2 is
minimized.
∞ ∞
The linear least squares filtering problem is to find a 𝑏 (𝐿) = ∑𝑗=0 𝑏𝑗 𝐿𝑗 such that ∑𝑗=0 |𝑏𝑗 |2 < ∞ and 𝔼[𝑏 (𝐿)𝑋𝑡 −
𝑌𝑡 ]2 is minimized.
Interesting versions of these problems related to the permanent income theory were studied by [Muth, 1960].

33.4.1 Problem Formulation

These problems are solved as follows.


The covariograms of 𝑌 and 𝑋 and their cross covariogram are, respectively,

𝐶𝑋 (𝜏 ) = 𝔼𝑋𝑡 𝑋𝑡−𝜏
𝐶𝑌 (𝜏 ) = 𝔼𝑌𝑡 𝑌𝑡−𝜏 𝜏 = 0, ±1, ±2, … (33.12)
𝐶𝑌 ,𝑋 (𝜏 ) = 𝔼𝑌𝑡 𝑋𝑡−𝜏

616 Chapter 33. Classical Prediction and Filtering With Linear Algebra
Advanced Quantitative Economics with Python

The covariance and cross-covariance generating functions are defined as



𝑔𝑋 (𝑧) = ∑ 𝐶𝑋 (𝜏 )𝑧 𝜏
𝜏=−∞

𝑔𝑌 (𝑧) = ∑ 𝐶𝑌 (𝜏 )𝑧 𝜏 (33.13)
𝜏=−∞

𝑔𝑌 𝑋 (𝑧) = ∑ 𝐶𝑌 𝑋 (𝜏 )𝑧 𝜏
𝜏=−∞

The generating functions can be computed by using the following facts.


Let 𝑣1𝑡 and 𝑣2𝑡 be two mutually and serially uncorrelated white noises with unit variances.
2 2
That is, 𝔼𝑣1𝑡 = 𝔼𝑣2𝑡 = 1, 𝔼𝑣1𝑡 = 𝔼𝑣2𝑡 = 0, 𝔼𝑣1𝑡 𝑣2𝑠 = 0 for all 𝑡 and 𝑠, 𝔼𝑣1𝑡 𝑣1𝑡−𝑗 = 𝔼𝑣2𝑡 𝑣2𝑡−𝑗 = 0 for all 𝑗 ≠ 0.
Let 𝑥𝑡 and 𝑦𝑡 be two random processes given by

𝑦𝑡 = 𝐴(𝐿)𝑣1𝑡 + 𝐵(𝐿)𝑣2𝑡
𝑥𝑡 = 𝐶(𝐿)𝑣1𝑡 + 𝐷(𝐿)𝑣2𝑡

Then, as shown for example in [Sargent, 1987] [ch. XI], it is true that

𝑔𝑦 (𝑧) = 𝐴(𝑧)𝐴(𝑧 −1 ) + 𝐵(𝑧)𝐵(𝑧 −1 )


𝑔𝑥 (𝑧) = 𝐶(𝑧)𝐶(𝑧 −1 ) + 𝐷(𝑧)𝐷(𝑧 −1 ) (33.14)
−1 −1
𝑔𝑦𝑥 (𝑧) = 𝐴(𝑧)𝐶(𝑧 ) + 𝐵(𝑧)𝐷(𝑧 )

Applying these formulas to (33.9) – (33.12), we have

𝑔𝑌 (𝑧) = 𝑑(𝑧)𝑑(𝑧 −1 )
𝑔𝑋 (𝑧) = 𝑑(𝑧)𝑑(𝑧 −1 ) + ℎ (33.15)
𝑔𝑌 𝑋 (𝑧) = 𝑑(𝑧)𝑑(𝑧 −1 )

The key step in obtaining solutions to our problems is to factor the covariance generating function 𝑔𝑋 (𝑧) of 𝑋.
The solutions of our problems are given by formulas due to Wiener and Kolmogorov.
These formulas utilize the Wold moving average representation of the 𝑋𝑡 process,

𝑋𝑡 = 𝑐 (𝐿) 𝜂𝑡 (33.16)
𝑚
where 𝑐(𝐿) = ∑𝑗=0 𝑐𝑗 𝐿𝑗 , with

̂ 𝑡 |𝑋𝑡−1 , 𝑋𝑡−2 , …]
𝑐0 𝜂𝑡 = 𝑋𝑡 − 𝔼[𝑋 (33.17)

Here 𝔼̂ is the linear least squares projection operator.


Equation (33.17) is the condition that 𝑐0 𝜂𝑡 can be the one-step-ahead error in predicting 𝑋𝑡 from its own past values.
Condition (33.17) requires that 𝜂𝑡 lie in the closed linear space spanned by [𝑋𝑡 , 𝑋𝑡−1 , …].
This will be true if and only if the zeros of 𝑐(𝑧) do not lie inside the unit circle.
It is an implication of (33.17) that 𝜂𝑡 is a serially uncorrelated random process and that normalization can be imposed so
that 𝔼𝜂𝑡2 = 1.
Consequently, an implication of (33.16) is that the covariance generating function of 𝑋𝑡 can be expressed as

𝑔𝑋 (𝑧) = 𝑐 (𝑧) 𝑐 (𝑧 −1 ) (33.18)

33.4. Infinite Horizon Prediction and Filtering Problems 617


Advanced Quantitative Economics with Python

It remains to discuss how 𝑐(𝐿) is to be computed.


Combining (33.14) and (33.18) gives

𝑑(𝑧) 𝑑(𝑧 −1 ) + ℎ = 𝑐 (𝑧) 𝑐 (𝑧 −1 ) (33.19)

Therefore, we have already shown constructively how to factor the covariance generating function 𝑔𝑋 (𝑧) = 𝑑(𝑧) 𝑑 (𝑧 −1 )+
ℎ.
We now introduce the annihilation operator:
∞ ∞
[ ∑ 𝑓𝑗 𝐿 𝑗 ] ≡ ∑ 𝑓𝑗 𝐿 𝑗 (33.20)
𝑗=−∞ 𝑗=0
+

In words, [ ]+ means “ignore negative powers of 𝐿”.


̂ 𝑡+𝑗 |𝑋𝑡 , 𝑋𝑡−1 , …] = 𝛾𝑗 (𝐿)𝑋𝑡 .
We have defined the solution of the prediction problem as 𝔼[𝑋
Assuming that the roots of 𝑐(𝑧) = 0 all lie outside the unit circle, the Wiener-Kolmogorov formula for 𝛾𝑗 (𝐿) holds:

𝑐(𝐿)
𝛾𝑗 (𝐿) = [ ] 𝑐 (𝐿)−1 (33.21)
𝐿𝑗 +

̂ 𝑡 ∣ 𝑋𝑡 , 𝑋𝑡−1 , …] = 𝑏(𝐿)𝑋𝑡 .
We have defined the solution of the filtering problem as 𝔼[𝑌
The Wiener-Kolomogorov formula for 𝑏(𝐿) is

𝑔𝑌 𝑋 (𝐿)
𝑏(𝐿) = [ ] 𝑐(𝐿)−1
𝑐(𝐿−1 ) +
or
𝑑(𝐿)𝑑(𝐿−1 )
𝑏(𝐿) = [ ] 𝑐(𝐿)−1 (33.22)
𝑐(𝐿−1 ) +

Formulas (33.21) and (33.22) are discussed in detail in [Whittle, 1983] and [Sargent, 1987].
The interested reader can there find several examples of the use of these formulas in economics Some classic examples
using these formulas are due to [Muth, 1960].
As an example of the usefulness of formula (33.22), we let 𝑋𝑡 be a stochastic process with Wold moving average repre-
sentation

𝑋𝑡 = 𝑐(𝐿)𝜂𝑡

̂ 𝑡 |𝑋𝑡−1 , …], 𝑐(𝐿) = ∑𝑚 𝑐𝑗 𝐿.


where 𝔼𝜂𝑡2 = 1, and 𝑐0 𝜂𝑡 = 𝑋𝑡 − 𝔼[𝑋 𝑗=0

Suppose that at time 𝑡, we wish to predict a geometric sum of future 𝑋’s, namely

1
𝑦𝑡 ≡ ∑ 𝛿 𝑗 𝑋𝑡+𝑗 = 𝑋
𝑗=0
1 − 𝛿𝐿−1 𝑡

given knowledge of 𝑋𝑡 , 𝑋𝑡−1 , ….


We shall use (33.22) to obtain the answer.
Using the standard formulas (33.14), we have that

𝑔𝑦𝑥 (𝑧) = (1 − 𝛿𝑧 −1 )𝑐(𝑧)𝑐(𝑧 −1 )


𝑔𝑥 (𝑧) = 𝑐(𝑧)𝑐(𝑧 −1 )

618 Chapter 33. Classical Prediction and Filtering With Linear Algebra
Advanced Quantitative Economics with Python

Then (33.22) becomes

𝑐(𝐿)
𝑏(𝐿) = [ ] 𝑐(𝐿)−1 (33.23)
1 − 𝛿𝐿−1 +

In order to evaluate the term in the annihilation operator, we use the following result from [Hansen and Sargent, 1980].
Proposition Let
∞ ∞
• 𝑔(𝑧) = ∑𝑗=0 𝑔𝑗 𝑧𝑗 where ∑𝑗=0 |𝑔𝑗 |2 < +∞.

• ℎ (𝑧 −1 ) = (1 − 𝛿1 𝑧−1 ) … (1 − 𝛿𝑛 𝑧−1 ), where |𝛿𝑗 | < 1, for 𝑗 = 1, … , 𝑛.


Then
𝑛
𝑔(𝑧) 𝑔(𝑧) 𝛿𝑗 𝑔(𝛿𝑗 ) 1
[ −1
] = −1
− ∑ 𝑛 ( ) (33.24)
ℎ(𝑧 ) + ℎ(𝑧 ) 𝑗=1 ∏ 𝑘=1 (𝛿𝑗 − 𝛿𝑘 ) 𝑧 − 𝛿𝑗
𝑘≠𝑗

and, alternatively,
𝑛
𝑔(𝑧) 𝑧𝑔(𝑧) − 𝛿𝑗 𝑔(𝛿𝑗 )
[ −1
] = ∑ 𝐵𝑗 ( ) (33.25)
ℎ(𝑧 ) + 𝑗=1 𝑧 − 𝛿𝑗

𝑛
where 𝐵𝑗 = 1/ ∏ 𝑘=1 (1 − 𝛿𝑘 /𝛿𝑗 ).
𝑘+𝑗

Applying formula (33.25) of the proposition to evaluating (33.23) with 𝑔(𝑧) = 𝑐(𝑧) and ℎ(𝑧−1 ) = 1 − 𝛿𝑧 −1 gives

𝐿𝑐(𝐿) − 𝛿𝑐(𝛿)
𝑏(𝐿) = [ ] 𝑐(𝐿)−1
𝐿−𝛿
or
1 − 𝛿𝑐(𝛿)𝐿−1 𝑐(𝐿)−1
𝑏(𝐿) = [ ]
1 − 𝛿𝐿−1

Thus, we have

1 − 𝛿𝑐(𝛿)𝐿−1 𝑐(𝐿)−1
𝔼̂ [∑ 𝛿 𝑗 𝑋𝑡+𝑗 |𝑋𝑡 , 𝑥𝑡−1 , …] = [ ] 𝑋𝑡 (33.26)
𝑗=0
1 − 𝛿𝐿−1

This formula is useful in solving stochastic versions of problem 1 of lecture Classical Control with Linear Algebra in which
the randomness emerges because {𝑎𝑡 } is a stochastic process.
The problem is to maximize
𝑁
1 2 1
𝔼0 lim ∑ 𝛽 𝑡 [𝑎𝑡 𝑦𝑡 − ℎ𝑦 − [𝑑(𝐿)𝑦𝑡 ]2 ] (33.27)
𝑁→∞
𝑡−0
2 𝑡 2

where 𝔼𝑡 is mathematical expectation conditioned on information known at 𝑡, and where {𝑎𝑡 } is a covariance stationary
stochastic process with Wold moving average representation

𝑎𝑡 = 𝑐(𝐿) 𝜂𝑡

where
𝑛̃
𝑐(𝐿) = ∑ 𝑐𝑗 𝐿𝑗
𝑗=0

33.4. Infinite Horizon Prediction and Filtering Problems 619


Advanced Quantitative Economics with Python

and

̂ 𝑡 |𝑎𝑡−1 , …]
𝜂𝑡 = 𝑎𝑡 − 𝔼[𝑎

The problem is to maximize (33.27) with respect to a contingency plan expressing 𝑦𝑡 as a function of information known
at 𝑡, which is assumed to be (𝑦𝑡−1 , 𝑦𝑡−2 , … , 𝑎𝑡 , 𝑎𝑡−1 , …).
The solution of this problem can be achieved in two steps.
First, ignoring the uncertainty, we can solve the problem assuming that {𝑎𝑡 } is a known sequence.
The solution is, from above,

𝑐(𝐿)𝑦𝑡 = 𝑐(𝛽𝐿−1 )−1 𝑎𝑡

or
𝑚 ∞
(1 − 𝜆1 𝐿) … (1 − 𝜆𝑚 𝐿)𝑦𝑡 = ∑ 𝐴𝑗 ∑(𝜆𝑗 𝛽)𝑘 𝑎𝑡+𝑘 (33.28)
𝑗=1 𝑘=0

Second, the solution of the problem under uncertainty is obtained by replacing the terms on the right-hand side of the
above expressions with their linear least squares predictors.
Using (33.26) and (33.28), we have the following solution
𝑚
1 − 𝛽𝜆𝑗 𝑐(𝛽𝜆𝑗 )𝐿−1 𝑐(𝐿)−1
(1 − 𝜆1 𝐿) … (1 − 𝜆𝑚 𝐿)𝑦𝑡 = ∑ 𝐴𝑗 [ ] 𝑎𝑡
𝑗=1
1 − 𝛽𝜆𝑗 𝐿−1

Blaschke factors
The following is a useful piece of mathematics underlying “root flipping”.
𝑚
Let 𝜋(𝑧) = ∑𝑗=0 𝜋𝑗 𝑧𝑗 and let 𝑧1 , … , 𝑧𝑘 be the zeros of 𝜋(𝑧) that are inside the unit circle, 𝑘 < 𝑚.
Then define

(𝑧1 𝑧 − 1) (𝑧 𝑧 − 1) (𝑧 𝑧 − 1)
𝜃(𝑧) = 𝜋(𝑧)( )( 2 )…( 𝑘 )
(𝑧 − 𝑧1 ) (𝑧 − 𝑧2 ) (𝑧 − 𝑧𝑘 )

The term multiplying 𝜋(𝑧) is termed a “Blaschke factor”.


Then it can be proved directly that

𝜃(𝑧 −1 )𝜃(𝑧) = 𝜋(𝑧 −1 )𝜋(𝑧)

and that the zeros of 𝜃(𝑧) are not inside the unit circle.

33.5 Exercises

Exercise 33.5.1
Let 𝑌𝑡 = (1 − 2𝐿)𝑢𝑡 where 𝑢𝑡 is a mean zero white noise with 𝔼𝑢2𝑡 = 1. Let

𝑋𝑡 = 𝑌𝑡 + 𝜀𝑡

where 𝜀𝑡 is a serially uncorrelated white noise with 𝔼𝜀2𝑡 = 9, and 𝔼𝜀𝑡 𝑢𝑠 = 0 for all 𝑡 and 𝑠.

620 Chapter 33. Classical Prediction and Filtering With Linear Algebra
Advanced Quantitative Economics with Python

Find the Wold moving average representation for 𝑋𝑡 .


Find a formula for the 𝐴1𝑗 ’s in

̂𝑡+1 ∣ 𝑋𝑡 , 𝑋𝑡−1 , … = ∑ 𝐴1𝑗 𝑋𝑡−𝑗
𝔼𝑋
𝑗=0

Find a formula for the 𝐴2𝑗 ’s in



̂ 𝑡+2 ∣ 𝑋𝑡 , 𝑋𝑡−1 , … = ∑ 𝐴2𝑗 𝑋𝑡−𝑗
𝔼𝑋
𝑗=0

Exercise 33.5.2
Multivariable Prediction: Let 𝑌𝑡 be an (𝑛 × 1) vector stochastic process with moving average representation
𝑌𝑡 = 𝐷(𝐿)𝑈𝑡
𝑚
where 𝐷(𝐿) = ∑𝑗=0 𝐷𝑗 𝐿𝐽 , 𝐷𝑗 an 𝑛 × 𝑛 matrix, 𝑈𝑡 an (𝑛 × 1) vector white noise with 𝔼𝑈𝑡 = 0 for all 𝑡, 𝔼𝑈𝑡 𝑈𝑠′ = 0
for all 𝑠 ≠ 𝑡, and 𝔼𝑈𝑡 𝑈𝑡′ = 𝐼 for all 𝑡.
Let 𝜀𝑡 be an 𝑛 × 1 vector white noise with mean 0 and contemporaneous covariance matrix 𝐻, where 𝐻 is a positive
definite matrix.
Let 𝑋𝑡 = 𝑌𝑡 + 𝜀𝑡 .
′ ′ ′
Define the covariograms as 𝐶𝑋 (𝜏 ) = 𝔼𝑋𝑡 𝑋𝑡−𝜏 , 𝐶𝑌 (𝜏 ) = 𝔼𝑌𝑡 𝑌𝑡−𝜏 , 𝐶𝑌 𝑋 (𝜏 ) = 𝔼𝑌𝑡 𝑋𝑡−𝜏 .
Then define the matrix covariance generating function, as in (32.21), only interpret all the objects in (32.21) as matrices.
Show that the covariance generating functions are given by
𝑔𝑦 (𝑧) = 𝐷(𝑧)𝐷(𝑧 −1 )′
𝑔𝑋 (𝑧) = 𝐷(𝑧)𝐷(𝑧 −1 )′ + 𝐻
𝑔𝑌 𝑋 (𝑧) = 𝐷(𝑧)𝐷(𝑧 −1 )′
A factorization of 𝑔𝑋 (𝑧) can be found (see [Rozanov, 1967] or [Whittle, 1983]) of the form
𝑚
𝐷(𝑧)𝐷(𝑧−1 )′ + 𝐻 = 𝐶(𝑧)𝐶(𝑧 −1 )′ , 𝐶(𝑧) = ∑ 𝐶𝑗 𝑧𝑗
𝑗=0

where the zeros of |𝐶(𝑧)| do not lie inside the unit circle.
A vector Wold moving average representation of 𝑋𝑡 is then
𝑋𝑡 = 𝐶(𝐿)𝜂𝑡
where 𝜂𝑡 is an (𝑛 × 1) vector white noise that is “fundamental” for 𝑋𝑡 .
That is, 𝑋𝑡 − 𝔼̂ [𝑋𝑡 ∣ 𝑋𝑡−1 , 𝑋𝑡−2 …] = 𝐶0 𝜂𝑡 .
The optimum predictor of 𝑋𝑡+𝑗 is
𝐶(𝐿)
𝔼̂ [𝑋𝑡+𝑗 ∣ 𝑋𝑡 , 𝑋𝑡−1 , …] = [ 𝑗 ] 𝜂𝑡
𝐿 +

If 𝐶(𝐿) is invertible, i.e., if the zeros of det 𝐶(𝑧) lie strictly outside the unit circle, then this formula can be written
𝐶(𝐿)
𝔼̂ [𝑋𝑡+𝑗 ∣ 𝑋𝑡 , 𝑋𝑡−1 , …] = [ 𝐽 ] 𝐶(𝐿)−1 𝑋𝑡
𝐿 +

33.5. Exercises 621


Advanced Quantitative Economics with Python

622 Chapter 33. Classical Prediction and Filtering With Linear Algebra
CHAPTER

THIRTYFOUR

KNOWING THE FORECASTS OF OTHERS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon


!conda install -y -c plotly plotly plotly-orca

34.1 Introduction

Robert E. Lucas, Jr. [Robert E. Lucas, 1975], Kenneth Kasa [Kasa, 2000], and Robert Townsend [Townsend, 1983]
showed that putting decision makers into environments in which they want to infer persistent hidden state variables from
equilibrium prices and quantities can elongate and amplify impulse responses to aggregate shocks.
This provides a promising way to think about amplification mechanisms in business cycle models.
Townsend [Townsend, 1983] noted that living in such environments makes decision makers want to forecast forecasts of
others.
This theme has been pursued for situations in which decision makers’ imperfect information forces them to pursue an
infinite recursion that involves forming beliefs about the beliefs of others (e.g., [Allen et al., 2002]).
Lucas [Robert E. Lucas, 1975] side stepped having decision makers forecast the forecasts of other decision makers by
assuming that they simply pool their information before forecasting.
A pooling equilibrium like Lucas’s plays a prominent role in this lecture.
Because he didn’t assume such pooling, [Townsend, 1983] confronted the forecasting the forecasts of others problem.
To formulate the problem recursively required that Townsend define a decision maker’s state vector.
Townsend concluded that his original model required an intractable infinite dimensional state space.
Therefore, he constructed a more manageable approximating model in which a hidden Markov component of a demand
shock is revealed to all firms after a fixed, finite number of periods.
In this lecture, we illustrate again the theme that finding the state is an art by showing how to formulate Townsend’s
original model in terms of a low-dimensional state space.
We show that Townsend’s model shares equilibrium prices and quantities with those that prevail in a pooling equilibrium.
That finding emerged from a line of research about Townsend’s model that built on [Pearlman et al., 1986] and that
culminated in [Pearlman and Sargent, 2005] .
Rather than directly deploying the [Pearlman et al., 1986] machinery here, we shall instead implement a sneaky guess-
and-verify tactic.
• We first compute a pooling equilibrium and represent it as an instance of a linear state-space system provided by
the Python class quantecon.LinearStateSpace.

623
Advanced Quantitative Economics with Python

• Leaving the state-transition equation for the pooling equilibrium unaltered, we alter the observation vector for a
firm to match what it is in Townsend’s original model. So rather than directly observing the signal received by firms
in the other industry, a firm sees the equilibrium price of the good produced by the other industry.
• We compute a population linear least squares regression of the noisy signal at time 𝑡 that firms in the other industry
would receive in a pooling equilibrium on time 𝑡 information that a firm receives in Townsend’s original model.
• The 𝑅2 in this regression equals 1.
• That verifies that a firm’s information set in Townsend’s original model equals its information set in a pooling
equilibrium.
• Therefore, equilibrium prices and quantities in Townsend’s original model equal those in a pooling equilibrium.

34.1.1 A Sequence of Models

We proceed by describing a sequence of models of two industries that are linked in a single way:
• shocks to the demand curves for their products have a common component.
The models are simplified versions of Townsend’s [Townsend, 1983].
Townsend’s is a model of a rational expectations equilibrium in which firms want to forecast forecasts of others.
In Townsend’s model, firms condition their forecasts on observed endogenous variables whose equilibrium laws of motion
are determined by their own forecasting functions.
We shall assemble model components progressively in ways that can help us to appreciate the structure of the pooling
equilibrium that ultimately interests us.
While keeping all other aspects of the model the same, we shall study consequences of alternative assumptions about
what decision makers observe.
Technically, this lecture deploys concepts and tools that appear in First Look at Kalman Filter and Rational Expectations
Equilibrium.

34.2 The Setting

We cast all variables in terms of deviations from means.


Therefore, we omit constants from inverse demand curves and other functions.
Firms in industry 𝑖 = 1, 2 use a single factor of production, capital 𝑘𝑡𝑖 , to produce output of a single good, 𝑦𝑡𝑖 .
Firms bear quadratic costs of adjusting their capital stocks.
A representative firm in industry 𝑖 has production function 𝑦𝑡𝑖 = 𝑓𝑘𝑡𝑖 , 𝑓 > 0.
The firm acts as a price taker with respect to output price 𝑃𝑡𝑖 , and maximizes

𝐸0𝑖 ∑ 𝛽 𝑡 {𝑃𝑡𝑖 𝑓𝑘𝑡𝑖 − .5ℎ(𝑘𝑡+1
𝑖
− 𝑘𝑡𝑖 )2 } , ℎ > 0. (34.1)
𝑡=0

Demand in industry 𝑖 is described by the inverse demand curve

𝑃𝑡𝑖 = −𝑏𝑌𝑡𝑖 + 𝜃𝑡 + 𝜖𝑖𝑡 , 𝑏 > 0, (34.2)

where 𝑃𝑡𝑖 is the price of good 𝑖 at 𝑡, 𝑌𝑡𝑖 = 𝑓𝐾𝑡𝑖 is output in market 𝑖, 𝜃𝑡 is a persistent component of a demand shock
that is common across the two industries, and 𝜖𝑖𝑡 is an industry specific component of the demand shock that is i.i.d. and
whose time 𝑡 marginal distribution is 𝒩(0, 𝜎𝜖2 ).

624 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

We assume that 𝜃𝑡 is governed by

𝜃𝑡+1 = 𝜌𝜃𝑡 + 𝑣𝑡 (34.3)

where {𝑣𝑡 } is an i.i.d. sequence of Gaussian shocks, each with mean zero and variance 𝜎𝑣2 .
To simplify notation, we’ll study a special case by setting ℎ = 𝑓 = 1.
Costs of adjusting their capital stocks impart to firms an incentive to forecast the price of the good that they sell.
Throughout, we use the rational expectations equilibrium concept presented in this lecture Rational Expectations Equi-
librium.
We let capital letters denote market wide objects and lower case letters denote objects chosen by a representative firm.
In each industry, a competitive equilibrium prevails.
To rationalize the big 𝐾, little 𝑘 connection, we can think of there being a continuum of firms in industry 𝑖, with each
1
firm being indexed by 𝜔 ∈ [0, 1] and 𝐾 𝑖 = ∫0 𝑘𝑖 (𝜔)𝑑𝜔.
In equilibrium, 𝑘𝑡𝑖 = 𝐾𝑡𝑖 , but we must distinguish between 𝑘𝑡𝑖 and 𝐾𝑡𝑖 when we pose the firm’s optimization problem.

34.3 Tactics

We shall compute equilibrium laws of motion for capital in industry 𝑖 under a sequence of assumptions about what a
representative firm observes.
Successive members of this sequence make a representative firm’s information more and more obscure.
We begin with the most information, then gradually withdraw information in a way that approaches and eventually reaches
the Townsend-like information structure that we are ultimately interested in.
Thus, we shall compute equilibria under the following alternative information structures:
• Perfect foresight: future values of 𝜃𝑡 , 𝜖𝑖𝑡 are observed in industry 𝑖.
• Observed history of stochastic 𝜃𝑡 : {𝜃𝑡 , 𝜖𝑖𝑡 } are realizations from a stochastic process; current and past values of
each are observed at time 𝑡 but future values are not.
• One noise-ridden observation on 𝜃𝑡 : values of {𝜃𝑡 , 𝜖𝑖𝑡 } separately are never observed. However, at time 𝑡, a
history 𝑤𝑡 of scalar noise-ridden observations on 𝜃𝑡 is observed at time 𝑡.
• Two noise-ridden observations on 𝜃𝑡 : values of {𝜃𝑡 , 𝜖𝑖𝑡 } separately are never observed. However, at time 𝑡, a
history 𝑤𝑡 of two noise-ridden observations on 𝜃𝑡 is observed at time 𝑡.
Successive computations build one on previous ones.
We proceed by first finding an equilibrium under perfect foresight.
To compute an equilibrium with current and past but not future values of 𝜃𝑡 observed, we use a certainty equivalence prin-
ciple to justify modifying the perfect foresight equilibrium by replacing future values of 𝜃𝑠 , 𝜖𝑖𝑠 , 𝑠 ≥ 𝑡 with mathematical
expectations conditioned on 𝜃𝑡 .
This provides the equilibrium when 𝜃𝑡 is observed at 𝑡 but future 𝜃𝑡+𝑗 and 𝜖𝑖𝑡+𝑗 are not observed.
To find an equilibrium when a history 𝑤𝑡 observations of a single noise-ridden 𝜃𝑡 is observed, we again apply a certainty
equivalence principle and replace future values of the random variables 𝜃𝑠 , 𝜖𝑖𝑠 , 𝑠 ≥ 𝑡 with their mathematical expectations
conditioned on 𝑤𝑡 .
To find an equilibrium when a history 𝑤𝑡 of two noisy signals on 𝜃𝑡 is observed, we replace future values of the random
variables 𝜃𝑠 , 𝜖𝑖𝑠 , 𝑠 ≥ 𝑡 with their mathematical expectations conditioned on history 𝑤𝑡 .
We call the equilibrium with two noise-ridden observations on 𝜃𝑡 a pooling equilibrium.

34.3. Tactics 625


Advanced Quantitative Economics with Python

• It corresponds to an arrangement in which at the beginning of each period firms in industries 1 and 2 somehow get
together and share information about current values of their noisy signals on 𝜃.
We want ultimately to compare outcomes in a pooling equilibrium with an equilibrium under the following alternative
information structure for a firm in industry 𝑖 that originally interested Townsend [Townsend, 1983]:
• Firm 𝑖’s noise-ridden signal on 𝜃𝑡 and the price in industry −𝑖, a firm in industry 𝑖 observes a history 𝑤𝑡 of
one noise-ridden signal on 𝜃𝑡 and a history of industry −𝑖’s price is observed. (Here −𝑖 means ``not 𝑖’’.)
With this information structure, a representative firm 𝑖 sees the price as well as the aggregate endogenous state variable
𝑌𝑡𝑖 in its own industry.
That allows it to infer the total demand shock 𝜃𝑡 + 𝜖𝑖𝑡 .
However, at time 𝑡, the firm sees only 𝑃𝑡−𝑖 and does not see 𝑌𝑡−𝑖 , so that a firm in industry 𝑖 does not directly observe
𝜃𝑡 + 𝜖−𝑖
𝑡 .

Nevertheless, it will turn out that equilibrium prices and quantities in this equilibrium equal their counterparts in a pooling
equilibrium because firms in industry 𝑖 are able to infer the noisy signal about the demand shock received by firms in
industry −𝑖.
We shall verify this assertion by using a guess and verify tactic that involves running a least squares regression and
inspecting its 𝑅2 .1

34.4 Equilibrium Conditions

It is convenient to solve a firm’s problem without uncertainty by forming the Lagrangian:



𝑖
𝐽 = ∑ 𝛽 𝑡 {𝑃𝑡𝑖 𝑘𝑡𝑖 − .5(𝜇𝑖𝑡 )2 + 𝜙𝑡𝑖 [𝑘𝑡𝑖 + 𝜇𝑖𝑡 − 𝑘𝑡+1 ]}
𝑡=0

𝑖
where {𝜙𝑡𝑖 } is a sequence of Lagrange multipliers on the transition law 𝑘𝑡+1 = 𝑘𝑡𝑖 + 𝜇𝑖𝑡 .
First order conditions for the nonstochastic problem are
𝑖
𝜙𝑡𝑖 = 𝛽𝜙𝑡+1 𝑖
+ 𝛽𝑃𝑡+1
(34.4)
𝜇𝑖𝑡 = 𝜙𝑡𝑖 .

Substituting the demand function (34.2) for 𝑃𝑡𝑖 , imposing the condition 𝑘𝑡𝑖 = 𝐾𝑡𝑖 that makes representative firm be
representative, and using definition (34.6) of 𝑔𝑡𝑖 , the Euler equation (34.4) lagged by one period can be expressed as
−𝑏𝑘𝑡𝑖 + 𝜃𝑡 + 𝜖𝑖𝑡 + (𝑘𝑡+1
𝑖
− 𝑘𝑡𝑖 ) − 𝑔𝑡𝑖 = 0 or
𝑖
𝑘𝑡+1 = (𝑏 + 1)𝑘𝑡𝑖 − 𝜃𝑡 − 𝜖𝑖𝑡 + 𝑔𝑡𝑖 (34.5)

where we define 𝑔𝑡𝑖 by

𝑔𝑡𝑖 = 𝛽 −1 (𝑘𝑡𝑖 − 𝑘𝑡−1


𝑖
) (34.6)

We can write Euler equation (34.4) as:

𝑔𝑡𝑖 = 𝑃𝑡𝑖 + 𝛽𝑔𝑡+1


𝑖
(34.7)

In addition, we have the law of motion for 𝜃𝑡 , (34.3), and the demand equation (34.2).
1 [Pearlman and Sargent, 2005] verified this assertion using a different tactic, namely, by constructing analytic formulas for an equilibrium under

the incomplete information structure and confirming that they match the pooling equilibrium formulas derived here.

626 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

In summary, with perfect foresight, equilibrium conditions for industry 𝑖 comprise the following system of difference
equations:
𝑖
𝑘𝑡+1 = (1 + 𝑏)𝑘𝑡𝑖 − 𝜖𝑖𝑡 − 𝜃𝑡 + 𝑔𝑡𝑖
𝜃𝑡+1 = 𝜌𝜃𝑡 + 𝑣𝑡
𝑖
(34.8)
𝑔𝑡+1 = 𝛽 −1 (𝑔𝑡𝑖 − 𝑃𝑡𝑖 )
𝑃𝑡𝑖 = −𝑏𝑘𝑡𝑖 + 𝜖𝑖𝑡 + 𝜃𝑡

Without perfect foresight, the same system prevails except that the following equation replaces the third equation of
(34.8):
𝑖
𝑔𝑡+1,𝑡 = 𝛽 −1 (𝑔𝑡𝑖 − 𝑃𝑡𝑖 )

where 𝑥𝑡+1,𝑡 denotes the mathematical expectation of 𝑥𝑡+1 conditional on information at time 𝑡.

34.4.1 Equilibrium under perfect foresight

Our first step is to compute the equilibrium law of motion for 𝑘𝑡𝑖 under perfect foresight.
Let 𝐿 be the lag operator.2
Equations (34.7) and (34.5) imply the second order difference equation in 𝑘𝑡𝑖 :3

[(𝐿−1 − (1 + 𝑏))(1 − 𝛽𝐿−1 ) + 𝑏] 𝑘𝑡𝑖 = 𝛽𝐿−1 𝜖𝑖𝑡 + 𝛽𝐿−1 𝜃𝑡 . (34.9)

Factor the polynomial in 𝐿 on the left side as:

−𝛽[𝐿−2 − (𝛽 −1 + (1 + 𝑏))𝐿−1 + 𝛽 −1 ] = 𝜆̃ −1 (𝐿−1 − 𝜆)(1


̃ ̃ −1 )
− 𝜆𝛽𝐿

where |𝜆|̃ < 1 is the smaller root and 𝜆 is the larger root of (𝜆 − 1)(𝜆 − 1/𝛽) = 𝑏𝜆.
Therefore, (34.9) can be expressed as

𝜆̃ −1 (𝐿−1 − 𝜆)(1
̃ ̃ −1 )𝑘𝑡𝑖 = 𝛽𝐿−1 𝜖𝑖𝑡 + 𝛽𝐿−1 𝜃𝑡 .
− 𝜆𝛽𝐿

Solving the stable root backwards and the unstable root forwards gives

̃
𝜆𝛽
𝑖
𝑘𝑡+1 ̃ 𝑡𝑖 +
= 𝜆𝑘 (𝜖𝑖 + 𝜃𝑡+1 ).
1 − 𝜆𝛽𝐿̃ −1 𝑡+1

Recall that we have already set 𝑘𝑖 = 𝐾 𝑖 at the appropriate point in the argument, namely, after having derived the
first-order necessary conditions for a representative firm in industry 𝑖.
Thus, under perfect foresight the equilibrium capital stock in industry 𝑖 satisfies

𝑖
𝑘𝑡+1 ̃ 𝑡𝑖 + ∑(𝜆𝛽)
= 𝜆𝑘 ̃ 𝑗 (𝜖𝑖 + 𝜃𝑡+𝑗 ). (34.10)
𝑡+𝑗
𝑗=1

Next, we shall investigate consequences of replacing future values of (𝜖𝑖𝑡+𝑗 + 𝜃𝑡+𝑗 ) in equation (34.10) with alternative
forecasting schemes.
In particular, we shall compute equilibrium laws of motion for capital under alternative assumptions about information
available to firms in market 𝑖.
2 See [Sargent, 1987], especially chapters IX and XIV, for principles that guide solving some roots backwards and others forwards.
3 As noted by [Sargent, 1987], this difference equation is the Euler equation for a planning problem that maximizes the discounted sum of consumer
plus producer surplus.

34.4. Equilibrium Conditions 627


Advanced Quantitative Economics with Python

34.5 Equilibrium with 𝜃𝑡 stochastic but observed at 𝑡

If future 𝜃’s are unknown at 𝑡, it is appropriate to replace all random variables on the right side of (34.10) with their
conditional expectations based on the information available to decision makers in market 𝑖.
For now, we assume that this information set is 𝐼𝑡𝑝 = [𝜃𝑡 𝜖𝑖𝑡 ], where 𝑧 𝑡 represents the semi-infinite history of variable
𝑧𝑠 up to time 𝑡.
Later we shall give firms less information.
To obtain an appropriate counterpart to (34.10) under our current assumption about information, we apply a certainty
equivalence principle.
In particular, it is appropriate to take (34.10) and replace each term (𝜖𝑖𝑡+𝑗 +𝜃𝑡+𝑗 ) on the right side with 𝐸[(𝜖𝑖𝑡+𝑗 +𝜃𝑡+𝑗 )|𝜃𝑡 ].
After using (34.3) and the i.i.d. assumption about {𝜖𝑖𝑡 }, this gives
̃
𝜆𝛽𝜌
𝑖
𝑘𝑡+1 ̃ 𝑡𝑖 +
= 𝜆𝑘 𝜃𝑡
̃
1 − 𝜆𝛽𝜌
or
𝑖 ̃ 𝑡𝑖 + 𝜌
𝑘𝑡+1 = 𝜆𝑘 𝜃 (34.11)
𝜆−𝜌 𝑡

where 𝜆 ≡ (𝛽 𝜆)̃ −1 .
For our purposes, it is convenient to represent the equilibrium {𝑘𝑡𝑖 }𝑡 process recursively as
1
𝑖
𝑘𝑡+1 ̃ 𝑡𝑖 +
= 𝜆𝑘 𝜃̂
𝜆 − 𝜌 𝑡+1
̂ = 𝜌𝜃 (34.12)
𝜃𝑡+1 𝑡
𝜃𝑡+1 = 𝜌𝜃𝑡 + 𝑣𝑡 .

34.5.1 Filtering

One noisy signal

We get closer to the original Townsend model that interests us by now assuming that firms in market 𝑖 do not observe 𝜃𝑡 .
Instead they observe a history 𝑤𝑡 of noisy signals at time 𝑡.
In particular, assume that
𝑤𝑡 = 𝜃𝑡 + 𝑒𝑡
(34.13)
𝜃𝑡+1 = 𝜌𝜃𝑡 + 𝑣𝑡

where 𝑒𝑡 and 𝑣𝑡 are mutually independent i.i.d. Gaussian shock processes with means of zero and variances 𝜎𝑒2 and 𝜎𝑣2 ,
respectively.
Define
̂ = 𝐸(𝜃 |𝑤𝑡 )
𝜃𝑡+1 𝑡+1

where 𝑤𝑡 = [𝑤𝑡 , 𝑤𝑡−1 , … , 𝑤0 ] denotes the history of the 𝑤𝑠 process up to and including 𝑡.
Associated with the state-space representation (34.13) is the time-invariant innovations representation
̂ = 𝜌𝜃 ̂ + 𝜅𝑎
𝜃𝑡+1 𝑡 𝑡
(34.14)
̂
𝑤 =𝜃 +𝑎
𝑡 𝑡 𝑡

628 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

where 𝑎𝑡 ≡ 𝑤𝑡 − 𝐸(𝑤𝑡 |𝑤𝑡−1 ) is the innovations process in 𝑤𝑡 and the Kalman gain 𝜅 is
𝜌𝑝
𝜅= (34.15)
𝑝 + 𝜎𝑒2

and where 𝑝 satisfies the Riccati equation

𝑝𝜌2 𝜎𝑒2
𝑝 = 𝜎𝑣2 + . (34.16)
𝜎𝑒2 + 𝑝

State-reconstruction error

Define the state reconstruction error 𝜃𝑡̃ by

𝜃𝑡̃ = 𝜃𝑡 − 𝜃𝑡̂ .

Then 𝑝 = 𝐸 𝜃𝑡2̃ .
Equations (34.13) and (34.14) imply

̃ = (𝜌 − 𝜅)𝜃 ̃ + 𝑣 − 𝑘𝑒 .
𝜃𝑡+1 (34.17)
𝑡 𝑡 𝑡

̂ as
Notice that we can express 𝜃𝑡+1

̂ = [𝜌𝜃 + 𝑣 ] + [𝜅𝑒 − (𝜌 − 𝜅)𝜃 ̃ − 𝑣 ],


𝜃𝑡+1 (34.18)
𝑡 𝑡 𝑡 𝑡 𝑡

̃ .
where the first term in braces equals 𝜃𝑡+1 and the second term in braces equals −𝜃𝑡+1
We can express (34.11) as

𝑖 ̃ 𝑡𝑖 + 1
𝑘𝑡+1 = 𝜆𝑘 𝐸𝜃 |𝜃𝑡 . (34.19)
𝜆 − 𝜌 𝑡+1

An application of a certainty equivalence principle asserts that when only 𝑤𝑡 is observed, a corresponding equilibrium
{𝑘𝑡𝑖 } process can be found by replacing the information set 𝜃𝑡 with 𝑤𝑡 in (34.19).
Making this substitution and using (34.18) leads to

𝑖 ̃ 𝑡𝑖 + 𝜌 𝜅 𝜌−𝜅 ̃
𝑘𝑡+1 = 𝜆𝑘 𝜃𝑡 + 𝑒𝑡 − 𝜃. (34.20)
𝜆−𝜌 𝜆−𝜌 𝜆−𝜌 𝑡
Simplifying equation (34.18), we also have

̂ = 𝜌𝜃 + 𝜅𝑒 − (𝜌 − 𝜅)𝜃 ̃ .
𝜃𝑡+1 (34.21)
𝑡 𝑡 𝑡

Equations (34.20), (34.21) describe an equilibrium when 𝑤𝑡 is observed.

34.5.2 A new state variable

Relative to (34.11), the equilibrium acquires a new state variable, namely, the 𝜃–reconstruction error, 𝜃𝑡̃ .
For a subsequent argument, by using (34.15), it is convenient to write (34.20) as

𝑖 ̃ 𝑡𝑖 + 𝜌 1 𝑝𝜌 1 𝜌𝜎𝑒2 ̃
𝑘𝑡+1 = 𝜆𝑘 𝜃𝑡 + 𝑒𝑡 − 𝜃 (34.22)
𝜆−𝜌 𝜆 − 𝜌 𝑝 + 𝜎𝑒2 𝜆 − 𝜌 𝑝 + 𝜎𝑒2 𝑡

34.5. Equilibrium with 𝜃𝑡 stochastic but observed at 𝑡 629


Advanced Quantitative Economics with Python

In summary, when decision makers in market 𝑖 observe a semi-infinite history 𝑤𝑡 of noisy signals 𝑤𝑡 on 𝜃𝑡 at 𝑡, we an
equilibrium law of motion for 𝑘𝑡𝑖 can be represented as
1
𝑖
𝑘𝑡+1 ̃ 𝑡𝑖 +
= 𝜆𝑘 𝜃̂
𝜆 − 𝜌 𝑡+1
̂ 𝜌𝑝 𝜌𝜎𝑒2 ̃
𝜃𝑡+1 = 𝜌𝜃𝑡 + 𝑒𝑡 − 𝜃
𝑝 + 𝜎𝑒2 𝑝 + 𝜎𝑒2 𝑡 (34.23)
̃ 𝜌𝜎𝑒2 ̃ 𝑝𝜌
𝜃𝑡+1 = 𝜃 − 𝑒 + 𝑣𝑡
𝑝 + 𝜎𝑒2 𝑡 𝑝 + 𝜎𝑒2 𝑡
𝜃𝑡+1 = 𝜌𝜃𝑡 + 𝑣𝑡 .

34.5.3 Two Noisy Signals

We now construct a pooling equilibrium by assuming that at time 𝑡 a firm in industry 𝑖 receives a vector 𝑤𝑡 of two noisy
signals on 𝜃𝑡 :

𝜃𝑡+1 = 𝜌𝜃𝑡 + 𝑣𝑡
1 𝑒
𝑤𝑡 = [ ] 𝜃𝑡 + [ 1𝑡 ]
1 𝑒2𝑡

To justify that we are constructing is a pooling equilibrium we can assume that

𝑒 𝜖1
[ 1𝑡 ] = [ 𝑡2 ]
𝑒2𝑡 𝜖𝑡

so that a firm in industry 𝑖 observes the noisy signals on that 𝜃𝑡 presented to firms in both industries 𝑖 and −𝑖.
The pertinent innovations representation now becomes
̂ = 𝜌𝜃 ̂ + 𝜅𝑎
𝜃𝑡+1 𝑡 𝑡
1 (34.24)
𝑤𝑡 = [ ] 𝜃𝑡̂ + 𝑎𝑡
1

where 𝑎𝑡 ≡ 𝑤𝑡 − 𝐸[𝑤𝑡 |𝑤𝑡−1 ] is a (2 × 1) vector of innovations in 𝑤𝑡 and 𝜅 is now a (1 × 2) vector of Kalman gains.
Formulas for the Kalman filter imply that
𝜌𝑝
𝜅= [1 1] (34.25)
2𝑝 + 𝜎𝑒2

where 𝑝 = 𝐸 𝜃𝑡̃ 𝜃𝑡𝑇̃ now satisfies the Riccati equation

𝑝𝜌2 𝜎𝑒2
𝑝 = 𝜎𝑣2 + . (34.26)
2𝑝 + 𝜎𝑒2
Thus, when a representative firm in industry 𝑖 observes two noisy signals on 𝜃𝑡 , we can express the equilibrium law of
motion for capital recursively as
1
𝑖
𝑘𝑡+1 ̃ 𝑡𝑖 +
= 𝜆𝑘 𝜃̂
𝜆 − 𝜌 𝑡+1
̂ 𝜌𝑝 𝜌𝜎𝑒2
𝜃𝑡+1 = 𝜌𝜃𝑡 + (𝑒1𝑡 + 𝑒2𝑡 ) − 𝜃̃
2𝑝 + 𝜎𝑒2 2𝑝 + 𝜎𝑒2 𝑡 (34.27)
̃ 𝜌𝜎𝑒2 𝑝𝜌
𝜃𝑡+1 = 𝜃̃ − (𝑒 + 𝑒2𝑡 ) + 𝑣𝑡
2𝑝 + 𝜎𝑒2 𝑡 2𝑝 + 𝜎𝑒2 1𝑡
𝜃𝑡+1 = 𝜌𝜃𝑡 + 𝑣𝑡 .

630 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

Below, by using a guess-and-verify tactic, we shall show that outcomes in this pooling equilibrium equal those in an
equilibrium under the alternative information structure that interested Townsend [Townsend, 1983] but that originally
seemed too challenging to compute.4

34.6 Guess-and-Verify Tactic

As a preliminary step we shall take our recursive representation (34.23) of an equilibrium in industry 𝑖 with one noisy
signal on 𝜃𝑡 and perform the following steps:
• Compute 𝜆 and 𝜆̃ by posing a root-finding problem and solving it with numpy.roots
• Compute 𝑝 by forming the appropriate discrete Riccati equation and then solving it using quantecon.
solve_discrete_riccati
• Add a measurement equation for 𝑃𝑡𝑖 = 𝑏𝑘𝑡𝑖 + 𝜃𝑡 + 𝑒𝑡 , 𝜃𝑡 + 𝑒𝑡 , and 𝑒𝑡 to system (34.23).
• Write the resulting system in state-space form and encode it using quantecon.LinearStateSpace
• Use methods of the quantecon.LinearStateSpace to compute impulse response functions of 𝑘𝑡𝑖 with
respect to shocks 𝑣𝑡 , 𝑒𝑡 .
After analyzing the one-noisy-signal structure in this way, by making appropriate modifications we shall analyze the
two-noisy-signal structure.
We proceed to analyze first the one-noisy-signal structure and then the two-noisy-signal structure.

34.7 Equilibrium with One Noisy Signal on 𝜃𝑡

34.7.1 Step 1: Solve for 𝜆̃ and 𝜆

1. Cast (𝜆 − 1) (𝜆 − 𝛽1 ) = 𝑏𝜆 as 𝑝 (𝜆) = 0 where 𝑝 is a polynomial function of 𝜆.


2. Use numpy.roots to solve for the roots of 𝑝
1
3. Verify 𝜆 ≈ 𝛽𝜆̃

Note that 𝑝 (𝜆) = 𝜆2 − (1 + 𝑏 + 𝛽1 ) 𝜆 + 𝛽1 .

34.7.2 Step 2: Solve for 𝑝


𝑝𝜌2 𝜎𝑒2
1. Cast 𝑝 = 𝜎𝑣2 + 2𝑝+𝜎𝑒2 as a discrete matrix Riccati equation.
2. Use quantecon.solve_discrete_riccati to solve for 𝑝
𝑝𝜌2 𝜎𝑒2
3. Verify 𝑝 ≈ 𝜎𝑣2 + 2𝑝+𝜎𝑒2

4 [Pearlman and Sargent, 2005] verify the same claim by applying machinery of [Pearlman et al., 1986].

34.6. Guess-and-Verify Tactic 631


Advanced Quantitative Economics with Python

Note that:
𝐴= [ 𝜌 ]

𝐵= [ 2 ]
𝑅= [ 𝜎𝑒2 ]
𝑄= [ 𝜎𝑣2 ]
𝑁= [ 0 ]

34.7.3 Step 3: Represent the system using quantecon.LinearStateSpace

We use the following representation for constructing the quantecon.LinearStateSpace instance.

𝑒𝑡+1 0 0 0 0 0 0 𝑒𝑡
⎡ 𝜅 −1 𝜅𝜎𝑒2 ⎤ 𝜎𝑒 0
⎡ 𝑘𝑖 ⎤ ̃ 𝜌
⎡ 𝑘𝑖 ⎤ ⎡ 0
⎢ 𝑡+1 ⎥ ⎢ 𝜆−𝜌 𝜆 𝜆−𝜌 2𝑝 0 𝜆−𝜌 0 ⎥
⎢ 𝑡̃ ⎥ ⎢
0 ⎤
̃ ⎢ −𝜅 0 𝜅𝜎𝑒 ⎥
⎢ 𝜃𝑡+1 ⎥ = 0 0 1 ⎥ ⎢ 𝜃𝑡 ⎥ + ⎢ 0 0 ⎥ 𝑧1,𝑡+1
⎢ 𝑏𝜅 𝑝
⎥ [ ]
⎢ 𝑃𝑡+1 ⎥ ⎢ 𝜆−𝜌 𝑏𝜆̃ 𝜆−𝜌 𝑝
−𝑏 𝜅𝜎𝑒
2
𝑏𝜌
0 𝜆−𝜌 + 𝜌 1 ⎥ ⎢ 𝑃𝑡 ⎥ ⎢ 𝜎𝑒 0 ⎥ 𝑧2,𝑡+1
⎢ 𝜃𝑡+1 ⎥ ⎢ 0 ⎢ 𝜃𝑡 ⎥ ⎢ 0 0 ⎥
0 0 0 𝜌 1 ⎥
𝑣𝑡+1⏟⏟
⎣⏟⏟
⏟ ⎦ ⎣
⏟ 𝑣 𝑡 ⎦ ⎣ 0
⏟⏟⏟⏟⏟ 𝜎 𝑣 ⎦
⎣ 0 0 0
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 0 0 0 ⎦
𝑥𝑡+1 𝑥𝑡 𝐶
𝐴
𝑒𝑡
⎡ 𝑘𝑖 ⎤
𝑃𝑡 0 0 0 1 0 0 ⎢ 𝑡 ⎥ 0
⎡ 𝑒 + 𝜃 ⎤ = ⎡ 1 0 0 0 1 0 ⎤ ⎢ 𝜃𝑡̃ ⎥ + ⎡ 0 ⎤ 𝑤
⎢ 𝑡 𝑡 ⎥ ⎢ ⎥⎢ 𝑃 ⎥ ⎢ ⎥ 𝑡+1
⎣ 𝑒𝑡
⏟⏟⏟⏟⏟ ⎣ 1 0 0 0 0 0 ⎦ ⎢ 𝜃𝑡 ⎥ ⎣
⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ⏟ 0 ⎦
𝑡
𝑦𝑡 𝐺 𝐻

⏟ 𝑣𝑡 ⎦
𝑥𝑡

𝑧1,𝑡+1
⎡ 𝑧 ⎤
⎢ 2,𝑡+1 ⎥ ∼ 𝒩 (0, 𝐼)
⎣ 𝑤𝑡+1 ⎦
𝜌𝑝
𝜅=
𝑝 + 𝜎𝑒2
This representation includes extraneous variables such as 𝑃𝑡 in the state vector.
We formulate things in this way because it allows us easily to compute covariances of these variables with other
components of the state vector (step 5 above) by using the stationary_distributions method of the Lin-
earStateSpace class.

import numpy as np
import quantecon as qe
import plotly.graph_objects as go
import plotly.offline as pyo
from statsmodels.regression.linear_model import OLS
from IPython.display import display, Latex, Image

pyo.init_notebook_mode(connected=True)

β = 0.9 # Discount factor


ρ = 0.8 # Persistence parameter for the hidden state
(continues on next page)

632 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

(continued from previous page)


b = 1.5 # Demand curve parameter
σ_v = 0.5 # Standard deviation of shock to θ_t
σ_e = 0.6 # Standard deviation of shocks to w_t

# Compute λ
poly = np.array([1, -(1 + β + b) / β, 1 / β])
roots_poly = np.roots(poly)
λ_tilde = roots_poly.min()
λ = roots_poly.max()

# Verify that λ = (βλ_tilde) ^ (-1)


tol = 1e-12
np.max(np.abs(λ - 1 / (β * λ_tilde))) < tol

True

A_ricc = np.array([[ρ]])
B_ricc = np.array([[1.]])
R_ricc = np.array([[σ_e ** 2]])
Q_ricc = np.array([[σ_v ** 2]])
N_ricc = np.zeros((1, 1))
p = qe.solve_discrete_riccati(A_ricc, B_ricc, Q_ricc, R_ricc, N_ricc).item()

p_one = p # Save for comparison later

# Verify that p = σ_v ^ 2 + p * ρ ^ 2 - (ρ * p) ^ 2 / (p + σ_e ** 2)


tol = 1e-12
np.abs(p - (σ_v ** 2 + p * ρ ** 2 - (ρ * p) ** 2 / (p + σ_e ** 2))) < tol

True

κ = ρ * p / (p + σ_e ** 2)
κ_prod = κ * σ_e ** 2 / p

κ_one = κ # Save for comparison later

A_lss = np.array([[0., 0., 0., 0., 0., 0.],


[κ / (λ - ρ), λ_tilde, -κ_prod / (λ - ρ), 0., ρ / (λ - ρ), 0.],
[-κ, 0., κ_prod, 0., 0., 1.],
[b * κ / (λ - ρ) , b * λ_tilde, -b * κ_prod / (λ - ρ), 0., b * ρ /␣
↪(λ - ρ) + ρ, 1.],

[0., 0., 0., 0., ρ, 1.],


[0., 0., 0., 0., 0., 0.]])

C_lss = np.array([[σ_e, 0.],


[0., 0.],
[0., 0.],
[σ_e, 0.],
[0., 0.],
[0., σ_v]])

(continues on next page)

34.7. Equilibrium with One Noisy Signal on 𝜃𝑡 633


Advanced Quantitative Economics with Python

(continued from previous page)


G_lss = np.array([[0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 1., 0.],
[1., 0., 0., 0., 0., 0.]])

mu_0 = np.array([0., 0., 0., 0., 0., 0.])

lss = qe.LinearStateSpace(A_lss, C_lss, G_lss, mu_0=mu_0)

ts_length = 100_000
x, y = lss.simulate(ts_length, random_state=1)

# Verify that two ways of computing P_t match


np.max(np.abs(np.array([[1., b, 0., 0., 1., 0.]]) @ x - x[3])) < 1e-12

True

34.7.4 Step 4: Compute impulse response functions

To compute impulse response functions of 𝑘𝑡𝑖 , we use the impulse_response method of the quantecon.
LinearStateSpace class and plot outcomes.

xcoef, ycoef = lss.impulse_response(j=21)


data = np.array([xcoef])[0, :, 1, :]

fig = go.Figure(data=go.Scatter(y=data[:-1, 0], name=r'$e_{t+1}$'))


fig.add_trace(go.Scatter(y=data[1:, 1], name=r'$v_{t+1}$'))
fig.update_layout(title=r'Impulse Response Function',
xaxis_title='Time',
yaxis_title=r'$k^{i}_{t}$')
fig1 = fig
# Export to PNG file
Image(fig1.to_image(format="png"))
# fig1.show() will provide interactive plot when running
# notebook locally

634 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

34.7.5 Step 5: Compute stationary covariance matrices and population regressions

We compute stationary covariance matrices by calling the stationary_distributions method of the


quantecon.LinearStateSpace class.
By appropriately decomposing the covariance matrix of the state vector, we obtain ingredients of pertinent population
regression coefficients.
Define
Σ11 Σ12
Σ𝑥 = [ ]
Σ21 Σ22

where Σ11 is the covariance matrix of dependent variables and Σ22 is the covariance matrix of independent variables.
Regression coefficients are 𝛽 = Σ21 Σ−1
22 .

To verify an instance of a law of large numbers computation, we construct a long simulation of the state vector and for the
resulting sample compute the ordinary least-squares estimator of 𝛽 that we shall compare with corresponding population
regression coefficients.

_, _, Σ_x, Σ_y, Σ_yx = lss.stationary_distributions()

Σ_11 = Σ_x[0, 0]
Σ_12 = Σ_x[0, 1:4]
Σ_21 = Σ_x[1:4, 0]
Σ_22 = Σ_x[1:4, 1:4]
(continues on next page)

34.7. Equilibrium with One Noisy Signal on 𝜃𝑡 635


Advanced Quantitative Economics with Python

(continued from previous page)

reg_coeffs = Σ_12 @ np.linalg.inv(Σ_22)

print('Regression coefficients (e_t on k_t, P_t, \\tilde{\\theta_t})')


print('------------------------------')
print(r'k_t:', reg_coeffs[0])
print(r'\tilde{\theta_t}:', reg_coeffs[1])
print(r'P_t:', reg_coeffs[2])

Regression coefficients (e_t on k_t, P_t, \tilde{\theta_t})


------------------------------
k_t: -3.275556845219769
\tilde{\theta_t}: -0.9649461170475457
P_t: 0.9649461170475457

# Compute R squared
R_squared = reg_coeffs @ Σ_x[1:4, 1:4] @ reg_coeffs / Σ_x[0, 0]
R_squared

0.9649461170475461

# Verify that the computed coefficients are close to least squares estimates
model = OLS(x[0], x[1:4].T)
reg_res = model.fit()
np.max(np.abs(reg_coeffs - reg_res.params)) < 1e-2

True

# Verify that R_squared matches least squares estimate


np.abs(reg_res.rsquared - R_squared) < 1e-2

True

# Verify that θ_t + e_t can be recovered


model = OLS(y[1], x[1:4].T)
reg_res = model.fit()
np.abs(reg_res.rsquared - 1.) < 1e-6

True

636 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

34.8 Equilibrium with Two Noisy Signals on 𝜃𝑡

Steps 1, 4, and 5 are identical to those for the one-noisy-signal structure.


Step 2 requires a straightforward modification.
For step 3, we construct the following state-space representation so that we can get our hands on all of the random
processes that we require in order to compute a regression of the noisy signal about 𝜃 from the other industry that a firm
receives directly in a pooling equilibrium against information that a firm would receive in Townsend’s original model.
For this purpose, we include equilibrium goods prices from both industries in the state vector:

0 0 0 0 0 0 0 0
𝑒1,𝑡+1 ⎡ 0 𝑒1,𝑡 𝜎𝑒 0 0
⎡ 𝑒 ⎤ 0 0 0 0 0 0 0 ⎤ ⎡ 𝑒 ⎤ ⎡ 0 𝜎
⎢ 𝜅 2 ⎥ 0 ⎤
⎢ 2,𝑡+1
𝑖 ⎥ ⎢ 𝜆−𝜌 𝜆−𝜌 𝜆̃ 𝜆−𝜌 𝑝 𝑒 0 0
𝜅 −1 𝜅𝜎 𝜌
0 ⎥
2,𝑡
⎢ 𝑖 ⎥ ⎢ 𝑒

𝑘
⎢ 𝑡+1 ⎥ 𝜆−𝜌 𝑘
⎢ 𝑡 ⎥ ⎢ 0 0 0 ⎥
⎢ −𝜅 −𝜅 0 𝑧
̃
⎢ 𝜃𝑡+1 ⎥
𝜅𝜎𝑒2
0 0 0 1 ⎥ ⎢ 𝜃𝑡̃ ⎥ ⎢ 0 0 0 ⎥ ⎡ 1,𝑡+1 ⎤
⎢ 𝑃1 ⎥ = ⎢ 𝑝 ⎥ +
⎢ 𝑃1 ⎥ ⎢ 𝜎 𝑧
0 ⎥ ⎢ 2,𝑡+1 ⎥
2

⎢ 𝑡+1 ⎥
𝑏𝜅
⎢ 𝜆−𝜌 𝑏𝜅
𝜆−𝜌 𝑏𝜆̃ 𝜆−𝜌
−𝑏 𝜅𝜎𝑒
𝑝
𝑏𝜌
0 0 𝜆−𝜌 +𝜌 1 ⎥ ⎢ 𝑡2 ⎥ ⎢ 𝑒
0
⎥ ⎣ 𝑧3,𝑡+1 ⎦
2
⎢ 𝑃𝑡+1 ⎥ ⎢ 𝑏𝜅 −𝑏 𝜅𝜎𝑒2 ⎥ ⎢ 𝑃𝑡 ⎥ ⎢ 0 𝜎 𝑒 0 ⎥
⎢ 𝜆−𝜌 𝜆−𝜌 𝑏𝜆̃ 𝜆−𝜌 𝑝
𝑏𝜅 𝑏𝜌
0 0 𝜆−𝜌 +𝜌 1 ⎥
⎢ 𝜃𝑡+1 ⎥ ⎢ 𝜃𝑡 ⎥ ⎢ 0 0 0 ⎥
⎢ 0 0 0 0 0 0 𝜌 1 ⎥
⎣⏟⏟
⏟ 𝑣𝑡+1⏟⏟⎦ ⎣
⏟ ⎣ 0 0 𝜎𝑣 ⎦
𝑣𝑡 ⎦ ⏟⏟⏟⏟⏟⏟⏟
⎣ 0 0 0 0
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 0 0 0 0 ⎦
𝑥𝑡+1 𝑥𝑡 𝐶
𝐴
𝑒1,𝑡
𝑃𝑡1 0 0 0 0 1 0 0 0 ⎡ 𝑒 ⎤ 0
⎡ ⎤ ⎡ 0 0 0 0 0 1 0 0 ⎤ ⎢ 2,𝑡
𝑖 ⎥ ⎡ 0 ⎤
𝑃𝑡2 ⎢ 𝑘𝑡 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 𝑒1,𝑡 + 𝜃𝑡 ⎥ = ⎢ 1 0 0 0 0 0 1 0 ⎥ ⎢ 𝜃𝑡̃ ⎥ ⎢ 0 ⎥ 𝑤𝑡+1
⎢ 𝑒2,𝑡 + 𝜃𝑡 ⎥ ⎢ 0 1 0 0 0 0 1 0 ⎥ ⎢ 𝑃1 ⎥+ ⎢ 0 ⎥
⎢ 1 0 0 0 0 0 0 0 ⎥ ⎢ 𝑡2 ⎥ ⎢ 0 ⎥
⎢ 𝑒1,𝑡 ⎥ ⎢ 𝑃𝑡 ⎥
⎣⏟⏟⏟
⏟ 𝑒2,𝑡⏟⏟⏟ ⎣ 0 1 0 0 0 0 0 0 ⎦
⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ⎢ 𝜃𝑡 ⎥ ⎣
⏟ 0 ⎦
𝑦𝑡 𝐺 ⎣
⏟ 𝑣𝑡 ⎦ 𝐻
𝑥𝑡

𝑧1,𝑡+1
⎡ 𝑧 ⎤
⎢ 2,𝑡+1 ⎥ ∼ 𝒩 (0, 𝐼)
⎢ 𝑧3,𝑡+1 ⎥
⎣ 𝑤𝑡+1 ⎦
𝜌𝑝
𝜅=
2𝑝 + 𝜎𝑒2

A_ricc = np.array([[ρ]])
B_ricc = np.array([[np.sqrt(2)]])
R_ricc = np.array([[σ_e ** 2]])
Q_ricc = np.array([[σ_v ** 2]])
N_ricc = np.zeros((1, 1))
p = qe.solve_discrete_riccati(A_ricc, B_ricc, Q_ricc, R_ricc, N_ricc).item()

p_two = p # Save for comparison later

# Verify that p = σ_v^2 + (pρ^2σ_e^2) / (2p + σ_e^2)


tol = 1e-12
np.abs(p - (σ_v ** 2 + p * ρ ** 2 * σ_e ** 2 / (2 * p + σ_e ** 2))) < tol

True

34.8. Equilibrium with Two Noisy Signals on 𝜃𝑡 637


Advanced Quantitative Economics with Python

κ = ρ * p / (2 * p + σ_e ** 2)
κ_prod = κ * σ_e ** 2 / p

κ_two = κ # Save for comparison later

A_lss = np.array([[0., 0., 0., 0., 0., 0., 0., 0.],


[0., 0., 0., 0., 0., 0., 0., 0.],
[κ / (λ - ρ), κ / (λ - ρ), λ_tilde, -κ_prod / (λ - ρ), 0., 0., ρ /␣
↪(λ - ρ), 0.],

[-κ, -κ, 0., κ_prod, 0., 0., 0., 1.],


[b * κ / (λ - ρ), b * κ / (λ - ρ), b * λ_tilde, -b * κ_prod / (λ -␣
↪ρ), 0., 0., b * ρ / (λ - ρ) + ρ, 1.],

[b * κ / (λ - ρ), b * κ / (λ - ρ), b * λ_tilde, -b * κ_prod / (λ -␣


↪ρ), 0., 0., b * ρ / (λ - ρ) + ρ, 1.],

[0., 0., 0., 0., 0., 0., ρ, 1.],


[0., 0., 0., 0., 0., 0., 0., 0.]])

C_lss = np.array([[σ_e, 0., 0.],


[0., σ_e, 0.],
[0., 0., 0.],
[0., 0., 0.],
[σ_e, 0., 0.],
[0., σ_e, 0.],
[0., 0., 0.],
[0., 0., σ_v]])

G_lss = np.array([[0., 0., 0., 0., 1., 0., 0., 0.],


[0., 0, 0, 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0., 0., 1., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0., 0.]])

mu_0 = np.array([0., 0., 0., 0., 0., 0., 0., 0.])

lss = qe.LinearStateSpace(A_lss, C_lss, G_lss, mu_0=mu_0)

ts_length = 100_000
x, y = lss.simulate(ts_length, random_state=1)

xcoef, ycoef = lss.impulse_response(j=20)

data = np.array([xcoef])[0, :, 2, :]

fig = go.Figure(data=go.Scatter(y=data[:-1, 0], name=r'$e_{1,t+1}$'))


fig.add_trace(go.Scatter(y=data[:-1, 1], name=r'$e_{2,t+1}$'))
fig.add_trace(go.Scatter(y=data[1:, 2], name=r'$v_{t+1}$'))
fig.update_layout(title=r'Impulse Response Function',
xaxis_title='Time',
yaxis_title=r'$k^{i}_{t}$')
fig2=fig
# Export to PNG file
Image(fig2.to_image(format="png"))
# fig2.show() will provide interactive plot when running
(continues on next page)

638 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

(continued from previous page)


# notebook locally

_, _, Σ_x, Σ_y, Σ_yx = lss.stationary_distributions()

Σ_11 = Σ_x[1, 1]
Σ_12 = Σ_x[1, 2:5]
Σ_21 = Σ_x[2:5, 1]
Σ_22 = Σ_x[2:5, 2:5]

reg_coeffs = Σ_12 @ np.linalg.inv(Σ_22)

print('Regression coefficients (e_{2,t} on k_t, P^{1}_t, \\tilde{\\theta_t})')


print('------------------------------')
print(r'k_t:', reg_coeffs[0])
print(r'\tilde{\theta_t}:', reg_coeffs[1])
print(r'P_t:', reg_coeffs[2])

Regression coefficients (e_{2,t} on k_t, P^{1}_t, \tilde{\theta_t})


------------------------------
k_t: 0.0
\tilde{\theta_t}: 0.0
P_t: 0.0

# Compute R squared
(continues on next page)

34.8. Equilibrium with Two Noisy Signals on 𝜃𝑡 639


Advanced Quantitative Economics with Python

(continued from previous page)


R_squared = reg_coeffs @ Σ_x[2:5, 2:5] @ reg_coeffs / Σ_x[1, 1]
R_squared

0.0

# Verify that the computed coefficients are close to least squares estimates
model = OLS(x[1], x[2:5].T)
reg_res = model.fit()
np.max(np.abs(reg_coeffs - reg_res.params)) < 1e-2

True

# Verify that R_squared matches least squares estimate


np.abs(reg_res.rsquared - R_squared) < 1e-2

True

_, _, Σ_x, Σ_y, Σ_yx = lss.stationary_distributions()

Σ_11 = Σ_x[1, 1]
Σ_12 = Σ_x[1, 2:6]
Σ_21 = Σ_x[2:6, 1]
Σ_22 = Σ_x[2:6, 2:6]

reg_coeffs = Σ_12 @ np.linalg.inv(Σ_22)

print('Regression coefficients (e_{2,t} on k_t, P^{1}_t, P^{2}_t, \\tilde{\\theta_t})


↪')

print('------------------------------')
print(r'k_t:', reg_coeffs[0])
print(r'\tilde{\theta_t}:', reg_coeffs[1])
print(r'P^{1}_t:', reg_coeffs[2])
print(r'P^{2}_t:', reg_coeffs[3])

Regression coefficients (e_{2,t} on k_t, P^{1}_t, P^{2}_t, \tilde{\theta_t})


------------------------------
k_t: -3.1373589171035627
\tilde{\theta_t}: -0.9242343967443672
P^{1}_t: -0.037882801627816154
P^{2}_t: 0.9621171983721835

# Compute R squared
R_squared = reg_coeffs @ Σ_x[2:6, 2:6] @ reg_coeffs / Σ_x[1, 1]
R_squared

640 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

0.9621171983721837

34.9 Key Step

Now we come to the key step for verifying that equilibrium outcomes for prices and quantities are identical in the pooling
equilibrium original model that led Townsend to deduce an infinite-dimensional state space.
We accomplish this by computing a population linear least squares regression of the noisy signal that firms in the other
industry receive in a pooling equilibrium on time 𝑡 information that a firm would receive in Townsend’s original model.
Let’s compute the regression and stare at the 𝑅2 :

# Verify that θ_t + e^{2}_t can be recovered

# θ_t + e^{2}_t on k^{i}_t, P^{1}_t, P^{2}_t, \\tilde{\\theta_t}

model = OLS(y[1], x[2:6].T)


reg_res = model.fit()
np.abs(reg_res.rsquared - 1.) < 1e-6

True

reg_res.rsquared

1.0

The 𝑅2 in this regression equals 1.


That verifies that a firm’s information set in Townsend’s original model equals its information set in a pooling equilibrium.
Therefore, equilibrium prices and quantities in Townsend’s original model equal those in a pooling equilibrium.

34.10 An observed common shock benchmark

For purposes of comparison, it is useful to construct a model in which demand disturbance in both industries still both
share have a common persistent component 𝜃𝑡 , but in which the persistent component 𝜃 is observed each period.
In this case, firms share the same information immediately and have no need to deploy signal-extraction techniques.
Thus, consider a version of our model in which histories of both 𝜖𝑖𝑡 and 𝜃𝑡 are observed by a representative firm.
In this case, the firm’s optimal decision rule is described by
1
𝑖
𝑘𝑡+1 ̃ 𝑡𝑖 +
= 𝜆𝑘 𝜃̂
𝜆 − 𝜌 𝑡+1

̂ =𝐸𝜃
where 𝜃𝑡+1 𝑡 𝑡+1 is given by

̂ = 𝜌𝜃
𝜃𝑡+1 𝑡

34.9. Key Step 641


Advanced Quantitative Economics with Python

Thus, the firm’s decision rule can be expressed

𝑖 ̃ 𝑡𝑖 + 𝜌
𝑘𝑡+1 = 𝜆𝑘 𝜃
𝜆−𝜌 𝑡

Consequently, when a history 𝜃𝑠 , 𝑠 ≤ 𝑡 is observed without noise, the following state space system prevails:

𝜃𝑡+1 𝜌 0 𝜃𝑡 𝜎
[ 𝑖 ]=[ 𝜌 ] [ ] + [ 𝑣 ] 𝑧1,𝑡+1
𝑘𝑡+1 𝜆−𝜌 𝜆̃ 𝑘𝑡𝑖 0
𝜃 1 0 𝜃𝑡 0
[ 𝑡𝑖 ] = [ ] [ ] + [ ] 𝑧1,𝑡+1
𝑘𝑡 0 1 𝑘𝑡𝑖 0

where 𝑧𝑡,𝑡+1 is a scalar iid standardized Gaussian process.


As usual, the system can be written as

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐶𝑧𝑡+1


𝑦𝑡 = 𝐺𝑥𝑡 + 𝐻𝑤𝑡+1

In order once again to use the quantecon class quantecon.LinearStateSpace, let’s form pertinent state-space
matrices

Ao_lss = np.array([[ρ, 0.],


[ρ / (λ - ρ), λ_tilde]])

Co_lss = np.array([[σ_v], [0.]])

Go_lss = np.identity(2)

muo_0 = np.array([0., 0.])

lsso = qe.LinearStateSpace(Ao_lss, Co_lss, Go_lss, mu_0=muo_0)

Now let’s form and plot an impulse response function of 𝑘𝑡𝑖 to shocks 𝑣𝑡 to 𝜃𝑡+1

xcoef, ycoef = lsso.impulse_response(j=21)


data = np.array([ycoef])[0, :, 1, :]

fig = go.Figure(data=go.Scatter(y=data[:-1, 0], name=r'$z_{t+1}$'))


fig.update_layout(title=r'Impulse Response Function',
xaxis_title= r'lag $j$',
yaxis_title=r'$k^{i}_{t}$')
fig3 = fig
# Export to PNG file
Image(fig3.to_image(format="png"))
# fig1.show() will provide interactive plot when running
# notebook locally

642 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

34.11 Comparison of All Signal Structures

It is enlightening side by side to plot impulse response functions for capital for the two noisy-signal information structures
and the noiseless signal on 𝜃 that we have just presented.
Please remember that the two-signal structure corresponds to the pooling equilibrium and also Townsend’s original
model.

fig_comb = go.Figure(data=[
*fig1.data,
*fig2.update_traces(xaxis='x2', yaxis='y2').data,
*fig3.update_traces(xaxis='x3', yaxis='y3').data
]).set_subplots(1, 3,
subplot_titles=("One noisy-signal",
"Two noisy-signal",
"No Noise"),
horizontal_spacing=0.02,
shared_yaxes=True)
# Export to PNG file
Image(fig_comb.to_image(format="png"))
# fig_comb.show() # will provide interactive plot when running
# notebook locally

34.11. Comparison of All Signal Structures 643


Advanced Quantitative Economics with Python

The three panels in the graph above show that


• responses of 𝑘𝑡𝑖 to shocks 𝑣𝑡 to the hidden Markov demand state 𝜃𝑡 process are largest in the no-noisy-signal
structure in which the firm observes 𝜃𝑡 at time 𝑡
• responses of 𝑘𝑡𝑖 to shocks 𝑣𝑡 to the hidden Markov demand state 𝜃𝑡 process are smaller in the two-noisy-signal
structure
• responses of 𝑘𝑡𝑖 to shocks 𝑣𝑡 to the hidden Markov demand state 𝜃𝑡 process are smallest in the one-noisy-signal
structure
With respect to the iid demand shocks 𝑒𝑡 the graphs show that
• responses of 𝑘𝑡𝑖 to shocks 𝑒𝑡 to the hidden Markov demand state 𝜃𝑡 process are smallest (i.e., nonexistent) in the
no-noisy-signal structure in which the firm observes 𝜃𝑡 at time 𝑡
• responses of 𝑘𝑡𝑖 to shocks 𝑒𝑡 to the hidden Markov demand state 𝜃𝑡 process are larger in the two-noisy-signal
structure
• responses of 𝑘𝑡𝑖 to idiosyncratic own-market noise-shocks 𝑒𝑡 are largest in the one-noisy-signal structure
Among other things, these findings indicate that time series correlations and coherences between outputs in the two
industries are higher in the two-noisy-signals or pooling model than they are in the one-noisy signal model.
The enhanced influence of the shocks 𝑣𝑡 to the hidden Markov demand state 𝜃𝑡 process that emerges from the two-noisy-
signal model relative to the one-noisy-signal model is a symptom of a lower equilibrium hidden-state reconstruction error
variance in the two-signal model:

display(Latex('$\\textbf{Reconstruction error variances}$'))


display(Latex(f'One-noise structure: {round(p_one, 6)}'))
display(Latex(f'Two-noise structure: {round(p_two, 6)}'))

644 Chapter 34. Knowing the Forecasts of Others


Advanced Quantitative Economics with Python

Reconstruction error variances

𝑂𝑛𝑒 − 𝑛𝑜𝑖𝑠𝑒𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒 ∶ 0.36618

𝑇 𝑤𝑜 − 𝑛𝑜𝑖𝑠𝑒𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒 ∶ 0.324062

Kalman gains for the two structures are

display(Latex('$\\textbf{Kalman Gains}$'))
display(Latex(f'One noisy-signal structure: {round(κ_one, 6)}'))
display(Latex(f'Two noisy-signals structure: {round(κ_two, 6)}'))

Kalman Gains

𝑂𝑛𝑒𝑛𝑜𝑖𝑠𝑦 − 𝑠𝑖𝑔𝑛𝑎𝑙𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒 ∶ 0.403404

𝑇 𝑤𝑜𝑛𝑜𝑖𝑠𝑦 − 𝑠𝑖𝑔𝑛𝑎𝑙𝑠𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒 ∶ 0.25716

Another lesson that comes from the preceding three-panel graph is that the presence of iid noise 𝜖𝑖𝑡 in industry 𝑖 generates
a response in 𝑘𝑡−𝑖 in the two-noisy-signal structure, but not in the one-noisy-signal structure.

34.12 Notes on History of the Problem

To truncate what he saw as an intractable, infinite dimensional state space, Townsend constructed an approximating model
in which the common hidden Markov demand shock is revealed to all firms after a fixed number of periods.
Thus,
• Townsend wanted to assume that at time 𝑡 firms in industry 𝑖 observe 𝑘𝑡𝑖 , 𝑌𝑡𝑖 , 𝑃𝑡𝑖 , (𝑃 −𝑖 )𝑡 , where (𝑃 −𝑖 )𝑡 is the
history of prices in the other market up to time 𝑡.
• Because that turned out to be too challenging, Townsend made a sensible alternative assumption that eased his
calculations: that after a large number 𝑆 of periods, firms in industry 𝑖 observe the hidden Markov component of
the demand shock 𝜃𝑡−𝑆 .
Townsend argued that the more manageable model could do a good job of approximating the intractable model in which
the Markov component of the demand shock remains unobserved for ever.
By applying technical machinery of [Pearlman et al., 1986], [Pearlman and Sargent, 2005] showed that there is a recursive
representation of the equilibrium of the perpetually and symmetrically uninformed model that Townsend wanted to solve
[Townsend, 1983].
A reader of [Pearlman and Sargent, 2005] will notice that their representation of the equilibrium of Townsend’s model
exactly matches that of the pooling equilibrium presented here.
We have structured our notation in this lecture to faciliate comparison of the pooling equilibrium constructed here with
the equilibrium of Townsend’s model reported in [Pearlman and Sargent, 2005].
The computational method of [Pearlman and Sargent, 2005] is recursive: it enlists the Kalman filter and invariant subspace
methods for solving systems of Euler equations5 .
5 See [Anderson et al., 1996] for an account of invariant subspace methods.

34.12. Notes on History of the Problem 645


Advanced Quantitative Economics with Python

As [Singleton, 1987], [Kasa, 2000], and [Sargent, 1991] also found, the equilibrium is fully revealing: observed prices
tell participants in industry 𝑖 all of the information held by participants in market −𝑖 (−𝑖 means not 𝑖).
This means that higher-order beliefs play no role: observing equilibrium prices in effect lets decision makers pool their
information sets6 .
The disappearance of higher order beliefs means that decision makers in this model do not really face a problem of
forecasting the forecasts of others.
Because those forecasts are the same as their own, they know them.

34.12.1 Further historical remarks

Sargent [Sargent, 1991] proposed a way to compute an equilibrium without making Townsend’s approximation.
Extending the reasoning of [Muth, 1960], Sargent noticed that it is possible to summarize the relevant history with a low
dimensional object, namely, a small number of current and lagged forecasting errors.
Positing an equilibrium in a space of perceived laws of motion for endogenous variables that takes the form of a vector
autoregressive, moving average, Sargent described an equilibrium as a fixed point of a mapping from the perceived law
of motion to the actual law of motion of that form.
Sargent worked in the time domain and proceeded to guess and verify the appropriate orders of the autoregressive and
moving average pieces of the equilibrium representation.
By working in the frequency domain [Kasa, 2000] showed how to discover the appropriate orders of the autoregressive
and moving average parts, and also how to compute an equilibrium.
The [Pearlman and Sargent, 2005] recursive computational method, which stays in the time domain, also discovered
appropriate orders of the autoregressive and moving average pieces.
In addition, by displaying equilibrium representations in the form of [Pearlman et al., 1986], [Pearlman and Sargent,
2005] showed how the moving average piece is linked to the innovation process of the hidden persistent component of
the demand shock.
That scalar innovation process is the additional state variable contributed by the problem of extracting a signal from
equilibrium prices that decision makers face in Townsend’s model.

6 See [Allen et al., 2002] for a discussion of information assumptions needed to create a situation in which higher order beliefs appear in equilibrium

decision rules. A way to read our findings in light of [Allen et al., 2002] is that, relative to the number of signals agents observe, Townsend’s section 8
model has too few random shocks to get higher order beliefs to play a role.

646 Chapter 34. Knowing the Forecasts of Others


Part VII

Asset Pricing and Finance

647
CHAPTER

THIRTYFIVE

ASSET PRICING II: THE LUCAS ASSET PRICING MODEL

35.1 Overview

As stated in an earlier lecture, an asset is a claim on a stream of prospective payments.


What is the correct price to pay for such a claim?
The elegant asset pricing model of Lucas [Lucas, 1978] attempts to answer this question in an equilibrium setting with
risk-averse agents.
While we mentioned some consequences of Lucas’ model earlier, it is now time to work through the model more carefully
and try to understand where the fundamental asset pricing equation comes from.
A side benefit of studying Lucas’ model is that it provides a beautiful illustration of model building in general and equi-
librium pricing in competitive models in particular.
Another difference to our first asset pricing lecture is that the state space and shock will be continuous rather than discrete.
Let’s start with some imports:

import numpy as np
from numba import njit, prange
from scipy.stats import lognorm
import matplotlib.pyplot as plt

35.2 The Lucas Model

Lucas studied a pure exchange economy with a representative consumer (or household), where
• Pure exchange means that all endowments are exogenous.
• Representative consumer means that either
– there is a single consumer (sometimes also referred to as a household), or
– all consumers have identical endowments and preferences
Either way, the assumption of a representative agent means that prices adjust to eradicate desires to trade.
This makes it very easy to compute competitive equilibrium prices.

649
Advanced Quantitative Economics with Python

35.2.1 Basic Setup

Let’s review the setup.

Assets

There is a single “productive unit” that costlessly generates a sequence of consumption goods {𝑦𝑡 }∞
𝑡=0 .

Another way to view {𝑦𝑡 }∞


𝑡=0 is as a consumption endowment for this economy.

We will assume that this endowment is Markovian, following the exogenous process

𝑦𝑡+1 = 𝐺(𝑦𝑡 , 𝜉𝑡+1 )

Here {𝜉𝑡 } is an IID shock sequence with known distribution 𝜙 and 𝑦𝑡 ≥ 0.


An asset is a claim on all or part of this endowment stream.

The consumption goods {𝑦𝑡 }𝑡=0 are nonstorable, so holding assets is the only way to transfer wealth into the future.
For the purposes of intuition, it’s common to think of the productive unit as a “tree” that produces fruit.
Based on this idea, a “Lucas tree” is a claim on the consumption endowment.

Consumers

A representative consumer ranks consumption streams {𝑐𝑡 } according to the time separable utility functional

𝔼 ∑ 𝛽 𝑡 𝑢(𝑐𝑡 ) (35.1)
𝑡=0

Here
• 𝛽 ∈ (0, 1) is a fixed discount factor.
• 𝑢 is a strictly increasing, strictly concave, continuously differentiable period utility function.
• 𝔼 is a mathematical expectation.

35.2.2 Pricing a Lucas Tree

What is an appropriate price for a claim on the consumption endowment?


We’ll price an ex-dividend claim, meaning that
• the seller retains this period’s dividend
• the buyer pays 𝑝𝑡 today to purchase a claim on
– 𝑦𝑡+1 and
– the right to sell the claim tomorrow at price 𝑝𝑡+1
Since this is a competitive model, the first step is to pin down consumer behavior, taking prices as given.
Next, we’ll impose equilibrium constraints and try to back out prices.
In the consumer problem, the consumer’s control variable is the share 𝜋𝑡 of the claim held in each period.

650 Chapter 35. Asset Pricing II: The Lucas Asset Pricing Model
Advanced Quantitative Economics with Python

Thus, the consumer problem is to maximize (35.1) subject to

𝑐𝑡 + 𝜋𝑡+1 𝑝𝑡 ≤ 𝜋𝑡 𝑦𝑡 + 𝜋𝑡 𝑝𝑡

along with 𝑐𝑡 ≥ 0 and 0 ≤ 𝜋𝑡 ≤ 1 at each 𝑡.


The decision to hold share 𝜋𝑡 is actually made at time 𝑡 − 1.
But this value is inherited as a state variable at time 𝑡, which explains the choice of subscript.

The Dynamic Program

We can write the consumer problem as a dynamic programming problem.


Our first observation is that prices depend on current information, and current information is really just the endowment
process up until the current period.
In fact, the endowment process is Markovian, so that the only relevant information is the current state 𝑦 ∈ ℝ+ (dropping
the time subscript).
This leads us to guess an equilibrium where price is a function 𝑝 of 𝑦.
Remarks on the solution method
• Since this is a competitive (read: price taking) model, the consumer will take this function 𝑝 as given.
• In this way, we determine consumer behavior given 𝑝 and then use equilibrium conditions to recover 𝑝.
• This is the standard way to solve competitive equilibrium models.
Using the assumption that price is a given function 𝑝 of 𝑦, we write the value function and constraint as

𝑣(𝜋, 𝑦) = max′ {𝑢(𝑐) + 𝛽 ∫ 𝑣(𝜋′ , 𝐺(𝑦, 𝑧))𝜙(𝑑𝑧)}


𝑐,𝜋

subject to

𝑐 + 𝜋′ 𝑝(𝑦) ≤ 𝜋𝑦 + 𝜋𝑝(𝑦) (35.2)

We can invoke the fact that utility is increasing to claim equality in (35.2) and hence eliminate the constraint, obtaining

𝑣(𝜋, 𝑦) = max

{𝑢[𝜋(𝑦 + 𝑝(𝑦)) − 𝜋′ 𝑝(𝑦)] + 𝛽 ∫ 𝑣(𝜋′ , 𝐺(𝑦, 𝑧))𝜙(𝑑𝑧)} (35.3)
𝜋

The solution to this dynamic programming problem is an optimal policy expressing either 𝜋′ or 𝑐 as a function of the
state (𝜋, 𝑦).
• Each one determines the other, since 𝑐(𝜋, 𝑦) = 𝜋(𝑦 + 𝑝(𝑦)) − 𝜋′ (𝜋, 𝑦)𝑝(𝑦)

Next Steps

What we need to do now is determine equilibrium prices.


It seems that to obtain these, we will have to
1. Solve this two-dimensional dynamic programming problem for the optimal policy.
2. Impose equilibrium constraints.
3. Solve out for the price function 𝑝(𝑦) directly.
However, as Lucas showed, there is a related but more straightforward way to do this.

35.2. The Lucas Model 651


Advanced Quantitative Economics with Python

Equilibrium Constraints

Since the consumption good is not storable, in equilibrium we must have 𝑐𝑡 = 𝑦𝑡 for all 𝑡.
In addition, since there is one representative consumer (alternatively, since all consumers are identical), there should be
no trade in equilibrium.
In particular, the representative consumer owns the whole tree in every period, so 𝜋𝑡 = 1 for all 𝑡.
Prices must adjust to satisfy these two constraints.

The Equilibrium Price Function

Now observe that the first-order condition for (35.3) can be written as

𝑢′ (𝑐)𝑝(𝑦) = 𝛽 ∫ 𝑣1′ (𝜋′ , 𝐺(𝑦, 𝑧))𝜙(𝑑𝑧)

where 𝑣1′ is the derivative of 𝑣 with respect to its first argument.


To obtain 𝑣1′ we can simply differentiate the right-hand side of (35.3) with respect to 𝜋, yielding

𝑣1′ (𝜋, 𝑦) = 𝑢′ (𝑐)(𝑦 + 𝑝(𝑦))

Next, we impose the equilibrium constraints while combining the last two equations to get

𝑢′ [𝐺(𝑦, 𝑧)]
𝑝(𝑦) = 𝛽 ∫ [𝐺(𝑦, 𝑧) + 𝑝(𝐺(𝑦, 𝑧))]𝜙(𝑑𝑧) (35.4)
𝑢′ (𝑦)
In sequential rather than functional notation, we can also write this as

𝑢′ (𝑐𝑡+1 )
𝑝𝑡 = 𝔼𝑡 [𝛽 (𝑦 + 𝑝𝑡+1 )] (35.5)
𝑢′ (𝑐𝑡 ) 𝑡+1
This is the famous consumption-based asset pricing equation.
Before discussing it further we want to solve out for prices.

35.2.3 Solving the Model

Equation (35.4) is a functional equation in the unknown function 𝑝.


The solution is an equilibrium price function 𝑝∗ .
Let’s look at how to obtain it.

Setting up the Problem

Instead of solving for it directly we’ll follow Lucas’ indirect approach, first setting

𝑓(𝑦) ∶= 𝑢′ (𝑦)𝑝(𝑦) (35.6)

so that (35.4) becomes

𝑓(𝑦) = ℎ(𝑦) + 𝛽 ∫ 𝑓[𝐺(𝑦, 𝑧)]𝜙(𝑑𝑧) (35.7)

Here ℎ(𝑦) ∶= 𝛽 ∫ 𝑢′ [𝐺(𝑦, 𝑧)]𝐺(𝑦, 𝑧)𝜙(𝑑𝑧) is a function that depends only on the primitives.

652 Chapter 35. Asset Pricing II: The Lucas Asset Pricing Model
Advanced Quantitative Economics with Python

Equation (35.7) is a functional equation in 𝑓.


The plan is to solve out for 𝑓 and convert back to 𝑝 via (35.6).
To solve (35.7) we’ll use a standard method: convert it to a fixed point problem.
First, we introduce the operator 𝑇 mapping 𝑓 into 𝑇 𝑓 as defined by

(𝑇 𝑓)(𝑦) = ℎ(𝑦) + 𝛽 ∫ 𝑓[𝐺(𝑦, 𝑧)]𝜙(𝑑𝑧) (35.8)

In what follows, we refer to 𝑇 as the Lucas operator.


The reason we do this is that a solution to (35.7) now corresponds to a function 𝑓 ∗ satisfying (𝑇 𝑓 ∗ )(𝑦) = 𝑓 ∗ (𝑦) for all 𝑦.
In other words, a solution is a fixed point of 𝑇 .
This means that we can use fixed point theory to obtain and compute the solution.

A Little Fixed Point Theory

Let 𝑐𝑏ℝ+ be the set of continuous bounded functions 𝑓 ∶ ℝ+ → ℝ+ .


We now show that
1. 𝑇 has exactly one fixed point 𝑓 ∗ in 𝑐𝑏ℝ+ .
2. For any 𝑓 ∈ 𝑐𝑏ℝ+ , the sequence 𝑇 𝑘 𝑓 converges uniformly to 𝑓 ∗ .

Note: If you find the mathematics heavy going you can take 1–2 as given and skip to the next section

Recall the Banach contraction mapping theorem.


It tells us that the previous statements will be true if we can find an 𝛼 < 1 such that

‖𝑇 𝑓 − 𝑇 𝑔‖ ≤ 𝛼‖𝑓 − 𝑔‖, ∀ 𝑓, 𝑔 ∈ 𝑐𝑏ℝ+ (35.9)

Here ‖ℎ‖ ∶= sup𝑥∈ℝ |ℎ(𝑥)|.


+

To see that (35.9) is valid, pick any 𝑓, 𝑔 ∈ 𝑐𝑏ℝ+ and any 𝑦 ∈ ℝ+ .


Observe that, since integrals get larger when absolute values are moved to the inside,

|𝑇 𝑓(𝑦) − 𝑇 𝑔(𝑦)| = ∣𝛽 ∫ 𝑓[𝐺(𝑦, 𝑧)]𝜙(𝑑𝑧) − 𝛽 ∫ 𝑔[𝐺(𝑦, 𝑧)]𝜙(𝑑𝑧)∣

≤ 𝛽 ∫ |𝑓[𝐺(𝑦, 𝑧)] − 𝑔[𝐺(𝑦, 𝑧)]| 𝜙(𝑑𝑧)

≤ 𝛽 ∫ ‖𝑓 − 𝑔‖𝜙(𝑑𝑧)

= 𝛽‖𝑓 − 𝑔‖

Since the right-hand side is an upper bound, taking the sup over all 𝑦 on the left-hand side gives (35.9) with 𝛼 ∶= 𝛽.

35.2. The Lucas Model 653


Advanced Quantitative Economics with Python

35.2.4 Computation – An Example

The preceding discussion tells that we can compute 𝑓 ∗ by picking any arbitrary 𝑓 ∈ 𝑐𝑏ℝ+ and then iterating with 𝑇 .
The equilibrium price function 𝑝∗ can then be recovered by 𝑝∗ (𝑦) = 𝑓 ∗ (𝑦)/𝑢′ (𝑦).
Let’s try this when ln 𝑦𝑡+1 = 𝛼 ln 𝑦𝑡 + 𝜎𝜖𝑡+1 where {𝜖𝑡 } is IID and standard normal.
Utility will take the isoelastic form 𝑢(𝑐) = 𝑐1−𝛾 /(1 − 𝛾), where 𝛾 > 0 is the coefficient of relative risk aversion.
We will set up a LucasTree class to hold parameters of the model

class LucasTree:
"""
Class to store parameters of the Lucas tree model.

"""

def __init__(self,
γ=2, # CRRA utility parameter
β=0.95, # Discount factor
α=0.90, # Correlation coefficient
σ=0.1, # Volatility coefficient
grid_size=100):

self.γ, self.β, self.α, self.σ = γ, β, α, σ

# Set the grid interval to contain most of the mass of the


# stationary distribution of the consumption endowment
ssd = self.σ / np.sqrt(1 - self.α**2)
grid_min, grid_max = np.exp(-4 * ssd), np.exp(4 * ssd)
self.grid = np.linspace(grid_min, grid_max, grid_size)
self.grid_size = grid_size

# Set up distribution for shocks


self.ϕ = lognorm(σ)
self.draws = self.ϕ.rvs(500)

self.h = np.empty(self.grid_size)
for i, y in enumerate(self.grid):
self.h[i] = β * np.mean((y**α * self.draws)**(1 - γ))

The following function takes an instance of the LucasTree and generates a jitted version of the Lucas operator

def operator_factory(tree, parallel_flag=True):

"""
Returns approximate Lucas operator, which computes and returns the
updated function Tf on the grid points.

tree is an instance of the LucasTree class

"""

grid, h = tree.grid, tree.h


α, β = tree.α, tree.β
z_vec = tree.draws

(continues on next page)

654 Chapter 35. Asset Pricing II: The Lucas Asset Pricing Model
Advanced Quantitative Economics with Python

(continued from previous page)


@njit(parallel=parallel_flag)
def T(f):
"""
The Lucas operator
"""

# Turn f into a function


Af = lambda x: np.interp(x, grid, f)

Tf = np.empty_like(f)
# Apply the T operator to f using Monte Carlo integration
for i in prange(len(grid)):
y = grid[i]
Tf[i] = h[i] + β * np.mean(Af(y**α * z_vec))

return Tf

return T

To solve the model, we write a function that iterates using the Lucas operator to find the fixed point.

def solve_model(tree, tol=1e-6, max_iter=500):


"""
Compute the equilibrium price function associated with Lucas
tree

* tree is an instance of LucasTree

"""
# Simplify notation
grid, grid_size = tree.grid, tree.grid_size
γ = tree.γ

T = operator_factory(tree)

i = 0
f = np.ones_like(grid) # Initial guess of f
error = tol + 1
while error > tol and i < max_iter:
Tf = T(f)
error = np.max(np.abs(Tf - f))
f = Tf
i += 1

price = f * grid**γ # Back out price vector

return price

Solving the model and plotting the resulting price function

tree = LucasTree()
price_vals = solve_model(tree)

fig, ax = plt.subplots(figsize=(10, 6))


ax.plot(tree.grid, price_vals, label='$p*(y)$')
(continues on next page)

35.2. The Lucas Model 655


Advanced Quantitative Economics with Python

(continued from previous page)


ax.set_xlabel('$y$')
ax.set_ylabel('price')
ax.legend()
plt.show()

We see that the price is increasing, even if we remove all serial correlation from the endowment process.
The reason is that a larger current endowment reduces current marginal utility.
The price must therefore rise to induce the household to consume the entire endowment (and hence satisfy the resource
constraint).
What happens with a more patient consumer?
Here the orange line corresponds to the previous parameters and the green line is price when 𝛽 = 0.98.
We see that when consumers are more patient the asset becomes more valuable, and the price of the Lucas tree shifts up.
Exercise 1 asks you to replicate this figure.

35.3 Exercises

Exercise 35.3.1
Replicate the figure to show how discount factors affect prices.

Solution to Exercise 35.3.1

656 Chapter 35. Asset Pricing II: The Lucas Asset Pricing Model
Advanced Quantitative Economics with Python

35.3. Exercises 657


Advanced Quantitative Economics with Python

fig, ax = plt.subplots(figsize=(10, 6))

for β in (.95, 0.98):


tree = LucasTree(β=β)
grid = tree.grid
price_vals = solve_model(tree)
label = rf'$\beta = {β}$'
ax.plot(grid, price_vals, lw=2, alpha=0.7, label=label)

ax.legend(loc='upper left')
ax.set(xlabel='$y$', ylabel='price', xlim=(min(grid), max(grid)))
plt.show()

658 Chapter 35. Asset Pricing II: The Lucas Asset Pricing Model
CHAPTER

THIRTYSIX

ELEMENTARY ASSET PRICING THEORY

36.1 Overview

This lecture is about some implications of asset-pricing theories that are based on the equation 𝐸𝑚𝑅 = 1, where 𝑅 is
the gross return on an asset, 𝑚 is a stochastic discount factor, and 𝐸 is a mathematical expectation with respect to a joint
probability distribution of 𝑅 and 𝑚.
Instances of this equation occur in many models.

Note: Chapter 1 of [Ljungqvist and Sargent, 2018] describes the role that this equation plays in a diverse set of models
in macroeconomics, monetary economics, and public finance.

We aim to convey insights about empirical implications of this equation brought out in the work of Lars Peter Hansen
[Hansen and Richard, 1987] and Lars Peter Hansen and Ravi Jagannathan [Hansen and Jagannathan, 1991].
By following their footsteps, from that single equation we’ll derive
• a mean-variance frontier
• a single-factor model of excess returns
To do this, we use two ideas:
• the equation 𝐸𝑚𝑅 = 1 that is implied by an application of a law of one price
• a Cauchy-Schwartz inequality
In particular, we’ll apply a Cauchy-Schwartz inequality to a population linear least squares regression equation that is
implied by 𝐸𝑚𝑅 = 1.
We’ll also describe how practitioners have implemented the model using
• cross sections of returns on many assets
• time series of returns on various assets
For background and basic concepts about linear least squares projections, see our lecture orthogonal projections and their
applications.
As a sequel to the material here, please see our lecture two modifications of mean-variance portfolio theory.

659
Advanced Quantitative Economics with Python

36.2 Key Equation

We begin with a key asset pricing equation:

𝐸𝑚𝑅𝑖 = 1 (36.1)

for 𝑖 = 1, … , 𝐼 and where

𝑚 = stochastic discount factor


𝑅𝑖 = random gross return on asset 𝑖
𝐸 ∼ mathematical expectation

The random gross return 𝑅𝑖 for every asset 𝑖 and the scalar stochastic discount factor 𝑚 live in a common probability
space.
[Hansen and Richard, 1987] and [Hansen and Jagannathan, 1991] explain how existence of a scalar stochastic discount
factor that verifies equation (36.1) is implied by a law of one price that requires that all portfolios of assets that bring the
same payouts have the same price.
They also explain how the absence of an arbitrage opportunity implies that the stochastic discount factor 𝑚 ≥ 0.
In order to say something about the uniqueness of a stochastic discount factor, we would have to impose more theoretical
structure than we do in this lecture.
For example, in complete markets models like those illustrated in this lecture equilibrium capital structures with incom-
plete markets, the stochastic discount factor is unique.
In incomplete markets models like those illustrated in this lecture the Aiyagari model, the stochastic discount factor is
not unique.

36.3 Implications of Key Equation

We combine key equation (36.1) with a remark of Lars Peter Hansen that “asset pricing theory is all about covariances”.

Note: Lars Hansen’s remark is a concise summary of ideas in [Hansen and Richard, 1987] and [Hansen and Jagannathan,
1991]. Important foundations of these ideas were set down by [Ross, 1976], [Ross, 1978], [Harrison and Kreps, 1979],
[Kreps, 1981], and [Chamberlain and Rothschild, 1983].

This remark of Lars Hansen refers to the fact that interesting restrictions can be deduced by recognizing that 𝐸𝑚𝑅𝑖 is a
component of the covariance between 𝑚 and 𝑅𝑖 and then using that fact to rearrange equation (36.1).
Let’s do this step by step.
First note that the definition of a covariance cov (𝑚, 𝑅𝑖 ) = 𝐸(𝑚 − 𝐸𝑚)(𝑅𝑖 − 𝐸𝑅𝑖 ) implies that

𝐸𝑚𝑅𝑖 = 𝐸𝑚𝐸𝑅𝑖 + cov (𝑚, 𝑅𝑖 )

Substituting this result into equation (36.1) gives

1 = 𝐸𝑚𝐸𝑅𝑖 + cov (𝑚, 𝑅𝑖 ) (36.2)

Next note that for a risk-free asset with non-random gross return 𝑅𝑓 , equation (36.1) becomes

1 = 𝐸𝑅𝑓 𝑚 = 𝑅𝑓 𝐸𝑚.

660 Chapter 36. Elementary Asset Pricing Theory


Advanced Quantitative Economics with Python

This is true because we can pull the constant 𝑅𝑓 outside the mathematical expectation.
It follows that the gross return on a risk-free asset is

𝑅𝑓 = 1/𝐸(𝑚)

Using this formula for 𝑅𝑓 in equation (36.2) and rearranging, it follows that

𝑅𝑓 = 𝐸𝑅𝑖 + cov (𝑚, 𝑅𝑖 ) 𝑅𝑓

which can be rearranged to become

𝐸𝑅𝑖 = 𝑅𝑓 − cov (𝑚, 𝑅𝑖 ) 𝑅𝑓 .

It follows that we can express an excess return 𝐸𝑅𝑖 − 𝑅𝑓 on asset 𝑖 relative to the risk-free rate as

𝐸𝑅𝑖 − 𝑅𝑓 = − cov (𝑚, 𝑅𝑖 ) 𝑅𝑓 (36.3)

Equation (36.3) can be rearranged to display important parts of asset pricing theory.

36.4 Expected Return - Beta Representation

We can obtain the celebrated expected-return-Beta -representation for gross return 𝑅𝑖 by simply rearranging excess
return equation (36.3) to become


⎜ cov (𝑅𝑖 , 𝑚) ⎞
⎟ ⎛ var(𝑚) ⎞
⎜ ⎟
𝐸𝑅𝑖 = 𝑅𝑓 + ⎜ ⎟ ⎜ − ⎟

⎜ ⏟⏟var(𝑚)
⏟⏟⏟ ⎟⎜
⎟ ⎜ 𝐸(𝑚)
⏟⏟⏟ ⏟⏟


⎝ 𝛽𝑖,𝑚 =regression coefficient⎠ ⎝ 𝜆𝑚 =price of risk⎠

or

𝐸𝑅𝑖 = 𝑅𝑓 + 𝛽𝑖,𝑚 𝜆𝑚 (36.4)

Here
• 𝛽𝑖,𝑚 is a (population) least squares regression coefficient of gross return 𝑅𝑖 on stochastic discount factor 𝑚
• 𝜆𝑚 is minus the variance of 𝑚 divided by the mean of 𝑚, an object that is sometimes called a price of risk.
Because 𝜆𝑚 < 0, equation (36.4) asserts that
• assets whose returns are positively correlated with the stochastic discount factor (SDF) 𝑚 have expected returns
lower than the risk-free rate 𝑅𝑓
• assets whose returns are negatively correlated with the SDF 𝑚 have expected returns higher than the risk-free
rate 𝑅𝑓
These patterns will be discussed more below.
In particular, we’ll see that returns that are perfectly negatively correlated with the SDF 𝑚 have a special status:
• they are on a mean-variance frontier
Before we dive into that more, we’ll pause to look at an example of an SDF.
To interpret representation (36.4), the following widely used example helps.
Example

36.4. Expected Return - Beta Representation 661


Advanced Quantitative Economics with Python

Let 𝑐𝑡 be the logarithm of the consumption of a representative consumer or just a single consumer for whom we have
consumption data.
A popular model of 𝑚 is

𝑈 ′ (𝐶𝑡+1 )
𝑚𝑡+1 = 𝛽
𝑈 ′ (𝐶𝑡 )

where 𝐶𝑡 is consumption at time 𝑡, 𝛽 = exp(−𝜌) is a discount factor with 𝜌 being the discount rate, and 𝑈 (⋅) is a
concave, twice-diffential utility function.
𝐶 1−𝛾
For a constant relative risk aversion (CRRA) utility function 𝑈 (𝐶) = 1−𝛾 utility function 𝑈 ′ (𝐶) = 𝐶 −𝛾 .
In this case, letting 𝑐𝑡 = log(𝐶𝑡 ), we can write 𝑚𝑡+1 as

𝑚𝑡+1 = exp(−𝜌) exp(−𝛾(𝑐𝑡+1 − 𝑐𝑡 ))

where 𝜌 > 0, 𝛾 > 0.


A popular model for the growth of log of consumption is

𝑐𝑡+1 − 𝑐𝑡 = 𝜇 + 𝜎𝑐 𝜖𝑡+1

where 𝜖𝑡+1 ∼ 𝒩(0, 1).


Here {𝑐𝑡 } is a random walk with drift 𝜇, a good approximation to US per capital consumption growth.
Again here
• 𝛾 > 0 is a coefficient of relative risk aversion
• 𝜌 > 0 is a fixed intertemporal discount rate
So we have

𝑚𝑡+1 = exp(−𝜌) exp(−𝛾𝜇 − 𝛾𝜎𝑐 𝜖𝑡+1 )

In this case
𝜎𝑐2 𝛾 2
𝐸𝑚𝑡+1 = exp(−𝜌) exp (−𝛾𝜇 + )
2

and

var(𝑚𝑡+1 ) = 𝐸(𝑚)[exp(𝜎𝑐2 𝛾 2 ) − 1)]

When 𝛾 > 0, it is true that


• when consumption growth is high, 𝑚 is low
• when consumption growth is low, 𝑚 is high
According to representation (36.4), an asset with a gross return 𝑅𝑖 that is expected to be high when consumption growth
is low has 𝛽𝑖,𝑚 positive and a low expected return.
• because it has a high gross return when consumption growth is low, it is a good hedge against consumption risk.
That justifies its low average return.
An asset with an 𝑅𝑖 that is low when consumption growth is low has 𝛽𝑖,𝑚 negative and a high expected return.
• because it has a low gross return when consumption growth is low, it is a poor hedge against consumption risk.
That justifies its high average return.

662 Chapter 36. Elementary Asset Pricing Theory


Advanced Quantitative Economics with Python

36.5 Mean-Variance Frontier

Now we’ll derive the celebrated mean-variance frontier.


We do this using a method deployed by Lars Peter Hansen and Scott Richard [Hansen and Richard, 1987].

Note: Methods of Hansen and Richard are described and used extensively by [Cochrane, 2005].

Their idea was rearrange the key equation (36.1), namely, 𝐸𝑚𝑅𝑖 = 1, and then to apply a Cauchy-Schwarz inequality.
A convenient way to remember the Cauchy-Schwartz inequality in our context is that it says that an 𝑅2 in any regression
has to be less than or equal to 1.
(Please note that here 𝑅2 denotes the coefficient of determination in a regression, not a return on an asset!)
Let’s apply that idea to deduce

1 = 𝐸 (𝑚𝑅𝑖 ) = 𝐸(𝑚)𝐸 (𝑅𝑖 ) + 𝜌𝑚,𝑅𝑖 𝜎(𝑚)𝜎 (𝑅𝑖 ) (36.5)

where the correlation coefficient 𝜌𝑚,𝑅𝑖 is defined as

cov (𝑚, 𝑅𝑖 )
𝜌𝑚,𝑅𝑖 ≡
𝜎(𝑚)𝜎 (𝑅𝑖 )

and where 𝜎(⋅) denotes the standard deviation of the variable in parentheses
Equation (36.5) implies

𝜎(𝑚)
𝐸𝑅𝑖 = 𝑅𝑓 − 𝜌𝑚,𝑅𝑖 𝜎 (𝑅𝑖 )
𝐸(𝑚)

Because 𝜌𝑚,𝑅𝑖 ∈ [−1, 1], it follows that |𝜌𝑚,𝑅𝑖 | ≤ 1 and that

𝜎(𝑚)
∣𝐸𝑅𝑖 − 𝑅𝑓 ∣ ⩽ 𝜎 (𝑅𝑖 ) (36.6)
𝐸(𝑚)

Inequality (36.6) delineates a mean-variance frontier


(Actually, it looks more like a mean-standard-deviation frontier)
Evidently, points on the frontier correspond to gross returns that are perfectly correlated (either positively or negatively)
with the stochastic discount factor 𝑚.
We summarize this observation as
+1 ⟹ 𝑅𝑖 is on lower frontier
𝜌𝑚,𝑅𝑖 = {
−1 ⟹ 𝑅𝑖 is on an upper frontier

Now let’s use matplotlib to draw a mean variance frontier.


In drawing a frontier, we’ll set 𝜎(𝑚) = .25 and 𝐸𝑚 = .99, values roughly consistent with what many studies calibrate
from quarterly US data.

import matplotlib.pyplot as plt


import numpy as np

# Define the function to plot


def y(x, alpha, beta):
(continues on next page)

36.5. Mean-Variance Frontier 663


Advanced Quantitative Economics with Python

(continued from previous page)


return alpha + beta*x
def z(x, alpha, beta):
return alpha - beta*x

sigmam = .25
Em = .99

# Set the values of alpha and beta


alpha = 1/Em
beta = sigmam/Em

# Create a range of values for x


x = np.linspace(0, .15, 100)

# Calculate the values of y and z


y_values = y(x, alpha, beta)
z_values = z(x, alpha, beta)

# Create a figure and axes object


fig, ax = plt.subplots()

# Plot y
ax.plot(x, y_values, label=r'$R^f + \frac{\sigma(m)}{E(m)} \sigma(R^i)$')
ax.plot(x, z_values, label=r'$R^f - \frac{\sigma(m)}{E(m)} \sigma(R^i)$')

plt.title('mean standard deviation frontier')


plt.xlabel(r"$\sigma(R^i)$")
plt.ylabel(r"$E (R^i) $")
plt.text(.053, 1.015, "(.05,1.015)")
ax.plot(.05, 1.015, 'o', label=r"$(\sigma(R^j), E R^j)$")
# Add a legend and show the plot
ax.legend()
plt.show()

664 Chapter 36. Elementary Asset Pricing Theory


Advanced Quantitative Economics with Python

The figure shows two straight lines, the blue upper one being the locus of (𝜎(𝑅𝑖 ), 𝐸(𝑅𝑖 ) pairs that are on the mean-
variance frontier or mean-standard-deviation frontier.
The green dot refers to a return 𝑅𝑗 that is not on the frontier and that has moments (𝜎(𝑅𝑗 ), 𝐸𝑅𝑗 ) = (.05, 1.015).
It is described by the statistical model

𝑅 𝑗 = 𝑅 𝑖 + 𝜖𝑗

where 𝑅𝑖 is a return that is on the frontier and 𝜖𝑗 is a random variable that has mean zero and that is orthogonal to 𝑅𝑖 .
Then 𝐸𝑅𝑗 = 𝐸𝑅𝑖 and, as a consequence of 𝑅𝑗 not being on the frontier,

𝜎2 (𝑅𝑗 ) = 𝜎2 (𝑅𝑖 ) + 𝜎2 (𝜖𝑗 )

The length of a horizontal line from the point 𝜎(𝑅𝑗 ), 𝐸(𝑅𝑗 ) = .05, 1.015 to the frontier equals

√𝜎2 (𝑅𝑖 ) + 𝜎2 (𝜖𝑗 ) − 𝜎(𝑅𝑖 )

This is a measure of the part of the risk in 𝑅𝑗 that is not priced because it is uncorrelated with the stochastic discount
factor and so can be diversified away (i.e., averaged out to zero by holding a diversified portfolio).

36.5. Mean-Variance Frontier 665


Advanced Quantitative Economics with Python

36.6 Sharpe Ratios and the Price of Risk

An asset’s Sharpe ratio is defined as

𝐸(𝑅𝑖 ) − 𝑅𝑓
𝜎(𝑅𝑖 )

The above figure reminds us that all assets 𝑅𝑖 whose returns are on the mean-standard deviation frontier satisfy

𝐸(𝑅𝑖 ) − 𝑅𝑓 𝜎(𝑚)
=
𝜎(𝑅𝑖 ) 𝐸𝑚
𝜎(𝑚)
The ratio 𝐸𝑚 is often called the market price of risk.
Evidently it equals the maximum Sharpe ratio for any asset or portfolio of assets.

36.7 Mathematical Structure of Frontier

The mathematical structure of the mean-variance frontier described by inequality (36.6) implies that
• all returns on the frontier are perfectly correlated.
Thus,
– Let 𝑅𝑚 , 𝑅𝑚𝑣 be two returns on the frontier.
– Then for some scalar 𝑎, a return 𝑅𝑚𝑣 on the mean-variance frontier satisfies the affine equation 𝑅𝑚𝑣 =
𝑅𝑓 + 𝑎 (𝑅𝑚 − 𝑅𝑓 ) . This is an exact equation with no residual.
• each return 𝑅𝑚𝑣 that is on the mean-variance frontier is perfectly (negatively) correlated with 𝑚
𝑚 = 𝑎 + 𝑏𝑅𝑚𝑣
– (𝜌𝑚,𝑅𝑚𝑣 = −1) ⇒ { for some scalars 𝑎, 𝑏, 𝑒, 𝑑,
𝑅𝑚𝑣 = 𝑒 + 𝑑𝑚
Therefore, any return on the mean-variance frontier is a legitimate stochastic discount factor
• for any mean-variance-efficient return 𝑅𝑚𝑣 that is on the frontier but that is not 𝑅𝑓 , there exists a single-beta
representation for any return 𝑅𝑖 that takes the form:
𝐸𝑅𝑖 = 𝑅𝑓 + 𝛽𝑖,𝑅𝑚𝑣 [𝐸 (𝑅𝑚𝑣 ) − 𝑅𝑓 ] (36.7)
• the regression coefficient 𝛽𝑖,𝑅𝑚𝑣 is often called asset 𝑖’s beta
• The special case of a single-beta representation (36.7) with 𝑅𝑖 = 𝑅𝑚𝑣 is
𝐸𝑅𝑚𝑣 = 𝑅𝑓 + 1 ⋅ [𝐸 (𝑅𝑚𝑣 ) − 𝑅𝑓 ]

36.8 Multi-factor Models

The single-beta representation (36.7) is a special case of the multi-factor model

𝐸𝑅𝑖 = 𝛾 + 𝛽𝑖,𝑎 𝜆𝑎 + 𝛽𝑖,𝑏 𝜆𝑏 + ⋯

where 𝜆𝑗 is the price of being exposed to risk factor 𝑓𝑡𝑗 and 𝛽𝑖,𝑗 is asset 𝑖’s exposure to that risk factor.

666 Chapter 36. Elementary Asset Pricing Theory


Advanced Quantitative Economics with Python

To uncover the 𝛽𝑖,𝑗 ’s, one takes data on time series of the risk factors 𝑓𝑡𝑗 that are being priced and specifies the following
least squares regression

𝑅𝑡𝑖 = 𝑎𝑖 + 𝛽𝑖,𝑎 𝑓𝑡𝑎 + 𝛽𝑖,𝑏 𝑓𝑡𝑏 + … + 𝜖𝑖𝑡 , 𝑡 = 1, 2, … , 𝑇


(36.8)
𝜖𝑖𝑡 ⟂ 𝑓𝑡𝑗 , 𝑖 = 1, 2, … , 𝐼; 𝑗 = 𝑎, 𝑏, …

Special cases are:


• a popular single-factor model specifies the single factor 𝑓𝑡 to be the return on the market portfolio
• another popular single-factor model called the consumption-based model specifies the factor to be 𝑚𝑡+1 =
𝑢′ (𝑐 )
𝛽 𝑢′ (𝑐𝑡+1) , where 𝑐𝑡 is a representative consumer’s time 𝑡 consumption.
𝑡

As a reminder, model objects are interpreted as follows:


• 𝛽𝑖,𝑎 is the exposure of return 𝑅𝑖 to risk factor 𝑓𝑎
• 𝜆𝑎 is the price of exposure to risk factor 𝑓𝑎

36.9 Empirical Implementations

We briefly describe empirical implementations of multi-factor generalizations of the single-factor model described above.
Two representations of a multi-factor model play importnt roles in empirical applications.
One is the time series regression (36.8)
The other representation entails a cross-section regression of average returns 𝐸𝑅𝑖 for assets 𝑖 = 1, 2, … , 𝐼 on prices
of risk 𝜆𝑗 for 𝑗 = 𝑎, 𝑏, 𝑐, …
Here is the cross-section regression specification for a multi-factor model:

𝐸𝑅𝑖 = 𝛾 + 𝛽𝑖,𝑎 𝜆𝑎 + 𝛽𝑖,𝑏 𝜆𝑏 + ⋯

Testing strategies:
Time-series and cross-section regressions play roles in both estimating and testing beta representation models.
The basic idea is to implement the following two steps.
Step 1:
• Estimate 𝑎𝑖 , 𝛽𝑖,𝑎 , 𝛽𝑖,𝑏 , ⋯ by running a time series regression: 𝑅𝑡𝑖 on a constant and 𝑓𝑡𝑎 , 𝑓𝑡𝑏 , …
Step 2:
• take the 𝛽𝑖,𝑗 ’s estimated in step one as regressors together with data on average returns 𝐸𝑅𝑖 over some period and
then estimate the cross-section regression
𝐸
⏟ (𝑅𝑖 ) =𝛾+ 𝛽
⏟ 𝑖,𝑎 𝜆
⏟𝑎 + 𝛽⏟ 𝑖,𝑏 𝜆⏟𝑏 +⋯ + 𝛼⏟𝑖 , 𝑖 = 1, … , 𝐼; 𝛼
⏟⏟𝑖 ⟂⏟𝛽⏟⏟⏟
𝑖,𝑗 , 𝑗 = 𝑎,
⏟⏟⏟𝑏, …
average return over time series regressor regressioncoefficient regressor regressioncoefficient pricing errors least squares orthogonality conditio

• Here ⟂ means orthogonal to


• estimate 𝛾, 𝜆𝑎 , 𝜆𝑏 , … by an appropriate regression technique, recognizing that the regressors have been generated
by a step 1 regression.
Note that presumably the risk-free return 𝐸𝑅𝑓 = 𝛾.
For excess returns 𝑅𝑒𝑖 = 𝑅𝑖 − 𝑅𝑓 we have

𝐸𝑅𝑒𝑖 = 𝛽𝑖,𝑎 𝜆𝑎 + 𝛽𝑖,𝑏 𝜆𝑏 + ⋯ + 𝛼𝑖 , 𝑖 = 1, … , 𝐼

36.9. Empirical Implementations 667


Advanced Quantitative Economics with Python

In the following exercises, we illustrate aspects of these empirical strategies on artificial data.
Our basic tools are random number generator that we shall use to create artificial samples that conform to the theory and
least squares regressions that let us watch aspects of the theory at work.
These exercises will further convince us that asset pricing theory is mostly about covariances and least squares regressions.

36.10 Exercises

Let’s start with some imports.

import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt

Lots of our calculations will involve computing population and sample OLS regressions.
So we define a function for simple univariate OLS regression that calls the OLS routine from statsmodels.

def simple_ols(X, Y, constant=False):

if constant:
X = sm.add_constant(X)

model = sm.OLS(Y, X)
res = model.fit()

β_hat = res.params[-1]
σ_hat = np.sqrt(res.resid @ res.resid / res.df_resid)

return β_hat, σ_hat

Exercise 36.10.1
Look at the equation,

𝑅𝑡𝑖 − 𝑅𝑓 = 𝛽𝑖,𝑅𝑚 (𝑅𝑡𝑚 − 𝑅𝑓 ) + 𝜎𝑖 𝜀𝑖,𝑡 .

Verify that this equation is a regression equation.

Solution to Exercise 36.10.1


To verify that it is a regression equation we must show that the residual is orthogonal to the regressor.
Our assumptions about mutual orthogonality imply that

𝐸 [𝜖𝑖,𝑡 ] = 0, 𝐸 [𝜖𝑖,𝑡 𝑢𝑡 ] = 0

It follows that
𝐸 [𝜎𝑖 𝜖𝑖,𝑡 (𝑅𝑡𝑚 − 𝑅𝑓 )] = 𝐸 [𝜎𝑖 𝜖𝑖,𝑡 (𝜉 + 𝜆𝑢𝑡 )]
= 𝜎𝑖 𝜉𝐸 [𝜖𝑖,𝑡 ] + 𝜎𝑖 𝜆𝐸 [𝜖𝑖,𝑡 𝑢𝑡 ]
=0

668 Chapter 36. Elementary Asset Pricing Theory


Advanced Quantitative Economics with Python

Exercise 36.10.2
Give a formula for the regression coefficient 𝛽𝑖,𝑅𝑚 .

Solution to Exercise 36.10.2


The regression coefficient 𝛽𝑖,𝑅𝑚 is

𝐶𝑜𝑣 (𝑅𝑡𝑖 − 𝑅𝑓 , 𝑅𝑡𝑚 − 𝑅𝑓 )


𝛽𝑖,𝑅𝑚 =
𝑉 𝑎𝑟 (𝑅𝑡𝑚 − 𝑅𝑓 )

Exercise 36.10.3
As in many sciences, it is useful to distinguish a direct problem from an inverse problem.
• A direct problem involves simulating a particular model with known parameter values.
• An inverse problem involves using data to estimate or choose a particular parameter vector from a manifold of
models indexed by a set of parameter vectors.
Please assume the parameter values provided below and then simulate 2000 observations from the theory specified above
for 5 assets, 𝑖 = 1, … , 5.

𝐸 [𝑅𝑓 ] = 0.02
𝜎𝑓 = 0.00
𝜉 = 0.06
𝜆 = 0.04
𝛽1,𝑅𝑚 = 0.2
𝜎1 = 0.04
𝛽2,𝑅𝑚 = .4
𝜎2 = 0.04
𝛽3,𝑅𝑚 = .6
𝜎3 = 0.04
𝛽4,𝑅𝑚 = .8
𝜎4 = 0.04
𝛽5,𝑅𝑚 = 1.0
𝜎5 = 0.04

More Exercises
Now come some even more fun parts!
Our theory implies that there exist values of two scalars, 𝑎 and 𝑏, such that a legitimate stochastic discount factor is:

𝑚𝑡 = 𝑎 + 𝑏𝑅𝑡𝑚

The parameters 𝑎, 𝑏 must satisfy the following equations:

𝐸[(𝑎 + 𝑏𝑅𝑡𝑚 )𝑅𝑡𝑚 )] = 1


𝐸[(𝑎 + 𝑏𝑅𝑡𝑚 )𝑅𝑡𝑓 )] = 1

36.10. Exercises 669


Advanced Quantitative Economics with Python

Solution to Exercise 36.10.3


Direct Problem:

# Code for the direct problem

# assign the parameter values


ERf = 0.02
σf = 0.00 # Zejin: Hi tom, here is where you manipulate σf
ξ = 0.06
λ = 0.08
βi = np.array([0.2, .4, .6, .8, 1.0])
σi = np.array([0.04, 0.04, 0.04, 0.04, 0.04])

# in this cell we set the number of assets and number of observations


# we first set T to a large number to verify our computation results
T = 2000
N = 5

# simulate i.i.d. random shocks


e = np.random.normal(size=T)
u = np.random.normal(size=T)
ϵ = np.random.normal(size=(N, T))

# simulate the return on a risk-free asset


Rf = ERf + σf * e

# simulate the return on the market portfolio


excess_Rm = ξ + λ * u
Rm = Rf + excess_Rm

# simulate the return on asset i


Ri = np.empty((N, T))
for i in range(N):
Ri[i, :] = Rf + βi[i] * excess_Rm + σi[i] * ϵ[i, :]

Now that we have a panel of data, we’d like to solve the inverse problem by assuming the theory specified above and
estimating the coefficients given above.

# Code for the inverse problem

Inverse Problem:
We will solve the inverse problem by simple OLS regressions.
1. estimate 𝐸 [𝑅𝑓 ] and 𝜎𝑓

ERf_hat, σf_hat = simple_ols(np.ones(T), Rf)

ERf_hat, σf_hat

670 Chapter 36. Elementary Asset Pricing Theory


Advanced Quantitative Economics with Python

(0.020000000000000046, 4.5114090308141905e-17)

Let’s compare these with the true population parameter values.

ERf, σf

(0.02, 0.0)

2. 𝜉 and 𝜆

ξ_hat, λ_hat = simple_ols(np.ones(T), Rm - Rf)

ξ_hat, λ_hat

(0.060225944676975, 0.07779632562028074)

ξ, λ

(0.06, 0.08)

3. 𝛽𝑖,𝑅𝑚 and 𝜎𝑖

βi_hat = np.empty(N)
σi_hat = np.empty(N)

for i in range(N):
βi_hat[i], σi_hat[i] = simple_ols(Rm - Rf, Ri[i, :] - Rf)

βi_hat, σi_hat

(array([0.19181196, 0.41525333, 0.59730297, 0.79558054, 1.00028957]),


array([0.03912597, 0.03938404, 0.03951595, 0.03953494, 0.0386066 ]))

βi, σi

(array([0.2, 0.4, 0.6, 0.8, 1. ]), array([0.04, 0.04, 0.04, 0.04, 0.04]))

Q: How close did your estimates come to the parameters we specified?

Exercise 36.10.4
Using the equations above, find a system of two linear equations that you can solve for 𝑎 and 𝑏 as functions of the
parameters (𝜆, 𝜉, 𝐸[𝑅𝑓 ]).
Write a function that can solve these equations.
Please check the condition number of a key matrix that must be inverted to determine a, b

36.10. Exercises 671


Advanced Quantitative Economics with Python

Solution to Exercise 36.10.4


The system of two linear equations is shown below:

𝑎((𝐸(𝑅𝑓 ) + 𝜉) + 𝑏((𝐸(𝑅𝑓 ) + 𝜉)2 + 𝜆2 + 𝜎𝑓2 ) = 1


𝑎𝐸(𝑅𝑓 ) + 𝑏(𝐸(𝑅𝑓 )2 + 𝜉𝐸(𝑅𝑓 ) + 𝜎𝑓2 ) = 1

# Code here
def solve_ab(ERf, σf, λ, ξ):

M = np.empty((2, 2))
M[0, 0] = ERf + ξ
M[0, 1] = (ERf + ξ) ** 2 + λ ** 2 + σf ** 2
M[1, 0] = ERf
M[1, 1] = ERf ** 2 + ξ * ERf + σf ** 2

a, b = np.linalg.solve(M, np.ones(2))
condM = np.linalg.cond(M)

return a, b, condM

Let’s try to solve 𝑎 and 𝑏 using the actual model parameters.

a, b, condM = solve_ab(ERf, σf, λ, ξ)

a, b, condM

(87.49999999999999, -468.7499999999999, 54.406619883717504)

Exercise 36.10.5
Using the estimates of the parameters that you generated above, compute the implied stochastic discount factor.

Solution to Exercise 36.10.5


̂ 𝑓 ), 𝜎̂ 𝑓 , 𝜆,̂ 𝜉 ̂ to the function solve_ab.
Now let’s pass 𝐸(𝑅

a_hat, b_hat, M_hat = solve_ab(ERf_hat, σf_hat, λ_hat, ξ_hat)

a_hat, b_hat, M_hat

(89.9163014776351, -497.54853792443083, 57.76878113601545)

672 Chapter 36. Elementary Asset Pricing Theory


CHAPTER

THIRTYSEVEN

TWO MODIFICATIONS OF MEAN-VARIANCE PORTFOLIO THEORY

37.1 Overview

This lecture describes extensions to the classical mean-variance portfolio theory summarized in our lecture Elementary
Asset Pricing Theory.
The classic theory described there assumes that a decision maker completely trusts the statistical model that he posits to
govern the joint distribution of returns on a list of available assets.
Both extensions described here put distrust of that statistical model into the mind of the decision maker.
One is a model of Black and Litterman [Black and Litterman, 1992] that imputes to the decision maker distrust of
historically estimated mean returns but still complete trust of estimated covariances of returns.
The second model also imputes to the decision maker doubts about his statistical model, but now by saying that, because
of that distrust, the decision maker uses a version of robust control theory described in this lecture Robustness.
The famous Black-Litterman (1992) [Black and Litterman, 1992] portfolio choice model was motivated by the finding
that with high frequency or moderately high frequency data, means are more difficult to estimate than variances.
A model of robust portfolio choice that we’ll describe below also begins from the same starting point.
To begin, we’ll take for granted that means are more difficult to estimate that covariances and will focus on how Black and
Litterman, on the one hand, an robust control theorists, on the other, would recommend modifying the mean-variance
portfolio choice model to take that into account.
At the end of this lecture, we shall use some rates of convergence results and some simulations to verify how means are
more difficult to estimate than variances.
Among the ideas in play in this lecture will be
• Mean-variance portfolio theory
• Bayesian approaches to estimating linear regressions
• A risk-sensitivity operator and its connection to robust control theory
In summary, we’ll describe two ways to modify the classic mean-variance portfolio choice model in ways designed to
make its recommendations more plausible.
Both of the adjustments that we describe are designed to confront a widely recognized embarrassment to mean-variance
portfolio theory, namely, that it usually implies taking very extreme long-short portfolio positions.
The two approaches build on a common and widespread hunch – that because it is much easier statistically to estimate
covariances of excess returns than it is to estimate their means, it makes sense to adjust investors’ subjective beliefs about
mean returns in order to render more plausible decisions.
Let’s start with some imports:

673
Advanced Quantitative Economics with Python

import numpy as np
import scipy.stats as stat
import matplotlib.pyplot as plt
from numba import jit

37.2 Mean-Variance Portfolio Choice

A risk-free security earns one-period net return 𝑟𝑓 .


An 𝑛 × 1 vector of risky securities earns an 𝑛 × 1 vector 𝑟 ⃗ − 𝑟𝑓 1 of excess returns, where 1 is an 𝑛 × 1 vector of ones.
The excess return vector is multivariate normal with mean 𝜇 and covariance matrix Σ, which we express either as

𝑟 ⃗ − 𝑟𝑓 1 ∼ 𝒩(𝜇, Σ)

or

𝑟 ⃗ − 𝑟𝑓 1 = 𝜇 + 𝐶𝜖

where 𝜖 ∼ 𝒩(0, 𝐼) is an 𝑛 × 1 random vector.


Let 𝑤 be an 𝑛 × 1 vector of portfolio weights.
A portfolio consisting 𝑤 earns returns

𝑤′ (𝑟 ⃗ − 𝑟𝑓 1) ∼ 𝒩(𝑤′ 𝜇, 𝑤′ Σ𝑤)

The mean-variance portfolio choice problem is to choose 𝑤 to maximize

𝛿
𝑈 (𝜇, Σ; 𝑤) = 𝑤′ 𝜇 − 𝑤′ Σ𝑤 (37.1)
2
where 𝛿 > 0 is a risk-aversion parameter. The first-order condition for maximizing (37.1) with respect to the vector 𝑤 is

𝜇 = 𝛿Σ𝑤

which implies the following design of a risky portfolio:

𝑤 = (𝛿Σ)−1 𝜇 (37.2)

37.3 Estimating Mean and Variance

The key inputs into the portfolio choice model (37.2) are
• estimates of the parameters 𝜇, Σ of the random excess return vector(𝑟 ⃗ − 𝑟𝑓 1)
• the risk-aversion parameter 𝛿
A standard way of estimating 𝜇 is maximum-likelihood or least squares; that amounts to estimating 𝜇 by a sample mean
of excess returns and estimating Σ by a sample covariance matrix.

674 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

37.4 Black-Litterman Starting Point

When estimates of 𝜇 and Σ from historical sample means and covariances have been combined with plausible values of
the risk-aversion parameter 𝛿 to compute an optimal portfolio from formula (37.2), a typical outcome has been 𝑤’s with
extreme long and short positions.
A common reaction to these outcomes is that they are so implausible that a portfolio manager cannot recommend them
to a customer.

np.random.seed(12)

N = 10 # Number of assets
T = 200 # Sample size

# random market portfolio (sum is normalized to 1)


w_m = np.random.rand(N)
w_m = w_m / (w_m.sum())

# True risk premia and variance of excess return (constructed


# so that the Sharpe ratio is 1)
μ = (np.random.randn(N) + 5) /100 # Mean excess return (risk premium)
S = np.random.randn(N, N) # Random matrix for the covariance matrix
V = S @ S.T # Turn the random matrix into symmetric psd
# Make sure that the Sharpe ratio is one
Σ = V * (w_m @ μ)**2 / (w_m @ V @ w_m)

# Risk aversion of market portfolio holder


δ = 1 / np.sqrt(w_m @ Σ @ w_m)

# Generate a sample of excess returns


excess_return = stat.multivariate_normal(μ, Σ)
sample = excess_return.rvs(T)

# Estimate μ and Σ
μ_est = sample.mean(0).reshape(N, 1)
Σ_est = np.cov(sample.T)

w = np.linalg.solve(δ * Σ_est, μ_est)

fig, ax = plt.subplots(figsize=(8, 5))


ax.set_title('Mean-variance portfolio weights recommendation and the market portfolio
↪')

ax.plot(np.arange(N)+1, w, 'o', c='k', label='$w$ (mean-variance)')


ax.plot(np.arange(N)+1, w_m, 'o', c='r', label='$w_m$ (market portfolio)')
ax.vlines(np.arange(N)+1, 0, w, lw=1)
ax.vlines(np.arange(N)+1, 0, w_m, lw=1)
ax.axhline(0, c='k')
ax.axhline(-1, c='k', ls='--')
ax.axhline(1, c='k', ls='--')
ax.set_xlabel('Assets')
ax.xaxis.set_ticks(np.arange(1, N+1, 1))
plt.legend(numpoints=1, fontsize=11)
plt.show()

37.4. Black-Litterman Starting Point 675


Advanced Quantitative Economics with Python

Black and Litterman’s responded to this situation in the following way:


• They continue to accept (37.2) as a good model for choosing an optimal portfolio 𝑤.
• They want to continue to allow the customer to express his or her risk tolerance by setting 𝛿.
• Leaving Σ at its maximum-likelihood value, they push 𝜇 away from its maximum-likelihood value in a way designed
to make portfolio choices that are more plausible in terms of conforming to what most people actually do.
In particular, given Σ and a plausible value of 𝛿, Black and Litterman reverse engineered a vector 𝜇𝐵𝐿 of mean excess
returns that makes the 𝑤 implied by formula (37.2) equal the actual market portfolio 𝑤𝑚 , so that

𝑤𝑚 = (𝛿Σ)−1 𝜇𝐵𝐿

37.5 Details

Let’s define

𝑤𝑚 𝜇 ≡ (𝑟𝑚 − 𝑟𝑓 )

as the (scalar) excess return on the market portfolio 𝑤𝑚 .


Define

𝜎 2 = 𝑤𝑚

Σ𝑤𝑚

as the variance of the excess return on the market portfolio 𝑤𝑚 .

676 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

Define
𝑟𝑚 − 𝑟𝑓
SR𝑚 =
𝜎
as the Sharpe-ratio on the market portfolio 𝑤𝑚 .
Let 𝛿𝑚 be the value of the risk aversion parameter that induces an investor to hold the market portfolio in light of the
optimal portfolio choice rule (37.2).
Evidently, portfolio rule (37.2) then implies that 𝑟𝑚 − 𝑟𝑓 = 𝛿𝑚 𝜎2 or
𝑟𝑚 − 𝑟𝑓
𝛿𝑚 =
𝜎2
or
SR𝑚
𝛿𝑚 =
𝜎
Following the Black-Litterman philosophy, our first step will be to back a value of 𝛿𝑚 from
• an estimate of the Sharpe-ratio, and
• our maximum likelihood estimate of 𝜎 drawn from our estimates or 𝑤𝑚 and Σ
The second key Black-Litterman step is then to use this value of 𝛿 together with the maximum likelihood estimate of Σ
to deduce a 𝜇BL that verifies portfolio rule (37.2) at the market portfolio 𝑤 = 𝑤𝑚

𝜇𝑚 = 𝛿𝑚 Σ𝑤𝑚

The starting point of the Black-Litterman portfolio choice model is thus a pair (𝛿𝑚 , 𝜇𝑚 ) that tells the customer to hold
the market portfolio.

# Observed mean excess market return


r_m = w_m @ μ_est

# Estimated variance of the market portfolio


σ_m = w_m @ Σ_est @ w_m

# Sharpe-ratio
sr_m = r_m / np.sqrt(σ_m)

# Risk aversion of market portfolio holder


d_m = r_m / σ_m

# Derive "view" which would induce the market portfolio


μ_m = (d_m * Σ_est @ w_m).reshape(N, 1)

x = np.arange(N) + 1
fig, ax = plt.subplots(figsize=(8, 5))
ax.set_title(r'Difference between $\hat{\mu}$ (estimate) and $\mu_{BL}$ (market␣
↪implied)')

ax.plot(x, μ_est, 'o', c='k', label=r'$\hat{\mu}$')


ax.plot(x, μ_m, 'o', c='r', label=r'$\mu_{BL}$')
ax.vlines(x, μ_m, μ_est, lw=1)
ax.axhline(0, c='k', ls='--')
ax.set_xlabel('Assets')
ax.xaxis.set_ticks(np.arange(1, N+1, 1))
plt.legend(numpoints=1)
plt.show()

37.5. Details 677


Advanced Quantitative Economics with Python

37.6 Adding Views

Black and Litterman start with a baseline customer who asserts that he or she shares the market’s views, which means
that he or she believes that excess returns are governed by

𝑟 ⃗ − 𝑟𝑓 1 ∼ 𝒩(𝜇𝐵𝐿 , Σ) (37.3)

Black and Litterman would advise that customer to hold the market portfolio of risky securities.
Black and Litterman then imagine a consumer who would like to express a view that differs from the market’s.
The consumer wants appropriately to mix his view with the market’s before using (37.2) to choose a portfolio.
Suppose that the customer’s view is expressed by a hunch that rather than (37.3), excess returns are governed by

𝑟 ⃗ − 𝑟𝑓 1 ∼ 𝒩(𝜇,̂ 𝜏 Σ)

where 𝜏 > 0 is a scalar parameter that determines how the decision maker wants to mix his view 𝜇̂ with the market’s
view 𝜇BL .
Black and Litterman would then use a formula like the following one to mix the views 𝜇̂ and 𝜇BL

𝜇̃ = (Σ−1 + (𝜏 Σ)−1 )−1 (Σ−1 𝜇𝐵𝐿 + (𝜏 Σ)−1 𝜇)̂ (37.4)

Black and Litterman would then advise the customer to hold the portfolio associated with these views implied by rule
(37.2):

𝑤̃ = (𝛿Σ)−1 𝜇̃

678 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

This portfolio 𝑤̃ will deviate from the portfolio 𝑤𝐵𝐿 in amounts that depend on the mixing parameter 𝜏 .
If 𝜇̂ is the maximum likelihood estimator and 𝜏 is chosen heavily to weight this view, then the customer’s portfolio will
involve big short-long positions.

def black_litterman(λ, μ1, μ2, Σ1, Σ2):


"""
This function calculates the Black-Litterman mixture
mean excess return and covariance matrix
"""
Σ1_inv = np.linalg.inv(Σ1)
Σ2_inv = np.linalg.inv(Σ2)

μ_tilde = np.linalg.solve(Σ1_inv + λ * Σ2_inv,


Σ1_inv @ μ1 + λ * Σ2_inv @ μ2)
return μ_tilde

τ = 1
μ_tilde = black_litterman(1, μ_m, μ_est, Σ_est, τ * Σ_est)

# The Black-Litterman recommendation for the portfolio weights


w_tilde = np.linalg.solve(δ * Σ_est, μ_tilde)

def BL_plot(τ):
μ_tilde = black_litterman(1, μ_m, μ_est, Σ_est, τ * Σ_est)
w_tilde = np.linalg.solve(δ * Σ_est, μ_tilde)

fig, ax = plt.subplots(1, 2, figsize=(16, 6))


ax[0].plot(np.arange(N)+1, μ_est, 'o', c='k',
label=r'$\hat{\mu}$ (subj view)')
ax[0].plot(np.arange(N)+1, μ_m, 'o', c='r',
label=r'$\mu_{BL}$ (market)')
ax[0].plot(np.arange(N)+1, μ_tilde, 'o', c='y',
label=r'$\tilde{\mu}$ (mixture)')
ax[0].vlines(np.arange(N)+1, μ_m, μ_est, lw=1)
ax[0].axhline(0, c='k', ls='--')
ax[0].set(xlim=(0, N+1), xlabel='Assets',
title=r'Relationship between $\hat{\mu}$, $\mu_{BL}$, and $ \tilde{\mu}
↪$')

ax[0].xaxis.set_ticks(np.arange(1, N+1, 1))


ax[0].legend(numpoints=1)

ax[1].set_title('Black-Litterman portfolio weight recommendation')


ax[1].plot(np.arange(N)+1, w, 'o', c='k', label=r'$w$ (mean-variance)')
ax[1].plot(np.arange(N)+1, w_m, 'o', c='r', label=r'$w_{m}$ (market, BL)')
ax[1].plot(np.arange(N)+1, w_tilde, 'o', c='y',
label=r'$\tilde{w}$ (mixture)')
ax[1].vlines(np.arange(N)+1, 0, w, lw=1)
ax[1].vlines(np.arange(N)+1, 0, w_m, lw=1)
ax[1].axhline(0, c='k')
ax[1].axhline(-1, c='k', ls='--')
ax[1].axhline(1, c='k', ls='--')
ax[1].set(xlim=(0, N+1), xlabel='Assets',
title='Black-Litterman portfolio weight recommendation')
ax[1].xaxis.set_ticks(np.arange(1, N+1, 1))
ax[1].legend(numpoints=1)
plt.show()
(continues on next page)

37.6. Adding Views 679


Advanced Quantitative Economics with Python

(continued from previous page)

BL_plot(τ)

37.7 Bayesian Interpretation

Consider the following Bayesian interpretation of the Black-Litterman recommendation.


The prior belief over the mean excess returns is consistent with the market portfolio and is given by

𝜇 ∼ 𝒩(𝜇𝐵𝐿 , Σ)

Given a particular realization of the mean excess returns 𝜇 one observes the average excess returns 𝜇̂ on the market
according to the distribution

𝜇̂ ∣ 𝜇, Σ ∼ 𝒩(𝜇, 𝜏 Σ)

where 𝜏 is typically small capturing the idea that the variation in the mean is smaller than the variation of the individual
random variable.
Given the realized excess returns one should then update the prior over the mean excess returns according to Bayes rule.
The corresponding posterior over mean excess returns is normally distributed with mean

(Σ−1 + (𝜏 Σ)−1 )−1 (Σ−1 𝜇𝐵𝐿 + (𝜏 Σ)−1 𝜇)̂

The covariance matrix is

(Σ−1 + (𝜏 Σ)−1 )−1

Hence, the Black-Litterman recommendation is consistent with the Bayes update of the prior over the mean excess returns
in light of the realized average excess returns on the market.

680 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

37.8 Curve Decolletage

Consider two independent “competing” views on the excess market returns

𝑟𝑒⃗ ∼ 𝒩(𝜇𝐵𝐿 , Σ)

and

𝑟𝑒⃗ ∼ 𝒩(𝜇,̂ 𝜏 Σ)

A special feature of the multivariate normal random variable 𝑍 is that its density function depends only on the (Euclidiean)
length of its realization 𝑧.
Formally, let the 𝑘-dimensional random vector be

𝑍 ∼ 𝒩(𝜇, Σ)

then

𝑍 ̄ ≡ Σ(𝑍 − 𝜇) ∼ 𝒩(0, 𝐼)

and so the points where the density takes the same value can be described by the ellipse

𝑧 ̄ ⋅ 𝑧 ̄ = (𝑧 − 𝜇)′ Σ−1 (𝑧 − 𝜇) = 𝑑 ̄ (37.5)

where 𝑑 ̄ ∈ ℝ+ denotes the (transformation) of a particular density value.


The curves defined by equation (37.5) can be labeled as iso-likelihood ellipses
Remark: More generally there is a class of density functions that possesses this feature, i.e.

∃𝑔 ∶ ℝ+ ↦ ℝ+ and 𝑐 ≥ 0, s.t. the density 𝑓 of 𝑍 has the form 𝑓(𝑧) = 𝑐𝑔(𝑧 ⋅ 𝑧)

This property is called spherical symmetry (see p 81. in Leamer (1978) [Leamer, 1978]).
In our specific example, we can use the pair (𝑑1̄ , 𝑑2̄ ) as being two “likelihood” values for which the corresponding iso-
likelihood ellipses in the excess return space are given by

(𝑟𝑒⃗ − 𝜇𝐵𝐿 )′ Σ−1 (𝑟𝑒⃗ − 𝜇𝐵𝐿 ) = 𝑑1̄


−1
(𝑟𝑒⃗ − 𝜇)̂ ′ (𝜏 Σ) (𝑟𝑒⃗ − 𝜇)̂ = 𝑑2̄

Notice that for particular 𝑑1̄ and 𝑑2̄ values the two ellipses have a tangency point.
These tangency points, indexed by the pairs (𝑑1̄ , 𝑑2̄ ), characterize points 𝑟𝑒⃗ from which there exists no deviation where
one can increase the likelihood of one view without decreasing the likelihood of the other view.
The pairs (𝑑1̄ , 𝑑2̄ ) for which there is such a point outlines a curve in the excess return space. This curve is reminiscent of
the Pareto curve in an Edgeworth-box setting.
Dickey (1975) [Dickey, 1975] calls it a curve decolletage.
Leamer (1978) [Leamer, 1978] calls it an information contract curve and describes it by the following program: maximize
the likelihood of one view, say the Black-Litterman recommendation while keeping the likelihood of the other view at
least at a prespecified constant 𝑑2̄

𝑑1̄ (𝑑2̄ ) ≡ max (𝑟𝑒⃗ − 𝜇𝐵𝐿 )′ Σ−1 (𝑟𝑒⃗ − 𝜇𝐵𝐿 )


𝑟𝑒⃗

subject to (𝑟𝑒⃗ − 𝜇)̂ ′ (𝜏 Σ)−1 (𝑟𝑒⃗ − 𝜇)̂ ≥ 𝑑2̄

37.8. Curve Decolletage 681


Advanced Quantitative Economics with Python

Denoting the multiplier on the constraint by 𝜆, the first-order condition is

2(𝑟𝑒⃗ − 𝜇𝐵𝐿 )′ Σ−1 + 𝜆2(𝑟𝑒⃗ − 𝜇)̂ ′ (𝜏 Σ)−1 = 0

which defines the information contract curve between 𝜇𝐵𝐿 and 𝜇̂

𝑟𝑒⃗ = (Σ−1 + 𝜆(𝜏 Σ)−1 )−1 (Σ−1 𝜇𝐵𝐿 + 𝜆(𝜏 Σ)−1 𝜇)̂ (37.6)

Note that if 𝜆 = 1, (37.6) is equivalent with (37.4) and it identifies one point on the information contract curve.
Furthermore, because 𝜆 is a function of the minimum likelihood 𝑑2̄ on the RHS of the constraint, by varying 𝑑2̄ (or 𝜆 ),
we can trace out the whole curve as the figure below illustrates.

np.random.seed(1987102)

N = 2 # Number of assets
T = 200 # Sample size
τ = 0.8

# Random market portfolio (sum is normalized to 1)


w_m = np.random.rand(N)
w_m = w_m / (w_m.sum())

μ = (np.random.randn(N) + 5) / 100
S = np.random.randn(N, N)
V = S @ S.T
Σ = V * (w_m @ μ)**2 / (w_m @ V @ w_m)

excess_return = stat.multivariate_normal(μ, Σ)
sample = excess_return.rvs(T)

μ_est = sample.mean(0).reshape(N, 1)
Σ_est = np.cov(sample.T)

σ_m = w_m @ Σ_est @ w_m


d_m = (w_m @ μ_est) / σ_m
μ_m = (d_m * Σ_est @ w_m).reshape(N, 1)

N_r1, N_r2 = 100, 100


r1 = np.linspace(-0.04, .1, N_r1)
r2 = np.linspace(-0.02, .15, N_r2)

λ_grid = np.linspace(.001, 20, 100)


curve = np.asarray([black_litterman(λ, μ_m, μ_est, Σ_est,
τ * Σ_est).flatten() for λ in λ_grid])

λ = 1

def decolletage(λ):
dist_r_BL = stat.multivariate_normal(μ_m.squeeze(), Σ_est)
dist_r_hat = stat.multivariate_normal(μ_est.squeeze(), τ * Σ_est)

X, Y = np.meshgrid(r1, r2)
XY = np.stack((X, Y), axis=-1)
Z_BL = dist_r_BL.pdf(XY)
Z_hat = dist_r_hat.pdf(XY)

(continues on next page)

682 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

(continued from previous page)


μ_tilde = black_litterman(λ, μ_m, μ_est, Σ_est, τ * Σ_est).flatten()

fig, ax = plt.subplots(figsize=(10, 6))


ax.contourf(X, Y, Z_hat, cmap='viridis', alpha =.4)
ax.contourf(X, Y, Z_BL, cmap='viridis', alpha =.4)
ax.contour(X, Y, Z_BL, [dist_r_BL.pdf(μ_tilde)], cmap='viridis', alpha=.9)
ax.contour(X, Y, Z_hat, [dist_r_hat.pdf(μ_tilde)], cmap='viridis', alpha=.9)
ax.scatter(μ_est[0], μ_est[1])
ax.scatter(μ_m[0], μ_m[1])
ax.scatter(μ_tilde[0], μ_tilde[1], c='k', s=20*3)

ax.plot(curve[:, 0], curve[:, 1], c='k')


ax.axhline(0, c='k', alpha=.8)
ax.axvline(0, c='k', alpha=.8)
ax.set_xlabel(r'Excess return on the first asset, $r_{e, 1}$')
ax.set_ylabel(r'Excess return on the second asset, $r_{e, 2}$')
ax.text(μ_est[0] + 0.003, μ_est[1], r'$\hat{\mu}$')
ax.text(μ_m[0] + 0.003, μ_m[1] + 0.005, r'$\mu_{BL}$')
plt.show()

decolletage(λ)

Note that the line that connects the two points 𝜇̂ and 𝜇𝐵𝐿 is linear, which comes from the fact that the covariance matrices
of the two competing distributions (views) are proportional to each other.
To illustrate the fact that this is not necessarily the case, consider another example using the same parameter values,
except that the “second view” constituting the constraint has covariance matrix 𝜏 𝐼 instead of 𝜏 Σ.
This leads to the following figure, on which the curve connecting 𝜇̂ and 𝜇𝐵𝐿 are bending

λ_grid = np.linspace(.001, 20000, 1000)


(continues on next page)

37.8. Curve Decolletage 683


Advanced Quantitative Economics with Python

(continued from previous page)


curve = np.asarray([black_litterman(λ, μ_m, μ_est, Σ_est,
τ * np.eye(N)).flatten() for λ in λ_grid])
λ = 200

def decolletage(λ):
dist_r_BL = stat.multivariate_normal(μ_m.squeeze(), Σ_est)
dist_r_hat = stat.multivariate_normal(μ_est.squeeze(), τ * np.eye(N))

X, Y = np.meshgrid(r1, r2)
XY = np.stack((X, Y), axis=-1)
Z_BL = dist_r_BL.pdf(XY)
Z_hat = dist_r_hat.pdf(XY)

μ_tilde = black_litterman(λ, μ_m, μ_est, Σ_est, τ * np.eye(N)).flatten()

fig, ax = plt.subplots(figsize=(10, 6))


ax.contourf(X, Y, Z_hat, cmap='viridis', alpha=.4)
ax.contourf(X, Y, Z_BL, cmap='viridis', alpha=.4)
ax.contour(X, Y, Z_BL, [dist_r_BL.pdf(μ_tilde)], cmap='viridis', alpha=.9)
ax.contour(X, Y, Z_hat, [dist_r_hat.pdf(μ_tilde)], cmap='viridis', alpha=.9)
ax.scatter(μ_est[0], μ_est[1])
ax.scatter(μ_m[0], μ_m[1])

ax.scatter(μ_tilde[0], μ_tilde[1], c='k', s=20*3)

ax.plot(curve[:, 0], curve[:, 1], c='k')


ax.axhline(0, c='k', alpha=.8)
ax.axvline(0, c='k', alpha=.8)
ax.set_xlabel(r'Excess return on the first asset, $r_{e, 1}$')
ax.set_ylabel(r'Excess return on the second asset, $r_{e, 2}$')
ax.text(μ_est[0] + 0.003, μ_est[1], r'$\hat{\mu}$')
ax.text(μ_m[0] + 0.003, μ_m[1] + 0.005, r'$\mu_{BL}$')
plt.show()

decolletage(λ)

684 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

37.9 Black-Litterman Recommendation as Regularization

First, consider the OLS regression


min ‖𝑋𝛽 − 𝑦‖2
𝛽

which yields the solution

̂
𝛽𝑂𝐿𝑆 = (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦

A common performance measure of estimators is the mean squared error (MSE).


An estimator is “good” if its MSE is relatively small. Suppose that 𝛽0 is the “true” value of the coefficient, then the MSE
of the OLS estimator is
̂
mse(𝛽𝑂𝐿𝑆 ̂
, 𝛽0 ) ∶= 𝔼‖𝛽𝑂𝐿𝑆 − 𝛽0 ‖2 = 𝔼‖ ̂
𝛽⏟ − 𝔼𝛽𝑂𝐿𝑆 ‖2 + ‖𝔼 ̂
𝛽⏟ ‖2
−⏟𝛽⏟0⏟
⏟⏟ ⏟⏟⏟
𝑂𝐿𝑆 ⏟⏟⏟ ⏟⏟ ⏟
𝑂𝐿𝑆
variance bias

From this decomposition, one can see that in order for the MSE to be small, both the bias and the variance terms must
be small.
̂
For example, consider the case when 𝑋 is a 𝑇 -vector of ones (where 𝑇 is the sample size), so 𝛽𝑂𝐿𝑆 is simply the sample
average, while 𝛽0 ∈ ℝ is defined by the true mean of 𝑦.
In this example the MSE is
2
𝑇
̂ 1
mse(𝛽𝑂𝐿𝑆 , 𝛽0 ) = 2 𝔼 (∑(𝑦𝑡 − 𝛽0 )) + 0⏟
𝑇
⏟⏟⏟⏟⏟⏟⏟⏟⏟
𝑡=1 bias
variance

However, because there is a trade-off between the estimator’s bias and variance, there are cases when by permitting a
small bias we can substantially reduce the variance so overall the MSE gets smaller.

37.9. Black-Litterman Recommendation as Regularization 685


Advanced Quantitative Economics with Python

A typical scenario when this proves to be useful is when the number of coefficients to be estimated is large relative to the
sample size.
In these cases, one approach to handle the bias-variance trade-off is the so called Tikhonov regularization.
A general form with regularization matrix Γ can be written as

̃ 2}
min {‖𝑋𝛽 − 𝑦‖2 + ‖Γ(𝛽 − 𝛽)‖
𝛽

which yields the solution

̂
𝛽𝑅𝑒𝑔 = (𝑋 ′ 𝑋 + Γ′ Γ)−1 (𝑋 ′ 𝑦 + Γ′ Γ𝛽)̃

̂
Substituting the value of 𝛽𝑂𝐿𝑆 yields

̂
𝛽𝑅𝑒𝑔 ̂
= (𝑋 ′ 𝑋 + Γ′ Γ)−1 (𝑋 ′ 𝑋 𝛽𝑂𝐿𝑆 + Γ′ Γ𝛽)̃

Often, the regularization matrix takes the form Γ = 𝜆𝐼 with 𝜆 > 0 and 𝛽 ̃ = 0.
Then the Tikhonov regularization is equivalent to what is called ridge regression in statistics.
To illustrate how this estimator addresses the bias-variance trade-off, we compute the MSE of the ridge estimator
2
𝑇 2
̂ ,𝛽 ) = 1 𝜆
mse(𝛽ridge 0 2
𝔼 (∑(𝑦𝑡 − 𝛽0 )) + ( ) 𝛽02
(𝑇 + 𝜆) ⏟⏟𝑇
⏟ +
⏟ 𝜆⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟𝑡=1
bias
variance

The ridge regression shrinks the coefficients of the estimated vector towards zero relative to the OLS estimates thus
reducing the variance term at the cost of introducing a “small” bias.
However, there is nothing special about the zero vector.
When 𝛽 ̃ ≠ 0 shrinkage occurs in the direction of 𝛽.̃
Now, we can give a regularization interpretation of the Black-Litterman portfolio recommendation.
To this end, first simplify the equation (37.4) that characterizes the Black-Litterman recommendation

𝜇̃ = (Σ−1 + (𝜏 Σ)−1 )−1 (Σ−1 𝜇𝐵𝐿 + (𝜏 Σ)−1 𝜇)̂


= (1 + 𝜏 −1 )−1 ΣΣ−1 (𝜇𝐵𝐿 + 𝜏 −1 𝜇)̂
= (1 + 𝜏 −1 )−1 (𝜇𝐵𝐿 + 𝜏 −1 𝜇)̂

In our case, 𝜇̂ is the estimated mean excess returns of securities. This could be written as a vector autoregression where
• 𝑦 is the stacked vector of observed excess returns of size (𝑁 𝑇 × 1) – 𝑁 securities and 𝑇 observations.

• 𝑋 = 𝑇 −1 (𝐼𝑁 ⊗ 𝜄𝑇 ) where 𝐼𝑁 is the identity matrix and 𝜄𝑇 is a column vector of ones.
Correspondingly, the OLS regression of 𝑦 on 𝑋 would yield the mean excess returns as coefficients.

With Γ = 𝜏 𝑇 −1 (𝐼𝑁 ⊗ 𝜄𝑇 ) we can write the regularized version of the mean excess return estimation
̂
𝛽𝑅𝑒𝑔 ̂
= (𝑋 ′ 𝑋 + Γ′ Γ)−1 (𝑋 ′ 𝑋 𝛽𝑂𝐿𝑆 + Γ′ Γ𝛽)̃
̂
= (1 + 𝜏 )−1 𝑋 ′ 𝑋(𝑋 ′ 𝑋)−1 (𝛽𝑂𝐿𝑆 + 𝜏 𝛽)̃
̂
= (1 + 𝜏 )−1 (𝛽𝑂𝐿𝑆 + 𝜏 𝛽)̃
̂
= (1 + 𝜏 −1 )−1 (𝜏 −1 𝛽𝑂𝐿𝑆 + 𝛽)̃

̂
Given that 𝛽𝑂𝐿𝑆 = 𝜇̂ and 𝛽 ̃ = 𝜇𝐵𝐿 in the Black-Litterman model, we have the following interpretation of the model’s
recommendation.

686 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

The estimated (personal) view of the mean excess returns, 𝜇̂ that would lead to extreme short-long positions are “shrunk”
towards the conservative market view, 𝜇𝐵𝐿 , that leads to the more conservative market portfolio.
So the Black-Litterman procedure results in a recommendation that is a compromise between the conservative market
portfolio and the more extreme portfolio that is implied by estimated “personal” views.

37.10 A Robust Control Operator

The Black-Litterman approach is partly inspired by the econometric insight that it is easier to estimate covariances of
excess returns than the means.
That is what gave Black and Litterman license to adjust investors’ perception of mean excess returns while not tampering
with the covariance matrix of excess returns.
The robust control theory is another approach that also hinges on adjusting mean excess returns but not covariances.
Associated with a robust control problem is what Hansen and Sargent [Hansen and Sargent, 2001], [Hansen and Sargent,
2008] call a T operator.
Let’s define the T operator as it applies to the problem at hand.
Let 𝑥 be an 𝑛 × 1 Gaussian random vector with mean vector 𝜇 and covariance matrix Σ = 𝐶𝐶 ′ . This means that 𝑥 can
be represented as

𝑥 = 𝜇 + 𝐶𝜖

where 𝜖 ∼ 𝒩(0, 𝐼).


Let 𝜙(𝜖) denote the associated standardized Gaussian density.
Let 𝑚(𝜖, 𝜇) be a likelihood ratio, meaning that it satisfies
• 𝑚(𝜖, 𝜇) > 0
• ∫ 𝑚(𝜖, 𝜇)𝜙(𝜖)𝑑𝜖 = 1
That is, 𝑚(𝜖, 𝜇) is a non-negative random variable with mean 1.
Multiplying 𝜙(𝜖) by the likelihood ratio 𝑚(𝜖, 𝜇) produces a distorted distribution for 𝜖, namely

̃ = 𝑚(𝜖, 𝜇)𝜙(𝜖)
𝜙(𝜖)

The next concept that we need is the entropy of the distorted distribution 𝜙 ̃ with respect to 𝜙.
Entropy is defined as

ent = ∫ log 𝑚(𝜖, 𝜇)𝑚(𝜖, 𝜇)𝜙(𝜖)𝑑𝜖

or

̃
ent = ∫ log 𝑚(𝜖, 𝜇)𝜙(𝜖)𝑑𝜖

That is, relative entropy is the expected value of the likelihood ratio 𝑚 where the expectation is taken with respect to the
twisted density 𝜙.̃
Relative entropy is non-negative. It is a measure of the discrepancy between two probability distributions.
As such, it plays an important role in governing the behavior of statistical tests designed to discriminate one probability
distribution from another.

37.10. A Robust Control Operator 687


Advanced Quantitative Economics with Python

We are ready to define the T operator.


Let 𝑉 (𝑥) be a value function.
Define

T (𝑉 (𝑥)) = min ∫ 𝑚(𝜖, 𝜇)[𝑉 (𝜇 + 𝐶𝜖) + 𝜃 log 𝑚(𝜖, 𝜇)]𝜙(𝜖)𝑑𝜖


𝑚(𝜖,𝜇)

−𝑉 (𝜇 + 𝐶𝜖)
= − log 𝜃 ∫ exp ( ) 𝜙(𝜖)𝑑𝜖
𝜃

This asserts that T is an indirect utility function for a minimization problem in which an adversary chooses a distorted
probability distribution 𝜙 ̃ to lower expected utility, subject to a penalty term that gets bigger the larger is relative entropy.
Here the penalty parameter

𝜃 ∈ [𝜃, +∞]

is a robustness parameter when it is +∞, there is no scope for the minimizing agent to distort the distribution, so no
robustness to alternative distributions is acquired.
As 𝜃 is lowered, more robustness is achieved.

Note: The T operator is sometimes called a risk-sensitivity operator.

We shall apply T to the special case of a linear value function 𝑤′ (𝑟−𝑟


⃗ 𝑓 1) where 𝑟−𝑟
⃗ 𝑓 1 ∼ 𝒩(𝜇, Σ) or 𝑟−𝑟
⃗ 𝑓 1 = 𝜇+𝐶𝜖
and 𝜖 ∼ 𝒩(0, 𝐼).
The associated worst-case distribution of 𝜖 is Gaussian with mean 𝑣 = −𝜃−1 𝐶 ′ 𝑤 and covariance matrix 𝐼
(When the value function is affine, the worst-case distribution distorts the mean vector of 𝜖 but not the covariance matrix
of 𝜖).
For utility function argument 𝑤′ (𝑟 ⃗ − 𝑟𝑓 1)

1 ′
T(𝑟 ⃗ − 𝑟𝑓 1) = 𝑤′ 𝜇 + 𝜁 − 𝑤 Σ𝑤
2𝜃
and entropy is

𝑣′ 𝑣 1
= 2 𝑤′ 𝐶𝐶 ′ 𝑤
2 2𝜃

37.11 A Robust Mean-Variance Portfolio Model

According to criterion (37.1), the mean-variance portfolio choice problem chooses 𝑤 to maximize

𝐸[𝑤(𝑟 ⃗ − 𝑟𝑓 1)]] − var[𝑤(𝑟 ⃗ − 𝑟𝑓 1)]

which equals
𝛿
𝑤′ 𝜇 − 𝑤′ Σ𝑤
2
A robust decision maker can be modeled as replacing the mean return 𝐸[𝑤(𝑟 ⃗ − 𝑟𝑓 1)] with the risk-sensitive criterion

1 ′
T[𝑤(𝑟 ⃗ − 𝑟𝑓 1)] = 𝑤′ 𝜇 − 𝑤 Σ𝑤
2𝜃

688 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

that comes from replacing the mean 𝜇 of 𝑟 ⃗ − 𝑟_𝑓1 with the worst-case mean

𝜇 − 𝜃−1 Σ𝑤

Notice how the worst-case mean vector depends on the portfolio 𝑤.


The operator T is the indirect utility function that emerges from solving a problem in which an agent who chooses
probabilities does so in order to minimize the expected utility of a maximizing agent (in our case, the maximizing agent
chooses portfolio weights 𝑤).
The robust version of the mean-variance portfolio choice problem is then to choose a portfolio 𝑤 that maximizes
𝛿
T[𝑤(𝑟 ⃗ − 𝑟𝑓 1)] − 𝑤′ Σ𝑤
2
or
𝛿
𝑤′ (𝜇 − 𝜃−1 Σ𝑤) − 𝑤′ Σ𝑤 (37.7)
2
The minimizer of (37.7) is
1
𝑤rob = Σ−1 𝜇
𝛿+𝛾

where 𝛾 ≡ 𝜃−1 is sometimes called the risk-sensitivity parameter.


An increase in the risk-sensitivity parameter 𝛾 shrinks the portfolio weights toward zero in the same way that an increase
in risk aversion does.

37.12 Appendix

We want to illustrate the “folk theorem” that with high or moderate frequency data, it is more difficult to estimate means
than variances.
In order to operationalize this statement, we take two analog estimators:
𝑁
• sample average: 𝑋̄ 𝑁 = 1
𝑁 ∑𝑖=1 𝑋𝑖
𝑁
• sample variance: 𝑆𝑁 = 1
𝑁−1 ∑𝑡=1 (𝑋𝑖 − 𝑋̄ 𝑁 )2
to estimate the unconditional mean and unconditional variance of the random variable 𝑋, respectively.
To measure the “difficulty of estimation”, we use mean squared error (MSE), that is the average squared difference
between the estimator and the true value.
Assuming that the process {𝑋𝑖 }is ergodic, both analog estimators are known to converge to their true values as the sample
size 𝑁 goes to infinity.
More precisely for all 𝜀 > 0

lim 𝑃 {∣𝑋̄ 𝑁 − 𝔼𝑋∣ > 𝜀} = 0


𝑁→∞

and

lim 𝑃 {|𝑆𝑁 − 𝕍𝑋| > 𝜀} = 0


𝑁→∞

A necessary condition for these convergence results is that the associated MSEs vanish as 𝑁 goes to infinity, or in other
words,

MSE(𝑋̄ 𝑁 , 𝔼𝑋) = 𝑜(1) and MSE(𝑆𝑁 , 𝕍𝑋) = 𝑜(1)

37.12. Appendix 689


Advanced Quantitative Economics with Python

Even if the MSEs converge to zero, the associated rates might be different. Looking at the limit of the relative MSE (as
the sample size grows to infinity)

MSE(𝑆𝑁 , 𝕍𝑋) 𝑜(1)


= → 𝐵
̄
MSE(𝑋𝑁 , 𝔼𝑋) 𝑜(1) 𝑁→∞

can inform us about the relative (asymptotic) rates.


We will show that in general, with dependent data, the limit 𝐵 depends on the sampling frequency.
In particular, we find that the rate of convergence of the variance estimator is less sensitive to increased sampling frequency
than the rate of convergence of the mean estimator.
Hence, we can expect the relative asymptotic rate, 𝐵, to get smaller with higher frequency data, illustrating that “it is
more difficult to estimate means than variances”.
That is, we need significantly more data to obtain a given precision of the mean estimate than for our variance estimate.

37.13 Special Case – IID Sample

We start our analysis with the benchmark case of IID data.


Consider a sample of size 𝑁 generated by the following IID process,

𝑋𝑖 ∼ 𝒩(𝜇, 𝜎2 )

Taking 𝑋̄ 𝑁 to estimate the mean, the MSE is

𝜎2
MSE(𝑋̄ 𝑁 , 𝜇) =
𝑁
Taking 𝑆𝑁 to estimate the variance, the MSE is

2𝜎4
MSE(𝑆𝑁 , 𝜎2 ) =
𝑁 −1
Both estimators are unbiased and hence the MSEs reflect the corresponding variances of the estimators.
Furthermore, both MSEs are 𝑜(1) with a (multiplicative) factor of difference in their rates of convergence:

MSE(𝑆𝑁 , 𝜎2 ) 𝑁 2𝜎2
= → 2𝜎2
MSE(𝑋̄ 𝑁 , 𝜇) 𝑁 −1 𝑁→∞

We are interested in how this (asymptotic) relative rate of convergence changes as increasing sampling frequency puts
dependence into the data.

37.14 Dependence and Sampling Frequency

To investigate how sampling frequency affects relative rates of convergence, we assume that the data are generated by a
mean-reverting continuous time process of the form

𝑑𝑋𝑡 = −𝜅(𝑋𝑡 − 𝜇)𝑑𝑡 + 𝜎𝑑𝑊𝑡

where 𝜇is the unconditional mean, 𝜅 > 0 is a persistence parameter, and {𝑊𝑡 } is a standardized Brownian motion.

690 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

Observations arising from this system in particular discrete periods 𝒯(ℎ) ≡ {𝑛ℎ ∶ 𝑛 ∈ ℤ}withℎ > 0 can be described
by the following process

𝑋𝑡+1 = (1 − exp(−𝜅ℎ))𝜇 + exp(−𝜅ℎ)𝑋𝑡 + 𝜖𝑡,ℎ

where
𝜎2 (1 − exp(−2𝜅ℎ))
𝜖𝑡,ℎ ∼ 𝒩(0, Σℎ ) with Σℎ =
2𝜅
We call ℎ the frequency parameter, whereas 𝑛 represents the number of lags between observations.
Hence, the effective distance between two observations 𝑋𝑡 and 𝑋𝑡+𝑛 in the discrete time notation is equal to ℎ ⋅ 𝑛 in
terms of the underlying continuous time process.
Straightforward calculations show that the autocorrelation function for the stochastic process {𝑋𝑡 }𝑡∈𝒯(ℎ) is

Γℎ (𝑛) ≡ corr(𝑋𝑡+ℎ𝑛 , 𝑋𝑡 ) = exp(−𝜅ℎ𝑛)

and the auto-covariance function is


exp(−𝜅ℎ𝑛)𝜎2
𝛾ℎ (𝑛) ≡ cov(𝑋𝑡+ℎ𝑛 , 𝑋𝑡 ) = .
2𝜅
𝜎2
It follows that if 𝑛 = 0, the unconditional variance is given by 𝛾ℎ (0) = 2𝜅 irrespective of the sampling frequency.
The following figure illustrates how the dependence between the observations is related to the sampling frequency
• For any given ℎ, the autocorrelation converges to zero as we increase the distance – 𝑛– between the observations.
This represents the “weak dependence” of the 𝑋 process.
• Moreover, for a fixed lag length, 𝑛, the dependence vanishes as the sampling frequency goes to infinity. In fact,
letting ℎ go to ∞ gives back the case of IID data.

μ = .0
κ = .1
σ = .5
var_uncond = σ**2 / (2 * κ)

n_grid = np.linspace(0, 40, 100)


autocorr_h1 = np.exp(-κ * n_grid * 1)
autocorr_h2 = np.exp(-κ * n_grid * 2)
autocorr_h5 = np.exp(-κ * n_grid * 5)
autocorr_h1000 = np.exp(-κ * n_grid * 1e8)

fig, ax = plt.subplots(figsize=(8, 4))


ax.plot(n_grid, autocorr_h1, label=r'$h=1$', c='darkblue', lw=2)
ax.plot(n_grid, autocorr_h2, label=r'$h=2$', c='darkred', lw=2)
ax.plot(n_grid, autocorr_h5, label=r'$h=5$', c='orange', lw=2)
ax.plot(n_grid, autocorr_h1000, label=r'"$h=\infty$"', c='darkgreen', lw=2)
ax.legend()
ax.grid()
ax.set(title=r'Autocorrelation functions, $\Gamma_h(n)$',
xlabel=r'Lags between observations, $n$')
plt.show()

37.14. Dependence and Sampling Frequency 691


Advanced Quantitative Economics with Python

37.15 Frequency and the Mean Estimator

Consider again the AR(1) process generated by discrete sampling with frequency ℎ. Assume that we have a sample of
size 𝑁 and we would like to estimate the unconditional mean – in our case the true mean is 𝜇.
Again, the sample average is an unbiased estimator of the unconditional mean

1 𝑁
𝔼[𝑋̄ 𝑁 ] = ∑ 𝔼[𝑋𝑖 ] = 𝔼[𝑋0 ] = 𝜇
𝑁 𝑖=1

The variance of the sample mean is given by

1 𝑁
𝕍 (𝑋̄ 𝑁 ) = 𝕍 ( ∑𝑋 )
𝑁 𝑖=1 𝑖
𝑁 𝑁−1 𝑁
1
= (∑ 𝕍(𝑋𝑖 ) + 2 ∑ ∑ cov(𝑋𝑖 , 𝑋𝑠 ))
𝑁 2 𝑖=1 𝑖=1 𝑠=𝑖+1
𝑁−1
1
= (𝑁 𝛾(0) + 2 ∑ 𝑖 ⋅ 𝛾 (ℎ ⋅ (𝑁 − 𝑖)))
𝑁2 𝑖=1
𝑁−1
1 𝜎2 𝜎2
= (𝑁 + 2 ∑ 𝑖 ⋅ exp(−𝜅ℎ(𝑁 − 𝑖)) )
𝑁2 2𝜅 𝑖=1
2𝜅

It is explicit in the above equation that time dependence in the data inflates the variance of the mean estimator through
the covariance terms.
Moreover, as we can see, a higher sampling frequency—smaller ℎ—makes all the covariance terms larger, everything
else being fixed.
This implies a relatively slower rate of convergence of the sample average for high-frequency data.

692 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

Intuitively, stronger dependence across observations for high-frequency data reduces the “information content” of each
observation relative to the IID case.
We can upper bound the variance term in the following way
𝑁−1
1
𝕍(𝑋̄ 𝑁 ) = 2 (𝑁 𝜎2 + 2 ∑ 𝑖 ⋅ exp(−𝜅ℎ(𝑁 − 𝑖))𝜎2 )
𝑁 𝑖=1
𝑁−1
𝜎2
≤ (1 + 2 ∑ ⋅ exp(−𝜅ℎ(𝑖)))
2𝜅𝑁 𝑖=1

𝜎2 1 − exp(−𝜅ℎ)𝑁−1
= (1 + 2 )
2𝜅𝑁
⏟ 1 − exp(−𝜅ℎ)
IID case

Asymptotically, the term exp(−𝜅ℎ)𝑁−1 vanishes and the dependence in the data inflates the benchmark IID variance by
a factor of
1
(1 + 2 )
1 − exp(−𝜅ℎ)

This long run factor is larger the higher is the frequency (the smaller is ℎ).
Therefore, we expect the asymptotic relative MSEs, 𝐵, to change with time-dependent data. We just saw that the mean
estimator’s rate is roughly changing by a factor of

1
(1 + 2 )
1 − exp(−𝜅ℎ)
Unfortunately, the variance estimator’s MSE is harder to derive.
Nonetheless, we can approximate it by using (large sample) simulations, thus getting an idea about how the asymptotic
relative MSEs changes in the sampling frequency ℎ relative to the IID case that we compute in closed form.

@jit
def sample_generator(h, N, M):
ϕ = (1 - np.exp(-κ * h)) * μ
ρ = np.exp(-κ * h)
s = σ**2 * (1 - np.exp(-2 * κ * h)) / (2 * κ)

mean_uncond = μ
std_uncond = np.sqrt(σ**2 / (2 * κ))

ε_path = np.random.normal(0, np.sqrt(s), (M, N))

y_path = np.zeros((M, N + 1))


y_path[:, 0] = np.random.normal(mean_uncond, std_uncond, M)

for i in range(N):
y_path[:, i + 1] = ϕ + ρ * y_path[:, i] + ε_path[:, i]

return y_path

# Generate large sample for different frequencies


N_app, M_app = 1000, 30000 # Sample size, number of simulations
h_grid = np.linspace(.1, 80, 30)

var_est_store = []
(continues on next page)

37.15. Frequency and the Mean Estimator 693


Advanced Quantitative Economics with Python

(continued from previous page)


mean_est_store = []
labels = []

for h in h_grid:
labels.append(h)
sample = sample_generator(h, N_app, M_app)
mean_est_store.append(np.mean(sample, 1))
var_est_store.append(np.var(sample, 1))

var_est_store = np.array(var_est_store)
mean_est_store = np.array(mean_est_store)

# Save mse of estimators


mse_mean = np.var(mean_est_store, 1) + (np.mean(mean_est_store, 1) - μ)**2
mse_var = np.var(var_est_store, 1) \
+ (np.mean(var_est_store, 1) - var_uncond)**2

benchmark_rate = 2 * var_uncond # IID case

# Relative MSE for large samples


rate_h = mse_var / mse_mean

fig, ax = plt.subplots(figsize=(8, 5))


ax.plot(h_grid, rate_h, c='darkblue', lw=2,
label=r'large sample relative MSE, $B(h)$')
ax.axhline(benchmark_rate, c='k', ls='--', label=r'IID benchmark')
ax.set_title('Relative MSE for large samples as a function of sampling \
frequency \n MSE($S_N$) relative to MSE($\\bar X_N$)')
ax.set_xlabel('Sampling frequency, $h$')
ax.legend()
plt.show()

694 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


Advanced Quantitative Economics with Python

The above figure illustrates the relationship between the asymptotic relative MSEs and the sampling frequency
• We can see that with low-frequency data – large values of ℎ – the ratio of asymptotic rates approaches the IID
case.
• As ℎ gets smaller – the higher the frequency – the relative performance of the variance estimator is better in the
sense that the ratio of asymptotic rates gets smaller. That is, as the time dependence gets more pronounced, the
rate of convergence of the mean estimator’s MSE deteriorates more than that of the variance estimator.

37.15. Frequency and the Mean Estimator 695


Advanced Quantitative Economics with Python

696 Chapter 37. Two Modifications of Mean-Variance Portfolio Theory


CHAPTER

THIRTYEIGHT

IRRELEVANCE OF CAPITAL STRUCTURES WITH COMPLETE


MARKETS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon


!conda install -y -c plotly plotly plotly-orca

38.1 Introduction

This is a prolegomenon to another lecture Equilibrium Capital Structures with Incomplete Markets about a model with
incomplete markets authored by Bisin, Clementi, and Gottardi [Bisin et al., 2018].
We adopt specifications of preferences and technologies very close to Bisin, Clemente, and Gottardi’s but unlike them
assume that there are complete markets in one-period Arrow securities.
This simplification of BCG’s setup helps us by
• creating a benchmark economy to compare with outcomes in BCG’s incomplete markets economy
• creating a good guess for initial values of some equilibrium objects to be computed in BCG’s incomplete markets
economy via an iterative algorithm
• illustrating classic complete markets outcomes that include
– indeterminacy of consumers’ portfolio choices
– indeterminacy of firms’ financial structures that underlies a Modigliani-Miller theorem [Modigliani and
Miller, 1958]
• introducing Big K, little k issues in a simple context that will recur in the BCG incomplete markets
environment
A Big K, little k analysis also played roles in this quantecon lecture as well as here and here.

697
Advanced Quantitative Economics with Python

38.1.1 Setup

The economy lasts for two periods, 𝑡 = 0, 1.


There are two types of consumers named 𝑖 = 1, 2.
A scalar random variable 𝜖 with probability density 𝑔(𝜖) affects both
• the return in period 1 from investing 𝑘 ≥ 0 in physical capital in period 0.
• exogenous period 1 endowments of the consumption good for agents of types 𝑖 = 1 and 𝑖 = 2.
Type 𝑖 = 1 and 𝑖 = 2 agents’ period 1 endowments are correlated with the return on physical capital in different ways.
We discuss two arrangements:
• a command economy in which a benevolent planner chooses 𝑘 and allocates goods to the two types of consumers
in each period and each random second period state
• a competitive equilibrium with markets in claims on physical capital and a complete set (possibly a continuum) of
one-period Arrow securities that pay period 1 consumption goods contingent on the realization of random variable
𝜖.

38.1.2 Endowments

There is a single consumption good in period 0 and at each random state 𝜖 in period 1.
Economy-wide endowments in periods 0 and 1 are

𝑤0
𝑤1 (𝜖) in state 𝜖

Soon we’ll explain how aggregate endowments are divided between type 𝑖 = 1 and type 𝑖 = 2 consumers.
We don’t need to do that in order to describe a social planning problem.

38.1.3 Technology:

Where 𝛼 ∈ (0, 1) and 𝐴 > 0

𝑐01 + 𝑐02 + 𝑘 = 𝑤01 + 𝑤02


𝑐1 (𝜖) + 𝑐12 (𝜖)
1
= 𝑤11 (𝜖) + 𝑤12 (𝜖) + 𝑒𝜖 𝐴𝑘𝛼 , 𝑘≥0

38.1.4 Preferences:

A consumer of type 𝑖 orders period 0 consumption 𝑐0𝑖 and state 𝜖, period 1 consumption 𝑐1𝑖 (𝜖) by

𝑢𝑖 = 𝑢(𝑐0𝑖 ) + 𝛽 ∫ 𝑢(𝑐1𝑖 (𝜖))𝑔(𝜖)𝑑𝜖, 𝑖 = 1, 2

𝛽 ∈ (0, 1) and the one-period utility function is


𝑐1−𝛾
1−𝛾 if 𝛾 ≠ 1
𝑢(𝑐) = {
log 𝑐 if 𝛾 = 1

698 Chapter 38. Irrelevance of Capital Structures with Complete Markets


Advanced Quantitative Economics with Python

38.1.5 Parameterizations

Following BCG, we shall employ the following parameterizations:

𝜖 ∼ 𝒩(𝜇, 𝜎2 )
𝑐1−𝛾
𝑢(𝑐) =
1−𝛾
2 2
𝑤1𝑖 (𝜖) = 𝑒−𝜒𝑖 𝜇−.5𝜒𝑖 𝜎 +𝜒𝑖 𝜖
, 𝜒𝑖 ∈ [0, 1]

Sometimes instead of asuming 𝜖 ∼ 𝑔(𝜖) = 𝒩(0, 𝜎2 ), we’ll assume that 𝑔(⋅) is a probability mass function that serves as
a discrete approximation to a standardized normal density.

38.1.6 Pareto criterion and planning problem

The planner’s objective function is

obj = 𝜙1 𝑢1 + 𝜙2 𝑢2 , 𝜙𝑖 ≥ 0, 𝜙 1 + 𝜙2 = 1

where 𝜙𝑖 ≥ 0 is a Pareto weight that the planner attaches to a consumer of type 𝑖.


We form the following Lagrangian for the planner’s problem:
2
𝐿 = ∑ 𝜙𝑖 [𝑢(𝑐0𝑖 ) + 𝛽 ∫ 𝑢(𝑐1𝑖 (𝜖))𝑔(𝜖)𝑑𝜖]
𝑖=1
+ 𝜆0 [𝑤01 + 𝑤02 − 𝑘 − 𝑐01 − 𝑐02 ]

+ 𝛽 ∫ 𝜆1 (𝜖) [𝑤11 (𝜖) + 𝑤12 (𝜖) + 𝑒𝜖 𝐴𝑘𝛼 − 𝑐11 (𝜖) − 𝑐12 (𝜖)] 𝑔(𝜖)𝑑𝜖

First-order necessary optimality conditions for the planning problem are:

𝑐01 ∶ 𝜙1 𝑢′ (𝑐01 ) − 𝜆0 = 0
𝑐02 ∶ 𝜙2 𝑢′ (𝑐02 ) − 𝜆0 = 0
𝑐11 (𝜖) ∶ 𝜙1 𝛽𝑢′ (𝑐11 (𝜖))𝑔(𝜖) − 𝛽𝜆1 (𝜖)𝑔(𝜖) = 0
𝑐12 (𝜖) ∶ 𝜙2 𝛽𝑢′ (𝑐12 (𝜖))𝑔(𝜖) − 𝛽𝜆1 (𝜖)𝑔(𝜖) = 0

𝑘∶ − 𝜆0 + 𝛽𝛼𝐴𝑘𝛼−1 ∫ 𝜆1 (𝜖)𝑒𝜖 𝑔(𝜖)𝑑𝜖 = 0

The first four equations imply that

𝑢′ (𝑐11 (𝜖)) 𝑢′ (𝑐12 (𝜖)) 𝜆 (𝜖)


1
= = 1
𝑢′ (𝑐0 )) 𝑢′ (𝑐02 )) 𝜆0
𝑢′ (𝑐01 ) 𝑢′ (𝑐11 (𝜖)) 𝜙
= = 2
𝑢′ (𝑐02 ) 𝑢′ (𝑐12 (𝜖)) 𝜙1

These together with the fifth first-order condition for the planner imply the following equation that determines an optimal
choice of capital

𝑢′ (𝑐1𝑖 (𝜖)) 𝜖
1 = 𝛽𝛼𝐴𝑘𝛼−1 ∫ 𝑒 𝑔(𝜖)𝑑𝜖
𝑢′ (𝑐0𝑖 )

for 𝑖 = 1, 2.

38.1. Introduction 699


Advanced Quantitative Economics with Python

38.1.7 Helpful observations and bookkeeping

Evidently,

𝑢′ (𝑐) = 𝑐−𝛾

and
−𝛾
𝑢′ (𝑐1 ) 𝑐1 𝜙2
= ( ) =
𝑢′ (𝑐2 ) 𝑐2 𝜙1

where it is to be understood that this equation holds for 𝑐1 = 𝑐01 and 𝑐2 = 𝑐02 and also for 𝑐1 = 𝑐1 (𝜖) and 𝑐2 = 𝑐2 (𝜖) for
all 𝜖.
With the same understanding, it follows that
−𝛾 −1
𝑐1 𝜙
( 2) = ( 2)
𝑐 𝜙1

Let 𝑐 = 𝑐1 + 𝑐2 .
It follows from the preceding equation that

𝑐1 = 𝜂𝑐
𝑐2 = (1 − 𝜂)𝑐

where 𝜂 ∈ [0, 1] is a function of 𝜙1 and 𝛾.


Consequently, we can write the planner’s first-order condition for 𝑘 as
−𝛾
𝑤1 (𝜖) + 𝐴𝑘𝛼 𝑒𝜖
1 = 𝛽𝛼𝐴𝑘𝛼−1 ∫ ( ) 𝑒𝜖 𝑔(𝜖)𝑑𝜖
𝑤0 − 𝑘

which is one equation to be solved for 𝑘 ≥ 0.


Anticipating a Big K, little k idea widely used in macroeconomics, to be discussed in detail below, let 𝐾 be the
value of 𝑘 that solves the preceding equation so that
−𝛾
𝑤1 (𝜖) + 𝐴𝐾 𝛼 𝑒𝜖
1 = 𝛽𝛼𝐴𝐾 𝛼−1 ∫ ( ) 𝑔(𝜖)𝑒𝜖 𝑑𝜖 (38.1)
𝑤0 − 𝐾

The associated optimal consumption allocation is

𝐶0 = 𝑤0 − 𝐾
𝐶1 (𝜖) = 𝑤1 (𝜖) + 𝐴𝐾 𝛼 𝑒𝜖
𝑐01 = 𝜂𝐶0
𝑐02 = (1 − 𝜂)𝐶0
𝑐11 (𝜖) = 𝜂𝐶1 (𝜖)
𝑐12 (𝜖) = (1 − 𝜂)𝐶1 (𝜖)

where 𝜂 ∈ [0, 1] is the consumption share parameter mentioned above that is a function of the Pareto weight 𝜙1 and the
utility curvature parameter 𝛾.

700 Chapter 38. Irrelevance of Capital Structures with Complete Markets


Advanced Quantitative Economics with Python

Remarks

The relative Pareto weight parameter 𝜂 does not appear in equation (38.1) that determines 𝐾.
Neither does it influence 𝐶0 or 𝐶1 (𝜖), which depend solely on 𝐾.
The role of 𝜂 is to determine how to allocate total consumption between the two types of consumers.
Thus, the planner’s choice of 𝐾 does not interact with how it wants to allocate consumption.

38.2 Competitive equilibrium

We now describe a competitive equilibrium for an economy that has specifications of consumer preferences, technology,
and aggregate endowments that are identical to those in the preceding planning problem.
While prices do not appear in the planning problem – only quantities do – prices play an important role in a competitive
equilibrium.
To understand how the planning economy is related to a competitive equilibrium, we now turn to the Big K, little
k distinction.

38.2.1 Measures of agents and firms

We follow BCG in assuming that there are unit measures of


• consumers of type 𝑖 = 1
• consumers of type 𝑖 = 2
• firms with access to the production technology that converts 𝑘 units of time 0 good into 𝐴𝑘𝛼 𝑒𝜖 units of the time 1
good in random state 𝜖
Thus, let 𝜔 ∈ [0, 1] index a particular consumer of type 𝑖.
Then define Big 𝐶 𝑖 as
1
𝐶 𝑖 = ∫ 𝑐𝑖 (𝜔)𝑑 𝜔
0

In the same spirit, let 𝜁 ∈ [0, 1] index a particular firm. Then define Big 𝐾 as
1
𝐾 = ∫ 𝑘(𝜁)𝑑 𝜁
0

The assumption that there are continua of our three types of agents plays an important role making each individual agent
into a powerless price taker:
• an individual consumer chooses its own (infinesimal) part 𝑐𝑖 (𝜔) of 𝐶 𝑖 taking prices as given
• an individual firm chooses its own (infinitesmimal) part 𝑘(𝜁) of 𝐾 taking prices as
• equilibrium prices depend on the Big K, Big C objects 𝐾 and 𝐶
Nevertheless, in equilibrium, 𝐾 = 𝑘, 𝐶 𝑖 = 𝑐𝑖
The assumption about measures of agents is thus a powerful device for making a host of competitive agents take as given
equilibrium prices that are determined by the independent decisions of hosts of agents who behave just like they do.

38.2. Competitive equilibrium 701


Advanced Quantitative Economics with Python

Ownership

Consumers of type 𝑖 own the following exogenous quantities of the consumption good in periods 0 and 1:

𝑤0𝑖 , 𝑖 = 1, 2
𝑖
𝑤1 (𝜖) 𝑖 = 1, 2

where

∑ 𝑤0𝑖 = 𝑤0
𝑖

∑ 𝑤1𝑖 (𝜖) = 𝑤1 (𝜖)


𝑖

Consumers also own shares in a firm that operates the technology for converting nonnegative amounts of the time 0
consumption good one-for-one into a capital good 𝑘 that produces 𝐴𝑘𝛼 𝑒𝜖 units of the time 1 consumption good in time
1 state 𝜖.
Consumers of types 𝑖 = 1, 2 are endowed with 𝜃0𝑖 shares of a firm and

𝜃01 + 𝜃02 = 1

Asset markets

At time 0, consumers trade the following assets with other consumers and with firms:
• equities (also known as stocks) issued by firms
• one-period Arrow securities that pay one unit of consumption at time 1 when the shock 𝜖 assumes a particular value
Later, we’ll allow the firm to issue bonds too, but not now.

38.2.2 Objects appearing in a competitive equilibrium

Let
• 𝑎𝑖 (𝜖) be consumer 𝑖 ’s purchases of claims on time 1 consumption in state 𝜖
• 𝑞(𝜖) be a pricing kernel for one-period Arrow securities
• 𝜃0𝑖 ≥ 0 be consumer 𝑖’s intial share of the firm, ∑𝑖 𝜃0𝑖 = 1
• 𝜃𝑖 be the fraction of a firm’s shares purchased by consumer 𝑖 at time 𝑡 = 0
• 𝑉 be the value of the representative firm
• 𝑉 ̃ be the value of equity issued by the representative firm
• 𝐾, 𝐶0 be two scalars and 𝐶1 (𝜖) a function that we use to construct a guess about an equilibrium pricing kernel for
Arrow securities
We proceed to describe constrained optimum problems faced by consumers and a representative firm in a competitive
equilibrium.

702 Chapter 38. Irrelevance of Capital Structures with Complete Markets


Advanced Quantitative Economics with Python

38.2.3 A representative firm’s problem

A representative firm takes Arrow security prices 𝑞(𝜖) as given.


The firm purchases capital 𝑘 ≥ 0 from consumers at time 0 and finances itself by issuing equity at time 0.
The firm produces time 1 goods 𝐴𝑘𝛼 𝑒𝜖 in state 𝜖 and pays all of these earnings to owners of its equity.
The value of a firm’s equity at time 0 can be computed by multiplying its state-contingent earnings by their Arrow securities
prices and then adding over all contingencies:

𝑉 ̃ = ∫ 𝐴𝑘𝛼 𝑒𝜖 𝑞(𝜖)𝑑𝜖

Owners of a firm want it to choose 𝑘 to maximize

𝑉 = −𝑘 + ∫ 𝐴𝑘𝛼 𝑒𝜖 𝑞(𝜖)𝑑𝜖

The firm’s first-order necessary condition for an optimal 𝑘 is

−1 + 𝛼𝐴𝑘𝛼−1 ∫ 𝑒𝜖 𝑞(𝜖)𝑑𝜖 = 0

The time 0 value of a representative firm is

𝑉 = −𝑘 + 𝑉 ̃

The right side equals the value of equity minus the cost of the time 0 goods that it purchases and uses as capital.

38.2.4 A consumer’s problem

We now pose a consumer’s problem in a competitive equilibrium.


As a price taker, each consumer faces a given Arrow securities pricing kernel 𝑞(𝜖), a given value of a firm 𝑉 that has
chosen capital stock 𝑘, a price of equity 𝑉 ̃ , and prospective next period random dividends 𝐴𝑘𝛼 𝑒𝜖 .
If we evaluate consumer 𝑖’s time 1 budget constraint at zero consumption 𝑐1𝑖 (𝜖) = 0 and solve for −𝑎𝑖 (𝜖) we obtain

−𝑎𝑖̄ (𝜖; 𝜃𝑖 ) = 𝑤1𝑖 (𝜖) + 𝜃𝑖 𝐴𝑘𝛼 𝑒𝜖 (38.2)

The quantity −𝑎𝑖̄ (𝜖; 𝜃𝑖 ) is the maximum amount that it is feasible for consumer 𝑖 to repay to his Arrow security creditors
at time 1 in state 𝜖.
Notice that −𝑎𝑖̄ (𝜖; 𝜃𝑖 ) defined in (38.2) depends on
• his endowment 𝑤1𝑖 (𝜖) at time 1 in state 𝜖
• his share 𝜃𝑖 of a representive firm’s dividends
These constitute two sources of collateral that back the consumer’s issues of Arrow securities that pay off in state 𝜖
Consumer 𝑖 chooses a scalar 𝑐0𝑖 and a function 𝑐1𝑖 (𝜖) to maximize

𝑢(𝑐0𝑖 ) + 𝛽 ∫ 𝑢(𝑐1𝑖 (𝜖))𝑔(𝜖)𝑑𝜖

subject to time 0 and time 1 budget constraints

𝑐0𝑖 ≤ 𝑤0𝑖 + 𝜃0𝑖 𝑉 − ∫ 𝑞(𝜖)𝑎𝑖 (𝜖)𝑑𝜖 − 𝜃𝑖 𝑉 ̃

𝑐1𝑖 (𝜖) ≤ 𝑤1𝑖 (𝜖) + 𝜃𝑖 𝐴𝑘𝛼 𝑒𝜖 + 𝑎𝑖 (𝜖)

38.2. Competitive equilibrium 703


Advanced Quantitative Economics with Python

Attach Lagrange multiplier 𝜆𝑖0 to the budget constraint at time 0 and scaled Lagrange multiplier 𝛽𝜆𝑖1 (𝜖)𝑔(𝜖) to the budget
constraint at time 1 and state 𝜖, then form the Lagrangian

𝐿𝑖 = 𝑢(𝑐0𝑖 ) + 𝛽 ∫ 𝑢(𝑐1𝑖 (𝜖))𝑔(𝜖)𝑑𝜖

+ 𝜆𝑖0 [𝑤0𝑖 + 𝜃0𝑖 − ∫ 𝑞(𝜖)𝑎𝑖 (𝜖)𝑑𝜖 − 𝜃𝑖 𝑉 ̃ − 𝑐0𝑖 ]

+ 𝛽 ∫ 𝜆𝑖1 (𝜖)[𝑤1𝑖 (𝜖) + 𝜃𝑖 𝐴𝑘𝛼 𝑒𝜖 + 𝑎𝑖 (𝜖)𝑐1𝑖 (𝜖)]𝑔(𝜖)𝑑𝜖

Off corners, first-order necessary conditions for an optimum with respect to 𝑐0𝑖 , 𝑐1𝑖 (𝜖), and 𝑎𝑖 (𝜖) are

𝑐0𝑖 ∶ 𝑢′ (𝑐0𝑖 ) − 𝜆𝑖0 = 0


𝑐1𝑖 (𝜖) ∶ 𝛽𝑢′ (𝑐1𝑖 (𝜖))𝑔(𝜖) − 𝛽𝜆𝑖1 (𝜖)𝑔(𝜖) = 0
𝑎𝑖 (𝜖) ∶ − 𝜆𝑖0 𝑞(𝜖) + 𝛽𝜆𝑖1 (𝜖) = 0

These equations imply that consumer 𝑖 adjusts its consumption plan to satisfy

𝑢′ (𝑐1𝑖 (𝜖))
𝑞(𝜖) = 𝛽 ( ) 𝑔(𝜖) (38.3)
𝑢′ (𝑐0𝑖 )

To deduce a restriction on equilibrium prices, we solve the period 1 budget constraint to express 𝑎𝑖 (𝜖) as

𝑎𝑖 (𝜖) = 𝑐1𝑖 (𝜖) − 𝑤1𝑖 (𝜖) − 𝜃𝑖 𝐴𝑘𝛼 𝑒𝜖

then substitute the expression on the right side into the time 0 budget constraint and rearrange to get the single intertem-
poral budget constraint

𝑤0𝑖 + 𝜃0𝑖 𝑉 + ∫ 𝑤1𝑖 (𝜖)𝑞(𝜖)𝑑𝜖 + 𝜃𝑖 [𝐴𝑘𝛼 ∫ 𝑒𝜖 𝑞(𝜖)𝑑𝜖 − 𝑉 ̃ ] ≥ 𝑐0𝑖 + ∫ 𝑐1𝑖 (𝜖)𝑞(𝜖)𝑑𝜖 (38.4)

The right side of inequality (38.4) is the present value of consumer 𝑖’s consumption while the left side is the present value
of consumer 𝑖’s endowment when consumer 𝑖 buys 𝜃𝑖 shares of equity.
From inequality (38.4), we deduce two findings.
1. No arbitrage profits condition:
Unless

𝑉 ̃ = 𝐴𝑘𝛼 ∫ 𝑒𝜖 𝑞(𝜖)𝑑𝜖 (38.5)

an arbitrage opportunity would be open.


If

𝑉 ̃ > 𝐴𝑘𝛼 ∫ 𝑒𝜖 𝑞(𝜖)𝑑𝜖

the consumer could afford an arbitrarily high present value of consumption by setting 𝜃𝑖 to an arbitrarily large negative
number.
If

𝑉 ̃ < 𝐴𝑘𝛼 ∫ 𝑒𝜖 𝑞(𝜖)𝑑𝜖

the consumer could afford an arbitrarily high present value of consumption by setting 𝜃𝑖 to be arbitrarily large positive
number.

704 Chapter 38. Irrelevance of Capital Structures with Complete Markets


Advanced Quantitative Economics with Python

Since resources are finite, there can exist no such arbitrage opportunity in a competitive equilibrium.
Therefore, it must be true that the following no arbitrage condition prevails:

𝑉 ̃ = ∫ 𝐴𝑘𝛼 𝑒𝜖 𝑞(𝜖; 𝐾)𝑑𝜖 (38.6)

Equation (38.6) asserts that the value of equity equals the value of the state-contingent dividends 𝐴𝑘𝛼 𝑒𝜖 evaluated at the
Arrow security prices 𝑞(𝜖; 𝐾) that we have expressed as a function of 𝐾.
We’ll say more about this equation later.
2. Indeterminacy of portfolio
When the no-arbitrage pricing equation (38.6) prevails, a consumer of type 𝑖’s choice 𝜃𝑖 of equity is indeterminate.
Consumer of type 𝑖 can offset any choice of 𝜃𝑖 by setting an appropriate schedule 𝑎𝑖 (𝜖) for purchasing state-contingent
securities.

38.2.5 Computing competitive equilibrium prices and quantities

Having computed an allocation that solves the planning problem, we can readily compute a competitive equilibrium via
the following steps that, as we’ll see, relies heavily on the Big K, little k, Big C, little c logic mentioned
earlier:
• a competitive equilbrium allocation equals the allocation chosen by the planner
• competitive equilibrium prices and the value of a firm’s equity are encoded in shadow prices from the planning
problem that depend on Big 𝐾 and Big 𝐶.
To substantiate that this procedure is valid, we proceed as follows.
With 𝐾 in hand, we make the following guess for competitive equilibrium Arrow securities prices
−𝛾
𝑢′ (𝑤1 (𝜖) + 𝐴𝐾 𝛼 𝑒𝜖 )
𝑞(𝜖; 𝐾) = 𝛽 ( ) (38.7)
𝑢′ (𝑤0 − 𝐾)
To confirm the guess, we begin by considering its consequences for the firm’s choice of 𝑘.
With Arrow securities prices (38.7), the firm’s first-order necessary condition for choosing 𝑘 becomes

−1 + 𝛼𝐴𝑘𝛼−1 ∫ 𝑒𝜖 𝑞(𝜖; 𝐾)𝑑𝜖 = 0 (38.8)

which can be verified to be satisfied if the firm sets

𝑘=𝐾

because by setting 𝑘 = 𝐾 equation (38.8) becomes equivalent with the planner’s first-order condition (38.1) for setting
𝐾.
To pose a consumer’s problem in a competitive equilibrium, we require not only the above guess for the Arrow securities
pricing kernel 𝑞(𝜖) but the value of equity 𝑉 ̃ :

𝑉 ̃ = ∫ 𝐴𝐾 𝛼 𝑒𝜖 𝑞(𝜖; 𝐾)𝑑𝜖 (38.9)

Let 𝑉 ̃ be the value of equity implied by Arrow securities price function (38.7) and formula (38.9).
At the Arrow securities prices 𝑞(𝜖) given by (38.7) and equity value 𝑉 ̃ given by (38.9), consumer 𝑖 = 1, 2 choose
consumption allocations and portolios that satisfy the first-order necessary conditions
𝑢′ (𝑐1𝑖 (𝜖))
𝛽( ) 𝑔(𝜖) = 𝑞(𝜖; 𝐾)
𝑢′ (𝑐0𝑖 )

38.2. Competitive equilibrium 705


Advanced Quantitative Economics with Python

It can be verified directly that the following choices satisfy these equations

𝑐01 + 𝑐02 = 𝐶0 = 𝑤0 − 𝐾
𝑐01 (𝜖) + 𝑐02 (𝜖) = 𝐶1 (𝜖) = 𝑤1 (𝜖) + 𝐴𝑘𝛼 𝑒𝜖
𝑐12 (𝜖) 𝑐02 1−𝜂
= =
𝑐11 (𝜖) 𝑐01 𝜂

for an 𝜂 ∈ (0, 1) that depends on consumers’ endowments [𝑤01 , 𝑤02 , 𝑤11 (𝜖), 𝑤12 (𝜖), 𝜃01 , 𝜃02 ].
Remark: Multiple arrangements of endowments [𝑤01 , 𝑤02 , 𝑤11 (𝜖), 𝑤12 (𝜖), 𝜃01 , 𝜃02 ] associated with the same distribution of
wealth 𝜂. Can you explain why?

Hint: Think about the portfolio indeterminacy finding above.

38.2.6 Modigliani-Miller theorem

We now allow a firm to issue both bonds and equity.


Payouts from equity and bonds, respectively, are

𝑑𝑒 (𝑘, 𝑏; 𝜖) = max {𝑒𝜖 𝐴𝑘𝛼 − 𝑏, 0}


𝑒𝜖 𝐴𝑘𝛼
𝑑𝑏 (𝑘, 𝑏; 𝜖) = min { , 1}
𝑏
Thus, one unit of the bond pays one unit of consumption at time 1 in state 𝜖 if 𝐴𝑘𝛼 𝑒𝜖 − 𝑏 ≥ 0, which is true when
𝛼 𝜖
𝜖 ≥ 𝜖∗ = log 𝐴𝑘𝑏 𝛼 , and pays 𝐴𝑘𝑏 𝑒 units of time 1 consumption in state 𝜖 when 𝜖 < 𝜖∗ .
The value of the firm is now the sum of equity plus the value of bonds, which we denote

𝑉 ̃ + 𝑏𝑝(𝑘, 𝑏)

where 𝑝(𝑘, 𝑏) is the price of one unit of the bond when a firm with 𝑘 units of physical capital issues 𝑏 bonds.
We continue to assume that there are complete markets in Arrow securities with pricing kernel 𝑞(𝜖).
A version of the no-arbitrage-in-equilibrium argument that we presented earlier implies that the value of equity and the
price of bonds are
∞ ∞
𝑉 ̃ = 𝐴𝑘𝛼 ∫ 𝑒𝜖 𝑞(𝜖)𝑑𝜖 − 𝑏 ∫ 𝑞(𝜖)𝑑𝜖
𝜖∗ 𝜖∗
𝜖∗ ∞
𝐴𝑘𝛼
𝑝(𝑘, 𝑏) = ∫ 𝑒𝜖 𝑞(𝜖)𝑑𝜖 + ∫ 𝑞(𝜖)𝑑𝜖
𝑏 −∞ 𝜖∗

Consequently, the value of the firm is



𝑉 ̃ + 𝑝(𝑘, 𝑏)𝑏 = 𝐴𝑘𝛼 ∫ 𝑒𝜖 𝑞(𝜖)𝑑𝜖,
−∞

which is the same expression that we obtained above when we assumed that the firm issued only equity.
We thus obtain a version of the celebrated Modigliani-Miller theorem [Modigliani and Miller, 1958] about firms’ finance:
Modigliani-Miller theorem:
• The value of a firm is independent the mix of equity and bonds that it uses to finance its physical capital.

706 Chapter 38. Irrelevance of Capital Structures with Complete Markets


Advanced Quantitative Economics with Python

• The firms’s decision about how much physical capital to purchase does not depend on whether it finances those
purchases by issuing bonds or equity
• The firm’s choice of whether to finance itself by issuing equity or bonds is indeterminant
Please note the role of the assumption of complete markets in Arrow securities in substantiating these claims.
In Equilibrium Capital Structures with Incomplete Markets, we will assume that markets are (very) incomplete – we’ll shut
down markets in almost all Arrow securities.
That will pull the rug from underneath the Modigliani-Miller theorem.

38.3 Code

We create a class object BCG_complete_markets to compute equilibrium allocations of the complete market BCG
model given a list of parameter values.
It consists of 4 functions that do the following things:
• opt_k computes the planner’s optimal capital 𝐾
– First, create a grid for capital.
– Then for each value of capital stock in the grid, compute the left side of the planner’s first-order necessary
condition for 𝑘, that is,
−𝛾
𝑤 (𝜖) + 𝐴𝐾 𝛼 𝑒𝜖
𝛽𝛼𝐴𝐾 𝛼−1
∫( 1 ) 𝑒𝜖 𝑔(𝜖)𝑑𝜖 − 1 = 0
𝑤0 − 𝐾

– Find 𝑘 that solves this equation.


• q computes Arrow security prices as a function of the productivity shock 𝜖 and capital 𝐾:
𝑢′ (𝑤1 (𝜖) + 𝐴𝐾 𝛼 𝑒𝜖 )
𝑞(𝜖; 𝐾) = 𝛽 ( )
𝑢′ (𝑤0 − 𝐾)

• V solves for the firm value given capital 𝑘:

𝑉 = −𝑘 + ∫ 𝐴𝑘𝛼 𝑒𝜖 𝑞(𝜖; 𝐾)𝑑𝜖

• opt_c computes optimal consumptions 𝑐0𝑖 , and 𝑐𝑖 (𝜖):


– The function first computes weight 𝜂 using the budget constraint for agent 1:

𝑤01 + 𝜃01 𝑉 + ∫ 𝑤11 (𝜖)𝑞(𝜖)𝑑𝜖 = 𝑐01 + ∫ 𝑐11 (𝜖)𝑞(𝜖)𝑑𝜖 = 𝜂 (𝐶0 + ∫ 𝐶1 (𝜖)𝑞(𝜖)𝑑𝜖)

where
𝐶0 = 𝑤0 − 𝐾
𝐶1 (𝜖) = 𝑤1 (𝜖) + 𝐴𝐾 𝛼 𝑒𝜖

– It computes consumption for each agent as


𝑐01 = 𝜂𝐶0
𝑐02 = (1 − 𝜂)𝐶0
𝑐11 (𝜖) = 𝜂𝐶1 (𝜖)
𝑐12 (𝜖) = (1 − 𝜂)𝐶1 (𝜖)

38.3. Code 707


Advanced Quantitative Economics with Python

The list of parameters includes:


• 𝜒1 , 𝜒2 : Correlation parameters for agents 1 and 2. Default values are 0 and 0.9, respectively.
• 𝑤01 , 𝑤02 : Initial endowments. Default values are 1.
• 𝜃01 , 𝜃02 : Consumers’ initial shares of a representative firm. Default values are 0.5.
• 𝜓: CRRA risk parameter. Default value is 3.
• 𝛼: Returns to scale production function parameter. Default value is 0.6.
• 𝐴: Productivity of technology. Default value is 2.5.
• 𝜇, 𝜎: Mean and standard deviation of the log of the shock. Default values are -0.025 and 0.4, respectively.
• 𝛽: time preference discount factor. Default value is .96.
• nb_points_integ: number of points used for integration through Gauss-Hermite quadrature: default value is
10

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from numba import njit, prange
from quantecon.optimize import root_finding

#=========== Class: BCG for complete markets ===========#


class BCG_complete_markets:

# init method or constructor


def __init__(self,
𝜒1 = 0,
𝜒2 = 0.9,
w10 = 1,
w20 = 1,
𝜃10 = 0.5,
𝜃20 = 0.5,
𝜓 = 3,
𝛼 = 0.6,
A = 2.5,
𝜇 = -0.025,
𝜎 = 0.4,
𝛽 = 0.96,
nb_points_integ = 10):

#=========== Setup ===========#


# Risk parameters
self.𝜒1 = 𝜒1
self.𝜒2 = 𝜒2

# Other parameters
self.𝜓 = 𝜓
self.𝛼 = 𝛼
self.A = A
self.𝜇 = 𝜇
self.𝜎 = 𝜎
self.𝛽 = 𝛽

# Utility
(continues on next page)

708 Chapter 38. Irrelevance of Capital Structures with Complete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


self.u = lambda c: (c**(1-𝜓)) / (1-𝜓)

# Production
self.f = njit(lambda k: A * (k ** 𝛼))
self.Y = lambda 𝜖, k: np.exp(𝜖) * self.f(k)

# Initial endowments
self.w10 = w10
self.w20 = w20
self.w0 = w10 + w20

# Initial holdings
self.𝜃10 = 𝜃10
self.𝜃20 = 𝜃20

# Endowments at t=1
w11 = njit(lambda 𝜖: np.exp(-𝜒1*𝜇 - 0.5*(𝜒1**2)*(𝜎**2) + 𝜒1*𝜖))
w21 = njit(lambda 𝜖: np.exp(-𝜒2*𝜇 - 0.5*(𝜒2**2)*(𝜎**2) + 𝜒2*𝜖))
self.w11 = w11
self.w21 = w21

self.w1 = njit(lambda 𝜖: w11(𝜖) + w21(𝜖))

# Normal PDF
self.g = lambda x: norm.pdf(x, loc=𝜇, scale=𝜎)

# Integration
x, self.weights = np.polynomial.hermite.hermgauss(nb_points_integ)
self.points_integral = np.sqrt(2) * 𝜎 * x + 𝜇

self.k_foc = k_foc_factory(self)

#=========== Optimal k ===========#


# Function: solve for optimal k
def opt_k(self, plot=False):
w0 = self.w0

# Grid for k
kgrid = np.linspace(1e-4, w0-1e-4, 100)

# get FONC values for each k in the grid


kfoc_list = [];
for k in kgrid:
kfoc = self.k_foc(k, self.𝜒1, self.𝜒2)
kfoc_list.append(kfoc)

# Plot FONC for k


if plot:
fig, ax = plt.subplots(figsize=(8,7))
ax.plot(kgrid, kfoc_list, color='blue', label=r'FONC for k')
ax.axhline(0, color='red', linestyle='--')
ax.legend()
ax.set_xlabel(r'k')
plt.show()

# Find k that solves the FONC

(continues on next page)

38.3. Code 709


Advanced Quantitative Economics with Python

(continued from previous page)


kk = root_finding.newton_secant(self.k_foc, 1e-2, args=(self.𝜒1, self.𝜒2)).
↪root

return kk

#=========== Arrow security price ===========#


# Function: Compute Arrow security price
def q(self,𝜖,k):
𝛽 = self.𝛽
𝜓 = self.𝜓
w0 = self.w0
w1 = self.w1
fk = self.f(k)
g = self.g

return 𝛽 * ((w1(𝜖) + np.exp(𝜖)*fk) / (w0 - k))**(-𝜓)

#=========== Firm value V ===========#


# Function: compute firm value V
def V(self, k):
q = self.q
fk = self.f(k)
weights = self.weights
integ = lambda 𝜖: np.exp(𝜖) * fk * q(𝜖, k)

return -k + np.sum(weights * integ(self.points_integral)) / np.sqrt(np.pi)

#=========== Optimal c ===========#


# Function: Compute optimal consumption choices c
def opt_c(self, k=None, plot=False):
w1 = self.w1
w0 = self.w0
w10 = self.w10
w11 = self.w11
𝜃10 = self.𝜃10
Y = self.Y
q = self.q
V = self.V
weights = self.weights

if k is None:
k = self.opt_k()

# Solve for the ratio of consumption from the intertemporal B.C.


fk = self.f(k)

c1 = lambda 𝜖: (w1(𝜖) + np.exp(𝜖)*fk)*q(𝜖,k)


denom = np.sum(weights * c1(self.points_integral)) / np.sqrt(np.pi) + (w0 - k)

w11q = lambda 𝜖: w11(𝜖)*q(𝜖,k)


num = w10 + 𝜃10 * V(k) + np.sum(weights * w11q(self.points_integral)) / np.
↪sqrt(np.pi)

𝜂 = num / denom

(continues on next page)

710 Chapter 38. Irrelevance of Capital Structures with Complete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


# Consumption choices
c10 = 𝜂 * (w0 - k)
c20 = (1-𝜂) * (w0 - k)
c11 = lambda 𝜖: 𝜂 * (w1(𝜖)+Y(𝜖,k))
c21 = lambda 𝜖: (1-𝜂) * (w1(𝜖)+Y(𝜖,k))

return c10, c20, c11, c21

def k_foc_factory(model):
𝜓 = model.𝜓
f = model.f
𝛽 = model.𝛽
𝛼 = model.𝛼
A = model.A
𝜓 = model.𝜓
w0 = model.w0
𝜇 = model.𝜇
𝜎 = model.𝜎

weights = model.weights
points_integral = model.points_integral

w11 = njit(lambda 𝜖, 𝜒1, : np.exp(-𝜒1*𝜇 - 0.5*(𝜒1**2)*(𝜎**2) + 𝜒1*𝜖))


w21 = njit(lambda 𝜖, 𝜒2: np.exp(-𝜒2*𝜇 - 0.5*(𝜒2**2)*(𝜎**2) + 𝜒2*𝜖))
w1 = njit(lambda 𝜖, 𝜒1, 𝜒2: w11(𝜖, 𝜒1) + w21(𝜖, 𝜒2))

@njit
def integrand(𝜖, 𝜒1, 𝜒2, k=1e-4):
fk = f(k)
return (w1(𝜖, 𝜒1, 𝜒2) + np.exp(𝜖) * fk) ** (-𝜓) * np.exp(𝜖)

@njit
def k_foc(k, 𝜒1, 𝜒2):
int_k = np.sum(weights * integrand(points_integral, 𝜒1, 𝜒2, k=k)) / np.
↪sqrt(np.pi)

mul = 𝛽 * 𝛼 * A * k ** (𝛼 - 1) / ((w0 - k) ** (-𝜓))


val = mul * int_k - 1

return val

return k_foc

38.3. Code 711


Advanced Quantitative Economics with Python

38.3.1 Examples

Below we provide some examples of how to use BCG_complete markets.

1st example

In the first example, we set up instances of BCG complete markets models.


We can use either default parameter values or set parameter values as we want.
The two instances of the BCG complete markets model, mdl1 and mdl2, represent the model with default parameter
settings and with agent 2’s income correlation altered to be 𝜒2 = −0.9, respectively.

# Example: BCG model for complete markets


mdl1 = BCG_complete_markets()
mdl2 = BCG_complete_markets(𝜒2=-0.9)

Let’s plot the agents’ time-1 endowments with respect to shocks to see the difference in the two models:

#==== Figure 1: HH endowments and firm productivity ====#


# Realizations of innovation from -3 to 3
epsgrid = np.linspace(-1,1,1000)

fig, ax = plt.subplots(1,2,figsize=(14,6))
ax[0].plot(epsgrid, mdl1.w11(epsgrid), color='black', label=r'Agent 1\'s endowment')
ax[0].plot(epsgrid, mdl1.w21(epsgrid), color='blue', label=r'Agent 2\'s endowment')
ax[0].plot(epsgrid, mdl1.Y(epsgrid,1), color='red', label=r'Production with $k=1$')
ax[0].set_xlim([-1,1])
ax[0].set_ylim([0,7])
ax[0].set_xlabel(r'$\epsilon$',fontsize=12)
ax[0].set_title(r'Model with $\chi_1 = 0$, $\chi_2 = 0.9$')
ax[0].legend()
ax[0].grid()

ax[1].plot(epsgrid, mdl2.w11(epsgrid), color='black', label=r'Agent 1\'s endowment')


ax[1].plot(epsgrid, mdl2.w21(epsgrid), color='blue', label=r'Agent 2\'s endowment')
ax[1].plot(epsgrid, mdl2.Y(epsgrid,1), color='red', label=r'Production with $k=1$')
ax[1].set_xlim([-1,1])
ax[1].set_ylim([0,7])
ax[1].set_xlabel(r'$\epsilon$',fontsize=12)
ax[1].set_title(r'Model with $\chi_1 = 0$, $\chi_2 = -0.9$')
ax[1].legend()
ax[1].grid()

plt.show()

712 Chapter 38. Irrelevance of Capital Structures with Complete Markets


Advanced Quantitative Economics with Python

Let’s also compare the optimal capital stock, 𝑘, and optimal time-0 consumption of agent 2, 𝑐02 , for the two models:

# Print optimal k
kk_1 = mdl1.opt_k()
kk_2 = mdl2.opt_k()

print('The optimal k for model 1: {:.5f}'.format(kk_1))


print('The optimal k for model 2: {:.5f}'.format(kk_2))

# Print optimal time-0 consumption for agent 2


c20_1 = mdl1.opt_c(k=kk_1)[1]
c20_2 = mdl2.opt_c(k=kk_2)[1]

print('The optimal c20 for model 1: {:.5f}'.format(c20_1))


print('The optimal c20 for model 2: {:.5f}'.format(c20_2))

The optimal k for model 1: 0.14235


The optimal k for model 2: 0.13791

The optimal c20 for model 1: 0.90205


The optimal c20 for model 2: 0.92862

2nd example

In the second example, we illustrate how the optimal choice of 𝑘 is influenced by the correlation parameter 𝜒𝑖 .
We will need to install the plotly package for 3D illustration. See https://plotly.com/python/getting-started/ for further
instructions.

# Mesh grid of
N = 30
𝜒1grid, 𝜒2grid = np.meshgrid(np.linspace(-1,1,N),
np.linspace(-1,1,N))

(continues on next page)

38.3. Code 713


Advanced Quantitative Economics with Python

(continued from previous page)


k_foc = k_foc_factory(mdl1)

# Create grid for k


kgrid = np.zeros_like(𝜒1grid)

w0 = mdl1.w0

@njit(parallel=True)
def fill_k_grid(kgrid):
# Loop: Compute optimal k and
for i in prange(N):
for j in prange(N):
X1 = 𝜒1grid[i, j]
X2 = 𝜒2grid[i, j]
k = root_finding.newton_secant(k_foc, 1e-2, args=(X1, X2)).root
kgrid[i, j] = k

%%time
fill_k_grid(kgrid)

CPU times: user 4.57 s, sys: 105 ms, total: 4.68 s


Wall time: 4.67 s

%%time
# Second-run
fill_k_grid(kgrid)

CPU times: user 7.65 ms, sys: 985 μs, total: 8.64 ms
Wall time: 2.87 ms

#=== Example: Plot optimal k with different correlations ===#

from IPython.display import Image


# Import plotly
import plotly.graph_objs as go

# Plot optimal k
fig = go.Figure(data=[go.Surface(x=𝜒1grid, y=𝜒2grid, z=kgrid)])
fig.update_layout(scene = dict(xaxis_title='x - 𝜒1',
yaxis_title='y - 𝜒2',
zaxis_title='z - k',
aspectratio=dict(x=1,y=1,z=1)))
fig.update_layout(width=500,
height=500,
margin=dict(l=50, r=50, b=65, t=90))
fig.update_layout(scene_camera=dict(eye=dict(x=2, y=-2, z=1.5)))

# Export to PNG file


Image(fig.to_image(format="png"))
# fig.show() will provide interactive plot when running
# notebook locally

714 Chapter 38. Irrelevance of Capital Structures with Complete Markets


Advanced Quantitative Economics with Python

38.3. Code 715


Advanced Quantitative Economics with Python

716 Chapter 38. Irrelevance of Capital Structures with Complete Markets


CHAPTER

THIRTYNINE

EQUILIBRIUM CAPITAL STRUCTURES WITH INCOMPLETE


MARKETS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon


!conda install -y -c plotly plotly plotly-orca

39.1 Introduction

This is an extension of an earlier lecture Irrelevance of Capital Structure with Complete Markets about a complete markets
model.
In contrast to that lecture, this one describes an instance of a model authored by Bisin, Clementi, and Gottardi [Bisin et
al., 2018] in which financial markets are incomplete.
Instead of being able to trade equities and a full set of one-period Arrow securities as they can in Irrelevance of Capital
Structure with Complete Markets, here consumers and firms trade only equity and a bond.
It is useful to watch how outcomes differ in the two settings.
In the complete markets economy in Irrelevance of Capital Structure with Complete Markets
• there is a unique stochastic discount factor that prices all assets
• consumers’ portfolio choices are indeterminate
• firms’ financial structures are indeterminate, so the model embodies an instance of a Modigliani-Miller irrelevance
theorem [Modigliani and Miller, 1958]
• the aggregate of all firms’ financial structures are indeterminate, a consequence of there being redundant assets
In the incomplete markets economy studied here
• there is a not a unique equilibrium stochastic discount factor
• different stochastic discount factors price different assets
• consumers’ portfolio choices are determinate
• while individual firms’ financial structures are indeterminate, thus conforming to part of a Modigliani-Miller the-
orem, [Modigliani and Miller, 1958], the aggregate of all firms’ financial structures is determinate.
A Big K, little k analysis played an important role in the previous lecture Irrelevance of Capital Structure with
Complete Markets.
A more subtle version of a Big K, little k features in the BCG incomplete markets environment here.

717
Advanced Quantitative Economics with Python

We use it to convey the heart of what BCG call a rational conjectures equilibrium in which conjectures are about
equilibrium pricing functions in regions of the state space that an average consumer or firm does not visit in equilibrium.
Note that the absence of complete markets means that now we cannot compute competitive equilibrium prices and
allocations by first solving the simple planning problem that we did in Irrelevance of Capital Structure with Complete
Markets.
Instead, we compute an equilibrium by solving a system of simultaneous inequalities.
(Here we do not address the interesting question of whether there is a different planning problem that we could use to
compute a competitive equlibrium allocation.)

39.1.1 Setup

We adopt specifications of preferences and technologies used by Bisin, Clemente, and Gottardi (2018) [Bisin et al., 2018]
and in our earlier lecture on a complete markets version of their model.
The economy lasts for two periods, 𝑡 = 0, 1.
There are two types of consumers named 𝑖 = 1, 2.
A scalar random variable 𝜖 affects both
• a representative firm’s physical return 𝑓(𝑘)𝑒𝜖 in period 1 from investing 𝑘 ≥ 0 in capital in period 0.
• period 1 endowments 𝑤1𝑖 (𝜖) of the consumption good for agents 𝑖 = 1 and 𝑖 = 2.

39.1.2 Ownership

A consumer of type 𝑖 is endowed with 𝑤0𝑖 units of the time 0 good and 𝑤1𝑖 (𝜖) of the time 1 good when the random variable
takes value 𝜖.
At the start of period 0, a consumer of type 𝑖 also owns 𝜃0𝑖 shares of a representative firm.

39.1.3 Measures of agents and firms

As in the companion lecture Irrelevance of Capital Structure with Complete Markets that studies a complete markets version
of the model, we follow BCG in assuming that there are unit measures of
• consumers of type 𝑖 = 1
• consumers of type 𝑖 = 2
• firms with access to a production technology that converts 𝑘 units of time 0 good into 𝐴𝑘𝛼 𝑒𝜖 units of the time 1
good in random state 𝜖
Thus, let 𝜔 ∈ [0, 1] index a particular consumer of type 𝑖.
Then define Big 𝐶 𝑖 as
1
𝐶 𝑖 = ∫ 𝑐𝑖 (𝜔)𝑑 𝜔
0

with components
1
𝐶0𝑖 = ∫ 𝑐0𝑖 (𝜔)𝑑 𝜔
0
1
𝐶1𝑖 (𝜖) = ∫ 𝑐1𝑖 (𝜖; 𝜔)𝑑 𝜔
0

718 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

In the same spirit, let 𝜁 ∈ [0, 1] index a particular firm and let firm 𝜁 purchase 𝑘(𝜁) units of capital and issue 𝑏(𝜁) bonds.
Then define Big 𝐾 and Big 𝐵 as
1 1
𝐾 = ∫ 𝑘(𝜁)𝑑 𝜁, 𝐵 = ∫ 𝑏(𝜁)𝑑 𝜁
0 0

The assumption that there are equal measures of our three types of agents justifies our assumption that each individual
agent is a powerless price taker:
• an individual consumer chooses its own (infinitesimal) part 𝑐𝑖 (𝜔) of 𝐶 𝑖 taking prices as given
• an individual firm chooses its own (infinitesmimal) part 𝑘(𝜁) of 𝐾 and 𝑏(𝜁) of 𝐵 taking pricing functions as given
• However, equilibrium prices depend on the Big K, Big B, Big C objects 𝐾, 𝐵, and 𝐶
The assumption about measures of agents is a powerful device for making a host of competitive agents take as given the
equilibrium prices that turn out to be determined by the decisions of hosts of agents who are just like them.
We call an equilibrium symmetric if
• all type 𝑖 consumers choose the same consumption profiles so that 𝑐𝑖 (𝜔) = 𝐶 𝑖 for all 𝜔 ∈ [0, 1]
• all firms choose the same levels of 𝑘 and 𝑏 so that 𝑘(𝜁) = 𝐾, 𝑏(𝜁) = 𝐵 for all 𝜁 ∈ [0, 1]
In this lecture, we restrict ourselves to describing symmetric equilibria.

39.1.4 Endowments

Per capital economy-wide endowments in periods 0 and 1 are

𝑤0 = 𝑤01 + 𝑤02
𝑤1 (𝜖) = 𝑤11 (𝜖) + 𝑤12 (𝜖) in state 𝜖

39.1.5 Feasibility:

Where 𝛼 ∈ (0, 1) and 𝐴 > 0

𝐶01 + 𝐶02 = 𝑤01 + 𝑤02 − 𝐾


1
𝐶11 (𝜖) + 𝐶12 (𝜖) = 𝑤11 (𝜖) + 𝑤12 (𝜖) + 𝑒𝜖 ∫ 𝑓(𝑘(𝜁))𝑑𝜁, 𝑘≥0
0

where 𝑓(𝑘) = 𝐴𝑘𝛼 , 𝐴 > 0, 𝛼 ∈ (0, 1).

39.1.6 Parameterizations

Following BCG, we shall employ the following parameterizations:

𝜖 ∼ 𝒩(𝜇, 𝜎2 )
𝑐1−𝛾
𝑢(𝑐) =
1−𝛾
2 2
𝑤1𝑖 (𝜖) = 𝑒−𝜒𝑖 𝜇−.5𝜒𝑖 𝜎 +𝜒𝑖 𝜖
, 𝜒𝑖 ∈ [0, 1]

Sometimes instead of asuming 𝜖 ∼ 𝑔(𝜖) = 𝒩(0, 𝜎2 ), we’ll assume that 𝑔(⋅) is a probability mass function that serves as
a discrete approximation to a standardized normal density.

39.1. Introduction 719


Advanced Quantitative Economics with Python

39.1.7 Preferences:

A consumer of type 𝑖 orders period 0 consumption 𝑐0𝑖 and state 𝜖-period 1 consumption 𝑐𝑖 (𝜖) by

𝑢𝑖 = 𝑢(𝑐0𝑖 ) + 𝛽 ∫ 𝑢(𝑐1𝑖 (𝜖))𝑔(𝜖)𝑑𝜖, 𝑖 = 1, 2

𝛽 ∈ (0, 1) and the one-period utility function is


𝑐1−𝛾
1−𝛾 if 𝛾 ≠ 1
𝑢(𝑐) = {
log 𝑐 if 𝛾 = 1

39.1.8 Risk-sharing motives

The two types of agents’ period 1 endowments have different correlations with the physical return on capital.
Endowment differences give agents incentives to trade risks that in the complete market version of the model showed up
in their demands for equity and in their demands and supplies of one-period Arrow securities.
In the incomplete-markets setting under study here, these differences show up in differences in the two types of consumers’
demands for a typical firm’s bonds and equity, the only two assets that agents can now trade.

39.2 Asset Markets

Markets are incomplete: ex cathedra we the model builders declare that only equities and bonds issued by representative
firms can be traded.
Let 𝜃𝑖 and 𝜉 𝑖 be a consumer of type 𝑖’s post-trade holdings of equity and bonds, respectively.
A firm issues bonds promising to pay 𝑏 units of consumption at time 𝑡 = 1 and purchases 𝑘 units of physical capital at
time 𝑡 = 0.
When 𝑒𝜖 𝐴𝑘𝛼 < 𝑏 at time 1, the firm defaults and its output is divided equally among bondholders.
Evidently, when the productivity shock 𝜖 < 𝜖∗ = log ( 𝐴𝑘𝑏 𝛼 ), the firm defaults on its debt
Payoffs to equity and debt at date 1 as functions of the productivity shock 𝜖 are thus

𝑑𝑒 (𝑘, 𝑏; 𝜖) = max {𝑒𝜖 𝐴𝑘𝛼 − 𝑏, 0}


𝑒𝜖 𝐴𝑘𝛼 (39.1)
𝑑𝑏 (𝑘, 𝑏; 𝜖) = min { , 1}
𝑏
A firm faces a bond price function 𝑝(𝑘, 𝑏) when it issues 𝑏 bonds and purchases 𝑘 units of physical capital.
A firm’s equity is worth 𝑞(𝑘, 𝑏) when it issues 𝑏 bonds and purchases 𝑘 units of physical capital.
A firm regards an equity-pricing function 𝑞(𝑘, 𝑏) and a bond pricing function 𝑝(𝑘, 𝑏) as exogenous in the sense that they
are not affected by its choices of 𝑘 and 𝑏.
Consumers face equilibrium prices 𝑞 ̌ and 𝑝̌ for bonds and equities, where 𝑞 ̌ and 𝑝̌ are both scalars.
Consumers are price takers and only need to know the scalars 𝑞,̌ 𝑝.̌
Firms are price function takers and must know the functions 𝑞(𝑘, 𝑏), 𝑝(𝑘, 𝑏) in order completely to pose their optimum
problems.

720 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

39.2.1 Consumers

Each consumer of type 𝑖 is endowed with 𝑤0𝑖 of the time 0 consumption good, 𝑤1𝑖 (𝜖) of the time 1, state 𝜖 consumption
good and also owns a fraction 𝜃0𝑖 ∈ (0, 1) of the initial value of a representative firm, where 𝜃01 + 𝜃02 = 1.
The initial value of a representative firm is 𝑉 (an object to be determined in a rational expectations equilibrium).
Consumer 𝑖 buys 𝜃𝑖 shares of equity and buys bonds worth 𝑝𝜉̌ 𝑖 where 𝑝̌ is the bond price.
Being a price-taker, a consumer takes 𝑉 , 𝑞,̌ 𝑝,̌ and 𝐾, 𝐵 as given.
Consumers know that equilibrium payoff functions for bonds and equities take the form
𝑑𝑒 (𝐾, 𝐵; 𝜖) = max {𝑒𝜖 𝐴𝐾 𝛼 − 𝐵, 0}
𝑒𝜖 𝐴𝐾 𝛼
𝑑𝑏 (𝐾, 𝐵; 𝜖) = min { , 1}
𝐵
Consumer 𝑖’s optimization problem is

max 𝑢(𝑐0𝑖 ) + 𝛽 ∫ 𝑢(𝑐𝑖 (𝜖))𝑔(𝜖) 𝑑𝜖


𝑐0𝑖 ,𝜃𝑖 ,𝜉𝑖 ,𝑐1𝑖 (𝜖)

subject to 𝑐0𝑖 = 𝑤0𝑖 + 𝜃0𝑖 𝑉 − 𝑞𝜃̌ 𝑖 − 𝑝𝜉̌ 𝑖 ,


𝑐1𝑖 (𝜖) = 𝑤1𝑖 (𝜖) + 𝜃𝑖 𝑑𝑒 (𝐾, 𝐵; 𝜖) + 𝜉 𝑖 𝑑𝑏 (𝐾, 𝐵; 𝜖) ∀ 𝜖,
𝜃𝑖 ≥ 0, 𝜉 𝑖 ≥ 0.
The last two inequalities impose that the consumer cannot short sell either equity or bonds.
In a rational expectations equilibrium, 𝑞 ̌ = 𝑞(𝐾, 𝐵) and 𝑝̌ = 𝑝(𝐾, 𝐵)
We form consumer 𝑖’s Lagrangian:

𝐿𝑖 ∶=𝑢(𝑐0𝑖 ) + 𝛽 ∫ 𝑢(𝑐𝑖 (𝜖))𝑔(𝜖) 𝑑𝜖

+ 𝜆𝑖0 [𝑤0𝑖 + 𝜃0 𝑉 − 𝑞𝜃̌ 𝑖 − 𝑝𝜉̌ 𝑖 − 𝑐0𝑖 ]

+ 𝛽 ∫ 𝜆𝑖1 (𝜖) [𝑤1𝑖 (𝜖) + 𝜃𝑖 𝑑𝑒 (𝐾, 𝐵; 𝜖) + 𝜉 𝑖 𝑑𝑏 (𝐾, 𝐵; 𝜖) − 𝑐1𝑖 (𝜖)] 𝑔(𝜖) 𝑑𝜖

Consumer 𝑖’s first-order necessary conditions for an optimum include:


𝑐0𝑖 ∶ 𝑢′ (𝑐0𝑖 ) = 𝜆𝑖0
𝑖
𝑐1 (𝜖) ∶ 𝑢′ (𝑐1𝑖 (𝜖)) = 𝜆𝑖1 (𝜖)

𝜃𝑖 ∶ 𝛽 ∫ 𝜆𝑖1 (𝜖)𝑑𝑒 (𝐾, 𝐵; 𝜖)𝑔(𝜖) 𝑑𝜖 ≤ 𝜆𝑖0 𝑞 ̌ (= if 𝜃𝑖 > 0)

𝜉 𝑖 ∶ 𝛽 ∫ 𝜆𝑖1 (𝜖)𝑑𝑏 (𝐾, 𝐵; 𝜖)𝑔(𝜖) 𝑑𝜖 ≤ 𝜆𝑖0 𝑝̌ (= if 𝑏𝑖 > 0)

We can combine and rearrange consumer 𝑖’s first-order conditions to become:


𝑢′ (𝑐1𝑖 (𝜖)) 𝑒
𝑞̌≥ 𝛽 ∫ 𝑑 (𝐾, 𝐵; 𝜖)𝑔(𝜖) 𝑑𝜖 (= if 𝜃𝑖 > 0)
𝑢′ (𝑐0𝑖 )
𝑢′ (𝑐1𝑖 (𝜖)) 𝑏
𝑝̌ ≥ 𝛽 ∫ 𝑑 (𝐾, 𝐵; 𝜖)𝑔(𝜖) 𝑑𝜖 (= if 𝑏𝑖 > 0)
𝑢′ (𝑐0𝑖 )
These inequalities imply that in a symmetric rational expectations equilibrium consumption allocations and prices satisfy
𝑢′ (𝑐1𝑖 (𝜖)) 𝑒
𝑞 ̌ = max 𝛽 ∫ 𝑑 (𝐾, 𝐵; 𝜖)𝑔(𝜖) 𝑑𝜖
𝑖 𝑢′ (𝑐0𝑖 )
𝑢′ (𝑐1𝑖 (𝜖)) 𝑏
𝑝̌ = max 𝛽 ∫ 𝑑 (𝐾, 𝐵; 𝜖)𝑔(𝜖) 𝑑𝜖
𝑖 𝑢′ (𝑐0𝑖 )

39.2. Asset Markets 721


Advanced Quantitative Economics with Python

39.2.2 Pricing functions

When individual firms solve their optimization problems, they take big 𝐶 𝑖 ’s as fixed objects that they don’t influence.
A representative firm faces a price function 𝑞(𝑘, 𝑏) for its equity and a price function 𝑝(𝑘, 𝑏) per unit of bonds that satisfy

𝑢′ (𝐶1𝑖 (𝜖)) 𝑒
𝑞(𝑘, 𝑏) = max 𝛽 ∫ 𝑑 (𝑘, 𝑏; 𝜖)𝑔(𝜖) 𝑑𝜖
𝑖 𝑢′ (𝐶0𝑖 )
𝑢′ (𝐶1𝑖 (𝜖)) 𝑏
𝑝(𝑘, 𝑏) = max 𝛽 ∫ 𝑑 (𝑘, 𝑏; 𝜖)𝑔(𝜖) 𝑑𝜖
𝑖 𝑢′ (𝐶0𝑖 )

where the payoff functions are described by equations (39.1).


Notice the appearance of big 𝐶 𝑖 ’s on the right sides of these two equations that define equilibrium pricing functions.
The two price functions describe outcomes not only for equilibrium choices 𝐾, 𝐵 of capital 𝑘 and debt 𝑏, but also for any
out-of-equilibrium pairs (𝑘, 𝑏) ≠ (𝐾, 𝐵).
The firm is assumed to know both price functions.
This means that the firm understands that its choice of 𝑘, 𝑏 influences how markets price its equity and debt.
This package of assumptions is sometimes called rational conjectures (about price functions).
BCG give credit to Makowski for emphasizing and clarifying how rational conjectures are components of rational ex-
pectations equilibria.

39.2.3 Firms

The firm chooses capital 𝑘 and debt 𝑏 to maximize its market value:

𝑉 ≡ max −𝑘 + 𝑞(𝑘, 𝑏) + 𝑝(𝑘, 𝑏)𝑏


𝑘,𝑏

Attributing value maximization to the firm is a good idea because in equilibrium consumers of both types want a firm to
maximize its value.
In the special quantitative examples studied here
• consumers of types 𝑖 = 1, 2 both hold equity
• only consumers of type 𝑖 = 2 hold debt; consumers of type 𝑖 = 1 hold none.
These outcomes occur because we follow BCG and set parameters so that a type 2 consumer’s stochastic endowment of
the consumption good in period 1 is more correlated with the firm’s output than is a type 1 consumer’s.
This gives consumers of type 2 a motive to hedge their second period endowment risk by holding bonds (they also choose
to hold some equity).
These outcomes mean that the pricing functions end up satisfying

𝑢′ (𝐶11 (𝜖)) 𝑒 𝑢′ (𝐶12 (𝜖)) 𝑒


𝑞(𝑘, 𝑏) = 𝛽 ∫ 1
𝑑 (𝑘, 𝑏; 𝜖)𝑔(𝜖) 𝑑𝜖 = 𝛽 ∫ 𝑑 (𝑘, 𝑏; 𝜖)𝑔(𝜖) 𝑑𝜖
𝑢′ (𝐶0 ) 𝑢′ (𝐶02 )
𝑢′ (𝐶12 (𝜖)) 𝑏
𝑝(𝑘, 𝑏) = 𝛽 ∫ 𝑑 (𝑘, 𝑏; 𝜖)𝑔(𝜖) 𝑑𝜖
𝑢′ (𝐶02 )

Recall that 𝜖∗ (𝑘, 𝑏) ≡ log ( 𝐴𝑘𝑏 𝛼 ) is a firm’s default threshold.

722 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

We can rewrite the pricing functions as:



𝑢′ (𝐶1𝑖 (𝜖)) 𝜖 𝛼
𝑞(𝑘, 𝑏) = 𝛽 ∫ (𝑒 𝐴𝑘 − 𝑏) 𝑔(𝜖) 𝑑𝜖, 𝑖 = 1, 2
𝜖∗ 𝑢′ (𝐶0𝑖 )
𝜖∗ ∞
𝑢′ (𝐶12 (𝜖)) 𝑒𝜖 𝐴𝑘𝛼 𝑢′ (𝐶12 (𝜖))
𝑝(𝑘, 𝑏) = 𝛽 ∫ ( ) 𝑔(𝜖) 𝑑𝜖 + 𝛽 ∫ 𝑔(𝜖) 𝑑𝜖
−∞ 𝑢′ (𝐶02 ) 𝑏 𝜖∗ 𝑢′ (𝐶02 )

Firm’s optimization problem

The firm’s optimization problem is

𝑉 ≡ max {−𝑘 + 𝑞(𝑘, 𝑏) + 𝑝(𝑘, 𝑏)𝑏}


𝑘,𝑏

The firm’s first-order necessary conditions with respect to 𝑘 and 𝑏, respectively, are

𝜕𝑞(𝑘, 𝑏) 𝜕𝑝(𝑞, 𝑏)
𝑘∶ −1+ +𝑏 =0
𝜕𝑘 𝜕𝑘
𝜕𝑞(𝑘, 𝑏) 𝜕𝑝(𝑘, 𝑏)
𝑏∶ + 𝑝(𝑘, 𝑏) + 𝑏 =0
𝜕𝑏 𝜕𝑏
We use the Leibniz integral rule several times to arrive at the following derivatives:

𝜕𝑞(𝑘, 𝑏) 𝑢′ (𝐶1𝑖 (𝜖)) 𝜖
= 𝛽𝛼𝐴𝑘𝛼−1 ∫ 𝑒 𝑔(𝜖)𝑑𝜖, 𝑖 = 1, 2
𝜕𝑘 𝜖∗ 𝑢′ (𝐶0𝑖 )

𝜕𝑞(𝑘, 𝑏) 𝑢′ (𝐶1𝑖 (𝜖))
= −𝛽 ∫ 𝑔(𝜖)𝑑𝜖, 𝑖 = 1, 2
𝜕𝑏 𝜖∗ 𝑢′ (𝐶0𝑖 )
𝜖∗
𝜕𝑝(𝑘, 𝑏) 𝐴𝑘𝛼−1 𝑢′ (𝐶12 (𝜖))
= 𝛽𝛼 ∫ ′ 2
𝑔(𝜖)𝑑𝜖
𝜕𝑘 𝑏 −∞ 𝑢 (𝐶0 )
𝜖∗
𝜕𝑝(𝑘, 𝑏) 𝐴𝑘𝛼 𝑢′ (𝐶12 (𝜖)) 𝜖
= −𝛽 2 ∫ ′ 2
𝑒 𝑔(𝜖)𝑑𝜖
𝜕𝑏 𝑏 −∞ 𝑢 (𝐶0 )
𝜕𝑞(𝑘,𝑏)
Special case: We confine ourselves to a special case in which both types of consumer hold positive equities so that 𝜕𝑘
and 𝜕𝑞(𝑘,𝑏)
𝜕𝑏 are related to rates of intertemporal substitution for both agents.
Substituting these partial derivatives into the above first-order conditions for 𝑘 and 𝑏, respectively, we obtain the following
versions of those first order conditions:

𝑢′ (𝐶12 (𝜖)) 𝜖
𝑘∶ −1 + 𝛽𝛼𝐴𝑘𝛼−1 ∫ 𝑒 𝑔(𝜖)𝑑𝜖 = 0 (39.2)
−∞ 𝑢′ (𝐶02 )
∞ ∞
𝑢′ (𝐶11 (𝜖)) 𝑢′ (𝐶12 (𝜖))
𝑏∶ ∫ ( ) 𝑔(𝜖) 𝑑𝜖 = ∫ ( ) 𝑔(𝜖) 𝑑𝜖 (39.3)
𝜖∗ 𝑢′ (𝐶01 ) 𝜖∗ 𝑢′ (𝐶02 )
where again recall that 𝜖∗ (𝑘, 𝑏) ≡ log ( 𝐴𝑘𝑏 𝛼 ).
Taking 𝐶0𝑖 , 𝐶1𝑖 (𝜖) as given, these are two equations that we want to solve for the firm’s optimal decisions 𝑘, 𝑏.

39.2. Asset Markets 723


Advanced Quantitative Economics with Python

39.3 Equilibrium verification

On page 5 of BCG (2018), the authors say


If the price conjectures corresponding to the plan chosen by firms in equilibrium are correct, that is equal to the market prices
𝑞 ̌ and 𝑝,̌ it is immediate to verify that the rationality of the conjecture coincides with the agents’ Euler equations.
Here BCG are describing how they go about verifying that when they set little 𝑘, little 𝑏 from the firm’s first-order
conditions equal to the big 𝐾, big 𝐵 at the big 𝐶’s that appear in the pricing functions, then
• consumers’ Euler equations are satisfied if little 𝑐’s are equated to Big 𝐶’s
• firms’ first-order necessary conditions for 𝑘, 𝑏 are satisfied.
• 𝑞 ̌ = 𝑞(𝐾, 𝐵) and 𝑝̌ = 𝑝(𝐾, 𝐵).

39.4 Pseudo Code

Before displaying our Python code for computing a BCG incomplete markets equilibrium, we’ll sketch some pseudo code
that describes its logical flow.
Here goes:
1. Set upper and lower bounds for firm value as 𝑉ℎ and 𝑉𝑙 , for capital as 𝑘ℎ and 𝑘𝑙 , and for debt as 𝑏ℎ and 𝑏𝑙 .
2. Conjecture firm value 𝑉 = 12 (𝑉ℎ + 𝑉𝑙 )
3. Conjecture debt level 𝑏 = 12 (𝑏ℎ + 𝑏𝑙 ).
4. Conjecture capital 𝑘 = 12 (𝑘ℎ + 𝑘𝑙 ).
5. Compute the default threshold 𝜖∗ ≡ log ( 𝐴𝑘𝑏 𝛼 ).
6. (In this step we abuse notation by freezing 𝑉 , 𝑘, 𝑏 and in effect temporarily treating them as Big 𝐾, 𝐵 values. Thus,
in this step 6 little 𝑘, 𝑏 are frozen at guessed at value of 𝐾, 𝐵.) Fixing the values of 𝑉 , 𝑏 and 𝑘, compute optimal
choices of consumption 𝑐𝑖 with consumers’ FOCs. Assume that only agent 2 holds debt: 𝜉 2 = 𝑏 and that both
agents hold equity: 0 < 𝜃𝑖 < 1 for 𝑖 = 1, 2.
7. Set high and low bounds for equity holdings for agent 1 as 𝜃ℎ1 and 𝜃𝑙1 . Guess 𝜃1 = 21 (𝜃ℎ1 + 𝜃𝑙1 ), and 𝜃2 = 1 − 𝜃1 .
While |𝜃ℎ1 − 𝜃𝑙1 | is large:
• Compute agent 1’s valuation of the equity claim with a fixed-point iteration:
′ 1
𝑞1 = 𝛽 ∫ 𝑢𝑢(𝑐′ (𝑐1 (𝜖)) 𝑒
1 ) 𝑑 (𝑘, 𝑏; 𝜖)𝑔(𝜖) 𝑑𝜖
0

where
𝑐11 (𝜖) = 𝑤11 (𝜖) + 𝜃1 𝑑𝑒 (𝑘, 𝑏; 𝜖)
and
𝑐01 = 𝑤01 + 𝜃01 𝑉 − 𝑞1 𝜃1
• Compute agent 2’s valuation of the bond claim with a fixed-point iteration:
′ 2
𝑝 = 𝛽 ∫ 𝑢𝑢(𝑐′ (𝑐1 (𝜖)) 𝑏
2 ) 𝑑 (𝑘, 𝑏; 𝜖)𝑔(𝜖) 𝑑𝜖
0

where
𝑐12 (𝜖) = 𝑤12 (𝜖) + 𝜃2 𝑑𝑒 (𝑘, 𝑏; 𝜖) + 𝑏
and

724 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

𝑐02 = 𝑤02 + 𝜃02 𝑉 − 𝑞1 𝜃2 − 𝑝𝑏


• Compute agent 2’s valuation of the equity claim with a fixed-point iteration:
′ 2
𝑞2 = 𝛽 ∫ 𝑢𝑢(𝑐′ (𝑐1 (𝜖)) 𝑒
2 ) 𝑑 (𝑘, 𝑏; 𝜖)𝑔(𝜖) 𝑑𝜖
0

where
𝑐12 (𝜖) = 𝑤12 (𝜖) + 𝜃2 𝑑𝑒 (𝑘, 𝑏; 𝜖) + 𝑏
and
𝑐02 = 𝑤02 + 𝜃02 𝑉 − 𝑞2 𝜃2 − 𝑝𝑏
• If 𝑞1 > 𝑞2 , Set 𝜃𝑙 = 𝜃1 ; otherwise, set 𝜃ℎ = 𝜃1 .
• Repeat steps 6Aa through 6Ad until |𝜃ℎ1 − 𝜃𝑙1 | is small.
8. Set bond price as 𝑝 and equity price as 𝑞 = max(𝑞1 , 𝑞2 ).
9. Compute optimal choices of consumption:

𝑐01 = 𝑤01 + 𝜃01 𝑉 − 𝑞𝜃1


𝑐02 = 𝑤02 + 𝜃02 𝑉 − 𝑞𝜃2 − 𝑝𝑏
𝑐11 (𝜖) = 𝑤11 (𝜖) + 𝜃1 𝑑𝑒 (𝑘, 𝑏; 𝜖)
𝑐12 (𝜖) = 𝑤12 (𝜖) + 𝜃2 𝑑𝑒 (𝑘, 𝑏; 𝜖) + 𝑏

10. (Here we confess to abusing notation again, but now in a different way. In step 7, we interpret frozen 𝑐𝑖 s as Big
𝐶 𝑖 . We do this to solve the firm’s problem.) Fixing the values of 𝑐0𝑖 and 𝑐1𝑖 (𝜖), compute optimal choices of capital
𝑘 and debt level 𝑏 using the firm’s first order necessary conditions.
11. Compute deviations from the firm’s FONC for capital 𝑘 as:
′ 2
𝑘𝑓𝑜𝑐 = 𝛽𝛼𝐴𝑘𝛼−1 (∫ 𝑢𝑢(𝑐′ (𝑐1 (𝜖)) 𝜖
2 ) 𝑒 𝑔(𝜖) 𝑑𝜖) − 1
0

• If 𝑘𝑓𝑜𝑐 > 0, Set 𝑘𝑙 = 𝑘; otherwise, set 𝑘ℎ = 𝑘.


• Repeat steps 4 through 7A until |𝑘ℎ − 𝑘𝑙 | is small.
12. Compute deviations from the firm’s FONC for debt level 𝑏 as:
∞ ′ 1 ∞ ′ 2
𝑏𝑓𝑜𝑐 = 𝛽 [∫𝜖∗ ( 𝑢𝑢(𝑐′ (𝑐1 (𝜖))
1 ) ) 𝑔(𝜖) 𝑑𝜖 − ∫∗
𝜖
( 𝑢𝑢(𝑐′ (𝑐1 (𝜖))
2 ) ) 𝑔(𝜖) 𝑑𝜖]
0 0

• If 𝑏𝑓𝑜𝑐 > 0, Set 𝑏ℎ = 𝑏; otherwise, set 𝑏𝑙 = 𝑏.


• Repeat steps 3 through 7B until |𝑏ℎ − 𝑏𝑙 | is small.
13. Given prices 𝑞 and 𝑝 from step 6, and the firm choices of 𝑘 and 𝑏 from step 7, compute the synthetic firm value:
𝑉𝑥 = −𝑘 + 𝑞 + 𝑝𝑏
• If 𝑉𝑥 > 𝑉 , then set 𝑉𝑙 = 𝑉 ; otherwise, set 𝑉ℎ = 𝑉 .
• Repeat steps 1 through 8 until |𝑉𝑥 − 𝑉 | is small.
14. Ultimately, the algorithm returns equilibrium capital 𝑘∗ , debt 𝑏∗ and firm value 𝑉 ∗ , as well as the following equi-
librium values:
• Equity holdings 𝜃1,∗ = 𝜃1 (𝑘∗ , 𝑏∗ )
• Prices 𝑞 ∗ = 𝑞(𝑘∗ , 𝑏∗ ), 𝑝∗ = 𝑝(𝑘∗ , 𝑏∗ )
• Consumption plans 𝐶01,∗ = 𝑐01 (𝑘∗ , 𝑏∗ ), 𝐶02,∗ = 𝑐02 (𝑘∗ , 𝑏∗ ), 𝐶11,∗ (𝜖) = 𝑐11 (𝑘∗ , 𝑏∗ ; 𝜖), 𝐶11,∗ (𝜖) = 𝑐12 (𝑘∗ , 𝑏∗ ; 𝜖).

39.4. Pseudo Code 725


Advanced Quantitative Economics with Python

39.5 Code

We create a Python class BCG_incomplete_markets to compute the equilibrium allocations of the incomplete
market BCG model, given a set of parameter values.
The class includes the following methods, i.e., functions:
• solve_eq: solves the BCG model and returns the equilibrium values of capital 𝑘, debt 𝑏 and firm value 𝑉 , as
well as
– agent 1’s equity holdings 𝜃1,∗
– prices 𝑞 ∗ , 𝑝∗
– consumption plans 𝐶01,∗ , 𝐶02,∗ , 𝐶11,∗ (𝜖), 𝐶12,∗ (𝜖).
• eq_valuation: inputs equilibrium consumpion plans 𝐶 ∗ and outputs the following valuations for each pair of
(𝑘, 𝑏) in the grid:
– the firm 𝑉 (𝑘, 𝑏)
– the equity 𝑞(𝑘, 𝑏)
– the bond 𝑝(𝑘, 𝑏).
Parameters include:
• 𝜒1 , 𝜒2 : correlation parameter for agent 1 and 2. Default values are respectively 0 and 0.9.
• 𝑤01 , 𝑤02 : initial endowments. Default values are respectively 0.9 and 1.1.
• 𝜃01 , 𝜃02 : initial holding of the firm. Default values are 0.5.
• 𝜓: risk parameter. Default value is 3.
• 𝛼: Production function parameter. Default value is 0.6.
• 𝐴: Productivity of the firm. Default value is 2.5.
• 𝜇, 𝜎: Mean and standard deviation of the shock distribution. Default values are respectively -0.025 and 0.4
• 𝛽: Discount factor. Default value is 0.96.
• bound: Bound for truncated normal distribution. Default value is 3.

import numpy as np
from scipy.stats import truncnorm
from scipy.integrate import quad
from numba import njit

class BCG_incomplete_markets:

# init method or constructor


def __init__(self,
𝜒1 = 0,
𝜒2 = 0.9,
w10 = 0.9,
w20 = 1.1,
𝜃10 = 0.5,
𝜃20 = 0.5,
𝜓1 = 3,
𝜓2 = 3,
𝛼 = 0.6,
(continues on next page)

726 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


A = 2.5,
𝜇 = -0.025,
𝜎 = 0.4,
𝛽 = 0.96,
bound = 3,
Vl = 0,
Vh = 0.5,
kbot = 0.01,
#ktop = ( *A)**(1/(1- )),
ktop = 0.25,
bbot = 0.1,
btop = 0.8):

#=========== Setup ===========#


# Risk parameters
self.𝜒1 = 𝜒1
self.𝜒2 = 𝜒2

# Other parameters
self.𝜓1 = 𝜓1
self.𝜓2 = 𝜓2
self.𝛼 = 𝛼
self.A = A
self.𝜇 = 𝜇
self.𝜎 = 𝜎
self.𝛽 = 𝛽
self.bound = bound

# Bounds for firm value, capital, and debt


self.Vl = Vl
self.Vh = Vh
self.kbot = kbot
#self.kbot = ( *A)**(1/(1- ))
self.ktop = ktop
self.bbot = bbot
self.btop = btop

# Utility
self.u = njit(lambda c: (c**(1-𝜓)) / (1-𝜓))

# Initial endowments
self.w10 = w10
self.w20 = w20
self.w0 = w10 + w20

# Initial holdings
self.𝜃10 = 𝜃10
self.𝜃20 = 𝜃20

# Endowments at t=1
self.w11 = njit(lambda 𝜖: np.exp(-𝜒1*𝜇 - 0.5*(𝜒1**2)*(𝜎**2) + 𝜒1*𝜖))
self.w21 = njit(lambda 𝜖: np.exp(-𝜒2*𝜇 - 0.5*(𝜒2**2)*(𝜎**2) + 𝜒2*𝜖))
self.w1 = njit(lambda 𝜖: self.w11(𝜖) + self.w21(𝜖))

# Truncated normal
ta, tb = (-bound - 𝜇) / 𝜎, (bound - 𝜇) / 𝜎

(continues on next page)

39.5. Code 727


Advanced Quantitative Economics with Python

(continued from previous page)


rv = truncnorm(ta, tb, loc=𝜇, scale=𝜎)
𝜖_range = np.linspace(ta, tb, 1000000)
pdf_range = rv.pdf(𝜖_range)
self.g = njit(lambda 𝜖: np.interp(𝜖, 𝜖_range, pdf_range))

#*************************************************************
# Function: Solve for equilibrium of the BCG model
#*************************************************************
def solve_eq(self, print_crit=True):

# Load parameters
𝜓1 = self.𝜓1
𝜓2 = self.𝜓2
𝛼 = self.𝛼
A = self.A
𝛽 = self.𝛽
bound = self.bound
Vl = self.Vl
Vh = self.Vh
kbot = self.kbot
ktop = self.ktop
bbot = self.bbot
btop = self.btop
w10 = self.w10
w20 = self.w20
𝜃10 = self.𝜃10
𝜃20 = self.𝜃20
w11 = self.w11
w21 = self.w21
g = self.g

# We need to find a fixed point on the value of the firm


V_crit = 1

Y = njit(lambda 𝜖, fk: np.exp(𝜖)*fk)


intqq1 = njit(lambda 𝜖, fk, 𝜃1, 𝜓1, b: (w11(𝜖) + 𝜃1*(Y(𝜖, fk) - b))**(-
↪𝜓1)*(Y(𝜖, fk) - b)*g(𝜖))

intp1 = njit(lambda 𝜖, fk, 𝜓2, b: (Y(𝜖, fk)/b)*(w21(𝜖) + Y(𝜖, fk))**(-


↪𝜓2)*g(𝜖))

intp2 = njit(lambda 𝜖, fk, 𝜃2, 𝜓2, b: (w21(𝜖) + 𝜃2*(Y(𝜖, fk)-b) + b)**(-


↪𝜓2)*g(𝜖))

intqq2 = njit(lambda 𝜖, fk, 𝜃2, 𝜓2, b: (w21(𝜖) + 𝜃2*(Y(𝜖, fk)-b) + b)**(-


↪𝜓2)*(Y(𝜖, fk) - b)*g(𝜖))

intk1 = njit(lambda 𝜖, fk, 𝜓2: (w21(𝜖) + Y(𝜖, fk))**(-𝜓2)*np.exp(𝜖)*g(𝜖))


intk2 = njit(lambda 𝜖, fk, 𝜃2, 𝜓2, b: (w21(𝜖) + 𝜃2*(Y(𝜖, fk)-b) + b)**(-
↪𝜓2)*np.exp(𝜖)*g(𝜖))

intB1 = njit(lambda 𝜖, fk, 𝜃1, 𝜓1, b: (w11(𝜖) + 𝜃1*(Y(𝜖, fk) - b))**(-


↪𝜓1)*g(𝜖))

intB2 = njit(lambda 𝜖, fk, 𝜃2, 𝜓2, b: (w21(𝜖) + 𝜃2*(Y(𝜖, fk) - b) + b)**(-


↪𝜓2)*g(𝜖))

while V_crit>1e-4:

# We begin by adding the guess for the value of the firm to endowment
V = (Vl+Vh)/2

(continues on next page)

728 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


ww10 = w10 + 𝜃10*V
ww20 = w20 + 𝜃20*V

# Figure out the optimal level of debt


bl = bbot
bh = btop
b_crit=1

while b_crit>1e-5:

# Setting the conjecture for debt


b = (bl+bh)/2

# Figure out the optimal level of capital


kl = kbot
kh = ktop
k_crit=1

while k_crit>1e-5:

# Setting the conjecture for capital


k = (kl+kh)/2

# Production
fk = A*(k**𝛼)
# Y = lambda : np.exp( )*fk

# Compute integration threshold


epstar = np.log(b/fk)

#**************************************************************
# Compute the prices and allocations consistent with consumers'
# Euler equations
#**************************************************************

# We impose the following:


# Agent 1 buys equity
# Agent 2 buys equity and all debt
# Agents trade such that prices converge

#========
# Agent 1
#========
# Holdings
𝜉1 = 0
𝜃1a = 0.3
𝜃1b = 1

while abs(𝜃1b - 𝜃1a) > 0.001:

𝜃1 = (𝜃1a + 𝜃1b) / 2

# qq1 is the equity price consistent with agent-1 Euler␣


Equation

## Note: Price is in the date-0 budget constraint of the agent

(continues on next page)

39.5. Code 729


Advanced Quantitative Economics with Python

(continued from previous page)

## First, compute the constant term that is not influenced by␣


q

## that is, E[u'(c^{1}_{1})d^{e}(k,B)]


# intqq1 = lambda : (w11( ) + 1*(Y( , fk) - b))**(- 1)*(Y( ,
↪ fk) - b)*g( )
# const_qq1 = * quad(intqq1,epstar,bound)[0]

const_qq1 = 𝛽 * quad(intqq1,epstar,bound, args=(fk, 𝜃1, 𝜓1,␣


b))[0]

## Second, iterate to get the equity price q


qq1l = 0
qq1h = ww10
diff = 1
while diff > 1e-7:
qq1 = (qq1l+qq1h)/2
rhs = const_qq1/((ww10-qq1*𝜃1)**(-𝜓1));
if (rhs > qq1):
qq1l = qq1
else:
qq1h = qq1
diff = abs(qq1l-qq1h)

#========
# Agent 2
#========
𝜉2 = b - 𝜉1
𝜃2 = 1 - 𝜃1

# p is the bond price consistent with agent-2 Euler Equation


## Note: Price is in the date-0 budget constraint of the agent

## First, compute the constant term that is not influenced by␣


p

## that is, E[u'(c^{2}_{1})d^{b}(k,B)]


# intp1 = lambda : (Y( , fk)/b)*(w21( ) + Y( , fk))**(-
↪ 2)*g( )
# intp2 = lambda : (w21( ) + 2*(Y( , fk)-b) + b)**(- 2)*g( )
# const_p = * (quad(intp1,-bound,epstar)[0] + quad(intp2,
epstar,bound)[0])

const_p = 𝛽 * (quad(intp1,-bound,epstar, args=(fk, 𝜓2, b))[0]\


+ quad(intp2,epstar,bound, args=(fk, 𝜃2, 𝜓2,␣
b))[0])

## iterate to get the bond price p


pl = 0
ph = ww20/b
diff = 1
while diff > 1e-7:
p = (pl+ph)/2
rhs = const_p/((ww20-qq1*𝜃2-p*b)**(-𝜓2))
if (rhs > p):
pl = p
else:

(continues on next page)

730 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


ph = p
diff = abs(pl-ph)

# qq2 is the equity price consistent with agent-2 Euler␣


Equation

# intqq2 = lambda : (w21( ) + 2*(Y( , fk)-b) + b)**(-


↪ 2)*(Y( , fk) - b)*g( )
const_qq2 = 𝛽 * quad(intqq2,epstar,bound, args=(fk, 𝜃2, 𝜓2,␣
↪b))[0]

qq2l = 0
qq2h = ww20
diff = 1
while diff > 1e-7:
qq2 = (qq2l+qq2h)/2
rhs = const_qq2/((ww20-qq2*𝜃2-p*b)**(-𝜓2));
if (rhs > qq2):
qq2l = qq2
else:
qq2h = qq2
diff = abs(qq2l-qq2h)

# q be the maximum valuation for the equity among agents


## This will be the equity price based on Makowski's criterion
q = max(qq1,qq2)

#================
# Update holdings
#================
if qq1 > qq2:
𝜃1a = 𝜃1
else:
𝜃1b = 𝜃1

#================
# Get consumption
#================
c10 = ww10 - q*𝜃1
c11 = lambda 𝜖: w11(𝜖) + 𝜃1*max(Y(𝜖, fk)-b,0)
c20 = ww20 - q*(1-𝜃1) - p*b
c21 = lambda 𝜖: w21(𝜖) + (1-𝜃1)*max(Y(𝜖, fk)-b,0) + min(Y(𝜖, fk),
b)

#*************************************************
# Compute the first order conditions for the firm
#*************************************************

#===========
# Equity FOC
#===========
# Only agent 2's IMRS is relevent
# intk1 = lambda : (w21( ) + Y( , fk))**(- 2)*np.exp( )*g( )
# intk2 = lambda : (w21( ) + 2*(Y( , fk)-b) + b)**(- 2)*np.
exp( )*g( )

# kfoc_num = quad(intk1,-bound,epstar)[0] + quad(intk2,epstar,


bound)[0]

(continues on next page)

39.5. Code 731


Advanced Quantitative Economics with Python

(continued from previous page)


kfoc_num = quad(intk1,-bound,epstar, args=(fk, 𝜓2))[0] +␣
quad(intk2,epstar,bound, args=(fk, 𝜃2, 𝜓2, b))[0]

kfoc_denom = (ww20- q*𝜃2 - p*b)**(-𝜓2)


kfoc = 𝛽*𝛼*A*(k**(𝛼-1))*(kfoc_num/kfoc_denom) - 1

if (kfoc > 0):


kl = k
else:
kh = k
k_crit = abs(kh-kl)

if print_crit:
print("critical value of k: {:.5f}".format(k_crit))

#=========
# Bond FOC
#=========
# intB1 = lambda : (w11( ) + 1*(Y( , fk) - b))**(- 1)*g( )
# intB2 = lambda : (w21( ) + 2*(Y( , fk) - b) + b)**(- 2)*g( )

# bfoc1 = quad(intB1,epstar,bound)[0] / (ww10 - q* 1)**(- 1)


# bfoc2 = quad(intB2,epstar,bound)[0] / (ww20 - q* 2 - p*b)**(- 2)

bfoc1 = quad(intB1,epstar,bound, args=(fk, 𝜃1, 𝜓1, b))[0] / (ww10 -␣


q*𝜃1)**(-𝜓1)

bfoc2 = quad(intB2,epstar,bound, args=(fk, 𝜃2, 𝜓2, b))[0] / (ww20 -␣


q*𝜃2 - p*b)**(-𝜓2)

bfoc = bfoc1 - bfoc2

if (bfoc > 0):


bh = b
else:
bl = b
b_crit = abs(bh-bl)

if print_crit:
print("#=== critical value of b: {:.5f}".format(b_crit))

# Compute the value of the firm


value_x = -k + q + p*b
if (value_x > V):
Vl = V
else:
Vh = V
V_crit = abs(value_x-V)

if print_crit:
print("#====== critical value of V: {:.5f}".format(V_crit))

print('k,b,p,q,kfoc,bfoc,epstar,V,V_crit')
formattedList = ["%.3f" % member for member in [k,
b,
p,
q,
kfoc,

(continues on next page)

732 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


bfoc,
epstar,
V,
V_crit]]
print(formattedList)

#*********************************
# Equilibrium values
#*********************************

# Return the results


kss = k
bss = b
Vss = V
qss = q
pss = p
c10ss = c10
c11ss = c11
c20ss = c20
c21ss = c21
𝜃1ss = 𝜃1

# Print the results


print('finished')
# print('k,b,p,q,kfoc,bfoc,epstar,V,V_crit')
#formattedList = ["%.3f" % member for member in [kss,
# bss,
# pss,
# qss,
# kfoc,
# bfoc,
# epstar,
# Vss,
# V_crit]]
#print(formattedList)

return kss,bss,Vss,qss,pss,c10ss,c11ss,c20ss,c21ss,𝜃1ss

#*************************************************************
# Function: Equity and bond valuations by different agents
#*************************************************************
def valuations_by_agent(self,
c10, c11, c20, c21,
k, b):

# Load parameters
𝜓1 = self.𝜓1
𝜓2 = self.𝜓2
𝛼 = self.𝛼
A = self.A
𝛽 = self.𝛽
bound = self.bound
Vl = self.Vl
Vh = self.Vh

(continues on next page)

39.5. Code 733


Advanced Quantitative Economics with Python

(continued from previous page)


kbot = self.kbot
ktop = self.ktop
bbot = self.bbot
btop = self.btop
w10 = self.w10
w20 = self.w20
𝜃10 = self.𝜃10
𝜃20 = self.𝜃20
w11 = self.w11
w21 = self.w21
g = self.g

# Get functions for IMRS/state price density


IMRS1 = lambda 𝜖: 𝛽 * (c11(𝜖)/c10)**(-𝜓1)*g(𝜖)
IMRS2 = lambda 𝜖: 𝛽 * (c21(𝜖)/c20)**(-𝜓2)*g(𝜖)

# Production
fk = A*(k**𝛼)
Y = lambda 𝜖: np.exp(𝜖)*fk

# Compute integration threshold


epstar = np.log(b/fk)

# Compute equity valuation with agent 1's IMRS


intQ1 = lambda 𝜖: IMRS1(𝜖)*(Y(𝜖) - b)
Q1 = quad(intQ1, epstar, bound)[0]

# Compute bond valuation with agent 1's IMRS


intP1 = lambda 𝜖: IMRS1(𝜖)*Y(𝜖)/b
P1 = quad(intP1, -bound, epstar)[0] + quad(IMRS1, epstar, bound)[0]

# Compute equity valuation with agent 2's IMRS


intQ2 = lambda 𝜖: IMRS2(𝜖)*(Y(𝜖) - b)
Q2 = quad(intQ2, epstar, bound)[0]

# Compute bond valuation with agent 2's IMRS


intP2 = lambda 𝜖: IMRS2(𝜖)*Y(𝜖)/b
P2 = quad(intP2, -bound, epstar)[0] + quad(IMRS2, epstar, bound)[0]

return Q1,Q2,P1,P2

#*************************************************************
# Function: equilibrium valuations for firm, equity, bond
#*************************************************************
def eq_valuation(self, c10, c11, c20, c21, N=30):

# Load parameters
𝜓1 = self.𝜓1
𝜓2 = self.𝜓2
𝛼 = self.𝛼
A = self.A
𝛽 = self.𝛽
bound = self.bound
Vl = self.Vl
Vh = self.Vh

(continues on next page)

734 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


kbot = self.kbot
ktop = self.ktop
bbot = self.bbot
btop = self.btop
w10 = self.w10
w20 = self.w20
𝜃10 = self.𝜃10
𝜃20 = self.𝜃20
w11 = self.w11
w21 = self.w21
g = self.g

# Create grids
kgrid, bgrid = np.meshgrid(np.linspace(kbot,ktop,N),
np.linspace(bbot,btop,N))
Vgrid = np.zeros_like(kgrid)
Qgrid = np.zeros_like(kgrid)
Pgrid = np.zeros_like(kgrid)

# Loop: firm value


for i in range(N):
for j in range(N):

# Get capital and debt


k = kgrid[i,j]
b = bgrid[i,j]

# Valuations by each agent


Q1,Q2,P1,P2 = self.valuations_by_agent(c10,
c11,
c20,
c21,
k,
b)

# The prices will be the maximum of the valuations


Q = max(Q1,Q2)
P = max(P1,P2)

# Compute firm value


V = -k + Q + P*b
Vgrid[i,j] = V
Qgrid[i,j] = Q
Pgrid[i,j] = P

return kgrid, bgrid, Vgrid, Qgrid, Pgrid

39.5. Code 735


Advanced Quantitative Economics with Python

39.6 Examples

Below we show some examples computed with the class BCG_incomplete markets.

39.6.1 First example

In the first example, we set up an instance of the BCG incomplete markets model with default parameter values.

mdl = BCG_incomplete_markets()
kss,bss,Vss,qss,pss,c10ss,c11ss,c20ss,c21ss,𝜃1ss = mdl.solve_eq(print_crit=False)

print(-kss+qss+pss*bss)
print(Vss)
print(𝜃1ss)

0.10073912888808995
0.100830078125
0.98564453125

Python reports to us that the equilibrium firm value is 𝑉 = 0.101, with capital 𝑘 = 0.151 and debt 𝑏 = 0.484.
Let’s verify some things that have to be true if our algorithm has truly found an equilibrium.
Thus, let’s see if the firm is actually maximizing its firm value given the equilibrium pricing function 𝑞(𝑘, 𝑏) for equity
and 𝑝(𝑘, 𝑏) for bonds.

kgrid, bgrid, Vgrid, Qgrid, Pgrid = mdl.eq_valuation(c10ss, c11ss, c20ss, c21ss,N=30)

print('Maximum valuation of the firm value in the (k,B) grid: {:.5f}'.format(Vgrid.


↪max()))

print('Equilibrium firm value: {:.5f}'.format(Vss))

Maximum valuation of the firm value in the (k,B) grid: 0.10074


Equilibrium firm value: 0.10083

Up to the approximation involved in using a discrete grid, these numbers give us comfort that the firm does indeed seem
to be maximizing its value at the top of the value hill on the (𝑘, 𝑏) plane that it faces.
Below we will plot the firm’s value as a function of 𝑘, 𝑏.
We’ll also plot the equilibrium price functions 𝑞(𝑘, 𝑏) and 𝑝(𝑘, 𝑏).

from IPython.display import Image


import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
import plotly.graph_objs as go

# Firm Valuation
fig = go.Figure(data=[go.Scatter3d(x=[kss],
y=[bss],
z=[Vss],
mode='markers',
marker=dict(size=3, color='red')),
(continues on next page)

736 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


go.Surface(x=kgrid,
y=bgrid,
z=Vgrid,
colorscale='Greens',opacity=0.6)])

fig.update_layout(scene = dict(
xaxis_title='x - Capital k',
yaxis_title='y - Debt b',
zaxis_title='z - Firm Value V',
aspectratio = dict(x=1,y=1,z=1)),
width=700,
height=700,
margin=dict(l=50, r=50, b=65, t=90))
fig.update_layout(scene_camera=dict(eye=dict(x=1.5, y=-1.5, z=2)))
fig.update_layout(title='Equilibrium firm valuation for the grid of (k,b)')

# Export to PNG file


Image(fig.to_image(format="png"))
# fig.show() will provide interactive plot when running
# code locally

39.6. Examples 737


Advanced Quantitative Economics with Python

A Modigliani-Miller theorem?

The red dot in the above graph is both an equilibrium (𝑏, 𝑘) chosen by a representative firm and the equilibrium 𝐵, 𝐾
pair chosen by the aggregate of all firms.
Thus, in equilibrium it is true that

(𝑏, 𝑘) = (𝐵, 𝐾)

But an individual firm named 𝜁 ∈ [0, 1] neither knows nor cares whether it sets (𝑏(𝜁), 𝑘(𝜁)) = (𝐵, 𝐾).
Indeed the above graph has a ridge of 𝑏(𝜁)’s that also maximize the firm’s value so long as it sets 𝑘(𝜁) = 𝐾.
Here it is important that the measure of firms that deviate from setting 𝑏 at the red dot is very small – measure zero – so
that 𝐵 remains at the red dot even while one firm 𝜁 deviates.

738 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

So within this equilibrium, there is a qualified Modigliani-Miller theorem that asserts that firm 𝜁’s value is independent
of how it mixes its financing between equity and bonds (so long as it is not what other firms do on average).
Thus, while an individual firm 𝜁’s financial structure is indeterminate, the market’s financial structure is determinant and
sits at the red dot in the above graph.
This contrasts sharply with the unqualified Modigliani-Miller theorem descibed in the complete markets model in the
lecture Irrelevance of Capital Structure with Complete Markets.
There the market’s financial structure was indeterminate.
These subtle distinctions bear more thought and exploration.
So we will do some calculations to ferret out a sense in which the equilibrium (𝑘, 𝑏) = (𝐾, 𝐵) outcome at the red dot in
the above graph is stable.
In particular, we’ll explore the consequences of some choices of 𝑏 = 𝐵 that deviate from the red dot and ask whether
firm 𝜁 would want to remain at that 𝑏.
In more detail, here is what we’ll do:
1. Obtain equilibrium values of capital and debt as 𝑘∗ = 𝐾 and 𝑏∗ = 𝐵, the red dot above.
2. Now fix 𝑘∗ and let 𝑏∗∗ = 𝑏∗ − 𝑒 for some 𝑒 > 0. Conjecture that big 𝐾 = 𝑘∗ but big 𝐵 = 𝑏∗∗ .
3. Take 𝐾 and 𝐵 and compute intertermporal marginal rates of substitution (IMRS’s) as we did before.
4. Taking the new IMRS to the firm’s problem. Plot 3D surface for the valuations of the firm with this new IMRS.
5. Check if the value at 𝑘∗ , 𝑏∗∗ is at the top of this new 3D surface.
6. Repeat these calculations for 𝑏∗∗ = 𝑏∗ + 𝑒.
To conduct the above procedures, we create a function off_eq_check that inputs the BCG model instance parameters,
equilibrium capital 𝐾 = 𝑘∗ and debt 𝐵 = 𝑏∗ , and a perturbation of debt 𝑒.
The function outputs the fixed point firm values 𝑉 ∗∗ , prices 𝑞 ∗∗ , 𝑝∗∗ , and consumption choices 𝑐∗∗ .
Importantly, we relax the condition that only agent 2 holds bonds.
Now both agents can hold bonds, i.e., 0 ≤ 𝜉 1 ≤ 𝐵 and 𝜉 1 + 𝜉 2 = 𝐵.
That implies the consumers’ budget constraints are:

𝑐01 = 𝑤01 + 𝜃01 𝑉 − 𝑞𝜃1 − 𝑝𝜉 1


𝑐02 = 𝑤02 + 𝜃02 𝑉 − 𝑞𝜃2 − 𝑝𝜉 2
1
𝑐1 (𝜖) = 𝑤11 (𝜖) + 𝜃1 𝑑𝑒 (𝑘, 𝑏; 𝜖) + 𝜉 1
𝑐12 (𝜖) = 𝑤12 (𝜖) + 𝜃2 𝑑𝑒 (𝑘, 𝑏; 𝜖) + 𝜉 2

The function also outputs agent 1’s bond holdings 𝜉1 .

def off_eq_check(mdl,kss,bss,e=0.1):
# Big K and big B
k = kss
b = bss + e

# Load parameters
𝜓1 = mdl.𝜓1
𝜓2 = mdl.𝜓2
𝛼 = mdl.𝛼
A = mdl.A
𝛽 = mdl.𝛽
bound = mdl.bound
(continues on next page)

39.6. Examples 739


Advanced Quantitative Economics with Python

(continued from previous page)


Vl = mdl.Vl
Vh = mdl.Vh
kbot = mdl.kbot
ktop = mdl.ktop
bbot = mdl.bbot
btop = mdl.btop
w10 = mdl.w10
w20 = mdl.w20
𝜃10 = mdl.𝜃10
𝜃20 = mdl.𝜃20
w11 = mdl.w11
w21 = mdl.w21
g = mdl.g

Y = njit(lambda 𝜖, fk: np.exp(𝜖)*fk)


intqq1 = njit(lambda 𝜖, fk, 𝜃1, 𝜓1, 𝜉1, b: (w11(𝜖) + 𝜃1*(Y(𝜖, fk) - b) + 𝜉1)**(-
↪𝜓1)*(Y(𝜖, fk) - b)*g(𝜖))

intpp1a = njit(lambda 𝜖, fk, 𝜓1, 𝜉1, b: (Y(𝜖, fk)/b)*(w11(𝜖) + Y(𝜖, fk)/b*𝜉1)**(-


↪𝜓1)*g(𝜖))

intpp1b = njit(lambda 𝜖, fk, 𝜃1, 𝜓1, 𝜉1, b: (w11(𝜖) + 𝜃1*(Y(𝜖, fk)-b) + 𝜉1)**(-
↪𝜓1)*g(𝜖))

intpp2a = njit(lambda 𝜖, fk, 𝜓2, 𝜉2, b: (Y(𝜖, fk)/b)*(w21(𝜖) + Y(𝜖, fk)/b*𝜉2)**(-


↪𝜓2)*g(𝜖))

intpp2b = njit(lambda 𝜖, fk, 𝜃2, 𝜓2, 𝜉2, b: (w21(𝜖) + 𝜃2*(Y(𝜖, fk)-b) + 𝜉2)**(-
↪𝜓2)*g(𝜖))

intqq2 = njit(lambda 𝜖, fk, 𝜃2, 𝜓2, b: (w21(𝜖) + 𝜃2*(Y(𝜖, fk)-b) + b)**(-𝜓2)*(Y(𝜖,


↪ fk) - b)*g(𝜖))

# Loop: Find fixed points V, q and p


V_crit = 1
while V_crit>1e-5:

# We begin by adding the guess for the value of the firm to endowment
V = (Vl+Vh)/2
ww10 = w10 + 𝜃10*V
ww20 = w20 + 𝜃20*V

# Production
fk = A*(k**𝛼)
# Y = lambda : np.exp( )*fk

# Compute integration threshold


epstar = np.log(b/fk)

#**************************************************************
# Compute the prices and allocations consistent with consumers'
# Euler equations
#**************************************************************

# We impose the following:


# Agent 1 buys equity
# Agent 2 buys equity and all debt
# Agents trade such that prices converge

(continues on next page)

740 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


#========
# Agent 1
#========
# Holdings
𝜉1a = 0
𝜉1b = b/2
p = 0.3

while abs(𝜉1b - 𝜉1a) > 0.001:

𝜉1 = (𝜉1a + 𝜉1b) / 2
𝜃1a = 0.3
𝜃1b = 1

while abs(𝜃1b - 𝜃1a) > (0.001/b):

𝜃1 = (𝜃1a + 𝜃1b) / 2

# qq1 is the equity price consistent with agent-1 Euler Equation


## Note: Price is in the date-0 budget constraint of the agent

## First, compute the constant term that is not influenced by q


## that is, E[u'(c^{1}_{1})d^{e}(k,B)]
# intqq1 = lambda : (w11( ) + 1*(Y( , fk) - b) + 1)**(- 1)*(Y( ,␣
fk) - b)*g( )

# const_qq1 = * quad(intqq1,epstar,bound)[0]
const_qq1 = 𝛽 * quad(intqq1,epstar,bound, args=(fk, 𝜃1, 𝜓1, 𝜉1, b))[0]

## Second, iterate to get the equity price q


qq1l = 0
qq1h = ww10
diff = 1
while diff > 1e-7:
qq1 = (qq1l+qq1h)/2
rhs = const_qq1/((ww10-qq1*𝜃1-p*𝜉1)**(-𝜓1));
if (rhs > qq1):
qq1l = qq1
else:
qq1h = qq1
diff = abs(qq1l-qq1h)

# pp1 is the bond price consistent with agent-2 Euler Equation


## Note: Price is in the date-0 budget constraint of the agent

## First, compute the constant term that is not influenced by p


## that is, E[u'(c^{1}_{1})d^{b}(k,B)]
# intpp1a = lambda : (Y( , fk)/b)*(w11( ) + Y( , fk)/b* 1)**(-
↪ 1)*g( )
# intpp1b = lambda : (w11( ) + 1*(Y( , fk)-b) + 1)**(- 1)*g( )
# const_pp1 = * (quad(intpp1a,-bound,epstar)[0] + quad(intpp1b,
↪epstar,bound)[0])

const_pp1 = 𝛽 * (quad(intpp1a,-bound,epstar, args=(fk, 𝜓1, 𝜉1, b))[0]␣


↪\

+ quad(intpp1b,epstar,bound, args=(fk, 𝜃1, 𝜓1, 𝜉1,␣


↪b))[0])

(continues on next page)

39.6. Examples 741


Advanced Quantitative Economics with Python

(continued from previous page)


## iterate to get the bond price p
pp1l = 0
pp1h = ww10/b
diff = 1
while diff > 1e-7:
pp1 = (pp1l+pp1h)/2
rhs = const_pp1/((ww10-qq1*𝜃1-pp1*𝜉1)**(-𝜓1))
if (rhs > pp1):
pp1l = pp1
else:
pp1h = pp1
diff = abs(pp1l-pp1h)

#========
# Agent 2
#========
𝜉2 = b - 𝜉1
𝜃2 = 1 - 𝜃1

# pp2 is the bond price consistent with agent-2 Euler Equation


## Note: Price is in the date-0 budget constraint of the agent

## First, compute the constant term that is not influenced by p


## that is, E[u'(c^{2}_{1})d^{b}(k,B)]
# intpp2a = lambda : (Y( , fk)/b)*(w21( ) + Y( , fk)/b* 2)**(-
↪ 2)*g( )
# intpp2b = lambda : (w21( ) + 2*(Y( , fk)-b) + 2)**(- 2)*g( )
# const_pp2 = * (quad(intpp2a,-bound,epstar)[0] + quad(intpp2b,
↪epstar,bound)[0])

const_pp2 = 𝛽 * (quad(intpp2a,-bound,epstar, args=(fk, 𝜓2, 𝜉2, b))[0]␣


↪\

+ quad(intpp2b,epstar,bound, args=(fk, 𝜃2, 𝜓2, 𝜉2,␣


↪b))[0])

## iterate to get the bond price p


pp2l = 0
pp2h = ww20/b
diff = 1
while diff > 1e-7:
pp2 = (pp2l+pp2h)/2
rhs = const_pp2/((ww20-qq1*𝜃2-pp2*𝜉2)**(-𝜓2))
if (rhs > pp2):
pp2l = pp2
else:
pp2h = pp2
diff = abs(pp2l-pp2h)

# p be the maximum valuation for the bond among agents


## This will be the equity price based on Makowski's criterion
p = max(pp1,pp2)

# qq2 is the equity price consistent with agent-2 Euler Equation


# intqq2 = lambda : (w21( ) + 2*(Y( , fk)-b) + b)**(- 2)*(Y( , fk) -
↪ b)*g( )
# const_qq2 = * quad(intqq2,epstar,bound)[0]

(continues on next page)

742 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

(continued from previous page)


const_qq2 = 𝛽 * quad(intqq2,epstar,bound, args=(fk, 𝜃2, 𝜓2, b))[0]
qq2l = 0
qq2h = ww20
diff = 1
while diff > 1e-7:
qq2 = (qq2l+qq2h)/2
rhs = const_qq2/((ww20-qq2*𝜃2-p*𝜉2)**(-𝜓2));
if (rhs > qq2):
qq2l = qq2
else:
qq2h = qq2
diff = abs(qq2l-qq2h)

# q be the maximum valuation for the equity among agents


## This will be the equity price based on Makowski's criterion
q = max(qq1,qq2)

#================
# Update holdings
#================
if qq1 > qq2:
𝜃1a = 𝜃1
else:
𝜃1b = 𝜃1

#print(p,q, 1, 1)

if pp1 > pp2:


𝜉1a = 𝜉1
else:
𝜉1b = 𝜉1

#================
# Get consumption
#================
c10 = ww10 - q*𝜃1 - p*𝜉1
c11 = lambda 𝜖: w11(𝜖) + 𝜃1*max(Y(𝜖, fk)-b,0) + 𝜉1*min(Y(𝜖, fk)/b,1)
c20 = ww20 - q*(1-𝜃1) - p*(b-𝜉1)
c21 = lambda 𝜖: w21(𝜖) + (1-𝜃1)*max(Y(𝜖, fk)-b,0) + (b-𝜉1)*min(Y(𝜖, fk)/b,1)

# Compute the value of the firm


value_x = -k + q + p*b
if (value_x > V):
Vl = V
else:
Vh = V
V_crit = abs(value_x-V)

return V,k,b,p,q,c10,c11,c20,c21,𝜉1

Here is our strategy for checking stability of an equilibrium.


We use off_eq_check to obtain consumption plans for both agents at the conjectured big 𝐾 and big 𝐵.
Then we input consumption plans into the function eq_valuation from the BCG model class and plot the agents’
valuations associated with different choices of 𝑘 and 𝑏.

39.6. Examples 743


Advanced Quantitative Economics with Python

Our hunch is that (𝑘∗ , 𝑏∗∗ ) is not at the top of the firm valuation 3D surface so that the firm is not maximizing its value
if it chooses 𝑘 = 𝐾 = 𝑘∗ and 𝑏 = 𝐵 = 𝑏∗∗ .
That indicates that (𝑘∗ , 𝑏∗∗ ) is not an equilibrium capital structure for the firm.
We first check the case in which 𝑏∗∗ = 𝑏∗ − 𝑒 where 𝑒 = 0.1:

#====================== Experiment 1 ======================#


Ve1,ke1,be1,pe1,qe1,c10e1,c11e1,c20e1,c21e1,𝜉1e1 = off_eq_check(mdl,
kss,
bss,
e=-0.1)

# Firm Valuation
kgride1, bgride1, Vgride1, Qgride1, Pgride1 = mdl.eq_valuation(c10e1, c11e1, c20e1,␣
↪c21e1,N=20)

print('Maximum valuation of the firm value in the (k,b) grid: {:.4f}'.format(Vgride1.


↪max()))

print('Equilibrium firm value: {:.4f}'.format(Ve1))

fig = go.Figure(data=[go.Scatter3d(x=[ke1],
y=[be1],
z=[Ve1],
mode='markers',
marker=dict(size=3, color='red')),
go.Surface(x=kgride1,
y=bgride1,
z=Vgride1,
colorscale='Greens',opacity=0.6)])

fig.update_layout(scene = dict(
xaxis_title='x - Capital k',
yaxis_title='y - Debt b',
zaxis_title='z - Firm Value V',
aspectratio = dict(x=1,y=1,z=1)),
width=700,
height=700,
margin=dict(l=50, r=50, b=65, t=90))
fig.update_layout(scene_camera=dict(eye=dict(x=1.5, y=-1.5, z=2)))
fig.update_layout(title='Equilibrium firm valuation for the grid of (k,b)')

# Export to PNG file


Image(fig.to_image(format="png"))
# fig.show() will provide interactive plot when running
# code locally

Maximum valuation of the firm value in the (k,b) grid: 0.1191


Equilibrium firm value: 0.1118

744 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

In the above 3D surface of prospective firm valuations, the perturbed choice (𝑘∗ , 𝑏∗ − 𝑒), represented by the red dot, is
not at the top.
The firm could issue more debts and attain a higher firm valuation from the market.
Therefore, (𝑘∗ , 𝑏∗ − 𝑒) would not be an equilibrium.
Next, we check for 𝑏∗∗ = 𝑏∗ + 𝑒.

#====================== Experiment 2 ======================#


Ve2,ke2,be2,pe2,qe2,c10e2,c11e2,c20e2,c21e2,𝜉1e2 = off_eq_check(mdl,
kss,
bss,
e=0.1)

# Firm Valuation
kgride2, bgride2, Vgride2, Qgride2, Pgride2 = mdl.eq_valuation(c10e2, c11e2, c20e2,␣
(continues on next page)

39.6. Examples 745


Advanced Quantitative Economics with Python

(continued from previous page)


↪c21e2,N=20)

print('Maximum valuation of the firm value in the (k,b) grid: {:.4f}'.format(Vgride2.


↪max()))

print('Equilibrium firm value: {:.4f}'.format(Ve2))

fig = go.Figure(data=[go.Scatter3d(x=[ke2],
y=[be2],
z=[Ve2],
mode='markers',
marker=dict(size=3, color='red')),
go.Surface(x=kgride2,
y=bgride2,
z=Vgride2,
colorscale='Greens',opacity=0.6)])

fig.update_layout(scene = dict(
xaxis_title='x - Capital k',
yaxis_title='y - Debt b',
zaxis_title='z - Firm Value V',
aspectratio = dict(x=1,y=1,z=1)),
width=700,
height=700,
margin=dict(l=50, r=50, b=65, t=90))
fig.update_layout(scene_camera=dict(eye=dict(x=1.5, y=-1.5, z=2)))
fig.update_layout(title='Equilibrium firm valuation for the grid of (k,b)')

# Export to PNG file


Image(fig.to_image(format="png"))
# fig.show() will provide interactive plot when running
# code locally

Maximum valuation of the firm value in the (k,b) grid: 0.1082


Equilibrium firm value: 0.0974

746 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

In contrast to (𝑘∗ , 𝑏∗ − 𝑒), the 3D surface for (𝑘∗ , 𝑏∗ + 𝑒) now indicates that a firm would want o decrease its debt issuance
to attain a higher valuation.
That incentive to deviate means that (𝑘∗ , 𝑏∗ + 𝑒) is not an equilibrium capital structure for the firm.
Interestingly, if consumers were to anticipate that firms would over-issue debt, i.e. 𝐵 > 𝑏∗ , then both types of consumer
would want to hold corporate debt.
For example, 𝜉 1 > 0:

print('Bond holdings of agent 1: {:.3f}'.format(𝜉1e2))

Bond holdings of agent 1: 0.039

Our two stability experiments suggest that the equilibrium capital structure (𝑘∗ , 𝑏∗ ) is locally unique even though at the
equilibrium an individual firm would be willing to deviate from the representative firms’ equilibrium debt choice.

39.6. Examples 747


Advanced Quantitative Economics with Python

These experiments thus refine our discussion of the qualified Modigliani-Miller theorem that prevails in this example
economy.

Equilibrium equity and bond price functions

It is also interesting to look at the equilibrium price functions 𝑞(𝑘, 𝑏) and 𝑝(𝑘, 𝑏) faced by firms in our rational expectations
equilibrium.

# Equity Valuation
fig = go.Figure(data=[go.Scatter3d(x=[kss],
y=[bss],
z=[qss],
mode='markers',
marker=dict(size=3, color='red')),
go.Surface(x=kgrid,
y=bgrid,
z=Qgrid,
colorscale='Blues',opacity=0.6)])

fig.update_layout(scene = dict(
xaxis_title='x - Capital k',
yaxis_title='y - Debt b',
zaxis_title='z - Equity price q',
aspectratio = dict(x=1,y=1,z=1)),
width=700,
height=700,
margin=dict(l=50, r=50, b=65, t=90))
fig.update_layout(scene_camera=dict(eye=dict(x=1.5, y=-1.5, z=2)))
fig.update_layout(title='Equilibrium equity valuation for the grid of (k,b)')

# Export to PNG file


Image(fig.to_image(format="png"))
# fig.show() will provide interactive plot when running
# code locally

748 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

# Bond Valuation
fig = go.Figure(data=[go.Scatter3d(x=[kss],
y=[bss],
z=[pss],
mode='markers',
marker=dict(size=3, color='red')),
go.Surface(x=kgrid,
y=bgrid,
z=Pgrid,
colorscale='Oranges',opacity=0.6)])

fig.update_layout(scene = dict(
xaxis_title='x - Capital k',
yaxis_title='y - Debt b',
zaxis_title='z - Bond price q',
aspectratio = dict(x=1,y=1,z=1)),
(continues on next page)

39.6. Examples 749


Advanced Quantitative Economics with Python

(continued from previous page)


width=700,
height=700,
margin=dict(l=50, r=50, b=65, t=90))
fig.update_layout(scene_camera=dict(eye=dict(x=1.5, y=-1.5, z=2)))
fig.update_layout(title='Equilibrium bond valuation for the grid of (k,b)')

# Export to PNG file


Image(fig.to_image(format="png"))
# fig.show() will provide interactive plot when running
# code locally

750 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

39.6.2 Comments on equilibrium pricing functions

The equilibrium pricing functions displayed above merit study and reflection.
They reveal the countervailing effects on a firm’s valuations of bonds and equities that lie beneath the Modigliani-Miller
ridge apparent in our earlier graph of an individual firm 𝜁’s value as a function of 𝑘(𝜁), 𝑏(𝜁).

39.6.3 Another example economy

We illustrate how the fraction of initial endowments held by agent 2, 𝑤02 /(𝑤01 +𝑤02 ) affects an equilibrium capital structure
(𝑘, 𝑏) = (𝐾, 𝐵) well as associated equilibrium allocations.
We are interested in how agents 1 and 2 value equity and bond.

𝑢′ (𝐶1𝑖,∗ (𝜖))
𝑄𝑖 = 𝛽 ∫ 𝑑𝑒 (𝑘∗ , 𝑏∗ ; 𝜖)𝑔(𝜖) 𝑑𝜖
𝑢′ (𝐶0𝑖,∗ )
𝑢′ (𝐶1𝑖,∗ (𝜖))
𝑃𝑖 = 𝛽 ∫ 𝑑𝑏 (𝑘∗ , 𝑏∗ ; 𝜖)𝑔(𝜖) 𝑑𝜖
𝑢′ (𝐶0𝑖,∗ )

The function valuations_by_agent is used in calculating these valuations.

# Lists for storage


wlist = []
klist = []
blist = []
qlist = []
plist = []
Vlist = []
tlist = []
q1list = []
q2list = []
p1list = []
p2list = []

# For loop: optimization for each endowment combination


for i in range(10):
print(i)

# Save fraction
w10 = 0.9 - 0.05*i
w20 = 1.1 + 0.05*i
wlist.append(w20/(w10+w20))

# Create the instance


mdl = BCG_incomplete_markets(w10 = w10, w20 = w20, ktop = 0.5, btop = 2.5)

# Solve for equilibrium


kss,bss,Vss,qss,pss,c10ss,c11ss,c20ss,c21ss,𝜃1ss = mdl.solve_eq(print_crit=False)

# Store the equilibrium results


klist.append(kss)
blist.append(bss)
qlist.append(qss)
plist.append(pss)
Vlist.append(Vss)
(continues on next page)

39.6. Examples 751


Advanced Quantitative Economics with Python

(continued from previous page)


tlist.append(𝜃1ss)

# Evaluations of equity and bond by each agent


Q1,Q2,P1,P2 = mdl.valuations_by_agent(c10ss, c11ss, c20ss, c21ss, kss, bss)

# Save the valuations


q1list.append(Q1)
q2list.append(Q2)
p1list.append(P1)
p2list.append(P2)

# Plot
fig, ax = plt.subplots(3,2,figsize=(12,12))
ax[0,0].plot(wlist,klist)
ax[0,0].set_title('capital')
ax[0,1].plot(wlist,blist)
ax[0,1].set_title('debt')
ax[1,0].plot(wlist,qlist)
ax[1,0].set_title('equity price')
ax[1,1].plot(wlist,plist)
ax[1,1].set_title('bond price')
ax[2,0].plot(wlist,Vlist)
ax[2,0].set_title('firm value')
ax[2,0].set_xlabel('fraction of initial endowment held by agent 2',fontsize=13)

# Create a list of Default thresholds


A = mdl.A
𝛼 = mdl.𝛼
epslist = []
for i in range(len(wlist)):
bb = blist[i]
kk = klist[i]
eps = np.log(bb/(A*kk**𝛼))
epslist.append(eps)

# Plot (cont.)
ax[2,1].plot(wlist,epslist)
ax[2,1].set_title(r'default threshold $\epsilon^*$')
ax[2,1].set_xlabel('fraction of initial endowment held by agent 2',fontsize=13)
plt.show()

752 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Advanced Quantitative Economics with Python

39.7 A picture worth a thousand words

Please stare at the above panels.


They describe how equilibrium prices and quantities respond to alterations in the structure of society’s hedging desires
across economies with different allocations of the initial endowment to our two types of agents.
Now let’s see how the two types of agents value bonds and equities, keeping in mind that the type that values the asset
highest determines the equilibrium price (and thus the pertinent set of Big 𝐶’s).

# Comparing the prices


fig, ax = plt.subplots(1,3,figsize=(16,6))

(continues on next page)

39.7. A picture worth a thousand words 753


Advanced Quantitative Economics with Python

(continued from previous page)


ax[0].plot(wlist,q1list,label='agent 1',color='green')
ax[0].plot(wlist,q2list,label='agent 2',color='blue')
ax[0].plot(wlist,qlist,label='equity price',color='red',linestyle='--')
ax[0].legend()
ax[0].set_title('equity valuations')
ax[0].set_xlabel('fraction of initial endowment held by agent 2',fontsize=11)

ax[1].plot(wlist,p1list,label='agent 1',color='green')
ax[1].plot(wlist,p2list,label='agent 2',color='blue')
ax[1].plot(wlist,plist,label='bond price',color='red',linestyle='--')
ax[1].legend()
ax[1].set_title('bond valuations')
ax[1].set_xlabel('fraction of initial endowment held by agent 2',fontsize=11)

ax[2].plot(wlist,tlist,color='blue')
ax[2].set_title('equity holdings by agent 1')
ax[2].set_xlabel('fraction of initial endowment held by agent 2',fontsize=11)

plt.show()

It is rewarding to stare at the above plots too.


In equilibrium, equity valuations are the same across the two types of agents but bond valuations are not.
Agents of type 2 value bonds more highly (they want more hedging).
Taken together with our earlier plot of equity holdings, these graphs confirm our earlier conjecture that while both type
of agents hold equities, only agents of type 2 holds bonds.

754 Chapter 39. Equilibrium Capital Structures with Incomplete Markets


Part VIII

Dynamic Programming Squared

755
CHAPTER

FORTY

OPTIMAL UNEMPLOYMENT INSURANCE

40.1 Overview

This lecture describes a model of optimal unemployment insurance created by Shavell and Weiss (1979) [Shavell and
Weiss, 1979].
We use recursive techniques of Hopenhayn and Nicolini (1997) [Hopenhayn and Nicolini, 1997] to compute optimal
insurance plans for Shavell and Weiss’s model.
Hopenhayn and Nicolini’s model is a generalization of Shavell and Weiss’s along dimensions that we’ll soon describe.

40.2 Shavell and Weiss’s Model

An unemployed worker orders stochastic processes of consumption and search effort {𝑐𝑡 , 𝑎𝑡 }∞
𝑡=0 according to


𝐸 ∑ 𝛽 𝑡 [𝑢(𝑐𝑡 ) − 𝑎𝑡 ] (40.1)
𝑡=0

where 𝛽 ∈ (0, 1) and 𝑢(𝑐) is strictly increasing, twice differentiable, and strictly concave.
We assume that 𝑢(0) is well defined.
We require that 𝑐𝑡 ≥ 0 and 𝑎𝑡 ≥ 0.
All jobs are alike and pay wage 𝑤 > 0 units of the consumption good each period forever.
An unemployed worker searches with effort 𝑎 and with probability 𝑝(𝑎) receives a permanent job at the beginning of the
next period.
Furthermore, 𝑎 = 0 when the worker is employed.
The probability of finding a job is 𝑝(𝑎).
𝑝 is an increasing, strictly concave, and twice differentiable function of 𝑎 that satisfies 𝑝(𝑎) ∈ [0, 1] for 𝑎 ≥ 0, 𝑝(0) = 0.

Note: When we compute examples below, we’ll use assume the same 𝑝(𝑎) function as [Hopenhayn and Nicolini, 1997],
namely, 𝑝(𝑎) = 1 − exp(−𝑟𝑎), where 𝑟 is a parameter that we’ll calibrate to hit the same target that [Hopenhayn and
Nicolini, 1997] did, namely, an empirical hazard rate of leaving unemployment.

The consumption good is nonstorable.


An unemployed worker has no savings and cannot borrow or lend.

757
Advanced Quantitative Economics with Python

The unemployed worker’s only source of consumption smoothing over time and across states is an insurance agency or
planner.
Once a worker has found a job, he is beyond the planner’s grasp.
• This is Shavell and Weiss’s assumption, but not Hopenhayn and Nicolini’s.
• Hopenhayn and Nicolini allow the unemployment insurance agency to impose history-dependent taxes on previ-
ously unemployed workers.
• Since there is no incentive problem after the worker has found a job, it is optimal for the agency to provide an
employed worker with a constant level of consumption.
• Hence, Hopenhayn and Nicolini’s insurance agency imposes a permanent per-period history-dependent tax on a
previously unemployed but presently employed worker.

40.2.1 Autarky

As a benchmark, we first study the fate of an unemployed worker who has no access to unemployment insurance.
Because employment is an absorbing state for the worker, we work backward from that state.
Let 𝑉 𝑒 be the expected sum of discounted one-period utilities of an employed worker.
Once the worker is employed, 𝑎 = 0, making his period utility be 𝑢(𝑐) − 𝑎 = 𝑢(𝑤) forever.
Therefore,
𝑢(𝑤)
𝑉𝑒 = . (40.2)
(1 − 𝛽)

Now let 𝑉 𝑢 be the expected discounted present value of utility for an unemployed worker who chooses consumption,
effort pair (𝑐, 𝑎) optimally.
Value 𝑉 𝑢 satisfies the Bellman equation

𝑉 𝑢 = max{𝑢(0) − 𝑎 + 𝛽 [𝑝(𝑎)𝑉 𝑒 + (1 − 𝑝(𝑎))𝑉 𝑢 ]}. (40.3)


𝑎≥0

The first-order condition for a maximum is

𝛽𝑝′ (𝑎) [𝑉 𝑒 − 𝑉 𝑢 ] ≤ 1, (40.4)

with equality if 𝑎 > 0.


Since there is no state variable in this infinite horizon problem, there is a time-invariant optimal search intensity 𝑎 and an
associated value of being unemployed 𝑉 𝑢 .
Let 𝑉aut = 𝑉 𝑢 solve Bellman equation (40.3).
Equations (40.3) and (40.4) form the basis for an iterative algorithm for computing 𝑉 𝑢 = 𝑉aut .
• Let 𝑉𝑗𝑢 be the estimate of 𝑉aut at the 𝑗th iteration.
• Use this value in equation (40.4) and solve for an estimate of effort 𝑎𝑗 .
• Use this value in a version of equation (40.3) with 𝑉𝑗𝑢 on the right side to compute 𝑉𝑗+1
𝑢
.
• Iterate to convergence.

758 Chapter 40. Optimal Unemployment Insurance


Advanced Quantitative Economics with Python

40.2.2 Full Information

Another benchmark model helps set the stage for the model with private information that we ultimately want to study.
We temporarily assume that an unemployment insurance agency has full information about the unemployed worker.
We assume that the insurance agency can control both the consumption and the search effort of an unemployed worker.
The agency wants to design an unemployment insurance contract to give the unemployed worker expected discounted
utility 𝑉 > 𝑉aut .
The agency, i.e., the planner, wants to deliver value 𝑉 efficiently, meaning in a way that minimizes an expected present
value discounted costs, using 𝛽 as the discount factor.
We formulate the optimal insurance problem recursively.
Let 𝐶(𝑉 ) be the expected discounted cost of giving the worker expected discounted utility 𝑉 .
The cost function is strictly convex because a higher 𝑉 implies a lower marginal utility of the worker; that is, additional
expected utils can be awarded to the worker only at an increasing marginal cost in terms of the consumption good.
Given 𝑉 , the planner assigns first-period pair (𝑐, 𝑎) and promised continuation value 𝑉 𝑢 next period if the worker is
unlucky and does not find a job this period.
The planner sets (𝑐, 𝑎, 𝑉 𝑢 ) as functions of 𝑉 and to satisfy the following Bellman equation for associated cost function
𝐶(𝑉 ):

𝐶(𝑉 ) = min𝑢 {𝑐 + 𝛽[1 − 𝑝(𝑎)]𝐶(𝑉 𝑢 )}, (40.5)


𝑐,𝑎,𝑉

where minimization is subject to the promise-keeping constraint

𝑉 ≤ 𝑢(𝑐) − 𝑎 + 𝛽 {𝑝(𝑎)𝑉 𝑒 + [1 − 𝑝(𝑎)]𝑉 𝑢 } . (40.6)

Here 𝑉 𝑒 is given by equation (40.2), which reflects the assumption that once the worker is employed, he is beyond the
reach of the unemployment insurance agency.
The right side of Bellman equation (40.5) is attained by policy functions 𝑐 = 𝑐(𝑉 ), 𝑎 = 𝑎(𝑉 ), and 𝑉 𝑢 = 𝑉 𝑢 (𝑉 ).
The promise-keeping constraint, equation (40.6), asserts that the 3-tuple (𝑐, 𝑎, 𝑉 𝑢 ) attains at least 𝑉 .
Let 𝜃 be a Lagrange multiplier on constraint (40.6).
At an interior solution, first-order conditions with respect to 𝑐, 𝑎, and 𝑉 𝑢 , respectively, are
1
𝜃= ,
𝑢′ (𝑐)
1 (40.7)
𝐶(𝑉 𝑢 ) = 𝜃 [ ′ − (𝑉 𝑒 − 𝑉 𝑢 )] ,
𝛽𝑝 (𝑎)
𝐶 ′ (𝑉 𝑢 ) = 𝜃 .

The envelope condition 𝐶 ′ (𝑉 ) = 𝜃 and the third equation of (40.7) imply that 𝐶 ′ (𝑉 𝑢 ) = 𝐶 ′ (𝑉 ).
Strict convexity of 𝐶 then implies that 𝑉 𝑢 = 𝑉 .
Applied repeatedly over time, 𝑉 𝑢 = 𝑉 makes the continuation value remain constant during the entire spell of unem-
ployment.
The first equation of (40.7) determines 𝑐, and the second equation of (40.7) determines 𝑎, both as functions of promised
value 𝑉 .
That 𝑉 𝑢 = 𝑉 then implies that 𝑐 and 𝑎 are held constant during the unemployment spell.

40.2. Shavell and Weiss’s Model 759


Advanced Quantitative Economics with Python

Thus, the unemployed worker’s consumption 𝑐 and search effort 𝑎 are both fully smoothed during the unemployment
spell.
But the worker’s consumption is not smoothed across states of employment and unemployment unless 𝑉 = 𝑉 𝑒 .

40.2.3 Incentive Problem

The preceding efficient insurance scheme assumes that the insurance agency controls both 𝑐 and 𝑎.
The insurance agency cannot simply provide 𝑐 and then allow the worker to choose 𝑎.
Here is why.
The agency delivers a value 𝑉 𝑢 higher than the autarky value 𝑉aut by doing two things.
It increases the unemployed worker’s consumption 𝑐 and decreases his search effort 𝑎.
The prescribed search effort is higher than what the worker would choose if he were to be guaranteed consumption level
𝑐 while he remains unemployed.
This follows from the first two equations of (40.7) and the fact that the insurance scheme is costly, 𝐶(𝑉 𝑢 ) > 0, which
imply [𝛽𝑝′ (𝑎)]−1 > (𝑉 𝑒 − 𝑉 𝑢 ).
Now look at the worker’s first-order condition (40.4) under autarky.
It implies that if search effort 𝑎 > 0, then [𝛽𝑝′ (𝑎)]−1 = [𝑉 𝑒 −𝑉 𝑢 ], which is inconsistent with the inequality [𝛽𝑝′ (𝑎)]−1 >
(𝑉 𝑒 − 𝑉 𝑢 ) that prevails when 𝑎 > 0 when the agency controls both 𝑎 and 𝑐.
If he were free to choose 𝑎, the worker would therefore want to fulfill (40.4), either at equality so long as 𝑎 > 0, or by
setting 𝑎 = 0 otherwise.
Starting from the 𝑎 associated with the full-information social insurance scheme in which the agency controls both 𝑐 and
𝑎, the worker would establish the desired equality in (40.4) by lowering 𝑎, thereby decreasing the term [𝛽𝑝′ (𝑎)]−1 (which
also lowers (𝑉 𝑒 − 𝑉 𝑢 ) when the value of being unemployed 𝑉 𝑢 increases).
If an equality can be established before 𝑎 reaches zero, this would be the worker’s preferred search effort; otherwise the
worker would find it optimal to accept the insurance payment, set 𝑎 = 0, and never work again.
Thus, since the worker does not take the cost of the insurance scheme into account, he would choose a search effort below
the socially optimal, full-information level.
The full-information contract thus relies on the agency’s ability to control both the unemployed worker’s consumption and
his search effort.

40.3 Private Information

Following [Shavell and Weiss, 1979] and [Hopenhayn and Nicolini, 1997], now assume that the unemployment insurance
agency cannot observe or control 𝑎, though it can observe and control 𝑐.
The worker is free to choose 𝑎, which puts expression (40.4), the worker’s first-order condition under autarky, back in
the picture.
• We are assuming that the worker’s best response to the unemployment insurance arrangement is completely char-
acterized by the first-order condition (40.4), an instance of the so-called first-order approach to incentive problems.
Given a contract, the individual will choose search effort according to first-order condition (40.4).
This fact motivates the insurance agency to design an unemployment insurance contract that respects this restriction.
Thus, the contract design problem is now to minimize the right side of equation (40.5) subject to expression (40.6) and
the incentive constraint (40.4).

760 Chapter 40. Optimal Unemployment Insurance


Advanced Quantitative Economics with Python

Since the restrictions (40.4) and (40.6) are not linear and generally do not define a convex set, it becomes challenging to
provide conditions under which the solution to the dynamic programming problem results in a convex function 𝐶(𝑉 ).
• Sometimes this complication can be handled by convexifying the constraint set by introducing lotteries.
• A common finding is that optimal plans do not involve lotteries, because convexity of the constraint set is a sufficient
but not necessary condition for convexity of the cost function.
• In order to characterize the optimal solution, we follow Hopenhayn and Nicolini (1997) [Hopenhayn and Nicolini,
1997] by hopefully proceeding under the assumption that 𝐶(𝑉 ) is strictly convex.
Let 𝜂 be the multiplier on constraint (40.4), while 𝜃 continues to denote the multiplier on constraint (40.6).
But now we replace the weak inequality in (40.6) by an equality.
• We do this because the unemployment insurance agency cannot award a higher utility than 𝑉 because that might
violate an incentive-compatibility constraint for exerting the proper search effort in earlier periods.
At an interior solution, first-order conditions with respect to 𝑐, 𝑎, and 𝑉 𝑢 , respectively, are
1
𝜃= ,
𝑢′ (𝑐)
1 𝑝″ (𝑎) 𝑒
𝐶(𝑉 𝑢 ) = 𝜃 [ − (𝑉 𝑒
− 𝑉 𝑢
)] − 𝜂 (𝑉 − 𝑉 𝑢 )
𝛽𝑝′ (𝑎) 𝑝′ (𝑎)
(40.8)
𝑝″ (𝑎) 𝑒
= −𝜂 ′ (𝑉 − 𝑉 𝑢 ) ,
𝑝 (𝑎)
𝑝′ (𝑎)
𝐶 ′ (𝑉 𝑢 ) = 𝜃 − 𝜂 ,
1 − 𝑝(𝑎)

where the second equality in the second equation in (40.8) follows from strict equality of the incentive constraint (40.4)
when 𝑎 > 0.
As long as the insurance scheme is associated with costs, so that 𝐶(𝑉 𝑢 ) > 0, the first-order condition in the second
equation of (40.8) implies that the multiplier 𝜂 is strictly positive.
The first-order condition in the second equation of the third equality in (40.8) and the envelope condition 𝐶 ′ (𝑉 ) = 𝜃
together allow us to conclude that 𝐶 ′ (𝑉 𝑢 ) < 𝐶 ′ (𝑉 ).
Convexity of 𝐶 then implies that 𝑉 𝑢 < 𝑉 .
After we have also used the first equation of (40.8), it follows that in order to provide the proper incentives, the consump-
tion of the unemployed worker must decrease as the duration of the unemployment spell lengthens.
It also follows from (40.4) at equality that search effort 𝑎 rises as 𝑉 𝑢 falls, i.e., it rises with the duration of unemployment.
The of benefits on the duration of unemployment is designed to provide the worker an incentive to search.
To understand this, from the third equation of (40.8), notice how the conclusion that consumption falls with the duration of
unemployment depends on the assumption that more search effort raises the prospect of finding a job, i.e., that 𝑝′ (𝑎) > 0.
If 𝑝′ (𝑎) = 0, then the third equation of (40.8) and the strict convexity of 𝐶 imply that 𝑉 𝑢 = 𝑉 .
Thus, when 𝑝′ (𝑎) = 0, there is no reason for the planner to make consumption fall with the duration of unemployment.

40.3. Private Information 761


Advanced Quantitative Economics with Python

40.3.1 Computational Details

It is useful to note that there are natural lower and upper bounds to the set of continuation values 𝑉 𝑢 .
The lower bound is the expected lifetime utility in autarky, 𝑉aut .
To compute an upper bound, represent condition (40.4) as

𝑉 𝑢 ≥ 𝑉 𝑒 − [𝛽𝑝′ (𝑎)]−1 ,

with equality if 𝑎 > 0.


If there is zero search effort, then 𝑉 𝑢 ≥ 𝑉 𝑒 − [𝛽𝑝′ (0)]−1 .
Therefore, to rule out zero search effort we require

𝑉 𝑢 < 𝑉 𝑒 − [𝛽𝑝′ (0)]−1 .

(Remember that 𝑝″ (𝑎) < 0.)


This step gives our upper bound for 𝑉 𝑢 .
To formulate the Bellman equation numerically, we suggest using the constraints to eliminate 𝑐 and 𝑎 as choice variables,
thereby reducing the Bellman equation to a minimization over the one choice variable 𝑉 𝑢 .
First express the promise-keeping constraint (40.6) at equality as

𝑢(𝑐) = 𝑉 + 𝑎 − 𝛽{𝑝(𝑎)𝑉 𝑒 + [1 − 𝑝(𝑎)]𝑉 𝑢 }

so that consumption is

𝑐 = 𝑢−1 (𝑉 + 𝑎 − 𝛽[𝑝(𝑎)𝑉 𝑒 + (1 − 𝑝(𝑎))𝑉 𝑢 ]) . (40.9)

Similarly, solving the inequality (40.4) for 𝑎 leads to


1
𝑎 = max {0, 𝑝′−1 ( )} . (40.10)
𝛽(𝑉 𝑒 − 𝑉 𝑢 )
When we specialize (40.10) to the functional form for 𝑝(𝑎) used by Hopenhayn and Nicolini, we obtain

log[𝑟𝛽(𝑉 𝑒 − 𝑉 𝑢 )]
𝑎 = max {0, }. (40.11)
𝑟
Formulas (40.9) and (40.11) express (𝑐, 𝑎) as functions of 𝑉 and the continuation value 𝑉 𝑢 .
Using these functions allows us to write the Bellman equation in 𝐶(𝑉 ) as

𝐶(𝑉 ) = min
𝑢
{𝑐 + 𝛽[1 − 𝑝(𝑎)]𝐶(𝑉 𝑢 )} (40.12)
𝑉

where 𝑐 and 𝑎 are given by equations (40.9) and (40.11).

40.3.2 Python Computations

We’ll approximate the planner’s optimal cost function with cubic splines.
To do this, we’ll load some useful modules

import numpy as np
import scipy as sp
import matplotlib.pyplot as plt

762 Chapter 40. Optimal Unemployment Insurance


Advanced Quantitative Economics with Python

We first create a class to set up a particular parametrization.

class params_instance:

def __init__(self,
r,
β = 0.999,
σ = 0.500,
w = 100,
n_grid = 50):

self.β,self.σ,self.w,self.r = β,σ,w,r
self.n_grid = n_grid
uw = self.w**(1-self.σ)/(1-self.σ) #Utility from consuming all wage
self.Ve = uw/(1-β)

40.3.3 Parameter Values

For the other parameters appearing in the above Python code, we’ll calibrate parameter 𝑟 that pins down the function
𝑝(𝑎) = 1 − exp(−𝑟𝑎) to match an observerd hazard rate – the probability that an unemployed worker finds a job each –
in US data.
In particular, we seek an 𝑟 so that in autarky p(a(r)) = 0.1, where a is the optimal search effort.
First, we create some helper functions.

# The probability of finding a job given search effort, a and parameter r.


def p(a,r):
return 1-np.exp(-r*a)

def invp_prime(x,r):
return -np.log(x/r)/r

def p_prime(a,r):
return r*np.exp(-r*a)

# The utiliy function


def u(self,c):
return (c**(1-self.σ))/(1-self.σ)

def u_inv(self,x):
return ((1-self.σ)*x)**(1/(1-self.σ))

Recall that under autarky the value for an unemployed worker satisfies the Bellman equation

𝑉 𝑢 = max{𝑢(0) − 𝑎 + 𝛽 [𝑝𝑟 (𝑎)𝑉 𝑒 + (1 − 𝑝𝑟 (𝑎))𝑉 𝑢 ]} (40.13)


𝑎

At the optimal choice of 𝑎, we have first-order necessary condition:

𝛽𝑝𝑟′ (𝑎)[𝑉 𝑒 − 𝑉 𝑢 ] ≤ 1 (40.14)

with equality when a >0.


Given a value of parameter 𝑟,̄ we can solve the autarky problem as follows:
1. Guess 𝑉 𝑢 ∈ ℝ+
2. Given 𝑉 𝑢 , use the FOC (40.14) to calculate the implied optimal search effort 𝑎

40.3. Private Information 763


Advanced Quantitative Economics with Python

3. Evaluate the difference between the LHS and RHS of the Bellman equation (40.13)
4. Update guess for 𝑉 𝑢 accordingly, then return to 2) and repeat until the Bellman equation is satisfied.
For a given 𝑟 and guess 𝑉 𝑢 , the function Vu_error calculates the error in the Bellman equation under the optimal
search intensity.
We’ll soon use this as an input to computing 𝑉 𝑢 .

# The error in the Bellman equation that requires equality at


# the optimal choices.
def Vu_error(self,Vu,r):
β= self.β
Ve = self.Ve

a = invp_prime(1/(β*(Ve-Vu)),r)
error = u(self,0) -a + β*(p(a,r)*Ve + (1-p(a,r))*Vu) - Vu
return error

Since the calibration exercise is to match the hazard rate under autarky to the data, we must find a parameter 𝑟 to match
p(a,r) = 0.1.
The function below r_error calculates, for a given guess of 𝑟 the difference between the model implied equilibrium
hazard rate and 0.1.
We’ll use this to compute a calibrated 𝑟∗ .

# The error of our p(a^*) relative to our calibration target


def r_error(self,r):
β = self.β
Ve = self.Ve

Vu_star = sp.optimize.fsolve(Vu_error_Λ,15000,args = (r))


a_star = invp_prime(1/(β*(Ve-Vu_star)),r) # Assuming a>0
return p(a_star,r) - 0.1

Now, let us create an instance of the model with our parametrization

params = params_instance(r = 1e-2)


# Create some lambda functions useful for fsolve function
Vu_error_Λ = lambda Vu,r: Vu_error(params,Vu,r)
r_error_Λ = lambda r: r_error(params,r)

We want to compute an 𝑟 that is consistent with the hazard rate 0.1 in autarky.
To do so, we will use a bisection strategy.

r_calibrated = sp.optimize.brentq(r_error_Λ,1e-10,1-1e-10)
print(f"Parameter to match 0.1 hazard rate: r = {r_calibrated}")

Vu_aut = sp.optimize.fsolve(Vu_error_Λ,15000,args = (r_calibrated))[0]


a_aut = invp_prime(1/(params.β*(params.Ve-Vu_aut)),r_calibrated)

print(f"Check p at r: {p(a_aut,r_calibrated)}")

Parameter to match 0.1 hazard rate: r = 0.0003431409393866592


Check p at r: 0.10000000000001996

764 Chapter 40. Optimal Unemployment Insurance


Advanced Quantitative Economics with Python

/tmp/ipykernel_9486/2412693371.py:6: RuntimeWarning: The iteration is not making␣


↪good progress, as measured by the

improvement from the last five Jacobian evaluations.


Vu_star = sp.optimize.fsolve(Vu_error_Λ,15000,args = (r))

Now that we have calibrated our the parameter 𝑟, we can continue with solving the model with private information.

40.3.4 Computation under Private Information

Our approach to solving the full model follows ideas of Judd (1998) [Judd, 1998], who uses a polynomial to approximate
the value function and a numerical optimizer to perform the optimization at each iteration.

Note: For further details of the Judd (1998) [Judd, 1998] method, see [Ljungqvist and Sargent, 2018], Section 5.7.

We will use cubic splines to interpolate across a pre-set grid of points to approximate the value function.
Our strategy involves finding a function 𝐶(𝑉 ) – the expected cost of giving the worker value 𝑉 – that satisfies the Bellman
equation:

𝐶(𝑉 ) = min𝑢 {𝑐 + 𝛽 [1 − 𝑝(𝑎)] 𝐶(𝑉 𝑢 )} (40.15)


𝑐,𝑎,𝑉

Notice that in equations (40.9) and (40.11), we have analytical solutions of 𝑐 and 𝑎 in terms of promised value 𝑉 and 𝑉 𝑢
(and other parameters).
We can substitute these equations for 𝑐 and 𝑎 and obtain the functional equation (40.12).

def calc_c(self,Vu,V,a):
'''
Calculates the optimal consumption choice coming from the constraint of the␣
↪insurer's problem

(which is also a Bellman equation)


'''
β,Ve,r = self.β,self.Ve,self.r

c = u_inv(self,V + a - β*(p(a,r)*Ve + (1-p(a,r))*Vu))


return c

def calc_a(self,Vu):
'''
Calculates the optimal effort choice coming from the worker's effort optimality␣
↪condition.

'''

r,β,Ve = self.r,self.β,self.Ve

a_temp = np.log(r*β*(Ve - Vu))/r


a = max(0,a_temp)
return a

With these analytical solutions for optimal 𝑐 and 𝑎 in hand, we can reduce the minimization to (40.12) in the single
variable 𝑉 𝑢 .
With this in hand, we have our algorithm.

40.3. Private Information 765


Advanced Quantitative Economics with Python

40.3.5 Algorithm

1. Fix a set of grid points 𝑔𝑟𝑖𝑑𝑉 for 𝑉 and 𝑉 𝑢𝑔𝑟𝑖𝑑 for 𝑉 𝑢


2. Guess a function 𝐶0 (𝑉 ) that is evaluated at a grid 𝑔𝑟𝑖𝑑𝑉 .
3. For each point in 𝑔𝑟𝑖𝑑𝑉 find the 𝑉 𝑢 that minimizes the expression on right side of (40.12). We find the minimum
by evaluating the right side of (40.12) at each point in 𝑉 𝑢𝑔𝑟𝑖𝑑 and then finding the minimum using cubic splines.
4. Evaluating the minimum across all points in 𝑔𝑟𝑖𝑑𝑉 gives you another function 𝐶1 (𝑉 ).
5. If 𝐶0 (𝑉 ) and 𝐶1 (𝑉 ) are sufficiently different, then repeat steps 3-4 again. Otherwise, we are done.
6. Thus, the iterations are 𝐶𝑗+1 (𝑉 ) = min𝑐,𝑎,𝑉 𝑢 {𝑐 − 𝛽[1 − 𝑝(𝑎)]𝐶𝑗 (𝑉 )}.
The function iterate_C below executes step 3 in the above algorithm.

# Operator iterate_C that calculates the next iteration of the cost function.
def iterate_C(self,C_old,Vu_grid):

'''
We solve the model by minimising the value function across a grid of possible␣
↪promised values.

'''
β,r,n_grid = self.β,self.r,self.n_grid

C_new = np.zeros(n_grid)
cons_star = np.zeros(n_grid)
a_star = np.zeros(n_grid)
V_star = np.zeros(n_grid)

C_new2 = np.zeros(n_grid)
V_star2 = np.zeros(n_grid)

for V_i in range(n_grid):


C_Vi_temp = np.zeros(n_grid)
cons_Vi_temp = np.zeros(n_grid)
a_Vi_temp = np.zeros(n_grid)

for Vu_i in range(n_grid):


a_i = calc_a(self,Vu_grid[Vu_i])
c_i = calc_c(self,Vu_grid[Vu_i],Vu_grid[V_i],a_i)

C_Vi_temp[Vu_i] = c_i + β*(1-p(a_i,r))*C_old[Vu_i]


cons_Vi_temp[Vu_i] = c_i
a_Vi_temp[Vu_i] = a_i

# Interpolate across the grid to get better approximation of the minimum


C_Vi_temp_interp = sp.interpolate.interp1d(Vu_grid,C_Vi_temp, kind = 'cubic')
cons_Vi_temp_interp = sp.interpolate.interp1d(Vu_grid,cons_Vi_temp, kind =
↪'cubic')

a_Vi_temp_interp = sp.interpolate.interp1d(Vu_grid,a_Vi_temp, kind = 'cubic')

res = sp.optimize.minimize_scalar(C_Vi_temp_interp,method='bounded',bounds =␣
↪(Vu_min,Vu_max))
V_star[V_i] = res.x
C_new[V_i] = res.fun

# Save the associated consumpton and search policy functions as well


(continues on next page)

766 Chapter 40. Optimal Unemployment Insurance


Advanced Quantitative Economics with Python

(continued from previous page)


cons_star[V_i] = cons_Vi_temp_interp(V_star[V_i])
a_star[V_i] = a_Vi_temp_interp(V_star[V_i])

return C_new,V_star,cons_star,a_star

The following code executes steps 4 and 5 in the Algorithm until convergence to a function 𝐶 ∗ (𝑉 ).

def solve_incomplete_info_model(self,Vu_grid,Vu_aut,tol = 1e-6,max_iter = 10000):


iter = 0
error = 1

C_init = np.ones(self.n_grid)*0
C_old = np.copy(C_init)

while iter<max_iter and error >tol:


C_new,V_new,cons_star,a_star = iterate_C(self,C_old,Vu_grid)
error = np.max(np.abs(C_new - C_old))

#Only print the iterations every 50 steps


if iter % 50 ==0:
print(f"Iteration: {iter}, error:{error}")
C_old = np.copy(C_new)
iter+=1

return C_new,V_new,cons_star,a_star

40.4 Outcomes

Using the above functions, we create another instance of the parameters with our calibrated parameter 𝑟.

##? Create another instance with the correct r now


params = params_instance(r = r_calibrated)

#Set up grid
Vu_min = Vu_aut
Vu_max = params.Ve - 1/(params.β*p_prime(0,params.r))
Vu_grid = np.linspace(Vu_min,Vu_max,params.n_grid)

#Solve model
C_star,V_star,cons_star,a_star = solve_incomplete_info_model(params,Vu_grid,Vu_aut,
↪tol = 1e-6,max_iter = 10000) #,cons_star,a_star

# Since we have the policy functions in grid form, we will interpolate them to be␣
↪able to

# evaluate any promised value


cons_star_interp = sp.interpolate.interp1d(Vu_grid,cons_star)
a_star_interp = sp.interpolate.interp1d(Vu_grid,a_star)
V_star_interp = sp.interpolate.interp1d(Vu_grid,V_star)

Iteration: 0, error:72.95964854907824

40.4. Outcomes 767


Advanced Quantitative Economics with Python

Iteration: 50, error:12.222761762480786

Iteration: 100, error:0.12875960366727668

Iteration: 150, error:0.0009402349710398994

Iteration: 200, error:6.115462838351959e-06

40.4.1 Replacement Ratios and Continuation Values

Let’s graph the replacement ratio (𝑐/𝑤) and search effort 𝑎 as functions of the duration of unemployment.
We’ll do this for three levels of 𝑉0 , the lowest being the autarky value 𝑉aut .
We accomplish this by using the optimal policy functions V_star, cons_star and a_star computed above as well
the following iterative procedure:

# Replacement ratio and effort as a function of unemployment duration


T_max = 52
Vu_t = np.empty((T_max,3))
cons_t = np.empty((T_max-1,3))
a_t = np.empty((T_max-1,3))

# Calculate the replacement ratios depending on different initial


# promised values
Vu_0_hold = np.array([Vu_aut,16942,17000])

for i,Vu_0, in enumerate(Vu_0_hold):


Vu_t[0,i] = Vu_0
for t in range(1,T_max):
cons_t[t-1,i] = cons_star_interp(Vu_t[t-1,i])
a_t[t-1,i] = a_star_interp(Vu_t[t-1,i])
Vu_t[t,i] = V_star_interp(Vu_t[t-1,i])

fontSize = 10
plt.rc('font', size=fontSize) # controls default text sizes
plt.rc('axes', titlesize=fontSize) # fontsize of the axes title
plt.rc('axes', labelsize=fontSize) # fontsize of the x and y labels
plt.rc('xtick', labelsize=fontSize) # fontsize of the tick labels
plt.rc('ytick', labelsize=fontSize) # fontsize of the tick labels
plt.rc('legend', fontsize=fontSize) # legend fontsize

f1 = plt.figure(figsize = (8,8))
plt.subplot(2,1,1)
plt.plot(range(T_max-1),cons_t[:,0]/params.w,label = '$V^u_0$ = 16759 (aut)',color =
↪'red')

plt.plot(range(T_max-1),cons_t[:,1]/params.w,label = '$V^u_0$ = 16942',color = 'blue')


plt.plot(range(T_max-1),cons_t[:,2]/params.w,label = '$V^u_0$ = 17000',color = 'green
↪')

plt.ylabel("Replacement ratio (c/w)")


plt.legend()
plt.title("Optimal replacement ratio")
(continues on next page)

768 Chapter 40. Optimal Unemployment Insurance


Advanced Quantitative Economics with Python

(continued from previous page)

plt.subplot(2,1,2)
plt.plot(range(T_max-1),a_t[:,0],color = 'red')
plt.plot(range(T_max-1),a_t[:,1],color = 'blue')
plt.plot(range(T_max-1),a_t[:,2],color = 'green')
plt.ylim(0,320)
plt.ylabel("Optimal search effort (a)")
plt.xlabel("Duration of unemployment")
plt.title("Optimal search effort")
plt.show()

For an initial promised value 𝑉 𝑢 = 𝑉aut , the planner chooses the autarky level of 0 for the replacement ratio and instructs
the worker to search at the autarky search intensity, regardless of the duration of unemployment
But for 𝑉 𝑢 > 𝑉aut , the planner makes the replacement ratio decline and search effort increase with the duration of

40.4. Outcomes 769


Advanced Quantitative Economics with Python

unemployment.

40.4.2 Interpretations

The downward slope of the replacement ratio when 𝑉 𝑢 > 𝑉aut is a consequence of the planner’s limited information
about the worker’s search effort.
By providing the worker with a duration-dependent schedule of replacement ratios, the planner induces the worker in
effect to reveal his/her search effort to the planner.
We saw earlier that with full information, the planner would smooth consumption over an unemployment spell by keeping
the replacement ratio constant.
With private information, the planner can’t observe the worker’s search effort and therefore makes the replacement ratio
fall.
Evidently, search effort rise as the duration of unemployment increases, especially early in an unemployment spell.
There is a carrot-and-stick aspect to the replacement rate and search effort schedules:
• the carrot occurs in the forms of high compensation and low search effort early in an unemployment spell.
• the stick occurs in the low compensation and high effort later in the spell.
We shall encounter a related carrot-and-stick feature in our other lectures about dynamic programming squared.
The planner offers declining benefits and induces increased search effort as the duration of an unemployment spell rises in
order to provide an unemployed worker with proper incentives, not to punish an unlucky worker who has been unemployed
for a long time.
The planner believes that a worker who has been unemployed a long time is unlucky, not that he has done anything wrong
(e.g.,that he has not lived up to the contract).
Indeed, the contract is designed to induce the unemployed workers to search in the way the planner expects.
The falling consumption and rising search effort of the unlucky ones with long unemployment spells are simply costs that
have to be paid in order to provide proper incentives.

770 Chapter 40. Optimal Unemployment Insurance


CHAPTER

FORTYONE

STACKELBERG PLANS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

41.1 Overview

This lecture formulates and computes a plan that a Stackelberg leader uses to manipulate forward-looking decisions of a
Stackelberg follower that depend on continuation sequences of decisions made once and for all by the Stackelberg leader
at time 0.
To facilitate computation and interpretation, we formulate things in a context that allows us to apply dynamic programming
for linear-quadratic models.
Technically, our calculations are closely related to ones described this lecture.
From the beginning, we carry along a linear-quadratic model of duopoly in which firms face adjustment costs that make
them want to forecast actions of other firms that influence future prices.
Let’s start with some standard imports:

import numpy as np
import numpy.linalg as la
import quantecon as qe
from quantecon import LQ
import matplotlib.pyplot as plt

41.2 Duopoly

Time is discrete and is indexed by 𝑡 = 0, 1, ….


Two firms produce a single good whose demand is governed by the linear inverse demand curve

𝑝𝑡 = 𝑎0 − 𝑎1 (𝑞1𝑡 + 𝑞2𝑡 )

where 𝑞𝑖𝑡 is output of firm 𝑖 at time 𝑡 and 𝑎0 and 𝑎1 are both positive.
𝑞10 , 𝑞20 are given numbers that serve as initial conditions at time 0.
By incurring a cost equal to
2
𝛾𝑣𝑖𝑡 , 𝛾 > 0,

771
Advanced Quantitative Economics with Python

firm 𝑖 can change its output according to

𝑞𝑖𝑡+1 = 𝑞𝑖𝑡 + 𝑣𝑖𝑡

Firm 𝑖’s profits at time 𝑡 equal


2
𝜋𝑖𝑡 = 𝑝𝑡 𝑞𝑖𝑡 − 𝛾𝑣𝑖𝑡

Firm 𝑖 wants to maximize the present value of its profits



∑ 𝛽 𝑡 𝜋𝑖𝑡
𝑡=0

where 𝛽 ∈ (0, 1) is a time discount factor.

41.2.1 Stackelberg Leader and Follower



Each firm 𝑖 = 1, 2 chooses a sequence 𝑞𝑖⃗ ≡ {𝑞𝑖𝑡+1 }𝑡=0 once and for all at time 0.
We let firm 2 be a Stackelberg leader and firm 1 be a Stackelberg follower.

The leader firm 2 goes first and chooses {𝑞2𝑡+1 }𝑡=0 once and for all at time 0.
Knowing that firm 2 has chosen {𝑞2𝑡+1 }∞ ∞
𝑡=0 , the follower firm 1 goes second and chooses {𝑞1𝑡+1 }𝑡=0 once and for all at
time 0.
In choosing 𝑞2⃗ , firm 2 takes into account that firm 1 will base its choice of 𝑞1⃗ on firm 2’s choice of 𝑞2⃗ .

41.2.2 Statement of Leader’s and Follower’s Problems

We can express firm 1’s problem as


max Π1 (𝑞1⃗ ; 𝑞2⃗ )
𝑞1⃗

where the appearance behind the semi-colon indicates that 𝑞2⃗ is given.
Firm 1’s problem induces the best response mapping

𝑞1⃗ = 𝐵(𝑞2⃗ )

(Here 𝐵 maps a sequence into a sequence)


The Stackelberg leader’s problem is
max Π2 (𝐵(𝑞2⃗ ), 𝑞2⃗ )
𝑞2⃗

whose maximizer is a sequence 𝑞2⃗ that depends on the initial conditions 𝑞10 , 𝑞20 and the parameters of the model 𝑎0 , 𝑎1 , 𝛾.
This formulation captures key features of the model
• Both firms make once-and-for-all choices at time 0.
• This is true even though both firms are choosing sequences of quantities that are indexed by time.
• The Stackelberg leader chooses first within time 0, knowing that the Stackelberg follower will choose second
within time 0.
While our abstract formulation reveals the timing protocol and equilibrium concept well, it obscures details that must be
addressed when we want to compute and interpret a Stackelberg plan and the follower’s best response to it.
To gain insights about these things, we study them in more detail.

772 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

41.2.3 Firms’ Problems

Firm 1 acts as if firm 2’s sequence {𝑞2𝑡+1 }∞


𝑡=0 is given and beyond its control.

Firm 2 knows that firm 1 chooses second and takes this into account in choosing {𝑞2𝑡+1 }∞
𝑡=0 .

In the spirit of working backward, we study firm 1’s problem first, taking {𝑞2𝑡+1 }∞
𝑡=0 as given.

We can formulate firm 1’s optimum problem in terms of the Lagrangian



𝐿 = ∑ 𝛽 𝑡 {𝑎0 𝑞1𝑡 − 𝑎1 𝑞1𝑡
2 2
− 𝑎1 𝑞1𝑡 𝑞2𝑡 − 𝛾𝑣1𝑡 + 𝜆𝑡 [𝑞1𝑡 + 𝑣1𝑡 − 𝑞1𝑡+1 ]}
𝑡=0

Firm 1 seeks a maximum with respect to {𝑞1𝑡+1 , 𝑣1𝑡 }∞ ∞


𝑡=0 and a minimum with respect to {𝜆𝑡 }𝑡=0 .

We approach this problem using methods described in [Ljungqvist and Sargent, 2018], chapter 2, appendix A and [Sar-
gent, 1987], chapter IX.
First-order conditions for this problem are

𝜕𝐿
= 𝑎0 − 2𝑎1 𝑞1𝑡 − 𝑎1 𝑞2𝑡 + 𝜆𝑡 − 𝛽 −1 𝜆𝑡−1 = 0, 𝑡≥1
𝜕𝑞1𝑡
𝜕𝐿
= −2𝛾𝑣1𝑡 + 𝜆𝑡 = 0, 𝑡 ≥ 0
𝜕𝑣1𝑡

These first-order conditions and the constraint 𝑞1𝑡+1 = 𝑞1𝑡 + 𝑣1𝑡 can be rearranged to take the form

𝛽𝑎0 𝛽𝑎 𝛽𝑎
𝑣1𝑡 = 𝛽𝑣1𝑡+1 + − 1 𝑞1𝑡+1 − 1 𝑞2𝑡+1
2𝛾 𝛾 2𝛾
𝑞𝑡+1 = 𝑞1𝑡 + 𝑣1𝑡

We can substitute the second equation into the first equation to obtain

(𝑞1𝑡+1 − 𝑞1𝑡 ) = 𝛽(𝑞1𝑡+2 − 𝑞1𝑡+1 ) + 𝑐0 − 𝑐1 𝑞1𝑡+1 − 𝑐2 𝑞2𝑡+1


𝛽𝑎0 𝛽𝑎1 𝛽𝑎1
where 𝑐0 = 2𝛾 , 𝑐1 = 𝛾 , 𝑐2 = 2𝛾 .

This equation can in turn be rearranged to become

−𝑞1𝑡 + (1 + 𝛽 + 𝑐1 )𝑞1𝑡+1 − 𝛽𝑞1𝑡+2 = 𝑐0 − 𝑐2 𝑞2𝑡+1 (41.1)

Equation (41.1) is a second-order difference equation in the sequence 𝑞1⃗ whose solution we want.
It satisfies two boundary conditions:
• an initial condition that 𝑞1,0 , which is given
• a terminal condition requiring that lim𝑇 →+∞ 𝛽 𝑇 𝑞1𝑡
2
< +∞
Using the lag operators described in [Sargent, 1987], chapter IX, difference equation (41.1) can be written as

1 + 𝛽 + 𝑐1
𝛽(1 − 𝐿 + 𝛽 −1 𝐿2 )𝑞1𝑡+2 = −𝑐0 + 𝑐2 𝑞2𝑡+1
𝛽
The polynomial in the lag operator on the left side can be factored as

1 + 𝛽 + 𝑐1
(1 − 𝐿 + 𝛽 −1 𝐿2 ) = (1 − 𝛿1 𝐿)(1 − 𝛿2 𝐿) (41.2)
𝛽

where 0 < 𝛿1 < 1 < √1 < 𝛿2 .


𝛽

41.2. Duopoly 773


Advanced Quantitative Economics with Python

Because 𝛿2 > √1𝛽 the operator (1−𝛿2 𝐿) contributes an unstable component if solved backwards but a stable component
if solved forwards.
Mechanically, write

(1 − 𝛿2 𝐿) = −𝛿2 𝐿(1 − 𝛿2−1 𝐿−1 )

and compute the following inverse operator


−1 −1 −1
[−𝛿2 𝐿(1 − 𝛿2−1 𝐿−1 )] = −𝛿2 (1 − 𝛿2 ) 𝐿−1

Operating on both sides of equation (41.2) with 𝛽 −1 times this inverse operator gives the follower’s decision rule for
setting 𝑞1𝑡+1 in the feedback-feedforward form

1
𝑞1𝑡+1 = 𝛿1 𝑞1𝑡 − 𝑐0 𝛿2−1 𝛽 −1 −1
+ 𝑐2 𝛿2−1 𝛽 −1 ∑ 𝛿2𝑗 𝑞2𝑡+𝑗+1 , 𝑡≥0 (41.3)
1 − 𝛿2 𝑗=0

The problem of the Stackelberg leader firm 2 is to choose the sequence {𝑞2𝑡+1 }∞
𝑡=0 to maximize its discounted profits


∑ 𝛽 𝑡 {(𝑎0 − 𝑎1 (𝑞1𝑡 + 𝑞2𝑡 ))𝑞2𝑡 − 𝛾(𝑞2𝑡+1 − 𝑞2𝑡 )2 }
𝑡=0

subject to the sequence of constraints (41.3) for 𝑡 ≥ 0.


We can put a sequence {𝜃𝑡 }∞
𝑡=0 of Lagrange multipliers on the sequence of equations (41.3) and formulate the following
Lagrangian for the Stackelberg leader firm 2’s problem

𝐿̃ = ∑ 𝛽 𝑡 {(𝑎0 − 𝑎1 (𝑞1𝑡 + 𝑞2𝑡 ))𝑞2𝑡 − 𝛾(𝑞2𝑡+1 − 𝑞2𝑡 )2 }
𝑡=0
∞ ∞ (41.4)
1
𝑡
+ ∑ 𝛽 𝜃𝑡 {𝛿1 𝑞1𝑡 − 𝑐0 𝛿2−1 𝛽 −1 + 𝑐 𝛿
2 2
−1 −1
𝛽 ∑ 𝛿2−𝑗 𝑞2𝑡+𝑗+1 − 𝑞1𝑡+1 }
𝑡=0
1 − 𝛿2−1 𝑗=0

subject to initial conditions for 𝑞1𝑡 , 𝑞2𝑡 at 𝑡 = 0.


Remarks: We have formulated the Stackelberg problem in a space of sequences.
The max-min problem associated with firm 2’s Lagrangian (41.4) is unpleasant because the time 𝑡 component of firm 2’s
payoff function depends on the entire future of its choices of {𝑞2𝑡+𝑗 }∞
𝑗=0 .

This renders a direct attack on the problem in the space of sequences cumbersome.
Therefore, below we will formulate the Stackelberg leader’s problem recursively.
We’ll proceed by putting our duopoly model into a broader class of models with the same general structure.

41.3 Stackelberg Problem

We formulate a class of linear-quadratic Stackelberg leader-follower problems of which our duopoly model is an instance.
We use the optimal linear regulator (a.k.a. the linear-quadratic dynamic programming problem described in LQ Dynamic
Programming problems) to represent a Stackelberg leader’s problem recursively.
Let 𝑧𝑡 be an 𝑛𝑧 × 1 vector of natural state variables.
Let 𝑥𝑡 be an 𝑛𝑥 × 1 vector of endogenous forward-looking variables that are physically free to jump at 𝑡.
In our duopoly example 𝑥𝑡 = 𝑣1𝑡 , the time 𝑡 decision of the Stackelberg follower.

774 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

Let 𝑢𝑡 be a vector of decisions chosen by the Stackelberg leader at 𝑡.


The 𝑧𝑡 vector is inherited from the past.
But 𝑥𝑡 is a decision made by the Stackelberg follower at time 𝑡 that is the follower’s best response to the choice of an
entire sequence of decisions made by the Stackelberg leader at time 𝑡 = 0.
Let
𝑧
𝑦𝑡 = [ 𝑡 ]
𝑥𝑡

Represent the Stackelberg leader’s one-period loss function as

𝑟(𝑦, 𝑢) = 𝑦′ 𝑅𝑦 + 𝑢′ 𝑄𝑢

Subject to an initial condition for 𝑧0 , but not for 𝑥0 , the Stackelberg leader wants to maximize

− ∑ 𝛽 𝑡 𝑟(𝑦𝑡 , 𝑢𝑡 ) (41.5)
𝑡=0

The Stackelberg leader faces the model

𝐼 0 𝑧 𝐴̂ 𝐴12̂ 𝑧 ̂ 𝑡
[ ] [ 𝑡+1 ] = [ 11
̂ ̂ ] [ 𝑡 ] + 𝐵𝑢 (41.6)
𝐺21 𝐺22 𝑥𝑡+1 𝐴21 𝐴22 𝑥 𝑡

𝐼 0
We assume that the matrix [ ] on the left side of equation (41.6) is invertible, so that we can multiply both
𝐺21 𝐺22
sides by its inverse to obtain

𝑧 𝐴 𝐴12 𝑧𝑡
[ 𝑡+1 ] = [ 11 ] [ ] + 𝐵𝑢𝑡 (41.7)
𝑥𝑡+1 𝐴21 𝐴22 𝑥𝑡
or

𝑦𝑡+1 = 𝐴𝑦𝑡 + 𝐵𝑢𝑡 (41.8)

41.3.1 Interpretation of Second Block of Equations

The Stackelberg follower’s best response mapping is summarized by the second block of equations of (41.7).
In particular, these equations are the first-order conditions of the Stackelberg follower’s optimization problem (i.e., its
Euler equations).
These Euler equations summarize the forward-looking aspect of the follower’s behavior and express how its time 𝑡 decision
depends on the leader’s actions at times 𝑠 ≥ 𝑡.
When combined with a stability condition to be imposed below, the Euler equations summarize the follower’s best response
to the sequence of actions by the leader.
The Stackelberg leader maximizes (41.5) by choosing sequences {𝑢𝑡 , 𝑥𝑡 , 𝑧𝑡+1 }∞
𝑡=0 subject to (41.8) and an initial condi-
tion for 𝑧0 .
Note that we have an initial condition for 𝑧0 but not for 𝑥0 .
𝑥0 is among the variables to be chosen at time 0 by the Stackelberg leader.
The Stackelberg leader uses its understanding of the responses restricted by (41.8) to manipulate the follower’s decisions.

41.3. Stackelberg Problem 775


Advanced Quantitative Economics with Python

41.3.2 More Mechanical Details

For any vector 𝑎𝑡 , define 𝑎𝑡⃗ = [𝑎𝑡 , 𝑎𝑡+1 …].


Define a feasible set of (𝑦1⃗ , 𝑢⃗0 ) sequences

Ω(𝑦0 ) = {(𝑦1⃗ , 𝑢⃗0 ) ∶ 𝑦𝑡+1 = 𝐴𝑦𝑡 + 𝐵𝑢𝑡 , ∀𝑡 ≥ 0}

Please remember that the follower’s system of Euler equations is embedded in the system of dynamic equations 𝑦𝑡+1 =
𝐴𝑦𝑡 + 𝐵𝑢𝑡 .
Note that the definition of Ω(𝑦0 ) treats 𝑦0 as given.
Although it is taken as given in Ω(𝑦0 ), eventually, the 𝑥0 component of 𝑦0 is to be chosen by the Stackelberg leader.

41.3.3 Two Subproblems

Once again we use backward induction.


We express the Stackelberg problem in terms of two subproblems.
Subproblem 1 is solved by a continuation Stackelberg leader at each date 𝑡 ≥ 0.
Subproblem 2 is solved by the Stackelberg leader at 𝑡 = 0.
The two subproblems are designed
• to respect the timing protocol in which the follower chooses 𝑞1⃗ after seeing 𝑞2⃗ chosen by the leader
• to make the leader choose 𝑞2⃗ while respecting that 𝑞1⃗ will be the follower’s best response to 𝑞2⃗
• to represent the leader’s problem recursively by artfully choosing the leader’s state variables and the control variables
available to the leader
Subproblem 1

𝑣(𝑦0 ) = max − ∑ 𝛽 𝑡 𝑟(𝑦𝑡 , 𝑢𝑡 )
(𝑦1⃗ ,𝑢⃗ 0 )∈Ω(𝑦0 )
𝑡=0

Subproblem 2

𝑤(𝑧0 ) = max 𝑣(𝑦0 )


𝑥0

Subproblem 1 takes the vector of forward-looking variables 𝑥0 as given.


Subproblem 2 optimizes over 𝑥0 .
The value function 𝑤(𝑧0 ) tells the value of the Stackelberg plan as a function of the vector of natural state variables 𝑧0 at
time 0.

41.4 Two Bellman Equations

We now describe Bellman equations for 𝑣(𝑦) and 𝑤(𝑧0 ).


Subproblem 1
The value function 𝑣(𝑦) in subproblem 1 satisfies the Bellman equation

𝑣(𝑦) = max∗ {−𝑟(𝑦, 𝑢) + 𝛽𝑣(𝑦∗ )} (41.9)


𝑢,𝑦

776 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

where the maximization is subject to


𝑦∗ = 𝐴𝑦 + 𝐵𝑢
and 𝑦∗ denotes next period’s value.
Substituting 𝑣(𝑦) = −𝑦′ 𝑃 𝑦 into Bellman equation (41.9) gives
−𝑦′ 𝑃 𝑦 = max𝑢,𝑦∗ {−𝑦′ 𝑅𝑦 − 𝑢′ 𝑄𝑢 − 𝛽𝑦∗′ 𝑃 𝑦∗ }
which as in lecture linear regulator gives rise to the algebraic matrix Riccati equation
𝑃 = 𝑅 + 𝛽𝐴′ 𝑃 𝐴 − 𝛽 2 𝐴′ 𝑃 𝐵(𝑄 + 𝛽𝐵′ 𝑃 𝐵)−1 𝐵′ 𝑃 𝐴
and the optimal decision rule coefficient vector
𝐹 = 𝛽(𝑄 + 𝛽𝐵′ 𝑃 𝐵)−1 𝐵′ 𝑃 𝐴
where the optimal decision rule is
𝑢𝑡 = −𝐹 𝑦𝑡
Subproblem 2
We find an optimal 𝑥0 by equating to zero the gradient of 𝑣(𝑦0 ) with respect to 𝑥0 :
−2𝑃21 𝑧0 − 2𝑃22 𝑥0 = 0,
which implies that
−1
𝑥0 = −𝑃22 𝑃21 𝑧0 (41.10)

41.5 Stackelberg Plan for Duopoly

Now let’s map our duopoly model into the above setup.
We formulate a state vector
𝑧
𝑦𝑡 = [ 𝑡 ]
𝑥𝑡
where for our duopoly model
1
𝑧𝑡 = ⎡ 𝑞 ⎤
⎢ 2𝑡 ⎥ , 𝑥𝑡 = 𝑣1𝑡 ,
⎣𝑞1𝑡 ⎦
where 𝑥𝑡 = 𝑣1𝑡 is the time 𝑡 decision of the follower firm 1, 𝑢𝑡 is the time 𝑡 decision of the leader firm 2 and
𝑣1𝑡 = 𝑞1𝑡+1 − 𝑞1𝑡 , 𝑢𝑡 = 𝑞2𝑡+1 − 𝑞2𝑡 .
For our duopoly model, initial conditions for the natural state variables in 𝑧𝑡 are
1
𝑧0 = ⎡𝑞 ⎤
⎢ 20 ⎥
⎣𝑞10 ⎦
while 𝑥0 = 𝑣10 = 𝑞11 − 𝑞10 is a choice variable for the Stackelberg leader firm 2, one that will ultimately be chosen
according an optimal rule prescribed by (41.10) for subproblem 2 above.
That the Stackelberg leader firm 2 chooses 𝑥0 = 𝑣10 is subtle.
Of course, 𝑥0 = 𝑣10 emerges from the feedback-feedforward solution (41.3) of firm 1’s system of Euler equations, so
that it is actually firm 1 that sets 𝑥0 .
⃗ = {𝑞2𝑡+1 }∞
But firm 2 manipulates firm 1’s choice through firm 2’s choice of the sequence 𝑞2,1 𝑡=0 .

41.5. Stackelberg Plan for Duopoly 777


Advanced Quantitative Economics with Python

41.5.1 Calculations to Prepare Duopoly Model

Now we’ll proceed to cast our duopoly model within the framework of the more general linear-quadratic structure de-
scribed above.
That will allow us to compute a Stackelberg plan simply by enlisting a Riccati equation to solve a linear-quadratic dynamic
program.
As emphasized above, firm 1 acts as if firm 2’s decisions {𝑞2𝑡+1 , 𝑣2𝑡 }∞
𝑡=0 are given and beyond its control.

41.5.2 Firm 1’s Problem

We again formulate firm 1’s optimum problem in terms of the Lagrangian



𝐿 = ∑ 𝛽 𝑡 {𝑎0 𝑞1𝑡 − 𝑎1 𝑞1𝑡
2 2
− 𝑎1 𝑞1𝑡 𝑞2𝑡 − 𝛾𝑣1𝑡 + 𝜆𝑡 [𝑞1𝑡 + 𝑣1𝑡 − 𝑞1𝑡+1 ]}
𝑡=0

Firm 1 seeks a maximum with respect to {𝑞1𝑡+1 , 𝑣1𝑡 }∞ ∞


𝑡=0 and a minimum with respect to {𝜆𝑡 }𝑡=0 .

First-order conditions for this problem are


𝜕𝐿
= 𝑎0 − 2𝑎1 𝑞1𝑡 − 𝑎1 𝑞2𝑡 + 𝜆𝑡 − 𝛽 −1 𝜆𝑡−1 = 0, 𝑡≥1
𝜕𝑞1𝑡
𝜕𝐿
= −2𝛾𝑣1𝑡 + 𝜆𝑡 = 0, 𝑡 ≥ 0
𝜕𝑣1𝑡
These first-order order conditions and the constraint 𝑞1𝑡+1 = 𝑞1𝑡 + 𝑣1𝑡 can be rearranged to take the form
𝛽𝑎0 𝛽𝑎 𝛽𝑎
𝑣1𝑡 = 𝛽𝑣1𝑡+1 + − 1 𝑞1𝑡+1 − 1 𝑞2𝑡+1
2𝛾 𝛾 2𝛾
𝑞𝑡+1 = 𝑞1𝑡 + 𝑣1𝑡
We use these two equations as components of the following linear system that confronts a Stackelberg continuation leader
at time 𝑡
1 0 0 0 1 1 0 0 0 1 0
⎡ 0 1 0 0 ⎤ ⎡𝑞2𝑡+1 ⎤ ⎡0 1 0 0⎤ ⎡𝑞2𝑡 ⎤ ⎡1⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎥ + ⎢ ⎥𝑣
⎢ 0 0 1 0 ⎥ ⎢𝑞1𝑡+1 ⎥ = ⎢0 0 1 1⎥ ⎢𝑞1𝑡 ⎥ ⎢0⎥ 2𝑡
𝛽𝑎0
⎣ 2𝛾 − 𝛽𝑎
2𝛾
1
− 𝛽𝑎𝛾 1 𝛽 ⎦ ⎣𝑣1𝑡+1 ⎦ ⎣0 0 0 1⎦ ⎣𝑣1𝑡 ⎦ ⎣0⎦
2
Time 𝑡 revenues of firm 2 are 𝜋2𝑡 = 𝑎0 𝑞2𝑡 − 𝑎1 𝑞2𝑡 − 𝑎1 𝑞1𝑡 𝑞2𝑡 which evidently equal
′ 𝑎0
1 0 2 0 1
𝑧𝑡′ 𝑅1 𝑧𝑡 ≡ ⎡𝑞 ⎤ ⎡ 𝑎0
⎢ 2𝑡 ⎥ ⎢ 2 −𝑎1 − 𝑎21 ⎤ ⎡𝑞 ⎤
⎥ ⎢ 2𝑡 ⎥
⎣𝑞1𝑡 ⎦ ⎣ 0 − 𝑎21 0 ⎦ ⎣𝑞1𝑡 ⎦
If we set 𝑄 = 𝛾, then firm 2’s period 𝑡 profits can then be written

𝑦𝑡′ 𝑅𝑦𝑡 − 𝑄𝑣2𝑡


2

where
𝑧
𝑦𝑡 = [ 𝑡 ]
𝑥𝑡
with 𝑥𝑡 = 𝑣1𝑡 and

𝑅1 0
𝑅=[ ]
0 0

778 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

We’ll report results of implementing this code soon.


But first, we want to represent the Stackelberg leader’s optimal choices recursively.
It is important to do this for several reasons:
• properly to interpret a representation of the Stackelberg leader’s choice as a sequence of history-dependent functions
• to formulate a recursive version of the follower’s choice problem
First, let’s get a recursive representation of the Stackelberg leader’s choice of 𝑞2⃗ for our duopoly model.

41.6 Recursive Representation of Stackelberg Plan

In order to attain an appropriate representation of the Stackelberg leader’s history-dependent plan, we will employ what
amounts to a version of the Big K, little k device often used in macroeconomics by distinguishing 𝑧𝑡 , which depends
partly on decisions 𝑥𝑡 of the followers, from another vector 𝑧𝑡̌ , which does not.
We will use 𝑧𝑡̌ and its history 𝑧𝑡̌ = [𝑧𝑡̌ , 𝑧𝑡−1
̌ , … , 𝑧0̌ ] to describe the sequence of the Stackelberg leader’s decisions that
the Stackelberg follower takes as given.
Thus, we let 𝑦𝑡′̌ = [𝑧𝑡′̌ 𝑥′𝑡̌ ] with initial condition 𝑧0̌ = 𝑧0 given.
That we distinguish 𝑧𝑡̌ from 𝑧𝑡 is part and parcel of the Big K, little k device in this instance.
We have demonstrated that a Stackelberg plan for {𝑢𝑡 }∞
𝑡=0 has a recursive representation

−1
𝑥0̌ = −𝑃22 𝑃21 𝑧0
𝑢𝑡 = −𝐹 𝑦𝑡̌ , 𝑡 ≥ 0
𝑦𝑡+1
̌ = (𝐴 − 𝐵𝐹 )𝑦𝑡̌ , 𝑡≥0

From this representation, we can deduce the sequence of functions 𝜎 = {𝜎𝑡 (𝑧𝑡̌ )}∞
𝑡=0 that comprise a Stackelberg plan.

𝑧̌
For convenience, let 𝐴 ̌ ≡ 𝐴 − 𝐵𝐹 and partition 𝐴 ̌ conformably to the partition 𝑦𝑡 = [ 𝑡 ] as
𝑥𝑡̌

𝐴̌ ̌
𝐴12
[ 11̌ ̌ ]
𝐴21 𝐴22

−1
Let 𝐻00 ≡ −𝑃22 𝑃21 so that 𝑥0̌ = 𝐻00 𝑧0̌ .
𝑧̌
̌ = 𝐴𝑦̌ 𝑡̌ starting from initial condition 𝑦0̌ = [ 00 ] imply that for 𝑡 ≥ 1
Then iterations on 𝑦𝑡+1
𝐻0 𝑧0̌

𝑡
𝑥𝑡̌ = ∑ 𝐻𝑗𝑡 𝑧𝑡−𝑗
̌
𝑗=1

where
̌
𝐻1𝑡 = 𝐴21
𝐻𝑡 = 𝐴̌ 𝐴̌
2 22 21
⋮ ⋮
𝑡
𝐻𝑡−1 ̌
= 𝐴𝑡−2 ̌
22 𝐴21
̌
𝐻𝑡𝑡 = 𝐴𝑡−1 ̌ ̌ 0
22 (𝐴21 + 𝐴22 𝐻0 )

41.6. Recursive Representation of Stackelberg Plan 779


Advanced Quantitative Economics with Python

An optimal decision rule for the Stackelberg leader’s choice of 𝑢𝑡 is

𝑧̌
𝑢𝑡 = −𝐹 𝑦𝑡̌ ≡ − [𝐹𝑧 𝐹𝑥 ] [ 𝑡 ]
𝑥𝑡
or
𝑡
𝑢𝑡 = −𝐹𝑧 𝑧𝑡̌ − 𝐹𝑥 ∑ 𝐻𝑗𝑡 𝑧𝑡−𝑗 = 𝜎𝑡 (𝑧 𝑡̌ ) (41.11)
𝑗=1

Representation (41.11) confirms that whenever 𝐹𝑥 ≠ 0, the typical situation, the time 𝑡 component 𝜎𝑡 of a Stackelberg
plan is history-dependent, meaning that the Stackelberg leader’s choice 𝑢𝑡 depends not just on 𝑧𝑡̌ but on components of
𝑧 𝑡−1
̌ .

41.6.1 Comments and Interpretations

Because we set 𝑧0̌ = 𝑧0 , it will turn out that 𝑧𝑡 = 𝑧𝑡̌ for all 𝑡 ≥ 0.
Then why did we distinguish 𝑧𝑡̌ from 𝑧𝑡 ?
The answer is that if we want to present to the Stackelberg follower a history-dependent representation of the Stackel-
berg leader’s sequence 𝑞2⃗ , we must use representation (41.11) cast in terms of the history 𝑧 𝑡̌ and not a corresponding
representation cast in terms of 𝑧𝑡 .

41.7 Dynamic Programming and Time Consistency of Follower’s


Problem

Given the sequence 𝑞2⃗ chosen by the Stackelberg leader in our duopoly model, it turns out that the Stackelberg follower’s
problem is recursive in the natural state variables that confront a follower at any time 𝑡 ≥ 0.
This means that the follower’s plan is time consistent.
To verify these claims, we’ll formulate a recursive version of a follower’s problem that builds on our recursive represen-
tation of the Stackelberg leader’s plan and our use of the Big K, little k idea.

41.7.1 Recursive Formulation of a Follower’s Problem

We now use what amounts to another “Big 𝐾, little 𝑘” trick (see rational expectations equilibrium) to formulate a recursive
version of a follower’s problem cast in terms of an ordinary Bellman equation.
Firm 1, the follower, faces {𝑞2𝑡 }∞
𝑡=0 as a given quantity sequence chosen by the leader and believes that its output price
at 𝑡 satisfies

𝑝𝑡 = 𝑎0 − 𝑎1 (𝑞1𝑡 + 𝑞2𝑡 ), 𝑡≥0

Our challenge is to represent {𝑞2𝑡 }∞


𝑡=0 as a given sequence.

To do so, recall that under the Stackelberg plan, firm 2 sets output according to the 𝑞2𝑡 component of

1
⎡𝑞 ⎤
𝑦𝑡+1 = ⎢ 2𝑡 ⎥
⎢𝑞1𝑡 ⎥
⎣ 𝑥𝑡 ⎦

780 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

which is governed by
𝑦𝑡+1 = (𝐴 − 𝐵𝐹 )𝑦𝑡
To obtain a recursive representation of a {𝑞2𝑡 } sequence that is exogenous to firm 1, we define a state 𝑦𝑡̃
1
⎡𝑞 ⎤
𝑦𝑡̃ = ⎢ 2𝑡 ⎥
⎢𝑞1𝑡
̃ ⎥
⎣ 𝑥𝑡̃ ⎦
that evolves according to
𝑦𝑡+1
̃ = (𝐴 − 𝐵𝐹 )𝑦𝑡̃
−1
subject to the initial condition 𝑞10
̃ = 𝑞10 and 𝑥0̃ = 𝑥0 where 𝑥0 = −𝑃22 𝑃21 as stated above.
Firm 1’s state vector is
𝑦𝑡̃
𝑋𝑡 = [ ]
𝑞1𝑡
It follows that the follower firm 1 faces law of motion
𝑦̃ 𝐴 − 𝐵𝐹 0 𝑦𝑡̃ 0
[ 𝑡+1 ] = [ ] [ ] + [ ] 𝑥𝑡 (41.12)
𝑞1𝑡+1 0 1 𝑞1𝑡 1
This specification assures that from the point of the view of firm 1, 𝑞2𝑡 is an exogenous process.
Here
• 𝑞1𝑡
̃ , 𝑥𝑡̃ play the role of Big K
• 𝑞1𝑡 , 𝑥𝑡 play the role of little k
The time 𝑡 component of firm 1’s objective is
′ 𝑎0
1 0 0 0 0 2 1
⎡𝑞 ⎤ ⎡0 0 0 0 − 𝑎21 ⎤ ⎡𝑞2𝑡 ⎤
2𝑡
̃ 𝑡 − 𝑥2𝑡 𝑄̃ = ⎢
𝑋̃ 𝑡′ 𝑅𝑥 ⎢𝑞1𝑡
̃ ⎥
⎥ ⎢
⎢0 0 0 0
⎥⎢ ⎥
0 ⎥ ⎢𝑞1𝑡 ̃ ⎥ − 𝛾𝑥2𝑡
⎢ 𝑥𝑡̃ ⎥ ⎢0 0 0 0 0 ⎥ ⎢ 𝑥𝑡̃ ⎥
𝑎
⎣𝑞1𝑡 ⎦ ⎣ 20 − 𝑎21 0 0 −𝑎1 ⎦ ⎣𝑞1𝑡 ⎦
Firm 1’s optimal decision rule is
𝑥𝑡 = − 𝐹 ̃ 𝑋 𝑡
and its state evolves according to

𝑋̃ 𝑡+1 = (𝐴 ̃ − 𝐵̃ 𝐹 ̃ )𝑋𝑡
under its optimal decision rule.
Later we shall compute 𝐹 ̃ and verify that when we set
1
⎡𝑞 ⎤
⎢ 20 ⎥
𝑋0 = ⎢𝑞10 ⎥
⎢ 𝑥0 ⎥
⎣𝑞10 ⎦
we recover

𝑥0 = −𝐹 ̃ 𝑋̃ 0 ,
which will verify that we have properly set up a recursive representation of the follower’s problem facing the Stackelberg
leader’s 𝑞2⃗ .

41.7. Dynamic Programming and Time Consistency of Follower’s Problem 781


Advanced Quantitative Economics with Python

41.7.2 Time Consistency of Follower’s Plan

The follower can solve its problem using dynamic programming because its problem is recursive in what for it are the
natural state variables, namely
1
⎡𝑞 ⎤
⎢ 2𝑡 ⎥
⎢𝑞1𝑡
̃ ⎥
⎣ 𝑥𝑡̃ ⎦
It follows that the follower’s plan is time consistent.

41.8 Computing Stackelberg Plan

Here is our code to compute a Stackelberg plan via the linear-quadratic dynamic program describe above.
Let’s use it to compute the Stackelberg plan.

# Parameters
a0 = 10
a1 = 2
β = 0.96
γ = 120
n = 300
tol0 = 1e-8
tol1 = 1e-16
tol2 = 1e-2

βs = np.ones(n)
βs[1:] = β
βs = βs.cumprod()

# In LQ form
Alhs = np.eye(4)

# Euler equation coefficients


Alhs[3, :] = β * a0 / (2 * γ), -β * a1 / (2 * γ), -β * a1 / γ, β

Arhs = np.eye(4)
Arhs[2, 3] = 1

Alhsinv = la.inv(Alhs)

A = Alhsinv @ Arhs

B = Alhsinv @ np.array([[0, 1, 0, 0]]).T

R = np.array([[0, -a0 / 2, 0, 0],


[-a0 / 2, a1, a1 / 2, 0],
[0, a1 / 2, 0, 0],
[0, 0, 0, 0]])

Q = np.array([[γ]])

# Solve using QE's LQ class


(continues on next page)

782 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

(continued from previous page)


# LQ solves minimization problems which is why the sign of R and Q was changed
lq = LQ(Q, R, A, B, beta=β)
P, F, d = lq.stationary_values(method='doubling')

P22 = P[3:, 3:]


P21 = P[3:, :3]
P22inv = la.inv(P22)
H_0_0 = -P22inv @ P21

# Simulate forward

π_leader = np.zeros(n)

z0 = np.array([[1, 1, 1]]).T
x0 = H_0_0 @ z0
y0 = np.vstack((z0, x0))

yt, ut = lq.compute_sequence(y0, ts_length=n)[:2]

π_matrix = (R + F. T @ Q @ F)

for t in range(n):
π_leader[t] = -(yt[:, t].T @ π_matrix @ yt[:, t])

# Display policies
print("Computed policy for Continuation Stackelberg leader\n")
print(f"F = {F}")

Computed policy for Continuation Stackelberg leader

F = [[-1.58004454 0.29461313 0.67480938 6.53970594]]

41.9 Time Series for Price and Quantities

Now let’s use the code to compute and display outcomes as a Stackelberg plan unfolds.
The following code plots quantities chosen by the Stackelberg leader and follower, together with the equilibrium output
price.

q_leader = yt[1, :-1]


q_follower = yt[2, :-1]
q = q_leader + q_follower # Total output, Stackelberg
p = a0 - a1 * q # Price, Stackelberg

fig, ax = plt.subplots(figsize=(9, 5.8))


ax.plot(range(n), q_leader, 'b-', lw=2, label='leader output')
ax.plot(range(n), q_follower, 'r-', lw=2, label='follower output')
ax.plot(range(n), p, 'g-', lw=2, label='price')
ax.set_title('Output and prices, Stackelberg duopoly')
ax.legend(frameon=False)
ax.set_xlabel('t')
plt.show()

41.9. Time Series for Price and Quantities 783


Advanced Quantitative Economics with Python

41.9.1 Value of Stackelberg Leader

We’ll compute the value 𝑤(𝑥0 ) attained by the Stackelberg leader, where 𝑥0 is given by the maximizer (41.10) of sub-
problem 2.
We’ll compute it two ways and get the same answer.
In addition to being a useful check on the accuracy of our coding, computing things in these two ways helps us think
about the structure of the problem.

v_leader_forward = np.sum(βs * π_leader)


v_leader_direct = -yt[:, 0].T @ P @ yt[:, 0]

# Display values
print("Computed values for the Stackelberg leader at t=0:\n")
print(f"v_leader_forward(forward sim) = {v_leader_forward:.4f}")
print(f"v_leader_direct (direct) = {v_leader_direct:.4f}")

Computed values for the Stackelberg leader at t=0:

v_leader_forward(forward sim) = 150.0316


v_leader_direct (direct) = 150.0324

784 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

# Manually checks whether P is approximately a fixed point


P_next = (R + F.T @ Q @ F + β * (A - B @ F).T @ P @ (A - B @ F))
(P - P_next < tol0).all()

True

# Manually checks whether two different ways of computing the


# value function give approximately the same answer
v_expanded = -((y0.T @ R @ y0 + ut[:, 0].T @ Q @ ut[:, 0] +
β * (y0.T @ (A - B @ F).T @ P @ (A - B @ F) @ y0)))
(v_leader_direct - v_expanded < tol0)[0, 0]

True

41.10 Time Inconsistency of Stackelberg Plan

In the code below we compare two values


• the continuation value 𝑣(𝑦𝑡 ) = −𝑦𝑡′ 𝑃 𝑦𝑡 earned by a continuation Stackelberg leader who inherits state 𝑦𝑡 at 𝑡
• the value 𝑤(𝑥𝑡̂ ) of a reborn Stackelberg leader who, at date 𝑡 along the Stackelberg plan, inherits state 𝑧𝑡 at 𝑡 but
−1
who discards 𝑥𝑡 from the time 𝑡 continuation of the original Stackelberg plan and resets it to 𝑥𝑡̂ = −𝑃22 𝑃21 𝑧𝑡
The difference between these two values is a tell-tale sign of the time inconsistency of the Stackelberg plan

# Compute value function over time with a reset at time t


vt_leader = np.zeros(n)
vt_reset_leader = np.empty_like(vt_leader)

yt_reset = yt.copy()
yt_reset[-1, :] = (H_0_0 @ yt[:3, :])

for t in range(n):
vt_leader[t] = -yt[:, t].T @ P @ yt[:, t]
vt_reset_leader[t] = -yt_reset[:, t].T @ P @ yt_reset[:, t]

fig, axes = plt.subplots(3, 1, figsize=(10, 7))

axes[0].plot(range(n+1), (- F @ yt).flatten(), 'bo',


label='Stackelberg leader', ms=2)
axes[0].plot(range(n+1), (- F @ yt_reset).flatten(), 'ro',
label='reborn at t Stackelberg leader', ms=2)
axes[0].set(title=r' $u_{t} = q_{2t+1} - q_t$', xlabel='t')
axes[0].legend()

axes[1].plot(range(n+1), yt[3, :], 'bo', ms=2)


axes[1].plot(range(n+1), yt_reset[3, :], 'ro', ms=2)
axes[1].set(title=r' $x_{t} = q_{1t+1} - q_{1t}$', xlabel='t')

axes[2].plot(range(n), vt_leader, 'bo', ms=2)


axes[2].plot(range(n), vt_reset_leader, 'ro', ms=2)
axes[2].set(title=r'$v(y_{t})$ and $w(\hat x_t)$', xlabel='t')
(continues on next page)

41.10. Time Inconsistency of Stackelberg Plan 785


Advanced Quantitative Economics with Python

(continued from previous page)

plt.tight_layout()
plt.show()

The figure above shows


• in the third panel that for 𝑡 ≥ 1 the reborn at 𝑡 Stackelberg leader’s’s value 𝑤(𝑥0̂ ) exceeds the continuation value
𝑣(𝑦𝑡 ) of the time 0 Stackelberg leader
• in the first panel that for 𝑡 ≥ 1 the reborn at 𝑡 Stackelberg leader wants to reduce his output below that prescribed
by the time 0 Stackelberg leader
• in the second panel that for 𝑡 ≥ 1 the reborn at 𝑡 Stackelberg leader wants to increase the output of the follower
firm 2 below that prescribed by the time 0 Stackelberg leader
Taken together, these outcomes express the time inconsistency of the original time 0 Stackelberg leaders’s plan.

41.11 Recursive Formulation of Follower’s Problem

We now formulate and compute the recursive version of the follower’s problem.
We check that the recursive Big 𝐾 , little 𝑘 formulation of the follower’s problem produces the same output path 𝑞1⃗ that
we computed when we solved the Stackelberg problem

A_tilde = np.eye(5)
A_tilde[:4, :4] = A - B @ F
(continues on next page)

786 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

(continued from previous page)

R_tilde = np.array([[0, 0, 0, 0, -a0 / 2],


[0, 0, 0, 0, a1 / 2],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[-a0 / 2, a1 / 2, 0, 0, a1]])

Q_tilde = Q
B_tilde = np.array([[0, 0, 0, 0, 1]]).T

lq_tilde = LQ(Q_tilde, R_tilde, A_tilde, B_tilde, beta=β)


P_tilde, F_tilde, d_tilde = lq_tilde.stationary_values(method='doubling')

y0_tilde = np.vstack((y0, y0[2]))


yt_tilde = lq_tilde.compute_sequence(y0_tilde, ts_length=n)[0]

# Checks that the recursive formulation of the follower's problem gives


# the same solution as the original Stackelberg problem
fig, ax = plt.subplots()
ax.plot(yt_tilde[4], 'r', label="q_tilde")
ax.plot(yt_tilde[2], 'b', label="q")
ax.legend()
plt.show()

Note: Variables with _tilde are obtained from solving the follower’s problem – those without are from the Stackelberg
problem

41.11. Recursive Formulation of Follower’s Problem 787


Advanced Quantitative Economics with Python

# Maximum absolute difference in quantities over time between


# the first and second solution methods
np.max(np.abs(yt_tilde[4] - yt_tilde[2]))

4.440892098500626e-16

# x0 == x0_tilde
yt[:, 0][-1] - (yt_tilde[:, 1] - yt_tilde[:, 0])[-1] < tol0

True

41.11.1 Explanation of Alignment

If we inspect coefficients in the decision rule −𝐹 ̃ , we should be able to spot why the follower chooses to set 𝑥𝑡 = 𝑥𝑡̃
when it sets 𝑥𝑡 = −𝐹 ̃ 𝑋𝑡 in the recursive formulation of the follower problem.
Can you spot what features of 𝐹 ̃ imply this?

Hint: Remember the components of 𝑋𝑡

# Policy function in the follower's problem


F_tilde.round(4)

array([[ 0. , -0. , -0.1032, -1. , 0.1032]])

# Value function in the Stackelberg problem


P

array([[ 963.54083615, -194.60534465, -511.62197962, -5258.22585724],


[ -194.60534465, 37.3535753 , 81.97712513, 784.76471234],
[ -511.62197962, 81.97712513, 247.34333344, 2517.05126111],
[-5258.22585724, 784.76471234, 2517.05126111, 25556.16504097]])

# Value function in the follower's problem


P_tilde

array([[-1.81991134e+01, 2.58003020e+00, 1.56048755e+01,


1.51229815e+02, -5.00000000e+00],
[ 2.58003020e+00, -9.69465925e-01, -5.26007958e+00,
-5.09764310e+01, 1.00000000e+00],
[ 1.56048755e+01, -5.26007958e+00, -3.22759027e+01,
-3.12791908e+02, -1.23823802e+01],
[ 1.51229815e+02, -5.09764310e+01, -3.12791908e+02,
-3.03132584e+03, -1.20000000e+02],
[-5.00000000e+00, 1.00000000e+00, -1.23823802e+01,
-1.20000000e+02, 1.43823802e+01]])

788 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

# Manually check that P is an approximate fixed point


(P - ((R + F.T @ Q @ F) + β * (A - B @ F).T @ P @ (A - B @ F)) < tol0).all()

True

# Compute `P_guess` using `F_tilde_star`


F_tilde_star = -np.array([[0, 0, 0, 1, 0]])
P_guess = np.zeros((5, 5))

for i in range(1000):
P_guess = ((R_tilde + F_tilde_star.T @ Q @ F_tilde_star) +
β * (A_tilde - B_tilde @ F_tilde_star).T @ P_guess
@ (A_tilde - B_tilde @ F_tilde_star))

# Value function in the follower's problem


-(y0_tilde.T @ P_tilde @ y0_tilde)[0, 0]

112.65590740578115

# Value function with `P_guess`


-(y0_tilde.T @ P_guess @ y0_tilde)[0, 0]

112.65590740578136

# Compute policy using policy iteration algorithm


F_iter = (β * la.inv(Q + β * B_tilde.T @ P_guess @ B_tilde)
@ B_tilde.T @ P_guess @ A_tilde)

for i in range(100):
# Compute P_iter
P_iter = np.zeros((5, 5))
for j in range(1000):
P_iter = ((R_tilde + F_iter.T @ Q @ F_iter) + β
* (A_tilde - B_tilde @ F_iter).T @ P_iter
@ (A_tilde - B_tilde @ F_iter))

# Update F_iter
F_iter = (β * la.inv(Q + β * B_tilde.T @ P_iter @ B_tilde)
@ B_tilde.T @ P_iter @ A_tilde)

dist_vec = (P_iter - ((R_tilde + F_iter.T @ Q @ F_iter)


+ β * (A_tilde - B_tilde @ F_iter).T @ P_iter
@ (A_tilde - B_tilde @ F_iter)))

if np.max(np.abs(dist_vec)) < 1e-8:


dist_vec2 = (F_iter - (β * la.inv(Q + β * B_tilde.T @ P_iter @ B_tilde)
@ B_tilde.T @ P_iter @ A_tilde))

if np.max(np.abs(dist_vec2)) < 1e-8:


F_iter
else:
print("The policy didn't converge: try increasing the number of \
(continues on next page)

41.11. Recursive Formulation of Follower’s Problem 789


Advanced Quantitative Economics with Python

(continued from previous page)


outer loop iterations")
else:
print("`P_iter` didn't converge: try increasing the number of inner \
loop iterations")

# Simulate the system using `F_tilde_star` and check that it gives the
# same result as the original solution

yt_tilde_star = np.zeros((n, 5))


yt_tilde_star[0, :] = y0_tilde.flatten()

for t in range(n-1):
yt_tilde_star[t+1, :] = (A_tilde - B_tilde @ F_tilde_star) \
@ yt_tilde_star[t, :]

fig, ax = plt.subplots()
ax.plot(yt_tilde_star[:, 4], 'r', label="q_tilde")
ax.plot(yt_tilde[2], 'b', label="q")
ax.legend()
plt.show()

# Maximum absolute difference


np.max(np.abs(yt_tilde_star[:, 4] - yt_tilde[2, :-1]))

0.0

790 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

41.12 Markov Perfect Equilibrium

The state vector is


1
𝑧𝑡 = ⎡𝑞 ⎤
⎢ 2𝑡 ⎥
𝑞
⎣ 1𝑡 ⎦
and the state transition dynamics are

𝑧𝑡+1 = 𝐴𝑧𝑡 + 𝐵1 𝑣1𝑡 + 𝐵2 𝑣2𝑡

where 𝐴 is a 3 × 3 identity matrix and


0 0
𝐵1 = ⎡ ⎤
⎢0⎥ , 𝐵2 = ⎡ ⎤
⎢1⎥
⎣1⎦ ⎣0⎦
The Markov perfect decision rules are

𝑣1𝑡 = −𝐹1 𝑧𝑡 , 𝑣2𝑡 = −𝐹2 𝑧𝑡

and in the Markov perfect equilibrium, the state evolves according to

𝑧𝑡+1 = (𝐴 − 𝐵1 𝐹1 − 𝐵2 𝐹2 )𝑧𝑡

# In LQ form
A = np.eye(3)
B1 = np.array([[0], [0], [1]])
B2 = np.array([[0], [1], [0]])

R1 = np.array([[0, 0, -a0 / 2],


[0, 0, a1 / 2],
[-a0 / 2, a1 / 2, a1]])

R2 = np.array([[0, -a0 / 2, 0],


[-a0 / 2, a1, a1 / 2],
[0, a1 / 2, 0]])

Q1 = Q2 = γ
S1 = S2 = W1 = W2 = M1 = M2 = 0.0

# Solve using QE's nnash function


F1, F2, P1, P2 = qe.nnash(A, B1, B2, R1, R2, Q1,
Q2, S1, S2, W1, W2, M1,
M2, beta=β, tol=tol1)

# Simulate forward
AF = A - B1 @ F1 - B2 @ F2
z = np.empty((3, n))
z[:, 0] = 1, 1, 1
for t in range(n-1):
z[:, t+1] = AF @ z[:, t]

# Display policies
print("Computed policies for firm 1 and firm 2:\n")
print(f"F1 = {F1}")
print(f"F2 = {F2}")

41.12. Markov Perfect Equilibrium 791


Advanced Quantitative Economics with Python

Computed policies for firm 1 and firm 2:

F1 = [[-0.22701363 0.03129874 0.09447113]]


F2 = [[-0.22701363 0.09447113 0.03129874]]

q1 = z[1, :]
q2 = z[2, :]
q = q1 + q2 # Total output, MPE
p = a0 - a1 * q # Price, MPE

fig, ax = plt.subplots(figsize=(9, 5.8))


ax.plot(range(n), q, 'b-', lw=2, label='total output')
ax.plot(range(n), p, 'g-', lw=2, label='price')
ax.set_title('Output and prices, duopoly MPE')
ax.legend(frameon=False)
ax.set_xlabel('t')
plt.show()

# Computes the maximum difference between the two quantities of the two firms
np.max(np.abs(q1 - q2))

8.881784197001252e-16

792 Chapter 41. Stackelberg Plans


Advanced Quantitative Economics with Python

# Compute values
u1 = (- F1 @ z).flatten()
u2 = (- F2 @ z).flatten()

π_1 = p * q1 - γ * (u1) ** 2
π_2 = p * q2 - γ * (u2) ** 2

v1_forward = np.sum(βs * π_1)


v2_forward = np.sum(βs * π_2)

v1_direct = (- z[:, 0].T @ P1 @ z[:, 0])


v2_direct = (- z[:, 0].T @ P2 @ z[:, 0])

# Display values
print("Computed values for firm 1 and firm 2:\n")
print(f"v1(forward sim) = {v1_forward:.4f}; v1 (direct) = {v1_direct:.4f}")
print(f"v2 (forward sim) = {v2_forward:.4f}; v2 (direct) = {v2_direct:.4f}")

Computed values for firm 1 and firm 2:

v1(forward sim) = 133.3303; v1 (direct) = 133.3296


v2 (forward sim) = 133.3303; v2 (direct) = 133.3296

# Sanity check
Λ1 = A - B2 @ F2
lq1 = qe.LQ(Q1, R1, Λ1, B1, beta=β)
P1_ih, F1_ih, d = lq1.stationary_values()

v2_direct_alt = - z[:, 0].T @ lq1.P @ z[:, 0] + lq1.d

(np.abs(v2_direct - v2_direct_alt) < tol2).all()

True

41.13 Comparing Markov Perfect Equilibrium and Stackelberg Out-


come

It is enlightening to compare equilbrium values for firms 1 and 2 under two alternative settings:
• A Markov perfect equilibrium like that described in this lecture
• A Stackelberg equilbrium
The following code performs the required computations, then plots the continuation values.

vt_MPE = np.zeros(n)
vt_follower = np.zeros(n)

for t in range(n):
vt_MPE[t] = -z[:, t].T @ P1 @ z[:, t]
vt_follower[t] = -yt_tilde[:, t].T @ P_tilde @ yt_tilde[:, t]

(continues on next page)

41.13. Comparing Markov Perfect Equilibrium and Stackelberg Outcome 793


Advanced Quantitative Economics with Python

(continued from previous page)


fig, ax = plt.subplots()
ax.plot(vt_MPE, 'b', label='MPE')
ax.plot(vt_leader, 'r', label='Stackelberg leader')
ax.plot(vt_follower, 'g', label='Stackelberg follower')
ax.set_title(r'Values for MPE duopolists and Stackelberg firms')
ax.set_xlabel('t')
ax.legend(loc=(1.05, 0))
plt.show()

# Display values
print("Computed values:\n")
print(f"vt_leader(y0) = {vt_leader[0]:.4f}")
print(f"vt_follower(y0) = {vt_follower[0]:.4f}")
print(f"vt_MPE(y0) = {vt_MPE[0]:.4f}")

Computed values:

vt_leader(y0) = 150.0324
vt_follower(y0) = 112.6559
vt_MPE(y0) = 133.3296

# Compute the difference in total value between the Stackelberg and the MPE
vt_leader[0] + vt_follower[0] - 2 * vt_MPE[0]

-3.9709425620890784

794 Chapter 41. Stackelberg Plans


CHAPTER

FORTYTWO

MACHINE LEARNING A RAMSEY PLAN

42.1 Introduction

This lecture uses what we call a machine learning approach to compute a Ramsey plan for a version of a model
of Calvo [Calvo, 1978].
We use another approach to compute a Ramsey plan for Calvo’s model in another quantecon lecture Time Inconsistency
of Ramsey Plans.
The Time Inconsistency of Ramsey Plans lecture uses an analytic approach based on dynamic programming
squared to guide computations.
Dynamic programming squared provides information about the structure of mathematical objects in terms of which a
Ramsey plan can be represented recursively.
Using that information paves the way to computing a Ramsey plan efficiently.
Included in the structural information that dynamic programming squared provides in quantecon lecture Time Inconsis-
tency of Ramsey Plans are
• a state variable that confronts a continuation Ramsey planner, and
• two Bellman equations
– one that describes the behavior of the representative agent
– another that describes decision problems of a Ramsey planner and of a continuation Ramsey planner
In this lecture, we approach the Ramsey planner in a less sophisticated way that proceeds without knowing the mathe-
matical structure imparted by dynamic programming squared.
We simply choose a pair of infinite sequences of real numbers that maximizes a Ramsey planner’s objective function.
The pair consists of
• a sequence 𝜃 ⃗ of inflation rates
• a sequence 𝜇⃗ of money growh rates
Because it fails to take advantage of the structure recognized by dynamic programming squared and, relative to the
dynamic programming squared approach, proliferates parameters, we take the liberty of calling this a machine learning
approach.
This is similar to what other machine learning algorithms also do.
Comparing the calculations in this lecture with those in our sister lecture Time Inconsistency of Ramsey Plans provides us
with a laboratory that can help us appreciate promises and limits of machine learning approaches more generally.
In this lecture, we’ll actually deploy two machine learning approaches.

795
Advanced Quantitative Economics with Python

• the first is really lazy


– it writes a Python function that computes the Ramsey planner’s objective as a function of a money growth
rate sequence and hands it over to a gradient descent optimizer
• the second is less lazy
– it exerts enough mental effort required to express the Ramsey planner’s objective as an affine quadratic form
in 𝜇,⃗ computes first-order conditions for an optimum, arranges them into a system of simultaneous linear
equations for 𝜇⃗ and then 𝜃,⃗ then solves them.
Each of these machine learning (ML) approaches recovers the same Ramsey plan that we compute in quantecon lecture
Time Inconsistency of Ramsey Plans by using dynamic programming squared.
However, the recursive structure of the Ramsey plan lies hidden within some of the objects calculated by our ML ap-
proaches.
To ferret out that structure, we have to ask the right questions.
We pose some of those questions at the end of this lecture and answer them by running some linear regressions on
components of 𝜇,⃗ 𝜃,⃗ and another vector that we’ll define later.
Human intelligence, not the artificial intelligence deployed in our machine learning approach, is a key input
into choosing which regressions to run.

42.2 The Model

We study a linear-quadratic version of a model that Guillermo Calvo [Calvo, 1978] used to illustrate the time inconsis-
tency of optimal government plans.
Calvo’s model focuses on intertemporal tradeoffs between
• utility accruing from a representative agent’s anticipations of future deflation that lower the agent’s cost of holding
real money balances and prompt him to increase his liquidity, as measured by his stock of real money balances,
and
• social costs associated with the distorting taxes that a government levies to acquire the paper money that it destroys
in order to generate prospective deflation
The model features
• rational expectations
• costly government actions at all dates 𝑡 ≥ 1 that increase the representative agent’s utilities at dates before 𝑡
The model combines ideas from papers by Cagan [Cagan, 1956], [Sargent and Wallace, 1973], and Calvo [Calvo, 1978].

42.3 Model Components

There is no uncertainty.
Let:
• 𝑝𝑡 be the log of the price level
• 𝑚𝑡 be the log of nominal money balances
• 𝜃𝑡 = 𝑝𝑡+1 − 𝑝𝑡 be the net rate of inflation between 𝑡 and 𝑡 + 1
• 𝜇𝑡 = 𝑚𝑡+1 − 𝑚𝑡 be the net rate of growth of nominal balances

796 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

The demand for real balances is governed by a perfect foresight version of a Cagan [Cagan, 1956] demand function for
real balances:

𝑚𝑡 − 𝑝𝑡 = −𝛼(𝑝𝑡+1 − 𝑝𝑡 ) , 𝛼 > 0 (42.1)

for 𝑡 ≥ 0.
Equation (42.1) asserts that the representative agent’s demand for real balances is inversely related to the representative
agent’s expected rate of inflation, which equals the actual rate of inflation because there is no uncertainty here.
(When there is no uncertainty, an assumption of rational expectations becomes equivalent to perfect foresight).
Subtracting the demand function (42.1) at time 𝑡 from the demand function at 𝑡 + 1 gives:

𝜇𝑡 − 𝜃𝑡 = −𝛼𝜃𝑡+1 + 𝛼𝜃𝑡

or
𝛼 1
𝜃𝑡 = 𝜃𝑡+1 + 𝜇 (42.2)
1+𝛼 1+𝛼 𝑡
𝛼
Because 𝛼 > 0, 0 < 1+𝛼 < 1.

Definition 42.3.1
For scalar 𝑏𝑡 , let 𝐿2 be the space of sequences {𝑏𝑡 }∞
𝑡=0 that satisfy


∑ 𝑏𝑡2 < +∞
𝑡=0

We say that a sequence that belongs to 𝐿2 is square summable.

When we assume that the sequence 𝜇⃗ = {𝜇𝑡 }∞ ⃗ ∞


𝑡=0 is square summable and also require that the sequence 𝜃 = {𝜃𝑡 }𝑡=0 is
square summable, the linear difference equation (42.2) can be solved forward to get:
∞ 𝑗
1 𝛼
𝜃𝑡 = ∑( ) 𝜇𝑡+𝑗 , 𝑡≥0 (42.3)
1 + 𝛼 𝑗=0 1 + 𝛼

The government values a representative household’s utility of real balances at time 𝑡 according to the utility function
𝑢2
𝑈 (𝑚𝑡 − 𝑝𝑡 ) = 𝑢0 + 𝑢1 (𝑚𝑡 − 𝑝𝑡 ) − (𝑚𝑡 − 𝑝𝑡 )2 , 𝑢0 > 0, 𝑢1 > 0, 𝑢2 > 0 (42.4)
2
The money demand function (42.1) and the utility function (42.4) imply that
𝑢2
𝑈 (−𝛼𝜃𝑡 ) = 𝑢0 + 𝑢1 (−𝛼𝜃𝑡 ) − (−𝛼𝜃𝑡 )2 . (42.5)
2

𝑢1
Note: The “bliss level” of real balances is 𝑢2 ; the inflation rate that attains it is − 𝑢𝑢1𝛼 .
2

Via equation (42.3), a government plan 𝜇⃗ = {𝜇𝑡 }∞ ⃗ ∞


𝑡=0 leads to a sequence of inflation rates 𝜃 = {𝜃𝑡 }𝑡=0 .

We assume that the government incurs social costs 2𝑐 𝜇2𝑡 when it changes the stock of nominal money balances at rate 𝜇𝑡
at time 𝑡.
Therefore, the one-period welfare function of a benevolent government is
𝑐
𝑠(𝜃𝑡 , 𝜇𝑡 ) = 𝑈 (−𝛼𝜃𝑡 ) − 𝜇2𝑡 .
2

42.3. Model Components 797


Advanced Quantitative Economics with Python

The Ramsey planner’s criterion is



𝑉 = ∑ 𝛽 𝑡 𝑠(𝜃𝑡 , 𝜇𝑡 ) (42.6)
𝑡=0

where 𝛽 ∈ (0, 1) is a discount factor.


The Ramsey planner chooses a vector of money growth rates 𝜇⃗ to maximize criterion (42.6) subject to equations (42.3)
and that restriction

𝜃 ⃗ ∈ 𝐿2 (42.7)

Equations (42.3) and (42.7) imply that 𝜃 ⃗ is a function of 𝜇.⃗


In particular, the inflation rate 𝜃𝑡 satisfies

𝜃𝑡 = (1 − 𝜆) ∑ 𝜆𝑗 𝜇𝑡+𝑗 , 𝑡≥0 (42.8)
𝑗=0

where
𝛼
𝜆= .
1+𝛼

42.4 Parameters and Variables

Parameters:
• Demand for money parameter is 𝛼 > 0; we set its default value 𝛼 = 1
𝛼
– Induced demand function for money parameter is 𝜆 = 1+𝛼

• Utility function parameters are 𝑢0 , 𝑢1 , 𝑢2 and 𝛽 ∈ (0, 1)


• Cost parameter of tax distortions associated with setting 𝜇𝑡 ≠ 0 is 𝑐
• A horizon truncation parameter: a positive integer 𝑇 > 0
Variables:
• 𝜃𝑡 = 𝑝𝑡+1 − 𝑝𝑡 where 𝑝𝑡 is log of price level
• 𝜇𝑡 = 𝑚𝑡+1 − 𝑚𝑡 where 𝑚𝑡 is log of money supply

42.4.1 Basic Objects

To prepare the way for our calculations, we’ll remind ourselves of the mathematical objects in play.
• sequences of inflation rates and money creation rates:

(𝜃,⃗ 𝜇)⃗ = {𝜃𝑡 , 𝜇𝑡 }∞


𝑡=0

• A planner’s value function



𝑐
𝑉 = ∑ 𝛽 𝑡 (ℎ0 + ℎ1 𝜃𝑡 + ℎ2 𝜃𝑡2 − 𝜇2𝑡 ) (42.9)
𝑡=0
2
where we set ℎ0 , ℎ1 , ℎ2 to match
𝑢2
𝑢0 + 𝑢1 (−𝛼𝜃𝑡 ) − (−𝛼𝜃𝑡 )2
2

798 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

with

ℎ0 + ℎ1 𝜃𝑡 + ℎ2 𝜃𝑡2

To make our parameters match as we want, we set

ℎ0 = 𝑢0
ℎ1 = −𝛼𝑢1
𝑢2 𝛼2
ℎ2 = −
2
A Ramsey planner chooses 𝜇⃗ to maximize the government’s value function (42.9) subject to equations (42.8).
A solution 𝜇⃗ of this problem is called a Ramsey plan.

42.4.2 Timing protocol

Following Calvo [Calvo, 1978], we assume that the government chooses the money growth sequence 𝜇⃗ once and for all
at, or before, time 0.
An optimal government plan under this timing protocol is an example of what is often called a Ramsey plan.
Notice that while the government is in effect choosing a bivariate time series (𝑚𝑢, ⃗ 𝜃),⃗ the government’s problem is static
in the sense that it chooses treats that time-series as a single object to be chosen at a single point in time.

42.5 Approximation and Truncation parameter 𝑇

We anticipate that under a Ramsey plan the sequences {𝜃𝑡 } and {𝜇𝑡 } both converge to stationary values.
Thus, we guess that under the optimal policy lim𝑡→+∞ 𝜇𝑡 = 𝜇.̄
Convergence of 𝜇𝑡 to 𝜇̄ together with formula (42.8) for the inflation rate then implies that lim𝑡→+∞ 𝜃𝑡 = 𝜇̄ as well.
We’ll guess a time 𝑇 large enough that 𝜇𝑡 has gotten very close to the limit 𝜇.̄
Then we’ll approximate 𝜇⃗ by a truncated vector with the property that

𝜇𝑡 = 𝜇̄ ∀𝑡 ≥ 𝑇

We’ll approximate 𝜃 ⃗ with a truncated vector with the property that

𝜃𝑡 = 𝜃 ̄ ∀𝑡 ≥ 𝑇

Formula for truncated 𝜃 ⃗


In light of our approximation that 𝜇𝑡 = 𝜇̄ for all 𝑡 ≥ 𝑇 , we seek a function that takes

𝜇̃ = [𝜇0 𝜇1 ⋯ 𝜇𝑇 −1 𝜇]̄

as an input and as an output gives

𝜃 ̃ = [𝜃0 𝜃1 ⋯ 𝜃𝑇 −1 𝜃]̄

where 𝜃 ̄ = 𝜇̄ and 𝜃𝑡 satisfies


𝑇 −1−𝑡
𝜃𝑡 = (1 − 𝜆) ∑ 𝜆𝑗 𝜇𝑡+𝑗 + 𝜆𝑇 −𝑡 𝜇̄ (42.10)
𝑗=0

42.5. Approximation and Truncation parameter 𝑇 799


Advanced Quantitative Economics with Python

for 𝑡 = 0, 1, … , 𝑇 − 1.
Formula for 𝑉
Having specified a truncated vector 𝜇̃ and and having computed 𝜃 ̃ by using formula (42.10), we shall write a Python
function that computes

𝑐
𝑉 ̃ = ∑ 𝛽 𝑡 (ℎ0 + ℎ1 𝜃𝑡̃ + ℎ2 𝜃𝑡2̃ − 𝜇2𝑡 ) (42.11)
𝑡=0
2

or more precisely
𝑇 −1
𝑐 𝛽𝑇 𝑐
𝑉 ̃ = ∑ 𝛽 𝑡 (ℎ0 + ℎ1 𝜃𝑡̃ + ℎ2 𝜃𝑡2̃ − 𝜇2𝑡 ) + (ℎ0 + ℎ1 𝜇̄ + ℎ2 𝜇2̄ − 𝜇2̄ )
𝑡=0
2 1 − 𝛽 2

where 𝜃𝑡̃ , 𝑡 = 0, 1, … , 𝑇 − 1 satisfies formula (1).

42.6 A Gradient Descent Algorithm

We now describe code that maximizes the criterion function (42.9) subject to equations (42.8) by choice of the truncated
vector 𝜇.̃
We use a brute force or machine learning approach that just hands our problem off to code that minimizes 𝑉 with
respect to the components of 𝜇̃ by using gradient descent.
We hope that answers will agree with those found obtained by other more structured methods in this quantecon lecture
Time Inconsistency of Ramsey Plans.

42.6.1 Implementation

We will implement the above in Python using JAX and Optax libraries.
We use the following imports in this lecture

!pip install --upgrade quantecon


!pip install --upgrade optax
!pip install --upgrade statsmodels

from quantecon import LQ


import numpy as np
import jax.numpy as jnp
from jax import jit, grad
import optax
import statsmodels.api as sm
import matplotlib.pyplot as plt

We’ll eventually want to compare the results we obtain here to those that we obtain in those obtained in this quantecon
lecture Time Inconsistency of Ramsey Plans.
To enable us to do that, we copy the class ChangLQ used in that lecture.
We hide the cell that copies the class, but readers can find details of the class in this quantecon lecture Time Inconsistency
of Ramsey Plans.
Now we compute the value of 𝑉 under this setup, and compare it against those obtained in this section Outcomes under
Three Timing Protocols of the sister quantecon lecture Time Inconsistency of Ramsey Plans.

800 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

# Assume β=0.85, c=2, T=40.


T = 40
clq = ChangLQ(β=0.85, c=2, T=T)

@jit
def compute_θ(μ, α=1):
λ = α / (1 + α)
T = len(μ) - 1
μbar = μ[-1]

# Create an array of powers for λ


λ_powers = λ ** jnp.arange(T + 1)

# Compute the weighted sums for all t


weighted_sums = jnp.array(
[jnp.sum(λ_powers[:T-t] * μ[t:T]) for t in range(T)])

# Compute θ values except for the last element


θ = (1 - λ) * weighted_sums + λ**(T - jnp.arange(T)) * μbar

# Set the last element


θ = jnp.append(θ, μbar)

return θ

@jit
def compute_hs(u0, u1, u2, α):
h0 = u0
h1 = -u1 * α
h2 = -0.5 * u2 * α**2

return h0, h1, h2

@jit
def compute_V(μ, β, c, α=1, u0=1, u1=0.5, u2=3):
θ = compute_θ(μ, α)

h0, h1, h2 = compute_hs(u0, u1, u2, α)

T = len(μ) - 1
t = np.arange(T)

# Compute sum except for the last element


V_sum = np.sum(β**t * (h0 + h1 * θ[:T] + h2 * θ[:T]**2 - 0.5 * c * μ[:T]**2))

# Compute the final term


V_final = (β**T / (1 - β)) * (h0 + h1 * μ[-1] + h2 * μ[-1]**2 - 0.5 * c * μ[-
↪1]**2)

V = V_sum + V_final

return V

V_val = compute_V(clq.μ_series, β=0.85, c=2)

(continues on next page)

42.6. A Gradient Descent Algorithm 801


Advanced Quantitative Economics with Python

(continued from previous page)


# Check the result with the ChangLQ class in previous lecture
print(f'deviation = {np.abs(V_val - clq.J_series[0])}') # good!

deviation = 1.430511474609375e-06

Now we want to maximize the function 𝑉 by choice of 𝜇.


We will use the optax.adam from the optax library.

def adam_optimizer(grad_func, init_params,


lr=0.1,
max_iter=10_000,
error_tol=1e-7):

# Set initial parameters and optimizer


params = init_params
optimizer = optax.adam(learning_rate=lr)
opt_state = optimizer.init(params)

# Update parameters and gradients


@jit
def update(params, opt_state):
grads = grad_func(params)
updates, opt_state = optimizer.update(grads, opt_state)
params = optax.apply_updates(params, updates)
return params, opt_state, grads

# Gradient descent loop


for i in range(max_iter):
params, opt_state, grads = update(params, opt_state)

if jnp.linalg.norm(grads) < error_tol:


print(f"Converged after {i} iterations.")
break

if i % 100 == 0:
print(f"Iteration {i}, grad norm: {jnp.linalg.norm(grads)}")

return params

Here we use automatic differentiation functionality in JAX with grad.

# Initial guess for μ


μ_init = jnp.zeros(T)

# Maximization instead of minimization


grad_V = jit(grad(
lambda μ: -compute_V(μ, β=0.85, c=2)))

%%time

# Optimize μ
optimized_μ = adam_optimizer(grad_V, μ_init)

print(f"optimized μ = \n{optimized_μ}")

802 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

Iteration 0, grad norm: 0.8627105951309204


Iteration 100, grad norm: 0.003303058445453644
Iteration 200, grad norm: 1.6979402062133886e-05
Converged after 280 iterations.
optimized μ =
[-0.06450712 -0.09033988 -0.10068494 -0.10482776 -0.1064868 -0.1071512
-0.10741723 -0.10752378 -0.10756643 -0.10758355 -0.1075904 -0.10759311
-0.10759423 -0.10759469 -0.10759488 -0.10759495 -0.10759498 -0.10759498
-0.10759496 -0.10759498 -0.10759496 -0.10759496 -0.10759495 -0.10759497
-0.10759496 -0.107595 -0.107595 -0.10759497 -0.10759496 -0.10759496
-0.10759494 -0.10759494 -0.10759492 -0.10759494 -0.10759492 -0.10759493
-0.10759496 -0.10759498 -0.10759498 -0.10759498]
CPU times: user 716 ms, sys: 64.4 ms, total: 780 ms
Wall time: 519 ms

print(f"original μ = \n{clq.μ_series}")

original μ =
[-0.06450708 -0.09033982 -0.10068489 -0.10482772 -0.10648677 -0.10715115
-0.10741722 -0.10752377 -0.10756644 -0.10758352 -0.10759037 -0.10759311
-0.1075942 -0.10759464 -0.10759482 -0.10759489 -0.10759492 -0.10759493
-0.10759493 -0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494
-0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494
-0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494
-0.10759494 -0.10759494 -0.10759494 -0.10759494]

print(f'deviation = {np.linalg.norm(optimized_μ - clq.μ_series)}')

deviation = 2.308478030954575e-07

compute_V(optimized_μ, β=0.85, c=2)

Array(6.8357825, dtype=float32)

compute_V(clq.μ_series, β=0.85, c=2)

Array(6.835783, dtype=float32)

42.6.2 Restricting 𝜇𝑡 = 𝜇̄ for all 𝑡

We take a brief detour to solve a restricted version of the Ramsey problem defined above.
First, recall that a Ramsey planner chooses 𝜇⃗ to maximize the government’s value function (42.9) subject to equations
(42.8).
We now define a distinct problem in which the planner chooses 𝜇⃗ to maximize the government’s value function (42.9)
subject to equation (42.8) and the additional restriction that 𝜇𝑡 = 𝜇̄ for all 𝑡.
The solution of this problem is a time-invariant 𝜇𝑡 that this quantecon lecture Time Inconsistency of Ramsey Plans calls
𝜇𝐶𝑅 .

42.6. A Gradient Descent Algorithm 803


Advanced Quantitative Economics with Python

# Initial guess for single μ


μ_init = jnp.zeros(1)

# Maximization instead of minimization


grad_V = jit(grad(
lambda μ: -compute_V(μ, β=0.85, c=2)))

# Optimize μ
optimized_μ_CR = adam_optimizer(grad_V, μ_init)

print(f"optimized μ = \n{optimized_μ_CR}")

Iteration 0, grad norm: 3.333333969116211


Iteration 100, grad norm: 0.0049784183502197266
Iteration 200, grad norm: 6.771087646484375e-05
Converged after 282 iterations.
optimized μ =
[-0.10000004]

Comparing it to 𝜇𝐶𝑅 in Time Inconsistency of Ramsey Plans, we again obtained very close answers.

np.linalg.norm(clq.μ_CR - optimized_μ_CR)

3.7252903e-08

V_CR = compute_V(optimized_μ_CR, β=0.85, c=2)


V_CR

Array(6.8333354, dtype=float32)

compute_V(jnp.array([clq.μ_CR]), β=0.85, c=2)

Array(6.8333344, dtype=float32)

42.7 A More Structured ML Algorithm

By thinking about the mathematical structure of the Ramsey problem and using some linear algebra, we can simplify the
problem that we hand over to a machine learning algorithm.
We start by recalling that the Ramsey problem that chooses 𝜇⃗ to maximize the government’s value function (42.9)subject
to equation (42.8).
This turns out to be an optimization problem with a quadratic objective function and linear constraints.
First-order conditions for this problem are a set of simultaneous linear equations in 𝜇.⃗
If we trust that the second-order conditions for a maximum are also satisfied (they are in our problem), we can compute
the Ramsey plan by solving these equations for 𝜇.⃗
We’ll apply this approach here and compare answers with what we obtained above with the gradient descent approach.

804 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

To remind us of the setting, remember that we have assumed that

𝜇𝑡 = 𝜇𝑇 ∀𝑡 ≥ 𝑇

and that

𝜃𝑡 = 𝜃𝑇 = 𝜇𝑇 ∀𝑡 ≥ 𝑇

Again, define
𝜃0 𝜇0
⎡ 𝜃 ⎤ ⎡ 𝜇 ⎤
⎢ 1 ⎥ ⎢ 1 ⎥
𝜃⃗ = ⎢ ⋮ ⎥ , 𝜇⃗ = ⎢ ⋮ ⎥
⎢𝜃𝑇 −1 ⎥ ⎢𝜇𝑇 −1 ⎥
⎣ 𝜃𝑇 ⎦ ⎣ 𝜇𝑇 ⎦

Write the system of 𝑇 + 1 equations (42.10) that relate 𝜃 ⃗ to a choice of 𝜇⃗ as the single matrix equation
1 −𝜆 0 0 ⋯ 0 0 𝜃0 𝜇0
⎡0 1 −𝜆 0 ⋯ 0 0 ⎤ ⎡ 𝜃1 ⎤ ⎡ 𝜇1 ⎤
1 ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢0 0 1 −𝜆 ⋯ 0 0 ⎥ ⎢ 𝜃2 ⎥ ⎢ 𝜇2 ⎥
=
(1 − 𝜆) ⎢ ⋮ ⋮ ⋮ ⋮ ⋮ −𝜆 0 ⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎢0 0 0 0 ⋯ 1 −𝜆 ⎥ ⎢𝜃𝑇 −1 ⎥ ⎢𝜇𝑇 −1 ⎥
⎣0 0 0 0 ⋯ 0 1 − 𝜆⎦ ⎣ 𝜃𝑇 ⎦ ⎣ 𝜇𝑇 ⎦
or

𝐴𝜃 ⃗ = 𝜇 ⃗

or

𝜃 ⃗ = 𝐵 𝜇⃗

where

𝐵 = 𝐴−1

def construct_B(α, T):


λ = α / (1 + α)

A = (jnp.eye(T, T) - λ*jnp.eye(T, T, k=1))/(1-λ)


A = A.at[-1, -1].set(A[-1, -1]*(1-λ))

B = jnp.linalg.inv(A)
return A, B

A, B = construct_B(α=clq.α, T=T)

print(f'A = \n {A}')

A =
[[ 2. -1. 0. ... 0. 0. 0.]
[ 0. 2. -1. ... 0. 0. 0.]
[ 0. 0. 2. ... 0. 0. 0.]
...
[ 0. 0. 0. ... 2. -1. 0.]
[ 0. 0. 0. ... 0. 2. -1.]
[ 0. 0. 0. ... 0. 0. 1.]]

42.7. A More Structured ML Algorithm 805


Advanced Quantitative Economics with Python

# Compute θ using optimized_μ


θs = np.array(compute_θ(optimized_μ))
μs = np.array(optimized_μ)

np.allclose(θs, B @ clq.μ_series)

True

As before, the Ramsey planner’s criterion is



𝑐
𝑉 = ∑ 𝛽 𝑡 (ℎ0 + ℎ1 𝜃𝑡 + ℎ2 𝜃𝑡2 − 𝜇2𝑡 )
𝑡=0
2

With our assumption above, criterion 𝑉 can be rewritten as


𝑇 −1
𝑐
𝑉 = ∑ 𝛽 𝑡 (ℎ0 + ℎ1 𝜃𝑡 + ℎ2 𝜃𝑡2 − 𝜇2𝑡 )
𝑡=0
2
𝛽𝑇 𝑐
+ (ℎ + ℎ1 𝜃𝑇 + ℎ2 𝜃𝑇2 − 𝜇2𝑇 )
1−𝛽 0 2
To help us write 𝑉 as a quadratic plus affine form, define

1
⎡ 𝛽 ⎤
⎢ ⎥
𝛽⃗ = ⎢ ⋮ ⎥
𝑇 −1
⎢𝛽 ⎥
𝛽𝑇
⎣ 1−𝛽 ⎦

Then we have:

ℎ1 ∑ 𝛽 𝑡 𝜃𝑡 = ℎ1 ⋅ 𝛽 𝑇⃗ 𝜃 ⃗ = (ℎ1 ⋅ 𝐵𝑇 𝛽)⃗ 𝑇 𝜇⃗ = 𝑔𝑇 𝜇⃗
𝑡=0

where 𝑔 = ℎ1 ⋅ 𝐵𝑇 𝛽 ⃗ is a (𝑇 + 1) × 1 vector,

ℎ2 ∑ 𝛽 𝑡 𝜃𝑡2 = 𝜇𝑇⃗ (𝐵𝑇 (ℎ2 ⋅ 𝛽 ⃗ ⋅ I)𝐵)𝜇⃗ = 𝜇𝑇⃗ 𝑀 𝜇⃗
𝑡=0

where 𝑀 = 𝐵𝑇 (ℎ2 ⋅ 𝛽 ⃗ ⋅ I)𝐵 is a (𝑇 + 1) × (𝑇 + 1) matrix,

𝑐 ∞ 𝑡 2 𝑐
∑ 𝛽 𝜇𝑡 = 𝜇𝑇⃗ ( ⋅ 𝛽 ⃗ ⋅ I)𝜇⃗ = 𝜇𝑇⃗ 𝐹 𝜇⃗
2 𝑡=0 2

where 𝐹 = 𝑐
2 ⋅ 𝛽 ⃗ ⋅ I is a (𝑇 + 1) × (𝑇 + 1) matrix
It follows that

𝑐
𝐽 = 𝑉 − ℎ0 = ∑ 𝛽 𝑡 (ℎ1 𝜃𝑡 + ℎ2 𝜃𝑡2 − 𝜇2𝑡 )
𝑡=0
2
= 𝑔𝑇 𝜇⃗ + 𝜇𝑇⃗ 𝑀 𝜇⃗ − 𝜇𝑇⃗ 𝐹 𝜇⃗
= 𝑔𝑇 𝜇⃗ + 𝜇𝑇⃗ (𝑀 − 𝐹 )𝜇⃗
= 𝑔𝑇 𝜇⃗ + 𝜇𝑇⃗ 𝐺𝜇⃗

806 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

where 𝐺 = 𝑀 − 𝐹 .
To compute the optimal government plan we want to maximize 𝐽 with respect to 𝜇.⃗
We use linear algebra formulas for differentiating linear and quadratic forms to compute the gradient of 𝐽 with respect
to 𝜇⃗

𝜕
𝐽 = 𝑔 + 2𝐺𝜇.⃗
𝜕 𝜇⃗
𝜕
Setting 𝜕 𝜇⃗ 𝐽 = 0, the maximizing 𝜇 is

1
𝜇𝑅
⃗ = − 𝐺−1 𝑔
2
The associated optimal inflation sequence is

⃗ = 𝐵𝜇𝑅
𝜃𝑅 ⃗

42.7.1 Two implementations

With the more structured approach, we can update our gradient descent exercise with compute_J

def compute_J(μ, β, c, α=1, u0=1, u1=0.5, u2=3):


T = len(μ) - 1

h0, h1, h2 = compute_hs(u0, u1, u2, α)


λ = α / (1 + α)

_, B = construct_B(α, T+1)

β_vec = jnp.hstack([β**jnp.arange(T),
(β**T/(1-β))])

θ = B @ μ
βθ_sum = jnp.sum((β_vec * h1) * θ)
βθ_square_sum = β_vec * h2 * θ.T @ θ
βμ_square_sum = 0.5 * c * β_vec * μ.T @ μ

return βθ_sum + βθ_square_sum - βμ_square_sum

# Initial guess for μ


μ_init = jnp.zeros(T)

# Maximization instead of minimization


grad_J = jit(grad(
lambda μ: -compute_J(μ, β=0.85, c=2)))

%%time

# Optimize μ
optimized_μ = adam_optimizer(grad_J, μ_init)

print(f"optimized μ = \n{optimized_μ}")

42.7. A More Structured ML Algorithm 807


Advanced Quantitative Economics with Python

Iteration 0, grad norm: 0.8627105951309204


Iteration 100, grad norm: 0.003303033299744129
Iteration 200, grad norm: 1.6926183889154345e-05
Converged after 280 iterations.
optimized μ =
[-0.06450712 -0.09033988 -0.10068493 -0.10482774 -0.1064868 -0.1071512
-0.10741723 -0.10752378 -0.10756644 -0.10758355 -0.10759039 -0.10759313
-0.10759424 -0.10759471 -0.10759488 -0.10759497 -0.10759498 -0.10759498
-0.10759498 -0.10759497 -0.10759494 -0.10759497 -0.10759496 -0.10759497
-0.10759497 -0.10759498 -0.107595 -0.10759494 -0.10759496 -0.10759496
-0.10759494 -0.10759493 -0.10759493 -0.10759494 -0.10759494 -0.10759495
-0.10759497 -0.107595 -0.10759497 -0.10759495]
CPU times: user 517 ms, sys: 250 ms, total: 768 ms
Wall time: 362 ms

print(f"original μ = \n{clq.μ_series}")

original μ =
[-0.06450708 -0.09033982 -0.10068489 -0.10482772 -0.10648677 -0.10715115
-0.10741722 -0.10752377 -0.10756644 -0.10758352 -0.10759037 -0.10759311
-0.1075942 -0.10759464 -0.10759482 -0.10759489 -0.10759492 -0.10759493
-0.10759493 -0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494
-0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494
-0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494 -0.10759494
-0.10759494 -0.10759494 -0.10759494 -0.10759494]

print(f'deviation = {np.linalg.norm(optimized_μ - clq.μ_series)}')

deviation = 2.3748542332668876e-07

V_R = compute_V(optimized_μ, β=0.85, c=2)


V_R

Array(6.8357825, dtype=float32)

We find that by exploiting more knowledge about the structure of the problem, we can significantly speed up our com-
putation.
We can also derive a closed-form solution for 𝜇⃗

def compute_μ(β, c, T, α=1, u0=1, u1=0.5, u2=3):


h0, h1, h2 = compute_hs(u0, u1, u2, α)

_, B = construct_B(α, T+1)

β_vec = jnp.hstack([β**jnp.arange(T),
(β**T/(1-β))])

g = h1 * B.T @ β_vec
M = B.T @ (h2 * jnp.diag(β_vec)) @ B
F = c/2 * jnp.diag(β_vec)
G = M - F
(continues on next page)

808 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

(continued from previous page)


return jnp.linalg.solve(2*G, -g)

μ_closed = compute_μ(β=0.85, c=2, T=T-1)


print(f'closed-form μ = \n{μ_closed}')

closed-form μ =
[-0.0645071 -0.09033982 -0.1006849 -0.1048277 -0.10648677 -0.10715113
-0.10741723 -0.10752378 -0.10756643 -0.10758351 -0.10759034 -0.10759313
-0.10759421 -0.10759464 -0.10759482 -0.1075949 -0.10759489 -0.10759492
-0.10759492 -0.10759491 -0.10759495 -0.10759494 -0.10759495 -0.10759493
-0.10759491 -0.10759491 -0.10759494 -0.10759491 -0.10759491 -0.10759495
-0.10759498 -0.10759492 -0.10759494 -0.10759485 -0.10759497 -0.10759495
-0.10759493 -0.10759494 -0.10759498 -0.10759494]

print(f'deviation = {np.linalg.norm(μ_closed - clq.μ_series)}')

deviation = 1.47137171779832e-07

compute_V(μ_closed, β=0.85, c=2)

Array(6.835783, dtype=float32)

print(f'deviation = {np.linalg.norm(B @ μ_closed - θs)}')

deviation = 2.535387864099903e-07

We can check the gradient of the analytical solution against the JAX computed version

def compute_grad(μ, β, c, α=1, u0=1, u1=0.5, u2=3):


T = len(μ) - 1

h0, h1, h2 = compute_hs(u0, u1, u2, α)

_, B = construct_B(α, T+1)

β_vec = jnp.hstack([β**jnp.arange(T),
(β**T/(1-β))])

g = h1 * B.T @ β_vec
M = (h2 * B.T @ jnp.diag(β_vec) @ B)
F = c/2 * jnp.diag(β_vec)
G = M - F
return g + (2*G @ μ)

closed_grad = compute_grad(jnp.ones(T), β=0.85, c=2)

closed_grad

42.7. A More Structured ML Algorithm 809


Advanced Quantitative Economics with Python

Array([-3.75 , -4.0625 , -3.8906252 , -3.5257816 , -3.1062894 ,


-2.6950336 , -2.3181221 , -1.9840758 , -1.6933005 , -1.4427234 ,
-1.2280239 , -1.0446749 , -0.8884009 , -0.7553544 , -0.64215815,
-0.5458878 , -0.46403134, -0.39444 , -0.33528066, -0.28499195,
-0.24224481, -0.20590894, -0.17502302, -0.14876978, -0.12645441,
-0.10748631, -0.09136339, -0.0776589 , -0.06601007, -0.05610856,
-0.04769228, -0.04053844, -0.03445768, -0.02928903, -0.02489567,
-0.02116132, -0.01798713, -0.01528906, -0.0129957 , -0.07364222], ␣
↪dtype=float32)

- grad_J(jnp.ones(T))

Array([-3.75 , -4.0625 , -3.890625 , -3.5257816 , -3.1062894 ,


-2.6950336 , -2.3181224 , -1.9840759 , -1.6933005 , -1.4427235 ,
-1.228024 , -1.0446749 , -0.8884009 , -0.7553544 , -0.6421581 ,
-0.54588777, -0.46403137, -0.39444 , -0.33528066, -0.28499192,
-0.24224481, -0.20590894, -0.175023 , -0.14876977, -0.12645441,
-0.10748631, -0.0913634 , -0.0776589 , -0.06601007, -0.05610857,
-0.04769228, -0.04053844, -0.03445768, -0.02928903, -0.02489568,
-0.02116132, -0.01798712, -0.01528906, -0.0129957 , -0.07364222], ␣
↪dtype=float32)

print(f'deviation = {np.linalg.norm(closed_grad - (- grad_J(jnp.ones(T))))}')

deviation = 4.074267394571507e-07

Let’s plot the Ramsey plan’s 𝜇𝑡 and 𝜃𝑡 for 𝑡 = 0, … , 𝑇 against 𝑡.

# Compute θ using optimized_μ


θs = np.array(compute_θ(optimized_μ))
μs = np.array(optimized_μ)

# Plot the two sequences


Ts = np.arange(T)

plt.scatter(Ts, μs, label=r'$\mu_t$', alpha=0.7)


plt.scatter(Ts, θs, label=r'$\theta_t$', alpha=0.7)
plt.xlabel(r'$t$')
plt.legend()
plt.show()

810 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

Note that while 𝜃𝑡 is less than 𝜇𝑡 for low 𝑡’s, it eventually converges to the limit 𝜇̄ of 𝜇𝑡 as 𝑡 → +∞.
This pattern reflects how formula (42.3) makes 𝜃𝑡 be a weighted average of future 𝜇𝑡 ’s.

42.8 Continuation Values

For subsquent analysis, it will be useful to compute a sequence {𝑣𝑡 }𝑇𝑡=0 of what we’ll call continuation values
along a Ramsey plan.
To do so, we’ll start at date 𝑇 and compute
1
𝑣𝑇 = 𝑠(𝜇,̄ 𝜇).
̄
1−𝛽
Then starting from 𝑡 = 𝑇 − 1, we’ll iterate backwards on the recursion

𝑣𝑡 = 𝑠(𝜃𝑡 , 𝜇𝑡 ) + 𝛽𝑣𝑡+1

for 𝑡 = 𝑇 − 1, 𝑇 − 2, … , 0.

# Define function for s and U in section 41.3


def s(θ, μ, u0, u1, u2, α, c):
U = lambda x: u0 + u1 * x - (u2 / 2) * x**2
return U(-α*θ) - (c / 2) * μ**2

# Calculate v_t sequence backward


def compute_vt(μ, β, c, u0=1, u1=0.5, u2=3, α=1):
(continues on next page)

42.8. Continuation Values 811


Advanced Quantitative Economics with Python

(continued from previous page)


T = len(μ)
θ = compute_θ(μ, α)

v_t = np.zeros(T)
μ_bar = μ[-1]

# Reduce parameters
s_p = lambda θ, μ: s(θ, μ,
u0=u0, u1=u1, u2=u2, α=α, c=c)

# Define v_T
v_t[T-1] = (1 / (1 - β)) * s_p(μ_bar, μ_bar)

# Backward iteration
for t in reversed(range(T-1)):
v_t[t] = s_p(θ[t], μ[t]) + β * v_t[t+1]

return v_t

v_t = compute_vt(μs, β=0.85, c=2)

The initial continuation value 𝑣0 should equal the optimized value of the Ramsey planner’s criterion 𝑉 defined in equation
(42.6).
Indeed, we find that the deviation is very small:

print(f'deviation = {np.linalg.norm(v_t[0] - V_R)}')

deviation = 4.76837158203125e-07

We can also verify approximate equality by inspecting a graph of 𝑣𝑡 against 𝑡 for 𝑡 = 0, … , 𝑇 along with the value attained
by a restricted Ramsey planner 𝑉 𝐶𝑅 and the optimized value of the ordinary Ramsey planner 𝑉 𝑅

# Plot the scatter plot


plt.scatter(Ts, v_t, label='$v_t$')

# Plot horizontal lines


plt.axhline(V_CR, color='C1', alpha=0.5)
plt.axhline(V_R, color='C2', alpha=0.5)

# Add labels
plt.text(max(Ts) + max(Ts)*0.07, V_CR, '$V^{CR}$', color='C1',
va='center', clip_on=False, fontsize=15)
plt.text(max(Ts) + max(Ts)*0.07, V_R, '$V^R$', color='C2',
va='center', clip_on=False, fontsize=15)
plt.xlabel(r'$t$')
plt.ylabel(r'$v_t$')

plt.tight_layout()
plt.show()

Figure Fig. 42.1 shows interesting patterns:


• The sequence of continuation values {𝑣𝑡 }𝑇𝑡=0 is monotonically decreasing

812 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

Fig. 42.1: Continuation values

42.8. Continuation Values 813


Advanced Quantitative Economics with Python

• Evidently, 𝑣0 > 𝑉 𝐶𝑅 > 𝑣𝑇 so that


– the value 𝑣0 of the ordinary Ramsey plan exceeds the value 𝑉 𝐶𝑅 of the special Ramsey plan in which the
planner is constrained to set 𝜇𝑡 = 𝜇𝐶𝑅 for all 𝑡.
– the continuation value 𝑣𝑇 of the ordinary Ramsey plan for 𝑡 ≥ 𝑇 is constant and is less than the value 𝑉 𝐶𝑅
of the special Ramsey plan in which the planner is constrained to set 𝜇𝑡 = 𝜇𝐶𝑅 for all 𝑡

Note: The continuation value 𝑣𝑇 is what some researchers call the “value of a Ramsey plan under a time-less perspective.”
A more descriptive phrase is “the value of the worst continuation Ramsey plan.”

42.9 Adding Some Human Intelligence

We have used our machine learning algorithms to compute a Ramsey plan.


By plotting it, we learned that the Ramsey planner makes 𝜇⃗ and 𝜃 ⃗ both vary over time.
• 𝜃 ⃗ and 𝜇⃗ both decline monotonically
• both of them converge from above to the same constant 𝜇⃗
Hidden from view, there is a recursive structure in the 𝜇,⃗ 𝜃 ⃗ chosen by the Ramsey planner that we want to bring out.
To do so, we’ll have to add some human intelligence to the artificial intelligence embodied in our machine learning
approach.
To proceed, we’ll compute least squares linear regressions of some components of 𝜃 ⃗ and 𝜇⃗ on others.
We hope that these regressions will reveal structure hidden within the 𝜇𝑅 ⃗ sequences associated with a Ramsey plan.
⃗ , 𝜃𝑅
It is worth pausing to think about roles being played here by human intelligence and artificial intelligence.
Artificial intelligence in the form of some Python code and a computer is running the regressions for us.
But we are free to regress anything on anything else.
Human intelligence tells us what regressions to run.
Additional inputs of human intelligence will be required fully to appreciate what those regressions reveal about the struc-
ture of a Ramsey plan.

Note: When we eventually get around to trying to understand the regressions below, it will worthwhile to study the
reasoning that let Chang [Chang, 1998] to choose 𝜃𝑡 as his key state variable.

We begin by regressing 𝜇𝑡 on a constant and 𝜃𝑡 .


This might seem strange because, after all, equation (42.3) asserts that inflation at time 𝑡 is determined {𝜇𝑠 }∞
𝑠=𝑡

Nevertheless, we’ll run this regression anyway.

# First regression: μ_t on a constant and θ_t


X1_θ = sm.add_constant(θs)
model1 = sm.OLS(μs, X1_θ)
results1 = model1.fit()

# Print regression summary


print("Regression of μ_t on a constant and θ_t:")
print(results1.summary(slim=True))

814 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

Regression of μ_t on a constant and θ_t:


OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 1.000
Model: OLS Adj. R-squared: 1.000
No. Observations: 40 F-statistic: 1.489e+13
Covariance Type: nonrobust Prob (F-statistic): 6.90e-222
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 0.0645 4.42e-08 1.46e+06 0.000 0.065 0.065
x1 1.5995 4.14e-07 3.86e+06 0.000 1.600 1.600
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly␣
↪specified.

Our regression tells us that the affine function

𝜇𝑡 = .0645 + 1.5995𝜃𝑡

fits perfectly along the Ramsey outcome 𝜇,⃗ 𝜃.⃗

Note: Of course, this means that a regression of 𝜃𝑡 on 𝜇𝑡 and a constant would also fit perfectly.

Let’s plot the regression line 𝜇𝑡 = .0645 + 1.5995𝜃𝑡 and the points (𝜃𝑡 , 𝜇𝑡 ) that lie on it for 𝑡 = 0, … , 𝑇 .

plt.scatter(θs, μs, label=r'$\mu_t$')


plt.plot(θs, results1.predict(X1_θ), 'grey', label=r'$\hat \mu_t$', linestyle='--')
plt.xlabel(r'$\theta_t$')
plt.ylabel(r'$\mu_t$')
plt.legend()
plt.show()

42.9. Adding Some Human Intelligence 815


Advanced Quantitative Economics with Python

The time 0 pair (𝜃0 , 𝜇0 ) appears as the point on the upper right.
Points (𝜃𝑡 , 𝜇𝑡 ) for succeeding times appear further and further to the lower left and eventually converge to (𝜇,̄ 𝜇).
̄
Next, we’ll run a linear regression of 𝜃𝑡+1 against 𝜃𝑡 and a constant.

# Second regression: θ_{t+1} on a constant and θ_t


θ_t = np.array(θs[:-1]) # θ_t
θ_t1 = np.array(θs[1:]) # θ_{t+1}
X2_θ = sm.add_constant(θ_t) # Add a constant term for the intercept
model2 = sm.OLS(θ_t1, X2_θ)
results2 = model2.fit()

# Print regression summary


print("\nRegression of θ_{t+1} on a constant and θ_t:")
print(results2.summary(slim=True))

Regression of θ_{t+1} on a constant and θ_t:


OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 1.000
Model: OLS Adj. R-squared: 1.000
No. Observations: 39 F-statistic: 7.775e+11
Covariance Type: nonrobust Prob (F-statistic): 1.41e-192
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const -0.0645 4.84e-08 -1.33e+06 0.000 -0.065 -0.065
(continues on next page)

816 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

(continued from previous page)


x1 0.4005 4.54e-07 8.82e+05 0.000 0.400 0.400
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly␣
↪specified.

We find that the regression line fits perfectly and thus discover the affine relationship

𝜃𝑡+1 = −.0645 + .4005𝜃𝑡

that prevails along the Ramsey outcome for inflation.


Let’s plot 𝜃𝑡 for 𝑡 = 0, 1, … , 𝑇 along the line.

plt.scatter(θ_t, θ_t1, label=r'$\theta_{t+1}$')


plt.plot(θ_t, results2.predict(X2_θ), color='grey', label=r'$\hat θ_{t+1}$',␣
↪linestyle='--')

plt.xlabel(r'$\theta_t$')
plt.ylabel(r'$\theta_{t+1}$')
plt.legend()

plt.tight_layout()
plt.show()

Points for succeeding times appear further and further to the lower left and eventually converge to 𝜇,̄ 𝜇.̄

42.9. Adding Some Human Intelligence 817


Advanced Quantitative Economics with Python

Next we ask Python to regress continuation value 𝑣𝑡 against a constant, 𝜃𝑡 , and 𝜃𝑡2 .

𝑣𝑡 = 𝑔0 + 𝑔1 𝜃𝑡 + 𝑔2 𝜃𝑡2 .

# Third regression: v_t on a constant, θ_t and θ^2_t


X3_θ = np.column_stack((np.ones(T), θs, θs**2))
model3 = sm.OLS(v_t, X3_θ)
results3 = model3.fit()

# Print regression summary


print("\nRegression of v_t on a constant, θ_t and θ^2_t:")
print(results3.summary(slim=True))

Regression of v_t on a constant, θ_t and θ^2_t:


OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 1.000
Model: OLS Adj. R-squared: 1.000
No. Observations: 40 F-statistic: 5.474e+08
Covariance Type: nonrobust Prob (F-statistic): 6.09e-139
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 6.8052 5.91e-06 1.15e+06 0.000 6.805 6.805
x1 -0.7581 0.000 -6028.976 0.000 -0.758 -0.758
x2 -4.6996 0.001 -7131.888 0.000 -4.701 -4.698
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly␣
↪specified.

[2] The condition number is large, 3.5e+04. This might indicate that there are
strong multicollinearity or other numerical problems.

The regression has an 𝑅2 equal to 1 and so fits perfectly.


However, notice the warning about the high condition number.
As indicated in the printout, this is a consequence of 𝜃𝑡 and 𝜃𝑡2 being highly correlated along the Ramsey plan.

np.corrcoef(θs, θs**2)

array([[ 1. , -0.99942156],
[-0.99942156, 1. ]])

Let’s plot 𝑣𝑡 against 𝜃𝑡 along with the nonlinear regression line.

θ_grid = np.linspace(min(θs), max(θs), 100)


X3_grid = np.column_stack((np.ones(len(θ_grid)), θ_grid, θ_grid**2))

plt.scatter(θs, v_t)
plt.plot(θ_grid, results3.predict(X3_grid), color='grey',
label=r'$\hat v_t$', linestyle='--')
plt.axhline(V_CR, color='C1', alpha=0.5)

(continues on next page)

818 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

(continued from previous page)


plt.text(max(θ_grid) - max(θ_grid)*0.025, V_CR, '$V^{CR}$', color='C1',
va='center', clip_on=False, fontsize=15)

plt.xlabel(r'$\theta_{t}$')
plt.ylabel(r'$v_t$')
plt.legend()

plt.tight_layout()
plt.show()

The highest continuation value 𝑣0 at 𝑡 = 0 appears at the peak of the function quadratic function 𝑔0 + 𝑔1 𝜃𝑡 + 𝑔2 𝜃𝑡2 .
Subsequent values of 𝑣𝑡 for 𝑡 ≥ 1 appear to the lower left of the pair (𝜃0 , 𝑣0 ) and converge monotonically from above to
𝑣𝑇 at time 𝑇 .
The value 𝑉 𝐶𝑅 attained by the Ramsey plan that is restricted to be a constant 𝜇𝑡 = 𝜇𝐶𝑅 sequence appears as a horizontal
line.
Evidently, continuation values 𝑣𝑡 > 𝑉 𝐶𝑅 for 𝑡 = 0, 1, 2 while 𝑣𝑡 < 𝑉 𝐶𝑅 for 𝑡 ≥ 3.

42.9. Adding Some Human Intelligence 819


Advanced Quantitative Economics with Python

42.10 What has Machine Learning Taught Us?

Our regressions tells us that along the Ramsey outcome 𝜇𝑅 ⃗ , the linear function
⃗ , 𝜃𝑅

𝜇𝑡 = .0645 + 1.5995𝜃𝑡

fits perfectly and that so do the regression lines

𝜃𝑡+1 = −.0645 + .4005𝜃𝑡

𝑣𝑡 = 6.8052 − .7580𝜃𝑡 − 4.6991𝜃𝑡2 .


Assembling these regressions, we have discovered run for our single Ramsey outcome path 𝜇𝑅 ⃗ that along a Ramsey
⃗ , 𝜃𝑅
plan, the following relationships prevail:

𝜃0 = 𝜃0𝑅
𝜇𝑡 = 𝑏 0 + 𝑏 1 𝜃𝑡 (42.12)
𝜃𝑡+1 = 𝑑0 + 𝑑1 𝜃𝑡

where the initial value 𝜃0𝑅 was computed along with other components of 𝜇𝑅 ⃗ when we computed the Ramsey plan,
⃗ , 𝜃𝑅
and where 𝑏0 , 𝑏1 , 𝑑0 , 𝑑1 are parameters whose values we estimated with our regressions.
In addition, we learned that continuation values are described by the quadratic function

𝑣𝑡 = 𝑔0 + 𝑔1 𝜃𝑡 + 𝑔2 𝜃𝑡2

We discovered these relationships by running some carefully chosen regressions and staring at the results, noticing that
the 𝑅2 ’s of unity tell us that the fits are perfect.
We have learned much about the structure of the Ramsey problem.
However, by using the methods and ideas that we have deployed in this lecture, it is challenging to say more.
There are many other linear regressions among components of 𝜇𝑅
⃗ , 𝜃𝑅 that would also have given us perfect fits.
For example, we could have regressed 𝜃𝑡 on 𝜇𝑡 and obtained the same 𝑅2 value.
Actually, wouldn’t that direction of fit have made more sense?
After all, the Ramsey planner chooses 𝜇,⃗ while 𝜃 ⃗ is an outcome that reflects the represenative agent’s response to the
Ramsey planner’s choice of 𝜇.⃗
Isn’t it more natural then to expect that we’d learn more about the structure of the Ramsey problem from a regression of
components of 𝜃 ⃗ on components of 𝜇?⃗
To answer these questions, we’ll have to deploy more economic theory.
We do that in this quantecon lecture Time Inconsistency of Ramsey Plans.
There, we’ll discover that system (42.12) is actually a very good way to represent a Ramsey plan because it reveals many
things about its structure.
Indeed, in that lecture, we show how to compute the Ramsey plan using dynamic programming squared and provide
a Python class ChangLQ that performs the calculations.
We have deployed ChangLQ earlier in this lecture to compute a baseline Ramsey plan to which we have compared
outcomes from our application of the cruder machine learning approaches studied here.
Let’s use the code to compute the parameters 𝑑0 , 𝑑1 for the decision rule for 𝜇𝑡 and the parameters 𝑑0 , 𝑑1 in the updating
rule for 𝜃𝑡+1 in representation (42.12).
First, we’ll again use ChangLQ to compute these objects (along with a number of others).

820 Chapter 42. Machine Learning a Ramsey Plan


Advanced Quantitative Economics with Python

clq = ChangLQ(β=0.85, c=2, T=T)

Now let’s print out the decision rule for 𝜇𝑡 uncovered by applying dynamic programming squared.

print("decision rule for μ")


print(f'-(b_0, b_1) = ({-clq.b0:.6f}, {-clq.b1:.6f})')

decision rule for μ


-(b_0, b_1) = (0.064507, 1.599536)

Now let’s print out the decision rule for 𝜃𝑡+1 uncovered by applying dynamic programming squared.

print("decision rule for θ(t+1) as function of θ(t)")


print(f'(d_0, d_1) = ({clq.d0:.6f}, {clq.d1:.6f})')

decision rule for θ(t+1) as function of θ(t)


(d_0, d_1) = (-0.064507, 0.400464)

Evidently, these agree with the relationships that we discovered by running regressions on the Ramsey outcomes 𝜇𝑅 ⃗
⃗ , 𝜃𝑅
that we constructed with either of our machine learning algorithms.
We have set the stage for this quantecon lecture Time Inconsistency of Ramsey Plans.
We close this lecture by giving a hint about an insight of Chang [Chang, 1998] that underlies much of quantecon lecture
Time Inconsistency of Ramsey Plans.
Chang noticed how equation (42.3) shows that an equivalence class of continuation money growth sequences {𝜇𝑡+𝑗 }∞
𝑗=0
deliver the same 𝜃𝑡 .
Consequently, equations (42.1) and (42.3) indicate that 𝜃𝑡 intermediates how the government’s choices of 𝜇𝑡+𝑗 , 𝑗 =
0, 1, … impinge on time 𝑡 real balances 𝑚𝑡 − 𝑝𝑡 = −𝛼𝜃𝑡 .
In lecture Time Inconsistency of Ramsey Plans, we’ll see how Chang [Chang, 1998] put this insight to work.

42.10. What has Machine Learning Taught Us? 821


Advanced Quantitative Economics with Python

822 Chapter 42. Machine Learning a Ramsey Plan


CHAPTER

FORTYTHREE

TIME INCONSISTENCY OF RAMSEY PLANS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

43.1 Overview

This lecture describes a linear-quadratic version of a model that Guillermo Calvo [Calvo, 1978] used to analyze the time
inconsistency of optimal government plans.
We use the model as a laboratory in which we explore consequences of different timing protocols for government decision
making.
The model focuses on intertemporal tradeoffs between
• benefits that anticipations of future deflation generate by decreasing costs of holding real money balances and
thereby increasing a representative agent’s liquidity, as measured by his or her holdings of real money balances, and
• costs associated with the distorting taxes that the government must levy in order to acquire the paper money that it
will destroy in order to generate anticipated deflation
Model features include
• rational expectations
• alternative possible timing protocols for government choices of a sequence of money growth rates
• costly government actions at all dates 𝑡 ≥ 1 that increase household utilities at dates before 𝑡
• alternative possible sets of Bellman equations, one set for each timing protocol
– for example, in a timing protocol used to pose a Ramsey plan, a government chooses an infinite sequence of
money supply growth rates once and for all at time 0.
– in this timing protocol, there are two value functions and associated Bellman equations, one that expresses a
representative private expectation of future inflation as a function of current and future government actions,
another that describes the value function of a Ramsey planner
– in other timing protocols, other Bellman equations and associated value functions will appear
A theme of this lecture is that timing protocols for government decisions affect outcomes.
We’ll use ideas from papers by Cagan [Cagan, 1956], Calvo [Calvo, 1978], and Chang [Chang, 1998] as well as from
chapter 19 of [Ljungqvist and Sargent, 2018].
In addition, we’ll use ideas from linear-quadratic dynamic programming described in Linear Quadratic Control as applied
to Ramsey problems in Stackelberg plans.

823
Advanced Quantitative Economics with Python

We specify model fundamentals in ways that allow us to use linear-quadratic discounted dynamic programming to com-
pute an optimal government plan under each of our timing protocols.
A sister lecture Machine Learning a Ramsey Plan studies some of the same models but does not use dynamic programming.
Instead it uses a machine learning approach that does not explicitly recognize the recursive structure structure of the
Ramsey problem that Chang [Chang, 1998] saw and that we exploit in this lecture.
In addition to what’s in Anaconda, this lecture will use the following libraries:

!pip install --upgrade quantecon

We’ll start with some imports:

import numpy as np
from quantecon import LQ
import matplotlib.pyplot as plt
from matplotlib.ticker import FormatStrFormatter
import pandas as pd
from IPython.display import display, Math

43.2 Model Components

There is no uncertainty.
Let:
• 𝑝𝑡 be the log of the price level
• 𝑚𝑡 be the log of nominal money balances
• 𝜃𝑡 = 𝑝𝑡+1 − 𝑝𝑡 be the net rate of inflation between 𝑡 and 𝑡 + 1
• 𝜇𝑡 = 𝑚𝑡+1 − 𝑚𝑡 be the net rate of growth of nominal balances
The demand for real balances is governed by a discrete time version of Sargent and Wallace’s [Sargent and Wallace, 1973]
perfect foresight version of a Cagan [Cagan, 1956] demand function for real balances:

𝑚𝑡 − 𝑝𝑡 = −𝛼(𝑝𝑡+1 − 𝑝𝑡 ) , 𝛼 > 0 (43.1)

for 𝑡 ≥ 0.
Equation (43.1) asserts that the demand for real balances is inversely related to the public’s expected rate of inflation,
which equals the actual rate of inflation because there is no uncertainty here.

Note: When there is no uncertainty, an assumption of rational expectations becomes equivalent to perfect foresight.
[Sargent, 1977] presents a rational expectations version of the model when there is uncertainty.

Subtracting the demand function (43.1) at time 𝑡 from the time 𝑡 + 1 version of this demand function gives

𝜇𝑡 − 𝜃𝑡 = −𝛼𝜃𝑡+1 + 𝛼𝜃𝑡

or
𝛼 1
𝜃𝑡 = 𝜃 + 𝜇 (43.2)
1 + 𝛼 𝑡+1 1 + 𝛼 𝑡
𝛼
Because 𝛼 > 0, 0 < 1+𝛼 < 1.

824 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

Definition: For scalar 𝑏𝑡 , let 𝐿2 be the space of sequences {𝑏𝑡 }∞


𝑡=0 satisfying


∑ 𝑏𝑡2 < +∞
𝑡=0

We say that a sequence that belongs to 𝐿2 is square summable.


When we assume that the sequence 𝜇⃗ = {𝜇𝑡 }∞ ⃗ ∞
𝑡=0 is square summable and we require that the sequence 𝜃 = {𝜃𝑡 }𝑡=0 is
square summable, the linear difference equation (43.2) can be solved forward to get:
∞ 𝑗
1 𝛼
𝜃𝑡 = ∑( ) 𝜇𝑡+𝑗 (43.3)
1 + 𝛼 𝑗=0 1 + 𝛼

Insight: Chang [Chang, 1998] noted that equations (43.1) and (43.3) show that 𝜃𝑡 intermediates how choices of 𝜇𝑡+𝑗 , 𝑗 =
0, 1, … impinge on time 𝑡 real balances 𝑚𝑡 − 𝑝𝑡 = −𝛼𝜃𝑡 .
An equivalence class of continuation money growth sequences {𝜇𝑡+𝑗 }∞
𝑗=0 deliver the same 𝜃𝑡 .

We shall use this insight to simplify our analysis of alternative government policy problems.
That future rates of money creation influence earlier rates of inflation makes timing protocols matter for modeling optimal
government policies.
We can represent restriction (43.3) as

1 1 0 1 0
[ ]=[ 1+𝛼 ] [ ] +[ ]𝜇 (43.4)
𝜃𝑡+1 0 𝛼
𝜃𝑡 − 𝛼1 𝑡
or

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵𝜇𝑡 (43.5)

Even though 𝜃0 is to be determined by our model and so is not an initial condition, as it ordinarily would be in the state-
space model described in our lecture on Linear Quadratic Control, we nevertheless write the model in the state-space
form (43.5).
We use form (43.5) because we want to apply an approach described in our lecture on Stackelberg plans.
1+𝛼
Notice that 𝛼 > 1 is an eigenvalue of transition matrix 𝐴 that threatens to destabilize the state-space system.

Indeed, for arbitrary, 𝜇⃗ = {𝜇𝑡 }∞ ⃗ ∞


𝑡=0 sequences, 𝜃 = {𝜃𝑡 }𝑡=0 will not be square summable.

But the government planner will design a decision rule for 𝜇𝑡 that stabilizes the system and renders 𝜃 ⃗ square summable.
The government values a representative household’s utility of real balances at time 𝑡 according to the utility function
𝑢2
𝑈 (𝑚𝑡 − 𝑝𝑡 ) = 𝑢0 + 𝑢1 (𝑚𝑡 − 𝑝𝑡 ) − (𝑚𝑡 − 𝑝𝑡 )2 , 𝑢0 > 0, 𝑢1 > 0, 𝑢2 > 0 (43.6)
2
The money demand function (43.1) and the utility function (43.6) imply that
𝑢2
𝑈 (−𝛼𝜃𝑡 ) = 𝑢0 + 𝑢1 (−𝛼𝜃𝑡 ) − (−𝛼𝜃𝑡 )2 . (43.7)
2

43.3 Friedman’s Optimal Rate of Deflation


𝑢1
According to (43.7), the bliss level of real balances is 𝑢2 and the inflation rate that attains it is

𝑢1
𝜃𝑡 = 𝜃 ∗ = − (43.8)
𝑢2 𝛼

43.3. Friedman’s Optimal Rate of Deflation 825


Advanced Quantitative Economics with Python

Milton Friedman recommended that the government withdraw and destroy money at a rate that implies an inflation rate
given by (43.8).
In our setting, that could be accomplished by setting

𝜇𝑡 = 𝜇∗ = 𝜃∗ , 𝑡 ≥ 0 (43.9)

where 𝜃∗ is given by equation (43.8).


Milton Friedman assumed that the taxes that government imposes to collect money at rate 𝜇𝑡 do not distort economic
decisions, e.g., they are lump-sum taxes.

43.4 Calvo’s Distortion

The starting point of Calvo [Calvo, 1978] and Chang [Chang, 1998] is that lump sum taxes are not available.
Instead, the government acquires money by levying taxes that distort decisions and thereby impose costs on the represen-
tative consumer.
In the models of Calvo [Calvo, 1978] and Chang [Chang, 1998], the government takes those tax-distortion costs into
account.
The government balances the costs of imposing the distorting taxes needed to acquire the money that it destroys in order
to generate deflation against the benefits that expected deflation generates by raising the representative household’s real
money balances.
Let’s see how the government does that.
Via equation (43.3), a government plan 𝜇⃗ = {𝜇𝑡 }∞ ⃗ ∞
𝑡=0 leads to a sequence of inflation outcomes 𝜃 = {𝜃𝑡 }𝑡=0 .

The government incurs social costs 2𝑐 𝜇𝑡2 at 𝑡 when it changes the stock of nominal money balances at rate 𝜇𝑡 .
Therefore, the one-period welfare function of a benevolent government is:

1 𝑢 − 𝑢12𝛼 1 𝑐
𝑠(𝜃𝑡 , 𝜇𝑡 ) ∶= −𝑟(𝑥𝑡 , 𝜇𝑡 ) = [ ] [ 𝑢01 𝛼 2][ ] − 𝜇2𝑡 = −𝑥′𝑡 𝑅𝑥𝑡 − 𝑄𝜇2𝑡 (43.10)
𝜃𝑡 − 2 − 𝑢22𝛼 𝜃𝑡 2

The government’s time 0 value is


∞ ∞
𝑣0 = − ∑ 𝛽 𝑡 𝑟(𝑥𝑡 , 𝜇𝑡 ) = ∑ 𝛽 𝑡 𝑠(𝜃𝑡 , 𝜇𝑡 ) (43.11)
𝑡=0 𝑡=0

where 𝛽 ∈ (0, 1) is a discount factor.

Note: We define 𝑟(𝑥𝑡 , 𝜇𝑡 ) ∶= −𝑠(𝜃𝑡 , 𝜇𝑡 ) in order to represent the government’s maximization problem in terms of
our Python code for solving linear quadratic discounted dynamic programs. In first LQ control lecture and some other
quantecon lectures, we formulated these as loss minimization problems.

The government’s time 𝑡 continuation value 𝑣𝑡 is



𝑣𝑡 = ∑ 𝛽 𝑗 𝑠(𝜃𝑡+𝑗 , 𝜇𝑡+𝑗 ). (43.12)
𝑗=0

We can represent dependence of 𝑣0 on (𝜃,⃗ 𝜇)⃗ recursively via the difference equation

𝑣𝑡 = 𝑠(𝜃𝑡 , 𝜇𝑡 ) + 𝛽𝑣𝑡+1 (43.13)

826 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

It is useful to evaluate (43.13) under a time-invariant money growth rate 𝜇𝑡 = 𝜇̄ that according to equation (43.3) would
bring forth a constant inflation rate equal to 𝜇.̄
Under that policy,

𝑠(𝜇,̄ 𝜇)̄
𝑣𝑡 = 𝑉 (𝜇)̄ = (43.14)
1−𝛽
for all 𝑡 ≥ 0.
Values of 𝑉 (𝜇)̄ computed according to formula (43.14) for three different values of 𝜇̄ will play important roles below.
• 𝑉 (𝜇𝑀𝑃 𝐸 ) is the value of attained by the government in a Markov perfect equilibrium
• 𝑉 (𝜇𝑅
∞ ) is the value that a continuation Ramsey planner attains at 𝑡 → +∞

– We shall discover that 𝑉 (𝜇𝑅


∞ ) is the worst continuation value attained along a Ramsey plan

• 𝑉 (𝜇𝐶𝑅 ) is the value of attained by the government in a constrained to constant 𝜇 equilibrium

43.5 Structure

The following structure is induced by a representative agent’s behavior as summarized by the demand function for money
(43.1) that leads to equation (43.3), which tells how future settings of 𝜇 affect the current value of 𝜃.
Equation (43.3) maps a policy sequence of money growth rates 𝜇⃗ = {𝜇𝑡 }∞ 2 ⃗
𝑡=0 ∈ 𝐿 into an inflation sequence 𝜃 =
∞ 2
{𝜃𝑡 }𝑡=0 ∈ 𝐿 .
These in turn induce a discounted value to a government sequence 𝑣 ⃗ = {𝑣𝑡 }∞ 2
𝑡=0 ∈ 𝐿 that satisfies recursion (43.13).

Thus, a triple of sequences (𝜇,⃗ 𝜃,⃗ 𝑣)⃗ depends on a sequence 𝜇⃗ ∈ 𝐿2 .


At this point 𝜇⃗ ∈ 𝐿2 is an arbitrary exogenous policy.
A theory of government decisions will make 𝜇⃗ endogenous, i.e., a theoretical output instead of an input.

43.5.1 Intertemporal Aspects

Criterion function (43.11) and the constraint system (43.5) exhibit the following structure:
• Setting the money growth rate 𝜇𝑡 ≠ 0 imposes costs 2𝑐 𝜇2𝑡 at time 𝑡 and at no other times; but
• The money growth rate 𝜇𝑡 affects the government’s one-period utilities at all dates 𝑠 = 0, 1, … , 𝑡.
This structure sets the stage for the emergence of a time-inconsistent optimal government plan under a Ramsey timing
protocol
• it is also called a Stackelberg timing protocol.
We’ll study outcomes under a Ramsey timing protocol.
We’ll also study outcomes under other timing protocols.

43.5. Structure 827


Advanced Quantitative Economics with Python

43.6 Three Timing Protocols

We consider three models of government policy making that differ in


• what a policymaker chooses, either a sequence 𝜇⃗ or just 𝜇𝑡 in a single period 𝑡.
• when a policymaker chooses, either once and for all at time 0, or at one or more times 𝑡 ≥ 0.
• what a policymaker assumes about how its choice of 𝜇𝑡 affects the representative agent’s expectations about earlier
and later inflation rates.
In two of our models, a single policymaker chooses a sequence {𝜇𝑡 }∞
𝑡=0 once and for all, knowing how 𝜇𝑡 affects household
one-period utilities at dates 𝑠 = 0, 1, … , 𝑡 − 1
• these two models thus employ a Ramsey or Stackelberg timing protocol.
In a third model, there is a sequence of policymaker indexed by 𝑡 ∈ {0, 1, …}, each of whom sets only 𝜇𝑡 .
• a time 𝑡 policymaker cares only about 𝑣𝑡 and ignores effects that its choice of 𝜇𝑡 has on 𝑣𝑠 at dates 𝑠 = 0, 1, … , 𝑡−1.
The three models differ with respect to timing protocols, constraints on government choices, and government policymak-
ers’ beliefs about how their decisions affect the representative agent’s beliefs about future government decisions.
The models are distinguished by their having either
• A single Ramsey planner that chooses a sequence {𝜇𝑡 }∞
𝑡=0 once and for all at time 0; or

• A single Ramsey planner that chooses a sequence {𝜇𝑡 }∞


𝑡=0 once and for all at time 0 subject to the constraint that
𝜇𝑡 = 𝜇 for all 𝑡 ≥ 0; or
• A sequence of distinct policymakers indexed by 𝑡 = 0, 1, 2, …
– a time 𝑡 policymaker chooses 𝜇𝑡 only and forecasts that future government decisions are unaffected by its
choice.
The first model describes a Ramsey plan chosen by a Ramsey planner
The second model describes a Ramsey plan chosen by a Ramsey planner constrained to choose a time-invariant 𝜇
The third model describes a Markov perfect equilibrium

Note: In the quantecon lecture Sustainable Plans for a Calvo Model, we’ll study outcomes under another timing protocol
in which there is a sequence of separate policymakers. A time 𝑡 policymaker chooses only 𝜇𝑡 but believes that its choice
of 𝜇𝑡 shapes the representative agent’s beliefs about future rates of money creation and inflation, and through them, future
government actions. This is a model of a credible government policy, also called a sustainable plan. The relationship
between outcomes in the first (Ramsey) timing protocol and the Sustainable Plans for a Calvo Model timing protocol
and belief structure is the subject of a literature on sustainable or credible public policies (Chari and Kehoe [Chari and
Kehoe, 1990] [Stokey, 1989], and Stokey [Stokey, 1991]).

43.7 Note on Dynamic Programming Squared

We’ll begin with the timing protocol associated with a Ramsey plan and deploy an application of what we nickname
dynamic programming squared.
The nickname refers to the feature that a value satisfying one Bellman equation appears as an argument in a value function
associated with a second Bellman equation.
Thus, two Bellman equations appear:

828 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

• equation (43.1) expresses how 𝜃𝑡 depends on 𝜇𝑡 and 𝜃𝑡+1


• equation (43.5) expresses how value 𝑣𝑡 depends on (𝜇𝑡 , 𝜃𝑡 ) and 𝑣𝑡+1
A value 𝜃 from one Bellman equation appears as an argument of a second Bellman equation for another value 𝑣.

43.8 A Ramsey Planner

A Ramsey planner chooses {𝜇𝑡 , 𝜃𝑡 }∞


𝑡=0 to maximize (43.11) subject to the law of motion (43.5).

We split this problem into two stages, as in the lecture Stackelberg plans and [Ljungqvist and Sargent, 2018] Chapter 19.
In the first stage, we take the initial inflation rate 𝜃0 as given and pose an ordinary discounted dynamic programming
problem that in our setting becomes an LQ discounted dynamic programming problem.
In the second stage, we choose an optimal initial inflation rate 𝜃0 .
Define a feasible set of {𝑥𝑡+1 , 𝜇𝑡 }∞ 2
𝑡=0 sequences, with each sequence belonging to 𝐿 :

Ω(𝑥0 ) = {𝑥𝑡+1 , 𝜇𝑡 }∞
𝑡=0 ∶ 𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵𝜇𝑡 , ∀𝑡 ≥ 0,

where we require that {𝑥𝑡+1 , 𝜇𝑡 }∞ 2 2


𝑡=0 ∈ 𝐿 × 𝐿 .

43.8.1 Subproblem 1

The value function



𝐽 (𝑥0 ) = max

∑ 𝛽 𝑡 𝑠(𝑥𝑡 , 𝜇𝑡 ) (43.15)
{𝑥𝑡+1 ,𝜇𝑡 }𝑡=0 ∈Ω(𝑥0 )
𝑡=0

satisfies the Bellman equation

𝐽 (𝑥) = max′ {𝑠(𝑥, 𝜇) + 𝛽𝐽 (𝑥′ )}


𝜇,𝑥

subject to:

𝑥′ = 𝐴𝑥 + 𝐵𝜇

As in the lecture Stackelberg plans, we can map this problem into a linear-quadratic control problem and deduce an
optimal value function 𝐽 (𝑥).
Guessing that 𝐽 (𝑥) = −𝑥′ 𝑃 𝑥 and substituting into the Bellman equation gives rise to the algebraic matrix Riccati
equation:

𝑃 = 𝑅 + 𝛽𝐴′ 𝑃 𝐴 − 𝛽 2 𝐴′ 𝑃 𝐵(𝑄 + 𝛽𝐵′ 𝑃 𝐵)−1 𝐵′ 𝑃 𝐴

and an optimal decision rule

𝜇𝑡 = −𝐹 𝑥𝑡

where

𝐹 = 𝛽(𝑄 + 𝛽𝐵′ 𝑃 𝐵)−1 𝐵′ 𝑃 𝐴 (43.16)

The QuantEcon LQ class solves for 𝐹 and 𝑃 given inputs 𝑄, 𝑅, 𝐴, 𝐵, and 𝛽.

43.8. A Ramsey Planner 829


Advanced Quantitative Economics with Python

The value function for a (continuation) Ramsey planner is

𝑃11 𝑃12 1
𝑣𝑡 = − [1 𝜃𝑡 ] [ ][ ]
𝑃21 𝑃22 𝜃𝑡
or

𝑣𝑡 = −𝑃11 − 2𝑃21 𝜃𝑡 − 𝑃22 𝜃𝑡2

or

𝑣𝑡 = 𝑔0 + 𝑔1 𝜃𝑡 + 𝑔2 𝜃𝑡2 (43.17)

where

𝑔0 = −𝑃11 , 𝑔1 = −2𝑃21 , 𝑔2 = −𝑃22

The Ramsey plan for setting 𝜇𝑡 is

1
𝜇𝑡 = − [𝐹1 𝐹2 ] [ ]
𝜃𝑡
or

𝜇𝑡 = 𝑏 0 + 𝑏 1 𝜃𝑡 (43.18)

where 𝑏0 = −𝐹1 , 𝑏1 = −𝐹2 and 𝐹 satisfies equation (43.16),


The Ramsey planner’s decision rule for updating 𝜃𝑡+1 is

𝜃𝑡+1 = 𝑑0 + 𝑑1 𝜃𝑡 (43.19)

where [ 𝑑0 𝑑1 ] is the second row of the closed-loop matrix 𝐴 − 𝐵𝐹 for computed in subproblem 1 above.
The linear quadratic control problem (43.15) satisfies regularity conditions that guarantee that 𝐴 − 𝐵𝐹 is a stable matrix
(i.e., its maximum eigenvalue is strictly less than 1 in absolute value).
Consequently, we are assured that

|𝑑1 | < 1, (43.20)

a stability condition that will play an important role.


It remains for us to describe how the Ramsey planner sets 𝜃0 .
Subproblem 2 does that.

43.8.2 Subproblem 2

The value of the Ramsey problem is

𝑉 𝑅 = max 𝐽 (𝑥0 )
𝑥0

We abuse notation slightly by writing 𝐽 (𝑥) as 𝐽 (𝜃) and rewrite the above equation as

1
Note: Since 𝑥 = [ ], it follows that 𝜃 is the only component of 𝑥 that can possibly vary.
𝜃

830 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

𝑉 𝑅 = max 𝐽 (𝜃0 )
𝜃0

Evidently, 𝑉 𝑅 is the maximum value of 𝑣0 defined in equation (43.11).


Value function 𝐽 (𝜃0 ) satisfies

𝑃11 𝑃12 1
𝐽 (𝜃0 ) = − [1 𝜃0 ] [ ] [ ] = −𝑃11 − 2𝑃21 𝜃0 − 𝑃22 𝜃02
𝑃21 𝑃22 𝜃0

The first-order necessary condition for maximizing 𝐽 (𝜃0 ) with respect to 𝜃0 is

−2𝑃21 − 2𝑃22 𝜃0 = 0

which implies
𝑃21
𝜃0 = 𝜃0𝑅 = −
𝑃22

43.9 Representation of Ramsey Plan

The preceding calculations indicate that we can represent a Ramsey plan 𝜇⃗ recursively with the following system created
in the spirit of Chang [Chang, 1998]:

𝜃0 = 𝜃0𝑅
𝜇𝑡 = 𝑏0 + 𝑏1 𝜃𝑡
(43.21)
𝑣𝑡 = 𝑔0 + 𝑔1 𝜃𝑡 + 𝑔2 𝜃𝑡2
𝜃𝑡+1 = 𝑑0 + 𝑑1 𝜃𝑡 , 𝑑0 > 0, 𝑑1 ∈ (0, 1)

where 𝑏0 , 𝑏1 , 𝑔0 , 𝑔1 , 𝑔2 are positive parameters that we shall compute with Python code below.
From condition (43.20), we know that |𝑑1 | < 1.
To interpret system (43.21), think of the sequence {𝜃𝑡 }∞
𝑡=0 as a sequence of synthetic promised inflation rates.

For some purposes, we can think of these promised inflation rates just as computational devices for generating a sequence
𝜇⃗ of money growth rates that when substituted into equation (43.3) generate actual rates of inflation.
It can be verified that if we substitute a plan 𝜇⃗ = {𝜇𝑡 }∞
𝑡=0 that satisfies these equations into equation (43.3), we obtain
the same sequence 𝜃 ⃗ generated by the system (43.21).
(Here an application of the Big 𝐾, little 𝑘 trick is again at work.)
Thus, within the Ramsey plan, promised inflation equals actual inflation.
System (43.21) implies that under the Ramsey plan

1 − 𝑑1𝑡
𝜃𝑡 = 𝑑0 ( ) + 𝑑1𝑡 𝜃0𝑅 , (43.22)
1 − 𝑑1

Because 𝑑1 ∈ (0, 1), it follows from (43.22) that as 𝑡 → ∞, 𝜃𝑡𝑅 converges to

𝑑0
lim 𝜃𝑡𝑅 = 𝜃∞
𝑅
= . (43.23)
𝑡→+∞ 1 − 𝑑1

Furthermore, we shall see that 𝜃𝑡𝑅 converges to 𝜃∞


𝑅
from above.

43.9. Representation of Ramsey Plan 831


Advanced Quantitative Economics with Python

Meanwhile, 𝜇𝑡 varies over time according to

1 − 𝑑1𝑡
𝜇𝑡 = 𝑏0 + 𝑏1 𝑑0 ( ) + 𝑏1 𝑑1𝑡 𝜃0𝑅 . (43.24)
1 − 𝑑1

Variation of 𝜇𝑅 ⃗ , 𝑣𝑅
⃗ , 𝜃𝑅 ⃗ over time are symptoms of time inconsistency.
• The Ramsey planner reaps immediate benefits from promising lower inflation later to be achieved by costly dis-
torting taxes.
• These benefits are intermediated by reductions in expected inflation that precede the reductions in money creation
rates that rationalize them, as indicated by equation (43.3).

43.10 Multiple roles of 𝜃𝑡

The inflation rate 𝜃𝑡 plays three roles:


• In equation (43.3), 𝜃𝑡 is the actual rate of inflation between 𝑡 and 𝑡 + 1.
• In equation (43.2) and (43.3), 𝜃𝑡 is also the public’s expected rate of inflation between 𝑡 and 𝑡 + 1.
• In system (43.21), 𝜃𝑡 is a promised rate of inflation chosen by the Ramsey planner at time 0.
That the same variable 𝜃𝑡 takes on these multiple roles brings insights about
• whether the government follows or leads the market,
• forward guidance, and
• inflation targeting.

43.11 Time inconsistency

As discussed in Stackelberg plans and Optimal taxation with state-contingent debt, a continuation Ramsey plan is not a
Ramsey plan.
This is a concise way of characterizing the time inconsistency of a Ramsey plan.
In the present context, a symptom of time inconsistency is that the Ramsey plannner chooses to make 𝜇𝑡 a non-constant
function of time 𝑡 despite the fact that, other than time itself, there is no other state variable.
Thus, in our context, time-variation of 𝜇⃗ chosen by a Ramsey planner is the telltale sign of the Ramsey plan’s time
inconsistency.

43.12 Constrained-to-Constant-Growth-Rate Ramsey Plan

We can use brute force to create a government plan that is time consistent, i.e., that is a time-invariant function of time.
We simply constrain a planner to choose a time-invariant money growth rate 𝜇̄ so that

𝜇𝑡 = 𝜇,̄ ∀𝑡 ≥ 0.

We assume that the government knows the perfect foresight outcome implied by equation (43.2) that 𝜃𝑡 = 𝜇̄ when 𝜇𝑡 = 𝜇̄
for all 𝑡 ≥ 0.
It follows that the value of such a plan is given by 𝑉 (𝜇)̄ defined inequation (43.14).

832 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

Then our restricted Ramsey planner chooses 𝜇̄ to maximize 𝑉 (𝜇).


̄
We can express 𝑉 (𝜇)̄ as
𝑐
𝑉 (𝜇)̄ = (1 − 𝛽)−1 [𝑈 (−𝛼𝜇)̄ − (𝜇)̄ 2 ] (43.25)
2
With the quadratic form (43.6) for the utility function 𝑈 , the maximizing 𝜇̄ is
𝛼𝑢1
𝜇𝐶𝑅 = max 𝑉 (𝜇)̄ = − (43.26)
𝜇̄ 𝛼2 𝑢2 + 𝑐
The optimal value attained by a constrained to constant 𝜇 Ramsey planner is
𝑐
𝑉 (𝜇𝐶𝑅 ) ≡ 𝑉 𝐶𝑅 = (1 − 𝛽)−1 [𝑈 (−𝛼𝜇𝐶𝑅 ) − (𝜇𝐶𝑅 )2 ] (43.27)
2
Time-variation of 𝜇⃗ chosen by a Ramsey planner is the telltale sign of the Ramsey plan’s time inconsistency.
Obviously, our constrained-to-constant 𝜇 Ramsey planner must must choose a plan that is time consistent.

43.13 Markov Perfect Governments

To generate an alternative model of time-consistent government decision making, we assume another timing protocol.
In this one, there is a sequence of government policymakers.
A time 𝑡 government chooses 𝜇𝑡 and expects all future governments to set 𝜇𝑡+𝑗 = 𝜇.̄
This assumption mirrors an assumption made in this QuantEcon lecture: Markov Perfect Equilibrium.
When it sets 𝜇𝑡 , the government at 𝑡 believes that 𝜇̄ is unaffected by its choice of 𝜇𝑡 .
According to equation (43.3), the time 𝑡 rate of inflation is then
1 𝛼
𝜃𝑡 = 𝜇 + 𝜇,̄ (43.28)
1+𝛼 𝑡 1+𝛼
which expresses inflation 𝜃𝑡 as a geometric weighted average of the money growth today 𝜇𝑡 and money growth from
tomorrow onward 𝜇.̄
Given 𝜇,̄ the time 𝑡 government chooses 𝜇𝑡 to maximize:
𝑐
𝐻(𝜇𝑡 , 𝜇)̄ = 𝑈 (−𝛼𝜃𝑡 ) − 𝜇2𝑡 + 𝛽𝑉 (𝜇)̄ (43.29)
2
where 𝑉 (𝜇)̄ is given by formula (43.14) for the time 0 value 𝑣0 of recursion (43.13) under a money supply growth rate
that is forever constant at 𝜇.̄
Substituting (43.28) into (43.29) and expanding gives:
2
𝛼2 𝛼 𝑢 𝛼2 𝛼
𝐻(𝜇𝑡 , 𝜇)̄ = 𝑢0 + 𝑢1 (− 𝜇̄ − 𝜇𝑡 ) − 2 (− 𝜇̄ − 𝜇)
1+𝛼 1+𝛼 2 1+𝛼 1+𝛼 𝑡 (43.30)
𝑐
− 𝜇2𝑡 + 𝛽𝑉 (𝜇)̄
2
The first-order necessary condition for maximizing 𝐻(𝜇𝑡 , 𝜇)̄ with respect to 𝜇𝑡 is:
𝛼 𝛼2 𝛼 𝛼
− 𝑢1 − 𝑢2 (− 𝜇̄ − 𝜇𝑡 )(− ) − 𝑐𝜇𝑡 = 0
1+𝛼 1+𝛼 1+𝛼 1+𝛼
Rearranging we get the time 𝑡 government’s best response map

𝜇𝑡 = 𝑓(𝜇)̄

43.13. Markov Perfect Governments 833


Advanced Quantitative Economics with Python

where
−𝑢1 𝛼2 𝑢2
𝑓(𝜇)̄ = 1+𝛼 𝛼
− 1+𝛼 𝛼
𝜇̄
𝛼 𝑐 + 1+𝛼 𝑢2 [ 𝛼 𝑐 + 1+𝛼 𝑢2 ] (1 + 𝛼)

A Markov Perfect Equilibrium (MPE) outcome 𝜇𝑀𝑃 𝐸 is a fixed point of the best response map:

𝜇𝑀𝑃 𝐸 = 𝑓(𝜇𝑀𝑃 𝐸 )

Calculating 𝜇𝑀𝑃 𝐸 , we find


−𝑢1
𝜇𝑀𝑃 𝐸 = 1+𝛼 𝛼 𝛼2
𝛼 𝑐 + 1+𝛼 𝑢2 + 1+𝛼 𝑢2

This can be simplified to


𝛼𝑢1
𝜇𝑀𝑃 𝐸 = − . (43.31)
𝛼2 𝑢 2 + (1 + 𝛼)𝑐

The value of a Markov perfect equilibrium is

𝑠(𝜇𝑀𝑃 𝐸 , 𝜇𝑀𝑃 𝐸 )
𝑉 𝑀𝑃 𝐸 = (43.32)
1−𝛽
or

𝑉 𝑀𝑃 𝐸 = 𝑉 (𝜇𝑀𝑃 𝐸 )

where 𝑉 (⋅) is given by formula (43.14).


Under the Markov perfect timing protocol
• a government takes 𝜇̄ as given when it chooses 𝜇𝑡
• we equate 𝜇𝑡 = 𝜇 only after we have computed a time 𝑡 government’s first-order condition for 𝜇𝑡 .

43.14 Outcomes under Three Timing Protocols

We want to compare outcome sequences {𝜃𝑡 , 𝜇𝑡 } under three timing protocols associated with
• a standard Ramsey plan with its time-varying {𝜃𝑡 , 𝜇𝑡 } sequences
• a Markov perfect equilibrium, with its time-invariant {𝜃𝑡 , 𝜇𝑡 } sequences
• a nonstandard Ramsey plan in which the planner is restricted to choose a time-invariant 𝜇𝑡 = 𝜇 for all 𝑡 ≥ 0.
We have computed closed form formulas for several of these outcomes, which we find it convenient to repeat here.
In particular, the constrained to constant inflation Ramsey inflation outcome is 𝜇𝐶𝑅 , which according to equation (43.26)
is
𝛼𝑢
𝜃𝐶𝑅 = − 2 1
𝛼 𝑢2 + 𝑐
Equation (43.31) implies that the Markov perfect constant inflation rate is
𝛼𝑢1
𝜃𝑀𝑃 𝐸 = −
𝛼2 𝑢2 + (1 + 𝛼)𝑐
According to equation (43.8), the bliss level of inflation that we associated with a Friedman rule is
𝑢1
𝜃∗ = −
𝑢2 𝛼

834 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

Proposition 1: When 𝑐 = 0, 𝜃𝑀𝑃 𝐸 = 𝜃𝐶𝑅 = 𝜃∗ and 𝜃0𝑅 = 𝜃∞


𝑅
.
The first two equalities follow from the preceding three equations.
We’ll illustrate the third equality that equates 𝜃0𝑅 to 𝜃∞
𝑅
with some quantitative examples below.
Proposition 1 draws attention to how a positive tax distortion parameter 𝑐 alters the optimal rate of deflation that Milton
Friedman financed by imposing a lump sum tax.
We’ll compute
⃗ , 𝜇𝑅
• (𝜃𝑅 ⃗ ): ordinary time-varying Ramsey sequences
• (𝜃𝑀𝑃 𝐸 = 𝜇𝑀𝑃 𝐸 ): Markov perfect equilibrium (MPE) fixed values
• (𝜃𝐶𝑅 = 𝜇𝐶𝑅 ): fixed values associated with a constrained to time-invariant 𝜇 Ramsey plan
• 𝜃∗ : bliss level of inflation prescribed by a Friedman rule
We will create a class ChangLQ that solves the models and stores their values

class ChangLQ:
"""
Class to solve LQ Chang model
"""
def __init__(self, β, c, α=1, u0=1, u1=0.5, u2=3, T=1000, θ_n=200):
# Record parameters
self.α, self.u0, self.u1, self.u2 = α, u0, u1, u2
self.β, self.c, self.T, self.θ_n = β, c, T, θ_n

self.setup_LQ_matrices()
self.solve_LQ_problem()
self.compute_policy_functions()
self.simulate_ramsey_plan()
self.compute_θ_range()
self.compute_value_and_policy()

def setup_LQ_matrices(self):
# LQ Matrices
self.R = -np.array([[self.u0, -self.u1 * self.α / 2],
[-self.u1 * self.α / 2,
-self.u2 * self.α**2 / 2]])
self.Q = -np.array([[-self.c / 2]])
self.A = np.array([[1, 0], [0, (1 + self.α) / self.α]])
self.B = np.array([[0], [-1 / self.α]])

def solve_LQ_problem(self):
# Solve LQ Problem (Subproblem 1)
lq = LQ(self.Q, self.R, self.A, self.B, beta=self.β)
self.P, self.F, self.d = lq.stationary_values()

# Compute g0, g1, and g2 (41.16)


self.g0, self.g1, self.g2 = [-self.P[0, 0],
-2 * self.P[1, 0], -self.P[1, 1]]

# Compute b0 and b1 (41.17)


[[self.b0, self.b1]] = self.F

# Compute d0 and d1 (41.18)


self.cl_mat = (self.A - self.B @ self.F) # Closed loop matrix
[[self.d0, self.d1]] = self.cl_mat[1:]
(continues on next page)

43.14. Outcomes under Three Timing Protocols 835


Advanced Quantitative Economics with Python

(continued from previous page)

# Solve Subproblem 2
self.θ_R = -self.P[0, 1] / self.P[1, 1]

# Find the bliss level of θ


self.θ_B = -self.u1 / (self.u2 * self.α)

def compute_policy_functions(self):
# Solve the Markov Perfect Equilibrium
self.μ_MPE = -self.u1 / ((1 + self.α) / self.α * self.c
+ self.α / (1 + self.α)
* self.u2 + self.α**2
/ (1 + self.α) * self.u2)
self.θ_MPE = self.μ_MPE
self.μ_CR = -self.α * self.u1 / (self.u2 * self.α**2 + self.c)
self.θ_CR = self.μ_CR

# Calculate value under MPE and CR economy


self.J_θ = lambda θ_array: - np.array([1, θ_array]) \
@ self.P @ np.array([1, θ_array]).T
self.V_θ = lambda θ: (self.u0 + self.u1 * (-self.α * θ)
- self.u2 / 2 * (-self.α * θ)**2
- self.c / 2 * θ**2) / (1 - self.β)

self.J_MPE = self.V_θ(self.μ_MPE)
self.J_CR = self.V_θ(self.μ_CR)

def simulate_ramsey_plan(self):
# Simulate Ramsey plan for large number of periods
θ_series = np.vstack((np.ones((1, self.T)), np.zeros((1, self.T))))
μ_series = np.zeros(self.T)
J_series = np.zeros(self.T)
θ_series[1, 0] = self.θ_R
[μ_series[0]] = -self.F.dot(θ_series[:, 0])
J_series[0] = self.J_θ(θ_series[1, 0])

for i in range(1, self.T):


θ_series[:, i] = self.cl_mat @ θ_series[:, i-1]
[μ_series[i]] = -self.F @ θ_series[:, i]
J_series[i] = self.J_θ(θ_series[1, i])

self.J_series = J_series
self.μ_series = μ_series
self.θ_series = θ_series

def compute_θ_range(self):
# Find the range of θ in Ramsey plan
θ_LB = min(min(self.θ_series[1, :]), self.θ_B)
θ_UB = max(max(self.θ_series[1, :]), self.θ_MPE)
θ_range = θ_UB - θ_LB
self.θ_LB = θ_LB - 0.05 * θ_range
self.θ_UB = θ_UB + 0.05 * θ_range
self.θ_range = θ_range

def compute_value_and_policy(self):
# Create the θ_space

(continues on next page)

836 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

(continued from previous page)


self.θ_space = np.linspace(self.θ_LB, self.θ_UB, 200)

# Find value function and policy functions over range of θ


self.J_space = np.array([self.J_θ(θ) for θ in self.θ_space])
self.μ_space = -self.F @ np.vstack((np.ones(200), self.θ_space))
x_prime = self.cl_mat @ np.vstack((np.ones(200), self.θ_space))
self.θ_prime = x_prime[1, :]
self.CR_space = np.array([self.V_θ(θ) for θ in self.θ_space])

self.μ_space = self.μ_space[0, :]

# Calculate J_range, J_LB, and J_UB


self.J_range = np.ptp(self.J_space)
self.J_LB = np.min(self.J_space) - 0.05 * self.J_range
self.J_UB = np.max(self.J_space) + 0.05 * self.J_range

Let’s create an instance of ChangLQ with the following parameters:

clq = ChangLQ(β=0.85, c=2)

The following code plots policy functions for a continuation Ramsey planner.

The dotted line in the above graph is the 45-degree line.


The blue line shows the choice of 𝜃𝑡+1 = 𝜃′ chosen by a continuation Ramsey planner who inherits 𝜃𝑡 = 𝜃.

43.14. Outcomes under Three Timing Protocols 837


Advanced Quantitative Economics with Python

The green line shows a continuation Ramsey planner’s choice of 𝜇𝑡 = 𝜇 as a function of an inherited 𝜃𝑡 = 𝜃.
𝑅
Dynamics under the Ramsey plan are confined to 𝜃 ∈ [𝜃∞ , 𝜃0𝑅 ].
𝑅
The blue and green lines intersect each other and the 45-degree line at 𝜃 = 𝜃∞ .
𝑅
Notice that for 𝜃 ∈ (𝜃∞ , 𝜃0𝑅 ]
• 𝜃′ < 𝜃 because the blue line is below the 45-degree line
• 𝜇 > 𝜃 because the green line is above the 45-degree line
𝑅
It follows that under the Ramsey plan {𝜃𝑡 } and {𝜇𝑡 } both converge monotonically from above to 𝜃∞ .
The next code plots the Ramsey planner’s value function 𝐽 (𝜃).
We know that 𝐽 (𝜃) is maximized at 𝜃0𝑅 , the best time 0 promised inflation rate.
𝑅
The figure also plots the limiting value 𝜃∞ , the limiting value of promised inflation rate 𝜃𝑡 under the Ramsey plan as
𝑡 → +∞.
The figure also indicates an MPE inflation rate 𝜃𝑀𝑃 𝐸 , the inflation 𝜃𝐶𝑅 under a Ramsey plan constrained to a constant
money creation rate, and a bliss inflation 𝜃∗ .

In the above graph, notice that 𝜃∗ < 𝜃∞


𝑅
< 𝜃𝐶𝑅 < 𝜃0𝑅 < 𝜃𝑀𝑃 𝐸 :
• 𝜃0𝑅 < 𝜃𝑀𝑃 𝐸 : the initial Ramsey inflation rate exceeds the MPE inflation rate
𝑅
• 𝜃∞ < 𝜃𝐶𝑅 < 𝜃0𝑅 : the initial Ramsey deflation rate, and the associated tax distortion cost 𝑐𝜇20 is less than the
𝑅
limiting Ramsey inflation rate 𝜃∞ and the associated tax distortion cost 𝜇2∞
• 𝜃 ∗ < 𝜃∞
𝑅
: the limiting Ramsey inflation rate exceeds the bliss level of inflation

838 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

In some subsequent calculations, we’ll use our Python code to study how gaps between these outcome vary depending on
parameters such as the cost parameter 𝑐 and the discount factor 𝛽.

43.15 Ramsey Planner’s Value Function

The next code plots the Ramsey Planner’s value function 𝐽 (𝜃) as well as the value function of a constrained Ramsey
planner who must choose a constant 𝜇.
A time-invariant 𝜇 implies a time-invariant 𝜃, we take the liberty of labeling this value function 𝑉 (𝜃).
We’ll use the code to plot 𝐽 (𝜃) and 𝑉 (𝜃) for several values of the discount factor 𝛽 and the cost parameter 𝑐 that multiplies
𝜇2𝑡 in the Ramsey planner’s one-period payoff function.
In all of the graphs below, we disarm the Proposition 1 equivalence results by setting 𝑐 > 0.
The graphs reveal interesting relationships among 𝜃’s associated with various timing protocols:
• 𝐽 (𝜃) ≥ 𝑉 (𝜃)
𝑅 𝑅
• 𝐽 (𝜃∞ ) = 𝑉 (𝜃∞ )
𝑅 𝑅
Before doing anything else, let’s write code to verify our claim that 𝐽 (𝜃∞ ) = 𝑉 (𝜃∞ ).
Here is the code.

θ_inf = clq.θ_series[1, -1]


np.allclose(clq.J_θ(θ_inf),
clq.V_θ(θ_inf))

True

𝑅 𝑅
So we have verified our claim that 𝐽 (𝜃∞ ) = 𝑉 (𝜃∞ ).
𝑅 𝑅
Since 𝐽 (𝜃∞ ) = 𝑉 (𝜃∞ ) occurs at a tangency point at which 𝐽 (𝜃) is increasing in 𝜃, it follows that
𝑅
𝑉 (𝜃∞ ) ≤ 𝐽 (𝜃𝐶𝑅 ) (43.33)

with strict inequality when 𝑐 > 0.


𝑅
Thus, the value of the plan that sets the money growth rate 𝜇𝑡 = 𝜃∞ for all 𝑡 ≥ 0 is worse than the value attained by a
Ramsey planner who is constrained to set a constant 𝜇𝑡 .
Now let’s write some code to plot outcomes under our three timing protocols.
For some default parameter values, the next figure plots the Ramsey planner’s continuation value function 𝐽 (𝜃) (orange
curve) and the restricted-to-constant-𝜇 Ramsey planner’s value function 𝑉 (𝜃) (blue curve).
The figure uses colored arrows to indicate locations of 𝜃∗ , 𝜃∞
𝑅
, 𝜃𝐶𝑅 , 𝜃0𝑅 , and 𝜃𝑀𝑃 𝐸 , ordered as they are from left to right,
on the 𝜃 axis.

43.15. Ramsey Planner’s Value Function 839


Advanced Quantitative Economics with Python

In the above figure, notice that


𝑅
• the orange 𝐽 value function lies above the blue 𝑉 value function except at 𝜃 = 𝜃∞
• the maximizer 𝜃0𝑅 of 𝐽 (𝜃) occurs at the top of the orange curve
• the maximizer 𝜃𝐶𝑅 of 𝑉 (𝜃) occurs at the top of the blue curve
𝑅
• the “timeless perspective” inflation and money creation rate 𝜃∞ occurs where 𝐽 (𝜃) is tangent to 𝑉 (𝜃)
• the Markov perfect inflation and money creation rate 𝜃𝑀𝑃 𝐸 exceeds 𝜃0𝑅 .
• the value 𝑉 (𝜃𝑀𝑃 𝐸 ) of the Markov perfect rate of money creation rate 𝜃𝑀𝑃 𝐸 is less than the value 𝑉 (𝜃∞
𝑅
) of the
worst continuation Ramsey plan
• the continuation value 𝐽 (𝜃𝑀𝑃 𝐸 ) of the Markov perfect rate of money creation rate 𝜃𝑀𝑃 𝐸 is greater than the value
𝑅 𝑅
𝑉 (𝜃∞ ) and of the continuation value 𝐽 (𝜃∞ ) of the worst continuation Ramsey plan

840 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

43.16 Perturbing Model Parameters

Now let’s present some graphs that teach us how outcomes change when we assume different values of 𝛽

The horizontal dotted lines indicate values 𝑉 (𝜇𝑅


∞ ), 𝑉 (𝜇
𝐶𝑅
), 𝑉 (𝜇𝑀𝑃 𝐸 ) of time-invariant money growth rates 𝜇𝑅
∞, 𝜇
𝐶𝑅
𝑀𝑃 𝐸
and 𝜇 , respectfully.
𝑅
Notice how 𝐽 (𝜃) and 𝑉 (𝜃) are tangent and increasing at 𝜃 = 𝜃∞ , which implies that 𝜃𝐶𝑅 > 𝜃∞
𝑅
and 𝐽 (𝜃𝐶𝑅 ) > 𝐽 (𝜃∞
𝑅
).
𝑅
Notice how changes in 𝛽 alter 𝜃∞ and 𝜃0𝑅 but neither 𝜃∗ , 𝜃𝐶𝑅 , nor 𝜃𝑀𝑃 𝐸 , in accord with formulas (43.8), (43.26), and
(43.31), which imply that
𝛼𝑢1
𝜃𝐶𝑅 = −
𝛼2 𝑢2 + 𝑐
𝛼𝑢1
𝜃𝑀𝑃 𝐸 = − 2
𝛼 𝑢2 + (1 + 𝛼)𝑐
𝑢
𝜃∗ = − 1
𝑢2 𝛼
The following table summarizes some outcomes.

𝛽 = 0.7, 𝑐 = 2 𝛽 = 0.8, 𝑐 = 2 𝛽 = 0.99, 𝑐 = 2


𝜃∗ −0.167 −0.167 −0.167
𝑅
𝜃∞ −0.121 −0.111 −0.100
𝐶𝑅
𝜃 −0.100 −0.100 −0.100
𝜃0𝑅 −0.083 −0.081 −0.079
𝑀𝑃 𝐸
𝜃 −0.071 −0.071 −0.071

But let’s see what happens when we change 𝑐.

# Increase c to 100
fig, axes = plt.subplots(1, 3, figsize=(12, 5))
c_values = [1, 10, 100]

(continues on next page)

43.16. Perturbing Model Parameters 841


Advanced Quantitative Economics with Python

(continued from previous page)


clqs = [ChangLQ(β=0.85, c=c) for c in c_values]
plt_clqs(clqs, axes)

generate_table(clqs, dig=4)

𝛽 = 0.85, 𝑐 = 1 𝛽 = 0.85, 𝑐 = 10 𝛽 = 0.85, 𝑐 = 100


𝜃∗ −0.167 −0.167 −0.167
𝑅
𝜃∞ −0.131 −0.044 −0.006
𝐶𝑅
𝜃 −0.125 −0.038 −0.005
𝜃0𝑅 −0.107 −0.028 −0.003
𝑀𝑃 𝐸
𝜃 −0.100 −0.022 −0.003

𝑅
The above table and figures show how changes in 𝑐 alter 𝜃∞ and 𝜃0𝑅 as well as 𝜃𝐶𝑅 and 𝜃𝑀𝑃 𝐸 , but not 𝜃∗ , again in accord
with formulas (43.8), (43.26), and (43.31).
𝑅
Notice that as 𝑐 gets larger and larger, 𝜃∞ , 𝜃0𝑅 and 𝜃𝐶𝑅 all converge to 𝜃𝑀𝑃 𝐸 .
Now let’s watch what happens when we drive 𝑐 toward zero.

# Decrease c towards 0
fig, axes = plt.subplots(1, 3, figsize=(12, 5))
c_limits = [1, 0.1, 0.01]

clqs = [ChangLQ(β=0.85, c=c) for c in c_limits]


plt_clqs(clqs, axes)

842 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

𝑅
The above graphs indicate that as 𝑐 approaches zero, 𝜃∞ , 𝜃0𝑅 , 𝜃𝐶𝑅 , and 𝜃𝑀𝑃 𝐸 all approach 𝜃∗ .
This makes sense, because it was by adding costs of distorting taxes that Calvo [Calvo, 1978] drove a wedge between
Friedman’s optimal deflation rate and the inflation rates chosen by a Ramsey planner.
The following code plots sequences 𝜇⃗ and 𝜃 ⃗ prescribed by a Ramsey plan as well as the constant levels 𝜇𝐶𝑅 and 𝜇𝑀𝑃 𝐸 .
The following graphs report values for the value function parameters 𝑔0 , 𝑔1 , 𝑔2 , and the Ramsey policy function parameters
𝑏0 , 𝑏1 , 𝑑0 , 𝑑1 associated with the indicated parameter pair 𝛽, 𝑐.
We’ll vary 𝛽 while keeping a small 𝑐.
After that we’ll study consequences of raising 𝑐.
We’ll watch how the decay rate 𝑑1 governing the dynamics of 𝜃𝑡𝑅 is affected by alterations in the parameters 𝛽, 𝑐.

for β in β_values:
clq = ChangLQ(β=β, c=2)
generate_param_table(clq)
plot_ramsey_MPE(clq)

𝑔0 𝑔1 𝑔2 𝑏0 𝑏1 𝑑0 𝑑1
𝛽 = 0.7, 𝑐 = 2 3.39 −0.75 −4.54 −0.06 −1.52 −0.06 0.48

43.16. Perturbing Model Parameters 843


Advanced Quantitative Economics with Python

𝑔0 𝑔1 𝑔2 𝑏0 𝑏1 𝑑0 𝑑1
𝛽 = 0.8, 𝑐 = 2 5.1 −0.76 −4.65 −0.06 −1.58 −0.06 0.42

𝑔0 𝑔1 𝑔2 𝑏0 𝑏1 𝑑0 𝑑1
𝛽 = 0.99, 𝑐 = 2 102.47 −0.76 −4.81 −0.07 −1.65 −0.07 0.35

Notice how 𝑑1 changes as we raise the discount factor parameter 𝛽.


Now let’s study how increasing 𝑐 affects 𝜃,⃗ 𝜇⃗ outcomes.

# Increase c to 100
for c in c_values:
clq = ChangLQ(β=0.85, c=c)
generate_param_table(clq)
plot_ramsey_MPE(clq)

𝑔0 𝑔1 𝑔2 𝑏0 𝑏1 𝑑0 𝑑1
𝛽 = 0.85, 𝑐 = 1 6.84 −0.68 −3.19 −0.09 −1.69 −0.09 0.31

844 Chapter 43. Time Inconsistency of Ramsey Plans


Advanced Quantitative Economics with Python

𝑔0 𝑔1 𝑔2 𝑏0 𝑏1 𝑑0 𝑑1
𝛽 = 0.85, 𝑐 = 10 6.72 −0.92 −16.16 −0.02 −1.47 −0.02 0.53

𝑔0 𝑔1 𝑔2 𝑏0 𝑏1 𝑑0 𝑑1
𝛽 = 0.85, 𝑐 = 100 6.67 −0.99 −143.29 −0.0 −1.42 −0.0 0.58

Evidently, increasing 𝑐 causes the decay factor 𝑑1 to increase.


Next, let’s look at consequences of increasing the demand for real balances parameter 𝛼 from its default value 𝛼 = 1 to
𝛼 = 4.

# Increase c to 100
for c in [10, 100]:
clq = ChangLQ(α=4, β=0.85, c=c)
generate_param_table(clq)
plot_ramsey_MPE(clq)

𝑔0 𝑔1 𝑔2 𝑏0 𝑏1 𝑑0 𝑑1
𝛽 = 0.85, 𝑐 = 10 6.84 −4.62 −82.32 −0.05 −2.33 −0.01 0.67

43.16. Perturbing Model Parameters 845


Advanced Quantitative Economics with Python

𝑔0 𝑔1 𝑔2 𝑏0 𝑏1 𝑑0 𝑑1
𝛽 = 0.85, 𝑐 = 100 6.74 −8.03 −390.65 −0.01 −1.47 −0.0 0.88

The above panels for an 𝛼 = 4 setting indicate that 𝛼 and 𝑐 affect outcomes in interesting ways.
We leave it to the reader to explore consequences of other constellations of parameter values.

43.16.1 Implausibility of Ramsey Plan

Many economists regard a time inconsistent plan as implausible because they question the plausibility of timing protocol
in which a plan for setting a sequence of policy variables is chosen once-and-for-all at time 0.
For that reason, the Markov perfect equilibrium concept attracts many economists.
• A Markov perfect equilibrium plan is constructed to insure that a sequence of government policymakers who choose
sequentially do not want to deviate from it.

43.16.2 Ramsey Plan Strikes Back

Research by Abreu [Abreu, 1988], Chari and Kehoe [Chari and Kehoe, 1990] [Stokey, 1989], and Stokey [Stokey, 1991]
described conditions under which a Ramsey plan can be rescued from the complaint that it is not credible.
They accomplished this by expanding the description of a plan to include expectations about adverse consequences of
deviating from it that can serve to deter deviations.
We turn to such theories in this quantecon lecture Sustainable Plans for a Calvo Model.

846 Chapter 43. Time Inconsistency of Ramsey Plans


CHAPTER

FORTYFOUR

SUSTAINABLE PLANS FOR A CALVO MODEL

44.1 Overview

This is a sequel to this quantecon lecture Time Inconsistency of Ramsey Plans.


That lecture studied a linear-quadratic version of a model that Guillermo Calvo [Calvo, 1978] used to study the time
inconsistency of the optimal government plan that emerges when a Stackelberg government (a.k.a. a Ramsey planner)
at time 0 once and for all chooses a sequence 𝜇⃗ = {𝜇𝑡 }∞
𝑡=0 of gross rates of growth in the supply of money.

A consequence of that choice is a (rational expectations equilibrium) sequence 𝜃 ⃗ = {𝜃𝑡 }∞


𝑡=0 of gross rates of increase in
the price level that we call inflation rates.
[Calvo, 1978] showed that a Ramsey plan would not emerge from alternative timing protocols and associated supplemen-
tary assumptions about what government authorities who set 𝜇𝑡 at time 𝑡 believe how future government authorities who
set 𝜇𝑡+𝑗 for 𝑗 > 0 will respond to their decisions.
In this lecture, we explore another set of assumptions about what government authorities who set 𝜇𝑡 at time 𝑡 believe how
future government authorities who set 𝜇𝑡+𝑗 for 𝑗 > 0 will respond to their decisions.
We shall assume that there is sequence of separate policymakers; a time 𝑡 policymaker chooses only 𝜇𝑡 , but now believes
that its choice of 𝜇𝑡 shapes the representative agent’s beliefs about future rates of money creation and inflation, and
through them, future government actions.
This timing protocol and belief structure leads to a model of a credible government policy, also known as a sustainable
plan.
In quantecon lecture Time Inconsistency of Ramsey Plans we used ideas from papers by Cagan [Cagan, 1956], Calvo
[Calvo, 1978], and Chang [Chang, 1998].
In addition to those ideas, we’ll also use ideas from Abreu [Abreu, 1988], Stokey [Stokey, 1989], [Stokey, 1991], and
Chari and Kehoe [Chari and Kehoe, 1990] to study outcomes under our timing protocol.

44.2 Model Components

We’ll start with a brief review of the setup.


There is no uncertainty.
Let
• 𝑝𝑡 be the log of the price level
• 𝑚𝑡 be the log of nominal money balances
• 𝜃𝑡 = 𝑝𝑡+1 − 𝑝𝑡 be the net rate of inflation between 𝑡 and 𝑡 + 1

847
Advanced Quantitative Economics with Python

• 𝜇𝑡 = 𝑚𝑡+1 − 𝑚𝑡 be the net rate of growth of nominal balances


The demand for real balances is governed by a perfect foresight version of a Cagan [Cagan, 1956] demand function for
real balances:

𝑚𝑡 − 𝑝𝑡 = −𝛼(𝑝𝑡+1 − 𝑝𝑡 ) , 𝛼 > 0 (44.1)

for 𝑡 ≥ 0.
Equation (44.1) asserts that the demand for real balances is inversely related to the public’s expected rate of inflation,
which equals the actual rate of inflation because there is no uncertainty here.
(When there is no uncertainty, an assumption of rational expectations that becomes equivalent to perfect foresight).
Subtracting the demand function (44.1) at time 𝑡 from the demand function at 𝑡 + 1 gives:

𝜇𝑡 − 𝜃𝑡 = −𝛼𝜃𝑡+1 + 𝛼𝜃𝑡

or
𝛼 1
𝜃𝑡 = 𝜃 + 𝜇 (44.2)
1 + 𝛼 𝑡+1 1 + 𝛼 𝑡
𝛼
Because 𝛼 > 0, 0 < 1+𝛼 < 1.
Definition: For scalar 𝑏𝑡 , let 𝐿2 be the space of sequences {𝑏𝑡 }∞
𝑡=0 satisfying


∑ 𝑏𝑡2 < +∞
𝑡=0

We say that a sequence that belongs to 𝐿2 is square summable.


When we assume that the sequence 𝜇⃗ = {𝜇𝑡 }∞ ⃗ ∞
𝑡=0 is square summable and we require that the sequence 𝜃 = {𝜃𝑡 }𝑡=0 is
square summable, the linear difference equation (44.2) can be solved forward to get:
∞ 𝑗
1 𝛼
𝜃𝑡 = ∑( ) 𝜇𝑡+𝑗 (44.3)
1 + 𝛼 𝑗=0 1 + 𝛼

Insight: In the spirit of Chang [Chang, 1998], equations (44.1) and (44.3) show that 𝜃𝑡 intermediates how choices of
𝜇𝑡+𝑗 , 𝑗 = 0, 1, … impinge on time 𝑡 real balances 𝑚𝑡 − 𝑝𝑡 = −𝛼𝜃𝑡 .
An equivalence class of continuation money growth sequences {𝜇𝑡+𝑗 }∞
𝑗=0 deliver the same 𝜃𝑡 .

That future rates of money creation influence earlier rates of inflation makes timing protocols matter for modeling optimal
government policies.
Quantecon lecture Time Inconsistency of Ramsey Plans used this insight to simplify analysis of alternative government
policy problems.

44.3 Another Timing Protocol

The Quantecon lecture Time Inconsistency of Ramsey Plans considered three models of government policy making that
differ in
• what a policymaker chooses, either a sequence 𝜇⃗ or just 𝜇𝑡 in a single period 𝑡.
• when a policymaker chooses, either once and for all at time 0, or at some time or times 𝑡 ≥ 0.
• what a policymaker assumes about how its choice of 𝜇𝑡 affects the representative agent’s expectations about inflation
rates.

848 Chapter 44. Sustainable Plans for a Calvo Model


Advanced Quantitative Economics with Python

In this lecture, there is a sequence of policymakers, each of whom sets 𝜇𝑡 at 𝑡 only.


To set the stage, recall that in a Markov perfect equilibrium
• a time 𝑡 policymaker cares only about 𝑣𝑡 and ignores effects that its choice of 𝜇𝑡 has on 𝑣𝑠 at dates 𝑠 = 0, 1, … , 𝑡−1.
In particular, in a Markov perfect equilibrium, there is a sequence indexed by 𝑡 = 0, 1, 2, … of separate policymakers; a
time 𝑡 policymaker chooses 𝜇𝑡 only and forecasts that future government decisions are unaffected by its choice.
By way of contrast, in this lecture there is sequence of distinct policymakers; a time 𝑡 policymaker chooses only 𝜇𝑡 ,
but now believes that its choice of 𝜇𝑡 shapes the representative agent’s beliefs about future rates of money creation and
inflation, and through them, future government actions.
This timing protocol and belief structure leads to a model of a credible government policy also known as a sustainable
plan
The relationship between outcomes under a (Ramsey) timing protocol and the timing protocol and belief structure in this
lecture is the subject of a literature on sustainable or credible public policies created by Abreu [Abreu, 1988], [Chari
and Kehoe, 1990] [Stokey, 1989], and Stokey [Stokey, 1991].
They discovered conditions under which a Ramsey plan can be rescued from the complaint that it is not credible.
They accomplished this by expanding the description of a plan to include expectations about adverse consequences of
deviating from it that can serve to deter deviations.
In this version of our model
• the government does not set {𝜇𝑡 }∞
𝑡=0 once and for all at 𝑡 = 0

• instead it sets 𝜇𝑡 at time 𝑡


• the representative agent’s forecasts of {𝜇𝑡+𝑗+1 , 𝜃𝑡+𝑗+1 }∞
𝑗=0 respond to whether the government at 𝑡 confirms or
disappoints its forecasts of 𝜇𝑡 brought into period 𝑡 from period 𝑡 − 1.
• the government at each time 𝑡 understands how the representative agent’s forecasts will respond to its choice of 𝜇𝑡 .
• at each 𝑡, the government chooses 𝜇𝑡 to maximize a continuation discounted utility.

44.3.1 Government Decisions

𝜇⃗ is chosen by a sequence of government decision makers, one for each 𝑡 ≥ 0.


The time 𝑡 decision maker chooses 𝜇𝑡 .
We assume the following within-period and between-period timing protocol for each 𝑡 ≥ 0:
• at time 𝑡 − 1, private agents expect that the government will set 𝜇𝑡 = 𝜇𝑡̃ , and more generally that it will set
𝜇𝑡+𝑗 = 𝜇𝑡+𝑗
̃ for all 𝑗 ≥ 0.

̃ }𝑗≥0 determine a 𝜃𝑡 = 𝜃𝑡̃ and an associated log of real balances 𝑚𝑡 − 𝑝𝑡 = −𝛼𝜃𝑡̃ at 𝑡.


• The forecasts {𝜇𝑡+𝑗

• Given those expectations and an associated 𝜃𝑡 = 𝜃𝑡̃ , at 𝑡 a government is free to set 𝜇𝑡 ∈ R.


• If the government at 𝑡 confirms the representative agent’s expectations by setting 𝜇𝑡 = 𝜇𝑡̃ at time 𝑡, private agents
expect the continuation government policy {𝜇𝑡+𝑗+1
̃ }∞ ̃
𝑗=0 and therefore bring expectation 𝜃𝑡+1 into period 𝑡 + 1.

• If the government at 𝑡 disappoints private agents by setting 𝜇𝑡 ≠ 𝜇𝑡̃ , private agents expect {𝜇𝐴 ∞
𝑗 }𝑗=0 as the con-
tinuation policy for 𝑡 + 1, i.e., {𝜇𝑡+𝑗+1 } = {𝜇𝐴 ∞ 𝐴
𝑗 }𝑗=0 and therefore expect an associated 𝜃0 for 𝑡 + 1. Here
𝐴 𝐴 ∞
𝜇⃗ = {𝜇𝑗 }𝑗=0 is an alternative government plan to be described below.

44.3. Another Timing Protocol 849


Advanced Quantitative Economics with Python

44.3.2 Temptation to Deviate from Plan

The government’s one-period return function 𝑠(𝜃, 𝜇) described in equation (43.10) in quantecon lecture [Calvo, 1978]
has the property that for all 𝜃

𝑠(𝜃, 0) ≥ 𝑠(𝜃, 𝜇)

This inequality implies that whenever the policy calls for the government to set 𝜇 ≠ 0, the government could raise its
one-period payoff by setting 𝜇 = 0.
Disappointing private sector expectations in that way would increase the government’s current payoff but would have
adverse consequences for subsequent government payoffs because the private sector would alter its expectations about
future settings of 𝜇.
The temporary gain constitutes the government’s temptation to deviate from a plan.
If the government at 𝑡 is to resist the temptation to raise its current payoff, it is only because it forecasts adverse con-
sequences that its setting of 𝜇𝑡 would bring for continuation government payoffs via alterations in the private sector’s
expectations.

44.4 Sustainable or Credible Plan

We call a plan 𝜇⃗ sustainable or credible if at each 𝑡 ≥ 0 the government chooses to confirm private agents’ prior
expectation of its setting for 𝜇𝑡 .
The government will choose to confirm prior expectations only if the long-term loss from disappointing private sec-
tor expectations – coming from the government’s understanding of the way the private sector adjusts its expectations
in response to having its prior expectations at 𝑡 disappointed – outweigh the short-term gain from disappointing those
expectations.
The theory of sustainable or credible plans assumes throughout that private sector expectations about what future gov-
ernments will do are based on the assumption that governments at times 𝑡 ≥ 0 always act to maximize the continuation
discounted utilities that describe those governments’ purposes.
This aspect of the theory means that credible plans always come in pairs:
• a credible (continuation) plan to be followed if the government at 𝑡 confirms private sector expectations
• a credible plan to be followed if the government at 𝑡 disappoints private sector expectations
That credible plans come in pairs threaten to bring an explosion of plans to keep track of
• each credible plan itself consists of two credible plans
• therefore, the number of plans underlying one plan is unbounded
But Dilip Abreu showed how to render manageable the number of plans that must be kept track of.
The key is an object called a self-enforcing plan.
We’ll proceed to compute one.
In addition to what’s in Anaconda, this lecture will use the following libraries:

!pip install --upgrade quantecon

We’ll start with some imports:

850 Chapter 44. Sustainable Plans for a Calvo Model


Advanced Quantitative Economics with Python

import numpy as np
from quantecon import LQ
import matplotlib.pyplot as plt
import pandas as pd

44.4.1 Abreu’s Self-Enforcing Plan

A plan 𝜇𝐴
⃗ (here the superscipt 𝐴 is for Abreu) is said to be self-enforcing if
• the consequence of disappointing the representative agent’s expectations at time 𝑗 is to restart plan 𝜇𝐴
⃗ at time 𝑗 + 1
• the consequence of restarting the plan is sufficiently adverse that it forever deters all deviations from the plan
More precisely, a government plan 𝜇𝐴 ⃗ is self-enforcing if
⃗ with equilibrium inflation sequence 𝜃𝐴

𝑣𝑗𝐴 = 𝑠(𝜃𝑗𝐴 , 𝜇𝐴 𝐴
𝑗 ) + 𝛽𝑣𝑗+1
(44.4)
≥ 𝑠(𝜃𝑗𝐴 , 0) + 𝛽𝑣0𝐴 ≡ 𝑣𝑗𝐴,𝐷 , 𝑗≥0

(Here it is useful to recall that setting 𝜇 = 0 is the maximizing choice for the government’s one-period return function)
The first line tells the consequences of confirming the representative agent’s expectations by following the plan, while the
second line tells the consequences of disappointing the representative agent’s expectations by deviating from the plan.
A consequence of the inequality stated in the definition is that a self-enforcing plan is credible.
Self-enforcing plans can be used to construct other credible plans, including ones with better values.
Thus, where 𝑣𝐴 ⃗ is the value associated with a self-enforcing plan 𝜇𝐴
⃗ , a sufficient condition for another plan 𝜇⃗ associated
with inflation 𝜃 ⃗ and value 𝑣 ⃗ to be credible is that
𝑣𝑗 = 𝑠(𝜃𝑗 , 𝜇𝑗 ) + 𝛽𝑣𝑗+1
(44.5)
≥ 𝑠(𝜃𝑗 , 0) + 𝛽𝑣0𝐴 ∀𝑗 ≥ 0

For this condition to be satisfied it is necessary and sufficient that

𝑠(𝜃𝑗 , 0) − 𝑠(𝜃𝑗 , 𝜇𝑗 ) < 𝛽(𝑣𝑗+1 − 𝑣0𝐴 )

The left side of the above inequality is the government’s gain from deviating from the plan, while the right side is the
government’s loss from deviating from the plan.
A government never wants to deviate from a credible plan.
Abreu taught us that key step in constructing a credible plan is first constructing a self-enforcing plan that has a low time
0 value.
The idea is to use the self-enforcing plan as a continuation plan whenever the government’s choice at time 𝑡 fails to confirm
private agents’ expectation.
We shall use a construction featured in Abreu ([Abreu, 1988]) to construct a self-enforcing plan with low time 0 value.

44.4. Sustainable or Credible Plan 851


Advanced Quantitative Economics with Python

44.4.2 Abreu’s Carrot-Stick Plan

[Abreu, 1988] invented a way to create a self-enforcing plan with a low initial value.
Imitating his idea, we can construct a self-enforcing plan 𝜇⃗ with a low time 0 value to the government by insisting that
future government decision makers set 𝜇𝑡 to a value yielding low one-period utilities to the household for a long time,
after which government decisions thereafter yield high one-period utilities.
• Low one-period utilities early are a stick
• High one-period utilities later are a carrot
Consider a candidate plan 𝜇𝐴
⃗ that sets 𝜇𝐴
𝑡 = 𝜇̄ (a high positive number) for 𝑇𝐴 periods, and then reverts to the Ramsey
plan.
Denote this sequence by {𝜇𝐴 ∞
𝑡 }𝑡=0 .

The sequence of inflation rates implied by this plan, {𝜃𝑡𝐴 }∞


𝑡=0 , can be calculated using:

∞ 𝑗
1 𝛼
𝜃𝑡𝐴 = ∑( ) 𝜇𝐴
𝑡+𝑗
1 + 𝛼 𝑗=0 1 + 𝛼

The value of {𝜃𝑡𝐴 , 𝜇𝐴 ∞


𝑡 }𝑡=0 at time 0 is

𝑇𝐴 −1
𝑣0𝐴 = ∑ 𝛽 𝑡 𝑠(𝜃𝑡𝐴 , 𝜇𝐴
𝑡 )+𝛽
𝑇𝐴
𝐽 (𝜃0𝑅 )
𝑡=0

For an appropriate 𝑇𝐴 , this plan can be verified to be self-enforcing and therefore credible.
From quantecon lecture Time Inconsistency of Ramsey Plans, we’ll again bring in the Python class ChangLQ that constructs
equilibria under timing protocols studied in that lecture.

class ChangLQ:
"""
Class to solve LQ Chang model
"""
def __init__(self, β, c, α=1, u0=1, u1=0.5, u2=3, T=1000, θ_n=200):
# Record parameters
self.α, self.u0, self.u1, self.u2 = α, u0, u1, u2
self.β, self.c, self.T, self.θ_n = β, c, T, θ_n

self.setup_LQ_matrices()
self.solve_LQ_problem()
self.compute_policy_functions()
self.simulate_ramsey_plan()
self.compute_θ_range()
self.compute_value_and_policy()

def setup_LQ_matrices(self):
# LQ Matrices
self.R = -np.array([[self.u0, -self.u1 * self.α / 2],
[-self.u1 * self.α / 2,
-self.u2 * self.α**2 / 2]])
self.Q = -np.array([[-self.c / 2]])
self.A = np.array([[1, 0], [0, (1 + self.α) / self.α]])
self.B = np.array([[0], [-1 / self.α]])

def solve_LQ_problem(self):
(continues on next page)

852 Chapter 44. Sustainable Plans for a Calvo Model


Advanced Quantitative Economics with Python

(continued from previous page)


# Solve LQ Problem (Subproblem 1)
lq = LQ(self.Q, self.R, self.A, self.B, beta=self.β)
self.P, self.F, self.d = lq.stationary_values()

# Compute g0, g1, and g2 (41.16)


self.g0, self.g1, self.g2 = [-self.P[0, 0],
-2 * self.P[1, 0], -self.P[1, 1]]

# Compute b0 and b1 (41.17)


[[self.b0, self.b1]] = self.F

# Compute d0 and d1 (41.18)


self.cl_mat = (self.A - self.B @ self.F) # Closed loop matrix
[[self.d0, self.d1]] = self.cl_mat[1:]

# Solve Subproblem 2
self.θ_R = -self.P[0, 1] / self.P[1, 1]

# Find the bliss level of θ


self.θ_B = -self.u1 / (self.u2 * self.α)

def compute_policy_functions(self):
# Solve the Markov Perfect Equilibrium
self.μ_MPE = -self.u1 / ((1 + self.α) / self.α * self.c
+ self.α / (1 + self.α)
* self.u2 + self.α**2
/ (1 + self.α) * self.u2)
self.θ_MPE = self.μ_MPE
self.μ_CR = -self.α * self.u1 / (self.u2 * self.α**2 + self.c)
self.θ_CR = self.μ_CR

# Calculate value under MPE and CR economy


self.J_θ = lambda θ_array: - np.array([1, θ_array]) \
@ self.P @ np.array([1, θ_array]).T
self.V_θ = lambda θ: (self.u0 + self.u1 * (-self.α * θ)
- self.u2 / 2 * (-self.α * θ)**2
- self.c / 2 * θ**2) / (1 - self.β)

self.J_MPE = self.V_θ(self.μ_MPE)
self.J_CR = self.V_θ(self.μ_CR)

def simulate_ramsey_plan(self):
# Simulate Ramsey plan for large number of periods
θ_series = np.vstack((np.ones((1, self.T)), np.zeros((1, self.T))))
μ_series = np.zeros(self.T)
J_series = np.zeros(self.T)
θ_series[1, 0] = self.θ_R
[μ_series[0]] = -self.F.dot(θ_series[:, 0])
J_series[0] = self.J_θ(θ_series[1, 0])

for i in range(1, self.T):


θ_series[:, i] = self.cl_mat @ θ_series[:, i-1]
[μ_series[i]] = -self.F @ θ_series[:, i]
J_series[i] = self.J_θ(θ_series[1, i])

self.J_series = J_series

(continues on next page)

44.4. Sustainable or Credible Plan 853


Advanced Quantitative Economics with Python

(continued from previous page)


self.μ_series = μ_series
self.θ_series = θ_series

def compute_θ_range(self):
# Find the range of θ in Ramsey plan
θ_LB = min(min(self.θ_series[1, :]), self.θ_B)
θ_UB = max(max(self.θ_series[1, :]), self.θ_MPE)
θ_range = θ_UB - θ_LB
self.θ_LB = θ_LB - 0.05 * θ_range
self.θ_UB = θ_UB + 0.05 * θ_range
self.θ_range = θ_range

def compute_value_and_policy(self):
# Create the θ_space
self.θ_space = np.linspace(self.θ_LB, self.θ_UB, 200)

# Find value function and policy functions over range of θ


self.J_space = np.array([self.J_θ(θ) for θ in self.θ_space])
self.μ_space = -self.F @ np.vstack((np.ones(200), self.θ_space))
x_prime = self.cl_mat @ np.vstack((np.ones(200), self.θ_space))
self.θ_prime = x_prime[1, :]
self.CR_space = np.array([self.V_θ(θ) for θ in self.θ_space])

self.μ_space = self.μ_space[0, :]

# Calculate J_range, J_LB, and J_UB


self.J_range = np.ptp(self.J_space)
self.J_LB = np.min(self.J_space) - 0.05 * self.J_range
self.J_UB = np.max(self.J_space) + 0.05 * self.J_range

Let’s create an instance of ChangLQ with the following parameters:

clq = ChangLQ(β=0.85, c=2)

44.4.3 Example of Self-Enforcing Plan

The following example implements an Abreu stick-and-carrot plan.


The government sets 𝜇𝐴
𝑡 = 0.1 for 𝑡 = 0, 1, … , 9 and then starts the Ramsey plan.

We have computed outcomes for this plan.


For this plan, we plot the 𝜃𝐴 , 𝜇𝐴 sequences as well as the implied 𝑣𝐴 sequence.
Notice that because the government sets money supply growth high for 10 periods, inflation starts high.
Inflation gradually slowly declines because people expect the government to lower the money growth rate after period 10.
From the 10th period onwards, the inflation rate 𝜃𝑡𝐴 associated with this Abreu plan starts the Ramsey plan from its
𝐴
beginning, i.e., 𝜃𝑡+10 = 𝜃𝑡𝑅 ∀𝑡 ≥ 0.

854 Chapter 44. Sustainable Plans for a Calvo Model


Advanced Quantitative Economics with Python

44.4. Sustainable or Credible Plan 855


Advanced Quantitative Economics with Python

⃗ is self-enforcing, we plot an object that we call 𝑉𝑡𝐴,𝐷 , defined in the key inequality in the
To confirm that the plan 𝜇𝐴
second line of equation (44.4) above.
𝑉𝑡𝐴,𝐷 is the value at 𝑡 of deviating from the self-enforcing plan 𝜇𝐴
⃗ by setting 𝜇𝑡 = 0 and then restarting the plan at 𝑣0𝐴
at 𝑡 + 1:

𝑣𝑡𝐴,𝐷 = 𝑠(𝜃𝑗 , 0) + 𝛽𝑣0𝐴

In the above graph 𝑣𝑡𝐴 > 𝑣𝑡𝐴,𝐷 , which confirms that 𝜇𝐴


⃗ is a self-enforcing plan.
We can also verify the inequalities required for 𝜇𝐴
⃗ to be self-confirming numerically as follows

np.all(clq.V_A[0:20] > clq.V_dev[0:20])

True

Given that plan 𝜇𝐴


⃗ is self-enforcing, we can check that the Ramsey plan 𝜇𝑅
⃗ is credible by verifying that:

𝑣𝑡𝑅 ≥ 𝑠(𝜃𝑡𝑅 , 0) + 𝛽𝑣0𝐴 , ∀𝑡 ≥ 0

def check_ramsey(clq, T=1000):


# Make sure Ramsey plan is sustainable
R_dev = np.zeros(T)
for t in range(T):
R_dev[t] = (clq.u0 + clq.u1 * (-clq.θ_series[1, t])
- clq.u2 / 2 * (-clq.θ_series[1, t])**2) \
+ clq.β * clq.V_A[0]

return np.all(clq.J_series > R_dev)

check_ramsey(clq)

True

44.4.4 Recursive Representation of a Sustainable Plan

We can represent a sustainable plan recursively by taking the continuation value 𝑣𝑡 as a state variable.
We form the following 3-tuple of functions:

𝜇𝑡̂ = 𝜈𝜇 (𝑣𝑡 )
𝜃𝑡 = 𝜈𝜃 (𝑣𝑡 ) (44.6)
𝑣𝑡+1 = 𝜈𝑣 (𝑣𝑡 , 𝜇𝑡 )

In addition to these equations, we need an initial value 𝑣0 to characterize a sustainable plan.


The first equation of (44.6) tells the recommended value of 𝜇𝑡̂ as a function of the promised value 𝑣𝑡 .
The second equation of (44.6) tells the inflation rate as a function of 𝑣𝑡 .
The third equation of (44.6) updates the continuation value in a way that depends on whether the government at 𝑡 confirms
the representative agent’s expectations by setting 𝜇𝑡 equal to the recommended value 𝜇𝑡̂ , or whether it disappoints those
expectations.

856 Chapter 44. Sustainable Plans for a Calvo Model


Advanced Quantitative Economics with Python

44.5 Whose Plan is It?

A credible government plan 𝜇⃗ plays multiple roles.


• It is a sequence of actions chosen by the government.
• It is a sequence of the representative agent’s forecasts of government actions.
Thus, 𝜇⃗ is both a government policy and a collection of the representative agent’s forecasts of government policy.
Does the government choose policy actions or does it simply confirm prior private sector forecasts of those actions?
An argument in favor of the government chooses interpretation comes from noting that the theory of credible plans builds
in a theory that the government each period chooses the action that it wants.
An argument in favor of the simply confirm interpretation is gathered from staring at the key inequality (44.5) that defines
a credible policy.
We have also computed credible plans for a government or sequence of governments that choose sequentially.
These include
• a self-enforcing plan that gives a low initial value 𝑣0 .
• a better plan – possibly one that attains values associated with Ramsey plan – that is not self-enforcing.

44.5. Whose Plan is It? 857


Advanced Quantitative Economics with Python

858 Chapter 44. Sustainable Plans for a Calvo Model


CHAPTER

FORTYFIVE

OPTIMAL TAXATION WITH STATE-CONTINGENT DEBT

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

45.1 Overview

This lecture describes a celebrated model of optimal fiscal policy by Robert E. Lucas, Jr., and Nancy Stokey [Lucas and
Stokey, 1983].
The model revisits classic issues about how to pay for a war.
Here a war means a more or less temporary surge in an exogenous government expenditure process.
The model features
• a government that must finance an exogenous stream of government expenditures with either
– a flat rate tax on labor, or
– purchases and sales from a full array of Arrow state-contingent securities
• a representative household that values consumption and leisure
• a linear production function mapping labor into a single good
• a Ramsey planner who at time 𝑡 = 0 chooses a plan for taxes and trades of Arrow securities for all 𝑡 ≥ 0
After first presenting the model in a space of sequences, we shall represent it recursively in terms of two Bellman equations
formulated along lines that we encountered in Dynamic Stackelberg models.
As in Dynamic Stackelberg models, to apply dynamic programming we shall define the state vector artfully.
In particular, we shall include forward-looking variables that summarize optimal responses of private agents to a Ramsey
plan.
See Optimal taxation for analysis within a linear-quadratic setting.
Let’s start with some standard imports:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import root
from quantecon import MarkovChain
from quantecon.optimize.nelder_mead import nelder_mead
from numba import njit, prange, float64
from numba.experimental import jitclass

859
Advanced Quantitative Economics with Python

45.2 A Competitive Equilibrium with Distorting Taxes

At time 𝑡 ≥ 0 a random variable 𝑠𝑡 belongs to a time-invariant set 𝑆 = [1, 2, … , 𝑆].


For 𝑡 ≥ 0, a history 𝑠𝑡 = [𝑠𝑡 , 𝑠𝑡−1 , … , 𝑠0 ] of an exogenous state 𝑠𝑡 has joint probability density 𝜋𝑡 (𝑠𝑡 ).
We begin by assuming that government purchases 𝑔𝑡 (𝑠𝑡 ) at time 𝑡 ≥ 0 depend on 𝑠𝑡 .
Let 𝑐𝑡 (𝑠𝑡 ), ℓ𝑡 (𝑠𝑡 ), and 𝑛𝑡 (𝑠𝑡 ) denote consumption, leisure, and labor supply, respectively, at history 𝑠𝑡 and date 𝑡.
A representative household is endowed with one unit of time that can be divided between leisure ℓ𝑡 and labor 𝑛𝑡 :

𝑛𝑡 (𝑠𝑡 ) + ℓ𝑡 (𝑠𝑡 ) = 1 (45.1)

Output equals 𝑛𝑡 (𝑠𝑡 ) and can be divided between 𝑐𝑡 (𝑠𝑡 ) and 𝑔𝑡 (𝑠𝑡 )

𝑐𝑡 (𝑠𝑡 ) + 𝑔𝑡 (𝑠𝑡 ) = 𝑛𝑡 (𝑠𝑡 ) (45.2)

A representative household’s preferences over {𝑐𝑡 (𝑠𝑡 ), ℓ𝑡 (𝑠𝑡 )}∞


𝑡=0 are ordered by


∑ ∑ 𝛽 𝑡 𝜋𝑡 (𝑠𝑡 )𝑢[𝑐𝑡 (𝑠𝑡 ), ℓ𝑡 (𝑠𝑡 )] (45.3)
𝑡=0 𝑠𝑡

where the utility function 𝑢 is increasing, strictly concave, and three times continuously differentiable in both arguments.
The technology pins down a pre-tax wage rate to unity for all 𝑡, 𝑠𝑡 .
The government imposes a flat-rate tax 𝜏𝑡 (𝑠𝑡 ) on labor income at time 𝑡, history 𝑠𝑡 .
There are complete markets in one-period Arrow securities.
One unit of an Arrow security issued at time 𝑡 at history 𝑠𝑡 and promising to pay one unit of time 𝑡 + 1 consumption in
state 𝑠𝑡+1 costs 𝑝𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ).
The government issues one-period Arrow securities each period.
The government has a sequence of budget constraints whose time 𝑡 ≥ 0 component is

𝑔𝑡 (𝑠𝑡 ) = 𝜏𝑡 (𝑠𝑡 )𝑛𝑡 (𝑠𝑡 ) + ∑ 𝑝𝑡+1 (𝑠𝑡+1 |𝑠𝑡 )𝑏𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ) − 𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ) (45.4)
𝑠𝑡+1

where
• 𝑝𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ) is a competitive equilibrium price of one unit of consumption at date 𝑡 + 1 in state 𝑠𝑡+1 at date 𝑡
and history 𝑠𝑡 .
• 𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ) is government debt falling due at time 𝑡, history 𝑠𝑡 .
Government debt 𝑏0 (𝑠0 ) is an exogenous initial condition.
The representative household has a sequence of budget constraints whose time 𝑡 ≥ 0 component is

𝑐𝑡 (𝑠𝑡 ) + ∑ 𝑝𝑡 (𝑠𝑡+1 |𝑠𝑡 )𝑏𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ) = [1 − 𝜏𝑡 (𝑠𝑡 )] 𝑛𝑡 (𝑠𝑡 ) + 𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ) ∀𝑡 ≥ 0 (45.5)
𝑠𝑡+1

A government policy is an exogenous sequence {𝑔(𝑠𝑡 )}∞ 𝑡 ∞


𝑡=0 , a tax rate sequence {𝜏𝑡 (𝑠 )}𝑡=0 , and a government debt
𝑡+1 ∞
sequence {𝑏𝑡+1 (𝑠 )}𝑡=0 .
A feasible allocation is a consumption-labor supply plan {𝑐𝑡 (𝑠𝑡 ), 𝑛𝑡 (𝑠𝑡 )}∞ 𝑡
𝑡=0 that satisfies (45.2) at all 𝑡, 𝑠 .

A price system is a sequence of Arrow security prices {𝑝𝑡+1 (𝑠𝑡+1 |𝑠𝑡 )}∞
𝑡=0 .

The household faces the price system as a price-taker and takes the government policy as given.

860 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

The household chooses {𝑐𝑡 (𝑠𝑡 ), ℓ𝑡 (𝑠𝑡 )}∞ 𝑡


𝑡=0 to maximize (45.3) subject to (45.5) and (45.1) for all 𝑡, 𝑠 .

A competitive equilibrium with distorting taxes is a feasible allocation, a price system, and a government policy such
that
• Given the price system and the government policy, the allocation solves the household’s optimization problem.
• Given the allocation, government policy, and price system, the government’s budget constraint is satisfied for all
𝑡, 𝑠𝑡 .

Note: There are many competitive equilibria with distorting taxes.

They are indexed by different government policies.


The Ramsey problem or optimal taxation problem is to choose a competitive equilibrium with distorting taxes that
maximizes (45.3).

45.2.1 Arrow-Debreu Version of Price System

We find it convenient sometimes to work with the Arrow-Debreu price system that is implied by a sequence of Arrow
securities prices.
Let 𝑞𝑡0 (𝑠𝑡 ) be the price at time 0, measured in time 0 consumption goods, of one unit of consumption at time 𝑡, history
𝑠𝑡 .
The following recursion relates Arrow-Debreu prices {𝑞𝑡0 (𝑠𝑡 )}∞ 𝑡 ∞
𝑡=0 to Arrow securities prices {𝑝𝑡+1 (𝑠𝑡+1 |𝑠 )}𝑡=0

0
𝑞𝑡+1 (𝑠𝑡+1 ) = 𝑝𝑡+1 (𝑠𝑡+1 |𝑠𝑡 )𝑞𝑡0 (𝑠𝑡 ) 𝑠.𝑡. 𝑞00 (𝑠0 ) = 1 (45.6)

Arrow-Debreu prices are useful when we want to compress a sequence of budget constraints into a single intertemporal
budget constraint, as we shall find it convenient to do below.

45.2.2 Primal Approach

We apply a popular approach to solving a Ramsey problem, called the primal approach.
The idea is to use first-order conditions for household optimization to eliminate taxes and prices in favor of quantities,
then pose an optimization problem cast entirely in terms of quantities.
After Ramsey quantities have been found, taxes and prices can then be unwound from the allocation.
The primal approach uses four steps:
1. Obtain first-order conditions of the household’s problem and solve them for {𝑞𝑡0 (𝑠𝑡 ), 𝜏𝑡 (𝑠𝑡 )}∞
𝑡=0 as functions of the
allocation {𝑐𝑡 (𝑠𝑡 ), 𝑛𝑡 (𝑠𝑡 )}∞
𝑡=0 .
2. Substitute these expressions for taxes and prices in terms of the allocation into the household’s present-value budget
constraint.
• This intertemporal constraint involves only the allocation and is regarded as an implementability constraint.
3. Find the allocation that maximizes the utility of the representative household (45.3) subject to the feasibility con-
straints (45.1) and (45.2) and the implementability condition derived in step 2.
• This optimal allocation is called the Ramsey allocation.
4. Use the Ramsey allocation together with the formulas from step 1 to find taxes and prices.

45.2. A Competitive Equilibrium with Distorting Taxes 861


Advanced Quantitative Economics with Python

45.2.3 The Implementability Constraint

By sequential substitution of one one-period budget constraint (45.5) into another, we can obtain the household’s present-
value budget constraint:
∞ ∞
∑ ∑ 𝑞𝑡0 (𝑠𝑡 )𝑐𝑡 (𝑠𝑡 ) = ∑ ∑ 𝑞𝑡0 (𝑠𝑡 )[1 − 𝜏𝑡 (𝑠𝑡 )]𝑛𝑡 (𝑠𝑡 ) + 𝑏0 (45.7)
𝑡=0 𝑠𝑡 𝑡=0 𝑠𝑡

{𝑞𝑡0 (𝑠𝑡 )}∞


𝑡=1 can be interpreted as a time 0 Arrow-Debreu price system.

To approach the Ramsey problem, we study the household’s optimization problem.


First-order conditions for the household’s problem for ℓ𝑡 (𝑠𝑡 ) and 𝑏𝑡 (𝑠𝑡+1 |𝑠𝑡 ), respectively, imply

𝑢𝑙 (𝑠𝑡 )
(1 − 𝜏𝑡 (𝑠𝑡 )) = (45.8)
𝑢𝑐 (𝑠𝑡 )
and
𝑢𝑐 (𝑠𝑡+1 )
𝑝𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ) = 𝛽𝜋(𝑠𝑡+1 |𝑠𝑡 ) ( ) (45.9)
𝑢𝑐 (𝑠𝑡 )

where 𝜋(𝑠𝑡+1 |𝑠𝑡 ) is the probability distribution of 𝑠𝑡+1 conditional on history 𝑠𝑡 .


Equation (45.9) implies that the Arrow-Debreu price system satisfies

𝑢𝑐 (𝑠𝑡 )
𝑞𝑡0 (𝑠𝑡 ) = 𝛽 𝑡 𝜋𝑡 (𝑠𝑡 ) (45.10)
𝑢𝑐 (𝑠0 )

(The stochastic process {𝑞𝑡0 (𝑠𝑡 )} is an instance of what finance economists call a stochastic discount factor process.)
Using the first-order conditions (45.8) and (45.9) to eliminate taxes and prices from (45.7), we derive the implementability
condition

∑ ∑ 𝛽 𝑡 𝜋𝑡 (𝑠𝑡 )[𝑢𝑐 (𝑠𝑡 )𝑐𝑡 (𝑠𝑡 ) − 𝑢ℓ (𝑠𝑡 )𝑛𝑡 (𝑠𝑡 )] − 𝑢𝑐 (𝑠0 )𝑏0 = 0 (45.11)
𝑡=0 𝑠𝑡

The Ramsey problem is to choose a feasible allocation that maximizes



∑ ∑ 𝛽 𝑡 𝜋𝑡 (𝑠𝑡 )𝑢[𝑐𝑡 (𝑠𝑡 ), 1 − 𝑛𝑡 (𝑠𝑡 )] (45.12)
𝑡=0 𝑠𝑡

subject to (45.11).

45.2.4 Solution Details

First, define a “pseudo utility function”

𝑉 [𝑐𝑡 (𝑠𝑡 ), 𝑛𝑡 (𝑠𝑡 ), Φ] = 𝑢[𝑐𝑡 (𝑠𝑡 ), 1 − 𝑛𝑡 (𝑠𝑡 )] + Φ [𝑢𝑐 (𝑠𝑡 )𝑐𝑡 (𝑠𝑡 ) − 𝑢ℓ (𝑠𝑡 )𝑛𝑡 (𝑠𝑡 )] (45.13)

where Φ is a Lagrange multiplier on the implementability condition (45.7).


Next form the Lagrangian

𝐽 = ∑ ∑ 𝛽 𝑡 𝜋𝑡 (𝑠𝑡 ){𝑉 [𝑐𝑡 (𝑠𝑡 ), 𝑛𝑡 (𝑠𝑡 ), Φ] + 𝜃𝑡 (𝑠𝑡 )[𝑛𝑡 (𝑠𝑡 ) − 𝑐𝑡 (𝑠𝑡 ) − 𝑔𝑡 (𝑠𝑡 )]} − Φ𝑢𝑐 (0)𝑏0 (45.14)
𝑡=0 𝑠𝑡

where {𝜃𝑡 (𝑠𝑡 ); ∀𝑠𝑡 }𝑡≥0 is a sequence of Lagrange multipliers on the feasible conditions (45.2).

862 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

Given an initial government debt 𝑏0 , we want to maximize 𝐽 with respect to {𝑐𝑡 (𝑠𝑡 ), 𝑛𝑡 (𝑠𝑡 ); ∀𝑠𝑡 }𝑡≥0 and to minimize
with respect to Φ and with respect to {𝜃(𝑠𝑡 ); ∀𝑠𝑡 }𝑡≥0 .
The first-order conditions for the Ramsey problem for periods 𝑡 ≥ 1 and 𝑡 = 0, respectively, are

𝑐𝑡 (𝑠𝑡 )∶ (1 + Φ)𝑢𝑐 (𝑠𝑡 ) + Φ [𝑢𝑐𝑐 (𝑠𝑡 )𝑐𝑡 (𝑠𝑡 ) − 𝑢ℓ𝑐 (𝑠𝑡 )𝑛𝑡 (𝑠𝑡 )] − 𝜃𝑡 (𝑠𝑡 ) = 0, 𝑡≥1
𝑡 𝑡 𝑡 𝑡 𝑡 𝑡 𝑡
(45.15)
𝑛𝑡 (𝑠 )∶ − (1 + Φ)𝑢ℓ (𝑠 ) − Φ [𝑢𝑐ℓ (𝑠 )𝑐𝑡 (𝑠 ) − 𝑢ℓℓ (𝑠 )𝑛𝑡 (𝑠 )] + 𝜃𝑡 (𝑠 ) = 0, 𝑡≥1

and
𝑐0 (𝑠0 , 𝑏0 )∶ (1 + Φ)𝑢𝑐 (𝑠0 , 𝑏0 ) + Φ [𝑢𝑐𝑐 (𝑠0 , 𝑏0 )𝑐0 (𝑠0 , 𝑏0 ) − 𝑢ℓ𝑐 (𝑠0 , 𝑏0 )𝑛0 (𝑠0 , 𝑏0 )] − 𝜃0 (𝑠0 , 𝑏0 )
− Φ𝑢𝑐𝑐 (𝑠0 , 𝑏0 )𝑏0 = 0
(45.16)
𝑛0 (𝑠 , 𝑏0 )∶ − (1 + Φ)𝑢ℓ (𝑠0 , 𝑏0 ) − Φ [𝑢𝑐ℓ (𝑠0 , 𝑏0 )𝑐0 (𝑠0 , 𝑏0 ) − 𝑢ℓℓ (𝑠0 , 𝑏0 )𝑛0 (𝑠0 , 𝑏0 )] + 𝜃0 (𝑠0 , 𝑏0 )
0

+ Φ𝑢𝑐ℓ (𝑠0 , 𝑏0 )𝑏0 = 0

Please note how these first-order conditions differ between 𝑡 = 0 and 𝑡 ≥ 1.


It is instructive to use first-order conditions (45.15) for 𝑡 ≥ 1 to eliminate the multipliers 𝜃𝑡 (𝑠𝑡 ).
For convenience, we suppress the time subscript and the index 𝑠𝑡 and obtain

(1 + Φ)𝑢𝑐 (𝑐, 1 − 𝑐 − 𝑔) + Φ[𝑐𝑢𝑐𝑐 (𝑐, 1 − 𝑐 − 𝑔) − (𝑐 + 𝑔)𝑢ℓ𝑐 (𝑐, 1 − 𝑐 − 𝑔)]


(45.17)
= (1 + Φ)𝑢ℓ (𝑐, 1 − 𝑐 − 𝑔) + Φ[𝑐𝑢𝑐ℓ (𝑐, 1 − 𝑐 − 𝑔) − (𝑐 + 𝑔)𝑢ℓℓ (𝑐, 1 − 𝑐 − 𝑔)]

where we have imposed conditions (45.1) and (45.2).


Equation (45.17) is one equation that can be solved to express the unknown 𝑐 as a function of the exogenous variable 𝑔
and the Lagrange multiplier Φ.
We also know that time 𝑡 = 0 quantities 𝑐0 and 𝑛0 satisfy

(1 + Φ)𝑢𝑐 (𝑐, 1 − 𝑐 − 𝑔) + Φ[𝑐𝑢𝑐𝑐 (𝑐, 1 − 𝑐 − 𝑔) − (𝑐 + 𝑔)𝑢ℓ𝑐 (𝑐, 1 − 𝑐 − 𝑔)]


= (1 + Φ)𝑢ℓ (𝑐, 1 − 𝑐 − 𝑔) + Φ[𝑐𝑢𝑐ℓ (𝑐, 1 − 𝑐 − 𝑔) − (𝑐 + 𝑔)𝑢ℓℓ (𝑐, 1 − 𝑐 − 𝑔)] + Φ(𝑢𝑐𝑐 − 𝑢𝑐,ℓ )𝑏0
(45.18)

Notice that a counterpart to 𝑏0 does not appear in (45.17), so 𝑐 does not directly depend on it for 𝑡 ≥ 1.
But things are different for time 𝑡 = 0.
An analogous argument for the 𝑡 = 0 equations (45.16) leads to one equation that can be solved for 𝑐0 as a function of
the pair (𝑔(𝑠0 ), 𝑏0 ) and the Lagrange multiplier Φ.
These outcomes mean that the following statement would be true even when government purchases are history-dependent
functions 𝑔𝑡 (𝑠𝑡 ) of the history of 𝑠𝑡 .
Proposition: If government purchases are equal after two histories 𝑠𝑡 and 𝑠𝜏̃ for 𝑡, 𝜏 ≥ 0, i.e., if

𝑔𝑡 (𝑠𝑡 ) = 𝑔𝜏 (𝑠𝜏̃ ) = 𝑔

then it follows from (45.17) that the Ramsey choices of consumption and leisure, (𝑐𝑡 (𝑠𝑡 ), ℓ𝑡 (𝑠𝑡 )) and (𝑐𝑗 (𝑠𝜏̃ ), ℓ𝑗 (𝑠𝜏̃ )),
are identical.
The proposition asserts that the optimal allocation is a function of the currently realized quantity of government purchases
𝑔 only and does not depend on the specific history that preceded that realization of 𝑔.

45.2. A Competitive Equilibrium with Distorting Taxes 863


Advanced Quantitative Economics with Python

45.2.5 The Ramsey Allocation for a Given Multiplier

Temporarily take Φ as given.


We shall compute 𝑐0 (𝑠0 , 𝑏0 ) and 𝑛0 (𝑠0 , 𝑏0 ) from the first-order conditions (45.16).
Evidently, for 𝑡 ≥ 1, 𝑐 and 𝑛 depend on the time 𝑡 realization of 𝑔 only.
But for 𝑡 = 0, 𝑐 and 𝑛 depend on both 𝑔0 and the government’s initial debt 𝑏0 .
Thus, while 𝑏0 influences 𝑐0 and 𝑛0 , there appears no analogous variable 𝑏𝑡 that influences 𝑐𝑡 and 𝑛𝑡 for 𝑡 ≥ 1.
The absence of 𝑏𝑡 as a direct determinant of the Ramsey allocation for 𝑡 ≥ 1 and its presence for 𝑡 = 0 is a symptom of
the time-inconsistency of a Ramsey plan.
Of course, 𝑏0 affects the Ramsey allocation for 𝑡 ≥ 1 indirectly through its effect on Φ.
Φ has to take a value that assures that the household and the government’s budget constraints are both satisfied at a
candidate Ramsey allocation and price system associated with that Φ.

45.2.6 Further Specialization

At this point, it is useful to specialize the model in the following ways.


We assume that 𝑠 is governed by a finite state Markov chain with states 𝑠 ∈ [1, … , 𝑆] and transition matrix Π, where

Π(𝑠′ |𝑠) = Prob(𝑠𝑡+1 = 𝑠′ |𝑠𝑡 = 𝑠)

Also, assume that government purchases 𝑔 are an exact time-invariant function 𝑔(𝑠) of 𝑠.
We maintain these assumptions throughout the remainder of this lecture.

45.2.7 Determining the Lagrange Multiplier

We complete the Ramsey plan by computing the Lagrange multiplier Φ on the implementability constraint (45.11).
Government budget balance restricts Φ via the following line of reasoning.
The household’s first-order conditions imply

𝑢𝑙 (𝑠𝑡 )
(1 − 𝜏𝑡 (𝑠𝑡 )) = (45.19)
𝑢𝑐 (𝑠𝑡 )

and the implied one-period Arrow securities prices

𝑢𝑐 (𝑠𝑡+1 )
𝑝𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ) = 𝛽Π(𝑠𝑡+1 |𝑠𝑡 ) (45.20)
𝑢𝑐 (𝑠𝑡 )

Substituting from (45.19), (45.20), and the feasibility condition (45.2) into the recursive version (45.5) of the household
budget constraint gives

𝑢𝑐 (𝑠𝑡 )[𝑛𝑡 (𝑠𝑡 ) − 𝑔𝑡 (𝑠𝑡 )] + 𝛽 ∑ Π(𝑠𝑡+1 |𝑠𝑡 )𝑢𝑐 (𝑠𝑡+1 )𝑏𝑡+1 (𝑠𝑡+1 |𝑠𝑡 )
𝑠𝑡+1 (45.21)
= 𝑢𝑙 (𝑠𝑡 )𝑛𝑡 (𝑠𝑡 ) + 𝑢𝑐 (𝑠𝑡 )𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 )

Define 𝑥𝑡 (𝑠𝑡 ) = 𝑢𝑐 (𝑠𝑡 )𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ).


Notice that 𝑥𝑡 (𝑠𝑡 ) appears on the right side of (45.21) while 𝛽 times the conditional expectation of 𝑥𝑡+1 (𝑠𝑡+1 ) appears
on the left side.

864 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

Hence the equation shares much of the structure of a simple asset pricing equation with 𝑥𝑡 being analogous to the price
of the asset at time 𝑡.
We learned earlier that for a Ramsey allocation 𝑐𝑡 (𝑠𝑡 ), 𝑛𝑡 (𝑠𝑡 ), and 𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ), and therefore also 𝑥𝑡 (𝑠𝑡 ), are each func-
tions of 𝑠𝑡 only, being independent of the history 𝑠𝑡−1 for 𝑡 ≥ 1.
That means that we can express equation (45.21) as

𝑢𝑐 (𝑠)[𝑛(𝑠) − 𝑔(𝑠)] + 𝛽 ∑ Π(𝑠′ |𝑠)𝑥′ (𝑠′ ) = 𝑢𝑙 (𝑠)𝑛(𝑠) + 𝑥(𝑠) (45.22)


𝑠′

where 𝑠′ denotes a next period value of 𝑠 and 𝑥′ (𝑠′ ) denotes a next period value of 𝑥.
Given 𝑛(𝑠) for 𝑠 = 1, … , 𝑆, equation (45.22) is easy to solve for 𝑥(𝑠) for 𝑠 = 1, … , 𝑆.
If we let 𝑛,⃗ 𝑔,⃗ 𝑥⃗ denote 𝑆 × 1 vectors whose 𝑖th elements are the respective 𝑛, 𝑔, and 𝑥 values when 𝑠 = 𝑖, and let Π be
the transition matrix for the Markov state 𝑠, then we can express (45.22) as the matrix equation

𝑢⃗𝑐 (𝑛⃗ − 𝑔)⃗ + 𝛽Π𝑥⃗ = 𝑢⃗𝑙 𝑛⃗ + 𝑥⃗ (45.23)

This is a system of 𝑆 linear equations in the 𝑆 × 1 vector 𝑥, whose solution is

𝑥⃗ = (𝐼 − 𝛽Π)−1 [𝑢⃗𝑐 (𝑛⃗ − 𝑔)⃗ − 𝑢⃗𝑙 𝑛]⃗ (45.24)

In these equations, by 𝑢⃗𝑐 𝑛,⃗ for example, we mean element-by-element multiplication of the two vectors.
𝑥(𝑠)
After solving for 𝑥,⃗ we can find 𝑏(𝑠𝑡 |𝑠𝑡−1 ) in Markov state 𝑠𝑡 = 𝑠 from 𝑏(𝑠) = 𝑢𝑐 (𝑠) or the matrix equation

𝑥⃗
𝑏⃗ = (45.25)
𝑢⃗𝑐

where division here means an element-by-element division of the respective components of the 𝑆 × 1 vectors 𝑥⃗ and 𝑢⃗𝑐 .
Here is a computational algorithm:
1. Start with a guess for the value for Φ, then use the first-order conditions and the feasibility conditions to compute
𝑐(𝑠𝑡 ), 𝑛(𝑠𝑡 ) for 𝑠 ∈ [1, … , 𝑆] and 𝑐0 (𝑠0 , 𝑏0 ) and 𝑛0 (𝑠0 , 𝑏0 ), given Φ.
• these are 2(𝑆 + 1) equations in 2(𝑆 + 1) unknowns.
2. Solve the 𝑆 equations (45.24) for the 𝑆 elements of 𝑥.⃗
• these depend on Φ.
3. Find a Φ that satisfies
𝑆
𝑢𝑐,0 𝑏0 = 𝑢𝑐,0 (𝑛0 − 𝑔0 ) − 𝑢𝑙,0 𝑛0 + 𝛽 ∑ Π(𝑠|𝑠0 )𝑥(𝑠) (45.26)
𝑠=1

by gradually raising Φ if the left side of (45.26) exceeds the right side and lowering Φ if the left side is less than
the right side.
4. After computing a Ramsey allocation, recover the flat tax rate on labor from (45.8) and the implied one-period
Arrow securities prices from (45.9).
In summary, when 𝑔𝑡 is a time-invariant function of a Markov state 𝑠𝑡 , a Ramsey plan can be constructed by solving
3𝑆 + 3 equations for 𝑆 components each of 𝑐,⃗ 𝑛,⃗ and 𝑥⃗ together with 𝑛0 , 𝑐0 , and Φ.

45.2. A Competitive Equilibrium with Distorting Taxes 865


Advanced Quantitative Economics with Python

45.2.8 Time Inconsistency

Let {𝜏𝑡 (𝑠𝑡 )}∞ 𝑡 ∞


𝑡=0 , {𝑏𝑡+1 (𝑠𝑡+1 |𝑠 )}𝑡=0 be a time 0, state 𝑠0 Ramsey plan.

Then {𝜏𝑗 (𝑠𝑗 )}∞ 𝑗 ∞ 𝑡


𝑗=𝑡 , {𝑏𝑗+1 (𝑠𝑗+1 |𝑠 )}𝑗=𝑡 is a time 𝑡, history 𝑠 continuation of a time 0, state 𝑠0 Ramsey plan.

A time 𝑡, history 𝑠𝑡 Ramsey plan is a Ramsey plan that starts from initial conditions 𝑠𝑡 , 𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ).
A time 𝑡, history 𝑠𝑡 continuation of a time 0, state 0 Ramsey plan is not a time 𝑡, history 𝑠𝑡 Ramsey plan.
The means that a Ramsey plan is not time consistent.
Another way to say the same thing is that a Ramsey plan is time inconsistent.
The reason is that a continuation Ramsey plan takes 𝑢𝑐𝑡 𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ) as given, not 𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ).
We shall discuss this more below.

45.2.9 Specification with CRRA Utility

In our calculations below and in a subsequent lecture based on an extension of the Lucas-Stokey model by Aiyagari, Marcet,
Sargent, and Seppä lä (2002) [Aiyagari et al., 2002], we shall modify the one-period utility function assumed above.
(We adopted the preceding utility specification because it was the one used in the original Lucas-Stokey paper [Lucas
and Stokey, 1983]. We shall soon revert to that specification in a subsequent section.)
We will modify their specification by instead assuming that the representative agent has utility function

𝑐1−𝜎 𝑛1+𝛾
𝑢(𝑐, 𝑛) = −
1−𝜎 1+𝛾
where 𝜎 > 0, 𝛾 > 0.
We continue to assume that

𝑐𝑡 + 𝑔 𝑡 = 𝑛 𝑡

We eliminate leisure from the model.


We also eliminate Lucas and Stokey’s restriction that ℓ𝑡 + 𝑛𝑡 ≤ 1.
We replace these two things with the assumption that labor 𝑛𝑡 ∈ [0, +∞].
With these adjustments, the analysis of Lucas and Stokey prevails once we make the following replacements

𝑢ℓ (𝑐, ℓ) ∼ −𝑢𝑛 (𝑐, 𝑛)


𝑢𝑐 (𝑐, ℓ) ∼ 𝑢𝑐 (𝑐, 𝑛)
𝑢ℓ,ℓ (𝑐, ℓ) ∼ 𝑢𝑛𝑛 (𝑐, 𝑛)
𝑢𝑐,𝑐 (𝑐, ℓ) ∼ 𝑢𝑐,𝑐 (𝑐, 𝑛)
𝑢𝑐,ℓ (𝑐, ℓ) ∼ 0

With these understandings, equations (45.17) and (45.18) simplify in the case of the CRRA utility function.
They become

(1 + Φ)[𝑢𝑐 (𝑐) + 𝑢𝑛 (𝑐 + 𝑔)] + Φ[𝑐𝑢𝑐𝑐 (𝑐) + (𝑐 + 𝑔)𝑢𝑛𝑛 (𝑐 + 𝑔)] = 0 (45.27)

and

(1 + Φ)[𝑢𝑐 (𝑐0 ) + 𝑢𝑛 (𝑐0 + 𝑔0 )] + Φ[𝑐0 𝑢𝑐𝑐 (𝑐0 ) + (𝑐0 + 𝑔0 )𝑢𝑛𝑛 (𝑐0 + 𝑔0 )] − Φ𝑢𝑐𝑐 (𝑐0 )𝑏0 = 0 (45.28)

866 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

In equation (45.27), it is understood that 𝑐 and 𝑔 are each functions of the Markov state 𝑠.
In addition, the time 𝑡 = 0 budget constraint is satisfied at 𝑐0 and initial government debt 𝑏0 :
𝑆
𝑢𝑐 (𝑠)
𝑏0 + 𝑔0 = 𝜏0 (𝑐0 + 𝑔0 ) + 𝛽 ∑ Π(𝑠|𝑠0 ) 𝑏 (𝑠) (45.29)
𝑠=1
𝑢𝑐,0 1

where 𝜏0 is the time 𝑡 = 0 tax rate.


In equation (45.29), it is understood that
𝑢𝑙,0
𝜏0 = 1 −
𝑢𝑐,0

45.2.10 Sequence Implementation

The above steps are implemented in a class called SequentialLS

class SequentialLS:

'''
Class that takes a preference object, state transition matrix,
and state contingent government expenditure plan as inputs, and
solves the sequential allocation problem described above.
It returns optimal allocations about consumption and labor supply,
as well as the multiplier on the implementability constraint Φ.
'''

def __init__(self,
pref,
π=np.full((2, 2), 0.5),
g=np.array([0.1, 0.2])):

# Initialize from pref object attributes


self.β, self.π, self.g = pref.β, π, g
self.mc = MarkovChain(self.π)
self.S = len(π) # Number of states
self.pref = pref

# Find the first best allocation


self.find_first_best()

def FOC_first_best(self, c, g):


'''
First order conditions that characterize
the first best allocation.
'''

pref = self.pref
Uc, Ul = pref.Uc, pref.Ul

n = c + g
l = 1 - n

return Uc(c, l) - Ul(c, l)

def find_first_best(self):
(continues on next page)

45.2. A Competitive Equilibrium with Distorting Taxes 867


Advanced Quantitative Economics with Python

(continued from previous page)


'''
Find the first best allocation
'''
S, g = self.S, self.g

res = root(self.FOC_first_best, np.full(S, 0.5), args=(g,))

if (res.fun > 1e-10).any():


raise Exception('Could not find first best')

self.cFB = res.x
self.nFB = self.cFB + g

def FOC_time1(self, c, Φ, g):


'''
First order conditions that characterize
optimal time 1 allocation problems.
'''

pref = self.pref
Uc, Ucc, Ul, Ull, Ulc = pref.Uc, pref.Ucc, pref.Ul, pref.Ull, pref.Ulc

n = c + g
l = 1 - n

LHS = (1 + Φ) * Uc(c, l) + Φ * (c * Ucc(c, l) - n * Ulc(c, l))


RHS = (1 + Φ) * Ul(c, l) + Φ * (c * Ulc(c, l) - n * Ull(c, l))

diff = LHS - RHS

return diff

def time1_allocation(self, Φ):


'''
Computes optimal allocation for time t >= 1 for a given Φ
'''
pref = self.pref
S, g = self.S, self.g

# use the first best allocation as intial guess


res = root(self.FOC_time1, self.cFB, args=(Φ, g))

if (res.fun > 1e-10).any():


raise Exception('Could not find LS allocation.')

c = res.x
n = c + g
l = 1 - n

# Compute x
I = pref.Uc(c, n) * c - pref.Ul(c, l) * n
x = np.linalg.solve(np.eye(S) - self.β * self.π, I)

return c, n, x

def FOC_time0(self, c0, Φ, g0, b0):

(continues on next page)

868 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

(continued from previous page)


'''
First order conditions that characterize
time 0 allocation problem.
'''

pref = self.pref
Ucc, Ulc = pref.Ucc, pref.Ulc

n0 = c0 + g0
l0 = 1 - n0

diff = self.FOC_time1(c0, Φ, g0)


diff -= Φ * (Ucc(c0, l0) - Ulc(c0, l0)) * b0

return diff

def implementability(self, Φ, b0, s0, cn0_arr):


'''
Compute the differences between the RHS and LHS
of the implementability constraint given Φ,
initial debt, and initial state.
'''

pref, π, g, β = self.pref, self.π, self.g, self.β


Uc, Ul = pref.Uc, pref.Ul
g0 = self.g[s0]

c, n, x = self.time1_allocation(Φ)

res = root(self.FOC_time0, cn0_arr[0], args=(Φ, g0, b0))


c0 = res.x
n0 = c0 + g0
l0 = 1 - n0

cn0_arr[:] = c0.item(), n0.item()

LHS = Uc(c0, l0) * b0


RHS = Uc(c0, l0) * c0 - Ul(c0, l0) * n0 + β * π[s0] @ x

return RHS - LHS

def time0_allocation(self, b0, s0):


'''
Finds the optimal time 0 allocation given
initial government debt b0 and state s0
'''

# use the first best allocation as initial guess


cn0_arr = np.array([self.cFB[s0], self.nFB[s0]])

res = root(self.implementability, 0., args=(b0, s0, cn0_arr))

if (res.fun > 1e-10).any():


raise Exception('Could not find time 0 LS allocation.')

Φ = res.x[0]

(continues on next page)

45.2. A Competitive Equilibrium with Distorting Taxes 869


Advanced Quantitative Economics with Python

(continued from previous page)


c0, n0 = cn0_arr

return Φ, c0, n0

def τ(self, c, n):


'''
Computes τ given c, n
'''
pref = self.pref
Uc, Ul = pref.Uc, pref.Ul

return 1 - Ul(c, 1-n) / Uc(c, 1-n)

def simulate(self, b0, s0, T, sHist=None):


'''
Simulates planners policies for T periods
'''
pref, π, β = self.pref, self.π, self.β
Uc = pref.Uc

if sHist is None:
sHist = self.mc.simulate(T, s0)

cHist, nHist, Bhist, τHist, ΦHist = np.empty((5, T))


RHist = np.empty(T-1)

# Time 0
Φ, cHist[0], nHist[0] = self.time0_allocation(b0, s0)
τHist[0] = self.τ(cHist[0], nHist[0])
Bhist[0] = b0
ΦHist[0] = Φ

# Time 1 onward
for t in range(1, T):
c, n, x = self.time1_allocation(Φ)
τ = self.τ(c, n)
u_c = Uc(c, 1-n)
s = sHist[t]
Eu_c = π[sHist[t-1]] @ u_c
cHist[t], nHist[t], Bhist[t], τHist[t] = c[s], n[s], x[s] / u_c[s], τ[s]
RHist[t-1] = Uc(cHist[t-1], 1-nHist[t-1]) / (β * Eu_c)
ΦHist[t] = Φ

gHist = self.g[sHist]
yHist = nHist

return [cHist, nHist, Bhist, τHist, gHist, yHist, sHist, ΦHist, RHist]

870 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

45.3 Recursive Formulation of the Ramsey Problem

We now temporarily revert to Lucas and Stokey’s specification.


We start by noting that 𝑥𝑡 (𝑠𝑡 ) = 𝑢𝑐 (𝑠𝑡 )𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ) in equation (45.21) appears to be a purely “forward-looking” variable.
But 𝑥𝑡 (𝑠𝑡 ) is a natural candidate for a state variable in a recursive formulation of the Ramsey problem, one that records
history-dependence and so is backward-looking.

45.3.1 Intertemporal Delegation

To express a Ramsey plan recursively, we imagine that a time 0 Ramsey planner is followed by a sequence of continuation
Ramsey planners at times 𝑡 = 1, 2, ….
A “continuation Ramsey planner” at time 𝑡 ≥ 1 has a different objective function and faces different constraints and state
variables than does the Ramsey planner at time 𝑡 = 0.
A key step in representing a Ramsey plan recursively is to regard the marginal utility scaled government debts 𝑥𝑡 (𝑠𝑡 ) =
𝑢𝑐 (𝑠𝑡 )𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ) as predetermined quantities that continuation Ramsey planners at times 𝑡 ≥ 1 are obligated to attain.
Continuation Ramsey planners do this by choosing continuation policies that induce the representative household to make
choices that imply that 𝑢𝑐 (𝑠𝑡 )𝑏𝑡 (𝑠𝑡 |𝑠𝑡−1 ) = 𝑥𝑡 (𝑠𝑡 ).
A time 𝑡 ≥ 1 continuation Ramsey planner faces 𝑥𝑡 , 𝑠𝑡 as state variables.
A time 𝑡 ≥ 1 continuation Ramsey planner delivers 𝑥𝑡 by choosing a suitable 𝑛𝑡 , 𝑐𝑡 pair and a list of 𝑠𝑡+1 -contingent
continuation quantities 𝑥𝑡+1 to bequeath to a time 𝑡 + 1 continuation Ramsey planner.
While a time 𝑡 ≥ 1 continuation Ramsey planner faces 𝑥𝑡 , 𝑠𝑡 as state variables, the time 0 Ramsey planner faces 𝑏0 , not
𝑥0 , as a state variable.
Furthermore, the Ramsey planner cares about (𝑐0 (𝑠0 ), ℓ0 (𝑠0 )), while continuation Ramsey planners do not.
The time 0 Ramsey planner hands a state-contingent function that make 𝑥1 a function of 𝑠1 to a time 1, state 𝑠1 contin-
uation Ramsey planner.
These lines of delegated authorities and responsibilities across time express the continuation Ramsey planners’ obligations
to implement their parts of an original Ramsey plan that had been designed once-and-for-all at time 0.

45.3.2 Two Bellman Equations

After 𝑠𝑡 has been realized at time 𝑡 ≥ 1, the state variables confronting the time 𝑡 continuation Ramsey planner are
(𝑥𝑡 , 𝑠𝑡 ).
• Let 𝑉 (𝑥, 𝑠) be the value of a continuation Ramsey plan at 𝑥𝑡 = 𝑥, 𝑠𝑡 = 𝑠 for 𝑡 ≥ 1.
• Let 𝑊 (𝑏, 𝑠) be the value of a Ramsey plan at time 0 at 𝑏0 = 𝑏 and 𝑠0 = 𝑠.
We work backward by preparing a Bellman equation for 𝑉 (𝑥, 𝑠) first, then a Bellman equation for 𝑊 (𝑏, 𝑠).

45.3. Recursive Formulation of the Ramsey Problem 871


Advanced Quantitative Economics with Python

45.3.3 The Continuation Ramsey Problem

The Bellman equation for a time 𝑡 ≥ 1 continuation Ramsey planner is

𝑉 (𝑥, 𝑠) = max 𝑢(𝑛 − 𝑔(𝑠), 1 − 𝑛) + 𝛽 ∑ Π(𝑠′ |𝑠)𝑉 (𝑥′ , 𝑠′ ) (45.30)


𝑛,{𝑥′ (𝑠′ )}
𝑠′ ∈𝑆

where maximization over 𝑛 and the 𝑆 elements of 𝑥′ (𝑠′ ) is subject to the single implementability constraint for 𝑡 ≥ 1:

𝑥 = 𝑢𝑐 (𝑛 − 𝑔(𝑠)) − 𝑢𝑙 𝑛 + 𝛽 ∑ Π(𝑠′ |𝑠)𝑥′ (𝑠′ ) (45.31)


𝑠′ ∈𝑆

Here 𝑢𝑐 and 𝑢𝑙 are today’s values of the marginal utilities.


For each given value of 𝑥, 𝑠, the continuation Ramsey planner chooses 𝑛 and 𝑥′ (𝑠′ ) for each 𝑠′ ∈ 𝑆.
Associated with a value function 𝑉 (𝑥, 𝑠) that solves Bellman equation (45.30) are 𝑆 + 1 time-invariant policy functions

𝑛𝑡 = 𝑓(𝑥𝑡 , 𝑠𝑡 ), 𝑡≥1
(45.32)
𝑥𝑡+1 (𝑠𝑡+1 ) = ℎ(𝑠𝑡+1 ; 𝑥𝑡 , 𝑠𝑡 ), 𝑠𝑡+1 ∈ 𝑆, 𝑡 ≥ 1

45.3.4 The Ramsey Problem

The Bellman equation of the time 0 Ramsey planner is

𝑊 (𝑏0 , 𝑠0 ) = max 𝑢(𝑛0 − 𝑔0 , 1 − 𝑛0 ) + 𝛽 ∑ Π(𝑠1 |𝑠0 )𝑉 (𝑥′ (𝑠1 ), 𝑠1 ) (45.33)


𝑛0 ,{𝑥′ (𝑠1 )}
𝑠1 ∈𝑆

where maximization over 𝑛0 and the 𝑆 elements of 𝑥′ (𝑠1 ) is subject to the time 0 implementability constraint

𝑢𝑐,0 𝑏0 = 𝑢𝑐,0 (𝑛0 − 𝑔0 ) − 𝑢𝑙,0 𝑛0 + 𝛽 ∑ Π(𝑠1 |𝑠0 )𝑥′ (𝑠1 ) (45.34)


𝑠1 ∈𝑆

coming from restriction (45.26).


Associated with a value function 𝑊 (𝑏0 , 𝑛0 ) that solves Bellman equation (45.33) are 𝑆 + 1 time 0 policy functions

𝑛0 = 𝑓0 (𝑏0 , 𝑠0 )
(45.35)
𝑥1 (𝑠1 ) = ℎ0 (𝑠1 ; 𝑏0 , 𝑠0 )

Notice the appearance of state variables (𝑏0 , 𝑠0 ) in the time 0 policy functions for the Ramsey planner as compared to
(𝑥𝑡 , 𝑠𝑡 ) in the policy functions (45.32) for the time 𝑡 ≥ 1 continuation Ramsey planners.

The value function 𝑉 (𝑥𝑡 , 𝑠𝑡 ) of the time 𝑡 continuation Ramsey planner equals 𝐸𝑡 ∑𝜏=𝑡 𝛽 𝜏−𝑡 𝑢(𝑐𝜏 , 𝑙𝜏 ), where consump-
tion and leisure processes are evaluated along the original time 0 Ramsey plan.

45.3.5 First-Order Conditions

Attach a Lagrange multiplier Φ1 (𝑥, 𝑠) to constraint (45.31) and a Lagrange multiplier Φ0 to constraint (45.26).
Time 𝑡 ≥ 1: First-order conditions for the time 𝑡 ≥ 1 constrained maximization problem on the right side of the
continuation Ramsey planner’s Bellman equation (45.30) are

𝛽Π(𝑠′ |𝑠)𝑉𝑥 (𝑥′ , 𝑠′ ) − 𝛽Π(𝑠′ |𝑠)Φ1 = 0 (45.36)

for 𝑥′ (𝑠′ ) and

(1 + Φ1 )(𝑢𝑐 − 𝑢𝑙 ) + Φ1 [𝑛(𝑢𝑙𝑙 − 𝑢𝑙𝑐 ) + (𝑛 − 𝑔(𝑠))(𝑢𝑐𝑐 − 𝑢𝑙𝑐 )] = 0 (45.37)

872 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

for 𝑛.
Given Φ1 , equation (45.37) is one equation to be solved for 𝑛 as a function of 𝑠 (or of 𝑔(𝑠)).
Equation (45.36) implies 𝑉𝑥 (𝑥′ , 𝑠′ ) = Φ1 , while an envelope condition is 𝑉𝑥 (𝑥, 𝑠) = Φ1 , so it follows that

𝑉𝑥 (𝑥′ , 𝑠′ ) = 𝑉𝑥 (𝑥, 𝑠) = Φ1 (𝑥, 𝑠) (45.38)

Time 𝑡 = 0: For the time 0 problem on the right side of the Ramsey planner’s Bellman equation (45.33), first-order
conditions are

𝑉𝑥 (𝑥(𝑠1 ), 𝑠1 ) = Φ0 (45.39)

for 𝑥(𝑠1 ), 𝑠1 ∈ 𝑆, and

(1 + Φ0 )(𝑢𝑐,0 − 𝑢𝑛,0 ) + Φ0 [𝑛0 (𝑢𝑙𝑙,0 − 𝑢𝑙𝑐,0 ) + (𝑛0 − 𝑔(𝑠0 ))(𝑢𝑐𝑐,0 − 𝑢𝑐𝑙,0 )]


(45.40)
− Φ0 (𝑢𝑐𝑐,0 − 𝑢𝑐𝑙,0 )𝑏0 = 0

Notice similarities and differences between the first-order conditions for 𝑡 ≥ 1 and for 𝑡 = 0.
An additional term is present in (45.40) except in three special cases
• 𝑏0 = 0, or
• 𝑢𝑐 is constant (i.e., preferences are quasi-linear in consumption), or
• initial government assets are sufficiently large to finance all government purchases with interest earnings from those
assets so that Φ0 = 0
Except in these special cases, the allocation and the labor tax rate as functions of 𝑠𝑡 differ between dates 𝑡 = 0 and
subsequent dates 𝑡 ≥ 1.
Naturally, the first-order conditions in this recursive formulation of the Ramsey problem agree with the first-order con-
ditions derived when we first formulated the Ramsey plan in the space of sequences.

45.3.6 State Variable Degeneracy

Equations (45.38) and (45.39) imply that Φ0 = Φ1 and that

𝑉𝑥 (𝑥𝑡 , 𝑠𝑡 ) = Φ0 (45.41)

for all 𝑡 ≥ 1.
When 𝑉 is concave in 𝑥, this implies state-variable degeneracy along a Ramsey plan in the sense that for 𝑡 ≥ 1, 𝑥𝑡 will
be a time-invariant function of 𝑠𝑡 .
Given Φ0 , this function mapping 𝑠𝑡 into 𝑥𝑡 can be expressed as a vector 𝑥⃗ that solves equation (45.34) for 𝑛 and 𝑐 as
functions of 𝑔 that are associated with Φ = Φ0 .

45.3.7 Manifestations of Time Inconsistency

While the marginal utility adjusted level of government debt 𝑥𝑡 is a key state variable for the continuation Ramsey planners
at 𝑡 ≥ 1, it is not a state variable at time 0.
The time 0 Ramsey planner faces 𝑏0 , not 𝑥0 = 𝑢𝑐,0 𝑏0 , as a state variable.
The discrepancy in state variables faced by the time 0 Ramsey planner and the time 𝑡 ≥ 1 continuation Ramsey planners
captures the differing obligations and incentives faced by the time 0 Ramsey planner and the time 𝑡 ≥ 1 continuation
Ramsey planners.

45.3. Recursive Formulation of the Ramsey Problem 873


Advanced Quantitative Economics with Python

• The time 0 Ramsey planner is obligated to honor government debt 𝑏0 measured in time 0 consumption goods.
• The time 0 Ramsey planner can manipulate the value of government debt as measured by 𝑢𝑐,0 𝑏0 .
• In contrast, time 𝑡 ≥ 1 continuation Ramsey planners are obligated not to alter values of debt, as measured by
𝑢𝑐,𝑡 𝑏𝑡 , that they inherit from a preceding Ramsey planner or continuation Ramsey planner.
When government expenditures 𝑔𝑡 are a time-invariant function of a Markov state 𝑠𝑡 , a Ramsey plan and associated
Ramsey allocation feature marginal utilities of consumption 𝑢𝑐 (𝑠𝑡 ) that, given Φ, for 𝑡 ≥ 1 depend only on 𝑠𝑡 , but that
for 𝑡 = 0 depend on 𝑏0 as well.
This means that 𝑢𝑐 (𝑠𝑡 ) will be a time-invariant function of 𝑠𝑡 for 𝑡 ≥ 1, but except when 𝑏0 = 0, a different function for
𝑡 = 0.
This in turn means that prices of one-period Arrow securities 𝑝𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ) = 𝑝(𝑠𝑡+1 |𝑠𝑡 ) will be the same time-invariant
functions of (𝑠𝑡+1 , 𝑠𝑡 ) for 𝑡 ≥ 1, but a different function 𝑝0 (𝑠1 |𝑠0 ) for 𝑡 = 0, except when 𝑏0 = 0.
The differences between these time 0 and time 𝑡 ≥ 1 objects reflect the Ramsey planner’s incentive to manipulate Arrow
security prices and, through them, the value of initial government debt 𝑏0 .

45.3.8 Recursive Implementation

The above steps are implemented in a class called RecursiveLS.

class RecursiveLS:

'''
Compute the planner's allocation by solving Bellman
equation.
'''

def __init__(self,
pref,
x_grid,
π=np.full((2, 2), 0.5),
g=np.array([0.1, 0.2])):

self.π, self.g, self.S = π, g, len(π)


self.pref, self.x_grid = pref, x_grid

bounds = np.empty((self.S, 2))

# bound for n
bounds[0] = 0, 1

# bound for xprime


for s in range(self.S-1):
bounds[s+1] = x_grid.min(), x_grid.max()

self.bounds = bounds

# initialization of time 1 value function


self.V = None

def time1_allocation(self, V=None, tol=1e-7):


'''
Solve the optimal time 1 allocation problem
by iterating Bellman value function.
(continues on next page)

874 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

(continued from previous page)


'''

π, g, S = self.π, self.g, self.S


pref, x_grid, bounds = self.pref, self.x_grid, self.bounds

# initial guess of value function


if V is None:
V = np.zeros((len(x_grid), S))

# initial guess of policy


z = np.empty((len(x_grid), S, S+2))

# guess of n
z[:, :, 1] = 0.5

# guess of xprime
for s in range(S):
for i in range(S-1):
z[:, s, i+2] = x_grid

while True:
# value function iteration
V_new, z_new = T(V, z, pref, π, g, x_grid, bounds)

if np.max(np.abs(V - V_new)) < tol:


break

V = V_new
z = z_new

self.V = V_new
self.z1 = z_new
self.c1 = z_new[:, :, 0]
self.n1 = z_new[:, :, 1]
self.xprime1 = z_new[:, :, 2:]

return V_new, z_new

def time0_allocation(self, b0, s0):


'''
Find the optimal time 0 allocation by maximization.
'''

if self.V is None:
self.time1_allocation()

π, g, S = self.π, self.g, self.S


pref, x_grid, bounds = self.pref, self.x_grid, self.bounds
V, z1 = self.V, self.z1

x = 1. # x is arbitrary
res = nelder_mead(obj_V,
z1[0, s0, 1:-1],
args=(x, s0, V, pref, π, g, x_grid, b0),
bounds=bounds,
tol_f=1e-10)

(continues on next page)

45.3. Recursive Formulation of the Ramsey Problem 875


Advanced Quantitative Economics with Python

(continued from previous page)

n0, xprime0 = IC(res.x, x, s0, b0, pref, π, g)


c0 = n0 - g[s0]
z0 = np.array([c0, n0, *xprime0])

self.z0 = z0
self.n0 = n0
self.c0 = n0 - g[s0]
self.xprime0 = xprime0

return z0

def τ(self, c, n):


'''
Computes τ given c, n
'''
pref = self.pref
uc, ul = pref.Uc(c, 1-n), pref.Ul(c, 1-n)

return 1 - ul / uc

def simulate(self, b0, s0, T, sHist=None):


'''
Simulates Ramsey plan for T periods
'''
pref, π = self.pref, self.π
Uc = pref.Uc

if sHist is None:
sHist = self.mc.simulate(T, s0)

cHist, nHist, Bhist, τHist, xHist = np.empty((5, T))


RHist = np.zeros(T-1)

# Time 0
self.time0_allocation(b0, s0)
cHist[0], nHist[0], xHist[0] = self.c0, self.n0, self.xprime0[s0]
τHist[0] = self.τ(cHist[0], nHist[0])
Bhist[0] = b0

# Time 1 onward
for t in range(1, T):
s, x = sHist[t], xHist[t-1]
cHist[t] = np.interp(x, self.x_grid, self.c1[:, s])
nHist[t] = np.interp(x, self.x_grid, self.n1[:, s])

τHist[t] = self.τ(cHist[t], nHist[t])

Bhist[t] = x / Uc(cHist[t], 1-nHist[t])

c, n = np.empty((2, self.S))
for sprime in range(self.S):
c[sprime] = np.interp(x, x_grid, self.c1[:, sprime])
n[sprime] = np.interp(x, x_grid, self.n1[:, sprime])
Euc = π[sHist[t-1]] @ Uc(c, 1-n)
RHist[t-1] = Uc(cHist[t-1], 1-nHist[t-1]) / (self.pref.β * Euc)

(continues on next page)

876 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

(continued from previous page)

gHist = self.g[sHist]
yHist = nHist

if t < T-1:
sprime = sHist[t+1]
xHist[t] = np.interp(x, self.x_grid, self.xprime1[:, s, sprime])

return [cHist, nHist, Bhist, τHist, gHist, yHist, xHist, RHist]

# Helper functions

@njit(parallel=True)
def T(V, z, pref, π, g, x_grid, bounds):
'''
One step iteration of Bellman value function.
'''

S = len(π)

V_new = np.empty_like(V)
z_new = np.empty_like(z)

for i in prange(len(x_grid)):
x = x_grid[i]
for s in prange(S):
res = nelder_mead(obj_V,
z[i, s, 1:-1],
args=(x, s, V, pref, π, g, x_grid),
bounds=bounds,
tol_f=1e-10)

# optimal policy
n, xprime = IC(res.x, x, s, None, pref, π, g)
z_new[i, s, 0] = n - g[s] # c
z_new[i, s, 1] = n # n
z_new[i, s, 2:] = xprime # xprime

V_new[i, s] = res.fun

return V_new, z_new

@njit
def obj_V(z_sub, x, s, V, pref, π, g, x_grid, b0=None):
'''
The objective on the right hand side of the Bellman equation.
z_sub contains guesses of n and xprime[:-1].
'''

S = len(π)
β, U = pref.β, pref.U

# find (n, xprime) that satisfies implementability constraint


n, xprime = IC(z_sub, x, s, b0, pref, π, g)
c, l = n-g[s], 1-n

(continues on next page)

45.3. Recursive Formulation of the Ramsey Problem 877


Advanced Quantitative Economics with Python

(continued from previous page)


# if xprime[-1] violates bound, return large penalty
if (xprime[-1] < x_grid.min()):
return -1e9 * (1 + np.abs(xprime[-1] - x_grid.min()))
elif (xprime[-1] > x_grid.max()):
return -1e9 * (1 + np.abs(xprime[-1] - x_grid.max()))

# prepare Vprime vector


Vprime = np.empty(S)
for sprime in range(S):
Vprime[sprime] = np.interp(xprime[sprime], x_grid, V[:, sprime])

# compute the objective value


obj = U(c, l) + β * π[s] @ Vprime

return obj

@njit
def IC(z_sub, x, s, b0, pref, π, g):
'''
Find xprime[-1] that satisfies the implementability condition
given the guesses of n and xprime[:-1].
'''

β, Uc, Ul = pref.β, pref.Uc, pref.Ul

n = z_sub[0]
xprime = np.empty(len(π))
xprime[:-1] = z_sub[1:]

c, l = n-g[s], 1-n
uc = Uc(c, l)
ul = Ul(c, l)

if b0 is None:
diff = x
else:
diff = uc * b0

diff -= uc * (n - g[s]) - ul * n + β * π[s][:-1] @ xprime[:-1]


xprime[-1] = diff / (β * π[s][-1])

return n, xprime

45.4 Examples

We return to the setup with CRRA preferences described above.

878 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

45.4.1 Anticipated One-Period War

This example illustrates in a simple setting how a Ramsey planner manages risk.
Government expenditures are known for sure in all periods except one
• For 𝑡 < 3 and 𝑡 > 3 we assume that 𝑔𝑡 = 𝑔𝑙 = 0.1.
• At 𝑡 = 3 a war occurs with probability 0.5.
– If there is war, 𝑔3 = 𝑔ℎ = 0.2
– If there is no war 𝑔3 = 𝑔𝑙 = 0.1
We define the components of the state vector as the following six (𝑡, 𝑔) pairs: (0, 𝑔𝑙 ), (1, 𝑔𝑙 ), (2, 𝑔𝑙 ), (3, 𝑔𝑙 ), (3, 𝑔ℎ ), (𝑡 ≥
4, 𝑔𝑙 ).
We think of these 6 states as corresponding to 𝑠 = 1, 2, 3, 4, 5, 6.
The transition matrix is
0 1 0 0 0 0

⎜ 0 0 1 0 0 0⎞⎟

⎜ ⎟
⎜ 0 0 0 0.5 0.5 0⎟⎟
Π=⎜
⎜ 0 0 0 0 0 1⎟⎟

⎜0 ⎟
0 0 0 0 1⎟
⎝0 0 0 0 0 1⎠

Government expenditures at each state are

0.1

⎜ 0.1⎞


⎜ ⎟

0.1
𝑔=⎜
⎜ ⎟
⎜ 0.1⎟

⎜0.2⎟
⎜ ⎟
⎝0.1⎠
We assume that the representative agent has utility function

𝑐1−𝜎 𝑛1+𝛾
𝑢(𝑐, 𝑛) = −
1−𝜎 1+𝛾
and set 𝜎 = 2, 𝛾 = 2, and the discount factor 𝛽 = 0.9.

Note: For convenience in terms of matching our code, we have expressed utility as a function of 𝑛 rather than leisure 𝑙.

This utility function is implemented in the class CRRAutility.

crra_util_data = [
('β', float64),
('σ', float64),
('γ', float64)
]

@jitclass(crra_util_data)
class CRRAutility:

def __init__(self,
β=0.9,
(continues on next page)

45.4. Examples 879


Advanced Quantitative Economics with Python

(continued from previous page)


σ=2,
γ=2):

self.β, self.σ, self.γ = β, σ, γ

# Utility function
def U(self, c, l):
# Note: `l` should not be interpreted as labor, it is an auxiliary
# variable used to conveniently match the code and the equations
# in the lecture
σ = self.σ
if σ == 1.:
U = np.log(c)
else:
U = (c**(1 - σ) - 1) / (1 - σ)
return U - (1-l) ** (1 + self.γ) / (1 + self.γ)

# Derivatives of utility function


def Uc(self, c, l):
return c ** (-self.σ)

def Ucc(self, c, l):


return -self.σ * c ** (-self.σ - 1)

def Ul(self, c, l):


return (1-l) ** self.γ

def Ull(self, c, l):


return -self.γ * (1-l) ** (self.γ - 1)

def Ucl(self, c, l):


return 0

def Ulc(self, c, l):


return 0

We set initial government debt 𝑏0 = 1.


We can now plot the Ramsey tax under both realizations of time 𝑡 = 3 government expenditures
• black when 𝑔3 = .1, and
• red when 𝑔3 = .2

π = np.array([[0, 1, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, 0.5, 0.5, 0],
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1]])

g = np.array([0.1, 0.1, 0.1, 0.2, 0.1, 0.1])


crra_pref = CRRAutility()

# Solve sequential problem


seq = SequentialLS(crra_pref, π=π, g=g)
sHist_h = np.array([0, 1, 2, 3, 5, 5, 5])
(continues on next page)

880 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

(continued from previous page)


sHist_l = np.array([0, 1, 2, 4, 5, 5, 5])
sim_seq_h = seq.simulate(1, 0, 7, sHist_h)
sim_seq_l = seq.simulate(1, 0, 7, sHist_l)

fig, axes = plt.subplots(3, 2, figsize=(14, 10))


titles = ['Consumption', 'Labor Supply', 'Government Debt',
'Tax Rate', 'Government Spending', 'Output']

for ax, title, sim_l, sim_h in zip(axes.flatten(),


titles,
sim_seq_l[:6],
sim_seq_h[:6]):
ax.set(title=title)
ax.plot(sim_l, '-ok', sim_h, '-or', alpha=0.7)
ax.grid()

plt.tight_layout()
plt.show()

Tax smoothing
• the tax rate is constant for all 𝑡 ≥ 1
– For 𝑡 ≥ 1, 𝑡 ≠ 3, this is a consequence of 𝑔𝑡 being the same at all those dates.
– For 𝑡 = 3, it is a consequence of the special one-period utility function that we have assumed.
– Under other one-period utility functions, the time 𝑡 = 3 tax rate could be either higher or lower than for dates
𝑡 ≥ 1, 𝑡 ≠ 3.

45.4. Examples 881


Advanced Quantitative Economics with Python

• the tax rate is the same at 𝑡 = 3 for both the high 𝑔𝑡 outcome and the low 𝑔𝑡 outcome
We have assumed that at 𝑡 = 0, the government owes positive debt 𝑏0 .
It sets the time 𝑡 = 0 tax rate partly with an eye to reducing the value 𝑢𝑐,0 𝑏0 of 𝑏0 .
It does this by increasing consumption at time 𝑡 = 0 relative to consumption in later periods.
This has the consequence of lowering the time 𝑡 = 0 value of the gross interest rate for risk-free loans between periods 𝑡
and 𝑡 + 1, which equals
𝑢𝑐,𝑡
𝑅𝑡 =
𝛽𝔼𝑡 [𝑢𝑐,𝑡+1 ]

A tax policy that makes time 𝑡 = 0 consumption be higher than time 𝑡 = 1 consumption evidently decreases the risk-free
rate one-period interest rate, 𝑅𝑡 , at 𝑡 = 0.
Lowering the time 𝑡 = 0 risk-free interest rate makes time 𝑡 = 0 consumption goods cheaper relative to consumption
goods at later dates, thereby lowering the value 𝑢𝑐,0 𝑏0 of initial government debt 𝑏0 .
We see this in a figure below that plots the time path for the risk-free interest rate under both realizations of the time
𝑡 = 3 government expenditure shock.
The following plot illustrates how the government lowers the interest rate at time 0 by raising consumption

fix, ax = plt.subplots(figsize=(8, 5))


ax.set_title('Gross Interest Rate')
ax.plot(sim_seq_l[-1], '-ok', sim_seq_h[-1], '-or', alpha=0.7)
ax.grid()
plt.show()

882 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

45.4.2 Government Saving

At time 𝑡 = 0 the government evidently dissaves since 𝑏1 > 𝑏0 .


• This is a consequence of it setting a lower tax rate at 𝑡 = 0, implying more consumption at 𝑡 = 0.
At time 𝑡 = 1, the government evidently saves since it has set the tax rate sufficiently high to allow it to set 𝑏2 < 𝑏1 .
• Its motive for doing this is that it anticipates a likely war at 𝑡 = 3.
At time 𝑡 = 2 the government trades state-contingent Arrow securities to hedge against war at 𝑡 = 3.
• It purchases a security that pays off when 𝑔3 = 𝑔ℎ .
• It sells a security that pays off when 𝑔3 = 𝑔𝑙 .
• These purchases are designed in such a way that regardless of whether or not there is a war at 𝑡 = 3, the government
will begin period 𝑡 = 4 with the same government debt.
• The time 𝑡 = 4 debt level can be serviced with revenues from the constant tax rate set at times 𝑡 ≥ 1.
At times 𝑡 ≥ 4 the government rolls over its debt, knowing that the tax rate is set at a level that raises enough revenue to
pay for government purchases and interest payments on its debt.

45.4.3 Time 0 Manipulation of Interest Rate

We have seen that when 𝑏0 > 0, the Ramsey plan sets the time 𝑡 = 0 tax rate partly with an eye toward lowering a
risk-free interest rate for one-period loans between times 𝑡 = 0 and 𝑡 = 1.
By lowering this interest rate, the plan makes time 𝑡 = 0 goods cheap relative to consumption goods at later times.
By doing this, it lowers the value of time 𝑡 = 0 debt that it has inherited and must finance.

45.4.4 Time 0 and Time-Inconsistency

In the preceding example, the Ramsey tax rate at time 0 differs from its value at time 1.
To explore what is going on here, let’s simplify things by removing the possibility of war at time 𝑡 = 3.
The Ramsey problem then includes no randomness because 𝑔𝑡 = 𝑔𝑙 for all 𝑡.
The figure below plots the Ramsey tax rates and gross interest rates at time 𝑡 = 0 and time 𝑡 ≥ 1 as functions of the
initial government debt (using the sequential allocation solution and a CRRA utility function defined above)

tax_seq = SequentialLS(CRRAutility(), g=np.array([0.15]), π=np.ones((1, 1)))

n = 100
tax_policy = np.empty((n, 2))
interest_rate = np.empty((n, 2))
gov_debt = np.linspace(-1.5, 1, n)

for i in range(n):
tax_policy[i] = tax_seq.simulate(gov_debt[i], 0, 2)[3]
interest_rate[i] = tax_seq.simulate(gov_debt[i], 0, 3)[-1]

fig, axes = plt.subplots(2, 1, figsize=(10,8), sharex=True)


titles = ['Tax Rate', 'Gross Interest Rate']

for ax, title, plot in zip(axes, titles, [tax_policy, interest_rate]):


(continues on next page)

45.4. Examples 883


Advanced Quantitative Economics with Python

(continued from previous page)


ax.plot(gov_debt, plot[:, 0], gov_debt, plot[:, 1], lw=2)
ax.set(title=title, xlim=(min(gov_debt), max(gov_debt)))
ax.grid()

axes[0].legend(('Time $t=0$', r'Time $t \geq 1$'))


axes[1].set_xlabel('Initial Government Debt')

fig.tight_layout()
plt.show()

The figure indicates that if the government enters with positive debt, it sets a tax rate at 𝑡 = 0 that is less than all later tax
rates.
By setting a lower tax rate at 𝑡 = 0, the government raises consumption, which reduces the value 𝑢𝑐,0 𝑏0 of its initial debt.
It does this by increasing 𝑐0 and thereby lowering 𝑢𝑐,0 .
Conversely, if 𝑏0 < 0, the Ramsey planner sets the tax rate at 𝑡 = 0 higher than in subsequent periods.
A side effect of lowering time 𝑡 = 0 consumption is that it lowers the one-period interest rate at time 𝑡 = 0 below that of
subsequent periods.
There are only two values of initial government debt at which the tax rate is constant for all 𝑡 ≥ 0.
The first is 𝑏0 = 0

884 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

• Here the government can’t use the 𝑡 = 0 tax rate to alter the value of the initial debt.
The second occurs when the government enters with sufficiently large assets that the Ramsey planner can achieve first
best and sets 𝜏𝑡 = 0 for all 𝑡.
It is only for these two values of initial government debt that the Ramsey plan is time-consistent.
Another way of saying this is that, except for these two values of initial government debt, a continuation of a Ramsey
plan is not a Ramsey plan.
To illustrate this, consider a Ramsey planner who starts with an initial government debt 𝑏1 associated with one of the
Ramsey plans computed above.
Call 𝜏1𝑅 the time 𝑡 = 0 tax rate chosen by the Ramsey planner confronting this value for initial government debt govern-
ment.
The figure below shows both the tax rate at time 1 chosen by our original Ramsey planner and what a new Ramsey planner
would choose for its time 𝑡 = 0 tax rate

tax_seq = SequentialLS(CRRAutility(), g=np.array([0.15]), π=np.ones((1, 1)))

n = 100
tax_policy = np.empty((n, 2))
τ_reset = np.empty((n, 2))
gov_debt = np.linspace(-1.5, 1, n)

for i in range(n):
tax_policy[i] = tax_seq.simulate(gov_debt[i], 0, 2)[3]
τ_reset[i] = tax_seq.simulate(gov_debt[i], 0, 1)[3]

fig, ax = plt.subplots(figsize=(10, 6))


ax.plot(gov_debt, tax_policy[:, 1], gov_debt, τ_reset, lw=2)
ax.set(xlabel='Initial Government Debt', title='Tax Rate',
xlim=(min(gov_debt), max(gov_debt)))
ax.legend((r'$\tau_1$', r'$\tau_1^R$'))
ax.grid()

fig.tight_layout()
plt.show()

45.4. Examples 885


Advanced Quantitative Economics with Python

The tax rates in the figure are equal for only two values of initial government debt.

45.4.5 Tax Smoothing and non-CRRA Preferences

The complete tax smoothing for 𝑡 ≥ 1 in the preceding example is a consequence of our having assumed CRRA prefer-
ences.
To see what is driving this outcome, we begin by noting that the Ramsey tax rate for 𝑡 ≥ 1 is a time-invariant function
𝜏 (Φ, 𝑔) of the Lagrange multiplier on the implementability constraint and government expenditures.
For CRRA preferences, we can exploit the relations 𝑈𝑐𝑐 𝑐 = −𝜎𝑈𝑐 and 𝑈𝑛𝑛 𝑛 = 𝛾𝑈𝑛 to derive

(1 + (1 − 𝜎)Φ)𝑈𝑐
=1
(1 + (1 − 𝛾)Φ)𝑈𝑛
from the first-order conditions.
This equation immediately implies that the tax rate is constant.
For other preferences, the tax rate may not be constant.
For example, let the period utility function be

𝑢(𝑐, 𝑛) = log(𝑐) + 0.69 log(1 − 𝑛)

We will create a new class LogUtility to represent this utility function

log_util_data = [
('β', float64),
('ψ', float64)
]

@jitclass(log_util_data)
(continues on next page)

886 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

(continued from previous page)


class LogUtility:

def __init__(self,
β=0.9,
ψ=0.69):

self.β, self.ψ = β, ψ

# Utility function
def U(self, c, l):
return np.log(c) + self.ψ * np.log(l)

# Derivatives of utility function


def Uc(self, c, l):
return 1 / c

def Ucc(self, c, l):


return -c**(-2)

def Ul(self, c, l):


return self.ψ / l

def Ull(self, c, l):


return -self.ψ / l**2

def Ucl(self, c, l):


return 0

def Ulc(self, c, l):


return 0

Also, suppose that 𝑔𝑡 follows a two-state IID process with equal probabilities attached to 𝑔𝑙 and 𝑔ℎ .
To compute the tax rate, we will use both the sequential and recursive approaches described above.
The figure below plots a sample path of the Ramsey tax rate

log_example = LogUtility()
# Solve sequential problem
seq_log = SequentialLS(log_example)

# Initialize grid for value function iteration and solve


x_grid = np.linspace(-3., 3., 200)

# Solve recursive problem


rec_log = RecursiveLS(log_example, x_grid)

T_length = 20
sHist = np.array([0, 0, 0, 0, 0,
0, 0, 0, 1, 1,
0, 0, 0, 1, 1,
1, 1, 1, 1, 0])

# Simulate
sim_seq = seq_log.simulate(0.5, 0, T_length, sHist)
sim_rec = rec_log.simulate(0.5, 0, T_length, sHist)

(continues on next page)

45.4. Examples 887


Advanced Quantitative Economics with Python

(continued from previous page)


fig, axes = plt.subplots(3, 2, figsize=(14, 10))
titles = ['Consumption', 'Labor Supply', 'Government Debt',
'Tax Rate', 'Government Spending', 'Output']

for ax, title, sim_s, sim_b in zip(axes.flatten(), titles, sim_seq[:6], sim_rec[:6]):


ax.plot(sim_s, '-ob', sim_b, '-xk', alpha=0.7)
ax.set(title=title)
ax.grid()

axes.flatten()[0].legend(('Sequential', 'Recursive'))
fig.tight_layout()
plt.show()

As should be expected, the recursive and sequential solutions produce almost identical allocations.
Unlike outcomes with CRRA preferences, the tax rate is not perfectly smoothed.
Instead, the government raises the tax rate when 𝑔𝑡 is high.

888 Chapter 45. Optimal Taxation with State-Contingent Debt


Advanced Quantitative Economics with Python

45.4.6 Further Comments

A related lecture describes an extension of the Lucas-Stokey model by Aiyagari, Marcet, Sargent, and Seppä lä (2002)
[Aiyagari et al., 2002].
In the AMSS economy, only a risk-free bond is traded.
That lecture compares the recursive representation of the Lucas-Stokey model presented in this lecture with one for an
AMSS economy.
By comparing these recursive formulations, we shall glean a sense in which the dimension of the state is lower in the
Lucas Stokey model.
Accompanying that difference in dimension will be different dynamics of government debt.

45.4. Examples 889


Advanced Quantitative Economics with Python

890 Chapter 45. Optimal Taxation with State-Contingent Debt


CHAPTER

FORTYSIX

OPTIMAL TAXATION WITHOUT STATE-CONTINGENT DEBT

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon


!pip install interpolation

46.1 Overview

Let’s start with following imports:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import root
from interpolation.splines import eval_linear, UCGrid, nodes
from quantecon import optimize, MarkovChain
from numba import njit, prange, float64
from numba.experimental import jitclass

In an earlier lecture, we described a model of optimal taxation with state-contingent debt due to Robert E. Lucas, Jr., and
Nancy Stokey [Lucas and Stokey, 1983].
Aiyagari, Marcet, Sargent, and Seppä lä [Aiyagari et al., 2002] (hereafter, AMSS) studied optimal taxation in a model
without state-contingent debt.
In this lecture, we
• describe assumptions and equilibrium concepts
• solve the model
• implement the model numerically
• conduct some policy experiments
• compare outcomes with those in a corresponding complete-markets model
We begin with an introduction to the model.

891
Advanced Quantitative Economics with Python

46.2 Competitive Equilibrium with Distorting Taxes

Many but not all features of the economy are identical to those of the Lucas-Stokey economy.
Let’s start with things that are identical.
For 𝑡 ≥ 0, a history of the state is represented by 𝑠𝑡 = [𝑠𝑡 , 𝑠𝑡−1 , … , 𝑠0 ].
Government purchases 𝑔(𝑠) are an exact time-invariant function of 𝑠.
Let 𝑐𝑡 (𝑠𝑡 ), ℓ𝑡 (𝑠𝑡 ), and 𝑛𝑡 (𝑠𝑡 ) denote consumption, leisure, and labor supply, respectively, at history 𝑠𝑡 at time 𝑡.
Each period a representative household is endowed with one unit of time that can be divided between leisure ℓ𝑡 and labor
𝑛𝑡 :

𝑛𝑡 (𝑠𝑡 ) + ℓ𝑡 (𝑠𝑡 ) = 1 (46.1)

Output equals 𝑛𝑡 (𝑠𝑡 ) and can be divided between consumption 𝑐𝑡 (𝑠𝑡 ) and 𝑔(𝑠𝑡 )

𝑐𝑡 (𝑠𝑡 ) + 𝑔(𝑠𝑡 ) = 𝑛𝑡 (𝑠𝑡 ) (46.2)

Output is not storable.


The technology pins down a pre-tax wage rate to unity for all 𝑡, 𝑠𝑡 .
A representative household’s preferences over {𝑐𝑡 (𝑠𝑡 ), ℓ𝑡 (𝑠𝑡 )}∞
𝑡=0 are ordered by


∑ ∑ 𝛽 𝑡 𝜋𝑡 (𝑠𝑡 )𝑢[𝑐𝑡 (𝑠𝑡 ), ℓ𝑡 (𝑠𝑡 )] (46.3)
𝑡=0 𝑠𝑡

where
• 𝜋𝑡 (𝑠𝑡 ) is a joint probability distribution over the sequence 𝑠𝑡 , and
• the utility function 𝑢 is increasing, strictly concave, and three times continuously differentiable in both arguments.
The government imposes a flat rate tax 𝜏𝑡 (𝑠𝑡 ) on labor income at time 𝑡, history 𝑠𝑡 .
Lucas and Stokey assumed that there are complete markets in one-period Arrow securities; also see smoothing models.
It is at this point that AMSS [Aiyagari et al., 2002] modify the Lucas and Stokey economy.
AMSS allow the government to issue only one-period risk-free debt each period.
Ruling out complete markets in this way is a step in the direction of making total tax collections behave more like that
prescribed in Robert Barro (1979) [Barro, 1979] than they do in Lucas and Stokey (1983) [Lucas and Stokey, 1983].

46.2.1 Risk-free One-Period Debt Only

In period 𝑡 and history 𝑠𝑡 , let


• 𝑏𝑡+1 (𝑠𝑡 ) be the amount of the time 𝑡 + 1 consumption good that at time 𝑡, history 𝑠𝑡 the government promised to
pay
• 𝑅𝑡 (𝑠𝑡 ) be the gross interest rate on risk-free one-period debt between periods 𝑡 and 𝑡 + 1
• 𝑇𝑡 (𝑠𝑡 ) be a non-negative lump-sum transfer to the representative household1
1 In an allocation that solves the Ramsey problem and that levies distorting taxes on labor, why would the government ever want to hand revenues

back to the private sector? It would not in an economy with state-contingent debt, since any such allocation could be improved by lowering distortionary
taxes rather than handing out lump-sum transfers. But, without state-contingent debt there can be circumstances when a government would like to make
lump-sum transfers to the private sector.

892 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

That 𝑏𝑡+1 (𝑠𝑡 ) is the same for all realizations of 𝑠𝑡+1 captures its risk-free character.
The market value at time 𝑡 of government debt maturing at time 𝑡 + 1 equals 𝑏𝑡+1 (𝑠𝑡 ) divided by 𝑅𝑡 (𝑠𝑡 ).
The government’s budget constraint in period 𝑡 at history 𝑠𝑡 is

𝑏𝑡+1 (𝑠𝑡 )
𝑏𝑡 (𝑠𝑡−1 ) = 𝜏𝑡𝑛 (𝑠𝑡 )𝑛𝑡 (𝑠𝑡 ) − 𝑔(𝑠𝑡 ) − 𝑇𝑡 (𝑠𝑡 ) +
𝑅𝑡 (𝑠𝑡 )
(46.4)
𝑏 (𝑠𝑡 )
≡ 𝑧𝑡 (𝑠 ) + 𝑡+1 𝑡 ,
𝑡
𝑅𝑡 (𝑠 )

where 𝑧𝑡 (𝑠𝑡 ) is the net-of-interest government surplus.


To rule out Ponzi schemes, we assume that the government is subject to a natural debt limit (to be discussed in a
forthcoming lecture).
The consumption Euler equation for a representative household able to trade only one-period risk-free debt with one-
period gross interest rate 𝑅𝑡 (𝑠𝑡 ) is

1 𝑢 (𝑠𝑡+1 )
𝑡
= ∑ 𝛽𝜋𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ) 𝑐 𝑡
𝑅𝑡 (𝑠 ) 𝑠𝑡+1 |𝑠𝑡 𝑢𝑐 (𝑠 )

Substituting this expression into the government’s budget constraint (46.4) yields:

𝑢𝑐 (𝑠𝑡+1 )
𝑏𝑡 (𝑠𝑡−1 ) = 𝑧𝑡 (𝑠𝑡 ) + 𝛽 ∑ 𝜋𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ) 𝑏 (𝑠𝑡 ) (46.5)
𝑠𝑡+1 |𝑠𝑡
𝑢𝑐 (𝑠𝑡 ) 𝑡+1

Components of 𝑧𝑡 (𝑠𝑡 ) on the right side depend on 𝑠𝑡 , but the left side is required to depend only on 𝑠𝑡−1 .
This is what it means for one-period government debt to be risk-free.
Therefore, the right side of equation (46.5) also has to depend only on 𝑠𝑡−1 .
This requirement will give rise to measurability constraints on the Ramsey allocation to be discussed soon.
If we replace 𝑏𝑡+1 (𝑠𝑡 ) on the right side of equation (46.5) by the right side of next period’s budget constraint (associated
with a particular realization 𝑠𝑡 ) we get

𝑢𝑐 (𝑠𝑡+1 ) 𝑏𝑡+2 (𝑠𝑡+1 )


𝑏𝑡 (𝑠𝑡−1 ) = 𝑧𝑡 (𝑠𝑡 ) + ∑ 𝛽𝜋𝑡+1 (𝑠𝑡+1 |𝑠𝑡 ) [𝑧 𝑡+1 (𝑠 𝑡+1
) + ]
𝑠𝑡+1 |𝑠𝑡
𝑢𝑐 (𝑠𝑡 ) 𝑅𝑡+1 (𝑠𝑡+1 )

After making similar repeated substitutions for all future occurrences of government indebtedness, and by invoking a
natural debt limit, we arrive at:

𝑢𝑐 (𝑠𝑡+𝑗 )
𝑏𝑡 (𝑠𝑡−1 ) = ∑ ∑ 𝛽 𝑗 𝜋𝑡+𝑗 (𝑠𝑡+𝑗 |𝑠𝑡 ) 𝑧 (𝑠𝑡+𝑗 ) (46.6)
𝑗=0 𝑠𝑡+𝑗 |𝑠𝑡
𝑢𝑐 (𝑠𝑡 ) 𝑡+𝑗

Notice how the conditioning sets in equation (46.6) differ: they are 𝑠𝑡−1 on the left side and 𝑠𝑡 on the right side.
Now let’s
• substitute the resource constraint into the net-of-interest government surplus, and
• use the household’s first-order condition 1 − 𝜏𝑡𝑛 (𝑠𝑡 ) = 𝑢ℓ (𝑠𝑡 )/𝑢𝑐 (𝑠𝑡 ) to eliminate the labor tax rate
so that we can express the net-of-interest government surplus 𝑧𝑡 (𝑠𝑡 ) as

𝑢ℓ (𝑠𝑡 )
𝑧𝑡 (𝑠𝑡 ) = [1 − ] [𝑐𝑡 (𝑠𝑡 ) + 𝑔(𝑠𝑡 )] − 𝑔(𝑠𝑡 ) − 𝑇𝑡 (𝑠𝑡 ) . (46.7)
𝑢𝑐 (𝑠𝑡 )

46.2. Competitive Equilibrium with Distorting Taxes 893


Advanced Quantitative Economics with Python

If we substitute appropriate versions of the right side of (46.7) for 𝑧𝑡+𝑗 (𝑠𝑡+𝑗 ) into equation (46.6), we obtain a sequence
of implementability constraints on a Ramsey allocation in an AMSS economy.
Expression (46.6) at time 𝑡 = 0 and initial state 𝑠0 was also an implementability constraint on a Ramsey allocation in a
Lucas-Stokey economy:

𝑢𝑐 (𝑠𝑗 )
𝑏0 (𝑠−1 ) = 𝔼0 ∑ 𝛽 𝑗 𝑧 (𝑠𝑗 ) (46.8)
𝑗=0
𝑢𝑐 (𝑠0 ) 𝑗

Indeed, it was the only implementability constraint there.


But now we also have a large number of additional implementability constraints

𝑢𝑐 (𝑠𝑡+𝑗 )
𝑏𝑡 (𝑠𝑡−1 ) = 𝔼𝑡 ∑ 𝛽 𝑗 𝑧 (𝑠𝑡+𝑗 ) (46.9)
𝑗=0
𝑢𝑐 (𝑠𝑡 ) 𝑡+𝑗

Equation (46.9) must hold for each 𝑠𝑡 for each 𝑡 ≥ 1.

46.2.2 Comparison with Lucas-Stokey Economy

The expression on the right side of (46.9) in the Lucas-Stokey (1983) economy would equal the present value of a
continuation stream of government net-of-interest surpluses evaluated at what would be competitive equilibrium Arrow-
Debreu prices at date 𝑡.
In the Lucas-Stokey economy, that present value is measurable with respect to 𝑠𝑡 .
In the AMSS economy, the restriction that government debt be risk-free imposes that that same present value must be
measurable with respect to 𝑠𝑡−1 .
In a language used in the literature on incomplete markets models, it can be said that the AMSS model requires that at
each (𝑡, 𝑠𝑡 ) what would be the present value of continuation government net-of-interest surpluses in the Lucas-Stokey
model must belong to the marketable subspace of the AMSS model.

46.2.3 Ramsey Problem Without State-contingent Debt

After we have substituted the resource constraint into the utility function, we can express the Ramsey problem as being
to choose an allocation that solves

max 𝔼0 ∑ 𝛽 𝑡 𝑢 (𝑐𝑡 (𝑠𝑡 ), 1 − 𝑐𝑡 (𝑠𝑡 ) − 𝑔(𝑠𝑡 ))
{𝑐𝑡 (𝑠 ),𝑏𝑡+1 (𝑠𝑡 )}
𝑡
𝑡=0

where the maximization is subject to



𝑢𝑐 (𝑠𝑗 )
𝔼0 ∑ 𝛽 𝑗 𝑧 (𝑠𝑗 ) ≥ 𝑏0 (𝑠−1 ) (46.10)
𝑗=0
𝑢𝑐 (𝑠0 ) 𝑗

and

𝑢𝑐 (𝑠𝑡+𝑗 )
𝔼𝑡 ∑ 𝛽 𝑗 𝑧 (𝑠𝑡+𝑗 ) = 𝑏𝑡 (𝑠𝑡−1 ) ∀ 𝑡, 𝑠𝑡 (46.11)
𝑗=0
𝑢𝑐 (𝑠𝑡 ) 𝑡+𝑗

given 𝑏0 (𝑠−1 ).

894 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

Lagrangian Formulation

Let 𝛾0 (𝑠0 ) be a non-negative Lagrange multiplier on constraint (46.10).


As in the Lucas-Stokey economy, this multiplier is strictly positive when the government must resort to distortionary
taxation; otherwise it equals zero.
A consequence of the assumption that there are no markets in state-contingent securities and that a market exists only in a
risk-free security is that we have to attach a stochastic process {𝛾𝑡 (𝑠𝑡 )}∞
𝑡=1 of Lagrange multipliers to the implementability
constraints (46.11).
Depending on how the constraints bind, these multipliers can be positive or negative:

𝛾𝑡 (𝑠𝑡 ) ≥ (≤) 0 if the constraint binds in the following direction



𝑢𝑐 (𝑠𝑡+𝑗 )
𝔼𝑡 ∑ 𝛽 𝑗 𝑧 (𝑠𝑡+𝑗 ) ≥ (≤) 𝑏𝑡 (𝑠𝑡−1 )
𝑗=0
𝑢𝑐 (𝑠𝑡 ) 𝑡+𝑗

A negative multiplier 𝛾𝑡 (𝑠𝑡 ) < 0 means that if we could relax constraint (46.11), we would like to increase the beginning-
of-period indebtedness for that particular realization of history 𝑠𝑡 .
That would let us reduce the beginning-of-period indebtedness for some other history2 .
These features flow from the fact that the government cannot use state-contingent debt and therefore cannot allocate its
indebtedness efficiently across future states.

46.2.4 Some Calculations

It is helpful to apply two transformations to the Lagrangian.


Multiply constraint (46.10) by 𝑢𝑐 (𝑠0 ) and the constraints (46.11) by 𝛽 𝑡 𝑢𝑐 (𝑠𝑡 ).
Then a Lagrangian for the Ramsey problem can be represented as

𝐽 = 𝔼0 ∑ 𝛽 𝑡 {𝑢 (𝑐𝑡 (𝑠𝑡 ), 1 − 𝑐𝑡 (𝑠𝑡 ) − 𝑔(𝑠𝑡 ))
𝑡=0

+ 𝛾𝑡 (𝑠𝑡 )[𝔼𝑡 ∑ 𝛽 𝑗 𝑢𝑐 (𝑠𝑡+𝑗 ) 𝑧𝑡+𝑗 (𝑠𝑡+𝑗 ) − 𝑢𝑐 (𝑠𝑡 ) 𝑏𝑡 (𝑠𝑡−1 )}
𝑗=0
(46.12)

= 𝔼0 ∑ 𝛽 𝑡 {𝑢 (𝑐𝑡 (𝑠𝑡 ), 1 − 𝑐𝑡 (𝑠𝑡 ) − 𝑔(𝑠𝑡 ))
𝑡=0

+ Ψ𝑡 (𝑠𝑡 ) 𝑢𝑐 (𝑠𝑡 ) 𝑧𝑡 (𝑠𝑡 ) − 𝛾𝑡 (𝑠𝑡 ) 𝑢𝑐 (𝑠𝑡 ) 𝑏𝑡 (𝑠𝑡−1 )}

where

Ψ𝑡 (𝑠𝑡 ) = Ψ𝑡−1 (𝑠𝑡−1 ) + 𝛾𝑡 (𝑠𝑡 ) and Ψ−1 (𝑠−1 ) = 0 (46.13)

In (46.12), the second equality uses the law of iterated expectations and Abel’s summation formula (also called summation
by parts, see this page).
First-order conditions with respect to 𝑐𝑡 (𝑠𝑡 ) can be expressed as

𝑢𝑐 (𝑠𝑡 ) − 𝑢ℓ (𝑠𝑡 ) + Ψ𝑡 (𝑠𝑡 ) {[𝑢𝑐𝑐 (𝑠𝑡 ) − 𝑢𝑐ℓ (𝑠𝑡 )] 𝑧𝑡 (𝑠𝑡 ) + 𝑢𝑐 (𝑠𝑡 ) 𝑧𝑐 (𝑠𝑡 )}
(46.14)
− 𝛾𝑡 (𝑠𝑡 ) [𝑢𝑐𝑐 (𝑠𝑡 ) − 𝑢𝑐ℓ (𝑠𝑡 )] 𝑏𝑡 (𝑠𝑡−1 ) = 0
2 From the first-order conditions for the Ramsey problem, there exists another realization 𝑠𝑡̃ with the same history up until the previous period, i.e.,
𝑠𝑡−1
̃ = 𝑠𝑡−1 , but where the multiplier on constraint (46.11) takes a positive value, so 𝛾𝑡 (𝑠𝑡̃ ) > 0.

46.2. Competitive Equilibrium with Distorting Taxes 895


Advanced Quantitative Economics with Python

and with respect to 𝑏𝑡 (𝑠𝑡 ) as


𝔼𝑡 [𝛾𝑡+1 (𝑠𝑡+1 ) 𝑢𝑐 (𝑠𝑡+1 )] = 0 (46.15)

If we substitute 𝑧𝑡 (𝑠𝑡 ) from (46.7) and its derivative 𝑧𝑐 (𝑠𝑡 ) into the first-order condition (46.14), we find two differences
from the corresponding condition for the optimal allocation in a Lucas-Stokey economy with state-contingent government
debt.
1. The term involving 𝑏𝑡 (𝑠𝑡−1 ) in the first-order condition (46.14) does not appear in the corresponding expression
for the Lucas-Stokey economy.
• This term reflects the constraint that beginning-of-period government indebtedness must be the same across
all realizations of next period’s state, a constraint that would not be present if government debt could be
state-contingent.
2. The Lagrange multiplier Ψ𝑡 (𝑠𝑡 ) in the first-order condition (46.14) may change over time in response to realizations
of the state, while the multiplier Φ in the Lucas-Stokey economy is time-invariant.
We need some code from an earlier lecture on optimal taxation with state-contingent debt sequential allocation imple-
mentation:

class SequentialLS:

'''
Class that takes a preference object, state transition matrix,
and state contingent government expenditure plan as inputs, and
solves the sequential allocation problem described above.
It returns optimal allocations about consumption and labor supply,
as well as the multiplier on the implementability constraint Φ.
'''

def __init__(self,
pref,
π=np.full((2, 2), 0.5),
g=np.array([0.1, 0.2])):

# Initialize from pref object attributes


self.β, self.π, self.g = pref.β, π, g
self.mc = MarkovChain(self.π)
self.S = len(π) # Number of states
self.pref = pref

# Find the first best allocation


self.find_first_best()

def FOC_first_best(self, c, g):


'''
First order conditions that characterize
the first best allocation.
'''

pref = self.pref
Uc, Ul = pref.Uc, pref.Ul

n = c + g
l = 1 - n

return Uc(c, l) - Ul(c, l)

(continues on next page)

896 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

(continued from previous page)


def find_first_best(self):
'''
Find the first best allocation
'''
S, g = self.S, self.g

res = root(self.FOC_first_best, np.full(S, 0.5), args=(g,))

if (res.fun > 1e-10).any():


raise Exception('Could not find first best')

self.cFB = res.x
self.nFB = self.cFB + g

def FOC_time1(self, c, Φ, g):


'''
First order conditions that characterize
optimal time 1 allocation problems.
'''

pref = self.pref
Uc, Ucc, Ul, Ull, Ulc = pref.Uc, pref.Ucc, pref.Ul, pref.Ull, pref.Ulc

n = c + g
l = 1 - n

LHS = (1 + Φ) * Uc(c, l) + Φ * (c * Ucc(c, l) - n * Ulc(c, l))


RHS = (1 + Φ) * Ul(c, l) + Φ * (c * Ulc(c, l) - n * Ull(c, l))

diff = LHS - RHS

return diff

def time1_allocation(self, Φ):


'''
Computes optimal allocation for time t >= 1 for a given Φ
'''
pref = self.pref
S, g = self.S, self.g

# use the first best allocation as intial guess


res = root(self.FOC_time1, self.cFB, args=(Φ, g))

if (res.fun > 1e-10).any():


raise Exception('Could not find LS allocation.')

c = res.x
n = c + g
l = 1 - n

# Compute x
I = pref.Uc(c, n) * c - pref.Ul(c, l) * n
x = np.linalg.solve(np.eye(S) - self.β * self.π, I)

return c, n, x

(continues on next page)

46.2. Competitive Equilibrium with Distorting Taxes 897


Advanced Quantitative Economics with Python

(continued from previous page)


def FOC_time0(self, c0, Φ, g0, b0):
'''
First order conditions that characterize
time 0 allocation problem.
'''

pref = self.pref
Ucc, Ulc = pref.Ucc, pref.Ulc

n0 = c0 + g0
l0 = 1 - n0

diff = self.FOC_time1(c0, Φ, g0)


diff -= Φ * (Ucc(c0, l0) - Ulc(c0, l0)) * b0

return diff

def implementability(self, Φ, b0, s0, cn0_arr):


'''
Compute the differences between the RHS and LHS
of the implementability constraint given Φ,
initial debt, and initial state.
'''

pref, π, g, β = self.pref, self.π, self.g, self.β


Uc, Ul = pref.Uc, pref.Ul
g0 = self.g[s0]

c, n, x = self.time1_allocation(Φ)

res = root(self.FOC_time0, cn0_arr[0], args=(Φ, g0, b0))


c0 = res.x
n0 = c0 + g0
l0 = 1 - n0

cn0_arr[:] = c0.item(), n0.item()

LHS = Uc(c0, l0) * b0


RHS = Uc(c0, l0) * c0 - Ul(c0, l0) * n0 + β * π[s0] @ x

return RHS - LHS

def time0_allocation(self, b0, s0):


'''
Finds the optimal time 0 allocation given
initial government debt b0 and state s0
'''

# use the first best allocation as initial guess


cn0_arr = np.array([self.cFB[s0], self.nFB[s0]])

res = root(self.implementability, 0., args=(b0, s0, cn0_arr))

if (res.fun > 1e-10).any():


raise Exception('Could not find time 0 LS allocation.')

(continues on next page)

898 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

(continued from previous page)


Φ = res.x[0]
c0, n0 = cn0_arr

return Φ, c0, n0

def τ(self, c, n):


'''
Computes τ given c, n
'''
pref = self.pref
Uc, Ul = pref.Uc, pref.Ul

return 1 - Ul(c, 1-n) / Uc(c, 1-n)

def simulate(self, b0, s0, T, sHist=None):


'''
Simulates planners policies for T periods
'''
pref, π, β = self.pref, self.π, self.β
Uc = pref.Uc

if sHist is None:
sHist = self.mc.simulate(T, s0)

cHist, nHist, Bhist, τHist, ΦHist = np.empty((5, T))


RHist = np.empty(T-1)

# Time 0
Φ, cHist[0], nHist[0] = self.time0_allocation(b0, s0)
τHist[0] = self.τ(cHist[0], nHist[0])
Bhist[0] = b0
ΦHist[0] = Φ

# Time 1 onward
for t in range(1, T):
c, n, x = self.time1_allocation(Φ)
τ = self.τ(c, n)
u_c = Uc(c, 1-n)
s = sHist[t]
Eu_c = π[sHist[t-1]] @ u_c
cHist[t], nHist[t], Bhist[t], τHist[t] = c[s], n[s], x[s] / u_c[s], τ[s]
RHist[t-1] = Uc(cHist[t-1], 1-nHist[t-1]) / (β * Eu_c)
ΦHist[t] = Φ

gHist = self.g[sHist]
yHist = nHist

return [cHist, nHist, Bhist, τHist, gHist, yHist, sHist, ΦHist, RHist]

To analyze the AMSS model, we find it useful to adopt a recursive formulation using techniques like those in our lectures
on dynamic Stackelberg models and optimal taxation with state-contingent debt.

46.2. Competitive Equilibrium with Distorting Taxes 899


Advanced Quantitative Economics with Python

46.3 Recursive Version of AMSS Model

We now describe a recursive formulation of the AMSS economy.


We have noted that from the point of view of the Ramsey planner, the restriction to one-period risk-free securities
• leaves intact the single implementability constraint on allocations (46.8) from the Lucas-Stokey economy, but
• adds measurability constraints (46.6) on functions of tails of allocations at each time and history
We now explore how these constraints alter Bellman equations for a time 0 Ramsey planner and for time 𝑡 ≥ 1, history
𝑠𝑡 continuation Ramsey planners.

46.3.1 Recasting State Variables

In the AMSS setting, the government faces a sequence of budget constraints

𝜏𝑡 (𝑠𝑡 )𝑛𝑡 (𝑠𝑡 ) + 𝑇𝑡 (𝑠𝑡 ) + 𝑏𝑡+1 (𝑠𝑡 )/𝑅𝑡 (𝑠𝑡 ) = 𝑔𝑡 + 𝑏𝑡 (𝑠𝑡−1 )

where 𝑅𝑡 (𝑠𝑡 ) is the gross risk-free rate of interest between 𝑡 and 𝑡 + 1 at history 𝑠𝑡 and 𝑇𝑡 (𝑠𝑡 ) are non-negative transfers.
Throughout this lecture, we shall set transfers to zero (for some issues about the limiting behavior of debt, this is possibly
an important difference from AMSS [Aiyagari et al., 2002], who restricted transfers to be non-negative).
In this case, the household faces a sequence of budget constraints

𝑏𝑡 (𝑠𝑡−1 ) + (1 − 𝜏𝑡 (𝑠𝑡 ))𝑛𝑡 (𝑠𝑡 ) = 𝑐𝑡 (𝑠𝑡 ) + 𝑏𝑡+1 (𝑠𝑡 )/𝑅𝑡 (𝑠𝑡 ) (46.16)

The household’s first-order conditions are 𝑢𝑐,𝑡 = 𝛽𝑅𝑡 𝔼𝑡 𝑢𝑐,𝑡+1 and (1 − 𝜏𝑡 )𝑢𝑐,𝑡 = 𝑢𝑙,𝑡 .
Using these to eliminate 𝑅𝑡 and 𝜏𝑡 from budget constraint (46.16) gives

𝑢𝑙,𝑡 (𝑠𝑡 ) 𝛽(𝔼𝑡 𝑢𝑐,𝑡+1 )𝑏𝑡+1 (𝑠𝑡 )


𝑏𝑡 (𝑠𝑡−1 ) + 𝑛 𝑡 (𝑠𝑡
) = 𝑐 𝑡 (𝑠 𝑡
) + (46.17)
𝑢𝑐,𝑡 (𝑠𝑡 ) 𝑢𝑐,𝑡 (𝑠𝑡 )
or

𝑢𝑐,𝑡 (𝑠𝑡 )𝑏𝑡 (𝑠𝑡−1 ) + 𝑢𝑙,𝑡 (𝑠𝑡 )𝑛𝑡 (𝑠𝑡 ) = 𝑢𝑐,𝑡 (𝑠𝑡 )𝑐𝑡 (𝑠𝑡 ) + 𝛽(𝔼𝑡 𝑢𝑐,𝑡+1 )𝑏𝑡+1 (𝑠𝑡 ) (46.18)

Now define
𝑏𝑡+1 (𝑠𝑡 )
𝑥𝑡 ≡ 𝛽𝑏𝑡+1 (𝑠𝑡 )𝔼𝑡 𝑢𝑐,𝑡+1 = 𝑢𝑐,𝑡 (𝑠𝑡 ) (46.19)
𝑅𝑡 (𝑠𝑡 )

and represent the household’s budget constraint at time 𝑡, history 𝑠𝑡 as


𝑢𝑐,𝑡 𝑥𝑡−1
= 𝑢𝑐,𝑡 𝑐𝑡 − 𝑢𝑙,𝑡 𝑛𝑡 + 𝑥𝑡 (46.20)
𝛽𝔼𝑡−1 𝑢𝑐,𝑡

for 𝑡 ≥ 1.

46.3.2 Measurability Constraints

Write equation (46.18) as

𝑢𝑙,𝑡 (𝑠𝑡 ) 𝛽(𝔼𝑡 𝑢𝑐,𝑡+1 )𝑏𝑡+1 (𝑠𝑡 )


𝑏𝑡 (𝑠𝑡−1 ) = 𝑐𝑡 (𝑠𝑡 ) − 𝑛 (𝑠 𝑡
) + (46.21)
𝑢𝑐,𝑡 (𝑠𝑡 ) 𝑡 𝑢𝑐,𝑡

900 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

The right side of equation (46.21) expresses the time 𝑡 value of government debt in terms of a linear combination of
terms whose individual components are measurable with respect to 𝑠𝑡 .
The sum of terms on the right side of equation (46.21) must equal 𝑏𝑡 (𝑠𝑡−1 ).
That implies that it has to be measurable with respect to 𝑠𝑡−1 .
Equations (46.21) are the measurability constraints that the AMSS model adds to the single time 0 implementation con-
straint imposed in the Lucas and Stokey model.

46.3.3 Two Bellman Equations

Let Π(𝑠|𝑠− ) be a Markov transition matrix whose entries tell probabilities of moving from state 𝑠− to state 𝑠 in one
period.
Let
• 𝑉 (𝑥− , 𝑠− ) be the continuation value of a continuation Ramsey plan at 𝑥𝑡−1 = 𝑥− , 𝑠𝑡−1 = 𝑠− for 𝑡 ≥ 1
• 𝑊 (𝑏, 𝑠) be the value of the Ramsey plan at time 0 at 𝑏0 = 𝑏 and 𝑠0 = 𝑠
We distinguish between two types of planners:
For 𝑡 ≥ 1, the value function for a continuation Ramsey planner satisfies the Bellman equation

𝑉 (𝑥− , 𝑠− ) = max ∑ Π(𝑠|𝑠− ) [𝑢(𝑛(𝑠) − 𝑔(𝑠), 1 − 𝑛(𝑠)) + 𝛽𝑉 (𝑥(𝑠), 𝑠)] (46.22)


{𝑛(𝑠),𝑥(𝑠)} 𝑠

subject to the following collection of implementability constraints, one for each 𝑠 ∈ 𝑆:

𝑢𝑐 (𝑠)𝑥−
= 𝑢𝑐 (𝑠)(𝑛(𝑠) − 𝑔(𝑠)) − 𝑢𝑙 (𝑠)𝑛(𝑠) + 𝑥(𝑠) (46.23)
𝛽 ∑𝑠 ̃ Π(𝑠|𝑠
̃ − )𝑢𝑐 (𝑠)̃

A continuation Ramsey planner at 𝑡 ≥ 1 takes (𝑥𝑡−1 , 𝑠𝑡−1 ) = (𝑥− , 𝑠− ) as given and before 𝑠 is realized chooses
(𝑛𝑡 (𝑠𝑡 ), 𝑥𝑡 (𝑠𝑡 )) = (𝑛(𝑠), 𝑥(𝑠)) for 𝑠 ∈ 𝑆.
The Ramsey planner takes (𝑏0 , 𝑠0 ) as given and chooses (𝑛0 , 𝑥0 ).
The value function 𝑊 (𝑏0 , 𝑠0 ) for the time 𝑡 = 0 Ramsey planner satisfies the Bellman equation

𝑊 (𝑏0 , 𝑠0 ) = max 𝑢(𝑛0 − 𝑔0 , 1 − 𝑛0 ) + 𝛽𝑉 (𝑥0 , 𝑠0 ) (46.24)


𝑛0 ,𝑥0

where maximization is subject to

𝑢𝑐,0 𝑏0 = 𝑢𝑐,0 (𝑛0 − 𝑔0 ) − 𝑢𝑙,0 𝑛0 + 𝑥0 (46.25)

46.3.4 Martingale Supercedes State-Variable Degeneracy

Let 𝜇(𝑠|𝑠− )Π(𝑠|𝑠− ) be a Lagrange multiplier on the constraint (46.23) for state 𝑠.
After forming an appropriate Lagrangian, we find that the continuation Ramsey planner’s first-order condition with respect
to 𝑥(𝑠) is

𝛽𝑉𝑥 (𝑥(𝑠), 𝑠) = 𝜇(𝑠|𝑠− ) (46.26)

Applying an envelope theorem to Bellman equation (46.22) gives

𝑢𝑐 (𝑠)
𝑉𝑥 (𝑥− , 𝑠− ) = ∑ Π(𝑠|𝑠− )𝜇(𝑠|𝑠− ) (46.27)
𝑠
𝛽 ∑𝑠 ̃ Π(𝑠|𝑠̃ − )𝑢𝑐 (𝑠)̃

46.3. Recursive Version of AMSS Model 901


Advanced Quantitative Economics with Python

Equations (46.26) and (46.27) imply that

𝑢𝑐 (𝑠)
𝑉𝑥 (𝑥− , 𝑠− ) = ∑ (Π(𝑠|𝑠− ) ) 𝑉𝑥 (𝑥, 𝑠) (46.28)
𝑠
∑𝑠 ̃ Π(𝑠|𝑠
̃ − )𝑢𝑐 (𝑠)̃

Equation (46.28) states that 𝑉𝑥 (𝑥, 𝑠) is a risk-adjusted martingale.


Saying that 𝑉𝑥 (𝑥, 𝑠) is a risk-adjusted martingale means that 𝑉𝑥 (𝑥, 𝑠) is a martingale with respect to the probability
distribution over 𝑠𝑡 sequences that are generated by the twisted transition probability matrix:

̌ 𝑢𝑐 (𝑠)
Π(𝑠|𝑠 − ) ≡ Π(𝑠|𝑠− )
∑𝑠 ̃ Π(𝑠|𝑠
̃ − )𝑢𝑐 (𝑠)̃

Exercise 46.3.1
̌
Please verify that Π(𝑠|𝑠 − ) is a valid Markov transition density, i.e., that its elements are all non-negative and that for each
𝑠− , the sum over 𝑠 equals unity.

46.3.5 Absence of State Variable Degeneracy

Along a Ramsey plan, the state variable 𝑥𝑡 = 𝑥𝑡 (𝑠𝑡 , 𝑏0 ) becomes a function of the history 𝑠𝑡 and initial government debt
𝑏0 .
In Lucas-Stokey model, we found that
• a counterpart to 𝑉𝑥 (𝑥, 𝑠) is time-invariant and equal to the Lagrange multiplier on the Lucas-Stokey implementabil-
ity constraint
• time invariance of 𝑉𝑥 (𝑥, 𝑠) is the source of a key feature of the Lucas-Stokey model, namely, state variable
degeneracy in which 𝑥𝑡 is an exact time-invariant function of 𝑠𝑡 .
That 𝑉𝑥 (𝑥, 𝑠) varies over time according to a twisted martingale means that there is no state-variable degeneracy in the
AMSS model.
In the AMSS model, both 𝑥 and 𝑠 are needed to describe the state.
This property of the AMSS model transmits a twisted martingale component to consumption, employment, and the tax
rate.

46.3.6 Digression on Non-negative Transfers

Throughout this lecture, we have imposed that transfers 𝑇𝑡 = 0.


AMSS [Aiyagari et al., 2002] instead imposed a nonnegativity constraint 𝑇𝑡 ≥ 0 on transfers.
They also considered a special case of quasi-linear preferences, 𝑢(𝑐, 𝑙) = 𝑐 + 𝐻(𝑙).
In this case, 𝑉𝑥 (𝑥, 𝑠) ≤ 0 is a non-positive martingale.
By the martingale convergence theorem 𝑉𝑥 (𝑥, 𝑠) converges almost surely.
Furthermore, when the Markov chain Π(𝑠|𝑠− ) and the government expenditure function 𝑔(𝑠) are such that 𝑔𝑡 is perpet-
ually random, 𝑉𝑥 (𝑥, 𝑠) almost surely converges to zero.

902 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

For quasi-linear preferences, the first-order condition for maximizing (46.22) subject to (46.23) with respect to 𝑛(𝑠)
becomes

(1 − 𝜇(𝑠|𝑠− ))(1 − 𝑢𝑙 (𝑠)) + 𝜇(𝑠|𝑠− )𝑛(𝑠)𝑢𝑙𝑙 (𝑠) = 0

When 𝜇(𝑠|𝑠− ) = 𝛽𝑉𝑥 (𝑥(𝑠), 𝑥) converges to zero, in the limit 𝑢𝑙 (𝑠) = 1 = 𝑢𝑐 (𝑠), so that 𝜏 (𝑥(𝑠), 𝑠) = 0.
Thus, in the limit, if 𝑔𝑡 is perpetually random, the government accumulates sufficient assets to finance all expenditures
from earnings on those assets, returning any excess revenues to the household as non-negative lump-sum transfers.

46.3.7 Code

The recursive formulation is implemented as follows

class AMSS:
# WARNING: THE CODE IS EXTREMELY SENSITIVE TO CHOCIES OF PARAMETERS.
# DO NOT CHANGE THE PARAMETERS AND EXPECT IT TO WORK

def __init__(self, pref, β, Π, g, x_grid, bounds_v):


self.β, self.Π, self.g = β, Π, g
self.x_grid = x_grid
self.n = x_grid[0][2]
self.S = len(Π)
self.bounds = bounds_v
self.pref = pref

self.T_v, self.T_w = bellman_operator_factory(Π, β, x_grid, g,


bounds_v)

self.V_solved = False
self.W_solved = False

def compute_V(self, V, σ_v_star, tol_vfi, maxitr, print_itr):

T_v = self.T_v

self.success = False

V_new = np.zeros_like(V)

Δ = 1.0
for itr in range(maxitr):
T_v(V, V_new, σ_v_star, self.pref)

Δ = np.max(np.abs(V_new - V))

if Δ < tol_vfi:
self.V_solved = True
print('Successfully completed VFI after %i iterations'
% (itr+1))
break

if (itr + 1) % print_itr == 0:
print('Error at iteration %i : ' % (itr + 1), Δ)

V[:] = V_new[:]
(continues on next page)

46.3. Recursive Version of AMSS Model 903


Advanced Quantitative Economics with Python

(continued from previous page)

self.V = V
self.σ_v_star = σ_v_star

return V, σ_v_star

def compute_W(self, b_0, W, σ_w_star):


T_w = self.T_w
V = self.V

T_w(W, σ_w_star, V, b_0, self.pref)

self.W = W
self.σ_w_star = σ_w_star
self.W_solved = True
print('Succesfully solved the time 0 problem.')

return W, σ_w_star

def solve(self, V, σ_v_star, b_0, W, σ_w_star, tol_vfi=1e-7,


maxitr=1000, print_itr=10):
print("===============")
print("Solve time 1 problem")
print("===============")
self.compute_V(V, σ_v_star, tol_vfi, maxitr, print_itr)
print("===============")
print("Solve time 0 problem")
print("===============")
self.compute_W(b_0, W, σ_w_star)

def simulate(self, s_hist, b_0):


if not (self.V_solved and self.W_solved):
msg = "V and W need to be successfully computed before simulation."
raise ValueError(msg)

pref = self.pref
x_grid, g, β, S = self.x_grid, self.g, self.β, self.S
σ_v_star, σ_w_star = self.σ_v_star, self.σ_w_star

T = len(s_hist)
s_0 = s_hist[0]

# Pre-allocate
n_hist = np.zeros(T)
x_hist = np.zeros(T)
c_hist = np.zeros(T)
τ_hist = np.zeros(T)
b_hist = np.zeros(T)
g_hist = np.zeros(T)

# Compute t = 0
l_0, T_0 = σ_w_star[s_0]
c_0 = (1 - l_0) - g[s_0]
x_0 = (-pref.Uc(c_0, l_0) * (c_0 - T_0 - b_0) +
pref.Ul(c_0, l_0) * (1 - l_0))

(continues on next page)

904 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

(continued from previous page)


n_hist[0] = (1 - l_0)
x_hist[0] = x_0
c_hist[0] = c_0
τ_hist[0] = 1 - pref.Ul(c_0, l_0) / pref.Uc(c_0, l_0)
b_hist[0] = b_0
g_hist[0] = g[s_0]

# Compute t > 0
for t in range(T - 1):
x_ = x_hist[t]
s_ = s_hist[t]
l = np.zeros(S)
T = np.zeros(S)
for s in range(S):
x_arr = np.array([x_])
l[s] = eval_linear(x_grid, σ_v_star[s_, :, s], x_arr)
T[s] = eval_linear(x_grid, σ_v_star[s_, :, S+s], x_arr)

c = (1 - l) - g
u_c = pref.Uc(c, l)
Eu_c = Π[s_] @ u_c

x = u_c * x_ / (β * Eu_c) - u_c * (c - T) + pref.Ul(c, l) * (1 - l)

c_next = c[s_hist[t+1]]
l_next = l[s_hist[t+1]]

x_hist[t+1] = x[s_hist[t+1]]
n_hist[t+1] = 1 - l_next
c_hist[t+1] = c_next
τ_hist[t+1] = 1 - pref.Ul(c_next, l_next) / pref.Uc(c_next, l_next)
b_hist[t+1] = x_ / (β * Eu_c)
g_hist[t+1] = g[s_hist[t+1]]

return c_hist, n_hist, b_hist, τ_hist, g_hist, n_hist

def obj_factory(Π, β, x_grid, g):


S = len(Π)

@njit
def obj_V(σ, state, V, pref):
# Unpack state
s_, x_ = state

l = σ[:S]
T = σ[S:]

c = (1 - l) - g
u_c = pref.Uc(c, l)
Eu_c = Π[s_] @ u_c
x = u_c * x_ / (β * Eu_c) - u_c * (c - T) + pref.Ul(c, l) * (1 - l)

V_next = np.zeros(S)

for s in range(S):

(continues on next page)

46.3. Recursive Version of AMSS Model 905


Advanced Quantitative Economics with Python

(continued from previous page)


V_next[s] = eval_linear(x_grid, V[s], np.array([x[s]]))

out = Π[s_] @ (pref.U(c, l) + β * V_next)

return out

@njit
def obj_W(σ, state, V, pref):
# Unpack state
s_, b_0 = state
l, T = σ

c = (1 - l) - g[s_]
x = -pref.Uc(c, l) * (c - T - b_0) + pref.Ul(c, l) * (1 - l)

V_next = eval_linear(x_grid, V[s_], np.array([x]))

out = pref.U(c, l) + β * V_next

return out

return obj_V, obj_W

def bellman_operator_factory(Π, β, x_grid, g, bounds_v):


obj_V, obj_W = obj_factory(Π, β, x_grid, g)
n = x_grid[0][2]
S = len(Π)
x_nodes = nodes(x_grid)

@njit(parallel=True)
def T_v(V, V_new, σ_star, pref):
for s_ in prange(S):
for x_i in prange(n):
state = (s_, x_nodes[x_i])
x0 = σ_star[s_, x_i]
res = optimize.nelder_mead(obj_V, x0, bounds=bounds_v,
args=(state, V, pref))

if res.success:
V_new[s_, x_i] = res.fun
σ_star[s_, x_i] = res.x
else:
print("Optimization routine failed.")

bounds_w = np.array([[-9.0, 1.0], [0., 10.]])

def T_w(W, σ_star, V, b_0, pref):


for s_ in prange(S):
state = (s_, b_0)
x0 = σ_star[s_]
res = optimize.nelder_mead(obj_W, x0, bounds=bounds_w,
args=(state, V, pref))

W[s_] = res.fun
σ_star[s_] = res.x

(continues on next page)

906 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

(continued from previous page)

return T_v, T_w

46.4 Examples

We now turn to some examples.

46.4.1 Anticipated One-Period War

In our lecture on optimal taxation with state-contingent debt we studied how the government manages uncertainty in a
simple setting.
As in that lecture, we assume the one-period utility function
𝑐1−𝜎 𝑛1+𝛾
𝑢(𝑐, 𝑛) = −
1−𝜎 1+𝛾

Note: For convenience in matching our computer code, we have expressed utility as a function of 𝑛 rather than leisure
𝑙.

We first consider a government expenditure process that we studied earlier in a lecture on optimal taxation with state-
contingent debt.
Government expenditures are known for sure in all periods except one.
• For 𝑡 < 3 or 𝑡 > 3 we assume that 𝑔𝑡 = 𝑔𝑙 = 0.1.
• At 𝑡 = 3 a war occurs with probability 0.5.
– If there is war, 𝑔3 = 𝑔ℎ = 0.2.
– If there is no war 𝑔3 = 𝑔𝑙 = 0.1.
A useful trick is to define components of the state vector as the following six (𝑡, 𝑔) pairs:
(0, 𝑔𝑙 ), (1, 𝑔𝑙 ), (2, 𝑔𝑙 ), (3, 𝑔𝑙 ), (3, 𝑔ℎ ), (𝑡 ≥ 4, 𝑔𝑙 )
We think of these 6 states as corresponding to 𝑠 = 1, 2, 3, 4, 5, 6.
The transition matrix is
0 1 0 0 0 0

⎜0 0 1 0 0 0⎞⎟

⎜ ⎟
0 0 0 0.5 0.5 0⎟
𝑃 =⎜
⎜ ⎟
⎜0 0 0 0 0 1⎟⎟

⎜0 ⎟
0 0 0 0 1⎟
⎝0 0 0 0 0 1⎠
The government expenditure at each state is
0.1

⎜ 0.1⎞


⎜ ⎟

0.1
𝑔=⎜
⎜ ⎟
⎜ 0.1⎟


⎜0.2⎟⎟
⎝0.1⎠

46.4. Examples 907


Advanced Quantitative Economics with Python

We assume the same utility parameters as in the Lucas-Stokey economy.


This utility function is implemented in the following class.

crra_util_data = [
('β', float64),
('σ', float64),
('γ', float64)
]

@jitclass(crra_util_data)
class CRRAutility:

def __init__(self,
β=0.9,
σ=2,
γ=2):

self.β, self.σ, self.γ = β, σ, γ

# Utility function
def U(self, c, l):
# Note: `l` should not be interpreted as labor, it is an auxiliary
# variable used to conveniently match the code and the equations
# in the lecture
σ = self.σ
if σ == 1.:
U = np.log(c)
else:
U = (c**(1 - σ) - 1) / (1 - σ)
return U - (1-l) ** (1 + self.γ) / (1 + self.γ)

# Derivatives of utility function


def Uc(self, c, l):
return c ** (-self.σ)

def Ucc(self, c, l):


return -self.σ * c ** (-self.σ - 1)

def Ul(self, c, l):


return (1-l) ** self.γ

def Ull(self, c, l):


return -self.γ * (1-l) ** (self.γ - 1)

def Ucl(self, c, l):


return 0

def Ulc(self, c, l):


return 0

The following figure plots Ramsey plans under complete and incomplete markets for both possible realizations of the
state at time 𝑡 = 3.
Ramsey outcomes and policies when the government has access to state-contingent debt are represented by black lines
and by red lines when there is only a risk-free bond.
Paths with circles are histories in which there is peace, while those with triangle denote war.

908 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

# WARNING: DO NOT EXPECT THE CODE TO WORK IF YOU CHANGE PARAMETERS


σ = 2
γ = 2
β = 0.9
Π = np.array([[0, 1, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, 0.5, 0.5, 0],
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1]])
g = np.array([0.1, 0.1, 0.1, 0.2, 0.1, 0.1])

x_min = -1.5555
x_max = 17.339
x_num = 300

x_grid = UCGrid((x_min, x_max, x_num))

crra_pref = CRRAutility(β=β, σ=σ, γ=γ)

S = len(Π)
bounds_v = np.vstack([np.hstack([np.full(S, -10.), np.zeros(S)]),
np.hstack([np.ones(S) - g, np.full(S, 10.)])]).T

amss_model = AMSS(crra_pref, β, Π, g, x_grid, bounds_v)

# WARNING: DO NOT EXPECT THE CODE TO WORK IF YOU CHANGE PARAMETERS


V = np.zeros((len(Π), x_num))
V[:] = -nodes(x_grid).T ** 2

σ_v_star = np.ones((S, x_num, S * 2))


σ_v_star[:, :, :S] = 0.0

W = np.empty(len(Π))
b_0 = 1.0
σ_w_star = np.ones((S, 2))
σ_w_star[:, 0] = -0.05

%%time

amss_model.solve(V, σ_v_star, b_0, W, σ_w_star)

===============
Solve time 1 problem
===============

Error at iteration 10 : 1.110064840137854

Error at iteration 20 : 0.30784885876438395

Error at iteration 30 : 0.03221851531398379

46.4. Examples 909


Advanced Quantitative Economics with Python

Error at iteration 40 : 0.014347598008733087

Error at iteration 50 : 0.0031219444631354065

Error at iteration 60 : 0.0010783647355108172

Error at iteration 70 : 0.0003761255356202753

Error at iteration 80 : 0.0001318127597098595

Error at iteration 90 : 4.650031579878089e-05

Error at iteration 100 : 1.801377708510188e-05

Error at iteration 110 : 6.175872600877597e-06

Error at iteration 120 : 2.4450291853383987e-06

Error at iteration 130 : 1.0836745989450947e-06

Error at iteration 140 : 5.682877084467464e-07

Error at iteration 150 : 3.567560966644123e-07

Error at iteration 160 : 2.5837734796141376e-07

Error at iteration 170 : 2.047536575844333e-07

Error at iteration 180 : 1.7066849622437985e-07

Error at iteration 190 : 1.4622035848788073e-07

Error at iteration 200 : 1.27387780324284e-07

Error at iteration 210 : 1.1226231499961159e-07

Successfully completed VFI after 220 iterations


===============
Solve time 0 problem
===============

910 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

Succesfully solved the time 0 problem.


CPU times: user 2min 8s, sys: 1.6 s, total: 2min 10s
Wall time: 1min 27s

# Solve the LS model


ls_model = SequentialLS(crra_pref, g=g, π=Π)

# WARNING: DO NOT EXPECT THE CODE TO WORK IF YOU CHANGE PARAMETERS


s_hist_h = np.array([0, 1, 2, 3, 5, 5, 5])
s_hist_l = np.array([0, 1, 2, 4, 5, 5, 5])

sim_h_amss = amss_model.simulate(s_hist_h, b_0)


sim_l_amss = amss_model.simulate(s_hist_l, b_0)

sim_h_ls = ls_model.simulate(b_0, 0, 7, s_hist_h)


sim_l_ls = ls_model.simulate(b_0, 0, 7, s_hist_l)

fig, axes = plt.subplots(3, 2, figsize=(14, 10))


titles = ['Consumption', 'Labor Supply', 'Government Debt',
'Tax Rate', 'Government Spending', 'Output']

for ax, title, ls_l, ls_h, amss_l, amss_h in zip(axes.flatten(), titles,


sim_l_ls, sim_h_ls,
sim_l_amss, sim_h_amss):
ax.plot(ls_l, '-ok', ls_h, '-^k', amss_l, '-or', amss_h, '-^r',
alpha=0.7)
ax.set(title=title)
ax.grid()

plt.tight_layout()
plt.show()

46.4. Examples 911


Advanced Quantitative Economics with Python

How a Ramsey planner responds to war depends on the structure of the asset market.
If it is able to trade state-contingent debt, then at time 𝑡 = 2
• the government purchases an Arrow security that pays off when 𝑔3 = 𝑔ℎ
• the government sells an Arrow security that pays off when 𝑔3 = 𝑔𝑙
• the Ramsey planner designs these purchases and sales designed so that, regardless of whether or not there is a war
at 𝑡 = 3, the government begins period 𝑡 = 4 with the same government debt
This pattern facilities smoothing tax rates across states.
The government without state-contingent debt cannot do this.
Instead, it must enter time 𝑡 = 3 with the same level of debt falling due whether there is peace or war at 𝑡 = 3.
The risk-free rate between time 2 and time 3 is unusually low because at time 2 consumption at time 3 is expected to be
unusually low.
A low risk-free rate of return on government debt between time 2 and time 3 allows the government to enter period 3
with lower government debt than it entered period 2.
To finance a war at time 3 it raises taxes and issues more debt to carry into perpetual peace that begins in period 4.
To service the additional debt burden, it raises taxes in all future periods.
The absence of state-contingent debt leads to an important difference in the optimal tax policy.
When the Ramsey planner has access to state-contingent debt, the optimal tax policy is history independent
• the tax rate is a function of the current level of government spending only, given the Lagrange multiplier on the
implementability constraint

912 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

Without state-contingent debt, the optimal tax rate is history dependent.


• A war at time 𝑡 = 3 causes a permanent increase in the tax rate.
• Peace at time 𝑡 = 3 causes a permanent reduction in the tax rate.

Perpetual War Alert

History dependence occurs more dramatically in a case in which the government perpetually faces the prospect of war.
This case was studied in the final example of the lecture on optimal taxation with state-contingent debt.
There, each period the government faces a constant probability, 0.5, of war.
In addition, this example features the following preferences

𝑢(𝑐, 𝑛) = log(𝑐) + 0.69 log(1 − 𝑛)

In accordance, we will re-define our utility function.

log_util_data = [
('β', float64),
('ψ', float64)
]

@jitclass(log_util_data)
class LogUtility:

def __init__(self,
β=0.9,
ψ=0.69):

self.β, self.ψ = β, ψ

# Utility function
def U(self, c, l):
return np.log(c) + self.ψ * np.log(l)

# Derivatives of utility function


def Uc(self, c, l):
return 1 / c

def Ucc(self, c, l):


return -c**(-2)

def Ul(self, c, l):


return self.ψ / l

def Ull(self, c, l):


return -self.ψ / l**2

def Ucl(self, c, l):


return 0

def Ulc(self, c, l):


return 0

With these preferences, Ramsey tax rates will vary even in the Lucas-Stokey model with state-contingent debt.

46.4. Examples 913


Advanced Quantitative Economics with Python

The figure below plots optimal tax policies for both the economy with state-contingent debt (circles) and the economy
with only a risk-free bond (triangles).

# WARNING: DO NOT EXPECT THE CODE TO WORK IF YOU CHANGE PARAMETERS


ψ = 0.69
Π = np.full((2, 2), 0.5)
β = 0.9
g = np.array([0.1, 0.2])

x_min = -3.4107
x_max = 3.709
x_num = 300

x_grid = UCGrid((x_min, x_max, x_num))


log_pref = LogUtility(β=β, ψ=ψ)

S = len(Π)
bounds_v = np.vstack([np.zeros(2 * S), np.hstack([1 - g, np.ones(S)]) ]).T

V = np.zeros((len(Π), x_num))
V[:] = -(nodes(x_grid).T + x_max) ** 2 / 14

σ_v_star = 1 - np.full((S, x_num, S * 2), 0.55)

W = np.empty(len(Π))
b_0 = 0.5
σ_w_star = 1 - np.full((S, 2), 0.55)

amss_model = AMSS(log_pref, β, Π, g, x_grid, bounds_v)

%%time

amss_model.solve(V, σ_v_star, b_0, W, σ_w_star, tol_vfi=3e-5, maxitr=3000,


print_itr=100)

===============
Solve time 1 problem
===============

Error at iteration 100 : 0.0011569123052908026

Error at iteration 200 : 0.0005024948171925558

Error at iteration 300 : 0.0002995649778405607

Error at iteration 400 : 0.00020753209923363158

Error at iteration 500 : 0.00015556566848218267

Error at iteration 600 : 0.0001228034492957164

914 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

Error at iteration 700 : 0.00010068689697462219

Error at iteration 800 : 8.474340939912395e-05

Error at iteration 900 : 7.290920770763876e-05

Error at iteration 1000 : 6.375694017535238e-05

Error at iteration 1100 : 5.642689428775327e-05

Error at iteration 1200 : 5.045426282634935e-05

Error at iteration 1300 : 4.561168914030134e-05

Error at iteration 1400 : 4.150059282892471e-05

Error at iteration 1500 : 3.799110186264443e-05

Error at iteration 1600 : 3.5163266918658564e-05

Error at iteration 1700 : 3.263979350620616e-05

Error at iteration 1800 : 3.0359381506528393e-05

Successfully completed VFI after 1818 iterations


===============
Solve time 0 problem
===============

Succesfully solved the time 0 problem.


CPU times: user 1min 58s, sys: 1.29 s, total: 1min 59s
Wall time: 1min 28s

ls_model = SequentialLS(log_pref, g=g, π=Π) # Solve sequential problem

# WARNING: DO NOT EXPECT THE CODE TO WORK IF YOU CHANGE PARAMETERS


s_hist = np.array([0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
0, 0, 0, 1, 1, 1, 1, 1, 1, 0])

T = len(s_hist)

sim_amss = amss_model.simulate(s_hist, b_0)


sim_ls = ls_model.simulate(0.5, 0, T, s_hist)

titles = ['Consumption', 'Labor Supply', 'Government Debt',


(continues on next page)

46.4. Examples 915


Advanced Quantitative Economics with Python

(continued from previous page)


'Tax Rate', 'Government Spending', 'Output']

fig, axes = plt.subplots(3, 2, figsize=(14, 10))

for ax, title, ls, amss in zip(axes.flatten(), titles, sim_ls, sim_amss):


ax.plot(ls, '-ok', amss, '-^b')
ax.set(title=title)
ax.grid()

axes[0, 0].legend(('Complete Markets', 'Incomplete Markets'))


plt.tight_layout()
plt.show()

When the government experiences a prolonged period of peace, it is able to reduce government debt and set persistently
lower tax rates.
However, the government finances a long war by borrowing and raising taxes.
This results in a drift away from policies with state-contingent debt that depends on the history of shocks.
This is even more evident in the following figure that plots the evolution of the two policies over 200 periods.
This outcome reflects the presence of a force for precautionary saving that the incomplete markets structure imparts to
the Ramsey plan.
In this subsequent lecture and this subsequent lecture, some ultimate consequences of that force are explored.

916 Chapter 46. Optimal Taxation without State-Contingent Debt


Advanced Quantitative Economics with Python

T = 200
s_0 = 0
mc = MarkovChain(Π)

s_hist_long = mc.simulate(T, init=s_0, random_state=5)

sim_amss = amss_model.simulate(s_hist_long, b_0)


sim_ls = ls_model.simulate(0.5, 0, T, s_hist_long)

titles = ['Consumption', 'Labor Supply', 'Government Debt',


'Tax Rate', 'Government Spending', 'Output']

fig, axes = plt.subplots(3, 2, figsize=(14, 10))

for ax, title, ls, amss in zip(axes.flatten(), titles, sim_ls, \


sim_amss):
ax.plot(ls, '-k', amss, '-.b', alpha=0.5)
ax.set(title=title)
ax.grid()

axes[0, 0].legend(('Complete Markets','Incomplete Markets'))


plt.tight_layout()
plt.show()

46.4. Examples 917


Advanced Quantitative Economics with Python

918 Chapter 46. Optimal Taxation without State-Contingent Debt


CHAPTER

FORTYSEVEN

FLUCTUATING INTEREST RATES DELIVER FISCAL INSURANCE

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

47.1 Overview

This lecture extends our investigations of how optimal policies for levying a flat-rate tax on labor income and issuing
government debt depend on whether there are complete markets for debt.
A Ramsey allocation and Ramsey policy in the AMSS [Aiyagari et al., 2002] model described in optimal taxation without
state-contingent debt generally differs from a Ramsey allocation and Ramsey policy in the Lucas-Stokey [Lucas and Stokey,
1983] model described in optimal taxation with state-contingent debt.
This is because the implementability restriction that a competitive equilibrium with a distorting tax imposes on allocations
in the Lucas-Stokey model is just one among a set of implementability conditions imposed in the AMSS model.
These additional constraints require that time 𝑡 components of a Ramsey allocation for the AMSS model be measurable
with respect to time 𝑡 − 1 information.
The measurability constraints imposed by the AMSS model are inherited from the restriction that only one-period risk-
free bonds can be traded.
Differences between the Ramsey allocations in the two models indicate that at least some of the implementability con-
straints of the AMSS model of optimal taxation without state-contingent debt are violated at the Ramsey allocation of a
corresponding [Lucas and Stokey, 1983] model with state-contingent debt.
Another way to say this is that differences between the Ramsey allocations of the two models indicate that some of the
measurability constraints imposed by the AMSS model are violated at the Ramsey allocation of the Lucas-Stokey
model.
Nonzero Lagrange multipliers on those constraints make the Ramsey allocation for the AMSS model differ from the
Ramsey allocation for the Lucas-Stokey model.
This lecture studies a special AMSS model in which
• The exogenous state variable 𝑠𝑡 is governed by a finite-state Markov chain.
• With an arbitrary budget-feasible initial level of government debt, the measurability constraints
– bind for many periods, but ….
– eventually, they stop binding evermore, so that …
– in the tail of the Ramsey plan, the Lagrange multipliers 𝛾𝑡 (𝑠𝑡 ) on the AMSS implementability constraints
(46.8) are zero.

919
Advanced Quantitative Economics with Python

• After the implementability constraints (46.8) no longer bind in the tail of the AMSS Ramsey plan
– history dependence of the AMSS state variable 𝑥𝑡 vanishes and 𝑥𝑡 becomes a time-invariant function of the
Markov state 𝑠𝑡 .
– the par value of government debt becomes constant over time so that 𝑏𝑡+1 (𝑠𝑡 ) = 𝑏̄ for 𝑡 ≥ 𝑇 for a sufficiently
large 𝑇 .
– 𝑏̄ < 0, so that the tail of the Ramsey plan instructs the government always to make a constant par value of
risk-free one-period loans to the private sector.
– the one-period gross interest rate 𝑅𝑡 (𝑠𝑡 ) on risk-free debt converges to a time-invariant function of the
Markov state 𝑠𝑡 .
• For a particular 𝑏0 < 0 (i.e., a positive level of initial government loans to the private sector), the measurability
constraints never bind.
• In this special case
– the par value 𝑏𝑡+1 (𝑠𝑡 ) = 𝑏̄ of government debt at time 𝑡 and Markov state 𝑠𝑡 is constant across time and
states, but ….
𝑏̄
– the market value 𝑅𝑡 (𝑠𝑡 ) of government debt at time 𝑡 varies as a time-invariant function of the Markov state
𝑠𝑡 .
̄
– fluctuations in the interest rate make gross earnings on government debt 𝑅 𝑏(𝑠 ) fully insure the gross-of-gross-
𝑡 𝑡
interest-payments government budget against fluctuations in government expenditures.
– the state variable 𝑥 in a recursive representation of a Ramsey plan is a time-invariant function of the Markov
state for 𝑡 ≥ 0.
• In this special case, the Ramsey allocation in the AMSS model agrees with that in a Lucas-Stokey [Lucas and
Stokey, 1983] complete markets model in which the same amount of state-contingent debt falls due in all states
tomorrow
– it is a situation in which the Ramsey planner loses nothing from not being able to trade state-contingent debt
and being restricted to exchange only risk-free debt debt.
• This outcome emerges only when we initialize government debt at a particular 𝑏0 < 0.
In a nutshell, the reason for this striking outcome is that at a particular level of risk-free government assets, fluctuations
in the one-period risk-free interest rate provide the government with complete insurance against stochastically varying
government expenditures.
Let’s start with some imports:

import matplotlib.pyplot as plt


from scipy.optimize import fsolve, fmin

47.2 Forces at Work

The forces driving asymptotic outcomes here are examples of dynamics present in a more general class of incomplete
markets models analyzed in [Bhandari et al., 2017] (BEGS).
BEGS provide conditions under which government debt under a Ramsey plan converges to an invariant distribution.
BEGS construct approximations to that asymptotically invariant distribution of government debt under a Ramsey plan.
BEGS also compute an approximation to a Ramsey plan’s rate of convergence to that limiting invariant distribution.

920 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

We shall use the BEGS approximating limiting distribution and their approximating rate of convergence to help interpret
outcomes here.
For a long time, the Ramsey plan puts a nontrivial martingale-like component into the par value of government debt as
part of the way that the Ramsey plan imperfectly smooths distortions from the labor tax rate across time and Markov
states.
But BEGS show that binding implementability constraints slowly push government debt in a direction designed to let the
government use fluctuations in equilibrium interest rates rather than fluctuations in par values of debt to insure against
shocks to government expenditures.
• This is a weak (but unrelenting) force that, starting from a positive initial debt level, for a long time is dominated
by the stochastic martingale-like component of debt dynamics that the Ramsey planner uses to facilitate imperfect
tax-smoothing across time and states.
• This weak force slowly drives the par value of government assets to a constant level at which the government can
completely insure against government expenditure shocks while shutting down the stochastic component of debt
dynamics.
• At that point, the tail of the par value of government debt becomes a trivial martingale: it is constant over time.

47.3 Logical Flow of Lecture

We present ideas in the following order


• We describe a two-state AMSS economy and generate a long simulation starting from a positive initial government
debt.
• We observe that in a long simulation starting from positive government debt, the par value of government debt
eventually converges to a constant 𝑏.̄
• In fact, the par value of government debt converges to the same constant level 𝑏̄ for alternative realizations of the
Markov government expenditure process and for alternative settings of initial government debt 𝑏0 .
• We reverse engineer a particular value of initial government debt 𝑏0 (it turns out to be negative) for which the
continuation debt moves to 𝑏̄ immediately.
• We note that for this particular initial debt 𝑏0 , the Ramsey allocations for the AMSS economy and the Lucas-Stokey
model are identical
– we verify that the LS Ramsey planner chooses to purchase identical claims to time 𝑡 + 1 consumption for all
Markov states tomorrow for each Markov state today.
• We compute the BEGS approximations to check how accurately they describe the dynamics of the long-simulation.

47.3.1 Equations from Lucas-Stokey (1983) Model

Although we are studying an AMSS [Aiyagari et al., 2002] economy, a Lucas-Stokey [Lucas and Stokey, 1983] economy
plays an important role in the reverse-engineering calculation to be described below.
For that reason, it is helpful to have key equations underlying a Ramsey plan for the Lucas-Stokey economy readily
available.
Recall first-order conditions for a Ramsey allocation for the Lucas-Stokey economy.
For 𝑡 ≥ 1, these take the form

(1 + Φ)𝑢𝑐 (𝑐, 1 − 𝑐 − 𝑔) + Φ[𝑐𝑢𝑐𝑐 (𝑐, 1 − 𝑐 − 𝑔) − (𝑐 + 𝑔)𝑢ℓ𝑐 (𝑐, 1 − 𝑐 − 𝑔)]


(47.1)
= (1 + Φ)𝑢ℓ (𝑐, 1 − 𝑐 − 𝑔) + Φ[𝑐𝑢𝑐ℓ (𝑐, 1 − 𝑐 − 𝑔) − (𝑐 + 𝑔)𝑢ℓℓ (𝑐, 1 − 𝑐 − 𝑔)]

47.3. Logical Flow of Lecture 921


Advanced Quantitative Economics with Python

There is one such equation for each value of the Markov state 𝑠𝑡 .
Given an initial Markov state, the time 𝑡 = 0 quantities 𝑐0 and 𝑏0 satisfy
(1 + Φ)𝑢𝑐 (𝑐, 1 − 𝑐 − 𝑔) + Φ[𝑐𝑢𝑐𝑐 (𝑐, 1 − 𝑐 − 𝑔) − (𝑐 + 𝑔)𝑢ℓ𝑐 (𝑐, 1 − 𝑐 − 𝑔)]
(47.2)
= (1 + Φ)𝑢ℓ (𝑐, 1 − 𝑐 − 𝑔) + Φ[𝑐𝑢𝑐ℓ (𝑐, 1 − 𝑐 − 𝑔) − (𝑐 + 𝑔)𝑢ℓℓ (𝑐, 1 − 𝑐 − 𝑔)] + Φ(𝑢𝑐𝑐 − 𝑢𝑐,ℓ )𝑏0
In addition, the time 𝑡 = 0 budget constraint is satisfied at 𝑐0 and initial government debt 𝑏0
𝑏̄
𝑏0 + 𝑔0 = 𝜏0 (𝑐0 + 𝑔0 ) + (47.3)
𝑅0
where 𝑅0 is the gross interest rate for the Markov state 𝑠0 that is assumed to prevail at time 𝑡 = 0 and 𝜏0 is the time
𝑡 = 0 tax rate.
In equation (47.3), it is understood that
𝑢𝑙,0
𝜏0 = 1 −
𝑢𝑐,0
𝑆
𝑢𝑐 (𝑠)
𝑅0−1 = 𝛽 ∑ Π(𝑠|𝑠0 )
𝑠=1
𝑢𝑐,0
It is useful to transform some of the above equations to forms that are more natural for analyzing the case of a CRRA
utility specification that we shall use in our example economies.

47.3.2 Specification with CRRA Utility

As in lectures optimal taxation without state-contingent debt and optimal taxation with state-contingent debt, we assume
that the representative agent has utility function
𝑐1−𝜎 𝑛1+𝛾
𝑢(𝑐, 𝑛) = −
1−𝜎 1+𝛾
and set 𝜎 = 2, 𝛾 = 2, and the discount factor 𝛽 = 0.9.
We eliminate leisure from the model and continue to assume that

𝑐𝑡 + 𝑔 𝑡 = 𝑛 𝑡

The analysis of Lucas and Stokey prevails once we make the following replacements
𝑢ℓ (𝑐, ℓ) ∼ −𝑢𝑛 (𝑐, 𝑛)
𝑢𝑐 (𝑐, ℓ) ∼ 𝑢𝑐 (𝑐, 𝑛)
𝑢ℓ,ℓ (𝑐, ℓ) ∼ 𝑢𝑛𝑛 (𝑐, 𝑛)
𝑢𝑐,𝑐 (𝑐, ℓ) ∼ 𝑢𝑐,𝑐 (𝑐, 𝑛)
𝑢𝑐,ℓ (𝑐, ℓ) ∼ 0
With these understandings, equations (47.1) and (47.2) simplify in the case of the CRRA utility function.
They become

(1 + Φ)[𝑢𝑐 (𝑐) + 𝑢𝑛 (𝑐 + 𝑔)] + Φ[𝑐𝑢𝑐𝑐 (𝑐) + (𝑐 + 𝑔)𝑢𝑛𝑛 (𝑐 + 𝑔)] = 0 (47.4)

and

(1 + Φ)[𝑢𝑐 (𝑐0 ) + 𝑢𝑛 (𝑐0 + 𝑔0 )] + Φ[𝑐0 𝑢𝑐𝑐 (𝑐0 ) + (𝑐0 + 𝑔0 )𝑢𝑛𝑛 (𝑐0 + 𝑔0 )] − Φ𝑢𝑐𝑐 (𝑐0 )𝑏0 = 0 (47.5)

In equation (47.4), it is understood that 𝑐 and 𝑔 are each functions of the Markov state 𝑠.
The CRRA utility function is represented in the following class.

922 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

import numpy as np

class CRRAutility:

def __init__(self,
β=0.9,
σ=2,
γ=2,
π=np.full((2, 2), 0.5),
G=np.array([0.1, 0.2]),
Θ=np.ones(2),
transfers=False):

self.β, self.σ, self.γ = β, σ, γ


self.π, self.G, self.Θ, self.transfers = π, G, Θ, transfers

# Utility function
def U(self, c, n):
σ = self.σ
if σ == 1.:
U = np.log(c)
else:
U = (c**(1 - σ) - 1) / (1 - σ)
return U - n**(1 + self.γ) / (1 + self.γ)

# Derivatives of utility function


def Uc(self, c, n):
return c**(-self.σ)

def Ucc(self, c, n):


return -self.σ * c**(-self.σ - 1)

def Un(self, c, n):


return -n**self.γ

def Unn(self, c, n):


return -self.γ * n**(self.γ - 1)

47.4 Example Economy

We set the following parameter values.


The Markov state 𝑠𝑡 takes two values, namely, 0, 1.
The initial Markov state is 0.
The Markov transition matrix is .5𝐼 where 𝐼 is a 2 × 2 identity matrix, so the 𝑠𝑡 process is IID.
Government expenditures 𝑔(𝑠) equal .1 in Markov state 0 and .2 in Markov state 1.
We set preference parameters as follows:

𝛽 = .9
𝜎=2
𝛾=2

47.4. Example Economy 923


Advanced Quantitative Economics with Python

Here are several classes that do most of the work for us.
The code is mostly taken or adapted from the earlier lectures optimal taxation without state-contingent debt and optimal
taxation with state-contingent debt.

import numpy as np
from scipy.optimize import root
from quantecon import MarkovChain

class SequentialAllocation:

'''
Class that takes CESutility or BGPutility object as input returns
planner's allocation as a function of the multiplier on the
implementability constraint μ.
'''

def __init__(self, model):

# Initialize from model object attributes


self.β, self.π, self.G = model.β, model.π, model.G
self.mc, self.Θ = MarkovChain(self.π), model.Θ
self.S = len(model.π) # Number of states
self.model = model

# Find the first best allocation


self.find_first_best()

def find_first_best(self):
'''
Find the first best allocation
'''
model = self.model
S, Θ, G = self.S, self.Θ, self.G
Uc, Un = model.Uc, model.Un

def res(z):
c = z[:S]
n = z[S:]
return np.hstack([Θ * Uc(c, n) + Un(c, n), Θ * n - c - G])

res = root(res, np.full(2 * S, 0.5))

if not res.success:
raise Exception('Could not find first best')

self.cFB = res.x[:S]
self.nFB = res.x[S:]

# Multiplier on the resource constraint


self.ΞFB = Uc(self.cFB, self.nFB)
self.zFB = np.hstack([self.cFB, self.nFB, self.ΞFB])

def time1_allocation(self, μ):


'''
Computes optimal allocation for time t >= 1 for a given μ
'''
(continues on next page)

924 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

(continued from previous page)


model = self.model
S, Θ, G = self.S, self.Θ, self.G
Uc, Ucc, Un, Unn = model.Uc, model.Ucc, model.Un, model.Unn

def FOC(z):
c = z[:S]
n = z[S:2 * S]
Ξ = z[2 * S:]
# FOC of c
return np.hstack([Uc(c, n) - μ * (Ucc(c, n) * c + Uc(c, n)) - Ξ,
Un(c, n) - μ * (Unn(c, n) * n + Un(c, n)) \
+ Θ * Ξ, # FOC of n
Θ * n - c - G])

# Find the root of the first-order condition


res = root(FOC, self.zFB)
if not res.success:
raise Exception('Could not find LS allocation.')
z = res.x
c, n, Ξ = z[:S], z[S:2 * S], z[2 * S:]

# Compute x
I = Uc(c, n) * c + Un(c, n) * n
x = np.linalg.solve(np.eye(S) - self.β * self.π, I)

return c, n, x, Ξ

def time0_allocation(self, B_, s_0):


'''
Finds the optimal allocation given initial government debt B_ and
state s_0
'''
model, π, Θ, G, β = self.model, self.π, self.Θ, self.G, self.β
Uc, Ucc, Un, Unn = model.Uc, model.Ucc, model.Un, model.Unn

# First order conditions of planner's problem


def FOC(z):
μ, c, n, Ξ = z
xprime = self.time1_allocation(μ)[2]
return np.hstack([Uc(c, n) * (c - B_) + Un(c, n) * n + β * π[s_0]
@ xprime,
Uc(c, n) - μ * (Ucc(c, n)
* (c - B_) + Uc(c, n)) - Ξ,
Un(c, n) - μ * (Unn(c, n) * n
+ Un(c, n)) + Θ[s_0] * Ξ,
(Θ * n - c - G)[s_0]])

# Find root
res = root(FOC, np.array(
[0, self.cFB[s_0], self.nFB[s_0], self.ΞFB[s_0]]))
if not res.success:
raise Exception('Could not find time 0 LS allocation.')

return res.x

def time1_value(self, μ):

(continues on next page)

47.4. Example Economy 925


Advanced Quantitative Economics with Python

(continued from previous page)


'''
Find the value associated with multiplier μ
'''
c, n, x, Ξ = self.time1_allocation(μ)
U = self.model.U(c, n)
V = np.linalg.solve(np.eye(self.S) - self.β * self.π, U)
return c, n, x, V

def Τ(self, c, n):


'''
Computes Τ given c, n
'''
model = self.model
Uc, Un = model.Uc(c, n), model.Un(c, n)

return 1 + Un / (self.Θ * Uc)

def simulate(self, B_, s_0, T, sHist=None):


'''
Simulates planners policies for T periods
'''
model, π, β = self.model, self.π, self.β
Uc = model.Uc

if sHist is None:
sHist = self.mc.simulate(T, s_0)

cHist, nHist, Bhist, ΤHist, μHist = np.zeros((5, T))


RHist = np.zeros(T - 1)

# Time 0
μ, cHist[0], nHist[0], _ = self.time0_allocation(B_, s_0)
ΤHist[0] = self.Τ(cHist[0], nHist[0])[s_0]
Bhist[0] = B_
μHist[0] = μ

# Time 1 onward
for t in range(1, T):
c, n, x, Ξ = self.time1_allocation(μ)
Τ = self.Τ(c, n)
u_c = Uc(c, n)
s = sHist[t]
Eu_c = π[sHist[t - 1]] @ u_c
cHist[t], nHist[t], Bhist[t], ΤHist[t] = c[s], n[s], x[s] / u_c[s], \
Τ[s]
RHist[t - 1] = Uc(cHist[t - 1], nHist[t - 1]) / (β * Eu_c)
μHist[t] = μ

return [cHist, nHist, Bhist, ΤHist, sHist, μHist, RHist]

import numpy as np
from scipy.optimize import fmin_slsqp
from scipy.optimize import root
from quantecon import MarkovChain

(continues on next page)

926 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

(continued from previous page)

class RecursiveAllocationAMSS:

def __init__(self, model, μgrid, tol_diff=1e-7, tol=1e-7):

self.β, self.π, self.G = model.β, model.π, model.G


self.mc, self.S = MarkovChain(self.π), len(model.π) # Number of states
self.Θ, self.model, self.μgrid = model.Θ, model, μgrid
self.tol_diff, self.tol = tol_diff, tol

# Find the first best allocation


self.solve_time1_bellman()
self.T.time_0 = True # Bellman equation now solves time 0 problem

def solve_time1_bellman(self):
'''
Solve the time 1 Bellman equation for calibration model and
initial grid μgrid0
'''
model, μgrid0 = self.model, self.μgrid
π = model.π
S = len(model.π)

# First get initial fit from Lucas Stokey solution.


# Need to change things to be ex ante
pp = SequentialAllocation(model)
interp = interpolator_factory(2, None)

def incomplete_allocation(μ_, s_):


c, n, x, V = pp.time1_value(μ_)
return c, n, π[s_] @ x, π[s_] @ V
cf, nf, xgrid, Vf, xprimef = [], [], [], [], []
for s_ in range(S):
c, n, x, V = zip(*map(lambda μ: incomplete_allocation(μ, s_), μgrid0))
c, n = np.vstack(c).T, np.vstack(n).T
x, V = np.hstack(x), np.hstack(V)
xprimes = np.vstack([x] * S)
cf.append(interp(x, c))
nf.append(interp(x, n))
Vf.append(interp(x, V))
xgrid.append(x)
xprimef.append(interp(x, xprimes))
cf, nf, xprimef = fun_vstack(cf), fun_vstack(nf), fun_vstack(xprimef)
Vf = fun_hstack(Vf)
policies = [cf, nf, xprimef]

# Create xgrid
x = np.vstack(xgrid).T
xbar = [x.min(0).max(), x.max(0).min()]
xgrid = np.linspace(xbar[0], xbar[1], len(μgrid0))
self.xgrid = xgrid

# Now iterate on Bellman equation


T = BellmanEquation(model, xgrid, policies, tol=self.tol)
diff = 1
while diff > self.tol_diff:

(continues on next page)

47.4. Example Economy 927


Advanced Quantitative Economics with Python

(continued from previous page)


PF = T(Vf)

Vfnew, policies = self.fit_policy_function(PF)


diff = np.abs((Vf(xgrid) - Vfnew(xgrid)) / Vf(xgrid)).max()

print(diff)
Vf = Vfnew

# Store value function policies and Bellman Equations


self.Vf = Vf
self.policies = policies
self.T = T

def fit_policy_function(self, PF):


'''
Fits the policy functions
'''
S, xgrid = len(self.π), self.xgrid
interp = interpolator_factory(3, 0)
cf, nf, xprimef, Tf, Vf = [], [], [], [], []
for s_ in range(S):
PFvec = np.vstack([PF(x, s_) for x in self.xgrid]).T
Vf.append(interp(xgrid, PFvec[0, :]))
cf.append(interp(xgrid, PFvec[1:1 + S]))
nf.append(interp(xgrid, PFvec[1 + S:1 + 2 * S]))
xprimef.append(interp(xgrid, PFvec[1 + 2 * S:1 + 3 * S]))
Tf.append(interp(xgrid, PFvec[1 + 3 * S:]))
policies = fun_vstack(cf), fun_vstack(
nf), fun_vstack(xprimef), fun_vstack(Tf)
Vf = fun_hstack(Vf)
return Vf, policies

def Τ(self, c, n):


'''
Computes Τ given c and n
'''
model = self.model
Uc, Un = model.Uc(c, n), model.Un(c, n)

return 1 + Un / (self.Θ * Uc)

def time0_allocation(self, B_, s0):


'''
Finds the optimal allocation given initial government debt B_ and
state s_0
'''
PF = self.T(self.Vf)
z0 = PF(B_, s0)
c0, n0, xprime0, T0 = z0[1:]
return c0, n0, xprime0, T0

def simulate(self, B_, s_0, T, sHist=None):


'''
Simulates planners policies for T periods
'''
model, π = self.model, self.π

(continues on next page)

928 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

(continued from previous page)


Uc = model.Uc
cf, nf, xprimef, Tf = self.policies

if sHist is None:
sHist = simulate_markov(π, s_0, T)

cHist, nHist, Bhist, xHist, ΤHist, THist, μHist = np.zeros((7, T))


# Time 0
cHist[0], nHist[0], xHist[0], THist[0] = self.time0_allocation(B_, s_0)
ΤHist[0] = self.Τ(cHist[0], nHist[0])[s_0]
Bhist[0] = B_
μHist[0] = self.Vf[s_0](xHist[0])

# Time 1 onward
for t in range(1, T):
s_, x, s = sHist[t - 1], xHist[t - 1], sHist[t]
c, n, xprime, T = cf[s_, :](x), nf[s_, :](
x), xprimef[s_, :](x), Tf[s_, :](x)

Τ = self.Τ(c, n)[s]
u_c = Uc(c, n)
Eu_c = π[s_, :] @ u_c

μHist[t] = self.Vf[s](xprime[s])

cHist[t], nHist[t], Bhist[t], ΤHist[t] = c[s], n[s], x / Eu_c, Τ


xHist[t], THist[t] = xprime[s], T[s]
return [cHist, nHist, Bhist, ΤHist, THist, μHist, sHist, xHist]

class BellmanEquation:
'''
Bellman equation for the continuation of the Lucas-Stokey Problem
'''

def __init__(self, model, xgrid, policies0, tol, maxiter=1000):

self.β, self.π, self.G = model.β, model.π, model.G


self.S = len(model.π) # Number of states
self.Θ, self.model, self.tol = model.Θ, model, tol
self.maxiter = maxiter

self.xbar = [min(xgrid), max(xgrid)]


self.time_0 = False

self.z0 = {}
cf, nf, xprimef = policies0

for s_ in range(self.S):
for x in xgrid:
self.z0[x, s_] = np.hstack([cf[s_, :](x),
nf[s_, :](x),
xprimef[s_, :](x),
np.zeros(self.S)])

self.find_first_best()

(continues on next page)

47.4. Example Economy 929


Advanced Quantitative Economics with Python

(continued from previous page)

def find_first_best(self):
'''
Find the first best allocation
'''
model = self.model
S, Θ, Uc, Un, G = self.S, self.Θ, model.Uc, model.Un, self.G

def res(z):
c = z[:S]
n = z[S:]
return np.hstack([Θ * Uc(c, n) + Un(c, n), Θ * n - c - G])

res = root(res, np.full(2 * S, 0.5))


if not res.success:
raise Exception('Could not find first best')

self.cFB = res.x[:S]
self.nFB = res.x[S:]
IFB = Uc(self.cFB, self.nFB) * self.cFB + \
Un(self.cFB, self.nFB) * self.nFB

self.xFB = np.linalg.solve(np.eye(S) - self.β * self.π, IFB)

self.zFB = {}
for s in range(S):
self.zFB[s] = np.hstack(
[self.cFB[s], self.nFB[s], self.π[s] @ self.xFB, 0.])

def __call__(self, Vf):


'''
Given continuation value function next period return value function this
period return T(V) and optimal policies
'''
if not self.time_0:
def PF(x, s): return self.get_policies_time1(x, s, Vf)
else:
def PF(B_, s0): return self.get_policies_time0(B_, s0, Vf)
return PF

def get_policies_time1(self, x, s_, Vf):


'''
Finds the optimal policies
'''
model, β, Θ, G, S, π = self.model, self.β, self.Θ, self.G, self.S, self.π
U, Uc, Un = model.U, model.Uc, model.Un

def objf(z):
c, n, xprime = z[:S], z[S:2 * S], z[2 * S:3 * S]

Vprime = np.empty(S)
for s in range(S):
Vprime[s] = Vf[s](xprime[s])

return -π[s_] @ (U(c, n) + β * Vprime)

(continues on next page)

930 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

(continued from previous page)


def objf_prime(x):

epsilon = 1e-7
x0 = np.asfarray(x)
f0 = np.atleast_1d(objf(x0))
jac = np.zeros([len(x0), len(f0)])
dx = np.zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (objf(x0+dx) - f0)/epsilon
dx[i] = 0.0

return jac.transpose()

def cons(z):
c, n, xprime, T = z[:S], z[S:2 * S], z[2 * S:3 * S], z[3 * S:]
u_c = Uc(c, n)
Eu_c = π[s_] @ u_c
return np.hstack([
x * u_c / Eu_c - u_c * (c - T) - Un(c, n) * n - β * xprime,
Θ * n - c - G])

if model.transfers:
bounds = [(0., 100)] * S + [(0., 100)] * S + \
[self.xbar] * S + [(0., 100.)] * S
else:
bounds = [(0., 100)] * S + [(0., 100)] * S + \
[self.xbar] * S + [(0., 0.)] * S
out, fx, _, imode, smode = fmin_slsqp(objf, self.z0[x, s_],
f_eqcons=cons, bounds=bounds,
fprime=objf_prime, full_output=True,
iprint=0, acc=self.tol, iter=self.
↪maxiter)

if imode > 0:
raise Exception(smode)

self.z0[x, s_] = out


return np.hstack([-fx, out])

def get_policies_time0(self, B_, s0, Vf):


'''
Finds the optimal policies
'''
model, β, Θ, G = self.model, self.β, self.Θ, self.G
U, Uc, Un = model.U, model.Uc, model.Un

def objf(z):
c, n, xprime = z[:-1]

return -(U(c, n) + β * Vf[s0](xprime))

def cons(z):
c, n, xprime, T = z
return np.hstack([
-Uc(c, n) * (c - B_ - T) - Un(c, n) * n - β * xprime,

(continues on next page)

47.4. Example Economy 931


Advanced Quantitative Economics with Python

(continued from previous page)


(Θ * n - c - G)[s0]])

if model.transfers:
bounds = [(0., 100), (0., 100), self.xbar, (0., 100.)]
else:
bounds = [(0., 100), (0., 100), self.xbar, (0., 0.)]
out, fx, _, imode, smode = fmin_slsqp(objf, self.zFB[s0], f_eqcons=cons,
bounds=bounds, full_output=True,
iprint=0)

if imode > 0:
raise Exception(smode)

return np.hstack([-fx, out])

import numpy as np
from scipy.interpolate import UnivariateSpline

class interpolate_wrapper:

def __init__(self, F):


self.F = F

def __getitem__(self, index):


return interpolate_wrapper(np.asarray(self.F[index]))

def reshape(self, *args):


self.F = self.F.reshape(*args)
return self

def transpose(self):
self.F = self.F.transpose()

def __len__(self):
return len(self.F)

def __call__(self, xvec):


x = np.atleast_1d(xvec)
shape = self.F.shape
if len(x) == 1:
fhat = np.hstack([f(x) for f in self.F.flatten()])
return fhat.reshape(shape)
else:
fhat = np.vstack([f(x) for f in self.F.flatten()])
return fhat.reshape(np.hstack((shape, len(x))))

class interpolator_factory:

def __init__(self, k, s):


self.k, self.s = k, s

def __call__(self, xgrid, Fs):


shape, m = Fs.shape[:-1], Fs.shape[-1]
(continues on next page)

932 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

(continued from previous page)


Fs = Fs.reshape((-1, m))
F = []
xgrid = np.sort(xgrid) # Sort xgrid
for Fhat in Fs:
F.append(UnivariateSpline(xgrid, Fhat, k=self.k, s=self.s))
return interpolate_wrapper(np.array(F).reshape(shape))

def fun_vstack(fun_list):

Fs = [IW.F for IW in fun_list]


return interpolate_wrapper(np.vstack(Fs))

def fun_hstack(fun_list):

Fs = [IW.F for IW in fun_list]


return interpolate_wrapper(np.hstack(Fs))

def simulate_markov(π, s_0, T):

sHist = np.empty(T, dtype=int)


sHist[0] = s_0
S = len(π)
for t in range(1, T):
sHist[t] = np.random.choice(np.arange(S), p=π[sHist[t - 1]])

return sHist

47.5 Reverse Engineering Strategy

We can reverse engineer a value 𝑏0 of initial debt due that renders the AMSS measurability constraints not binding from
time 𝑡 = 0 onward.
We accomplish this by recognizing that if the AMSS measurability constraints never bind, then the AMSS allocation
and Ramsey plan is equivalent with that for a Lucas-Stokey economy in which for each period 𝑡 ≥ 0, the government
promises to pay the same state-contingent amount 𝑏̄ in each state tomorrow.
This insight tells us to find a 𝑏0 and other fundamentals for the Lucas-Stokey [Lucas and Stokey, 1983] model that make
the Ramsey planner want to borrow the same value 𝑏̄ next period for all states and all dates.
We accomplish this by using various equations for the Lucas-Stokey [Lucas and Stokey, 1983] model presented in optimal
taxation with state-contingent debt.
We use the following steps.
Step 1: Pick an initial Φ.
Step 2: Given that Φ, jointly solve two versions of equation (47.4) for 𝑐(𝑠), 𝑠 = 1, 2 associated with the two values for
𝑔(𝑠), 𝑠 = 1, 2.
Step 3: Solve the following equation for 𝑥⃗

𝑥⃗ = (𝐼 − 𝛽Π)−1 [𝑢⃗𝑐 (𝑛⃗ − 𝑔)⃗ − 𝑢⃗𝑙 𝑛]⃗ (47.6)

47.5. Reverse Engineering Strategy 933


Advanced Quantitative Economics with Python

𝑥(𝑠)
Step 4: After solving for 𝑥,⃗ we can find 𝑏(𝑠𝑡 |𝑠𝑡−1 ) in Markov state 𝑠𝑡 = 𝑠 from 𝑏(𝑠) = 𝑢𝑐 (𝑠) or the matrix equation

𝑥⃗
𝑏⃗ = (47.7)
𝑢⃗𝑐

Step 5: Compute 𝐽 (Φ) = (𝑏(1) − 𝑏(2))2 .


Step 6: Put steps 2 through 6 in a function minimizer and find a Φ that minimizes 𝐽 (Φ).
Step 7: At the value of Φ and the value of 𝑏̄ that emerged from step 6, solve equations (47.5) and (47.3) jointly for 𝑐0 , 𝑏0 .

47.6 Code for Reverse Engineering

Here is code to do the calculations for us.

u = CRRAutility()

def min_Φ(Φ):

g1, g2 = u.G # Government spending in s=0 and s=1

# Solve Φ(c)
def equations(unknowns, Φ):
c1, c2 = unknowns
# First argument of .Uc and second argument of .Un are redundant

# Set up simultaneous equations


eq = lambda c, g: (1 + Φ) * (u.Uc(c, 1) - -u.Un(1, c + g)) + \
Φ * ((c + g) * u.Unn(1, c + g) + c * u.Ucc(c, 1))

# Return equation evaluated at s=1 and s=2


return np.array([eq(c1, g1), eq(c2, g2)]).flatten()

global c1 # Update c1 globally


global c2 # Update c2 globally

c1, c2 = fsolve(equations, np.ones(2), args=(Φ))

uc = u.Uc(np.array([c1, c2]), 1) # uc(n - g)


# ul(n) = -un(c + g)
ul = -u.Un(1, np.array([c1 + g1, c2 + g2])) * [c1 + g1, c2 + g2]
# Solve for x
x = np.linalg.solve(np.eye((2)) - u.β * u.π, uc * [c1, c2] - ul)

global b # Update b globally


b = x / uc
loss = (b[0] - b[1])**2

return loss

Φ_star = fmin(min_Φ, .1, ftol=1e-14)

Optimization terminated successfully.


Current function value: 0.000000
Iterations: 24
Function evaluations: 48

934 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

To recover and print out 𝑏̄

b_bar = b[0]
b_bar

-1.0757576567504166

To complete the reverse engineering exercise by jointly determining 𝑐0 , 𝑏0 , we set up a function that returns two simul-
taneous equations.

def solve_cb(unknowns, Φ, b_bar, s=1):

c0, b0 = unknowns

g0 = u.G[s-1]

R_0 = u.β * u.π[s] @ [u.Uc(c1, 1) / u.Uc(c0, 1), u.Uc(c2, 1) / u.Uc(c0, 1)]


R_0 = 1 / R_0

τ_0 = 1 + u.Un(1, c0 + g0) / u.Uc(c0, 1)

eq1 = τ_0 * (c0 + g0) + b_bar / R_0 - b0 - g0


eq2 = (1 + Φ) * (u.Uc(c0, 1) + u.Un(1, c0 + g0)) \
+ Φ * (c0 * u.Ucc(c0, 1) + (c0 + g0) * u.Unn(1, c0 + g0)) \
- Φ * u.Ucc(c0, 1) * b0

return np.array([eq1, eq2.item()], dtype='float64')

To solve the equations for 𝑐0 , 𝑏0 , we use SciPy’s fsolve function

c0, b0 = fsolve(solve_cb, np.array([1., -1.], dtype='float64'),


args=(Φ_star, b[0], 1), xtol=1.0e-12)
c0, b0

(0.9344994030900681, -1.0386984075517638)

Thus, we have reverse engineered an initial 𝑏0 = −1.038698407551764 that ought to render the AMSS measurability
constraints slack.

47.7 Short Simulation for Reverse-engineered: Initial Debt

The following graph shows simulations of outcomes for both a Lucas-Stokey economy and for an AMSS economy starting
from initial government debt equal to 𝑏0 = −1.038698407551764.
These graphs report outcomes for both the Lucas-Stokey economy with complete markets and the AMSS economy with
one-period risk-free debt only.

μ_grid = np.linspace(-0.09, 0.1, 100)

log_example = CRRAutility()

log_example.transfers = True # Government can use transfers


log_sequential = SequentialAllocation(log_example) # Solve sequential problem
(continues on next page)

47.7. Short Simulation for Reverse-engineered: Initial Debt 935


Advanced Quantitative Economics with Python

(continued from previous page)


log_bellman = RecursiveAllocationAMSS(log_example, μ_grid,
tol_diff=1e-10, tol=1e-10)

T = 20
sHist = np.array([0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
0, 0, 0, 1, 1, 1, 1, 1, 1, 0])

sim_seq = log_sequential.simulate(-1.03869841, 0, T, sHist)


sim_bel = log_bellman.simulate(-1.03869841, 0, T, sHist)

titles = ['Consumption', 'Labor Supply', 'Government Debt',


'Tax Rate', 'Government Spending', 'Output']

# Government spending paths


sim_seq[4] = log_example.G[sHist]
sim_bel[4] = log_example.G[sHist]

# Output paths
sim_seq[5] = log_example.Θ[sHist] * sim_seq[1]
sim_bel[5] = log_example.Θ[sHist] * sim_bel[1]

fig, axes = plt.subplots(3, 2, figsize=(14, 10))

for ax, title, seq, bel in zip(axes.flatten(), titles, sim_seq, sim_bel):


ax.plot(seq, '-ok', bel, '-^b')
ax.set(title=title)
ax.grid()

axes[0, 0].legend(('Complete Markets', 'Incomplete Markets'))


plt.tight_layout()
plt.show()

/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:437: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

fx = wrapped_fun(x)
/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:441: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

g = append(wrapped_grad(x), 0.0)
/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:495: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

a_eq = vstack([con['jac'](x, *con['args'])

/tmp/ipykernel_6773/108196118.py:24: RuntimeWarning: divide by zero encountered in␣


↪reciprocal

U = (c**(1 - σ) - 1) / (1 - σ)
/tmp/ipykernel_6773/108196118.py:29: RuntimeWarning: divide by zero encountered in␣
↪power

return c**(-self.σ)
/tmp/ipykernel_6773/1277371586.py:249: RuntimeWarning: invalid value encountered␣
↪in divide

x * u_c / Eu_c - u_c * (c - T) - Un(c, n) * n - β * xprime,


(continues on next page)

936 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

(continued from previous page)


/tmp/ipykernel_6773/1277371586.py:249: RuntimeWarning: invalid value encountered␣
↪in multiply

x * u_c / Eu_c - u_c * (c - T) - Un(c, n) * n - β * xprime,

0.04094445433232542

0.001673211146137493

0.001484674847917127

0.001313772136887205

0.0011814037130420663

0.001055965336102068

0.0009446661649946108

0.0008463807319492324

0.0007560453788611131

0.0006756001033938903

0.000604152845540819

0.0005396004518747859

0.00048207169166290613

0.00043082732064067867

0.00038481851351225495

0.000343835217593145

0.0003072436935049677

0.0002745009146233244

0.00024531773293589513

47.7. Short Simulation for Reverse-engineered: Initial Debt 937


Advanced Quantitative Economics with Python

0.00021923324298642947

0.00019593539310787213

0.00017514303481690137

0.0001565593985003591

0.00013996737081815812

0.00012514457789841946

0.00011190070823325749

0.0001000702000922041

8.949728534363834e-05

8.00497532414663e-05

7.160585250570457e-05

6.405840591557493e-05

5.731160522780524e-05

5.1279701373366633e-05

4.588651722582404e-05

4.106390497232627e-05

3.6750969979187823e-05

3.289357328148953e-05

2.9443322731171715e-05

2.6356778254647064e-05

2.3595477005441402e-05

938 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

2.1124867549068547e-05

1.8914292342161616e-05

1.6935989661294087e-05

1.5165570482803087e-05

1.3581075188566359e-05

1.2162766163347089e-05

1.0893227516817513e-05

9.756678182519297e-06

8.739234428152772e-06

7.828320614508025e-06

7.012602839408298e-06

6.2821988113865695e-06

5.628118884533389e-06

5.0424276120745635e-06

4.517800318375349e-06

4.048011435284343e-06

3.6271819852132397e-06

3.250228025571809e-06

2.91255521672949e-06

2.6100632205124585e-06

2.339096372677708e-06

47.7. Short Simulation for Reverse-engineered: Initial Debt 939


Advanced Quantitative Economics with Python

2.096300057053759e-06

1.8787856014677842e-06

1.6838896002658147e-06

1.5092763000475938e-06

1.352790440377663e-06

1.2125870135921682e-06

1.0869367592654264e-06

9.74329344948381e-07

8.734258726613521e-07
7.82979401245993e-07

7.019280421759928e-07

6.292786681149374e-07

5.641636376342722e-07

5.058008139530142e-07

4.5348427330256424e-07

4.0659062310367744e-07

3.6455314441729855e-07

3.2687002299145745e-07
2.930882045255147e-07

2.6280345786809706e-07
2.356529429295176e-07

2.1131168850248635e-07
1.8948851788438695e-07

940 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

1.6992245629426705e-07

1.5237965358488245e-07
1.3665054480740185e-07

1.2254729288266142e-07
1.0990157880047098e-07

9.85625196806722e-08

8.839490296454315e-08

7.927751099544721e-08

7.110169892267009e-08

6.377012234144897e-08

5.719543299951795e-08

5.129944108294742e-08
4.6011930465755267e-08

4.127024907212617e-08

3.7017901411273995e-08

3.320421136675924e-08

2.9783836454122435e-08

2.6716185879207155e-08

2.3964828404060055e-08

2.1497111441656643e-08
1.928376711102591e-08

1.7298534286134342e-08
1.5517887041510468e-08

1.3920711115842077e-08

47.7. Short Simulation for Reverse-engineered: Initial Debt 941


Advanced Quantitative Economics with Python

1.2488086772484325e-08
1.120303914946054e-08

1.0050349805051883e-08
9.016372957223345e-09

8.088867717275256e-09
7.256860052028448e-09

6.5105080491085e-09
5.8409842196277625e-09

5.240371187393206e-09

4.701571286205833e-09
4.2182149401635156e-09

3.784594252430241e-09

3.3955835551064364e-09
3.0465910785331343e-09

2.7334965385949916e-09
2.4526029798499404e-09

2.2005967896788517e-09
1.9745023230252437e-09

1.7716540861495694e-09

1.5896779606666392e-09

1.4263644656786832e-09

1.279915801041798e-09
1.1484611488603225e-09

1.0305702313922867e-09

9.247647878021015e-10
8.298468061604299e-10

7.446744286173443e-10

942 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

6.682506157688693e-10
5.996765544062293e-10

5.381420956749845e-10

4.829271458904042e-10

4.3337871811544764e-10
3.8891892933983235e-10

3.4902066124392655e-10

3.1321799130111273e-10

2.8109002457092086e-10

2.5225950288597284e-10
2.263868938948011e-10

2.0316830484184638e-10

1.8233409175417047e-10
1.6363582056463494e-10

1.4685617665861112e-10
1.3179940303096093e-10

1.1828486777347211e-10

1.0615888599012755e-10
9.527490070407684e-11

47.7. Short Simulation for Reverse-engineered: Initial Debt 943


Advanced Quantitative Economics with Python

The Ramsey allocations and Ramsey outcomes are identical for the Lucas-Stokey and AMSS economies.
This outcome confirms the success of our reverse-engineering exercises.
Notice how for 𝑡 ≥ 1, the tax rate is a constant - so is the par value of government debt.
However, output and labor supply are both nontrivial time-invariant functions of the Markov state.

47.8 Long Simulation

The following graph shows the par value of government debt and the flat-rate tax on labor income for a long simulation
for our sample economy.
For the same realization of a government expenditure path, the graph reports outcomes for two economies
• the gray lines are for the Lucas-Stokey economy with complete markets
• the blue lines are for the AMSS economy with risk-free one-period debt only
For both economies, initial government debt due at time 0 is 𝑏0 = .5.
For the Lucas-Stokey complete markets economy, the government debt plotted is 𝑏𝑡+1 (𝑠𝑡+1 ).
• Notice that this is a time-invariant function of the Markov state from the beginning.
For the AMSS incomplete markets economy, the government debt plotted is 𝑏𝑡+1 (𝑠𝑡 ).
• Notice that this is a martingale-like random process that eventually seems to converge to a constant 𝑏̄ ≈ −1.07.
• Notice that the limiting value 𝑏̄ < 0 so that asymptotically the government makes a constant level of risk-free loans
to the public.

944 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

• In the simulation displayed as well as other simulations we have run, the par value of government debt converges
to about 1.07 after between 1400 to 2000 periods.
For the AMSS incomplete markets economy, the marginal tax rate on labor income 𝜏𝑡 converges to a constant
• labor supply and output each converge to time-invariant functions of the Markov state

T = 2000 # Set T to 200 periods

sim_seq_long = log_sequential.simulate(0.5, 0, T)
sHist_long = sim_seq_long[-3]
sim_bel_long = log_bellman.simulate(0.5, 0, T, sHist_long)

titles = ['Government Debt', 'Tax Rate']

fig, axes = plt.subplots(2, 1, figsize=(14, 10))

for ax, title, id in zip(axes.flatten(), titles, [2, 3]):


ax.plot(sim_seq_long[id], '-k', sim_bel_long[id], '-.b', alpha=0.5)
ax.set(title=title)
ax.grid()

axes[0].legend(('Complete Markets', 'Incomplete Markets'))


plt.tight_layout()
plt.show()

47.8. Long Simulation 945


Advanced Quantitative Economics with Python

47.8.1 Remarks about Long Simulation

As remarked above, after 𝑏𝑡+1 (𝑠𝑡 ) has converged to a constant, the measurability constraints in the AMSS model cease
to bind
• the associated Lagrange multipliers on those implementability constraints converge to zero
This leads us to seek an initial value of government debt 𝑏0 that renders the measurability constraints slack from time
𝑡 = 0 onward
• a tell-tale sign of this situation is that the Ramsey planner in a corresponding Lucas-Stokey economy would instruct
the government to issue a constant level of government debt 𝑏𝑡+1 (𝑠𝑡+1 ) across the two Markov states
We now describe how to find such an initial level of government debt.

47.9 BEGS Approximations of Limiting Debt and Convergence Rate

It is useful to link the outcome of our reverse engineering exercise to limiting approximations constructed by BEGS
[Bhandari et al., 2017].
BEGS [Bhandari et al., 2017] used a slightly different notation to represent a generalization of the AMSS model.
We’ll introduce a version of their notation so that readers can quickly relate notation that appears in their key formulas to
the notation that we have used.
BEGS work with objects 𝐵𝑡 , ℬ𝑡 , ℛ𝑡 , 𝒳𝑡 that are related to our notation by
𝑢𝑐,𝑡 𝑢𝑐,𝑡
ℛ𝑡 = 𝑅 =
𝑢𝑐,𝑡−1 𝑡−1 𝛽𝐸𝑡−1 𝑢𝑐,𝑡
𝑏𝑡+1 (𝑠𝑡 )
𝐵𝑡 =
𝑅𝑡 (𝑠𝑡 )
𝑡−1
𝑏𝑡 (𝑠 ) = ℛ𝑡−1 𝐵𝑡−1
ℬ𝑡 = 𝑢𝑐,𝑡 𝐵𝑡 = (𝛽𝐸𝑡 𝑢𝑐,𝑡+1 )𝑏𝑡+1 (𝑠𝑡 )
𝒳𝑡 = 𝑢𝑐,𝑡 [𝑔𝑡 − 𝜏𝑡 𝑛𝑡 ]

In terms of their notation, equation (44) of [Bhandari et al., 2017] expresses the time 𝑡 state 𝑠 government budget con-
straint as

ℬ(𝑠) = ℛ𝜏 (𝑠, 𝑠− )ℬ− + 𝒳𝜏(𝑠) (𝑠) (47.8)

where the dependence on 𝜏 is to remind us that these objects depend on the tax rate and 𝑠− is last period’s Markov state.
BEGS interpret random variations in the right side of (47.8) as a measure of fiscal risk composed of
• interest-rate-driven fluctuations in time 𝑡 effective payments due on the government portfolio, namely,
ℛ𝜏 (𝑠, 𝑠− )ℬ− , and
• fluctuations in the effective government deficit 𝒳𝑡

946 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

47.9.1 Asymptotic Mean

BEGS give conditions under which the ergodic mean of ℬ𝑡 is

cov∞ (ℛ, 𝒳)
ℬ∗ = − (47.9)
var∞ (ℛ)
where the superscript ∞ denotes a moment taken with respect to an ergodic distribution.
Formula (47.9) presents ℬ∗ as a regression coefficient of 𝒳𝑡 on ℛ𝑡 in the ergodic distribution.
This regression coefficient emerges as the minimizer for a variance-minimization problem:

ℬ∗ = argminℬ var(ℛℬ + 𝒳) (47.10)

The minimand in criterion (47.10) is the measure of fiscal risk associated with a given tax-debt policy that appears on
the right side of equation (47.8).
Expressing formula (47.9) in terms of our notation tells us that 𝑏̄ should approximately equal
ℬ∗
𝑏̂ = (47.11)
𝛽𝐸𝑡 𝑢𝑐,𝑡+1

47.9.2 Rate of Convergence

BEGS also derive the following approximation to the rate of convergence to ℬ∗ from an arbitrary initial condition.
𝐸𝑡 (ℬ𝑡+1 − ℬ∗ ) 1

≈ 2
(47.12)
(ℬ𝑡 − ℬ ) 1 + 𝛽 var(ℛ)
(See the equation above equation (47) in [Bhandari et al., 2017])

47.9.3 Formulas and Code Details

For our example, we describe some code that we use to compute the steady state mean and the rate of convergence to it.
The values of 𝜋(𝑠) are 0.5, 0.5.
We can then construct 𝒳(𝑠), ℛ(𝑠), 𝑢𝑐 (𝑠) for our two states using the definitions above.
We can then construct 𝛽𝐸𝑡−1 𝑢𝑐 = 𝛽 ∑𝑠 𝑢𝑐 (𝑠)𝜋(𝑠), cov(ℛ(𝑠), 𝒳(𝑠)) and var(ℛ(𝑠)) to be plugged into formula (47.11).
We also want to compute var(𝒳).
To compute the variances and covariance, we use the following standard formulas.
Temporarily let 𝑥(𝑠), 𝑠 = 1, 2 be an arbitrary random variables.
Then we define
𝜇𝑥 = ∑ 𝑥(𝑠)𝜋(𝑠)
𝑠

var(𝑥) = (∑ ∑ 𝑥(𝑠)2 𝜋(𝑠)) − 𝜇2𝑥


𝑠 𝑠

cov(𝑥, 𝑦) = (∑ 𝑥(𝑠)𝑦(𝑠)𝜋(𝑠)) − 𝜇𝑥 𝜇𝑦
𝑠

After we compute these moments, we compute the BEGS approximation to the asymptotic mean 𝑏̂ in formula (47.11).

47.9. BEGS Approximations of Limiting Debt and Convergence Rate 947


Advanced Quantitative Economics with Python

After that, we move on to compute ℬ∗ in formula (47.9).


We’ll also evaluate the BEGS criterion (47.8) at the limiting value ℬ∗
2
𝐽 (ℬ∗ ) = var(ℛ) (ℬ∗ ) + 2ℬ∗ cov(ℛ, 𝒳) + var(𝒳) (47.13)

Here are some functions that we’ll use to compute key objects that we want

def mean(x):
'''Returns mean for x given initial state'''
x = np.array(x)
return x @ u.π[s]

def variance(x):
x = np.array(x)
return x**2 @ u.π[s] - mean(x)**2

def covariance(x, y):


x, y = np.array(x), np.array(y)
return x * y @ u.π[s] - mean(x) * mean(y)

Now let’s form the two random variables ℛ, 𝒳 appearing in the BEGS approximating formulas

u = CRRAutility()

s = 0
c = [0.940580824225584, 0.8943592757759343] # Vector for c
g = u.G # Vector for g
n = c + g # Total population
τ = lambda s: 1 + u.Un(1, n[s]) / u.Uc(c[s], 1)

R_s = lambda s: u.Uc(c[s], n[s]) / (u.β * (u.Uc(c[0], n[0]) * u.π[0, 0] \


+ u.Uc(c[1], n[1]) * u.π[1, 0]))
X_s = lambda s: u.Uc(c[s], n[s]) * (g[s] - τ(s) * n[s])

R = [R_s(0), R_s(1)]
X = [X_s(0), X_s(1)]

print(f"R, X = {R}, {X}")

R, X = [1.055169547122964, 1.1670526750992583], [0.06357685646224803, 0.


↪19251010100512958]

Now let’s compute the ingredient of the approximating limit and the approximating rate of convergence

bstar = -covariance(R, X) / variance(R)


div = u.β * (u.Uc(c[0], n[0]) * u.π[s, 0] + u.Uc(c[1], n[1]) * u.π[s, 1])
bhat = bstar / div
bhat

-1.0757585378303758

Print out 𝑏̂ and 𝑏̄

bhat, b_bar

948 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


Advanced Quantitative Economics with Python

(-1.0757585378303758, -1.0757576567504166)

So we have

bhat - b_bar

-8.810799592140484e-07

These outcomes show that 𝑏̂ does a remarkably good job of approximating 𝑏.̄
Next, let’s compute the BEGS fiscal criterion that 𝑏̂ is minimizing

Jmin = variance(R) * bstar**2 + 2 * bstar * covariance(R, X) + variance(X)


Jmin

-9.020562075079397e-17

This is machine zero, a verification that 𝑏̂ succeeds in minimizing the nonnegative fiscal cost criterion 𝐽 (ℬ∗ ) defined in
BEGS and in equation (47.13) above.
Let’s push our luck and compute the mean reversion speed in the formula above equation (47) in [Bhandari et al., 2017].

den2 = 1 + (u.β**2) * variance(R)


speedrever = 1/den2
print(f'Mean reversion speed = {speedrever}')

Mean reversion speed = 0.9974715478249827

Now let’s compute the implied meantime to get to within 0.01 of the limit

ttime = np.log(.01) / np.log(speedrever)


print(f"Time to get within .01 of limit = {ttime}")

Time to get within .01 of limit = 1819.0360880098472

The slow rate of convergence and the implied time of getting within one percent of the limiting value do a good job of
approximating our long simulation above.
In a subsequent lecture we shall study an extension of the model in which the force highlighted in this lecture causes
government debt to converge to a nontrivial distribution instead of the single debt level discovered here.

47.9. BEGS Approximations of Limiting Debt and Convergence Rate 949


Advanced Quantitative Economics with Python

950 Chapter 47. Fluctuating Interest Rates Deliver Fiscal Insurance


CHAPTER

FORTYEIGHT

FISCAL RISK AND GOVERNMENT DEBT

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

48.1 Overview

This lecture studies government debt in an AMSS economy [Aiyagari et al., 2002] of the type described in Optimal
Taxation without State-Contingent Debt.
We study the behavior of government debt as time 𝑡 → +∞.
We use these techniques
• simulations
• a regression coefficient from the tail of a long simulation that allows us to verify that the asymptotic mean of
government debt solves a fiscal-risk minimization problem
• an approximation to the mean of an ergodic distribution of government debt
• an approximation to the rate of convergence to an ergodic distribution of government debt
We apply tools that are applicable to more general incomplete markets economies that are presented on pages 648 - 650
in section III.D of [Bhandari et al., 2017] (BEGS).
We study an AMSS economy [Aiyagari et al., 2002] with three Markov states driving government expenditures.
• In a previous lecture, we showed that with only two Markov states, it is possible that endogenous interest rate
fluctuations eventually can support complete markets allocations and Ramsey outcomes.
• The presence of three states prevents the full spanning that eventually prevails in the two-state example featured in
Fiscal Insurance via Fluctuating Interest Rates.
The lack of full spanning means that the ergodic distribution of the par value of government debt is nontrivial, in contrast
to the situation in Fiscal Insurance via Fluctuating Interest Rates in which the ergodic distribution of the par value of
government debt is concentrated on one point.
Nevertheless, [Bhandari et al., 2017] (BEGS) establish that, for general settings that include ours, the Ramsey planner
steers government assets to a level that comes as close as possible to providing full spanning in a precise a sense defined
by BEGS that we describe below.
We use code constructed in Fluctuating Interest Rates Deliver Fiscal Insurance.
Warning: Key equations in [Bhandari et al., 2017] section III.D carry typos that we correct below.
Let’s start with some imports:

951
Advanced Quantitative Economics with Python

import matplotlib.pyplot as plt


from scipy.optimize import minimize

48.2 The Economy

As in Optimal Taxation without State-Contingent Debt and Optimal Taxation with State-Contingent Debt, we assume that
the representative agent has utility function

𝑐1−𝜎 𝑛1+𝛾
𝑢(𝑐, 𝑛) = −
1−𝜎 1+𝛾
We work directly with labor supply instead of leisure.
We assume that

𝑐𝑡 + 𝑔 𝑡 = 𝑛 𝑡

The Markov state 𝑠𝑡 takes three values, namely, 0, 1, 2.


The initial Markov state is 0.
The Markov transition matrix is (1/3)𝐼 where 𝐼 is a 3 × 3 identity matrix, so the 𝑠𝑡 process is IID.
Government expenditures 𝑔(𝑠) equal .1 in Markov state 0, .2 in Markov state 1, and .3 in Markov state 2.
We set preference parameters

𝛽 = .9
𝜎=2
𝛾=2

The following Python code sets up the economy

import numpy as np

class CRRAutility:

def __init__(self,
β=0.9,
σ=2,
γ=2,
π=np.full((2, 2), 0.5),
G=np.array([0.1, 0.2]),
Θ=np.ones(2),
transfers=False):

self.β, self.σ, self.γ = β, σ, γ


self.π, self.G, self.Θ, self.transfers = π, G, Θ, transfers

# Utility function
def U(self, c, n):
σ = self.σ
if σ == 1.:
U = np.log(c)
(continues on next page)

952 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

(continued from previous page)


else:
U = (c**(1 - σ) - 1) / (1 - σ)
return U - n**(1 + self.γ) / (1 + self.γ)

# Derivatives of utility function


def Uc(self, c, n):
return c**(-self.σ)

def Ucc(self, c, n):


return -self.σ * c**(-self.σ - 1)

def Un(self, c, n):


return -n**self.γ

def Unn(self, c, n):


return -self.γ * n**(self.γ - 1)

48.2.1 First and Second Moments

We’ll want first and second moments of some key random variables below.
The following code computes these moments; the code is recycled from Fluctuating Interest Rates Deliver Fiscal Insurance.

def mean(x, s):


'''Returns mean for x given initial state'''
x = np.array(x)
return x @ u.π[s]

def variance(x, s):


x = np.array(x)
return x**2 @ u.π[s] - mean(x, s)**2

def covariance(x, y, s):


x, y = np.array(x), np.array(y)
return x * y @ u.π[s] - mean(x, s) * mean(y, s)

48.3 Long Simulation

To generate a long simulation we use the following code.


We begin by showing the code that we used in earlier lectures on the AMSS model.
Here it is

import numpy as np
from scipy.optimize import root
from quantecon import MarkovChain

class SequentialAllocation:

'''
(continues on next page)

48.3. Long Simulation 953


Advanced Quantitative Economics with Python

(continued from previous page)


Class that takes CESutility or BGPutility object as input returns
planner's allocation as a function of the multiplier on the
implementability constraint μ.
'''

def __init__(self, model):

# Initialize from model object attributes


self.β, self.π, self.G = model.β, model.π, model.G
self.mc, self.Θ = MarkovChain(self.π), model.Θ
self.S = len(model.π) # Number of states
self.model = model

# Find the first best allocation


self.find_first_best()

def find_first_best(self):
'''
Find the first best allocation
'''
model = self.model
S, Θ, G = self.S, self.Θ, self.G
Uc, Un = model.Uc, model.Un

def res(z):
c = z[:S]
n = z[S:]
return np.hstack([Θ * Uc(c, n) + Un(c, n), Θ * n - c - G])

res = root(res, np.full(2 * S, 0.5))

if not res.success:
raise Exception('Could not find first best')

self.cFB = res.x[:S]
self.nFB = res.x[S:]

# Multiplier on the resource constraint


self.ΞFB = Uc(self.cFB, self.nFB)
self.zFB = np.hstack([self.cFB, self.nFB, self.ΞFB])

def time1_allocation(self, μ):


'''
Computes optimal allocation for time t >= 1 for a given μ
'''
model = self.model
S, Θ, G = self.S, self.Θ, self.G
Uc, Ucc, Un, Unn = model.Uc, model.Ucc, model.Un, model.Unn

def FOC(z):
c = z[:S]
n = z[S:2 * S]
Ξ = z[2 * S:]
# FOC of c
return np.hstack([Uc(c, n) - μ * (Ucc(c, n) * c + Uc(c, n)) - Ξ,
Un(c, n) - μ * (Unn(c, n) * n + Un(c, n)) \

(continues on next page)

954 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

(continued from previous page)


+ Θ * Ξ, # FOC of n
Θ * n - c - G])

# Find the root of the first-order condition


res = root(FOC, self.zFB)
if not res.success:
raise Exception('Could not find LS allocation.')
z = res.x
c, n, Ξ = z[:S], z[S:2 * S], z[2 * S:]

# Compute x
I = Uc(c, n) * c + Un(c, n) * n
x = np.linalg.solve(np.eye(S) - self.β * self.π, I)

return c, n, x, Ξ

def time0_allocation(self, B_, s_0):


'''
Finds the optimal allocation given initial government debt B_ and
state s_0
'''
model, π, Θ, G, β = self.model, self.π, self.Θ, self.G, self.β
Uc, Ucc, Un, Unn = model.Uc, model.Ucc, model.Un, model.Unn

# First order conditions of planner's problem


def FOC(z):
μ, c, n, Ξ = z
xprime = self.time1_allocation(μ)[2]
return np.hstack([Uc(c, n) * (c - B_) + Un(c, n) * n + β * π[s_0]
@ xprime,
Uc(c, n) - μ * (Ucc(c, n)
* (c - B_) + Uc(c, n)) - Ξ,
Un(c, n) - μ * (Unn(c, n) * n
+ Un(c, n)) + Θ[s_0] * Ξ,
(Θ * n - c - G)[s_0]])

# Find root
res = root(FOC, np.array(
[0, self.cFB[s_0], self.nFB[s_0], self.ΞFB[s_0]]))
if not res.success:
raise Exception('Could not find time 0 LS allocation.')

return res.x

def time1_value(self, μ):


'''
Find the value associated with multiplier μ
'''
c, n, x, Ξ = self.time1_allocation(μ)
U = self.model.U(c, n)
V = np.linalg.solve(np.eye(self.S) - self.β * self.π, U)
return c, n, x, V

def Τ(self, c, n):


'''
Computes Τ given c, n

(continues on next page)

48.3. Long Simulation 955


Advanced Quantitative Economics with Python

(continued from previous page)


'''
model = self.model
Uc, Un = model.Uc(c, n), model.Un(c, n)

return 1 + Un / (self.Θ * Uc)

def simulate(self, B_, s_0, T, sHist=None):


'''
Simulates planners policies for T periods
'''
model, π, β = self.model, self.π, self.β
Uc = model.Uc

if sHist is None:
sHist = self.mc.simulate(T, s_0)

cHist, nHist, Bhist, ΤHist, μHist = np.zeros((5, T))


RHist = np.zeros(T - 1)

# Time 0
μ, cHist[0], nHist[0], _ = self.time0_allocation(B_, s_0)
ΤHist[0] = self.Τ(cHist[0], nHist[0])[s_0]
Bhist[0] = B_
μHist[0] = μ

# Time 1 onward
for t in range(1, T):
c, n, x, Ξ = self.time1_allocation(μ)
Τ = self.Τ(c, n)
u_c = Uc(c, n)
s = sHist[t]
Eu_c = π[sHist[t - 1]] @ u_c
cHist[t], nHist[t], Bhist[t], ΤHist[t] = c[s], n[s], x[s] / u_c[s], \
Τ[s]
RHist[t - 1] = Uc(cHist[t - 1], nHist[t - 1]) / (β * Eu_c)
μHist[t] = μ

return [cHist, nHist, Bhist, ΤHist, sHist, μHist, RHist]

import numpy as np
from scipy.optimize import fmin_slsqp
from scipy.optimize import root
from quantecon import MarkovChain

class RecursiveAllocationAMSS:

def __init__(self, model, μgrid, tol_diff=1e-7, tol=1e-7):

self.β, self.π, self.G = model.β, model.π, model.G


self.mc, self.S = MarkovChain(self.π), len(model.π) # Number of states
self.Θ, self.model, self.μgrid = model.Θ, model, μgrid
self.tol_diff, self.tol = tol_diff, tol

# Find the first best allocation


(continues on next page)

956 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

(continued from previous page)


self.solve_time1_bellman()
self.T.time_0 = True # Bellman equation now solves time 0 problem

def solve_time1_bellman(self):
'''
Solve the time 1 Bellman equation for calibration model and
initial grid μgrid0
'''
model, μgrid0 = self.model, self.μgrid
π = model.π
S = len(model.π)

# First get initial fit from Lucas Stokey solution.


# Need to change things to be ex ante
pp = SequentialAllocation(model)
interp = interpolator_factory(2, None)

def incomplete_allocation(μ_, s_):


c, n, x, V = pp.time1_value(μ_)
return c, n, π[s_] @ x, π[s_] @ V
cf, nf, xgrid, Vf, xprimef = [], [], [], [], []
for s_ in range(S):
c, n, x, V = zip(*map(lambda μ: incomplete_allocation(μ, s_), μgrid0))
c, n = np.vstack(c).T, np.vstack(n).T
x, V = np.hstack(x), np.hstack(V)
xprimes = np.vstack([x] * S)
cf.append(interp(x, c))
nf.append(interp(x, n))
Vf.append(interp(x, V))
xgrid.append(x)
xprimef.append(interp(x, xprimes))
cf, nf, xprimef = fun_vstack(cf), fun_vstack(nf), fun_vstack(xprimef)
Vf = fun_hstack(Vf)
policies = [cf, nf, xprimef]

# Create xgrid
x = np.vstack(xgrid).T
xbar = [x.min(0).max(), x.max(0).min()]
xgrid = np.linspace(xbar[0], xbar[1], len(μgrid0))
self.xgrid = xgrid

# Now iterate on Bellman equation


T = BellmanEquation(model, xgrid, policies, tol=self.tol)
diff = 1
while diff > self.tol_diff:
PF = T(Vf)

Vfnew, policies = self.fit_policy_function(PF)


diff = np.abs((Vf(xgrid) - Vfnew(xgrid)) / Vf(xgrid)).max()

print(diff)
Vf = Vfnew

# Store value function policies and Bellman Equations


self.Vf = Vf
self.policies = policies

(continues on next page)

48.3. Long Simulation 957


Advanced Quantitative Economics with Python

(continued from previous page)


self.T = T

def fit_policy_function(self, PF):


'''
Fits the policy functions
'''
S, xgrid = len(self.π), self.xgrid
interp = interpolator_factory(3, 0)
cf, nf, xprimef, Tf, Vf = [], [], [], [], []
for s_ in range(S):
PFvec = np.vstack([PF(x, s_) for x in self.xgrid]).T
Vf.append(interp(xgrid, PFvec[0, :]))
cf.append(interp(xgrid, PFvec[1:1 + S]))
nf.append(interp(xgrid, PFvec[1 + S:1 + 2 * S]))
xprimef.append(interp(xgrid, PFvec[1 + 2 * S:1 + 3 * S]))
Tf.append(interp(xgrid, PFvec[1 + 3 * S:]))
policies = fun_vstack(cf), fun_vstack(
nf), fun_vstack(xprimef), fun_vstack(Tf)
Vf = fun_hstack(Vf)
return Vf, policies

def Τ(self, c, n):


'''
Computes Τ given c and n
'''
model = self.model
Uc, Un = model.Uc(c, n), model.Un(c, n)

return 1 + Un / (self.Θ * Uc)

def time0_allocation(self, B_, s0):


'''
Finds the optimal allocation given initial government debt B_ and
state s_0
'''
PF = self.T(self.Vf)
z0 = PF(B_, s0)
c0, n0, xprime0, T0 = z0[1:]
return c0, n0, xprime0, T0

def simulate(self, B_, s_0, T, sHist=None):


'''
Simulates planners policies for T periods
'''
model, π = self.model, self.π
Uc = model.Uc
cf, nf, xprimef, Tf = self.policies

if sHist is None:
sHist = simulate_markov(π, s_0, T)

cHist, nHist, Bhist, xHist, ΤHist, THist, μHist = np.zeros((7, T))


# Time 0
cHist[0], nHist[0], xHist[0], THist[0] = self.time0_allocation(B_, s_0)
ΤHist[0] = self.Τ(cHist[0], nHist[0])[s_0]
Bhist[0] = B_

(continues on next page)

958 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

(continued from previous page)


μHist[0] = self.Vf[s_0](xHist[0])

# Time 1 onward
for t in range(1, T):
s_, x, s = sHist[t - 1], xHist[t - 1], sHist[t]
c, n, xprime, T = cf[s_, :](x), nf[s_, :](
x), xprimef[s_, :](x), Tf[s_, :](x)

Τ = self.Τ(c, n)[s]
u_c = Uc(c, n)
Eu_c = π[s_, :] @ u_c

μHist[t] = self.Vf[s](xprime[s])

cHist[t], nHist[t], Bhist[t], ΤHist[t] = c[s], n[s], x / Eu_c, Τ


xHist[t], THist[t] = xprime[s], T[s]
return [cHist, nHist, Bhist, ΤHist, THist, μHist, sHist, xHist]

class BellmanEquation:
'''
Bellman equation for the continuation of the Lucas-Stokey Problem
'''

def __init__(self, model, xgrid, policies0, tol, maxiter=1000):

self.β, self.π, self.G = model.β, model.π, model.G


self.S = len(model.π) # Number of states
self.Θ, self.model, self.tol = model.Θ, model, tol
self.maxiter = maxiter

self.xbar = [min(xgrid), max(xgrid)]


self.time_0 = False

self.z0 = {}
cf, nf, xprimef = policies0

for s_ in range(self.S):
for x in xgrid:
self.z0[x, s_] = np.hstack([cf[s_, :](x),
nf[s_, :](x),
xprimef[s_, :](x),
np.zeros(self.S)])

self.find_first_best()

def find_first_best(self):
'''
Find the first best allocation
'''
model = self.model
S, Θ, Uc, Un, G = self.S, self.Θ, model.Uc, model.Un, self.G

def res(z):
c = z[:S]
n = z[S:]

(continues on next page)

48.3. Long Simulation 959


Advanced Quantitative Economics with Python

(continued from previous page)


return np.hstack([Θ * Uc(c, n) + Un(c, n), Θ * n - c - G])

res = root(res, np.full(2 * S, 0.5))


if not res.success:
raise Exception('Could not find first best')

self.cFB = res.x[:S]
self.nFB = res.x[S:]
IFB = Uc(self.cFB, self.nFB) * self.cFB + \
Un(self.cFB, self.nFB) * self.nFB

self.xFB = np.linalg.solve(np.eye(S) - self.β * self.π, IFB)

self.zFB = {}
for s in range(S):
self.zFB[s] = np.hstack(
[self.cFB[s], self.nFB[s], self.π[s] @ self.xFB, 0.])

def __call__(self, Vf):


'''
Given continuation value function next period return value function this
period return T(V) and optimal policies
'''
if not self.time_0:
def PF(x, s): return self.get_policies_time1(x, s, Vf)
else:
def PF(B_, s0): return self.get_policies_time0(B_, s0, Vf)
return PF

def get_policies_time1(self, x, s_, Vf):


'''
Finds the optimal policies
'''
model, β, Θ, G, S, π = self.model, self.β, self.Θ, self.G, self.S, self.π
U, Uc, Un = model.U, model.Uc, model.Un

def objf(z):
c, n, xprime = z[:S], z[S:2 * S], z[2 * S:3 * S]

Vprime = np.empty(S)
for s in range(S):
Vprime[s] = Vf[s](xprime[s])

return -π[s_] @ (U(c, n) + β * Vprime)

def objf_prime(x):

epsilon = 1e-7
x0 = np.asfarray(x)
f0 = np.atleast_1d(objf(x0))
jac = np.zeros([len(x0), len(f0)])
dx = np.zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (objf(x0+dx) - f0)/epsilon
dx[i] = 0.0

(continues on next page)

960 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

(continued from previous page)

return jac.transpose()

def cons(z):
c, n, xprime, T = z[:S], z[S:2 * S], z[2 * S:3 * S], z[3 * S:]
u_c = Uc(c, n)
Eu_c = π[s_] @ u_c
return np.hstack([
x * u_c / Eu_c - u_c * (c - T) - Un(c, n) * n - β * xprime,
Θ * n - c - G])

if model.transfers:
bounds = [(0., 100)] * S + [(0., 100)] * S + \
[self.xbar] * S + [(0., 100.)] * S
else:
bounds = [(0., 100)] * S + [(0., 100)] * S + \
[self.xbar] * S + [(0., 0.)] * S
out, fx, _, imode, smode = fmin_slsqp(objf, self.z0[x, s_],
f_eqcons=cons, bounds=bounds,
fprime=objf_prime, full_output=True,
iprint=0, acc=self.tol, iter=self.
↪maxiter)

if imode > 0:
raise Exception(smode)

self.z0[x, s_] = out


return np.hstack([-fx, out])

def get_policies_time0(self, B_, s0, Vf):


'''
Finds the optimal policies
'''
model, β, Θ, G = self.model, self.β, self.Θ, self.G
U, Uc, Un = model.U, model.Uc, model.Un

def objf(z):
c, n, xprime = z[:-1]

return -(U(c, n) + β * Vf[s0](xprime))

def cons(z):
c, n, xprime, T = z
return np.hstack([
-Uc(c, n) * (c - B_ - T) - Un(c, n) * n - β * xprime,
(Θ * n - c - G)[s0]])

if model.transfers:
bounds = [(0., 100), (0., 100), self.xbar, (0., 100.)]
else:
bounds = [(0., 100), (0., 100), self.xbar, (0., 0.)]
out, fx, _, imode, smode = fmin_slsqp(objf, self.zFB[s0], f_eqcons=cons,
bounds=bounds, full_output=True,
iprint=0)

if imode > 0:

(continues on next page)

48.3. Long Simulation 961


Advanced Quantitative Economics with Python

(continued from previous page)


raise Exception(smode)

return np.hstack([-fx, out])

import numpy as np
from scipy.interpolate import UnivariateSpline

class interpolate_wrapper:

def __init__(self, F):


self.F = F

def __getitem__(self, index):


return interpolate_wrapper(np.asarray(self.F[index]))

def reshape(self, *args):


self.F = self.F.reshape(*args)
return self

def transpose(self):
self.F = self.F.transpose()

def __len__(self):
return len(self.F)

def __call__(self, xvec):


x = np.atleast_1d(xvec)
shape = self.F.shape
if len(x) == 1:
fhat = np.hstack([f(x) for f in self.F.flatten()])
return fhat.reshape(shape)
else:
fhat = np.vstack([f(x) for f in self.F.flatten()])
return fhat.reshape(np.hstack((shape, len(x))))

class interpolator_factory:

def __init__(self, k, s):


self.k, self.s = k, s

def __call__(self, xgrid, Fs):


shape, m = Fs.shape[:-1], Fs.shape[-1]
Fs = Fs.reshape((-1, m))
F = []
xgrid = np.sort(xgrid) # Sort xgrid
for Fhat in Fs:
F.append(UnivariateSpline(xgrid, Fhat, k=self.k, s=self.s))
return interpolate_wrapper(np.array(F).reshape(shape))

def fun_vstack(fun_list):

Fs = [IW.F for IW in fun_list]


(continues on next page)

962 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

(continued from previous page)


return interpolate_wrapper(np.vstack(Fs))

def fun_hstack(fun_list):

Fs = [IW.F for IW in fun_list]


return interpolate_wrapper(np.hstack(Fs))

def simulate_markov(π, s_0, T):

sHist = np.empty(T, dtype=int)


sHist[0] = s_0
S = len(π)
for t in range(1, T):
sHist[t] = np.random.choice(np.arange(S), p=π[sHist[t - 1]])

return sHist

Next, we show the code that we use to generate a very long simulation starting from initial government debt equal to −.5.
Here is a graph of a long simulation of 102000 periods.

μ_grid = np.linspace(-0.09, 0.1, 100)

log_example = CRRAutility(π=np.full((3, 3), 1 / 3),


G=np.array([0.1, 0.2, .3]),
Θ=np.ones(3))

log_example.transfers = True # Government can use transfers


log_sequential = SequentialAllocation(log_example) # Solve sequential problem
log_bellman = RecursiveAllocationAMSS(log_example, μ_grid,
tol=1e-12, tol_diff=1e-10)

T = 102000 # Set T to 102000 periods

sim_seq_long = log_sequential.simulate(0.5, 0, T)
sHist_long = sim_seq_long[-3]
sim_bel_long = log_bellman.simulate(0.5, 0, T, sHist_long)

titles = ['Government Debt', 'Tax Rate']

fig, axes = plt.subplots(2, 1, figsize=(10, 8))

for ax, title, id in zip(axes.flatten(), titles, [2, 3]):


ax.plot(sim_seq_long[id], '-k', sim_bel_long[id], '-.b', alpha=0.5)
ax.set(title=title)
ax.grid()

axes[0].legend(('Complete Markets', 'Incomplete Markets'))


plt.tight_layout()
plt.show()

48.3. Long Simulation 963


Advanced Quantitative Economics with Python

/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:437: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

fx = wrapped_fun(x)
/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:441: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

g = append(wrapped_grad(x), 0.0)
/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:495: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

a_eq = vstack([con['jac'](x, *con['args'])

/tmp/ipykernel_6826/108196118.py:24: RuntimeWarning: divide by zero encountered in␣


↪reciprocal

U = (c**(1 - σ) - 1) / (1 - σ)
/tmp/ipykernel_6826/108196118.py:29: RuntimeWarning: divide by zero encountered in␣
↪power

return c**(-self.σ)
/tmp/ipykernel_6826/1277371586.py:249: RuntimeWarning: invalid value encountered␣
↪in divide

x * u_c / Eu_c - u_c * (c - T) - Un(c, n) * n - β * xprime,

/tmp/ipykernel_6826/1277371586.py:249: RuntimeWarning: invalid value encountered␣


↪in multiply

x * u_c / Eu_c - u_c * (c - T) - Un(c, n) * n - β * xprime,

0.038266353387659546

0.0015144378246632448

0.0013387575049931865

0.0011833202400662248

0.0010600307116134505

0.0009506620324908642

0.0008518776517238551

0.0007625857031042564

0.0006819563061669217

0.0006094002927240671

964 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

0.0005443007356805235

0.00048599500343956094

0.0004338395935928358

0.00038722730865154364

0.00034559541217657187

0.00030842870645340995

0.00027525901875688697

0.0002456631291987257

0.00021925988533911457

0.0001957069581927878

0.00017469751641633328

0.00015595697131045533

0.00013923987965580473

0.0001243270476244632

0.00011102285954170156

9.915283206080047e-05

8.856139177373994e-05

7.910986485356134e-05

7.067466534026614e-05

6.314566737649043e-05

5.6424746008715835e-05

48.3. Long Simulation 965


Advanced Quantitative Economics with Python

5.04244714230645e-05

4.5066942129829506e-05

4.028274354582181e-05

3.601001917066026e-05

3.219364287744318e-05

2.878448158073308e-05

2.5738738366349524e-05

2.3017369974638877e-05

2.0585562530972924e-05

1.8412273759209572e-05

1.6470096733078585e-05

1.4734148603737835e-05

1.3182214255360329e-05

1.1794654716176686e-05

1.0553942898779478e-05

9.444436197515114e-06

8.452171093491432e-06

7.564681603048501e-06

6.770836606096674e-06

6.060699172057158e-06

5.4253876343226e-06

966 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

4.856977544060761e-06

4.348382732427091e-06

3.893276456302588e-06

3.4860028420224977e-06

3.1215110784890745e-06

2.7952840260155024e-06

2.503284254157189e-06

2.241904747465382e-06

2.0079209145832687e-06

1.7984472260187192e-06

1.610904141295967e-06

1.4429883256895489e-06

1.2926354365994746e-06

1.1580011940576491e-06

1.0374362190402233e-06

9.294651286343194e-07

8.327660623755013e-07

7.461585686381671e-07

6.68586648784756e-07

5.991017296865946e-07

5.368606502407216e-07

48.3. Long Simulation 967


Advanced Quantitative Economics with Python

4.811037017633464e-07

4.3115434615062044e-07

3.8640500348483447e-07

3.4631274740294855e-07

3.1039146715661056e-07

2.782060642970499e-07

2.493665449692665e-07

2.235241683944158e-07

2.0036660045892633e-07

1.796140357496926e-07

1.610161234596195e-07

1.4434845857135709e-07

1.29410194199688e-07

1.1602140686642469e-07

1.04020962175412e-07

9.326451087350253e-08

8.362279520562034e-08

7.49799979528415e-08

6.723237810210067e-08

6.028699653820159e-08

5.4060588066801066e-08

968 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

4.847855517381241e-08

4.347405660607874e-08

3.898720608840536e-08

3.496434157686767e-08

3.135737680533792e-08

2.8123222131646282e-08

2.5223262308472423e-08

2.2622892571432625e-08

2.0291098813063476e-08

1.820008555543109e-08

1.6324938418135388e-08

1.4643330672610771e-08

1.3135245110419445e-08

1.178274355586975e-08

1.0569743803546048e-08

9.48183058751907e-09

8.506079544395937e-09

7.630907318911004e-09

6.845926774203295e-09

6.141826797773109e-09

5.510259068441386e-09

48.3. Long Simulation 969


Advanced Quantitative Economics with Python

4.943738281315066e-09

4.435554859709816e-09

3.979736766026741e-09

3.5708317622814044e-09

3.2040044801866767e-09

2.874916539533131e-09

2.579680212253616e-09

2.3148068175021918e-09

2.077170148801081e-09

1.8639635474165993e-09

1.6726726276855955e-09

1.5010414936033808e-09

1.3470449992327086e-09

1.2088698423920761e-09

1.0848882197883804e-09

9.736395405805598e-10

8.738135346705384e-10

7.842367703299733e-10

7.03855297579472e-10

6.317225605423774e-10

5.669925787732949e-10

970 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

5.089032105148693e-10

4.5677367318159076e-10

4.0999013116379334e-10

3.680044560697966e-10

3.3032415368561477e-10

2.96506010211222e-10

2.6615516244191936e-10

2.389139399385772e-10

2.144649644252697e-10

1.9252092177853976e-10

1.7282471699749249e-10

1.551454449875162e-10

1.3927730577138407e-10

1.2503449048385917e-10

1.1224916676355658e-10

1.0077318342152794e-10

9.047094182757221e-11

48.3. Long Simulation 971


Advanced Quantitative Economics with Python

The long simulation apparently indicates eventual convergence to an ergodic distribution.


It takes about 1000 periods to reach the ergodic distribution – an outcome that is forecast by approximations to rates of
convergence that appear in BEGS [Bhandari et al., 2017] and that we discuss in Fluctuating Interest Rates Deliver Fiscal
Insurance.
Let’s discard the first 2000 observations of the simulation and construct the histogram of the par value of government
debt.
We obtain the following graph for the histogram of the last 100,000 observations on the par value of government debt.
The black vertical line denotes the sample mean for the last 100,000 observations included in the histogram; the green
ℬ∗
vertical line denotes the value of 𝐸𝑢 , associated with a sample from our approximation to the ergodic distribution where
𝑐

ℬ is a regression coefficient to be described below; the red vertical line denotes an approximation by [Bhandari et al.,
2017] to the mean of the ergodic distribution that can be computed before the ergodic distribution has been approximated,
as described below.
Before moving on to discuss the histogram and the vertical lines approximating the ergodic mean of government debt in
more detail, the following graphs show government debt and taxes early in the simulation, for periods 1-100 and 101 to
200 respectively.

titles = ['Government Debt', 'Tax Rate']

fig, axes = plt.subplots(4, 1, figsize=(10, 15))

(continues on next page)

972 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

48.3. Long Simulation 973


Advanced Quantitative Economics with Python

974 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

(continued from previous page)


for i, id in enumerate([2, 3]):
axes[i].plot(sim_seq_long[id][:99], '-k', sim_bel_long[id][:99],
'-.b', alpha=0.5)
axes[i+2].plot(range(100, 199), sim_seq_long[id][100:199], '-k',
range(100, 199), sim_bel_long[id][100:199], '-.b',
alpha=0.5)
axes[i].set(title=titles[i])
axes[i+2].set(title=titles[i])
axes[i].grid()
axes[i+2].grid()

axes[0].legend(('Complete Markets', 'Incomplete Markets'))


plt.tight_layout()
plt.show()

48.3. Long Simulation 975


Advanced Quantitative Economics with Python

976 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

48.3. Long Simulation 977


Advanced Quantitative Economics with Python

For the short samples early in our simulated sample of 102,000 observations, fluctuations in government debt and the
tax rate conceal the weak but inexorable force that the Ramsey planner puts into both series driving them toward ergodic
marginal distributions that are far from these early observations
• early observations are more influenced by the initial value of the par value of government debt than by the ergodic
mean of the par value of government debt
• much later observations are more influenced by the ergodic mean and are independent of the par value of initial
government debt

48.4 Asymptotic Mean and Rate of Convergence

We apply the results of BEGS [Bhandari et al., 2017] to interpret


• the mean of the ergodic distribution of government debt
• the rate of convergence to the ergodic distribution from an arbitrary initial government debt
We begin by computing objects required by the theory of section III.i of BEGS [Bhandari et al., 2017].
As in Fiscal Insurance via Fluctuating Interest Rates, we recall that BEGS [Bhandari et al., 2017] used a particular notation
to represent what we can regard as their generalization of an AMSS model.
We introduce some of the [Bhandari et al., 2017] notation so that readers can quickly relate notation that appears in key
BEGS formulas to the notation that we have used in previous lectures here and here.
BEGS work with objects 𝐵𝑡 , ℬ𝑡 , ℛ𝑡 , 𝒳𝑡 that are related to notation that we used in earlier lectures by
𝑢𝑐,𝑡 𝑢𝑐,𝑡
ℛ𝑡 = 𝑅 =
𝑢𝑐,𝑡−1 𝑡−1 𝛽𝐸𝑡−1 𝑢𝑐,𝑡
𝑏𝑡+1 (𝑠𝑡 )
𝐵𝑡 =
𝑅𝑡 (𝑠𝑡 )
𝑡−1
𝑏𝑡 (𝑠 ) = ℛ𝑡−1 𝐵𝑡−1
ℬ𝑡 = 𝑢𝑐,𝑡 𝐵𝑡 = (𝛽𝐸𝑡 𝑢𝑐,𝑡+1 )𝑏𝑡+1 (𝑠𝑡 )
𝒳𝑡 = 𝑢𝑐,𝑡 [𝑔𝑡 − 𝜏𝑡 𝑛𝑡 ]

BEGS [Bhandari et al., 2017] call 𝒳𝑡 the effective government deficit and ℬ𝑡 the effective government debt.
Equation (44) of [Bhandari et al., 2017] expresses the time 𝑡 state 𝑠 government budget constraint as

ℬ(𝑠) = ℛ𝜏 (𝑠, 𝑠− )ℬ− + 𝒳𝜏 (𝑠) (48.1)

where the dependence on 𝜏 is meant to remind us that these objects depend on the tax rate; 𝑠− is last period’s Markov
state.
BEGS interpret random variations in the right side of (48.1) as fiscal risks generated by
• interest-rate-driven fluctuations in time 𝑡 effective payments due on the government portfolio, namely,
ℛ𝜏 (𝑠, 𝑠− )ℬ− , and
• fluctuations in the effective government deficit 𝒳𝑡

978 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

48.4.1 Asymptotic Mean

BEGS give conditions under which the ergodic mean of ℬ𝑡 is approximated by


cov∞ (ℛt , 𝒳t )
ℬ∗ = − (48.2)
var∞ (ℛt )
where the superscript ∞ denotes a moment taken with respect to an ergodic distribution.
Formula (48.2) represents ℬ∗ as a regression coefficient of 𝒳𝑡 on ℛ𝑡 in the ergodic distribution.
Regression coefficient ℬ∗ solves a variance-minimization problem:

ℬ∗ = argminℬ var∞ (ℛℬ + 𝒳) (48.3)

The minimand in criterion (48.3) measures fiscal risk associated with a given tax-debt policy that appears on the right
side of equation (48.1).
Expressing formula (48.2) in terms of our notation tells us that the ergodic mean of the par value 𝑏 of government debt
in the AMSS model should be approximately
ℬ∗ ℬ∗
𝑏̂ = = (48.4)
𝛽𝐸(𝐸𝑡 𝑢𝑐,𝑡+1 ) 𝛽𝐸(𝑢𝑐,𝑡+1 )

where mathematical expectations are taken with respect to the ergodic distribution.

48.4.2 Rate of Convergence

BEGS also derive the following approximation to the rate of convergence to ℬ∗ from an arbitrary initial condition.

𝐸𝑡 (ℬ𝑡+1 − ℬ∗ ) 1
≈ (48.5)
(ℬ𝑡 − ℬ∗ ) 1 + 𝛽 2 var∞ (ℛ)

(See the equation above equation (47) in BEGS [Bhandari et al., 2017])

48.4.3 More Advanced Topic

The remainder of this lecture is about technical material based on formulas from BEGS [Bhandari et al., 2017].
The topic involves interpreting and extending formula (48.3) for the ergodic mean ℬ∗ .

48.4.4 Chicken and Egg

Notice how attributes of the ergodic distribution for ℬ𝑡 appear on the right side of formula (48.3) for approximating the
ergodic mean via ℬ∗ .
Therefor, formula (48.3) is not useful for estimating the mean of the ergodic in advance of actually approximating the
ergodic distribution.
• we need to know the ergodic distribution to compute the right side of formula (48.3)
So the primary use of equation (48.3) is how it confirms that the ergodic distribution solves a fiscal-risk minimization
problem.
As an example, notice how we used the formula for the mean of ℬ in the ergodic distribution of the special AMSS
economy in Fiscal Insurance via Fluctuating Interest Rates

48.4. Asymptotic Mean and Rate of Convergence 979


Advanced Quantitative Economics with Python

• first we computed the ergodic distribution using a reverse-engineering construction


• then we verified that ℬ∗ agrees with the mean of that distribution

48.4.5 Approximating the Ergodic Mean

BEGS also [Bhandari et al., 2017] propose an approximation to ℬ∗ that can be computed without first approximating
the ergodic distribution.
To construct the BEGS approximation to ℬ∗ , we just follow steps set forth on pages 648 - 650 of section III.D of [Bhandari
et al., 2017]
• notation in BEGS might be confusing at first sight, so it is important to stare and digest before computing
• there are also some sign errors in the [Bhandari et al., 2017] text that we’ll want to correct here
Here is a step-by-step description of the BEGS [Bhandari et al., 2017] approximation procedure.

48.4.6 Step by Step

Step 1: For a given 𝜏 we compute a vector of values 𝑐𝜏 (𝑠), 𝑠 = 1, 2, … , 𝑆 that satisfy

(1 − 𝜏 )𝑐𝜏 (𝑠)−𝜎 − (𝑐𝜏 (𝑠) + 𝑔(𝑠))𝛾 = 0

This is a nonlinear equation to be solved for 𝑐𝜏 (𝑠), 𝑠 = 1, … , 𝑆.


𝑆 = 3 in our case, but we’ll write code for a general integer 𝑆.
Typo alert: Please note that there is a sign error in equation (42) of BEGS [Bhandari et al., 2017] – it should be a minus
rather than a plus in the middle.
• We have made the appropriate correction in the above equation.
Step 2: Knowing 𝑐𝜏 (𝑠), 𝑠 = 1, … , 𝑆 for a given 𝜏 , we want to compute the random variables

𝑐𝜏 (𝑠)−𝜎
ℛ𝜏 (𝑠) = 𝑆
𝛽 ∑𝑠′ =1 𝑐𝜏 (𝑠′ )−𝜎 𝜋(𝑠′ )

and

𝒳𝜏 (𝑠) = (𝑐𝜏 (𝑠) + 𝑔(𝑠))1+𝛾 − 𝑐𝜏 (𝑠)1−𝜎

each for 𝑠 = 1, … , 𝑆.
BEGS call ℛ𝜏 (𝑠) the effective return on risk-free debt and they call 𝒳𝜏 (𝑠) the effective government deficit.
Step 3: With the preceding objects in hand, for a given ℬ, we seek a 𝜏 that satisfies

𝛽 𝛽
ℬ=− 𝐸𝒳𝜏 ≡ − ∑ 𝒳𝜏 (𝑠)𝜋(𝑠)
1−𝛽 1−𝛽 𝑠

This equation says that at a constant discount factor 𝛽, equivalent government debt ℬ equals the present value of the mean
effective government surplus.
Another typo alert: there is a sign error in equation (46) of BEGS [Bhandari et al., 2017] –the left side should be
multiplied by −1.
• We have made this correction in the above equation.

980 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

For a given ℬ, let a 𝜏 that solves the above equation be called 𝜏 (ℬ).
We’ll use a Python root solver to find a 𝜏 that solves this equation for a given ℬ.
We’ll use this function to induce a function 𝜏 (ℬ).
Step 4: With a Python program that computes 𝜏 (ℬ) in hand, next we write a Python function to compute the random
variable.

𝐽 (ℬ)(𝑠) = ℛ𝜏(ℬ) (𝑠)ℬ + 𝒳𝜏(ℬ) (𝑠), 𝑠 = 1, … , 𝑆

Step 5: Now that we have a way to compute the random variable 𝐽 (ℬ)(𝑠), 𝑠 = 1, … , 𝑆, via a composition of Python
functions, we can use the population variance function that we defined in the code above to construct a function var(𝐽 (ℬ)).
We put var(𝐽 (ℬ)) into a Python function minimizer and compute

ℬ∗ = argminℬ var(𝐽 (ℬ))

Step 6: Next we take the minimizer ℬ∗ and the Python functions for computing means and variances and compute
1
rate =
1+ 𝛽 2 var(ℛ 𝜏(ℬ∗ ) )

Ultimate outputs of this string of calculations are two scalars

(ℬ∗ , rate)

Step 7: Compute the divisor

𝑑𝑖𝑣 = 𝛽𝐸𝑢𝑐,𝑡+1

and then compute the mean of the par value of government debt in the AMSS model

ℬ∗
𝑏̂ =
𝑑𝑖𝑣
In the two-Markov-state AMSS economy in Fiscal Insurance via Fluctuating Interest Rates, 𝐸𝑡 𝑢𝑐,𝑡+1 = 𝐸𝑢𝑐,𝑡+1 in the
ergodic distribution.
We have confirmed that this formula very accurately describes a constant par value of government debt that
• supports full fiscal insurance via fluctuating interest parameters, and
• is the limit of government debt as 𝑡 → +∞
In the three-Markov-state economy of this lecture, the par value of government debt fluctuates in a history-dependent
way even asymptotically.
In this economy, 𝑏̂ given by the above formula approximates the mean of the ergodic distribution of the par value of
government debt
so while the approximation circumvents the chicken and egg problem that surrounds
the much better approximation associated with the green vertical line, it does so by enlarging the approximation
error
• 𝑏̂ is represented by the red vertical line plotted in the histogram of the last 100,000 observations of our simulation
of the par value of government debt plotted above
• the approximation is fairly accurate but not perfect

48.4. Asymptotic Mean and Rate of Convergence 981


Advanced Quantitative Economics with Python

48.4.7 Execution

Now let’s move on to compute things step by step.

Step 1

u = CRRAutility(π=np.full((3, 3), 1 / 3),


G=np.array([0.1, 0.2, .3]),
Θ=np.ones(3))

τ = 0.05 # Initial guess of τ (to displays calcs along the way)


S = len(u.G) # Number of states

def solve_c(c, τ, u):


return (1 - τ) * c**(-u.σ) - (c + u.G)**u.γ

# .x returns the result from root


c = root(solve_c, np.ones(S), args=(τ, u)).x
c

array([0.93852387, 0.89231015, 0.84858872])

root(solve_c, np.ones(S), args=(τ, u))

message: The solution converged.


success: True
status: 1
fun: [ 5.618e-10 -4.769e-10 1.175e-11]
x: [ 9.385e-01 8.923e-01 8.486e-01]
method: hybr
nfev: 11
fjac: [[-9.999e-01 -4.954e-03 -1.261e-02]
[-5.156e-03 9.999e-01 1.610e-02]
[-1.253e-02 -1.616e-02 9.998e-01]]
r: [ 4.269e+00 8.685e-02 -6.301e-02 -4.713e+00 -7.433e-02
-5.508e+00]
qtf: [ 1.556e-08 1.283e-08 7.899e-11]

982 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

Step 2

n = c + u.G # Compute labor supply

48.4.8 Note about Code

Remember that in our code 𝜋 is a 3 × 3 transition matrix.


But because we are studying an IID case, 𝜋 has identical rows and we need only to compute objects for one row of 𝜋.
This explains why at some places below we set 𝑠 = 0 just to pick off the first row of 𝜋.

48.4.9 Running the code

Let’s take the code out for a spin.


First, let’s compute ℛ and 𝒳 according to our formulas

def compute_R_X(τ, u, s):


c = root(solve_c, np.ones(S), args=(τ, u)).x # Solve for vector of c's
div = u.β * (u.Uc(c[0], n[0]) * u.π[s, 0] \
+ u.Uc(c[1], n[1]) * u.π[s, 1] \
+ u.Uc(c[2], n[2]) * u.π[s, 2])
R = c**(-u.σ) / (div)
X = (c + u.G)**(1 + u.γ) - c**(1 - u.σ)
return R, X

c**(-u.σ) @ u.π

array([1.25997521, 1.25997521, 1.25997521])

u.π

array([[0.33333333, 0.33333333, 0.33333333],


[0.33333333, 0.33333333, 0.33333333],
[0.33333333, 0.33333333, 0.33333333]])

We only want unconditional expectations because we are in an IID case.


So we’ll set 𝑠 = 0 and just pick off expectations associated with the first row of 𝜋

s = 0

R, X = compute_R_X(τ, u, s)

Let’s look at the random variables ℛ, 𝒳

array([1.00116313, 1.10755123, 1.22461897])

48.4. Asymptotic Mean and Rate of Convergence 983


Advanced Quantitative Economics with Python

mean(R, s)

1.1111111111111112

array([0.05457803, 0.18259396, 0.33685546])

mean(X, s)

0.19134248445303795

X @ u.π

array([0.19134248, 0.19134248, 0.19134248])

Step 3

def solve_τ(τ, B, u, s):


R, X = compute_R_X(τ, u, s)
return ((u.β - 1) / u.β) * B - X @ u.π[s]

Note that 𝐵 is a scalar.


Let’s try out our method computing 𝜏

s = 0
B = 1.0

τ = root(solve_τ, .1, args=(B, u, s)).x[0] # Very sensitive to initial value


τ

0.2740159773695818

In the above cell, B is fixed at 1 and 𝜏 is to be computed as a function of B.


Note that 0.2 is the initial value for 𝜏 in the root-finding algorithm.

Step 4

def min_J(B, u, s):


# Very sensitive to initial value of τ
τ = root(solve_τ, .5, args=(B, u, s)).x[0]
R, X = compute_R_X(τ, u, s)
return variance(R * B + X, s)

984 Chapter 48. Fiscal Risk and Government Debt


Advanced Quantitative Economics with Python

min_J(B, u, s)

0.035564405653720765

Step 6

B_star = minimize(min_J, .5, args=(u, s)).x[0]


B_star

-1.199483167941158

n = c + u.G # Compute labor supply

div = u.β * (u.Uc(c[0], n[0]) * u.π[s, 0] \


+ u.Uc(c[1], n[1]) * u.π[s, 1] \
+ u.Uc(c[2], n[2]) * u.π[s, 2])

B_hat = B_star/div
B_hat

-1.0577661126390971

τ_star = root(solve_τ, 0.05, args=(B_star, u, s)).x[0]


τ_star

0.09572916798461703

R_star, X_star = compute_R_X(τ_star, u, s)


R_star, X_star

(array([0.9998398 , 1.10746593, 1.2260276 ]),


array([0.0020272 , 0.12464752, 0.27315299]))

rate = 1 / (1 + u.β**2 * variance(R_star, s))


rate

0.9931353432732218

root(solve_c, np.ones(S), args=(τ_star, u)).x

array([0.9264382 , 0.88027117, 0.83662635])

48.4. Asymptotic Mean and Rate of Convergence 985


Advanced Quantitative Economics with Python

986 Chapter 48. Fiscal Risk and Government Debt


CHAPTER

FORTYNINE

COMPETITIVE EQUILIBRIA OF A MODEL OF CHANG

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install polytope

49.1 Overview

This lecture describes how Chang [Chang, 1998] analyzed competitive equilibria and a best competitive equilibrium
called a Ramsey plan.
He did this by
• characterizing a competitive equilibrium recursively in a way also employed in the dynamic Stackelberg problems
and Calvo model lectures to pose Stackelberg problems in linear economies, and then
• appropriately adapting an argument of Abreu, Pearce, and Stachetti [Abreu et al., 1990] to describe key features
of the set of competitive equilibria
Roberto Chang [Chang, 1998] chose a model of Calvo [Calvo, 1978] as a simple structure that conveys ideas that apply
more broadly.
A textbook version of Chang’s model appears in chapter 25 of [Ljungqvist and Sargent, 2018].
This lecture and Credible Government Policies in Chang Model can be viewed as more sophisticated and complete treat-
ments of the topics discussed in Ramsey plans, time inconsistency, sustainable plans.
Both this lecture and Credible Government Policies in Chang Model make extensive use of an idea to which we apply the
nickname dynamic programming squared.
In dynamic programming squared problems there are typically two interrelated Bellman equations
• A Bellman equation for a set of agents or followers with value or value function 𝑣𝑎 .
• A Bellman equation for a principal or Ramsey planner or Stackelberg leader with value or value function 𝑣𝑝 in
which 𝑣𝑎 appears as an argument.
We encountered problems with this structure in dynamic Stackelberg problems, optimal taxation with state-contingent debt,
and other lectures.
We’ll start with some standard imports:

import numpy as np
import polytope
import matplotlib.pyplot as plt

987
Advanced Quantitative Economics with Python

`polytope` failed to import `cvxopt.glpk`.

will use `scipy.optimize.linprog`

49.1.1 The Setting

First, we introduce some notation.


For a sequence of scalars 𝑧 ⃗ ≡ {𝑧𝑡 }∞ 𝑡
𝑡=0 , let 𝑧 ⃗ = (𝑧0 , … , 𝑧𝑡 ), 𝑧𝑡⃗ = (𝑧𝑡 , 𝑧𝑡+1 , …).

An infinitely lived representative agent and an infinitely lived government exist at dates 𝑡 = 0, 1, ….
The objects in play are
• an initial quantity 𝑀−1 of nominal money holdings
• a sequence of inverse money growth rates ℎ⃗ and an associated sequence of nominal money holdings 𝑀⃗
• a sequence of values of money 𝑞 ⃗
• a sequence of real money holdings 𝑚⃗
• a sequence of total tax collections 𝑥⃗
• a sequence of per capita rates of consumption 𝑐 ⃗
• a sequence of per capita incomes 𝑦 ⃗
A benevolent government chooses sequences (𝑀⃗ , ℎ,⃗ 𝑥)⃗ subject to a sequence of budget constraints and other constraints
imposed by competitive equilibrium.
Given tax collection and price of money sequences, a representative household chooses sequences (𝑐,⃗ 𝑚)
⃗ of consumption
and real balances.
In competitive equilibrium, the price of money sequence 𝑞 ⃗ clears markets, thereby reconciling decisions of the government
and the representative household.
Chang adopts a version of a model that [Calvo, 1978] designed to exhibit time-inconsistency of a Ramsey policy in a
simple and transparent setting.
By influencing the representative household’s expectations, government actions at time 𝑡 affect components of household
utilities for periods 𝑠 before 𝑡.
When setting a path for monetary expansion rates, the government takes into account how the household’s anticipations
of the government’s future actions affect the household’s current decisions.
The ultimate source of time inconsistency is that a time 0 Ramsey planner takes these effects into account in designing a
plan of government actions for 𝑡 ≥ 0.

988 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

49.2 Decisions

49.2.1 The Household’s Problem

A representative household faces a nonnegative value of money sequence 𝑞 ⃗ and sequences 𝑦,⃗ 𝑥⃗ of income and total tax
collections, respectively.
Facing vector 𝑞 ⃗ as a price taker, the representative household chooses nonnegative sequences 𝑐,⃗ 𝑀⃗ of consumption and
nominal balances, respectively, to maximize

∑ 𝛽 𝑡 [𝑢(𝑐𝑡 ) + 𝑣(𝑞𝑡 𝑀𝑡 )] (49.1)
𝑡=0

subject to

𝑞𝑡 𝑀𝑡 ≤ 𝑦𝑡 + 𝑞𝑡 𝑀𝑡−1 − 𝑐𝑡 − 𝑥𝑡 (49.2)

and

𝑞𝑡 𝑀𝑡 ≤ 𝑚̄ (49.3)

Here 𝑞𝑡 is the reciprocal of the price level at 𝑡, which we can also call the value of money.
Chang [Chang, 1998] assumes that
• 𝑢 ∶ ℝ+ → ℝ is twice continuously differentiable, strictly concave, and strictly increasing;
• 𝑣 ∶ ℝ+ → ℝ is twice continuously differentiable and strictly concave;
• 𝑢′ (𝑐)𝑐→0 = lim𝑚→0 𝑣′ (𝑚) = +∞;
• there is a finite level 𝑚 = 𝑚𝑓 such that 𝑣′ (𝑚𝑓 ) = 0
The household carries real balances out of a period equal to 𝑚𝑡 = 𝑞𝑡 𝑀𝑡 .
Inequality (49.2) is the household’s time 𝑡 budget constraint.
It tells how real balances 𝑞𝑡 𝑀𝑡 carried out of period 𝑡 depend on real balances 𝑞𝑡 𝑀𝑡−1 carried into period 𝑡, income,
consumption, taxes.
Equation (49.3) imposes an exogenous upper bound 𝑚̄ on the household’s choice of real balances, where 𝑚̄ ≥ 𝑚𝑓 .

49.2.2 Government
𝑀𝑡−1
The government chooses a sequence of inverse money growth rates with time 𝑡 component ℎ𝑡 ≡ 𝑀𝑡 ∈ Π ≡ [𝜋, 𝜋],
where 0 < 𝜋 < 1 < 𝛽1 ≤ 𝜋.
The government purchases no goods.
It taxes only to acquire paper currency that it will withdraw from circulation (e.g., by burning it).
Let 𝑝𝑡 be the price level at time 𝑡, measured as time 𝑡 dollars per unit of the consumption good.
Evidently, the value of paper currency meassured in units of the consumption good at time 𝑡 is
1
𝑞𝑡 = .
𝑝𝑡
The government faces a sequence of budget constraints with time 𝑡 component
𝑀𝑡 − 𝑀𝑡−1
𝑥𝑡 + = 0,
𝑝𝑡

49.2. Decisions 989


Advanced Quantitative Economics with Python

𝑀𝑡 −𝑀𝑡−1
where 𝑥𝑡 is the real value of revenue that the government raises from taxes and 𝑝𝑡 is the real value of revenue that
the government raises by printing new paper currency.
Evidently, this budget constraint can be rewritten as

−𝑥𝑡 = 𝑞𝑡 (𝑀𝑡 − 𝑀𝑡−1 )

which by using the definitions of 𝑚𝑡 and ℎ𝑡 can also be expressed as

−𝑥𝑡 = 𝑚𝑡 (1 − ℎ𝑡 ) (49.4)

The restrictions 𝑚𝑡 ∈ [0, 𝑚]̄ and ℎ𝑡 ∈ Π = [𝜋, 𝜋] evidently imply that 𝑥𝑡 ∈ 𝑋 ≡ [(𝜋 − 1)𝑚,̄ (𝜋 − 1)𝑚].
̄
We define the set 𝐸 ≡ [0, 𝑚]̄ × Π × 𝑋, so that we require that (𝑚, ℎ, 𝑥) ∈ 𝐸.
To represent the idea that taxes are distorting, Chang makes the following assumption about outcomes for per capita
output:

𝑦𝑡 = 𝑓(𝑥𝑡 ), (49.5)

where 𝑓 ∶ ℝ → ℝ satisfies 𝑓(𝑥) > 0, 𝑓(𝑥) is twice continuously differentiable, 𝑓 ″ (𝑥) < 0, 𝑓 ′ (0) = 0, and 𝑓(𝑥) = 𝑓(−𝑥)
for all 𝑥 ∈ ℝ, so that subsidies and taxes are equally distorting.
Example parameterizations
In some of our Python code deployed later in this lecture, we’ll assume the following functional forms:

𝑢(𝑐) = log(𝑐)
1
𝑣(𝑚) = (𝑚𝑚̄ − 0.5𝑚2 )0.5
500
𝑓(𝑥) = 180 − (0.4𝑥)2
The tax distortion function
Calvo’s and Chang’s purpose is not to model the causes of tax distortions in any detail but simply to summarize the outcome
of those distortions via the function 𝑓(𝑥).
A key part of the specification is that tax distortions are increasing in the absolute value of tax revenues.
Ramsey plan: A Ramsey plan is a competitive equilibrium that maximizes (49.1).
Within-period timing of decisions is as follows:
• first, the government chooses ℎ𝑡 and 𝑥𝑡 ;
• then given 𝑞 ⃗ and its expectations about future values of 𝑥 and 𝑦’s, the household chooses 𝑀𝑡 and therefore 𝑚𝑡
because 𝑚𝑡 = 𝑞𝑡 𝑀𝑡 ;
• then output 𝑦𝑡 = 𝑓(𝑥𝑡 ) is realized;
• finally 𝑐𝑡 = 𝑦𝑡
This within-period timing confronts the government with choices framed by how the private sector wants to respond when
the government takes time 𝑡 actions that differ from what the private sector had expected.
This consideration will be important in lecture credible government policies when we study credible government policies.
The model is designed to focus on the intertemporal trade-offs between the welfare benefits of deflation and the welfare
costs associated with the high tax collections required to retire money at a rate that delivers deflation.
A benevolent time 0 government can promote utility generating increases in real balances only by imposing sufficiently
large distorting tax collections.
To promote the welfare increasing effects of high real balances, the government wants to induce gradual deflation.

990 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

49.2.3 Household’s Problem

Given 𝑀−1 and {𝑞𝑡 }∞


𝑡=0 , the household’s problem is

ℒ = max min ∑ 𝛽 𝑡 {𝑢(𝑐𝑡 ) + 𝑣(𝑀𝑡 𝑞𝑡 ) + 𝜆𝑡 [𝑦𝑡 − 𝑐𝑡 − 𝑥𝑡 + 𝑞𝑡 𝑀𝑡−1 − 𝑞𝑡 𝑀𝑡 ]
𝑐,⃗ 𝑀⃗ 𝜆,⃗ 𝜇⃗ 𝑡=0

+ 𝜇𝑡 [𝑚̄ − 𝑞𝑡 𝑀𝑡 ]}
First-order conditions with respect to 𝑐𝑡 and 𝑀𝑡 , respectively, are
𝑢′ (𝑐𝑡 ) = 𝜆𝑡
𝑞𝑡 [𝑢′ (𝑐𝑡 ) − 𝑣′ (𝑀𝑡 𝑞𝑡 )] ≤ 𝛽𝑢′ (𝑐𝑡+1 )𝑞𝑡+1 , = if 𝑀𝑡 𝑞𝑡 < 𝑚̄
The last equation expresses Karush-Kuhn-Tucker complementary slackness conditions (see here).
These insist that the inequality is an equality at an interior solution for 𝑀𝑡 .
𝑀𝑡−1 𝑚𝑡
Using ℎ𝑡 = 𝑀𝑡 and 𝑞𝑡 = 𝑀𝑡 in these first-order conditions and rearranging implies

𝑚𝑡 [𝑢′ (𝑐𝑡 ) − 𝑣′ (𝑚𝑡 )] ≤ 𝛽𝑢′ (𝑓(𝑥𝑡+1 ))𝑚𝑡+1 ℎ𝑡+1 , = if 𝑚𝑡 < 𝑚̄ (49.6)

Define the following key variable

𝜃𝑡+1 ≡ 𝑢′ (𝑓(𝑥𝑡+1 ))𝑚𝑡+1 ℎ𝑡+1 (49.7)

This is real money balances at time 𝑡 + 1 measured in units of marginal utility, which Chang refers to as ‘the marginal
utility of real balances’.
From the standpoint of the household at time 𝑡, equation (49.7) shows that 𝜃𝑡+1 intermediates the influences of
(𝑥𝑡+1
⃗ , 𝑚⃗ 𝑡+1 ) on the household’s choice of real balances 𝑚𝑡 .
By “intermediates” we mean that the future paths (𝑥𝑡+1
⃗ , 𝑚⃗ 𝑡+1 ) influence 𝑚𝑡 entirely through their effects on the scalar
𝜃𝑡+1 .
The observation that the one dimensional promised marginal utility of real balances 𝜃𝑡+1 functions in this way is an
important step in constructing a class of competitive equilibria that have a recursive representation.
A closely related observation pervaded the analysis of Stackelberg plans in lecture dynamic Stackelberg problems.

49.3 Competitive Equilibrium

Definition:
• A government policy is a pair of sequences (ℎ,⃗ 𝑥)⃗ where ℎ𝑡 ∈ Π ∀𝑡 ≥ 0.
• A price system is a nonnegative value of money sequence 𝑞.⃗
• An allocation is a triple of nonnegative sequences (𝑐,⃗ 𝑚,⃗ 𝑦).

It is required that time 𝑡 components (𝑚𝑡 , 𝑥𝑡 , ℎ𝑡 ) ∈ 𝐸.
Definition:
Given 𝑀−1 , a government policy (ℎ,⃗ 𝑥),
⃗ price system 𝑞,⃗ and allocation (𝑐,⃗ 𝑚,⃗ 𝑦)⃗ are said to be a competitive equilibrium
if
• 𝑚𝑡 = 𝑞𝑡 𝑀𝑡 and 𝑦𝑡 = 𝑓(𝑥𝑡 ).
• The government budget constraint is satisfied.
• Given 𝑞,⃗ 𝑥,⃗ 𝑦,⃗ (𝑐,⃗ 𝑚)
⃗ solves the household’s problem.

49.3. Competitive Equilibrium 991


Advanced Quantitative Economics with Python

49.4 Inventory of Objects in Play

Chang constructs the following objects


1. A set Ω of initial marginal utilities of money 𝜃0
• Let Ω denote the set of initial promised marginal utilities of money 𝜃0 associated with competitive equilibria.
• Chang exploits the fact that a competitive equilibrium consists of a first period outcome (ℎ0 , 𝑚0 , 𝑥0 ) and a
continuation competitive equilibrium with marginal utility of money 𝜃1 ∈ Ω.
2. Competitive equilibria that have a recursive representation
• A competitive equilibrium with a recursive representation consists of an initial 𝜃0 and a four-tuple of functions
(ℎ, 𝑚, 𝑥, Ψ) mapping 𝜃 into this period’s (ℎ, 𝑚, 𝑥) and next period’s 𝜃, respectively.
• A competitive equilibrium can be represented recursively by iterating on

ℎ𝑡 = ℎ(𝜃𝑡 )
𝑚𝑡 = 𝑚(𝜃𝑡 )
(49.8)
𝑥𝑡 = 𝑥(𝜃𝑡 )
𝜃𝑡+1 = Ψ(𝜃𝑡 )

starting from 𝜃0
The range and domain of Ψ(⋅) are both Ω
3. A recursive representation of a Ramsey plan
• A recursive representation of a Ramsey plan is a recursive competitive equilibrium 𝜃0 , (ℎ, 𝑚, 𝑥, Ψ) that,

among all recursive competitive equilibria, maximizes ∑𝑡=0 𝛽 𝑡 [𝑢(𝑐𝑡 ) + 𝑣(𝑞𝑡 𝑀𝑡 )].
• The Ramsey planner chooses 𝜃0 , (ℎ, 𝑚, 𝑥, Ψ) from among the set of recursive competitive equilibria at time
0.
• Iterations on the function Ψ determine subsequent 𝜃𝑡 ’s that summarize the aspects of the continuation com-
petitive equilibria that influence the household’s decisions.
• At time 0, the Ramsey planner commits to this implied sequence {𝜃𝑡 }∞
𝑡=0 and therefore to an associated
sequence of continuation competitive equilibria.
4. A characterization of time-inconsistency of a Ramsey plan
• Imagine that after a ‘revolution’ at time 𝑡 ≥ 1, a new Ramsey planner is given the opportunity to ignore history
and solve a brand new Ramsey plan.
• This new planner would want to reset the 𝜃𝑡 associated with the original Ramsey plan to 𝜃0 .
• The incentive to reinitialize 𝜃𝑡 associated with this revolution experiment indicates the time-inconsistency of
the Ramsey plan.
• By resetting 𝜃 to 𝜃0 , the new planner avoids the costs at time 𝑡 that the original Ramsey planner must pay
to reap the beneficial effects that the original Ramsey plan for 𝑠 ≥ 𝑡 had achieved via its influence on the
household’s decisions for 𝑠 = 0, … , 𝑡 − 1.

992 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

49.5 Analysis

A competitive equilibrium is a triple of sequences (𝑚,⃗ 𝑥,⃗ ℎ)⃗ ∈ 𝐸 ∞ that satisfies (49.2), (49.3), and (49.6).
Chang works with a set of competitive equilibria defined as follows.
Definition: 𝐶𝐸 = {(𝑚,⃗ 𝑥,⃗ ℎ)⃗ ∈ 𝐸 ∞ such that (49.2), (49.3), and (49.6) are satisfied }.
𝐶𝐸 is not empty because there exists a competitive equilibrium with ℎ𝑡 = 1 for all 𝑡 ≥ 1, namely, an equilibrium with
a constant money supply and constant price level.
Chang establishes that 𝐶𝐸 is also compact.
Chang makes the following key observation that combines ideas of Abreu, Pearce, and Stacchetti [Abreu et al., 1990]
with insights of Kydland and Prescott [Kydland and Prescott, 1980].
Proposition: The continuation of a competitive equilibrium is a competitive equilibrium.
That is, (𝑚,⃗ 𝑥,⃗ ℎ)⃗ ∈ 𝐶𝐸 implies that (𝑚⃗ 𝑡 , 𝑥𝑡⃗ , ℎ⃗ 𝑡 ) ∈ 𝐶𝐸 ∀ 𝑡 ≥ 1.
(Lecture dynamic Stackelberg problems also used a version of this insight)
We can now state that a Ramsey problem is to

max ∑ 𝛽 𝑡 [𝑢(𝑐𝑡 ) + 𝑣(𝑚𝑡 )]
(𝑚, ⃗
⃗ 𝑥,⃗ ℎ)∈𝐸 ∞
𝑡=0

subject to restrictions (49.2), (49.3), and (49.6).


Evidently, associated with any competitive equilibrium (𝑚0 , 𝑥0 ) is an implied value of 𝜃0 = 𝑢′ (𝑓(𝑥0 ))(𝑚0 + 𝑥0 ).
To bring out a recursive structure inherent in the Ramsey problem, Chang defines the set

Ω = {𝜃 ∈ ℝ such that 𝜃 = 𝑢′ (𝑓(𝑥0 ))(𝑚0 + 𝑥0 ) for some (𝑚,⃗ 𝑥,⃗ ℎ)⃗ ∈ 𝐶𝐸}

Equation (49.6) inherits from the household’s Euler equation for money holdings the property that the value of 𝑚0
consistent with the representative household’s choices depends on (ℎ⃗ 1 , 𝑚⃗ 1 ).
This dependence is captured in the definition above by making Ω be the set of first period values of 𝜃0 satisfying 𝜃0 =

𝑢′ (𝑓(𝑥0 ))(𝑚0 + 𝑥0 ) for first period component (𝑚0 , ℎ0 ) of competitive equilibrium sequences (𝑚,⃗ 𝑥,⃗ ℎ).
Chang establishes that Ω is a nonempty and compact subset of ℝ+ .
Next Chang advances:
Definition: Γ(𝜃) = {(𝑚,⃗ 𝑥,⃗ ℎ)⃗ ∈ 𝐶𝐸|𝜃 = 𝑢′ (𝑓(𝑥0 ))(𝑚0 + 𝑥0 )}.
Thus, Γ(𝜃) is the set of competitive equilibrium sequences (𝑚,⃗ 𝑥,⃗ ℎ)⃗ whose first period components (𝑚0 , ℎ0 ) deliver the
prescribed value 𝜃 for first period marginal utility.
If we knew the sets Ω, Γ(𝜃), we could use the following two-step procedure to find at least the value of the Ramsey
outcome to the representative household
1. Find the indirect value function 𝑤(𝜃) defined as

𝑤(𝜃) = max ∑ 𝛽 𝑡 [𝑢(𝑓(𝑥𝑡 )) + 𝑣(𝑚𝑡 )]
(𝑚, ⃗
⃗ 𝑥,⃗ ℎ)∈Γ(𝜃) 𝑡=0

2. Compute the value of the Ramsey outcome by solving max𝜃∈Ω 𝑤(𝜃).

49.5. Analysis 993


Advanced Quantitative Economics with Python

Thus, Chang states the following


Proposition:
𝑤(𝜃) satisfies the Bellman equation

𝑤(𝜃) = max ′ {𝑢(𝑓(𝑥)) + 𝑣(𝑚) + 𝛽𝑤(𝜃′ )} (49.9)


𝑥,𝑚,ℎ,𝜃

where maximization is subject to

(𝑚, 𝑥, ℎ) ∈ 𝐸 and 𝜃′ ∈ Ω (49.10)

and

𝜃 = 𝑢′ (𝑓(𝑥))(𝑚 + 𝑥) (49.11)

and

−𝑥 = 𝑚(1 − ℎ) (49.12)

and

𝑚 ⋅ [𝑢′ (𝑓(𝑥)) − 𝑣′ (𝑚)] ≤ 𝛽𝜃′ , = if 𝑚 < 𝑚̄ (49.13)

Before we use this proposition to recover a recursive representation of the Ramsey plan, note that the proposition relies
on knowing the set Ω.
To find Ω, Chang uses the insights of Kydland and Prescott [Kydland and Prescott, 1980] together with a method based on
the Abreu, Pearce, and Stacchetti [Abreu et al., 1990] iteration to convergence on an operator 𝐵 that maps continuation
values into values.
We want an operator that maps a continuation 𝜃 into a current 𝜃.
Chang lets 𝑄 be a nonempty, bounded subset of ℝ.
Elements of the set 𝑄 are taken to be candidate values for continuation marginal utilities.
Chang defines an operator

𝐵(𝑄) = 𝜃 ∈ ℝ such that there is (𝑚, 𝑥, ℎ, 𝜃′ ) ∈ 𝐸 × 𝑄

such that (49.11), (49.12), and (49.13) hold.


Thus, 𝐵(𝑄) is the set of first period 𝜃’s attainable with (𝑚, 𝑥, ℎ) ∈ 𝐸 and some 𝜃′ ∈ 𝑄.
Proposition:
1. 𝑄 ⊂ 𝐵(𝑄) implies 𝐵(𝑄) ⊂ Ω (‘self-generation’).
2. Ω = 𝐵(Ω) (‘factorization’).
The proposition characterizes Ω as the largest fixed point of 𝐵.
It is easy to establish that 𝐵(𝑄) is a monotone operator.
This property allows Chang to compute Ω as the limit of iterations on 𝐵 provided that iterations begin from a sufficiently
large initial set.

994 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

49.5.1 Some Useful Notation

Let ℎ⃗ 𝑡 = (ℎ0 , ℎ1 , … , ℎ𝑡 ) denote a history of inverse money creation rates with time 𝑡 component ℎ𝑡 ∈ Π.
A government strategy 𝜎 = {𝜎𝑡 }∞
𝑡=0 is a 𝜎0 ∈ Π and for 𝑡 ≥ 1 a sequence of functions 𝜎𝑡 ∶ Π
𝑡−1
→ Π.
Chang restricts the government’s choice of strategies to the following space:

𝐶𝐸𝜋 = {ℎ⃗ ∈ Π∞ ∶ there is some (𝑚,⃗ 𝑥)⃗ such that (𝑚,⃗ 𝑥,⃗ ℎ)⃗ ∈ 𝐶𝐸}

In words, 𝐶𝐸𝜋 is the set of money growth sequences consistent with the existence of competitive equilibria.
Chang observes that 𝐶𝐸𝜋 is nonempty and compact.
Definition: 𝜎 is said to be admissible if for all 𝑡 ≥ 1 and after any history ℎ⃗ 𝑡−1 , the continuation ℎ⃗ 𝑡 implied by 𝜎 belongs
to 𝐶𝐸𝜋 .
Admissibility of 𝜎 means that anticipated policy choices associated with 𝜎 are consistent with the existence of competitive
equilibria after each possible subsequent history.
After any history ℎ⃗ 𝑡−1 , admissibility restricts the government’s choice in period 𝑡 to the set

𝐶𝐸𝜋0 = {ℎ ∈ Π ∶ there is ℎ⃗ ∈ 𝐶𝐸𝜋 with ℎ = ℎ0 }

In words, 𝐶𝐸𝜋0 is the set of all first period money growth rates ℎ = ℎ0 , each of which is consistent with the existence of
a sequence of money growth rates ℎ⃗ starting from ℎ0 in the initial period and for which a competitive equilibrium exists.
Remark:

𝐶𝐸𝜋0 = {ℎ ∈ Π ∶ there is (𝑚, 𝜃′ ) ∈ [0, 𝑚]̄ × Ω such that 𝑢′ [𝑓((ℎ − 1)𝑚) − 𝑣′ (𝑚)] ≤ 𝛽𝜃′ with equality if 𝑚 < 𝑚}.
̄

Definition: An allocation rule is a sequence of functions 𝛼⃗ = {𝛼𝑡 }∞ 𝑡


𝑡=0 such that 𝛼𝑡 ∶ Π → [0, 𝑚]
̄ × 𝑋.
Thus, the time 𝑡 component of 𝛼𝑡 (ℎ𝑡 ) is a pair of functions (𝑚𝑡 (ℎ𝑡 ), 𝑥𝑡 (ℎ𝑡 )).
Definition: Given an admissible government strategy 𝜎, an allocation rule 𝛼 is called competitive if given any history
ℎ⃗ 𝑡−1 and ℎ𝑡 ∈ 𝐶𝐸𝜋0 , the continuations of 𝜎 and 𝛼 after (ℎ⃗ 𝑡−1 , ℎ𝑡 ) induce a competitive equilibrium sequence.

49.5.2 Another Operator

At this point it is convenient to introduce another operator that can be used to compute a Ramsey plan.
For computing a Ramsey plan, this operator is wasteful because it works with a state vector that is bigger than necessary.
̃
We introduce this operator because it helps to prepare the way for Chang’s operator called 𝐷(𝑍) that we shall describe
in lecture credible government policies.
It is also useful because a fixed point of the operator to be defined here provides a good guess for an initial set from which
̃
to initiate iterations on Chang’s set-to-set operator 𝐷(𝑍) to be described in lecture credible government policies.
Let 𝑆 be the set of all pairs (𝑤, 𝜃) of competitive equilibrium values and associated initial marginal utilities.
Let 𝑊 be a bounded set of values in ℝ.
Let 𝑍 be a nonempty subset of 𝑊 × Ω.
Think of using pairs (𝑤′ , 𝜃′ ) drawn from 𝑍 as candidate continuation value, 𝜃 pairs.
Define the operator

𝐷(𝑍) = {(𝑤, 𝜃) ∶ there is ℎ ∈ 𝐶𝐸𝜋0

49.5. Analysis 995


Advanced Quantitative Economics with Python

and a four-tuple (𝑚(ℎ), 𝑥(ℎ), 𝑤′ (ℎ), 𝜃′ (ℎ)) ∈ [0, 𝑚]̄ × 𝑋 × 𝑍


such that

𝑤 = 𝑢(𝑓(𝑥(ℎ))) + 𝑣(𝑚(ℎ)) + 𝛽𝑤′ (ℎ) (49.14)

𝜃 = 𝑢′ (𝑓(𝑥(ℎ)))(𝑚(ℎ) + 𝑥(ℎ)) (49.15)

𝑥(ℎ) = 𝑚(ℎ)(ℎ − 1) (49.16)

𝑚(ℎ)(𝑢′ (𝑓(𝑥(ℎ))) − 𝑣′ (𝑚(ℎ))) ≤ 𝛽𝜃′ (ℎ) (49.17)

with equality if 𝑚(ℎ) < 𝑚}


̄
It is possible to establish.
Proposition:
1. If 𝑍 ⊂ 𝐷(𝑍), then 𝐷(𝑍) ⊂ 𝑆 (‘self-generation’).
2. 𝑆 = 𝐷(𝑆) (‘factorization’).
Proposition:
1. Monotonicity of 𝐷: 𝑍 ⊂ 𝑍 ′ implies 𝐷(𝑍) ⊂ 𝐷(𝑍 ′ ).
2. 𝑍 compact implies that 𝐷(𝑍) is compact.
It can be shown that 𝑆 is compact and that therefore there exists a (𝑤, 𝜃) pair within this set that attains the highest
possible value 𝑤.
This (𝑤, 𝜃) pair i associated with a Ramsey plan.
Further, we can compute 𝑆 by iterating to convergence on 𝐷 provided that one begins with a sufficiently large initial set
𝑆0 .
As a very useful by-product, the algorithm that finds the largest fixed point 𝑆 = 𝐷(𝑆) also produces the Ramsey plan,
its value 𝑤, and the associated competitive equilibrium.

49.6 Calculating all Promise-Value Pairs in CE

Above we have defined the 𝐷(𝑍) operator as:

𝐷(𝑍) = {(𝑤, 𝜃) ∶ ∃ℎ ∈ 𝐶𝐸𝜋0 and (𝑚(ℎ), 𝑥(ℎ), 𝑤′ (ℎ), 𝜃′ (ℎ)) ∈ [0, 𝑚]̄ × 𝑋 × 𝑍

such that

𝑤 = 𝑢(𝑓(𝑥(ℎ))) + 𝑣(𝑚(ℎ)) + 𝛽𝑤′ (ℎ)

𝜃 = 𝑢′ (𝑓(𝑥(ℎ)))(𝑚(ℎ) + 𝑥(ℎ))

𝑥(ℎ) = 𝑚(ℎ)(ℎ − 1)

𝑚(ℎ)(𝑢′ (𝑓(𝑥(ℎ))) − 𝑣′ (𝑚(ℎ))) ≤ 𝛽𝜃′ (ℎ) (with equality if 𝑚(ℎ) < 𝑚)}
̄
We noted that the set 𝑆 can be found by iterating to convergence on 𝐷, provided that we start with a sufficiently large
initial set 𝑆0 .
Our implementation builds on ideas in this notebook.
To find 𝑆 we use a numerical algorithm called the outer hyperplane approximation algorithm.

996 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

It was invented by Judd, Yeltekin, Conklin [Judd et al., 2003].


This algorithm constructs the smallest convex set that contains the fixed point of the 𝐷(𝑆) operator.
Given that we are finding the smallest convex set that contains 𝑆, we can represent it on a computer as the intersection
of a finite number of half-spaces.
Let 𝐻 be a set of subgradients, and 𝐶 be a set of hyperplane levels.
We approximate 𝑆 by:

𝑆 ̃ = {(𝑤, 𝜃)|𝐻 ⋅ (𝑤, 𝜃) ≤ 𝐶}

A key feature of this algorithm is that we discretize the action space, i.e., we create a grid of possible values for 𝑚 and ℎ
(note that 𝑥 is implied by 𝑚 and ℎ). This discretization simplifies computation of 𝑆 ̃ by allowing us to find it by solving a
sequence of linear programs.
The outer hyperplane approximation algorithm proceeds as follows:
1. Initialize subgradients, 𝐻, and hyperplane levels, 𝐶0 .
2. Given a set of subgradients, 𝐻, and hyperplane levels, 𝐶𝑡 , for each subgradient ℎ𝑖 ∈ 𝐻:
• Solve a linear program (described below) for each action in the action space.
• Find the maximum and update the corresponding hyperplane level, 𝐶𝑖,𝑡+1 .
3. If |𝐶𝑡+1 − 𝐶𝑡 | > 𝜖, return to 2.
Step 1 simply creates a large initial set 𝑆0 .
Given some set 𝑆𝑡 , Step 2 then constructs the set 𝑆𝑡+1 = 𝐷(𝑆𝑡 ). The linear program in Step 2 is designed to construct
a set 𝑆𝑡+1 that is as large as possible while satisfying the constraints of the 𝐷(𝑆) operator.
To do this, for each subgradient ℎ𝑖 , and for each point in the action space (𝑚𝑗 , ℎ𝑗 ), we solve the following problem:

max ℎ𝑖 ⋅ (𝑤, 𝜃)
[𝑤′ ,𝜃′ ]

subject to

𝐻 ⋅ (𝑤′ , 𝜃′ ) ≤ 𝐶𝑡

𝑤 = 𝑢(𝑓(𝑥𝑗 )) + 𝑣(𝑚𝑗 ) + 𝛽𝑤′

𝜃 = 𝑢′ (𝑓(𝑥𝑗 ))(𝑚𝑗 + 𝑥𝑗 )

𝑥𝑗 = 𝑚𝑗 (ℎ𝑗 − 1)

𝑚𝑗 (𝑢′ (𝑓(𝑥𝑗 )) − 𝑣′ (𝑚𝑗 )) ≤ 𝛽𝜃′ (= if 𝑚𝑗 < 𝑚)


̄
This problem maximizes the hyperplane level for a given set of actions.
The second part of Step 2 then finds the maximum possible hyperplane level across the action space.
The algorithm constructs a sequence of progressively smaller sets 𝑆𝑡+1 ⊂ 𝑆𝑡 ⊂ 𝑆𝑡−1 ⋯ ⊂ 𝑆0 .
Step 3 ends the algorithm when the difference between these sets is small enough.
We have created a Python class that solves the model assuming the following functional forms:

𝑢(𝑐) = log(𝑐)

1
𝑣(𝑚) = (𝑚𝑚̄ − 0.5𝑚2 )0.5
500

49.6. Calculating all Promise-Value Pairs in CE 997


Advanced Quantitative Economics with Python

𝑓(𝑥) = 180 − (0.4𝑥)2


̄ are then variables to be specified for an instance of the Chang class.
The remaining parameters {𝛽, 𝑚,̄ ℎ, ℎ}
Below we use the class to solve the model and plot the resulting equilibrium set, once with 𝛽 = 0.3 and once with 𝛽 = 0.8.
(Here we have set the number of subgradients to 10 in order to speed up the code for now - we can increase accuracy by
increasing the number of subgradients)

"""
Provides a class called ChangModel to solve different
parameterizations of the Chang (1998) model.
"""

import numpy as np
import quantecon as qe
import time

from scipy.spatial import ConvexHull


from scipy.optimize import linprog, minimize, minimize_scalar
from scipy.interpolate import UnivariateSpline
import numpy.polynomial.chebyshev as cheb

class ChangModel:
"""
Class to solve for the competitive and sustainable sets in the Chang (1998)
model, for different parameterizations.
"""

def __init__(self, β, mbar, h_min, h_max, n_h, n_m, N_g):


# Record parameters
self.β, self.mbar, self.h_min, self.h_max = β, mbar, h_min, h_max
self.n_h, self.n_m, self.N_g = n_h, n_m, N_g

# Create other parameters


self.m_min = 1e-9
self.m_max = self.mbar
self.N_a = self.n_h*self.n_m

# Utility and production functions


uc = lambda c: np.log(c)
uc_p = lambda c: 1/c
v = lambda m: 1/500 * (mbar * m - 0.5 * m**2)**0.5
v_p = lambda m: 0.5/500 * (mbar * m - 0.5 * m**2)**(-0.5) * (mbar - m)
u = lambda h, m: uc(f(h, m)) + v(m)

def f(h, m):


x = m * (h - 1)
f = 180 - (0.4 * x)**2
return f

def θ(h, m):


x = m * (h - 1)
θ = uc_p(f(h, m)) * (m + x)
return θ

# Create set of possible action combinations, A


A1 = np.linspace(h_min, h_max, n_h).reshape(n_h, 1)
(continues on next page)

998 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)


A2 = np.linspace(self.m_min, self.m_max, n_m).reshape(n_m, 1)
self.A = np.concatenate((np.kron(np.ones((n_m, 1)), A1),
np.kron(A2, np.ones((n_h, 1)))), axis=1)

# Pre-compute utility and output vectors


self.euler_vec = -np.multiply(self.A[:, 1], \
uc_p(f(self.A[:, 0], self.A[:, 1])) - v_p(self.A[:, 1]))
self.u_vec = u(self.A[:, 0], self.A[:, 1])
self.Θ_vec = θ(self.A[:, 0], self.A[:, 1])
self.f_vec = f(self.A[:, 0], self.A[:, 1])
self.bell_vec = np.multiply(uc_p(f(self.A[:, 0],
self.A[:, 1])),
np.multiply(self.A[:, 1],
(self.A[:, 0] - 1))) \
+ np.multiply(self.A[:, 1],
v_p(self.A[:, 1]))

# Find extrema of (w, θ) space for initial guess of equilibrium sets


p_vec = np.zeros(self.N_a)
w_vec = np.zeros(self.N_a)
for i in range(self.N_a):
p_vec[i] = self.Θ_vec[i]
w_vec[i] = self.u_vec[i]/(1 - β)

w_space = np.array([min(w_vec[~np.isinf(w_vec)]),
max(w_vec[~np.isinf(w_vec)])])
p_space = np.array([0, max(p_vec[~np.isinf(w_vec)])])
self.p_space = p_space

# Set up hyperplane levels and gradients for iterations


def SG_H_V(N, w_space, p_space):
"""
This function initializes the subgradients, hyperplane levels,
and extreme points of the value set by choosing an appropriate
origin and radius. It is based on a similar function in QuantEcon's
Games.jl
"""

# First, create a unit circle. Want points placed on [0, 2π]


inc = 2 * np.pi / N
degrees = np.arange(0, 2 * np.pi, inc)

# Points on circle
H = np.zeros((N, 2))
for i in range(N):
x = degrees[i]
H[i, 0] = np.cos(x)
H[i, 1] = np.sin(x)

# Then calculate origin and radius


o = np.array([np.mean(w_space), np.mean(p_space)])
r1 = max((max(w_space) - o[0])**2, (o[0] - min(w_space))**2)
r2 = max((max(p_space) - o[1])**2, (o[1] - min(p_space))**2)
r = np.sqrt(r1 + r2)

# Now calculate vertices

(continues on next page)

49.6. Calculating all Promise-Value Pairs in CE 999


Advanced Quantitative Economics with Python

(continued from previous page)


Z = np.zeros((2, N))
for i in range(N):
Z[0, i] = o[0] + r*H.T[0, i]
Z[1, i] = o[1] + r*H.T[1, i]

# Corresponding hyperplane levels


C = np.zeros(N)
for i in range(N):
C[i] = np.dot(Z[:, i], H[i, :])

return C, H, Z

C, self.H, Z = SG_H_V(N_g, w_space, p_space)


C = C.reshape(N_g, 1)
self.c0_c, self.c0_s, self.c1_c, self.c1_s = np.copy(C), np.copy(C), \
np.copy(C), np.copy(C)
self.z0_s, self.z0_c, self.z1_s, self.z1_c = np.copy(Z), np.copy(Z), \
np.copy(Z), np.copy(Z)

self.w_bnds_s, self.w_bnds_c = (w_space[0], w_space[1]), \


(w_space[0], w_space[1])
self.p_bnds_s, self.p_bnds_c = (p_space[0], p_space[1]), \
(p_space[0], p_space[1])

# Create dictionaries to save equilibrium set for each iteration


self.c_dic_s, self.c_dic_c = {}, {}
self.c_dic_s[0], self.c_dic_c[0] = self.c0_s, self.c0_c

def solve_worst_spe(self):
"""
Method to solve for BR(Z). See p.449 of Chang (1998)
"""

p_vec = np.full(self.N_a, np.nan)


c = [1, 0]

# Pre-compute constraints
aineq_mbar = np.vstack((self.H, np.array([0, -self.β])))
bineq_mbar = np.vstack((self.c0_s, 0))

aineq = self.H
bineq = self.c0_s
aeq = [[0, -self.β]]

for j in range(self.N_a):
# Only try if consumption is possible
if self.f_vec[j] > 0:
# If m = mbar, use inequality constraint
if self.A[j, 1] == self.mbar:
bineq_mbar[-1] = self.euler_vec[j]
res = linprog(c, A_ub=aineq_mbar, b_ub=bineq_mbar,
bounds=(self.w_bnds_s, self.p_bnds_s))
else:
beq = self.euler_vec[j]
res = linprog(c, A_ub=aineq, b_ub=bineq, A_eq=aeq, b_eq=beq,
bounds=(self.w_bnds_s, self.p_bnds_s))

(continues on next page)

1000 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)


if res.status == 0:
p_vec[j] = self.u_vec[j] + self.β * res.x[0]

# Max over h and min over other variables (see Chang (1998) p.449)
self.br_z = np.nanmax(np.nanmin(p_vec.reshape(self.n_m, self.n_h), 0))

def solve_subgradient(self):
"""
Method to solve for E(Z). See p.449 of Chang (1998)
"""

# Pre-compute constraints
aineq_C_mbar = np.vstack((self.H, np.array([0, -self.β])))
bineq_C_mbar = np.vstack((self.c0_c, 0))

aineq_C = self.H
bineq_C = self.c0_c
aeq_C = [[0, -self.β]]

aineq_S_mbar = np.vstack((np.vstack((self.H, np.array([0, -self.β]))),


np.array([-self.β, 0])))
bineq_S_mbar = np.vstack((self.c0_s, np.zeros((2, 1))))

aineq_S = np.vstack((self.H, np.array([-self.β, 0])))


bineq_S = np.vstack((self.c0_s, 0))
aeq_S = [[0, -self.β]]

# Update maximal hyperplane level


for i in range(self.N_g):
c_a1a2_c, t_a1a2_c = np.full(self.N_a, -np.inf), \
np.zeros((self.N_a, 2))
c_a1a2_s, t_a1a2_s = np.full(self.N_a, -np.inf), \
np.zeros((self.N_a, 2))

c = [-self.H[i, 0], -self.H[i, 1]]

for j in range(self.N_a):
# Only try if consumption is possible
if self.f_vec[j] > 0:

# COMPETITIVE EQUILIBRIA
# If m = mbar, use inequality constraint
if self.A[j, 1] == self.mbar:
bineq_C_mbar[-1] = self.euler_vec[j]
res = linprog(c, A_ub=aineq_C_mbar, b_ub=bineq_C_mbar,
bounds=(self.w_bnds_c, self.p_bnds_c))
# If m < mbar, use equality constraint
else:
beq_C = self.euler_vec[j]
res = linprog(c, A_ub=aineq_C, b_ub=bineq_C, A_eq = aeq_C,
b_eq = beq_C, bounds=(self.w_bnds_c, \
self.p_bnds_c))
if res.status == 0:
c_a1a2_c[j] = self.H[i, 0] * (self.u_vec[j] \
+ self.β * res.x[0]) + self.H[i, 1] * self.Θ_vec[j]
t_a1a2_c[j] = res.x

(continues on next page)

49.6. Calculating all Promise-Value Pairs in CE 1001


Advanced Quantitative Economics with Python

(continued from previous page)

# SUSTAINABLE EQUILIBRIA
# If m = mbar, use inequality constraint
if self.A[j, 1] == self.mbar:
bineq_S_mbar[-2] = self.euler_vec[j]
bineq_S_mbar[-1] = self.u_vec[j] - self.br_z
res = linprog(c, A_ub=aineq_S_mbar, b_ub=bineq_S_mbar,
bounds=(self.w_bnds_s, self.p_bnds_s))
# If m < mbar, use equality constraint
else:
bineq_S[-1] = self.u_vec[j] - self.br_z
beq_S = self.euler_vec[j]
res = linprog(c, A_ub=aineq_S, b_ub=bineq_S, A_eq = aeq_S,
b_eq = beq_S, bounds=(self.w_bnds_s, \
self.p_bnds_s))
if res.status == 0:
c_a1a2_s[j] = self.H[i, 0] * (self.u_vec[j] \
+ self.β*res.x[0]) + self.H[i, 1] * self.Θ_vec[j]
t_a1a2_s[j] = res.x

idx_c = np.where(c_a1a2_c == max(c_a1a2_c))[0][0]


self.z1_c[:, i] = np.array([self.u_vec[idx_c]
+ self.β * t_a1a2_c[idx_c, 0],
self.Θ_vec[idx_c]])

idx_s = np.where(c_a1a2_s == max(c_a1a2_s))[0][0]


self.z1_s[:, i] = np.array([self.u_vec[idx_s]
+ self.β * t_a1a2_s[idx_s, 0],
self.Θ_vec[idx_s]])

for i in range(self.N_g):
self.c1_c[i] = np.dot(self.z1_c[:, i], self.H[i, :])
self.c1_s[i] = np.dot(self.z1_s[:, i], self.H[i, :])

def solve_sustainable(self, tol=1e-5, max_iter=250):


"""
Method to solve for the competitive and sustainable equilibrium sets.
"""

t = time.time()
diff = tol + 1
iters = 0

print('### --------------- ###')


print('Solving Chang Model Using Outer Hyperplane Approximation')
print('### --------------- ### \n')

print('Maximum difference when updating hyperplane levels:')

while diff > tol and iters < max_iter:


iters = iters + 1
self.solve_worst_spe()
self.solve_subgradient()
diff = max(np.maximum(abs(self.c0_c - self.c1_c),
abs(self.c0_s - self.c1_s)))
print(diff)

(continues on next page)

1002 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)

# Update hyperplane levels


self.c0_c, self.c0_s = np.copy(self.c1_c), np.copy(self.c1_s)

# Update bounds for w and θ


wmin_c, wmax_c = np.min(self.z1_c, axis=1)[0], \
np.max(self.z1_c, axis=1)[0]
pmin_c, pmax_c = np.min(self.z1_c, axis=1)[1], \
np.max(self.z1_c, axis=1)[1]

wmin_s, wmax_s = np.min(self.z1_s, axis=1)[0], \


np.max(self.z1_s, axis=1)[0]
pmin_S, pmax_S = np.min(self.z1_s, axis=1)[1], \
np.max(self.z1_s, axis=1)[1]

self.w_bnds_s, self.w_bnds_c = (wmin_s, wmax_s), (wmin_c, wmax_c)


self.p_bnds_s, self.p_bnds_c = (pmin_S, pmax_S), (pmin_c, pmax_c)

# Save iteration
self.c_dic_c[iters], self.c_dic_s[iters] = np.copy(self.c1_c), \
np.copy(self.c1_s)
self.iters = iters

elapsed = time.time() - t
print('Convergence achieved after {} iterations and {} \
seconds'.format(iters, round(elapsed, 2)))

def solve_bellman(self, θ_min, θ_max, order, disp=False, tol=1e-7, maxiters=100):


"""
Continuous Method to solve the Bellman equation in section 25.3
"""
mbar = self.mbar

# Utility and production functions


uc = lambda c: np.log(c)
uc_p = lambda c: 1 / c
v = lambda m: 1 / 500 * (mbar * m - 0.5 * m**2)**0.5
v_p = lambda m: 0.5/500 * (mbar*m - 0.5 * m**2)**(-0.5) * (mbar - m)
u = lambda h, m: uc(f(h, m)) + v(m)

def f(h, m):


x = m * (h - 1)
f = 180 - (0.4 * x)**2
return f

def θ(h, m):


x = m * (h - 1)
θ = uc_p(f(h, m)) * (m + x)
return θ

# Bounds for Maximization


lb1 = np.array([self.h_min, 0, θ_min])
ub1 = np.array([self.h_max, self.mbar - 1e-5, θ_max])
lb2 = np.array([self.h_min, θ_min])
ub2 = np.array([self.h_max, θ_max])

(continues on next page)

49.6. Calculating all Promise-Value Pairs in CE 1003


Advanced Quantitative Economics with Python

(continued from previous page)


# Initialize Value Function coefficients
# Calculate roots of Chebyshev polynomial
k = np.linspace(order, 1, order)
roots = np.cos((2 * k - 1) * np.pi / (2 * order))
# Scale to approximation space
s = θ_min + (roots - -1) / 2 * (θ_max - θ_min)
# Create a basis matrix
Φ = cheb.chebvander(roots, order - 1)
c = np.zeros(Φ.shape[0])

# Function to minimize and constraints


def p_fun(x):
scale = -1 + 2 * (x[2] - θ_min)/(θ_max - θ_min)
p_fun = - (u(x[0], x[1]) \
+ self.β * np.dot(cheb.chebvander(scale, order - 1), c))
return p_fun

def p_fun2(x):
scale = -1 + 2*(x[1] - θ_min)/(θ_max - θ_min)
p_fun = - (u(x[0],mbar) \
+ self.β * np.dot(cheb.chebvander(scale, order - 1), c))
return p_fun

cons1 = ({'type': 'eq', 'fun': lambda x: uc_p(f(x[0], x[1])) * x[1]


* (x[0] - 1) + v_p(x[1]) * x[1] + self.β * x[2] - θ},
{'type': 'eq', 'fun': lambda x: uc_p(f(x[0], x[1]))
* x[0] * x[1] - θ})
cons2 = ({'type': 'ineq', 'fun': lambda x: uc_p(f(x[0], mbar)) * mbar
* (x[0] - 1) + v_p(mbar) * mbar + self.β * x[1] - θ},
{'type': 'eq', 'fun': lambda x: uc_p(f(x[0], mbar))
* x[0] * mbar - θ})

bnds1 = np.concatenate([lb1.reshape(3, 1), ub1.reshape(3, 1)], axis=1)


bnds2 = np.concatenate([lb2.reshape(2, 1), ub2.reshape(2, 1)], axis=1)

# Bellman Iterations
diff = 1
iters = 1

while diff > tol:


# 1. Maximization, given value function guess
p_iter1 = np.zeros(order)
for i in range(order):
θ = s[i]
res = minimize(p_fun,
lb1 + (ub1-lb1) / 2,
method='SLSQP',
bounds=bnds1,
constraints=cons1,
tol=1e-10)
if res.success == True:
p_iter1[i] = -p_fun(res.x)
res = minimize(p_fun2,
lb2 + (ub2-lb2) / 2,
method='SLSQP',
bounds=bnds2,

(continues on next page)

1004 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)


constraints=cons2,
tol=1e-10)
if -p_fun2(res.x) > p_iter1[i] and res.success == True:
p_iter1[i] = -p_fun2(res.x)

# 2. Bellman updating of Value Function coefficients


c1 = np.linalg.solve(Φ, p_iter1)
# 3. Compute distance and update
diff = np.linalg.norm(c - c1)
if bool(disp == True):
print(diff)
c = np.copy(c1)
iters = iters + 1
if iters > maxiters:
print('Convergence failed after {} iterations'.format(maxiters))
break

self.θ_grid = s
self.p_iter = p_iter1
self.Φ = Φ
self.c = c
print('Convergence achieved after {} iterations'.format(iters))

# Check residuals
θ_grid_fine = np.linspace(θ_min, θ_max, 100)
resid_grid = np.zeros(100)
p_grid = np.zeros(100)
θ_prime_grid = np.zeros(100)
m_grid = np.zeros(100)
h_grid = np.zeros(100)
for i in range(100):
θ = θ_grid_fine[i]
res = minimize(p_fun,
lb1 + (ub1-lb1) / 2,
method='SLSQP',
bounds=bnds1,
constraints=cons1,
tol=1e-10)
if res.success == True:
p = -p_fun(res.x)
p_grid[i] = p
θ_prime_grid[i] = res.x[2]
h_grid[i] = res.x[0]
m_grid[i] = res.x[1]
res = minimize(p_fun2,
lb2 + (ub2-lb2)/2,
method='SLSQP',
bounds=bnds2,
constraints=cons2,
tol=1e-10)
if -p_fun2(res.x) > p and res.success == True:
p = -p_fun2(res.x)
p_grid[i] = p
θ_prime_grid[i] = res.x[1]
h_grid[i] = res.x[0]
m_grid[i] = self.mbar

(continues on next page)

49.6. Calculating all Promise-Value Pairs in CE 1005


Advanced Quantitative Economics with Python

(continued from previous page)


scale = -1 + 2 * (θ - θ_min)/(θ_max - θ_min)
resid_grid[i] = np.dot(cheb.chebvander(scale, order-1), c) - p

self.resid_grid = resid_grid
self.θ_grid_fine = θ_grid_fine
self.θ_prime_grid = θ_prime_grid
self.m_grid = m_grid
self.h_grid = h_grid
self.p_grid = p_grid
self.x_grid = m_grid * (h_grid - 1)

# Simulate
θ_series = np.zeros(31)
m_series = np.zeros(30)
h_series = np.zeros(30)

# Find initial θ
def ValFun(x):
scale = -1 + 2*(x - θ_min)/(θ_max - θ_min)
p_fun = np.dot(cheb.chebvander(scale, order - 1), c)
return -p_fun

res = minimize(ValFun,
(θ_min + θ_max)/2,
bounds=[(θ_min, θ_max)])
θ_series[0] = res.x

# Simulate
for i in range(30):
θ = θ_series[i]
res = minimize(p_fun,
lb1 + (ub1-lb1)/2,
method='SLSQP',
bounds=bnds1,
constraints=cons1,
tol=1e-10)
if res.success == True:
p = -p_fun(res.x)
h_series[i] = res.x[0]
m_series[i] = res.x[1]
θ_series[i+1] = res.x[2]
res2 = minimize(p_fun2,
lb2 + (ub2-lb2)/2,
method='SLSQP',
bounds=bnds2,
constraints=cons2,
tol=1e-10)
if -p_fun2(res2.x) > p and res2.success == True:
h_series[i] = res2.x[0]
m_series[i] = self.mbar
θ_series[i+1] = res2.x[1]

self.θ_series = θ_series
self.m_series = m_series
self.h_series = h_series
self.x_series = m_series * (h_series - 1)

1006 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

ch1 = ChangModel(β=0.3, mbar=30, h_min=0.9, h_max=2, n_h=8, n_m=35, N_g=10)


ch1.solve_sustainable()

### --------------- ###


Solving Chang Model Using Outer Hyperplane Approximation
### --------------- ###

Maximum difference when updating hyperplane levels:

[1.9168]

[0.66782]

[0.49235]

[0.32412]

[0.19022]

[0.10863]

[0.05817]

[0.0262]

[0.01836]

[0.01415]

[0.00297]

[0.00089]

[0.00027]

[0.00008]

[0.00002]

[0.00001]
Convergence achieved after 16 iterations and 38.65 seconds

49.6. Calculating all Promise-Value Pairs in CE 1007


Advanced Quantitative Economics with Python

def plot_competitive(ChangModel):
"""
Method that only plots competitive equilibrium set
"""
poly_C = polytope.Polytope(ChangModel.H, ChangModel.c1_c)
ext_C = polytope.extreme(poly_C)

fig, ax = plt.subplots(figsize=(7, 5))

ax.set_xlabel('w', fontsize=16)
ax.set_ylabel(r"$\theta$", fontsize=18)

ax.fill(ext_C[:,0], ext_C[:,1], 'r', zorder=0)


ChangModel.min_theta = min(ext_C[:, 1])
ChangModel.max_theta = max(ext_C[:, 1])

# Add point showing Ramsey Plan


idx_Ramsey = np.where(ext_C[:, 0] == max(ext_C[:, 0]))[0][0]
R = ext_C[idx_Ramsey, :]
ax.scatter(R[0], R[1], 150, 'black', 'o', zorder=1)
w_min = min(ext_C[:, 0])

# Label Ramsey Plan slightly to the right of the point


ax.annotate("R", xy=(R[0], R[1]), xytext=(R[0] + 0.03 * (R[0] - w_min),
R[1]), fontsize=18)

plt.tight_layout()
plt.show()

plot_competitive(ch1)

1008 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

ch2 = ChangModel(β=0.8, mbar=30, h_min=0.9, h_max=1/0.8,


n_h=8, n_m=35, N_g=10)
ch2.solve_sustainable()

### --------------- ###


Solving Chang Model Using Outer Hyperplane Approximation
### --------------- ###

Maximum difference when updating hyperplane levels:

[0.06369]

[0.02476]

[0.02153]

[0.01915]

[0.01795]

[0.01642]

49.6. Calculating all Promise-Value Pairs in CE 1009


Advanced Quantitative Economics with Python

[0.01507]

[0.01284]

[0.01106]

[0.00694]

[0.0085]

[0.00781]

[0.00433]

[0.00492]

[0.00303]

[0.00182]

[0.00638]

[0.00116]

[0.00093]

[0.00075]

[0.0006]

[0.00494]

[0.00038]

[0.00121]

[0.00024]

[0.0002]

[0.00016]

1010 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

[0.00013]

[0.0001]

[0.00008]

[0.00006]

[0.00005]

[0.00004]

[0.00003]

[0.00003]

[0.00002]

[0.00002]

[0.00001]

[0.00001]

[0.00001]
Convergence achieved after 40 iterations and 114.68 seconds

plot_competitive(ch2)

49.6. Calculating all Promise-Value Pairs in CE 1011


Advanced Quantitative Economics with Python

49.7 Solving a Continuation Ramsey Planner’s Bellman Equation

In this section we solve the Bellman equation confronting a continuation Ramsey planner.
The construction of a Ramsey plan is decomposed into a two subproblems in Ramsey plans, time inconsistency, sustainable
plans and dynamic Stackelberg problems.
• Subproblem 1 is faced by a sequence of continuation Ramsey planners at 𝑡 ≥ 1.
• Subproblem 2 is faced by a Ramsey planner at 𝑡 = 0.
The problem is:

𝐽 (𝜃) = max ′ 𝑢(𝑓(𝑥)) + 𝑣(𝑚) + 𝛽𝐽 (𝜃′ )


𝑚,𝑥,ℎ,𝜃

subject to:

𝜃 ≤ 𝑢′ (𝑓(𝑥))𝑥 + 𝑣′ (𝑚)𝑚 + 𝛽𝜃′

𝜃 = 𝑢′ (𝑓(𝑥))(𝑚 + 𝑥)

𝑥 = 𝑚(ℎ − 1)

(𝑚, 𝑥, ℎ) ∈ 𝐸

𝜃′ ∈ Ω
To solve this Bellman equation, we must know the set Ω.

1012 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

We have solved the Bellman equation for the two sets of parameter values for which we computed the equilibrium value
sets above.
Hence for these parameter configurations, we know the bounds of Ω.
The two sets of parameters differ only in the level of 𝛽.
From the figures earlier in this lecture, we know that when 𝛽 = 0.3, Ω = [0.0088, 0.0499], and when 𝛽 = 0.8,
Ω = [0.0395, 0.2193]

ch1 = ChangModel(β=0.3, mbar=30, h_min=0.99, h_max=1/0.3,


n_h=8, n_m=35, N_g=50)
ch2 = ChangModel(β=0.8, mbar=30, h_min=0.1, h_max=1/0.8,
n_h=20, n_m=50, N_g=50)

/tmp/ipykernel_7471/1608401414.py:33: RuntimeWarning: invalid value encountered in␣


↪log

uc = lambda c: np.log(c)

ch1.solve_bellman(θ_min=0.01, θ_max=0.0499, order=30, tol=1e-6)


ch2.solve_bellman(θ_min=0.045, θ_max=0.15, order=30, tol=1e-6)

/tmp/ipykernel_7471/1608401414.py:382: DeprecationWarning: Conversion of an array␣


↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

p_iter1[i] = -p_fun(res.x)
/tmp/ipykernel_7471/1608401414.py:309: RuntimeWarning: invalid value encountered␣
↪in log

uc = lambda c: np.log(c)

Convergence achieved after 15 iterations

/tmp/ipykernel_7471/1608401414.py:427: DeprecationWarning: Conversion of an array␣


↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

p_grid[i] = p
/tmp/ipykernel_7471/1608401414.py:444: DeprecationWarning: Conversion of an array␣
↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

resid_grid[i] = np.dot(cheb.chebvander(scale, order-1), c) - p

/tmp/ipykernel_7471/1608401414.py:468: DeprecationWarning: Conversion of an array␣


↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

θ_series[0] = res.x

/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:437: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

fx = wrapped_fun(x)

49.7. Solving a Continuation Ramsey Planner’s Bellman Equation 1013


Advanced Quantitative Economics with Python

/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:441: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

g = append(wrapped_grad(x), 0.0)
/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:495: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

a_eq = vstack([con['jac'](x, *con['args'])


/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/scipy/optimize/
↪_slsqp_py.py:501: RuntimeWarning: Values in x were outside bounds during a␣

↪minimize step, clipping to bounds

a_ieq = vstack([con['jac'](x, *con['args'])

Convergence achieved after 72 iterations

First, a quick check that our approximations of the value functions are good.
We do this by calculating the residuals between iterates on the value function on a fine grid:

max(abs(ch1.resid_grid)), max(abs(ch2.resid_grid))

(6.46313155971967e-06, 6.875358415925348e-07)

The value functions plotted below trace out the right edges of the sets of equilibrium values plotted above

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

for ax, model in zip(axes, (ch1, ch2)):


ax.plot(model.θ_grid, model.p_iter)
ax.set(xlabel=r"$\theta$",
ylabel=r"$J(\theta)$",
title=rf"$\beta = {model.β}$")

plt.show()

The next figure plots the optimal policy functions; values of 𝜃′ , 𝑚, 𝑥, ℎ for each value of the state 𝜃:

1014 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

for model in (ch1, ch2):

fig, axes = plt.subplots(2, 2, figsize=(12, 6), sharex=True)


fig.suptitle(rf"$\beta = {model.β}$", fontsize=16)

plots = [model.θ_prime_grid, model.m_grid,


model.h_grid, model.x_grid]
labels = [r"$\theta'$", "$m$", "$h$", "$x$"]

for ax, plot, label in zip(axes.flatten(), plots, labels):


ax.plot(model.θ_grid_fine, plot)
ax.set_xlabel(r"$\theta$", fontsize=14)
ax.set_ylabel(label, fontsize=14)

plt.show()

49.7. Solving a Continuation Ramsey Planner’s Bellman Equation 1015


Advanced Quantitative Economics with Python

With the first set of parameter values, the value of 𝜃′ chosen by the Ramsey planner quickly hits the upper limit of Ω.
But with the second set of parameters it converges to a value in the interior of the set.
Consequently, the choice of 𝜃 ̄ is clearly important with the first set of parameter values.
One way of seeing this is plotting 𝜃′ (𝜃) for each set of parameters.
With the first set of parameter values, this function does not intersect the 45-degree line until 𝜃,̄ whereas in the second
set of parameter values, it intersects in the interior.

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

for ax, model in zip(axes, (ch1, ch2)):


ax.plot(model.θ_grid_fine, model.θ_prime_grid, label=r"$\theta'(\theta)$")
ax.plot(model.θ_grid_fine, model.θ_grid_fine, label=r"$\theta$")
ax.set(xlabel=r"$\theta$", title=rf"$\beta = {model.β}$")

axes[0].legend()
plt.show()

1016 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

Subproblem 2 is equivalent to the planner choosing the initial value of 𝜃 (i.e. the value which maximizes the value
function).
From this starting point, we can then trace out the paths for {𝜃𝑡 , 𝑚𝑡 , ℎ𝑡 , 𝑥𝑡 }∞
𝑡=0 that support this equilibrium.

These are shown below for both sets of parameters

for model in (ch1, ch2):

fig, axes = plt.subplots(2, 2, figsize=(12, 6))


fig.suptitle(rf"$\beta = {model.β}$")

plots = [model.θ_series, model.m_series, model.h_series, model.x_series]


labels = [r"$\theta$", "$m$", "$h$", "$x$"]

for ax, plot, label in zip(axes.flatten(), plots, labels):


ax.plot(plot)
ax.set(xlabel='t', ylabel=label)

plt.show()

49.7. Solving a Continuation Ramsey Planner’s Bellman Equation 1017


Advanced Quantitative Economics with Python

1018 Chapter 49. Competitive Equilibria of a Model of Chang


Advanced Quantitative Economics with Python

49.7.1 Next Steps

In Credible Government Policies in Chang Model we shall find a subset of competitive equilibria that are sustainable in
the sense that a sequence of government administrations that chooses sequentially, rather than once and for all at time 0
will choose to implement them.
In the process of constructing them, we shall construct another, smaller set of competitive equilibria.

49.7. Solving a Continuation Ramsey Planner’s Bellman Equation 1019


Advanced Quantitative Economics with Python

1020 Chapter 49. Competitive Equilibria of a Model of Chang


CHAPTER

FIFTY

CREDIBLE GOVERNMENT POLICIES IN A MODEL OF CHANG

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install polytope

50.1 Overview

Some of the material in this lecture and competitive equilibria in the Chang model can be viewed as more sophisticated
and complete treatments of the topics discussed in Ramsey plans, time inconsistency, sustainable plans.
This lecture assumes almost the same economic environment analyzed in competitive equilibria in the Chang model.
The only change – and it is a substantial one – is the timing protocol for making government decisions.
In competitive equilibria in the Chang model, a Ramsey planner chose a comprehensive government policy once-and-for-all
at time 0.
Now in this lecture, there is no time 0 Ramsey planner.
Instead there is a sequence of government decision-makers, one for each 𝑡.
The time 𝑡 government decision-maker choose time 𝑡 government actions after forecasting what future governments will
do.
We use the notion of a sustainable plan proposed in [Chari and Kehoe, 1990], also referred to as a credible public policy
in [Stokey, 1989].
Technically, this lecture starts where lecture competitive equilibria in the Chang model on Ramsey plans within the Chang
[Chang, 1998] model stopped.
That lecture presents recursive representations of competitive equilibria and a Ramsey plan for a version of a model of
Calvo [Calvo, 1978] that Chang used to analyze and illustrate these concepts.
We used two operators to characterize competitive equilibria and a Ramsey plan, respectively.
In this lecture, we define a credible public policy or sustainable plan.
̃
Starting from a large enough initial set 𝑍0 , we use iterations on Chang’s set-to-set operator 𝐷(𝑍) to compute a set of
values associated with sustainable plans.
̃
Chang’s operator 𝐷(𝑍) is closely connected with the operator 𝐷(𝑍) introduced in lecture competitive equilibria in the
Chang model.
̃
• 𝐷(𝑍) incorporates all of the restrictions imposed in constructing the operator 𝐷(𝑍), but ….
• It adds some additional restrictions
– these additional restrictions incorporate the idea that a plan must be sustainable.

1021
Advanced Quantitative Economics with Python

– sustainable means that the government wants to implement it at all times after all histories.
Let’s start with some standard imports:

import numpy as np
import polytope
import matplotlib.pyplot as plt

`polytope` failed to import `cvxopt.glpk`.

will use `scipy.optimize.linprog`

50.2 The Setting

We begin by reviewing the set up deployed in competitive equilibria in the Chang model.
Chang’s model, adopted from Calvo, is designed to focus on the intertemporal trade-offs between the welfare benefits
of deflation and the welfare costs associated with the high tax collections required to retire money at a rate that delivers
deflation.
A benevolent time 0 government can promote utility generating increases in real balances only by imposing an infinite
sequence of sufficiently large distorting tax collections.
To promote the welfare increasing effects of high real balances, the government wants to induce gradual deflation.
We start by reviewing notation.
For a sequence of scalars 𝑧 ⃗ ≡ {𝑧𝑡 }∞ 𝑡
𝑡=0 , let 𝑧 ⃗ = (𝑧0 , … , 𝑧𝑡 ), 𝑧𝑡⃗ = (𝑧𝑡 , 𝑧𝑡+1 , …).

An infinitely lived representative agent and an infinitely lived government exist at dates 𝑡 = 0, 1, ….
The objects in play are
• an initial quantity 𝑀−1 of nominal money holdings
• a sequence of inverse money growth rates ℎ⃗ and an associated sequence of nominal money holdings 𝑀⃗
• a sequence of values of money 𝑞 ⃗
• a sequence of real money holdings 𝑚⃗
• a sequence of total tax collections 𝑥⃗
• a sequence of per capita rates of consumption 𝑐 ⃗
• a sequence of per capita incomes 𝑦 ⃗
A benevolent government chooses sequences (𝑀⃗ , ℎ,⃗ 𝑥)⃗ subject to a sequence of budget constraints and other constraints
imposed by competitive equilibrium.
Given tax collection and price of money sequences, a representative household chooses sequences (𝑐,⃗ 𝑚)
⃗ of consumption
and real balances.
In competitive equilibrium, the price of money sequence 𝑞 ⃗ clears markets, thereby reconciling decisions of the government
and the representative household.

1022 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

50.2.1 The Household’s Problem

A representative household faces a nonnegative value of money sequence 𝑞 ⃗ and sequences 𝑦,⃗ 𝑥⃗ of income and total tax
collections, respectively.
The household chooses nonnegative sequences 𝑐,⃗ 𝑀⃗ of consumption and nominal balances, respectively, to maximize

∑ 𝛽 𝑡 [𝑢(𝑐𝑡 ) + 𝑣(𝑞𝑡 𝑀𝑡 )] (50.1)
𝑡=0

subject to

𝑞𝑡 𝑀𝑡 ≤ 𝑦𝑡 + 𝑞𝑡 𝑀𝑡−1 − 𝑐𝑡 − 𝑥𝑡 (50.2)

and

𝑞𝑡 𝑀𝑡 ≤ 𝑚̄ (50.3)

Here 𝑞𝑡 is the reciprocal of the price level at 𝑡, also known as the value of money.
Chang [Chang, 1998] assumes that
• 𝑢 ∶ ℝ+ → ℝ is twice continuously differentiable, strictly concave, and strictly increasing;
• 𝑣 ∶ ℝ+ → ℝ is twice continuously differentiable and strictly concave;
• 𝑢′ (𝑐)𝑐→0 = lim𝑚→0 𝑣′ (𝑚) = +∞;
• there is a finite level 𝑚 = 𝑚𝑓 such that 𝑣′ (𝑚𝑓 ) = 0
Real balances carried out of a period equal 𝑚𝑡 = 𝑞𝑡 𝑀𝑡 .
Inequality (50.2) is the household’s time 𝑡 budget constraint.
It tells how real balances 𝑞𝑡 𝑀𝑡 carried out of period 𝑡 depend on income, consumption, taxes, and real balances 𝑞𝑡 𝑀𝑡−1
carried into the period.
Equation (50.3) imposes an exogenous upper bound 𝑚̄ on the choice of real balances, where 𝑚̄ ≥ 𝑚𝑓 .

50.2.2 Government
𝑀𝑡−1
The government chooses a sequence of inverse money growth rates with time 𝑡 component ℎ𝑡 ≡ 𝑀𝑡 ∈ Π ≡ [𝜋, 𝜋],
where 0 < 𝜋 < 1 < 𝛽1 ≤ 𝜋.
The government faces a sequence of budget constraints with time 𝑡 component

−𝑥𝑡 = 𝑞𝑡 (𝑀𝑡 − 𝑀𝑡−1 )

which, by using the definitions of 𝑚𝑡 and ℎ𝑡 , can also be expressed as

−𝑥𝑡 = 𝑚𝑡 (1 − ℎ𝑡 ) (50.4)

The restrictions 𝑚𝑡 ∈ [0, 𝑚]̄ and ℎ𝑡 ∈ Π evidently imply that 𝑥𝑡 ∈ 𝑋 ≡ [(𝜋 − 1)𝑚,̄ (𝜋 − 1)𝑚].
̄
We define the set 𝐸 ≡ [0, 𝑚]̄ × Π × 𝑋, so that we require that (𝑚, ℎ, 𝑥) ∈ 𝐸.
To represent the idea that taxes are distorting, Chang makes the following assumption about outcomes for per capita
output:

𝑦𝑡 = 𝑓(𝑥𝑡 ) (50.5)

50.2. The Setting 1023


Advanced Quantitative Economics with Python

where 𝑓 ∶ ℝ → ℝ satisfies 𝑓(𝑥) > 0, 𝑓(𝑥) is twice continuously differentiable, 𝑓 ″ (𝑥) < 0, 𝑓 ′ (0) = 0, and 𝑓(𝑥) = 𝑓(−𝑥)
for all 𝑥 ∈ ℝ, so that subsidies and taxes are equally distorting.
The purpose is not to model the causes of tax distortions in any detail but simply to summarize the outcome of those
distortions via the function 𝑓(𝑥).
A key part of the specification is that tax distortions are increasing in the absolute value of tax revenues.
The government chooses a competitive equilibrium that maximizes (50.1).

50.2.3 Within-period Timing Protocol

For the results in this lecture, the timing of actions within a period is important because of the incentives that it activates.
Chang assumed the following within-period timing of decisions:
• first, the government chooses ℎ𝑡 and 𝑥𝑡 ;
• then given 𝑞 ⃗ and its expectations about future values of 𝑥 and 𝑦’s, the household chooses 𝑀𝑡 and therefore 𝑚𝑡
because 𝑚𝑡 = 𝑞𝑡 𝑀𝑡 ;
• then output 𝑦𝑡 = 𝑓(𝑥𝑡 ) is realized;
• finally 𝑐𝑡 = 𝑦𝑡
This within-period timing confronts the government with choices framed by how the private sector wants to respond when
the government takes time 𝑡 actions that differ from what the private sector had expected.
This timing will shape the incentives confronting the government at each history that are to be incorporated in the con-
struction of the 𝐷̃ operator below.

50.2.4 Household’s Problem

Given 𝑀−1 and {𝑞𝑡 }∞


𝑡=0 , the household’s problem is


ℒ = max min ∑ 𝛽 𝑡 {𝑢(𝑐𝑡 ) + 𝑣(𝑀𝑡 𝑞𝑡 ) + 𝜆𝑡 [𝑦𝑡 − 𝑐𝑡 − 𝑥𝑡 + 𝑞𝑡 𝑀𝑡−1 − 𝑞𝑡 𝑀𝑡 ]
𝑐,⃗ 𝑀⃗ 𝜆,⃗ 𝜇⃗ 𝑡=0

+ 𝜇𝑡 [𝑚̄ − 𝑞𝑡 𝑀𝑡 ]}

First-order conditions with respect to 𝑐𝑡 and 𝑀𝑡 , respectively, are

𝑢′ (𝑐𝑡 ) = 𝜆𝑡
𝑞𝑡 [𝑢′ (𝑐𝑡 ) − 𝑣′ (𝑀𝑡 𝑞𝑡 )] ≤ 𝛽𝑢′ (𝑐𝑡+1 )𝑞𝑡+1 , = if 𝑀𝑡 𝑞𝑡 < 𝑚̄
𝑀𝑡−1 𝑚𝑡
Using ℎ𝑡 = 𝑀𝑡 and 𝑞𝑡 = 𝑀𝑡 in these first-order conditions and rearranging implies

𝑚𝑡 [𝑢′ (𝑐𝑡 ) − 𝑣′ (𝑚𝑡 )] ≤ 𝛽𝑢′ (𝑓(𝑥𝑡+1 ))𝑚𝑡+1 ℎ𝑡+1 , = if 𝑚𝑡 < 𝑚̄ (50.6)

Define the following key variable

𝜃𝑡+1 ≡ 𝑢′ (𝑓(𝑥𝑡+1 ))𝑚𝑡+1 ℎ𝑡+1 (50.7)

This is real money balances at time 𝑡 + 1 measured in units of marginal utility, which Chang refers to as ‘the marginal
utility of real balances’.
From the standpoint of the household at time 𝑡, equation (50.7) shows that 𝜃𝑡+1 intermediates the influences of
(𝑥𝑡+1
⃗ , 𝑚⃗ 𝑡+1 ) on the household’s choice of real balances 𝑚𝑡 .

1024 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

By “intermediates” we mean that the future paths (𝑥𝑡+1


⃗ , 𝑚⃗ 𝑡+1 ) influence 𝑚𝑡 entirely through their effects on the scalar
𝜃𝑡+1 .
The observation that the one dimensional promised marginal utility of real balances 𝜃𝑡+1 functions in this way is an
important step in constructing a class of competitive equilibria that have a recursive representation.
A closely related observation pervaded the analysis of Stackelberg plans in dynamic Stackelberg problems and the Calvo
model.

50.2.5 Competitive Equilibrium

Definition:
• A government policy is a pair of sequences (ℎ,⃗ 𝑥)⃗ where ℎ𝑡 ∈ Π ∀𝑡 ≥ 0.
• A price system is a non-negative value of money sequence 𝑞.⃗
• An allocation is a triple of non-negative sequences (𝑐,⃗ 𝑚,⃗ 𝑦).

It is required that time 𝑡 components (𝑚𝑡 , 𝑥𝑡 , ℎ𝑡 ) ∈ 𝐸.
Definition:
Given 𝑀−1 , a government policy (ℎ,⃗ 𝑥),
⃗ price system 𝑞,⃗ and allocation (𝑐,⃗ 𝑚,⃗ 𝑦)⃗ are said to be a competitive equilibrium
if
• 𝑚𝑡 = 𝑞𝑡 𝑀𝑡 and 𝑦𝑡 = 𝑓(𝑥𝑡 ).
• The government budget constraint is satisfied.
• Given 𝑞,⃗ 𝑥,⃗ 𝑦,⃗ (𝑐,⃗ 𝑚)
⃗ solves the household’s problem.

50.2.6 A Credible Government Policy

Chang works with


A credible government policy with a recursive representation
• Here there is no time 0 Ramsey planner.
• Instead there is a sequence of governments, one for each 𝑡, that choose time 𝑡 government actions after forecasting
what future governments will do.

• Let 𝑤 = ∑𝑡=0 𝛽 𝑡 [𝑢(𝑐𝑡 ) + 𝑣(𝑞𝑡 𝑀𝑡 )] be a value associated with a particular competitive equilibrium.
• A recursive representation of a credible government policy is a pair of initial conditions (𝑤0 , 𝜃0 ) and a five-tuple
of functions

ℎ(𝑤𝑡 , 𝜃𝑡 ), 𝑚(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 ), 𝑥(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 ), 𝜒(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 ), Ψ(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )

mapping 𝑤𝑡 , 𝜃𝑡 and in some cases ℎ𝑡 into ℎ̂ 𝑡 , 𝑚𝑡 , 𝑥𝑡 , 𝑤𝑡+1 , and 𝜃𝑡+1 , respectively.


• Starting from an initial condition (𝑤0 , 𝜃0 ), a credible government policy can be constructed by iterating on these
functions in the following order that respects the within-period timing:

ℎ̂ 𝑡 = ℎ(𝑤𝑡 , 𝜃𝑡 )
𝑚𝑡 = 𝑚(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )
𝑥𝑡 = 𝑥(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 ) (50.8)
𝑤𝑡+1 = 𝜒(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )
𝜃𝑡+1 = Ψ(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )

50.2. The Setting 1025


Advanced Quantitative Economics with Python

• Here it is to be understood that ℎ̂ 𝑡 is the action that the government policy instructs the government to take, while
ℎ𝑡 possibly not equal to ℎ̂ 𝑡 is some other action that the government is free to take at time 𝑡.
The plan is credible if it is in the time 𝑡 government’s interest to execute it.
Credibility requires that the plan be such that for all possible choices of ℎ𝑡 that are consistent with competitive equilibria,

𝑢(𝑓(𝑥(ℎ̂ 𝑡 , 𝑤𝑡 , 𝜃𝑡 ))) + 𝑣(𝑚(ℎ̂ 𝑡 , 𝑤𝑡 , 𝜃𝑡 )) + 𝛽𝜒(ℎ̂ 𝑡 , 𝑤𝑡 , 𝜃𝑡 )


≥ 𝑢(𝑓(𝑥(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 ))) + 𝑣(𝑚(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )) + 𝛽𝜒(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )

so that at each instance and circumstance of choice, a government attains a weakly higher lifetime utility with continuation
value 𝑤𝑡+1 = Ψ(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 ) by adhering to the plan and confirming the associated time 𝑡 action ℎ̂ 𝑡 that the public had
expected earlier.
Please note the subtle change in arguments of the functions used to represent a competitive equilibrium and a Ramsey
plan, on the one hand, and a credible government plan, on the other hand.
The extra arguments appearing in the functions used to represent a credible plan come from allowing the government to
contemplate disappointing the private sector’s expectation about its time 𝑡 choice ℎ̂ 𝑡 .
A credible plan induces the government to confirm the private sector’s expectation.
The recursive representation of the plan uses the evolution of continuation values to deter the government from wanting
to disappoint the private sector’s expectations.
Technically, a Ramsey plan and a credible plan both incorporate history dependence.
For a Ramsey plan, this is encoded in the dynamics of the state variable 𝜃𝑡 , a promised marginal utility that the Ramsey
plan delivers to the private sector.
For a credible government plan, we the two-dimensional state vector (𝑤𝑡 , 𝜃𝑡 ) encodes history dependence.

50.2.7 Sustainable Plans

A government strategy 𝜎 and an allocation rule 𝛼 are said to constitute a sustainable plan (SP) if.
1. 𝜎 is admissible.
2. Given 𝜎, 𝛼 is competitive.
3. After any history ℎ⃗ 𝑡−1 , the continuation of 𝜎 is optimal for the government; i.e., the sequence ℎ⃗ 𝑡 induced by 𝜎
after ℎ⃗ 𝑡−1 maximizes over 𝐶𝐸𝜋 given 𝛼.
Given any history ℎ⃗ 𝑡−1 , the continuation of a sustainable plan is a sustainable plan.
Let Θ = {(𝑚,⃗ 𝑥,⃗ ℎ)⃗ ∈ 𝐶𝐸 ∶ there is an SP whose outcome is(𝑚,⃗ 𝑥,⃗ ℎ)}.

Sustainable outcomes are elements of Θ.


Now consider the space

𝑆 = {(𝑤, 𝜃) ∶ there is a sustainable outcome (𝑚,⃗ 𝑥,⃗ ℎ)⃗ ∈ Θ

with value

𝑤 = ∑ 𝛽 𝑡 [𝑢(𝑓(𝑥𝑡 )) + 𝑣(𝑚𝑡 )] and such that 𝑢′ (𝑓(𝑥0 ))(𝑚0 + 𝑥0 ) = 𝜃}
𝑡=0

The space 𝑆 is a compact subset of 𝑊 × Ω where 𝑊 = [𝑤, 𝑤] is the space of values associated with sustainable plans.
Here 𝑤 and 𝑤 are finite bounds on the set of values.
Because there is at least one sustainable plan, 𝑆 is nonempty.

1026 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

Now recall the within-period timing protocol, which we can depict (ℎ, 𝑥) → 𝑚 = 𝑞𝑀 → 𝑦 = 𝑐.
With this timing protocol in mind, the time 0 component of an SP has the following components:
1. A period 0 action ℎ̂ ∈ Π that the public expects the government to take, together with subsequent within-period
consequences 𝑚(ℎ),̂ 𝑥(ℎ)̂ when the government acts as expected.

2. For any first-period action ℎ ≠ ℎ̂ with ℎ ∈ 𝐶𝐸𝜋0 , a pair of within-period consequences 𝑚(ℎ), 𝑥(ℎ) when the
government does not act as the public had expected.
3. For every ℎ ∈ Π, a pair (𝑤′ (ℎ), 𝜃′ (ℎ)) ∈ 𝑆 to carry into next period.
These components must be such that it is optimal for the government to choose ℎ̂ as expected; and for every possible
ℎ ∈ Π, the government budget constraint and the household’s Euler equation must hold with continuation 𝜃 being 𝜃′ (ℎ).
Given the timing protocol within the model, the representative household’s response to a government deviation to ℎ ≠
ℎ̂ from a prescribed ℎ̂ consists of a first-period action 𝑚(ℎ) and associated subsequent actions, together with future
equilibrium prices, captured by (𝑤′ (ℎ), 𝜃′ (ℎ)).
At this point, Chang introduces an idea in the spirit of Abreu, Pearce, and Stacchetti [Abreu et al., 1990].
Let 𝑍 be a nonempty subset of 𝑊 × Ω.
Think of using pairs (𝑤′ , 𝜃′ ) drawn from 𝑍 as candidate continuation value, promised marginal utility pairs.
Define the following operator:

̃
𝐷(𝑍) = {(𝑤, 𝜃) ∶ there is ℎ̂ ∈ 𝐶𝐸𝜋0 and for each ℎ ∈ 𝐶𝐸𝜋0
(50.9)
a four-tuple (𝑚(ℎ), 𝑥(ℎ), 𝑤′ (ℎ), 𝜃′ (ℎ)) ∈ [0, 𝑚]̄ × 𝑋 × 𝑍

such that

̂ + 𝑣(𝑚(ℎ))
𝑤 = 𝑢(𝑓(𝑥(ℎ))) ̂ + 𝛽𝑤′ (ℎ)̂ (50.10)

̂
𝜃 = 𝑢′ (𝑓(𝑥(ℎ)))(𝑚( ℎ)̂ + 𝑥(ℎ))
̂ (50.11)
and for all ℎ ∈ 𝐶𝐸𝜋0

𝑤 ≥ 𝑢(𝑓(𝑥(ℎ))) + 𝑣(𝑚(ℎ)) + 𝛽𝑤′ (ℎ) (50.12)

𝑥(ℎ) = 𝑚(ℎ)(ℎ − 1) (50.13)


and

𝑚(ℎ)(𝑢′ (𝑓(𝑥(ℎ))) − 𝑣′ (𝑚(ℎ))) ≤ 𝛽𝜃′ (ℎ) (50.14)

with equality if 𝑚(ℎ) < 𝑚}


̄

This operator adds the key incentive constraint to the conditions that had defined the earlier 𝐷(𝑍) operator defined in
competitive equilibria in the Chang model.
Condition (50.12) requires that the plan deter the government from wanting to take one-shot deviations when candidate
continuation values are drawn from 𝑍.
Proposition:
̃
1. If 𝑍 ⊂ 𝐷(𝑍), ̃
then 𝐷(𝑍) ⊂ 𝑆 (‘self-generation’).
̃
2. 𝑆 = 𝐷(𝑆) (‘factorization’).
Proposition:.
1. Monotonicity of 𝐷:̃ 𝑍 ⊂ 𝑍 ′ implies 𝐷(𝑍)
̃ ̃ ′ ).
⊂ 𝐷(𝑍

50.2. The Setting 1027


Advanced Quantitative Economics with Python

̃
2. 𝑍 compact implies that 𝐷(𝑍) is compact.
Chang establishes that 𝑆 is compact and that therefore there exists a highest value SP and a lowest value SP.
Further, the preceding structure allows Chang to compute 𝑆 by iterating to convergence on 𝐷̃ provided that one begins
with a sufficiently large initial set 𝑍0 .
This structure delivers the following recursive representation of a sustainable outcome:
1. choose an initial (𝑤0 , 𝜃0 ) ∈ 𝑆;
2. generate a sustainable outcome recursively by iterating on (50.8), which we repeat here for convenience:

ℎ̂ 𝑡 = ℎ(𝑤𝑡 , 𝜃𝑡 )
𝑚𝑡 = 𝑚(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )
𝑥𝑡 = 𝑥(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )
𝑤𝑡+1 = 𝜒(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )
𝜃𝑡+1 = Ψ(ℎ𝑡 , 𝑤𝑡 , 𝜃𝑡 )

50.3 Calculating the Set of Sustainable Promise-Value Pairs

̃
Above we defined the 𝐷(𝑍) operator as (50.9).
Chang (1998) provides a method for dealing with the final three constraints.
These incentive constraints ensure that the government wants to choose ℎ̂ as the private sector had expected it to.
Chang’s simplification starts from the idea that, when considering whether or not to confirm the private sector’s expecta-
tion, the government only needs to consider the payoff of the best possible deviation.
Equally, to provide incentives to the government, we only need to consider the harshest possible punishment.
Let ℎ denote some possible deviation. Chang defines:

𝑃 (ℎ; 𝑍) = min 𝑢(𝑓(𝑥)) + 𝑣(𝑚) + 𝛽𝑤′

where the minimization is subject to

𝑥 = 𝑚(ℎ − 1)

𝑚(ℎ)(𝑢′ (𝑓(𝑥(ℎ))) + 𝑣′ (𝑚(ℎ))) ≤ 𝛽𝜃′ (ℎ) (with equality if 𝑚(ℎ) < 𝑚)}
̄

(𝑚, 𝑥, 𝑤′ , 𝜃′ ) ∈ [0, 𝑚]̄ × 𝑋 × 𝑍


For a given deviation ℎ, this problem finds the worst possible sustainable value.
We then define:

𝐵𝑅(𝑍) = max 𝑃 (ℎ; 𝑍) subject to ℎ ∈ 𝐶𝐸𝜋0

𝐵𝑅(𝑍) is the value of the government’s most tempting deviation.


̃
With this in hand, we can define a new operator 𝐸(𝑍) that is equivalent to the 𝐷(𝑍) operator but simpler to implement:

𝐸(𝑍) = {(𝑤, 𝜃) ∶ ∃ℎ ∈ 𝐶𝐸𝜋0 and (𝑚(ℎ), 𝑥(ℎ), 𝑤′ (ℎ), 𝜃′ (ℎ)) ∈ [0, 𝑚]̄ × 𝑋 × 𝑍

such that

𝑤 = 𝑢(𝑓(𝑥(ℎ))) + 𝑣(𝑚(ℎ)) + 𝛽𝑤′ (ℎ)

1028 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

𝜃 = 𝑢′ (𝑓(𝑥(ℎ)))(𝑚(ℎ) + 𝑥(ℎ))

𝑥(ℎ) = 𝑚(ℎ)(ℎ − 1)

𝑚(ℎ)(𝑢′ (𝑓(𝑥(ℎ))) − 𝑣′ (𝑚(ℎ))) ≤ 𝛽𝜃′ (ℎ) (with equality if 𝑚(ℎ) < 𝑚)


̄
and

𝑤 ≥ 𝐵𝑅(𝑍)}

Aside from the final incentive constraint, this is the same as the operator in competitive equilibria in the Chang model.
Consequently, to implement this operator we just need to add one step to our outer hyperplane approximation algorithm :
1. Initialize subgradients, 𝐻, and hyperplane levels, 𝐶0 .
2. Given a set of subgradients, 𝐻, and hyperplane levels, 𝐶𝑡 , calculate 𝐵𝑅(𝑆𝑡 ).
3. Given 𝐻, 𝐶𝑡 , and 𝐵𝑅(𝑆𝑡 ), for each subgradient ℎ𝑖 ∈ 𝐻:
• Solve a linear program (described below) for each action in the action space.
• Find the maximum and update the corresponding hyperplane level, 𝐶𝑖,𝑡+1 .
4. If |𝐶𝑡+1 − 𝐶𝑡 | > 𝜖, return to 2.
Step 1 simply creates a large initial set 𝑆0 .
Given some set 𝑆𝑡 , Step 2 then constructs the value 𝐵𝑅(𝑆𝑡 ).
To do this, we solve the following problem for each point in the action space (𝑚𝑗 , ℎ𝑗 ):

min 𝑢(𝑓(𝑥𝑗 )) + 𝑣(𝑚𝑗 ) + 𝛽𝑤′


[𝑤′ ,𝜃′ ]

subject to

𝐻 ⋅ (𝑤′ , 𝜃′ ) ≤ 𝐶𝑡

𝑥𝑗 = 𝑚𝑗 (ℎ𝑗 − 1)

𝑚𝑗 (𝑢′ (𝑓(𝑥𝑗 )) − 𝑣′ (𝑚𝑗 )) ≤ 𝛽𝜃′ (= if 𝑚𝑗 < 𝑚)


̄
This gives us a matrix of possible values, corresponding to each point in the action space.
To find 𝐵𝑅(𝑍), we minimize over the 𝑚 dimension and maximize over the ℎ dimension.
Step 3 then constructs the set 𝑆𝑡+1 = 𝐸(𝑆𝑡 ). The linear program in Step 3 is designed to construct a set 𝑆𝑡+1 that is as
large as possible while satisfying the constraints of the 𝐸(𝑆) operator.
To do this, for each subgradient ℎ𝑖 , and for each point in the action space (𝑚𝑗 , ℎ𝑗 ), we solve the following problem:

max ℎ𝑖 ⋅ (𝑤, 𝜃)
[𝑤′ ,𝜃′ ]

subject to

𝐻 ⋅ (𝑤′ , 𝜃′ ) ≤ 𝐶𝑡

𝑤 = 𝑢(𝑓(𝑥𝑗 )) + 𝑣(𝑚𝑗 ) + 𝛽𝑤′

𝜃 = 𝑢′ (𝑓(𝑥𝑗 ))(𝑚𝑗 + 𝑥𝑗 )

𝑥𝑗 = 𝑚𝑗 (ℎ𝑗 − 1)

50.3. Calculating the Set of Sustainable Promise-Value Pairs 1029


Advanced Quantitative Economics with Python

𝑚𝑗 (𝑢′ (𝑓(𝑥𝑗 )) − 𝑣′ (𝑚𝑗 )) ≤ 𝛽𝜃′ (= if 𝑚𝑗 < 𝑚)


̄

𝑤 ≥ 𝐵𝑅(𝑍)
This problem maximizes the hyperplane level for a given set of actions.
The second part of Step 3 then finds the maximum possible hyperplane level across the action space.
The algorithm constructs a sequence of progressively smaller sets 𝑆𝑡+1 ⊂ 𝑆𝑡 ⊂ 𝑆𝑡−1 ⋯ ⊂ 𝑆0 .
Step 4 ends the algorithm when the difference between these sets is small enough.
We have created a Python class that solves the model assuming the following functional forms:

𝑢(𝑐) = log(𝑐)

1
𝑣(𝑚) = (𝑚𝑚̄ − 0.5𝑚2 )0.5
500
𝑓(𝑥) = 180 − (0.4𝑥)2
̄ are then variables to be specified for an instance of the Chang class.
The remaining parameters {𝛽, 𝑚,̄ ℎ, ℎ}
Below we use the class to solve the model and plot the resulting equilibrium set, once with 𝛽 = 0.3 and once with 𝛽 = 0.8.
We also plot the (larger) competitive equilibrium sets, which we described in competitive equilibria in the Chang model.
(We have set the number of subgradients to 10 in order to speed up the code for now. We can increase accuracy by
increasing the number of subgradients)
The following code computes sustainable plans

"""
Provides a class called ChangModel to solve different
parameterizations of the Chang (1998) model.
"""

import numpy as np
import quantecon as qe
import time

from scipy.spatial import ConvexHull


from scipy.optimize import linprog, minimize, minimize_scalar
from scipy.interpolate import UnivariateSpline
import numpy.polynomial.chebyshev as cheb

class ChangModel:
"""
Class to solve for the competitive and sustainable sets in the Chang (1998)
model, for different parameterizations.
"""

def __init__(self, β, mbar, h_min, h_max, n_h, n_m, N_g):


# Record parameters
self.β, self.mbar, self.h_min, self.h_max = β, mbar, h_min, h_max
self.n_h, self.n_m, self.N_g = n_h, n_m, N_g

# Create other parameters


self.m_min = 1e-9
self.m_max = self.mbar
self.N_a = self.n_h*self.n_m
(continues on next page)

1030 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)

# Utility and production functions


uc = lambda c: np.log(c)
uc_p = lambda c: 1/c
v = lambda m: 1/500 * (mbar * m - 0.5 * m**2)**0.5
v_p = lambda m: 0.5/500 * (mbar * m - 0.5 * m**2)**(-0.5) * (mbar - m)
u = lambda h, m: uc(f(h, m)) + v(m)

def f(h, m):


x = m * (h - 1)
f = 180 - (0.4 * x)**2
return f

def θ(h, m):


x = m * (h - 1)
θ = uc_p(f(h, m)) * (m + x)
return θ

# Create set of possible action combinations, A


A1 = np.linspace(h_min, h_max, n_h).reshape(n_h, 1)
A2 = np.linspace(self.m_min, self.m_max, n_m).reshape(n_m, 1)
self.A = np.concatenate((np.kron(np.ones((n_m, 1)), A1),
np.kron(A2, np.ones((n_h, 1)))), axis=1)

# Pre-compute utility and output vectors


self.euler_vec = -np.multiply(self.A[:, 1], \
uc_p(f(self.A[:, 0], self.A[:, 1])) - v_p(self.A[:, 1]))
self.u_vec = u(self.A[:, 0], self.A[:, 1])
self.Θ_vec = θ(self.A[:, 0], self.A[:, 1])
self.f_vec = f(self.A[:, 0], self.A[:, 1])
self.bell_vec = np.multiply(uc_p(f(self.A[:, 0],
self.A[:, 1])),
np.multiply(self.A[:, 1],
(self.A[:, 0] - 1))) \
+ np.multiply(self.A[:, 1],
v_p(self.A[:, 1]))

# Find extrema of (w, θ) space for initial guess of equilibrium sets


p_vec = np.zeros(self.N_a)
w_vec = np.zeros(self.N_a)
for i in range(self.N_a):
p_vec[i] = self.Θ_vec[i]
w_vec[i] = self.u_vec[i]/(1 - β)

w_space = np.array([min(w_vec[~np.isinf(w_vec)]),
max(w_vec[~np.isinf(w_vec)])])
p_space = np.array([0, max(p_vec[~np.isinf(w_vec)])])
self.p_space = p_space

# Set up hyperplane levels and gradients for iterations


def SG_H_V(N, w_space, p_space):
"""
This function initializes the subgradients, hyperplane levels,
and extreme points of the value set by choosing an appropriate
origin and radius. It is based on a similar function in QuantEcon's
Games.jl

(continues on next page)

50.3. Calculating the Set of Sustainable Promise-Value Pairs 1031


Advanced Quantitative Economics with Python

(continued from previous page)


"""

# First, create a unit circle. Want points placed on [0, 2π]


inc = 2 * np.pi / N
degrees = np.arange(0, 2 * np.pi, inc)

# Points on circle
H = np.zeros((N, 2))
for i in range(N):
x = degrees[i]
H[i, 0] = np.cos(x)
H[i, 1] = np.sin(x)

# Then calculate origin and radius


o = np.array([np.mean(w_space), np.mean(p_space)])
r1 = max((max(w_space) - o[0])**2, (o[0] - min(w_space))**2)
r2 = max((max(p_space) - o[1])**2, (o[1] - min(p_space))**2)
r = np.sqrt(r1 + r2)

# Now calculate vertices


Z = np.zeros((2, N))
for i in range(N):
Z[0, i] = o[0] + r*H.T[0, i]
Z[1, i] = o[1] + r*H.T[1, i]

# Corresponding hyperplane levels


C = np.zeros(N)
for i in range(N):
C[i] = np.dot(Z[:, i], H[i, :])

return C, H, Z

C, self.H, Z = SG_H_V(N_g, w_space, p_space)


C = C.reshape(N_g, 1)
self.c0_c, self.c0_s, self.c1_c, self.c1_s = np.copy(C), np.copy(C), \
np.copy(C), np.copy(C)
self.z0_s, self.z0_c, self.z1_s, self.z1_c = np.copy(Z), np.copy(Z), \
np.copy(Z), np.copy(Z)

self.w_bnds_s, self.w_bnds_c = (w_space[0], w_space[1]), \


(w_space[0], w_space[1])
self.p_bnds_s, self.p_bnds_c = (p_space[0], p_space[1]), \
(p_space[0], p_space[1])

# Create dictionaries to save equilibrium set for each iteration


self.c_dic_s, self.c_dic_c = {}, {}
self.c_dic_s[0], self.c_dic_c[0] = self.c0_s, self.c0_c

def solve_worst_spe(self):
"""
Method to solve for BR(Z). See p.449 of Chang (1998)
"""

p_vec = np.full(self.N_a, np.nan)


c = [1, 0]

(continues on next page)

1032 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)


# Pre-compute constraints
aineq_mbar = np.vstack((self.H, np.array([0, -self.β])))
bineq_mbar = np.vstack((self.c0_s, 0))

aineq = self.H
bineq = self.c0_s
aeq = [[0, -self.β]]

for j in range(self.N_a):
# Only try if consumption is possible
if self.f_vec[j] > 0:
# If m = mbar, use inequality constraint
if self.A[j, 1] == self.mbar:
bineq_mbar[-1] = self.euler_vec[j]
res = linprog(c, A_ub=aineq_mbar, b_ub=bineq_mbar,
bounds=(self.w_bnds_s, self.p_bnds_s))
else:
beq = self.euler_vec[j]
res = linprog(c, A_ub=aineq, b_ub=bineq, A_eq=aeq, b_eq=beq,
bounds=(self.w_bnds_s, self.p_bnds_s))
if res.status == 0:
p_vec[j] = self.u_vec[j] + self.β * res.x[0]

# Max over h and min over other variables (see Chang (1998) p.449)
self.br_z = np.nanmax(np.nanmin(p_vec.reshape(self.n_m, self.n_h), 0))

def solve_subgradient(self):
"""
Method to solve for E(Z). See p.449 of Chang (1998)
"""

# Pre-compute constraints
aineq_C_mbar = np.vstack((self.H, np.array([0, -self.β])))
bineq_C_mbar = np.vstack((self.c0_c, 0))

aineq_C = self.H
bineq_C = self.c0_c
aeq_C = [[0, -self.β]]

aineq_S_mbar = np.vstack((np.vstack((self.H, np.array([0, -self.β]))),


np.array([-self.β, 0])))
bineq_S_mbar = np.vstack((self.c0_s, np.zeros((2, 1))))

aineq_S = np.vstack((self.H, np.array([-self.β, 0])))


bineq_S = np.vstack((self.c0_s, 0))
aeq_S = [[0, -self.β]]

# Update maximal hyperplane level


for i in range(self.N_g):
c_a1a2_c, t_a1a2_c = np.full(self.N_a, -np.inf), \
np.zeros((self.N_a, 2))
c_a1a2_s, t_a1a2_s = np.full(self.N_a, -np.inf), \
np.zeros((self.N_a, 2))

c = [-self.H[i, 0], -self.H[i, 1]]

(continues on next page)

50.3. Calculating the Set of Sustainable Promise-Value Pairs 1033


Advanced Quantitative Economics with Python

(continued from previous page)


for j in range(self.N_a):
# Only try if consumption is possible
if self.f_vec[j] > 0:

# COMPETITIVE EQUILIBRIA
# If m = mbar, use inequality constraint
if self.A[j, 1] == self.mbar:
bineq_C_mbar[-1] = self.euler_vec[j]
res = linprog(c, A_ub=aineq_C_mbar, b_ub=bineq_C_mbar,
bounds=(self.w_bnds_c, self.p_bnds_c))
# If m < mbar, use equality constraint
else:
beq_C = self.euler_vec[j]
res = linprog(c, A_ub=aineq_C, b_ub=bineq_C, A_eq = aeq_C,
b_eq = beq_C, bounds=(self.w_bnds_c, \
self.p_bnds_c))
if res.status == 0:
c_a1a2_c[j] = self.H[i, 0] * (self.u_vec[j] \
+ self.β * res.x[0]) + self.H[i, 1] * self.Θ_vec[j]
t_a1a2_c[j] = res.x

# SUSTAINABLE EQUILIBRIA
# If m = mbar, use inequality constraint
if self.A[j, 1] == self.mbar:
bineq_S_mbar[-2] = self.euler_vec[j]
bineq_S_mbar[-1] = self.u_vec[j] - self.br_z
res = linprog(c, A_ub=aineq_S_mbar, b_ub=bineq_S_mbar,
bounds=(self.w_bnds_s, self.p_bnds_s))
# If m < mbar, use equality constraint
else:
bineq_S[-1] = self.u_vec[j] - self.br_z
beq_S = self.euler_vec[j]
res = linprog(c, A_ub=aineq_S, b_ub=bineq_S, A_eq = aeq_S,
b_eq = beq_S, bounds=(self.w_bnds_s, \
self.p_bnds_s))
if res.status == 0:
c_a1a2_s[j] = self.H[i, 0] * (self.u_vec[j] \
+ self.β*res.x[0]) + self.H[i, 1] * self.Θ_vec[j]
t_a1a2_s[j] = res.x

idx_c = np.where(c_a1a2_c == max(c_a1a2_c))[0][0]


self.z1_c[:, i] = np.array([self.u_vec[idx_c]
+ self.β * t_a1a2_c[idx_c, 0],
self.Θ_vec[idx_c]])

idx_s = np.where(c_a1a2_s == max(c_a1a2_s))[0][0]


self.z1_s[:, i] = np.array([self.u_vec[idx_s]
+ self.β * t_a1a2_s[idx_s, 0],
self.Θ_vec[idx_s]])

for i in range(self.N_g):
self.c1_c[i] = np.dot(self.z1_c[:, i], self.H[i, :])
self.c1_s[i] = np.dot(self.z1_s[:, i], self.H[i, :])

def solve_sustainable(self, tol=1e-5, max_iter=250):


"""

(continues on next page)

1034 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)


Method to solve for the competitive and sustainable equilibrium sets.
"""

t = time.time()
diff = tol + 1
iters = 0

print('### --------------- ###')


print('Solving Chang Model Using Outer Hyperplane Approximation')
print('### --------------- ### \n')

print('Maximum difference when updating hyperplane levels:')

while diff > tol and iters < max_iter:


iters = iters + 1
self.solve_worst_spe()
self.solve_subgradient()
diff = max(np.maximum(abs(self.c0_c - self.c1_c),
abs(self.c0_s - self.c1_s)))
print(diff)

# Update hyperplane levels


self.c0_c, self.c0_s = np.copy(self.c1_c), np.copy(self.c1_s)

# Update bounds for w and θ


wmin_c, wmax_c = np.min(self.z1_c, axis=1)[0], \
np.max(self.z1_c, axis=1)[0]
pmin_c, pmax_c = np.min(self.z1_c, axis=1)[1], \
np.max(self.z1_c, axis=1)[1]

wmin_s, wmax_s = np.min(self.z1_s, axis=1)[0], \


np.max(self.z1_s, axis=1)[0]
pmin_S, pmax_S = np.min(self.z1_s, axis=1)[1], \
np.max(self.z1_s, axis=1)[1]

self.w_bnds_s, self.w_bnds_c = (wmin_s, wmax_s), (wmin_c, wmax_c)


self.p_bnds_s, self.p_bnds_c = (pmin_S, pmax_S), (pmin_c, pmax_c)

# Save iteration
self.c_dic_c[iters], self.c_dic_s[iters] = np.copy(self.c1_c), \
np.copy(self.c1_s)
self.iters = iters

elapsed = time.time() - t
print('Convergence achieved after {} iterations and {} \
seconds'.format(iters, round(elapsed, 2)))

def solve_bellman(self, θ_min, θ_max, order, disp=False, tol=1e-7, maxiters=100):


"""
Continuous Method to solve the Bellman equation in section 25.3
"""
mbar = self.mbar

# Utility and production functions


uc = lambda c: np.log(c)
uc_p = lambda c: 1 / c

(continues on next page)

50.3. Calculating the Set of Sustainable Promise-Value Pairs 1035


Advanced Quantitative Economics with Python

(continued from previous page)


v = lambda m: 1 / 500 * (mbar * m - 0.5 * m**2)**0.5
v_p = lambda m: 0.5/500 * (mbar*m - 0.5 * m**2)**(-0.5) * (mbar - m)
u = lambda h, m: uc(f(h, m)) + v(m)

def f(h, m):


x = m * (h - 1)
f = 180 - (0.4 * x)**2
return f

def θ(h, m):


x = m * (h - 1)
θ = uc_p(f(h, m)) * (m + x)
return θ

# Bounds for Maximization


lb1 = np.array([self.h_min, 0, θ_min])
ub1 = np.array([self.h_max, self.mbar - 1e-5, θ_max])
lb2 = np.array([self.h_min, θ_min])
ub2 = np.array([self.h_max, θ_max])

# Initialize Value Function coefficients


# Calculate roots of Chebyshev polynomial
k = np.linspace(order, 1, order)
roots = np.cos((2 * k - 1) * np.pi / (2 * order))
# Scale to approximation space
s = θ_min + (roots - -1) / 2 * (θ_max - θ_min)
# Create a basis matrix
Φ = cheb.chebvander(roots, order - 1)
c = np.zeros(Φ.shape[0])

# Function to minimize and constraints


def p_fun(x):
scale = -1 + 2 * (x[2] - θ_min)/(θ_max - θ_min)
p_fun = - (u(x[0], x[1]) \
+ self.β * np.dot(cheb.chebvander(scale, order - 1), c))
return p_fun

def p_fun2(x):
scale = -1 + 2*(x[1] - θ_min)/(θ_max - θ_min)
p_fun = - (u(x[0],mbar) \
+ self.β * np.dot(cheb.chebvander(scale, order - 1), c))
return p_fun

cons1 = ({'type': 'eq', 'fun': lambda x: uc_p(f(x[0], x[1])) * x[1]


* (x[0] - 1) + v_p(x[1]) * x[1] + self.β * x[2] - θ},
{'type': 'eq', 'fun': lambda x: uc_p(f(x[0], x[1]))
* x[0] * x[1] - θ})
cons2 = ({'type': 'ineq', 'fun': lambda x: uc_p(f(x[0], mbar)) * mbar
* (x[0] - 1) + v_p(mbar) * mbar + self.β * x[1] - θ},
{'type': 'eq', 'fun': lambda x: uc_p(f(x[0], mbar))
* x[0] * mbar - θ})

bnds1 = np.concatenate([lb1.reshape(3, 1), ub1.reshape(3, 1)], axis=1)


bnds2 = np.concatenate([lb2.reshape(2, 1), ub2.reshape(2, 1)], axis=1)

# Bellman Iterations

(continues on next page)

1036 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)


diff = 1
iters = 1

while diff > tol:


# 1. Maximization, given value function guess
p_iter1 = np.zeros(order)
for i in range(order):
θ = s[i]
res = minimize(p_fun,
lb1 + (ub1-lb1) / 2,
method='SLSQP',
bounds=bnds1,
constraints=cons1,
tol=1e-10)
if res.success == True:
p_iter1[i] = -p_fun(res.x)
res = minimize(p_fun2,
lb2 + (ub2-lb2) / 2,
method='SLSQP',
bounds=bnds2,
constraints=cons2,
tol=1e-10)
if -p_fun2(res.x) > p_iter1[i] and res.success == True:
p_iter1[i] = -p_fun2(res.x)

# 2. Bellman updating of Value Function coefficients


c1 = np.linalg.solve(Φ, p_iter1)
# 3. Compute distance and update
diff = np.linalg.norm(c - c1)
if bool(disp == True):
print(diff)
c = np.copy(c1)
iters = iters + 1
if iters > maxiters:
print('Convergence failed after {} iterations'.format(maxiters))
break

self.θ_grid = s
self.p_iter = p_iter1
self.Φ = Φ
self.c = c
print('Convergence achieved after {} iterations'.format(iters))

# Check residuals
θ_grid_fine = np.linspace(θ_min, θ_max, 100)
resid_grid = np.zeros(100)
p_grid = np.zeros(100)
θ_prime_grid = np.zeros(100)
m_grid = np.zeros(100)
h_grid = np.zeros(100)
for i in range(100):
θ = θ_grid_fine[i]
res = minimize(p_fun,
lb1 + (ub1-lb1) / 2,
method='SLSQP',
bounds=bnds1,

(continues on next page)

50.3. Calculating the Set of Sustainable Promise-Value Pairs 1037


Advanced Quantitative Economics with Python

(continued from previous page)


constraints=cons1,
tol=1e-10)
if res.success == True:
p = -p_fun(res.x)
p_grid[i] = p
θ_prime_grid[i] = res.x[2]
h_grid[i] = res.x[0]
m_grid[i] = res.x[1]
res = minimize(p_fun2,
lb2 + (ub2-lb2)/2,
method='SLSQP',
bounds=bnds2,
constraints=cons2,
tol=1e-10)
if -p_fun2(res.x) > p and res.success == True:
p = -p_fun2(res.x)
p_grid[i] = p
θ_prime_grid[i] = res.x[1]
h_grid[i] = res.x[0]
m_grid[i] = self.mbar
scale = -1 + 2 * (θ - θ_min)/(θ_max - θ_min)
resid_grid[i] = np.dot(cheb.chebvander(scale, order-1), c) - p

self.resid_grid = resid_grid
self.θ_grid_fine = θ_grid_fine
self.θ_prime_grid = θ_prime_grid
self.m_grid = m_grid
self.h_grid = h_grid
self.p_grid = p_grid
self.x_grid = m_grid * (h_grid - 1)

# Simulate
θ_series = np.zeros(31)
m_series = np.zeros(30)
h_series = np.zeros(30)

# Find initial θ
def ValFun(x):
scale = -1 + 2*(x - θ_min)/(θ_max - θ_min)
p_fun = np.dot(cheb.chebvander(scale, order - 1), c)
return -p_fun

res = minimize(ValFun,
(θ_min + θ_max)/2,
bounds=[(θ_min, θ_max)])
θ_series[0] = res.x

# Simulate
for i in range(30):
θ = θ_series[i]
res = minimize(p_fun,
lb1 + (ub1-lb1)/2,
method='SLSQP',
bounds=bnds1,
constraints=cons1,
tol=1e-10)

(continues on next page)

1038 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)


if res.success == True:
p = -p_fun(res.x)
h_series[i] = res.x[0]
m_series[i] = res.x[1]
θ_series[i+1] = res.x[2]
res2 = minimize(p_fun2,
lb2 + (ub2-lb2)/2,
method='SLSQP',
bounds=bnds2,
constraints=cons2,
tol=1e-10)
if -p_fun2(res2.x) > p and res2.success == True:
h_series[i] = res2.x[0]
m_series[i] = self.mbar
θ_series[i+1] = res2.x[1]

self.θ_series = θ_series
self.m_series = m_series
self.h_series = h_series
self.x_series = m_series * (h_series - 1)

50.3.1 Comparison of Sets

The set of (𝑤, 𝜃) associated with sustainable plans is smaller than the set of (𝑤, 𝜃) pairs associated with competitive
equilibria, since the additional constraints associated with sustainability must also be satisfied.
Let’s compute two examples, one with a low 𝛽, another with a higher 𝛽

ch1 = ChangModel(β=0.3, mbar=30, h_min=0.9, h_max=2, n_h=8, n_m=35, N_g=10)

ch1.solve_sustainable()

### --------------- ###


Solving Chang Model Using Outer Hyperplane Approximation
### --------------- ###

Maximum difference when updating hyperplane levels:

[1.9168]

[0.66782]

[0.49235]

[0.32412]

[0.19022]

50.3. Calculating the Set of Sustainable Promise-Value Pairs 1039


Advanced Quantitative Economics with Python

[0.10863]

[0.05817]

[0.0262]

[0.01836]

[0.01415]

[0.00297]

[0.00089]

[0.00027]

[0.00008]

[0.00002]

[0.00001]
Convergence achieved after 16 iterations and 38.57 seconds

The following plot shows both the set of 𝑤, 𝜃 pairs associated with competitive equilibria (in red) and the smaller set of
𝑤, 𝜃 pairs associated with sustainable plans (in blue).

def plot_equilibria(ChangModel):
"""
Method to plot both equilibrium sets
"""
fig, ax = plt.subplots(figsize=(7, 5))

ax.set_xlabel('w', fontsize=16)
ax.set_ylabel(r"$\theta$", fontsize=18)

poly_S = polytope.Polytope(ChangModel.H, ChangModel.c1_s)


poly_C = polytope.Polytope(ChangModel.H, ChangModel.c1_c)
ext_C = polytope.extreme(poly_C)
ext_S = polytope.extreme(poly_S)

ax.fill(ext_C[:, 0], ext_C[:, 1], 'r', zorder=-1)


ax.fill(ext_S[:, 0], ext_S[:, 1], 'b', zorder=0)

# Add point showing Ramsey Plan


idx_Ramsey = np.where(ext_C[:, 0] == max(ext_C[:, 0]))[0][0]
R = ext_C[idx_Ramsey, :]
ax.scatter(R[0], R[1], 150, 'black', 'o', zorder=1)
w_min = min(ext_C[:, 0])

(continues on next page)

1040 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

(continued from previous page)


# Label Ramsey Plan slightly to the right of the point
ax.annotate("R", xy=(R[0], R[1]),
xytext=(R[0] + 0.03 * (R[0] - w_min),
R[1]), fontsize=18)

plt.tight_layout()
plt.show()

plot_equilibria(ch1)

Evidently, the Ramsey plan, denoted by the 𝑅, is not sustainable.


Let’s raise the discount factor and recompute the sets

ch2 = ChangModel(β=0.8, mbar=30, h_min=0.9, h_max=1/0.8,


n_h=8, n_m=35, N_g=10)

ch2.solve_sustainable()

### --------------- ###


Solving Chang Model Using Outer Hyperplane Approximation
### --------------- ###

Maximum difference when updating hyperplane levels:

50.3. Calculating the Set of Sustainable Promise-Value Pairs 1041


Advanced Quantitative Economics with Python

[0.06369]

[0.02476]

[0.02153]

[0.01915]

[0.01795]

[0.01642]

[0.01507]

[0.01284]

[0.01106]

[0.00694]

[0.0085]

[0.00781]

[0.00433]

[0.00492]

[0.00303]

[0.00182]

[0.00638]

[0.00116]

[0.00093]

[0.00075]

[0.0006]

1042 Chapter 50. Credible Government Policies in a Model of Chang


Advanced Quantitative Economics with Python

[0.00494]

[0.00038]

[0.00121]

[0.00024]

[0.0002]

[0.00016]

[0.00013]

[0.0001]

[0.00008]

[0.00006]

[0.00005]

[0.00004]

[0.00003]

[0.00003]

[0.00002]

[0.00002]

[0.00001]

[0.00001]

[0.00001]
Convergence achieved after 40 iterations and 113.3 seconds

Let’s plot both sets

plot_equilibria(ch2)

50.3. Calculating the Set of Sustainable Promise-Value Pairs 1043


Advanced Quantitative Economics with Python

Evidently, the Ramsey plan is now sustainable.

1044 Chapter 50. Credible Government Policies in a Model of Chang


Part IX

Other

1045
CHAPTER

FIFTYONE

TROUBLESHOOTING

This page is for readers experiencing errors when running the code from the lectures.

51.1 Fixing Your Local Environment

The basic assumption of the lectures is that code in a lecture should execute whenever
1. it is executed in a Jupyter notebook and
2. the notebook is running on a machine with the latest version of Anaconda Python.
You have installed Anaconda, haven’t you, following the instructions in this lecture?
Assuming that you have, the most common source of problems for our readers is that their Anaconda distribution is not
up to date.
Here’s a useful article on how to update Anaconda.
Another option is to simply remove Anaconda and reinstall.
You also need to keep the external code libraries, such as QuantEcon.py up to date.
For this task you can either
• use pip install –upgrade quantecon on the command line, or
• execute !pip install –upgrade quantecon within a Jupyter notebook.
If your local environment is still not working you can do two things.
First, you can use a remote machine instead, by clicking on the Launch Notebook icon available for each lecture

Second, you can report an issue, so we can try to fix your local set up.
We like getting feedback on the lectures so please don’t hesitate to get in touch.

1047
Advanced Quantitative Economics with Python

51.2 Reporting an Issue

One way to give feedback is to raise an issue through our issue tracker.
Please be as specific as possible. Tell us where the problem is and as much detail about your local set up as you can
provide.
Another feedback option is to use our discourse forum.
Finally, you can provide direct feedback to [email protected]

1048 Chapter 51. Troubleshooting


CHAPTER

FIFTYTWO

REFERENCES

1049
Advanced Quantitative Economics with Python

1050 Chapter 52. References


CHAPTER

FIFTYTHREE

EXECUTION STATISTICS

This table contains the latest execution statistics.

Document Modified Method Run Time (s) Status


BCG_complete_mkts 2025-02-17 03:15 cache 48.29 ✅
BCG_incomplete_mkts 2025-02-17 03:17 cache 82.6 ✅
additive_functionals 2025-02-17 03:17 cache 13.4 ✅
amss 2025-02-17 03:20 cache 185.83 ✅
amss2 2025-02-17 03:21 cache 60.41 ✅
amss3 2025-02-17 03:26 cache 272.94 ✅
arellano 2025-02-17 03:27 cache 82.22 ✅
arma 2025-02-17 03:27 cache 6.66 ✅
asset_pricing_lph 2025-02-17 03:27 cache 2.59 ✅
black_litterman 2025-02-17 03:28 cache 30.84 ✅
calvo 2025-02-17 03:28 cache 8.83 ✅
calvo_abreu 2025-02-17 03:28 cache 3.75 ✅
calvo_machine_learn 2025-02-17 22:20 cache 24.72 ✅
cattle_cycles 2025-02-17 03:28 cache 3.65 ✅
chang_credible 2025-02-17 03:31 cache 156.58 ✅
chang_ramsey 2025-02-17 03:37 cache 351.21 ✅
classical_filtering 2025-02-17 03:37 cache 1.45 ✅
coase 2025-02-17 03:37 cache 4.29 ✅
cons_news 2025-02-17 03:37 cache 4.33 ✅
discrete_dp 2025-02-17 03:37 cache 30.41 ✅
dyn_stack 2025-02-17 03:38 cache 5.88 ✅
entropy 2025-02-17 03:38 cache 1.04 ✅
estspec 2025-02-17 03:38 cache 4.63 ✅
five_preferences 2025-02-17 03:39 cache 51.25 ✅
growth_in_dles 2025-02-17 03:39 cache 4.12 ✅
hs_invertibility_example 2025-02-17 03:39 cache 4.27 ✅
hs_recursive_models 2025-02-17 03:39 cache 0.99 ✅
intro 2025-02-17 03:39 cache 0.99 ✅
irfs_in_hall_model 2025-02-17 03:39 cache 4.1 ✅
knowing_forecasts_of_others 2025-02-17 03:39 cache 22.34 ✅
lqramsey 2025-02-17 03:39 cache 5.04 ✅
lu_tricks 2025-02-17 03:39 cache 2.09 ✅
lucas_asset_pricing_dles 2025-02-17 03:39 cache 3.83 ✅
lucas_model 2025-02-17 03:40 cache 12.42 ✅
markov_jump_lq 2025-02-17 03:41 cache 77.98 ✅
continues on next page

1051
Advanced Quantitative Economics with Python

Table 53.1 – continued from previous page


Document Modified Method Run Time (s) Status
match_transport 2025-02-17 03:41 cache 22.91 ✅
matsuyama 2025-02-17 03:41 cache 6.7 ✅
muth_kalman 2025-02-17 03:41 cache 4.05 ✅
opt_tax_recur 2025-02-17 03:43 cache 69.65 ✅
orth_proj 2025-02-17 03:43 cache 1.34 ✅
permanent_income_dles 2025-02-17 03:43 cache 4.06 ✅
rob_markov_perf 2025-02-17 03:43 cache 3.84 ✅
robustness 2025-02-17 03:43 cache 4.59 ✅
rosen_schooling_model 2025-02-17 03:43 cache 4.04 ✅
smoothing 2025-02-17 03:43 cache 4.21 ✅
smoothing_tax 2025-02-17 03:43 cache 6.41 ✅
stationary_densities 2025-02-17 03:43 cache 8.34 ✅
status 2025-02-17 03:43 cache 4.36 ✅
tax_smoothing_1 2025-02-17 03:43 cache 10.82 ✅
tax_smoothing_2 2025-02-17 03:44 cache 4.4 ✅
tax_smoothing_3 2025-02-17 03:44 cache 4.46 ✅
troubleshooting 2025-02-17 03:39 cache 0.99 ✅
un_insure 2025-02-17 03:44 cache 12.25 ✅
zreferences 2025-02-17 03:39 cache 0.99 ✅

These lectures are built on linux instances through github actions.


These lectures are using the following python version

!python --version

Python 3.12.7

and the following package versions

!conda list

1052 Chapter 53. Execution Statistics


BIBLIOGRAPHY

[Abr88] Dilip Abreu. On the theory of infinitely repeated games with discounting. Econometrica, 56:383–396,
1988.
[APS90] Dilip Abreu, David Pearce, and Ennio Stacchetti. Toward a theory of discounted repeated games with
imperfect monitoring. Econometrica, 58(5):1041–1063, September 1990.
[AMSSeppala02] S Rao Aiyagari, Albert Marcet, Thomas J Sargent, and Juha Seppä lä . Optimal taxation without state-
contingent debt. Journal of Political Economy, 110(6):1220–1254, 2002.
[AMS02] Franklin Allen, Stephen Morris, and Hyun Song Shin. Beauty contests, bubbles, and iterated expectations
in asset markets. mimeo, 2002.
[AHMS96] Evan Anderson, Lars Peter Hansen, Ellen R. McGrattan, and Thomas J. Sargent. Mechanics of forming
and estimating dynamic linear economies. In Hans M. Amman, David A. Kendrick, and John Rust, editors,
Handbook of computational economics, 171–252. Elsevier Science, North-Holland, 1996.
[AHS03] Evan W. Anderson, Lars Peter Hansen, and Thomas J. Sargent. A Quartet of Semigroups for Model Spec-
ification, Robustness, Prices of Risk, and Model Detection. Journal of the European Economic Association,
1(1):68–123, March 2003. URL: https://ideas.repec.org/a/tpr/jeurec/v1y2003i1p68-123.html, doi:.
[Are08] Cristina Arellano. Default risk and income fluctuations in emerging economies. The American Economic
Review, pages 690–712, 2008.
[AP91] Papoulis Athanasios and S Unnikrishna Pillai. Probability, random variables, and stochastic processes. Mc-
Graw Hill, 1991.
[AP11] Orazio P Attanasio and Nicola Pavoni. Risk sharing in private information models with asset accumulation:
explaining the excess smoothness of consumption. Econometrica, 79(4):1027–1068, 2011.
[BCZ14] David Backus, Mikhail Chernov, and Stanley Zin. Sources of Entropy in Representative Agent
Models. Journal of Finance, 69(1):51–99, February 2014. URL: https://ideas.repec.org/a/bla/jfinan/
v69y2014i1p51-99.html, doi:.
[BHS09] Francisco Barillas, Lars Peter Hansen, and Thomas J. Sargent. Doubts or variability? Journal
of Economic Theory, 144(6):2388–2418, November 2009. URL: https://ideas.repec.org/a/eee/jetheo/
v144y2009i6p2388-2418.html, doi:.
[Bar79] Robert J Barro. On the Determination of the Public Debt. Journal of Political Economy, 87(5):940–971,
1979.
[Bar99] Robert J Barro. Determinants of democracy. Journal of Political economy, 107(S6):S158–S183, 1999.
[BM03] Robert J Barro and Rachel McCleary. Religion and economic growth. Technical Report, National Bureau
of Economic Research, 2003.
[BEGS17] Anmol Bhandari, David Evans, Mikhail Golosov, and Thomas J. Sargent. Fiscal Policy and Debt Manage-
ment with Incomplete Markets. The Quarterly Journal of Economics, 132(2):617–663, 2017.

1053
Advanced Quantitative Economics with Python

[BCG18] Alberto Bisin, Gian Luca Clementi, and Piero Gottardi. Capital and hedging demand with incomplete
markets. Technical Report, NYU and EUI, 2018.
[BL92] Fischer Black and Robert Litterman. Global portfolio optimization. Financial analysts journal, 48(5):28–
43, 1992.
[BTWZ24] Job Boerma, Aleh Tsyvinski, Ruodu Wang, and Zhenyuan Zhang. Composite sorting. Technical Report,
University of Wisconsin, 2024.
[Buc04] James A. Bucklew. An Introduction to Rare Event Simulation. Springer Verlag, New York, 2004.
[Cag56] Philip Cagan. The monetary dynamics of hyperinflation. In Milton Friedman, editor, Studies in the Quantity
Theory of Money, pages 25–117. University of Chicago Press, Chicago, 1956.
[Cal78] Guillermo A. Calvo. On the time consistency of optimal policy in a monetary economy. Econometrica,
46(6):1411–1428, 1978.
[CR83] Gary Chamberlain and Michael Rothschild. Arbitrage, Factor Structure, and Mean-Variance Analysis on
Large Asset Markets. Econometrica, 51(5):1281–1304, September 1983. URL: https://ideas.repec.org/a/
ecm/emetrp/v51y1983i5p1281-304.html, doi:.
[Cha98] Roberto Chang. Credible monetary policy in an infinite horizon model: recursive approaches. Journal of
Economic Theory, 81(2):431–461, 1998.
[CK90] Varadarajan V Chari and Patrick J Kehoe. Sustainable plans. Journal of Political Economy, pages 783–802,
1990.
[Coa37] Ronald Harry Coase. The nature of the firm. economica, 4(16):386–405, 1937.
[Coc05] John H. Cochrane. Asset Pricing: revised edition. Princeton University Press, Princeton, New Jersey, 2005.
[CC08] J. D. Cryer and K-S. Chan. Time Series Analysis. Springer, 2nd edition edition, 2008.
[DSS11] Julie Delon, Julien Salomon, and Andrei Sobolevski. Minimum-weight perfect matching for non-intrinsic
distances on the line. arXiv preprint arXiv:1102.1558, 2011.
[DJ92] Raymond J Deneckere and Kenneth L Judd. Cyclical and chaotic behavior in a dynamic equilibrium model,
with implications for fiscal policy. Cycles and chaos in economic equilibrium, pages 308–329, 1992.
[Dic75] J Dickey. Bayesian alternatives to the f-test and least-squares estimate in the normal linear model. In S.E.
Fienberg and A. Zellner, editors, Studies in Bayesian econometrics and statistics, pages 515–554. North-
Holland, Amsterdam, 1975.
[DVGC99] JBR Do Val, JC Geromel, and OLV Costa. Solutions for the linear-quadratic control problem of markov
jump linear systems. Journal of Optimization Theory and Applications, 103(2):283–311, 1999.
[Fri56] M. Friedman. A Theory of the Consumption Function. Princeton University Press, 1956.
[Gal37] Albert Gallatin. Report on the finances**, november, 1807. In Reports of the Secretary of the Treasury of
the United States, Vol 1. Government printing office, Washington, DC, 1837.
[GS89] Itzhak Gilboa and David Schmeidler. Maxmin Expected Utility with Non-Unique Prior. Journal of Math-
ematical Economics, 18(2):141–153, apr 1989.
[Hal78] Robert E Hall. Stochastic Implications of the Life Cycle-Permanent Income Hypothesis: Theory and Evi-
dence. Journal of Political Economy, 86(6):971–987, 1978.
[HS08a] L P Hansen and T J Sargent. Robustness. Princeton University Press, 2008.
[Han12] Lars Peter Hansen. Dynamic Valuation Decomposition Within Stochastic Economies. Economet-
rica, 80(3):911–967, May 2012. URL: https://ideas.repec.org/a/ecm/emetrp/v80y2012i3p911-967.html,
doi:10.3982/ECTA8070.

1054 Bibliography
Advanced Quantitative Economics with Python

[HJ91] Lars Peter Hansen and Ravi Jagannathan. Implications of Security Market Data for Models of Dynamic
Economies. Journal of Political Economy, 99(2):225–262, April 1991. URL: https://ideas.repec.org/a/ucp/
jpolec/v99y1991i2p225-62.html, doi:10.1086/261749.
[HR87] Lars Peter Hansen and Scott F Richard. The Role of Conditioning Information in Deducing Testable.
Econometrica, 55(3):587–613, May 1987.
[HS80] Lars Peter Hansen and Thomas J Sargent. Formulating and estimating dynamic linear rational expectations
models. Journal of Economic Dynamics and control, 2:7–46, 1980.
[HS00] Lars Peter Hansen and Thomas J Sargent. Wanting robustness in macroeconomics. Manuscript, Department
of Economics, Stanford University., 2000.
[HS08b] Lars Peter Hansen and Thomas J Sargent. Robustness. Princeton University Press, 2008.
[HS01] Lars Peter Hansen and Thomas J. Sargent. Robust control and model uncertainty. American Economic
Review, 91(2):60–66, 2001.
[HS13] Lars Peter Hansen and Thomas J. Sargent. Recursive Linear Models of Dynamic Economics. Princeton
University Press, Princeton, New Jersey, 2013.
[HS24] Lars Peter Hansen and Thomas J. Sargent. Risk, uncertainty, and value. University of Chicago and NYU
manuscript, 2024.
[HST99] Lars Peter Hansen, Thomas J. Sargent, and Thomas D. Tallarini. Robust Permanent Income and
Pricing. Review of Economic Studies, 66(4):873–907, 1999. URL: https://ideas.repec.org/a/oup/restud/
v66y1999i4p873-907..html, doi:.
[HK79] J. Michael Harrison and David M. Kreps. Martingales and arbitrage in multiperiod securities mar-
kets. Journal of Economic Theory, 20(3):381–408, June 1979. URL: https://ideas.repec.org/a/eee/jetheo/
v20y1979i3p381-408.html, doi:.
[HK85] Elhanan Helpman and Paul Krugman. Market structure and international trade. MIT Press Cambridge,
1985.
[HLL96] O Hernandez-Lerma and J B Lasserre. Discrete-Time Markov Control Processes: Basic Optimality Criteria.
Number Vol 1 in Applications of Mathematics Stochastic Modelling and Applied Probability. Springer,
1996.
[HN97] Hugo A Hopenhayn and Juan Pablo Nicolini. Optimal Unemployment Insurance. Journal of Political Econ-
omy, 105(2):412–438, April 1997. URL: https://ideas.repec.org/a/ucp/jpolec/v105y1997i2p412-38.html,
doi:10.1086/262078.
[HR93] Hugo A Hopenhayn and Richard Rogerson. Job Turnover and Policy Evaluation: A General Equilibrium
Analysis. Journal of Political Economy, 101(5):915–938, 1993.
[Jac73] D. H. Jacobson. Optimal stochastic linear systems with exponential performance criteria and their relation
to differential games. IEEE Transactions on Automatic Control, 18(2):124–131, 1973.
[Jud98] K L Judd. Numerical Methods in Economics. Scientific and Engineering. MIT Press, 1998.
[Jud85] Kenneth L Judd. On the performance of patents. Econometrica, pages 567–585, 1985.
[JYC03] Kenneth L. Judd, Sevin Yeltekin, and James Conklin. Computing Supergame Equilibria. Econometrica,
71(4):1239–1254, 07 2003. URL: https://ideas.repec.org/a/ecm/emetrp/v71y2003i4p1239-1254.html,
doi:.
[Kas00] Kenneth Kasa. Forecasting the forecasts of others in the frequency domain. Review of Economic Dynamics,
3:726–756, 2000.
[KNS18] Tomoo Kikuchi, Kazuo Nishimura, and John Stachurski. Span of control, transaction costs, and the struc-
ture of production chains. Theoretical Economics, 13(2):729–760, 2018.
[Kni21] Frank H. Knight. Risk, Uncertainty, and Profit. Houghton Mifflin, 1921.

Bibliography 1055
Advanced Quantitative Economics with Python

[Kre81] David M. Kreps. Arbitrage and equilibrium in economies with infinitely many commodities. Jour-
nal of Mathematical Economics, 8(1):15–35, March 1981. URL: https://ideas.repec.org/a/eee/mateco/
v8y1981i1p15-35.html, doi:.
[KP80] Finn E Kydland and Edward C Prescott. Dynamic optimal taxation, rational expectations and optimal
control. Journal of Economic Dynamics and Control, 2:79–91, 1980.
[LM94] A Lasota and M C MacKey. Chaos, Fractals, and Noise: Stochastic Aspects of Dynamics. Applied Mathe-
matical Sciences. Springer-Verlag, 1994.
[Lea78] Edward E Leamer. Specification searches: Ad hoc inference with nonexperimental data. Volume 53. John
Wiley & Sons Incorporated, 1978.
[LWY13] Eric M. Leeper, Todd B. Walker, and Shu‐Chun Susan Yang. Fiscal foresight and information flows. Econo-
metrica, 81(3):1115–1145, May 2013.
[LS18] L Ljungqvist and T J Sargent. Recursive Macroeconomic Theory. MIT Press, 4 edition, 2018.
[Luc87] Robert E Lucas. Models of business cycles. Volume 26. Oxford Blackwell, 1987.
[Luc78] Robert E Lucas, Jr. Asset prices in an exchange economy. Econometrica: Journal of the Econometric Society,
46(6):1429–1445, 1978.
[LS83] Robert E Lucas, Jr. and Nancy L Stokey. Optimal Fiscal and Monetary Policy in an Economy without
Capital. Journal of monetary Economics, 12(3):55–93, 1983.
[MMR06] Fabio Maccheroni, Massimo Marinacci, and Aldo Rustichini. Ambiguity Aversion, Robustness, and the
Variational Representation of Preferences. Econometrica, 74(6):1147–1498, 2006.
[MT09] S P Meyn and R L Tweedie. Markov Chains and Stochastic Stability. Cambridge University Press, 2009.
[MF02] Mario J Miranda and P L Fackler. Applied Computational Economics and Finance. Cambridge: MIT Press,
2002.
[MM58] Franco Modigliani and Merton H. Miller. Corporation finance and the theory of investment. American
Economic Review, XLVIII(3):261–297, 1958.
[Mut60] John F Muth. Optimal properties of exponentially weighted forecasts. Journal of the american statistical
association, 55(290):299–306, 1960.
[Orf88] Sophocles J Orfanidis. Optimum Signal Processing: An Introduction. McGraw Hill Publishing, New York,
New York, 1988.
[PCL86] Joseph Pearlman, David Currie, and Paul Levine. Rational Expectations Models with Private Information.
Economic Modelling, 3(2):90–105, 1986.
[PS05] Joseph G. Pearlman and Thomas J. Sargent. Knowing the Forecasts of Others. Review of Economic Dy-
namics, 8(2):480–497, April 2005. URL: https://ideas.repec.org/a/red/issued/v8y2005i2p480-497.html,
doi:10.1016/j.red.2004.10.011.
[Put05] Martin L Puterman. Markov decision processes: discrete stochastic dynamic programming. John Wiley &
Sons, 2005.
[Ram27] F. P. Ramsey. A Contribution to the theory of taxation. Economic Journal, 37(145):47–61, 1927.
[REL75] Jr. Robert E. Lucas. An equilibrium model of the business cycle. Journal of Political Economy, 83:1113–
1144, 1975.
[Rom05] Steven Roman. Advanced linear algebra. Volume 3. Springer, 2005.
[RMS94] Sherwin Rosen, Kevin M Murphy, and Jose A Scheinkman. Cattle cycles. Journal of Political Economy,
102(3):468–492, 1994.
[Ros78] Stephen A Ross. A Simple Approach to the Valuation of Risky Streams. The Journal of Business,
51(3):453–475, July 1978. URL: https://ideas.repec.org/a/ucp/jnlbus/v51y1978i3p453-75.html.

1056 Bibliography
Advanced Quantitative Economics with Python

[Ros76] Stephen A. Ross. The arbitrage theory of capital asset pricing. Journal of Economic Theory, 13(3):341–
360, December 1976. URL: https://ideas.repec.org/a/eee/jetheo/v13y1976i3p341-360.html, doi:.
[Roz67] Y. A. Rozanov. Stationary Random Processes. Holden-Day, San Francisco, 1967.
[Rus96] John Rust. Numerical dynamic programming in economics. Handbook of computational economics, 1:619–
729, 1996.
[RR04] Jaewoo Ryoo and Sherwin Rosen. The engineering labor market. Journal of political economy,
112(S1):S110–S140, 2004.
[SHR91] Thomas Sargent, Lars Peter Hansen, and Will Roberts. Observable implications of present value budget
balance. In Rational Expectations Econometrics. Westview Press, 1991.
[Sar77] Thomas J Sargent. The Demand for Money During Hyperinflations under Rational Expectations: I. Inter-
national Economic Review, 18(1):59–82, February 1977.
[Sar87] Thomas J Sargent. Macroeconomic Theory. Academic Press, New York, 2nd edition, 1987.
[SW73] Thomas J Sargent and Neil Wallace. The stability of models of money and growth with perfect foresight.
Econometrica: Journal of the Econometric Society, pages 1043–1048, 1973.
[Sar91] Thomas J. Sargent. Equilibrium with signal extraction from endogenous variables. Journal of Economic
Dynamics and Control, 15:245–273, 1991.
[SW49] Claude E. Shannon and Warren Weaver. The Mathematical Theory of Communication. University of Illinois
Press, Urbana, 1949.
[SW79] Steven Shavell and Laurence Weiss. The optimal payment of unemployment insurance benefits over time.
Journal of political Economy, 87(6):1347–1362, 1979.
[Shi95] A N Shiriaev. Probability. Graduate texts in mathematics. Springer. Springer, 2nd edition, 1995.
[Sin87] Kenneth J. Singleton. Asset prices in a time-series model with disparately informed competitive traders.
In William A. Barnett and Kenneth J. Singleton, editors, New Apprroaches to Monetary Economics. Cam-
bridge University Press, 1987.
[SLP89] N L Stokey, R E Lucas, and E C Prescott. Recursive Methods in Economic Dynamics. Harvard University
Press, 1989.
[Sto89] Nancy L Stokey. Reputation and time consistency. The American Economic Review, pages 134–139, 1989.
[Sto91] Nancy L. Stokey. Credible public policy. Journal of Economic Dynamics and Control, 15(4):627–656,
October 1991.
[SW09] Lars E.O. Svensson and Noah Williams. Optimal Monetary Policy under Uncertainty in DSGE Models: A
Markov Jump-Linear-Quadratic Approach. In Klaus Schmidt-Hebbel, Carl E. Walsh, Norman Loayza (Se-
ries Editor), and Klaus Schmidt-Hebbel (Series, editors, Monetary Policy under Uncertainty and Learning,
volume 13 of Central Banking, Analysis, and Economic Policies Book Series, chapter 3, pages 077–114.
Central Bank of Chile, edition, March 2009.
[SW+08] Lars EO Svensson, Noah Williams, and others. Optimal monetary policy under uncertainty: a markov
jump-linear-quadratic approach. Federal Reserve Bank of St. Louis Review, 90(4):275–293, 2008.
[Tal00] Thomas D Tallarini. Risk-sensitive real business cycles. Journal of Monetary Economics, 45(3):507–532,
June 2000.
[Tow83] Robert M. Townsend. Forecasting the forecasts of others. Journal of Political Economy, 91:546–588, 1983.
[Whi63] Peter Whittle. Prediction and regulation by linear least-square methods. English Univ. Press, 1963.
[Whi81] Peter Whittle. Risk-sensitive linear/quadratic/gaussian control. Advances in Applied Probability,
13(4):764–777, 1981.

Bibliography 1057
Advanced Quantitative Economics with Python

[Whi83] Peter Whittle. Prediction and Regulation by Linear Least Squares Methods. University of Minnesota Press,
Minneapolis, Minnesota, 2nd edition, 1983.
[Whi90] Peter Whittle. Risk-Sensitive Optimal Control. Wiley, New York, 1990.

1058 Bibliography
PROOF INDEX

square-summable
square-summable (calvo_machine_learn), 797

1059
Advanced Quantitative Economics with Python

1060 Proof Index


INDEX

A M
AR, 528 MA, 528
ARMA, 525, 528 Markov Chains
ARMA Processes, 521 Continuous State, 23
Markov Perfect Equilibrium
B Applications, 503
Bellman Equation, 479 Overview, 499
Models
C Additive functionals, 559, 823, 847
Coase's Theory of the Firm, 289 Lucas Asset Pricing, 649
Complex Numbers, 526
Consumption N
Tax, 91 Nonparametric Estimation, 546
Covariance Stationary, 522
Covariance Stationary Processes, 521 O
AR, 524 Orthogonal Projection, 5
MA, 524
P
D Periodograms, 543
Discrete State Dynamic Programming, 53 Computation, 545
Interpretation, 544
E python, 43, 142, 195, 205, 219, 359, 388, 399, 407, 417,
Elementary Asset Pricing, 659 422, 428, 435, 672

F R
Fixed Point Theory, 653 Ramsey Problem
Optimal Taxation, 227
G Robustness, 479
General Linear Processes, 523
S
L Smoothing, 546
Linear Markov Perfect Equilibria, 500 Tax, 105
Lucas Model, 649 Spectra
Assets, 650 Estimation, 543
Computation, 654 Spectra, Estimation
Consumers, 650 AR(1) Setting, 551
Dynamic Program, 651 Fast Fourier Transform, 543
Equilibrium Constraints, 652 Pre-Filtering, 551
Equilibrium Price Function, 652 Smoothing, 546, 549, 551
Pricing, 650 Spectral Analysis, 521, 526
Solving, 652 Spectral Densities, 527

1061
Advanced Quantitative Economics with Python

Spectral Density, 528


interpretation, 528
Inverting the Transformation, 533
Mathematical Theory, 533

W
White Noise, 523, 527
Wold Representation, 523

1062 Index

You might also like