0% found this document useful (0 votes)
121 views

Quantecon Python Intro

This document is a textbook about quantitative economics and Python. It contains several chapters that introduce economic concepts like economic growth, business cycles, inequality, and supply and demand. It also includes chapters on essential quantitative tools like linear equations, matrix algebra, eigenvalues, and eigenvectors. The textbook aims to explain key economic ideas and analysis using the Python programming language.

Uploaded by

moveee2
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views

Quantecon Python Intro

This document is a textbook about quantitative economics and Python. It contains several chapters that introduce economic concepts like economic growth, business cycles, inequality, and supply and demand. It also includes chapters on essential quantitative tools like linear equations, matrix algebra, eigenvalues, and eigenvectors. The textbook aims to explain key economic ideas and analysis using the Python programming language.

Uploaded by

moveee2
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 615

A First Course in Quantitative

Economics with Python

Thomas J. Sargent & John Stachurski

Sep 25, 2023


CONTENTS

I Introduction 3
1 About These Lectures 5
1.1 About . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Credits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

II Economic Data 7
2 Economic Growth Evidence 9
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Setting up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 GDP plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 The industrialized world . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Constructing a plot similar to Tooze’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Regional analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 Business Cycles 25
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Data acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 GDP growth rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Unemployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.6 Leading indicators and correlated factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4 Income and Wealth Inequality 39


4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 The Lorenz curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 The Gini coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4 Top shares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

III Essential Tools 57


5 Linear Equations and Matrix Algebra 59
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 A two good example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.4 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.5 Solving systems of equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

i
5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6 Eigenvalues and Eigenvectors 77


6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.2 Matrices as transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.3 Types of transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.4 Matrix multiplication as composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.5 Iterating on a fixed map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.6 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.7 The Neumann Series Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

7 Introduction to Supply and Demand 103


7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.2 Supply and demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.3 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

IV Linear Dynamics 115


8 Present Values 117
8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.2 Present value calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
8.3 Analytical expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.4 More about bubbles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8.5 Gross rate of return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

9 Consumption Smoothing 125


9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
9.2 Friedman-Hall consumption-smoothing model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
9.3 Mechanics of Consumption smoothing model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
9.4 Wrapping up the consumption-smoothing model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.5 Difference equations with linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

10 Equalizing Difference Model 139


10.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.2 The indifference condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
10.3 Reinterpreting the model: workers and entrepreneurs . . . . . . . . . . . . . . . . . . . . . . . . . . 141
10.4 Entrepreneur-worker interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
10.5 An application of calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

11 Price Level Histories 151


11.1 Four Centuries of Price Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
11.2 Ends of Four Big Inflations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
11.3 Starting and Stopping Big Inflations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

12 A Fiscal Theory of the Price Level 169


12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
12.2 Structure of the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
12.3 Continuation values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
12.4 Sequel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

13 A Fiscal Theory of Price Level with Adaptive Expectations 187

ii
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
13.2 Structure of the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
13.3 Representing key equations with linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
13.4 Harvesting returns from our matrix formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
13.5 Forecast errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
13.6 Technical condition for stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

14 Geometric Series for Elementary Economics 197


14.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
14.2 Key formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
14.3 Example: The Money Multiplier in Fractional Reserve Banking . . . . . . . . . . . . . . . . . . . . . 198
14.4 Example: The Keynesian Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
14.5 Example: Interest Rates and Present Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
14.6 Back to the Keynesian multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

V Probability and Distributions 215


15 Distributions and Probabilities 217
15.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
15.2 Common distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
15.3 Observed distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

16 LLN and CLT 245


16.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
16.2 The law of large numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
16.3 Breaking the LLN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
16.4 Central limit theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
16.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

17 Monte Carlo and Option Pricing 259


17.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
17.2 An introduction to Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
17.3 Pricing a european call option under risk neutrality . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
17.4 Pricing via a dynamic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
17.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

18 Heavy-Tailed Distributions 273


18.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
18.2 Visual comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
18.3 Heavy tails in economic cross-sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
18.4 Failure of the LLN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
18.5 Why do heavy tails matter? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
18.6 Classifying tail properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
18.7 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
18.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

19 Racial Segregation 301


19.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
19.2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
19.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
19.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

iii
VI Nonlinear Dynamics 313
20 The Solow-Swan Growth Model 315
20.1 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
20.2 A graphical perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
20.3 Growth in continuous time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
20.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322

21 Dynamics in One Dimension 327


21.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
21.2 Some definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
21.3 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
21.4 Graphical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
21.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

22 The Cobweb Model 343


22.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
22.2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
22.3 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
22.4 Naive expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
22.5 Adaptive expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
22.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352

23 The Overlapping Generations Model 357


23.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
23.2 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
23.3 Supply of capital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
23.4 Demand for capital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
23.5 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
23.6 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
23.7 CRRA preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
23.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369

24 Commodity Prices 373


24.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
24.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
24.3 The competitive storage model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
24.4 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
24.5 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
24.6 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377

VII Stochastic Dynamics 381


25 Markov Chains: Basic Concepts 383
25.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
25.2 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
25.3 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
25.4 Distributions over time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
25.5 Stationary distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
25.6 Computing expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399

26 Markov Chains: Irreducibility and Ergodicity 405


26.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
26.2 Irreducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406

iv
26.3 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
26.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413

27 Univariate Time Series with Matrix Algebra 419


27.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
27.2 Samuelson’s model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
27.3 Adding a random term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
27.4 Computing population moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
27.5 Moving average representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
27.6 A forward looking model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430

VIII Optimization 433


28 Linear Programming 437
28.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
28.2 Example 1: Production Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
28.3 Example 2: Investment Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
28.4 Standard form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
28.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446

29 Shortest Paths 449


29.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
29.2 Outline of the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
29.3 Finding least-cost paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
29.4 Solving for minimum cost-to-go . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
29.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456

IX Modeling in Higher Dimensions 461


30 The Perron-Frobenius Theorem 463
30.1 Nonnegative matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
30.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472

31 Input-Output Models 475


31.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
31.2 Input output analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
31.3 Production possibility frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
31.4 Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
31.5 Linear programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
31.6 Leontief inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
31.7 Applications of graph theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
31.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484

32 A Lake Model of Employment 487


32.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
32.2 The Lake model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
32.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
32.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

33 Networks 499
33.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
33.2 Economic and financial networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
33.3 An introduction to graph theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502

v
33.4 Weighted graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
33.5 Adjacency matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
33.6 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
33.7 Network centrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
33.8 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
33.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521

X Markets and Competitive Equilibrium 527


34 Supply and Demand with Many Goods 529
34.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
34.2 Formulas from linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
34.3 From utility function to demand curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
34.4 Endowment economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
34.5 Digression: Marshallian and Hicksian demand curves . . . . . . . . . . . . . . . . . . . . . . . . . . 533
34.6 Dynamics and risk as special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
34.7 Economies with endogenous supplies of goods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
34.8 Multi-good welfare maximization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548

35 Market Equilibrium with Heterogeneity 549


35.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
35.2 An simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
35.3 Pure exchange economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
35.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
35.5 Deducing a representative consumer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556

XI Estimation 557
36 Simple Linear Regression Model 559
36.1 How does error change with respect to 𝛼 and 𝛽 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
36.2 Calculating optimal values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566

37 Maximum Likelihood Estimation 579


37.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
37.2 Maximum likelihood estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
37.3 Pareto distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
37.4 What is the best distribution? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
37.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588

XII Other 591


38 Troubleshooting 593
38.1 Fixing your local environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
38.2 Reporting an issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594

39 References 595

40 Execution Statistics 597

Bibliography 599

Proof Index 603

vi
Index 605

vii
viii
A First Course in Quantitative Economics with Python

This lecture series provides an introduction to quantitative economics using Python.


The lectures were designed and written by Thomas J. Sargent and John Stachurski, with extensive help from the rest of
the QuantEcon team.
• Introduction
– About These Lectures
• Economic Data
– Economic Growth Evidence
– Business Cycles
– Income and Wealth Inequality
• Essential Tools
– Linear Equations and Matrix Algebra
– Eigenvalues and Eigenvectors
– Introduction to Supply and Demand
• Linear Dynamics
– Present Values
– Consumption Smoothing
– Equalizing Difference Model
– Price Level Histories
– A Fiscal Theory of the Price Level
– A Fiscal Theory of Price Level with Adaptive Expectations
– Geometric Series for Elementary Economics
• Probability and Distributions
– Distributions and Probabilities
– LLN and CLT
– Monte Carlo and Option Pricing
– Heavy-Tailed Distributions
– Racial Segregation
• Nonlinear Dynamics
– The Solow-Swan Growth Model
– Dynamics in One Dimension
– The Cobweb Model
– The Overlapping Generations Model
– Commodity Prices
• Stochastic Dynamics
– Markov Chains: Basic Concepts
– Markov Chains: Irreducibility and Ergodicity

CONTENTS 1
A First Course in Quantitative Economics with Python

– Univariate Time Series with Matrix Algebra


• Optimization
– Linear Programming
– Shortest Paths
• Modeling in Higher Dimensions
– The Perron-Frobenius Theorem
– Input-Output Models
– A Lake Model of Employment
– Networks
• Markets and Competitive Equilibrium
– Supply and Demand with Many Goods
– Market Equilibrium with Heterogeneity
• Estimation
– Simple Linear Regression Model
– Maximum Likelihood Estimation
• Other
– Troubleshooting
– References
– Execution Statistics

2 CONTENTS
Part I

Introduction

3
CHAPTER

ONE

ABOUT THESE LECTURES

1.1 About

This lecture series introduces quantitative economics using elementary mathematics and statistics plus computer code
written in Python.
The lectures emphasize simulation and visualization through code as a way to convey ideas, rather than focusing on
mathematical details.
Although the presentation is quite novel, the ideas are rather foundational.
We emphasize the deep and fundamental importance of economic theory, as well as the value of analyzing data and
understanding stylized facts.
The lectures can be used for university courses, self-study, reading groups or workshops.
Researchers and policy professionals might also find some parts of the series valuable for their work.
We hope the lectures will be of interest to students of economics who want to learn both economics and computing, as
well as students from fields such as computer science and engineering who are curious about economics.

1.2 Level

The lecture series is aimed at undergraduate students.


The level of the lectures varies from truly introductory (suitable for first year undergraduates or even high school students)
to more intermediate.
The more intermediate lectures require comfort with linear algebra and some mathematical maturity (e.g., calmly reading
theorems and trying to understand their meaning).
In general, easier lectures occur earlier in the lecture series and harder lectures occur later.
We assume that readers have covered the easier parts of the QuantEcon lecture series on Python programming.
In particular, readers should be familiar with basic Python syntax including Python functions. Knowledge of classes and
Matplotlib will be beneficial but not essential.

5
A First Course in Quantitative Economics with Python

1.3 Credits

In building this lecture series, we had invaluable assistance from research assistants at QuantEcon, as well as our Quan-
tEcon colleagues. Without their help this series would not have been possible.
In particular, we sincerely thank and give credit to
• Aakash Gupta
• Shu Hu
• Jiacheng Li
• Smit Lunagariya
• Matthew McKay
• Maanasee Sharma
• Humphrey Yang
We also thank Noritaka Kudoh for encouraging us to start this project and providing thoughtful suggestions.

6 Chapter 1. About These Lectures


Part II

Economic Data

7
CHAPTER

TWO

ECONOMIC GROWTH EVIDENCE

2.1 Overview

In this lecture we use Python, Pandas, and Matplotlib to download, organize, and visualize historical data on GDP growth.
In addition to learning how to deploy these tools more generally, we’ll use them to describe facts about economic growth
experiences across many countries over several centuries.
Such “growth facts” are interesting for a variety of reasons.
Explaining growth facts is a principal purpose of both “development economics” and “economic history”.
And growth facts are important inputs into historians’ studies of geopolitical forces and dynamics.
Thus, Adam Tooze’s account of the geopolitical precedents and antecedents of World War I begins by describing how
Gross National Products of European Great Powers had evolved during the 70 years preceding 1914 (see chapter 1 of
[Too14]).
Using the very same data that Tooze used to construct his figure, here is our version of his chapter 1 figure.

(This is just a copy of our figure Fig. 2.6. We desribe how we constructed it later in this lecture.)
Chapter 1 of [Too14] used his graph to show how US GDP started the 19th century way behind the GDP of the British
Empire.

9
A First Course in Quantitative Economics with Python

By the end of the nineteenth century, US GDP had caught up with GDP of the British Empire, and how during the first
half of the 20th century, US GDP surpassed that of the British Empire.
For Adam Tooze, that fact was a key geopolitical underpinning for the “American century”.
Looking at this graph and how it set the geopolitical stage for “the American (20th) century” naturally tempts one to want
a counterpart to his graph for 2014 or later.
(An impatient reader seeking a hint at the answer might now want to jump ahead and look at figure Fig. 2.7.)
As we’ll see, reasoning by analogy, this graph perhaps set the stage for an “XXX (21st) century”, where you are free to
fill in your guess for country XXX.
As we gather data to construct those two graphs, we’ll also study growth experiences for a number of countries for time
horizons extending as far back as possible.
These graphs will portray how the “Industrial Revolution” began in Britain in the late 18th century, then migrated to one
country after another.
In a nutshell, this lecture records growth trajectories of various countries over long time periods.
While some countries have experienced long term rapid growth across that has lasted a hundred years, others have not.
Since populations differ across countries and vary within a country over time, it will be interesting to describe both total
GDP and GDP per capita as it evolves within a country.
First let’s import the packages needed to explore what the data says about long run growth

import pandas as pd
import os
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
from collections import namedtuple
from matplotlib.lines import Line2D

2.2 Setting up

A project initiated by Angus Maddison has collected many historical time series related to economic growth, some dating
back to the first century.
The data can be downloaded from the Maddison Historical Statistics webpage by clicking on the “Latest Maddison Project
Release”.
For convenience, here is a copy of the 2020 data in Excel format.
Let’s read it into a pandas dataframe:

data = pd.read_excel("datasets/mpd2020.xlsx", sheet_name='Full data')


data

countrycode country year gdppc pop


0 AFG Afghanistan 1820 NaN 3280.00000
1 AFG Afghanistan 1870 NaN 4207.00000
2 AFG Afghanistan 1913 NaN 5730.00000
3 AFG Afghanistan 1950 1156.0000 8150.00000
4 AFG Afghanistan 1951 1170.0000 8284.00000
(continues on next page)

10 Chapter 2. Economic Growth Evidence


A First Course in Quantitative Economics with Python

(continued from previous page)


... ... ... ... ... ...
21677 ZWE Zimbabwe 2014 1594.0000 13313.99205
21678 ZWE Zimbabwe 2015 1560.0000 13479.13812
21679 ZWE Zimbabwe 2016 1534.0000 13664.79457
21680 ZWE Zimbabwe 2017 1582.3662 13870.26413
21681 ZWE Zimbabwe 2018 1611.4052 14096.61179

[21682 rows x 5 columns]

We can see that this dataset contains GDP per capita (gdppc) and population (pop) for many countries and years.
Let’s look at how many and which countries are available in this dataset

len(data.country.unique())

169

We can now explore some of the 169 countries that are available.
Let’s loop over each country to understand which years are available for each country

cntry_years = []
for cntry in data.country.unique():
cy_data = data[data.country == cntry]['year']
ymin, ymax = cy_data.min(), cy_data.max()
cntry_years.append((cntry, ymin, ymax))
cntry_years = pd.DataFrame(cntry_years, columns=['country', 'Min Year', 'Max Year']).
↪set_index('country')

cntry_years

Min Year Max Year


country
Afghanistan 1820 2018
Angola 1950 2018
Albania 1 2018
United Arab Emirates 1950 2018
Argentina 1800 2018
... ... ...
Yemen 1820 2018
Former Yugoslavia 1 2018
South Africa 1 2018
Zambia 1950 2018
Zimbabwe 1950 2018

[169 rows x 2 columns]

Let’s now reshape the original data into some convenient variables to enable quicker access to countries time series data.
We can build a useful mapping between country codes and country names in this dataset

code_to_name = data[['countrycode','country']].drop_duplicates().reset_
↪index(drop=True).set_index(['countrycode'])

Then we can quickly focus on GDP per capita (gdp)

2.2. Setting up 11
A First Course in Quantitative Economics with Python

data

countrycode country year gdppc pop


0 AFG Afghanistan 1820 NaN 3280.00000
1 AFG Afghanistan 1870 NaN 4207.00000
2 AFG Afghanistan 1913 NaN 5730.00000
3 AFG Afghanistan 1950 1156.0000 8150.00000
4 AFG Afghanistan 1951 1170.0000 8284.00000
... ... ... ... ... ...
21677 ZWE Zimbabwe 2014 1594.0000 13313.99205
21678 ZWE Zimbabwe 2015 1560.0000 13479.13812
21679 ZWE Zimbabwe 2016 1534.0000 13664.79457
21680 ZWE Zimbabwe 2017 1582.3662 13870.26413
21681 ZWE Zimbabwe 2018 1611.4052 14096.61179

[21682 rows x 5 columns]

gdppc = data.set_index(['countrycode','year'])['gdppc']
gdppc = gdppc.unstack('countrycode')

gdppc

countrycode AFG AGO ALB ARE ARG \


year
1 NaN NaN NaN NaN NaN
730 NaN NaN NaN NaN NaN
1000 NaN NaN NaN NaN NaN
1090 NaN NaN NaN NaN NaN
1120 NaN NaN NaN NaN NaN
... ... ... ... ... ...
2014 2022.0000 8673.0000 9808.0000 72601.0000 19183.0000
2015 1928.0000 8689.0000 10032.0000 74746.0000 19502.0000
2016 1929.0000 8453.0000 10342.0000 75876.0000 18875.0000
2017 2014.7453 8146.4354 10702.1201 76643.4984 19200.9061
2018 1934.5550 7771.4418 11104.1665 76397.8181 18556.3831

countrycode ARM AUS AUT AZE BDI ... \


year ...
1 NaN NaN NaN NaN NaN ...
730 NaN NaN NaN NaN NaN ...
1000 NaN NaN NaN NaN NaN ...
1090 NaN NaN NaN NaN NaN ...
1120 NaN NaN NaN NaN NaN ...
... ... ... ... ... ... ...
2014 9735.0000 47867.0000 41338.0000 17439.0000 748.0000 ...
2015 10042.0000 48357.0000 41294.0000 17460.0000 694.0000 ...
2016 10080.0000 48845.0000 41445.0000 16645.0000 665.0000 ...
2017 10859.3783 49265.6135 42177.3706 16522.3072 671.3169 ...
2018 11454.4251 49830.7993 42988.0709 16628.0553 651.3589 ...

countrycode URY USA UZB VEN VNM \


year
1 NaN NaN NaN NaN NaN
730 NaN NaN NaN NaN NaN
(continues on next page)

12 Chapter 2. Economic Growth Evidence


A First Course in Quantitative Economics with Python

(continued from previous page)


1000 NaN NaN NaN NaN NaN
1090 NaN NaN NaN NaN NaN
1120 NaN NaN NaN NaN NaN
... ... ... ... ... ...
2014 19160.0000 51664.0000 9085.0000 20317.0000 5455.0000
2015 19244.0000 52591.0000 9720.0000 18802.0000 5763.0000
2016 19468.0000 53015.0000 10381.0000 15219.0000 6062.0000
2017 19918.1361 54007.7698 10743.8666 12879.1350 6422.0865
2018 20185.8360 55334.7394 11220.3702 10709.9506 6814.1423

countrycode YEM YUG ZAF ZMB ZWE


year
1 NaN NaN NaN NaN NaN
730 NaN NaN NaN NaN NaN
1000 NaN NaN NaN NaN NaN
1090 NaN NaN NaN NaN NaN
1120 NaN NaN NaN NaN NaN
... ... ... ... ... ...
2014 4054.0000 14627.0000 12242.0000 3478.0000 1594.0000
2015 2844.0000 14971.0000 12246.0000 3478.0000 1560.0000
2016 2506.0000 15416.0000 12139.0000 3479.0000 1534.0000
2017 2321.9239 15960.8432 12189.3579 3497.5818 1582.3662
2018 2284.8899 16558.3123 12165.7948 3534.0337 1611.4052

[772 rows x 169 columns]

We create a color mapping between country codes and colors for consistency

2.3 GDP plots

Looking at the United Kingdom we can first confirm we are using the correct country code

fig, ax = plt.subplots(dpi=300)
cntry = 'GBR'
_ = gdppc[cntry].plot(
ax = fig.gca(),
ylabel = 'International $\'s',
xlabel = 'Year',
linestyle='-',
color=color_mapping['GBR'])

Note: International Dollars are a hypothetical unit of currency that has the same purchasing power parity that the U.S.
Dollar has in the United States at any given time. They are also known as Geary–Khamis dollars (GK Dollars).

We can see that the data is non-continuous for longer periods in the early 250 years of this millennium, so we could
choose to interpolate to get a continuous line plot.
Here we use dashed lines to indicate interpolated trends

2.3. GDP plots 13


A First Course in Quantitative Economics with Python

Fig. 2.1: GDP per Capita (GBR)

fig, ax = plt.subplots(dpi=300)
cntry = 'GBR'
ax.plot(gdppc[cntry].interpolate(),
linestyle='--',
lw=2,
color=color_mapping[cntry])

ax.plot(gdppc[cntry],
linestyle='-',
lw=2,
color=color_mapping[cntry])
ax.set_ylabel('International $\'s')
ax.set_xlabel('Year')
plt.show()

We can now put this into a function to generate plots for a list of countries

def draw_interp_plots(series, ylabel, xlabel, color_mapping, code_to_name, lw,␣


↪logscale, ax):

for i, c in enumerate(cntry):
# Get the interpolated data
df_interpolated = series[c].interpolate(limit_area='inside')
interpolated_data = df_interpolated[series[c].isnull()]

# Plot the interpolated data with dashed lines


(continues on next page)

14 Chapter 2. Economic Growth Evidence


A First Course in Quantitative Economics with Python

Fig. 2.2: GDP per Capita (GBR)

(continued from previous page)


ax.plot(interpolated_data,
linestyle='--',
lw=lw,
alpha=0.7,
color=color_mapping[c])

# Plot the non-interpolated data with solid lines


ax.plot(series[c],
linestyle='-',
lw=lw,
color=color_mapping[c],
alpha=0.8,
label=code_to_name.loc[c]['country'])

if logscale == True:
ax.set_yscale('log')

# Draw the legend outside the plot


ax.legend(loc='center left', bbox_to_anchor=(1, 0.5), frameon=False)
ax.set_ylabel(ylabel)
ax.set_xlabel(xlabel)

return ax

As you can see from this chart, economic growth started in earnest in the 18th century and continued for the next two
hundred years.
How does this compare with other countries’ growth trajectories?

2.3. GDP plots 15


A First Course in Quantitative Economics with Python

Let’s look at the United States (USA), United Kingdom (GBR), and China (CHN)

Fig. 2.3: GDP per Capita, 1500- (China, UK, USA)

The preceding graph of per capita GDP strikingly reveals how the spread of the industrial revolution has over time
gradually lifted the living standards of substantial groups of people
• most of the growth happened in the past 150 years after the industrial revolution.
• per capita GDP in the US and UK rose and diverged from that of China from 1820 to 1940.
• the gap has closed rapidly after 1950 and especially after the late 1970s.
• these outcomes reflect complicated combinations of technological and economic-policy factors that students of
economic growth try to understand and quantify.
It is fascinating to see China’s GDP per capita levels from 1500 through to the 1970s.
Notice the long period of declining GDP per capital levels from the 1700s until the early 20th century.
Thus, the graph indicates
• a long economic downturn and stagnation after the Closed-door Policy by the Qing government.
• China’s very different experience than the UK’s after the onset of the industrial revolution in the UK.
• how the Self-Strengthening Movement seemed mostly to help China to grow.
• how stunning have been the growth achievements of modern Chinese economic policies by the PRC that culminated
with its late 1970s reform and liberalization.

We can also look at the United States (USA) and United Kingdom (GBR) in more detail
In the following graph, please watch for
• impact of trade policy (Navigation Act).
• productivity changes brought by the industrial revolution.

16 Chapter 2. Economic Growth Evidence


A First Course in Quantitative Economics with Python

Fig. 2.4: GDP per Capita, 1500-2000 (China)

• how the US gradually approaches and then surpasses the UK, setting the stage for the ‘‘American Century’’.
• the often unanticipated consequences of wars.
• interruptions and scars left by business cycle recessions and depressions.

2.4 The industrialized world

Now we’ll construct some graphs of interest to geopolitical historians like Adam Tooze.
We’ll focus on total Gross Domestic Product (GDP) (as a proxy for ‘‘national geopolitical-military power’’) rather than
focusing on GDP per capita (as a proxy for living standards).

data = pd.read_excel("datasets/mpd2020.xlsx", sheet_name='Full data')


data.set_index(['countrycode', 'year'], inplace=True)
data['gdp'] = data['gdppc'] * data['pop']
gdp = data['gdp'].unstack('countrycode')

2.4. The industrialized world 17


A First Course in Quantitative Economics with Python

Fig. 2.5: GDP per Capita, 1500-2000 (UK and US)

2.4.1 Early industrialization (1820 to 1940)

We first visualize the trend of China, the Former Soviet Union, Japan, the UK and the US.
The most notable trend is the rise of the US, surpassing the UK in the 1860s and China in the 1880s.
The growth continued until the large dip in the 1930s when the Great Depression hit.
Meanwhile, Russia experienced significant setbacks during World War I and recovered significantly after the February
Revolution.

fig, ax = plt.subplots(dpi=300)
ax = fig.gca()
cntry = ['CHN', 'SUN', 'JPN', 'GBR', 'USA']
start_year, end_year = (1820, 1945)
ax = draw_interp_plots(gdp[cntry].loc[start_year:end_year],
'International $\'s','Year',
color_mapping, code_to_name, 2, False, ax)

2.5 Constructing a plot similar to Tooze’s

In this section we describe how we have constructed a version of the striking figure from chapter 1 of [Too14] that we
discussed at the start of this lecture.
Let’s first define a collection of countries that consist of the British Empire (BEM) so we can replicate that series in
Tooze’s chart.

18 Chapter 2. Economic Growth Evidence


A First Course in Quantitative Economics with Python

Fig. 2.6: GDP in the early industrialization era

BEM = ['GBR', 'IND', 'AUS', 'NZL', 'CAN', 'ZAF']


gdp['BEM'] = gdp[BEM].loc[start_year-1:end_year].interpolate(method='index').
↪sum(axis=1) # Interpolate incomplete time-series

Let’s take a look at the aggregation that represents the British Empire.

gdp['BEM'].plot() # The first year is np.nan due to interpolation

<Axes: xlabel='year'>

2.5. Constructing a plot similar to Tooze’s 19


A First Course in Quantitative Economics with Python

code_to_name

country
countrycode
AFG Afghanistan
AGO Angola
ALB Albania
ARE United Arab Emirates
ARG Argentina
... ...
YEM Yemen
YUG Former Yugoslavia
ZAF South Africa
ZMB Zambia
ZWE Zimbabwe

[169 rows x 1 columns]

Now let’s assemble our series and get ready to plot them.

# Define colour mapping and name for BEM


color_mapping['BEM'] = color_mapping['GBR'] # Set the color to be the same as Great␣
↪Britain

# Add British Empire to code_to_name


bem = pd.DataFrame(["British Empire"], index=["BEM"], columns=['country'])
bem.index.name = 'countrycode'
code_to_name = pd.concat([code_to_name, bem])

20 Chapter 2. Economic Growth Evidence


A First Course in Quantitative Economics with Python

fig, ax = plt.subplots(dpi=300)
ax = fig.gca()
cntry = ['DEU', 'USA', 'SUN', 'BEM', 'FRA', 'JPN']
start_year, end_year = (1821, 1945)
ax = draw_interp_plots(gdp[cntry].loc[start_year:end_year],
'Real GDP in 2011 $\'s','Year',
color_mapping, code_to_name, 2, False, ax)
plt.savefig("./_static/lecture_specific/long_run_growth/tooze_ch1_graph.png", dpi=300,
↪ bbox_inches='tight')

plt.show()

At the start of this lecture, we noted how US GDP came from “nowhere” at the start of the 19th century to rival and then
overtake the GDP of the British Empire by the end of the 19th century, setting the geopolitical stage for the “American
(twentieth) century”.
Let’s move forward in time and start roughly where Tooze’s graph stopped after World War II.
In the spirit of Tooze’s chapter 1 analysis, doing this will provide some information about geopolitical realities today.

2.5.1 The modern era (1950 to 2020)

The following graph displays how quickly China has grown, especially since the late 1970s.

fig, ax = plt.subplots(dpi=300)
ax = fig.gca()
cntry = ['CHN', 'SUN', 'JPN', 'GBR', 'USA']
start_year, end_year = (1950, 2020)
ax = draw_interp_plots(gdp[cntry].loc[start_year:end_year],
'International $\'s','Year',
color_mapping, code_to_name, 2, False, ax)

It is tempting to compare this graph with figure Fig. 2.6 that showed the US overtaking the UK near the start of the
“American Century”, a version of the graph featured in chapter 1 of [Too14].

2.5. Constructing a plot similar to Tooze’s 21


A First Course in Quantitative Economics with Python

Fig. 2.7: GDP in the modern era

2.6 Regional analysis

We often want to study historical experiences of countries outside the club of “World Powers”.
Fortunately, the Maddison Historical Statistics dataset also includes regional aggregations

data = pd.read_excel("datasets/mpd2020.xlsx", sheet_name='Regional data', header=(0,1,


↪2), index_col=0)

data.columns = data.columns.droplevel(level=2)

We can save the raw data in a more convenient format to build a single table of regional GDP per capita

regionalgdppc = data['gdppc_2011'].copy()
regionalgdppc.index = pd.to_datetime(regionalgdppc.index, format='%Y')

Let’s interpolate based on time to fill in any gaps in the dataset for the purpose of plotting

regionalgdppc.interpolate(method='time', inplace=True)

and record a dataset of world GDP per capita

worldgdppc = regionalgdppc['World GDP pc']

fig = plt.figure(dpi=300)
ax = fig.gca()
ax = worldgdppc.plot(
ax = ax,
xlabel='Year',
ylabel='2011 US$',
)

22 Chapter 2. Economic Growth Evidence


A First Course in Quantitative Economics with Python

Fig. 2.8: World GDP per capita

Looking more closely, let’s compare the time series for Western Offshoots and Sub-Saharan Africa and
more broadly at a number of different regions around the world.
Again we see the divergence of the West from the rest of the world after the industrial revolution and the convergence of
the world after the 1950s

fig = plt.figure(dpi=300)
ax = fig.gca()
line_styles = ['-', '--', ':', '-.', '.', 'o', '-', '--', '-']
ax = regionalgdppc.plot(ax = ax, style=line_styles)
ax.set_yscale('log')
plt.legend(loc='lower center',
ncol=3, bbox_to_anchor=[0.5, -0.4])
plt.show()

2.6. Regional analysis 23


A First Course in Quantitative Economics with Python

Fig. 2.9: Regional GDP per capita

24 Chapter 2. Economic Growth Evidence


CHAPTER

THREE

BUSINESS CYCLES

3.1 Overview

In this lecture we review some empirical aspects of business cycles.


Business cycles are fluctuations in economic activity over time.
These include expansions (also called booms) and contractions (also called recessions).
For our study, we will use economic indicators from the World Bank and FRED.
In addition to the packages already installed by Anaconda, this lecture requires

!pip install wbgapi


!pip install pandas-datareader

We use the following imports

import matplotlib.pyplot as plt


import pandas as pd
import numpy as np
import scipy.stats as st
import datetime
import wbgapi as wb
import pandas_datareader.data as web

Here’s some minor code to help with colors in our plots.

3.2 Data acquisition

We will use the World Bank’s data API wbgapi and pandas_datareader to retrieve data.
We can use wb.series.info with the argument q to query available data from the World Bank.
For example, let’s retrieve the GDP growth data ID to query GDP growth data.

wb.series.info(q='GDP growth')

id value
----------------- ---------------------
NY.GDP.MKTP.KD.ZG GDP growth (annual %)
1 elements

25
A First Course in Quantitative Economics with Python

Now we use this series ID to obtain the data.

gdp_growth = wb.data.DataFrame('NY.GDP.MKTP.KD.ZG',
['USA', 'ARG', 'GBR', 'GRC', 'JPN'],
labels=True)
gdp_growth

Country YR1960 YR1961 YR1962 YR1963 YR1964 \


economy
JPN Japan NaN 12.043536 8.908973 8.473642 11.676708
GRC Greece NaN 13.203841 0.364811 11.844868 9.409677
GBR United Kingdom NaN 2.677119 1.102910 4.874384 5.533659
ARG Argentina NaN 5.427843 -0.852022 -5.308197 10.130298
USA United States NaN 2.300000 6.100000 4.400000 5.800000

YR1965 YR1966 YR1967 YR1968 ... YR2013 YR2014 \


economy ...
JPN 5.819708 10.638562 11.082142 12.882468 ... 2.005100 0.296206
GRC 10.768011 6.494502 5.669485 7.203719 ... -2.515997 0.475696
GBR 2.142177 1.573100 2.786475 5.441083 ... 1.819863 3.199703
ARG 10.569433 -0.659726 3.191997 4.822501 ... 2.405324 -2.512615
USA 6.400000 6.500000 2.500000 4.800000 ... 1.841875 2.287776

YR2015 YR2016 YR2017 YR2018 YR2019 YR2020 \


economy
JPN 1.560627 0.753827 1.675332 0.643391 -0.402169 -4.278604
GRC -0.196088 -0.487173 1.092149 1.668429 1.884342 -9.004044
GBR 2.393103 2.165206 2.443570 1.705021 1.604309 -11.030858
ARG 2.731160 -2.080328 2.818503 -2.617396 -2.000861 -9.943235
USA 2.706370 1.667472 2.241921 2.945385 2.294439 -2.767803

YR2021 YR2022
economy
JPN 2.142487 1.028625
GRC 8.434426 5.913708
GBR 7.597471 4.101621
ARG 10.398249 5.243044
USA 5.945485 2.061593

[5 rows x 64 columns]

We can look at the series’ metadata to learn more about the series (click to expand).

wb.series.metadata.get('NY.GDP.MKTP.KD.ZG')

3.3 GDP growth rate

First we look at GDP growth.


Let’s source our data from the World Bank and clean it.

# Use the series ID retrieved before


gdp_growth = wb.data.DataFrame('NY.GDP.MKTP.KD.ZG',
['USA', 'ARG', 'GBR', 'GRC', 'JPN'],
(continues on next page)

26 Chapter 3. Business Cycles


A First Course in Quantitative Economics with Python

(continued from previous page)


labels=True)
gdp_growth = gdp_growth.set_index('Country')
gdp_growth.columns = gdp_growth.columns.str.replace('YR', '').astype(int)

Here’s a first look at the data

gdp_growth

1960 1961 1962 1963 1964 1965 \


Country
Japan NaN 12.043536 8.908973 8.473642 11.676708 5.819708
Greece NaN 13.203841 0.364811 11.844868 9.409677 10.768011
United Kingdom NaN 2.677119 1.102910 4.874384 5.533659 2.142177
Argentina NaN 5.427843 -0.852022 -5.308197 10.130298 10.569433
United States NaN 2.300000 6.100000 4.400000 5.800000 6.400000

1966 1967 1968 1969 ... 2013 \


Country ...
Japan 10.638562 11.082142 12.882468 12.477895 ... 2.005100
Greece 6.494502 5.669485 7.203719 11.563667 ... -2.515997
United Kingdom 1.573100 2.786475 5.441083 1.924097 ... 1.819863
Argentina -0.659726 3.191997 4.822501 9.679526 ... 2.405324
United States 6.500000 2.500000 4.800000 3.100000 ... 1.841875

2014 2015 2016 2017 2018 2019 \


Country
Japan 0.296206 1.560627 0.753827 1.675332 0.643391 -0.402169
Greece 0.475696 -0.196088 -0.487173 1.092149 1.668429 1.884342
United Kingdom 3.199703 2.393103 2.165206 2.443570 1.705021 1.604309
Argentina -2.512615 2.731160 -2.080328 2.818503 -2.617396 -2.000861
United States 2.287776 2.706370 1.667472 2.241921 2.945385 2.294439

2020 2021 2022


Country
Japan -4.278604 2.142487 1.028625
Greece -9.004044 8.434426 5.913708
United Kingdom -11.030858 7.597471 4.101621
Argentina -9.943235 10.398249 5.243044
United States -2.767803 5.945485 2.061593

[5 rows x 63 columns]

We write a function to generate plots for individual countries taking into account the recessions.
Let’s start with the United States.

fig, ax = plt.subplots()

country = 'United States'


ylabel = 'GDP growth rate (%)'
plot_series(gdp_growth, country,
ylabel, 0.1, ax,
g_params, b_params, t_params)
plt.show()

3.3. GDP growth rate 27


A First Course in Quantitative Economics with Python

Fig. 3.1: United States (GDP growth rate %)

GDP growth is positive on average and trending slightly downward over time.
We also see fluctuations over GDP growth over time, some of which are quite large.
Let’s look at a few more countries to get a basis for comparison.
The United Kingdom (UK) has a similar pattern to the US, with a slow decline in the growth rate and significant fluctu-
ations.
Notice the very large dip during the Covid-19 pandemic.

fig, ax = plt.subplots()

country = 'United Kingdom'


plot_series(gdp_growth, country,
ylabel, 0.1, ax,
g_params, b_params, t_params)
plt.show()

Now let’s consider Japan, which experienced rapid growth in the 1960s and 1970s, followed by slowed expansion in the
past two decades.
Major dips in the growth rate coincided with the Oil Crisis of the 1970s, the Global Financial Crisis (GFC) and the
Covid-19 pandemic.

fig, ax = plt.subplots()

(continues on next page)

28 Chapter 3. Business Cycles


A First Course in Quantitative Economics with Python

Fig. 3.2: United Kingdom (GDP growth rate %)

(continued from previous page)


country = 'Japan'
plot_series(gdp_growth, country,
ylabel, 0.1, ax,
g_params, b_params, t_params)
plt.show()

Now let’s study Greece.

fig, ax = plt.subplots()

country = 'Greece'
plot_series(gdp_growth, country,
ylabel, 0.1, ax,
g_params, b_params, t_params)
plt.show()

Greece experienced a very large drop in GDP growth around 2010-2011, during the peak of the Greek debt crisis.
Next let’s consider Argentina.

fig, ax = plt.subplots()

(continues on next page)

3.3. GDP growth rate 29


A First Course in Quantitative Economics with Python

Fig. 3.3: Japan (GDP growth rate %)

Fig. 3.4: Greece (GDP growth rate %)

30 Chapter 3. Business Cycles


A First Course in Quantitative Economics with Python

(continued from previous page)


country = 'Argentina'
plot_series(gdp_growth, country,
ylabel, 0.1, ax,
g_params, b_params, t_params)
plt.show()

Fig. 3.5: Argentina (GDP growth rate %)

Notice that Argentina has experienced far more volatile cycles than the economies examined above.
At the same time, Argentina’s growth rate did not fall during the two developed economy recessions in the 1970s and
1990s.

3.4 Unemployment

Another important measure of business cycles is the unemployment rate.


We study unemployment using rate data from FRED spanning from 1929-1942 to 1948-2022, combined unemployment
rate data over 1942-1948 estimated by the Census Bureau.
Let’s plot the unemployment rate in the US from 1929 to 2022 with recessions defined by the NBER.

The plot shows that


• expansions and contractions of the labor market have been highly correlated with recessions.

3.4. Unemployment 31
A First Course in Quantitative Economics with Python

Fig. 3.6: Long-run unemployment rate, US (%)

• cycles are, in general, asymmetric: sharp rises in unemployment are followed by slow recoveries.
It also shows us how unique labor market conditions were in the US during the post-pandemic recovery.
The labor market recovered at an unprecedented rate after the shock in 2020-2021.

3.5 Synchronization

In our previous discussion, we found that developed economies have had relatively synchronized periods of recession.
At the same time, this synchronization did not appear in Argentina until the 2000s.
Let’s examine this trend further.
With slight modifications, we can use our previous function to draw a plot that includes multiple countries.
Here we compare the GDP growth rate of developed economies and developing economies.
We use the United Kingdom, United States, Germany, and Japan as examples of developed economies.

We choose Brazil, China, Argentina, and Mexico as representative developing economies.

The comparison of GDP growth rates above suggests that business cycles are becoming more synchronized in 21st-century
recessions.
However, emerging and less developed economies often experience more volatile changes throughout the economic cycles.

32 Chapter 3. Business Cycles


A First Course in Quantitative Economics with Python

Fig. 3.7: Developed economies (GDP growth rate %)

Fig. 3.8: Developing economies (GDP growth rate %)

3.5. Synchronization 33
A First Course in Quantitative Economics with Python

Despite the synchronization in GDP growth, the experience of individual countries during the recession often differs.
We use the unemployment rate and the recovery of labor market conditions as another example.
Here we compare the unemployment rate of the United States, the United Kingdom, Japan, and France.

Fig. 3.9: Developed economies (unemployment rate %)

We see that France, with its strong labor unions, typically experiences relatively slow labor market recoveries after negative
shocks.
We also notice that Japan has a history of very low and stable unemployment rates.

3.6 Leading indicators and correlated factors

Examining leading indicators and correlated factors helps policymakers to understand the causes and results of business
cycles.
We will discuss potential leading indicators and correlated factors from three perspectives: consumption, production, and
credit level.

34 Chapter 3. Business Cycles


A First Course in Quantitative Economics with Python

3.6.1 Consumption

Consumption depends on consumers’ confidence towards their income and the overall performance of the economy in the
future.
One widely cited indicator for consumer confidence is the consumer sentiment index published by the University of
Michigan.
Here we plot the University of Michigan Consumer Sentiment Index and year-on-year core consumer price index (CPI)
change from 1978-2022 in the US.

Fig. 3.10: Consumer sentiment index and YoY CPI change, US

We see that
• consumer sentiment often remains high during expansions and drops before recessions.
• there is a clear negative correlation between consumer sentiment and the CPI.
When the price of consumer commodities rises, consumer confidence diminishes.
This trend is more significant during stagflation.

3.6. Leading indicators and correlated factors 35


A First Course in Quantitative Economics with Python

3.6.2 Production

Real industrial output is highly correlated with recessions in the economy.


However, it is not a leading indicator, as the peak of contraction in production is delayed relative to consumer confidence
and inflation.
We plot the real industrial output change from the previous year from 1919 to 2022 in the US to show this trend.

Fig. 3.11: YoY real output change, US (%)

We observe the delayed contraction in the plot across recessions.

3.6.3 Credit level

Credit contractions often occur during recessions, as lenders become more cautious and borrowers become more hesitant
to take on additional debt.
This is due to factors such as a decrease in overall economic activity and gloomy expectations for the future.
One example is domestic credit to the private sector by banks in the UK.
The following graph shows the domestic credit to the private sector as a percentage of GDP by banks from 1970 to 2022
in the UK.

Note that the credit rises during economic expansions and stagnates or even contracts after recessions.

36 Chapter 3. Business Cycles


A First Course in Quantitative Economics with Python

Fig. 3.12: Domestic credit to private sector by banks (% of GDP)

3.6. Leading indicators and correlated factors 37


A First Course in Quantitative Economics with Python

38 Chapter 3. Business Cycles


CHAPTER

FOUR

INCOME AND WEALTH INEQUALITY

4.1 Overview

In this section we
• provide motivation for the techniques deployed in the lecture and
• import code libraries needed for our work.

4.1.1 Some history

Many historians argue that inequality played a key role in the fall of the Roman Republic.
After defeating Carthage and invading Spain, money flowed into Rome and greatly enriched those in power.
Meanwhile, ordinary citizens were taken from their farms to fight for long periods, diminishing their wealth.
The resulting growth in inequality caused political turmoil that shook the foundations of the republic.
Eventually, the Roman Republic gave way to a series of dictatorships, starting with Octavian (Augustus) in 27 BCE.
This history is fascinating in its own right, and we can see some parallels with certain countries in the modern world.
Many recent political debates revolve around inequality.
Many economic policies, from taxation to the welfare state, are aimed at addressing inequality.

4.1.2 Measurement

One problem with these debates is that inequality is often poorly defined.
Moreover, debates on inequality are often tied to political beliefs.
This is dangerous for economists because allowing political beliefs to shape our findings reduces objectivity.
To bring a truly scientific perspective to the topic of inequality we must start with careful definitions.
In this lecture we discuss standard measures of inequality used in economic research.
For each of these measures, we will look at both simulated and real data.
We will install the following libraries.

!pip install --upgrade quantecon interpolation

And we use the following imports.

39
A First Course in Quantitative Economics with Python

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import quantecon as qe
import random as rd
from interpolation import interp

4.2 The Lorenz curve

One popular measure of inequality is the Lorenz curve.


In this section we define the Lorenz curve and examine its properties.

4.2.1 Definition

The Lorenz curve takes a sample 𝑤1 , … , 𝑤𝑛 and produces a curve 𝐿.


We suppose that the sample 𝑤1 , … , 𝑤𝑛 has been sorted from smallest to largest.
To aid our interpretation, suppose that we are measuring wealth
• 𝑤1 is the wealth of the poorest member of the population and
• 𝑤𝑛 is the wealth of the richest member of the population.
The curve 𝐿 is just a function 𝑦 = 𝐿(𝑥) that we can plot and interpret.
To create it we first generate data points (𝑥𝑖 , 𝑦𝑖 ) according to

𝑖 ∑𝑗≤𝑖 𝑤𝑗
𝑥𝑖 = , 𝑦𝑖 = , 𝑖 = 1, … , 𝑛
𝑛 ∑𝑗≤𝑛 𝑤𝑗

Now the Lorenz curve 𝐿 is formed from these data points using interpolation.
(If we use a line plot in Matplotlib, the interpolation will be done for us.)
The meaning of the statement 𝑦 = 𝐿(𝑥) is that the lowest (100 × 𝑥)% of people have (100 × 𝑦)% of all wealth.
• if 𝑥 = 0.5 and 𝑦 = 0.1, then the bottom 50% of the population owns 10% of the wealth.
In the discussion above we focused on wealth but the same ideas apply to income, consumption, etc.

4.2.2 Lorenz curves of simulated data

Let’s look at some examples and try to build understanding.


In the next figure, we generate 𝑛 = 2000 draws from a lognormal distribution and treat these draws as our population.
The straight line (𝑥 = 𝐿(𝑥) for all 𝑥) corresponds to perfect equality.
The lognormal draws produce a less equal distribution.
For example, if we imagine these draws as being observations of wealth across a sample of households, then the dashed
lines show that the bottom 80% of households own just over 40% of total wealth.

40 Chapter 4. Income and Wealth Inequality


A First Course in Quantitative Economics with Python

n = 2000
sample = np.exp(np.random.randn(n))

fig, ax = plt.subplots()

f_vals, l_vals = qe.lorenz_curve(sample)


ax.plot(f_vals, l_vals, label=f'lognormal sample', lw=2)
ax.plot(f_vals, f_vals, label='equality', lw=2)

ax.legend(fontsize=12)

ax.vlines([0.8], [0.0], [0.43], alpha=0.5, colors='k', ls='--')


ax.hlines([0.43], [0], [0.8], alpha=0.5, colors='k', ls='--')

ax.set_ylim((0, 1))
ax.set_xlim((0, 1))

plt.show()

Fig. 4.1: Lorenz curve of simulated data

4.2. The Lorenz curve 41


A First Course in Quantitative Economics with Python

4.2.3 Lorenz curves for US data

Next let’s look at the real data, focusing on income and wealth in the US in 2016.
The following code block imports a subset of the dataset SCF_plus, which is derived from the Survey of Consumer
Finances (SCF).

url = 'https://media.githubusercontent.com/media/QuantEcon/high_dim_data/main/SCF_
↪plus/SCF_plus_mini.csv'

df = pd.read_csv(url)
df = df.dropna()
df_income_wealth = df

df_income_wealth.head()

year n_wealth t_income l_income weights nw_groups ti_groups


0 1950 266933.75 55483.027 0.0 0.998732 50-90% 50-90%
1 1950 87434.46 55483.027 0.0 0.998732 50-90% 50-90%
2 1950 795034.94 55483.027 0.0 0.998732 Top 10% 50-90%
3 1950 94531.78 55483.027 0.0 0.998732 50-90% 50-90%
4 1950 166081.03 55483.027 0.0 0.998732 50-90% 50-90%

The following code block uses data stored in dataframe df_income_wealth to generate the Lorenz curves.
(The code is somewhat complex because we need to adjust the data according to population weights supplied by the SCF.)
Now we plot Lorenz curves for net wealth, total income and labor income in the US in 2016.

fig, ax = plt.subplots()

ax.plot(f_vals_nw[-1], l_vals_nw[-1], label=f'net wealth')


ax.plot(f_vals_ti[-1], l_vals_ti[-1], label=f'total income')
ax.plot(f_vals_li[-1], l_vals_li[-1], label=f'labor income')
ax.plot(f_vals_nw[-1], f_vals_nw[-1], label=f'equality')

ax.legend(fontsize=12)
plt.show()

Here all the income and wealth measures are pre-tax.


Total income is the sum of households’ all income sources, including labor income but excluding capital gains.
One key finding from this figure is that wealth inequality is significantly more extreme than income inequality.

4.3 The Gini coefficient

The Lorenz curve is a useful visual representation of inequality in a distribution.


Another popular measure of income and wealth inequality is the Gini coefficient.
The Gini coefficient is just a number, rather than a curve.
In this section we discuss the Gini coefficient and its relationship to the Lorenz curve.

42 Chapter 4. Income and Wealth Inequality


A First Course in Quantitative Economics with Python

Fig. 4.2: 2016 US Lorenz curves

4.3.1 Definition

As before, suppose that the sample 𝑤1 , … , 𝑤𝑛 has been sorted from smallest to largest.
The Gini coefficient is defined for the sample above as
𝑛 𝑛
∑𝑖=1 ∑𝑗=1 |𝑤𝑗 − 𝑤𝑖 |
𝐺 ∶= 𝑛 . (4.1)
2𝑛 ∑𝑖=1 𝑤𝑖

The Gini coefficient is closely related to the Lorenz curve.


In fact, it can be shown that its value is twice the area between the line of equality and the Lorenz curve (e.g., the shaded
area in the following Figure below).
The idea is that 𝐺 = 0 indicates complete equality, while 𝐺 = 1 indicates complete inequality.

fig, ax = plt.subplots()

f_vals, l_vals = qe.lorenz_curve(sample)


ax.plot(f_vals, l_vals, label=f'lognormal sample', lw=2)
ax.plot(f_vals, f_vals, label='equality', lw=2)

ax.legend(fontsize=12)

ax.vlines([0.8], [0.0], [0.43], alpha=0.5, colors='k', ls='--')


ax.hlines([0.43], [0], [0.8], alpha=0.5, colors='k', ls='--')

(continues on next page)

4.3. The Gini coefficient 43


A First Course in Quantitative Economics with Python

(continued from previous page)


ax.fill_between(f_vals, l_vals, f_vals, alpha=0.06)

ax.set_ylim((0, 1))
ax.set_xlim((0, 1))

ax.text(0.04, 0.5, r'$G = 2 \times$ shaded area', fontsize=12)

plt.show()

Fig. 4.3: Shaded Lorenz curve of simulated data

4.3.2 Gini coefficient dynamics of simulated data

Let’s examine the Gini coefficient in some simulations.


The following code computes the Gini coefficients for five different populations.
Each of these populations is generated by drawing from a lognormal distribution with parameters 𝜇 (mean) and 𝜎 (standard
deviation).
To create the five populations, we vary 𝜎 over a grid of length 5 between 0.2 and 4.
In each case we set 𝜇 = −𝜎2 /2.
This implies that the mean of the distribution does not change with 𝜎.
(You can check this by looking up the expression for the mean of a lognormal distribution.)

44 Chapter 4. Income and Wealth Inequality


A First Course in Quantitative Economics with Python

k = 5
σ_vals = np.linspace(0.2, 4, k)
n = 2_000

ginis = []

for σ in σ_vals:
μ = -σ**2 / 2
y = np.exp(μ + σ * np.random.randn(n))
ginis.append(qe.gini_coefficient(y))

def plot_inequality_measures(x, y, legend, xlabel, ylabel):

fig, ax = plt.subplots()
ax.plot(x, y, marker='o', label=legend)

ax.set_xlabel(xlabel, fontsize=12)
ax.set_ylabel(ylabel, fontsize=12)

ax.legend(fontsize=12)
plt.show()

plot_inequality_measures(σ_vals,
ginis,
'simulated',
'$\sigma$',
'gini coefficients')

The plots show that inequality rises with 𝜎, according to the Gini coefficient.

4.3.3 Gini coefficient dynamics for US data

Now let’s look at Gini coefficients for US data derived from the SCF.
The following code creates a list called Ginis.
It stores data of Gini coefficients generated from the dataframe df_income_wealth and method gini_coefficient,
from QuantEcon library.

ginis_nw, ginis_ti, ginis_li = Ginis

Let’s plot the Gini coefficients for net wealth, labor income and total income.

# use an average to replace an outlier in labor income gini


ginis_li_new = ginis_li
ginis_li_new[5] = (ginis_li[4] + ginis_li[6]) / 2

xlabel = "year"
ylabel = "gini coefficient"

fig, ax = plt.subplots()
(continues on next page)

4.3. The Gini coefficient 45


A First Course in Quantitative Economics with Python

Fig. 4.4: Gini coefficients of simulated data

46 Chapter 4. Income and Wealth Inequality


A First Course in Quantitative Economics with Python

(continued from previous page)

ax.plot(years, ginis_nw, marker='o')

ax.set_xlabel(xlabel, fontsize=12)
ax.set_ylabel(ylabel, fontsize=12)

plt.show()

Fig. 4.5: Gini coefficients of US net wealth

xlabel = "year"
ylabel = "gini coefficient"

fig, ax = plt.subplots()

ax.plot(years, ginis_li_new, marker='o', label="labor income")


ax.plot(years, ginis_ti, marker='o', label="total income")

ax.set_xlabel(xlabel, fontsize=12)
ax.set_ylabel(ylabel, fontsize=12)

ax.legend(fontsize=12)
plt.show()

4.3. The Gini coefficient 47


A First Course in Quantitative Economics with Python

Fig. 4.6: Gini coefficients of US income

48 Chapter 4. Income and Wealth Inequality


A First Course in Quantitative Economics with Python

We see that, by this measure, inequality in wealth and income has risen substantially since 1980.
The wealth time series exhibits a strong U-shape.

4.4 Top shares

Another popular measure of inequality is the top shares.


Measuring specific shares is less complex than the Lorenz curve or the Gini coefficient.
In this section we show how to compute top shares.

4.4.1 Definition

As before, suppose that the sample 𝑤1 , … , 𝑤𝑛 has been sorted from smallest to largest.
Given the Lorenz curve 𝑦 = 𝐿(𝑥) defined above, the top 100 × 𝑝% share is defined as

∑𝑗≥𝑖 𝑤𝑗
𝑇 (𝑝) = 1 − 𝐿(1 − 𝑝) ≈ , 𝑖 = ⌊𝑛(1 − 𝑝)⌋ (4.2)
∑𝑗≤𝑛 𝑤𝑗

Here ⌊⋅⌋ is the floor function, which rounds any number down to the integer less than or equal to that number.
The following code uses the data from dataframe df_income_wealth to generate another dataframe
df_topshares.
df_topshares stores the top 10 percent shares for the total income, the labor income and net wealth from 1950 to
2016 in US.
Then let’s plot the top shares.

xlabel = "year"
ylabel = "top $10\%$ share"

fig, ax = plt.subplots()

ax.plot(years, df_topshares["topshare_l_income"],
marker='o', label="labor income")
ax.plot(years, df_topshares["topshare_n_wealth"],
marker='o', label="net wealth")
ax.plot(years, df_topshares["topshare_t_income"],
marker='o', label="total income")

ax.set_xlabel(xlabel, fontsize=12)
ax.set_ylabel(ylabel, fontsize=12)

ax.legend(fontsize=12)
plt.show()

4.4. Top shares 49


A First Course in Quantitative Economics with Python

Fig. 4.7: US top shares

50 Chapter 4. Income and Wealth Inequality


A First Course in Quantitative Economics with Python

4.5 Exercises

Exercise 4.5.1
Using simulation, compute the top 10 percent shares for the collection of lognormal distributions associated with the
random variables 𝑤𝜎 = exp(𝜇 + 𝜎𝑍), where 𝑍 ∼ 𝑁 (0, 1) and 𝜎 varies over a finite grid between 0.2 and 4.
As 𝜎 increases, so does the variance of 𝑤𝜎 .
To focus on volatility, adjust 𝜇 at each step to maintain the equality 𝜇 = −𝜎2 /2.
For each 𝜎, generate 2,000 independent draws of 𝑤𝜎 and calculate the Lorenz curve and Gini coefficient.
Confirm that higher variance generates more dispersion in the sample, and hence greater inequality.

Solution to Exercise 4.5.1


Here is one solution:

def calculate_top_share(s, p=0.1):

s = np.sort(s)
n = len(s)
index = int(n * (1 - p))
return s[index:].sum() / s.sum()

k = 5
σ_vals = np.linspace(0.2, 4, k)
n = 2_000

topshares = []
ginis = []
f_vals = []
l_vals = []

for σ in σ_vals:
μ = -σ ** 2 / 2
y = np.exp(μ + σ * np.random.randn(n))
f_val, l_val = qe._inequality.lorenz_curve(y)
f_vals.append(f_val)
l_vals.append(l_val)
ginis.append(qe._inequality.gini_coefficient(y))
topshares.append(calculate_top_share(y))

plot_inequality_measures(σ_vals,
topshares,
"simulated data",
"$\sigma$",
"top $10\%$ share")

plot_inequality_measures(σ_vals,
ginis,
(continues on next page)

4.5. Exercises 51
A First Course in Quantitative Economics with Python

Fig. 4.8: Top shares of simulated data

52 Chapter 4. Income and Wealth Inequality


A First Course in Quantitative Economics with Python

(continued from previous page)


"simulated data",
"$\sigma$",
"gini coefficient")

Fig. 4.9: Gini coefficients of simulated data

fig, ax = plt.subplots()
ax.plot([0,1],[0,1], label=f"equality")
for i in range(len(f_vals)):
ax.plot(f_vals[i], l_vals[i], label=f"$\sigma$ = {σ_vals[i]}")
plt.legend()
plt.show()

Exercise 4.5.2
According to the definition of the top shares (4.2) we can also calculate the top percentile shares using the Lorenz curve.
Compute the top shares of US net wealth using the corresponding Lorenz curves data: f_vals_nw, l_vals_nw and
linear interpolation.
Plot the top shares generated from Lorenz curve and the top shares approximated from data together.

4.5. Exercises 53
A First Course in Quantitative Economics with Python

Fig. 4.10: Lorenz curves for simulated data

Solution to Exercise 4.5.2


Here is one solution:

def lorenz2top(f_val, l_val, p=0.1):


t = lambda x: interp(f_val, l_val, x)
return 1- t(1 - p)

top_shares_nw = []
for f_val, l_val in zip(f_vals_nw, l_vals_nw):
top_shares_nw.append(lorenz2top(f_val, l_val))

xlabel = "year"
ylabel = "top $10\%$ share"

fig, ax = plt.subplots()

ax.plot(years, df_topshares["topshare_n_wealth"], marker='o',\


label="net wealth-approx")
ax.plot(years, top_shares_nw, marker='o', label="net wealth-lorenz")

ax.set_xlabel(xlabel, fontsize=12)
ax.set_ylabel(ylabel, fontsize=12)

(continues on next page)

54 Chapter 4. Income and Wealth Inequality


A First Course in Quantitative Economics with Python

(continued from previous page)


ax.legend(fontsize=12)
plt.show()

Fig. 4.11: US top shares: approximation vs Lorenz

4.5. Exercises 55
A First Course in Quantitative Economics with Python

56 Chapter 4. Income and Wealth Inequality


Part III

Essential Tools

57
CHAPTER

FIVE

LINEAR EQUATIONS AND MATRIX ALGEBRA

5.1 Overview

Many problems in economics and finance require solving linear equations.


In this lecture we discuss linear equations and their applications.
To illustrate the importance of linear equations, we begin with a two good model of supply and demand.
The two good case is so simple that solutions can be calculated by hand.
But often we need to consider markets containing many goods.
In the multiple goods case we face large systems of linear equations, with many equations and unknowns.
To handle such systems we need two things:
• matrix algebra (and the knowledge of how to use it) plus
• computer code to apply matrix algebra to the problems of interest.
This lecture covers these steps.
We will use the following packages:

import numpy as np
import matplotlib.pyplot as plt

5.2 A two good example

In this section we discuss a simple two good example and solve it by


1. pencil and paper
2. matrix algebra
The second method is more general, as we will see.

59
A First Course in Quantitative Economics with Python

5.2.1 Pencil and paper methods

Suppose that we have two related goods, such as


• propane and ethanol, and
• rice and wheat, etc.
To keep things simple, we label them as good 0 and good 1.
The demand for each good depends on the price of both goods:
𝑞0𝑑 = 100 − 10𝑝0 − 5𝑝1
(5.1)
𝑞1𝑑 = 50 − 𝑝0 − 10𝑝1
(We are assuming demand decreases when the price of either good goes up, but other cases are also possible.)
Let’s suppose that supply is given by
𝑞0𝑠 = 10𝑝0 + 5𝑝1
(5.2)
𝑞1𝑠 = 5𝑝0 + 10𝑝1

Equilibrium holds when supply equals demand (𝑞0𝑠 = 𝑞0𝑑 and 𝑞1𝑠 = 𝑞1𝑑 ).
This yields the linear system
100 − 10𝑝0 − 5𝑝1 = 10𝑝0 + 5𝑝1
(5.3)
50 − 𝑝0 − 10𝑝1 = 5𝑝0 + 10𝑝1
We can solve this with pencil and paper to get

𝑝0 = 4.41 and 𝑝1 = 1.18.

Inserting these results into either (5.1) or (5.2) yields the equilibrium quantities

𝑞0 = 50 and 𝑞1 = 33.82.

5.2.2 Looking forward

Pencil and paper methods are easy in the two good case.
But what if there are many goods?
For such problems we need matrix algebra.
Before solving problems with matrix algebra, let’s first recall the basics of vectors and matrices, in both theory and
computation.

5.3 Vectors

A vector of length 𝑛 is just a sequence (or array, or tuple) of 𝑛 numbers, which we write as 𝑥 = (𝑥1 , … , 𝑥𝑛 ) or
𝑥 = [𝑥1 , … , 𝑥𝑛 ].
We can write these sequences either horizontally or vertically.
But when we use matrix operations, our default assumption is that vectors are column vectors.
The set of all 𝑛-vectors is denoted by ℝ𝑛 .
For example,

60 Chapter 5. Linear Equations and Matrix Algebra


A First Course in Quantitative Economics with Python

• ℝ2 is the plane — the set of pairs (𝑥1 , 𝑥2 ).


• ℝ3 is 3 dimensional space — the set of vectors (𝑥1 , 𝑥2 , 𝑥3 ).
Often vectors are represented visually as arrows from the origin to the point.
Here’s a visualization.

5.3.1 Vector operations

Sometimes we want to modify vectors.


The two most common operators on vectors are addition and scalar multiplication, which we now describe.
When we add two vectors, we add them element-by-element.
For example,

4 3 4 + 3 7
[ ]+[ ]=[ ] = [ ].
−2 3 −2 + 3 1

In general,

𝑥1 𝑦1 𝑥1 + 𝑦1
⎡𝑥 ⎤ ⎡𝑦 ⎤ ⎡𝑥 + 𝑦 ⎤
𝑥 + 𝑦 = ⎢ 2 ⎥ + ⎢ 2 ⎥ ∶= ⎢ 2 2⎥
.
⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎣𝑥𝑛 ⎦ ⎣𝑦𝑛 ⎦ ⎣𝑥𝑛 + 𝑦𝑛 ⎦

We can visualise vector addition in ℝ2 as follows.

5.3. Vectors 61
A First Course in Quantitative Economics with Python

Scalar multiplication is an operation that multiplies a vector 𝑥 with a scalar elementwise.


For example,

3 −2 × 3 −6
−2 [ ]=[ ] = [ ].
−7 −2 × −7 14

More generally, it takes a number 𝛾 and a vector 𝑥 and produces

𝛾𝑥1
⎡ 𝛾𝑥 ⎤
𝛾𝑥 ∶= ⎢ 2 ⎥ .
⎢ ⋮ ⎥
⎣𝛾𝑥𝑛 ⎦

Scalar multiplication is illustrated in the next figure.

62 Chapter 5. Linear Equations and Matrix Algebra


A First Course in Quantitative Economics with Python

In Python, a vector can be represented as a list or tuple, such as x = [2, 4, 6] or x = (2, 4, 6).
However, it is more common to represent vectors with NumPy arrays.
One advantage of NumPy arrays is that scalar multiplication and addition have very natural syntax.

x = np.ones(3) # Vector of three ones


y = np.array((2, 4, 6)) # Converts tuple (2, 4, 6) into a NumPy array
x + y # Add (element-by-element)

array([3., 5., 7.])

4 * x # Scalar multiply

array([4., 4., 4.])

5.3.2 Inner product and norm

The inner product of vectors 𝑥, 𝑦 ∈ ℝ𝑛 is defined as

𝑦1
⎡𝑦 ⎤ 𝑛

𝑥 𝑦 = [𝑥1 𝑥2 ⋯ 𝑥𝑛 ] ⎢ 2 ⎥ = 𝑥1 𝑦1 + 𝑥2 𝑦2 + ⋯ + 𝑥𝑛 𝑦𝑛 ∶= ∑ 𝑥𝑖 𝑦𝑖 .
⎢ ⋮ ⎥ 𝑖=1
⎣𝑦𝑛 ⎦

The norm of a vector 𝑥 represents its “length” (i.e., its distance from the zero vector) and is defined as
1/2
√ 𝑛
‖𝑥‖ ∶= 𝑥⊤ 𝑥 ∶= (∑ 𝑥2𝑖 ) .
𝑖=1

5.3. Vectors 63
A First Course in Quantitative Economics with Python

The expression ‖𝑥 − 𝑦‖ can be thought of as the “distance” between 𝑥 and 𝑦.


The inner product and norm can be computed as follows

np.sum(x*y) # Inner product of x and y

12.0

x @ y # Another way to compute the inner product

12.0

np.sqrt(np.sum(x**2)) # Norm of x, method one

1.7320508075688772

np.linalg.norm(x) # Norm of x, method two

1.7320508075688772

5.4 Matrix operations

When we discussed linear price systems, we mentioned using matrix algebra.


Matrix algebra is similar to algebra for numbers.
Let’s review some details.

5.4.1 Addition and scalar multiplication

Just as was the case for vectors, we can add, subtract and scalar multiply matrices.
Scalar multiplication and addition are generalizations of the vector case:
Here is an example of scalar multiplication

2 −13 6 −39
3[ ]=[ ].
0 5 0 15

In general for a number 𝛾 and any matrix 𝐴,

𝑎11 ⋯ 𝑎1𝑘 𝛾𝑎11 ⋯ 𝛾𝑎1𝑘


𝛾𝐴 = 𝛾 ⎡
⎢ ⋮ ⋮ ⋮ ⎤⎥ ∶= ⎡ ⋮
⎢ ⋮ ⋮ ⎤⎥.
⎣𝑎𝑛1 ⋯ 𝑎𝑛𝑘 ⎦ ⎣𝛾𝑎𝑛1 ⋯ 𝛾𝑎𝑛𝑘 ⎦

Consider this example of matrix addition,

1 5 12 −1 13 4
[ ]+[ ]=[ ].
7 3 0 9 7 12

64 Chapter 5. Linear Equations and Matrix Algebra


A First Course in Quantitative Economics with Python

In general,

𝑎11 ⋯ 𝑎1𝑘 𝑏11 ⋯ 𝑏1𝑘 𝑎11 + 𝑏11 ⋯ 𝑎1𝑘 + 𝑏1𝑘


𝐴+𝐵 =⎡
⎢ ⋮ ⋮ ⋮ ⎤ +
⎥ ⎢
⎡ ⋮ ⋮ ⋮ ⎤⎥ ∶= ⎡
⎢ ⋮ ⋮ ⋮ ⎤.

⎣𝑎𝑛1 ⋯ 𝑎𝑛𝑘 ⎦ ⎣𝑏𝑛1 ⋯ 𝑏𝑛𝑘 ⎦ ⎣ 𝑛1 𝑏𝑛1
𝑎 + ⋯ 𝑎𝑛𝑘 + 𝑏𝑛𝑘 ⎦

In the latter case, the matrices must have the same shape in order for the definition to make sense.

5.4.2 Matrix multiplication

We also have a convention for multiplying two matrices.


The rule for matrix multiplication generalizes the idea of inner products discussed above.
If 𝐴 and 𝐵 are two matrices, then their product 𝐴𝐵 is formed by taking as its 𝑖, 𝑗-th element the inner product of the 𝑖-th
row of 𝐴 and the 𝑗-th column of 𝐵.
If 𝐴 is 𝑛 × 𝑘 and 𝐵 is 𝑗 × 𝑚, then to multiply 𝐴 and 𝐵 we require 𝑘 = 𝑗, and the resulting matrix 𝐴𝐵 is 𝑛 × 𝑚.
Here’s an example of a 2 × 2 matrix multiplied by a 2 × 1 vector.

𝑎11 𝑎12 𝑥1 𝑎 𝑥 + 𝑎12 𝑥2


𝐴𝑥 = [ ] [ ] = [ 11 1 ]
𝑎21 𝑎22 𝑥2 𝑎21 𝑥1 + 𝑎22 𝑥2

As an important special case, consider multiplying 𝑛 × 𝑘 matrix 𝐴 and 𝑘 × 1 column vector 𝑥.


According to the preceding rule, this gives us an 𝑛 × 1 column vector.

𝑎11 𝑎12 ⋯ 𝑎1𝑘 𝑥1 𝑎11 𝑥1 + 𝑎22 𝑥2 + ⋯ + 𝑎1𝑘 𝑥𝑘


⎡ ⋮ ⋮ ⋮ ⎤ ⎡𝑥 ⎤ ⎡ ⋮ ⎤
⎢ ⎥ ⎢ 2⎥ ⎢ ⎥
𝐴𝑥 = ⎢ 𝑎𝑖1 𝑎𝑖2 ⋯ 𝑎𝑖 𝑘 ⎥ ⎢ ⋮ ⎥ ∶= ⎢ 𝑎𝑖1 𝑥1 + 𝑎𝑖2 𝑥2 + ⋯ + 𝑎𝑖𝑘 𝑥𝑘 ⎥ (5.4)
⎢ ⋮ ⋮ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎣𝑎𝑛1 𝑎𝑛2 ⋯ 𝑎𝑛𝑘 ⎦𝑛×𝑘 ⎣𝑥𝑘 ⎦𝑘×1 ⎣𝑎𝑛1 𝑥1 + 𝑎𝑛2 𝑥2 + ⋯ + 𝑎𝑛𝑘 𝑥𝑘 ⎦𝑛×1

Here is a simple illustration of multiplication of two matrices.

𝑎11 𝑎12 𝑏11 𝑏12 𝑎 𝑏 + 𝑎12 𝑏21 𝑎11 𝑏12 + 𝑎12 𝑏22
𝐴𝐵 = [ ][ ] ∶= [ 11 11 ]
𝑎21 𝑎22 𝑏21 𝑏22 𝑎21 𝑏11 + 𝑎22 𝑏21 𝑎21 𝑏12 + 𝑎22 𝑏22

There are many tutorials to help you further visualize this operation, such as
• this one, or
• the discussion on the Wikipedia page.

Note: Unlike number products, 𝐴𝐵 and 𝐵𝐴 are not generally the same thing.

One important special case is the identity matrix, which has ones on the principal diagonal and zero elsewhere:

1 ⋯ 0
𝐼=⎡
⎢⋮ ⋱ ⋮⎥

⎣0 ⋯ 1⎦
It is a useful exercise to check the following:
• if 𝐴 is 𝑛 × 𝑘 and 𝐼 is the 𝑘 × 𝑘 identity matrix, then 𝐴𝐼 = 𝐴, and
• if 𝐼 is the 𝑛 × 𝑛 identity matrix, then 𝐼𝐴 = 𝐴.

5.4. Matrix operations 65


A First Course in Quantitative Economics with Python

5.4.3 Matrices in NumPy

NumPy arrays are also used as matrices, and have fast, efficient functions and methods for all the standard matrix oper-
ations.
You can create them manually from tuples of tuples (or lists of lists) as follows

A = ((1, 2),
(3, 4))

type(A)

tuple

A = np.array(A)

type(A)

numpy.ndarray

A.shape

(2, 2)

The shape attribute is a tuple giving the number of rows and columns — see here for more discussion.
To get the transpose of A, use A.transpose() or, more simply, A.T.
There are many convenient functions for creating common matrices (matrices of zeros, ones, etc.) — see here.
Since operations are performed elementwise by default, scalar multiplication and addition have very natural syntax.

A = np.identity(3) # 3 x 3 identity matrix


B = np.ones((3, 3)) # 3 x 3 matrix of ones
2 * A

array([[2., 0., 0.],


[0., 2., 0.],
[0., 0., 2.]])

A + B

array([[2., 1., 1.],


[1., 2., 1.],
[1., 1., 2.]])

To multiply matrices we use the @ symbol.

Note: In particular, A @ B is matrix multiplication, whereas A * B is element-by-element multiplication.

66 Chapter 5. Linear Equations and Matrix Algebra


A First Course in Quantitative Economics with Python

5.4.4 Two good model in matrix form

We can now revisit the two good model and solve (5.3) numerically via matrix algebra.
This involves some extra steps but the method is widely applicable — as we will see when we include more goods.
First we rewrite (5.1) as

𝑞𝑑 −10 −5 100
𝑞 𝑑 = 𝐷𝑝 + ℎ where 𝑞 𝑑 = [ 0𝑑 ] 𝐷=[ ] and ℎ=[ ]. (5.5)
𝑞1 −1 −10 50

Recall that 𝑝 ∈ ℝ2 is the price of two goods.


(Please check that 𝑞 𝑑 = 𝐷𝑝 + ℎ represents the same equations as (5.1).)
We rewrite (5.2) as

𝑞𝑠 10 5
𝑞 𝑠 = 𝐶𝑝 where 𝑞 𝑠 = [ 0𝑠 ] and 𝐶 = [ ]. (5.6)
𝑞1 5 10

Now equality of supply and demand can be expressed as 𝑞 𝑠 = 𝑞 𝑑 , or

𝐶𝑝 = 𝐷𝑝 + ℎ.

We can rearrange the terms to get

(𝐶 − 𝐷)𝑝 = ℎ.

If all of the terms were numbers, we could solve for prices as 𝑝 = ℎ/(𝐶 − 𝐷).
Matrix algebra allows us to do something similar: we can solve for equilibrium prices using the inverse of 𝐶 − 𝐷:

𝑝 = (𝐶 − 𝐷)−1 ℎ. (5.7)

Before we implement the solution let us consider a more general setting.

5.4.5 More goods

It is natural to think about demand systems with more goods.


For example, even within energy commodities there are many different goods, including crude oil, gasoline, coal, natural
gas, ethanol, and uranium.
The prices of these goods are related, so it makes sense to study them together.
Pencil and paper methods become very time consuming with large systems.
But fortunately the matrix methods described above are essentially unchanged.
In general, we can write the demand equation as 𝑞 𝑑 = 𝐷𝑝 + ℎ, where
• 𝑞 𝑑 is an 𝑛 × 1 vector of demand quantities for 𝑛 different goods.
• 𝐷 is an 𝑛 × 𝑛 “coefficient” matrix.
• ℎ is an 𝑛 × 1 vector of constant values.
Similarly, we can write the supply equation as 𝑞 𝑠 = 𝐶𝑝 + 𝑒, where
• 𝑞 𝑑 is an 𝑛 × 1 vector of supply quantities for the same goods.
• 𝐶 is an 𝑛 × 𝑛 “coefficient” matrix.

5.4. Matrix operations 67


A First Course in Quantitative Economics with Python

• 𝑒 is an 𝑛 × 1 vector of constant values.


To find an equilibrium, we solve 𝐷𝑝 + ℎ = 𝐶𝑝 + 𝑒, or

(𝐷 − 𝐶)𝑝 = 𝑒 − ℎ. (5.8)

Then the price vector of the n different goods is

𝑝 = (𝐷 − 𝐶)−1 (𝑒 − ℎ).

5.4.6 General linear systems

A more general version of the problem described above looks as follows.

𝑎11 𝑥1 + 𝑎12 𝑥2 + ⋯ + 𝑎1𝑛 𝑥𝑛 = 𝑏1


⋮ ⋮ ⋮ ⋮ (5.9)
𝑎𝑛1 𝑥1 + 𝑎𝑛2 𝑥2 + ⋯ + 𝑎𝑛𝑛 𝑥𝑛 = 𝑏𝑛

The objective here is to solve for the “unknowns” 𝑥1 , … , 𝑥𝑛 .


We take as given the coefficients 𝑎11 , … , 𝑎𝑛𝑛 and constants 𝑏1 , … , 𝑏𝑛 .
Notice that we are treating a setting where the number of unknowns equals the number of equations.
This is the case where we are most likely to find a well-defined solution.
(The other cases are referred to as overdetermined and underdetermined systems of equations — we defer discussion of
these cases until later lectures.)
In matrix form, the system (5.9) becomes

𝑎11 ⋯ 𝑎1𝑛 𝑏1
𝐴𝑥 = 𝑏 where 𝐴 = ⎡
⎢ ⋮ ⋮ ⋮ ⎤⎥ and 𝑏 = ⎡ ⎤
⎢ ⋮ ⎥. (5.10)
⎣𝑎𝑛1 ⋯ 𝑎𝑛𝑛 ⎦ ⎣𝑏𝑛 ⎦
For example, (5.8) has this form with

𝐴 = 𝐷 − 𝐶, 𝑏 =𝑒−ℎ and 𝑥 = 𝑝.

When considering problems such as (5.10), we need to ask at least some of the following questions
• Does a solution actually exist?
• If a solution exists, how should we compute it?

5.5 Solving systems of equations

Recall again the system of equations (5.9), which we write here again as

𝐴𝑥 = 𝑏. (5.11)

The problem we face is to find a vector 𝑥 ∈ ℝ𝑛 that solves (5.11), taking 𝑏 and 𝐴 as given.
We may not always find a unique vector 𝑥 that solves (5.11).
We illustrate two such cases below.

68 Chapter 5. Linear Equations and Matrix Algebra


A First Course in Quantitative Economics with Python

5.5.1 No solution

Consider the system of equations given by,


𝑥 + 3𝑦 = 3
2𝑥 + 6𝑦 = −8.
It can be verified manually that this system has no possible solution.
To illustrate why this situation arises let’s plot the two lines.

fig, ax = plt.subplots()
x = np.linspace(-10, 10)
plt.plot(x, (3-x)/3, label=f'$x + 3y = 3$')
plt.plot(x, (-8-2*x)/6, label=f'$2x + 6y = -8$')
plt.legend()
plt.show()

Clearly, these are parallel lines and hence we will never find a point 𝑥 ∈ ℝ2 such that these lines intersect.
Thus, this system has no possible solution.
We can rewrite this system in matrix form as
1 3 3
𝐴𝑥 = 𝑏 where 𝐴=[ ] and 𝑏=[ ]. (5.12)
2 6 −8
It can be noted that the 2𝑛𝑑 row of matrix 𝐴 = (2, 6) is just a scalar multiple of the 1𝑠𝑡 row of matrix 𝐴 = (1, 3).
The rows of matrix 𝐴 in this case are called linearly dependent.

Note: Advanced readers can find a detailed explanation of linear dependence and independence here.
But these details are not needed in what follows.

5.5. Solving systems of equations 69


A First Course in Quantitative Economics with Python

5.5.2 Many solutions

Now consider,
𝑥 − 2𝑦 = −4
−2𝑥 + 4𝑦 = 8.

Any vector 𝑣 = (𝑥, 𝑦) such that 𝑥 = 2𝑦 − 4 will solve the above system.
Since we can find infinite such vectors this system has infinitely many solutions.
This is because the rows of the corresponding matrix
1 −2
𝐴=[ ]. (5.13)
−2 4
are linearly dependent — can you see why?
We now impose conditions on 𝐴 in (5.11) that rule out these problems.

5.5.3 Nonsingular matrices

To every square matrix we can assign a unique number called the determinant.
For 2 × 2 matrices, the determinant is given by,
𝑎 𝑏
[ ] = 𝑎𝑑 − 𝑏𝑐.
𝑐 𝑑
If the determinant of 𝐴 is not zero, then we say that 𝐴 is nonsingular.
A square matrix 𝐴 is nonsingular if and only if the rows and columns of 𝐴 are linearly independent.
A more detailed explanation of matrix inverse can be found here.
You can check yourself that the in (5.12) and (5.13) with linearly dependent rows are singular matrices.
This gives us a useful one-number summary of whether or not a square matrix can be inverted.
In particular, a square matrix 𝐴 has a nonzero determinant, if and only if it possesses an inverse matrix 𝐴−1 , with the
property that 𝐴𝐴−1 = 𝐴−1 𝐴 = 𝐼.
As a consequence, if we pre-multiply both sides of 𝐴𝑥 = 𝑏 by 𝐴−1 , we get

𝑥 = 𝐴−1 𝑏. (5.14)

This is the solution to 𝐴𝑥 = 𝑏 — the solution we are looking for.

5.5.4 Linear equations with NumPy

In the two good example we obtained the matrix equation,

𝑝 = (𝐶 − 𝐷)−1 ℎ.

where 𝐶, 𝐷 and ℎ are given by (5.5) and (5.6).


This equation is analogous to (5.14) with 𝐴 = (𝐶 − 𝐷)−1 , 𝑏 = ℎ, and 𝑥 = 𝑝.
We can now solve for equilibrium prices with NumPy’s linalg submodule.
All of these routines are Python front ends to time-tested and highly optimized FORTRAN code.

70 Chapter 5. Linear Equations and Matrix Algebra


A First Course in Quantitative Economics with Python

C = ((10, 5), # Matrix C


(5, 10))

Now we change this to a NumPy array.

C = np.array(C)

D = ((-10, -5), # Matrix D


(-1, -10))
D = np.array(D)

h = np.array((100, 50)) # Vector h


h.shape = 2,1 # Transforming h to a column vector

from numpy.linalg import det, inv


A = C - D
# Check that A is nonsingular (non-zero determinant), and hence invertible
det(A)

340.0000000000001

A_inv = inv(A) # compute the inverse


A_inv

array([[ 0.05882353, -0.02941176],


[-0.01764706, 0.05882353]])

p = A_inv @ h # equilibrium prices


p

array([[4.41176471],
[1.17647059]])

q = C @ p # equilibrium quantities
q

array([[50. ],
[33.82352941]])

Notice that we get the same solutions as the pencil and paper case.
We can also solve for 𝑝 using solve(A, h) as follows.

from numpy.linalg import solve


p = solve(A, h) # equilibrium prices
p

array([[4.41176471],
[1.17647059]])

5.5. Solving systems of equations 71


A First Course in Quantitative Economics with Python

q = C @ p # equilibrium quantities
q

array([[50. ],
[33.82352941]])

Observe how we can solve for 𝑥 = 𝐴−1 𝑦 by either via inv(A) @ y, or using solve(A, y).
The latter method uses a different algorithm that is numerically more stable and hence should be the default option.

5.6 Exercises

Exercise 5.6.1
Let’s consider a market with 3 commodities - good 0, good 1 and good 2.
The demand for each good depends on the price of the other two goods and is given by:

𝑞0𝑑 = 90 − 15𝑝0 + 5𝑝1 + 5𝑝2


𝑞1𝑑 = 60 + 5𝑝0 − 10𝑝1 + 10𝑝2
𝑞2𝑑 = 50 + 5𝑝0 + 5𝑝1 − 5𝑝2

(Here demand decreases when own price increases but increases when prices of other goods increase.)
The supply of each good is given by:

𝑞0𝑠 = −10 + 20𝑝0


𝑞1𝑠 = −15 + 15𝑝1
𝑞2𝑠 = −5 + 10𝑝2

Equilibrium holds when supply equals demand, i.e, 𝑞0𝑑 = 𝑞0𝑠 , 𝑞1𝑑 = 𝑞1𝑠 and 𝑞2𝑑 = 𝑞2𝑠 .
1. Set up the market as a system of linear equations.
2. Use matrix algebra to solve for equilibrium prices. Do this using both the numpy.linalg.solve and inv(A)
methods. Compare the solutions.

Solution to Exercise 5.6.1


The generated system would be:

35𝑝0 − 5𝑝1 − 5𝑝2 = 100


−5𝑝0 + 25𝑝1 − 10𝑝2 = 75
−5𝑝0 − 5𝑝1 + 15𝑝2 = 55

In matrix form we will write this as:


35 −5 −5 𝑝0 100
𝐴𝑝 = 𝑏 where 𝐴 = ⎡ ⎤
⎢−5 25 −10⎥ , 𝑝=⎡ ⎤
⎢𝑝1 ⎥ and 𝑏 = ⎡ ⎤
⎢ 75 ⎥
⎣−5 −5 15 ⎦ ⎣𝑝2 ⎦ ⎣ 55 ⎦

72 Chapter 5. Linear Equations and Matrix Algebra


A First Course in Quantitative Economics with Python

import numpy as np
from numpy.linalg import det

A = np.array([[35, -5, -5], # matrix A


[-5, 25, -10],
[-5, -5, 15]])

b = np.array((100, 75, 55)) # column vector b


b.shape = (3, 1)

det(A) # check if A is nonsingular

9999.99999999999

# Using inverse
from numpy.linalg import det

A_inv = inv(A)

p = A_inv @ b
p

array([[4.9625],
[7.0625],
[7.675 ]])

# Using numpy.linalg.solve
from numpy.linalg import solve
p = solve(A, b)
p

array([[4.9625],
[7.0625],
[7.675 ]])

The solution is given by: $𝑝0 = 4.6925, 𝑝1 = 7.0625 and 𝑝2 = 7.675$

Exercise 5.6.2
Earlier in the lecture we discussed cases where the system of equations given by 𝐴𝑥 = 𝑏 has no solution.
In this case 𝐴𝑥 = 𝑏 is called an inconsistent system of equations.
When faced with an inconsistent system we try to find the best “approximate” solution.
There are various methods to do this, one such method is the method of least squares.
Suppose we have an inconsistent system

𝐴𝑥 = 𝑏 (5.15)

where 𝐴 is an 𝑚 × 𝑛 matrix and 𝑏 is an 𝑚 × 1 column vector.


A least squares solution to (5.15) is an 𝑛 × 1 column vector 𝑥̂ such that, for all other vectors 𝑥 ∈ ℝ𝑛 , the distance from
𝐴𝑥̂ to 𝑏 is less than the distance from 𝐴𝑥 to 𝑏.

5.6. Exercises 73
A First Course in Quantitative Economics with Python

That is,

‖𝐴𝑥̂ − 𝑏‖ ≤ ‖𝐴𝑥 − 𝑏‖

It can be shown that, for the system of equations 𝐴𝑥 = 𝑏, the least squares solution 𝑥̂ is

𝑥̂ = (𝐴𝑇 𝐴)−1 𝐴𝑇 𝑏 (5.16)

Now consider the general equation of a linear demand curve of a good given by:

𝑝 = 𝑚 − 𝑛𝑞

where 𝑝 is the price of the good and 𝑞 is the quantity demanded.


Suppose we are trying to estimate the values of 𝑚 and 𝑛.
We do this by repeatedly observing the price and quantity (for example, each month) and then choosing 𝑚 and 𝑛 to fit
the relationship between 𝑝 and 𝑞.
We have the following observations:

Price Quantity Demanded


1 9
3 7
8 3

Requiring the demand curve 𝑝 = 𝑚 − 𝑛𝑞 to pass through all these points leads to the following three equations:

1 = 𝑚 − 9𝑛
3 = 𝑚 − 7𝑛
8 = 𝑚 − 3𝑛

1 −9 1
𝑚
Thus we obtain a system of equations 𝐴𝑥 = 𝑏 where 𝐴 = ⎡ ⎤ ⎡ ⎤
⎢1 −7⎥, 𝑥 = [ 𝑛 ] and 𝑏 = ⎢3⎥.
⎣1 −3⎦ ⎣8⎦
It can be verified that this system has no solutions.
(The problem is that we have three equations and only two unknowns.)
We will thus try to find the best approximate solution for 𝑥.
1. Use (5.16) and matrix algebra to find the least squares solution 𝑥.̂
2. Find the least squares solution using numpy.linalg.lstsq and compare the results.

Solution to Exercise 5.6.2

import numpy as np
from numpy.linalg import inv

# Using matrix algebra


A = np.array([[1, -9], # matrix A
[1, -7],
[1, -3]])
(continues on next page)

74 Chapter 5. Linear Equations and Matrix Algebra


A First Course in Quantitative Economics with Python

(continued from previous page)

A_T = np.transpose(A) # transpose of matrix A

b = np.array((1, 3, 8)) # column vector b


b.shape = (3, 1)

x = inv(A_T @ A) @ A_T @ b
x

array([[11.46428571],
[ 1.17857143]])

# Using numpy.linalg.lstsq
x, res, _, _ = np.linalg.lstsq(A, b, rcond=None)

x̂ = [[11.46428571]
[ 1.17857143]]
‖Ax̂ - b‖² = 0.07142857142857081

Here is a visualization of how the least squares method approximates the equation of a line connecting a set of points.
We can also describe this as “fitting” a line between a set of points.

fig, ax = plt.subplots()
p = np.array((1, 3, 8))
q = np.array((9, 7, 3))

a, b = x

ax.plot(q, p, 'o', label='observations', markersize=5)


ax.plot(q, a - b*q, 'r', label='Fitted line')
plt.xlabel('quantity demanded')
plt.ylabel('price')
plt.legend()
plt.show()

5.6. Exercises 75
A First Course in Quantitative Economics with Python

5.6.1 Further reading

The documentation of the numpy.linalg submodule can be found here.


More advanced topics in linear algebra can be found here.

76 Chapter 5. Linear Equations and Matrix Algebra


CHAPTER

SIX

EIGENVALUES AND EIGENVECTORS

6.1 Overview

Eigenvalues and eigenvectors are a relatively advanced topic in linear algebra.


At the same time, these concepts are extremely useful for
• economic modeling (especially dynamics!)
• statistics
• some parts of applied mathematics
• machine learning
• and many other fields of science.
In this lecture we explain the basics of eigenvalues and eigenvectors and introduce the Neumann Series Lemma.
We assume in this lecture that students are familiar with matrices and understand the basics of matrix algebra.
We will use the following imports:

import matplotlib.pyplot as plt


import numpy as np
from numpy.linalg import matrix_power
from matplotlib.lines import Line2D
from matplotlib.patches import FancyArrowPatch
from mpl_toolkits.mplot3d import proj3d

6.2 Matrices as transformations

Let’s start by discussing an important concept concerning matrices.

77
A First Course in Quantitative Economics with Python

6.2.1 Mapping vectors to vectors

One way to think about a matrix is as a rectangular collection of numbers.


Another way to think about a matrix is as a map (i.e., as a function) that transforms vectors to new vectors.
To understand the second point of view, suppose we multiply an 𝑛 × 𝑚 matrix 𝐴 with an 𝑚 × 1 column vector 𝑥 to
obtain an 𝑛 × 1 column vector 𝑦:

𝐴𝑥 = 𝑦

If we fix 𝐴 and consider different choices of 𝑥, we can understand 𝐴 as a map transforming 𝑥 to 𝐴𝑥.
Because 𝐴 is 𝑛 × 𝑚, it transforms 𝑚-vectors to 𝑛-vectors.
We can write this formally as 𝐴 ∶ ℝ𝑚 → ℝ𝑛 .
You might argue that if 𝐴 is a function then we should write 𝐴(𝑥) = 𝑦 rather than 𝐴𝑥 = 𝑦 but the second notation is
more conventional.

6.2.2 Square matrices

Let’s restrict our discussion to square matrices.


In the above discussion, this means that 𝑚 = 𝑛 and 𝐴 maps ℝ𝑛 to itself.
This means 𝐴 is an 𝑛 × 𝑛 matrix that maps (or “transforms”) a vector 𝑥 in ℝ𝑛 to a new vector 𝑦 = 𝐴𝑥 also in ℝ𝑛 .
Here’s one example:

2 1 1 5
[ ][ ] = [ ]
−1 1 3 2

Here, the matrix

2 1
𝐴=[ ]
−1 1

1 5
transforms the vector 𝑥 = [ ] to the vector 𝑦 = [ ].
3 2
Let’s visualize this using Python:

A = np.array([[2, 1],
[-1, 1]])

from math import sqrt

fig, ax = plt.subplots()
# Set the axes through the origin

for spine in ['left', 'bottom']:


ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
ax.spines[spine].set_color('none')

ax.set(xlim=(-2, 6), ylim=(-2, 4), aspect=1)

(continues on next page)

78 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

(continued from previous page)


vecs = ((1, 3), (5, 2))
c = ['r', 'black']
for i, v in enumerate(vecs):
ax.annotate('', xy=v, xytext=(0, 0),
arrowprops=dict(color=c[i],
shrink=0,
alpha=0.7,
width=0.5))

ax.text(0.2 + 1, 0.2 + 3, 'x=$(1,3)$')


ax.text(0.2 + 5, 0.2 + 2, 'Ax=$(5,2)$')

ax.annotate('', xy=(sqrt(10/29) * 5, sqrt(10/29) * 2), xytext=(0, 0),


arrowprops=dict(color='purple',
shrink=0,
alpha=0.7,
width=0.5))

ax.annotate('', xy=(1, 2/5), xytext=(1/3, 1),


arrowprops={'arrowstyle': '->',
'connectionstyle': 'arc3,rad=-0.3'},
horizontalalignment='center')
ax.text(0.8, 0.8, f'θ', fontsize=14)

plt.show()

One way to understand this transformation is that 𝐴


• first rotates 𝑥 by some angle 𝜃 and
• then scales it by some scalar 𝛾 to obtain the image 𝑦 of 𝑥.

6.2. Matrices as transformations 79


A First Course in Quantitative Economics with Python

6.3 Types of transformations

Let’s examine some standard transformations we can perform with matrices.


Below we visualize transformations by thinking of vectors as points instead of arrows.
We consider how a given matrix transforms
• a grid of points and
• a set of points located on the unit circle in ℝ2 .
To build the transformations we will use two functions, called grid_transform and circle_transform.
Each of these functions visualizes the actions of a given 2 × 2 matrix 𝐴.

6.3.1 Scaling

A matrix of the form


𝛼 0
[ ]
0 𝛽

scales vectors across the x-axis by a factor 𝛼 and along the y-axis by a factor 𝛽.
Here we illustrate a simple example where 𝛼 = 𝛽 = 3.

A = np.array([[3, 0], # scaling by 3 in both directions


[0, 3]])
grid_transform(A)
circle_transform(A)

80 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

6.3.2 Shearing

A “shear” matrix of the form


1 𝜆
[ ]
0 1

stretches vectors along the x-axis by an amount proportional to the y-coordinate of a point.

A = np.array([[1, 2], # shear along x-axis


[0, 1]])
grid_transform(A)
circle_transform(A)

6.3. Types of transformations 81


A First Course in Quantitative Economics with Python

6.3.3 Rotation

A matrix of the form


cos 𝜃 sin 𝜃
[ ]
− sin 𝜃 cos 𝜃

is called a rotation matrix.


This matrix rotates vectors clockwise by an angle 𝜃.

θ = np.pi/4 # 45 degree clockwise rotation


A = np.array([[np.cos(θ), np.sin(θ)],
[-np.sin(θ), np.cos(θ)]])
grid_transform(A)

82 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

6.3.4 Permutation

The permutation matrix


0 1
[ ]
1 0
interchanges the coordinates of a vector.

A = np.column_stack([[0, 1], [1, 0]])


grid_transform(A)

More examples of common transition matrices can be found here.

6.4 Matrix multiplication as composition

Since matrices act as functions that transform one vector to another, we can apply the concept of function composition to
matrices as well.

6.4.1 Linear compositions

Consider the two matrices


0 1 1 2
𝐴=[ ] and 𝐵=[ ]
−1 0 0 1
What will the output be when we try to obtain 𝐴𝐵𝑥 for some 2 × 1 vector 𝑥?
𝑥 𝑥 𝑦
0 1 1 2 1 ⏞ 0 1 1⏞ ⏞ 3
[ ][ ][ ] → [ ][ ] → [ ]
−1
⏟⏟ 0 0 1 3 −1
⏟⏟⏟⏟⏟ −2 3 −7
𝐴 𝐵 𝐴𝐵
𝑥 𝐵𝑥 𝑦
0 1 1 2 1 ⏞ 0 1 7 ⏞ ⏞ 3
[ ][ ][ ] → [ ][ ] → [ ]
−1
⏟⏟ 0 0 1 3 ⏟−1 0 3 −7
𝐴 𝐵 𝐴

6.4. Matrix multiplication as composition 83


A First Course in Quantitative Economics with Python

We can observe that applying the transformation 𝐴𝐵 on the vector 𝑥 is the same as first applying 𝐵 on 𝑥 and then applying
𝐴 on the vector 𝐵𝑥.
Thus the matrix product 𝐴𝐵 is the composition of the matrix transformations 𝐴 and 𝐵
This means first apply transformation 𝐵 and then transformation 𝐴.
When we matrix multiply an 𝑛 × 𝑚 matrix 𝐴 with an 𝑚 × 𝑘 matrix 𝐵 the obtained matrix product is an 𝑛 × 𝑘 matrix
𝐴𝐵.
Thus, if 𝐴 and 𝐵 are transformations such that 𝐴 ∶ ℝ𝑚 → ℝ𝑛 and 𝐵 ∶ ℝ𝑘 → ℝ𝑚 , then 𝐴𝐵 transforms ℝ𝑘 to ℝ𝑛 .
Viewing matrix multiplication as composition of maps helps us understand why, under matrix multiplication, 𝐴𝐵 is
generally not equal to 𝐵𝐴.
(After all, when we compose functions, the order usually matters.)

6.4.2 Examples

0 1
Let 𝐴 be the 90∘ clockwise rotation matrix given by [ ] and let 𝐵 be a shear matrix along the x-axis given by
−1 0
1 2
[ ].
0 1
We will visualize how a grid of points changes when we apply the transformation 𝐴𝐵 and then compare it with the
transformation 𝐵𝐴.

A = np.array([[0, 1], # 90 degree clockwise rotation


[-1, 0]])
B = np.array([[1, 2], # shear along x-axis
[0, 1]])

Shear then rotate

grid_composition_transform(A, B) # transformation AB

84 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

Rotate then shear

grid_composition_transform(B,A) # transformation BA

It is evident that the transformation 𝐴𝐵 is not the same as the transformation 𝐵𝐴.

6.5 Iterating on a fixed map

In economics (and especially in dynamic modeling), we are often interested in analyzing behavior where we repeatedly
apply a fixed matrix.
For example, given a vector 𝑣 and a matrix 𝐴, we are interested in studying the sequence

𝑣, 𝐴𝑣, 𝐴𝐴𝑣 = 𝐴2 𝑣, …

Let’s first see examples of a sequence of iterates (𝐴𝑘 𝑣)𝑘≥0 under different maps 𝐴.

def plot_series(A, v, n):

B = np.array([[1, -1],
[1, 0]])

fig, ax = plt.subplots()

ax.set(xlim=(-4, 4), ylim=(-4, 4))


ax.set_xticks([])
ax.set_yticks([])
for spine in ['left', 'bottom']:
ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
ax.spines[spine].set_color('none')

θ = np.linspace(0, 2 * np.pi, 150)


r = 2.5
x = r * np.cos(θ)
y = r * np.sin(θ)
x1 = x.reshape(1, -1)
y1 = y.reshape(1, -1)
xy = np.concatenate((x1, y1), axis=0)

ellipse = B @ xy
ax.plot(ellipse[0, :], ellipse[1, :], color='black',
(continues on next page)

6.5. Iterating on a fixed map 85


A First Course in Quantitative Economics with Python

(continued from previous page)


linestyle=(0, (5, 10)), linewidth=0.5)

# Initialize holder for trajectories


colors = plt.cm.rainbow(np.linspace(0, 1, 20))

for i in range(n):
iteration = matrix_power(A, i) @ v
v1 = iteration[0]
v2 = iteration[1]
ax.scatter(v1, v2, color=colors[i])
if i == 0:
ax.text(v1+0.25, v2, f'$v$')
elif i == 1:
ax.text(v1+0.25, v2, f'$Av$')
elif 1 < i < 4:
ax.text(v1+0.25, v2, f'$A^{i}v$')
plt.show()

A = np.array([[sqrt(3) + 1, -2],
[1, sqrt(3) - 1]])
A = (1/(2*sqrt(2))) * A
v = (-3, -3)
n = 12

plot_series(A, v, n)

Here with each iteration the vectors get shorter, i.e., move closer to the origin.
In this case, repeatedly multiplying a vector by 𝐴 makes the vector “spiral in”.

86 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

B = np.array([[sqrt(3) + 1, -2],
[1, sqrt(3) - 1]])
B = (1/2) * B
v = (2.5, 0)
n = 12

plot_series(B, v, n)

Here with each iteration vectors do not tend to get longer or shorter.
In this case, repeatedly multiplying a vector by 𝐴 simply “rotates it around an ellipse”.

B = np.array([[sqrt(3) + 1, -2],
[1, sqrt(3) - 1]])
B = (1/sqrt(2)) * B
v = (-1, -0.25)
n = 6

plot_series(B, v, n)

6.5. Iterating on a fixed map 87


A First Course in Quantitative Economics with Python

Here with each iteration vectors tend to get longer, i.e., farther from the origin.
In this case, repeatedly multiplying a vector by 𝐴 makes the vector “spiral out”.
We thus observe that the sequence (𝐴𝑘 𝑣)𝑘≥0 behaves differently depending on the map 𝐴 itself.
We now discuss the property of A that determines this behavior.

6.6 Eigenvalues

In this section we introduce the notions of eigenvalues and eigenvectors.

6.6.1 Definitions

Let 𝐴 be an 𝑛 × 𝑛 square matrix.


If 𝜆 is scalar and 𝑣 is a non-zero 𝑛-vector such that

𝐴𝑣 = 𝜆𝑣.

Then we say that 𝜆 is an eigenvalue of 𝐴, and 𝑣 is the corresponding eigenvector.


Thus, an eigenvector of 𝐴 is a nonzero vector 𝑣 such that when the map 𝐴 is applied, 𝑣 is merely scaled.
The next figure shows two eigenvectors (blue arrows) and their images under 𝐴 (red arrows).
As expected, the image 𝐴𝑣 of each 𝑣 is just a scaled version of the original

from numpy.linalg import eig

A = [[1, 2],
(continues on next page)

88 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

(continued from previous page)


[2, 1]]
A = np.array(A)
evals, evecs = eig(A)
evecs = evecs[:, 0], evecs[:, 1]

fig, ax = plt.subplots(figsize=(10, 8))


# Set the axes through the origin
for spine in ['left', 'bottom']:
ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
ax.spines[spine].set_color('none')
# ax.grid(alpha=0.4)

xmin, xmax = -3, 3


ymin, ymax = -3, 3
ax.set(xlim=(xmin, xmax), ylim=(ymin, ymax))

# Plot each eigenvector


for v in evecs:
ax.annotate('', xy=v, xytext=(0, 0),
arrowprops=dict(facecolor='blue',
shrink=0,
alpha=0.6,
width=0.5))

# Plot the image of each eigenvector


for v in evecs:
v = A @ v
ax.annotate('', xy=v, xytext=(0, 0),
arrowprops=dict(facecolor='red',
shrink=0,
alpha=0.6,
width=0.5))

# Plot the lines they run through


x = np.linspace(xmin, xmax, 3)
for v in evecs:
a = v[1] / v[0]
ax.plot(x, a * x, 'b-', lw=0.4)

plt.show()

6.6. Eigenvalues 89
A First Course in Quantitative Economics with Python

6.6.2 Complex values

So far our definition of eigenvalues and eigenvectors seems straightforward.


There is one complication we haven’t mentioned yet:
When solving 𝐴𝑣 = 𝜆𝑣,
• 𝜆 is allowed to be a complex number and
• 𝑣 is allowed to be an 𝑛-vector of complex numbers.
We will see some examples below.

6.6.3 Some mathematical details

We note some mathematical details for more advanced readers.


(Other readers can skip to the next section.)
The eigenvalue equation is equivalent to (𝐴 − 𝜆𝐼)𝑣 = 0.
This equation has a nonzero solution 𝑣 only when the columns of 𝐴 − 𝜆𝐼 are linearly dependent.
This in turn is equivalent to stating the determinant is zero.
Hence, to find all eigenvalues, we can look for 𝜆 such that the determinant of 𝐴 − 𝜆𝐼 is zero.
This problem can be expressed as one of solving for the roots of a polynomial in 𝜆 of degree 𝑛.
This in turn implies the existence of 𝑛 solutions in the complex plane, although some might be repeated.

90 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

6.6.4 Facts

Some nice facts about the eigenvalues of a square matrix 𝐴 are as follows:
1. the determinant of 𝐴 equals the product of the eigenvalues
2. the trace of 𝐴 (the sum of the elements on the principal diagonal) equals the sum of the eigenvalues
3. if 𝐴 is symmetric, then all of its eigenvalues are real
4. if 𝐴 is invertible and 𝜆1 , … , 𝜆𝑛 are its eigenvalues, then the eigenvalues of 𝐴−1 are 1/𝜆1 , … , 1/𝜆𝑛 .
A corollary of the last statement is that a matrix is invertible if and only if all its eigenvalues are nonzero.

6.6.5 Computation

Using NumPy, we can solve for the eigenvalues and eigenvectors of a matrix as follows

from numpy.linalg import eig

A = ((1, 2),
(2, 1))

A = np.array(A)
evals, evecs = eig(A)
evals # eigenvalues

array([ 3., -1.])

evecs # eigenvectors

array([[ 0.70710678, -0.70710678],


[ 0.70710678, 0.70710678]])

Note that the columns of evecs are the eigenvectors.


Since any scalar multiple of an eigenvector is an eigenvector with the same eigenvalue (which can be verified), the eig
routine normalizes the length of each eigenvector to one.
The eigenvectors and eigenvalues of a map 𝐴 determine how a vector 𝑣 is transformed when we repeatedly multiply by
𝐴.
This is discussed further later.

6.7 The Neumann Series Lemma

In this section we present a famous result about series of matrices that has many applications in economics.

6.7. The Neumann Series Lemma 91


A First Course in Quantitative Economics with Python

6.7.1 Scalar series

Here’s a fundamental result about series:


If 𝑎 is a number and |𝑎| < 1, then

1
∑ 𝑎𝑘 = = (1 − 𝑎)−1 (6.1)
𝑘=0
1−𝑎

For a one-dimensional linear equation 𝑥 = 𝑎𝑥 + 𝑏 where x is unknown we can thus conclude that the solution 𝑥∗ is given
by:

𝑏
𝑥∗ = = ∑ 𝑎𝑘 𝑏
1 − 𝑎 𝑘=0

6.7.2 Matrix series

A generalization of this idea exists in the matrix setting.


Consider the system of equations 𝑥 = 𝐴𝑥 + 𝑏 where 𝐴 is an 𝑛 × 𝑛 square matrix and 𝑥 and 𝑏 are both column vectors
in ℝ𝑛 .
Using matrix algebra we can conclude that the solution to this system of equations will be given by:

𝑥∗ = (𝐼 − 𝐴)−1 𝑏 (6.2)

What guarantees the existence of a unique vector 𝑥∗ that satisfies (6.2)?


The following is a fundamental result in functional analysis that generalizes (6.1) to a multivariate case.

Theorem 6.7.1 (Neumann Series Lemma)


Let 𝐴 be a square matrix and let 𝐴𝑘 be the 𝑘-th power of 𝐴.
Let 𝑟(𝐴) be the spectral radius of 𝐴, defined as max𝑖 |𝜆𝑖 |, where
• {𝜆𝑖 }𝑖 is the set of eigenvalues of 𝐴 and
• |𝜆𝑖 | is the modulus of the complex number 𝜆𝑖
Neumann’s Theorem states the following: If 𝑟(𝐴) < 1, then 𝐼 − 𝐴 is invertible, and

(𝐼 − 𝐴)−1 = ∑ 𝐴𝑘
𝑘=0

We can see the Neumann Series Lemma in action in the following example.

A = np.array([[0.4, 0.1],
[0.7, 0.2]])

evals, evecs = eig(A) # finding eigenvalues and eigenvectors

r = max(abs(λ) for λ in evals) # compute spectral radius


print(r)

92 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

0.582842712474619

The spectral radius 𝑟(𝐴) obtained is less than 1.


Thus, we can apply the Neumann Series Lemma to find (𝐼 − 𝐴)−1 .

I = np.identity(2) # 2 x 2 identity matrix


B = I - A

B_inverse = np.linalg.inv(B) # direct inverse method

A_sum = np.zeros((2, 2)) # power series sum of A


A_power = I
for i in range(50):
A_sum += A_power
A_power = A_power @ A

Let’s check equality between the sum and the inverse methods.

np.allclose(A_sum, B_inverse)

True

Although we truncate the infinite sum at 𝑘 = 50, both methods give us the same result which illustrates the result of the
Neumann Series Lemma.

6.8 Exercises

Exercise 6.8.1
Power iteration is a method for finding the greatest absolute eigenvalue of a diagonalizable matrix.
The method starts with a random vector 𝑏0 and repeatedly applies the matrix 𝐴 to it

𝐴𝑏𝑘
𝑏𝑘+1 =
‖𝐴𝑏𝑘 ‖

A thorough discussion of the method can be found here.


In this exercise, first implement the power iteration method and use it to find the greatest absolute eigenvalue and its
corresponding eigenvector.
Then visualize the convergence.

Solution to Exercise 6.8.1


Here is one solution.
We start by looking into the distance between the eigenvector approximation and the true eigenvector.

6.8. Exercises 93
A First Course in Quantitative Economics with Python

# Define a matrix A
A = np.array([[1, 0, 3],
[0, 2, 0],
[3, 0, 1]])

num_iters = 20

# Define a random starting vector b


b = np.random.rand(A.shape[1])

# Get the leading eigenvector of matrix A


eigenvector = np.linalg.eig(A)[1][:, 0]

errors = []
res = []

# Power iteration loop


for i in range(num_iters):
# Multiply b by A
b = A @ b
# Normalize b
b = b / np.linalg.norm(b)
# Append b to the list of eigenvector approximations
res.append(b)
err = np.linalg.norm(np.array(b)
- eigenvector)
errors.append(err)

greatest_eigenvalue = np.dot(A @ b, b) / np.dot(b, b)


print(f'The approximated greatest absolute eigenvalue is \
{greatest_eigenvalue:.2f}')
print('The real eigenvalue is', np.linalg.eig(A)[0])

# Plot the eigenvector approximations for each iteration


plt.figure(figsize=(10, 6))
plt.xlabel('iterations')
plt.ylabel('error')
_ = plt.plot(errors)

The approximated greatest absolute eigenvalue is 4.00


The real eigenvalue is [ 4. -2. 2.]

Then we can look at the trajectory of the eigenvector approximation.

# Set up the figure and axis for 3D plot


fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

# Plot the eigenvectors


ax.scatter(eigenvector[0],
eigenvector[1],
eigenvector[2],
color='r', s=80)

for i, vec in enumerate(res):


ax.scatter(vec[0], vec[1], vec[2],
(continues on next page)

94 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

Fig. 6.1: Power iteration

(continued from previous page)


color='b',
alpha=(i+1)/(num_iters+1),
s=80)

ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
ax.tick_params(axis='both', which='major', labelsize=7)

points = [plt.Line2D([0], [0], linestyle='none',


c=i, marker='o') for i in ['r', 'b']]
ax.legend(points, ['actual eigenvector',
r'approximated eigenvector ($b_k$)'])
ax.set_box_aspect(aspect=None, zoom=0.8)

plt.show()

Exercise 6.8.2
We have discussed the trajectory of the vector 𝑣 after being transformed by 𝐴.
1 2 2
Consider the matrix 𝐴 = [ ] and the vector 𝑣 = [ ].
1 1 −2
Try to compute the trajectory of 𝑣 after being transformed by 𝐴 for 𝑛 = 4 iterations and plot the result.

Solution to Exercise 6.8.2

6.8. Exercises 95
A First Course in Quantitative Economics with Python

Fig. 6.2: Power iteration trajectory

96 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

A = np.array([[1, 2],
[1, 1]])
v = (0.4, -0.4)
n = 11

# Compute eigenvectors and eigenvalues


eigenvalues, eigenvectors = np.linalg.eig(A)

print(f'eigenvalues:\n {eigenvalues}')
print(f'eigenvectors:\n {eigenvectors}')

plot_series(A, v, n)

eigenvalues:
[ 2.41421356 -0.41421356]
eigenvectors:
[[ 0.81649658 -0.81649658]
[ 0.57735027 0.57735027]]

The result seems to converge to the eigenvector of 𝐴 with the largest eigenvalue.
Let’s use a vector field to visualize the transformation brought by A.
(This is a more advanced topic in linear algebra, please step ahead if you are comfortable with the math.)

# Create a grid of points


x, y = np.meshgrid(np.linspace(-5, 5, 15),
np.linspace(-5, 5, 20))

# Apply the matrix A to each point in the vector field


vec_field = np.stack([x, y])
(continues on next page)

6.8. Exercises 97
A First Course in Quantitative Economics with Python

(continued from previous page)


u, v = np.tensordot(A, vec_field, axes=1)

# Plot the transformed vector field


c = plt.streamplot(x, y, u - x, v - y,
density=1, linewidth=None, color='#A23BEC')
c.lines.set_alpha(0.5)
c.arrows.set_alpha(0.5)

# Draw eigenvectors
origin = np.zeros((2, len(eigenvectors)))
parameters = {'color': ['b', 'g'], 'angles': 'xy',
'scale_units': 'xy', 'scale': 0.1, 'width': 0.01}
plt.quiver(*origin, eigenvectors[0],
eigenvectors[1], **parameters)
plt.quiver(*origin, - eigenvectors[0],
- eigenvectors[1], **parameters)

colors = ['b', 'g']


lines = [Line2D([0], [0], color=c, linewidth=3) for c in colors]
labels = ["2.4 eigenspace", "0.4 eigenspace"]
plt.legend(lines, labels, loc='center left',
bbox_to_anchor=(1, 0.5))

plt.xlabel("x")
plt.ylabel("y")
plt.grid()
plt.gca().set_aspect('equal', adjustable='box')
plt.show()

Fig. 6.3: Convergence towards eigenvectors

98 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

Note that the vector field converges to the eigenvector of 𝐴 with the largest eigenvalue and diverges from the eigenvector
of 𝐴 with the smallest eigenvalue.
In fact, the eigenvectors are also the directions in which the matrix 𝐴 stretches or shrinks the space.
Specifically, the eigenvector with the largest eigenvalue is the direction in which the matrix 𝐴 stretches the space the most.
We will see more intriguing examples in the following exercise.

Exercise 6.8.3
Previously, we demonstrated the trajectory of the vector 𝑣 after being transformed by 𝐴 for three different matrices.
Use the visualization in the previous exercise to explain the trajectory of the vector 𝑣 after being transformed by 𝐴 for
the three different matrices.

Solution to Exercise 6.8.3


Here is one solution

figure, ax = plt.subplots(1, 3, figsize=(15, 5))


A = np.array([[sqrt(3) + 1, -2],
[1, sqrt(3) - 1]])
A = (1/(2*sqrt(2))) * A

B = np.array([[sqrt(3) + 1, -2],
[1, sqrt(3) - 1]])
B = (1/2) * B

C = np.array([[sqrt(3) + 1, -2],
[1, sqrt(3) - 1]])
C = (1/sqrt(2)) * C

examples = [A, B, C]

for i, example in enumerate(examples):


M = example

# Compute right eigenvectors and eigenvalues


eigenvalues, eigenvectors = np.linalg.eig(M)
print(f'Example {i+1}:\n')
print(f'eigenvalues:\n {eigenvalues}')
print(f'eigenvectors:\n {eigenvectors}\n')

eigenvalues_real = eigenvalues.real
eigenvectors_real = eigenvectors.real

# Create a grid of points


x, y = np.meshgrid(np.linspace(-20, 20, 15),
np.linspace(-20, 20, 20))

# Apply the matrix A to each point in the vector field


vec_field = np.stack([x, y])
u, v = np.tensordot(M, vec_field, axes=1)
(continues on next page)

6.8. Exercises 99
A First Course in Quantitative Economics with Python

(continued from previous page)

# Plot the transformed vector field


c = ax[i].streamplot(x, y, u - x, v - y, density=1,
linewidth=None, color='#A23BEC')
c.lines.set_alpha(0.5)
c.arrows.set_alpha(0.5)

# Draw eigenvectors
parameters = {'color': ['b', 'g'], 'angles': 'xy',
'scale_units': 'xy', 'scale': 1,
'width': 0.01, 'alpha': 0.5}
origin = np.zeros((2, len(eigenvectors)))
ax[i].quiver(*origin, eigenvectors_real[0],
eigenvectors_real[1], **parameters)
ax[i].quiver(*origin,
- eigenvectors_real[0],
- eigenvectors_real[1],
**parameters)

ax[i].set_xlabel("x-axis")
ax[i].set_ylabel("y-axis")
ax[i].grid()
ax[i].set_aspect('equal', adjustable='box')

plt.show()

Example 1:

eigenvalues:
[0.61237244+0.35355339j 0.61237244-0.35355339j]
eigenvectors:
[[0.81649658+0.j 0.81649658-0.j ]
[0.40824829-0.40824829j 0.40824829+0.40824829j]]

Example 2:

eigenvalues:
[0.8660254+0.5j 0.8660254-0.5j]
eigenvectors:
[[0.81649658+0.j 0.81649658-0.j ]
[0.40824829-0.40824829j 0.40824829+0.40824829j]]

Example 3:

eigenvalues:
[1.22474487+0.70710678j 1.22474487-0.70710678j]
eigenvectors:
[[0.81649658+0.j 0.81649658-0.j ]
[0.40824829-0.40824829j 0.40824829+0.40824829j]]

The vector fields explain why we observed the trajectories of the vector 𝑣 multiplied by 𝐴 iteratively before.
The pattern demonstrated here is because we have complex eigenvalues and eigenvectors.
We can plot the complex plane for one of the matrices using Arrow3D class retrieved from stackoverflow.

100 Chapter 6. Eigenvalues and Eigenvectors


A First Course in Quantitative Economics with Python

Fig. 6.4: Vector fields of the three matrices

class Arrow3D(FancyArrowPatch):
def __init__(self, xs, ys, zs, *args, **kwargs):
super().__init__((0, 0), (0, 0), *args, **kwargs)
self._verts3d = xs, ys, zs

def do_3d_projection(self):
xs3d, ys3d, zs3d = self._verts3d
xs, ys, zs = proj3d.proj_transform(xs3d, ys3d, zs3d,
self.axes.M)
self.set_positions((0.1*xs[0], 0.1*ys[0]),
(0.1*xs[1], 0.1*ys[1]))

return np.min(zs)

eigenvalues, eigenvectors = np.linalg.eig(A)

# Create meshgrid for vector field


x, y = np.meshgrid(np.linspace(-2, 2, 15),
np.linspace(-2, 2, 15))

# Calculate vector field (real and imaginary parts)


u_real = A[0][0] * x + A[0][1] * y
v_real = A[1][0] * x + A[1][1] * y
u_imag = np.zeros_like(x)
v_imag = np.zeros_like(y)

# Create 3D figure
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
vlength = np.linalg.norm(eigenvectors)
ax.quiver(x, y, u_imag, u_real-x, v_real-y, v_imag-u_imag,
colors='b', alpha=0.3, length=.2,
arrow_length_ratio=0.01)

arrow_prop_dict = dict(mutation_scale=5,
arrowstyle='-|>', shrinkA=0, shrinkB=0)

# Plot 3D eigenvectors
for c, i in zip(['b', 'g'], [0, 1]):
a = Arrow3D([0, eigenvectors[0][i].real],
[0, eigenvectors[1][i].real],
(continues on next page)

6.8. Exercises 101


A First Course in Quantitative Economics with Python

(continued from previous page)


[0, eigenvectors[1][i].imag],
color=c, **arrow_prop_dict)
ax.add_artist(a)

# Set axis labels and title


ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('Im')
ax.set_box_aspect(aspect=None, zoom=0.8)

plt.draw()
plt.show()

Fig. 6.5: 3D plot of the vector field

102 Chapter 6. Eigenvalues and Eigenvectors


CHAPTER

SEVEN

INTRODUCTION TO SUPPLY AND DEMAND

7.1 Overview

This lecture is about some models of equilibrium prices and quantities, one of the main topics of elementary microeco-
nomics.
Throughout the lecture, we focus on models with one good and one price.
In a subsequent lecture we will investigate settings with many goods.
Key infrastructure concepts that we’ll encounter in this lecture are
• inverse demand curves
• inverse supply curves
• consumer surplus
• producer surplus
• social welfare as the sum of consumer and producer surpluses
• relationship between equilibrium quantity and social welfare optimum
Throughout the lectures, we’ll assume that inverse demand and supply curves are affine functions of quantity.
(“Affine” means “linear plus a constant” and here is a nice discussion about it.)
We’ll also assume affine inverse supply and demand functions when we study models with multiple consumption goods in
our subsequent lecture.
We do this in order to simplify the exposition and enable us to use just a few tools from linear algebra, namely, matrix
multiplication and matrix inversion.
In our exposition we will use the following imports.

import numpy as np
import matplotlib.pyplot as plt

103
A First Course in Quantitative Economics with Python

7.2 Supply and demand

We study a market for a single good in which buyers and sellers exchange a quantity 𝑞 for a price 𝑝.
Quantity 𝑞 and price 𝑝 are both scalars.
We assume that inverse demand and supply curves for the good are:

𝑝 = 𝑑0 − 𝑑1 𝑞, 𝑑0 , 𝑑1 > 0

𝑝 = 𝑠0 + 𝑠1 𝑞, 𝑠0 , 𝑠1 > 0
We call them inverse demand and supply curves because price is on the left side of the equation rather than on the right
side as it would be in a direct demand or supply function.
Here is a class that stores parameters for our single good market, as well as implementing the inverse demand and supply
curves.

class Market:

def __init__(self,
d_0=1.0, # demand intercept
d_1=0.6, # demand slope
s_0=0.1, # supply intercept
s_1=0.4): # supply slope

self.d_0, self.d_1 = d_0, d_1


self.s_0, self.s_1 = s_0, s_1

def inverse_demand(self, q):


return self.d_0 - self.d_1 * q

def inverse_supply(self, q):


return self.s_0 + self.s_1 * q

Let’s create an instance.

market = Market()

Here is a plot of these two functions using market.

104 Chapter 7. Introduction to Supply and Demand


A First Course in Quantitative Economics with Python

In the above graph, an equilibrium price-quantity pair occurs at the intersection of the supply and demand curves.

7.2.1 Consumer surplus

Let a quantity 𝑞 be given and let 𝑝 ∶= 𝑑0 − 𝑑1 𝑞 be the corresponding price on the inverse demand curve.
We define consumer surplus 𝑆𝑐 (𝑞) as the area under an inverse demand curve minus 𝑝𝑞:
𝑞
𝑆𝑐 (𝑞) ∶= ∫ (𝑑0 − 𝑑1 𝑥)𝑑𝑥 − 𝑝𝑞 (7.1)
0

The next figure illustrates

7.2. Supply and demand 105


A First Course in Quantitative Economics with Python

Consumer surplus provides a measure of total consumer welfare at quantity 𝑞.


The idea is that the inverse demand curve 𝑑0 − 𝑑1 𝑞 shows a consumer’s willingness to pay for an additional increment of
the good at a given quantity 𝑞.
The difference between willingness to pay and the actual price is consumer surplus.
The value 𝑆𝑐 (𝑞) is the “sum” (i.e., integral) of these surpluses when the total quantity purchased is 𝑞 and the purchase
price is 𝑝.
Evaluating the integral in the definition of consumer surplus (7.1) gives
1
𝑆𝑐 (𝑞) = 𝑑0 𝑞 − 𝑑1 𝑞 2 − 𝑝𝑞
2

7.2.2 Producer surplus

Let a quantity 𝑞 be given and let 𝑝 ∶= 𝑠0 + 𝑠1 𝑞 be the corresponding price on the inverse supply curve.
We define producer surplus as 𝑝𝑞 minus the area under an inverse supply curve
𝑞
𝑆𝑝 (𝑞) ∶= 𝑝𝑞 − ∫ (𝑠0 + 𝑠1 𝑥)𝑑𝑥 (7.2)
0

The next figure illustrates

106 Chapter 7. Introduction to Supply and Demand


A First Course in Quantitative Economics with Python

Producer surplus measures total producer welfare at quantity 𝑞


The idea is similar to that of consumer surplus.
The inverse supply curve 𝑠0 + 𝑠1 𝑞 shows the price at which producers are prepared to sell, given quantity 𝑞.
The difference between willingness to sell and the actual price is producer surplus.
The value 𝑆𝑝 (𝑞) is the integral of these surpluses.
Evaluating the integral in the definition of producer surplus (7.2) gives
1
𝑆𝑝 (𝑞) = 𝑝𝑞 − 𝑠0 𝑞 − 𝑠1 𝑞 2
2

7.2.3 Social welfare

Sometimes economists measure social welfare by a welfare criterion that equals consumer surplus plus producer surplus,
assuming that consumers and producers pay the same price:
𝑞 𝑞
𝑊 (𝑞) = ∫ (𝑑0 − 𝑑1 𝑥)𝑑𝑥 − ∫ (𝑠0 + 𝑠1 𝑥)𝑑𝑥
0 0

Evaluating the integrals gives


1
𝑊 (𝑞) = (𝑑0 − 𝑠0 )𝑞 − (𝑑1 + 𝑠1 )𝑞 2
2
Here is a Python function that evaluates this social welfare at a given quantity 𝑞 and a fixed set of parameters.

def W(q, market):


# Unpack
d_0, d_1, s_0, s_1 = market.d_0, market.d_1, market.s_0, market.s_1
(continues on next page)

7.2. Supply and demand 107


A First Course in Quantitative Economics with Python

(continued from previous page)


# Compute and return welfare
return (d_0 - s_0) * q - 0.5 * (d_1 + s_1) * q**2

The next figure plots welfare as a function of 𝑞.

Let’s now give a social planner the task of maximizing social welfare.
To compute a quantity that maximizes the welfare criterion, we differentiate 𝑊 with respect to 𝑞 and then set the derivative
to zero.
𝑑𝑊 (𝑞)
= 𝑑0 − 𝑠0 − (𝑑1 + 𝑠1 )𝑞 = 0
𝑑𝑞
Solving for 𝑞 yields

𝑑0 − 𝑠0
𝑞= (7.3)
𝑠1 + 𝑑1

Let’s remember the quantity 𝑞 given by equation (7.3) that a social planner would choose to maximize consumer surplus
plus producer surplus.
We’ll compare it to the quantity that emerges in a competitive equilibrium that equates supply to demand.

108 Chapter 7. Introduction to Supply and Demand


A First Course in Quantitative Economics with Python

7.2.4 Competitive equilibrium

Instead of equating quantities supplied and demanded, we can accomplish the same thing by equating demand price to
supply price:

𝑝 = 𝑑 0 − 𝑑1 𝑞 = 𝑠 0 + 𝑠1 𝑞

If we solve the equation defined by the second equality in the above line for 𝑞, we obtain

𝑑0 − 𝑠0
𝑞= (7.4)
𝑠1 + 𝑑1

This is the competitive equilibrium quantity.


Observe that the equilibrium quantity equals the same 𝑞 given by equation (7.3).
The outcome that the quantity determined by equation (7.3) equates supply to demand brings us a key finding:
• a competitive equilibrium quantity maximizes our welfare criterion
This is a version of the first fundamental welfare theorem,
It also brings a useful competitive equilibrium computation strategy:
• after solving the welfare problem for an optimal quantity, we can read a competitive equilibrium price from either
supply price or demand price at the competitive equilibrium quantity

7.3 Generalizations

In a later lecture, we’ll derive generalizations of the above demand and supply curves from other objects.
Our generalizations will extend the preceding analysis of a market for a single good to the analysis of 𝑛 simultaneous
markets in 𝑛 goods.
In addition
• we’ll derive demand curves from a consumer problem that maximizes a utility function subject to a budget
constraint.
• we’ll derive supply curves from the problem of a producer who is price taker and maximizes his profits minus total
costs that are described by a cost function.

7.4 Exercises

Suppose now that the inverse demand and supply curves are modified to take the form

𝑝 = 𝑖𝑑 (𝑞) ∶= 𝑑0 − 𝑑1 𝑞 0.6

𝑝 = 𝑖𝑠 (𝑞) ∶= 𝑠0 + 𝑠1 𝑞 1.8
All parameters are positive, as before.

Exercise 7.4.1
Define a new Market class that holds the same parameter values as before by changing the inverse_demand and
inverse_supply methods to match these new definitions.

7.3. Generalizations 109


A First Course in Quantitative Economics with Python

Using the class, plot the inverse demand and supply curves 𝑖𝑑 and 𝑖𝑠

Solution to Exercise 7.4.1

class Market:

def __init__(self,
d_0=1.0, # demand intercept
d_1=0.6, # demand slope
s_0=0.1, # supply intercept
s_1=0.4): # supply slope

self.d_0, self.d_1 = d_0, d_1


self.s_0, self.s_1 = s_0, s_1

def inverse_demand(self, q):


return self.d_0 - self.d_1 * q**0.6

def inverse_supply(self, q):


return self.s_0 + self.s_1 * q**1.8

Let’s create an instance.

market = Market()

Here is a plot of inverse supply and demand.

Exercise 7.4.2

110 Chapter 7. Introduction to Supply and Demand


A First Course in Quantitative Economics with Python

As before, consumer surplus at 𝑞 is the area under the demand curve minus price times quantity:
𝑞
𝑆𝑐 (𝑞) = ∫ 𝑖𝑑 (𝑥)𝑑𝑥 − 𝑝𝑞
0

Here 𝑝 is set to 𝑖𝑑 (𝑞)


Producer surplus is price times quantity minus the area under the inverse supply curve:
𝑞
𝑆𝑝 (𝑞) = 𝑝𝑞 − ∫ 𝑖𝑠 (𝑥)𝑑𝑥
0

Here 𝑝 is set to 𝑖𝑠 (𝑞).


Social welfare is the sum of consumer and producer surplus under the assumption that the price is the same for buyers
and sellers:
𝑞 𝑞
𝑊 (𝑞) = ∫ 𝑖𝑑 (𝑥)𝑑𝑥 − ∫ 𝑖𝑠 (𝑥)𝑑𝑥
0 0

Solve the integrals and write a function to compute this quantity numerically at given 𝑞.
Plot welfare as a function of 𝑞.

Solution to Exercise 7.4.2


Solving the integrals gives

𝑑1 𝑞 1.6 𝑠 𝑞 2.8
𝑊 (𝑞) = 𝑑0 𝑞 − − (𝑠0 𝑞 + 1 )
1.6 2.8

Here’s a Python function that computes this value:

def W(q, market):


# Unpack
d_0, d_1 = market.d_0, market.d_1
s_0, s_1 = market.s_0, market.s_1
# Compute and return welfare
S_c = d_0 * q - d_1 * q**1.6 / 1.6
S_p = s_0 * q + s_1 * q**2.8 / 2.8
return S_c - S_p

The next figure plots welfare as a function of 𝑞.

fig, ax = plt.subplots()
ax.plot(q_vals, W(q_vals, market), label='welfare')
ax.legend(frameon=False)
ax.set_xlabel('quantity')
plt.show()

7.4. Exercises 111


A First Course in Quantitative Economics with Python

Exercise 7.4.3
Due to nonlinearities, the new welfare function is not easy to maximize with pencil and paper.
Maximize it using scipy.optimize.minimize_scalar instead.

Solution to Exercise 7.4.3

from scipy.optimize import minimize_scalar

def objective(q):
return -W(q, market)

result = minimize_scalar(objective, bounds=(0, 10))


print(result.message)

Solution found.

maximizing_q = result.x
print(f"{maximizing_q: .5f}")

0.90564

Exercise 7.4.4

112 Chapter 7. Introduction to Supply and Demand


A First Course in Quantitative Economics with Python

Now compute the equilibrium quantity by finding the price that equates supply and demand.
You can do this numerically by finding the root of the excess demand function

𝑒𝑑 (𝑞) ∶= 𝑖𝑑 (𝑞) − 𝑖𝑠 (𝑞)

You can use scipy.optimize.newton to compute the root.


Initialize newton with a starting guess somewhere close to 1.0.
(Similar initial conditions will give the same result.)
You should find that the equilibrium price agrees with the welfare maximizing price, in line with the first fundamental
welfare theorem.

Solution to Exercise 7.4.4

from scipy.optimize import newton

def excess_demand(q):
return market.inverse_demand(q) - market.inverse_supply(q)

equilibrium_q = newton(excess_demand, 0.99)


print(f"{equilibrium_q: .5f}")

0.90564

7.4. Exercises 113


A First Course in Quantitative Economics with Python

114 Chapter 7. Introduction to Supply and Demand


Part IV

Linear Dynamics

115
CHAPTER

EIGHT

PRESENT VALUES

8.1 Overview

This lecture describes the present value model that is a starting point of much asset pricing theory.
We’ll use the calculations described here in several subsequent lectures.
Our only tool is some elementary linear algebra operations, namely, matrix multiplication and matrix inversion.
Let’s dive in.
Let
• {𝑑𝑡 }𝑇𝑡=0 be a sequence of dividends or “payouts”
• {𝑝𝑡 }𝑇𝑡=0 be a sequence of prices of a claim on the continuation of the asset stream from date 𝑡 on, namely, {𝑑𝑠 }𝑇𝑠=𝑡
• 𝛿 ∈ (0, 1) be a one-period “discount factor”
• 𝑝𝑇∗ +1 be a terminal price of the asset at time 𝑇 + 1
We assume that the dividend stream {𝑑𝑡 }𝑇𝑡=0 and the terminal price 𝑝𝑇∗ +1 are both exogenous.
Assume the sequence of asset pricing equations

𝑝𝑡 = 𝑑𝑡 + 𝛿𝑝𝑡+1 , 𝑡 = 0, 1, … , 𝑇 (8.1)

This is a “cost equals benefits” formula.


It says that the cost of buying the asset today equals the reward for holding it for one period (which is the dividend 𝑑𝑡 )
and then selling it, at 𝑡 + 1.
The future value 𝑝𝑡+1 is discounted using 𝛿 to shift it to a present value, so it is comparable with 𝑑𝑡 and 𝑝𝑡 .
We want to solve for the asset price sequence {𝑝𝑡 }𝑇𝑡=0 as a function of {𝑑𝑡 }𝑇𝑡=0 and 𝑝𝑇∗ +1 .
In this lecture we show how this can be done using matrix algebra.
We will use the following imports

import numpy as np
import matplotlib.pyplot as plt

117
A First Course in Quantitative Economics with Python

8.2 Present value calculations

The equations in (8.1.1) can be stacked, as in


𝑝0 = 𝑑0 + 𝛿𝑝1
𝑝1 = 𝑑1 + 𝛿𝑝2
⋮ (8.2)
𝑝𝑇 −1 = 𝑑𝑇 −1 + 𝛿𝑝𝑇
𝑝𝑇 = 𝑑𝑇 + 𝛿𝑝𝑇∗ +1
Write the system (8.2) of 𝑇 + 1 asset pricing equations as the single matrix equation
1 −𝛿 0 0 ⋯ 0 0 𝑝0 𝑑0 0
⎡0 1 −𝛿 0 ⋯ 0 0 ⎤ ⎡ 𝑝1 ⎤ ⎡ 𝑑1 ⎤ ⎡ 0 ⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 0 1 −𝛿 ⋯ 0 0 ⎥ ⎢ 𝑝2 ⎥ ⎢ 𝑑2 ⎥ ⎢ 0 ⎥
= + (8.3)
⎢⋮ ⋮ ⋮ ⋮ ⋮ 0 0 ⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎢0 0 0 0 ⋯ 1 −𝛿 ⎥ ⎢𝑝𝑇 −1 ⎥ ⎢𝑑𝑇 −1 ⎥ ⎢ 0 ⎥
⎣0 0 0 0 ⋯ 0 1 ⎦ ⎣ 𝑝𝑇 ⎦ ⎣ 𝑑𝑇 ⎦ ⎣𝛿𝑝𝑇∗ +1 ⎦

Exercise 8.2.1
Carry out the matrix multiplication in (8.2.2) by hand and confirm that you recover the equations in (8.2.1).

In vector-matrix notation, we can write the system (8.2.2) as

𝐴𝑝 = 𝑑 + 𝑏 (8.4)

Here 𝐴 is the matrix on the left side of equation (8.3), while


𝑝0 𝑑0 0
⎡𝑝 ⎤ ⎡𝑑 ⎤ ⎡ 0 ⎤
𝑝 = ⎢ 1⎥, 𝑑 = ⎢ 1⎥, and 𝑏=⎢ ⎥
⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎣𝑝𝑇 ⎦ ⎣𝑑𝑇 ⎦ ⎣𝑝𝑇∗ +1 ⎦
The solution for prices is given by

𝑝 = 𝐴−1 (𝑑 + 𝑏) (8.5)

Here is a small example, where the dividend stream is given by

𝑑𝑡+1 = 1.05𝑑𝑡 , 𝑡 = 0, 1, … , 𝑇 − 1.

T = 6
current_d = 1.0
d = []
for t in range(T+1):
d.append(current_d)
current_d = current_d * 1.05

fig, ax = plt.subplots()
ax.plot(d, 'o', label='dividends')
ax.legend()
ax.set_xlabel('time')
plt.show()

118 Chapter 8. Present Values


A First Course in Quantitative Economics with Python

We set 𝛿 and 𝑝𝑇∗ +1 to

δ = 0.99
p_star = 10.0

Let’s build the matrix 𝐴

A = np.zeros((T+1, T+1))
for i in range(T+1):
for j in range(T+1):
if i == j:
A[i, j] = 1
if j < T:
A[i, j+1] = -δ

Let’s inspect 𝐴

array([[ 1. , -0.99, 0. , 0. , 0. , 0. , 0. ],
[ 0. , 1. , -0.99, 0. , 0. , 0. , 0. ],
[ 0. , 0. , 1. , -0.99, 0. , 0. , 0. ],
[ 0. , 0. , 0. , 1. , -0.99, 0. , 0. ],
[ 0. , 0. , 0. , 0. , 1. , -0.99, 0. ],
[ 0. , 0. , 0. , 0. , 0. , 1. , -0.99],
[ 0. , 0. , 0. , 0. , 0. , 0. , 1. ]])

Now let’s solve for prices using (8.2.4).

8.2. Present value calculations 119


A First Course in Quantitative Economics with Python

b = np.zeros(T+1)
b[-1] = δ * p_star
p = np.linalg.solve(A, d + b)
fig, ax = plt.subplots()
ax.plot(p, 'o', label='asset price')
ax.legend()
ax.set_xlabel('time')
plt.show()

We can also consider a cyclically growing dividend sequence, such as

𝑑𝑡+1 = 1.01𝑑𝑡 + 0.1 sin 𝑡, 𝑡 = 0, 1, … , 𝑇 − 1.

T = 100
current_d = 1.0
d = []
for t in range(T+1):
d.append(current_d)
current_d = current_d * 1.01 + 0.1 * np.sin(t)

fig, ax = plt.subplots()
ax.plot(d, 'o-', ms=4, alpha=0.8, label='dividends')
ax.legend()
ax.set_xlabel('time')
plt.show()

120 Chapter 8. Present Values


A First Course in Quantitative Economics with Python

Exercise 8.2.2
Compute the corresponding asset price sequence when 𝑝𝑇∗ +1 = 0 and 𝛿 = 0.98.

Solution to Exercise 8.2.2


We proceed as above after modifying parameters and 𝐴.

δ = 0.98
p_star = 0.0
A = np.zeros((T+1, T+1))
for i in range(T+1):
for j in range(T+1):
if i == j:
A[i, j] = 1
if j < T:
A[i, j+1] = -δ

b = np.zeros(T+1)
b[-1] = δ * p_star
p = np.linalg.solve(A, d + b)
fig, ax = plt.subplots()
ax.plot(p, 'o-', ms=4, alpha=0.8, label='asset price')
ax.legend()
ax.set_xlabel('time')
plt.show()

8.2. Present value calculations 121


A First Course in Quantitative Economics with Python

The weighted averaging associated with the present value calculation largely eliminates the cycles.

8.3 Analytical expressions

It can be verified that the inverse of the matrix 𝐴 in (8.3) is

1 𝛿 𝛿2 ⋯ 𝛿 𝑇 −1 𝛿𝑇
⎡0 1 𝛿 ⋯ 𝛿 𝑇 −2 𝛿 𝑇 −1 ⎤
−1 ⎢ ⎥
𝐴 = ⎢⋮ ⋮ ⋮ ⋯ ⋮ ⋮ ⎥ (8.6)
⎢0 0 0 ⋯ 1 𝛿 ⎥
⎣0 0 0 ⋯ 0 1 ⎦

Exercise 8.3.1
Check this by showing that 𝐴𝐴−1 is equal to the identity matrix.
(By the inverse matrix theorem, a matrix 𝐵 is the inverse of 𝐴 whenever 𝐴𝐵 is the identity.)

If we use the expression (8.3.1) in (8.2.4) and perform the indicated matrix multiplication, we shall find that
𝑇
𝑝𝑡 = ∑ 𝛿 𝑠−𝑡 𝑑𝑠 + 𝛿 𝑇 +1−𝑡 𝑝𝑇∗ +1 (8.7)
𝑠=𝑡

Pricing formula (8.7) asserts that two components sum to the asset price 𝑝𝑡 :
𝑇
• a fundamental component ∑𝑠=𝑡 𝛿 𝑠−𝑡 𝑑𝑠 that equals the discounted present value of prospective dividends

122 Chapter 8. Present Values


A First Course in Quantitative Economics with Python

• a bubble component 𝛿 𝑇 +1−𝑡 𝑝𝑇∗ +1


The fundamental component is pinned down by the discount factor 𝛿 and the “fundamentals” of the asset (in this case,
the dividends).
The bubble component is the part of the price that is not pinned down by fundamentals.
It is sometimes convenient to rewrite the bubble component as

𝑐𝛿 −𝑡

where

𝑐 ≡ 𝛿 𝑇 +1 𝑝𝑇∗ +1

8.4 More about bubbles

For a few moments, let’s focus on the special case of an asset that will never pay dividends, in which case

𝑑0 0
⎡ 𝑑 ⎤ ⎡0⎤
⎢ 1 ⎥ ⎢ ⎥
⎢ 𝑑2 ⎥ = ⎢0⎥
⎢ ⋮ ⎥ ⎢⋮⎥
⎢𝑑𝑇 −1 ⎥ ⎢0⎥
⎣ 𝑑𝑇 ⎦ ⎣0⎦

In this case system (8.1) of our 𝑇 + 1 asset pricing equations takes the form of the single matrix equation

1 −𝛿 0 0 ⋯ 0 0 𝑝0 0
⎡0 1 −𝛿 0 ⋯ 0 0 ⎤ ⎡ 𝑝1 ⎤ ⎡ 0 ⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢0 0 1 −𝛿 ⋯ 0 0 ⎥ ⎢ 𝑝2 ⎥ ⎢ 0 ⎥
= (8.8)
⎢⋮ ⋮ ⋮ ⋮ ⋮ 0 0 ⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎢0 0 0 0 ⋯ 1 −𝛿 ⎥ ⎢𝑝𝑇 −1 ⎥ ⎢ 0 ⎥
⎣0 0 0 0 ⋯ 0 1 ⎦ ⎣ 𝑝𝑇 ⎦ ⎣𝛿𝑝𝑇∗ +1 ⎦

Evidently, if 𝑝𝑇∗ +1 = 0, a price vector 𝑝 of all entries zero solves this equation and the only the fundamental component
of our pricing formula (8.7) is present.
But let’s activate the bubble component by setting

𝑝𝑇∗ +1 = 𝑐𝛿 −(𝑇 +1) (8.9)

for some positive constant 𝑐.


In this case, it can be verified that when we multiply both sides of (8.8) by the matrix 𝐴−1 presented in equation (8.6),
we shall find that

𝑝𝑡 = 𝑐𝛿 −𝑡 (8.10)

8.4. More about bubbles 123


A First Course in Quantitative Economics with Python

8.5 Gross rate of return

Define the gross rate of return on holding the asset from period 𝑡 to period 𝑡 + 1 as
𝑝𝑡+1
𝑅𝑡 = (8.11)
𝑝𝑡

Equation (8.10) confirms that an asset whose sole source of value is a bubble that earns a gross rate of return

𝑅𝑡 = 𝛿 −1 > 1.

8.6 Exercises

Exercise 8.6.1
Give analytical expressions for the asset price 𝑝𝑡 under the following settings for 𝑑 and 𝑝𝑇∗ +1 :
1. 𝑝𝑇∗ +1 = 0, 𝑑𝑡 = 𝑔𝑡 𝑑0 (a modified version of the Gordon growth formula)
2. 𝑝𝑇∗ +1 = 𝑔𝑇 +1 𝑑0 , 𝑑𝑡 = 𝑔𝑡 𝑑0 (the plain vanilla Gordon growth formula)
3. 𝑝𝑇∗ +1 = 0, 𝑑𝑡 = 0 (price of a worthless stock)
4. 𝑝𝑇∗ +1 = 𝑐𝛿 −(𝑇 +1) , 𝑑𝑡 = 0 (price of a pure bubble stock)

124 Chapter 8. Present Values


CHAPTER

NINE

CONSUMPTION SMOOTHING

9.1 Overview

Technically, this lecture is a sequel to this quantecon lecture present values, although it might not seem so at first.
It will take a while for a “present value” or asset price explicilty to appear in this lecture, but when it does it will be a key
actor.
In this lecture, we’ll study a famous model of the “consumption function” that Milton Friedman [Fri56] and Robert Hall
[Hal78]) proposed to fit some empirical data patterns that the simple Keynesian model described in this quantecon lecture
geometric series had missed.
The key insight of Friedman and Hall was that today’s consumption ought not to depend just on today’s non-financial
income: it should also depend on a person’s anticipations of her future non-financial incomes at various dates.
In this lecture, we’ll study what is sometimes called the “consumption-smoothing model” using only linear algebra, in
particular matrix multiplication and matrix inversion.
Formulas presented in present value formulas are at the core of the consumption smoothing model because they are used
to define a consumer’s “human wealth”.
As usual, we’ll start with by importing some Python modules.

import numpy as np
import matplotlib.pyplot as plt
from collections import namedtuple

Our model describes the behavior of a consumer who lives from time 𝑡 = 0, 1, … , 𝑇 , receives a stream {𝑦𝑡 }𝑇𝑡=0 of
non-financial income and chooses a consumption stream {𝑐𝑡 }𝑇𝑡=0 .
We usually think of the non-financial income stream as coming from the person’s salary from supplying labor.
The model takes that non-financial income stream as an input, regarding it as “exogenous” in the sense of not being
determined by the model.
The consumer faces a gross interest rate of 𝑅 > 1 that is constant over time, at which she is free to borrow or lend, up to
some limits that we’ll describe below.
To set up the model, let
• 𝑇 ≥ 2 be a positive integer that constitutes a time-horizon
• 𝑦 = {𝑦𝑡 }𝑇𝑡=0 be an exogenous sequence of non-negative non-financial incomes 𝑦𝑡
• 𝑎 = {𝑎𝑡 }𝑇𝑡=0
+1
be a sequence of financial wealth
• 𝑐 = {𝑐𝑡 }𝑇𝑡=0 be a sequence of non-negative consumption rates
• 𝑅 ≥ 1 be a fixed gross one period rate of return on financial assets

125
A First Course in Quantitative Economics with Python

• 𝛽 ∈ (0, 1) be a fixed discount factor


• 𝑎0 be a given initial level of financial assets
• 𝑎𝑇 +1 ≥ 0 be a terminal condition on final assets
While the sequence of financial wealth 𝑎 is to be determined by the model, it must satisfy two boundary conditions that
require it to be equal to 𝑎0 at time 0 and 𝑎𝑇 +1 at time 𝑇 + 1.
The terminal condition 𝑎𝑇 +1 ≥ 0 requires that the consumer not die leaving debts.
(We’ll see that a utility maximizing consumer won’t want to die leaving positive assets, so she’ll arrange her affairs to
make $a_{T+1} = 0.)
The consumer faces a sequence of budget constraints that constrains the triple of sequences 𝑦, 𝑐, 𝑎

𝑎𝑡+1 = 𝑅(𝑎𝑡 + 𝑦𝑡 − 𝑐𝑡 ), 𝑡 = 0, 1, … 𝑇 (9.1)

Notice that there are 𝑇 + 1 such budget constraints, one for each 𝑡 = 0, 1, … , 𝑇 .
Given a sequence 𝑦 of non-financial income, there is a big set of pairs (𝑎, 𝑐) of (financial wealth, consumption) sequences
that satisfy the sequence of budget constraints (9.1).
Our model has the following logical flow.
• start with an exogenous non-financial income sequence 𝑦, an initial financial wealth 𝑎0 , and a candidate consumption
path 𝑐.
• use the system of equations (9.1) for 𝑡 = 0, … , 𝑇 to compute a path 𝑎 of financial wealth
• verify that 𝑎𝑇 +1 satisfies the terminal wealth constraint 𝑎𝑇 +1 ≥ 0.
– If it does, declare that the candidate path is budget feasible.
– if the candidate consumption path is not budget feasible, propose a path with less consumption sometimes
and start over
Below, we’ll describe how to execute these steps using linear algebra – matrix inversion and multiplication.
The above procedure seems like a sensible way to find “budget-feasible” consumption paths 𝑐, i.e., paths that are consistent
with the exogenous non-financial income stream 𝑦, the initial financial asset level 𝑎0 , and the terminal asset level 𝑎𝑇 +1 .
In general, there will be many budget feasible consumption paths 𝑐.
Among all budget-feasible consumption paths, which one should the consumer want to choose?
We shall eventually evaluate alternative budget feasible consumption paths 𝑐 using the following welfare criterion
𝑇
𝑔2 2
𝑊 = ∑ 𝛽 𝑡 (𝑔1 𝑐𝑡 − 𝑐 ) (9.2)
𝑡=0
2 𝑡

where 𝑔1 > 0, 𝑔2 > 0.


𝑔2 2
The fact that the utility function 𝑔1 𝑐𝑡 − 2 𝑐𝑡 has diminishing marginal utility imparts a preference for consumption that
is very smooth when 𝛽𝑅 ≈ 1.
Indeed, we shall see that when 𝛽𝑅 = 1 (a condition assumed by Milton Friedman [Fri56] and Robert Hall [Hal78]), this
criterion assigns higher welfare to smoother consumption paths.
By smoother we mean as close as possible to being constant over time.
The preference for smooth consumption paths that is built into the model gives it the name “consumption smoothing
model”.
Let’s dive in and do some calculations that will help us understand how the model works.

126 Chapter 9. Consumption Smoothing


A First Course in Quantitative Economics with Python

Here we use default parameters 𝑅 = 1.05, 𝑔1 = 1, 𝑔2 = 1/2, and 𝑇 = 65.


We create a namedtuple to store these parameters with default values.

ConsumptionSmoothing = namedtuple("ConsumptionSmoothing",
["R", "g1", "g2", "β_seq", "T"])

def creat_cs_model(R=1.05, g1=1, g2=1/2, T=65):


β = 1/R
β_seq = np.array([β**i for i in range(T+1)])
return ConsumptionSmoothing(R=1.05, g1=1, g2=1/2,
β_seq=β_seq, T=65)

9.2 Friedman-Hall consumption-smoothing model

A key object in the model is what Milton Friedman called “human” or “non-financial” wealth at time 0:

𝑦0
𝑇 ⎡𝑦 ⎤
ℎ0 ≡ ∑ 𝑅−𝑡 𝑦𝑡 = [1 𝑅−1 ⋯ 𝑅−𝑇 ] ⎢ 1 ⎥
𝑡=0
⎢ ⋮ ⎥
⎣𝑦𝑇 ⎦

Human or non-financial wealth is evidently just the present value at time 0 of the consumer’s non-financial income stream
𝑦.
Notice that formally it very much resembles the asset price that we computed in this quantecon lecture present values.
Indeed, this is why Milton Friedman called it “human capital”.
By iterating on equation (9.1) and imposing the terminal condition

𝑎𝑇 +1 = 0,

it is possible to convert a sequence of budget constraints into the single intertemporal constraint
𝑇
∑ 𝑅−𝑡 𝑐𝑡 = 𝑎0 + ℎ0 ,
𝑡=0

which says that the present value of the consumption stream equals the sum of finanical and non-financial (or human)
wealth.
Robert Hall [Hal78] showed that when 𝛽𝑅 = 1, a condition Milton Friedman had also assumed, it is “optimal” for a
consumer to smooth consumption by setting

𝑐𝑡 = 𝑐 0 𝑡 = 0, 1, … , 𝑇

(Later we’ll present a “variational argument” that shows that this constant path is indeed optimal when 𝛽𝑅 = 1.)
In this case, we can use the intertemporal budget constraint to write
−1
𝑇
𝑐𝑡 = 𝑐 0 = ( ∑ 𝑅 ) −𝑡
(𝑎0 + ℎ0 ), 𝑡 = 0, 1, … , 𝑇 . (9.3)
𝑡=0

Equation (9.3) is the consumption-smoothing model in a nutshell.

9.2. Friedman-Hall consumption-smoothing model 127


A First Course in Quantitative Economics with Python

9.3 Mechanics of Consumption smoothing model

As promised, we’ll provide step by step instructions on how to use linear algebra, readily implemented in Python, to
compute all the objects in play in the consumption-smoothing model.
In the calculations below, we’ll set default values of 𝑅 > 1, e.g., 𝑅 = 1.05, and 𝛽 = 𝑅−1 .

9.3.1 Step 1

For some (𝑇 + 1) × 1 𝑦 vector, use matrix algebra to compute ℎ0


𝑦0
𝑇 ⎡𝑦 ⎤
−𝑡 −1 −𝑇
∑ 𝑅 𝑦𝑡 = [1 𝑅 ⋯ 𝑅 ]⎢ 1⎥
𝑡=0
⎢ ⋮ ⎥
⎣𝑦𝑇 ⎦

9.3.2 Step 2

Compute the optimal level of consumption 𝑐0


𝑇
1 − 𝑅−1
𝑐𝑡 = 𝑐0 = ( ) (𝑎 0 + ∑ 𝑅𝑡 𝑦𝑡 ), 𝑡 = 0, 1, … , 𝑇
1 − 𝑅−(𝑇 +1) 𝑡=0

9.3.3 Step 3

In this step, we use the system of equations (9.1) for 𝑡 = 0, … , 𝑇 to compute a path 𝑎 of financial wealth.
To do this, we translated that system of difference equations into a single matrix equation as follows (we’ll say more about
the mechanics of using linear algebra to solve such difference equations later in the last part of this lecture):
1 0 0 ⋯ 0 0 0 𝑎1 𝑦0 + 𝑎 0 − 𝑐 0
⎡−𝑅 1 0 ⋯ 0 0 0⎤ ⎡ 𝑎2 ⎤ ⎡ 𝑦 −𝑐 ⎤
1 0
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ 0 −𝑅 1 ⋯ 0 0 0⎥ ⎢ 𝑎3 ⎥
= 𝑅⎢
𝑦2 − 𝑐 0 ⎥
⎢ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮ ⋮⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎢ 0 0 0 ⋯ −𝑅 1 0⎥ ⎢ 𝑎𝑇 ⎥ ⎢ 𝑦𝑇 −1 − 𝑐0 ⎥
⎣ 0 0 0 ⋯ 0 −𝑅 1⎦ ⎣𝑎𝑇 +1 ⎦ ⎣ 𝑦𝑇 − 𝑐 0 ⎦
Multiply both sides by the inverse of the matrix on the left side to compute
𝑎1
⎡ 𝑎 ⎤
⎢ 2 ⎥
⎢ 𝑎3 ⎥
⎢ ⋮ ⎥
⎢ 𝑎𝑇 ⎥
⎣𝑎𝑇 +1 ⎦
It should turn out automatically that
𝑎𝑇 +1 = 0.

We have built into the our calculations that the consumer leaves life with exactly zero assets, just barely satisfying the
terminal condition that 𝑎𝑇 +1 ≥ 0.
Let’s verify this with our Python code.
First we implement this model in compute_optimal

128 Chapter 9. Consumption Smoothing


A First Course in Quantitative Economics with Python

def compute_optimal(model, a0, y_seq):


R, T = model.R, model.T

# non-financial wealth
h0 = model.β_seq @ y_seq # since β = 1/R

# c0
c0 = (1 - 1/R) / (1 - (1/R)**(T+1)) * (a0 + h0)
c_seq = c0*np.ones(T+1)

# verify
A = np.diag(-R*np.ones(T), k=-1) + np.eye(T+1)
b = y_seq - c_seq
b[0] = b[0] + a0

a_seq = np.linalg.inv(A) @ b
a_seq = np.concatenate([[a0], a_seq])

return c_seq, a_seq

We use an example where the consumer inherits 𝑎0 < 0 (which can be interpreted as a student debt).
The non-financial process {𝑦𝑡 }𝑇𝑡=0 is constant and positive up to 𝑡 = 45 and then becomes zero afterward.

# Financial wealth
a0 = -2 # such as "student debt"

# non-financial Income process


y_seq = np.concatenate([np.ones(46), np.zeros(20)])

cs_model = creat_cs_model()
c_seq, a_seq = compute_optimal(cs_model, a0, y_seq)

print('check a_T+1=0:',
np.abs(a_seq[-1] - 0) <= 1e-8)

check a_T+1=0: True

The visualization shows the path of non-financial income, consumption, and financial assets.

# Sequence Length
T = cs_model.T

plt.plot(range(T+1), y_seq, label='non-financial income')


plt.plot(range(T+1), c_seq, label='consumption')
plt.plot(range(T+2), a_seq, label='financial wealth')
plt.plot(range(T+2), np.zeros(T+2), '--')

plt.legend()
plt.xlabel(r'$t$')
plt.ylabel(r'$c_t,y_t,a_t$')
plt.show()

9.3. Mechanics of Consumption smoothing model 129


A First Course in Quantitative Economics with Python

Note that 𝑎𝑇 +1 = 0 is satisfied.


We can further evaluate the welfare using the formula (9.2)

def welfare(model, c_seq):


β_seq, g1, g2 = model.β_seq, model.g1, model.g2

u_seq = g1 * c_seq - g2/2 * c_seq**2


return β_seq @ u_seq

print('Welfare:', welfare(cs_model, c_seq))

Welfare: 13.285050962183433

9.3.4 Feasible consumption variations

Earlier, we had promised to present an argument that supports our claim that a constant consumption play 𝑐𝑡 = 𝑐0 for all
𝑡 is optimal.
Let’s do that now.
Although simple and direct, the approach we’ll take is actually an example of what is called the “calculus of variations”.
Let’s dive in and see what the key idea is.
To explore what types of consumption paths are welfare-improving, we shall create an admissible consumption path
variation sequence {𝑣𝑡 }𝑇𝑡=0 that satisfies
𝑇
∑ 𝑅−𝑡 𝑣𝑡 = 0
𝑡=0

This equation says that the present value of admissible variations must be zero.

130 Chapter 9. Consumption Smoothing


A First Course in Quantitative Economics with Python

(So once again, we encounter our formula for the present value of an “asset”.)
Here we’ll compute a two-parameter class of admissible variations of the form

𝑣𝑡 = 𝜉 1 𝜙 𝑡 − 𝜉 0

We say two and not three-parameter class because 𝜉0 will be a function of (𝜙, 𝜉1 ; 𝑅) that guarantees that the variation is
feasible.
Let’s compute that function.
We require
𝑇
∑ [𝜉1 𝜙𝑡 − 𝜉0 ] = 0
𝑡=0

which implies that


𝑇 𝑇
𝜉1 ∑ 𝜙𝑡 𝑅−𝑡 − 𝜉0 ∑ 𝑅−𝑡 = 0
𝑡=0 𝑡=0

which implies that

1 − (𝜙𝑅−1 )𝑇 +1 1 − 𝑅−(𝑇 +1)


𝜉1 −1
− 𝜉0 =0
1 − 𝜙𝑅 1 − 𝑅−1
which implies that

1 − 𝑅−1 1 − (𝜙𝑅−1 )𝑇 +1
𝜉0 = 𝜉0 (𝜙, 𝜉1 ; 𝑅) = 𝜉1 ( )( )
1−𝑅 −(𝑇 +1) 1 − 𝜙𝑅−1

This is our formula for 𝜉0 .


Evidently, if 𝑐𝑜 is a budget-feasible consumption path, then so is 𝑐𝑜 + 𝑣, where 𝑣 is a budget-feasible variation.
Given 𝑅, we thus have a two parameter class of budget feasible variations 𝑣 that we can use to compute alternative
consumption paths, then evaluate their welfare.
Now let’s compute and visualize the variations

def compute_variation(model, ξ1, ϕ, a0, y_seq, verbose=1):


R, T, β_seq = model.R, model.T, model.β_seq

ξ0 = ξ1*((1 - 1/R) / (1 - (1/R)**(T+1))) * ((1 - (ϕ/R)**(T+1)) / (1 - ϕ/R))


v_seq = np.array([(ξ1*ϕ**t - ξ0) for t in range(T+1)])

if verbose == 1:
print('check feasible:', np.isclose(β_seq @ v_seq, 0)) # since β = 1/R

c_opt, _ = compute_optimal(model, a0, y_seq)


cvar_seq = c_opt + v_seq

return cvar_seq

We visualize variations with 𝜉1 ∈ {.01, .05} and 𝜙 ∈ {.95, 1.02}

9.3. Mechanics of Consumption smoothing model 131


A First Course in Quantitative Economics with Python

fig, ax = plt.subplots()

ξ1s = [.01, .05]


ϕs= [.95, 1.02]
colors = {.01: 'tab:blue', .05: 'tab:green'}

params = np.array(np.meshgrid(ξ1s, ϕs)).T.reshape(-1, 2)

for i, param in enumerate(params):


ξ1, ϕ = param
print(f'variation {i}: ξ1={ξ1}, ϕ={ϕ}')
cvar_seq = compute_variation(model=cs_model,
ξ1=ξ1, ϕ=ϕ, a0=a0,
y_seq=y_seq)
print(f'welfare={welfare(cs_model, cvar_seq)}')
print('-'*64)
if i % 2 == 0:
ls = '-.'
else:
ls = '-'
ax.plot(range(T+1), cvar_seq, ls=ls,
color=colors[ξ1],
label=fr'$\xi_1 = {ξ1}, \phi = {ϕ}$')

plt.plot(range(T+1), c_seq,
color='orange', label=r'Optimal $\vec{c}$ ')

plt.legend()
plt.xlabel(r'$t$')
plt.ylabel(r'$c_t$')
plt.show()

variation 0: ξ1=0.01, ϕ=0.95


check feasible: True
welfare=13.285009346064834
----------------------------------------------------------------
variation 1: ξ1=0.01, ϕ=1.02
check feasible: True
welfare=13.284911631015438
----------------------------------------------------------------
variation 2: ξ1=0.05, ϕ=0.95
check feasible: True
welfare=13.284010559218512
----------------------------------------------------------------
variation 3: ξ1=0.05, ϕ=1.02
check feasible: True
welfare=13.28156768298361
----------------------------------------------------------------

132 Chapter 9. Consumption Smoothing


A First Course in Quantitative Economics with Python

We can even use the Python np.gradient command to compute derivatives of welfare with respect to our two pa-
rameters.
We are teaching the key idea beneath the calculus of variations.
First, we define the welfare with respect to 𝜉1 and 𝜙

def welfare_rel(ξ1, ϕ):


"""
Compute welfare of variation sequence
for given ϕ, ξ1 with a consumption smoothing model
"""

cvar_seq = compute_variation(cs_model, ξ1=ξ1,


ϕ=ϕ, a0=a0,
y_seq=y_seq,
verbose=0)
return welfare(cs_model, cvar_seq)

# Vectorize the function to allow array input


welfare_vec = np.vectorize(welfare_rel)

Then we can visualize the relationship between welfare and 𝜉1 and compute its derivatives

ξ1_arr = np.linspace(-0.5, 0.5, 20)

plt.plot(ξ1_arr, welfare_vec(ξ1_arr, 1.02))


plt.ylabel('welfare')
plt.xlabel(r'$\xi_1$')
plt.show()

welfare_grad = welfare_vec(ξ1_arr, 1.02)


welfare_grad = np.gradient(welfare_grad)
(continues on next page)

9.3. Mechanics of Consumption smoothing model 133


A First Course in Quantitative Economics with Python

(continued from previous page)


plt.plot(ξ1_arr, welfare_grad)
plt.ylabel('derivative of welfare')
plt.xlabel(r'$\xi_1$')
plt.show()

The same can be done on 𝜙

134 Chapter 9. Consumption Smoothing


A First Course in Quantitative Economics with Python

ϕ_arr = np.linspace(-0.5, 0.5, 20)

plt.plot(ξ1_arr, welfare_vec(0.05, ϕ_arr))


plt.ylabel('welfare')
plt.xlabel(r'$\phi$')
plt.show()

welfare_grad = welfare_vec(0.05, ϕ_arr)


welfare_grad = np.gradient(welfare_grad)
plt.plot(ξ1_arr, welfare_grad)
plt.ylabel('derivative of welfare')
plt.xlabel(r'$\phi$')
plt.show()

9.3. Mechanics of Consumption smoothing model 135


A First Course in Quantitative Economics with Python

9.4 Wrapping up the consumption-smoothing model

The consumption-smoothing model of Milton Friedman [Fri56] and Robert Hall [Hal78]) is a cornerstone of modern
macro that has important ramifications about the size of the Keynesian “fiscal policy multiplier” described briefly in
quantecon lecture geometric series.
In particular, Milton Friedman and others showed that it lowered the fiscal policy multiplier relative to the one implied
by the simple Keynesian consumption function presented in geometric series.
Friedman and Hall’s work opened the door to a lively literature on the aggregate consumption function and implied fiscal
multipliers that remains very active today.

9.5 Difference equations with linear algebra

In the preceding sections we have used linear algebra to solve a consumption smoothing model.
The same tools from linear algebra – matrix multiplication and matrix inversion – can be used to study many other dynamic
models too.
We’ll concluse this lecture by giving a couple of examples.
In particular, we’ll describe a useful way of representing and “solving” linear difference equations.
To generate some 𝑦 vectors, we’ll just write down a linear difference equation with appropriate initial conditions and then
use linear algebra to solve it.

136 Chapter 9. Consumption Smoothing


A First Course in Quantitative Economics with Python

9.5.1 First-order difference equation

We’ll start with a first-order linear difference equation for {𝑦𝑡 }𝑇𝑡=0 :

𝑦𝑡 = 𝜆𝑦𝑡−1 , 𝑡 = 1, 2, … , 𝑇

where 𝑦0 is a given initial condition.


We can cast this set of 𝑇 equations as a single matrix equation

1 0 0 ⋯ 0 0 𝑦1 𝜆𝑦0
⎡−𝜆 1 0 ⋯ 0 0⎤ ⎡ 𝑦 2 ⎤ ⎡ 0 ⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ 0 −𝜆 1 ⋯ 0 0 ⎥ ⎢ 𝑦3 ⎥ = ⎢ 0 ⎥
⎢ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎣ 0 0 0 ⋯ −𝜆 1⎦ ⎣𝑦𝑇 ⎦ ⎣ 0 ⎦

Multiplying both sides by inverse of the matrix on the left provides the solution

𝑦1 1 0 0 ⋯ 0 0 𝜆𝑦0
⎡𝑦 ⎤ ⎡ 𝜆 1 0 ⋯ 0 0⎤ ⎡ 0 ⎤
2
⎢ ⎥ ⎢ 2 ⎥⎢ ⎥
⎢ 𝑦3 ⎥ = ⎢ 𝜆 𝜆 1 ⋯ 0 0⎥ ⎢ 0 ⎥ (9.4)
⎢ ⋮ ⎥ ⎢ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮⎥⎢ ⋮ ⎥
⎣𝑦𝑇 ⎦ ⎣𝜆𝑇 −1 𝜆𝑇 −2 𝜆𝑇 −3 ⋯ 𝜆 1⎦ ⎣ 0 ⎦

Exercise 9.5.1
In the (9.4), we multiply the inverse of the matrix 𝐴. In this exercise, please confirm that

1 0 0 ⋯ 0 0
⎡ 𝜆 1 0 ⋯ 0 0⎤
⎢ 2 ⎥
⎢ 𝜆 𝜆 1 ⋯ 0 0⎥
⎢ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮⎥
⎣𝜆𝑇 −1 𝜆𝑇 −2 𝜆𝑇 −3 ⋯ 𝜆 1⎦

is the inverse of 𝐴 and check that 𝐴𝐴−1 = 𝐼

9.5.2 Second order difference equation

A second-order linear difference equation for {𝑦𝑡 }𝑇𝑡=0 is

𝑦𝑡 = 𝜆1 𝑦𝑡−1 + 𝜆2 𝑦𝑡−2 , 𝑡 = 1, 2, … , 𝑇

where now 𝑦0 and 𝑦−1 are two given initial equations determined outside the model.
As we did with the first-order difference equation, we can cast this set of 𝑇 equations as a single matrix equation

1 0 0 ⋯ 0 0 0 𝑦1 𝜆1 𝑦0 + 𝜆2 𝑦−1
⎡−𝜆 1 0 ⋯ 0 0 0 ⎤ ⎡ 𝑦2 ⎤ ⎡ 𝜆2 𝑦0 ⎤
⎢ 1 ⎥⎢ ⎥ ⎢ ⎥
⎢−𝜆2 −𝜆1 1 ⋯ 0 0 0 ⎥ ⎢ 𝑦3 ⎥ = ⎢ 0 ⎥
⎢ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮ ⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎣ 0 0 0 ⋯ −𝜆2 −𝜆1 1⎦ ⎣𝑦𝑇 ⎦ ⎣ 0 ⎦

Multiplying both sides by inverse of the matrix on the left again provides the solution.

9.5. Difference equations with linear algebra 137


A First Course in Quantitative Economics with Python

Exercise 9.5.2
As an exercise, we ask you to represent and solve a third order linear difference equation. How many initial conditions
must you specify?

138 Chapter 9. Consumption Smoothing


CHAPTER

TEN

EQUALIZING DIFFERENCE MODEL

10.1 Overview

This lecture presents a model of the college-high-school wage gap in which the “time to build” a college graduate plays a
key role.
The model is “incomplete” in the sense that it is just one “condition” in the form of a single equation that would be part
of set equations comprising all “equilibrium conditions” of a more fully articulated model.
The condition featured in our model determines a college, high-school wage ratio that equalizes the present values of a
high school worker and a college educated worker.
The idea behind this condition is that lifetime earnings have to adjust to make someone indifferent between going to
college and not going to college.
(The job of the “other equations” in a more complete model would be to fill in details about what adjusts to bring about
this outcome.)
It is just one instance of an “equalizing difference” theory of relative wage rates, a class of theories dating back at least to
Adam Smith’s Wealth of Nations [Smi10].
For most of this lecture, the only mathematical tools that we’ll use are from linear algebra, in particular, matrix multipli-
cation and matrix inversion.
However, at the very end of the lecture, we’ll use calculus just in case readers want to see how computing partial derivatives
could let us present some findings more concisely.
(And doing that will let us show off how good Python is at doing calculus!)
But if you don’t know calculus, our tools from linear algebra are certainly enough.
As usual, we’ll start by importing some Python modules.

import numpy as np
import matplotlib.pyplot as plt

139
A First Course in Quantitative Economics with Python

10.2 The indifference condition

The key idea is that the initial college wage premium has to adjust to make a representative worker indifferent between
going to college and not going to college.
Let
• 𝑅 > 1 be the gross rate of return on a one-period bond
• 𝑡 = 0, 1, 2, … 𝑇 denote the years that a person either works or attends college
• 0 denote the first period after high school that a person can go to work
• 𝑇 denote the last period that a person works
• 𝑤𝑡ℎ be the wage at time 𝑡 of a high school graduate
• 𝑤𝑡𝑐 be the wage at time 𝑡 of a college graduate
• 𝛾ℎ > 1 be the (gross) rate of growth of wages of a high school graduate, so that 𝑤𝑡ℎ = 𝑤0ℎ 𝛾ℎ𝑡
• 𝛾𝑐 > 1 be the (gross) rate of growth of wages of a college graduate, so that 𝑤𝑡𝑐 = 𝑤0𝑐 𝛾𝑐𝑡
• 𝐷 be the upfront monetary costs of going to college
If someone goes to work immediately after high school and works for the 𝑇 + 1 years 𝑡 = 0, 1, 2, … , 𝑇 , she earns present
value
𝑇
1 − (𝑅−1 𝛾ℎ )𝑇 +1
ℎ0 = ∑ 𝑅−𝑡 𝑤𝑡ℎ = 𝑤0ℎ [ ] ≡ 𝑤0ℎ 𝐴ℎ
𝑡=0
1 − 𝑅−1 𝛾ℎ

where
1 − (𝑅−1 𝛾ℎ )𝑇 +1
𝐴ℎ = [ ].
1 − 𝑅−1 𝛾ℎ
The present value ℎ0 is the “human wealth” at the beginning of time 0 of someone who chooses not to attend college but
instead to go to work immediately at the wage of a high school graduate.
If someone goes to college for the four years 𝑡 = 0, 1, 2, 3 during which she earns 0, but then goes to work immediately
after college and works for the 𝑇 − 3 years 𝑡 = 4, 5, … , 𝑇 , she earns present value
𝑇
1 − (𝑅−1 𝛾𝑐 )𝑇 −3
𝑐0 = ∑ 𝑅−𝑡 𝑤𝑡𝑐 = 𝑤0𝑐 (𝑅−1 𝛾𝑐 )4 [ ] ≡ 𝑤0𝑐 𝐴𝑐
𝑡=4
1 − 𝑅−1 𝛾𝑐

where
1 − (𝑅−1 𝛾𝑐 )𝑇 −3
𝐴𝑐 = (𝑅−1 𝛾𝑐 )4 [ ]
1 − 𝑅−1 𝛾𝑐
The present value 𝑐0 is the “human wealth” at the beginning of time 0 of someone who chooses to attend college for four
years and then start to work at time 𝑡 = 4 at the wage of a college graduate.
Assume that college tuition plus four years of room and board paid for up front costs 𝐷.
So net of monetary cost of college, the present value of attending college as of the first period after high school is

𝑐0 − 𝐷

We now formulate a pure equalizing difference model of the initial college-high school wage gap 𝜙 defined by
Let

𝑤0𝑐 = 𝜙𝑤0ℎ

140 Chapter 10. Equalizing Difference Model


A First Course in Quantitative Economics with Python

We suppose that 𝑅, 𝛾ℎ , 𝛾𝑐 , 𝑇 and also 𝑤0ℎ are fixed parameters.


We start by noting that the pure equalizing difference model asserts that the college-high-school wage gap 𝜙 solves “equal-
izing” equation that sets the present value not going to college equal to the present value of going go college:

ℎ0 = 𝑐 0 − 𝐷

or

𝑤0ℎ 𝐴ℎ = 𝜙𝑤0ℎ 𝐴𝑐 − 𝐷. (10.1)

This is the “indifference condition” that is at the heart of the model.


Solving equation (10.1) for the college wage premium 𝜙 we obtain

𝐴ℎ 𝐷
𝜙= + ℎ . (10.2)
𝐴𝑐 𝑤0 𝐴𝑐

In a free college special case 𝐷 = 0 so that the only cost of going to college is the forgone earnings from not working as
a high school worker.
In that case,
𝐴ℎ
𝜙= .
𝐴𝑐

Soon we’ll write Python code to compute the gap and plot it as a function of its determinants.
But first we’ll describe a possible alternative interpretation of our model.

10.3 Reinterpreting the model: workers and entrepreneurs

We can add a parameter and reinterpret variables to get a model of entrepreneurs versus workers.
We now let ℎ be the present value of a “worker”.
We define the present value of an entrepreneur to be
𝑇
𝑐0 = 𝜋 ∑ 𝑅−𝑡 𝑤𝑡𝑐
𝑡=4

where 𝜋 ∈ (0, 1) is the probability that an entrepreneur’s “project” succeeds.


For our model of workers and firms, we’ll interpret 𝐷 as the cost of becoming an entrepreneur.
This cost might include costs of hiring workers, office space, and lawyers.
What we used to call the college, high school wage gap 𝜙 now becomes the ratio of a successful entrepreneur’s earnings
to a worker’s earnings.
We’ll find that as 𝜋 decreases, 𝜙 increases.
Now let’s write some Python code to compute 𝜙 and plot it as a function of some of its determinants.
We can have some fun providing some example calculations that tweak various parameters, prominently including
𝛾ℎ , 𝛾𝑐 , 𝑅.

10.3. Reinterpreting the model: workers and entrepreneurs 141


A First Course in Quantitative Economics with Python

class equalizing_diff:
"""
A class of the equalizing difference model
"""

def __init__(self, R, T, γ_h, γ_c, w_h0, D=0, π=None):


# one switches to the weak model by setting π
self.R, self.γ_h, self.γ_c, self.w_h0, self.D = R, γ_h, γ_c, w_h0, D
self.T, self.π = T, π

def compute_gap(self):
R, γ_h, γ_c, w_h0, D = self.R, self.γ_h, self.γ_c, self.w_h0, self.D
T, π = self.T, self.π

A_h = (1 - (γ_h/R)**(T+1)) / (1 - γ_h/R)


A_c = (1 - (γ_c/R)**(T-3)) / (1 - γ_c/R) * (γ_c/R)**4

# tweaked model
if π!=None:
A_c = π*A_c

ϕ = A_h/A_c + D/(w_h0*A_c)
return ϕ

We can build some functions to help do comparative statics using vectorization instead of loops.
For a given instance of the class, we want to compute 𝜙 when one parameter changes and others remain unchanged.
Let’s do an example.

# ϕ_R
def ϕ_R(mc, R_new):
mc_new = equalizing_diff(R_new, mc.T, mc.γ_h, mc.γ_c, mc.w_h0, mc.D, mc.π)
return mc_new.compute_gap()

ϕ_R = np.vectorize(ϕ_R)

# ϕ_γh
def ϕ_γh(mc, γh_new):
mc_new = equalizing_diff(mc.R, mc.T, γh_new, mc.γ_c, mc.w_h0, mc.D, mc.π)
return mc_new.compute_gap()

ϕ_γh = np.vectorize(ϕ_γh)

# ϕ_γc
def ϕ_γc(mc, γc_new):
mc_new = equalizing_diff(mc.R, mc.T, mc.γ_h, γc_new, mc.w_h0, mc.D, mc.π)
return mc_new.compute_gap()

ϕ_γc = np.vectorize(ϕ_γc)

# ϕ_π
def ϕ_π(mc, π_new):
mc_new = equalizing_diff(mc.R, mc.T, mc.γ_h, mc.γ_c, mc.w_h0, mc.D, π_new)
return mc_new.compute_gap()

ϕ_π = np.vectorize(ϕ_π)

142 Chapter 10. Equalizing Difference Model


A First Course in Quantitative Economics with Python

# set benchmark parameters


R = 1.05
T = 40
γ_h, γ_c = 1.01, 1.01
w_h0 = 1
D = 10

# create an instance
ex1 = equalizing_diff(R=R, T=T, γ_h=γ_h, γ_c=γ_c, w_h0=w_h0, D=D)
gap1 = ex1.compute_gap()

print(gap1)

1.8041412724969135

Let’s not charge for college and recompute 𝜙.


The initial college wage premium should go down.

# free college
ex2 = equalizing_diff(R, T, γ_h, γ_c, w_h0, D=0)
gap2 = ex2.compute_gap()
print(gap2)

1.2204649517903732

Let us construct some graphs that show us how the initial college-high-school wage ratio 𝜙 would change if one of its
determinants were to change.
Let’s start with the gross interest rate 𝑅.

R_arr = np.linspace(1, 1.2, 50)


plt.plot(R_arr, φ_R(ex1, R_arr))
plt.xlabel(r'$R$')
plt.ylabel(r'wage gap')
plt.show()

10.3. Reinterpreting the model: workers and entrepreneurs 143


A First Course in Quantitative Economics with Python

Evidently, the initial wage ratio 𝜙 must rise to compensate a prospective high school student for waiting to start receiving
income – remember that while she is earning nothing in years 𝑡 = 0, 1, 2, 3, the high school worker is earning a salary.
Not let’s study what happens to the initial wage ratio 𝜙 if the rate of growth of college wages rises, holding constant other
determinants of 𝜙.

γc_arr = np.linspace(1, 1.2, 50)


plt.plot(γc_arr, φ_γc(ex1, γc_arr))
plt.xlabel(r'$\gamma_c$')
plt.ylabel(r'wage gap')
plt.show()

144 Chapter 10. Equalizing Difference Model


A First Course in Quantitative Economics with Python

Notice how the intitial wage gap falls when the rate of growth 𝛾𝑐 of college wages rises.
It falls to “equalize” the present values of the two types of career, one as a high school worker, the other as a college
worker.
Can you guess what happens to the initial wage ratio 𝜙 when next we vary the rate of growth of high school wages, holding
all other determinants of 𝜙 constant?
The following graph shows what happens.

γh_arr = np.linspace(1, 1.1, 50)


plt.plot(γh_arr, φ_γh(ex1, γh_arr))
plt.xlabel(r'$\gamma_h$')
plt.ylabel(r'wage gap')
plt.show()

10.3. Reinterpreting the model: workers and entrepreneurs 145


A First Course in Quantitative Economics with Python

10.4 Entrepreneur-worker interpretation

Now let’s adopt the entrepreneur-worker interpretation of our model.


If the probability that a new business succeeds is .2, let’s compute the initial wage premium for successful entrepreneurs.

# a model of enterpreneur
ex3 = equalizing_diff(R, T, γ_h, γ_c, w_h0, π=0.2)
gap3 = ex3.compute_gap()

print(gap3)

6.102324758951866

Now let’s study how the initial wage premium for successful entrepreneurs depend on the success probability.

π_arr = np.linspace(0.2, 1, 50)


plt.plot(π_arr, φ_π(ex3, π_arr))
plt.ylabel(r'wage gap')
plt.xlabel(r'$\pi$')
plt.show()

146 Chapter 10. Equalizing Difference Model


A First Course in Quantitative Economics with Python

Does the graph make sense to you?

10.5 An application of calculus

So far, we have used only linear algebra and it has been a good enough tool for us to figure out how our model works.
However, someone who knows calculus might ask “Instead of plotting those graphs, why didn’t you just take partial
derivatives?”
We’ll briefly do just that, yes, the questioner is correct and that partial derivatives are indeed a good tool for discovering
the “comparative statics” properities of our model.
A reader who doesn’t know calculus could read no further and feel confident that applying linear algebra has taught us the
main properties of the model.
But for a reader interested in how we can get Python to do all the hard work involved in computing partial derivatives,
we’ll say a few things about that now.
We’ll use the Python module ‘sympy’ to compute partial derivatives of 𝜙 with respect to the parameters that determine it.
Let’s import key functions from sympy.

from sympy import Symbol, Lambda, symbols

Define symbols

γ_h, γ_c, w_h0, D = symbols('\gamma_h, \gamma_h_c, w_0^h, D', real=True)


R, T = Symbol('R', real=True), Symbol('T', integer=True)

Define function 𝐴ℎ

10.5. An application of calculus 147


A First Course in Quantitative Economics with Python

A_h = Lambda((γ_h, R, T), (1 - (γ_h/R)**(T+1)) / (1 - γ_h/R))


A_h

𝑇 +1
1 − ( 𝛾𝑅ℎ )
((𝛾ℎ , 𝑅, 𝑇 ) ↦ )
1 − 𝛾𝑅ℎ

Define function 𝐴𝑐

A_c = Lambda((γ_c, R, T), (1 - (γ_c/R)**(T-3)) / (1 - γ_c/R) * (γ_c/R)**4)


A_c

𝑇 −3
4
𝛾ℎ𝑐 ⋅ (1 − ( 𝛾𝑅ℎ𝑐 ) )

⎜(𝛾ℎ𝑐 , 𝑅, 𝑇 ) ↦ ⎞

𝛾ℎ𝑐
𝑅4 ⋅ (1 − 𝑅 )
⎝ ⎠

Now, define 𝜙

ϕ = Lambda((D, γ_h, γ_c, R, T, w_h0), A_h(γ_h, R, T)/A_c(γ_c, R, T) + D/(w_h0*A_c(γ_c,


↪ R, T)))

𝑇 +1
𝐷𝑅4 ⋅ (1 − 𝛾𝑅ℎ𝑐 ) 𝑅4 ⋅ (1 − ( 𝛾𝑅ℎ ) ) (1 − 𝛾𝑅ℎ𝑐 )
⎛ ℎ
⎜(𝐷, 𝛾ℎ , 𝛾ℎ𝑐 , 𝑅, 𝑇 , 𝑤0 ) ↦ + ⎞

4 ℎ 𝛾ℎ𝑐 𝑇 −3 4 𝛾ℎ 𝛾ℎ𝑐 𝑇 −3
⎝ 𝛾ℎ𝑐 𝑤0 ⋅ (1 − ( 𝑅 ) ) 𝛾 ℎ𝑐 ⋅ (1 − 𝑅 ) (1 − ( 𝑅 ) ) ⎠

We begin by setting default parameter values.

R_value = 1.05
T_value = 40
γ_h_value, γ_c_value = 1.01, 1.01
w_h0_value = 1
D_value = 10

𝜕𝜙
Now let’s compute 𝜕𝐷 and then evaluate it at the default values

ϕ_D = ϕ(D, γ_h, γ_c, R, T, w_h0).diff(D)


ϕ_D

𝛾ℎ𝑐
𝑅4 ⋅ (1 − 𝑅 )
4 𝑤ℎ ⋅ (1 − 𝛾ℎ𝑐 𝑇 −3
𝛾ℎ𝑐 0 ( 𝑅 ) )

# Numerical value at default parameters


ϕ_D_func = Lambda((D, γ_h, γ_c, R, T, w_h0), ϕ_D)
ϕ_D_func(D_value, γ_h_value, γ_c_value, R_value, T_value, w_h0_value)

148 Chapter 10. Equalizing Difference Model


A First Course in Quantitative Economics with Python

0.058367632070654

Thus, as with our graph above, we find that raising 𝑅 increases the initial college wage premium 𝜙.
𝜕𝜙
Compute 𝜕𝑇 and evaluate it a default parameters

ϕ_T = ϕ(D, γ_h, γ_c, R, T, w_h0).diff(T)


ϕ_T

𝑇 −3 𝑇 +1
𝐷𝑅4 ( 𝛾𝑅ℎ𝑐 )
𝑇 −3
⋅ (1 − 𝛾ℎ𝑐 𝛾ℎ𝑐
𝑅4 ( 𝛾𝑅ℎ )
𝑇 +1 𝛾ℎ𝑐 𝛾ℎ 𝑅4 ( 𝛾𝑅ℎ𝑐 ) ⋅ (1 − ( 𝛾𝑅ℎ ) ) (1 − 𝛾ℎ𝑐 𝛾ℎ𝑐
𝑅 ) log ( 𝑅 )
𝑅 ) log ( 𝑅 ) ⋅ (1 − 𝑅 ) log ( 𝑅 )
− +
𝑇 −3 2 𝛾ℎ 𝑇 −3 𝑇 −3 2
4 𝑤ℎ (1 −
𝛾ℎ𝑐 ( 𝛾𝑅ℎ𝑐 ) )
4 ⋅ (1 −
𝛾ℎ𝑐 𝑅 ) (1 − ( 𝛾𝑅ℎ𝑐 ) ) 4 ⋅ (1 −
𝛾ℎ𝑐 𝛾ℎ
− ( 𝛾𝑅ℎ𝑐 )
0 𝑅 ) (1 )

# Numerical value at default parameters


ϕ_T_func = Lambda((D, γ_h, γ_c, R, T, w_h0), ϕ_T)
ϕ_T_func(D_value, γ_h_value, γ_c_value, R_value, T_value, w_h0_value)

−0.00973478032996598

We find that raising 𝑇 decreases the initial college wage premium 𝜙.


This is because college graduates now have longer career lengths to “pay off” the time and other costs they paid to go to
college
𝜕𝜙
Let’s compute 𝜕𝛾ℎ and evaluate it at default parameters.

ϕ_γ_h = ϕ(D, γ_h, γ_c, R, T, w_h0).diff(γ_h)


ϕ_γ_h

𝑇 +1
𝑅4 ( 𝛾𝑅ℎ )
𝑇 +1
⋅ (1 − 𝛾ℎ𝑐 𝑅3 ⋅ (1 − ( 𝛾𝑅ℎ ) ) (1 − 𝛾ℎ𝑐
𝑅 )
𝑅 ) (𝑇 + 1)
− +
𝛾ℎ 𝛾ℎ𝑐 𝑇 −3 𝛾ℎ 2 𝑇 −3
4 ⋅ (1 −
𝛾ℎ 𝛾ℎ𝑐 𝑅 ) (1 − ( 𝑅 ) ) 4 (1 −
𝛾ℎ𝑐 𝑅) ⋅ (1 − ( 𝛾𝑅ℎ𝑐 ) )

# Numerical value at default parameters


ϕ_γ_h_func = Lambda((D, γ_h, γ_c, R, T, w_h0), ϕ_γ_h)
ϕ_γ_h_func(D_value, γ_h_value, γ_c_value, R_value, T_value, w_h0_value)

17.8590485545256

We find that raising 𝛾ℎ increases the initial college wage premium 𝜙, as we did with our graphical analysis earlier
𝜕𝜙
Compute 𝜕𝛾𝑐 and evaluate it numerically at default parameter values

ϕ_γ_c = ϕ(D, γ_h, γ_c, R, T, w_h0).diff(γ_c)


ϕ_γ_c

10.5. An application of calculus 149


A First Course in Quantitative Economics with Python

𝑇 −3 𝑇 +1
𝐷𝑅4 ( 𝛾𝑅ℎ𝑐 )
𝑇 −3
⋅ (1 − 𝛾ℎ𝑐
4𝐷𝑅4 ⋅ (1 − 𝛾ℎ𝑐
𝐷𝑅3 𝑅4 ( 𝛾𝑅ℎ𝑐 ) ⋅ (1 − ( 𝛾𝑅ℎ ) )(
𝑅 ) (𝑇 − 3) 𝑅 )
− − +
𝑇 −3 2 𝛾ℎ𝑐 𝑇 −3 𝑇 −3
5
𝛾ℎ𝑐 𝑤0ℎ (1 − ( 𝛾𝑅ℎ𝑐 ) )
5
𝛾ℎ𝑐 𝑤0ℎ ⋅ (1 − ( 𝑅 ) ) 4 𝑤ℎ ⋅ (1 −
𝛾ℎ𝑐 ( 𝛾𝑅ℎ𝑐 ) ) 5
𝛾ℎ𝑐 ⋅ (1 − 𝛾ℎ
− ( 𝛾𝑅ℎ
𝑅 ) (1
0

# Numerical value at default parameters


ϕ_γ_c_func = Lambda((D, γ_h, γ_c, R, T, w_h0), ϕ_γ_c)
ϕ_γ_c_func(D_value, γ_h_value, γ_c_value, R_value, T_value, w_h0_value)

−31.6486401973376

We find that raising 𝛾𝑐 decreases the initial college wage premium 𝜙, as we did with our graphical analysis earlier
𝜕𝜙
Let’s compute 𝜕𝑅 and evaluate it numerically at default parameter values

ϕ_R = ϕ(D, γ_h, γ_c, R, T, w_h0).diff(R)


ϕ_R

𝑇 −3 𝑇 +1
𝐷𝑅3 ( 𝛾𝑅ℎ𝑐 ) ⋅ (1 − 𝛾ℎ𝑐
𝑅 ) (𝑇 − 3) 4𝐷𝑅3 ⋅ (1 − 𝛾ℎ𝑐
𝑅 ) 𝐷𝑅2 𝑅3 ( 𝛾𝑅ℎ ) ⋅ (1 − 𝛾ℎ𝑐
𝑅 ) (𝑇 +
− + + +
𝑇 −3 2 𝛾ℎ𝑐 𝑇 −3 𝑇 −3 𝑇 −3
4 𝑤ℎ (1 − ( 𝛾ℎ𝑐 )
𝛾ℎ𝑐 )
4 𝑤ℎ ⋅ (1 −
𝛾ℎ𝑐 0 ( 𝑅 ) ) 3
𝛾ℎ𝑐 𝑤0ℎ ⋅ (1 − ( 𝛾𝑅ℎ𝑐 ) ) 4 ⋅ (1 −
𝛾ℎ𝑐 𝛾ℎ
𝑅 ) (1 − ( 𝛾𝑅ℎ𝑐 )
0 𝑅

# Numerical value at default parameters


ϕ_R_func = Lambda((D, γ_h, γ_c, R, T, w_h0), ϕ_R)
ϕ_R_func(D_value, γ_h_value, γ_c_value, R_value, T_value, w_h0_value)

13.2642738659429

We find that raising the gross interest rate 𝑅 increases the initial college wage premium 𝜙, as we did with our graphical
analysis earlier

150 Chapter 10. Equalizing Difference Model


CHAPTER

ELEVEN

PRICE LEVEL HISTORIES

This lecture offers some scraps of historical evidence about fluctuations in levels of aggregate price indexes.
The rate of growth of the price level is called inflation in the popular press and in discussions among central bankers and
treasury officials.
The price level is measured in units of domestic currency per units of a representative bundle of consumption goods.
Thus, in the US, the price level at 𝑡 is measured in dollars in month 𝑡 or year 𝑡 per unit of the consumption bundle.
Until the early 20th century, throughout much of the west, although price levels fluctuated from year to year, they didn’t
have much of a trend.
Thus, they tended to end a century at close to a level at which they started it.
Things were different in the 20th century, as we shall see in this lecture.
This lecture will set the stage for some subsequent lectures about a particular theory that economists use to think about
determinants of the price level.

11.1 Four Centuries of Price Levels

We begin by displaying some data that originally appeared on page 35 of [SV02].


The data price levels for four “hard currency” countries from 1600 to 1914.
The four countries are
• France
• Spain (Castile)
• United Kingdom
• United States
In the present context, the phrase hard currency means that the countries were on a commodity-money standard: money
consisted of gold and silver coins that circulated at values largely determined by the weights of their gold and silver
contents.
(Under a gold or silver standard, some money also consisted of “warehouse certificates” that represented paper claims on
gold or silver coins. Bank notes issued by the government or private banks can be viewed as examples of such “warehouse
certificate”.)
The data we want to study data originally appeared in a graph on page 35 of [SV02].
As usual, we’ll start by importing some Python modules.

151
A First Course in Quantitative Economics with Python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime

!pip install xlrd

We’ll start by bringing these data into Pandas from a spreadsheet.

# import data
df_fig5 = pd.read_excel('datasets/longprices.xls', sheet_name='all', header=2, index_
↪col=0).iloc[1:]

df_fig5.index = df_fig5.index.astype(int)

df_fig5.head(5)

UK US France Castile
1600 72.455009 NaN NaN 48.559223
1601 84.609771 NaN NaN 41.760932
1602 74.349258 NaN NaN 40.158477
1603 70.718615 NaN NaN 40.595510
1604 63.773036 NaN NaN 44.480248

We first plot price levels over the period 1600-1914.


During most years in this time interval, the countries were on a gold or silver standard.

df_fig5_bef1914 = df_fig5[df_fig5.index <= 1915]

# create plot
cols = ['UK', 'US', 'France', 'Castile']

fig, ax = plt.subplots(1, 1, figsize=[8, 5], dpi=200)

for col in cols:


ax.plot(df_fig5_bef1914.index, df_fig5_bef1914[col], label=col)

ax.spines[['right', 'top']].set_visible(False)
ax.legend()
ax.set_ylabel('Index 1913 = 100')
ax.set_xlim(xmin=1600)
plt.tight_layout()
fig.text(.5, .0001, "Price Levels", ha='center')
plt.show()

152 Chapter 11. Price Level Histories


A First Course in Quantitative Economics with Python

We say “most years” because there were temporary lapses from the gold or silver standard.
By staring at the graph carefully, you might be able to guess when these temporary lapses occurred, because they were
also times during which price levels rose markedly from what had been average values during more typical years.
• 1791-1797 in France (the French Revolution)
• 1776-1793 in the US (the US War for Independence from Great Britain)
• 1861-1865 in the US (the US Civil War)
During each of these episodes, the gold/silver standard was temporarily abandoned as a government printed paper money
to help it finance war expenditures.
Despite these temporary lapses, a striking thing about the figure is that price levels hovered around roughly constant
long-term levels for over three centuries.
Two other features of the figure attracted the attention of leading economists such as Irving Fisher of Yale University and
John Maynard Keynes of Cambridge University in the early century.
• There was considerable year-to-year instability of the price levels despite their long begin anchored to the same
average level in the long term
• While using valuable gold and silver as coins was a time-tested way to anchor the price level by limiting the supply
of money, it cost real resources.
– that is, society paid a high “opportunity cost” for using gold and silver as coins; gold and silver could instead
be used as valuable jewelry and also as an industrial input.
Keynes and Fisher proposed what they suggested would be a socially more efficient way to achieve a price level that would
be at least as firmly anchored, and would also exhibit less year-to-year short-term fluctuations.
In particular, they argued that a well-managed central bank could achieve price level stability by
• issuing a limited supply of paper currency
• guaranteeing that it would not print money to finance government expenditures
Thus, the waste from using gold and silver as coins prompted John Maynard Keynes to call a commodity standard a
“barbarous relic.”
A paper fiat money system disposes of all reserves behind a currency.

11.1. Four Centuries of Price Levels 153


A First Course in Quantitative Economics with Python

But notice that in doing so, it also eliminates an automatic supply mechanism constraining the price level.
A low-inflation paper fiat money system replaces that automatic mechanism with an enlightened government that commits
itself to limiting the quantity of a pure token, no-cost currency.
Now let’s see what happened to the price level in our four countries when after 1914 one after another of them left the
gold/silver standard.
We’ll show a version of the complete graph that originally appeared on page 35 of [SV02].
The graph shows logarithms of price levels our four “hard currency” countries from 1600 to 2000.
Allthough we didn’t have to use logarithms in our earlier graphs that had stopped in 1914 – we use logarithms now because
we want also to fit observations after 1914 in the same graph as the earlier observations.
All four of the countries eventually permanently left the gold standard by modifying their monetary and fiscal policies in
several ways, starting the outbreak of the Great War in 1914.

# create plot
cols = ['UK', 'US', 'France', 'Castile']

fig, ax = plt.subplots(1, 1, figsize=[8, 5], dpi=200)

for col in cols:


ax.plot(df_fig5.index, df_fig5[col])
ax.text(x=df_fig5.index[-1]+2, y=df_fig5[col].iloc[-1], s=col)

ax.spines[['right', 'top']].set_visible(False)
ax.set_yscale('log')
ax.set_ylabel('Index 1913 = 100')
ax.set_xlim(xmin=1600)
ax.set_ylim([10, 1e6])
plt.tight_layout()
fig.text(.5, .0001, "Logs of Price Levels", ha='center')
plt.show()

The graph shows that achieving a price level system with a well-managed paper money system proved to be more chal-
lenging than Irving Fisher and Keynes perhaps imagined.

154 Chapter 11. Price Level Histories


A First Course in Quantitative Economics with Python

Actually, earlier economists and statesmen knew about the possibility of fiat money systems long before Keynes and Fisher
advocated them in the early 20th century.
It was because earlier proponents of a commodity money system did not trust governments properly to manage a fiat
money system that they were willing to pay the resource costs associated with setting up and maintaining a commodity
money system.
In light of the high inflation episodes that many countries experienced in the twentieth century after they abandoned
commodity monies, it is difficult to criticize them for their preference to stay on the pre-1914 gold/silver standard.
The breadth and length of the inflationary experiences of the twentieth century, the century of paper money, are historically
unprecedented.

11.2 Ends of Four Big Inflations

In the wake of World War I, which ended in November 1918, monetary and fiscal authorities struggled to achieve price
level stability without being on a gold or silver standard.
We present four graphs from “The Ends of Four Big Inflations” from chapter 3 of [Sar13].
The graphs depict logarithms of price levels during the early post World War I years for four countries:
• Figure 3.1, Retail prices Austria, 1921-1924 (page 42)
• Figure 3.2, Wholesale prices Hungary, 1921-1924 (page 43)
• Figure 3.3, Wholesale prices, Poland, 1921-1924 (page 44)
• Figure 3.4, Wholesale prices, Germany, 1919-1924 (page 45)
We have added logarithms of the exchange rates vis a vis the US dollar to each of the four graphs from chapter 3 of
[Sar13].
Data underlying our graphs appear in tables in an appendix to chapter 3 of [Sar13]. We have transcribed all of these data
into a spreadsheet chapter_3.xls that we read into Pandas.

# import data
xls = pd.ExcelFile('datasets/chapter_3.xlsx')

# unpack and combine all series


sheet_index = [(2, 3, 4), (9, 10), (14, 15, 16), (21, 18, 19)]
remove_row = [(-2, -2, -2), (-7, -10), (-6, -4, -3), (-19, -3, -6)]

df_list = []

for i in range(4):

indices, rows = sheet_index[i], remove_row[i]


sheet_list = [pd.read_excel(xls, 'Table3.' + str(ind), header=1).iloc[:row].
↪applymap(process_entry)

for ind, row in zip(indices, rows)]

sheet_list = [process_df(df) for df in sheet_list]


df_list.append(pd.concat(sheet_list, axis=1))

df_Aus, df_Hung, df_Pol, df_Germ = df_list

Let’s dive in and construct graphs for our four countries.


For each country, we’ll plot two graphs.

11.2. Ends of Four Big Inflations 155


A First Course in Quantitative Economics with Python

The first graph plots logarithms of


• price levels
• exchange rates vis a vis US dollars
For each country, the scale on the right side of a graph will pertain to the price level while the scale on the left side of a
graph will pertain to the exchange rate.
For each country, the second graph plots a three-month moving average of the inflation rate defined as 𝑝𝑡 − 𝑝𝑡−1 .

11.2.1 Austria

The sources of our data are:


• Table 3.3, exp 𝑝
• Table 3.4, exchange rate with US

df_Aus.head(5)

Total Note Circulation Retail price index, 52 commodities \


1919-01-01 NaN NaN
1919-02-01 NaN NaN
1919-03-01 4687056.0 NaN
1919-04-01 5577851.0 NaN
1919-05-01 5960003.0 NaN

Exchange Rate
1919-01-01 17.09
1919-02-01 20.72
1919-03-01 25.85
1919-04-01 26.03
1919-05-01 24.75

p_seq = df_Aus['Retail price index, 52 commodities']


e_seq = df_Aus['Exchange Rate']

lab = ['Retail Price Index', 'Exchange Rate']

# create plot
fig, ax = plt.subplots(figsize=[10,7], dpi=200)
_ = create_pe_plot(p_seq, e_seq, df_Aus.index, lab, ax)

# connect disjunct parts


plt.figtext(0.5, -0.02, 'Austria', horizontalalignment='center', fontsize=12)
plt.show()

156 Chapter 11. Price Level Histories


A First Course in Quantitative Economics with Python

fig, ax = plt.subplots(figsize=[10,7], dpi=200)


_ = create_pr_plot(p_seq, df_Aus.index, ax)

plt.figtext(0.5, -0.02, 'Austria', horizontalalignment='center', fontsize=12)


plt.show()

11.2. Ends of Four Big Inflations 157


A First Course in Quantitative Economics with Python

Staring at the above graphs conveys the following impressions to the authors of this lecture at quantecon.
• an episode of “hyperinflation” with rapidly rising log price level and very high monthly inflation rates
• a sudden stop of the hyperinflation as indicated by the abrupt flattening of the log price level and a marked permanent
drop in the three-month average of inflation
• a US dollar exchange rate that shadows the price level.
We’ll see similar patterns in the next three episodes that we’ll study now.

11.2.2 Hungary

The source of our data for Hungary is:


• Table 3.10, price level exp 𝑝 and exchange rate

df_Hung.head(5)

Gold coin and bullion Silver coin Foreign currency and exchange \
1921-01-01 NaN NaN NaN
1921-02-01 NaN NaN NaN
1921-03-01 NaN NaN NaN
1921-04-01 NaN NaN NaN
1921-05-01 NaN NaN NaN

Bills discounted Advances on securities Advances to treasury \


1921-01-01 10924.0 195.0 NaN
1921-02-01 13202.0 162.0 NaN
1921-03-01 12862.0 160.0 NaN
1921-04-01 12178.0 110.0 NaN
1921-05-01 11847.0 111.0 NaN

Notes in circulation Currrent accounts and deposits \


1921-01-01 15206.0 3851.0
1921-02-01 15571.0 5531.0
1921-03-01 15650.0 5246.0
1921-04-01 13114.0 6802.0
1921-05-01 13686.0 5760.0

Hungarian index of prices Cents per crown in New York


1921-01-01 NaN NaN
1921-02-01 NaN NaN
1921-03-01 NaN NaN
1921-04-01 NaN NaN
1921-05-01 NaN NaN

m_seq = df_Hung['Notes in circulation']


p_seq = df_Hung['Hungarian index of prices']
e_seq = 1/df_Hung['Cents per crown in New York']
rb_seq = np.log(m_seq) - np.log(p_seq)

lab = ['Hungarian Index of Prices', '1/Cents per Crown in New York']

# create plot
fig, ax = plt.subplots(figsize=[10,7], dpi=200)
(continues on next page)

158 Chapter 11. Price Level Histories


A First Course in Quantitative Economics with Python

(continued from previous page)


_ = create_pe_plot(p_seq, e_seq, df_Hung.index, lab, ax)

plt.figtext(0.5, -0.02, 'Hungary', horizontalalignment='center', fontsize=12)


plt.show()

fig, ax = plt.subplots(figsize=[10,7], dpi=200)


_ = create_pr_plot(p_seq, df_Hung.index, ax)

plt.figtext(0.5, -0.02, 'Hungary', horizontalalignment='center', fontsize=12)


plt.show()

11.2. Ends of Four Big Inflations 159


A First Course in Quantitative Economics with Python

11.2.3 Poland

The sources of our data for Poland are:


• Table 3.15, price level exp 𝑝
• Table 3.15, exchange rate

Note: To construct the price level series from the data in the spreadsheet, we instructed Pandas to follow the same
procedures implemented in chapter 3 of [Sar13]. We spliced together three series - Wholesale price index, Wholesale
Price Index: On paper currency basis, and Wholesale Price Index: On zloty basis. We adjusted the sequence based on the
price level ratio at the last period of the available previous series and glued them to construct a single series. We dropped
the exchange rate after June 1924, when the zloty was adopted. We did this because we don’t have the price measured in
zloty. We used the old currency in June to compute the exchange rate adjustment.

df_Pol.head(5)

Gold Silver (including base coin) Balances with foreign banks \


1919-01-01 NaN NaN NaN
1919-02-01 NaN NaN NaN
1919-03-01 3.7 4.2 3.9
1919-04-01 3.7 4.4 9.4
1919-05-01 3.7 8.9 5.8

Discounts Advances: Commercial Advances: Government \


1919-01-01 5.0 194.7 209.9
1919-02-01 4.2 196.4 315.0
(continues on next page)

160 Chapter 11. Price Level Histories


A First Course in Quantitative Economics with Python

(continued from previous page)


1919-03-01 3.5 189.7 400.0
1919-04-01 2.5 192.8 575.0
1919-05-01 1.8 193.2 925.0

Note circulation Gold and Silver (together) \


1919-01-01 1098.1 NaN
1919-02-01 1160.0 NaN
1919-03-01 1223.2 NaN
1919-04-01 1346.0 NaN
1919-05-01 1548.3 NaN

Wholesale price index \


1919-01-01 NaN
1919-02-01 NaN
1919-03-01 NaN
1919-04-01 NaN
1919-05-01 NaN

Wholesale Price Index: On paper currency basis \


1919-01-01 NaN
1919-02-01 NaN
1919-03-01 NaN
1919-04-01 NaN
1919-05-01 NaN

Wholesale Price Index: On zloty basis \


1919-01-01 NaN
1919-02-01 NaN
1919-03-01 NaN
1919-04-01 NaN
1919-05-01 NaN

Cents per Polish mark (zloty after May 1924)


1919-01-01 NaN
1919-02-01 NaN
1919-03-01 NaN
1919-04-01 NaN
1919-05-01 NaN

# splice three price series in different units


p_seq1 = df_Pol['Wholesale price index'].copy()
p_seq2 = df_Pol['Wholesale Price Index: On paper currency basis'].copy()
p_seq3 = df_Pol['Wholesale Price Index: On zloty basis'].copy()

# non-nan part
ch_index_1 = p_seq1[~p_seq1.isna()].index[-1]
ch_index_2 = p_seq2[~p_seq2.isna()].index[-2]

adj_ratio12 = p_seq1[ch_index_1]/p_seq2[ch_index_1]
adj_ratio23 = p_seq2[ch_index_2]/p_seq3[ch_index_2]

# glue three series


p_seq = pd.concat([p_seq1[:ch_index_1],
adj_ratio12 * p_seq2[ch_index_1:ch_index_2],
adj_ratio23 * p_seq3[ch_index_2:]])
(continues on next page)

11.2. Ends of Four Big Inflations 161


A First Course in Quantitative Economics with Python

(continued from previous page)


p_seq = p_seq[~p_seq.index.duplicated(keep='first')]

# exchange rate
e_seq = 1/df_Pol['Cents per Polish mark (zloty after May 1924)']
e_seq[e_seq.index > '05-01-1924'] = np.nan

lab = ['Wholesale Price Index', '1/Cents per Polish Mark']

# create plot
fig, ax = plt.subplots(figsize=[10,7], dpi=200)
ax1 = create_pe_plot(p_seq, e_seq, df_Pol.index, lab, ax)

plt.figtext(0.5, -0.02, 'Poland', horizontalalignment='center', fontsize=12)


plt.show()

fig, ax = plt.subplots(figsize=[10,7], dpi=200)


_ = create_pr_plot(p_seq, df_Pol.index, ax)

plt.figtext(0.5, -0.02, 'Poland', horizontalalignment='center', fontsize=12)


plt.show()

162 Chapter 11. Price Level Histories


A First Course in Quantitative Economics with Python

11.2.4 Germany

The sources of our data for Germany are the following tables from chapter 3 of [Sar13]:
• Table 3.18, wholesale price level exp 𝑝
• Table 3.19, exchange rate

df_Germ.head(5)

Day Gold Held Abroad Gold In vault Gold Total \


1919-01-01 NaN NaN NaN NaN
1919-02-01 NaN NaN NaN NaN
1919-03-01 NaN NaN NaN NaN
1919-04-01 NaN NaN NaN NaN
1919-05-01 NaN NaN NaN NaN

Other metallic currency Total coin and bullion Foreign exchange \


1919-01-01 NaN NaN NaN
1919-02-01 NaN NaN NaN
1919-03-01 NaN NaN NaN
1919-04-01 NaN NaN NaN
1919-05-01 NaN NaN NaN

Treasury and loan bank notes Notes of other banks \


1919-01-01 NaN NaN
1919-02-01 NaN NaN
1919-03-01 NaN NaN
1919-04-01 NaN NaN
(continues on next page)

11.2. Ends of Four Big Inflations 163


A First Course in Quantitative Economics with Python

(continued from previous page)


1919-05-01 NaN NaN

Rentenbank notes ... \


1919-01-01 NaN ...
1919-02-01 NaN ...
1919-03-01 NaN ...
1919-04-01 NaN ...
1919-05-01 NaN ...

Total discounted treasury and commercial bills Advances \


1919-01-01 NaN NaN
1919-02-01 NaN NaN
1919-03-01 NaN NaN
1919-04-01 NaN NaN
1919-05-01 NaN NaN

Securities Notes in circulation Public Demand deposits \


1919-01-01 NaN NaN NaN
1919-02-01 NaN NaN NaN
1919-03-01 NaN NaN NaN
1919-04-01 NaN NaN NaN
1919-05-01 NaN NaN NaN

Other Demand deposits Total demand deposits \


1919-01-01 NaN NaN
1919-02-01 NaN NaN
1919-03-01 NaN NaN
1919-04-01 NaN NaN
1919-05-01 NaN NaN

Demand deposits due to the Rentenbank \


1919-01-01 NaN
1919-02-01 NaN
1919-03-01 NaN
1919-04-01 NaN
1919-05-01 NaN

Price index (on basis of marks before July 1924, reichsmarks after) \
1919-01-01 262.0
1919-02-01 270.0
1919-03-01 274.0
1919-04-01 286.0
1919-05-01 297.0

Cents per mark


1919-01-01 NaN
1919-02-01 NaN
1919-03-01 NaN
1919-04-01 NaN
1919-05-01 NaN

[5 rows x 22 columns]

p_seq = df_Germ['Price index (on basis of marks before July 1924, reichsmarks after)
↪'].copy()

(continues on next page)

164 Chapter 11. Price Level Histories


A First Course in Quantitative Economics with Python

(continued from previous page)


e_seq = 1/df_Germ['Cents per mark']

lab = ['Price Index', '1/Cents per Mark']

# create plot
fig, ax = plt.subplots(figsize=[9,5], dpi=200)
ax1 = create_pe_plot(p_seq, e_seq, df_Germ.index, lab, ax)

plt.figtext(0.5, -0.06, 'Germany', horizontalalignment='center', fontsize=12)


plt.show()

p_seq = df_Germ['Price index (on basis of marks before July 1924, reichsmarks after)
↪'].copy()

e_seq = 1/df_Germ['Cents per mark'].copy()

# adjust the price level/exchange rate after the currency reform


p_seq[p_seq.index > '06-01-1924'] = p_seq[p_seq.index > '06-01-1924'] * 1e12
e_seq[e_seq.index > '12-01-1923'] = e_seq[e_seq.index > '12-01-1923'] * 1e12

lab = ['Price Index (Marks or converted to Marks)', '1/Cents per Mark (or Reichsmark␣
↪converted to Mark)']

# create plot
fig, ax = plt.subplots(figsize=[10,7], dpi=200)
ax1 = create_pe_plot(p_seq, e_seq, df_Germ.index, lab, ax)

plt.figtext(0.5, -0.02, 'Germany', horizontalalignment='center', fontsize=12)


plt.show()

11.2. Ends of Four Big Inflations 165


A First Course in Quantitative Economics with Python

fig, ax = plt.subplots(figsize=[10,7], dpi=200)


_ = create_pr_plot(p_seq, df_Germ.index, ax)

plt.figtext(0.5, -0.02, 'Germany', horizontalalignment='center', fontsize=12)


plt.show()

166 Chapter 11. Price Level Histories


A First Course in Quantitative Economics with Python

11.3 Starting and Stopping Big Inflations

A striking thing about our four graphs is how quickly the (log) price levels in Austria, Hungary, Poland, and Germany
leveled off after having been rising so quickly.
These “sudden stops” are also revealed by the permanent drops in three-month moving averages of inflation for the four
countries.
In addition, the US dollar exchange rates for each of the four countries shadowed their price levels.
• This pattern is an instance of a force modeled in the purchasing power parity theory of exchange rates.
Each of these big inflations seemed to have “stopped on a dime”.
Chapter 3 of [SV02] attempts to offer an explanation for this remarkable pattern.
In a nutshell, here is his story.
After World War I, the United States was on the gold standard. The US government stood ready to convert a dollar into
a specified amount of gold on demand. To understate things, immediately after the war, Hungary, Austria, Poland, and
Germany were not on the gold standard.
In practice, their currencies were largely “fiat” or “unbacked”, meaning that they were not backed by credible government
promises to convert them into gold or silver coins on demand. The governments of these countries resorted to the printing
of new unbacked money to finance government deficits. (The notes were “backed” mainly by treasury bills that, in those
times, could not be expected to be paid off by levying taxes, but only by printing more notes or treasury bills.) This was
done on such a scale that it led to a depreciation of the currencies of spectacular proportions. In the end, the German
mark stabilized at 1 trillion (1012 ) paper marks to the prewar gold mark, the Polish mark at 1.8 million paper marks to
the gold zloty, the Austrian crown at 14,400 paper crowns to the prewar Austro-Hungarian crown, and the Hungarian
krone at 14,500 paper crowns to the prewar Austro-Hungarian crown.
Chapter 3 of [SV02] focuses on the deliberate changes in policy that Hungary, Austria, Poland, and Germany made to end
their hyperinflations. The hyperinflations were each ended by restoring or virtually restoring convertibility to the dollar
or equivalently to gold.
The story told in [SV02] is grounded in a “fiscal theory of the price level” described in this lecture and further discussed
in this lecture.
Those lectures discuss theories about what holders of those rapidly depreciating currencies were thinking about them and
how that shaped responses of inflation to government policies.

11.3. Starting and Stopping Big Inflations 167


A First Course in Quantitative Economics with Python

168 Chapter 11. Price Level Histories


CHAPTER

TWELVE

A FISCAL THEORY OF THE PRICE LEVEL

12.1 Introduction

As usual, we’ll start by importing some Python modules.

import numpy as np
from collections import namedtuple
import matplotlib.pyplot as plt

We’ll use linear algebra first to explain and then do some experiments with a “fiscal theory of the price level”.
A fiscal theory of the price level was described by Thomas Sargent and Neil Wallace in chapter 5 of [Sar13], which
reprints a 1981 article title “Unpleasant Monetarist Arithmetic”.
Sometimes people call it a ``monetary theory of the price level’’ because fiscal effects on the price level occur through the
effects of government fiscal policy decisions on the path of the money supply.
• a goverment’s fiscal policies determine whether it expenditures exceed its tax collections
• if its expenditures exceeds it tax collections, it can cover the difference by printing money
• that leads to effects on the price level as price level path adjusts to equate the supply of money to the demand for
money
The theory has been extended, criticized, and applied by John Cochrane in [Coc23].
In another lecture price level histories, we described some European hyperinflations that occurred in the wake of World
War I.
Elemental forces at work in the fiscal theory of the price level help to understand those episodes.
According to this theory, when the government persistently spends more than it collects in taxes and prints money to
finance the shortfall (the “shortfall” is called the “government deficit”), it puts upward pressure on the price level and
generates persistent inflation.
The “fiscal theory of the price level” asserts that
• to start a persistent inflation the government simply persistently runs a money-financed government deficit
• to stop a persistent inflation the government simply stops persistently running a money-financed government deficit
Our model is a “rational expectations” (or “perfect foresight”) version of a model that Philip Cagan [Cag56] used to study
the monetary dynamics of hyperinflations.
While Cagan didn’t use that “rational expectations” version of the model, Thomas Sargent [Sar82] did when he studied
the Ends of Four Big Inflations in Europe after World War I.

169
A First Course in Quantitative Economics with Python

• this lecture fiscal theory of the price level with adaptive expectations describes a version of the model that does
not impose “rational expectations” but instead uses what Cagan and his teacher Milton Friedman called “adaptive
expectations”
– a reader of both lectures will notice that the algebra is easier and more streamlined in the present rational
expectations version of the model
– this can be traced to the following source: the adaptive expectations version of the model has more endogenous
variables and more free parameters
Some of our quantitative experiments with our rational expectations version of the model are designed to illustrate how
the fiscal theory explains the abrupt end of those big inflations.
In those experiments, we’ll encounter an instance of a ‘‘velocity dividend’’ that has sometimes accompanied successful
inflation stabilization programs.
To facilitate using linear matrix algebra as our main mathematical tool, we’ll use a finite horizon version of the model.
As in the present values and consumption smoothing lectures, the only linear algebra that we’ll be using are matrix multi-
plication and matrix inversion.

12.2 Structure of the model

The model consists of


• a function that expresses the demand for real balances of government printed money as an inverse function of the
public’s expected rate of inflation
• an exogenous sequence of rates of growth of the money supply. The money supply grows because the government
is printing it to finance some of its expenditures
• an equilibrium condition that equates the demand for money to the supply
• a “perfect foresight” assumption that the public’s expected rate of inflation equals the actual rate of inflation.
To represent the model formally, let
• 𝑚𝑡 be the log of the supply of nominal money balances;
• 𝜇𝑡 = 𝑚𝑡+1 − 𝑚𝑡 be the net rate of growth of nominal balances;
• 𝑝𝑡 be the log of the price level;
• 𝜋𝑡 = 𝑝𝑡+1 − 𝑝𝑡 be the net rate of inflation between 𝑡 and 𝑡 + 1;
• 𝜋𝑡∗ be the public’s expected rate of inflation between 𝑡 and 𝑡 + 1;
• 𝑇 the horizon – i.e., the last period for which the model will determine 𝑝𝑡
• 𝜋𝑇∗ +1 the terminal rate of inflation between times 𝑇 and 𝑇 + 1.
𝑑
The demand for real balances exp ( 𝑚
𝑝 ) is governed by the following version of the Cagan demand function
𝑡
𝑡

𝑚𝑑𝑡 − 𝑝𝑡 = −𝛼𝜋𝑡∗ , 𝛼 > 0; 𝑡 = 0, 1, … , 𝑇 . (12.1)

This equation asserts that the demand for real balances is inversely related to the public’s expected rate of inflation.
People somehow acquire perfect foresight by their having solved a forecasting problem.
This lets us set

𝜋𝑡∗ = 𝜋𝑡 , (12.2)

170 Chapter 12. A Fiscal Theory of the Price Level


A First Course in Quantitative Economics with Python

while equating demand for money to supply lets us set 𝑚𝑑𝑡 = 𝑚𝑡 for all 𝑡 ≥ 0.
The preceding equations then imply
𝑚𝑡 − 𝑝𝑡 = −𝛼(𝑝𝑡+1 − 𝑝𝑡 ) , 𝛼 > 0 (12.3)
To fill in details about what it means for private agents to have perfect foresight, we subtract equation (12.3) at time 𝑡
from the same equation at 𝑡 + 1 to get
𝜇𝑡 − 𝜋𝑡 = −𝛼𝜋𝑡+1 + 𝛼𝜋𝑡 ,
which we rewrite as a forward-looking first-order linear difference equation in 𝜋𝑠 with 𝜇𝑠 as a “forcing variable”:
𝛼 1
𝜋𝑡 = 𝜋𝑡+1 + 𝜇, 𝑡 = 0, 1, … , 𝑇
1+𝛼 1+𝛼 𝑡
𝛼
where 0 < 1+𝛼 < 1.
𝛼
Setting 𝛿 = 1+𝛼 let’s us represent the preceding equation as
𝜋𝑡 = 𝛿𝜋𝑡+1 + (1 − 𝛿)𝜇𝑡 , 𝑡 = 0, 1, … , 𝑇
Write this system of 𝑇 + 1 equations as the single matrix equation
1 −𝛿 0 0 ⋯ 0 0 𝜋0 𝜇0 0
⎡0 1 −𝛿 0 ⋯ 0 0 ⎤ ⎡ 𝜋1 ⎤ ⎡ 𝜇 ⎤ ⎡ 0 ⎤
1
⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 0 1 −𝛿 ⋯ 0 0 ⎥ ⎢ 𝜋2 ⎥ 𝜇
= (1 − 𝛿) ⎢ 2 ⎥ + ⎢
0 ⎥
(12.4)
⎢⋮ ⋮ ⋮ ⋮ ⋮ 0 0 ⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎢0 0 0 0 ⋯ 1 −𝛿 ⎥ ⎢𝜋𝑇 −1 ⎥ ⎢𝜇𝑇 −1 ⎥ ⎢ 0 ⎥
⎣0 0 0 0 ⋯ 0 1 ⎦ ⎣ 𝜋𝑇 ⎦ ⎣ 𝜇𝑇 ⎦ ⎣𝛿𝜋𝑇∗ +1 ⎦
By multiplying both sides of equation (12.4) by the inverse of the matrix on the left side, we can calculate
𝜋0
⎡ 𝜋 ⎤
⎢ 1 ⎥
𝜋
𝜋≡⎢ 2 ⎥
⎢ ⋮ ⎥
⎢𝜋𝑇 −1 ⎥
⎣ 𝜋𝑇 ⎦
It turns out that
𝑇
𝜋𝑡 = (1 − 𝛿) ∑ 𝛿 𝑠−𝑡 𝜇𝑠 + 𝛿 𝑇 +1−𝑡 𝜋𝑇∗ +1 (12.5)
𝑠=𝑡

We can represent the equations


𝑚𝑡+1 = 𝑚𝑡 + 𝜇𝑡 , 𝑡 = 0, 1, … , 𝑇
as the matrix equation
1 0 0 ⋯ 0 0 𝑚1 𝜇0 𝑚0
⎡−1 1 0 ⋯ 0 0⎤ ⎡ 𝑚2 ⎤ ⎡ 𝜇1 ⎤ ⎡ 0 ⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0 −1 1 ⋯ 0 0⎥ ⎢ 𝑚3 ⎥ ⎢ 𝜇2 ⎥ ⎢ 0 ⎥
= + (12.6)
⎢ ⋮ ⋮ ⋮ ⋮ 0 0⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎢0 0 0 ⋯ 1 0⎥ ⎢ 𝑚𝑇 ⎥ ⎢𝜇𝑇 −1 ⎥ ⎢ 0 ⎥
⎣0 0 0 ⋯ −1 1⎦ ⎣𝑚𝑇 +1 ⎦ ⎣ 𝜇𝑇 ⎦ ⎣ 0 ⎦
Multiplying both sides of equation (12.6) with the inverse of the matrix on the left will give
𝑡−1
𝑚𝑡 = 𝑚0 + ∑ 𝜇𝑠 , 𝑡 = 1, … , 𝑇 + 1 (12.7)
𝑠=0

Equation (12.7) shows that the log of the money supply at 𝑡 equals the log of the initial money supply 𝑚0 plus accumulation
of rates of money growth between times 0 and 𝑇 .

12.2. Structure of the model 171


A First Course in Quantitative Economics with Python

12.3 Continuation values

To determine the continuation inflation rate 𝜋𝑇∗ +1 we shall proceed by applying the following infinite-horizon version of
equation (12.5) at time 𝑡 = 𝑇 + 1:

𝜋𝑡 = (1 − 𝛿) ∑ 𝛿 𝑠−𝑡 𝜇𝑠 , (12.8)
𝑠=𝑡

and by also assuming the following continuation path for 𝜇𝑡 beyond 𝑇 :

𝜇𝑡+1 = 𝛾 ∗ 𝜇𝑡 , 𝑡 ≥ 𝑇.

Plugging the preceding equation into equation (12.8) at 𝑡 = 𝑇 + 1 and rearranging we can deduce that

1−𝛿 ∗
𝜋𝑇∗ +1 = 𝛾 𝜇𝑇 (12.9)
1 − 𝛿𝛾 ∗

where we require that |𝛾 ∗ 𝛿| < 1.


Let’s implement and solve this model.
First, we store parameters in a namedtuple:

# Create the rational expectation version of Cagan model in finite time


CaganREE = namedtuple("CaganREE",
["m0", "T", "μ_seq", "α", "δ", "π_end"])

def create_cagan_model(m0, α, T, μ_seq):


δ = α/(1 + α)
π_end = μ_seq[-1] # compute terminal expected inflation
return CaganREE(m0, T, μ_seq, α, δ, π_end)

Here we use the following parameter values:

# parameters
T = 80
T1 = 60
α = 5
m0 = 1

μ0 = 0.5
μ_star = 0

Now we can solve the model to compute 𝜋𝑡 , 𝑚𝑡 and 𝑝𝑡 for 𝑡 = 1, … , 𝑇 + 1 using the matrix equation above

def solve(model):
model_params = model.m0, model.T, model.π_end, model.μ_seq, model.α, model.δ
m0, T, π_end, μ_seq, α, δ = model_params
A1 = np.eye(T+1, T+1) - δ * np.eye(T+1, T+1, k=1)
A2 = np.eye(T+1, T+1) - np.eye(T+1, T+1, k=-1)

b1 = (1-δ) * μ_seq + np.concatenate([np.zeros(T), [δ * π_end]])


b2 = μ_seq + np.concatenate([[m0], np.zeros(T)])

π_seq = np.linalg.inv(A1) @ b1
m_seq = np.linalg.inv(A2) @ b2
(continues on next page)

172 Chapter 12. A Fiscal Theory of the Price Level


A First Course in Quantitative Economics with Python

(continued from previous page)

π_seq = np.append(π_seq, π_end)


m_seq = np.append(m0, m_seq)

p_seq = m_seq + α * π_seq

return π_seq, m_seq, p_seq

12.3.1 Some quantitative experiments

In the experiments below, we’ll use formula (12.9) as our terminal condition for expected inflation.
In devising these experiments, we’ll make assumptions about {𝜇𝑡 } that are consistent with formula (12.9).
We describe several such experiments.
In all of them,

𝜇𝑡 = 𝜇 ∗ , 𝑡 ≥ 𝑇1

so that, in terms of our notation and formula for 𝜃𝑇∗ +1 above, 𝛾̃ = 1.

Experiment 1: Foreseen sudden stabilization

In this experiment, we’ll study how, when 𝛼 > 0, a foreseen inflation stabilization has effects on inflation that proceed it.
We’ll study a situation in which the rate of growth of the money supply is 𝜇0 from 𝑡 = 0 to 𝑡 = 𝑇1 and then permanently
falls to 𝜇∗ at 𝑡 = 𝑇1 .
Thus, let 𝑇1 ∈ (0, 𝑇 ).
So where 𝜇0 > 𝜇∗ , we assume that

𝜇0 , 𝑡 = 0, … , 𝑇1 − 1
𝜇𝑡+1 = {
𝜇∗ , 𝑡 ≥ 𝑇 1

We’ll start by executing a version of our “experiment 1” in which the government implements a foreseen sudden permanent
reduction in the rate of money creation at time 𝑇1 .
The following code performs the experiment and plots outcomes.

def solve_and_plot(m0, α, T, μ_seq):


model_params = create_cagan_model(m0=m0, α=α, T=T, μ_seq=μ_seq)
π_seq, m_seq, p_seq = solve(model_params)
T_seq = range(T + 2)

fig, ax = plt.subplots(5, figsize=[5, 12], dpi=200)

ax[0].plot(T_seq[:-1], μ_seq)
ax[0].set_ylabel(r'$\mu$')

ax[1].plot(T_seq, π_seq)
ax[1].set_ylabel(r'$\pi$')

ax[2].plot(T_seq, m_seq - p_seq)


(continues on next page)

12.3. Continuation values 173


A First Course in Quantitative Economics with Python

(continued from previous page)


ax[2].set_ylabel(r'$m - p$')

ax[3].plot(T_seq, m_seq)
ax[3].set_ylabel(r'$m$')

ax[4].plot(T_seq, p_seq)
ax[4].set_ylabel(r'$p$')

for i in range(5):
ax[i].set_xlabel(r'$t$')

plt.tight_layout()
plt.show()

return π_seq, m_seq, p_seq

μ_seq_1 = np.append(μ0*np.ones(T1+1), μ_star*np.ones(T-T1))

# solve and plot


π_seq_1, m_seq_1, p_seq_1 = solve_and_plot(m0=m0, α=α,
T=T, μ_seq=μ_seq_1)

174 Chapter 12. A Fiscal Theory of the Price Level


A First Course in Quantitative Economics with Python

12.3. Continuation values 175


A First Course in Quantitative Economics with Python

The plot of the money growth rate 𝜇𝑡 in the top level panel portrays a sudden reduction from .5 to 0 at time 𝑇1 = 60.
This brings about a gradual reduction of the inflation rate 𝜋𝑡 that precedes the money supply growth rate reduction at time
𝑇1 .
Notice how the inflation rate declines smoothly (i.e., continuously) to 0 at 𝑇1 – unlike the money growth rate, it does not
suddenly “jump” downward at 𝑇1 .
This is because the reduction in 𝜇 at 𝑇1 has been foreseen from the start.
While the log money supply portrayed in the bottom panel has a kink at 𝑇1 , the log price level does not – it is “smooth”
– once again a consequence of the fact that the reduction in 𝜇 has been foreseen.
To set the stage for our next experiment, we want to study the determinants of the price level a little more.

12.3.2 The log price level

We can use equations (12.1) and (12.2) to discover that the log of the price level satisfies

𝑝𝑡 = 𝑚𝑡 + 𝛼𝜋𝑡 (12.10)

or, by using equation (12.5),


𝑇
𝑝𝑡 = 𝑚𝑡 + 𝛼 [(1 − 𝛿) ∑ 𝛿 𝑠−𝑡 𝜇𝑠 + 𝛿 𝑇 +1−𝑡 𝜋𝑇∗ +1 ] (12.11)
𝑠=𝑡

In our next experiment, we’ll study a “surprise” permanent change in the money growth that beforehand was completely
unanticipated.
At time 𝑇1 when the “surprise” money growth rate change occurs, to satisfy equation (12.10), the log of real balances
jumps *upward as 𝜋𝑡 jumps downward.
But in order for 𝑚𝑡 − 𝑝𝑡 to jump, which variable jumps, 𝑚𝑇1 or 𝑝𝑇1 ?

12.3.3 What jumps?

What jumps at 𝑇1 ?
Is it 𝑝𝑇1 or 𝑚𝑇1 ?
If we insist that the money supply 𝑚𝑇1 is locked at its value 𝑚1𝑇1 inherited from the past, then formula (12.10) implies
that the price level jumps downward at time 𝑇1 , to coincide with the downward jump in 𝜋𝑇1
An alternative assumption about the money supply level is that as part of the “inflation stabilization”, the government
resets 𝑚𝑇1 according to

𝑚2𝑇1 − 𝑚1𝑇1 = 𝛼(𝜋1 − 𝜋2 ) (12.12)

By letting money jump according to equation (12.12) the monetary authority prevents the price level from falling at the
moment that the unanticipated stabilization arrives.
In various research papers about stabilizations of high inflations, the jump in the money supply described by equation
(12.12) has been called “the velocity dividend” that a government reaps from implementing a regime change that sustains
a permanently lower inflation rate.

176 Chapter 12. A Fiscal Theory of the Price Level


A First Course in Quantitative Economics with Python

Technical details about whether 𝑝 or 𝑚 jumps at 𝑇1

We have noted that with a constant expected forward sequence 𝜇𝑠 = 𝜇̄ for 𝑠 ≥ 𝑡, 𝜋𝑡 = 𝜇.̄
A consequence is that at 𝑇1 , either 𝑚 or 𝑝 must “jump” at 𝑇1 .
We’ll study both cases.

𝑚𝑇1 does not jump.

𝑚𝑇1 = 𝑚𝑇1 −1 + 𝜇0
𝜋𝑇1 = 𝜇∗
𝑝𝑇1 = 𝑚𝑇1 + 𝛼𝜋𝑇1
Simply glue the sequences 𝑡 ≤ 𝑇1 and 𝑡 > 𝑇1 .

𝑚𝑇1 jumps.

We reset 𝑚𝑇1 so that 𝑝𝑇1 = (𝑚𝑇1 −1 + 𝜇0 ) + 𝛼𝜇0 , with 𝜋𝑇1 = 𝜇∗ .


Then,

𝑚𝑇1 = 𝑝𝑇1 − 𝛼𝜋𝑇1 = (𝑚𝑇1 −1 + 𝜇0 ) + 𝛼 (𝜇0 − 𝜇∗ )

We then compute for the remaining 𝑇 − 𝑇1 periods with 𝜇𝑠 = 𝜇∗ , ∀𝑠 ≥ 𝑇1 and the initial condition 𝑚𝑇1 from above.
We are now technically equipped to discuss our next experiment.

Experiment 2: an unforeseen sudden stabilization

This experiment deviates a little bit from a pure version of our “perfect foresight” assumption by assuming that a sudden
permanent reduction in 𝜇𝑡 like that analyzed in experiment 1 is completely unanticipated.
Such a completely unanticipated shock is popularly known as an “MIT shock”.
The mental experiment involves switching at time 𝑇1 from an initial “continuation path” for {𝜇𝑡 , 𝜋𝑡 } to another path that
involves a permanently lower inflation rate.
Initial Path: 𝜇𝑡 = 𝜇0 for all 𝑡 ≥ 0. So this path is for {𝜇𝑡 }∞
𝑡=0 ; the associated path for 𝜋𝑡 has 𝜋𝑡 = 𝜇0 .

Revised Continuation Path Where 𝜇0 > 𝜇∗ , we construct a continuation path {𝜇𝑠 }∞ ∗


𝑠=𝑇1 by setting 𝜇𝑠 = 𝜇 for all
𝑠 ≥ 𝑇1 . The perfect foresight continuation path for 𝜋 is 𝜋𝑠 = 𝜇∗
To capture a “completely unanticipated permanent shock to the {𝜇} process at time 𝑇1 , we simply glue the 𝜇𝑡 , 𝜋𝑡 that
emerges under path 2 for 𝑡 ≥ 𝑇1 to the 𝜇𝑡 , 𝜋𝑡 path that had emerged under path 1 for 𝑡 = 0, … , 𝑇1 − 1.
We can do the MIT shock calculations mostly by hand.
Thus, for path 1, 𝜋𝑡 = 𝜇0 for all 𝑡 ∈ [0, 𝑇1 − 1], while for path 2, 𝜇𝑠 = 𝜇∗ for all 𝑠 ≥ 𝑇1 .
We now move on to experiment 2, our “MIT shock”, completely unforeseen sudden stabilization.
We set this up so that the {𝜇𝑡 } sequences that describe the sudden stabilization are identical to those for experiment 1,
the foreseen sudden stabilization.
The following code does the calculations and plots outcomes.

12.3. Continuation values 177


A First Course in Quantitative Economics with Python

# path 1
μ_seq_2_path1 = μ0 * np.ones(T+1)

mc1 = create_cagan_model(m0=m0, α=α,


T=T, μ_seq=μ_seq_2_path1)
π_seq_2_path1, m_seq_2_path1, p_seq_2_path1 = solve(mc1)

# continuation path
μ_seq_2_cont = μ_star * np.ones(T-T1)

mc2 = create_cagan_model(m0=m_seq_2_path1[T1+1],
α=α, T=T-1-T1, μ_seq=μ_seq_2_cont)
π_seq_2_cont, m_seq_2_cont1, p_seq_2_cont1 = solve(mc2)

# regime 1 - simply glue π_seq, μ_seq


μ_seq_2 = np.concatenate([μ_seq_2_path1[:T1+1],
μ_seq_2_cont])
π_seq_2 = np.concatenate([π_seq_2_path1[:T1+1],
π_seq_2_cont])
m_seq_2_regime1 = np.concatenate([m_seq_2_path1[:T1+1],
m_seq_2_cont1])
p_seq_2_regime1 = np.concatenate([p_seq_2_path1[:T1+1],
p_seq_2_cont1])

# regime 2 - reset m_T1


m_T1 = (m_seq_2_path1[T1] + μ0) + α*(μ0 - μ_star)

mc = create_cagan_model(m0=m_T1, α=α,
T=T-1-T1, μ_seq=μ_seq_2_cont)
π_seq_2_cont2, m_seq_2_cont2, p_seq_2_cont2 = solve(mc)

m_seq_2_regime2 = np.concatenate([m_seq_2_path1[:T1+1],
m_seq_2_cont2])
p_seq_2_regime2 = np.concatenate([p_seq_2_path1[:T1+1],
p_seq_2_cont2])

T_seq = range(T+2)

# plot both regimes


fig, ax = plt.subplots(5, 1, figsize=[5, 12], dpi=200)

ax[0].plot(T_seq[:-1], μ_seq_2)
ax[0].set_ylabel(r'$\mu$')

ax[1].plot(T_seq, π_seq_2)
ax[1].set_ylabel(r'$\pi$')

ax[2].plot(T_seq, m_seq_2_regime1 - p_seq_2_regime1)


ax[2].set_ylabel(r'$m - p$')

ax[3].plot(T_seq, m_seq_2_regime1,
label='Smooth $m_{T_1}$')
ax[3].plot(T_seq, m_seq_2_regime2,
label='Jumpy $m_{T_1}$')
ax[3].set_ylabel(r'$m$')

(continues on next page)

178 Chapter 12. A Fiscal Theory of the Price Level


A First Course in Quantitative Economics with Python

(continued from previous page)


ax[4].plot(T_seq, p_seq_2_regime1,
label='Smooth $m_{T_1}$')
ax[4].plot(T_seq, p_seq_2_regime2,
label='Jumpy $m_{T_1}$')
ax[4].set_ylabel(r'$p$')

for i in range(5):
ax[i].set_xlabel(r'$t$')

for i in [3, 4]:


ax[i].legend()

plt.tight_layout()
plt.show()

12.3. Continuation values 179


A First Course in Quantitative Economics with Python

180 Chapter 12. A Fiscal Theory of the Price Level


A First Course in Quantitative Economics with Python

We invite you to compare these graphs with corresponding ones for the foreseen stabilization analyzed in experiment 1
above.
Note how the inflation graph in the top middle panel is now identical to the money growth graph in the top left panel, and
how now the log of real balances portrayed in the top right panel jumps upward at time 𝑇1 .
The bottom panels plot 𝑚 and 𝑝 under two possible ways that 𝑚𝑇1 might adjust as required by the upward jump in 𝑚 − 𝑝
at 𝑇1 .
• the orange line lets 𝑚𝑇1 jump upward in order to make sure that the log price level 𝑝𝑇1 does not fall.
• the blue line lets 𝑝𝑇1 fall while stopping the money supply from jumping.
Here is a way to interpret what the government is doing when the orange line policy is in place.
The government prints money to finance expenditure with the “velocity dividend” that it reaps from the increased demand
for real balances brought about by the permanent decrease in the rate of growth of the money supply.
The next code generates a multi-panel graph that includes outcomes of both experiments 1 and 2.
That allows us to assess how important it is to understand whether the sudden permanent drop in 𝜇𝑡 at 𝑡 = 𝑇1 is fully
unanticipated, as in experiment 1, or completely unanticipated, as in experiment 2.

# compare foreseen vs unforeseen shock


fig, ax = plt.subplots(5, figsize=[5, 12], dpi=200)

ax[0].plot(T_seq[:-1], μ_seq_2)
ax[0].set_ylabel(r'$\mu$')

ax[1].plot(T_seq, π_seq_2,
label='Unforeseen')
ax[1].plot(T_seq, π_seq_1,
label='Foreseen', color='tab:green')
ax[1].set_ylabel(r'$\pi$')

ax[2].plot(T_seq,
m_seq_2_regime1 - p_seq_2_regime1,
label='Unforeseen')
ax[2].plot(T_seq, m_seq_1 - p_seq_1,
label='Foreseen', color='tab:green')
ax[2].set_ylabel(r'$m - p$')

ax[3].plot(T_seq, m_seq_2_regime1,
label=r'Unforeseen (Smooth $m_{T_1}$)')
ax[3].plot(T_seq, m_seq_2_regime2,
label=r'Unforeseen ($m_{T_1}$ jumps)')
ax[3].plot(T_seq, m_seq_1,
label='Foreseen shock')
ax[3].set_ylabel(r'$m$')

ax[4].plot(T_seq, p_seq_2_regime1,
label=r'Unforeseen (Smooth $m_{T_1}$)')
ax[4].plot(T_seq, p_seq_2_regime2,
label=r'Unforeseen ($m_{T_1}$ jumps)')
ax[4].plot(T_seq, p_seq_1,
label='Foreseen')
ax[4].set_ylabel(r'$p$')

for i in range(5):
ax[i].set_xlabel(r'$t$')
(continues on next page)

12.3. Continuation values 181


A First Course in Quantitative Economics with Python

(continued from previous page)

for i in range(1, 5):


ax[i].legend()

plt.tight_layout()
plt.show()

182 Chapter 12. A Fiscal Theory of the Price Level


A First Course in Quantitative Economics with Python

12.3. Continuation values 183


A First Course in Quantitative Economics with Python

It is instructive to compare the preceding graphs with graphs of log price levels and inflation rates for data from four big
inflations described in this lecture.
In particular, in the above graphs, notice how a gradual fall in inflation precedes the “sudden stop” when it has been
anticipated long beforehand, but how inflation instead falls abruptly when the permanent drop in money supply growth is
unanticipated.
It seems to the author team at quantecon that the drops in inflation near the ends of the four hyperinflations described in
this lecture more closely resemble outcomes from the experiment 2 “unforeseen stabilization”.
(It is fair to say that the preceding informal pattern recognition exercise should be supplemented with a more formal
structural statistical analysis.)

Experiment 3

Foreseen gradual stabilization


Instead of a foreseen sudden stabilization of the type studied with experiment 1, it is also interesting to study the conse-
quences of a foreseen gradual stabilization.
Thus, suppose that 𝜙 ∈ (0, 1), that 𝜇0 > 𝜇∗ , and that for 𝑡 = 0, … , 𝑇 − 1

𝜇𝑡 = 𝜙𝑡 𝜇0 + (1 − 𝜙𝑡 )𝜇∗ .

Next we perform an experiment in which there is a perfectly foreseen gradual decrease in the rate of growth of the money
supply.
The following code does the calculations and plots the results.

# parameters
ϕ = 0.9
μ_seq = np.array([ϕ**t * μ0 + (1-ϕ**t)*μ_star for t in range(T)])
μ_seq = np.append(μ_seq, μ_star)

# solve and plot


π_seq, m_seq, p_seq = solve_and_plot(m0=m0, α=α, T=T, μ_seq=μ_seq)

184 Chapter 12. A Fiscal Theory of the Price Level


A First Course in Quantitative Economics with Python

12.3. Continuation values 185


A First Course in Quantitative Economics with Python

12.4 Sequel

This lecture fiscal theory of the price level with adaptive expectations describes an “adaptive expectations” version of
Cagan’s model.
The dynamics become more complicated and so does the algebra.
Nowadays, the “rational expectations” version of the model is more popular among central bankers and economists ad-
vising them.

186 Chapter 12. A Fiscal Theory of the Price Level


CHAPTER

THIRTEEN

A FISCAL THEORY OF PRICE LEVEL WITH ADAPTIVE


EXPECTATIONS

13.1 Introduction

As usual, we’ll start by importing some Python modules.

import numpy as np
from collections import namedtuple
import matplotlib.pyplot as plt

This lecture is a sequel or prequel to this lecture fiscal theory of the price level.
We’ll use linear algebra to do some experiments with an alternative “fiscal theory of the price level”.
Like the model in this lecture fiscal theory of the price level, the model asserts that when a government persistently spends
more than it collects in taxes and prints money to finance the shortfall, it puts upward pressure on the price level and
generates persistent inflation.
Instead of the “perfect foresight” or “rational expectations” version of the model in this lecture fiscal theory of the price
level, our model in the present lecture is an “adaptive expectations” version of a model that Philip Cagan [Cag56] used to
study the monetary dynamics of hyperinflations.
It combines these components:
• a demand function for real money balances that asserts that the logarithm of the quantity of real balances demanded
depends inversely on the public’s expected rate of inflation
• an adaptive expectations model that describes how the public’s anticipated rate of inflation responds to past values
of actual inflation
• an equilibrium condition that equates the demand for money to the supply
• an exogenous sequence of rates of growth of the money supply
Our model stays quite close to Cagan’s original specification.
As in the present values and consumption smoothing lectures, the only linear algebra operations that we’ll be using are
matrix multiplication and matrix inversion.
To facilitate using linear matrix algebra as our principal mathematical tool, we’ll use a finite horizon version of the model.

187
A First Course in Quantitative Economics with Python

13.2 Structure of the model

Let
• 𝑚𝑡 be the log of the supply of nominal money balances;
• 𝜇𝑡 = 𝑚𝑡+1 − 𝑚𝑡 be the net rate of growth of nominal balances;
• 𝑝𝑡 be the log of the price level;
• 𝜋𝑡 = 𝑝𝑡+1 − 𝑝𝑡 be the net rate of inflation between 𝑡 and 𝑡 + 1;
• 𝜋𝑡∗ be the public’s expected rate of inflation between 𝑡 and 𝑡 + 1;
• 𝑇 the horizon – i.e., the last period for which the model will determine 𝑝𝑡
• 𝜋0∗ public’s initial expected rate of inflation between time 0 and time 1.
𝑑
The demand for real balances exp ( 𝑚
𝑝 ) is governed by the following version of the Cagan demand function
𝑡
𝑡

𝑚𝑑𝑡 − 𝑝𝑡 = −𝛼𝜋𝑡∗ , 𝛼 > 0; 𝑡 = 0, 1, … , 𝑇 . (13.1)

This equation asserts that the demand for real balances is inversely related to the public’s expected rate of inflation.
Equating the logarithm 𝑚𝑑𝑡 of the demand for money to the logarithm 𝑚𝑡 of the supply of money in equation (13.1) and
solving for the logarithm 𝑝𝑡 of the price level gives

𝑝𝑡 = 𝑚𝑡 + 𝛼𝜋𝑡∗ (13.2)

Taking the difference between equation (13.2) at time 𝑡 + 1 and at time 𝑡 gives

𝜋𝑡 = 𝜇𝑡 + 𝛼𝜋𝑡+1 − 𝛼𝜋𝑡∗ (13.3)

We assume that the expected rate of inflation 𝜋𝑡∗ is governed by the Friedman-Cagan adaptive expectations scheme

𝜋𝑡+1 = 𝜆𝜋𝑡∗ + (1 − 𝜆)𝜋𝑡 (13.4)

As exogenous inputs into the model, we take initial conditions 𝑚0 , 𝜋0∗ and a money growth sequence 𝜇 = {𝜇𝑡 }𝑇𝑡=0 .
As endogenous outputs of our model we want to find sequences 𝜋 = {𝜋𝑡 }𝑇𝑡=0 , 𝑝 = {𝑝𝑡 }𝑇𝑡=0 as functions of the endogenous
inputs.
We’ll do some mental experiments by studying how the model outputs vary as we vary the model inputs.

13.3 Representing key equations with linear algebra

We begin by writing the equation (13.4) adaptive expectations model for 𝜋𝑡∗ for 𝑡 = 0, … , 𝑇 as

1 0 0 ⋯ 0 0 𝜋0∗ 0 0 0 ⋯ 0 𝜋0 𝜋0∗
⎡−𝜆 1 0 ⋯ 0 0⎤ ⎡ 𝜋∗ ⎤ ⎡1 0 0 ⋯ 0⎤ ⎡ 𝜋1 ⎤ ⎡ 0 ⎤
⎢ ⎥ ⎢ 1∗ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ 0 −𝜆 1 ⋯ 0 0⎥ ⎢ 𝜋2 ⎥ = (1 − 𝜆) ⎢0 1 0 ⋯ 0⎥ ⎢ 𝜋2 ⎥ + ⎢ 0 ⎥
⎢ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮⎥⎢ ⋮ ⎥ ⎢⋮ ⋮ ⋮ ⋯ ⋮⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎣ 0 0 0 ⋯ −𝜆 1⎦ ⎣𝜋𝑇∗ +1 ⎦ ⎣0 0 0 ⋯ 1⎦ ⎣𝜋𝑇 ⎦ ⎣ 0 ⎦

Write this equation as

𝐴𝜋∗ = (1 − 𝜆)𝐵𝜋 + 𝜋0∗ (13.5)

188 Chapter 13. A Fiscal Theory of Price Level with Adaptive Expectations
A First Course in Quantitative Economics with Python

where the (𝑇 + 2) × (𝑇 + 2)matrix 𝐴, the (𝑇 + 2) × (𝑇 + 1) matrix 𝐵, and the vectors 𝜋∗ , 𝜋0 , 𝜋0∗ are defined implicitly
by aligning these two equations.
Next we write the key equation (13.3) in matrix notation as
𝜋0 𝜇0 −𝛼 𝛼 0 ⋯ 0 0 𝜋0∗
⎡ 𝜋 ⎤ ⎡ 𝜇 ⎤ ⎡ 0 −𝛼 𝛼 ⋯ 0 0 ⎤ ⎡ 𝜋∗ ⎤
⎢ 1⎥ ⎢ 1⎥ ⎢ ⎥ ⎢ 1∗ ⎥
⎢ 𝜋1 ⎥ = ⎢ 𝜇 2 ⎥ + ⎢ 0 0 −𝛼 ⋯ 0 0 ⎥ ⎢ 𝜋2 ⎥
⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⋮ ⋮ ⋯ 𝛼 0⎥ ⎢ ⋮ ⎥
⎣𝜋𝑇 ⎦ ⎣𝜇𝑇 ⎦ ⎣ 0 0 0 ⋯ −𝛼 𝛼⎦ ⎣𝜋𝑇∗ +1 ⎦
Represent the previous equation system in terms of vectors and matrices as

𝜋 = 𝜇 + 𝐶𝜋∗ (13.6)

where the (𝑇 + 1) × (𝑇 + 2) matrix 𝐶 is defined implicitly to align this equation with the preceding equation system.

13.4 Harvesting returns from our matrix formulation

We now have all of the ingredients we need to solve for 𝜋 as a function of 𝜇, 𝜋0 , 𝜋0∗ .
Combine equations (13.5)and (13.6) to get
𝐴𝜋∗ = (1 − 𝜆)𝐵𝜋 + 𝜋0∗
= (1 − 𝜆)𝐵 [𝜇 + 𝐶𝜋∗ ] + 𝜋0∗
which implies that

[𝐴 − (1 − 𝜆)𝐵𝐶] 𝜋∗ = (1 − 𝜆)𝐵𝜇 + 𝜋0∗

Multiplying both sides of the above equation by the inverse of the matrix on the left side gives
−1
𝜋∗ = [𝐴 − (1 − 𝜆)𝐵𝐶] [(1 − 𝜆)𝐵𝜇 + 𝜋0∗ ] (13.7)

Having solved equation (13.7) for 𝜋∗ , we can use equation (13.6) to solve for 𝜋:

𝜋 = 𝜇 + 𝐶𝜋∗

We have thus solved for two of the key endogenous time series determined by our model, namely, the sequence 𝜋∗ of
expected inflation rates and the sequence 𝜋 of actual inflation rates.
Knowing these, we can then quickly calculate the associated sequence 𝑝 of the logarithm of the price level from equation
(13.2).
Let’s fill in the details for this step.
Since we now know 𝜇 it is easy to compute 𝑚.
Thus, notice that we can represent the equations

𝑚𝑡+1 = 𝑚𝑡 + 𝜇𝑡 , 𝑡 = 0, 1, … , 𝑇

as the matrix equation


1 0 0 ⋯ 0 0 𝑚1 𝜇0 𝑚0
⎡−1 1 0 ⋯ 0 0⎤ ⎡ 𝑚2 ⎤ ⎡ 𝜇1 ⎤ ⎡ 0 ⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0 −1 1 ⋯ 0 0⎥ ⎢ 𝑚3 ⎥ ⎢ 𝜇2 ⎥ ⎢ 0 ⎥
= + (13.8)
⎢ ⋮ ⋮ ⋮ ⋮ 0 0⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎢0 0 0 ⋯ 1 0⎥ ⎢ 𝑚𝑇 ⎥ ⎢𝜇𝑇 −1 ⎥ ⎢ 0 ⎥
⎣0 0 0 ⋯ −1 1⎦ ⎣𝑚𝑇 +1 ⎦ ⎣ 𝜇𝑇 ⎦ ⎣ 0 ⎦

13.4. Harvesting returns from our matrix formulation 189


A First Course in Quantitative Economics with Python

Multiplying both sides of equation (13.8) with the inverse of the matrix on the left will give
𝑡−1
𝑚𝑡 = 𝑚0 + ∑ 𝜇𝑠 , 𝑡 = 1, … , 𝑇 + 1 (13.9)
𝑠=0

Equation (13.9) shows that the log of the money supply at 𝑡 equals the log 𝑚0 of the initial money supply plus accumulation
of rates of money growth between times 0 and 𝑡.
We can then compute 𝑝𝑡 for each 𝑡 from equation (13.2).
We can write a compact formula for 𝑝 as

𝑝 = 𝑚 + 𝛼𝜋 ̂ ∗

where
𝜋0∗
⎡ 𝜋∗ ⎤
⎢ 1⎥
𝜋̂∗ = ⎢ 𝜋2∗ ⎥ ,
⎢ ⋮ ⎥
⎣𝜋𝑇∗ ⎦
which is just 𝜋∗ with the last element dropped.

13.5 Forecast errors

Our computations will verify that

𝜋̂∗ ≠ 𝜋,

so that in general

𝜋𝑡∗ ≠ 𝜋𝑡 , 𝑡 = 0, 1, … , 𝑇 (13.10)

This outcome is typical in models in which adaptive expectations hypothesis like equation (13.4) appear as a component.
In this lecture fiscal theory of the price level, we studied a version of the model that replaces hypothesis (13.4) with a
“perfect foresight” or “rational expectations” hypothesis.

Cagan_Adaptive = namedtuple("Cagan_Adaptive",
["α", "m0", "Eπ0", "T", "λ"])

def create_cagan_model(α, m0, Eπ0, T, λ):


return Cagan_Adaptive(α, m0, Eπ0, T, λ)

Here we define the parameters.

# parameters
T = 80
T1 = 60
α = 5
λ = 0.9 # 0.7
m0 = 1

μ0 = 0.5
μ_star = 0

md = create_cagan_model(α=α, m0=m0, Eπ0=μ0, T=T, λ=λ)

190 Chapter 13. A Fiscal Theory of Price Level with Adaptive Expectations
A First Course in Quantitative Economics with Python

We solve the model and plot variables of interests using the following functions.

def solve(model, μ_seq):


" Solve the Cagan model in finite time. "

model_params = model.α, model.m0, model.Eπ0, model.T, model.λ


α, m0, Eπ0, T, λ = model_params

A = np.eye(T+2, T+2) - λ*np.eye(T+2, T+2, k=-1)


B = np.eye(T+2, T+1, k=-1)
C = -α*np.eye(T+1, T+2) + α*np.eye(T+1, T+2, k=1)
Eπ0_seq = np.append(Eπ0, np.zeros(T+1))

# Eπ_seq is of length T+2


Eπ_seq = np.linalg.inv(A - (1-λ)*B @ C) @ ((1-λ) * B @ μ_seq + Eπ0_seq)

# π_seq is of length T+1


π_seq = μ_seq + C @ Eπ_seq

D = np.eye(T+1, T+1) - np.eye(T+1, T+1, k=-1)


m0_seq = np.append(m0, np.zeros(T))

# m_seq is of length T+2


m_seq = np.linalg.inv(D) @ (μ_seq + m0_seq)
m_seq = np.append(m0, m_seq)

# p_seq is of length T+2


p_seq = m_seq + α * Eπ_seq

return π_seq, Eπ_seq, m_seq, p_seq

def solve_and_plot(model, μ_seq):

π_seq, Eπ_seq, m_seq, p_seq = solve(model, μ_seq)

T_seq = range(model.T+2)

fig, ax = plt.subplots(5, 1, figsize=[5, 12], dpi=200)


ax[0].plot(T_seq[:-1], μ_seq)
ax[1].plot(T_seq[:-1], π_seq, label=r'$\pi_t$')
ax[1].plot(T_seq, Eπ_seq, label=r'$\pi^{*}_{t}$')
ax[2].plot(T_seq, m_seq - p_seq)
ax[3].plot(T_seq, m_seq)
ax[4].plot(T_seq, p_seq)

y_labs = [r'$\mu$', r'$\pi$', r'$m - p$', r'$m$', r'$p$']

for i in range(5):
ax[i].set_xlabel(r'$t$')
ax[i].set_ylabel(y_labs[i])

ax[1].legend()
plt.tight_layout()
plt.show()

return π_seq, Eπ_seq, m_seq, p_seq

13.5. Forecast errors 191


A First Course in Quantitative Economics with Python

13.6 Technical condition for stability

In constructing our examples, we shall assume that (𝜆, 𝛼) satisfy

𝜆 − 𝛼(1 − 𝜆)
∣ ∣<1 (13.11)
1 − 𝛼(1 − 𝜆)

The source of this condition is the following string of deductions:



𝜋𝑡 = 𝜇𝑡 + 𝛼𝜋𝑡+1 − 𝛼𝜋𝑡∗

𝜋𝑡+1 = 𝜆𝜋𝑡∗ + (1 − 𝜆)𝜋𝑡
𝜇𝑡 𝛼(1 − 𝜆)
𝜋𝑡 = − 𝜋∗
1 − 𝛼(1 − 𝜆) 1 − 𝛼(1 − 𝜆) 𝑡
1 1 − 𝛼(1 − 𝜆)
⟹ 𝜋𝑡∗ = 𝜇 − 𝜋𝑡
𝛼(1 − 𝜆) 𝑡 𝛼(1 − 𝜆)
𝜇𝑡+1 𝛼(1 − 𝜆)
𝜋𝑡+1 = − (𝜆𝜋𝑡∗ + (1 − 𝜆)𝜋𝑡 )
1 − 𝛼(1 − 𝜆) 1 − 𝛼(1 − 𝜆)
𝜇𝑡+1 𝜆 𝜆 − 𝛼(1 − 𝜆)
= − 𝜇𝑡 + 𝜋
1 − 𝛼(1 − 𝜆) 1 − 𝛼(1 − 𝜆) 1 − 𝛼(1 − 𝜆) 𝑡

By assuring that the coefficient on 𝜋𝑡 is less than one in absolute value, condition (13.11) assures stability of the dynamics
of {𝜋𝑡 } described by the last line of our string of deductions.
The reader is free to study outcomes in examples that violate condition (13.11).

print(np.abs((λ - α*(1-λ))/(1 - α*(1-λ))))

0.8

print(λ - α*(1-λ))

0.40000000000000013

Now we’ll turn to some experiments.

13.6.1 Experiment 1

We’ll study a situation in which the rate of growth of the money supply is 𝜇0 from 𝑡 = 0 to 𝑡 = 𝑇1 and then permanently
falls to 𝜇∗ at 𝑡 = 𝑇1 .
Thus, let 𝑇1 ∈ (0, 𝑇 ).
So where 𝜇0 > 𝜇∗ , we assume that

𝜇0 , 𝑡 = 0, … , 𝑇1 − 1
𝜇𝑡+1 = {
𝜇∗ , 𝑡 ≥ 𝑇 1

Notice that we studied exactly this experiment in a rational expectations version of the model in this lecture fiscal theory
of the price level.
So by comparing outcomes across the two lectures, we can learn about consequences of assuming adaptive expectations,
as we do here, instead of rational expectations as we assumed in that other lecture.

192 Chapter 13. A Fiscal Theory of Price Level with Adaptive Expectations
A First Course in Quantitative Economics with Python

μ_seq_1 = np.append(μ0*np.ones(T1), μ_star*np.ones(T+1-T1))

# solve and plot


π_seq_1, Eπ_seq_1, m_seq_1, p_seq_1 = solve_and_plot(md, μ_seq_1)

13.6. Technical condition for stability 193


A First Course in Quantitative Economics with Python

194 Chapter 13. A Fiscal Theory of Price Level with Adaptive Expectations
A First Course in Quantitative Economics with Python

We invite the reader to compare outcomes with those under rational expectations studied in this lecture fiscal theory of
the price level.
Please note how the actual inflation rate 𝜋𝑡 “overshoots” its ultimate steady-state value at the time of the sudden reduction
in the rate of growth of the money supply at time 𝑇1 .
We invite you to explain to yourself the source of this overshooting and why it does not occur in the rational expectations
version of the model.

13.6.2 Experiment 2

Now we’ll do a different experiment, namely, a gradual stabilization in which the rate of growth of the money supply
smoothly decline from a high value to a persistently low value.
While price level inflation eventually falls, it falls more slowly than the driving force that ultimately causes it to fall, namely,
the falling rate of growth of the money supply.
The sluggish fall in inflation is explained by how anticipated inflation 𝜋𝑡∗ persistently exceeds actual inflation 𝜋𝑡 during the
transition from a high inflation to a low inflation situation.

# parameters
ϕ = 0.9
μ_seq_2 = np.array([ϕ**t * μ0 + (1-ϕ**t)*μ_star for t in range(T)])
μ_seq_2 = np.append(μ_seq_2, μ_star)

# solve and plot


π_seq_2, Eπ_seq_2, m_seq_2, p_seq_2 = solve_and_plot(md, μ_seq_2)

13.6. Technical condition for stability 195


A First Course in Quantitative Economics with Python

196 Chapter 13. A Fiscal Theory of Price Level with Adaptive Expectations
CHAPTER

FOURTEEN

GEOMETRIC SERIES FOR ELEMENTARY ECONOMICS

Contents

• Geometric Series for Elementary Economics


– Overview
– Key formulas
– Example: The Money Multiplier in Fractional Reserve Banking
– Example: The Keynesian Multiplier
– Example: Interest Rates and Present Values
– Back to the Keynesian multiplier

14.1 Overview

The lecture describes important ideas in economics that use the mathematics of geometric series.
Among these are
• the Keynesian multiplier
• the money multiplier that prevails in fractional reserve banking systems
• interest rates and present values of streams of payouts from assets
(As we shall see below, the term multiplier comes down to meaning sum of a convergent geometric series)
These and other applications prove the truth of the wise crack that
“in economics, a little knowledge of geometric series goes a long way “
Below we’ll use the following imports:

%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (11, 5) #set default figure size
import numpy as np
import sympy as sym
from sympy import init_printing, latex
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D

197
A First Course in Quantitative Economics with Python

14.2 Key formulas

To start, let 𝑐 be a real number that lies strictly between −1 and 1.


• We often write this as 𝑐 ∈ (−1, 1).
• Here (−1, 1) denotes the collection of all real numbers that are strictly less than 1 and strictly greater than −1.
• The symbol ∈ means in or belongs to the set after the symbol.
We want to evaluate geometric series of two types – infinite and finite.

14.2.1 Infinite geometric series

The first type of geometric that interests us is the infinite series

1 + 𝑐 + 𝑐2 + 𝑐3 + ⋯

Where ⋯ means that the series continues without end.


The key formula is
1
1 + 𝑐 + 𝑐2 + 𝑐3 + ⋯ = (14.1)
1−𝑐
To prove key formula (14.1), multiply both sides by (1 − 𝑐) and verify that if 𝑐 ∈ (−1, 1), then the outcome is the
equation 1 = 1.

14.2.2 Finite geometric series

The second series that interests us is the finite geometric series

1 + 𝑐 + 𝑐2 + 𝑐3 + ⋯ + 𝑐𝑇

where 𝑇 is a positive integer.


The key formula here is

1 − 𝑐𝑇 +1
1 + 𝑐 + 𝑐2 + 𝑐3 + ⋯ + 𝑐𝑇 =
1−𝑐
Remark: The above formula works for any value of the scalar 𝑐. We don’t have to restrict 𝑐 to be in the set (−1, 1).
We now move on to describe some famous economic applications of geometric series.

14.3 Example: The Money Multiplier in Fractional Reserve Banking

In a fractional reserve banking system, banks hold only a fraction 𝑟 ∈ (0, 1) of cash behind each deposit receipt that
they issue
• In recent times
– cash consists of pieces of paper issued by the government and called dollars or pounds or …
– a deposit is a balance in a checking or savings account that entitles the owner to ask the bank for immediate
payment in cash

198 Chapter 14. Geometric Series for Elementary Economics


A First Course in Quantitative Economics with Python

• When the UK and France and the US were on either a gold or silver standard (before 1914, for example)
– cash was a gold or silver coin
– a deposit receipt was a bank note that the bank promised to convert into gold or silver on demand; (sometimes
it was also a checking or savings account balance)
Economists and financiers often define the supply of money as an economy-wide sum of cash plus deposits.
In a fractional reserve banking system (one in which the reserve ratio 𝑟 satisfies 0 < 𝑟 < 1), banks create money by
issuing deposits backed by fractional reserves plus loans that they make to their customers.
A geometric series is a key tool for understanding how banks create money (i.e., deposits) in a fractional reserve system.
The geometric series formula (14.1) is at the heart of the classic model of the money creation process – one that leads us
to the celebrated money multiplier.

14.3.1 A simple model

There is a set of banks named 𝑖 = 0, 1, 2, ….


Bank 𝑖’s loans 𝐿𝑖 , deposits 𝐷𝑖 , and reserves 𝑅𝑖 must satisfy the balance sheet equation (because balance sheets balance):

𝐿𝑖 + 𝑅 𝑖 = 𝐷 𝑖 (14.2)

The left side of the above equation is the sum of the bank’s assets, namely, the loans 𝐿𝑖 it has outstanding plus its reserves
of cash 𝑅𝑖 .
The right side records bank 𝑖’s liabilities, namely, the deposits 𝐷𝑖 held by its depositors; these are IOU’s from the bank to
its depositors in the form of either checking accounts or savings accounts (or before 1914, bank notes issued by a bank
stating promises to redeem note for gold or silver on demand).
Each bank 𝑖 sets its reserves to satisfy the equation

𝑅𝑖 = 𝑟𝐷𝑖 (14.3)

where 𝑟 ∈ (0, 1) is its reserve-deposit ratio or reserve ratio for short


• the reserve ratio is either set by a government or chosen by banks for precautionary reasons
Next we add a theory stating that bank 𝑖 + 1’s deposits depend entirely on loans made by bank 𝑖, namely

𝐷𝑖+1 = 𝐿𝑖 (14.4)

Thus, we can think of the banks as being arranged along a line with loans from bank 𝑖 being immediately deposited in
𝑖+1
• in this way, the debtors to bank 𝑖 become creditors of bank 𝑖 + 1
Finally, we add an initial condition about an exogenous level of bank 0’s deposits

𝐷0 is given exogenously

We can think of 𝐷0 as being the amount of cash that a first depositor put into the first bank in the system, bank number
𝑖 = 0.
Now we do a little algebra.
Combining equations (14.2) and (14.3) tells us that

𝐿𝑖 = (1 − 𝑟)𝐷𝑖 (14.5)

14.3. Example: The Money Multiplier in Fractional Reserve Banking 199


A First Course in Quantitative Economics with Python

This states that bank 𝑖 loans a fraction (1 − 𝑟) of its deposits and keeps a fraction 𝑟 as cash reserves.
Combining equation (14.5) with equation (14.4) tells us that

𝐷𝑖+1 = (1 − 𝑟)𝐷𝑖 for 𝑖 ≥ 0

which implies that

𝐷𝑖 = (1 − 𝑟)𝑖 𝐷0 for 𝑖 ≥ 0 (14.6)

Equation (14.6) expresses 𝐷𝑖 as the 𝑖 th term in the product of 𝐷0 and the geometric series

1, (1 − 𝑟), (1 − 𝑟)2 , ⋯

Therefore, the sum of all deposits in our banking system 𝑖 = 0, 1, 2, … is



𝐷0 𝐷
∑(1 − 𝑟)𝑖 𝐷0 = = 0 (14.7)
𝑖=0
1 − (1 − 𝑟) 𝑟

14.3.2 Money multiplier

The money multiplier is a number that tells the multiplicative factor by which an exogenous injection of cash into bank
0 leads to an increase in the total deposits in the banking system.
1
Equation (14.7) asserts that the money multiplier is 𝑟
𝐷0
• An initial deposit of cash of 𝐷0 in bank 0 leads the banking system to create total deposits of 𝑟 .

• The initial deposit 𝐷0 is held as reserves, distributed throughout the banking system according to 𝐷0 = ∑𝑖=0 𝑅𝑖 .

14.4 Example: The Keynesian Multiplier

The famous economist John Maynard Keynes and his followers created a simple model intended to determine national
income 𝑦 in circumstances in which
• there are substantial unemployed resources, in particular excess supply of labor and capital
• prices and interest rates fail to adjust to make aggregate supply equal demand (e.g., prices and interest rates are
frozen)
• national income is entirely determined by aggregate demand

14.4.1 Static version

An elementary Keynesian model of national income determination consists of three equations that describe aggregate
demand for 𝑦 and its components.
The first equation is a national income identity asserting that consumption 𝑐 plus investment 𝑖 equals national income 𝑦:

𝑐+𝑖=𝑦

The second equation is a Keynesian consumption function asserting that people consume a fraction 𝑏 ∈ (0, 1) of their
income:

𝑐 = 𝑏𝑦

200 Chapter 14. Geometric Series for Elementary Economics


A First Course in Quantitative Economics with Python

The fraction 𝑏 ∈ (0, 1) is called the marginal propensity to consume.


The fraction 1 − 𝑏 ∈ (0, 1) is called the marginal propensity to save.
The third equation simply states that investment is exogenous at level 𝑖.
• exogenous means determined outside this model.
Substituting the second equation into the first gives (1 − 𝑏)𝑦 = 𝑖.
Solving this equation for 𝑦 gives
1
𝑦= 𝑖
1−𝑏
1
The quantity 1−𝑏 is called the investment multiplier or simply the multiplier.
Applying the formula for the sum of an infinite geometric series, we can write the above equation as

𝑦 = 𝑖 ∑ 𝑏𝑡
𝑡=0

where 𝑡 is a nonnegative integer.


So we arrive at the following equivalent expressions for the multiplier:

1
= ∑ 𝑏𝑡
1−𝑏 𝑡=0


The expression ∑𝑡=0 𝑏𝑡 motivates an interpretation of the multiplier as the outcome of a dynamic process that we describe
next.

14.4.2 Dynamic version

We arrive at a dynamic version by interpreting the nonnegative integer 𝑡 as indexing time and changing our specification
of the consumption function to take time into account
• we add a one-period lag in how income affects consumption
We let 𝑐𝑡 be consumption at time 𝑡 and 𝑖𝑡 be investment at time 𝑡.
We modify our consumption function to assume the form

𝑐𝑡 = 𝑏𝑦𝑡−1

so that 𝑏 is the marginal propensity to consume (now) out of last period’s income.
We begin with an initial condition stating that

𝑦−1 = 0

We also assume that

𝑖𝑡 = 𝑖 for all 𝑡 ≥ 0

so that investment is constant over time.


It follows that

𝑦0 = 𝑖 + 𝑐0 = 𝑖 + 𝑏𝑦−1 = 𝑖

14.4. Example: The Keynesian Multiplier 201


A First Course in Quantitative Economics with Python

and

𝑦1 = 𝑐1 + 𝑖 = 𝑏𝑦0 + 𝑖 = (1 + 𝑏)𝑖

and

𝑦2 = 𝑐2 + 𝑖 = 𝑏𝑦1 + 𝑖 = (1 + 𝑏 + 𝑏2 )𝑖

and more generally

𝑦𝑡 = 𝑏𝑦𝑡−1 + 𝑖 = (1 + 𝑏 + 𝑏2 + ⋯ + 𝑏𝑡 )𝑖

or
1 − 𝑏𝑡+1
𝑦𝑡 = 𝑖
1−𝑏
Evidently, as 𝑡 → +∞,
1
𝑦𝑡 → 𝑖
1−𝑏
Remark 1: The above formula is often applied to assert that an exogenous increase in investment of Δ𝑖 at time 0 ignites
a dynamic process of increases in national income by successive amounts

Δ𝑖, (1 + 𝑏)Δ𝑖, (1 + 𝑏 + 𝑏2 )Δ𝑖, ⋯

at times 0, 1, 2, ….
Remark 2 Let 𝑔𝑡 be an exogenous sequence of government expenditures.
If we generalize the model so that the national income identity becomes

𝑐𝑡 + 𝑖 𝑡 + 𝑔 𝑡 = 𝑦 𝑡
1
then a version of the preceding argument shows that the government expenditures multiplier is also 1−𝑏 , so that a
permanent increase in government expenditures ultimately leads to an increase in national income equal to the multiplier
times the increase in government expenditures.

14.5 Example: Interest Rates and Present Values

We can apply our formula for geometric series to study how interest rates affect values of streams of dollar payments that
extend over time.
We work in discrete time and assume that 𝑡 = 0, 1, 2, … indexes time.
We let 𝑟 ∈ (0, 1) be a one-period net nominal interest rate
• if the nominal interest rate is 5 percent, then 𝑟 = .05
A one-period gross nominal interest rate 𝑅 is defined as

𝑅 = 1 + 𝑟 ∈ (1, 2)

• if 𝑟 = .05, then 𝑅 = 1.05


Remark: The gross nominal interest rate 𝑅 is an exchange rate or relative price of dollars at between times 𝑡 and 𝑡 + 1.
The units of 𝑅 are dollars at time 𝑡 + 1 per dollar at time 𝑡.
When people borrow and lend, they trade dollars now for dollars later or dollars later for dollars now.
The price at which these exchanges occur is the gross nominal interest rate.

202 Chapter 14. Geometric Series for Elementary Economics


A First Course in Quantitative Economics with Python

• If I sell 𝑥 dollars to you today, you pay me 𝑅𝑥 dollars tomorrow.


• This means that you borrowed 𝑥 dollars for me at a gross interest rate 𝑅 and a net interest rate 𝑟.
We assume that the net nominal interest rate 𝑟 is fixed over time, so that 𝑅 is the gross nominal interest rate at times
𝑡 = 0, 1, 2, ….
Two important geometric sequences are

1, 𝑅, 𝑅2 , ⋯ (14.8)

and

1, 𝑅−1 , 𝑅−2 , ⋯ (14.9)

Sequence (14.8) tells us how dollar values of an investment accumulate through time.
Sequence (14.9) tells us how to discount future dollars to get their values in terms of today’s dollars.

14.5.1 Accumulation

Geometric sequence (14.8) tells us how one dollar invested and re-invested in a project with gross one period nominal
rate of return accumulates
• here we assume that net interest payments are reinvested in the project
• thus, 1 dollar invested at time 0 pays interest 𝑟 dollars after one period, so we have 𝑟 + 1 = 𝑅 dollars at time1
• at time 1 we reinvest 1 + 𝑟 = 𝑅 dollars and receive interest of 𝑟𝑅 dollars at time 2 plus the principal 𝑅 dollars, so
we receive 𝑟𝑅 + 𝑅 = (1 + 𝑟)𝑅 = 𝑅2 dollars at the end of period 2
• and so on
Evidently, if we invest 𝑥 dollars at time 0 and reinvest the proceeds, then the sequence

𝑥, 𝑥𝑅, 𝑥𝑅2 , ⋯

tells how our account accumulates at dates 𝑡 = 0, 1, 2, ….

14.5.2 Discounting

Geometric sequence (14.9) tells us how much future dollars are worth in terms of today’s dollars.
Remember that the units of 𝑅 are dollars at 𝑡 + 1 per dollar at 𝑡.
It follows that
• the units of 𝑅−1 are dollars at 𝑡 per dollar at 𝑡 + 1
• the units of 𝑅−2 are dollars at 𝑡 per dollar at 𝑡 + 2
• and so on; the units of 𝑅−𝑗 are dollars at 𝑡 per dollar at 𝑡 + 𝑗
So if someone has a claim on 𝑥 dollars at time 𝑡 + 𝑗, it is worth 𝑥𝑅−𝑗 dollars at time 𝑡 (e.g., today).

14.5. Example: Interest Rates and Present Values 203


A First Course in Quantitative Economics with Python

14.5.3 Application to asset pricing

A lease requires a payments stream of 𝑥𝑡 dollars at times 𝑡 = 0, 1, 2, … where


𝑥𝑡 = 𝐺 𝑡 𝑥0
where 𝐺 = (1 + 𝑔) and 𝑔 ∈ (0, 1).
Thus, lease payments increase at 𝑔 percent per period.
For a reason soon to be revealed, we assume that 𝐺 < 𝑅.
The present value of the lease is
𝑝0 = 𝑥0 + 𝑥1 /𝑅 + 𝑥2 /(𝑅2 ) + ⋯
= 𝑥0 (1 + 𝐺𝑅−1 + 𝐺2 𝑅−2 + ⋯)
1
= 𝑥0
1 − 𝐺𝑅−1
where the last line uses the formula for an infinite geometric series.
Recall that 𝑅 = 1 + 𝑟 and 𝐺 = 1 + 𝑔 and that 𝑅 > 𝐺 and 𝑟 > 𝑔 and that 𝑟 and 𝑔 are typically small numbers, e.g., .05
or .03.
1
Use the Taylor series of 1+𝑟 about 𝑟 = 0, namely,
1
= 1 − 𝑟 + 𝑟2 − 𝑟3 + ⋯
1+𝑟
1
and the fact that 𝑟 is small to approximate 1+𝑟 ≈ 1 − 𝑟.
Use this approximation to write 𝑝0 as
1
𝑝0 = 𝑥0
1 − 𝐺𝑅−1
1
= 𝑥0
1 − (1 + 𝑔)(1 − 𝑟)
1
= 𝑥0
1 − (1 + 𝑔 − 𝑟 − 𝑟𝑔)
1
≈ 𝑥0
𝑟−𝑔
where the last step uses the approximation 𝑟𝑔 ≈ 0.
The approximation
𝑥0
𝑝0 =
𝑟−𝑔
is known as the Gordon formula for the present value or current price of an infinite payment stream 𝑥0 𝐺𝑡 when the
nominal one-period interest rate is 𝑟 and when 𝑟 > 𝑔.
We can also extend the asset pricing formula so that it applies to finite leases.
Let the payment stream on the lease now be 𝑥𝑡 for 𝑡 = 1, 2, … , 𝑇 , where again
𝑥𝑡 = 𝐺 𝑡 𝑥0
The present value of this lease is:
𝑝0 = 𝑥0 + 𝑥1 /𝑅 + ⋯ + 𝑥𝑇 /𝑅𝑇
= 𝑥0 (1 + 𝐺𝑅−1 + ⋯ + 𝐺𝑇 𝑅−𝑇 )
𝑥0 (1 − 𝐺𝑇 +1 𝑅−(𝑇 +1) )
=
1 − 𝐺𝑅−1

204 Chapter 14. Geometric Series for Elementary Economics


A First Course in Quantitative Economics with Python

Applying the Taylor series to 𝑅−(𝑇 +1) about 𝑟 = 0 we get:


1 1
= 1 − 𝑟(𝑇 + 1) + 𝑟2 (𝑇 + 1)(𝑇 + 2) + ⋯ ≈ 1 − 𝑟(𝑇 + 1)
(1 + 𝑟)𝑇 +1 2

Similarly, applying the Taylor series to 𝐺𝑇 +1 about 𝑔 = 0:

𝑇 (𝑇 + 1) 2 (𝑇 − 1)𝑇 (𝑇 + 1) 3
(1 + 𝑔)𝑇 +1 = 1 + (𝑇 + 1)𝑔 + 𝑔 + 𝑔 + ⋯ ≈ 1 + (𝑇 + 1)𝑔
2! 3!
Thus, we get the following approximation:

𝑥0 (1 − (1 + (𝑇 + 1)𝑔)(1 − 𝑟(𝑇 + 1)))


𝑝0 =
1 − (1 − 𝑟)(1 + 𝑔)
Expanding:

𝑥0 (1 − 1 + (𝑇 + 1)2 𝑟𝑔 − 𝑟(𝑇 + 1) + 𝑔(𝑇 + 1))


𝑝0 =
1 − 1 + 𝑟 − 𝑔 + 𝑟𝑔
𝑥 (𝑇 + 1)((𝑇 + 1)𝑟𝑔 + 𝑟 − 𝑔)
= 0
𝑟 − 𝑔 + 𝑟𝑔
𝑥 (𝑇 + 1)(𝑟 − 𝑔) 𝑥0 𝑟𝑔(𝑇 + 1)
≈ 0 +
𝑟−𝑔 𝑟−𝑔
𝑥0 𝑟𝑔(𝑇 + 1)
= 𝑥0 (𝑇 + 1) +
𝑟−𝑔

We could have also approximated by removing the second term 𝑟𝑔𝑥0 (𝑇 + 1) when 𝑇 is relatively small compared to
1/(𝑟𝑔) to get 𝑥0 (𝑇 + 1) as in the finite stream approximation.
We will plot the true finite stream present-value and the two approximations, under different values of 𝑇 , and 𝑔 and 𝑟 in
Python.
First we plot the true finite stream present-value after computing it below

# True present value of a finite lease


def finite_lease_pv_true(T, g, r, x_0):
G = (1 + g)
R = (1 + r)
return (x_0 * (1 - G**(T + 1) * R**(-T - 1))) / (1 - G * R**(-1))
# First approximation for our finite lease

def finite_lease_pv_approx_1(T, g, r, x_0):


p = x_0 * (T + 1) + x_0 * r * g * (T + 1) / (r - g)
return p

# Second approximation for our finite lease


def finite_lease_pv_approx_2(T, g, r, x_0):
return (x_0 * (T + 1))

# Infinite lease
def infinite_lease(g, r, x_0):
G = (1 + g)
R = (1 + r)
return x_0 / (1 - G * R**(-1))

Now that we have defined our functions, we can plot some outcomes.
First we study the quality of our approximations

14.5. Example: Interest Rates and Present Values 205


A First Course in Quantitative Economics with Python

def plot_function(axes, x_vals, func, args):


axes.plot(x_vals, func(*args), label=func.__name__)

T_max = 50

T = np.arange(0, T_max+1)
g = 0.02
r = 0.03
x_0 = 1

our_args = (T, g, r, x_0)


funcs = [finite_lease_pv_true,
finite_lease_pv_approx_1,
finite_lease_pv_approx_2]
# the three functions we want to compare

fig, ax = plt.subplots()
for f in funcs:
plot_function(ax, T, f, our_args)
ax.legend()
ax.set_xlabel('$T$ Periods Ahead')
ax.set_ylabel('Present Value, $p_0$')
plt.show()

Fig. 14.1: Finite lease present value 𝑇 periods ahead

Evidently our approximations perform well for small values of 𝑇 .


However, holding 𝑔 and r fixed, our approximations deteriorate as 𝑇 increases.
Next we compare the infinite and finite duration lease present values over different lease lengths 𝑇 .

# Convergence of infinite and finite


T_max = 1000
T = np.arange(0, T_max+1)
fig, ax = plt.subplots()
f_1 = finite_lease_pv_true(T, g, r, x_0)
(continues on next page)

206 Chapter 14. Geometric Series for Elementary Economics


A First Course in Quantitative Economics with Python

(continued from previous page)


f_2 = np.full(T_max+1, infinite_lease(g, r, x_0))
ax.plot(T, f_1, label='T-period lease PV')
ax.plot(T, f_2, '--', label='Infinite lease PV')
ax.set_xlabel('$T$ Periods Ahead')
ax.set_ylabel('Present Value, $p_0$')
ax.legend()
plt.show()

Fig. 14.2: Infinite and finite lease present value 𝑇 periods ahead

The graph above shows how as duration 𝑇 → +∞, the value of a lease of duration 𝑇 approaches the value of a perpetual
lease.
Now we consider two different views of what happens as 𝑟 and 𝑔 covary

# First view
# Changing r and g
fig, ax = plt.subplots()
ax.set_ylabel('Present Value, $p_0$')
ax.set_xlabel('$T$ periods ahead')
T_max = 10
T=np.arange(0, T_max+1)

rs, gs = (0.9, 0.5, 0.4001, 0.4), (0.4, 0.4, 0.4, 0.5),


comparisons = ('$\gg$', '$>$', r'$\approx$', '$<$')
for r, g, comp in zip(rs, gs, comparisons):
ax.plot(finite_lease_pv_true(T, g, r, x_0), label=f'r(={r}) {comp} g(={g})')

ax.legend()
plt.show()

This graph gives a big hint for why the condition 𝑟 > 𝑔 is necessary if a lease of length 𝑇 = +∞ is to have finite value.
For fans of 3-d graphs the same point comes through in the following graph.
If you aren’t enamored of 3-d graphs, feel free to skip the next visualization!

14.5. Example: Interest Rates and Present Values 207


A First Course in Quantitative Economics with Python

Fig. 14.3: Value of lease of length 𝑇

# Second view
fig = plt.figure(figsize = [16, 5])
T = 3
ax = plt.subplot(projection='3d')
r = np.arange(0.01, 0.99, 0.005)
g = np.arange(0.011, 0.991, 0.005)

rr, gg = np.meshgrid(r, g)
z = finite_lease_pv_true(T, gg, rr, x_0)

# Removes points where undefined


same = (rr == gg)
z[same] = np.nan
surf = ax.plot_surface(rr, gg, z, cmap=cm.coolwarm,
antialiased=True, clim=(0, 15))
fig.colorbar(surf, shrink=0.5, aspect=5)
ax.set_xlabel('$r$')
ax.set_ylabel('$g$')
ax.set_zlabel('Present Value, $p_0$')
ax.view_init(20, 8)
plt.show()

We can use a little calculus to study how the present value 𝑝0 of a lease varies with 𝑟 and 𝑔.
We will use a library called SymPy.
SymPy enables us to do symbolic math calculations including computing derivatives of algebraic equations.
We will illustrate how it works by creating a symbolic expression that represents our present value formula for an infinite
lease.
After that, we’ll use SymPy to compute derivatives

# Creates algebraic symbols that can be used in an algebraic expression


g, r, x0 = sym.symbols('g, r, x0')
G = (1 + g)
(continues on next page)

208 Chapter 14. Geometric Series for Elementary Economics


A First Course in Quantitative Economics with Python

Fig. 14.4: Three period lease PV with varying 𝑔 and 𝑟

(continued from previous page)


R = (1 + r)
p0 = x0 / (1 - G * R**(-1))
init_printing(use_latex='mathjax')
print('Our formula is:')
p0

Our formula is:

𝑥0
𝑔+1
− 𝑟+1 + 1

print('dp0 / dg is:')
dp_dg = sym.diff(p0, g)
dp_dg

dp0 / dg is:

𝑥0
2
(𝑟 + 1) (− 𝑔+1
𝑟+1 + 1)

14.5. Example: Interest Rates and Present Values 209


A First Course in Quantitative Economics with Python

print('dp0 / dr is:')
dp_dr = sym.diff(p0, r)
dp_dr

dp0 / dr is:

𝑥0 (𝑔 + 1)
− 2
2
(𝑟 + 1) (− 𝑔+1
𝑟+1 + 1)

𝜕𝑝0 𝜕𝑝0
We can see that for 𝜕𝑟 < 0 as long as 𝑟 > 𝑔, 𝑟 > 0 and 𝑔 > 0 and 𝑥0 is positive, so 𝜕𝑟 will always be negative.
𝜕𝑝0 𝜕𝑝0
Similarly, 𝜕𝑔 > 0 as long as 𝑟 > 𝑔, 𝑟 > 0 and 𝑔 > 0 and 𝑥0 is positive, so 𝜕𝑔 will always be positive.

14.6 Back to the Keynesian multiplier

We will now go back to the case of the Keynesian multiplier and plot the time path of 𝑦𝑡 , given that consumption is a
constant fraction of national income, and investment is fixed.

# Function that calculates a path of y


def calculate_y(i, b, g, T, y_init):
y = np.zeros(T+1)
y[0] = i + b * y_init + g
for t in range(1, T+1):
y[t] = b * y[t-1] + i + g
return y

# Initial values
i_0 = 0.3
g_0 = 0.3
# 2/3 of income goes towards consumption
b = 2/3
y_init = 0
T = 100

fig, ax = plt.subplots()
ax.set_xlabel('$t$')
ax.set_ylabel('$y_t$')
ax.plot(np.arange(0, T+1), calculate_y(i_0, b, g_0, T, y_init))
# Output predicted by geometric series
ax.hlines(i_0 / (1 - b) + g_0 / (1 - b), xmin=-1, xmax=101, linestyles='--')
plt.show()

In this model, income grows over time, until it gradually converges to the infinite geometric series sum of income.
We now examine what will happen if we vary the so-called marginal propensity to consume, i.e., the fraction of income
that is consumed

bs = (1/3, 2/3, 5/6, 0.9)

(continues on next page)

210 Chapter 14. Geometric Series for Elementary Economics


A First Course in Quantitative Economics with Python

Fig. 14.5: Path of aggregate output tver time

(continued from previous page)


fig,ax = plt.subplots()
ax.set_ylabel('$y_t$')
ax.set_xlabel('$t$')
x = np.arange(0, T+1)
for b in bs:
y = calculate_y(i_0, b, g_0, T, y_init)
ax.plot(x, y, label=r'$b=$'+f"{b:.2f}")
ax.legend()
plt.show()

Fig. 14.6: Changing consumption as a fraction of income

Increasing the marginal propensity to consume 𝑏 increases the path of output over time.
Now we will compare the effects on output of increases in investment and government spending.

14.6. Back to the Keynesian multiplier 211


A First Course in Quantitative Economics with Python

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(6, 10))


fig.subplots_adjust(hspace=0.3)

x = np.arange(0, T+1)
values = [0.3, 0.4]

for i in values:
y = calculate_y(i, b, g_0, T, y_init)
ax1.plot(x, y, label=f"i={i}")
for g in values:
y = calculate_y(i_0, b, g, T, y_init)
ax2.plot(x, y, label=f"g={g}")

axes = ax1, ax2


param_labels = "Investment", "Government Spending"
for ax, param in zip(axes, param_labels):
ax.set_title(f'An Increase in {param} on Output')
ax.legend(loc ="lower right")
ax.set_ylabel('$y_t$')
ax.set_xlabel('$t$')
plt.show()

Notice here, whether government spending increases from 0.3 to 0.4 or investment increases from 0.3 to 0.4, the shifts
in the graphs are identical.

212 Chapter 14. Geometric Series for Elementary Economics


A First Course in Quantitative Economics with Python

Fig. 14.7: Different increase on output

14.6. Back to the Keynesian multiplier 213


A First Course in Quantitative Economics with Python

214 Chapter 14. Geometric Series for Elementary Economics


Part V

Probability and Distributions

215
CHAPTER

FIFTEEN

DISTRIBUTIONS AND PROBABILITIES

Contents

• Distributions and Probabilities


– Outline
– Common distributions
– Observed distributions

15.1 Outline

In this lecture we give a quick introduction to data and probability distributions using Python

!pip install --upgrade yfinance

import matplotlib.pyplot as plt


import pandas as pd
import numpy as np
import yfinance as yf
import scipy.stats
import seaborn as sns

15.2 Common distributions

In this section we recall the definitions of some well-known distributions and show how to manipulate them with SciPy.

217
A First Course in Quantitative Economics with Python

15.2.1 Discrete distributions

Let’s start with discrete distributions.


A discrete distribution is defined by a set of numbers 𝑆 = {𝑥1 , … , 𝑥𝑛 } and a probability mass function (PMF) on 𝑆,
which is a function 𝑝 from 𝑆 to [0, 1] with the property
𝑛
∑ 𝑝(𝑥𝑖 ) = 1
𝑖=1

We say that a random variable 𝑋 has distribution 𝑝 if 𝑋 takes value 𝑥𝑖 with probability 𝑝(𝑥𝑖 ).
That is,

ℙ{𝑋 = 𝑥𝑖 } = 𝑝(𝑥𝑖 ) for 𝑖 = 1, … , 𝑛

The mean or expected value of a random variable 𝑋 with distribution 𝑝 is


𝑛
𝔼𝑋 = ∑ 𝑥𝑖 𝑝(𝑥𝑖 )
𝑖=1

Expectation is also called the first moment of the distribution.


We also refer to this number as the mean of the distribution (represented by) 𝑝.
The variance of 𝑋 is defined as
𝑛
𝕍𝑋 = ∑(𝑥𝑖 − 𝔼𝑋)2 𝑝(𝑥𝑖 )
𝑖=1

Variance is also called the second central moment of the distribution.


The cumulative distribution function (CDF) of 𝑋 is defined by
𝑛
𝐹 (𝑥) = ℙ{𝑋 ≤ 𝑥} = ∑ 𝟙{𝑥𝑖 ≤ 𝑥}𝑝(𝑥𝑖 )
𝑖=1

Here 𝟙{statement} = 1 if “statement” is true and zero otherwise.


Hence the second term takes all 𝑥𝑖 ≤ 𝑥 and sums their probabilities.

Uniform distribution

One simple example is the uniform distribution, where 𝑝(𝑥𝑖 ) = 1/𝑛 for all 𝑛.
We can import the uniform distribution on 𝑆 = {1, … , 𝑛} from SciPy like so:

n = 10
u = scipy.stats.randint(1, n+1)

Here’s the mean and variance

u.mean(), u.var()

(5.5, 8.25)

218 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

The formula for the mean is (𝑛 + 1)/2, and the formula for the variance is (𝑛2 − 1)/12.
Now let’s evaluate the PMF

u.pmf(1)

0.1

u.pmf(2)

0.1

Here’s a plot of the probability mass function:

fig, ax = plt.subplots()
S = np.arange(1, n+1)
ax.plot(S, u.pmf(S), linestyle='', marker='o', alpha=0.8, ms=4)
ax.vlines(S, 0, u.pmf(S), lw=0.2)
ax.set_xticks(S)
plt.show()

Here’s a plot of the CDF:

fig, ax = plt.subplots()
S = np.arange(1, n+1)
ax.step(S, u.cdf(S))
ax.vlines(S, 0, u.cdf(S), lw=0.2)
ax.set_xticks(S)
plt.show()

15.2. Common distributions 219


A First Course in Quantitative Economics with Python

The CDF jumps up by 𝑝(𝑥𝑖 ) and 𝑥𝑖 .

Exercise 15.2.1
Calculate the mean and variance for this parameterization (i.e., 𝑛 = 10) directly from the PMF, using the expressions
given above.
Check that your answers agree with u.mean() and u.var().

Binomial distribution

Another useful (and more interesting) distribution is the binomial distribution on 𝑆 = {0, … , 𝑛}, which has PMF

𝑛
𝑝(𝑖) = ( )𝜃𝑖 (1 − 𝜃)𝑛−𝑖
𝑖

Here 𝜃 ∈ [0, 1] is a parameter.


The interpretation of 𝑝(𝑖) is: the number of successes in 𝑛 independent trials with success probability 𝜃.
(If 𝜃 = 0.5, this is “how many heads in 𝑛 flips of a fair coin”)
The mean and variance are

n = 10
θ = 0.5
u = scipy.stats.binom(n, θ)

u.mean(), u.var()

220 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

(5.0, 2.5)

The formula for the mean is 𝑛𝜃 and the formula for the variance is 𝑛𝜃(1 − 𝜃).
Here’s the PDF

u.pmf(1)

0.009765625000000002

fig, ax = plt.subplots()
S = np.arange(1, n+1)
ax.plot(S, u.pmf(S), linestyle='', marker='o', alpha=0.8, ms=4)
ax.vlines(S, 0, u.pmf(S), lw=0.2)
ax.set_xticks(S)
plt.show()

Here’s the CDF

fig, ax = plt.subplots()
S = np.arange(1, n+1)
ax.step(S, u.cdf(S))
ax.vlines(S, 0, u.cdf(S), lw=0.2)
ax.set_xticks(S)
plt.show()

15.2. Common distributions 221


A First Course in Quantitative Economics with Python

Exercise 15.2.2
Using u.pmf, check that our definition of the CDF given above calculates the same function as u.cdf.

Solution to Exercise 15.2.2


Here is one solution

fig, ax = plt.subplots()
S = np.arange(1, n+1)
u_sum = np.cumsum(u.pmf(S))
ax.step(S, u_sum)
ax.vlines(S, 0, u_sum, lw=0.2)
ax.set_xticks(S)
plt.show()

222 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

We can see that the output graph is the same as the one above.

Poisson distribution

Poisson distribution on 𝑆 = {0, 1, …} with parameter 𝜆 > 0 has PMF

𝜆𝑖 −𝜆
𝑝(𝑖) = 𝑒
𝑖!
The interpretation of 𝑝(𝑖) is: the number of events in a fixed time interval, where the events occur at a constant rate 𝜆
and independently of each other.
The mean and variance are

λ = 2
u = scipy.stats.poisson(λ)

u.mean(), u.var()

(2.0, 2.0)

The the expectation of Poisson distribution is 𝜆 and the variance is also 𝜆.


Here’s the PMF

λ = 2
u = scipy.stats.poisson(λ)

15.2. Common distributions 223


A First Course in Quantitative Economics with Python

u.pmf(1)

0.2706705664732254

fig, ax = plt.subplots()
S = np.arange(1, n+1)
ax.plot(S, u.pmf(S), linestyle='', marker='o', alpha=0.8, ms=4)
ax.vlines(S, 0, u.pmf(S), lw=0.2)
ax.set_xticks(S)
plt.show()

15.2.2 Continuous distributions

Continuous distributions are represented by a density function, which is a function 𝑝 over ℝ (the set of all numbers) such
that 𝑝(𝑥) ≥ 0 for all 𝑥 and

∫ 𝑝(𝑥)𝑑𝑥 = 1
−∞

We say that random variable 𝑋 has distribution 𝑝 if


𝑏
ℙ{𝑎 < 𝑋 < 𝑏} = ∫ 𝑝(𝑥)𝑑𝑥
𝑎

for all 𝑎 ≤ 𝑏.
The definition of the mean and variance of a random variable 𝑋 with distribution 𝑝 are the same as the discrete case,
after replacing the sum with an integral.

224 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

For example, the mean of 𝑋 is



𝔼𝑋 = ∫ 𝑥𝑝(𝑥)𝑑𝑥
−∞

The cumulative distribution function (CDF) of 𝑋 is defined by


𝑥
𝐹 (𝑥) = ℙ{𝑋 ≤ 𝑥} = ∫ 𝑝(𝑥)𝑑𝑥
−∞

Normal distribution

Perhaps the most famous distribution is the normal distribution, which has density

1 (𝑥 − 𝜇)2
𝑝(𝑥) = √ exp (− )
2𝜋𝜎 2𝜎2

This distribution has two parameters, 𝜇 and 𝜎.


It can be shown that, for this distribution, the mean is 𝜇 and the variance is 𝜎2 .
We can obtain the moments, PDF, and CDF of the normal density as follows:

μ, σ = 0.0, 1.0
u = scipy.stats.norm(μ, σ)

u.mean(), u.var()

(0.0, 1.0)

Here’s a plot of the density — the famous “bell-shaped curve”:

μ_vals = [-1, 0, 1]
σ_vals = [0.4, 1, 1.6]
fig, ax = plt.subplots()
x_grid = np.linspace(-4, 4, 200)

for μ, σ in zip(μ_vals, σ_vals):


u = scipy.stats.norm(μ, σ)
ax.plot(x_grid, u.pdf(x_grid),
alpha=0.5, lw=2,
label=f'$\mu={μ}, \sigma={σ}$')

plt.legend()
plt.show()

15.2. Common distributions 225


A First Course in Quantitative Economics with Python

Here’s a plot of the CDF:

fig, ax = plt.subplots()
for μ, σ in zip(μ_vals, σ_vals):
u = scipy.stats.norm(μ, σ)
ax.plot(x_grid, u.cdf(x_grid),
alpha=0.5, lw=2,
label=f'$\mu={μ}, \sigma={σ}$')
ax.set_ylim(0, 1)
plt.legend()
plt.show()

226 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

Lognormal distribution

The lognormal distribution is a distribution on (0, ∞) with density


2
1 (log 𝑥 − 𝜇)
𝑝(𝑥) = √ exp (− )
𝜎𝑥 2𝜋 2𝜎2

This distribution has two parameters, 𝜇 and 𝜎.


It can be shown that, for this distribution, the mean is exp (𝜇 + 𝜎2 /2) and the variance is [exp (𝜎2 ) − 1] exp (2𝜇 + 𝜎2 ).
It has a nice interpretation: if 𝑋 is lognormally distributed, then log 𝑋 is normally distributed.
It is often used to model variables that are “multiplicative” in nature, such as income or asset prices.
We can obtain the moments, PDF, and CDF of the normal density as follows:

μ, σ = 0.0, 1.0
u = scipy.stats.lognorm(s=σ, scale=np.exp(μ))

u.mean(), u.var()

(1.6487212707001282, 4.670774270471604)

μ_vals = [-1, 0, 1]
σ_vals = [0.25, 0.5, 1]
x_grid = np.linspace(0, 3, 200)

fig, ax = plt.subplots()
(continues on next page)

15.2. Common distributions 227


A First Course in Quantitative Economics with Python

(continued from previous page)


for μ, σ in zip(μ_vals, σ_vals):
u = scipy.stats.lognorm(σ, scale=np.exp(μ))
ax.plot(x_grid, u.pdf(x_grid),
alpha=0.5, lw=2,
label=f'$\mu={μ}, \sigma={σ}$')

plt.legend()
plt.show()

fig, ax = plt.subplots()
μ = 1
for σ in σ_vals:
u = scipy.stats.norm(μ, σ)
ax.plot(x_grid, u.cdf(x_grid),
alpha=0.5, lw=2,
label=f'$\mu={μ}, \sigma={σ}$')
ax.set_ylim(0, 1)
ax.set_xlim(0, 3)
plt.legend()
plt.show()

228 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

Exponential distribution

The exponential distribution is a distribution on (0, ∞) with density

𝑝(𝑥) = 𝜆 exp (−𝜆𝑥)

This distribution has one parameter, 𝜆.


It is related to the Poisson distribution as it describes the distribution of the length of the time interval between two
consecutive events in a Poisson process.
It can be shown that, for this distribution, the mean is 1/𝜆 and the variance is 1/𝜆2 .
We can obtain the moments, PDF, and CDF of the normal density as follows:

λ = 1.0
u = scipy.stats.expon(scale=1/λ)

u.mean(), u.var()

(1.0, 1.0)

fig, ax = plt.subplots()
λ_vals = [0.5, 1, 2]
x_grid = np.linspace(0, 6, 200)

for λ in λ_vals:
u = scipy.stats.expon(scale=1/λ)
ax.plot(x_grid, u.pdf(x_grid),
(continues on next page)

15.2. Common distributions 229


A First Course in Quantitative Economics with Python

(continued from previous page)


alpha=0.5, lw=2,
label=f'$\lambda={λ}$')
plt.legend()
plt.show()

fig, ax = plt.subplots()
for λ in λ_vals:
u = scipy.stats.expon(scale=1/λ)
ax.plot(x_grid, u.cdf(x_grid),
alpha=0.5, lw=2,
label=f'$\lambda={λ}$')
ax.set_ylim(0, 1)
plt.legend()
plt.show()

230 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

Beta distribution

The beta distribution is a distribution on (0, 1) with density


Γ(𝛼 + 𝛽) 𝛼−1
𝑝(𝑥) = 𝑥 (1 − 𝑥)𝛽−1
Γ(𝛼)Γ(𝛽)
where Γ is the gamma function.
(The role of the gamma function is just to normalize the density, so that it integrates to one.)
This distribution has two parameters, 𝛼 > 0 and 𝛽 > 0.
It can be shown that, for this distribution, the mean is 𝛼/(𝛼 + 𝛽) and the variance is 𝛼𝛽/(𝛼 + 𝛽)2 (𝛼 + 𝛽 + 1).
We can obtain the moments, PDF, and CDF of the normal density as follows:

α, β = 3.0, 1.0
u = scipy.stats.beta(α, β)

u.mean(), u.var()

(0.75, 0.0375)

α_vals = [0.5, 1, 5, 25, 3]


β_vals = [3, 1, 10, 20, 0.5]
x_grid = np.linspace(0, 1, 200)

fig, ax = plt.subplots()
for α, β in zip(α_vals, β_vals):
(continues on next page)

15.2. Common distributions 231


A First Course in Quantitative Economics with Python

(continued from previous page)


u = scipy.stats.beta(α, β)
ax.plot(x_grid, u.pdf(x_grid),
alpha=0.5, lw=2,
label=fr'$\alpha={α}, \beta={β}$')
plt.legend()
plt.show()

fig, ax = plt.subplots()
for α, β in zip(α_vals, β_vals):
u = scipy.stats.beta(α, β)
ax.plot(x_grid, u.cdf(x_grid),
alpha=0.5, lw=2,
label=fr'$\alpha={α}, \beta={β}$')
ax.set_ylim(0, 1)
plt.legend()
plt.show()

232 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

Gamma distribution

The gamma distribution is a distribution on (0, ∞) with density

𝛽 𝛼 𝛼−1
𝑝(𝑥) = 𝑥 exp(−𝛽𝑥)
Γ(𝛼)

This distribution has two parameters, 𝛼 > 0 and 𝛽 > 0.


It can be shown that, for this distribution, the mean is 𝛼/𝛽 and the variance is 𝛼/𝛽 2 .
One interpretation is that if 𝑋 is gamma distributed and 𝛼 is an integer, then 𝑋 is the sum of 𝛼 independent exponentially
distributed random variables with mean 1/𝛽.
We can obtain the moments, PDF, and CDF of the normal density as follows:

α, β = 3.0, 2.0
u = scipy.stats.gamma(α, scale=1/β)

u.mean(), u.var()

(1.5, 0.75)

α_vals = [1, 3, 5, 10]


β_vals = [3, 5, 3, 3]
x_grid = np.linspace(0, 7, 200)

fig, ax = plt.subplots()
for α, β in zip(α_vals, β_vals):
(continues on next page)

15.2. Common distributions 233


A First Course in Quantitative Economics with Python

(continued from previous page)


u = scipy.stats.gamma(α, scale=1/β)
ax.plot(x_grid, u.pdf(x_grid),
alpha=0.5, lw=2,
label=fr'$\alpha={α}, \beta={β}$')
plt.legend()
plt.show()

fig, ax = plt.subplots()
for α, β in zip(α_vals, β_vals):
u = scipy.stats.gamma(α, scale=1/β)
ax.plot(x_grid, u.cdf(x_grid),
alpha=0.5, lw=2,
label=fr'$\alpha={α}, \beta={β}$')
ax.set_ylim(0, 1)
plt.legend()
plt.show()

234 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

15.3 Observed distributions

Sometimes we refer to observed data or measurements as “distributions”.


For example, let’s say we observe the income of 10 people over a year:

data = [['Hiroshi', 1200],


['Ako', 1210],
['Emi', 1400],
['Daiki', 990],
['Chiyo', 1530],
['Taka', 1210],
['Katsuhiko', 1240],
['Daisuke', 1124],
['Yoshi', 1330],
['Rie', 1340]]

df = pd.DataFrame(data, columns=['name', 'income'])


df

name income
0 Hiroshi 1200
1 Ako 1210
2 Emi 1400
3 Daiki 990
4 Chiyo 1530
5 Taka 1210
6 Katsuhiko 1240
7 Daisuke 1124
(continues on next page)

15.3. Observed distributions 235


A First Course in Quantitative Economics with Python

(continued from previous page)


8 Yoshi 1330
9 Rie 1340

In this situation, we might refer to the set of their incomes as the “income distribution.”
The terminology is confusing because this set is not a probability distribution — it’s just a collection of numbers.
However, as we will see, there are connections between observed distributions (i.e., sets of numbers like the income
distribution above) and probability distributions.
Below we explore some observed distributions.

15.3.1 Summary statistics

Suppose we have an observed distribution with values {𝑥1 , … , 𝑥𝑛 }


The sample mean of this distribution is defined as

1 𝑛
𝑥̄ = ∑𝑥
𝑛 𝑖=1 𝑖

The sample variance is defined as

1 𝑛
∑(𝑥 − 𝑥)̄ 2
𝑛 𝑖=1 𝑖

For the income distribution given above, we can calculate these numbers via

x = np.asarray(df['income'])

x.mean(), x.var()

(1257.4, 20412.839999999997)

Exercise 15.3.1
Check that the formulas given above produce the same numbers.

15.3.2 Visualization

Let’s look at different ways that we can visualize one or more observed distributions.
We will cover
• histograms
• kernel density estimates and
• violin plots

236 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

Histograms

We can histogram the income distribution we just constructed as follows

x = df['income']
fig, ax = plt.subplots()
ax.hist(x, bins=5, density=True, histtype='bar')
plt.show()

Let’s look at a distribution from real data.


In particular, we will look at the monthly return on Amazon shares between 2000/1/1 and 2023/1/1.
The monthly return is calculated as the percent change in the share price over each month.
So we will have one observation for each month.

df = yf.download('AMZN', '2000-1-1', '2023-1-1', interval='1mo' )


prices = df['Adj Close']
data = prices.pct_change()[1:] * 100
data.head()

[*********************100%%**********************] 1 of 1 completed

Date
2000-02-01 6.679568
2000-03-01 -2.722323
2000-04-01 -17.630592
2000-05-01 -12.457531
(continues on next page)

15.3. Observed distributions 237


A First Course in Quantitative Economics with Python

(continued from previous page)


2000-06-01 -24.838297
Name: Adj Close, dtype: float64

The first observation is the monthly return (percent change) over January 2000, which was

data[0]

6.6795679502808625

Let’s turn the return observations into an array and histogram it.

x_amazon = np.asarray(data)

fig, ax = plt.subplots()
ax.hist(x_amazon, bins=20)
plt.show()

Kernel density estimates

Kernel density estimate (KDE) is a non-parametric way to estimate and visualize the PDF of a distribution.
KDE will generate a smooth curve that approximates the PDF.

fig, ax = plt.subplots()
sns.kdeplot(x_amazon, ax=ax)
plt.show()

238 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

The smoothness of the KDE is dependent on how we choose the bandwidth.

fig, ax = plt.subplots()
sns.kdeplot(x_amazon, ax=ax, bw_adjust=0.1, alpha=0.5, label="bw=0.1")
sns.kdeplot(x_amazon, ax=ax, bw_adjust=0.5, alpha=0.5, label="bw=0.5")
sns.kdeplot(x_amazon, ax=ax, bw_adjust=1, alpha=0.5, label="bw=1")
plt.legend()
plt.show()

When we use a larger bandwidth, the KDE is smoother.

15.3. Observed distributions 239


A First Course in Quantitative Economics with Python

A suitable bandwidth is not too smooth (underfitting) or too wiggly (overfitting).

Violin plots

Yet another way to display an observed distribution is via a violin plot.

fig, ax = plt.subplots()
ax.violinplot(x_amazon)
plt.show()

Violin plots are particularly useful when we want to compare different distributions.
For example, let’s compare the monthly returns on Amazon shares with the monthly return on Apple shares.

df = yf.download('AAPL', '2000-1-1', '2023-1-1', interval='1mo' )


prices = df['Adj Close']
data = prices.pct_change()[1:] * 100
x_apple = np.asarray(data)

[*********************100%%**********************] 1 of 1 completed

fig, ax = plt.subplots()
ax.violinplot([x_amazon, x_apple])
plt.show()

240 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

15.3.3 Connection to probability distributions

Let’s discuss the connection between observed distributions and probability distributions.
Sometimes it’s helpful to imagine that an observed distribution is generated by a particular probability distribution.
For example, we might look at the returns from Amazon above and imagine that they were generated by a normal distri-
bution.
Even though this is not true, it might be a helpful way to think about the data.
Here we match a normal distribution to the Amazon monthly returns by setting the sample mean to the mean of the
normal distribution and the sample variance equal to the variance.
Then we plot the density and the histogram.

μ = x_amazon.mean()
σ_squared = x_amazon.var()
σ = np.sqrt(σ_squared)
u = scipy.stats.norm(μ, σ)

x_grid = np.linspace(-50, 65, 200)


fig, ax = plt.subplots()
ax.plot(x_grid, u.pdf(x_grid))
ax.hist(x_amazon, density=True, bins=40)
plt.show()

15.3. Observed distributions 241


A First Course in Quantitative Economics with Python

The match between the histogram and the density is not very bad but also not very good.
One reason is that the normal distribution is not really a good fit for this observed data — we will discuss this point again
when we talk about heavy tailed distributions.
Of course, if the data really is generated by the normal distribution, then the fit will be better.
Let’s see this in action
• first we generate random draws from the normal distribution
• then we histogram them and compare with the density.

μ, σ = 0, 1
u = scipy.stats.norm(μ, σ)
N = 2000 # Number of observations
x_draws = u.rvs(N)
x_grid = np.linspace(-4, 4, 200)
fig, ax = plt.subplots()
ax.plot(x_grid, u.pdf(x_grid))
ax.hist(x_draws, density=True, bins=40)
plt.show()

242 Chapter 15. Distributions and Probabilities


A First Course in Quantitative Economics with Python

Note that if you keep increasing 𝑁 , which is the number of observations, the fit will get better and better.
This convergence is a version of the “law of large numbers”, which we will discuss later.

15.3. Observed distributions 243


A First Course in Quantitative Economics with Python

244 Chapter 15. Distributions and Probabilities


CHAPTER

SIXTEEN

LLN AND CLT

16.1 Overview

This lecture illustrates two of the most important results in probability and statistics:
1. the law of large numbers (LLN) and
2. the central limit theorem (CLT).
These beautiful theorems lie behind many of the most fundamental results in econometrics and quantitative economic
modeling.
The lecture is based around simulations that show the LLN and CLT in action.
We also demonstrate how the LLN and CLT break down when the assumptions they are based on do not hold.
This lecture will focus on the univariate case (the multivariate case is treated in a more advanced lecture).
We’ll need the following imports:

import matplotlib.pyplot as plt


import random
import numpy as np
import scipy.stats as st

16.2 The law of large numbers

We begin with the law of large numbers, which tells us when sample averages will converge to their population means.

16.2.1 The LLN in action

Let’s see an example of the LLN in action before we go further.


Consider a Bernoulli random variable 𝑋 with parameter 𝑝.
This means that 𝑋 takes values in {0, 1} and ℙ{𝑋 = 1} = 𝑝.
We can think of drawing 𝑋 as tossing a biased coin where
• the coin falls on “heads” with probability 𝑝 and
• the coin falls on “tails” with probability 1 − 𝑝

245
A First Course in Quantitative Economics with Python

We set 𝑋 = 1 if the coin is “heads” and zero otherwise.


The (population) mean of 𝑋 is

𝔼𝑋 = 0 ⋅ ℙ{𝑋 = 0} + 1 ⋅ ℙ{𝑋 = 1} = ℙ{𝑋 = 1} = 𝑝

We can generate a draw of 𝑋 with scipy.stats (imported as st) as follows:

p = 0.8
X = st.bernoulli.rvs(p)
print(X)

In this setting, the LLN tells us if we flip the coin many times, the fraction of heads that we see will be close to the mean
𝑝.
Let’s check this:

n = 1_000_000
X_draws = st.bernoulli.rvs(p, size=n)
print(X_draws.mean()) # count the number of 1's and divide by n

0.800343

If we change 𝑝 the claim still holds:

p = 0.3
X_draws = st.bernoulli.rvs(p, size=n)
print(X_draws.mean())

0.299651

Let’s connect this to the discussion above, where we said the sample average converges to the “population mean”.
Think of 𝑋1 , … , 𝑋𝑛 as independent flips of the coin.
The population mean is the mean in an infinite sample, which equals the expectation 𝔼𝑋.
The sample mean of the draws 𝑋1 , … , 𝑋𝑛 is

1 𝑛
𝑋̄ 𝑛 ∶= ∑ 𝑋𝑖
𝑛 𝑖=1

In this case, it is the fraction of draws that equal one (the number of heads divided by 𝑛).
Thus, the LLN tells us that for the Bernoulli trials above

𝑋̄ 𝑛 → 𝔼𝑋 = 𝑝 (𝑛 → ∞) (16.1)

This is exactly what we illustrated in the code.

246 Chapter 16. LLN and CLT


A First Course in Quantitative Economics with Python

16.2.2 Statement of the LLN

Let’s state the LLN more carefully.


Let 𝑋1 , … , 𝑋𝑛 be random variables, all of which have the same distribution.
These random variables can be continuous or discrete.
For simplicity we will
• assume they are continuous and
• let 𝑓 denote their common density function
The last statement means that for any 𝑖 in {1, … , 𝑛} and any numbers 𝑎, 𝑏,
𝑏
ℙ{𝑎 ≤ 𝑋𝑖 ≤ 𝑏} = ∫ 𝑓(𝑥)𝑑𝑥
𝑎

(For the discrete case, we need to replace densities with probability mass functions and integrals with sums.)
Let 𝜇 denote the common mean of this sample.
Thus, for each 𝑖,

𝜇 ∶= 𝔼𝑋𝑖 = ∫ 𝑥𝑓(𝑥)𝑑𝑥
−∞

The sample mean is

1 𝑛
𝑋̄ 𝑛 ∶= ∑ 𝑋𝑖
𝑛 𝑖=1

The next theorem is called Kolmogorov’s strong law of large numbers.

Theorem 16.2.1
If 𝑋1 , … , 𝑋𝑛 are IID and 𝔼|𝑋| is finite, then

ℙ {𝑋̄ 𝑛 → 𝜇 as 𝑛 → ∞} = 1 (16.2)

Here
• IID means independent and identically distributed and

• 𝔼|𝑋| = ∫−∞ |𝑥|𝑓(𝑥)𝑑𝑥

16.2.3 Comments on the theorem

What does the probability one statement in the theorem mean?


Let’s think about it from a simulation perspective, imagining for a moment that our computer can generate perfect random
samples (although this isn’t strictly true).
Let’s also imagine that we can generate infinite sequences so that the statement 𝑋̄ 𝑛 → 𝜇 can be evaluated.
In this setting, (16.2) should be interpreted as meaning that the probability of the computer producing a sequence where
𝑋̄ 𝑛 → 𝜇 fails to occur is zero.

16.2. The law of large numbers 247


A First Course in Quantitative Economics with Python

16.2.4 Illustration

Let’s illustrate the LLN using simulation.


When we illustrate it, we will use a key idea: the sample mean 𝑋̄ 𝑛 is itself a random variable.
The reason 𝑋̄ 𝑛 is a random variable is that it’s a function of the random variables 𝑋1 , … , 𝑋𝑛 .
What we are going to do now is
1. pick some fixed distribution to draw each 𝑋𝑖 from
2. set 𝑛 to some large number
and then repeat the following three instructions.
1. generate the draws 𝑋1 , … , 𝑋𝑛
2. calculate the sample mean 𝑋̄ 𝑛 and record its value in an array sample_means
3. go to step 1.
We will loop over these three steps 𝑚 times, where 𝑚 is some large integer.
The array sample_means will now contain 𝑚 draws of the random variable 𝑋̄ 𝑛 .
If we histogram these observations of 𝑋̄ 𝑛 , we should see that they are clustered around the population mean 𝔼𝑋.
Moreover, if we repeat the exercise with a larger value of 𝑛, we should see that the observations are even more tightly
clustered around the population mean.
This is, in essence, what the LLN is telling us.
To implement these steps, we will use functions.
Our first function generates a sample mean of size 𝑛 given a distribution.

def draw_means(X_distribution, # The distribution of each X_i


n): # The size of the sample mean

# Generate n draws: X_1, ..., X_n


X_samples = X_distribution.rvs(size=n)

# Return the sample mean


return np.mean(X_samples)

Now we write a function to generate 𝑚 sample means and histogram them.

def generate_histogram(X_distribution, n, m):

# Compute m sample means

sample_means = np.empty(m)
for j in range(m):
sample_means[j] = draw_means(X_distribution, n)

# Generate a histogram

fig, ax = plt.subplots()
ax.hist(sample_means, bins=30, alpha=0.5, density=True)
μ = X_distribution.mean() # Get the population mean
σ = X_distribution.std() # and the standard deviation
ax.axvline(x=μ, ls="--", c="k", label=fr"$\mu = {μ}$")
(continues on next page)

248 Chapter 16. LLN and CLT


A First Course in Quantitative Economics with Python

(continued from previous page)

ax.set_xlim(μ - σ, μ + σ)
ax.set_xlabel(r'$\bar X_n$', size=12)
ax.set_ylabel('density', size=12)
ax.legend()
plt.show()

Now we call the function.

# pick a distribution to draw each $X_i$ from


X_distribution = st.norm(loc=5, scale=2)
# Call the function
generate_histogram(X_distribution, n=1_000, m=1000)

We can see that the distribution of 𝑋̄ is clustered around 𝔼𝑋 as expected.


Let’s vary n to see how the distribution of the sample mean changes.
We will use a violin plot to show the different distributions.
Each distribution in the violin plot represents the distribution of 𝑋𝑛 for some 𝑛, calculated by simulation.

def means_violin_plot(distribution,
ns = [1_000, 10_000, 100_000],
m = 10_000):

data = []
for n in ns:
sample_means = [draw_means(distribution, n) for i in range(m)]
data.append(sample_means)
(continues on next page)

16.2. The law of large numbers 249


A First Course in Quantitative Economics with Python

(continued from previous page)

fig, ax = plt.subplots()

ax.violinplot(data)
μ = distribution.mean()
ax.axhline(y=μ, ls="--", c="k", label=fr"$\mu = {μ}$")

labels=[fr'$n = {n}$' for n in ns]

ax.set_xticks(np.arange(1, len(labels) + 1), labels=labels)


ax.set_xlim(0.25, len(labels) + 0.75)

plt.subplots_adjust(bottom=0.15, wspace=0.05)

ax.set_ylabel('density', size=12)
ax.legend()
plt.show()

Let’s try with a normal distribution.

means_violin_plot(st.norm(loc=5, scale=2))

As 𝑛 gets large, more probability mass clusters around the population mean 𝜇.
Now let’s try with a Beta distribution.

means_violin_plot(st.beta(6, 6))

250 Chapter 16. LLN and CLT


A First Course in Quantitative Economics with Python

We get a similar result.

16.3 Breaking the LLN

We have to pay attention to the assumptions in the statement of the LLN.


If these assumptions do not hold, then the LLN might fail.

16.3.1 Infinite first moment

As indicated by the theorem, the LLN can break when 𝔼|𝑋| is not finite.
We can demonstrate this using the Cauchy distribution.
The Cauchy distribution has the following property:
If 𝑋1 , … , 𝑋𝑛 are IID and Cauchy, then so is 𝑋̄ 𝑛 .
This means that the distribution of 𝑋̄ 𝑛 does not eventually concentrate on a single number.
Hence the LLN does not hold.
The LLN fails to hold here because the assumption 𝔼|𝑋| = ∞ is violated by the Cauchy distribution.

16.3. Breaking the LLN 251


A First Course in Quantitative Economics with Python

16.3.2 Failure of the IID condition

The LLN can also fail to hold when the IID assumption is violated.
For example, suppose that

𝑋0 ∼ 𝑁 (0, 1) and 𝑋𝑖 = 𝑋𝑖−1 for 𝑖 = 1, ..., 𝑛

In this case,

1 𝑛
𝑋̄ 𝑛 = ∑ 𝑋𝑖 = 𝑋0 ∼ 𝑁 (0, 1)
𝑛 𝑖=1

Therefore, the distribution of 𝑋̄ 𝑛 is 𝑁 (0, 1) for all 𝑛!


Does this contradict the LLN, which says that the distribution of 𝑋̄ 𝑛 collapses to the single point 𝜇?
No, the LLN is correct — the issue is that its assumptions are not satisfied.
In particular, the sequence 𝑋1 , … , 𝑋𝑛 is not independent.

Note: Although in this case the violation of IID breaks the LLN, there are situations where IID fails but the LLN still
holds.
We will show an example in the exercise.

16.4 Central limit theorem

Next, we turn to the central limit theorem (CLT), which tells us about the distribution of the deviation between sample
averages and population means.

16.4.1 Statement of the theorem

The central limit theorem is one of the most remarkable results in all of mathematics.
In the IID setting, it tells us the following:

Theorem 16.4.1
If 𝑋1 , … , 𝑋𝑛 is IID with common mean 𝜇 and common variance 𝜎2 ∈ (0, ∞), then
√ 𝑑
𝑛(𝑋̄ 𝑛 − 𝜇) → 𝑁 (0, 𝜎2 ) as 𝑛 → ∞ (16.3)

𝑑
Here → 𝑁 (0, 𝜎2 ) indicates convergence in distribution to a centered (i.e., zero mean) normal with standard deviation 𝜎.
The striking implication of the CLT is that for any distribution with finite second moment, the simple operation of adding
independent copies always leads to a Gaussian curve.

252 Chapter 16. LLN and CLT


A First Course in Quantitative Economics with Python

16.4.2 Simulation 1

Since the CLT seems almost magical, running simulations that verify its implications is one good way to build under-
standing.
To this end, we now perform the following simulation
1. Choose an arbitrary distribution 𝐹 for the underlying observations 𝑋𝑖 .

2. Generate independent draws of 𝑌𝑛 ∶= 𝑛(𝑋̄ 𝑛 − 𝜇).
3. Use these draws to compute some measure of their distribution — such as a histogram.
4. Compare the latter to 𝑁 (0, 𝜎2 ).
Here’s some code that does exactly this for the exponential distribution 𝐹 (𝑥) = 1 − 𝑒−𝜆𝑥 .
(Please experiment with other choices of 𝐹 , but remember that, to conform with the conditions of the CLT, the distribution
must have a finite second moment.)

# Set parameters
n = 250 # Choice of n
k = 1_000_000 # Number of draws of Y_n
distribution = st.expon(2) # Exponential distribution, λ = 1/2
μ, σ = distribution.mean(), distribution.std()

# Draw underlying RVs. Each row contains a draw of X_1,..,X_n


data = distribution.rvs((k, n))
# Compute mean of each row, producing k draws of \bar X_n
sample_means = data.mean(axis=1)
# Generate observations of Y_n
Y = np.sqrt(n) * (sample_means - μ)

# Plot
fig, ax = plt.subplots(figsize=(10, 6))
xmin, xmax = -3 * σ, 3 * σ
ax.set_xlim(xmin, xmax)
ax.hist(Y, bins=60, alpha=0.4, density=True)
xgrid = np.linspace(xmin, xmax, 200)
ax.plot(xgrid, st.norm.pdf(xgrid, scale=σ),
'k-', lw=2, label='$N(0, \sigma^2)$')
ax.set_xlabel(r"$Y_n$", size=12)
ax.set_ylabel(r"$density$", size=12)

ax.legend()

plt.show()

16.4. Central limit theorem 253


A First Course in Quantitative Economics with Python

(Notice the absence of for loops — every operation is vectorized, meaning that the major calculations are all shifted to
fast C code.)
The fit to the normal density is already tight and can be further improved by increasing n.

16.5 Exercises

Exercise 16.5.1
Repeat the simulation above1 with the Beta distribution.
You can choose any 𝛼 > 0 and 𝛽 > 0.

Solution to Exercise 16.5.1

# Set parameters
n = 250 # Choice of n
k = 1_000_000 # Number of draws of Y_n
distribution = st.beta(2,2) # We chose Beta(2, 2) as an example
μ, σ = distribution.mean(), distribution.std()

# Draw underlying RVs. Each row contains a draw of X_1,..,X_n


data = distribution.rvs((k, n))
# Compute mean of each row, producing k draws of \bar X_n
sample_means = data.mean(axis=1)
# Generate observations of Y_n
Y = np.sqrt(n) * (sample_means - μ)

# Plot
fig, ax = plt.subplots(figsize=(10, 6))
xmin, xmax = -3 * σ, 3 * σ
ax.set_xlim(xmin, xmax)
ax.hist(Y, bins=60, alpha=0.4, density=True)
(continues on next page)

254 Chapter 16. LLN and CLT


A First Course in Quantitative Economics with Python

(continued from previous page)


ax.set_xlabel(r"$Y_n$", size=12)
ax.set_ylabel(r"$density$", size=12)
xgrid = np.linspace(xmin, xmax, 200)
ax.plot(xgrid, st.norm.pdf(xgrid, scale=σ), 'k-', lw=2, label='$N(0, \sigma^2)$')
ax.legend()

plt.show()

Exercise 16.5.2
At the start of this lecture we discussed Bernoulli random variables.
NumPy doesn’t provide a bernoulli function that we can sample from.
However, we can generate a draw of Bernoulli 𝑋 using NumPy via

U = np.random.rand()
X = 1 if U < p else 0
print(X)

Explain why this provides a random variable 𝑋 with the right distribution.

Solution to Exercise 16.5.2


We can write 𝑋 as 𝑋 = 1{𝑈 < 𝑝} where 1 is the indicator function (i.e., 1 if the statement is true and zero otherwise).
Here we generated a uniform draw 𝑈 on [0, 1] and then used the fact that

ℙ{0 ≤ 𝑈 < 𝑝} = 𝑝 − 0 = 𝑝

This means that 𝑋 = 1{𝑈 < 𝑝} has the right distribution.

16.5. Exercises 255


A First Course in Quantitative Economics with Python

Exercise 16.5.3
We mentioned above that LLN can still hold sometimes when IID is violated.
Let’s investigate this claim further.
Consider the AR(1) process

𝑋𝑡+1 = 𝛼 + 𝛽𝑋𝑡 + 𝜎𝜖𝑡+1

where 𝛼, 𝛽, 𝜎 are constants and 𝜖1 , 𝜖2 , … is IID and standard norma.


Suppose that
𝛼 𝜎2
𝑋0 ∼ 𝑁 ( , )
1 − 𝛽 1 − 𝛽2
This process violates the independence assumption of the LLN (since 𝑋𝑡+1 depends on the value of 𝑋𝑡 ).
However, the next exercise teaches us that LLN type convergence of the sample mean to the population mean still occurs.
1. Prove that the sequence 𝑋1 , 𝑋2 , … is identically distributed.
2. Show that LLN convergence holds using simulations with 𝛼 = 0.8, 𝛽 = 0.2.

Solution to Exercise 16.5.3


Q1 Solution
Regarding part 1, we claim that 𝑋𝑡 has the same distribution as 𝑋0 for all 𝑡.
To construct a proof, we suppose that the claim is true for 𝑋𝑡 .
Now we claim it is also true for 𝑋𝑡+1 .
Observe that we have the correct mean:
𝔼𝑋𝑡+1 = 𝛼 + 𝛽𝔼𝑋𝑡
𝛼
=𝛼+𝛽
1−𝛽
𝛼
=
1−𝛽
We also have the correct variance:
Var(𝑋𝑡+1 ) = 𝛽 2 Var(𝑋𝑡 ) + 𝜎2
𝛽 2 𝜎2
= + 𝜎2
1 − 𝛽2
𝜎2
=
1 − 𝛽2
Finally, since both 𝑋𝑡 and 𝜖0 are normally distributed and independent from each other, any linear combination of these
two variables is also normally distributed.
We have now shown that
𝛼 𝜎2
𝑋𝑡+1 ∼ 𝑁 ( , )
1 − 𝛽 1 − 𝛽2
We can conclude this AR(1) process violates the independence assumption but is identically distributed.
Q2 Solution

256 Chapter 16. LLN and CLT


A First Course in Quantitative Economics with Python

σ = 10
α = 0.8
β = 0.2
n = 100_000

fig, ax = plt.subplots(figsize=(10, 6))


x = np.ones(n)
x[0] = st.norm.rvs(α/(1-β), α**2/(1-β**2))
ϵ = st.norm.rvs(size=n+1)
means = np.ones(n)
means[0] = x[0]
for t in range(n-1):
x[t+1] = α + β * x[t] + σ * ϵ[t+1]
means[t+1] = np.mean(x[:t+1])

ax.scatter(range(100, n), means[100:n], s=10, alpha=0.5)

ax.set_xlabel(r"$n$", size=12)
ax.set_ylabel(r"$\bar X_n$", size=12)
yabs_max = max(ax.get_ylim(), key=abs)
ax.axhline(y=α/(1-β), ls="--", lw=3,
label=r"$\mu = \frac{\alpha}{1-\beta}$",
color = 'black')

plt.legend()
plt.show()

We see the convergence of 𝑥̄ around 𝜇 even when the independence assumption is violated.

16.5. Exercises 257


A First Course in Quantitative Economics with Python

258 Chapter 16. LLN and CLT


CHAPTER

SEVENTEEN

MONTE CARLO AND OPTION PRICING

17.1 Overview

Simple probability calculations can be done either


• with pencil and paper, or
• by looking up facts about well known probability distributions, or
• in our heads.
For example, we can easily work out
• the probability of three heads in five flips of a fair coin
• the expected value of a random variable that equals −10 with probability 1/2 and 100 with probability 1/2.
But some probability calculations are very complex.
Complex calculations concerning probabilities and expectations occur in many economic and financial problems.
Perhaps the most important tool for handling complicated probability calculations is Monte Carlo methods.
In this lecture we introduce Monte Carlo methods for computing expectations, with some applications in finance.
We will use the following imports.

import numpy as np
import matplotlib.pyplot as plt
from numpy.random import randn

17.2 An introduction to Monte Carlo

In this section we describe how Monte Carlo can be used to compute expectations.

259
A First Course in Quantitative Economics with Python

17.2.1 Share price with known distribution

Suppose that we are considering buying a share in some company.


Our plan is either to
1. buy the share now, hold it for one year and then sell it, or
2. do something else with our money.
We start by thinking of the share price in one year as a random variable 𝑆.
Before deciding whether or not to buy the share, we need to know some features of the distribution of 𝑆.
For example, suppose the mean of 𝑆 is high relative to the price of buying the share.
This suggests we have a good chance of selling at a relatively high price.
Suppose, however, that the variance of 𝑆 is also high.
This suggests that buying the share is risky, so perhaps we should refrain.
Either way, this discussion shows the importance of understanding the distribution of 𝑆.
Suppose that, after analyzing the data, we guess that 𝑆 is well represented by a lognormal distribution with parameters
𝜇, 𝜎 .
• 𝑆 has the same distribution as exp(𝜇 + 𝜎𝑍) where 𝑍 is standard normal.
• we write this statement as 𝑆 ∼ 𝐿𝑁 (𝜇, 𝜎).
Any good reference on statistics (such as Wikipedia) will tell us that the mean and variance are

𝜎2
𝔼𝑆 = exp (𝜇 + )
2

and

Var 𝑆 = [exp(𝜎2 ) − 1] exp(2𝜇 + 𝜎2 )

So far we have no need for a computer.

17.2.2 Share price with unknown distribution

But now suppose that we study the distribution of 𝑆 more carefully.


We decide that the share price depends on three variables, 𝑋1 , 𝑋2 , and 𝑋3 (e.g., sales, inflation, and interest rates).
In particular, our study suggests that

𝑆 = (𝑋1 + 𝑋2 + 𝑋3 )𝑝

where
• 𝑝 is a positive number, which is known to us (i.e., has been estimated),
• 𝑋𝑖 ∼ 𝐿𝑁 (𝜇𝑖 , 𝜎𝑖 ) for 𝑖 = 1, 2, 3,
• the values 𝜇𝑖 , 𝜎𝑖 are also known, and
• the random variables 𝑋1 , 𝑋2 and 𝑋3 are independent.

260 Chapter 17. Monte Carlo and Option Pricing


A First Course in Quantitative Economics with Python

How should we compute the mean of 𝑆?


To do this with pencil and paper is hard (unless, say, 𝑝 = 1).
But fortunately there’s an easy way to do this, at least approximately.
This is the Monte Carlo method, which runs as follows:
1. Generate 𝑛 independent draws of 𝑋1 , 𝑋2 and 𝑋3 on a computer,
2. Use these draws to generate 𝑛 independent draws of 𝑆, and
3. Take the average value of these draws of 𝑆.
This average will be close to the true mean when 𝑛 is large.
This is due to the law of large numbers, which we discussed in another lecture.
We use the following values for 𝑝 and each 𝜇𝑖 and 𝜎𝑖 .

n = 1_000_000
p = 0.5
μ_1, μ_2, μ_3 = 0.2, 0.8, 0.4
σ_1, σ_2, σ_3 = 0.1, 0.05, 0.2

A routine using loops in python

Here’s a routine using native Python loops to calculate the desired mean

1 𝑛
∑ 𝑆 ≈ 𝔼𝑆
𝑛 𝑖=1 𝑖

%%time

S = 0.0
for i in range(n):
X_1 = np.exp(μ_1 + σ_1 * randn())
X_2 = np.exp(μ_2 + σ_2 * randn())
X_3 = np.exp(μ_3 + σ_3 * randn())
S += (X_1 + X_2 + X_3)**p
S / n

CPU times: user 4.7 s, sys: 7.65 ms, total: 4.71 s


Wall time: 4.71 s

2.2297684653600913

We can also construct a function that contains these operations:

def compute_mean(n=1_000_000):
S = 0.0
for i in range(n):
X_1 = np.exp(μ_1 + σ_1 * randn())
X_2 = np.exp(μ_2 + σ_2 * randn())
X_3 = np.exp(μ_3 + σ_3 * randn())
S += (X_1 + X_2 + X_3)**p
return (S / n)

17.2. An introduction to Monte Carlo 261


A First Course in Quantitative Economics with Python

Now let’s call it.

compute_mean()

2.2296764329201237

17.2.3 A vectorized routine

If we want a more accurate estimate we should increase 𝑛.


But the code above runs quite slowly.
To make it faster, let’s implement a vectorized routine using NumPy.

def compute_mean_vectorized(n=1_000_000):
X_1 = np.exp(μ_1 + σ_1 * randn(n))
X_2 = np.exp(μ_2 + σ_2 * randn(n))
X_3 = np.exp(μ_3 + σ_3 * randn(n))
S = (X_1 + X_2 + X_3)**p
return S.mean()

%%time

compute_mean_vectorized()

CPU times: user 87.5 ms, sys: 12 ms, total: 99.5 ms


Wall time: 99.4 ms

2.229649736607342

Notice that this routine is much faster.


We can increase 𝑛 to get more accuracy and still have reasonable speed:

%%time

compute_mean_vectorized(n=10_000_000)

CPU times: user 926 ms, sys: 104 ms, total: 1.03 s
Wall time: 1.03 s

2.2297604134260776

262 Chapter 17. Monte Carlo and Option Pricing


A First Course in Quantitative Economics with Python

17.3 Pricing a european call option under risk neutrality

Next we are going to price a European call option under risk neutrality.
Let’s first discuss risk neutrality and then consider European options.

17.3.1 Risk-Neutral Pricing

When we use risk-neutral pricing, we determine the price of a given asset according to its expected payoff:

cost = expected benefit

For example, suppose someone promises to pay you


• 1,000,000 dollars if “heads” is the outcome of a fair coin flip
• 0 dollars if “tails” is the outcome
Let’s denote the payoff as 𝐺, so that
1
ℙ {𝐺 = 106 } = ℙ{𝐺 = 0} =
2
Suppose in addition that you can sell this promise to anyone who wants it.
• First they pay you 𝑃 , the price at which you sell it
• Then they get 𝐺, which could be either 1,000,000 or 0.
What’s a fair price for this asset (this promise)?
The definition of “fair” is ambiguous, but we can say that the risk-neutral price is 500,000 dollars.
This is because the risk-neutral price is just the expected payoff of the asset, which is
1 1
𝔼𝐺 = × 106 + × 0 = 5 × 105
2 2

17.3.2 A comment on risk

As suggested by the name, the risk-neutral price ignores risk.


To understand this, consider whether you would pay 500,000 dollars for such a promise.
Would you prefer to receive 500,000 for sure or 1,000,000 dollars with 50% probability and nothing with 50% probability?
At least some readers will strictly prefer the first option — although some might prefer the second.
Thinking about this makes us realize that 500,000 is not necessarily the “right” price — or the price that we would see if
there was a market for these promises.
Nonetheless, the risk-neutral price is an important benchmark, which economists and financial market participants try to
calculate every day.

17.3. Pricing a european call option under risk neutrality 263


A First Course in Quantitative Economics with Python

17.3.3 Discounting

Another thing we ignored in the previous discussion was time.


In general, receiving 𝑥 dollars now is preferable to receiving 𝑥 dollars in 𝑛 periods (e.g., 10 years).
After all, if we receive 𝑥 dollars now, we could put it in the bank at interest rate 𝑟 > 0 and receive (1 + 𝑟)𝑛 𝑥 in 𝑛 periods.
Hence future payments need to be discounted when we consider their present value.
We will implement discounting by
• multiplying a payment in one period by 𝛽 < 1
• multiplying a payment in 𝑛 periods by 𝛽 𝑛 , etc.
The same adjustment needs to be applied to our risk-neutral price for the promise described above.
Thus, if 𝐺 is realized in 𝑛 periods, then the risk-neutral price is

𝑃 = 𝛽 𝑛 𝔼𝐺 = 𝛽 𝑛 5 × 105

17.3.4 European call options

Now let’s price a European call option.


The option is described by three things:
2. 𝑛, the expiry date,
3. 𝐾, the strike price, and
4. 𝑆𝑛 , the price of the underlying asset at date 𝑛.
For example, suppose that the underlying is one share in Amazon.
The owner of this option has the right to buy one share in Amazon at price 𝐾 after 𝑛 days.
If 𝑆𝑛 > 𝐾, then the owner will exercise the option, buy at 𝐾, sell at 𝑆𝑛 , and make profit 𝑆𝑛 − 𝐾.
If 𝑆𝑛 ≤ 𝐾, then the owner will not exercise the option and the payoff is zero.
Thus, the payoff is max{𝑆𝑛 − 𝐾, 0}.
Under the assumption of risk neutrality, the price of the option is the expected discounted payoff:

𝑃 = 𝛽 𝑛 𝔼 max{𝑆𝑛 − 𝐾, 0}

Now all we need to do is specify the distribution of 𝑆𝑛 , so the expectation can be calculated.
Suppose we know that 𝑆𝑛 ∼ 𝐿𝑁 (𝜇, 𝜎) and 𝜇 and 𝜎 are known.
If 𝑆𝑛1 , … , 𝑆𝑛𝑀 are independent draws from this lognormal distribution then, by the law of large numbers,

1 𝑀
𝔼 max{𝑆𝑛 − 𝐾, 0} ≈ ∑ max{𝑆𝑛𝑚 − 𝐾, 0}
𝑀 𝑚=1

We suppose that

μ = 1.0
σ = 0.1
K = 1
n = 10
β = 0.95

264 Chapter 17. Monte Carlo and Option Pricing


A First Course in Quantitative Economics with Python

We set the simulation size to

M = 10_000_000

Here is our code

S = np.exp(μ + σ * np.random.randn(M))
return_draws = np.maximum(S - K, 0)
P = β**n * np.mean(return_draws)
print(f"The Monte Carlo option price is approximately {P:3f}")

The Monte Carlo option price is approximately 1.037017

17.4 Pricing via a dynamic model

In this exercise we investigate a more realistic model for the share price 𝑆𝑛 .
This comes from specifying the underlying dynamics of the share price.
First we specify the dynamics.
Then we’ll compute the price of the option using Monte Carlo.

17.4.1 Simple dynamics

One simple model for {𝑆𝑡 } is

𝑆𝑡+1
ln = 𝜇 + 𝜎𝜉𝑡+1
𝑆𝑡

where
• 𝑆0 is normally distributed and
• {𝜉𝑡 } is IID and standard normal.
Under the stated assumptions, 𝑆𝑛 is lognormally distributed.
To see why, observe that, with 𝑠𝑡 ∶= ln 𝑆𝑡 , the price dynamics become

𝑠𝑡+1 = 𝑠𝑡 + 𝜇 + 𝜎𝜉𝑡+1 (17.1)

Since 𝑠0 is normal and 𝜉1 is normal and IID, we see that 𝑠1 is normally distributed.
Continuing in this way shows that 𝑠𝑛 is normally distributed.
Hence 𝑆𝑛 = exp(𝑠𝑛 ) is lognormal.

17.4. Pricing via a dynamic model 265


A First Course in Quantitative Economics with Python

17.4.2 Problems with simple dynamics

The simple dynamic model we studied above is convenient, since we can work out the distribution of 𝑆𝑛 .
However, its predictions are counterfactual because, in the real world, volatility (measured by 𝜎) is not stationary.
Instead it rather changes over time, sometimes high (like during the GFC) and sometimes low.
In terms of our model above, this means that 𝜎 should not be constant.

17.4.3 More realistic dynamics

This leads us to study the improved version:

𝑆𝑡+1
ln = 𝜇 + 𝜎𝑡 𝜉𝑡+1
𝑆𝑡

where

𝜎𝑡 = exp(ℎ𝑡 ), ℎ𝑡+1 = 𝜌ℎ𝑡 + 𝜈𝜂𝑡+1

Here {𝜂𝑡 } is also IID and standard normal.

17.4.4 Default parameters

For the dynamic model, we adopt the following parameter values.

μ = 0.0001
ρ = 0.1
ν = 0.001
S0 = 10
h0 = 0

(Here S0 is 𝑆0 and h0 is ℎ0 .)
For the option we use the following defaults.

K = 100
n = 10
β = 0.95

17.4.5 Visualizations

With 𝑠𝑡 ∶= ln 𝑆𝑡 , the price dynamics become

𝑠𝑡+1 = 𝑠𝑡 + 𝜇 + exp(ℎ𝑡 )𝜉𝑡+1

Here is a function to simulate a path using this equation:

def simulate_asset_price_path(μ=μ, S0=S0, h0=h0, n=n, ρ=ρ, ν=ν):


s = np.empty(n+1)
s[0] = np.log(S0)

(continues on next page)

266 Chapter 17. Monte Carlo and Option Pricing


A First Course in Quantitative Economics with Python

(continued from previous page)


h = h0
for t in range(n):
s[t+1] = s[t] + μ + np.exp(h) * randn()
h = ρ * h + ν * randn()

return np.exp(s)

Here we plot the paths and the log of the paths.

fig, axes = plt.subplots(2, 1)

titles = 'log paths', 'paths'


transforms = np.log, lambda x: x
for ax, transform, title in zip(axes, transforms, titles):
for i in range(50):
path = simulate_asset_price_path()
ax.plot(transform(path))
ax.set_title(title)

fig.tight_layout()
plt.show()

17.4. Pricing via a dynamic model 267


A First Course in Quantitative Economics with Python

17.4.6 Computing the price

Now that our model is more complicated, we cannot easily determine the distribution of 𝑆𝑛 .
So to compute the price 𝑃 of the option, we use Monte Carlo.
We average over realizations 𝑆𝑛1 , … , 𝑆𝑛𝑀 of 𝑆𝑛 and appealing to the law of large numbers:

1 𝑀
𝔼 max{𝑆𝑛 − 𝐾, 0} ≈ ∑ max{𝑆𝑛𝑚 − 𝐾, 0}
𝑀 𝑚=1

Here’s a version using Python loops.

def compute_call_price(β=β,
μ=μ,
S0=S0,
h0=h0,
K=K,
n=n,
ρ=ρ,
ν=ν,
M=10_000):
current_sum = 0.0
# For each sample path
for m in range(M):
s = np.log(S0)
h = h0
# Simulate forward in time
for t in range(n):
s = s + μ + np.exp(h) * randn()
h = ρ * h + ν * randn()
# And add the value max{S_n - K, 0} to current_sum
current_sum += np.maximum(np.exp(s) - K, 0)

return β**n * current_sum / M

%%time
compute_call_price()

CPU times: user 248 ms, sys: 0 ns, total: 248 ms


Wall time: 248 ms

869.8728500140572

17.5 Exercises

Exercise 17.5.1
We would like to increase 𝑀 in the code above to make the calculation more accurate.
But this is problematic because Python loops are slow.
Your task is to write a faster version of this code using NumPy.

268 Chapter 17. Monte Carlo and Option Pricing


A First Course in Quantitative Economics with Python

Solution to Exercise 17.5.1

def compute_call_price(β=β,
μ=μ,
S0=S0,
h0=h0,
K=K,
n=n,
ρ=ρ,
ν=ν,
M=10_000):

s = np.full(M, np.log(S0))
h = np.full(M, h0)
for t in range(n):
Z = np.random.randn(2, M)
s = s + μ + np.exp(h) * Z[0, :]
h = ρ * h + ν * Z[1, :]
expectation = np.mean(np.maximum(np.exp(s) - K, 0))

return β**n * expectation

%%time
compute_call_price()

CPU times: user 6.33 ms, sys: 58 µs, total: 6.39 ms


Wall time: 6.19 ms

1148.8594136849188

Notice that this version is faster than the one using a Python loop.
Now let’s try with larger 𝑀 to get a more accurate calculation.

%%time
compute_call_price(M=10_000_000)

CPU times: user 6.23 s, sys: 952 ms, total: 7.18 s


Wall time: 7.18 s

897.2171827573278

Exercise 17.5.2
Consider that a European call option may be written on an underlying with spot price of $100 and a knockout barrier of
$120.
This option behaves in every way like a vanilla European call, except if the spot price ever moves above $120, the option
“knocks out” and the contract is null and void.
Note that the option does not reactivate if the spot price falls below $120 again.

17.5. Exercises 269


A First Course in Quantitative Economics with Python

Use the dynamics defined in (17.1) to price the European call option.

Solution to Exercise 17.5.2

μ = 0.0001
ρ = 0.1
ν = 0.001
S0 = 10
h0 = 0
K = 100
n = 10
β = 0.95
bp = 120

def compute_call_price_with_barrier(β=β,
μ=μ,
S0=S0,
h0=h0,
K=K,
n=n,
ρ=ρ,
ν=ν,
bp=bp,
M=50_000):
current_sum = 0.0
# For each sample path
for m in range(M):
s = np.log(S0)
h = h0
payoff = 0
option_is_null = False
# Simulate forward in time
for t in range(n):
s = s + μ + np.exp(h) * randn()
h = ρ * h + ν * randn()
if np.exp(s) > bp:
payoff = 0
option_is_null = True
break

if not option_is_null:
payoff = np.maximum(np.exp(s) - K, 0)
# And add the payoff to current_sum
current_sum += payoff

return β**n * current_sum / M

%time compute_call_price_with_barrier()

CPU times: user 1.43 s, sys: 90 µs, total: 1.43 s


Wall time: 1.43 s

0.044558291514763024

270 Chapter 17. Monte Carlo and Option Pricing


A First Course in Quantitative Economics with Python

Let’s look at the vectorized version which is faster than using Python loops.

def compute_call_price_with_barrier_vector(β=β,
μ=μ,
S0=S0,
h0=h0,
K=K,
n=n,
ρ=ρ,
ν=ν,
bp=bp,
M=50_000):
s = np.full(M, np.log(S0))
h = np.full(M, h0)
option_is_null = np.full(M, False)
for t in range(n):
Z = np.random.randn(2, M)
s = s + μ + np.exp(h) * Z[0, :]
h = ρ * h + ν * Z[1, :]
# Mark all the options null where S_n > barrier price
option_is_null = np.where(np.exp(s) > bp, True, option_is_null)

# mark payoff as 0 in the indices where options are null


payoff = np.where(option_is_null, 0, np.maximum(np.exp(s) - K, 0))
expectation = np.mean(payoff)
return β**n * expectation

%time compute_call_price_with_barrier_vector()

CPU times: user 32.5 ms, sys: 0 ns, total: 32.5 ms


Wall time: 32.2 ms

0.03892917960975804

17.5. Exercises 271


A First Course in Quantitative Economics with Python

272 Chapter 17. Monte Carlo and Option Pricing


CHAPTER

EIGHTEEN

HEAVY-TAILED DISTRIBUTIONS

Contents

• Heavy-Tailed Distributions
– Overview
– Visual comparisons
– Heavy tails in economic cross-sections
– Failure of the LLN
– Why do heavy tails matter?
– Classifying tail properties
– Further reading
– Exercises

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade yfinance quantecon pandas_datareader interpolation

We use the following imports.

import matplotlib.pyplot as plt


import numpy as np
import quantecon as qe
import yfinance as yf
import pandas as pd
import pandas_datareader.data as web
import statsmodels.api as sm

from interpolation import interp


from pandas_datareader import wb
from scipy.stats import norm, cauchy
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

273
A First Course in Quantitative Economics with Python

18.1 Overview

In this section we give some motivation for the lecture.

18.1.1 Introduction: light tails

Most commonly used probability distributions in classical statistics and the natural sciences have “light tails.”
To explain this concept, let’s look first at examples.
The classic example is the normal distribution, which has density

1 (𝑥 − 𝜇)2
𝑓(𝑥) = √ exp (− ) (−∞ < 𝑥 < ∞)
2𝜋𝜎 2𝜎2

The two parameters 𝜇 and 𝜎 are the mean and standard deviation respectively.
As 𝑥 deviates from 𝜇, the value of 𝑓(𝑥) goes to zero extremely quickly.
We can see this when we plot the density and show a histogram of observations, as with the following code (which assumes
𝜇 = 0 and 𝜎 = 1).

fig, ax = plt.subplots()
X = norm.rvs(size=1_000_000)
ax.hist(X, bins=40, alpha=0.4, label='histogram', density=True)
x_grid = np.linspace(-4, 4, 400)
ax.plot(x_grid, norm.pdf(x_grid), label='density')
ax.legend()
plt.show()

Notice how
• the density’s tails converge quickly to zero in both directions and

274 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

• even with 1,000,000 draws, we get no very large or very small observations.
We can see the last point more clearly by executing

X.min(), X.max()

(-4.878995200091626, 5.777536110294029)

Here’s another view of draws from the same distribution:

n = 2000
fig, ax = plt.subplots()
data = norm.rvs(size=n)
ax.plot(list(range(n)), data, linestyle='', marker='o', alpha=0.5, ms=4)
ax.vlines(list(range(n)), 0, data, lw=0.2)
ax.set_ylim(-15, 15)
ax.set_xlabel('$i$')
ax.set_ylabel('$X_i$', rotation=0)
plt.show()

We have plotted each individual draw 𝑋𝑖 against 𝑖.


None are very large or very small.
In other words, extreme observations are rare and draws tend not to deviate too much from the mean.
Putting this another way, light-tailed distributions are those that rarely generate extreme values.
(A more formal definition is given below.)
Many statisticians and econometricians use rules of thumb such as “outcomes more than four or five standard deviations
from the mean can safely be ignored.”
But this is only true when distributions have light tails.

18.1. Overview 275


A First Course in Quantitative Economics with Python

18.1.2 When are light tails valid?

In probability theory and in the real world, many distributions are light-tailed.
For example, human height is light-tailed.
Yes, it’s true that we see some very tall people.
• For example, basketballer Sun Mingming is 2.32 meters tall
But have you ever heard of someone who is 20 meters tall? Or 200? Or 2000?
Have you ever wondered why not?
After all, there are 8 billion people in the world!
In essence, the reason we don’t see such draws is that the distribution of human height has very light tails.
In fact the distribution of human height obeys a bell-shaped curve similar to the normal distribution.

18.1.3 Returns on assets

But what about economic data?


Let’s look at some financial data first.
Our aim is to plot the daily change in the price of Amazon (AMZN) stock for the period from 1st January 2015 to 1st
July 2022.
This equates to daily returns if we set dividends aside.
The code below produces the desired plot using Yahoo financial data via the yfinance library.

s = yf.download('AMZN', '2015-1-1', '2022-7-1')['Adj Close']


r = s.pct_change()

fig, ax = plt.subplots()

ax.plot(r, linestyle='', marker='o', alpha=0.5, ms=4)


ax.vlines(r.index, 0, r.values, lw=0.2)
ax.set_ylabel('returns', fontsize=12)
ax.set_xlabel('date', fontsize=12)

plt.show()

[*********************100%%**********************] 1 of 1 completed

276 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

This data looks different to the draws from the normal distribution we saw above.
Several of observations are quite extreme.
We get a similar picture if we look at other assets, such as Bitcoin

s = yf.download('BTC-USD', '2015-1-1', '2022-7-1')['Adj Close']


r = s.pct_change()

fig, ax = plt.subplots()

ax.plot(r, linestyle='', marker='o', alpha=0.5, ms=4)


ax.vlines(r.index, 0, r.values, lw=0.2)
ax.set_ylabel('returns', fontsize=12)
ax.set_xlabel('date', fontsize=12)

plt.show()

[*********************100%%**********************] 1 of 1 completed

18.1. Overview 277


A First Course in Quantitative Economics with Python

The histogram also looks different to the histogram of the normal distribution:

fig, ax = plt.subplots()
ax.hist(r, bins=60, alpha=0.4, label='bitcoin returns', density=True)
ax.set_xlabel('returns', fontsize=12)
plt.show()

278 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

If we look at higher frequency returns data (e.g., tick-by-tick), we often see even more extreme observations.
See, for example, [Man63] or [Rac03].

18.1.4 Other data

The data we have just seen is said to be “heavy-tailed”.


With heavy-tailed distributions, extreme outcomes occur relatively frequently.
Importantly, there are many examples of heavy-tailed distributions observed in economic and financial settings!
For example, the income and the wealth distributions are heavy-tailed
• You can imagine this: most people have low or modest wealth but some people are extremely rich.
The firm size distribution is also heavy-tailed
• You can imagine this too: most firms are small but some firms are enormous.
The distribution of town and city sizes is heavy-tailed
• Most towns and cities are small but some are very large.
Later in this lecture, we examine heavy tails in these distributions.

18.1.5 Why should we care?

Heavy tails are common in economic data but does that mean they are important?
The answer to this question is affirmative!
When distributions are heavy-tailed, we need to think carefully about issues like
• diversification and risk
• forecasting
• taxation (across a heavy-tailed income distribution), etc.
We return to these points below.

18.2 Visual comparisons

Later we will provide a mathematical definition of the difference between light and heavy tails.
But for now let’s do some visual comparisons to help us build intuition on the difference between these two types of
distributions.

18.2. Visual comparisons 279


A First Course in Quantitative Economics with Python

18.2.1 Simulations

The figure below shows a simulation.


The top two subfigures each show 120 independent draws from the normal distribution, which is light-tailed.
The bottom subfigure shows 120 independent draws from the Cauchy distribution, which is heavy-tailed.

n = 120
np.random.seed(11)

fig, axes = plt.subplots(3, 1, figsize=(6, 12))

for ax in axes:
ax.set_ylim((-120, 120))

s_vals = 2, 12

for ax, s in zip(axes[:2], s_vals):


data = np.random.randn(n) * s
ax.plot(list(range(n)), data, linestyle='', marker='o', alpha=0.5, ms=4)
ax.vlines(list(range(n)), 0, data, lw=0.2)
ax.set_title(f"draws from $N(0, \sigma^2)$ with $\sigma = {s}$", fontsize=11)

ax = axes[2]
distribution = cauchy()
data = distribution.rvs(n)
ax.plot(list(range(n)), data, linestyle='', marker='o', alpha=0.5, ms=4)
ax.vlines(list(range(n)), 0, data, lw=0.2)
ax.set_title(f"draws from the Cauchy distribution", fontsize=11)

plt.subplots_adjust(hspace=0.25)

plt.show()

280 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

18.2. Visual comparisons 281


A First Course in Quantitative Economics with Python

In the top subfigure, the standard deviation of the normal distribution is 2, and the draws are clustered around the mean.
In the middle subfigure, the standard deviation is increased to 12 and, as expected, the amount of dispersion rises.
The bottom subfigure, with the Cauchy draws, shows a different pattern: tight clustering around the mean for the great
majority of observations, combined with a few sudden large deviations from the mean.
This is typical of a heavy-tailed distribution.

18.2.2 Nonnegative distributions

Let’s compare some distributions that only take nonnegative values.


One is the exponential distribution, which we discussed in our lecture on probability and distributions.
The exponential distribution is a light-tailed distribution.
Here are some draws from the exponential distribution.

n = 120
np.random.seed(11)

fig, ax = plt.subplots()
ax.set_ylim((0, 50))

data = np.random.exponential(size=n)
ax.plot(list(range(n)), data, linestyle='', marker='o', alpha=0.5, ms=4)
ax.vlines(list(range(n)), 0, data, lw=0.2)

plt.show()

Another nonnegative distribution is the Pareto distribution.

282 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

If 𝑋 has the Pareto distribution, then there are positive constants 𝑥̄ and 𝛼 such that
𝛼
(𝑥/𝑥)
̄ if 𝑥 ≥ 𝑥̄
ℙ{𝑋 > 𝑥} = { (18.1)
1 if 𝑥 < 𝑥̄

The parameter 𝛼 is called the tail index and 𝑥̄ is called the minimum.
The Pareto distribution is a heavy-tailed distribution.
One way that the Pareto distribution arises is as the exponential of an exponential random variable.
In particular, if 𝑋 is exponentially distributed with rate parameter 𝛼, then

𝑌 = 𝑥̄ exp(𝑋)

is Pareto-distributed with minimum 𝑥̄ and tail index 𝛼.


Here are some draws from the Pareto distribution with tail index 1 and minimum 1.

n = 120
np.random.seed(11)

fig, ax = plt.subplots()
ax.set_ylim((0, 80))
exponential_data = np.random.exponential(size=n)
pareto_data = np.exp(exponential_data)
ax.plot(list(range(n)), pareto_data, linestyle='', marker='o', alpha=0.5, ms=4)
ax.vlines(list(range(n)), 0, pareto_data, lw=0.2)

plt.show()

Notice how extreme outcomes are more common.

18.2. Visual comparisons 283


A First Course in Quantitative Economics with Python

18.2.3 Counter CDFs

For nonnegative random variables, one way to visualize the difference between light and heavy tails is to look at the
counter CDF (CCDF).
For a random variable 𝑋 with CDF 𝐹 , the CCDF is the function

𝐺(𝑥) ∶= 1 − 𝐹 (𝑥) = ℙ{𝑋 > 𝑥}

(Some authors call 𝐺 the “survival” function.)


The CCDF shows how fast the upper tail goes to zero as 𝑥 → ∞.
If 𝑋 is exponentially distributed with rate parameter 𝛼, then the CCDF is

𝐺𝐸 (𝑥) = exp(−𝛼𝑥)

This function goes to zero relatively quickly as 𝑥 gets large.


The standard Pareto distribution, where 𝑥̄ = 1, has CCDF

𝐺𝑃 (𝑥) = 𝑥−𝛼

This function goes to zero as 𝑥 → ∞, but much slower than 𝐺𝐸 .

Exercise 18.2.1
Show how the CCDF of the standard Pareto distribution can be derived from the CCDF of the exponential distribution.

Solution to Exercise 18.2.1


Letting 𝐺𝐸 and 𝐺𝑃 be defined as above, letting 𝑋 be exponentially distributed with rate parameter 𝛼, and letting 𝑌 =
exp(𝑋), we have

𝐺𝑃 (𝑦) = ℙ{𝑌 > 𝑦}


= ℙ{exp(𝑋) > 𝑦}
= ℙ{𝑋 > ln 𝑦}
= 𝐺𝐸 (ln 𝑦)
= exp(−𝛼 ln 𝑦)
= 𝑦−𝛼

Here’s a plot that illustrates how 𝐺𝐸 goes to zero faster than 𝐺𝑃 .

x = np.linspace(1.5, 100, 1000)


fig, ax = plt.subplots()
alpha = 1.0
ax.plot(x, np.exp(- alpha * x), label='exponential', alpha=0.8)
ax.plot(x, x**(- alpha), label='Pareto', alpha=0.8)
ax.legend()
plt.show()

284 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

Here’s a log-log plot of the same functions, which makes visual comparison easier.

fig, ax = plt.subplots()
alpha = 1.0
ax.loglog(x, np.exp(- alpha * x), label='exponential', alpha=0.8)
ax.loglog(x, x**(- alpha), label='Pareto', alpha=0.8)
ax.legend()
plt.show()

18.2. Visual comparisons 285


A First Course in Quantitative Economics with Python

In the log-log plot, the Pareto CCDF is linear, while the exponential one is concave.
This idea is often used to separate light- and heavy-tailed distributions in visualisations — we return to this point below.

18.2.4 Empirical CCDFs

The sample counterpart of the CCDF function is the empirical CCDF.


Given a sample 𝑥1 , … , 𝑥𝑛 , the empirical CCDF is given by

̂ 1 𝑛
𝐺(𝑥) = ∑ 𝟙{𝑥𝑖 > 𝑥}
𝑛 𝑖=1

̂
Thus, 𝐺(𝑥) shows the fraction of the sample that exceeds 𝑥.
Here’s a figure containing some empirical CCDFs from simulated data.

def eccdf(x, data):


"Simple empirical CCDF function."
return np.mean(data > x)

x_grid = np.linspace(1, 1000, 1000)


sample_size = 1000
np.random.seed(13)
z = np.random.randn(sample_size)

data_1 = np.random.exponential(size=sample_size)
data_2 = np.exp(z)
data_3 = np.exp(np.random.exponential(size=sample_size))

data_list = [data_1, data_2, data_3]

fig, axes = plt.subplots(3, 1, figsize=(6, 8))


axes = axes.flatten()
labels = ['exponential', 'lognormal', 'Pareto']

for data, label, ax in zip(data_list, labels, axes):

ax.loglog(x_grid, [eccdf(x, data) for x in x_grid],


'o', markersize=3.0, alpha=0.5, label=label)
ax.set_xlabel("log rank")
ax.set_ylabel("log size")

ax.legend()

fig.subplots_adjust(hspace=0.4)

plt.show()

286 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

As with the CCDF, the empirical CCDF from the Pareto distributions is approximately linear in a log-log plot.
We will use this idea below when we look at real data.

18.2.5 Power laws

One specific class of heavy-tailed distributions has been found repeatedly in economic and social phenomena: the class
of so-called power laws.
A random variable 𝑋 is said to have a power law if, for some 𝛼 > 0,

ℙ{𝑋 > 𝑥} ≈ 𝑥−𝛼 when 𝑥 is large

We can write this more mathematically as

lim 𝑥𝛼 ℙ{𝑋 > 𝑥} = 𝑐 for some 𝑐 > 0 (18.2)


𝑥→∞

18.2. Visual comparisons 287


A First Course in Quantitative Economics with Python

It is also common to say that a random variable 𝑋 with this property has a Pareto tail with tail index 𝛼.
Notice that every Pareto distribution with tail index 𝛼 has a Pareto tail with tail index 𝛼.
We can think of power laws as a generalization of Pareto distributions.
They are distributions that resemble Pareto distributions in their upper right tail.
Another way to think of power laws is a set of distributions with a specific kind of (very) heavy tail.

18.3 Heavy tails in economic cross-sections

As mentioned above, heavy tails are pervasive in economic data.


In fact power laws seem to be very common as well.
We now illustrate this by showing the empirical CCDF of heavy tails.
All plots are in log-log, so that a power law shows up as a linear log-log plot, at least in the upper tail.
We hide the code that generates the figures, which is somewhat complex, but readers are of course welcome to explore
the code (perhaps after examining the figures).

18.3.1 Firm size

Here is a plot of the firm size distribution for the largest 500 firms in 2020 taken from Forbes Global 2000.

288 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

18.3.2 City size

Here are plots of the city size distribution for the US and Brazil in 2023 from world population review.
The size is measured by population.

18.3.3 Wealth

Here is a plot of the upper tail (top 500) of the wealth distribution.
The data is from the Forbes Billionaires list in 2020.

18.3. Heavy tails in economic cross-sections 289


A First Course in Quantitative Economics with Python

18.3.4 GDP

Of course, not all cross-sectional distributions are heavy-tailed.


Here we show cross-country per capita GDP.

The plot is concave rather than linear, so the distribution has light tails.
One reason is that this is data on an aggregate variable, which involves some averaging in its definition.
Averaging tends to eliminate extreme outcomes.

18.4 Failure of the LLN

One impact of heavy tails is that sample averages can be poor estimators of the underlying mean of the distribution.
To understand this point better, recall our earlier discussion of the Law of Large Numbers, which considered IID
𝑋1 , … , 𝑋𝑛 with common distribution 𝐹
𝑛
If 𝔼|𝑋𝑖 | is finite, then the sample mean 𝑋̄ 𝑛 ∶= 1
𝑛 ∑𝑖=1 𝑋𝑖 satisfies

ℙ {𝑋̄ 𝑛 → 𝜇 as 𝑛 → ∞} = 1 (18.3)

where 𝜇 ∶= 𝔼𝑋𝑖 = ∫ 𝑥𝐹 (𝑑𝑥) is the common mean of the sample.


The condition 𝔼|𝑋𝑖 | = ∫ |𝑥|𝐹 (𝑑𝑥) < ∞ holds in most cases but can fail if the distribution 𝐹 is very heavy tailed.
For example, it fails for the Cauchy distribution.
Let’s have a look at the behavior of the sample mean in this case, and see whether or not the LLN is still valid.

from scipy.stats import cauchy

np.random.seed(1234)
N = 1_000

distribution = cauchy()

fig, ax = plt.subplots()
data = distribution.rvs(N)

# Compute sample mean at each n


(continues on next page)

290 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

(continued from previous page)


sample_mean = np.empty(N)
for n in range(1, N):
sample_mean[n] = np.mean(data[:n])

# Plot
ax.plot(range(N), sample_mean, alpha=0.6, label='$\\bar{X}_n$')

ax.plot(range(N), np.zeros(N), 'k--', lw=0.5)


ax.legend()

plt.show()

The sequence shows no sign of converging.


We return to this point in the exercises.

18.5 Why do heavy tails matter?

We have now seen that


1. heavy tails are frequent in economics and
2. the Law of Large Numbers fails when tails are very heavy.
But what about in the real world? Do heavy tails matter?
Let’s briefly discuss why they do.

18.5. Why do heavy tails matter? 291


A First Course in Quantitative Economics with Python

18.5.1 Diversification

One of the most important ideas in investing is using diversification to reduce risk.
This is a very old idea — consider, for example, the expression “don’t put all your eggs in one basket”.
To illustrate, consider an investor with one dollar of wealth and a choice over 𝑛 assets with payoffs 𝑋1 , … , 𝑋𝑛 .
Suppose that returns on distinct assets are independent and each return has mean 𝜇 and variance 𝜎2 .
If the investor puts all wealth in one asset, say, then the expected payoff of the portfolio is 𝜇 and the variance is 𝜎2 .
If instead the investor puts share 1/𝑛 of her wealth in each asset, then the portfolio payoff is
𝑛
𝑋𝑖 1 𝑛
𝑌𝑛 = ∑ = ∑ 𝑋𝑖 .
𝑖=1
𝑛 𝑛 𝑖=1

Try computing the mean and variance.


You will find that
• The mean is unchanged at 𝜇, while
• the variance of the portfolio has fallen to 𝜎2 /𝑛.
Diversification reduces risk, as expected.
But there is a hidden assumption here: the variance of returns is finite.
If the distribution is heavy-tailed and the variance is infinite, then this logic is incorrect.
For example, we saw above that if every 𝑋𝑖 is Cauchy, then so is 𝑌𝑛 .
This means that diversification doesn’t help at all!

18.5.2 Fiscal policy

The heaviness of the tail in the wealth distribution matters for taxation and redistribution policies.
The same is true for the income distribution.
For example, the heaviness of the tail of the income distribution helps determine how much revenue a given tax policy
will raise.

18.6 Classifying tail properties

Up until now we have discussed light and heavy tails without any mathematical definitions.
Let’s now rectify this.
We will focus our attention on the right hand tails of nonnegative random variables and their distributions.
The definitions for left hand tails are very similar and we omit them to simplify the exposition.

292 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

18.6.1 Light and heavy tails

A distribution 𝐹 with density 𝑓 on ℝ+ is called heavy-tailed if



∫ exp(𝑡𝑥)𝑓(𝑥)𝑑𝑥 = ∞ for all 𝑡 > 0. (18.4)
0

We say that a nonnegative random variable 𝑋 is heavy-tailed if its density is heavy-tailed.


This is equivalent to stating that its moment generating function 𝑚(𝑡) ∶= 𝔼 exp(𝑡𝑋) is infinite for all 𝑡 > 0.
For example, the log-normal distribution is heavy-tailed because its moment generating function is infinite everywhere on
(0, ∞).
The Pareto distribution is also heavy-tailed.
A distribution 𝐹 on ℝ+ is called light-tailed if it is not heavy-tailed.
A nonnegative random variable 𝑋 is light-tailed if its distribution 𝐹 is light-tailed.
For example, every random variable with bounded support is light-tailed. (Why?)
As another example, if 𝑋 has the exponential distribution, with cdf 𝐹 (𝑥) = 1 − exp(−𝜆𝑥) for some 𝜆 > 0, then its
moment generating function is

𝜆
𝑚(𝑡) = when 𝑡 < 𝜆
𝜆−𝑡
In particular, 𝑚(𝑡) is finite whenever 𝑡 < 𝜆, so 𝑋 is light-tailed.
One can show that if 𝑋 is light-tailed, then all of its moments are finite.
Conversely, if some moment is infinite, then 𝑋 is heavy-tailed.
The latter condition is not necessary, however.
For example, the lognormal distribution is heavy-tailed but every moment is finite.

18.7 Further reading

For more on heavy tails in the wealth distribution, see e.g., [Vil96] and [BB18].
For more on heavy tails in the firm size distribution, see e.g., [Axt01], [Gab16].
For more on heavy tails in the city size distribution, see e.g., [RRGM11], [Gab16].
There are other important implications of heavy tails, aside from those discussed above.
For example, heavy tails in income and wealth affect productivity growth, business cycles, and political economy.
For further reading, see, for example, [AR02], [GSS03], [BEGS18] or [AKM+18].

18.7. Further reading 293


A First Course in Quantitative Economics with Python

18.8 Exercises

Exercise 18.8.1
Prove: If 𝑋 has a Pareto tail with tail index 𝛼, then 𝔼[𝑋 𝑟 ] = ∞ for all 𝑟 ≥ 𝛼.

Solution to Exercise 18.8.1


Let 𝑋 have a Pareto tail with tail index 𝛼 and let 𝐹 be its cdf.
Fix 𝑟 ≥ 𝛼.
In view of (18.2), we can take positive constants 𝑏 and 𝑥̄ such that

ℙ{𝑋 > 𝑥} ≥ 𝑏𝑥−𝛼 whenever 𝑥 ≥ 𝑥̄

But then
∞ 𝑥̄ ∞
𝔼𝑋 𝑟 = 𝑟 ∫ 𝑥𝑟−1 ℙ{𝑋 > 𝑥}𝑑𝑥 ≥ 𝑟 ∫ 𝑥𝑟−1 ℙ{𝑋 > 𝑥}𝑑𝑥 + 𝑟 ∫ 𝑥𝑟−1 𝑏𝑥−𝛼 𝑑𝑥.
0 0 𝑥̄

We know that ∫𝑥̄ 𝑥𝑟−𝛼−1 𝑑𝑥 = ∞ whenever 𝑟 − 𝛼 − 1 ≥ −1.
Since 𝑟 ≥ 𝛼, we have 𝔼𝑋 𝑟 = ∞.

Exercise 18.8.2
Repeat exercise 1, but replace the three distributions (two normal, one Cauchy) with three Pareto distributions using
different choices of 𝛼.
For 𝛼, try 1.15, 1.5 and 1.75.
Use np.random.seed(11) to set the seed.

Solution to Exercise 18.8.2

from scipy.stats import pareto

np.random.seed(11)

n = 120
alphas = [1.15, 1.50, 1.75]

fig, axes = plt.subplots(3, 1, figsize=(6, 8))

for (a, ax) in zip(alphas, axes):


ax.set_ylim((-5, 50))
data = pareto.rvs(size=n, scale=1, b=a)
ax.plot(list(range(n)), data, linestyle='', marker='o', alpha=0.5, ms=4)
ax.vlines(list(range(n)), 0, data, lw=0.2)
ax.set_title(f"Pareto draws with $\\alpha = {a}$", fontsize=11)

plt.subplots_adjust(hspace=0.4)
(continues on next page)

294 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

(continued from previous page)

plt.show()

Exercise 18.8.3
There is an ongoing argument about whether the firm size distribution should be modeled as a Pareto distribution or a
lognormal distribution (see, e.g., [FDGA+04], [KLS18] or [ST19]).
This sounds esoteric but has real implications for a variety of economic phenomena.
To illustrate this fact in a simple way, let us consider an economy with 100,000 firms, an interest rate of r = 0.05 and

18.8. Exercises 295


A First Course in Quantitative Economics with Python

a corporate tax rate of 15%.


Your task is to estimate the present discounted value of projected corporate tax revenue over the next 10 years.
Because we are forecasting, we need a model.
We will suppose that
1. the number of firms and the firm size distribution (measured in profits) remain fixed and
2. the firm size distribution is either lognormal or Pareto.
Present discounted value of tax revenue will be estimated by
1. generating 100,000 draws of firm profit from the firm size distribution,
2. multiplying by the tax rate, and
3. summing the results with discounting to obtain present value.
The Pareto distribution is assumed to take the form (18.1) with 𝑥̄ = 1 and 𝛼 = 1.05.
(The value the tail index 𝛼 is plausible given the data [Gab16].)
To make the lognormal option as similar as possible to the Pareto option, choose its parameters such that the mean and
median of both distributions are the same.
Note that, for each distribution, your estimate of tax revenue will be random because it is based on a finite number of
draws.
To take this into account, generate 100 replications (evaluations of tax revenue) for each of the two distributions and
compare the two samples by
• producing a violin plot visualizing the two samples side-by-side and
• printing the mean and standard deviation of both samples.
For the seed use np.random.seed(1234).
What differences do you observe?
(Note: a better approach to this problem would be to model firm dynamics and try to track individual firms given the
current distribution. We will discuss firm dynamics in later lectures.)

Solution to Exercise 18.8.3


To do the exercise, we need to choose the parameters 𝜇 and 𝜎 of the lognormal distribution to match the mean and median
of the Pareto distribution.
Here we understand the lognormal distribution as that of the random variable exp(𝜇 + 𝜎𝑍) when 𝑍 is standard normal.
The mean and median of the Pareto distribution (18.1) with 𝑥̄ = 1 are
𝛼
mean = and median = 21/𝛼
𝛼−1
Using the corresponding expressions for the lognormal distribution leads us to the equations
𝛼
= exp(𝜇 + 𝜎2 /2) and 21/𝛼 = exp(𝜇)
𝛼−1
which we solve for 𝜇 and 𝜎 given 𝛼 = 1.05.
Here is code that generates the two samples, produces the violin plot and prints the mean and standard deviation of the
two samples.

296 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

num_firms = 100_000
num_years = 10
tax_rate = 0.15
r = 0.05

β = 1 / (1 + r) # discount factor

x_bar = 1.0
α = 1.05

def pareto_rvs(n):
"Uses a standard method to generate Pareto draws."
u = np.random.uniform(size=n)
y = x_bar / (u**(1/α))
return y

Let’s compute the lognormal parameters:

μ = np.log(2) / α
σ_sq = 2 * (np.log(α/(α - 1)) - np.log(2)/α)
σ = np.sqrt(σ_sq)

Here’s a function to compute a single estimate of tax revenue for a particular choice of distribution dist.

def tax_rev(dist):
tax_raised = 0
for t in range(num_years):
if dist == 'pareto':
π = pareto_rvs(num_firms)
else:
π = np.exp(μ + σ * np.random.randn(num_firms))
tax_raised += β**t * np.sum(π * tax_rate)
return tax_raised

Now let’s generate the violin plot.

num_reps = 100
np.random.seed(1234)

tax_rev_lognorm = np.empty(num_reps)
tax_rev_pareto = np.empty(num_reps)

for i in range(num_reps):
tax_rev_pareto[i] = tax_rev('pareto')
tax_rev_lognorm[i] = tax_rev('lognorm')

fig, ax = plt.subplots()

data = tax_rev_pareto, tax_rev_lognorm

ax.violinplot(data)

plt.show()

18.8. Exercises 297


A First Course in Quantitative Economics with Python

Finally, let’s print the means and standard deviations.

tax_rev_pareto.mean(), tax_rev_pareto.std()

(1458729.0546623734, 406089.3613661567)

tax_rev_lognorm.mean(), tax_rev_lognorm.std()

(2556174.8615230713, 25586.444565139616)

Looking at the output of the code, our main conclusion is that the Pareto assumption leads to a lower mean and greater
dispersion.

Exercise 18.8.4
The characteristic function of the Cauchy distribution is

𝜙(𝑡) = 𝔼𝑒𝑖𝑡𝑋 = ∫ 𝑒𝑖𝑡𝑥 𝑓(𝑥)𝑑𝑥 = 𝑒−|𝑡| (18.5)

Prove that the sample mean 𝑋̄ 𝑛 of 𝑛 independent draws 𝑋1 , … , 𝑋𝑛 from the Cauchy distribution has the same charac-
teristic function as 𝑋1 .
(This means that the sample mean never converges.)

Solution to Exercise 18.8.4

298 Chapter 18. Heavy-Tailed Distributions


A First Course in Quantitative Economics with Python

By independence, the characteristic function of the sample mean becomes

̄ 𝑡 𝑛
𝔼𝑒𝑖𝑡𝑋𝑛 = 𝔼 exp {𝑖 ∑𝑋 }
𝑛 𝑗=1 𝑗
𝑛
𝑡
= 𝔼 ∏ exp {𝑖 𝑋𝑗 }
𝑗=1
𝑛
𝑛
𝑡
= ∏ 𝔼 exp {𝑖 𝑋𝑗 } = [𝜙(𝑡/𝑛)]𝑛
𝑗=1
𝑛

In view of (18.5), this is just 𝑒−|𝑡| .


Thus, in the case of the Cauchy distribution, the sample mean itself has the very same Cauchy distribution, regardless of
𝑛!

18.8. Exercises 299


A First Course in Quantitative Economics with Python

300 Chapter 18. Heavy-Tailed Distributions


CHAPTER

NINETEEN

RACIAL SEGREGATION

Contents

• Racial Segregation
– Outline
– The model
– Results
– Exercises

19.1 Outline

In 1969, Thomas C. Schelling developed a simple but striking model of racial segregation [Sch69].
His model studies the dynamics of racially mixed neighborhoods.
Like much of Schelling’s work, the model shows how local interactions can lead to surprising aggregate outcomes.
It studies a setting where agents (think of households) have relatively mild preference for neighbors of the same race.
For example, these agents might be comfortable with a mixed race neighborhood but uncomfortable when they feel
“surrounded” by people from a different race.
Schelling illustrated the follow surprising result: in such a setting, mixed race neighborhoods are likely to be unstable,
tending to collapse over time.
In fact the model predicts strongly divided neighborhoods, with high levels of segregation.
In other words, extreme segregation outcomes arise even though people’s preferences are not particularly extreme.
These extreme outcomes happen because of interactions between agents in the model (e.g., households in a city) that drive
self-reinforcing dynamics in the model.
These ideas will become clearer as the lecture unfolds.
In recognition of his work on segregation and other research, Schelling was awarded the 2005 Nobel Prize in Economic
Sciences (joint with Robert Aumann).
Let’s start with some imports:

301
A First Course in Quantitative Economics with Python

%matplotlib inline
import matplotlib.pyplot as plt
from random import uniform, seed
from math import sqrt
import numpy as np

19.2 The model

In this section we will build a version of Schelling’s model.

19.2.1 Set-Up

We will cover a variation of Schelling’s model that is different from the original but also easy to program and, at the same
time, captures his main idea.
Suppose we have two types of people: orange people and green people.
Assume there are 𝑛 of each type.
These agents all live on a single unit square.
Thus, the location (e.g, address) of an agent is just a point (𝑥, 𝑦), where 0 < 𝑥, 𝑦 < 1.
• The set of all points (𝑥, 𝑦) satisfying 0 < 𝑥, 𝑦 < 1 is called the unit square
• Below we denote the unit square by 𝑆

19.2.2 Preferences

We will say that an agent is happy if 5 or more of her 10 nearest neighbors are of the same type.
An agent who is not happy is called unhappy.
For example,
• if an agent is orange and 5 of her 10 nearest neighbors are orange, then she is happy.
• if an agent is green and 8 of her 10 nearest neighbors are orange, then she is unhappy.
‘Nearest’ is in terms of Euclidean distance.
An important point to note is that agents are not averse to living in mixed areas.
They are perfectly happy if half of their neighbors are of the other color.

19.2.3 Behavior

Initially, agents are mixed together (integrated).


In particular, we assume that the initial location of each agent is an independent draw from a bivariate uniform distribution
on the unit square 𝑆.
• First their 𝑥 coordinate is drawn from a uniform distribution on (0, 1)
• Then, independently, their 𝑦 coordinate is drawn from the same distribution.

302 Chapter 19. Racial Segregation


A First Course in Quantitative Economics with Python

Now, cycling through the set of all agents, each agent is now given the chance to stay or move.
Each agent stays if they are happy and moves if they are unhappy.
The algorithm for moving is as follows

Algorithm 19.2.1 (Jump Chain Algorithm)


1. Draw a random location in 𝑆
2. If happy at new location, move there
3. Otherwise, go to step 1

We cycle continuously through the agents, each time allowing an unhappy agent to move.
We continue to cycle until no one wishes to move.

19.3 Results

Let’s now implement and run this simulation.


In what follows, agents are modeled as objects.
Here’s an indication of their structure:

* Data:

* type (green or orange)


* location

* Methods:

* determine whether happy or not given locations of other agents


* If not happy, move
* find a new location where happy

Let’s build them.

class Agent:

def __init__(self, type):


self.type = type
self.draw_location()

def draw_location(self):
self.location = uniform(0, 1), uniform(0, 1)

def get_distance(self, other):


"Computes the euclidean distance between self and other agent."
a = (self.location[0] - other.location[0])**2
b = (self.location[1] - other.location[1])**2
return sqrt(a + b)

def happy(self,
agents, # List of other agents
(continues on next page)

19.3. Results 303


A First Course in Quantitative Economics with Python

(continued from previous page)


num_neighbors=10, # No. of agents viewed as neighbors
require_same_type=5): # How many neighbors must be same type
"""
True if a sufficient number of nearest neighbors are of the same
type.
"""

distances = []

# Distances is a list of pairs (d, agent), where d is distance from


# agent to self
for agent in agents:
if self != agent:
distance = self.get_distance(agent)
distances.append((distance, agent))

# Sort from smallest to largest, according to distance


distances.sort()

# Extract the neighboring agents


neighbors = [agent for d, agent in distances[:num_neighbors]]

# Count how many neighbors have the same type as self


num_same_type = sum(self.type == agent.type for agent in neighbors)
return num_same_type >= require_same_type

def update(self, agents):


"If not happy, then randomly choose new locations until happy."
while not self.happy(agents):
self.draw_location()

Here’s some code that takes a list of agents and produces a plot showing their locations on the unit square.
Orange agents are represented by orange dots and green ones are represented by green dots.

def plot_distribution(agents, cycle_num):


"Plot the distribution of agents after cycle_num rounds of the loop."
x_values_0, y_values_0 = [], []
x_values_1, y_values_1 = [], []
# == Obtain locations of each type == #
for agent in agents:
x, y = agent.location
if agent.type == 0:
x_values_0.append(x)
y_values_0.append(y)
else:
x_values_1.append(x)
y_values_1.append(y)
fig, ax = plt.subplots()
plot_args = {'markersize': 8, 'alpha': 0.8}
ax.set_facecolor('azure')
ax.plot(x_values_0, y_values_0,
'o', markerfacecolor='orange', **plot_args)
ax.plot(x_values_1, y_values_1,
'o', markerfacecolor='green', **plot_args)
ax.set_title(f'Cycle {cycle_num-1}')
plt.show()

304 Chapter 19. Racial Segregation


A First Course in Quantitative Economics with Python

And here’s some pseudocode for the main loop, where we cycle through the agents until no one wishes to move.
The pseudocode is

plot the distribution


while agents are still moving
for agent in agents
give agent the opportunity to move
plot the distribution

The real code is below

def run_simulation(num_of_type_0=600,
num_of_type_1=600,
max_iter=100_000, # Maximum number of iterations
set_seed=1234):

# Set the seed for reproducibility


seed(set_seed)

# Create a list of agents of type 0


agents = [Agent(0) for i in range(num_of_type_0)]
# Append a list of agents of type 1
agents.extend(Agent(1) for i in range(num_of_type_1))

# Initialize a counter
count = 1

# Plot the initial distribution


plot_distribution(agents, count)

# Loop until no agent wishes to move


while count < max_iter:
print('Entering loop ', count)
count += 1
no_one_moved = True
for agent in agents:
old_location = agent.location
agent.update(agents)
if agent.location != old_location:
no_one_moved = False
if no_one_moved:
break

# Plot final distribution


plot_distribution(agents, count)

if count < max_iter:


print(f'Converged after {count} iterations.')
else:
print('Hit iteration bound and terminated.')

Let’s have a look at the results.

run_simulation()

19.3. Results 305


A First Course in Quantitative Economics with Python

Entering loop 1

Entering loop 2

Entering loop 3

Entering loop 4

Entering loop 5

Entering loop 6

Entering loop 7

306 Chapter 19. Racial Segregation


A First Course in Quantitative Economics with Python

Converged after 8 iterations.

As discussed above, agents are initially mixed randomly together.


But after several cycles, they become segregated into distinct regions.
In this instance, the program terminated after a small number of cycles through the set of agents, indicating that all agents
had reached a state of happiness.
What is striking about the pictures is how rapidly racial integration breaks down.
This is despite the fact that people in the model don’t actually mind living mixed with the other type.
Even with these preferences, the outcome is a high degree of segregation.

19.4 Exercises

Exercise 19.4.1
The object oriented style that we used for coding above is neat but harder to optimize than procedural code (i.e., code
based around functions rather than objects and methods).
Try writing a new version of the model that stores
• the locations of all agents as a 2D NumPy array of floats.
• the types of all agents as a flat NumPy array of integers.
Write functions that act on this data to update the model using the logic similar to that described above.
However, implement the following two changes:

19.4. Exercises 307


A First Course in Quantitative Economics with Python

1. Agents are offered a move at random (i.e., selected randomly and given the opportunity to move).
2. After an agent has moved, flip their type with probability 0.01
The second change introduces extra randomness into the model.
(We can imagine that, every so often, an agent moves to a different city and, with small probability, is replaced by an
agent of the other type.)

Solution to Exercise 19.4.1


solution here

from numpy.random import uniform, randint

n = 1000 # number of agents (agents = 0, ..., n-1)


k = 10 # number of agents regarded as neighbors
require_same_type = 5 # want >= require_same_type neighbors of the same type

def initialize_state():
locations = uniform(size=(n, 2))
types = randint(0, high=2, size=n) # label zero or one
return locations, types

def compute_distances_from_loc(loc, locations):


""" Compute distance from location loc to all other points. """
return np.linalg.norm(loc - locations, axis=1)

def get_neighbors(loc, locations):


" Get all neighbors of a given location. "
all_distances = compute_distances_from_loc(loc, locations)
indices = np.argsort(all_distances) # sort agents by distance to loc
neighbors = indices[:k] # keep the k closest ones
return neighbors

def is_happy(i, locations, types):


happy = True
agent_loc = locations[i, :]
agent_type = types[i]
neighbors = get_neighbors(agent_loc, locations)
neighbor_types = types[neighbors]
if sum(neighbor_types == agent_type) < require_same_type:
happy = False
return happy

def count_happy(locations, types):


" Count the number of happy agents. "
happy_sum = 0
for i in range(n):
happy_sum += is_happy(i, locations, types)
return happy_sum

def update_agent(i, locations, types):


" Move agent if unhappy. "
moved = False
while not is_happy(i, locations, types):
(continues on next page)

308 Chapter 19. Racial Segregation


A First Course in Quantitative Economics with Python

(continued from previous page)


moved = True
locations[i, :] = uniform(), uniform()
return moved

def plot_distribution(locations, types, title, savepdf=False):


" Plot the distribution of agents after cycle_num rounds of the loop."
fig, ax = plt.subplots()
colors = 'orange', 'green'
for agent_type, color in zip((0, 1), colors):
idx = (types == agent_type)
ax.plot(locations[idx, 0],
locations[idx, 1],
'o',
markersize=8,
markerfacecolor=color,
alpha=0.8)
ax.set_title(title)
plt.show()

def sim_random_select(max_iter=100_000, flip_prob=0.01, test_freq=10_000):


"""
Simulate by randomly selecting one household at each update.

Flip the color of the household with probability `flip_prob`.

"""

locations, types = initialize_state()


current_iter = 0

while current_iter <= max_iter:

# Choose a random agent and update them


i = randint(0, n)
moved = update_agent(i, locations, types)

if flip_prob > 0:
# flip agent i's type with probability epsilon
U = uniform()
if U < flip_prob:
current_type = types[i]
types[i] = 0 if current_type == 1 else 1

# Every so many updates, plot and test for convergence


if current_iter % test_freq == 0:
cycle = current_iter / n
plot_distribution(locations, types, f'iteration {current_iter}')
if count_happy(locations, types) == n:
print(f"Converged at iteration {current_iter}")
break

current_iter += 1

if current_iter > max_iter:


print(f"Terminating at iteration {current_iter}")

19.4. Exercises 309


A First Course in Quantitative Economics with Python

When we run this we again find that mixed neighborhoods break down and segregation emerges.
Here’s a sample run.

sim_random_select(max_iter=50_000, flip_prob=0.01, test_freq=10_000)

310 Chapter 19. Racial Segregation


A First Course in Quantitative Economics with Python

19.4. Exercises 311


A First Course in Quantitative Economics with Python

Terminating at iteration 50001

312 Chapter 19. Racial Segregation


Part VI

Nonlinear Dynamics

313
CHAPTER

TWENTY

THE SOLOW-SWAN GROWTH MODEL

In this lecture we review a famous model due to Robert Solow (1925–2014) and Trevor Swan (1918–1989).
The model is used to study growth over the long run.
Although the model is simple, it contains some interesting lessons.
We will use the following imports

import matplotlib.pyplot as plt


import numpy as np

20.1 The model

In a Solow–Swan economy, agents save a fixed fraction of their current incomes.


Savings sustain or increase the stock of capital.
Capital is combined with labor to produce output, which in turn is paid out to workers and owners of capital.
To keep things simple, we ignore population and productivity growth.
For each integer 𝑡 ≥ 0, output 𝑌𝑡 in period 𝑡 is given by 𝑌𝑡 = 𝐹 (𝐾𝑡 , 𝐿𝑡 ), where 𝐾𝑡 is capital, 𝐿𝑡 is labor and 𝐹 is an
aggregate production function.
The function 𝐹 is assumed to be nonnegative and homogeneous of degree one, meaning that

𝐹 (𝜆𝐾, 𝜆𝐿) = 𝜆𝐹 (𝐾, 𝐿) for all 𝜆 ≥ 0

Production functions with this property include


• the Cobb-Douglas function 𝐹 (𝐾, 𝐿) = 𝐴𝐾 𝛼 𝐿1−𝛼 with 0 ≤ 𝛼 ≤ 1 and
1/𝜌
• the CES function 𝐹 (𝐾, 𝐿) = {𝑎𝐾 𝜌 + 𝑏𝐿𝜌 } with 𝑎, 𝑏, 𝜌 > 0.
We assume a closed economy, so domestic investment equals aggregate domestic saving.
The saving rate is a constant 𝑠 satisfying 0 ≤ 𝑠 ≤ 1, so that aggregate investment and saving both equal 𝑠𝑌𝑡 .
Capital depreciates: without replenishing through investment, one unit of capital today becomes 1 − 𝛿 units tomorrow.
Thus,

𝐾𝑡+1 = 𝑠𝐹 (𝐾𝑡 , 𝐿𝑡 ) + (1 − 𝛿)𝐾𝑡

Without population growth, 𝐿𝑡 equals some constant 𝐿.

315
A First Course in Quantitative Economics with Python

Setting 𝑘𝑡 ∶= 𝐾𝑡 /𝐿 and using homogeneity of degree one now yields

𝐹 (𝐾𝑡 , 𝐿)
𝑘𝑡+1 = 𝑠 + (1 − 𝛿)𝑘𝑡 = 𝑠𝐹 (𝑘𝑡 , 1) + (1 − 𝛿)𝑘𝑡
𝐿
With 𝑓(𝑘) ∶= 𝐹 (𝑘, 1), the final expression for capital dynamics is

𝑘𝑡+1 = 𝑔(𝑘𝑡 ) where 𝑔(𝑘) ∶= 𝑠𝑓(𝑘) + (1 − 𝛿)𝑘 (20.1)

Our aim is to learn about the evolution of 𝑘𝑡 over time, given an exogenous initial capital stock 𝑘0 .

20.2 A graphical perspective

To understand the dynamics of the sequence (𝑘𝑡 )𝑡≥0 we use a 45 degree diagram.
To do so, we first need to specify the functional form for 𝑓 and assign values to the parameters.
We choose the Cobb–Douglas specification 𝑓(𝑘) = 𝐴𝑘𝛼 and set 𝐴 = 2.0, 𝛼 = 0.3, 𝑠 = 0.3 and 𝛿 = 0.4.
The function 𝑔 from (20.1) is then plotted, along with the 45 degree line.
Let’s define the constants.

A, s, alpha, delta = 2, 0.3, 0.3, 0.4


x0 = 0.25
xmin, xmax = 0, 3

Now, we define the function 𝑔

def g(A, s, alpha, delta, k):


return A * s * k**alpha + (1 - delta) * k

Let’s plot the 45 degree diagram of 𝑔

def plot45(kstar=None):
xgrid = np.linspace(xmin, xmax, 12000)

fig, ax = plt.subplots()

ax.set_xlim(xmin, xmax)

g_values = g(A, s, alpha, delta, xgrid)

ymin, ymax = np.min(g_values), np.max(g_values)


ax.set_ylim(ymin, ymax)

lb = r'$g(k) = sAk^{\alpha} + (1 - \delta)k$'


ax.plot(xgrid, g_values, lw=2, alpha=0.6, label=lb)
ax.plot(xgrid, xgrid, 'k-', lw=1, alpha=0.7, label='45')

if kstar:
fps = (kstar,)

ax.plot(fps, fps, 'go', ms=10, alpha=0.6)

ax.annotate(r'$k^* = (sA / \delta)^{(1/(1-\alpha))}$',


(continues on next page)

316 Chapter 20. The Solow-Swan Growth Model


A First Course in Quantitative Economics with Python

(continued from previous page)


xy=(kstar, kstar),
xycoords='data',
xytext=(-40, -60),
textcoords='offset points',
fontsize=14,
arrowprops=dict(arrowstyle="->"))

ax.legend(loc='upper left', frameon=False, fontsize=12)

ax.set_xticks((0, 1, 2, 3))
ax.set_yticks((0, 1, 2, 3))

ax.set_xlabel('$k_t$', fontsize=12)
ax.set_ylabel('$k_{t+1}$', fontsize=12)

plt.show()

plot45()

Suppose, at some 𝑘𝑡 , the value 𝑔(𝑘𝑡 ) lies strictly above the 45 degree line.
Then we have 𝑘𝑡+1 = 𝑔(𝑘𝑡 ) > 𝑘𝑡 and capital per worker rises.
If 𝑔(𝑘𝑡 ) < 𝑘𝑡 then capital per worker falls.
If 𝑔(𝑘𝑡 ) = 𝑘𝑡 , then we are at a steady state and 𝑘𝑡 remains constant.
(A steady state of the model is a fixed point of the mapping 𝑔.)
From the shape of the function 𝑔 in the figure, we see that there is a unique steady state in (0, ∞).

20.2. A graphical perspective 317


A First Course in Quantitative Economics with Python

It solves 𝑘 = 𝑠𝐴𝑘𝛼 + (1 − 𝛿)𝑘 and hence is given by


1/(1−𝛼)
𝑠𝐴 (20.2)
𝑘∗ ∶= ( )
𝛿
If initial capital is below 𝑘∗ , then capital increases over time.
If initial capital is above this level, then the reverse is true.
Let’s plot the 45 degree diagram to show the 𝑘∗ in the plot

kstar = ((s * A) / delta)**(1/(1 - alpha))


plot45(kstar)

From our graphical analysis, it appears that (𝑘𝑡 ) converges to 𝑘∗ , regardless of initial capital 𝑘0 .
This is a form of global stability.
The next figure shows three time paths for capital, from three distinct initial conditions, under the parameterization listed
above.
At this parameterization, 𝑘∗ ≈ 1.78.
Let’s define the constants and three distinct initial conditions

A, s, alpha, delta = 2, 0.3, 0.3, 0.4


x0 = np.array([.25, 1.25, 3.25])

ts_length = 20
xmin, xmax = 0, ts_length
ymin, ymax = 0, 3.5

318 Chapter 20. The Solow-Swan Growth Model


A First Course in Quantitative Economics with Python

def simulate_ts(x0_values, ts_length):

k_star = (s * A / delta)**(1/(1-alpha))
fig, ax = plt.subplots(figsize=[11, 5])
ax.set_xlim(xmin, xmax)
ax.set_ylim(ymin, ymax)

ts = np.zeros(ts_length)

# simulate and plot time series


for x_init in x0_values:
ts[0] = x_init
for t in range(1, ts_length):
ts[t] = g(A, s, alpha, delta, ts[t-1])
ax.plot(np.arange(ts_length), ts, '-o', ms=4, alpha=0.6,
label=r'$k_0=%g$' %x_init)
ax.plot(np.arange(ts_length), np.full(ts_length,k_star),
alpha=0.6, color='red', label=r'$k^*$')
ax.legend(fontsize=10)

ax.set_xlabel(r'$t$', fontsize=14)
ax.set_ylabel(r'$k_t$', fontsize=14)

plt.show()

simulate_ts(x0, ts_length)

As expected, the time paths in the figure both converge to this value.

20.2. A graphical perspective 319


A First Course in Quantitative Economics with Python

20.3 Growth in continuous time

In this section we investigate a continuous time version of the Solow–Swan growth model.
We will see how the smoothing provided by continuous time can simplify analysis.
Recall that the discrete time dynamics for capital are given by 𝑘𝑡+1 = 𝑠𝑓(𝑘𝑡 ) + (1 − 𝛿)𝑘𝑡 .
A simple rearrangement gives the rate of change per unit of time:

Δ𝑘𝑡 = 𝑠𝑓(𝑘𝑡 ) − 𝛿𝑘𝑡 where Δ𝑘𝑡 ∶= 𝑘𝑡+1 − 𝑘𝑡

Taking the time step to zero gives the continuous time limit

𝑑
𝑘𝑡′ = 𝑠𝑓(𝑘𝑡 ) − 𝛿𝑘𝑡 with 𝑘𝑡′ ∶= 𝑘 (20.3)
𝑑𝑡 𝑡
Our aim is to learn about the evolution of 𝑘𝑡 over time, given initial stock 𝑘0 .
A steady state for (20.3) is a value 𝑘∗ at which capital is unchanging, meaning 𝑘𝑡′ = 0 or, equivalently, 𝑠𝑓(𝑘∗ ) = 𝛿𝑘∗ .
We assume 𝑓(𝑘) = 𝐴𝑘𝛼 , so 𝑘∗ solves 𝑠𝐴𝑘𝛼 = 𝛿𝑘.
The solution is the same as the discrete time case—see (20.2).
The dynamics are represented in the next figure, maintaining the parameterization we used above.
Writing 𝑘𝑡′ = 𝑔(𝑘𝑡 ) with 𝑔(𝑘) = 𝑠𝐴𝑘𝛼 − 𝛿𝑘, values of 𝑘 with 𝑔(𝑘) > 0 imply that 𝑘𝑡′ > 0, so capital is increasing.
When 𝑔(𝑘) < 0, the opposite occurs. Once again, high marginal returns to savings at low levels of capital combined with
low rates of return at high levels of capital combine to yield global stability.
To see this in a figure, let’s define the constants

A, s, alpha, delta = 2, 0.3, 0.3, 0.4

Next we define the function 𝑔 for growth in continuous time

def g_con(A, s, alpha, delta, k):


return A * s * k**alpha - delta * k

def plot_gcon(kstar=None):

k_grid = np.linspace(0, 2.8, 10000)

fig, ax = plt.subplots(figsize=[11, 5])


ax.plot(k_grid, g_con(A, s, alpha, delta, k_grid), label='$g(k)$')
ax.plot(k_grid, 0 * k_grid, label="$k'=0$")

if kstar:
fps = (kstar,)

ax.plot(fps, 0, 'go', ms=10, alpha=0.6)

ax.annotate(r'$k^* = (sA / \delta)^{(1/(1-\alpha))}$',


xy=(kstar, 0),
xycoords='data',
xytext=(0, 60),
(continues on next page)

320 Chapter 20. The Solow-Swan Growth Model


A First Course in Quantitative Economics with Python

(continued from previous page)


textcoords='offset points',
fontsize=12,
arrowprops=dict(arrowstyle="->"))

ax.legend(loc='lower left', fontsize=12)

ax.set_xlabel("$k$",fontsize=10)
ax.set_ylabel("$k'$", fontsize=10)

ax.set_xticks((0, 1, 2, 3))
ax.set_yticks((-0.3, 0, 0.3))

plt.show()

kstar = ((s * A) / delta)**(1/(1 - alpha))


plot_gcon(kstar)

This shows global stability heuristically for a fixed parameterization, but how would we show the same thing formally for
a continuum of plausible parameters?
In the discrete time case, a neat expression for 𝑘𝑡 is hard to obtain.
In continuous time the process is easier: we can obtain a relatively simple expression for 𝑘𝑡 that specifies the entire path.
The first step is to set 𝑥𝑡 ∶= 𝑘𝑡1−𝛼 , so that 𝑥′𝑡 = (1 − 𝛼)𝑘𝑡−𝛼 𝑘𝑡′ .
Substituting into 𝑘𝑡′ = 𝑠𝐴𝑘𝑡𝛼 − 𝛿𝑘𝑡 leads to the linear differential equation

𝑥′𝑡 = (1 − 𝛼)(𝑠𝐴 − 𝛿𝑥𝑡 ) (20.4)

This equation has the exact solution

𝑠𝐴 −𝛿(1−𝛼)𝑡 𝑠𝐴
𝑥𝑡 = (𝑘01−𝛼 − )e +
𝛿 𝛿
(You can confirm that this function 𝑥𝑡 satisfies (20.4) by differentiating it with respect to 𝑡.)
Converting back to 𝑘𝑡 yields
1/(1−𝛼)
𝑠𝐴 −𝛿(1−𝛼)𝑡 𝑠𝐴 (20.5)
𝑘𝑡 = [(𝑘01−𝛼 − )e + ]
𝛿 𝛿

20.3. Growth in continuous time 321


A First Course in Quantitative Economics with Python

Since 𝛿 > 0 and 𝛼 ∈ (0, 1), we see immediately that 𝑘𝑡 → 𝑘∗ as 𝑡 → ∞ independent of 𝑘0 .


Thus, global stability holds.

20.4 Exercises

Exercise 20.4.1
Plot per capita consumption 𝑐 at the steady state, as a function of the savings rate 𝑠, where 0 ≤ 𝑠 ≤ 1.
Use the Cobb–Douglas specification 𝑓(𝑘) = 𝐴𝑘𝛼 .
Set 𝐴 = 2.0, 𝛼 = 0.3, and 𝛿 = 0.5
Also, find the approximate value of 𝑠 that maximizes the 𝑐∗ (𝑠) and show it in the plot.

Solution to Exercise 20.4.1


Steady state consumption at savings rate 𝑠 is given by

𝑐∗ (𝑠) = (1 − 𝑠)𝑓(𝑘∗ ) = (1 − 𝑠)𝐴(𝑘∗ )𝛼

A = 2.0
alpha = 0.3
delta = 0.5

s_grid = np.linspace(0, 1, 1000)


k_star = ((s_grid * A) / delta)**(1/(1 - alpha))
c_star = (1 - s_grid) * A * k_star ** alpha

Let’s find the value of 𝑠 that maximizes 𝑐∗ using scipy.optimize.minimize_scalar. We will use −𝑐∗ (𝑠) since mini-
mize_scalar finds the minimum value.

from scipy.optimize import minimize_scalar

def calc_c_star(s):
k = ((s * A) / delta)**(1/(1 - alpha))
return - (1 - s) * A * k ** alpha

return_values = minimize_scalar(calc_c_star, bounds=(0, 1))


s_star_max = return_values.x
c_star_max = -return_values.fun
print(f"Function is maximized at s = {round(s_star_max, 4)}")

Function is maximized at s = 0.3

x_s_max = np.array([s_star_max, s_star_max])


y_s_max = np.array([0, c_star_max])

fig, ax = plt.subplots(figsize=[11, 5])


(continues on next page)

322 Chapter 20. The Solow-Swan Growth Model


A First Course in Quantitative Economics with Python

(continued from previous page)

fps = (c_star_max,)

# Highlight the maximum point with a marker


ax.plot((s_star_max, ), (c_star_max,), 'go', ms=8, alpha=0.6)

ax.annotate(r'$s^*$',
xy=(s_star_max, c_star_max),
xycoords='data',
xytext=(20, -50),
textcoords='offset points',
fontsize=12,
arrowprops=dict(arrowstyle="->"))
ax.plot(s_grid, c_star, label=r'$c*(s)$')
ax.plot(x_s_max, y_s_max, alpha=0.5, ls='dotted')
ax.set_xlabel(r'$s$')
ax.set_ylabel(r'$c^*(s)$')
ax.legend()

plt.show()

𝑑 ∗
One can also try to solve this mathematically by differentiating 𝑐∗ (𝑠) and solve for 𝑑𝑠 𝑐 (𝑠) = 0 using sympy.

from sympy import solve, Symbol

s_symbol = Symbol('s', real=True)


k = ((s_symbol * A) / delta)**(1/(1 - alpha))
c = (1 - s_symbol) * A * k ** alpha

Let’s differentiate 𝑐 and solve using sympy.solve

# Solve using sympy


s_star = solve(c.diff())[0]
print(f"s_star = {s_star}")

s_star = 0.300000000000000

Incidentally, the rate of savings which maximizes steady state level of per capita consumption is called the Golden Rule

20.4. Exercises 323


A First Course in Quantitative Economics with Python

savings rate.

Exercise 20.4.2
Stochastic Productivity
To bring the Solow–Swan model closer to data, we need to think about handling random fluctuations in aggregate quan-
tities.
Among other things, this will eliminate the unrealistic prediction that per-capita output 𝑦𝑡 = 𝐴𝑘𝑡𝛼 converges to a constant
𝑦∗ ∶= 𝐴(𝑘∗ )𝛼 .
We shift to discrete time for the following discussion.
One approach is to replace constant productivity with some stochastic sequence (𝐴𝑡 )𝑡≥1 .
Dynamics are now

𝑘𝑡+1 = 𝑠𝐴𝑡+1 𝑓(𝑘𝑡 ) + (1 − 𝛿)𝑘𝑡 (20.6)

We suppose 𝑓 is Cobb–Douglas and (𝐴𝑡 ) is IID and lognormal.


Now the long run convergence obtained in the deterministic case breaks down, since the system is hit with new shocks at
each point in time.
Consider 𝐴 = 2.0, 𝑠 = 0.6, 𝛼 = 0.3, and 𝛿 = 0.5
Generate and plot the time series 𝑘𝑡 .

Solution to Exercise 20.4.2


Let’s define the constants for lognormal distribution and initial values used for simulation

# Define the constants


sig = 0.2
mu = np.log(2) - sig**2 / 2
A = 2.0
s = 0.6
alpha = 0.3
delta = 0.5
x0 = [.25, 3.25] # list of initial values used for simulation

Let’s define the function k_next to find the next value of 𝑘

def lgnorm():
return np.exp(mu + sig * np.random.randn())

def k_next(s, alpha, delta, k):


return lgnorm() * s * k**alpha + (1 - delta) * k

def ts_plot(x_values, ts_length):


fig, ax = plt.subplots(figsize=[11, 5])
ts = np.zeros(ts_length)

# simulate and plot time series


for x_init in x_values:
(continues on next page)

324 Chapter 20. The Solow-Swan Growth Model


A First Course in Quantitative Economics with Python

(continued from previous page)


ts[0] = x_init
for t in range(1, ts_length):
ts[t] = k_next(s, alpha, delta, ts[t-1])
ax.plot(np.arange(ts_length), ts, '-o', ms=4,
alpha=0.6, label=r'$k_0=%g$' %x_init)

ax.legend(loc='best', fontsize=10)

ax.set_xlabel(r'$t$', fontsize=12)
ax.set_ylabel(r'$k_t$', fontsize=12)

plt.show()

ts_plot(x0, 50)

20.4. Exercises 325


A First Course in Quantitative Economics with Python

326 Chapter 20. The Solow-Swan Growth Model


CHAPTER

TWENTYONE

DYNAMICS IN ONE DIMENSION

Contents

• Dynamics in One Dimension


– Overview
– Some definitions
– Stability
– Graphical analysis
– Exercises

21.1 Overview

In this lecture we give a quick introduction to discrete time dynamics in one dimension.
• In one-dimensional models, the state of the system is described by a single variable.
• The variable is a number (that is, a point in ℝ).
While most quantitative models have two or more state variables, the one-dimensional setting is a good place to learn the
foundations of dynamics and understand key concepts.
Let’s start with some standard imports:

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

21.2 Some definitions

This section sets out the objects of interest and the kinds of properties we study.

327
A First Course in Quantitative Economics with Python

21.2.1 Composition of functions

For this lecture you should know the following.


If
• 𝑔 is a function from 𝐴 to 𝐵 and
• 𝑓 is a function from 𝐵 to 𝐶,
then the composition 𝑓 ∘ 𝑔 of 𝑓 and 𝑔 is defined by

(𝑓 ∘ 𝑔)(𝑥) = 𝑓(𝑔(𝑥))

For example, if
• 𝐴 = 𝐵 = 𝐶 = ℝ, the set of real numbers,
√ √
• 𝑔(𝑥) = 𝑥2 and 𝑓(𝑥) = 𝑥, then (𝑓 ∘ 𝑔)(𝑥) = 𝑥2 = |𝑥|.
If 𝑓 is a function from 𝐴 to itself, then 𝑓 2 is the composition of 𝑓 with itself.

For example, if 𝐴 = (0, ∞), the set of positive numbers, and 𝑓(𝑥) = 𝑥, then

𝑓 2 (𝑥) = √ 𝑥 = 𝑥1/4

Similarly, if 𝑛 is an integer, then 𝑓 𝑛 is 𝑛 compositions of 𝑓 with itself.


𝑛
In the example above, 𝑓 𝑛 (𝑥) = 𝑥1/(2 ) .

21.2.2 Dynamic systems

A (discrete time) dynamic system is a set 𝑆 and a function 𝑔 that sends set 𝑆 back into to itself.
Examples of dynamic systems include

• 𝑆 = (0, 1) and 𝑔(𝑥) = 𝑥
• 𝑆 = (0, 1) and 𝑔(𝑥) = 𝑥2
• 𝑆 = ℤ (the integers) and 𝑔(𝑥) = 2𝑥
On the other hand, if 𝑆 = (−1, 1) and 𝑔(𝑥) = 𝑥 + 1, then 𝑆 and 𝑔 do not form a dynamic system, since 𝑔(1) = 2.
• 𝑔 does not always send points in 𝑆 back into 𝑆.

21.2.3 Dynamic systems

We care about dynamic systems because we can use them to study dynamics!
Given a dynamic system consisting of set 𝑆 and function 𝑔, we can create a sequence {𝑥𝑡 } of points in 𝑆 by setting

𝑥𝑡+1 = 𝑔(𝑥𝑡 ) with 𝑥0 given. (21.1)

This means that we choose some number 𝑥0 in 𝑆 and then take

𝑥0 , 𝑥1 = 𝑔(𝑥0 ), 𝑥2 = 𝑔(𝑥1 ) = 𝑔(𝑔(𝑥0 )), etc. (21.2)

This sequence {𝑥𝑡 } is called the trajectory of 𝑥0 under 𝑔.


In this setting, 𝑆 is called the state space and 𝑥𝑡 is called the state variable.

328 Chapter 21. Dynamics in One Dimension


A First Course in Quantitative Economics with Python

Recalling that 𝑔𝑛 is the 𝑛 compositions of 𝑔 with itself, we can write the trajectory more simply as

𝑥𝑡 = 𝑔𝑡 (𝑥0 ) for 𝑡 ≥ 0.

In all of what follows, we are going to assume that 𝑆 is a subset of ℝ, the real numbers.
Equation (21.1) is sometimes called a first order difference equation
• first order means dependence on only one lag (i.e., earlier states such as 𝑥𝑡−1 do not enter into (21.1)).

21.2.4 Example: A Linear Model

One simple example of a dynamic system is when 𝑆 = ℝ and 𝑔(𝑥) = 𝑎𝑥 + 𝑏, where 𝑎, 𝑏 are fixed constants.
This leads to the linear difference equation

𝑥𝑡+1 = 𝑎𝑥𝑡 + 𝑏 with 𝑥0 given.

The trajectory of 𝑥0 is

𝑥0 , 𝑎𝑥0 + 𝑏, 𝑎2 𝑥0 + 𝑎𝑏 + 𝑏, etc. (21.3)

Continuing in this way, and using our knowledge of geometric series, we find that, for any 𝑡 ≥ 0,

1 − 𝑎𝑡
𝑥𝑡 = 𝑎𝑡 𝑥0 + 𝑏 (21.4)
1−𝑎
We have an exact expression for 𝑥𝑡 for all 𝑡 and hence a full understanding of the dynamics.
Notice in particular that |𝑎| < 1, then, by (21.4), we have

𝑏
𝑥𝑡 → as 𝑡 → ∞ (21.5)
1−𝑎
regardless of 𝑥0
This is an example of what is called global stability, a topic we return to below.

21.2.5 Example: A Nonlinear Model

In the linear example above, we obtained an exact analytical expression for 𝑥𝑡 in terms of arbitrary 𝑡 and 𝑥0 .
This made analysis of dynamics very easy.
When models are nonlinear, however, the situation can be quite different.
For example, recall how we previously studied the law of motion for the Solow growth model, a simplified version of
which is

𝑘𝑡+1 = 𝑠𝑧𝑘𝑡𝛼 + (1 − 𝛿)𝑘𝑡 (21.6)

Here 𝑘 is capital stock and 𝑠, 𝑧, 𝛼, 𝛿 are positive parameters with 0 < 𝛼, 𝛿 < 1.
If you try to iterate like we did in (21.3), you will find that the algebra gets messy quickly.
Analyzing the dynamics of this model requires a different method (see below).

21.2. Some definitions 329


A First Course in Quantitative Economics with Python

21.3 Stability

Consider a fixed dynamic system consisting of set 𝑆 ⊂ ℝ and 𝑔 mapping 𝑆 to 𝑆.

21.3.1 Steady states

A steady state of this system is a point 𝑥∗ in 𝑆 such that 𝑥∗ = 𝑔(𝑥∗ ).


In other words, 𝑥∗ is a fixed point of the function 𝑔 in 𝑆.
For example, for the linear model 𝑥𝑡+1 = 𝑎𝑥𝑡 + 𝑏, you can use the definition to check that
• 𝑥∗ ∶= 𝑏/(1 − 𝑎) is a steady state whenever 𝑎 ≠ 1.
• if 𝑎 = 1 and 𝑏 = 0, then every 𝑥 ∈ ℝ is a steady state.
• if 𝑎 = 1 and 𝑏 ≠ 0, then the linear model has no steady state in ℝ.

21.3.2 Global stability

A steady state 𝑥∗ of the dynamic system is called globally stable if, for all 𝑥0 ∈ 𝑆,

𝑥𝑡 = 𝑔𝑡 (𝑥0 ) → 𝑥∗ as 𝑡 → ∞

For example, in the linear model 𝑥𝑡+1 = 𝑎𝑥𝑡 + 𝑏 with 𝑎 ≠ 1, the steady state 𝑥∗
• is globally stable if |𝑎| < 1 and
• fails to be globally stable otherwise.
This follows directly from (21.4).

21.3.3 Local stability

A steady state 𝑥∗ of the dynamic system is called locally stable if there exists an 𝜖 > 0 such that

|𝑥0 − 𝑥∗ | < 𝜖 ⟹ 𝑥𝑡 = 𝑔𝑡 (𝑥0 ) → 𝑥∗ as 𝑡 → ∞

Obviously every globally stable steady state is also locally stable.


We will see examples below where the converse is not true.

21.4 Graphical analysis

As we saw above, analyzing the dynamics for nonlinear models is nontrivial.


There is no single way to tackle all nonlinear models.
However, there is one technique for one-dimensional models that provides a great deal of intuition.
This is a graphical approach based on 45 degree diagrams.
Let’s look at an example: the Solow model with dynamics given in (21.6).
We begin with some plotting code that you can ignore at first reading.
The function of the code is to produce 45 degree diagrams and time series plots.

330 Chapter 21. Dynamics in One Dimension


A First Course in Quantitative Economics with Python

Let’s create a 45 degree diagram for the Solow model with a fixed set of parameters

A, s, alpha, delta = 2, 0.3, 0.3, 0.4

Here’s the update function corresponding to the model.

def g(k):
return A * s * k**alpha + (1 - delta) * k

Here is the 45 degree plot.

xmin, xmax = 0, 4 # Suitable plotting region.

plot45(g, xmin, xmax, 0, num_arrows=0)

The plot shows the function 𝑔 and the 45 degree line.


Think of 𝑘𝑡 as a value on the horizontal axis.
To calculate 𝑘𝑡+1 , we can use the graph of 𝑔 to see its value on the vertical axis.
Clearly,
• If 𝑔 lies above the 45 degree line at this point, then we have 𝑘𝑡+1 > 𝑘𝑡 .
• If 𝑔 lies below the 45 degree line at this point, then we have 𝑘𝑡+1 < 𝑘𝑡 .
• If 𝑔 hits the 45 degree line at this point, then we have 𝑘𝑡+1 = 𝑘𝑡 , so 𝑘𝑡 is a steady state.
For the Solow model, there are two steady states when 𝑆 = ℝ+ = [0, ∞).
• the origin 𝑘 = 0

21.4. Graphical analysis 331


A First Course in Quantitative Economics with Python

• the unique positive number such that 𝑘 = 𝑠𝑧𝑘𝛼 + (1 − 𝛿)𝑘.


By using some algebra, we can show that in the second case, the steady state is

𝑠𝑧 1/(1−𝛼)
𝑘∗ = ( )
𝛿

21.4.1 Trajectories

By the preceding discussion, in regions where 𝑔 lies above the 45 degree line, we know that the trajectory is increasing.
The next figure traces out a trajectory in such a region so we can see this more clearly.
The initial condition is 𝑘0 = 0.25.

k0 = 0.25

plot45(g, xmin, xmax, k0, num_arrows=5, var='k')

We can plot the time series of capital corresponding to the figure above as follows:

ts_plot(g, xmin, xmax, k0, var='k')

332 Chapter 21. Dynamics in One Dimension


A First Course in Quantitative Economics with Python

Here’s a somewhat longer view:

ts_plot(g, xmin, xmax, k0, ts_length=20, var='k')

When capital stock is higher than the unique positive steady state, we see that it declines:

21.4. Graphical analysis 333


A First Course in Quantitative Economics with Python

k0 = 2.95

plot45(g, xmin, xmax, k0, num_arrows=5, var='k')

Here is the time series:

ts_plot(g, xmin, xmax, k0, var='k')

334 Chapter 21. Dynamics in One Dimension


A First Course in Quantitative Economics with Python

21.4.2 Complex dynamics

The Solow model is nonlinear but still generates very regular dynamics.
One model that generates irregular dynamics is the quadratic map

𝑔(𝑥) = 4𝑥(1 − 𝑥), 𝑥 ∈ [0, 1]

Let’s have a look at the 45 degree diagram.

xmin, xmax = 0, 1
g = lambda x: 4 * x * (1 - x)

x0 = 0.3
plot45(g, xmin, xmax, x0, num_arrows=0)

21.4. Graphical analysis 335


A First Course in Quantitative Economics with Python

Now let’s look at a typical trajectory.

plot45(g, xmin, xmax, x0, num_arrows=6)

336 Chapter 21. Dynamics in One Dimension


A First Course in Quantitative Economics with Python

Notice how irregular it is.


Here is the corresponding time series plot.

ts_plot(g, xmin, xmax, x0, ts_length=6)

21.4. Graphical analysis 337


A First Course in Quantitative Economics with Python

The irregularity is even clearer over a longer time horizon:

ts_plot(g, xmin, xmax, x0, ts_length=20)

338 Chapter 21. Dynamics in One Dimension


A First Course in Quantitative Economics with Python

21.5 Exercises

Exercise 21.5.1
Consider again the linear model 𝑥𝑡+1 = 𝑎𝑥𝑡 + 𝑏 with 𝑎 ≠ 1.
The unique steady state is 𝑏/(1 − 𝑎).
The steady state is globally stable if |𝑎| < 1.
Try to illustrate this graphically by looking at a range of initial conditions.
What differences do you notice in the cases 𝑎 ∈ (−1, 0) and 𝑎 ∈ (0, 1)?
Use 𝑎 = 0.5 and then 𝑎 = −0.5 and study the trajectories
Set 𝑏 = 1 throughout.

Solution to Exercise 21.5.1


We will start with the case 𝑎 = 0.5.
Let’s set up the model and plotting region:

a, b = 0.5, 1
xmin, xmax = -1, 3
g = lambda x: a * x + b

Now let’s plot a trajectory:

x0 = -0.5
plot45(g, xmin, xmax, x0, num_arrows=5)

21.5. Exercises 339


A First Course in Quantitative Economics with Python

Here is the corresponding time series, which converges towards the steady state.

ts_plot(g, xmin, xmax, x0, ts_length=10)

Now let’s try 𝑎 = −0.5 and see what differences we observe.


Let’s set up the model and plotting region:

340 Chapter 21. Dynamics in One Dimension


A First Course in Quantitative Economics with Python

a, b = -0.5, 1
xmin, xmax = -1, 3
g = lambda x: a * x + b

Now let’s plot a trajectory:

x0 = -0.5
plot45(g, xmin, xmax, x0, num_arrows=5)

Here is the corresponding time series, which converges towards the steady state.

ts_plot(g, xmin, xmax, x0, ts_length=10)

21.5. Exercises 341


A First Course in Quantitative Economics with Python

Once again, we have convergence to the steady state but the nature of convergence differs.
In particular, the time series jumps from above the steady state to below it and back again.
In the current context, the series is said to exhibit damped oscillations.

342 Chapter 21. Dynamics in One Dimension


CHAPTER

TWENTYTWO

THE COBWEB MODEL

The cobweb model is a model of prices and quantities in a given market, and how they evolve over time.

22.1 Overview

The cobweb model dates back to the 1930s and, while simple, it remains significant because it shows the fundamental
importance of expectations.
To give some idea of how the model operates, and why expectations matter, imagine the following scenario.
There is a market for soy beans, say, where prices and traded quantities depend on the choices of buyers and sellers.
The buyers are represented by a demand curve — they buy more at low prices and less at high prices.
The sellers have a supply curve — they wish to sell more at high prices and less at low prices.
However, the sellers (who are farmers) need time to grow their crops.
Suppose now that the price is currently high.
Seeing this high price, and perhaps expecting that the high price will remain for some time, the farmers plant many fields
with soy beans.
Next period the resulting high supply floods the market, causing the price to drop.
Seeing this low price, the farmers now shift out of soy beans, restricting supply and causing the price to climb again.
You can imagine how these dynamics could cause cycles in prices and quantities that persist over time.
The cobweb model puts these ideas into equations so we can try to quantify them, and to study conditions under which
cycles persist (or disappear).
In this lecture, we investigate and simulate the basic model under different assumptions regarding the way that produces
form expectations.
Our discussion and simulations draw on high quality lectures by Cars Hommes.
We will use the following imports.

import numpy as np
import matplotlib.pyplot as plt

343
A First Course in Quantitative Economics with Python

22.2 History

Early papers on the cobweb cycle include [Wau64] and [Har60].


The paper [Har60] uses the cobweb theorem to explain the prices of hog in the US over 1920–1950
The next plot replicates part of Figure 2 from that paper, which plots the price of hogs at yearly frequency.
Notice the cyclical price dynamics, which match the kind of cyclical soybean price dynamics discussed above.

hog_prices = [55, 57, 80, 70, 60, 65, 72, 65, 51, 49, 45, 80, 85,
78, 80, 68, 52, 65, 83, 78, 60, 62, 80, 87, 81, 70,
69, 65, 62, 85, 87, 65, 63, 75, 80, 62]
years = np.arange(1924, 1960)
fig, ax = plt.subplots()
ax.plot(years, hog_prices, '-o', ms=4, label='hog price')
ax.set_xlabel('year')
ax.set_ylabel('dollars')
ax.legend()
ax.grid()
plt.show()

344 Chapter 22. The Cobweb Model


A First Course in Quantitative Economics with Python

22.3 The model

Let’s return to our discussion of a hypothetical soy bean market, where price is determined by supply and demand.
We suppose that demand for soy beans is given by

𝐷(𝑝𝑡 ) = 𝑎 − 𝑏𝑝𝑡

where 𝑎, 𝑏 are nonnegative constants and 𝑝𝑡 is the spot (i.e, current market) price at time 𝑡.
(𝐷(𝑝𝑡 ) is the quantity demanded in some fixed unit, such as thousands of tons.)
Because the crop of soy beans for time 𝑡 is planted at 𝑡 − 1, supply of soy beans at time 𝑡 depends on expected prices at
𝑒
time 𝑡, which we denote 𝑝𝑡−1 .
We suppose that supply is nonlinear in expected prices, and takes the form
𝑒 𝑒
𝑆(𝑝𝑡−1 ) = tanh(𝜆(𝑝𝑡−1 − 𝑐)) + 𝑑

where 𝜆 is a positive constant and 𝑐, 𝑑 ≥ 0.


Let’s make a plot of supply and demand for particular choices of the parameter values.
First we store the parameters in a class and define the functions above as methods.

class Market:

def __init__(self,
a=8, # demand parameter
b=1, # demand parameter
c=6, # supply parameter
d=1, # supply parameter
λ=2.0): # supply parameter
self.a, self.b, self.c, self.d = a, b, c, d
self.λ = λ

def demand(self, p):


a, b = self.a, self.b
return a - b * p

def supply(self, p):


c, d, λ = self.c, self.d, self.λ
return np.tanh(λ * (p - c)) + d

Now let’s plot.

p_grid = np.linspace(5, 8, 200)


m = Market()
fig, ax = plt.subplots()

ax.plot(p_grid, m.demand(p_grid), label="$D$")


ax.plot(p_grid, m.supply(p_grid), label="S")
ax.set_xlabel("price")
ax.set_ylabel("quantity")
ax.legend()

plt.show()

22.3. The model 345


A First Course in Quantitative Economics with Python

Market equilibrium requires that supply equals demand, or


𝑒
𝑎 − 𝑏𝑝𝑡 = 𝑆(𝑝𝑡−1 )

Rewriting in terms of 𝑝𝑡 gives

1 𝑒
𝑝𝑡 = − [𝑆(𝑝𝑡−1 ) − 𝑎]
𝑏
Finally, to complete the model, we need to describe how price expectations are formed.
We will assume that expected prices at time 𝑡 depend on past prices.
In particular, we suppose that
𝑒
𝑝𝑡−1 = 𝑓(𝑝𝑡−1 , 𝑝𝑡−2 ) (22.1)

where 𝑓 is some function.


Thus, we are assuming that producers expect the time-𝑡 price to be some function of lagged prices, up to 2 lags.
(We could of course add additional lags and readers are encouraged to experiment with such cases.)
Combining the last two equations gives the dynamics for prices:
1
𝑝𝑡 = − [𝑆(𝑓(𝑝𝑡−1 , 𝑝𝑡−2 )) − 𝑎] (22.2)
𝑏
The price dynamics depend on the parameter values and also on the function 𝑓 that determines how producers form
expectations.

346 Chapter 22. The Cobweb Model


A First Course in Quantitative Economics with Python

22.4 Naive expectations

To go further in our analysis we need to specify the function 𝑓; that is, how expectations are formed.
Let’s start with naive expectations, which refers to the case where producers expect the next period spot price to be
whatever the price is in the current period.
In other words,
𝑒
𝑝𝑡−1 = 𝑝𝑡−1

Using (22.2), we then have


1
𝑝𝑡 = − [𝑆(𝑝𝑡−1 ) − 𝑎]
𝑏
We can write this as

𝑝𝑡 = 𝑔(𝑝𝑡−1 )

where 𝑔 is the function defined by


1
𝑔(𝑝) = − [𝑆(𝑝) − 𝑎] (22.3)
𝑏
Here we represent the function 𝑔

def g(model, current_price):


"""
Function to find the next price given the current price
and Market model
"""
a, b = model.a, model.b
next_price = - (model.supply(current_price) - a) / b
return next_price

Let’s try to understand how prices will evolve using a 45 degree diagram, which is a tool for studying one-dimensional
dynamics.
The function plot45 defined below helps us draw the 45 degree diagram.
Now we can set up a market and plot the 45 degree diagram.

m = Market()

plot45(m, 0, 9, 2, num_arrows=3)

22.4. Naive expectations 347


A First Course in Quantitative Economics with Python

The plot shows the function 𝑔 defined in (22.3) and the 45 degree line.
Think of 𝑝𝑡 as a value on the horizontal axis.
Since 𝑝𝑡+1 = 𝑔(𝑝𝑡 ), we use the graph of 𝑔 to see 𝑝𝑡+1 on the vertical axis.
Clearly,
• If 𝑔 lies above the 45 degree line at 𝑝𝑡 , then we have 𝑝𝑡+1 > 𝑝𝑡 .
• If 𝑔 lies below the 45 degree line at 𝑝𝑡 , then we have 𝑝𝑡+1 < 𝑝𝑡 .
• If 𝑔 hits the 45 degree line at 𝑝𝑡 , then we have 𝑝𝑡+1 = 𝑝𝑡 , so 𝑝𝑡 is a steady state.
Consider the sequence of prices starting at 𝑝0 , as shown in the figure.
We find 𝑝1 on the vertical axis and then shift it to the horizontal axis using the 45 degree line (where values on the two
axes are equal).
Then from 𝑝1 we obtain 𝑝2 and continue.
We can see the start of a cycle.
To confirm this, let’s plot a time series.

def ts_plot_price(model, # Market model


p0, # Initial price
y_a=3, y_b= 12, # Controls y-axis
ts_length=10): # Length of time series
"""
Function to simulate and plot the time series of price.

"""
(continues on next page)

348 Chapter 22. The Cobweb Model


A First Course in Quantitative Economics with Python

(continued from previous page)


fig, ax = plt.subplots()
ax.set_xlabel(r'$t$', fontsize=12)
ax.set_ylabel(r'$p_t$', fontsize=12)
p = np.empty(ts_length)
p[0] = p0
for t in range(1, ts_length):
p[t] = g(model, p[t-1])
ax.plot(np.arange(ts_length),
p,
'bo-',
alpha=0.6,
lw=2,
label=r'$p_t$')
ax.legend(loc='best', fontsize=10)
ax.set_ylim(y_a, y_b)
ax.set_xticks(np.arange(ts_length))
plt.show()

ts_plot_price(m, 4, ts_length=15)

We see that a cycle has formed and the cycle is persistent.


(You can confirm this by plotting over a longer time horizon.)
The cycle is “stable”, in the sense that prices converge to it from most starting conditions.
For example,

ts_plot_price(m, 10, ts_length=15)

22.4. Naive expectations 349


A First Course in Quantitative Economics with Python

22.5 Adaptive expectations

Naive expectations are quite simple and also important in driving the cycle that we found.
What if expectations are formed in a different way?
Next we consider adaptive expectations.
This refers to the case where producers form expectations for the next period price as a weighted average of their last
guess and the current spot price.
That is,
𝑒 𝑒
𝑝𝑡−1 = 𝛼𝑝𝑡−1 + (1 − 𝛼)𝑝𝑡−2 (0 ≤ 𝛼 ≤ 1) (22.4)

Another way to write this is


𝑒 𝑒 𝑒
𝑝𝑡−1 = 𝑝𝑡−2 + 𝛼(𝑝𝑡−1 − 𝑝𝑡−2 ) (22.5)

This equation helps to show that expectations shift


1. up when prices last period were above expectations
2. down when prices last period were below expectations
Using (22.4), we obtain the dynamics
1 𝑒
𝑝𝑡 = − [𝑆(𝛼𝑝𝑡−1 + (1 − 𝛼)𝑝𝑡−2 ) − 𝑎]
𝑏
Let’s try to simulate the price and observe the dynamics using different values of 𝛼.

350 Chapter 22. The Cobweb Model


A First Course in Quantitative Economics with Python

def find_next_price_adaptive(model, curr_price_exp):


"""
Function to find the next price given the current price expectation
and Market model
"""
return - (model.supply(curr_price_exp) - model.a) / model.b

The function below plots price dynamics under adaptive expectations for different values of 𝛼.

def ts_price_plot_adaptive(model, p0, ts_length=10, α=[1.0, 0.9, 0.75]):


fig, axs = plt.subplots(1, len(α), figsize=(12, 5))
for i_plot, a in enumerate(α):
pe_last = p0
p_values = np.empty(ts_length)
p_values[0] = p0
for i in range(1, ts_length):
p_values[i] = find_next_price_adaptive(model, pe_last)
pe_last = a*p_values[i] + (1 - a)*pe_last

axs[i_plot].plot(np.arange(ts_length), p_values)
axs[i_plot].set_title(r'$\alpha={}$'.format(a))
axs[i_plot].set_xlabel('t')
axs[i_plot].set_ylabel('price')
plt.show()

Let’s call the function with prices starting at 𝑝0 = 5.


TODO does this fit well in the page, even in the pdf? If not should it be stacked vertically?

ts_price_plot_adaptive(m, 5, ts_length=30)

Note that if 𝛼 = 1, then adaptive expectations are just naive expectation.


Decreasing the value of 𝛼 shifts more weight to the previous expectations, which stabilizes expected prices.
This increased stability can be seen in the figures.
TODO check / fix exercises

22.5. Adaptive expectations 351


A First Course in Quantitative Economics with Python

22.6 Exercises

Exercise 22.6.1
Using the default Market class and naive expectations, plot a time series simulation of supply (rather than the price).
Show, in particular, that supply also cycles.

Solution to Exercise 22.6.1

def ts_plot_supply(model, p0, ts_length=10):


"""
Function to simulate and plot the supply function
given the initial price.
"""
pe_last = p0
s_values = np.empty(ts_length)
for i in range(ts_length):
# store quantity
s_values[i] = model.supply(pe_last)
# update price
pe_last = - (s_values[i] - model.a) / model.b

fig, ax = plt.subplots()
ax.plot(np.arange(ts_length),
s_values,
'bo-',
alpha=0.6,
lw=2,
label=r'supply')

ax.legend(loc='best', fontsize=10)
ax.set_xticks(np.arange(ts_length))
ax.set_xlabel("time")
ax.set_ylabel("quantity")
plt.show()

m = Market()
ts_plot_supply(m, 5, 15)

352 Chapter 22. The Cobweb Model


A First Course in Quantitative Economics with Python

Exercise 22.6.2
Backward looking average expectations
Backward looking average expectations refers to the case where producers form expectations for the next period price as
a linear combination of their last guess and the second last guess.
That is,
𝑒
𝑝𝑡−1 = 𝛼𝑝𝑡−1 + (1 − 𝛼)𝑝𝑡−2 (22.6)

Simulate and plot the price dynamics for 𝛼 ∈ {0.1, 0.3, 0.5, 0.8} where 𝑝0 = 1 and 𝑝1 = 2.5.

Solution to Exercise 22.6.2

def find_next_price_blae(model, curr_price_exp):


"""
Function to find the next price given the current price expectation
and Market model
"""
return - (model.supply(curr_price_exp) - model.a) / model.b

def ts_plot_price_blae(model, p0, p1, alphas, ts_length=15):


"""
Function to simulate and plot the time series of price
using backward looking average expectations.
"""
fig, axes = plt.subplots(len(alphas), 1, figsize=(8, 16))

(continues on next page)

22.6. Exercises 353


A First Course in Quantitative Economics with Python

(continued from previous page)


for ax, a in zip(axes.flatten(), alphas):
p = np.empty(ts_length)
p[0] = p0
p[1] = p1
for t in range(2, ts_length):
pe = a*p[t-1] + (1 - a)*p[t-2]
p[t] = -(model.supply(pe) - model.a) / model.b
ax.plot(np.arange(ts_length),
p,
'o-',
alpha=0.6,
label=r'$\alpha={}$'.format(a))
ax.legend(loc='best', fontsize=10)
ax.set_xlabel(r'$t$', fontsize=12)
ax.set_ylabel(r'$p_t$', fontsize=12)
plt.show()

m = Market()
ts_plot_price_blae(m,
p0=5,
p1=6,
alphas=[0.1, 0.3, 0.5, 0.8],
ts_length=20)

354 Chapter 22. The Cobweb Model


A First Course in Quantitative Economics with Python

22.6. Exercises 355


A First Course in Quantitative Economics with Python

356 Chapter 22. The Cobweb Model


CHAPTER

TWENTYTHREE

THE OVERLAPPING GENERATIONS MODEL

In this lecture we study the famous overlapping generations (OLG) model, which is used by policy makers and researchers
to examine
• fiscal policy
• monetary policy
• long run growth
and many other topics.
The first rigorous version of the OLG model was developed by Paul Samuelson [Sam58].
Our aim is to gain a good understanding of a simple version of the OLG model.

23.1 Overview

The dynamics of the OLG model are quite similar to those of the Solow-Swan growth model.
At the same time, the OLG model adds an important new feature: the choice of how much to save is endogenous.
To see why this is important, suppose, for example, that we are interested in predicting the effect of a new tax on long-run
growth.
We could add a tax to the Solow-Swan model and look at the change in the steady state.
But this ignores the fact that households will change their savings and consumption behavior when they face the new tax
rate.
Such changes can substantially alter the predictions of the model.
Hence, if we care about accurate predictions, we should model the decision problems of the agents.
In particular, households in the model should decide how much to save and how much to consume, given the environment
that they face (technology, taxes, prices, etc.)
The OLG model takes up this challenge.
We will present a simple version of the OLG model that clarifies the decision problem of households and studies the
implications for long run growth.
Let’s start with some imports.

import numpy as np
from scipy import optimize
from collections import namedtuple
(continues on next page)

357
A First Course in Quantitative Economics with Python

(continued from previous page)


from functools import partial
import matplotlib.pyplot as plt

23.2 Environment

We assume that time is discrete, so that 𝑡 = 0, 1, …,


An individual born at time 𝑡 lives for two periods, 𝑡 and 𝑡 + 1.
We call an agent
• “young” during the first period of their lives and
• “old” during the second period of their lives
Young agents work, supplying labor and earning labor income.
They also decide how much to save.
Old agents do not work, so all income is financial.
Their financial income is from interest on their savings from wage income, which is then combined with the labor of the
new young generation at 𝑡 + 1.
The wage and interest rates are determined in equilibrium by supply and demand.
To make the algebra slightly easier, we are going to assume a constant population size.
We normalize the constant population size in each period to 1.
We also suppose that each agent supplies one “unit” of labor hours, so total labor supply is 1.

23.3 Supply of capital

First let’s consider the household side.

23.3.1 Consumer’s problem

Suppose that utility for individuals born at time 𝑡 takes the form

𝑈𝑡 = 𝑢(𝑐𝑡 ) + 𝛽𝑢(𝑐𝑡+1 ) (23.1)

Here
• 𝑢 ∶ ℝ+ → ℝ is called the “flow” utility function
• 𝛽 ∈ (0, 1) is the discount factor
• 𝑐𝑡 is time 𝑡 consumption of the individual born at time 𝑡
• 𝑐𝑡+1 is time 𝑡 + 1 consumption of the same individual
We assume that 𝑢 is strictly increasing.
Savings behavior is determined by the optimization problem

max {𝑢(𝑐𝑡 ) + 𝛽𝑢(𝑐𝑡+1 )} (23.2)


𝑐𝑡 ,𝑐𝑡+1

358 Chapter 23. The Overlapping Generations Model


A First Course in Quantitative Economics with Python

subject to

𝑐𝑡 + 𝑠 𝑡 ≤ 𝑤 𝑡 and 𝑐𝑡+1 ≤ 𝑅𝑡+1 𝑠𝑡

Here
• 𝑠𝑡 is savings by an individual born at time 𝑡
• 𝑤𝑡 is the wage rate at time 𝑡
• 𝑅𝑡+1 is the interest rate on savings invested at time 𝑡, paid at time 𝑡 + 1
Since 𝑢 is strictly increasing, both of these constraints will hold as equalities at the maximum.
Using this fact and substituting 𝑠𝑡 from the first constraint into the second we get 𝑐𝑡+1 = 𝑅𝑡+1 (𝑤𝑡 − 𝑐𝑡 ).
The first-order condition for a maximum can be obtained by plugging 𝑐𝑡+1 into the objective function, taking the derivative
with respect to 𝑐𝑡 , and setting it to zero.
This leads to the Euler equation of the OLG model, which is

𝑢′ (𝑐𝑡 ) = 𝛽𝑅𝑡+1 𝑢′ (𝑅𝑡+1 (𝑤𝑡 − 𝑐𝑡 )) (23.3)

From the first constraint we get 𝑐𝑡 = 𝑤𝑡 − 𝑠𝑡 , so the Euler equation can also be expressed as

𝑢′ (𝑤𝑡 − 𝑠𝑡 ) = 𝛽𝑅𝑡+1 𝑢′ (𝑅𝑡+1 𝑠𝑡 ) (23.4)

Suppose that, for each 𝑤𝑡 and 𝑅𝑡+1 , there is exactly one 𝑠𝑡 that solves (23.3.4).
Then savings can be written as a fixed function of 𝑤𝑡 and 𝑅𝑡+1 .
We write this as

𝑠𝑡 = 𝑠(𝑤𝑡 , 𝑅𝑡+1 ) (23.5)

The precise form of the function 𝑠 will depend on the choice of flow utility function 𝑢.
Together, 𝑤𝑡 and 𝑅𝑡+1 represent the prices in the economy (price of labor and rental rate of capital).
Thus, (23.3.5) states the quantity of savings given prices.

23.3.2 Example: log preferences

In the special case 𝑢(𝑐) = log 𝑐, the Euler equation simplifies to 𝑠𝑡 = 𝛽(𝑤𝑡 − 𝑠𝑡 ).
Solving for saving, we get
𝛽
𝑠𝑡 = 𝑠(𝑤𝑡 , 𝑅𝑡+1 ) = 𝑤 (23.6)
1+𝛽 𝑡
In this special case, savings does not depend on the interest rate.

23.3.3 Savings and investment

Since the population size is normalized to 1, 𝑠𝑡 is also total savings in the economy at time 𝑡.
In our closed economy, there is no foreign investment, so net savings equals total investment, which can be understood as
supply of capital to firms.
In the next section we investigate demand for capital.
Equating supply and demand will allow us to determine equilibrium in the OLG economy.

23.3. Supply of capital 359


A First Course in Quantitative Economics with Python

23.4 Demand for capital

First we describe the firm problem and then we write down an equation describing demand for capital given prices.

23.4.1 Firm’s problem

For each integer 𝑡 ≥ 0, output 𝑦𝑡 in period 𝑡 is given by the Cobb-Douglas production function

𝑦𝑡 = 𝑘𝑡𝛼 ℓ𝑡1−𝛼 (23.7)

Here 𝑘𝑡 is capital, ℓ𝑡 is labor, and 𝛼 is a parameter (sometimes called the “output elasticity of capital”).
The profit maximization problem of the firm is

max{𝑘𝑡𝛼 ℓ𝑡1−𝛼 − 𝑅𝑡 𝑘𝑡 − ℓ𝑡 𝑤𝑡 } (23.8)


𝑘𝑡 ,ℓ𝑡

The first-order conditions are obtained by taking the derivative of the objective function with respect to capital and labor
respectively and setting them to zero:

(1 − 𝛼)(𝑘𝑡 /ℓ𝑡 )𝛼 = 𝑤𝑡 and 𝛼(𝑘𝑡 /ℓ𝑡 )𝛼−1 = 𝑅𝑡

23.4.2 Demand

Using our assumption ℓ𝑡 = 1 allows us to write

𝑤𝑡 = (1 − 𝛼)𝑘𝑡𝛼 (23.9)

and

𝑅𝑡 = 𝛼𝑘𝑡𝛼−1 (23.10)

Rearranging (23.4.4) gives the aggregate demand for capital at time 𝑡 + 1


1/(1−𝛼)
𝛼
𝑑
𝑘 (𝑅𝑡+1 ) ∶= ( ) (23.11)
𝑅𝑡+1

In Python code this is

def capital_demand(R, α):


return (α/R)**(1/(1-α))

The next figure plots the supply of capital, as in (23.3.6), as well as the demand for capital, as in (23.4.5), as functions of
the interest rate 𝑅𝑡+1 .
(For the special case of log utility, supply does not depend on the interest rate, so we have a constant function.)

R_vals = np.linspace(0.3, 1)
α, β = 0.5, 0.9
w = 2.0

fig, ax = plt.subplots()

ax.plot(R_vals, capital_demand(R_vals, α),


(continues on next page)

360 Chapter 23. The Overlapping Generations Model


A First Course in Quantitative Economics with Python

(continued from previous page)


label="aggregate demand")
ax.plot(R_vals, np.ones_like(R_vals) * (β / (1 + β)) * w,
label="aggregate supply")

ax.set_xlabel("$R_{t+1}$")
ax.set_ylabel("$k_{t+1}$")
ax.legend()
plt.show()

23.5 Equilibrium

In this section we derive equilibrium conditions and investigate an example.

23.5.1 Equilibrium conditions

In equilibrium, savings at time 𝑡 equals investment at time 𝑡, which equals capital supply at time 𝑡 + 1.
Equilibrium is computed by equating these quantities, setting
1/(1−𝛼)
𝛼
𝑑
𝑠(𝑤𝑡 , 𝑅𝑡+1 ) = 𝑘 (𝑅𝑡+1 ) = ( ) (23.12)
𝑅𝑡+1

In principle, we can now solve for the equilibrium price 𝑅𝑡+1 given 𝑤𝑡 .
(In practice, we first need to specify the function 𝑢 and hence 𝑠.)
When we solve this equation, which concerns time 𝑡 + 1 outcomes, time 𝑡 quantities are already determined, so we can
treat 𝑤𝑡 as a constant.

23.5. Equilibrium 361


A First Course in Quantitative Economics with Python

From equilibrium 𝑅𝑡+1 and (23.4.5), we can obtain the equilibrium quantity 𝑘𝑡+1 .

23.5.2 Example: log utility

In the case of log utility, we can use (23.5.1) and (23.3.6) to obtain
1/(1−𝛼)
𝛽 𝛼
𝑤𝑡 = ( ) (23.13)
1+𝛽 𝑅𝑡+1
Solving for the equilibrium interest rate gives
𝛼−1
𝛽
𝑅𝑡+1 = 𝛼 ( 𝑤) (23.14)
1+𝛽 𝑡
In Python we can compute this via

def equilibrium_R_log_utility(α, β, w):


R = α * ( (β * w) / (1 + β))**(α - 1)
return R

In the case of log utility, since capital supply does not depend on the interest rate, the equilibrium quantity is fixed by
supply.
That is,
𝛽
𝑘𝑡+1 = 𝑠(𝑤𝑡 , 𝑅𝑡+1 ) = 𝑤 (23.15)
1+𝛽 𝑡
Let’s redo our plot above but now inserting the equilibrium quantity and price.

R_vals = np.linspace(0.3, 1)
α, β = 0.5, 0.9
w = 2.0

fig, ax = plt.subplots()

ax.plot(R_vals, capital_demand(R_vals, α),


label="aggregate demand")
ax.plot(R_vals, np.ones_like(R_vals) * (β / (1 + β)) * w,
label="aggregate supply")

R_e = equilibrium_R_log_utility(α, β, w)
k_e = (β / (1 + β)) * w

ax.plot(R_e, k_e, 'go', ms=6, alpha=0.6)

ax.annotate(r'equilibrium',
xy=(R_e, k_e),
xycoords='data',
xytext=(0, 60),
textcoords='offset points',
fontsize=12,
arrowprops=dict(arrowstyle="->"))

ax.set_xlabel("$R_{t+1}$")
ax.set_ylabel("$k_{t+1}$")
ax.legend()
plt.show()

362 Chapter 23. The Overlapping Generations Model


A First Course in Quantitative Economics with Python

23.6 Dynamics

In this section we discuss dynamics.


For now we will focus on the case of log utility, so that the equilibrium is determined by (23.5.4).

23.6.1 Evolution of capital

The discussion above shows how equilibrium 𝑘𝑡+1 is obtained given 𝑤𝑡 .


From (23.4.3) we can translate this into 𝑘𝑡+1 as a function of 𝑘𝑡
In particular, since 𝑤𝑡 = (1 − 𝛼)𝑘𝑡𝛼 , we have

𝛽
𝑘𝑡+1 = (1 − 𝛼)(𝑘𝑡 )𝛼 (23.16)
1+𝛽
If we iterate on this equation, we get a sequence for capital stock.
Let’s plot the 45 degree diagram of these dynamics, which we write as

𝛽
𝑘𝑡+1 = 𝑔(𝑘𝑡 ) where 𝑔(𝑘) ∶= (1 − 𝛼)(𝑘)𝛼
1+𝛽

def k_update(k, α, β):


return β * (1 - α) * k**α / (1 + β)

23.6. Dynamics 363


A First Course in Quantitative Economics with Python

α, β = 0.5, 0.9
kmin, kmax = 0, 0.1
x = 1000
k_grid = np.linspace(kmin, kmax, x)
k_grid_next = np.empty_like(k_grid)

for i in range(x):
k_grid_next[i] = k_update(k_grid[i], α, β)

fig, ax = plt.subplots(figsize=(6, 6))

ymin, ymax = np.min(k_grid_next), np.max(k_grid_next)

ax.plot(k_grid, k_grid_next, lw=2, alpha=0.6, label='$g$')


ax.plot(k_grid, k_grid, 'k-', lw=1, alpha=0.7, label='$45^{\circ}$')

ax.legend(loc='upper left', frameon=False, fontsize=12)


ax.set_xlabel('$k_t$', fontsize=12)
ax.set_ylabel('$k_{t+1}$', fontsize=12)

plt.show()

364 Chapter 23. The Overlapping Generations Model


A First Course in Quantitative Economics with Python

23.6.2 Steady state (log case)

The diagram shows that the model has a unique positive steady state, which we denote by 𝑘∗ .
We can solve for 𝑘∗ by setting 𝑘∗ = 𝑔(𝑘∗ ), or

𝛽(1 − 𝛼)(𝑘∗ )𝛼
𝑘∗ = (23.17)
(1 + 𝛽)

Solving this equation yields


1/(1−𝛼)
𝛽(1 − 𝛼)
𝑘∗ = ( ) (23.18)
1+𝛽

We can get the steady state interest rate from (23.4.4), which yields

𝛼 1+𝛽
𝑅∗ = 𝛼(𝑘∗ )𝛼−1 =
1−𝛼 𝛽
In Python we have

k_star = ((β * (1 - α))/(1 + β))**(1/(1-α))


R_star = (α/(1 - α)) * ((1 + β) / β)

23.6.3 Time series

The 45 degree diagram above shows that time series of capital with positive initial conditions converge to this steady state.
Let’s plot some time series that visualize this.

ts_length = 25
k_series = np.empty(ts_length)
k_series[0] = 0.02
for t in range(ts_length - 1):
k_series[t+1] = k_update(k_series[t], α, β)

fig, ax = plt.subplots()
ax.plot(k_series, label="capital series")
ax.plot(range(ts_length), np.full(ts_length, k_star), 'k--', label="$k^*$")
ax.set_ylim(0, 0.1)
ax.set_ylabel("capital")
ax.set_xlabel("$t$")
ax.legend()
plt.show()

23.6. Dynamics 365


A First Course in Quantitative Economics with Python

If you experiment with different positive initial conditions, you will see that the series always converges to 𝑘∗ .
Below we also plot the gross interest rate over time.

R_series = α * k_series**(α - 1)

fig, ax = plt.subplots()
ax.plot(R_series, label="gross interest rate")
ax.plot(range(ts_length), np.full(ts_length, R_star), 'k--', label="$R^*$")
ax.set_ylim(0, 4)
ax.set_xlabel("$t$")
ax.legend()
plt.show()

366 Chapter 23. The Overlapping Generations Model


A First Course in Quantitative Economics with Python

The interest rate reflects the marginal product of capital, which is high when capital stock is low.

23.7 CRRA preferences

Previously, in our examples, we looked at the case of log utility.


Log utility is a rather special case.
𝑐1−𝛾 −1
In this section, we are going to assume that 𝑢(𝑐) = 1−𝛾 , where 𝛾 > 0, 𝛾 ≠ 1.
This function is called the CRRA utility function.
In other respects, the model is the same.
Below we define the utility function in Python and construct a namedtuple to store the parameters.

def crra(c, γ):


return c**(1 - γ) / (1 - γ)

Model = namedtuple('Model', ['α', # Cobb-Douglas parameter


'β', # discount factor
'γ'] # parameter in CRRA utility
)

def create_olg_model(α=0.4, β=0.9, γ=0.5):


return Model(α=α, β=β, γ=γ)

Let’s also redefine the capital demand function to work with this namedtuple.

def capital_demand(R, model):


return (α/R)**(1/(1-model.α))

23.7. CRRA preferences 367


A First Course in Quantitative Economics with Python

23.7.1 Supply

For households, the Euler equation becomes


1−𝛾
(𝑤𝑡 − 𝑠𝑡 )−𝛾 = 𝛽𝑅𝑡+1 (𝑠𝑡 )−𝛾 (23.19)

Solving for savings, we have


−1
(𝛾−1)/𝛾 (23.20)
𝑠𝑡 = 𝑠(𝑤𝑡 , 𝑅𝑡+1 ) = 𝑤𝑡 [1 + 𝛽 −1/𝛾 𝑅𝑡+1 ]

Notice how, unlike the log case, savings now depends on the interest rate.

def savings_crra(w, R, model):


α, β, γ = model
return w / (1 + β**(-1/γ) * R**((γ-1)/γ))

R_vals = np.linspace(0.3, 1)
model = create_olg_model()
w = 2.0

fig, ax = plt.subplots()

ax.plot(R_vals, capital_demand(R_vals, model),


label="aggregate demand")
ax.plot(R_vals, savings_crra(w, R_vals, model),
label="aggregate supply")

ax.set_xlabel("$R_{t+1}$")
ax.set_ylabel("$k_{t+1}$")
ax.legend()
plt.show()

368 Chapter 23. The Overlapping Generations Model


A First Course in Quantitative Economics with Python

23.7.2 Equilibrium

Equating aggregate demand for capital (see (23.4.5)) with our new aggregate supply function yields equilibrium capital.
Thus, we set
1/(𝛼−1)
(𝛾−1)/𝛾 −1 𝑅𝑡+1 (23.21)
𝑤𝑡 [1 + 𝛽 −1/𝛾 𝑅𝑡+1 ] =( )
𝛼
This expression is quite complex and we cannot solve for 𝑅𝑡+1 analytically.
Combining (23.4.4) and (23.7.3) yields
−1
𝑘𝑡+1 = [1 + 𝛽 −1/𝛾 (𝛼𝑘𝑡+1
𝛼−1 (𝛾−1)/𝛾
) ] (1 − 𝛼)(𝑘𝑡 )𝛼 (23.22)

Again, with this equation and 𝑘𝑡 as given, we cannot solve for 𝑘𝑡+1 by pencil and paper.
In the exercise below, you will be asked to solve these equations numerically.

23.8 Exercises

Exercise 23.8.1
Solve for the dynamics of equilibrium capital stock in the CRRA case numerically using (23.7.4).
Visualize the dynamics using a 45 degree diagram.

Solution to Exercise 23.8.1


To solve for 𝑘𝑡+1 given 𝑘𝑡 we use Newton’s method.
Let
(𝛾−1)/𝛾
𝑓(𝑘𝑡+1 , 𝑘𝑡 ) = 𝑘𝑡+1 [1 + 𝛽 −1/𝛾 (𝛼𝑘𝑡+1
𝛼−1
) ] − (1 − 𝛼)𝑘𝑡𝛼 = 0 (23.23)

If 𝑘𝑡 is given then 𝑓 is a function of unknown 𝑘𝑡+1 .


Then we can use scipy.optimize.newton to solve 𝑓(𝑘𝑡+1 , 𝑘𝑡 ) = 0 for 𝑘𝑡+1 .
First let’s define 𝑓.

def f(k_prime, k, model):


α, β, γ = model.α, model.β, model.γ
z = (1 - α) * k**α
a = α**(1-1/γ)
b = k_prime**((α * γ - α + 1) / γ)
p = k_prime + k_prime * β**(-1/γ) * a * b
return p - z

Now let’s define a function that finds the value of 𝑘𝑡+1 .

def k_update(k, model):


return optimize.newton(lambda k_prime: f(k_prime, k, model), 0.1)

Finally, here is the 45 degree diagram.

23.8. Exercises 369


A First Course in Quantitative Economics with Python

kmin, kmax = 0, 0.5


x = 1000
k_grid = np.linspace(kmin, kmax, x)
k_grid_next = np.empty_like(k_grid)

for i in range(x):
k_grid_next[i] = k_update(k_grid[i], model)

fig, ax = plt.subplots(figsize=(6, 6))

ymin, ymax = np.min(k_grid_next), np.max(k_grid_next)

ax.plot(k_grid, k_grid_next, lw=2, alpha=0.6, label='$g$')


ax.plot(k_grid, k_grid, 'k-', lw=1, alpha=0.7, label='$45^{\circ}$')

ax.legend(loc='upper left', frameon=False, fontsize=12)


ax.set_xlabel('$k_t$', fontsize=12)
ax.set_ylabel('$k_{t+1}$', fontsize=12)

plt.show()

Exercise 23.8.2

370 Chapter 23. The Overlapping Generations Model


A First Course in Quantitative Economics with Python

The 45 degree diagram from the last exercise shows that there is a unique positive steady state.
The positive steady state can be obtained by setting 𝑘𝑡+1 = 𝑘𝑡 = 𝑘∗ in (23.7.4), which yields

(1 − 𝛼)(𝑘∗ )𝛼
𝑘∗ =
1+ 𝛽 −1/𝛾 (𝛼(𝑘∗ )𝛼−1 )(𝛾−1)/𝛾
Unlike the log preference case, the CRRA utility steady state 𝑘∗ cannot be obtained analytically.
Instead, we solve for 𝑘∗ using Newton’s method.

Solution to Exercise 23.8.2


We introduce a function ℎ such that positive steady state is the root of ℎ.

ℎ(𝑘∗ ) = 𝑘∗ [1 + 𝛽 −1/𝛾 (𝛼(𝑘∗ )𝛼−1 )(𝛾−1)/𝛾 ] − (1 − 𝛼)(𝑘∗ )𝛼 (23.24)

Here it is in Python

def h(k_star, model):


α, β, γ = model.α, model.β, model.γ
z = (1 - α) * k_star**α
R1 = α ** (1-1/γ)
R2 = k_star**((α * γ - α + 1) / γ)
p = k_star + k_star * β**(-1/γ) * R1 * R2
return p - z

Let’s apply Newton’s method to find the root:

k_star = optimize.newton(h, 0.2, args=(model,))


print(f"k_star = {k_star}")

k_star = 0.25788950250843484

Exercise 23.8.3
Generate three time paths for capital, from three distinct initial conditions, under the parameterization listed above.
Use initial conditions for 𝑘0 of 0.001, 1.2, 2.6 and time series length 10.

Solution to Exercise 23.8.3


Let’s define the constants and three distinct intital conditions

ts_length = 10
k0 = np.array([0.001, 1.2, 2.6])

def simulate_ts(model, k0_values, ts_length):

fig, ax = plt.subplots()

ts = np.zeros(ts_length)
(continues on next page)

23.8. Exercises 371


A First Course in Quantitative Economics with Python

(continued from previous page)

# simulate and plot time series


for k_init in k0_values:
ts[0] = k_init
for t in range(1, ts_length):
ts[t] = k_update(ts[t-1], model)
ax.plot(np.arange(ts_length), ts, '-o', ms=4, alpha=0.6,
label=r'$k_0=%g$' %k_init)
ax.plot(np.arange(ts_length), np.full(ts_length, k_star),
alpha=0.6, color='red', label=r'$k^*$')
ax.legend(fontsize=10)

ax.set_xlabel(r'$t$', fontsize=14)
ax.set_ylabel(r'$k_t$', fontsize=14)

plt.show()

simulate_ts(model, k0, ts_length)

372 Chapter 23. The Overlapping Generations Model


CHAPTER

TWENTYFOUR

COMMODITY PRICES

24.1 Outline

For more than half of all countries around the globe, commodities account for the majority of total exports.
Examples of commodities include copper, diamonds, iron ore, lithium, cotton and coffee beans.
In this lecture we give an introduction to the theory of commodity prices.
The lecture is quite advanced relative to other lectures in this series.
We need to compute an equilibrium, and that equilibrium is described by a price function.
We will solve an equation where the price function is the unknown.
This is harder than solving an equation for an unknown number, or vector.
The lecture will discuss one way to solve a “functional equation” for an unknown function
For this lecture we need the yfinance library.

!pip install yfinance

We will use the following imports

import numpy as np
import yfinance as yf
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
from scipy.optimize import minimize_scalar, brentq
from scipy.stats import beta

24.2 Data

The figure below shows the price of cotton in USD since the start of 2016.

373
A First Course in Quantitative Economics with Python

The figure shows surprisingly large movements in the price of cotton.


What causes these movements?
In general, prices depend on the choices and actions of
1. suppliers,
2. consumers, and
3. speculators.
Our focus will be on the interaction between these parties.
We will connect them together in a dynamic model of supply and demand, called the competitive storage model.
This model was developed by [Sam71], [WW82], [SS83], [DL92], [DL96], and [CB96].

24.3 The competitive storage model

In the competitive storage model, commodities are assets that


1. can be traded by speculators and
2. have intrinsic value to consumers.
Total demand is the sum of consumer demand and demand by speculators.
Supply is exogenous, depending on “harvests”.

Note: These days, goods such as basic computer chips and integrated circuits are often treated as commodities in financial
markets, being highly standardized, and, for these kinds of commodities, the word “harvest” is not appropriate.
Nonetheless, we maintain it for simplicity.

374 Chapter 24. Commodity Prices


A First Course in Quantitative Economics with Python

The equilibrium price is determined competitively.


It is a function of the current state (which determines current harvests and predicts future harvests).

24.4 The model

Consider a market for a single commodity, whose price is given at 𝑡 by 𝑝𝑡 .


The harvest of the commodity at time 𝑡 is 𝑍𝑡 .
We assume that the sequence {𝑍𝑡 }𝑡≥1 is IID with common density function 𝜙.
Speculators can store the commodity between periods, with 𝐼𝑡 units purchased in the current period yielding 𝛼𝐼𝑡 units in
the next.
Here 𝛼 ∈ (0, 1) is a depreciation rate for the commodity.
For simplicity, the risk free interest rate is taken to be zero, so expected profit on purchasing 𝐼𝑡 units is

𝔼𝑡 𝑝𝑡+1 ⋅ 𝛼𝐼𝑡 − 𝑝𝑡 𝐼𝑡 = (𝛼𝔼𝑡 𝑝𝑡+1 − 𝑝𝑡 )𝐼𝑡

Here 𝔼𝑡 𝑝𝑡+1 is the expectation of 𝑝𝑡+1 taken at time 𝑡.

24.5 Equilibrium

In this section we define the equilibrium and discuss how to compute it.

24.5.1 Equilibrium conditions

Speculators are assumed to be risk neutral, which means that they buy the commodity whenever expected profits are
positive.
As a consequence, if expected profits are positive, then the market is not in equilibrium.
Hence, to be in equilibrium, prices must satisfy the “no-arbitrage” condition

𝛼𝔼𝑡 𝑝𝑡+1 − 𝑝𝑡 ≤ 0 (24.1)

Profit maximization gives the additional condition

𝛼𝔼𝑡 𝑝𝑡+1 − 𝑝𝑡 < 0 implies 𝐼𝑡 = 0 (24.2)

We also require that the market clears in each period.


We assume that consumers generate demand quantity 𝐷(𝑝) corresponding to price 𝑝.
Let 𝑃 ∶= 𝐷−1 be the inverse demand function.
Regarding quantities,
• supply is the sum of carryover by speculators and the current harvest
• demand is the sum of purchases by consumers and purchases by speculators.
Mathematically,
• supply = 𝑋𝑡 = 𝛼𝐼𝑡−1 + 𝑍𝑡 , which takes values in 𝑆 ∶= ℝ+ , while

24.4. The model 375


A First Course in Quantitative Economics with Python

• demand = 𝐷(𝑝𝑡 ) + 𝐼𝑡
Thus, the market equilibrium condition is

𝛼𝐼𝑡−1 + 𝑍𝑡 = 𝐷(𝑝𝑡 ) + 𝐼𝑡 (24.3)

The initial condition 𝑋0 ∈ 𝑆 is treated as given.

24.5.2 An equilibrium function

How can we find an equilibrium?


Our path of attack will be to seek a system of prices that depend only on the current state.
In other words, we take a function 𝑝 on 𝑆 and set 𝑝𝑡 = 𝑝(𝑋𝑡 ) for every 𝑡.
Prices and quantities then follow

𝑝𝑡 = 𝑝(𝑋𝑡 ), 𝐼𝑡 = 𝑋𝑡 − 𝐷(𝑝𝑡 ), 𝑋𝑡+1 = 𝛼𝐼𝑡 + 𝑍𝑡+1 (24.4)

We choose 𝑝 so that these prices and quantities satisfy the equilibrium conditions above.
More precisely, we seek a 𝑝 such that (24.5.1) and (24.5.2) hold for the corresponding system (24.5.4).
To this end, suppose that there exists a function 𝑝∗ on 𝑆 satisfying

𝑝∗ (𝑥) = max {𝛼 ∫ 𝑝∗ (𝛼𝐼(𝑥) + 𝑧)𝜙(𝑧)𝑑𝑧, 𝑃 (𝑥)} (𝑥 ∈ 𝑆) (24.5)
0

where

𝐼(𝑥) ∶= 𝑥 − 𝐷(𝑝∗ (𝑥)) (𝑥 ∈ 𝑆) (24.6)

It turns out that such a 𝑝∗ will suffice, in the sense that (24.5.1) and (24.5.2) hold for the corresponding system (24.5.4).
To see this, observe first that

𝔼𝑡 𝑝𝑡+1 = 𝔼𝑡 𝑝∗ (𝑋𝑡+1 ) = 𝔼𝑡 𝑝∗ (𝛼𝐼(𝑋𝑡 ) + 𝑍𝑡+1 ) = ∫ 𝑝∗ (𝛼𝐼(𝑋𝑡 ) + 𝑧)𝜙(𝑧)𝑑𝑧
0

Thus (24.5.1) requires that



𝛼∫ 𝑝∗ (𝛼𝐼(𝑋𝑡 ) + 𝑧)𝜙(𝑧)𝑑𝑧 ≤ 𝑝∗ (𝑋𝑡 )
0

This inequality is immediate from (24.5.5).


Second, regarding (24.5.2), suppose that

𝛼∫ 𝑝∗ (𝛼𝐼(𝑋𝑡 ) + 𝑧)𝜙(𝑧)𝑑𝑧 < 𝑝∗ (𝑋𝑡 )
0

Then by (24.5.5) we have 𝑝∗ (𝑋𝑡 ) = 𝑃 (𝑋𝑡 )


But then 𝐷(𝑝∗ (𝑋𝑡 )) = 𝑋𝑡 and 𝐼𝑡 = 𝐼(𝑋𝑡 ) = 0.
As a consequence, both (24.5.1) and (24.5.2) hold.
We have found an equilibrium.

376 Chapter 24. Commodity Prices


A First Course in Quantitative Economics with Python

24.5.3 Computing the equilibrium

We now know that an equilibrium can be obtained by finding a function 𝑝∗ that satisfies (24.5.5).
It can be shown that, under mild conditions there is exactly one function on 𝑆 satisfying (24.5.5).
Moreover, we can compute this function using successive approximation.
This means that we start with a guess of the function and then update it using (24.5.5).
This generates a sequence of functions 𝑝1 , 𝑝2 , …
We continue until this process converges, in the sense that 𝑝𝑘 and 𝑝𝑘+1 are very close together.
Then we take the final 𝑝𝑘 that we computed as our approximation of 𝑝∗ .
To implement our update step, it is helpful if we put (24.5.5) and (24.5.6) together.
This leads us to the update rule

𝑝𝑘+1 (𝑥) = max {𝛼 ∫ 𝑝𝑘 (𝛼(𝑥 − 𝐷(𝑝𝑘+1 (𝑥))) + 𝑧)𝜙(𝑧)𝑑𝑧, 𝑃 (𝑥)} (24.7)
0

In other words, we take 𝑝𝑘 as given and, at each 𝑥, solve for 𝑞 in



𝑞 = max {𝛼 ∫ 𝑝𝑘 (𝛼(𝑥 − 𝐷(𝑞)) + 𝑧)𝜙(𝑧)𝑑𝑧, 𝑃 (𝑥)} (24.8)
0

Actually we can’t do this at every 𝑥, so instead we do it on a grid of points 𝑥1 , … , 𝑥𝑛 .


Then we get the corresponding values 𝑞1 , … , 𝑞𝑛 .
Then we compute 𝑝𝑘+1 as the linear interpolation of the values 𝑞1 , … , 𝑞𝑛 over the grid 𝑥1 , … , 𝑥𝑛 .
Then we repeat, seeking convergence.

24.6 Code

The code below implements this iterative process, starting from 𝑝0 = 𝑃 .


The distribution 𝜙 is set to a shifted Beta distribution (although many other choices are possible).
The integral in (24.5.8) is computed via Monte Carlo.

α, a, c = 0.8, 1.0, 2.0


beta_a, beta_b = 5, 5
mc_draw_size = 250
gridsize = 150
grid_max = 35
grid = np.linspace(a, grid_max, gridsize)

beta_dist = beta(5, 5)
Z = a + beta_dist.rvs(mc_draw_size) * c # Shock observations
D = P = lambda x: 1.0 / x
tol = 1e-4

def T(p_array):

new_p = np.empty_like(p_array)
(continues on next page)

24.6. Code 377


A First Course in Quantitative Economics with Python

(continued from previous page)

# Interpolate to obtain p as a function.


p = interp1d(grid,
p_array,
fill_value=(p_array[0], p_array[-1]),
bounds_error=False)

# Update
for i, x in enumerate(grid):

h = lambda q: q - max(α * np.mean(p(α * (x - D(q)) + Z)), P(x))


new_p[i] = brentq(h, 1e-8, 100)

return new_p

fig, ax = plt.subplots()

price = P(grid)
ax.plot(grid, price, alpha=0.5, lw=1, label="inverse demand curve")
error = tol + 1
while error > tol:
new_price = T(price)
error = max(np.abs(new_price - price))
price = new_price

ax.plot(grid, price, 'k-', alpha=0.5, lw=2, label=r'$p^*$')


ax.legend()
ax.set_xlabel('$x$', fontsize=12)

plt.show()

378 Chapter 24. Commodity Prices


A First Course in Quantitative Economics with Python

The figure above shows the inverse demand curve 𝑃 , which is also 𝑝0 , as well as our approximation of 𝑝∗ .
Once we have an approximation of 𝑝∗ , we can simulate a time series of prices.

# Turn the price array into a price function


p_star = interp1d(grid,
price,
fill_value=(price[0], price[-1]),
bounds_error=False)

def carry_over(x):
return α * (x - D(p_star(x)))

def generate_cp_ts(init=1, n=50):


X = np.empty(n)
X[0] = init
for t in range(n-1):
Z = a + c * beta_dist.rvs()
X[t+1] = carry_over(X[t]) + Z
return p_star(X)

fig, ax = plt.subplots()
ax.plot(generate_cp_ts(), label="price")
ax.set_xlabel("time")
ax.legend()
plt.show()

24.6. Code 379


A First Course in Quantitative Economics with Python

380 Chapter 24. Commodity Prices


Part VII

Stochastic Dynamics

381
CHAPTER

TWENTYFIVE

MARKOV CHAINS: BASIC CONCEPTS

Contents

• Markov Chains: Basic Concepts


– Overview
– Definitions and examples
– Simulation
– Distributions over time
– Stationary distributions
– Computing expectations

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install quantecon

25.1 Overview

Markov chains are a standard way to model time series with some dependence between observations.
For example,
• inflation next year depends on inflation this year
• unemployment next month depends on unemployment this month
Markov chains are one of the workhorse models of economics and finance.
The theory of Markov chains is beautiful and provides many insights into probability and dynamics.
In this introductory lecture, we will
• review some of the key ideas from the theory of Markov chains and
• show how Markov chains appear in some economic applications.
Let’s start with some standard imports:

383
A First Course in Quantitative Economics with Python

import matplotlib.pyplot as plt


import quantecon as qe
import numpy as np
import networkx as nx
from matplotlib import cm
import matplotlib as mpl
from itertools import cycle

25.2 Definitions and examples

In this section we provide the basic definitions and some elementary examples.

25.2.1 Stochastic matrices

Recall that a probability mass function over 𝑛 possible outcomes is a nonnegative 𝑛-vector 𝑝 that sums to one.
For example, 𝑝 = (0.2, 0.2, 0.6) is a probability mass function over 3 outcomes.
A stochastic matrix (or Markov matrix) is an 𝑛 × 𝑛 square matrix 𝑃 such that each row of 𝑃 is a probability mass
function over 𝑛 outcomes.
In other words,
1. each element of 𝑃 is nonnegative, and
2. each row of 𝑃 sums to one
If 𝑃 is a stochastic matrix, then so is the 𝑘-th power 𝑃 𝑘 for all 𝑘 ∈ ℕ.
Checking this in the first exercises below.

25.2.2 Markov chains

Now we can introduce Markov chains.


First we will give some examples and then we will define them more carefully.
At that time, the connection between stochastic matrices and Markov chains will become clear.

Example 1

From US unemployment data, Hamilton [Ham05] estimated the following dynamics.

Here there are three states

384 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

• “ng” represents normal growth


• “mr” represents mild recession
• “sr” represents severe recession
The arrows represent transition probabilities over one month.
For example, the arrow from mild recession to normal growth has 0.145 next to it.
This tells us that, according to past data, there is a 14.5% probability of transitioning from mild recession to normal
growth in one month.
The arrow from normal growth back to normal growth tells us that there is a 97% probability of transitioning from normal
growth to normal growth (staying in the same state).
Note that these are conditional probabilities — the probability of transitioning from one state to another (or staying at the
same one) conditional on the current state.
To make the problem easier to work with numerically, let’s convert states to numbers.
In particular, we agree that
• state 0 represents normal growth
• state 1 represents mild recession
• state 2 represents severe recession
Let 𝑋𝑡 record the value of the state at time 𝑡.
Now we can write the statement “there is a 14.5% probability of transitioning from mild recession to normal growth in
one month” as

ℙ{𝑋𝑡+1 = 0 | 𝑋𝑡 = 1} = 0.145

We can collect all of these conditional probabilities into a matrix, as follows

0.971 0.029 0
𝑃 =⎡
⎢0.145 0.778 0.077⎤

⎣ 0 0.508 0.492⎦

Notice that 𝑃 is a stochastic matrix.


Now we have the following relationship

𝑃 (𝑖, 𝑗) = ℙ{𝑋𝑡+1 = 𝑗 | 𝑋𝑡 = 𝑖}

This holds for any 𝑖, 𝑗 between 0 and 2.


In particular, 𝑃 (𝑖, 𝑗) is the probability of transitioning from state 𝑖 to state 𝑗 in one month.

Example 2

Consider a worker who, at any given time 𝑡, is either unemployed (state 0) or employed (state 1).
Suppose that, over a one-month period,
1. the unemployed worker finds a job with probability 𝛼 ∈ (0, 1).
2. the employed worker loses her job and becomes unemployed with probability 𝛽 ∈ (0, 1).

25.2. Definitions and examples 385


A First Course in Quantitative Economics with Python

Given the above information, we can write out the transition probabilities in matrix form as

1−𝛼 𝛼
𝑃 =[ ] (25.1)
𝛽 1−𝛽

For example,

𝑃 (0, 1) = probability of transitioning from state 0 to state 1 in one month


= probability finding a job next month
=𝛼

Suppose we can estimate the values 𝛼 and 𝛽.


Then we can address a range of questions, such as
• What is the average duration of unemployment?
• Over the long-run, what fraction of the time does a worker find herself unemployed?
• Conditional on employment, what is the probability of becoming unemployed at least once over the next 12 months?
We’ll cover some of these applications below.

Example 3

Imam and Temple [IT23] categorize political institutions into three types: democracy (D), autocracy (A), and an inter-
mediate state called anocracy (N).
Each institution can have two potential development regimes: collapse (C) and growth (G). This results in six possible
states: DG, DC, NG, NC, AG and AC.
Imam and Temple [IT23] estimate the following transition probabilities:

0.86 0.11 0.03 0.00 0.00 0.00


⎡0.52 0.33 0.13 0.02 0.00 0.00⎤
⎢ ⎥
0.12 0.03 0.70 0.11 0.03 0.01⎥
𝑃 ∶= ⎢
⎢0.13 0.02 0.35 0.36 0.10 0.04⎥
⎢0.00 0.00 0.09 0.11 0.55 0.25⎥
⎣0.00 0.00 0.09 0.15 0.26 0.50⎦

nodes = ['DG', 'DC', 'NG', 'NC', 'AG', 'AC']


P = [[0.86, 0.11, 0.03, 0.00, 0.00, 0.00],
[0.52, 0.33, 0.13, 0.02, 0.00, 0.00],
[0.12, 0.03, 0.70, 0.11, 0.03, 0.01],
[0.13, 0.02, 0.35, 0.36, 0.10, 0.04],
[0.00, 0.00, 0.09, 0.11, 0.55, 0.25],
[0.00, 0.00, 0.09, 0.15, 0.26, 0.50]]

Here is a visualization, with darker colors indicating higher probability.

386 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

Looking at the data, we see that democracies tend to have longer-lasting growth regimes compared to autocracies (as
indicated by the lower probability of transitioning from growth to growth in autocracies).
We can also find a higher probability from collapse to growth in democratic regimes

25.2.3 Defining Markov chains

So far we’ve given examples of Markov chains but now let’s define them more carefully.
To begin, let 𝑆 be a finite set {𝑥1 , … , 𝑥𝑛 } with 𝑛 elements.
The set 𝑆 is called the state space and 𝑥1 , … , 𝑥𝑛 are the state values.
A distribution 𝜓 on 𝑆 is a probability mass function of length 𝑛, where 𝜓(𝑖) is the amount of probability allocated to
state 𝑥𝑖 .
A Markov chain {𝑋𝑡 } on 𝑆 is a sequence of random variables taking values in 𝑆 that have the Markov property.
This means that, for any date 𝑡 and any state 𝑦 ∈ 𝑆,

ℙ{𝑋𝑡+1 = 𝑦 | 𝑋𝑡 } = ℙ{𝑋𝑡+1 = 𝑦 | 𝑋𝑡 , 𝑋𝑡−1 , …} (25.2)

In other words, knowing the current state is enough to know probabilities for the future states.
In particular, the dynamics of a Markov chain are fully determined by the set of values

𝑃 (𝑥, 𝑦) ∶= ℙ{𝑋𝑡+1 = 𝑦 | 𝑋𝑡 = 𝑥} (𝑥, 𝑦 ∈ 𝑆) (25.3)

By construction,
• 𝑃 (𝑥, 𝑦) is the probability of going from 𝑥 to 𝑦 in one unit of time (one step)
• 𝑃 (𝑥, ⋅) is the conditional distribution of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥

25.2. Definitions and examples 387


A First Course in Quantitative Economics with Python

We can view 𝑃 as a stochastic matrix where

𝑃𝑖𝑗 = 𝑃 (𝑥𝑖 , 𝑥𝑗 ) 1 ≤ 𝑖, 𝑗 ≤ 𝑛

Going the other way, if we take a stochastic matrix 𝑃 , we can generate a Markov chain {𝑋𝑡 } as follows:
• draw 𝑋0 from a distribution 𝜓0 on 𝑆
• for each 𝑡 = 0, 1, …, draw 𝑋𝑡+1 from 𝑃 (𝑋𝑡 , ⋅)
By construction, the resulting process satisfies (25.3).

25.3 Simulation

One natural way to answer questions about Markov chains is to simulate them.
Let’s start by doing this ourselves and then look at libraries that can help us.
In these exercises, we’ll take the state space to be 𝑆 = 0, … , 𝑛 − 1.
(We start at 0 because Python arrays are indexed from 0.)

25.3.1 Writing our own simulation code

To simulate a Markov chain, we need


1. a stochastic matrix 𝑃 and
2. a probability mass function 𝜓0 of length 𝑛 from which to draw an initial realization of 𝑋0 .
The Markov chain is then constructed as follows:
1. At time 𝑡 = 0, draw a realization of 𝑋0 from the distribution 𝜓0 .
2. At each subsequent time 𝑡, draw a realization of the new state 𝑋𝑡+1 from 𝑃 (𝑋𝑡 , ⋅).
(That is, draw from row 𝑋𝑡 of 𝑃 .)
To implement this simulation procedure, we need a method for generating draws from a discrete distribution.
For this task, we’ll use random.draw from QuantEcon.py.
To use random.draw, we first need to convert the probability mass function to a cumulative distribution

ψ_0 = (0.3, 0.7) # probabilities over {0, 1}


cdf = np.cumsum(ψ_0) # convert into cumulative distribution
qe.random.draw(cdf, 5) # generate 5 independent draws from ψ

array([1, 0, 1, 0, 1])

We’ll write our code as a function that accepts the following three arguments
• A stochastic matrix P.
• An initial distribution ψ_0.
• A positive integer ts_length representing the length of the time series the function should return.

388 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

def mc_sample_path(P, ψ_0=None, ts_length=1_000):

# set up
P = np.asarray(P)
X = np.empty(ts_length, dtype=int)

# Convert each row of P into a cdf


n = len(P)
P_dist = np.cumsum(P, axis=1) # Convert rows into cdfs

# draw initial state, defaulting to 0


if ψ_0 is not None:
X_0 = qe.random.draw(np.cumsum(ψ_0))
else:
X_0 = 0

# simulate
X[0] = X_0
for t in range(ts_length - 1):
X[t+1] = qe.random.draw(P_dist[X[t], :])

return X

Let’s see how it works using the small matrix

P = [[0.4, 0.6],
[0.2, 0.8]]

Here’s a short time series.

mc_sample_path(P, ψ_0=[1.0, 0.0], ts_length=10)

array([0, 1, 1, 1, 1, 1, 1, 1, 0, 1])

It can be shown that for a long series drawn from P, the fraction of the sample that takes value 0 will be about 0.25.
(We will explain why later.)
Moreover, this is true regardless of the initial distribution from which 𝑋0 is drawn.
The following code illustrates this

X = mc_sample_path(P, ψ_0=[0.1, 0.9], ts_length=1_000_000)


np.mean(X == 0)

0.250019

You can try changing the initial distribution to confirm that the output is always close to 0.25 (for the P matrix above).

25.3. Simulation 389


A First Course in Quantitative Economics with Python

25.3.2 Using QuantEcon’s routines

QuantEcon.py has routines for handling Markov chains, including simulation.


Here’s an illustration using the same 𝑃 as the preceding example

mc = qe.MarkovChain(P)
X = mc.simulate(ts_length=1_000_000)
np.mean(X == 0)

0.249481

The simulate routine is faster (because it is JIT compiled).

%time mc_sample_path(P, ts_length=1_000_000) # Our homemade code version

CPU times: user 1.58 s, sys: 3.62 ms, total: 1.59 s


Wall time: 1.59 s

array([0, 0, 1, ..., 1, 1, 1])

%time mc.simulate(ts_length=1_000_000) # qe code version

CPU times: user 18.3 ms, sys: 4.12 ms, total: 22.4 ms
Wall time: 22 ms

array([0, 1, 0, ..., 0, 0, 1])

Adding state values and initial conditions

If we wish to, we can provide a specification of state values to MarkovChain.


These state values can be integers, floats, or even strings.
The following code illustrates

mc = qe.MarkovChain(P, state_values=('unemployed', 'employed'))


mc.simulate(ts_length=4, init='employed')

array(['employed', 'employed', 'unemployed', 'employed'], dtype='<U10')

mc.simulate(ts_length=4, init='unemployed')

array(['unemployed', 'employed', 'employed', 'unemployed'], dtype='<U10')

mc.simulate(ts_length=4) # Start at randomly chosen initial state

390 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

array(['employed', 'employed', 'employed', 'employed'], dtype='<U10')

If we want to see indices rather than state values as outputs as we can use

mc.simulate_indices(ts_length=4)

array([1, 0, 1, 1])

25.4 Distributions over time

We learned that
1. {𝑋𝑡 } is a Markov chain with stochastic matrix 𝑃
2. the distribution of 𝑋𝑡 is known to be 𝜓𝑡
What then is the distribution of 𝑋𝑡+1 , or, more generally, of 𝑋𝑡+𝑚 ?
To answer this, we let 𝜓𝑡 be the distribution of 𝑋𝑡 for 𝑡 = 0, 1, 2, ….
Our first aim is to find 𝜓𝑡+1 given 𝜓𝑡 and 𝑃 .
To begin, pick any 𝑦 ∈ 𝑆.
To get the probability of being at 𝑦 tomorrow (at 𝑡+1), we account for all ways this can happen and sum their probabilities.
This leads to
ℙ{𝑋𝑡+1 = 𝑦} = ∑ ℙ{𝑋𝑡+1 = 𝑦 | 𝑋𝑡 = 𝑥} ⋅ ℙ{𝑋𝑡 = 𝑥}
𝑥∈𝑆

(We are using the law of total probability.)


Rewriting this statement in terms of marginal and conditional probabilities gives
𝜓𝑡+1 (𝑦) = ∑ 𝑃 (𝑥, 𝑦)𝜓𝑡 (𝑥)
𝑥∈𝑆

There are 𝑛 such equations, one for each 𝑦 ∈ 𝑆.


If we think of 𝜓𝑡+1 and 𝜓𝑡 as row vectors, these 𝑛 equations are summarized by the matrix expression

𝜓𝑡+1 = 𝜓𝑡 𝑃 (25.4)

Thus, we postmultiply by 𝑃 to move a distribution forward one unit of time.


By postmultiplying 𝑚 times, we move a distribution forward 𝑚 steps into the future.
Hence, iterating on (25.4), the expression 𝜓𝑡+𝑚 = 𝜓𝑡 𝑃 𝑚 is also valid — here 𝑃 𝑚 is the 𝑚-th power of 𝑃 .
As a special case, we see that if 𝜓0 is the initial distribution from which 𝑋0 is drawn, then 𝜓0 𝑃 𝑚 is the distribution of
𝑋𝑚 .
This is very important, so let’s repeat it

𝑋0 ∼ 𝜓0 ⟹ 𝑋𝑚 ∼ 𝜓 0 𝑃 𝑚 (25.5)

The general rule is that post-multiplying a distribution by 𝑃 𝑚 shifts it forward 𝑚 units of time.
Hence the following is also valid.

𝑋𝑡 ∼ 𝜓𝑡 ⟹ 𝑋𝑡+𝑚 ∼ 𝜓𝑡 𝑃 𝑚 (25.6)

25.4. Distributions over time 391


A First Course in Quantitative Economics with Python

25.4.1 Multiple step transition probabilities

We know that the probability of transitioning from 𝑥 to 𝑦 in one step is 𝑃 (𝑥, 𝑦).
It turns out that the probability of transitioning from 𝑥 to 𝑦 in 𝑚 steps is 𝑃 𝑚 (𝑥, 𝑦), the (𝑥, 𝑦)-th element of the 𝑚-th
power of 𝑃 .
To see why, consider again (25.6), but now with a 𝜓𝑡 that puts all probability on state 𝑥.
Then 𝜓𝑡 is a vector with 1 in position 𝑥 and zero elsewhere.
Inserting this into (25.6), we see that, conditional on 𝑋𝑡 = 𝑥, the distribution of 𝑋𝑡+𝑚 is the 𝑥-th row of 𝑃 𝑚 .
In particular

ℙ{𝑋𝑡+𝑚 = 𝑦 | 𝑋𝑡 = 𝑥} = 𝑃 𝑚 (𝑥, 𝑦) = (𝑥, 𝑦)-th element of 𝑃 𝑚

25.4.2 Example: probability of recession

Recall the stochastic matrix 𝑃 for recession and growth considered above.
Suppose that the current state is unknown — perhaps statistics are available only at the end of the current month.
We guess that the probability that the economy is in state 𝑥 is 𝜓𝑡 (𝑥) at time t.
The probability of being in recession (either mild or severe) in 6 months time is given by

(𝜓𝑡 𝑃 6 )(1) + (𝜓𝑡 𝑃 6 )(2)

25.4.3 Example 2: Cross-sectional distributions

The distributions we have been studying can be viewed either


1. as probabilities or
2. as cross-sectional frequencies that the Law of Large Numbers leads us to anticipate for large samples.
To illustrate, recall our model of employment/unemployment dynamics for a given worker discussed above.
Consider a large population of workers, each of whose lifetime experience is described by the specified dynamics, with
each worker’s outcomes being realizations of processes that are statistically independent of all other workers’ processes.
Let 𝜓𝑡 be the current cross-sectional distribution over {0, 1}.
The cross-sectional distribution records fractions of workers employed and unemployed at a given moment t.
• For example, 𝜓𝑡 (0) is the unemployment rate.
What will the cross-sectional distribution be in 10 periods hence?
The answer is 𝜓𝑡 𝑃 10 , where 𝑃 is the stochastic matrix in (25.1).
This is because each worker’s state evolves according to 𝑃 , so 𝜓𝑡 𝑃 10 is a marginal distribution for a single randomly
selected worker.
But when the sample is large, outcomes and probabilities are roughly equal (by an application of the Law of Large
Numbers).
So for a very large (tending to infinite) population, 𝜓𝑡 𝑃 10 also represents fractions of workers in each state.
This is exactly the cross-sectional distribution.

392 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

25.5 Stationary distributions

As seen in (25.4), we can shift a distribution forward one unit of time via postmultiplication by 𝑃 .
Some distributions are invariant under this updating process — for example,

P = np.array([[0.4, 0.6],
[0.2, 0.8]])
ψ = (0.25, 0.75)
ψ @ P

array([0.25, 0.75])

Notice that ψ @ P is the same as ψ


Such distributions are called stationary or invariant.
Formally, a distribution 𝜓∗ on 𝑆 is called stationary for 𝑃 if 𝜓∗ 𝑃 = 𝜓∗ .
Notice that, post-multiplying by 𝑃 , we have 𝜓∗ 𝑃 2 = 𝜓∗ 𝑃 = 𝜓∗ .
Continuing in the same way leads to 𝜓∗ = 𝜓∗ 𝑃 𝑡 for all 𝑡.
This tells us an important fact: If the distribution of 𝜓0 is a stationary distribution, then 𝜓𝑡 will have this same distribution
for all 𝑡.
The following theorem is proved in Chapter 4 of [SS23] and numerous other sources.

Theorem 25.5.1
Every stochastic matrix 𝑃 has at least one stationary distribution.

Note that there can be many stationary distributions corresponding to a given stochastic matrix 𝑃 .
• For example, if 𝑃 is the identity matrix, then all distributions on 𝑆 are stationary.
To get uniqueness, we need the Markov chain to “mix around,” so that the state doesn’t get stuck in some part of the state
space.
This gives some intuition for the following theorem.

Theorem 25.5.2
If 𝑃 is everywhere positive, then 𝑃 has exactly one stationary distribution.

We will come back to this when we introduce irreducibility in the next lecture on Markov chains.

25.5. Stationary distributions 393


A First Course in Quantitative Economics with Python

25.5.1 Example

Recall our model of the employment/unemployment dynamics of a particular worker discussed above.
If 𝛼 ∈ (0, 1) and 𝛽 ∈ (0, 1), then the transition matrix is everywhere positive.
Let 𝜓∗ = (𝑝, 1 − 𝑝) be the stationary distribution, so that 𝑝 corresponds to unemployment (state 0).
Using 𝜓∗ = 𝜓∗ 𝑃 and a bit of algebra yields

𝛽
𝑝=
𝛼+𝛽
This is, in some sense, a steady state probability of unemployment.
Not surprisingly it tends to zero as 𝛽 → 0, and to one as 𝛼 → 0.

25.5.2 Calculating stationary distributions

A stable algorithm for computing stationary distributions is implemented in QuantEcon.py.


Here’s an example

P = [[0.4, 0.6],
[0.2, 0.8]]

mc = qe.MarkovChain(P)
mc.stationary_distributions # Show all stationary distributions

array([[0.25, 0.75]])

25.5.3 Asymptotic stationarity

Consider an everywhere positive stochastic matrix with unique stationary distribution 𝜓∗ .


Sometimes the distribution 𝜓𝑡 = 𝜓0 𝑃 𝑡 of 𝑋𝑡 converges to 𝜓∗ regardless of 𝜓0 .
For example, we have the following result

Theorem 25.5.3
Theorem: If there exists an integer 𝑚 such that all entries of 𝑃 𝑚 are strictly positive, with unique stationary distribution
𝜓∗ , and

𝜓0 𝑃 𝑡 → 𝜓 ∗ as 𝑡 → ∞

See, for example, [SS23] Chapter 4.

394 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

Example: Hamilton’s chain

Hamilton’s chain satisfies the conditions of the theorem because 𝑃 2 is everywhere positive:

P = np.array([[0.971, 0.029, 0.000],


[0.145, 0.778, 0.077],
[0.000, 0.508, 0.492]])
P @ P

array([[0.947046, 0.050721, 0.002233],


[0.253605, 0.648605, 0.09779 ],
[0.07366 , 0.64516 , 0.28118 ]])

Let’s pick an initial distribution 𝜓0 and trace out the sequence of distributions 𝜓0 𝑃 𝑡 for 𝑡 = 0, 1, 2, …
First, we write a function to iterate the sequence of distributions for ts_length period

def iterate_ψ(ψ_0, P, ts_length):


n = len(P)
ψ_t = np.empty((ts_length, n))
ψ = ψ_0
for t in range(ts_length):
ψ_t[t] = ψ
ψ = ψ @ P
return np.array(ψ_t)

Now we plot the sequence

ψ_0 = (0.0, 0.2, 0.8) # Initial condition

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

ax.set(xlim=(0, 1), ylim=(0, 1), zlim=(0, 1),


xticks=(0.25, 0.5, 0.75),
yticks=(0.25, 0.5, 0.75),
zticks=(0.25, 0.5, 0.75))

ψ_t = iterate_ψ(ψ_0, P, 20)

ax.scatter(ψ_t[:,0], ψ_t[:,1], ψ_t[:,2], c='r', s=60)


ax.view_init(30, 210)

mc = qe.MarkovChain(P)
ψ_star = mc.stationary_distributions[0]
ax.scatter(ψ_star[0], ψ_star[1], ψ_star[2], c='k', s=60)

plt.show()

25.5. Stationary distributions 395


A First Course in Quantitative Economics with Python

Here
• 𝑃 is the stochastic matrix for recession and growth considered above.
• The highest red dot is an arbitrarily chosen initial marginal probability distribution 𝜓0 , represented as a vector in
ℝ3 .
• The other red dots are the marginal distributions 𝜓0 𝑃 𝑡 for 𝑡 = 1, 2, ….
• The black dot is 𝜓∗ .
You might like to try experimenting with different initial conditions.

An alternative illustration

We can show this in a slightly different way by focusing on the probability that 𝜓𝑡 puts on each state.
First, we write a function to draw initial distributions 𝜓0 of size num_distributions

def generate_initial_values(num_distributions):
n = len(P)
ψ_0s = np.empty((num_distributions, n))

for i in range(num_distributions):
draws = np.random.randint(1, 10_000_000, size=n)

# Scale them so that they add up into 1


(continues on next page)

396 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

(continued from previous page)


ψ_0s[i,:] = np.array(draws/sum(draws))

return ψ_0s

We then write a function to plot the dynamics of (𝜓0 𝑃 𝑡 )(𝑖) as 𝑡 gets large, for each state 𝑖 with different initial distributions

def plot_distribution(P, ts_length, num_distributions):

# Get parameters of transition matrix


n = len(P)
mc = qe.MarkovChain(P)
ψ_star = mc.stationary_distributions[0]

## Draw the plot


fig, axes = plt.subplots(nrows=1, ncols=n, figsize=[11, 5])
plt.subplots_adjust(wspace=0.35)

ψ_0s = generate_initial_values(num_distributions)

# Get the path for each starting value


for ψ_0 in ψ_0s:
ψ_t = iterate_ψ(ψ_0, P, ts_length)

# Obtain and plot distributions at each state


for i in range(n):
axes[i].plot(range(0, ts_length), ψ_t[:,i], alpha=0.3)

# Add labels
for i in range(n):
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color = 'black',
label = fr'$\psi^*({i})$')
axes[i].set_xlabel('t')
axes[i].set_ylabel(fr'$\psi_t({i})$')
axes[i].legend()

plt.show()

The following figure shows

# Define the number of iterations


# and initial distributions
ts_length = 50
num_distributions = 25

P = np.array([[0.971, 0.029, 0.000],


[0.145, 0.778, 0.077],
[0.000, 0.508, 0.492]])

plot_distribution(P, ts_length, num_distributions)

25.5. Stationary distributions 397


A First Course in Quantitative Economics with Python

The convergence to 𝜓∗ holds for different initial distributions.

Example: Failure of convergence

In the case of a periodic chain, with

0 1
𝑃 =[ ]
1 0

we find the distribution oscillates

P = np.array([[0, 1],
[1, 0]])

ts_length = 20
num_distributions = 30

plot_distribution(P, ts_length, num_distributions)

Indeed, this 𝑃 fails our asymptotic stationarity condition, since, as you can verify, 𝑃 𝑡 is not everywhere positive for any
𝑡.

398 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

25.6 Computing expectations

We sometimes want to compute mathematical expectations of functions of 𝑋𝑡 of the form

𝔼[ℎ(𝑋𝑡 )] (25.7)

and conditional expectations such as

𝔼[ℎ(𝑋𝑡+𝑘 ) ∣ 𝑋𝑡 = 𝑥] (25.8)

where
• {𝑋𝑡 } is a Markov chain generated by 𝑛 × 𝑛 stochastic matrix 𝑃 .
• ℎ is a given function, which, in terms of matrix algebra, we’ll think of as the column vector
ℎ(𝑥1 )
ℎ=⎡
⎢ ⋮ ⎥.

⎣ℎ(𝑥𝑛 )⎦
Computing the unconditional expectation (25.7) is easy.
We just sum over the marginal distribution of 𝑋𝑡 to get

𝔼[ℎ(𝑋𝑡 )] = ∑(𝜓𝑃 𝑡 )(𝑥)ℎ(𝑥)


𝑥∈𝑆

Here 𝜓 is the distribution of 𝑋0 .


Since 𝜓 and hence 𝜓𝑃 𝑡 are row vectors, we can also write this as

𝔼[ℎ(𝑋𝑡 )] = 𝜓𝑃 𝑡 ℎ

For the conditional expectation (25.8), we need to sum over the conditional distribution of 𝑋𝑡+𝑘 given 𝑋𝑡 = 𝑥.
We already know that this is 𝑃 𝑘 (𝑥, ⋅), so

𝔼[ℎ(𝑋𝑡+𝑘 ) ∣ 𝑋𝑡 = 𝑥] = (𝑃 𝑘 ℎ)(𝑥) (25.9)

25.6.1 Expectations of geometric sums

Sometimes we want to compute the mathematical expectation of a geometric sum, such as ∑𝑡 𝛽 𝑡 ℎ(𝑋𝑡 ).
In view of the preceding discussion, this is

𝔼 [∑ 𝛽 𝑗 ℎ(𝑋𝑡+𝑗 ) ∣ 𝑋𝑡 = 𝑥] = 𝑥 + 𝛽(𝑃 ℎ)(𝑥) + 𝛽 2 (𝑃 2 ℎ)(𝑥) + ⋯
𝑗=0

By the Neumann series lemma, this sum can be calculated using

𝐼 + 𝛽𝑃 + 𝛽 2 𝑃 2 + ⋯ = (𝐼 − 𝛽𝑃 )−1

The vector 𝑃 𝑘 ℎ stores the conditional expectation 𝔼[ℎ(𝑋𝑡+𝑘 ) ∣ 𝑋𝑡 = 𝑥] over all 𝑥.

Exercise 25.6.1

25.6. Computing expectations 399


A First Course in Quantitative Economics with Python

Imam and Temple [IT23] used a three-state transition matrix to describe the transition of three states of a regime: growth,
stagnation, and collapse

0.68 0.12 0.20


𝑃 ∶= ⎡
⎢0.50 0.24 0.26⎤⎥
⎣0.36 0.18 0.46⎦

where rows, from top to down, correspond to growth, stagnation, and collapse.
In this exercise,
1. visualize the transition matrix and show this process is asymptotically stationary
2. calculate the stationary distribution using simulations
3. visualize the dynamics of (𝜓0 𝑃 𝑡 )(𝑖) where 𝑡 ∈ 0, ..., 25 and compare the convergent path with the previous
transition matrix
Compare your solution to the paper.

Solution to Exercise 25.6.1


Solution 1:

Since the matrix is everywhere positive, there is a unique stationary distribution.


Solution 2:
One simple way to calculate the stationary distribution is to take the power of the transition matrix as we have shown
before

P = np.array([[0.68, 0.12, 0.20],


[0.50, 0.24, 0.26],
[0.36, 0.18, 0.46]])
P_power = np.linalg.matrix_power(P, 20)
P_power

array([[0.56145769, 0.15565164, 0.28289067],


[0.56145769, 0.15565164, 0.28289067],
[0.56145769, 0.15565164, 0.28289067]])

Note that rows of the transition matrix converge to the stationary distribution.

400 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

ψ_star_p = P_power[0]
ψ_star_p

array([0.56145769, 0.15565164, 0.28289067])

mc = qe.MarkovChain(P)
ψ_star = mc.stationary_distributions[0]
ψ_star

array([0.56145769, 0.15565164, 0.28289067])

Solution 3:
We find the distribution 𝜓 converges to the stationary distribution more quickly compared to the hamilton’s chain.

ts_length = 10
num_distributions = 25
plot_distribution(P, ts_length, num_distributions)

In fact, the rate of convergence is governed by eigenvalues [SS23].

P_eigenvals = np.linalg.eigvals(P)
P_eigenvals

array([1. , 0.28219544, 0.09780456])

P_hamilton = np.array([[0.971, 0.029, 0.000],


[0.145, 0.778, 0.077],
[0.000, 0.508, 0.492]])

hamilton_eigenvals = np.linalg.eigvals(P_hamilton)
hamilton_eigenvals

array([1. , 0.85157412, 0.38942588])

More specifically, it is governed by the spectral gap, the difference between the largest and the second largest eigenvalue.

25.6. Computing expectations 401


A First Course in Quantitative Economics with Python

sp_gap_P = P_eigenvals[0] - np.diff(P_eigenvals)[0]


sp_gap_hamilton = hamilton_eigenvals[0] - np.diff(hamilton_eigenvals)[0]

sp_gap_P > sp_gap_hamilton

True

We will come back to this when we discuss spectral theory.

Exercise 25.6.2
We discussed the six-state transition matrix estimated by Imam & Temple [IT23] before.

nodes = ['DG', 'DC', 'NG', 'NC', 'AG', 'AC']


P = [[0.86, 0.11, 0.03, 0.00, 0.00, 0.00],
[0.52, 0.33, 0.13, 0.02, 0.00, 0.00],
[0.12, 0.03, 0.70, 0.11, 0.03, 0.01],
[0.13, 0.02, 0.35, 0.36, 0.10, 0.04],
[0.00, 0.00, 0.09, 0.11, 0.55, 0.25],
[0.00, 0.00, 0.09, 0.15, 0.26, 0.50]]

In this exercise,
1. show this process is asymptotically stationary without simulation
2. simulate and visualize the dynamics starting with a uniform distribution across states (each state will have a prob-
ability of 1/6)
3. change the initial distribution to P(DG) = 1, while all other states have a probability of 0

Solution to Exercise 25.6.2


Solution 1:
Although 𝑃 is not every positive, 𝑃 𝑚 when 𝑚 = 3 is everywhere positive.

P = np.array([[0.86, 0.11, 0.03, 0.00, 0.00, 0.00],


[0.52, 0.33, 0.13, 0.02, 0.00, 0.00],
[0.12, 0.03, 0.70, 0.11, 0.03, 0.01],
[0.13, 0.02, 0.35, 0.36, 0.10, 0.04],
[0.00, 0.00, 0.09, 0.11, 0.55, 0.25],
[0.00, 0.00, 0.09, 0.15, 0.26, 0.50]])

np.linalg.matrix_power(P,3)

array([[0.764927, 0.133481, 0.085949, 0.011481, 0.002956, 0.001206],


[0.658861, 0.131559, 0.161367, 0.031703, 0.011296, 0.005214],
[0.291394, 0.057788, 0.439702, 0.113408, 0.062707, 0.035001],
[0.272459, 0.051361, 0.365075, 0.132207, 0.108152, 0.070746],
[0.064129, 0.012533, 0.232875, 0.154385, 0.299243, 0.236835],
[0.072865, 0.014081, 0.244139, 0.160905, 0.265846, 0.242164]])

So it satisfies the requirement.

402 Chapter 25. Markov Chains: Basic Concepts


A First Course in Quantitative Economics with Python

Solution 2:
We find the distribution 𝜓 converges to the stationary distribution quickly regardless of the initial distributions

ts_length = 30
num_distributions = 20
nodes = ['DG', 'DC', 'NG', 'NC', 'AG', 'AC']

# Get parameters of transition matrix


n = len(P)
mc = qe.MarkovChain(P)
ψ_star = mc.stationary_distributions[0]
ψ_0 = np.array([[1/6 for i in range(6)],
[0 if i != 0 else 1 for i in range(6)]])
## Draw the plot
fig, axes = plt.subplots(ncols=2)
plt.subplots_adjust(wspace=0.35)
for idx in range(2):
ψ_t = iterate_ψ(ψ_0[idx], P, ts_length)
for i in range(n):
axes[idx].plot(ψ_t[:, i] - ψ_star[i], alpha=0.5, label=fr'$\psi_t({i+1})$')
axes[idx].set_ylim([-0.3, 0.3])
axes[idx].set_xlabel('t')
axes[idx].set_ylabel(fr'$\psi_t$')
axes[idx].legend()
axes[idx].axhline(0, linestyle='dashed', lw=1, color = 'black')

plt.show()

Exercise 25.6.3

25.6. Computing expectations 403


A First Course in Quantitative Economics with Python

Prove the following: If 𝑃 is a stochastic matrix, then so is the 𝑘-th power 𝑃 𝑘 for all 𝑘 ∈ ℕ.

Solution to Exercise 25.6.3


Suppose that 𝑃 is stochastic and, moreover, that 𝑃 𝑘 is stochastic for some integer 𝑘.
We will prove that 𝑃 𝑘+1 = 𝑃 𝑃 𝑘 is also stochastic.
(We are doing proof by induction — we assume the claim is true at 𝑘 and now prove it is true at 𝑘 + 1.)
To see this, observe that, since 𝑃 𝑘 is stochastic and the product of nonnegative matrices is nonnegative, 𝑃 𝑘+1 = 𝑃 𝑃 𝑘 is
nonnegative.
Also, if 1 is a column vector of ones, then, since 𝑃 𝑘 is stochastic we have 𝑃 𝑘 1 = 1 (rows sum to one).
Therefore 𝑃 𝑘+1 1 = 𝑃 𝑃 𝑘 1 = 𝑃 1 = 1
The proof is done.

404 Chapter 25. Markov Chains: Basic Concepts


CHAPTER

TWENTYSIX

MARKOV CHAINS: IRREDUCIBILITY AND ERGODICITY

Contents

• Markov Chains: Irreducibility and Ergodicity


– Overview
– Irreducibility
– Ergodicity
– Exercises

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install quantecon

26.1 Overview

This lecture continues on from our earlier lecture on Markov chains.


Specifically, we will introduce the concepts of irreducibility and ergodicity, and see how they connect to stationarity.
Irreducibility describes the ability of a Markov chain to move between any two states in the system.
Ergodicity is a sample path property that describes the behavior of the system over long periods of time.
As we will see,
• an irreducible Markov chain guarantees the existence of a unique stationary distribution, while
• an ergodic Markov chain generates time series that satisfy a version of the law of large numbers.
Together, these concepts provide a foundation for understanding the long-term behavior of Markov chains.
Let’s start with some standard imports:

import matplotlib.pyplot as plt


plt.rcParams["figure.figsize"] = (11, 5) # set default figure size
import quantecon as qe
import numpy as np
import networkx as nx
from matplotlib import cm
import matplotlib as mpl

405
A First Course in Quantitative Economics with Python

26.2 Irreducibility

To explain irreducibility, let’s take 𝑃 to be a fixed stochastic matrix.


Two states 𝑥 and 𝑦 are said to communicate with each other if there exist positive integers 𝑗 and 𝑘 such that

𝑃 𝑗 (𝑥, 𝑦) > 0 and 𝑃 𝑘 (𝑦, 𝑥) > 0

In view of our discussion above, this means precisely that


• state 𝑥 can eventually be reached from state 𝑦, and
• state 𝑦 can eventually be reached from state 𝑥
The stochastic matrix 𝑃 is called irreducible if all states communicate; that is, if 𝑥 and 𝑦 communicate for all (𝑥, 𝑦) in
𝑆 × 𝑆.
For example, consider the following transition probabilities for wealth of a fictitious set of households

We can translate this into a stochastic matrix, putting zeros where there’s no edge between nodes

0.9 0.1 0
𝑃 ∶= ⎡
⎢0.4 0.4 0.2⎤

⎣0.1 0.1 0.8⎦

It’s clear from the graph that this stochastic matrix is irreducible: we can eventually reach any state from any other state.
We can also test this using QuantEcon.py’s MarkovChain class

P = [[0.9, 0.1, 0.0],


[0.4, 0.4, 0.2],
[0.1, 0.1, 0.8]]

mc = qe.MarkovChain(P, ('poor', 'middle', 'rich'))


mc.is_irreducible

True

Here’s a more pessimistic scenario in which poor people remain poor forever

406 Chapter 26. Markov Chains: Irreducibility and Ergodicity


A First Course in Quantitative Economics with Python

This stochastic matrix is not irreducible since, for example, rich is not accessible from poor.
Let’s confirm this

P = [[1.0, 0.0, 0.0],


[0.1, 0.8, 0.1],
[0.0, 0.2, 0.8]]

mc = qe.MarkovChain(P, ('poor', 'middle', 'rich'))


mc.is_irreducible

False

It might be clear to you already that irreducibility is going to be important in terms of long-run outcomes.
For example, poverty is a life sentence in the second graph but not the first.
We’ll come back to this a bit later.

26.2.1 Irreducibility and stationarity

We discussed the uniqueness of the stationary in the previous lecture requires the transition matrix to be everywhere
positive.
In fact irreducibility is enough for the uniqueness of the stationary distribution to hold if the distribution exists.
We can revise the theorem into the following fundamental theorem:

Theorem 26.2.1
If 𝑃 is irreducible, then 𝑃 has exactly one stationary distribution.

For proof, see Chapter 4 of [SS23] or Theorem 5.2 of [Haggstrom02].

26.2. Irreducibility 407


A First Course in Quantitative Economics with Python

26.3 Ergodicity

Please note that we use 𝟙 for a vector of ones in this lecture.


Under irreducibility, yet another important result obtains:

Theorem 26.3.1
If 𝑃 is irreducible and 𝜓∗ is the unique stationary distribution, then, for all 𝑥 ∈ 𝑆,

1 𝑚
∑ 1{𝑋𝑡 = 𝑥} → 𝜓∗ (𝑥) as 𝑚 → ∞ (26.1)
𝑚 𝑡=1

Here
• {𝑋𝑡 } is a Markov chain with stochastic matrix 𝑃 and initial. distribution 𝜓0
• 𝟙{𝑋𝑡 = 𝑥} = 1 if 𝑋𝑡 = 𝑥 and zero otherwise.
The result in (26.1) is sometimes called ergodicity.
The theorem tells us that the fraction of time the chain spends at state 𝑥 converges to 𝜓∗ (𝑥) as time goes to infinity.
This gives us another way to interpret the stationary distribution (provided irreducibility holds).
Importantly, the result is valid for any choice of 𝜓0 .
The theorem is related to the Law of Large Numbers.
It tells us that, in some settings, the law of large numbers sometimes holds even when the sequence of random variables
is not IID.

26.3.1 Example 1

Recall our cross-sectional interpretation of the employment/unemployment model discussed above.


Assume that 𝛼 ∈ (0, 1) and 𝛽 ∈ (0, 1), so that irreducibility holds.
We saw that the stationary distribution is (𝑝, 1 − 𝑝), where

𝛽
𝑝=
𝛼+𝛽
In the cross-sectional interpretation, this is the fraction of people unemployed.
In view of our latest (ergodicity) result, it is also the fraction of time that a single worker can expect to spend unemployed.
Thus, in the long run, cross-sectional averages for a population and time-series averages for a given person coincide.
This is one aspect of the concept of ergodicity.

408 Chapter 26. Markov Chains: Irreducibility and Ergodicity


A First Course in Quantitative Economics with Python

26.3.2 Example 2

Another example is the Hamilton dynamics we discussed before.


The graph of the Markov chain shows it is irreducible
Therefore, we can see the sample path averages for each state (the fraction of time spent in each state) converges to the
stationary distribution regardless of the starting state
Let’s denote the fraction of time spent in state 𝑥 at time 𝑡 in our sample path as 𝑝𝑡̂ (𝑥) where

1 𝑡
𝑝𝑡̂ (𝑥) ∶= ∑ 1{𝑋𝑡 = 𝑥}
𝑡 𝑡=1

Here we compare 𝑝𝑡̂ (𝑥) with the stationary distribution 𝜓∗ (𝑥) for different starting points 𝑥0 .

P = np.array([[0.971, 0.029, 0.000],


[0.145, 0.778, 0.077],
[0.000, 0.508, 0.492]])
ts_length = 10_000
mc = qe.MarkovChain(P)
n = len(P)
fig, axes = plt.subplots(nrows=1, ncols=n, figsize=(15, 6))
ψ_star = mc.stationary_distributions[0]
plt.subplots_adjust(wspace=0.35)

for i in range(n):
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color='black',
label = fr'$\psi^*({i})$')
axes[i].set_xlabel('t')
axes[i].set_ylabel(fr'$\hat p_t({i})$')

# Compute the fraction of time spent, starting from different x_0s


for x0, col in ((0, 'blue'), (1, 'green'), (2, 'red')):
# Generate time series that starts at different x0
X = mc.simulate(ts_length, init=x0)
p_hat = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
axes[i].plot(p_hat, color=col, label=f'$x_0 = \, {x0} $')
axes[i].legend()
plt.show()

Note the convergence to the stationary distribution regardless of the starting point 𝑥0 .

26.3. Ergodicity 409


A First Course in Quantitative Economics with Python

26.3.3 Example 3

Let’s look at one more example with six states discussed before.

0.86 0.11 0.03 0.00 0.00 0.00


⎡0.52 0.33 0.13 0.02 0.00 0.00⎤
⎢ ⎥
0.12 0.03 0.70 0.11 0.03 0.01⎥
𝑃 ∶= ⎢
⎢0.13 0.02 0.35 0.36 0.10 0.04⎥
⎢0.00 0.00 0.09 0.11 0.55 0.25⎥
⎣0.00 0.00 0.09 0.15 0.26 0.50⎦

The graph for the chain shows all states are reachable, indicating that this chain is irreducible.
Here we visualize the difference between 𝑝𝑡̂ (𝑥) and the stationary distribution 𝜓∗ (𝑥) for each state 𝑥

P = [[0.86, 0.11, 0.03, 0.00, 0.00, 0.00],


[0.52, 0.33, 0.13, 0.02, 0.00, 0.00],
[0.12, 0.03, 0.70, 0.11, 0.03, 0.01],
[0.13, 0.02, 0.35, 0.36, 0.10, 0.04],
[0.00, 0.00, 0.09, 0.11, 0.55, 0.25],
[0.00, 0.00, 0.09, 0.15, 0.26, 0.50]]

ts_length = 10_000
mc = qe.MarkovChain(P)
ψ_star = mc.stationary_distributions[0]
fig, ax = plt.subplots(figsize=(9, 6))
X = mc.simulate(ts_length)
# Center the plot at 0
ax.set_ylim(-0.25, 0.25)
ax.axhline(0, linestyle='dashed', lw=2, color='black', alpha=0.4)

for x0 in range(6):
# Calculate the fraction of time for each state
p_hat = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
ax.plot(p_hat - ψ_star[x0], label=f'$x = {x0+1} $')
ax.set_xlabel('t')
ax.set_ylabel(r'$\hat p_t(x) - \psi^* (x)$')

ax.legend()
plt.show()

410 Chapter 26. Markov Chains: Irreducibility and Ergodicity


A First Course in Quantitative Economics with Python

Similar to previous examples, the sample path averages for each state converge to the stationary distribution.

26.3.4 Example 4

Let’s look at another example with two states: 0 and 1.

0 1
𝑃 ∶= [ ]
1 0

The diagram of the Markov chain shows that it is irreducible

In fact it has a periodic cycle — the state cycles between the two states in a regular way.
This is called periodicity.
It is still irreducible so ergodicity holds.

P = np.array([[0, 1],
[1, 0]])
ts_length = 10_000
mc = qe.MarkovChain(P)
n = len(P)
fig, axes = plt.subplots(nrows=1, ncols=n)
ψ_star = mc.stationary_distributions[0]

for i in range(n):
axes[i].set_ylim(0.45, 0.55)
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color='black',
(continues on next page)

26.3. Ergodicity 411


A First Course in Quantitative Economics with Python

(continued from previous page)


label = fr'$\psi^*({i})$')
axes[i].set_xlabel('t')
axes[i].set_ylabel(fr'$\hat p_t({i})$')

# Compute the fraction of time spent, for each x


for x0 in range(n):
# Generate time series starting at different x_0
X = mc.simulate(ts_length, init=x0)
p_hat = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
axes[i].plot(p_hat, label=f'$x_0 = \, {x0} $')

axes[i].legend()
plt.show()

This example helps to emphasize that asymptotic stationarity is about the distribution, while ergodicity is about the sample
path.
The proportion of time spent in a state can converge to the stationary distribution with periodic chains.
However, the distribution at each state does not.

26.3.5 Expectations of geometric sums

Sometimes we want to compute the mathematical expectation of a geometric sum, such as ∑𝑡 𝛽 𝑡 ℎ(𝑋𝑡 ).
In view of the preceding discussion, this is

𝔼 [∑ 𝛽 𝑗 ℎ(𝑋𝑡+𝑗 ) ∣ 𝑋𝑡 = 𝑥] = 𝑥 + 𝛽(𝑃 ℎ)(𝑥) + 𝛽 2 (𝑃 2 ℎ)(𝑥) + ⋯
𝑗=0

By the Neumann series lemma, this sum can be calculated using

𝐼 + 𝛽𝑃 + 𝛽 2 𝑃 2 + ⋯ = (𝐼 − 𝛽𝑃 )−1

412 Chapter 26. Markov Chains: Irreducibility and Ergodicity


A First Course in Quantitative Economics with Python

26.4 Exercises

Exercise 26.4.1
Benhabib el al. [BBL19] estimated that the transition matrix for social mobility as the following

0.222 0.222 0.215 0.187 0.081 0.038 0.029 0.006


⎡0.221 0.22 0.215 0.188 0.082 0.039 0.029 0.006⎤
⎢ ⎥
⎢0.207 0.209 0.21 0.194 0.09 0.046 0.036 0.008⎥
⎢0.198 0.201 0.207 0.198 0.095 0.052 0.04 0.009⎥
𝑃 ∶= ⎢
0.175 0.178 0.197 0.207 0.11 0.067 0.054 0.012⎥
⎢ ⎥
⎢0.182 0.184 0.2 0.205 0.106 0.062 0.05 0.011⎥
⎢0.123 0.125 0.166 0.216 0.141 0.114 0.094 0.021⎥
⎣0.084 0.084 0.142 0.228 0.17 0.143 0.121 0.028⎦

where each state 1 to 8 corresponds to a percentile of wealth shares

0 − 20%, 20 − 40%, 40 − 60%, 60 − 80%, 80 − 90%, 90 − 95%, 95 − 99%, 99 − 100%

The matrix is recorded as P below

P = [
[0.222, 0.222, 0.215, 0.187, 0.081, 0.038, 0.029, 0.006],
[0.221, 0.22, 0.215, 0.188, 0.082, 0.039, 0.029, 0.006],
[0.207, 0.209, 0.21, 0.194, 0.09, 0.046, 0.036, 0.008],
[0.198, 0.201, 0.207, 0.198, 0.095, 0.052, 0.04, 0.009],
[0.175, 0.178, 0.197, 0.207, 0.11, 0.067, 0.054, 0.012],
[0.182, 0.184, 0.2, 0.205, 0.106, 0.062, 0.05, 0.011],
[0.123, 0.125, 0.166, 0.216, 0.141, 0.114, 0.094, 0.021],
[0.084, 0.084, 0.142, 0.228, 0.17, 0.143, 0.121, 0.028]
]

P = np.array(P)
codes_B = ('1','2','3','4','5','6','7','8')

In this exercise,
1. show this process is asymptotically stationary and calculate the stationary distribution using simulations.
2. use simulations to demonstrate ergodicity of this process.

Solution to Exercise 26.4.1


Solution 1:
Use the technique we learnt before, we can take the power of the transition matrix

P = [[0.222, 0.222, 0.215, 0.187, 0.081, 0.038, 0.029, 0.006],


[0.221, 0.22, 0.215, 0.188, 0.082, 0.039, 0.029, 0.006],
[0.207, 0.209, 0.21, 0.194, 0.09, 0.046, 0.036, 0.008],
[0.198, 0.201, 0.207, 0.198, 0.095, 0.052, 0.04, 0.009],
[0.175, 0.178, 0.197, 0.207, 0.11, 0.067, 0.054, 0.012],
[0.182, 0.184, 0.2, 0.205, 0.106, 0.062, 0.05, 0.011],
[0.123, 0.125, 0.166, 0.216, 0.141, 0.114, 0.094, 0.021],
[0.084, 0.084, 0.142, 0.228, 0.17, 0.143, 0.121, 0.028]]
(continues on next page)

26.4. Exercises 413


A First Course in Quantitative Economics with Python

(continued from previous page)

P = np.array(P)
codes_B = ('1','2','3','4','5','6','7','8')

np.linalg.matrix_power(P, 10)

array([[0.20254451, 0.20379879, 0.20742102, 0.19505842, 0.09287832,


0.0503871 , 0.03932382, 0.00858802],
[0.20254451, 0.20379879, 0.20742102, 0.19505842, 0.09287832,
0.0503871 , 0.03932382, 0.00858802],
[0.20254451, 0.20379879, 0.20742102, 0.19505842, 0.09287832,
0.0503871 , 0.03932382, 0.00858802],
[0.20254451, 0.20379879, 0.20742102, 0.19505842, 0.09287832,
0.0503871 , 0.03932382, 0.00858802],
[0.20254451, 0.20379879, 0.20742102, 0.19505842, 0.09287832,
0.0503871 , 0.03932382, 0.00858802],
[0.20254451, 0.20379879, 0.20742102, 0.19505842, 0.09287832,
0.0503871 , 0.03932382, 0.00858802],
[0.20254451, 0.20379879, 0.20742102, 0.19505842, 0.09287832,
0.0503871 , 0.03932382, 0.00858802],
[0.20254451, 0.20379879, 0.20742102, 0.19505842, 0.09287832,
0.0503871 , 0.03932382, 0.00858802]])

We find again that rows of the transition matrix converge to the stationary distribution

mc = qe.MarkovChain(P)
ψ_star = mc.stationary_distributions[0]
ψ_star

array([0.20254451, 0.20379879, 0.20742102, 0.19505842, 0.09287832,


0.0503871 , 0.03932382, 0.00858802])

Solution 2:

ts_length = 1000
mc = qe.MarkovChain(P)
fig, ax = plt.subplots(figsize=(9, 6))
X = mc.simulate(ts_length)
ax.set_ylim(-0.25, 0.25)
ax.axhline(0, linestyle='dashed', lw=2, color='black', alpha=0.4)

for x0 in range(8):
# Calculate the fraction of time for each worker
p_hat = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
ax.plot(p_hat - ψ_star[x0], label=f'$x = {x0+1} $')
ax.set_xlabel('t')
ax.set_ylabel(r'$\hat p_t(x) - \psi^* (x)$')

ax.legend()
plt.show()

414 Chapter 26. Markov Chains: Irreducibility and Ergodicity


A First Course in Quantitative Economics with Python

Note that the fraction of time spent at each state quickly converges to the probability assigned to that state by the stationary
distribution.

Exercise 26.4.2
According to the discussion above, if a worker’s employment dynamics obey the stochastic matrix

1−𝛼 𝛼
𝑃 ∶= [ ]
𝛽 1−𝛽

with 𝛼 ∈ (0, 1) and 𝛽 ∈ (0, 1), then, in the long run, the fraction of time spent unemployed will be

𝛽
𝑝 ∶=
𝛼+𝛽

In other words, if {𝑋𝑡 } represents the Markov chain for employment, then 𝑋̄ 𝑚 → 𝑝 as 𝑚 → ∞, where

1 𝑚
𝑋̄ 𝑚 ∶= ∑ 1{𝑋𝑡 = 0}
𝑚 𝑡=1

This exercise asks you to illustrate convergence by computing 𝑋̄ 𝑚 for large 𝑚 and checking that it is close to 𝑝.
You will see that this statement is true regardless of the choice of initial condition or the values of 𝛼, 𝛽, provided both lie
in (0, 1).
The result should be similar to the plot we plotted here

Solution to Exercise 26.4.2


We will address this exercise graphically.
The plots show the time series of 𝑋̄ 𝑚 − 𝑝 for two initial conditions.
As 𝑚 gets large, both series converge to zero.

26.4. Exercises 415


A First Course in Quantitative Economics with Python

α = β = 0.1
ts_length = 10000
p = β / (α + β)

P = ((1 - α, α), # Careful: P and p are distinct


( β, 1 - β))
mc = qe.MarkovChain(P)

fig, ax = plt.subplots(figsize=(9, 6))


ax.set_ylim(-0.25, 0.25)
ax.axhline(0, linestyle='dashed', lw=2, color='black', alpha=0.4)

for x0, col in ((0, 'blue'), (1, 'green')):


# Generate time series for worker that starts at x0
X = mc.simulate(ts_length, init=x0)
# Compute fraction of time spent unemployed, for each n
X_bar = (X == 0).cumsum() / (1 + np.arange(ts_length, dtype=float))
# Plot
ax.fill_between(range(ts_length), np.zeros(ts_length), X_bar - p, color=col,␣
↪alpha=0.1)

ax.plot(X_bar - p, color=col, label=f'$x_0 = \, {x0} $')


# Overlay in black--make lines clearer
ax.plot(X_bar - p, 'k-', alpha=0.6)
ax.set_xlabel('t')
ax.set_ylabel(r'$\bar X_m - \psi^* (x)$')

ax.legend(loc='upper right')
plt.show()

Exercise 26.4.3
In quantecon library, irreducibility is tested by checking whether the chain forms a strongly connected component.
However, another way to verify irreducibility is by checking whether 𝐴 satisfies the following statement:

416 Chapter 26. Markov Chains: Irreducibility and Ergodicity


A First Course in Quantitative Economics with Python

𝑛−1
Assume A is an 𝑛 × 𝑛 𝐴 is irreducible if and only if ∑𝑘=0 𝐴𝑘 is a positive matrix.
(see more: [Zha12] and here)
Based on this claim, write a function to test irreducibility.

Solution to Exercise 26.4.3

def is_irreducible(P):
n = len(P)
result = np.zeros((n, n))
for i in range(n):
result += np.linalg.matrix_power(P, i)
return np.all(result > 0)

P1 = np.array([[0, 1],
[1, 0]])
P2 = np.array([[1.0, 0.0, 0.0],
[0.1, 0.8, 0.1],
[0.0, 0.2, 0.8]])
P3 = np.array([[0.971, 0.029, 0.000],
[0.145, 0.778, 0.077],
[0.000, 0.508, 0.492]])

for P in (P1, P2, P3):


result = lambda P: 'irreducible' if is_irreducible(P) else 'reducible'
print(f'{P}: {result(P)}')

[[0 1]
[1 0]]: irreducible
[[1. 0. 0. ]
[0.1 0.8 0.1]
[0. 0.2 0.8]]: reducible
[[0.971 0.029 0. ]
[0.145 0.778 0.077]
[0. 0.508 0.492]]: irreducible

26.4. Exercises 417


A First Course in Quantitative Economics with Python

418 Chapter 26. Markov Chains: Irreducibility and Ergodicity


CHAPTER

TWENTYSEVEN

UNIVARIATE TIME SERIES WITH MATRIX ALGEBRA

Contents

• Univariate Time Series with Matrix Algebra


– Overview
– Samuelson’s model
– Adding a random term
– Computing population moments
– Moving average representation
– A forward looking model

27.1 Overview

This lecture uses matrices to solve some linear difference equations.


As a running example, we’ll study a second-order linear difference equation that was the key technical tool in Paul
Samuelson’s 1939 article [Sam39] that introduced the multiplier-accelerator model.
This model became the workhorse that powered early econometric versions of Keynesian macroeconomic models in the
United States.
You can read about the details of that model in this QuantEcon lecture.
(That lecture also describes some technicalities about second-order linear difference equations.)
In this lecture, we’ll also learn about an autoregressive representation and a moving average representation of a non-
stationary univariate time series {𝑦𝑡 }𝑇𝑡=0 .
We’ll also study a “perfect foresight” model of stock prices that involves solving a “forward-looking” linear difference
equation.
We will use the following imports:

import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib import cm
plt.rcParams["figure.figsize"] = (11, 5) #set default figure size

419
A First Course in Quantitative Economics with Python

27.2 Samuelson’s model

Let 𝑡 = 0, ±1, ±2, … index time.


For 𝑡 = 1, 2, 3, … , 𝑇 suppose that

𝑦𝑡 = 𝛼0 + 𝛼1 𝑦𝑡−1 + 𝛼2 𝑦𝑡−2 (27.1)

where we assume that 𝑦0 and 𝑦−1 are given numbers that we take as initial conditions.
In Samuelson’s model, 𝑦𝑡 stood for national income or perhaps a different measure of aggregate activity called gross
domestic product (GDP) at time 𝑡.
Equation (27.1) is called a second-order linear difference equation.
But actually, it is a collection of 𝑇 simultaneous linear equations in the 𝑇 variables 𝑦1 , 𝑦2 , … , 𝑦𝑇 .

Note: To be able to solve a second-order linear difference equation, we require two boundary conditions that can take
the form either of two initial conditions or two terminal conditions or possibly one of each.

Let’s write our equations as a stacked system


1 0 0 0 ⋯ 0 0 0 𝑦1 𝛼0 + 𝛼1 𝑦0 + 𝛼2 𝑦−1
⎡ −𝛼 1 0 0 ⋯ 0 0 0 ⎤⎡ 𝑦2 ⎤ ⎡ 𝛼 0 + 𝛼 2 𝑦0 ⎤
1
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ −𝛼 2 −𝛼 1 1 0 ⋯ 0 0 0 ⎥⎢ 𝑦3 ⎥= ⎢ 𝛼 0 ⎥
⎢ 0 −𝛼2 −𝛼1 1 ⋯ 0 0 0 ⎥⎢ 𝑦4 ⎥ ⎢ 𝛼0 ⎥
⎢ ⋮ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮ ⋮ ⎥⎢ ⋮ ⎥ ⎢ ⋮ ⎥
⎣ 0 0 0 0 ⋯ −𝛼
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2 −𝛼 1 1 ⎦⎣ 𝑦𝑇 ⎣
⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟𝛼 0 ⎦
≡𝐴 ≡𝑏
or

𝐴𝑦 = 𝑏

where
𝑦1
⎡𝑦 ⎤
𝑦 = ⎢ 2⎥
⎢ ⋮ ⎥
⎣𝑦𝑇 ⎦
Evidently 𝑦 can be computed from

𝑦 = 𝐴−1 𝑏

The vector 𝑦 is a complete time path {𝑦𝑡 }𝑇𝑡=1 .


Let’s put Python to work on an example that captures the flavor of Samuelson’s multiplier-accelerator model.
We’ll set parameters equal to the same values we used in this QuantEcon lecture.

T = 80

# parameters
0 = 10.0
1 = 1.53
2 = -.9

y_1 = 28. # y_{-1}


y0 = 24.

420 Chapter 27. Univariate Time Series with Matrix Algebra


A First Course in Quantitative Economics with Python

Now we construct 𝐴 and 𝑏.

A = np.identity(T) # The T x T identity matrix

for i in range(T):

if i-1 >= 0:
A[i, i-1] = - 1

if i-2 >= 0:
A[i, i-2] = - 2

b = np.full(T, 0)
b[0] = 0 + 1 * y0 + 2 * y_1
b[1] = 0 + 2 * y0

Let’s look at the matrix 𝐴 and the vector 𝑏 for our example.

A, b

(array([[ 1. , 0. , 0. , ..., 0. , 0. , 0. ],
[-1.53, 1. , 0. , ..., 0. , 0. , 0. ],
[ 0.9 , -1.53, 1. , ..., 0. , 0. , 0. ],
...,
[ 0. , 0. , 0. , ..., 1. , 0. , 0. ],
[ 0. , 0. , 0. , ..., -1.53, 1. , 0. ],
[ 0. , 0. , 0. , ..., 0.9 , -1.53, 1. ]]),
array([ 21.52, -11.6 , 10. , 10. , 10. , 10. , 10. , 10. ,
10. , 10. , 10. , 10. , 10. , 10. , 10. , 10. ,
10. , 10. , 10. , 10. , 10. , 10. , 10. , 10. ,
10. , 10. , 10. , 10. , 10. , 10. , 10. , 10. ,
10. , 10. , 10. , 10. , 10. , 10. , 10. , 10. ,
10. , 10. , 10. , 10. , 10. , 10. , 10. , 10. ,
10. , 10. , 10. , 10. , 10. , 10. , 10. , 10. ,
10. , 10. , 10. , 10. , 10. , 10. , 10. , 10. ,
10. , 10. , 10. , 10. , 10. , 10. , 10. , 10. ,
10. , 10. , 10. , 10. , 10. , 10. , 10. , 10. ]))

Now let’s solve for the path of 𝑦.


If 𝑦𝑡 is GNP at time 𝑡, then we have a version of Samuelson’s model of the dynamics for GNP.
To solve 𝑦 = 𝐴−1 𝑏 we can either invert 𝐴 directly, as in

A_inv = np.linalg.inv(A)

y = A_inv @ b

or we can use np.linalg.solve:

y_second_method = np.linalg.solve(A, b)

Here make sure the two methods give the same result, at least up to floating point precision:

np.allclose(y, y_second_method)

27.2. Samuelson’s model 421


A First Course in Quantitative Economics with Python

True

Note: In general, np.linalg.solve is more numerically stable than using np.linalg.inv directly. However,
stability is not an issue for this small example. Moreover, we will repeatedly use A_inv in what follows, so there is added
value in computing it directly.

Now we can plot.

plt.plot(np.arange(T)+1, y)
plt.xlabel('t')
plt.ylabel('y')

plt.show()

The steady state value 𝑦∗ of 𝑦𝑡 is obtained by setting 𝑦𝑡 = 𝑦𝑡−1 = 𝑦𝑡−2 = 𝑦∗ in (27.1), which yields
𝛼0
𝑦∗ =
1 − 𝛼 1 − 𝛼2

If we set the initial values to 𝑦0 = 𝑦−1 = 𝑦∗ , then 𝑦𝑡 will be constant:

y_star = 0 / (1 - 1 - 2)
y_1_steady = y_star # y_{-1}
y0_steady = y_star

b_steady = np.full(T, 0)
b_steady[0] = 0 + 1 * y0_steady + 2 * y_1_steady
b_steady[1] = 0 + 2 * y0_steady

y_steady = A_inv @ b_steady

plt.plot(np.arange(T)+1, y_steady)
plt.xlabel('t')
plt.ylabel('y')

plt.show()

422 Chapter 27. Univariate Time Series with Matrix Algebra


A First Course in Quantitative Economics with Python

27.3 Adding a random term

To generate some excitement, we’ll follow in the spirit of the great economists Eugen Slutsky and Ragnar Frisch and
replace our original second-order difference equation with the following second-order stochastic linear difference
equation:

𝑦𝑡 = 𝛼0 + 𝛼1 𝑦𝑡−1 + 𝛼2 𝑦𝑡−2 + 𝑢𝑡 (27.2)

where 𝑢𝑡 ∼ 𝑁 (0, 𝜎𝑢2 ) and is IID, meaning independent and identically distributed.
We’ll stack these 𝑇 equations into a system cast in terms of matrix algebra.
Let’s define the random vector
𝑢1
⎡ 𝑢 ⎤
𝑢=⎢ 2 ⎥
⎢ ⋮ ⎥
⎣ 𝑢𝑇 ⎦
Where 𝐴, 𝑏, 𝑦 are defined as above, now assume that 𝑦 is governed by the system

𝐴𝑦 = 𝑏 + 𝑢 (27.3)

The solution for 𝑦 becomes

𝑦 = 𝐴−1 (𝑏 + 𝑢) (27.4)

Let’s try it out in Python.

u = 2.

u = np.random.normal(0, u, size=T)
y = A_inv @ (b + u)

plt.plot(np.arange(T)+1, y)
plt.xlabel('t')
plt.ylabel('y')

plt.show()

27.3. Adding a random term 423


A First Course in Quantitative Economics with Python

The above time series looks a lot like (detrended) GDP series for a number of advanced countries in recent decades.
We can simulate 𝑁 paths.

N = 100

for i in range(N):
col = cm.viridis(np.random.rand()) # Choose a random color from viridis
u = np.random.normal(0, u, size=T)
y = A_inv @ (b + u)
plt.plot(np.arange(T)+1, y, lw=0.5, color=col)

plt.xlabel('t')
plt.ylabel('y')

plt.show()

Also consider the case when 𝑦0 and 𝑦−1 are at steady state.

N = 100

for i in range(N):
col = cm.viridis(np.random.rand()) # Choose a random color from viridis
u = np.random.normal(0, u, size=T)
(continues on next page)

424 Chapter 27. Univariate Time Series with Matrix Algebra


A First Course in Quantitative Economics with Python

(continued from previous page)


y_steady = A_inv @ (b_steady + u)
plt.plot(np.arange(T)+1, y_steady, lw=0.5, color=col)

plt.xlabel('t')
plt.ylabel('y')

plt.show()

27.4 Computing population moments

We can apply standard formulas for multivariate normal distributions to compute the mean vector and covariance matrix
for our time series model

𝑦 = 𝐴−1 (𝑏 + 𝑢).

You can read about multivariate normal distributions in this lecture Multivariate Normal Distribution.
Let’s write our model as

̃ + 𝑢)
𝑦 = 𝐴(𝑏

where 𝐴 ̃ = 𝐴−1 .
Because linear combinations of normal random variables are normal, we know that

𝑦 ∼ 𝒩(𝜇𝑦 , Σ𝑦 )

where

𝜇𝑦 = 𝐴𝑏̃

and

̃ 2𝐼
Σ𝑦 = 𝐴(𝜎 𝑇̃
𝑢 𝑇 ×𝑇 )𝐴

Let’s write a Python class that computes the mean vector 𝜇𝑦 and covariance matrix Σ𝑦 .

27.4. Computing population moments 425


A First Course in Quantitative Economics with Python

class population_moments:
"""
Compute population moments mu_y, Sigma_y.
---------
Parameters:
alpha0, alpha1, alpha2, T, y_1, y0
"""
def __init__(self, alpha0, alpha1, alpha2, T, y_1, y0, sigma_u):

# compute A
A = np.identity(T)

for i in range(T):
if i-1 >= 0:
A[i, i-1] = -alpha1

if i-2 >= 0:
A[i, i-2] = -alpha2

# compute b
b = np.full(T, alpha0)
b[0] = alpha0 + alpha1 * y0 + alpha2 * y_1
b[1] = alpha0 + alpha2 * y0

# compute A inverse
A_inv = np.linalg.inv(A)

self.A, self.b, self.A_inv, self.sigma_u, self.T = A, b, A_inv, sigma_u, T

def sample_y(self, n):


"""
Give a sample of size n of y.
"""
A_inv, sigma_u, b, T = self.A_inv, self.sigma_u, self.b, self.T
us = np.random.normal(0, sigma_u, size=[n, T])
ys = np.vstack([A_inv @ (b + u) for u in us])

return ys

def get_moments(self):
"""
Compute the population moments of y.
"""
A_inv, sigma_u, b = self.A_inv, self.sigma_u, self.b

# compute mu_y
self.mu_y = A_inv @ b
self.Sigma_y = sigma_u**2 * (A_inv @ A_inv.T)

return self.mu_y, self.Sigma_y

my_process = population_moments(
alpha0=10.0, alpha1=1.53, alpha2=-.9, T=80, y_1=28., y0=24., sigma_u=1)

mu_y, Sigma_y = my_process.get_moments()


A_inv = my_process.A_inv

426 Chapter 27. Univariate Time Series with Matrix Algebra


A First Course in Quantitative Economics with Python

It is enlightening to study the 𝜇𝑦 , Σ𝑦 ’s implied by various parameter values.


Among other things, we can use the class to exhibit how statistical stationarity of 𝑦 prevails only for very special initial
conditions.
Let’s begin by generating 𝑁 time realizations of 𝑦 plotting them together with population mean 𝜇𝑦 .

# plot mean
N = 100

for i in range(N):
col = cm.viridis(np.random.rand()) # Choose a random color from viridis
ys = my_process.sample_y(N)
plt.plot(ys[i,:], lw=0.5, color=col)
plt.plot(mu_y, color='red')

plt.xlabel('t')
plt.ylabel('y')

plt.show()

Visually, notice how the variance across realizations of 𝑦𝑡 decreases as 𝑡 increases.


Let’s plot the population variance of 𝑦𝑡 against 𝑡.

# plot variance
plt.plot(Sigma_y.diagonal())
plt.show()

27.4. Computing population moments 427


A First Course in Quantitative Economics with Python

Notice how the population variance increases and asymptotes


Let’s print out the covariance matrix Σ𝑦 for a time series 𝑦

my_process = population_moments(alpha0=0, alpha1=.8, alpha2=0, T=6, y_1=0., y0=0.,␣


↪sigma_u=1)

mu_y, Sigma_y = my_process.get_moments()


print("mu_y = ",mu_y)
print("Sigma_y = ", Sigma_y)

mu_y = [0. 0. 0. 0. 0. 0.]


Sigma_y = [[1. 0.8 0.64 0.512 0.4096 0.32768 ]
[0.8 1.64 1.312 1.0496 0.83968 0.671744 ]
[0.64 1.312 2.0496 1.63968 1.311744 1.0493952 ]
[0.512 1.0496 1.63968 2.311744 1.8493952 1.47951616]
[0.4096 0.83968 1.311744 1.8493952 2.47951616 1.98361293]
[0.32768 0.671744 1.0493952 1.47951616 1.98361293 2.58689034]]

Notice that the covariance between 𝑦𝑡 and 𝑦𝑡−1 – the elements on the superdiagonal – are not identical.
This is is an indication that the time series represented by our 𝑦 vector is not stationary.
To make it stationary, we’d have to alter our system so that our initial conditions (𝑦1 , 𝑦0 ) are not fixed numbers but
instead a jointly normally distributed random vector with a particular mean and covariance matrix.
We describe how to do that in another lecture in this lecture Linear State Space Models.
But just to set the stage for that analysis, let’s print out the bottom right corner of Σ𝑦 .

mu_y, Sigma_y = my_process.get_moments()


print("bottom right corner of Sigma_y = \n", Sigma_y[72:,72:])

bottom right corner of Sigma_y =


[]

Please notice how the sub diagonal and super diagonal elements seem to have converged.
This is an indication that our process is asymptotically stationary.
You can read about stationarity of more general linear time series models in this lecture Linear State Space Models.

428 Chapter 27. Univariate Time Series with Matrix Algebra


A First Course in Quantitative Economics with Python

There is a lot to be learned about the process by staring at the off diagonal elements of Σ𝑦 corresponding to different time
periods 𝑡, but we resist the temptation to do so here.

27.5 Moving average representation

Let’s print out 𝐴−1 and stare at its structure


• is it triangular or almost triangular or … ?
To study the structure of 𝐴−1 , we shall print just up to 3 decimals.
Let’s begin by printing out just the upper left hand corner of 𝐴−1

with np.printoptions(precision=3, suppress=True):


print(A_inv[0:7,0:7])

[[ 1. -0. 0. -0. -0. 0. -0. ]


[ 1.53 1. 0. -0. -0. 0. -0. ]
[ 1.441 1.53 1. 0. 0. 0. 0. ]
[ 0.828 1.441 1.53 1. 0. 0. 0. ]
[-0.031 0.828 1.441 1.53 1. 0. -0. ]
[-0.792 -0.031 0.828 1.441 1.53 1. 0. ]
[-1.184 -0.792 -0.031 0.828 1.441 1.53 1. ]]

Evidently, 𝐴−1 is a lower triangular matrix.


Let’s print out the lower right hand corner of 𝐴−1 and stare at it.

with np.printoptions(precision=3, suppress=True):


print(A_inv[72:,72:])

[[ 1. 0. 0. 0. 0. 0. 0. 0. ]
[ 1.53 1. 0. -0. -0. 0. -0. -0. ]
[ 1.441 1.53 1. 0. 0. 0. 0. 0. ]
[ 0.828 1.441 1.53 1. 0. 0. 0. 0. ]
[-0.031 0.828 1.441 1.53 1. 0. -0. -0. ]
[-0.792 -0.031 0.828 1.441 1.53 1. 0. 0. ]
[-1.184 -0.792 -0.031 0.828 1.441 1.53 1. 0. ]
[-1.099 -1.184 -0.792 -0.031 0.828 1.441 1.53 1. ]]

Notice how every row ends with the previous row’s pre-diagonal entries.
Since 𝐴−1 is lower triangular, each row represents 𝑦𝑡 for a particular 𝑡 as the sum of
• a time-dependent function 𝐴−1 𝑏 of the initial conditions incorporated in 𝑏, and
• a weighted sum of current and past values of the IID shocks {𝑢𝑡 }
Thus, let 𝐴 ̃ = 𝐴−1 .
Evidently, for 𝑡 ≥ 0,
𝑡+1 𝑡
̃
𝑦𝑡+1 = ∑ 𝐴𝑡+1,𝑖 ̃
𝑏𝑖 + ∑ 𝐴𝑡+1,𝑖 𝑢𝑖 + 𝑢𝑡+1
𝑖=1 𝑖=1

This is a moving average representation with time-varying coefficients.

27.5. Moving average representation 429


A First Course in Quantitative Economics with Python

Just as system (27.4) constitutes a moving average representation for 𝑦, system (27.3) constitutes an autoregressive
representation for 𝑦.

27.6 A forward looking model

Samuelson’s model is backwards looking in the sense that we give it initial conditions and let it run.
Let’s now turn to model that is forward looking.
We apply similar linear algebra machinery to study a perfect foresight model widely used as a benchmark in macroeco-
nomics and finance.
As an example, we suppose that 𝑝𝑡 is the price of a stock and that 𝑦𝑡 is its dividend.
We assume that 𝑦𝑡 is determined by second-order difference equation that we analyzed just above, so that

𝑦 = 𝐴−1 (𝑏 + 𝑢)

Our perfect foresight model of stock prices is


𝑇 −𝑡
𝑝𝑡 = ∑ 𝛽 𝑗 𝑦𝑡+𝑗 , 𝛽 ∈ (0, 1)
𝑗=0

where 𝛽 is a discount factor.


The model asserts that the price of the stock at 𝑡 equals the discounted present values of the (perfectly foreseen) future
dividends.
Form
𝑝1 1 𝛽 𝛽 2 ⋯ 𝛽 𝑇 −1 𝑦1
⎡ 𝑝 ⎤ ⎡ 0 1 𝛽 ⋯ 𝛽 𝑇 −2 ⎤ ⎡ 𝑦2 ⎤
⎢ 2 ⎥ ⎢ 𝑇 −3 ⎥ ⎢ ⎥
⎢ 𝑝3 ⎥ = ⎢ 0 0 1 ⋯ 𝛽 ⎥⎢ 𝑦3 ⎥
⎢ ⋮ ⎥ ⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥⎢ ⋮ ⎥
⏟ ⎣ 0 0 0 ⋯
⎣ 𝑝𝑇 ⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 1 ⎦⎣ 𝑦𝑇 ⎦
≡𝑝 ≡𝐵

= .96

# construct B
B = np.zeros((T, T))

for i in range(T):
B[i, i:] = ** np.arange(0, T-i)

array([[1. , 0.96 , 0.9216 , ..., 0.04314048, 0.04141486,


0.03975826],
[0. , 1. , 0.96 , ..., 0.044938 , 0.04314048,
0.04141486],
[0. , 0. , 1. , ..., 0.04681041, 0.044938 ,
0.04314048],
...,
(continues on next page)

430 Chapter 27. Univariate Time Series with Matrix Algebra


A First Course in Quantitative Economics with Python

(continued from previous page)


[0. , 0. , 0. , ..., 1. , 0.96 ,
0.9216 ],
[0. , 0. , 0. , ..., 0. , 1. ,
0.96 ],
[0. , 0. , 0. , ..., 0. , 0. ,
1. ]])

u = 0.
u = np.random.normal(0, u, size=T)
y = A_inv @ (b + u)
y_steady = A_inv @ (b_steady + u)

p = B @ y

plt.plot(np.arange(0, T)+1, y, label='y')


plt.plot(np.arange(0, T)+1, p, label='p')
plt.xlabel('t')
plt.ylabel('y/p')
plt.legend()

plt.show()

Can you explain why the trend of the price is downward over time?
Also consider the case when 𝑦0 and 𝑦−1 are at the steady state.

p_steady = B @ y_steady

plt.plot(np.arange(0, T)+1, y_steady, label='y')


plt.plot(np.arange(0, T)+1, p_steady, label='p')
plt.xlabel('t')
plt.ylabel('y/p')
plt.legend()

plt.show()

27.6. A forward looking model 431


A First Course in Quantitative Economics with Python

432 Chapter 27. Univariate Time Series with Matrix Algebra


Part VIII

Optimization

433
A First Course in Quantitative Economics with Python

In this lecture, we will need the following library. Install ortools using pip.

!pip install ortools

435
A First Course in Quantitative Economics with Python

436
CHAPTER

TWENTYEIGHT

LINEAR PROGRAMMING

28.1 Overview

Linear programming problems either maximize or minimize a linear objective function subject to a set of linear equality
and/or inequality constraints.
Linear programs come in pairs:
• an original primal problem, and
• an associated dual problem.
If a primal problem involves maximization, the dual problem involves minimization.
If a primal problem involves minimization, the dual problem involves maximization.
We provide a standard form of a linear program and methods to transform other forms of linear programming problems
into a standard form.
We tell how to solve a linear programming problem using SciPy and Google OR-Tools.
We describe the important concept of complementary slackness and how it relates to the dual problem.
Let’s start with some standard imports.

import numpy as np
from ortools.linear_solver import pywraplp
from scipy.optimize import linprog
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
%matplotlib inline

Let’s start with some examples of linear programming problem.

28.2 Example 1: Production Problem

This example was created by [Ber97]


Suppose that a factory can produce two goods called Product 1 and Product 2.
To produce each product requires both material and labor.
Selling each product generates revenue.
Required per unit material and labor inputs and revenues are shown in table below:

437
A First Course in Quantitative Economics with Python

Product 1 Product 2

Material 2 5
Labor 4 2
Revenue 3 4

30 units of material and 20 units of labor available.


A firm’s problem is to construct a production plan that uses its 30 units of materials and 20 units of labor to maximize its
revenue.
Let 𝑥𝑖 denote the quantity of Product 𝑖 that the firm produces.
This problem can be formulated as:

max 𝑧 = 3𝑥1 + 4𝑥2


𝑥1 ,𝑥2

subject to 2𝑥1 + 5𝑥2 ≤ 30


4𝑥1 + 2𝑥2 ≤ 20
𝑥1 , 𝑥2 ≥ 0

The following graph illustrates the firm’s constraints and iso-revenue lines.

The blue region is the feasible set within which all constraints are satisfied.
Parallel orange lines are iso-revenue lines.
The firm’s objective is to find the parallel orange lines to the upper boundary of the feasible set.
The intersection of the feasible set and the highest orange line delineates the optimal set.
In this example, the optimal set is the point (2.5, 5).

438 Chapter 28. Linear Programming


A First Course in Quantitative Economics with Python

28.2.1 Computation: Using OR-Tools

Let’s try to solve the same problem using the package ortools.linear_solver
The following cell instantiates a solver and creates two variables specifying the range of values that they can have.

# Instantiate a GLOP(Google Linear Optimization Package) solver


solver = pywraplp.Solver.CreateSolver('GLOP')

Let’s us create two variables 𝑥1 and 𝑥2 such that they can only have nonnegative values.

# Create the two variables and let them take on any non-negative value.
x1 = solver.NumVar(0, solver.infinity(), 'x1')
x2 = solver.NumVar(0, solver.infinity(), 'x2')

Add the constraints to the problem.

# Constraint 1: 2x_1 + 5x_2 <= 30.0


solver.Add(2 * x1 + 5 * x2 <= 30.0)

# Constraint 2: 4x_1 + 2x_2 <= 20.0


solver.Add(4 * x1 + 2 * x2 <= 20.0)

<ortools.linear_solver.pywraplp.Constraint; proxy of <Swig Object of type


↪'operations_research::MPConstraint *' at 0x7f4930e9a970> >

Let’s specify the objective function. We use solver.Maximize method in the case when we want to maximize the
objective function and in the case of minimization we can use solver.Minimize.

# Objective function: 3x_1 + 4x_2


solver.Maximize(3 * x1 + 4 * x2)

Once we solve the problem, we can check whether the solver was successful in solving the problem using it’s status. If it’s
successful, then the status will be equal to pywraplp.Solver.OPTIMAL.

# Solve the system.


status = solver.Solve()

if status == pywraplp.Solver.OPTIMAL:
print('Objective value =', solver.Objective().Value())
x1_sol = round(x1.solution_value(), 2)
x2_sol = round(x2.solution_value(), 2)
print(f'(x1, x2): ({x1_sol}, {x2_sol})')
else:
print('The problem does not have an optimal solution.')

Objective value = 27.5


(x1, x2): (2.5, 5.0)

28.2. Example 1: Production Problem 439


A First Course in Quantitative Economics with Python

28.3 Example 2: Investment Problem

We now consider a problem posed and solved by [Hu18].


A mutual fund has $100, 000 to be invested over a three year horizon.
Three investment options are available:
1. Annuity: the fund can pay a same amount of new capital at the beginning of each of three years and receive a
payoff of 130% of total capital invested at the end of the third year. Once the mutual fund decides to invest in
this annuity, it has to keep investing in all subsequent years in the three year horizon.
2. Bank account: the fund can deposit any amount into a bank at the beginning of each year and receive its capital
plus 6% interest at the end of that year. In addition, the mutual fund is permitted to borrow no more than $20,000
at the beginning of each year and is asked to pay back the amount borrowed plus 6% interest at the end of the year.
The mutual fund can choose whether to deposit or borrow at the beginning of each year.
3. Corporate bond: At the beginning of the second year, a corporate bond becomes available. The fund can buy an
amount that is no more than $50,000 of this bond at the beginning of the second year and at the end of the third
year receive a payout of 130% of the amount invested in the bond.
The mutual fund’s objective is to maximize total payout that it owns at the end of the third year.
We can formulate this as a linear programming problem.
Let 𝑥1 be the amount of put in the annuity, 𝑥2 , 𝑥3 , 𝑥4 be bank deposit balances at the beginning of the three years, and
𝑥5 be the amount invested in the corporate bond.
When 𝑥2 , 𝑥3 , 𝑥4 are negative, it means that the mutual fund has borrowed from bank.
The table below shows the mutual fund’s decision variables together with the timing protocol described above:

Year 1 Year 2 Year 3

Annuity 𝑥1 𝑥1 𝑥1
Bank account 𝑥2 𝑥3 𝑥4
Corporate bond 0 𝑥5 0

The mutual fund’s decision making proceeds according to the following timing protocol:
1. At the beginning of the first year, the mutual fund decides how much to invest in the annuity and how much to
deposit in the bank. This decision is subject to the constraint:

𝑥1 + 𝑥2 = 100, 000

2. At the beginning of the second year, the mutual fund has a bank balance of 1.06𝑥2 . It must keep 𝑥1 in the annuity.
It can choose to put 𝑥5 into the corporate bond, and put 𝑥3 in the bank. These decisions are restricted by

𝑥1 + 𝑥5 = 1.06𝑥2 − 𝑥3

3. At the beginning of the third year, the mutual fund has a bank account balance equal to 1.06𝑥3 . It must again
invest 𝑥1 in the annuity, leaving it with a bank account balance equal to 𝑥4 . This situation is summarized by the
restriction:

𝑥1 = 1.06𝑥3 − 𝑥4

440 Chapter 28. Linear Programming


A First Course in Quantitative Economics with Python

The mutual fund’s objective function, i.e., its wealth at the end of the third year is:

1.30 ⋅ 3𝑥1 + 1.06𝑥4 + 1.30𝑥5

Thus, the mutual fund confronts the linear program:

max 1.30 ⋅ 3𝑥1 + 1.06𝑥4 + 1.30𝑥5


𝑥
subject to 𝑥1 + 𝑥2 = 100, 000
𝑥1 − 1.06𝑥2 + 𝑥3 + 𝑥5 = 0
𝑥1 − 1.06𝑥3 + 𝑥4 = 0
𝑥2 ≥ −20, 000
𝑥3 ≥ −20, 000
𝑥4 ≥ −20, 000
𝑥5 ≤ 50, 000
𝑥𝑗 ≥ 0, 𝑗 = 1, 5
𝑥𝑗 unrestricted, 𝑗 = 2, 3, 4

28.3.1 Computation: Using OR-Tools

Let’s try to solve the above problem using the package ortools.linear_solver.
The following cell instantiates a solver and creates two variables specifying the range of values that they can have.

# Instantiate a GLOP(Google Linear Optimization Package) solver


solver = pywraplp.Solver.CreateSolver('GLOP')

Let’s us create five variables 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , and 𝑥5 such that they can only have the values defined in the above constraints.

# Create the variables using the ranges available from constraints


x1 = solver.NumVar(0, solver.infinity(), 'x1')
x2 = solver.NumVar(-20_000, solver.infinity(), 'x2')
x3 = solver.NumVar(-20_000, solver.infinity(), 'x3')
x4 = solver.NumVar(-20_000, solver.infinity(), 'x4')
x5 = solver.NumVar(0, 50_000, 'x5')

Add the constraints to the problem.

# Constraint 1: x_1 + x_2 = 100,000


solver.Add(x1 + x2 == 100_000.0)

# Constraint 2: x_1 - 1.06 * x_2 + x_3 + x_5 = 0


solver.Add(x1 - 1.06 * x2 + x3 + x5 == 0.0)

# Constraint 3: x_1 - 1.06 * x_3 + x_4 = 0


solver.Add(x1 - 1.06 * x3 + x4 == 0.0)

<ortools.linear_solver.pywraplp.Constraint; proxy of <Swig Object of type


↪'operations_research::MPConstraint *' at 0x7f4930e99050> >

Let’s specify the objective function.

28.3. Example 2: Investment Problem 441


A First Course in Quantitative Economics with Python

# Objective function: 1.30 * 3 * x_1 + 1.06 * x_4 + 1.30 * x_5


solver.Maximize(1.30 * 3 * x1 + 1.06 * x4 + 1.30 * x5)

Let’s solve the problem and check the status using pywraplp.Solver.OPTIMAL.

# Solve the system.


status = solver.Solve()

if status == pywraplp.Solver.OPTIMAL:
print('Objective value =', solver.Objective().Value())
x1_sol = round(x1.solution_value(), 3)
x2_sol = round(x2.solution_value(), 3)
x3_sol = round(x1.solution_value(), 3)
x4_sol = round(x2.solution_value(), 3)
x5_sol = round(x1.solution_value(), 3)
print(f'(x1, x2, x3, x4, x5): ({x1_sol}, {x2_sol}, {x3_sol}, {x4_sol}, {x5_sol})')
else:
print('The problem does not have an optimal solution.')

Objective value = 141018.24349792692


(x1, x2, x3, x4, x5): (24927.755, 75072.245, 24927.755, 75072.245, 24927.755)

OR-Tools tells us that the best investment strategy is:


1. At the beginning of the first year, the mutual fund should buy $24, 927.755 of the annuity. Its bank account balance
should be $75, 072.245.
2. At the beginning of the second year, the mutual fund should buy $24, 927.755 of the corporate bond and keep
invest in the annuity. Its bank balance should be $24, 927.755.
3. At the beginning of the third year, the bank balance should be $75, 072.245.
4. At the end of the third year, the mutual fund will get payouts from the annuity and corporate bond and repay its
loan from the bank. At the end it will own $141018.24, so that it’s total net rate of return over the three periods is
41.02%.

28.4 Standard form

For purposes of
• unifying linear programs that are initially stated in superficially different forms, and
• having a form that is convenient to put into black-box software packages,
it is useful to devote some effort to describe a standard form.
Our standard form is:
min 𝑐1 𝑥1 + 𝑐2 𝑥2 + ⋯ + 𝑐𝑛 𝑥𝑛
𝑥
subject to 𝑎11 𝑥1 + 𝑎12 𝑥2 + ⋯ + 𝑎1𝑛 𝑥𝑛 = 𝑏1
𝑎21 𝑥1 + 𝑎22 𝑥2 + ⋯ + 𝑎2𝑛 𝑥𝑛 = 𝑏2

𝑎𝑚1 𝑥1 + 𝑎𝑚2 𝑥2 + ⋯ + 𝑎𝑚𝑛 𝑥𝑛 = 𝑏𝑚
𝑥 1 , 𝑥 2 , … , 𝑥𝑛 ≥ 0

442 Chapter 28. Linear Programming


A First Course in Quantitative Economics with Python

Let
𝑎11 𝑎12 … 𝑎1𝑛 𝑏1 𝑐1 𝑥1
⎡𝑎 𝑎22 … 𝑎2𝑛 ⎤ ⎡𝑏 ⎤ ⎡𝑐 ⎤ ⎡𝑥 ⎤
𝐴 = ⎢ 21 ⎥, 𝑏 = ⎢ 2⎥, 𝑐 = ⎢ 2⎥ , 𝑥 = ⎢ 2⎥ .
⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢⋮⎥ ⎢ ⋮ ⎥
⎣𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑛 ⎦ ⎣𝑏𝑚 ⎦ ⎣𝑐𝑛 ⎦ ⎣𝑥𝑛 ⎦

The standard form LP problem can be expressed concisely as:

min 𝑐′ 𝑥
𝑥
subject to 𝐴𝑥 = 𝑏 (28.1)
𝑥≥0

Here, 𝐴𝑥 = 𝑏 means that the 𝑖-th entry of 𝐴𝑥 equals the 𝑖-th entry of 𝑏 for every 𝑖.
Similarly, 𝑥 ≥ 0 means that 𝑥𝑗 is greater than equal to 0 for every 𝑗.

28.4.1 Useful transformations

It is useful to know how to transform a problem that initially is not stated in the standard form into one that is.
By deploying the following steps, any linear programming problem can be transformed into an equivalent standard form
linear programming problem.
1. Objective Function: If a problem is originally a constrained maximization problem, we can construct a new
objective function that is the additive inverse of the original objective function. The transformed problem is then a
minimization problem.
2. Decision Variables: Given a variable 𝑥𝑗 satisfying 𝑥𝑗 ≤ 0, we can introduce a new variable 𝑥′𝑗 = −𝑥𝑗 and
substitute it into original problem. Given a free variable 𝑥𝑖 with no restriction on its sign, we can introduce two
new variables 𝑥+ − + − + −
𝑗 and 𝑥𝑗 satisfying 𝑥𝑗 , 𝑥𝑗 ≥ 0 and replace 𝑥𝑗 by 𝑥𝑗 − 𝑥𝑗 .
𝑛
3. Inequality constraints: Given an inequality constraint ∑𝑗=1 𝑎𝑖𝑗 𝑥𝑗 ≤ 0, we can introduce a new variable 𝑠𝑖 ,
𝑛
called a slack variable that satisfies 𝑠𝑖 ≥ 0 and replace the original constraint by ∑𝑗=1 𝑎𝑖𝑗 𝑥𝑗 + 𝑠𝑖 = 0.
Let’s apply the above steps to the two examples described above.

28.4.2 Example 1: Production Problem

The original problem is:

max 3𝑥1 + 4𝑥2


𝑥1 ,𝑥2

subject to 2𝑥1 + 5𝑥2 ≤ 30


4𝑥1 + 2𝑥2 ≤ 20
𝑥1 , 𝑥2 ≥ 0

This problem is equivalent to the following problem with a standard form:

min − (3𝑥1 + 4𝑥2 )


𝑥1 ,𝑥2

subject to 2𝑥1 + 5𝑥2 + 𝑠1 = 30


4𝑥1 + 2𝑥2 + 𝑠2 = 20
𝑥1 , 𝑥2 , 𝑠1 , 𝑠2 ≥ 0

28.4. Standard form 443


A First Course in Quantitative Economics with Python

28.4.3 Computation: Using SciPy

The package scipy.optimize provides a function linprog to solve linear programming problems with a form below:

min 𝑐′ 𝑥
𝑥
subject to 𝐴𝑢𝑏 𝑥 ≤ 𝑏𝑢𝑏
𝐴𝑒𝑞 𝑥 = 𝑏𝑒𝑞
𝑙≤𝑥≤𝑢

Note: By default 𝑙 = 0 and 𝑢 = None unless explicitly specified with the argument ‘bounds’.

Let’s now try to solve the Problem 1 using SciPy.

# Construct parameters
c_ex1 = np.array([3, 4])

# Inequality constraints
A_ex1 = np.array([[2, 5],
[4, 2]])
b_ex1 = np.array([30,20])

Once we solve the problem, we can check whether the solver was successful in solving the problem using the boolean
attribute success. If it’s successful, then the success attribute is set to True.

# Solve the problem


# we put a negative sign on the objective as linprog does minimization
res_ex1 = linprog(-c_ex1, A_ub=A_ex1, b_ub=b_ex1)

if res_ex1.success:
# We use negative sign to get the optimal value (maximized value)
print('Optimal Value:', -res_ex1.fun)
print(f'(x1, x2): {res_ex1.x[0], res_ex1.x[1]}')
else:
print('The problem does not have an optimal solution.')

Optimal Value: 27.5


(x1, x2): (2.5, 5.0)

The optimal plan tells the factory to produce 2.5 units of Product 1 and 5 units of Product 2; that generates a maximizing
value of revenue of 27.5.
We are using the linprog function as a black box.
Inside it, Python first transforms the problem into standard form.
To do that, for each inequality constraint it generates one slack variable.
Here the vector of slack variables is a two-dimensional NumPy array that equals 𝑏𝑢𝑏 − 𝐴𝑢𝑏 𝑥.
See the official documentation for more details.

Note: This problem is to maximize the objective, so that we need to put a minus sign in front of parameter vector c.

444 Chapter 28. Linear Programming


A First Course in Quantitative Economics with Python

28.4.4 Example 2: Investment Problem

The original problem is:

max 1.30 ⋅ 3𝑥1 + 1.06𝑥4 + 1.30𝑥5


𝑥
subject to 𝑥1 + 𝑥2 = 100, 000
𝑥1 − 1.06𝑥2 + 𝑥3 + 𝑥5 = 0
𝑥1 − 1.06𝑥3 + 𝑥4 = 0
𝑥2 ≥ −20, 000
𝑥3 ≥ −20, 000
𝑥4 ≥ −20, 000
𝑥5 ≤ 50, 000
𝑥𝑗 ≥ 0, 𝑗 = 1, 5
𝑥𝑗 unrestricted, 𝑗 = 2, 3, 4

This problem is equivalent to the following problem with a standard form:

min − (1.30 ⋅ 3𝑥1 + 1.06𝑥+ −


4 − 1.06𝑥4 + 1.30𝑥5 )
𝑥
subject to 𝑥1 + 𝑥+ −
2 − 𝑥2 = 100, 000
𝑥1 − 1.06(𝑥+ − + −
2 − 𝑥2 ) + 𝑥 3 − 𝑥 3 + 𝑥 5 = 0
𝑥1 − 1.06(𝑥+ − + −
3 − 𝑥3 ) + 𝑥 4 − 𝑥 4 = 0
+
𝑥−
2 − 𝑥2 + 𝑠1 = 20, 000
+
𝑥−
3 − 𝑥3 + 𝑠2 = 20, 000
+
𝑥−
4 − 𝑥4 + 𝑠3 = 20, 000
𝑥5 + 𝑠4 = 50, 000
𝑥𝑗 ≥ 0, 𝑗 = 1, 5
𝑥+ −
𝑗 , 𝑥𝑗 ≥ 0, 𝑗 = 2, 3, 4
𝑠𝑗 ≥ 0, 𝑗 = 1, 2, 3, 4

# Construct parameters
rate = 1.06

# Objective function parameters


c_ex2 = np.array([1.30*3, 0, 0, 1.06, 1.30])

# Inequality constraints
A_ex2 = np.array([[1, 1, 0, 0, 0],
[1, -rate, 1, 0, 1],
[1, 0, -rate, 1, 0]])
b_ex2 = np.array([100000, 0, 0])

# Bounds on decision variables


bounds_ex2 = [( 0, None),
(-20000, None),
(-20000, None),
(-20000, None),
( 0, 50000)]

Let’s solve the problem and check the status using success attribute.

28.4. Standard form 445


A First Course in Quantitative Economics with Python

# Solve the problem


res_ex2 = linprog(-c_ex2, A_eq=A_ex2, b_eq=b_ex2,
bounds=bounds_ex2)

if res_ex2.success:
# We use negative sign to get the optimal value (maximized value)
print('Optimal Value:', -res_ex2.fun)
x1_sol = round(res_ex2.x[0], 3)
x2_sol = round(res_ex2.x[1], 3)
x3_sol = round(res_ex2.x[2], 3)
x4_sol = round(res_ex2.x[3], 3)
x5_sol = round(res_ex2.x[4], 3)
print(f'(x1, x2, x3, x4, x5): {x1_sol, x2_sol, x3_sol, x4_sol, x5_sol}')
else:
print('The problem does not have an optimal solution.')

Optimal Value: 141018.24349792697


(x1, x2, x3, x4, x5): (24927.755, 75072.245, 4648.825, -20000.0, 50000.0)

SciPy tells us that the best investment strategy is:


1. At the beginning of the first year, the mutual fund should buy $24, 927.75 of the annuity. Its bank account balance
should be $75, 072.25.
2. At the beginning of the second year, the mutual fund should buy $50, 000 of the corporate bond and keep invest
in the annuity. Its bank account balance should be $4, 648.83.
3. At the beginning of the third year, the mutual fund should borrow $20, 000 from the bank and invest in the annuity.
4. At the end of the third year, the mutual fund will get payouts from the annuity and corporate bond and repay its
loan from the bank. At the end it will own $141018.24, so that it’s total net rate of return over the three periods is
41.02%.

Note: You might notice the difference in the values of optimal solution using OR-Tools and SciPy but the optimal value
is the same. It is because there can be many optimal solutions for the same problem.

28.5 Exercises

Exercise 28.5.1
Implement a new extended solution for the Problem 1 where in the factory owner decides that number of units of Product
1 should not be less than the number of units of Product 2.

Solution to Exercise 28.5.1

446 Chapter 28. Linear Programming


A First Course in Quantitative Economics with Python

So we can reformulate the problem as:

max 𝑧 = 3𝑥1 + 4𝑥2


𝑥1 ,𝑥2

subject to 2𝑥1 + 5𝑥2 ≤ 30


4𝑥1 + 2𝑥2 ≤ 20
𝑥1 ≥ 𝑥2
𝑥1 , 𝑥2 ≥ 0

# Instantiate a GLOP(Google Linear Optimization Package) solver


solver = pywraplp.Solver.CreateSolver('GLOP')

# Create the two variables and let them take on any non-negative value.
x1 = solver.NumVar(0, solver.infinity(), 'x1')
x2 = solver.NumVar(0, solver.infinity(), 'x2')

# Constraint 1: 2x_1 + 5x_2 <= 30.0


solver.Add(2 * x1 + 5 * x2 <= 30.0)

# Constraint 2: 4x_1 + 2x_2 <= 20.0


solver.Add(4 * x1 + 2 * x2 <= 20.0)

# Constraint 3: x_1 >= x_2


solver.Add(x1 >= x2)

<ortools.linear_solver.pywraplp.Constraint; proxy of <Swig Object of type


↪'operations_research::MPConstraint *' at 0x7f4930e9aa90> >

# Objective function: 3x_1 + 4x_2


solver.Maximize(3 * x1 + 4 * x2)

# Solve the system.


status = solver.Solve()

if status == pywraplp.Solver.OPTIMAL:
print('Objective value =', solver.Objective().Value())
x1_sol = round(x1.solution_value(), 2)
x2_sol = round(x2.solution_value(), 2)
print(f'(x1, x2): ({x1_sol}, {x2_sol})')
else:
print('The problem does not have an optimal solution.')

Objective value = 23.333333333333336


(x1, x2): (3.33, 3.33)

Exercise 28.5.2
A carpenter manufactures 2 products - 𝐴 and 𝐵.
Product 𝐴 generates a profit of 23 and product 𝐵 generates a profit of 10.
It takes 2 hours for the carpenter to produce 𝐴 and 0.8 hours to produce 𝐵.

28.5. Exercises 447


A First Course in Quantitative Economics with Python

Moreover, he can’t spend more than 25 hours per week and the total number of units of 𝐴 and 𝐵 should not be greater
than 20.
Find the number of units of 𝐴 and product 𝐵 that he should manufacture in order to maximise his profit.

Solution to Exercise 28.5.2


Let us assume the carpenter produces 𝑥 units of 𝐴 and 𝑦 units of 𝐵.
So we can formulate the problem as:

max 𝑧 = 23𝑥 + 10𝑦


𝑥,𝑦

subject to 𝑥 + 𝑦 ≤ 20
2𝑥 + 0.8𝑦 ≤ 25

# Instantiate a GLOP(Google Linear Optimization Package) solver


solver = pywraplp.Solver.CreateSolver('GLOP')

Let’s us create two variables 𝑥1 and 𝑥2 such that they can only have nonnegative values.

# Create the two variables and let them take on any non-negative value.
x = solver.NumVar(0, solver.infinity(), 'x')
y = solver.NumVar(0, solver.infinity(), 'y')

# Constraint 1: x + y <= 20.0


solver.Add(x + y <= 20.0)

# Constraint 2: 2x + 0.8y <= 25.0


solver.Add(2 * x + 0.8 * y <= 25.0)

<ortools.linear_solver.pywraplp.Constraint; proxy of <Swig Object of type


↪'operations_research::MPConstraint *' at 0x7f4930e98c30> >

# Objective function: 23x + 10y


solver.Maximize(23 * x + 10 * y)

# Solve the system.


status = solver.Solve()

if status == pywraplp.Solver.OPTIMAL:
print('Maximum Profit =', solver.Objective().Value())
x_sol = round(x.solution_value(), 3)
y_sol = round(y.solution_value(), 3)
print(f'(x, y): ({x_sol}, {y_sol})')
else:
print('The problem does not have an optimal solution.')

Maximum Profit = 297.5


(x, y): (7.5, 12.5)

448 Chapter 28. Linear Programming


CHAPTER

TWENTYNINE

SHORTEST PATHS

Contents

• Shortest Paths
– Overview
– Outline of the problem
– Finding least-cost paths
– Solving for minimum cost-to-go
– Exercises

29.1 Overview

The shortest path problem is a classic problem in mathematics and computer science with applications in
• Economics (sequential decision making, analysis of social networks, etc.)
• Operations research and transportation
• Robotics and artificial intelligence
• Telecommunication network design and routing
• etc., etc.
Variations of the methods we discuss in this lecture are used millions of times every day, in applications such as
• Google Maps
• routing packets on the internet
For us, the shortest path problem also provides a nice introduction to the logic of dynamic programming.
Dynamic programming is an extremely powerful optimization technique that we apply in many lectures on this site.
The only scientific library we’ll need in what follows is NumPy:

import numpy as np

449
A First Course in Quantitative Economics with Python

29.2 Outline of the problem

The shortest path problem is one of finding how to traverse a graph from one specified node to another at minimum cost.
Consider the following graph

We wish to travel from node (vertex) A to node G at minimum cost


• Arrows (edges) indicate the movements we can take.
• Numbers on edges indicate the cost of traveling that edge.
(Graphs such as the one above are called weighted directed graphs.)
Possible interpretations of the graph include
• Minimum cost for supplier to reach a destination.
• Routing of packets on the internet (minimize time).
• Etc., etc.
For this simple graph, a quick scan of the edges shows that the optimal paths are
• A, C, F, G at cost 8
• A, D, F, G at cost 8

450 Chapter 29. Shortest Paths


A First Course in Quantitative Economics with Python

29.2. Outline of the problem 451


A First Course in Quantitative Economics with Python

452 Chapter 29. Shortest Paths


A First Course in Quantitative Economics with Python

29.3 Finding least-cost paths

For large graphs, we need a systematic solution.


Let 𝐽 (𝑣) denote the minimum cost-to-go from node 𝑣, understood as the total cost from 𝑣 if we take the best route.
Suppose that we know 𝐽 (𝑣) for each node 𝑣, as shown below for the graph from the preceding example

Note that 𝐽 (𝐺) = 0.


The best path can now be found as follows
1. Start at node 𝑣 = 𝐴
2. From current node 𝑣, move to any node that solves
min {𝑐(𝑣, 𝑤) + 𝐽 (𝑤)} (29.1)
𝑤∈𝐹𝑣

where
• 𝐹𝑣 is the set of nodes that can be reached from 𝑣 in one step.
• 𝑐(𝑣, 𝑤) is the cost of traveling from 𝑣 to 𝑤.
Hence, if we know the function 𝐽 , then finding the best path is almost trivial.
But how can we find the cost-to-go function 𝐽 ?

29.3. Finding least-cost paths 453


A First Course in Quantitative Economics with Python

Some thought will convince you that, for every node 𝑣, the function 𝐽 satisfies

𝐽 (𝑣) = min {𝑐(𝑣, 𝑤) + 𝐽 (𝑤)} (29.2)


𝑤∈𝐹𝑣

This is known as the Bellman equation, after the mathematician Richard Bellman.
The Bellman equation can be thought of as a restriction that 𝐽 must satisfy.
What we want to do now is use this restriction to compute 𝐽 .

29.4 Solving for minimum cost-to-go

Let’s look at an algorithm for computing 𝐽 and then think about how to implement it.

29.4.1 The algorithm

The standard algorithm for finding 𝐽 is to start an initial guess and then iterate.
This is a standard approach to solving nonlinear equations, often called the method of successive approximations.
Our initial guess will be

𝐽0 (𝑣) = 0 for all 𝑣 (29.3)

Now
1. Set 𝑛 = 0
2. Set 𝐽𝑛+1 (𝑣) = min𝑤∈𝐹𝑣 {𝑐(𝑣, 𝑤) + 𝐽𝑛 (𝑤)} for all 𝑣
3. If 𝐽𝑛+1 and 𝐽𝑛 are not equal then increment 𝑛, go to 2
This sequence converges to 𝐽 .
Although we omit the proof, we’ll prove similar claims in our other lectures on dynamic programming.

29.4.2 Implementation

Having an algorithm is a good start, but we also need to think about how to implement it on a computer.
First, for the cost function 𝑐, we’ll implement it as a matrix 𝑄, where a typical element is

𝑐(𝑣, 𝑤) if 𝑤 ∈ 𝐹𝑣
𝑄(𝑣, 𝑤) = {
+∞ otherwise

In this context 𝑄 is usually called the distance matrix.


We’re also numbering the nodes now, with 𝐴 = 0, so, for example

𝑄(1, 2) = the cost of traveling from B to C

For example, for the simple graph above, we set

454 Chapter 29. Shortest Paths


A First Course in Quantitative Economics with Python

from numpy import inf

Q = np.array([[inf, 1, 5, 3, inf, inf, inf],


[inf, inf, inf, 9, 6, inf, inf],
[inf, inf, inf, inf, inf, 2, inf],
[inf, inf, inf, inf, inf, 4, 8],
[inf, inf, inf, inf, inf, inf, 4],
[inf, inf, inf, inf, inf, inf, 1],
[inf, inf, inf, inf, inf, inf, 0]])

Notice that the cost of staying still (on the principle diagonal) is set to
• np.inf for non-destination nodes — moving on is required.
• 0 for the destination node — here is where we stop.
For the sequence of approximations {𝐽𝑛 } of the cost-to-go functions, we can use NumPy arrays.
Let’s try with this example and see how we go:

nodes = range(7) # Nodes = 0, 1, ..., 6


J = np.zeros_like(nodes, dtype=int) # Initial guess
next_J = np.empty_like(nodes, dtype=int) # Stores updated guess

max_iter = 500
i = 0

while i < max_iter:


for v in nodes:
# minimize Q[v, w] + J[w] over all choices of w
lowest_cost = inf
for w in nodes:
cost = Q[v, w] + J[w]
if cost < lowest_cost:
lowest_cost = cost
next_J[v] = lowest_cost
if np.equal(next_J, J).all():
break
else:
J[:] = next_J # Copy contents of next_J to J
i += 1

print("The cost-to-go function is", J)

The cost-to-go function is [ 8 10 3 5 4 1 0]

This matches with the numbers we obtained by inspection above.


But, importantly, we now have a methodology for tackling large graphs.

29.4. Solving for minimum cost-to-go 455


A First Course in Quantitative Economics with Python

29.5 Exercises

Exercise 29.5.1
The text below describes a weighted directed graph.
The line node0, node1 0.04, node8 11.11, node14 72.21 means that from node0 we can go to
• node1 at cost 0.04
• node8 at cost 11.11
• node14 at cost 72.21
No other nodes can be reached directly from node0.
Other lines have a similar interpretation.
Your task is to use the algorithm given above to find the optimal path and its cost.

Note: You will be dealing with floating point numbers now, rather than integers, so consider replacing np.equal()
with np.allclose().

%%file graph.txt
node0, node1 0.04, node8 11.11, node14 72.21
node1, node46 1247.25, node6 20.59, node13 64.94
node2, node66 54.18, node31 166.80, node45 1561.45
node3, node20 133.65, node6 2.06, node11 42.43
node4, node75 3706.67, node5 0.73, node7 1.02
node5, node45 1382.97, node7 3.33, node11 34.54
node6, node31 63.17, node9 0.72, node10 13.10
node7, node50 478.14, node9 3.15, node10 5.85
node8, node69 577.91, node11 7.45, node12 3.18
node9, node70 2454.28, node13 4.42, node20 16.53
node10, node89 5352.79, node12 1.87, node16 25.16
node11, node94 4961.32, node18 37.55, node20 65.08
node12, node84 3914.62, node24 34.32, node28 170.04
node13, node60 2135.95, node38 236.33, node40 475.33
node14, node67 1878.96, node16 2.70, node24 38.65
node15, node91 3597.11, node17 1.01, node18 2.57
node16, node36 392.92, node19 3.49, node38 278.71
node17, node76 783.29, node22 24.78, node23 26.45
node18, node91 3363.17, node23 16.23, node28 55.84
node19, node26 20.09, node20 0.24, node28 70.54
node20, node98 3523.33, node24 9.81, node33 145.80
node21, node56 626.04, node28 36.65, node31 27.06
node22, node72 1447.22, node39 136.32, node40 124.22
node23, node52 336.73, node26 2.66, node33 22.37
node24, node66 875.19, node26 1.80, node28 14.25
node25, node70 1343.63, node32 36.58, node35 45.55
node26, node47 135.78, node27 0.01, node42 122.00
node27, node65 480.55, node35 48.10, node43 246.24
node28, node82 2538.18, node34 21.79, node36 15.52
node29, node64 635.52, node32 4.22, node33 12.61
node30, node98 2616.03, node33 5.61, node35 13.95
node31, node98 3350.98, node36 20.44, node44 125.88
node32, node97 2613.92, node34 3.33, node35 1.46
(continues on next page)

456 Chapter 29. Shortest Paths


A First Course in Quantitative Economics with Python

(continued from previous page)


node33, node81 1854.73, node41 3.23, node47 111.54
node34, node73 1075.38, node42 51.52, node48 129.45
node35, node52 17.57, node41 2.09, node50 78.81
node36, node71 1171.60, node54 101.08, node57 260.46
node37, node75 269.97, node38 0.36, node46 80.49
node38, node93 2767.85, node40 1.79, node42 8.78
node39, node50 39.88, node40 0.95, node41 1.34
node40, node75 548.68, node47 28.57, node54 53.46
node41, node53 18.23, node46 0.28, node54 162.24
node42, node59 141.86, node47 10.08, node72 437.49
node43, node98 2984.83, node54 95.06, node60 116.23
node44, node91 807.39, node46 1.56, node47 2.14
node45, node58 79.93, node47 3.68, node49 15.51
node46, node52 22.68, node57 27.50, node67 65.48
node47, node50 2.82, node56 49.31, node61 172.64
node48, node99 2564.12, node59 34.52, node60 66.44
node49, node78 53.79, node50 0.51, node56 10.89
node50, node85 251.76, node53 1.38, node55 20.10
node51, node98 2110.67, node59 23.67, node60 73.79
node52, node94 1471.80, node64 102.41, node66 123.03
node53, node72 22.85, node56 4.33, node67 88.35
node54, node88 967.59, node59 24.30, node73 238.61
node55, node84 86.09, node57 2.13, node64 60.80
node56, node76 197.03, node57 0.02, node61 11.06
node57, node86 701.09, node58 0.46, node60 7.01
node58, node83 556.70, node64 29.85, node65 34.32
node59, node90 820.66, node60 0.72, node71 0.67
node60, node76 48.03, node65 4.76, node67 1.63
node61, node98 1057.59, node63 0.95, node64 4.88
node62, node91 132.23, node64 2.94, node76 38.43
node63, node66 4.43, node72 70.08, node75 56.34
node64, node80 47.73, node65 0.30, node76 11.98
node65, node94 594.93, node66 0.64, node73 33.23
node66, node98 395.63, node68 2.66, node73 37.53
node67, node82 153.53, node68 0.09, node70 0.98
node68, node94 232.10, node70 3.35, node71 1.66
node69, node99 247.80, node70 0.06, node73 8.99
node70, node76 27.18, node72 1.50, node73 8.37
node71, node89 104.50, node74 8.86, node91 284.64
node72, node76 15.32, node84 102.77, node92 133.06
node73, node83 52.22, node76 1.40, node90 243.00
node74, node81 1.07, node76 0.52, node78 8.08
node75, node92 68.53, node76 0.81, node77 1.19
node76, node85 13.18, node77 0.45, node78 2.36
node77, node80 8.94, node78 0.98, node86 64.32
node78, node98 355.90, node81 2.59
node79, node81 0.09, node85 1.45, node91 22.35
node80, node92 121.87, node88 28.78, node98 264.34
node81, node94 99.78, node89 39.52, node92 99.89
node82, node91 47.44, node88 28.05, node93 11.99
node83, node94 114.95, node86 8.75, node88 5.78
node84, node89 19.14, node94 30.41, node98 121.05
node85, node97 94.51, node87 2.66, node89 4.90
node86, node97 85.09
node87, node88 0.21, node91 11.14, node92 21.23
node88, node93 1.31, node91 6.83, node98 6.12

(continues on next page)

29.5. Exercises 457


A First Course in Quantitative Economics with Python

(continued from previous page)


node89, node97 36.97, node99 82.12
node90, node96 23.53, node94 10.47, node99 50.99
node91, node97 22.17
node92, node96 10.83, node97 11.24, node99 34.68
node93, node94 0.19, node97 6.71, node99 32.77
node94, node98 5.91, node96 2.03
node95, node98 6.17, node99 0.27
node96, node98 3.32, node97 0.43, node99 5.87
node97, node98 0.30
node98, node99 0.33
node99,

Overwriting graph.txt

Solution to Exercise 29.5.1


First let’s write a function that reads in the graph data above and builds a distance matrix.

num_nodes = 100
destination_node = 99

def map_graph_to_distance_matrix(in_file):

# First let's set of the distance matrix Q with inf everywhere


Q = np.full((num_nodes, num_nodes), np.inf)

# Now we read in the data and modify Q


with open(in_file) as infile:
for line in infile:
elements = line.split(',')
node = elements.pop(0)
node = int(node[4:]) # convert node description to integer
if node != destination_node:
for element in elements:
destination, cost = element.split()
destination = int(destination[4:])
Q[node, destination] = float(cost)
Q[destination_node, destination_node] = 0
return Q

In addition, let’s write


1. a “Bellman operator” function that takes a distance matrix and current guess of J and returns an updated guess of
J, and
2. a function that takes a distance matrix and returns a cost-to-go function.
We’ll use the algorithm described above.
The minimization step is vectorized to make it faster.

def bellman(J, Q):


num_nodes = Q.shape[0]
next_J = np.empty_like(J)
(continues on next page)

458 Chapter 29. Shortest Paths


A First Course in Quantitative Economics with Python

(continued from previous page)


for v in range(num_nodes):
next_J[v] = np.min(Q[v, :] + J)
return next_J

def compute_cost_to_go(Q):
num_nodes = Q.shape[0]
J = np.zeros(num_nodes) # Initial guess
max_iter = 500
i = 0

while i < max_iter:


next_J = bellman(J, Q)
if np.allclose(next_J, J):
break
else:
J[:] = next_J # Copy contents of next_J to J
i += 1

return(J)

We used np.allclose() rather than testing exact equality because we are dealing with floating point numbers now.
Finally, here’s a function that uses the cost-to-go function to obtain the optimal path (and its cost).

def print_best_path(J, Q):


sum_costs = 0
current_node = 0
while current_node != destination_node:
print(current_node)
# Move to the next node and increment costs
next_node = np.argmin(Q[current_node, :] + J)
sum_costs += Q[current_node, next_node]
current_node = next_node

print(destination_node)
print('Cost: ', sum_costs)

Okay, now we have the necessary functions, let’s call them to do the job we were assigned.

Q = map_graph_to_distance_matrix('graph.txt')
J = compute_cost_to_go(Q)
print_best_path(J, Q)

0
8
11
18
23
33
41
53
56
57
60
(continues on next page)

29.5. Exercises 459


A First Course in Quantitative Economics with Python

(continued from previous page)


67
70
73
76
85
87
88
93
94
96
97
98
99
Cost: 160.55000000000007

The total cost of the path should agree with 𝐽 [0] so let’s check this.

J[0]

160.55

460 Chapter 29. Shortest Paths


Part IX

Modeling in Higher Dimensions

461
CHAPTER

THIRTY

THE PERRON-FROBENIUS THEOREM

Contents

• The Perron-Frobenius Theorem


– Nonnegative matrices
– Exercises

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install quantecon

In this lecture we will begin with the foundational concepts in spectral theory.
Then we will explore the Perron-Frobenius Theorem and connect it to applications in Markov chains and networks.
We will use the following imports:

import matplotlib.pyplot as plt


import numpy as np
from numpy.linalg import eig
import scipy as sp
import quantecon as qe

30.1 Nonnegative matrices

Often, in economics, the matrix that we are dealing with is nonnegative.


Nonnegative matrices have several special and useful properties.
In this section we will discuss some of them — in particular, the connection between nonnegativity and eigenvalues.
An 𝑛 × 𝑚 matrix 𝐴 is called nonnegative if every element of 𝐴 is nonnegative, i.e., 𝑎𝑖𝑗 ≥ 0 for every 𝑖, 𝑗.
We denote this as 𝐴 ≥ 0.

463
A First Course in Quantitative Economics with Python

30.1.1 Irreducible matrices

We introduced irreducible matrices in the Markov chain lecture.


Here we generalize this concept:
Let 𝑎𝑘𝑖𝑗 be element (𝑖, 𝑗) of 𝐴𝑘 .
An 𝑛 × 𝑛 nonnegative matrix 𝐴 is called irreducible if 𝐴 + 𝐴2 + 𝐴3 + ⋯ ≫ 0, where ≫ 0 indicates that every element
in 𝐴 is strictly positive.
In other words, for each 𝑖, 𝑗 with 1 ≤ 𝑖, 𝑗 ≤ 𝑛, there exists a 𝑘 ≥ 0 such that 𝑎𝑘𝑖𝑗 > 0.
Here are some examples to illustrate this further:

0.5 0.1
𝐴=[ ]
0.2 0.2

𝐴 is irreducible since 𝑎𝑖𝑗 > 0 for all (𝑖, 𝑗).

0 1 1 0
𝐵=[ ], 𝐵2 = [ ]
1 0 0 1

𝐵 is irreducible since 𝐵 + 𝐵2 is a matrix of ones.

1 0
𝐶=[ ]
0 1

𝐶 is not irreducible since 𝐶 𝑘 = 𝐶 for all 𝑘 ≥ 0 and thus 𝑐12


𝑘 𝑘
, 𝑐21 = 0 for all 𝑘 ≥ 0.

30.1.2 Left eigenvectors

Recall that we previously discussed eigenvectors in Eigenvalues and Eigenvectors.


In particular, 𝜆 is an eigenvalue of 𝐴 and 𝑣 is an eigenvector of 𝐴 if 𝑣 is nonzero and satisfy

𝐴𝑣 = 𝜆𝑣.

In this section we introduce left eigenvectors.


To avoid confusion, what we previously referred to as “eigenvectors” will be called “right eigenvectors”.
Left eigenvectors will play important roles in what follows, including that of stochastic steady states for dynamic models
under a Markov assumption.
A vector 𝑤 is called a left eigenvector of 𝐴 if 𝑤 is a right eigenvector of 𝐴⊤ .
In other words, if 𝑤 is a left eigenvector of matrix 𝐴, then 𝐴⊤ 𝑤 = 𝜆𝑤, where 𝜆 is the eigenvalue associated with the left
eigenvector 𝑣.
This hints at how to compute left eigenvectors

A = np.array([[3, 2],
[1, 4]])

# Compute eigenvalues and right eigenvectors


λ, v = eig(A)

# Compute eigenvalues and left eigenvectors


(continues on next page)

464 Chapter 30. The Perron-Frobenius Theorem


A First Course in Quantitative Economics with Python

(continued from previous page)


λ, w = eig(A.T)

# Keep 5 decimals
np.set_printoptions(precision=5)

print(f"The eigenvalues of A are:\n {λ}\n")


print(f"The corresponding right eigenvectors are: \n {v[:,0]} and {-v[:,1]}\n")
print(f"The corresponding left eigenvectors are: \n {w[:,0]} and {-w[:,1]}\n")

The eigenvalues of A are:


[2. 5.]

The corresponding right eigenvectors are:


[-0.89443 0.44721] and [0.70711 0.70711]

The corresponding left eigenvectors are:


[-0.70711 0.70711] and [0.44721 0.89443]

We can also use scipy.linalg.eig with argument left=True to find left eigenvectors directly

eigenvals, ε, e = sp.linalg.eig(A, left=True)

print(f"The eigenvalues of A are:\n {eigenvals.real}\n")


print(f"The corresponding right eigenvectors are: \n {e[:,0]} and {-e[:,1]}\n")
print(f"The corresponding left eigenvectors are: \n {ε[:,0]} and {-ε[:,1]}\n")

The eigenvalues of A are:


[2. 5.]

The corresponding right eigenvectors are:


[-0.89443 0.44721] and [0.70711 0.70711]

The corresponding left eigenvectors are:


[-0.70711 0.70711] and [0.44721 0.89443]

The eigenvalues are the same while the eigenvectors themselves are different.
(Also note that we are taking the nonnegative value of the eigenvector of dominant eigenvalue, this is because eig
automatically normalizes the eigenvectors.)
We can then take transpose to obtain 𝐴⊤ 𝑤 = 𝜆𝑤 and obtain 𝑤⊤ 𝐴 = 𝜆𝑤⊤ .
This is a more common expression and where the name left eigenvectors originates.

30.1.3 The Perron-Frobenius theorem

For a square nonnegative matrix 𝐴, the behavior of 𝐴𝑘 as 𝑘 → ∞ is controlled by the eigenvalue with the largest absolute
value, often called the dominant eigenvalue.
For any such matrix 𝐴, the Perron-Frobenius Theorem characterizes certain properties of the dominant eigenvalue and
its corresponding eigenvector.

Theorem (Perron-Frobenius Theorem)

30.1. Nonnegative matrices 465


A First Course in Quantitative Economics with Python

If a matrix 𝐴 ≥ 0 then,
1. the dominant eigenvalue of 𝐴, 𝑟(𝐴), is real-valued and nonnegative.
2. for any other eigenvalue (possibly complex) 𝜆 of 𝐴, |𝜆| ≤ 𝑟(𝐴).
3. we can find a nonnegative and nonzero eigenvector 𝑣 such that 𝐴𝑣 = 𝑟(𝐴)𝑣.
Moreover if 𝐴 is also irreducible then,
4. the eigenvector 𝑣 associated with the eigenvalue 𝑟(𝐴) is strictly positive.
5. there exists no other positive eigenvector 𝑣 (except scalar multiples of 𝑣) associated with 𝑟(𝐴).
(More of the Perron-Frobenius theorem about primitive matrices will be introduced below.)

(This is a relatively simple version of the theorem — for more details see here).
We will see applications of the theorem below.
Let’s build our intuition for the theorem using a simple example we have seen before.
Now let’s consider examples for each case.

Example: Irreducible matrix

Consider the following irreducible matrix 𝐴:

A = np.array([[0, 1, 0],
[.5, 0, .5],
[0, 1, 0]])

We can compute the dominant eigenvalue and the corresponding eigenvector

eig(A)

(array([-1.00000e+00, -3.30139e-18, 1.00000e+00]),


array([[ 5.77350e-01, 7.07107e-01, 5.77350e-01],
[-5.77350e-01, 3.16979e-18, 5.77350e-01],
[ 5.77350e-01, -7.07107e-01, 5.77350e-01]]))

Now we can see the claims of the Perron-Frobenius Theorem holds for the irreducible matrix 𝐴:
1. The dominant eigenvalue is real-valued and non-negative.
2. All other eigenvalues have absolute values less than or equal to the dominant eigenvalue.
3. A non-negative and nonzero eigenvector is associated with the dominant eigenvalue.
4. As the matrix is irreducible, the eigenvector associated with the dominant eigenvalue is strictly positive.
5. There exists no other positive eigenvector associated with the dominant eigenvalue.

466 Chapter 30. The Perron-Frobenius Theorem


A First Course in Quantitative Economics with Python

30.1.4 Primitive matrices

We know that in real world situations it’s hard for a matrix to be everywhere positive (although they have nice properties).
The primitive matrices, however, can still give us helpful properties with looser definitions.
Let 𝐴 be a square nonnegative matrix and let 𝐴𝑘 be the 𝑘𝑡ℎ power of 𝐴.
A matrix is called primitive if there exists a 𝑘 ∈ ℕ such that 𝐴𝑘 is everywhere positive.
Recall the examples given in irreducible matrices:
0.5 0.1
𝐴=[ ]
0.2 0.2

𝐴 here is also a primitive matrix since 𝐴𝑘 is everywhere nonnegative for 𝑘 ∈ ℕ.


0 1 1 0
𝐵=[ ], 𝐵2 = [ ]
1 0 0 1
𝐵 is irreducible but not primitive since there are always zeros in either principal diagonal or secondary diagonal.
We can see that if a matrix is primitive, then it implies the matrix is irreducible but not vice versa.
Now let’s step back to the primitive matrices part of the Perron-Frobenius Theorem

Theorem (Continous of Perron-Frobenius Theorem)


If 𝐴 is primitive then,
6. the inequality |𝜆| ≤ 𝑟(𝐴) is strict for all eigenvalues 𝜆 of 𝐴 distinct from 𝑟(𝐴), and
7. with 𝑣 and 𝑤 normalized so that the inner product of 𝑤 and 𝑣 = 1, we have 𝑟(𝐴)−𝑚 𝐴𝑚 converges to 𝑣𝑤⊤ when
𝑚 → ∞. The matrix 𝑣𝑤⊤ is called the Perron projection of 𝐴.

Example 1: Primitive matrix

Consider the following primitive matrix 𝐵:

B = np.array([[0, 1, 1],
[1, 0, 1],
[1, 1, 0]])

np.linalg.matrix_power(B, 2)

array([[2, 1, 1],
[1, 2, 1],
[1, 1, 2]])

We compute the dominant eigenvalue and the corresponding eigenvector

eig(B)

(array([-1., 2., -1.]),


array([[-0.8165 , 0.57735, -0.09266],
[ 0.40825, 0.57735, -0.65621],
[ 0.40825, 0.57735, 0.74887]]))

30.1. Nonnegative matrices 467


A First Course in Quantitative Economics with Python

Now let’s give some examples to see if the claims of the Perron-Frobenius Theorem hold for the primitive matrix 𝐵:
1. The dominant eigenvalue is real-valued and non-negative.
2. All other eigenvalues have absolute values strictly less than the dominant eigenvalue.
3. A non-negative and nonzero eigenvector is associated with the dominant eigenvalue.
4. The eigenvector associated with the dominant eigenvalue is strictly positive.
5. There exists no other positive eigenvector associated with the dominant eigenvalue.
6. The inequality |𝜆| < 𝑟(𝐵) holds for all eigenvalues 𝜆 of 𝐵 distinct from the dominant eigenvalue.
Furthermore, we can verify the convergence property (7) of the theorem on the following examples:

def compute_perron_projection(M):

eigval, v = eig(M)
eigval, w = eig(M.T)

r = np.max(eigval)

# Find the index of the dominant (Perron) eigenvalue


i = np.argmax(eigval)

# Get the Perron eigenvectors


v_P = v[:, i].reshape(-1, 1)
w_P = w[:, i].reshape(-1, 1)

# Normalize the left and right eigenvectors


norm_factor = w_P.T @ v_P
v_norm = v_P / norm_factor

# Compute the Perron projection matrix


P = v_norm @ w_P.T
return P, r

def check_convergence(M):
P, r = compute_perron_projection(M)
print("Perron projection:")
print(P)

# Define a list of values for n


n_list = [1, 10, 100, 1000, 10000]

for n in n_list:

# Compute (A/r)^n
M_n = np.linalg.matrix_power(M/r, n)

# Compute the difference between A^n / r^n and the Perron projection
diff = np.abs(M_n - P)

# Calculate the norm of the difference matrix


diff_norm = np.linalg.norm(diff, 'fro')
print(f"n = {n}, error = {diff_norm:.10f}")

A1 = np.array([[1, 2],
(continues on next page)

468 Chapter 30. The Perron-Frobenius Theorem


A First Course in Quantitative Economics with Python

(continued from previous page)


[1, 4]])

A2 = np.array([[0, 1, 1],
[1, 0, 1],
[1, 1, 0]])

A3 = np.array([[0.971, 0.029, 0.1, 1],


[0.145, 0.778, 0.077, 0.59],
[0.1, 0.508, 0.492, 1.12],
[0.2, 0.8, 0.71, 0.95]])

for M in A1, A2, A3:


print("Matrix:")
print(M)
check_convergence(M)
print()
print("-"*36)
print()

Matrix:
[[1 2]
[1 4]]
Perron projection:
[[0.1362 0.48507]
[0.24254 0.8638 ]]
n = 1, error = 0.0989045731
n = 10, error = 0.0000000001
n = 100, error = 0.0000000000
n = 1000, error = 0.0000000000
n = 10000, error = 0.0000000000

------------------------------------

Matrix:
[[0 1 1]
[1 0 1]
[1 1 0]]
Perron projection:
[[0.33333 0.33333 0.33333]
[0.33333 0.33333 0.33333]
[0.33333 0.33333 0.33333]]
n = 1, error = 0.7071067812
n = 10, error = 0.0013810679
n = 100, error = 0.0000000000
n = 1000, error = 0.0000000000
n = 10000, error = 0.0000000000

------------------------------------

Matrix:
[[0.971 0.029 0.1 1. ]
[0.145 0.778 0.077 0.59 ]
[0.1 0.508 0.492 1.12 ]
[0.2 0.8 0.71 0.95 ]]
Perron projection:
(continues on next page)

30.1. Nonnegative matrices 469


A First Course in Quantitative Economics with Python

(continued from previous page)


[[0.12506 0.31949 0.20233 0.43341]
[0.07714 0.19707 0.1248 0.26735]
[0.12158 0.31058 0.19669 0.42133]
[0.13885 0.3547 0.22463 0.48118]]
n = 1, error = 0.5361031549
n = 10, error = 0.0000434043
n = 100, error = 0.0000000000
n = 1000, error = 0.0000000000
n = 10000, error = 0.0000000000

------------------------------------

The convergence is not observed in cases of non-primitive matrices.


Let’s go through an example

B = np.array([[0, 1, 1],
[1, 0, 0],
[1, 0, 0]])

# This shows that the matrix is not primitive


print("Matrix:")
print(B)
print("100th power of matrix B:")
print(np.linalg.matrix_power(B, 100))

check_convergence(B)

Matrix:
[[0 1 1]
[1 0 0]
[1 0 0]]
100th power of matrix B:
[[1125899906842624 0 0]
[ 0 562949953421312 562949953421312]
[ 0 562949953421312 562949953421312]]
Perron projection:
[[0.5 0.35355 0.35355]
[0.35355 0.25 0.25 ]
[0.35355 0.25 0.25 ]]
n = 1, error = 1.0000000000
n = 10, error = 1.0000000000
n = 100, error = 1.0000000000
n = 1000, error = 1.0000000000
n = 10000, error = 1.0000000000

The result shows that the matrix is not primitive as it is not everywhere positive.
These examples show how the Perron-Frobenius Theorem relates to the eigenvalues and eigenvectors of positive matrices
and the convergence of the power of matrices.
In fact we have already seen the theorem in action before in the Markov chain lecture.

470 Chapter 30. The Perron-Frobenius Theorem


A First Course in Quantitative Economics with Python

Example 2: Connection to Markov chains

We are now prepared to bridge the languages spoken in the two lectures.
A primitive matrix is both irreducible and aperiodic.
So Perron-Frobenius Theorem explains why both Imam and Temple matrix and Hamilton matrix converge to a stationary
distribution, which is the Perron projection of the two matrices

P = np.array([[0.68, 0.12, 0.20],


[0.50, 0.24, 0.26],
[0.36, 0.18, 0.46]])

print(compute_perron_projection(P)[0])

[[0.56146 0.15565 0.28289]


[0.56146 0.15565 0.28289]
[0.56146 0.15565 0.28289]]

mc = qe.MarkovChain(P)
ψ_star = mc.stationary_distributions[0]
ψ_star

array([0.56146, 0.15565, 0.28289])

P_hamilton = np.array([[0.971, 0.029, 0.000],


[0.145, 0.778, 0.077],
[0.000, 0.508, 0.492]])

print(compute_perron_projection(P_hamilton)[0])

[[0.8128 0.16256 0.02464]


[0.8128 0.16256 0.02464]
[0.8128 0.16256 0.02464]]

mc = qe.MarkovChain(P_hamilton)
ψ_star = mc.stationary_distributions[0]
ψ_star

array([0.8128 , 0.16256, 0.02464])

We can also verify other properties hinted by Perron-Frobenius in these stochastic matrices.
Another example is the relationship between convergence gap and convergence rate.
In the exercise, we stated that the convergence rate is determined by the spectral gap, the difference between the largest
and the second largest eigenvalue.
This can be proven using what we have learned here.
Please note that we use 𝟙 for a vector of ones in this lecture.
With Markov model 𝑀 with state space 𝑆 and transition matrix 𝑃 , we can write 𝑃 𝑡 as
𝑛−1
𝑃 𝑡 = ∑ 𝜆𝑡𝑖 𝑣𝑖 𝑤𝑖⊤ + 𝟙𝜓∗ ,
𝑖=1

30.1. Nonnegative matrices 471


A First Course in Quantitative Economics with Python

This is proven in [SS23] and a nice discussion can be found here.


In this formula 𝜆𝑖 is an eigenvalue of 𝑃 with corresponding right and left eigenvectors 𝑣𝑖 and 𝑤𝑖 .
Premultiplying 𝑃 𝑡 by arbitrary 𝜓 ∈ 𝒟(𝑆) and rearranging now gives
𝑛−1
𝜓𝑃 𝑡 − 𝜓∗ = ∑ 𝜆𝑡𝑖 𝜓𝑣𝑖 𝑤𝑖⊤
𝑖=1

Recall that eigenvalues are ordered from smallest to largest from 𝑖 = 1...𝑛.
As we have seen, the largest eigenvalue for a primitive stochastic matrix is one.
This can be proven using Gershgorin Circle Theorem, but it is out of the scope of this lecture.
So by the statement (6) of Perron-Frobenius Theorem, 𝜆𝑖 < 1 for all 𝑖 < 𝑛, and 𝜆𝑛 = 1 when 𝑃 is primitive.
Hence, after taking the Euclidean norm deviation, we obtain

‖𝜓𝑃 𝑡 − 𝜓∗ ‖ = 𝑂 (𝜂𝑡 ) where 𝜂 ∶= |𝜆𝑛−1 | < 1

Thus, the rate of convergence is governed by the modulus of the second largest eigenvalue.

30.2 Exercises

Exercise 30.2.1 (Leontief’s Input-Output Model)


Wassily Leontief developed a model of an economy with 𝑛 sectors producing 𝑛 different commodities representing the
interdependencies of different sectors of an economy.
Under this model some of the output is consumed internally by the industries and the rest is consumed by external con-
sumers.
We define a simple model with 3 sectors - agriculture, industry, and service.
The following table describes how output is distributed within the economy:

Total output Agriculture Industry Service Consumer

Agriculture 𝑥1 0.3𝑥1 0.2𝑥2 0.3𝑥3 4


Industry 𝑥2 0.2𝑥1 0.4𝑥2 0.3𝑥3 5
Service 𝑥3 0.2𝑥1 0.5𝑥2 0.1𝑥3 12

The first row depicts how agriculture’s total output 𝑥1 is distributed


• 0.3𝑥1 is used as inputs within agriculture itself,
• 0.2𝑥2 is used as inputs by the industry sector to produce 𝑥2 units,
• 0.3𝑥3 is used as inputs by the service sector to produce 𝑥3 units and
• 4 units is the external demand by consumers.
We can transform this into a system of linear equations for the 3 sectors as given below:

𝑥1 = 0.3𝑥1 + 0.2𝑥2 + 0.3𝑥3 + 4


𝑥2 = 0.2𝑥1 + 0.4𝑥2 + 0.3𝑥3 + 5
𝑥3 = 0.2𝑥1 + 0.5𝑥2 + 0.1𝑥3 + 12

472 Chapter 30. The Perron-Frobenius Theorem


A First Course in Quantitative Economics with Python

This can be transformed into the matrix equation 𝑥 = 𝐴𝑥 + 𝑑 where

𝑥1 0.3 0.2 0.3 4


𝑥=⎡𝑥
⎢ 2⎥
⎤ , 𝐴 = ⎡0.2
⎢ 0.4 0.3⎤⎥ and 𝑑 = ⎡5⎤
⎢ ⎥
⎣𝑥3 ⎦ ⎣0.2 0.5 0.1⎦ ⎣12⎦

The solution 𝑥∗ is given by the equation 𝑥∗ = (𝐼 − 𝐴)−1 𝑑


1. Since 𝐴 is a nonnegative irreducible matrix, find the Perron-Frobenius eigenvalue of 𝐴.
2. Use the Neumann Series Lemma to find the solution 𝑥∗ if it exists.

Solution to Exercise 30.2.1 (Leontief’s Input-Output Model)

A = np.array([[0.3, 0.2, 0.3],


[0.2, 0.4, 0.3],
[0.2, 0.5, 0.1]])

evals, evecs = eig(A)

r = max(abs(λ) for λ in evals) #dominant eigenvalue/spectral radius


print(r)

0.8444086477164563

Since we have 𝑟(𝐴) < 1 we can thus find the solution using the Neumann Series Lemma.

I = np.identity(3)
B = I - A

d = np.array([4, 5, 12])
d.shape = (3,1)

B_inv = np.linalg.inv(B)
x_star = B_inv @ d
print(x_star)

[[38.30189]
[44.33962]
[46.47799]]

30.2. Exercises 473


A First Course in Quantitative Economics with Python

474 Chapter 30. The Perron-Frobenius Theorem


CHAPTER

THIRTYONE

INPUT-OUTPUT MODELS

31.1 Overview

This lecture requires the following imports and installs before we proceed.

!pip install quantecon_book_networks


!pip install quantecon
!pip install pandas-datareader

import numpy as np
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
import quantecon_book_networks
import quantecon_book_networks.input_output as qbn_io
import quantecon_book_networks.plotting as qbn_plt
import quantecon_book_networks.data as qbn_data
import matplotlib as mpl
from matplotlib.patches import Polygon

quantecon_book_networks.config("matplotlib")
mpl.rcParams.update(mpl.rcParamsDefault)

The following figure illustrates a network of linkages among 15 sectors obtained from the US Bureau of Economic Anal-
ysis’s 2021 Input-Output Accounts Data.

Label Sector Label Sector Label Sector


ag Agriculture wh Wholesale pr Professional Services
mi Mining re Retail ed Education & Health
ut Utilities tr Transportation ar Arts & Entertainment
co Construction in Information ot Other Services (exc govt)
ma Manufacturing fi Finance go Government

An arrow from 𝑖 to 𝑗 means that some of sector 𝑖’s output serves as an input to production of sector 𝑗.
Economies are characterised by many such links.
A basic framework for their analysis is Leontief’s input-output model.
After introducing the input-output model, we describe some of its connections to linear programming lecture.

475
A First Course in Quantitative Economics with Python

Fig. 31.1: US 15 sector production network

476 Chapter 31. Input-Output Models


A First Course in Quantitative Economics with Python

31.2 Input output analysis

Let
• 𝑥0 be the amount of a single exogenous input to production, say labor
• 𝑥𝑗 , 𝑗 = 1, … 𝑛 be the gross output of final good 𝑗
• 𝑑𝑗 , 𝑗 = 1, … 𝑛 be the net output of final good 𝑗 that is available for final consumption
• 𝑧𝑖𝑗 be the quantity of good 𝑖 allocated to be an input to producing good 𝑗 for 𝑖 = 1, … 𝑛, 𝑗 = 1, … 𝑛
• 𝑧0𝑗 be the quantity of labor allocated to producing good 𝑗.
• 𝑎𝑖𝑗 be the number of units of good 𝑖 required to produce one unit of good 𝑗, 𝑖 = 0, … , 𝑛, 𝑗 = 1, … 𝑛.
• 𝑤 > 0 be an exogenous wage of labor, denominated in dollars per unit of labor
• 𝑝 be an 𝑛 × 1 vector of prices of produced goods 𝑖 = 1, … , 𝑛.
The technology for producing good 𝑗 ∈ {1, … , 𝑛} is described by the Leontief function

𝑧𝑖𝑗
𝑥𝑗 = min ( )
𝑖∈{0,…,𝑛} 𝑎𝑖𝑗

31.2.1 Two goods

To illustrate, we begin by setting 𝑛 = 2 and formulating the following network.

Feasible allocations must satisfy

(1 − 𝑎11 )𝑥1 − 𝑎12 𝑥2 ≥ 𝑑1


−𝑎21 𝑥1 + (1 − 𝑎22 )𝑥2 ≥ 𝑑2
𝑎01 𝑥1 + 𝑎02 𝑥2 ≤ 𝑥0

31.2. Input output analysis 477


A First Course in Quantitative Economics with Python

This can be graphically represented as follows.

More generally, constraints on production are

(𝐼 − 𝐴)𝑥 ≥ 𝑑
(31.1)
𝑎⊤
0 𝑥 ≤ 𝑥0

where 𝐴 is the 𝑛 × 𝑛 matrix with typical element 𝑎𝑖𝑗 and 𝑎⊤


0 = [𝑎01 ⋯ 𝑎0𝑛 ].
If we solve the first block of equations of (31.1) for gross output 𝑥 we get

𝑥 = (𝐼 − 𝐴)−1 𝑑 ≡ 𝐿𝑑 (31.2)

where the matrix 𝐿 = (𝐼 − 𝐴)−1 is sometimes called a Leontief Inverse.


To assure that the solution 𝑋 of (31.2) is a positive vector, the following Hawkins-Simon conditions suffice:

det(𝐼 − 𝐴) > 0 and


(𝐼 − 𝐴)𝑖𝑗 > 0 for all 𝑖 = 𝑗

For example a two good economy described by

0.1 40 50
𝐴=[ ] and 𝑑 = [ ] (31.3)
0.01 0 2

A = np.array([[0.1, 40],
[0.01, 0]])
d = np.array([50, 2]).reshape((2, 1))

I = np.identity(2)
B = I - A
B

478 Chapter 31. Input-Output Models


A First Course in Quantitative Economics with Python

array([[ 9.e-01, -4.e+01],


[-1.e-02, 1.e+00]])

Let’s check the Hawkins-Simon conditions

np.linalg.det(B) > 0 # checking Hawkins-Simon conditions

True

Now, let’s compute the Leontief inverse matrix

L = np.linalg.inv(B) # obtaining Leontief inverse matrix


L

array([[2.0e+00, 8.0e+01],
[2.0e-02, 1.8e+00]])

x = L @ d # solving for gross output


x

array([[260. ],
[ 4.6]])

31.3 Production possibility frontier

The second equation of (31.1) can be written

𝑎⊤
0 𝑥 = 𝑥0

or

𝐴⊤
0 𝑑 = 𝑥0 (31.4)

where

𝐴⊤ ⊤
0 = 𝑎0 (𝐼 − 𝐴)
−1

For 𝑖 ∈ {1, … , 𝑛}, the 𝑖th component of 𝐴0 is the amount of labor that is required to produce one unit of final output of
good 𝑖.
Equation (31.4) sweeps out a production possibility frontier of final consumption bundles 𝑑 that can be produced with
exogenous labor input 𝑥0 .
Consider the example in (31.3).
Suppose we are now given

𝑎⊤
0 = [4 100]

Then we can find 𝐴⊤


0 by

31.3. Production possibility frontier 479


A First Course in Quantitative Economics with Python

a0 = np.array([4, 100])
A0 = a0 @ L
A0

array([ 10., 500.])

Thus, the production possibility frontier for this economy is

10𝑑1 + 500𝑑2 = 𝑥0

31.4 Prices

[DSS58] argue that relative prices of the 𝑛 produced goods must satisfy

𝑝1 = 𝑎11 𝑝1 + 𝑎21 𝑝2 + 𝑎01 𝑤


𝑝2 = 𝑎12 𝑝1 + 𝑎22 𝑝2 + 𝑎02 𝑤

More generally,

𝑝 = 𝐴⊤ 𝑝 + 𝑎0 𝑤

which states that the price of each final good equals the total cost of production, which consists of costs of intermediate
inputs 𝐴⊤ 𝑝 plus costs of labor 𝑎0 𝑤.
This equation can be written as

(𝐼 − 𝐴⊤ )𝑝 = 𝑎0 𝑤 (31.5)

which implies

𝑝 = (𝐼 − 𝐴⊤ )−1 𝑎0 𝑤

Notice how (31.5) with (31.1) forms a conjugate pair through the appearance of operators that are transposes of one
another.
This connection surfaces again in a classic linear program and its dual.

31.5 Linear programs

A primal problem is

min 𝑤𝑎⊤
0𝑥
𝑥

subject to

(𝐼 − 𝐴)𝑥 ≥ 𝑑

The associated dual problem is

max 𝑝⊤ 𝑑
𝑝

480 Chapter 31. Input-Output Models


A First Course in Quantitative Economics with Python

subject to

(𝐼 − 𝐴)⊤ 𝑝 ≤ 𝑎0 𝑤

The primal problem chooses a feasible production plan to minimize costs for delivering a pre-assigned vector of final
goods consumption 𝑑.
The dual problem chooses prices to maximize the value of a pre-assigned vector of final goods 𝑑 subject to prices covering
costs of production.
By the strong duality theorem, optimal value of the primal and dual problems coincide:

𝑤𝑎⊤ ∗ ∗
0𝑥 = 𝑝 𝑑

where ∗ ’s denote optimal choices for the primal and dual problems.
The dual problem can be graphically represented as follows.

31.6 Leontief inverse

We have discussed that gross output 𝑥 is given by (31.2), where 𝐿 is called the Leontief Inverse.
Recall the Neumann Series Lemma which states that 𝐿 exists if the spectral radius 𝑟(𝐴) < 1.
In fact

𝐿 = ∑ 𝐴𝑖
𝑖=0

31.6. Leontief inverse 481


A First Course in Quantitative Economics with Python

31.6.1 Demand shocks

Consider the impact of a demand shock Δ𝑑 which shifts demand from 𝑑0 to 𝑑1 = 𝑑0 + Δ𝑑.
Gross output shifts from 𝑥0 = 𝐿𝑑0 to 𝑥1 = 𝐿𝑑1 .
If 𝑟(𝐴) < 1 then a solution exists and

Δ𝑥 = 𝐿Δ𝑑 = Δ𝑑 + 𝐴(Δ𝑑) + 𝐴2 (Δ𝑑) + ⋯

This illustrates that an element 𝑙𝑖𝑗 of 𝐿 shows the total impact on sector 𝑖 of a unit change in demand of good 𝑗.

31.7 Applications of graph theory

We can further study input output networks through applications of graph theory.
An input output network can be represented by a weighted directed graph induced by the adjacency matrix 𝐴.
The set of nodes 𝑉 = [𝑛] is the list of sectors and the set of edges is given by

𝐸 = {(𝑖, 𝑗) ∈ 𝑉 × 𝑉 ∶ 𝑎𝑖𝑗 > 0}

In Fig. 31.1 weights are indicated by the widths of the arrows, which are proportional to the corresponding input-output
coefficients.
We can now use centrality measures to rank sectors and discuss their importance relative to the other sectors.

31.7.1 Eigenvector centrality

Eigenvector centrality of a node 𝑖 is measured by


1
𝑒𝑖 = ∑ 𝑎 𝑒
𝑟(𝐴) 1≤𝑗≤𝑛 𝑖𝑗 𝑗

We plot a bar graph of hub-based eigenvector centrality for the sectors represented in Fig. 31.1.

482 Chapter 31. Input-Output Models


A First Course in Quantitative Economics with Python

A higher measure indicates higher importance as a supplier.


As a result demand shocks in most sectors will significantly impact activity in sectors with high eigenvector centrality.
The above figure indicates that manufacturing is the most dominant sector in the US economy.

31.7.2 Output multipliers

Another way to rank sectors in input output networks is via output multipliers.
The output multiplier of sector 𝑗 denoted by 𝜇𝑗 is usually defined as the total sector-wide impact of a unit change of
demand in sector 𝑗.
Earlier when disussing demand shocks we concluded that for 𝐿 = (𝑙𝑖𝑗 ) the element 𝑙𝑖𝑗 represents the impact on sector 𝑖
of a unit change in demand in sector 𝑗.
Thus,
𝑛
𝜇𝑗 = ∑ 𝑙𝑖𝑗
𝑗=1

This can be written as 𝜇⊤ = 𝟙⊤ 𝐿 or

𝜇⊤ = 𝟙⊤ (𝐼 − 𝐴)−1

Please note that here we use 𝟙 to represent a vector of ones.


High ranking sectors within this measure are important buyers of intermediate goods.
A demand shock in such sectors will cause a large impact on the whole production network.
The following figure displays the output multipliers for the sectors represented in Fig. 31.1.

31.7. Applications of graph theory 483


A First Course in Quantitative Economics with Python

We observe that manufacturing and agriculture are highest ranking sectors.

31.8 Exercises

Exercise 31.8.1
[DSS58] Chapter 9 discusses an example with the following parameter settings:

0.1 1.46
𝐴=[ ] and 𝑎0 = [.04 .33]
0.16 0.17

250
𝑥=[ ] and 𝑥0 = 50
120
50
𝑑=[ ]
60
Describe how they infer the input-output coefficients in 𝐴 and 𝑎0 from the following hypothetical underlying “data” on
agricultural and manufacturing industries:

25 175
𝑧=[ ] and 𝑧0 = [10 40]
40 20

where 𝑧0 is a vector of labor services used in each industry.

Solution to Exercise 31.8.1


For each 𝑖 = 0, 1, 2 and 𝑗 = 1, 2
𝑧𝑖𝑗
𝑎𝑖𝑗 =
𝑥𝑗

484 Chapter 31. Input-Output Models


A First Course in Quantitative Economics with Python

Exercise 31.8.2
Derive the production possibility frontier for the economy characterized in the previous exercise.

Solution to Exercise 31.8.2

A = np.array([[0.1, 1.46],
[0.16, 0.17]])
a_0 = np.array([0.04, 0.33])

I = np.identity(2)
B = I - A
L = np.linalg.inv(B)

A_0 = a_0 @ L
A_0

array([0.16751071, 0.69224776])

Thus the production possibility frontier is given by

0.17𝑑1 + 0.69𝑑2 = 50

31.8. Exercises 485


A First Course in Quantitative Economics with Python

486 Chapter 31. Input-Output Models


CHAPTER

THIRTYTWO

A LAKE MODEL OF EMPLOYMENT

32.1 Outline

In addition to what’s in Anaconda, this lecture will need the following libraries:

import numpy as np
import matplotlib.pyplot as plt

32.2 The Lake model

This model is sometimes called the lake model because there are two pools of workers:
1. those who are currently employed.
2. those who are currently unemployed but are seeking employment.
The “flows” between the two lakes are as follows:
1. workers exit the labor market at rate 𝑑.
2. new workers enter the labor market at rate 𝑏.
3. employed workers separate from their jobs at rate 𝛼.
4. unemployed workers find jobs at rate 𝜆.
The below graph illustrates the lake model.

32.3 Dynamics

Let 𝑒𝑡 and 𝑢𝑡 be the number of employed and unemployed workers at time 𝑡 respectively.
The total population of workers is 𝑛𝑡 = 𝑒𝑡 + 𝑢𝑡 .
The number of unemployed and employed workers thus evolves according to:

𝑢𝑡+1 = (1 − 𝑑)(1 − 𝜆)𝑢𝑡 + 𝛼(1 − 𝑑)𝑒𝑡 + 𝑏𝑛𝑡


= ((1 − 𝑑)(1 − 𝜆) + 𝑏)𝑢𝑡 + (𝛼(1 − 𝑑) + 𝑏)𝑒𝑡 (32.1)
𝑒𝑡+1 = (1 − 𝑑)𝜆𝑢𝑡 + (1 − 𝛼)(1 − 𝑑)𝑒𝑡

487
A First Course in Quantitative Economics with Python

Fig. 32.1: An illustration of the lake model

We can arrange (32.1) as a linear system of equations in matrix form 𝑥𝑡+1 = 𝐴𝑥𝑡 where

𝑢𝑡+1 (1 − 𝑑)(1 − 𝜆) + 𝑏 𝛼(1 − 𝑑) + 𝑏 𝑢


𝑥𝑡+1 = [ ] 𝐴=[ ] and 𝑥𝑡 = [ 𝑡 ] .
𝑒𝑡+1 (1 − 𝑑)𝜆 (1 − 𝛼)(1 − 𝑑) 𝑒𝑡

Suppose at 𝑡 = 0 we have 𝑥0 = [𝑢0 𝑒0 ] .
Then, 𝑥1 = 𝐴𝑥0 , 𝑥2 = 𝐴𝑥1 = 𝐴2 𝑥0 and thus 𝑥𝑡 = 𝐴𝑡 𝑥0 .
Thus the long-run outcomes of this system may depend on the initial condition 𝑥0 and the matrix 𝐴.
We are interested in how 𝑢𝑡 and 𝑒𝑡 evolve over time.
What long-run unemployment rate and employment rate should we expect?
Do long-run outcomes depend on the initial values (𝑢0 , 𝑒𝑜 )?

32.3.1 Visualising the long-run outcomes

Let us first plot the time series of unemployment 𝑢𝑡 , employment 𝑒𝑡 , and labor force 𝑛𝑡 .

class LakeModel:
"""
Solves the lake model and computes dynamics of the unemployment stocks and
rates.

Parameters:
------------
λ : scalar
The job finding rate for currently unemployed workers
α : scalar
The dismissal rate for currently employed workers
b : scalar
Entry rate into the labor force
d : scalar
Exit rate from the labor force

"""
def __init__(self, λ=0.1, α=0.013, b=0.0124, d=0.00822):
(continues on next page)

488 Chapter 32. A Lake Model of Employment


A First Course in Quantitative Economics with Python

(continued from previous page)


self.λ, self.α, self.b, self.d = λ, α, b, d

λ, α, b, d = self.λ, self.α, self.b, self.d


self.g = b - d
g = self.g

self.A = np.array([[(1-d)*(1-λ) + b, α*(1-d) + b],


[ (1-d)*λ, (1-α)*(1-d)]])

self.ū = (1 + g - (1 - d) * (1 - α)) / (1 + g - (1 - d) * (1 - α) + (1 - d) *␣
λ)

self.ē = 1 - self.ū

def simulate_path(self, x0, T=1000):


"""
Simulates the sequence of employment and unemployment

Parameters
----------
x0 : array
Contains initial values (u0,e0)
T : int
Number of periods to simulate

Returns
----------
x : iterator
Contains sequence of employment and unemployment rates

"""
x0 = np.atleast_1d(x0) # Recast as array just in case
x_ts= np.zeros((2, T))
x_ts[:, 0] = x0
for t in range(1, T):
x_ts[:, t] = self.A @ x_ts[:, t-1]
return x_ts

lm = LakeModel()
e_0 = 0.92 # Initial employment
u_0 = 1 - e_0 # Initial unemployment, given initial n_0 = 1

lm = LakeModel()
T = 100 # Simulation length

x_0 = (u_0, e_0)


x_path = lm.simulate_path(x_0, T)

fig, axes = plt.subplots(3, 1, figsize=(10, 8))

axes[0].plot(x_path[0, :], lw=2)


axes[0].set_title('Unemployment')

(continues on next page)

32.3. Dynamics 489


A First Course in Quantitative Economics with Python

(continued from previous page)


axes[1].plot(x_path[1, :], lw=2)
axes[1].set_title('Employment')

axes[2].plot(x_path.sum(0), lw=2)
axes[2].set_title('Labor force')

for ax in axes:
ax.grid()

plt.tight_layout()
plt.show()

Not surprisingly, we observe that labor force 𝑛𝑡 increases at a constant rate.


This coincides with the fact there is only one inflow source (new entrants pool) to unemployment and employment pools.
The inflow and outflow of labor market system is determined by constant exit rate and entry rate of labor market in the
long run.
In detail, let 𝟙 = [1, 1]⊤ be a vector of ones.
Observe that
𝑛𝑡+1 = 𝑢𝑡+1 + 𝑒𝑡+1
= 𝟙⊤ 𝑥𝑡+1
= 𝟙⊤ 𝐴𝑥𝑡
= (1 + 𝑏 − 𝑑)(𝑢𝑡 + 𝑒𝑡 )
= (1 + 𝑏 − 𝑑)𝑛𝑡 .
Hence, the growth rate of 𝑛𝑡 is fixed at 1 + 𝑏 − 𝑑.
Moreover, the times series of unemployment and employment seems to grow at some stable rates in the long run.

490 Chapter 32. A Lake Model of Employment


A First Course in Quantitative Economics with Python

32.3.2 The application of Perron-Frobenius theorem

Since by intuition if we consider unemployment pool and employment pool as a closed system, the growth should be
similar to the labor force.
We next ask whether the long run growth rates of 𝑒𝑡 and 𝑢𝑡 also dominated by 1 + 𝑏 − 𝑑 as labor force.
The answer will be clearer if we appeal to Perron-Frobenius theorem.
The importance of the Perron-Frobenius theorem stems from the fact that firstly in the real world most matrices we
encounter are nonnegative matrices.
Secondly, many important models are simply linear iterative models that begin with an initial condition 𝑥0 and then evolve
recursively by the rule 𝑥𝑡+1 = 𝐴𝑥𝑡 or in short 𝑥𝑡 = 𝐴𝑡 𝑥0 .
This theorem helps characterise the dominant eigenvalue 𝑟(𝐴) which determines the behavior of this iterative process.

Dominant eigenvector

We now illustrate the power of the Perron-Frobenius theorem by showing how it helps us to analyze the lake model.
Since 𝐴 is a nonnegative and irreducible matrix, the Perron-Frobenius theorem implies that:
• the spectral radius 𝑟(𝐴) is an eigenvalue of 𝐴, where
𝑟(𝐴) ∶= max{|𝜆| ∶ 𝜆 is an eigenvalue of 𝐴}
• any other eigenvalue 𝜆 in absolute value is strictly smaller than 𝑟(𝐴): |𝜆| < 𝑟(𝐴),
• there exist unique and everywhere positive right eigenvector 𝜙 (column vector) and left eigenvector 𝜓 (row vector):
𝐴𝜙 = 𝑟(𝐴)𝜙, 𝜓𝐴 = 𝑟(𝐴)𝜓
• if further 𝐴 is positive, then with < 𝜓, 𝜙 >= 𝜓𝜙 = 1 we have
𝑟(𝐴)−𝑡 𝐴𝑡 → 𝜙𝜓
The last statement implies that the magnitude of 𝐴𝑡 is identical to the magnitude of 𝑟(𝐴)𝑡 in the long run, where 𝑟(𝐴)
can be considered as the dominant eigenvalue in this lecture.
Therefore, the magnitude 𝑥𝑡 = 𝐴𝑡 𝑥0 is also dominated by 𝑟(𝐴)𝑡 in the long run.
Recall that the spectral radius is bounded by column sums: for 𝐴 ≥ 0, we have
min colsum𝑗 (𝐴) ≤ 𝑟(𝐴) ≤ max colsum𝑗 (𝐴) (32.2)
𝑗 𝑗

Note that colsum𝑗 (𝐴) = 1 + 𝑏 − 𝑑 for 𝑗 = 1, 2 and by (32.2) we can thus conclude that the dominant eigenvalue is
𝑟(𝐴) = 1 + 𝑏 − 𝑑.
Denote 𝑔 = 𝑏 − 𝑑 as the overall growth rate of the total labor force, so that 𝑟(𝐴) = 1 + 𝑔.
𝑢̄
The Perron-Frobenius implies that there is a unique positive eigenvector 𝑥̄ = [ ] such that 𝐴𝑥̄ = 𝑟(𝐴)𝑥̄ and [1 1] 𝑥̄ =
𝑒̄
1:
𝑏 + 𝛼(1 − 𝑑)
𝑢̄ =
𝑏 + (𝛼 + 𝜆)(1 − 𝑑)
(32.3)
𝜆(1 − 𝑑)
𝑒̄ =
𝑏 + (𝛼 + 𝜆)(1 − 𝑑)

Since 𝑥̄ is the eigenvector corresponding to the dominant eigenvalue 𝑟(𝐴), we call 𝑥̄ the dominant eigenvector.
This dominant eigenvector plays an important role in determining long-run outcomes as illustrated below.

32.3. Dynamics 491


A First Course in Quantitative Economics with Python

def plot_time_paths(lm, x0=None, T=1000, ax=None):


"""
Plots the simulated time series.

Parameters
----------
lm : class
Lake Model
x0 : array
Contains some different initial values.
T : int
Number of periods to simulate

"""

if x0 is None:
x0 = np.array([[5.0, 0.1]])

ū, ē = lm.ū, lm.ē

x0 = np.atleast_2d(x0)

if ax is None:
fig, ax = plt.subplots(figsize=(10, 8))
# Plot line D
s = 10
ax.plot([0, s * ū], [0, s * ē], "k--", lw=1, label='set $D$')

# Set the axes through the origin


for spine in ["left", "bottom"]:
ax.spines[spine].set_position("zero")
for spine in ["right", "top"]:
ax.spines[spine].set_color("none")

ax.set_xlim(-2, 6)
ax.set_ylim(-2, 6)
ax.set_xlabel("unemployed workforce")
ax.set_ylabel("employed workforce")
ax.set_xticks((0, 6))
ax.set_yticks((0, 6))

# Plot time series


for x in x0:
x_ts = lm.simulate_path(x0=x)

ax.scatter(x_ts[0, :], x_ts[1, :], s=4,)

u0, e0 = x
ax.plot([u0], [e0], "ko", ms=2, alpha=0.6)
ax.annotate(f'$x_0 = ({u0},{e0})$',
xy=(u0, e0),
xycoords="data",
xytext=(0, 20),
(continues on next page)

492 Chapter 32. A Lake Model of Employment


A First Course in Quantitative Economics with Python

(continued from previous page)


textcoords="offset points",
arrowprops=dict(arrowstyle = "->"))

ax.plot([ū], [ē], "ko", ms=4, alpha=0.6)


ax.annotate(r'$\bar{x}$',
xy=(ū, ē),
xycoords="data",
xytext=(20, -20),
textcoords="offset points",
arrowprops=dict(arrowstyle = "->"))

if ax is None:
plt.show()

lm = LakeModel(α=0.01, λ=0.1, d=0.02, b=0.025)


x0 = ((5.0, 0.1), (0.1, 4.0), (2.0, 1.0))
plot_time_paths(lm, x0=x0)

Since 𝑥̄ is an eigenvector corresponding to the eigenvalue 𝑟(𝐴), all the vectors in the set 𝐷 ∶= {𝑥 ∈ ℝ2 ∶ 𝑥 =
𝛼𝑥̄ for some 𝛼 > 0} are also eigenvectors corresponding to 𝑟(𝐴).
This set 𝐷 is represented by a dashed line in the above figure.
The graph illustrates that for two distinct initial conditions 𝑥0 the sequences of iterates (𝐴𝑡 𝑥0 )𝑡≥0 move towards 𝐷 over
time.
This suggests that all such sequences share strong similarities in the long run, determined by the dominant eigenvector 𝑥.̄

32.3. Dynamics 493


A First Course in Quantitative Economics with Python

Negative growth rate

In the example illustrated above we considered parameters such that overall growth rate of the labor force 𝑔 > 0.
Suppose now we are faced with a situation where the 𝑔 < 0, i.e., negative growth in the labor force.
This means that 𝑏 − 𝑑 < 0, i.e., workers exit the market faster than they enter.
What would the behavior of the iterative sequence 𝑥𝑡+1 = 𝐴𝑥𝑡 be now?
This is visualised below.

lm = LakeModel(α=0.01, λ=0.1, d=0.025, b=0.02)


plot_time_paths(lm, x0=x0)

Thus, while the sequence of iterates still moves towards the dominant eigenvector 𝑥,̄ in this case they converge to the
origin.
This is a result of the fact that 𝑟(𝐴) < 1, which ensures that the iterative sequence (𝐴𝑡 𝑥0 )𝑡≥0 will converge to some
point, in this case to (0, 0).
This leads us to the next result.

494 Chapter 32. A Lake Model of Employment


A First Course in Quantitative Economics with Python

32.3.3 Properties

Since the column sums of 𝐴 are 𝑟(𝐴) = 1, the left eigenvector is 𝟙⊤ = [1, 1].
Perron-Frobenius theory implies that

𝑢̄ 𝑢̄
𝑟(𝐴)−𝑡 𝐴𝑡 ≈ 𝑥𝟙
̄ ⊤=[ ].
𝑒̄ 𝑒̄

As a result, for any 𝑥0 = (𝑢0 , 𝑒0 )⊤ , we have

𝑢̄ 𝑢̄ 𝑢0
𝑥𝑡 = 𝐴𝑡 𝑥0 ≈ 𝑟(𝐴)𝑡 [ ][ ]
𝑒 ̄ 𝑒 ̄ 𝑒0
𝑢̄
= (1 + 𝑔)𝑡 (𝑢0 + 𝑒0 ) [ ]
𝑒̄
= (1 + 𝑔)𝑡 𝑛0 𝑥̄
= 𝑛𝑡 𝑥.̄

as 𝑡 is large enough.
We see that the growth of 𝑢𝑡 and 𝑒𝑡 also dominated by 𝑟(𝐴) = 1 + 𝑔 in the long run: 𝑥𝑡 grows along 𝐷 as 𝑟(𝐴) > 1 and
converges to (0, 0) as 𝑟(𝐴) < 1.
Moreover, the long-run unemployment and employment are steady fractions of 𝑛𝑡 .
The latter implies that 𝑢̄ and 𝑒 ̄ are long-run unemployment rate and employment rate, respectively.
In detail, we have the unemployment rates and employment rates: 𝑥𝑡 /𝑛𝑡 = 𝐴𝑡 𝑛0 /𝑛𝑡 → 𝑥̄ as 𝑡 → ∞.
To illustrate the dynamics of the rates, let 𝐴 ̂ ∶= 𝐴/(1 + 𝑔) be the transition matrix of 𝑟𝑡 ∶= 𝑥𝑡 /𝑛𝑡 .
The dynamics of the rates follow
𝑥𝑡+1 𝑥𝑡+1 𝐴𝑥𝑡 𝑥
𝑟𝑡+1 = = = = 𝐴 ̂ 𝑡 = 𝐴𝑟̂ 𝑡 .
𝑛𝑡+1 (1 + 𝑔)𝑛𝑡 (1 + 𝑔)𝑛𝑡 𝑛𝑡

Observe that the column sums of 𝐴 ̂ are all one so that 𝑟(𝐴)̂ = 1.
One can check that 𝑥̄ is also the right eigenvector of 𝐴 ̂ corresponding to 𝑟(𝐴)̂ that 𝑥̄ = 𝐴𝑥.
̂ ̄

Moreover, 𝐴𝑡̂ 𝑟0 → 𝑥̄ as 𝑡 → ∞ for any 𝑟0 = 𝑥0 /𝑛0 , since the above discussion implies

𝑢̄ 𝑢̄ 𝑢̄
𝑟𝑡 = 𝐴𝑡̂ 𝑟0 = (1 + 𝑔)−𝑡 𝐴𝑡 𝑟0 = 𝑟(𝐴)−𝑡 𝐴𝑡 𝑟0 → [ ]𝑟 = [ ].
𝑒̄ 𝑒̄ 0 𝑒̄

This is illustrated below.

lm = LakeModel()
e_0 = 0.92 # Initial employment
u_0 = 1 - e_0 # Initial unemployment, given initial n_0 = 1

lm = LakeModel()
T = 100 # Simulation length

x_0 = (u_0, e_0)

x_path = lm.simulate_path(x_0, T)

rate_path = x_path / x_path.sum(0)


(continues on next page)

32.3. Dynamics 495


A First Course in Quantitative Economics with Python

(continued from previous page)

fig, axes = plt.subplots(2, 1, figsize=(10, 8))

# Plot steady ū and ē


axes[0].hlines(lm.ū, 0, T, 'r', '--', lw=2, label='ū')
axes[1].hlines(lm.ē, 0, T, 'r', '--', lw=2, label='ē')

titles = ['Unemployment rate', 'Employment rate']


locations = ['lower right', 'upper right']

# Plot unemployment rate and employment rate


for i, ax in enumerate(axes):
ax.plot(rate_path[i, :], lw=2, alpha=0.6)
ax.set_title(titles[i])
ax.grid()
ax.legend(loc=locations[i])

plt.tight_layout()
plt.show()

To provide more intuition for convergence, we further explain the convergence below without the Perron-Frobenius the-
orem.
Suppose that 𝐴 ̂ = 𝑃 𝐷𝑃 −1 is diagonalizable, where 𝑃 = [𝑣1 , 𝑣2 ] consists of eigenvectors 𝑣1 and 𝑣2 of 𝐴 ̂ corresponding
to eigenvalues 𝛾1 and 𝛾2 respectively, and 𝐷 = diag(𝛾1 , 𝛾2 ).
Let 𝛾1 = 𝑟(𝐴)̂ = 1 and |𝛾2 | < 𝛾1 , so that the spectral radius is a dominant eigenvalue.
The dynamics of the rates follow 𝑟𝑡+1 = 𝐴𝑟̂ 𝑡 , where 𝑟0 is a probability vector: ∑𝑗 𝑟0,𝑗 = 1.

496 Chapter 32. A Lake Model of Employment


A First Course in Quantitative Economics with Python

Consider 𝑧𝑡 = 𝑃 −1 𝑟𝑡 .
Then, we have 𝑧𝑡+1 = 𝑃 −1 𝑟𝑡+1 = 𝑃 −1 𝐴𝑟̂ 𝑡 = 𝑃 −1 𝐴𝑃
̂ 𝑧 = 𝐷𝑧 .
𝑡 𝑡

Hence, we obtain 𝑧𝑡 = 𝐷𝑡 𝑧0 , and for some 𝑧0 = (𝑐1 , 𝑐2 )⊤ we have

𝛾1𝑡 0 𝑐1
𝑟𝑡 = 𝑃 𝑧𝑡 = [𝑣1 𝑣2 ] [ ] [ ] = 𝑐1 𝛾1𝑡 𝑣1 + 𝑐2 𝛾2𝑡 𝑣2 .
0 𝛾2𝑡 𝑐2

Since |𝛾2 | < |𝛾1 | = 1, the second term in the right hand side converges to zero.
Therefore, the convergence follows 𝑟𝑡 → 𝑐1 𝑣1 .
Since the column sums of 𝐴 ̂ are one and 𝑟0 is a probability vector, 𝑟𝑡 must be a probability vector.
In this case, 𝑐1 𝑣1 must be a normalized eigenvector, so 𝑐1 𝑣1 = 𝑥̄ and then 𝑟𝑡 → 𝑥.̄

32.4 Exercise

Exercise 32.4.1 (Evolution of unemployment and employment rate)


How do the long-run unemployment rate and employment rate evolve if there is an increase in the separation rate 𝛼 or a
decrease in job finding rate 𝜆?
Is the result compatible with your intuition?
Plot the graph to illustrate how the line 𝐷 ∶= {𝑥 ∈ ℝ2 ∶ 𝑥 = 𝛼𝑥̄ for some 𝛼 > 0} shifts in the unemployment-
employment space.

Solution to Exercise 32.4.1 (Evolution of unemployment and employment rate)


Eq. (32.3) implies that the long-run unemployment rate will increase, and the employment rate will decrease if 𝛼 increases
or 𝜆 decreases.
Suppose first that 𝛼 = 0.01, 𝜆 = 0.1, 𝑑 = 0.02, 𝑏 = 0.025. Assume that 𝛼 increases to 0.04.
The below graph illustrates that the line 𝐷 shifts clockwise downward, which indicates that the fraction of unemployment
rises as the separation rate increases.

fig, ax = plt.subplots(figsize=(10, 8))

lm = LakeModel(α=0.01, λ=0.1, d=0.02, b=0.025)


plot_time_paths(lm, ax=ax)
s=10
ax.plot([0, s * lm.ū], [0, s * lm.ē], "k--", lw=1, label='set $D$, α=0.01')

lm = LakeModel(α=0.04, λ=0.1, d=0.02, b=0.025)


plot_time_paths(lm, ax=ax)
ax.plot([0, s * lm.ū], [0, s * lm.ē], "r--", lw=1, label='set $D$, α=0.04')

ax.legend(loc='best')
plt.show()

32.4. Exercise 497


A First Course in Quantitative Economics with Python

498 Chapter 32. A Lake Model of Employment


CHAPTER

THIRTYTHREE

NETWORKS

!pip install quantecon-book-networks pandas-datareader

33.1 Outline

In recent years there has been rapid growth in a field called network science.
Network science studies relationships between groups of objects.
One important example is the world wide web , where web pages are connected by hyperlinks.
Another is the human brain: studies of brain function emphasize the network of connections between nerve cells (neurons).
Artificial neural networks are based on this idea, using data to build intricate connections between simple processing units.
Epidemiologists studying transmission of diseases like COVID-19 analyze interactions between groups of human hosts.
In operations research, network analysis is used to study fundamental problems as on minimum cost flow, the traveling
salesman, shortest paths, and assignment.
This lecture gives an introduction to economic and financial networks.
Some parts of this lecture are drawn from the text https://networks.quantecon.org/ but the level of this lecture is more
introductory.
We will need the following imports.

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
import quantecon as qe

import matplotlib.cm as cm
import quantecon_book_networks
import quantecon_book_networks.input_output as qbn_io
import quantecon_book_networks.data as qbn_data

import matplotlib.patches as mpatches

499
A First Course in Quantitative Economics with Python

33.2 Economic and financial networks

Within economics, important examples of networks include


• financial networks
• production networks
• trade networks
• transport networks and
• social networks
Social networks affect trends in market sentiment and consumer decisions.
The structure of financial networks helps to determine relative fragility of the financial system.
The structure of production networks affects trade, innovation and the propagation of local shocks.
To better understand such networks, let’s look at some examples in more depth.

33.2.1 Example: Aircraft Exports

The following figure shows international trade in large commercial aircraft in 2019 based on International Trade Data
SITC Revision 2.

The circles in the figure are called nodes or vertices – in this case they represent countries.
The arrows in the figure are called edges or links.
Node size is proportional to total exports and edge width is proportional to exports to the target country.
(The data is for trade in commercial aircraft weighing at least 15,000kg and was sourced from CID Dataverse.)
The figure shows that the US, France and Germany are major export hubs.
In the discussion below, we learn to quantify such ideas.

33.2.2 Example: A Markov Chain

Recall that, in our lecture on Markov chains we studied a dynamic model of business cycles where the states are
• “ng” = “normal growth”
• “mr” = “mild recession”
• “sr” = “severe recession”
Let’s examine the following figure

500 Chapter 33. Networks


A First Course in Quantitative Economics with Python

Fig. 33.1: Commercial Aircraft Network

33.2. Economic and financial networks 501


A First Course in Quantitative Economics with Python

This is an example of a network, where the set of nodes 𝑉 equals the states:

𝑉 = {"ng", "mr", "sr"}

The edges between the nodes show the one month transition probabilities.

33.3 An introduction to graph theory

Now we’ve looked at some examples, let’s move on to theory.


This theory will allow us to better organize our thoughts.
The theoretical part of network science is constructed using a major branch of mathematics called graph theory.
Graph theory can be complicated and we will cover only the basics.
However, these concepts will already be enough for us to discuss interesting and important ideas on economic and financial
networks.
We focus on “directed” graphs, where connections are, in general, asymmetric (arrows typically point one way, not both
ways).
E.g.,
• bank 𝐴 lends money to bank 𝐵
• firm 𝐴 supplies goods to firm 𝐵
• individual 𝐴 “follows” individual 𝐵 on a given social network
(“Undirected” graphs, where connections are symmetric, are a special case of directed graphs — we just need to insist
that each arrow pointing from 𝐴 to 𝐵 is paired with another arrow pointing from 𝐵 to 𝐴.)

33.3.1 Key definitions

A directed graph consists of two things:


1. a finite set 𝑉 and
2. a collection of pairs (𝑢, 𝑣) where 𝑢 and 𝑣 are elements of 𝑉 .
The elements of 𝑉 are called the vertices or nodes of the graph.
The pairs (𝑢, 𝑣) are called the edges of the graph and the set of all edges will usually be denoted by 𝐸
Intuitively and visually, an edge (𝑢, 𝑣) is understood as an arrow from node 𝑢 to node 𝑣.
(A neat way to represent an arrow is to record the location of the tail and head of the arrow, and that’s exactly what an
edge does.)
In the aircraft export example shown in Fig. 33.1
• 𝑉 is all countries included in the data set.
• 𝐸 is all the arrows in the figure, each indicating some positive amount of aircraft exports from one country to
another.
Let’s look at more examples.
Two graphs are shown below, each with three nodes.
We now construct a graph with the same nodes but different edges.

502 Chapter 33. Networks


A First Course in Quantitative Economics with Python

Fig. 33.2: Poverty Trap

Fig. 33.3: Poverty Trap

33.3. An introduction to graph theory 503


A First Course in Quantitative Economics with Python

For these graphs, the arrows (edges) can be thought of as representing positive transition probabilities over a given unit
of time.
In general, if an edge (𝑢, 𝑣) exists, then the node 𝑢 is called a direct predecessor of 𝑣 and 𝑣 is called a direct successor
of 𝑢.
Also, for 𝑣 ∈ 𝑉 ,
• the in-degree is 𝑖𝑑 (𝑣) = the number of direct predecessors of 𝑣 and
• the out-degree is 𝑜𝑑 (𝑣) = the number of direct successors of 𝑣.

33.3.2 Digraphs in Networkx

The Python package Networkx provides a convenient data structure for representing directed graphs and implements
many common routines for analyzing them.
As an example, let us recreate Fig. 33.3 using Networkx.
To do so, we first create an empty DiGraph object:

G_p = nx.DiGraph()

Next we populate it with nodes and edges.


To do this we write down a list of all edges, with poor represented by p and so on:

edge_list = [('p', 'p'),


('m', 'p'), ('m', 'm'), ('m', 'r'),
('r', 'p'), ('r', 'm'), ('r', 'r')]

Finally, we add the edges to our DiGraph object:

for e in edge_list:
u, v = e
G_p.add_edge(u, v)

Alternatively, we can use the method add_edges_from.

G_p.add_edges_from(edge_list)

Adding the edges automatically adds the nodes, so G_p is now a correct representation of our graph.
We can verify this by plotting the graph via Networkx with the following code:

fig, ax = plt.subplots()
nx.draw_spring(G_p, ax=ax, node_size=500, with_labels=True,
font_weight='bold', arrows=True, alpha=0.8,
connectionstyle='arc3,rad=0.25', arrowsize=20)
plt.show()

504 Chapter 33. Networks


A First Course in Quantitative Economics with Python

The figure obtained above matches the original directed graph in Fig. 33.3.
DiGraph objects have methods that calculate in-degree and out-degree of nodes.
For example,

G_p.in_degree('p')

33.3.3 Communication

Next, we study communication and connectedness, which have important implications for economic networks.
Node 𝑣 is called accessible from node 𝑢 if either 𝑢 = 𝑣 or there exists a sequence of edges that lead from 𝑢 to 𝑣.
• in this case, we write 𝑢 → 𝑣
(Visually, there is a sequence of arrows leading from 𝑢 to 𝑣.)
For example, suppose we have a directed graph representing a production network, where
• elements of 𝑉 are industrial sectors and
• existence of an edge (𝑖, 𝑗) means that 𝑖 supplies products or services to 𝑗.
Then 𝑚 → ℓ means that sector 𝑚 is an upstream supplier of sector ℓ.
Two nodes 𝑢 and 𝑣 are said to communicate if both 𝑢 → 𝑣 and 𝑣 → 𝑢.
A graph is called strongly connected if all nodes communicate.
For example, Fig. 33.2 is strongly connected however in Fig. 33.3 rich is not accessible from poor, thus it is not strongly
connected.

33.3. An introduction to graph theory 505


A First Course in Quantitative Economics with Python

We can verify this by first constructing the graphs using Networkx and then using nx.is_strongly_connected.

fig, ax = plt.subplots()
G1 = nx.DiGraph()

G1.add_edges_from([('p', 'p'),('p','m'),('p','r'),
('m', 'p'), ('m', 'm'), ('m', 'r'),
('r', 'p'), ('r', 'm'), ('r', 'r')])

nx.draw_networkx(G1, with_labels = True)

nx.is_strongly_connected(G1) #checking if above graph is strongly connected

True

fig, ax = plt.subplots()
G2 = nx.DiGraph()

G2.add_edges_from([('p', 'p'),
('m', 'p'), ('m', 'm'), ('m', 'r'),
('r', 'p'), ('r', 'm'), ('r', 'r')])

nx.draw_networkx(G2, with_labels = True)

506 Chapter 33. Networks


A First Course in Quantitative Economics with Python

nx.is_strongly_connected(G2) #checking if above graph is strongly connected

False

33.4 Weighted graphs

We now introduce weighted graphs, where weights (numbers) are attached to each edge.

33.4.1 International private credit flows by country

To motivate the idea, consider the following figure which shows flows of funds (i.e., loans) between private banks, grouped
by country of origin.

The country codes are given in the following table

Code Country Code Country Code Country Code Country


AU Australia DE Germany CL Chile ES Spain
PT Portugal FR France TR Turkey GB United Kingdom
US United States IE Ireland AT Austria IT Italy
BE Belgium JP Japan SW Switzerland SE Sweden

An arrow from Japan to the US indicates aggregate claims held by Japanese banks on all US-registered banks, as collected
by the Bank of International Settlements (BIS).
The size of each node in the figure is increasing in the total foreign claims of all other nodes on this node.

33.4. Weighted graphs 507


A First Course in Quantitative Economics with Python

Fig. 33.4: International Credit Network

508 Chapter 33. Networks


A First Course in Quantitative Economics with Python

The widths of the arrows are proportional to the foreign claims they represent.
Notice that, in this network, an edge (𝑢, 𝑣) exists for almost every choice of 𝑢 and 𝑣 (i.e., almost every country in the
network).
(In fact, there are even more small arrows, which we have dropped for clarity.)
Hence the existence of an edge from one node to another is not particularly informative.
To understand the network, we need to record not just the existence or absence of a credit flow, but also the size of the
flow.
The correct data structure for recording this information is a “weighted directed graph”.

33.4.2 Definitions

A weighted directed graph is a directed graph to which we have added a weight function 𝑤 that assigns a positive
number to each edge.
The figure above shows one weighted directed graph, where the weights are the size of fund flows.
The following figure shows a weighted directed graph, with arrows representing edges of the induced directed graph.

Fig. 33.5: Weighted Poverty Trap

The numbers next to the edges are the weights.


In this case, you can think of the numbers on the arrows as transition probabilities for a household over, say, one year.
We see that a rich household has a 10% chance of becoming poor in one year.

33.5 Adjacency matrices

Another way that we can represent weights, which turns out to be very convenient for numerical work, is via a matrix.
The adjacency matrix of a weighted directed graph with nodes {𝑣1 , … , 𝑣𝑛 }, edges 𝐸 and weight function 𝑤 is the matrix

𝑤(𝑣𝑖 , 𝑣𝑗 ) if (𝑣𝑖 , 𝑣𝑗 ) ∈ 𝐸
𝐴 = (𝑎𝑖𝑗 )1≤𝑖,𝑗≤𝑛 with 𝑎𝑖𝑗 = {
0 otherwise.

Once the nodes in 𝑉 are enumerated, the weight function and adjacency matrix provide essentially the same information.

33.5. Adjacency matrices 509


A First Course in Quantitative Economics with Python

For example, with {poor, middle, rich} mapped to {1, 2, 3} respectively, the adjacency matrix corresponding to the
weighted directed graph in Fig. 33.5 is

0.9 0.1 0

⎜0.4 0.4 0.2⎞
⎟.
⎝0.1 0.1 0.8⎠

In QuantEcon’s DiGraph implementation, weights are recorded via the keyword weighted:

A = ((0.9, 0.1, 0.0),


(0.4, 0.4, 0.2),
(0.1, 0.1, 0.8))
A = np.array(A)
G = qe.DiGraph(A, weighted=True) # store weights

One of the key points to remember about adjacency matrices is that taking the transpose reverses all the arrows in the
associated directed graph.
For example, the following directed graph can be interpreted as a stylized version of a financial network, with nodes as
banks and edges showing the flow of funds.

G4 = nx.DiGraph()

G4.add_edges_from([('1','2'),
('2','1'),('2','3'),
('3','4'),
('4','2'),('4','5'),
('5','1'),('5','3'),('5','4')])
pos = nx.circular_layout(G4)

edge_labels={('1','2'): '100',
('2','1'): '50', ('2','3'): '200',
('3','4'): '100',
('4','2'): '500', ('4','5'): '50',
('5','1'): '150',('5','3'): '250', ('5','4'): '300'}

nx.draw_networkx(G4, pos, node_color = 'none',node_size = 500)


nx.draw_networkx_edge_labels(G4, pos, edge_labels=edge_labels)
nx.draw_networkx_nodes(G4, pos, linewidths= 0.5, edgecolors = 'black',
node_color = 'none',node_size = 500)

plt.show()

510 Chapter 33. Networks


A First Course in Quantitative Economics with Python

We see that bank 2 extends a loan of size 200 to bank 3.


The corresponding adjacency matrix is

0 100 0 0 0

⎜ 50 0 200 0 0⎞⎟
⎜ ⎟
𝐴=⎜
⎜ 0 0 0 100 0 ⎟
⎟ .

⎜ 0 500 ⎟
0 0 50⎟
⎝150 0 250 300 0 ⎠
The transpose is

0 50 0 0 150

⎜ 100 0 0 500 0 ⎞⎟
⎜ ⎟
𝐴⊤ = ⎜
⎜ 0 200 0 0 250⎟⎟ .

⎜ 0 ⎟
0 100 0 300⎟
⎝ 0 0 0 50 0 ⎠
The corresponding network is visualized in the following figure which shows the network of liabilities after the loans have
been granted.
Both of these networks (original and transpose) are useful for analyzing financial markets.

G5 = nx.DiGraph()

G5.add_edges_from([('1','2'),('1','5'),
('2','1'),('2','4'),
('3','2'),('3','5'),
('4','3'),('4','5'),
('5','4')])

edge_labels={('1','2'): '50', ('1','5'): '150',


('2','1'): '100', ('2','4'): '500',
(continues on next page)

33.5. Adjacency matrices 511


A First Course in Quantitative Economics with Python

(continued from previous page)


('3','2'): '200', ('3','5'): '250',
('4','3'): '100', ('4','5'): '300',
('5','4'): '50'}

nx.draw_networkx(G5, pos, node_color = 'none',node_size = 500)


nx.draw_networkx_edge_labels(G5, pos, edge_labels=edge_labels)
nx.draw_networkx_nodes(G5, pos, linewidths= 0.5, edgecolors = 'black',
node_color = 'none',node_size = 500)

plt.show()

In general, every nonnegative 𝑛 × 𝑛 matrix 𝐴 = (𝑎𝑖𝑗 ) can be viewed as the adjacency matrix of a weighted directed
graph.
To build the graph we set 𝑉 = 1, … , 𝑛 and take the edge set 𝐸 to be all (𝑖, 𝑗) such that 𝑎𝑖𝑗 > 0.
For the weight function we set 𝑤(𝑖, 𝑗) = 𝑎𝑖𝑗 for all edges (𝑖, 𝑗).
We call this graph the weighted directed graph induced by 𝐴.

33.6 Properties

Consider a weighted directed graph with adjacency matrix 𝐴.


Let 𝑎𝑘𝑖𝑗 be element 𝑖, 𝑗 of 𝐴𝑘 , the 𝑘-th power of 𝐴.
The following result is useful in many applications:

Theorem 33.6.1

512 Chapter 33. Networks


A First Course in Quantitative Economics with Python

For distinct nodes 𝑖, 𝑗 in 𝑉 and any integer 𝑘, we have

𝑎𝑘𝑖𝑗 > 0 if and only if 𝑗 is accessible from 𝑖.

The above result is obvious when 𝑘 = 1 and a proof of the general case can be found in [SS22].
Now recall from the eigenvalues lecture that a nonnegative matrix 𝐴 is called irreducible if for each (𝑖, 𝑗) there is an
integer 𝑘 ≥ 0 such that 𝑎𝑘𝑖𝑗 > 0.
From the preceding theorem, it is not too difficult (see [SS22] for details) to get the next result.

Theorem 33.6.2
For a weighted directed graph the following statements are equivalent:
1. The directed graph is strongly connected.
2. The adjacency matrix of the graph is irreducible.

We illustrate the above theorem with a simple example.


Consider the following weighted directed graph.

We first create the above network as a Networkx DiGraph object.

G6 = nx.DiGraph()

G6.add_edges_from([('1','2'),('1','3'),
('2','1'),
('3','1'),('3','2')])

Then we construct the associated adjacency matrix A.

A = np.array([[0,0.7,0.3], # adjacency matrix A


[1,0,0],
[0.4,0.6,0]])

is_irreducible(A) # check irreducibility of A

True

33.6. Properties 513


A First Course in Quantitative Economics with Python

nx.is_strongly_connected(G6) # check connectedness of graph

True

33.7 Network centrality

When studying networks of all varieties, a recurring topic is the relative “centrality” or “importance” of different nodes.
Examples include
• ranking of web pages by search engines
• determining the most important bank in a financial network (which one a central bank should rescue if there is a
financial crisis)
• determining the most important industrial sector in an economy.
In what follows, a centrality measure associates to each weighted directed graph a vector 𝑚 where the 𝑚𝑖 is interpreted
as the centrality (or rank) of node 𝑣𝑖 .

33.7.1 Degree centrality

Two elementary measures of “importance” of a node in a given directed graph are its in-degree and out-degree.
Both of these provide a centrality measure.
In-degree centrality is a vector containing the in-degree of each node in the graph.
Consider the following simple example.

G7 = nx.DiGraph()

G7.add_nodes_from(['1','2','3','4','5','6','7'])

G7.add_edges_from([('1','2'),('1','6'),
('2','1'),('2','4'),
('3','2'),
('4','2'),
('5','3'),('5','4'),
('6','1'),
('7','4'),('7','6')])
pos = nx.planar_layout(G7)

nx.draw_networkx(G7, pos, node_color='none', node_size=500)


nx.draw_networkx_nodes(G7, pos, linewidths=0.5, edgecolors='black',
node_color='none',node_size=500)

plt.show()

The following code displays the in-degree centrality of all nodes.

514 Chapter 33. Networks


A First Course in Quantitative Economics with Python

Fig. 33.6: Sample Graph

iG7 = [G7.in_degree(v) for v in G7.nodes()] # computing in-degree centrality

for i, d in enumerate(iG7):
print(i+1, d)

1 2
2 3
3 1
4 3
5 0
6 2
7 0

Consider the international credit network displayed in Fig. 33.4.


The following plot displays the in-degree centrality of each country.

D = qbn_io.build_unweighted_matrix(Z)
indegree = D.sum(axis=0)

def centrality_plot_data(countries, centrality_measures):


df = pd.DataFrame({'code': countries,
'centrality':centrality_measures,
'color': qbn_io.colorise_weights(centrality_measures).tolist()
})
return df.sort_values('centrality')

33.7. Network centrality 515


A First Course in Quantitative Economics with Python

fig, ax = plt.subplots()

df = centrality_plot_data(countries, indegree)

ax.bar('code', 'centrality', data=df, color=df["color"], alpha=0.6)

patch = mpatches.Patch(color=None, label='in degree', visible=False)


ax.legend(handles=[patch], fontsize=12, loc="upper left", handlelength=0,␣
↪frameon=False)

ax.set_ylim((0,20))

plt.show()

Unfortunately, while in-degree and out-degree centrality are simple to calculate, they are not always informative.
In Fig. 33.4, an edge exists between almost every node, so the in- or out-degree based centrality ranking fails to effectively
separate the countries.
This can be seen in the above graph as well.
Another example is the task of a web search engine, which ranks pages by relevance whenever a user enters a search.
Suppose web page A has twice as many inbound links as page B.
In-degree centrality tells us that page A deserves a higher rank.
But in fact, page A might be less important than page B.
To see why, suppose that the links to A are from pages that receive almost no traffic, while the links to B are from pages
that receive very heavy traffic.
In this case, page B probably receives more visitors, which in turn suggests that page B contains more valuable (or enter-
taining) content.

516 Chapter 33. Networks


A First Course in Quantitative Economics with Python

Thinking about this point suggests that importance might be recursive.


This means that the importance of a given node depends on the importance of other nodes that link to it.
As another example, we can imagine a production network where the importance of a given sector depends on the im-
portance of the sectors that it supplies.
This reverses the order of the previous example: now the importance of a given node depends on the importance of other
nodes that it links to.
The next centrality measures will have these recursive features.

33.7.2 Eigenvector centrality

Suppose we have a weighted directed graph with adjacency matrix 𝐴.


For simplicity, we will suppose that the nodes 𝑉 of the graph are just the integers 1, … , 𝑛.
Let 𝑟(𝐴) denote the spectral radius of 𝐴.
The eigenvector centrality of the graph is defined as the 𝑛-vector 𝑒 that solves
1
𝑒= 𝐴𝑒. (33.1)
𝑟(𝐴)
In other words, 𝑒 is the dominant eigenvector of 𝐴 (the eigenvector of the largest eigenvalue — see the discussion of the
Perron-Frobenius theorem in the eigenvalue lecture.
To better understand (33.1), we write out the full expression for some element 𝑒𝑖
1
𝑒𝑖 = ∑ 𝑎 𝑒 (33.2)
𝑟(𝐴) 1≤𝑗≤𝑛 𝑖𝑗 𝑗

Note the recursive nature of the definition: the centrality obtained by node 𝑖 is proportional to a sum of the centrality of
all nodes, weighted by the rates of flow from 𝑖 into these nodes.
A node 𝑖 is highly ranked if
1. there are many edges leaving 𝑖,
2. these edges have large weights, and
3. the edges point to other highly ranked nodes.
Later, when we study demand shocks in production networks, there will be a more concrete interpretation of eigenvector
centrality.
We will see that, in production networks, sectors with high eigenvector centrality are important suppliers.
In particular, they are activated by a wide array of demand shocks once orders flow backwards through the network.
To compute eigenvector centrality we can use the following function.

def eigenvector_centrality(A, k=40, authority=False):


"""
Computes the dominant eigenvector of A. Assumes A is
primitive and uses the power method.

"""
A_temp = A.T if authority else A
n = len(A_temp)
r = np.max(np.abs(np.linalg.eigvals(A_temp)))
(continues on next page)

33.7. Network centrality 517


A First Course in Quantitative Economics with Python

(continued from previous page)


e = r**(-k) * (np.linalg.matrix_power(A_temp, k) @ np.ones(n))
return e / np.sum(e)

Let’s compute eigenvector centrality for the graph generated in Fig. 33.6.

A = nx.to_numpy_array(G7) # compute adjacency matrix of graph

e = eigenvector_centrality(A)
n = len(e)

for i in range(n):
print(i+1,e[i])

1 0.18580570704268035
2 0.18580570704268035
3 0.11483424225608216
4 0.11483424225608216
5 0.14194292957319637
6 0.11483424225608216
7 0.14194292957319637

While nodes 2 and 4 had the highest in-degree centrality, we can see that nodes 1 and 2 have the highest eigenvector
centrality.
Let’s revisit the international credit network in Fig. 33.4.

eig_central = eigenvector_centrality(Z)

fig, ax = plt.subplots()

df = centrality_plot_data(countries, eig_central)

ax.bar('code', 'centrality', data=df, color=df["color"], alpha=0.6)

patch = mpatches.Patch(color=None, visible=False)


ax.legend(handles=[patch], fontsize=12, loc="upper left", handlelength=0,␣
↪frameon=False)

plt.show()

Countries that are rated highly according to this rank tend to be important players in terms of supply of credit.
Japan takes the highest rank according to this measure, although countries with large financial sectors such as Great Britain
and France are not far behind.
The advantage of eigenvector centrality is that it measures a node’s importance while considering the importance of its
neighbours.
A variant of eigenvector centrality is at the core of Google’s PageRank algorithm, which is used to rank web pages.
The main principle is that links from important nodes (as measured by degree centrality) are worth more than links from
unimportant nodes.

518 Chapter 33. Networks


A First Course in Quantitative Economics with Python

Fig. 33.7: Eigenvector centrality

33.7.3 Katz centrality

One problem with eigenvector centrality is that 𝑟(𝐴) might be zero, in which case 1/𝑟(𝐴) is not defined.
For this and other reasons, some researchers prefer another measure of centrality for networks called Katz centrality.
Fixing 𝛽 in (0, 1/𝑟(𝐴)), the Katz centrality of a weighted directed graph with adjacency matrix 𝐴 is defined as the
vector 𝜅 that solves
𝜅𝑖 = 𝛽 ∑ 𝑎𝑖𝑗 𝜅𝑗 + 1 for all 𝑖 ∈ {0, … , 𝑛 − 1}. (33.3)
1≤𝑗1

Here 𝛽 is a parameter that we can choose.


In vector form we can write

𝜅 = 1 + 𝛽𝐴𝜅 (33.4)

where 1 is a column vector of ones.


The intuition behind this centrality measure is similar to that provided for eigenvector centrality: high centrality is con-
ferred on 𝑖 when it is linked to by nodes that themselves have high centrality.
Provided that 0 < 𝛽 < 1/𝑟(𝐴), Katz centrality is always finite and well-defined because then 𝑟(𝛽𝐴) < 1.
This means that (33.4) has the unique solution

𝜅 = (𝐼 − 𝛽𝐴)−1 1

This follows from the Neumann series theorem.


The parameter 𝛽 is used to ensure that 𝜅 is finite
When 𝑟(𝐴) < 1, we use 𝛽 = 1 as the default for Katz centrality computations.

33.7. Network centrality 519


A First Course in Quantitative Economics with Python

33.7.4 Authorities vs hubs

Search engine designers recognize that web pages can be important in two different ways.
Some pages have high hub centrality, meaning that they link to valuable sources of information (e.g., news aggregation
sites).
Other pages have high authority centrality, meaning that they contain valuable information, as indicated by the number
and significance of incoming links (e.g., websites of respected news organizations).
Similar ideas can and have been applied to economic networks (often using different terminology).
The eigenvector centrality and Katz centrality measures we discussed above measure hub centrality.
(Nodes have high centrality if they point to other nodes with high centrality.)
If we care more about authority centrality, we can use the same definitions except that we take the transpose of the
adjacency matrix.
This works because taking the transpose reverses the direction of the arrows.
(Now nodes will have high centrality if they receive links from other nodes with high centrality.)
For example, the authority-based eigenvector centrality of a weighted directed graph with adjacency matrix 𝐴 is the
vector 𝑒 solving
1
𝑒= 𝐴⊤ 𝑒. (33.5)
𝑟(𝐴)

The only difference from the original definition is that 𝐴 is replaced by its transpose.
(Transposes do not affect the spectral radius of a matrix so we wrote 𝑟(𝐴) instead of 𝑟(𝐴⊤ ).)
Element-by-element, this is given by
1
𝑒𝑗 = ∑ 𝑎 𝑒 (33.6)
𝑟(𝐴) 1≤𝑖≤𝑛 𝑖𝑗 𝑖

We see 𝑒𝑗 will be high if many nodes with high authority rankings link to 𝑗.
The following figurenshows the authority-based eigenvector centrality ranking for the international credit network shown
in Fig. 33.4.

ecentral_authority = eigenvector_centrality(Z, authority=True)

fig, ax = plt.subplots()

df = centrality_plot_data(countries, ecentral_authority)

ax.bar('code', 'centrality', data=df, color=df["color"], alpha=0.6)

patch = mpatches.Patch(color=None, visible=False)


ax.legend(handles=[patch], fontsize=12, loc="upper left", handlelength=0,␣
↪frameon=False)

plt.show()

Highly ranked countries are those that attract large inflows of credit, or credit inflows from other major players.
In this case the US clearly dominates the rankings as a target of interbank credit.

520 Chapter 33. Networks


A First Course in Quantitative Economics with Python

Fig. 33.8: Eigenvector authority

33.8 Further reading

We apply the ideas discussed in this lecture to:


Textbooks on economic and social networks include [Jac10], [EK+10], [BEJ18], [SS22] and [Goy23].
Within the realm of network science, the texts by [New18], [MFD20] and [Cos21] are excellent.

33.9 Exercises

Exercise 33.9.1
Here is a mathematical exercise for those who like proofs.
Let (𝑉 , 𝐸) be a directed graph and write 𝑢 ∼ 𝑣 if 𝑢 and 𝑣 communicate.
Show that ∼ is an equivalence relation on 𝑉 .

Solution to Exercise 33.9.1


Reflexivity:
Trivially, 𝑢 = 𝑣 ⇒ 𝑢 → 𝑣.
Thus, 𝑢 ∼ 𝑢.
Symmetry: Suppose, 𝑢 ∼ 𝑣

33.8. Further reading 521


A First Course in Quantitative Economics with Python

⇒ 𝑢 → 𝑣 and 𝑣 → 𝑢.
By definition, this implies 𝑣 ∼ 𝑢.
Transitivity:
Suppose, 𝑢 ∼ 𝑣 and 𝑣 ∼ 𝑤
This implies, 𝑢 → 𝑣 and 𝑣 → 𝑢 and also 𝑣 → 𝑤 and 𝑤 → 𝑣.
Thus, we can conclude 𝑢 → 𝑣 → 𝑤 and 𝑤 → 𝑣 → 𝑢.
Which means 𝑢 ∼ 𝑤.

Exercise 33.9.2
Consider a directed graph 𝐺 with the set of nodes

𝑉 = {0, 1, 2, 3, 4, 5, 6, 7}

and the set of edges

𝐸 = {(0, 1), (0, 3), (1, 0), (2, 4), (3, 2), (3, 4), (3, 7), (4, 3), (5, 4), (5, 6), (6, 3), (6, 5), (7, 0)}

1. Use Networkx to draw graph 𝐺.


2. Find the associated adjacency matrix 𝐴 for 𝐺.
3. Use the functions defined above to compute in-degree centrality, out-degree centrality and eigenvector centrality
of G.

Solution to Exercise 33.9.2

# First, let's plot the given graph

G = nx.DiGraph()

G.add_nodes_from(np.arange(8)) # adding nodes

G.add_edges_from([(0,1),(0,3), # adding edges


(1,0),
(2,4),
(3,2),(3,4),(3,7),
(4,3),
(5,4),(5,6),
(6,3),(6,5),
(7,0)])

nx.draw_networkx(G, pos=nx.circular_layout(G), node_color='gray', node_size=500, with_


↪labels=True)

plt.show()

522 Chapter 33. Networks


A First Course in Quantitative Economics with Python

A = nx.to_numpy_array(G) #find adjacency matrix associated with G

array([[0., 1., 0., 1., 0., 0., 0., 0.],


[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0., 0.],
[0., 0., 1., 0., 1., 0., 0., 1.],
[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 1., 0.],
[0., 0., 0., 1., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.]])

oG = [G.out_degree(v) for v in G.nodes()] # computing in-degree centrality

for i, d in enumerate(oG):
print(i, d)

0 2
1 1
2 1
3 3
4 1
5 2
6 2
7 1

e = eigenvector_centrality(A) # computing eigenvector centrality


n = len(e)
(continues on next page)

33.9. Exercises 523


A First Course in Quantitative Economics with Python

(continued from previous page)

for i in range(n):
print(i+1, e[i])

1 0.1458980838002507
2 0.0901698980074874
3 0.05572805602479352
4 0.14589810100962305
5 0.0901699482402499
6 0.1803397955498566
7 0.20162621936025152
8 0.0901698980074874

Exercise 33.9.3
Consider a graph 𝐺 with 𝑛 nodes and 𝑛 × 𝑛 adjacency matrix 𝐴.
𝑛−1
Let 𝑆 = ∑𝑘=0 𝐴𝑘
We can say for any two nodes 𝑖 and 𝑗, 𝑗 is accessible from 𝑖 if and only if 𝑆𝑖𝑗 > 0.
Devise a function is_accessible that checks if any two nodes of a given graph are accessible.
Consider the graph in Exercise 33.9.2 and use this function to check if
1. 1 is accessible from 2
2. 6 is accessible from 3

Solution to Exercise 33.9.3

def is_accessible(G,i,j):
A = nx.to_numpy_array(G)
n = len(A)
result = np.zeros((n, n))
for i in range(n):
result += np.linalg.matrix_power(A, i)
if result[i,j]>0:
return True
else:
return False

G = nx.DiGraph()

G.add_nodes_from(np.arange(8)) # adding nodes

G.add_edges_from([(0,1),(0,3), # adding edges


(1,0),
(2,4),
(3,2),(3,4),(3,7),
(4,3),
(5,4),(5,6),
(continues on next page)

524 Chapter 33. Networks


A First Course in Quantitative Economics with Python

(continued from previous page)


(6,3),(6,5),
(7,0)])

is_accessible(G, 2, 1)

True

is_accessible(G, 3, 6)

False

33.9. Exercises 525


A First Course in Quantitative Economics with Python

526 Chapter 33. Networks


Part X

Markets and Competitive Equilibrium

527
CHAPTER

THIRTYFOUR

SUPPLY AND DEMAND WITH MANY GOODS

34.1 Overview

In a previous lecture we studied supply, demand and welfare in a market with a single consumption good.
In this lecture, we study a setting with 𝑛 goods and 𝑛 corresponding prices.
Key infrastructure concepts that we’ll encounter in this lecture are
• inverse demand curves
• marginal utilities of wealth
• inverse supply curves
• consumer surplus
• producer surplus
• social welfare as a sum of consumer and producer surpluses
• competitive equilibrium
We will provide a version of the first fundamental welfare theorem, which was formulated by
• Leon Walras
• Francis Ysidro Edgeworth
• Vilfredo Pareto
Important extensions to the key ideas were obtained by
• Abba Lerner
• Harold Hotelling
• Paul Samuelson
• Kenneth Arrow
• Gerard Debreu
We shall describe two classic welfare theorems:
• first welfare theorem: for a given distribution of wealth among consumers, a competitive equilibrium allocation
of goods solves a social planning problem.
• second welfare theorem: An allocation of goods to consumers that solves a social planning problem can be
supported by a competitive equilibrium with an appropriate initial distribution of wealth.
As usual, we start by importing some Python modules.

529
A First Course in Quantitative Economics with Python

# import some packages


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.linalg import inv

34.2 Formulas from linear algebra

We shall apply formulas from linear algebra that


• differentiate an inner product with respect to each vector
• differentiate a product of a matrix and a vector with respect to the vector
• differentiate a quadratic form in a vector with respect to the vector
Where 𝑎 is an 𝑛 × 1 vector, 𝐴 is an 𝑛 × 𝑛 matrix, and 𝑥 is an 𝑛 × 1 vector:

𝜕𝑎⊤ 𝑥 𝜕𝑥⊤ 𝑎
= =𝑎
𝜕𝑥 𝜕𝑥
𝜕𝐴𝑥
=𝐴
𝜕𝑥
𝜕𝑥⊤ 𝐴𝑥
= (𝐴 + 𝐴⊤ )𝑥
𝜕𝑥

34.3 From utility function to demand curve

Our study of consumers will use the following primitives


• Π be an 𝑚 × 𝑛 matrix,
• 𝑏 be an 𝑚 × 1 vector of bliss points,
• 𝑒 be an 𝑛 × 1 vector of endowments, and
We will analyze endogenous objects 𝑐 and 𝑝, where
• 𝑐 is an 𝑛 × 1 vector of consumptions of various goods,
• 𝑝 is an 𝑛 × 1 vector of prices
The matrix Π describes a consumer’s willingness to substitute one good for every other good.
We assume that Π has linearly independent columns, which implies that Π⊤ Π is a positive definite matrix.
• it follows that Π⊤ Π has an inverse.
We shall see below that (Π⊤ Π)−1 is a matrix of slopes of (compensated) demand curves for 𝑐 with respect to a vector of
prices:

𝜕𝑐
= (Π⊤ Π)−1
𝜕𝑝
A consumer faces 𝑝 as a price taker and chooses 𝑐 to maximize the utility function
1
− (Π𝑐 − 𝑏)⊤ (Π𝑐 − 𝑏) (34.1)
2

530 Chapter 34. Supply and Demand with Many Goods


A First Course in Quantitative Economics with Python

subject to the budget constraint

𝑝⊤ (𝑐 − 𝑒) = 0 (34.2)

We shall specify examples in which Π and 𝑏 are such that it typically happens that

Π𝑐 ≪ 𝑏 (34.3)

This means that the consumer has much less of each good than he wants.
The deviation in (34.3) will ultimately assure us that competitive equilibrium prices are positive.

34.3.1 Demand curve implied by constrained utility maximization

For now, we assume that the budget constraint is (34.2).


So we’ll be deriving what is known as a Marshallian demand curve.
Our aim is to maximize (34.3.1) subject to (34.3.2).
Form a Lagrangian
1
𝐿 = − (Π𝑐 − 𝑏)⊤ (Π𝑐 − 𝑏) + 𝜇[𝑝⊤ (𝑒 − 𝑐)]
2
where 𝜇 is a Lagrange multiplier that is often called a marginal utility of wealth.
The consumer chooses 𝑐 to maximize 𝐿 and 𝜇 to minimize it.
First-order conditions for 𝑐 are
𝜕𝐿
= −Π⊤ Π𝑐 + Π⊤ 𝑏 − 𝜇𝑝 = 0
𝜕𝑐
so that, given 𝜇, the consumer chooses

𝑐 = (Π⊤ Π)−1 (Π⊤ 𝑏 − 𝜇𝑝) (34.4)

Substituting (34.4) into budget constraint (34.2) and solving for 𝜇 gives

𝑝⊤ (Π⊤ Π)−1 Π⊤ 𝑏 − 𝑝⊤ 𝑒
𝜇(𝑝, 𝑒) = . (34.5)
𝑝⊤ (Π⊤ Π)−1 𝑝
Equation (34.5) tells how marginal utility of wealth depends on the endowment vector 𝑒 and the price vector 𝑝.

Note: Equation (34.5) is a consequence of imposing that 𝑝⊤ (𝑐 − 𝑒) = 0.


We could instead take 𝜇 as a parameter and use (34.4) and the budget constraint (34.6) to solve for wealth.
Which way we proceed determines whether we are constructing a Marshallian or Hicksian demand curve.

34.4 Endowment economy

We now study a pure-exchange economy, or what is sometimes called an endowment economy.


Consider a single-consumer, multiple-goods economy without production.
The only source of goods is the single consumer’s endowment vector 𝑒.

34.4. Endowment economy 531


A First Course in Quantitative Economics with Python

A competitive equilibrium price vector induces the consumer to choose 𝑐 = 𝑒.


This implies that the equilibrium price vector satisfies

𝑝 = 𝜇−1 (Π⊤ 𝑏 − Π⊤ Π𝑒)

In the present case where we have imposed budget constraint in the form (34.2), we are free to normalize the price vector
by setting the marginal utility of wealth 𝜇 = 1 (or any other value for that matter).
This amounts to choosing a common unit (or numeraire) in which prices of all goods are expressed.
(Doubling all prices will affect neither quantities nor relative prices.)
We’ll set 𝜇 = 1.

Exercise 34.4.1
Verify that setting 𝜇 = 1 in (34.4) implies that formula (34.5) is satisfied.

Exercise 34.4.2
Verify that setting 𝜇 = 2 in (34.4) also implies that formula (34.5) is satisfied.

Here is a class that computes competitive equilibria for our economy.

class ExchangeEconomy:

def __init__(self,
Π,
b,
e,
thres=1.5):
"""
Set up the environment for an exchange economy

Args:
Π (np.array): shared matrix of substitution
b (list): the consumer's bliss point
e (list): the consumer's endowment
thres (float): a threshold to check p >> Π e condition
"""

# check non-satiation
if np.min(b / np.max(Π @ e)) <= thres:
raise Exception('set bliss points further away')

self.Π, self.b, self.e = Π, b, e

def competitive_equilibrium(self):
"""
Compute the competitive equilibrium prices and allocation
"""
Π, b, e = self.Π, self.b, self.e

(continues on next page)

532 Chapter 34. Supply and Demand with Many Goods


A First Course in Quantitative Economics with Python

(continued from previous page)


# compute price vector with μ=1
p = Π.T @ b - Π.T @ Π @ e

# compute consumption vector


slope_dc = inv(Π.T @ Π)
Π_inv = inv(Π)
c = Π_inv @ b - slope_dc @ p

if any(c < 0):


print('allocation: ', c)
raise Exception('negative allocation: equilibrium does not exist')

return p, c

34.5 Digression: Marshallian and Hicksian demand curves

Sometimes we’ll use budget constraint (34.2) in situations in which a consumer’s endowment vector 𝑒 is his only source
of income.
Other times we’ll instead assume that the consumer has another source of income (positive or negative) and write his
budget constraint as

𝑝⊤ (𝑐 − 𝑒) = 𝑤 (34.6)

where 𝑤 is measured in “dollars” (or some other numeraire) and component 𝑝𝑖 of the price vector is measured in dollars
per unit of good 𝑖.
Whether the consumer’s budget constraint is (34.2) or (34.6) and whether we take 𝑤 as a free parameter or instead as an
endogenous variable will affect the consumer’s marginal utility of wealth.
Consequently, how we set 𝜇 determines whether we are constructing
• a Marshallian demand curve, as when we use (34.2) and solve for 𝜇 using equation (34.5) above, or
• a Hicksian demand curve, as when we treat 𝜇 as a fixed parameter and solve for 𝑤 from (34.6).
Marshallian and Hicksian demand curves contemplate different mental experiments:
For a Marshallian demand curve, hypothetical changes in a price vector have both substitution and income effects
• income effects are consequences of changes in 𝑝⊤ 𝑒 associated with the change in the price vector
For a Hicksian demand curve, hypothetical price vector changes have only substitution effects
• changes in the price vector leave the 𝑝⊤ 𝑒 + 𝑤 unaltered because we freeze 𝜇 and solve for 𝑤
Sometimes a Hicksian demand curve is called a compensated demand curve in order to emphasize that, to disarm the
income (or wealth) effect associated with a price change, the consumer’s wealth 𝑤 is adjusted.
We’ll discuss these distinct demand curves more below.

34.5. Digression: Marshallian and Hicksian demand curves 533


A First Course in Quantitative Economics with Python

34.6 Dynamics and risk as special cases

Special cases of our 𝑛-good pure exchange model can be created to represent
• dynamics — by putting different dates on different commodities
• risk — by interpreting delivery of goods as being contingent on states of the world whose realizations are described
by a known probability distribution
Let’s illustrate how.

34.6.1 Dynamics

Suppose that we want to represent a utility function


1
− [(𝑐1 − 𝑏1 )2 + 𝛽(𝑐2 − 𝑏2 )2 ]
2
where 𝛽 ∈ (0, 1) is a discount factor, 𝑐1 is consumption at time 1 and 𝑐2 is consumption at time 2.
To capture this with our quadratic utility function (34.1), set

1 √0 ]
Π=[
0 𝛽

𝑒
𝑒 = [ 1]
𝑒2
and
𝑏
𝑏 = [√ 1 ]
𝛽𝑏2

The budget constraint (34.2) becomes

𝑝1 𝑐1 + 𝑝2 𝑐2 = 𝑝1 𝑒1 + 𝑝2 𝑒2

The left side is the discounted present value of consumption.


The right side is the discounted present value of the consumer’s endowment.
𝑝1
The relative price 𝑝2 has units of time 2 goods per unit of time 1 goods.
Consequently,
𝑝1
(1 + 𝑟) ∶= 𝑅 ∶=
𝑝2

is the gross interest rate and 𝑟 is the net interest rate.


Here is an example.

beta = 0.95

Π = np.array([[1, 0],
[0, np.sqrt(beta)]])

b = np.array([5, np.sqrt(beta) * 5])

(continues on next page)

534 Chapter 34. Supply and Demand with Many Goods


A First Course in Quantitative Economics with Python

(continued from previous page)


e = np.array([1, 1])

dynamics = ExchangeEconomy(Π, b, e)
p, c = dynamics.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c)

Competitive equilibrium price vector: [4. 3.8]


Competitive equilibrium allocation: [1. 1.]

34.6.2 Risk and state-contingent claims

We study risk in the context of a static environment, meaning that there is only one period.
By risk we mean that an outcome is not known in advance, but that it is governed by a known probability distribution.
As an example, our consumer confronts risk means in particular that
• there are two states of nature, 1 and 2.
• the consumer knows that the probability that state 1 occurs is 𝜆.
• the consumer knows that the probability that state 2 occurs is (1 − 𝜆).
Before the outcome is realized, the consumer’s expected utility is
1
− [𝜆(𝑐1 − 𝑏1 )2 + (1 − 𝜆)(𝑐2 − 𝑏2 )2 ]
2
where
• 𝑐1 is consumption in state 1
• 𝑐2 is consumption in state 2
To capture these preferences we set

𝜆 √ 0 ]
Π=[
0 1−𝜆

𝑒
𝑒 = [ 1]
𝑒2

𝜆𝑏1
𝑏 = [√ ]
1 − 𝜆𝑏2
A consumer’s endowment vector is
𝑐
𝑐 = [ 1]
𝑐2

A price vector is

𝑝
𝑝 = [ 1]
𝑝2

where 𝑝𝑖 is the price of one unit of consumption in state 𝑖 ∈ {1, 2}.

34.6. Dynamics and risk as special cases 535


A First Course in Quantitative Economics with Python

The state-contingent goods being traded are often called Arrow securities.
Before the random state of the world 𝑖 is realized, the consumer sells his/her state-contingent endowment bundle and
purchases a state-contingent consumption bundle.
Trading such state-contingent goods is one way economists often model insurance.
We use the tricks described above to interpret 𝑐1 , 𝑐2 as “Arrow securities” that are state-contingent claims to consumption
goods.
Here is an instance of the risk economy:

prob = 0.2

Π = np.array([[np.sqrt(prob), 0],
[0, np.sqrt(1 - prob)]])

b = np.array([np.sqrt(prob) * 5, np.sqrt(1 - prob) * 5])

e = np.array([1, 1])

risk = ExchangeEconomy(Π, b, e)
p, c = risk.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c)

Competitive equilibrium price vector: [0.8 3.2]


Competitive equilibrium allocation: [1. 1.]

Exercise 34.6.1
Consider the instance above.
Please numerically study how each of the following cases affects the equilibrium prices and allocations:
• the consumer gets poorer,
• they like the first good more, or
• the probability that state 1 occurs is higher.
Hints. For each case choose some parameter 𝑒, 𝑏, or 𝜆 different from the instance.

Solution to Exercise 34.6.1


First consider when the consumer is poorer.
Here we just decrease the endowment.

risk.e = np.array([0.5, 0.5])

p, c = risk.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c)

536 Chapter 34. Supply and Demand with Many Goods


A First Course in Quantitative Economics with Python

Competitive equilibrium price vector: [0.9 3.6]


Competitive equilibrium allocation: [0.5 0.5]

If the consumer likes the first (or second) good more, then we can set a larger bliss value for good 1.

risk.b = np.array([np.sqrt(prob) * 6, np.sqrt(1 - prob) * 5])


p, c = risk.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c)

Competitive equilibrium price vector: [1.1 3.6]


Competitive equilibrium allocation: [0.5 0.5]

Increase the probability that state 1 occurs.

prob = 0.8

Π = np.array([[np.sqrt(prob), 0],
[0, np.sqrt(1 - prob)]])

b = np.array([np.sqrt(prob) * 5, np.sqrt(1 - prob) * 5])

e = np.array([1, 1])

risk = ExchangeEconomy(Π, b, e)
p, c = risk.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c)

Competitive equilibrium price vector: [3.2 0.8]


Competitive equilibrium allocation: [1. 1.]

34.7 Economies with endogenous supplies of goods

Up to now we have described a pure exchange economy in which endowments of goods are exogenous, meaning that they
are taken as given from outside the model.

34.7.1 Supply curve of a competitive firm

A competitive firm that can produce goods takes a price vector 𝑝 as given and chooses a quantity 𝑞 to maximize total
revenue minus total costs.
The firm’s total revenue equals 𝑝⊤ 𝑞 and its total cost equals 𝐶(𝑞) where 𝐶(𝑞) is a total cost function

1
𝐶(𝑞) = ℎ⊤ 𝑞 + 𝑞 ⊤ 𝐽 𝑞
2
and 𝐽 is a positive definite matrix.

34.7. Economies with endogenous supplies of goods 537


A First Course in Quantitative Economics with Python

So the firm’s profits are

𝑝⊤ 𝑞 − 𝐶(𝑞) (34.7)

An 𝑛 × 1 vector of marginal costs is

𝜕𝐶(𝑞)
= ℎ + 𝐻𝑞
𝜕𝑞
where
1
𝐻= (𝐽 + 𝐽 ⊤ )
2
The firm maximizes total profits by setting marginal revenue to marginal costs.
𝜕𝑝⊤ 𝑞
An 𝑛 × 1 vector of marginal revenues for the price-taking firm is 𝜕𝑞 = 𝑝.
So price equals marginal revenue for our price-taking competitive firm.
This leads to the following inverse supply curve for the competitive firm:

𝑝 = ℎ + 𝐻𝑞

34.7.2 Competitive equilibrium

To compute a competitive equilibrium for a production economy where demand curve is pinned down by the marginal
utility of wealth 𝜇, we first compute an allocation by solving a planning problem.
Then we compute the equilibrium price vector using the inverse demand or supply curve.

𝜇 = 1 warmup

As a special case, let’s pin down a demand curve by setting the marginal utility of wealth 𝜇 = 1.
Equating supply price to demand price and letting 𝑞 = 𝑐 we get

𝑝 = ℎ + 𝐻𝑐 = Π⊤ 𝑏 − Π⊤ Π𝑐,

which implies the equilibrium quantity vector

𝑐 = (Π⊤ Π + 𝐻)−1 (Π⊤ 𝑏 − ℎ) (34.8)

This equation is the counterpart of equilibrium quantity (7.3) for the scalar 𝑛 = 1 model with which we began.

General 𝜇 ≠ 1 case

Now let’s extend the preceding analysis to a more general case by allowing 𝜇 ≠ 1.
Then the inverse demand curve is

𝑝 = 𝜇−1 [Π⊤ 𝑏 − Π⊤ Π𝑐] (34.9)

Equating this to the inverse supply curve, letting 𝑞 = 𝑐 and solving for 𝑐 gives

𝑐 = [Π⊤ Π + 𝜇𝐻]−1 [Π⊤ 𝑏 − 𝜇ℎ] (34.10)

538 Chapter 34. Supply and Demand with Many Goods


A First Course in Quantitative Economics with Python

34.7.3 Implementation

A Production Economy will consist of


• a single person that we’ll interpret as a representative consumer
• a single set of production costs
• a multiplier 𝜇 that weights “consumers” versus “producers” in a planner’s welfare function, as described above in
the main text
• an 𝑛 × 1 vector 𝑝 of competitive equilibrium prices
• an 𝑛 × 1 vector 𝑐 of competitive equilibrium quantities
• consumer surplus
• producer surplus
Here we define a class ProductionEconomy.

class ProductionEconomy:

def __init__(self,
Π,
b,
h,
J,
μ):
"""
Set up the environment for a production economy

Args:
Π (np.ndarray): matrix of substitution
b (np.array): bliss points
h (np.array): h in cost func
J (np.ndarray): J in cost func
μ (float): welfare weight of the corresponding planning problem
"""
self.n = len(b)
self.Π, self.b, self.h, self.J, self.μ = Π, b, h, J, μ

def competitive_equilibrium(self):
"""
Compute a competitive equilibrium of the production economy
"""
Π, b, h, μ, J = self.Π, self.b, self.h, self.μ, self.J
H = .5 * (J + J.T)

# allocation
c = inv(Π.T @ Π + μ * H) @ (Π.T @ b - μ * h)

# price
p = 1 / μ * (Π.T @ b - Π.T @ Π @ c)

# check non-satiation
if any(Π @ c - b >= 0):
raise Exception('invalid result: set bliss points further away')

return c, p
(continues on next page)

34.7. Economies with endogenous supplies of goods 539


A First Course in Quantitative Economics with Python

(continued from previous page)

def compute_surplus(self):
"""
Compute consumer and producer surplus for single good case
"""
if self.n != 1:
raise Exception('not single good')
h, J, Π, b, μ = self.h.item(), self.J.item(), self.Π.item(), self.b.item(),␣
↪self.μ

H = J

# supply/demand curve coefficients


s0, s1 = h, H
d0, d1 = 1 / μ * Π * b, 1 / μ * Π**2

# competitive equilibrium
c, p = self.competitive_equilibrium()

# calculate surplus
c_surplus = d0 * c - .5 * d1 * c**2 - p * c
p_surplus = p * c - s0 * c - .5 * s1 * c**2

return c_surplus, p_surplus

Then define a function that plots demand and supply curves and labels surpluses and equilibrium.

Example: single agent with one good and production

Now let’s construct an example of a production economy with one good.


To do this we
• specify a single person and a cost curve in a way that let’s us replicate the simple single-good supply demand
example with which we started
• compute equilibrium 𝑝 and 𝑐 and consumer and producer surpluses
• draw graphs of both surpluses
• do experiments in which we shift 𝑏 and watch what happens to 𝑝, 𝑐.

Π = np.array([[1]]) # the matrix now is a singleton


b = np.array([10])
h = np.array([0.5])
J = np.array([[1]])
μ = 1

PE = ProductionEconomy(Π, b, h, J, μ)
c, p = PE.competitive_equilibrium()

print('Competitive equilibrium price:', p.item())


print('Competitive equilibrium allocation:', c.item())

# plot
plot_competitive_equilibrium(PE)

540 Chapter 34. Supply and Demand with Many Goods


A First Course in Quantitative Economics with Python

Competitive equilibrium price: 5.25


Competitive equilibrium allocation: 4.75

c_surplus, p_surplus = PE.compute_surplus()

print('Consumer surplus:', c_surplus.item())


print('Producer surplus:', p_surplus.item())

Consumer surplus: 11.28125


Producer surplus: 11.28125

Let’s give the consumer a lower welfare weight by raising 𝜇.

PE.μ = 2
c, p = PE.competitive_equilibrium()

print('Competitive equilibrium price:', p.item())


print('Competitive equilibrium allocation:', c.item())

# plot
plot_competitive_equilibrium(PE)

Competitive equilibrium price: 3.5


Competitive equilibrium allocation: 3.0

34.7. Economies with endogenous supplies of goods 541


A First Course in Quantitative Economics with Python

c_surplus, p_surplus = PE.compute_surplus()

print('Consumer surplus:', c_surplus.item())


print('Producer surplus:', p_surplus.item())

Consumer surplus: 2.25


Producer surplus: 4.5

Now we change the bliss point so that the consumer derives more utility from consumption.

PE.μ = 1
PE.b = PE.b * 1.5
c, p = PE.competitive_equilibrium()

print('Competitive equilibrium price:', p.item())


print('Competitive equilibrium allocation:', c.item())

# plot
plot_competitive_equilibrium(PE)

Competitive equilibrium price: 7.75


Competitive equilibrium allocation: 7.25

542 Chapter 34. Supply and Demand with Many Goods


A First Course in Quantitative Economics with Python

This raises both the equilibrium price and quantity.

Example: single agent two-good economy with production

• we’ll do some experiments like those above


• we can do experiments with a diagonal Π and also with a non-diagonal Π matrices to study how cross-slopes
affect responses of 𝑝 and 𝑐 to various shifts in 𝑏 (TODO)

Π = np.array([[1, 0],
[0, 1]])

b = np.array([10, 10])

h = np.array([0.5, 0.5])

J = np.array([[1, 0.5],
[0.5, 1]])
μ = 1

PE = ProductionEconomy(Π, b, h, J, μ)
c, p = PE.competitive_equilibrium()

print('Competitive equilibrium price:', p)


print('Competitive equilibrium allocation:', c)

Competitive equilibrium price: [6.2 6.2]


Competitive equilibrium allocation: [3.8 3.8]

34.7. Economies with endogenous supplies of goods 543


A First Course in Quantitative Economics with Python

PE.b = np.array([12, 10])

c, p = PE.competitive_equilibrium()

print('Competitive equilibrium price:', p)


print('Competitive equilibrium allocation:', c)

Competitive equilibrium price: [7.13333333 6.46666667]


Competitive equilibrium allocation: [4.86666667 3.53333333]

PE.Π = np.array([[1, 0.5],


[0.5, 1]])

PE.b = np.array([10, 10])

c, p = PE.competitive_equilibrium()

print('Competitive equilibrium price:', p)


print('Competitive equilibrium allocation:', c)

Competitive equilibrium price: [6.3 6.3]


Competitive equilibrium allocation: [3.86666667 3.86666667]

PE.b = np.array([12, 10])


c, p = PE.competitive_equilibrium()

print('Competitive equilibrium price:', p)


print('Competitive equilibrium allocation:', c)

Competitive equilibrium price: [7.23333333 6.56666667]


Competitive equilibrium allocation: [4.93333333 3.6 ]

34.7.4 Digression: a supplier who is a monopolist

A competitive firm is a price-taker who regards the price and therefore its marginal revenue as being beyond its control.
A monopolist knows that it has no competition and can influence the price and its marginal revenue by setting quantity.
A monopolist takes a demand curve and not the price as beyond its control.
Thus, instead of being a price-taker, a monopolist sets prices to maximize profits subject to the inverse demand curve
(34.9).
So the monopolist’s total profits as a function of its output 𝑞 is
1
[𝜇−1 Π⊤ (𝑏 − Π𝑞)]⊤ 𝑞 − ℎ⊤ 𝑞 − 𝑞 ⊤ 𝐽 𝑞 (34.11)
2
After finding first-order necessary conditions for maximizing monopoly profits with respect to 𝑞 and solving them for 𝑞,
we find that the monopolist sets

𝑞 = (𝐻 + 2𝜇−1 Π⊤ Π)−1 (𝜇−1 Π⊤ 𝑏 − ℎ) (34.12)

We’ll soon see that a monopolist sets a lower output 𝑞 than does either a

544 Chapter 34. Supply and Demand with Many Goods


A First Course in Quantitative Economics with Python

• planner who chooses 𝑞 to maximize social welfare


• a competitive equilibrium

Exercise 34.7.1
Please verify the monopolist’s supply curve (34.12).

34.7.5 A monopolist

Let’s consider a monopolist supplier.


We have included a method in our ProductionEconomy class to compute an equilibrium price and allocation when
the supplier is a monopolist.
Since the supplier now has the price-setting power
• we first compute the optimal quantity that solves the monopolist’s profit maximization problem.
• Then we back out an equilibrium price from the consumer’s inverse demand curve.
Next, we use a graph for the single good case to illustrate the difference between a competitive equilibrium and an
equilibrium with a monopolist supplier.
Recall that in a competitive equilibrium, a price-taking supplier equates marginal revenue 𝑝 to marginal cost ℎ + 𝐻𝑞.
This yields a competitive producer’s inverse supply curve.
A monopolist’s marginal revenue is not constant but instead is a non-trivial function of the quantity it sets.
The monopolist’s marginal revenue is

𝑀 𝑅(𝑞) = −2𝜇−1 Π⊤ Π𝑞 + 𝜇−1 Π⊤ 𝑏,

which the monopolist equates to its marginal cost.


The plot indicates that the monopolist’s sets output lower than either the competitive equilibrium quantity.
In a single good case, this equilibrium is associated with a higher price of the good.

class Monopoly(ProductionEconomy):

def __init__(self,
Π,
b,
h,
J,
μ):
"""
Inherit all properties and methods from class ProductionEconomy
"""
super().__init__(Π, b, h, J, μ)

def equilibrium_with_monopoly(self):
"""
Compute the equilibrium price and allocation when there is a monopolist␣
↪supplier

"""
(continues on next page)

34.7. Economies with endogenous supplies of goods 545


A First Course in Quantitative Economics with Python

(continued from previous page)


Π, b, h, μ, J = self.Π, self.b, self.h, self.μ, self.J
H = .5 * (J + J.T)

# allocation
q = inv(μ * H + 2 * Π.T @ Π) @ (Π.T @ b - μ * h)

# price
p = 1 / μ * (Π.T @ b - Π.T @ Π @ q)

if any(Π @ q - b >= 0):


raise Exception('invalid result: set bliss points further away')

return q, p

Define a function that plots the demand, marginal cost and marginal revenue curves with surpluses and equilibrium la-
belled.

A multiple good example

Let’s compare competitive equilibrium and monopoly outcomes in a multiple goods economy.

Π = np.array([[1, 0],
[0, 1.2]])

b = np.array([10, 10])

h = np.array([0.5, 0.5])

J = np.array([[1, 0.5],
[0.5, 1]])
μ = 1

M = Monopoly(Π, b, h, J, μ)
c, p = M.competitive_equilibrium()
q, pm = M.equilibrium_with_monopoly()

print('Competitive equilibrium price:', p)


print('Competitive equilibrium allocation:', c)

print('Equilibrium with monopolist supplier price:', pm)


print('Equilibrium with monopolist supplier allocation:', q)

Competitive equilibrium price: [6.23542117 6.32397408]


Competitive equilibrium allocation: [3.76457883 3.94168467]
Equilibrium with monopolist supplier price: [7.26865672 8.23880597]
Equilibrium with monopolist supplier allocation: [2.73134328 2.6119403 ]

546 Chapter 34. Supply and Demand with Many Goods


A First Course in Quantitative Economics with Python

A single-good example

Π = np.array([[1]]) # the matrix now is a singleton


b = np.array([10])
h = np.array([0.5])
J = np.array([[1]])
μ = 1

M = Monopoly(Π, b, h, J, μ)
c, p = M.competitive_equilibrium()
q, pm = M.equilibrium_with_monopoly()

print('Competitive equilibrium price:', p.item())


print('Competitive equilibrium allocation:', c.item())

print('Equilibrium with monopolist supplier price:', pm.item())


print('Equilibrium with monopolist supplier allocation:', q.item())

# plot
plot_monopoly(M)

Competitive equilibrium price: 5.25


Competitive equilibrium allocation: 4.75
Equilibrium with monopolist supplier price: 6.833333333333334
Equilibrium with monopolist supplier allocation: 3.1666666666666665

34.7. Economies with endogenous supplies of goods 547


A First Course in Quantitative Economics with Python

34.8 Multi-good welfare maximization problem

Our welfare maximization problem – also sometimes called a social planning problem – is to choose 𝑐 to maximize
1
− 𝜇−1 (Π𝑐 − 𝑏)⊤ (Π𝑐 − 𝑏)
2
minus the area under the inverse supply curve, namely,
1
ℎ𝑐 + 𝑐⊤ 𝐽 𝑐
2
So the welfare criterion is
1 1
− 𝜇−1 (Π𝑐 − 𝑏)⊤ (Π𝑐 − 𝑏) − ℎ𝑐 − 𝑐⊤ 𝐽 𝑐
2 2
In this formulation, 𝜇 is a parameter that describes how the planner weighs interests of outside suppliers and our repre-
sentative consumer.
The first-order condition with respect to 𝑐 is

−𝜇−1 Π⊤ Π𝑐 + 𝜇−1 Π⊤ 𝑏 − ℎ − 𝐻𝑐 = 0

which implies (34.10).


Thus, as for the single-good case, with multiple goods a competitive equilibrium quantity vector solves a planning problem.
(This is another version of the first welfare theorem.)
We can deduce a competitive equilibrium price vector from either
• the inverse demand curve, or
• the inverse supply curve

548 Chapter 34. Supply and Demand with Many Goods


CHAPTER

THIRTYFIVE

MARKET EQUILIBRIUM WITH HETEROGENEITY

Contents

• Market Equilibrium with Heterogeneity


– Overview
– An simple example
– Pure exchange economy
– Implementation
– Deducing a representative consumer

35.1 Overview

In the previous lecture, we studied competitive equilibria in an economy with many goods.
While the results of the study were informative, we used a strong simplifying assumption: all of the agents in the economy
are identical.
In the real world, households, firms and other economic agents differ from one another along many dimensions.
In this lecture, we introduce heterogeneity across consumers by allowing their preferences and endowments to differ.
We will examine competitive equilibrium in this setting.
We will also show how a “representative consumer” can be constructed.
Here are some imports:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.linalg import inv

549
A First Course in Quantitative Economics with Python

35.2 An simple example

Let’s study a simple example of pure exchange economy without production.


There are two consumers who differ in their endowment vectors 𝑒𝑖 and their bliss-point vectors 𝑏𝑖 for 𝑖 = 1, 2.
The total endowment is 𝑒1 + 𝑒2 .
A competitive equilibrium requires that

𝑐1 + 𝑐 2 = 𝑒 1 + 𝑒 2

Assume the demand curves

𝑐𝑖 = (Π⊤ Π)−1 (Π⊤ 𝑏𝑖 − 𝜇𝑖 𝑝)

Competitive equilibrium then requires that

𝑒1 + 𝑒2 = (Π⊤ Π)−1 (Π⊤ (𝑏1 + 𝑏2 ) − (𝜇1 + 𝜇2 )𝑝)

which, after a line or two of linear algebra, implies that

(𝜇1 + 𝜇2 )𝑝 = Π⊤ (𝑏1 + 𝑏2 ) − Π⊤ Π(𝑒1 + 𝑒2 ) (35.1)

We can normalize prices by setting 𝜇1 + 𝜇2 = 1 and then solving

𝑝⊤ (Π−1 𝑏𝑖 − 𝑒𝑖 )
𝜇𝑖 (𝑝, 𝑒) = (35.2)
𝑝⊤ (Π⊤ Π)−1 𝑝

for 𝜇𝑖 , 𝑖 = 1, 2.

Exercise 35.2.1
Show that, up to normalization by a positive scalar, the same competitive equilibrium price vector that you computed
in the preceding two-consumer economy would prevail in a single-consumer economy in which a single representative
consumer has utility function

−.5(Π𝑐 − 𝑏)⊤ (Π𝑐 − 𝑏)

and endowment vector 𝑒, where

𝑏 = 𝑏 1 + 𝑏2

and

𝑒 = 𝑒 1 + 𝑒2 .

35.3 Pure exchange economy

Let’s further explore a pure exchange economy with 𝑛 goods and 𝑚 people.

550 Chapter 35. Market Equilibrium with Heterogeneity


A First Course in Quantitative Economics with Python

35.3.1 Competitive equilibrium

We’ll compute a competitive equilibrium.


To compute a competitive equilibrium of a pure exchange economy, we use the fact that
• Relative prices in a competitive equilibrium are the same as those in a special single person or representative
consumer economy with preference Π and 𝑏 = ∑𝑖 𝑏𝑖 , and endowment 𝑒 = ∑𝑖 𝑒𝑖 .
We can use the following steps to compute a competitive equilibrium:
• First we solve the single representative consumer economy by normalizing 𝜇 = 1. Then, we renormalize the price
vector by using the first consumption good as a numeraire.
• Next we use the competitive equilibrium prices to compute each consumer’s marginal utility of wealth:
−𝑊𝑖 + 𝑝⊤ (Π−1 𝑏𝑖 − 𝑒𝑖 )
𝜇𝑖 =
𝑝⊤ (Π⊤ Π)−1 𝑝
• Finally we compute a competitive equilibrium allocation by using the demand curves:
𝑐𝑖 = Π−1 𝑏𝑖 − (Π⊤ Π)−1 𝜇𝑖 𝑝

35.3.2 Designing some Python code

Below we shall construct a Python class with the following attributes:


• Preferences in the form of
– an 𝑛 × 𝑛 positive definite matrix Π
– an 𝑛 × 1 vector of bliss points 𝑏
• Endowments in the form of
– an 𝑛 × 1 vector 𝑒
– a scalar “wealth” 𝑊 with default value 0
The class will include a test to make sure that 𝑏 ≫ Π𝑒 and raise an exception if it is violated (at some threshold level we’d
have to specify).
• A Person in the form of a pair that consists of
– Preferences and Endowments
• A Pure Exchange Economy will consist of
– a collection of 𝑚 persons
∗ 𝑚 = 1 for our single-agent economy
∗ 𝑚 = 2 for our illustrations of a pure exchange economy
– an equilibrium price vector 𝑝 (normalized somehow)
– an equilibrium allocation 𝑐1 , 𝑐2 , … , 𝑐𝑚 – a collection of 𝑚 vectors of dimension 𝑛 × 1
Now let’s proceed to code.

class ExchangeEconomy:
def __init__(self,
Π,
bs,
(continues on next page)

35.3. Pure exchange economy 551


A First Course in Quantitative Economics with Python

(continued from previous page)


es,
Ws=None,
thres=1.5):
"""
Set up the environment for an exchange economy

Args:
Π (np.array): shared matrix of substitution
bs (list): all consumers' bliss points
es (list): all consumers' endowments
Ws (list): all consumers' wealth
thres (float): a threshold set to test b >> Pi e violated
"""
n, m = Π.shape[0], len(bs)

# check non-satiation
for b, e in zip(bs, es):
if np.min(b / np.max(Π @ e)) <= thres:
raise Exception('set bliss points further away')

if Ws == None:
Ws = np.zeros(m)
else:
if sum(Ws) != 0:
raise Exception('invalid wealth distribution')

self.Π, self.bs, self.es, self.Ws, self.n, self.m = Π, bs, es, Ws, n, m

def competitive_equilibrium(self):
"""
Compute the competitive equilibrium prices and allocation
"""
Π, bs, es, Ws = self.Π, self.bs, self.es, self.Ws
n, m = self.n, self.m
slope_dc = inv(Π.T @ Π)
Π_inv = inv(Π)

# aggregate
b = sum(bs)
e = sum(es)

# compute price vector with mu=1 and renormalize


p = Π.T @ b - Π.T @ Π @ e
p = p / p[0]

# compute marginal utility of wealth


μ_s = []
c_s = []
A = p.T @ slope_dc @ p

for i in range(m):
μ_i = (-Ws[i] + p.T @ (Π_inv @ bs[i] - es[i])) / A
c_i = Π_inv @ bs[i] - μ_i * slope_dc @ p
μ_s.append(μ_i)
c_s.append(c_i)

(continues on next page)

552 Chapter 35. Market Equilibrium with Heterogeneity


A First Course in Quantitative Economics with Python

(continued from previous page)


for c_i in c_s:
if any(c_i < 0):
print('allocation: ', c_s)
raise Exception('negative allocation: equilibrium does not exist')

return p, c_s, μ_s

35.4 Implementation

Next we use the class ExchangeEconomy defined above to study


• a two-person economy without production,
• a dynamic economy, and
• an economy with risk and arrow securities.

35.4.1 Two-person economy without production

Here we study how competitive equilibrium 𝑝, 𝑐1 , 𝑐2 respond to different 𝑏𝑖 and 𝑒𝑖 , 𝑖 ∈ {1, 2}.

Π = np.array([[1, 0],
[0, 1]])

bs = [np.array([5, 5]), # first consumer's bliss points


np.array([5, 5])] # second consumer's bliss points

es = [np.array([0, 2]), # first consumer's endowment


np.array([2, 0])] # second consumer's endowment

EE = ExchangeEconomy(Π, bs, es)


p, c_s, μ_s = EE.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c_s)

Competitive equilibrium price vector: [1. 1.]


Competitive equilibrium allocation: [array([1., 1.]), array([1., 1.])]

What happens if the first consumer likes the first good more and the second consumer likes the second good more?

EE.bs = [np.array([6, 5]), # first consumer's bliss points


np.array([5, 6])] # second consumer's bliss points

p, c_s, μ_s = EE.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c_s)

Competitive equilibrium price vector: [1. 1.]


Competitive equilibrium allocation: [array([1.5, 0.5]), array([0.5, 1.5])]

35.4. Implementation 553


A First Course in Quantitative Economics with Python

Let the first consumer be poorer.

EE.es = [np.array([0.5, 0.5]), # first consumer's endowment


np.array([1, 1])] # second consumer's endowment

p, c_s, μ_s = EE.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c_s)

Competitive equilibrium price vector: [1. 1.]


Competitive equilibrium allocation: [array([1., 0.]), array([0.5, 1.5])]

Now let’s construct an autarky (i.e., no-trade) equilibrium.

EE.bs = [np.array([4, 6]), # first consumer's bliss points


np.array([6, 4])] # second consumer's bliss points

EE.es = [np.array([0, 2]), # first consumer's endowment


np.array([2, 0])] # second consumer's endowment

p, c_s, μ_s = EE.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c_s)

Competitive equilibrium price vector: [1. 1.]


Competitive equilibrium allocation: [array([0., 2.]), array([2., 0.])]

Now let’s redistribute endowments before trade.

bs = [np.array([5, 5]), # first consumer's bliss points


np.array([5, 5])] # second consumer's bliss points

es = [np.array([1, 1]), # first consumer's endowment


np.array([1, 1])] # second consumer's endowment

Ws = [0.5, -0.5]
EE_new = ExchangeEconomy(Π, bs, es, Ws)
p, c_s, μ_s = EE_new.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c_s)

Competitive equilibrium price vector: [1. 1.]


Competitive equilibrium allocation: [array([1.25, 1.25]), array([0.75, 0.75])]

554 Chapter 35. Market Equilibrium with Heterogeneity


A First Course in Quantitative Economics with Python

35.4.2 A dynamic economy

Now let’s use the tricks described above to study a dynamic economy, one with two periods.

beta = 0.95

Π = np.array([[1, 0],
[0, np.sqrt(beta)]])

bs = [np.array([5, np.sqrt(beta) * 5])]

es = [np.array([1, 1])]

EE_DE = ExchangeEconomy(Π, bs, es)


p, c_s, μ_s = EE_DE.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c_s)

Competitive equilibrium price vector: [1. 0.95]


Competitive equilibrium allocation: [array([1., 1.])]

35.4.3 Risk economy with arrow securities

We use the tricks described above to interpret 𝑐1 , 𝑐2 as “Arrow securities” that are state-contingent claims to consumption
goods.

prob = 0.7

Π = np.array([[np.sqrt(prob), 0],
[0, np.sqrt(1 - prob)]])

bs = [np.array([np.sqrt(prob) * 5, np.sqrt(1 - prob) * 5]),


np.array([np.sqrt(prob) * 5, np.sqrt(1 - prob) * 5])]

es = [np.array([1, 0]),
np.array([0, 1])]

EE_AS = ExchangeEconomy(Π, bs, es)


p, c_s, μ_s = EE_AS.competitive_equilibrium()

print('Competitive equilibrium price vector:', p)


print('Competitive equilibrium allocation:', c_s)

Competitive equilibrium price vector: [1. 0.42857143]


Competitive equilibrium allocation: [array([0.7, 0.7]), array([0.3, 0.3])]

35.4. Implementation 555


A First Course in Quantitative Economics with Python

35.5 Deducing a representative consumer

In the class of multiple consumer economies that we are studying here, it turns out that there exists a single representative
consumer whose preferences and endowments can be deduced from lists of preferences and endowments for separate
individual consumers.
Consider a multiple consumer economy with initial distribution of wealth 𝑊𝑖 satisfying ∑𝑖 𝑊𝑖 = 0
We allow an initial redistribution of wealth.
We have the following objects
• The demand curve:
𝑐𝑖 = Π−1 𝑏𝑖 − (Π⊤ Π)−1 𝜇𝑖 𝑝
• The marginal utility of wealth:
−𝑊𝑖 + 𝑝⊤ (Π−1 𝑏𝑖 − 𝑒𝑖 )
𝜇𝑖 =
𝑝⊤ (Π⊤ Π)−1 𝑝
• Market clearing:
∑ 𝑐 𝑖 = ∑ 𝑒𝑖
Denote aggregate consumption ∑𝑖 𝑐𝑖 = 𝑐 and ∑𝑖 𝜇𝑖 = 𝜇.
Market clearing requires

Π−1 (∑ 𝑏𝑖 ) − (Π⊤ Π)−1 𝑝 (∑ 𝜇𝑖 ) = ∑ 𝑒𝑖


𝑖 𝑖 𝑖

which, after a few steps, leads to

𝑝 = 𝜇−1 (Π⊤ 𝑏 − Π⊤ Π𝑒)

where
0 + 𝑝⊤ (Π−1 𝑏 − 𝑒)
𝜇 = ∑ 𝜇𝑖 = .
𝑖
𝑝⊤ (Π⊤ Π)−1 𝑝

Now consider the representative consumer economy specified above.


Denote the marginal utility of wealth of the representative consumer by 𝜇.̃
The demand function is

𝑐 = Π−1 𝑏 − (Π⊤ Π)−1 𝜇𝑝̃

Substituting this into the budget constraint gives

𝑝⊤ (Π−1 𝑏 − 𝑒)
𝜇̃ =
𝑝⊤ (Π⊤ Π)−1 𝑝

In an equilibrium 𝑐 = 𝑒, so

𝑝 = 𝜇−1
̃ (Π⊤ 𝑏 − Π⊤ Π𝑒)

Thus, we have verified that, up to the choice of a numeraire in which to express absolute prices, the price vector in our
representative consumer economy is the same as that in an underlying economy with multiple consumers.

556 Chapter 35. Market Equilibrium with Heterogeneity


Part XI

Estimation

557
CHAPTER

THIRTYSIX

SIMPLE LINEAR REGRESSION MODEL

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

The simple regression model estimates the relationship between two variables 𝑥𝑖 and 𝑦𝑖

𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1, 2, ..., 𝑁

where 𝜖𝑖 represents the error between the line of best fit and the sample values for 𝑦𝑖 given 𝑥𝑖 .
Our goal is to choose values for 𝛼 and 𝛽 to build a line of “best” fit for some data that is available for variables 𝑥𝑖 and 𝑦𝑖 .
Let us consider a simple dataset of 10 observations for variables 𝑥𝑖 and 𝑦𝑖 :

𝑦𝑖 𝑥𝑖

1 2000 32
2 1000 21
3 1500 24
4 2500 35
5 500 10
6 900 11
7 1100 22
8 1500 21
9 1800 27
10 250 2

Let us think about 𝑦𝑖 as sales for an ice-cream cart, while 𝑥𝑖 is a variable that records the day’s temperature in Celsius.

x = [32, 21, 24, 35, 10, 11, 22, 21, 27, 2]


y = [2000,1000,1500,2500,500,900,1100,1500,1800, 250]
df = pd.DataFrame([x,y]).T
df.columns = ['X', 'Y']
df

X Y
0 32 2000
1 21 1000
2 24 1500
3 35 2500
(continues on next page)

559
A First Course in Quantitative Economics with Python

(continued from previous page)


4 10 500
5 11 900
6 22 1100
7 21 1500
8 27 1800
9 2 250

We can use a scatter plot of the data to see the relationship between 𝑦𝑖 (ice-cream sales in dollars ($’s)) and 𝑥𝑖 (degrees
Celsius).

ax = df.plot(
x='X',
y='Y',
kind='scatter',
ylabel='Ice-Cream Sales ($\'s)',
xlabel='Degrees Celcius'
)

as you can see the data suggests that more ice-cream is typically sold on hotter days.
To build a linear model of the data we need to choose values for 𝛼 and 𝛽 that represents a line of “best” fit such that

𝑦𝑖̂ = 𝛼̂ + 𝛽𝑥̂ 𝑖

Let’s start with 𝛼 = 5 and 𝛽 = 10

α = 5
β = 10
df['Y_hat'] = α + β * df['X']

560 Chapter 36. Simple Linear Regression Model


A First Course in Quantitative Economics with Python

fig, ax = plt.subplots()
df.plot(x='X',y='Y', kind='scatter', ax=ax)
df.plot(x='X',y='Y_hat', kind='line', ax=ax)

<Axes: xlabel='X', ylabel='Y'>

We can see that this model does a poor job of estimating the relationship.
We can continue to guess and iterate towards a line of “best” fit by adjusting the parameters

β = 100
df['Y_hat'] = α + β * df['X']

fig, ax = plt.subplots()
df.plot(x='X',y='Y', kind='scatter', ax=ax)
df.plot(x='X',y='Y_hat', kind='line', ax=ax)

<Axes: xlabel='X', ylabel='Y'>

561
A First Course in Quantitative Economics with Python

β = 65
df['Y_hat'] = α + β * df['X']

fig, ax = plt.subplots()
df.plot(x='X',y='Y', kind='scatter', ax=ax)
df.plot(x='X',y='Y_hat', kind='line', ax=ax, color='g')

<Axes: xlabel='X', ylabel='Y'>

562 Chapter 36. Simple Linear Regression Model


A First Course in Quantitative Economics with Python

However we need to think about formalising this guessing process by thinking of this problem as an optimization problem.
Let’s consider the error 𝜖𝑖 and define the difference between the observed values 𝑦𝑖 and the estimated values 𝑦𝑖̂ which we
will call the residuals
𝑒𝑖̂ = 𝑦𝑖 − 𝑦𝑖̂
= 𝑦𝑖 − 𝛼̂ − 𝛽𝑥̂ 𝑖

df['error'] = df['Y_hat'] - df['Y']

df

X Y Y_hat error
0 32 2000 2085 85
1 21 1000 1370 370
2 24 1500 1565 65
3 35 2500 2280 -220
4 10 500 655 155
5 11 900 720 -180
6 22 1100 1435 335
7 21 1500 1370 -130
8 27 1800 1760 -40
9 2 250 135 -115

fig, ax = plt.subplots()
df.plot(x='X',y='Y', kind='scatter', ax=ax)
df.plot(x='X',y='Y_hat', kind='line', ax=ax, color='g')
plt.vlines(df['X'], df['Y_hat'], df['Y'], color='r');

563
A First Course in Quantitative Economics with Python

The Ordinary Least Squares (OLS) method, as the name suggests, chooses 𝛼 and 𝛽 in such a way that minimises the
Sum of the Squared Residuals (SSR).
𝑁 𝑁
min ∑ 𝑒2𝑖̂ = min ∑ (𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 )2
𝛼,𝛽 𝛼,𝛽
𝑖=1 𝑖=1

Let’s call this a cost function


𝑁
𝐶 = ∑ (𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 )2
𝑖=1

that we would like to minimise with parameters 𝛼 and 𝛽.

36.1 How does error change with respect to 𝛼 and 𝛽

Let us first look at how the total error changes with respect to 𝛽 (holding the intercept 𝛼 constant)
We know from the next section the optimal values for 𝛼 and 𝛽 are:

β_optimal = 64.38
α_optimal = -14.72

We can then calculate the error for a range of 𝛽 values

errors = {}
for β in np.arange(20,100,0.5):
errors[β] = abs((α_optimal + β * df['X']) - df['Y']).sum()

Ploting the error

564 Chapter 36. Simple Linear Regression Model


A First Course in Quantitative Economics with Python

ax = pd.Series(errors).plot(xlabel='β', ylabel='error')
plt.axvline(β_optimal, color='r');

Now let us vary 𝛼 (holding 𝛽 constant)

errors = {}
for α in np.arange(-500,500,5):
errors[α] = abs((α + β_optimal * df['X']) - df['Y']).sum()

Ploting the error

ax = pd.Series(errors).plot(xlabel='α', ylabel='error')
plt.axvline(α_optimal, color='r');

36.1. How does error change with respect to 𝛼 and 𝛽 565


A First Course in Quantitative Economics with Python

36.2 Calculating optimal values

Now let us use calculus to solve the optimization problem and compute the optimal values for 𝛼 and 𝛽 to find the ordinary
least squares solution.
First taking the partial derivative with respect to 𝛼

𝜕𝐶 𝑁
[∑ (𝑦 − 𝛼 − 𝛽𝑥𝑖 )2 ]
𝜕𝛼 𝑖=1 𝑖

and setting it equal to 0


𝑁
0 = ∑ −2(𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 )
𝑖=1

we can remove the constant −2 from the summation by dividing both sides by −2
𝑁
0 = ∑ (𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 )
𝑖=1

Now we can split this equation up into the components


𝑁 𝑁 𝑁
0 = ∑ 𝑦𝑖 − ∑ 𝛼 − 𝛽 ∑ 𝑥𝑖
𝑖=1 𝑖=1 𝑖=1

The middle term is a straight forward sum from 𝑖 = 1, ...𝑁 by a constant 𝛼


𝑁 𝑁
0 = ∑ 𝑦𝑖 − 𝑁 ∗ 𝛼 − 𝛽 ∑ 𝑥𝑖
𝑖=1 𝑖=1

566 Chapter 36. Simple Linear Regression Model


A First Course in Quantitative Economics with Python

and rearranging terms


𝑁 𝑁
∑𝑖=1 𝑦𝑖 − 𝛽 ∑𝑖=1 𝑥𝑖
𝛼=
𝑁
We observe that both fractions resolve to the means 𝑦𝑖̄ and 𝑥𝑖̄

𝛼 = 𝑦𝑖̄ − 𝛽 𝑥𝑖̄ (36.1)

Now let’s take the partial derivative of the cost function 𝐶 with respect to 𝛽

𝜕𝐶 𝑁
[∑ (𝑦 − 𝛼 − 𝛽𝑥𝑖 )2 ]
𝜕𝛽 𝑖=1 𝑖
and setting it equal to 0
𝑁
0 = ∑ −2𝑥𝑖 (𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 )
𝑖=1

we can again take the constant outside of the summation and divide both sides by −2
𝑁
0 = ∑ 𝑥𝑖 (𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 )
𝑖=1

which becomes
𝑁
0 = ∑ (𝑥𝑖 𝑦𝑖 − 𝛼𝑥𝑖 − 𝛽𝑥2𝑖 )
𝑖=1

now substituting for 𝛼


𝑁
0 = ∑ (𝑥𝑖 𝑦𝑖 − (𝑦𝑖̄ − 𝛽 𝑥𝑖̄ )𝑥𝑖 − 𝛽𝑥2𝑖 )
𝑖=1

and rearranging terms


𝑁
0 = ∑ (𝑥𝑖 𝑦𝑖 − 𝑦𝑖̄ 𝑥𝑖 − 𝛽 𝑥𝑖̄ 𝑥𝑖 − 𝛽𝑥2𝑖 )
𝑖=1

This can be split into two summations


𝑁 𝑁
0 = ∑(𝑥𝑖 𝑦𝑖 − 𝑦𝑖̄ 𝑥𝑖 ) + 𝛽 ∑(𝑥𝑖̄ 𝑥𝑖 − 𝑥2𝑖 )
𝑖=1 𝑖=1

and solving for 𝛽 yields


𝑁
∑𝑖=1 (𝑥𝑖 𝑦𝑖 − 𝑦𝑖̄ 𝑥𝑖 )
𝛽= 𝑁
(36.2)
∑𝑖=1 (𝑥2𝑖 − 𝑥𝑖̄ 𝑥𝑖 )
We can now use (36.1) and (36.2) to calculate the optimal values for 𝛼 and 𝛽
Calculating 𝛽

df = df[['X','Y']].copy() # Original Data

# Calculate the sample means


x_bar = df['X'].mean()
y_bar = df['Y'].mean()

Now computing across the 10 observations and then summing the numerator and denominator

36.2. Calculating optimal values 567


A First Course in Quantitative Economics with Python

# Compute the Sums


df['num'] = df['X'] * df['Y'] - y_bar * df['X']
df['den'] = pow(df['X'],2) - x_bar * df['X']
β = df['num'].sum() / df['den'].sum()
print(β)

64.37665782493369

Calculating 𝛼

α = y_bar - β * x_bar
print(α)

-14.72148541114052

Now we can plot the OLS solution

df['Y_hat'] = α + β * df['X']
df['error'] = df['Y_hat'] - df['Y']

fig, ax = plt.subplots()
df.plot(x='X',y='Y', kind='scatter', ax=ax)
df.plot(x='X',y='Y_hat', kind='line', ax=ax, color='g')
plt.vlines(df['X'], df['Y_hat'], df['Y'], color='r');

Why use OLS?


TODO
1. Discuss mathematical properties for why we have chosen OLS

568 Chapter 36. Simple Linear Regression Model


A First Course in Quantitative Economics with Python

Exercise 36.2.1
Now that you know the equations that solve the simple linear regression model using OLS you can now run your own
regressions to build a model between 𝑦 and 𝑥.
Let’s consider two economic variables GDP per capita and Life Expectancy.
1. What do you think their relationship would be?
2. Gather some data from our world in data
3. Use pandas to import the csv formated data and plot a few different countries of interest
4. Use (36.1) and (36.2) to compute optimal values for 𝛼 and 𝛽
5. Plot the line of best fit found using OLS
6. Interpret the coefficients and write a summary sentence of the relationship between GDP per capita and Life Ex-
pectancy

Solution to Exercise 36.2.1


Q2: Gather some data from our world in data
You can download a copy of the data here if you get stuck
Q3: Use pandas to import the csv formatted data and plot a few different countries of interest

fl = "_static/lecture_specific/simple_linear_regression/life-expectancy-vs-gdp-per-
↪capita.csv" # TODO: Replace with GitHub link
df = pd.read_csv(fl, nrows=10)

df

Entity Code Year Life expectancy at birth (historical) \


0 Abkhazia OWID_ABK 2015 NaN
1 Afghanistan AFG 1950 27.7
2 Afghanistan AFG 1951 28.0
3 Afghanistan AFG 1952 28.4
4 Afghanistan AFG 1953 28.9
5 Afghanistan AFG 1954 29.2
6 Afghanistan AFG 1955 29.9
7 Afghanistan AFG 1956 30.4
8 Afghanistan AFG 1957 30.9
9 Afghanistan AFG 1958 31.5

GDP per capita 417485-annotations Population (historical estimates) \


0 NaN NaN NaN
1 1156.0 NaN 7480464.0
2 1170.0 NaN 7571542.0
3 1189.0 NaN 7667534.0
4 1240.0 NaN 7764549.0
5 1245.0 NaN 7864289.0
6 1246.0 NaN 7971933.0
7 1278.0 NaN 8087730.0
(continues on next page)

36.2. Calculating optimal values 569


A First Course in Quantitative Economics with Python

(continued from previous page)


8 1253.0 NaN 8210207.0
9 1298.0 NaN 8333827.0

Continent
0 Asia
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN

You can see that the data downloaded from Our World in Data has provided a global set of countries with the GDP per
capita and Life Expectancy Data.
It is often a good idea to at first import a few lines of data from a csv to understand its structure so that you can then
choose the columns that you want to read into your DataFrame.
You can observe that there are a bunch of columns we won’t need to import such as Continent
So let’s built a list of the columns we want to import

cols = ['Code', 'Year', 'Life expectancy at birth (historical)', 'GDP per capita']
df = pd.read_csv(fl, usecols=cols)
df

Code Year Life expectancy at birth (historical) GDP per capita


0 OWID_ABK 2015 NaN NaN
1 AFG 1950 27.7 1156.0
2 AFG 1951 28.0 1170.0
3 AFG 1952 28.4 1189.0
4 AFG 1953 28.9 1240.0
... ... ... ... ...
62151 ZWE 1946 NaN NaN
62152 ZWE 1947 NaN NaN
62153 ZWE 1948 NaN NaN
62154 ZWE 1949 NaN NaN
62155 ALA 2015 NaN NaN

[62156 rows x 4 columns]

Sometimes it can be useful to rename your columns to make it easier to work with in the DataFrame

df.columns = ["cntry", "year", "life_expectancy", "gdppc"]


df

cntry year life_expectancy gdppc


0 OWID_ABK 2015 NaN NaN
1 AFG 1950 27.7 1156.0
2 AFG 1951 28.0 1170.0
3 AFG 1952 28.4 1189.0
4 AFG 1953 28.9 1240.0
(continues on next page)

570 Chapter 36. Simple Linear Regression Model


A First Course in Quantitative Economics with Python

(continued from previous page)


... ... ... ... ...
62151 ZWE 1946 NaN NaN
62152 ZWE 1947 NaN NaN
62153 ZWE 1948 NaN NaN
62154 ZWE 1949 NaN NaN
62155 ALA 2015 NaN NaN

[62156 rows x 4 columns]

We can see there are NaN values which represents missing data so let us go ahead and drop those

df.dropna(inplace=True)

df

cntry year life_expectancy gdppc


1 AFG 1950 27.7 1156.0000
2 AFG 1951 28.0 1170.0000
3 AFG 1952 28.4 1189.0000
4 AFG 1953 28.9 1240.0000
5 AFG 1954 29.2 1245.0000
... ... ... ... ...
61960 ZWE 2014 58.8 1594.0000
61961 ZWE 2015 59.6 1560.0000
61962 ZWE 2016 60.3 1534.0000
61963 ZWE 2017 60.7 1582.3662
61964 ZWE 2018 61.4 1611.4052

[12445 rows x 4 columns]

We have now dropped the number of rows in our DataFrame from 62156 to 12445 removing a lot of empty data rela-
tionships.
Now we have a dataset containing life expectancy and GDP per capita for a range of years.
It is always a good idea to spend a bit of time understanding what data you actually have.
For example, you may want to explore this data to see if there is consistent reporting for all countries across years
Let’s first look at the Life Expectancy Data

le_years = df[['cntry', 'year', 'life_expectancy']].set_index(['cntry', 'year']).


↪unstack()['life_expectancy']

le_years

year 1543 1548 1553 1558 1563 1568 1573 1578 1583 1588 ... 2009 \
cntry ...
AFG NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 60.4
AGO NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 55.8
ALB NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 77.8
ARE NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 78.0
ARG NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 75.9
... ... ... ... ... ... ... ... ... ... ... ... ...
VNM NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 73.5
(continues on next page)

36.2. Calculating optimal values 571


A First Course in Quantitative Economics with Python

(continued from previous page)


YEM NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 67.2
ZAF NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 57.4
ZMB NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 55.3
ZWE NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 48.1

year 2010 2011 2012 2013 2014 2015 2016 2017 2018
cntry
AFG 60.9 61.4 61.9 62.4 62.5 62.7 63.1 63.0 63.1
AGO 56.7 57.6 58.6 59.3 60.0 60.7 61.1 61.7 62.1
ALB 77.9 78.1 78.1 78.1 78.4 78.6 78.9 79.0 79.2
ARE 78.3 78.5 78.7 78.9 79.0 79.2 79.3 79.5 79.6
ARG 75.7 76.1 76.5 76.5 76.8 76.8 76.3 76.8 77.0
... ... ... ... ... ... ... ... ... ...
VNM 73.5 73.7 73.7 73.8 73.9 73.9 73.9 74.0 74.0
YEM 67.3 67.4 67.3 67.5 67.4 65.9 66.1 66.0 64.6
ZAF 58.9 60.7 61.8 62.5 63.4 63.9 64.7 65.4 65.7
ZMB 56.8 57.8 58.9 59.9 60.7 61.2 61.8 62.1 62.3
ZWE 50.7 53.3 55.6 57.5 58.8 59.6 60.3 60.7 61.4

[166 rows x 310 columns]

As you can see there are a lot of countries where data is not available for the Year 1543!
Which country does report this data?

le_years[~le_years[1543].isna()]

year 1543 1548 1553 1558 1563 1568 1573 1578 1583 1588 \
cntry
GBR 33.94 38.82 39.59 22.38 36.66 39.67 41.06 41.56 42.7 37.05

year ... 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
cntry ...
GBR ... 80.2 80.4 80.8 80.9 80.9 81.2 80.9 81.1 81.2 81.1

[1 rows x 310 columns]

You can see that Great Britain (GBR) is the only one available
You can also take a closer look at the time series to find that it is also non-continuous, even for GBR.

le_years.loc['GBR'].plot()

<Axes: xlabel='year'>

572 Chapter 36. Simple Linear Regression Model


A First Course in Quantitative Economics with Python

In fact we can use pandas to quickly check how many countries are captured in each year

le_years.stack().unstack(level=0).count(axis=1).plot(xlabel="Year", ylabel="Number of␣


↪countries");

So it is clear that if you are doing cross-sectional comparisons then more recent data will include a wider set of countries

36.2. Calculating optimal values 573


A First Course in Quantitative Economics with Python

Now let us consider the most recent year in the dataset 2018

df = df[df.year == 2018].reset_index(drop=True).copy()

df.plot(x='gdppc', y='life_expectancy', kind='scatter', xlabel="GDP per capita",␣


↪ylabel="Life Expectancy (Years)",);

This data shows a couple of interesting relationships.


1. there are a number of countries with similar GDP per capita levels but a wide range in Life Expectancy
2. there appears to be a positive relationship between GDP per capita and life expectancy. Countries with higher GDP
per capita tend to have higher life expectancy outcomes
Even though OLS is solving linear equations – one option we have is to transform the variables, such as through a log
transform, and then use OLS to estimate the transformed variables

Tip: ln -> ln == elasticities

By specifying logx you can plot the GDP per Capita data on a log scale

df.plot(x='gdppc', y='life_expectancy', kind='scatter', xlabel="GDP per capita",␣


↪ylabel="Life Expectancy (Years)", logx=True);

574 Chapter 36. Simple Linear Regression Model


A First Course in Quantitative Economics with Python

As you can see from this transformation – a linear model fits the shape of the data more closely.

df['log_gdppc'] = df['gdppc'].apply(np.log10)

df

cntry year life_expectancy gdppc log_gdppc


0 AFG 2018 63.1 1934.5550 3.286581
1 ALB 2018 79.2 11104.1660 4.045486
2 DZA 2018 76.1 14228.0250 4.153145
3 AGO 2018 62.1 7771.4420 3.890502
4 ARG 2018 77.0 18556.3830 4.268493
.. ... ... ... ... ...
161 VNM 2018 74.0 6814.1420 3.833411
162 OWID_WRL 2018 72.6 15212.4150 4.182198
163 YEM 2018 64.6 2284.8900 3.358865
164 ZMB 2018 62.3 3534.0337 3.548271
165 ZWE 2018 61.4 1611.4052 3.207205

[166 rows x 5 columns]

Q4: Use (36.1) and (36.2) to compute optimal values for 𝛼 and 𝛽

data = df[['log_gdppc', 'life_expectancy']].copy() # Get Data from DataFrame

# Calculate the sample means


x_bar = data['log_gdppc'].mean()
y_bar = data['life_expectancy'].mean()

36.2. Calculating optimal values 575


A First Course in Quantitative Economics with Python

data

log_gdppc life_expectancy
0 3.286581 63.1
1 4.045486 79.2
2 4.153145 76.1
3 3.890502 62.1
4 4.268493 77.0
.. ... ...
161 3.833411 74.0
162 4.182198 72.6
163 3.358865 64.6
164 3.548271 62.3
165 3.207205 61.4

[166 rows x 2 columns]

# Compute the Sums


data['num'] = data['log_gdppc'] * data['life_expectancy'] - y_bar * data['log_gdppc']
data['den'] = pow(data['log_gdppc'],2) - x_bar * data['log_gdppc']
β = data['num'].sum() / data['den'].sum()
print(β)

12.643730292819699

α = y_bar - β * x_bar
print(α)

21.702096701389074

Q5: Plot the line of best fit found using OLS

data['life_expectancy_hat'] = α + β * df['log_gdppc']
data['error'] = data['life_expectancy_hat'] - data['life_expectancy']

fig, ax = plt.subplots()
data.plot(x='log_gdppc',y='life_expectancy', kind='scatter', ax=ax)
data.plot(x='log_gdppc',y='life_expectancy_hat', kind='line', ax=ax, color='g')
plt.vlines(data['log_gdppc'], data['life_expectancy_hat'], data['life_expectancy'],␣
↪color='r')

<matplotlib.collections.LineCollection at 0x7f4b67f3a9e0>

576 Chapter 36. Simple Linear Regression Model


A First Course in Quantitative Economics with Python

Exercise 36.2.2
Minimising the sum of squares is not the only way to generate the line of best fit.
For example, we could also consider minimising the sum of the absolute values, that would give less weight to outliers.
Solve for 𝛼 and 𝛽 using the least absolute values

36.2. Calculating optimal values 577


A First Course in Quantitative Economics with Python

578 Chapter 36. Simple Linear Regression Model


CHAPTER

THIRTYSEVEN

MAXIMUM LIKELIHOOD ESTIMATION

from scipy.stats import lognorm, pareto, expon, norm


import numpy as np
from scipy.integrate import quad
import matplotlib.pyplot as plt
import pandas as pd
from math import exp, log

37.1 Introduction

Consider a situation where a policymaker is trying to estimate how much revenue a proposed wealth tax will raise.
The proposed tax is

𝑎𝑤 if 𝑤 ≤ 𝑤̄
ℎ(𝑤) = {
𝑎𝑤̄ + 𝑏(𝑤 − 𝑤)̄ if 𝑤 > 𝑤̄

where 𝑤 is wealth.
For example, if 𝑎 = 0.05, 𝑏 = 0.1, and 𝑤̄ = 2.5, this means
• a 5% tax on wealth up to 2.5 and
• a 10% tax on wealth in excess of 2.5.
The unit is 100,000, so 𝑤 = 2.5 means 250,000 dollars.
Let’s go ahead and define ℎ:

def h(w, a=0.05, b=0.1, w_bar=2.5):


if w <= w_bar:
return a * w
else:
return a * w_bar + b * (w - w_bar)

For a population of size 𝑁 , where individual 𝑖 has wealth 𝑤𝑖 , total revenue raised by the tax will be
𝑁
𝑇 = ∑ ℎ(𝑤𝑖 )
𝑖=1

We wish to calculate this quantity.


The problem we face is that, in most countries, wealth is not observed for all individuals.

579
A First Course in Quantitative Economics with Python

Collecting and maintaining accurate wealth data for all individuals or households in a country is just too hard.
So let’s suppose instead that we obtain a sample 𝑤1 , 𝑤2 , ⋯ , 𝑤𝑛 telling us the wealth of 𝑛 randomly selected individuals.
For our exercise we are going to use a sample of 𝑛 = 10, 000 observations from wealth data in the US in 2016.

n = 10_000

The data is derived from the Survey of Consumer Finances (SCF).


The following code imports this data and reads it into an array called sample.
Let’s histogram this sample.

fig, ax = plt.subplots()
ax.set_xlim(-1,20)
ax.hist(sample, density=True, bins=5_000, histtype='stepfilled', alpha=0.8)
plt.show()

The histogram shows that many people have very low wealth and a few people have very high wealth.
We will take the full population size to be

N = 100_000_000

How can we estimate total revenue from the full population using only the sample data?
Our plan is to assume that wealth of each individual is a draw from a distribution with density 𝑓.
If we obtain an estimate of 𝑓 we can then approximate 𝑇 as follows:
𝑁 ∞
1 𝑁
𝑇 = ∑ ℎ(𝑤𝑖 ) = 𝑁 ∑ ℎ(𝑤𝑖 ) ≈ 𝑁 ∫ ℎ(𝑤)𝑓(𝑤)𝑑𝑤 (37.1)
𝑖=1
𝑁 𝑖=1 0

(The sample mean should be close to the mean by the law of large numbers.)

580 Chapter 37. Maximum Likelihood Estimation


A First Course in Quantitative Economics with Python

The problem now is: how do we estimate 𝑓?

37.2 Maximum likelihood estimation

Maximum likelihood estimation is a method of estimating an unknown distribution.


Maximum likelihood estimation has two steps:
1. Guess what the underlying distribution is (e.g., normal with mean 𝜇 and standard deviation 𝜎).
2. Estimate the parameter values (e.g., estimate 𝜇 and 𝜎 for the normal distribution)
One possible assumption for the wealth is that each 𝑤𝑖 is log-normally distributed, with parameters 𝜇 ∈ (−∞, ∞) and
𝜎 ∈ (0, ∞).
(This means that ln 𝑤𝑖 is normally distributed with mean 𝜇 and standard deviation 𝜎.)
You can see that this assumption is not completely unreasonable because, if we histogram log wealth instead of wealth,
the picture starts to look something like a bell-shaped curve.

ln_sample = np.log(sample)
fig, ax = plt.subplots()
ax.hist(ln_sample, density=True, bins=200, histtype='stepfilled', alpha=0.8)
plt.show()

Now our job is to obtain the maximum likelihood estimates of 𝜇 and 𝜎, which we denote by 𝜇̂ and 𝜎.̂
These estimates can be found by maximizing the likelihood function given the data.
The pdf of a lognormally distributed random variable 𝑋 is given by:
2
1 1 −1 ln 𝑥 − 𝜇
𝑓(𝑥, 𝜇, 𝜎) = √ exp ( ( ))
𝑥 𝜎 2𝜋 2 𝜎

37.2. Maximum likelihood estimation 581


A First Course in Quantitative Economics with Python

For our sample 𝑤1 , 𝑤2 , ⋯ , 𝑤𝑛 , the likelihood function is given by


𝑛
𝐿(𝜇, 𝜎|𝑤𝑖 ) = ∏ 𝑓(𝑤𝑖 , 𝜇, 𝜎)
𝑖=1

The likelihood function can be viewed as both


• the joint distribution of the sample (which is assumed to be IID) and
• the “likelihood” of parameters (𝜇, 𝜎) given the data.
Taking logs on both sides gives us the log likelihood function, which is
𝑛
ℓ(𝜇, 𝜎|𝑤𝑖 ) = ln [∏ 𝑓(𝑤𝑖 , 𝜇, 𝜎)]
𝑖=1
𝑛
𝑛 𝑛 1 𝑛
= − ∑ ln 𝑤𝑖 − ln(2𝜋) − ln 𝜎2 − 2 ∑(ln 𝑤𝑖 − 𝜇)2
𝑖=1
2 2 2𝜎 𝑖=1

To find where this function is maximised we find its partial derivatives wrt 𝜇 and 𝜎2 and equate them to 0.
Let’s first find the maximum likelihood estimate (MLE) of 𝜇
𝑛
𝛿ℓ 1
= − 2 × 2 ∑(ln 𝑤𝑖 − 𝜇) = 0
𝛿𝜇 2𝜎 𝑖=1
𝑛
⟹ ∑ ln 𝑤𝑖 − 𝑛𝜇 = 0
𝑖=1
𝑛
∑ ln 𝑤𝑖
⟹ 𝜇̂ = 𝑖=1
𝑛
Now let’s find the MLE of 𝜎
𝛿ℓ 𝑛 1 𝑛
= − + ∑(ln 𝑤𝑖 − 𝜇)2 = 0
𝛿𝜎2 2𝜎2 2𝜎4 𝑖=1
𝑛 1 𝑛
⟹ = ∑(ln 𝑤𝑖 − 𝜇)2
2𝜎2 2𝜎4 𝑖=1
𝑛 1/2
∑ (ln 𝑤𝑖 − 𝜇)̂ 2
⟹ 𝜎̂ = ( 𝑖=1 )
𝑛

Now that we have derived the expressions for 𝜇̂ and 𝜎,̂ let’s compute them for our wealth sample.

μ_hat = np.mean(ln_sample)
μ_hat

0.0634375526654064

num = (ln_sample - μ_hat)**2


σ_hat = (np.mean(num))**(1/2)
σ_hat

2.1507346258433424

Let’s plot the log-normal pdf using the estimated parameters against our sample data.

582 Chapter 37. Maximum Likelihood Estimation


A First Course in Quantitative Economics with Python

dist_lognorm = lognorm(σ_hat, scale = exp(μ_hat))


x = np.linspace(0,50,10000)

fig, ax = plt.subplots()
ax.set_xlim(-1,20)

ax.hist(sample, density=True, bins=5_000, histtype='stepfilled', alpha=0.5)


ax.plot(x, dist_lognorm.pdf(x), 'k-', lw=0.5, label='lognormal pdf')
ax.legend()
plt.show()

Our estimated lognormal distribution appears to be a reasonable fit for the overall data.
We now use (37.1) to calculate total revenue.
We will compute the integral using numerical integration via SciPy’s quad function

def total_revenue(dist):
integral, _ = quad(lambda x: h(x) * dist.pdf(x), 0, 100_000)
T = N * integral
return T

tr_lognorm = total_revenue(dist_lognorm)
tr_lognorm

101105326.82814863

(Our unit was 100,000 dollars, so this means that actual revenue is 100,000 times as large.)

37.2. Maximum likelihood estimation 583


A First Course in Quantitative Economics with Python

37.3 Pareto distribution

We mentioned above that using maximum likelihood estimation requires us to make a prior assumption of the underlying
distribution.
Previously we assumed that the distribution is lognormal.
Suppose instead we assume that 𝑤𝑖 are drawn from the Pareto Distribution with parameters 𝑏 and 𝑥𝑚 .
In this case, the maximum likelihood estimates are known to be
𝑛
𝑏̂ = 𝑛 and 𝑥𝑚
̂ = min 𝑤𝑖
∑𝑖=1 ln(𝑤𝑖 /𝑥𝑚̂ ) 𝑖

Let’s calculate them.

xm_hat = min(sample)
xm_hat

0.0001

den = np.log(sample/xm_hat)
b_hat = 1/np.mean(den)
b_hat

0.10783091940803055

Now let’s recompute total revenue.

dist_pareto = pareto(b = b_hat, scale = xm_hat)


tr_pareto = total_revenue(dist_pareto)
tr_pareto

12933168365.762571

The number is very different!

tr_pareto / tr_lognorm

127.91777418162562

We see that choosing the right distribution is extremely important.


Let’s compare the fitted Pareto distribution to the histogram:

fig, ax = plt.subplots()
ax.set_xlim(-1, 20)
ax.set_ylim(0,1.75)

ax.hist(sample, density=True, bins=5_000, histtype='stepfilled', alpha=0.5)


ax.plot(x, dist_pareto.pdf(x), 'k-', lw=0.5, label='Pareto pdf')
ax.legend()

plt.show()

584 Chapter 37. Maximum Likelihood Estimation


A First Course in Quantitative Economics with Python

We observe that in this case the fit for the Pareto distribution is not very good, so we can probably reject it.

37.4 What is the best distribution?

There is no “best” distribution — every choice we make is an assumption.


All we can do is try to pick a distribution that fits the data well.
The plots above suggested that the lognormal distribution is optimal.
However when we inspect the upper tail (the richest people), the Pareto distribution may be a better fit.
To see this, let’s now set a minimum threshold of net worth in our dataset.
We set an arbitrary threshold of $500,000 and read the data into sample_tail.
Let’s plot this data.

fig, ax = plt.subplots()
ax.set_xlim(0,50)
ax.hist(sample_tail, density=True, bins=500, histtype='stepfilled', alpha=0.8)
plt.show()

37.4. What is the best distribution? 585


A First Course in Quantitative Economics with Python

Now let’s try fitting some distributions to this data.

37.4.1 Lognormal distribution for the right hand tail

Let’s start with the lognormal distribution


We estimate the parameters again and plot the density against our data.

ln_sample_tail = np.log(sample_tail)
μ_hat_tail = np.mean(ln_sample_tail)
num_tail = (ln_sample_tail - μ_hat_tail)**2
σ_hat_tail = (np.mean(num_tail))**(1/2)
dist_lognorm_tail = lognorm(σ_hat_tail, scale = exp(μ_hat_tail))

fig, ax = plt.subplots()
ax.set_xlim(0,50)
ax.hist(sample_tail, density=True, bins=500, histtype='stepfilled', alpha=0.5)
ax.plot(x, dist_lognorm_tail.pdf(x), 'k-', lw=0.5, label='lognormal pdf')
ax.legend()
plt.show()

586 Chapter 37. Maximum Likelihood Estimation


A First Course in Quantitative Economics with Python

While the lognormal distribution was a good fit for the entire dataset, it is not a good fit for the right hand tail.

37.4.2 Pareto distribution for the right hand tail

Let’s now assume the truncated dataset has a Pareto distribution.


We estimate the parameters again and plot the density against our data.

xm_hat_tail = min(sample_tail)
den_tail = np.log(sample_tail/xm_hat_tail)
b_hat_tail = 1/np.mean(den_tail)
dist_pareto_tail = pareto(b = b_hat_tail, scale = xm_hat_tail)

fig, ax = plt.subplots()
ax.set_xlim(0, 50)
ax.set_ylim(0,0.65)
ax.hist(sample_tail, density=True, bins= 500, histtype='stepfilled', alpha=0.5)
ax.plot(x, dist_pareto_tail.pdf(x), 'k-', lw=0.5, label='pareto pdf')
plt.show()

37.4. What is the best distribution? 587


A First Course in Quantitative Economics with Python

The Pareto distribution is a better fit for the right hand tail of our dataset.

37.4.3 So what is the best distribution?

As we said above, there is no “best” distribution — each choice is an assumption.


We just have to test what we think are reasonable distributions.
One test is to plot the data against the fitted distribution, as we did.
There are other more rigorous tests, such as the Kolmogorov-Smirnov test.
We omit such advanced topics (but encourage readers to study them once they have completed these lectures).

37.5 Exercises

Exercise 37.5.1
Suppose we assume wealth is exponentially distributed with parameter 𝜆 > 0.
The maximum likelihood estimate of 𝜆 is given by
𝑛
𝜆̂ = 𝑛
∑𝑖=1 𝑤𝑖

1. Compute 𝜆̂ for our initial sample.


2. Use 𝜆̂ to find the total revenue

Solution to Exercise 37.5.1

588 Chapter 37. Maximum Likelihood Estimation


A First Course in Quantitative Economics with Python

λ_hat = 1/np.mean(sample)
λ_hat

0.15234120963403971

dist_exp = expon(scale = 1/λ_hat)


tr_expo = total_revenue(dist_exp)
tr_expo

55246978.53427645

Exercise 37.5.2
Plot the exponential distribution against the sample and check if it is a good fit or not.

Solution to Exercise 37.5.2

fig, ax = plt.subplots()
ax.set_xlim(-1, 20)

ax.hist(sample, density=True, bins=5000, histtype='stepfilled', alpha=0.5)


ax.plot(x, dist_exp.pdf(x), 'k-', lw=0.5, label='exponential pdf')
ax.legend()

plt.show()

37.5. Exercises 589


A First Course in Quantitative Economics with Python

Clearly, this distribution is not a good fit for our data.

590 Chapter 37. Maximum Likelihood Estimation


Part XII

Other

591
CHAPTER

THIRTYEIGHT

TROUBLESHOOTING

Contents

• Troubleshooting
– Fixing your local environment
– Reporting an issue

This page is for readers experiencing errors when running the code from the lectures.

38.1 Fixing your local environment

The basic assumption of the lectures is that code in a lecture should execute whenever
1. it is executed in a Jupyter notebook and
2. the notebook is running on a machine with the latest version of Anaconda Python.
You have installed Anaconda, haven’t you, following the instructions in this lecture?
Assuming that you have, the most common source of problems for our readers is that their Anaconda distribution is not
up to date.
Here’s a useful article on how to update Anaconda.
Another option is to simply remove Anaconda and reinstall.
You also need to keep the external code libraries, such as QuantEcon.py up to date.
For this task you can either
• use conda install -y quantecon on the command line, or
• execute !conda install -y quantecon within a Jupyter notebook.
If your local environment is still not working you can do two things.
First, you can use a remote machine instead, by clicking on the Launch Notebook icon available for each lecture

593
A First Course in Quantitative Economics with Python

Second, you can report an issue, so we can try to fix your local set up.
We like getting feedback on the lectures so please don’t hesitate to get in touch.

38.2 Reporting an issue

One way to give feedback is to raise an issue through our issue tracker.
Please be as specific as possible. Tell us where the problem is and as much detail about your local set up as you can
provide.
Another feedback option is to use our discourse forum.
Finally, you can provide direct feedback to [email protected]

594 Chapter 38. Troubleshooting


CHAPTER

THIRTYNINE

REFERENCES

595
A First Course in Quantitative Economics with Python

596 Chapter 39. References


CHAPTER

FORTY

EXECUTION STATISTICS

This table contains the latest execution statistics.

Document Modified Method Run Time (s) Status


business_cycle 2023-09-25 01:19 cache 16.35 ✅
cagan_adaptive 2023-09-25 01:19 cache 3.3 ✅
cagan_ree 2023-09-25 01:19 cache 4.76 ✅
cobweb 2023-09-25 01:19 cache 3.64 ✅
commod_price 2023-09-25 01:20 cache 21.82 ✅
cons_smooth 2023-09-25 01:20 cache 2.46 ✅
eigen_I 2023-09-25 01:20 cache 6.02 ✅
eigen_II 2023-09-25 01:20 cache 8.67 ✅
equalizing_difference 2023-09-25 01:20 cache 3.26 ✅
geom_series 2023-09-25 01:20 cache 3.71 ✅
heavy_tails 2023-09-25 01:21 cache 20.72 ✅
inequality 2023-09-25 01:27 cache 402.69 ✅
inflation_history 2023-09-25 01:28 cache 11.36 ✅
input_output 2023-09-25 01:28 cache 16.68 ✅
intro 2023-09-25 01:28 cache 1.21 ✅
intro_supply_demand 2023-09-25 01:28 cache 2.54 ✅
lake_model 2023-09-25 01:28 cache 3.4 ✅
linear_equations 2023-09-25 01:28 cache 2.56 ✅
lln_clt 2023-09-25 01:31 cache 202.86 ✅
long_run_growth 2023-09-25 01:32 cache 10.45 ✅
lp_intro 2023-09-25 01:32 cache 7.71 ✅
markov_chains_I 2023-09-25 01:32 cache 12.81 ✅
markov_chains_II 2023-09-25 01:32 cache 8.64 ✅
mle 2023-09-25 01:32 cache 8.46 ✅
monte_carlo 2023-09-25 01:33 cache 21.8 ✅
networks 2023-09-25 01:33 cache 11.62 ✅
olg 2023-09-25 01:33 cache 3.43 ✅
prob_dist 2023-09-25 01:33 cache 10.51 ✅
pv 2023-09-25 01:33 cache 2.11 ✅
scalar_dynam 2023-09-25 01:33 cache 3.95 ✅
schelling 2023-09-25 01:33 cache 17.21 ✅
short_path 2023-09-25 01:33 cache 1.25 ✅
simple_linear_regression 2023-09-25 01:34 cache 4.59 ✅
solow 2023-09-25 01:34 cache 5.41 ✅
status 2023-09-25 01:28 cache 1.21 ✅
continues on next page

597
A First Course in Quantitative Economics with Python

Table 40.1 – continued from previous page


Document Modified Method Run Time (s) Status
supply_demand_heterogeneity 2023-09-25 01:34 cache 2.0 ✅
supply_demand_multiple_goods 2023-09-25 01:34 cache 2.84 ✅
time_series_with_matrices 2023-09-25 01:34 cache 3.79 ✅
troubleshooting 2023-09-25 01:28 cache 1.21 ✅
zreferences 2023-09-25 01:28 cache 1.21 ✅

These lectures are built on linux instances through github actions and amazon web services (aws)
to enable access to a gpu. These lectures are built on a p3.2xlarge that has access to 8 vcpu's, a V100 NVIDIA
Tesla GPU, and 61 Gb of memory.

598 Chapter 40. Execution Statistics


BIBLIOGRAPHY

[AR02] Daron Acemoglu and James A. Robinson. The political economy of the Kuznets curve. Review of Development
Economics, 6(2):183–203, 2002.
[AKM+18] SeHyoun Ahn, Greg Kaplan, Benjamin Moll, Thomas Winberry, and Christian Wolf. When inequality mat-
ters for macro and macro matters for inequality. NBER Macroeconomics Annual, 32(1):1–75, 2018.
[Axt01] Robert L Axtell. Zipf distribution of us firm sizes. science, 293(5536):1818–1820, 2001.
[BB18] Jess Benhabib and Alberto Bisin. Skewed wealth distributions: theory and empirics. Journal of Economic
Literature, 56(4):1261–91, 2018.
[BBL19] Jess Benhabib, Alberto Bisin, and Mi Luo. Wealth Distribution and Social Mobility in the US: A Quantitative
Approach. American Economic Review, 109(5):1623–1647, May 2019.
[Ber97] J. N. Bertsimas, D. & Tsitsiklis. Introduction to linear optimization. Athena Scientific, 1997.
[BEGS18] Anmol Bhandari, David Evans, Mikhail Golosov, and Thomas J Sargent. Inequality, business cycles, and
monetary-fiscal policy. Technical Report, National Bureau of Economic Research, 2018.
[BEJ18] Stephen P Borgatti, Martin G Everett, and Jeffrey C Johnson. Analyzing social networks. Sage, 2018.
[Cag56] Philip Cagan. The monetary dynamics of hyperinflation. In Milton Friedman, editor, Studies in the Quantity
Theory of Money, pages 25–117. University of Chicago Press, Chicago, 1956.
[CB96] Marcus J Chambers and Roy E Bailey. A theory of commodity price fluctuations. Journal of Political Econ-
omy, 104(5):924–957, 1996.
[Coc23] John H Cochrane. The Fiscal Theory of the Price Level. Princeton University Press, Princeton, New Jersey,
2023.
[Cos21] Michele Coscia. The atlas for the aspiring network scientist. arXiv preprint arXiv:2101.00863, 2021.
[DL92] Angus Deaton and Guy Laroque. On the behavior of commodity prices. The Review of Economic Studies,
59:1–23, 1992.
[DL96] Angus Deaton and Guy Laroque. Competitive storage and commodity price dynamics. Journal of Political
Economy, 104(5):896–923, 1996.
[DSS58] Robert Dorfman, Paul A. Samuelson, and Robert M. Solow. Linear Programming and Economic Analysis:
Revised Edition. McGraw Hill, New York, 1958.
[EK+10] David Easley, Jon Kleinberg, and others. Networks, crowds, and markets. Volume 8. Cambridge university
press Cambridge, 2010.
[Fri56] M. Friedman. A Theory of the Consumption Function. Princeton University Press, 1956.

599
A First Course in Quantitative Economics with Python

[FDGA+04] Yoshi Fujiwara, Corrado Di Guilmi, Hideaki Aoyama, Mauro Gallegati, and Wataru Souma. Do pareto–
zipf and gibrat laws hold true? an analysis with european firms. Physica A: Statistical Mechanics and its
Applications, 335(1-2):197–216, 2004.
[Gab16] Xavier Gabaix. Power laws in economics: an introduction. Journal of Economic Perspectives, 30(1):185–206,
2016.
[GSS03] Edward Glaeser, Jose Scheinkman, and Andrei Shleifer. The injustice of inequality. Journal of Monetary
Economics, 50(1):199–222, 2003.
[Goy23] Sanjeev Goyal. Networks: An economics approach. MIT Press, 2023.
[Hal78] Robert E Hall. Stochastic Implications of the Life Cycle-Permanent Income Hypothesis: Theory and Evi-
dence. Journal of Political Economy, 86(6):971–987, 1978.
[Ham05] James D Hamilton. What's real about the business cycle? Federal Reserve Bank of St. Louis Review, pages
435–452, 2005.
[Har60] Arthur A. Harlow. The hog cycle and the cobweb theorem. American Journal of Agricultural Economics,
42(4):842–853, 1960. doi:https://doi.org/10.2307/1235116.
[Hu18] Y. Hu, Y. & Guo. Operations research. Tsinghua University Press, 5th edition, 2018.
[Haggstrom02] Olle Häggström. Finite Markov chains and algorithmic applications. Volume 52. Cambridge University
Press, 2002.
[IT23] Patrick Imam and Jonathan RW Temple. Political institutions and output collapses. IMF Working Paper, 2023.
[Jac10] Matthew O Jackson. Social and economic networks. Princeton university press, 2010.
[KLS18] Illenin Kondo, Logan T Lewis, and Andrea Stella. On the us firm and establishment size distributions. Tech-
nical Report, SSRN, 2018.
[Man63] Benoit Mandelbrot. The variation of certain speculative prices. The Journal of Business, 36(4):394–419, 1963.
[MFD20] Filippo Menczer, Santo Fortunato, and Clayton A Davis. A first course in network science. Cambridge Uni-
versity Press, 2020.
[New18] Mark Newman. Networks. Oxford university press, 2018.
[Rac03] Svetlozar Todorov Rachev. Handbook of heavy tailed distributions in finance: Handbooks in finance. Vol-
ume 1. Elsevier, 2003.
[RRGM11] Hernán D Rozenfeld, Diego Rybski, Xavier Gabaix, and Hernán A Makse. The area and population of cities:
new insights from a different perspective on cities. American Economic Review, 101(5):2205–25, 2011.
[Sam58] Paul A Samuelson. An exact consumption-loan model of interest with or without the social contrivance of
money. Journal of political economy, 66(6):467–482, 1958.
[Sam71] Paul A Samuelson. Stochastic speculative price. Proceedings of the National Academy of Sciences, 68(2):335–
337, 1971.
[Sam39] Paul A. Samuelson. Interactions between the multiplier analysis and the principle of acceleration. Review of
Economic Studies, 21(2):75–78, 1939.
[Sar82] Thomas J Sargent. The ends of four big inflations. In Robert E Hall, editor, Inflation: Causes and effects, pages
41–98. University of Chicago Press, 1982.
[Sar13] Thomas J Sargent. Rational Expectations and Inflation. Princeton University Press, Princeton, New Jersey,
2013.
[SS22] Thomas J Sargent and John Stachurski. Economic networks: theory and computation. arXiv preprint
arXiv:2203.11972, 2022.

600 Bibliography
A First Course in Quantitative Economics with Python

[SS23] Thomas J Sargent and John Stachurski. Economic networks: theory and computation. arXiv preprint
arXiv:2203.11972, 2023.
[SV02] Thomas J Sargent and François R Velde. The Big Problem of Small Change. Princeton University Press,
Princeton, New Jersey, 2002.
[SS83] Jose A Scheinkman and Jack Schechtman. A simple competitive model with production and storage. The
Review of Economic Studies, 50(3):427–441, 1983.
[Sch69] Thomas C Schelling. Models of Segregation. American Economic Review, 59(2):488–493, 1969.
[ST19] Christian Schluter and Mark Trede. Size distributions reconsidered. Econometric Reviews, 38(6):695–710,
2019.
[Smi10] Adam Smith. The Wealth of Nations: An inquiry into the nature and causes of the Wealth of Nations. Harriman
House Limited, 2010.
[Too14] Adam Tooze. The deluge: the great war, america and the remaking of the global order, 1916–1931. 2014.
[Vil96] Pareto Vilfredo. Cours d'économie politique. Rouge, Lausanne, 1896.
[Wau64] Frederick V. Waugh. Cobweb models. Journal of Farm Economics, 46(4):732–750, 1964.
[WW82] Brian D Wright and Jeffrey C Williams. The economic role of commodity storage. The Economic Journal,
92(367):596–614, 1982.
[Zha12] Dongmei Zhao. Power Distribution and Performance Analysis for Wireless Communication Networks.
SpringerBriefs in Computer Science. Springer US, Boston, MA, 2012. ISBN 978-1-4614-3283-8 978-
1-4614-3284-5. URL: https://link.springer.com/10.1007/978-1-4614-3284-5 (visited on 2023-02-03),
doi:10.1007/978-1-4614-3284-5.

Bibliography 601
A First Course in Quantitative Economics with Python

602 Bibliography
PROOF INDEX

con-perron-frobenius
con-perron-frobenius (eigen_II), ??

graph_theory_property1
graph_theory_property1 (networks), 512

graph_theory_property2
graph_theory_property2 (networks), 513

mc_conv_thm
mc_conv_thm (markov_chains_II), 407

mc_po_conv_thm
mc_po_conv_thm (markov_chains_I), 393

move_algo
move_algo (schelling), 303

neumann_series_lemma
neumann_series_lemma (eigen_I), 92

perron-frobenius
perron-frobenius (eigen_II), ??

statement_clt
statement_clt (lln_clt), 252

stationary
stationary (markov_chains_II), 408

theorem-0
theorem-0 (lln_clt), 247

theorem-2
theorem-2 (markov_chains_I), 394

unique_stat
unique_stat (markov_chains_I), 393

603
A First Course in Quantitative Economics with Python

604 Proof Index


INDEX

C S
Central Limit Theorem, 252 Schelling Segregation Model, 301

D T
Distributions and Probabilities, 217 The Perron-Frobenius Theorem, 463
Dynamic Programming
Shortest Paths, 449 V
Dynamics in One Dimension, 327 Vectors, 60
Inner Product, 63
E Norm, 63
Eigenvalues and Eigenvectors, 77 Operations, 61

L
Law of Large Numbers, 245
Illustration, 248
Linear Algebra
Eigenvalues, 88
SciPy, 70
Vectors, 60
Linear Equations and Matrix Algebra, 59

M
Markov Chains
Forecasting Future Values, 399
Future Probabilities, 392
Simulation, 388
Markov Chains: Basic Concepts and Sta-
tionarity, 383
Markov Chains: Irreducibility and Er-
godicity, 405
Matrix
Numpy, 66
Operations, 64
Solving Systems of Equations, 68
Models
Schelling's Segregation Model, 301

N
Neumann's Lemma, 91

P
python, 196

605

You might also like