0% found this document useful (0 votes)

30 views277 pages

NM4M

This document is an introduction to numerical methods for macroeconomics. It contains 10 chapters that cover topics such as solving nonlinear equations, maximization problems, deterministic and stochastic dynamics, and modeling heterogeneous agents. Each chapter includes examples coded in MATLAB to demonstrate practical applications of the numerical methods. The overall document aims to equip macroeconomists with computational tools for building, simulating, and analyzing dynamic economic models.

Uploaded by

gglu4067

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views277 pages

NM4M

Uploaded by

gglu4067

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 277

JEREMY GREENWOOD U N I V E R S I T Y O F P E N N S Y LV A N I A

RICARDO MARTO U N I V E R S I T Y O F P E N N S Y LV A N I A

NUMERICAL
METHODS FOR
MACROECONOMISTS
W I T H J U L I A A N D M AT L A B C O D E S
Contents

1 Introduction 9

2 Nonlinear Equations 15
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Labor-Leisure Choice . . . . . . . . . . . . . . . . . . . . . 16
2.3 Government Spending and Taxation . . . . . . . . . . . . 24
2.4 General Equilibrium . . . . . . . . . . . . . . . . . . . . . 31
2.5 Equivalent and Compensating Variations . . . . . . . . . 34
2.6 Solving Nonlinear Equations Numerically . . . . . . . . 37
2.7 Heterogenous Agents . . . . . . . . . . . . . . . . . . . . . 44
2.8 MATLAB: A Worked-Out Example . . . . . . . . . . . . . 47

3 Maximization (and Minimization) 53

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 Golden-Section Search . . . . . . . . . . . . . . . . . . . . 54
3.3 General Equilibrium, A Slight Return . . . . . . . . . . . 58
3.4 Discrete Maximization . . . . . . . . . . . . . . . . . . . . 59
3.5 Particle Swarm Optimization . . . . . . . . . . . . . . . . 59
3.6 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.7 MATLAB: A Worked-Out Example . . . . . . . . . . . . . 66

4 Why do Americans Work so Much More than Europeans? 69

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3 National Income and Product Accounts . . . . . . . . . . 73
4.4 Mapping the Model into the Data . . . . . . . . . . . . . 75
4.5 Actual and Predicted Labor Supplies . . . . . . . . . . . 78
4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.7 MATLAB: A Worked-Out Example . . . . . . . . . . . . . 81

5 Graphing 89
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 William Playfair . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3 Some Basic Principals . . . . . . . . . . . . . . . . . . . . 90
4 jeremy greenwood university of pennsylvania ricardo marto university of pennsylvania

5.4 MATLAB: Worked-Out Examples . . . . . . . . . . . . . . 93

6 Deterministic Dynamics 97
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2 The World of Robinson Crusoe . . . . . . . . . . . . . . . 98
6.3 The Euler Equation . . . . . . . . . . . . . . . . . . . . . . 99
6.4 The Steady State . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5 Dynamic Programming Formulation . . . . . . . . . . . . 103
6.6 Consumption Smoothing . . . . . . . . . . . . . . . . . . 107
6.7 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.8 The Value Function: A More Formal Analysis . . . . . . 112
6.9 A Linear-Quadratic Optimization Problem . . . . . . . . 121
6.10 Adding a Labor-Leisure Choice . . . . . . . . . . . . . . . 124
6.11 The Taxation of Capital and Labor Income . . . . . . . . 125
6.12 The Extended Path and Multiple Shooting Algorithms . 127
6.13 MATLAB: A Worked-Out Example . . . . . . . . . . . . . 132

7 Malthus to Solow 143

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.2 A Graphical Exposition of Malthusian Theory . . . . . . 143
7.3 Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.4 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.5 Malthus versus Solow . . . . . . . . . . . . . . . . . . . . 149
7.6 The Malthusian Steady State . . . . . . . . . . . . . . . . 152
7.7 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
7.8 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

8 Numerical Approximations 157

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2 Numerical Differentiation . . . . . . . . . . . . . . . . . . 158
8.3 The Impact of Technological Innovation in Contaception
on Non-Marital Births . . . . . . . . . . . . . . . . . . . . 160
8.4 Classical Numerical Intergration . . . . . . . . . . . . . . 162
8.5 Measuring the Welfare Gain from Personal Computers . 163
8.6 Random Number Generators . . . . . . . . . . . . . . . . 164
8.7 Eugen Slutsky and Random Causes as the Source of
Cyclic Processes . . . . . . . . . . . . . . . . . . . . . . . . 166
8.8 Monte Carlo Integration . . . . . . . . . . . . . . . . . . . 167
8.9 Robert E. Lucas, Jr., and the Cost of Business Cycles . . . 168
8.10 Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . 171
8.11 Unemployment: Calibrating a Markov Chain . . . . . . . 175
8.12 The Equity Premium: A Puzzle . . . . . . . . . . . . . . . 177
8.13 Approximating an AR1 by a Markov Chain . . . . . . . . 182
8.14 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.15 The Hodrick-Prescott Filter . . . . . . . . . . . . . . . . . 187
numerical methods for macroeconomists with julia and matlab codes 5

8.16 MATLAB: Worked-Out Examples . . . . . . . . . . . . . . 191

9 Stochastic Dynamics 195

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 195
9.2 Robinson Crusoe . . . . . . . . . . . . . . . . . . . . . . . 196
9.3 Business Cycle Modeling . . . . . . . . . . . . . . . . . . . 201
9.4 Discrete-State-Space Dynamic Programming . . . . . . . 202
9.5 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . 207
9.6 Coleman’s Policy-Function Algorithm . . . . . . . . . . . 214
9.7 Carroll’s Endogenous Grid Method . . . . . . . . . . . . 218
9.8 Parameterized Expectations Algorithm . . . . . . . . . . 220
9.9 MATLAB: Worked-Out Examples . . . . . . . . . . . . . . 222

10 The Aiyagari Model 229

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 229
10.2 The Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
10.3 A Person’s Choice Problem . . . . . . . . . . . . . . . . . 231
10.4 Heterogeneity and Aggregation . . . . . . . . . . . . . . . 233
10.5 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
10.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
10.7 Aggregate Uncertainty . . . . . . . . . . . . . . . . . . . . 238

A Mathematical Appendix 243

A.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
A.2 Maximizing a Function . . . . . . . . . . . . . . . . . . . . 244
A.3 Total Differentials . . . . . . . . . . . . . . . . . . . . . . . 248
A.4 Intermediate Value Theorem . . . . . . . . . . . . . . . . 249
A.5 The Implicit Function Theorem . . . . . . . . . . . . . . . 249
A.6 First- and Second-Order Taylor Expansions . . . . . . . . 249
A.7 Unimodal Function . . . . . . . . . . . . . . . . . . . . . . 250
A.8 The Golden Ratio . . . . . . . . . . . . . . . . . . . . . . . 250
A.9 Euler’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . 250
A.10 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . 251
A.11 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . 251
A.12 The Uniform, Normal, and Weibull Distributions . . . . 254
A.13 The Strong Law of Large Numbers . . . . . . . . . . . . . 255

B Introduction to MATLAB 257

B.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . 257
B.2 Basic Commands . . . . . . . . . . . . . . . . . . . . . . . 259
B.3 Flow Control . . . . . . . . . . . . . . . . . . . . . . . . . . 262
B.4 Defining Functions . . . . . . . . . . . . . . . . . . . . . . 265
B.5 Graphing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
B.6 Solving Nonlinear Equations . . . . . . . . . . . . . . . . 266
B.7 Minimization (or Maximization) . . . . . . . . . . . . . . 268
6 jeremy greenwood university of pennsylvania ricardo marto university of pennsylvania

B.8 Roots of Polynomials . . . . . . . . . . . . . . . . . . . . . 268

B.9 Eigenvector decomposition . . . . . . . . . . . . . . . . . 269
B.10 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . 269
B.11 Random Number Generation . . . . . . . . . . . . . . . . 269
B.12 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . 269
B.13 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . 270
B.14 Writing Results to a Table . . . . . . . . . . . . . . . . . . 270

C Introduction to Julia 271

C.1 Choosing an IDE . . . . . . . . . . . . . . . . . . . . . . . 271
C.2 Installing and Using Packages . . . . . . . . . . . . . . . . 271
C.3 Basic Commands . . . . . . . . . . . . . . . . . . . . . . . 272

References 275
numerical methods for macroeconomists with julia and matlab codes 7

This primer will cover some of the numerical methods that are used
in modern macroeconomics. You will learn how to:
1. Solve nonlinear equations via bisection and Newton’s method;
2. Compute maximization problems by golden section search, dis-
cretization, and the particle swarm algorithm;
3. Simulate difference equations using the extended path and multiple
shooting algorithms;
4. Differentiate and integrate functions numerically;

5. Conduct Monte Carlo simulations by drawing random variables;

6. Construct Markov chains;
7. Interpolate functions and smooth data;
8. Compute dynamic programming problems;

9. Solve for policy functions using the Coleman, endogenous grid, and
parameterized expectation algorithms;

10. Solve the Aiyagari heterogeneous agent model with and without
aggregate uncertainty.
This will be done while studying economic problems, such as the de-
termination of labor supply, economic growth, and business cycle anal-
ysis. Calculus is an integral part of the primer and some elementary
probability theory will be drawn upon. The MATLAB programming
language will be used. It is time to move into the modern age and
learn these techniques. Besides, using computers to solve economic
models is fun. The primer is self contained so little prior knowledge is
required.

Copyright ©2022 Jeremy Greenwood. All Rights Reserved. This book

contains material protected under International and Federal Copyright Laws
and Treaties. Any unauthorized reprint or use of this material is prohibited.
No part of this book may be reproduced or transmitted in any form or by any
means, electronic or mechanical, including photocopying, recording, or by any
information storage and retrieval system without express written permission
from the author.

If you would like to use this book for teaching or other purposes
please contact Jeremy Greenwood:
Telephone, (215) 898-1505; Fax, (215) 746-2947;
Email, [email protected]—this email address
is reserved exclusively for matters concerning the book.

Comments and suggestions are welcome! This is a work in progress.

Reports of errors, no matter how small, are greatly appreciated.
1 Introduction

There cannot be a language more universal and more simple, more free
from errors and obscurities,....more worthy to express the invariable re-
lations of natural things than mathematics. It interprets all phenomena
by the same language, as if to attest the unity and simplicity of the plan
of the universe, and to make still more evident that unchangeable order
which presides over all natural causes.
Joseph Fourier, Analytical Theory of Heat, 1822

Many people have a passionate hatred of abstraction, chiefly, I think

because of its intellectual difficulty; but as they do not wish to give this
reason they invent all sorts of others that sound grand. They say that
all reality is concrete, and that in making abstractions we are leaving
out the essential. They say that all abstraction is falsification, and that
as soon as you have left out any aspect of something actual you have
exposed yourself to the risk of fallacy in arguing from its remaining
aspects alone. Those who argue in this way are in fact concerned with
matters quite other than those that concern science.
Bertrand Russell, The Scientific Outlook, 1931.

Modern macroeconomics usually proceeds along the following path:

1. Specifying people’s tastes for goods and leisure. This involves

defining a utility function.

2. Spelling out the technologies that individuals, firms, and govern-

ments employ to produce goods. This could be a production func-
tion for firms, say using capital and labor. Sometimes a household
production function is specified for a family that relates how much
home goods are produced for a given amount of household labor
and capital. Once in a while governments are assumed to produce
goods as well, which also require capital and labor.

3. Stipulating the structure of institutions and markets in the economy.

For example, do firms produce competitively or are they monopo-
listic in nature; what type of financial markets do households and
firms have access to (say bonds, equities, and insurance); and what
is the set of spending and tax instruments available to the govern-
ment (for example, consumption, capital, and labor income taxes)?
10 numerical methods for macroeconomists with julia and matlab codes

4. Solving the maximization problems for households and firms. Usu-

ally households are assumed to maximize their utility and firms are
taken to maximize their profits.

5. Imposing any equilibrium conditions. For example, this might in-

volve spelling out the capital, labor, and goods markets clearing
conditions.

6. Solving the government’s maximization problem, if there is one.

Sometimes the government is taken as picking spending and taxes
to maximize a social welfare function of some sort. Here the gov-
ernment is taken as a dominant player, whereby it knowingly in-
fluences the equilibrium in study. This explains its position in the
steps. Other times the government’s actions--spending and taxes--
are just taken as given or are exogenous, as is assumed here.

7. Studying the resulting economy.

Of course, not all models have all of these ingredients. Some mod-
els don’t have consumers, others don’t have firms, and yet others ex-
clude governments; there can be various combinations of these factors.
Economies can be studied using old-fashioned pencil-and-paper tech-
niques and/or modern numerical methods. Pencil-and-paper tech-
niques are useful for developing propositions and theorems about
economies. As model economies become more sophisticated it be-
comes increasingly difficult to develop propositions and theorems.
Computers can be used to develop properties about economies, just
as they are used in aerospace engineering to develop properties about
air and spacecraft. Additionally, they allow for concrete quantitative
predictions that are useful for policymakers, which are devoid in gen-
eral mathematical analyzes.
Static economies can often be characterized as either solutions to
nonlinear equation systems or as solutions to maximization problems
in conjunction with an algorithm mimicking a Walrasian auctioneer.
Chapter 2 studies labor supply in a static setting. This is done with
and without government spending and taxation. It shows how this
problem can be setup as the solution to a nonlinear equation. Various
properties of the labor supply problem are established using pencil-
and-paper techniques. The chapter then turns to discussing how this
problem can be solved numerically using a nonlinear equation solver
employing either the bisection algorithm or Newton’s method. At the
end of the chapter, a MATLAB program is presented that solves a
monopolist’s pricing problem in static setting. Later on in Chapters 6
and 9, this example is made both dynamic and dynamic, stochastic.
In Chapter 3 the problem is recast as the solution to a maximization
problem in conjunction with a Walrasian auctioneer. Three techniques
introduction 11

are presented for maximizing a function; viz, golden-section search,

discrete maximization, and particle swarm optimization. The chapter
also discusses the concept of calibration. This involves choosing the
parameter values for a model to maximize its fit with respect to a set
of data targets. Two examples of calibrating models are presented.
This first example focuses on the decrease in hours worked by males
over the course of the 20th century. The second example discusses the
rise in the premarital sexual activity over the last century.
As an example of studying taxation in a static economy, Prescott
(2004) study on “Why Do Americans Work So Much More Than Eu-
ropeans?” is discussed in Chapter 4. The chapter also illustrates how
a parameter value can be chosen to maximize the fit of a model; this
is an exercise in calibration. Since macro model’s are often calibrated
to the national income and product accounts, as Prescott does, a brief
discussion of these are presented. The national income and product
accounts are must knowledge for macroeconomists.
The discussion then moves on in Chapter 6 to the solution of de-
terministic dynamic models. This is cast within the context of the
neoclassical growth model. To begin with, the deterministic dynam-
ics of the model are completely characterized using pencil-and-paper
techniques. It is shown how the solution to this model can be charac-
terized as a nonlinear difference equation. While doing this, the no-
tions of dynamic programming and the value function are introduced.
Properties of the value function for the neoclassical growth model are
derived. All of this is done in a heuristic (non-rigorous) way so as
not to obscure the beauty of dynamic programming by intimidating
readers.
Next, three techniques for numerically solving this nonlinear differ-
ence equation are presented; viz, the extended path method, lineariza-
tion, and multiple shooting. Multiple shooting picks one of the initial
conditions for the nonlinear difference equation so that the economy
ends up in its steady state after some extended period of time. The ex-
tended path method conjectures a path for expectations about how the
economy will evolve. It then solves the economy given this conjectured
path and computes a revised path for the expectations. The procedure
is then repeated. When the path for expectations and the path for the
actual economy converge a perfect foresight path has been found. Lin-
earization refers to a method where the nonlinear difference equation
describing an economy’s dynamics is approximated by a linear differ-
ence equation. The chapter illustrates how this is done. Finally, the
monopolist’s pricing problem, introduced in Chapter 2, is returned to.
The problem is now made dynamic. A MATLAB program is provided
that solves this problem using the three solution techniques discussed.
As an example of deterministic dynamics, Hansen and Prescott (2002)
12 numerical methods for macroeconomists with julia and matlab codes

“Malthus to Solow” is presented in Chapter 7.

Chapter 9 deals with stochastic dynamics. The discussion is cen-
tered around the stochastic growth model, which is widely used in
business cycle analysis. Three numerical techniques are presented
for solving dynamic stochastic economies; namely, dynamic program-
ming, linearization, and policy-function iteration. Three methods for
policy-function iteration are presented: the Coleman (1991) algorithm,
the Carroll (2006) endogenous grid method, and den Hann and Marcet
(1990)’s parameterized expectations. As an illustration of these tech-
niques, the monopolist’s dynamic pricing problem presented in Chap-
ter 6 is now made stochastic. A MATLAB program illustrating how to
solve this model is presented.
The famous Aiyagari (1994) model is presented in Chapter 10. This
was the first paper to extend the standard representative agent model
to a world with heterogeneous agents. In the original Aiyagari model
there was no aggregate uncertainty. Individuals face idiosyncratic ran-
domness in their labor income. They also faced a borrowing constraint
that limited their ability to insure against the risk in their labor in-
come. Aiyagari showed how a distribution of wealth emerges across
people. Boppart et al. (2018) illustrate how the Aiyagari model can be
extended to incorporate aggregate uncertainty. Their methodology for
doing this is discussed.
Some numerical approximations that are useful for solving macroe-
conomic models numerically, especially stochastic ones are covered in
Chapter 8. The chapter starts off discussing numerical derivatives.
Two methods are covered here: the standard method and complex
step differentiation. Next, the chapter turns to the classical method for
numerical integration. As an example of this technique, the consumer
surplus for computers is calculated. The chapter then moves on to ran-
dom number generation. This topic is illustrated using Slutsky (1937)
model of the business cycle fluctuations. Random number generation
leads naturally to the subject of Monte Carlo integration. As an ex-
ample of this, the chapter visits Lucas (1987) welfare cost of business
cycles. The concept of a Markov chain is also presented. The use-
fulness of Markov chains is illustrated using two examples. The first
illustration constructs a Markov chain for unemployment and uses this
to estimate job finding and separation rates. The second illustration is
the Mehra and Prescott (1985) study on equity premium. A method
for approximating an AR1 process by a Markov chain is discussed.
The last topic in the chapter is the approximation of functions. Often
in macroeconomics one wants to compute some functions for which
there are no known analytical solutions, such as policy functions or
value functions. Three methods are discussed for approximating func-
tions: piecewise linear interpolation, cubic spline interpolation, and
introduction 13

radial basis function interpolation. Cubic spline interpolation is a very

flexible technique. This is shown by mimicking an artist’s sketch of a
face using cubic splines. This is also a natural point to introduce the
Hodrick-Prescott filter.
Graphing data and the results from models is an important part
of macroeconomics. Statistical graphing was introduced in the 18th
century by an economist, William Playfair. Chapter 5 discusses some
basic principals for graphing. Some of Playfair’s beautiful graphs are
reproduced. MATLAB programs for three Playfair-style graphs are
provided.
The book is self contained. Chapter B provides an introduction to
MATLAB. The elementary mathematics used in the book are reviewed
in Chapter A. A legend for some of the notation used in the book is
also presented here. It is important for economists in the modern era
to be able to move fluidly between economics, computing, and math-
ematics. Mathematics forces clarity of thought and is necessary for
setting up economic models on computers. Computers are needed for
solving complicated economic models and for providing concrete pre-
dictions. Writing computer code also fosters a better understanding
of mathematical concepts since the math has to be operationalized in
a practical sense. Most important of all is a firm understanding of
economics. First, one needs to know what an interesting economic
problem is. Second, it’s crucial for understanding the economic intu-
ition that is embedded in mathematical formulations--the math should
speak to you. Third, an adage in computer science is “garbage in,
garbage out.” Having a sense of what to input into an economic model
and what to expect out is very important.
Last, this genre of economics is often called Quantitative Theory. It
both complements and overlaps with conventional econometrics. Non-
structural econometrics formulates statistical models. There may or
may not be an economic model that gives rise to the functional forms
estimated in nonstructural econometrics. Statistical models are very
useful for characterizing facts in the data. Still, one should remem-
ber Tjalling C. Koopmans’s warning in about measurement without
theory.1 The interaction between measurement and theory is bidirec- 1
He critiqued empirical work in Koop-
tional. Empirical findings motivate theory and theory sheds light on mans (1947) saying: ’The various choices
as to what to “look for,” what economic
what ideas to test and guides empirical formulations. Results from phenomena to observe, and what mea-
nonstructural econometric models (e.g., regression coefficients) can be sure to define and compute, are made
with a minimum of assistance from theo-
used to calibrate simulated models. This is called indirect inference. retical conceptions or hypothesis regard-
Structural econometric models are a close cousin to calibrated ones. ing the nature of the economic processes
On the one hand, they use formal statistical analysis to evaluate the by which the variables studied are gen-
erated (p. 161).’
model. On the other hand, due to the added difficulties of estimation,
they are often partial equilibrium in nature. Also, sometimes mini-
mum distance estimation procedures are used to calibrate simulated
14 numerical methods for macroeconomists with julia and matlab codes

models so that quantitative theory and structural econometrics over-

lap.
2 Nonlinear Equations

2.1 Introduction

In modern macroeconomics an economy can often be described as the

solution for h to a nonlinear equation of the following form:

Z (h) = 0. (2.1.1)

Here h could be a single variable or a vector of variables and like-

wise Z could be a single or vector valued function. The function Z
should have the same dimension as the variable h; i.e., you need the
same number of equations as unknowns. A solution to this equation is
called a zero of Z or the root of Z (h) = 0. Solving such equations is the
subject of this chapter. Embodied in Z may be the tastes and technol-
ogy of the economy, the tax and spending policies of the government,
the upshot of individuals’ and firms’ choice problems, and market-
clearing conditions. In fact, modern macroeconomics tries to specify
the economy at the granular level needed to address the question of
interest.
Two methods are presented in Section 2.6 for solving the above
equation, namely the bisection method and Newton’s method. The
chapter starts out in Section 2.2 with a labor-leisure choice problem.
The discussion follows the path outlined in Chapter 1 for modern
macroeconomics. The impact of shifts in government spending, taxes,
and wages are broken down into income and substitution effects along
the lines of Sir John R. Hicks (1904-1989) and Eugen Slutsky (1880-
1948). The income effect associated with a government fiscal policy
depends crucially on how the tax revenue raised is used. In particu-
lar it depends on whether the revenue is used for transfer payments
or government spending on goods and services. Furthermore, if the
revenue is used to provide goods and services, does this government
spending substitute for private expenditure or not. The analysis is
then cast into a general equilibrium setting where the wage rate and
rental rate on capital are determined endogenously. To measure the
change in economic welfare associated with a shift in government pol-
icy, Hicks (1941) notions of compensating and equivalent variations are
16 numerical methods for macroeconomists with julia and matlab codes

introduced. These notions are depicted using Lucas (1987) calculation

of the welfare benefit/costs from changing an economy’s growth rate.
It is shown how the general equilibrium solution to the labor-leisure
choice problem in economy with taxes and government spending can
be setup as a nonlinear equation that has the form of equation (2.1.1).
A MATLAB program illustrating the two techniques for solving non-
linear equations is presented in Section 2.8 for a monopolist’s pricing
problem.

2.2 Labor-Leisure Choice

2.2.1 Utility Functions

Tastes are specified by a utility function. Let

u = U (c)

represent the utility function for consumption. It gives the level of hap-
piness, u, that person realizes if they consume the amount, c. Utility is
an ordinal concept, not an cardinal one. It specifies how different con-
sumption levels are ranked. The precise numbers assigned to partic-
ular consumption levels are meaningless. It’s important to remember
this when conducting welfare experiments, as will be discussed in Sec-
tion 2.5. Some typical properties imposed jointly on a utility function,
or on the ordinal ranking, are:

1. U : R+ → R (so that a utility function maps the positive reals into

the reals). Consumption must always be nonnegative, but utility
can be negative.

2. U is strictly increasing so that U1 ≡ dU/dc > 0. More of a good

is better than less of it. Marginal utility is positive. Even if utility
is negative it will be increasing in consumption, because marginal
utility is positive. Now suppose one added a constant to the utility
function. The utility value connected with different levels of con-
sumption would change by the added constant. But, utility would
still be strictly increasing with exactly the same first derivative and
the original ranking across different levels of consumption is pre-
served. This illustrates the ordinal nature of utility.

3. U is strictly concave so that U11 ≡ d2 U/dc2 < 0. Marginal util-

ity decreases as consumption increases. Each extra increment of
consumption is worth less to the consumer. This is called dimin-
ishing marginal utility. Again, adding a constant to the utility func-
tion does not change the second derivative. So the exact values
assigned to utility arising from a particular level of consumption
are not meaningful.
nonlinear equations 17

Example 1. (Common utility functions) Here are some utility functions

that are commonly used in macroeconomics. They satisfy the above
properties.

U (c) = ln c (logarithmic);
U (c) = c1−ρ /(1 − ρ) − 1/(1 − ρ), with ρ ≥ 0 (isoelastic);
U (c) = −e−γc , with γ > 0 (exponential);
U (c) = αc − βc2 /2, with α, β > 0 and for c < α/β (quadratic).

These utility functions are illustrated in Figure 2.2.1. The logarithmic

utility returns a negative value for c < 1 and positive one for c > 1.
Observe that U1 = d ln c/dc = 1/c > 0 and U11 ≡ d2 ln c/dc2 =
−1/c2 < 0. Utility can also be positive or negative with an isoelastic
utility. Here U1 = c−ρ > 0 and U11 = −ρc−ρ−1 < 0. The isoelastic
utility function is also called a constant-relative-risk-aversion (CRRA)
utility function. Here 1/ρ represents the elasticity of intertemporal
substitution, which is defined in Chapter 6. The elasticity of intertem-
poral substitution controls the responsiveness of consumption to in-
terest rate changes. With this utility function ρ also represents the
coefficient of relative risk aversion, as will be discussed in Chapter 8.
This governs an individual’s willingness to invest in risky assets. So,
ρ plays two roles, which may cause problems in some applications.
Often the constant term, −1/(1 − ρ), is dropped. When this included,
the isoelastic utility function converges to the logarithmic one as ρ ap-
proaches 1.
Utility is always negative with the exponential utility function. So,
the sign of utility is not important. The important thing is that as c
rises so does utility and this increase in utility decreases with the level
of c. The quadratic utility function is ∩ shaped. As can be seen, U1 =
α − βc R 0 depending on whether c Q α/β, Therefore, this function
rises or falls depending on the value of c. Only the upward portion of
the ∩ is valid. The peak of the ∩ occurs at c = α/β. This explains the
restriction imposed on c. The quadratic utility function is still strictly
concave because U11 = − β < 0. Sometimes quadratic utility functions
are used in numerical work to approximate nonquadratic ones, as is
done in Chapter 6. The MATLAB code for making a version of Figure
2.2.1 is in Section 2.8.1.

Likewise, let
v = V (1 − h )

represent the utility function for leisure. Here it is presumed that an

individual has one unit of time that can be split between working and
leisure. This utility function returns the level of happiness, v, that a
worker realizes if he spends the proportion of his time h working so
18 numerical methods for macroeconomists with julia and matlab codes

Figure 2.2.1: Utility functions:

0 .5 q u a d r a tic logarithmic (ln); crra (ρ = 1.5);
exponential (γ = 1); quadratic
0 .0 (α = 0.5, β = 0.2). The ex-
e x p o n e n tia l ponential utility function always
u = U ( c ) , u tility

-0 .5
returns a negative value for util-
-1 .0 ity, u. The crra and ln utility
0 1 2 2 .5 3 4
ln functions can yield both nega-
1 .1
tive and positive values for util-
p o s itiv e ,↑
0 .0 ity. When ρ > 1 the crra utility
n e g a tiv e ,↓
-1 .1
function is more concave than
the ln one and less so when ρ <
-2 .2 c rra
1. The quadratic utility function
0 1 2 3 4 declines when c > α/β. Hence,
c , c o n s u m p tio n
it is only good for c < α/β.
The peak of the quadratic util-
ity function occurs at c = α/β =
that he enjoys the fraction 1 − h in leisure. Leisure can be thought of 2.5.
as a good, just as consumption is, so the properties for V are the same
as those imposed on U.

2.2.2 A Static Consumer/Worker’s Decision Problem

Suppose that a person works on a spot market for labor. They have one
unit of time that they can split between working, h, and leisure, 1 − h.
The wage rate for a unit of labor is w. Let a denote the worker’s level
of assets, which is exogenously specified for the moment. This will be
used to calculate the effect of wealth, a, on labor supply, h. Later in this
chapter a will be connected with the rental income that accrues from a
fixed amount of capital or land. Additionally, in Chapter 6 the rental
income that a person will earn from their ownership of capital will be
the outcome of a consumer/worker’s consumption/savings problem.
The worker’s static maximization problem is

max{U (c) + V (1 − h)},

c,h

subject to the budget constraint

c = wh + a.

The lefthand is the person’s expenditure on consumption, c. The right-

hand represents the resources at their disposal, which derive from
their labor income, wh, and their wealth, a. By substituting the budget
constraint into the objective function to solve out for consumption, c,
nonlinear equations 19

a maximization problem in just hours worked, h, can be obtained. The

revised maximization problem is

max{U (wh + a) + V (1 − h)},

The objective function for this problem is shown in Figure 2.2.2.

The slope of the objective function is zero at a maximum. (See the
Mathematical Appendix in Chapter A for an elementary exposition of
maximization problems.) This corresponds to setting

U1 (wh + a)w − V1 (1 − h) = 0,

which is the first-order condition to the single-variable maximization

problem. When the objective function is strictly concave in h, this
first-order condition is both a necessary and sufficient condition for a
maximum in h to attain. This equation has the form of (2.1.1), which
can be seen by making the following definition for Z (h):

Z (h) ≡ U1 (wh + a)w − V1 (1 − h) = 0.

The above describes one equation in the one unknown endogenous

variable, h, where a and w are exogenous variables. By the implicit
function theorem, the solution for h to this equation can be written as
h = H ( a, w). The function H is the person’s decision rule. It gives their
optimal choice for labor effort, h, as a function of the exogenous vari-
ables a and w. (The implicit function theorem is presented in Chapter
A.)
The above first-order condition can be rewritten as

U1 (wh + a)w = V1 (1 − h) . (2.2.1)

| {z } | {z }
Marginal benefit from working Marginal cost of working

The solution is portrayed in Figure 2.2.3. The righthand side is in-

creasing in h, because V1 is decreasing in 1 − h. This results from the
assumption that the utility function for leisure is strictly concave or
that leisure exhibits diminishing marginal utility; i.e., V11 = dV1 (1 −
h)/d(1 − h) < 0. The righthand side represents the marginal cost of
working; therefore, the marginal cost of working is increasing in hours
worked, h. The lefthand side is decreasing in h, because U1 is decreas-
ing in c, and hence wh + a, due to the fact that utility is strictly concave
in consumption (U11 = dU1 (c)/dc < 0). The lefthand side portrays the
marginal benefit of working; hence, the marginal benefit of working
decreases in labor effort, h. This arises because the utility function for
consumption is strictly concave in h or because consumption displays
diminishing marginal utility. Your consumption increases as you work
more, but the value of the extra consumption is subject to diminishing
marginal utility.
20 numerical methods for macroeconomists with julia and matlab codes

Figure 2.2.2: The representa-

utility tive consumer/worker’s objec-
U(wh+a)+V(1-h) tive function. The level of hours
worked, h, that maximizes util-
ity occurs where the slope of
Slope = 0
utility function is zero, or where
U1 (wh + a)w − V1 (1 − h) = 0.

h hours worked

In principal a corner solution can occur–the mathematics of corner

solutions is discussed in Chapter A. For example, the person would
not want to work (h = 0) when

U1 ( a)w < V1 (1) .

| {z } | {z }
Marginal benefit from working Marginal cost of working

Here the marginal benefit of working (at h = 0) is less than its marginal
cost. It’s likely that there exists a large enough value for a such that
the corner solution holds for sure. For example, think about the situa-
tion where lima→∞ U1 ( a) = 0. The person is so wealthy that an extra
unit of consumption is worthless to them. In this case (not shown)
the marginal cost curve in Figure 2.2.3 would lie above the marginal
benefit curve at h = 0, which is flat at zero for all values of h.
The impact that shifts in wealth, a, and wages, w, have on labor
supply, h, are now analyzed. The notions of income and substitution
effects, as advanced by Hicks (1939) and Slutsky (1915), come into play
here. Sir John Hicks (1904-1989) was a
British economist. Hicks won the
Nobel Prize in Economics in 1972.
The impact of an increase in wealth, a He brought many important ideas
Suppose the worker is wealthier; i.e., increase a. By examining the into economics: income and
substitution effects, compensating
righthand side of equation (2.2.1), it is clear that the marginal cost
and equivalent variations, and the
curve in Figure 2.2.3 is not a function of a. From the lefthand side of IS-LM model.
nonlinear equations 21

MB, MC
Figure 2.2.3: Labor-leisure
LHS, MB
choice. The optimal level of
RHS, MC
hours worked, h, occurs where
the marginal benefit from work-
ing, MB, equals its marginal
cost, MC–compare with equa-
tion (2.2.1). An increase in
wealth, a, causes the MB curve
to shift down, which results in a
h’ h hours worked
decline in hours worked from h
to h0 . This illustrates the wealth
effect on labor supply.

(2.2.1) it can be seen that the marginal benefit curve is. In particular,
for any given level of h, an increase in a will cause c = wh + a to
rise. Hence U1 (wh + a) falls, due to diminishing marginal utility or
the fact that the utility function is strictly concave. Thus, the marginal
benefit curve shifts down. This results in a fall in labor supply, h–
see Figure 2.2.3. When the person gets wealthier, they would like to
spread the windfall across both consumption and leisure. They won’t
use all of the windfall for consumption because the marginal utility
of consumption is declining so that each extra unit of consumption
is worth less and less. Hence some of the gain in wealth should be
directed toward leisure.
To obtain the mathematical transliteration of the graphical analysis
take the total differential of (2.2.1) with respect to a and h. This gives

U11 (wh + a)w2 dh + U11 (wh + a)wda = −V11 (1 − h)dh,

which yields
dh −U11 (wh + a)w
= < 0. (2.2.2)
da U11 (wh + a)w2 + V11 (1 − h)
The concepts of total differentials and total derivatives are reviewed
in Chapter A. The sign of the above expression results from the fact
that U11 (wh + a) and V11 (1 − h) are both negative because the utility
functions for goods and leisure are assumed to be strictly concave.
As will be seen, the term on the righthand side of (2.2.2) is closely
connected with the income effect from a rise in wages.

The impact of a rise in wages, w

Suppose that wages rise. It’s hard to tell if the marginal benefit curve
will shift to the right or left. There are two opposing forces, as can be
22 numerical methods for macroeconomists with julia and matlab codes

seen by examining the lefthand side of (2.2.1). First, holding marginal

utility constant, or U1 (wh + a), an increase in w raises the marginal
benefit from working. In response, hours worked will rise. This is the
substitution effect from the rise in wages. Second, holding h fixed, an
increase in w operates to reduce the marginal utility of consumption.
Hence, on this account, the marginal benefit from working will drop.
Hours worked will fall and hence leisure rise. This represents the
income effect from an increase in wages. Mathematically one finds, by
taking the total differential of (2.2.1) with respect to h and w, that

U11 (wh + a)w2 dh + U11 (wh + a)hwdw + U1 (wh + a)dw = −V11 (1 − h)dh,

(2.2.3)

dh −U11 (wh + a)wh −U1 (wh + a)

= + R0
dw U11 (wh + a)w + V11 (1 − h) U11 (wh + a)w2 + V11 (1 − h)
2
| {z } | {z }
Income Effect, <0 Substitution Effect, >0
dh −U1 (wh + a)
=h + R 0, using (2.2.2).
da U11 (wh + a)w2 + V11 (1 − h)

Thus, the effect is ambiguous depending on the relative size of the

income and substitution effects. The income effect is given by the first
term. The size of the income effect is proportional to amount of work
that the person does, h. If he did little work (h ' 0), then he would
not gain much in income from an increase in the wage rate.

Example 2. (Logarithmic utility) Let U (c) = θ ln c and V (1 − h) =

(1 − θ ) ln(1 − h) where 0 < θ < 1. This is the most commonly used
functional form for utility in macroeconomics. Then, the above first-
order condition appears as

1 1
θ × w = (1 − θ ) .
wh + a 1−h
| {z } | {z }
MB MC

Cross multiplying and solving for h yields:

θw − (1 − θ ) a a
h= = θ − (1 − θ ) ,
w w
at least when there is an interior solution. Note that it is possible for
h = 0, which occurs when the above equation returns a negative solu-
tion for h. A negative value for hours worked, h, is invalid; the lowest
it can be is zero. There are three cases to consider.
(i) Labor supply, h, is increasing in wages , w, when a > 0. Hence,
the substitution effect is larger than the income effect. As an illus-
tration of this case, think of a married household where the husband
works a fixed work week and a is his income. Here, w represents the
nonlinear equations 23

wife’s wage. Hence, in a married household a rise in the wife’s wage,

w, would cause her to work more. A married woman may not work
(h = 0), if her husband earns enough. This occurs when the marginal
benefit from working, MB, is less than the marginal cost, MC, when
evaluating (2.2.1) at h = 0.
(ii) If a = 0, then a change in wages will have no impact on labor sup-
ply, because the income and substitution effects from a rise in wages
would exactly cancel out. Also, observe that if assets were propor-
tional to wages, so that a = ψw for any ψ ≥ 0, then a rise in wages, w,
would have no impact on hours worked, h; again, the income and sub-
stitution effects exactly cancel out. This property is used in balanced
growth models, where wealth rises in tandem with wages.
(iii) When a < 0 then an increase in wages will reduce hours worked.
Here the income effect dominates the substitution effect.
Taking stock of things, the strength of the income effect is decreasing
in assets, a. This makes sense. An extra dollar must be worth more to
a poor person vis à vis a rich one.

Example 3. (Subsistence level of consumption) Rewrite the utility func-

tion as U (c) = θ ln(c − c) and give the person the budget constraint
c = wh; i.e., assume that the person has no assets, a. Here c can be
interpreted as a subsistence level of consumption. The solution to this
case can be obtained from the previous example by letting c = − a > 0
(so that the solution will operate in the same manner as assuming that
a < 0). Therefore, in this illustration the solution for h, associated with
the consumer/worker’s maximization problem, is the same as before:
h = θ + (1 − θ )c/w. The higher the subsistence level of consumption,
or c, is, the harder the person will work. As wages rise the person will
work less.

Example 4. (Zero-income effect utility function) Let utility be given by

W (c, h) = ln(c − h1+η /(1 + η )). The worker’s problem is

maxW (wh + a, h).

h | {z }
c

The first-order condition for labor reads

W1 (c, h) × w = −W2 (c, h).

| {z } | {z }
MB MC

Plugging in the above functional form gives

1 hη
× w = ,
c − h 1+ η / (1 + η ) c − h1+ η / (1 + η )

so that
h = w1/η .
24 numerical methods for macroeconomists with julia and matlab codes

Labor supply, h, is increasing in wages, w. Here 1/η gives the elasticity

of labor supply with respect to wages. Using the above formula it is
easy to calculate that
w dh 1
= ,
h dw η
where the lefthand side gives the percentage change in hours worked
in response to a percentage change in wages. This utility function has
no income effect. Note the absence of a in the solution for h. This
utility function is often used in business cycle modeling because of its
simple solution for h. It will be returned to in Chapter 6.

2.3 Government Spending and Taxation

How do government spending and labor income taxation affect hours

worked? This will be explored now. It will be discovered that the effect
depends upon how the revenue from the taxation is used. This gov-
erns the income effect associated with a government spending-cum-
tax plan. Specifically, the impact of taxation will differ according to
whether the goverment uses the revenue for:

1. Lump-sum transfer payments. Lump-sum transfer payments imply

that taxation has a zero income effect because all revenue is rebated
back to the consumer/worker.

2. Government spending. There are two cases to consider here:

(a) Government spending does not substitute for private consump-

tion spending. Here there will be a negative income effect asso-
ciated with the government spending because the government is
drawing resources out of the economy that will reduce private
consumption spending.
(b) Government spending substitutes for private consumption spend-
ing. The negative income effect will be mitigated to the extent
that the spending substitutes for reduced private consumption.

Suppose that there is a government in the economy that taxes labor

income at the rate τ. It uses the revenue raised from labor income
taxes to finance government spending on goods and services, g, and to
provide lump-sum transfer payments, λ. The strength of the impact of
taxation on labor supply depends crucially on how the revenue raised
is used.
The government’s budget constraint appears as

g+λ = |{z}
τwh . (2.3.1)
| {z }
disbursements receipts
nonlinear equations 25

To calculate the effect of taxation on labor supply follow the path out-
lined in Chapter 1:
1. Solve the individual’s labor-leisure choice with labor taxation and
transfer payments to obtain the individual’s first-order condition.

2. Impose any equilibrium conditions and the government’s budget

constraint on the first-order condition.
Incorporating any equilibrium conditions and/or government’s bud-
get constraint into the individual’s labor-leisure choice before solving
the person’s problem leads to a fundamental error in economics, which
is discussed later on.

2.3.1 Step 1, The worker’s problem with taxation

Start with the case where the government taxes labor income at the rate
τ, provides lump-sum transfers in the amount, λ, and spends g. The
consumer/worker does not value government spending here. Valued
government spending will be discussed later. The worker’s problem is
now
max{U (c) + V (1 − h)},
c,h
subject to his budget constraint
c = (1 − τ )wh + a + λ.
The worker’s after-tax labor income is (1 − τ )wh when he works the
amount h. The level of lump-sum transfer payments, λ, is unrelated
to the individual’s work effort. This provides an additional source of
funds for consumption spending, c. After using the budget constraint
to solve out for c in the utility function, it is easy to deduce that the
first-order condition for h is
Z (h) ≡ U1 ((1 − τ )wh + a + λ) × (1 − τ )w − V1 (1 − h) = 0,
which can be rewritten as
U1 ((1 − τ )wh + a + λ) × (1 − τ )w = V1 (1 − h). (2.3.2)
| {z } | {z }
IE SE
Again, observe that this is one equation in one unknown. A value
of h is sought that sets Z (h) = 0. Represent the solution by h =
H (w, τ, λ, a). Will income taxation raise or reduce work effort? On the
one hand, a unit of labor now earns the after-tax wage rate, (1 − τ )w.
This creates a disincentive to work, at least when consumption c =
(1 − τ )wh + a + λ is held fixed. This is the substitution effect, SE,
from taxation. On the other hand, the person’s income, (1 − τ )wh +
a + λ, and hence consumption will be lower, for a given level of hours
worked, h. This operates to make the person work harder, other things
equal. This is the income effect, IE.
26 numerical methods for macroeconomists with julia and matlab codes

2.3.2 Step 2, Impose the government’s budget constraint

Proceed now to Step 2. By using the government’s budget constraint
(2.3.1), to solve out for lump-sum transfers, λ, in the above first-order
condition (2.3.2), one obtains

U1 (wh + a − g)(1 − τ )w = V1 (1 − h). (2.3.3)

It is easy to see intuitively how the government can affect the worker.
First, the presence of taxation distorts the person’s labor supply deci-
sion, as shown by the (1 − τ ) term. This works as a negative substitu-
tion effect. Second, the government takes away some of the economy’s
resources, as reflected by the g term. This operates as a negative in-
come effect–the situation by portrayed in Figure 2.3.1. If there is no
government spending, such as when all revenue is rebated back as
transfer payments, there will be no income effect associated with the
taxation. The case of transfers is discussed next.

Figure 2.3.1: The size of the

economy is wh + a, or the area
of the circle. Out of this pie
the government takes the slice g
leaving c for private consump-
tion. The size of the pie is en-
dogenous, depending on labor
supply, h. Labor supply is a
function of the level of govern-
ment spending, g, and the rate
of labor income taxation, τ.

2.3.3 All taxes rebated back as lump-sum transfers

Here,
τwh = λ and g = 0.

The worker’s consumption will be

c = (1 − τ )wh + a + λ = wh + a.

Plug this into the first-order condition (Step 2) to get

U1 (wh + a)(1 − τ )w = V1 (1 − h), (2.3.4)

| {z }
SE
nonlinear equations 27

or equivalently just set g = 0 in (2.3.3). The way taxes enter the above
expression suggests that only the substitution effect will be opera-
tional.
Now an increase in taxes reduces the marginal benefit from work-
ing. Hence, labor supply falls. Parroting the above exercises one gets

dh U1 (wh + a)w
= < 0.
dτ U11 (wh + a)(1 − τ )w2 + V11 (1 − h)
This is just the substitution effect–compare with the second term in
(2.2.3). In terms of Figure 2.2.3 an increase in taxes will shift the MB
curve down for any given value of h. This transpires because the after-
tax wage rate, (1 − τ )w, drops. The MC curve remains fixed.
Remark 5. (The meaning of representative agent models) The construct
of a representative agent is a stand-in device for millions of identical
individuals each maximizing their own welfare while taking the ac-
tions of other parties in the economy as given. A huge blunder for
a macroeconomist to make is to substitute the government’s budget
constraint into the consumer/worker’s one before the maximization
is done. If this is done, then (2.2.1) will reappear instead of (2.3.2).
Hence, there would be no apparent effect of taxes on labor supply. To
understand the mistake suppose that there are n identical agents in
the economy. Let the representative agent choose labor supply in the
amount h. Suppose that the other n − 1 people pick h. Of course in
equilibrium h = h, because they are all the same. The government’s
budget constraint can be written as

τwh + (n − 1)τwh = nλ,

or
λ = τwh/n + (n − 1)τwh/n.
The person knows if he works one unit of time more, and no one else
does, his transfers will increase by τw/n. The person’s maximization
problem is now

max{U ((1 − τ )wh + a+τwh/n + (n − 1)τwh/n) + V (1 − h)},

where λ has been solved out for in his budget constraint. The individ-
ual cannot tell his neighbor what to do, so he must take h as given in
this maximization problem. The agent’s first-order condition is

U1 ((1 − τ )wh + a+τwh/n + (n − 1)τwh/n)(1 − τ + τ/n)w = V1 (1 − h).

Now set h = h because all individuals will be the same in competitive

equilibrium. The above condition then appears as

U1 (wh + a)(1 − τ + τ/n)w = V1 (1 − h).

28 numerical methods for macroeconomists with julia and matlab codes

Clearly, as n → ∞ this converges to (2.3.4). When n = 1 it looks like

(2.2.1). Thus, the mistake is treating the representative agent as being
the single person in the economy instead of as representing millions of
identical individuals, each acting on their own, while taking as given
other people’s actions.

2.3.4 Non-Valued Government Spending

Consider the case where all tax revenue is used to finance government
spending on goods and services, g. Since the consumer/worker does
not value the government spending, g does not enter his utility func-
tion, in contrast to the case of valued government spending discussed
below. Since there are no lump-sum transfers, λ = 0 so that

τwh = g.

Plugging this revised government budget constraint into the first-order

condition (2.3.3) yields

U1 ((1 − τ )wh + a)(1 − τ )w = V1 (1 − h). (2.3.5)

| {z } | {z }
IE SE

Now an increase in taxes has an ambiguous impact on the marginal

benefit from working, since a rise in taxation has both an income and
substitution effect. Parroting the above exercises yields

dh U11 ((1 − τ )wh + a)(1 − τ )w2 h

=
dτ U ((1 − τ )wh + a)(1 − τ )2 w2 + V11 (1 − h)
| 11 {z }
income effect, >0
U1 ((1 − τ )wh + a)w
+ Q 0.
U11 ((1 − τ )wh + a)(1 − τ )2 w2 + V11 (1 − h)
| {z }
substitution effect, <0

Whether hours worked will fall or rise depends on whether the sub-
stitution or income effect dominates.

2.3.5 Valued Government Spending

The case where government spending is valued is now analyzed. There

are two cases to consider. In the first case government spending di-
rectly substitutes for private consumption, while in the second case it
does not. Once again presume that all tax revenue is used to finance
government spending; i.e., λ = 0.
nonlinear equations 29

Government spending is valued in the same way as private consump-

tion
It is easy to allow for government spending to be valued. For instance,
one could write the consumer/worker’s utility function as

U (c + ωg) + V (1 − h),

where ω is a constant specifying the value that an agent derives from

government spending. Equation (2.3.3) now appears as

U1 ([1 − (1 − ω )τ ]wh + a)(1 − τ )w = V1 (1 − h),

because ωg = ωτwh so that −(1 − ω ) g = −(1 − ω )τwh. What hap-

pens if ω = 0 or ω = 1? It is easy to see that when ω = 0 the
above first-order condition reduces to (2.3.5), where there is an in-
come effect connected with taxation. Alternatively, when ω = 1 it is
the same as (2.3.4), so that there is no income effect associated with
taxation. In general, when 0 < ω < 1 the size of the income effect
associated with taxation depends upon how valuable or substitutable
government spending is in terms of private consumption. This is gov-
erned by the size of ω. The drain on private spending due to taxation
is offset by the portion of government spending, ωτwh, that is substi-
tutable for private spending. When government spending on goods
and services is not very substitutable in terms of private consump-
tion the consumer/worker will feel a bigger loss in terms of private
consumption than when it is substitutable.

Government spending is valued in a different way than private con-

sumption
Suppose alternatively that government spending is valued according
to the concave utility function G ( g) so that the individual’s utility func-
tion can be written as

U ( c ) + G ( g ) + V (1 − h ).

Equation (2.3.5) now reads

U1 ((1 − τ )wh + a)(1 − τ )w = V1 (1 − h).

Thus, surprisingly, this case can be analyzed in the same fashion as

the situation where government spending is a deadweight loss! This
makes clear that the income effect from government spending derives
from the fact that it reduces private consumption. The loss depends on
the degree to which government spending is substitutable for private
spending.
30 numerical methods for macroeconomists with julia and matlab codes

2.3.6 Progressive Income Taxation

Let the consumer/worker face a progressive tax schedule where his
income tax is given by
T (wh),
with T (0) = 0, T1 , T11 > 0. The tax schedule is displayed in Figure
2.3.2. It is convex due to the assumption that T11 > 0. This implies
that taxation is progressive since the tax rate that the person pays on
the last dollar earned, or T1 (wh), increases with labor income, wh. The
consumer/workers choice problem for h now formulates as

max{U (wh − T (wh) + a + λ) + V (1 − h)}.

It is straightforward to calculate that his first-order condition will now

read

U1 (wh − T (wh) + a + λ)[1 − T (wh) ]w = V1 (1 − h).

| {z } | 1 {z }
c marginal tax rate

Again, observe that this is one equation in one unknown. Note that the
marginal tax rate, T1 (wh), is higher than the average one, T (wh)/wh,
because the tax function is convex. When considering the disincentive
effect of distortional taxation it is important to use the marginal tax
rate and not the average one. It is the marginal tax rate that governs
the substitution effect, not the average one. The difference between
average and marginal tax rates will be touched upon in Chapter 4,
which discusses why American work more than Europeans.
nonlinear equations 31

Figure 2.3.2: Relationship between

Taxes average and marginal tax rates. The
T(wh) T(wh)
marginal tax rate at the income
level wh is given by the slope of
the T (wh) curve at the point wh,
Slope, marginal tax rate
as shown by the slope of the tan-
gent line. The average rate is illus-
trated by the slope of straight line
Slope, average tax rate
from the origin through the point
(wh, T (wh)), or [ T (wh) − 0]/(wh −
0) = T (wh)/(wh). With progres-
sive taxation, the marginal tax rate,
T1 (wh), exceeds the average tax
0 wh labor income
rate, T (wh)/wh.

2.4 General Equilibrium

Production will now be introduced into the above setting, which brings
into the analysis the notion of a production function. Output, o, will Leon Walras (1834-1910) was a
now be produced using capital, k, in addition to labor, h. Capital French economist at the University
is assumed to be in fixed supply. This assumption is abandoned in of Lausanne. He is best known for
his book Éléments d’économie
Chapter 6, where the supply of capital is endogenously determined.
politique pure. This book founded
general equilibrium theory. He
2.4.1 derived Walras’s law that states the
Production Functions
sum of excess demands across
Assume that output, o, can be produced with capital, k, and labor, h, in markets must sum to zero. This
line with the following constant-returns-to-scale production function: implies that any given market must
be in equilibrium, if all other
markets are in equilibrium.
o = F (k, h).

Generally, it is assumed that inputs and outputs are all positive. So,
some typical properties imposed on a production function are:

1. F : R2+ → R+ (so that a production function maps the positive

reals into the positive reals). Inputs and outputs must always be
nonnegative.

2. F is strictly increasing in each of its arguments so F1 ≡ ∂F/∂k > 0

and F2 ≡ ∂F/∂h > 0. Marginal products are positive.
32 numerical methods for macroeconomists with julia and matlab codes

3. F is strictly concave in each of arguments F11 ≡ d2 F/dk2 < 0 and

F22 ≡ d2 F/dh2 < 0. This implies that the marginal product for an
input decreases with its usage. This is called the law of diminishing
marginal returns.

4. F (θk, θh) = θF (k, h) for θ > 0–definition of constant returns to scale.

This assumption implies that the world can be represented by a
single aggregate production function. I.e., it doesn’t matter whether
the economy has θ firms each producing F (k, h) units of output or
one large firm producing F (θk, θh) units of output. The aggregate
levels of inputs and output will be exactly the same.

Example 6. (Common production functions) Here are some produc-

tion functions that are commonly used in macroeconomics.

F (k, h) = kα h1−α , with 0 ≤ α ≤ 1 (Cobb-Douglas);

F (k, h) = αk + βh, with 0 ≤ α, β (linear);
F (k, h) = αk + βh − ψk2 /2 + δkh − εh2 /2, with 0 ≤ α, β, ψ, δ, ε, ∆ ≡ ψε − δ2 > 0,
k ≤ (αε + βδ)/∆, and h ≤ (ψβ + δα)/∆ (quadratic).

The linear production is not strictly concave, just concave. Also, the
quadratic production function is not always increasing in k and l so
the restriction on the bottom line is needed. It does not satisfy the
constant-returns-to-scale assumption, either.

2.4.2 The Firm’s Decision Problem

A firm hires labor to maximize its profits. Let the capital stock, k, be
fixed. Denote the rental price of capital by r. A firm’s profits are given
by it sales, F (k, h), minus what it pays out the factors of production
that it hires, wh + rk. The firm’s profit maximization problem is

max{ F (k, h) − wh − rk}.

h,k | {z }
profits

The first-order condition for labor is

F2 (k, h) = w
|{z} . (2.4.1)
| {z }
marginal product of labor marginal cost

The situation is portrayed in Figure 2.4.1. The firm can hire as much
labor as it desires at the wage w. The function F2 implicitly defines
the demand for labor. It shows the marginal product of the last unit of
labor hired. Due to diminishing returns, or the fact that F22 < 0, this
schedule is downward sloping. The firm hires up to the point where
the marginal product of labor equals the wage rate.
nonlinear equations 33

The first-order condition for capital has a similar form

F (k, h) = r , (2.4.2)
| 1 {z } |{z}
marginal product of capital marginal cost

where again r is the rental rate for capital.

Figure 2.4.1: The firm’s hir-

LHS, ing decision. The labor de-
RHS LHS, F2(k,h) Labor Demand mand function is given by
the marginal product for labor
curve, F2 (k, h). The firm can hire
as much labor as it wants at the
wage rate w. So, the labor sup-
RHS, w
ply function is given by the hori-
zontal line at w. The level of em-
ployment, h∗ , is determined by
the point of intersection between
the labor demand and supply
curves.

h* labor, h

2.4.3 Equilibrium
To characterize the determination of hours worked in the economy
(with taxes) solve out for w using (2.4.1) in the consumer/worker’s
first-order condition for labor (2.3.2) to get

U1 ((1 − τ ) F2 (k, h)h + a + λ)(1 − τ ) F2 (k, h) = V1 (1 − h).

| {z } | {z }
w w

One should solve out for w after solving the representative consumer/-
worker’s problem. Next, substitute out for λ using the government’s
budget constraint (2.3.1) to obtain

U1 ( F2 (k, h)h + a − g)(1 − τ ) F2 (k, h) = V1 (1 − h).

One might think that the individual owns the economy’s fixed capi-
tal stock. Each period capital, k, will earn its marginal product, F1 (k, h).
Thus, it is reasonable to let a = rk = F1 (k, h)k. That is, the worker earns
34 numerical methods for macroeconomists with julia and matlab codes

rental on capital accruing from the operation of the firm. This will be
discussed in more detail later on. If one solves out for a in this fashion,
the result will still be one equation in one unknown:
U1 ( F2 (k, h)h + F1 (k, h)k − g)(1 − τ ) F2 (k, h) = V1 (1 − h),
| {z }
c
so that
U1 ( F (k, h) − g)(1 − τ ) F2 (k, h) = V1 (1 − h), (2.4.3)
because by the constant-returns-to-scale assumption1 1
See Chapter A for a proof of Euler’s
theorem.
F2 (k, h)h + F1 (k, h)k = F (k, h) (Euler’s Theorem).
With constant returns to scale, payments to the factors of production,
F2 (k, h)h + F1 (k, h)k = wh + rk, completely exhaust output, F (k, h), so
that no economics profits are earned; i.e., F (k, h) − wh − rk = 0. Now,
equation (2.4.3) represents one equation in one unknown, h. By the
implicit function theorem, a solution of the form h = H (k, τ, g) exits.
It’s easy to see that (2.4.3) fits the form of (2.1.1); just write
Z (h) ≡ U1 ( F (k, h) − g)(1 − τ ) F2 (k, h) − V1 (1 − h) = 0. (2.4.4)
So embedded in this single equation are the outcome of the consumer/-
worker’s labor-leisure choice problem, the upshot of the firm’s profit
maximization problem, the government’s budget constraint, and market-
clearing conditions.

2.5 Equivalent and Compensating Variations

To compute the welfare costs of taxation, consider a switch from tax

policy regime A to tax policy regime B. Suppose that the representa-
tive agent’s welfare under policy regime A is given by
W A ≡ U ( c A ) + V (1 − h A ),
where c A and h A are his consumption and work effort under this
regime. Now, similarly define the person’s welfare under policy regime
B by
W B ≡ U ( c B ) + V (1 − h B ).
Since utility is an ordinal measure, neither (W B − W A )/W A nor W B /W A
give a meaningful measure of the welfare gain or loss of moving from
policy regime A to policy regime B. To see this, imagine adding a
constant term, m, to the representative agent’s utility function. This
would not affect any of the person’s choices. Yet, by making m very
large, (W B − W A )/W A can be made arbitrarily small while W B /W A
would approach one. To get around the ordinal property of utility,
Hicks (1941) invented the concepts of compensating and equivalent
variations, which measure a person’s willingness to pay to make the
switch.
nonlinear equations 35

2.5.1 Equivalent Variation

Now, how much would a person be willing to pay, measured as a
fraction of regime A’s consumption, to move from A to B? The fraction
e solves the equation below

U (c A (1 + e)) + V (1 − h A ) = W B ,

so that
U −1 (W B − V (1 − h A ))
1+e = ,
cA
where U −1 is the inverse of the function U. The fraction e is called the
equivalent variation (EV). Chapter 8 uses the concept of an equivalent
variation to compute the welfare cost of business cycles.

Example 7. (EV with logarithmic Utility) Let U (c) = ln(c). Then,

U (c(1 + e)) = ln(1 + e) + ln c = ln(1 + e) + U (c). For this utility
function

U (c A (1 + e)) + V (1 − h A ) = ln(1 + e) + U (c A ) + V (1 − h A ) = W B ,

implying

ln(1 + e) = W B − U (c A ) − V (1 − h A ) = W B − W A ,

so that
e = exp(W B − W A ) − 1.

2.5.2 Excess Burden of Taxation using the Equivalent Variation

The excess burden of taxation is defined as the welfare cost per unit of
extra revenue raised by some proposed form of taxation. Imagine rais-
ing the labor income tax by some tiny amount away from tax regime
A to a new regime B. The excess burden of taxation is given by
welfare cost of raising taxes ec A
= .
change in tax revenue τB w B h B − τA w A h A
The numerator gives the excess burden of the taxation in terms of
consumption units while the denominator gives the amount of new
revenue raised (again in consumption units).

2.5.3 Compensating Variation

The notion of a compensating variation (CV) is very similar. Here the
question is: How much would a person be willing to pay, measured
as a fraction of regime B’s consumption, to move from A to B? The
fraction ψ solves the equation below

U (c B (1 + ψ)) + V (1 − h B ) = W A .
36 numerical methods for macroeconomists with julia and matlab codes

Example 8. (CV with logarithmic Utility) Again let U (c) = ln(c). Re-
tracing the steps of the previous example while making the appropri-
ate adjustments leads to

ln(1 + ψ) = W A − U (c B ) − V (1 − h B ) = W A − W B ,

so that
ψ = exp(W A − W B ) − 1.

Example 9. (Importance of economic growth, Lucas (1987, Table 1,

p.25)). Consider the impact of changing the rate of growth for an Robert E. Lucas, Jr (1937-) is one of
economy. Let a consumer have the lifetime utility function given by the most important
macroeconomists of the 20th
∞ century, along John Maynard
W= ∑ βt−1 ln(ct ), with 0 < β < 1, Keynes and Edward C. Prescott. He
t =1 has written seminal papers in both
business cycle theory and
where ct is consumption in year t. Year-t utility, ln(ct ), is discounted endogenous growth theory. Lucas
at the rate 0 < βt−1 < 1. The further off in the future a utility is, the introduced the idea of rational
more it is discounted because βt−1 is decreasing in t. The initial level expectations and dynamic general
of consumption is c1 . Now suppose that consumption grows at some equilibrium modeling into
constant gross rate µ.2 This implies that c2 = µc1 , c3 = µc2 = µ2 c1 , macroeconomics. He also stressed
the role of human capital formation
c4 = µc3 = µ3 c1 , and ct = µt−1 c1 . Using ct = µt−1 c1 in the above
for economic growth. In 1995 he
lifetime utility function yields was awarded the Nobel Prize in
Economics “for having developed
∞ ∞
and applied the hypothesis of
W= ∑ βt−1 ln(ct ) = ∑ βt−1 ln(µt−1 c1 ) rational expectations, and thereby
t =1 t =1
∞ having transformed
1
= ∑ βt−1 (t − 1) ln(µ) + 1 − β ln c1 macroeconomic analysis and
t =1 deepened our understanding of
β 1 economic policy.” Like Albert
= ln(µ) + ln c1 .
(1 − β )2 1−β Einstein, Lucas received the prize
well after his work was widely
Suppose that in regime A consumption grows at the gross rate µ A recognized as being pathbreaking.
Revolutions in ideas don’t come
while under regime B the growth rate is µ B . What is the compensat-
easily.
ing variation associated with a move from A to B, ψ, computed as a
fraction of the initial consumption in regime B, c1B ? A change in the 2
The gross growth rate is one plus the
net growth rate. So, if the economy is
initial level of consumption moves up or down the entire consumption growing at 3 percent per year, the gross
stream. The compensating variation is growth rate is 1.03.

β
ψ = exp[(1 − β)(W A − W B )] − 1 = exp{ [ln(µ A ) − ln(µ B )]} − 1.
1−β

To make this more concrete, let the annual discount factor, β, be 0.95
and the annual growth rate in regime A, µ A , be 3 percent. The table
below reports the compensating variation for various growth rates in
regime B.
nonlinear equations 37

Shifts in Growth
Growth rate in Regime B, % CV, %
(µ B − 1) × 100% ψ × 100%
1.0 45
2.0 20
3.0 0
4.0 -17
5.0 -31
6.0 -42
So, a person would be willing to give 17 percent of his regime B’s
consumption stream to increase growth from 3 to 4 percent and would
have to have a compensation of 45 percent to move to a situation where
the economy grows at only 1 percent. These are large numbers. Lu-
cas’s conclusion is that growth effects are important. The welfare cost
of business cycles is discussed in Chapter 8. The welfare costs of busi-
ness cycles turn out to be much smaller, so many economists feel that
studying economic growth is more important than studying business
cycles.

2.6 Solving Nonlinear Equations Numerically

In all of the above labor supply problems the difference between the
lefthand and righthand side of the first-order conditions is a decreas-
ing function in h. This occurs because as h increases the left-hand side
drops, while the right-hand, which is being subtracted off, moves up.
For example, recall equation (2.4.4) which stated

Z (h) ≡ U1 ( F (k, h) − g)(1 − τ ) F2 (k, h) − V1 (1 − h). (2.6.1)

The function Z (h) is a decreasing function in h because U1 ( F (k, h) − g),

F2 (k, h), and −V1 (1 − h) are all decreasing in h due to strict concavity
assumptions. Again this single equation represents the upshot of a
consumer/worker’s labor-leisure choice problem, a firm’s profit max-
imization decision, a government’s spending and tax program, and
market-clearing conditions. The solution to this model’s general equi-
librium with taxes and spending then amounts to solving (2.6.1) for the
single endogenous variable h. Now assume that Z (h) is a continuous
function of a variable h. Suppose that one wants to find numerically
the value of h that solves the nonlinear equation

Z (h) = 0.

This is called finding the zero or root for the function Z. Before pro-
ceeding, some properties will be imposed on the function Z, largely
for heuristic purposes.
38 numerical methods for macroeconomists with julia and matlab codes

1. Z : R → R. The solution lies in the space of real numbers.

2. Z (h) = 0 for some h called h∗ . This assumption implies that a

solution exists.

3. Z1 (h) < 0 for all h. This assumption is imposed for illustration

purposes only. The case where Z (h) is an increasing function can be
handled by analyzing − Z (h) = 0, where − Z (h) will be decreasing
in h. The solution for h isn’t affected by changing the sign of Z (h).

The above properties imply that


 > 0,
 if h < h∗ ;
Z (h) is = 0, if h = h∗ ;
if h > h∗ .

 < 0,

The function Z (h) is shown in Figure 2.6.1.

Two methods for numerically solving nonlinear equations are pre-
sented here: the bisection algorithm and Newton’s method. Pseudo
code is presented for each method. Pseudo code follows the conven-
tions of a normal structured programming language. It is intended as
a heuristic device, since actual computer code can be difficult to read.
Pseudo code does not follow the syntax of any particular program-
ming language.

2.6.1 Bisection Method

The bisection method brackets the solution for h, or h∗ , between lower
and upper bounds, hl and hu , so that hl < h∗ < hu . On each iteration of
the algorithm one of these bounds is moved toward h∗ , which shrinks
the bracket. The algorithm is constructed so that h∗ always remains
within the bounds on each iteration. Eventually h∗ is captured within
a very tiny bracket, implying that a solution has been found.

The algorithm-pseudo code

Set a desired tolerance for the solution, denoted by ε > 0.

1. Enter iteration j with lower and upper bounds for h∗ denoted by

hl,j and hu,j , such that Z (hl,j ) > 0 and Z (hu,j ) < 0.

2. Construct a guess for the solution,

h j = (hl,j + hu,j )/2.

This motivates the name bisection.

3. Check for convergence.

nonlinear equations 39

Figure 2.6.1: Illustration of the

Z(h) Bisection Method. Iteration j
starts with the lower and upper
bounds, hl,j and hu,j , on the so-
Z(hl,j)
lution for h. The solution, h∗ , is
trapped between these bounds,
by the Intermediate Value The-
hj=(hl,j+ hu,j)/2
hu,j+1=hj
orem, because Z (hl,j ) > 0 and
hu,j
Z (hu,j ) < 0. (See Chapter A
0 hl,j h* h for the Intermediate Value The-
orem.) A guess, h j , is made that
Z(hj)
bisects this bounds; i.e., that is,
h j = (hl,j + hu,j )/2. In the situ-
Z(hu,j)
ation shown, Z (h j ) < 0 so that
the true solution must lie below
h j , given the assumption that
Z (h) is decreasing. Hence, the
upper bound on iteration j + 1
is lowered by setting hu,j+1 = h j .

(a) If
| Z (h j )| < ε,

then stop. The desired solution has been found.

(b) Else
| Z (h j )| ≥ ε.

Go to Step 4.

4. Construct new lower and upper bounds.

(a) If
Z (h j ) < 0,

then the solution for h, or h∗ , must smaller than h j , because Z is

decreasing in h. Hence, the upper bound, hu,j , must be too high.
So, set hl,j+1 = hl,j and hu,j+1 = h j ; i.e., reset the upper bound.
Return to Step 1.
(b) If
Z (h j ) > 0,

then set hl,j+1 = h j and hu,j+1 = hu,j ; i.e., reset the lower bound.
Return to Step 1.

The process is shown in Figure 2.6.1.

40 numerical methods for macroeconomists with julia and matlab codes

2.6.2 Newton’s Method

Nature and nature’s laws lay hid in night; God said "Let Newton be"
and all was light. (Alexander Pope)

Sir Isaac Newton (1642-1727) is thought by many to be the greatest

scientist of all time. He made seminal contributions to astronomy,
mathematics, and physics. His method for solving nonlinear equations
is the dominant one in numerical analysis even today.
Newton’s method has two advantages over the bisection method.
First, it is faster, because it uses knowledge about both Z and Z1 to
compute revised guesses. Second, it generalizes easily to systems of
nonlinear equations. Its disadvantage is that it can be unstable.

The algorithm-pseudo code

Set a desired tolerance for the solution, denoted by ε > 0.

1. Enter iteration j with a guess for h∗ denoted by h j .

2. Check for convergence

(a) If
| Z (h j )| < ε,
then stop. A solution has been found.
(b) Else
| Z (h j )| ≥ ε,
and go to Step 3.

3. Update guess using

Z(h j )
h j +1 = h j − .
Z1 (h j )

Go back to Step 1.

The process is shown in Figure 2.6.2. At the guess h j one follows the
tangent line down to the axis to get the revised guess h j+1 . Note that
the equation for the tangent line is

y = aj + bj hj.

So, the revised guess, h j+1 , must solve

0 = a j + b j h j +1 ,

implying
aj
h j +1 = − .
bj
nonlinear equations 41

All that is needed is the coefficients a j and b j . From the formula for a
straight line, b j = Z1 (h j ). To find a j note that Z (h j ) = a j + b j h j so that
a j = Z (h j ) − b j h j = Z (h j ) − Z1 (h j )h j . Therefore,

−aj
z }| {
− Z (h j ) + Z1 (h j )h j Z(h j )
h j +1 = = hj − .
Z (h j ) Z1 (h j )
| 1{z }
bj

Newton’s method requires knowledge about the derivative Z1 (h j ).

Sometimes this can be computed analytically and the formula for the
derivative inputted into the nonlinear equation solver. Other times
it must be done numerically. Numerical derivatives are discussed in
Chapter 9.
Newton’s method can be unstable since it is prone to overshooting,
as is shown by Figure 2.6.3. Here the algorithm returns a negative
revised guess for h for use on iteration j + 1. This isn’t sensible here,
as the function Z is not defined when h is negative. Imagine that
Z (h) = U1 ( F (k, h) − g)(1 − τ ) F2 (k, h) − V1 (1 − h), as given by (2.4.4).
The terms F (k, h) and F2 (k, h) cannot be evaluated at negative values
for h. For example, suppose that F (k, h) = kα h1−α and F2 (k, h) =
(1 − α)kα h−α . Then a negative value for h would generate complex
numbers for these quantities. Newton’s algorithm may go awry at this
point. This type of problem is often easy to avoid though. To prevent
the type of overshooting shown in Figure 2.6.3, sometimes it pays to
put a line in the computer code for the function Z (h) stating that h =
max{1.0E − 8, h}. This line binds h above h≡ 1.0E − 8, and hence zero,
and keeps the algorithm from going into the troublesome region. It
works, providing that the answer for h is greater than 1.0E − 8. This
idea is also shown in the Figure 2.6.3.

2.6.3 Corner Solutions

Return to the simple labor-leisure choice problem poised at the begin-
ning. Suppose that the worker may desire to devote all of his time
to the labor force or none of it. That is, perhaps the worker would
desire to set h = 1 or h = 0. Now, the worker’s maximization can be
expressed as
max {U (wh + a) + V (1 − h)}.
0≤ h ≤1

The solution to this problem will have the following form:

U1 (wh + a)w − V1 (1 − h) = 0, if 0 < h < 1,

U1 (wh + a)w − V1 (1 − h) < 0, if h = 0,
U1 (wh + a)w − V1 (1 − h) > 0, if h = 1.
42 numerical methods for macroeconomists with julia and matlab codes

Figure 2.6.2: Illustration of New-

Z(h) ton’s Method. The guess for
the solution on iteration j is h j .
Newton’s method uses the line
Z(hj)
that is tangent to Z (h) at the
point h j to compute the revised
guess h j+1 . The revised guess
is given by the point where the
h* tangent line hits the horizontal
0 hj+1
hj h axis.

Figure 2.6.3: Example of over-

Z(h), deﬁned for h > 0 shooting when using Newton’s
method. Here Z : R+ → R,
yet on the jth iteration Newton’s
method returns a negative new
guess for h denoted by h j+1 . The
function Z cannot be evaluated
when h j+1 < 0, since it is not de-
hj+1<0 0 hj fined for h < 0. To prevent over-
h hj+1>0 shooting a lower bound, h, can
be placed on the problem. This
ensures that for h j+1 > 0.
Z(hj)
nonlinear equations 43

The above algorithms can easily be extended to cover this situation.

Specifically,

1. Set h = 0. Check whether

U1 ( a)w − V1 (1) < 0.

(a) If so, a solution has been found.

(b) If not, proceed to Step 2.

2. Set h = 1. Check whether

U1 (w + a)w − V1 (0) > 0.

(a) If so, a solution has been found.

(b) If not, proceed to Step 3.

3. Find the zero to the equation

U1 (wh + a)w − V1 (1 − h) = 0,

using either the bisection or Newton’s method.

Remark 10. Let Z (h) = U1 (wh + a)w − V1 (1 − h). One could solve the
following equation for h:

Z (h)h(1 − h) + h min{0, Z (h)} + (1 − h) max{0, Z (h)} = 0.

Note that this equation will return a zero for the true solution.

2.6.4 Nonlinear Systems of Equations

Newton’s method generalizes easily to systems of nonlinear equations,
unlike the bisection method. Consider the nonlinear system of equa-
tions
Z (h) = 0 , (2.6.2)
n ×1 n ×1

where Z : Rn → Rn ; that is, Z is a system of n nonlinear equa-

tions, stacked vertically, in the n unknowns, h ≡ (h1 , · · · , hn )0 . Take a
first-order Taylor expansion of the above function around the point h j ,
while dropping the remainder term, to get

Z ( h ) = Z ( h j ) + J ( h j ) ( h j − h ),
n ×1 n ×1 n×n n ×1

where the Jacobian, J (h j ), is a n × n matrix containing the partial

derivatives of Z.3 The Jacobian is defined by 3
This is just a multivariate generaliza-
  tion of the bivariate Taylor expansion re-
Z1,1 (h) · · · Z1,n (h) viewed in Chapter A.
.. ..
J (h j ) ≡ 
 
,
 . . 
Zn,1 (h) · · · Zn,n (h)
44 numerical methods for macroeconomists with julia and matlab codes

where Zij refers to the derivative of the i-th row of Z with respect to
its j-th argument. Now, at the h that solves (2.6.2) it transpires that

0 = Z (h j ) + J (h j )(h − h j ),

which implies that in a neighborhood around the solution

h = h j − J ( h j ) −1 Z ( h j ).

This motivates using the updating equation

h j +1 = h j − J ( h j ) −1 Z ( h j ).
n ×1 n ×1 n×n n ×1

2.7 Heterogenous Agents

Suppose that there are I types of individuals in the economy, namely

i = 1, · · · , I. The population of type i agents is of size µi , for i =
1, · · · , I. For convenience, set ∑iI=1 µi = 1. A person has tastes of the
following form

U (ci ) + V (1 − hi ), for i = 1, · · · , I,

where ci is the consumption enjoyed by a type-i individual and hi is

his work effort. The productivity of person i on the labor market is
πi . Assume that π I > · · · > πi > · · · > π1 , so that person with a
higher index has a higher level of productivity. The wage rate for a
raw unit of labor is w. A person of type-i will earn the amount wπi hi
in labor income when he works the amount hi . Suppose that type-i
individuals are taxed on their labor income at the rate τi . Progressive
income taxation implies that τI > · · · > τi > · · · > τ1 . There is one
unit of capital in the economy. People also earn income from the share
of this fixed capital stock that they own. Assume that a type-i agent
owns k i , with k I > · · · > k i > · · · > k1 . Capital earns the rental,
r. Production in the economy is undertaken in accordance with the
constant-returns-to-scale production function

o = F (k, h),

where k is the aggregate capital stock and h is the aggregate input of

labor. Last, the government in the economy uses the tax revenue that
it collects to distribute lump-sum transfer payments to the populace in
the amount λ.

2.7.1 Type-i’s optimization problem

The optimization problem for a type-i agent is given by

max[U (ci ) + V (1 − hi )],

ci ,hi
nonlinear equations 45

subject to
ci = (1 − τi )wπi hi + rk i + λ.
The upshot of this maximization problem is

U1 ((1 − τi )wπi hi + rk i + λ)(1 − τi )wπi = V1 (1 − hi ), for i = 1, · · · , I.

| {z }
ci
(2.7.1)

2.7.2 The Firm’s Problem

The firm’s problem is

max[ F (k, h) − rk − wh].

k,h

This results in
F1 (k, h) = r, (2.7.2)
and
F2 (k, h) = w. (2.7.3)

2.7.3 General Equilibrium

The government’s budget constraint is

λ = µ1 τ1 wπ1 h1 + · · · + µi τi wπi hi + · · · + µ I τI wπ I h I
I
= ∑ µi τi wπi hi . (2.7.4)
i =1

Since the capital market must clear

I
k= ∑ µi ki = 1. (2.7.5)
i =1

Likewise, market clearing in the labor market gives

I
h= ∑ µi πi hi . (2.7.6)
i =1

To characterize the model’s general equilibrium use (2.7.3), (2.7.2)

in (2.7.1) to get

U1 ((1 − τ1 ) F2 (1, h)π1 h1 + F1 (1, h)k1 + λ)(1 − τ1 ) F2 (1, h)π1 = V1 (1 − h1 )

| {z }
c1
.. ..
. .
U1 ((1 − τi ) F2 (1, h)πi hi + F1 (1, h)k i + λ)(1 − τi ) F2 (1, h)πi = V1 (1 − hi )
| {z }
ci
.. ..
. .
U1 ((1 − τI ) F2 (1, h)π I h I + F1 (1, h)k I + λ)(1 − τI ) F2 (1, h)π I = V1 (1 − h I ).
| {z }
cI
46 numerical methods for macroeconomists with julia and matlab codes

Now, note that (2.7.4) and (2.7.6) could be used to solve out for λ
and h. The result would be a system of I equations in I unknowns,
h1 , h2 , · · · , h I . Solving this system of equations on the computer may
not be an easy business, depending on how large I is. Instead, consider
the following algorithm which involves just solving one equation in
one unknown at a time.

2.7.4 Algorithm (Walras)

Set a tolerance for the algorithm denoted by ε.

1. Enter iteration j with a guess for the wage rate, w, and transfer
payments, λ, denoted by w j and λ j . Note that a guess for w amounts
to a guess for r because from the equation w = F2 (1, h), so that one
can solve for h and hence r using the relationship r = F1 (1, h).

2. Solve the optimization problems for agents i = 1, · · · , I using the

guesses w j and λ j to get a solution for the hi ’s. This will involve a
FOR or DO loop in the computer program.

3. Calculate what wages and transfer payments are at the solution for
the hi ’s:
I
w = F2 (1, ∑ µi πi hi ),
i =1
| {z }
h

and
I
λ= ∑ µi τi wπi hi .
i =1

Compute a revised guess for wages and transfer payment using the
formulae
(w + w j )/2,

and
(λ + λ j )/2.

4. Check for convergence

(a) If
|w j+1 − w j |/2 + |λ j+1 − λ j |/2 < ε,

then stop.
(b) Otherwise, return to step 1 with the new guesses.

The topic of heterogenous agents will be returned to in Chapter 10

when the Aiyagari (1994) model is discussed.
nonlinear equations 47

2.8 MATLAB: A Worked-Out Example

An introduction to MATLAB is presented in Chapter B. Some MAT-

LAB examples are presented below.

2.8.1 Plotting Utility Functions–Introduction to Graphs

Here is the MATLAB code used to construct Figure 2.2.1. The file
utilfuncs.m is a script file that executes a list of commands. The .m
extension denotes a MATLAB file. To run the file just type utilfuncs
in the MATLAB command directory. Make sure that current directory
in MATLAB is the directory on your computer where you stored the
script file. The % symbol is used to denote a comment. The text follow-
ing a % symbol is not executed in the program. An important part of a
program is making comments to tell others what you are doing. The
main program starts off clearing all results from previous runs, using
the clear all command. A semicolon at the end of a line means that
it will run silently so that the results of the line will not show up on
the screen. Remove the semicolon and the results of the command will
show up on the screen.
1 u t i l f u n c s .m % P l o t u t i l i t y f u n c t i o n s
2 c l e a r a l l % C l e a r e v e r y t h i n g from memory
3 % Generate a g r i d f o r consumption going from . 1 t o 3 by
increments of . 0 5
4 % This w i l l be a v e c t o r o f 59 p o i n t s
5 cons = . 1 : . 0 5 : 3 ;
6

7 % Logarithmic u t i l i t y f u n c t i o n
8 f i g u r e ( 1 ) % Command t o open f i g u r e window 1
9 l n u t i l = l o g ( cons ) ; % Generate v e c t o r o f u t i l s
10 p l o t ( cons , l n u t i l ) % P l o t cons and u t i l s
11 t i t l e ( ’ Log U t i l i t y ’ ) % Make t i t l e
12 y l a b e l ( ’ U t i l s , u ’ ) % Make l a b e l f o r v e r t i c a l a x i s
13 x l a b e l ( ’ Consumption , c ’ ) % Make l a b e l f o r h o r i z o n t a l a x i s
14

15 % Exponential u t i l i t y function
16 figure (2)
17 gamma = 1 . 0 ; % Parameter f o r e x p o n e n t i a l
18 e x p u t i l = −exp ( −gamma * cons ) ;
19 p l o t ( cons , e x p u t i l )
20 t i t l e ( ’ Exponential U t i l i t y ’ )
21 ylabel ( ’ Utils , u ’ )
22 x l a b e l ( ’ Consumption , c ’ )
23 % CRRA u t i l i t y f u n c t i o n
24

25 figure (3)
26 rho = 1 . 5 ; % Parameter f o r c r r a
27 c r r a u t i l = cons . ( 1 − rho ) /(1 − rho ) − 1/(1 − rho ) ;
28 p l o t ( cons , c r r a u t i l )
29 title ( ’ Isoelastic Utility ’ )
30 ylabel ( ’ Utils , u ’ )
31 x l a b e l ( ’ Consumption , c ’ )
32 % Quadratic u t i l i t y f u n c t i o n
33
48 numerical methods for macroeconomists with julia and matlab codes

34 figure (4)
35 alpha = . 5 ; % C o e f f i c i e n t s f o r q u a d r a t i c terms
36 beta = . 2 ;
37 q u a d u t i l = alpha * cons − b e t a * cons . 2 / 2 ;
38 % C r e a t e v e r t i c a l l i n e where q u a d r a t i c u t i l i t y f u n c t i o n peaks
39 maxpty = 0 : . 7 / 5 8 : . 7 ; % Generate v e r t i c a l y p o i n t s
40 % Generate x v e c t o r with t h e c o n s t a n t term alpha/ b e t a
41 maxptx = ones ( 1 , 5 9 ) * alpha/ b e t a ; % Generate x v e c t o r with t h e
constant
42 % P l o t u t i l i t y f u n c t i o n plus v e r t i c a l l i n e
43 p l o t ( cons , q u a d u t i l , maxptx , maxpty )
44 t i t l e ( ’ Quadratic U t i l i t y ’ )
45 ylabel ( ’ Utils , u ’ )
46 x l a b e l ( ’ Consumption , c ’ )
47 % Plot a l l u t i l i t y functions
48

49 figure (5)
50 t i t l e ( ’ U t i l i t y Functions ’ )
51 p l o t ( cons ’ , l n u t i l ’ , cons ’ , e x p u t i l ’ , cons ’ , c r r a u t i l ’ , cons ’ ,
quadutil ’ )
52 t i t l e ( ’ U t i l i t y Functions ’ )
53 ylabel ( ’ Utils , u ’ )
54 x l a b e l ( ’ Consumption , c ’ )
55 % Make a legend i n t h e s o u t h e a s t c o r n e r o f t h e graph .
56 legend ( ’ l o g ’ , ’ exp ’ , ’ c r r a ’ , ’ q u a d r a t i c ’ , ’ l o c a t i o n ’ , ’ s o u t h e a s t ’ )

2.8.2 A Monopoly Problem

Consider the problem of a monopolist who faces the linear demand
function
β
p = α − o,
2
where p is the price of the product and o is the monopolist’s output.
Demand, o, is decreasing in price, p; i.e., o = (2/β)(α − p). The mo-
nopolist produces according to the quadratic cost function
γ 2
c= o ,
2
where c is total cost. Marginal cost, γo, is increasing in output, o. In
other words, the cost function is strictly convex.
The monopolist’s revenue, po, is
β 2
po = αo − o .
2
This implies that his profits, π, read
β γ
π = αo − o2 − o2 .
2 2
| {z } |{z}
revenue costs

Therefore, the monopolist’s maximization problem is pick his output

to maximize profits. The mathematical transliteration of this maxi-
mization problem is
β 2 γ 2
max{αo − o − o }.
o 2 2
nonlinear equations 49

The first-order condition connected with this maximization is

α − βo = γo ,
| {z } |{z}
MR MC

which sets marginal revenue, MR, equal to marginal cost, MC. It is

easy to calculate that the solutions for output, o, and prices, p, are
α
o= ,
γ+β
and
β α
p = α− ,
2 γ+β
where the solution for o has been substituted into the demand curve
to obtain an answer for p.

2.8.3 The MATLAB code for the Monopoly Example

Here is a MATLAB program that solves the above monopolist’s prob-
lem. Given the simple nature of the above problem, there is really no
need to solve the model numerically. More complicated problems have
the same structure though.

MATLAB, Main Program-main.m

This is the main m file used to solve the monopoly problem. It calls
other m files. The % symbol is used to denote a comment. The text
following a % symbol is not executed in the program. An important
part of a program is making comments to tell others what you are
doing. The main program starts off clearing all results from previous
runs, using the clear all command. The screen in the command
window is also cleared (clc). The global command specifies variables
that are common in different parts of the program. If the value of one
of these variables is changed in any of the common parts, then it will
be changed in the rest of the parts that share this variable. So, the
global command must be used with some caution. The program then
sets the parameters values for the demand and cost functions. These
functions are plotted on a grid for output, o. The grid runs from 0 to
α/β in increments of α/( β ∗ 100). MR and MC are evaluated at each
of these grid points. The MR curve hits the horizontal axis at α/β.
You should give graphs a title and label their axes. The graph gives
an idea about the solution to the model. Sometimes this is useful for
troubleshooting or debugging a program. Debugging programs can
often be a painful process.
The model is then solved using the nonlinear equation solver fzero.
This is a built in MATLAB function. You can get help for any build-in
50 numerical methods for macroeconomists with julia and matlab codes

MATLAB function by typing help followed by the function’s name in

the command window; i.e., in this case type help fzero. This non-
linear equation solver makes a call to the user-supplied function foc,
which sets out the first-order condition for the model. Observe that the
solution from MATLAB is checked for accuracy. This is done using an
if statement that will display an error if the first-order condition is not
close enough to zero at the computed solution for o. All if statements
must be followed by an end statement. Some output for the model is
then displayed. To run the program just type main in the command
window.

1 % main .m
2 % Monopoly ProblemMain Program
3 c l e a r a l l % C l e a r a l l numbers from p re vi o us runs
4 c l c % Clear screen
5 g l o b a l alpha b e t a gamma
6
7 % S e t parameters f o r model
8 % Demand curve
9 alpha = 1 ; % c o n s t a n t
10 beta = 0 . 5 ; % slope
11 % Cost f u n c t i o n
12 gamma = 0 . 5 ; % q u a d r a t i c term
13 % P l o t marginal revenue and marginal c o s t
14
15 % C o n s t r u c t g r i d o f output p o i n t s
16 o g r i d = 0 : alpha /( b e t a * 1 0 0 ) : alpha/ b e t a ;
17 % o g r i d runs from 0 t o alpha/ b e t a i n
18 % i n c r e m e n t s o f alpha /( b e t a * 1 0 0 )
19

20 % P l o t marginal revenue and c o s t curves

21 figure (1)
22 p l o t ( ogrid , mr ( o g r i d ) , ogrid , mc( o g r i d ) )
23 t i t l e ( ’ Marginal Revenue and Marginal Cost ’ )
24 x l a b e l ( ’ Output ’ )
25 y l a b e l ( ’MR and MC’ )
26

27 % Solve f o r output and p r i c e s

28 output = f z e r o ( @foc , 1 ) ; % C a l l up n o n l i n e a r e q u a t i o n s o l v e r
29 % Check s o l u t i o n g i v e s a zero
30 i f abs ( f o c ( output ) ) = . 0 0 0 0 0 1
31 disp ( ’ s o l u t i o n not found ’ )
32 end
33 p r i c e = alpha − b e t a * output / 2 ;
34 c o s t = gamma * output2 / 2 ;
35 p r o f i t s = p r i c e * output − c o s t ;
36 markup = p r i c e /mc( output ) ;
37

38 % Display r e s u l t s
39 d i s p l a y ( ’ R e s u l t s f o r t h e monopoly model ’ )
40 d i s p l a y ( ’ output , p r i c e , p r o f i t s , and markup ’ )
41 d i s p l a y ( [ output p r i c e p r o f i t s markup ] )
nonlinear equations 51

A Function Specifying Marginal Revenue-mr.m

This is a function setting out the formula for marginal revenue. It
is used in the function for the first-order condition, foc. Note that
the variables alpha and beta are passed into this function using the
global command. These variables must be specified as global in the
main program. The global command passes variables to be used in
functions without inputting them in directly as arguments of the func-
tion. If you change the value of a global variable inside a function, it
will change the value of this variable everywhere else in the program
that has access to this global variable. So, use global variables wisely.
1 f u n c t i o n [ out ] = mr ( output )
2 % mr .m
3 % This i s t h e monopolist ’ s marginal revenue curve
4 % Gives marginal revenue as a f u n c t i o n o f output
5 % This f u n c t i o n i s used i n f o c .m
6 g l o b a l alpha b e t a
7 out = alpha − b e t a * output ; % Marginal Revenue
8 out = max ( 0 , out ) ; % MR must be p o s i t i v e
9 end

A Function Specifying Marginal Cost-mc.m.

This function sets out the formula for marginal cost which is used in
the first-order condition. It is similar in construction to mr.
1 f u n c t i o n [ out ] = mc( output )
2 % mc .m
3 % This i s t h e monopolist ’ s marginal c o s t curve
4 % Gives marginal c o s t as a f u n c t i o n o f output
5 % This f u n c t i o n i s used i n f o c .m
6 g l o b a l gamma
7 out = gamma * output ; % Marginal c o s t
8 end

A Function Specifying the First-Order Condition to be Solved-foc.m

Observe that this function calls two other functions, mr and mc. So,
functions can call functions. It sets out the first-order condition that
should be set to zero. The nonlinear equation solver, fzero, will try
out different values for output in an attempt to get foc(output) close
to zero.
1 f u n c t i o n [ zero ] = f o c ( output )
2 % f o c .m
3 % This i s t h e f o c f o r p r o f i t maximization
4 % Gives t h e s o l u t i o n o f t h e f o c as a f u n c t i o n o f output
5 zero = mr ( output ) − mc( output ) ; % f i r s t −order c o n d i t i o n
6 end
52 numerical methods for macroeconomists with julia and matlab codes

Output from Program

The program generates the graph shown in Figure 2.8.1. It also gives
the following output. Marginal Revenue and Marginal Cost
1

1 % R e s u l t s f o r t h e monopoly model
2 output , p r i c e , p r o f i t s , and markup
0.9

3 ans =
4 1.0000 0.7500 0.5000 1.5000
0.8

0.7

0.6

Figure 2.8.1: Graph generated

MR and MC

0.5
from the MATLAB program for
0.4
the monopoly problem. Graphs
should always be given titles
0.3
and the axes labeled. You may
0.2
have to increase the font size to
get the labels to look nice.
0.1

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Output
3 Maximization (and Minimization)

3.1 Introduction

In economics maximization problems are everywhere. The generic

maximization problem might take the form

max F (h), (3.1.1)

subject to
G (h) = 0.

Here F is an objective function, h is a vector of variables, and G is a

function representing a constraint on the choice variables. For exam-
ple, in a consumer problem F would represent a utility function and
G would be the consumer’s budget constraint. Note that this problem
can be rewritten as
min{− F (h)},
h

subject to
G (h) = 0.

So, it is easy to convert a maximization problem into a minimization

problem and vice versa.
Often the parameter values for a model are chosen to maximize
the model’s fit with respect to some data. This is usually done by
selecting the parameter values to minimize some objective function
containing the model’s prediction errors. So, this is another place in
macroeconomics where maximization (or equivalently minimization)
is important. In macroeconomics this is often done by calibrating a
model to match, as close as possible, a set of stylized facts.
Three methods are presented for maximizing functions: golden-
section search, discrete maximization, and particle swarm optimiza-
tion. In the discussion below the constraint on the maximization prob-
lems is dropped. There is a wide variety of numerical algorithms,
however, that allow for constraints. The discussion then turns to the
calibration of economic models. On this two examples are presented:
first, the decline in hours worked by males over the last century and
54 numerical methods for macroeconomists with julia and matlab codes

Figure 3.2.1: Beauty and the

golden ratio. Is your concept
of beauty governed by certain di-
vine facial proportions that are re-
lated to the golden ratio? The
golden ratio is denoted by the ra-
tional number ψ that has the value
1.61803398874 · · · . In the so-called
“ideal” facial structure the ratio of
the distance from the eyes to mouth
divided by the distance from the
second the rise in premarital sexual activity by young women over the
mouth to the chin should be ψ.
same period.
Likewise, ψ should also be the ratio
of the distance from the hairline to
3.2 Golden-Section Search the bottom of the nose over the dis-
tance from the bottom of the nose
The golden-section search algorithm is reminiscent of the bisection al-
to the bottom of the chin. George
gorithm discussed in Chapter 2. It was invented by the statistician
Clooney and Bella Hadid score high
Jack Kiefer in 1953. The algorithm assumes that the objective func-
when golden ratio formulae are
tion F (h) is unimodal–the concept of a unimodal function is defined
used. Junk science pushed by jour-
in Chapter A. This implies that the function F will rise in h until it hits
nalists? Probably.
its maximal value and then decline. A unimodal function does not
have to be strictly concave because F11 (h) does not have to be negative
everywhere. Denote the value of h that attains the maximum by h∗ .
The algorithm starts by imposing a bracket on iteration 1, [hl,1 , hu,1 ],
that is known to contain h∗ so that h∗ ∈ [hl,1 , hu,1 ]. The bracket is suc-
cessively shrunk until h∗ is trapped within some tiny range, at which
point the solution has been effectively found, where on each itera-
tion j it transpires that h∗ ∈ [hl,j , hu,j ]. The rate at which the bracket
shrinks is α = 1/ψ, where ψ is the golden ratio or the rational num-
ber 1.61803398874 · · · . The golden ratio turns up in architecture, the
arts, sciences, and, according to some cosmetic surgeons, perceptions
of beauty–see Figure 3.2.1.

3.2.1 The algorithm–pseudo code

The pseudo code for the algorithm is as follows. To start with, set the
desired tolerance level for the solution denoted by ε > 0.

1. Enter iteration j with the brackets, hl,j and hu,j , around the maximal
value such that hl,j ≤ h∗ ≤ hu,j .

2. Construct two test points, denoted by pl,j and pu,j , which are given
by
pl,j = hl,j + (1 − α)(hu,j − hl,j ), (3.2.1)
maximization (and minimization) 55

and

pu,j = hu,j − (1 − α)(hu,j − hl,j ) = hl,j + α(hu,j − hl,j ). (3.2.2)

where α = 1/ψ and ψ is the golden ratio–see Chapter A. The golden

√
ratio is given by ψ = ( 5 + 1)/2 so that α = 0.61803398874 · · · .
Why this magic number is chosen is discussed below. Note that
pl,j ≤ pu,j , but it does not have to be the case that h∗ ∈ [ pl,j , pu,j ].

3. Update the brackets.

(a) If
F ( pl,j ) > F ( pu,j ).

Since the function is unimodal this implies that h∗ < pu,j . So the
upper bracket, hu,j+1 , should be reset on the next iteration. In
particular,
hl,j+1 = hl,j and hu,j+1 = pu,j . (3.2.3)

(b) Else
F ( pl,j ) < F ( pu,j ),

so that instead the lower bracket is reset on the next iteration,

implying
hl,j+1 = pl,j and hu,j+1 = hu,j .

4. Check for convergence or that

|hu,j+1 − hl,j+1 | < ε.

If the answer is yes, then the solution, h∗ , is trapped inside this

narrow interval. The algorithm then stops. If the answer is no,
return to Step 1.

The golden-section search algorithm is illustrated in Figure 3.2.2, for

the case in point 3(a). The test point pl,j is determined so that ratio
of the distance from pl,j to it’s nearest bound, here hl,j , to the distance
from pl,j to its farthest bound, hu,j , is α. That is, the ratio of a’s length
to b’s length is α. (Or equivalently the ratio of b to a is 1/α = ψ,
the Golden ratio.) The same is true if one instead takes the ratio of
the distance from pu,j to its nearest bound to distance from pu,j to
its farthest bound. Now, since F ( pl,j ) > F ( pu,j ) the algorithm resets
the upper bound so that hu,j+1 = pu,j with the new test point being
pu,j+1 = pl,j . By design this keeps the ratio of the distance of pu,j+1 to
its nearest bound, now hu,j+1 , to the distance of the ratio of pu,j+1 to
its farthest bound, now hl,j+1 = hl,j , at α. In other words the ratio of
c’s length to a’s length is α.
56 numerical methods for macroeconomists with julia and matlab codes

Figure 3.2.2: The golden-section

search algorithm. The algorithm
c
𝐹𝐹(ℎ) starts off in iteration j with the opti-
a b
golden ratios mal point, h∗ , being bracketed in the
∗
𝐹𝐹(ℎ ) interval [hl,j , hu,j ]. Two test points,
𝐹𝐹(𝑝𝑝𝑙𝑙,𝑗𝑗 )
𝐹𝐹(𝑝𝑝𝑢𝑢,𝑗𝑗 )
pl,j and pu,j , are then choosen. Be-
cause F ( pl,j ) > F ( pu,j ) it is clear
that h∗ cannot lie to the right of pu,j ,
because F is unimodal. Hence, on
the next iteration, j + 1, the upper
bound can be reset to hu,j+1 = pu,j .
The algorithm is designed so that
in iteration j + 1 the new test point
for the upper bound is equal to the
ℎ𝑙𝑙,𝑗𝑗 𝑝𝑝𝑙𝑙,𝑗𝑗 ℎ∗ 𝑝𝑝𝑢𝑢,𝑗𝑗 ℎ𝑢𝑢,𝑗𝑗 h
𝑝𝑝𝑢𝑢,𝑗𝑗+1 ℎ𝑢𝑢,𝑗𝑗+1 old test point for the lower bound;
i.e., pu,j+1 = pl,j . This restriction
implies that the line segment ratios,
a/b and c/a, are both (the inverses
of) golden-section ratios and hence
equal to each other. Note that in
The Determination of α the situation portrayed in the dia-
gram, h∗ ∈
/ [ pl,j+1 , pu,j+1 ] because
Exactly how is the constant α determined to achieve this? Observe that
pu,j+1 < h∗ .
on iteration j + 1 the bracket will be
(
l,j+1 u,j+1 (hl,j , pu,j ), if F ( pl,j ) > F ( pu,j );
(h ,h )=
( pl,j , hu,j ), if F ( pl,j ) < F ( pu,j ).

Take the case in point 3(a) where the new bracket for iteration j + 1 is
(hl,j+1 , hu,j+1 ) = (hl,j , pu,j ); i.e., the upper bound is being adjusted–the
second case can be analyzed in a similar manner. Now, suppose the
restriction below is imposed where

pu,j+1 = pl,j ; (3.2.4)

i.e., the old lower test point becomes the new upper test point. As will
be seen, this condition implies that ratio of the distance between the
test point and its closest bound to the distance between the test point
and its farthest bound is kept constant across iterations. Then, it must
happen that

pl,j = pu,j+1
= hl,j+1 + α(hu,j+1 − hl,j+1 ) [by updating (3.2.2)]
= hl,j + α( pu,j − hl,j ) [by (3.2.3)]. (3.2.5)
maximization (and minimization) 57

Next, using (3.2.1) and (3.2.2) to substitute out for pl,j and pu,j on the
left and right, this can be rewritten as

hl,j + (1 − α)(hu,j − hl,j ) = hl,j + α[hl,j + α(hu,j − hl,j ) − hl,j ],

| {z } | {z }
= pl,j = pu,j

which by cancelling out the (hu,j − hl,j )’s reduces to

α2 + α − 1 = 0. (3.2.6)

The positive root of this quadratic is 0.61803398874 · · · , which is 1/ψ,

where ψ is the golden ratio–again, see the Chapter A for more detail.
Observe using (3.2.1) and (3.2.2) that

pl,j − hl,j pu,j − pl,j

=
hu,j − pl,j pl,j − hl,j
| {z } | {z }
(1−α)/α=α (2α−1)/(1−α)=α

hu,j+1 − pu,j+1
= [using (3.2.3) and (3.2.4)].
pu,j+1 − hl,j+1
The formula implies that the ratio of the length of the line segment
from a bracket to the nearest test point over the length of the other
bracket to the same test point is preserved across iterations. So cer-
tain ratios are held constant in line with the golden ratio. The above
formula motivates the choice of pu,j+1 = pl,j . To derive the first
line in the formula, note that the numerator on the lefthand side is
pl,j − hl,j = (1 − α)(hu,j − hl,j ) by (3.2.1). Turn to the denominator
on the left. By subtracting (3.2.1) from (3.2.2) it can be seen that
hu,j − pl,j = pu,j − hl,j . But, (3.2.2) states that pu,j − hl,j = α(hu,j − hl,j ).
Therefore, the ratio on the lefthand side is (1 − α)/α. The formula for
the Golden ratio (3.2.6) implies that this is just α. Now turn to right-
hand side of the top line. By summing (3.2.1) and (3.2.2) it can be seen
that pu,j − pl,j = (2α − 1)(hu,j − hl,j ). Therefore, the ratio on the right-
hand side is (2α − 1)/(1 − α). Again, this is just α from the golden
ratio formula (3.2.6). Thus, the lefthand and the righthand hold with
equality at the Golden ratio.1 1
This equation implies that (1 − α)/α =
(2α − 1)/(1 − α), so as before α must
solve α2 + α − 1 = 0.
Speed of Convergence
Also, note that

hu,j+1 − hl,j+1 = pu,j − hl,j [using (3.2.3)]

= hl,j + α(hu,j − hl,j ) − hl,j [using (3.2.2)]
= α(hu,j − hl,j ),

so that bracket is shrinking across iterations by a factor of α. All of this

is true for the case in point 3(b).
58 numerical methods for macroeconomists with julia and matlab codes

3.3 General Equilibrium, A Slight Return

Return to the general equilibrium labor supply problem, that was cast
in Section 2.4 of Chapter 2. Now, the solution for h cannot be simply
found in a single shot by solving one equation in one unknown.

3.3.1 Algorithm (Walras)

Set a tolerance for the algorithm denoted by ε. Suppose the true solu-
tion to the problem is h∗ . Start with a guess for h∗ on the first iteration
denoted by h1 .

1. Enter iteration j with a guess for the solution for labor supply de-
noted by h j . A guess for h amounts to guesses for r, w, and λ using
the equations r j = F1 (k, h j ), w j = F2 (k, h j ), and λ j = τF2 (k, h j )h j − g.

2. Solve the maximization problem for the representative agent, taking

as given r j , w j , and λ j . That is, solve the problem

max{U ((1 − τ )w j h + r j k + λ j ) + V (1 − h)}.

Now, for a revised guess set

h j+1 = (h + h j )/2.

The reasoning for this is straightforward. Suppose that the guess,

h j , is too high; i.e., h j > h∗ . Then, the representative agent will
be receiving too much in transfer payments. He will then work
less than he would in equilibrium due to the income effect. So, the
solution h will be less than the true solution h∗ . Therefore, the true
solution must lie between h and h j . One could think about h as
being the supply of hour worked and h j as the demand for them at
the conjectured prices. The true solution should be somewhere in
between.

3. Check for convergence

(a) If
|h j+1 − h j | < ε,
then stop.
(b) Otherwise, return to step 1 with the new guess.

If the algorithm converges, then by construction an equilibrium will

prevail. At the prevailing level of prices, r and w, and transfer pay-
ments, λ, the representative agent maximizes his utility. In equilibrium
he will earn a wage rate of w = F2 (k, h), which is the marginal product
of his labor. He will also earn a rental rate on capital of r = F1 (k, h),
maximization (and minimization) 59

which is the marginal product of capital. Last, the person’s trans-

fer payments are difference between what the government collects in
taxes, τwh, and spends on goods and services, g.

3.4 Discrete Maximization

This is the simplest technique, but the least accurate. Here the domain
of the objective function is discretized and the range of the function is
evaluated on each point in this discrete set. In particular, suppose that
h must lie in the discrete set H. The maximization problem (3.1.1) thus
appears as
max F (h).
h∈H

3.4.1 The algorithm–pseudo code

The operationalization of discrete maximization is easy

1. Discretize the domain of the objective function to get H ≡ (h1 , h2 , · · · , hn ).

2. Construct the vector F ≡ ( F (h1 ), F (h2 ), · · · , F (hn ))˙ , which repre-

sents the associated range of the objective function.

3. Pick the largest element in F , or find F (h j ) such that

F (h j ) > F (hi ), for all j 6= i.

This amounts to just searching a list of numbers and finding the max-
imal value, something computers can do quickly. If the grid of points
in H is fine enough, then h j should be reasonably close to the solu-
tion, h∗ , that obtains from maximization problem where h is allowed
to vary continuously. The situation is shown in Figure 3.4.1. Discrete
maximization can handle constraints fairly easily. For example, lower
and upper bounds on h can be imposed by restricting elements in the
set H to lie in within the range imposed by the bounds. Discrete max-
imization is often used to solve dynamic programming problems and
is returned to in Chapter 9.

3.5 Particle Swarm Optimization

Imagine unleashing a group of bots to find a target, here the global

maximum of a function. Each bot starts off from a different position;
to wit, a different value of the control variable. A bot modifies its
search for the maximum based on two principles. First, it changes its
current position in a random manner based on its own personal past
best, which is the position in its search history that yielded the highest
60 numerical methods for macroeconomists with julia and matlab codes

Figure 3.4.1: Discrete Maximiza-

tion. In the figure the domain
for h is converted to a six-point
set H ≡ (h1 , h2 , · · · , h6 ). The
function obtains its maximum at
h = h3 . How close this point
is to optimal solution, h∗ , de-
pends on the number and spac-
ing of points in the set. By
adding more points to set, the
solution can be made more ac-
curate. The points can also be
made more dense around where
the presumed solution lies.

value of objective function. Second, it also modifies its current position

in a random way based upon the best position that all bots have found
in their past searches. So, there is both individual and social learning
going on. The injection of randomness operates to ensure that various
parts of the function are sampled. As the bots continue their searches
they will start to swarm toward the global maximum. The algorithm is
an application of machine learning. It is said to resemble the behavior
of a flock of birds searching and then homing in on a food source. The
algorithm was developed by James Kennedy and Russell Eberhart in
1995.
Just two equations are central to the particle swarm algorithm. Sup-
pose that there are I bots. The first equation describes how bot i
changes its position between iteration j and j + 1.
j +1 j j +1
hi = hi + si , for i = 1, · · · , I, (3.5.1)
j +1
where si is the step size that the bot will take for iteration j + 1. The
j +1
second equation regulates bot i’s step size for iteration j + 1, or si ,
and reads
j +1 j j j∗ j j j
si = α × si + β × ζ i × (hi − hi ) + γ × ξ i × (h j∗ − hi ), for i = 1, · · · , I.
| {z } | {z } | {z }
≡b ≡a ≡c
(3.5.2)
In the above step-size equation, α, β, and γ are just constants terms
j
that are fixed across all bots and iterations. The first term, α × si , is
maximization (and minimization) 61

Figure 3.5.1: Particle Swarm Op-

timization. The original position of
j
bot i on iteration j is shown by hi .
j +1
The bot moves its position to hi
for iteration j + 1. The size of the
step is b + c − a–see equation (3.5.2)
for the definitions of a, b, and c. The
j∗
bot’s personal best was hi . In ac-
cordance with (3.5.2), this induces
it to step back by the distance a =
j j∗ j
β × ζ i × (hi − hi ). The best posi-
tion that all of the bots have found
is h j∗ , which is closer to the global
maximum. This, in conjuction with
the inertial component, will cause
the bot to move forward by b + c =
j j j
α × si + γ × ξ i × (h j∗ − hi ). The cur-
rent position of some other bot, l, is
j
shown by hl .

called the inertial component and operates to keep the bot moving in
j j∗ j
the same direction. The second component, β × ζ i × (hi − hi ), reflects
j∗
individual learning. Here hi is best position that bot i has personally
experienced in the past up to and including iteration j. This term
causes the bot to return to the location where it did the best. The
j
coefficient ζ i ∈ [0, 1] is a random number that bot i draws on iteration
j. This encourages the bot’s to search new regions of the space where
j j
the control variable lies. The last term, γ × ξ i × (h j∗ − hi ), is the social
learning component. The variable h j∗ is the best position that any bot
has found in the past. This entices the bot to move to regions where the
swarm has found to be productive. Again, there is some randomness
j
in this move since ξ i ∈ [0, 1] is a randomly drawn number. Figure 3.5.1
portrays the situation.

3.5.1 The algorithm–pseudo code

The following steps describe how to operationalize the particle swarm
algorithm.

1. Initialize I bots.

(a) Randomly assign an initial position, h0i , in the control space for
each of the i = 1, · · · , I bots.
(b) Likewise, for each bot i randomly pick an initial step size, s0i .
62 numerical methods for macroeconomists with julia and matlab codes

j
2. On a generic iteration j compute F (hi ) for all i = 1, · · · , I.
j j∗ j +1∗ j
(a) Update personal best. If F (hi ) > F (hi ), then set hi = hi .
(b) Update the swarm’s best. To do this, find the bot that is doing
the best on iteration j. I.e., find
j j
m = arg max{ F (h1 ), · · · , F (h I )}.
i

j j
If F (hm ) > F (h j∗ ), then set h j+1∗ = hm .

3. Decide whether to stop or move onto iteration j + 1.

(a) If F (h j∗ ) hasn’t improved for J iterations, then stop.

(b) Otherwise, update each bot i’s position according to equations
(3.5.1) and (3.5.2).

The particle swarm algorithm is good for global optimization. Its cod-
ing is simple and it is perfect for parallel computing, where the prob-
lems for many bots can be solved simultaneously. Its main disadvan-
tage is that during the final stages it is slow to home in on the optimal
solution. But, for these stages the algorithm could switch to speedier
local maximizers, such as golden-section search.2 2
The parameter α should lie between 0
and 1for convergence reasons. Normally
values between 1and 3are selected for β
3.6 Calibration and γ.

Economic theory comes alive when confronted with data. One method
of matching economic models with data is calibration. This method-
ology was introduced into economics by Kydland and Prescott (1982)
in a now famous paper. An elementary introduction to calibration is
contained in Prescott and Chandler (2008). Calibration often refers
to adjusting an instrument, scientific or otherwise, so that it matches
some known benchmarks. For example, a guitar can be tuned so that
the A string has the Stuttgart pitch of 440Hz. After the guitar has been
tuned (calibrated) it can be used to play songs in key with others. In
economics the model is treated as an instrument and its parameters
can be adjusted so that it matches certain features in the data. After an
economic model has been calibrated (tuned) it can be used to conduct
policy analysis or thought experiments.
Consider the following economic model

o = M ( p ),

where o is a n × 1 vector of output and p is a m × 1 vector of param-

eters. How should p be chosen? Two criteria are used for selecting
parameter values. First, the values for some parameters can be as-
signed from a priori information. On this, appropriate values for some
maximization (and minimization) 63

of the parameters may be available in the economics literature. Also,

some parameters might have direct counterparts in the data. Denote
the vector of parameters that can be assigned from a prior information
by u ⊆ p.
Second, the remaining parameters values are chosen so that the
model matches, as well as possible, a set of data targets. Let the vec-
tor of these parameters be represented by v ⊆ p. Hence, p = (u, v).
The j × 1 vector of data targets is given by d. Typically, data targets
are a set of means, correlations, variances. But, they may also include
regression coefficients. The model’s output vector, o, must include
counterparts for each of the j data targets in the vector d. The model’s
output may include simulated regression coefficients. The parameters
v are picked to minimize the predictions error of the model. Therefore,
v solves a minimization problem such as
j
min ∑ [di − Mi ( u, v )]2 ,
v |{z}
i =1
=p

where di is the ith data target and Mi (u, v) is the model’s prediction
for this target. Different criteria could be used for the minimization
problem, or for the objective function. Sometimes the data targets can
be hit exactly. Often this is done using a nonlinear equation solver
instead of a minimization routine, as the example below show. When
this can be done for the data targets, the model’s prediction will exactly
match targets so that di = Mi (u, v). So, this procedure can also be
thought of as solving the above problem. Calibration is a very close
cousin of econometrics.

3.6.1 Selecting parameters values by backsolving

Sometimes parameter values can be obtained by selecting them so that
the model’s first-order conditions hold exactly at the observed values
in the data. As can example, consider the following consumer/worker
problem

(wh)1−ρ
max{θ + (1 − θ ) ln(1 − h)}, with 0 < θ < 1 and ρ ≥ 0.
h 1−ρ
which has the first-order condition
1−θ
wh )−ρ × |{z}
θ (|{z} w = .
1−h
IE SE

Now, in 1900 the average male worked 63 hours a week. This dropped
to only 44 hours in 2018–see Figure 3.6.1. Over this time period real
wages rose by a factor of 7.7. Is the above model consistent with
these facts? There are both income and substitution effects associated
64 numerical methods for macroeconomists with julia and matlab codes

Figure 3.6.1: The plot shows the

decline in weekly hours worked
by males over the course of
the 20th century in the United
States. For hours worked to
decline the income effect from
a rise in real wages must
dominate the substitution ef-
fect. Source: Greenwood et al.
(2021b).

with rising real wages–income and substition effects are discussed in

Chapter 2. An increase in real wages implies that more consumption,
c = wh, can purchased for a given level of work effort. The person
will work less on account of the income effect, IE, because he would
like to use some of the increase in his standard of living to enjoy more
leisure. A climb in the real wage implies that the price of leisure has
become more expensive. The substitution effect, SE, states that on this
account the person will work more. To get hours worked to fall over
time the income effect must dominate the substitution effect.
There are two observations to be targeted; viz, hours worked in 1900
and 2018. There are also two parameter values that need to be selected;
namely, θ and ρ. If there are 112 non-sleeping hours in a week, then
the desired h’s are 0.56 = 63/112 and 0.39 = 44/112. If the wage rate
for 1900 is normalized to one, the wage rate for 2018 is 7.7. Thus, θ
and ρ must solve
1−θ
θ (1.0 × 0.56)−ρ × 1.0 = ,
1 − 0.56
and
1−θ
θ (7.7 × 0.39)−ρ × 7.7 = .
1 − 0.39
This represents a system of two equations in two unknowns. Comput-
ing the solution using a nonlinear equation solver yields θ = 0.50 and
ρ = 1.41. These parameter values satisfy the restrictions imposed on
them, so that the computed solution is legitimate. Often one can think
about the exponent on a function as regulating the change in a variable
over time, while the constant term pins down the level in some year
given the exponent.3 3
Dividing the second equation by the
first gives [(7.7 × 0.39)/(1.0 × 0.56)]ρ =
(1 − 0.39)/(1 − 0.56), so the change in
hours worked from 0.56 to 0.39 is gov-
erned by ρ, given the change in the wage
rate from 1.0 to 7.0. Then, the first equa-
tion could be used to solve for θ.
maximization (and minimization) 65

3.6.2 Selecting parameters values by maximizing goodness of fit

Only 18 percent of 20-year-old women born in 1900 had experienced

premarital sex. This rose to 48 percent for women born in 1938 and to
76 percent for those in the 1978 cohort. These facts are shown in Figure
3.6.2. What caused this? Technological progress in contraception is the
most likely candidate. The failure rate for contraception in 1900 was 72
percent. This gives the odds of becoming pregnant if a young woman
engaged in premarital sex for a year using the available contraception
practices at the time. This dropped to 59 percent by 1960 and to 30
percent in 2000.4 4
The failure rates are reported roughly
20 years after the 1938 and 1978 cohorts
were born. The number reported for
1900 in Greenwood et al. (2021a) is ac-
tually based on data from the 1920’s and
30’s so it is appropiate to use for the 1900
cohort.

Figure 3.6.2: The chart shows

how premarital sexual activity
by young women in the United
States increased with technolog-
ical innovation in contraception.
Source: Greenwood et al. (2021a)

To model this, suppose that the joy a young women gets from a
sexual relationship is given by ej, which is distributed across women
according to a Weibull distribution:

Pr[ej ≤ j] = 1 − exp[−( j/η ) β ], with β, η > 0.

(The Weibull distribution is discussed in Chapter A.) Let the cost of

an out-of-wedlock birth be represented by O and the failure rate be
denoted by φ. A young woman’s decision to be sexually active is
summerized as follows:
66 numerical methods for macroeconomists with julia and matlab codes

j > φO, sexually active;

j ≤ φO, abstinent.

So, a young woman is sexually active when the joy of sex, j, exceeds
the expected cost of a pregnancy, φO. The threshold level of joy, j∗ , at
which a woman is indifferent between having sex or not is given by

j∗ = φO .
|{z} |{z}
benefit cost

All women with a level of joy, ej, above the threshold, j∗ , will
participate in premarital sexual activity. At the threshold the joy of
sex, j∗ , is equal to its expected cost, φO. The fraction of women with
premarital sexual experience then reads

Pr[ej ≥ j] = exp[−( j∗ /η ) β ] = exp[−(φO/η ) β ].

To calibrate this to the U.S. data note that there are 3 parameters,
namely β, η, and O. Observations are at hand for the levels of pre-
marital sexual activity and the failure rates for three years spanning
the 20th century. In this case it is impossible to get a perfect fit by
solving a system of 3 equations in 3 unknowns. If instead the param-
eter values are chosen to minimize the model’s prediction errors for
premarital sex for these three years, then β, η, and O must solve

min {[0.18 − exp[−(0.72 × O/η ) β ]]2 + [0.48 − exp[−(0.59 × O/η ) β ]]2

β,η,O

+ [0.76 − exp[−(0.30 × O/η ) β ]]2 }.

The solution to this problem yields β = 2.30, η = 2.06, and O = 1.34.

The circles on the figure show the model’s prediction for premarital
sex.
Often calibration involves a mixture of these two strategies–an ex-
ample is Greenwood et al. (2021b)

3.7 MATLAB: A Worked-Out Example

The monopolist’s problem discussed in Chapter 2 is revisited. There

are just two key differences. First, the monopolist’s objective function
is plotted instead of the marginal revenue and cost curves. Second,
the monopolist’s profits are directly maximized as opposed to solving
the first-order condition arising from the maximization problem by us-
ing a nonlinear equation solver. The required profit maximization is
done two ways; viz, by using a minimization program and by discrete
maximization. When using a minimization problem to solve a max-
imization problem a minus sign must be placed before the objective
function.
maximization (and minimization) 67

3.7.1 Matlab, Main Program–main.m

This program parrots the previous one. It now uses the minimization
routine fminbnd to minimize the objective function contained in the
m file nprofits.m.
1 % main .m
2 % Monopoly ProblemMain Program
3 c l e a r a l l % C l e a r a l l numbers from p re vi o us runs
4 c l c % Clear screen
5 g l o b a l alpha b e t a gamma
6

7 % S e t parameters f o r model
8 % Demand curve
9 alpha = 1 ; % c o n s t a n t
10 beta = 0 . 5 ; % slope
11 % Cost f u n c t i o n
12 gamma = 0 . 5 ; % q u a d r a t i c term
13
14 % P l o t t h e Monopolist ’ s O b j e c t i v e Function
15 % C o n s t r u c t g r i d o f output p o i n t s
16 o g r i d = 0 : alpha /( b e t a * 1 0 0 ) : alpha/ b e t a ;
17 % o g r i d runs from 0 t o alpha/ b e t a i n
18 % i n c r e m e n t s o f alpha /( b e t a * 1 0 0 )
19 figure (1)
20 p l o t ( ogrid , − n p r o f i t s ( o g r i d ) )
21 t i t l e ( ’ O b j e c t i v e Function ’ )
22 x l a b e l ( ’ Output ’ )
23 ylabel ( ’ Profits ’ )
24
25 % C a l l up MATLAB minimizer t o s o l v e f o r output
26 output = fminbnd ( @ n p r o f i t s , 0 , 2 ) ; % n p r o f i t s i s t h e o b j e c t i v e
function
27

28 p r i c e = alpha − b e t a * output / 2 ;
29 c o s t = gamma * output2 / 2 ;
30 p r o f i t s = p r i c e * output − c o s t ;
31 markup = p r i c e /(gamma * output ) ;
32
33 % Display r e s u l t s
34 d i s p l a y ( ’ R e s u l t s f o r t h e monopoly model ’ )
35 d i s p l a y ( ’ output , p r i c e , p r o f i t s , and markup ’ )
36 d i s p l a y ( [ output p r i c e p r o f i t s markup ] )
37
38 % D i s c r e t e maximization
39

40 o s e t = 0 : 2 / 9 9 : 2 ; % D i s c r e t i z e domain f o r maximization
41 [ maxvalue , o p o i n t ] = max( − n p r o f i t s ( o s e t ) ) ; % Find optimal p o i n t
42 % maxvalue = maximal value o f o b j e c t i v e f u n c t i o n
43 % o p o i n t = p o i n t number o f t h e optimal value o f output
44 % o s e t ( o p o i n t ) = optimal l e v e l o f output
45

46 p r i c e = alpha − b e t a * o s e t ( o p o i n t ) / 2 ;
47 c o s t = gamma * o s e t ( o p o i n t ) 2 / 2 ;
48 p r o f i t s = p r ic e * oset ( opoint ) − c o s t ;
49 markup = p r i c e /(gamma * o s e t ( o p o i n t ) ) ;
50

51 % Display r e s u l t s
52 d i s p l a y ( ’ R e s u l t s f o r t h e monopoly m o d e l d i s c r e t e maximization ’ )
53 d i s p l a y ( ’ output , p r i c e , p r o f i t s , and markup ’ )
54 d i s p l a y ( [ o s e t ( o p o i n t ) p r i c e p r o f i t s markup ] )
68 numerical methods for macroeconomists with julia and matlab codes

3.7.2 A Function Specifying the Objective Function–nprofits.m

This spells out the objective function for the minimizer, fminbnd. This
is monopolist’s profits. But, because the program minimizes instead
of maximizes a negative sign must be placed before profits.
1 f u n c t i o n [ out ] = n p r o f i t s ( output )
2 % n p r o f i t s .m
3 % This s p e c i f i e s t h e o b j e c t i v e f u n c t i o n f o r t h e minimizer , used
i n fminbnd
4 % This i s t h e n e g a t i v e o f t h e monopolist ’ s p r o f i t s
5 g l o b a l alpha b e t a gamma
6 out = alpha * output − b e t a * output . 2 / 2 − gamma * output . 2 / 2 ; %
profits
7 out = − out ; % Switch s i g n f o r minimizing
8 end

3.7.3 Output from Program

The program generates the graph shown in Figure 3.7.1. The following
output is the same as before.
1 % R e s u l t s f o r t h e monopoly model
2 output , p r i c e , p r o f i t s , and markup
3 ans =
4 1.0000 0.7500 0.5000 1.5000
5

6 % R e s u l t s f o r t h e monopoly m o d e l d i s c r e t e maximization
7 output , p r i c e , p r o f i t s , and markup
8 ans =
9 0.9899 0.7525 0.4999 1.5204

Figure 3.7.1: Graph generated

from the MATLAB program for
the monopolist’s objective func-
tion
4 Why do Americans Work so Much
More than Europeans?

In general, the art of government consists in taking as much money as

possible from one party of the citizens to give to the other. (Voltaire)

4.1 Introduction

During the period 1993-96, Americans put in about 50 percent more

work than did the French or Italians. Other members of the G7 worked
significantly less too. This wasn’t always the case. Europeans worked
more than Americans over the 1970-74 period. U.S. output per capita
is about 40 percent higher than its European counterparts. This is not
due to higher productivity, but to higher labor effort. So a question is:
Why do Americans work so much more than Europeans?
The answer provided by Prescott (2004) in a classic paper is: Euro-
pean’s labor income is taxed at a much higher rate. Prescott calibrates Edward C. Prescott (1940-) is the
his model to the national income and product accounts for the G7 father of quantitative theory.
countries. A short detour through the national income and product Prescott is the inspiration behind
the numerical methods in this
accounts is taken since they are an important source of information
book. Before earning his Ph.D. in
for macroeconomists. Besides examining the impact of distortional economics from Carnegie Mellon
labor income taxation, Prescott’s paper raises two other interesting University, Prescott received a
points. First, a consumption tax works in much the same way as a bachelors degree in mathematics
labor income tax does. Second, financing old-age retirement using from Swathmore and a masters
government mandated private-saving accounts is more efficient than a degree in operations research from
Case-Western University. With this
taxed-financed social security program with lump-sum benefits.
training he was well suited to bring
dynamic stochastic general
4.2 equilibrium analysis and numerical
The Model
methods into macroeconomics.
Along with his coauthor and
To answer this question, let consumer/workers in a country have tastes former student Finn E. Kydland,
given by Prescott was granted the Nobel
ln c + α ln(100 − h). Prize in Economics in 2004. He is
one of the foremost
Here each worker is assumed to have 100 hours of non-sleeping time macroeconomists of the 20th
per week. century.
70 numerical methods for macroeconomists with julia and matlab codes

Output, Labor Supply, and Productivity

output = hours worked × productivity Table 4.1.1: Output, hours
Output Hours Worked Productivity worked and productivity in ad-
1993–96 Germany 74 75 99 vanced economies.
France 74 68 110
Italy 57 64 90
Canada 79 88 89
United Kingdom 67 88 76
Japan 78 104 74
United States 100 100 100

1970–74 Germany 75 105 72

France 77 105 74
Italy 53 82 65
Canada 86 94 91
United Kingdom 68 110 62
Japan 62 127 49
United States 100 100 100

Output in a country is produced according to

o = zkθ h1−θ , (4.2.1)

where z is a country-specific level of total factor productivity. Take the

capital stock in each country to be some fixed number. Suppose that it
depreciates at rate δ, with the depreciated portion being made up by
investment, i = δk. As will be seen, capital doesn’t play much of a role
in the analysis.
Each country has a government. It spends the amount g on gov-
ernment produced goods and services. It provides transfer payments
in the lump-sum amount λ. It taxes labor income at the rate τh and
consumption at the rate τc . The government’s budget constraint is

λ+g = τc c + τh wh,
| {z } | {z }
expenditure revenue

where w is the wage rate and r is the rental rate.

Last, there is a resource constraint for a country. It states that

c + g + δk = o,

or that consumption, c, plus government spending on goods and ser-

vices, g, plus investment, δk, equals output, o.
why do americans work so much more than europeans? 71

4.2.1 Worker’s Problem

The representative worker’s utility maximization problem is

max[ln c + α ln(100 − h)],

c,h

subject to their budget constraint

(1 + τc )c = (1 − τh )wh + rk + λ,

where w is the wage rate and r is the rental rate. Observe that the sales
tax, τc , increases the price of consumption, c.
The first-order condition for labor is
(1 − τh ) w 1
=α ,
(1 + τc ) c 100 − h
| {z }
≡(1−τ )

or
(1 − τ ) w 1
=α ,
c 100 − h
where
τc + τh
τ≡ ,
1 + τc
is the effective tax on labor. The consumption tax, τc , creates a dis-
incentive to work, just as the labor income tax, τh , does. This makes
sense. When deciding how much to work the individual considers the
relative price of leisure in terms of consumption goods; i.e., he looks at
the forgone consumption that a marginal increase in leisure will cost.
Raising the price of consumption, via a consumption tax, reduces the
relative price of leisure in a manner similar to increasing the labor
income tax.

4.2.2 Firm’s Problem

The firm’s profit maximization problem is

max[zkθ h1−θ − rk − wh].

k,h

Profit maximization for the firm implies

w = (1 − θ )zkθ h−θ ,

which can be written, using the production function (4.2.1), as

w = (1 − θ )o/h.

The above equation also implies

wh
1−θ = .
o
Therefore, 1 − θ is labor’s share of income.
72 numerical methods for macroeconomists with julia and matlab codes

4.2.3 Equilibrium–well, almost

Solving out for the wage rate, w, in the first-order condition for labor
yields
(1 − τ )(1 − θ )o/h 1
=α ,
c 100 − h
or
(100 − h)(1 − τ )(1 − θ )o = αch,
so that
100(1 − θ )
h= . (4.2.2)
α(c/o )/(1 − τ ) + (1 − θ )
Hours worked, h, is a decreasing function in the effective tax rate,
τ, and a decreasing function in the consumption/output ratio, c/o.
Loosely speaking the term τ is capturing the substitution effect asso-
ciated with taxation while the term c/o is tied to the income effect.
Anything that increases consumption (relative to output) causes the
worker to cut back on his effort since the marginal benefit of working
falls. The effect of taxation on c/o will be modest when the revenue
from taxation is rebated back as lump-sum transfer payments. In this
situation the negative income effect associated with taxation will be
minimized.
It is easy to calculate that

d ln h 1 (c/o )
(1 − τ ) =− α < 0.
dτ α(c/o )/(1 − τ ) + (1 − θ ) 1 − τ

This is the elasticity of labor with respect to a tax change. (Actually, it

is the elasticity of labor with respect to 1 − τ, where it should be noted
that d(1 − τ )/dτ = −1). Therefore, the impact of a tax change on
labor supply is bigger the larger is c/o and the smaller is 1 − τ (or the
bigger is τ). The consumption/output ratio, c/o, will be higher when
the revenue from taxation is used for lump-sum transfer payments as
opposed to wasteful government spending on goods and services. For
use in Section 4.7, note that

dh 100(1 − θ ) (c/o )
=− < 0. (4.2.3)
dα [α(c/o )/(1 − τ ) + (1 − θ )]2 1 − τ

The heart of Prescott (2004) quantitative analysis is equation (4.2.2),

which is used to predict hours worked for each country. The following
two observations are made about this equation:

1. Given values for the taste and production parameters, α and θ,

which will be the same across countries, and observations on the
consumption-output ratio, c/o, and the effective tax rate, τ, which
will differ across countries, one can make a prediction about hours
worked, h, for each country.
why do americans work so much more than europeans? 73

2. Strictly speaking the consumption-output ratio, c/o, is an endoge-

nous variable and should ideally be solved out for.

Some examples will now be presented, which echo the theory of labor
income taxation presented in Chapter 2. In these examples, let labor be
the only factor of production. Therefore, assume a linear production
function of the form o = wh. Thus, set θ = 0.

Example 11. (All government spending is transfer payments) Suppose

that the government rebates back tax revenue via lump-sum transfers
(g = 0). In this case, c = wh = o via the resource constraint (since
g = 0). Then, the above formula appears as
100
h= .
α/(1 − τ ) + 1
(Recall that θ = 0.) Hours worked will decline when taxes rise. Only
the substitution effect from taxation is present here. There is no nega-
tive income effect because all of the tax revenue is rebated back.
(All government spending is a deadweight loss) Alternatively, as-
sume that the government spends all tax revenue. Here, c = (1 −
τ )wh = (1 − τ )o, using the worker’s budget constraint. In this situa-
tion the above formula reads
100
h= .
α+1
Taxation has no impact on hours worked because the income and sub-
stitution effects exactly cancel out.
(Valued government spending) Now let utility be written as ln(c +
ξg), with 0 ≤ ξ ≤ 1. Assume that the government spends all tax
revenue. By parroting the above steps, one now gets

(100 − h)(1 − τ )o = α(c + ξg)h.

In the above equation use the facts that c = (1 − τ )wh and g = τwh.
This gives
(100 − h)(1 − τ )o = α[1 − (1 − ξ )τ ]wh2 .
Therefore,
100
h= .
α [1 − (1 − ξ ) τ ] / (1 − τ ) + 1
If ξ = 0, then the result in Example 11 obtains, and when ξ = 1,
the result in Example 11 occurs. So, this case is just a hybrid of the
previous two cases.

4.3 National Income and Product Accounts

The national income and product accounts (NIPA) are a key source of
data for macroeconomists. While national income accounting is one of
74 numerical methods for macroeconomists with julia and matlab codes

the most boring aspects of macroeconomics, it is also one of the most

valuable. NIPA follows the standard practice of double-entry book-
keeping utilized in accounting. One the lefthand side of the accounts
is the expenditure on final goods and services produced in the econ-
omy. On the righthand side the income earned by consumers, firms,
and government. Entries made on the lefthand side must be balanced
off with entries on the righthand side, and vice versa. Richard Stone (1913-1991) was
The logic underlying NIPA can be gleaned by thinking about the instrumental in developing the
circular flow of income. Imagine a static setting where final goods national income accounts–see Stone
and Stone (1966). He introduced
are produced just using labor, so income in the economy is just labor
the technique of double-entry
income. This labor income is then expended by consumers on final accounting into the accounts. For
goods. The situation is portrayed in Figure 4.3.1. Let a CAPITAL let- all entries on the lefthand
ter denote a variable in the National Income and Product Accounts (expenditure) side of the ledger
(NIPA). This gives the NIPA identity where consumption, C, equals there must corresponding entries
labor income, WL: on the righthand (income) side. For
this he was honored with the Nobel
C = WL. prize in 1984. He earned an
economics degree from Cambridge
In the above example, suppose that there are also profits, Π, on pro-
University in 1935 and after World
duction. Then, the accounts read
War II served as a professor there
until his retirement in 1980.
C = WL + Π.

For some businesses, say sole proprietorships, it is impossible to break-

down income into labor income, WL, and profits, Π. For these types of
businesses only proprietors income, PI, is recorded so that now

C = WL + Π + PI.

In reality all income, WL + Π + PI, is not used for domestic consump-

tion, C. Some of it is saved. Suppose that people can also use their
income for another domestically produced final good, investment, I.
There are two concepts for investment, gross and net. Gross invest-
ment includes spending to replace the depreciation, D, on the existing
capital stock. Firms deduct depreciation from their profits, Π. So, if I
represents gross investment it will include depreciation. On the right-
hand side of the national income identity depreciation must be added
back because it is deduced when calculating profits yielding

C + I = WL + Π + PI + D.

Another source of final expenditures is the government, G. They

raise the money from taxes, both direct and indirect. Labor income,
W L, profits, Π, and proprietors income, PI, are recorded before the
direct taxes levied on earned income. Hence, this revenue is already
incorporated into the lefthand side of the national income identity.
Sales taxes and property taxes, are not. These are called indirect taxes,
why do americans work so much more than europeans? 75

Figure 4.3.1: Circular Flow of In-

come. Output is produced using
solely labor. The income gener-
Consumers ated from labor is used for ex-
penditure on consumption.

Final Goods

Expenditure Income

Labor

Firms

IT. Indirect taxes are included in the expenditure on the lefthand side
and hence they only need to be added to the righthand side to obtain

C + I + G = WL + Π + PI + D + IT.

Additionally, part of expenditure on final goods derives from exports,

E. Also, some of expenditure is not on domestically produced goods
but on imports, M. Thus, net exports, E − M, should be added to the
lefthand side.

GDP ≡ C + I + G + E − M = WL + Π + PI + D + IT.

The above equation gives the national income identity for gross do-
mestic product, GDP. For net domestic product, NDP, depreciation is
subtracted off of both sides to obtain

NDP ≡ GDP − D = C + I − D + G + E − M = WL + Π + PI + IT,

where I − D is net investment. National income, NI, is defined as net

domestic product, NDP, less indirect taxes, IT, or

NI ≡ NDP − IT = C + I − D + G + E − M − IT = WL + Π + PI.

4.4 Mapping the Model into the Data

Associating model quantities with their analogues in NIPA can be a

bit tricky at times. Also, how should parameter values be assigned?
76 numerical methods for macroeconomists with julia and matlab codes

The model’s resource constraint is c + g + i = o. Adding indirect taxes

on both sides gives c+IT+ g + i = o +IT. Consumption and invest-
ment spending in the data, C + I, include indirect taxes; i.e., C= c+ITc
and I= i + ITc , where ITc and ITi are the indirect taxes on consump-
tion and investment, with ITc + ITi = IT. Therefore, C + G + I = GDP.
Note that GDP − IT = o = zkθ h1−θ .

Assigning Parameter Values

Parameter values will how be assigned for τc , τh , α, and θ. The no-
tions of consumption and output in the model relative to data are also
discussed.

• Consumption tax, τc . Indirect taxes on consumption, ITc , are esti-

mated from total indirect taxes, IT, as follows:
C
ITc = (2/3 + 1/3 × )IT.
C+I
The consumption tax rate, τc , is then computed as
ITc
τc = .
C − ITc
Note that consumption is approximately 2/3 of private spending.
So, Prescott (2004) is assuming that 2/3 of indirect taxes fall di-
rectly on consumption. Think of this a representing the sales tax on
consumption. The remaining indirect taxes fall on business, which
will be partially passed on to consumers. This represents things
such as gasoline or property taxes, which will be reflected in higher
prices for goods and services. So, the remaining 1/3 is split between
consumption and investment according to consumption’s share of
this spending. Consumption in NIPA is measured as C = c + ITc .
That is, measured consumption includes the indirect taxes on con-
sumption, ITc . So, consumption in the model (abstracting from
substitutable government spending which is discussed below), c, is
given by c = C − ITc .

• Labor income tax, τh . The labor income tax is comprised of two taxes,
viz a social security tax, τss , and an income tax, τinc . The social
security tax rate is estimated to be

Social Security Taxes

τss = .
(1 − θ )(GDP − IT)
The denominator in the above expression is just labor income. The
income tax rate is
Direct taxes
τinc = .
NI
why do americans work so much more than europeans? 77

Direct Taxes includes taxes on interest income. When this measure

is used it is impossible to disentangle labor income taxes from taxes
on interest income. This explains why total income is in the denom-
inator and not just labor income. Assume that

τh = τss + 1.6 × τinc .

The number 1.6 translates the average income tax rate into a higher
marginal one–recall from Chapter 2 that with progressive taxation
the marginal rate must lie above the average one as is illustrated
there by Figure 2.3.2. The translation factor is based on the 40%
marginal tax rate estimated in Feenberg and Coutts (1993). They
take a representative sample of households and see by how much
tax revenue will increase if household income rises by 1%. They
then estimate the marginal tax rate as (change in tax revenue)/(change
in labor income). Assuming that this markup from average to marginal
tax rates is valid for all countries is a bit heroic.

• Mapping consumption and output in the model to the data, c and o. Ef-
fective consumption, c, and output, o, in the model are assumed to
have the following relationship with consumption, C, and GDP in the
data:
c = C + G − GMILITARY − ITc ,

o = GDP − IT.

From the above examples, taxes will have a bigger effect on labor
supply the smaller is the income effect. Assuming that nonmilitary
government spending, G − GMILITARY , is substitutable for private con-
sumption in a one-to-one fashion amplifies the negative impact that
taxation has on labor supply.

• The parameters α and θ. In standard fashion capital’s share of income

is set so that θ = 0.32. Labor’s share of income, 1 − θ, is often
measured as
WL
1−θ = .
NI − PI
How to treat the income from proprietorships is tricky. Some of
it will be labor income and some of it will be capital income. The
above formula factors out proprietor’s income, PI, from national in-
come, NI, to adjust for this. Capital’s share of income is assumed
to be the same for all countries. The weight on leisure in the util-
ity function, or α, is picked to match average labor supply for the
countries. This implies that α = 1.54. Prescott (2004) does this in
an informal manner, but a more formal fitting procedure, following
the discussion on calibration in Chapter 3, yields much the same
results, as is shown in Section 4.7.
78 numerical methods for macroeconomists with julia and matlab codes

4.5 Actual and Predicted Labor Supplies

4.5.1 Historical Episodes

Under the 1986 U.S. tax reform the marginal tax rate on the second
earner in a household dropped. This led to a 10% increase in labor
supply between the 1970s and 1990s–recall example 2. Most of this
was due to an increase in female labor supply. There was a similar
reform in Spain in 1998. Labor supply went up by 12%, with a slight
increase in revenues.

Example 12. (Joint taxation of married household income) Consider

the problem of a married couple. Assume that the husband works a
fixed 40 hour week, denoted by h. He earns the wage w and is taxed at
the rate τ 1 . Let the wife’s working hours, h, be flexible. If the woman
works, she earns the wage rate φw, where φ is the gender gap or the
ratio of female to male earning. Suppose that if the woman works,
then any family income above the level b will be taxed at the higher
marginal rate τ 2 > τ 1 –see Figure 4.5.1. That is, when the wife works
the family may be pushed into a higher bracket. The household’s
budget constraint under joint taxation appears as
(
(1 − τ 1 )wh + (1 − τ 1 )φwh + λ, if wh + φwh < b;
c=
(1 − τ 1 )b + (1 − τ 2 )(wh + φwh − b) + λ, if wh + φwh ≥ b.

while if they are taxed separately it reads

c = (1 − τ 1 )wh + (1 − τ 1 )φwh + λ.

Clearly, joint taxation creates a larger disincentive effect for the second
worker, here the wife, than taxing each person separately.
why do americans work so much more than europeans? 79

Figure 4.5.1: Joint Taxation.

When the wife works the
amount h the household is
pushed into a higher tax
bracket. Any household income
above the amount b is taxed at
the higher marginal rate τ2 > τ1 .

4.5.2 Results

The analysis proceeds by using equation (4.2.2) to predict hours worked,

h, for each country as a function of their effective tax rate on labor, τ,
and their consumption/output ratio, c/o. The two parameters α and θ
are common across countries and are chosen in the manner discussed
above. Table 4.5.1 below displays the results.
The model predicts hours worked in the G7 well for the period 1993-
96, and less so for the period 1970-74. This can seen better from Figure
4.7.1 presented later on. If France reduced its effective tax rate from
60 percent to the U.S. level of 40 percent, it would increase its welfare
by 19 percent, measured in terms of consumption. Leisure drops from
81.2 hours to 75.8 hours. There is no reduction in tax revenue. If the
United States reduced its income tax rate down from 40 percent to 30
percent welfare would rise by 7 percent in terms of consumption.

4.5.3 Financing Social Security

Suppose that the United States switches to a system of private accounts

for social security. In particular, let each worker choose between putting
8.7 percent of income into a private retirement account or staying in
the present system. The first option reduces the effective income tax
rate for a worker to 31.3 percent from 40 percent. This is because,
unlike the current system, a worker realizes that he will get these con-
tributions back when he retires; i.e., his retirement payments will now
be directly tied to his own work effort. The welfare gain to a 21 year
80 numerical methods for macroeconomists with julia and matlab codes

Actual and Predicted labor Supply

Labor Supply Differences Prediction Factors Table 4.5.1: Labor supply.

Actual Predicted τ c/o
1993–96 Germany 19.3 19.5 0.2 .59 .74
France 17.5 19.5 2.0 .59 .74
Italy 16.5 18.8 2.3 .64 .69
Canada 22.9 21.3 -1.6 .52 .77
United Kingdom 22.8 22.8 0 .44 .83
Japan 27.0 29.0 2.0 .37 .68
United States 25.9 24.6 -1.3 .40 .81

1970–74 Germany 24.6 24.6 0 .52 .66

France 24.4 25.4 1.0 .49 .66
Italy 19.2 28.3 9.1 .41 .66
Canada 22.2 25.6 3.4 .44 .72
United Kingdom 25.9 24.0 -1.9 .45 .77
Japan 29.8 35.8 6.0 .25 .60
United States 23.5 26.4 2.9 .40 .74

old is estimated to be worth 4 percent of his lifetime consumption. The

benefit would be larger still, if he had allowed the retirement age to
adjust. That is, in a private system there is more incentive to work
longer because this allows you to build up your retirement account.

Example 13. (Private-savings accounts) Imagine a individual who lives

for two periods. In the first period he works, while in the second pe-
riod he is retired. Suppose that both the individual and the govern-
ment face the common interest rate r. Under the current system, the
person faces a social security tax on their labor income at the rate τss
and will receive a lump-sum benefit (essentially unrelated to the work
effort) in the amount λ. The person’s budget constraint will appear as

c2 λ
c1 + = (1 − τss )wh + ,
1+r 1+r
where c1 and c2 are consumption in the first and second periods.
Clearly, with this budget constraint, the individual’s labor-leisure choice
will be distorted by the presence of the tax. Under private accounts the
person will recognize that λ = (1 + r )τss wh; i.e., that is he will inter-
nalize the fact that he will get back with interest any money that goes
into his private account. Substituting out for λ in the above budget
constraint then gives
c2
c1 + = wh.
1+r
Hence, a system of private accounts does not distort the person’s labor-
leisure choice.
why do americans work so much more than europeans? 81

4.6 Conclusions

The results are sensitive to the implied elasticity of labor supply with
respect to taxes. This elasticity depends how the revenue from taxation
is used because this governs the relative strength of the income and
substitution effects from a shift in taxation. A high elasticity implies
that hours worked will be very responsive to changes in taxation. A
high elasticity occurs when the revenue from taxation is either rebated
back in the form of lump-sum transfer payments or when government
spending on goods and services is highly substitutable with private
spending. In these two cases the income effect associated with taxa-
tion is mitigated so that just the substitution effect remains. The dead-
weight loss from labor income taxation will be high. A high elasticity
of labor supply implies that reforms to the social security system, such
as private-saving accounts, will significantly improve welfare. Such
accounts encourage people to work longer.

4.7 MATLAB: A Worked-Out Example

Here is some MATLAB code that computes the solution for hours
worked in Prescott’s model for the 7 countries involved. The code
has two parts.

4.7.1 Taking α = 1.54.

The first part of the program uses Prescott’s formula for hours worked
or equation (4.2.2). It takes Prescott’s values for α and θ. Given these
values, hours worked can be computed for each country using their
tax rate, τ, and consumption to output ratio, c/o. The program loads
up Prescott’s data for τ, and c/o. which are given in the paper. It then
computes h for each country using equation (4.2.2). The results are
then plotted versus the data.

4.7.2 Picking α Optimally

The second and third parts of the program use a value for α for hours
that maximizes the model’s goodness of fit for the seven countries
over both periods; i.e., minimizes the model’s prediction error. It takes
Prescott’s value for θ as given. In particular, α solves the problem

7
min{∑[ hdata93
i
−96
− hmodel93
i
−96
(α)]2
α
i
7
+ ∑[ hdata70
i
−74
− hmodel70
i
−74
(α)]2 },
i
82 numerical methods for macroeconomists with julia and matlab codes

−96 −74
where hdata93 i and hdata70i are the data for hours worked in
−96
country i for 1993-1996 and 1970-1974, respectively, and hmodel93 i
−74
and hmodel70 i are the model’s predictions for this variable. Note
that the model’s prediction is a function of α, as can be seen from
equation (4.2.2).
This problem can be solved numerically two ways. First, one could
solve the first-order associated with this minimization problem using
the techniques outlined in Chapter 2 for solving a nonlinear equation.
The first-order condition associated with this minimization problem is

−96
7 dhmodel93 (α)
∑[hdata93
i
−96
− hmodel93
i
−96
(α)]
dα
i
i
−74
7 dhmodel70 (α)
+ ∑[ hdata70
i
−74
− hmodel70
i
−74
(α)] i
= 0.
i
dα

The second part of the program picks α to solve this equation. Equa-
tion (4.2.3) gives the formula for dhmodel (α)/dα. The program then
just repeats what was done in the first part using this new value for
α. The third part of the program employs a minimization algorithm to
compute α, following the procedure discussed in Chapter 3.

4.7.3 The MATLAB Code

MATLAB Main Programprescott.m

The main m file is prescott.m. It calls five different function files,
namely derivh.m, foc.m, hours.m, sumofsquares.m,777 and makefig-
ures.m. To run the program just type prescott. Note the use of vectors
to load in the data and save the model results. Furthermore, note the
use .* and ./ multiplication and division operators that multiple and
divide one vector by another in a component by component fashion.
Last, inserting ... allows an instruction to flow over to the next line.
norm is a MATLAB command that computes the norm of a vector.
sum is a MATLAB command that sums the components of a vector.
1 % P r e s c o t t 2004
2 clear all
3 clc
4 g l o b a l alpha t h e t a
5 g l o b a l tau9396 c t o o 9 3 9 6 hdata9396 tau7074 c t o o 7 0 7 4 hdata7074
6

7 % Data 1993 −96

8 tau9396 = [ . 5 9 , . 5 9 , . 6 4 , . 5 2 , . 4 4 , . 3 7 , . 4 0 ] ; % E f f e c t i v e t a x r a t e
9 c t o o 9 3 9 6 = [ . 7 4 , . 7 4 , . 6 9 , . 7 7 , . 8 3 , . 6 8 , . 8 1 ] ; % Cons/output r a t i o
10 hdata9396 = [ 1 9 . 3 , 1 7 . 5 , 1 6 . 5 , 2 2 . 9 , 2 2 . 8 , 2 7 . 0 , 2 5 . 9 ] ; % Hours data
11

12 % Data 1970 −74

13 tau7074 = [ . 5 2 , . 4 9 , . 4 1 , . 4 4 , . 4 5 , . 2 5 , . 4 0 ] ;
14 ctoo7074 = [ . 6 6 , . 6 6 , . 6 6 , . 7 2 , . 7 7 , . 6 0 , . 7 4 ] ;
15 hdata7074 = [ 2 4 . 6 , 2 4 . 4 , 1 9 . 2 , 2 2 . 2 , 2 5 . 9 , 2 9 . 8 , 2 3 . 5 ] ;
why do americans work so much more than europeans? 83

16
17 % Parameter v a l u e s
18 alpha = 1 . 5 4 ; % Weight on l e i s u r e
19 theta = 0 . 3 2 ; % Capitals share
20

21 % C a l c u l a t e Model ’ s P r e d i c t i o n s
22 % 1993 −96
23 hmodel9396 = hours ( tau9396 , c t o o 9 3 9 6 ) ;
24
25 % Function f o r model ’ s s o l u t i o n
26 e r r o r 9 3 9 6 = hdata9396 −hmodel9396 ; % P r e d i c t i o n e r r o r
27 disp ( ’ R e s u l t s using P r e s c o t t s alpha ’ )
28 disp ( ’ alpha ’ )
29 disp ( alpha )
30 disp ( ’ R e s u l t s 1 9 9 3 t o 96 ’ )
31 disp ( ’ Data Model E r r o r ’ )
32 disp ( [ hdata9396 ’ , hmodel9396 ’ , e r r o r 9 3 9 6 ’ ] )
33
34 % 1970 −74
35 disp ( ’ R e s u l t s 1 9 7 0 t o 74 ’ )
36 disp ( ’ Data Model E r r o r ’ )
37 hmodel7074 = hours ( tau7074 , c t o o 7 0 7 4 ) ;
38 e r r o r 7 0 7 4 = hdata7074 −hmodel7074 ;
39 disp ( [ hdata7074 ’ , hmodel7074 ’ , e r r o r 7 0 7 4 ’ ] )
40

41 % Compute and d i s p l a y e u c l i d e a n norm f o r t h e p r e d i c t i o n e r r o r

42 % over both p e r i o d s
43 disp ( ’ norm ’ )
44 disp ( norm ( [ e r r o r 9 3 9 6 e r r o r 7 0 7 4 ] ) )
45 % norm i s MATLAB command f o r t h e e u c l i d e a n norm
46 % Observe how t h e p r e d i c t i o n s e r r o r s have been patched t o g e t h e r
47 % into a single vector .
48

49 % Make graphs p l o t t i n g t h e data v e r s u s model f o r t h e 7 c o u n t r i e s

.
50 % 1993 −96
51 % Figure 1
52 heading = ’ F i t o f M o d e l P r e s c o t t s alpha , 1993 t o 1996 ’ ;
53 % C a l l f u n c t i o n t o make t h e f i g u r e
54 makefigures ( 1 , hdata9396 , hmodel9396 , heading )
55

56 % 1970 −74
57 % Figure 2
58 heading = ’ F i t o f M o d e l P r e s c o t t s alpha , 1970 t o 1974 ’ ;
59 makefigures ( 2 , hdata7074 , hmodel7074 , heading )
60

61 % C a l c u l a t e t h e optimal value f o r alpha

62 % C a l l n o n l i n e a r e q u a t i o n s o l v e r t o compute l e a s t s q u a r e s
problem
63 alpha = f z e r o ( @foc , alpha ) ;
64 i f abs ( f o c ( alpha ) ) = 0 . 0 0 0 0 0 0 1 % Check f o c i s c l o s e t o zero
65 disp ( ’ S o l u t i o n not found ’ )
66 end
67
68 % C a l c u l a t e Model ’ s P r e d i c t i o n s new alpha
69 % 1993 −96
70 hmodel9396 = hours ( tau9396 , c t o o 9 3 9 6 ) ;
71 e r r o r 9 3 9 6 = hdata9396 −hmodel9396 ;
72 disp ( ’ R e s u l t s using t h e alpha t h a t f i t s t h e b e s t ’ )
73 disp ( ’ alpha ’ )
74 disp ( alpha )
84 numerical methods for macroeconomists with julia and matlab codes

75 disp ( ’ R e s u l t s 1 9 9 3 t o 96 ’ )
76 disp ( ’ Data Model E r r o r ’ )
77 disp ( [ hdata9396 ’ , hmodel9396 ’ , e r r o r 9 3 9 6 ’ ] )
78
79 % 1970 −74
80 disp ( ’ R e s u l t s 1 9 7 0 t o 74 ’ )
81 disp ( ’ Data Model E r r o r ’ )
82 hmodel7074 = hours ( tau7074 , c t o o 7 0 7 4 ) ;
83 e r r o r 7 0 7 4 = hdata7074 −hmodel7074 ;
84 disp ( [ hdata7074 ’ , hmodel7074 ’ , e r r o r 7 0 7 4 ’ ] )
85 disp ( ’ norm ’ )
86 disp ( norm ( [ e r r o r 9 3 9 6 e r r o r 7 0 7 4 ] ) )
87

88 % Make graphs t o show f i t

89 % 1993 −96
90 % Figure 3
91 heading = ’ F i t o f M o d e l f i t t e d alpha , 1993 t o 1996 ’ ;
92 % C a l l f u n c t i o n t o make t h e f i g u r e
93 makefigures ( 3 , hdata9396 , hmodel9396 , heading )
94
95 % 1970 −74
96 % Figure 4
97 heading = ’ F i t o f M o d e l f i t t e d alpha , 1970 t o 1974 ’ ;
98 makefigures ( 4 , hdata7074 , hmodel7074 , heading )
99 % C a l c u l a t e Model ’ s P r e d i c t i o n s b y minimizing t h e sum o f t h e
squares
100 alpha = fminbnd ( @sumofsquares , 1 , 2 ) ; % C a l l up minimization
routine
101 disp ( ’ alphaby minimizing t h e sum o f t h e s q u a r e s ’ )
102 disp ( alpha )

A Function Specifying Hours Worked-hours.m

1 f u n c t i o n [ h ] = hours ( tau , c t o o )
2 % This f u n c t i o n computes hours worked using P r e s c o t t ’ s formula
3 % I t t a k e s t h e v e c t o r o f t a x r a t e s , tau , and t h e v e c t o r o f
4 % consumption/output r a t i o s , ctoo , as
5 % i n p u t s and g i v e s a v e c t o r f o r hours , h , as t h e output .
6 g l o b a l alpha t h e t a
7 h = 100 * (1 − t h e t a ) . / ( alpha * c t o o . / ( 1 − tau ) + (1 − t h e t a ) ) ;
8 % Note t h a t . / d i v i d e s one v e c t o r by another
9 % component by component .
10 end

A Function Specifying dh/dα-derivh.m

1 f u n c t i o n [ d e r i v a t i v e ] = derivh ( alpha , tau , c t o o )

2 % This f u n c t i o n g i v e s a v e c t o r f o r t h e d e r i v a t i v e o f hour worked
3 % with r e s p e c t t o alpha , given t h e v e c t o r s o f t a x e s , tau ,
4 % and consumption t o output r a t i o s , c t o o .
5 global theta
6 d e r i v a t i v e = −100 * (1 − t h e t a ) . / ( ( alpha * c t o o . / ( 1 − tau ) + (1 −
theta ) ) ) .^2 . . .
7 . * ( c t o o . / ( 1 − tau ) ) ;
8 % Note t h a t . / d i v i d e s one v e c t o r by another component
9 % by component . Also , note t h a t t h e use o f . . . a l l o w s
10 % an e x e c u t a b l e s t a t e m e n t t o flow over two l i n e s .
11 end
why do americans work so much more than europeans? 85

A Function Specifying the Foc for the Optimal Choice of α-foc.m

1 f u n c t i o n [ zero ] = f o c ( alpha )
2 % This f u n c t i o n s p e c i f i e s t h e f i r s t −order c o n d i t i o n f o r t h e
3 % l e a s t −s q u a r e s o p t i m i z a t i o n problem . .
4 global theta
5 g l o b a l tau9396 c t o o 9 3 9 6 hdata9396 tau7074 c t o o 7 0 7 4 hdata7074 .
6 % C a l c u l a t e t h e model ’ s s o l u t i o n f o r hours worked f o r a given
alpha
7 hmodel9396 = 100 * (1 − t h e t a ) . / ( alpha * c t o o 9 3 9 6 . / ( 1 − tau9396 ) +
(1 − t h e t a ) ) ;
8 hmodel7074 = 100 * (1 − t h e t a ) . / ( alpha * c t o o 7 0 7 4 . / ( 1 − tau7074 ) +
(1 − t h e t a ) ) ; .
9 % Write out t h e f i r s t −order c o n d i t i o n f o r t h e minimization
problem
10 % i n v o l v i n g t h e c h o i c e o f alpha
11 zero = −( hdata9396 − hmodel9396 ) . * derivh ( alpha , tau9396 ,
ctoo9396 ) . . .
12 − ( hdata7074 − hmodel7074 ) . * derivh ( alpha , tau7074 ,
ctoo7074 ) ; .
13 % Note t h a t . * m u l t i p l i e s one v e c t o r by another component
14 % by component . Also , note t h a t t h e use o f . . . a l l o w s t h e
15 % e x e c u t a b l e s t a t e m e n t t o flow over two l i n e s .
16 zero = 2 * sum ( zero ) ;
17 % sum i s a MATLAB command t h a t adds a c r o s s t h e components o f
a vector .
18 end

A Function Specifying the Sum of the Squares for the Optimal Choice
of α-foc.m

1 f u n c t i o n [ s o s ] = sumofsquares ( alpha )
2 % This f u n c t i o n s p e c i f i e s t h e o b j e c t i v e f u n c t i o n f o r t h e l e a s t −
squares
3 % o p t i m i z a t i o n problem .
4 global theta
5 g l o b a l tau9396 c t o o 9 3 9 6 hdata9396 tau7074 c t o o 7 0 7 4 hdata7074
6 % C a l c u l a t e t h e model ’ s s o l u t i o n f o r hours worked f o r a given
alpha
7 hmodel9396 = 100 * (1 − t h e t a ) . / ( alpha * c t o o 9 3 9 6 . / ( 1 − tau9396 ) +
(1 − t h e t a ) ) ;
8 hmodel7074 = 100 * (1 − t h e t a ) . / ( alpha * c t o o 7 0 7 4 . / ( 1 − tau7074 ) +
(1 − t h e t a ) ) ;
9 % Write out t h e sum o f t h e s q u a r e s f o r t h e minimization
problem
10 % i n v o l v i n g t h e c h o i c e o f alpha
11 s o s = ( hdata9396 − hmodel9396 ) . 2 . . .
12 + ( hdata7074 − hmodel7074 ) . 2 ;
13 % Note t h a t . * m u l t i p l i e s one v e c t o r by another component by
component .
14 % Also , note t h a t t h e use o f . . . a l l o w s our e x e c u t a b l e
s t a t e m e n t t o flow
15 % over two l i n e s .
16 s o s = sum ( s o s ) ; % Sum
17 % sum i s a MATLAB command t h a t adds a c r o s s t h e components o f
a vector
18 end
86 numerical methods for macroeconomists with julia and matlab codes

A Function for Generating the Figures-makefigures.m

This function saves on coding so you don’t have to write all the com-
mands for generating the figures over and over again.
1 f u n c t i o n [ ] = makefigures ( num, hdata , hmodel , heading )
2 % This f u n c t i o n make a graph p l o t t i n g t h e data v e r s u s model
3 % f o r t h e 7 c o u n t r i e s . Num = t h e f i g u r e number . hdata i s
4 % t h e data v e c t o r o f hours o f hours worked f o r t h e c o u n t r i e s .
5 % hmodel i s t h e same t h i n g f o r t h e model .
6 % Heading i s a c h a r a c t e r s t r i n g g i v i n g t h e t i t l e f o r
7 % t h e graph . Note t h a t output f o r t h e f u n c t i o n i s empty , or [ ] .
8 f i g u r e (num)
9 p l o t ( hdata , hmodel , ’ rd ’ , hdata , hdata , ’ b− ’ )
10 % The p l o t command a sks MATLAB t o p l o t hdata v e r s u s hmodel
11 % using a a red symbol . There i s no l i n e f o r t h e f i r s t p l o t .
12 % The second plotshows hdata v e r s u s hdata , or t h e 45 degree
line ,
13 % using a blue s t r a i g h t l i n e .
14 t i t l e ( heading )
15 x l a b e l ( ’ data ’ )
16 y l a b e l ( ’ model ’ )
17 % The n ex t l i n e s a r e used t o l a b e l t h e 7 p o i n t s on t h e graph
18 x1 = hdata ( 1 ) ; % Germany
19 y1 = hmodel ( 1 ) ;
20 t x t 1 = ’ Ger ’ ;
21 t e x t ( x1 , y1 , t x t 1 )
22 x2 = hdata ( 2 ) ; % France
23 y2 = hmodel ( 2 ) ;
24 t x t 2 = ’ Fra ’ ;
25 t e x t ( x2 , y2 , t x t 2 )
26 x3 = hdata ( 3 ) ; % I t a l y
27 y3 = hmodel ( 3 ) ;
28 txt3 = ’ Ita ’ ;
29 t e x t ( x3 , y3 , t x t 3 )
30 x4 = hdata ( 4 ) ; % Canada
31 y4 = hmodel ( 4 ) ;
32 t x t 4 = ’ Can ’ ;
33 t e x t ( x4 , y4 , t x t 4 )
34 x5 = hdata ( 5 ) ; % UK
35 y5 = hmodel ( 5 ) ;
36 t x t 5 = ’ UK ’ ;
37 t e x t ( x5 , y5 , t x t 5 )
38 x6 = hdata ( 6 ) ; % Japan
39 y6 = hmodel ( 6 ) ;
40 t x t 6 = ’ Jap ’ ;
41 t e x t ( x6 , y6 , t x t 6 , ’ HorizontalAlignment ’ , ’ r i g h t ’ )
42 x7 = hdata ( 7 ) ; % USA
43 y7 = hmodel ( 7 ) ;
44 t x t 7 = ’ USA ’ ;
45 t e x t ( x7 , y7 , t x t 7 ) .
46 end

Output from Program

Some output from the program is shown. To save on space the ma-
trices showing the numerical results are omitted. These are better dis-
played in the diagrams, which are shown. Prescott’s model works
well for the later period, but not as well for the earlier one. Whether
why do americans work so much more than europeans? 87

Prescott’s α is used or the optimally fitted one is does not appear to

matter much.
1 % R e s u l t s using P r e s c o t t s alpha
2 alpha
3 1.5400
4 norm
5 12.5548

Fit of Model--Prescotts alpha, 1993 to 1996 Fit of Model--Prescotts alpha, 1970 to 1974
30 36
Jap
Jap
28
34 Figure 4.7.1: Goodness of fit
26
32
using Prescott’s α, 1993-96 and
30

24
USA
Ita
1970-74.
28
model

model
UK
USA
22 26
Can Fra
Can
24 Ger UK
20
Fra Ger
22
Ita
18
20

16 18
16 18 20 22 24 26 28 18 20 22 24 26 28 30
data data

1 % R e s u l t s using t h e alpha t h a t f i t s t h e b e s t : from f i r s t −order

condition
2 alpha
3 1.7105
4 norm
5 10.3763

Fit of Model--ﬁtted alpha, 1993 to 1996 Fit of Model--fitted alpha, 1970 to 1974
28 34
Jap
Jap
32
26
Figure 4.7.2: Goodness of fit us-
30

24 ing least squares’ α, 1993-96 and

28
USA 1970-74.
model

model

22 26 Ita

UK
USA
24
20 Can Fra
Can
Ger UK
22

18 Fra Ger
20
Ita

16 18
16 18 20 22 24 26 28 18 20 22 24 26 28 30
data data

1 % R e s u l t s using t h e alpha t h a t f i t s t h e b e s t : minimization

problem
2 alpha
3 1.7105
5 Graphing

A picture is worth a thousand words.

5.1 Introduction

Graphs can make a paper or presentation come alive. Tufte (2001) says
that Charles Joseph Minard’s (1781-1870) graph portraying the fate of
Napoleon’s army during its invasion of Russia in 1812 “may well be
the best statistical graph ever drawn.” The graph combines a map with
a time-series showing the size of Napoleon’s army as it travelled from
the Polish-Russian border to Moscow and back. The width of the thick
red line illustrates the size of the army as it advances towards Moscow.
The width of the black line shows the size of it as it retreats. The brutal
temperatures facing the army are shown in the bottom panel, which is
in sync with the upper one.

5.2 William Playfair

The father of statistical graphing was William Playfair (1759–1823).

He is credited with inventing times-series plots, bar graphs, and pie
charts. One of Playfair’s time-series plots is displayed in Figure 5.2.1. It
shows the trade balance over time between England, on the one hand,

Figure 5.1.1: Napoleon’s march.

A rendition in english of Charles
Joseph Minard’s 1869 famous
chart.
90 numerical methods for macroeconomists with julia and matlab codes

Figure 5.2.1: Source: Playfair,

William. Commercial and Political
Atlas, 1786.

Figure 5.2.2: Source: Playfair,

William. Commercial and Political
Atlas, 1786.

and Denmark and Norway, on the other. Figure 5.2.2 is a bar graph
illustrating Scottish exports to and imports from various countries for
the year 1781. One of Playfair’s pie charts exhibiting the fractions of
the Turkish empire (before 1789) located in Africa, Asia, and Europe is
presented in Figure 5.2.3.

5.3 Some Basic Principals

Some basic principals for graphing are:

1. Truthfulness. Graphs should truthfully display the data. While it’s

okay to pitch an idea with enthusiasm to an audience or readers,
do not change the makeup of a graph to unduly influence people.
For example, dips and spikes in time-series plots can be exagger-
ated by changing the ratio of the vertical to horizontal axes. This
is the type of trick that journalists do to flog a story to readers. If
graphing 91

Figure 5.2.3: Source: Playfair,

William. Statistical Brevity, 1801.

there is choice of data to present, then show what is representa-

tive. Of course, your source for the data or graph should be given
somewhere. Replication is an important principal in science.

2. Purposefulness. A graph should be informative not just a picture.

Whatever is being shown should be relevant for the story being
told. Graphs should be used to illustrate evidence, an idea or a hy-
pothesis in a paper or presentation, which would be communicated
less clearly or forcefully without a graph. As such, they should al-
ways be clearly explained in the main text of a paper and or verbally
in a presentation.

3. Clarity. Graphs should be clear and easy to follow.

(a) Captions, colors, fonts, labels, and lines. Axes should be labeled,
in fonts large enough to read, and graphs should have titles in
captions. Table 5.3.1 shows how fonts can be use to empha-
size something. The caption should also explain the graphical
construct, if needed. On a time series plot one can distinguish
between the lines by using different line colors and styles. The
same is true for the bars on a histogram portraying different data
objects. They can be distinguished using different colors and fill
patterns. Additionally, make sure that different colors reproduce
well in black and white, if this is a requirement.

A Couple of Axes
The mathematician The mathematician Table 5.3.1: The most commonly
Plotting his past relations Plotting his past relations used consonant in English is
"ex" and "why" axis "ex" and "why" axis the letter t. By using a bold
font the mind can more quickly
see this on the version of the
haiku shown on the right. This
is a take on an example in
Schwabish (2014).
92 numerical methods for macroeconomists with julia and matlab codes

(b) Data-ink maximization. Tufte (2001) advocates the principal of

data-ink maximization. He defines the data-ink ratio as
data ink
Data-ink ratio = .
total ink used to produce the graph
The idea is that most of the ink on a graph should be used to pro-
vide information about the data. As such, he recommends delet-
ing boxes around graphs and grids on graphs, since these intro-
duce unnecessary ink and detract from the information shown.
Likewise, legends can often be avoided by labeling things di-
rectly, such as lines. Does a legend really need a box? Figure 5.3.1
illustrates the idea with two versions of the same of graph. The
graphs show the rise in U.S. female-labor force participation in
the 20th century from 7 percent in 1860 to 74 percent in 2018. The
graph on left uses an antiquated grid, puts the plot in a frame,
and includes a legend. The one of the right is much cleaner. Ad-
ditionally, the plot lines have been thickened to emphasize them
and the fonts have enlarged to make them more readable. The
range of the x axis has been adjusted to more relevant years.
(c) Multipanel graphs. Multipanel graphs containing a large number
of subgraphs should be avoided. The subgraphs tend to be small
and hard to read.

4. Aestheticism. While beauty is in the eye of the beholder, try to

make your graph as appealing as possible for the intended audi-
ence, without sacrificing the principles of clarity and truthfulness.
For example, pleasing coloring schemes can be used. Or colors can
be used to represent things such as a national colors or female and
male.

Women
Labor-Force Participation, % Ages 20-64

100 100
Men
90 Figure 5.3.1: The graphs show
Labor-Force Participation, % Ages 20-64

Men
80 80
the rise in U.S. female labor-
70

60 60 force participation over the 20th

50
century. The left panel presents
40 40

30 Women a graph with a low data-ink ra-

20 20

10
tio. The right panel illustrates
0
1840 1860 1880 1900 1920 1940 1960 1980 2000 2020 2040
0
1860 1880 1900 1920 1940 1960 1980 2000 2020
the same graph with a high
Year
Year data-ink ratio. Source: Green-
wood et al. (2021b)
graphing 93

5.4 MATLAB: Worked-Out Examples

5.4.1 Time Series Plot

Here is some MATLAB code for a version of Playfair’s time series
export-import plot but applied to United States instead of England.
The code generates Figure 5.4.1.
1 % Reproduce a v e r s i o n o f P l a y f a i r ’ s time s e r i e s export −import
p l o t t h e United S t a t e s .
2 % Code by Artem Kuriksha
3 clear all ;
4 close a l l ;
5

6 % Importing data from comma s e p a r a t e d v a l u e s f i l e

7 df = importdata ( ’EXPGSCA . csv ’ , ’ , ’ , 1 ) ;
8 exp = l o g ( df . data ) . ’ ; % g e t e x p o r t data column and then l o g
9 df = importdata ( ’IMPGSCA . csv ’ , ’ , ’ , 1 ) ;
10 imp = l o g ( df . data ) . ’ ; % g e t import data column and then l o g
11 years = 1929:2018; % generate vector of years
12
13 %P l o t
14 figure ;
15 hold on ;
16 s e t ( gca , ’ F o n t S i z e ’ , 1 4 ) % Font s i z e f o r type on graph
17 p l o t ( years , exp , ’ LineWidth ’ , 4 , ’ Color ’ , ’ r ’ ) ;
18 p l o t ( years , imp , ’ LineWidth ’ , 4 , ’ Color ’ , ’ g ’ ) ;
19 % Command t o f i l l a polygon between t h e import and e x p o r t
vectors
20 f 1 = f i l l ( [ y e a r s f l i p l r ( y e a r s ) ] , [ exp f l i p l r ( max ( exp , imp ) ) ] , ’ r ’ ) ;
21 f 2 = f i l l ( [ y e a r s f l i p l r ( y e a r s ) ] , [ exp f l i p l r ( min ( exp , imp ) ) ] , ’ g ’ ) ;
22 alpha ( f1 , . 2 ) ; % P i c k t r a n p a r e n c y o f f i l l
23 alpha ( f2 , . 4 ) ;
24 legend ( { ’ Exports ’ , ’ Imports ’ } , ’ L o c a t i o n ’ , ’ s o u t h e a s t ’ ) ;
25 lgd . F o n t S i z e = 1 4 ; % Font s i z e f o r legend
26 x l a b e l ( ’ Year ’ ) ;
27 t i t l e ( ’ Real Exports and Imports o f Goods and S e r v i c e s ’ ) ;
28 y l a b e l ( ’ Log o f b i l l i o n s o f d o l l a r s ’ ) ;
29 xlim ( [ y e a r s ( 1 ) y e a r s ( end ) ] ) ; % P i c k where t o s t a r t and end x
axis
30 a n n o t a t i o n ( ’ t e x t a r r o w ’ , [ 0 . 7 7 0 . 7 7 ] , [ 0 . 5 2 0 . 7 1 ] , ’ S t r i n g ’ , ’ Ba lance
a g a i n s t US ’ , ’ F o n t S i z e ’ , 1 1 ) ;
31 annotation ( ’ textarrow ’ , [ 0 . 2 8 6 0 . 2 8 6 ] , [ 0 . 4 2 0 . 2 9 ] , ’ String ’ , ’
Bala nce i n f a v o r o f US ’ , ’ F o n t S i z e ’ , 1 1 ) ;
32 box on ;
33 s a v e a s ( gcf , ’ t i m e s e r i e s . png ’ ) ; % Save graph as png f i l e
94 numerical methods for macroeconomists with julia and matlab codes

Figure 5.4.1: The version of Play-

fair’s time series plot for the
United States that was gener-
ated from the MATLAB pro-
gram. The program was written
by Artem Kuriksha.

5.4.2 Bar Chart

Playfair’s export-import bar chart is generated here but applied to
United States instead of England. The code generates Figure .
1 % Reproduce a v e r s i o n o f P l a y f a i r ’ s bar c h a r t f o r t h e United
States .
2 % To do t h i s , t a k e t h e United S t a t e s ’ 15 l a r g e s t t r a d i n g
partners .
3 % Code by G i o r g i o Lo
4

5 %% Housekeeping
6 clear
7 close a l l
8 %% Reproduce a v e r s i o n o f P l a y f a i r ’ s bar c h a r t f o r t h e United
S t a t e s . To do t h i s , t a k e t h e United S t a t e s ’ 15 l a r g e s t
trading partners .
9 x = [1:15];
10 % Load e x p o r t and import data .
11 y = [120.3539.5;...
12 298.7318.5;...
13 265.0346.5;...
14 75.0142.6;...
15 57.7125.9;...
16 56.374.3;...
17 66.260.8;...
18 36.352.5;...
19 33.154.4;...
20 23.254.7;...
21 30.245.8;...
22 49.424.6;...
23 39.531.2;...
graphing 95

24 10.757.5;...
25 22.241.1];
26 figure ;
27
28 % C r e a t e bar c h a r t with two a x i s .
29 % Subsequent commands t a r g e t r i g h t a x i s .
30 yyaxis r i g h t ;
31 % Make h o r i z o n t a l bar c h a r t x v e r t i c a l , y h o r i z o n t a l
32 H=barh ( x , y , ’ grouped ’ ) ; % b a r s a r e grouped by row
33 H( 1 ) . FaceColor = ’ k ’ ; % s e t f i r s t bar t o b l a c k
34 % Set object properties for chart
35 s e t ( gca , ’ y t i c k l a b e l ’ , { ’ China ’ , ’ Canada ’ , ’ Mexico ’ , ’ Japan ’ , ’ Germany
’ , ’ Korea , South ’ , ’ United Kingdom ’ , ’ France ’ , ’ I n d i a ’ , ’ I t a l y ’ , ’
Taiwan ’ , ’ Netherlands ’ , ’ B r a z i l ’ , ’ I r e l a n d ’ , ’ S w i t z e r l a n d ’ } ) ;
36 t i t l e ( { ’ f o n t s i z e { 1 6 }U. S . Imports & Exports i n 2018 ’ ; ’ f o n t s i z e
{ 1 0 } B i l l i o n s o f d o l l a r s g o o d s only . ( Source : U. S . Census
Bureau ) ’ } )
37 % Add t e x t t o bottom o f c h a r t
38 t e x t ( 3 0 0 , 0 , ’ Red ( top ) b a r s a r e Imports . Bl a ck ( bottom ) b a r s a r e
Exports . ’ , ’ V e r t i c a l A l i g n m e n t ’ , ’ bottom ’ , ’
HorizontalAlignment ’ , ’ c e n t e r ’ ) ;

U.S. Imports & Exports in 2018

Billions of dollars--goods only. (Source: U.S. Census Bureau) Figure 5.4.2: The version of Play-
15
Switzerland fair’s bar chart for U.S. exports
Ireland
Brazil
and imports in 2018. The MAT-
Netherlands LAB code was written by Gior-
Taiwan
10 Italy gio Lo.
India
France
United Kingdom
Korea, South
5 Germany
Japan
Mexico
Canada
China
Red (top) bars are Imports. Black (bottom) bars are Exports.
0
0 100 200 300 400 500 600

5.4.3 A Pie Chart

Here is some MATLAB code for a version of Playfair’s piechart but
applied to North America instead of the Turkish Empire. The code
generates Figure 5.4.3.
1 % Reproduce a v e r s i o n o f P l a y f a i r ’ s p i e diagram f o r North
America .
2 % To do t h i s , d i v i d e up t h e a r e a f o r North America between
Canada , Mexico , and t h e United S t a t e s .
3 % Code by G i o r g i o Lo
4 X = [ 9 9 8 4 6 7 0 1964375 9 6 2 9 0 9 1 ] ; % Vector with a r e a s
5 l a b e l s = { ’ Canada ’ , ’ Mexico ’ , ’US ’ } ; % Country l a b e l s
6 figure ;
7 p = p i e ( X ) ; % C a l l command t o make p i e c h a r t
96 numerical methods for macroeconomists with julia and matlab codes

8 t i t l e ( { ’ f o n t s i z e { 1 8 } North America ’ , ’ f o n t s i z e { 8 } Source : United

Nations S t a t i s t i c s D i v i s i o n . 2 0 0 8 . R e t r i e v e d 14 October
2010. ’ } )
9 legend ( l a b e l s , ’ L o c a t i o n ’ , ’ s o u t h o u t s i d e ’ , ’ O r i e n t a t i o n ’ , ’
horizontal ’ ) ;
10 lgd . F o n t S i z e = 1 4 ; % S e t f o n t s i z e f o r legend
11 t e x t ( 0 , 0 , ’ 21578136 KM2 ’ , ’ V e r t i c a l A l i g n m e n t ’ , ’ bottom ’ , ’
HorizontalAlignment ’ , ’ c e n t e r ’ , ’ Color ’ , ’ white ’ , ’ F o n t S i z e ’ , 1 4 )
;
12 t e x t ( − 0 . 5 , 0 . 2 , ’ Canada ’ , ’ V e r t i c a l A l i g n m e n t ’ , ’ bottom ’ , ’
HorizontalAlignment ’ , ’ c e n t e r ’ , ’ Color ’ , ’ white ’ , ’ F o n t S i z e ’ , 1 6 )
;
13 t e x t ( 0 . 0 5 , − 0 . 7 , ’ Mexico ’ , ’ V e r t i c a l A l i g n m e n t ’ , ’ bottom ’ , ’
HorizontalAlignment ’ , ’ c e n t e r ’ , ’ Color ’ , ’ white ’ , ’ F o n t S i z e ’ , 1 6 )
;
14 t e x t ( 0 . 5 , 0 . 2 , ’US ’ , ’ V e r t i c a l A l i g n m e n t ’ , ’ bottom ’ , ’
HorizontalAlignment ’ , ’ c e n t e r ’ , ’ Color ’ , ’ white ’ , ’ F o n t S i z e ’
,16) ;
15 s a v e a s ( gcf , ’ P i e C h a r t . bmp ’ ) % Save g r a p h i c s f i l e i n bmp format

Figure 5.4.3: The version of Play-

fair’s pie chart for North Ameri-
can that was generated from the
MATLAB program. The pro-
gram was written by Giorgio Lo.
6 Deterministic Dynamics

6.1 Introduction

Often macro models specify that the state of economy, k t , evolves ac-
cording to a second-order nonlinear difference equation of the follow-
ing form:
k t = D (k t−1, k t−2 ), for t = 3, 4, · · · . (6.1.1)
The function D may represent the upshot of individuals’ and firms’
dynamic choice problems, government policies, and market-clearing
conditions. By specifying the economy at an elemental level it is hoped
the function D will capture the behavioral changes of individuals to
the state of the economy.
To start this difference equation off at time t = 3, one would need
to know both k1 and k2 . In period 1 it is reasonable to assume that
state of the economy has been predetermined, say at kinitial . So, one
can employ the starting condition

k1 = kinitial .

Determining an appropriate value for k2 is not as transparent. Most

often one would like the above difference equation to converge to a
steady state. Hence, one desires that

limt→∞ k t = ksteady state ,

where ksteady state is the long-run steady state. Thus, the goal is to solve
the above difference equation subject to two boundary conditions, one
at the beginning of time and the other at the end. This falls into a class
of problems known as two-point boundary value problems.
Three solution methods for solving this problem are presented. The
classic way of solving such problems is multiple shooting. If one knew
k1 and k2 , then the solution for the time path {k t }∞ t=1 could be com-
puted by just iterating on (6.1.1). As mentioned, in economics gener-
ally only k1 is known. Multiple shooting selects a value for k2 such
that the economy converges over time to the long-run steady state,
ksteady state . Another algorithm is the extended-path method. This al-
gorithm turns the difference equation (6.1.1) on its head. Updating
98 numerical methods for macroeconomists with julia and matlab codes

equation (6.1.1) by two periods gives k t+2 = D (k t+1, k t ).1 Rewrite this 1
That is, just add 2 to the subscripts
equation as k t+1 = D e (k t , k t+2 ). For each time period t, the extended- in this equation. The updated equation
holds for all t ≥ 1.
path method solves for next period’s capital stock, k t+1 , given its cur-
rent value, k t , and an expectation about its future value two periods
down the road, k t+2 . The algorithm is constructed in such a way so
that upon convergence the expectation about the path for the k t+2 ’s
coincides with the actual path for the k t+2 ’s and also so that the econ-
omy converges to a long-run steady state. The extended-path method
and multiple shooting are discussed in Section 6.12. The last solution
method treats (6.1.1) as a second-order linear difference equation that
comes out of a linear-quadratic optimization problem. This method is
explained in Section 6.9.
The discussion in the chapter will be centered around the neoclassi-
cal growth model, which is the workhorse of modern macroeconomics.
The model has its roots in work by Frank P. Ramsey (1903-1930). The
transitional dynamics for the neoclassical growth model are fully char-
acterized using pencil-and-paper techniques. While doing this, Bell-
man (1957) concept of dynamic programming and the value function is
presented. Properties of the value function for the neoclassical growth
model are derived. The contraction mapping principal underlying
much of dynamic programming is discussed. This is done in an in-
tuitive way, at the sacrifice of some rigor. The numerical techniques
introduced are illustrated in Section 6.13 using a dynamic version of
the monopolist’s pricing problem, first introduced in Chapter 2.

6.2 The World of Robinson Crusoe

Imagine an economy inhabited by millions of people, all the same.

This will be related here in terms of a representative agent, named
Robinson Crusoe. An Robinson Crusoe’s period-t lifetime utility is Robinson Crusoe is the name of a famous

given by book written by Daniel Defoe that was first

∞ published in 1719. The inscription on the title
∑ β j U (ct+ j ), with 0 < β < 1, page read “The Life and Strange Surprizing
j =0 Adventures of Robinson Crusoe, of York

where ct+ j is the person’s consumption in period t + j. Utility in period Mariner: Who lived Eight and Twenty Years
all alone in an un-inhabited Island of the
t + j, U (ct+ j ), is discounted at the rate β j . Since β < 1, β j is decreasing
Coast of America, near the Mouth of the
in j so the further off a utility is in the future the less Robinson cares Great River of Oroonoque; Having been cast
about it. Note that β j → 0, as j → ∞. on shore by Shipwreck, where-in all the Men
Output in period t + j, or ot+ j , is produced in line with the following perished but himself.” This is believed to be
constant-returns-to-scale production function the first English novel. Interestingly,
historians suggest that the book is based on

ot+ j = Fe(k t+ j , ht+ j ), the true story of the buccaneer Alexander

Selkirk. After a dispute with his ship’s

which uses the period-(t + j) inputs, capital, k t+ j , and labor, ht+ j . To captain, Selkirk was left alone in 1704 on one
of the Juan Fernandez Islands for four and a
begin with, suppose that Robinson Crusoe supplies just one fixed unit
half years. The seaman who went ashore in
1709 to retrieve him said he found “a man
clothed in goat’skins, who looked wilder than
the first owners of them.”
deterministic dynamics 99

of labor. This restriction will be relaxed in Section 6.10. In light of

this restriction, define F (k t+ j ) by F (k t+ j ) ≡ Fe(k t+ j , 1). The economy’s
capital stock is owned by its inhabitants. This capital depreciates at
rate δ. Suppose that an individual starts off period t + j owning k t+ j
units of capital. By investing the amount, it+ j , he can augment the
capital stock to k t+ j+1 in accordance with the law of motion

k t + j +1 = (1 − δ ) k t + j + i t + j .

This is a version of Ramsey (1928) growth model. Frank P. Ramsey (1903-1930) was a
Robinson Crusoe’s goal in life is to maximize his lifetime utility by British economist and
picking optimally his consumption and investment in each period. His mathematician. He died at the very
young age of 26. In economics he is
period-t problem can be written as
known for his work on the growth
∞ model, optimal taxation, and
max
{ct+ j ,it+ j }∞
∑ β j U ( c t + j ), subjective probability. In
j =0 j =0
mathematics he started a branch of
subject to the economy’s resource constraint, combinatorics which is now known
as Ramsey theory. He was named
ct+ j + it+ j = Fe(k t+ j , 1) = F (k t+ j ), the Senior Wrangler or the top
undergraduate in mathematics at
the law of motion for capital, Cambridge.

k t + j +1 = (1 − δ ) k t + j + i t + j ,

and the initial condition, k t . Robinson Crusoe’s problem has been cast
as starting in some arbitrary period, t. Often the first period is taken
as t = 1.2 Substitute out for ct+ j and it+ j in the utility functions using 2
His period-1 problem can be written as
the resource constraint and the law of motion for capital. The problem ∞

now appears as
max
{ct ,it }∞
∑ β t −1 U ( c t ),
t =1 t =1

∞ subject to ct + it = F (k t ) and k t+1 = (1 −

V (k t ) ≡ max
{ k t + j +1 } ∞
∑ β j U (|F(kt+ j ) + (1 −{zδ)kt+ j − kt+ j+}1 ). δ)k t + it . To see this more formally, set
| {z } j =0 j =0 t = 1 in the problem in the main text to
Value Function ct+ j get
∞
(6.2.1)
max
{c1+ j ,i1+ j }∞
∑ βU (c1+ j ),
The function V (k t ) gives the maximal level of lifetime utility that Robin- j =0 j =0

son Crusoe will realize if he enters period t with the capital stock, k t . subject to c1+ j + i1+ j = F (k1+ j ) and
k1+ j+1 = (1 − δ)k1+ j + i1+ j . Now, do a
This is called the value function. It plays an important role in modern change of variable by setting t = 1 + j.
macroeconomics. Note that if j starts at 0 then t must start
at 1. The period-1 problem then obtains.

6.3 The Euler Equation

Attention is now directed toward obtaining a solution to Robinson

Crusoe’s problem (6.2.1). Note that k t+ j+1 appears exactly twice in the
maximand of problem (6.2.1), at time t + j and t + j + 1. Specifically, it
appears in two terms shown below

... + β j U ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 )

+ β j+1 U ( F (k t+ j+1 ) + (1 − δ)k t+ j+1 − k t+ j+2 ) + ...
100 numerical methods for macroeconomists with julia and matlab codes

By maximizing with respect to k t+ j+1 , the following set of first-order

conditions can be obtained

U1 ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 ) = [ F1 (k t+ j+1 ) + (1 − δ)]

(6.3.1)
× βU1 ( F (k t+ j+1 ) + (1 − δ)k t+ j+1 − k t+ j+2 ), for j = 0, 1, · · · .

This is an infinite dimensional system of equations. The above formula

is Robinson Crusoe’s Euler equation. Leonhard Euler (1707-1783) is a
The Euler equation plays a central role in modern macroeconomics. famous mathematician of Swiss
It characterizes the consumption/investment decision. The lefthand descent. He is known for many
things, but his work on fluid
side represents the cost of investing in an extra unit of capital. Robin-
dynamics led him to study
son Crusoe must give up one unit of consumption to do this, which ordinary and partial differential
has a period t + j utility cost of U1 ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 ). The equations.
righthand side gives the benefit from investing in an extra unit of cap-
ital. Output will increase in period t + j + 1 by the amount F1 (k t+ j+1 ).
Plus, Crusoe will still have 1 − δ units of capital left over after de-
preciation. An extra unit of period t + j + 1 consumption is worth
βU1 ( F (k t+ j+1 ) + (1 − δ)k t+ j+1 − k t+ j+2 ) in utility terms. Therefore, in-
vesting in an extra unit of capital in period t + j has a utility benefit of
[ F1 (k t+ j+1 ) + (1 − δ)] × βU1 ( F (k t+ j+1 ) + (1 − δ)k t+ j+1 − k t+ j+2 ).
Now,
ct+ j
z }| {
U1 ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 )
R 1 as β[ F1 (k t+ j+1 ) + (1 − δ)] R 1.
U1 ( F (k t+ j+1 ) + (1 − δ)k t+ j+1 − k t+ j+2 )
| {z }
c t + j +1

So, when the return on capital accumulation, F1 (k t+ j+1 ) + (1 − δ), ex-

ceeds (falls short of) the representative agent’s gross rate of subjective
time preference, or 1/β, consumption must be growing (dropping)
over time–the net rate of time preference is defined later on. This oc-
curs because U1 (ct+ j )/U1 (ct+ j+1 ) > 1, when β[ F1 (k t+ j+1 ) + (1 − δ)] >
1, which implies that ct+ j+1 > ct+ j due to the assumption of dimin-
ishing marginal utility. The above equation brings up the notion of
the elasticity of intertemporal substitution, or how willing a person is
to substitute consumption across time in response to changes in the
interest rate.

Definition 14. (Elasticity of intertemporal substitution) Let U (c) =

c1−ρ /(1 − ρ), with ρ ≥ 0, and define the gross real interest, r, by
r ≡ F1 (k) + (1 − δ). The Euler equation (6.3.1) implies
c t + j +1 ρ
( ) = βrt+ j .
ct+ j

Thus, the gross growth rate in consumption, ct+ j+1 /ct+ j , rises with the
deterministic dynamics 101

real interest rate, rt+ j . It is easy to calculate that

rt+ j d(ct+ j+1 /ct+ j ) 1

= ,
ct+ j+1 /ct+ j drt+ j ρ

where 1/ρ is the elasticity of intertemporal substitution. The bigger

1/ρ is, the larger the impact that a change in the interest rate, rt+ j ,
has on the growth rate of consumption, ct+ j+1 /ct+ j . The elasticity of
intertemporal substitution will be returned to in Chapter 8.

It’s interesting to compare equation (6.3.1) with what Ramsey (1928)

derived–see Figure 6.3.1.

Figure 6.3.1: This is Ramsey (1928)

Euler equation. In Ramey’s notation x (t)
is period-t consumption and u( x (t)) is
its marginal utility. The marginal prod-
uct of capital is given by ∂ f /dc, where
f is the production function and c is
capital. Ramsey (1928) did not discount
the future and capital did not depreci-
ate, which explains the difference be-
tween his Euler and the modern ver-
sion (6.3.1). So, in Ramsey’s model, the
marginal utility of consumption had to
decline over time, as stated by his equa-
tion (3).

The Euler equation also represents a 2nd-order nonlinear difference

equation, which can be represented implicitly as

k t+ j+2 = D (k t+ j+1 , k t+ j ), for all j ≥ 0.

If one knew k t and k t+1 then the above difference equation could be
used to solve for k t+2 (by setting j = 0). One now has k t+2 and k t+1 in
hand that can be used to get k t+3 . By iterating on the entire time path
102 numerical methods for macroeconomists with julia and matlab codes

for the k t+ j ’s can be uncovered. That is, one could proceed as follows:

k t +2 = D ( k t +1 , k t ),
k t +3 = D ( k t +2 , k t +1 ) = D ( D ( k t +1 , k t ), k t +1 ),
| {z }
k t +2
k t +4 = D ( k t +3 , k t +2 ) = D ( D ( D (k t+1 , k t ), k t+1 ), D (k t+1 , k t )),
| {z } | {z }
k t +3 k t +2
.. ..
. .

This procedure generates a sequence for the k t+ j+2 ’s that is a function

of the starting values k t and k t+1 –observe that on the righthand side
of the above expression things can always be written just in terms of
k t and k t+1 , even though the resulting expressions are ugly in form.
Normally, one knows the starting value for capital, k t . The trouble is
that a condition needs to be found to pin down k t+1 . Suppose that
capital stock converges to the unique steady-state value, k∗ . How the
steady-state value for the capital stock is determined is discussed in the
next Section. The steady-state value for the capital stock can be used
to tie down k t+1 . Specifically, one needs to find the value of k t+1 such
that lim j→∞ k t+ j+2 = k∗ . This is called a two-point boundary value
problem. The time path is pinned down by an initial condition, k t ,
and a terminal condition, k∗ . This idea forms the basis of the multiple
shooting algorithm discussed in Section 6.12.2.

6.4 The Steady State

In a steady state the capital stock will be constant at some level denoted
by k∗ . Therefore, in a steady state k t+ j = k t+ j+1 = k t+ j+2 = · · · = k∗ .
This implies that ct+ j = ct+ j+1 = ct+ j+2 = · · · and hence U1 (ct+ j ) =
U1 (ct+ j+1 ) = U1 (ct+ j+2 ) = · · · . So, in this situation the above Euler
equation reduces to

β[ F1 (k∗ ) + (1 − δ)] = 1.

Therefore, k∗ is determined by

F1 (k∗ ) = 1/β − 1 + δ.

Figure 6.4.1 illustrates the situation. Express the discount factor β by

β = 1/(1 + ι), where 1 + ι is the gross rate of time preference and ι is
the net rate. Then, in a steady state

F1 (k∗ ) = ι + δ,

the marginal product of capital is equal to the (net) rate of time pref-
erence, ι, plus the depreciation rate, δ. Since F1 is a strictly decreasing
function of k, this (nontrivial) steady state is unique.
deterministic dynamics 103

Example 15. (Steady-state capital stock with a Cobb-Douglas produc-

tion function) Let production be described by the Cobb-Douglas pro-
duction function o = kα . Then, αk∗α−1 = ι + δ. The steady-state stock
of capital, k∗ , is then given by k∗ = [α/(ι + δ)]1/(1−α) . The steady-state
capital stock declines with the cost of capital, ι + δ.

Figure 6.4.1: The diagram illustrates

how the steady-state stock of capital, k∗ ,
is determined. The function F1 (k ), which
specifies the marginal product of capi-
tal, is strictly decreasing because the pro-
duction function, F (k ), is strictly concave
in the capital stock, k; i.e., there are di-
minishing marginal returns. This sched-
ule defines the demand for capital. Of-
ten the conditions limk→0 F1 (k ) = ∞ and
limk→∞ F1 (k) = 0 are imposed. These
conditions guarantee that a solution will
exist, because F1 (k ) must start off above
the horizontal 1/β − 1 + δ line and end
up below. When a solution does ex-
ist, it is unique because F1 (k ) is down-
ward sloping and can only cross the hor-
izontal 1/β − 1 + δ line once. Note that
6.5 Dynamic Programming Formulation F1 (k t+ j ) R 1/β − 1 + δ as k t+ j S k∗ .

The above optimization problem can be formulated as a dynamic pro-

gramming problem. In problem (6.2.1) there are an infinite number of Dynamic programming was
choice variables, {k t+ j+1 }∞
j=0 . Bellman (1957) noted that large prob-
introduced in 1953 by the famous
lems, such as this, suffer from “the curse of dimensionality.” His so- applied mathematician Richard E.
Bellman (1920-1984) while he was
lution was to break down such gigantic problems into a set of smaller
working at the Rand Corporation.
simpler problems. In line with this idea, problem (6.2.1) can be recast It is an important tool in both
in terms of a smaller problem for each period t + j, which has just economics and engineering.
one choice variable, k t+ j+1 . There are effectively an infinite number
of these small problems, one for each t + j, but they are often easy to
compute.
To see this, update problem (6.2.1) by one period (by shifting t to
t + 1) to get Robinson Crusoe’s problem at time t + 1:
∞
V ( k t +1 ) ≡ max
{ k t + j +2 } ∞
∑ β j U ( F ( k t + j +1 ) + (1 − δ ) k t + j +1 − k t + j +2 )
j =0 j =0
∞
= max
{ k t + j +1 } ∞
∑ β j −1 U ( F ( k t + j ) + (1 − δ ) k t + j − k t + j +1 ). (6.5.1)
j =1 j =1
104 numerical methods for macroeconomists with julia and matlab codes

Now observe that Robinson Crusoe’s time-t problem (6.2.1) can be

written as
∞
V (k t ) ≡ max
{ k t + j +1 } ∞
∑ β j U ( F ( k t + j ) + (1 − δ ) k t + j − k t + j +1 )
j =0 j =0

= max{U ( F (k t ) + (1 − δ)k t − k t+1 )

k t +1
∞
+ max { β ∑ β j−1 U ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 )}}
{ k t + j +1 } ∞
j =1 j =1

= max{U ( F (k t ) + (1 − δ)k t − k t+1 ) + βV (k t+1 )}

k t +1

[using (6.5.1)]. (6.5.2)

Note how k t+1 can be separated from the inner maximization prob-
lem. This can only be done since the return function U ( F (k t ) + (1 −
δ)k t − k t+1 ) doesn’t involve future values of the control variable, here
{ k t + j +1 } ∞
j=1 . This allows the maximization to be separated into two
maximization operations, with the max operator in the outer problem
cascading over the one in the inner problem.
The agent’s period-t dynamic programming problem is

V (k t ) = max{U ( F (k t ) + (1 − δ)k t − k t+1 ) + βV (k t+1 )}.

k t +1

This is called the Bellman equation. (To get the period t + j problem
just rewrite t as t + j). Effectively, this is just a two-period problem;
viz, today and the future. The future is encapsulated in the function
V (k t+1 ). This function gives the maximal level of lifetime utility that
can be obtained in period t + 1 contingent on Robinson having the
capital stock k t+1 .
The first-order condition for optimality in the period-t is

U1 ( F (k t ) + (1 − δ)k t − k t+1 ) = βV1 (k t+1 ). (6.5.3)

This will determine his consumption-savings decision. The lefthand

side of this equation is the marginal cost associated with doing an
extra unit of investment in period t. An extra unit of investment re-
duces consumption by one which in turn reduces period-t utility by
U1 ( F (k t ) + (1 − δ)k t − k t+1 ). The righthand side is the marginal bene-
fit. In particular, V1 (k t+1 ) in the increase in lifetime utility from having
an extra unit of capital in period t + 1. Since this gain one period down
the road it should discounted by β. Equation (6.5.3) represents one un-
known, k t+1 . The solution to the above first-order condition will have
the form
k t +1 = K ( k t ).
The function K is known as a decision rule. This is a first-order dif-
ference equation. Some properties of this decision rule are discussed
deterministic dynamics 105

further below. The steady-state capital stock, k∗ , must satisfy the con-
dition
k ∗ = K ( k ∗ ).

The discussion now turns to establishing two properties of the value

function, V (k t ). First, it will be shown that V (k t ) is strictly increasing
in k t . Second, it will be demonstrated that V (k t ) is strictly concave in
kt .

6.5.1 Differentiation of the Value Function using the Envelope Theorem

Next, differentiate both sides of (6.5.2) to get

V1 (k t ) = U1 ( F (k t ) + (1 − δ)k t − k t+1 )[ F1 (k t ) + (1 − δ)]

+ [−U1 ( F (k t ) + (1 − δ)k t − k t+1 ) + βV1 (k t+1 )] ×dk t+1 /dk t
| {z }
=0
= U1 ( F (k t ) + (1 − δ)k t − k t+1 )[ F1 (k t ) + (1 − δ)] > 0.

(6.5.4)

The term in the middle disappears because of the first-order condition

for the maximization. This is called the envelope theorem. (For a simple
explanation of the envelope theorem, see Chapter A.) The result im-
plies that V (k t ) is increasing in the capital stock. So, not surprisingly,
Robinson Crusoe is better off in period t the more capital, k t , he has.
This result is stated now as a lemma.

Lemma 16. (Value function is strictly increasing) V (k t ) is strictly increas-

ing.

The above expression can be updated to period t + 1 by changing

the time subscripts on the variables. Doing this gives

V1 (k t+1 ) = U1 ( F (k t+1 ) + (1 − δ)k t+1 − k t+2 )[ F1 (k t+1 ) + (1 − δ)].

Using this on the righthand side of (6.5.3) yields

U1 ( F (k t ) + (1 − δ)k t − k t+1 ) = βU1 ( F (k t+1 ) + (1 − δ)k t+1 − k t+2 )[ F1 (k t+1 ) + (1 − δ)],

or equivalently, by rewriting t as t + j as,

U1 ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 )
= βU1 ( F (k t+ j+1 ) + (1 − δ)k t+ j+1 − k t+ j+2 )[ F1 (k t+ j+1 ) + (1 − δ)].

This is the Euler equation (6.3.1) again. So, the two approaches yield
the same solution to optimization problem.
106 numerical methods for macroeconomists with julia and matlab codes

6.5.2 Concavity of the Value Function

Last, the function V (k t ) is strictly concave.

Definition 17. (Strict concavity) A function V : K n → R is strictly

concave if

V (θk1 + (1 − θ )k2 ) > θV (k1 ) + (1 − θ )V (k2 ),

for all k1 , k2 ∈ K n such that k1 6= k2 and θ ∈ (0, 1). A function V :

K n → R is concave if V (θk1 + (1 − θ )k2 ) ≥ θV (k1 ) + (1 − θ )V (k2 ), for
all k1 , k2 ∈ K n such that k1 6= k2 and θ ∈ (0, 1). Note in general that
k can be a n-dimensional vector, even though this case is not analyzed
here. Figure 6.5.1 illustrates the definition for the situation where n =
1.

Figure 6.5.1: The figure illus-

trates how > θV (k1 ) + (1 −
V (kθ )
θ )V ( k2 )when V (k) is strictly con-
cave. Let the equation for the
dashed straight line be V (k) =
a + bk. It is easy to see that
θV (k1 ) = θa + θbk1 and (1 −
θ )V (k2 ) = (1 − θ ) a + (1 − θ )bk2 .
Hence, θV (k1 )+ (1 − θ )V (k2 ) = a +
bkθ , as shown. Although irrelevant,
a = [V ( k1 ) k2 − V ( k2 ) k1 ] / ( k2 − k1 )
and b = [V (k1 ) − V (k2 )]/(k2 − k1 ).

Lemma 18. (Value function is strictly concave) V (k t ) is strictly concave.

Proof. Consider two points, viz k1t and k2t . Take a convex combination
of these two points; let kθt = θk1t + (1 − θ )k2t , for θ ∈ (0, 1). Need to
show that θV (k1t ) + (1 − θ )V (k2t ) < V (kθt ). Now, let kθt+ j = θk∗t+1 j + (1 −
θ )k∗t+2 j , where k∗t+1 j and k∗t+2 j are the optimal solutions for k1t+ j in (6.2.1)
starting off from the initial condition k t = k1t and k t = k2t , respectively.
(So, to be clear, in this proof an asterisk does not refer to the steady-
state value for capital.) It will be shown that the kθt+ j ’s are feasible
deterministic dynamics 107

solutions when starting off from kθt . Then,

∞
θV (k1t ) + (1 − θ )V (k2t ) = θ max
{k1t+ j+1 }∞
∑ β j U ( F(k1t+ j ) + (1 − δ)k1t+ j − k1t+ j+1 )
j =0 j =0
∞
+ (1 − θ ) max
{k2t+ j+1 }∞
∑ β j U ( F(k2t+ j ) + (1 − δ)k2t+ j − k2t+ j+1 )
j =0 j =0
∞
= θ ∑ β j U ( F (k∗t+1 j ) + (1 − δ)k∗t+1 j − k∗t+1 j+1 )
j =0
∞
+ (1 − θ ) ∑ β j U ( F (k∗t+2 j ) + (1 − δ)k∗t+2 j − k∗t+2 j+1 ).
j =0

Now concavity of the utility and production functions on the right-

hand side, in addition to assuming feasibility (shown below), allows
this to be rewritten as
∞
θV (k1t ) + (1 − θ )V (k2t ) < ∑ β j U ( F(kθt+ j ) + (1 − δ)kθt+ j − kθt+ j+1 ).
j =0

Last note that on the righthand side of the above expression kθt+ j and
kθt+ j+1 are not optimal so that
∞
θV (k1t ) + (1 − θ )V (k2t ) < ∑ β j U ( F(k∗t+θ j ) + (1 − δ)k∗t+θ j − k∗t+θ j+1 ) = V (kθt ).
j =0

It needs to be shown that kθt+ j is a feasible solution. Note that if

0 < k1t+ j+1 < F (k1t+ j ) + (1 − δ)k1t+ j and 0 < k2t+ j+1 < F (k2t+ j ) + (1 −
δ)k2t+ j , then 0 < kθt+ j+1 < F (kθt+ j ) + (1 − δ)kθt+ j , by the concavity of F.
Therefore, {kθt+ j+1 }∞
j=0 is a feasible solution for the problem associated
with V (kθt ), even though it may not be optimal.

So, before proceeding on to analyzing the transitional dynamics for the

neoclassical growth model, it has been shown that the value function,
V (k t ), is strictly increasing, and strictly concave in k t .

6.6 Consumption Smoothing

Imagine that Robinson Crusoe gets a tiny bit more capital in period t.
This will increase his period-t output by F1 (k t ) + (1 − δ). One would
expect that he will consume some of this windfall increase in output
and that he will save the rest. That is, one might expect that

0 < K1 (k t ) < F1 (k t ) + (1 − δ). (6.6.1)

I.e., that fact that K1 (k t ) > 0 implies that Robinson must be saving
some of the windfall increase in capital, while the condition K1 (k t ) <
F1 (k t ) + (1 − δ) means that he must be consuming part of it.
108 numerical methods for macroeconomists with julia and matlab codes

Property (6.6.1) can be derived in two ways. The first approach uses
the first-order condition connected with the dynamic programming
problem (6.5.2). Update (6.5.3) to get

U1 ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 ) = βV1 (k t+ j+1 ). (6.6.2)

By totally differentiating this equation it can be seen that

dk t+ j+1 U11 (·t+ j )
0< = [ F (· ) + 1 − δ]
dk t+ j U11 (·t+ j ) + βV11 (·t+ j+1 ) 1 t+ j
| {z }
<1
< F1 (·t+ j ) + 1 − δ. (6.6.3)

The notation ·t+ j signifies that the arguments of the function are being
evaluated at time t + j. Note that the above derivation assumes that the
value function is continuously twice differentiable; there may be some
points where this is not the case. Equation (6.6.3) implies that period-
(t + j) consumption, ct+ j , will rise because dk t+ j+1 /dk t+ j < F1 (·t+ j ) +
1 − δ. So, the representative agent is both consuming and saving (since
0 < dk t+ j+1 /dk t+ j ) some of income resulting from an increase in the
period-(t + j) capital stock. Now, by updating the above formula it
follows that 0 < dk t+ j+2 /dk t+ j+1 < F1 (·t+ j+1 ) + 1 − δ. Therefore, the
increase in k t+ j will also cause both ct+ j+1 and k t+ j+2 to rise. And,
the increase in k t+ j+2 will induce ct+ j+2 and k t+ j+3 to move up and
so on. So, the effect of increase in the period-(t + j) capital stock will
propagate throughout the entire future increasing consumption and
the capital stock in every period.
The second approach employs the Euler equation (6.3.1).3 To show 3
The second approach can be skipped
that property (6.6.1) is consistent with the Euler equation (6.3.1), sub- for those not interested in formalities.

stitute out for k t+ j+2 using the updated relationship k t+ j+2 = K i (k t+ j+1 )
to get

U1 ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 ) = β[ F1 (k t+ j+1 ) + (1 − δ)]

× U1 ( F (k t+ j+1 ) + (1 − δ)k t+ j+1 − K i (k t+ j+1 )).

Here K i (k t+ j+1 ) is some guess for K (k t+ j+1 ). This is one equation in

one unknown, k t+ j+1 . As an induction hypothesis at stage i, suppose
that
0 ≤ K1i ≤ F1 + 1 − δ. (6.6.4)
It will now be shown that this implies
dk t+ j+1
0< < F1 + 1 − δ.
dk t+ j
Totally differentiate the above Euler equation to get
dk t+ j+1 U11 (·t+ j )[ F1 (·t+ j ) + 1 − δ] U11 (·t+ j )
= = [ F1 (·t+ j ) + 1 − δ],
dk t+ j ∆ ∆
deterministic dynamics 109

where

∆ ≡ U11 (·t+ j ) + βF11 (·t+ j+1 )U1 (·t+ j+1 )

+ β[ F1 (·t+ j+1 ) + (1 − δ)]U11 (·t+ j+1 )[ F1 (·t+ j+1 ) + 1 − δ − K1i (·t+ j+1 )].

Then, it is easy to see that

dk t+ j+1
0< = K1i+1 < F1 + 1 − δ,
dk t+ j

because U11 (·t+ j )/∆ < 1. Hence, the property is self-fulfilling. Now,
limi→∞ K1i = K. Thus,

dk t+ j+1
0< = K1 < F1 + 1 − δ.
dk t+ j

6.7 Dynamics

The dynamics of the neoclassical growth model are developed now.

The analysis starts off with Lemma 19 that states if one starts off in
the current period t + j with a capital stock, k t+ j , that lies below the
steady-state capital stock, k∗ , then next period’s capital stock, k t+ j+1 ,
will be bigger than the current one. This will be useful for showing
monotonic convergence toward the steady-state capital stock.

Lemma 19. (If below the steady state, then rise, while if above, then fall.) If
k t+ j < k∗ , then k t+ j < k t+ j+1 , and if k t+ j > k∗ , then k t+ j > k t+ j+1 .

Proof. Recall that the value function is strictly concave so that V1 (k t+ j )

is strictly decreasing in k t+ j . Therefore, V1 (k t+ j ) R V1 (k t+ j+1 ) as k t+ j Q
k t+ j+1 –see Figure 6.7.1. Hence,

[V1 (k t+ j ) − V1 (k t+ j+1 )](k t+ j − k t+ j+1 ) < 0.

By using the envelope theorem, it was shown that V1 (k t+ j ) = U1 ( F (k t+ j ) +

(1 − δ)k t+ j − k t+ j+1 )[ F1 (k t+ j ) + (1 − δ)]–this is an updated version of
(6.5.4). The first-order condition for k t+ j+1 , or equation (6.6.2), also im-
plied that V1 (k t+ j+1 ) = U1 ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 )/β. Plugging
these two expressions into the above condition gives

[ F1 (k t+ j ) + (1 − δ) − 1/β](k t+ j − k t+ j+1 ) < 0.

Suppose k t+ j < k∗ . Then, F1 (k t+ j ) + (1 − δ) − 1/β > 0–see Figure

6.4.1. But, this implies k t+ j − k t+ j+1 < 0 or k t+ j+1 > k t+ j . Conversely,
if k t+ j > k∗ then k t+ j < k t+ j+1 .

The dynamics for the Ramsey growth model can now be charac-
terized, with Lemma 19 in hand. They are portrayed in Figure 6.7.2.
110 numerical methods for macroeconomists with julia and matlab codes

Figure 6.7.1: The graph illus-

trates that if k t+ j < k t+ j+1 , then
V1 (k t+ j ) > V1 (k t+ j+1 ). This
transpires because the value
function, V (k), is strictly con-
cave in the capital stock, k, im-
plying that V1 (k) < 0.

From (6.6.3) the decision rule for capital, k t+ j+1 = K (k t+ j ), is strictly

increasing. Along the 45 degree line k t+ j+1 = k t+ j . Hence, a steady
state is located at points where the K function crosses this line. There
can only be one nontrivial steady state, as was discussed above. When
k t+ j is below k∗ the function K (k t+ j ) lies above the 45 degree line, by
the Lemma 19. Note that in this situation, the function K (k t+ j ) can-
not return a value for k t+ j+1 greater than k∗ . If it did then, then the
function K would have to turn down to attain the steady state, which
can’t happen because K is strictly increasing. When k t+ j is above k∗
the function K (k t+ j ) lies below the 45 degree line. It must cut the 45
degree line from above due to the Lemma 19. This implies that at the
steady state
dk t+ j+1
0< < 1, (6.7.1)
dk t+ j

because the slope of the 45 degree line is one. A local solution for
dk t+ j+1 /dk t+ j around the nontrivial steady state is given in Section
9.5. This is obtained by linearizing the Euler equation (6.3.1) around
the (unique nontrivial) steady state. It will be reaffirmed then that
(6.7.1) holds by examining the linear difference equation that arises
from the linearized Euler equation.
To conclude, the model’s transitional dynamics are as displayed
by Figure 6.7.2. When starting off below the steady state the capi-
tal stock monotonically increases until it converges to its steady-state
value. Along the transition path toward the steady state, the interest
rate steadily falls. To see this, note that the period-(t + j) gross interest
rate is given by U1 ( F (k t+ j ) + (1 − δ)k t+ j − k t+ j+1 )/[ βU1 ( F (k t+ j+1 ) +
(1 − δ)k t+ j+1 − k t+ j+2 )] = F1 (k t+ j+1 ) + 1 − δ. The term on the left is
the amount of period-(t + j + 1) consumption that the person must
receive in order to sacrifice a unit of period-(t + j) consumption. This
deterministic dynamics 111

Figure 6.7.2: Transitional Dynam-

450
kt+j+1 ics. The economy starts off in pe-
riod 1 with the capital stock k1 , dis-
played on the horizontal axis. Using
k* the decision rule, k t+ j+1 = K (k t+ j ),
kt+j+1= K(kt+j)
this implies that the capital stock in
period 2 will be k2 , as shown on the
k3 vertical axis. To move forward in
time to period 2, reflect the period-2
capital stock, k2 , onto the horizon-
tal axis using the 45o line. It can
k2
then be seen, by using the decision
rule again, that the period-3 capi-
tal stock will be k3 , as given on the
vertical axis. The capital stock will
keep raising in a monotone fashion
k1 k2 k3 k* kt+j
until the steady-state value of capi-
tal, k∗ , is reached.

is set equal to the gross return from investing in capital in period t + j,

or the term on the right. Clearly, the term on the right decreases over
time as the capital stock increases. The story is reversed when starting
off from above the steady state.
Situations such as those shown in Figure 6.7.3 are ruled out by the
uniqueness property. If a second (non-trivial) steady state did exist
(which it does not), then it would have to be unstable. At the second
steady state the policy function cuts the 45 degree line from below
implying K1 > 1. The above lemma established that this can’t happen.
To see why, observe from Figure 6.7.3 that in a neighborhood around
the unstable k∗ if k t+ j > k∗ then k t+ j+1 > k t+ j , which would contradict
the lemma. Note that the trivial steady state in Figure 6.7.2 is unstable.
If the system is started from a value for k t+ j that is close to zero it will
always converge to k∗ and not zero.
112 numerical methods for macroeconomists with julia and matlab codes

Figure 6.7.3: Unstable equilib-

kt+j+1 450
ria. If at any steady state
dk t+ j+1 /dk t+ j < 1, then an
unstable nontrivial steady state
kt+j+1=K(kt+j) cannot exist.

k* kt+j

6.8 The Value Function: A More Formal Analysis

The above analysis suggests that the neoclassical growth model can be
written as

V (k ) ≡ max{U ( F (k) + (1 − δ)k − k0 ) + βV (k0 )}. P(1)

The messy time subscripts have been eliminated in the above dynamic
programming problem where next period’s capital stock has a prime
symbol attached to it. This can be done because there is no notion of
time in Robinson Crusoe’s problem. All that matters is the amount of
capital that he enters a period with. The goal is to answer the following
questions concerning the value function, V:
1. Will V exist?

2. Is V unique?

3. Is V continuous?

4. Is V continuously differentiable?

5. Is V increasing in k?

6. Is V concave in k?

6.8.1 Method of Successive Approximation

The idea here is to approximate the value function V by a sequence
of successively better guesses, denoted by V j at stage j. Consider the
following algorithm to do this:
deterministic dynamics 113

1. Make an initial guess for V. Call it V 0 .

2. Construct a revised guess for V, denoted by V 1 :

V 1 (k) ≡ max{U ( F (k) + (1 − δ)k − k0 ) + βV 0 (k0 )}.

3. Enter iteration n + 1 with a solution for V from the previous itera-

tion, V n . Compute V n+1 , given V n , as follows

V n+1 (k) ≡ max{U ( F (k) + (1 − δ)k − k0 ) + βV n (k0 )}. P(2)

This procedure can be represented much more compactly using opera-

tor notation.
V n+1 = TV n .
The operator T is shorthand notation for the list of operations, de-
scribed by P(2), which are performed on the function V n to transform
it into the new one V n+1 . Often the operator T maps some set of func-
tions, say F , into itself. That is, T : F → F . The hope is that as n gets
large it will transpire that V n → V, where V = TV. To know if V n
is close to V requires some sort of metric or a standard for measuring
distance. This brings up the notion of a metric space.

6.8.2 Metric Spaces: A Detour through Real Analysis

Definition 20. (Metric Space) A metric space is a set S , together with
a metric ρ : S × S → R+ , such that for all x, y, z ∈ S (see Figure 6.8.1):

1. ρ( x, y) ≥ 0, with ρ( x, y) = 0 if and only if x = y,

2. ρ( x, y) = ρ(y, x ),

3. ρ( x, z) ≤ ρ( x, y) + ρ(y, z).

The points in a metric space can actually be functions. On this, think

about a continuous function x (n) as just being an infinite dimensional
vector; i.e., the infinite dimensional analogue of the point x = { x1 , · · · , x j , · · · , xn }
in Rn where now j can vary continuously. How can the distance be-
tween two continuous function be measured?

Definition 21. (Uniform Metric) Consider the space of continuous

functions C : [ a, b] → R. A useful metric for this space is

ρ( x, y) = max | x (t) − y(t)|,

t∈[ a,b]

where x (t) and y(t) are two functions in C . This is called the uniform
metric.
114 numerical methods for macroeconomists with julia and matlab codes

Figure 6.8.1: Distances Between

Cities. The distance between
Vancouver, x Rochester and Vancouver is non-
Rochester, z
negative. The miles from
ρ(x,z)
Rochester to Vancouver are the
ρ(x,y) same as the from Vancouver
to Rochester. Taking a detour
ρ(y,z)
through L.A. increases the miles
covered.

L.A., y

Example 22. (Distance between two functions–uniform metric) Figure

6.8.2 plots the two continuous functions x (t) = 1 and y(t) = 1 + t − t2
on the space [0, 1]. When using the uniform metric the functions are
farthest apart at the point t = 0.5.

Figure 6.8.2: The Uniform Met-

ric: The maximal distance be-
1.25
tween the two function x (t) and
y(t) occurs at the point t = 0.5.
1.20

y(t)=1+t-t2
ρ(x,y)=0.25
1.15

1.10

1.05

x(t)=1
1.00

-0.1 0.1 0.3 0.5 0.7 0.9 1.1

The iterative scheme P(2) generates a sequence of functions {V n }∞

n =0 .
Will this sequence converge to something? What does convergence
mean?
Definition 23. (Convergence of a Sequence) A sequence { xn }∞ n=0 in S
converges to x ∈ S , if for each ε > 0 there exists a Nε such that

ρ( xn , x ) < ε, for all n ≥ Nε .

deterministic dynamics 115

In general Nε will depend on ε. From a computational viewpoint this

notation of convergence isn’t very appealing. Suppose one is trying
to find a numerical solution, x ∈ S , to some problem. To get the
solution a computer algorithm is employed. The algorithm generates a
sequence x0 , x1 , x2 , . . .. Each point in the sequence is hopefully getting
closer and closer to the answer x. When should one stop? The above
criteria is not be very useful to use because it requires knowing the
answer x, and this is what is being sought.

Definition 24. (Cauchy Sequence) A sequence { xn }∞ n=0 in S is a Cauchy

sequence if for each ε > 0 there exists a Nε such that

ρ( xm , xn ) < ε, for all m, n ≥ Nε .

From the computational viewpoint the Cauchy criteria for convergence

looks more appealing. Basically, it would say keep iterating until the
answers being generated aren’t changing much. Of course, it would
be impossible to check whether or not ρ( xm , xn ) < ε for all m, n ≥ Nε .
Also, will a Cauchy sequence generated by some algorithm converge
to an answer, x ∈ S ? The answer in general is no.
Remark 25. A Cauchy sequence in S may not converge to a point in S .

Example 26. (A Cauchy sequence in S that converges to point outside

of S ) Let S = (0, 1], ρ( x, y) = | x − y|, and { xn }∞ ∞
n=0 = {1/n }n=0 .
Clearly, xn → 0 ∈
/ (0, 1]. This sequence satisfies the Cauchy criteria,
though, because

1 1 1 1 2
ρ( xn , xm ) = | − | ≤ + < ε, if m, n > .
m n m n ε
Given this it is often useful to focus attention on those metric spaces
(S , ρ) where all Cauchy sequences are guaranteed to converge to a
point in the space.

Definition 27. (Complete Metric Space) A metric space (S , ρ) is com-

plete if every Cauchy sequence in S converges to a point in S .

Theorem 28. Let X ⊆ Rl and C( X ) be the set of bounded continuous

functions V : X → R with the uniform metric ρ(V, W ) = sup|V − W |.
x∈X
Then C( X ) is a complete metric space.

Proof. See Bryant (1985, Theorem 3.9).

Remark 29. Pointwise convergence of a sequence of continuous func-

tions does not imply that the limiting function is continuous.

Example 30. (Pointwise converge of a sequence of continuous func-

tions to a discontinous function) Let {V n }∞
n=1 in C[0, 1] be defined by
116 numerical methods for macroeconomists with julia and matlab codes

V n (t) = tn . As n → ∞ it transpires that: (i) V n (t) → 0 for t ∈ [0, 1)

and (ii), V n (t) → 1 for t = 1. Thus,
(
0, for t ∈ [0, 1),
V (t) =
1, for t = 1.

Hence V (t) is a discontinuous function. See Figure 6.8.3. Clearly,

by the above theorem {V n }∞ n=1 cannot describe a Cauchy sequence
under the uniform metric. This can be shown directly too, however. In
particular, for given any Nε it is always possible to pick a m, n ≥ Nε and
t ∈ [0, 1) so |tn − tm | ≥ 1/2. To see this pick n = Nε and a t ∈ (0, 1) so
that t Nε ≥ 3/4; i.e., choose t ≥ (3/4)1/Nε . Next, pick m large enough
such that, for the t chosen earlier, tm < 1/4 or m ≥ (ln 1/4)/(ln t).
The desired results obtains.

Figure 6.8.3: Pointwise Conver-

gence of Continuous Functions
1.0
to a Discontinuous Function. As
n increases the continuous func-
0.8
tions V n (t) = tn bend more.
0.6
Eventually the stress is too much
t
and the limiting function, V (t),
t2 t10 breaks at the point t = 1.
0.4
t3

0.2

0.0

-0.1 0.1 0.3 0.5 0.7 0.9 1.1

Remark 31. The space of strictly increasing functions is not complete

since the limiting function may just be nondecreasing. Likewise, the
space of strictly concave functions is not complete since the limiting
function may just be concave.

Example 32. Consider the Cauchy sequence of strictly increasing, strictly

concave function {y = x1−1/(n+1) }∞n=1 on the domain [0, 1]. This se-
quence converges to the increasing, concave function y = x. The situ-
ation is shown in Figure 6.8.4.
deterministic dynamics 117

Figure 6.8.4: The Spaces of

1.0 Strictly Concave and Strictly
Increasing Functions are Not
0.8 Complete. The sequence of
strictly concave, strictly increas-
0.6 ing functions shown converges
y = x[1-1/(n+1]

y = x0.5
y = x0.67 to a straight line, which is just
0.4 a concave, increasing function.
y = x0.75
y = x0.83
0.2 y=x

0.0

0.0 0.2 0.4 0.6 0.8 1.0

6.8.3 The Contraction Mapping Theorem

Establishing, both computationally and theoretically, properties of map-
pings such as P(2) involves the idea of a contraction mapping.

Definition 33. (Contraction Mapping) Let (S , ρ) be a metric space and

T : S → S be function mapping S into itself. T is a contraction
mapping (with modulus β) if for β ∈ (0, 1),

ρ( Tx, Ty) ≤ βρ( x, y), for all x, y ∈ S . (6.8.1)

As the name implies, after applying the operator T the distance be-
tween functions contracts; i.e., the distance between Tx and Ty is
smaller than between x and y.

Theorem 34. (Contraction Mapping Theorem or Banach Fixed Point The-

orem) If (S , ρ) is a complete metric space and T : S → S is a contraction
mapping with modulus β, then

1. T has exactly one fixed point V ∈ S such that V = TV,

2. for any V 0 ∈ S , ρ( T n V 0 , V ) ≤ βn ρ(V 0 , V ), n = 0, 1, 2, ... .

Proof. See Bryant (1985, Theorem 4.1).

The theorem implies that from a computational standpoint contraction

mappings are great. Consider the mapping V n+1 = TV n , where the
operator T is a contraction. Part 1 of the theorem states that there is
only one fixed point to the operator. Part 2 says that you can get to
this unique fixed point by employing the iterative scheme V n+1 = TV n
starting from any initial guess V 0 (in the space S ).
118 numerical methods for macroeconomists with julia and matlab codes

Corollary 35. Let (S , ρ) be a complete metric space and let T : S → S be a

contraction mapping with fixed point V ∈ S . If S 0 is a closed subset of S and
T (S 0 ) ⊆ S 0 then V ∈ S 0 . If in addition T (S 0 ) ⊆ S 00 ⊆ S 0 , then V ∈ S 00 .

Proof. Choose V 0 ∈ S 0 and note that { T n V 0 } is a sequence in S 0 con-

verging to V. Since S 0 is closed, it follows that V ∈ S 0 . If T (S 0 ) ⊆ S 00 ,
it then follows that V = TV ∈ S 00 .

To check whether a particular mapping is a contraction using (6.8.1)

can be cumbersome. So, for dynamic programming problems in eco-
nomics it is often much easier to use the sufficient conditions presented
below. David Blackwell (1919-2010) was an
American mathematician and
Theorem 36. (Blackwell’s Sufficiency Condition) Let X ⊆ Rl and B( X ) be statistician. He made important
the space of bounded functions V : X → R with the uniform metric. Let contributions to game theory,
T : B(X ) → B(X ) be an operator satisfying information theory, probability
theory, and statistics. He was the
first African American to become a
1. (Monotonicity) V, W ∈ B(X ). If V ≤ W [i.e., V ( x ) ≤ W ( x ) for all
tenured professor at Berkeley and
x] then TV ≤ TW.
the first to be inducted into the
National Academy of Sciences.
2. (Discounting) There exists some constant β ∈ (0, 1) such that T (V +
a) ≤ TV + βa, for all V ∈ B(X ) and a ≥ 0.

Then T is a contraction with modulus β.

Proof. For every V, W ∈ B(X ),

V ≤ W + ρ(V, W ).

Thus, (1) and (2) imply

TV ≤ T (W + ρ(V, W )) ≤ TW + βρ(V, W ).
| {z } | {z }
Monotonicity Discounting

Thus,
TV − TW ≤ βρ(V, W ).

By permuting the functions it is easy to show that

TW − TV ≤ βρ(V, W ).

Consequently,
| TV − TW | ≤ βρ(V, W ),

so that
ρ( TV, TW ) ≤ βρ(V, W ).

Therefore T is a contraction.
deterministic dynamics 119

6.8.4 Back to the Neoclassical Growth Model

The above machinery will now be applied to the neoclassical growth
model. To this end, consider the mapping

( TV )(k) = max {U ( F (k) + (1 − δ)k − k0 ) + βV (k0 )}. P(3)

0≤k0 ≤ F (k )+(1−δ)

Is T a contraction?

1. Monotonicity. Suppose V (k) ≤ W (k) for all k. It will be shown that

( TV )(k) ≤ ( TW )(k).

( TV )(k) = {U ( F (k) + (1 − δ)k − k0∗ ) + βV (k0∗ )},

where k0∗ maximizes P(3). Clearly,

( TV )(k) ≤ {U ( F (k) + (1 − δ)k − k0∗ ) + βW (k0∗ )}

≤ max {U ( F (k) + (1 − δ) − k0 ) + βW (k0 )}
0≤k0 ≤ F (k )+(1−δ)

= ( TW )(k).

2. Discounting.

T (V + a)(k) = max {U ( F (k) + (1 − δ)k − k0 ) + β[V (k0 ) + a]}

0≤k0 ≤ F (k )+(1−δ)

= max {U ( F (k) + (1 − δ)k − k0 ) + βV (k0 )} + βa

0≤k0 ≤ F (k )+(1−δ)

= ( TV )(k) + βa.

Theorem 37. V is a continuous, strictly increasing, strictly concave function

in k.4 4
A more formal proof is in Stokey and
Lucas (1986).
Proof. (Heuristic) It will be shown that the operator described by P(3)
maps increasing, concave C 2 functions into strictly increasing, strictly
concave C 2 functions. Suppose that V n is a continuous, strictly increas-
ing, strictly concave C 2 function. The decision rule for k0 is determined
from the first-order condition

U1 ( F (k) + (1 − δ)k − k0 ) = βV1n (k0 ).

This determines k0 as a continuously differentiable function of k by the

implicit function theorem. Therefore, V n+1 (k) is a strictly increasing
C 2 function, since V1n+1 (k) = U1 ( F (k) + (1 − δ)k − k0 ) F1 (k) > 0. The
limit of such a sequence must be a continuous function, because each
V n is a continuous function and the sequence converges uniformly.
(It is does not have to be a C 2 function) The limiting function is also
strictly increasing because P(3) maps increasing functions into strictly
increasing ones.5 To see this, let k1 < k2 . Then, 5
In terms of Collorary 35 think about
S 0 as being space of increasing functions
00
and S the space of strictly increasing
functions.
120 numerical methods for macroeconomists with julia and matlab codes

V n+1 (k1 ) = U ( F (k1 ) + (1 − δ)k1 − k0∗ n 0∗

1 ) + βV ( k 1 )
< U ( F (k2 ) + (1 − δ)k2 − k0∗ n 0∗
1 ) + βV ( k 1 )
≤ U ( F (k2 ) + (1 − δ)k2 − k0∗ n 0∗
2 ) + βV ( k 2 )
= V n +1 ( k 2 ).

By employing a proof similar to that used in Lemma 18 it can be

shown that the operator described by P(3) maps concave function into
strictly concave ones.6 Thus, the limiting function must be strictly 6
Again, in terms of Collorary 35 think
concave. about S 0 as being space of concave func-
00
tions and S the space of strictly concave
functions.
Differentiability
The last question concerns whether or not the value function for the
neoclassical growth model is differentiable.

Lemma 38. Let X ⊆ Rl be a convex set, V : X → R be a concave function.

Pick an x0 ∈ int ( X ) and let D be a neighborhood of x0 . If there is a concave,
differentiable function W : D → R with W ( x0 ) = V ( x0 ) and W ( x ) ≤
V ( x ) for all x ∈ D then V is differentiable at x0 and

Vi ( x0 ) = Wi ( x0 ), for i = 1, 2, ..., l.

Proof. (Heuristic) Figure 6.8.5 tells it all. If V is not differentiable at x0 ,

then it would have to have a kink in it at this point. But, if this was the
case it would be impossible to have a smooth function W lying always
below V that just touches V at x0 . Try to derive a contradiction by
drawing different scenarios.

Figure 6.8.5: Differentiability of

V. The function V cannot have
V, W
a kink (or a break in its deriva-
tive) at the point x0 . If it did,
then it would be impossible to
insert the concave, differentiable
W
function W under the function V
while touching at x0 .
V

x0 x
deterministic dynamics 121

Theorem 39. (Benveniste and Scheinkman) Suppose that K is a convex set

and that U and F are strictly concave C1 functions. Let V : K → R in line
with P(3) and denote the decision rule associated with this problem by k0 =
G (k). Pick k0 ∈ int (K ) and assume that 0 < G (k0 ) < F (k0 ) + (1 − δ)k0 .
Then V (k) is continuously differentiable at k0 with its derivative given by

V1 = U1 ( F (k0 ) + (1 − δ)k0 − G (k0 )) F1 (k0 ).

Proof. Clearly, there exists some neighborhood D of k0 such that 0 <

G (k0 ) < F (k0 ) for all k ∈ D. Define W on D by

W (k) = U ( F (k) + (1 − δ)k − G (k0 )) + βV ( G (k0 )).

Now, W is concave and differentiable since U and F are. Furthermore,

it follows that

W (k) ≤ max{U ( F (k) + (1 − δ)k − k0 ) + βV (k0 )} = V (k),

with this expression holding with strict equality at k = k0 . The results

then follow immediately from the above lemma.

6.9 A Linear-Quadratic Optimization Problem

Robinson Crusoe’s problem is now recast as a linear-quadratic opti-

mization problem. This class of optimization problems is characterized
by a quadratic objective function and linear constraints. They yield
linear first-order conditions and are a close cousin of the linearization
technique discussed in Chapter 9. The solution to the linear-quadratic
optimization problem will shed further light on the local dynamics of
the neoclassical growth model.

6.9.1 Taking a Quadratic Approximation to the Utility Function

Substitute the resource constraint into the momentary utility function
to obtain
U ( F ( k ) + (1 − δ ) k − k 0 ).

Take a second-order Taylor expansion of this to get

U ( F (k) + (1 − δ)k − k0 ) = U (∗) + U1 (∗)[ F1 (∗) + (1 − δ)](k − k∗ ) − U1 (∗)(k0 − k∗ )

| {z } | {z }
α λ
1
+ [U11 (∗)[ F1 (∗) + (1 − δ)]2 + U1 (∗) F11 (∗)](k − k∗ )2
2| {z }
−ψ
1
− U11 (∗)[ F1 (∗) + (1 − δ)](k − k∗ )(k0 − k∗ ) + U11 (∗)(k0 − k∗ )2 .
| {z } 2 | {z }
−ρ −φ
122 numerical methods for macroeconomists with julia and matlab codes

(See Chapter A for the concept of a second-order Taylor expansion.)

In the above equation the ∗ notation signifies that the argument of the
function is being evaluated at its steady-state value. Define deviations
in capital stock from the steady state by bk ≡ k − k∗ and b k0 ≡ k0 − k∗ .
The momentary utility function can then be expressed as

ψ 2 φ b02
U (b k0 ) = τ + αb
k, b k0 + ρb
k − λb k0 − b
kb k − k .
2 2
Note the slight abuse of notation in redefining U to now be a func-
tion of b k0 . The derivatives in the above Taylor expansion could
k and b
be computed numerically using the formulae presented for numerical
first- and second-derivatives presented in Chapter 8.

6.9.2 Robinson’s Linear-Quadratic Optimization Problem

Robinson Crusoe’s optimization problem is now given by
∞
max
{ k t + j +1 } ∞
∑ β j U (kt+ j , kt+ j+1 ), with 0 < β < 1.
j =0 j =0

As before, k t+ j+1 will show up twice in the objective function:

· · · + β j U ( k t + j , k t + j +1 ) + β j +1 U ( k t + j +1 , k t + j +2 ) + · · ·

Maximizing then gives

U2 (k t+ j , k t+ j+1 ) = − βU1 (k t+ j+1 , k t+ j+2 ), for j = 0, 1, · · · .

The Euler equation for capital accumulation is then given by

−U2 (k t+ j ,k t+ j+1 ) =U1 (k t+ j+1 ,k t+ j+2 )

z }| { z }| {
λ − ρb
k t+ j + φb k t+ j+2 − ψb
k t+ j+1 = β(α + ρb k t + j ),
| {z } | {z }
MC of investment MB of investment

or
λ − ρb k0 = β(α + ρb
k + φb k00 − ψb
k 0 ).

This is a linear second-order difference equation.

Now, in a steady state b k∗ = b k∗0 = b
k∗00 = 0, so that the following
parameter restriction must apply:

λ = βα.

Therefore,
−ρbk + φbk0 = β(ρbk00 − ψbk0 ).
Conjecture a solution of the form

k 0 = ηb
b k,
deterministic dynamics 123

so that
k00 = ηb
b k0 .
There is no constant term since the decision rule has been defined in
terms of deviations from the steady state. Note that if η was negative,
then the capital stock would oscillate around the steady state, which
Figure 6.7.2 rules out. So, one should expect that η > 0. Now, if
0 < η < 1, then the difference equation is stable and convergence to
the steady state will be monotone.
Using the conjectured decision rule in the above Euler equation
gives
−ρbk + φbk0 = β(ρηbk0 − ψbk0 ),
which can be rewritten as
ρ
k0 =
b k.
b
(φ + βψ − βρη )
Hence, η must solve the quadratic
ρ
η= .
φ + βψ − βρη
Cross multiplying gives

− βρη 2 + (φ + βψ)η − ρ = 0.

This equation will have two roots. One will lie between 0 and 1 and
the other will be greater than 1. The stable root corresponds to the
situation where the decision rule crosses the 450 degree line in Figure
6.7.2.
To prove this formally, observe that when η = 0 the lefthand side of
this equation is negative. When η = 1 then the lefthand side is positive
because

βψ = −U11 (∗)[ F1 (∗) + (1 − δ)] − βU1 (∗) F11 (∗),

βρ = −U11 (∗),
φ = −U11 (∗),
ρ = −U11 (∗)[ F1 (∗) + (1 − δ)],
so that

− βρ + (φ + βψ) − ρ =
U11 (∗) − U11 (∗) − U11 (∗)[ F1 (∗) + (1 − δ)]
− βU1 (∗) F11 (∗) + U11 (∗)[ F1 (∗) + (1 − δ)]
= − βU1 (∗) F11 (∗) > 0.

Therefore, a root must lie between 0 and 1. As η becomes large the

first term in the quadratic equation will dominate and the expression
turns negative again. Figure 6.9.1 portrays the situation.
124 numerical methods for macroeconomists with julia and matlab codes

Figure 6.9.1: Roots to the

quadratic equation for η. The
unstable root can be thrown
away, implying 0 < η < 1. Thus,
the difference equation bk 0 = ηb
k
will be stable and exhibit mono-
tone dynamics.

6.10 Adding a Labor-Leisure Choice

It is easy to add a labor-leisure choice to the above analysis. To do this,

rewrite the period-(t + j) momentary utility function as

U (ct+ j − G (ht+ j )),

where ht+ j is the amount of work effort expended. This utility function
was introduced in Chapter 2. The function G : R+ → R+ gives the
disutility of work effort, measured in consumption units; assume that
it is strictly convex. Express the production function as

o t + j = F ( k t + j , h t + j ).

Robinson Crusoe’s problem will now appear as

∞
max
{ht+ j ,k t+ j+1 }∞
∑ β j U ( F(kt+ j , ht+ j ) + (1 − δ)kt+ j − kt+ j+1 − G(ht+ j )).
j =0 j =0

There will be two first-order conditions; viz, one for ht+ j and the other
for k t+ j+1 . The first-order condition governing period-(t + j) labor
effort, or ht+ j , is

U1 ( F (k t+ j , ht+ j ) + (1 − δ)k t+ j − k t+ j+1 − G (ht+ j )) F2 (k t+ j , ht+ j )

= U1 ( F (k t+ j , ht+ j ) + (1 − δ)k t+ j − k t+ j+1 − G (ht+ j )) G1 (ht+ j ),

which simplifies to

F2 (k t+ j , ht+ j ) = G1 (ht+ j ).

This equation specifies ht+ j as a function of k t+ j . Write this solution as

h t + j = H ( k t + j ).

The first-order condition for the period-(t + j + 1) capital stock,

deterministic dynamics 125

k t+ j+1 , is

U1 ( F (k t+ j , ht+ j ) + (1 − δ)k t+ j − k t+ j+1 − G (ht+ j ))

= [ F1 (k t+ j+1 , ht+ j+1 ) + (1 − δ)]
× βU1 ( F (k t+ j+1 , ht+ j+1 ) + (1 − δ)k t+ j+1 − k t+ j+2 − G (ht+ j+1 )),

for j = 0, 1, · · · . Plugging in the decision-rule for labor then yields

U1 F (k t+ j , H (k t+ j )) + (1 − δ)k t+ j − k t+ j+1 − G ( H (k t+ j ))
= [ F1 (k t+ j+1 , H (k t+ j+1 )) + (1 − δ)]
× βU1 ( F (k t+ j+1 , H (k t+ j+1 )) + (1 − δ)k t+ j+1 − k t+ j+2 − G ( H (k t+ j+1 ))),

for j = 0, 1, · · · . The problem has been reduced to the earlier one; i.e.,
the solution for the model is once again represented by a second-order
nonlinear difference equation for the capital stock.

Example 40. (Hours worked with a zero-income effect utility function

and a Cobb-Douglas production function) Let G (h) = h1+θ /(1 + θ )
and F (k, h) = kα h1−α . Then, the first-order condition for labor can be
written as
(1 − α)kα h−α = |{z}
hθ .
| {z }
F2 G1

It’s trivial to calculate that

h = [(1 − α)kα ]1/(θ +α) .

6.11 The Taxation of Capital and Labor Income

Adding income taxation into the above framework with a labor choice
is straightforward. This can be done along the lines discussed in Chap-
ter 2. To see how, let labor income be taxed at the rate τh . Likewise,
let the tax rate on capital income, net of the cost of depreciation, be
τk . For simplicity assume that all tax revenue is rebated back to the
consumer/worker in the form of lump-sum transfer payments, λ. Sup-
pose that consumer/workers own all the capital in the economy, which
they rent out to firms at the rate r. Likewise, they supply labor at the
wage rate w. Assume that tastes and technology have the forms given
in the previous section.
The representative consumer/worker’s period-(t + j) budget con-
straint reads

ct+ j + k t+ j+1 = (1 − τh )wt+ j ht+ j + rt+ j k t+ j − τk (rt+ j − δ)k t+ j + (1 − δ)k t+ j + λt+ j

= (1 − τh )wt+ j ht+ j + [1 + (1 − τk )(rt+ j − δ)]k t+ j + λt+ j .
126 numerical methods for macroeconomists with julia and matlab codes

As before, this equation can be used to substitute out for ct+ j in the
consumer/worker’s utility function. The government’s budget con-
straint will appear as

τh wt+ j ht+ j + τk (rt+ j − δ)k t+ j = λt+ j .

When formulating the general equilibrium for this economy, do the

following steps in order:

1. Solve the consumer/worker’s and firm’s problems. This will lead

to an Euler equation for capital accumulation of the form

U1 (ct+ j − G (ht+ j )) = β[1 + (1 − τk )(rt+ j+1 − δ)] × U1 (ct+ j+1 − G (ht+ j+1 )),

where ct+ j and ct+ j+1 are given by the consumer/worker’s con-
straint. The first-order condition for labor is

(1 − τh )wt+ j = G1 (ht+ j ).

2. Eliminate λt+ j in the consumer/worker’s budget constraint using

the government’s budget constraint. This will result in

c t + j + k t + j +1 = w t + j h t + j + r t + j k t + j + (1 − δ ) k t + j .

3. Use the firm’s first-order conditions to solve out for the wage and
rental rates in consumer/worker’s formula for consumption. After
employing Euler’s theorem, this will result in7 7
If government spending is added into
the mix, then the government’s budget
c t + j = F ( k t + j , h t + j ) + (1 − δ ) k t + j − k t + j +1 . constraint will read

τh wt+ j ht+ j + τk (rt+ j − δ)k t+ j = λt+ j + gt+ j .

A set of equations will arise that only involve the k’s and h’s. To This will result in the equation for con-
make computer readable, this line could be inserted before the Euler sumption appearing as
equation presented in Step 1. Similarly, when ht+ j has an analytical c t + j = F ( k t + j , h t + j ) + (1 − δ ) k t + j − k t + j +1 − g t + j .
solution, it could be placed before the line for ct+ j . The expressions for
ct+ j+1 and ht+ j+1 can be obtained by just updating the formulas for
ct+ j and ht+ j .
Return to concept of the equivalent variation that was introduced
in Chapter 2 and consider a switch from some tax regime A to tax
regime B. How much would a person be willing to pay, as a fraction
of each period’s consumption under regime A, to move from A to B?
The fraction e solves the equation
∞
∑ β j U (ctA+ j (1 + e) − G(htA+ j )) = W B
j =0
∞
≡ ∑ β j U (ctB+ j − G(htB+ j )).
j =0

Things are a little more complicated now, but this is just one equation
in the unknown variable e.
deterministic dynamics 127

6.12 The Extended Path and Multiple Shooting Algorithms

Unlike the quadratic case, in general the solution to the neoclassical

growth model must be solved numerically on a computer. The above
framework will be modified to allow for some predetermined known
sequence of technology shocks, {zt }∞t=1 . In particular, assume that
Robinson Crusoe’s time-1 choice problem is now given by
∞
max
ct ,k t+1
∑ β t −1 U ( c t ),
t =1

subject to
c t + k t +1 = F ( k t , z t ) + (1 − δ ) k t ,
and the initial condition, k1 . Note the presence of the technology shock
in the production function. The Euler equation for this model is

U1 ( F (k t , zt ) + (1 − δ)k t − k t+1 ) (6.12.1)

= βU1 ( F (k t+1 , zt+1 ) + (1 − δ)k t+1 − k t+2 )[ F1 (k t+1 , zt+1 ) + (1 − δ)].

Assume that the economy converges to the steady-state level of the

capital stock by period T. Let z∗ denote the steady-state level for the
technological shock. The steady-state level of capital, k∗ , will then be
given by
1/β = F1 (k∗ , z∗ ) + 1 − δ. (6.12.2)
So, a time path for capital is being sought that goes from k1 to k T = k∗ .
Thus, essentially a solution is being sought for T − 2 capital stocks, or
for k2 , k3 , · · · , k T −1 .

6.12.1 The Extended Path Algorithm

The extended path algorithm was proposed by Fair and Taylor (1983).
Observe that if k t+2 was known then (6.12.1) could be used to solve for
k t+1 , given k t . This observation suggests the following algorithm:

1. Enter iteration j with a guess for the sequence {k t }tT=1 , denoted by

j
{k t }tT=1 . For each period t (for t = 1, 2, · · · , T − 2) solve for k t+1
using the equation
j
U1 ( F (k t , zt ) + (1 − δ)k t − k t+1 ) = βU1 ( F (k t+1 , zt+1 ) + (1 − δ)k t+1 − k t+2 )
| {z } | {z }
ct c t +1

× [ F1 (k t+1 , zt+1 ) + (1 − δ)].

(6.12.3)
j
Note that k t was determined in the previous period and that k t+2 is
given by the guess. This will generate a sequence {k t }tT=1 . Specifi-
cally, using a “for” or “do” loop:
128 numerical methods for macroeconomists with julia and matlab codes

(a) Start off in period 1 with the predetermined starting condition,

j
k1 . Use (6.12.3) to determine k2 , given k1 and the guess k3 .
(b) Move to period 2. Here the goal is to calculate k3 , given the
j
solution just obtained for k2 together with the guess k4 .
j
(c) Then compute k4 , given k3 and k5 .
(d) Proceed down the time path in the above fashion to period T −
2. Here the starting capital stock is k T −2 . The goal is to compute
j
k T −1 , given the guess k T = k T = k∗ .
j
2. Check whether ∑tT=1 |k t − k t | < ε.

(a) If so, exit the algorithm since a solution has been found.
j +1 T
(b) If not, set {k t } t =1 = {k t }tT=1 . Repeat step one using this new
guess.

Extended Path Algorithm

Time, t Variables
Predetermined Computed Future Guess
j
1 k1 k2 k3
.
j
2 k2 k3 k4
.
j
3 k3 k4 k5
.. .. .. ..
. . . .
j
T−2 k T −2 k T −1 k T = k∗

Remark 41. (Restricting consumption to be positive) It pays sometimes

to impose a lower bound on consumption to avoid the overshooting
that often occurs with Newton’s method. In particular, one could add
the lines ct = F (k t , zt ) + (1 − δ)k t − k t+1 , ct = max{1.0E − 5, ct }, ct+1 =
F (k t+1 , zt+1 ) + (1 − δ)k t+1 − k t+2 , and ct+1 = max{1.0E − 5, ct+1 } just
before the Euler equation is presented and write the marginal utilities
as U1 (ct ) and U1 (ct+1 ).
(Speeding up Newton’s Method) Newton’s method can be sped up
j
by using the solution for k t+1 on iteration j − 1 , or k t+1 , as the starting
guess in the nonlinear equation on iteration j. The same is true if the
Bisection method is used where a guess for the solution is provided,
instead of specifying upper and lower bounds.

6.12.2 Multiple Shooting

Again suppose that the economy converges to the steady-state level
of the capital stock, k∗ , by period T. Let z∗ denote the steady-state
level for the technological shock. Once again k∗ will be defined by
deterministic dynamics 129

(6.12.2). Recall that the Euler equation (6.12.1) is a second-order dif-

ference equation in the capital stock. Thus, two starting conditions
are needed, k1 and k2 . Now, k1 is predetermined. So, the idea of the
algorithm is to pick k2 so that the capital stock will be k∗ at time T.
This can be expressed in terms of finding the solution to one nonlinear
equation in one unknown variable, k2 .
1. At the heart of the algorithm is constructing a function that returns
a value for the terminal capital stock, k T , given the two starting
values, k1 and k2 , and the sequence of technology shocks, {zt }∞ t =1 .
Denote this function by k T = KT (k1 , k2 ; z1 , · · · , z T ). Suppose one
j
has a guess for the capital stock in period 2, denoted by k2 . The
period-1 stock of capital, k1 , is a predetermined variable. Solve for
j
the sequence of capital stocks {k t }tT=3 using the Euler equation for
capital accumulation. That is, given a guess for k t and k t+1 , denoted
j j j j j
by k t and k t+1 , one can solve recursively for k t+2 , k t+3 , · · · , k T , using
the second-order nonlinear difference equation
j j j j j j
U1 ( F (k t , zt ) + (1 − δ)k t − k t+1 ) = βU1 ( F (k t+1 , zt+1 ) + (1 − δ)k t+1 − k t+2 )
j
× [ F1 (k t+1 , zt+1 ) + (1 − δ)].
(6.12.4)
The above difference equation implicitly generates a sequence for
the capital stocks described by the following:
j j j
k 3 = D ( k 2 , k 1 , z 2 , z 1 ) ≡ K3 ( k 1 , k 2 ; z 1 , z 2 ) ,
j j j j j j
k 4 = D ( k 3 , k 2 , z 3 , z 2 ) = D ( D ( k 2 , k 1 , z 2 , z 1 ) , k 2 , z 3 , z 2 ) ≡ K4 ( k 1 , k 2 ; z 1 , z 2 , z 3 )
..
.
j j
k T = K T ( k 1 , k 2 ; z1 , · · · , z T ).
j j
This ultimately gives a value for k T that is effectively based on k1
j j
and k2 . Since the first capital stock is predetermined so that k1 = k1 .
this solution can be represented by
j j
k T = K T ( k 1 , k 2 ; z1 , · · · , z T ).

On the computer, the function KT will involve writing a “for” or

“do” loop as follows:

(a) Start off in period 3. Given the starting value k1 and the guess
j j
k2 one gets k3 .
j j j
(b) Move on to period 4. Given k2 and k3 one can obtain k4 . and so
on.
(c) Proceed down the time path in the above manner until one gets
j j
to period T. Here one will enter the period with k T −2 and k T −1 .
The goal is to solve for the final capital stock, k T .
130 numerical methods for macroeconomists with julia and matlab codes

j
2. At time T it is desired that k T ' k∗ , the steady-state capital stock.
So, the algorithm amounts to solving the following nonlinear equa-
tion for k2 ,
KT (k1 , k2 ; z1 , · · · , z T ) − k∗ = 0, (6.12.5)

where the function KT (k1 , k2 ; z1 , · · · , z T ) is characterized in Step 1.

This nonlinear equation can be solved by either bisection or New-
ton’s method. Either method essentially involves iterating on the
j j
starting condition, k2 , until |k T − k∗ | < ε, where the nonlinear equa-
tion is given by (6.12.5).

j j
(a) When a value for k2 is found that sets |k T − k∗ | < ε, the non-
linear equation solver will terminate since a solution has been
found.
(b) If not, the nonlinear equation solver will try a new guess for k2 ,
j +1
denoted by k2 . It will repeat step one using this new guess.

Multiple Shooting Algorithm

Find the Zero of KT (k1 , k2 ; z1 , · · · , z T ) − k∗ = 0

Variables
Predetermined Computed Target
k1 k2 K T ( k 1 , k 2 ; z1 , · · · , z T ) = k ∗

Loop inside Function, KT (k1 , k2 ; z1 , · · · , z T )

Time, t Variables
Predetermined Computed
3 k1 k2 k3
. .
4 k2 k3 k4
. .
5 k3 k4 k5
.. .. .. ..
. . . .
T k T −2 k T −1 kT

The idea of the algorithm is to pick the starting value for the period-
2 capital stock so that the second-order difference equation k t+2 =
D (k t+1 , k t , zt+1 , zt ) converges to the steady state at the end of T peri-
j
ods. Now, if k2 6= k2 the computed sequence of k’s will tend to diverge
away in an explosive manner from the true solution.
To see this, suppose that economy starts off from a value for k1 that
lies in some small neighborhood of the final steady state. Now, make
a guess for k2 that lies above the true value. By using (6.12.4) when
deterministic dynamics 131

t = 1, it is easy to calculate that

dk3 U (· ) + β[ F1 (·2 ) + (1 − δ)]2 U11 (·2 ) + βF11 (·2 )U1 (·2 )

= 11 1
dk2 β[ F1 (·2 ) + (1 − δ)]U11 (·2 )
' 1 + F1 (·2 ) + (1 − δ) + βF11 (·2 )U1 (·2 )/U11 (·2 ) > 1,

The inequality follows from the fact that in a vicinity of the steady
state F1 (·t ) + (1 − δ) ' 1/β, U11 (·t ) = U11 (·t+1 ), etc. This says that the
impact of choosing a value for k2 that is too large will be magnified on
k3 . Employing (6.12.4) again for when t = 2, then gives
dk4 U (· )[dk3 /dk2 − F1 (·2 ) − (1 − δ)]
= 11 2
dk2 β[ F1 (·3 ) + (1 − δ)]U11 (·3 )
{ β[ F1 (·3 ) + (1 − δ)]2 U11 (·3 ) + βF11 (·3 )U1 (·3 )}dk3 /dk2
+
β[ F1 (·3 ) + (1 − δ)]U11 (·3 )
' 1 + βF11 (·2 )U1 (·2 )/U11 (·2 )
+ { F1 (·3 ) + (1 − δ) + βF11 (·3 )U1 (·3 )/U11 (·3 )}dk3 /dk2
dk
> 3.
dk2
Hence, the error in the startup value for the difference equation will
cascade into the future in an explosive manner. The situation is por-
trayed in Figure 6.12.1.

Figure 6.12.1: Multiple Shooting

Explosive
Path

True Path

k2 j

k2
k1

1 2 3 T t

Reverse Shooting
The difference equation (6.12.1) could also be run backwards. Specif-
j
ically, note that one could use (6.12.1) to solve for a value of k t given
132 numerical methods for macroeconomists with julia and matlab codes

j j
values for k t+1 and k t+2 . Again, this describes one equation in one
unknown. Represent the solution by
←
−
k t = D ( k t +1 , k t +2 ; z t +1 , z t ).

Start the system off at time T from the terminal condition k T = k∗ and
z T = z∗ . Given a value for k T −1 one could run the above difference
equation backwards in time to get k T −2 , k T −3 , · · · , k1 . This iterative
scheme can be thought of as yielding a solution for k1 as a function of
k T −1 and k∗ . Represent this function by K1 (k T −1 , k∗ ; z T , z T −1 , · · · , z1 ).
Clearly, k T −1 should be chosen so that

K1 (k T −1 , k∗ ; z T , z T −1 , · · · , z1 ) − k1 = 0;

that is, when the difference equation is run backward it should go

through the initial condition k1 at time 1.

Reverse Shooting Algorithm

Find the Zero of K1 (k T −1 , k∗ ; z T , z T −1 , · · · , z1 ) − k1 = 0

Variables
Predetermined Computed Target
k T = k∗ k T −1 K1 ( k T −1 , k ∗ ; z T , z T −1 , · · · , z 1 ) = k 1

Loop inside Function, K1 (k T −1 , k∗ ; z T , z T −1 , · · · , z1 )

Time, t Variables
Predetermined Computed
T−2 kT k T −1 k T −2
. .
T−3 k T −1 k T −2 k T −3
.. .. .. ..
. . . .
1 k3 k2 k1

6.13 MATLAB: A Worked-Out Example

6.13.1 A Dynamic Monopoly Problem

The monopoly problem presented in Chapter 2 is now made dynamic.
In each period t the monopolist faces the linear demand function
β
pt = α −
ot ,
2
where pt is the period-t price of the product and ot is the monopo-
list’s output in this period. Demand is decreasing in price, pt . The
monopolist now produces according to the quadratic cost function
γ
ct = (ot − κot−1 )2 ,
2
deterministic dynamics 133

where ct is period-t total cost and ot−1 is the monopolist’s level of out-
put in period t − 1. This cost function introduces a dynamic element
into the analysis. By producing more in period t − 1 the monopolist
can reduce his costs in period t. Think about this as adding learning
by doing into the analysis.8 8
For the learning-by-doing interpreta-
The monopolist’s period-t revenue is tion to make sense, assume that ot >
κot−1 . This will be the case in the set-
β 2 ting discussed here.
pt ot = αot −
o .
2 t
This implies that his period-t profits, πt , read
β 2 γ
o − (ot − κot−1 )2 .
αot −
2 t 2
Therefore, the monopolist’s maximization problem is to pick his out-
put in each period to maximize the present value of his profits. Sup-
pose that the monopolist’s discount factor is δ. The mathematical
transliteration of this problem is
∞
β 2 γ
max { ∑ δt−1 [αot − o − (ot − κot−1 )2 ]}.
{ot }∞
t =1 t =1
2 t 2
Note that ot appears in exactly two periods in this optimization prob-
lem: in periods t and t + 1. To see this, write out the objective function
as
β γ β γ
· · · + δt−1 [αot − ot2 − (ot − κot−1 )2 ] + δt [αot+1 − ot2+1 − (ot+1 − κot )2 ] + · · · .
2 2 2 2
The first-order condition associated with this maximization problem is
α − βot = γ(ot − κot−1 ) − δγκ (ot+1 − κot ),
| {z } | {z }
MR MC
which sets marginal revenue, MR, equal to marginal cost, MC. Ob-
serve that when the monopolist increases his output in period t he will
reduce his cost in period t + 1 (at least when ot+1 > κot ). The above
first-order condition represents a linear 2nd-order difference equation
in output.

Steady-State Output
Let o ∗ denote the steady-state level of output. It must solve the equa-
tion
α − βo ∗ = γ(o ∗ − κo ∗ ) − δγκ (o ∗ − κo ∗ )
α − βo ∗ = γ(1 − κ )o ∗ − δγκ (1 − κ )o ∗ ,
which implies
α α
o∗ = = .
β + γ(1 − κ ) − δγκ (1 − κ ) β + γ(1 − κ )(1 − δκ )
Observe that the steady-state level of output is increasing in κ. So,
adding learning in the model raises the steady-state level of output.
134 numerical methods for macroeconomists with julia and matlab codes

The Decision Rule Approach

Conjecture that the monopolist’s decision rule has the following linear
form:
ot = η + ψot−1 . (6.13.1)
In a steady state,
η
o∗ = .
1−ψ
Hence, the constant η must solve

(1 − ψ ) α
η= . (6.13.2)
β + γ(1 − κ ) − δγκ (1 − κ )

If this is the case, then one can rewrite the first-order condition as

α − βot = γ(ot − κot−1 ) − δγκ (η + ψot − κot ).

Therefore,

(γ + β − δγκψ + δγκ 2 )ot = α + δγκη + γκot−1 ,

so that
α + δγκη γκ
ot = 2
+ o t −1 .
γ + β − δγκψ + δγκ γ + β − δγκψ + δγκ 2

This implies that

γκ
ψ= .
γ + β − δγκψ + δγκ 2
Therefore, the solution for ψ solves the quadratic equation

−δγκψ2 + (γ + β + δγκ 2 )ψ − γκ = 0.

Since this is a quadratic equation there will be two roots. Observe that
the lefthand side of the above equation is negative when ψ = 0 and
that its derivative is positive at this point. The expression is positive
when ψ = 1 (since the steady-state level of output must be positive).
The lefthand side eventually becomes negative as ψ becomes large. It
is easy to solve this equation for ψ on the computer. Given the solution
for ψ one can recover the solution for η using (6.13.2). The time path
for output is then obtained by iterating on (6.13.1) starting from o0 = 0.

Multiple Shooting
Now, the second-order difference equation for output, associated with
the first-order condition arising from the above maximization problem,
can be rewritten as
α β + γ + δγκ 2 γκ
o t +1 = − + ot − o ,
δγκ δγκ δγκ t−1
deterministic dynamics 135

or
ot+1 = d + eot + f ot−1 ,

with d ≡ −α/(δγκ ), e ≡ ( β + γ + δγκ 2 )/(δγκ ), and f ≡ −γκ/(δγκ ).

This second-order difference equation can be solved using multiple
shooting. Given two starting conditions, o0 and o1 , the above equation
can be iterated forward in time to get o2 , o3 , o4 ,· · · . The monopo-
list only starts producing in period 1 so o0 = 0. The idea underlying
multiple shooting is to select o1 so that limt→∞ ot = o ∗ ; i.e., so that
as the end of time approaches the time path of output converges to
the steady-state level of output. This is operationalized by running the
model over some finite time horizon. In particular, say T periods. Out-
put in the first period, o1 , is then chosen, using a nonlinear equation
solver, so that output in the last period, T, is equal to the steady-state
level of output, or o T = o ∗ . It is a good idea not to set T too large
because the time path for output will often have an explosive behavior
when trying out guesses for o1 .

Extended Path Method

Here the second-order difference equation appears as

α γκ δγκ
ot = 2
+ 2
o t −1 + o t +1 .
β + γ + δγκ β + γ + δγκ β + γ + δγκ 2

This can be expressed as

ot = g + hot−1 + iot+1 ,

where g ≡ α/( β + γ + δγκ 2 ), h ≡ (γκ )/( β + γ + δγκ 2 ), and i ≡

(δγκ )/( β + γ + δγκ 2 ). Given values for ot−1 and ot+1 , this difference
equation will give a solution for ot . In any period t, the past level
of output, ot−1 , will be known. The future value for output, ot+1 , is
j
read off of a guess path, which for iteration j is denoted by {ot }tT=0 .
j
Note that the guess for the final period is set so that o T = o ∗ , be-
cause the output in period T is set equal to its steady-state value. The
j
guess for the initial period is zero so that o0 = 0. So, in iteration j
the above difference equation can be started off from o0 = 0 to get a
j
solution for o1 ,while setting o2 = o2 . One can then move to period 2.
Here, one solves for o2 , using the previous answer for o1 ,while setting
j
o3 = o3 . One proceeds down the path up for all t ≤ T − 1. This gives
a time path {ot }tT=0 , where again by construction o0 = 0 and o T = o ∗ .
j
Next, check whether the difference between {ot }tT=0 and {ot }tT=0 is suf-
j +1 T
ficiently small. If not, set {ot } t =0 = {ot }tT=0 and proceed on to itera-
tion j + 1.
136 numerical methods for macroeconomists with julia and matlab codes

6.13.2 The MATLAB code

Below is a MATLAB program that solves the dynamic monopolist’s
problem using two methods. First, the model is solved taking the
decision-rule approach. Second, the solution to the model is computed
using multiple shooting. Third, the extended path methods is used.

MATLAB, Main Program-main.m

This is the main m file to solve the monopoly problem. The decision-
rule approach involves finding the roots of a polynomial. This is done
using built in MATLAB function roots. To solve the model using
multiple shooting, a starting value for second period output is found
using the nonlinear equation fzero. This calls the m file multshooting,
which sets up the difference equation that is being solved. The global
statement is used to pass back and forth some variables into the m file
multshooting. Last, the extended path method makes a guess path
for the the evolution of output. Using this guess path, a revised path
for output is computed. This involves solving a difference equation at
each point in time. This difference equation is contained in the m file
extendpath.
1 % main .m
2 % Dynamic Monopoly ProblemMain Program
3 c l e a r a l l % C l e a r a l l numbers from p re vi o us runs
4 c l c % Clear screen
5 % D e c l a r e some g l o b a l v a r i a b l e s t h a t w i l l be passed i n t o
function
6 % multshooting used f o r t h e m u l t i p l e s h o o t i n g a l g o r i t h m
7 g l o b a l d e f o s t a r ovecms T
8

9 % S e t parameters f o r model
10
11 % Demand curve
12 alpha = 1 ; % c o n s t a n t
13 beta = 0 . 5 ; % slope
14

15 % Cost f u n c t i o n
16 gamma = 0 . 5 ; % q u a d r a t i c term
17 kappa = 0 . 9 ; % c o s t r e d u c t i o n term
18
19 % Discount f a c t o r
20 delta = 0.96;
21
22 % Time horizon f o r s i m u l a t i o n
23 T = 1 0 ; % Number o f p e r i o d s
24
25 % Compute steady − s t a t e l e v e l o f output
26 o s t a r = alpha /( b e t a + gamma * (1 − kappa ) * (1 − d e l t a * kappa ) ) ;

1 % METHOD 1 : Solve model t a k i n g t h e Decision −Rule Approach

3 % S e t up t h e Quadratic Formulae f o r P s i
4 a = − d e l t a * gamma * kappa ; % C o e f f i c i e n t on squared term
deterministic dynamics 137

5 b = gamma + b e t a + d e l t a * gamma * kappa2 ; % L i n e a r term

6 c = −gamma * kappa ; % Constant term
7 p = [ a b c ] ; % C o e f f i c i e n t s on q u a d r a t i c
8
9 % Plot the quadratic equation fo r psi
10 % C o n s t r u c t g r i d t o p l o t p s i over going from 0 t o 1
11 % by i n c r e m e n t s o f 0 . 1
12 p s i g r i d = 0 : . 1 : 1 ; % Grid f o r p s i f o r p l o t t i n g
13 quad4psi = a * p s i g r i d . 2 + b * p s i g r i d + c ; % The q u a d r a t i c
14 figure (1)
15 p l o t ( p s i g r i d , quad4psi )
16 t i t l e ( ’ The q u a d r a t i c f o r p s i ’ )
17 xlabel ( ’ Psi ’ )
18 y l a b e l ( ’ Quadratic ’ )
19
20 % Find r o o t s o f q u a d r a t i c and t a k e t h e minimum one .
21 p s i = min ( r o o t s ( p ) ) ; % Slope term on law o f motion f o r output
22 e t a = (1 − p s i ) * o s t a r ; % Constant term on law o f motion
23

24 % I t e r a t e on d i f f e r e n c e e q u a t i o n f o r output
25 % S e t up v e c t o r t o s t o r e outputs f o r d e c i s i o n − r u l e approach
26 ovec = z e r o s ( T , 1 ) ;
27 time = ( 1 : T ) ’ ; % S e t up v e c t o r f o r time
28 % Note t h e t h e f i r s t element i n t h e s e v e c t o r s , ovec ( 1 , 1 ) and
time ( 1 , 1 ) ,
29 % a c t u a l l y correspond t o period zero
30 for t = 2:T
31 ovec ( t , 1 ) = e t a + p s i * ovec ( t − 1 , 1 ) ;
32 % S t a r t i n g value f o r loop i s ovec ( 1 , 1 ) = 0
33 end
34 pvec = alpha − b e t a * ovec / 2 ; % Compute p r i c e s over time
35

36 % Plot results
37 % Here a graph with 6 p a n e l s i s c o n s t r u c t e d i n t h e form o f a 3
by 2 matrix
38 % The l a s t number r e f e r s t o t h e p o s i t i o n o f t h e p l o t
39 % i n t h e matrix , running 1 t o 6
40 % Plot results
41 figure (2)
42 s u b p l o t ( 3 , 2 , 1 ) % Multipanel graph
43 p l o t ( time , ovec )
44 t i t l e ( ’ Outputd ecision r u l e approach ’ )
45 x l a b e l ( ’ Time ’ )
46 y l a b e l ( ’ Output ’ )
47 s u b p l o t ( 3 , 2 , 2 ) % Multipanel graph
48 p l o t ( time , pvec )
49 t i t l e ( ’ P r i c e d e c i s i o n r u l e approach ’ )
50 x l a b e l ( ’ Time ’ )
51 ylabel ( ’ Price ’ )

1 % METHOD 2 : Solve model using m u l t i p l e s h o o t i n g

2
3 % S e t up terms f o r second −order d i f f e r e n c e e q u a t i o n
4 % These a r e g l o b a l v a r i a b l e s
5 % Passed i n t o multshooting .m
6 d = −alpha /( − a ) ; % Constant term
7 e = b/( − a ) ; % C o e f f i c i e n t on lagged output
8 f = c /( − a ) ; % Coef on lagged output , two p e r i o d s ago
9 % C r e a t e v e c t o r t o s t o r e outputs using m u l t i p l e s h o o t i n g
10 ovecms = z e r o s ( T , 1 ) ; % A g l o b a l v a r i a b l e
138 numerical methods for macroeconomists with julia and matlab codes

11 o2guess = . 5 * ovec ( 2 , 1 ) ;
12 o2 = f z e r o ( @multshooting , o2guess ) ;
13

14 % Check s o l u t i o n
15 errorms = abs ( multshooting ( o2 ) ) ; % a b s o l u t e value o f e r r o r i n
solution
16 i f errorms = 1 . E−9
17 disp ( ’ S o l u t i o n has not been found ’ )
18 disp ( errorms )% e r r o r
19 end
20 % Note t h a t t h e v e c t o r ovecms i s passed back from t h e f u n c t i o n
21 % multshooting using t h e g l o b a l s t a t e m e n t .
22 % The s t a t e m e n t multshooting ( o2 )
23 % e n s u r e s t h a t t h i s v e c t o r i s computed
24 % using MATLAB’ s f i n a l s o l u t i o n f o r o2
25 pvecms = alpha − b e t a * ovecms / 2 ; % Compute p r i c e s over time
26

27 % Plot results
28 subplot ( 3 , 2 , 3 )
29 p l o t ( time , ovecms )
30 t i t l e ( ’ Outputmultiple s h o o t i n g ’ )
31 x l a b e l ( ’ Time ’ )
32 y l a b e l ( ’ Output ’ )
33 subplot ( 3 , 2 , 4 )
34 p l o t ( time , pvecms )
35 t i t l e ( ’ Pricemultiple shooting ’ )
36 x l a b e l ( ’ Time ’ )
37 ylabel ( ’ Price ’ )

1 % METHOD 3 : Solve model using extended −path method

3 % S e t up terms f o r second −order d i f f e r e n c e e q u a t i o n

4 % These a r e g l o b a l v a r i a b l e s
5 % Passed i n t o extendedpath .m
6 g = alpha/b ; % Constant term
7 h = gamma * kappa/b ; % C o e f f i c i e n t on lagged output
8 i = −a/b ; % C o e f f i c i e n t on f u t u r e output
9 % guesspath = z e r o s ( T , 1 ) ;
10 guesspath = ovecms ;
11 guesspath = 0 : o s t a r /(T−1) : o s t a r ;
12 guesspath = guesspath ’ ;
13 errorep = 1 ;
14 iterationep = 1;
15

16 while i t e r a t i o n e p <= 100 && e r r o r e p >= 1 . E−9

17 ovecep = extendpath ( guesspath ) ;
18 e r r o r e p = norm ( ovecep −guesspath ) ;
19 guesspath = ovecep ;
20 iterationep = iterationep + 1;
21 end
22 disp ( ’Number o f i t e r a t i o n s f o r extended path method ’ )
23 disp ( i t e r a t i o n e p −1)
24 pvecep = alpha − b e t a * ovecep / 2 ; % Compute p r i c e s over time
25
26 % Plot results
27 subplot ( 3 , 2 , 5 )
28 p l o t ( time , ovecms )
29 t i t l e ( ’ Outputextended path ’ )
30 x l a b e l ( ’ Time ’ )
31 y l a b e l ( ’ Output ’ )
deterministic dynamics 139

32 subplot ( 3 , 2 , 6 )
33 p l o t ( time , pvecms )
34 t i t l e ( ’ P r i c e e x t e n d e d path ’ )
35 x l a b e l ( ’ Time ’ )
36 ylabel ( ’ Price ’ )

A Function Specifying the Second-Order Difference to be Solved: mult-

shooting.m
This is a function setting out the second-order difference equation to
be solved for using multiple shooting. It solves for period T output,
o T , as a function of o1 . The idea is to minimize the distance between
o T and the steady-state level of output, o ∗ . Period-1 output, o1 , and
the steady-state level of output are passed into the function using the
global command. The level for o1 is contained in the first element
of the vector ovecms. The coefficients for the second-order difference
equation are also passed in using the global command.
1 f u n c t i o n [ zero ] = multshooting ( o2 )
2 % multshooting .m
3 % This m f i l e s o l v e s f o r t h e t e r m i n a l l e v e l o f output by s o l v i n g
4 % t h e 2nd order d i f f e r e n c e e q u a t i o n by making a guess f o r t h e
5 % second s t a r t i n g value f o r output
6
7 % The g l o b a l s t a t e m e n t i s used t o pass some i n f o r m a t i o n i n t o t h e
8 % f u n c t i o n from t h e main program . The v a r i a b l e ovecms , which
9 % c o n t a i n s t h e time p a t h f o r output i s a l s o passed
10 % back i n t o t h e main program
11
12 g l o b a l d e f o s t a r ovecms T
13

14 ovecms ( 2 , 1 ) = o2 ; % Guess f o r second period output

16 % I t e r a t e on d i f f e r e n c e e q u a t i o n s t a r t i n g from period 3
17 % using s t a r t i n g v a l u e s f o r t h e f i r s t two p e r i o d s f o r output
18 for t =3:T
19 ovecms ( t , 1 ) = d + e * ovecms ( t − 1 , 1 ) + f * ovecms ( t − 2 , 1 ) ;
20 % Note t h a t t h e s t a r t i n g value f o r output , ovecms ( 1 , 1 ) = 0
21 % Likewise , t h e second s t a r t i n g value , ovecms ( 2 , 1 ) , i s equal
22 % t o t h e guess
23 end
24

25 % Examine t h e d i f f e r e n c e between t h e t e r m i n a l l e v e l o f output

26 % and t h e steady − s t a t e l e v e l o f output
27 zero = o s t a r − ovecms ( T , 1 ) ;
28 end

A Function Specifying the Second-Order Difference to be Solved: ex-

tendpath.m
This is a function setting out the second-order difference equation to
be solved for using the extended path method. Given a guess path for
j
output, {ot }tT=1 , it computes a revised path for output, {ot }tT=1 . The
level for o1 is contained in the first element of the vector ovecep. The
140 numerical methods for macroeconomists with julia and matlab codes

coefficients for the second-order difference equation, the steady-state

level of output, and the time horizon are passed in using the global
command.
1 f u n c t i o n [ r e v i s e d p a t h ] = extendpath ( guesspath )
2 % extendpath .m
3 % This m f i l e i t e r a t e s on t h e 2nd−order d i f f e r e n c e e q u a t i o n f o r
output
4 % using a guess path t o i n s e r t a value f o r t h e f u t u r e l e v e l o f
output . The
5 % guess path , c a l l e d guesspath , i s i n s e r t e d i n as an in pu t i n t o
6 % the function .
7 % The g l o b a l s t a t e m e n t i s used t o pass some i n f o r m a t i o n i n t o t h e
function
8 % from t h e main program . The f u n c t i o n outputs t h e v a r i a b l e
revisepath ,
9 % which c o n t a i n s t h e r e v i s e d time path f o r output .
10
11 global g h i ostar T
12

13 r e v i s e d p a t h = z e r o s ( T , 1 ) ; % S t a r t i n g l e v e l o f output i s zero .
14 r e v i s e d p a t h ( T , 1 ) = o s t a r ; % Terminal output i s t h e steady −
state level .
15 % I t e r a t e on d i f f e r e n c e e q u a t i o n s t a r t i n g from period 2 using
starting
16 % value f o r t h e f i r s t period . There i s no need t o s o l v e f o r
the l a s t
17 % period output l e v e l s i n c e t h i s taken t o be t h e steady − s t a t e
l e v e l of
18 % output .
19

20 f o r t = 2 : T−1
21 r e v i s e d p a t h ( t , 1 ) = g + h * r e v i s e d p a t h ( t − 1 , 1 ) + i * guesspath (
t +1 ,1) ;
22 % Note t h a t t h e s t a r t i n g value f o r output or r e v i s e d p a t h
(1 ,1) = 0
23 % Likewise , t h e t e r m i n a l value , or r e v i s e d p a t h ( T , 1 ) , i s
equal t o
24 % ostar .
25 end
26

27 end

Output from the Program

The program generates the graph shown in Figures 6.13.1 and 6.13.2.
This is the output from the program for the dynamic monopoly
model. Observe that the three techniques give the same solution for
the time path of output and prices.
deterministic dynamics 141

The quadratic for psi

0.6
Figure 6.13.1: The quadratic
equation for ψ. Note that there
0.4 is only one root between 0 and
1.
0.2
Quadratic

-0.2

-0.4

-0.6
0 0.2 0.4 0.6 0.8 1
Psi

Output−−decision rule approach Price−−decision rule approach

2 1
Figure 6.13.2: This is the out-
put from the program for the
Output

Price

0.8
1
dynamic monopoly model. Ob-
0.6
0 serve that the three techniques
0 5 10 0 5 10
Time Time give the same solution for the
Output−−multiple shooting Price−−multiple shooting time path of output and prices.
2 1
Output

Price

0.8
1
0.6
0
0 5 10 0 5 10
Time Time
Output−−extended path Price−−extended path
2 1
Output

Price

0.8
1
0.6
0
0 5 10 0 5 10
Time Time
7 Malthus to Solow

“In October 1838, that is, fifteen months after I had begun my system-
atic inquiry, I happened to read for amusement Malthus on Population,
and being well prepared to appreciate the struggle for existence which
everywhere goes on from long- continued observation of the habits of
animals and plants, it at once struck me that under these circumstances
favourable variations would tend to be preserved, and unfavourable
ones to be destroyed. The results of this would be the formation of a
new species. Here, then I had at last got a theory by which to work.”
Charles Darwin (1876)

7.1 Introduction

The goal here is to model the transition from a world where living
conditions were stagnant over a long period of time to a world with
rising living standard.1 The analysis presumes the existence of two 1
This section is based on Hansen and
technologies: Malthus and Solow. The preindustrial era uses a land- Prescott (2002).

intensive technology. This constant-returns-to-scale technology also Thomas R. Malthus (1766-1834) was
employs capital and labor. Productivity for this technology grows at a an English cleric and economist.
very slow rate. Land is in fixed supply. The necessity to use land places He write the famous book An Essay
on the Principle of Population, which
a drag on growth. This technology is dubbed the Malthus technology.
postulated that the size of a
The modern era uses a constant-returns-to-scale technology employing population is limited by the
just capital and labor. Productivity for this technology grows at a faster productive capacity of land.
clip than the preindustrial one, although its productivity starts off from Charles Darwin credits Malthus’s
a very low level. This technology is labeled the Solow technology.- work as being instrumental in
Both technologies are always available. At low levels of development forming his theory of evolution.
it pays only to use the Malthus technology. As the economy develops Robert M. Solow (1923-) is an
it becomes profitable also to use the Solow technology. The Malthus American economist who is best
technology fades away asymptotically. known for his work on economic
growth. He broke down the sources
of economic growth into changes in
7.2 A Graphical Exposition of Malthusian Theory the labor supply, increases in the
capital stock, and technological
Malthusian theory states the size of the population will be regulated by progress. He won the Nobel Prize
in 1987.
the productive capacity of the economy. Figure 7.2.1 portrays the situ-
ation. Start with the lower panel. Income per worker, y, is negatively
related to the number of workers, n, as the curve in the lower panel
144 numerical methods for macroeconomists with julia and matlab codes

of the diagram portrays. This occurs because land is fixed in supply.

Turn to the upper panel. Fertility is increasing in income because par-
ents can better support larger families. Likewise, the mortality rate
declines in income, since the diseases related to poverty fall. The per-
worker level of income associated with a stable population size, n∗ , is
given by y∗ . If income per worker was at some higher level, say y0 ,
then population size would increase. This would occur since fertility
would exceed mortality. This expansion in population would lead to a
decline in income per worker until it converges to y∗ .

Figure 7.2.1: Malthusian equilib-

fertility rate rium

mortality rate
0
y* y y
n income per
person
n*
productivity

n
number
of people

7.3 Facts

7.3.1 England, 1275-1800–Malthus Era

Real wages were roughly constant for a long period of time in prein-
dustrial England. When population fell, in the Black Death, real wages
rose. This is in accord with Malthusian theory. Here wages adjust
to limit the size of the population–see Figure 7.3.1. Malthusian the-
ory predicts that population and land rents will rise and fall together.
They did over this period–see Figure 7.3.2.
malthus to solow 145

Figure 7.3.1: Population and

real wages: England, 1279-1800.
Source: Hansen and Prescott
(2002).

Figure 7.3.2: Population and real

land rents: England, 1279-1800.
Source: Hansen and Prescott
(2002).
146 numerical methods for macroeconomists with julia and matlab codes

7.3.2 England 1800-1989–Solow Era

By contrast, in the industrial era population growth did not lead to
falling real wages as Malthusian theory predicts–see Figure 7.3.3. It’s
hard to see a relationship between population growth and labor pro-
ductivity. The Solow model doesn’t predict one–Figure 7.3.4. The
value of farmland to GDP fell–see Figure 7.3.5.

7.4 The Model

The vehicle for analysis is a two-period overlapping generations model.

At any point of time, there are two generations of adults, young and
old. People only work while young. They supply one unit of labor
which earns a wage. Old people are retired. They live off of the sav-
ings they undertook when young. The young can save in the form
of capital or land. Capital depreciates fully across periods. Capital is
reproducible. Land lasts forever. It is fixed in supply at one.
There are two technologies, a primitive one and an advanced one.
Either or both can be operated in a period. The primitive technology is
labeled as the Malthus technology Here output, ym , is produced using
capital, k m , labor, nm , and land, lm , according to
φ µ 1− φ − µ
ym = am k m nm lm .

The key factor in the Malthus technology is its use of land, lm . The ad-
vanced technology is dubbed the Solow technology. Under the Solow
technology production is governed by

ys = as kθs n1s −θ ,

where ys is the level of output produced by the Solow technology, and

k s , and ns are the inputs of capital, and labor. The Solow technology
does not use land. It is assumed that as grows at a faster rate over time
than am . Let the gross growth rate of technological progress in the
Solow sector be represented by γs ≡ a0s /as and that for the Malthus by
γm ≡ a0m /am . (Here the prime or 0 denotes the value of the variable
next period.) Since the Solow technology has a faster rate of techno-
logical progress, γs > γm . So, on this accounts, one would expect
growth to be faster with the Solow technology. The fact that land is
not a reproducible factor slows down growth further in the Malthus
technology.
Last to complete the setup, the economy’s resource constraint is
presented
c + k0 = ym + ys .
Here c is aggregate consumption and k0 is aggregate investment. Aggre-
gate consumption is the sum of consumption over the old and young
malthus to solow 147

Productivity vs Population size

Figure 7.3.3: The relationship
70 20 between population and pro-
60
18 ductivity, 1700-1989
16
Millions
50
Millions of people 14 Productivity

1985 $US 12
40

30
8

20 6

4
10
2

0 0
1650 1700 1750 1800 1850 1900 1950 2000 2050
Year

Productivity growth rate vs. Population growth rate

Figure 7.3.4: The relationship
3.5
between population growth and
3 productivity growth
Growth rate GDP
2.5 Growth rate (pop)

1.5

0.5

0
1600 1700 1800 1900 2000 2100

-0.5
Year
148 numerical methods for macroeconomists with julia and matlab codes

Percentage
Figure 7.3.5: Value of farm land
100 relative to GDP, U.S., 1870-1990
90

0
1860 1880 1900 1920 1940 1960 1980 2000

generations, which will be of different sizes. Aggregate investment is

just done by the young. Therefore, aggregate demand, c + k0 , must
equal aggregate supply, ym + ys .

7.4.1 Firms’ Problems

Firms hire capital, labor and land to maximize profits. They solve the
following maximization problems:
φ µ 1− φ − µ
max { am k m nm lm − wnm − rk k m − rl lm }, (7.4.1)
k m ,nm ,lm

and
max{ as kθs n1s −θ − wns − rk k s }, (7.4.2)
k s ,ns

where w is the wage rate and rk and rl are the rental rates on capital
and land.

7.4.2 Household’s Problem

Each household solves the following maximization problem

max {ln c1 + β ln c20 }, (7.4.3)

c1 ,c20 ,k0 ,l 0

subject to their first- and second-period budget constraints

c1 + k0 + ql 0 = w,

and
c20 = rk0 k0 + (rl0 + q0 )l 0 .
Here q is the current price for a unit of land. The first-period budget
constraint states that when young consumption, c1 , and savings in
malthus to solow 149

capital and land, k0 + ql 0 , equals labor income, w. The second-period

budget constraint says that when old the person’s consumption, c20 ,
will be limited by the income he earns from his ownership of capital
and land, rk0 k0 + (rl0 + q0 )l 0 . When old the person will sell his land (to a
young person in the next generation) at the unit price q0 .

7.4.3 Demographics
Population growth is simply given by

n0 = G (c1 )n. (7.4.4)

The function G (c1 ) is constructed to match the demographic transition.

The demographic transition refers to the fact that fertility has had ∩
shape over time (or equivalently has risen and then fallen with living
standards).

7.4.4 Equilibrium
Let n represent the size of the today’s young population and n−1 the
size of the old population. An equilibrium for the above economy
must satisfy the following conditions.

1. Firms maximize profits or solve problems (7.4.1) and (7.4.2).

2. The households maximize utility or solve problem (7.4.3).

3. All markets clear implying

(a) Physical Capital

k m + k s = n−1 k,

(b) Labor
nm + ns = n,

(d) Goods
nc1 + n−1 c2 + nk0 = ym + ys ,
where aggregate consumption and investment are given by c =
nc1 + n−1 c2 and k0 = nk0 .

7.5 Malthus versus Solow

The cost function for the Solow sector is

Cs (w, rk , ys ) = min{rk k s + wns : ys = as kθs n1s −θ } = a− 1 −θ

s θ (1 − θ )
−(1−θ ) θ 1−θ
rk w ys .
k s ,ns
150 numerical methods for macroeconomists with julia and matlab codes

This cost function has the standard properties. Cost is increasing and
concave in both prices, rk and w, separately. It is homogenous of de-
gree one in both prices together. That is, if both prices are increased
by a factor λ, then costs will rise by this factor too. Cost also rises with
the level of output, ys , produced. Here marginal cost is

a− 1 −θ
s θ (1 − θ )
−(1−θ ) θ 1−θ
rk w .

Hence, marginal cost is constant.

The cost function for the Malthus sector (holding land fixed at
unity) is
φ µ 1− φ − µ
Cm (w, rk , ym ) = min {rk k m + wnm : ym = am k m nm lm and lm = 1}
k m ,nm
−1/(φ+µ) φ φ φ/(φ+µ) µ/(φ+µ) 1/(φ+µ)
= am [( )µ/(φ+µ) + ( )−φ/(φ+µ) ]rk w ym ,
µ µ

so that marginal cost will be

1 −1/(φ+µ) φ µ/(φ+µ) φ φ/(φ+µ) µ/(φ+µ) 1/(φ+µ)−1

a [( ) + ( )−φ/(φ+µ) ]rk w ym .
(φ + µ) m µ µ

Here, marginal cost is increasing and convex. Observe that marginal

cost goes to zero as output goes to zero.
The Solow sector will not operate when

1 −1/(φ+µ) φ µ/(φ+µ) φ
a− 1 −θ
s θ (1 − θ )
−(1−θ ) θ 1−θ
rk w > am [( ) + ( )−φ/(φ+µ) ]
(φ + µ) µ µ
φ/(φ+µ) 1/(φ+µ)−1
× rk wµ/(φ+µ) ym .

That is, the Solow sector will not operate at any aggregate output lev-
els, ys , where the Solow sector has higher marginal cost. Both sectors
will operate only when

1 −1/(φ+µ) φ µ/(φ+µ) φ
a− 1 −θ
s θ (1 − θ )
−(1−θ ) θ 1−θ
rk w = am [( ) + ( )−φ/(φ+µ) ]
(φ + µ) µ µ
φ/(φ+µ) 1/(φ+µ)−1
× rk wµ/(φ+µ) ym .

The Malthus sector will always operate since, as was mentioned, its
marginal cost goes to zero as output goes to zero. Figure 7.5.1 shows
the adoption point, at a given set of factor prices.
malthus to solow 151

Figure 7.5.1: The Solow Adop-

tion Point
MCm

MCs

Malthus Malthus + Solow

ym
ym, ys

Lemma 42. The Solow technology is not used if

r k θ w 1− θ
as < ( ) ( ) .
θ 1−θ
Proof. (Gokce Uysal, Richard Suen and Vikram Manjunath) Imagine
one is in a world where only the Malthus technology is used and the
factor prices are given by rk and w. It will be shown that when the
above condition holds it is not optimal to use the Solow technology.
First, it will be shown that profits will be negative if the above con-
dition holds. In the Solow sector θ = rk k s /ys and 1 − θ = wns /ys .
Therefore, profits can be written as

θys θ (1 − θ )ys 1−θ θys (1 − θ ) y s

as ( ) ( ) − ks − ns .
rk w ks ns
For any ys > 0 profits will be negative whenever
r k θ w 1− θ
as < ( ) ( ) .
θ 1−θ
An alternative proof can be constructed that demonstrates that the
first-order conditions for a firm using the Solow technology will be
violated if the above inequality holds. Suppose that the statement in
the lemma holds and that the Solow sector operates. The first-order
conditions to (7.4.2) imply that
ns r w
= ( k )/( ).
ks θ 1−θ
Therefore, from the first-order condition for capital

k s θ −1 r w θ −1
as θ ( ) = a s θ ( k )1− θ ( ) = rk .
ns θ 1−θ
152 numerical methods for macroeconomists with julia and matlab codes

This implies that

r k θ w 1− θ
as = ( ) ( ) ,
θ 1−θ
a contradiction.
Yet, a third proof follows from the fact that the marginal cost of
producing in the Solow sector cannot exceed the price of output, which
is one, so that a− 1 −θ
s θ (1 − θ )
−(1−θ ) r θ w1−θ ≤ 1.
k

7.6 The Malthusian Steady State

Before proceeding on further, the Malthusian steady state will be for-

mulated. First, it is easy to define the rate of population growth in the
Malthusian steady state. In a Malthus-only economy population grows
at the same rate as output, γym . Now,
0 0 φ 0 µ 01− φ − µ
y0m a k n l
γy m ≡ = m mφ mµ m 1− φ − µ
.
ym am k m nm lm
Recall that a0m /am = γm . Land is fixed in supply so lm 0 = l . Since
m
0
population is growing at the same rate as output nm /nm = γym . Last,
capital will grow at the same rate as output, too, so that k0m /k m =
γym . Using these facts in the above equation leads to the result γym =
1/(1−µ−φ)
γm .
Second, the level of consumption for young adults is now immedi-
ate. Since population is growing at the same rate as output, it happens
that per-capita output, and consumption for the young, c1 , are con-
stant. The level of c1 can be determined by the steady-state for the
Malthus-only equilibrium. To characterize the Malthus-only steady
state, note from (7.4.4) that nt+1 = G (c1,t )nt so

c 1 = G − 1 ( γy m ) .

Third, the level of wages, w, can now be computed. From the con-
sumer’s problem it can be calculated that

c1 = w/(1 + β).

Therefore, in a Malthusian steady state

w = ( 1 + β ) c 1 = ( 1 + β ) G − 1 ( γy m ) . (7.6.1)

The steady-state wage rate has been pinned down.

Last, the rental prices for capital and land, rk and rl , and the price
of land, q, can be determined. In a steady state the price of land will
be given by q = (rl + q)γym /i, where i is the gross interest rate–note
0
that rl = γym rl and q0 = γym q. Therefore,

q = rl (γym /i )/(1 − γym /i ); (7.6.2)

malthus to solow 153

that is, the price of land is the discounted value of the (growth-adjusted)
rents that it will earn. It is immediate that rk = i since the gross return
on capital must equal the gross interest rate.

7.6.1 Savings Equals Investment

So, what determines the interest rate i? In a steady state aggregate
saving by the young, n(w − c1 ), must equal aggregate investment in
capital and land, γym k m + q. (Here, k0m = γym k m = nk0 .) Therefore,

n(w − c1 ) = nβ/(1 + β)w = γym k m + q,

or
(w − c1 ) = β/(1 + β)w = γym k m /n + q/n. (7.6.3)
Think about (7.6.3) as determining the interest rate i. Observe that the
lefthand side is a constant, since w is just a function of γym . It will now
be shown that k m /n and q/n on the righthand side can be expressed
as functions of i. Given this, equation (7.6.3) determines i. [Condition
(7.6.3) is actually the same as the goods market-clearing condition. Try
to convince yourself that this is true.] Thus, in a steady state for an
overlapping generations model the interest rate, i, is not pinned down
in simple fashion by the discount factor, as in the representative agent
model.
By taking the ratio of the first-order conditions for labor and capital
in the firm’s optimization problem for the Malthus sector, it can be
shown that
km
= (w/i )(φ/µ).
nm
In a Malthus-only equilibrium nm is equal to the size of the young
population, n. Thus,

k m /n = (w/i )(φ/µ).

Therefore, k m /n is a function of i, as claimed. The first-order condi-

tions for land and labor in the Malthus sector imply
1−φ−µ
rl = wn.
µ
This implies that q/n will be a function of i from (7.6.2). Hence, (7.6.3)
determines i = rk .
Next, the equilibrium size of the population will be uncovered. Us-
ing the formula for the marginal product of labor in the Malthus sector

w = µam k m nµ−1 = µam [n(w/i )(φ/µ)]φ nµ−1 .

φ
(7.6.4)

It’s clear from (7.6.4) that n, or the size of the young population, can
be written as a simple function of i–recall that w is pinned down by
154 numerical methods for macroeconomists with julia and matlab codes

(7.6.1). Specifically,
1 1/(φ+µ−1) (φ−1)/(φ+µ−1) (1−φ)/(φ+µ−1) i φ/(φ+µ−1)
n=( ) µ w ( ) .
am φ

7.7 Calibration

7.7.1 Demographics–in accord with Lucas (1998)

 1/(1−µ−φ)
 γm
 (2 − cc1m
1
) + 2( cc1m
1
− 1), for c1 < 2c1m , (rising segment)
G ( c1 ) = c − 2c
2 − 116c 1m , 2c1m ≤ c1 ≤ 18c1m , (falling segment)
 1m
1, for c1 > 18c1m (flat segment).


This function is plotted in Figure 7.7.1 below. The figure has three
interesting features. First, at the Malthusian level of living standards,
1/(1−µ−φ)
c1 = c1m , population grows at the same pace as output, γm ,
where γm is the rate of growth in am . Second, as living standards
double from the Malthusian level the population growth rate rises to
a point where it doubles every thirty-five (the period length) years.
Third, from this point to the point where living standards are 18 times
the Malthusian one (2c1m ≤ c1 ≤ 18c1m ) the rate of population growth
declines until a stationary level is attained.

Figure 7.7.1: Demographics.

Source: Hansen and Prescott
(2002).

7.7.2 Parameter Values

The parameter values chosen are shown in the table below. Technolog-
ical progress is faster in the Solow era than in the Malthus one. They
are picked to mimic growth in each era. Capital’s share of income is
much less in the Malthus economy relative to the Solow one. This is
because land is also used in the Malthus economy. The need to use
land in the Malthus era slows down growth, as will be discussed be-
low. Another point to note is that the discount factor is set to one. Still,
malthus to solow 155

the model gives reasonable values for the interest rate in the Malthus
and Solow eras. This is because the interest rate is not pinned down
by the discount factor, alone, in an overlapping generations model.
Remark 43. If both land and labor were fixed, then in an economy us-
1/(1−φ)
ing only the Malthus technology output would grow at rate γm ,
1/(1−θ )
while in a Solow one the growth rate would be γs . Thus, when
γm ≥ γs the Solow economy grows faster because θ > φ; i.e., the
reproducible factor, capital, has a larger share in the Solow economy.

Parameter Value Comment

γm 1.032 Growth in Malthus Era–period 35 yrs (0.1% a year.)
Consistent with pop. doubling every 230 yrs
γs 1.518 Postwar GDP Growth (1.2% a year)
φ 0.1 Capital’s Share of Income, Malthus
µ 0.6 Labor’s share, both technologies
θ 0.4 Capital’s Share, Solow
β 1.0 Discount factor– annual return of 2% in Malthus era,
4-4.5% in Solow era.

7.8 Results

As the economy develops, the share of inputs devoted to the Malthus

sector declines over time. This is shown in Figure 7.8.1. Figure 7.8.2
shows how wages rise and the population grows as the economy
moves to the Solow epoch. Last, land isn’t used by the Solow tech-
nology. So, its value declines over time as the Malthus sector dies out.
This is shown in Figure 7.8.3.

Figure 7.8.1: The vanishing of

the Malthusian sector. Source:
Hansen and Prescott (2002).
156 numerical methods for macroeconomists with julia and matlab codes

Figure 7.8.2: The Solow era.

Source: Hansen and Prescott
(2002).

Figure 7.8.3: The declining value

of land. Source: Hansen and
Prescott (2002).
8 Numerical Approximations

8.1 Introduction

Solving macroeconomic models, especially stochastic ones, on the com-

puter often involves employing numerical approximation for things
such as derivatives, functions, and probability distributions. Some of
these numerical approximations are discussed here. The discussion
starts off with numerical differentiation. This is often used to log-
linearize dynamic stochastic models, as will be discussed in Chapter
9. The concept of numerical differentiation is illustrated by examining
the impact of technological progress in contraception on non-marital
births by young women. An enigma is the U.S. data is that as con-
traception improved non-marital births jumped up. By comparing nu-
merical derivatives at two different points in time, this fact can be
rationalized. The chapter then moves on to the topic of numerical
integration. This is illustrated by an example that calculates the con-
sumer surplus for computers, which is the roughly trianglar shaped
area trapped below the demand curve for computers and above the
price line for computers.
Numerical integration is then followed by the topic of random num-
ber generation. Random numbers are used in Monte Carlo simulation
of business cycle models, again discussed in Chapter 9. The concept of
random number generation is illustrated using an example from Eu-
gen Slutsky (1937)’s classic work on “Random Causes as the Source of
Cyclic Processes.” Next, the concept of Monte Carlo integration using
random number generation is discussed. This concept is illustrated by
calculating the welfare cost of business cycles à la Robert E. Lucas, Jr.
The chapter also introduces the idea of a Markov chain. As an
example a Markov chain for the evolution of employment/unemploy-
ment is presented. When calibrated this Markov chain can be used
to backout the job finding and separation rates. Mehra and Prescott
(1985)’s well-known study on the equity premium is also used to illus-
trate the notion of a Markov chain. It is shown how an AR1 process
can be approximated by a Markov chain. Last, the topic of approx-
imating a function is discussed; e.g., some policy function or value
158 numerical methods for macroeconomists with julia and matlab codes

function as in Chapter 9. Three methods are discussed: piecewise

linear interpolation, cubic spline interpolation, and radial basis func-
tion interpolation. Cubic spline interpolation is very flexible. To show
this, a facsimile of an artist’s sketch of a face is generated using cubic
splines. The Hodrick-Prescott filter, which is based on cubic splines, is
also presented.

8.2 Numerical Differentiation

8.2.1 The Standard Method

Computing analytically the derivatives of a function, say F ( x ), can
take some effort. Often it is simpler to approximate these derivatives
numerically. The derivative of the function F ( x ) at the point x can be
computed numerically by using the formula

F ( x + h) − F ( x − h)
F1 ( x ) = ,
2h
where h is some small number. Computing numerical derivatives is a
bit trickier than this formula suggests. Mathematically speaking, one
would like to make h as small as possible. But, note that the difference
between F ( x + h) and F ( x − h) will be rounding error if h is made too
small. This occurs because the numbers F ( x + h) and F ( x − h) will
be computed with small errors at the nth decimal where n is some
integer, say 10. If h is made too small the difference will just be this
error. Dividing this through by a very small h then blows this up.
Hence, there is a trade-off. Making h small improves mathematical
precision but increases numerical error. So, how small should h be? A
good lower bound on a PC would be about 1.0e-5.
To see the problem formally, take a first-order Taylor expansion of
the function F around the point x. (Chapter A reviews the concept of
a first-order Taylor expansion.) One gets

F ( x + h) = F ( x ) + F1 ( x )h + F11 (ζ )h2 /2 (for x ≤ ζ ≤ x + h).

In a similar fashion, one can write

F ( x − h) = F ( x ) − F1 ( x )h + F11 (ξ )h2 /2 (for x − h ≤ ξ ≤ x).

Subtracting the second equation from the first, while applying the in-
termediate value theorem, gives,

F ( x + h) − F ( x − h) [ F11 (ξ ) − F11 (ζ )]h

F1 ( x ) = + .
2h 4
Now, on the computer the function F will be computed with error, ε.
That is, the computer will compute F ( x + h) as F ( x + h) + ε +h and
numerical approximations 159

F(x) Slope=F1(x) Figure 8.2.1: Finite-

difference approximation,
[ F ( x + h) − F ( x − h)]/2h, vis à
F(x+h)
vis the true derivative, F1 ( x ).
Slope=numerical derivative
Mathematically speaking, the
F(x-h) error associated with the finite-
difference approximation will
shrink with h. This ignores
the machine error associated
with computing F ( x + h) and
F ( x − h ).

x-h x x+h x

F ( x − h) as F ( x − h) + ε −h . Therefore,

F ( x + h) − F ( x − h) [ F11 (ξ ) − F11 (ζ )]h

F1 ( x ) = + +(ε +h − ε −h )/(2h).
2h | 4
{z } | {z }
approx error machine error

Thus, there are two types of error on the righthand side of the equa-
tion. The mathematical approximation error given by [ F11 (ξ ) − F11 (ζ )]h/4
and the machine error shown by (ε +h − ε −h )/(2h). The first gets
smaller when h is reduced while the latter becomes bigger. Note that
by computing the derivative in both the forward and backward direc-
tions (or centering the derivative around the point x) the mathematical
approximation error is reduced.
The second derivative is just the difference between two first deriva-
tives
[ F ( x + h) − F ( x )]/h − [ F ( x ) − F ( x − h)]/h
F11 ( x ) =
h
F ( x + h) − 2F ( x ) + F ( x − h)
= .
h2
This second derivative is automatically centered around the point x.
Last, suppose that the function is F ( x, y). From the above it is easy
to deduce that
F ( x + h, y + h) − F ( x + h, y − h)
F2 ( x + h, y) =
2h
and
F ( x − h, y + h) − F ( x − h, y − h)
F2 ( x − h, y) = .
2h
So, the centered cross derivative is1 1
It should be immediate that
F11 ( x, y) = [ F ( x + h, y) − 2F ( x, y) + F ( x − h), y]/h2
and
F22 ( x, y) = [ F ( x, y + h) − 2F ( x, y) + F ( x, y − h]/h2 .
160 numerical methods for macroeconomists with julia and matlab codes

[ F ( x + h, y + h) − F ( x + h, y − h)]/2h
F12 ( x, y) =
2h
[ F ( x − h, y + h) − F ( x − h, y − h)]/2h
−
2h
F ( x + h, y + h) − F ( x + h, y − h) − F ( x − h, y + h) + F ( x − h, y − h)
= .
4h2

8.2.2 Complex Step Differentiation

This is a more modern method. It is an accurate method for sim-
ple functions–say one line formulas. Here the function F is expanded
around the complex part of the point x. Specifically,

F ( x + ih) ' F ( x ) + F1 ( x )ih,

√
where i ≡ −1. Take the imaginary part of both sides and divide by
h to get
Im( F ( x + ih))
F1 ( x ) = .
h
This is a remarkably simple formula and is easy to implement. It
turns out to be more accurate than the standard method. Now, h can
be smaller, say 1.0e-8.

8.3 The Impact of Technological Innovation in Contaception on

Non-Marital Births

A puzzling feature of the U.S. data is that as contraception improved

the number of non-marital births increased–see Figure 8.3.1. Can this
be explained? To do this, return to the model of premarital sexual
activity presented in Section 3.6 of Chapter 3. Recall that the fraction
of 20-year-old women with premarital sexual experience, p, is given
by
p = exp[−(φO/η ) β ],

where φ is the failure rate of contraception, O is the cost of a non-

marital birth, and β and η are the Weibull distribution’s shape and
scale parameters. When this relationship is calibrated to the U.S. data
it transpires that β = 2.30, η = 2.06, and O = 1.34; again, recall Section
3.6 of Chapter 3. The fraction of women with a non-marital birth, b, is
given by
b = φp = φ exp[−(φO/η ) β ].

To calculate non-martial births just multiply the number of sexually

active single young women, p, by odds of becoming pregnant, φ.
To uncover the impact of the efficacy of contraception on non-marital
births, the following derivative and elasticity are computed using complex-
numerical approximations 161

Figure 8.3.1: Non-marital births

as a percentage of all births
in the United States, 1920-2017.
Non-marital births rose despite
the fact that contraception be-
came more efficient. Source:
Greenwood et al. (2021a).

step differentiation:
db φ db
and .
dφ b dφ
Recall the the failure rates for 1900 and 2000 are 72 and 30 percent.
The results are shown in the table below. Interestingly, an drop in the
failure rate for 1900 leads to a hike in non-marital births while for 2000
it causes a fall. I.e., an one percentage point drop in failure rate in 1900
leads to a 0.47 percentage point increase in the fraction of young women
having a non-marital birth, while in 2000 it causes a 0.37 percentage
point decrease. In elasticity terms a one percent drop in φ in 1900 leads
to 2.18 percent increase in b while for 2000 it causes a 0.48 percent
decrease.
Contraception and Non-Marital Births
db/dφ (φ/b)(db/dφ)
1900 -0.47 -2.18
2000 0.37 0.48

What explains this? A drop in the failure rate makes sex safer but
it also entices more women to engage in premarital sex. For 1900
the impact on b = φp from a rise in p is bigger than the effect from
a decline in φ. Since not many women were having premarital sex at
that time a reduction in the failure rate on non-marital births is modest.
For 2000 exactly the opposite is true since a large fraction of unmarried
young women were sexually active. As the failure rate approaches 0
so will the number of non-marital births to young women.
162 numerical methods for macroeconomists with julia and matlab codes

8.4 Classical Numerical Intergration

Suppose one wants to compute the function

Z xn
I= F ( x )dx.
x0

That is, the task is to compute the area under the function F on the
domain [ x0 , xn ], as shown in Figure 8.4.1. The classical way to do this
is to break up the distance between x0 to xn into a grid of n equally
spaced points { x0 , x1 , · · · , x j−1 , x j, · · · , xn } where x j − x j−1 = h for all
j = 1, · · · , n. Now, take any two adjacent points, say x j−1 and x j .
Over the interval [ x j−1 , x j ] the function F ( x ) will be approximated by
a trapezoid, T j−1 ( x ). Specifically,

T j−1 ( x ) = (1 − µ)y j−1 + µy j , for x ∈ [ x j−1 , x j ],

where
y j−1 ≡ F ( x j−1 ) and y j ≡ F ( x j ),

and
µ = ( x − x j −1 ) / ( x j − x j −1 ).

Observe that the function T j−1 ( x ) is simply a weighted average of the

points y j−1 ≡ F ( x j−1 ) and y j ≡ F ( x j ) where the weight depends on
how close the point is to x j−1 . The further x is away from the point x j−1
the higher is the weight that is attached to the point y j ≡ F ( x j ) and
consequently the lower is the weight assigned to y j−1 ≡ F ( x j−1 ). (This
is related to the concept of piecewise linear interpolation discussed in
Section 8.14.)
Now, the integral for the area under the trapezoid T j−1 ( x ) is
Z x
j
Z x
j x − x j −1 Z x
j x − x j −1
T j−1 ( x )dx = y j−1 (1 − )dx + y j dx
x j −1 x j −1 x j − x j −1 x j −1 x j − x j −1
Z x
j xj − x Z x
j x − x j −1
= y j −1 dx + y j dx
x j −1 x j − x j −1 x j −1 x j − x j −1
1 1 y j −1 + y j
= y j −1 h + y j h = h.
2 2 2
So, the area of the trapezoid on the interval [ x j−1 , x j ] is just average of
the two points y j−1 and y j multiplied by the length of the interval or h.
Summing over all of the trapezoids on the entire domain [ x0 , xn ] gives
Z xn n y j −1 + y j
F ( x )dx ' h ∑
x0 j =1
2
n −1
= h(y0 /2 + ∑ y j + yn /2).
j =1
numerical approximations 163

The accuracy of this approximation will depend on how fine the grid,
{ x0 , x1 , · · · , x j−1 , x j, · · · , xn }, is. The more points, or the smaller h is,
the higher will be the approximation. Of course, the function F ( x )
doesn’t have to be approximated by a series of trapezoids. Other better
shapes could be used.

Figure 8.4.1: The area under the

F(x)
curve F ( x ) between the points
yj = F(xj) x0 and xn is approximated by
summing the area for a series of
n trapezoids.

yj-1 = F(xj-1)

x0 x1 xj-1 xj xn x

8.5 Measuring the Welfare Gain from Personal Computers

The first PC to be successfully mass produced was the Apple II, which
was introduced in 1977. The computer sold for roughly $1,200. Its
microprocessor ran at 1MHz and the PC had 4 kb of random-access
memory (RAM). There was no hard disk. An audio cassette was used
for program loading and data storage. Now, the speed of PCs is mea-
sured in terms of gigaHZ, RAM in gigabytes, and hard drive storage
in terabytes. Peripherals such as monitors and speakers are also much
better. The quality adjusted price of computers dropped at 25 per-
cent per year between 1977 to 2004. Over that time period, spending
on computers and peripherals rose from 0 to 0.6 percent of personal
consumption spending.
Greenwood and Kopecky (2013) estimated a nonlinear demand curve
for computers of the following form for the period 1977 to 2004:

(y + pυ)
c = D ( p, y) = − υ, with υ, θ, ρ > 0,
p + θ pρ

where c denotes the demand for computers at price p and income

y. They find υ = 4.1491 × 10−5 , θ = 0.0056, and ρ = 1.4844. For
164 numerical methods for macroeconomists with julia and matlab codes

Figure 8.5.1: The demand curve

for computers, c = D ( p, y). The
curve hits the vertical axis at
Hick’s virtual price, ph . The area
under the demand curve above
the price p measures the con-
sumer’s surplus. This is the area
that needs to be numerically in-
tegrated.

a given level of income, y, the demand curve for computers has the
form shown in Figure 8.5.1. The price ph where the demand curve hits
the vertical axis is known as Hick’s virtual price. It solves the equation,

(y + ph υ)
ρ − υ = 0.
ph + θ ph

The consumer surplus from computers at price p and income y can

be estimated by computing the area under the demand curve above
the price p. That is,
Z p
h (y + p
eυ)
consumer surplus = [ − υ]d pe.
p pe + θ peρ

This is an exercise in numerical integration. The consumer’s surplus

at the 2004 indices for price (p = 0.0013) and income (y = 0.2165)
amounts to 2.17 percent of personal consumption expenditure. While
consumer’s surplus measured in this way is not an exact measure of
the worth of computers to consumers, the estimate obtained here is
surprisingly close to the compensating and equivalent variations of
2.14 and 2.19 percent reported in Greenwood and Kopecky (2013).

8.6 Random Number Generators

There is nothing random about generating random numbers on a com-

puter. Somewhat surprisingly, these numbers are created using a de-
terministic algorithm.
numerical approximations 165

8.6.1 The Modulo Operator

As a prelude to the discussion, imagine dividing some natural num-
ber x through by another natural number m > 0 and calculating the
remainder, r. (Define the natural numbers as 0, 1, 2, · · · .) The remain-
der, also a natural number, will be given by the formula
x
r = x − floor ( )m,
m
where floor is the nearest natural number less than or equal to x/m.
The remainder, r, is a natural number. This operation defining the
remainder from a division is abbreviated as x modulo m or as

x mod m.

Observe that2 2
To understand the upper bound, ob-
0≤r=x mod m ≤ m − 1. serve that for any x the largest value
for the remainder must occur when
floor ( x/m) = 0, which gives r =
Example 44. (Remainder from 22 ÷ 7) 22 mod 7 = 22− floor (22/7) × x. But, this requires m > x. The
7 = 22− floor (3.1429) × 7 = 22 − 3 × 7 = 1. largest value of x compatible with this
inequality, m > x, is x = m −
Example 45. (Remainder from 11 ÷ 7) 11 mod 7 = 11− floor (11/7) × 1. The lower bound is easy to de-
duce. Clearly, r cannot be nega-
7 = 11− floor (1.5714) × 7 = 11 − 1 × 7 = 4.
tive because x ≥ floor ( x/m)m. But,
floor ( x/m)m = x whenever m = x. So
Example 46. (Remainder from 44 ÷ 7) 44 mod 7 = 44− floor (44/7) × 0 = x − floor ( x/m)m is an admissible
7 = 44− floor (6.2857) × 7 = 44 − 6 × 7 = 2. value for the remainder when m = x.

8.6.2 The Linear Congruential Generator

Pick some natural number, m. This is called the modulus of the ran-
dom number generator. Let x be a natural number that is created using
the following iterative scheme

x j+1 = ( ax j + b) mod m, (8.6.1)

where the constant a, a natural number, is called the multiplier and

b, another natural number, is known as the offset coefficient. Given
some starting value for x, x0 , this equation will give a sequence of
natural numbers. The starting value x0 is called the seed of the random
generator. The natural number x j+1 lies in the interval

0 ≤ x j+1 ≤ m − 1.

If x j+1 ever returns a number that was generated before then the al-
gorithm will repeat itself. Since there are only m possible values of x
the generator must cycle after m periods. Define the pseudo-random
number, u j+1 , that is associated with this sequence by

x j +1
u j +1 = .
m
166 numerical methods for macroeconomists with julia and matlab codes

The pseudo-random numbers will resemble those drawn from a uni-

form distribution–the uniform distribution is discussed in the Chapter
A.

Example 47. (A 4 period cycle) Let a = 11, b = 0 and m = 7. Set x0 =

2. Then, the algorithm proceeds as follows: x1 = (11 × 2) mod 7 = 22
mod 7 = 1 so that u1 = 1/7 = 0.1429; x2 = (11 × 1) mod 7 = 4 so
that u2 = 4/7 = 0.5714; x3 = (11 × 4) mod 7 = 2 so that u3 = 2/7 =
0.2857, · · · . Since x3 = x0 = 2 the random number generator then
repeats the cycle over again.

Example 48. (A 31 period cycle) Let a = 13, b = 0 and m = 31.

Set x0 = 3. Then, the algorithm proceeds as follows: x1 = (13 × 3)
mod 31 = 39 mod 31 = 8 so that u1 = 8/31 = 0.2581; x2 = (13 ×
8) mod 31 = 11 so that u2 = 11/31 = 0.3548; · · · ; x30 = (13 × 5)
mod 31 = 3 so that u30 = 3/31 = 0.0968, · · · . Since x30 = x0 = 3 the
cycle begins over again.

As can be seen from the above example, for small values of a and
m such a random number generator has very poor qualities. But, it
works reasonably well for large values. An earlier version of MATLAB
set m = 231 − 1 (a Mersenne prime number) and a = 75 = 16, 807. This
takes 231 − 2 periods before it cycles.

8.7 Eugen Slutsky and Random Causes as the Source of Cyclic

Processes
(I)s it possible that a definite structure of a connection between ran-
dom fluctuations could form them into a system of more or less regular
waves? Many laws of physics and biology are based on chance, among
them such laws as the second law of thermodynamics and Mendel’s
laws. But heretofore we have known how regularities could be derived
from a chaos of disconnected elements because of the very disconnect-
edness. In our case we wish to consider the rise of regularity from series
of chaotically-random elements because of certain connections imposed
upon them. Eugen Slutsky (1937, p. 106) Eugen Slutsky (1880-1948) was a
Russian economist and statistician.
In 1937 Eugen Slutsky put forward the following probabilistic model He wrote two path-breaking papers
of the business cycle: in economics. In the the first paper
he derived what is now known as
ot = x t + x t −1 + · · · x t −9 + 5, the Slutsky equation. This is one of
o t −1 = x t −1 + · · · x t −9 + xt−10 + 5, the cornerstones of consumer
theory. Because the paper was
where ot is a index of the business cycle for period t and the xt ’s published in Italian in 1915 it lay in
are independently and identically distributed random variables. To obscurity for a while. Hicks
generate the xt ’s, Slutsky took a sample of winning numbers for a independently rediscovered the
lottery for loans from the People’s Commissariate of Finance. He just notion in 1939. His second paper,
the one discussed here, was on
used the last digit of winning numbers for the xt ’s. He noted that each
“The Summation of Random
Causes as a Source of Cyclical
Processes.” This paper was
originally published in Russian in
1927. The English version was
published ten years later. The
notion of randomness is now
everywhere in economics.
numerical approximations 167

of two adjacent values for the business cycle index o, say ot and ot−1 ,
would have one random cause unique to itself, here xt and xt−10 , and 9
random causes in common, or xt−1 to xt−9 . Because the business cycle
index has events in common there appears to be a correlation between
them even though the series of causes are independent. Figure 8.7.1
shows the upshot of Slutsky’s Monte Carlo simulation. It exhibits the
same pattern of fluctuations as displayed in the business cycle data for
ninetieth century England. Think about how much effort and time it

Figure 8.7.1: The dashed line

shows an English business cy-
cle index for the years 1855 to
1877. The solid line is a por-
tion of the sample path from
Slutsky’s Model 1, which cor-
responds to model shown here.
Slutsky cut out a portion of his
randomly generated series and
aligned it with the business cy-
would have taken Slutsky do this diagram. This is a simple task today: cle data. Source: Slutsky (1937,
it involves using a random number generator, a loop for the ot ’s, and Figure 3).
making a graph from the output.

8.8 Monte Carlo Integration

Again, suppose one wants to compute the function

Z b
I= F ( x )dx.
a
An easy way to do this is by Monte Carlo integration. Now, rewrite
the above formula as
Z b
1
I = (b − a)[ F(x) dx ].
a b−a
Think about 1/(b − a) as representing the density function for a uni-
form distribution. Thus, the term in brackets is the expected value
of the function F ( x ), while the (b − a) term multiplying this expected
value is the length of the line segment going from a to b.
The idea underlying Monte Carlo integration is to compute the ex-
Rb
pectation term a F ( x )/(b − a)dx by drawing a random sample of x’s
on [ a, b] from the uniform distribution. Represent this draw of n ran-
dom numbers by { x1 , x2 , · · · , xn }. The expectation in question is com-
puted using the following formula
Z b n
1
a
F ( x )/(b − a)dx '
n ∑ F ( x i ).
i =1
168 numerical methods for macroeconomists with julia and matlab codes

This implies that

(b − a) n
Z b
I=
a
F ( x )dx '
n ∑ F ( x i ),
i =1

which is the formula used in Monte Carlo integration. Figure 8.8.1

illustrates the situation.
Now, the strong law of large numbers implies that
n Z b
1
lim
n→∞ n ∑ F ( xi ) = a
F ( x )/(b − a)dx,
i =1

where the righthand side is the expected value of F ( x )–a statement of

the strong law of large numbers is provided in Chapter A. The variance
of the sample mean for F ( x ), or the variance of ∑in=1 F ( xi )/n, is given
by
n n
1 1 σ2
σn2 = 2
E[ ∑ { F ( xi ) − ∑ F ( x j )}2 ] = E[{ F ( x ) − E[ F ( x )]}2 ] = .
n i =1 j =1
n n

since all the x’s are independently and identically distributed. This
implies that the sample’s standard deviation around the mean, σn , will
√
decline with n. Hence, to reduce the standard error by half, the
sample size needs to be quadrupled.

Figure 8.8.1: Monte Carlo In-

tegration. The solid line plots
the function F ( x ). The circles
show the ( xn , F ( xn )) combina-
tions that arise when the x’s are
sampled from a uniform distri-
bution and F ( x ) is evaluated at
each of the n sample points. The
F ( xn ) points will be averaged to
compute E[ F ( x )]. Clearly, the
larger the sample is the more ac-
curate will be the integration.

8.9 Robert E. Lucas, Jr., and the Cost of Business Cycles

What is the welfare cost of random fluctuations in the business ac-

tivity? This question was poised by Lucas (1987). Surprisingly, in a
numerical approximations 169

representative agent framework it is very small. To see this, denote the

long-run (or stationary) distribution over consumption, c, and hours
worked, h, by D (c, h). In the long-run the distribution over consump-
tion and hours worked is the same in each period.3 Therefore, ex- 3
The notion of a long-run distribution
pected momentary utility in each period is the same. The representa- is presented later in Section 8.10 on
Markov chains. It is also discussed in
tive agent’s expected lifetime utility is the Chapter 9.

E[U (c) + V (1 − h)]

(1 + β + β2 + · · · ) E[U (c) + V (1 − h)] =
1−β
RR
[U (c) + V (1 − h)]dD (c, h)
= ,
1−β
where U (c) is the utility function over consumption, V (1 − h) is the
one over leisure, and 0 < β < 1 is the discount factor.
So, what is the welfare cost of business cycles? This involves com-
puting either a compensating or equivalent variation. These concepts
are discussed in Chapter 2. Imagine that one could stabilize consump-
tion and hours worked at their respective means, c and h. How much
compensation would the representative agent have to be given to make
him as well off in the world with business cycle fluctuations as in the
world without them? The equivalent variation, e, solves the equation
Z Z
[U (c(1 + e)) + V (1 − h)]dD (c, h) = U (c) + V (1 − h).

The righthand side of the above expression gives the lifetime utility
that the representative agent realizes when consumption and hours are
stabilized at their mean levels. So, the question being posed is what
fraction of consumption in each and every state would the person have
to be given to make him as well off as in a world without fluctuations.
Observe that there is no β in the formula because it will cancel out of
both sides of the equation. Last, despite being complicated looking,
this is only one equation in one unknown, e.
To come up with an estimate of the welfare cost of business cycles,
let utility be given by

U (c) = θ ln c and V (1 − h) = (1 − θ ) ln(1 − h),

where θ = 0.33. Given the logarithmic form of the utility function, it

is easy to see that
Z Z
θ ln(1 + e) = θ ln(c) + (1 − θ ) ln(1 − h) − [θ ln(c) + (1 − θ ) ln(1 − h)]dD (c, h),

so that
Z Z
e = exp {θ ln(c) + (1 − θ ) ln(1 − h) − [θ ln(c) + (1 − θ ) ln(1 − h)]dD (c, h)}/θ − 1.

To use this formula, a distribution over c and h is needed and then a

double integration will have to be performed. This will be done using
170 numerical methods for macroeconomists with julia and matlab codes

Monte Carlo integration. To obtain the properties of D (c, h) in the U.S.

data, time series for consumption, c, and hours, h, are logged and
then Hodrick-Prescott (H-P) filtered. The H-P filter is discussed later
in this chapter. The standard deviations of for consumption and hours
are 0.0217 and 0.0257–see Chapter A for a discussion of descriptive
statistics. The correlation between these two variables is 0.6160. Figure
8.9.1 shows the bivariate distribution obtained from the U.S. data, for
these two variables.

Figure 8.9.1: The bivariate dis-

tribution for consumption and
hours obtained from annual U.S.
data, 1949 to 2017. The data was
first logged and then detrended
using the Hodrick-Prescott filter.
When consumption is high there
is a tendancy for hours to also
be high, and vice versa–that is
the mass from the bars accumu-
late along a diagonal in the (c, h)
plane from front to back.

For the analysis assume that ln c and ln h follow the bivariate nor-
2 , σ2 , σ
mal distribution N (µln c , µln h , σln c ln h ln c,ln h )–the bivariate normal
distribution is presented in Chapter A. As in the U.S. data, set the
variances and covariance as follows: σln 2 = 0.02172 , σ2 = 0.02572 ,and
c ln h
σln c,ln h = 0.6160 × (0.0217 × 0.0257). Without loss of generality, pick
µln c = −0.0002 and µln h = −1.0989, which imply that the means of c
and h are 1.0 and 0.3333. Call up 100, 000 random draws of (c, h) from
this distribution and then calculate
100,000
1
100, 000 i∑
[θ ln(ci ) + (1 − θ ) ln(1 − hi )],
=1
RR
which is a Monte Carlo approximation to the expectation [θ ln c +
(1 − θ ) ln(1 − h)]dD (c, h). Using the expectation in the formula for e re-
sults in 100 × e = 0.040. Thus, the representative agent needs be given
an extra 0.04 percent in terms of consumption to make him as well off
as in a world without business cycles! By comparison, Lucas (1987,
Table 2, p. 26)’s estimate of the cost of consumption fluctuations was
0.072 percent, at least when the utility function was logarithmic and
numerical approximations 171

consumption had a standard deviation of 0.039. Lucas ignored vari-

ability in labor supply. Hours worked tend to be low when consump-
tion is low. Hence, leisure is negatively associated with consumption
which partially offsets the loss from a fall in consumption. So, adding
variability in labor supply works to reduce the cost of business cycles.
His conclusion was that the welfare cost of business cycles is very low,
especially in comparison with the large welfare effects associated with
changes in an economy’s growth rate–recall the discussion in Chapter
2.

8.10 Markov Chains

A Markov chain is a probability model where the system jumps from

one state to another in a random manner. The odds of how the next Andrey A. Markov (1856 -1922)
jump will occur depend only on the current state of the system. Sup- was a Russian mathematician. He
pose the system can take one of n values at each point in time, denoted is best known for his work on
stochastic processes.
by z ∈ Z ≡ {z1 , z2 , · · · , zn }, where the set Z is time invariant. If the
system is currently at state zi , then the chance of it hopping next period
to state z j is given by

πij = Pr[z0 = z j |z = zi ], for all i, j = 1, · · · , n.

The πij ’s are called transition probabilities. Since the odds of how
the system moves depend solely on where the system is currently, the
system is called memoryless. Now,
n
∑ πij = 1, for all i,
j =1

because if the system is currently at zi it must either stay next period

at zi or move to some z j for j = 1, · · · , n with j 6= i.

Example 49. (Two-State Markov Chain) Let z have two values, a low
value (state 1) and a high one (state 2) represented by z and z. Suppose
that the odds of going from the low value (state 1) to the high value
(state 2) are given by π12 and from the high value to the low value are
π21 . These are called transition probabilities. Note that π11 = 1 − π12
and π22 = 1 − π21 . Figure 8.10.1 illustrates the situation. Let π12 =
π21 . This implies π11 = π22 . Hence, the chain is symmetric, a common
assumption.

Now, load the transition probabilities, or the πij ’s, into a matrix, T,
as follows  
π11 · · · π1n
 . .. .. 
T =  .. . . .

|{z}
n×n πn1 · · · πnn
172 numerical methods for macroeconomists with julia and matlab codes

Figure 8.10.1: Two-State Markov

Chain

1 π 11 1

π 12

π 21
2 2
π 22

Each row of the matrix sums to one, since ∑nj=1 πij = 1, for all i. Sup-
pose that one is given some initial probability distribution, ρ0 , over the
position of the states, where

ρ0 = (ρ01 , ρ02 , · · · , ρ0n ).

|{z}
1× n

Next period’s probability distribution over the states, or ρ1 , is given by

ρ1 = ρ0 × T .
|{z} |{z} |{z}
1× n 1× n n×n

If one knew with certainty that the time-0 state of the system is zi , then
just set ρ0i = 1 and ρ0j = 0 for all j 6= i. So, writing the time-0 state of
the system in the above manner is without loss of generality.
By expanding the above formula, it can be seen that
 
π11 ··· π1n
 . n n
1 1 0 0  .. ..  0 0
(ρ1 , ..., ρn ) = (ρ1 , ..., ρn )  .. . .  = ( ∑ ρi πi1 , · · · , ∑ ρi πin ).

i =1 i =1
πn1 ··· πnn

Take the probability of being in state j next period or ρ1j . This will be
n
given by ρ1j = ∑ ρ0i πij . In theory one can get to state j in period 1 by
i =1
starting out from any state i in period 0. The odds of initially being in
state i are given by ρ0i while the odds of then transiting from i to j are
πij . So, ρ0i πij represents the probability of starting out in i and then
moving to j.
numerical approximations 173

8.10.1 Stationary Distribution

It easy to see that the m-period-ahead probability distribution over
states is given by

ρ m = ρ m −1 T = ρ m −2 T 2 = · · · = ρ 0 T m .

A question of interest is whether or not ρm converges to something as

m gets large. The long-run or stationary distribution, ρ∗ , will be given
by the fixed point to this operation or

ρ∗ = ρ∗ T. (8.10.1)

There are four ways to computer the stationary distribution.

1. The easiest way to compute ρ∗ is to iterate on the mapping

ρ j+1 = ρ j T, (8.10.2)

until |ρ j+1 − ρ j | becomes sufficiently small. At each stage, the vector

ρ j is just post multiplied by the matrix T. This is an easy thing to
do.

2. Alternatively, one could solve the following system for ρ∗ :

 
π11 ... π 1( n −1) −1
 .
.. .. 

(ρ1 , ..., ρ(n−1) , ρn ) = (ρ1 , ..., ρ(n−1) , ρn ) . 

−1
 
| {z }  π ( n −1)1 π(n−1)(n−1) 
ρ∗
πn,1 ... π n ( n −1) 0
| {z }
T
b

+ (0, ..., 0, 1),

| {z }
b

or
ρ∗ = b ∗ [ I − T
b ] −1 . (8.10.3)
Since the matrix I − T does not have full rank, the last equation in
the system has (or the one for ρn ) been replaced by the equation
∑i ρi = 1, which can be written as ρn = 1 − ∑i6=n ρi . (To see that
I − T does not have full rank replace each πii with 1 − ∑ j6=i πij (=
πii ). The sum of the last n − 1 columns in I − T then equals the
negative of the first one.) This is resolved in the above system of
equations by writing the last equation as ρn = −ρ1 · · · − ρn−1 + 1.

3. Yet another way is to compute the eigenvalues and (left) eigenvec-

tors associated with the transition matrix T. An eigenvalue/eigen-
vector pair must solve the equation

eT = εe,
174 numerical methods for macroeconomists with julia and matlab codes

where e is an 1 × n eigenvector and ε its associated eigenvalue,

which is a scalar–see Chapter A. Now, the stationary distribution
will solve this equation when ε = 1 and e = ρ∗ (see equation (8.10.1)
above). So, one just needs to find the eigenvector associated with an
eigenvalue of one. This can be achieved by factorizing matrix T as
T = EΛE−1 , where each column of E is an eigenvector e. Through
this eigendecomposition of T, the diagonal elements of the diago-
nal matrix Λ are the eigenvalues ε associated with each eigenvector.
Note that if e solves eT = εe, then so will (ξe) T = ε(ξe), where ξ is a
scalar. Choose ξ to normalize the eigenvector e so that ∑in=1 ei = 1.

4. Last, one could conduct a Monte Carlo simulation to find the sta-
tionary distribution. To do this, the Markov chain is simulated for
a long time series of identically and independently distributed ran-
dom shocks drawn from a uniform distribution on [0, 1]. Suppose
in some time period that the Markov chain is in state i. The Markov
chain will move to state j next period if the shock for this period
j j +1
lies in the interval [∑l =1 πil , ∑l = j πil ].
One might guess that the existence of a unique, invariant long-run
distribution might be related to whether or not the operator T is a
contraction mapping. Let P n represent the space of n-dimensional
probability vectors. Think about the transition matrix as defining an
operator T : P n → P n .
Lemma 50. (Convergence to a unique, invariant long-run distribution) limm→∞ ρ0 T m =
ρ∗ for all ρ0 ∈ T n if and only if for some m it occurs that T m defines a con-
traction on P n .
Proof. See Stokey and Lucas (1986, chp 11).

A necessary and sufficient condition for this to occur is given next

– again see Stokey and Lucas (1986, chp 11). In more conventional
fashion, now let
T = [πij ].
|{z}
n×n

Condition 51. (Mixing) For some j there exists a m such that mini πijm >
0.
That is, there exists a column where all the entries have non-zero ele-
ments. This condition implies that at interation m it is possible to get
into state j from any other state. In other words, the system can’t get
stuck with probability one in any other state i 6= j.
Some examples for two-state Markov chains are presented now.

" 52. (Long-run distribution

Example # for a two-state Markov chain.) Let
π11 1 − π11
T = . The stationary distribution, if it exists,
1 − π22 π22
numerical approximations 175

must solve
" #
1 − π12 π12
( ρ1 , ρ2 ) = ( ρ1 , ρ2 ) .
π21 1 − π21

It’s easy to check that

ρ1∗ = π21 /(π12 + π21 ) and ρ2∗ = π12 /(π12 + π21 )

satisfy this equation. Or one could use the equations ρ1∗ = ρ1∗ (1 −
π12 ) + ρ2∗ π21 and ρ1∗ + ρ2∗ = 1, which is really just a restatement of
(8.10.3) for the 2 × 2 case.

It may be the case that a unique invariant long-run distribution does

not exist, as the following two examples show. Hence, they can’t sat-
isfy the above mixing condition.

Example 53. (Long-run variance and autocorrelation for a symmetric

two-state Markov chain) Let z ∈ {−δ, δ}. Define π by π ≡ π11 = π22 .
From the previous example it is clear that ρ1∗ = ρ2∗ = 0.5. The long-
run mean of shock is E[z] = −δ/2 + δ/2 = 0. It is easy to see that
E[z2 ] = δ2 /2 + δ2 /2 = δ2 . Additionally, E[z]2 = 0. Thus, variance
and standard deviation are given by δ2 and δ. Finally, E[z0 z] = (π11 −
π12 − π21 + π22 )δ2 /2 = (2π − 1)δ2 . This implies that the long-run
coefficient of autocorrelation is (2π − 1)δ2 /δ2 = 2π − 1.
" #
0 1
Example 54. (A long-run distribution does not exist) Let T = .
1 0
Check that with this transition matrix, if you start out in state 1,
you’ll switch to state 2 and vice versa. Here, it is easy to deduce that
limm→∞ ρ0 T m doesn’t exist for certain ρ0 –try ρ0 = [1, 0]. It is easy to
see that T m = T for all odd m and T m = I for all even m. Hence, the
mixing condition will never obtain.
" #
1 0
Example 55. (The long-run distribution is not invariant) Let T = .
0 1
Check that with this transition matrix, if you start out in state 1 you’ll
stay there, while the same is true for state 2. Pick any conjectured ρ∗ .
The above fact implies that limm→∞ ρ0 T m 6= ρ∗ for all ρ0 ∈ P 2 . Here
there are different ρ∗ associated with different starting values, ρ0 –try
ρ0 = [1, 0] and ρ0 = [0, 1]. Finally, observe that T m = I for all m, so the
mixing condition cannot ever be satisfied.

8.11 Unemployment: Calibrating a Markov Chain

Workers move between different jobs throughout their careers either

because they find better opportunities or else are forced to leave their
176 numerical methods for macroeconomists with julia and matlab codes

current positions. The goal here is use a two-state Markov chain over
employment and unemployment to estimate what percentage of work-
ers are separated from their jobs every month and what percentage of
the unemployed find a job. To do this, data will be used on the aver-
age rate of unemployment and the average duration of unemployment.
This data will then be used to back out the job finding and separation
using the stationary distribution from the Markov chain.
Let the transition between employment, e, and unemployment, u,
be governed by the job-finding probability φ—how likely an unem-
ployed worker finds a job—and the separation probability σ—how
likely an employed worker becomes unemployed. Given this, the ag-
gregate rates of employment and unemployment will evolve according
to
et+1 = (1 − σ )et + φut
and
ut+1 = σet + (1 − φ)ut .
These equation are easy to explain. Take the one for unemployment.
In period t the employment rate is et . Out of this component of the
labor force the fraction σ will lose their job and enter unemployment in
period t + 1. Likewise, in period t the unemployment rate is ut . From
pool of unemployed the fraction 1 − φ will not find a job and therefore
will remain unemployed in period t + 1. An interesting statistics to
analyze is the average duration of unemployment, d. This is given by4 4
To go from the penultimate to the last
line, recall the formula for a geomet-
ric series, or 1/(1 − x ) = 1 + x + x2 +
x3 + · · · ,where 0 < x < 1. Differenti-
d = 1φ + 2(1 − φ)φ + 3(1 − φ)2 φ + · · · + n(1 − φ)n−1 φ + · · · ate both sides with respect to x to get
1/(1 − x )2 = 1 + 2x + 3x2 + · · · . Now,

= φ 1 + 2 (1 − φ ) + 3 (1 − φ )2 + · · · + n (1 − φ ) n −1 + · · ·
set x = 1 − φ so that 1 − x = φ.
= 1/φ.

To understand this formula, note that to be unemployed for exactly n

periods, you must have not found a job for n − 1 consecutive periods,
which occurs with probability (1 − φ)n−1 , and then found a job at the
end of the nth period, which happens with odds φ.
The stationary distribution for employment and unemployment must
solve the following version of (8.10.3),
" #
1 − σ −1
(e, u) = (e, u) + (0, 1),
φ 0

from which is easy to calculate that5 5

Again, this amounts to using the two
equations e = e(1 − u) + uφ and e + u =
φ σ 1.
e= and u = .
σ+φ σ+φ
To calibrate the above Markov chain two facts will be used. First,
the long-run rate of unemployment is set to 5.77 percent, the average
numerical approximations 177

monthly unemployment rate since 1948, implying that u = 0.0577.

Second, the average unemployment duration is taken to be 16.2254
weeks so that d = 16.2254/4 = 4.0564 months. Both trends are shown
in Figure 8.11.1. The job finding probability is then given by φ =
1/4.0564 = 0.2465, so that 24.65 percent of the unemployment find a
job every month. From the formula for the long-run unemployment, it
follows that the job separation rate is σ = uφ/(1 − u) = 0.0151. Thus,
1.51 percent of workers are separated monthly from their jobs.

Figure 8.11.1: The unemploy-

ment rate and average unem-
ployment duration over the last
60 years.

8.12 The Equity Premium: A Puzzle

From 1889-1978 the average return on equity from the Standard and
Poor 500 index was 7 percent. In contrast, the average yield on short
term debt was less than 1 percent. Can such a differential be explained
within the neoclassical growth model? The puzzle is that to get a low
risk-free interest rate in a growing economy you need a high elasticity
of intertemporal substitution. When future income is higher than cur-
rent income individuals would like to borrow. This operates to drive
up the interest rate. To mitigate this, people must be very willing
to postpone consumption in response to interest rate rises; in other
words, a high elasticity of intertemporal substitution is required. To
get a large equity premium, people must dislike risk. This requires
a high coefficient of relative risk aversion. But in a standard macro
model, with isoelastic preferences for consumption, the coefficient of
relative risk aversion (discussed below) is the reciprocal of the elastic-
ity of intertemporal substitution (discussed in Chapter 6). In a classic
paper, Mehra and Prescott (1985) show that the standard macro model
can only generate an equity premium that is 0.4 percentage points
higher than on short-term debt.
178 numerical methods for macroeconomists with julia and matlab codes

8.12.1 The Setup

Consider a representative agent economy where the individual has
tastes of the following form

c 1− α − 1
U (c, α) = , 0 < α < ∞.
1−α
where c is consumption. Now, 1/α is the elasticity of intertemporal
substitution–see Chapter 6 for discussion about the elasticity of in-
tertemporal substitution. With this utility function α also represents
the coefficient of relative risk aversion, as is discussed below. So, α
has a double duty. This is at the heart of the equity premium puzzle.
The person discounts the future at rate β. Let the person’s income, y,
evolve according to the two-state Markov chain

y0 = zy,

where z ∈ {z1 , z2 } with πij ≡ Pr[z0 = z j |z = zi ]. Now, assume that

z1 = 1 + µ + δ and z2 = 1 + µ − δ.

Here µ governs how consumption grows while δ controls its volatility.

Additionally, assume that

π11 = π22 ≡ π,

which implies that π12 = π21 = 1 − π. Thus, the Markov chain is

symmetric.

Definition 56. (Coefficient of relative risk version) The coefficient of

relative risk aversion is defined as θ = −c[U11 (c)/U1 (c)]. With the
utility function U (c, α) = (c1−α − 1)/(1 − α) it is clear that α = θ. To
understand this concept, consider a static setting where an individual
may invest a fraction, φ, of his wealth, c, in a risky asset that will payoff
either γ + ε or γ − ε with probability 1/2, where γ > 1 and ε > 0. So,
the expected payoff on a unit of investment in this risky asset is γ > 1.
At that time of the payoff the person will consume all of his wealth.
So, the individual’s problem is

max{U ((1 − φ)c + φc(γ + ε)) + U ((1 − φ)c + φc(γ − ε))}/2.

The first-order condition associated is

U1 ((1 − φ)c + φc(γ + ε)) × (γ + ε − 1) + U1 ((1 − φ)c + φc(γ − ε)) × (γ − ε − 1) = 0.

Now, take a first-order Taylor expansion of the above two marginal

utility terms around c to get

U1 (c) × (γ + ε − 1) + U11 (c)φc(γ + ε − 1)2 /2 + U1 (c) × (γ − ε − 1) + U11 (c)φc(γ − ε − 1)2 /2 = 0.

numerical approximations 179

Next, let ε → 0, which amounts to assuming that the risk is small.

Then, it transpires that

1 1 1 U1 (c)
φ= = =− .
θ α c U11 (c)

So, the reciprocal of the coefficient of relative risk aversion can be

thought of as measuring the fraction of wealth that an individual will
invest in a risky asset. The bigger the coefficient of relative risk aver-
sion, θ, the smaller will be the amount invested in the risky asset.

8.12.2 Pricing Equities and Bonds

The prices for equities and bonds will now be characterized. Since
this is a representative agent model, there will never be any trades in
either equities or bonds. That is, the person will always consume his
endowment in a period. Equities and bonds are priced so that there
will always be zero stock and bond trades in equilibrium.

Equity
Suppose that an equity is a claim on the flow of income, y. Let the
price of a share be denoted by p. This price will be a function of the
state of the economy So, let p = P(y, zi ) represent the current price of
equity and p0 = P(y0 , z j ) = P(z j y, z j ) denote next period’s price. If the
individual buys a share in the current period, his consumption will
be reduced by P(y, zi ). The marginal utility of current consumption
is y−α , so his utility will be reduced by y−α P(y, zi ). Next period the
person gets a dividend in the random amount y0 and will be able to
sell the share at the price P(z j y, z j ). Thus, utility next period will be
increased by the random amount βy0−α [y0 + P(y0 , z j )] = β(z j y)−α [z j y +
P(z j y, z j )]. This event occurs with chance πi1 . So, the person’s Euler
equation is

y−α P(y, zi ) = πi1 β(z1 y)−α [z1 y + P(z1 y, z1 )] + πi2 β(z2 y)−α [z2 y + P(z2 y, z2 )],

P(y, zi ) = πi1 β(z1 )−α [z1 y + P(z1 y, z1 )] + πi2 β(z2 )−α [z2 y + P(z2 y, z2 )].

Now, conjecture that the pricing function is given by P(y, zi ) = wi y

implying that P(z j y, z j ) = w j z j y. This guess looks reasonable because
the above equation is linear in y and any functional dependence on zi
can be captured by the constant wi ; i.e., think about writing wi = Z (zi )
where Z is some function. If so, then

wi = βπi1 z11−α (1 + w1 ) + βπi2 z12−α (1 + w2 ), for i = 1, 2.

180 numerical methods for macroeconomists with julia and matlab codes

This is a system of two equation in two unknowns and can be repre-

sented in matrix notation by

w = βΛw + γ,

where
" # " # " #
w1 π11 z11−α π12 z12−α β(π11 z11−α + π12 z12−α )
w= ,Λ = ,γ = .
wn π21 z11−α π22 z12−α β(π21 z11−α + π22 z12−α )

Thus,
w = [ I − βΛ]−1 γ,

assuming that | I − βΛ| 6= 0.

What is the expected return from holding equity? The realized return,
rij , when moving from state (y, zi ) to (z j y, z j ) is

P(z j y, z j ) + z j y − P(y, zi ) z j ( w j + 1)
rij = = − 1.
P(y, zi ) wi

The expected return on equity, conditional on that the current state is

i, is
Ri = πi1 ri1 + πi2 ri2 .

Thus, the long-run return on equity is

Re = ρ1∗ R1 + ρ2∗ R2 .

Bonds
Next consider the price of a one-period discount bond in state i, or
f
pi = P f (y, zi ). Such a bond will pay off one unit of consumption next
period with certainty. Even so, the marginal utility of next period’s
consumption is a random variable dependent on the individual’s in-
come. The Euler equation for the one-period discount bond reads

y−α P f (y, zi ) = πi1 β(z1 y)−α + πi2 β(z2 y)−α ,

so that
πi1 β(z1 y)−α + πi2 β(z2 y)−α
P f (y, zi ) = = β(πi1 z1−α + πi2 z2−α ).
y−α

The expected return on this risk free asset, conditional on that the
current state is i, is
f
Ri = 1/P f (y, zi ) − 1,

which implies that the long-run return will be

f f
R f = ρ1∗ R1 + ρ2∗ R2 .
numerical approximations 181

8.12.3 Findings

For the U.S. economy the mean annual growth rate in consumption
was 0.018. Its standard deviation and autocorrelation were 0.036 and
-0.14. Matching these facts necessitated setting µ = 0.018, δ = 0.036,
and π = 0.43. Now, clearly the discount factor, β, should lie between 0
and 1. Mehra and Prescott (1985) suggest that the coefficent of relative
risk aversion, α, is bounded between 0 and 10. So, they computed
the risk-free rate and the equity premium for values of α and β that
lie within these ranges subject to the condition that a solution for the
model exists or that | I − βΛ| 6= 0. In other words, think about risk-
free rate and the equity premium as being defined by two functions
R f = R(α, β) and Re − R f = P(α, β). They compute the values of these
two functions for parameter values that lie in the set X where

X = {(α, β) : 0 < β < 1, 0 < α < 10, and | I − βΛ| 6= 0}.

As can be seen from Figure 8.12.1, the model can’t simultaneously

generate an equity premium of 6.98 percent and risk-free return of
0.8 percent. So, within the context of a frictionless Arrow-Debreu-
McKenzie world it is difficult to rationalize why the average return on
equity was so high while the risk-free return was so low.

Figure 8.12.1: Equity premium

and risk-free rate combinations
for various values of the coeffi-
cient of relative risk aversion, α,
and the discount factor, β, lying
in the set X . Source: Mehra and
Prescott (1985, p. 155).

Kenneth J. Arrow (1921-), Gerard

Debreu (1921-2004), and Lionel W.
McKenzie (1919-2010) are
considered to be the fathers of
modern general equilibrium theory.
Arrow and Debreu won Nobels in
1972 and 1983, respectively. In 1995
McKenzie was awarded The Order
of the Rising Sun in Japan.
182 numerical methods for macroeconomists with julia and matlab codes

8.13 Approximating an AR1 by a Markov Chain

AR1 processes are commonly used in macroeconomics. The generic

AR1 process has the form

zt+1 = ρzt + ε t+1 , where ε t+1 ∼ N (0, σ2 ).

This has an autocorrelation coefficient of ρ, a conditional standard de-

viation of σ, and an unconditional (long-run) standard deviation of
p
σ/ (1 − ρ2 ).Here 0 < ρ < 1 represents the coefficient of autocorre-
lation. Often the AR1 process is specified in log. This ensures that
ρ
zt+1 will always be nonnegative since now zt+1 = zt exp(ε t+1 ). How
can such AR1 processes be approximated by a N-state Markov chain,
where N ≥ 2.

8.13.1 Algorithm: Rouwenhorst (1995)

1. Constrain the variable z to always lie in a time-invariant grid of n
equally spaced points centered around 0, so that z ∈ {z1 , · · · , zn }
√ p
with −z1 = zn = ψ > 0. where ψ = σ n − 1/ (1 − ρ2 ). A
transition matrix, T n , is sought that has the form
 
π11 · · · π1n
 . .. .. 
Tn =   .. . . ,

πn1 · · · πnn

where πkl are the odds of going from state k to state l. The summa-
tion across any row equals 1; i.e., ∑nl=1 πkl = 1 for all k.

2. The transition matrix T j is generated recursively for j = 3, · · · , n as

follows:

(a)
   
T j −1 0 0 T j −1
( j−1)×( j−1) ( j−1)×1 ( j−1)×1 ( j−1)×( j−1)
Tj = p  + (1 − p ) 
   
00 0 0 00

j× j
1×( j−1) 1×1 1×1 1×( j−1)
   
00 0 0 00
1×( j−1) 1×1 1×1 1×( j−1)
+ (1 − p )  + p ,
   
T j −1 0 0 T j −1
( j−1)×( j−1) ( j−1)×1 ( j−1)×1 ( j−1)×( j−1)

where " #
2 p 1− p
T = ,
1− p p
0 is an ( j − 1) × 1 column vector of zeros, and p = (1 + ρ)/2.
(b) At the end of each iteration, all but the first and last rows of T j
should be divided by 2.
numerical approximations 183

The idea is that if you are in the upper left cell on iteration j − 1 then
you will stay there with probability p on iteration j or move to the
upper right cell with the complementary probability 1 − p. Note that
it is possible to get into the rows 2, · · · , n − 1 of T j from either the
upper or lower cells, which explains the division by 2; i.e., without the
division by 2, ∑nl πkl = 2 for k = 2, · · · , n − 1. By setting p = (1 + ρ)/2,
the Markov chain will have the same long-run variance and first-order
autocorrelation as the AR1.

8.14 Interpolation

Suppose that one wants to represent a set of n points, ( x1 , y1 ), ( x1 , y2 ),

· · · , ( xn , yn ), by a continuous function y = F ( x ), for which an analyt-
ical expression is not available. That is, assume that yi represents the
value of F when evaluated at the point xi , for i = 1, · · · , n. So, the
problem is to construct a continuous function from the set of given
data points, ( x1 , y1 ), ( x1 , y2 ), · · · , ( xn , yn ), that will specify a value
for y given any value for x within some specified range, say for ex-
ample [ x1 , xn ]. The constructed function will have the property that
yi = F ( xi ) for all i, so that it fits all of the specified data points.

8.14.1 Piecewise Linear Interpolation

Piecewise linear interpolation places a continuous curve through the n
points, ( x1 , y1 ), ( x1 , y2 ), · · · , ( xn , yn ). The segment between each pair
of adjacent points is linear. So, the curve is made up of a bunch of
linear segments. Denote this piecewise linear function by L. The piece-
wise linear function L satisfies the following criteria:
1. L( x ) is a linear function, denoted by Lh , on each of the subintervals
[ xh , xh+1 ] for each h = 1, ..., n − 1. Denote this linear function by
Lh ( x ) = αh + β h ( x − xh ), for xh ≤ x ≤ xh+1 .

2. L( x ) = yh when x = xh for each h = 1, ..., n − 1. So, the piecewise

linear function passes through each interpolation point.
It is easy to see that following solution for Lh ( x ) works:
Lh ( x ) = (1 − µ)yh + µyh+1 ,
where µ = ( x − xh )/( xh+1 − xh ), for xh ≤ x ≤ xh+1 . This implies
y − yh
β h = h +1 ,
x h +1 − x h
and
j
αh = yh .
Figure 8.14.1 illustrates the situation, where the function y = F ( x ) is
approximated by the piecewise linear function L( x ).
184 numerical methods for macroeconomists with julia and matlab codes

Figure 8.14.1: The function y =

F ( x ) is approximated by the
piecewise linear function L( x ).
At each interpolation point, xi ,
the piecewise linear function,
L( xi ), takes the same value as
yi = F ( xi ). The values of L( x )
and F ( x ) differ when not at an
interpolation point.

8.14.2 Cubic Spline Interpolation

Cubic spline interpolation fits a flexible, C2 curve through the n points,
( x1 , y1 ), ( x1 , y2 ), · · · , ( xn , yn ). Denote this spline by S( x ). The spline
function S satisfies the following criteria:

1. S( x ) is made up by cubic polynomials, denoted by Sh , on each of

the subintervals [ xh , xh+1 ] for each h = 1, ..., n − 1. Denote this cubic
by

Sh ( x ) = αh + β h ( x − xh ) + ψh ( x − xh )2 + δh ( x − xh )3 , for xh ≤ x ≤ xh+1 .

A prototypical cubic is given shown in Figure 8.14.2.

2. S( x ) = yh when x = xh for each h = 1, ..., n. Therefore, it goes

through all of the interpolation points.

3. Sh+1 ( xh+1 ) = Sh ( xh+1 ) for each h = 1, ..., n − 2. The cubics over

each interval are connected.

4. S1h+1 ( xh+1 ) = S1h ( xh+1 ) for each h = 1, ..., n − 2 . The connections

are smooth in the sense that the first derivatives are the same where
one cubic ends and the other starts.
h +1 h (x
5. S11 ( xh+1 ) = S11 h+1 ) for each h = 1, ..., n − 2. The function is
very smooth in the sense that the second derivatives are the same
at connection points.

6. One of the following boundary conditions is satisfied:

(a) S11 ( x1 ) = S11 ( xn ) = 0 (free or natural),

numerical approximations 185

Figure 8.14.2: A Cubic Equation:

1.0
y = x3 .
0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.2 -0.7 -0.2 0.3 0.8

x
(b) S1 ( x1 ) = F1 ( x1 ) and S1 ( xn ) = F1 ( xn ) (clamped), where F ( x ) is
some function that is being approximated.

Observe that there are n − 1 intervals. Hence, there are 4(n − 1) param-
j j j j
eters that a solution is need for–the αh ’s, β h ’s, ψh ’s and δh ’s. Properties
2 to 6 imply that there will be exactly 4(n − 1) linear restrictions. The
Hodrick Prescott filter, discussed later, is a close cousin of cubic spline
interpolation.

8.14.3 Spline Art

Cubic spline functions are very flexible and can be used to approx-
imate many things. An artist’s ink drawing of a face is shown in
the upper panel of Figure 8.14.3. The lower panel shows a computer-
generated facsimile of the artist’s sketch using cubic spline functions.
To do this, the picture on the left was broken up into 4 regions: the
left eyebrow, the left eyeball, the profile, and the right eyelid. The pixel
coordinates for the lines in each of the regions were read off by placing
the cursor from a mouse on the parts of the lines in each region. The
pixel coordinates were then translated into (x,y) coordinates. There
are many programs that can read off pixel coordinates from a graph,
such as Windows Paint. The more points that are read off, the more
accurate will be the computer-generated rendition. Finally, 4 cubic
splines are fit to (x,y) coordinates in each of the 4 regions. The cubic
spline interpolation did a great job replicating the artist’s sketch. This
demonstrates the utility of cubic splines.
186 numerical methods for macroeconomists with julia and matlab codes

Figure 8.14.3: Spline Art. The

top panel presents a sketch of
face done by an artist. The
left plot in the bottom panel
shows a set of points mimick-
ing the sketch. The right plot in
the bottom panel is a computer
generated approximation of the
face using shape-preserving cu-
bic spline functions.

8.14.4 Radial Basis Functions

A modern approach to interpolation is radial basis functions. An ad-

vantage of this approach is that is easily extends to multivariate func-
tions. The idea underlying radial basis function interpolation is easy
to understand. The interpolating function is a linear combination of
radial basis functions each centered around one of the interpolation
points, xi , in the domain. The generic radial basis function has the
form φ(| x − xi |), where | x − xi | is the radial distance of the point x
from the interpolation point xi . Most often the Euclidian norm is
used for the distance measure. Note that even though the x’s could
be vectors, φ is a function of only one argument; the radial distance,
r ≡ | x − xi |. An equisized step away from xi in any direction has the
numerical approximations 187

same influence on φ. The value of the interpolating function at the

point x is given by
n
R( x ) = ∑ ωi φ(|x − xi |),
i =1

where ωi is the weight attached to the radial basis function that is

centered at the point xi . The value of the interpolating function at the
point x is a function of all of the interpolation points, or the xi ’s. To
compute the weights, or the ωi ’s, the function R( x ) is forced to have
the value yi when evaluated at xi . Thus, the weights can be recovered
by solving the following system of linear equations

y1 = R ( x1 ) = ∑in=1 ωi φ(| x1 − xi |)
.. ..
. .
yn = R( xn ) = ∑in=1 ωi φ(| xn − xi |).

Some examples of radial basis function are shown below (where ε

is some constant):

Various Radial Basis Functions

φ (r ) =e −εr2 , Gaussian
p
φ (r ) = 1 + (εr2 )2 Multiquadric
p
φ (r ) = 1/ 1 + (εr2 )2 Inverse multiquadric
φ (r ) = r k ,for k = 1, 3, 5, · · · Polyharmonic spline, odd
φ (r ) = r k−1 ln(rr ), for k = 2, 4, 6, · · · Polyharmonic spline, even.

Figure 8.14.4 plots the Gaussian radial basis function for several values
of ε. The value of the function φ declines as one moves away from the
center, r = | x − xi | = 0. Therefore, less value will be attached to
points in the domain, x, further away from the point, xi , that is being
interpolated around. The shape parameter, ε, controls the speed of
the decay. The bigger the value of ε, the less weight will assigned to
distant points.

8.15 The Hodrick-Prescott Filter

The Hodrick and Prescott (1997) filter is often used in macroeconomics

to detrend economic time-series . This filter draws a smooth curve
(a cubic spline) through an economic time series. While controversial
when introduced in 1981, it is actually a variant of the Whittaker (1923)
cubic smoothing spline, which has a long and distinguished history in
the statistics literature.6 To see how it works, let {yt }tT=1 represent 6
While widely accepted now, the contro-
some time series of interest. Usually, this time series has been logged. versy explains the long delay in publica-
tion.
The filter fits a trend through series, denoted by {τt }tT=1 . This trend
188 numerical methods for macroeconomists with julia and matlab codes

Figure 8.14.4: A Gaussian radial

1.0 basis function plotted for three
values of the shape parameter, ε.
0.8

0.6 ε=1
φ(r)

0.4

0.2

ε=2
ε=5
0.0

-1.0 -0.5 0.0 0.5 1.0

solves the following minimization problem:

T T −1
min { ∑ (yt − τt )2 + λ ∑ [(τt+1 − τt ) − (τt − τt−1 )]2 }
{τ }tT=1 t=1 t =2
T T −1
= min { ∑ (yt − τt )2 + λ ∑ (τt+1 − 2τt + τt−1 )2 },
{τ }tT=1 t=1 t =2

where λ is a constant that governs the degree of smoothness in the

trend. As can be seen, changes in the first-differences of the trend are
penalized. That is, “roughness” in the resulting curve is penalized.
How much depends on the size of λ, the smoothing parameter.

1. When λ = 0 the minimization routine sets yt = τt , since no move-

ments in the trend are penalized. One could think about fitting a cu-
bic interpolating spline through the points {yt }nt=0 . This would set
∑tT=1 (yt − τt )2 = 0, by property 2 of the cubic interpolating spline.
Thus, setting λ = 0 returns a cubic interpolating spline.

2. When λ → ∞ the trend becomes linear since this will set the last
term to zero. To see this, suppose that τt = a + bt. Then, τt+1 − τt =
b and τt − τt−1 = b.

3. For 0 < λ < ∞ solve the above problem to get {τt }tT=1 . Observe that
{τt }tT=1 6= {yt }tT=1 , because roughness in the resulting curve is being
penalized. The HP trend is obtained by fitting a cubic interpolating
spline to the points {τt }tT=1 . One can think about this procedure as
fitting a cubic interpolating spline, S(t), to the data points {yt }tT=1
while dropping the n restrictions that S(t) = yt . These restrictions
numerical approximations 189

Logged Real US GDP and HP trends Blow Up of Left Panel

10 8.8
Data Data Figure 8.15.1: Quarterly real
1,600 1,600
8.7 GDP and its HP trend, 1947-
100,000 100,000
9.5 2017. Real GDP has been
8.6 logged. The HP trend is shown
for two values of the smooth-
8.5
9 ing parameter, λ = 1, 600 and
λ = 100, 000. As the smooth-
GDP

GDP
8.4
ing parameter is increased the
8.5 HP trend becomes less flexible.
8.3
The right panel is merely a blow
8.2 up of the left one.
8

8.1

7.5 8
1940 1960 1980 2000 2020 1960 1965 1970 1975 1980
Year Year

are made up by using the T first-order conditions to the above prob-

lem.

Often for quarterly data λ is set to 1,600. For annual data a value
of 6.25 has been suggested (although values of 100 and 400 are also
used). Figure 8.15.1 plots postwar quarter real GDP together with its
H-P trend. H-P detrended is illustrated in Figure 8.15.2.
The generic first-order condition connected to the above minimiza-
tion problem is

− (yt − τt ) − λ2(τt+1 − 2τt + τt−1 ) + λ(τt − 2τt−1 + τt−2 ) + λ(τt+2 − 2τt+1 + τt ) = 0,

(8.15.1)
for t = 2, · · · , T − 2. This can be rewritten as

λτt−2 − 4λτt−1 + (6λ + 1)τt − 4λτt+1 + λτt+2 = yt , for τ = 2, · · · , T − 2.

This first-order condition takes a more restricted form for t = 1, 2

and t = T − 1, T. For example, it is easy to see that the terms in the
objective function involving τ1 are (y1 − τ1 )2 + λ(τ3 − 2τ2 + τ1 )2 . So
the first-order condition for τ1 is −(y1 − τ1 ) + λ(τ3 − 2τ2 + τ1 ) = 0.
This can be expressed as

(λ + 1)τ1 − 2λτ2 + λτ3 = y1 .

Doing the same thing for τ2 , τT −1 , and τT gives

−2λτ1 + (5λ + 1)τ2 − 4λτ3 + λτ4 = y2 ,

λτT −3 − λ4τT −2 + (5λ + 1)τT −1 − 2λτT = y T −1 ,

190 numerical methods for macroeconomists with julia and matlab codes

Figure 8.15.2: H-P detrended

quarterly real GDP, 1947-2017.
Real GDP was logged before
it was detrended. The figure
shows detrended GDP for two
values of the smoothing param-
eter, namely λ = 1, 600 and λ =
100, 000. Observe how the fluc-
tuations increase with the size of
λ.

and
λτT −2 − 2λτT −1 + λτT = y T .

Construct the matrices shown below

 
λ + 1 −2λ λ 0 0 0 0 0 0 0
 −2λ 5λ + 1 −4λ λ 0 ··· 0 0 0 0 0 
 
−4λ 6λ + 1 4λ 0 0 0 0 0
 
 λ λ 
 
 0 λ −4λ 6λ + 1 4λ 0 0 0 0 0 
T≡ ,
 
.. ..

 . . 

 0 0 0 0 0 −4λ 6λ + 1 4λ
 
λ λ 
 
 0 0 0 0 0 ··· 0 λ −4λ 5λ + 1 −2λ 
0 0 0 0 0 0 0 λ −2λ λ+1
   
τ1 y1
 .   . 
 ..  , and y =  ..
τ=   .

τT yT

Using these matrices the solution for the Hodrick Prescott reads
Tτ = y or
τ = T −1 y.

The detrended series is simply y − τ or ( I − T −1 )y.

numerical approximations 191

8.16 MATLAB: Worked-Out Examples

8.16.1 Numerical Differentiation

Suppose that a person has the utility function

U ( x, y) = θ ln x + (1 − θ ) ln y, with 0 < θ < 1,

where x and y are two goods. The marginal rate of substitution of x

for y is given by
U ( x, y)
MRS xy = 1 .
U2 ( x, y)
The task is compute the marginal rate of substitution numerically us-
ing both the standard method and complex step differentiation.

8.16.2 MATLAB code–Numerical Differentiation

The program has two parts. The main program, MRS.m, call the func-
tion, utility.m, that needs to be differentiated. It numerically dif-
ferentiates this function in two ways, computes the marginal rate of
substitution, and then compares the numerical derivatives with the
analytical one.

MATLAB, Main Program-MRS.m

Here is the main program.
1 % This s c r i p t c a l c u l a t e s t h e MRS between too goods both
a n a l y t i c a l l y and
2 % n u m e r i c a l l y f o r t h e u t i l i t y f u n c t i o n t h e t a * l n x + (1 − t h e t a ) * lny
. Two
3 % numerical methods a r e used : standard numerical d i f f e r e n t i a t i o n
and
4 % complex s t e p d i f f e r e n t i a t i o n .
5 global theta
6 theta = 0 . 4 ;
7 x = 1.5;
8 y = 8.3;
9 % The formula f o r marginal r a t e o f s u b s t i t i o n o f x f o r y
10 % i s given by MUx/MUy.
11 MRSanalytical = ( t h e t a /(1 − t h e t a ) ) * ( y/x ) ; % MUx/MUy
12

13 % C a l c u l a t e t h i s formula using standard numerical d i f f e r e n t i a n

14 h = 0 . 0 0 0 0 1 ; % Step s i z e f o r D e r i v a t i v e
15 MRSsnd= ( u t i l i t y ( x+h , y ) − u t i l i t y ( x−h , y ) ) . . .
16 /( u t i l i t y ( x , y+h ) − u t i l i t y ( x , y−h ) ) ;
17 disp ( ’MRS a n a l y t i c a l vs MRS standard numerical d i f f e r e n t i a t i o n ’ )
18 disp ( [ MRSanalytical , MRSsnd ] )
19 i f norm ( MRSanalytical −MRSsnd ) = . 0 0 0 0 0 0 1
20 disp ( ’ D e r i v a t i v e s a r e not matching ’ )
21 end
22

23 % C a l c u l a t e t h i s formula n u m e r i c a l l y s i n g complex s t e p
differentiation
192 numerical methods for macroeconomists with julia and matlab codes

24
25 h = 0 . 0 0 0 0 0 0 0 1 ; % Step s i z e f o r D e r i v a t i v e
26 MRSnumerical= imag ( u t i l i t y ( x+1 i * h , y ) ) /imag ( u t i l i t y ( x , y+1 i * h ) ) ;
27 disp ( [ MRSanalytical , MRSnumerical ] )
28 i f norm ( MRSanalytical −MRSnumerical ) = . 0 0 0 0 0 0 1
29 disp ( ’ D e r i v a t i v e s a r e not matching ’ )
30 end

MATLAB, Main Program-utility.m

This is the function for utility.
1 function [ u t i l s ] = u t i l i t y ( x , y)
2 % U t i l i t y f u n c t i o n over x and y
3 % The parameter t h e t a i s a g l o b a l v a r i a b l e
4 global theta
5 u t i l s = t h e t a * l o g ( x ) +(1 − t h e t a ) * l o g ( y ) ;
6 end

Output from the Program

The result of the program is shown now.
1 MRS
2 3.6889 3.6889

8.16.3 Random Number Generation–Slutsky’s Business Cycle

1 % S l u t s k y .m
2 % This program g e n e r a t e s S l u t s k y ’ s b u s i n e s s c y c l e
3 c l e a r a l l % C l e a r memory
4 c l c % Clear screen
5 rng ( 1 ) % Seed t h e random number g e n e r a t o r
6 xvec = r a n d i ( 1 0 , 2 0 0 , 1 ) ; % C a l l up 200 uniform random numbers , 1
<= x <= 10
7 ovec = z e r o s ( 1 9 0 , 1 ) ; % The B u s i n e s s Cycle Index
8 for t = 1:190
9 ovec ( t ) = sum ( xvec ( t : t +9) ) + 5 ; % S l u t s k y ’ s formula
10 end
11

12 % P l o t S l u t s k y ’ s B u s i n e s s Cycle
13 figure (1)
14 p l o t ( ovec )
15 t i t l e ( ’ S l u t s k y s B u s i n e s s Cycle ’ )
16 x l a b e l ( ’ Period ’ )
17 y l a b e l ( ’ B u s i n e s s Cycle Index ’ )

8.16.4 Output from the Slutsky Program

numerical approximations 193

Figure 8.16.1: Slutsky’s business

cycle à la MATLAB.

Slutskys Business Cycle

65
Business Cycle Index

30
0 20 40 60 80 100 120 140 160 180 200
Period
9 Stochastic Dynamics

“Our task as I see it ... is to write a Fortran program that will accept
specific economic policy rules as ‘input’ and will generate as ‘output’
statistics describing the operating characteristics of time series we care
about, which are predicted to result from these policies ... It must be
taken for granted that simply attempting various policies that may be
proposed on actual economies and watching the outcome must not be
taken as a serious solution method: Social Experiments on the grand
scale may be instructive and admirable, but they are best admired at a
distance.” (Robert E. Lucas Jr, “Methods and Problems in Business Cycle
Theory,” Journal of Money,Credit and Banking, 1980).

9.1 Introduction

The macroeconomy is full of randomness. For example, no one knows

what the state of technology will be in the future. Think about how the
information age is affecting the economy: robots in factories, online
shopping, etc. Likewise, governments come and go, with different
views about deficits, education, the environment, health care, income
equality, and international trade. So, future spending and tax policies
are unknown too. Acts of God, such as Covid19, earthquakes, and
hurricanes, have effects too.
To capture this, let the state of the economy, defined by the vector
(k t , zt ), evolve according to

k t +1 = K ( k t , z t ),

where zt is a random variable which is distributed according to the

cumulative distribution function

zt+1 ∼ G (zt+1 |zt ) = Pr[e

z t +1 ≤ z t +1 | e
z t = z t ].

The associated density function is represented by g(zt+1 |zt ) ≡ G1 (zt+1 |zt ).

Observe that the randomness in the z’s will imply randomness in the
k’s. Suppose that in period t one knows k t and zt . One will not know
what zt+1 will be in period t + 1, because this is random. This implies
that k t+2 = K (k t+1 , zt+1 ) will be unknown because it is a function of
k t+1 and zt+1 . A long time ago, Eugen Slutsky (1937) discussed how
196 numerical methods for macroeconomists with julia and matlab codes

the business cycle could be the result of random causes; this was dis-
cussed in Chapter 8.
The function K is modelled here as the outcome of a dynamic stochas-
tic optimization problem in conjunction with equilibrium conditions
and the government budget constraint. This dynamic stochastic opti-
mization problem is formulated as a dynamic programming problem
à la Bellman (1957). Dynamic programming problems can be solved
numerically in various ways. Three methods are covered here: dis-
crete state space dynamic problem, linearization, and policy-function
iteration. There different algorithms for policy function iteration are
discussed: the Coleman (1991) algorithm, the endogenous grid method
by Carroll (2006), and parameterized expectations introduced by den
Hann and Marcet (1990).
The above economy will never settle down to a deterministic steady
state because zt is always fluctuating. The best one can hope for is
that in the long run fluctuations in (k t , zt ) will be described by some
stationary probability distribution, S(k t , zt ). The above two equations
imply a joint probability distribution for (k t+1 , zt+1 ) as a function of
(k t , zt ). Write this as
T(k t+1 , zt+1 |k t , zt ) ≡ Pr[e
k t +1 ≤ k t +1 , e
zt+1 ≤ zt+1 |e
kt = kt , e
z t = z t ],
(9.1.1)
where T is often referred to as the transition operator. The long-run
distribution must solve the equation
Z Z
S ( k t +1 , z t +1 ) = T(k t+1 , zt+1 |k t , zt )dS(k t , zt ).

Note that the right-hand side of the above equation is counting the
ways you can move into a situation in period t + 1 where e k t +1 ≤ k t +1
and e zt+1 ≤ zt+1 from any of the possible (k t , zt ) combinations in pe-
riod t. In stochastic models one is interested in the statistical properties
of (k t , zt ), as opposed to characterizing a deterministic time path. This
will be done two ways here. First, by using a Markov chain for T and,
second, by simulating T using Monte Carlo techniques.

9.2 Robinson Crusoe

Consider the problem of Robinson Crusoe who lives on an island.

Robinson must decide how much to consume and save in the form
of capital each period. He does this to maximize the expected value of
his lifetime utility
∞
E[ ∑ βt−1 U (ct )].
t =1
He produces according to the following production function

y t = z t F ( k t ),
stochastic dynamics 197

where zt is a random technological shock in period t. In any given

period t, output, yt , is split between consumption, ct , and gross invest-
ment, k t+1 − (1 − δ)k t , where δ is the rate of depreciation on capital.
Robinson makes his consumption and investment decision after he
sees zt . Let zt follow a stochastic process of the following form

zt+1 ∼ G (zt+1 |zt ) = Pr[e

z t +1 ≤ z t +1 | e
z t = z t ].

Here G is the cumulative distribution function for zt+1 conditioned

on zt , with associated density function g(zt+1 |zt ) ≡ G1 (zt+1 |zt ). The
above setting is a simplified version of the famous Brock and Mirman
(1972) stochastic growth model. As mentioned in Chapter 8, Slutsky
(1937) viewed the business cycle as resulting from random perturba-
tion to the economy. His analysis had more to do with statistical me-
chanics than with economics, however. The stochastic process for zt is
often operationalized using one of two forms. First, zt can be assumed
to follow an AR1 in logs. Specifying the process in logs ensures that
the values drawn for zt are always positive. Second, zt can be taken
to be governed by a Markov chain. Both of these forms for stochastic
processes are discussed in Chapter 8. It’s easy to add labor into the
stochastic growth model along the lines mentioned in Chapter 6.
In any particular period, Robinson’s world will determined by the
capital stock he has, k, and the value of the technology shock, z. The
pair (k, z) is called the ‘state of the world.’ Robinson will make all
decisions based upon the state of his world. Let V (k, z) denote the
maximal expected lifetime utility that Robinson will realize if today’s
state of the world is (k, z). Robinson lives in a stationary world in the
sense that everyday is the same as another except that he may have
a different level of capital, k, and realize a different level of the tech-
nology shock, z. In this stationary environment, Robinson’s decision
problem is described by
Z
V (k, z) = max{U (zF (k) + (1 − δ)k − k0 ) + β V (k0 , z0 )dG (z0 |z)}
k0 | {z }
c
Z
= max{U (zF (k) + (1 − δ)k − k0 ) + β V (k0 , z0 ) g(z0 |z) dz0 }.
k0 | {z }
density for G

(9.2.1)
Problem (9.2.1) is a dynamic programming problem. It is presented us-
ing recursive notation, where the time subscripts have been dropped.
In this problem effectively there is only today and tomorrow, where a
0 is attached to variable to denote its value tomorrow. Today’s utility

is given by U (zF (k) + (1 − δ)k − k0 ), whereas expected discounted life-

time utility from tomorrow on is represented by β V (k0 , z0 ) g(z0 |z)dz0 .
R

As with the deterministic version of the neoclassical growth model, it

198 numerical methods for macroeconomists with julia and matlab codes

can be shown the that the value function, V, exists and is unique, is
increasing and strictly concave in k, and is continuously differentiable
in k–the analysis proceeds along the lines of Section 6.8 in Chapter 6.1 1
The only additional assumption
V (k0 , z0 )dG (z0 |z) is
R
Let the solution for k0 that arises out of this problem be represented needed is that
continuous in k0 if V (k0 , z0 ) is. This is
by known as the Feller property. See Stokey
k0 = K (k, z). (9.2.2) and Lucas (1986) for the formalities.

This is called Robinson’s ‘decision rule.’ It gives Robinson’s optimal

action for capital accumulation should he find himself in the state of
the world (k, z). It is defined for every (k, z) pair that could occur on
the island. Solving the above problem leads to the following first-order
condition:
Z
U1 (zF (k ) + (1 − δ)k − k0 ) = β V1 (k0 , z0 )dG (z0 |z) (9.2.3)
| {z } | {z }
MC of invest
MB of invest
Z
=β V1 (k0 , z0 ) g(z0 |z)dz0 .

The lefthand side of the above expression is the marginal cost of invest-
ing an extra unit of capital today. The righthand side is the discounted
expected benefit. To understand this, note that V1 (k0 , z0 ) is the ex-
pected benefit next period from having an extra unit of capital should
the technology shock be z0 . But, the value of next period’s technology
shock is unknown currently so Crusoe must take the expected value
of this benefit. Observe that the current value of the technology, z, is
useful for forecasting the value of z0 through the function G (z0 |z) [or
equivalently through g(z0 |z)]. Last, since this expected benefit occurs
next period it must be discounted by β. This first-order condition rep-
resents one equation in the one unknown, k0 , where k and z are known
exogenous variables in the current period. Hence, the solution for k0
will have the form shown by (9.2.2).
If the technology shock is a discrete random variable taking n val-
ues, so that z ∈ Z = {z1 , z2 , · · · , zn }, then the dynamic programming
problem (9.2.1) appears as
n
V (k, zr ) = max{U (zr F (k) + (1 − δ)k − k0 ) + β
k0
∑ πrs V (k0 , zs )}, (9.2.4)
s =1

where πrs represents the odds of z traveling from zr today to zs to-

morrow. Here the technology shock is described by an n-state Markov
chain–see Chapter 8.

9.2.1 The Envelope Theorem, Again

This first-order condition involves the derivative of the unknown func-
tion V. It can be gotten rid off in the same way as in Chapter 6. To do
stochastic dynamics 199

k0 denote the optimal level of investment. Then, (9.2.1) can be

this, let e
written as
Z
k0 ) + β
V (k, z) = U (zF (k) + (1 − δ)k − e k0 , z0 )dG (z0 |z).
V (e

To eliminate V1 (k0 , z0 ) in (9.2.3), differentiate both sides of the above

equation with respect to k. One gets

k0 )[zF1 (k ) + 1 − δ]
V1 (k, z) = U1 (zF (k ) + (1 − δ)k − e
e0 k0
0 dk de
Z
−U1 (zF (k) + (1 − δ)k − k )
e + β V1 (ek0 , z0 )dG (z0 |z) ,
| dk
{z dk}
=0

where the term on the second line is zero from (9.2.3). The fact that per-
turbations in the choice variable, k0 , cancel out in the objective function
when evaluated at the optimal solution is called the envelope theorem.

9.2.2 The Stochastic Euler Equation

Updating the above result gives

V1 (k0 , z0 ) = U1 (z0 F (k0 ) + (1 − δ)k0 − k00 )[z0 F1 (k0 ) + 1 − δ].

Using this in the first-order condition (9.2.3) then leads to the following
stochastic Euler equation:
Z
U1 (zF (k ) + (1 − δ)k − k0 ) = β U1 (z0 F (k0 ) + (1 − δ)k0 − k00 )[z0 F1 (k0 ) + 1 − δ]dG (z0 |z)

= βE U1 (z0 F (k0 ) + (1 − δ)k0 − k00 ) × [z0 F1 (k0 ) + 1 − δ]|k, z .

(9.2.5)

9.2.3 Sequence Space Formulation

Robinson Crusoe’s problem can also be cast in sequence space. Let the
technology shock, z, follow an N-state Markov chain, with the tran-
sition probability between states i and j being denoted by π ij , where
the states are now represented by superscripts. Suppose that the tech-
nology shock starts off in period 1 from the known state i, so that
z1 = z1i , where a superscript represents a state and a subscript the
time period. Let zt = (zt , zt−1 , · · · , z1 ) signify some realized sequence
of technology shocks between periods 1 and t. There are N t−1 possible
sequences, with the set of these sequences notated by Zt . Denote the
odds of the sequence zt occurring by ρ(zt ). The probabilities ρ(zt ) are
200 numerical methods for macroeconomists with julia and matlab codes

given recursively by

ρ(z1i ) = 1,
ρ[z2 = (z2 , z1i )] = Pr(z2 | z1 )ρ(z1i ),
ρ[z3 = (z3 , z2 )] = Pr(z3 | z2 )ρ(z2 ),
..
.
ρ[zt = (zt , zt−1 )] = Pr(zt | zt−1 )ρ(zt−1 ).

Think about Robinson Crusoe choosing a capital stock for every state-
time combination that can possibly occur. That is, for each period
t Robinson Crusoe chooses the capital stock for the next period, k t+1 ,
contingent upon the sequence of technology shocks that have occurred
up to that period, or zt . This choice variable is designated by k t+1 (zt ).
Note the capital stock chosen for period t + 1 will depend on the initial
capital stock, k1 , and the sequence of shocks that transpires between
period 1 and period t, or (zt , zt−1 , · · · , z1 ).
Robinson Crusoe’s maximization problem now appears as
∞
( )
max
{k t+1 (zt )}∞
∑ ∑ β t −1
ρ(zt )U ( F (k t+1 (zt−1 )) + (1 − δ)k t+1 (zt−1 ) − k t+1 (zt )) .
t =1 t=1 zt ∈Zt

Note that for each value of zt , the choice variable k t+1 (zt ) appears in
just two periods in the maximand, namely time t and t + 1. Specifically,
it shows up in the terms

· · · + βt−1 ρ(zt )U (zt F (k t (zt−1 )) + (1 − δ)k t+1 (zt−1 ) − k t+1 (zt ))

N
+ βt ∑ ρ (zt+1 , zt ) U (zt+1 F (k t+1 (zt )) + (1 − δ)k t+1 (zt ) − k t+2 (zt+1 )) + · · · .
j

j =1

j j
By writing ρ (zt+1 , zt ) = Pr(zt+1 | zt )ρ(zt ) the above can be ex-
pressed as

· · · + βt−1 ρ(zt ) U (zt F (k t (zt−1 )) + (1 − δ)k t+1 (zt−1 ) − k t+1 (zt ))

N
∑ Pr(zt+1 | zt )U (zt+1 F(kt+1 (zt )) + (1 − δ)kt+1 (zt ) − kt+2 (zt+1 ))
j
+ βt +··· .
j =1

Now, suppose that the current value of the technology shock in period
j
t is zit so that zt =(zit , zt−1 ). Then, Pr(zt+1 | zit ) = π ij and the generic
first-order condition reads

U1 zit F (k t (zt−1 )) + (1 − δ)k t+1 (zt−1 ) − k t+1 (zt )
N
= β ∑ π ij U1 zt+1 F (k t+1 (zt )) + (1 − δ)k t+1 (zt ) − k t+2 (zt+1 )
j

j =1
j
× [zt+1 F1 (k t+1 (zt )) + (1 − δ)],
stochastic dynamics 201

for i = 1, 2, · · · , N. With a minor switch to recursive notation, this can

be rewritten as

U1 (zF (k) + (1 − δ)k − k0 )

= βE U1 (z0 F (k0 ) + (1 − δ)k0 − k00 ) × [z0 F (k0 ) + (1 − δ)] | k, z .

This is the same stochastic Euler equation that obtained from the dy-
namic programming formulation.

9.3 Business Cycle Modeling

The goal of a business cycle model is to reproduce a set of stylized

facts characterizing business cycles. This research program was laid
out by Kydland and Prescott (1982) in one of the 20th century’s most
influential papers in macroeconomics. To characterize the business cy- Finn E. Kydland (1943-) is a
cle, the U.S. time series for variables that one cares about, say GDP, Norwegian macroeconomist. He
consumption, investment and hours worked, are first logged and then studied under Edward C. Prescott
at Carnegie Mellon University.
detrended using some filtering technique. Volatility of the logged and
Along with Prescott, he won the
detrended series is measured by its standard deviation. (See Chapter Nobel Prize in 2004. Kydland
A for a discussion of standard deviations, correlations, and autocor- relays that as an undergraduate at
relations.) To determine the cyclicality of a series its correlation with the Norwegian School of
output is computed. A series is called procyclical when the correla- Economics and Business
tion is positive. Last, persistence is judged by a series’ autocorrelation. Administration he took a course
that covered Ronald A. Howard’s
Kydland and Prescott (1982) matched up the predictions from their
book on Dynamic Programming and
business cycle model with such a set of stylized facts for the U.S. econ- Markov Processes. He wrote his first
omy. The original Kydland and Prescott (1982) paper picked the stan- computer program doing dynamic
dard deviation of the technological shock, σ, and its autocorrelation, programming in FORTRAN.
ρ, so that the model could match the standard deviation of output and
its autocorrelation.

9.3.1 Stylized Facts

1. Volatility: Volatility is measured by standard deviation of the de-
trended logged variable. Investment is much more volatile than
output, consumption less.

2. Correlations: Here the correlation between the detrended logged

variable and detrended logged output are computed. Hours has
the highest correlation with output, but the other variables particu-
larly consumption come fairly close.

3. Persistence: Now, the correlation between the detrended logged

variable and its own lagged value is computed. In the data, con-
sumption and productivity have the highest autocorrelations, and
investment the lowest.
202 numerical methods for macroeconomists with julia and matlab codes

Business Cycle Statistics – Annual U.S. Data

Variable–logged Standard Deviation Correlation Autocorrelation
Output 3.5 1.00 0.66
Consumption 2.2 0.74 0.72
Investment 10.5 0.68 0.25
Hours Worked 2.1 0.81 0.39
Productivity 2.2 0.82 0.77
Source: Greenwood et al. (1988)

9.4 Discrete-State-Space Dynamic Programming

This method has two key steps. In the first step, a dynamic pro-
gramming problem is solved assuming that values for the capital stock
and the technology shock both lie in discrete sets. In the second step
the solution to this dynamics programming problem is represented as
Markov chain. Using this Markov chain representation, statistics for
any variable of interest can then be readily computed.
The capital stock in each period is constrained to be an element of
the finite time-invariant set, K. Thus,

k ∈ K = { k 1 , · · · , k n }.

For simplicity let the technology shock follow a two-state Markov

chain so that
z ∈ Z = { z1 , z2 },

with the transition probabilities

πrs = Pr[z0 = zs |z = zr ].

9.4.1 Representative Agent’s Dynamic Programming Problem

Given the structure outlined above, the representative agent’s dynamic
programming problem can be written as

2
V (k i , zr ) = max {U (zr F (k i ) + (1 − δ)k i − k0 ) + β
c>0,k0 ∈K
∑ πrs V (k0 , zs )}.
s =1
P(1)
Observe that V : K × Z → R is merely a list of 2n values, one for each
(k i , zr ) ∈ K × Z . The choice for k0 is restricted to lie in the discrete set
K. Discrete maximization was introduced in Chapter 3.
So, how can a solution V be obtained? Here’s an algorithm.

1. Make an initial guess for V, denoted by V 0 . This guess is merely a

list of 2n values. Go to Step 2.
stochastic dynamics 203

2. Enter iteration j + 1 with a guess for the V on the righthand side of

P(1). Call this guess V j , it’s merely a list of 2n values. Next, compute
the solution to the righthand side of P(1). Denote this solution by
V j+1 . Now, let
2
M j (k i , zr , k0 ) = {U (zr F (k i ) + (1 − δ)k i − k0 ) + β ∑ πrs V j (k0 , zs )}.
s =1

This is the value of objective function in state (k i , zr ) assuming that

the capital stock k0 ∈ K is choosen. In general, it will not be optimal
to choose k0 . It’s easy to see that V j+1 is given by
n elements in set
z }| {
V j +1 ( k 1 , z 1 ) = max { M j (k1 , z1 , k1 ), M j (k1 , z1 , k2 ), · · · , M j (k1 , z1 , k n )},
V j +1 ( k 2 , z 1 ) = max{ M j (k2 , z1 , k1 ), M j (k2 , z1 , k2 ), · · · , M j (k2 , z1 , k n )},
.. ..
. .
j + 1
V ( k n , z1 ) = max{ M (k n , z1 , k1 ), M (k n , z1 , k2 ), · · · , M j (k n , z1 , k n )},
j j

V j +1 ( k 1 , z 2 ) = max{ M j (k1 , z2 , k1 ), M j (k1 , z2 , k2 ), · · · , M j (k1 , z2 , k n )},

.. ..
. .
j + 1
V ( k n , z2 ) = max{ M (k n , z2 , k1 ), M (k n , z2 , k2 ), · · · , M j (k n , z2 , k n )}.
j j

This constitutes a revised guess for V. For each possible current

state (k i , zr ) the value for the future capital stock, k0 , that maxi-
mizes the objective function, say k l , is found; that is, M j (k i , zr , k l ) ≥
M j (k i , zr , k m ), for all m 6= l. Note that max is a built in operation in
MATLAB. For example, [value, location] = max([ M j (k1 , z1 , k1 ), M j (k1 , z1 , k2 ),
· · · , M j (k1 , z1 , k n )]) returns the maximum element and its location
in the vector [ M j (k1 , z1 , k1 ), M j (k1 , z1 , k2 ), · · · , M j (k1 , z1 , k n )]. The
location specifies which capital stock maximizes the vector of M j ’s.
You can also apply this operator on matrices. For instructions on
how to do this look at the help menu in MATLAB. Go to Step 3.

3. Check whether |V j+1 − V j | is sufficiently small. If so, stop. If not,

go back to Step 2. Essentially, equation P(1) defines an operator
T such that V j+1 = TV j . The contraction mapping theorem–see
Chapter 6–states that the initial guess, V 0 , is irrelevant, the solution
for V is unique, and the algorithm will converge from V 0 to V. You
could set V 0 = 0.

Decision Rule for Capital

The decision rule for capital, K : K × Z → K, is

k 0 = K ( k i , zr ) ∈ K .

This gives an investment plan for all 2n contingencies in the state

space. All of the model’s variables, such as consumption, gross in-
vestment, and output, can be written as a function of the current state
204 numerical methods for macroeconomists with julia and matlab codes

of the world, (k i , zr ):

o = O ( k i , zr ) = zr F ( k i ),

i = I ( k i , zr ) = K ( k i , zr ) − (1 − δ ) k i ,
and
c = C ( k i , z r ) = z r F ( k i ) + (1 − δ ) k i − K ( k i , z r ).
It is easy to add labor, h, into stochastic growth model along the
lines presented in Chapter 6. Now the production function would
read y = zF (k, h). The decision rule for labor would have the form
h = H (k i , zr ). Therefore, output in the current period can be expressed
as o = O(k i , zr ) = zr F (k i , H (k i , zr )).

9.4.2 Casting the Model’s Solution as a Markov Chain

The solution to the above model will now be cast as a Markov chain.
The concept of a Markov chain was introduced in Chapter 8. The
decision rule for capital can be rewritten in probabilistic form as
(
0 1, for some j,
Pr[k = k j | k = k i , z = zr ] =
0, for the rest.

Trivially, then
n
∑ Pr[k0 = k j | k = ki , z = zr ] = 1 for all (k, z) ∈ K × Z .
j =1

Define the transition probability between (k, z) pairs by

pir , js = Pr[k0 = k j , z0 = zs |k = k i , z = zr ] = Pr[k0 = k j | k = k i , z = zr ]πrs .

(9.4.1)
Now, load these transition probabilities into a matrix:

T = [ pir , js ].
| {z }
2n×2n

The matrix T is the Markov chain analogue to the transition operator

described by equation (9.1.1). Observe that ∑ j,s pir,js = 1 for all i, r;
i.e., from any state (i, r ) you must go somewhere. Thus, each row in T
sums to one.
Given some initial 1 × 2n probability distribution ρ0 over the state
space K × Z , next period’s probability distribution is given by

ρ1 = ρ0 × T .
1×2n 1×2n 2n×2n

The m-period-ahead probability distribution over state space K × Z

reads
ρ m = ρ m −1 T = ρ m −2 T 2 = · · · = ρ 0 T m .
stochastic dynamics 205

The long-run or stationary distribution, ρ∗ , solves

ρ∗ = ρ∗ T, (9.4.2)

The stationary distribution can be computed using one of the methods

discussed in Chapter 8.

Computation of Moments
Once the long-run distribution, ρ∗ , has been obtained, it is easy to
compute any moment of interest.

2 n
∗
E[ln o ] = ∑ ∑ ρir ln O(k i , zr ), (9.4.3)
r = 1i = 1

2 n
∗
E[ln c ln o ] = ∑ ∑ ρir ln C (k i , zr ) ln O(k i ,zr ), (9.4.4)
r = 1i = 1

2 n 2 n
E[ln o 0 ln o ] = ∑ ∑ ∑ ∑ pir,js ρir
∗
ln O(k0j , z0s ) ln O(k i , zr ). (9.4.5)
s = 1 j = 1r = 1 i = 1

To compute the percentage standard deviation of output use the for-

mula q
σln o = E[(ln o )2 ] − E[(ln o )]2 . (9.4.6)

Similarly, the correlation between consumption and output is

E[ln c ln o ] − E[ln c] E[ln o ]

ρln c, ln o = . (9.4.7)
σln c σln o

Last, the autocorrelation of output can be written as

E[ln o 0 ln o ] − E[ln o ]2
ρln o0 , ln o = . (9.4.8)
(σln o )2

Choosing the Grid for the Capital Stock

How should the grid for the capital stock be chosen? One way of
doing this is to plot the marginal distribution for the capital stock. The
∗ + ρ∗ , ρ∗ + ρ∗ , · · · , ρ∗ +
marginal distribution for capital is given by (ρ11 12 21 22 n1
∗
ρn2 ). That is, the marginal distribution for capital is constructed by
taking the joint distribution over capital and technology shocks and
summing (or integrating) over the technology shock. It should be cen-
tered around the deterministic steady-state level of the capital stock,
k∗ . When the grid is picked correctly this distribution will resemble a
choppy normal distribution. The odds of getting a very low or high
capital stock (as given by the tails of distribution) will be small.
206 numerical methods for macroeconomists with julia and matlab codes

9.4.3 Algorithm: Discrete-State-Space Dynamic Programming

To summarize, the key steps in discrete-state-space dynamic program-
ming are:

1. Solve the discrete-state-space dynamic programming problem P(1)

to obtain the nonlinear decision rule for capital.

2. Use the decision rule for capital and the Markov process for the
technology shock to specify the Markov transition matrix (9.4.1).

3. Compute the stationary distribution for the capital stock and tech-
nology shock defined by equation (9.4.2) using one of the methods
discussed in Chapter 8.

4. Compute various business cycle statistics using formulas such as

(9.4.3) to (9.4.8).

One could alternatively use the Monte Carlo based algorithm dis-
cussed below. This is less desirable in general. But in some circum-
stances one might want to filter the business cycle data coming from
the model or perhaps the Markov transition matrix (9.4.1) is too big to
handle.

1. Solve the discrete-state-space dynamic programming problem P(1)

to obtain the nonlinear decision rule for capital.

2. Draw a sample of T random variables, {ε t }tT=1 –random number

generation is discussed in Chapter 8. In MATLAB a sample of uni-
formly distributed random variables on the [0, 1] interval can be
drawn using the rand command. Make sure that the seed is fixed
for the random number generator–this is done with the rng(seed),
where seed is some natural number. This line should be inserted
just before the call for the random numbers.

3. Enter period t with a level of capital, k t = k i ∈ K, and some past

value for the technology shock, zt−1 = zr ∈ Z . The current tech-
nology shock could remain at its past value, zr , or switch to a new
value, zs . Now, take ε t from the sample of random variables. Com-
pute the current technology shock, zt , follows:

zt = zr , if ε < πrr ,
zt = zs , if ε ≥ πrr .

Compute next period’s capital stock, k t+1 ∈ K, using the decision

rule for capital
k t +1 = K ( k t , z t ).
stochastic dynamics 207

4. For the starting value of capital just take k1 = k∗ , where k∗ is level

of capital in the deterministic steady state. For z0 take either value
for the technology shock.

5. Given a sample path of capital stocks and technology shocks, {k t }tT=+11

and {zt }tT=+11 , data for all other variables of interest in the model,
say consumption, investment and GDP, can be calculated. Some-
times researchers throw away some numbers at the beginning, say
{k t }nt=1 . This way the sample is not influenced by the starting val-
ues for k1 and z0 . From these variables one can then calculate a set
of business cycle statistics. When doing this take the logarithm of
variable. In MATLAB standard deviations can be computed using
the std command. Likewise, correlations can be calculated using
the corrcoef command.

9.5 Linearization

This method involves taking a log-linear approximation of the Euler

equation (9.2.5). The algorithm involves three steps: (i) conjectur-
ing a log-linear law of the motion for capital accumulation; (ii) log-
linearizing the Euler equation, while utilizing this guess, and solving
for the resulting log-linear law of motion for capital while imposing
a consistency requirement between the conjectured decision rule and
the log-linearized solution; (iii) undertaking a Monte Carlo simulation
of the computed decision rule to obtain sample paths for the variables
of interest; and (iv) then computing the business facts that result from
these sample paths.

9.5.1 Conjecturing a Decision Rule

Suppose that the technology shock follows an AR1 form specified by

ln z0 = ρ ln z + ε0 , where ε0 ∼ N (0, σ2 ). (9.5.1)

Conjecture a log-linear decision rule for capital of the following form

ln k0 = a + b ln k + f ln z. (9.5.2)

One would expect that a > 0, 0 < b < 1, and f > 0. To solve for a, b,
and f , the above stochastic Euler equation (9.2.5) will linearized in ln k
and ln z. The resulting log-linear solution must be consistent with the
assumed one (9.5.2). This consistency requirement provides a solution
for the constants a, b, and f .
If one knew the constants a, b, and f , then one could use (9.5.1) and
(9.5.2) to compute sample paths for k and z denoted by {k t+1 }tT=1 and
{zt+1 }tT=1 , given a starting condition, k1 and z1 , and a sample path
208 numerical methods for macroeconomists with julia and matlab codes

for the error terms {ε t+1 }tT=1 . The sample path for the error terms,
{ε t+1 }tT=1 , can be drawn from a random number generator. Further-
more, if one has a sample path for k and z then it is easy to construct
ones for other variables, such as output, o, or consumption, c.
Therefore, the hardest part of the problem is computing a, b, and
f . These coefficients will be uncovered by linearizing the stochastic
Euler equation (9.2.5). Toward this end, represent the values for k and
z that would occur in a deterministic steady state by k∗ and z∗ . In the
absence of uncertainty, the above decision rule should converge to this
steady state implying

ln k∗ = a + b ln k∗ + f ln z∗ .

Thus, one can write

ln k0 − ln k∗ = b(ln k − ln k∗ ) + f (ln z − ln z∗ ),

where ln k − ln k∗ and ln z − ln z∗ denote the proportionate deviations

of the capital stock and technology shock away from their determinis-
tic steady-state values, k∗ and z∗ . Let

k ≡ ln k − ln k∗ and b
b z ≡ ln z − ln z∗ ,

which allows the decision rule for capital to be rewritten as

k0 = bb
b k + fb
z. (9.5.3)

Similarly,
z0 = ρb
b z + ε0 .

k and b
Last, the unconditional (long-run) expectations of b z are given
by
E[b
k ] = E [b
z] = 0,

which imply that E[ln k] = ln k∗ and E[ln z] = ln z∗ . To see this, note

z t + ε t + j + ρ 1 ε t + j −1 + · · · + ρ j −1 ε t +1 ,
zt+ j = ρ j b
b

so that
E [b zt ] = ρ j b
zt+ j |b zt .

Clearly,
zt+ j |b
lim E[b zt ] = 0,
j→∞

because 0 < ρ < 1. Likewise,

k t+ j = b jb
b k t + f (b z t + j −2 + · · · + b j −1 b
zt+ j−1 + bb zt )
= b jbk t + f (ρ j−1 b
zt + bρ j−2 b
z t + · · · + b j −1 b
zt ) + ε terms.
stochastic dynamics 209

Thus,
j
k t+ j |b
E[b zt ] = b jb
kt , b kt + f ∑ bi−1 ρ j−i bzt
i =1
Therefore,
k t+ j |b
lim E[b zt ] = 0,
kt , b
j→∞

because lim j→∞ bi−1 ρ j−i = 0 for all 1 ≤ i ≤ j, as 0 < b, ρ < 1.

9.5.2 Log Linearizing the Euler Equation

Taking antilogs of (9.5.1) yields

z0 = zρ exp(ε0 ) (9.5.4)

Define the function Λ(k, k0 , k00 , z, ε0 ) by

Λ(k, k0 , k00 , z, ε0 )
= U1 (zF (k) + (1 − δ)k − k0 )
− βU1 (zρ exp(ε0 ) F (k0 ) + (1 − δ)k0 − k00 )[zρ exp(ε0 ) F1 (k0 ) + 1 − δ].

This allows the stochastic Euler equation (9.2.5) to be rewritten as

E[Λ(k, k0 , k00 , z, ε0 )] = 0.

Note that
x = eln x = exp(ln x ), (9.5.5)
which implies
dx
= eln x = x. (9.5.6)
d ln x
Hence, one can write

e (ln k, ln k0 , ln k00 , ln z, ε0 )]
E[Λ
= E[Λ(exp(ln k), exp(ln k0 ), exp(ln k00 ), exp(ln z), ε0 )]
| {z } | {z } | {z } | {z }
k k0 k00 z

= 0.

Furthermore, using (9.5.6) it happens that

Λ e 2 = Λ2 k 0 , Λ
e 1 = Λ1 k, Λ e 3 = Λ3 k00 , Λ
e 4 = Λ4 z, and Λ
e 5 = Λ5 . (9.5.7)

This relationship will allow the subsequent analysis to compute the

derivatives of Λ instead of Λ.
e
The conjectured decision-rule (9.5.2) implies that

ln k00 = a + b ln k0 + f ln z0
= a + b ln k0 + f (ρ ln z + ε0 ).
210 numerical methods for macroeconomists with julia and matlab codes

This allows the function Λ

e to be expressed as

e (ln k, ln k0 , a + b ln k0 + f ρ ln z + f ε0 , ln z, ε0 )
Λ
| {z }
ln k00

Take a first-order Taylor expansion of the function Λ e in the vari-

0 0
ables ln k, ln k , ln z, and ε around the deterministic steady state where
ln k = ln k∗ , ln k0 = ln k∗ , ln z = ln z∗ = 0, and ε0 = 0. (Again, the con-
cept of a first-order Taylor expansion is presented in Chapter A.) This
is called linearizing the function Λ. e Let Λ(∗) denote the arguments in
the function Λ are being evaluated at their values in a deterministic
steady state. The above Euler equation can then be rewritten as

E[Λ(∗) + Λ1 (∗)k∗ (ln k − ln k∗ )

+ Λ2 (∗)k∗ (ln k0 − ln k∗ ) + Λ3 (∗)k∗ b(ln k0 − ln k∗ )
+ Λ3 (∗)k∗ f ρ(ln z − ln z∗ ) + Λ3 (∗)k∗ f (ε0 − ε∗ )
+ Λ4 (∗)z∗ (ln z − ln z∗ ) + Λ5 (∗)(ε0 − ε∗ )] = 0,

where (9.5.7) has been used. The derivatives Λ1 (∗), Λ2 (∗), Λ3 (∗),
Λ4 (∗), and Λ5 (∗) are all just constant terms. In a deterministic steady-
state Λ(∗) = 0, because

U1 (z∗ F (k∗ ) + (1 − δ)k − k∗ ) − βU1 (z∗ F (k∗ ) + (1 − δ)k − k∗ )z∗ F1 (k∗ ) = 0.

Furthermore, E(ε0 − ε∗ ) = 0. Thus, the first, sixth, and eighth terms

e (ln k, ln k0 , ln k00 , ln z, ε0 )]. Hence,
will disappear in the expression for E[Λ

Λ1 (∗)k∗b
k + [Λ2 (∗) + bΛ3 (∗)]k∗b
k0 + [Λ3 (∗)k∗ f ρ + Λ4 (∗)]b
z = 0.

In the above equation use has also been made of the fact that z∗ =
1. Note that expectation operator E has disappeared from the above
equation, because all terms are fully known. It is as if there is no
uncertainty in the economy. Thus, linearization imposes a certainty
equivalence property. Rewrite the above equation as

Λ1 (∗) Λ3 (∗) f ρ + Λ4 (∗)/k∗

k0 =
b k+
b z.
−[Λ2 (∗) + bΛ3 (∗)] −[Λ2 (∗) + bΛ3 (∗)]
b
| {z } | {z }
=b =f

9.5.3 Solving for the Decision Rule

The last step is to solve for the coefficients on the conjectured decision
rule or for a, b, and f . This is done in a manner similar to Chapter
6.6.9. By comparing the above equation with (9.5.3), it is obvious that

Λ1 (∗)
b= , (9.5.8)
−[Λ2 (∗) + bΛ3 (∗)]
stochastic dynamics 211

and
Λ3 (∗) f ρ + Λ4 (∗)/k∗
f = . (9.5.9)
−[Λ2 (∗) + bΛ3 (∗)]
Solving for b involves computing the solution to the quadratic equa-
tion
Λ3 (∗)b2 + Λ2 (∗)b + Λ1 (∗) = 0. (9.5.10)
This equation has two roots. As will be shown below, both will be
positive in value. One will be bigger than one, the other smaller. Pick
the smaller one. Note that by solving (9.5.9) for f it transpires that
Λ4 (∗)/k∗
f = . (9.5.11)
−[Λ2 (∗) + bΛ3 (∗) + Λ3 (∗)ρ]
Note these expressions involve the derivatives of Λ and not Λ.
e Last, a
will be given by

a = (1 − b) ln k∗ − f ln z∗ = (1 − b) ln k∗ . (9.5.12)

9.5.4 Numerical Characterization

To compute a numerical solution for the model, all one needs is the
derivatives Λ1 (∗), Λ2 (∗), Λ3 (∗), and Λ4 (∗). These can be computed
numerically, as discussed in Chapter 8. Then, the constants a, b, and f
can be determined using (9.5.12), (9.5.10), and (9.5.11).

9.5.5 Theoretical Characterization

It will now be shown that 0 < b < 1 and that f > 0.

Computing the Derivatives for Λ(k, k0 , k00 , z, ε0 )

To characterize the solution theoretically, the constants Λ1 (∗), Λ2 (∗),
Λ3 (∗), and Λ4 (∗) need to be calculated. By inspecting (9.2.5) and
(9.5.4), it is apparent that

Λ(k, k0 , k00 , z, ε0 ) = U1 (zF (k) + (1 − δ)k − k0 )

| {z }
c
− βU1 (z exp(ε ) F (k0 ) + (1 − δ)k0 − k00 )[zρ exp(ε0 ) F1 (k0 ) + 1 − δ].
ρ 0
| {z } | {z }
c0 z0

In a deterministic steady state β[z∗ F1 (∗) + 1 − δ] = 1, ε∗0 = 0, exp(ε∗0 ) =

1, and z∗ = z∗ρ = 1. Therefore,

Λ1 (∗) = U11 (∗)[z∗ F1 (∗) + 1 − δ] = U11 (∗)[ F1 (∗) + 1 − δ] < 0,

Λ2 (∗) = −U11 (∗) − βU11 (∗)[z∗ρ exp(ε∗0 ) F1 (∗) + 1 − δ][z∗ρ exp(ε∗0 ) F1 (∗) + 1 − δ]
− βU1 (∗)z∗ρ exp(ε∗0 ) F11 (∗)
= −U11 (∗) − U11 (∗)[ F1 (∗) + 1 − δ] − βU1 (∗) F11 (∗) > 0,
212 numerical methods for macroeconomists with julia and matlab codes

Λ3 (∗) = βU11 (∗)[z∗ρ exp(ε∗0 ) F1 (∗) + 1 − δ] < 0

= U11 (∗) < 0,

Λ4 (∗) = U11 (∗) F (∗) − βU11 (∗)[z∗ρ exp(ε∗0 ) F1 (∗) + 1 − δ] F (∗)ρz∗ρ−1 exp(ε∗0 )
− βU1 (∗) F1 (∗)ρz∗ρ−1 exp(ε∗0 )
= U11 (∗) F (∗)(1 − ρ) − βU1 (∗) F1 (∗)ρ < 0,

and

Λ5 (∗) = − βU11 (∗)[z∗ρ exp(ε∗0 ) F1 (∗) + 1 − δ]z∗ρ exp(ε∗0 ) F (∗) − βU1 (∗)z∗ρ F1 (∗) exp(ε∗0 )
= −U11 (∗) F (∗) − βU1 (∗) F1 (∗).

The Solution for b and f

Focus on the solution for b, as given by (9.5.8). It implies that

U11 (∗)[ F1 (∗) + 1 − δ]

b=
U11 (∗) + U11 (∗)[ F1 (∗) + 1 − δ] + βU1 (∗) F11 (∗) − bU11 (∗)
F1 (∗) + 1 − δ
= .
1 − b + F1 (∗) + 1 − δ + βU1 (∗) F11 (∗)/U11 (∗)
This can be expressed as a quadratic equation in b:

{1 + F1 (∗) + 1 − δ + βU1 (∗) F11 (∗)/U11 (∗)}b − b2 − F1 (∗) − 1 + δ = 0.

A quadratic equation has two roots. At b = 0 the lefthand side of the

above formula is negative, because − F1 (∗) − 1 + δ = −1/β < 0. At
b = 1 the lefthand side of the above equation is positive. The lefthand
side becomes negative as b becomes large. Hence, there exits a value
of b < 1 that solves the above equation and a value of b > 1 that does
also. Clearly, when b > 1 the system would be unstable. So, throw this
root away. Figure 9.5.1 portrays the situation. Observe from (9.5.11)
that f > 0, because

− [Λ2 (∗) + bΛ3 (∗) + ρΛ3 (∗)]

= −(1 + 1/β)U11 (∗) − βU1 (∗) F11 (∗) + (b + ρ)U11 (∗) < 0

and Λ4 (∗) < 0. Last, note from (9.5.12) that a > 0 when ln k∗ > 0 and
a < 0 when ln k∗ < 1. Thus, the model’s local transitional dynamics
around its deterministic steady state have been characterized.

9.5.6 Stability of the Deterministic Neoclassical Growth Model

In the baseline version of the deterministic neoclassical growth model
z is a constant. Hence, the decision for rule for capital accumulation
reduces to
ln k0 − ln k∗ = b(ln k − ln k∗ ).
stochastic dynamics 213

Figure 9.5.1: The two roots to

Value of Quadratic
the quadratic equations for b.
Both roots will be positive. One
will be smaller than 1, the other
larger. The unstable root (the
bigger one) can be discarded.
0
1 b

Unstable Root
Stable Root

Additionally, around the steady state k0 − k∗ ' k∗ (ln k0 − ln k∗ ). There-

fore, one can write
k 0 − k ∗ = b ( k − k ∗ ).

Now, it has just been shown that 0 < b < 1 so that there exists a steady
state where 0 < dk t+1 /dk t < 1. The situation shown in Figure 6.7.2
therefore is apropos.

9.5.7 Algorithm: Log-Linearized Model

1. Compute the deterministic steady state for the model.

2. Conjecture a log-linear decision rule for capital of the form (9.5.2).

Setup the Euler equation for the model and log-linearize it. This
can be done by differentiating the Euler equation with respect to
the logs of the variables and solving for the implied coefficients.
The derivatives can be calculated either analytically or numerically–
Chapter 8 covers numerical differentiation. This gives the decision
rule (9.5.2). The roots for the quadratic equation for b can be com-
puted in MATLAB using the roots command. One root will lie
between 0 and 1, the other will be bigger than 1. Discard the root
bigger than 1.

3. Draw a sample of T random variables, {ε t+1 }tT=1 –again, random

number generation is discussed in Chapter 8. A sample of normally
distributed random variables can be obtained using the normrnd
command. (The older syntax is randn.) Make sure that the seed is
fixed for the random number generator–this is also done with the
rng command, where an integer is picked for the seed.

4. Pick some initial period-0 starting value for the capital stock and
214 numerical methods for macroeconomists with julia and matlab codes

technology shock, k0 and z0 . This could be their steady-state levels

so that k0 = k∗ and E[ln z] = ln z∗ = 0.

5. Enter period t with a level of capital, k t , and a technology shock, zt .

Compute the capital stock for next period as follows:

ln k t+1 = a + b ln k t + f ln zt .

The technology shock for next period is computed using the rela-
tionship
ln zt+1 = ρ ln zt + ε t+1 .

6. Given a sample path of capital stocks and technology shocks, {k t }tT=+11

and {zt }tT=+11 , data for all other variables of interest in the model, say
consumption, investment and GDP, can be calculated. From these
variables one can then calculate a set of business cycle statistics.
When doing this take the logarithm of variable. In MATLAB stan-
dard deviations can be computed using the std command. Like-
wise, correlations can be calculated using the corrcoef command.

9.6 Coleman’s Policy-Function Algorithm

Another way to proceed is to solve the Euler equation (9.2.5) directly

for the policy function (9.2.2). To do this, update the decision rule Wilbur John Coleman II developed
k0 = K (k, z) to get k00 = K (k0 , z0 ). Equation (9.2.5) can be rewritten as the algorithm as part of his Ph.D.
thesis at the University of Chicago
in 1987. It is published in Coleman
U1 (zF (k) + (1 − δ)k − k0 )
Z (1991).
=β U1 (z0 F (k0 ) + (1 − δ)k0 − K (k0 , z0 ))[ F1 (k0 , z0 ) + 1 − δ]dG (z0 |z).

The idea for policy-function iteration is to make a guess for the func-
tion K. Denote the guess to be used at stage j + 1 by k0 = K j (k, z).
One then solves the equation shown below to obtain a revised guess,
k0 = K j+1 (k, z).

U1 (zF (k) + (1 − δ)k − k0 )

Z
=β U1 (z0 F (k0 ) + (1 − δ)k0 − K j (k0 , z0 ))[ F1 (k0 , z0 ) + 1 − δ]dG (z0 |z).

To this end, let the shock process follow a m-state Markov chain
where z ∈ Z = {z1 , z2 , ..., zm }. Denote the odds of transition from
state i to state l by πil ≡ Pr[z0 = zl |z0 = zi ].

9.6.1 Algorithm: Policy-Function Iteration

Suppose that the policy function k00 = K (k0 , z0 ) can be approximated by
some class of functions constructed over some grid K ∈ {k1 , k2 , ..., k n }
stochastic dynamics 215

spanning the interval [0, K ]. Let K j (k h , z j ) be a guess, within this class

of function, to be used on iteration j + 1 for the decision rule K (k, z)
at the point (k h , z j ) ∈ K × Z . There are n × m such guesses. Unlike
discrete-state-space dynamic programming, the values for the model’s
capital stock are not restricted to be an element of the set K. There-
fore, values for K j (k, zi ) when k ∈
/ K are also required. How are values
j
for K (k, zi ) determined when k is not a grid point? This is done by
employing an interpolation scheme. Specifically, imagine that there is
some interpolation scheme where a continuous function can be fitted
through the n points (hh , K j (k h , zi )). With an abuse of notation, de-
note this interpolated function by K j (k, zi ). There are many ways to
construct such an interpolated function, as was discussed in Chapter
8. One could use piecewise linear interpolation, cubic spline interpo-
lation, or interpolation with radial basis functions.

The Algorithm-Coleman (1991)

1. Enter iteration j + 1 with a guess for K (k, zi ) denoted by K j (k, zi ).
The task is to compute a revised guess for K (k, zi ), denoted by
K j+1 (k, zi ). To this end, at each point (k h , zi ) ∈ K × Z a value for
K j+1 (k, zi ) can be computed by solving the equation below for k0 .

U1 ( F (k h , zi ) + (1 − δ)k h − k0 )
m
= β ∑ U1 ( F (k0 , zl ) + (1 − δ)k0 − K j (k0 , zl ))[ F1 (k0 , zl ) + 1 − δ]πil ,
l =1

where πi,l = Pr[z0 = zl |z = zi ]. In general solving for k0 involves

computing the solution to a nonlinear equation. Note that in gen-
eral k0 ∈
/ K–i.e., k0 is not restricted to be a grid point. The continuity
j
of K is important for solving this nonlinear equation numerically.
The interpolation schemes in Chapter 8 will result in a continuous
function for K j . For the initial guess for the policy function, Cole-
man set investment to zero; i.e., K0 (k, zi ) = 0. This corresponds to
assuming that Robinson Crusoe consumes all of his resources in the
final period of life. The idea here is that iteration 1 corresponds to
the final period of life, iteration 2 is the penultimate period of life,
iteration 3 the second to last period, and iteration j to the jth last
period, etc. That is, the iteration procedure can be thought of as
solving the Euler equation backwards in time.

2. Compute ρ(K j , K j+1 ). If ρ(K j , K j+1 ) < ε then stop, as convergence

has been obtained. Otherwise, return to step 1 using the revised
guess. The situation is shown in Figure 9.6.1 for the case of linear
interpolation, which is discussed in Chapter 8.

3. Draw a sample of T uniform random variables, {ε t }tT=1 –random

216 numerical methods for macroeconomists with julia and matlab codes

number generation is discussed in Chapter 8. In MATLAB a sam-

ple of uniformly distributed random variables on the [0, 1] inter-
val can be drawn using the rand command. Make sure that the
seed is fixed for the random number generator–this is done with the
rng(seed), where seed is some natural number. This line should
be inserted just before the call for the random numbers.

4. Enter period t with a level of capital, k t = k i ∈ K, and some past

value for the technology shock, zt−1 = zr ∈ Z . The technology
shock may randomly transist to another value zt = zs in the cur-
rent period t. Now, take ε t from the sample of random variables.
Compute the current technology shock, zt , follows:

s −1 s
zt = zs , if ε ∈ [ ∑ πr,u , ∑ πr,u ], for s = 1, · · · , n,
u =0 u =0

where πr0 ≡ 0. The is portrayed in Figure 9.6.2. Compute next

period’s capital stock, k t+1 ∈ K, using the interpolated decision
rule for capital
k t +1 = K ( k t , z t ).

5. For the starting value of capital just take k1 = k∗ , where k∗ is level

of capital in the deterministic steady state. For z0 take either value
for the technology shock.

6. Given a sample path of capital stocks and technology shocks, {k t }tT=+11

Figure 9.6.1: Coleman (1991) Al-

gorithm with Linear Interpola-
Kj(k,zi)
tion. Note that for the initial
K2(k,zi) guess, K0 (k, zi ) = 0, which im-
plies Robinson Crusoe is con-
suming all of the resources at his
K1(k,zi) disposal.

K0(k,zi)
0

k1 k2 k3 k4 k

Figure 9.6.2: Simulation of the

shocks, zt , for a 3-State Markov
chain. Observe that the length
of interval u = 1, 2, 3 is equal
to πr,u . The total length over
all intervals is 1. If the value of
𝑧𝑧𝑟𝑟 → 𝑧𝑧1 𝑧𝑧𝑟𝑟 → 𝑧𝑧2 𝑧𝑧𝑟𝑟 → 𝑧𝑧3
ε t ∈ [0, 1] lies in interval u, then
the technology shock transits be-
𝜋𝜋𝑟𝑟,0 = 0 𝜋𝜋𝑟𝑟,1 𝜋𝜋𝑟𝑟,1 + 𝜋𝜋𝑟𝑟,2 𝜋𝜋𝑟𝑟,1 + 𝜋𝜋𝑟𝑟,2 tween periods t − 1 and t from zr
+ 𝜋𝜋𝑟𝑟,3 = 1
to zu .
218 numerical methods for macroeconomists with julia and matlab codes

9.7 Carroll’s Endogenous Grid Method

The Coleman method defines an exogenous grid for the current stock
of capital k, which is then used to compute an interpolated decision
rule k0 = K (k, z) by calling a nonlinear equation solver to solve the
Euler equation. The endogenous grid method proposed by Carroll
(2006) runs the process in reverse. A fixed grid, K 0 , for next period’s
capital stock k0 is imposed. The values for k0 ∈ K are used to construct
an interpolated consumption function, C ( x, z), where x is the income
at the agent’s disposal. The levels of disposable income, x, that justify
the choice of k0 at each of these grid points are computed using the
Euler equation. This is the sense that Coleman’s process is run in
reverse. At each stage in an iteration of the algorithm, disposable
income and technology shocks are used to construct an interpolated
consumption function, c = C ( x, z). In sum, the current period shock,
next period’s capital stock, and an interpolated consumption function
are taken as given, and then a level of disposable income in the current
period is solved for that ensures the Euler equation holds. Depending
on the context, this may avoid the costly use of a nonlinear equation
solver by making disposable income x vary endogenously or at least
make the solution quicker to compute. One assumption is needed to
make the algorithm practicable: the marginal utility of consumption
must be invertible.

9.7.1 Algorithm: Endogenous Grid Method

As above, define a grid for next period’s capital stock K 0 ≡ {k01 , k02 , ..., k0n }
spanning the interval [0, K ]. Let the shock process follow a first-order
m-state Markov chain, where z ∈ Z = {z1 , z2 , ..., zm } and πil ≡ Pr[z0 =
zl |z0 = zi ] corresponds to the probability of transitioning from state i
to state l. The following algorithm uses the interpolation schemes dis-
cussed above, but in some circumstances avoids the use of a nonlinear
equation solver in each iteration.

1. Enter iteration j + 1 with a guess for next period’s consumption

function, c0 = C j ( xk h zl , zl ), where xk h ,zl ≡ F (k h , zl ) + (1 − δ)k h is next
period’s disposable income and zl is next period’s shock. Given a
value for zl , there will be a unique one-to-one relationship between
xk h ,zl and k h .

2. For each pair of next period’s capital stock and current technology
shock, (k h , zi ) ∈ K 0 ×Z , a value for current disposable income xk,zi
given current capital stock k and technology shock zi is recovered
stochastic dynamics 219

by solving the Euler equation

m
U1 xk,zi − k h = β ∑ U1 C j ( xk h zl , zl ) [ F1 (k h , zl ) + 1 − δ] πil .

l =1

Note, that for a given values of k h and zl , the level of next pe-
riod’s disposable income, xk h ,zl , is known from the formula xk h ,zl ≡
F (k h , zl ) + (1 − δ)k h . The advantage of defining the Euler equation
in this fashion is that often it can be solved analytically as long as
the marginal utility of consumption is invertible. Specifically,
!
m
xk,zi = k h + U1−1 β ∑ U1 C ( xk h zl , zl ) [ F1 (k h , zl ) + 1 − δ] πil
j
,
l =1

where U1−1 (·) is the inverse of the marginal utility of consumption.

In certain contexts it may be possible to solve this equation quickly
without the use of a nonlinear equation solver. At the end of this
step, there will be n × m solutions for x, one for each value of next
period’s capital stock, k0 , and the current technology shock, z. This
set of solutions will vary by iteration. Connected with each value of
x will be a value for current consumption c = x − k0 .

3. Now use this set of n × m solutions for c, which are functions of

current disposable income, x, and the current technology shock, z,
to fit a new interpolated consumption function, c = C j+1 ( x, z).

4. Compute ρ(C j , C j+1 ). If ρ(C j , C j+1 ) < ε, then stop, as convergence

has been obtained. Otherwise, return to step 1 using the revised
guess for the decision rule.

Remark 57. A Monte Carlo simulation can be undertaken to construct

statistics of interest, just as in the Coleman algorithm. In a nutshell
the procedure is this. One will enter an arbitrary period t with known
levels of the capital stock, k t , and technology shock, zt . Given the cur-
rent state, (k t , zt ), one can compute disposable income in period t, or
xt = F (k t , zt ) + (1 − δ)k t . This gives the current level of consumption,
ct = C ( xt , zt ), and next period’s capital stock, k t+1 = xt − C ( xt , zt ).
Then, next period’s technology shock, zt+1 , is drawn via the Monte
Carlo procedure described in the discussion of the Coleman algorithm.
After this the period-(t + 1) state, (k t+1 , zt+1 ), is known and the proce-
dure repeats itself.
Remark 58. It is easy to add labor supply into the above formulation.
Let the utility function be U (c − G (l )). Hours worked, l, can be writ-
ten as a function of k and z, as discussed in Chapter 6. Define an
augmented consumption function by c − G (l ) = C e( x, z).
220 numerical methods for macroeconomists with julia and matlab codes

9.8 Parameterized Expectations Algorithm

Yet another method to solve dynamic stochastic models is to parame-

terize conditional expectation functions, typically the consumption Eu-
ler equation or the household’s value function, by an ordinary polyno-
mial function. This method was introduced by den Hann and Marcet
(1990). As in the Coleman method discussed above, the goal is to find
the policy function for next period capital, k0 = K (k, z). Now, the pol-
icy function is approximated by a flexible polynomial function given a
vector of coefficients φ, so that

K (k, z) ' P (k, z; φ)

and
n
P (k, z; φ) = ∑ φi pi (k, z),
i =0
where pi (k, z) is some set of basis functions such as k, z, k2 , z2 , k z, etc.
Given the current capital stock and productivity level, (k h , zi ) , rewrite
the Euler equation

m
U1 (c(k h , zi )) = β ∑ U1 (c(k0 , zl ))[ F1 (k0 , zl ) + 1 − δ]πil (9.8.1)
l =1

as a function of k0 according to
m
U1 (c(k0 , zl ))
k0 = β ∑ [ F1 (k0 , zl ) + 1 − δ]k0 πil ,
l =1
U1 ( c ( k h , z i ))
| {z }
'P (k,z;φ)
where the righthand side of the equation is the conditional expectation
to be approximated as a function of k and z. The algorithm below uses
a stochastic simulation method to update the polynomial, P (k, z; φ),
on the righthand side. The two-stage procedure draws on Judd et al.
(2011).

9.8.1 Algorithm: Parameterized Expectations Method

Initialize the algorithm by providing an initial guess for the vector of
polynomial coefficients φ1 , the initial state (k0 , z0 ) needed for simula-
tions, a sequence of productivity realizations {zt }t=1,...,T , where T is
the simulation length. The first stage proceeds as follows:
1. Enter iteration j with coefficients φ j and state (k t , zt ), and simulate
the model for T periods using

k0 (k t , zt ) = P ( k t , z t ; φ j ),
c(k t , zt ) = F ( k t , z t ) − k 0 ( k t , z t ) + (1 − δ ) k t .
stochastic dynamics 221

2. For periods t = 0, ..., T − 1, compute the conditional expectation in

(9.8.1), where

k00 (k0 , zl ) = P P ( k t , z t ; φ j ), z l ; φ j ,

c0 (k0 , zl ) = F (k0 , zl ) − k00 (k0 , zl ) + (1 − δ) k0 .

3. Find φ̂ that minimizes the prediction error ε0

m
U1 (c(k0 , zl ))
ε0 = β ∑ [ F (k0 , zl ) + 1 − δ]k0 πtl − P (k t , zt ; φ).
U (c(k t , zt )) 1
l =1 1

Here φ̂ can be estimated using ordinary least-squares, least-squares

using a singular value decomposition, least-absolute deviations, or
principal component regressions.

4. Check for convergence of the decision rule according to

j j −1
1 T
k0 − k0
T ∑| k0
j
|< ρ.
t =1

5. If convergence is not reached, update the vector of polynomial co-

efficients using fixed-point iteration with damping parameter γ ∈
(0, 1], or
φ j+1 = (1 − γ)φ j + γφ̂,

and return to step 1.

The second stage of the algorithm computes the approximation errors

in the Euler equation. If the approximation is accurate, the candidate
vector of polynomial coefficients φ? is accepted. If instead the approx-
imation is not sufficiently accurate, the first stage can be amended by
using a different approximating function P , increasing the simulation
length T, and/or choosing a different norm when estimating the coef-
ficients φ̂. To compute the Euler equation errors proceed as follows:

1. Draw a new set of productivity realizations to be used as test points,

T ≡ {e z T }. Given the converged vector of coefficients φ? ,
z1 , · · · , e
z0 , e
compute e k0 (e
kt , ezt ) = P (e zt ; φ? ) for t = 1, 2, · · · , T.
kt , e

2. Compute the Euler equation errors at each point ekτ , e
zτ

m
U1 (c(e k0 , e
zl ))
zτ ) = β ∑
E (ek τ , e [ F1 (ek0 , e
zl ) + 1 − δ]πτl − 1.
U
l =1 1 ( c (ek ,
τ τz
e ))
If the mean of the errors are sufficiently small, the candidate φ? is
accepted.
222 numerical methods for macroeconomists with julia and matlab codes

9.9 MATLAB: Worked-Out Examples

9.9.1 A Stochastic Dynamic Monopoly Problem

The dynamic monopolist’s problem will now be reformulated where
he faces a random linear demand function. In particular, demand is
given by
β
pt = αt − ot ,
2
where pt is the period-t price of the product, ot is the monopolist’s
output in this period, and now αt is a stochastic demand shifter that
follows the AR1 process

αt = ραt−1 + ε t ,

with
ε t ∼ N ((1 − ρ)α, σ ).
Demand is decreasing in price, pt . The monopolist produces according
to the quadratic cost function
γ
ct = (ot − κot−1 )2 ,
2
where ct is period-t total cost and ot−1 is the monopolist’s level of out-
put in period t − 1. In this random world the above cost function im-
plies that the monopolist would like to smooth out fluctuations in his
output. Under the above formulation, the long-run or unconditional
expected level of the demand shifter is

E[α] = α.

This can be seen by noting that

αt = ρt−1 α1 + ε t + ρε t−1 + · · · + ρt−2 ε 2 ,

which implies that

1 − ρ t −1
E [ α t | α 1 ] = ρ t −1 α 1 + E[ε] = ρt−1 α1 + (1 − ρt )α.
1−ρ
Clearly, as t → ∞, this converges to α.

9.9.2 The Monopolist’s Dynamic Programming Problem

The monopolist’s state of the world in period t is (ot−1 , αt , ); that is,
he knows the past level of his output, ot−1 and the current state of
demand, αt . The monopolist’s objective is to maximize the expected
present value of his profits. The mathematical transliteration of this
problem is the following dynamic programming problem.
β 2 γ
V (o−1 , α) = max{αo − o − (o − κo−1 )2 + δE[V (o, α0 )]},
o 2 2
stochastic dynamics 223

where o−1 is last period’s output and α0 is next period’s state of de-
mand. The first-order condition associated with this maximization
problem is
α − βo = γ(o − κo−1 ) − δE[V1 (o, α0 )],
| {z } | {z }
MR MC
which sets marginal revenue, MR, equal to expected marginal cost,
MC. By differentiating the both sides of the above dynamic program-
ming problem, while applying the envelope theorem, it is easy to de-
duce that
V1 (o−1 , α) = γκ (o − κo−1 ) > 0.
An increase in the past level of output, o−1 , is beneficial to the monop-
olist because it reduces his current costs. By updating the equation,
one obtains
V1 (o, α0 ) = γκ (o 0 − κo ).
Using this in the first-order condition for the above dynamic program-
ming problem gives

α − βo = γ(o − κo−1 ) − δγκ ( E[o 0 |o−1 , α, ] − κo ). (9.9.1)

9.9.3 Solving the Model via the Decision Rule Approach

Conjecture that the monopolist’s decision rule has the following linear
form:
o = η + λα + ψo−1 . (9.9.2)
Solving the model amounts to calculating the solution for the three
coefficients η, λ and ψ. The long-run (or unconditional) expected level
of output is given by
η + λα
E[o ] = .
1−ψ
This can be seen by noting that

ot = ψt−1 o1 + λαt + ψλαt−1 + · · · + ψt−1 λα1 ,

so that

E[ot ] = ψt−1 o1 + λE[αt |α1 ] + ψλE[αt−1 |α1 ] + · · · + ψt−1 λα1 .

As t become large E[αt |α1 ] = α and the above result obtains. Now, if
o−1 = o = E[o ] and α = α, then E[o 0 ] = E[o ].
Next, E[o ] will be calculated using the first-order condition. This
allows the constant η be determined. Toward this end, using the above
results in the first-order condition for the above dynamic programming
problem (9.9.1) gives

α − βE[o ] = γ( E[o ] − κE[o ]) − δγκ ( E[o ] − κE[o ])

α − βE[o ] = γ(1 − κ ) E[o ] − δγκ (1 − κ ) E[o ],
224 numerical methods for macroeconomists with julia and matlab codes

which implies

1 1
E[o ] = α= α.
β + γ(1 − κ ) − δγκ (1 − κ ) β + γ(1 − κ )(1 − δκ )

Hence, the constant η must solve

η = (1 − ψ) E[o ] − λα. (9.9.3)

Making use of the conjectured decision rule in the first-order condi-

tion for the dynamic programming problem yields

α − βot = γ(ot − κot−1 ) − δγκ (η + λE[α0 |α] + ψot − κot ).

Therefore,

[γ + β − δγκψ + δγκ 2 ]ot = α + δγκη + δγκλα + δγκλρα + γκot−1 .

This implies that

γκ
ψ= ,
γ + β − δγκψ + δγκ 2
and
1 + δγκλρ
λ= .
γ + β − δγκψ + δγκ 2
Therefore, the solution for ψ solves the quadratic equation

−δγκψ2 + (γ + β + δγκ 2 )ψ − γκ = 0.

This is the same quadratic equation as for the dynamic monopoly

problem. It has two roots, a stable and unstable one. Take the sta-
ble root for ψ. The solution for λ is

1
λ= .
γ + β − δγκψ + δγκ 2 − δγκρ

9.9.4 The MATLAB code

MATLAB, Main Program-main.m

Below is a MATLAB program that solves the stochastic dynamic mo-
nopolist’s problem using decision-rule approach. First, the model is
solved taking the decision-rule approach. The decision-rule approach
involves finding the roots of a polynomial. This is done using built
in MATLAB function roots. Second, the model is solved again using
multiple shooting. Then the decision rule is simulated via a Monte
Carlo. This involves calling up a sample of normally distributed ran-
dom errors using the MATLAB function normrnd. Before doing this
you should set a seed for the random number generator using the rng
command. It is important to do this or else your sample of random
numbers will change every time you use the normrnd command.
stochastic dynamics 225

Next, the decision rule is simulated using a simple for loop. The
last step is to present some output. Some means and standard devia-
tions are computed using the built-in MATLAB functions mean and
std. The probability distribution for output is plotted using the his-
togram command.

1 % main .m
2 % S t o c h a s t i c Monopoly ProblemMain Program
3 c l e a r a l l % C l e a r a l l numbers from p re vi o us runs
4 c l c % Clear screen
5

6 % S e t parameters f o r model
7
8 % Demand curve
9 malpha = 1 ; % mean o f i n t e r c e p t
10 beta = 0 . 5 ; % slope
11 rho = 0 . 5 ; % a u t o c o r r e l a t i o n
12 sdshock = . 1 0 ; % standard d e v i a t i o n
13

14 % Cost f u n c t i o n
15 gamma = 0 . 5 ; % q u a d r a t i c term
16 kappa = 0 . 9 ; % c o s t r e d u c t i o n term
17

18 % Discount f a c t o r
19 delta = 0.96;
20

21 % Time horizon f o r s i m u l a t i o n
22 T = 1 0 0 0 0 0 ; % Number o f p e r i o d s
23
24 % Compute steady − s t a t e l e v e l o f output
25 o s t a r = malpha /( b e t a + gamma * (1 − kappa ) * (1 − d e l t a * kappa ) ) ;
26
27 % Solve model t a k i n g t h e Decision −Rule Approach
28 % S e t up t h e Quadratic Formulae f o r P s i
29 a = − d e l t a * gamma * kappa ; % C o e f f i c i e n t on squared term
30 b = gamma + b e t a + d e l t a * gamma * kappa2 ; % L i n e a r term
31 c = −gamma * kappa ; % Constant term
32 p = [ a b c ] ; % C o e f f i c i e n t s on q u a d r a t i c
33

34 % Find r o o t s o f q u a d r a t i c and t a k e t h e minimum one .

35 p s i = min ( r o o t s ( p ) ) ; % Slope term on output i n d e c i s i o n r u l e
36 lam = 1/( b−a * rho ) ; % Slope term on shock i n d e c i s i o n r u l e
37 e t a = (1 − p s i ) * o s t a r − lam * malpha ; % Constant term i n d e c i s i o n
rule
38 % Display c o e f f f o r d e c i s i o n r u l e
39 disp ( ’ C o e f f i c i e n t s f o r D e c i s i o n Rule ’ )
40 disp ( ’ e t a p s i lam ’ )
41 disp ( [ eta , psi , lam ] )
42
43 % Draw normal random numbers f o r shock
44 rng ( 1 ) % S e t seed f o r random numbers
45 shocks = normrnd ( (1 − rho ) * malpha , sdshock , T , 1 ) ; % Draw shocks
46 % Old MATLAB s yn ta x
47 % randn ( ’ seed ’ , 0 ) % Old MATLAB sy nta x f o r seed
48 % shocks = (1 − rho ) * malpha + sdshock . * randn ( T , 1 ) ; % OLD
49

50 % I t e r a t e on d i f f e r e n c e e q u a t i o n f o r output
51 ovec = z e r o s ( T , 1 ) ; % S e t up v e c t o r t o s t o r e outputs
52 avec = z e r o s ( T , 1 ) ; % S e t up v e c t o r t o s t o r e a l p h a s
53 time = ( 0 : T−1) ’ ; % S e t up v e c t o r f o r time
226 numerical methods for macroeconomists with julia and matlab codes

54 a l p h a p a s t = malpha ;
55 for t = 2:T
56 alpha = rho * a l p h a p a s t + shocks ( t , 1 ) ;
57 avec ( t , 1 ) = alpha ;
58 ovec ( t , 1 ) = e t a + lam * alpha + p s i * ovec ( t − 1 , 1 ) ;
59 a l p h a p a s t = alpha ; % Update alpha
60 end
61 pvec = avec − b e t a * ovec / 2 ; % Compute p r i c e s over time
62
63 % Plot results
64 figure (1)
65 subplot ( 2 , 2 , 1 )
66 p l o t ( time ( 1 : 3 0 ) , ovec ( 1 : 3 0 ) )
67 t i t l e ( ’ S t a r t o f Time ’ )
68 x l a b e l ( ’ Time ’ )
69 y l a b e l ( ’ Output ’ )
70 subplot ( 2 , 2 , 2 )
71 p l o t ( time ( 1 : 3 0 ) , pvec ( 1 : 3 0 ) )
72 t i t l e ( ’ S t a r t o f Time ’ )
73 x l a b e l ( ’ Time ’ )
74 ylabel ( ’ Price ’ )
75 subplot ( 2 , 2 , 3 )
76 p l o t ( time ( 1 5 0 : 3 0 0 ) , ovec ( 1 5 0 : 3 0 0 ) )
77 t i t l e ( ’ Stationary Dist ’ )
78 x l a b e l ( ’ Time ’ )
79 y l a b e l ( ’ Quantity ’ )
80 a x i s ( [ 1 4 0 , 3 0 0 , min ( ovec ( 1 5 0 : 3 0 0 ) ) , max ( ovec ( 1 5 0 : 3 0 0 ) ) ] )
81 subplot ( 2 , 2 , 4 )
82 p l o t ( time ( 1 5 0 : 3 0 0 ) , pvec ( 1 5 0 : 3 0 0 ) )
83 t i t l e ( ’ Stationary Dist ’ )
84 x l a b e l ( ’ Time ’ )
85 ylabel ( ’ Price ’ )
86 a x i s ( [ 1 4 0 , 3 0 0 , min ( pvec ( 1 5 0 : 3 0 0 ) ) , max ( pvec ( 1 5 0 : 3 0 0 ) ) ] )
87 figure (2)
88 histogram ( ovec ( 1 5 0 : T ) , ’ Normalization ’ , ’ p r o b a b i l i t y ’ )
89 % Old MATLAB syntax , histogram with 20 b i n s
90 % h i s t ( ovec ( 1 5 0 : T ) , 20 )
91 t i t l e ( ’ Histogram o f Outputs ’ )
92 x l a b e l ( ’ Output ’ )
93 y l a b e l ( ’ Frequency ’ )
94

95 % Compute some d e s c r i p t i v e s t a t i s t i c s
96 disp ( ’ D e s c r i p t i v e S t a t i s t i c s ’ )
97 disp ( ’Mean Level o f Output ’ )
98 disp ( mean ( ovec ( 1 5 0 : T , 1 ) ) )
99 disp ( ’Mean Level o f Shock , Alpha ’ )
100 disp ( mean ( avec ( 1 5 0 : T , 1 ) ) )
101 disp ( ’ Standard D e v i a t i o n Ln Output ’ )
102 disp ( s t d ( l o g ( ovec ( 1 5 0 : T , 1 ) ) ) )
103 disp ( ’ Standard D e v i a t i o n Ln P r i c e ’ )
104 disp ( s t d ( l o g ( pvec ( 1 5 0 : T , 1 ) ) ) )
105 disp ( ’ C o r r e l a t i o n between Ln Output and Ln P r i c e s ’ )
106 x = c o r r c o e f ( l o g ( ovec ( 1 5 0 : T , 1 ) ) , l o g ( pvec ( 1 5 0 : T , 1 ) ) ) ;
107 disp ( x ( 2 , 1 ) )
108 disp ( ’ A u t o c o r r e l a t i o n f o r Ln Output ’ )
109 x = c o r r c o e f ( l o g ( ovec ( 1 5 1 : T , 1 ) ) , l o g ( ovec ( 1 5 0 : T − 1 , 1 ) ) ) ;
110 disp ( x ( 2 , 1 ) )
stochastic dynamics 227

Output from the Program

The program gives the following output. The upper panel of Figure
9.9.1 shows the time path output and prices for monopolist at the be-
ginning of time, before a stochastic steady state has been reached. The
lower panel shows the same thing when the stochastic steady state has
been reached. Note absence of any time trend here. Figure 9.9.2 plots a
histogram for output. Observe how it resembles a normal distribution.
1 % C o e f f i c i e n t s f o r D e c i s i o n Rule
2 e t a p s i lam
3 0.6287 0.3656 0.6231
4

5 % Descriptive S t a t i s t i c s
6 % Mean Level o f Output
7 1.9734
8 % Mean Level o f Shock , Alpha
9 1.0002
10 % Standard D e v i a t i o n Ln Output
11 0.0471
12 % Standard D e v i a t i o n Ln P r i c e
13 0.1942
14 % C o r r e l a t i o n between Ln Output and Ln P r i c e s
15 0.9118
16 % A u t o c o r r e l a t i o n f o r Ln Output
17 0.7285

Figure 9.9.1: The figure plots

some random sample paths for
output and prices that come
from the MATLAB programs.
228 numerical methods for macroeconomists with julia and matlab codes

Figure 9.9.2: This is the prob-

ability distribution for output
that results from the MATLAB
program. It resembles a normal
distribution.
10 The Aiyagari Model

10.1 Introduction

The Aiyagari (1994) model is a landmark in macroeconomics. It set out

a basic heterogenous agent model that has become the starting point
for studying incomplete markets and heterogeneity among people and
firms more generally. So, it was one of the first papers to abandon the
representative agent model. S. Rao Aiyagari (1951-1997) died at
Aiyagari builds a version of the Brock and Mirman (1972) growth the relatively young age of 45 from
model that allows for a large number of individuals, subject to idiosyn- a heart attack while playing tennis,
one of his beloved activities. He
cratic risk, who cannot insure imperfectly due to incomplete markets.
never saw that impact that his
People can only insure themselves against risk by borrowing or saving model would have. Aiyagari was a
using one-period bonds with a safe return. There is a limit on how brilliant person. Before obtaining a
much an individual could borrow. Since risk is idiosyncratic in nature Ph.D. in economics he published a
it washes out at the aggregate level, due to a law of large numbers, so paper (with M.N. Mahanta) in the
that a deterministic steady-state equilibrium obtains. Aiyagari’s analy- Journal of Mathematical Physics titled
“On the Equivalence of the
sis can be extended to include aggregate risk along the lines proposed
Einstein-Mayer and Einstein-Cartan
by Boppart et al. (2018). All of this is discussed below. Theories for Describing a Spinning
Aiyagari (1994)’s analysis had two purposes. Medium.”

1. To study the impact of aggregation. He showed how the savings deci-

sions of many heterogeneous agents can be aggregated to obtain a
deterministic steady-state wealth distribution. Unlike the neoclassi-
cal growth model, the real interest is not equal to the rate of time
preference plus the depreciation. In particular, it is always smaller.

2. To quantify the importance of idiosyncratic risk for savings. Many re-

searchers have conjectured that precautionary savings may account
for a significant fraction of aggregate savings. The extent of such
saving will depend on how risk averse a person is and on how
volatile the idiosyncratic shocks are.

The upshot of his analysis is:

1. Contribution of idiosyncratic risk to aggregate savings is modest.

The aggregate savings rate increases by no more than 3 percentage
points.
230 numerical methods for macroeconomists with julia and matlab codes

2. Access to asset markets is quite important in smoothing out earn-

ings fluctuations. Asset markets allow an individual to cut their
consumption variability by half and enjoy a welfare gain worth
about 8% of GNP.

3. The model is consistent with certain features of the income and

wealth distribution. Particularly, the distributions are positively
skewed (median < mean). Wealth distributions are more unequal
than income distributions.

10.2 The Setup

In the Aiyagari model there is a distribution of consumer/workers

of unit mass each characterized by a different level of resources that
they can access. Each person seeks to maximize their expected lifetime
utility as given by
∞
E[ ∑ βt U (ct )].
t =0
An individual’s labor period-t supply, lt ∈ [lmin , lmax ], is an indepen-
dently and identically distributed random variable drawn from the
cumulative distribution function L(lt ) with E[lt ] = 1. A unit of labor
is paid the wage rate w. To insure against the randomness in labor
income a person can borrow or lend at the interest rate r. A individ-
ual’s assets in period t are denoted by at . This is negative when the
person is in debt. The maximum level of debt that a person can incur
is φ. People will have different levels of asset holds because they will
have have different histories of labor supply. An individual’s period-t
budget constraint reads

ct + at+1 = wlt + (1 + r ) at .

The person will also face the borrowing constraint

at+1 ≥ −φ.

Production in the economy is given by the constant-returns-to-scale

production function

y t = F ( k t , 1) ≡ Y ( k t ),

where k t is the period-t aggregate per-capita capital stock and where

the aggregate supply of labor is one. The aggregate capital stock, k t ,
evolves according to

k t +1 = (1 − δ ) k t + i t ,

where it is the period-t aggregate per-capita level of savings in the

economy.
the aiyagari model 231

10.3 A Person’s Choice Problem

Following Aiyagari, the model is analyzed in a transformed form. De-

at+1 by
fine the variable b

at+1 = at+1 + φ.
b (10.3.1)

at+1 represents the amount of cash that person can draw

The variable b
on either through savings and borrowing. Likewise, let zt+1 be given
by
at+1 − rφ.
zt+1 = wlt+1 + (1 + r )b (10.3.2)
This represents the total amount of resources inclusive of labor income
at the individual’s disposal. With these two changes in variables the
budget and borrowing constraints can be rewritten as

ct + b
a t +1 = z t ,

and
at+1 ≥ 0.
b
An individual’s dynamic programming problem can be cast as
Z
V (zt , φ, w, r ) = max {U (zt − b
a t +1 ) + β V (zt+1 , φ, w, r )dL(lt+1 )},
a t +1 ≥0
b

subject to (10.3.1) and (10.3.2). The Euler equation connected with the
problem can have both an interior and a corner solution:
Z
U1 (zt − b
a t +1 ) = β (1 + r ) U1 (zt+1 − b
at+2 )dZ (zt+1 |zt ), if b
at+1 > 0,

and
Z
U1 (zt − b
a t +1 ) ≥ β (1 + r ) U1 (zt+1 − b
at+2 )dZ (zt+1 |zt ), if b
at+1 = 0.

In the case of a corner solution, or when b at+1 = 0, the individual

would like to borrow more but they can’t since they have hit the bor-
rowing constraint. Thus, the marginal benefit of current borrowing,
R
U1 (zt − bat+1 ), exceeds the expected future cost, β(1 + r ) U1 (zt+1 −
at+2 )dZ (zt+1 |zt ).
b
The above programming problem will lead to a decision rule of the
form
at+1 = A(zt , φ, w, r ).
b (10.3.3)
The law of motion for resources then reads

zt+1 = wlt+1 + (1 + r ) A(zt , φ, w, r ) − rφ. (10.3.4)

Figure 10.3.1 plots the typical shapes for these functions, assuming
that the interest rate r lies below the rate of time preference, λ ≡
232 numerical methods for macroeconomists with julia and matlab codes

1/β − 1. This assumption is verified later. To begin with, focus on

(10.3.3) which is shown in the lefthand panel of Figure 10.3.1. There
will exit some lower limit on resources, b z, such the the individual will
hit the borrowing constraint implying that b at+1 = 0. This fact is proved
below in Proposition 59. Above this point any increase in resources
will be used both for consumption and either to write off debt or to
save. Below this point, all resources are used for consumption and
debt service. An increase in resources goes into consumption. The
righthand side shows the associated function for zt+1 when evaluated
at the two labor supply points lt+1 = lmin and lt+1 = lmax . The function
evaluated at other values of lt+1 will lie between these lines. The long-
run value for zt+1 , or E[z], will be trapped between zmin and zmax .

Figure 10.3.1: Figures Ia and

Ib reproduced from Aiyagari
(1994).

Proposition 59. Assume that β(1 + r ) < 1. Suppose that either U1 (0) < ∞
or zmin ≡ wlmin − rφ > 0. Then there is a b
z > zmin such that for all zt < b
z,
ct = zt and b
at+1 = 0.

Proof. First, note assume that U1 (zmin ) is finite. This implies that
V1 (zmin ) = U1 (zmin − b at+1 ) is finite also, a fact that will be estab-
lished later. The proof now proceeds by contradiction. Suppose to the
contrary that the borrowing constraint is not binding. Then V1 (zt ) =
U1 (zt − bat+1 ) = β(1 + r ) E[V1 (zt+1 )] < V1 (zmin ). As zt → zmin this re-
| {z }
<1
sults in a contradiction. So, there must be some neighborhood around
zm for which the proposition is true.
Second, it needs to be proved that V1 (zmin ) is finite. First, if U1 (0)
is finite then so will be V1 (zmin ) = U1 (zmin − bat+1 ) ≤ U1 (0). Sec-
ond, suppose alternatively that zmin ≡ wlmin − rφ > 0. Then, ex-
pected lifetime utility is bounded below by U (wlmin − rφ)/(1 − β).
Now, V1 (zmin ) = ∞ if and only if b at+1 = zmin . For this to be true,
it must transpire that U1 (0) ≤ β(1 + r ) E[V1 (wlt+1 + (1 + r )zmin −
the aiyagari model 233

rφ)]. Because, V1 is a concave function this requires that V1 = ∞

over some measurable interval, say [zmin , y]. Now, V (y) = V (zmin ) +
Ry
zmin V1 ( ω ) dω, with V ( zmin ) > U ( wlmin − rφ ) / (1 − β ). If V1 ( ω ) = ∞
over [zmin , y] for some y then V (y) = ∞, a contradiction since V is
bounded from above – an assumption imposed on dynamic program-
ming problems.

10.4 Heterogeneity and Aggregation

Let E[ aw ] denote the long-run level of assets for the economy. Using
(10.3.1) and (10.3.3) this is given by

E[ aw ] = E[ A(z, φ, w, r )] − φ.

Some features of this function are shown in the lefthand panel of Fig-
ure 10.4.1. Here as the rate of interest approaches the rate of time
preference, λ = 1/β − 1, the economy’s holdings of assets grow with-
out bound. Suppose that r = λ. Then, in a world without uncertainty
it would be costless for a person to hang on to assets. In the world with
uncertainty there is a positive probability that the individual could get
a string of bad shocks. To insure against this, the person holds an
infinitely large amount of assets.
In equilibrium the marginal production of capital will be equal to
its user cost. Thus,
Y1 (k) = r + δ.
The capital-to-labor ratio is given by

K (r ) = Y1−1 (r + δ).

Since the aggregate supply of labor is one this is also the per-capita
demand for capital. Given the constant-returns-to-scale assumption,
the wage rate, w, can be expressed as

w = Y (K (r )) − rK (r ) ≡ W (r ).

The economy’s equilibrium is portrayed in the righthand panel of Fig-

ure 10.4.1. From the diagram it is apparent that r < 1/β − 1. This is
established formally in Proposition 60.

Proposition 60. [Hugget (1997)] In a stationary equilibrium, the interest

rate lies below the rate of time preference (i.e., r < 1/β − 1), provided that a
measurable set of agents is borrowing constrained.

Proof. Let C (z) denote an agent’s decision rule for c. In the presence
of borrowing constraints
Z
U1 (C (z)) > β(1 + r ) U1 (C (z0 ))dZ (z0 |z). (10.4.1)
234 numerical methods for macroeconomists with julia and matlab codes

Figure 10.4.1: Figures IIa and IIb

from Aiyagari (1994).

This above equation will hold with equality for an individual who
isn’t borrowing constrained. The stationary distribution for z, Z (z0 ) is
defined by Z
Z (z0 ) = Z ( z 0 | z ) d Z ( z ).

Now, assume that a positive mass of agents is borrowing constrained.

Integrating both sides of (10.4.1) with respect to the stationary distri-
bution gives
Z Z Z
U1 (C (z))dZ (z) > β(1 + r ) U1 (C (z0 ))dZ (z0 |z)dZ (z)
Z
= β (1 + r ) U1 (C (z0 ))dZ (z0 ).

This can only be true if β(1 + r ) < 1, or (1 + r ) < 1/β.

10.4.1 Algorithm for Computation

Computing a solution to this model is remarkably simple. Just follow
these steps:

1. Enter each iteration j with a guess for the interest rate, say r j <
1/β − 1.

2. Compute the solution to the representative agent’s dynamic pro-

gramming problem assuming this guess for the interest rate.

3. Compute E[ aw ] by a Monte Carlo simulation of the optimal decision-

rule over some large number of periods. Monte Carlo simulation is
covered in Chapter 8. Aiyagari (1994) used an interpolated version
of the discrete decision rule–see Chapters 8 and 9. In a stationary
equilibrium, that sample path for the time series of at will resemble
the cross-section over at at point in time.

4. Check excess demand in the capital market, or K (r j ) − E[ aw ].

the aiyagari model 235

(a) Stop if |K (r j ) − E[ aw ]| < tolerence.

(b) If excess demand is positive, then raise the interest. If it is neg-

ative, then lower it. This can be done using a bisection routine–
recall Section 2.6 in Chapter 2. Call the new guess r j+1 . Go back
to step one with the new guess, r j+1 .

• Bisection routine. Enter each iteration j with a guess for the

interest rate, r j , and an upper and lower bound, r and r, such
that r < r j < r. Calculate excess demand K (r j ) − E[ aw ]. If
|K (r j ) − E[ aw ]| < tolerence, then stop. Otherwise, if excess
demand is positive, then reset the lower bound so that r = r j
,while if it is negative pick r = r j . Next, revise the guess for the
interest by letting r j+1 = (r + r )/2. [In the righthand panel of
Figure 10.4.1 think about r1 = r and r2 = r. By happenstance
in the diagram, the true solution is r ∗ = (r1 + r2 )/2.]

10.5 Calibration

It’s time to pick functional forms for tastes, technology, and the stochas-
tic process for labor supply. The period length is taken to be one year,
so the discount factor β is set at 0.96. The model is simulated for
various configurations of parameters values for these functions. Let
momentary utility be represented by

c 1− µ − 1
U (c) = , where µ ∈ {1, 3, 5},
1−µ

and the production technology specified as

Y (k) = kα , with α = 0.36.

In the analysis the model is computed for different values for the co-
efficient of relative risk aversion, µ. Suppose that labor income has the
following first-order autoregressive representation.
q
ln(lt ) = ρ ln(lt−1 ) + σ 1 − ρ2 ε t , with ε ∼ N (0, 1),

and
σ ∈ {0.2, 0.4} and
|{z} ρ ∈ {0, 0.3, 0.6, 0.9}.
|{z}
coef var. auto corr.

Studies indicate that a reasonable value of the coefficient of variation

lies between 0.2 to 0.4. The autocorrelation coefficient probably lies
below 0.6. Last, it is assumed that people can’t borrow which is equiv-
alent to setting φ = 0.
236 numerical methods for macroeconomists with julia and matlab codes

10.6 Results

10.6.1 Aggregate Savings

The impact of idiosyncratic risk on aggregate savings is moderate, at
least for reasonable values of σ, ρ, and µ.

1. Full insurance: In the full insurance version of the model individuals

are perfectly insured against shocks to their labor supply. This is the
neoclassical growth model, with r = 1/β − 1. The aggregate saving
rate is just δkY1 (k)/[Y (k)Y1 (k )] = αδ/Y1 (k) = αδ/(r + δ). Thus,
the aggregate savings rate is not a function of µ, σ, and ρ. The full
insurance baseline gives r = 4.17 and a savings rate of 23.67%.

2. Moderate risk: σ = 0.4, ρ = 0.6, and µ = 3. Here a moderate value

for the coefficient of relative risk aversion is selected. Labor shocks
are not so persistent, which makes the shocks less risky because
they will not persist for that long. Aggregate savings increases by
3 percentage points. Aiyagari takes this case to be at the upper end
of what is reasonable.

3. High risk: σ = 0.4, ρ = 0.9, and µ = 5. Now, the individual is

quite risk averse. The labor shocks are quite persistent so that a bad
state will endure for some time. Now there is an increase in the
aggregate savings rate of about 14 percentage points.

10.6.2 Importance of Asset Trading

Loss due to consumption variability (expressed as a fraction of con-
sumption) is
µσc2 /2,

where σc is the coefficient of variation in consumption. This is easy

to show. To do so, take a second-order Taylor expansion of the utility
function U (c) around the mean level of consumption, c̄. This yields

U (c) = U (c̄) + U1 (c̄)(c − c̄) + U11 (c̄)(c − c̄)2 /2,

so that (for a small amount of risk)

E[U (c)] ' U (c̄) + U11 (c̄)c̄2 σc2 /2.

Therefore,
1 dE[U (c)] 1 U11 (c̄) 1
= c̄ = µ.
cU 0 (c̄) dσc2 2 U1 (c̄) 2
| {z }
coef. rel. aver.

Consumption variability falls from 0.35 to 0.17 (when µ = 3, σ = 0.17,

and ρ = 0.6), when one moves from a world where agents assets are
the aiyagari model 237

fixed at their per-capita amount to the current setting where assets may
be optimally accumulated and depleted. This lead to an increase in
welfare, measured in terms of consumption, of 3 × (0.352 − 0.172 )/2 '
0.14, which is about 8% of GNP.

10.6.3 Income Distribution

The distribution of income in the United States is unequal, as in most

countries. This is reflected in the income distribution being skewed
toward the right. A simple measure of this is the ratio of the median
to the mean. When the distribution is skewed to the right the median
level of income will be less than the mean. This transpires because
the rich (or people in the upper portion of income distribution or to
the right of the median person) earn a lot more than the poor, which
operates to pull the mean up. So, the lower is this measure, the greater
is the level of income inequality.
The income distribution can be shown by a Lorenz curve. A Lorenz Max O. Lorenz (1876-1959) was an
curve plots the percentage of income earned by all of the population American economist. He developed
below some percentile against that percentile. Two Lorenz curves for the Lorenz curve in an
undergraduate essay! He published
the United States are plotted in Figure 10.6.1, one for 2009 and the
a paper on this in the Publications of
other for 2019. If the distribution of income is equal, then Lorenz the American Statistical Association
curve would lie on the 45◦ degree line. The further away the Lorenz while a graduate student in 1905.
curve is from the 45◦ degree line the higher is the degree of income
inequality. For example, in the figure the population lying below the
50th percentile accounts for far less than 50 percent of income in the
United States. According to the two Lorenz curves that are plotted
income inequality was worse in 2009 than in 2019.
The Gini coefficient measures the area between the Lorenz curve
and the 45◦ degree line. The bigger this area, the more unequal is
the income distribution. Suppose that one is given a sample of in-
comes, {i j }nj=1 . Then, the Gini coefficient is given by ∑nj=1 ∑nk=1 |i j −
ik |/(2n2 ∑nj=1 i j ). As can be seen, it measures the differences in in-
comes, or the i’s. Normally, the Gini is thought of having a value of 0,
if all incomes are equal, and a value of 1 if one person has all of the
income.
The model yields positively skewed income and wealth distribu-
tions, but falls short of the amount displayed in the data.

1. Data:

(a) Median income is about 80 percent of mean income.

(b) Gini coefficient for income and wealth are 0.40 and 0.80.

2. Model:
238 numerical methods for macroeconomists with julia and matlab codes

(a) Median income is over 90 percent of mean income for all pa-
rameterizations.
(b) When σ = 0.2, ρ = 0.6, and µ = 5, the Gini coefficients for
income and wealth are 0.12 and 0.32.

Line of Equality
2009 Lorenz Curve
2019 Lorenz Curve

Figure 10.6.1: Lorenz curves for

the U.S..

10.7 Aggregate Uncertainty

Can the model be computed with aggregate uncertainty? The answer

is yes. To do so, let the aggregate production function include total
factor productivity zt so that yt = zt Y (k t ) and assume productivity
follows a first-order autoregressive process (AR(1)) in logs with ρ as
the serial correlation parameter. Thus, ln zt = ρ ln zt−1 + ε t , where ε t is
white noise. The issue here is that the entire distribution of wealth is a
state variable off steady state. It will change as the aggregate economy
is shocked.
An algorithm for computing the Aiyagari (1994) model with uncer-
tainty is outlined in Boppart et al. (2018). Their algorithm involves Timo Boppart, Per Krusell, and
two key steps. In the first step, deterministic dynamics are computed Kurt Mitman are economists at the
for the model. The impulse response functions that arise from the de- Institute for International
Economics in Stockholm, Sweden.
terministic dynamics are calculated for variables of interest, say the
aggregate capital stock, aggregate consumption, the Gini coefficient,
etc. These impulse response functions are only computed once. In
turn, the second step involves undertaking a Monte Carlo simulation
of these impulse response functions to obtain statistics of interest for
the aggregate economy with uncertainty.

10.7.1 Deterministic Dynamics

To compute deterministic dynamics for the Aiyagari model a version
of the extended path algorithm discussed in Chapter 6 can be used.
the aiyagari model 239

Start the economy off from the steady-state wealth distribution in the
Aiyagari model. This is done using discrete-state-space dynamics pro-
gramming. Here it is important to create a grid of asset holdings
with more points at the lower end since the distribution will feature
a large mass of individuals close to the borrowing constraint.1 The 1
This can be achieved by defining a
steady-state wealth distribution is computed as the invariant distri- point in the asset grid as follows: a j =
j −1 α

a + ( a − a) n−1 , where a is the min-
bution of the Markov chain connected with the dynamic program-
imum level of asset holdings, a is the
ming problem–see Chapters 8 and 9 for a discussion of Markov chains. maximum level of asset holdings, n is
Then, do a one-time unforeseen shock to the innovation at the start of the total number of grid points, and α
is a curvature parameter (usuallly set to
time. In particular, let ε 1 = σ, where σ is one standard deviation.
7).
Hence, the sequence of innovations is just {ε 1 = σ, 0, 0, 0, · · · }. Thus, z
will jump up or down upon impact and then return to its steady-state
level (i.e. 1). In particular, ln zt will follow the time path ln zt = ρt−1 ε 1 .
Assume that the economy will converge back to its initial steady-state
level of capital by time T.

1. Enter an iteration with a guess for the time path of the aggregate
j
capital stock {k t }tT=1 , denoted by {k t }tT=1 . Note that a guess for the
time path for the aggregate capital stock will imply a guess for the
time paths for the interest and wage rates.

2. Given this guess path for aggregate capital stock, compute the rep-
resentative agent’s value functions and decision rules starting at pe-
riod T − 1 and then work backwards to period 1.

3. Now, use the obtained decision rule for savings and the idiosyn-
chratic transition probabilities to simulate forward in time the evo-
lution of the distribution of income. So, for example, in period 1 use
the decision rule for saving to compute the distribution of income
for period 2, and likewise in period t use the decision rule savings
to compute the wealth distribution for period t + 1. This is done us-
ing the transition matrix connected with the dynamic programming
problem.

4. The wealth distribution computed for each period t implies an ag-

gregate capital stock for that period. Thus, a sequence for the aggre-
j
gate capital stock will obtain, {k t }tT=1 . Check ∑tT=1 |k t − k t | <tolerance.

(a) If so, exit the algorithm since a solution has been found.
j +1 T
(b) If not, set {k t } t =1 = {k t }tT=1 . Repeat step one using this new
guess.

5. Upon convergence save the time paths (or the impulse response
functions) for the variables of interest. The impulse reponse func-
tions never need to be computed again.
240 numerical methods for macroeconomists with julia and matlab codes

10.7.2 Simulating the Impulse Response Functions

Let X ( j) represent the baseline impulse response function for the some
generic variable, x, that was obtained in the previous step. The vari-
able x is measured in logarithms as the deviation from the logarithm of
its steady-state value, x ? . That is, 100 × X ( j) gives the percentage de-
viation of the variable of interest relative to its steady-state level in the
j-th period following the one-shot innovation. The impulse response
function shifts up or down in proportion to the size of the unfore-
seen innovation. Hence, if the one-shot innovation was λε 1 instead of
ε 1 > 0, with λ > 1, then the impulse response function shifts up by
the factor λ.

Figure 10.7.1: Impulse Response

𝑋𝑋(𝑗𝑗)
Functions: The solid green line
ε1
shows the baseline response of x
𝑋𝑋(1) to an unexpected positive shock
σ 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆, ε1 > σ
in period 1 to ε of size σ. The re-
𝑋𝑋(1) sponse is measured in terms of
the gap between the logged vari-
able and the log of its steady-
state value. The dashed blue line
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆, σ
shows what would happen if in-
0 stead a larger shock, ε 1 > σ, oc-
1 2 3 4 5 6 𝑗𝑗 curred. The baseline impulse re-
sponse just needs to be scaled
The response of xt to a sequence of innovations {ε t , ε t−1 , ε t−2 , · · · } is up by the factor ε 1 /σ.
given by a scaled moving average of the past shocks multiplied by the
baseline impulse response function obtained from the deterministic
simulation, or

J
ε t +1− j

xt = ∑ σ
X ( j ).
j =1

In the above formula, the baseline impulse response function X ( j)

gives the impact on xt of an innovation of size σ that happened j peri-
ods ago. If the innovation was instead of size ε t+1− j , then things need
to be rescaled by ε t+1− j /σ.
A Monte Carlo simulation is then used to generate the time-series
process for xt . Specifically, draw a random sequence for the ε t ’s,
{ε t }tN=1 where N is a large number. Next, simulate the changes in xt ’s
the aiyagari model 241

as follows
ε ε ε
J J −1
xJ = X (1) + X (2) + · · · + 1 X ( J ),
σ σ σ
ε ε ε
J +1 J 2
x J +1 = X (1) + X (2) + · · · + X ( J ),
σ σ σ
.. ..
.=.
ε ε ε
t +1− J
X (1 ) + t −1 X (2 ) + · · · +
t
xt = X ( J ).
σ σ σ
Finally, compute the desired descriptive statistics from the obtained
(logged) deviations from the steady state.
A Mathematical Appendix

Some of the basic mathematics used in the book is reviewed here. This
should make the book self-contained for those rusty or unfamiliar with
the mathematics used. The presentation is cookbook in style and is ori-
ented toward discussing the uses of mathematics in the main text. An
excellent gentle and gradual introduction to the mathematics used in
economics is contained in Chiang (2011). A good introduction to prob-
ability and statistics is DeGroot (1975). Last, Bryant (1985) provides a
heuristic approach to real analysis.

A.1 Notation

• Upper case Roman letters usually denote a function.

• Lower case Roman letters usually represent a variable.

• Calligraphic letters are often used to represent sets or spaces.

• Greek letters are usually parameters.

• ≡ is a symbol meaning equal to by definition.

• ' denotes approximately equal to.

• ∈ is shorthand for contained in.

• | x | is either the absolute value of the scalar x or the norm of a vector

• R represents the real numbers and R+ is the positive reals.

• F : X → Y . The function F maps the space X (the domain) into the

space Y (the range).

• F ( x, y) denotes a function of two variables, x and y.

• F1 , F2 , F11 , F12 , and F22 . These denote various derivatives of the func-
tion F. Specifically, F1 is the partial derivative of F with respect to
its argument, x. Thus, F1 ≡ dF/dx. Likewise, F2 ≡ dF/dy. Next, F11
is the derivative with respect to x of the first derivative F1 so that
F11 ≡ d2 F/dx2 . Finally, F12 ≡ d2 F/(dxdy) and F22 ≡ d2 F/dy2 .
244 numerical methods for macroeconomists with julia and matlab codes

• e or exp is Euler’s constant or 2.7182· · · .

• ln is the natural logarithm. I.e., the logarithm using the base e.

• Pr is shorthand for probability.

• mod is the modulo operator. Used to characterize the remainder

from a division.

• floor is notation for round down to the nearest natural number.

• i is the square root of −1, which is an imaginary number.

• E is the expectations operator. So, E[ x ] is the expected value of x.

• 0 used to signify the value of a variable one period down the road.

A.2 Maximizing a Function

Maximization is at the heart of economics. Economic actors try to do

the best for themselves. So, people maximize their utility and firms
maximize their profits. Mathematically speaking this corresponds to
maximizing a function. Functions are everywhere in economics: cost
functions, production functions, utility functions to name a few.
Consider the function
y = F ( x ),
which maps the real-valued variable x into a real value for the variable
y. By definition, a function associates each value of x with a unique
value for y. Take x to be a nonnegative number, so x ∈ R+ , and y to
be some real number, implying y ∈ R. Thus, F : R+ → R. Assume
that F is continuously twice differentiable. Denote the first and second
derivatives of F by

dF ( x ) d2 F ( x ) dF ( x )
F1 ( x ) ≡ and F11 ( x ) ≡ 2
= 1 .
dx dx dx
The first derivative gives the impact that a small change in x will have
on y. The second derivative specifies how the first derivative changes
in response to a small shift in x. In other words, it says how the
change in y in response to a tiny shift in x, itself, changes with a small
movement in x.
Now, consider the unconstrained maximization problem

max F ( x ). (A.2.1)
x

Here the value of x is sought that maximizes the function F ( x ). At a

maximum, the following first-order condition must obtain

F1 ( x ) = 0.
mathematical appendix 245

This condition is necessary for a local maximum. Suppose to the con-

trary that at a maximum F1 ( x ) > 0. Then, a small shift up in x would
increase F ( x ), a contradiction. The above first-order condition repre-
sents one equation in one unknown, x. The first-order condition spec-
ifies a local maximum, instead of a local minimum (or an inflection
point), if the second-order condition shown below holds

F11 ( x ) < 0.

Let x ∗ denote the value of x that maximizes the the function, F ( x ).

When the second-order condition holds, a small increase in x must
cause the function F ( x ), when evaluated at x ∗ , to decrease, because
F1 ( x ) becomes negative. Likewise, a small decrease in x induces F1 ( x )
to become positive, implying that the reduction in x also results in a
decline in F ( x ). Therefore, x ∗ must maximize F ( x ), at least locally.

A.2.1 Strict Concavity (Convexity) and the Second-Order Condition for a

Maximum (Minimum)
Now, a strictly concave function has a negative second derivative. That
is, if a function is strictly concave, then F11 ( x ) < 0 for all x. In this sit-
uation, the second-order condition for a maximum will automatically
hold; hence, for strictly concave functions the first-order condition is
both necessary and sufficient for characterizing a maximum. By con-
trast, a strictly convex function has a positive second derivative so that
F11 ( x ) > 0 for all x. In this case, the first-order condition is both nec-
essary and sufficient for a minimum to hold.
This situation is portrayed by Figure A.2.1 for a typical case in eco-
nomics. At the peak of the function the slope or the first-derivative
is zero. Note that the second-derivative is negative. That is, the first-
derivative declines as you move from left to right. This occurs because
the objective function is strictly concave in x.

A.2.2 Envelope Theorem

Rewrite problem (A.2.1) as

V (α) = max F ( x; α),

where α is an exogenous parameter. The function V (α) gives optimized

value of F for the a given value of α. One can ask how this optimized
value of F changes with the parameter α. Now, let x ∗ be the opti-
mal value for x. This value for x must solve the first-order condition
attached to the above problem. That is, x ∗ must solve the equation

F1 ( x ∗ ; α) = 0.
246 numerical methods for macroeconomists with julia and matlab codes

Figure A.2.1: Finding an uncon-

strained maximum.

By definition,
V ( α ) = F ( x ∗ ; α ),
because x ∗ is the value of x that maximizes F ( x ). Differentiate both
sides of the above expression with respect to α to get

dV (α) dx ∗
= F1 ( x ∗ ; α) + F2 ( x ∗ ; α).
dα | {z } dα
=0

By the first-order condition, F1 ( x ∗ ; α) = 0, any induced variation in x ∗

caused by a change in α washes out implying

dV (α)
= F2 ( x ∗ ; α).
dα
This occurs because at the maximum a small change in x will have no
impact on the objective function. As can be seen from Figure A.2.1, at
the top of the function an infinitesimal step left or right won’t change
the value of the function. The envelope theorem is used in Chapters 6
and 9.

A.2.3 Corner Solutions

In economics corner solutions to maximization problems often occur.
For example, perhaps a person wants to set their hours worked in the
labor force to be zero, or likewise, they do not want to acquire any skill
by attaining a post-secondary education. Now, suppose that there is a
lower bound on x, denoted by xl , so that the constraint x ≥ xl must
hold. The maximization problem above now appears as

max F ( x ).
x ≥ xl
mathematical appendix 247

One of two solutions may obtain to this constrained maximization

problem; viz., an interior solution or a corner solution. The interior
solution is described as before by the first-order condition

F1 ( x ) = 0.

The corner solution occurs when

F1 ( xl ) < 0.

This is shown by the right-hand panel of Figure A.2.2. Here, the peak
of the function cannot be attained because the lower bound on x has
been hit. The slope is negative at x = xl . Because F1 ( x ) < 0 at x = xl , a
small reduction in x would increase the value of the objective function,
F ( x ). This cannot be done due to the presence of the lower bound, xl .
Alternatively, x could be constrained by an upper bound, xu , requiring
that x ≤ xu . Now the corner solution happens when

F1 ( xu ) > 0.

A small increase in x from xu would raise the value of objective func-

tion, but this isn’t feasible, because the upper bound, xu , has been hit.
The left-hand side of Figure A.2.2 illustrates this situation.

Figure A.2.2: Constrained max-

imization. The right-hand side
panel shows the situation when
a corner solution is hit at a lower
bound, while the left-hand side
illustrates things for an upper
bound.
248 numerical methods for macroeconomists with julia and matlab codes

A.2.4 Constrained Maximization

Optimization in economics often involves maximization subject to con-
straints. For example, a consumer maximizes utility subject to a bud-
get constraint or a firm minimizes costs subject to a production func-
tion. Consider maximizing the function F ( x, y), with respect to the
decision variables x and y, subject to a constraint given by the func-
tion y = G ( x ). There are two ways to proceed here. First, one could
just replace y in the objective function with the function G ( x ) and then
maximize with respect to the single variable, x. That is, one could
solve the problem
max F ( x, G ( x )).
x

By defining the new function Fe( x ) = F ( x, G ( x )), it should be clear that

this reduces to the form of the maximization problem discussed above.

A.3 Total Differentials

Consider the function

z = F ( x, y).
What would happen to z if both x and y are changed by some arbitrary
small amounts? Denote the small changes in x and y by dx and dy,
respectively. These are called differentials. Likewise, the induced total
change in z is represented by dz. The total change in z is given by

dz = F1 ( x, y)dx + F2 ( x, y)dy, (A.3.1)

where F1 ( x, y) ≡ dF ( x, y)/dx and F2 ( x, y) ≡ dF ( x, y)/dy. The above

expression decomposes the change in z into two factors. The first term
on the right-hand side is the change in z that results from the shift in
x. The shift in x is represented by dx. To get the induced shift in z,
this is multiplied by the (partial) derivative F1 ( x, y), which translates a
shift in x into a shift in z. The second term does the same thing for y.

A.3.1 The Total Derivative

The differentials dz, dx, and dy can be manipulated to obtain deriva-
tives. For example, one could divide the above equation through by
dx to obtain
dz dy
= F1 ( x, y) + F2 ( x, y) .
dx dx
The term dz/dx is the total derivative of z with respect to x. The
change in x has both a direct and indirect effect on z, as shown by
the first and second terms on the right-hand side. The indirect effect
occurs because the change in x may induce a change in y, as given by
dy/dx, which in turn will affect z via F2 ( x, y). To compute the indirect
mathematical appendix 249

effect more information would be needed. To illustrate, perhaps y is

given by the function y = G ( x ). Then, dy/dx = G1 ( x ), implying
dz/dx = F1 ( x, y) + F2 ( x, y) G1 ( x ). Alternatively, perhaps y is not a
function of x. Then, dy/dx = F1 ( x, y). As another example, perhaps x
and y are functions of another variable t, which changed by the small
amount, dt. Then, dividing both sides of (A.3.1) by dt gives

dz dx dy
= F1 ( x, y) + F2 ( x, y) .
dt dt dt
Here, the derivatives dx/dt and dy/dt depend on the specified func-
tional dependencies of x and y on t.

A.4 Intermediate Value Theorem

Let F ( x ) be a continuous function whose domain contains the interval

[ a, b]. The function F ( x ) takes on every value between F ( a) and F (b)
as x traverses the interval [ a, b].

A.5 The Implicit Function Theorem

Consider an equation of the form

F ( x, y) = 0.

Here F is a function. Think about x as being an endogenous variable

and y as being an exogenous one. Then, the above expression repre-
sents one equation in one unknown variable, x. Does a solution exist
where one could write
x = G ( y ), (A.5.1)
where G is some function? Now, let F : R2 → R be a continuously
differentiable function. The implicit function theorem states that, pro-
vided F1 ( x, y) 6= 0 at the solution point, there will indeed be a contin-
uously differentiable solution of the form (A.5.1) where

F ( G (y), y) = 0.

A.6 First- and Second-Order Taylor Expansions

First- and second-order Taylor expansions, without the remainder terms,

are illustrated for the case of a bivariate function. Let F ( x, y) be a twice
differentiable function of two variables, x and y. The function F ( x, y)
can be approximated around the point ( x ∗ , y∗ ) by using either a first-
or second-order Taylor expansion as follows:

F ( x, y) ' F ( x ∗ , y∗ ) + F1 ( x ∗ , y∗ )( x − x ∗ ) + F2 ( x ∗ , y∗ )(y − y∗ ), (first order)

250 numerical methods for macroeconomists with julia and matlab codes

and

F ( x, y) ' F ( x ∗ , y∗ ) + F1 ( x ∗ , y∗ )( x − x ∗ ) + F2 ( x ∗ , y∗ )(y − y∗ )
1
+ F11 ( x ∗ , y∗ )( x − x ∗ )2 + F21 ( x ∗ , y∗ )( x − x ∗ )(y − y∗ )
2
1
+ F22 ( x ∗ , y∗ )(y − y∗ )2 , (second order).
2

A.7 Unimodal Function

Definition 61. (Unimodal function) A function F ( x ) is unimodal if for

some value x ∗ , it is monotonically increasing (decreasing) for x ≤ x ∗
and monotonically decreasing (increasing) for x ≥ x ∗ . Clearly, the
only maximum (minimum) for F ( x ) is F ( x ∗ ).

A.8 The Golden Ratio

The golden ratio, often denoted by 6 ψ, is the positive solution to the

polynomial 6 ψ2 − 6 ψ − 1 = 0. By using the quadratic formula, it is
√
easy to calculate that 6 ψ = (1 + 5)/2 = 1.61803398874 · · · . Interest-
ingly, 1/ 6 ψ = 0.61803398874 · · · , because the above polynomial can
be restated as 6 ψ − 1 = 1/ 6 ψ.

A.9 Euler’s Theorem

Lemma 62. (Euler’s theorem) Consider a function, F (k, h), which is homoge-
nous of degree one in k and h; i.e., exhibits constant returns to scale in k and
h. Then,
F (k, h) = F1 (k, h)k + F2 (k, h)h.

Proof. Since F is homogenous of degree one in k and h,

λF (k, h) = F (λk, λh).

Differentiating with respect to λ then gives

F (k, h) = F1 (k, h)k + F2 (k, h)h.

Remark 63. In the main text, F (k, h) is a constant-returns-to-scale pro-

duction function and F1 (k, h) and F2 (k, h) are the marginal products
of capital and labor. In competitive equilibrium F1 (k, h) and F2 (k, h)
will be equal to the rental rate on capital, r, and the wage wage, w.
Therefore, F (k, h) = rk + wh.
mathematical appendix 251

A.10 Eigenvalues and Eigenvectors

Let T be a n × n matrix. An eigenvalue/eigenvector pair satisfy the

equation
eT = εe,

where e is the 1 × n (left) eigenvector and the scalar ε is the associated

eigenvalue. This can also be expressed as

Te0 = εe0 ,

where e0 is the n × 1 (right) eigenvector and again with the scalar ε

being the associated eigenvalue. As is probably obvious, is e0 is just
the transpose of e.
The eigenvalues of the matrix T solve the equation

det( T − ε) = 0.

This equation yields a polynomial of degree n that may have up to n

distinct (potentially complex) roots or eigenvalues. The eigenvalues
are the n values of ε that solve the characteristic polynomial

det( T − ε) = (ε − v1 )(ε − v2 ) · · · (ε − vn ) = 0.

These values may be repeated and complex.

A.11 Descriptive Statistics

A.11.1 Mean and Median

The mean is just the average value of the data in a set. When the data is
ordered from the lowest to the highest value, the median is the middle
value.

Definition 64. (Mean) Let { xi }iN=1 be a data series. The mean for the
series, µ x , is defined by
N
1
µx =
N ∑ xi .
i =1

Definition 65. (Median) Let { xi }iN=1 be a data series ordered without

loss of generality such that xi ≥ xi−1 for 2 ≤ i ≤ N. The median for
the series, median x , is defined by
(
x ( N +1), for N odd;
median x =
( x N/2 + x N/2+1 )/2, for N even.
252 numerical methods for macroeconomists with julia and matlab codes

A.11.2 Standard Deviation

The standard deviation measures the amount of dispersion or varia-
tion in a data series. The bigger the number is the higher is the amount
of dispersion around the series’s mean. In business cycle analysis the
standard deviation of a series measures its fluctuations over time.

Definition 66. (Standard Deviation) Let { xi }iN=1 be a data series. The

standard deviation for the series, σx , is defined by
v v
N N
u u
u1 u1
σx = t
N ∑ ( xi − µ x ) = t N
2
∑ xi2 − µ2x ,
i =1 i =1

where the mean, µ x , is

N
1
µx =
N ∑ xi .
i =1

A.11.3 Pearson Correlation Coefficient

Correlation coefficients measure the association between two data se-
ries, say { xi }in=1 and {yi }in=1 . A correlation coefficient takes a value
between −1 and 1, where a positive value indicates that two series
tend to move together while a negative one shows that they have a
proclivity to move oppositely to one another. The higher the correla-
tion coefficient is in absolute value the stronger is the association. A
value of 0 shows no association. The Pearson correlation coefficient
measures the degree of linear association between two series. In busi-
ness cycle analysis often one is interested in how a variable moves with
GDP. When a variable has a positive correlation (negative correlation)
with GDP it is called procyclical (countercyclical).

Definition 67. (Pearson correlation coefficient) The Pearson correla-

tion coefficient is a measure of the linear dependence (or correlation)
between the two data series, { xi }in=1 and {yi }in=1 . It is defined by the
formula
n
∑(xi − x)(yi − y)
i
ρ= s s ,
n n
∑ ( x i − x )2 ∑ ( y i − y )2
i i
n
where the sample means, x and y, are given by x = (∑ xi )/n and
i
n
y = (∑yi )/n. If there is a tendency when x rises above its mean for y
i
to do so as well, then the numerator will likely be positive and hence
so will be ρ. The opposite will be true, if there is a penchant for y to
mathematical appendix 253

fall below its mean when x rises above its one. If y and x have a strictly
positive (negative) linear relationship then ρ will be 1 (-1).

Figure A.11.1 illustrates the Pearson correlation coefficient between

x and y for some randomly generated series. The two series for x and
y are always positively associated. As can be seen, as ρ increases so
does the strength of the positive association.

Figure A.11.1: Pearson correla-

tion coefficient. As one moves
from left to right, the degree
of positive association between
the series for x and y increases.
This is reflected in higher values
for the Pearson correlation coef-
ficient, ρ.

A.11.4 Coefficient of Autocorrelation

The autocorrelation of a time series measures the correlation of the

series with a delayed facsimile of itself. In business cycle analysis it is
used to measure the degree of persistence in a time series.

Definition 68. (Autocorrelation) The autocorrelation coefficient is just

the correlation coefficient between the current and lagged value of a
variable.

The autocorrelation coefficient, since it is just a correlation coefficient,

has a value between −1 and 1.
254 numerical methods for macroeconomists with julia and matlab codes

A.12 The Uniform, Normal, and Weibull Distributions

Definition 69. (Uniform Distribution) A random variable xe is dis-

tributed according to a uniform distribution U : [ x, x ] → [0, 1] if

x−x
Pr[ xe ≤ x ] = U ( x ) = ,
x−x

for x ≤ x ≤ x. The uniform distribution U ( x ) is just a straight line

that starts at 0 when x = x and ends at 1 when x = x. The probability
density function connected with the uniform distribution is

1
U1 ( x ) = .
x−x

The mean and variance of x are given by ( x + x )/2 and ( x − x )2 /12.

Definition 70. (Normal Distribution) A random variable xe is distributed

according to a normal distribution N (µ, σ2 ) : (−∞, ∞) → [0, 1], with
mean µ and variance σ2 , if

−( xe − µ)2
Z x
1
Pr[ xe ≤ x ] = exp [ ]de
x,
−∞ σ (2π )1/2 2σ2

where
1 −( xe − µ)2
exp [ ],
σ(2π )1/2 2σ2
is the probability density function for a normal distribution. When the
logarithm of a variable is normally distributed the variable is said to
follow a log-normal distribution.

Definition 71. (Bivariate Normal Distribution) Two random variables

xe and ye are distributed according to a bivariate normal distribution N
with means µ x and µ x , variances σx2 and σy2 , and correlation ρ if where

1
p
2πσx σy (1 − ρ2 )
1 ( xe − µ x )2 ( xe − µ x )(ye − µy ) (ye − µy )2
× exp{− [ − 2ρ + ]},
2(1 − ρ2 ) σx2 σx σy σy2

is the probability density function for a normal distribution. When the

logarithm of a variable is normally distributed the variable is said to
follow a log-normal distribution.

Definition 72. (Weibull Distribution) A random variable xe is distributed

according to a Weibull distribution W : [0, ∞) → [0, 1] if

Pr[ xe ≤ x ] = W ( x ) = 1 − exp[−( x/η ) β ],

mathematical appendix 255

Here η > 0 is called the scale parameter and β > 0 is referred to as the
shape parameter. Depending on parameter values, the density func-
tion for the Weibull function can fall or rise and then fall. The Weibull
distribution has an easy formula for the median of the distribution:

η ln(2)1/β .

The formulae for the mean and variance are somewhat more compli-
cated:
ηΓ((1 + β)/β)
and
η 2 Γ((2 + β)/β) − ηΓ((1 + β)/β)2 ,
where Γ is the gamma function. The gamma function is built into most
numerical programming languages, such as MATLAB.

A.13 The Strong Law of Large Numbers

Let { xi }in=1 be a sample of independently and identically distributed

random numbers drawn from some distribution with mean µ. Let the
mean of the random sample be denoted by
n
1
xn =
n ∑ xi .
i =1

The strong law of large numbers states that

Pr[ lim xn = µ] = 1.
n→∞

In other words, as the sample sizes increases the mean of the sample
will approach the mean of the distribution with virtual certainty.
B Introduction to MATLAB

by Pengfei Han

This programming tutorial is prepared as an introduction to MAT-

LAB, tailored for the course “Numerical Methods for Macroeconomists"
of Professor Jeremy Greenwood. The programming language for this
course is designed to be MATLAB, a powerful and popular language
to solve numerical problems, including computation of integrations,
maximizations, simulations, and numerical optimizations. To embark
on our journey of the programming adventure this semester, this tu-
torial covers five basic topics as outlined on the next page: we will
start with the elementary building blocks of basic commands and flow
control, and proceed to define functions and graphing techniques. As
the last ingredient, we will study how to solve a nonlinear equation.
As in learning any programming language, the engine of making
progress in programming skill is –as always– “learning by doing". From
this perspective, the following sections of this tutorial will only serve
as the key to open the door of MATLAB, beyond which there is a vast
universe for you to explore.

B.1 Getting Started

Interface of MATLAB

Once clicking on the icon of “MATLAB" on your computer, there

will be five windows popping up on the screen:
Current Directory:
This is where your current MATLAB files are stored and managed.
Command Window:
This is where you type in your commands to MATLAB.
Command History:
This window tracks a sequence of your recent commands.
Workspace:
This is where the variables in your current program are kept.
Variable Editor:
258 numerical methods for macroeconomists with julia and matlab codes

The contents of the variables can be examined and edited in this sec-
tion.

Housekeeping

Prior to the main body of the coding work, there are three basic
commands commonly used for housekeeping:
clear all: removes all variables from the current workspace.
close all: deletes all figures whose handles are not hidden.
clc: clears all input and output from the command window, giving
you a "clean screen".

Loading Data

To import data into MATLAB, we can simply use the import wiz-
ard:
“HOME" → “Import Data", and choose the folder/file you wish to im-
port.
Alternatively, we can load the data by using the commands associated
with the type of file to be imported. For instance:
to import files with format “xls", use the command: xlsread(‘FileName.xls’);
to import files with format “csv", use the command: csvread(‘FileName.csv’);
A detailed example is given in the M-file “Ex_load_data.m".

Stopwatch Timer

To monitor the performance of our code, we can use a stopwatch

timer to measure the execution time. This stopwatch timer starts with
the tic at the begging of the code, and to display the elapsed time,
simply use the command toc.

Help

Whenever you get lost in your programming endeavor, always re-

call the most powerful command in MATLAB: help. Help can be
obtained two ways. First, you can access the help menu. Just click on
the ? on the upper righthand side of the screen. Then navigate your
way through the index until you see what you want. Second, if you
know the name of the command or function that you are interested in,
then in the command window you can just type help name. This is
good for seeing the syntax associated with executing a command or
the options that are available.

A Line in a MATLAB File

A line in a MATLAB file is usually an executable statement. Usu-

ally it ends with a semicolon or ;. This tells MATLAB to run the line
silently; i.e., not to print the result of the executable statement to the
screen. If you want to see the output, then omit the semi colon. Often
introduction to matlab 259

your executable statement will run over one line. Then, you must put
a continuation statement at the end of line in the Editor before you go
onto the next line. This way MATLAB knows that the next line is part
of the same executable statement. The continuation statement is just
three dots or . . . .

B.2 Basic Commands

As revealed in the term “Mat" as in MATLAB, this language is par-

ticularly optimized in processing matrices, so as the first program-
ming suggestion, it is always a weakly dominating strategy to write
your code in a “matrix" (in contrast to a “loop"), whenever you can.
The codes of the examples in this section can be found in the M-file
“Ex_basic_command.m".

B.2.1 Creation and Concatenation of Matrices

Creation of Matrices

Creating A General Matrix

Matrix is the essential element for MATLAB, and in general we can
simply create a matrix by enumerating its elements as follows:
M = [1 2 5; -1 20 7; 8 -9 3]
This creates a 3 by 3 matrix: M.
Identity, Zero, and One Matrix
As you will see soon in this class, very frequently we need to create
identity matrix, matrices of zero, and matrices of one. MATLAB has
handy functions for all of these tasks:
I = eye(5, 5)
This creates a 5-by-5 identity matrix.
O = zeros(2, 3)
This creates a 2-by-3 matrix with elements of zero: O.
Y = ones(3, 4)
This creates a 3-by-4 matrix with elements of one: Y.
Creating a Vector
Also frequently used in MATLAB are vectors, which can be created
this way:
V = 0 : 0.5 : 10;
This creates a row vector V with elements ranging from 0 to 10 with
increments of 0.5.

Concatenation of Matrices
In MATLAB the concatenation operator is “[ ]".
For example, we can join two matrices side by side as follows:
X = zeros(2, 2);
260 numerical methods for macroeconomists with julia and matlab codes

Y = ones(2, 2);
join = [X Y]
0011
0011
To stack the two matrices, we can use the command “[ ; ]":
stack = [X; Y]
00
00
11
11

B.2.2 Operators and Relational Operators

Operators

There are four basic operators in MATLAB:

+ : Addition
- : Subtraction
* : Multiplication (applicable for both scalars and matrices)
∧ : Matrix Power
In addition, we have three element-by-element operators:
.* : Multiplication, operated element by element
./ : Division, operated element by element
.∧ : Power, operated element by element

Moreover, listed as follows are some frequently used operators for

your reference:
exp(X): the e-exponential of the elements of the matrix X
log(X): the natural logarithm of the elements of the matrix X
sqrt(X): the square root of the elements of the matrix X
abs(X): the absolute value of the elements of the matrix X
round(X) - rounds the elements of X to the nearest integers
floor(X): rounds the elements of X to the nearest integers towards
minus infinity
ceil(X): rounds the elements of X to the nearest integers towards in-
finity
diag(X): the diagonal elements of X
det(X): the determinant of X
rank(X): the rank of X
fix(X) : rounds the elements

Relational Operators

There are eight basic relational operators in MATLAB:

< : less than
introduction to matlab 261

> : greater than

<= : less than or equal to
>= : greater than or equal to
== : equal to
∼= : not equal
&& : and
||: or

B.2.3 Properties of Matrices

Max, Min, Mean, Sum, Prod

max(X): returns the maximum values of matrix X along its columns.

For instance: X = [1 2 3; 4 5 6; 7 8 9]
123
456
789
max(X)
789
To obtain the maximum along the rows, use the command: max(X, [ ],
2). The “2 " indicates that we are examining along the second dimen-
sion of the matrix. For instance:
max(X, [ ], 2)
3
6
9
Analogously, we can apply the matrix operators: min to find the min-
imum, mean to obtain the mean, and prod to calculate the products.
For instance: X = [1 2 3; 4 5 6; 7 8 9]
123
456
789
mean(X)
456
mean(X, 2)
2
5
8
sum(X)
12 15 18
sum(X, 2)
6
15
24
262 numerical methods for macroeconomists with julia and matlab codes

prod(X)
28 80 162
prod(X, 2)
6
120
504
norm(X) returns the Euclidean norm (or the square root of the sum of
the squares) for the matrix X.

Obtain Dimensions of A Matrix

[M, N] = size(X): returns the number of rows (M) and columns (N)
of the matrix X as separate output variables.
size(X, 1): returns the number of rows (M)
size(X, 2): returns the number of columns (N)
length(X) : returns the length of the vector X

B.3 Flow Control

Structural command – or flow control – governs the flow of informa-

tion in the program and, thus, is essential in any programming lan-
guage. In particular, there are three structural commands in MATLAB:
if, for, and while. The codes associated with this section are in the
M-file “Ex_flow_control.m".

B.3.1 If
The “if" statement –matched with the command “end" – executes a
group of statements when a logical expression is evaluated to be true.
The general form is:
if logical expression A
statements to be executed if A is true
end
In addition, we can enrich the logical evaluation process by adding the
command “else":
if logical expression A
statements to be executed if A is true
else
statements to be executed if A is false
end
Moreover, we can further augment the if-statement by adding a series
of the command “elseif":
if logical expression A1
statements to be executed if A1 is true
introduction to matlab 263

elseif logical expression A2

statements to be executed if A2 is true
else
statements to be executed if all logical expressions enumerated above
are false
end

Application: How Much Do You Pay for Federal Income Tax?

How much is an average American family paying for the Federal in-
come tax?
Given the tax rate schedule, the if command can deliver the federal
income tax for any level of income:
% median household income (current $) in the United States: $53,657
in 2014
income = 54462;
if income > 0 && income <= 12950
tax = income * 10 * .01;
elseif income > 12950 && income <= 49400
tax = 1295.00 + (income - 12950) * 15 * .01;
elseif income > 49400 && income <= 127550
tax = 6762.50 + (income - 49400) * 25 * .01;
elseif income > 127550 && income <= 206600
tax = 26300.00 + (income - 127550) * 28 * .01;
elseif income > 206600 && income <= 405100
tax = 48434.00 + (income - 206600) * 33 * .01;
elseif income > 405100 && income <= 432200
tax = 113939.00 + (income - 405100) * 35 * .01;
else % i.e., income > 432200
tax = 123424.00 + (income - 432200) * 39.6 * .01;
end

Rate Schedule for the Federal Income Tax (2014)

Schedule Z (Applies to the Head of Household)
If Taxable Income Is Over But Not Over The Tax Is: Of the Amount Over
0 12,950 10% 0
12,950 49,400 1,295.00 + 15% 12,950
49,400 127,550 6,762.50 + 25% 49,400
127,550 206,600 26,300.00 + 28% 127,550
206,600 405,100 48,434.00 + 33% 206,600
405,100 432,200 113,939.00 + 35% 405,100
432,200 ∞ 123,424.00 + 39.6% 432,200
264 numerical methods for macroeconomists with julia and matlab codes

B.3.2 For
The for loop repeats a group of statements by a fixed and predeter-
mined number of times. The general form of the for loop is:
for n = 1 : N
statements to be repeated
end
In this loop, the variable “n" – which begins at 1 and ends at N – serves
as a counter, and the variable “N" controls the number of repetition.

Application: How Fast Is the Federal Deficit Growing?

How fast is the deficit of the federal government growing?
Given the time series of federal deficit we obtained in section 1.3
(“loading data"), this task is straightforward to tackle by a for loop:
for id = 1 : (n_year - 1)
g_def(id) = GDP(id + 1) / GDP(id) - 1;
end

B.3.3 While
The while loop repeats a group of statements an indefinite number
of times, under the control of a logical condition.
The general form of the while loop is:
while logical expression
statements to be repeated
end

Application: How to Find the Steady State In Solow Growth Model

Consider a Solow growth model in which the capital is accumulated
by:
k t+1 = s kαt + (1 − δ)k t
What is the steady state level of capital in this economy?
This problem can be readily solved by the while command:
1 % Parameterization
2 s = 0 . 0 5 5 0 ; % savings r a t e
3 aalpha = 1 / 3 ; % c a p i t a l income s h a r e
4 ddelta = 0 . 1 4 5 ; % depreciation r a t e of c a p i t a l
5
6 % Initialization
7 d i f f = 1 ; % s p e c i f y t h e i n i t i a l d i s t a n c e from convergence
8 c r i t e r i o n = 1e − 8 ; % t h e c r i t e r i o n t o determine convergence
9 k_new = 1 ; % i n i t i a l guess f o r t h e steady − s t a t e c a p i t a l
10
11 while d i f f > c r i t e r i o n
12 % c h a r a c t e r i z e t h e e v o l u t i o n o f c a p i t a l with exogenous
savings r a t e
introduction to matlab 265

13 k_old = k_new ;
14 k_new = s * ( k_old ) ( aalpha ) + ( 1 − d d e l t a ) * k_old ;
15 % e v a l u a t e t h e d i s t a n c e from converging t o t h e s t e a d y s t a t e
16 d i f f = abs ( k_new − k_old ) ;
17 end

B.4 Defining Functions

B.4.1 Anonymous Function

In this course, we will be working intensively with functions. The
most basic form of the function is an anonymous function and can be
simply created by the command @() in MATLAB.
For instance: cubic = @( x ) x ∧ 3;
This creates a function “cubic" which transforms any input to its power
of 3.
To call this function, simply type, say, “cubic(3)".

B.4.2 M-File Function

When the function becomes too complex to be specified in the main
code, we can generate an independent M-file in the format of “.m" to
create these functions. For instance:
function [y] = objective(x, c)
y = 1 + exp(−c ∗ x ) − log( x );
end
In the main code, we can simply call this function by typing, say,
“objective(3, 1)".
This is particularly helpful when we need to estimate some parameters
which involves potentially complex objective functions.

B.5 Graphing

B.5.1 Plot
In MATLAB, two-dimensional graph can be created by the command
plot. For instance, to plot a vector y against a vector x, we can simply
use the command plot ( x, y).
To add further features into the figure, consider graphing for two func-
tions:
F ( x ) = log( x ) and g( x ) = 1 + e−cx .
1 % C r e a t e t h e Grid f o r Graphing
2 x _ l b = 2 ; % lower bound o f t h e g r i d
3 x_ub = 4 ; % upper bound o f t h e g r i d
4 nx = 1 0 1 ; % t h e number o f p o i n t s i n t h e g r i d f o r x
5
266 numerical methods for macroeconomists with julia and matlab codes

6 % c r e a t e a g r i d f o r x on [ x_lb , x_ub ] with t h e number o f p o i n t s :

nx
7 x _ g r i d = l i n s p a c e ( x_lb , x_ub , nx ) ;
8 c = 1 . 0 ; % parameter c o n t r o l l i n g t h e c u r v a t u r e o f : 1 + exp ( − c x
)
9
10 % Graphing : S e p a r a t e l y For Each I n d i v i d u a l F u n c t i o n s
11 f i g u r e ( 1 ) ; % c r e a t e s a new f i g u r e window
12 p l o t ( x_grid , l o g ( x _ g r i d ) , x_grid , 1+ exp ( − c * x _ g r i d ) ) ; %
p l o t t h e two f u n c t i o n s
13 t i t l e ( ‘ Graphing For Each I n d i v i d u a l Functions ’ ) ; % make a t i t l e
for this figure
14 xlabel ( ‘ x ’ ) ; % label for x axis
15 y l a b e l ( ‘ l o g ( x ) and 1+ exp ( − c * x ) ’ ) ; % l a b e l f o r y a x i s
16 legend ( ‘ l o g ( x ) ’ , ‘1+ exp ( − c * x ) ’ ) ; % legends f o r each f u n c t i o n

The default fonts on MATLAB graphs are small. These can be

changed using the FontSize command. So, in the above example
to change the font size for the title to 18 just write:
title(‘Graphing For Each Individual Functions’, ’FontSize’, 18);
One can do the same thing for the axis labels.
More detailed graphing features of the command plot can be found
in the M-file:
Ex_graphing.m.

B.5.2 Tool Kit of Graphing Commands

In addition, MATLAB offers a large choice set of graphing options –
beyond plot– for us to explore, and some handy tools are listed as
follows for you reference:
hist: create histogram;
surf: three-dimension graphing;
waterfall: the waterfall plot;
plotyy: plot with multiple vertical axes

B.6 Solving Nonlinear Equations

B.6.1 Solving Nonlinear Equations

To clarify on the context: in this section, our objective is to find the
solution x ∗ which solves a nonlinear equation: F ( x ∗ ) = 0. In particu-
lar, we will discuss two scenarios and two methods: smooth objective
function solved by the Newton’s method, and nonsmooth objective
function solved by the bisection method.

B.6.2 fsolve and fzero

In MATLAB, we have a straightforward built-in function to solve non-
linear equations: “fsolve". In this section, we will outline the al-
introduction to matlab 267

gorithm underlying this function, and illustrate how this function is

implemented.
fsolve: Algorithm
To be brief, the function fsolve is based on the Newton’s method,
i.e., we start from an initial guess xn , and we update our guess xn+1
recursively by the following rule:

F ( xn ) + F1 ( xn )( xn+1 − xn ) = 0
=⇒ xn+1 = xn − [ F1 ( xn )]−1 F ( xn )
In general fsolve works well when the function F (·) is smooth. In
addition, as you can tell from the iteration rule above, the initial guess
is critical in governing the computation efficiency. The solution of
fsolve can be of any (finite) dimension. For the scenario of single-
variable nonlinear equation, an alternative is the command fzero,
with analogous procedure to implement as fsolve. fzero can only
solve one equation in one unknown.
fsolve: Implementation
The general form to implement fsolve is:
x_star = fsolve(@(x) objective(x, c), x_initial)
There are five ingredients to implement fsolve:
objective(x, c): the nonlinear equation F ( x; c) = 0 under consideration.
c: the exogenous parameter in this nonlinear equation.
x: the endogenous unknown to obtain.
x_star: the solution to this nonlinear equation, i.e., F ( xstar ; c) = 0.
x_initial: our initial guess for the solution x_star.
Of course, we can add a variety of options to the function fsolve.
For example, to control for the criterion of convergence, we can use
the command optimset:
opt = optimset(’Tolfun’, 1e-8)
x_star = fsolve(@(x) objective(x, c), x_initial, opt)

As a cookbook illustration, let’s consider the following nonlinear

equation:
ln( x ) = 1 + e−cx
The procedure to solve for this equation is delineated in the M-file:
Ex_fsolve.m.

B.6.3 Bisection
Unfortunately, Newton’s method does not perform well when the ob-
jective function F (·) is not smooth, and a potential alternative solution
is the bisection method.
268 numerical methods for macroeconomists with julia and matlab codes

The bisection method applies when our objective function F (·) is

defined on an interval [ x, x̄ ] where F ( x ) · F ( x̄ ) < 0, i.e., the value of
the objective function has opposite signs at the two boundaries of the
interval in which it is defined. To apply the bisection method, we start
from the initial guesses for the minimum and maximum values of the
solution, and update our guess in accordance with the sign of the slope
of the objective function.
To demonstrate how the bisection works, let’s consider the same
nonlinear equation:
ln( x ) = 1 + e−cx

A detailed implementation can be found in the M-file: Ex_bisection.m.

B.7 Minimization (or Maximization)

Suppose that a solution is sought to a problem of the following form

min { F ( x )}.
x≤x≤x

Here x is constrained to lie in the interval [ x, x ], where x and x are the

lower and upper bounds on the minimization problem. The function
can be minimized in MATLAB by calling the function fminbnd. The
syntax to use fminbnd is

x = fminbnd (@function, lower bound, upper found),

where x is the solution from the minimization routine. Suppose that

one wants to maximize F ( x ) on the domain [ x, x ]. This maximization
problem can be transformed into a minimization problem as follows

max { F ( x )} = min {− F ( x )}.

x≤x≤x x≤x≤x

So, one would just need to put a negative sign in front of the objective
function.

B.8 Roots of Polynomials

Suppose that one wants to find the roots of an nth-order polynomial.

This involves finding the n values of x that solve the equation.

an x n + an−1 x n−1 + · · · + a1 x1 + a0 = 0.

This can be done in MATLAB using the roots command. To use this
just specify the vector a = [ an , an−1 , · · · , a0 ], where the coefficients are
in descending order and type the command roots ( a).
introduction to matlab 269

B.9 Eigenvector decomposition

To find the eigenvalues and their corresponding eigenvectors of a square

matrix
 A, the command [ e, ε ] = eig( A) returns diagonal matrix ε =
ε1 0
h i
 .. 
 of eigenvalues ε i and matrix e = e 1 ... e n whose

 . 
0 εn
columns are the corresponding right eigenvectors, so that Ae = eε.

B.10 Interpolation

Matlab offers interpolation routines to add new data points within a

range of a set of known data points for one to N-dimensional gridded
data through commands interp1, interp2, interp3, or interpn.
Focus on the two-dimensional case, where y = f ( x1 , x2 ) is the function
evaluated at each sample point ( x1 , x2 ) and y, x1 , and x2 are n × n ma-
query query
trices. The command y query = interp2(x 1 , x 2 , y, x 1 , x2 , method)
query query
returns interpolated values of y at query points ( x 1 , x2 ) , using
the interpolation method defined by method = 0 linear 0 , 0 nearest 0 , 0 cubic 0 , 0 makima 0 , s pline’.
query query
Here y query , x 1 , and x 2 are n query × n query matrices.

B.11 Random Number Generation

To call up a 1 × n vector normal random numbers with mean mu and

standard deviation sigma just type in normrnd(mu, sigma,1,n). To
seed the random number generation use the statement rng=int, where
int is some positive integer, just before you call normrnd. If you don’t
do this, your random numbers will change every time you run your
program. You can also call up uniformly distributed random numbers.
rand(n,m) will yield a n × m matrix of uniformly distributed random
numbers on the interval [0, 1]. randi(a,b,n) will return n integers that
are uniformly distributed on the interval [ a, b]. Again, you should seed
these random number generators.

B.12 Complex Numbers

Complex numbers are easy in MATLAB. A complex number has the

form a + bi, where a and b are real numbers and i is the imaginary
√
part or where i = −1. In MATLAB a complex number is written in
√
exactly this way, where i is reserved for −1. To recover the coefficent
b on a complex number x = a + bi, just type imag(x).
270 numerical methods for macroeconomists with julia and matlab codes

B.13 Descriptive Statistics

Suppose one has two vectors of equal size, x and y, containing data.
To compute the standard deviation of x use the command std(x). To
compute the correlation between x and y just write corrcoef(x,y). To
plot a histogram for x use the command histogram(x). This com-
mand can be used in conjunction with title, xlabel, and ylabel to
generate a title and axis labels for the histogram.

B.14 Writing Results to a Table

It’s easy to write results to a table in MATLAB using the table com-
mand. Suppose one has two n × 1 vectors x and y. These can be
written to a table called T using the following command:

T = table( x, y);

This will create a n × 2 table with the headings x and y for each col-
umn. This table can be exported to an EXCEL file called Data using
the writetable command. Specifically, write

filename = ’Data.xlsx’

and
writetable(T, filename);
C Introduction to Julia

Julia is a high-level programing language well suited for numerical

analysis. One of its main advantages resides in its speed as Julia com-
piles all code to machine code before running it. This brief introduc-
tion highlights the different steps needed to get us up to speed with
the code provided in the book.

C.1 Choosing an IDE

The first step before using Julia is to choose an Integrated Develop-

ment Environment (IDE), a platform to help you write and run code
efficiently. There are free options available, the most popular ones at
the moment are Juno for Atom and Julia for VS Code. These editors
offer an interface that is similar to Matlab, including a Workspace dis-
playing packages in use, structures and variables created, an Editor to
write code, and Julia’s command line REPL (standing for Read, Exe-
cute, Print, Loop), where the code’s output is printed.

C.2 Installing and Using Packages

Julia has a built-in package manager that allows the user to install
packages and to call them when needed to run code. Packages are in-
stalled by typing “Alt ]” in the REPL and then “add package_name”
and enter. Alternatively, you can type “using Pkg” and then enter to
call the package manager, followed by “Pkg.add(“package_name”)”
and enter.

Key packages

For data analysis: DataFrames.jl offers tools for working with

tabular data. CSV.jl is a package for reading csv data.
For differentiation: ForwardDiff.jl uses forward mode auto-
matic differentiation to evaluate the derivatives of functions and com-
pute gradients, Jacobian and Hessian matrices. FiniteDifferences.jl
offers an alternative method to estimate derivatives with finite differ-
ences.
272 numerical methods for macroeconomists with julia and matlab codes

For solving nonlinear problems: Roots.jl offers algorithms to

find the roots of continuous scalar funciton of a single real variable
(e.g. using Bisection method, Brent’s method and derivative-free meth-
ods). NLsolve.jl provides algorithms to solve systems of nonlinear
equations, including mixed complementarity problems.
For solving optimization problems: Optim.jl offers algorithms
to solve univariate and multivariate optimization problem with box
constraints and includes methods such as simulated annealing and
particle swarm. BlackBoxOptim.jl provides global optimization al-
gorithms that do not require the objective function to be differentiable.
NLopt.jl is an interface offering a suit of different optimization al-
gorithms. JuMP.jl is another optimization interface for a number
of open-source and commercial solvers targeted at constrainted prob-
lems.
For plotting: Plots.jl is a data visualization interface that can be
combined with other backends such as PGFPlotsX.jl, PlotlyJS.jl,
and PyPlot.jl.

C.3 Basic Commands

Housekeeping

Prior to the main body of the coding work, there are two basic com-
mands commonly used for housekeeping:
Exit(): removes all packages, structures, and variables from the cur-
rent workspace.
clearconsole(): clears all input and output from the REPL.

Loading data

To import data

Stopwatch timer

To monitor the

Lines in a file

Julia does not require the use of three dots (“. . . “) continue state-
ments across lines.
introduction to julia 273

Table C.3.1: Matlab-Julia cheat-

sheet
Matlab Julia Packages
Vectors and matrices
1 % Row v e c t o r
1 % Row v e c t o r
2 A = [1 2 3]
2 A = [1 2 3]
3 % Column v e c t o r
3 % Column v e c t o r
4 A = [1 2 3] ’
4 A = [ 1 ; 2; 3]
5 % Matrix
5 % Matrix
6 A = [1 2; 3 4]
6 A = [1 2; 3 4]
7 % Matrix o f z e r o s
7 % Matrix o f z e r o s
8 A = zeros ( 2 , 2)
8 A = zeros ( 2 , 2)
9 % Matrix o f ones
9 % Matrix o f ones
10 A = ones ( 2 , 2 )
10 A = ones ( 2 , 2 )
11 % I d e n t i t y matrix
11 % I d e n t i t y matrix
12 A = I
12 A = eye ( 2 , 2 )
13 % Diagonal matrix
13 % Diagonal matrix
14 A = Diagonal ( [ 1 , 2 , 3 ] )
14 A = diag ( [ 1 2 3 ] )
15 % L i n e a r l y spaced v e c t o r
15 % L i n e a r l y spaced v e c t o r
16 A = range ( x _ i n i , x _ f i n a l , l e n g t h
16 A = linspace ( x_ini , x_final , n)
= n)
Interpolations

Distributions
References

S. Rao Aiyagari. Uninsured idiosyncratic risk and aggregate savings.

Quarterly Journal of Economics, 109(3):659–684, 1994.

Richard Bellman. Dynamic Programming. Princeton University Press,

Princeton, NJ, 1957.

Timo Boppart, Per Krusell, and Kurt Mitman. Exploiting mit shocks
in heterogeneous-agent economies: The impulse response as a nu-
merical derivative. Journal of Economic Dynamics & Control, 89:68–92,
2018.

William Brock and Leonard Mirman. Optimal economic growth and

uncertainty: The discounted case. Journal of Economic Theory, 4(3):
479–513, 1972.

Victor Bryant. Metric Spaces: Iteration and Application. Cambridge Uni-

versity Press, Cambridge, U.K., 1985.

Christopher D. Carroll. The method of endogenous gridpoints for solv-

ing dynamic stochastic optimization problems. Economics Letters, 3
(91):312–320, 2006.

Alpha C. Chiang. Fundamental Methods of Mathematical Economics.

McGraw-Hill Book Company, New York, NY, 3 edition, 2011.

Wilbur John Coleman, II. Equilibrium in a production economy with

an income tax. Econometrica, 59(4):1091–1104, 1991.

Morris H. DeGroot. Probability and Statistics. Addison-Wesley Publish-

ing Company, Reading, MA, 1975.

Wouter J. den Hann and Albert. Marcet. Solving the stochastic growth
model by parameterizing expectations. Journal of Business and Eco-
nomic Statistics, 8(1):31–34, 1990.

Ray C. Fair and John B. Taylor. Solution and maximum likelihood es-
timation of dynamic nonlinear rational expectations models. Econo-
metrica, 51(4):1169–1185, 1983.
276 numerical methods for macroeconomists with julia and matlab codes

Daniel Feenberg and Elisabeth Coutts. An introduction to the taxsim

model. Journal of Policy Analysis and Management, 12(1):189–194, 1993.

Jeremy Greenwood and Karen A. Kopecky. Measuring the welfare gain

from personal computers. Economic Inquiry, 51(1):336–347, 2013.

Jeremy Greenwood, Zvi Hercowitz, and Gregory W. Huffman. In-

vestment, capacity utilization, and the real business cycle. American
Economic Review, 78(3):402–417, 1988.

Jeremy Greenwood, Nezih Guner, and Karen A. Kopecky. The wife’s

protector: A quantitative theory linking contraceptive technology
with the decline in marriage. Handbook of Historical Economics, 2021a.

Jeremy Greenwood, Nezih Guner, and Ricardo Marto. The great tran-
sition: Kuznets facts for family economists. Handbook of Historical
Economics, 2021b.

Gary D. Hansen and Edward C. Prescott. Malthus to solow. American

Economic Review, 92(4):1205–1217, 2002.

John R. Hicks. Value and Capital. Oxford University Press, Oxford,

1939.

John R. Hicks. The rehabilitation of consumers’ surplus. Review of

Economic Studies, 8(2):108–116, 1941.

Robert J. Hodrick and Edward C. Prescott. Postwar u.s. business cycles:

An empirical investigation. Journal of Money, Credit and Banking, 29
(1):1–16, 1997.

Mark Hugget. The one sector growth model with idiosyncratic shocks:
Steady states and dynamics. Journal of Monetary Economics, 39(3):
385–403, 1997.

Kenneth L. Judd, Lilia Maliar, and Serguei Maliar. Numerically stable

and accurate stochastic simulation approaches for solving dynamic
economic models. Quantitative Economics, 2(2):173–210, 2011.

Tjalling C. Koopmans. Measurement without theory. Review of Eco-

nomics and Statistics, 29(3):161–172, 1947.

Finn E. Kydland and Edward C. Prescott. Time to build and aggregate

fluctuations. Econometrica, 50(6):1345–1370, 1982.

Robert E. Lucas, Jr. Models of business cycles. 1987.

Rajnish Mehra and Edward C. Prescott. The equity premium: A puz-

zle. Journal of Monetary Economics, 15(2):145–161, 1985.
references 277

Edward C. Prescott. Why do americans work so much more than

europeans? Quarterly Review, 28(1):2–13, 2004.

Edward C. Prescott and Graham V. Chandler. Calibration. The New

Palgrave: A Dictionary of Economics. Palgrave Macmillan, 2 edition,
2008.

Frank P. Ramsey. A mathematical theory of saving. Economic Journal,

38(152):543–559, 1928.

K. Geert Rouwenhorst. Asset Pricing Implications of Equilibrium Business

Cycle Models. Thomas F. Cooley, Princeton, NJ, Princeton University
Press, 1995.

Jonathan A. Schwabish. An economist’s guide to visualizing data.

Journal of Economic Perspectives, 28(Winter):209–34, 2014.

Eugen Slutsky. Sulla teoria del bilancio del consumatore. Giornale degli
Economisti e Rivista di Statistica, 51(1):1–26, 1915.

Eugen Slutsky. The summation of random causes as a source of cyclic

processes. Econometrica, 5(2):105–146, 1937.

Nancy L. Stokey and Robert E. Lucas, Jr. with Edward C. Prescott.

Recursive Methods in Economic Dynamics. Harvard University Press,
Cambridge, MA, 1986.

Richard Stone and Giovanna. Stone. National Income and Expenditure.

Bowes and Bowes, London, 1966.

Edward R. Tufte. The Visual Display of Quantitative Information. Graph-

ics Press, Cheshire, Connecticut, 2001.

Edmund T. Whittaker. On a new method of graduation. Proceedings of

the Edinburgh Mathematical Society, 41(February):63–73, 1923.

Jérôme Adda - Dynamic Economics
100% (1)
Jérôme Adda - Dynamic Economics
381 pages
Quantecon Python Advanced
No ratings yet
Quantecon Python Advanced
1,074 pages
QUANTITATIVE ECONOMICS With Python PDF
No ratings yet
QUANTITATIVE ECONOMICS With Python PDF
670 pages
Advanced Microeconomics II - Tian
No ratings yet
Advanced Microeconomics II - Tian
783 pages
Roger D. Peng - Advanced Statistical Computing (2022 Update) (2023) - Libgen - Li
No ratings yet
Roger D. Peng - Advanced Statistical Computing (2022 Update) (2023) - Libgen - Li
107 pages
Micro Notes Main
No ratings yet
Micro Notes Main
207 pages
Difference Equations
100% (1)
Difference Equations
193 pages
Py Quant Econ
100% (1)
Py Quant Econ
702 pages
Quantitative Economics With Python PDF
No ratings yet
Quantitative Economics With Python PDF
670 pages
Py Quant Econ PDF
100% (2)
Py Quant Econ PDF
763 pages
Quantecon Python Intro
No ratings yet
Quantecon Python Intro
615 pages
EF4e Uppint Filetest 5a
100% (6)
EF4e Uppint Filetest 5a
7 pages
Macro II Notes - Topic 3 - Update 8
No ratings yet
Macro II Notes - Topic 3 - Update 8
76 pages
QuantEconlectures Python3
No ratings yet
QuantEconlectures Python3
1,362 pages
GREAT Manager Framework
100% (4)
GREAT Manager Framework
14 pages
QuantEconlectures Python3
No ratings yet
QuantEconlectures Python3
1,123 pages
Galli
No ratings yet
Galli
109 pages
Advstatcomp PDF
No ratings yet
Advstatcomp PDF
109 pages
SGOS Book
No ratings yet
SGOS Book
238 pages
Quantitative Analysis For Business & Economics
No ratings yet
Quantitative Analysis For Business & Economics
202 pages
Section All
No ratings yet
Section All
63 pages
Math Modeling With Mat Lab
No ratings yet
Math Modeling With Mat Lab
214 pages
Macroeconomics Toolbox - Excelente PDF
100% (1)
Macroeconomics Toolbox - Excelente PDF
94 pages
Py Quant Econ
No ratings yet
Py Quant Econ
763 pages
QuantEconlectures Python3 PDF
100% (1)
QuantEconlectures Python3 PDF
1,125 pages
IEA Book 2011jan12
No ratings yet
IEA Book 2011jan12
257 pages
QuantEconlectures Julia
No ratings yet
QuantEconlectures Julia
979 pages
Lecture Notes For ECON660 and ECON460-2022-08
No ratings yet
Lecture Notes For ECON660 and ECON460-2022-08
265 pages
RATS Programming Manual
No ratings yet
RATS Programming Manual
255 pages
Mathematics
No ratings yet
Mathematics
20 pages
Mathematical Economics Lecture Notes: Alexander W. Richter
No ratings yet
Mathematical Economics Lecture Notes: Alexander W. Richter
128 pages
Mathbook-Econ Prep
100% (1)
Mathbook-Econ Prep
278 pages
Mathematics - Mathematical Economics and Finance
100% (1)
Mathematics - Mathematical Economics and Finance
153 pages
QABEBook 2011
No ratings yet
QABEBook 2011
202 pages
Cesarone2020 Preface Contents
No ratings yet
Cesarone2020 Preface Contents
15 pages
Business Mathematics
No ratings yet
Business Mathematics
87 pages
Micro Math PDF
No ratings yet
Micro Math PDF
139 pages
Business Maths
No ratings yet
Business Maths
91 pages
Mathematics For Business Economics: Herbert Hamers, Bob Kaper, John Kleppe
No ratings yet
Mathematics For Business Economics: Herbert Hamers, Bob Kaper, John Kleppe
37 pages
Lecture Notes: Guoqiang TIAN Department of Economics Texas A&M University College Station, Texas 77843 (Gtian@tamu - Edu)
No ratings yet
Lecture Notes: Guoqiang TIAN Department of Economics Texas A&M University College Station, Texas 77843 (Gtian@tamu - Edu)
218 pages
Building Code Requirements For Structural Concrete Reinforced With Glass FiberReinforced Polymer (GFRP) Bars Code and Commentary 440.11.22 Chapter 22
100% (1)
Building Code Requirements For Structural Concrete Reinforced With Glass FiberReinforced Polymer (GFRP) Bars Code and Commentary 440.11.22 Chapter 22
32 pages
A Cookbook of Mathematics
100% (1)
A Cookbook of Mathematics
116 pages
Mains Voltage Compensation
No ratings yet
Mains Voltage Compensation
6 pages
Theory of Elasticity
No ratings yet
Theory of Elasticity
4 pages
ReadyIAS AW Toolkit
No ratings yet
ReadyIAS AW Toolkit
41 pages
Cost & Management Accounting
No ratings yet
Cost & Management Accounting
3 pages
Ecology PDF
No ratings yet
Ecology PDF
3 pages
Intracranial Aneurysms by Andrew J Ringer Ebook and TestBank Bundle Official Test Bank
No ratings yet
Intracranial Aneurysms by Andrew J Ringer Ebook and TestBank Bundle Official Test Bank
311 pages
Deep and Surface Learning PDF
No ratings yet
Deep and Surface Learning PDF
1 page
WEG - Transformer
No ratings yet
WEG - Transformer
20 pages
Glade Tutorial
No ratings yet
Glade Tutorial
5 pages
A2mot En5
100% (1)
A2mot En5
5 pages
PCP Comprehensive Solutions
No ratings yet
PCP Comprehensive Solutions
8 pages
Lab 1 Group 3 - Pure and Series
No ratings yet
Lab 1 Group 3 - Pure and Series
60 pages
Installation Instructions: Diesel/Alternator Tachometer 3-3/8" & 5"
No ratings yet
Installation Instructions: Diesel/Alternator Tachometer 3-3/8" & 5"
2 pages
MIL 11 - 12 Q3 0102 What Is Media and Information Literacy PS
No ratings yet
MIL 11 - 12 Q3 0102 What Is Media and Information Literacy PS
14 pages
كل مذكرات السنة الأولى في الانجليزية
No ratings yet
كل مذكرات السنة الأولى في الانجليزية
32 pages
Fórmulas Basicas de Derivadas e Integrales
No ratings yet
Fórmulas Basicas de Derivadas e Integrales
1 page
Topic 2 Linear Programming
No ratings yet
Topic 2 Linear Programming
64 pages
Exploring The Preferences For Micro-Apartments: Nezar Mabrouk H. Soub İpek Memikoğlu
No ratings yet
Exploring The Preferences For Micro-Apartments: Nezar Mabrouk H. Soub İpek Memikoğlu
12 pages
Critical Thinking
No ratings yet
Critical Thinking
3 pages
Cambridge International AS & A Level: Thinking Skills 9694/13
No ratings yet
Cambridge International AS & A Level: Thinking Skills 9694/13
9 pages
SLEX 4 Monster Mash
No ratings yet
SLEX 4 Monster Mash
7 pages
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
No ratings yet
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
33 pages
Foot-Surface-Structure Analysis Using A Smartphone-Based 3D Foot Scanner
No ratings yet
Foot-Surface-Structure Analysis Using A Smartphone-Based 3D Foot Scanner
7 pages
HSDL 3005 028
No ratings yet
HSDL 3005 028
28 pages
Lesson Plans Feb. 2019
No ratings yet
Lesson Plans Feb. 2019
13 pages
PROJECT
No ratings yet
PROJECT
6 pages
Gyan Sagar College of Engineering, SAGAR, (M.P.)
No ratings yet
Gyan Sagar College of Engineering, SAGAR, (M.P.)
5 pages