Solow Discrete Time
Solow Discrete Time
One of the long-run economic facts presented in Chapter 2 is the stability of the capital-
output ratio Kt /Yt (where Yt represents aggregate output at period t and Kt represents
aggregate capital at period t) over time: capital is roughly three times annual GDP. Earlier
contributions of growth theory, such as the so-called Harrod-Domar model, considered this
stability as representing a technological property and incorporated it as one of the assump-
tions in the model. The capital stock, traditionally divided into structures and equipment
but nowadays also containing some intangible components (e.g., software), is one of the im-
portant production inputs, but it is of course not the only one. It is possible to produce
products in a very capital-intensive way, but clearly there is a choice and using labor—
different people’s time, skills, and effort—is the most obvious alternative, or complement.
Given the many possibilities in which production process can be set up, it is therefore not
obvious why, at the macro level, Kt /Yt is almost constant over time. As argued in Chapter
2, this was Solow’s starting point and he managed to resolve the tension between the stable
aggregate ratio and the intuitive notion that capital and labor are quite substitutable by a
sequence of insights that led both to the construction of a framework for studying macroe-
conomic dynamics and for measurement of technological change. The purpose of the present
chapter is thus to detail the Solow model: the basic assumptions underlying it and their
implications.
A central element in the Solow model is the aggregate production function. In the
aggregate production function, there are three economic variables that can affect the growth
of GDP: technology At , capital Kt , and labor Lt . The Solow model focuses on the endogenous
accumulation of capital Kt . We will see that Kt not only reacts to the saving rate but also to
At and Lt . After solving the model, which will deliver a stable capital-output ratio, we will
focus on two main takeaways: (i) the fundamental source of long-run growth in per capita
income is the growth in At ; (ii) if all parameter values are common, different economies
converge to the same (both in terms of level and growth rate) income per capita in the long
run.
69
3.1 The basic model
We start our exposition using the simplest version of the model, where there is neither
technological progress nor any growth in the size of the population or the skills of workers.
The centerpiece of the Solow model is the aggregate production function
Yt = F (Kt , Lt ).
Note that we can interpret Yt as the GDP in this economy. We make the following assump-
tions for the function F (K, L):
3. F (K, L) exhibits constant returns to scale in (K, L): when K and L change to cK and
cL, with any c > 0, F (K, L) becomes cF (K, L).
4. F (0, L) = 0.
6. limK→∞ F1 (K, L) = 0.
Assumptions 5 and 6 are often called Inada conditions and are stronger than we need but
these assumptions simplify the exposition.1
In the basic model, we assume that the population is constant and that hours worked
per worker is constant, so that Lt is constant. We normalize both population and hours
per capita to 1; therefore, the only variable input for production is Kt , and because of this
normalization, we can write F (Kt , 1) = F (kt , 1), where we remind the reader that lower-case
letters are per-capita measures. Let us use f (kt ) to denote F (kt , 1). Then the production
function can be expressed
yt = f (kt ). (3.1)
The second important piece of the Solow model is the equation that describes the evolution
of the capital stock:
kt+1 = it + (1 − δ)kt , (3.2)
where it is investment in period t. The existing capital stock loses value, δkt , while being
used, where δ ∈ (0, 1) is capital’s depreciation rate.
These two centerpieces are connected through individual behavior. First, the goods
supply yt is equal to the demand for goods, ct + it :
yt = ct + it . (3.3)
1
We also assume that F is twice continuously differentiable. This means that first-order conditions to
maximization problems involving F can be differentiated and then generate continuous functions.
70
Here, ct is consumption and it is investment (both per capita). Note that here we implicitly
assume (as in the large part of the following chapters) that goods are homogeneous and can
be used for both consumption and investment. Because output yt is also the total income
for consumers, it is either consumed or saved. Therefore, we know that total saving has to
equal total investment. In an open economy—where there is trade—this does not necessarily
hold. Finally, in this chapter section, we can interpret both c and i as including government
consumption and investment, respectively.
In the Solow model, instead of explicitly modeling the consumption-saving decisions of
consumers, the consumers are assumed to mechanically save a constant fraction of their
income, so that investment is given by
it = syt , (3.4)
where s ∈ (0, 1) is the constant saving rate. This behavioral assumption is relaxed and
replaced by consumers’ optimizing consumption-saving behavior in Chapter 4.
Inserting equation (3.4) into equation (3.2) and using equation (3.1) yields
This difference equation expresses the dynamics of the capital stock kt over time. This is
the fundamental equation of the Solow model. Note that the only endogenous variable on
the right-hand side of the fundamental equation is kt . Therefore, the next period’s stock of
capital kt+1 can be determined only with the knowledge of the current capital stock kt , given
values of exogenous objects: the scalars δ and s and the function f . Note also that, starting
from a given k0 , once we obtain the series of {kt+1 }∞ t=0 from the fundamental equation (3.5),
the time series yt = f (kt ), ct = (1 − s)yt , and it = syt can readily be obtained.
k̄ = (1 − δ)k̄ + sf (k̄).
This equation implies δ k̄ = sf (k̄). It is straightforward to verify that, under the assumptions
for the aggregate production function in the previous section, a strictly positive value of k̄
that solves this equation always exists and is unique. Graphically, plot the left- and right-
hand sides of the equation δ k̄ = sf (k̄); the left-hand side is a straight line through the
origin with a positive slope and the right-hand side, which also starts in the origin, is strictly
increasing and strictly concave, with a slope of infinity at 0 and one that approaches 0 as
k̄ → ∞. Clearly, we see that an intersection exists and is unique. The Inada conditions are
used here to guarantee the existence of k̄. In the context of the basic model and as pointed
71
out above, the Inada conditions are stronger than necessary: they can be replaced by weaker
versions limk→0 F1 (k, 1) > δ/s and limk→∞ F1 (k, 1) < δ/s.
A particularly useful production function is the Cobb-Douglas production function: F (K, L) =
K α L1−α , so that f (k) = k α , where α ∈ (0, 1). This production function satisfies all assump-
tions we need, including the Inada conditions. With Cobb-Douglas production, k̄ can be
solved for analytically:
1
s 1−α
k̄ = . (3.6)
δ
From this expression, we can see that the capital stock in the steady state is increasing in the
saving rate s and decreasing in the depreciation rate δ. Because aggregate output (GDP) is
ȳ = f (k̄), ȳ is also increasing in s and decreasing in δ.
Now, let us use a diagram to analyze the dynamics of kt when k0 > 0 is not at the steady-
state level. Figure 3.1 plots the equation (3.5) with the 45-degree line (that is, representing
kt+1 = kt ). In the figure, the intersection of (3.5) and the 45-degree line represents the
steady-state k̄.
In the figure, when we start from a given k0 on the horizontal axis, we can obtain k1 on
the vertical axis by using the (3.5) curve. By placing this k1 back on the horizontal axis and
using the (3.5) curve again, we obtain k2 , and so on. This procedure yields the full time
series of the capital stock: {kt+1 }∞
t=0 . One can easily verify the dynamics of kt exhibits a
global and monotonic convergence to k̄, regardless of the initial value k0 . That is, whatever
the starting point is, the time series of kt gradually approaches k̄ over time.
The figure works very well as a graphical argument, but how would a mathematical proof
be put together? Suppose that kt < k̄. It is then straightforward to show that (i) kt+1 > kt
72
(because sf (kt ) > δkt when kt < k̄) and also that (ii) kt+1 < k̄ (because kt < k̄). Repeating
this procedure, we can see the sequence {kt , kt+1 , kt+2 , ...} is monotone and bounded by [kt , k̄].
From the Monotone Convergence Theorem, the sequence has a limit. The limit has to be k̄,
as the limit is unique under the conditions given.
Intuitively, the convergence occurs because the aggregate production function f (kt ) has
decreasing returns to capital (Assumption 2 above). Equation (3.2), rewritten in terms of
the change in the capital stock,
kt+1 − kt = it − δkt ,
reveals two forces that go in opposite directions: (gross) investment and depreciation. When
the total capital stock is small, output per unit of capital is large, and the constant saving
rate then implies that a large (gross) investment is made relative to the existing capital
stock. This process enables the aggregate capital stock to increase. As kt increases, output
per unit of capital becomes smaller due to the decreasing returns property, and when kt
is very large enough, the gross investment cannot cover total deprecation, δkt . Thus, the
investment force is stronger when kt is small and the depreciation force is stronger when kt
is large. This relationship is perhaps even clearer if we write the above equation in terms of
the growth rate:
kt+1 − kt sf (kt )
= − δ,
kt kt
where we have replaced it = sf (kt ). The assumptions f 00 (kt ) < 0 and f (0) = 0 imply that
f (kt )/kt is decreasing in kt , generating the negative relationship between the investment
force of pushing up the capital stock and the level of the capital stock. In fact, when saving
behavior (as captured by s here) is modeled explicitly as a choice, s can counteract this force
toward convergence, but it turns out not to be strong enough to overturn the convergence
result. This issue will be discussed in detail in the next chapter.
Let us go back to our motivating fact: the stability of kt /yt over time. In the basic model
here, kt /yt is of course constant in steady state, as are all the variables. Below, however,
we will see that, even in a growing economy where kt and yt keep increasing over time, the
economy settles to a situation where kt /yt is constant over time.
In the Cobb-Douglas case above, the steady-state k/y ratio can be solved out as
k̄ s
= .
ȳ δ
Clearly, in the long run kt /yt is larger when s is larger and when δ is smaller.
73
Endogenous growth Consider the situation where F1 (k, 1) is uniformly above δ/s. Figure
3.2 below draws such an example. In this case, the steady state with k̄ > 0 does not exist,
and kt keeps growing larger over time. That is, there is unbounded growth “by itself”:
growth in endogenous. This concept will be discussed more in Chapter 11 below but in a
richer model where other production inputs can also be accumulated. Here, given that one
expects decreasing returns to each input—such as capital—it is hard to take this case very
seriously.
In the special (and illustrative) case where F1 is a constant—as illustrated in the graph—
we can think of output as linear in capital: yt = Akt , with no role for labor (make α = 1
in the Cobb-Douglas setting).2 Given that labor commands about two thirds of the income
from production, this setup does not seem empirically plausible. In a setup with endogenous
growth such as this, two identical countries starting out with different capital stocks will be
forever different; the gap between them, in percentage terms, will stay constant.
Poverty traps
Suppose that the production function is not globally concave in k: it has a middle sec-
tion that is convex. This could be true if there are some regions of k with increasing
returns, say, as a result of large infrastructure investments—the building of transporta-
tion networks. In such a case, the right-hand side of (3.5) will not be concave, and it
2
One can imagine a role for labor if Yt = AKt + BL, which is CRS, but here labor would not matter
asymptotically if A > δ/s.
74
may cross the 45-degree line multiple times, as illustrated in Figure 3.3 below.
Clearly, in this case there are multiple steady states and at least one of the steady
states will then not be “stable”: kt will not converge to that steady state even when k0
starts very close to it (a small perturbation away from the steady state leads further
away from it). When there are multiple steady states, an economy can get stuck in
the steady state with a low k̄ (and thus a low GDP) when it starts from a low k0 .
This situation is often called the poverty trap. The gap between a poor country with a
small k0 and a rich country with a large k0 may never close in this setting. In Figure
3.3, there are three steady states, k̄1 , k̄2 , and k̄3 . Of these three, k̄1 and k̄3 are stable
steady states. When the economy starts from a very low k0 , the economy converges to
k̄1 and gets trapped in it. To escape from the trap, the capital stock would have to
be pushed up to a larger level than k̄2 , from which it would converge to k̄3 . One way
to achieve this movement is to (temporarily) encourage very high saving. If the saving
rate is raised permanently so that the (1 − δ)kt + sf (kt ) curve moves up sufficiently,
the steady states k̄1 and k̄2 will disappear and the economy converges globally to k̄3 .
The growth/development literature does not appear to have identified sufficiently large
increasing returns leading to results of the kind described here, but it is an interesting
possibility.
Non-monotonic dynamics and chaos An even more radical departure from the
neoclassical setting is if f (k) declines in k at high levels of k. Conceptually, if a bakery
has no ovens, ovens have high marginal productivity, and as more ovens are added, the
marginal productivity declines, and it will become negative once there are so many ovens
in the bakery that there is neither space for bakers nor for the dough. This possibility
75
is more esoteric in a macroeconomic context but let us nevertheless study it briefly. So
when f (k) decreases steeply enough, the right-hand side of (3.5) will become decreasing
in kt . Illustrating this graphically, we will see that convergence, if convergence is at
all possible, will not be monotonic.a In fact, kt can exhibit forever oscillating (or even
chaotic) dynamics.b An example of is drawn in Figure 3.4.
a
Locally stable dynamics will occur if the slope at the steady state is less than 1 in absolute value.
b
Chaos is a mathematical term; it involves great sensitivity to initial conditions and forever non-
monotone behavior that never settles down to a repeated pattern (a repeated pattern could be a
two-cycle): it looks “random.”
Yt = F (Kt , At Lt ).
There are two changes from the basic model: first, we allow the labor input (population
times hours worked per person) Lt to grow over time. Second, and more importantly, we
allow for technological progress. In this production function, the variable representing the
technology level, At , is multiplied by labor input Lt . Technological progress thus takes a
form of improving the labor input, and At Lt is often referred to as the total number of
efficiency units of labor (or effective labor ). This form of technological progress is labor-
76
augmenting; it was introduced in the previous chapter.3 As was also asserted there, Uzawa
(1961) proved that labor-augmenting technical change is the only form of technical progress
that is consistent with exact balanced growth, that is, the growth path where aggregate
variables such as output and capital grow at a constant rate. Uzawa’s theorem is formally
stated and proved in Appendix 3.A.
We assume that the (net) growth rate of At is γ and that the growth rate of Lt is n.4
The same manipulations of equations as in the basic model yield
3
This form of technical progress is sometimes also called Harrod-neutral.
4
n has two origins: a growing population and changes in hours worked per person, which we saw from
Chapter 2 is best characterized by a decline in the longer run.
77
With a Cobb-Douglas production function we can obtain a closed-form solution:
1−α
s
k̃¯ = . (3.8)
(1 + γ)(1 + n) + δ − 1
The dynamic property of the model can be analyzed similarly to the basic model. As in the
basic model, starting from any k̃0 , {k̃t+1 }∞ ¯
t=0 converges monotonically to k̃. In (3.8), the rate
of technological progress γ and the population growth rate n work similarly to depreciation:
maintaining a level of kt = Kt /(At Lt ) is harder as A and L grow faster; i.e., each unit of
untransformed capital needs to grow faster, as if making up for depreciation.
In this framework, we can analyze how various economic variables grow over time. For
example, suppose Lt is simply population size (assuming that all citizens work one unit)
and then consider income per capita, defined as yt ≡ Yt /Lt . Because Yt /(At Lt ) = f (k̃t ), in
¯ Therefore, in the long run, income per capita
the long run, Yt /(At Lt ) converges to f (k̃).
converges to
¯
y = f (k̃)A
t t
yt+1 − yt ¯ ¯
f (k̃)At+1 − f (k̃)At At+1 − At
= ¯ = = γ.
yt f (k̃)At At
The growth in technology At is essential in sustaining long-run growth in per capita income.
Surprisingly, no other parameters affect the long-run growth of per capita income. For
example, encouraging saving (an increase in s) does not affect the long-run growth rate of
per capita income in the economy. Note that this result does not mean that the change in
s does not have any effect on economic outcome: it has an effect on the level of per capita
income, rather than the growth rate. It also has an effect on the growth rate in the short
run (when the economy is not yet on the balanced-growth path).
In the short run, k̃t changes over time and its movement has an effect on the economic
outcome. For example, the growth rate per capita income is now
From the fundamental equation (3.7), we know that when k̃t < k̃, ¯ k̃ increases over time,
t
that is, k̃t+1 > k̃t . Therefore, in this case, f (k̃t+1 )/f (k̃t ) > 1 and the growth rate of yt in the
short run is larger than γ. Similarly, when k̃t > k̃, ¯ the growth rate of y is smaller than γ. In
t
other words, the Solow model predicts that income per capita of a poor country grows faster
than at rate γ and that income per capita of a rich country grows slower than at rate γ in
the short run. This difference in growth rate is another representation of the convergence
prediction of the Solow model.
78
3.3 Stylized facts and the Solow model
The model with growth, presented above, can match various stylized facts of economic
growth. First, going back to our motivating fact on Kt /Yt , because Yt /(At Lt ) = f (k̃t ) and
Kt /(At Lt ) = k̃t are both constant along the balanced growth path, Kt /Yt = k̃t f (k̃t ) is also
constant in the long run. Once again, the Solow model can replicate the constant Kt /Yt in
the data.
The first fact in Chapter 2 was the steady growth of the GDP per capita. As we have
seen above, the GDP per capita grows at the rate γ along the balanced growth path (towards
which the economy converges from any starting point). This fact, therefore, is consistent
with the Solow model with technological progress.
Another stylized fact is that the return to physical capital has been nearly constant.
Here, we need to first compute the return to physical capital in the model. Suppose that
firms maximize profit under competitive markets:
max F (Kt , At Lt ) − rt Kt − wt At Lt . (3.9)
Kt ,At Lt
Here, output is taken as the numéraire and the price is set at one. Therefore, F (Kt , At Lt )
is the revenue and rt Kt + wt At Lt is the cost. Let us assume, for simplicity, that the capital
stock is owned by the consumers and rented to the firms with the rental rate rt . Thus, rt
represents the return to capital. The other component of the cost is the wage payment: wt
is the wage per efficiency unit of labor. From the first-order condition for Kt , the return to
physical capital is equal to the marginal product of capital:
rt = F1 (Kt , At Lt ).
Differentiating both sides of the equation f (Kt /At Lt ) = F (Kt , At Lt )/(At Lt ) with respect to
Kt , we obtain that
rt = f 0 (k̃t ).
Along the balanced growth path, therefore, rt is constant because the right-hand side is
¯
constant at f 0 (k̃).
Another prominent fact is the stability of the labor share and the capital share. The
capital share is equal to rt Kt /Yt , and it can readily be seen that it is constant, because we
have already seen that both rt and Kt /Yt are constant along the balanced-growth path. The
labor share is wt At Lt /Yt . From the first-order condition of (3.9), the wage is equal to the
marginal product of labor:
wt = F2 (Kt , At Lt ).
When the production function has constant returns to scale, it is homogeneous of degree
one.5 Then it follows that production becomes
Yt = Kt F1 (Kt , At Lt ) + At Lt F2 (Kt , At Lt ).
5
Recall that a function f (x) is homogeneous of degree r (is H(r)) when f (sx) = sr f (x) for all (s, x); here
x is a vector and r and s are scalars.
P If r = 1, it then follows, using differentiation with respect to s and
each element of x, that f (x) = i (∂f /∂xi )xi for all x.
79
Dividing both sides by Yt , we obtain that the labor share is one minus the capital share.
Therefore, the labor share is also constant when the capital share is constant.
We can also confirm the stability of the labor share by direct calculation. As for the case
of rt , it can be shown, by differentiating f (Kt /At Lt ) = F (Kt , At Lt )/(At Lt ) with respect to
At Lt , that
wt = f (k̃t ) − k̃t f 0 (k̃t ).
It can readily be seen that w is constant when k̃ is constant at k̃. ¯ Because A L /Y =
t t t t t
1/f (k̃t ), it is also constant along the balanced growth path. Therefore, wt At Lt /Yt is also
constant. Note that, for a Cobb-Douglas production function Yt = Ktα (At Lt )1−α , the capital
share is α and the labor share is 1 − α regardless of the values of Kt and At Lt , and therefore
the factor shares are constant even outside the balanced-growth path.
3.4 Convergence
We have already seen that, in the Solow model, the economy monotonically converges to the
steady state (or balanced growth path). Here, we look at this convergence property more in
detail and take a quick look at the data.
where ∆kt represents the deviation of kt from its steady-state value, that is, kt − k̄, when
the deviation is small.6 When the production function is in the Cobb-Douglas form, using
the steady-state solution (3.6),
holds. Replacing ∆kt by kt − k̄ and dividing both sides by k̄, (3.10) can be expressed as
kt+1 − k̄ kt − k̄
= (1 − λ) ,
k̄ k̄
where λ ≡ δ − sf 0 (k̄) represents the convergence speed. A large value of λ implies that ∆kt+1
becomes smaller (in absolute value) more quickly, implying a faster convergence. This is
illustrated in Figure 3.5 below: a higher λ represents a flatter slope at the steady state and
“more steps until you reach steady state.”
In the Cobb-Douglas case, using (3.6),
λ = δ(1 − α) (3.11)
6
This is obtained from a first-order Taylor approximation of the expression around k̄.
80
Figure 3.5: Slow and fast convergence
holds, and thus the convergence is faster when α is small and δ is large. Note that s does not
affect the convergence speed in this case. The convergence speed, in general, is affected by
how kt affects kt+1 . In an extreme case, if kt has no effect on kt+1 (a flat line), convergence
is immediate. The parameter s has two opposing forces to this mechanism. For a given kt ,
a large s implies a larger impact of kt on kt+1 . However, the steady-state value of capital is
larger when s is larger, and thus the marginal product of capital at the steady state, f 0 (k̄),
is smaller when s is larger, implying a smaller impact of kt on kt+1 . These two opposing
forces exactly offset each other when the production function is in the Cobb-Douglas form.7
In the case with growth, the fundamental equation (3.7) can be approximated by
¯ ∆k̃ .
(1 + γ)(1 + n)∆k̃t+1 = 1 − δ + sf 0 (k̃) t (3.12)
When the production function is in the Cobb-Douglas form, using the steady-state solution
(3.8),
(1 − α)(1 − δ)
∆k̃t+1 = α + ∆k̃t
(1 + γ)(1 + n)
holds. The equation (3.12) can be rewritten as
81
3.4.2 Cross-country data
Is convergence observed in the data? Recall that the convergence prediction implies that
a country that starts with a smaller per-capita GDP experiences faster subsequent growth.
Figure 3.6 plots this relationship across countries. The data is taken from the Penn World
Table 10.0 (https://www.rug.nl/ggdc/productivity/pwt/). The horizontal axis is per-capita
real GDP in 1960, which we take as the starting point. The vertical axis is the subsequent
growth rate (annualized using geometric averages) in per-capita real GDP from 1960 to 2019.
average annual growth rate: 1960-2019
0.06 TWN
KOR
BWA
THA
CHN MLT SGP
0.04 MYSHKG
ROU
GNQ IRL
MLILKA
IDN CYP
INDEGY PANJPN
MUS
SYC
DOM
MAR
LSO TUN
CPV PRT
TUR ESP ISL
PAK BRA
ETH PRY
COL
CHL
CRI GRCISRFIN
ITA AUT
BEL
FRA NOR
NLD
DEU
LUX
NPL
0.02 MWI
BGD
MOZ
BEN
RWA PHLFJI MEXTTO GBRDNK
SWE USA
AUS
CAN
TZA
BFA BRB GAB URY
UGA GTMPER
ECU
NAM
SLV
MRT NZL CHE
ZWE
TGO CIV
GHA
COG
HNDBOL
COM
KEN
GIN SYRDZA
NGA
CMR ZAF
JOR
ARG
TCD
BDI ZMB
SENNIC JAM
0 GNB
GMB IRN
NER
HTI
MDG
VEN CAF
COD
-0.02
-0.04
Source: Penn World Tables 10.0. The GDP variable used is RGDPNA.
One can immediately see that there is no systematic tendency for initially poor countries
to grow faster. The fact that there is no tendency for countries to converge, however, does
not imply a rejection of the Solow model. In fact, the Solow model does not predict that
different countries will always converge to the same balanced growth path (a phenomenon
called “unconditional convergence” or “absolute convergence”). Rather, it predicts that
countries converge if they share the same parameter values (“conditional convergence”). We
know, for example, that saving rates differ widely across countries and, at least over shorter
time horizons, it is reasonable to think that the growth rates of At also differ.
To examine conditional convergence, a useful exercise is to look at the same kind of
graph restricted to a smaller, and more similar, set of countries. Figure 3.7 thus plots the
same data as Figure 3.6, but only for the original members of the Organisation for Economic
Co-operation and Development (OECD). OECD was formed by high-income countries that
share relatively similar economic and political institutions, and we can expect the underlying
parameters in the Solow model to be relatively similar among these countries.
82
0.04
0.035
0.03
PRT
TUR
ESP
ISL
0.025 LUX
AUT NOR
GRC
BEL
ITA NLD
FRA DEU
0.02 DNK USA
SWE
GBR CAN
0.015
CHE
0.01
0.5 1 1.5 2 2.5 3 3.5
per capita GDP 1960 (in 2017 US$) 4
10
Source: Penn World Tables 10.0. The GDP variable used is RGDPNA.
For this set of countries, we observe a clear tendency for convergence: poor countries
in 1960 on average experience faster subsequent growth. Barro and Sala-i-Martin (2004,
Chapter 12) conduct a similar exercise across regions within countries, treating each region
as a different “country.” For U.S. states, Japanese prefectures, and European regions, they
find a clear tendency of convergence, supporting the prediction of the Solow model.
In a recent paper, Kremer et al. (2020) show that in recent years, the data actually show
a tendency for unconditional convergence. Figure 3.8 below repeats the same exercise as in
Figure 3.6 for the same set of countries, but setting the initial date to 2000. We can see
that some convergence (negative correlation) is observed in the recent years. The authors
argue that this tendency arose because some of the underlying factors that affect growth (the
factors that likely affect the growth rate of At ), such as policies, institutions, and human
capital have become more similar across countries in recent years. In Chapter 11, we will
discuss this and many related issues in greater detail.
83
average annual growth rate: 2000-2019
0.06 ETHCHN
MLI
IND
RWA
BGD ROU
LKA PAN
0.04 MOZIDNPERDOM
TZA
NPL
LSO PHL
GHA
MAR THAMUS KOR
BEN
TCD TUR MLT TWN IRL
UGA
CIV
ZMB
NGA PRY MYS SGP
BFA
TGO CPV
BOLCOLCRI
URY HKG
ZWE CHL
BWASYC
EGY TTO
KEN
GIN
PAK TUN
0.02 COD
SEN
MWI NAM
HND FJI GNQ
ECU NZL ISL
NERNIC
CMR SLV
DZA ISR SWE
GTM BRA
IRN
ZAF CYP FIN AUS
DEU NLDUSA
MRTJOR ARG PRT ESP
JPN BEL
GBR
FRA AUT
CANDNK CHE LUX
GNB MEX NOR
COMJAM BRB GRC
0 BDI
MDG
GMB ITA
HTICOG GAB
CAF
SYR
-0.02
-0.04
VEN
0 2 4 6 8
per capita GDP 2000 (in 2017 US$) 4
10
Figure 3.8: All countries, 2000-2019
Source: Penn World Tables 10.0. The GDP variable used is RGDPNA.
If we are interested in the model’s quantitative predictions for the speed of convergence,
one way to proceed would be to simply see if it is possible to choose parameters so as to
hit the “observed λ” (e.g., as measured by the slope in Figure 3.7). More generally, one
could specify a full stochastic model, say, with explicit shocks to variables (such as At ) and
estimate the resulting structure against the data we just looked at. Clearly, we could generate
a good fit in this case if we are free to choose parameters; for example, given any production
function, we could match the λ by an appropriate choice of δ. This choice, however, may not
be consistent with what we know about depreciation rates from microeconomic data. More
generally, we would like our model’s different components (functional forms and parameter
values) to be selected to be in line with microeconomic studies (and perhaps aggregate data
too). This way, the quantitative evaluation is disciplined. As briefly discussed in Chapter
1 above, a way forward is calibration. The procedure consists of two distinct steps, each
guided by data. First, we assign a parameterized functional forms to the unknown functions
in the model. In our case, the production function F corresponds to the unknown function.
Second, we assign specific values to the parameters. Because we use various moments (such
as means and variances) in the data to assign parameter values, this second step bears close
resemblance to estimation by the method of moments.
Before starting the calibration, we have to decide on the length of the time period. Here,
we are chiefly interested in movements in aggregates that occur over the medium run and
thus set one period to be one year. For the first step, we choose the Cobb-Douglas form,
used above, for the production function: F (Kt , At Lt ) = Ktα (At Lt )1−α . As we have seen, this
84
functional form yields constant factor shares, which is consistent with the rather striking
data on an absence of major movements in the shares. A Cobb-Douglas function is special,
however, in that it has the property that the substitution elasticity between inputs (capital
and labor) is equal to one, so we need to make sure that it is consistent with studies of
production functions, at least at a high level of aggregation.8 These studies rarely suggest
major departures from 1.
In the second step, we need to assign values to the parameters α, δ, γ, n, and s. Here,
we have implicitly assumed that the production function has constant returns to scale, i.e.,
that the sum of the exponents on capital and labor is one, so that it suffices to choose the α.
Before discussing the parameter selections, note that for the convergence speed λ in (3.13),
the information on s is not necessary. Below we will therefore skip assigning a value to s.
Calibration usually draws on multiple data sources. Overall, there are two methods of
assigning the parameter values based on data. First, if particular parameters have been con-
sidered as important objects for investigation elsewhere and we know plausible parameter
values from past studies, it is convenient to directly assign the parameter values accordingly.
Second, we can choose data moments that involve several parameters and assign parameter
values such that these moments, when generated by our theory, line up with observed mo-
ments. The first method can be considered a special case of the second method, because the
parameter values from past studies have to come from certain data moments used in these
studies. From this perspective, an alternative interpretation of the calibration procedure is
an implementation of the method of moments with multiple data sets.
We set the value of α from national income accounting. As we have seen from the previous
section, α corresponds to the capital share. From Figure 2.12 in Chapter 2, α = 1/3 is a
good approximation. We can set the value of γ from the long-run growth rate of per capita
income in advanced countries. The population growth rate n can be measured directly from
the data. Barro and Sala-i-Martin (2004, p.58) use γ = 0.02 and n = 0.01 as a benchmark
at the annual frequency. The fundamental equation (3.7) can be rewritten as
It
(1 + γ)(1 + n) = 1 − δ + ,
Kt
where we used sf (k̃t )/k̃t = It /Kt . The investment-capital ratio in the U.S. economy is
about 0.076 (Cooley and Prescott, 1995) at an annual frequency. Given the above values of
γ and n, this equation implies δ = 0.046.9 Clearly, here, one could alternatively have used
depreciation rates directly from data on depreciation (by capital type, or from aggregate data
on depreciation rates) and then this equation would have implied a value for the average
size of I/K on a balanced growth path. Given the accounting practices and the fact that
capital remains at roughly three times annual GDP as time passes, on average investment
8
The substitution elasticity is given by the percentage change in the ratio of the inputs when the ratio of
their prices change by one percent, i.e., −d log(K/L)/d log(r/w). Using the firm’s first-order conditions, we
can see that this expression must equal 1 for a Cobb-Douglas production function.
9
Given that I/K = (I/Y )(Y /K), we could alternatively have measured I/Y —the saving rate—and
Y /K, which we know is about 1/3. Conversely, we can now obtain the implied I/Y as 0.076/0.33, which
approximately equals 0.23.
85
precisely will have to make up for depreciation (taking population and technology growth
into account), so either way the number for δ ends up around 0.05.
With these values, we find that λ = 0.049. The empirical counterpart of λ is about 0.015
to 0.03 (Barro and Sala-i-Martin 2004, p,59), and therefore the Solow model over-predicts
the convergence speed. The model value of λ is about 0.02 if α is raised to 0.73. An argument
for a larger value of α is that a part of labor income is the return to human capital, which
can be accumulated in a similar manner as physical capital, and thus some part of labor
income should be included in the capital share. Relatedly, one can view At as an accumulable
factor—after all, technological development is often part of conscious investments into R&D,
and hence it too is a capital stock. These issues are returned to in Chapter 11.
Once the model has been assigned specific functional forms and parameter values, we
can also conduct quantitative experiments. Suppose, for example, the saving rate s equals
0.1. How would increasing s to 0.2 affect the normalized level of output along the balanced-
¯ With a Cobb-Douglas production function, we have already obtained
growth path, f (k̃)?
the solution for k̃¯ in (3.8). Inserting our calibrated parameter values, along with s = 0.1,
we obtain k̃¯ = 1.20 and f (k̃)¯ = k̃¯α = 1.06. When s goes up to 0.2, k̃¯ rises to 1.90 and
¯ = 1.24. Therefore, doubling the saving rate from 10% to 20% increases the normalized
f (k̃)
level of output by 17%, because 1.24/1.06 = 1.17. In addition, we could compute the
quantitative predictions for how fast output would rise to eventually reach a 17% higher
value.
86
business cycle models. Throughout we conduct the analysis in per-capita terms and, hence,
use lower-case letters.
The first business cycle model is the so-called real business cycle (RBC) model. As
will be explained in Chapter 12, the RBC model provides a simple mechanism whereby
macroeconomic variables comove, as is clear in the data. The prototypical RBC model
considers the following aggregate production function
yt = At F (kt , `t ),
where `t is variable labor input, and considers a shock to At (often called the “neutral
technology shock”). That is, it assumes that At changes over time and the movement
of At is the source of the business cycle. For example, consider a model where At
switches around between two values, AH and AL , where AH > AL . When At moves
to AH , the economy starts moving towards the corresponding steady state. Then At
switches to AL , and the economy now moves towards a different steady-state value. We
can interpret these movements as business cycles: movements around some average.
Augmented with steady growth we would have movements around the balanced path.
Note that the production function in this section is different from the Harrod-neutral
form F (kt , At `t ) earlier. Note that the distinction between Harrod-neutral techno-
logical progress and the Hicks-neutral technological progress is not essential when the
production function is in the Cobb-Douglas form: the Harrod-neutral production func-
tion ktα (At `t )1−α can be rewritten as At1−α ktα `1−α
t , and by defining Ãt ≡ At1−α , the same
production can be interpreted as the Hicks-neutral production function Ãt ktα `1−α t .
If we maintain the assumptions of the basic model, we have `t constant and it /yt = s
(and ct /yt = 1 − s) constant even with the shocks to At . These features are at odds
with the business cycle data. To accommodate the business cycle facts that (i) `t
comoves positively with the business cycle, (ii) it is more volatile than yt , and (iii) ct
is less volatile than yt , the basic equation would have to allow for the saving rate and
`t to react to At (and possibly to kt ). The economy evolves, therefore, following the
difference equation
This is a modified form of the fundamental equation (3.5) of the basic Solow model.
Next, we consider a different kind of shock. Suppose that the final goods market
clearing condition (3.3) is modified to
yt = ct + it /νt ,
where νt moves over time (and often called the “investment-specific technological
progress”): when it is high, it is cheaper to produce investment goods. One can
think of this equation as reflecting the two-sector structure of the economy: yt and ct
87
are measured in consumption goods, and investment goods have a production process
that can create it units of investment goods by using it /νt units of consumption goods.
The fundamental equation (3.5) can now be modified to
this equation can also be extended to include endogenous s and ` as in the case of the
neutral technology shock.
In the third example, we consider a very different model structure: one with “demand
shocks.” First, assume that ct is exogenous and that it moves around over time. The
movement of ct serves as the (demand) shock. Suppose, further, that we maintain the
assumption that it /ct = s/(1 − s). Because of this assumption, since s is constant, it
is proportional to ct and moves along with it. Therefore, the total demand for goods
is a function of ct :
1
ct + it = ct .
1−s
When ct is not sufficiently large, ct + it would be less than the full capacity output
F (kt , `) (here we assume again that ` is given by labor-force participation: those who
want to work). Assume, then, that when there are demand shortages, total output yt
is determined by the demand side and so that a fraction ut of the labor force become
unemployed (therefore, ut is the unemployment rate):
1
yt = ct = F (kt , `(1 − ut )).
1−s
From the second equality, ut can be represented as the function of ct and kt : u(ct , kt ).
The fundamental equation (3.5) can therefore be modified as
This framework is very Keynesian in spirit but clearly begs the question of how output
can end up below full capacity, and hence be driven by demand. In a well-functioning
market, this phenomenon could not occur. This book contains several chapters on fric-
tions that could lead to something like the setting just described, and it then becomes
central for policymakers to understand the exact nature of these frictions.
In all three cases above, we can represent kt+1 as a function of kt and a shock (At , νt , or ct ).
This representation allows us to characterize the dynamics of kt (and other macroeconomic
variables, such as yt , ct , and it ) in response to these shocks.
88
0, the value of At is constant at Ā. After a sufficiently long time, the value of kt settles close
to the corresponding steady-state value k̄. Then suppose that at time 0, A0 is (ε × 100)%
higher, that is, A0 = (1 + ε)Ā. For t = 1, 2, 3, . . . , the value of At is At = (1 + ρt ε)Ā, where
ρ ∈ (0, 1).
We can then generate the resulting time-path of kt , starting from k0 = k̄, with
kt+1 = sAt ktα `1−α + (1 − δ)kt , (3.14)
for t = 0, 1, 2, . . . . This time path is called the impulse-response function. The time-path
of At is drawn in Figure 3.9—the impulse—along with the response of kt . For the impulse-
response function for kt we use s = 0.2, δ = 0.046, and α = 1/3. The starting value of A
and the value of ` are normalized to 1. The initial value of the shock to A, ε, is 1%, and the
persistence ρ = 0.9.
1.01 9.09
1.008 9.085
1.006 9.08
1.004 9.075
1.002
9.07
1
0 50 100 150 200 0 50 100 150 200
period period
In the figure, kt increases from its steady-state value of 9.066 to a maximum value of 9.089
and then gradually goes back to the steady-sate value: a hump shape. Three properties of
the graph are worthwhile emphasizing. First, the movement of kt is much slower than
the movement of At . The deviation of At becomes very close to zero around period 50,
whereas the response of kt is more persistent. Second, the magnitude of the response of kt is
significantly smaller than the impulse: the maximum deviation as a fraction to the steady-
state level is 9.089/9.066 − 1 = 0.0025, that is, 0.25%. This magnitude is much smaller than
the initial At deviation of 1%. Third, kt eventually comes back to the steady-state value.
The convergence force is always at work when we consider the response to recurrent shocks,
as in the business-cycle examples.
89
the logarithms of variables from their corresponding steady-state values. An arbitrary
variable xt can be expressed as
xt = x̄ex̂t , (3.15)
where x
t
x̂t ≡ log ,
x̄
is the log deviation from the steady-state value x̄. Note that because
x x − x̄
t t
x̂t ≡ log ≈ ,
x̄ x̄
where the approximation is the first-order Taylor expansion around x̄, x̂t can be inter-
preted as the percent deviation from the steady state. Also note that
Consider the impulse-response experiment of the system (3.14). Using the transformation
(3.15),
k̄ek̂t+1 = sĀeÂt (k̄ek̂t )α `1−α + (1 − δ)k̄ek̂t
holds. Using (3.16) and the fact that
Because equation (3.17) implies δ k̄ = sĀ(k̄)α `1−α , this equation can be rewritten as
Let us again use λ ≡ δ(1 − α), which is the notation for the convergence speed in (3.11).
In fact, the log-linearization procedure here is essentially the same as the procedure for
obtaining the percentage deviation in the convergence section, in the sense that both are
applying the first-order Taylor approximation to (3.14).
The above impulse-response experiment corresponds to setting Â0 = ε and Ât = ρt ε for
t = 1, 2, 3, . . . . Solving (3.18) with this specification yields
t+1
t
X
t−τ τ t
1 − 1−λρ
k̂t+1 = ρ (1 − λ) δε = ρ δε.
τ =0
1 − 1−λ
ρ
Quantitatively, the log-linear approximation performs well in our calibrated model. The
maximum error (in the units of Kt ) of the approximation is about 0.00001. If we were to
90
plot the log-linear solution and the nonlinear solution in the same figure, the difference would
not be visible.
Applying the above log-linearization procedure to the production function yt = At ktα `1−α ,
the log-deviation of output is
t
1−λ
αδ 1 − ρ
ŷt = Ât + αk̂t = ρt 1−λ
δε
ρ 1− ρ
for t = 1, 2, . . . (for t = 0, ŷ0 = Â0 = ε). This equation explicitly describes how yt moves
over the cycle, in response to the realization of the shock ε. Log-linearization allows us to
derive this explicit expression.
An alternative to the log-linear approximation is a linear approximation in levels. Let
the production function be At F (kt , `t ). Assuming that `t = 1 for any t and defining f (kt ) ≡
F (kt , 1), the fundamental equation (3.5) (with the modification in the production function)
can be written as
kt+1 = g(kt , At ),
where g(kt , At ) ≡ (1 − δ)kt + sAt f (kt ). Using the notation we employed earlier,
where gk and gA are the partial derivatives of g(k, A) with respect to k and A, respectively
(evaluated at the steady state), and ∆kt = kt − k̄ and ∆At = At − Ā. When f (kt ) = ktα , we
1 α α
obtain gk = 1 − δ(1 − α) and gA = s 1−α δ − 1−α A 1−α .
91