Firm Heterogeneity, Endogenous Markups, and Factor Endowments
Firm Heterogeneity, Endogenous Markups, and Factor Endowments
1 Introduction
We have seen in the previous lecture how to add …rm heterogeneity to
the classical Krugman model of trade. We have seen in the Melitz paper
how taking into account …rm heterogeneity may explain why exposure to
trade will induce some increase in aggregate productivity, even without
a change in the actual technology of production, through a more e¢ -
cient reallocation of factors of production. We have seen in Chaney that
taking into account …rm heterogeneity will change substantially some
predictions for the patterns of international trade. Among others, …rm
heterogeneity will introduce a new margin of adjustment of trade barri-
ers, the extensive, and this barrier behaves di¤erently from the intensive
margin traditionally studied. One prediction is that the elasticity of sub-
stitution between goods will no longer increase the sensitivity of trade
‡ows to trade barriers, it may actually dampen it.
In this lecture, we will try and re…ne our approach to international
trade with heterogeneous …rms to account for some stylized facts that
these models were missing. The …rst important caveat of models based
on CES preferences and monopolistic competition is that mark-ups are
constant. We will see in Melitz and Ottaviano (2003), as well as in
Bernard, Eaton, Jensen and Kortum (2003) how more elaborate mod-
els can account for the endogenous determination of mark-ups. These
models will give us a better understanding of the adjustments that take
place when a country is exposed to trade. In Bernard, Redding and
Schott (2004), we will see how the Melitz model can be augmented to
include di¤erent factors of production, and how we can then relate this
model with heterogeneous …rms to the Ricardian comparative advantage
model.
1
2 Melitz and Ottaviano (2005)
Melitz and Ottaviano keep the monopolistic competition assumption,
but they move away from CES preferences. This will allow them to
generate endogenous mark-ups, and derive interesting properties related
to market size. Mark-ups respond to the "toughness" of competition
(which we will de…ne precisely). In larger markets, competition will be
tougher, so that …rms charge lower mark-ups, and aggregate productivity
is higher. Integration through costly trade will not entirely kill this e¤ect,
so that larger countries, even if they are open to trade at some cost, will
still be characterized by tougher competition than others. They are
also able to describe the e¤ect of some stylized trade policies. Trade
liberalization increases import competition and therefore reduces mark-
ups, and increases aggregate productivity (as in Melitz).
Main assumptions:
2
– (> 0) indexes the degree of product di¤erentiation between
varieties. In the limit of = 0, goods become perfectly
homogenous, so that consumers only care about the total
amount of di¤erentiated goods they consume, not which spe-
ci…c variety. As increases, consumers care more and more
about the distribution of consumption over all varieties, so
that goods become more and more di¤erentiated.
– and (> 0) index the substitution between the di¤eren-
tiated varieties and the homogenous good. Both parameters
shift out the demand for di¤erentiated varieties relative to the
homogenous good.
Note that the marginal utility of all goods is bounded from above,
so that consumers may not consume all goods, even in the absence
of …xed costs. We will use this condition to derive in equilibrium
the set of goods that are produced, and de…ne the extensive margin
of trade. We assume though that the income is always large enough
so that consumers have a positive demand for the numeraire.
Since there is a "choke price" for each variety (given the bound on
the marginal utility of each variety), we can de…ne the set of goods
actually consumed, , as the largest subset of such that,
+ Np
p (!) (4)
N+
3
We can easily see that any price p (!) must be below , the mar-
ginal utility of the numeraire good (which we have assumed is
consumed), so that p .
4
Monopolistic competition means that …rm maximize pro…ts choos-
ing price or quantity, taking as given the residual demand for their
good (i.e. the prices set by their competitors, p and N ). Using the
expression for demand in Eq. (3), total pro…ts earned by a …rm
with cost c selling quantity q is
(q) = (p (q) c) q
N
= + p q c q
N+ N+ L
Firms with too high a cost, i.e. a cost c above the threshold cD =
+ Np
N+
have zero demand, and exit immediately. A …rm with a
cost cD is exactly indi¤erent between staying in business or exiting,
p (cD ) = cD . We assume that the upper bound on cost, cM , is
always have enough so that in equilibrium there are some …rms in
the di¤erentiated sector (cD < cM ). The threshold cD summarizes
all the information that is needed to describe the behavior of the
…rms that stay in business.
1
p (c) = (cD + c) (price)
2
1
(c) = (cD c) (mark-up)
2
L
q (c) = (cD c) (quantity)
2
L 2
r (c) = cD c2 (revenue)
4
L
(c) = (cD c)2 (pro…ts)
4
We get the nice following properties:
5
– Lower cost …rms earn higher revenues and pro…ts.
– Lower cost …rms set higher mark-ups. So unlike the CES, not
the entire productivity gain is passed on to consumers, part
of it is retained as higher mark-ups.
6
This is exactly the same distribution as in Chaney (2006). There,
the labor productivity, ' = 1=c, was drawn from a Pareto over
[1; +1) with a scaling parameter , so it’s exactly the same as
here replacing k = and cM = 1. As in Chaney (2006), k is an
inverse measure of the dispersion of labor productivity, or labor
cost. A high k means that most of the cost draws are concentrated
around cM , whereas k = 1 corresponds to a uniform distribution
over [0; cM ]. With this speci…c functional form, we get the simple
closed form solutions,
1
k+2
cD = (F E)
L
2 (k + 1) cD
N= (ZCP )
cD
Size matters:
These results are consistent with stylized facts from the IO lit-
erature, mainly Campbell and Hopenhayn (2002) and Syverson
(2004, 2005). Campbell and Hopenhayn look at the retail sector,
and …nd that larger markets have higher average size (measured
7
in sales or employment), as well as more dispersed sizes. Syverson
looks at sectors where real output (quantities) is measurable (ce-
ment and concrete), so that he can recover unit prices (this is a
unique example of reliable unit price data at the …rm level!). He
…nds larger plants, higher productivity, and tougher competition
(less dispersed productivity as well as a higher lower bound for
productivity) in larger markets.
8
discounted sum of future pro…ts. Those pro…ts include both pro…ts
earned from domestic sales, and potentially pro…ts earned from
foreign sales. The free entry condition imposes that these two are
equalized,
Z cD Z cX
D (c) dG (c) + X (c) dG (c) = fE (10)
0 0
Z cD Z cX
D (c) dG (c) + X (c) dG (c) = fE
0 0
9
The impact of trade on prices, mark-ups, sizes, welfare:
We also see that if trade is costly ( > 1), trade does not entirely
integrate markets. This is obvious from the fact that size still
matters:
– the larger country has a lower cost cuto¤, higher average pro-
ductivity and product variety, and lower mark-ups and prices
(consumers bene…t from all these combined e¤ects).
– with that speci…c functional form, the size of one’s trading
partner does not a¤ect domestic variables. Even though a
larger trading partner represents increased export opportuni-
ties, this is o¤set by increased competition. Similarly, even
though a larger trading partner represents an increased im-
port competition, exit in the long run reduces the number of
entrants and o¤sets the competition e¤ect.
Aggregate exports:
cMk
X= NE L cDk+2 k
2 (k + 1)
10
parameter driving the substitutability between varieties ( here)
does not a¤ect the sensitivity of aggregate exports to trade barri-
ers. Only the distribution of productivity shocks (indexed by k)
matters.
Set-up
Preferences:
11
a quantity qn (!) of each variety !, and derive a utility,
Z 1 1
1
Un qn (!) d! (11)
0
with Pn = pn (!)1 d!
0
Fréchet distributions:
In each country, there are many di¤erent …rms. Each of these …rms
get a random productivity draw. In each country, only the best
technology will be used, so that only the minimal cost is used. The
distribution of the lowest cost of producing good ! in country i,
zi1 (!) (note that the superscript 1 denotes the best draw), is drawn
from a Fréchet distribution, Fi :
12
then the distribution of the highest productivity is exactly as given
in Eq. (13). This proposition can be proven by looking at the
di¤erent possible orders of zi1 (!) ; zi2 (!) ; z1 and z2 , and the
respective probabilities of these di¤erent orders. Note that setting
z1 = z2 returns the initial Fréchet distribution from Eq. (13).
Ti scales up the technology of all goods in country i. It is a measure
of the absolute advantage of country i. The parameter , which
we assume is the same in all countries, is an inverse measure of
the heterogeneity in productivity between di¤erent sectors. It will
index the strength of comparative advantages between countries.
13
The cheapest version of good ! in country n, looking at all poten-
tial source countries, is
From this, we can derive the probability that country i is the cheap-
est provider of a given good ! in country n. Given that we have
a continuum of goods, the law of large numbers holds, and this
probability is exactly the share of goods that are imported from i
14
to n,
Ti ( in wi )
=
n
This share of imported good from i (and we’ll see later that this
is also the share of imports in nominal terms) depends on the
productivity of country i (scaled by the trade barriers between i
and n), relative to the productivity of all other trading partners of
n.
Bertrand competition:
pn (!) = (!) c1n (!) = min c2n (!) ; c1n (!) (20)
1
; >1
with =
1 ; 1
15
and the second draw are su¢ ciently far apart. Otherwise, the
mark-up depends on the realization of the …rst and second draw
of productivity (or cost).
If country i is the cheapest provider of good ! in country n, it
means it has the lowest cost, and therefore we know that the lowest
cost for that good ! in country n must be such that,
From this result and the distribution of the highest and the second
highest productivities in Eq. (14), we get the joint distribution of
the lowest and the second lowest costs from country i in country
n. It is convenient to work with the complementary distributions
for a moment,
Gcin (c1 ; c2 ) = Pr c1in (!) c1 ; c2in (!) c2 (22)
1 in wi 2 in wi
= Pr zin (!) ; zin (!)
c1 c2
h i
= 1 + Ti ( in wi ) c2 c1 exp Ti ( in wi ) c2
From this, and from Eq. (21), we get the complementary joint
distribution of the lowest and the second lowest cost in country n,
unconditional of the origin country,
Gcn (c1 ; c2 ) = Pr c1n (!) c1 ; c2n (!) c2 (23)
"both the lowest and the second lowest draws are above c2 "
= Pr or
"the lowest cost is in [c1 ; c2 ] and the second lowest above c2 "
2 3
for all i0 s, "c1in (!) c2 and c2in (!) c2 in all i0 s"
= Pr 4 "c1kn (!) c2 and c2kn (!) c2 in all k 6= i" 5
or, for all i0 s,
and "c1 c1in (!) < c2 and c2in (!) c2 in i"
Y
= [1 Gin (c2 ; c2 )]
i
( )
X Y
+ ([1 Gin (c1 ; c2 )] [1 Gin (c2 ; c2 )]) [1 Gkn (c2 ; c2 )]
i k6=i
Y
Ti ( in wi ) c2
= e
i
( )
X Y
Ti ( in wi ) c2 Tk ( kn wk ) c2
+ Ti ( in wi ) c2 c1 e e
i k6=i
= exp n c2 + n c2 c1 exp n c2
16
And …nally, we recover the joint distribution of the lowest and
second lowest costs, unconditional on the country of origin,
Gn (c1 ; c2 ) = Pr c1n (!) c1 ; c2n (!) c2 (24)
= 1 Gcn (0; c2 ) Gcn (c1 ; c1 ) + Gcn (c1 ; c2 )
= 1 exp n c1 + n c1 exp n c2
Distribution of mark-ups:
Now that we have the joint distribution of the lowest and second
lowest costs in country n, we can describe the distribution of mark-
ups in country n. For all 0 s such that 1 ,
c2
Pr n (!) jc2n (!) = c2 = Pr c1n (!) c2 jc2n (!) = c(25)
2
R c2 @ 2 Gn
j
c2 = @c1 @c2 c1 ;c2
dc1
= R c2 @ 2 Gn
j
0 @c1 @c2 c1 ;c2
dc1
c2 (c2 = )
=
c2
=1
So the distribution of the mark-ups in country n, conditional on
the second lowest cost being c2 , is a Pareto distribution that does
not depend on c2 . This property is speci…c to the functional form
we assumed for the distribution of costs, and it is quite convenient.
The unconditional distribution will therefore be the same (we just
integrate that probability for all realizations of c2 , which is inte-
grating a constant over the support of c2 , and we get exactly that
same constant). This was the distribution of mark-ups in country
n conditional on the mark-up being below the Dixit-Stiglitz mark-
up. The unconditional distributions of mark-ups in country n is
then this Pareto distribution, truncated from above by m,
1 ; 1 ;
H ( ) = Pr [ n (!) ]= (26)
1 ; > = 1
17
– The reason for that is that while reducing trade barriers will
increase the number of potential competitors in sector ! and
therefore lower mark-ups. At the same time however, this is
exactly o¤set by the exit of domestic …rms who used to charge
the lowest mark-ups.
– Note also that the distribution of mark-ups only depends on
the heterogeneity parameters, (inverse) heterogeneity in pref-
erences, , and (inverse) heterogeneity in productivity be-
tween …rms, . A higher heterogeneity in productivity be-
tween …rms, lower , will increase the probability of high
mark-ups, as there are relatively more dispersion between
…rms (and therefore more distance between the lowest and
the second lowest cost draw, on average). If agents see goods
as more di¤erentiated, lower , …rms are more likely to charge
a high mark-up, as mark-ups are truncated at a higher point.
Measured productivity:
18
Dixit-Stiglitz mark-up:
1
Hin ( j z1 ) = Pr in (!) j zin (!) = zn1 (!) = z1 (27)
c2 (!) in wi
= Pr 1n j c1in (!) = c1 =
cn (!) z1
2 1
= Pr c1 cn (!) c1 j cin = c1
R c1 @ 2 G(c1 ;c2 )
@c1 @c2
dc2
= Rc11 @ 2 G(c 1 ;c2 )
c1 @c1 @c2
dc2
h i
exp n c1 exp n ( c1 )
= h i
exp n ( c1 )
=1 exp n 1 c1
h i
=1 exp n 1 ( in wi ) z1
19
– The two measures of heterogeneity (heterogeneity in tastes, ,
and heterogeneity in productivity, ) a¤ect the distribution
of the mark-up of a single …rm in the same way that they
a¤ect the whole distribution of mark-ups: more heterogeneity
(lower or ) implies higher mak-ups, on average.
It is easy to see that the model predicts that exporters will typically
be more productive than non-exporters. The model also predicts
that all exporters will also sell on their domestic market, whereas
only a fraction of domestic …rms are also exporters.
To sell domestically, a domestic producer of good ! must be more
e¢ cient than any of its foreign competitors,
wi
zi1 (!) zk1 (!) ; 8k 6= i
ki wk
wi in wi
zk1 (!) zk1 (!)
ki wk kn wk
Exporters also have larger domestic sales than non exporters be-
cause exporters tend to be more productive, and more productive
…rms tend to be larger: more productive …rms charge lower mark-
ups, and therefore have larger market shares.
Aggregate exports:
20
As in Chaney (2006), or Eaton and Kortum (2002), this share only
depends on the relative trade barriers between i and n (scaled by
i’s productivity) and the trade barriers from all countries and n.
As in Eaton and Kortum (2002) and Chaney (2006), the sensitiv-
ity of trade ‡ows to trade barriers does not depend on the elas-
ticity of substitution between goods, but only on the measure of
…rm heterogeneity (equivalent to the parameter of the Pareto
distribution in Chaney). The reason why drops out of the ex-
ports expression is somehow similar to the mechanism described
in Chaney (2006). When trade barriers between i and n ( in )
increase, several things happen.
First, the extensive margin of trade adjusts. Because the trade
barriers are higher, there are some goods for which country i is not
the cheapest producer anymore. How many of these goods there
are does not depend on , it only depends on the distribution of
productivity shocks, governed by . How much market shares each
of these exporters had prior to losing their edge does depend on ,
but it does not depend on average.
Second, there are some goods for which i is still the cheapest
provider. However, because the cost for i’s exporters has increased,
the di¤erence between the lowest cost (from i) and the second low-
est (from some other country, una¤ected by this change in in ) has
shrunk. For some fraction of goods for which the second lowest cost
was not binding, and therefore for which the exporter from i was
charging the Dixit-Stiglitz mark-up, the second highest price is still
not binding, so that i’s exporter is still charging the Dixit-Stiglitz
mark-up, and hence increases its price. How much market share
does i’s exporter lose to other sectors, this depends on the elasticity
of substitution , as consumers substitute towards other sectors:
the bigger , the larger the loss in market share. For some other i’s
exporters, the second lowest cost becomes binding, so that the …rm
switches from the Dixit-Stiglitz mark-up to the constrained mark-
up (therefore reduces its mark-up). These types of i’s exporters
increase their price in n, but less so that the previous category
of exporters. How much they increase their price depends on
(which determins the Dixit-Stiglitz mark-up). How much market
share they lose to other sectors depends on as well. Finally, for
some of i’s exporters, the second lowest cost was binding and is
still binding, so that their price is unchanged.
Third, there are goods for which the best producer in i either
remains the (potential) second cheapest provider, or was the second
21
cheapest provider, but no longer is. Because the cost has changed
for those (potential) exporters, the price charged by the cheapest
provider (no matter where this guy comes from) will change. How
much market share gain that price change induces depends on the
elasticity of substitution (the bigger , the bigger the gains in
market share of these cheapest suppliers).
However, because we have a continuum of goods, and because we
know from the distribution of mark-ups in Eq. (26) that the distri-
bution of prices in country n is independent of any trade barriers,
those impacts of changing in on the intensive margin of trade ex-
actly cancel out. The only margin of adjustment, on average, is the
extensive margin. This margin only depends on the distribution
of productivity shocks, driven by .
Welfare:
h i1=( 1)
1 1+ +( 1) 1+2
= 1+ :
22
Calibration and empirical exercises
BEJK go on to calibrate their model on actual …rm level data (US …rms),
to test some of the predictions of their model, and to do some simulation
exercises.
Parameters:
From Eq. (29), using bilateral trade ‡ows and aggregate output
data between the US and 47 other countries, BEJK infer trade
barriers measures, Ti ( in wi ) = n ’s for all i; n’s,
Ti ( in wi ) Xin
=
n Xn
The data on bilateral trade ‡ows come from Feenstra, Lipsey and
Bower (1997). Data on aggregate output come from UNIDO (1999),
completed with data from the World Bank.
They take 500; 000 draws from joint Fréchet distributions (for the
highest and second highest productivity) for each 47 countries (we
have 500,000 sectors). From these productivity draws, they know
which …rm sells in which country, what price it charges, what its
total size and measured productivity is...
23
They get about right the fraction of revenue from exports:
most exporters export only a small fraction of their output. They
underestimate the fraction of export-oriented …rms when they take
large ’s.
Simulations:
24