0% found this document useful (0 votes)
44 views96 pages

1-Linear SSIV

Uploaded by

Jhorland Ayala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views96 pages

1-Linear SSIV

Uploaded by

Jhorland Ayala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

Roadmap

Introductions
Me and This Course
(Linear) SSIV

Shock Exogeneity
Motivation
Borusyak et al. (2022)

Share Exogeneity
Motivation
Goldsmith-Pinkham et al. (2020)

Choosing an Appropriate Framework


Who Am I?

A Professor of Economics at Brown University


Who Am I?

A Professor of Economics at Brown University


A big fan of instrumental variable methods:
Who Am I?

A Professor of Economics at Brown University


A big fan of instrumental variable methods:
• Lottery- and non-lottery IVs in studies of educational quality
(Angrist et al. 2016, 2017, 2021, 2022; Abdulkadiroğlu et al. 2016)

• Quasi-experimental evaluations of healthcare quality


(Hull 2020; Abaluck et al. 2021, 2022)

• IV-based analyses of discrimination and bias


(Arnold et al. 2020, 2021, 2022; Hull 2021; Bohren et al. 2022; Baron et al. 2023)

• Shift-share instruments (SSIV) and related designs


(Borusyak et al. 2022; Borusyak and Hull 2021, 2022; Goldsmith-Pinkham et al. 2022)
What is This Course?

A two-day intensive on SSIV, focusing on recent practical advances


• Highlighting key points on identification, estimation, and inference

• Emphasis on practical: IV is meant to be used, not just studied!


What is This Course?

A two-day intensive on SSIV, focusing on recent practical advances


• Highlighting key points on identification, estimation, and inference

• Emphasis on practical: IV is meant to be used, not just studied!

Four one-hour lectures


• Please ask questions in the Discord chat!
What is This Course?

A two-day intensive on SSIV, focusing on recent practical advances


• Highlighting key points on identification, estimation, and inference

• Emphasis on practical: IV is meant to be used, not just studied!

Four one-hour lectures


• Please ask questions in the Discord chat!

One 70-minute coding lab


• 40 min: you, seeing how far you can get on your own (or with your
classmate’s help)
• 30 min: me, live-coding solutions in Stata (we will also post R code)
Schedule
What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reflecting


heterogeneous exposure shares : z` = n s`n gn
P
What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reflecting


heterogeneous exposure shares : z` = n s`n gn
P

• The shocks vary at a different “level” n = 1, . . . , N than the shares


` = 1, . . . , L , where we also observe an outcome y` & treatment x`
What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reflecting


heterogeneous exposure shares : z` = n s`n gn
P

• The shocks vary at a different “level” n = 1, . . . , N than the shares


` = 1, . . . , L , where we also observe an outcome y` & treatment x`

We want to use z` to estimate parameter β of the model y` = βx` + ε`


What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reflecting


heterogeneous exposure shares : z` = n s`n gn
P

• The shocks vary at a different “level” n = 1, . . . , N than the shares


` = 1, . . . , L , where we also observe an outcome y` & treatment x`

We want to use z` to estimate parameter β of the model y` = βx` + ε`


• Could be a “structural” equation or a potential outcomes model
• Could be misspecified, with heterogeneous treatment effects β`
• Could be a “reduced form” analysis, with x` = z`
• Could have other included controls w`
What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reflecting


heterogeneous exposure shares : z` = n s`n gn
P

• The shocks vary at a different “level” n = 1, . . . , N than the shares


` = 1, . . . , L , where we also observe an outcome y` & treatment x`

We want to use z` to estimate parameter β of the model y` = βx` + ε`


• Could be a “structural” equation or a potential outcomes model
• Could be misspecified, with heterogeneous treatment effects β`
• Could be a “reduced form” analysis, with x` = z`
• Could have other included controls w`

Key question: under what assumptions does this SSIV strategy “work”?
SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

Bartik (1991); Blanchard and Katz (1992):


• β = inverse local labor supply elasticity

• x` and y` = employment and wage growth in region `

• Need a labor demand shifter as an IV


SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

Bartik (1991); Blanchard and Katz (1992):


• β = inverse local labor supply elasticity

• x` and y` = employment and wage growth in region `

• Need a labor demand shifter as an IV

• gn = national growth of industry n

• s`n = lagged employment shares (of industry in a region)

• z` = predicted employment growth due to national industry trends


SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

Autor, Dorn, and Hanson (2013, ADH):


• x` = growth of import competition in region `

• y` = growth of manuf. employment, unemployment, etc.

• gn = growth of China exports in manufacturing industry n to 8 other


(i.e. non-U.S.) countries
• s`n = 10-year lagged employment shares (over total employment)

• z` = predicted growth of import competition


SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

“Enclave instrument”, e.g. Card (2009)


• β = inverse elasticity of substitution between native and immigrant
labor of some skill level (need a relative labor supply instrument)
• x` and y` = relative employment and wage in region `

• gn = national immigration growth from origin country n

• s`n = lagged shares of migrants from origin n in region `

• z` = share of migrants predicted from enclaves & recent growth


SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

Hummels et al. (2014) on offshoring:


• β = effect of imports on wages

• x` = imports by Danish firm `, y` = wages

• gn = changes in transport costs by n = (product, country)

• s`n = lagged import shares

• z` = predicted change in firm inputs via transport costs


What Do We Do With This?

Of course, we can always run IV with such z` ... but what does the
corresponding estimand identify?
What Do We Do With This?

Of course, we can always run IV with such z` ... but what does the
corresponding estimand identify?

Recall IV validity condition: E = 0 for model residual ε`


1 P 
L ` z ` ε`

• Looks a little different than normal because we’re not assuming


i.i.d. sampling, i.e. E = E[z` ε` ] (you’ll see why soon!)
1 P 
L ` z ` ε`
What Do We Do With This?

Of course, we can always run IV with such z` ... but what does the
corresponding estimand identify?

Recall IV validity condition: E = 0 for model residual ε`


1 P 
L ` z ` ε`

• Looks a little different than normal because we’re not assuming


i.i.d. sampling, i.e. E = E[z` ε` ] (you’ll see why soon!)
1 P 
L ` z ` ε`

What properties of shocks and shares make this condition hold?

• Is SSIV like a natural experiment? A diff-in-diff? Something new?

• Since z` combines multiple sources of variation, it can be difficult to


think about it being randomly assigned across ` (unlike a lottery IV)
Roadmap

Introductions
Me and This Course
(Linear) SSIV

Shock Exogeneity
Motivation
Borusyak et al. (2022)

Share Exogeneity
Motivation
Goldsmith-Pinkham et al. (2020)

Choosing an Appropriate Framework


Exogenous Shocks in Industry-Level Regressions

Acemoglu-Autor-Dorn-Hanson-Price (AADHP, 2016) look at the effects


of import competition with China on US manufacturing industries:

∆ log Empnt = α + β∆IPnt + εnt ,

where ∆IPnt measures growth in import penetration from China in


industry n, and εnt captures industry demand/productivity shocks
Exogenous Shocks in Industry-Level Regressions

Acemoglu-Autor-Dorn-Hanson-Price (AADHP, 2016) look at the effects


of import competition with China on US manufacturing industries:

∆ log Empnt = α + β∆IPnt + εnt ,

where ∆IPnt measures growth in import penetration from China in


industry n, and εnt captures industry demand/productivity shocks

Two Key Problems with OLS estimation:

1. Endogeneity of ∆IPnt : OLS is not consistent for β

2. GE spillovers: β does not capture aggregate effects


Problem 1: Endogeneity of ∆IPnt

∆ log Empnt = α + β∆IPnt + εnt

∆IPnt is driven by productivity shocks in China, but also potentially by


productivity and demand shocks in the US

• εnt captures productivity and demand shocks in the US


Problem 1: Endogeneity of ∆IPnt

∆ log Empnt = α + β∆IPnt + εnt

∆IPnt is driven by productivity shocks in China, but also potentially by


productivity and demand shocks in the US

• εnt captures productivity and demand shocks in the US

AADHP instrument ∆IPnt with ∆IP Ont , measuring average Chinese


import penetration growth in 8 non-US countries
Problem 1: Endogeneity of ∆IPnt

∆ log Empnt = α + β∆IPnt + εnt

∆IPnt is driven by productivity shocks in China, but also potentially by


productivity and demand shocks in the US

• εnt captures productivity and demand shocks in the US

AADHP instrument ∆IPnt with ∆IP Ont , measuring average Chinese


import penetration growth in 8 non-US countries

• Relevance: both ∆IPnt and ∆IP Ont are driven by the same
Chinese productivity shocks
• Validity: local productivity/demand shocks in the US are
uncorrelated with those of other countries (entering ∆IP Ont )
Identification from a Natural Experiment

Suppose ∆IP Ont is as-good-as-randomly assigned, as in a RCT:

E[∆IP Ont | I] = µ for all n, t

where I = {εnt , pre-trends, balance variables, . . . }


Identification from a Natural Experiment

Suppose ∆IP Ont is as-good-as-randomly assigned, as in a RCT:

E[∆IP Ont | I] = µ for all n, t

where I = {εnt , pre-trends, balance variables, . . . }

Consistent IV estimation then follows from many observations of nt,


with sufficiently independent variation in ∆IP Ont
Identification from a Natural Experiment

Can relax to add observables capturing systematic variation:


0
E[∆IP Ont | I] = qnt µ for all n, t

where qnt may include:

• period FE, isolating within-period variation in the shocks

• FE of 10 broad sectors, isolating within-sector variation, etc.


Identification from a Natural Experiment

Can relax to add observables capturing systematic variation:


0
E[∆IP Ont | I] = qnt µ for all n, t

where qnt may include:

• period FE, isolating within-period variation in the shocks

• FE of 10 broad sectors, isolating within-sector variation, etc.

We would then just want to control for qnt in the industry-level IV


Problem 2: GE Spillovers

Spillovers across different industries are likely important:


• When employment shrinks in industry n after a negative shock,
aggregate employment may or may not respond
Problem 2: GE Spillovers

Spillovers across different industries are likely important:


• When employment shrinks in industry n after a negative shock,
aggregate employment may or may not respond

• In a flexible labor market, comparing wages of similar workers


across industries does not make sense
Problem 2: GE Spillovers

ADH Solution: specify the outcome equation for local labor markets
• Works if local economies are isolated “islands”
(simple model in Adao-Kolesar-Morales 2019; richer structure of
spatial spillovers in Adao-Arkolakis-Esposito 2020)
Problem 2: GE Spillovers

ADH Solution: specify the outcome equation for local labor markets
• Works if local economies are isolated “islands”
(simple model in Adao-Kolesar-Morales 2019; richer structure of
spatial spillovers in Adao-Arkolakis-Esposito 2020)

But correct specification is not the same as identification!


• Key point: the same industry-level natural experiment can be used
to estimate a regional specification, via SSIV
Borusyak, Hull, and Jaravel (BHJ; 2022)

Consider the SSIV estimator of y` = βx` + γ 0 w` + ε` instrumented by


z` = n s`n gn and, for now, n s`n = 1 for all `
P P

• Reduced-form allowed: x` = z`

• Only the shift-share structure of z` matters; x` can be anything

• Note: view gn as stochastic, so can’t assume z` is iid


Borusyak, Hull, and Jaravel (BHJ; 2022)

Consider the SSIV estimator of y` = βx` + γ 0 w` + ε` instrumented by


z` = n s`n gn and, for now, n s`n = 1 for all `
P P

• Reduced-form allowed: x` = z`

• Only the shift-share structure of z` matters; x` can be anything

• Note: view gn as stochastic, so can’t assume z` is iid

E.g. gn = ∆IP On aggregated w/mfg employment shares s`n

• Can we leverage a natural experiment in gn , as before?


Leveraging gn
Shift-Share Estimand

Consider the SSIV estimator of y` = βx` + γ 0 w` + ε` instrumented by


z` = n s`n gn and, for now, n s`n = 1 for all `
P P

First step: note that by the FWL thm., the estimator can be written
⊥ ⊥
P P P
` z ` y` ` n s`n gn y`
β̂ = P ⊥
= P P ⊥
` z` x` ` n s`n gn x`

where v`⊥ denotes sample residuals from regressing v` on w`


Leveraging gn
BHJ Numerical Equivalence

BHJ show β̂ can be obtained from a shock-level IV procedure that


uses gn to instrument for a shock-level “aggregate” of the treatment:
Leveraging gn
BHJ Numerical Equivalence

BHJ show β̂ can be obtained from a shock-level IV procedure that


uses gn to instrument for a shock-level “aggregate” of the treatment:

1
s`n gn y`⊥
P P
β̂ = L
1 P` Pn ⊥
=
L ` n s`n gn x`
Leveraging gn
BHJ Numerical Equivalence

BHJ show β̂ can be obtained from a shock-level IV procedure that


uses gn to instrument for a shock-level “aggregate” of the treatment:

1 ⊥ ⊥
P P P P 1
L ` n s`n gn y` n gn ` L s`n y`
β̂ = 1 P P ⊥
= P P 1 ⊥
=
L ` n s`n gn x` n gn ` L s`n x`
Leveraging gn
BHJ Numerical Equivalence

BHJ show β̂ can be obtained from a shock-level IV procedure that


uses gn to instrument for a shock-level “aggregate” of the treatment:

1 ⊥ ⊥
P P P P 1 P ⊥
L ` n s`n gn y` n gn ` L s`n y` n sn gn ȳn
β̂ = 1 P P ⊥
= P P 1 ⊥
= P ⊥
,
L ` n s`n gn x` n gn ` L s`n x` n sn gn x̄n

where sn = 1
are weights capturing the average importance of
P
L ` s`n
P
s v
shock n, and v̄n = P` s`n`n ` is an exposure-weighted average of v`
`
Leveraging gn
BHJ Numerical Equivalence

sn gn ȳn⊥
P
β̂ = Pn ⊥
n sn gn x̄n

The IV estimate from the original observation-level IV procedure is


equivalent to a “industry-level” IV regression with model
ȳn⊥ = α + x̄⊥ ¯n instrumented by gn with weights sn .
nβ +

The residual ε̄n of this shock-level IV procedure is the average residual


of observations with a high share of n
• E.g. in ADH, the average unobserved determinants of regional
employment in regions most specialized in industry n
Leveraging gn
BHJ Numerical Equivalence

sn gn ȳn⊥
P
β̂ = Pn ⊥
n sn gn x̄n

The IV estimate from the original observation-level IV procedure is


equivalent to a “industry-level” IV regression with model
ȳn⊥ = α + x̄⊥ ¯n instrumented by gn with weights sn .
nβ +

The residual ε̄n of this shock-level IV procedure is the average residual


of observations with a high share of n
• E.g. in ADH, the average unobserved determinants of regional
employment in regions most specialized in industry n
It follows that β̂ is consistent iff this shock-level IV procedure is...
BHJ Baseline Assumptions
A1 (Quasi-random shock assignment): E[gn | ε̄, s] = µ, for all n
• Each shock has the same expected value, conditional on the
shock-level unobservables ε̄n and average exposure sn
BHJ Baseline Assumptions
A1 (Quasi-random shock assignment): E[gn | ε̄, s] = µ, for all n
• Each shock has the same expected value, conditional on the
shock-level unobservables ε̄n and average exposure sn
• Implies SSIV exogeneity, as z` = µ + n s`n (gn − µ) = µ + “noise”
P
BHJ Baseline Assumptions

A2 (Many uncorrelated shocks):

n sn → 0: expected Herfindahl index of average shock


P 2 
• E
exposure converges to zero (implies N → ∞)
• Cov(gn , gn0 | ε̄, s) = 0 for all n0 6= n: shocks are mutually
uncorrelated given the unobservables
BHJ Baseline Assumptions

A2 (Many uncorrelated shocks):

n sn → 0: expected Herfindahl index of average shock


P 2 
• E
exposure converges to zero (implies N → ∞)
• Cov(gn , gn0 | ε̄, s) = 0 for all n0 6= n: shocks are mutually
uncorrelated given the unobservables
p
• Imply a shock-level law of large numbers:
P
n sn gn ε̄n →
− 0
BHJ Baseline Assumptions

A2 (Many uncorrelated shocks):

n sn → 0: expected Herfindahl index of average shock


P 2 
• E
exposure converges to zero (implies N → ∞)
• Cov(gn , gn0 | ε̄, s) = 0 for all n0 6= n: shocks are mutually
uncorrelated given the unobservables
p
• Imply a shock-level law of large numbers:
P
n sn gn ε̄n →
− 0

Both assumptions, while novel for SSIV, would be standard for a


shock-level IV regression with weights sn and instrument gn
BHJ Extensions
Conditional Quasi-Random Assignment: E[gn | ε̄, q, s] = qn0 µ for
some observed shock-level variables qn

• Consistency follows when w` = n s`n qn is controlled for in the IV


P
BHJ Extensions
Conditional Quasi-Random Assignment: E[gn | ε̄, q, s] = qn0 µ for
some observed shock-level variables qn

• Consistency follows when w` = n s`n qn is controlled for in the IV


P

Weakly Mutually Correlated Shocks: gn | (ε̄, q, s) are clustered or


otherwise mutually dependent

• Consistency follows when mutual correlation is not too strong


BHJ Extensions
Conditional Quasi-Random Assignment: E[gn | ε̄, q, s] = qn0 µ for
some observed shock-level variables qn

• Consistency follows when w` = n s`n qn is controlled for in the IV


P

Weakly Mutually Correlated Shocks: gn | (ε̄, q, s) are clustered or


otherwise mutually dependent

• Consistency follows when mutual correlation is not too strong

proxies for an infeasible gn∗


P
Estimated Shocks: gn = ` w`n g`n

• Consistency may require a “leave-out” adjustment: z` = ` s`n g̃`n


P

for g̃`n = `0 6=` ω`0 n g`0 n (akin to JIVE solution to many-IV bias)
P
BHJ Extensions (cont.)
Panel Data: Have (y`t , x`t , s`nt , gnt ) across ` = 1, . . . , L, t = 1, . . . , T

• Consistency can follow from either N → ∞ or T → ∞

• Unit fixed effects “de-mean” the shocks, if s`nt are time-invariant


BHJ Extensions (cont.)
Panel Data: Have (y`t , x`t , s`nt , gnt ) across ` = 1, . . . , L, t = 1, . . . , T

• Consistency can follow from either N → ∞ or T → ∞

• Unit fixed effects “de-mean” the shocks, if s`nt are time-invariant

Heterogeneous Effects: LATE theorem logic goes through


• Under a first-stage monotonicity condition, SSIV identifies a convex
weighted average of heterogeneous treatment effects
Practical Consideration 1: Incomplete Shares
The Problem
So far we have assumed a constant sum-of-shares: S` ≡
P
n s`n =1

• But in some settings, S` varies across `

• E.g. in ADH, S` is region `’s share of non-manufacturing emp.


Practical Consideration 1: Incomplete Shares
The Problem
So far we have assumed a constant sum-of-shares: S` ≡
P
n s`n =1

• But in some settings, S` varies across `

• E.g. in ADH, S` is region `’s share of non-manufacturing emp.

BHJ show that A1/A2 are not enough for validity of z` in this case

• Now z` = n s`n (µ + (gn − µ)) = µS` + n s`n (gn − µ)


P P

• So z` is mechanically correlated with S` , which may be endogenous

E.g. in ADH, Comparing locations with larger and smaller z` could be


comparing places with larger vs. smaller manufacturing employment
(e.g. Midwest vs. South)
Practical Consideration 1: Incomplete Shares
The Solution

X X
z` = s`n (µ + (gn − µ)) = µS` + s`n (gn − µ)
n n
| {z }
Clean Shock Variation

Controlling for the sum-of-shares S` isolates clean shock variation


Practical Consideration 1: Incomplete Shares
The Solution

X X
z` = s`n (µ + (gn − µ)) = µS` + s`n (gn − µ)
n n
| {z }
Clean Shock Variation

Controlling for the sum-of-shares S` isolates clean shock variation

• Further controls are needed when A1 only holds conditional on qn ;


e.g. in panels, S` should be interacted with time FE:
X X
z`t = s`n (µt + (gnt − µt )) = µt S` + s`n (gnt − µt )
n n
| {z }
Clean Shock Variation
Practical Consideration 2: Exposure Clustering
The Problem

Adão, Kolesar, and Morales (2019) study a novel inference challenge


when SSIV identification leverages quasi-random shocks

• Observations with similar shares s`1, , . . . , s`N are likely to have


correlated z` , even when observations are not “clustered” in
conventional ways (e.g. by distance)
Practical Consideration 2: Exposure Clustering
The Problem

Adão, Kolesar, and Morales (2019) study a novel inference challenge


when SSIV identification leverages quasi-random shocks

• Observations with similar shares s`1, , . . . , s`N are likely to have


correlated z` , even when observations are not “clustered” in
conventional ways (e.g. by distance)

• When ε` is similarly clustered (e.g. when ε` = + ε̃` ),


P
n s`n νn
large-sample distribution of β̂ may not be well-approximated by
standard central limit theorems (CLTs)
Practical Consideration 2: Exposure Clustering
The Problem

Adão, Kolesar, and Morales (2019) study a novel inference challenge


when SSIV identification leverages quasi-random shocks

• Observations with similar shares s`1, , . . . , s`N are likely to have


correlated z` , even when observations are not “clustered” in
conventional ways (e.g. by distance)

• When ε` is similarly clustered (e.g. when ε` = + ε̃` ),


P
n s`n νn
large-sample distribution of β̂ may not be well-approximated by
standard central limit theorems (CLTs)

They then derive a new CLT + SEs to address “exposure clustering”

• “Design-based”: leverage iidness of shocks, not observations


Practical Consideration 2: Exposure Clustering
The Solution

BHJ use similar logic to show robust/clustered SEs can be valid when
β̂ is given by estimating the ‘industry-level’ regression

ȳn⊥ = α + β x̄⊥ 0 ⊥
n + qn τ + ε̄n ,

n by gn and weighting by sn
instrumenting x̄⊥
Practical Consideration 2: Exposure Clustering
The Solution

BHJ use similar logic to show robust/clustered SEs can be valid when
β̂ is given by estimating the ‘industry-level’ regression

ȳn⊥ = α + β x̄⊥ 0 ⊥
n + qn τ + ε̄n ,

n by gn and weighting by sn
instrumenting x̄⊥

• Numerically identical IV estimate, when controls include


P
n s`n qn

• Clustering logic: valid SEs are obtained when estimating the IV at


the level of identifying variation (here, shocks)
Practical Consideration 2: Exposure Clustering
The Solution

BHJ use similar logic to show robust/clustered SEs can be valid when
β̂ is given by estimating the ‘industry-level’ regression

ȳn⊥ = α + β x̄⊥ 0 ⊥
n + qn τ + ε̄n ,

n by gn and weighting by sn
instrumenting x̄⊥

• Numerically identical IV estimate, when controls include


P
n s`n qn

• Clustering logic: valid SEs are obtained when estimating the IV at


the level of identifying variation (here, shocks)
Same logic applies to performing valid balance/pre-trend tests and
evaluating first-stage strength of the instrument
SSIV with ssaggregate

Stata package ssaggregate leverages the BHJ equivalence result: it


translates data to the shock level, after which researchers can proceed
with familiar estimation commands (install w/ ssc install ssaggregate)
SSIV with ssaggregate...in R!
Thanks to our own Kyle Butts, ssaggregate is now available in R too!

Download at https://github.com/kylebutts/ssaggregate
Application: “The China Shock”

ADH study the effects of rising Chinese import competition on US


commuting zones, 1991-2000 and 2000-2007

• Treatment x` : local growth of Chinese imports in $1,000/worker


(slightly different from AADHP and ADHS)

• Main outcome y` : local change in manufacturing emp. share


Application: “The China Shock”

ADH study the effects of rising Chinese import competition on US


commuting zones, 1991-2000 and 2000-2007

• Treatment x` : local growth of Chinese imports in $1,000/worker


(slightly different from AADHP and ADHS)

• Main outcome y` : local change in manufacturing emp. share

To address endogeneity challenge, use a SSIV z`t =


P
n s`nt gnt

• n: 397 SIC4 manufacturing industries (× 2 periods)

• gnt : growth of Chinese imports in non-US economies per US worker

• s`nt : lagged share of mfg. industry n in total emp. of location `


ADH Revisited

BHJ show how ADH can be seen as leveraging quasi-random shocks

• Ex ante plausible: imagine random industry productivity shocks in


China affecting imports in U.S. & elsewhere
ADH Revisited
Plausability of A1/A2

Evaluate A1 by regional and industry-level balance tests

• Industry shocks are uncorrelated with observables


ADH Revisited
Plausability of A1/A2

Evaluate A1 by regional and industry-level balance tests

• Industry shocks are uncorrelated with observables

Check sensitivity to adjusting for potential industry-level confounders:


• Control for w`t = n s`nt qnt , where qnt include period FE, sector FE,
P

the Acemoglu et al. (2016) observables, ...


ADH Revisited
Plausability of A1/A2

Evaluate A1 by regional and industry-level balance tests

• Industry shocks are uncorrelated with observables

Check sensitivity to adjusting for potential industry-level confounders:


• Control for w`t = n s`nt qnt , where qnt include period FE, sector FE,
P

the Acemoglu et al. (2016) observables, ...

Evaluate A2 by studying variation across industries

• Effective sample size (1/HHI of sn weights): 58-192

• Shocks appear mutually uncorrelated across SIC3 sectors


BHJ do ADH: Shock-Level Balance

No significant correlations between shocks and industry observables,


controlling for year fixed effects
BHJ do ADH: Manufacturing Employment
Roadmap

Introductions
Me and This Course
(Linear) SSIV

Shock Exogeneity
Motivation
Borusyak et al. (2022)

Share Exogeneity
Motivation
Goldsmith-Pinkham et al. (2020)

Choosing an Appropriate Framework


The Mariel Boatlift as a Basic SSIV

Card (1990) leverages a big migration “push” of low-skilled workers


from Cuba to Miami, a Cuban-enclave.
The Mariel Boatlift as a Basic SSIV

Card (1990) leverages a big migration “push” of low-skilled workers


from Cuba to Miami, a Cuban-enclave. Imagine instrumenting
immigrant inflows by the lagged share of Cuban workers s`,Cuba in a
diff-in-diff setup
• Need parallel trends: regions with more/fewer Cuban workers on
similar employment trends
This can be viewed as a simple shift-share
X instrument:
s`,Cuba ≡ s`,Cuba · 1 + s`n · 0
n6=Cuba
The Mariel Boatlift as a Basic SSIV

Card (1990) leverages a big migration “push” of low-skilled workers


from Cuba to Miami, a Cuban-enclave. Imagine instrumenting
immigrant inflows by the lagged share of Cuban workers s`,Cuba in a
diff-in-diff setup
• Need parallel trends: regions with more/fewer Cuban workers on
similar employment trends
This can be viewed as a simple shift-share
X instrument:
s`,Cuba ≡ s`,Cuba · 1 + s`n · 0
n6=Cuba
If several migration origins had a push shock, we can pool them
together with a more traditional SSIV...
Goldsmith-Pinkham, Sorkin, and Swift (GPSS; 2020)

GPSS view the set of n and values of gn as fixed, so z` = is a


P
n s`n gn
linear combination of shares
Goldsmith-Pinkham, Sorkin, and Swift (GPSS; 2020)

GPSS view the set of n and values of gn as fixed, so z` = is a


P
n s`n gn
linear combination of shares

They then also establish a numerical equivalence: β̂ can be obtained


from an overidentified IV procedure that uses N share instruments s`n
and a weight matrix based on the shocks gn
Goldsmith-Pinkham, Sorkin, and Swift (GPSS; 2020)

Sufficient identifying assumption: shares s`n are exogenous for each n


(like parallel trends when ε` are unobserved trends)

E[ε` | s`n ] = 0, ∀n
Goldsmith-Pinkham, Sorkin, and Swift (GPSS; 2020)

Sufficient identifying assumption: shares s`n are exogenous for each n


(like parallel trends when ε` are unobserved trends)
X XX
E[ε` | s`n ] = 0, ∀n =⇒ E[ z ` ε` ] = gn E[s`n ]E[ε` | s`n ] = 0
`
` n

This is N moment conditions at the level of observations, e.g. 38 for


Card and 397 for ADH (vs. just 1 in BHJ, at the level of industries)

In other words, GPSS show that the SSIV estimator can be seen as
pooling many Boatlift-style diff-in-diff IVs, one for each industry
Rotemberg Weights
How does SSIV pool different diff-in-diffs?

• GPSS propose “opening the black box” of overidentified IV by


deriving the weights SSIV implicitly puts on each share instrument

• Builds on Rotemberg (1983), so they call these “Rotemberg weights”

⊥ gn ` s`n x⊥
P P
` s`n y`
X
β̂ = α̂n β̂n , where β̂n = P ⊥
and α̂n = P P ` ⊥
s x
`n ` g
n0 n 0 s`n0 x`
n | {z` } | {z ` }
n-specific IV estimate Rotemberg weight
Rotemberg Weights
How does SSIV pool different diff-in-diffs?

• GPSS propose “opening the black box” of overidentified IV by


deriving the weights SSIV implicitly puts on each share instrument

• Builds on Rotemberg (1983), so they call these “Rotemberg weights”

⊥ gn ` s`n x⊥
P P
` s`n y`
X
β̂ = α̂n β̂n , where β̂n = P ⊥
and α̂n = P P ` ⊥
s x
`n ` g
n0 n 0 s`n0 x`
n | {z` } | {z ` }
n-specific IV estimate Rotemberg weight

Intuitively, more weight is given to share instruments with more


extreme shocks gn and larger first stages ` s`n x⊥
P
`

• Weights can be negative (potential issue w/heterogeneous effects)


Rotemberg Weights in Card (2009)
Is Share Exogeneity Plausible?

Share exogeneity assumption is not that “shares don’t causally


respond to the residual” (they can’t: shares are pre-determined)
• It’s: “all unobservables are uncorrelated with anything about the
local share distribution”
Is Share Exogeneity Plausible?

This sufficient condition is typically violated when there are any


unobserved shocks νn that affect ε` via the same or correlated shares

• I.e. if ε` = n s`n νn + ε̃` , then s`n and ε` cannot be uncorrelated in


P

large samples—even if νn are uncorelated with gn

• E.g. in ADH, unobserved technology shocks across industries affect


labor markets via lagged emp. shares, along with observed gn

• Problem arises when shares are “generic” – predicting many things


Card and ADH Revisited

When share exogeneity is ex ante plauible, can test its assumptions ex


post (focusing on high Rotemberg weight n):

• Balance/pre-trend tests

• Overidentification tests (under constant effects)

• Straightforward to implement; no different than any other IV


Card and ADH Revisited

When share exogeneity is ex ante plauible, can test its assumptions ex


post (focusing on high Rotemberg weight n):

• Balance/pre-trend tests

• Overidentification tests (under constant effects)

• Straightforward to implement; no different than any other IV

GPSS find that balance/overidentification tests broadly pass for Card


... but fail badly for ADH, consistent with ex ante implausibility
Roadmap

Introductions
Me and This Course
(Linear) SSIV

Shock Exogeneity
Motivation
Borusyak et al. (2022)

Share Exogeneity
Motivation
Goldsmith-Pinkham et al. (2020)

Choosing an Appropriate Framework


A Taxonomy of SSIV Settings
Case 1 the IV is based on a set of shocks which can be thought of as an
instrument (i.e. many, plausibly quasi-randomly assigned)

• BHJ shows how this identifying variation can be mapped to estimate


effects at a different “level” (i.e. industries → local labor markets)
A Taxonomy of SSIV Settings
Case 1 the IV is based on a set of shocks which can be thought of as an
instrument (i.e. many, plausibly quasi-randomly assigned)

• BHJ shows how this identifying variation can be mapped to estimate


effects at a different “level” (i.e. industries → local labor markets)

Case 2 the researcher does not directly observe many quasi-random


shocks, but can estimate them in-sample

• Canonical setting of Bartik (1991), where gn are average industry growth


rates (thought to proxy for latent demand shocks)

• See also Card (2009), where national immiration rates are estimated
A Taxonomy of SSIV Settings
Case 1 the IV is based on a set of shocks which can be thought of as an
instrument (i.e. many, plausibly quasi-randomly assigned)

• BHJ shows how this identifying variation can be mapped to estimate


effects at a different “level” (i.e. industries → local labor markets)

Case 2 the researcher does not directly observe many quasi-random


shocks, but can estimate them in-sample

• Canonical setting of Bartik (1991), where gn are average industry growth


rates (thought to proxy for latent demand shocks)

• See also Card (2009), where national immiration rates are estimated

Case 3 the gn cannot be naturally viewed as an instrument

• Either too few or implausibly exogenous, even given some qn .

• Identification may (or may not) instead follow from share exogeneity
Ex Ante vs. Ex Post Validity
BHJ emphasize that the decision to pursue a “shocks” vs. “shares”
identification strategy must be made ex ante
• Undesirable to base identifying assumptions on ex post tests,
though balance/pre-trend tests can be used to falsify assumptions

• The two identification strategies have different economic content


Ex Ante vs. Ex Post Validity
BHJ emphasize that the decision to pursue a “shocks” vs. “shares”
identification strategy must be made ex ante
• Undesirable to base identifying assumptions on ex post tests,
though balance/pre-trend tests can be used to falsify assumptions

• The two identification strategies have different economic content

They suggest thinking about whether shares are “tailored” to the


economic question/treatment, or are “generic”
• Generic shares (e.g. ADH): unobserved νn are likely to enter ε` via
the same or similar shares, violating share exogeneity

• Tailored shares have a diff-in-diff feel; don’t even need the shocks,
except to possibly improve power or avoid many-IV bias

You might also like