0% found this document useful (0 votes)

44 views96 pages

1-Linear SSIV

Uploaded by

Jhorland Ayala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views96 pages

1-Linear SSIV

Uploaded by

Jhorland Ayala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 96

Roadmap

Introductions
Me and This Course
(Linear) SSIV

Shock Exogeneity
Motivation
Borusyak et al. (2022)

Share Exogeneity
Motivation
Goldsmith-Pinkham et al. (2020)

Choosing an Appropriate Framework

Who Am I?

A Professor of Economics at Brown University

Who Am I?

A Professor of Economics at Brown University

A big fan of instrumental variable methods:
Who Am I?

A Professor of Economics at Brown University

A big fan of instrumental variable methods:
• Lottery- and non-lottery IVs in studies of educational quality
(Angrist et al. 2016, 2017, 2021, 2022; Abdulkadiroğlu et al. 2016)

• Quasi-experimental evaluations of healthcare quality

(Hull 2020; Abaluck et al. 2021, 2022)

• IV-based analyses of discrimination and bias

(Arnold et al. 2020, 2021, 2022; Hull 2021; Bohren et al. 2022; Baron et al. 2023)

• Shift-share instruments (SSIV) and related designs

(Borusyak et al. 2022; Borusyak and Hull 2021, 2022; Goldsmith-Pinkham et al. 2022)
What is This Course?

A two-day intensive on SSIV, focusing on recent practical advances

• Highlighting key points on identiﬁcation, estimation, and inference

• Emphasis on practical: IV is meant to be used, not just studied!

What is This Course?

A two-day intensive on SSIV, focusing on recent practical advances

• Highlighting key points on identiﬁcation, estimation, and inference

• Emphasis on practical: IV is meant to be used, not just studied!

Four one-hour lectures

• Please ask questions in the Discord chat!
What is This Course?

A two-day intensive on SSIV, focusing on recent practical advances

• Highlighting key points on identiﬁcation, estimation, and inference

• Emphasis on practical: IV is meant to be used, not just studied!

Four one-hour lectures

• Please ask questions in the Discord chat!

One 70-minute coding lab

• 40 min: you, seeing how far you can get on your own (or with your
classmate’s help)
• 30 min: me, live-coding solutions in Stata (we will also post R code)
Schedule
What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reﬂecting

heterogeneous exposure shares : z` = n s`n gn
P
What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reﬂecting

heterogeneous exposure shares : z` = n s`n gn
P

• The shocks vary at a different “level” n = 1, . . . , N than the shares

` = 1, . . . , L , where we also observe an outcome y` & treatment x`
What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reﬂecting

heterogeneous exposure shares : z` = n s`n gn
P

• The shocks vary at a different “level” n = 1, . . . , N than the shares

` = 1, . . . , L , where we also observe an outcome y` & treatment x`

We want to use z` to estimate parameter β of the model y` = βx` + ε`

What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reﬂecting

heterogeneous exposure shares : z` = n s`n gn
P

• The shocks vary at a different “level” n = 1, . . . , N than the shares

` = 1, . . . , L , where we also observe an outcome y` & treatment x`

We want to use z` to estimate parameter β of the model y` = βx` + ε`

• Could be a “structural” equation or a potential outcomes model
• Could be misspeciﬁed, with heterogeneous treatment effects β`
• Could be a “reduced form” analysis, with x` = z`
• Could have other included controls w`
What is a (Linear) SSIV?

A weighted sum of a common set of shocks , with weights reﬂecting

heterogeneous exposure shares : z` = n s`n gn
P

• The shocks vary at a different “level” n = 1, . . . , N than the shares

` = 1, . . . , L , where we also observe an outcome y` & treatment x`

We want to use z` to estimate parameter β of the model y` = βx` + ε`

Key question: under what assumptions does this SSIV strategy “work”?
SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

Bartik (1991); Blanchard and Katz (1992):

• β = inverse local labor supply elasticity

• x` and y` = employment and wage growth in region `

• Need a labor demand shifter as an IV

SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

Bartik (1991); Blanchard and Katz (1992):

• β = inverse local labor supply elasticity

• x` and y` = employment and wage growth in region `

• Need a labor demand shifter as an IV

• gn = national growth of industry n

• s`n = lagged employment shares (of industry in a region)

• z` = predicted employment growth due to national industry trends

SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

Autor, Dorn, and Hanson (2013, ADH):

• x` = growth of import competition in region `

• y` = growth of manuf. employment, unemployment, etc.

• gn = growth of China exports in manufacturing industry n to 8 other

(i.e. non-U.S.) countries
• s`n = 10-year lagged employment shares (over total employment)

• z` = predicted growth of import competition

SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

“Enclave instrument”, e.g. Card (2009)

• β = inverse elasticity of substitution between native and immigrant
labor of some skill level (need a relative labor supply instrument)
• x` and y` = relative employment and wage in region `

• gn = national immigration growth from origin country n

• s`n = lagged shares of migrants from origin n in region `

• z` = share of migrants predicted from enclaves & recent growth

SSIV Examples

X shares shocks
Instrument z` = s`n gn for model y` = βx` + γ 0 w` + ε`
n

Hummels et al. (2014) on offshoring:

• β = effect of imports on wages

• x` = imports by Danish ﬁrm `, y` = wages

• gn = changes in transport costs by n = (product, country)

• s`n = lagged import shares

• z` = predicted change in ﬁrm inputs via transport costs

What Do We Do With This?

Of course, we can always run IV with such z` ... but what does the
corresponding estimand identify?
What Do We Do With This?

Of course, we can always run IV with such z` ... but what does the
corresponding estimand identify?

Recall IV validity condition: E = 0 for model residual ε`

1 P
L ` z ` ε`

• Looks a little different than normal because we’re not assuming

i.i.d. sampling, i.e. E = E[z` ε` ] (you’ll see why soon!)
1 P
L ` z ` ε`
What Do We Do With This?

Of course, we can always run IV with such z` ... but what does the
corresponding estimand identify?

Recall IV validity condition: E = 0 for model residual ε`

1 P
L ` z ` ε`

• Looks a little different than normal because we’re not assuming

i.i.d. sampling, i.e. E = E[z` ε` ] (you’ll see why soon!)
1 P
L ` z ` ε`

What properties of shocks and shares make this condition hold?

• Is SSIV like a natural experiment? A diff-in-diff? Something new?

• Since z` combines multiple sources of variation, it can be difﬁcult to

think about it being randomly assigned across ` (unlike a lottery IV)
Roadmap

Introductions
Me and This Course
(Linear) SSIV

Shock Exogeneity
Motivation
Borusyak et al. (2022)

Share Exogeneity
Motivation
Goldsmith-Pinkham et al. (2020)

Choosing an Appropriate Framework

Exogenous Shocks in Industry-Level Regressions

Acemoglu-Autor-Dorn-Hanson-Price (AADHP, 2016) look at the effects

of import competition with China on US manufacturing industries:

∆ log Empnt = α + β∆IPnt + εnt ,

where ∆IPnt measures growth in import penetration from China in

industry n, and εnt captures industry demand/productivity shocks
Exogenous Shocks in Industry-Level Regressions

Acemoglu-Autor-Dorn-Hanson-Price (AADHP, 2016) look at the effects

of import competition with China on US manufacturing industries:

∆ log Empnt = α + β∆IPnt + εnt ,

where ∆IPnt measures growth in import penetration from China in

industry n, and εnt captures industry demand/productivity shocks

Two Key Problems with OLS estimation:

1. Endogeneity of ∆IPnt : OLS is not consistent for β

2. GE spillovers: β does not capture aggregate effects

Problem 1: Endogeneity of ∆IPnt

∆ log Empnt = α + β∆IPnt + εnt

∆IPnt is driven by productivity shocks in China, but also potentially by

productivity and demand shocks in the US

• εnt captures productivity and demand shocks in the US

Problem 1: Endogeneity of ∆IPnt

∆ log Empnt = α + β∆IPnt + εnt

∆IPnt is driven by productivity shocks in China, but also potentially by

productivity and demand shocks in the US

• εnt captures productivity and demand shocks in the US

AADHP instrument ∆IPnt with ∆IP Ont , measuring average Chinese

import penetration growth in 8 non-US countries
Problem 1: Endogeneity of ∆IPnt

∆ log Empnt = α + β∆IPnt + εnt

∆IPnt is driven by productivity shocks in China, but also potentially by

productivity and demand shocks in the US

• εnt captures productivity and demand shocks in the US

AADHP instrument ∆IPnt with ∆IP Ont , measuring average Chinese

import penetration growth in 8 non-US countries

• Relevance: both ∆IPnt and ∆IP Ont are driven by the same
Chinese productivity shocks
• Validity: local productivity/demand shocks in the US are
uncorrelated with those of other countries (entering ∆IP Ont )
Identiﬁcation from a Natural Experiment

Suppose ∆IP Ont is as-good-as-randomly assigned, as in a RCT:

E[∆IP Ont | I] = µ for all n, t

where I = {εnt , pre-trends, balance variables, . . . }

Identiﬁcation from a Natural Experiment

Suppose ∆IP Ont is as-good-as-randomly assigned, as in a RCT:

E[∆IP Ont | I] = µ for all n, t

where I = {εnt , pre-trends, balance variables, . . . }

Consistent IV estimation then follows from many observations of nt,

with sufﬁciently independent variation in ∆IP Ont
Identiﬁcation from a Natural Experiment

Can relax to add observables capturing systematic variation:

0
E[∆IP Ont | I] = qnt µ for all n, t

where qnt may include:

• period FE, isolating within-period variation in the shocks

• FE of 10 broad sectors, isolating within-sector variation, etc.

Identiﬁcation from a Natural Experiment

Can relax to add observables capturing systematic variation:

0
E[∆IP Ont | I] = qnt µ for all n, t

where qnt may include:

• period FE, isolating within-period variation in the shocks

• FE of 10 broad sectors, isolating within-sector variation, etc.

We would then just want to control for qnt in the industry-level IV

Problem 2: GE Spillovers

Spillovers across different industries are likely important:

• When employment shrinks in industry n after a negative shock,
aggregate employment may or may not respond
Problem 2: GE Spillovers

Spillovers across different industries are likely important:

• When employment shrinks in industry n after a negative shock,
aggregate employment may or may not respond

• In a ﬂexible labor market, comparing wages of similar workers

across industries does not make sense
Problem 2: GE Spillovers

ADH Solution: specify the outcome equation for local labor markets
• Works if local economies are isolated “islands”
(simple model in Adao-Kolesar-Morales 2019; richer structure of
spatial spillovers in Adao-Arkolakis-Esposito 2020)
Problem 2: GE Spillovers

But correct speciﬁcation is not the same as identiﬁcation!

• Key point: the same industry-level natural experiment can be used
to estimate a regional speciﬁcation, via SSIV
Borusyak, Hull, and Jaravel (BHJ; 2022)

Consider the SSIV estimator of y` = βx` + γ 0 w` + ε` instrumented by

z` = n s`n gn and, for now, n s`n = 1 for all `
P P

• Reduced-form allowed: x` = z`

• Only the shift-share structure of z` matters; x` can be anything

• Note: view gn as stochastic, so can’t assume z` is iid

Borusyak, Hull, and Jaravel (BHJ; 2022)

Consider the SSIV estimator of y` = βx` + γ 0 w` + ε` instrumented by

z` = n s`n gn and, for now, n s`n = 1 for all `
P P

• Reduced-form allowed: x` = z`

• Only the shift-share structure of z` matters; x` can be anything

• Note: view gn as stochastic, so can’t assume z` is iid

E.g. gn = ∆IP On aggregated w/mfg employment shares s`n

• Can we leverage a natural experiment in gn , as before?

Leveraging gn
Shift-Share Estimand

Consider the SSIV estimator of y` = βx` + γ 0 w` + ε` instrumented by

z` = n s`n gn and, for now, n s`n = 1 for all `
P P

First step: note that by the FWL thm., the estimator can be written
⊥ ⊥
P P P
` z ` y` ` n s`n gn y`
β̂ = P ⊥
= P P ⊥
` z` x` ` n s`n gn x`

where v`⊥ denotes sample residuals from regressing v` on w`

Leveraging gn
BHJ Numerical Equivalence

BHJ show β̂ can be obtained from a shock-level IV procedure that

uses gn to instrument for a shock-level “aggregate” of the treatment:
Leveraging gn
BHJ Numerical Equivalence

BHJ show β̂ can be obtained from a shock-level IV procedure that

uses gn to instrument for a shock-level “aggregate” of the treatment:

1
s`n gn y`⊥
P P
β̂ = L
1 P` Pn ⊥
=
L ` n s`n gn x`
Leveraging gn
BHJ Numerical Equivalence

BHJ show β̂ can be obtained from a shock-level IV procedure that

uses gn to instrument for a shock-level “aggregate” of the treatment:

1 ⊥ ⊥
P P P P 1
L ` n s`n gn y` n gn ` L s`n y`
β̂ = 1 P P ⊥
= P P 1 ⊥
=
L ` n s`n gn x` n gn ` L s`n x`
Leveraging gn
BHJ Numerical Equivalence

BHJ show β̂ can be obtained from a shock-level IV procedure that

uses gn to instrument for a shock-level “aggregate” of the treatment:

1 ⊥ ⊥
P P P P 1 P ⊥
L ` n s`n gn y` n gn ` L s`n y` n sn gn ȳn
β̂ = 1 P P ⊥
= P P 1 ⊥
= P ⊥
,
L ` n s`n gn x` n gn ` L s`n x` n sn gn x̄n

where sn = 1
are weights capturing the average importance of
P
L ` s`n
P
s v
shock n, and v̄n = P` s`n`n ` is an exposure-weighted average of v`
`
Leveraging gn
BHJ Numerical Equivalence

sn gn ȳn⊥
P
β̂ = Pn ⊥
n sn gn x̄n

The IV estimate from the original observation-level IV procedure is

equivalent to a “industry-level” IV regression with model
ȳn⊥ = α + x̄⊥ ¯n instrumented by gn with weights sn .
nβ +

The residual ε̄n of this shock-level IV procedure is the average residual

of observations with a high share of n
• E.g. in ADH, the average unobserved determinants of regional
employment in regions most specialized in industry n
Leveraging gn
BHJ Numerical Equivalence

sn gn ȳn⊥
P
β̂ = Pn ⊥
n sn gn x̄n

The IV estimate from the original observation-level IV procedure is

equivalent to a “industry-level” IV regression with model
ȳn⊥ = α + x̄⊥ ¯n instrumented by gn with weights sn .
nβ +

The residual ε̄n of this shock-level IV procedure is the average residual

of observations with a high share of n
• E.g. in ADH, the average unobserved determinants of regional
employment in regions most specialized in industry n
It follows that β̂ is consistent iff this shock-level IV procedure is...
BHJ Baseline Assumptions
A1 (Quasi-random shock assignment): E[gn | ε̄, s] = µ, for all n
• Each shock has the same expected value, conditional on the
shock-level unobservables ε̄n and average exposure sn
BHJ Baseline Assumptions
A1 (Quasi-random shock assignment): E[gn | ε̄, s] = µ, for all n
• Each shock has the same expected value, conditional on the
shock-level unobservables ε̄n and average exposure sn
• Implies SSIV exogeneity, as z` = µ + n s`n (gn − µ) = µ + “noise”
P
BHJ Baseline Assumptions

A2 (Many uncorrelated shocks):

n sn → 0: expected Herﬁndahl index of average shock

P 2
• E
exposure converges to zero (implies N → ∞)
• Cov(gn , gn0 | ε̄, s) = 0 for all n0 6= n: shocks are mutually
uncorrelated given the unobservables
BHJ Baseline Assumptions

A2 (Many uncorrelated shocks):

n sn → 0: expected Herﬁndahl index of average shock

P 2
• E
exposure converges to zero (implies N → ∞)
• Cov(gn , gn0 | ε̄, s) = 0 for all n0 6= n: shocks are mutually
uncorrelated given the unobservables
p
• Imply a shock-level law of large numbers:
P
n sn gn ε̄n →
− 0
BHJ Baseline Assumptions

A2 (Many uncorrelated shocks):

n sn → 0: expected Herﬁndahl index of average shock

Both assumptions, while novel for SSIV, would be standard for a

shock-level IV regression with weights sn and instrument gn
BHJ Extensions
Conditional Quasi-Random Assignment: E[gn | ε̄, q, s] = qn0 µ for
some observed shock-level variables qn

• Consistency follows when w` = n s`n qn is controlled for in the IV

P
BHJ Extensions
Conditional Quasi-Random Assignment: E[gn | ε̄, q, s] = qn0 µ for
some observed shock-level variables qn

• Consistency follows when w` = n s`n qn is controlled for in the IV

Weakly Mutually Correlated Shocks: gn | (ε̄, q, s) are clustered or

otherwise mutually dependent

• Consistency follows when mutual correlation is not too strong

BHJ Extensions
Conditional Quasi-Random Assignment: E[gn | ε̄, q, s] = qn0 µ for
some observed shock-level variables qn

• Consistency follows when w` = n s`n qn is controlled for in the IV

Weakly Mutually Correlated Shocks: gn | (ε̄, q, s) are clustered or

otherwise mutually dependent

• Consistency follows when mutual correlation is not too strong

proxies for an infeasible gn∗

P
Estimated Shocks: gn = ` w`n g`n

• Consistency may require a “leave-out” adjustment: z` = ` s`n g̃`n

for g̃`n = `0 6=` ω`0 n g`0 n (akin to JIVE solution to many-IV bias)
P
BHJ Extensions (cont.)
Panel Data: Have (y`t , x`t , s`nt , gnt ) across ` = 1, . . . , L, t = 1, . . . , T

• Consistency can follow from either N → ∞ or T → ∞

• Unit ﬁxed effects “de-mean” the shocks, if s`nt are time-invariant

BHJ Extensions (cont.)
Panel Data: Have (y`t , x`t , s`nt , gnt ) across ` = 1, . . . , L, t = 1, . . . , T

• Consistency can follow from either N → ∞ or T → ∞

• Unit ﬁxed effects “de-mean” the shocks, if s`nt are time-invariant

Heterogeneous Effects: LATE theorem logic goes through

• Under a ﬁrst-stage monotonicity condition, SSIV identiﬁes a convex
weighted average of heterogeneous treatment effects
Practical Consideration 1: Incomplete Shares
The Problem
So far we have assumed a constant sum-of-shares: S` ≡
P
n s`n =1

• But in some settings, S` varies across `

• E.g. in ADH, S` is region `’s share of non-manufacturing emp.

Practical Consideration 1: Incomplete Shares
The Problem
So far we have assumed a constant sum-of-shares: S` ≡
P
n s`n =1

• But in some settings, S` varies across `

• E.g. in ADH, S` is region `’s share of non-manufacturing emp.

BHJ show that A1/A2 are not enough for validity of z` in this case

• Now z` = n s`n (µ + (gn − µ)) = µS` + n s`n (gn − µ)

P P

• So z` is mechanically correlated with S` , which may be endogenous

E.g. in ADH, Comparing locations with larger and smaller z` could be

comparing places with larger vs. smaller manufacturing employment
(e.g. Midwest vs. South)
Practical Consideration 1: Incomplete Shares
The Solution

X X
z` = s`n (µ + (gn − µ)) = µS` + s`n (gn − µ)
n n
| {z }
Clean Shock Variation

Controlling for the sum-of-shares S` isolates clean shock variation

Practical Consideration 1: Incomplete Shares
The Solution

X X
z` = s`n (µ + (gn − µ)) = µS` + s`n (gn − µ)
n n
| {z }
Clean Shock Variation

Controlling for the sum-of-shares S` isolates clean shock variation

• Further controls are needed when A1 only holds conditional on qn ;

e.g. in panels, S` should be interacted with time FE:
X X
z`t = s`n (µt + (gnt − µt )) = µt S` + s`n (gnt − µt )
n n
| {z }
Clean Shock Variation
Practical Consideration 2: Exposure Clustering
The Problem

Adão, Kolesar, and Morales (2019) study a novel inference challenge

when SSIV identiﬁcation leverages quasi-random shocks

• Observations with similar shares s`1, , . . . , s`N are likely to have

correlated z` , even when observations are not “clustered” in
conventional ways (e.g. by distance)
Practical Consideration 2: Exposure Clustering
The Problem

Adão, Kolesar, and Morales (2019) study a novel inference challenge

when SSIV identiﬁcation leverages quasi-random shocks

• Observations with similar shares s`1, , . . . , s`N are likely to have

correlated z` , even when observations are not “clustered” in
conventional ways (e.g. by distance)

• When ε` is similarly clustered (e.g. when ε` = + ε̃` ),

P
n s`n νn
large-sample distribution of β̂ may not be well-approximated by
standard central limit theorems (CLTs)
Practical Consideration 2: Exposure Clustering
The Problem

Adão, Kolesar, and Morales (2019) study a novel inference challenge

when SSIV identiﬁcation leverages quasi-random shocks

• Observations with similar shares s`1, , . . . , s`N are likely to have

correlated z` , even when observations are not “clustered” in
conventional ways (e.g. by distance)

• When ε` is similarly clustered (e.g. when ε` = + ε̃` ),

P
n s`n νn
large-sample distribution of β̂ may not be well-approximated by
standard central limit theorems (CLTs)

They then derive a new CLT + SEs to address “exposure clustering”

• “Design-based”: leverage iidness of shocks, not observations

Practical Consideration 2: Exposure Clustering
The Solution

BHJ use similar logic to show robust/clustered SEs can be valid when
β̂ is given by estimating the ‘industry-level’ regression

ȳn⊥ = α + β x̄⊥ 0 ⊥
n + qn τ + ε̄n ,

n by gn and weighting by sn
instrumenting x̄⊥
Practical Consideration 2: Exposure Clustering
The Solution

BHJ use similar logic to show robust/clustered SEs can be valid when
β̂ is given by estimating the ‘industry-level’ regression

ȳn⊥ = α + β x̄⊥ 0 ⊥
n + qn τ + ε̄n ,

n by gn and weighting by sn
instrumenting x̄⊥

• Numerically identical IV estimate, when controls include

P
n s`n qn

• Clustering logic: valid SEs are obtained when estimating the IV at

the level of identifying variation (here, shocks)
Practical Consideration 2: Exposure Clustering
The Solution

BHJ use similar logic to show robust/clustered SEs can be valid when
β̂ is given by estimating the ‘industry-level’ regression

ȳn⊥ = α + β x̄⊥ 0 ⊥
n + qn τ + ε̄n ,

n by gn and weighting by sn
instrumenting x̄⊥

• Numerically identical IV estimate, when controls include

P
n s`n qn

• Clustering logic: valid SEs are obtained when estimating the IV at

the level of identifying variation (here, shocks)
Same logic applies to performing valid balance/pre-trend tests and
evaluating ﬁrst-stage strength of the instrument
SSIV with ssaggregate

Stata package ssaggregate leverages the BHJ equivalence result: it

translates data to the shock level, after which researchers can proceed
with familiar estimation commands (install w/ ssc install ssaggregate)
SSIV with ssaggregate...in R!
Thanks to our own Kyle Butts, ssaggregate is now available in R too!

Download at https://github.com/kylebutts/ssaggregate
Application: “The China Shock”

ADH study the effects of rising Chinese import competition on US

commuting zones, 1991-2000 and 2000-2007

• Treatment x` : local growth of Chinese imports in $1,000/worker

(slightly different from AADHP and ADHS)

• Main outcome y` : local change in manufacturing emp. share

Application: “The China Shock”

ADH study the effects of rising Chinese import competition on US

commuting zones, 1991-2000 and 2000-2007

• Treatment x` : local growth of Chinese imports in $1,000/worker

(slightly different from AADHP and ADHS)

• Main outcome y` : local change in manufacturing emp. share

To address endogeneity challenge, use a SSIV z`t =

P
n s`nt gnt

• n: 397 SIC4 manufacturing industries (× 2 periods)

• gnt : growth of Chinese imports in non-US economies per US worker

• s`nt : lagged share of mfg. industry n in total emp. of location `

ADH Revisited

BHJ show how ADH can be seen as leveraging quasi-random shocks

• Ex ante plausible: imagine random industry productivity shocks in

China affecting imports in U.S. & elsewhere
ADH Revisited
Plausability of A1/A2

Evaluate A1 by regional and industry-level balance tests

• Industry shocks are uncorrelated with observables

ADH Revisited
Plausability of A1/A2

Evaluate A1 by regional and industry-level balance tests

• Industry shocks are uncorrelated with observables

Check sensitivity to adjusting for potential industry-level confounders:

• Control for w`t = n s`nt qnt , where qnt include period FE, sector FE,
P

the Acemoglu et al. (2016) observables, ...

ADH Revisited
Plausability of A1/A2

Evaluate A1 by regional and industry-level balance tests

• Industry shocks are uncorrelated with observables

Check sensitivity to adjusting for potential industry-level confounders:

• Control for w`t = n s`nt qnt , where qnt include period FE, sector FE,
P

the Acemoglu et al. (2016) observables, ...

Evaluate A2 by studying variation across industries

• Effective sample size (1/HHI of sn weights): 58-192

• Shocks appear mutually uncorrelated across SIC3 sectors

BHJ do ADH: Shock-Level Balance

No signiﬁcant correlations between shocks and industry observables,

controlling for year ﬁxed effects
BHJ do ADH: Manufacturing Employment
Roadmap

Introductions
Me and This Course
(Linear) SSIV

Shock Exogeneity
Motivation
Borusyak et al. (2022)

Share Exogeneity
Motivation
Goldsmith-Pinkham et al. (2020)

Choosing an Appropriate Framework

The Mariel Boatlift as a Basic SSIV

Card (1990) leverages a big migration “push” of low-skilled workers

from Cuba to Miami, a Cuban-enclave.
The Mariel Boatlift as a Basic SSIV

Card (1990) leverages a big migration “push” of low-skilled workers

from Cuba to Miami, a Cuban-enclave. Imagine instrumenting
immigrant inﬂows by the lagged share of Cuban workers s`,Cuba in a
diff-in-diff setup
• Need parallel trends: regions with more/fewer Cuban workers on
similar employment trends
This can be viewed as a simple shift-share
X instrument:
s`,Cuba ≡ s`,Cuba · 1 + s`n · 0
n6=Cuba
The Mariel Boatlift as a Basic SSIV

Card (1990) leverages a big migration “push” of low-skilled workers

GPSS view the set of n and values of gn as ﬁxed, so z` = is a

P
n s`n gn
linear combination of shares
Goldsmith-Pinkham, Sorkin, and Swift (GPSS; 2020)

GPSS view the set of n and values of gn as ﬁxed, so z` = is a

P
n s`n gn
linear combination of shares

They then also establish a numerical equivalence: β̂ can be obtained

from an overidentiﬁed IV procedure that uses N share instruments s`n
and a weight matrix based on the shocks gn
Goldsmith-Pinkham, Sorkin, and Swift (GPSS; 2020)

Sufﬁcient identifying assumption: shares s`n are exogenous for each n

(like parallel trends when ε` are unobserved trends)

E[ε` | s`n ] = 0, ∀n
Goldsmith-Pinkham, Sorkin, and Swift (GPSS; 2020)

Sufﬁcient identifying assumption: shares s`n are exogenous for each n

(like parallel trends when ε` are unobserved trends)
X XX
E[ε` | s`n ] = 0, ∀n =⇒ E[ z ` ε` ] = gn E[s`n ]E[ε` | s`n ] = 0
`
` n

This is N moment conditions at the level of observations, e.g. 38 for

Card and 397 for ADH (vs. just 1 in BHJ, at the level of industries)

In other words, GPSS show that the SSIV estimator can be seen as
pooling many Boatlift-style diff-in-diff IVs, one for each industry
Rotemberg Weights
How does SSIV pool different diff-in-diffs?

• GPSS propose “opening the black box” of overidentiﬁed IV by

deriving the weights SSIV implicitly puts on each share instrument

• Builds on Rotemberg (1983), so they call these “Rotemberg weights”

⊥ gn ` s`n x⊥
P P
` s`n y`
X
β̂ = α̂n β̂n , where β̂n = P ⊥
and α̂n = P P ` ⊥
s x
`n ` g
n0 n 0 s`n0 x`
n | {z` } | {z ` }
n-speciﬁc IV estimate Rotemberg weight
Rotemberg Weights
How does SSIV pool different diff-in-diffs?

• GPSS propose “opening the black box” of overidentiﬁed IV by

deriving the weights SSIV implicitly puts on each share instrument

• Builds on Rotemberg (1983), so they call these “Rotemberg weights”

⊥ gn ` s`n x⊥
P P
` s`n y`
X
β̂ = α̂n β̂n , where β̂n = P ⊥
and α̂n = P P ` ⊥
s x
`n ` g
n0 n 0 s`n0 x`
n | {z` } | {z ` }
n-speciﬁc IV estimate Rotemberg weight

Intuitively, more weight is given to share instruments with more

extreme shocks gn and larger ﬁrst stages ` s`n x⊥
P
`

• Weights can be negative (potential issue w/heterogeneous effects)

Rotemberg Weights in Card (2009)
Is Share Exogeneity Plausible?

Share exogeneity assumption is not that “shares don’t causally

respond to the residual” (they can’t: shares are pre-determined)
• It’s: “all unobservables are uncorrelated with anything about the
local share distribution”
Is Share Exogeneity Plausible?

This sufﬁcient condition is typically violated when there are any

unobserved shocks νn that affect ε` via the same or correlated shares

• I.e. if ε` = n s`n νn + ε̃` , then s`n and ε` cannot be uncorrelated in

large samples—even if νn are uncorelated with gn

• E.g. in ADH, unobserved technology shocks across industries affect

labor markets via lagged emp. shares, along with observed gn

• Problem arises when shares are “generic” – predicting many things

Card and ADH Revisited

When share exogeneity is ex ante plauible, can test its assumptions ex

post (focusing on high Rotemberg weight n):

• Balance/pre-trend tests

• Overidentiﬁcation tests (under constant effects)

• Straightforward to implement; no different than any other IV

Card and ADH Revisited

When share exogeneity is ex ante plauible, can test its assumptions ex

post (focusing on high Rotemberg weight n):

• Balance/pre-trend tests

• Overidentiﬁcation tests (under constant effects)

• Straightforward to implement; no different than any other IV

GPSS ﬁnd that balance/overidentiﬁcation tests broadly pass for Card

... but fail badly for ADH, consistent with ex ante implausibility
Roadmap

Introductions
Me and This Course
(Linear) SSIV

Shock Exogeneity
Motivation
Borusyak et al. (2022)

Share Exogeneity
Motivation
Goldsmith-Pinkham et al. (2020)

Choosing an Appropriate Framework

A Taxonomy of SSIV Settings
Case 1 the IV is based on a set of shocks which can be thought of as an
instrument (i.e. many, plausibly quasi-randomly assigned)

• BHJ shows how this identifying variation can be mapped to estimate

effects at a different “level” (i.e. industries → local labor markets)
A Taxonomy of SSIV Settings
Case 1 the IV is based on a set of shocks which can be thought of as an
instrument (i.e. many, plausibly quasi-randomly assigned)

• BHJ shows how this identifying variation can be mapped to estimate

effects at a different “level” (i.e. industries → local labor markets)

Case 2 the researcher does not directly observe many quasi-random

shocks, but can estimate them in-sample

• Canonical setting of Bartik (1991), where gn are average industry growth

rates (thought to proxy for latent demand shocks)

• See also Card (2009), where national immiration rates are estimated
A Taxonomy of SSIV Settings
Case 1 the IV is based on a set of shocks which can be thought of as an
instrument (i.e. many, plausibly quasi-randomly assigned)

• BHJ shows how this identifying variation can be mapped to estimate

effects at a different “level” (i.e. industries → local labor markets)

Case 2 the researcher does not directly observe many quasi-random

shocks, but can estimate them in-sample

• Canonical setting of Bartik (1991), where gn are average industry growth

rates (thought to proxy for latent demand shocks)

• See also Card (2009), where national immiration rates are estimated

Case 3 the gn cannot be naturally viewed as an instrument

• Either too few or implausibly exogenous, even given some qn .

• Identiﬁcation may (or may not) instead follow from share exogeneity
Ex Ante vs. Ex Post Validity
BHJ emphasize that the decision to pursue a “shocks” vs. “shares”
identiﬁcation strategy must be made ex ante
• Undesirable to base identifying assumptions on ex post tests,
though balance/pre-trend tests can be used to falsify assumptions

• The two identiﬁcation strategies have different economic content

Ex Ante vs. Ex Post Validity
BHJ emphasize that the decision to pursue a “shocks” vs. “shares”
identiﬁcation strategy must be made ex ante
• Undesirable to base identifying assumptions on ex post tests,
though balance/pre-trend tests can be used to falsify assumptions

• The two identiﬁcation strategies have different economic content

They suggest thinking about whether shares are “tailored” to the

economic question/treatment, or are “generic”
• Generic shares (e.g. ADH): unobserved νn are likely to enter ε` via
the same or similar shares, violating share exogeneity

• Tailored shares have a diff-in-diff feel; don’t even need the shocks,
except to possibly improve power or avoid many-IV bias

Econometric Methods For Panel Data
No ratings yet
Econometric Methods For Panel Data
58 pages
Chapter 15
No ratings yet
Chapter 15
38 pages
Borusyak
No ratings yet
Borusyak
92 pages
Heckman and Urzua
No ratings yet
Heckman and Urzua
110 pages
Micro-Econometrics ECO 6175: Abel Brodeur
No ratings yet
Micro-Econometrics ECO 6175: Abel Brodeur
32 pages
Day 9.2
No ratings yet
Day 9.2
81 pages
2024 French IV Slides
No ratings yet
2024 French IV Slides
93 pages
Instrumental Variable Homogenous Effect
No ratings yet
Instrumental Variable Homogenous Effect
81 pages
09 Causal Inference II: MSBA7003 Quantitative Analysis Methods
No ratings yet
09 Causal Inference II: MSBA7003 Quantitative Analysis Methods
34 pages
Revised December 2021
No ratings yet
Revised December 2021
47 pages
Recent Advances in Shift-Share IV: Peter Hull U Chicago and NBER
No ratings yet
Recent Advances in Shift-Share IV: Peter Hull U Chicago and NBER
33 pages
Using Instrumental Variables For Inference About Policy Relevant Treatment Parameters
No ratings yet
Using Instrumental Variables For Inference About Policy Relevant Treatment Parameters
57 pages
Chapter 1 - Instrumental Variable Method
No ratings yet
Chapter 1 - Instrumental Variable Method
32 pages
Econ24 Hull
No ratings yet
Econ24 Hull
58 pages
Metrics WT 2023-24 Unit11 Endogeneity
No ratings yet
Metrics WT 2023-24 Unit11 Endogeneity
36 pages
05 - Instrumental Variables PDF
No ratings yet
05 - Instrumental Variables PDF
92 pages
s10 IV Handout
No ratings yet
s10 IV Handout
48 pages
Spillover Effects in Experimental Data
No ratings yet
Spillover Effects in Experimental Data
42 pages
Endogeneity
No ratings yet
Endogeneity
19 pages
Applied Economics IV Lecture Notes
No ratings yet
Applied Economics IV Lecture Notes
64 pages
Cathy Econ0019 - w2
No ratings yet
Cathy Econ0019 - w2
62 pages
LN11 Handout
No ratings yet
LN11 Handout
16 pages
Sunil IFPRI 23mar21 IV ESR PDFFormat
No ratings yet
Sunil IFPRI 23mar21 IV ESR PDFFormat
54 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
IV From An Econometrician's Perspective
No ratings yet
IV From An Econometrician's Perspective
37 pages
Cathy Econ0019 - w3
No ratings yet
Cathy Econ0019 - w3
44 pages
LATE - An Intro
No ratings yet
LATE - An Intro
24 pages
Instrumental Variable Estimation 1: Framework: Instructor: Yuta Toyama Last Updated: 2021-05-18
No ratings yet
Instrumental Variable Estimation 1: Framework: Instructor: Yuta Toyama Last Updated: 2021-05-18
30 pages
Topics in Applied Econometrics MIT 14.387 J. Angrist Spring 2004 W. Newey
No ratings yet
Topics in Applied Econometrics MIT 14.387 J. Angrist Spring 2004 W. Newey
7 pages
05 - Instrumental Variables
No ratings yet
05 - Instrumental Variables
92 pages
Lectute 1 - Instrumental Variable Method
No ratings yet
Lectute 1 - Instrumental Variable Method
32 pages
Slide
No ratings yet
Slide
43 pages
AE 2023 Lecture10
No ratings yet
AE 2023 Lecture10
40 pages
American Statistical Association
No ratings yet
American Statistical Association
9 pages
Endogeneity
No ratings yet
Endogeneity
73 pages
Chapter 15
No ratings yet
Chapter 15
76 pages
Economics 717 Fall 2019 Lecture - IV PDF
No ratings yet
Economics 717 Fall 2019 Lecture - IV PDF
30 pages
Slides 5 Iu
No ratings yet
Slides 5 Iu
38 pages
Empirical Methods in Microeconomics
No ratings yet
Empirical Methods in Microeconomics
3 pages
Ch. 1 - Endogeneity
No ratings yet
Ch. 1 - Endogeneity
18 pages
Chapter 15
No ratings yet
Chapter 15
11 pages
Instrumental Variables
No ratings yet
Instrumental Variables
33 pages
Vb V ε X = σ Vb = σ Vb = X'X Σx X'X: I X'X X'
No ratings yet
Vb V ε X = σ Vb = σ Vb = X'X Σx X'X: I X'X X'
9 pages
Lecture 2
No ratings yet
Lecture 2
52 pages
Some Basics For Panel Data Analysis
No ratings yet
Some Basics For Panel Data Analysis
21 pages
00.a Event Outline & Further Reading
No ratings yet
00.a Event Outline & Further Reading
8 pages
Class 7 After
No ratings yet
Class 7 After
23 pages
Endogeneity 6
No ratings yet
Endogeneity 6
16 pages
Development Economics I Dr. Elisabetta Gentile: Orientation Tutorial
No ratings yet
Development Economics I Dr. Elisabetta Gentile: Orientation Tutorial
11 pages
00.0 EC 402 Syllabus 2025
No ratings yet
00.0 EC 402 Syllabus 2025
8 pages
Borusyak Et Al - 2025 - A Practical Guide To Shift-Share Instruments
No ratings yet
Borusyak Et Al - 2025 - A Practical Guide To Shift-Share Instruments
24 pages
2 - Instrumental Variable
No ratings yet
2 - Instrumental Variable
77 pages
Endogeneity and Instrumental Variables
No ratings yet
Endogeneity and Instrumental Variables
22 pages
Causal Inference, Michael E. Sobel
No ratings yet
Causal Inference, Michael E. Sobel
3 pages
Instrumental Variable: Statistics Econometrics Epidemiology
No ratings yet
Instrumental Variable: Statistics Econometrics Epidemiology
5 pages
PH1630 Chapter8 Notes
No ratings yet
PH1630 Chapter8 Notes
8 pages
Additional Cheatsheet en
No ratings yet
Additional Cheatsheet en
3 pages
Econometrics Notes 2024
100% (1)
Econometrics Notes 2024
46 pages
Simultaneous Equation Models
100% (1)
Simultaneous Equation Models
17 pages
Applied Econometrics Notes
No ratings yet
Applied Econometrics Notes
3 pages
Auto-Regression and Distributed Lag Models
100% (2)
Auto-Regression and Distributed Lag Models
79 pages
Residential Water Demand: An Updated Look: Jacob Jasperson, B.S
No ratings yet
Residential Water Demand: An Updated Look: Jacob Jasperson, B.S
47 pages
Multiple Choice Quiz62
No ratings yet
Multiple Choice Quiz62
3 pages
Lecture 6 LBS Slides
No ratings yet
Lecture 6 LBS Slides
105 pages
General Method of Moments
No ratings yet
General Method of Moments
14 pages
Econometrics II. Lecture Notes 1
No ratings yet
Econometrics II. Lecture Notes 1
17 pages
Model Exit Exam Round Two
No ratings yet
Model Exit Exam Round Two
15 pages
System Dynamics Modeling
No ratings yet
System Dynamics Modeling
32 pages
(引用) 5.Effects of economic policy uncertainty and political uncertainty on business confidence and investment
No ratings yet
(引用) 5.Effects of economic policy uncertainty and political uncertainty on business confidence and investment
26 pages
Chernoz Hansen 2006 JoE
No ratings yet
Chernoz Hansen 2006 JoE
35 pages
Bairagya 2021
No ratings yet
Bairagya 2021
25 pages
Mental Health Impacts of Child Labour Evidence From Vietnam and India
No ratings yet
Mental Health Impacts of Child Labour Evidence From Vietnam and India
16 pages
Liadeli Et Al 2022 A Meta Analysis of The Effects of Brands Owned Social Media On Social Media Engagement and Sales
No ratings yet
Liadeli Et Al 2022 A Meta Analysis of The Effects of Brands Owned Social Media On Social Media Engagement and Sales
22 pages
15 Instrumental Variables
No ratings yet
15 Instrumental Variables
27 pages
J Econmod 2019 09 026
No ratings yet
J Econmod 2019 09 026
8 pages
Simultaneous Equation Model
No ratings yet
Simultaneous Equation Model
6 pages
Corporate Future Investments and Stock Liquidity Evidence From Emerging Markets
No ratings yet
Corporate Future Investments and Stock Liquidity Evidence From Emerging Markets
15 pages
The Impact of Green Lending On Credit Risk in China: Sustainability
No ratings yet
The Impact of Green Lending On Credit Risk in China: Sustainability
16 pages
African Development Review - 2023 - Billon
No ratings yet
African Development Review - 2023 - Billon
13 pages
Published Remittanceand Exchangerates
No ratings yet
Published Remittanceand Exchangerates
19 pages
Corruption Types of Corruption and Firm Financial Performance New Evidence From A Transitional Economy
No ratings yet
Corruption Types of Corruption and Firm Financial Performance New Evidence From A Transitional Economy
12 pages
Endogenous Selection Bias
No ratings yet
Endogenous Selection Bias
58 pages
18 Measuring Regional Endogenous Growth
No ratings yet
18 Measuring Regional Endogenous Growth
21 pages
Panel GMM Commands
No ratings yet
Panel GMM Commands
13 pages
Ciadmin, Journal Manager, 8620-34170-1-CE
No ratings yet
Ciadmin, Journal Manager, 8620-34170-1-CE
22 pages
Spatial Panel Models: J. Paul Elhorst
No ratings yet
Spatial Panel Models: J. Paul Elhorst
21 pages
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
Understanding Educational Statistics Using Microsoft Excel and SPSS
From Everand
Understanding Educational Statistics Using Microsoft Excel and SPSS
Martin Lee Abbott
No ratings yet
The mathematics of quantum mechanics
From Everand
The mathematics of quantum mechanics
Alessio Mangoni
No ratings yet