Did Functional - Form
Did Functional - Form
Pedro H. C. Sant’Anna
Emory University
January 2025
Introduction
Introduction
1
Introduction
■ A natural one focuses on the extent to which the validity of DiD depends on
functional form restrictions.
■ Following Athey and Imbens (2006), we will say parallel trends is insensitive to
functional form if when it holds for potential outcomes Y(∞), it also holds for
potential outcomes s(Y(∞)) for any strictly monotonic s.
■ Intuitively, this says that parallel trends holds regardless of the units in which one
measures the outcome.
2
Why study sensitivity to functional form
■ Studying sensitivity to functional form helps clarify the different ways that a
researcher can justify the validity of a DiD design:
▶ Can verify conditions that ensure PT holds for all functional forms.
3
WWhy study sensitivity to functional form
■ It often may not be clear from subject-specific knowledge what is the “right”
transformation for PT to hold.
■ Example: different labor market studies have measured earnings in levels, logs, or
percentiles relative to national wage distribution.
■ The choice of transformation may be motivated by which ATT is “most relevant”, but
not always obvious that policy variation will generate PT for the same transformation
■ Moreover, we might want to use the same policy variation to study the ATT for
multiple transformations of the same outcome.
■ We will use Meyer, Viscusi and Durbin (1995) as a running example in the next slides:
interested in studying whether changes in weekly benefit amounts affected the
duration of time out of work in Michigan and Kentucky.
4
Parallel Trends in levels
E ln Yi,t=2 (∞) − ln Yi,t=1 (∞) | Gi = 2 = E ln Yi,t=2 (∞) − ln Yi,t=1 (∞) | Gi = ∞
Y (∞) Y (∞)
E ln i,t=2 Gi = 2 = E ln i,t=2 Gi = ∞
Y i, t = 1 ( ∞ ) Y i, t = 1 ( ∞ )
■ Under parallel trends (in logs), the ATT would take the format:
Yi,t=2 (2)
ATT = E [ln Yi,t=2 (2) − ln Yi,t=2 (∞) | G = 2] = E ln G=2 .
Yi,t=2 (∞)
8
The rest of the lecture will build on
different notation.
9
Setup
Model setup
■ Potential outcomes: Yi,t (2), Yi,t (∞). Observe Yi,t = 1{Gi =1} Yi,t (2) + 1{Gi =∞} Yi,t (∞).
10
More general models
■ More recent papers have considered settings with multiple periods and staggered
adoption.
▶ Typically impose a version of the 2-group, 2-period parallel trends assumption for many
periods/groups
(de Chaisemartin and D’Haultfœuille, 2020; Callaway and Sant’Anna, 2021; Sun and Abraham, 2021;
Borusyak, Jaravel and Spiess, 2024; Wooldridge, 2021).
▶ Thus, 2x2 results have immediate implications for the generalized PT assumption in the
staggered case.
■ The following results remain valid if all probability statements are implicitly
conditional on X, as when one assumes conditional parallel trends
(Heckman, Ichimura and Todd, 1997; Abadie, 2005; Sant’Anna and Zhao, 2020; Callaway and Sant’Anna,
11
2021).
Parallel Trends for all transformations of Y(∞)
Parallel Trends and Insensitivity to Functional Form
■ Following the definition in Athey and Imbens (2006), we say parallel trends is
insensitive to functional form (a.k.a. invariant to transformations) if
E s(Yi,t=2 (∞)) | Gi = 2 − E s(Yi,t=1 (∞)) | Gi = 2
=
E s(Yi,t=2 (∞)) | Gi = ∞ − E s(Yi,t=1 (∞)) | Gi = ∞
for all strictly monotonic s such that the expectations exist and are finite.
12
Insensitivity of Parallel Trends
Roth and Sant’Anna (2023) established the following characterization relating PT and
functional form.
FYi,t=2 (∞)|Gi =2 (y) − FYi,t=1 (∞)|Gi =2 (y) = FYi,t=2 (∞)|Gi =∞ (y) − FYi,t=1 (∞)|Gi =∞ (y), for all y ∈ R (1)
| {z } | {z }
Change in CDF for treated group Change in CDF for comparison group
Note that if Y(∞) is continuous (discrete), this is equivalent to parallel trends of PDFs (PMFs).
13
What Generates PT of CDFs?
What Generates PT of CDFs?
■ Under minor regularity conditions, Roth and Sant’Anna (2023) shows that parallel
trends of CDFs holds if and only if
FYi,t (∞)|Gi =g (y) = θJt (y) + (1 − θ )Hg (y) for all y ∈ R and g × t ∈ {2, ∞} × {1, 2}. (2)
for some θ ∈ [0, 1] and CDFs Jt (y) and Hg (y) depending only on time and group,
respectively.
■ This says that the distribution of Y(∞) for group g in period t is a mixture of a
time-dependent distribution (not depending on g) and a group-dependent
distribution (not depending on t).
14
Cases
This implies that PT is insensitive to funct form iff we are in the following three cases:
■ Case 1: (As-If) Randomized Treatment (θ = 1). The distribution of Yi,t (∞)|G = g is
the same for both groups (g = 2, ∞)
■ Case 2: Stationary Y(∞) (θ = 0). For each group, the distribution of Yi,t (∞)|G = g
doesn’t depend on t.
16
Can we test PT in CDFs?
Testable Implications
FYi,t=2 (∞)|Gi =2 (y) = FYi,t=1 (∞)|Gi =2 (y) + FYi,t=2 (∞)|Gi =∞ (y) − FYi,t=1 (∞)|Gi =∞ (y) for all y ∈ R
| {z } | {z }
Counterfactual Identified
(3)
■ Roth and Sant’Anna (2023) show that we can use this to test for cases where it is
clear from data we need to justify the particular choice of functional form
17
Testing in Practice
■ Then, testing that the implied CDF is increasing is equivalent to testing that the
implied mass is non-negative at all support points, i.e.
fYi,t=1 |Gi =2 (y) + fYi,t=2 |Gi =∞ (y) − fYi,t=2 |Gi =∞ (y) ≥ 0 for all y,
where fYi,t |Gi =g (y) is the probability mass function of Yi,t |Gi = g.
■ To test, we can merely replace the mass functions with sample analogs and apply
tools from the moment inequality literature to test that
E [fYi,t=1 |Gi =2 (y) + fYi,t=2 |Gi =∞ (y) − fYi,t=2 |Gi =∞ (y)] ≥ 0 for all y.
■ With continuous support, can likewise use methods for testing a continuum of
inequalities (e.g. Andrews and Shi (2013)). 18
Caveats
■ These tests may be useful for detecting when parallel trends is sensitive to
functional form.
■ But failure to reject does not mean that we don’t need to worry about functional
form!
■ Like tests of pre-trends, such pre-tests may be underpowered, and relying on them
can introduce distortions from pre-testing (Roth, 2022).
19
Empirical Illustration
Empirical Illustration
■ Set-up:
▶ The pre-period is either 2007 or 2010. Post-period is 2015
20
Empirical Illustration
■ Panel data from Cengiz, Dube, Lindner and Zipperer (2019) with state-level MW
changes and employment-to-population ratios for 25c wage-bins (in 2016 dollars) at
state-level
22
Results: Pre = 2007, Post = 2015
■ Intuitively, employment declines in control states are larger than initial levels in
treatment states (likely b/c of differential effects of change in federal MW)
23
Results: Pre = 2010, Post = 2015
24
R package
■ Jon Roth and I have prepared the R package didFF to help you use these tests.
25
References
Abadie, Alberto, “Semiparametric Difference-in-Differences Estimators,” The Review of
Economic Studies, 2005, 72 (1), 1–19.
Andrews, Donald W. K. and Xiaoxia Shi, “Inference Based on Conditional Moment
Inequalities,” Econometrica, 2013, 81 (2), 609–666.
Athey, Susan and Guido Imbens, “Identification and Inference in Nonlinear
Difference-in-Differences Models,” Econometrica, 2006, 74 (2), 431–497.
Borusyak, Kirill, Xavier Jaravel, and Jann Spiess, “Revisiting Event Study Designs: Robust
and Efficient Estimation,” Review of Economic Studies, 2024, Forthcoming.
Callaway, Brantly and Pedro H. C. Sant’Anna, “Difference-in-Differences with Multiple
Time Periods,” Journal of Econometrics, 2021, 225 (2), 200–230.
Cengiz, Doruk, Arindrajit Dube, Attila Lindner, and Ben Zipperer, “The Effect of Minimum
Wages on Low-Wage Jobs,” The Quarterly Journal of Economics, August 2019, 134 (3),
1405–1454.
de Chaisemartin, Clément and Xavier D’Haultfœuille, “Two-Way Fixed Effects Estimators
with Heterogeneous Treatment Effects,” American Economic Review, 2020, 110 (9),
2964–2996.
Heckman, James J., Hidehiko Ichimura, and Petra E. Todd, “Matching As An Econometric
Evaluation Estimator: Evidence from Evaluating a Job Training Programme,” The Review
of Economic Studies, October 1997, 64 (4), 605–654.
Meyer, Bruce D., W. Kip Viscusi, and David L. Durbin, “Workers’ Compensation and Injury
Duration: Evidence from a Natural Experiment,” The American Economic Review, 1995,
85 (3), 322–340.
Roth, Jonathan, “Pre-test with Caution: Event-study Estimates After Testing for Parallel
Trends,” American Economic Review: Insights, 2022, Forthcoming.
and Pedro H. C. Sant’Anna, “When Is Parallel Trends Sensitive to Functional Form?,”
Econometrica, 2023, 91 (2), 737–747.
Sant’Anna, Pedro H. C. and Jun Zhao, “Doubly robust difference-in-differences estimators,”
Journal of Econometrics, November 2020, 219 (1), 101–122.
Sun, Liyan and Sarah Abraham, “Estimating Dynamic Treatment Effects in Event Studies
with Heterogeneous Treatment Effects,” Journal of Econometrics, 2021, 225 (2).
Wooldridge, Jeffrey M, “Two-Way Fixed Effects, the Two-Way Mundlak Regression, and
Difference-in-Differences Estimators,” Working Paper, 2021, pp. 1–89.
25