正在发送邮件 wk-08-slides
正在发送邮件 wk-08-slides
Ellen Stuart
School of Economics
The University of Sydney
21 September 2023
Reminder
Reminders:
Our goal: to empirically estimate the revenue-maximizing linear income tax rate
• Wk1: Used a simple model to tell us what we need out of the data (e)
• Wk2: Considered the pros and cons of different sources of data
• Wk3-5: Initial/exploratory data analysis (data cleaning, data visualization,
summary statistics)
• Wk6-7: Prediction/inference (linear regression, instrumental variables)
• More inference: difference in differences & our 2nd paper estimating the ETI
1/91
Where we left off
Recall our linear regression model. The true population values are:
Yi = β0 + β1 Xi + ui
What we estimate:
i = βb0 + βb1 Xi + ubi
2/91
Where we left off
Reminder
3/91
Where we left off
Reminder
We might worry that one (or more) of the following omitted variables are
correlated with hours of study:
• Inherent ability
• Class attendance
• Quality of instruction
• Difficulty of other courses taken concurrently
Because these are all in u, we would have X correlated with u and therefore we
would be concerned about endogeneity 4/91
Where we left off
Reminder
Last week we discussed one strategy empirical economists use when faced with
endogeneity: instrumental variables
Because the two groups are “the same”, we can compare (what happens to the
treatment group) to (what happens in the control group) and assume the
difference is due to the treatment
6/91
Diff-in-diff: Visual/example
Population of
interest
7/91
Diff-in-diff: Visual/example
A bolt of lightening (or maybe some policy intervention) strikes the population,
randomly breaking it into two groups:
Population of
interest
8/91
Diff-in-diff: Visual/example
Because the policy intervention was random, the mean value of Y is ≈ 10 for both
the treatment group and the control group
15.6
11.7
Treatment group 10.2
Control group 10.1
The policy intervention has some effect on the value of Y so that the mean value
of Y is different between the treatment and control groups after the intervention
15.6
11.7
Treatment group 10.2
Control group 10.1
The difference in differences estimator first takes the difference between the post-
and pre-intervention for both groups:
The difference in differences estimator then takes the difference in those differences
(thus the name):
13/91
Diff-in-diff: Intuition
Why take both differences? Why not compare the treated group before and after?
• We want to know how the intervention affected the mean of Y
• We might worry that some other thing also impacted Y at the same time as
the treatment
• If we just looked at the treated group before and after, the effect of that other
thing would be incorrectly attributed to our policy intervention
• Looking at the control group before and after the intervention allows us to
estimate the average impact of that other thing
• Because the two groups are otherwise “the same” apart from the treatment,
we might expect that the impact of that other thing to be the same for the
control group and the treatment group
• We can then remove the impact from that other thing from our estimate of
the impact of the intervention 14/91
Diff-in-diff: Estimation
Y : outcome variable
Treatment: variable that = 1 if the row in the data is from the treatment group
Post: variable that = 1 if the row in the data is from the post-intervention period
Treatment x Post: variable that = 1 if the row in the data is BOTH from the
treatment group and from the post-intervention period
15/91
Diff-in-diff: Estimation
Our data might look something like this: Outcome is our “Y” variable
16/91
Diff-in-diff: Estimation
β0
11.7
Treatment group 10.2
Control group 10.1
17/91
Pre-intervention Post-intervention
Diff-in-diff: Estimation
11.7
Treatment group 10.2
Control group 10.1
20/91
Pre-intervention Post-intervention
Diff-in-diff: Estimation
Each coefficient represents either (1) a group mean or (2) a difference between
means:
21/91
Diff-in-diff: Estimation
What is the counterfactual? I.e., what would have happened to the treatment group if
the policy intervention had not occurred?
15.6 = β0 + β1 + β2 + β3
11.8 = β0 + β1 + β2
11.7 = β0 + β2
Treatment group 10.2
Control group 10.1
Pre-intervention Post-intervention
22/91
Diff-in-diff: Assumptions
If the intervention had not occurred (i.e., in the absence of the treatment), the
control and treatment groups would have had a similar trend over time
This assumption allows us to use β2 to net out changes in the outcome that are
not due to the treatment
Don’t trust a study that uses diff-in-diff but doesn’t graphically show these trends!
23/91
Diff-in-diff: Assumptions
The allocation of the policy intervention was not determined by the outcome of
interest
The composition of the control and treatment groups is stable (true for panel data;
must be checked for repeated cross sections)
No spillover effects (i.e., the effect on the treatment group does not indirectly
effect the control group)
25/91
Diff-in-diff: Seven more things
26/91
Diff-in-diff: Seven more things
15.6
11.7
Treatment group
Control &
10.1
treatment groups
Pre-intervention Post-intervention
27/91
Diff-in-diff: Seven more things
15.6
Pre-intervention Post-intervention
28/91
Diff-in-diff: Seven more things
11.8
11.7
Treatment group 10.2
Control group 10.1
Pre-intervention Post-intervention
29/91
Diff-in-diff: Seven more things
Second thing.
We saw an example of this in Week 2 when discussing the use of field experiments
in economics
30/91
Diff-in-diff: Seven more things
Third thing.
31/91
Diff-in-diff: Seven more things
Fourth thing.
⇒ Diff-in-diff requires data from the pre- and post- period (like panel data or
repeated cross sections)–a single cross section won’t work
32/91
Diff-in-diff: Seven more things
Fifth thing.
The coefficients of the regression model presented earlier represent means and
differences in means–there is no slope term included
We could instead apply the diff-in-diff design to coefficient estimates from some
other regression (rather than to variables)
Broadly, we would run a regression for the pre-periods and a regression for the
post-periods
This allows us to examine the trend in post-treatment coefficients (i.e., are the
increasing? Decreasing?)
33/91
Diff-in-diff: Seven more things
Sixth thing.
Note, when using many years of data, standard errors likely need to be adjusted for
autocorrelation (correlation across time)
34/91
Diff-in-diff: Seven more things
This specification is called two-way fixed effects (TWFE) estimation because of the
group and time fixed effects
35/91
Diff-in-diff: Seven more things
Last thing.
The new diff-in-diff methods provide estimation techniques that “correct” for the
biases in TWFE
Three critical papers in this space:
37/91
Diff-in-diff: Seven more things
In short:
38/91
Diff-in-diff: More examples
39/91
Diff-in-diff: More examples
40/91
Diff-in-diff: More examples
41/91
Diff-in-diff: More examples
43/91
Diff-in-diff: More examples
44/91
Paper discussion
Feldstein (1995)
Martin Feldstein (1995). “The Effect of Marginal Tax Rates on Taxable Income: A
Panel Study of the 1986 Tax Reform Act.” Journal of Political Economy, 103(3):
551-572.
Abstract: This paper uses a Treasury Department panel of more than 4,000
taxpayers to estimate the sensitivity of taxable income to changes in tax rates on
the basis of a comparison of the tax returns of the same individual taxpayers
before and after the 1986 reform. The analysis emphasizes that the response of
taxable income involves much more than a change in the traditional measures of
labor supply. The evidence shows an elasticity of taxable income with respect to
the marginal net-of-tax rate that is at least one and could be substantially higher.
The implications for recent tax rate changes are discussed.
45/91
Feldstein (1995): Context
46/91
Feldstein (1995): Context
1985 1988
Bracket > AGI MTR > AGI MTR
1 $0 0.0% $0 11%
2 $3,670 11.0% $3,000 15%
3 $5,940 12.0% $30,950 28%
Additional context (not provided 4 $8,200 14.0% $45,000 35%
by the paper) 5 $12,840 16.0% $90,000 39%
6 $17,270 18.0%
Marginal tax rates faced by 7 $21,800 22.0%
8 $26,550 25.0%
married-filing-jointly returns in 9 $32,270 28.0%
1985 versus 1988: 10 $37,980 33.0%
11 $49,420 38.0%
12 $64,750 42.0%
13 $92,370 45.0%
14 $118,050 49.0%
15 $175,250 50.0%
47/91
Source: https://taxfoundation.org/data/all/federal/historical-income-tax-rates-brackets/
Feldstein (1995): Context
“The Tax Reform Act of 1986 combined sharp reductions in high marginal tax
rates with base-broadening changes in tax rules. The combination was
designed to be approximately revenue neutral and distributionally neutral if
there were no behavioral response to the tax changes.”
“To increase the political appeal of the tax proposal, the tax changes were
actually structured so that tax revenue would decline in each broad income
class (assuming no behavioral response) and so that the resulting revenue
shortfall would be made up by an increase in the corporate income
tax.11 ”
48/91
Feldstein (1995): Context
49/91
Feldstein (1995): Data
“This is the first time in which panel data have been used to estimate the
sensitivity of taxable income to marginal tax rates.”
50/91
Feldstein (1995): Data
Focus on “the largest marital status subgroup, those taxpayers who were married
and filed a joint return in both 1985 and 1988.”
“Since retirement also causes a substantial change in income, the analysis excludes
taxpayers who were over age 65 in 1988.”
51/91
Feldstein (1995): Data
Paper does not tell us how many observations there were before restricting to the
final analysis sample. For example:
• We don’t know what impact, e.g., dropping all taxpayers who adopted
subchapter S corporation between 1985 and 1988 had on the final sample size
(more on that in a minute)
• We don’t know by what margin married filing jointly is “the largest marital
status subgroup”
Analysis sample includes:
• 3,538 medium-income taxpayers (1985 MTR between 22-38%)
• 197 high-income taxpayers (1985 MTR between 42-45%)
• 57 highest-income taxpayers (1985 MTR 49% or 50%)
52/91
Feldstein (1995): Data
“The analysis excludes taxpayers with 1985 marginal tax rates below 22 percent for
two reasons.”
53/91
Feldstein (1995): Data
54/91
Feldstein (1995): Data
55/91
Feldstein (1995): Empirical strategy
The paper doesn’t show us the equation, but this is what is estimated:
• First difference: (post-pre) for each group
– Percent change in net of tax rate between 1985-1988
– Percent change in taxable income between 1985-1988
• Second difference: difference of differences for both net of tax rate and
taxable income for each group:
– High-income minus medium-income
– Highest-income minus high-income
– Highest-income minus medium-income
56/91
Feldstein (1995): Empirical strategy
A: The difference in the percent change in taxable income between 1985 and 1988
B: The difference in the percent change in net-of-tax rate between 1985 and 1988
57/91
Feldstein (1995): Empirical strategy
58/91
Feldstein (1995): Empirical strategy
“The changes in the tax rules that accompanied the tax rate reductions mean that
precautions must be taken in comparing incomes in 1985 and 1988. Four such
changes are noteworthy.”
1. Adjusted gross income (AGI) in 1985 excluded 60% of realized capital gains;
exclusion was eliminated by TRA86
Strategy: Paper shows comparisons for both all of AGI and AGI excluding capital
gains
59/91
Feldstein (1995): Empirical strategy
Strategy: Paper has no way to obtain 1985 subchapter C incomes; drops all
taxpayers who adopted subchater S corporation between 1985 and 1988
60/91
Feldstein (1995): Empirical strategy
• Assume reduction entirely due to lower MTR (no adjustment for losses)
• Assume reduction entirely due new offsetting rules (add losses to taxable
income in both 1985 and 1988)
61/91
Feldstein (1995): Empirical strategy
62/91
Feldstein (1995): Empirical strategy
4. Changes to personal exemptions, the effective zero bracket amount, and the
definition of taxable income (specifically, that taxable income was defined net of
the zero bracket amount and personal exemptions)
63/91
Feldstein (1995): Empirical strategy
“One final adjustment is necessary to make modified taxable income for 1985
comparable to the taxable income that the taxpayer would report in 1988 if the
taxpayer did not change his behavior.”
Intuition: even without the tax reform, nominal wages would have increased, on
average, as a result of inflation, promotions, etc.
64/91
Feldstein (1995): Empirical strategy
65/91
Feldstein (1995): Empirical strategy
66/91
Feldstein (1995): Empirical strategy
“There are of course some additional small changes in tax rules that have not been
taken into account. Three deserve special mention.”
• Rules about who was eligible for a tax-benefited retirement savings account
−→ would increase taxable income more for lowest-income group in study; bias
estimates downward
• Increase in Social Security tax rates and tax base
Small increase that is somewhat offset by future benefits
• Changes to the “alternate minimum tax”
−→ Some people experienced smaller reductions in tax rate, may not have as
large a income response as would have otherwise
67/91
Feldstein (1995): Empirical strategy
Even if we believe the author’s claim that income in 1985 has been sufficient
adjusted so that it is comparable to income in 1988, we need to think about our
critical assumption for diff in diff.
If the intervention had not occurred (i.e., in the absence of the treatment),
the control and treatment groups would have had a similar trend over time
68/91
Feldstein (1995): Empirical strategy
The numerator of our elasticity is “the difference in the percent change in taxable
income between 1985 and 1988”
For parallel trends to hold in this context, that would mean that if the tax reform
had not occurred, the percent change in taxable income between 1985 and 1988
would be the same in the treatment and control groups
69/91
Feldstein (1995): Empirical strategy
70/91
Feldstein (1995): Empirical strategy
However:
Source: Figure 16 in
Emmanuel Saez (2017).
“Income and Wealth
Inequality: Evidence and
Policy Implications.”
Contemporary Economic
Policy, 35(1): 7-25 71/91
Feldstein (1995): Results
where:
A: The difference in the percent change in taxable income between 1985 and 1988
B: The difference in the percent change in net-of-tax rate between 1985 and 1988
72/91
Feldstein (1995): Results
73/91
Feldstein (1995): Results
We know that income is very skewed–what is the maximum income in the sample?
The paper notes “Because the sample sizes are relatively small for the top tax rate
groups, calculations are presented in the lower part of the table that combine
several individual 1985 marginal tax rate groups with the appropriate sample
weights.”
74/91
Feldstein (1995): Results
“Because the Tax Reform Act of 1986 did not reduce marginal tax rates on capital
gains in the same way that it did for other income, to study the effect of lowering
marginal tax rates it is appropriate to focus on income excluding capital gains.18 ”
75/91
Feldstein (1995): Results
76/91
Feldstein (1995): Results
77/91
Feldstein (1995): Results
78/91
Feldstein (1995): Results
79/91
Feldstein (1995): Discussion
Feldstein (1995) uses a nonstratified panel of individual U.S. tax returns from 1985 and
1988 to estimate the elasticity of taxable income using a difference in differences approach
The paper was one of the first (ever) in empirical economics to use panel data
The identifying tax variation comes from a single tax reforms which significantly decreased
top marginal tax rates and broadened the tax base
They find:
“If the long-run response to a change in marginal tax rates is greater than the short-run
response...this analysis...may understate the long-run sensitivity of taxable income to
80/91
changes in tax rates.”
Feldstein (1995): Discussion
Emmanuel Saez, Joel Slemrod, and Seth Giertz* make the following important
observation:
• “[N]ote that if the control group faces a tax change, difference-in-differences
estimates will be consistent only if the elasticities are the same for the two
groups.”
In all of the comparisons in Feldstein (1995), the control group also faced tax
changes, which means the estimates are only consistent if we assume the two
groups have the same elasticity
*Emmanuel Saez, Joel Slemrod, and Seth H. Giertz (2012). “The Elasticity of Taxable Income with Respect to Marginal Tax Rates: A Critical
Review.” Journal of Economic Literature, 50(1): 3-50.
81/91
Feldstein (1995): Discussion
To see this, remember that the estimate of the ETI with this empirical strategy is :
82/91
Feldstein (1995): Discussion
Then:
• eT = (% ∆ in net-of-tax rate, TG)/(% ∆ in taxable income, TG)
• % ∆ in taxable income, CG = 0
−→ the estimated elasticity is twice the size of the true elasticity for the treatment
group
83/91
Feldstein (1995): Discussion
All of the groups considered in Feldstein (1995) faced tax rate changes
⇒ for the elasticity estimates to be consistent, we need to assume that all of the
income groups have the same elasticity
84/91
Feldstein (1995): Discussion
85/91
*Emmanuel Saez, Joel Slemrod, and Seth H. Giertz (2012). “The Elasticity of Taxable Income with Respect to Marginal Tax Rates: A Critical
Review.” Journal of Economic Literature, 50(1): 3-50.
Feldstein (1995): Discussion
*Emmanuel Saez, Joel Slemrod, and Seth H. Giertz (2012). “The Elasticity of Taxable Income with Respect to Marginal Tax Rates: A Critical 86/91
Review.” Journal of Economic Literature, 50(1): 3-50.
Feldstein (1995): Discussion
87/91
Feldstein (1995): Discussion
Today:
Key take-aways:
89/91
Wrapping up
90/91
Wrapping up
Consider the first two papers we’ve read that try to estimate the elasticity of
taxable income (e): Kleven and Schultz (2014) and Feldstein (1995).
For each paper, outline the following in about 150-175 words:
Conclude with the following: note which of the two papers you find more credible
and, in 4-5 sentences, explain the main reason(s) why.
Media assignment
Find a data visualization in a news article (or other media source). In 3 minutes, discuss:
• Where is the visualization from? Was it created by the news source, or did they copy
a visualization from somewhere else?
• What is the visualization trying to show? Is it successful?
• Does the surrounding article discuss the underlying data? Is it mentioned in any notes
of the figure?
– If yes, what is the data source? How do you feel about that data?
– If no, does it change the way you feel about the figure to not know where the
underlying data is from? How so?
• Is there anything that is misleading about the figure?
Your recording should be structured so that both you and the visualization are visible. One
way to do this would be to put your visualization on a slide and share your screen for the
recording.