Introduction To Treatment Effects Handout
Introduction To Treatment Effects Handout
Alberto Abadie
MIT
Contents
Treatment
Di : Indicator of treatment intake for unit i
1 if unit i received the treatment
Di =
0 otherwise.
Outcome
Yi : Observed outcome variable of interest for unit i
Potential Outcomes
Y0i and Y1i : Potential outcomes for unit i
Treatment Effect
The treatment effect or causal effect of the treatment on the
outcome for unit i is the difference between its two potential
outcomes:
Y1i − Y0i
Observed Outcomes
Observed outcomes are realized as
Y1i if Di = 1
Yi = Y1i Di + Y0i (1 − Di ) or Yi =
Y0i if Di = 0
Assumption
Observed outcomes are realized as
Yi = Y1i Di + Y0i (1 − Di )
ATE
Average treatment effect is:
αATE = E [Y1 − Y0 ]
ATET
Average treatment effect on the treated is:
αATET = E [Y1 − Y0 |D = 1]
Selection bias
Problem
Comparisons of earnings for the treated and the untreated do not
usually give the right answer:
E [Y |D = 1] − E [Y |D = 0] = E [Y1 |D = 1] − E [Y0 |D = 0]
= E [Y1 − Y0 |D = 1] + {E [Y0 |D = 1] − E [Y0 |D = 0]}
| {z } | {z }
ATET BIAS
Problem
Comparisons of earnings for the treated and the untreated do not
usually give the right answer:
E [Y |D = 1] − E [Y |D = 0] = E [Y1 |D = 1] − E [Y0 |D = 0]
= E [Y1 − Y0 |D = 1] + {E [Y0 |D = 1] − E [Y0 |D = 0]}
| {z } | {z }
ATET BIAS
Assignment mechanism
Assignment mechanism is the procedure that determines which
units are selected for treatment intake. Examples include:
random assignment
selection on observables
selection on unobservables
Typically, treatment effects models attain identification by
restricting the assignment mechanism in some way.
Key ideas
αATET = E [Y1 − Y0 |D = 1] = E [Y |D = 1] − E [Y |D = 0]
As a result,
E [Y |D = 1] − E [Y |D = 0] = αATE = αATET
| {z }
Difference in Means
Identification in randomized experiments
The identification result extends beyond average treatment effects.
Let Qθ (Y ) be the θ-th quantile of the distribution of Y :
Pr(Y ≤ Qθ (Y )) = θ.
Y0 ∼ Y0 |D = 0 ∼ Y |D = 0
Y1 ∼ Y |D = 1.
b = Ȳ1 − Ȳ0 ,
α
where P
Yi · Di 1 X
Ȳ1 = P = Yi ;
Di N1
Di =1
P
Yi · (1 − Di ) 1 X
Ȳ0 = P = Yi
(1 − Di ) N0
Di =0
P
with N1 = i Di and N0 = N − N1 .
b is an unbiased and consistent estimator of αATE .
α
Testing in large samples: Two-sample t-test
Notice that:
αb − αATE d
s → N(0, 1),
σ 2
b1 b
σ 2
+ 0
N1 N0
where X
1
b12 =
σ (Yi − Ȳ1 )2 ,
N1 − 1
Di =1
b
α
t=s .
b12
σ b02
σ
+
N1 N0
H0 : Y1 = Y0 , H1 : Y1 6= Y0 (sharp null)
Yi 12 4 6 10 6 0 1 1
Di 1 1 1 1 0 0 0 0 α̂ = 6
α̂(ω)
ω =1 1 1 1 1 0 0 0 0 6
ω =2 1 1 1 0 1 0 0 0 4
ω =3 1 1 1 0 0 1 0 0 1
ω =4 1 1 1 0 0 0 1 0 1.5
···
ω = 70 0 0 0 0 1 1 1 1 -6
The randomization distributionP b (under the sharp null
of α
1
α ≤ z) = 70 ω∈Ω 1{b
hypothesis) is Pr(b α(ω) ≤ z}
α| > z) ≤ 0.05}
Now, find z̄ = inf{z : P(|b
Reject the null hypothesis, H0 : Y1i − Y0i = 0 for all i, against the
alternative hypothesis, H1 : Y1i − Y0i 6= 0 for some i, at the 5%
significance level if |b
α| > z̄
12
Diff. in Means
10
0
−8 −6 −4 −2 0 2 4 6 8
Pr(|α̂(ω)| ≥ 6) = 0.0857
Pr(|α̂(ω)| ≥ 6) = 0.0857
Covariate balance
Failure of randomization
Attrition
Non-representative sample
Non-representative program
Scale effects
Hawthorne effects
Appendix:
Experimental Design
σ12 σ02
var(Ȳ1 − Ȳ0 ) = +
pN (1 − p)N
σ12 σ02
− ∗2 + = 0.
p N (1 − p ∗ )2 N
Therefore:
1 − p∗ σ0
= ,
p∗ σ1
and
σ1 1
p∗ = = .
σ1 + σ0 1 + σ0 /σ1
A “rule of thumb” for the case σ1 ≈ σ0 is p∗ = 0.5
For practical reasons it is sometimes better to choose unequal
sample sizes (even if σ1 ≈ σ0 )
αb−α
∼ N (0, 1) .
s.e.(b
α)
Probability of rejection if µ1 − µ0 = 0
−1.96 0 1.96
Probability of rejection if µ1 − µ0 = α
−1.96 0 α 1.96
s.e.(b
α)
0.9 N=50
0.8 N=25
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
α/σ
Pr (reject µ1 − µ0 = 0|µ1 − µ0 = α)
,s
2
σ1 σ02
= Φ −1.96 − α +
pN (1 − p)N
,s !!
σ12 σ02
+ 1 − Φ 1.96 − α + .
pN (1 − p)N