0% found this document useful (0 votes)
25 views2 pages

Problem Set III: Part 1a: Theoretical Exercise

The document discusses two papers on causal inference methods. Part 1 covers a theoretical exercise on regression discontinuity design and assumptions. Part 2 analyzes Angrist and Lavy (1999) which uses Maimonides' rule to estimate the effect of class size on test scores via a sharp regression discontinuity design.

Uploaded by

manuzipeixoto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views2 pages

Problem Set III: Part 1a: Theoretical Exercise

The document discusses two papers on causal inference methods. Part 1 covers a theoretical exercise on regression discontinuity design and assumptions. Part 2 analyzes Angrist and Lavy (1999) which uses Maimonides' rule to estimate the effect of class size on test scores via a sharp regression discontinuity design.

Uploaded by

manuzipeixoto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Problem Set III

Causal Inference, Master in Economics, PUC-Rio 2023

Ursula Mello

[email protected]

Part 1a: Theoretical Exercise


1. Consider the following model that generates the outcome Y for each individual i:

Yi = β0 + β1 Di + β2 Di Xi + f (Xi ) + ui (1)

where Di is an indicator of treatment that takes value 1 if i participates in the program we want to
evaluate, Xi is an observable continuous variable, f (x) is a derivable function that does not depend
on any other variable nor parameter of the model, ui is unobservable. While ui |X follows an unk-
nown distribution, we know that E [ui |X] and E [ui |D, X] depend continuously on X. Furthermore,
participation in the program is determined according to the following rule:

Di = 1 {γ0 + γ1 1 {Xi ≥ x} + γ2 εi ≥ 0} (2)


where γ0 = −1, γ1 = 2, γ2 = 1, x is a fixed threshold, εi is a random shock that follows an iid N (0, 1)
distribution independently of everything else, and 1 {.} is the indicator function that takes value 1 if
the condition inside the brackets is satisfied, and zero otherwise.

(a) Derive the effect of the program for i, τi , as a function of the terms and parameters in (1). Using
τi , write the expression for τAT T as a function of the parameters of the model.

(b) Write and explain the discontinuity identification condition on D and the assumption about local
continuity on Y that are needed for identifying a treatment effect with the Regression Discon-
tinuity Method. Are those conditions satisfied in the model described in this exercise? Would
they be satisfied if γ1 were equal to 0?

(c) Calculate the effect of the program identified by the RD estimator, τRD , as a function of β1 , β2 y
x. Calculate it also if γ2 = 0. You might, if you want, use the variable Zi = 1 {Xi ≥ x} to define
τRD . And if β2 = 0, what effect is identified?

(d) Explain a way to estimate τRD assuming the assumptions needed for RD are satisfied but that
the functional forms in (1) and (2) are unknown.

Part 1b: Dell, 2010. The persistent effects of Peru’s mining mita. Eco-
nometrica.
1. Explain what the mita system was and how it creates a discontinuity in the paper’s context. Is that a
sharp or fuzzy RD design, and why? Explain what sharpness and fuzziness mean in this context.

1
2. Explain what is the role of Table 1 in the paper. Does it provide evidence to which of the RDDs
crucial assumptions? Why is that condition crucial for interpreting RD estimates as causal?

3. Suppose we are working on a homogeneous effects, sharp RD framework, so that Yi = αDi +Y0i . Use
the fact that limz→z+ E[Yi j | Zi = z] = limz→z− E[Yi j | Zi = z] to demonstrate formally that α can be
0 0
identified as α = limz→z+ E[Yi | Zi = z] − limz→z− E[Yi | Zi = z].
0 0

4. Consider now equation (1) of the paper.


(a) What does f(geographic locationd ) stand for, and why is so important for the estimation of αRD ?
(b) Table 2 describes the estimations results of that equation. What is the cutoff z0 in the specifica-
tion of Panel C? How would you define Di as a function of this cutoff?
(c) Use delldata_consumption.dta and delldata_childstunt.dta to replicate the results of Panel C.
Make sure you specify f(geographic locationd ) and Di as discussed in (4a) and (4b), and that
you check the table’s notes to understand which is the specification used. [Hint: interpret, for
this application, the euclidian distance as its absolute value!]

Part 2: Angrist, J.D. and Lavy, V., 1999. Using Maimonides’ rule to es-
timate the effect of class size on scholastic achievement. The Quarterly
Journal of Economics.
1. Explain what the Maimonides’ rule is and how it creates a discontinuity in class size assignment to
children. Is that a sharp or fuzzy RD, and why?
2. The data file angrist1999.dta contains the data used in that paper. Use it for the following points:

(a) Reproduce Figure 2-Panel A, but using math scores instead of reading scores. Which prelimi-
nary conclusions about the relationship between class size and school performance?
(b) Now run an OLS regression of average math score on average class size, controlling for percen-
tage of disadvantaged kids and enrollment. Is the estimate for average class size causal? Is its
sign consistent with your discussion in item (a)?
(c) Implement an IV estimation of average math test scores on average class size by instrumenting
it with the Maimonides Rule. Control for percentage of disadvantaged kids and enrollment.
Comment the estimate for class size – is it causal, and how it compares to what was found in
item (b)? If causal, which assumptions we need to make for that to hold?
3. Let us now give a LATE-type of interpretation to our problem. Disregard enrollment cohorts with
more than 50 students. Suppose we are interested in looking at the local effect in the neighbourhood
of 5 students around the class size cutoff (i.e. z0 = 40, h = 5).
(a) Which is the extra assumption needed for this estimation, and how is it formally stated? How
would it be interpreted in the current context?
(b) Estimate a LATE-like αRD for math scores as the outcome variable. That is, estimate1

E[Yi | Wi = 1, Si = 1] − E[Yi | Wi = 0, Si = 1]
α̂RD = (3)
E[Di | Wi = 1, Si = 1] − E[Di | Wi = 0, Si = 1]

1 You will be looking at the cohorts with z ± h pupils. S is a dummy for belonging to such group, and W a dummy for belonging
0 i i
to an enrollment cohort above the class size cutoff.

You might also like