Assignment
Assignment
exercises you are going to replicate results from the lecture and estimate the effect of following a
gifted and talented program (gt) on students’ math and language scores in secondary education.
The data are drawn from Booij et al. (2016). In Part 1 you will try and figure out if you could
apply a Regression Control (RC) design, as you did in previous CLabs 3.1 and 3.2.
1. Open the data and familiarize yourself using desc and sum.
Contains data from C:\Users\Minza Mangi\Downloads\gta_data.dta
obs: 3,057
vars: 9
size: 220,104
id double %12.0g
cohort double %12.0g cohort
male double %12.0g
age double %12.0g
ist_norm double %12.0g IST pre-test distance to cutoff
cito double %12.0g End of primary school exam CITO score
resmath double %12.0g Average math score grade 2 to 6
reslang double %12.0g Average language score grade 2 to 6
__gt double %12.0g Treated child (GT)
. sum
Visualize the data using hist XXX, frac name(hist_XXX) where XXX is the name of the
variable that you are looking at. Check out all variables (except id, of course). What
variables do you think are realized prior to, and what are realized post of the GT progam?
.1
.08
.06
Fraction
.04
.02
0
2 4 6 8 10
Average math score grade 2 to 6
.15
.1
Fraction
.05
0
4 6 8 10
Average language score grade 2 to 6
.4
.3
Fraction
.2 .1
0
0 .2 .4 .6 .8 1
male
.2
.15
Fraction
.1.05
0
10 11 12 13 14 15
age
.1
.08.06
Fraction
.04
.02
0
-100 -50 0 50
IST pre-test distance to cutoff
.08
.06
Fraction
.04 .02
0
0 .2 .4 .6 .8 1
Treated child (GT)
The age, male, cohort and ist_norm are prior to GT program, whereas average math score,
average language score and _gt are after the GT program.
2. To estimate the effect of the gt we will ultimately have to compare individuals that get
the program
(gt=1) to those that don’t (gt=0). Type
Summary statistics: mean
by categories of: __gt (Treated child (GT))
Total 1575.15 2003.97 0.54 12.16 -13.00 547.49 6.61 6.77 0.21
The above summary statistics shows that individuals who get the program have the
highest average score in math and language, as compared to those who didn’t.
3. Observing imbalance (prior differences between the groups) is not a good sign, but not
necessarily a problem. Imbalance with respect to gender, for example, only poses a big problem
if it predicts math and/or language scores. A simple way to check this is to do a regression using
prior variables as predictors. For simplicity we will neglect cohort and only look at math:
. reg resmath male age ist cito, r
Robust
resmath Coef. Std. Err. t P>|t| [95% Conf. Interval]
Is gender predictive?
The above regression results shows that gender is not significant predictor math. Its p-
value is 0.671 which is greater than alpha of 0.05.
So, is imbalance with respect to gender a problem? It is not always easy to interpret the
coefficients from a regression. The coefficient of age, for example, 𝛽𝛽̂ 𝑎ge = −.1639968 and
significant (𝑡𝑡 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = −3.89). What does it mean? It means that when a student’s age is 1
year higher, her predicted grade is 0.16 lower. That seems a lot, but is it? That depends on how
big a 1 year change is.
Recall from q1 that 𝑆𝑆𝑆(𝑎𝑎𝑎𝑎𝑎𝑎) = 0.48. A 1 unit change in age is more than 2 ×
𝑆𝑆𝑆(𝑎𝑎𝑎𝑎𝑎𝑎). If ±2 × 𝑆𝑆𝑆(𝑎𝑎𝑎𝑎𝑎𝑎) around the mean contains about 95% of the data, 2 ×
𝑆𝑆𝑆𝑆(𝑎𝑎𝑎𝑎𝑎𝑎) must contain about 47.5%. So if your age is 1 year higher, you “overtake” about
47.5% of students and go from being one of the youngest to one of the oldest. A big change.
Also, you can look at the histogram of age to see if you think a 1 year age difference is large or
not given this sample.
The historgram below shows that a 1 year age difference is not larger for this sample. The
mean age is 12.15 with +(-) 0.89 standard deviation which also depicts from histogram.
1.5
1
Density
.5
0
10 11 12 13 14 15
age
Robust
resmath Coef. Std. Err. t P>|t| Beta
What does this reveal? Is the coefficient of age really that big, or are some of the others bigger?
No, it shows that coefficient of age is not so much higher its impact is only 6.98 % whereas
impact of IST pre trest distance is 32.7% on average math scores
5. The philosophy of doing a controlled comparison is to find subgroups of units that are
comparable in terms of other observable characteristics (cohort, male, age, ist, cito) and compare
gt=0 and gt=1 people within that subgroup, rather than all units. This requires “overlap”: within
each subgroup we need treated (gt=1) and control (gt=0) units. From q4 we know that imbalance
with respect to ist_norm and cito would be most problematic because these are very predictive of
math grades. Lets check if we have overlap with respect to these variables:
550
End of primary school exam CITO score
520 530 540
-100 -50 0 50
IST pre-test distance to cutoff
Do we have overlap? Can we use the RC design here? What would the scatter have looked like if
gt had been randomly assigned?
Yes there is problem of overlapping we are unable to distinguish perceive number of points
whether they are in control group or treatment.
Part 2 – Regression Discontinuity design: Comparing kids around the cutoff In part 2 of these
exercises you are going to replicate results from the lecture and use a regression discontinuity
design to tackle the problem of “no overlap”. The previous question showed that there are no
control units with ist_norm≥0, and no treated units with ist_norm
10
Average math score grade 2 to 6
4 62 8
-100 -50 0 50
IST pre-test distance to cutoff
By giving different colors and shapes to the scatter points in the sample gt==0 and gt==1, we can
clearly see the overlap problem. We can also see, however, that there are many units around
ist_norm=0, the cutoff value at which treatment status discretely changes from 0 to 1. 7. From
the scatter in q6 it is hard – if not impossible – to see if students score higher after the cutoff. To
see we can add regression lines to the plot
10
8
6
4
2
-100 -50 0 50
IST pre-test distance to cutoff
The above graph clearly shows that there is a discontinuity in average math scores at point
ist_norm=0.The cutoff point is crucial in ensuring that treatment and control groups do not
overlap. Traditional regression models are unreliable in estimating intervention impact
under these circumstances, but RD analysis provides reliable estimates when there is no
overlap in assignment. This limited use of RD analysis still holds promise for numerous
therapeutic applications.
8. The lines in the plot show a jump (discontinuity) at point ist_norm=0. From the graph it is
difficult to see exactly how large the jump is, and we cannot tell if it is significant. For that we
need a regression. First we quietly run a regression without controls for future reference and
store it as a; then we do the RD regression and store it as b:
Linear regression Number of obs = 3057
F( 3, 3053) = 176.29
Prob > F = 0.0000
R-squared = 0.1567
Root MSE = 1.0545
Robust
resmath Coef. Std. Err. t P>|t| [95% Conf. Interval]
The above regression analysis shows that (discontinuity) at point ist_norm=0 Is significant
and there is not a large the jump the coefficient of ist_norm is 0.0202, this means 2.2%
students score higher math course after the cutoff.
9. The graph from q7 does not clearly show how the data fits the line(s). To see that we have plot
the data using a smaller number of bins. The program grrd does that (check out the code in
progs.do if you want to learn about it):
. grrd resmath
gt#c.ist_norm
1 .0115797 .0057695 2.01 0.045 .0002672 .0228921
The above analysis shows that there is significant impact of student around ist_norm=0
for math skills prior to GT program (b1= 0.0202418 & p-value =0.000) and after GT
program (b2= 0.0115797 & p-value =0.045) for mat scores.
10. The graph makes is visually clear that there is a discontinuity in average math scores at point
ist_norm=0. If it is reasonable to assume that students around the cutoff are similar in all aspects
other than gt assignment, we can conclude that the difference that we see is due to the program.
We have some additional prior variables (cohort, male, age, cito) that we can include to see if the
result changes:
. eststo c: reg resmath gt ist_norm c.gt#c.ist_norm age male cito i.cohort, r
Robust
resmath Coef. Std. Err. t P>|t| [95% Conf. Interval]
cohort
1999 -.1774812 .0916062 -1.94 0.053 -.3570976 .0021353
2000 .0991205 .0885403 1.12 0.263 -.0744844 .2727254
2001 .0050158 .0985941 0.05 0.959 -.188302 .1983337
2002 -.2836387 .0975594 -2.91 0.004 -.4749278 -.0923496
2003 -.0853821 .0977454 -0.87 0.382 -.277036 .1062717
2004 .0211138 .0902265 0.23 0.815 -.1557974 .198025
2005 .0432867 .0944868 0.46 0.647 -.1419779 .2285513
2006 -.1448801 .0971173 -1.49 0.136 -.3353024 .0455422
2007 .1553589 .0911375 1.70 0.088 -.0233384 .3340563
2008 .6665464 .0937586 7.11 0.000 .4827098 .8503831
2009 .4678772 .1015936 4.61 0.000 .2686781 .6670764
2010 .4537919 .0889739 5.10 0.000 .2793367 .6282471
The above regression results shows that students around the cutoff are not similar after
incorporating age, male, cito and cohort ads control variables because beta coefficient of
c.ist_norm (0.1143) has p-value 0.57 which is insignificant at 5% confidence level.
Look at columns (1) – (3) and explain the differences and similarities. Do you believe it is
credible to assume that students around ist_norm=0 are similar?
. esttab a b c, b(a2) se nogap star(* 0.10 ** 0.05 *** 0.01)
gt#c.ist_norm
1 -.0001287 .0042516 -0.03 0.976 -.0084651 .0082076
The above analysis shows that there is also significant impact of student around
ist_norm=1 for language skill prior to GT program (b1= 0.0079921 & p-value =0.000).
Whereas, there is insignificant impact of student around ist_norm=1 for language skill
after GT program (b2= 0.001287 & p-value =0.976).
It is noticeable that results prior to GRT program are significant for both math and
language scores, Whereas after GT program it is significant for math and insignificant for
language skills score.
. esttab a b c d, b(a2) se nogap star(* 0.10 ** 0.05 *** 0.01)
.
Do we see an effect on language skills as well? How large?