Causal Inference in Statistics and Biostatistics, Homework 4
Causal Inference in Statistics and Biostatistics, Homework 4
1. Use the NSW data to compare the following four models in Lecture 6.
The order of the variables from left to right is: treatment indicator (1 if treated, 0 if not
treated), age, education, Black (1 if black, 0 otherwise), Hispanic (1 if Hispanic, 0
otherwise), married (1 if married, 0 otherwise), no degree (1 if no degree, 0 otherwise),
RE74 (earnings in 1974), RE75 (earnings in 1975), and RE78 (earnings in 1978). The
last variable is the outcome; other variables are pre-treatment. You should divide RE74,
RE75 and RE78 by 1000.
Compare your result with table 8.6 and 8.7.
Model 3: 𝑌𝑌𝑖𝑖 (0) and 𝑌𝑌𝑖𝑖 (1) are independent with covariates,
𝑌𝑌𝑖𝑖 (0) 𝑋𝑋 𝛽𝛽 𝜎𝜎 2 0
� � |𝑋𝑋𝑖𝑖 , 𝜃𝜃 ∼ 𝑁𝑁 �� 𝑖𝑖 𝑐𝑐 � , � 𝑐𝑐 ��.
𝑌𝑌𝑖𝑖 (1) 𝑋𝑋𝑖𝑖 𝛽𝛽𝑡𝑡 0 𝜎𝜎𝑡𝑡2
In the models above, the prior distribution for 𝜇𝜇 (or 𝜇𝜇𝑐𝑐 , 𝜇𝜇𝑡𝑡 ) is 𝑁𝑁(0,1002 ), and the
prior distribution for 𝜎𝜎 2 (or 𝜎𝜎𝑐𝑐2 , 𝜎𝜎𝑡𝑡2 ) is inverse gamma distribution with shape 1 and
rate 0.01.
2. Suppose there is a single binary covariate, with unknown marginal distribution in the
super-population 𝑋𝑋𝑖𝑖 ∈ {𝑓𝑓, 𝑚𝑚} , with Pr(𝑋𝑋𝑖𝑖 = 𝑓𝑓) = 𝑝𝑝 unknown. Suppose that the
propensity score for each 𝑋𝑋𝑖𝑖 ∈ {𝑓𝑓, 𝑚𝑚} is 1/2. The potential outcomes
𝑌𝑌(0) ∣ (𝑋𝑋𝑖𝑖 = 𝑓𝑓) ∼ 𝑁𝑁(0, 𝜎𝜎12 ), 𝑌𝑌(1) ∣ (𝑋𝑋𝑖𝑖 = 𝑓𝑓) ∼ 𝑁𝑁(𝜇𝜇1 , 𝜎𝜎12 );
𝑌𝑌(0) ∣ (𝑋𝑋𝑖𝑖 = 𝑚𝑚) ∼ 𝑁𝑁(0, 𝜎𝜎22 ), 𝑌𝑌(1) ∣ (𝑋𝑋𝑖𝑖 = 𝑚𝑚) ∼ 𝑁𝑁(𝜇𝜇2 , 𝜎𝜎22 ).
Under unconfoundness, calculate the efficiency bound 𝕍𝕍eff eff
cond and 𝕍𝕍sp for estimating
conditional (average) treatment effect.