0% found this document useful (0 votes)
52 views2 pages

Causal Inference in Statistics and Biostatistics, Homework 4

This document provides instructions for Homework 4 on causal inference, which includes three problems: 1. Compare four models (perfectly correlated outcomes, independent outcomes, independent outcomes with covariates, two-part model) using the NSW data and report results. 2. Calculate efficiency bounds for estimating conditional average treatment effects given a binary covariate with unknown distribution and specified potential outcomes. 3. Show that an estimator that uses inverse propensity score weighting and outcome regression enjoys the "doubly robust" property, where it is consistent if either the propensity scores or outcome regressions are correctly specified.

Uploaded by

何沛予
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views2 pages

Causal Inference in Statistics and Biostatistics, Homework 4

This document provides instructions for Homework 4 on causal inference, which includes three problems: 1. Compare four models (perfectly correlated outcomes, independent outcomes, independent outcomes with covariates, two-part model) using the NSW data and report results. 2. Calculate efficiency bounds for estimating conditional average treatment effects given a binary covariate with unknown distribution and specified potential outcomes. 3. Show that an estimator that uses inverse propensity score weighting and outcome regression enjoys the "doubly robust" property, where it is consistent if either the propensity scores or outcome regressions are correctly specified.

Uploaded by

何沛予
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Causal Inference in Statistics and Biostatistics, Homework 4

Due date: 24:00 May 12, 2021


Submit to TA’s email: [email protected]
with title “姓名_学号_HW4” in pdf format

1. Use the NSW data to compare the following four models in Lecture 6.
The order of the variables from left to right is: treatment indicator (1 if treated, 0 if not
treated), age, education, Black (1 if black, 0 otherwise), Hispanic (1 if Hispanic, 0
otherwise), married (1 if married, 0 otherwise), no degree (1 if no degree, 0 otherwise),
RE74 (earnings in 1974), RE75 (earnings in 1975), and RE78 (earnings in 1978). The
last variable is the outcome; other variables are pre-treatment. You should divide RE74,
RE75 and RE78 by 1000.
Compare your result with table 8.6 and 8.7.

Model 1: 𝑌𝑌𝑖𝑖 (0) and 𝑌𝑌𝑖𝑖 (1) are perfectly correlated,


𝑌𝑌 (0) 𝜇𝜇𝑐𝑐 2
𝜎𝜎 2 ��.
� 𝑖𝑖 � |𝑋𝑋𝑖𝑖 , 𝜃𝜃 ∼ 𝑁𝑁 �� 𝜇𝜇 � , �𝜎𝜎 2
𝑌𝑌𝑖𝑖 (1) 𝑡𝑡 𝜎𝜎 𝜎𝜎 2

Model 2: 𝑌𝑌𝑖𝑖 (0) and 𝑌𝑌𝑖𝑖 (1) are independent,


𝑌𝑌 (0) 𝜇𝜇𝑐𝑐 𝜎𝜎 2 0
� 𝑖𝑖 � |𝑋𝑋𝑖𝑖 , 𝜃𝜃 ∼ 𝑁𝑁 ��𝜇𝜇 � , � 𝑐𝑐 ��.
𝑌𝑌𝑖𝑖 (1) 𝑡𝑡 0 𝜎𝜎𝑡𝑡2

Model 3: 𝑌𝑌𝑖𝑖 (0) and 𝑌𝑌𝑖𝑖 (1) are independent with covariates,
𝑌𝑌𝑖𝑖 (0) 𝑋𝑋 𝛽𝛽 𝜎𝜎 2 0
� � |𝑋𝑋𝑖𝑖 , 𝜃𝜃 ∼ 𝑁𝑁 �� 𝑖𝑖 𝑐𝑐 � , � 𝑐𝑐 ��.
𝑌𝑌𝑖𝑖 (1) 𝑋𝑋𝑖𝑖 𝛽𝛽𝑡𝑡 0 𝜎𝜎𝑡𝑡2

Model 4: two-part model,


exp(𝑋𝑋𝑖𝑖 𝛾𝛾𝑡𝑡 )
Pr(𝑌𝑌𝑖𝑖 (0) > 0 ∣ 𝑋𝑋𝑖𝑖 , 𝑊𝑊𝑖𝑖 , 𝜃𝜃) = ,
1 + exp(𝑋𝑋𝑖𝑖 𝛾𝛾𝑡𝑡 )
exp(𝑋𝑋𝑖𝑖 𝛾𝛾𝑐𝑐 )
Pr(𝑌𝑌𝑖𝑖 (1) > 0 ∣ 𝑋𝑋𝑖𝑖 , 𝑊𝑊𝑖𝑖 , 𝜃𝜃) = ,
1 + exp(𝑋𝑋𝑖𝑖 𝛾𝛾𝑐𝑐 )
ln(𝑌𝑌𝑖𝑖 (0)) ∣ 𝑌𝑌𝑖𝑖 (0) > 0, 𝑋𝑋𝑖𝑖 , 𝑊𝑊𝑖𝑖 , 𝜃𝜃 ∼ 𝑁𝑁(𝑋𝑋𝑖𝑖 𝛽𝛽𝑐𝑐 , 𝜎𝜎𝑐𝑐2 ),
ln(𝑌𝑌𝑖𝑖 (1)) ∣ 𝑌𝑌𝑖𝑖 (1) > 0, 𝑋𝑋𝑖𝑖 , 𝑊𝑊𝑖𝑖 , 𝜃𝜃 ∼ 𝑁𝑁(𝑋𝑋𝑖𝑖 𝛽𝛽𝑡𝑡 , 𝜎𝜎𝑡𝑡2 ).

In the models above, the prior distribution for 𝜇𝜇 (or 𝜇𝜇𝑐𝑐 , 𝜇𝜇𝑡𝑡 ) is 𝑁𝑁(0,1002 ), and the
prior distribution for 𝜎𝜎 2 (or 𝜎𝜎𝑐𝑐2 , 𝜎𝜎𝑡𝑡2 ) is inverse gamma distribution with shape 1 and
rate 0.01.
2. Suppose there is a single binary covariate, with unknown marginal distribution in the
super-population 𝑋𝑋𝑖𝑖 ∈ {𝑓𝑓, 𝑚𝑚} , with Pr(𝑋𝑋𝑖𝑖 = 𝑓𝑓) = 𝑝𝑝 unknown. Suppose that the
propensity score for each 𝑋𝑋𝑖𝑖 ∈ {𝑓𝑓, 𝑚𝑚} is 1/2. The potential outcomes
𝑌𝑌(0) ∣ (𝑋𝑋𝑖𝑖 = 𝑓𝑓) ∼ 𝑁𝑁(0, 𝜎𝜎12 ), 𝑌𝑌(1) ∣ (𝑋𝑋𝑖𝑖 = 𝑓𝑓) ∼ 𝑁𝑁(𝜇𝜇1 , 𝜎𝜎12 );
𝑌𝑌(0) ∣ (𝑋𝑋𝑖𝑖 = 𝑚𝑚) ∼ 𝑁𝑁(0, 𝜎𝜎22 ), 𝑌𝑌(1) ∣ (𝑋𝑋𝑖𝑖 = 𝑚𝑚) ∼ 𝑁𝑁(𝜇𝜇2 , 𝜎𝜎22 ).
Under unconfoundness, calculate the efficiency bound 𝕍𝕍eff eff
cond and 𝕍𝕍sp for estimating
conditional (average) treatment effect.

3. Consider the following estimator:


𝑊𝑊𝑖𝑖 𝑊𝑊𝑖𝑖
𝜇𝜇̂ 𝑡𝑡 = 𝑌𝑌𝑖𝑖 + �1 − � 𝑚𝑚� (𝑋𝑋 ),
𝑒𝑒(𝑋𝑋𝑖𝑖 ) 𝑒𝑒(𝑋𝑋𝑖𝑖 ) 𝑡𝑡 𝑖𝑖
1 − 𝑊𝑊𝑖𝑖 1 − 𝑊𝑊𝑖𝑖
𝜇𝜇̂ 𝑐𝑐 = 𝑌𝑌𝑖𝑖 + �1 − � 𝑚𝑚 � (𝑋𝑋 ),
1 − 𝑒𝑒(𝑋𝑋𝑖𝑖 ) 1 − 𝑒𝑒(𝑋𝑋𝑖𝑖 ) 𝑐𝑐 𝑖𝑖
where 𝑚𝑚� 𝑡𝑡 (𝑋𝑋𝑖𝑖 ) and 𝑚𝑚 � 𝑐𝑐 (𝑋𝑋𝑖𝑖 ) are regression estimators of 𝑌𝑌𝑖𝑖 in treatment group and
control group. Show that the average treatment effect estimator 𝜏𝜏̂ = 𝜇𝜇̂ 𝑡𝑡 − 𝜇𝜇̂ 𝑐𝑐 enjoys
the doubly robust property in that this estimator is consistent if either the propensity
score 𝑒𝑒(𝑋𝑋𝑖𝑖 ) is correct or the outcome regression �𝑚𝑚 � 𝑡𝑡 (𝑋𝑋𝑖𝑖 ), 𝑚𝑚
� 𝑐𝑐 (𝑋𝑋𝑖𝑖 )� is correct.
[Hint: You need to prove that 𝔼𝔼𝜇𝜇̂ 𝑡𝑡 = 𝔼𝔼𝔼𝔼(1) and 𝔼𝔼𝜇𝜇̂ 𝑐𝑐 = 𝔼𝔼𝔼𝔼(0) under the case where
𝑒𝑒(𝑋𝑋𝑖𝑖 ) is correct or �𝑚𝑚
� 𝑡𝑡 (𝑋𝑋𝑖𝑖 ), 𝑚𝑚
� 𝑐𝑐 (𝑋𝑋𝑖𝑖 )� is correct.]

You might also like