0% found this document useful (0 votes)

19 views10 pages

Causal Inference in Python

The document discusses the framework of Rubin's potential outcome model and introduces the Causalinference Python package for causal analysis, highlighting its features such as propensity score estimation and treatment effect estimation. It provides a step-by-step guide on using the package, including data simulation, statistical summaries, and methods for improving covariate balance. The document emphasizes the importance of understanding treatment effects and the role of propensity scores in causal analysis.

Uploaded by

deeplearning.nsz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views10 pages

Causal Inference in Python

Uploaded by

deeplearning.nsz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

1 Setting and Notation

As is standard in the literature, we work within the framework of Rubin’s potential outcome
model (Rubin, 1974). Let Y (0) denote the potential outcome of a subject in the absence of
treatment, and let Y (1) denote the unit’s potential outcome when it is treated. Let D denote
treatment status, with D = 1 indicating treatment and D = 0 indicating control, and let X be a K-
column vector of covariates or individual characteristics. For unit i, i = 1, 2, . . . , N, the observed
outcome can be written as Y i = (1 − Di )Yi (0) + Di Yi (1) . The set of observables (Y , D , X
i i i

), i = 1, 2, . . . , N, forms the basic input data set for Causalinference . Causalinference is

appropriate for settings in which treatment can be said to be strongly ignorable, as defined in
Rosenbaum and Rubin (1983). That is, for all x in the support of X, we have

(i) Unconfoundedness: D is independent of Y(0) Y(1) conditional on X=x,

(ii) Overlap: c < P(D = 1|X = x) < 1 − c, for some c > 0.

In the following, we illustrate the typical flow of a causal analysis using the tools of
Causalinference and a simulated data set. In simulating the data, we specified a constant
treatment effect of 10 for simplicity, and incorporated systematic overlap issues and
nonlinearities to highlight a number of tools in the package. We focus mostly on illustrating the
use of Causalinference; for details on methodology please refer to Imbens and Rubin (2015).

2. Causalinference
Causalinference is a Python package that provides various statistical methods for causal
analysis. It is a simple package that was used for basic causal analysis learning. The main
features of these packages include:

Propensity score estimation and subclassification

Improvement of covariate balance through trimming
Estimation of treatment effects
Assessment of overlap in covariate distributions We can find the explanation on their web
page for a longer explanation regarding each term.

Let’s try out the Causalinference package. For starters, we need to install the package.

In [1]: pip install causalinference

Defaulting to user installation because normal site-packages is not writeable

Collecting causalinference
Downloading CausalInference-0.1.3-py3-none-any.whl (51 kB)
-------------------------------------- 51.1/51.1 kB 869.9 kB/s eta 0:00:00
Installing collected packages: causalinference
Successfully installed causalinference-0.1.3
Note: you may need to restart the kernel to use updated packages.
In [5]: print(dir(causal))

['class', 'delattr', 'dict', 'dir', 'doc', 'eq', '__format_

_', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass_
_', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex
__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakr
ef__', '_post_pscore_init', 'blocks', 'cutoff', 'est_propensity', 'est_propensity_s',
'est_via_blocking', 'est_via_matching', 'est_via_ols', 'est_via_weighting', 'estimate
s', 'old_data', 'propensity', 'raw_data', 'reset', 'strata', 'stratify', 'stratify_
s', 'summary_stats', 'trim', 'trim_s']

After the installation finishes, we will try to implement a causal model for causal analysis. We
would use the random data that came from the causalinference package.

In [66]: import faker

faker = faker.Faker()
faker.seed_instance(7121)
from causalinference import CausalModel
from causalinference.utils import random_data
#Y is the outcome, D is treatment status, and X is the independent variable
Y, D, X = random_data()
causal = CausalModel(Y, D, X)

The CausalModel class would analyze the data. We would need to do a few more steps to
acquire important information from the model.

First, let’s get the statistical summary.

In [67]: print(causal.summary_stats)

Summary Statistics

Controls (N_c=2503) Treated (N_t=2497)

Variable Mean S.d. Mean S.d. Raw-diff
--------------------------------------------------------------------------------
Y -0.926 1.774 4.986 3.014 5.912

Controls (N_c=2503) Treated (N_t=2497)

Variable Mean S.d. Mean S.d. Nor-diff
--------------------------------------------------------------------------------
X0 -0.337 0.935 0.330 0.949 0.708
X1 -0.320 0.974 0.313 0.935 0.664
X2 -0.327 0.936 0.356 0.946 0.726

In [68]: causal.summary_stats.keys()

dict_keys(['N', 'K', 'N_c', 'N_t', 'Y_c_mean', 'Y_t_mean', 'Y_c_sd', 'Y_t_sd', 'rdif

Out[68]:
f', 'X_c_mean', 'X_t_mean', 'X_c_sd', 'X_t_sd', 'ndiff'])

In [69]: causal.summary_stats['X_t_mean']

array([0.32965893, 0.3130337 , 0.35620761])

Out[69]:

In [70]: causal.summary_stats['ndiff']
array([0.70765718, 0.66358536, 0.7261009 ])
Out[70]:

In [71]: causal.summary_stats['Y_t_mean']

4.986076842982941
Out[71]:

Here rdiff refers to the difference in average observed outcomes between treatment and
control groups. ndiff , on the other hand, refers to the normalized differences in average
covariates, defined as

¯
¯¯ ¯
¯¯
x k,t −x k,t

2 2
s +s
k,t k,c
√
2

where ¯x
¯¯
k,t
and sk,t are the sample mean and sample standard deviation of the kth covariate of
the treatment group, and ¯x
¯¯
k,c and sk,c are the analogous statistics for the control group.

The normalized differences in average covariates provide a way to measure the covariate
balance between the treatment and the control groups. Unlike the t-statistic, its absolute
magnitude does not increase (in expectation) as the sample size increases.

By using the summary_stats attribute, we would acquire all the basic information of the
dataset.

The main part of causal analysis is acquiring the treatment effect information. The simplest one
to do is by using the Ordinary Least Square method.

3 Least Squares Estimation

One of the simplest treatment effect estimators is the ordinary least squares (OLS) estimator.
Causalinference provides several common regression specifications. By default, the method est
via ols will run the following regression:

¯
¯¯¯¯ ¯
¯¯¯¯
Yi = α + β(Xi − X ) + δYi (Xi − X ) + ϵi

To inspect any treatment effect estimates produced, we can simply invoke print on the attribute
estimates, as in below:

In [72]: causal.est_via_ols()
print(causal.estimates)
Treatment Effect Estimates: OLS

Est. S.e. z P>|z| [95% Conf. int.]

--------------------------------------------------------------------------------
ATE 2.934 0.034 85.182 0.000 2.866 3.001
ATC 1.958 0.040 49.551 0.000 1.881 2.035
ATT 3.911 0.040 97.372 0.000 3.833 3.990

C:\Users\moham\AppData\Roaming\Python\Python310\site-packages\causalinference\estimat
ors\ols.py:21: FutureWarning: `rcond` parameter will change to the default of machine
precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to
keep using the old, explicitly pass `rcond=-1`.
olscoef = np.linalg.lstsq(Z, Y)[0]

ATE, ATC, and ATT stand for Average Treatment Effect, Average Treatment Effect for Control
and Average Treatment Effect for Treated, respectively. Using this information, we could assess
whether the treatment has an effect compared to the control.

Including interaction terms between the treatment indicator D and covariates X implies that
treatment effects can differ across individuals. In some instances we may want to assume a
constant treatment effect, and only run

¯
¯¯¯¯ ¯
¯¯¯¯
Yi = α + β(Xi − X ) + δYi (Xi − X ) + ϵi

This can be achieved by supplying a value of 1 in est via ols to the optional parameter adj (its
default value is 2). To compute the raw difference in average outcomes between treatment and
control groups, we can set adj=0 . In this example, the least squares estimates are radically
different from the true treatment effect of 10. This is the result of the nonlinearity and non-
overlap issues intentionally introduced into the data simulation process. As we shall see, several
other tools exist in Causalinference that can better deal with a lack of overlap and that will
allow us to obtain estimates that are less sensitive to functional form assumptions.

4 Propensity Score Estimation

The probability of getting treatment conditional on the covariates, p(X ) = p(D = 1|X ), also
i i i

known as the propensity score, plays a central role in much of what follows. Two methods, est
propensity and est propensity s, are provided for propensity score estimation. Both involve
running a logistic regression of the treatment indicator D on functions of the covariates.
est_propensity allows the user to specify the covariates to include linearly and/or
quadratically, while est_propensity_s will make this choice automatically based on a
sequence of likelihood ratio tests. In the following, we run est_propensity_s and display the
estimation results. In this example, the specification selection algorithm decided to include both
covariates and all the interaction and quadratic terms.

Using the propensity score method, we could also get information regarding the probability of
treatment conditional on the independent variables.
In [74]: causal.est_propensity_s()
print(causal.propensity)

Estimated Parameters of Propensity Score

Coef. S.e. z P>|z| [95% Conf. int.]

--------------------------------------------------------------------------------
Intercept 0.044 0.042 1.057 0.291 -0.038 0.127
X2 1.017 0.042 24.311 0.000 0.935 1.099
X0 1.032 0.042 24.712 0.000 0.950 1.114
X1 0.986 0.041 23.890 0.000 0.905 1.067
X1*X1 -0.061 0.028 -2.153 0.031 -0.116 -0.005

Using the propensity score method, we could assess the probability of the treatment given the
independent variables.

There are still many methods you could explore and learn from. I suggest you visit the
causalinference web page and learn further.

The propensity attribute is again another dictionary-like container of results. The dictionary keys
of propensity can be found by running:

In [75]: causal.propensity.keys()

dict_keys(['lin', 'qua', 'coef', 'loglike', 'fitted', 'se'])

Out[75]:

In [76]: causal.propensity['lin']

[2, 0, 1]
Out[76]:

In [77]: causal.propensity['qua']

[(1, 1)]
Out[77]:

In [78]: causal.propensity['coef']

array([ 0.04446577, 1.0166564 , 1.03201622, 0.98577117, -0.06090652])

Out[78]:

5 Improving Covariate Balance

When there is indication of covariate imbalance, we may wish to construct a sample where the
treatment and control groups are more similar than the original full sample. One way of doing
so is by dropping units with extreme values of propensity score. For these subjects, their
covariate values are such that the probability of being in the treatment (or control) group is so
overwhelmingly high that we cannot reliably find comparable units in the opposite group. We
may wish to forego estimating treatment effects for such units since nothing much can be
credibly said about them.
A good rule-of-thumb is to drop units whose estimated propensity score is less than α = 0.1 or
greater than 1−α = 0.9. By default, once the propensity score has been estimated by running
either est propensity or est propensity s, a value of 0.1 will be set for the attribute cutoff:

In [79]: causal.cutoff

0.1
Out[79]:

Calling causal.trim() at this point will drop every unit that has propensity score outside of the [α,
1 − α] interval. Alternatively, a procedure exists that will estimate the optimal cutoff that
minimizes the asymptotic sampling variance of the trimmed sample. The method trim s will
perform this calculation, set the cutoff to the optimal α, and then invoke trim to construct the
subsample. For our example, the optimal α was estimated to be slightly less than 0.1:

In [80]: causal.trim_s()

In [81]: causal.cutoff

0.10095500234207272
Out[81]:

The complexity of this cutoff selection algorithm is only O(N log N), so in practice there is very
little reason to not employ it.

6 Stratifying the Sample

With the propensity score estimated, one may wish to stratify the sample into blocks that have
units that are more similar in terms of their covariates. This makes the treatment and control
groups within each propensity bin more comparable, and therefore treatment effect estimates
more credible. Causalinference provides two methods for subclassification based on
propensity score. The first, stratify, splits the sample based on what is specified in the attribute
blocks. The default value of blocks is set to 5, which means that stratify will split the sample into
5 equal-sized bins. In contrast, the second method, stratify_s, will use a data-driven procedure
for selecting both the number of blocks and their boundaries, with the expectation that the
number of blocks should increase with the sample size. Operationally this method is a divide-
and-conquer algorithm that recursively divides the sample into two until there is no significant
advantage of doing so. This algorithm also runs in O(N log N) time, so costs relatively little to
use. To inspect the results of the stratification, we can invoke print on the attribute strata to
display some summary statistics, as follows:

In [82]: causal.stratify_s()

In [83]: print(causal.strata)
Stratification Summary

Propensity Score Sample Size Ave. Propensity Outcome

Stratum Min. Max. Controls Treated Controls Treated Raw-diff
--------------------------------------------------------------------------------
1 0.101 0.152 218 27 0.126 0.127 1.151
2 0.152 0.197 201 42 0.175 0.178 1.430
3 0.197 0.297 364 123 0.247 0.246 1.992
4 0.297 0.392 311 176 0.342 0.346 2.417
5 0.393 0.506 267 220 0.446 0.450 2.660
6 0.506 0.548 121 123 0.527 0.526 3.052
7 0.548 0.602 94 149 0.574 0.576 3.136
8 0.602 0.653 106 138 0.627 0.629 3.445
9 0.653 0.706 76 167 0.681 0.679 3.746
10 0.706 0.802 110 377 0.753 0.755 4.001
11 0.802 0.850 44 200 0.824 0.826 4.489
12 0.851 0.899 34 209 0.871 0.876 4.672

Under the hood, the attribute strata is actually a list-like object that contains, as each of its
elements, a full instance of the class CausalModel, with the input data being those that
correspond to the units that are in the propensity bin. We can thus, for example, access each
stratum and inspect its summary_stats attribute, or as the following illustrates, loop through
strata and estimate within-bin treatment effects using least squares.

In [97]: for stratum in causal.strata:

stratum.est_via_ols(adj=2)

In [100… [stratum.estimates['ols']['att']for stratum in causal.strata]

[1.1010525059277556,
Out[100]:
1.3936541440372463,
2.004643746290025,
2.4002774533086817,
2.662716013620451,
3.0475818205042002,
3.122994358139151,
3.424260986897368,
3.7506272041228654,
3.9854869677920286,
4.443325713714909,
4.67815605365512]

In [102… stratum.estimates['ols']['att']

4.67815605365512
Out[102]:

Note that these estimates are much more stable and closer to the true value of 10 than the
within-bin raw differences in average outcomes that were reported in the stratification summary
table, highlighting the virtue of further controlling for covariates even within blocks. Taking the
sample-weighted average of the above within-bin least squares estimates results in a
propensity score matching estimator that is commonly known as the subclassification estimator
or blocking estimator. However, instead of manually looping through the strata attribute,
estimating within-bin treatment effects, and then averaging appropriately to arrive at an overall
estimate, we can also simply call est via blocking, which will perform these operations and
collect the results in the attribute estimates. We will report these estimates in the next section
along with estimates obtained from other, alternative estimators.

7 Treatment Effect Estimation

In addition to least squares and the blocking estimator described in the last section,
Causalinference provides two alternative treatment effect estimators. The first is the nearest
neighborhood matching estimator of Abadie and Imbens (2006). Instead of relying on the
propensity score, this estimator pairs treatment and control units by matching directly on the
covariate vectors themselves. More specifically, each unit i in the sample is matched with a unit
m(i) in the opposite group, where

m(i) = argmin ∥Xi − Xj ∥

j:Dj ≠Di

and ∥X i − Xj ∥ is some measure of distance between the covariate vectors X and X . The
j i

method est_via_matching implements this estimator, as well as several extensions that can be
invoked through optional arguments.

The last estimator is a version of the Horvitz-Thompson weighting estimator, modified to

further adjust for covariates. Mechanically, this involves running the following weight least
squares regression:

′
Yi = α + βDi + γ Xi + ϵi

where the weight for unit i is 1/p^(X) if i is in the treatment group, and 1/(1 − p^(X)) if i is in
the control group. This estimator is also sometimes called the doubly-robust estimator,
referring to the fact that this estimator is consistent if either the specification of the propensity
score is correct, or the specification of the regression function is correct. We can invoke it by
calling est via weighting. Note that under this specification the treatment effect does not differ
across units, so the ATC and the ATT are both equal to the overall ATE

In the following we invoke each of the four estimators (including least squares, since the input
data has changed now that the sample has been trimmed), and print out the resulting
estimates.

In [103… causal.est_via_ols()

In [104… causal.est_via_weighting()
C:\Users\moham\AppData\Roaming\Python\Python310\site-packages\causalinference\estimat
ors\weighting.py:23: FutureWarning: `rcond` parameter will change to the default of m
achine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to
keep using the old, explicitly pass `rcond=-1`.
wlscoef = np.linalg.lstsq(Z_w, Y_w)[0]

In [105… causal.est_via_blocking()

In [106… causal.est_via_matching(bias_adj=True)

C:\Users\moham\AppData\Roaming\Python\Python310\site-packages\causalinference\estimat
ors\matching.py:100: FutureWarning: `rcond` parameter will change to the default of m
achine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to
keep using the old, explicitly pass `rcond=-1`.
return np.linalg.lstsq(X, Y)[0][1:] # don't need intercept coef

In [107… print(causal.estimates)

Treatment Effect Estimates: OLS

Est. S.e. z P>|z| [95% Conf. int.]

--------------------------------------------------------------------------------
ATE 2.938 0.036 82.237 0.000 2.868 3.008
ATC 2.463 0.039 62.482 0.000 2.385 2.540
ATT 3.412 0.039 86.655 0.000 3.334 3.489

Treatment Effect Estimates: Weighting

Est. S.e. z P>|z| [95% Conf. int.]

--------------------------------------------------------------------------------
ATE 2.932 0.041 71.746 0.000 2.852 3.012

Treatment Effect Estimates: Blocking

Est. S.e. z P>|z| [95% Conf. int.]

--------------------------------------------------------------------------------
ATE 2.932 0.037 79.554 0.000 2.860 3.004
ATC 2.465 0.042 58.037 0.000 2.382 2.549
ATT 3.398 0.040 84.079 0.000 3.319 3.477

Treatment Effect Estimates: Matching

Est. S.e. z P>|z| [95% Conf. int.]

--------------------------------------------------------------------------------
ATE 2.911 0.069 42.171 0.000 2.775 3.046
ATC 2.427 0.082 29.428 0.000 2.265 2.588
ATT 3.394 0.081 42.106 0.000 3.236 3.552

As we can see above, despite the trimming the least squares estimates are still severely biased,
as is the weighting estimator (since neither the propensity score or the regression function is
correctly specified). The blocking and matching estimators, on the other hand, are less sensitive
to specification assumptions, and thus result in estimates that are closer to the true average
treatment effects.
References
Abadie, A., & Imbens, G. (2006). Large sample properties of matching estimators for
average treatment effects. Econometrica, 74 , 235-267.
Crump, R., Hotz, V. J., Imbens, G., & Mitnik, O. (2009). Dealing with limited overlap in
estimation of average treatment effects. Biometrika, 96 , 187-199.
Imbens, G. W., & Rubin, D. B. (2015). Causal inference in statistics, social, and biomedical
sciences: An introduction. Cambridge University Press.
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in
observational studies for causal effects. Biometrika, 70 , 41-55.
Rubin, D. B. (1974). Estimating Causal Effects of Treatments in Randomized and
Nonrandomized Studies. Journal of Educational Psychology, 66 , 688-701

In [ ]:

CausalML Book 2022
No ratings yet
CausalML Book 2022
500 pages
2025 - Applied Causal Inference Powered by ML and AI
No ratings yet
2025 - Applied Causal Inference Powered by ML and AI
518 pages
1 Optimization & Anti-Optimization of Structures Under Uncertainty - Isaac Elishakoff PDF
No ratings yet
1 Optimization & Anti-Optimization of Structures Under Uncertainty - Isaac Elishakoff PDF
425 pages
Two Way Fixed Effect Models
No ratings yet
Two Way Fixed Effect Models
116 pages
Statistical Causal Inferences and Their Applications in Public Health Research-Springer International Publishing (2016)
100% (2)
Statistical Causal Inferences and Their Applications in Public Health Research-Springer International Publishing (2016)
324 pages
Te
No ratings yet
Te
167 pages
Casual Tutorial Slides
No ratings yet
Casual Tutorial Slides
254 pages
Causal Inference - A Statistical Learning Approach
No ratings yet
Causal Inference - A Statistical Learning Approach
247 pages
Buhlman 2020 PPT
No ratings yet
Buhlman 2020 PPT
88 pages
AAAI-2023 教程用于因果推断的机器学习
No ratings yet
AAAI-2023 教程用于因果推断的机器学习
145 pages
w4 Propensity Scores Intro Perraillon 0
No ratings yet
w4 Propensity Scores Intro Perraillon 0
45 pages
Statistical Approaches To Causal Analysis, 1st Edition EPUB DOCX PDF Download
100% (11)
Statistical Approaches To Causal Analysis, 1st Edition EPUB DOCX PDF Download
14 pages
CIML2023
No ratings yet
CIML2023
87 pages
An Introduction To Causal Modelling: Gauranga Kumar Baishya and M. R. Srinivasan Chennai Mathematical Institute (CMI)
No ratings yet
An Introduction To Causal Modelling: Gauranga Kumar Baishya and M. R. Srinivasan Chennai Mathematical Institute (CMI)
52 pages
Causality
No ratings yet
Causality
89 pages
Bayesian Causal Tutorial Ohiostate June2019
No ratings yet
Bayesian Causal Tutorial Ohiostate June2019
56 pages
Causes of Effects Via A Bayesian Model S
No ratings yet
Causes of Effects Via A Bayesian Model S
16 pages
RPP Akuntansi Dasar Dalam Bahasa Inggris
No ratings yet
RPP Akuntansi Dasar Dalam Bahasa Inggris
18 pages
Math 6 - Q1 - W3.docx Edited
No ratings yet
Math 6 - Q1 - W3.docx Edited
11 pages
Lec 3EFCFull
No ratings yet
Lec 3EFCFull
50 pages
CausalML Book
No ratings yet
CausalML Book
496 pages
Diff in Diff
No ratings yet
Diff in Diff
23 pages
A2 Causality
No ratings yet
A2 Causality
28 pages
Соц Эффект По Закону2
No ratings yet
Соц Эффект По Закону2
29 pages
05 - The Unreasonable Effectiveness of Linear Regression - Causal Inference For The Brave and True
No ratings yet
05 - The Unreasonable Effectiveness of Linear Regression - Causal Inference For The Brave and True
10 pages
Chicago19 Masten
No ratings yet
Chicago19 Masten
35 pages
Generalization Bounds and Representation Learning For Estimation of Potential Outcomes and Causal Effects
No ratings yet
Generalization Bounds and Representation Learning For Estimation of Potential Outcomes and Causal Effects
50 pages
PSM Inès
No ratings yet
PSM Inès
71 pages
Causal-Inference Emsley
No ratings yet
Causal-Inference Emsley
54 pages
Lunceford Davidian 2004
No ratings yet
Lunceford Davidian 2004
24 pages
The International Journal of Biostatistics: An Introduction To Causal Inference
No ratings yet
The International Journal of Biostatistics: An Introduction To Causal Inference
62 pages
Mindful Sport Performance Enhancement: Mental Training Athletes Coaches
No ratings yet
Mindful Sport Performance Enhancement: Mental Training Athletes Coaches
321 pages
Athey
No ratings yet
Athey
31 pages
14 - 382 - Pset - 5 (1) - Merged
No ratings yet
14 - 382 - Pset - 5 (1) - Merged
9 pages
Slides 1 Match7!30!07
No ratings yet
Slides 1 Match7!30!07
40 pages
1-Introduction To Applied Econometrics
No ratings yet
1-Introduction To Applied Econometrics
33 pages
Introduction To Treatment Effects Handout
No ratings yet
Introduction To Treatment Effects Handout
18 pages
Causal Notes
No ratings yet
Causal Notes
17 pages
Causal Inference in Statistics: An Overview
100% (1)
Causal Inference in Statistics: An Overview
51 pages
Causal Inference: 1.1 Two Types of Causal Questions
No ratings yet
Causal Inference: 1.1 Two Types of Causal Questions
19 pages
Module 2.2 Randomized Assignment
No ratings yet
Module 2.2 Randomized Assignment
10 pages
Causal Review
No ratings yet
Causal Review
25 pages
Asz 048
No ratings yet
Asz 048
14 pages
Lecture Notes
No ratings yet
Lecture Notes
10 pages
M Api
No ratings yet
M Api
17 pages
Best Linear Predictor
No ratings yet
Best Linear Predictor
15 pages
Causality (Slides 2009 Video Link) - Philip Dawid
No ratings yet
Causality (Slides 2009 Video Link) - Philip Dawid
34 pages
Package CausalImpact - CausalImpact
No ratings yet
Package CausalImpact - CausalImpact
8 pages
Annurev Statistics 033121 114601
No ratings yet
Annurev Statistics 033121 114601
30 pages
Lecture 21
No ratings yet
Lecture 21
8 pages
Formulating Causal Questions and Principled Statistical Answers
No ratings yet
Formulating Causal Questions and Principled Statistical Answers
26 pages
Causal Inference and Stable Learning: Peng Cui Tong Zhang
No ratings yet
Causal Inference and Stable Learning: Peng Cui Tong Zhang
95 pages
Intro Stat
No ratings yet
Intro Stat
17 pages
Hrs RDD Slides F
No ratings yet
Hrs RDD Slides F
40 pages
Causal Inference: 1.1 Two Types of Causal Questions
No ratings yet
Causal Inference: 1.1 Two Types of Causal Questions
8 pages
Causal Inference in The Social Sciences
No ratings yet
Causal Inference in The Social Sciences
30 pages
Moderation Meditation PDF
No ratings yet
Moderation Meditation PDF
11 pages
Slides 1
No ratings yet
Slides 1
8 pages
Experiments and Causality
No ratings yet
Experiments and Causality
21 pages
Properties of Fibres
No ratings yet
Properties of Fibres
7 pages
Class 10 Multilevel Models
No ratings yet
Class 10 Multilevel Models
42 pages
Causal Inference in Statistics: An Overview
100% (2)
Causal Inference in Statistics: An Overview
51 pages
Diseases of Potato
No ratings yet
Diseases of Potato
8 pages
Literature Per Urat PDF
No ratings yet
Literature Per Urat PDF
6 pages
How To Create A Local Mirror of The Latest Update For Red Hat Enterprise Linux 5, 6, 7, 8 Without Using Satellite Server?
No ratings yet
How To Create A Local Mirror of The Latest Update For Red Hat Enterprise Linux 5, 6, 7, 8 Without Using Satellite Server?
28 pages
Neuroscience: Science of The Brain
90% (80)
Neuroscience: Science of The Brain
60 pages
Practical Research 2: Quarter 1 - Module 3
No ratings yet
Practical Research 2: Quarter 1 - Module 3
11 pages
Online Guest Room Booking System
No ratings yet
Online Guest Room Booking System
19 pages
RAMET Main Brochure
No ratings yet
RAMET Main Brochure
60 pages
How To Set Up A Firewall
No ratings yet
How To Set Up A Firewall
2 pages
Teenager Problems
No ratings yet
Teenager Problems
4 pages
Theory of Elasticity and Plasticity. (CVL 622) M.Tech. CE Term-2 (2017-18)
No ratings yet
Theory of Elasticity and Plasticity. (CVL 622) M.Tech. CE Term-2 (2017-18)
2 pages
Installation Manual
No ratings yet
Installation Manual
41 pages
Discursive - Boxing Passage
No ratings yet
Discursive - Boxing Passage
4 pages
Carens Brochure 2025 Mobile
No ratings yet
Carens Brochure 2025 Mobile
21 pages
CEE 209 Course Objective and Outcome Form - October 2018
No ratings yet
CEE 209 Course Objective and Outcome Form - October 2018
2 pages
Construction Services PDF
No ratings yet
Construction Services PDF
2 pages
The Role of Educational Psychology in Today'S Technology Enriched Classroom
No ratings yet
The Role of Educational Psychology in Today'S Technology Enriched Classroom
7 pages
Name ... : Grade 5: Al Andalus International School
No ratings yet
Name ... : Grade 5: Al Andalus International School
26 pages
E Portfolio BSED 2
No ratings yet
E Portfolio BSED 2
30 pages
Brandon Jones 2017
No ratings yet
Brandon Jones 2017
6 pages
LCF Paper High Strength Steel-2024
No ratings yet
LCF Paper High Strength Steel-2024
12 pages
Coseismic Displacement and Recurrence Interval of The 1973 Ragay Gulf Earthquake, Southern Luzon, Philippines
No ratings yet
Coseismic Displacement and Recurrence Interval of The 1973 Ragay Gulf Earthquake, Southern Luzon, Philippines
9 pages
04.DNS Protection Advanced Profiles
No ratings yet
04.DNS Protection Advanced Profiles
3 pages
24 Generic Toolbar Component 169163
No ratings yet
24 Generic Toolbar Component 169163
8 pages
Contoh Soal
No ratings yet
Contoh Soal
6 pages
Hurl 170425
No ratings yet
Hurl 170425
9 pages
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet

Causal Inference in Python

Uploaded by

Causal Inference in Python

Uploaded by

1 Setting and Notation

), i = 1, 2, . . . , N, forms the basic input data set for Causalinference . Causalinference is

(i) Unconfoundedness: D is independent of Y(0) Y(1) conditional on X=x,

Propensity score estimation and subclassification

In [1]: pip install causalinference

Defaulting to user installation because normal site-packages is not writeable

['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format_

In [66]: import faker

First, let’s get the statistical summary.

Controls (N_c=2503) Treated (N_t=2497)

Controls (N_c=2503) Treated (N_t=2497)

dict_keys(['N', 'K', 'N_c', 'N_t', 'Y_c_mean', 'Y_t_mean', 'Y_c_sd', 'Y_t_sd', 'rdif

array([0.32965893, 0.3130337 , 0.35620761])

3 Least Squares Estimation

Est. S.e. z P>|z| [95% Conf. int.]

4 Propensity Score Estimation

Estimated Parameters of Propensity Score

Coef. S.e. z P>|z| [95% Conf. int.]

dict_keys(['lin', 'qua', 'coef', 'loglike', 'fitted', 'se'])

array([ 0.04446577, 1.0166564 , 1.03201622, 0.98577117, -0.06090652])

5 Improving Covariate Balance

6 Stratifying the Sample

Propensity Score Sample Size Ave. Propensity Outcome

In [97]: for stratum in causal.strata:

In [100… [stratum.estimates['ols']['att']for stratum in causal.strata]

7 Treatment Effect Estimation

m(i) = argmin ∥Xi − Xj ∥

The last estimator is a version of the Horvitz-Thompson weighting estimator, modified to

Treatment Effect Estimates: OLS

Est. S.e. z P>|z| [95% Conf. int.]

Treatment Effect Estimates: Weighting

Est. S.e. z P>|z| [95% Conf. int.]

Treatment Effect Estimates: Blocking

Est. S.e. z P>|z| [95% Conf. int.]

Treatment Effect Estimates: Matching

Est. S.e. z P>|z| [95% Conf. int.]

You might also like

['class', 'delattr', 'dict', 'dir', 'doc', 'eq', '__format_