Buhlman 2020 PPT
Buhlman 2020 PPT
Peter Bühlmann
in genomics:
if we would make an intervention at a single gene, what would
be its effect on a phenotype of interest?
in genomics:
if we would make an intervention at a single gene, what would
be its effect on a phenotype of interest?
e.g.:
“Pritzker Consortium on Early Childhood Development
identifies when and how child intervention programs can be
most influential”
Genomics
p = 5360 genes
phenotype of interest: Y = expression of first gene
“covariates” X = gene expressions from all other genes
and then
phenotype of interest: Y = expression of second gene
“covariates” X = gene expressions from all other genes
and so on
infer/predict the effects of a single gene knock-down on all
other genes
; consider the framework of an
1,000 IDA
Lasso
Elastic−net
800
Random
True positives
600
400
200
0
0 1,000 2,000 3,000 4,000
False positives
1,000 IDA
Lasso
Elastic−net
800
Random
True positives
600
400
200
0
0 1,000 2,000 3,000 4,000
False positives
1,000 IDA
Lasso
Elastic−net
800
Random
n = 63
True positives
600
200
0
0 1,000 2,000 3,000 4,000
False positives
A bit more specifically
I univariate response Y
I p-dimensional covariate X
question:
what is the effect of setting the jth component of X to a certain
value x:
do(X (j) = x)
X2 Y
X4 X3
for simplicity: just consider DAGs (Directed Acyclic Graphs)
[with hidden variables (Spirtes, Glymour & Scheines (1993);
Colombo et al. (2012)) much more complicated and not validated
with real data]
random variables are represented as nodes in the DAG
X (1) X (1)
X (2) Y
X (2) = x Y
P(Y , X (1) , X (2) , X (3) , X (4) ) = P(Y , X (1) , X (3) , X (4) |do(X (2) = x)) =
P(Y |X (1) , X (3) )× P(Y |X (1) , X (3) )×
P(X (1) |X (2) )× P(X (1) |X (2) = x)×
P(X (2) |X (3) , X (4) )× P(X (3) )×
P(X (3) )× P(X (4) )
P(X (4) )
truncated factorization for do(X (2) = x):
intervention effect:
Z
E[Y |do(X (2) = x)] = yP(y |do(X (2) = x))dy
∂
intervention effect at x0 : E[Y |do(X (2) = x)]|x=x0
∂x
∂
E[Y |do(X (2) = x)]≡ θ2 for all x
∂x
The backdoor criterion (Pearl, 1993)
parental set might not be the minimal set but always suffices
for Y ∈
/ pa(j):
X (2) Y
X (4) X (3)
when having no unmeasured confounder (variable):
recap:
causal effect = effect from a randomized trial
(but we want to infer it without a randomized study...
because often we cannot do it, or it is too expensive)
recap:
causal effect = effect from a randomized trial
(but we want to infer it without a randomized study...
because often we cannot do it, or it is too expensive)
recap:
causal effect = effect from a randomized trial
(but we want to infer it without a randomized study...
because often we cannot do it, or it is too expensive)
Example:
X Y X Y
X causes Y Y causes X
D ∼ D 0 ⇔ M(D) = M(D 0 )
for
P = {Gaussian distributions} or
P = {nonparametric distributions}
it holds:
E(D) = {D 0 ; D 0 ∼ D}
Equivalence class of DAGs
• All DAGs in an equivalence class have the same skeleton and the
same v-structures
• An equivalence class can be uniquely represented by a completed
partially directed acyclic graph (CPDAG)
22
we cannot estimate causal/intervention effects from
observational distribution
conceptual “procedure”:
I probability distribution P from a DAG, generating the data
; true underlying equivalence class of DAGs (CPDAG)
I find all DAG-members of true equivalence class (CPDAG):
D1 , . . . , Dm
I for every DAG-member Dr , and every variable X (j) :
single intervention effect θr ,j
summarize them by
Θ = {θr ,j ; r = 1, . . . , m; j = 1, . . . , p}
| {z }
identifiable parameter
we cannot estimate causal/intervention effects from
observational distribution
conceptual “procedure”:
I probability distribution P from a DAG, generating the data
; true underlying equivalence class of DAGs (CPDAG)
I find all DAG-members of true equivalence class (CPDAG):
D1 , . . . , Dm
I for every DAG-member Dr , and every variable X (j) :
single intervention effect θr ,j
summarize them by
Θ = {θr ,j ; r = 1, . . . , m; j = 1, . . . , p}
| {z }
identifiable parameter
IDA (oracle version)
PC-algorithm do-calculus
DAG 1 effect 1
DAG 2 effect 2
DAG m effect m
17
If you want a single number for every variable ...
Θ = {θr ,j ; r = 1, . . . , m; j = 1, . . . , p}
effect 1
effect 2
effect q
33
1 7
P⇒ CPDAG
| {z } 5 4 2
10
two main approaches:
I multiple testing of conditional dependencies:
PC-algorithm as prime example
I score-based methods: MLE as prime example
Estimation from finite samples
1 7
P⇒ CPDAG
| {z } 5 4 2
10
two main approaches:
I multiple testing of conditional dependencies:
PC-algorithm as prime example
I score-based methods: MLE as prime example
Faithfulness assumption
(necessary for conditional dependence testing approaches)
X (1) ← ε(1) ,
X (2) ← αX (1) + ε(2) ,
X (3) ← βX (1) + γX (2) + ε(3) ,
2 3
ε(1) , ε(2) , ε(3) i.i.d. ∼ N (0, 1)
unfaithful distributions
due to exact cancellation
The PC-algorithm (Spirtes & Glymour, 1991)
I crucial assumption:
distribution P is faithful to the true underlying DAG
I less crucial but convenient:
Gaussian assumption for Y , X (1) , . . . , X (p) ; can work with
partial correlations
I input: Σ̂MLE
but we only need to consider many small sub-matrices of it
(assuming sparsity of the graph)
I output: based on a clever data-dependent (random)
sequence of multiple tests
estimated CPDAG
PC-algorithm: a rough outline
for estimating the skeleton of underlying DAG
correlation)
3. partial correlations of
order 1:
remove edge i − j if
\ (i) , X (j) |X (k) ) is
Parcor(X
partial correlation order 1 stopped
small for some k in the
current neighborhood of i
or j (thanks to faithfulness)
4. move-up to partial
correlations of order 2:
remove edge i − j if partial
correlation
\ (i) , X (j) |X (k) , X (`) )
Parcor(X
is small for some k , ` in full graph correlation screening
E[N]=2
−1
E[N]=8
1. PC-algorithm ; CPDAG
\
2. local algorithm ; Θ̂local
3. lower bounds for absolute causal effects ; α̂j
R-package: pcalg
X Y X Y
X causes Y Y causes X
the same situation arises with a full graph with more than 2
nodes
;
R.A. Fisher
Gaussian DAG is Gaussian linear structural equation model:
1
X (1) ← ε(1)
X (2) ← β21 X (1) + ε(2)
2 3
X (3) ← β31 X (1) + β32 X (2) + ε(3)
in general:
p
X
(j)
X ← βjk X (k) + ε(j) (j = 1, . . . , p), βjk 6= 0 ⇔ edge k → j
k=1
X = BX + ε, ε ∼ Np (0, diag(σ12 , . . . , σp2 )) in matrix notation
; reparameterization
Σ̂, D̂ = argminΣ;D a DAG − `(Σ, D; data) + λ|D|
= argminB; {σ2 ;j} − `(B, {σj2 ; j}; data) + λ kBk0
j | {z }
P
ij I(Bij 6=0)
beta2
(0,0)
X1 X2 beta1
1,000 IDA
Lasso
Elastic−net
800
Random
True positives
n = 63 600
observational data
400
200
0
0 1,000 2,000 3,000 4,000
False positives
Arabidopsis thaliana (Stekhoven, Moraes, Sveinbjörnsson, Hennig,
Maathuis & PB, 2012)
recap:
if P = {Gaussian distributions}, then:
cardinality of the Markov-equivalence class of a DAG D
|E(D)| often > 1
Xj ← fj (Xpa(j) , εj ) (j = 1, . . . , p)
e.g.
X
Xj ← fjk (Xk ) + εj (j = 1, . . . , p)
k∈pa(j)
three types of identifiable SEMs where
true DAG is identifiable from observational distribution:
with εj ’s non-Gaussian
as with independent component analysis (ICA)
; identifying Gaussian components is the hardest
I linear Gauss with equal error variances (Peters & PB, 2014)
X
Xj ← Bjk Xk + εj (j = 1, . . . , p),
k ∈pa(j)
Xj ← fj (Xpa(j) ) + εj (j = 1, . . . , p)
30
● ● ●
● ●●
●●●
● ●●
●
4
●●
●●
●●
●●●
●● ● ●
●●●● ● ●● ●
25
●
●●
●●
● ● ●
●●●●●
● ● ●
● ●● ● ●● ● ●
●
●●●●●
●● ● ●● ● ●
● ●
●
● ●
●● ●●
●●
● ●●● ● ● ● ●●● ● ●●● ●● ● ●● ●
●
●● ● ●●
● ●●
● ●● ●
●●●●
● ● ●● ● ●● ● ●
●● ●● ● ● ●● ● ●●● ● ● ●● ● ● ● ●
20
● ● ● ●●●● ● ●●● ●●
X3
● ●● ●●●
●● ● ●● ●● ●●●●
2
●● ● ● ●●
●●● ● ●● ●● ●● ● ●●●● ● ●● ● ●● ●●●
+ε
● ●● ●● ●●● ●●●●
truth: Y =
●
●●●
● ●●
●●● ● ● ●●● ●●● ●● ● ● ● ●● ●●
●●●●
●●●●
● ●●●
●● ●●● ●●
●
●●●● ●
● ●●●●●
●●●● ●● ● ●●● ●● ● ●● ●● ●
●●● ● ●● ●●●●● ● ●●●●
●●●●●●●●
●●●
●●●
●● ● ●●●●●●
●
● ●●● ●● ●●●●● ●
● ●●●● ● ●● ●● ●● ● ●
●●●
● ●
●● ●● ● ● ●●
epsilon
● ● ●● ●●●●●
●●●●● ●●●
● ●●● ●●● ●●● ●● ●● ●●●
15
●● ●● ● ● ● ●
●
●
●● ●●
●●●● ●●●●●●●●●
●
●●●
● ●● ●●● ●● ● ●● ●
●●●● ●●
●● ●● ●●●● ● ● ●●● ●
●● ●
● ●
●●
● ●●
● ●● ●●●●●●● ●●
● ●● ● ● ●●●●
● ●
●●●●
● ● ●● ● ●● ●●
●●● ●●●●●●●● ● ●●● ●● ●●● ● ●● ●●
●● ●● ●
● ●●●●●
● ● ●●
●●●● ●●
● ●●●●●
y
● ● ●●●●●● ●●● ●● ●●●
● ●
●●
●
●●●●●●
● ●●
●●
●● ●
●● ●● ● ● ●
●
●●● ● ●●●●● ●●● ●●
●●●●● ●●● ●
● ●●●●●● ● ●● ●●● ●●
0
●●
●●
●●● ● ● ● ●
● ●●●●●
●● ●
● ● ● ● ●●● ●● ● ●●
● ●● ●●
●● ●●● ● ●●●
● ●●●●● ●● ●●●●●●●● ● ●●●●
● ● ●●●●●
● ● ●● ●● ●●●●
● ●●●
●
●●●
●●● ● ●
●●●●● ● ●● ●● ●● ●●●●●●●● ● ●● ●
●
●● ●●●●
10
● ● ●● ● ●●●● ●●●●● ●●
●
●●●●
●●
●●
● ●● ● ●● ●
● ●● ●● ●●●●●●●●●
●● ● ● ●● ●●
●●● ●●● ●●● ●● ● ●
●● ●●●● ●●●
● ● ● ●
●
●● ● ●●●●●●●
● ●
● ●●
● ●
●●● ● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ●●● ●
● ●●●
●
●●
●
●●●
● ●●
●●
●
● ●●
● ● ●●● ●●●● ● ● ●● ●● ● ●●●●●●● ● ● ●● ● ●●● ● ●
● ●● ●
ε indep. of X
●●● ● ● ●
●
●
●●●
●
●● ●
● ●
●
●
●
●
●●
●● ●●●● ●● ●
● ●●● ●● ● ●●●●● ● ● ● ●● ●● ● ●
●●● ● ●
●
● ●●● ● ●
● ●●●●●
● ●
●● ● ●
●●
●●●●
●●
●● ●● ●● ●●● ●● ● ●●●
● ●
●●
● ●● ●●
●●●●●●
●
●
●●●
●
●●
●
●
●●●
●
●●●
●
●●
●
●
●●●●
● ● ●●●●● ●
●
●
●
● ● ●● ●●●●● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ●●
●
−2
●
5
● ● ● ●● ● ●● ●
●● ●● ●●
●●●●●
●
●
●
●●
●
●●
●●
●●●
●●
●●
●● ●●
●
●
●●
● ●● ● ●●● ●● ●●● ● ● ● ● ●● ●●●●● ●● ●●●● ●● ●
●● ●
●● ● ● ●● ●●●●
● ●
●● ●●●
● ●
●●
●●
●
●
● ●
●●●● ●●● ● ● ●● ●● ● ●●● ● ●● ●●
●
● ●●
● ●●●●●●●● ●
●●●●●●
●●●
●●●
●●
●
●●●
●● ●●
●
●●
●●
●●
●●● ●● ●● ●
● ●●
● ●●● ●●●●● ●
●●●●
● ●●● ● ●●
●●●
●
●●
●
●●●●
●
●
●
●●●●
●
●
●●●●●●
●
●●●
●●●
● ●
● ●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●●
●●
●
●●●
●
●● ● ●●●● ●
● ● ● ● ●● ● ● ●
●● ●
●
●●
●
●
●●
●
●● ●
●
●●●●
●
●●
●
●●
●●●●●●●
●●●● ●●●
●●●
●
●●
●●
●●
●●
● ●●
●●
● ●●● ●●
● ●● ●● ● ● ● ●● ● ● ●● ● ● ●
●●●
● ●
●●
●
●●
●●●
● ● ●
●●●●●
●●
●●
●●
●●
●●●
● ●●
●●●
●●●
● ●
●●● ●
●
●●
● ● ●● ● ● ● ● ●● ● ●
0
●
●●
●
●
● ●
●●
● ●
●●
●
● ●
●●
● ●●●●●●
● ●
●●●
●●
●●●●●●
●
●●●
●
●●●
● ●●●●● ● ● ●
●
● ●●●
●●
● ●●●● ●
●●●
●●●●
●●●●●●● ●●●●
● ●● ●●
●●
●● ●●
●
●
●●●
● ●●
● ●●●
● ●●
●●
● ●
●
●
●●●●
●● ● ● ● ● ● ●
● ● ●●●● ● ●● ● ● ●● ● ● ●
−4
●● ●● ●
●● ● ● ●
−5
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0
x x
●● ●●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●
●●
● ●
●
●● ●●●●
●●
●●
●● ●●●●
●
●●●●●●
● ●
●
●●●● ●
●
●
●●●
●
●●●●●●
●● ●●
●
● ●
●●●
●●●
●
●●
● ●●● ●● ●●
●● ●●● ●●
●● ●● ●●
● ●●
●●
2.5
0.5
●●
●
●
●●
●●●●●
● ●●●
●
●
●
● ●
●●
●● ●
●
● ●●●
●● ●
●● ●●
●
● ●
●●●●
● ●
●
●●
●●●●
● ●●
●
●●●●●●
●●●
●● ●●
● ●● ●
● ●
●
● ● ●●
●● ●●●
●●●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●●●
● ●● ● ●
●●
● ●●●●● ● ● ●●
●●●●●
● ●●
●●
●●
●
●●●●●
●
● ●●
● ●
●
●
●
●
●●
●● ● ●●●
●●●● ●
●● ●
●
● ●
●●● ●●●● ●●●●
2.0
●●●
● ●
●●● ●
● ● ●● ● ●● ●
● ●●●● ●●●●● ●● ●
●●● ●●
●●
● ●●
●
●●●
●
●●
●
●●
●●●● ●●
●●●●●
●●
●
●
●
●●●●●●
●
●●
●
●● ●●● ●●●●
● ●●●●●●
●●● ●● ● ●●●●
●● ●
●●● ●● ● ●● ●●●● ● ●
●●●●●●●
● ●●●●●●● ● ●● ●●●●●● ●●● ●●●●●
●●●● ●● ● ● ● ● ● ●
● ●●
0.0
●
● ●
● ●●●
●●●● ●
●●● ●●●●●●
●●●
●●
●
●
●● ● ● ● ●●
●●●●
●● ● ●
●●●●●● ●●
●● ●●●
●● ●●●● ●
●●
●
●●
● ●● ● ●
●●
● ●
●
●●●●
●●●●
●●
●●●
●● ●●
●●●
●
●●●
● ●●●●
●
●
●● ●
●●●●●
●
● ● ●●
● ● ●
●●●●
●
● ● ●
●●● ● ●
●●●
●●
●●
●
●●
●
●●●
●●●●
●●
●
●●●
●●●●●
● ●
●
●●
●●
● ●
●
●●●●
●●●
●●●
●●●●●
● ●
●●●●
●
●●●●●●
●
●●●●
●●
●
●●●●●●
●●●
●
●●
●●
● ●● ●
●
●●●● ●●
●● ●● ●● ●●
●
●●●
●●
● ●
●●●●●●
●
●●●
●●●●
● ● ●●
●● ●●
● ●● ●
●● ●● ● ● ●●● ● ●●
●
● ●
●
eta
●
● ●
● ● ●●●
● ● ● ●● ● ●● ●●●● ●●●
●●●● ●●
●
●● ● ●● ●●● ●
●●
●●
●● ●●
●● ●● ● ● ●●
x
●●●●●●
● ● ●● ● ●●
● ●●● ●●●● ●● ●
●●
●● ● ●●●
● ●
●● ●
● ●
●●
●
●
●
●
●
●
●
●
●●●
●
● ● ●●●
●● ●
●●●
●●
●● ●●●
●
●● ●●●
● ● ●●
● ● ●●●
● ●
●●●●●
−0.5
●●● ●
●●
●●
●
● ●● ● ●●● ●●
●●●●● ● ● ● ● ●
●●
● ●●● ● ●
●
●●● ● ●●
●●●●●
● ●
●●
●●●
●●●
●
●
●
●●
●
●●
●●
●
● ●
●●
● ● ●
1.0
●●●● ●
●● ●
●●
●●●
●●
●●● ●●● ●
●
●● ●●●
● ● ●
●●●● ●
●
●●
●●
●●
●● ●●● ●
●●●
● ●
●●● ●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●●●●
●
●
● ●
●
●
●
●
●
●
●
●●●
●
●●
●●
● ●●●
●●
●●●●
●●
●
●●●●
●●●●
● ●
●●●
●●●●
●●
●
● ● ● ● ●
●
●●●●●● ●● ●
●● ● ●
●●● ●
●●
●●
●●● ● ●●●●
●
●●●●
●
●● ●
●●
●
●●
●
● ●●
●●
●●
●
●
● ●
● ●●
−1.0
●● ●● ● ● ●
η not indep. of Y
0.5
●●●●
●● ●
●
●●
●●●● ● ●●
● ●
●●
●
●●●
● ●●●●
●●
●●●● ●●●●● ●
●●●●● ●●
●●
●
●
●●
●
●
●●●
●● ●
●
●●
●
●●●● ●
● ● ●
●
●●
●●●●●
●● ●
●
●●
●
●
●
●
●
●
●
●
●
● ●● ● ●●
●● ●●●●●●●
●●●
●
●
●
● ●
●●●●
●
●
● ●●●● ●
●●●●
●
●●●
●●●●●●
●●
●
●
●●
●
●
● ● ●
●
● ●●
●●
●●●●
0.0
●●●●●● ●●
● ● ●
● ● ●
−5 0 5 10 15 20 25 30 −5 0 5 10 15 20 25 30
y y
strategy to infer structure and model parameters
(for identifiable models):
1 3
“superset skeleton”
4
5
X
Xj ← fjk (Xk ) + εj (j = 1, . . . , n)
k∈pa(j)
εj ∼ N (0, σj2 )
1 3 1 3 1 3
4 4 4
step 1: 5 step 2: 5 step 3: 5
p = 100, n = 200
true model is CAM (additive SEM) with Gaussian error
p = 100 p = 100
300 1500
200 1000
●
100 500
●
0 0
●
CAM
RESIT
LiNGAM
PC
CPC
GES
CAM
RESIT
LiNGAM
PC (lower)
PC (upper)
CPC (lower)
CPC (upper)
GES (lower)
GES (upper)
RESIT (Mooij et al. 2009) cannot be used for p = 100
CAM method is impressive where true functions are
non-monotone and nonlinear (sampled from Gaussian proc.);
for monotone functions: still good but less impressive gains
Empirical results to illustrate what can be achieved with CAM
p = 100, n = 200
true model is CAM (additive SEM) with Gaussian error
p = 100 p = 100
300 1500
200 1000
●
100 500
●
0 0
●
CAM
RESIT
LiNGAM
PC
CPC
GES
CAM
RESIT
LiNGAM
PC (lower)
PC (upper)
CPC (lower)
CPC (upper)
GES (lower)
GES (upper)
RESIT (Mooij et al. 2009) cannot be used for p = 100
CAM method is impressive where true functions are
non-monotone and nonlinear (sampled from Gaussian proc.);
for monotone functions: still good but less impressive gains
Gene expressions from isoprenoid pathways in Arabidopsis
Thaliana (Wille et al., 2004)
p = 39, n = 118
DXR DXR
HMGS HMGS
MCT MCT
HMGR1 HMGR2 HMGR1 HMGR2
CMK CMK
MK MK
MECPS MECPS
MPDC1 MPDC2 MPDC1 MPDC2
HDS HDS
IPPI2 IPPI2
HDR HDR
Mitochondrion Mitochondrion
IPPI1 FPPS1 FPPS2 IPPI1 FPPS1 FPPS2
UPPS1 UPPS1
GPPS GPPS
DPPS2 DPPS2
GGPPS PPDS1 PPDS2 GGPPS1,5,9 DPPS1,3 GGPPS3,4 GGPPS PPDS1 PPDS2 GGPPS1,5,9 DPPS1,3 GGPPS3,4
2,6,8,10,11,12 2,6,8,10,11,12
call it ord-additive
modeling/fitting
very robust against model-misspecification !
call it ord-additive
modeling/fitting
very robust against model-misspecification !
DXR
MCT
I rank Rdirected gene pairs (X causes Y )
CMK
with (Ê[Y |do(X = x) − Ê[Y ])2 dx
MECPS
I top 10 scoring directed gene pairs and
HDS
check their stability ;
HDR
stability selection:
E[false positives] ≤ 1
IPPI1
squared error
Partial Path 3 Partial Path
1.5 Parent Adjustment Parent Adjustment
2
1.0
0.5 1
0.0 0
0 (100%) 16 (96%) 32 (92%) 64 (84%) 128 (68%) 256 (34%) 0 (100%) 4 (96%) 8 (92%) 16 (84%) 32 (68%) 64 (34%)
SHD to true DAG (percentage of correct edges) SHD to true DAG (percentage of correct edges)
1. Beware of over-interpretation!
R-package: pcalg
(Kalisch, Mächler, Colombo, Maathuis & PB, 2012)
References:
I Ernest, J. and Bühlmann, P. (2014). On the role of additive regression for (high-dimensional) causal
inference. Preprint arXiv:1405.1868
I Bühlmann, P., Peters, J. and Ernest, J. (2013). CAM: Causal Additive Models, high-dimensional order
search and penalized regression. Preprint arXiv:1310.1533
I Peters, J. and Bühlmann, P. (2014). Identifiability of Gaussian structural equation models with equal error
variances. Biometrika 101, 219-228.
I Uhler, C., Raskutti, G., Bühlmann, P. and Yu, B. (2013). Geometry of faithfulness assumption in causal
inference. Annals of Statistics 41, 436-463.
I van de Geer, S. and Bühlmann, P. (2013). `0 -penalized maximum likelihood for sparse directed acyclic
graphs. Annals of Statistics 41, 536-567.
I Kalisch, M., Mächler, M., Colombo, D., Maathuis, M.H. and Bühlmann, P. (2012). Causal inference using
graphical models with the R package pcalg. Journal of Statistical Software 47 (11), 1-26.
I Stekhoven, D.J., Moraes, I., Sveinbjörnsson, G., Hennig, L., Maathuis, M.H. and Bühlmann, P. (2011).
Causal stability ranking. Bioinformatics 28, 2819-2823.
I Hauser, A. and Bühlmann, P. (2012). Characterization and greedy learning of interventional Markov
equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409-2464.
I Maathuis, M.H., Colombo, D., Kalisch, M. and Bühlmann, P. (2010). Predicting causal effects in large-scale
systems from observational data. Nature Methods 7, 247-248.
I Maathuis, M.H., Kalisch, M. and Bühlmann, P. (2009). Estimating high-dimensional intervention effects from
observational data. Annals of Statistics 37, 3133-3164.