0% found this document useful (0 votes)
22 views106 pages

Causal Fairness Analysis

This document introduces a framework for causal fairness analysis in machine learning. It aims to link disparities in observed data to the underlying causal mechanisms that generate unfairness, called the Fundamental Problem of Causal Fairness Analysis. The framework studies how to decompose variations in fairness measures and attribute them to structural causes and population subgroups. It presents the Fairness Map, which organizes different fairness criteria, and the Fairness Cookbook for assessing disparate impact and treatment given causal assumptions. The goal is a mathematical framework for formalizing legal notions of discrimination to enable detecting, quantifying, and removing unfairness from predictive systems.

Uploaded by

Charag E Zindgi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views106 pages

Causal Fairness Analysis

This document introduces a framework for causal fairness analysis in machine learning. It aims to link disparities in observed data to the underlying causal mechanisms that generate unfairness, called the Fundamental Problem of Causal Fairness Analysis. The framework studies how to decompose variations in fairness measures and attribute them to structural causes and population subgroups. It presents the Fairness Map, which organizes different fairness criteria, and the Fairness Cookbook for assessing disparate impact and treatment given causal assumptions. The goal is a mathematical framework for formalizing legal notions of discrimination to enable detecting, quantifying, and removing unfairness from predictive systems.

Uploaded by

Charag E Zindgi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 106

Causal Fairness Analysis

Drago Plečko [email protected]


Seminar für Statistik
ETH Zürich
Zürich, 8092, Switzerland
Elias Bareinboim [email protected]
Department of Computer Science
Columbia University
arXiv:2207.11385v1 [cs.AI] 23 Jul 2022

New York, 10027, United States

Abstract
Decision-making systems based on AI and machine learning have been used throughout a
wide range of real-world scenarios, including healthcare, law enforcement, education, and
finance. It is no longer far-fetched to envision a future where autonomous systems will be
driving entire business decisions and, more broadly, supporting large-scale decision-making
infrastructure to solve society’s most challenging problems. Issues of unfairness and dis-
crimination are pervasive when decisions are being made by humans, and remain (or are
potentially amplified) when decisions are made using machines with little transparency,
accountability, and fairness. In this paper, we introduce a framework for causal fairness
analysis with the intent of filling in this gap, i.e., understanding, modeling, and possibly
solving issues of fairness in decision-making settings. The main insight of our approach will
be to link the quantification of the disparities present on the observed data with the under-
lying, and often unobserved, collection of causal mechanisms that generate the disparity
in the first place, challenge we call the Fundamental Problem of Causal Fairness Analysis
(FPCFA). In order to solve the FPCFA, we study the problem of decomposing variations
and empirical measures of fairness that attribute such variations to structural mechanisms
and different units of the population. Our effort culminates in the Fairness Map, which
is the first systematic attempt to organize and explain the relationship between different
criteria found in the literature. Finally, we study which causal assumptions are minimally
needed for performing causal fairness analysis and propose a Fairness Cookbook, which
allows data scientists to assess the existence of disparate impact and disparate treatment.
Keywords: Fairness in machine learning, Causal Inference, Graphical models, Counter-
factual fairness, Fair predictions.

1. Introduction
As society transitions to an AI-based economy, an increasing number of decisions that were
once made by humans are now delegated to automated systems, and this trend will likely
accelerate in the coming years. Automated systems may exhibit discrimination based on
gender, race, religion, or other sensitive attributes, and so considerations about fairness
in AI are an emergent discussion across the globe. Even though it might seem that the
issue of unfairness in AI is a recent development, the origins of the problem can be traced
back to long before the advent of AI and the prominence these systems have reached in the
last years. Among others, one prominent example is Martin Luther King Jr., who spoke of

©2022 Plečko and Bareinboim.


License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/.
Plečko and Bareinboim

having a dream that his children “will one day live in a nation where they will not be judged
by the color of their skin, but by the content of their character”. So little could he have
anticipated that machine algorithms would one day use race for making decisions, and that
the issues of unfairness in AI would be legislated under Title VII of the Civil Rights Act of
1964 (Act, 1964), which he advocated and fought for (Oppenheimer, 1994; Kotz, 2005).
The critical challenge underlying fairness in AI systems lies in the fact that biases in
decision-making exist in the real world from which various datasets are collected. Perhaps
not surprisingly, a dataset collected from a biased reality will contain aspects of this biases
as an imprint. In this context, algorithms are tools that may replicate or potentially even
amplify the biases that exist in reality in the first place. As automated systems are a priori
oblivious to ethical considerations, using them blindly could lead to the perpetuation of
unfairness in the future. More pessimistic analysts take this observation as a prelude to
doomsday, which, in their opinion, suggests that we should be extremely wary and defensive
against any AI. We believe a degree of caution is necessary, of course, but take a more
positive perspective, and consider this transition to a more AI-based society as a unique
opportunity to improve the current state of affairs.
While many human decision-makers are hard to change, even when aware of their own
biases, AI systems may be less brittle and more flexible. Still, one of the requirements
to realize AI potential is a new mathematical framework that allows the description and
assessment of legal notions of discrimination in a formal way. Based on this framework,
some of the tasks of fair ML will be to detect and quantify undesired discrimination based on
society’s current ethical standards, and to then design learning methods capable of removing
such unfairness from future predictions and decisions. This situation is somewhat unique
in the context of AI because a new definition of “ground truth” is required. The decision-
making system cannot rely purely on learning from the data, which is contaminated with
unwanted bias. It is currently unclear how to formulate the ideal inferential target1 , which
would help bring about a fair world when deployed. This degree of flexibility in deciding
the new ground truth also emphasizes the importance of normative work in this context. 2
In this paper, we build on two legal systems applied to large bodies of cases throughout
the US and the EU that are known as disparate treatment and disparate impact (Barocas
and Selbst, 2016). One of our key goals will be to develop a framework for causal fairness
analysis grounded in these systems and translate them into exact mathematical language.
The disparate treatment doctrine enforces the equality of treatment of different groups,
prohibiting the use of the protected attribute (e.g., race) in the decision process. One of
the legal formulations for proving disparate treatment is that “a similarly situated person
who is not a member of the protected class would not have suffered the same fate” (Barocas

1. We believe this explains the vast number of fairness criteria described in the literature, which we will
detail later on the paper.
2. One way of seeing this point a bit more formally goes as follows. We first consider the current version of
the world, say π, and note that it generates a probability distribution P. Training the machine learning
algorithm with data from this distribution (D ∼ P) is replicating patterns from this reality, π. What
we would want is to have an alternative, counterfactual reality π 0 , which induces a different distribution
P 0 without the past biases. The challenge here is that thinking about and defining P 0 relies on going
beyond P, or the corresponding dataset, which is non-trivial, and yet one of our main goals.

2
Causal Fairness Analysis

and Selbst, 2016)3 . On the other hand, the disparate impact doctrine focuses on outcome
fairness, namely, the equality of outcomes among protected groups. Disparate impact
discrimination occurs if a facially neutral practice has an adverse impact on members of the
protected group. Under this doctrine most commonly fall the cases in which discrimination
is unintended or implicit. The analysis can become somewhat intricate when variables are
correlated with the protected attribute and may act as a proxy, while the law may not
necessarily prohibit their usage due to their relevance to the business itself; this is known
as “business necessity” or “job-relatedness”. Taking business necessity into account is the
essence of disparate impact (Barocas and Selbst, 2016). Through causal machinery, our
framework will allow the data scientist to explain how much of the observed disparity can
be attributed to each underlying causal mechanism. This, in turn, allows the data scientist
to quantify the disparity explained by mechanisms that do not fall under business necessity
and are considered discriminatory, thereby providing a formal way of assessing disparate
impact and accomodate for business necessity requirements.

Current state of affairs & challenges


The behavior of AI/ML-based decision-making systems is an emergent property following a
complex combination of past (possibly biased) data and interactions with the environment.
Predicting or explaining this behavior and its impact on the real-world can be a difficult task,
even for the system designer who has the knowledge of how the system is built. Ensuring
fairness of such decision-making systems, therefore, critically relies on contributions from
two groups, namely:

a. the AI and ML engineers who develop methods to detect bias and ensure adherence
of ML systems to fairness measures, and

b. the domain experts, social scientists, economists, policymakers, and legal experts,
who study the origins of these biases and can provide the societal interpretations of
fairness measures and their expectations in terms of norms and standards.

Currently, these groups do not share a common starting point. It’s indeed extremely difficult
for them to understand each other and work together towards developing a fair specification
of such complex systems, aligned with the many stakeholders involved. In this work, we
argue that the language of structural causality can provide this common starting point
and facilitate the discussion and exchange of ideas, goals, and expectations between these
groups. In some sense, the connection with causal inference might be seen as natural in
this context as the legal frameworks of anti-discrimination laws (for example, Title VII in
the US) often require that to establish a prima facie case of discrimination, the plaintiff
must demonstrate “a strong causal connection” between the alleged discriminatory practice
and the observed statistical disparity (Texas Dept. of Housing and Community Affairs
v. Inclusive Communities Project, Inc., 576 U.S. 519 (2015)). Therefore, as discussed in
subsequent sections, one of the requirements of our framework will be the ability to represent
3. This formulation is related to a condition known as ceteris paribus, which represents the effect of the
protected attribute on the outcome of interest while keeping everything else constant. From a causal
perspective, this suggests that the disparate treatment doctrine is concerned with direct discrimination,
a connection we draw formally later on in the manuscript.

3
Plečko and Bareinboim

causal mechanisms underlying a given decision-making setting as well as to distinguish


between notions of discrimination that would otherwise be statistically indistinguishable.
Consider for instance the Berkeley Admission example, in
which admission results of students applying to UC Berkeley
were collected and analyzed (Bickel et al., 1975). The analysis X Y
showed that male students are 14% more likely to be admitted Gender D Admission
than their female counterparts, which raised concerns about Department
the possibility of gender discrimination. The discussion of this Figure 1: A partial causal
example is often less focused on accuracy and appropriateness model for the Berkeley Ad-
of the used statistical measures, and more on the plausible mission example.
justification of disparity based on the mechanism underlying
this disparity. A visual representation of the dynamics in this
settings is shown in Fig. 1. In words, each student chooses a department of application. The
department choice and student’s gender might, in turn, influence the admission decision.
In this example, there is a clear need for determining how much of the observed statistical
disparity can be attributed to the direct causal path from gender to admission decision
vs. the indirect mechanism4 going through the department choice variable. Looking directly
at gender for determining university admission would certainly be disallowed, whereas using
department choice, which may be influenced by gender, might be deemed acceptable. The
need to explain an observed statistical disparity, say in this case the 14% difference in
admission rates, through the underlying causal mechanisms – direct and indirect – is a
recurring theme when assessing discrimination, even though it is sometimes considered
only implicitly.
In fact, when AI tools are deployed in the real-world, a similar pattern of questions
emerges. Examples include (but are not limited to) the debate over the origins and inter-
pretation of discrimination in criminal justice (COMPAS, Angwin et al., 2016), the contri-
bution of data vs. algorithms in the observed bias in face detection (e.g., Harwell, 2019;
Buolamwini and Gebru, 2018), and the business necessity vs. risk of digital redlining in
targeted advertising (Detrixhe and Merrill, 2019). Intuitively, through these types of ques-
tions, society wants to draw a line between what is seen as discriminatory on the one hand,
and what is seen as acceptable or justified by economic principles on the other.
Considering the above, a practitioner interested in implementing a fair decision-making
system based on AI will face two challenges. The first stems from the fact that the cur-
rent literature is abundant with different fairness measures, some of which are mutually
incompatible (Corbett-Davies and Goel, 2018), and choosing among these measures, even
for the system designer, is usually a non-trivial task. This challenge is compounded with
the second challenge, which arises from the statistical nature of such fairness measures. As
we will show both formally and empirically later on in the text, statistical measures alone
cannot distinguish between different causal mechanisms that transmit change and gener-
ate disparity in the real world, even if an unlimited amount of data is available. Despite
this apparent shortcoming of purely statistical measures, much of the literature focuses on
casting fair prediction as an optimization problem subject to fairness constraints based on
such measures (Pedreschi et al., 2008, 2009; Luong et al., 2011; Ruggieri et al., 2011; Hajian
4. As discussed later on, even among indirect paths, one may need to distinguish between mediated causal
paths and confounded non-causal paths, or, more generally, among a specific subset of these paths.

4
Causal Fairness Analysis

b
Section 4: Section 5:
Sections 2 & 3:
Data D Fairness Fairness
Foundations
Measures Tasks

a d e g
Structural Fairness Empirical
SCM M Criteria Measures Measures
{Qk }k=1:s {µi }i=1:m {ηi }i=1:m

c f
Disparate Task 1
Impact
Diagram G Task 2
Disparate
Treatment Task 3

Doctrines Tasks

Figure 2: A mental map of the Causal Fairness Analysis pipeline.

and Domingo-Ferrer, 2012; Kamiran and Calders, 2009; Calders and Verwer, 2010; Kami-
ran et al., 2010; Zliobaite et al., 2011; Kamiran and Calders, 2012; Kamiran et al., 2012;
Zemel et al., 2013; Mancuhan and Clifton, 2014; Romei and Ruggieri, 2014; Dwork et al.,
2012; Friedler et al., 2016; Chouldechova, 2017; Pleiss et al., 2017), to cite a few. In fact,
these methods may be insufficient for removing bias and perhaps even lead to unintended
consequences and bias amplification, as it will become clear later on.
The above observations highlight the importance of considering causal aspects when
designing fair systems. Obtaining rich enough causal models of unobserved or partially
observed reality is not always trivial in practice, yet it is crucial in the context of fair ML.
Causal models must be built using inputs from domain experts, social scientists, and policy-
makers, and a formal language is needed to express and scrutinize them. In this work, we
lay down the foundations of interpreting legal doctrines of discrimination through causal
reasoning, which we view as an essential step towards the development of a new generation
of more ethical and transparent AI systems.

1.1 Contributions
To overcome the challenges described above, we will study fairness analysis through causal
lenses and develop a framework for understanding, modeling, and potentially controlling
for the biases present in the data. Fig. 2 contains the key elements involved in Causal
Fairness Analysis as well as a roadmap of how this paper is organized. Specifically, in
Sec. 2, we cover the basic notions of causal inference, including structural causal models,
causal diagrams, and data collection. In Sec. 3, we introduce the essential elements of our
theoretical framework. In particular, we define the notions of structural fairness that will
serve as a baseline, ground truth for determining the presence or absence of discrimination

5
Plečko and Bareinboim

under the disparate impact and disparate treatment doctrines. In Sec. 4, we introduce
causal measures of fairness that can be computed from data in practice. We further draw
the connection between such measures and the aforementioned legal doctrines. In Sec. 5,
we introduce the tasks of Causal Fairness Analysis – bias detection and quantification, fair
prediction, and fair decision-making – and show how they can be solved by building on the
tools developed earlier. More specifically, our contributions are as follows:

1. We develop a general and coherent framework of Causal Fairness Analysis (Fig. 2).
This framework provides a common language to connect computer scientists and
statisticians on the one hand, and legal and ethical experts on the other to tackle
challenges of fairness in automated decision-making. Further, this new framework
grounds the legal doctrines of disparate impact and disparate treatment through the
semantics of structural causal models.

2. We formulate the Fundamental Problem of Causal Fairness Analysis (FPCFA), which


outlines some critical properties that empirical measures of fairness should exhibit. In
particular, we discuss which properties allow us to relate fairness measures with the
specific causal mechanisms that generate the disparity observed in the data, thereby
providing empirical basis for reasoning about structural causality.

3. We formalize the problem of decomposing variations between a protected attribute X


and an outcome variable Y . In particular, we show how the total variation (TV) can
be decomposed based on different causal mechanisms and across different groups of
units. These developments lead to the construction of the explainability plane (Fig. 7).

4. We introduce the TV family of measures (Table 1) and construct the first version of
the Fairness Map (Thm. 7 and Fig. 12). The Map brings well-known fairness measures
under the same theoretical umbrella and uncovers the structure that connects them.

5. We elicit the assumptions under which different causal fairness criteria can be evalu-
ated. Specifically, we introduce the Standard Fairness Model (SFM), which is a generic
and simplified way of encoding causal assumptions and constructing the causal dia-
gram. One desirable feature of the SFM is that it strikes a balance between simplicity
of construction and informativeness for causal analysis (Def. 7 and Thm. 9).

6. We develop the Fairness Cookbook that represents a practical solution that allows
data scientists to assess the presence of disparate treatment and disparate impact.
Furthermore, we provide an R-package for performing this task called faircause.

7. We study the implications of Causal Fairness Analysis on the fair prediction problem.
In particular, we prove the Fair Prediction Theorem (Thm. 10) which shows that
making TV being equal to zero during the training stage is almost never sufficient to
ensure that causal measures of fairness are well-behaved.

Readers familiar with causal inference may want to move straight to Sec. 3, even though
the next section’s examples are used to motivate the problem of fairness.

6
Causal Fairness Analysis

2. Foundations of Causal Inference


In this section, we introduce three fundamental building blocks that will allow us to formalize
the challenges of fairness described above through a causal lense. First, we will define in
Sec. 2.1 a general class of data-generating models known as structural causal models (shown
in Fig. 2a). The key observation here is that the collection of mechanisms underpinning any
decision-making scenario are causal, and therefore should be modeled through proper and
formal causal semantics. Second, we will discuss in Sec. 2.2 qualitatively different probability
distributions that are induced by the causal generative process, and which will lead to the
observed data and counterfactuals (Fig. 2b). Third, we will introduce in Sec. 2.3 an object
known as a causal diagram (Fig. 2c), which will allow the data scientist to articulate non-
parametric assumptions over the space of generative models. These assumptions can be
shown as necessary for the analysis, in a broader sense. Finally, we will define the standard
fairness model (SFM), which is a special class of diagrams that act as a template, allowing
one to generically express entire classes of structural models. The SFM class, in particular,
requires fewer modelling assumptions than the more commonly used causal diagrams.

2.1 Structural Causal Models


The basic semantical framework of our analysis rests on the notion of structural causal
model (SCM, for short), which is one of the most flexible class of generative models known
to date (Pearl, 2000). The section will follow the presentation in (Bareinboim et al., 2022),
which contains more detailed discussions and proofs. First, we introduce and exemplify
SCMs through the following definition:

Definition 1 (Structural Causal Model (SCM) (Pearl, 2000)) A structural causal model
(SCM) is a 4-tuple hV, U, F, P (u)i, where

1. U is a set of exogeneous variables, also called background variables, that are determined
by factors outside the model;

2. V = {V1 , ..., Vn } is a set of endogeneous (observed) variables, that are determined by


variables in the model (i.e. by the variables in U ∪ V );

3. F = {f1 , ..., fn } is the set of structural functions determining V , vi ← fi (pa(vi ), ui ),


where pa(Vi ) ⊆ V \ Vi and Ui ⊆ U are the functional arguments of fi ;

4. P (u) is a distribution over the exogeneous variables U .

In words, each structural causal model can be seen as partitioning the variables involved in
the phenomenon into sets of exogenous (unobserved) and endogenous (observed) variables,
respectively, U and V . The exogenous variables are determined “outside” of the model and
their associated probability distribution, P (U ), represents a summary of the world external
to the phenomenon that is under investigation. In our setting, these variables will represent
the units involved in the phenomenon, which correspond to elements of the population under
study, for instance, patients, students, customers. Naturally, their randomness (encoded in
P (U )) induces variations in the endogenous set V .

7
Plečko and Bareinboim

Inside the model, the value of each endogenous variable Vi is determined by a causal
process, Vi ← fi (pa(vi ), ui ), that maps the exogenous factors Ui and a set of endogenous
variables P ai (so called parents) to Vi . These causal processes – or mechanisms – are
assumed to be invariant unless explicitly intervened on (as defined later in the section).
Together with the background factors, they represent the data-generating process according
to which the values of the endogenous variables are determined. For concreteness and
grounding of the definition, we revisit the Berkeley admission example through the lens of
SCMs.

Example 1 (Berkeley Admission (Bickel et al., 1975)) During the application pro-
cess for admissions to UC Berkeley, potential students choose a department to which they
apply, which is labelled as D (binary with D = 0 for arts & humanities, D = 1 for sciences).
The admission decision is labelled as Y (y1 accepted, y0 rejected) and the student’s gender
is labelled as X (x0 female, x1 male)5 .
The SCM M is the 4-tuple hV = {X, D, Y }, U = {UX , UD , UY }, F, P (U )i, where
UX , UY , UD represent the exogenous variables, outside of the model, that affect X, Y, D,
respectively. Also, the causal mechanisms F are given as follows 6 :

X ← 1(UX < 0.5) (4)


D ← 1(UD < 0.5 + λX) (5)
Y ← 1(UY < 0.1 + αX + βD), (6)

and P (UX , UD , UY ) is such that UX , UD , UY are independent Unif(0, 1) random variables.


In words, the population is partitioned into males and females, with equal probability (the
exogenous UX represents the population’s biological randomness). Each applicant chooses a
department D, and this decision depends on UD and gender X. The exogenous variable UD
represents the individual’s natural inclination towards studying science. Whenever λ > 0 in
Eq. 5, the threshold for applying to a science department is higher for female individuals,
which is a result of various societal pressures. Finally, the admission decision Y possibly
depends on gender (if α 6= 0 in Eq. 6) and/or department of choice (if β 6= 0 in Eq. 6).
The exogenous variable UY in this case represents the impression the applicant left during
an admission interview. Notice that female students and arts & humanities students may
need to leave a better interview impression in order to be admitted (depending on Eq. 6). 

Another important notion for our discussion is that of a submodel, which is defined next:

5. In the manuscript, gender is discussed as a binary variable, which is a simplification of reality, used to
keep the presentation of the concepts simple. In general, one might be interested in analyses of gender
discrimination with gender taking non-binary values.
6. The given SCM can also be written as

X ← Bernoulli(0.5) (1)
D ← Bernoulli(0.5 + λX) (2)
Y ← Bernoulli(0.1 + αX + βD). (3)

8
Causal Fairness Analysis

Definition 2 (Submodel (Pearl, 2000)) Let M be a structural causal model, X a set


of variables in V , and x a particular value of X. A submodel Mx (of M) is a 4-tuple:

Mx = hV, U, Fx , P (u)i (7)

where
Fx = {fi : Vi ∈
/ X} ∪ {X ← x}, (8)
and all other components are preserved from M.

In words, the SCM Mx is obtained from M by replacing all equations in F related to


variables X by equations that set X to a specific value x. In the context of Causal Fairness
Analysis, we might be interested in submodels in which the protected attribute X is set to
a fixed value x. Building on submodels, we introduce next the notion of potential response:

Definition 3 (Potential Response (Pearl, 2000)) Let X and Y be two sets of variables
in Y and u ∈ U be a unit. The potential response Yx (u) is defined as the solution for Y of
the set of equations Fx with respect to SCM M. That is, Yx (u) denotes the solution of Y
in the submodel Mx of M.

In words, Yx (u) is the value variable Y would take if (possibly contrary to observed facts)
X is set to x, for a specific unit u. In the Admission example, Yx (u) would denote the
admission outcome for the specific unit u, had their gender X been set to value x by
intervention (e.g., possibly contrary to their actual gender). Potential responses are also
called potential outcomes in the literature.

2.2 Observational & Counterfactual Distributions


Each SCM M induces different types of probability distributions, which represent different
data collection modes and will play a key role in fairness analysis. We start with the
observational distribution that represents a state of the underlying decision-making system
in which the fairness analysts just collect data, without interfering in the decision-making
process, as defined next.

Definition 4 (Observational Distribution (Bareinboim et al., 2022)) An SCM M =


hV, U, F, P (u)i induces a joint probability distribution P (V ) such that for each Y ⊆ V ,
X  
P M (y) = 1 Y (u) = y P (u), (9)
u

where Y (u) is the solution for Y after evaluating F with U = u.

In words, the procedure can be described as follows:

1. for each unit U = u, the structural functions F are evaluated following a valid topo-
logical order, and

2. the probability mass P(U = u) is accumulated for each instantiation U = u consistent


with the event Y = y.

9
Plečko and Bareinboim

Throughout this manuscript, all the sums should be replaced by the corresponding
integrals whenever suitable. To ground the discussion about this definition, we continue
with the example above and see how the corresponding observational distribution is induced.
Example 2 (College Admission’s Observational Distribution) Consider the SCM M
in Eq. 4-6. The total variation (TV for short; also called demographic parity) generated by
M depends on the structural mechanisms F and the distribution of exogenous variables
P (UX , UD , UY ). The total variation can be written as:
P (y, x1 ) P (y, x0 )
P (y | x1 ) − P (y | x0 ) = − . (10)
P (x1 ) P (x0 )
Therefore, we compute the terms P (y, x1 ), P (x1 ), P (y, x0 ), P (x0 ) based on the true, under-
lying SCM. Using Def. 4 and Eq. 4, we can see that:
1
P (x1 ) = P (UX < 0.5) = = P (UX > 0.5) = P (x0 ). (11)
2
Using the fact that UX , UD , and UY are independent in the SCM, P (y, x1 ) can be computed
in the following way (Def. 4):
X
P (y, x1 ) = 1(Y (u) = 1, X(u) = 1)P (u) (12)
u

= P (UX < 0.5) P (UD > 0.5 + λ)P (UY < 0.1 + α)+ (13)

P (UD < 0.5 + λ)P (UY < 0.1 + α + β)
1 1 1 1 1
= [( − λ)(0.1 + α) + ( + λ)(1 + α + β)] = (0.1 + α + ( + λ)β). (14)
2 2 2 2 2
The computation above can be described as follows. Firstly, X(u) = 1 is equivalent with
UX < 0.5 (Eq. 4). Secondly, when X(u) = 1, there are two possibilities for the variable D
based on UD (see Eq. 5). Whenever UD > 0.5 + λ, then D(u) = 0, and to have Y (u) = 1,
we need UY < 0.1 + α (see Eq. 6). If UD < 0.5 + λ, then D(u) = 1, and to have Y (u) = 1,
we need UY < 0.1 + α + β (see Eq. 6). An analogous computation yields that:
X
P (y, x0 ) = 1(Y (u) = 1, X(u) = 0)P (u) (15)
u
11 1  1 β
= ∗ 0.1 + ∗ (0.1 + β) = (0.1 + ). (16)
2 2 2 2 2
Putting the results together in Eq. 10, the TV equals
1 β
2 (0.1 + α + ( 12 + λ)β) 1
2 (0.1 + 2 )
P (y | x1 ) − P (y | x0 ) = 1 − 1 (17)
2 2
= α + λβ. (18)
In fact, after analyzing the admission dataset from UC Berkeley, a data scientist computes
the observed disparity to be7
P (y | x1 ) − P (y | x0 ) = 14%. (19)
7. The number below was actually evaluated from the actual real dataset, which is compatible with struc-
7 2
tural coefficients α = 0, β = 10 , and λ = 10 .

10
Causal Fairness Analysis

In words, male candidates are 14% more likely to be admitted than female candidates. The
data scientist (who does not have access to the SCM M described above) might wonder if
this disparity (14%) means that female applicants are discriminated against. Also, she/he
might wonder how the observed disparity relates to the SCM M given in Eq. 4-6. Our goal
in this manuscript is to address these questions from first principles. 

Next, we define another important family of distributions over possible counterfactual out-
comes, which will be used throughout this manuscript:

Definition 5 (Counterfactual Distributions (Bareinboim et al., 2022)) An SCM M =


hV, U, F, P (u)i induces a family of joint distributions over counterfactual events Yx , . . . , Zw
for any Y, Z, . . . , X, W ⊆ V :
X  
P M (yx , . . . , zw ) = 1 Yx (u) = y, . . . , Zw (u) = z P (u). (20)
u

The LHS in Eq. 20 contains variables with different subscripts, which syntactically represent
different potential responses (Def. 3), or counterfactual worlds. In words, the equation can
be interpreted as follows:

1. For each set of subscripts and variables (X, . . . , W and Y, . . . , Z), replace the corre-
sponding mechanism with appropriate constants to generate Fx , . . . , Fw and create
submodels Mx , . . . , Mw ,

2. For each unit U = u, evaluate the modified mechanisms Fx , ..., Fw to obtain the
potential response of the observables,

3. The probability mass P (U = u) is accumulated for each instance U = u that is


consistent with the events over the counterfactual variables, that is Yx = y, . . . , Zw =
z, that is, Y = y in Mx , . . . , Z = z in Mw .

Example 3 (College Admission Counterfactual Distribution) Consider the SCM in


Eq. 4-6 and the following joint counterfactual distribution:

P (yx1 , yx0 ). (21)

In the submodel Mx0 (where X = 0 is set by intervention), we have that Dx0 (u) = 1 is
equivalent with UD < 0.5. When Dx0 (u) = 1, Yx0 (u) = 1 if and only if UY < 0.1 + β.
Similarly, when Dx0 (u) = 0, Yx0 (u) = 1 if and only if UY < 0.1. Therefore, we have that

Yx0 (u) = 1 ⇐⇒ ((UD < 0.5) ∧ (UY < 0.1 + β)) ∨ ((UD > 0.5) ∧ (UY < 0.1)). (22)

In the submodel Mx1 , we have

Yx1 (u) = 1 ⇐⇒ ((UD < 0.5 + λ) ∧ (UY < 0.1 + α + β))∨ (23)
((UD > 0.5 + λ) ∧ (UY < 0.1 + α)).

11
Plečko and Bareinboim

Based on this, the expression in Eq. 21 can be evaluated using Def. 5, which leads to
X
P (yx1 , yx0 ) = 1(Yx1 (u) = 1, Yx0 (u) = 1)P (u) (24)
u
=P (UD < 0.5)P (UY < 0.1 + β) + P (UD > 0.5)P (UY < 0.1)
β
=0.1 + . (25)
2
Interestingly, this distribution is never obtainable from observational data, since it involves
both potential responses Yx0 , Yx1 , which can never be observed simultaneously. 
In most fairness analysis settings, the data scientist will only have data D in the form of
samples collected from the observational distribution. One significant result in this context
is known as the causal hierarchy theorem (CHT, for short), which says that it is almost
never possible (in an information theoretic sense) to recover the counterfactual distribution
from the observational distribution alone (Bareinboim et al., 2022, Thm. 1). Given this
impossibility result and the unavailability of the SCM in most settings, the data scientist
needs to resort to some sort of assumptions in order to possibly make claims about these
underlying mechanisms, which is discussed in the next section.

2.3 Encoding Structural assumptions through Causal Diagrams


Despite the fact that SCMs are well defined and provide the semantics to different families
of probability distributions, and are important for fairness analysis, one critical observation
is that they are usually not observable by the data scientist. A common way of encoding
assumptions about the underlying SCM is through an object called a causal diagram. We
describe below the constructive procedure that allows one to articulate a diagram from a
coarse understanding of the SCM.
Definition 6 (Causal Diagram (Pearl, 2000; Bareinboim et al., 2022)) Let M =
hV, U, F, P (u)i be an SCM. A graph G is said to be a causal diagram (of M) if:
1. there is a vertex for every endogenous variable Vi ∈ V ,
2. there is an edge Vi → Vj if Vi appears as an argument of fj ∈ F,
3. there is a bidirected edge Vi L9999K Vj if the corresponding Ui , Uj ⊂ U are correlated
or the corresponding functions fi , fj share some Uij ∈ U as an argument.
In words, there is an edge from endogenous variables Vi to Vj whenever Vj “listens to” Vi
for determining its value8 . Similarly, the existence of a bidirected edge between Vi and Vj
indicates there is some shared, unobserved information affecting how both Vi and Vj obtain
their values. Note that while the SCM contains explicit information about all structural
mechanisms (F) and exogenous variables (P (u)), the causal diagram, on the other hand,
encodes information only about which functional arguments were possibly used as inputs
to the functions in F. That is, the diagram abstracts out the specifics of the functions F
and retains information about their possible arguments.
8. This construction lies at the heart of the type of knowledge causal models represent, as suggested in
(Pearl and Mackenzie, 2018, pp. 129): “This listening metaphor encapsulates the entire knowledge that
a causal network conveys; the rest can be derived, sometimes by leveraging data.”

12
Causal Fairness Analysis

Furthermore, the existence of a directed arrow, e.g., Vi →


Vj , encodes the possibility of the mechanism of Vj to listen X Y
to variable Vi , but not the necessity. In words, the edges are Gender D Admission
in this sense non-committal; for instance, fj may decide not Department
to take the value of Vi into account. On the other hand, the
Figure 3: A partial causal
assumptions are not really encoded in the arrows present in
model for the Admissions’
the diagram, but in the missing arrows; each missing arrow
example.
ascertains that one variable is certainly not the argument of
the other. The data scientist, in general, should try to spec-
ify as much knowledge as possible of this type. For concreteness, consider the following
example.

Example 4 (Admission’s Causal Diagram) Consider again the SCM M in Ex. 1, which
is unknown by the data scientist trying to analyze the existence of discrimination in the ad-
mission process. To apply the graphical construction dictated by Def. 6, the data scientist
starts the modeling process by examining each of the endogenous variables and the potential
arguments of their corresponding mechanisms. For example, the mechanism
D ← fD (X, UD ) (26)
suggests that each applicant department’s choice (D) is, possibly, a function of their gender
X, regardless of the specific form about how this happens in reality. If that is the case, so
the causal diagram G will contain the arrow X → D. Again, an arrow in G does not commit
to how the variables X and D interact, which is significantly less informative than the true
mechanism given by Eq. 5. Continuing the causal modelling process, the data scientists may
think about the admission’s process, and consider that
Y ← fY (X, D, UY ), (27)
which represents that how admission decisions come about may be influenced by gender and
department choice. If that is the case, the causal diagram G will also contain the arrows
X → Y and D → Y , respectively. Again, this stands in sharp contrast with how detailed
the knowledge is presented in the true SCM M, and, for instance, as delineated in Eq. 6.
Interestingly enough, an entirely different functional form than that in Eq. 6, say

Y ← 1 UY < 0.1 + βXD , (28)
is also compatible with the causal diagram in Fig. 1.
Lastly, if the coefficient α is equal to 0 in the mechanism described by Eq. 6 (i.e.,
Y ← 1(UY < 0.1 + αX + βD)), this would still be compatible with the causal diagram G.
Again, the arrow allows for the possibility of functional dependence, but does not necessitate
it. 

2.3.1 Standard Fairness Model


Specifying the relationship among all pairs of variables, as required by the definition of a
causal diagram, is possibly non-trivial in many practical settings. In this section, we will
introduce the Standard Fairness Model, which is a template-like model that represents a
collection of causal diagrams and aims to alleviate the modeling requirements.

13
Plečko and Bareinboim

Definition 7 (Standard Fairness Model (SFM)) The standard fairness model (SFM)
is the causal diagram GSFM over endogenous variables {X, Z, W, Y } and given by
Z

X Y

W
where the nodes represent:
• the protected attribute, labelled X (e.g., gender, race, religion),
• the set of confounding variables Z, which are not causally influenced by the attribute
X (e.g., demographic information, zip code),
• the set of mediator variables W that are possibly causally influenced by the attribute
(e.g., educational level, or other job related information),
• the outcome variable Y (e.g., admissions, hiring, salary).
Nodes Z and W are possibly multi-dimensional or empty. Furthermore, for a causal diagram
G, the projection of G onto the SFM is defined as the mapping of the endogenous variables
V appearing in G into four groups X, Z, W, Y , as described above. The projection is denoted
by ΠSFM (G) and is constructed by choosing the protected attribute, the outcome of interest,
and grouping the confounders Z and mediators W .
For simplicity, we assume X to be binary (whereas Z, W , and Y could be either discrete
or continuous). For instance, by setting Z = ∅ and W = {D}, the causal diagram of the
Admissions example can be represented by GSFM . To ground the definition further, consider
the following well-known example.
Example 5 (COMPAS (Larson et al., 2016)) The courts at Broward County, Florida,
use machine learning to predict whether individuals released on parole are at high risk of
re-offending within 2 years (Y ). The algorithm is based on the demographic information Z
(Z1 for gender, Z2 for age), race X (x0 denoting White, x1 Non-White), juvenile offense
counts J, prior offense count P , and degree of charge D. The causal diagram for this setting
is shown in Fig. 4a. The bidirected arrows between X and Z1 , Z2 indicate that the exogenous
variable UX possibly shares information with exogenous variables UZ1 , UZ2 . This diagram
can be standardized (projected on the SFM) by grouping the mediators W = {J, P, D} and
confounders Z = {Z1 , Z2 }. Formally, the SFM projection can be written as
ΠSFM (G) = hX = {X}, Z = {Z1 , Z2 }, W = {J, P, D}, Y = {Y }i. (29)
The projection is shown in Fig. 4b. Notice that the full diagram G is not needed for de-
termining the SFM projection. The data scientist only needs to group the confounders and
mediators, and determine whether there is latent confounding between any of the groups.
Going back to Florida, after a period of using the algorithm, it is observed that Non-
White individuals are 9% more likely to be classified as high-risk, i.e.,
P (y | x1 ) − P (y | x0 ) = 9%. (30)

14
Causal Fairness Analysis

Z-set
Z1 Z2 Z1 Z2

X Y X Y
W -set
J P D J P D
.
(a) Causal diagram of COMPAS dataset. (b) Causal diagram projected onto the SFM.

Figure 4: The causal diagram of COMPAS dataset and its projection onto the SFM.

The reader might wonder if the disparity of 9% means that racial minorities are discrim-
inated by the legal justice system in Broward County. An important consideration here is
how much of the disparity can be explained by the spurious association of race with age
or gender (which potentially influence the recidivism prediction), the effect of race on the
prediction mediated by juvenile and prior offense counts, or the direct effect of race on the
prediction. 

As noted in the example, the SFM does not explicitly assume the causal structure within
the possibly multi-dimensional sets Z, W . In causal language, the SFM can be seen as an
equivalence class of causal diagrams9 . For instance, under the SFM, if Z = {Z1 , Z2 }, the
relationship between Z1 and Z2 is not fully specified, and it may be the case that Z1 → Z2 ,
Z2 → Z1 , or of another type. Secondly, the SFM encodes assumptions about lack of hidden
confounding, which is reflected through the absence of bidirected arrows between variable
groups. We discuss in Appendix B how the lack of confounding assumptions can be relaxed.

3. Foundations of Causal Fairness Analysis


In this section, we will introduce two main results that will allow us to understand and
possibly solve the problem of fairness using causal tools. First, we will introduce in Sec. 3.1 a
structural definition of fairness, which leads to a natural way of expressing legal requirements
based on the doctrines of disparate treatment and impact. In particular, we will define the
notion of fairness measure and two key properties called admissibility and decomposability.
Armed with these new notions, we will then be able to formally state the fundamental
problem of causal fairness analysis. In words, these results suggest that reasoning about
fairness requires an understanding of how to explain variations, in particular, how the
outcome variable Y can be explained in terms of the structural measures following variations
of the protected attribute X. In Sec. 3.2, we formalize the notion of a contrast, which allows
us to understand the aforementioned variations from a factual-counterfactual perspective.
We then prove how to decompose contrasts and re-express them in terms of the structural
basis, which lead to the explainability plane and the decomposition of arbitrary types of
9. A more detailed study on the properties of clustered diagrams can be found in (Anand et al., 2021).

15
Plečko and Bareinboim

contrast. The discussion is somewhat theoretical and we will provide examples to ground
and make the main points more concrete.

Example 6 (College’s admissions, inspired by (Bickel et al., 1975)) During the pro-
cess of application to undergraduate studies, prospective students choose a department to
which they want to join (D), report their gender X (x0 female, x1 male), and after a
certain period they receive the admission decisions Y (y1 accepted, y0 rejected).
In reality, how applicants pick their department (fD ) and how the university decides on
who to admit (fY ) is represented by the SCM M∗ = hV = {X, D, Y }, U = {UX , UD , UY },
F ∗ , P ∗ (U )i, where the pair hF ∗ , P ∗ (U )i is such that

 X ← Bernoulli (0.5)
 (31)
∗ ∗ 2
F , P (U ) : D ← Bernoulli (0.5 + 10 X) (32)
7

Y ← Bernoulli (0.1 + 0 * X + 10 D). (33)

Based on data that it made available from the previous admissions’ cycle, the school is sued
by a group of applicants who allege gender discrimination. In particular, they share with
the court the following statistics:

P (y | x1 ) − P (y | x0 ) = 14%, (34)

which seems a devastating piece of evidence against the university. In words, it seems that
male candidates are 14% more likely to be admitted than their female counterparts. The
natural question that arises is what could explain such a disparity in the observed data?
Would this be a textbook case of direct, gender-discrimination?
Despite the fact that the court does not have access to the true M∗ , in reality, there is
no direct discrimination at all since fY (Eq. 33) does not take gender into account (note the
zero coefficient multiplying X). In fact, female applicants are more likely to apply to arts
& humanities departments, which have lower admission rates, in turn causing a disparity
in the overall admission rates.
The plaintiffs hire a team of (evil) data scientists that conduct their own study. After
some time, the team comes back and claims to have understood the university decision-
making process after a series of interviews and research, which is given by SCM M0 =
hV = {X, D, Y }, U = {UX , UD , UY }, F 0 , P 0 (U )i, where hF 0 , P 0 (U )i are such that


X
 ← Bernoulli (0.5) (35)
0 0 2
F , P (U ) : D ← Bernoulli(0.5 + 10 X) (36)
14

Y ← Bernoulli(0.1 + 100 ∗ X + 0 ∗ D). (37)

The only difference between M∗ (the true set of mechanisms) and M0 (the hypothesized one)
is fY . Interestingly enough, the hypothesized fY (Eq. 37) takes gender (X) into account
while discarding any information about applicants’ department choices (D). Clearly, if this
was indeed the true decision-making process by which the university selects students, the jury
should condemn the university, since that would be a blatant case of direct discrimination.


16
Causal Fairness Analysis

Interestingly, both SCMs M ∗ and M 0 generate the same total variation of 14%. Still,
M∗ , which is the true generating model, doesn’t suggest any type of gender discrimination,
while M0 , which is false, suggests that the university’s admissions decisions are purely based
on gender. In summary, SCMs M ∗ and M 0 are qualitatively different (in the sense that the
disparity is transmitted along different causal mechanisms), but they are indistinguishable
based on TV. We next formalize this issue in more generality.

3.1 Structural Fairness Criteria


To understand the issue discussed in the previous section, we start by noting that qualitative
distinctions – such as differentiating direct and indirect discrimination – lie at the heart of
some of the most important legal doctrines on discrimination. In particular, the doctrine
of disparate treatment asks the question on whether a different decision would have been
reached for an individual, had she/he been of a different race or gender, while keeping all
other attributes the same (Barocas and Selbst, 2016). In causal terminology, the question
is about disparities transmitted along the direct causal mechanism between the attribute
X and the outcome Y . On the other hand, the doctrine of disparate impact considers
situations in which a facially neutral policy (that does not use race or gender explicitly)
results in very different outcomes for racial or gender groups (Rutherglen, 1987). In this
case, the concern is also with disparities transmitted along indirect and spurious causal
mechanisms. Motivated by these legal doctrines, we can mathematically define qualitative
assessments about discrimination based on an SCM:

Definition 8 (Structural Fairness Criterion) Let Ω be a space of SCMs. A structural


criterion Q is a binary operator on the space Ω, that is a map Q : Ω → {0, 1} that determines
whether a set of causal mechanisms between X and Y exist or not, in a given SCM M ∈ Ω.

For most of the manuscript, we wish to focus on structural criteria that capture direct,
indirect, and spurious discrimination. We consider these criteria as elementary. A more
refined and detailed structural notions are discussed in Sec. 6. We now formally define the
three elementary structural fairness criteria, based on the functional relationships between
X and Y encoded in an SCM:
Definition 9 (Elementary Structural Fairness Criteria) Let pa(Vi ) and an be the par-
ents and ancestors of Vi in the causal diagram G, respectively. For an SCM M, define the
following three structural criteria:
(i) Structural direct criterion:

Str-DEX (Y ) = 1(X ∈ pa(Y )).

(ii) Structural indirect criterion:

Str-IEX (Y ) = 1(X ∈ an(pa(Y ))).

(iii) Structural spurious criterion:


 
Str-SEX (Y ) = 1 (UX ∩ an(Y ) 6= ∅) ∨ (an(X) ∩ an(Y ) 6= ∅) .

17
Plečko and Bareinboim

For Str-DEX (Y ) = 0, Str-IEX (Y ) = 0, and Str-SEX (Y ) = 0, we write DE-fairX (Y ),


IE-fairX (Y ), and SE-fairX (Y ), respectively.
In words, the structural direct criterion verifies whether the attribute X is a function of
the mechanism fY , that is, if Y is a function of X. The structural indirect criterion verifies
whether there exist mediating variables, which are affected by X, that in turn influence
Y . These two criteria are defined in terms of the functional relationships within M, or F.
This means that they convey causal information about the relationship among endogenous
variables. Finally, the structural spurious criterion verifies whether there exist variables that
both causally affect the attribute X and the outcome Y . Different than the previous ones,
this criterion relies on the relationships among the exogenous variables U , which relates to
the confounding relation among the observables.
We revisit the Admissions example to ground such notions:
Example 7 (Admissions continued) In the SCM M defined in Eq. 1-3, the structural
direct and indirect effects can be analyzed as follows:
(i) Y is fair w.r.t. X in terms of direct effect if and only if:
α = 0 in {Y ← Bernoulli(0.1 + αX + βD)}. (38)

(ii) Y is fair w.r.t. X in terms of indirect effect if and only if:


λ = 0 in {D ← Bernoulli(0.5 + λX)}, or
 (39)
β = 0 in {Y ← Bernoulli(0.1 + αX + βD) }.

For the SCM M∗ in Eq. 31-33, we can see that direct discrimination does not exist, since
α = 0, and therefore X ∈/ pa(Y ) (see Def. 9(i)). However, indirect discrimination is present,
2 7
since λ = 10 and β = 10 , and therefore X ∈ an(pa(Y )) (see Def. 9(ii)). In contrast to this,
for the SCM M0 in Eq. 35-37, direct discrimination is present, since α = 71 and thus
X ∈ pa(Y ), but indirect discrimination is not, since β = 0 and thus X ∈ / an(pa(Y )). 
Other meaningful structural fairness criteria could be defined using different logical combi-
nations of these three elementary criteria. For instance, Y can be called totally fair with
respect to X (FairX (Y )) if and only if direct, indirect, and spurious fairness are simulta-
neously true (i.e., FairX (Y ) = DE-fairX (Y ) ∧ IE-fairX (Y ) ∧ SE-fairX (Y )). Alternatively,
causal fairness could be defined as Causal-fairX (Y ) = DE-fairX (Y ) ∧ IE-fairX (Y ), which
encodes the non-existence of active causal influence from X to Y (neither direct nor medi-
ated).
These definitions of structural fairness represent idealized and intuitive criteria that can
be evaluated whenever the true underlying mechanisms are known, i.e., the fully specified
SCM M. The importance of these measures, encoded through the structural mechanisms
(Def. 9), stems from the fact that they underpin existing legal and societal notions of
fairness. Therefore, they will be used as a benchmark to understand under what conditions,
and how close other measures, which might be estimable from data, approximate these
idealized and intuitive notions.
One central question is whether there exist quantitative measures of discrimination that
can help us assess whether a structural criterion is satisfied or not. Firstly, we define a
general fairness measure that can be computed from the SCM:

18
Causal Fairness Analysis

Definition 10 (Fairness Measure) Let Ω be a space of SCMs. A fairness measure µ is


a functional on the space Ω, that is a map µ : Ω → R, which quantifies the association of
X and Y through any subset of causal mechanisms, in a given SCM M ∈ Ω.

Here, the definition of a fairness measure µ is kept as quite general. In Sec. 3.2, we will
restrict our attention to a specific class of measures µ and explain their importance in the
context of Causal Fairness Analysis. In the sequel, we introduce a notion that represents
when a fairness measure µ is suitable for assessing a structural criterion Q:

Definition 11 (Admissibility) Let Ω be a class of SCMs on which a structural criterion


Q and a measure µ are defined. A measure µ is said to be admissible w.r.t. the structural
criterion Q within the class of models Ω, or (Q, Ω)-admissible, if:

∀M ∈ Ω : Q(M) = 0 =⇒ µ(M) = 0. (40)

For simplicity, we will use admissibility instead of (Q, Ω)-admissibility whenever the context
is clear. The importance of having an admissible measure µ stems from the contrapositive
of Eq. 40, namely, if µ(M) can be measured or evaluated and µ(M) 6= 0, this means that
the structural measure must be true, i.e., Q(M) = 1. In other words, the measure µ will act
as a link between the well-defined but unobservable structural measure and the observable
and estimable world. For concreteness, consider the following result that formalizes the
issue found in Example 6:

Lemma 1 (TV is not admissible w.r.t. Str-DE, IE, SE) Let Ω be the space of Semi-
Markovian SCMs which contain variables X and Y . Let µ be the total variation measure
TVx0 ,x1 (y). Then µ is not admissible with respect to structural direct, indirect, or spurious
criteria. That is,

6
(Str-DE(M) = 0) =⇒ (TVx0 ,x1 (y) = 0), (41)
6
(Str-IE(M) = 0) =⇒ (TVx0 ,x1 (y) = 0), (42)
6
(Str-SE(M) = 0) =⇒ (TVx0 ,x1 (y) = 0). (43)

In fact, the reason why the TV measure is not admissible with respect to structural direct,
indirect, and spurious criteria is because it captures the three types of variations together.
To formalize this idea, we introduce the notion of decomposability of a measure µ, i.e.:

Definition 12 (Decomposability) Let Ω be a class of SCMs and µ be a measure defined


over it. µ is said to be Ω-decomposable if there exist measures

µ1 , . . . , µk such that µ = f (µ1 , . . . , µk ), (44)

and where f is a non-trivial function vanishing at the origin, i.e., f (0, . . . , 0) = 0.

In words, decomposability states that a measure µ can be written as a function of measures


(µi )ki=1 , and that if all measures (µi )ki=1 are equal to 0 for an SCM M, then the measure µ
must be 0 as well. For concreteness, consider the following examples.

19
Plečko and Bareinboim

Example 8 (Covariance decomposition, after (Zhang and Bareinboim, 2018c))


Let µ be the covariance measure between random variables X and Y ,

Cov(X, Y ) = E[XY ] − E[X]E[Y ], (45)

which plays a role somewhat analogous to TV (and, more broadly, the observational distri-
bution) whenever the system F and P (U ) are linear and Gaussian. Further, let the causal
covariance be defined as

Covcx (X, Y ) = Cov(X, Y − Yx ). (46)

Furthermore, let the spurious covariance be defined as

Covsx (X, Y ) = Cov(X, Yx ). (47)

Then, we can write

Cov(X, Y ) = f Covcx (X, Y ), Covsx (X, Y ) ,



(48)

with the function f (a, b) = a + b, which satisfies f (0, 0) = 0. 

Armed with the definitions of admissibility and decomposability, we are ready to formally
define the first version of the problem studied here.

Definition 13 (Fundamental Problem of Causal Fairness Analysis (preliminary))


Consider a class of SCMs Ω, and let

• Q1 , Q2 , ..., Qk be a collection of structural fairness criteria, and

• µ be a measure,

both defined over Ω. The Fundamental Problem of Causal Fairness Analysis is to find a
collection of measures µ1 , . . . , µk such that the following properties are satisfied:

(1) µ is decomposable w.r.t. µ1 , . . . , µk ;

(2) µ1 , . . . , µk are admissible w.r.t. the structural fairness criteria Q1 , Q2 , ..., Qk .

In other words, find measures

µ1 , . . . , µk that are admissible w.r.t. Q1 , . . . , Qk , (49)

respectively, and such that

µ = f (µ1 , . . . , µk ), (50)

where f is a non-trivial function vanishing at the origin, i.e., f (0, . . . , 0) = 0. 

20
Causal Fairness Analysis

For grounding this discussion, we will consider TV


that the measure µ is given by the TV10 and
the structural measures will be Str-{DE, IE, SE}. decomposable
We refer to this instance of the problem by
FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)). Fig. 5 provides µSE µDE µIE
a visual summary of the FPCFA where TV is shown on
the top and the structural measures Str-{DE, IE, SE} admissible
on the bottom. As you have just seen in Lem. 1, TV
is not admissible relative to each of these structural Str-SE Str-DE Str-IE
measures.
The FPCFA asks for the existence of a set of mea-
sures (µDE , µIE , µSE ) that could act as a bridge be- SCM M∗
tween T V and the more meaningful, albeit unobserv-
able structural measures Str-{DE, IE, SE}. In fact, Figure 5: Fundamental Problem of
the FPCFA is solved whenever TV can be expressed in Fairness Analysis (TV version).
terms of (µDE , µIE , µSE ), and each of these measures is admissible w.r.t. to the correspond-
ing structural measures. If that is the case, the measures (µDE , µIE , µSE ) could be seen as
explaining the variations of TV in terms of the most elementary, structural components.
Interestingly, this is both a quantitive and a qualitative exercise. From TV’s perspective,
(µi )ki=1 should account for all its variations, which is naturally a quantitive exercise. From
the structural measures perspective, we would like to enforce soundness, namely, discrimi-
nation is indeed readable from the corresponding (µi )ki=1 , which is a qualitative exercise.

3.2 Explaining Factual & Counterfactual Variations


In this section, the main task is studying how the variations in outcome Y can be explained
by changes of the protected attribute X. The result of this study is what we call the
population-mechanism plane, which we also refer to as the explainability plane (Fig. 7).
The methodology introduced by the plane will allow us to re-express different measures of
fairness in an unified manner, which will facilitate their comparison in terms of admissibility,
decomposability, and possibly other desirable properties.
We start by introducing a quite general type of measure encoding the idea of contrast.

Definition 14 (Contrast) Given a SCM M, a contrast C is any quantity of the form

C(C0 , C1 , E0 , E1 ) = E[yC1 | E1 ] − E[yC0 | E0 ], (51)

where E0 , E1 are observed (factual) clauses and C0 , C1 are counterfactual clauses to which
the outcome Y responds. Furthermore, whenever

(a) E0 = E1 , the contrast C is said to be counterfactual;

(b) C0 = C1 , the contrast C is said to be factual.

10. Naturally, other types of contrasts can be used as measures instead of TV, such as the covariance (Zhang
and Bareinboim, 2018c) or equality of odds (Hardt et al., 2016; Zhang and Bareinboim, 2018a).

21
Plečko and Bareinboim

For simplicity11 , we will focus on the binary case, in which a contrast can be written as

P (yC1 | E1 ) − P (yC0 | E0 ). (52)

The purpose of a contrast is to compare the outcome of individuals who coincide with the
observed event E1 in the factual world and whose values were intervened on (possibly coun-
terfactually) as defined by C1 , against individuals who coincide with the observed event
E0 in the factual world and whose values were intervened on (possibly counterfactually)
as defined by C0 . The definition also distinguishes two special cases of contrasts. A coun-
terfactual contrast captures only the difference in outcome induced by the difference in
interventions C0 , C1 (since E0 = E1 ). Complementary to this, a factual contrast captures
only the difference induced by the observed events E0 , E1 (since C0 = C1 ). We now show
why contrasts are useful for explaining variations:

Theorem 1 (Contrast’s Decomposition & Structural Basis Expansion) Given a SCM


M and let C be a contrast P (yC1 | E1 ) − P (yC0 | E0 ). C can be decomposed into its coun-
terfactual and factual variations, namely:

P (yC1 | E1 ) − P (yC0 | E1 ) + P (yC0 | E1 ) − P (yC0 | E0 ) . (53)


| {z } | {z }
counterfactual contrast factual contrast

Furthermore, the corresponding counterfactual and factual contrasts admit the following
structural basis expansions, respectively:

(a) Counterfactual contrast (Cctf ), where E0 = E1 = E, can be expanded as


X 
P (yC1 | E) − P (yC0 | E) = yC1 (u) − yC0 (u) P (u | E), (54)
u
| {z } | {z }
unit-level difference posterior

(b) Factual contrast (Cfactual ), where C0 = C1 = C, can be expanded as


X 
P (yC | E1 ) − P (yC | E0 ) = yC (u) P (u | E1 ) − P (u | E0 ) . (55)
u
| {z } | {z }
unit outcome posterior difference

The decomposition and structural basis expansion of contrasts presented in this theorem
entail a fundamental connection of causal fairness measures with structural causal mod-
els. In particular, the decomposition given in Eq. 53 allows us to disentangle factual and
counterfactual variations within any contrast.
We note that Eqs. 54 and 55 re-expresses the variations within the target quantity in
terms of the underlying units and activated mechanisms, as references by the SCM. We
would like to understand these qualitatively different types of variations separately.

11. The results in this section hold for any real-valued random variable Y .

22
Causal Fairness Analysis

First, we will take a generative interpretation over


how the targeted variations are realized in terms of
the SCM M = hV, U, F, P (U ). Fig. 6 illustrates the
two-step generative process that goes as follow:
(1) Sampling: A unit U = u is sampled from the
population distributed according to P (U );

(2) Evaluation: This unit u passes through the


sequence of mechanisms F, in causal order, until
the values of the endogenous variables V are real- Figure 6: Two-step generative pro-
ized. cess that includes sampling of an
unit from the population (left), and
The l.h.s. of the figure shows the sampling process evaluating it against the correspond-
while the r.h.s. represents the evaluation process. As ing structural mechanisms (right).
discussed in Sec. 2.2, if the system is not submitted to
an intervention, this leads to the observational distri-
bution. On the other hand, if the values of certain variables are fixed through intervention,
this leads to the corresponding counterfactual distribution.
Considering this two-step generative process, we re-examine the variations encoded in
the structural basis expansion of Thm. 1. For convenience, we reproduce the equation
relative to the counterfactual variations in the sequel (Eq. 54):
X 
P (yC1 | E) − P (yC0 | E) = yC1 (u) − yC0 (u) P (u | E)
u
| {z } | {z }
unit-level difference posterior

First, we consider the second factor in the r.h.s. of the expression. Note that P (u | E = e)
represents the first step in the generative process in which units who naturally arise to
value E = e are drawn from the population. In fact, depending on the granularity of the
evidence E, a different fraction of the population (or types of individuals) will be selected.
For instance, if E = {}, the (posterior) distribution P (u) is somewhat uninformative, and
represents an average when units are drawn at random from the underlying population,
regardless of their predispositions and characteristics. On the other hand, if E = {X = x},
the posterior distribution P (u | x) would be more informative since it now includes units
that naturally would have X = x. Naturally, this is less informative compared to more
specific events such as E = {X = x, Z = z} or E = {X = x, Z = z, W = w, Y = y}. In
fact, the l.h.s. of the figure illustrates this increasingly more refined and informative set
of events E, i.e., starting from picking individuals at random from the general population,
P (u), to a single individual δu , where δu is the Dirac delta function. Second, we note that
once the unit U = u is selected, all randomness is vanished, and the unit will go through
the set of mechanisms F. The first factor of the expression, yC1 (u) − yC0 (u), describes the
difference in response y between conditions C1 and C0 for a fixed realization of exogenous
variables u. As realizations of exogenous variables U are indices for the different identities
of units in the population, the quantity yC1 (u) − yC0 (u) will be an unit-level quantity.
In the context of fairness discussed here, consider the case when C1 = x1 and C0 = x0 ,
which could represent the protected attribute, for instance, males and females, or White
and African-American. The quantity yx1 (u) − yx0 (u) measures what the change in outcome

23
Plečko and Bareinboim

Y would be when changing the attribute X from x0 to x1 , for a specific unit u. For this
particular choice of C0 , C1 , the quantity captures what is known as the total causal effect of
X on Y , that is it includes all the variations from X to Y translated across causal pathways.
In summary, any counterfactual contrast Cctf can be decomposed into two parts:

1. A unit-level difference comparing the counterfactual worlds C1 vs. C0 to a specifc


unit U = u. This quantity is determined by the causal mechanisms F of the SCM,
and does not depend on the distribution P (u).

2. A posterior distribution P (u | E = e) that indicates the probability mass assigned


to unit u whenever the event E = e. By changing the granularity of the event
E, the space of included units is restricted, making the measure more specific to a
subpopulation (see Fig. 6 (l.h.s.)).

Given that the selection of units is fixed (second factor), and the only thing that varies
is the selection of the mechanisms (first factor) through the choices of the counterfactual
conditions C1 and C0 , this will generate variations downstream, so they will be inherently
“causal”. In fact, the specific instantiation of C1 and C0 and E = {} (i.e., P (U )) matches
to the very definition of average causal effect, P (y|do(x1 )) − P (y|do(x0 )).
We now re-examine the factual variations encoded in the structural basis expansion of
Thm. 1. For convenience, we reproduce the corresponding equation (Eq. 55):
X 
P (yC | E1 ) − P (yC | E0 ) = yC (u) P (u | E1 ) − P (u | E0 )
u
| {z } | {z }
unit outcome posterior difference

In words, a factual contrast can be expanded as a sum of differences in the posteria


P (u | E1 ) − P (u | E0 ), weighted by unit-level outcomes yC (u). We note that the dif-
ference in posteria represents the first step in the generative process in which two sets of
units who naturally arise to values E1 and E0 are drawn from the population, respectively.
Similarly to the previous discussion, different sub-populations will be selected depending
on the granularity of the evidence E1 , E0 . The scope of these events is the same but their
instantiations are different.
This can be seen as complementary when compared to the counterfactual contrasts.
Given that the mechanisms are fixed (first factor), the component that generates variations
is relative to the choice of units based on the factual conditions E0 and E1 . We suggest
this will generate upstream variations, which will be somewhat “non-causal” (also called
spurious), which will be described in more details later on in the manuscript. Still, for
instance, we are mostly interested in setting C1 = C0 = x, so that X = x along all causal
pathways. The contrast then will capture the difference in probability mass assigned to u in
events E1 and E0 . By definition, spurious effects are generated by variations that causally
precede X, so these cannot be captured by intervening on X. For this reason, we need
to compare events E1 and E0 , which have resulted in a different instantiation of the value
of X. This factorization also suggest mathematically how causal and spurious effects are
inherently different from each other.

24
Causal Fairness Analysis

Explainability plane. By decomposing


variations via factual and counterfactual
contrasts, and expanding them using the
structural basis, we can give the essential
structure of the measures used in Causal
Fairness Analysis. The approach used for
decomposing the total variation is shown
in Fig. 7, which we call the explainability
plane.
As the figure illustrates, there are two
separate axes of the decomposition. On the Figure 7: In the population axis, contrasts are
mechanism axis, we are decomposing the restricted to smaller subsets of units u in the
TV into its direct, indirect, and spurious domain U. At the same time, along the mech-
variations. On the population axis, we are anism axis, we distinguish between direct, in-
considering increasingly precise subsets of direct, and spurious variations.
the space of units U, which correspond to
different posterior distributions. As we will see later, moving along the population axis will
correspond to constructing increasingly more powerful fairness measures.

4. TV family
In this section, we introduce a family of measures that populate the explainability plane in
Fig. 7. Since all the measures describe variations included within the TV measure, we refer
to them as the TV family (part e of Fig. 2). In particular, this section aims to explicitly
solve the FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) discussed in Sec. 3.

4.1 Solving the Fundamental Problem of Causal Fairness Analysis


The measures in the TV-family are introduced in order. We start with measures that
quantify discrimination in the entire population of units u (corresponding to posterior P (u)),
and reach measures that quantify discrimination for a single unit u (corresponding to the
posterior δu , where δ is the Dirac delta function).

4.1.1 Population level Contrasts - P (u)


We first recall that the TV measure itself is not admissible with respect to structural
criteria Str-{DE,IE,SE}, as shown in Lem. 1. Specifically, the reason for this is that the TV
captures variations between groups generated by any mechanism of association, both causal
and non-causal, and does not distinguish them. Our first step is therefore to disentangle
these variations – the causal and non-causal (or spurious) – within the TV.

Definition 15 (Total and spurious effects) Let the total effect and experimental spuri-
ous effect be defined as follows:

TEx0 ,x1 (y) = P (yx1 ) − P (yx0 ) (56)


Exp-SEx (y) = P (yx ) − P (y | x) (57)

25
Plečko and Bareinboim

Further, we write TE-fairX (Y ) whenever TEx0 ,x1 (y) = 0, or simply TE-fair when X and
Y are clear from the context. Exp-SE-fair is defined analogously.
In words, TE measures the difference in the outcome Y when setting X = x1 , compared
to setting X = x0 . The measure can be visualized graphically as shown in Fig. 8a. In this
case, Y responds to the change in X from x0 to x1 through two mechanisms. In fact, Y
variations in response to change in X are realized through (i) the direct link, X → Y , and
(ii) through the indirect link via W , X → W → Y . In the context of the COMPAS dataset
in Ex. 5, the total effect would be the average difference in recidivism prediction had an
individual’s race been White compared to had it been Non-White. Since the covariates
Z vary naturally in both counterfactual worlds (both sides of the expression), those are
cancelled out and Y variations can be explained in terms of the downstream variations in
response to the change purely on X 12 .
In a complementary manner, the experimental spurious effect measures the average
difference in outcome Y when X = x by intervention, counterfactually speaking, compared
to simply observing that X = x. As shown graphically in Fig. 8b, note that since from
Y ’s perspective X has the same value x in both factors, the Y variations can be explained
in terms of the upstream effect in response to how X naturally affected Z versus how Z
varies free from the influence of X. In the COMPAS dataset, this would mean the average
difference in recidivism prediction for individuals for whom the race is set to White by
intervention, compared to simply observing the race to be White.
Syntactically, following the discussion in Sec. 3.2, we can write these quantities in terms
of contrasts (Def. 14), namely:

TEx0 ,x1 (y) = C(x0 , x1 , ∅, ∅) (58)


Exp-SEx (y) = C(∅, x, x, ∅) (59)

Based on these two notions, the TV can be decomposed into two distinct sources of
variation, which correspond precisely to its causal and non-causal mechanisms:
Lemma 2 (TV decomposition I) The total variation measure can be decomposed as

TVx0 ,x1 (y) = TEx0 ,x1 (y) + (Exp-SEx0 (y) − Exp-SEx1 (y)). (60)

Lem. 2 shows that the TV measure equals to the total effect on Y when X transitions from
x0 to x1 plus the difference between the experimental spurious effect of X = x0 and X = x1
13 . In other words, TV accounts for the sum of the directed (causal) and confounding paths

12. The TE measure is also called causal effect and sometimes written in do-notation, P (y | do(x1 )) − P (y |
do(x0 )). Obviously, this quantity has well-defined semantics given a SCM, despite the fact that no one
intends or believes to set any of the protected attributes literally by intervention. Still, through the
formal language of causality, one can contemplate these distinct counterfactual realities. In particular,
one can disentangle and explain the sources of Y variations in response to changes in X, including the
ones through the causal pathways versus the non-causal ones, along the spurious paths.
13. An alternative way of interpreting this relation is by flipping TV and TE in the equation, namely:

TEx0 ,x1 (y) = TVx0 ,x1 (y) − (Exp-SEx0 (y) − Exp-SEx1 (y)). (61)

This means that the total effect of transitioning X from x0 to x1 on Y is equal to the corresponding
total variation of Y minus the the difference in spurious effects of the baseline X = x0 versus X = x1 .

26
Causal Fairness Analysis

Z Z Z Z

X = x1 Y − X = x0 Y X=x Y − X=x Y

W W W W
P (yx1 ) P (yx0 ) P (yx ) P (y | x)
(a) Total effect TEx0 ,x1 (y). (b) Experimental spurious effect Exp-SEx (y).
Z Z Z Z
X = x1 X = x1
Y − X = x0 Y Y − X = x1 Y
X = x0 X = x0
W W W W
P (yx1 ,Wx0 ) P (yx0 ) P (yx1 ,Wx0 ) P (yx1 )
(c) Natural direct effect NDEx0 ,x1 (y). (d) Natural indirect effect NIEx1 ,x0 (y).

Figure 8: Graphical representations of measures used in TVx0 ,x1 (y) decomposition.

from X to Y . More formally, the lemma shows that the TV satisfies decomposability with
respect to TE and Exp-SE.
Interestingly, the TE itself is still not admissible w.r.t. Str-{DE,IE}, as it captures all
causal influences of X on Y , including the direct (through the direct link X → Y ) and
indirect ones (i.e., paths via W ).

Lemma 3 (TE inadmissibility) The total effect measure TEx0 ,x1 (y) is not admissible
with respect to structural criteria Str-DE and Str-IE.

To solve FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)), therefore, we will further need to disentangle
the relationships within TE. In particular, we will need to determine the Y variations that
are a direct consequence of the protected attribute, and the ones that are mediated by other
variables. In the literature, the total effect was shown to be decomposable into the measures
known as the natural direct and indirect effects (Pearl, 2001).

Definition 16 (Natural direct and indirect effects) The natural direct and indirect
effects are defined, respectively, as follows:

NDEx0 ,x1 (y) = P (yx1 ,Wx0 ) − P (yx0 ) (62)


NIEx1 ,x0 (y) = P (yx1 ,Wx0 ) − P (yx1 ). (63)

Further, we write NDE-fairX (Y ) for NDEx0 ,x1 (y) = 0, or simply NDE-fair when the at-
tribute/outcome are clear from the context. The condition NIE-fair is defined analogously.

Several observations are important making about these definitions. First in terms of seman-
tics, the NDE captures the difference in Eq. 62, namely, how the outcome Y changes when
setting X = x1 , but keeping the mediators W at whatever value it would have taken had
X been x0 , compared to setting X = x0 by intervention. This counterfactual statement is

27
Plečko and Bareinboim

shown graphically in Fig. 8c. Note that Y “perceives” X through the direct link (marked
in blue) as if it is equal to x1 , written in counterfactual language as yx1 , while W perceives
X as if it is x0 , formally, Wx0 . Putting these two together leads to the first factor in Eq. 62,
i.e., yx1 ,Wx0 . The second factor in the contrast is yx0 , which can be written equivalently as
yx0 ,Wx0 , due to the consistency axiom. It represents the fact that both Y and W perceives
X at the same level, x0 14 . Whenever we subtract one from the other, in some sense, the
variations coming from X to Y through W are the same (since it perceives X at the baseline
level x0 ), and what remains are the variations transmitted through the direct arrows, so
the name direct effect. The qualification natural is because W attains its value naturally,
depending on the value of X, but not by interventions.
Second, in the context of our COMPAS example, the NDE would measure how much
the predicted probability of recidivism would have changed for an individual whose race
was set by intervention to White, had their race been set to Non-White, but their juvenile
and prior offense counts took a value they would have attained naturally (that is, a value
naturally attained by White subjects). The contrast represented by the NDE (in Eq. 62)
is known as a nested counterfactual, since X has distinct values when considering different
variables. Albeit not realizable in the real world, it encodes significant types of variations
that can be evaluated from a collection of mechanisms and fully specified SCM, and which
is sometimes computable from data, as discussed in more details in Sec. B.1.
Third, the definition of NIE follows a similar logic while flipping the sources of variations,
as illustrated in Eq. 63 and Fig. 8d. More specifically, the outcome Y responds to X as
being x1 through the direct link in both factors of the contrast (yx1 ), which means that
no direct influence from X to Y is “active”. On the other hand, W responds to X when
varying from levels X = x1 to x0 , formally written as Wx1 versus Wx0 ; this, in turn, affects
Y , which formally is written as counterfactuals yx1 ,Wx1 versus yx1 ,Wx0 . 15 The NIE is also
a nested counterfactual. For the COMPAS example, the NIE would measure how much the
predicted probability of recidivism would have changed for an individual whose race was
White, had their race been Non-White along the indirect causal pathway influencing the
values of juvenile and prior offense counts.
Syntactically, and following the discussion in Sec. 3.2, we can put these observations
together and write the NDE and NIE as counterfactual contrasts (Eq. 54), namely: 16

14. For further discussion on counterfactuals, see (Pearl, 2000, Sec. 7.2) and (Bareinboim et al., 2022).
15. The first term yx1 ,Wx1 is equivalently written as yx1 , which follows from the consistency axiom (Pearl,
2000, Sec. 7.2).
16. Following prior discussion and reversing the usual simplification back, based on the application of the
consistency axiom, these contrasts can more explicitly be written as:

NDEx0 ,x1 (y) = C({x0 , Wx0 }, {x1 , Wx0 }, ∅, ∅) (64)


NIEx1 ,x0 (y) = C({x1 , Wx1 }, {x1 , Wx0 }, ∅, ∅). (65)

It’s evident when considering the NDE that the variations through the mediator W , Wx0 , coincide in
both sides of the contrast and end up cancelling out, which means that all remaining variations are due
to the direct change of X from x0 to x1 in the first component of the pair. On the other hand, the direct
variations in the NIE are both equal to X = x1 , which cancel out, and Y changes are in response to the
change in W , which varies differently depending on whether X = x1 and X = x0 , or Wx1 versus Wx0 .

28
Causal Fairness Analysis

NDEx0 ,x1 (y) = C(x0 , {x1 , Wx0 }, ∅, ∅) (66)


NIEx1 ,x0 (y) = C(x1 , {x1 , Wx0 }, ∅, ∅). (67)

The notions of NDE and NIE, together with Exp-SE, in fact provide the first solution
to the FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)), as shown in the next result.
Theorem 2 (FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) solution (preliminary)) The to-
tal variation measure can be decomposed as

TVx0 ,x1 (y) = NDEx0 ,x1 (y) − NIEx1 ,x0 (y) + (Exp-SEx0 (y) − Exp-SEx1 (y)). (68)

Furthermore, the measures NDE, NIE, and Exp-SE are admissible with respect to Str-DE,
Str-IE, and Str-SE, respectively. More formally, we write

Str-DE-fair =⇒ NDE-fair (69)


Str-IE-fair =⇒ NIE-fair (70)
Str-SE-fair =⇒ Exp-SE-fair. (71)

Therefore, the measures (µDE , µIE , µSE ) = (NDEx0 ,x1 (y), NIEx1 ,x0 (y), Exp-SEx (y)) solve
the FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)).
After showing a solution to FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)), we make two important
remarks. Firstly, the measures discussed so far admit a structural basis expansion (Thm. 1)
and can be expanded as follows:
X  
TVx0 ,x1 (y) = y(u) P (u|x1 ) − P (u | x0 ) (72)
u
X 
TEx0 ,x1 (y) = yx1 (u) − yx0 (u) P (u) (73)
u
X  
Exp-SEx (y) = yx (u) P (u) − P (u | x) (74)
u
X 
NDEx0 ,x1 (y) = yx1 ,Wx0 (u) − yx0 (u) P (u) (75)
u
X 
NIEx1 ,x0 (y) = yx1 ,Wx0 (u) − yx1 (u) P (u). (76)
u

The factorization in the display above connects the measures to the sampling-evaluation
process discussed in Sec.3.2, explaining the observed contrasts in terms of unit-level quanti-
ties. We revisit this point shortly. Secondly, one of the significant and practical implications
of Thm. 2 appears through the Eq. 69’s contrapositive (and Eqs. 70, 71), i.e.:

(NDEx0 ,x1 (y) 6= 0) =⇒ ¬Str-DE-fair. (77)

Based on this, we have now a principled way of testing the following hypothesis:

H0 : NDEx0 ,x1 (y) = 0. (78)

29
Plečko and Bareinboim

mechanism axis
TV Causal Spurious Direct Indirect
population axis

P (u) ∧ ∧
TE Exp-SE NDE NIE

Str-TE S-SE Str-DE Str-IE

Figure 9: Placing the total, experimental spurious, natural direct, and natural indirect
effects along the population and mechanism axes that were first introduced in Fig. 7.

If the H0 hypothesis is rejected, the fairness analyst can conclude that direct discrimination
is present in the dataset. In contrast, any statistics or hypothesis test based on the TV are
insufficient to test for the existence of a direct effect.
We display in Fig. 9 the measures TE, NDE, NIE, and Exp-SE along the population
and mechanism axes of the explainability plane (Fig. 7). One may be tempted to surmise
that the FPCFA is fully solved based on the results discussed so far. This is unfortunately
not always the case, as illustrated next.
Example 9 (Limitation of the NDE) A startup company is currently in hiring season.
The hiring decision (Y ∈ {0, 1} indicates whether the candidate is hired) is based on gender
(X ∈ {0, 1} represents females and males, respectively), age (Z ∈ {0, 1}, indicating younger
and older applicants, respectively), and education level (W ∈ {0, 1} indicating whether the
applicant has a PhD). The true SCM M, unknown to the fairness analyst, is given by:

U ← N (0, 1) (79)
X ← Bernoulli(expit(U )) (80)
Z ← Bernoulli(expit(U )) (81)
W ← Bernoulli(0.3) (82)
1 1
Y ← Bernoulli( (X + Z − 2XZ) + W ), (83)
5 6
ex
where expit(x) = 1+ex . In this case, the NDE can be computed as:

NDEx0 ,x1 (y) = P (yx1 ,Wx0 ) − P (yx0 ) (84)


1 1 1 1
= P (Bernoulli( (1 − Z) + W ) = 1) − P (Bernoulli( (Z) + W ) = 1) (85)
5 6 5 6
X X 1 1 1 1
= P (z, w)[ (1 − z) + w − z − w] (86)
5 6 5 6
z∈{0,1} w∈{0,1}
X X 1
= P (z)P (w)[ (1 − 2z)] since P (z, w) = P (z)P (w) (87)
5
z∈{0,1} w∈{0,1}
X 1 1 1 1 −1
= P (z)[ (1 − 2z)] = × + × = 0. (88)
5 2 5 2 5
z∈{0,1}

30
Causal Fairness Analysis

In other words, the NDEx0 ,x1 (y) is equal to zero. Still, perhaps surprisingly, the structural
direct effect is present in this case, that is Str-DE-fair does not hold, since the outcome Y
is a function of gender X, as evident from the structural Eq. 83. 

This example illustrates that even though the NDE is admissible with respect to structural
direct effect, it may still be equal to 0 while structural direct effect exists. One can see
through Eq. 88 that the NDE is an aggregate measure over two distinct sub-populations.
Specifically, when considering junior applicants, females are 20% less likely to be hired (units
with (Z = 0, X = 0)), whereas for senior applicants, males are 20% less likely to be hired
(units with (Z = 1, X = 1)). Mixing these two groups together results in the cancellation
of the two effects and the NDE equating to 0, in turn, making it impossible for the analyst
to detect discrimination using only the NDE. 17
Another interesting way of understanding this phenomenon is through the structural
basis expansion of the NDE. In Eq. 75, the posterior weighting term is P (u), which means
that both younger and older applicants are included in the contrast. The fact that this
contrast mixes somewhat heterogeneous units of the population, regarding the decision-
making procedure to decide Y (fy ), motivates another important notion in fairness analysis:

Definition 17 (Power) Let Ω be a space of SCMs. Let Q be a structural criterion and


µ1 , µ2 fairness measures defined on Ω. Suppose that µ1 , µ2 are (Q, Ω)-admissible. We say
that µ2 is more powerful than µ1 if

∀M ∈ Ω : µ2 (M) = 0 =⇒ µ1 (M) = 0. (89)

The notion of power can be useful in the following context. Suppose there is an SCM M
in the space Ω for which discrimination is present, Q(M) = 1, while the measure µ1 is
admissible but unable to capture it, i.e., µ1 (M) = 0. Still, another measure may exist such
that µ2 (M) 6= 0. If this is the case, we would say that discrimination qualitatively described
by criterion Q can be detected using measure µ2 , but not using µ1 . We would then say
that µ2 is more powerful than µ1 . Putting it differently, what Ex. 9 showed was that the
measure

NDEx0 ,x1 (y) = C(x0 , {x1 , Wx0 }, ∅, ∅) (90)

was not powerful enough. The reason in this case is that for the NDE, the conditioning
events are E0 = E1 = ∅, which is not refined enough to capture the discrimination in the
aforementioned example. Next, we re-write the definition of FPCFA to account for the
measures’ power:

Definition 18 (FPCFA continued with power) The Fundamental Problem of Causal


Fairness Analysis is to find a collection of measures µ1 , . . . , µk such that the following prop-
erties are satisfied:

(1) µ is decomposable w.r.t. µ1 , . . . , µk ;


17. This observation is structural, and despite of the number of samples available. In practice, depending
on the sample size, some level of tolerance regarding the difference between these two groups may be
present and still be undetectable through any statistical hypothesis testing.

31
Plečko and Bareinboim

(2) µ1 , . . . , µk are admissible w.r.t. the structural fairness criteria Q1 , Q2 , . . . , Qk .

(3) µ1 , . . . , µk are as powerful as possible.

We provide in Fig. 10 an updated, visual repre-


sentation of the FPCFA that accounts for the power TV
relation across measures. In some sense, picking
(NDEx0 ,x1 (y), NIEx1 ,x0 (y), Exp-SEx (y)) as the mea- decomposable
sures (µkDE , µkIE , µkSE ) helped to solve the original
problem, but the gap between TV and the struc-
tural measures is so substantive that certain criti- µkSE µkDE µkIE
cal instances were left undetected. In the updated powerful

...

...

...
definition, the requirement is to find measures that
are as powerful as possible, or in other words, the µSE µDE µIE
closest possible to the corresponding structural ones,
Str-{DE,IE,SE}. In the sequel, we discuss how to admissible
construct increasingly more powerful measures by us-
ing more specific events E. Str-SE Str-DE Str-IE

4.1.2 X-specific Contrasts - P (u | x)


SCM M∗
We will quantify the level of discrimination for a spe-
cific subgroup of the population for which X(u) = x Figure 10: FPCFA with power rela-
(for example, females) by considering contrasts with tions.
the conditioning event E = {X = x}. In fact, we
are moving inwards in the population axis in Fig. 7,
following the discussion in Sec. 3.2, and the sub-population we are focusing on is more spe-
cific. More formally, this can be seen through the structural basis expansion (Eq. 54) and
the fact that the posterior after using the new E becomes P (u | X = x), which generates a
family of x-specific measures:
Definition 19 (x-specific TE, DE, IE, and SE) The x-{total, -direct, -indirect, -spurious}
effects are defined as follow

x-TEx0 ,x1 (y | x) = P (yx1 | x) − P (yx0 | x) (91)


x-DEx0 ,x1 (y | x) = P (yx1 ,Wx0 | x) − P (yx0 | x) (92)
x-IEx1 ,x0 (y | x) = P (yx1 ,Wx0 | x) − P (yx1 | x) (93)
x-SEx0 ,x1 (y) = P (yx0 | x1 ) − P (yx0 | x0 ). (94)

The x-TE is a well-known quantity and usually called the effect of treatment on the treated
(ETT, for short) in literature, and appeared in (Heckman et al., 1998), while the x-specific
DE, IE, and SE are more recent quantities were introduced in (Zhang and Bareinboim,
2018b). 18 Some observations ensue from these definitions. Firstly, these measures can be
18. In fact, Zhang and Bareinboim (2018b) originally named these quantities the counterfactual DE, IE, and
SE, but we highlight here that they are the x-specific counterparts of their marginal effects. This is for
clarify of the discussion here since from now on in this paper, all quantities will be “counterfactual”, in
the sense of layer 3 in Pearl Causal Hierarchy (Bareinboim et al., 2022).

32
Causal Fairness Analysis

Z Z Z Z

x x1 Y − x x0 Y x1 x0 Y − x0 Y

W W W W
P (yx1 | x) P (yx0 | x) P (yx0 | x1 ) P (yx0 | x0 )
(a) ETTx0 ,x1 (y | x). (b) Ctf-SEx0 ,x1 (y).

Figure 11: Graphical representations of some x-specific causal fairness measures. The blue
and red color highlight where the contrast between the quantities lies.

written as their structural basis and unit-level factorization (Eqs. 54 and 55), that is
X
x-TEx0 ,x1 (y | x) = [yx1 (u) − yx0 (u)]P (u | x) (95)
u
X
x-DEx0 ,x1 (y | x) = [yx1 ,Wx0 (u) − yx0 (u)]P (u | x) (96)
u
X
x-IEx1 ,x0 (y | x) = [yx1 ,Wx0 (u) − yx1 (u)]P (u | x) (97)
u
X
x-SEx0 ,x1 (y) = yx0 (u)[P (u | x1 ) − P (u | x0 )]. (98)
u

To simplify the notation and the comparison with the measures discussed earlier, we re-write
them as factual and counterfactual contrasts, namely:

x-TEx0 ,x1 (y | x) = C(x0 , x1 , x, x) (99)


x-DEx0 ,x1 (y | x) = C(x0 , {x1 , Wx0 }, x, x) (100)
x-IEx1 ,x0 (y | x) = C(x1 , {x1 , Wx0 }, x, x) (101)
x-SEx0 ,x1 (y) = C(x0 , x0 , x1 , x0 ). (102)

Secondly, we will consider each of the measures individually. Starting with the x-TE, we
note that it is simply a conditional version of the total effect (TE) for the subset of units
U in which X(u) = x. This can be easily seen by comparing the contrast representation of
the TE (Eq. 58) versus the x-TE (Eq. 99), namely:

x-TEx0 ,x1 (y | x) = C(x0 , x1 , x, x)


TEx0 ,x1 (y) = C(x0 , x1 , ∅, ∅),

which make it obvious that the former has E0 = E1 = ∅, whereas the latter has E0 = E1 =
x. Both measures, however, use the same counterfactual clauses C0 = x0 and C1 = x1 .
In terms of the sampling-evaluation process discussed earlier, even though these measures
evaluate each unit in the same way (due to the same counterfactual clauses), the TE draws
units at random from the population, while the x-TE filters them out based on X’s par-
ticular instantiation. The graphical visualization of the ETT is shown in Fig. 11a and can
be compared with that of TE in Fig. 8a, for grounding the intuition. In words, note that

33
Plečko and Bareinboim

the downstream effect of X on Y is the same, but now Z is no longer disconnected from
X, but varies in accordance to the event X = x. As we will show later on, in the startup
hiring example (Ex. 9), the gender will lead to an additional source of information about
age, which can be use in the measure.
Thirdly, the counterfactual measures of direct and indirect effects, x-DE and x-IE, are
conditional versions of the NDE and NIE, respectively. These observations are also reflected
in Eqs. 96-97, in which the only difference compared to the general population measures is
in the posterior weighting term P (u | x), while for the NDE and NIE the weighting term
is simply P (u) (Eqs. 75-76). One difference relative to the natural DE and IE is that here
a reference value, X = x, needs to be picked such that the baseline population can be
selected. For instance, in the context of comparing the direct effect on Y from transitioning
X from x0 to x1 , one could more naturally set the baseline population to X = x0 .
Fourthly, we consider the x-SE and its graphical representation, as shown in Fig. 11b.
This quantity also generalizes that of Exp-SEx (y) shown in Fig. 8b. The difference between
these two quantities is in the weighting term, where P (u) − P (u | x) in Exp-SEx (y) is
replaced by P (u | x1 ) − P (u | x0 ) in x-SEx0 ,x1 (y). Despite its innocent appearance, this a
substantive difference since the Exp-SE entails a comparison between the observational and
interventional distributions, while x-SE is a purely counterfactual measure. 19
After all, we can finally state the main result of this section, namely, that the quantities
x-{DE, IE, SE} solve the FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)).

Theorem 3 (x-specific FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) solution) The total vari-
ation measure can be decomposed as

TVx0 ,x1 (y) = x-DEx0 ,x1 (y | x0 ) − x-IEx1 ,x0 (y | x0 ) − x-SEx1 ,x0 (y). (103)

Further, the measures x-{DE, IE, SE} are admissible w.r.t. Str-DE, Str-IE, Str-SE, respec-
tively. Moreover, the counterfactual family is more powerful than NDE, NIE, and Exp-SE,
respectively. More formally, the admissibility relations can be written as:

Str-DE-fair =⇒ x-DE-fair (104)


Str-IE-fair =⇒ x-IE-fair (105)
Str-SE-fair =⇒ x-SE-fair, (106)

and the power relations as:

x-DE-fair ◦−→ NDE-fair, (107)


x-IE-fair ◦−→ NIE-fair, (108)
x-SE-fair ◦−→ Exp-SE-fair. (109)

Therefore, the measures (µDE , µIE , µSE ) = (x-DEx0 ,x1 (y), x-IEx1 ,x0 (y), x-SEx0 ,x1 (y)) solve
the FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)).
19. In terms of the Pearl Causal Hierarchy (PCH, for short), the quantity Exp-SE entails assumptions only
relative to associational and experimental quantities (PCH’s layers 1 and 2), while the x-SE requires
substantively stronger assumptions regarding counterfactuals (layer 3). For a more detailed discussion
on that matter, refer to (Bareinboim et al., 2022).

34
Causal Fairness Analysis

Similarly to the discussion in the general-population measures (i.e., P (u)), the signifi-
cance, and practical implications of Thm. 3 appears through the Eq. 104’s contrapositive
(and Eqs. 105, 106), i.e.:

(x-DEx0 ,x1 (y) 6= 0) =⇒ ¬Str-DE-fair. (110)

Based on this, we have now a principled way of testing the following hypothesis:

H0 : x-DEx0 ,x1 (y) = 0. (111)

If the H0 hypothesis is rejected, the fairness analyst can conclude that direct discrimination
is present in the dataset. Naturally, similar tests can be performed regarding the indirect
and spurious structural measures.

Example 10 (Revisiting the Startup’s hiring & NDE lack of power) Consider the
SCM M given in Eq. 79-83. For X = x0 we compute the x-specific direct effects as:

x-DEx0 ,x1 (y | x0 ) = P (yx1 ,Wx0 | x0 ) − P (yx0 | x0 ) (112)


1 1
= P (Bernoulli( (1 − Z) + W ) = 1 | x0 ) (113)
5 6
1 1
− P (Bernoulli( (Z) + W ) = 1 | x0 ) (114)
5 6
X X 1 1 1
= P (w)P (z | x0 )[ (1 − 2z) + w − w] (115)
5 6 6
z∈{0,1} w∈{0,1}
X 1
= (1 − 2z)P (z | x0 ) = 0.036. (116)
5
z∈{0,1}

In words, when considering female applicants (X = x0 ), they are 3.6% less likely of being
hired than they would be, had they been male. In other words, direct discrimination is
certainly present in the hiring process of the startup company. 

4.1.3 Z-specific Contrasts - P (u | z)


One might also be interested in capturing discrimination for a specific subset of U for which
Z(u) = z similarly as for the x-specific measures. Here, we will consider two possibilities
in terms of sub-population selection, first when event Z(u) = z and then when Z(u) =
z, X(u) = x. Before introducing the corresponding z- and (x, z)-specific quantities, we
clarify one major difference compared to the general and x-specific case, namely in the
spurious effects. As noted in Sec. 3, spurious effects are captured by factual contrasts of
the form
X
P (yx | E1 ) − P (yx | E0 ) = yx (u)[P (u | E1 ) − P (u | E0 )], (117)
u

which rely on comparing different units corresponding to events E0 , E1 . These spurious


effects represent variations that causally precede X and Y . Interestingly enough, under the
assumptions of the SFM (Sec. 2.3.1), conditioning on Z(u) = z closes all backdoor paths

35
Plečko and Bareinboim

between X and Y . In other words, fixing Z also fixes the possible spurious variations, and
therefore on a z- or (x, z)-specific level spurious effects are always equal to zero20 . Therefore,
we can consider the following measures:

Definition 20 (z- and (x, z)-specific TE, DE, and IE) The z-specific total, direct and
indirect effects are defined as

z-TEx0 ,x1 (y | z) = P (yx1 | z) − P (yx0 | z) (120)


z-DEx0 ,x1 (y | z) = P (yx1 ,Wx0 | z) − P (yx0 | z) (121)
z-IEx1 ,x0 (y | z) = P (yx1 ,Wx0 | z) − P (yx1 | z) (122)
(x, z)-TEx0 ,x1 (y | z) = P (yx1 | x, z) − P (yx0 | x, z) (123)
(x, z)-DEx0 ,x1 (y | z) = P (yx1 ,Wx0 | x, z) − P (yx0 | x, z) (124)
(x, z)-IEx1 ,x0 (y | z) = P (yx1 ,Wx0 | x, z) − P (yx1 | x, z). (125)

As before, the measures can be factorized using the corresponding unit-level outcomes:
X
z-TEx0 ,x1 (y | z) = [yx1 (u) − yx0 (u)]P (u | z) (126)
u
X
z-DEx0 ,x1 (y | z) = [yx1 ,Wx0 (u) − yx0 (u)]P (u | z) (127)
u
X
z-IEx1 ,x0 (y | z) = [yx1 ,Wx0 (u) − yx1 (u)]P (u | z) (128)
u

X
(x, z)-TEx0 ,x1 (y | z) = [yx1 (u) − yx0 (u)]P (u | x, z) (129)
u
X
(x, z)-DEx0 ,x1 (y | z) = [yx1 ,Wx0 (u) − yx0 (u)]P (u | x, z) (130)
u
X
(x, z)-IEx1 ,x0 (y | z) = [yx1 ,Wx0 (u) − yx1 (u)]P (u | x, z). (131)
u

These quantities can also be represented more explicitly as contrasts:

z-TEx0 ,x1 (y | z) = C(x0 , x1 , z, z) (132)


z-DEx0 ,x1 (y | z) = C(x0 , {x1 , Wx0 }, z, z) (133)
z-IEx1 ,x0 (y | z) = C(x1 , {x1 , Wx0 }, z, z) (134)

20. Experienced readers might notice, in the presence of unobserved confounders (UCs), we could have more
explicitly defined the corresponding, z-, (x, z)-specific notions

z-SEx (y) = P (y | x, z) − P (yx | z), (118)


(x, z)-SEx0 ,x1 (y) = P (yx | x1 , z) − P (yx | x0 , z). (119)

Naturally, this would account for the spurious variations brought about by the UCs. For a more com-
prehensive treatment of these issues, we refer readers to Sec. 6.

36
Causal Fairness Analysis

(x, z)-TEx0 ,x1 (y | z) = C(x0 , x1 , {x, z}, {x, z}) (135)


(x, z)-DEx0 ,x1 (y | x, z) = C(x0 , {x1 , Wx0 }, {x, z}, {x, z}) (136)
(x, z)-IEx1 ,x0 (y | x, z) = C(x1 , {x1 , Wx0 }, {x, z}, {x, z}). (137)

The z-TE, z-DE, and z-IE (and similarly the (x, z)- counterparts) are simply conditional
versions of TE, NDE, and NIE respectively, restricted to the subpopulation of U such that
Z(u) = z (or Z(u) = z, X(u) = x), which is reflected in the posterior weighting term which
becomes P (u | z) (or P (u | x, z)).
Several important remarks are due. Using the sampling of units analogy from before,
we notice that z-specific effects filter on units which have Z(u) = z, which means they
provide us with a more refined lens for detecting discrimination than the general population
measures. Similarly, the (x, z)-specific measures can be seen as additionally filtering the
units on Z(u) = z, after they were filtered based on X(u) = x, which is precisely what x-
specific measures have done. Therefore, (x, z)-specific measures can be seen as more refined
than x- and z- specific ones. The only uncertainty left in terms of power is about comparing
x-specific and z-specific measures.
Interestingly, under the SFM, the (x, z)-specific measures are equal to the z-specific mea-
sures. This result cannot be deduced from the structural basis expansions above (Eq. 127-
131), but requires the assumptions encoded in the SFM (namely the absence of backdoor
paths from X to Y conditional on Z). This equivalence of z- and (x, z)-specific measures
under the SFM shows that z-specific measures are in fact more powerful than the x-specific
ones, although this need not be the case in general. Following this discussion, we are ready
to present the main result regarding the measures introduced above (while, as discussed
earlier, for the spurious effects we rely on the general and x-specific notions):

Theorem 4 (z-specific FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) solution) The total vari-
ation measure can be decomposed as
X X
TVx0 ,x1 (y) = z-DEx0 ,x1 (y | z)P (z) − z-IEx1 ,x0 (y | z)P (z)
z z
− (Exp-SEx0 (y) − Exp-SEx1 (y)) (138)
X X
= (x, z)-DEx0 ,x1 (y | x, z)P (z | x) − (x, z)-IEx1 ,x0 (y | x, z)P (z | x)
z z
− x-SEx1 ,x0 (y). (139)

Further, the measures z-DE and (x, z)-DE are admissible w.r.t. Str-DE, whereas z-IE and
(x, z)-IE are admissible w.r.t. Str-IE. Moreover, the following power relations hold:

(x, z)-DE-fair ◦−→ z-DE-fair ◦−→ NDE-fair, (140)


(x, z)-IE-fair ◦−→ z-IE-fair ◦−→ NIE-fair, (141)

and also

(x, z)-DE-fair ◦−→ x-DE-fair, (142)


(x, z)-IE-fair ◦−→ x-IE-fair. (143)

37
Plečko and Bareinboim

Additionally, under the SFM, we can say that:

z-DE-fair ◦−→ x-DE-fair, (144)


z-IE-fair ◦−→ x-IE-fair. (145)

Therefore, under the SFM, the measures (µDE , µIE , µSE ) = (z-DEx0 ,x1 (y), z-IEx1 ,x0 (y),
x-SEx0 ,x1 (y)) give a more powerful solution to FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) than
the x-specific ones.

With z-specific measures in hand, we revisit Ex. 9, which showed that the NDE can equal
0 even though direct discrimination exists:

Example 11 (Revisiting the Startup’s hiring & NDE lack of power) Consider the
SCM M given in Eq. 79-83. For Z = 0 we compute the z-specific direct effects as:

z-DE(y | Z = 0) = P (yx1 ,Wx0 | Z = 0) − P (yx0 | Z = 0) (146)


1 1
= P (Bernoulli( (1 − Z) + W ) = 1 | Z = 0) (147)
5 6
1 1
− P (Bernoulli( (Z) + W ) = 1 | Z = 0)
5 6
X 1 1 1 1
= P (w)[ + w − w] = . (148)
5 6 6 5
w∈{0,1}

In words, when considering younger applicants (Z = 0), females are 20% less likely to be
hired than their male counterparts. 

Interesting enough, note that the z-specific DE is able to detect discrimination in the above
example, and finds an even larger disparity transmitted through the direct mechanism
compared to the x-specific DE measure in Ex. 10.

4.1.4 More informative contrasts (V 0 ⊆ V -specific).


In case even more detailed measures of fairness are needed, we can consider specific subsets
of the observed variables, V 0 ⊆ V . For example, we might be interested in quantifying
discrimination for specific units u that correspond to Z(u) = z, W (u) = w (for example
quantifying discrimination for a specific age group with a specific level of education). Other
choices of V 0 than {Z, W } are possible, but due to a large number of possibilities, we do
not cover all of them here. Instead, we define generic v 0 -specific measures for an arbitrary
choice of v 0 :

Definition 21 (V 0 ⊆ V -specific TE, DE and IE) Let V 0 ⊆ V be a subset of the observ-


ables V . For any fixed value of V 0 = v 0 , we define the v 0 -specific total, direct, and indirect
effects as:

v 0 -TEx0 ,x1 (y | v 0 ) = P (yx1 | v 0 ) − P (yx0 | v 0 ) (149)


0 0 0 0
v -DEx0 ,x1 (y | v ) = P (yx1 ,Wx0 | v ) − P (yx0 | v ) (150)
0 0 0 0
v -IEx1 ,x0 (y | v ) = P (yx1 ,Wx0 | v ) − P (yx1 | v ). (151)

38
Causal Fairness Analysis

Once more, these measures admit a structural basis expansion and which are written with
the corresponding contrasts:
X
v 0 -DEx0 ,x1 (y | v 0 ) = [yx1 ,Wx0 (u) − yx0 (u)]P (u | v 0 ) = C(x0 , {x1 , Wx0 }, v 0 , v 0 ) (152)
u
X
0 0
v -IEx1 ,x0 (y | v ) = [yx1 ,Wx0 (u) − yx1 (u)]P (u | v 0 ) = C(x1 , {x1 , Wx0 }, v 0 , v 0 ). (153)
u

Similarly as in the z-specific case, the notion of a spurious effect is lacking whenever Z ⊆ V 0 ,
so once again we rely on previously developed notions of spurious effects. Importantly, the
v 0 -specific measures give an even stronger solution to FPCFA than the z- or (x, z)-specific
measures:
Theorem 5 (v 0 -specific FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) solution) Suppose V 0 ⊆
V is a subset of the observables that contains both X and Z. The total variation measure
can be decomposed as
X X
TVx0 ,x1 (y) = v 0 -DEx0 ,x1 (y | v 0 )P (v 0 | x) − v 0 -IEx1 ,x0 (y | v 0 )P (v 0 | x) − x-SEx1 ,x0 (y).
v0 v0
(154)

Further, the measures v 0 -{DE, IE} are admissible w.r.t. Str-DE, Str-IE, respectively. More-
over, the v 0 -specific family is more powerful than the (x, z)-specific, namely:

v 0 -DE-fair ◦−→ (x, z)-DE-fair, (155)


0
v -IE-fair ◦−→ (x, z)-IE-fair. (156)

Therefore, the measures (µDE , µIE , µSE ) = (v 0 -DEx0 ,x1 (y), v 0 -IEx1 ,x0 (y), x-SEx0 ,x1 (y)) give
a more powerful solution to FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) than the z- or (x, z)-
specific ones.
The next example illustrates why having more flexible, v 0 -specific measures can be infor-
mative, and therefore useful in some practical settings.
Example 12 (Startup hiring - Version II) A startup company is hiring employees. Let
X ∈ {x0 , x1 } denote female and male applicants respectively. The employment decision
Y ∈ {0, 1} is based on gender and education level W . The SCM M is given by:

X ← Bernoulli(0.5) (157)
W ← N (14, 4) (158)
W 
Y ← Bernoulli 0.1 + + 0.1 ∗ X ∗ 1(W < 20) . (159)
50
Since there are no confounders (Z = ∅), general, x-specific and z-specific effects are all
equal:

NDEx0 ,x1 (y) = x-DEx0 ,x1 (y | x) = z-DEx0 ,x1 (y | z) = 9.2%. (160)

Therefore, there is clearly direct discrimination against female employees by the company.

39
Plečko and Bareinboim

The company argues in the legal proceedings that in the high-tech industry, they are
mostly concerned with highly educated individuals. In words, they should be asked whether
they discriminate highly educated female applicants, which is represented through the quan-
tity w-DEx0 ,x1 (y | w > 20). This number can be computed as follows:

w-DEx0 ,x1 (y | w > 20) = 0%, (161)

In words, the company claim was accurate since highly educated individuals were not dis-
criminated against. 

What the example shows is that v 0 -specific measures can sometimes capture aspects of
discrimination that otherwise cannot be quantified using general, x-specific, or z-specific
measures.

Probabilities of causation. Remarkably, the v 0 -specific measures carry a fundamental


connection to what is known in the literature as probabilities of causation (Pearl, 2000,
Ch. 9). For example, by picking event v 0 = {x0 , y0 }, the measure v 0 -TE becomes

(x, y)-TEx0 ,x1 (y | x0 , y0 ) = P (yx1 | x0 , y0 ) − P (yx0 | x0 , y0 ), (162)

where Y = y is a shortcut to Y = 1. First, note that P (yx0 | x0 , y0 ) = P (y | x0 , y0 ), since


by the consistency axiom Y = Yx0 whenever X = x0 . Obviously, P (y | x0 , y0 ) = 0 since
y0 6= 1. Putting these together, the r.h.s. of Eq. 162 can be re-written as

(x, y)-TEx0 ,x1 (y | x0 , y0 ) = P (yx1 | x0 , y0 ), (163)

which is known as the probability of sufficiency (Pearl, 2000, Def. 9.2.2). The measure
computes the probability that a change in attribute from X = x0 to X = x1 produces a
change in outcome from Y = y0 to Y = y1 , or, in words, how much X’s value is “sufficient”
to produce y1 . Along similar lines, v 0 -TE for the event v 0 = {x1 , y1 } can be written as

(x, y)-TEx0 ,x1 (y | x1 , y1 ) = P (yx1 | x1 , y1 ) − P (yx0 | x1 , y1 ) (164)


= 1 − P (yx0 | x1 , y1 ) (165)
= P (yx0 = 0 | x1 , y1 ), (166)

which is known as the probability of necessity (Pearl, 2000, Def. 9.2.1). The second line
of the derivation followed since by the consistency axiom, Yx 1 = Y , and also the fact that
Y = 1 in the factual world. The measures computes the probability that a change in
attribute from X = x1 to X = x0 produces a change in outcome from Y = y1 to Y = y0 , or
how X’s value is “necessary” to produce y1 . These two types of variations usually appear
together and may be modeled through what is known as the probability of necessity and
sufficiency (PNS). We refer readers to (Pearl, 2000, Ch. 9) for further discussion.

4.1.5 Unit-level Contrasts - δu


Finally, the most powerful measures to consider are unit-level measures, as defined next:

40
Causal Fairness Analysis

Definition 22 (Unit-level TE, DE, and IE) Given a unit U = u, the unit-level total,
direct, and indirect effects are given by

u-TEx0 ,x1 (y(u)) = yx1 (u) − yx0 (u) = C(x0 , x1 , u, u) (167)


u-DEx0 ,x1 (y(u)) = yx1 ,Wx0 (u) − yx0 (u) = C(x0 , {x1 , Wx0 }, u, u) (168)
u-IEx1 ,x0 (y(u)) = yx1 ,Wx0 (u) − yx1 (u) = C(x1 , {x1 , Wx0 }, u, u). (169)

For unit-level measures the posterior distribution that is used as a weighting term is δu ,
where δ is the Dirac delta function. The unit-level measures can be seen as the canonical
basis under which all other measures are expanded. They also give the strongest theoretical
solution to the FPCFA, once again, with the help of x-specific spurious effect developed
earlier:

Theorem 6 (unit-level FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) solution) The total vari-
ation measure can be decomposed as
X X
TVx0 ,x1 (y) = u-DEx0 ,x1 (y(u))P (v 0 | x) − u-IEx1 ,x0 (y(u))P (u | x) − x-SEx1 ,x0 (y).
u u
(170)

Further, the measures u-{DE, IE} are admissible w.r.t. Str-DE, Str-IE, respectively. More-
over, the u-specific family is more powerful than the v 0 -specific, namely:

u-DE-fair =⇒ v 0 -DE-fair, (171)


0
u-IE-fair =⇒ v -IE-fair. (172)

Therefore, the measures (µDE , µIE , µSE ) = (u-DEx0 ,x1 (y), u-IEx1 ,x0 (y), x-SEx0 ,x1 (y)) give
the most powerful solution to FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)).

The unit-level measures represent the most refined level at which discrimination can
be described. In fact, introducing these measures also brings us to the final level of the
population axis of the explainability plane (Fig. 7). Recall, the population axis ranges from
the general population measures (with a posterior P (u)), all the way to the deterministic
measures which consider a single unit (with a posterior δu ), eliciting a range of measures
which may be useful for fairness analysis. We next move onto giving a systematic overview
of the TV-family of measures that was introduced in this section.

4.2 Summary of the TV-family & the Fairness Map


To facilitate comparison and understanding after introducing the measures of the TV-family,
we show next how they can be more explicitly written as contrasts:

Lemma 4 (TV family as contrasts) The TV-family of causal fairness measures is a


collection of contrasts C(C0 , C1 , E0 , E1 ) (Def. 14) that follow the specific instantiations of
counterfactual and factual clauses, C0 , C1 , E0 , E1 , as described in Table 1.

41
Plečko and Bareinboim

A few things are worth noting relative to this Measure C0 C1 E0 E1


taxonomy. First, The measures are grouped in five
categories, based on the granularity of the events TVx0 ,x1 ∅ ∅ x0 x1

general
E0 , E1 . For each of the contrasts, we define a cri- Exp-SEx x x ∅ x
terion based on the resulting measure. Namely, we TEx0 ,x1 x0 x1 ∅ ∅
say Y is fair with respect to X in the x-TE mea- NDEx0 ,x1 x0 x1 , Wx0 ∅ ∅
sure if x-TEx0 ,x1 (y | x) = 0 ∀x. We write x-TE- NIEx0 ,x1 x0 x0 , Wx1 ∅ ∅
fairX (Y ) for this condition, or x-TE-fair, for short. x-TEx0 ,x1 x0 x1 x x

X=x
Further note that Table 1 has a distinct struc- x-SEx0 ,x1 x0 x0 x0 x1
ture. In fact, the contrasts corresponding to TE, x-TEx0 ,x1 x0 x1 x x
DE, and IE measures have repeating (equal) coun- x-DEx0 ,x1 x0 x1 , Wx0 x x
terfactual clauses C0 and C1 , whereas the condi- x-IEx0 ,x1 x0 x0 , Wx1 x x
z-TEx0 ,x1 x0 x1 z z

Z=z
tioning event E changes. Mathematically, the mea-
sures in the table, but for the spurious effects, can z-DEx0 ,x1 x0 x1 , Wx0 z z
be written more succinctly as z-IEx0 ,x1 x0 x0 , Wx1 z z
v 0 -TEx0 ,x1 x0 x1 v0 v0

V0 ⊆V
 
E-TEx0 ,x1 (y | E) = C(x0 , x1 , E, E)
 
 v 0 -TEx0 ,x1 x0 x1 v0 v0
E-DEx0 ,x1 (y | E) = C(x0 , {x1 , Wx0 }, E, E) v 0 -DEx0 ,x1 x0 x1 , Wx0 v0 v0


E-IEx0 ,x1 (y | E) = C(x0 , {x0 , Wx1 }, E, E)

 v 0 -IEx0 ,x1 x0 x0 , Wx1 v0 v0
u-TEx0 ,x1 x0 x1 u u
for E ∈ {∅, x, z, v 0 , u}.
unit
u-TEx0 ,x1 x0 x1 u u
(173) u-DEx0 ,x1 x0 x1 , Wx0 u u
Apart from the overarching structure underlying u-IEx0 ,x1 x0 x0 , Wx1 u u
the measures, as described in Table 1, there is more
Table 1: Measures of fairness in the
structure across them as delineated in the next re-
TV-family. TE stands for total effect,
sult, which comes under the rubric of the fairness
Exp experimental, SE spurious, N nat-
map.
ural, DE direct, IE indirect, v 0 for an
0 0 0
Theorem 7 (Fairness Map) The total variation event V = v , where V ⊆ V .
(TV) family of causal measures of fairness admits
a number of relations of decomposability, admissibility, and power, which are represented in
what we call the Fairness Map, as shown in Fig. 12.

In words, the measures of the TV family satisfy an entire hierarchy of relations in terms
of the properties discussed so far, namely, admissibility, decomposability, and power. This
hierarchy is one of the main results of this manuscript. There are several observations worth
making at this point. First, each arrow in Fig. 12 corresponds to an implication, and the
full and more syntactic version of the map is provided in the Appendix A.1, including the
proofs. There are different ways of reading the map, and perhaps the most natural one is
to navigate along the two axes, mechanisms and population, which match the dimensions
discussed earlier in the explainability plane (Fig. 7/Sec. 3.2).
First, note that the mechanism’s axis is partitioned into two. First, there are the
elementary structural fairness criteria (Def. 9), in which each represent a different type of
mechanism. Second, there are the composite measures, which are the ones that are usually
readable from the data. More prominently, the causal effects are marked in gray, which is
also known as the total effects, and the total variation is shown on the left-top corner.

42
Causal Fairness Analysis

Mechanisms Axis

TV Causal Spurious Direct Indirect


∧ ∧
X = x general

TE Exp-SE NDE NIE

∧ ∧

Empirical
x-TE x-SE x-DE x-IE
Population Axis


Z=z

z-TE 8 z-DE z-IE


V0 ⊆V

v 0 -TE 8 v 0 -DE v 0 -IE

Structural
unit

u-TE 8 u-DE u-IE


Ctf-fair

Str-TE Str-SE Str-DE Str-IE

Composite Atomic

Figure 12: Fairness Map for the TV family of measures. The x-axis represent the mecha-
nisms (causal, spurious, direct, and indirect), and the y-axis the events that capture increas-
ingly more granular sub-populations, from general (P (u)) to unit level, and structural. The
arrow =⇒ indicates relations of admissibility, ◦−→ of power, and 99K of decomposability.

In a complementary way, the population axis can also be partitioned as well. First,
there are the structural measures (below the blue-dotted line), which are computable from
the true SCM M and almost always unobservable. On the other side, there are the “em-
pirical” measures (above the blue line), which are possibly computable depending on the
combination of data and assumptions about the underlying generative processes.
Given this initial acknowledgment, we note this is a rough characterization, and then
navigate through the axis in a more detailed manner, along each of the axis separately.
Population axis (vertical) – Admissibility & Power relations.
When reading the map vertically, from bottom to top, one can find all power and ad-
missibility relations from Thm. 2 to Thm. 6. For example, the last column of the map
(“indirect”) shows that
Str-IE =⇒ u-IE ◦−→ v 0 -IE ◦−→ z-IE ◦−→ x-IE ◦−→ NIE. (174)
In words, this says that
(i) unit IE is admissible w.r.t. structural IE;

43
Plečko and Bareinboim

(ii) unit IE is more powerful than v 0 -IE, which is more powerful than z-IE, which is more
powerful than x-IE, which is more powerful than NIE;
(iii) by transitivity of the admissibility and power relations, it follows that every measure
in the column is admissible w.r.t. structural IE.
The other columns of the map can be interpreted in a similar fashion.
Mechanisms axis (horizontal) – Decomposability relations.
When reading the map horizontally, from the right to the left, the decomposability re-
lations are encoded. For example, consider the first row of the map (“general”), it shows
that

TE 99K NDE ∧ NIE (175)


TV 99K TE ∧ Exp-SE, (176)

In words, this says that


(i) the total variation (TV) can be decomposed into the total (TE) and experimental
spurious effects (Exp-SE);
(ii) the total effect (TE) can further be decomposed into natural direct effect (NDE) and
natural indirect effect (NIE).
(iii) More explicitly, these relations can be combined and written as:

TV 99K NDE ∧ NIE ∧ Exp-SE. (177)

More strongly, this can be stated for every level of the population axis (i.e., the TE is
decomposed into DE and IE at every level), as shown next:

Corollary 1 (Extended Mediation Formula) The total effect admits a decomposition


into its direct and indirect parts, at every level of granularity of event E in the Fairness
Map in Fig. 12. Formally, we can say that

TEx0 ,x1 (y) = NDEx0 ,x1 (y) − NIEx1 ,x0 (y) (178)
x-TEx0 ,x1 (y | x) = x-DEx0 ,x1 (y | x) − x-IEx1 ,x0 (y | x) (179)
z-TEx0 ,x1 (y | z) = z-DEx0 ,x1 (y | z) − z-IEx1 ,x0 (y | z) (180)
0 0 0 0 0 0
v -TEx0 ,x1 (y | v ) = v -DEx0 ,x1 (y | v ) − v -IEx1 ,x0 (y | v ) (181)
u-TEx0 ,x1 (y(u)) = u-DEx0 ,x1 (y(u)) − u-IEx1 ,x0 (y(u)). (182)

Furthermore, the TV measure admits different expansions into DE, IE, and SE measures
(as shown in Thm. 2-6). The importance of these decompositions was already stated earlier,
as they played a crucial role in solving the decomposability part of the FPCFA.
In summary, the Fairness Map represents a general, theoretical solution to the FPCFA,
and shows how the gap between the observed (TV in the top left of the map) and the
structural (bottom of the map) can be bridged from first principles. The map therefore,
in principle, closes the problem pervasive throughout the literature, as formalized earlier in
this manuscript.

44
Causal Fairness Analysis

4.3 The Identification Problem & the FPCFA in practice


The Fairness Map introduced in Thm. 7 contains various admissible measures w.r.t. to
different structural mechanisms. All these measures are well-defined and computable from
the underlying data-generating model, the true SCM M. However, M is not available in
practice, which was the very motivation for engaging in the discussions so far, and finding
proxies for the structural measures. One key consideration that follows is which of these
measures can be computed in practice, given (1) a set of assumptions A about the underlying
M and (2) data from past decisions generated by M. This question indeed can be seen as
a problem of identifiability (Pearl, 2000, Sec. 3.2.4). We formalize this notation considering
the context of this discussion.

Definition 23 (Identifiability) Let the true, generative SCM M = hV, U, P (U ), F i, and


a set of assumptions A and an observational distribution P (v) generated by it. Let ΩA the
space of all SCMs compatible with A. Let φ be a query that can be computed from M. The
quantity φ is said to be identifiable from ΩA and the observational distribution P (V ) if

∀M1 , M2 ∈ ΩA : AM1 = AM2 and (183)


P M1 (V ) = P M2 (V ) =⇒ φ(M1 ) = φ(M2 ). (184)

In words, if any two SCMs agree with the set of assumptions (A) and also generate the
same observational distribution (P (v)), then they should agree with the answer to the query
φ.

A query φ is identifiable if it can be uniquely computed from the combination of qual-


itative assumptions and empirical data. In fact, the lack of identifiability means that one
cannot compute the value of φ from the observational data and set of assumptions, i.e., the
gap between the true generative process, M, and the feature that we are trying to obtain
from it, φ, is too large, and cannot be bridged through the pair hA, P (v)i. In practice, one
common way of articulating assumptions about M is through the use of causal diagrams.
Whenever the causal diagram is known, we can then write the following :

ΩG = {M : M compatible with G}, (185)

where compatibility is related to sharing the same causal diagram, which encodes qualitative
assumptions, following the construction in Def. 621 .

Example 13 ((Non-)Identifiability of measures) Let ΩG be the space of SCMs that


are compatible with the causal diagram G

X Y

W .

When considering the quantities TEx0 ,x1 (y) and NIEx0 ,x1 (y) in this context, we can say
that:
21. For a more formal account of this notion, see discussion on CBNs in (Bareinboim et al., 2022, Sec. 1.3))

45
Plečko and Bareinboim

(i) quantity TEx0 ,x1 (y) is identifiable over ΩG ,

(ii) quantity NIEx0 ,x1 (y) is not identifiable over ΩG .

In fact, for any SCM in ΩG , we have that TEx0 ,x1 (y) is equal to

P (y | x1 ) − P (y | x0 ). (186)

To show that NIEx0 ,x1 (y) is not identifiable, consider the following two SCMs:

X
 ← UX (187)
M1 := W ← 1(UD < 0.2 + 0.4X + 0.4UW Y ) (188)

Y ← 1(UY < 0.1X + 0.7W + 0.1UW Y ), (189)


X
 ← UX (190)
M2 := W ← 1(UD < 0.2 + 0.4X + 0.4UW Y ) (191)

Y ← 1(UY < 0.2X + 0.1W + 0.7UW Y ), (192)

where UX , UD , UW Y and UY are independent, exogenous variables, with UX , UW Y binary


with P (UX = 1) = P (UW Y = 1) = 21 , and UD , UY distributed uniformly Unif[0, 1]. Both
M1 , M2 are compatible with G and hence are in ΩG . The reader can verify that the two
SCMs generate the same observational distribution. However, computing that

NIEM M2
x0 ,x1 (y) = 28% 6= NIEx0 ,x1 (y) = 4%
1
(193)

shows lack of identifiability in the given context. 

Following the discussion in Sec. 2.3, we noted that one SCM M induces a particular
causal diagram G. Still, specifying the precise G may be non-trivial in practice, and we
hence introduced the standard fairness model (SFM). In this case, we will be particularly
interested in the set of SCM defined by the SFM projection of the causal diagram, which is
called ΩSF M . Reasoning within the ΩSF M space has two interesting consequences. First,
identification is in principle more challenging since this context is generally larger, containing
more SCMs than the true ΩG . Given that more SCMs implies the possibility of finding an
alternative SCM that agrees with the assumptions and P (v), and disagrees in the query,
identifiability will in general be less frequent. Still, second, since the SFM projection encodes
fewer assumptions than the specific causal diagram G, from the fairness analyst perspective,
it will be in general easier to elicit such knowledge to construct a diagram. This situation
is more visibly seen through Fig. 13.
We now extend the FPCFA to account for the identifiability issues discussed above:

Definition 24 (FPCFA continued with Identifiability) [Ω, Q as before] Let the true,
unobserved generative SCM M = hV, U, P (U ), F i, and let A be a set of assumption and P (v)
be the observational distribution generated by it. Let ΩA the space of all SCMs compatible
with A. The Fundamental Problem of Causal Fairness Analysis is to find a collection of
measures µ1 , . . . , µk such that the following properties are satisfied:

(1) µ is decomposable w.r.t. µ1 , . . . , µk ;

46
Causal Fairness Analysis

(2) µ1 , . . . , µk are admissible w.r.t. the structural fairness criteria Q1 , Q2 , . . . , Qk .

(3) µ1 , . . . , µk are as powerful as possible.

(4) µ1 , . . . , µk are identifiable from the observational distribution P (v) and class ΩA .
The first question we ask is about solving the Step
(4) of FPCFA when having the full causal graph G.
To this end, we state the following theorem:
Theorem 8 (Identifiability over ΩG ) Let G be a
causal diagram compatible with the SFM and let ΩG
be the context defined based on G. Then,
(i) TE, NDE, NIE, and Exp-SE are identifiable,

(ii) x-TE, x-DE, x-IE, and x-SE are identifiable, Figure 13: Spaces of SCMs (left)
and Causal Diagrams (right). (Left)
(iii) z-TE, z-DE, and z-IE are identifiable, Each point corresponds to a fully in-
stantiated SCM. The SCMs compat-
(iv) if {W, Y } ∩ V 0 6= ∅, then v 0 -TE , v 0 -DE , and
ible with the diagram G are shown
v 0 -IE are not identifiable except in degenerate
in light blue, and the ones with the
cases,
SFM in dark blue. (Right) Each
(v) u-TE, u-DE, and u-IE are not identifiable ex- point corresponds to a causal dia-
cept in degenerate cases. gram. The lightest green dot corre-
sponds to the true diagram G, while
By degenerate cases we refer to instances in which the ones in the light green area cor-
a measure is equal to 0 and identifiable from the ab- respond to different diagrams com-
sence of pathways. patible with the SFM assumption.
For example, v 0 -DE or u-DE could be identifiable (and equal to 0) if the causal diagram G
does not contain the arrow X → Y (this is a case we call degenerate in the above theorem).
In summary, we can claim that general, x-specific, and z-specific measures are identifiabile
over ΩG whenever G is compatible with the SFM. However, v 0 or unit level measures are in
general not identifiable, without additional assumptions.
The important next question we ask is whether there is a gap in solving the FPCFA
under the context ΩSF M compared to ΩG . In the first instance, as shown in the following
theorem, the answer is negative, showing formally show why our definition of the SFM is
indeed sensible in the context of FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)):
Theorem 9 (Identifiability over ΩSF M & Soundness of SFM) Under the Standard
Fairness Model (SFM) the orientation of edges within possibly multidimensional variable
sets Z and W does not change any of general, x-specific or z-specific measures. That is, if
two diagrams G1 and G2 have the same projection to the Standard Fairness Model, i.e.,

ΠSFM (G1 ) = ΠSFM (G2 ) (194)

then any measure µ(P (v), G) will satisfy

µ(P (v), G1 ) = µ(P (v), G2 ) = µ(P (v), GSFM ). (195)

47
Plečko and Bareinboim

That is, if measures µ1 , . . . , µk in Step (4) of FPCFA in Def. 24 are identifiable over the class
of SCMs ΩG corresponding to a causal diagram G, then they are also identifiable over the
class of SCMs ΩSF M corresponding to the diagram’s SFM projection GSFM . The notation
µ(P (v), G) indicates the measures are computed based on the observational distribution P (v)
and the causal diagram G (as opposed to being computed based on the SCM M as before).

The proofs of Thm. 8 and 9 are given in Appendix A.2, together with a discussion on
relaxing the assumptions of the SFM, and a discussion on the estimation of measures. The
theorem shows that the SFM projection of a diagram GSFM is equally useful as the fully
specified diagram G for computing any of the general, x-specific or z-specific measures in
Lem. 4. That is, specifying more precisely the causal structure contained in multivariate
nodes Z and W would not change the values the different measures. The SFM projection
GSFM can be understood as a coarsening of the equivalence class of SCMs compatible with
the graph G. Perhaps surprisingly, this coarsening does not hurt the identifiability of some
of the most interesting measures. Moreover, for computing the v 0 -specific and unit-level
measures, additional assumptions would be necessary, even if the full diagram G was avail-
able (see Appendix A.2 for more details). The key observation is that v 0 -specific measures
require the identification of the joint counterfactual distribution P (vx0 0 , vx0 1 ), and these two
potential outcomes are never observed simultaneously. Therefore, unless we are interested
in v 0 -specific or unit-level measures, we can simply focus on constructing the GSFM and not
worry about full details of the diagram G. The formulation of FPCFA with identifiability
uncovers an interesting interplay of power and identifiability, in which increasingly strong
assumptions are needed to identify more powerful measures.

4.4 Other relations with the literature


Equipped with the Fairness Map, which was the culmination of understanding the relation-
ship between a multitude of measures, we can now analyze the connection of Causal Fairness
Analysis with some influential previous works that articulated other measures in the liter-
ature. In particular, we will discuss the criteria of counterfactual fairness in Sec. 4.4.1 and
individual fairness in Sec. 4.4.2.

4.4.1 Criterion 1. Counterfactual fairness


One criterion that has received considerable attention in the literature is called “counter-
factual fairness” (Kusner et al., 2017). Noteworthy in terms of terminology, the name
“counterfactual fairness” is a misnomer, and somewhat misleading, as there are various
measures that are counterfactual in nature and could be employed to reason about fairness,
following the previous discussion and the Fairness Map (Fig. 12). Regardless of the name,
the criterion has important limitations that we elaborate on this section.
To begin with, the definition of the proposed criterion is somewhat ambiguous in regard
to whether it represents a unit-level quantity or a probabilistic-type of counterfactual22 . To
understand the issue, we list in the sequel three possible definitions compatible with the
original paper, and then discuss their interpretations:

22. For various reasons, probabilistic measures tend to be discussed in the literature.

48
Causal Fairness Analysis

(u)
(i) Counterfactual Fairness – Unit-level (Ctffair ):

yx (u) − yx0 (u) = 0, ∀x, x0 , u ∈ U. (196)

(up)
(ii) Counterfactual Fairness – Unit-level/probabilistic version (Ctffair ):

P (yx (u) | X = x, W = w) = P (yx0 (u) | X = x, W = w), ∀x, x0 , w. (197)

(p)
(iii) Counterfactual Fairness – Population-level (Ctffair ):

P (yx | X = x, W = w) = P (yx0 | X = x, W = w), ∀x, x0 , w. (198)

(up)
In fact, the paper use the unit-level probabilistic version (Ctffair ) as its core definition
(Kusner et al., 2017, Def. 5), which is a direct translation to our notation so as to make the
context and comparisons more transparent. 23 The authors “emphasize that counterfactual
fairness is an individual-level definition, which is substantially different from comparing
different individuals that happen to share the same “treatment” X = x and coincide on
the values of W = w” (Kusner et al., 2017, Sec. 3). Interestingly, this seems a deliberate
choice and suggest a unit-level definition of fairness. Importantly, the probabilistic unit-level
(up) (u)
(Ctffair ) and the unit-level definition (Ctffair ) are equivalent, as shown next:
(up) (u)
Proposition 1 (Ctffair ⇐⇒ Ctffair ) The unit-level counterfactual fairness (Eq. 196)
and the unit-level/probabilistic counterfactual fairness (Eq. 197) criteria are equivalent.

This proposition suggests that the notation used in the original definition of the counter-
(up)
factual fairness criterion, Ctffair , entails some confusion. In words, once the unit U = u
is specified, as originally stated in the criterion, Yx (u) is fully determined. It is therefore
redundant, and there is no need for considering or conditioning on event X = x, W = w, as
this is implied by the choice of unit u.
However, the authors also state that “the distribution over possible predictions for an
individual should remain unchanged in a world where an individual’s protected attributes
had been different” (Kusner et al., 2017, Sec. 1) As explained above, if the unit U = u is
known, there are no probabilities involved, and the statements are deterministic. Therefore,
under the alternative description the authors provide, a different formulation of the criterion
is needed. In fact, if the goal is to have a probabilistic counterpart of Eq. 196, as the above
statement might lead one to think, then the unit U = u should be removed altogether,
(p)
which leads more explicitly to Ctffair definition, as displayed in Eq. 198. Interestingly, using
structural basis expansion from Thm. 1, we can show the relation of the unit- and the
probabilistic-level definitions:
(p) (u)
Proposition 2 (Ctffair is a probabilistic average of Ctffair ) Consider the following mea-
sure:

(x, w)-TEx,x0 (y | x, w) = P (yx | X = x, W = w) − P (yx0 | X = x, W = w). (199)


23. In particular, the original paper uses A for the protected attribute, where we use X, and it uses X for
the remaining attributes where we use W .

49
Plečko and Bareinboim

(p)
Then, the Ctffair criterion is equivalent to (x, w)-TEx,x0 (y | x, w) = 0, ∀x, x0 , w. Further-
(p)
more, the measure underlying the Ctffair criterion can be written as
X
(x, w)-TEx,x0 (y | x, w) = [yx (u) − yx0 (u)]P (u | x, w). (200)
u

In words, Prop. 2 shows that probabilistic counterfactual fairness criterion takes an average
of the unit level differences yx (u)−yx0 (u), weighted by the posterior P (u | x, w), and requires
the average to be equal to 0. Note the difference between this definition and the unit-level
definition, which requires every unit-level difference yx (u) − yx0 (u) to be 0.
After explaining the difference between the two possible and qualitatively different inter-
pretations of counterfactual fairness, and clearing up the notational confusion with respect
to fixing a unit U = u, we now discuss somewhat more serious issues regarding the criterion,
including from a conceptual, technical, and practical viewpoints. In fact, the issues listed
(u) (p)
below apply to both the Ctffair and Ctffair interpretations of counterfactual fairness, with
the three major points being:
(u) (p)
1. inadmissibility of Ctffair and Ctffair with respect to Str-{DE,IE,SE},

2. lack of accounting for spurious effects, and

3. hardness/impossibility of identifiability.

Issue 1. Inadmissiblity w.r.t. structural direct, indirect, and spurious effects


In the context of the discussion that lead to the conclusions in Sec. 4.2, it is somewhat
natural to expect that the counterfactual fairness measure is inadmissible w.r.t. any of the
structural criteria, as more formally shown in the sequel.

Proposition 3 (Unit-TE, (x, w)-TE not admissible) The unit-level total effect (unit-
TEx0 ,x1 (y)) and the (x, w)-specific total effect ((x, w)-TEx0 ,x1 (y | x, w)) are both not admis-
sible w.r.t. the structural direct, indirect, and spurious criteria. Formally, we write

6
Str-DE-fair =⇒ 6
unit-TE-fair, Str-DE-fair =⇒ (x, w)-TE-fair (201)
6
Str-IE-fair =⇒ 6
unit-TE-fair, Str-IE-fair =⇒ (x, w)-TE-fair (202)
6
Str-SE-fair =⇒ 6
unit-TE-fair, Str-SE-fair =⇒ (x, w)-TE-fair. (203)

The importance of this result stems from the fact that even if one is able to ascertain

yx1 (u)−yx0 (u) = 0 ∀u, or


P (yx1 | X = x, W = w)−P (yx0 | X = x, W = w) = 0 ∀x, w,

it could still be that case that neither the direct nor the indirect (nor the spurious) effects are
equal to 0. The broader discussion around the Fairness Map, and the idea of decomposability
of measures into admissible ones was introduced precisely to avoid such situations. The next
example highlights this issue more vividly.

50
Causal Fairness Analysis

Example 14 (Startup Hiring Continued - Salaries) The startup company from Ex. 9
has closed the hiring season. In the hiring process, the company achieved demographic parity,
which means in this context that 50% of new hires were female. Now, the company needs to
decide on each employee’s salary. In order to be “fair”, each employee is evaluated on how
well they perform their tasks. The salary Y is then determined based on this information,
but, due to a subconscious bias of the executive determining the salaries, gender also affects
how salaries are determined. The SCM M∗ corresponding to this process is:


 X ← UX (204)

W ← −X + UW (205)





 Y ←X +W +U .

(206)
Y
F ∗ , P ∗ (U ) :



 UX ∈ {0, 1}, P (UX = 1) = 0.5,


 (207)

U , U ∼ N (0, 1).

(208)
W Y

For any unit u = (ux , uw , uy ), we can compute that

yx1 (u) − yx0 (u) = (1 + (−1 + uw ) + uy ) − (0 + (−0 + uw ) + uy ) = 0, (209)


| {z } | {z }
yx1 (u) yx0 (u)

showing that unit-level total effect is 0. Furthermore, for each choice of X = x, W = w, it


is also true that

P (yx1 | X = x, W = w) − P (yx0 | X = x, W = w) = 0. (210)

Therefore, both interpretations of the counterfactual fairness criterion are satisfied. How-
ever, direct discrimination against female employees still exists since the fy mechanism in
Eq. 206 assigns a higher salary to male employees. On the other hand, the mechanism fw
in Eq. 205 shows that female employees are better at performing their tasks, and should
therefore be paid more. Nevertheless, the superior performance of female employees in per-
forming their tasks is cancelled out by the direct discrimination favoring males (as witnessed
by Eq. 209). In effect, they are paid the same as they would be had they been male. 

The inability of total effect to detect direct and indirect effects stems from the fact that the
total effect is decomposable (see Corol. 1). The example above illustrates the first critical
shortcoming of the criterion proposed by Kusner et al. (2017), as in any other composite
measure, and any optimization procedure based on it, i.e., zeroing the Ctffair measure, may
lead to unintended side effects and discrimination if implemented in the real world.

Issue 2. Ancestral closure & Spurious effects


The purported criterion rules out, by construction, the possibility of existence of any spu-
rious types of variations. In particular, the argument relies on the notion introduced in the
paper called ancestral closure (AC, for short) w.r.t. the protected attribute set. The AC
requires that all protected attributes and their parents, and all their ancestors, should be
measured and included in the set of endogenous variables. This is obviously a very stringent

51
Plečko and Bareinboim

requirement, which is hard to ascertain in practice. The paper then argues that “the fault
should be at the postulated set of protected attributes rather than with the definition of
counterfactual fairness, and that typically we should expect set X to be closed under ances-
tral relationships given by the causal graph. For instance, if Race is a protected attribute,
and Mother’s race is a parent of Race, then it should also be in X”.
Conceptually speaking, we contrast this constraint over the space of models with the
very existence of dashed-bidirected arrows in causal diagrams, as discussed earlier. These
arrows in particular allow for the possibility that there are variations between X and Z that
can be left unexplained in the model, or unmeasured confounders may exist. Practically
speaking, assuming that no bidirected arrows exist is a strong assumption that do not
hold in many settings. For instance, consider the widely recognized phenomenon in the
fairness literature known as redlining (Zenou and Boccard, 2000; Hernandez, 2009). In
some practical settings, the location where loan applicants live may correlate with their
race. Applications might be rejected based on the zip code, disproportionately affecting
certain minority groups in the real world.
It has been reported in the literature that correlation between gender and location, or
religious and location may possibly exist, and therefore, should be acknowledged through
modeling. For instance, the one-child policy affecting mainly urban areas in China had
visible effects in terms of shifting the gender ratio towards males (Hesketh et al., 2005;
Ding and Hesketh, 2006). Beyond race or gender, religious segregation is also a recognized
phenomenon in some urban areas (Brimicombe, 2007). Again, while we make no claim
that location affects race (or religion), or vice-versa, the bidirected arrows give a degree of
modeling flexibility that allows for the encoding of such co-variations. Still, this without
making any commitment to whatever historical processes and other complex dynamics that
took place and generated such imbalance in the first place. To corroborate this point,
consider the following example:

Example 15 (Spurious associations in COMPAS & Adult datasets) A data scien-


tist is trying to understand the correlation between the features in the COMPAS dataset.
The protected attribute X is race, and the demographic variables Z1 , Z2 are age and sex.
The data scientist tests two hypotheses, namely:
(1)
H0 : X⊥⊥Z1 , (211)
(2)
H0 : X⊥⊥Z2 . (212)

The association of X and Z1 , Z2 are shown graphically in the bottom row of Fig. 14. Both
of the hypotheses are rejected (p-values < 0.001). However, possible confounders of this
relationship are not measured in the corresponding dataset.
Similarly, the same data scientist is now trying to understand the correlation of the
features in the Adult dataset. The protected attribute X is gender, and the demographic
variables Z1 , Z2 are age and race. The data scientist tests the independence of sex and age
(X⊥⊥Z1 ), and sex and race (X⊥ ⊥Z2 ), and both hypotheses are rejected (p-values < 0.001,
see Fig. 14 top row). Again, possible confounders of this relationship are not measured
in the corresponding dataset, meaning that the attribute X cannot be separated from the
confounders Z1 , Z2 using any of the observed variables. 

52
Causal Fairness Analysis

Figure 14: Testing for independence of the protected attribute (X) and the confounders
(Z) on the Adult and COMPAS datasets.

As this example illustrates, from a both conceptual and practical standpoint, disallowing
the possibility of non-causal relationships and confounding induced by some historical or
societal context, and the associated spurious effects, can be an major limitation to any type
of fairness analysis.

Issue 3. Lack of identifiability


An important practical property of any fairness measure is its identifiability under different
sets of causal assumptions. We introduced the notion of identifiability in Sec. 4.3 to better
understand when a fairness measure can be used in practice. We then discussed some
necessary assumptions for measures in the Fairness Map to be identifiable. A significant
implication of this prior discussion in the context of counterfactual fairness is highlighted
by the following result:

Proposition 4 (Unit-TE, (x, w)-TE not identifiable) Suppose that M is a Markovian


model and that G is the associated causal diagram. Assume that the set of mediators between
X and Y is non-empty, W 6= ∅. Then, the measures unit-TEx0 ,x1 (y) and (x, w)-TEx0 ,x1 (y |
x, w) are not identifiable from observational data, even if the fully specified diagram G is
known.

The proposition shows that the measures on which counterfactual fairness is based are
never computable from observational data and the causal diagram, even for models in
which Markovianity is assumed to hold, a strong assumption. The main issue with these

53
Plečko and Bareinboim

quantities is that they require knowledge of the joint distribution of counterfactual outcomes
Yx1 , Yx0 , which are never observed at the same time 24
The issue discussed above obviously curtails the generality of the proposed method,
since the underlying measures are not identifiable immediately, as illustrated next.
(u) (p)
Example 16 (Non-ID of Ctffair , Ctffair - Startup Salaries Continued) Consider the
SCM M∗ of the Startup Salaries example (Ex. 14) given in Eq. 204-206. In M∗ we showed
that

(x, w)-TEx0 ,x1 (y | x, w) = 0. (213)

Consider now an alternative SCM M0 given by:



 X ← UX
 (214)
F 0 , P 0 (U ) : W ← −X + (−1)X UW (215)

Y ← X + W + UY , (216)

with the same distribution P (u) over the units as for M∗ . It’s verifiable that that M0
generates the same observational distribution as M∗ and has the same causal diagram G.
However, notice that for u = (1, uw , uy ), we have

u-TEx0 ,x1 (y) = yx1 (u) − yx0 (u) = −2uw 6= 0 whenever uw 6= 0. (217)

Furthermore, we have that

(x, w)-TEx0 ,x1 (y | x, w > 0) 6= 0. (218)

Therefore, M∗ and M0 generate the same observational distribution and have the same
causal diagram, but differ substantially with respect to counterfactual fairness. 

The example constructed above is not atypical, but stems from the general non-identifiability
result in Prop. 4. These results raise the question as to whether counterfactual fairness cri-
(u) (p)
teria – either Ctffair or Ctffair – can be used for the purpose of bias detection in any practical
setting. In fact, to circumvent the identifiability issue discussed above, the proposal of the
paper is that “the model M∗ must be provided” (Kusner et al., 2017, Sec. 4.2). This means
that the fully specified causal model M∗ is needed to assess the existence of discrimination.
The assumptions put forward in our manuscript are concerned with constructing the causal
diagram G, or the simplified version of the diagram in the form of an SFM. In stark contrast,
the assumptions needed to provide the model M∗ are orders of magnitude stronger than
those needed for constructing the causal diagram or the SFM. This level of knowledge re-
quires reading the intentions and minds of decision-makers, or having access to the internal
systems and strategic secrets of companies, which are usually not accessible to outsiders.
On the more mathematical side, as alluded to earlier, inducing such a structural model from
observational data alone is almost never possible (Bareinboim et al., 2022, Thm. 1).
24. Such quantities can be identified under additional, stronger assumptions, such as monotonicity (Tian
and Pearl, 2000; Plečko and Meinshausen, 2020).

54
Causal Fairness Analysis

4.4.2 Criterion 2. Individual fairness


In this section, we discuss a prominent measure introduced by Dwork et al. (2012) called
individual fairness (IF, for short). One of the most natural intuitions behind fairness is
that if we constraint the population in a way that the units are the same but for the
protected attribute, this would allow us to make claims about the impact of variations of
this attribute. In fact, since nothing else remains to explain the observed disparities, the
differences in outcome would be attributable to the change in the protected attribute.
To ground this intuition, we introduced in Sec. 3 the explainability plane (Fig. 7) that
spawns the population and the mechanisms axes. In terms of the population axis, we noted
that as the event E = e is enlarged, the corresponding measure of fairness became more and
more individualized. Formally, the restriction on the observed information translates into
a more precise subpopulation of the space of unobservable units U. The analysis discussed
earlier here relied on three observations that will be key to compare other causal measures
with the IF measure, and try to understand its causal implications. First, the plane is
contingent on the assumptions encoded in the SFM. As we will show formally, assumptions
about the underlying causal structure are also relevant in the framework of IF. Secondly,
the explainability plane considers admissibility and power of different measures, and we
use these notions to place and understand the IF condition in the context of the Fairness
Map. Thirdly, as highlighted by our analysis of the FPCFA, optimizing based on a specific
composite criterion may in fact fail to remove bias that could be in principle detected when a
more fine-grained analysis of the causal mechanisms generating the disparity is undertaken.
We discuss conditions under which the IF framework is optimizing based on composite
measures, with practical examples in which this may lead to unintended and potentially
harmful side effects. We start with the definition of individual fairness:

Definition 25 (Individual Fairness) Let d be a fairness metric on X × Z × W. An


outcome Y is said to satisfy individual fairness if

|P (y | x, z, w) − P (y | x0 , z 0 , w0 )| ≤ d((x, z, w), (x0 , z 0 , w0 )), (219)

∀ x, x0 , w, w0 , z, z 0 .

The framework of IF assumes the existence of a fairness metric d that computes the dis-
tance between two individuals described by attributes (x, z, w) and (x0 , z 0 , w0 ), while the
outcome y is not taken into account. In words, IF requires that individuals who are similar
with respect to metric d need to have a similar outcome. This requirement is represented
by a Lipschitz property in Eq. 219. If the distance between two values of the covariates,
d((x, z, w), (x0 , z 0 , w0 )), is smaller than , then the criterion in Eq. 219 implies that individ-
uals who coincide with these covariate values must have a similar probability of a positive
outcome, that is

|P (y | x, z, w) − P (y | x0 , z 0 , w0 )| ≤ . (220)

We now look at the implications of the IF criterion, and observe some possible shortcomings
that can result from ignoring the causal structure.

55
Plečko and Bareinboim

Example 17A Example 17B

X ← UXY (221) X ← UXZ (224)


F Z ← UZ (222) Z ← UXZ + UZY (225)
SCM M

Y ← X − UXY + Z + UY (223) Y ← UZY + UY (226)

UXY ∼ Bernoulli(0.5), UXZ ∼ Bernoulli(0.5),


P (u) UZ , UY ∼ N (0, 1) UZY , UY ∼ N (0, 1)
Z
diagram

Z
G
X Y X Y

Table 2: An example of two situations in which the IF criterion has different meanings.

Issue 1. IF is oblivious to causal structure

The IF definition in Eq. 219 is agnostic with respect to the underlying causal structure that
generated the data. We start with two examples of a hiring process that are on the surface
similar, but differ with respect to the underlying causal structure. As we will see, this will
show that the implications of the IF criterion can be quite different, which will highlight
the fact that the causal structure cannot be dismissed when using this criterion.

Example 17 (Startup Hiring III) Suppose that two startup companies, A and B, are
hiring employees. Let X (sex) represent the protected attribute, Z the candidates perfor-
mance on an aptitude test, and Y the overall score for job hiring Y . The set of mediators
W is in this case empty. The hiring process is similar, yet there is a difference between the
two companies. In both instances, we assume age is a latent, unobserved factor, which has
shared information with gender. In company A, age affects the salary directly, whereas in
company B, age affects the aptitude test result. Additionally, in company B the aptitude test
result has shared information with the salary, represented by the unobserved variable which
measures how much the candidate prepared for the interview day. The respective SCMs and
causal diagrams are shown in Table 2. Suppose that the fairness metric d is in both cases is

d((x, z), (x0 , z 0 )) = |z − z 0 |. (227)

Then, the IF criterion can be written as

E[y | x, z] − E[y | x0, z0] ≤ d((x, z), (x0 , z 0 )) = |z − z 0 |. ∀x, x0 , z, z 0 . (228)

56
Causal Fairness Analysis

Notice that in company A, we can compute that

EM A
[y | x, z] = EMA [X − UXY + Z + UY | x, z] (229)
=E MA
[X − UXY | x, z] + E MA
[Z | x, z] + E MA
[UY | x, z] (230)
| {z } | {z }
=0 as X=UXY =0 as UY ∼N (0,1),
UY ⊥
⊥Z,X

= z. (231)

Therefore, we can conclude that

EM A
[y | x1 , z] − EMA [y | x0 , z 0 ] = |z − z 0 |. (232)

In company B, however, we can compute:

EM B
[y | x, z] = EMB [UZY + UY | x, z] (233)
=E MB
[Z − UXZ | x, z] + E MB
[UY | x, z] (234)
| {z }
=0 as UY ∼N (0,1),
UY ⊥
⊥Z,X

= EMB [Z − X | x, z] = z − x. (235)

Therefore, the IF criterion is not satisfied, which can be shown by computing:

EM B
[y | x1 , z] − EMB [y | x0 , z 0 ] = |z − 1 − z 0 |. (236)

When assessing direct discrimination on a structural level, in company A, the mechanism


fy in Eq. 223 shows the presence of direct discrimination. In company B, however, the
mechanism fy in Eq. 226 shows no direct discrimination. We could pick a more empirical
measure of DE, such as the NDE (Def. 16). Evaluating the NDE using the generated data:

NDEM
x0 ,x1 (y) = 1,
A
(237)
NDEM
x0 ,x1 (y) = 0,
B
(238)

which is consistent with the observed discrimination at the structural level. 

Somewhat paradoxically, the example illustrates that in company A direct discrimination


exists, yet the IF criterion is satisfied, whereas in company B the criterion is not fulfilled,
but there is no direct discrimination. This example, even though perhaps surprising at
first, is reflective of the fact that IF does not take the causal structure into account. Our
conclusion is that without the causal diagram, the consequences of using IF might be un-
clear. Therefore, from this point forward, we assume the SFM structure, and look at the
IF framework in this fixed context.

Issue 2. IF captures the direct effect only under the SFM


We next show that under the assumptions of the standard fairness model, the IF condition
given in Eq. 219 has causal implications. In other words, we investigate where the IF
condition can be placed in the Fairness Map in Fig. 12. An initial difficulty arises from the

57
Plečko and Bareinboim

fact that the IF criterion is not written in the form of a contrastive measure (which were
studied in Sec. 3). Therefore, instead of using the exact IF criterion, we look at a criterion
that is implied by the IF criterion, but is itself a contrastive measure. This criterion is
based on the measure known as the observational direct effect:

Definition 26 (Observational direct effect) The observational direct effect (Obs-DE,


for short) is defined as

Obs-DEx0 ,x1 (y | z, w) = P (y | x1 , z, w) − P (y | x0 , z, w). (239)

Based on this measure, we define the Obs-DE-fair criterion as:

Obs-DE-fair ⇐⇒ Obs-DEx0 ,x1 (y | z, w) = 0 ∀z, w. (240)

The Obs-DE-fair criterion is implied by IF whenever the fairness metric d satisfies

d((x1 , z, w), (x0 , z, w)) = 0 ∀z, w, (241)

that is, when the metric d does not depend on the protected attribute X. The Obs-
DE condition can then be obtained from Eq. 219 by setting (x, z, w) = (x1 , z, w) and
(x0 , z 0 , w0 ) = (x0 , z, w). The Obs-DE criterion, which is implied by the IF condition under
certain assumptions, is admissible with respect to structural direct criterion:

Proposition 5 (Admissibility of Obs-DE w.r.t. Str-DE and IF) Suppose that the met-
ric d does not depend on the X variable, that is

d((x, z, w), (x0 , z 0 , w0 )) = d((z, w), (z 0 , w0 )). (242)

Then, the IF criterion in Eq. 219 implies the Obs-DE-fair criterion in Eq. 240. Fur-
thermore, under the assumptions of the standard fairness model the Obs-DE measure is
admissible with respect to Str-DE, that is

Str-DE-fair =⇒ Obs-DE-fair. (243)

A further positive result shows that the Obs-DE criterion is in fact powerful in the context
of detecting direct discrimination (again under suitable assumptions):

Proposition 6 (Power of IF w.r.t. Str-DE) Suppose that the Obs-DE-fair criterion in


Eq. 240 holds. Under the assumptions of the standard fairness model, the Obs-DE measure
is more powerful than z-DE, x-DE and NDE:

Obs-DE-fair ◦−→ z-DE-fair ◦−→ x-DE-fair ◦−→ NDE-fair. (244)

Under the SFM25 P (y | x1 , z, w) − P (y | x0 , z, w) equals what is known as the controlled


direct effect

CDEx0 ,x1 := P (yx1 ,z,w ) − P (yx0 ,z,w ). (245)


25. The exact assumption needed here can be written as Yx,z,w ⊥
⊥X, Z, W . This assumption is encoded in
the SFM.

58
Causal Fairness Analysis

Therefore, under certain assumptions, the constraint implied by IF in fact precludes the
existence of a direct effect and has a valid causal interpretation. Importantly, the assump-
tions that are needed are of a causal nature, and ignoring the causal diagram of the data
generating model can lead to undesired consequences when using the IF condition (see
Ex. 17).
To continue the discussion, we consider two distinct cases when choosing the fairness
metric d, on which much of the IF framework relies:

(i) metric d is sparse, meaning that it does not depend on all variables in the sets Z, W ,

(ii) metric d is complete, meaning that it depends on all variables in the sets Z, W .

We now consider these two cases separately, and point out their possible drawbacks. We
emphasize that our goal is not to pick a metric but to shed light on the fundamental
interplay between the arguments/properties of the fairness metric d and the underlying
causal mechanisms, which describes where the decision-making process takes place in the
real world and from where data is collected.

Issue 3. Sparse metrics d lead to lack of admissibility


From individual to global. Suppose that the IF condition in Eq. 219 holds. Under
suitable causal assumptions, the condition precludes the existence of direct discrimination,
as was shown above. However, even if the IF condition holds, the disparity between the
groups corresponding to X = x0 and X = x1 (measured by the TV) could still be large, if
the conditional distributions

Z, W | X = x0 and Z, W | X = x1

differ. This observation leads to the second step of the framework of Dwork et al. (2012).
The authors provide the following significant result:

Proposition 7 (Optimal Transport bound on TV (Dwork et al., 2012)) Let d be


a fairness metric, and suppose that the individual fairness condition in Eq. 219 holds. Let
the optimal transport cost between Z, W | X = x1 and Z, W | X = x0 be denoted by

OTCdx0 ,x1 ((Z, W )). (246)

Then, the TV measure between the groups is bounded by the optimal transport cost up to a
constant Cd dependent on the metric d only, namely

|TVx0 ,x1 (y)| ≤ Cd ∗ OTCdx0 ,x1 ((Z, W )). (247)

In words, if the optimal transport (OT) distance between distributions

Z, W | X = x1 and Z, W | X = x0 ,

with the metric d measuring the transport cost, is small, the TV measure is consequently
small as well. Here, however, there is an important nuance, stemming from the decompos-
ability of the TV measure, as shown in the following proposition:

59
Plečko and Bareinboim

Proposition 8 (Inadmissibility of OTC) The optimal transport cost OTCdx0 ,x1 ((Z, W ))
is not admissible with respect to structural indirect and structural spurious criteria. For-
mally, we write that:
OTCdx0 ,x1 ((Z, W )) = 0 ,

6
Str-IE-fair =⇒ (248)
d

Str-SE-fair =⇒ 6 OTCx0 ,x1 ((Z, W )) = 0 . (249)
To see the relevance of the proposition above, we proceed by means of an example, in
which the above optimal transport distance is small and the TV is minimized, but in which
indirect and spurious discrimination still exist.
Example 18 (Startup Hiring IV) Suppose that a startup company is hiring accoun-
tants. Let X (sex) be the protected attribute, Z be the age of the candidate and W their
performance on an accountancy test, upon which the job decision Y is based. The following
SCM M∗ describes the situation:


 X ← UXZ (250)




 Z ← −UXZ + UZ (251)

 W ← X + Z + UW (252)



∗ ∗
F , P (U ) : Y ← 1(UY < expit(W )), (253)






UXZ ∈ {0, 1}, P (UXZ = 1) = 0.5, (254)





UZ , UW , UY ∼ Unif[0, 1], (255)

ex
where expit(x) = 1+ex . The fw mechanism in Eq. 252 shows that older candidates per-
form better at the test, and that women perform better than men, given equal age. However,
due to latent confounding, arising from a specific historical context, women tend to leave
the profession at an earlier age (mechanisms fx , fz in Eq. 250 and 251 show that lower age
is correlated with being female, through the UXZ variable). The causal graph representing
this situation is given by
Z

X W Y .

Importantly, the marginal distributions W | X = x0 and W | X = x1 are equal in M∗ . An


outside authority, which certifies whether discrimination is present, decides that the metric
d is given by:
d((x, z, w), (x0 , z 0 , w0 )) = |w − w0 |. (256)
In this case, we have that
|P (y | x, z, w) − P (y | x0 , z 0 , w0 )| =|expit(w) − expit(w0 )| (257)
1
≤ |w − w0 |, (258)
4

60
Causal Fairness Analysis

where the last inequality follows from an application of the mean value theorem. Fur-
thermore, the optimal transport cost is 0, because the marginal distributions of W are
matching between the groups. There is no direct discrimination, since Y is not a func-
tion of X (Eq. 253). Therefore, the IF criterion is satisfied and the TV measure equals
0. However, when applying the decomposition of TV found in the x-specific solution to
FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) in Thm. 3, we have that

TVx0 ,x1 (y) = x-DEx0 ,x1 (y | x0 ) − x-IEx1 ,x0 (y | x0 ) − x-SEx1 ,x0 (y) (259)
= (0%) − (14%) − (−14%), (260)
| {z } | {z } | {z }
direct indirect spurious

which indicates that even though the TV equals 0, the spurious and indirect effects exist. 

Notice the following about the example. Women, who are naturally better at their jobs,
are interviewed at a younger age. If the source of the confounding comes from the fact that
women (willingly) advance to a different profession in later stages of their career, then the
cancellation of spurious and indirect effects in Eq. 260 might be acceptable. If, however, the
spurious effect stems from a confounding mechanism in which women abandon their careers
for certain adverse reasons, then the situation could reasonably be deemed unfair. Without
causal considerations, these two cases are indistinguishable. This example is inspired by an
example of the original IF paper, which says that “the imposition of a metric already occurs
in many classification processes, including credit scores for loan applications” (Dwork et al.,
2012, Sec. 6.1.1). Notice that such a metric is based on a single mediator W , similar to the
metric in Ex. 18.
A possible objection to Ex. 18 is that the metric d does not include all confounders and
mediators Z, W , which introduces a different issues, as discussed next.

Issue 4. Complete metrics d do not allow for business necessity


We now suppose that the fairness metric d includes all variables in Z, W . If this is the case,
then the optimal transport condition implies the independence of X and the Z, W variables,
as shown in the following proposition:

⊥Z, W ) Suppose that the metric d is of the following form


Proposition 9 (OTC =⇒ X⊥

d((x, z, w), (x0 , z 0 , w0 )) = kz − z 0 k + kw − w0 k, (261)

where k · k is any norm on Rd . Then, we have that the optimal transport condition implies
the independence of X and {Z, W }, namely:

OTCdx0 ,x1 ((Z, W )) = 0 =⇒ X⊥⊥Z, W. (262)

Furthermore, if the metric d does not consider X then the IF condition implies the inde-
pendence of X and Y conditional of Z, W .

Proposition 10 (IF =⇒ X⊥ ⊥Y | Z, W ) Suppose that d is a fairness metric and suppose


that the IF condition in Eq. 219 holds. Then, for a binary outcome Y , X⊥⊥Y | Z, W .

61
Plečko and Bareinboim

Finally, putting the above two propositions together implies that the variable X is inde-
pendent from all other observables in V , as shown next:
Proposition 11 (OTC ∧ IF =⇒ X⊥⊥V \ {X}) Suppose that the metric d is of the form
d((x, z, w), (x0 , z 0 , w0 )) = kz − z 0 k + kw − w0 k, where k · k is any norm on Rd . Suppose also
that OTCdx0 ,x1 ((Z, W )) = 0 and the IF condition in Eq. 219 holds. Then we have that

X⊥⊥Z, W, Y. (263)
The proposition shows that if (i) the metric d includes all variables in Z, W ; (ii) the IF
condition holds; (iii) the optimal transport distance is small, then the protected attribute
X is independent from all other endogenous variables in the system. As we will discuss
later in Sec. 5, this can be a very strong requirement in practice, which requires completely
removing the influence of X, and is not compatible with considerations about business
necessity under the disparate impact doctrine.

5. Fairness tasks
The main goal of this section is to equip the reader with the tools for solving fairness
problems in practice, building on the foundations introduced in previous sections. We
classify fairness problems into three tasks, in increasing order of difficulty:
Task 1. Bias detection and quantification: the first and most basic task of fair ML. We
may consider operating with a dataset D of past decisions, or in infinite samples with
an observed distribution P (V ) over variables V . The task is to define a mapping
M : P → R,
where P is the set of possible distributions P (V ). M is viewed as a fairness measure
and it is often constructed so that M (P (V )) = 0 would suggest the absence of some
form of discrimination.
Task 2. Fair prediction: The task of fair prediction, usually, relies on a certain measure of
fairness. The task is to learn a distribution P ∗ (V ) while maximizing utility U (P (V ))
and satisfying
|M (P ∗ (V ))| ≤ ,
where M is a measure of fairness as discussed in Task 1. Fair classification and fair
regression problems fall into this category26 .
Task 3. Fair decision-making: In fair decision-making, the well-being of certain groups over
time is considered. Notions of affirmative actions also fall into this category. We might
be interested in designing a policy π, which at every time step affects the observed
distribution Pt (V ) (which now changes over time steps) so that we have
Pt+1 (V ) = π(Pt (V )),
and we are, perhaps, interested in controlling how M (Pt (V )) changes with t.
26. Different categories of fair prediction methods exist, namely pre-processing, in-processing, and post-
processing methods. These will be discussed separately in Sec. 5.2.

62
Causal Fairness Analysis

Note that these three tasks form a certain hierarchy, and are introduced in order of difficulty.
Fair prediction often relies on a specific fairness measure; fair decision-making often relies
on both a fairness measure and fair predictions. The first two tasks are discussed in Sec. 5.1
and Sec. 5.2, respectively, while the last task (fair decision-making) is left for future work.

5.1 Task 1: Bias Detection & Quantification


In the context of Task 1, we distinguish two different, but closely related subtasks. These
subtasks are referred to as bias detection and bias quantification. In bias detection, we are
interested in providing a binary decision rule ψ which determines whether discrimination is
present or not. In bias quantification, we are interested in how strong the discrimination is,
and therefore provide a real-valued number, instead of a binary decision. In what follows,
we give the mathematical formulation of the two subtasks, together with an approach for
how to solve them.

Definition 27 (Bias Detection under SFM) Let Ω be a space of SCMs. Let Q be a


structural fairness criterion, Q : Ω → {0, 1}, determining whether a causal mechanism
within the SCM M ∈ Ω is active (Q(M) = 0 if mechanism not active, Q(M) = 1 if
active). The task of bias detection is to test the hypothesis

H0 : Q(M) = 0, (264)

that is, constructing a mapping ψ(GSFM , D) into {0, 1}, which provides a decision rule for
testing H0 , based on the standard fairness model GSFM and the data D.

In words, we are interested whether direct, indirect, or spurious discrimination exists (cor-
responding to Q ∈ Str-{DE,IE,SE}, see Def. 9). The null hypothesis H0 assumes that
discrimination is not present, and the decision rule ψ determines whether H0 should be
rejected based on the SFM and the available data. Notice, crucially, that ψ is a function
of GSFM and D. This stems from the fact that the SCM M is never available to the data
scientist. Therefore, we cannot directly reason about Q(M), but instead need to find an
admissible measure µ that satisfies

Q(M) = 0 =⇒ µ(M) = 0, (265)

where µ(M) can be computed in practice. Recall the result from Prop. 1 which shows that
the TV measure is not admissible with respect to Str-{DE,IE,SE} and therefore should
not be used for bias detection when one is interested in direct, indirect, and spurious
effects. Moreover, we note that solving the bias detection task depends on solving the
FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)), which we now restate in the form more suitable for
Task 1:

Definition 28 (FPCFA continued for Task 1) [Ω, Q as before] Let the true, unob-
served generative SCM M = hV, U, P (U ), F i, and let A be a set of assumption and P (v)
be the observational distribution generated by it. Let ΩA the space of all SCMs compatible
with A. The Fundamental Problem of Causal Fairness Analysis is to find a collection of
measures µ1 , . . . , µk such that the following properties are satisfied:

63
Plečko and Bareinboim

(1) µ is decomposable w.r.t. µ1 , . . . , µk ;

(2) µ1 , . . . , µk are admissible w.r.t. the structural fairness criteria Q1 , Q2 , . . . , Qk .

(3) µ1 , . . . , µk are as powerful as possible.

(4) µ1 , . . . , µk are identifiable from the observational distribution P (v) and class ΩA .

The final step of FPCFA for Task 1 is

(5) estimate µ1 , . . . , µk and their (1 − α) confidence intervals from the observational data
and the SFM projection of the causal diagram.

Upon solving FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) for Task 1, we obtain measures µi based
on which the decision rule ψ can be constructed. In particular, the decision rule ψ will be
constructed by computing the (1 − α) confidence interval for µi using bootstrap. If the
interval excludes 0, the H0 hypothesis is rejected.
The derived measures µi obtained from solving FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) for
Task 1 can also be used for the related task of bias quantification:

Definition 29 (Bias Quantification under SFM) Let Ω be a space of SCMs and let
(Qi )i=1:3 = Str-{DE,IE,SE}. The task of bias quantification is concerned with finding a
mapping φ : Ω → R3 where the i-th component φi is admissible with respect to Qi .

In words, the amount of discrimination is summarized using a 3-dimensional statistic. Each


component of the statistic corresponds to one of the direct, indirect, or spurious effects.
The measures µi obtained from FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) can be used to solve
the task of bias quantification, by setting

φ(M) = µDE (M), µIE (M), µSE (M) . (266)

We can now discuss a specific proposal for the measures µi .

Measures µi for Task 1. Following the x-specific solution of FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y))
from Thm. 3, we use the following measures:

µDE is given by x-DEx0 ,x1 (y | x0 ) = P (yx1 ,Wx0 | x0 ) − P (yx0 | x0 ) (267)


µIE is given by x-IEx1 ,x0 (y | x0 ) = P (yx1 ,Wx0 | x0 ) − P (yx1 | x0 ) (268)
µSE is given by x-SEx1 ,x0 (y) = P (yx1 | x0 ) − P (yx1 | x1 ). (269)

Moreover, the solution also showed that the TV can be decomposed as:

TVx0 ,x1 (y) = x-DEx0 ,x1 (y | x0 ) − x-IEx1 ,x0 (y | x0 ) − x-SEx1 ,x0 (y | x0 ) . (270)
| {z } | {z } | {z }
µDE µIE µSE

In words, the TV equals the x-specific direct effect with a transition x0 → x1 , minus the
x-specific indirect effect with the opposite transition x1 → x0 and minus the x-specific

64
Causal Fairness Analysis

spurious effect with the transition x1 → x0 . One critical point to note is that such a
decomposition is not unique, since the TV can also be decomposed as:

TVx0 ,x1 (y) = −x-DEx1 ,x0 (y | x0 ) + x-IEx1 ,x0 (y | x0 ) − x-SEx1 ,x0 (y | x0 ). (271)

To achieve symmetry and avoid picking a specific order, we propose using the average of
the two decompositions. In particular, define the symmetric x-specific direct and indirect
effects as:

Definition 30 (Symmetric x-specific direct and indirect effect) The symmetric x-


specific direct and indirect effects are defined as:

1
x-DEsym

x (y | x) = x-DEx0 ,x1 (y | x) − x-DEx1 ,x0 (y | x) (272)
2
sym 1 
x-IEx (y | x) = x-IEx0 ,x1 (y | x) − x-IEx1 ,x0 (y | x) . (273)
2
Therefore, we propose to use x-DEsym sym
x (y | x0 ) and x-IEx (y | x0 ) instead of x-DEx1 ,x0 (y |
x0 ) and x-IEx1 ,x0 (y | x0 ) for Task 1. The benefit of these alternative measures is that no
single transition x0 → x1 has to be chosen for computing the direct/indirect effect, but
both x0 → x1 and x1 → x0 transitions are considered, by taking the average of the two.
Such an approach offers measures of direct and indirect effect which are symmetric with
respect to the change in the protected attribute, unlike their counterparts that consider a
single transition.

5.1.1 Legal Doctrines - A Formal Approach


Equipped with specific measures that can be used to perform bias detection and quantifi-
cation, we offer a formal approach for assessing the legal doctrines of disparate impact and
treatment. Our operational approach is described in Algorithm 1, and is one of the high-
lights of the manuscript. The algorithm takes the dataset D, the SFM projection ΠSFM (G)
of the causal diagram, and the Business Necessity Set (BN-set) as an input. When using
the SFM, the allowed BN-sets are ∅, {Z}, {W }, and {Z, W }27 . We next apply the Fairness
Cookbook in practice.

5.1.2 Empirical Evaluation


The practical usefulness of the Fairness Cookbook for Task 1 is demonstrated on two exam-
ples. Firstly, we apply the cookbook for the task of bias detection to the US Census 2018
dataset. After that, we apply the cookbook for the task of temporal bias quantification on
a College Admissions dataset.

Example 19 (US Government Census 2018) The United States Census of 2018 col-
lected broad information about the US Government employees, including demographic infor-
mation Z (Z1 for age, Z2 for race, Z3 for nationality), gender X (x0 female, x1 male), mar-
ital and family status M , education information L, and work-related information R. In an
27. Handling more involved BN-sets is discussed in detail in Sec. 6.

65
Plečko and Bareinboim

Algorithm 1 Fairness Cookbook for Task 1


• Inputs: Dataset D, SFM projection ΠSFM (G), Business Necessity Set BN-set.
1: Obtain the dataset D.
2: Determine the Standard Fairness Model projection ΠSFM (G) of the causal di-
agram G corresponding to the SCM M. Note that the full diagram G need not be
specified for this.
Additionally: are there known bidirected edges between X, Z, W , and Y groups? If yes,
go to Appendix B and consider the estimation in presence of bidirected edges. Otherwise
continue to next step.
3: Consider the existence of Disparate Treatment:

• compute the measure x-DEsym x (y | x0 ) and its 95% confidence interval (for bias
quantification, return this result and skip to next step)
• test the hypothesis
(x-DE)
H0 : x-DEsym
x (y | x0 ) = 0. (274)

(x-DE)
– if H0 not rejected =⇒ no evidence of disparate treatment
(x-DE)
– if H0 rejected =⇒ evidence of disparate treatment
• Additionally: if no evidence of disparate treatment in overall population, for Z = z
(z-DE)
test the hypothesis H0 : z-DEsym
x (y | z) = 0.
4: Consider the existence of Disparate Impact:
• compute the measures x-IEsym x (y | x0 ) and x-SEx1 ,x0 (y) and their 95% confidence
interval (for bias quantification, return this result and terminate the algorithm)
• if W ∈
/ BN-set, test the hypothesis
(x-IE)
H0 : x-IEsym
x (y | x0 ) = 0. (275)

(x-IE)
– if H0 not rejected =⇒ no evidence of disparate impact
(x-IE)
– if H0 rejected =⇒ evidence of disparate impact
– Additionally: if no evidence of disparate impact in overall population, for
(z-IE)
Z = z test the hypothesis H0 : z-IEsym
x (y | z) = 0.
• if Z ∈
/ BN-set, test the hypothesis
(x-SE)
H0 : x-SEx1 ,x0 (y) = 0. (276)

(x-SE)
– if H0 not rejected =⇒ no evidence of disparate impact
(x-SE)
– if H0 rejected =⇒ evidence of disparate impact

66
Causal Fairness Analysis

Figure 15: Measures obtained when applying the Fairness Cookbook for Task 1 on the
Government Census 2018 dataset.

initial analysis, a data scientist observed that male employees on average earn $14000/year
more than female employees, that is

E[y | x1] − E[y | x0] = $14000. (277)

Following the Fairness Cookbook, the data scientist does the following:
SFM projection: the SFM projection of the causal diagram G of this dataset is given by

ΠSFM (G) = hX = {X}, Z = {Z1 , Z2 , Z3 }, W = {M, L, R}, Y = {Y }i. (278)

Disparate treatment: when considering disparate treatment, she computes x-DEsym


x (y |
x0 ) and its 95% confidence interval to be

x-DEsym
x (y | x0 ) = $9980 ± $1049. (279)
(x-DE)
The hypothesis H0 is thus rejected, providing evidence of disparate treatment of females.
Disparate impact: when considering disparate impact, she computes Ctf-SE, Ctf-IE and
their respective 95% confidence intervals:

x-DEsym
x (y | x0 ) = $5126 ± $778, (280)
x-SEx1 ,x0 (y) = −$1675 ± $955. (281)

The data scientist decides that the differences in salary explained by the spurious correlation
of gender with age, race, and nationality are not considered discriminatory. Therefore, she
tests the hypothesis
(x-IE)
H0 : x-IEsym
x (y | x0 ) = 0,

which is rejected, indicating evidence of disparate treatment of female employees of the


government. Measures computed in the example are visualized in Fig. 15. 

67
Plečko and Bareinboim

Example 20 (Bias Quantification in College Admissions) A university in the United


States admits applicants every year. The university is interested in quantifying discrimi-
nation in the admission process and track it over time, between 2010 and 2020. The data
generating process changes over time, and can be described as follows. Let X denote gender
(x0 female, x1 male). Let Z be the age at time of application (Z = 0 under 20 years, Z = 1
over 20 years) and let W denote the department of application (W = 0 for arts&humanities,
W = 1 for sciences). Finally, let Y denote the admission decision (Y = 0 rejection, Y = 1
acceptance). The application process changes over time and is given by


 X ← 1(UX < 0.5 + 0.1UXZ ) (282)

Z ← 1(UZ < 0.5 + κ(t)UXZ ) (283)





 W ← 1(UW < 0.5 + λ(t)UXZ ) (284)



F(t), P (U ) : Y ← 1(UY < 0.1 + α(t)X + β(t)W + 0.1Z). (285)






UXZ ∈ {0, 1}, P (UXZ = 1) = 0.5, (286)





UX , UZ , UW , UY ∼ Unif[0, 1]. (287)

The coefficients κ(t), λ(t), α(t), β(t) change every year, and obey the following dynamics:

κ(t + 1) = 0.9κ(t) (288)


λ(t + 1) = λ(t)(1 − β(t)) (289)
β(t + 1) = β(t)(1 − λ(t))f (t), f (t) ∼ Unif[0.8, 1.2] (290)
α(t + 1) = 0.8α(t). (291)

The equations can be interpreted as follows. The coefficient κ(t) decreases over time, mean-
ing that the overall age gap between the groups decreases. The coefficient λ(t) decreases
compared to the previous year, by an amount dependent on β(t). In words, the rate of ap-
plication to arts&humanities departments decreases if these departments have lower overall
admission rates (i.e., students are less likely to apply to departments that are hard to get
into). Further, α(t), which represents gender bias, decreases over time. Finally, β(t) repre-
sent the increase in the probability of admission when applying to a science department. Its
value depends on the value from the previous year, multiplied by (1 − λ(t)) and the random
variable f (t). Multiplication by the former factor describes the mechanism in which the
benefit of applying to a science department decreases if a larger proportion of students apply
for it. The latter factor describes a random variation over time which describes how well
(in relative terms) the science departments are funded, and can be seen as depending on
research and market dynamics in the sciences.
The head data scientist at the university decides to use the Fairness Cookbook for per-
forming bias quantification. The SFM projection of the causal diagram G of the dataset is
given by
ΠSFM (G) = hX = {X}, Z = {Z}, W = {W }, Y = {Y }i. (292)
After that, the analyst estimates the quantities

x-DEsym sym
x (y | x0 ), x-DEx (y | x0 ), and x-SEx1 ,x0 (y) ∀t ∈ {2010, . . . , 2020}. (293)

68
Causal Fairness Analysis

Figure 16: Tracking bias over time in the synthetic College Admissions dataset from Ex. 20,
between years 2010 and 2020. Both the estimated values from simulated samples (solid line)
and the true population values (dashed lines) are shown, for direct (red), indirect (green),
and spurious (blue) effects.

The temporal dynamics of the estimated measures of discrimination (together with the
ground truth values obtained from the SCM Mt ) are shown graphically in Fig. 16. 

5.2 Task 2. Fair Prediction


We are now ready to discuss Task 2, which builds on similar foundations as the previous
task. The section is organized as follows.

(i) We first discuss previous literature on (fair) prediction; in particular, we discuss post-
processing, in-processing, and pre-processing methods.

(ii) We formalize the FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) for Task 2, which is the prob-
lem that needs to be solved s.t. causally meaningful fair predictions can be obtained.

(iii) We introduce the Fair Prediction Theorem (Thm. 10) that explains why standard
methods for fair prediction, agnostic to the causal structure, fail in solving FPCFA.

(iv) We develop two alternative formulations of the fair prediction optimization problem
capable of remedying the shortcomings of methods found in the literature.

5.2.1 Prediction
In the context of prediction, one is generally interested in constructing a predictor Yb of Y ,
which is a function of X, Z and W . More precisely, from a causal inference point of view, this
process can be conceptualized as constructing an additional mechanism Yb ← fYb (x, z, w)
in the SCM, which is under our control, as shown in Fig. 17. A typical choice of fYb in

69
Plečko and Bareinboim

Original diagram G

Y
X

W Yb

Figure 17: Standard Fairness Model (SFM) extended with a blue node Yb , for the task of
(fair) prediction.

the context of regression is the estimate of E[Y | X = x, Z = z, W = w], whereas for


classification a rounded version of such an estimate is often considered.
When constructing fair predictions, one is additionally interested in ensuring that the
constructed Yb also satisfies a fairness constraint. In the fairness literature, there are three
broad categories for achieving this. These approaches are referred to as post-processing,
in-processing, and pre-processing. We now cover them in order. Although there are many
possible target measures of fairness which the predictor Yb could satisfy, in this manuscript
we focus on methods that aim to achieve the condition TVx0 ,x1 (b y ) = 0.

5.2.2 Post-processing
Post-processing methods are the simplest and most easily described. First, one constructs a
predictor fYb without applying fairness constraints. The output of fYb (x, z, w) is then taken
and transformed using a transformation T , such that the constructed predictor
Yb ← T (fYb (x, z, w)), (294)
satisfies the condition TVx0 ,x1 (b
y ) = 0. We illustrate the post-processing methods with an
example. The reject-option classification of Kamiran et al. (2012) starts by estimating the
probabilities of belonging to the positive class, P (y) (label the estimates with fYb (x, z, w)).
The classifier Yb is then constructed such that
Yb (x, z, w) = 1(fYb (x, z, w) > θx ),

where θx0 , θx1 are group-specific thresholds chosen so that Yb satisfies TVx0 ,x1 (b
y ) = 0, and
also that θx0 , θx1 are as close as possible to 0.5 (to minimize the loss in accuracy). An
important question we discuss shortly is whether the Yb constructed in such a way also
behaves well from a causal perspective.

5.2.3 In-processing
In-processing methods take a different route. Instead of massaging unconstrained predic-
tions, they attempt to incorporate a fairness constraint into the learning process. This in

70
Causal Fairness Analysis

effect means that the mechanism fYb is no longer unconstrained, but is required to lie within
a class of functions which satisfy the TV constraint. Broadly speaking, this is achieved by
formulating an optimization problem of the form

arg minf b L Y, fYb (x, w, z) (295)
Y

subject to y ) ≤ ,
TVx0 ,x1 (b (296)
0 0 0 0 0 0
||fYb (x, w, z) − fYb (x , w , z )|| ≤ τ ((x, w, z), (x , w , z )). (297)

where L is a suitable loss function28 and τ is a metric on the covariates V \ Y . In the


language of Dwork et al. (2012), the TV minimization constraint in Line 296 ensures group
fairness, whereas the constraint in Line 297 ensures covariate-specific fairness 29 , meaning
that predictions for individuals with similar covariates x, z, w should be similar. Exactly
formulating and efficiently solving problems as in Lines 295-297 constitutes an entire field of
research. Due to space limitations, we do not go into full detail on how this can be achieved,
but rather name a few well-known examples. Zemel et al. (2013) use a clustering-based
approach, whereas Zhang et al. (2018) use an adversarial network approach. Kamishima
et al. (2012) add a mutual information constraint to control the TV in parametric settings.
Agarwal et al. (2018) formulate a saddle-point problem with moment-based constraints to
achieve the desired minimization of the TV. The mentioned methods differ in many practical
details, but all attempt to satisfy the constraint TVx0 ,x1 (b
y ) = 0 by constraining the learner
fYb . Again, the question arises as to whether constructing the mechanism fYb so that TV
equals 0 can provide guarantees about the causal behavior of the predictor.

5.2.4 Pre-processing
The last category of methods are the pre-processing methods. Here, the aim is to start from
a distribution P (x, w, z, y) and find its “fair version”, labeled Pe(x, w, z, y). Sometimes an
exact mapping between τ : V → V is constructed30 , and τ can even be stochastic. In that
case, the transformed distribution Pe is defined as:

Pe(v) = Eτ P ◦ τ (v) .
 
(298)

The fair pre-processing methods formulate an optimization problem that attempts to find
the optimal Pe(V ), where optimality is defined as minimizing some notion of distance to
the original distribution P (V ). There are two different approaches here, that have different
causal implications:

(A) the protected attribute X should be independent from the rest of observables V \ X
in the fair distribution Pe(V ), written X⊥⊥V \ X,

(B) the protected attribute X should be independent from the the outcome Y in the fair
distribution Pe(V ), written X⊥⊥Y .

E
 2
28. A common choice here is the loss Y − fYb (x, w, z) .
29. This notion corresponds to individual fairness in the work of Dwork et al. (2012). Causally speaking,
this would be seen as a covariate-specific fairness constraint, as the term individual is overloaded.
30. V here denotes the domain in which the observables V take values.

71
Plečko and Bareinboim

Figure 18: A schematic summary of the post-processing (red arrows), in-processing (blue),
and pre-processing (yellow) fair prediction methods, compared to a typical ML workflow
(black).

The first approach requires that the effect of the attribute X is entirely erased from the
data. The second, less stringent option requires the independence X⊥⊥Y in Pe(V ), which
is equivalent with having TVx0 ,x1 (b
y ) = 0. These two cases will be discussed separately in
the remainder of the section. In Fig. 18 we give a schematic representation of the three
categories of fair prediction methods, and in particular how they relate to a typical machine
learning workflow. We next move onto formulating FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y))
for Task 2.

5.2.5 FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) for Task 2.


Building on the previous definition of FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)), we can now
state its version in the context of fair prediction:
Definition 31 (FPCFA continued for Task 2) [Ω, Q as before] Let the true, unob-
served generative SCM M = hV, U, P (U ), F i, and let A be a set of assumption and P (v)
be the observational distribution generated by it. Let ΩA the space of all SCMs compatible
with A. The Fundamental Problem of Causal Fairness Analysis is to find a collection of
measures µ1 , . . . , µk such that the following properties are satisfied:
(1) µ is decomposable w.r.t. µ1 , . . . , µk ;
(2) µ1 , . . . , µk are admissible w.r.t. the structural fairness criteria Q1 , Q2 , . . . , Qk .
(3) µ1 , . . . , µk are as powerful as possible.
(4) µ1 , . . . , µk are identifiable from the observational distribution P (v) and class ΩA .
The final step of FPCFA for Task 2 is to construct an alternative SCM M0 such that
(5) the measures µ1 , . . . , µk satisfy that
µ1 (M0 ) = · · · = µk (M0 ) = 0. (299)

To make matters explicit, in the formulation of FPCFA(Str-{DE,IE,SE}, TVx0 ,x1 (y)) for
Task 2, we want to ensure that the constructed predictor Yb satisfies
x-DEsym
x (by | x0 ) = x-IEsym
x (by | x0 ) = x-SEx1 ,x0 (b
y ) = 0, (300)

72
Causal Fairness Analysis

instead of just requiring that TVx0 ,x1 (b


y ) = 0. The question we address formally next is
whether the condition in Eq. 300 can be achieved by methods that focus on minimizing TV.
For this purpose, we prove the Fair Prediction Theorem that is formulated for in-processing
methods in the linear case:

Theorem 10 (Fair Prediction Theorem) Let SFM(nZ , nW ) be the standard fairness


model with |Z| = nZ and |W | = nW . Let E denote the set of edges of SFM(nZ , nW ).
Further, let Snlinear
Z ,nW
be the space of linear structural causal models (with the exception of
X variable which is Bernoulli) compatible with the SFM(nZ , nW ) and whose structural co-
efficients are drawn uniformly from [−1, 1]|E| . An SCM M ∈ SnlinearZ ,nW
is said to be -TV-
compliant if

fbfair = arg min E[Y − f (X, Z, W )]2 (301)


f linear

subject to T Vx0 ,x1 (f ) = 0 (302)

also satisfies

|x-DEx0 ,x1 (fbfair | x0 )| ≤ , (303)


|x-IEx0 ,x1 (fbfair | x0 )| ≤ , (304)
|x-SEx ,x (fbfair )| ≤ .
0 1 (305)

Under the Lebesgue measure over [−1, 1]|E| , the set of 0-TV-compliant SCMs in SFM(nZ , nW )
has measure 0. Furthermore, for any nZ , nW , there exists an  = (nZ , nW ) such that

P(M is -TV-compliant) ≤ 41 . (306)

The proof is given in Appendix A.3. The theorem states that, for a random linear SCM,
the optimal fair predictor with TV measure equal to 0 will almost never have the x-specific
fairness measures equal to 0. The remarkable implication of the theorem is that minimizing
the TV measure provides no guarantees that the direct, indirect and spurious effects are
also minimized. That is, the resulting fair classifier might not be causally meaningful.
The Fair Prediction Theorem considers the linear case for in-processing methods, but
we conjecture that it has implications for more complex settings too (see also empirical
evidence on real data below). For example, note that in the optimization problem in
Lines 301-302 we are searching over linear functions f of X, Z, and W . For pre-processing
methods that achieve X⊥ ⊥Yb , the space of allowed functions f would be even more flexible,
but the underlying optimization problem would remain similar. Even though formal results
are difficult to provide, our observations raise a serious concern about whether any of
the fair prediction methods in the literature provide predictors that are well-behaved in a
causal sense. We now exemplify this point empirically, by applying several well-known fair
prediction methods on the COMPAS dataset.

5.2.6 Empirical evaluation of the Fair Prediction Theorem


Consider the following example based on the COMPAS dataset.

73
Plečko and Bareinboim

Example 21 (COMPAS continued for Fair Prediction) A team of data scientists from
ProPublica have shown that the COMPAS dataset from Broward County contains a strong
racial bias against minorities. They are now interested in producing fair predictions Yb on
the dataset, to replace the biased predictions. To this end they implement:

(i) baseline: a random forest classifier trained without any fairness constraints,

(ii) pre-processing: a logistic regression classifier trained with the reweighing method
(Kamiran and Calders, 2012),

(iii) in-processing: fair reductions approach of Agarwal et al. (2018) with a logistic re-
gression base classifier,

(iv) post-processing: a random forest classifier trained without fairness constraints, with
reject-option post-processing applied (Kamiran et al., 2012).

The fair prediction algorithms (ii), (iii), and (iv) are intended to set the TV measure to 0.
After constructing these predictors, the team make use of the Fairness Cookbook in Algo-
rithm 1. Following the steps of the Fairness Cookbook, the team computes the TV measure,
together with the appropriate measures of direct, indirect, and spurious discrimination.
The obtained decompositions of the TV measures are shown in Figures 19(ii), 19(iii),
and 19(iv). The ProPublica team notes that all methods substantially reduce the TVx0 ,x1 (b
y ),
however, the measures of direct, indirect, and, spurious effects are not necessarily reduced
to 0, consistent with the Fair Prediction Theorem. 

One class of fair prediction methods that are not addressed by the discussion above are
the pre-processing methods that achieve the independence of the protected attribute with
all the observables, namely X⊥ ⊥V \ {X}.

5.2.7 Pre-processing methods that achieve X⊥⊥V \ {X}


A pre-processing method that achieves attribute independence (X⊥⊥V \{X}) is the proposal
of Dwork et al. (2012), in which in the pre-processing step the distribution

V \ {X} | X = x0 is transported onto V \ {X} | X = x1 .

However, as witnessed by the following example, such an approach does not guarantee that
causal measures of fairness vanish:

Example 22 (Failure of Optimal Transport methods) A company is hiring prospec-


tive applicants for a new job position. Let X denote gender (x0 for male, x1 for female), W
denotes a score on a test (taking to values, ±), Y the outcome of the application (Y = 0
for no job offer, Y = 1 for job offer). The following SCM M describes the data generating

74
Causal Fairness Analysis

Figure 19: Causal Fairness Analysis applied to a standard prediction method (random
forest, subfigure (i)) and three different fair prediction algorithms (reweighing (Kamiran
and Calders, 2012) in subfigure (ii), reductions (Agarwal et al., 2018) in subfigure (iii), and
reject-option (Kamiran et al., 2012) in subfigure (iv)). All of the fair predictions methods
reduce the TV measure, but fail to nullify the causal measures of fairness. Confidence
intervals of the measures, obtained using bootstrap, are shown as vertical bars.

75
Plečko and Bareinboim

process:


 X ← UX (307)




 W ← (2UW − 1) (308)

 (
 UY ∨ 1(W > 0) if X = x0
F, P (U ) : Y ← (309)


 UY ∨ 1(W < 0) if X = x1






UX , UZ , UY , UY ∼ Bernoulli(0.5). (310)

After first part of the selection process, the company realized that they are achieving demo-
graphic parity

TVx0 ,x1 (y) = 0, (311)

but they are uncertain whether they are causally fair, with respect to the direct and indirect
effects. For this reason, they choose to optimally transport the conditional distributions,
namely
τ
W, Y | x1 7→ W, Y | x0 , (312)

where τ denotes the optimal transport map between the two distributions. By doing so, the
company aims to make sure that both the direct and the indirect effect are equal to 0.
The obtained optimal transport map τ can be described as follows:

 (−, 0)
 if (w, y) = (, 0)
τ (w, y) = (, 1) if (w, y) = (, 1) (313)
 1
(±, 1) w.p. 2 if (w, y) = (−, 1)

The conditional distributions W, Y | x0 and W, Y | x1 are shown in Fig. 20b, together


with the optimal transport map. Denote by W f , Ye the transformed values of W, Y . After
the transformation, we compute the indirect effect, comparing the potential outcomes Yex0
and Yex0 ,W
fx , where the latter describes the potential outcome where X = x0 along the direct
1
pathway, and W behaves like W f under the intervention X = x1 . We compute the as follows:
X
P (e
yx0 ,W
fx ) = P (e
yx0 ,w , W
fx = w)
1 (314)
1
w

= P (yx0 , , W fx = −),
fx = ) + P (yx ,− , W (315)
1 0 1

fx = ) equals 1 , corresponding to (UW , UY ) = (1, {0, 1})


where the first term P (yx0 , , W 1 2
(weighted w.p. 1) and (UW , UY ) = (0, {0, 1}) (weighted w.p. 21 ). The second term,
fx = −) equals 1 , corresponding to (UW , UY ) = (0, 1) (weighted w.p. 1 ).
P (yx0 ,− , W 1 8 2
5
fx ) = 8 . The term P (e
Thus, we have that P (yx0 ,W yx0 ) = P (yx0 ) = P (y | x0 ) = 34 . Putting
1
together, we have that
5 3 1
NIEx0 ,x1 (e
y ) = P (e fx ) − P (e
yx0 ,W y x0 ) = − =− , (316)
1 8 4 8
showing that the indirect effect after the optimal transport step is non-zero. 

76
Causal Fairness Analysis

X Y

(b) Conditional distributions W, Y | x0 (blue)


(a) Causal graph corresponding to Exam- W, Y | x1 (red), and the optimal transport
ple 22. map τ (green) from Example 22.

The reader might wonder about the underlying issue for why all of the discussed methods
from previous literature fail. We next move onto explaining the shortcomings of these
methods in more detail, and give two possible formulations that can help when constructing
causally meaningful predictors.

5.2.8 Towards the solution


Our next goal is to remedy the pitfalls of the fair prediction methods discussed so far.
In particular, we outline a strategy for ensuring that direct, indirect, and spurious effects
vanish (or a subset of them, in case of business necessity). There are two conditions that
are needed to guarantee causal behaviour of our predictor:

(I) the causal structure of the SFM is preserved for the predictor Yb ,

(II) The identification expressions of x-DE, x-IE, and x-SE equal 0 in the new SCM M0 .

We first show formally that the two conditions provide guarantees for the constructed
classifier Yb :

Proposition 12 (Fair Predictor Causal Conditions) Let M be an SCM compatible


with the SFM and let Yb be a predictor of the outcome Y satisfying:

(a) X, Z, W and Yb are compatible with the SFM,

(b) the identification expressions for x-DEsym sym


x (y | x0 ), x-DEx (y | x0 ), and x-SEx1 ,x0 (y)
equal 0, namely
X
[P (y | x1 , z, w) − P (y | x0 , z, w)]P (w | x1 , z)P (z | x0 ) = 0 (317)
z,w
X
[P (y | x1 , z, w) − P (y | x0 , z, w)]P (w | x0 , z)P (z | x0 ) = 0 (318)
z,w
X
P (y | x0 , z, w)[P (w | x1 , z) − P (w | x0 , z)]P (z | x) = 0 (319)
z,w

77
Plečko and Bareinboim

X
P (y | x1 , z, w)[P (w | x1 , z) − P (w | x0 , z)]P (z | x) = 0 (320)
z,w
X
P (y | x1 , z)[P (z | x1 ) − P (z | x0 )] = 0. (321)
z

Then, the predictor Yb satisfies:

x-DEsym
x (by | x0 ) = x-IEsym
x (by | x0 ) = x-SEx1 ,x0 (b
y ) = 0. (322)

Based on the proposition, we can offer two solutions that give causally meaningful fair
predictors, which are discussed next.

5.2.9 Causally aware in-processing


The first option for constructing fair predictions that obey causal constraints is via in-
processing. The simple idea is to replace the constraint TVx0 ,x1 (b
y ) = 0 with a number of
constraints that represent the identification expressions of the important causal quantities
that we wish to minimize. After that, we can use the fact that the causal structure of the
SFM is inherited for a predictor Yb constructed with in-processing. The formal statement
of the in-processing approach is given in the following theorem:

Theorem 11 (In-processing with causal constraints) Let M be an SCM compatible


with the SFM. Let Yb be constructed as the optimal solution to

Yb = arg min E[Y − f (X, Z, W )]2 (323)


f

subject to x-DEID y | x0 ) = 0
x0 ,x1 (b (324)
x-DEID y | x0 ) = 0
x1 ,x0 (b (325)
ID
y | x0 ) = 0
x-IEx0 ,x1 (b (326)
x-IEID y | x0 ) = 0
x1 ,x0 (b (327)
x-SEID
x1 ,x0 (b
y) =0 (328)

where x-DEID , x-IEID , and x-SEID represent the identification expressions of the corre-
sponding measures (as shown in Prop. 12). Then the predictor Yb satisfies

x-DEsym
x (by | x0 ) = x-IEsym
x (by | x0 ) = x-SEx1 ,x0 (b
y ) = 0. (329)

The following remark shows that the result of the theorem holds even more broadly than
just for the standard fairness model:

Remark 1 (Robustness of in-processing with causal constraints) Thm. 11 is stated


for an SCM that is compatible with the SFM. However, such an assumption can be relaxed.
In particular, the result of the theorem remains true even if the bidirected edges X L9999K Y ,
Z L9999K Y , and W L9999K Y are present in the model.

78
Causal Fairness Analysis

Algorithm 2 Causal Individual Fairness (Causal IF)


• Inputs: Dataset D, SFM projection ΠSFM (G), Business Necessity Set BN-set.
for V 0 ∈ {Z, W, Y } do
if V 0 ∈/ BN-set then
0
transport V 0 | x0 , pa(V 0 ) onto V 0 | x1 , τ pa(V ) (pa(V 0 ))
0
let τ V denote the transport map
else if V 0 ∈ BN-set then
0
transport V 0 | x, pa(V 0 ) onto V 0 | x, τ pa(V ) (pa(V 0 )) for x ∈ {x0 , x1 }
0
let τ V denote the transport map
end if
end for

5.2.10 Causally aware pre-processing


After discussing a suitable in-processing approach, we can offer an approach based on pre-
processing, inspired by the optimal transport approach of Dwork et al. (2012):

Definition 32 (Causal Individual Fairness (Causal IF)) Let M be an SCM compat-


ible with the SFM. Let the business necessity set be denoted as BN-set, taking values

BN-set ∈ ∅, {Z}, {W }, {Z, W } . (330)

The Causal Individual Fairness algorithm performs sequential optimal transport of the dis-
tributions of Z, W, and Y (in this fixed topological ordering) conditional on the values of the
parental set. The procedure is described formally in Algorithm 2.

To be even more precise, causal IF starts by optimally transporting

Z | X = x0 onto Z | X = x1 ,

unless Z is in the business necessity set. Let τ Z denote the optimal transport map. Then,
in the next step, the distribution of W is transported, namely,

W | X = x0 , Z = z onto Z | X = x1 , Z = τ Z (z) ∀z.

In the final step the distribution of Y is transported

Y | X = x0 , Z = z, W = w onto Z | X = x1 , Z = τ Z (z), W = τ W (w) ∀z, w.

Theorem 12 (Soundness Causal Individual Fairness) Let M be an SCM compatible


with the SFM. Let τ Y be the optimal transport map obtained when applying Causal IF.
Define an additional mechanism of the SCM M such that

Ye ← τ Y (Y ; X, Z, W ). (331)

For the transformed outcome Ye , we can then claim:

if Z ∈
/ BN-set =⇒ x-SEx1 ,x0 (e
y ) = 0. (332)
if W ∈
/ BN-set =⇒ x-IEsym
x (ey| x0 ) = 0. (333)

79
Plečko and Bareinboim

Furthermore, the transformed outcome Ye also satisfies

x-DEsym
x (ey | x0 ) = 0. (334)

The full proof of the theorem can be found in Appendix A.4. After showing that the Causal
IF procedure provides certain guarantees for the causal measures of fairness, we go back to
Ex. 22 to see exactly why the method of joint optimal transport fails to produce a causally
meaningful predictor:
Remark 2 (Why joint Optimal Transport fails) The first term in the indirect effect
in Ex. 22 after the joint transport map was applied was expanded as:
X
P (e
yx0 ,W
fx ) = P (e fx = w)
yx0 ,w , W 1 (335)
1
w

Typically, whenever such an effect is identifiable, one would expect the independence relation

Yex0 ,w ⊥⊥W
fx
1 (336)

to hold. However, the joint optimal transport map τ (w, y), which determines the value of
W
fx , also depends on the exogenous variable UY . For this reason, W
1
fx is also a function
1
of UY , but so is Yx0 ,w . In this example, the joint optimal transport step introduced spurious
e
shared information between exogenous variables UW and UY , which resulted in

P (e fx = w) 6= P (e
yx0 ,w , W yx0 ,w )P (W
fx = w), (337)
1 1

which disables us from providing guarantees that the indirect effect vanishes. The Causal
IF transport method, on the other hand, circumvents this problem and guarantees that
Yex0 ,w ⊥⊥W
fx .
1

6. Disparate Impact and Business Necessity


In this section, we generalize the analysis introduced earlier, including the Fairness Cook-
book (Alg. 1), to consider more refined settings described by an arbitrary causal diagram.
The main motivation for doing so comes from the observation that when analyzing disparate
impact, quantities such as Ctf-DEx0 ,x1 (y | x0 ), Ctf-IEx0 ,x1 (y | x0 ), and Ctf-SEx0 ,x1 (y) are in-
sufficient to account for certain business necessity requirements. For concreteness, consider
the following example.

Example 23 (COMPAS continued) The courts at Broward County, Florida, were us-
ing machine learning to predict whether individuals released on parole are at high risk of
re-offending within 2 years. The algorithm is based on the demographic information Z (Z1
for gender, Z2 for age), race X (x0 denoting White, x1 Non-White), juvenile offense counts
J, prior offense count P , and degree of charge D.
A causal analysis using the Fairness Cookbook revealed that:

Ctf-IEx1 ,x0 (y | x1 ) = −5.3% ± 0.4%, (338)


Ctf-SEx1 ,x0 (y) = −4.3% ± 0.9%, (339)

80
Causal Fairness Analysis

potentially indicating presence of disparate impact. Based on this information, a legal team
of ProPublica filed a lawsuit to the district court, claiming discrimination w.r.t. the Non-
White subpopulation based on the doctrine of disparate impact. After the court hearing, the
judge rules that using the attributes age (Z1 ), prior count (P ), and charge degree (D) is
not discriminatory, but using the attributes juvenile count (J) and gender (Z2 ) is discrim-
inatory. Data scientists at ProPublica need to consider how to proceed in the light of this
new requirement for discounting the allowable attributes in the quantiative analysis.

The difficulty in this example is that the quantity Ctf-SEx1 ,x0 (y) measures the spurious
discrimination between the attribute X and outcome Y as generated by both confounders
Z1 and Z2 . Since using the confounder Z1 is not considered discriminatory, but using the
confounder Z2 is, the quantity Ctf-SEx1 ,x0 (y) needs to be refined such that the spurious
variations based on the different confounders are disentangled. A similar challenge is pre-
sented while computing the Ctf-IEx0 ,x1 (y | x0 ) measure. In fact, a more refined analysis is
needed to disentangle the indirect and spurious variations that comes from the confounding
set Z or mediating set W such that they can be explained separately.

6.1 Refining spurious discrimination


6.1.1 Markovian case
6.1.2 Semi-Markovian models
6.1.3 Identification of set-specific spurious effects
6.2 Refining indirect effects
6.2.1 Identification of set-specific indirect effects
6.3 Extended Fairness Cookbook
6.4 Extended Fairness Map
7. Conclusions
Modern automated decision-making systems based on AI are fueled by data, which encodes
many complex historical processes and past, potentially discriminatory practices. Such
data, imprinted with undesired biases, cannot by itself be used and expected to produce
fair systems, regardless of the level of statistical sophistication of the methods used or the
amount of data available. In light of this limitation in which more data or clever methods are
not the solution, the AI designer is left to search for a new notion of what a fair reality should
look like. By and large, the literature on fair machine learning attempts to address this
question by formulating (and then optimizing) statistical notions about how fairness should
be measured. Still, as many of the examples in this manuscript demonstrated, statistical
notions fall short of providing a satisfactory answer for what a fair reality should entail.
Using decision systems that arise when only considering statistical notions of fairness may
be causally meaningless, or even have unintended and possibly catastrophic consequences.
We combined in this manuscript two ingredients to address this challenge, (i) the lan-
guage of causality and (ii) legal doctrines of discrimination so as to provide a sound basis
for imagining what a fair reality should look like and represent society’s norms and expec-

81
Plečko and Bareinboim

tations. This formalization of the fairness problem will allow the communication between
the key stakeholders involved in developing such systems in practice, including computer
scientists, statisticians, and data scientists on the one hand and social, legal, and ethical
experts on the other. A key observation is that mapping social, ethical, and legal norms
onto statistical measures is a challenging task. A formulation we propose explicitly in
the form of the Fundamental Problem of Causal Fairness Analysis is to map such social
norms onto the underlying causal mechanisms and causal measures of fairness associated
with these particular mechanisms. We believe such an approach can help data scientists to
be more transparent when measuring discrimination and can also help social scientists to
ground their principles and ideas in a formal mathematical language that is amenable to
implementation.
The final important distinction introduced in this manuscript is between the different
fairness tasks, namely (i) bias detection and quantification, (ii) fair prediction, and (iii) fair
decision-making. The first task helps us to understand how much (and if any) bias exists
in our data. The task of fair prediction allows us to correct for (parts or entirety) of this
bias and envisage a more fair world in which such bias is removed. We leave as future work
the precise formulation of the third task based on the principles developed here. Achieving
fairness in the real world requires interventions, such as affirmative action. However, such
interventions can have various complex consequences and implications, and interfacing the
principles introduced in this manuscript with key ideas in economics and econometrics is
an essential next step in designing fair systems.

8. Acknowledgements
We thank Kai-Zhan Lee for the feedback and help in improving this manuscript. This work
was done in part while Drago Plecko was visiting the Causal AI lab at Columbia University.
This research was supported in part by the NSF, ONR, AFOSR, DoE, Amazon, JP Morgan,
and The Alfred P. Sloan Foundation.

82
Causal Fairness Analysis

Implication Proof
Unit-TE =⇒ v 0 -TE =⇒ z-TE =⇒ ETT =⇒ TE Lem. 5
Unit-DE =⇒ v 0 -DE =⇒ z-DE =⇒ Ctf-DE =⇒ NDE
power

Lem. 5
Unit-IE =⇒ v 0 -IE =⇒ z-IE =⇒ Ctf-IE =⇒ NIE Lem. 5
Exp-SE ⇐⇒ Ctf-SE Lem. 6
admissibility

S-SE =⇒ Ctf-SE Lem. 9


S-DE =⇒ unit-DE Lem. 7
S-IE =⇒ unit-IE Lem. 8
NDE ∧ NIE =⇒ TE Lem. 10
Ctf-DE ∧ Ctf-IE =⇒ ETT Lem. 10
decomposability

z-DE ∧ z-IE =⇒ z-TE Lem. 10


v 0 -DE ∧ v 0 -IE =⇒ v 0 -TE Lem. 10
unit-DE ∧ unit-IE =⇒ unit-TE Lem. 10
TE ∧ Exp-SE =⇒ TV Lem. 11
ETT ∧ Ctf-SE =⇒ TV Lem. 11

Table 3: List of implications in the Fairness Map in Fig. 12.

Appendix A. Proofs

In this section, we provide the proofs of the main theorems presented in the manuscript.
In particular, we give the proof for the Fairness Map theorem (Thm. 7), soundness of the
SFM theorem (Thm. 9), the Fair Prediction theorem (Thm. 10), and the soundness of the
Causal Individual Fairness procedure (Thm. 12).

A.1 Proof of Thm. 7

The proof of Thm. 7 is organized as follows. The full list of implications contained in the
Fairness Map in Fig. 12 is given in in Tab. 3. For each implication, we indicate the lemma
in which the implication proof is given.

83
Plečko and Bareinboim

Lemma 5 (Power relations of causal effects) The total, direct, and indirect effects ad-
mit the following relations of power:

unit-TEx0 ,x1 (y(u)) = 0 ∀u =⇒ v 0 -TEx0 ,x1 (y | v 0 ) = 0 ∀v 0 =⇒ z-TEx0 ,x1 (y | z) = 0 ∀z


(340)
=⇒ ETTx0 ,x1 (y | x) = 0 ∀x =⇒ TEx0 ,x1 (y) = 0, (341)
0 0 0
unit-DEx0 ,x1 (y(u)) = 0 ∀u =⇒ v -DEx0 ,x1 (y | v ) = 0 ∀v =⇒ z-DEx0 ,x1 (y | z) = 0 ∀z
(342)
=⇒ Ctf-DEx0 ,x1 (y | x) = 0 ∀x =⇒ NDEx0 ,x1 (y) = 0, (343)
0 0 0
unit-IEx0 ,x1 (y(u)) = 0 ∀u =⇒ v -IEx0 ,x1 (y | v ) = 0 ∀v =⇒ z-IEx0 ,x1 (y | z) = 0 ∀z
(344)
=⇒ Ctf-IEx0 ,x1 (y | x) = 0 ∀x =⇒ NIEx0 ,x1 (y) = 0. (345)

Proof We prove the statement for total effects (the proof for direct and indirect is analo-
gous). We start by showing that ETT is more powerful than TE.

TEx0 ,x1 (y) =P (yx1 ) − P (yx0 )


X 
= P (yx1 | x) − P (yx0 | x) P (x)
x
X
= ETTx0 ,x1 (y | x)P (x).
x

Therefore, if ETTx0 ,x1 (y | x) = 0 ∀x then TEx0 ,x1 (y) = 0. Next, we can write

ETTx0 ,x1 (y | x) = P (yx1 | x) − P (yx0 | x)


X 
= P (yx1 | x, z) − P (yx0 | x, z) P (z | x)
z
X 
= P (yx1 | z) − P (yx0 | z) P (z | x) Yx ⊥⊥X | Z in SFM
z
X
= z-TEx0 ,x1 (y | z)P (z | x).
z

Therefore, if z-TEx0 ,x1 (y | z) = 0 ∀z then ETTx0 ,x1 (y | x) = 0 ∀x. Next, for a set V 0 ⊆ V
such that Z ⊆ V 0 , we can write

z-TEx0 ,x1 (y) = P (yx1 | z) − P (yx0 | z)


X
= P (yx1 | z, v 0 \ z) − P (yx0 | z, v 0 \ z)P (v 0 \ z | z)
v 0 \z
X
= v 0 -TEx0 ,x1 (y | v 0 )P (v 0 \ z | z).
v 0 \z

84
Causal Fairness Analysis

Therefore, if v 0 -TEx0 ,x1 (y | v 0 ) = 0 ∀v 0 then z-TEx0 ,x1 (y | z) = 0 ∀z. Next, notice that

v 0 -TEx0 ,x1 (y) = P (yx1 | v 0 ) − P (yx0 | v 0 )


X
yx1 (u) − yx0 (u) P (u | v 0 )

=
u
X
= unit-TEx0 ,x1 (y(u))P (u | v 0 ).
u

Therefore, if unit-TEx0 ,x1 (y(u)) = 0 ∀u then v 0 -TEx0 ,x1 (y | v 0 ) = 0 ∀v 0 .

Lemma 6 (Power relations of spurious effects) The criteria based on Ctf-SE and Exp-
SE are equivalent. Formally,

Exp-SEx (y) = 0 ∀x ⇐⇒ Ctf-SEx,x0 (y) = 0 ∀x 6= x0 . (346)

Proof

Exp-SEx (y) = P (y | x) − P (yx )


= P (y | x) − P (yx | x)P (x) − P (yx | x0 )P (x0 )
= P (y | x)[1 − P (x)] − P (yx | x0 )P (x0 )
= P (y | x)P (x0 ) − P (yx | x0 )P (x0 )
= −P (x0 )Ctf-SEx0 ,x (y).

Assuming P (x0 ) > 0, the claim follows.

Lemma 7 (Admissibility w.r.t. structural direct) The structural direct effect crite-
rion (X ∈
/ pa(Y )) implies the absence of unit-level direct effect. Formally:

S-DE =⇒ unit-DEx0 ,x1 (y(u)) = 0 ∀u. (347)

Proof Suppose that X ∈


/ pa(Y ). Note that:

unit-DEx0 ,x1 (y(u)) = yx1 ,Wx0 (u) − yx0 (u)


= fY (x1 , Wx0 (u), Z(u), uY ) − fY (x0 , Wx0 (u), Z(u), uY )
= fY (Wx0 (u), Z(u), uY ) − fY (Wx0 (u), Z(u), uY ) X∈
/ pa(Y )
= 0.

Lemma 8 (Admissibility w.r.t. structural indirect) The structural indirect effect cri-
terion (de(X) ∈
/ pa(Y )) implies the absence of unit-level indirect effect. Formally:

S-IE =⇒ unit-IEx1 ,x0 (y(u)) = 0 ∀u. (348)

85
Plečko and Bareinboim

Proof Suppose that de(X) ∈ / pa(Y ). Let Wde ⊆ W be the subset of mediators W which
C be its complement in W . Then, by assumption, W ∈
are in de(X), and let Wde C / pa(Y ).
We can write:

unit-IEx1 ,x0 (y(u)) = yx1 ,Wx0 (u) − yx1 (u)


C C
= fY (x1 , (Wde )x0 (u), Z(u), uY ) − fY (x1 , (Wde )x1 (u), Z(u), uY )
C C C
= fY (x1 , Wde (u), Z(u), uY ) − fY (x1 , Wde (u), Z(u), uY ) Wde ∈
/ de(X)
= 0.

Lemma 9 (Admissibility w.r.t. structural spurious) The structural spurious effect


criterion (UX ∩ an(Y ) = ∅ ∧ an(X) ∩ an(Y ) = ∅) implies counterfactual spurious effect
is 0. Formally:

S-SE =⇒ Ctf-SEx0 ,x1 (y) = 0 ∀u. (349)

Proof Note that S-SE implies there is no open backdoor path between X and Y . As a
consequence, we know that

Yx ⊥⊥X.

Furthermore, the absence of backdoor paths also implies we can use the 2nd rule of do-
calculus (Action/Observation Exchange). Therefore, we can write:

Ctf-SEx0 ,x1 (y) = P (yx0 | x1 ) − P (y | x0 )


= P (yx0 ) − P (y | x0 ) since Yx ⊥⊥X
= P (yx0 ) − P (yx0 ) Action/Observation Exchange
= 0.

Lemma 10 (Extended Mediation Formula) The total effect can be decomposed into its
direct and indirect contributions on every level of the population axes in the explainability
plane. Formally, we write:

TEx0 ,x1 (y) = NDEx0 ,x1 (y) − NIEx1 ,x0 (y) (350)
ETTx0 ,x1 (y | x) = Ctf-DEx0 ,x1 (y | x) − Ctf-IEx1 ,x0 (y | x) (351)
z-TEx0 ,x1 (y | z) = z-DEx0 ,x1 (y | z) − z-IEx1 ,x0 (y | z) (352)
0 0 0 0 0 0
v -TEx0 ,x1 (y | v ) = v -DEx0 ,x1 (y | v ) − v -IEx1 ,x0 (y | v ) (353)
unit-TEx0 ,x1 (y(u)) = unit-DEx0 ,x1 (y(u)) − unit-IEx1 ,x0 (y(u)). (354)

86
Causal Fairness Analysis

Proof The proof follows from the structural basis expansion from Eq. (54). In particular,
note that

E-TEx1 ,x0 (y | E) = P (yx1 | E) − P (yx0 | E) (355)


= P (yx1 | E) − P (yx1 ,Wx0 | E) + P (yx1 ,Wx0 | E) − P (yx0 | E) (356)
= −E-IEx1 ,x0 (y | E) + E-DEx1 ,x0 (y | E). (357)

By using different events E the claim follows.

Lemma 11 (TV Decompositions) The total variation (TV) measure admits the follow-
ing two decompositions

TVx0 ,x1 (y) = Exp-SEx1 (y) + TEx0 ,x1 (y) − Exp-SEx0 (y) (358)
= ETTx0 ,x1 (y | x0 ) − Ctf-SEx1 ,x0 . (359)

Proof We write

TVx0 ,x1 (y) = P (y | x1 ) − P (y | x0 )


= P (y | x1 ) − P (yx1 ) + P (yx1 ) − P (yx0 ) + P (yx0 ) − P (y | x0 )
= Exp-SEx1 (y) + TEx0 ,x1 (y) − Exp-SEx0 (y).

Alternatively, we can write

TVx0 ,x1 (y) = P (y | x1 ) − P (y | x0 )


= P (y | x1 ) − P (yx1 | x0 ) + P (yx1 | x0 ) − P (y | x0 )
= ETTx0 ,x1 (y | x0 ) − Ctf-SEx1 ,x0 (y),

which completes the proof.

A.2 Soundness of the SFM: Proof of Theorem 9


Proof The proof consists of two parts. In the first part, we show that the quantities where
the event E is either of ∅, {x}, {z} (corresponding to the first three rows of the fairness map)
are identifiable under the assumptions of the Standard Fairness Model. We in particular
show that TEx0 ,x1 (y), Exp-SEx (y), TEx0 ,x1 (y | z), ETTx0 ,x1 (y | x) and Ctf-DEx0 ,x1 (y |
x) are identifiable (it follows from very similar arguments that all other quantities are
also identifiable. From this, it follows that for any graph G compatible with GSFM , the
quantities of interest are (i) identifiable; (ii) their identification expression is the same.
This in turn shows that using GSFM instead of the full G does not hurt identifiability of
these quantities. In the second part of the proof, we show that any contrast defined by
an event E which contains either W = w or Y = y is not identifiable under some very
mild conditions (namely the existence of a path X → Wi1 → ... → Wik → Y ). This
part of the proof, complementary to the first part, shows that for contrasts with event E

87
Plečko and Bareinboim

containing post-treatment observations, even having the full graph G would not make the
expression identifiable. All of the proofs here need to be derived from first principles, since
the graph GSFM contains “groups” of variables Z and W , making the standard identification
machinery (Pearl, 2000) not directly applicable.
Part I: Note that for identifying TEx0 ,x1 (y) we need to identify P (yx ). We can write

P (yx ) = P (y | do(x))
X
= P (y | do(x), z)P (z | do(x)) Law of Total Probability
z
X
= P (y | x, z)P (z) (Y ⊥⊥X | Z)GX , (X⊥⊥Z)GX
z
P
from which it follows that TEx0 ,x1 (y) = z [P (y | x1 , z) − P (y | x0 , z)]P (z). Note that the
identifiability
P of TEx0 ,x1 (y | z) also follows from the above derivation,
P namely TEx0 ,x1 (y |
z) = z [P (y | x1 , z) − P (y | x0 , z)], and so does Exp-SEx (y) = z P (y | x, z)[P (z) − P (z |
x)]. We are now left with showing that ETTx0 ,x1 (y | x) and Ctf-DEx0 ,x1 (y | x) are also
identifiable. These are Layer 3, counterfactual quantities and therefore rules of do-calculus
will not suffice. To be able to use independence statements of counterfactual variables, we
will make use of the make-cg algorithm of Shpitser and Pearl (2007) for construction of
counterfactual graphs, which extends the twin-network approach of Balke and Pearl (1994).
Therefore, when using for an expression of the form Yx = y, X = x0 , we obtain the following
counterfactual graph

X Yx

Wx

from which we can see that Yx ⊥


⊥X | Z. Therefore,

ETTx0 ,x1 (y) = P (yx1 | x) − P (yx0 | x)


X
= [P (yx1 | x, z) − P (yx0 | x, z)]P (z | x) Law of Total Probability
z
X
= [P (y | x1 , z) − P (y | x0 , z)]P (z | x) Yx ⊥⊥X | Z.
z

Finally, for identifying Ctf-DEx0 ,x1 (y | x) we use make-cg applied to GSFM and yx1 ,w , wx0 , x, z
to obtain

X Yx1 ,w

W x0

88
Causal Fairness Analysis

⊥(Wx0 , X) | Z. Therefore, we can write


from which we can say that Yx1 ,w ⊥
Ctf-DEx0 ,x1 (y | x) = P (yx1 ,Wx0 | x) − P (yx0 ,Wx0 | x)
X
= [P (yx1 ,Wx0 | x, z) − P (yx0 ,Wx0 | x, z)]P (z | x) Law of Total Probability
z
X
= [P (yx1 ,w , wx0 | x, z) − P (yx0 ,w , wx0 | x, z)]P (z | x) Counterfactual unnesting
z,w
X
= [P (yx1 ,w | x, z) − P (yx0 ,w | x, z)]P (wx0 | z)P (z | x) Yx1 ,w ⊥⊥Wx0 | Z
z,w
X
= [P (y | x1 , z, w) − P (yx0 | x0 , z, w)]P (w | x0 , z)P (z | x) Yx ⊥⊥X | Z, Wx0 ⊥
⊥X | Z.
z,w

Part II: We next need to show that any contrast with either W = w or Y = y in the
event E is not identifiable, even if using the full graph G. We show this for the quantity
P (yx1 | x0 , w), since other similar quantities work analogously. Assume for simplicity that
(i) variable Z = ∅; (ii) there are no bidirected edges between the W variables. The latter
assumption clearly makes the identifiability task easier, since adding bidirected edges can
never help identification of quantities. Before we continue, we give an example of a graph
G compatible with GSFM for which P (yx1 | x0 , w) is identifiable. Consider the graph
X Y

W1 W2

and notice that


P (yx1 | x0 , w1 , w2 ) = P (yx1 | x0 , w2 ) Yx1 ⊥⊥W1
= P (yx1 | w2 ) Yx1 ⊥⊥X
= P (y | x1 , w2 ) (Y ⊥⊥X)GX .
However, this example is somewhat pathological, since there is no indirect path between
X and Y mediated by W . In this case, considering the set W is arguably not relevant.
Therefore, assume instead that a path X → Wi1 → ... → Wik → Y exists. Then, when
applying make-cg to G and yx1 , x0 , w the resulting counterfactual graph will contain
X W i1 W i2 ... W ik

W i 1 x1 W i 2 x1 ... W i k x1 Yx1

as a subgraph and therefore when applying the ID∗ algorithm of Shpitser and Pearl (2007),
we will encounter a C-component {Wi , Wix1 } which will result in non-identifiability of the
overall expression. Therefore, even having access to the full G will not help us identify
contrasts that include observations of post-treatment variables, completing the proof.

89
Plečko and Bareinboim

A.3 Proof of Theorem 10


Proof Considering the following SFM

U Z

X Y

we can write the linear structural equation model as follows:

U ← N (0, 1) (360)
X ← Bernoulli(expit(U )) (361)
Z ← aU Z U + aZZ ZZ (362)
W ← aXW X + aZW Z + aW W W + W (363)
Y ← aXY X + aZY Z + aW Y W + Y (364)

where matrices aZZ , aW W are upper diagonal, making the above SCM valid, in the sense
that no variable is a functional argument of itself. For simplicity, we assume Z ∼ N (0, InZ ),
W ∼ N (0, InW ) and Y ∼ N (0, 1). The coefficients a of the above model are assumed to be
drawn uniformly from [−1, 1]|E| , where |E| is the number of edges with a linear coefficient.
By expanding out, the outcome Y can be written
X
Y = aVi Y Vi + Y ,
Vi ∈X,Z,W

and the linear predictor of Y , labeled f can be written as


X
f (X, Z, W ) = ãVi Y Vi .
Vi ∈X,Z,W

The objective of the optimization can then be written as

E[Y − f (X, Z, W )]2 = E


 X 2
aVi Y − e
aVi Y )Vi + Y
Vi ∈X,Z,W

= E[2Y ] + E
 X 
(aVi Y − e
aVi Y )(aVj Y − e
aVj Y )Vi Vj
Vi ,Vj ∈X,Z,W

aV Y )T E[V V T ](aV Y − e
= 1 + (aV Y − e aV Y ),

when written as a quadratic form with the characteristic matrix E[V V T ]. Here, (with slight
abuse of notation) the set V includes X, Z, W . Further, the constraint TVx0 ,x1 (f ) = 0 is in
fact a linear constraint on the coefficients e
aV Y , since we have that

T Vx0 ,x1 (f ) = (E[V | x1 ] − E[V | x0 ])T e


aV Y .

90
Causal Fairness Analysis

We write

c = E[V | x1 ] − E[V | x0 ], (365)


Σ = E[V V ] T
(366)

and note that our optimization problem can be written as

arg min aV Y )T Σ(aV Y − e


(aV Y − e aV Y ) (367)
aV Y
e

subject to cT e
aV Y = 0. (368)

The objective is a quadratic form centered at aV Y . Geometrically, the solution to the opti-
mization problem is the meeting point of an ellipsoid centered at aV Y with the characteristic
matrix Σ and the hyperplane through the origin with the normal vector c. The solution is
given explicitly as

cT aV Y Σ−1 c
aV Y = aV Y − .
cT Σ−1 c
b

We next analyze the constraints

Ctf-DEx0 ,x1 (fbfair | x0 ) = Ctf-IEx0 ,x1 (fbfair | x0 ) = Ctf-SEx0 ,x1 (fbfair ) = 0.

The first constraint Ctf-DEx0 ,x1 (fbfair | x0 ) can be simply written as b


aXY (x1 − x0 ) = 0, and
since x1 − x0 = 0, the constraint can be written as c1 b T aV Y = 0 where c1 = (1 0 . . . 0)T .
Similarly, but more involved, the Ctf-IE constraint can be written as cT2 b aV Y = 0 where
entries of c2 corresponding to Wi variables are

E[Wi | x1] − E[Wix 0


| x1 ],

and 0 everywhere else. Finally, the Ctf-SE constraint can be written as cT3 b
aV Y = 0 where
entries of c3 corresponding to Wi variables are

E[Wix 0
| x1 ] − E[Wi | x0 ],

and the entries corresponding to Zi variables

E[Zi | x1] − E[Zi | x0].


Notice also that c1 + c2 + c3 = c. We note that

E[Z | x1] − E[Z | x0] = (I − aZZ )−1aU Z δu01


where δu01 = E[U | x1 ] − E[U | x0 ] is a constant. Similarly,

E[Wx | x1] − E[W | x0] = (I − aW W )−1aZW (I − aZZ )−1aU Z δu01.


0

Furthermore, for the indirect effect, we have that

E[Wi | x1] − E[Wix


X Y
0
| x1 ] = (x1 − x0 ) aVk Vl .
paths X→Wi edges Vk →Vl

91
Plečko and Bareinboim

Therefore, we can now see how the three constraints can be expressed in terms of the
structural coefficients in a. What remains is understanding the entries of the Σ matrix.
Note that E[Vi Vj ] can be computed by considering all treks from Vi to Vj . A trek is a path
that first goes backwards from Vi until a certain node, and the forwards to Vj . The slight
complication comes from the treks with the turning point at U that pass through X, as the
SCM is not linear at X. Nonetheless, in this case the contribution to the covariance of Vi
and Vj equals the product of the coefficients on the trek multiplied by E[XU ]. Therefore,
we note that

E[ViVj ] =
X Y
λ(Ts ) aVk Vl
treks Ts edges Vk →Vl
from Vi to Vj ∈Ts

where the weighing factor λ(Ts ) is either 1 or E[XU ] depending on the trek Ts . To conclude
the argument, notice the following. The entries of the Σ matrix are polynomial functions
of the structural coefficients a. The same also therefore holds for Σ−1 . Furthermore, the
coefficient c is also a polynomial function of coefficients in a. Therefore, the condition
cT1 b
aV Y = 0 can be written as

cT aV Y Σ−1 c
cT1 (aV Y − ) = 0, (369)
cT Σ−1 c
where the left hand side is a polynomial expression in the coefficients of a. Therefore, the
above expression defines an algebraic hypersurface. Any such hypersurface has measure 0
in the space [−1, 1]|E| , proving that the set of 0-TV-compliant SCMs is in fact of measure
0. Intuitively, the result is saying that the meeting point of an ellipsoid centered at aV Y
with the characteristic matrix Σ and the hyperplane through the origin with the normal
vector c with measure 0 also lies on a random hyperplane defined by the normal vector c1
and passing through the origin.
To extend the result for an  > 0, we proceed as follows. Let H() be the set of -TV-
compliant SCMs. Let HDE () be the set of SCMs for which the direct effect is bounded
by  for the fb. Let HIE (), HSE () be defined analogously for the indirect and spurious
effects. We then analyze the degrees of the terms appearing in Eq. 369, which defines the
surface HDE (0). In particular, notice that
cT aV Y Σ−1 c cT aV Y Σ−1 c
deg(cT1 (aV Y − )) ≤ deg(c1 ) + deg(aV Y ) + deg( ) (370)
cT Σ−1 c cT Σ−1 c
and also that
cT aV Y Σ−1 c
deg( ) ≤ deg(cT aV Y Σ−1 c) + deg(cT Σ−1 c) (371)
cT Σ−1 c
≤ 2deg(c) + deg(aV Y ) + deg(Σ−1 ) + 2deg(c) + deg(Σ−1 ). (372)

Now, one can observe the following bounds, where p = |V |:

deg(c) ≤ p from Eq. 365, (373)


deg(aV Y ) = 1 by definition, (374)
−1 2 4
deg(Σ ) ≤ p ∗ max deg(Σij ) = p from Eq. 366. (375)

92
Causal Fairness Analysis

from which it follows that the degree of the surface of 0-TV-compliant SCMs, labeled H(0),
is bounded by 2 + 4p + 2p2 . Therefore, by an application of the Lojasiewicz’s inequality (Ji
et al., 1992), there exist constants k1 , k2 such that:

cT aV Y Σ−1 c
vol(HDE ()) = vol{a ∈ [−1, 1]|E| | |cT1 (aV Y − )| ≤ } (376)
cT Σ−1 c
= vol{a ∈ [−1, 1]|E| | d(a, HDE (0)) ≤ k1 k2 }, (377)

where the volume in Eq. 377 can be bounded above by an application of the Crofton’s
inequality (Adler et al., 2007, p. 45), to obtain that

vol(HDE ()) ≤ k1 k2 2|E|/2 C(|E|, deg(HDE (0)))deg(HDE (0)), (378)

where C(|E|) is a constant coming from the Crofton’s inequality. Finally, we can write that
for a random M sampled from Snlinear
Z ,nW
we have that

DE
P(M ∈ HDE ()) = vol(H2|E| ()) . (379)

By noting that |E| = p(p + 1) and setting

2
 2p /4 1/k2
= (380)
4C(|E|)[2 + 4p + 2p4 ]k1

we obtain that P(M ∈ HDE ()) ≤ 14 . Since we know that

H() = HDE () ∩ HIE () ∩ HSE () =⇒ P(M ∈ H()) ≤ P(M ∈ HDE ()). (381)

=⇒ P(M ∈ H()) ≤ 14 , (382)

for such an . Intuitively, any SCM in H() must also be in HDE (). Any SCM in HDE ()
must be close to HDE (0). The maximal deviation of an SCM in HDE () from HDE (0) can
be bounded by the Lojasiewicz’s inequality, whereas the surface area of HDE (0) can be
bounded above by Crofton’s inequality. Putting together, we get a bound on the measure
of -TV-compliant SCMs.

The behaviour of the  term given in Eq. 380 cannot be theoretically analyzed further,
since the constants arising from the Lojasiewicz’s inequality are dimension dependent. To
this end, for nZ = nW = 5 we empirically estimate

P(M ∈ HDE ()) (383)

for a range of  values, and obtain the plot in Fig. 21.

93
Plečko and Bareinboim

Figure 21: Estimating empirically the probability that a random SCM in Snlinear
Z ,nW
, for nZ =
nW = 5, has a direct effect smaller than  after ensuring that TV equals 0.

A.4 Proof of Thm. 12


Proof We prove the result for the case BN-set= ∅ (the other cases of BN-sets follow
analogously), in the population level case. Based on the standard fairness model, we are
starting with an SCM M given by:

X ← fX (ux , uz ) (384)
Z ← fZ (ux , uz ) (385)
W ← fW (X, Z, uw ) (386)
Y ← fY (X, Z, W, uy ). (387)

The noise variables ux , uz are not independent, but the variables uw , uy are mutually inde-
pendent, and also independent from ux , uz .
We now explain how the sequential optimal transport steps extend the original SCM
M (to which we do not have access). Firstly, the conditional distribution Z | X = x1 is
transported onto Z | X = x0 . Write τ Z for the transport map. On the level of the SCM,
this corresponds to extending the equations by an additional mechanism
(
e ← fZ (ux , uz )
Z
if fX (ux , uz ) = x0
. (388)
fZ (π Z (ux , uz )) if fX (ux , uz ) = x1

Here, there is an implicit (possibly stochastic) mapping π Z that we cannot observe. For
simplicity, we assume that the variable Z is continuous and that π Z is deterministic. We
can give an optimization problem to which π Z is the solution, namely:
Z
Z
π := arg min kfZ (π(uz , ux )) − fZ (uz , ux )k2 duX=x
xz
1

π UX ×UZ
(389)
d
s.t. fZ (π(uz , ux )) = fZ (uz , ux ) .
ux ,uz ∼UX ,UZ |X=x1 ux ,uz ∼UX ,UZ |X=x0

94
Causal Fairness Analysis

The measure duxz X=x1 in the objective is the probability measure associated with the distri-

bution P (ux , uz | X = x1 ). The constraint ensures that after the transport, Ze | X = x1 is


e | X = x0 . In the second step of the procedure, we are transporting
equal in distribution to Z
the distribution of W . This results in adding the mechanism:
(
f ← fW (x0 , Z, uw ) if X = x0
e
W W
. (390)
fW (x0 , Z, π (uw )) if X = x1
e

Similarly as before, π W is a possibly stochastic mapping solving the following optimization


problem:
Z
W
π := arg min kfW (x0 , ze, π(uw )) − fW (x1 , ze, uw )k2 duw
π UW (391)
d
s.t. fW (x0 , ze, π(uw )) = fW (x0 , ze, uw ).
The above optimization problem is thought of being solved separately for each value of
Z
e = ze. Finally, in the last step, we are constructing the additional mechanism:
(
fY (x0 , Z,
e Wf , uy ) if X = x0
Ye ← Y
(392)
fY (x0 , Z,
e Wf , π (uy )) if X = x1

Again, the implicit mapping π Y is constructed so that it is the solution to


Z
Y
π := arg min kfY (x0 , ze, w, e uy )k2 duy
e π(uy )) − fy (x1 , ze, w,
π UY (393)
d
s.t. fY (x0 , ze, w,
e π(uy )) = fY (x0 , ze, w,
e uy ).

where the problem is solved separately for each fixed choice of parents Z
e = ze, W
f = w.
e
After constructing the additional mechanisms Z, W , and Y , we draw the explicit causal
e f e
diagram corresponding to the new variables, which includes the unobservables UX , UZ , UW ,
and UY (marked in red), given as follows:

UZ

Ze
UX UY

X Ye

UW W .
f

Note that by marginalizing out the unobserved variables UX , UZ , UW , UY , we obtain the new
causal diagram, which is given by the standard fairness model over the variables X, Z,
e Wf , Ye .

95
Plečko and Bareinboim

Therefore, it follows that the identification expressions for the spurious, indirect, and direct
effects are known, and given by:
X
y | x0 ) =
x-DEx0 ,x1 (e y | x1 , ze, w)
[P (e e − P (e
y | x0 , ze, w)]P
e (w e | x0 , ze)P (e
z | x) (394)
ze,w
e
X
y | x0 ) =
x-IEx0 ,x1 (e y | x0 , ze, w)[P
P (e e (w e | x1 , ze) − P (w
e | x0 , ze)]P (e
z | x) (395)
ze,w
e
X
x-SEx0 ,x1 (e
y) = y | x0 , ze)[P (e
P (e z | x0 ) − P (e
z | x1 )]. (396)
ze

To finish the proof, notice that by construction (the matching of distributions via optimal
transport), we have that

y | x1 , ze, w)
P (e y | x0 , ze, w)
e = P (e e (397)
e | x1 , ze) = P (w
P (w e | x0 , ze) (398)
z | x0 ) = P (e
P (e z | x1 ), (399)

implying that all three effects in Eq. 394-396 are equal to 0.

Appendix B. Practical aspects of fairness measures


B.1 Identification of measures
The structure of the measures used in Causal Fairness Analysis was given by the Fairness
Map from Thm. 7 (see also Fig. 12). Moreover, in Theorem 9 in Appendix A.2 we have
shown that many of the measures in the map are identifiable from observational data in the
standard fairness model (SFM) and we provided explicit expressions for their identification.
The natural question is whether these measures remain identifiable when some assump-
tions of the SFM are relaxed. To answer this question, we consider what happens to
identifiability of different measures when we add bidirected edges to the GSFM .

B.1.1 Identification under Extended Fairness Model


There are five possible bidirected edges that could be added to the GSFM (since the bidirected
edge X L9999K Z is assumed to be present already). The other five possibilities include
the W L9999K Y (mediator-outcome), Z L9999K Y (confounder-outcome), X L9999K W
(attribute-mediator), Z L9999K W (confounder-mediator) and X L9999K Y (attribute-
outcome). We analyze these cases in the respective order.
Bidirected edge Z L9999K Y . Consider the case of confounder-outcome confounding,
represented by the Z L9999K Y edge. An example of such a model is given in the LHS
of Table 5. In this case, without expanding the Z set, none of the fairness measures are
identifiable (due to the set Z not satisfying the back-door criterion with respect to variables
X and Y ). However, this does not necessarily mean there is no hope for identifying our
fairness measures. What we do next is refine the Z set, in the hope that the additional
assumptions obtained in this process might help us identify our quantities of interest. In

96
Causal Fairness Analysis

Measure ID expression
P
TEx0 ,x1 (y) z [P (y | x1 , z) − P (y | x0 , z)]P (z)
general
P
Exp-SEx (y) z P (y | x, z)[P (z) − P (z | x)]
P
NDEx0 ,x1 (y) [P (y | x1 , z, w) − P (y | x0 , z, w)]P (w | x0 , z)P (z)
Pz,w
NIEx0 ,x1 (y) z,w P (y | x0 , z, w)[P (w | x1 , z) − P (w | x0 , z)]P (z)
P
ETTx0 ,x1 (y | x) [P (y | x1 , z) − P (y | x0 , z)]P (z | x)
x-specific

Pz
Ctf-SEx0 ,x1 (y) z P (y | x0 , z)[P (z | x0 ) − P (z | x1 )]
P
Ctf-DEx0 ,x1 (y | x) [P (y | x1 , z, w) − P (y | x0 , z, w)]P (w | x0 , z)P (z | x)
Pz,w
Ctf-IEx0 ,x1 (y | x) z,w P (y | x0 , z, w)[P (w | x1 , z) − P (w | x0 , z)]P (z | x)
z-specific

z-TEx0 ,x1 (y | x) P (y | x1 , z) − P (y | x0 , z)
P
z-DEx0 ,x1 (y | x) [P (y | x1 , z, w) − P (y | x0 , z, w)]P (w | x0 , z)
Pw
z-IEx0 ,x1 (y | x) w P (y | x0 , z, w)[P (w | x1 , z) − P (w | x0 , z)]

Table 4: Population level and x-specific causal measures of fairness in the TV-family, and
their identification expressions under the standard fairness model GSF M .

Table 5: An example of the extended fairness model with a bidirected Z L9999K Y edge
(left side), in which refining the set of variables Z yields a graph (right side) in which all
fairness measures are identifiable.

cluster model refined model


Z Z1 Z2

X Y X Y

some sense, the assumptions encoded in the clustered diagram are not sufficient for iden-
tification. It might turn out, however, that by spelling out all the variables in the cluster,
some additional assumptions might help with identification. Consider the example on the
RHS of Table 5, where the full causal graph is given, after refining the previously clustered
Z set. Interestingly, in this case the set {Z1 , Z2 } can be shown as back-door admissible
for the effect of X on Y . Furthermore, the identification expression for all the quantities
remains the same as in the standard fairness model, given by the expressions in Table 4.

Bidirected edge W L9999K Y . Next consider the case where there is a bidirected edge
between the group of variables W and the outcome Y . Firstly, we note that the identification
of causal (TE/ETT) and spurious measures (Exp-SE/Ctf-SE) is unaffected by the W L9999K
Y edge, and that these quantities are identified by the same expressions as in Table 4. The
quantities measuring direct and indirect effects are not identifiable, at least not without
further refining the W set. Consider the example given in Table 6. In the LHS of the table
we have a model in which W is clustered and NDE or NIE quantities are not identifiable. On
the RHS, after expanding the previously clustered W set, the natural direct (and indirect)

97
Plečko and Bareinboim

Table 6: An example of the extended fairness model with a bidirected W L9999K Y edge
(left side), in which refining the set of variables W yields a graph (right side) in which all
fairness measures are identifiable.

cluster model refined model


X Y X Y

W W1 W2

effects can be identified, by the virtue of the front-door criterion (Pearl, 2000). However,
note that in this case, the identification expression for the natural direct effect is different
from the identification expression for the natural direct effect in the standard fairness model.
Whenever front-door identification is used, we expect the expression to change, compared
to the baseline SFM case.
Bidirected edge X L9999K W . The case of the X L9999K W edge is similar to that of
W L9999K Y , yet slightly different. None of the measures discussed are identifiable in this
case, before refining the W set. However, similarly as in the W L9999K Y example in Table
6, when refining the W set, we might find that in fact the effect of X on Y is identifiable
via the front-door. Again, the identification expression in this case will change. For the
sake of brevity we skip an explicit example.
Bidirected edge Z L9999K W . In the case of the Z L9999K W edge, none of the measures
are identifiable. However, refining the Z and W sets might help. To see an example, consider
the following graph
Z1

X Y

W1 W2 .
In this case, all of the measures of fairness in Table 4 are identifiable, but again with different
expressions that those presented in the table.
Bidirected edge X L9999K Y . The attribute-outcome confounding represented by the
X L9999K Y edge is the most difficult case. When this edge is present, none of the fairness
quantities can be identified. The reason why this case is hard is that the X L9999K Y
introduces a bidirected edge between X and its child Y . This causes the effect of X on Y
to be non-identifiable (Tian and Pearl, 2002). For more general identification strategies for
when a combination of observational and experimental data is available, refer to (Lee et al.,
2019) and (Correa et al., 2021), and for partial identification ones, see (Zhang et al., 2022,
in press).
The summary of the discussion of the five cases of bidirected edges in the extended
fairness model, and what can be done under their presence, is given in Table 7. Identification

98
Causal Fairness Analysis

Table 7: Identification of causal fairness measures under latent confounding.

W Y 4 4 Refine W Refine W

Z Y Refine Z Refine Z Refine Z Refine Z

X W Refine W Refine W Refine W Refine W

Z W Refine Z, W Refine Z, W Refine Z, W Refine Z, W

X Y 8 8 8 8

checks and suggestions about when to refine the Z- or W -sets are included in the faircause
R-package. We end with an example that fits the extended fairness model with all bidirected
edges apart from the X L9999K Y , but in which case all the fairness measures in Table 4
(albeit not with the same expression as in the table), showing that refining Z and W sets
sometimes might help:

Z1 Z2

X Y

W1 W2

B.2 Estimation of measures


Suppose we found that a target causal measure of fairness is identifiable from observational
data (after possibly refining the SFM). The next question is then how to estimate the
causal measure in practice. There is a large body of literature on the estimation of causal
quantities, based on which our own implementation is built. We focus on describing how
to estimate E(yx ) and E(yx1 ,Wx0 ). Most fairness measures can then be derived from taking
(conditional) differences of these two estimands.

B.2.1 Doubly Robust Estimation


In the SFM, a standard way of computing the quantity E(yx ) would be using inverse
propensity weighting. The mediator W can be marginalized out and the estimator
n
1 X 1(Xi = x)Yi
, (400)
n pb(Xi | Zi )
i=1

99
Plečko and Bareinboim

where pb(Xi | Zi ) is the estimate of the conditional probability P(Xi = x | Zi ), can be used.
The additional assumption necessary for such an approach is the positivity assumption:

Definition 33 (Positivity assumption) The positivity assumption holds if ∀ x, z, P(X =


x | Z = z) is bounded away from 0, that is

δ < P(X = x | Z = z) < 1 − δ,

for some δ > 0.

Such an assumption is needed for the estimation of causal quantities we discuss (together
with the assumptions encoded in the SFM that are used for identification).
However, more powerful estimation techniques have been developed and applied very broadly.
In particular, doubly robust estimators have been proposed for the estimation of causal
quantities (Heckman et al., 1998; Bang and Robins, 2005). In context of the estimator in
Equation (400), a doubly robust estimator would be
n
1 X 1(Xi = x)(Yi − µb(Yi | Zi , Xi ))
b(Yi | Zi , Xi ),
+µ (401)
n pb(Xi | Zi )
i=1

where µb denotes the estimator of the conditional mean E[Y | Z = z, X = x]. In fact, only
one of the two estimators µb(Yi | Zi , Xi ) and pb(Xi | Zi ) needs to be consistent, for the entire
estimator in Equation (401) to be consistent. Such robustness to model misspecification is
a rather desirable property.
Estimating E(yx1 ,Wx0 ) in a doubly robust fashion is somewhat more involved. This
problem has been studied under the rubric of causal mediation analysis (Robins and Green-
land, 1992; Pearl, 2001; Robins, 2003). Tchetgen and Shpitser (2012) proposed a doubly
robust estimator of the expected potential outcome E[Yx1 ,Wx0 ] defined via:

1(X = x1 )f (W | x0 , Z)
φx0 ,x1 (X, W, Z) = [Y − µ(x1 , W, Z)]
px1 (Z)f (W | x1 , Z)
Z
1(X = x0 )  
+ µ(x1 , W, Z) − µ(x1 , w, Z)f (w | x0 , Z) dw (402)
p (Z)
Z x0 W

+ µ(x1 , w, Z)f (w | x0 , Z) dw.


W

The estimator is given by n1 ni=1 φbx0 ,x1 (Xi , Wi , Zi ), where in φb the quantities px (Z), µ(X, W, Z)
P
and f (W | X, Z) are replaced by respective estimates. Such an estimator is multiply robust
(one of the three models can be misspecified). However, the estimator also requires the es-
timation of the conditional density f (W | X, Z). In case of continuous or high-dimensional
W , estimating the conditional density could be very hard and the estimator could therefore
suffer in performance. We revisit the estimation of E[yx1 ,Wx0 ] shortly.

B.2.2 Double Machine Learning


Doubly (and multiply) robust estimation allows for model misspecification of one of the
models, while retaining consistency of the estimator. However, we have not discussed the

100
Causal Fairness Analysis

1
convergence rates of these estimators yet. In some cases fast, O(n− 2 ) rates are attainable for
doubly robust estimators, under certain conditions. For example, one such condition is that
px (Z), µ(X, W, Z) and their estimates belong to the Donsker class of functions (Benkeser
et al., 2017). For a review, refer to (Kennedy, 2016). However, modern ML methods do not
belong to the Donsker class.
In a recent advance, Chernozhukov et al. (2018) showed that the Donsker class condition
can, in many cases (including modern ML methods), be relaxed by using a cross-fitting
approach. This method was named double machine learning (DML). For estimating E[Yx ]
we make use of the estimator in Equation (401) and proceed as follows:

1. Split the data D into K disjoint folds D1 , D2 , ..., DK ,


−(k)
2. Using the complement of fold Dk (labeled DkC ) compute the estimates pbx (Z),
b−(k) (X, Z) of P (X = x | Z = z) and E[Y | Z = z],
µ

3. Compute
1(Xi = x)(Yi − µb(Yi | Zi , Xi ))
b(Yi | Zi , Xi ),
+µ (403)
pb(Xi | Zi )
−(k)
for each observation (Xi , Zi , Yi ) in Dk by plugging in estimators pbx b−(k) (X, Z)
(Z), µ
obtained on the complement DkC ,

4. Taking the mean of the terms in Equation (403) across all observations.

For estimating E[yx1 ,Wx0 ] we follow the approach of Farbmacher et al. (2020). The authors
propose a slightly different estimator than that based on Equation (402), where they replace
φx0 ,x1 (X, W, Z) by

1(X = x1 )px0 (Z, W )


ψx0 ,x1 (X, W, Z) = [Y − µ(x1 , W, Z)]
px1 (Z, W )px0 (Z)

µ(x1 , W, Z) − E[µ(x1 , W, Z) | X = x0 , Z]
1(X = x0 )  
+ (404)
px0 (Z)
+ E[µ(x1 , W, Z) | X = x0 , Z],

which avoids the computation of densities in a possibly high-dimensional case. The terms
ψx0 ,x1 (X, W, Z) are estimated in a cross-fitting procedure as described above, with the slight
extension that in Step 2 we need to further split the complement DkC into two parts, to
estimate the conditional mean µ(X, W, Z) and the nested conditional mean E[µ(x1 , W, Z) |
X = x0 , Z] on disjoint subsets of the data. This approach is used in the faircause
R-package.

References
Civil Rights Act. Civil rights act of 1964. Title VII, Equal Employment Opportunities,
1964.

Robert J Adler, Jonathan E Taylor, et al. Random fields and geometry, volume 80. Springer,
2007.

101
Plečko and Bareinboim

Alekh Agarwal, Alina Beygelzimer, Miroslav Dudı́k, John Langford, and Hanna Wallach.
A reductions approach to fair classification. In International Conference on Machine
Learning, pages 60–69. PMLR, 2018.

Tara Anand, Adele Ribeiro, Jin Tian, and Elias Bareinboim. Effect identification in causal
diagrams with clustered variables. 2021. TR-77, Causal Artificial Intelligence Lab,
Columbia University, https://causalai.net/r77.pdf.

Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. Machine bias: There’s
software used across the country to predict future criminals. and it’s biased against
blacks. ProPublica, May 23 2016. URL https://www.propublica.org/article/
machine-bias-risk-assessments-in-criminal-sentencing.

Alexander Balke and Judea Pearl. Counterfactual probabilities: Computational methods,


bounds and applications. In Uncertainty Proceedings 1994, pages 46–54. Elsevier, 1994.

Heejung Bang and James M Robins. Doubly robust estimation in missing data and causal
inference models. Biometrics, 61(4):962–973, 2005.

Elias Bareinboim, Juan D. Correa, Duligur Ibeling, and Thomas Icard. On pearl’s hierarchy
and the foundations of causal inference. In Probabilistic and Causal Inference: The Works
of Judea Pearl, page 507–556. Association for Computing Machinery, New York, NY,
USA, 1st edition, 2022.

Solon Barocas and Andrew D Selbst. Big data’s disparate impact. Calif. L. Rev., 104:671,
2016.

David Benkeser, Marco Carone, MJ Van Der Laan, and PB Gilbert. Doubly robust non-
parametric inference on the average treatment effect. Biometrika, 104(4):863–880, 2017.

Peter J Bickel, Eugene A Hammel, and J William O’Connell. Sex bias in graduate admis-
sions: Data from berkeley. Science, 187(4175):398–404, 1975.

Allan J Brimicombe. Ethnicity, religion, and residential segregation in london: evidence


from a computational typology of minority communities. Environment and Planning B:
Planning and Design, 34(5):884–904, 2007.

Joy Buolamwini and Timnit Gebru. Gender shades: Intersectional accuracy disparities in
commercial gender classification. In Sorelle A. Friedler and Christo Wilson, editors, Pro-
ceedings of the 1st Conference on Fairness, Accountability and Transparency, volume 81
of Proceedings of Machine Learning Research, pages 77–91, NY, USA, 2018.

Toon Calders and Sicco Verwer. Three naive bayes approaches for discrimination-free clas-
sification. Data Mining journal, 2010.

Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen,
Whitney Newey, and James Robins. Double/debiased machine learning for treatment
and structural parameters, 2018.

102
Causal Fairness Analysis

A. Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism


prediction instruments. Technical Report arXiv:1703.00056, arXiv.org, 2017.

Sam Corbett-Davies and Sharad Goel. The measure and mismeasure of fairness: A critical
review of fair machine learning. arXiv preprint arXiv:1808.00023, 2018.

Juan Correa, Sanghack Lee, and Elias Bareinboim. Nested counterfactual identification
from arbitrary surrogate experiments. In Advances in Neural Information Processing
Systems, volume 34, 2021.

John Detrixhe and Jeremy B. Merrill. The fight against financial advertisers using facebook
for digital redlining, November 1 2019.

Qu Jian Ding and Therese Hesketh. Family size, fertility preferences, and sex ratio in
china in the era of the one child family policy: results from national family planning and
reproductive health survey, 2006.

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fair-
ness through awareness. In Proceedings of the 3rd innovations in theoretical computer
science conference, pages 214–226, 2012.

Helmut Farbmacher, Martin Huber, Lukáš Lafférs, Henrika Langen, and Martin
Spindler. Causal mediation analysis with double machine learning. arXiv preprint
arXiv:2002.12710, 2020.

Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. On the


(im)possibility of fairness. Technical Report 1609.07236, arxiv.org, September 23 2016.
URL http://arxiv.org/abs/1609.07236.

Sara Hajian and Josep Domingo-Ferrer. A study on the impact of data anonymization on
anti-discrimination. In Toon Calders and Indre Zliobaite, editors, ICDM International
Workshop on Discrimination and Privacy-Aware Data Mining. IEEE, December 10 2012.

Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning.
Advances in neural information processing systems, 29:3315–3323, 2016.

Drew Harwell. Federal study confirms racial bias of many


facial-recognition systems, casts doubt on their expanding use.
https://www.washingtonpost.com/technology/2019/12/19/federal-study-confirms-
racial-bias-many-facial-recognition-systems-casts-doubt-their-expanding-use/, December
19, 2019.

James J Heckman, Hidehiko Ichimura, and Petra Todd. Matching as an econometric eval-
uation estimator. The review of economic studies, 65(2):261–294, 1998.

Jesus Hernandez. Redlining revisited: mortgage lending patterns in sacramento 1930–2004.


International Journal of Urban and Regional Research, 33(2):291–313, 2009.

Therese Hesketh, Li Lu, and Zhu Wei Xing. The effect of china’s one-child family policy
after 25 years, 2005.

103
Plečko and Bareinboim

Shanyu Ji, János Kollár, and Bernard Shiffman. A global lojasiewicz inequality for algebraic
varieties. Transactions of the American Mathematical Society, 329(2):813–818, 1992.

Faisal Kamiran and Toon Calders. Classifying without discriminating. In Proc. IC4 09.
IEEE, 2009.

Faisal Kamiran and Toon Calders. Data preprocessing techniques for classification without
discrimination. Knowledge and Information Systems, 33(1):1–33, 2012.

Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. Discrimination aware decision tree
learning. In International Conference on Data Mining. IEEE, 2010.

Faisal Kamiran, Asim Karim, and Xiangliang Zhang. Decision theory for discrimination-
aware classification. In 2012 IEEE 12th International Conference on Data Mining, pages
924–929. IEEE, 2012.

Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. Fairness-aware clas-
sifier with prejudice remover regularizer. In Joint European Conference on Machine
Learning and Knowledge Discovery in Databases, pages 35–50. Springer, 2012.

Edward H Kennedy. Semiparametric theory and empirical processes in causal inference. In


Statistical causal inferences and their applications in public health research, pages 141–
167. Springer, 2016.

Nick Kotz. Judgment Days: Lyndon Baines Johnson, Martin Luther King, Jr., and the
Laws That Changed America. HMH, 2005.

Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness.
Advances in neural information processing systems, 30, 2017.

Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. How we analyzed the compas
recidivism algorithm. ProPublica (5 2016), 9, 2016.

Sanghack Lee, Juan Correa, and Elias Bareinboim. General identifiability with arbitrary
surrogate experiments. In Proceedings of the 35th Conference on Uncertainty in Artificial
Intelligence, Tel Aviv, Israel, 2019. AUAI Press.

B. T. Luong, S. Ruggieri, and F. Turini. k-nn as an implementation of situation testing


for discrimination discovery and prevention. In 17th ACM International Conference on
Knowledge Discovery and Data Mining (KDD 2011). ACM, 2011.

Koray Mancuhan and Chris Clifton. Decision tree classification on outsourced data. In
Workshop on Data Ethics held in conjunction with KDD 2014, New York, NY, 2014.

David Benjamin Oppenheimer. Kennedy, king, shuttlesworth and walker: The events lead-
ing to the introduction of the civil rights act of 1964. USFL Rev., 29:645, 1994.

J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, New
York, 2000. 2nd edition, 2009.

104
Causal Fairness Analysis

Judea Pearl. Direct and indirect effects. In Proceedings of the Seventeenth Conference
on Uncertainty in Artificial Intelligence, page 411–420, San Francisco, CA, USA, 2001.
Morgan Kaufmann Publishers Inc.

Judea Pearl and Dana Mackenzie. The Book of Why: The New Science of Cause and Effect.
Basic Books, Inc., New York, NY, USA, 1st edition, 2018.

Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. Discrimination-aware data mining.
In 14th ACM International Conference on Knowledge Discovery and Data Mining (KDD
2008). ACM, 2008.

Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. Measuring discrimination in socially-
sensitive decision records. In 9th SIAM Conference on Data Mining (SDM 2009), pages
581–592, 2009.

Drago Plečko and Nicolai Meinshausen. Fair data adaptation with quantile preservation.
Journal of Machine Learning Research, 21:242, 2020.

Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q. Weinberger. On
fairness and calibration. In NIPS, 2017. URL https://arxiv.org/abs/1709.02012.

James M Robins. Semantics of causal dag models and the identification of direct and
indirect effects. Oxford Statistical Science Series, pages 70–82, 2003.

James M Robins and Sander Greenland. Identifiability and exchangeability for direct and
indirect effects. Epidemiology, pages 143–155, 1992.

Andrea Romei and Salvatore Ruggieri. A multidisciplinary survey on discrimination anal-


ysis. The Knowledge Engineering Review, 29(5):582–638, 2014.

Salvatore Ruggieri, Dino Pedreschi, and Franco Turini. Dcube: Discrimination discovery
in databases. In 17th ACM International Conference on Knowledge Discovery and Data
Mining (KDD 2011). ACM, 2011.

George Rutherglen. Disparate impact under title vii: an objective theory of discrimination.
Va. L. Rev., 73:1297, 1987.

Ilya Shpitser and Judea Pearl. What counterfactuals can be tested. In Proceedings of the
Twenty-third Conference on Uncertainty in Artificial Intelligence, page 352–359, 2007.

Eric J Tchetgen Tchetgen and Ilya Shpitser. Semiparametric theory for causal media-
tion analysis: efficiency bounds, multiple robustness, and sensitivity analysis. Annals of
statistics, 40(3):1816, 2012.

Jin Tian and Judea Pearl. Probabilities of causation: Bounds and identification. Annals of
Mathematics and Artificial Intelligence, 28(1):287–313, 2000.

Jin Tian and Judea Pearl. A general identification condition for causal effects. In Aaai/iaai,
pages 567–573, 2002.

105
Plečko and Bareinboim

Rich Zemel, Yu Wu, Kevin Swersky, Toniann Pitassi, and Cynthia Dwork. Learning fair
representations. In S. Dasgupta and D. Mcallester, editors, Proceedings of the 30th In-
ternational Conference on Machine Learning, volume 28, pages 325–333, May 2013.

Yves Zenou and Nicolas Boccard. Racial discrimination and redlining in cities. Journal of
Urban economics, 48(2):260–285, 2000.

Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. Mitigating unwanted biases with
adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics,
and Society, pages 335–340, 2018.

Junzhe Zhang and Elias Bareinboim. Equality of opportunity in classification: A causal


approach. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and
R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 3671–
3681, Montreal, Canada, 2018a. Curran Associates, Inc.

Junzhe Zhang and Elias Bareinboim. Fairness in decision-making—the causal explanation


formula. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32,
2018b.

Junzhe Zhang and Elias Bareinboim. Non-parametric path analysis in structural causal
models. In Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence,
2018c.

Junzhe Zhang, Jin Tian, and Elias Bareinboim. Partial counterfactual identification from
observational and experimental data. In Proceedings of the 39th International Conference
on Machine Learning, 2022, in press.

Indre Zliobaite, Faisal Kamiran, and Toon Calders. Handling conditional discrimination.
In International Conference on Data Mining. IEEE, 2011.

106

You might also like