0% found this document useful (0 votes)

143 views23 pages

Counterfactual Explanations and Algorithmic Recourses For Machine Learning: A Review

This document reviews research on counterfactual explanations for machine learning models. Counterfactual explanations show how a model's output may have changed if the input values were different. The review categorizes counterfactual explanation algorithms and evaluates their properties. It finds that counterfactual explanations can increase trust in models by helping people understand decisions and check for bias. The review also identifies gaps in research and promising directions.

Uploaded by

Gurmehak kaur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

143 views23 pages

Counterfactual Explanations and Algorithmic Recourses For Machine Learning: A Review

Uploaded by

Gurmehak kaur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Counterfactual Explanations and Algorithmic Recourses for

Machine Learning: A Review

Sahil Verma Varich Boonsanong Minh Hoang
[email protected] [email protected] [email protected]
University of Washington University of Washington University of Washington
Seattle, WA, USA Seattle, WA, USA Seattle, WA, USA

Keegan E. Hines John P. Dickerson Chirag Shah

[email protected] [email protected] [email protected]
arXiv:2010.10596v3 [cs.LG] 15 Nov 2022

Arthur AI Arthur AI University of Washington

Washington D.C., USA Washington D.C., USA Seattle, WA, USA

ABSTRACT • An explanation can be beneficial to the applicant whose life

Machine learning plays a role in many deployed decision systems, is impacted by the decision. For example, it helps an applicant
often in ways that are difficult or impossible to understand by hu- understand which of their attributes were strong drivers in de-
man stakeholders. Explaining, in a human-understandable way, the termining a decision.
relationship between the input and output of machine learning • Various forms of explanations can serve as a proxy for trans-
models is essential to the development of trustworthy machine parency in the system, which could increase its trustworthiness.
learning based systems. A burgeoning body of research seeks to • Further, it can help an applicant challenge a decision if they feel
define the goals and methods of explainability in machine learn- an unfair treatment has been meted out, e.g., if one’s race was
ing. In this paper, we seek to review and categorize research on crucial in determining the outcome. This can also be useful for
counterfactual explanations, a specific class of explanation that pro- organizations to check for bias in their algorithms.
vides a link between what could have happened had input to a • In some instances, an explanation provides the applicant with
model been changed in a particular way. Modern approaches to feedback that they can act upon to receive the desired outcome
counterfactual explainability in machine learning draw connec- at a future time.
tions to the established legal doctrine in many countries, making • Explanations can help the machine learning model developers
them appealing to fielded systems in high-impact areas such as identify, detect, and fix bugs and other performance issues.
finance and healthcare. Thus, we design a rubric with desirable • Explanations help adhere to laws surrounding machine-produced
properties of counterfactual explanation algorithms and compre- decisions, e.g., GDPR [62].
hensively evaluate all currently proposed algorithms against that
Explainability in machine learning is broadly about using inher-
rubric. Our rubric provides easy comparison and comprehension
ently interpretable and transparent models or generating post-hoc
of the advantages and disadvantages of different approaches and
explanations for opaque models. Examples of the former include
serves as an introduction to major research themes in this field. We
linear/logistic regression, decision trees, rule sets, etc. Examples of
also identify gaps and discuss promising research directions in the
the latter include random forests, support vector machines (SVMs),
space of counterfactual explainability.
and neural networks. Post-hoc explanation approaches can either
be model-specific or model-agnostic. Explanations by feature im-
1 INTRODUCTION portance and model simplification are two broad kinds of model-
specific approaches. Model-agnostic approaches can be categorized
Machine learning is increasingly accepted as an effective tool to
into visual explanations, local explanations, feature importance,
enable large-scale automation in many domains. In lieu of hand-
and model simplification.
designed rules, algorithms are able to learn from data to discover
patterns and support decisions. Those decisions can, and do, di- Feature importance finds the most influential features contributing
rectly or indirectly impact humans; high-profile cases include appli- to the model’s overall accuracy or for a particular decision, e.g.,
cations in credit lending [281], talent sourcing [275], parole [295], SHAP [205], QII [70]. Model simplification finds an interpretable
and medical treatment [93]. The nascent Fairness, Accountability, model that imitates the opaque model closely. Dependency plots
Transparency, and Ethics (FATE) in machine learning community are a popular kind of visual explanation, e.g., Partial Dependence
has emerged as a multi-disciplinary group of researchers and indus- Plots [106], Accumulated Local Effects Plot [16], Individual Con-
try practitioners interested in developing techniques to detect bias ditional Expectation [118]. They plot the change in the model’s
in machine learning models, develop algorithms to counteract that prediction as one or multiple features are changed. Local expla-
bias, generate human-comprehensible explanations for the machine nations differ from other methods because they only explain a
decisions, hold organizations responsible for unfair decisions, etc. single prediction. Local explanations can be further categorized
Human-understandable explanations for machine-produced deci- into approximation and example-based approaches. Approximation
sions are advantageous in several ways. For example, focusing on a approaches sample new datapoints in the vicinity of the datapoint
use case of applicants applying for loans, the benefits would include: whose prediction from the model needs to be explained (hereafter
1
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

called the explainee datapoint), and then fit a linear model (e.g., laws make it illegal to use sensitive features as the basis of any deci-
LIME [261]) or extracts a rule set from them (e.g., Anchors [262]). sion (see Appendix C). Biased decisions can also attract widespread
Example-based approaches seek to find datapoints in the vicinity criticism and are therefore crucial to avoid [123, 177]. Fairness has
of the explainee datapoint. They either offer explanations in the been captured in several notions based on a demographic grouping
form of datapoints that have the same prediction as the explainee or individual capacity. Verma and Rubin [317] have enumerated
datapoint or the datapoints whose prediction differs from the ex- and intuitively explained many fairness definitions using a uni-
plainee datapoint. Note that the latter kind of datapoints are still fying dataset. Dunkelau and Leuschel [88] provide an extensive
close to the explainee datapoint and are termed as “counterfactual overview of the major categorization of research efforts in ensuring
explanations” (CFE). fair machine learning and enlists important works in all categories.
Explainable machine learning has also seen interest from other
Recall the use case of applicants applying for a loan. For an individ-
communities, specifically healthcare [300], having huge social im-
ual whose loan request has been denied, counterfactual explana-
plications. Several works have summarized and reviewed other
tions provide them with actionable feedback that could help them
research in explainable machine learning [3, 51, 127].
make changes to their features in order to transition to the desirable
side of the decision boundary, i.e., get the loan. This feedback is
termed as an algorithmic recourse. Unlike several other explainabil- 2.2 Explainability in Machine Learning
ity techniques, CFEs (or recourses) do not explicitly answer the
This section gives some concrete examples that emphasize the im-
“why” the model made a prediction; instead, they provide sugges-
portance of explainability and give further details of the research
tions to achieve the desired outcome. CFEs are also applicable to
in this area. In a real-world example, the US military trained a clas-
black-box models (when only the predict function of the model is
sifier to distinguish enemy tanks from friendly tanks. Although
accessible), and therefore place no restrictions on model complexity
the classifier performed well on the training and test dataset, its
and do not require model disclosure. They also do not necessarily
performance was abysmal on the battlefield. Later, it was found
approximate the underlying model, producing accurate feedback.
that the photos of friendly tanks were taken on sunny days, while
Owing to their intuitive nature, CFEs are also amenable to legal
for enemy tanks, photos clicked only on overcast days were avail-
frameworks (see appendix C).
able [127]. The classifier found it much easier to use the difference
In this work, we collect, review and categorize more than 350 re-
between the background as the distinguishing feature. In a simi-
cent papers that propose algorithms to generate counterfactual
lar case, a husky was classified as a wolf because of the presence
explanations for machine learning models. Many of these methods
of snow in the background, which the classifier had learned as a
have focused on datasets that are either tabular or image-based.
feature associated with wolves [261]. The use of an explainability
We describe our methodology for collecting papers for this survey
technique helped discover these issues.
in appendix B. We describe recent research themes in this field and
The explainability problem can be divided into model explanation
categorize the collected papers among a fixed set of desiderata for
and outcome explanation problems [127].
effective counterfactual explanations (see table 1).
Model explanation searches for an interpretable and transparent
The contributions of this review paper are:
global explanation of the original model. Various papers have de-
(1) We examine a set of more than 350 recent papers on the same set veloped techniques to explain neural networks and tree ensem-
of parameters to allow for an easy comparison of the techniques bles using single decision tree [65, 83, 184] and rule sets [14, 76].
these papers propose and the assumptions they work under. Some approaches are model-agnostic, such as Golden Eye and
(2) The categorization of the papers achieved by this evaluation PALM [139, 185, 357].
helps a researcher or a developer choose the most appropriate Outcome explanation needs to provide an explanation for a specific
algorithm given the set of assumptions they have and the speed prediction from the model. This explanation need not be a global ex-
and quality of the generation they want to achieve. planation or explain the internal logic of the model. Model-specific
(3) Comprehensive and lucid introduction for beginners in the area approaches for deep neural networks (CAM, Grad-CAM [274, 355]),
of counterfactual explanations for machine learning. and model agnostic approaches (LIME, MES [261, 307]) have been
proposed. These are either feature attribution or model simplifi-
2 BACKGROUND cation methods. Example-based approaches are another kind of
This section gives the background about the social implications of explainability technique used to explain a particular outcome. This
machine learning, explainability research in machine learning, and work focuses on counterfactual explanations (CFEs), which is an
some prior studies about counterfactual explanations. example-based approach.
By definition, CFEs are applicable to supervised machine learning
2.1 Social Implications of Machine Learning setups where the desired prediction has not been obtained for a
Establishing fairness and making an automated tool’s decision ex- datapoint. The majority of research in this area has applied CFEs to
plainable are two broad ways in which we can ensure equitable classification settings, which consists of several labeled datapoints
social implications of machine learning. Fairness research aims at that are given as input to the model, and the goal is to learn a
developing algorithms that can ensure that the decisions produced function mapping from the input datapoints (with, say, m features)
by the system are not biased against a particular demographic group to labels. In classification, the labels are discrete values. X𝑚 is used
of individuals, which are defined with respect to sensitive or pro- to denote the input space of the features, and Y is used to denote
tected features, such as race, sex, and religion. Anti-discrimination the output space of the labels. The learned function is the mapping
2
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

𝑓 : X𝑚 → Y, which is used to predict labels for unseen datapoints

in the future.

2.3 History of Counterfactual Explanations

Counterfactual explanations have a long history in other fields like
philosophy, psychology, and the social sciences. Philosophers like
David Lewis published articles on the ideas of counterfactuals back
in 1973 [196]. Woodward [339] said that a satisfactory explanation
must follow patterns of counterfactual dependence. Psychologists
have demonstrated that counterfactuals elicit causal reasoning in
humans [45, 46, 163]. Philosophers have also validated the concept Figure 1: Two possible paths for a datapoint (shown in blue),
of causal thinking due to counterfactuals [30, 339]. originally classified in the negative class, to cross the de-
Studies have compared the likeability of CFEs with other explana- cision boundary. The endpoints of both the paths (shown
tion approaches. Binns et al. [33] and Dodge et al. [81] performed in red and green) are valid counterfactuals for the original
user studies that showed that users prefer CFEs over case-based point. Note that the red path is the shortest, whereas the
reasoning, which is another example-based approach. The work green path adheres closely to the manifold of the training
by Fernández-Loría et al. [98] provides three interesting examples data, but is longer.
where the feature importance explanation methods fail to capture
the underlying model, whereas CFEs do. Asher et al. [23] argue
that the partiality and locality of CFEs make them epistemically
should quantify a relatively small change, which will lead to the de-
accessible and an adequate form of explanations.
sired alternative outcome. Alice might need to increase her income
by $10K to get approved for a loan, and even though an increase
3 COUNTERFACTUAL EXPLANATIONS of $50K would do the job, it is most pragmatic for her if she can
This section illustrates counterfactual explanations by giving an make the smallest possible change. Additionally, Alice might care
example and then outlines the major aspects of the problem. about a simpler explanation – it is easier for her to focus on chang-
ing a few things (such as only Income) instead of trying to change
3.1 An Example many features. Alice certainly also cares that the counterfactual
Suppose Alice walks into a bank and seeks a home mortgage loan. she receives is giving her advice, which is realistic and actionable.
The decision is impacted in large part by a machine learning clas- It would be of little use if the recommendation were to decrease
sifier that considers Alice’s feature vector of {Income, CreditScore, her age by ten years.
Education, Age}. Unfortunately, Alice is denied the loan she seeks These desiderata, among others, have set the stage for recent devel-
and is left wondering (1) why the loan was denied? and (2) what can opments in the field of counterfactual explainability. As we describe
she do differently so that the loan will be approved in the future? in this section, major themes of research have sought to incorpo-
The former question might be answered with explanations like: rate increasingly complex constraints on counterfactuals, all in the
“CreditScore was too low”, and is similar to the majority of tradi- spirit of ensuring the resulting explanation is truly actionable and
tional explainability methods. The latter question forms the basis helpful. Development in this field has focused on addressing these
of a counterfactual explanation: what small changes could be made desiderata in a way that is generalizable across algorithms and is
to Alice’s feature vector in order to end up on the other side of the computationally efficient.
classifier’s decision boundary? Let us suppose the bank provides (1) Validity: Wachter et al. [324] first proposed counterfactual ex-
Alice with exactly this advice (through a CFE) of what she might planations in 2017. They posed CFE as an optimization prob-
change in order to be approved next time. A possible counterfactual lem. Equation (1) states the optimization objective, which is
recommended by the system might be to increase her Income by to minimize the distance between the counterfactual (𝑥 ′ ) and
$10K or get a new master’s degree or a combination of both. The the original datapoint (𝑥) subject to the constraint that the out-
answer to the former question does not tell Alice what action to put of the classifier on the counterfactual is the desired label
take, while the CFE explicitly helps her. Figure 1 illustrates how (𝑦 ′ ∈ Y). Converting the objective into a differentiable, un-
the datapoint representing an individual, which originally got clas- constrained form yields two terms (see Equation (2)). The first
sified in the negative class, can take two paths to cross the decision term encourages the output of the classifier on the counter-
boundary into the positive class region. factual to be close to the desired class, and the second term
The assumption in a CFE is that the underlying classifier would not forces the counterfactual to be close to the original datapoint.
change when the applicant applies in the future. And if the assump- A metric 𝑑 is used to measure the distance between two data-
tion holds, the counterfactual guarantees the desired outcome in points 𝑥, 𝑥 ′ ∈ X, which can be the L1/L2 distance, or quadratic
the future time. distance, or distance functions which take as input the CDF
of the features [310], or pairwise feature costs as perceived by
3.2 Desiderata and Major Themes of Research users [258]. Thus, this original definition already emphasized
The previous example alludes to many desirable properties of an that an effective counterfactual must be small change relative
effective counterfactual explanation. For Alice, the counterfactual to the starting point.
3
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

(5) Causality: Features in a dataset are rarely independent, there-

arg min
′
𝑑 (𝑥, 𝑥 ′ ) subject to 𝑓 (𝑥 ′ ) = 𝑦 ′ (1) fore, changing one feature in the real world affects other fea-
𝑥 tures. For example, getting a new educational degree necessi-
tates increasing the individual’s age by at least some amount.
arg min
′
max 𝜆(𝑓 (𝑥 ′ ) − 𝑦 ′ ) 2 + 𝑑 (𝑥, 𝑥 ′ ) (2) In order to be realistic and actionable, a counterfactual should
𝑥 𝜆
maintain any known causal relations between features. Gen-
A counterfactual that indeed is classified in the desired class is
erally, our loss function now accounts for (1) counterfactual
a valid counterfactual. As illustrated in fig. 1, the points shown
validity, (2) sparsity in feature vector (and actionability of fea-
in red and green are valid counterfactuals, as they are in the
tures); (3) similarity to the training data; and (4) causal relations.
positive class region. The distance to the red counterfactual is
smaller than the distance to the green counterfactual. The following research themes are not added as terms in the opti-
(2) Actionability: An important consideration while making a rec- mization objective; they are properties of the algorithm generating
ommendation is about which features are mutable (e.g., income, the CFEs.
age) and which are not (e.g., race, country of origin). A rec- (6) Amortized inference: Generating a counterfactual is expensive,
ommended counterfactual should never change the immutable which involves solving an optimization process for each data-
features. In fact, if a change to a legally sensitive feature pro- point. Mahajan et al. [210] proposed generative technique for
duces a change in prediction, it shows inherent bias in the “amortized inference” of CFEs. Learning to predict a CFE allows
model. Several papers have also mentioned that an applicant the algorithm to quickly compute a counterfactual (or several)
might have a preference order amongst the mutable features for any new input 𝑥, without requiring to solve an optimization
(which can also be hidden.) The optimization problem is modi- problem. Verma et al. [316] proposed another approach that
fied to take this into account. We might call the set of actionable uses RL to generate amortized CFEs.
features A, and update our loss function to be, (7) Black-box access: If a CFE generating approach can work with
the black-box access to an ML model, i.e., with only accessing
arg min
′
max 𝜆(𝑓 (𝑥 ′ ) − 𝑦 ′ ) 2 + 𝑑 (𝑥, 𝑥 ′ ) (3) its ‘predict’ function, it can then be used in settings where the
𝑥 ∈A 𝜆
access to the ML model cannot be given due to proprietary or
(3) Sparsity: There can be a trade-off between the number of fea-
legal reasons. Dandl et al. [67] propose a genetic algorithm and
tures changed and the total amount of change made to obtain
Verma et al. [316] propose a RL-based algorithm to this end.
the counterfactual. A counterfactual ideally should change a
(8) Model Agnosticity: A closely linked concept is model agnosticity.
smaller number of features in order to be the most effective. It
An approach that is model agnostic can work with different
has been argued that people find it easier to understand shorter
kinds of ML models and hence is more desirable than a model-
explanations [218, 227], making sparsity an important consider-
specific approach. An approach that requires black-box access
ation. We update our loss function to include a penalty function
to the model is model-agnostic by definition.
that encourages sparsity in the difference between the modified
and the original datapoint, 𝑔(𝑥 ′ − 𝑥), e.g., L0/L1 norm.
3.3 Relationship to other related terms
arg min
′
max 𝜆(𝑓 (𝑥 ′ ) − 𝑦 ′ ) 2 + 𝑑 (𝑥, 𝑥 ′ ) + 𝑔(𝑥 ′ − 𝑥) (4) Out of the papers collected, different terminology often captures
𝑥 ∈A 𝜆
the basic idea of counterfactual explanations, although subtle differ-
(4) Data Manifold closeness: It would be hard to trust a counterfac- ences exist between the terms. Several terms worth noting include:
tual if it resulted in a combination of features that were utterly
• Algorithmic Recourse: Ustun et al. [310] point out that counterfac-
unlike any observations the classifier has seen before. In this
tuals do not take into account the actionability of the prescribed
sense, the counterfactual would be “unrealistic", not easy to re-
changes, which recourse does. Works taking a causal view of
alize, and anomalous to the training datapoints [40]. Therefore,
the problem further fortify this claim [168, 169]. Recent papers
a generated counterfactual should be realistic in the sense that
in counterfactual generation take actionability and feasibility
it is near the training data and adheres to observed correlations
of the prescribed changes, and therefore the difference with re-
among the features. Many papers have proposed various ways
course has blurred. In this work, we use the term counterfactual
of quantifying this. We might update our loss function to in-
explanation, its abbreviation CFE, and recourse interchangeably.
clude a penalty for adhering to the data manifold defined by
• Inverse classification: Inverse classification aims to perturb an
the training set X, denoted by 𝑙 (𝑥 ′ ; X)
input in a meaningful way in order to classify it into its desired
arg min
′
max 𝜆(𝑓 (𝑥 ′ ) − 𝑦 ′ ) 2 + 𝑑 (𝑥, 𝑥 ′ ) + 𝑔(𝑥 ′ − 𝑥) + 𝑙 (𝑥 ′ ; X) (5) class [4, 189]. Such an approach prescribes the actions to be
𝑥 ∈A 𝜆
taken in order to get the desired classification. Therefore inverse
In fig. 1, the region between the dashed lines shows the data classification has the same goals as CFEs.
manifold. There are two possible paths to cross the decision • Contrastive explanation: Contrastive explanations generate expla-
boundary for the blue datapoint. The shorter, red path takes it nations of the form “an input x is classified as y because features
to a counterfactual that is outside the data manifold, whereas 𝑓1, 𝑓2, . . . , 𝑓𝑘 are present and 𝑓𝑛 , . . . , 𝑓𝑟 are absent”. The features
a bit longer, the green path takes it to a counterfactual that that are minimally sufficient for a classification are called perti-
follows the data manifold. Adding the data manifold loss term nent positives, and the features whose absence is necessary for
encourages the algorithm to choose the green path over the red the final classification are termed pertinent negatives. To gener-
path, even if it is slightly longer. ate both pertinent positives and pertinent negatives, one needs
4
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

to solve the optimization problem to find the minimum pertur- [324], but this is restricted to differentiable models only. Black-
bations needed to maintain the same class label or change it, box approaches use gradient-free optimization algorithms such
respectively. Therefore contrastive explanations (specifically per- as Nelder-Mead [124], growing spheres [191], FISTA [79, 311]
tinent negatives) are related to CFEs. ASP [32], or genetic algorithms [67, 189, 278] to solve the op-
• Adversarial learning: Adversarial learning is closely related, but timization problem. Finally, some approaches do not cast the
the terms are not interchangeable. Adversarial learning aims to goal into an optimization problem and solve it using heuris-
generate the least amount of change in a given input to classify tics [126, 173, 254, 334]. Poyiadzi et al. [247] propose FACE,
it differently, often with the goal of far-exceeding the decision which uses Dijkstra’s algorithm [80] to find the shortest path
boundary and resulting in a highly-confident misclassification. between existing training datapoints to find counterfactual for
While the optimization problem is similar to the one posed in a a given input. Hence, this method does not generate new data-
counterfactual generation, the desiderata are different. For exam- points. Fraunhofer IOSB et al. [104] and Blanchart [35] divide
ple, in adversarial learning (often applied to images), the goal is the feature space into ‘pure’ regions where all datapoints (by
an imperceptible change in the input image. This is often at odds sampling) belong to one class and then use graph traversing
with the CFE’s goal of sparsity and parsimony (though single- techniques to find the closest CFEs.
pixel attacks are an exception). Further, notions of data manifold Distinct from the three levels of model access, there exist ap-
and actionability/causality are rarely considerations in adversar- proaches that propose new training routines. Ross et al. [265]
ial learning. A few works point to the similarity and synergy propose adding adversarial loss during training of the ML model
between the two domains: Pawelczyk et al. [239] explore the con- to have a higher probability of having a recourse for the training
nection between the optimization objectives and results of the ad- datapoints. (After training, any CFE generating method can be
versarial and CFE generating techniques. Freiesleben [105] state used.) Guo et al. [130] propose CounterNet, a novel architecture
that the differences in the desired class label and distance from the that predicts the class and generates the CFE of a datapoint
original datapoint distinguish CFEs from adversarial examples. when trained from scratch. [277] train a sum-product network
Elliott et al. [91] propose generating semantically meaningful ad- that acts as both a classifier and density estimator and uses that
versarial perturbations to generate CFEs for images. Browne and to generate CFEs.
Swift [41] point out that the constraint of producing plausible (2) Model agnostic: This column describes the domain of mod-
datapoints distinguishes CFEs from adversarial examples. els a given algorithm can operate on. For example, gradient-
based algorithms can only handle differentiable models, and
4 ASSESSMENT OF THE APPROACHES ON the algorithms based on solvers require linear or piece-wise
linear models [164, 167, 168, 267, 310], some algorithms are
COUNTERFACTUAL PROPERTIES model-specific and only work for those models like tree ensem-
For easy comprehension and comparison, we identify several prop- bles [97, 164, 203, 302]. Black-box methods have no restriction
erties that are important for a counterfactual generation algorithm. on the underlying model and are, therefore, model-agnostic.
For all the collected papers which propose an algorithm to generate (3) Optimization amortization: Among the collected papers, the pro-
counterfactual explanations, we assess the algorithm they propose posed algorithm mostly returned a single counterfactual for a
against these properties. The results are presented in table 1. For given input datapoint. Therefore these algorithms require solv-
papers that do not propose new algorithms and discuss related ing the optimization problem for each counterfactual that was
aspects of counterfactual explanations or modifications to previous generated, that too, for every input datapoint. A smaller number
methods are mentioned in section 5.3. The methodology we used of the methods are able to generate multiple counterfactuals
to collect the papers is given in appendix B. (generally diverse by some metric of diversity) for a single input
datapoint; therefore, they require to be run once per input to get
4.1 Properties of counterfactual algorithms several counterfactuals [48, 67, 97, 126, 167, 210, 224, 267, 278].
Mahajan et al. [210]’s approach learns the mapping of dat-
This section expounds on the key properties of a counterfactual
apoints to counterfactuals using a variational auto-encoder
explanation generation algorithm. The properties form the columns
(VAE) [82]. Therefore, once the VAE is trained, it can gener-
of table 1.
ate multiple counterfactuals for all input datapoints, without
(1) Model access: The counterfactual generation algorithms require solving the optimization problem separately and is thus very
different levels of access to the underlying model for which fast. Verma et al. [316] and Samoilescu et al. [270] train a re-
they generate counterfactuals. We identify three distinct ac- inforcement learning model to learn the actions that need to
cess levels – access to complete model internals, access to be taken to generate CFEs for a data distribution. Hence, these
gradients, and access to only the prediction function (black- approaches are also amortized. [344] trains a CGAN to synthe-
box). Access to the complete model internals is required when size CFEs with umbrella sampling; hence, their approach is also
the algorithm uses a solver-based method like, mixed integer amortized. Van Looveren et al. [312] also train a GAN-based
programming [164, 167, 168, 267, 310] or if they operate on model that is amortized. Schleich et al. [272] partially evaluate
decision trees [48, 97, 203, 221, 302] which requires access to (amortize) the classifier for the static features, hence speeding
all internal nodes of the tree. A majority of the methods use up the CFE generation. We report two aspects of optimization
a gradient-based algorithm to solve the optimization objec- amortization in the table.
tive, modifying the loss function proposed by Wachter et al.
5
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

• Amortized Inference: This column is marked Yes if the algo- immutable, mutable, and actionable types. Actionable features
rithm can generate counterfactuals for multiple input data- are a subset of mutable features. They point out that certain fea-
points without optimizing separately for them; otherwise, it tures are mutable but not directly actionable by the individual,
is marked No. e.g., CreditScore cannot be directly changed; it changes as an
• Multiple counterfactuals (CF): This column is marked Yes if effect of changes in other features like income, credit amount.
the algorithm can generate multiple counterfactuals for a Mahajan et al. [210] uses an oracle to learn the user preferences
single input datapoint; otherwise, it is marked No. for changing features (among mutable features) and can also
(4) Counterfactual (CF) attributes: These columns evaluate algo- learn hidden preferences.
rithms on sparsity, data manifold adherence, and causality. Most tabular datasets have both continuous and categorical
Among the collected papers, methods using solvers explic- features. Performing arithmetic over continuous features is
itly constrain sparsity [167, 310], black-box methods constrain natural, but handling categorical variables in gradient-based
L0 norm of counterfactual and the input datapoint [67, 191]. algorithms can be complicated. Some algorithms cannot handle
Gradient-based methods typically use the L1 norm of counter- categorical variables and filter them out [191, 203]. Wachter et al.
factual and the input datapoint. Some of the methods change [324] proposed clamping all categorical features to each of their
only a fixed number of features [173, 334], change features iter- values, thus spawning many processes (one for each value of
atively [160, 193, 273, 316], or flip the minimum possible split each categorical feature), leading to scalability issues. Some ap-
nodes in the decision tree [126] to induce sparsity. Some meth- proaches convert categorical features to one-hot encoding and
ods also induce sparsity post-hoc [191, 224]. This is done by then treat them as numerical features. In this case, maintaining
sorting the features in ascending order of relative change and one-hotness can be challenging. Some use a different distance
greedily restoring their values to match the values in the input function for categorical features, which is generally an indicator
datapoint until the prediction for the CFE is still different from function (1 if a different value, else 0). [109] use Markov chain
the input datapoint. transitions to encode categorical distances. Yang et al. [344]
Adherence to the data manifold has been addressed using sev- use Gaussian mixture models to normalize the continuous fea-
eral different approaches, like training VAEs on the data dis- tures and Gumbel-Softmax to relax categorical features into
tribution [78, 159, 210, 311], constraining the distance of a continuous ones. Genetic algorithms, evolutionary algorithms,
counterfactual from the 𝑘 nearest training datapoints [67, 89, and SMT solvers can naturally handle categorical features. We
164], directly sampling points from the latent space of a VAE report these properties in the table.
trained on the data, and then passing the points through the • Feature preference: This column is marked Yes if the algorithm
decoder [243], using an ensemble of model to capture the pre- considers feature actionability, otherwise marked No.
dictive entropy [273], using an Kernel Density Estimator (KDE) • Categorical distance function: This column is marked - if
to estimate PDF of underlying data manifold [109], using cycle the algorithm does not use a separate distance function for
consistency loss in GAN [312], mapping back to the data do- categorical variables, else it specifies the distance function.
main [193], using a combination of existing datapoints [173],
using Gaussian Mixture Models to approximate the probability 5 EVALUATION OF COUNTERFACTUAL
of in-distributionness [19], or by using feature correlations [20],
or by simply not generating any new datapoint [247].
GENERATION ALGORITHMS
The relation between different features is represented by a This section lists the common datasets used to evaluate counter-
directed graph between them, which is termed as a causal factual generation algorithms and the metrics on which they are
graph [244]. Out of the papers that have addressed this concern, typically evaluated and compared.
most require access to the complete causal graph [168, 169]
(which is rarely available in the real world), while Duong et al. 5.1 Commonly used datasets for evaluation
[89], Mahajan et al. [210], Verma et al. [316], Yang et al. [344] The datasets used in the evaluation in the papers we review can
can work with partial causal graphs. be categorized into tabular and image datasets. Not all methods
These three properties are reported in the table. support image datasets. Some of the papers also used synthetic
• Sparsity: This column is marked No if the algorithm does not datasets for evaluating their algorithms, but we skip those in this
consider sparsity, else it specifies the sparsity constraint. review since they were generated for a specific paper and also might
• Data manifold: This column is marked Yes if the algorithm not be available. Common datasets in the literature include:
forces the generated counterfactuals to be close to the data
• Image: MNIST [194], EMNIST [60], CelebA [200], CheXpert [152],
manifold by some mechanism; otherwise, it is marked No.
ImageNet [77], ISIC Skin Lesion [59], ADNI [225], ChestX-ray8 [326].
• Causal relation: This column is marked Yes if the algorithm
considers the causal relations between features when gener-
1 It considers global and local feature importance, not preference.
ating counterfactuals; otherwise, it is marked No. 2 All features are converted to polytope type.
(5) Counterfactual (CF) optimization (opt.) problem attributes: These 3 Does not generate new datapoints
are a few attributes of the optimization problem. 4 The distance is calculated in latent space.
5 It considers feature importance not user preference.
Out of the papers that consider feature actionability, most clas-
sify the features into immutable and mutable types. Karimi 6 Maybe partially as it uses cycle consistency loss
et al. [168] and Lash et al. [189] categorize the features into
6
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

Table 1: Assessment of the collected papers on the key properties, which are important for readily comparing and comprehend-
ing the differences and limitations of different counterfactual algorithms. Papers are sorted chronologically. Details about the
full table is given in appendix A.
Assumptions Optimization amortization CF attributes CF opt. problem attributes

Amortized Multiple Data Feature Categorical dist.

Year Paper Model access Model domain Sparsity Causal relation
Inference CFEs manifold preference func



 [189] Black-box Agnostic No No Iteratively No No Yes -



2017 [324] Gradients Differentiable No No L1 No No No -



 [302]
 Complete Tree ensemble No No No No No No -



 [191] Black-box Agnostic No No L0 and post-hoc No No No -




 Flips min. split
 [126] Black-box Agnostic No Yes No No No Indicator


 nodes
2018


 [78]
 Gradients Differentiable No No L1 Yes No No -




 [124]
 Black-box Agnostic No No No No No No1 -




 [267] Complete Linear No Yes L1 No No No N.A.2



 [310]

 Complete Linear No No Hard constraint No No Yes -





 [278] Black-box Agnostic No Yes No No No Yes Indicator




 Black-box or

 [79] Differentiable No No L1 Yes No No -



 gradient
2019

 [254] Black-box Agnostic No No No No No No -





 [159] Gradients Differentiable No No No Yes No No -



 [250]

 Gradients Differentiable No No No No No No -





 [334, Changes one fea-

 Black-box Agnostic No No No No No -
 335] ture





 [224] Gradients Differentiable No Yes L1 and post-hoc No No No Indicator





 [247] Black-box Agnostic No No No Yes3 No No -




 Black-box or
 [311]
 Differentiable No No L1 Yes No No Embedding


 gradient



 [210]

 Gradients Differentiable Yes Yes No Yes Yes Yes -





 [167] Complete Linear No Yes Hard constraint No No Yes Indicator


N.A.4



 [243] Gradients Differentiable No No No Yes No Yes



 [173]

 Black-box Agnostic No No Yes Yes No No -




 Linear and causal
2020 [168] Complete No No L1 No Yes Yes -

 graph




 [169]
 Gradients Differentiable No No No No Yes Yes -






 [193] Gradients Differentiable No No Iteratively Yes No No5 -



 [67] Black-box Agnostic No Yes L0 Yes No Yes Indicator






 Linear and tree en-

 [164] Complete No No No Yes No Yes -


 semble





 [97] Complete Random Forest No Yes L1 No No No -




 [202,

 Complete Tree ensemble No No L1 No No No -
 203]



• Tabular: Adult income, German credit, Student Performance, Mac [206], UK unsecured personal loans [43], insurance dataset
Breast cancer, Default of credit, Shopping, Iris, Wine, Spam- [179], BPIC2017 [145].
bee, Covertype, ICU [87], LendingClub [294], Give Me Some
Credit [162], COMPAS [155], LSAT [36], Pima diabetes [283], HE-
LOC/FICO [100], Fannie Mae [208], Portuguese Bank [223], San- 5.2 Metrics for evaluation of counterfactual
giovese [209], Bail dataset [158], Simple-BN [210], AllState [150], generation algorithms
WiDS Datathon [149], Home Credit Default Risk [125], German Most of the counterfactual generation algorithms are evaluated
Housing [102], HospitalTriage [142], MIMIC-IV [157], Freddie on the desirable properties of counterfactuals. Counterfactuals are
7
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

Table 2: Continued from Table 1

Assumptions Optimization amortization CF attributes CF opt. problem attributes

Amortized Multiple Data Feature Categorical dist.

Year Paper Model access Model domain Sparsity Causal relation
Inference CFEs manifold preference func

No6



 [312] Gradient Differentiable Yes No L1 No No -




 [48,

 Complete Decision Tree No Yes L1 No No Yes -


 134]





 [166] Complete Linear No Yes Iteratively No Yes No -





 [273] Gradients Differentiable No No Iteratively Yes No Yes -





 [227] Black-box Agnostic No Yes Gower No Yes Yes Gower






 [42] Black-box Agnostic No No Yes Yes No No Indicator



 [89] Black-box Agnostic No No No No Yes No Latent space





 [228] Complete Linear No Yes Hard constraint Yes No Yes -







 [20] Complete Linear No No No Yes No No -




 Black-box or Agnostic if black-

 [272] No Yes L0/L1 No Yes Yes Indicator



 complete box
2021
 Black-box or Agnostic if black-

 [230] Yes No L1 Yes No Yes -


 gradient box





 [35] Complete Tree ensemble Yes No Yes No No Yes -





 [270] Black-box Agnostic Yes Yes L0/L1 Yes No Yes Indicator






 [316] Black-box Agnostic Yes Yes Iteratively Yes Yes Yes -



 [238] Complete Tree ensemble No No L0/L1 Yes No Yes Gower





 [221] Complete Linear No Yes Hard constraint No No Yes Indicator







 [104] Black-box Agnostic Yes Yes No No No No -





 [344] Black-box Agnostic Yes Yes No Yes Yes No Not sure





 [160] Gradient Differentiable No No No No No No -





 [109] Black-box Agnostic No No L1 Yes No No Markov Chains




 [259] Black-box Agnostic Partially Yes Hard constraint No No Yes Gower

 Training

 [130] Differentiable Yes No No No No No -


 from scratch






 [340] Gradient Differentiable No No No Yes Yes No -


2022 [343] Black-box Agnostic No Might Yes No No Yes -






 [258] Black-box Agnostic Yes Might Yes No No Yes Indicator





 [277] Training
Differentiable No No No Yes No Yes -
 from scratch

considered actionable feedback to individuals who have received features, some papers standardize them in pre-processing or
undesirable outcomes from automated decision-makers, and there- divide L1 norm by median absolute deviation of respective
fore, a user study can be considered a gold standard. The ease of features [224, 267, 324], or divide L1 norm by the range of the
acting on a recommended counterfactual is thus measured by using respective features [67, 167, 168]. Some papers term proximity
quantifiable proxies: as the average distance of the generated counterfactuals from
the input. Lower values of average distance are preferable.
(1) Validity: Validity measures the ratio of the counterfactuals that (3) Sparsity: Shorter explanations are more comprehensible to hu-
actually have the desired class label to the total number of mans [218], therefore, counterfactuals ideally should prescribe
counterfactuals generated. Higher validity is preferable. Most a change in a small number of features. Although a consensus
papers report it. on a hard cap on the number of modified features has not been
(2) Proximity: Proximity measures the distance of a counterfactual reached, Keane and Smyth [173] cap a sparse counterfactual to
from the input datapoint. For counterfactuals to be easy to act at most two feature changes.
upon, they should be close to the input datapoint. Distance (4) Counterfactual generation time: Intuitively, this measures the
metrics like the L1 norm, L2 norm, Mahalanobis distance are time required to generate counterfactuals. This metric can be
common. To handle the variability of range among different averaged over the generation of a counterfactual for a batch of
8
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

input datapoints or for the generation of multiple counterfactu- (9) Label Variation Score and Oracle Score: Hvilshøj et al. [147]
als for a single input datapoint. point out that the previous metrics are unable to detect out-of-
(5) Diversity: Some algorithms support the generation of multiple distribution CFEs (especially for high dimensional datasets) and
counterfactuals for a single input datapoint. The purpose of propose two new metrics. Label Variation Score applies when
providing multiple counterfactuals is to increase the ease for each datapoint has multiple labels, and the intuition is that CFE
applicants to reach at least one counterfactual state. Therefore, for a particular label should not affect the predictions for other
the recommended counterfactuals should be diverse, allowing labels (unless they are highly correlated).
applicants to choose the easiest one. If an algorithm is strongly ∑︁
enforcing sparsity, there could be many different sparse subsets 𝐿𝑉 𝑆 = 𝑑𝑑𝑖𝑣 [𝑝𝑙 (𝑥), 𝑝𝑙 (𝐶𝐹 𝐸 (𝑥))] (8)
of the features that could be changed. Therefore, having a di- 𝑙 ∈𝐿
verse set of counterfactuals is useful. Diversity is encouraged by where L is the total number of labels for a datapoint and 𝑝𝑙 is the
maximizing the distance between the multiple counterfactuals predicted probability for the specific label 𝑙, and 𝑑𝑑𝑖𝑣 measures
by adding it as a term in the optimization objective [67, 224] the divergence between the predicted probability of label 𝑙 for
or as a hard constraint [167, 221, 310], or by minimizing the the original datapoint 𝑥 and its CFE.
mutual information between all pairs of modified features [193]. Oracle Score is similar to validity, however, with an additional
Mothilal et al. [224] reported diversity as the feature-wise dis- classifier trained on the same dataset as the original classifier.
tance between each pair of counterfactuals. A higher value of The intuition is that if a CFE is more like an adversarial example
diversity is preferable. for a classifier, the CFE would not be classified in the desired
(6) Closeness to the training data: Recent papers have considered class by the other classifier, and hence we use the prediction
the actionability and realisticness of the modified features by from the additional classifier as the ground truth validity.
grounding them in the training data distribution. This has been
captured by measuring the average distance to the k-nearest Some of the reviewed papers did not evaluate their algorithm on
datapoints [67], or measuring the local outlier factor [164], or any of the above metrics. They only showed a couple of example
measuring the reconstruction error from a VAE trained on the inputs and respective CFEs, details about which are available in the
training data [210, 311], or measuring the PDF of such dat- full table (see appendix A).
apoints using KDE [109], or measuring the maximum mean
discrepancy (MMD) between the original and counterfactual 5.3 Other works
points [312]. A lower value of the distance and reconstruction This section enlists works that talk about the desirable properties
error is preferable. of counterfactuals or point to their issues. We also talk about works
(7) Causal constraint satisfaction (feasibility): This metric captures that propose minor modifications to previous similar approaches.
how realistic the modifications in the counterfactual are by
measuring if they satisfy the causal relation between features. Works exploring desirable CFE properties: Sokol and Flach
Mahajan et al. [210] evaluated their algorithm on this metric. [286] list several desirable properties of counterfactuals inspired
(8) IM1 and IM2: Van Looveren and Klaise [311] proposed two from Miller [218] and state how the method of flipping logical
interpretability metrics specifically for algorithms that use auto- conditions in a decision tree satisfies most of them. Laugel et al.
encoders. Let the counterfactual class be 𝑡, and the original [190] enlist proximity, connectedness, and stability as three desirable
class be 𝑜. 𝐴𝐸𝑡 is the auto-encoder trained on training instances properties of a CFE and propose the metrics to measure them.
of class 𝑡, and 𝐴𝐸𝑜 is the auto-encoder trained on training in- Works pointing to issues with CFEs: Laugel et al. [192] says
stances of class 𝑜. Let 𝐴𝐸 be the auto-encoder trained on the that if the explanation is not based on training data, but the ar-
full training dataset (all classes). tifacts of non-robustness of the classifier, it is unjustified. They
define justified explanations to be connected to training data by a
∥𝑥𝑐 𝑓 − 𝐴𝐸𝑡 (𝑥𝑐 𝑓 )∥ 22 continuous set of datapoints, termed E-chainability. Barocas et al.
𝐼 𝑀1 = (6)
∥𝑥𝑐 𝑓 − 𝐴𝐸𝑜 (𝑥𝑐 𝑓 )∥ 22 + 𝜖 [28] state five reasons that have led to the success of counterfac-
tual explanations and also point out the overlooked assumptions.
They mention the unavoidable conflicts which arise due to the
∥𝐴𝐸𝑡 (𝑥𝑐 𝑓 ) − 𝐴𝐸 (𝑥𝑐 𝑓 )∥ 22 need for privacy invasion in order to generate helpful explanations.
𝐼 𝑀2 = (7) Kasirzadeh and Smart [171] provide philosophical insight into the
𝑥𝑐 𝑓 + 𝜖
1
implicit assumptions and choices made when generating CFEs.
A lower value of IM1 implies that the counterfactual (𝑥𝑐 𝑓 ) can be Causal CFEs: Downs et al. [86] propose using conditional sub-
better reconstructed by the auto-encoder trained on the coun- space VAEs (CSVAE), a variant of VAEs, to generate CFEs that obey
terfactual class (𝐴𝐸𝑡 ) compared to the auto-encoder trained on correlations between features, causal relations between features,
the original class (𝐴𝐸𝑜 ). Thus implying that the counterfactual and personal preferences. This method builds a probabilistic data
is closer to the data manifold of the counterfactual class. A model of the training data using a CSVAE and uses it to generate
lower value of IM2 implies that the reconstruction from the CFEs. However, these CFEs are not with respect to a specific ML
auto-encoder trained on the counterfactual class and the auto- model. Crupi. et al. [66] propose a technique that can be used with
encoder trained on all classes is similar. Therefore, a lower value any counterfactual generation approach to generate causality abid-
of IM1 and IM2 means a more interpretable counterfactual. ing CFEs. von Kügelgen et al. [321] extend Karimi et al. [169]’s work
9
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

to the setting where unobserved confounders may be present in the have developed extensible toolboxes where several CFE approaches
causal setting. de Lara et al. [71] show that optimal transport-based can be plugged in and compared on specific datasets.
methods are an approximation of Pearl’s CFEs and hence can be Various uncategorized works: State [288] talk about generating
used to generate causal CFEs. Beckers [31] delve further into the CFEs with real-world constraints on features and adaptability with
integration of causality, actual causation, and CFEs. updating ML models using constraint logic programming. Tahoun
CFE for specific models: Albini et al. [11] propose a CFE genera- and Kassis [291] propose to disentangle actions from feature modi-
tion approach targeted for Bayesian network classifiers. Artelt and fications to address the lack of intervention data and appropriate
Hammer [18, 19] enlists the counterfactual optimization problem action costs. The users should already describe the actions they
formulation for several model-specific cases, like generalized linear are willing to take, and a model should just choose the minimum
model, gaussian naive bayes, and mention the general algorithm cost action that generates the CFE. Lucic et al. [201] propose a
to solve them. Koopman and Renooij [180] propose a BFS-based CFE approach to provide a lower and upper bound for the feature
technique for generating CFEs for Bayesian networks. values that get a low prediction error from the ML model for a
Works considering multi-agent scenarios of CFEs: Tsirtsis datapoint that originally had a high prediction error. Korikov and
and Gomez-Rodriguez [306] cast the counterfactual generation Beck [181], Korikov et al. [182] show how CFEs can be generated
problem as a Stackelberg game between the decision maker and the by using the generalization of inverse combinatorial optimization
person receiving the prediction. Given a ground set of CFEs, the and solve it under two objectives. Pawelczyk et al. [241] provide
proposed algorithm returns the top-k CFEs, which maximizes the a general upper bound on the cost of counterfactual explanations
utility of both the involved parties. Bordt et al. [37] point out that under the phenomenon of predictive multiplicity, wherein more
the interests of the provider and receiver of model explanations than one trained models have the same test accuracy and there is no
might be in conflict, and the ambiguous post-hoc explanations clear winner among them. Fdez-Sánchez et al. [95] propose a hierar-
might be unsuitable for achieving the purpose of transparency as chical decompositions-based method to obtain CFEs for multi-class
desired in GDPR. This also relates to fairwashing (see RC14). classification problems. Bertossi [32] and Medeiros Raimundo et al.
Global CFEs: Rawal and Lakkaraju [258] propose AReS to gener- [215] propose brute force approaches to generate CFEs.
ate rule lists that act as global CFEs. Ley et al. [197] and Kanamori
et al. [165] propose computationally more efficient implementation
6 COUNTERFACTUAL EXPLANATIONS FOR
of Rawal and Lakkaraju [258]’s work. Carrizosa et al. [49] propose
a mixed integer quadratic model to generate CFEs for a group of OTHER DATA MODALITIES
datapoints. Koo et al. [179] propose generating CFEs for a set of Since we restrict this survey to the papers that generate CFEs for
datapoints using lagrangian and subgradient methods. Pedapati tabular data, in this section we point the readers to the papers that
et al. [245] propose a technique to train a globally interpretable propose algorithms targeted towards other data modalities:
model (for a black-box model) such that this model is consistent
(1) Image data: [1, 8, 9, 12, 13, 27, 69, 91, 96, 101, 115, 122, 129, 133,
with the pertinent positives and pertinent negatives [78] of the
138, 146, 148, 153, 154, 174, 175, 188, 198, 199, 217, 235, 236, 246,
training datapoints used to train the original model.
264, 271, 284, 299, 312, 313, 318, 325, 336, 345, 347, 353].
Works proposing modifications to previous approaches: Chen
(2) Text data: [38, 54, 160, 207, 251, 255, 263, 301, 345–347].
et al. [57] and De Toni et al. [72] use RL to generate CFE as was
(3) Speech data: [351].
also proposed by Verma et al. [316]. Rasouli and Chieh Yu [252]
(4) Time-series data: [24, 74, 144, 170, 290, 305, 312, 329, 330].
propose a genetic algorithm to generate CFEs as was also proposed
(5) Graph data for graph neural networks: [2, 25, 26, 92, 204, 232,
by Dandl et al. [67]. Hashemi and Fathi [137] propose to use ge-
332]. A survey for CFE on graph neural networks: [248].
netic algorithm for CFE generation similar to Dandl et al. [67]’s
(6) Agent action (e.g. Reinforcement Learning or Planning): [39, 237,
work. Monteiro and Reynoso-Meza [222] propose extending Dandl
289].
et al. [67]’s approach using U-NSGA-III evolutionary algorithm.
(7) Recommender systems: [73, 116, 117, 161, 276, 293, 303, 341, 354,
Barr et al. [29] extend Mahajan et al. [210]’s work by interpolating
356].
between the input and CFE datapoint to generate CFEs closer to the
(8) Functional data: [50, 183] and Behavioral data: [251].
input datapoint. Sajja et al. [269] propose using a semi-supervised
autoencoder instead of the traditional unsupervised autoencoder
to generate CFEs close to the training data manifold. Huang et al. 7 OTHER APPLICATIONS OF
[145] propose LORELEY that extends LORE [126] to generate CFEs COUNTERFACTUAL EXPLANATIONS
for multi-class classification problems and account for flow con- Here we refer the readers to other applications where counterfactual
straints. Wijekoon et al. [337] use feature importances provided by explanations are being used apart from explaining ML models:
LIME to assist the case-based reasoning approach to generate CFEs.
Delaney et al. [75] propose using trust scores to measure the out-of- (1) Anomaly and data-drift detection: Hinder and Hammer [140]
distributionness of the CFEs. Guidotti and Ruggieri [128] propose propose to use CFEs to explain data drift. Sulem et al. [290] pro-
using an ensemble of base CFE explainers to generate diverse CFEs. pose to use CFEs to explain anomalies in time-series datasets.
Benchmark and dataset curation: Mazzine and Martens [214] Ravi et al. [256] wrote a survey on the explainability techniques
quantitatively compare 10 CFE generating approaches using 22 for convolutional auto-encoders for anomaly detection of im-
datasets and nine metrics. Pawelczyk et al. [240] and Artelt [17] ages. Haldar et al. [135] propose to use CFEs to explain anom-
aly detection when using autoencoders. Antoran et al. [15] use
10
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

CFEs to find changes in a datapoint that would help a classifier pertaining to CFEs [315]. In this version, we supplement this section
have a higher confidence in its prediction. with the research progress made towards solving them and new
(2) Training dataset debugging: Yousefzadeh and O’Leary [349] research challenges.
propose to use CFEs to debug ML models by diagnosing the be- Research Challenge 1. Unify counterfactual explanations with
havior and using synthetic data to alter the decision boundaries.
traditional “explainable AI.”
Qi and Chelmis [249] propose to use CFEs to debug potentially
mislabeled datasets. Gan et al. [111] propose to use CFEs to Although counterfactual explanations have been credited to elic-
detect bugs in financial models. Han and Ghosh [136] propose iting causal thinking and providing actionable feedback to users,
finding a minimal subset of training datapoints that are respon- they do not tell which feature(s) was the principal reason for the
sible for a particular prediction and hence can be used to debug original decision and why. It would be nice if, along with giving
training datasets. actionable feedback, counterfactual explanations also gave the rea-
(3) Data augmentation: Yuan et al. [350] propose to use CFEs to son for the original decision, which can help applicants understand
augment training data that is used to predict market volatility the model’s logic. This is addressed by traditional “explainable AI”
based on earning calls. Temraz and Keane [296] propose using methods like LIME [261], Anchors [262], Grad-CAM [274].
CFEs to augment training data to tackle the class imbalance Progress: Guidotti et al. [126] have attempted this unification, as
problem. Mehedi Hasan and Talbert [216], Rasouli and Yu [253] they first learn a local decision tree and then interpret the inversion
propose using CFEs for data augmentation of tabular datasets of decision nodes of the tree as counterfactual explanations. How-
for increased robustness. Temraz et al. [297] propose using CFEs ever, they do not show the CFEs they generate, and their technique
to generate data points that can be used to train ML models also misses other desiderata of counterfactuals (see section 3.2).
that predict crop growth (afflicted by climate change). Kommiya Mothilal et al. [178] propose necessity and sufficiency as
(4) Drug designing: Nguyen et al. [231] use CFEs to find changes the two important properties of an explanation. Feature attribu-
in a drug and protein molecule that will increase their affinity tion explanations find the feature values that are sufficient for a
for each other. They use multi-agent RL to this end. prediction, while CFEs find the feature values that are necessary
(5) ML model bias detection: [94, 226, 310]. for a prediction. They propose methods to find the necessity and
(6) Various applications: Mazzine et al. [213] propose to use CFEs sufficiency of any feature subset and discuss how that aligns with
in employment services to help job seekers get personalized finding CFEs. Galhotra et al. [110] propose Lewis that also em-
advice for increasing their propensity for getting recommended phasizes the necessity and sufficiency scores of a feature subset in
for a job and to help the ML developers to detect potential bias finding its global importance and in generating a CFE for local
and other issues in their ML model. Sadler et al. [268] propose explainability. Jia et al. [156] propose to use DeepLIFT to assign
to use CFEs for community detection in social networks. Fuji- contribution scores to the features that changed in a counterfactual
wara et al. [108] propose to use CFEs to understand interactive datapoint. Ramon et al. [251] rank the feature importances using
dimensionality reduction. Tsiakmaki and Ragos [304] propose LIME and SHAP, and then remove the features in decreasing order
to use CFEs for providing actionable suggestions to improve of importance until a CFE is found. Wiratunga et al. [338] propose
student performance in a university course. Cong et al. [63] to use methods like LIME and SHAP to find feature importances and
propose a CFE approach to explain why a test set fails the then replace the features in decreasing order of importance with
Kolmogorov-Smirnov test. Marchezini et al. [211] propose to the values borrowed from the nearest unlike neighbor (case-based
use CFE for altering both observational and latent variables to reasoning approach). Albini et al. [10] propose to change the back-
reason about mental health. Yao et al. [348] propose to use coun- ground distribution used to compute the Shapley values to make
terfactuals for evaluating the explanations for recommender the feature attribution amount to the counterfactual-ability of the
systems. Gupta et al. [131] use CFEs to propose changes to con- features, i.e., changing a feature with higher attribution would have
straint satisfaction problems that have no solutions. Teofili et al. a higher probability of changing the prediction. Wang and Vascon-
[298] propose using CFEs to explain entity resolution models. celos [325] propose to use the discriminant attribution explanations
Artelt et al. [21] use CFEs to explain the differences between as a way to produce CFEs for images. Wijekoon et al. [337] use
the learning of a pair of models. Frohberg and Binder [107] LIME to assist case-based reasoning techniques to generate CFEs.
propose a new dataset, CRASS, to test reasoning and natural Ge et al. [114] propose using counterfactual-ability of features as a
language understanding of LLMs. metric for their feature importance.
There has been one case of real-world deployment of CFEs in a Research Challenge 2. Provide counterfactual explanations as
hiring platform, Hired. Nemirovsky et al. [229] use a GAN-based discrete and sequential steps of actions.
approach [230] to suggest changes in features like expected salary,
years of experience, and skills to candidates in order to get them Most counterfactual generation approaches return the modified dat-
approved by the Hired Marketplace ML model. apoint, which would receive the desired classification. The modified
datapoint (state) reflects the idea of instantaneous and continuous
actions, but in the real world, actions are discrete and often sequen-
8 OPEN QUESTIONS AND RESEARCH tial. Therefore the counterfactual generation process must take the
PROGRESS FOR SOLVING THEM discreteness of actions into account and provide a series of actions
In the first version of this survey paper, we delineated the open that would take the individual from the current state to the modified
questions and challenges yet to be tackled by the existing works state, which has the desired class label.
11
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

Progress: Naumann and Ntoutsi [227] argue that to help an in- Research Challenge 7. Scalability and throughput of counterfac-
dividual achieve the desired goal, CFEs should be provided as a tual explanations generation.
sequential step of actions instead of just providing the final goal.
As we see in table 1, most approaches need to solve an optimiza-
Singh et al. [280] conduct a user study to show the high prefer-
tion problem to generate one counterfactual explanation. Some
ence for a sequential step of actions steps over a single-step goal.
papers generate multiple counterfactuals while optimizing once,
Ramakrishnan et al. [250] propose a program synthesis based tech-
but they still need to optimize separately for different input data-
nique to generate such sequences. Kanamori et al. [166] propose a
points. However, for industrial deployment, the generation should
mixed-integer based programming method and Verma et al. [316]
be more scalable.
propose an RL-based method that generates ordered sequences of
actions as a CFE. Progress: Mahajan et al. [210] learn a VAE which can generate
multiple CFEs for any given input datapoint after training. There-
Research Challenge 3. Extend counterfactual explanations beyond
fore, their approach is highly scalable and is termed as “amortized
classification.
inference”. Verma et al. [316] proposed an RL-based technique,
Progress: Recent work has been extending counterfactual expla- FastAR, that also generates amortized CFEs. Van Looveren et al.
nations to different tasks and model architectures. Spooner et al. [312], Samoilescu et al. [270], [344], Rawal and Lakkaraju [258],
[287] propose a Bayesian optimization-based technique for gener- and Nemirovsky et al. [230] also propose approaches to this end.
ating CFEs for regression problems. Numeroso and Bacciu [232]
Research Challenge 8. Counterfactual explanations should ac-
propose an RL-based approach for generating CFEs for graph neural
count for bias in the classifier.
networks, which are used to predict chemical molecule properties.
Delaney et al. [74] propose a case-based reasoning approach to gen- Counterfactuals potentially capture and reflect the bias in the mod-
erate CFEs for a time-series classifier. See Section 6 and Section 7 els. To underscore this as a possibility, Ustun et al. [310] experi-
for a list of all the approaches. mented on the difference in the difficulty of attaining a counter-
Research Challenge 4. Counterfactual explanations as an interac- factual state across genders, which clearly showed a significant
tive service to the applicants. difference. More work must be done to find how equally easy
counterfactual explanations can be provided across different de-
Counterfactual explanations should be provided as an interactive mographic groups, or how adjustments should be made to the
interface, where an individual can come at regular intervals, inform prescribed changes to account for the bias.
the system of the modified state, and get updated instructions to
achieve the counterfactual state. This can help when the individual Progress: Rawal and Lakkaraju [258] generate recourse rules for a
could not precisely follow the earlier advice for various reasons. subgroup that they use to detect model biases. Gupta et al. [132] pro-
pose adding a regularizer while training a classifier that encourages
Progress: Hohman et al. [141] developed an interactive user- the classifier to maintain a similar distance of the decision bound-
interface for providing explanations to data scientists. They found ary from different demographic groups, thereby facilitating the
out that data scientists used interactivity as the primary mecha- opportunity of equal recourse across demographic groups (which
nism for exploring, comparing, and explaining predictions. Sokol is their definition of fairness). von Kügelgen et al. [322] extend this
and Flach [285] propose to enhance ML explanations with a voice- fairness notion when the distance between the recourse is mea-
assisted interactive service. Akula et al. [9] propose an approach sured in a causal manner. Galhotra et al. [110] propose LEWIS that
that explains an ML model using an interactive sequence of CFEs. uses CFEs to identify racial bias in COMPAS and gender in Adult
Wang et al. [327] propose refining the CFEs for different feature datasets. Dash et al. [69] propose using CFEs to detect bias in image
change costs based on user interactions. classifiers and counterfactual regularizer to counteract that bias.
Research Challenge 5. The ability of counterfactual explanations
Research Challenge 9. Generate robust counterfactual explana-
to work with incomplete—or missing—causal graphs.
tions [99, 219].
Incorporating causality in the counterfactual generation is essential
for the CFEs to be grounded in reality. Complete causal graphs and Counterfactual explanation optimization problems force the modi-
structural equations are rarely available in the real world, and fied datapoint to obtain the desired class label. However, the modi-
therefore the algorithm should be able to work with incomplete fied datapoint could be labeled either in a robust manner or due to
causal graphs. the classifier’s non-robustness, e.g., an overfitted classifier. Laugel
et al. [190] term this as the stability property of a counterfactual.
Progress: Mahajan et al. [210]’s approach was the first to be com- There are three kinds of robustness needs: 1) robustness to model
patible with incomplete causal graphs. Now other works like Gal- changes when models are retrained, for example, 2) robustness to
hotra et al. [110], Verma et al. [316], Schleich et al. [272], Yang et al. the input datapoint (two individuals with a slight change in features
[344] can also work with partial causal graphs. should be given similar CFEs), and 3) robustness to small changes
Research Challenge 6. The ability of counterfactual explanations in the attained CFE (a CFE with minor changes to the originally
to work with missing feature values. suggested CFE should also be accepted).
Along the lines of an incomplete causal graph, counterfactual ex- Progress: Slack et al. [282] underscore this challenge by show-
planation algorithms should also be able to handle missing feature ing that small perturbations in the input datapoints can result in
values, which often happens in the real world [112]. drastically different CFEs. Rawal et al. [257] further emphasize
12
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

this challenge by empirically demonstrating the invalidation of Research Challenge 12. Counterfactual explanations should also
already prescribed recourses when the ML model gets retrained inform the applicants about what must not change
on datasets with temporal or geospatial distribution shifts. Artelt
Suppose a CFE advises someone to increase their income but does
et al. [22] evaluate the robustness of closest CFEs when contrasted
not tell that their length of last employment should not decrease.
with CFEs generated with the data manifold constraint. Bueff et al.
To increase their income, the applicant who switches to a higher-
[43] propose the framework to measure the robustness of models
paying job may find themselves in a worse position than earlier.
by purposing generated CFEs as adversarial attack datasets. Vir-
Thus by failing to disclose what must not change, an explanation
golin and Fracaros [320] empirically show that non-robust CFEs
may lead the applicant to an unsuccessful state [28]. This corrobo-
encounter a higher cost of change when adverse perturbations are
rates RC4, whereby an applicant might be able to interact with a
applied to the datapoint, thus concluding that robustness in CFEs
platform to see the effect of a potential real-world action they are
should be considered.
considering taking to achieve the counterfactual state.
Upadhyay et al. [309] propose a technique named ROAR that uses
adversarial training to generate recourses robust to changes in an Research Challenge 13. Preserving model privacy.
ML model that is retrained on a distributionally shifted training Privacy attacks on ML models can come in two major forms: mem-
dataset. Dominguez-Olmedo et al. [84] show that the CFEs that ber inference and model extraction. Both of these privacy attacks
just cross the decision boundary are usually non-robust and for- can be enhanced due to the provision of CFEs. Aïvodji et al. [7] em-
mulate an optimization problem that generates robust recourse for pirically demonstrate that adversaries can train a surrogate model
linear models and neural networks. Pawelczyk et al. [242] propose with very high fidelity to the original model (i.e., model extraction
a technique named PROBE that generates robust CFEs while letting attack) with as few as 1,000 queries to the model (which is required
the users decide the trade-off between the CFE invalidation risk during CFE generation). The problem is further aggravated when
and its cost. Black et al. [34] argue that robust CFEs should have diverse CFEs are provided. Shokri et al. [279] have demonstrated
high confidence neighborhoods with small Lipschitz constants, and that gradient-based explanations methods leak a lot of information
propose a Stable Neighbor Search algorithm to that end. Bui et al. and make the models vulnerable to membership inference attacks.
[44] propose an algorithm to generate robust CFEs by considering Miura et al. [220] propose MEGEX, a data-free model extraction
a distribution over the parameters of the model if retrained. Dutta attack that learns a surrogate model without access to its training
et al. [90] propose counterfactual stability (the lower bound of the data by training a generative model. Wang et al. [328] propose us-
predicted class probability for the sampled datapoints in the neigh- ing the CFE of a CFE to train a surrogate model and show that it is
borhood of a given CFE) as a metric for filtering robust CFEs. Bajaj more efficient in model extraction when compared to [7].
et al. [26] propose a technique to generate robust CFEs for graph
neural networks. Research Challenge 14. Guarding against fairwashing.
Aivodji et al. [5] and Aïvodji et al. [6] have pointed out the risk
Research Challenge 10. Counterfactual explanations should han-
of an adversary using model explanations to rationalize a model’s
dle dynamics (data drift, classifier update, applicant’s utility function
decisions and obscure its bias. It remains to be seen if the fair
changing, etc.)
recourse approaches can guard against fairwashing.
All counterfactual explanation papers we review assume that the Research Challenge 15. CFE interpretability with engineered fea-
underlying black box is monotonic and does not change over time. tures [272].
However, this might not be true; credit card companies and banks
update their models as frequently as 12-18 months [113]. Therefore Most current CFE approaches assume that the features they change
counterfactual explanation algorithms should take data drift, the are directly input to the ML model. This might not be the case –
dynamism and non-monotonicity of the classifier into account. it is known that model developers use highly engineered features
for training the ML models. In this light, approaches need to be
Research Challenge 11. Counterfactual explanations should cap- developed that take feature engineering into account (potentially
ture the applicant’s preferences. a non-differentiable step). Approaches that work with black-box
access will naturally be able to work in this setting.
Along with the distinction between mutable and immutable fea-
tures (finely classified into actionable, mutable, and immutable), Research Challenge 16. Handling of categorical features in coun-
counterfactual explanations should also capture preferences spe- terfactual explanations
cific to an applicant. This is important because the ease of changing Different papers have come up with various methods to handle
different features can differ across applicants. categorical features, like converting them to one-hot encoding and
Progress: Mahajan et al. [210] captures the applicant’s preferences then enforcing the sum of those columns to be 1 using regulariza-
using an oracle, but that is expensive and is still a challenge. Rawal tion or a hard constraint, or clamping an optimization problem to
and Lakkaraju [258] use the Bradley-Terry model to learn the pair- a specific categorical value, or leaving them to be automatically
wise cost for each feature pair and hence the preference among handled by genetic approaches and SMT solvers. Measuring dis-
them. Yadav et al. [343] argue that assuming each user’s cost of tance in categorical features is also not obvious. Some papers use
changing different features is the same is unrealistic. They propose an indicator function, which equates to 1 for unequal values and
asking for the user’s cost function or computing the expectation 0 if the same; other papers convert to one-hot encoding and use
by sampling cost functions from a distribution. standard distance metrics like L1/L2 norm, or use the distance in
13
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

Markov chains [102]. Therefore none of the methods developed Research Challenge 19. Generating optimal recourses when con-
to handle categorical features are obvious; future research must sidering a multi-agent scenario.
consider this and develop appropriate methods.
Research Challenge 17. Evaluate counterfactual explanations us- O’Brien and Kim [233] demonstrate the non-optimality of recourses
ing a user study. generated when a single agent’s interest is considered in a multi-
agent scenario like the prisoner’s dilemma. In the real world, an
The evaluation for counterfactual explanations must be done using agent’s actions affect other agents, hence generating recourses that
a user study because evaluation proxies (see section 5) might not consider the interests of multiple agents would be useful.
be able to precisely capture the psychological and other intricacies
of human cognition on the ease of actionability of a counterfactual. Research Challenge 20. Incentivize users to improve features in
Keane et al. [172] emphasize the importance of user studies in the non-manipulative ways.
context of CFEs. Progress: Förster et al. [103] conduct a user study
with 144 participants to understand the format of explanation they An approach that provides a recourse to users might want to pre-
prefer. They conclude that users prefer concrete, consistent, rele- vent the “gamification” of the model (when users manipulate simple
vant explanations, and lengthy explanations if they are concrete. features like the purpose of a loan to get approved). This also pro-
Förster et al. [102] conduct a user study with 46 participants who tects the ML models from adversarial robustness attacks.
were asked to rate the realisticness of the CFEs generated by theirs Progress: Chen et al. [56] propose the optimization objective for
and a baseline approach. Using statistical tests, they concluded that linear classification models when the goal is to develop an accurate
the CFEs generated by their approach were perceived to be more model that encourages actual feature improvement for users. They
real and typical. Rawal and Lakkaraju [258] conduct a user study categorize features into three categories: improvement, manipu-
with 21 participants who were asked to detect a bias in the recourse lative, and immutable. Users should be encouraged to change the
summaries for demographic groups. Kanamori et al. [165] conduct improvement features, not the manipulative ones when optimizing
a user study with 35 participants to compare their global CFE gen- for recourse. König et al. [187] suggest using causality to generate
erating technique with that of Rawal and Lakkaraju [258]. Singh meaningful recourses and prevent gamification of the model.
et al. [280] conduct a user study with 54 participants and found that
most users prefer specific directives over generic and non-directive Research Challenge 21. Strengthen the ties between machine
explanations. Warren et al. [331] conduct a user study with 127 par- learning and regulatory communities.
ticipants and found that counterfactual explanations elicited higher
trust and satisfaction than causal explanations. Yacoby et al. [342] A joint statement between the machine learning community and
conduct a user study with 8 U.S. state court judges to understand regulatory community (OCC, Federal Reserve, FTC, CFPB) acknowl-
their response to CFEs from pretrial risk assessment instruments edging successes and limitations of where counterfactual explana-
(PRAI). They conclude that judges ignored the CFEs and focused on tions will be adequate for legal and consumer-facing needs and
the factual features of the defendant. Kuhl et al. [186] conduct a user would improve the adoption and use of counterfactual explanations
study with 74 users in an interactive game setting and found that in critical software.
users benefit less from receiving computationally plausible CFEs
Progress: Reed et al. [260] talk about how regulation and policies
than the closest CFEs (measured using feature distance). Zhang.
need to adapt to how ML models can explain their decisions.
et al. [352] conduct a user study with 200 users to check their un-
derstanding of global, local, and CF explanations. Cai et al. [47]
conduct a user study on 1070 participants to understand how users 9 CONCLUSIONS
perceive explanations when provided examples from the desired In this paper, we collected and reviewed more than 350 papers
class vs. when provided examples from all other classes. which proposed various algorithmic solutions to finding counter-
Research Challenge 18. Counterfactual explanations should be factual explanations for the decisions produced by automated sys-
integrated with data visualization interfaces. tems, specifically automated by machine learning. Evaluating all
the papers on the same rubric helps in quickly understanding the
Counterfactual explanations will directly interact with consumers peculiarities of different approaches and the advantages, and disad-
with varying technical knowledge levels; therefore, counterfactual vantages of each of them, which can also help organizations choose
generation algorithms should be integrated with visualization in- the algorithm best suited to their application constraints. This has
terfaces. We already know that visualization can influence human also helped us readily identify the gaps, which will be beneficial to
behavior [64], and a collaboration between machine learning and researchers scouring for open problems in this space and quickly
HCI communities could help address this challenge. sifting the large body of literature. We hope this paper can also be
Progress: Cheng et al. [58], Gomez et al. [119, 120], Leung et al. the starting point for people wanting to get an introduction to the
[195], Wexler et al. [333] have developed interactive graphical user broad area of counterfactual explanations and guide them to proper
interfaces for displaying CFEs. DECE [58] also summarizes CFEs for resources for things they might be interested in.
subgroups that can help detect model biases, if any. Tamagnini et al.
[292] develop a visualization tool for CFEs for text classification Acknowledgments. We thank Jason Wittenbach, Aditya Kusupati,
models. Hohman et al. [141] also build a visual interactive user Divyat Mahajan, Jessica Dai, Soumye Singhal, Harsh Vardhan, and
interface for providing model explanations. Jesse Michel for helpful comments.
14
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

REFERENCES [21] André Artelt, Fabian Hinder, Valerie Vaquet, Robert Feldhans, and Barbara
[1] Abubakar Abid, Mert Yuksekgonul, and James Zou. 2022. Meaningfully debug- Hammer. 2021. Contrastive Explanations for Explaining Model Adaptations.
ging model mistakes using conceptual counterfactual explanations. In Proceed- In Advances in Computational Intelligence. Springer International Publishing,
ings of the 39th International Conference on Machine Learning (Proceedings of Ma- Cham, 101–112.
chine Learning Research), Vol. 162. PMLR, 66–88. https://proceedings.mlr.press/ [22] André Artelt, Valerie Vaquet, Riza Velioglu, Fabian Hinder, Johannes Brinkrolf,
v162/abid22a.html Malte Schilling, and Barbara Hammer. 2021. Evaluating Robustness of Counter-
[2] Carlo Abrate and Francesco Bonchi. 2021. Counterfactual Graphs for Explain- factual Explanations. 2021 IEEE Symposium Series on Computational Intelligence
able Classification of Brain Networks. In Proceedings of the 27th ACM SIGKDD (SSCI) (2021), 01–09.
Conference on Knowledge Discovery & Data Mining (KDD ’21). Association [23] Nicholas Asher, Lucas De Lara, Soumya Paul, and Chris Russell. 2022. Coun-
for Computing Machinery, New York, NY, USA, 10. https://doi.org/10.1145/ terfactual Models for Fair and Adequate Explanations. Machine Learning and
3447548.3467154 Knowledge Extraction 4, 2 (2022), 316–349. https://doi.org/10.3390/make4020014
[3] Amina Adadi and Mohammed Berrada. 2018. Peeking inside the black-box: A [24] Emre Ates, Burak Aksar, Vitus J. Leung, and Ayse K. Coskun. 2021. Counter-
survey on Explainable Artificial Intelligence (XAI). IEEE Access PP (09 2018), factual Explanations for Multivariate Time Series. In 2021 International Confer-
1–1. https://doi.org/10.1109/ACCESS.2018.2870052 ence on Applied Artificial Intelligence (ICAPAI). 1–8. https://doi.org/10.1109/
[4] Charu C. Aggarwal, Chen Chen, and Jiawei Han. 2010. The Inverse Classification ICAPAI49758.2021.9462056
Problem. J. Comput. Sci. Technol. 25, 3 (May 2010), 458–468. https://doi.org/ [25] Davide Bacciu and Danilo Numeroso. 2022. Explaining Deep Graph Networks
10.1007/s11390-010-9337-x via Input Perturbation. IEEE Transactions on Neural Networks and Learning
[5] Ulrich Aivodji, Hiromi Arai, Olivier Fortineau, Sébastien Gambs, Satoshi Hara, Systems (2022). https://doi.org/10.1109/TNNLS.2022.3165618
and Alain Tapp. 2019. Fairwashing: the risk of rationalization. In Proceedings of [26] Mohit Bajaj, Lingyang Chu, Zi Yu Xue, Jian Pei, Lanjun Wang, Peter Cho-Ho
the 36th International Conference on Machine Learning (Proceedings of Machine Lam, and Yong Zhang. 2021. Robust Counterfactual Explanations on Graph
Learning Research), Vol. 97. PMLR, 161–170. https://proceedings.mlr.press/v97/ Neural Networks. https://doi.org/10.48550/ARXIV.2107.04086
aivodji19a.html [27] Rachana Balasubramanian, Samuel Sharpe, Brian Barr, Jason Wittenbach, and
[6] Ulrich Aïvodji, Hiromi Arai, Sébastien Gambs, and Satoshi Hara. 2021. C. Bayan Bruss. 2020. Latent-CF: A Simple Baseline for Reverse Counterfactual
Characterizing the risk of fairwashing. In Advances in Neural Information Explanations. https://doi.org/10.48550/ARXIV.2012.09301
Processing Systems, Vol. 34. Curran Associates, Inc., 14822–14834. https: [28] Solon Barocas, Andrew D. Selbst, and Manish Raghavan. 2020. The Hidden
//proceedings.neurips.cc/paper/2021/file/7caf5e22ea3eb8175ab518429c8589a4- Assumptions behind Counterfactual Explanations and Principal Reasons. In Pro-
Paper.pdf ceedings of the Conference on Fairness, Accountability, and Transparency (FAccT)
[7] Ulrich Aïvodji, Alexandre Bolot, and Sébastien Gambs. 2020. Model extraction (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 10.
from counterfactual explanations. arXiv preprint arXiv:2009.01884 (2020). https://doi.org/10.1145/3351095.3372830
[8] Arjun Akula, Shuai Wang, and Song-Chun Zhu. 2020. CoCoX: Generating [29] Brian Barr, Matthew R. Harrington, Samuel Sharpe, and C. Bayan Bruss. 2021.
Conceptual and Counterfactual Explanations via Fault-Lines. Proceedings of Counterfactual Explanations via Latent Space Projection and Interpolation.
the AAAI Conference on Artificial Intelligence 34, 03 (Apr. 2020), 2594–2601. https://doi.org/10.48550/ARXIV.2112.00890
https://doi.org/10.1609/aaai.v34i03.5643 [30] C. Van Fraassen Bas. 1980. The Scientific Image. Oxford University Press.
[9] Arjun R. Akula, Keze Wang, Changsong Liu, Sari Saba-Sadiya, Hongjing Lu, [31] Sander Beckers. 2022. Causal Explanations and XAI. https://doi.org/10.48550/
Sinisa Todorovic, Joyce Chai, and Song-Chun Zhu. 2022. CX-ToM: Counter- ARXIV.2201.13169
factual explanations with theory-of-mind for enhancing human trust in image [32] Leopoldo E. Bertossi. 2020. Declarative Approaches to Counterfactual Explana-
recognition models. iScience 25, 1 (2022), 103581. https://doi.org/10.1016/ tions for Classification.
j.isci.2021.103581 [33] Reuben Binns, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel
[10] Emanuele Albini, Jason Long, Danial Dervovic, and Daniele Magazzeni. 2022. Shadbolt. 2018. ’It’s Reducing a Human Being to a Percentage’: Perceptions
Counterfactual Shapley Additive Explanations. In 2022 ACM Conference on Fair- of Justice in Algorithmic Decisions. In Proceedings of the 2018 CHI Conference
ness, Accountability, and Transparency (FAccT ’22). Association for Computing on Human Factors in Computing Systems (CHI ’18). Association for Computing
Machinery, New York, NY, USA, 17. https://doi.org/10.1145/3531146.3533168 Machinery, New York, NY, USA, 14. https://doi.org/10.1145/3173574.3173951
[11] Emanuele Albini, Antonio Rago, Pietro Baroni, and Francesca Toni. 2021. [34] Emily Black, Zifan Wang, and Matt Fredrikson. 2022. Consistent Counterfactuals
Influence-Driven Explanations for Bayesian Network Classifiers. In PRICAI for Deep Models. In International Conference on Learning Representations. https:
2021. Springer-Verlag, Berlin, Heidelberg, 13. https://doi.org/10.1007/978-3- //arxiv.org/abs/2110.03109
030-89188-6_7 [35] Pierre Blanchart. 2021. An exact counterfactual-example-based approach to tree-
[12] Gohar Ali, Feras Al-Obeidat, Abdallah Tubaishat, Tehseen Zia, Muhammad ensemble models interpretability. https://doi.org/10.48550/ARXIV.2105.14820
Ilyas, and Alvaro Rocha. 2021. Counterfactual explanation of Bayesian model [36] R. D. Boch and M. Lieberman. 1970. Fitting a response model for n dichotomously
uncertainty. Neural Computing and Applications (Sept. 2021). https://doi.org/ scored items. Psychometrika 35 (1970), 179–97.
10.1007/s00521-021-06528-z [37] Sebastian Bordt, Michèle Finck, Eric Raidl, and Ulrike von Luxburg. 2022. Post-
[13] Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen P. Schulze, Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts. https:
Yi Yao, and Giedrius T. Burachas. 2021. Improving users’ mental model with //arxiv.org/abs/2201.10295
attention-directed counterfactual edits. Applied AI Letters 2, 4 (2021). https: [38] Zeyd Boukhers, Timo Hartmann, and Jan Jürjens. 2022. COIN: Counterfac-
//doi.org/10.1002/ail2.47 tual Image Generation for VQA Interpretation. https://doi.org/10.48550/
[14] Robert Andrews, Joachim Diederich, and Alan B. Tickle. 1995. Survey and ARXIV.2201.03342
Critique of Techniques for Extracting Rules from Trained Artificial Neural [39] Martim Brandão, Gerard Canal, Senka Krivić, Paul Luff, and Amanda Coles.
Networks. Know.-Based Syst. 8, 6 (1995), 17. https://doi.org/10.1016/0950- 2021. How experts explain motion planner output: a preliminary user-study
7051(96) 81920-4 to inform the design of explainable planners. In 2021 30th IEEE International
[15] Javier Antoran, Umang Bhatt, Tameem Adel, Adrian Weller, and José Miguel Conference on Robot & Human Interactive Communication (RO-MAN). 299–306.
Hernández-Lobato. 2021. Getting a {CLUE}: A Method for Explaining Un- https://doi.org/10.1109/RO-MAN50785.2021.9515407
certainty Estimates. In International Conference on Learning Representations. [40] Katherine Elizabeth Brown, Doug Talbert, and Steve Talbert. 2021. The Uncer-
https://openreview.net/forum?id=XSLF1XFq5h tainty of Counterfactuals in Deep Learning. The International FLAIRS Conference
[16] Daniel Apley and Jingyu Zhu. 2020. Visualizing the effects of predictor variables Proceedings 34 (2021). https://doi.org/10.32473/flairs.v34i1.128795
in black box supervised learning models. Journal of the Royal Statistical Society: [41] Kieran Browne and Ben Swift. 2020. Semantics and explanation: why coun-
Series B (Statistical Methodology) 82(4) (06 2020), 1059–1086. https://doi.org/ terfactual explanations produce adversarial examples in deep neural networks.
10.1111/rssb.12377 https://doi.org/10.48550/ARXIV.2012.10076
[17] André Artelt. 2019 - 2021. CEML: Counterfactuals for Explaining Machine [42] Dieter Brughmans and David Martens. 2021. NICE: An Algorithm for
Learning models - A Python toolbox. https://www.github.com/andreArtelt/ Nearest Instance Counterfactual Explanations. https://doi.org/10.48550/
ceml. ARXIV.2104.07411
[18] André Artelt and Barbara Hammer. 2019. On the computation of counterfactual [43] Andreas C. Bueff, Mateusz Cytryński, Raffaella Calabrese, Matthew Jones, John
explanations – A survey. http://arxiv.org/abs/1911.07749 Roberts, Jonathon Moore, and Iain Brown. 2022. Machine learning interpretabil-
[19] André Artelt and Barbara Hammer. 2020. Efficient computation of contrastive ity for a stress scenario generation in credit scoring based on counterfactu-
explanations. https://doi.org/10.48550/ARXIV.2010.02647 als. Expert Systems with Applications 202 (2022). https://doi.org/10.1016/
[20] André Artelt and Barbara Hammer. 2021. Convex optimization for action- j.eswa.2022.117271
able & plausible counterfactual explanations. https://doi.org/10.48550/ [44] Ngoc Bui, Duy Nguyen, and Viet Anh Nguyen. 2022. Counterfactual Plans
ARXIV.2105.07630 under Distributional Ambiguity. https://doi.org/10.48550/ARXIV.2201.12487
[45] Ruth Byrne. 2008. The Rational Imagination: How People Create Alternatives
to Reality. The Behavioral and brain sciences 30 (12 2008), 439–53; discussion
15
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

453. https://doi.org/10.1017/S0140525X07002579 Explanations and Feasible Recommendations to End Users. In Proceedings of

[46] Ruth M. J. Byrne. 2019. Counterfactuals in Explainable Artificial Intelligence the 14th International Conference on Agents and Artificial Intelligence - Volume 2:
(XAI): Evidence from Human Reasoning. In Proceedings of the Twenty-Eighth ICAART,. SciTePress, 24–32. https://doi.org/10.5220/0010761500003116
International Joint Conference on Artificial Intelligence, IJCAI-19. International [67] Susanne Dandl, Christoph Molnar, Martin Binder, and Bernd Bischl. 2020. Multi-
Joint Conferences on Artificial Intelligence Organization, California, USA, 6276– Objective Counterfactual Explanations. In Parallel Problem Solving from Nature
6282. https://doi.org/10.24963/ijcai.2019/876 – PPSN XVI. Springer International Publishing, Cham, 448–469.
[47] Carrie J. Cai, Jonas Jongejan, and Jess Holbrook. 2019. The Effects of Example- [68] DARPA. [n. d.]. Broad Agency Announcement: Explainable Artificial Intel-
Based Explanations in a Machine Learning Interface. In Proceedings of the 24th ligence (XAI). https://www.darpa.mil/attachments/DARPA-BAA-16-53.pdf.
International Conference on Intelligent User Interfaces (IUI ’19). Association for Accessed: 2020-10-15.
Computing Machinery, New York, NY, USA, 258–262. https://doi.org/10.1145/ [69] Saloni Dash, Vineeth N Balasubramanian, and Amit Sharma. 2022. Evaluating
3301275.3302289 and Mitigating Bias in Image Classifiers: A Causal Perspective Using Counter-
[48] Miguel Á. Carreira-Perpiñán and Suryabhan Singh Hada. 2021. Counterfactual factuals. In Proceedings of the IEEE/CVF Winter Conference on Applications of
Explanations for Oblique Decision Trees: Exact, Efficient Algorithms. Proceed- Computer Vision (WACV). 915–924.
ings of the AAAI Conference on Artificial Intelligence 35 (May 2021), 6903–6911. [70] A. Datta, S. Sen, and Y. Zick. 2016. Algorithmic Transparency via Quantitative
https://doi.org/10.1609/aaai.v35i8.16851 Input Influence: Theory and Experiments with Learning Systems. In 2016 IEEE
[49] Emilio Carrizosa, Jasone Ramirez-Ayerbe, and Dolores Romero Morales. 2021. Symposium on Security and Privacy (SP). IEEE, New York, USA, 598–617.
Generating Collective Counterfactual Explanations in Score-Based Classification [71] Lucas de Lara, Alberto González-Sanz, Nicholas Asher, and Jean-Michel Loubes.
via Mathematical Optimization. https://doi.org/10.13140/RG.2.2.22996.12168/1 2021. Transport-based Counterfactual Models. https://doi.org/10.48550/
[50] Emilio Carrizosa, Jasone Ramírez-Ayerbe, and Dolores Romero Morales. 2022. ARXIV.2108.13025
Counterfactual Explanations for Functional Data: A Mathematical Optimization [72] Giovanni De Toni, Bruno Lepri, and Andrea Passerini. 2022. Synthesizing
Approach. https://doi.org/10.13140/RG.2.2.25682.68801 explainable counterfactual policies for algorithmic recourse with program syn-
[51] Diogo V Carvalho, Eduardo M Pereira, and Jaime S Cardoso. 2019. Machine thesis. https://doi.org/10.48550/ARXIV.2201.07135
learning interpretability: A survey on methods and metrics. Electronics 8, 8 [73] Sarah Dean, Sarah Rich, and Benjamin Recht. 2020. Recommendations and User
(2019), 832. Agency: The Reachability of Collaboratively-Filtered Information. In Proceedings
[52] CFPB. [n. d.]. Adverse Action Notice Requirements Under the ECOA and the of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* ’20).
FCRA. https://consumercomplianceoutlook.org/2013/second-quarter/adverse- Association for Computing Machinery, New York, NY, USA, 10. https://doi.org/
action-notice-requirements-under-ecoa-fcra/. Accessed: 2020-10-15. 10.1145/3351095.3372866
[53] CFPB. [n. d.]. Notification of action taken, ECOA notice, and statement of specific [74] Eoin Delaney, Derek Greene, and Mark T Keane. 2021. Instance-based counter-
reasons. https://www.consumerfinance.gov/policy-compliance/rulemaking/ factual explanations for time series classification. In International Conference on
regulations/1002/9/. Accessed: 2020-10-15. Case-Based Reasoning. Springer, 32–47.
[54] Qianglong Chen, Feng Ji, Xiangji Zeng, Feng-Lin Li, Ji Zhang, Haiqing Chen, and [75] Eoin Delaney, Derek Greene, and Mark T. Keane. 2021. Uncertainty Estimation
Yin Zhang. 2021. KACE: Generating Knowledge Aware Contrastive Explanations and Out-of-Distribution Detection for Counterfactual Explanations: Pitfalls and
for Natural Language Inference. In Proceedings of the 59th Annual Meeting of Solutions.
the Association for Computational Linguistics and the 11th International Joint [76] Houtao Deng. 2014. Interpreting Tree Ensembles with inTrees. arXiv:1408.5456
Conference on Natural Language Processing (Volume 1: Long Papers). Association (08 2014). https://doi.org/10.1007/s41060-018-0144-8
for Computational Linguistics, Online, 2516–2527. https://doi.org/10.18653/v1/ [77] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009.
2021.acl-long.196 ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference
[55] Tsong Yueh Chen, Fei-Ching Kuo, Huai Liu, Pak-Lok Poon, Dave Towey, T. H. on Computer Vision and Pattern Recognition. 248–255.
Tse, and Zhi Quan Zhou. 2018. Metamorphic Testing: A Review of Challenges [78] Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting,
and Opportunities. ACM Comput. Surv. 51, 1 (2018), 27. https://doi.org/10.1145/ Karthikeyan Shanmugam, and Payel Das. 2018. Explanations Based on the Miss-
3143561 ing: Towards Contrastive Explanations with Pertinent Negatives. In Proceedings
[56] Yatong Chen, Jialu Wang, and Yang Liu. 2020. Strategic Recourse in Linear of the 32nd International Conference on Neural Information Processing Systems
Classification. (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 590–601.
[57] Ziheng Chen, Fabrizio Silvestri, Jia Wang, He Zhu, Hongshik Ahn, and Gabriele [79] Amit Dhurandhar, Tejaswini Pedapati, Avinash Balakrishnan, Pin-Yu Chen,
Tolomei. 2021. ReLAX: Reinforcement Learning Agent eXplainer for Arbitrary Karthikeyan Shanmugam, and Ruchir Puri. 2019. Model Agnostic Contrastive
Predictive Models. https://doi.org/10.48550/ARXIV.2110.11960 Explanations for Structured Data. http://arxiv.org/abs/1906.00117
[58] Furui Cheng, Yao Ming, and Huamin Qu. 2020. DECE: Decision Ex- [80] Edsger W Dijkstra. 1959. A note on two problems in connexion with graphs.
plorer with Counterfactual Explanations for Machine Learning Models. Numerische mathematik 1, 1 (1959), 269–271.
arXiv:cs.LG/2008.08353 [81] Jonathan Dodge, Q. Vera Liao, Yunfeng Zhang, Rachel K. E. Bellamy, and Casey
[59] Noel Codella, Veronica Rotemberg, Philipp Tschandl, M. Emre Celebi, Stephen Dugan. 2019. Explaining Models: An Empirical Study of How Explanations
Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael Impact Fairness Judgment. In Proceedings of the 24th International Conference on
Marchetti, Harald Kittler, and Allan Halpern. 2019. Skin Lesion Analysis To- Intelligent User Interfaces (IUI ’19). Association for Computing Machinery, New
ward Melanoma Detection 2018: A Challenge Hosted by the International Skin York, NY, USA, 11. https://doi.org/10.1145/3301275.3302310
Imaging Collaboration (ISIC). https://doi.org/10.48550/ARXIV.1902.03368 [82] Carl Doersch. 2016. Tutorial on Variational Autoencoders.
[60] Gregory Cohen, Saeed Afshar, Jonathan C. Tapson, and André van Schaik. 2017. arXiv:stat.ML/1606.05908
EMNIST: Extending MNIST to handwritten letters. 2017 International Joint [83] Pedro Domingos. 1998. Knowledge Discovery Via Multiple Models. Intell. Data
Conference on Neural Networks (IJCNN) (2017), 2921–2926. Anal. 2, 3 (May 1998), 187–202.
[61] European Commission. [n. d.]. Artificial Intelligence. https://ec.europa.eu/info/ [84] Ricardo Dominguez-Olmedo, Amir H Karimi, and Bernhard Schölkopf. 2022. On
funding-tenders/opportunities/portal/screen/opportunities/topic-details/ict- the Adversarial Robustness of Causal Algorithmic Recourse. In Proceedings of
26-2018-2020. Accessed: 2020-10-15. the 39th International Conference on Machine Learning (Proceedings of Machine
[62] European Commission. [n. d.]. REGULATION (EU) 2016/679 OF THE EURO- Learning Research). PMLR, 5324–5342. https://proceedings.mlr.press/v162/
PEAN PARLIAMENT AND OF THE COUNCIL of 27 April 2016 on the protection dominguez-olmedo22a.html
of natural persons with regard to the processing of personal data and on the [85] Finale Doshi-Velez, Mason Kortz, Ryan Budish, Chris Bavitz, Sam Gershman, D.
free movement of such data, and repealing Directive 95/46/EC (General Data O’Brien, Stuart Schieber, J. Waldo, D. Weinberger, and Alexandra Wood. 2017.
Protection Regulation). https://eur-lex.europa.eu/eli/reg/2016/679/oj. Accessed: Accountability of AI Under the Law: The Role of Explanation.
2020-10-15. [86] Michael Downs, Jonathan Chu, Yaniv Yacoby, Finale Doshi-Velez, and
[63] Zicun Cong, Lingyang Chu, Yu Yang, and Jian Pei. 2021. Comprehensible Weiwei. Pan. 2020. CRUDS: Counterfactual Recourse Using Disen-
Counterfactual Explanation on Kolmogorov-Smirnov Test. Proc. VLDB Endow. tangled Subspaces. In Workshop on Human Interpretability in Machine
14, 9 (2021), 1583–1596. https://doi.org/10.14778/3461535.3461546 Learning (WHI). https://finale.seas.harvard.edu/files/finale/files/cruds-
[64] Michael Correll. 2019. Ethical Dimensions of Visualization Research. In Pro- _counterfactual_recourse_using_disentangled_subspaces.pdf
ceedings of the 2019 CHI Conference on Human Factors in Computing Systems [87] Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository - Adult
(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–13. Income. http://archive.ics.uci.edu/ml/datasets/Adult
https://doi.org/10.1145/3290605.3300418 [88] Jannik Dunkelau and Michael Leuschel. 2019. Fairness-Aware Machine Learn-
[65] Mark W. Craven and Jude W. Shavlik. 1995. Extracting Tree-Structured Repre- ing. , 60 pages. https://www.phil-fak.uni-duesseldorf .de/fileadmin/Redaktion/
sentations of Trained Networks. In Conference on Neural Information Processing Institute/Sozialwissenschaften/Kommunikations-_und_Medienwissenschaft/
Systems (NeurIPS) (NIPS’95). MIT Press, Cambridge, MA, USA, 24–30. KMW_I/Working_Paper/Dunkelau___Leuschel__2019__Fairness-
[66] Riccardo Crupi., Beatriz San Miguel González., Alessandro Castelnovo., and Aware_Machine_Learning.pdf
Daniele Regoli. 2022. Leveraging Causal Relations to Provide Counterfactual
16
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

[89] Tri Dung Duong, Qian Li, and Guandong Xu. 2021. Prototype-based Coun- [111] Jingwei Gan, Shinan Zhang, Chi Zhang, and Andy Li. 2021. Automated Coun-
terfactual Explanation for Causal Classification. https://doi.org/10.48550/ terfactual Generation in Financial Model Risk Management. In 2021 IEEE Inter-
ARXIV.2105.00703 national Conference on Big Data (Big Data). 4064–4068. https://doi.org/10.1109/
[90] Sanghamitra Dutta, Jason Long, Saumitra Mishra, Cecilia Tilli, and Daniele BigData52589.2021.9671561
Magazzeni. 2022. Robust Counterfactual Explanations for Tree-Based Ensem- [112] P. J. García-Laencina, J. Sancho-Gómez, and A. R. Figueiras-Vidal. 2009. Pattern
bles. In Proceedings of the 39th International Conference on Machine Learn- classification with missing data: a review. Neural Computing and Applications
ing (Proceedings of Machine Learning Research), Vol. 162. PMLR, 5742–5756. 19 (2009), 263–282.
https://proceedings.mlr.press/v162/dutta22a.html [113] Gordon Garisch. [n. d.]. MODEL LIFECYCLE TRANSFORMA-
[91] Andrew Elliott, Stephen Law, and Chris Russell. 2021. Explaining Classi- TION: HOW BANKS ARE UNLOCKING EFFICIENCIES. https:
fiers using Adversarial Perturbations on the Perceptual Ball. In Conference //financialservicesblog.accenture.com/model-lifecycle-transformation-
on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.48550/ how-banks-are-unlocking-efficiencies. Accessed: 2022-10-15.
ARXIV.1912.09405 [114] Yingqiang Ge, Shuchang Liu, Zelong Li, Shuyuan Xu, Shijie Geng, Yunqi Li,
[92] Lukas Faber, Amin K. Moghaddam, and Roger Wattenhofer. 2020. Con- Juntao Tan, Fei Sun, and Yongfeng Zhang. 2021. Counterfactual Evaluation for
trastive Graph Neural Network Explanation. https://doi.org/10.48550/ Explainable AI. https://doi.org/10.48550/ARXIV.2109.01962
ARXIV.2010.13663 [115] Asma Ghandeharioun, Been Kim, Chun-Liang Li, Brendan Jou, Brian Eoff, and
[93] Daniel Faggella. 2020. Machine Learning for Medical Diagnostics – 4 Cur- Rosalind Picard. 2022. DISSECT: Disentangled Simultaneous Explanations via
rent Applications. https://emerj.com/ai-sector-overviews/machine-learning- Concept Traversals. In International Conference on Learning Representations.
medical-diagnostics-4-current-applications/. Accessed: 2020-10-15. https://openreview.net/forum?id=qY79G8jGsep
[94] Jake Fawkes, Robin Evans, and Dino Sejdinovic. 2022. Selection, Ignorability and [116] Azin Ghazimatin, Oana Balalau, Rishiraj Saha Roy, and Gerhard Weikum. 2020.
Challenges With Causal Fairness. https://doi.org/10.48550/ARXIV.2202.13774 PRINCE: Provider-Side Interpretability with Counterfactual Explanations in
[95] J.A. Fdez-Sánchez, J.D. Pascual-Triana, A. Fernández, and F. Herrera. 2021. Learn- Recommender Systems. In Proceedings of the 13th International Conference on
ing interpretable multi-class models by means of hierarchical decomposition: Web Search and Data Mining (WSDM ’20). Association for Computing Machinery,
Threshold Control for Nested Dichotomies. Neurocomputing 463 (2021), 514–524. New York, NY, USA, 9. https://doi.org/10.1145/3336191.3371824
https://doi.org/10.1016/j.neucom.2021.07.097 [117] Giorgos Giannopoulos, George Papastefanatos, Dimitris Sacharidis, and Kostas
[96] Amir H. Feghahati, Christian R. Shelton, Michael J. Pazzani, and Kevin Tang. Stefanidis. 2021. Interactivity, Fairness and Explanations in Recommendations.
2020. CDeepEx: Contrastive Deep Explanations. In ECAI. Association for Computing Machinery, New York, NY, USA. https://doi.org/
[97] Rubén R. Fernández, Isaac Martín de Diego, Víctor Aceña, Alberto Fernández- 10.1145/3450614.3462238
Isabel, and Javier M. Moguerza. 2020. Random forest explainability using coun- [118] Alex Goldstein, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2013. Peeking
terfactual sets. Information Fusion 63 (2020), 196–207. https://doi.org/10.1016/ Inside the Black Box: Visualizing Statistical Learning With Plots of Individual
j.inffus.2020.07.001 Conditional Expectation. Journal of Computational and Graphical Statistics 24
[98] Carlos Fernández-Loría, Foster Provost, and Xintian Han. 2020. Explaining (09 2013). https://doi.org/10.1080/10618600.2014.907095
Data-Driven Decisions made by AI Systems: The Counterfactual Approach. [119] Oscar Gomez, Steffen Holter, Jun Yuan, and Enrico Bertini. 2020. ViCE: Visual
http://arxiv.org/abs/2001.07417 Counterfactual Explanations for Machine Learning Models. In Proceedings of
[99] Andrea Ferrario and Michele Loi. 2020. A Series of Unfortunate Counterfactual the 25th International Conference on Intelligent User Interfaces (IUI ’20). 5. https:
Events: the Role of Time in Counterfactual Explanations. https://doi.org/ //doi.org/10.1145/3377325.3377536
10.48550/ARXIV.2010.04687 [120] Oscar Gomez, Steffen Holter, Jun Yuan, and Enrico Bertini. 2021. AdViCE:
[100] FICO. 2018. FICO (HELOC) dataset. https://community.fico.com/s/explainable- Aggregated Visual Counterfactual Explanations for Machine Learning Model
machine-learning-challenge?tabset-3158a=2 Validation. https://doi.org/10.48550/ARXIV.2109.05629
[101] Giorgos Filandrianos, Konstantinos Thomas, Edmund Dervakos, and Giorgos [121] Bryce Goodman and S. Flaxman. 2016. EU regulations on algorithmic decision-
Stamou. 2022. Conceptual Edits as Counterfactual Explanations. In Proceedings making and a "right to explanation". ArXiv abs/1606.08813 (2016).
of the AAAI 2022 Spring Symposium on Machine Learning and Knowledge En- [122] Yash Goyal, Ziyan Wu, Jan Ernst, Dhruv Batra, Devi Parikh, and Stefan Lee.
gineering for Hybrid Intelligence (AAAI-MAKE 2022), Stanford University, Palo 2019. Counterfactual Visual Explanations. In Proceedings of the 36th International
Alto, California, USA, March 21-23, 2022 (CEUR Workshop Proceedings), Vol. 3121. Conference on Machine Learning (Proceedings of Machine Learning Research),
CEUR-WS.org. http://ceur-ws.org/Vol-3121/paper6.pdf Vol. 97. PMLR, 2376–2384. https://proceedings.mlr.press/v97/goyal19a.html
[102] Maximilian Förster, Philipp Hühn, Mathias Klier, and Kilian Kluge. 2021. Cap- [123] Preston Gralla. 2016. Amazon Prime and the racist algorithms.
turing Users’ Reality: A Novel Approach to Generate Coherent Counterfactual https://www.computerworld.com/article/3068622/amazon-prime-and-
Explanations. https://doi.org/10.24251/HICSS.2021.155 the-racist-algorithms.html
[103] Maximilian Förster, Mathias Klier, Kilian Kluge, and Irina Sigler. 2020. Evaluating [124] Rory Mc Grath, Luca Costabello, Chan Le Van, Paul Sweeney, Farbod Kamiab,
explainable Artifical intelligence–What users really appreciate. (2020). https: Zhao Shen, and Freddy Lecue. 2018. Interpretable Credit Application Predictions
//aisel.aisnet.org/ecis2020_rp/195 With Counterfactual Explanations. http://arxiv.org/abs/1811.05245
[104] Fraunhofer IOSB, Maximilian Becker, Nadia Burkart, Pascal Birnstill, and [125] Home Credit Group. 2018. Home Credit Default Risk. https://www.kaggle.com/
Jürgen Beyerer. 2021. A Step Towards Global Counterfactual Explanations: c/home-credit-default-risk/data
Approximating the Feature Space Through Hierarchical Division and Graph [126] Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco
Search. Advances in Artificial Intelligence and Machine Learning (2021), 90–110. Turini, and Fosca Giannotti. 2018. Local Rule-Based Explanations of Black Box
https://doi.org/10.54364/aaiml.2021.1107 Decision Systems. http://arxiv.org/abs/1805.10820
[105] Timo Freiesleben. 2022. The intriguing relation between counterfactual expla- [127] Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca
nations and adversarial examples. Minds Mach. (Dordr.) 32, 1 (2022), 77–109. Giannotti, and Dino Pedreschi. 2018. A Survey of Methods for Explaining
[106] Jerome H. Friedman. 2001. Greedy Function Approximation: A Gradient Black Box Models. ACM Comput. Surv. 51, 5, Article 93 (Aug. 2018), 42 pages.
Boosting Machine. The Annals of Statistics 29, 5 (2001), 1189–1232. http: https://doi.org/10.1145/3236009
//www.jstor.org/stable/2699986 [128] Riccardo Guidotti and Salvatore Ruggieri. 2021. Ensemble of Counterfactual
[107] Jörg Frohberg and Frank Binder. 2022. CRASS: A Novel Data Set and Benchmark Explainers. Springer-Verlag, Berlin, Heidelberg, 11. https://doi.org/10.1007/978-
to Test Counterfactual Reasoning of Large Language Models. In Proceedings 3-030-88942-5_28
of the Language Resources and Evaluation Conference. European Language Re- [129] Sadaf Gulshad and Arnold Smeulders. 2021. Counterfactual attribute-based
sources Association, Marseille, France, 2126–2140. https://aclanthology.org/ visual explanations for classification. International Journal of Multimedia Infor-
2022.lrec-1.229 mation Retrieval (2021), 127–140. https://doi.org/10.1007/s13735-021-00208-3
[108] Takanori Fujiwara, Xinhai Wei, Jian Zhao, and Kwan-Liu Ma. 2022. Interactive [130] Hangzhi Guo, Thanh Hong Nguyen, and Amulya Yadav. 2021. CounterNet:
Dimensionality Reduction for Comparative Analysis. IEEE Transactions on End-to-End Training of Counterfactual Aware Predictions. https://doi.org/
Visualization and Computer Graphics (2022), 758–768. https://doi.org/10.1109/ 10.48550/ARXIV.2109.07557
tvcg.2021.3114807 [131] Sharmi Dev Gupta, Begum Genc, and Barry O’Sullivan. 2022. Finding Counter-
[109] Maximilian Förster, Philipp Hühn, Mathias Klier, and Kilian Kluge. 2021. Cap- factual Explanations through Constraint Relaxations. https://doi.org/10.48550/
turing Users’ Reality: A Novel Approach to Generate Coherent Counterfactual ARXIV.2204.03429
Explanations. https://doi.org/10.24251/HICSS.2021.155 [132] Vivek Gupta, Pegah Nokhiz, Chitradeep Dutta Roy, and Suresh Venkatasubra-
[110] Sainyam Galhotra, Romila Pradhan, and Babak Salimi. 2021. Explaining Black- manian. 2019. Equalizing Recourse across Groups.
Box Algorithms Using Probabilistic Contrastive Counterfactuals. In SIGMOD [133] Victor Guyomard, Françoise Fessant, Tassadit Bouadi, and Thomas Guyet. 2021.
’21: International Conference on Management of Data, Virtual Event, China, June Post-hoc counterfactual generation with supervised autoencoder.
20-25, 2021. ACM. https://doi.org/10.1145/3448016.3458455

17
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

[134] Suryabhan Singh Hada and Miguel Á. Carreira-Perpiñán. 2021. Exploring [159] Shalmali Joshi, Oluwasanmi Koyejo, Warut Vijitbenjaronk, Been Kim, and Joy-
Counterfactual Explanations for Classification and Regression Trees. In Ma- deep Ghosh. 2019. Towards Realistic Individual Recourse and Actionable Expla-
chine Learning and Principles and Practice of Knowledge Discovery in Databases. nations in Black-Box Decision Making Systems. http://arxiv.org/abs/1907.09615
Springer International Publishing, Cham, 489–504. [160] Hong-Gyu Jung, Sin-Han Kang, Hee-Dong Kim, Dong-Ok Won, and Seong-
[135] Swastik Haldar, Philips George John, and Diptikalyan Saha. 2021. Reliable Whan Lee. 2020. Counterfactual Explanation Based on Gradual Construction
Counterfactual Explanations for Autoencoder Based Anomalies. In 8th ACM for Deep Networks. https://doi.org/10.48550/ARXIV.2008.01897
IKDD CODS and 26th COMAD. Association for Computing Machinery, New [161] Vassilis Kaffes, Dimitris Sacharidis, and Giorgos Giannopoulos. 2021. Model-
York, NY, USA, 83–91. https://doi.org/10.1145/3430984.3431015 Agnostic Counterfactual Explanations of Recommendations. In Proceedings of
[136] Xing Han and Joydeep Ghosh. 2021. Model-Agnostic Explanations using Mini- the 29th ACM Conference on User Modeling, Adaptation and Personalization
mal Forcing Subsets. In 2021 International Joint Conference on Neural Networks (UMAP ’21). Association for Computing Machinery, New York, NY, USA, 6.
(IJCNN). 1–8. https://doi.org/10.1109/IJCNN52387.2021.9533992 https://doi.org/10.1145/3450613.3456846
[137] Masoud Hashemi and Ali Fathi. 2020. PermuteAttack: Counterfactual Expla- [162] Kaggle. 2012. Give Me Some Credit. https://www.kaggle.com/c/
nation of Machine Learning Credit Scorecards. https://doi.org/10.48550/ GiveMeSomeCredit
ARXIV.2008.10138 [163] D. Kahneman and D. Miller. 1986. Norm Theory: Comparing Reality to Its
[138] Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, and Zeynep Akata. 2018. Alternatives. Psychological Review 93 (1986), 136–153.
Generating Counterfactual Explanations with Natural Language. https:// [164] Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, and Hiroki Arimura. 2020.
doi.org/10.48550/ARXIV.1806.09809 DACE: Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear
[139] Andreas Henelius, Kai Puolamäki, Henrik Boström, Lars Asker, and Pana- Optimization. In International Joint Conference on Artificial Intelligence (IJCAI).
giotis Papapetrou. 2014. A Peek into the Black Box: Exploring Classifiers California, USA. https://doi.org/10.24963/ijcai.2020/395
by Randomization. Data Min. Knowl. Discov. 28, 5-6 (2014), 27. https: [165] Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, and Yuichi Ike. 2022. Coun-
//doi.org/10.1007/s10618-014-0368-8 terfactual Explanation Trees: Transparent and Consistent Actionable Recourse
[140] Fabian Hinder and Barbara Hammer. 2020. Counterfactual Explanations of with Decision Trees. In Proceedings of The 25th International Conference on Artifi-
Concept Drift. https://doi.org/10.48550/ARXIV.2006.12822 cial Intelligence and Statistics (Proceedings of Machine Learning Research). PMLR,
[141] Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven Mark 1846–1870.
Drucker. 2019. Gamut: A Design Probe to Understand How Data Scientists [166] Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, Yuichi Ike, Kento Uemura,
Understand Machine Learning Models. Proceedings of the 2019 CHI Conference and Hiroki Arimura. 2021. Ordered Counterfactual Explanation by Mixed-
on Human Factors in Computing Systems (2019). Integer Linear Optimization. Proceedings of the AAAI Conference on Artificial
[142] Woo Suk Hong, Adrian Daniel Haimovich, and R. Andrew Taylor. 2018. Predict- Intelligence 35, 13 (2021), 11. https://doi.org/10.1609/aaai.v35i13.17376
ing hospital admission at emergency department triage using machine learning. [167] A.-H. Karimi, G. Barthe, B. Balle, and I. Valera. 2020. Model-Agnostic Counterfac-
PLOS ONE 13, 7 (2018). https://doi.org/10.1371/journal.pone.0201016 tual Explanations for Consequential Decisions. http://arxiv.org/abs/1905.11190
[143] The US White House. 2022. Blueprint for an AI bill of rights. https: [168] Amir-Hossein Karimi, Bernhard Schölkopf, and Isabel Valera. 2021. Algorithmic
//www.whitehouse.gov/ostp/ai-bill-of-rights/#discrimination Recourse: From Counterfactual Explanations to Interventions. In Proceedings of
[144] Chihcheng Hsieh, Catarina Moreira, and Chun Ouyang. 2021. DiCE4EL: In- the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT
terpreting Process Predictions using a Milestone-Aware Counterfactual Ap- ’21). Association for Computing Machinery, New York, NY, USA, 10. https:
proach. In 2021 3rd International Conference on Process Mining (ICPM). 88–95. //doi.org/10.1145/3442188.3445899
https://doi.org/10.1109/ICPM53251.2021.9576881 [169] Amir-Hossein Karimi, Julius von Kügelgen, Bernhard Schölkopf, and Isabel
[145] Tsung-Hao Huang, Andreas Metzger, and Klaus Pohl. 2022. Counterfactual Valera. 2020. Algorithmic recourse under imperfect causal knowledge: a proba-
Explanations for Predictive Business Process Monitoring. Springer International bilistic approach. http://arxiv.org/abs/2006.06831
Publishing, Cham, 399–413. [170] Isak Karlsson, Jonathan Rebane, Panagiotis Papapetrou, and Aristides Gionis.
[146] Frederik Hvilshøj, Alexandros Iosifidis, and Ira Assent. 2021. ECINN: Efficient 2020. Locally and Globally Explainable Time Series Tweaking. Knowl. Inf. Syst.
Counterfactuals from Invertible Neural Networks. https://doi.org/10.48550/ (2020), 30. https://doi.org/10.1007/s10115-019-01389-4
ARXIV.2103.13701 [171] Atoosa Kasirzadeh and Andrew Smart. 2021. The Use and Misuse of Counter-
[147] Frederik Hvilshøj, Alexandros Iosifidis, and Ira Assent. 2021. On Quantitative factuals in Ethical Machine Learning. In Proceedings of the 2021 ACM Conference
Evaluations of Counterfactuals. https://doi.org/10.48550/ARXIV.2111.00177 on Fairness, Accountability, and Transparency. Association for Computing Ma-
[148] Benedikt Höltgen, Lisa Schut, Jan M. Brauner, and Yarin Gal. 2021. DeDUCE: chinery, New York, NY, USA, 9. https://doi.org/10.1145/3442188.3445886
Generating Counterfactual Explanations Efficiently. https://doi.org/10.48550/ [172] Mark T. Keane, Eoin M. Kenny, Eoin Delaney, and Barry Smyth. 2021. If Only
ARXIV.2111.15639 We Had Better Counterfactual Explanations: Five Key Deficits to Rectify in the
[149] Global Women in Data Science Conference The Global Open Source Severity of Evaluation of Counterfactual XAI Techniques. CoRR (2021). https://arxiv.org/
Illness Score Consortium. 2020. WiDS Datathon 2020. https://www.kaggle.com/ abs/2103.01035
c/widsdatathon2020 [173] Mark T. Keane and Barry Smyth. 2020. Good Counterfactuals and Where to Find
[150] Allstate Insurance. 2011. Allstate Claim Prediction Challenge. https: Them: A Case-Based Technique for Generating Counterfactuals for Explainable
//www.kaggle.com/c/ClaimPredictionChallenge AI (XAI). arXiv:cs.AI/2005.13997
[151] France intelligence artificielle. [n. d.]. RAPPORT DE SYNTHESE FRANCE [174] Eoin M. Kenny and Mark T Keane. 2021. On Generating Plausible Counterfactual
INTELLIGENCE ARTIFICIELLE. https://www.economie.gouv.fr/files/files/PDF/ and Semi-Factual Explanations for Deep Learning. Proceedings of the AAAI
2017/Rapport_synthese_France_IA_.pdf. Accessed: 2020-10-15. Conference on Artificial Intelligence 35 (May 2021), 11. https://ojs.aaai.org/
[152] Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, index.php/AAAI/article/view/17377
Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpan- [175] Saeed Khorram and Li Fuxin. 2022. Cycle-Consistent Counterfactuals by Latent
skaya, Jayne Seekins, David A. Mong, Safwan S. Halabi, Jesse K. Sandberg, Transformations. In Proceedings of the IEEE/CVF Conference on Computer Vision
Ricky Jones, David B. Larson, Curtis P. Langlotz, Bhavik N. Patel, Matthew P. and Pattern Recognition (CVPR). 10.
Lungren, and Andrew Y. Ng. 2019. CheXpert: A Large Chest Radiograph Dataset [176] Boris Kment. 2006. Counterfactuals and Explanation. Mind 115 (04 2006).
with Uncertainty Labels and Expert Comparison. https://doi.org/10.48550/ https://doi.org/10.1093/mind/fzl261
ARXIV.1901.07031 [177] Will Knight. 2019. The Apple Card Didn’t ’See’ Gender—and That’s the Prob-
[153] Paul Jacob, Éloi Zablocki, Hédi Ben-Younes, Mickaël Chen, Patrick Pérez, and lem. https://www.wired.com/story/the-apple-card-didnt-see-genderand-
Matthieu Cord. [n. d.]. STEEX: Steering Counterfactual Explanations with thats-the-problem/
Semantics. https://doi.org/10.48550/ARXIV.2111.09094 [178] Ramaravind Kommiya Mothilal, Divyat Mahajan, Chenhao Tan, and Amit
[154] Guillaume Jeanneret, Loïc Simon, and Frédéric Jurie. 2022. Diffusion Models for Sharma. 2021. Towards Unifying Feature Attribution and Counterfactual Explana-
Counterfactual Explanations. https://doi.org/10.48550/ARXIV.2203.15636 tions: Different Means to the Same End. Association for Computing Machinery,
[155] Lauren Kirchner Jeff Larson, Surya Mattu and Julia Angwin. 2016. UCI Machine New York, NY, USA.
Learning Repository. https://github.com/propublica/compas-analysis/ [179] Jaehoon Koo, Diego Klabjan, and Jean Utke. 2020. Inverse Classification with
[156] Yan Jia, John McDermid, and Ibrahim Habli. 2021. Enhancing the Value of Limited Budget and Maximum Number of Perturbed Samples. https://doi.org/
Counterfactual Explanations for Deep Learning. In Artificial Intelligence in 10.48550/ARXIV.2009.14111
Medicine. Springer International Publishing, Cham, 389–394. [180] Tara Koopman and Silja Renooij. 2021. Persuasive Contrastive Explanations for
[157] Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Steven Horng, Leo Anthony Bayesian Networks. In Symbolic and Quantitative Approaches to Reasoning with
Celi, and Roger Mark. 2021. MIMIC-IV. https://doi.org/10.13026/S6N6-XD98 Uncertainty. Springer International Publishing, Cham, 229–242.
[158] Kareem L. Jordan and Tina L. Freiburger. 2015. The Effect of Race/Ethnicity [181] Anton Korikov and J. Christopher Beck. 2021. Counterfactual Explana-
on Sentencing: Examining Sentence Type, Jail Length, and Prison Length. tions via Inverse Constraint Programming. In 27th International Conference
Journal of Ethnicity in Criminal Justice 13, 3 (2015). https://doi.org/10.1080/ on Principles and Practice of Constraint Programming (CP 2021) (Leibniz In-
15377938.2014.984045 ternational Proceedings in Informatics (LIPIcs)), Vol. 210. Schloss Dagstuhl
18
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

– Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 35:1–35:16. https: [205] Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting
//doi.org/10.4230/LIPIcs.CP.2021.35 Model Predictions. In Advances in Neural Information Processing Systems 30.
[182] Anton Korikov, Alexander Shleyfman, and J. Christopher Beck. 2021. Counter- Curran Associates, Inc., 4765–4774.
factual Explanations for Optimization-Based Decisions in the Context of the [206] Freddie Mac. 2019. Single family loan-level dataset. https://
GDPR. In Proceedings of the Thirtieth International Joint Conference on Artificial www.freddiemac.com/research/datasets/sf-loanlevel-dataset
Intelligence, IJCAI-21. 4097–4103. https://doi.org/10.24963/ijcai.2021/564 [207] Nishtha Madaan, Inkit Padhi, Naveen Panwar, and Diptikalyan Saha. 2021.
[183] Maxim Kovalev, Lev Utkin, Frank Coolen, and Andrei Konstantinov. 2021. Coun- Generate Your Counterfactuals: Towards Controlled Counterfactual Generation
terfactual Explanation of Machine Learning Survival Models. Informatica 32, 4 for Text. Proceedings of the AAAI Conference on Artificial Intelligence 35 (May
(jan 2021), 817–847. https://doi.org/10.15388/21-INFOR468 2021), 13516–13524. https://ojs.aaai.org/index.php/AAAI/article/view/17594
[184] R. Krishnan, G. Sivakumar, and P. Bhattacharya. 1999. Extracting decision trees [208] Fannie Mae. 2020. Fannie Mae dataset. https://www.fanniemae.com/portal/
from trained neural networks. Pattern Recognition 32, 12 (1999), 1999 – 2009. funding-the-market/data/loan-performance-data.html
https://doi.org/10.1016/S0031-3203(98) 00181-2 [209] Alessandro Magrini, Stefano di Blasi, and Federico Stefanini. 2017. A conditional
[185] Sanjay Krishnan and Eugene Wu. 2017. PALM: Machine Learning Explanations linear Gaussian network to assess the impact of several agronomic settings on
For Iterative Debugging. In Proceedings of the 2nd Workshop on Human-In-the- the quality of Tuscan Sangiovese grapes. Biometrical Letters 54 (06 2017), 25–42.
Loop Data Analytics (HILDA’17). Association for Computing Machinery, New https://doi.org/10.1515/bile-2017-0002
York, NY, USA, Article 4, 6 pages. https://doi.org/10.1145/3077257.3077271 [210] Divyat Mahajan, Chenhao Tan, and Amit Sharma. 2020. Preserving Causal
[186] Ulrike Kuhl, André Artelt, and Barbara Hammer. 2022. Keep Your Friends Constraints in Counterfactual Explanations for Machine Learning Classifiers.
Close and Your Counterfactuals Closer: Improved Learning From Closest Rather http://arxiv.org/abs/1912.03277
Than Plausible Counterfactual Explanations in an Abstract Setting. ArXiv [211] Guilherme F Marchezini, Anisio M Lacerda, Gisele L Pappa, Wagner Meira, Jr,
abs/2205.05515 (2022). Debora Miranda, Marco A Romano-Silva, Danielle S Costa, and Leandro Malloy
[187] Gunnar König, Timo Freiesleben, and Moritz Grosse-Wentrup. 2021. A Causal Diniz. 2022. Counterfactual inference with latent variable and its application in
Perspective on Meaningful and Robust Algorithmic Recourse. https://doi.org/ mental health care. Data Min. Knowl. Discov. 36, 2 (Jan. 2022), 811–840.
10.48550/ARXIV.2107.07853 [212] David Martens and Foster J. Provost. 2014. Explaining Data-Driven Document
[188] Jokin Labaien, Ekhi Zugasti, and Xabier De Carlos. 2021. DA-DGCEx: Ensuring Classifications. MIS Q. 38 (2014), 73–99.
Validity of Deep Guided Counterfactual Explanations With Distribution-Aware [213] Raphael Mazzine, Sofie Goethals, Dieter Brughmans, and David Martens. 2021.
Autoencoder Loss. https://doi.org/10.48550/ARXIV.2104.09062 Counterfactual Explanations for Employment Services. In International workshop
[189] Michael T. Lash, Qihang Lin, William Nick Street, Jennifer G. Robinson, and on Fair, Effective And Sustainable Talent management using data science. 1–7.
Jeffrey W. Ohlmann. 2017. Generalized Inverse Classification. In SDM. Society [214] Raphael Mazzine and David Martens. 2021. A Framework and Benchmarking
for Industrial and Applied Mathematics, Philadelphia, PA, USA, 162–170. Study for Counterfactual Generating Methods on Tabular Data. https://doi.org/
[190] Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, and Marcin De- 10.48550/ARXIV.2107.04680
tyniecki. 2019. Issues with post-hoc counterfactual explanations: a discussion. [215] Marcos Medeiros Raimundo, Luis Nonato, and Jorge Poco. 2021. Mining
arXiv:1906.04774 Pareto-Optimal Counterfactual Antecedents With A Branch-And-Bound Model-
[191] Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, and Agnostic Algorithm. https://doi.org/10.21203/rs.3.rs-551661/v1
Marcin Detyniecki. 2018. Comparison-Based Inverse Classification for Inter- [216] Md Golam Moula Mehedi Hasan and Douglas Talbert. 2022. Data Augmentation
pretability in Machine Learning. In Information Processing and Management using Counterfactuals: Proximity vs Diversity. The International FLAIRS Confer-
of Uncertainty in Knowledge-Based Systems, Theory and Foundations (IPMU). ence Proceedings 35 (May 2022). https://doi.org/10.32473/flairs.v35i.130705
Springer International Publishing. https://doi.org/10.1007/978-3-319-91473-2_9 [217] Silvan Mertes, Tobias Huber, Katharina Weitz, Alexander Heimerl, and Elisabeth
[192] Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, and André. 2022. GANterfactual—Counterfactual Explanations for Medical Non-
Marcin Detyniecki. 2019. The Dangers of Post-hoc Interpretability: Unjustified experts Using Generative Adversarial Learning. Frontiers in Artificial Intelligence
Counterfactual Explanations. http://arxiv.org/abs/1907.09294 5 (2022). https://doi.org/10.3389/frai.2022.825565
[193] Thai Le, Suhang Wang, and Dongwon Lee. 2019. GRACE: Generating Con- [218] Tim Miller. 2019. Explanation in artificial intelligence: Insights from the so-
cise and Informative Contrastive Sample to Explain Neural Network Model’s cial sciences. Artificial Intelligence (Feb 2019), 1–38. https://doi.org/10.1016/
Prediction. arXiv:cs.LG/1911.02042 j.artint.2018.07.007
[194] Yann LeCun and Corinna Cortes. 2010. MNIST handwritten digit database. [219] Saumitra Mishra, Sanghamitra Dutta, Jason Long, and Daniele Magazzeni. 2021.
http://yann.lecun.com/exdb/mnist/. (2010). http://yann.lecun.com/exdb/mnist/ A Survey on the Robustness of Feature Importance and Counterfactual Expla-
[195] Carson K. Leung, Adam G.M. Pazdor, and Joglas Souza. 2021. Explainable nations. https://doi.org/10.48550/ARXIV.2111.00358
Artificial Intelligence for Data Science on Customer Churn. In 2021 IEEE 8th [220] Takayuki Miura, Satoshi Hasegawa, and Toshiki Shibahara. 2021. MEGEX: Data-
International Conference on Data Science and Advanced Analytics (DSAA). 1–10. Free Model Extraction Attack against Gradient-Based Explainable AI. ArXiv
https://doi.org/10.1109/DSAA53316.2021.9564166 abs/2107.08909 (2021).
[196] David Lewis. 1973. Counterfactuals. Blackwell Publishers, Oxford. [221] Kiarash Mohammadi, Amir-Hossein Karimi, Gilles Barthe, and Isabel Valera.
[197] Dan Ley, Saumitra Mishra, and Daniele Magazzeni. 2022. Global Counterfac- 2021. Scaling Guarantees for Nearest Counterfactual Explanations. In Pro-
tual Explanations: Investigations, Implementations and Improvements. In ICLR ceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. Asso-
Workshop on Privacy, Accountability, Interpretability, Robustness, Reasoning on ciation for Computing Machinery, New York, NY, USA, 177–187. https:
Structured Data. //doi.org/10.1145/3461702.3462514
[198] Yan Li, Shasha Liu, Chunwei Wu, Xidong Xi, Guitao Cao, and Wenming Cao. [222] Wellington Rodrigo Monteiro and Gilberto Reynoso-Meza. 2022. Counterfactual
2021. DCFG: Discovering Directional CounterFactual Generation for Chest Generation Through Multi-objective Constrained Optimisation. (2022), 23.
X-rays. In 2021 IEEE International Conference on Bioinformatics and Biomedicine https://doi.org/10.21203/rs.3.rs-1325730/v1
(BIBM). 972–979. https://doi.org/10.1109/BIBM52615.2021.9669770 [223] Sérgio Moro, Paulo Cortez, and Paulo Rita. 2014. A data-driven approach to
[199] Shusen Liu, Bhavya Kailkhura, Donald Loveland, and Yong Han. 2019. Gen- predict the success of bank telemarketing. Decision Support Systems 62 (2014),
erative Counterfactual Introspection for Explainable Deep Learning. In 2019 22–31. https://doi.org/10.1016/j.dss.2014.03.001
IEEE Global Conference on Signal and Information Processing (GlobalSIP). 1–5. [224] Ramaravind K. Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining
https://doi.org/10.1109/GlobalSIP45357.2019.8969491 Machine Learning Classifiers through Diverse Counterfactual Explanations.
[200] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2014. Deep Learning In Proceedings of the Conference on Fairness, Accountability, and Transparency
Face Attributes in the Wild. (11 2014). https://doi.org/10.1109/ICCV.2015.425 (FAccT) (FAT* ’20). Association for Computing Machinery, New York, NY, USA.
[201] Ana Lucic, Hinda Haned, and Maarten de Rijke. 2020. Why Does My Model https://doi.org/10.1145/3351095.3372850
Fail? Contrastive Local Explanations for Retail Forecasting. In Proceedings of [225] Susanne G. Mueller, Michael W. Weiner, Leon J. Thal, Ronald C. Petersen, Clifford
the 2020 Conference on Fairness, Accountability, and Transparency (FAT* ’20). Jack, William Jagust, John Q. Trojanowski, Arthur W. Toga, and Laurel Beckett.
Association for Computing Machinery, New York, NY, USA, 9. https://doi.org/ 2008. Alzheimer’s Disease Neuroimaging Initiative. In Advances in Alzheimer’s
10.1145/3351095.3372824 and Parkinson’s Disease. Springer US, Boston, MA, 183–189.
[202] Ana Lucic, Harrie Oosterhuis, Hinda Haned, and Maarten de Rijke. 2019. FOCUS: [226] Chelsea M. Myers, Evan Freed, Luis Fernando Laris Pardo, Anushay Furqan,
Flexible Optimizable Counterfactual Explanations for Tree Ensembles. https: Sebastian Risi, and Jichen Zhu. 2020. Revealing Neural Network Bias to
//doi.org/10.48550/ARXIV.1911.12199 Non-Experts Through Interactive Counterfactual Examples. https://doi.org/
[203] Ana Lucic, Harrie Oosterhuis, Hinda Haned, and Maarten de Rijke. 2020. Ac- 10.48550/ARXIV.2001.02271
tionable Interpretability through Optimizable Counterfactual Explanations for [227] Philip Naumann and Eirini Ntoutsi. 2021. Consequence-aware Sequential Coun-
Tree Ensembles. http://arxiv.org/abs/1911.12199 terfactual Generation. arXiv:cs.LG/2104.05592
[204] Ana Lucic, Maartje ter Hoeve, Gabriele Tolomei, Maarten de Rijke, and Fab- [228] Guillermo Navas-Palencia. 2021. Optimal Counterfactual Explanations for
rizio Silvestri. 2021. CF-GNNExplainer: Counterfactual Explanations for Graph Scorecard modelling. https://arxiv.org/abs/2104.08619
Neural Networks. arXiv:cs.LG/2102.03322
19
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

[229] Daniel Nemirovsky, Nicolas Thiebaut, Ye Xu, and Abhishek Gupta. 2021. Provid- https://doi.org/10.1007/s11634-020-00418-3
ing Actionable Feedback in Hiring Marketplaces Using Generative Adversarial [252] Peyman Rasouli and Ingrid Chieh Yu. 2022. CARE: Coherent actionable recourse
Networks. In Proceedings of the 14th ACM International Conference on Web Search based on sound counterfactual explanations. International Journal of Data Science
and Data Mining. Association for Computing Machinery, New York, NY, USA, 4. and Analytics (2022), 1–26.
https://doi.org/10.1145/3437963.3441705 [253] Peyman Rasouli and Ingrid Chieh Yu. 2021. Analyzing and Improving the Ro-
[230] Daniel Nemirovsky, Nicolas Thiebaut, Ye Xu, and Abhishek Gupta. 2022. Coun- bustness of Tabular Classifiers using Counterfactual Explanations. In 2021 20th
teRGAN: Generating counterfactuals for real-time recourse and interpretability IEEE International Conference on Machine Learning and Applications (ICMLA).
using residual GANs. In Proceedings of the Thirty-Eighth Conference on Uncer- 1286–1293. https://doi.org/10.1109/ICMLA52953.2021.00209
tainty in Artificial Intelligence (Proceedings of Machine Learning Research). PMLR, [254] Shubham Rathi. 2019. Generating Counterfactual and Contrastive Explanations
1488–1497. https://proceedings.mlr.press/v180/nemirovsky22a.html using SHAP. http://arxiv.org/abs/1906.09293 arXiv: 1906.09293.
[231] Tri Minh Nguyen, Thomas P Quinn, Thin Nguyen, and Truyen Tran. 2021. [255] Shauli Ravfogel, Grusha Prasad, Tal Linzen, and Yoav Goldberg. 2021. Counter-
Counterfactual Explanation with Multi-Agent Reinforcement Learning for Drug factual Interventions Reveal the Causal Effect of Relative Clause Representations
Target Prediction. arXiv:cs.AI/2103.12983 on Agreement Prediction. In Proceedings of the 25th Conference on Computational
[232] Danilo Numeroso and Davide Bacciu. 2021. MEG: Generating Molecular Coun- Natural Language Learning. Association for Computational Linguistics, 194–209.
terfactual Explanations for Deep Graph Networks. https://doi.org/10.18653/v1/2021.conll-1.15
[233] Andrew O’Brien and Edward Kim. 2021. Multi-Agent Algorithmic Recourse. [256] Ambareesh Ravi, Xiaozhuo Yu, Iara Santelices, Fakhri Karray, and Baris Fidan.
https://doi.org/10.48550/ARXIV.2110.00673 2021. General Frameworks for Anomaly Detection Explainability: Comparative
[234] House of Commons. [n. d.]. Algorithms in decision making. https:// Study. In 2021 IEEE International Conference on Autonomous Systems (ICAS). 1–5.
publications.parliament.uk/pa/cm201719/cmselect/cmsctech/351/351.pdf. Ac- https://doi.org/10.1109/ICAS49788.2021.9551129
cessed: 2020-10-15. [257] Kaivalya Rawal, Ece Kamar, and Himabindu Lakkaraju. 2021. Algorithmic
[235] Kwanseok Oh, Jee Seok Yoon, and Heung-Il Suk. 2020. Born Identity Network: Recourse in the Wild: Understanding the Impact of Data and Model Shifts.
Multi-way Counterfactual Map Generation to Explain a Classifier’s Decision. arXiv:cs.LG/2012.11788
https://doi.org/10.48550/ARXIV.2011.10381 [258] Kaivalya Rawal and Himabindu Lakkaraju. 2020. Beyond Individualized
[236] Kwanseok Oh, Jee Seok Yoon, and Heung-Il Suk. 2021. Learn-Explain-Reinforce: Recourse: Interpretable and Interactive Summaries of Actionable Recourses.
Counterfactual Reasoning and Its Guidance to Reinforce an Alzheimer’s Disease In Advances in Neural Information Processing Systems, Vol. 33. Curran As-
Diagnosis Model. https://doi.org/10.48550/ARXIV.2108.09451 sociates, Inc., 12187–12198. https://proceedings.neurips.cc/paper/2020/file/
[237] Matthew L. Olson, Roli Khanna, Lawrence Neal, Fuxin Li, and Weng-Keen 8ee7730e97c67473a424ccfeff49ab20-Paper.pdf
Wong. 2021. Counterfactual state explanations for reinforcement learning [259] Annabelle Redelmeier, Martin Jullum, Kjersti Aas, and Anders Løland. 2021.
agents via generative deep learning. Artificial Intelligence 295 (2021), 103455. MCCE: Monte Carlo sampling of realistic counterfactual explanations. https:
https://doi.org/10.1016/j.artint.2021.103455 //doi.org/10.48550/ARXIV.2111.09790
[238] Axel Parmentier and Thibaut Vidal. 2021. Optimal Counterfactual Explanations [260] Chris Reed, Keri Grieman, and Joseph Early. 2021. Non-Asimov Explanations
in Tree Ensembles. https://arxiv.org/abs/2106.06631 Regulating AI Through Transparency. In Queen Mary Law Research Paper No.
[239] Martin Pawelczyk, Chirag Agarwal, Shalmali Joshi, Sohini Upadhyay, and 370/2021. https://ssrn.com/abstract=3970518
Himabindu Lakkaraju. 2022. Exploring Counterfactual Explanations Through [261] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I
the Lens of Adversarial Examples: A Theoretical and Empirical Analysis. In Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the
Proceedings of The 25th International Conference on Artificial Intelligence and 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data
Statistics (Proceedings of Machine Learning Research), Vol. 151. PMLR, 4574–4594. Mining (KDD ’16). Association for Computing Machinery, New York, NY, USA,
https://proceedings.mlr.press/v151/pawelczyk22a.html 10. https://doi.org/10.1145/2939672.2939778
[240] Martin Pawelczyk, Sascha Bielawski, Johannes van den Heuvel, Tobias Richter, [262] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-
and Gjergji Kasneci. 2021. CARLA: A Python Library to Benchmark Algorithmic Precision Model-Agnostic Explanations. In Conference on Artificial Intelligence
Recourse and Counterfactual Explanation Algorithms. arXiv:cs.LG/2108.00783 (AAAI). AAAI press, California, USA, 9. https://www.aaai.org/ocs/index.php/
[241] Martin Pawelczyk, Klaus Broelemann, and Gjergji. Kasneci. 2020. On Coun- AAAI/AAAI18/paper/view/16982
terfactual Explanations under Predictive Multiplicity. In Proceedings of Ma- [263] Marcel Robeer, Floris Bex, and Ad Feelders. 2021. Generating Realistic Natural
chine Learning Research. PMLR, Virtual, 9. http://proceedings.mlr.press/v124/ Language Counterfactuals. In Findings of the Association for Computational
pawelczyk20a.html Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana,
[242] Martin Pawelczyk, Teresa Datta, Johannes van-den Heuvel, Gjergji Kasneci, Dominican Republic, 3611–3625. https://doi.org/10.18653/v1/2021.findings-
and Himabindu Lakkaraju. 2022. Probabilistically Robust Recourse: Navigating emnlp.306
the Trade-offs between Costs and Robustness in Algorithmic Recourse. https: [264] Pau Rodriguez, Massimo Caccia, Alexandre Lacoste, Lee Zamparo, Issam Laradji,
//doi.org/10.48550/ARXIV.2203.06768 Laurent Charlin, and David Vazquez. 2021. Beyond Trivial Counterfactual
[243] Martin Pawelczyk, Johannes Haug, Klaus Broelemann, and Gjergji Kasneci. Explanations with Diverse Valuable Explanations. https://doi.org/10.48550/
2020. Learning Model-Agnostic Counterfactual Explanations for Tabular Data. , ARXIV.2103.10226
3126–3132 pages. https://doi.org/10.1145/3366423.3380087 [265] Alexis Ross, Himabindu Lakkaraju, and Osbert Bastani. 2021. Learning Models
[244] Judea Pearl. 2000. Causality: Models, Reasoning, and Inference. Cambridge for Actionable Recourse. In Advances in Neural Information Processing Systems,
University Press, USA. Vol. 34. Curran Associates, Inc., 18734–18746. https://proceedings.neurips.cc/
[245] Tejaswini Pedapati, Avinash Balakrishnan, Karthikeyan Shanmugan, and Amit paper/2021/file/9b82909c30456ac902e14526e63081d4-Paper.pdf
Dhurandhar. 2020. Learning Global Transparent Models Consistent with Local [266] David-Hillel Ruben. 1992. Counterfactuals. Routledge Publishers. https://
Contrastive Explanations. In Proceedings of the 34th International Conference on philarchive.org/archive/RUBEE-3
Neural Information Processing Systems (NIPS’20). Curran Associates Inc., Red [267] Chris Russell. 2019. Efficient Search for Diverse Coherent Explanations. In Pro-
Hook, NY, USA, 11. ceedings of the Conference on Fairness, Accountability, and Transparency (FAccT)
[246] Oana-Iuliana Popescu, Maha Shadaydeh, and Joachim Denzler. 2021. Counterfac- (FAT* ’19). Association for Computing Machinery, New York, NY, USA, 20–28.
tual Generation with Knockoffs. https://doi.org/10.48550/ARXIV.2102.00951 https://doi.org/10.1145/3287560.3287569
[247] Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, and Peter [268] Sophie Sadler, Derek Greene, and Daniel W. Archambault. 2021. A Study of
Flach. 2020. FACE: Feasible and Actionable Counterfactual Explanations. , Explainable Community-Level Features. In GEM: Graph Embedding and Mining
344–350 pages. https://doi.org/10.1145/3375627.3375850 arXiv: 1909.09369. ECML-PKDD 2021 Workshop+Tutorial.
[248] Mario Alfonso Prado-Romero, Bardh Prenkaj, Giovanni Stilo, and Fosca Gi- [269] Surya Shravan Kumar Sajja, Sumanta Mukherjee, Satyam Dwivedi, and Vikas C.
annotti. 2022. A Survey on Graph Counterfactual Explanations: Definitions, Raykar. 2021. Semi-supervised counterfactual explanations. https://
Methods, Evaluation. https://doi.org/10.48550/ARXIV.2210.12089 openreview.net/forum?id=o6ndFLB1DST
[249] Wenting Qi and Charalampos Chelmis. 2021. Improving Algorithmic Deci- [270] Robert-Florian Samoilescu, Arnaud Van Looveren, and Janis Klaise. 2021. Model-
sion–Making in the Presence of Untrustworthy Training Data. In 2021 IEEE agnostic and Scalable Counterfactual Explanations via Reinforcement Learning.
International Conference on Big Data (Big Data). 1102–1108. https://doi.org/ https://doi.org/10.48550/ARXIV.2106.02597
10.1109/BigData52589.2021.9671677 [271] Pedro Sanchez and Sotirios A. Tsaftaris. 2022. Diffusion Causal Models for
[250] Goutham Ramakrishnan, Y. C. Lee, and Aws Albarghouthi. 2020. Synthesizing Counterfactual Estimation. https://doi.org/10.48550/ARXIV.2202.10166
Action Sequences for Modifying Model Decisions. In Conference on Artificial [272] Maximilian Schleich, Zixuan Geng, Yihong Zhang, and Dan Suciu. 2021. GeCo:
Intelligence (AAAI). AAAI press, California, USA, 16. http://arxiv.org/abs/ Quality Counterfactual Explanations in Real Time. arXiv:cs.LG/2101.01292
1910.00057 [273] Lisa Schut, Oscar Key, Rory McGrath, Luca Costabello, Bogdan Sacaleanu, Medb
[251] Yanou Ramon, David Martens, Foster Provost, and Theodoros Evgeniou. 2020. Corcoran, and Yarin Gal. 2021. Generating Interpretable Counterfactual Ex-
A Comparison of Instance-Level Counterfactual Explanation Algorithms for planations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties.
Behavioral and Textual Data: SEDC, LIME-C and SHAP-C. 14, 4 (2020), 801–819. https://doi.org/10.48550/ARXIV.2103.08951
20
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

[274] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. [295] Jason Tashea. 2017. Courts Are Using AI to Sentence Criminals. That Must Stop
2017. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Now. https://www.wired.com/2017/04/courts-using-ai-sentence-criminals-
Localization. In IEEE International Conference on Computer Vision. 618–626. must-stop-now/. Accessed: 2020-10-15.
[275] Kumba Sennaar. 2019. Machine Learning for Recruiting and Hiring – 6 Current [296] Mohammed Temraz and Mark T. Keane. 2021. Solving the Class Imbalance
Applications. https://emerj.com/ai-sector-overviews/machine-learning-for- Problem Using a Counterfactual Method for Data Augmentation. https://
recruiting-and-hiring/. Accessed: 2020-10-15. doi.org/10.48550/ARXIV.2111.03516
[276] Ruoxi Shang, K. J. Kevin Feng, and Chirag Shah. 2022. Why Am I Not Seeing [297] Mohammed Temraz, Eoin M. Kenny, Elodie Ruelle, Laurence Shalloo, Barry
It? Understanding Users’ Needs for Counterfactual Explanations in Everyday Smyth, and Mark T. Keane. 2021. Handling Climate Change Using Counterfactu-
Recommendations. In 2022 ACM Conference on Fairness, Accountability, and als: Using Counterfactuals in Data Augmentation to Predict Crop Growth in an
Transparency (FAccT ’22). Association for Computing Machinery, New York, NY, Uncertain Climate Future. In Case-Based Reasoning Research and Development.
USA, 11. https://doi.org/10.1145/3531146.3533189 Springer International Publishing, Cham, 216–231.
[277] Xiaoting Shao and Kristian Kersting. 2022. Gradient-based Counterfactual [298] T. Teofili, D. Firmani, N. Koudas, V. Martello, P. Merialdo, and D. Srivastava.
Explanations using Tractable Probabilistic Models. https://doi.org/10.48550/ 2022. Effective Explanations for Entity Resolution Models. In 2022 IEEE 38th
ARXIV.2205.07774 International Conference on Data Engineering (ICDE). IEEE Computer Society, Los
[278] Shubham Sharma, Jette Henderson, and Joydeep Ghosh. 2019. CERTIFAI: Coun- Alamitos, CA, USA, 2709–2721. https://doi.org/10.1109/ICDE53745.2022.00248
terfactual Explanations for Robustness, Transparency, Interpretability, and Fair- [299] Jayaraman Thiagarajan, Vivek Sivaraman Narayanaswamy, Deepta Ra-
ness of Artificial Intelligence models. http://arxiv.org/abs/1905.07857 jan, Jia Liang, Akshay Chaudhari, and Andreas Spanias. 2021. De-
[279] Reza Shokri, Martin Strobel, and Yair Zick. 2021. On the Privacy Risks of Model signing Counterfactual Generators using Deep Model Inversion. In Ad-
Explanations. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, vances in Neural Information Processing Systems, Vol. 34. Curran Asso-
and Society. Association for Computing Machinery, New York, NY, USA, 11. ciates, Inc., 16873–16884. https://proceedings.neurips.cc/paper/2021/file/
https://doi.org/10.1145/3461702.3462533 8ca01ea920679a0fe3728441494041b9-Paper.pdf
[280] Ronal Rajneshwar Singh, Paul Dourish, Piers Howe, Tim Miller, Liz Sonenberg, [300] Erico Tjoa and Cuntai Guan. 2019. A Survey on Explainable Artificial Intelligence
Eduardo Velloso, and Frank Vetere. 2021. Directive Explanations for Actionable (XAI): Towards Medical XAI. arXiv:cs.LG/1907.07374
Explainability in Machine Learning Applications. [301] George Tolkachev, Stephen Mell, Stephan Zdancewic, and Osbert Bastani. 2022.
[281] Saurav Singla. 2020. Machine Learning to Predict Credit Risk in Lending In- Counterfactual Explanations for Natural Language Interfaces. In Proceedings of
dustry. https://www.aitimejournal.com/@saurav.singla/machine-learning-to- the 60th Annual Meeting of the Association for Computational Linguistics (Volume
predict-credit-risk-in-lending-industry. Accessed: 2020-10-15. 2: Short Papers). Association for Computational Linguistics, Dublin, Ireland,
[282] Dylan Slack, Sophie Hilgard, Himabindu Lakkaraju, and Sameer Singh. 2021. 113–118. https://aclanthology.org/2022.acl-short.14
Counterfactual Explanations Can Be Manipulated. arXiv:cs.LG/2106.02666 [302] Gabriele Tolomei, Fabrizio Silvestri, Andrew Haines, and Mounia Lalmas. 2017.
[283] J. W. Smith, J. Everhart, W. C. Dickson, W. Knowler, and R. Johannes. 1988. Interpretable Predictions of Tree-Based Ensembles via Actionable Feature
Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus. Tweaking. In International Conference on Knowledge Discovery and Data Mining
In Proceedings of the Annual Symposium on Computer Application in Medical (KDD) (KDD ’17). Association for Computing Machinery, New York, NY, USA,
Care. American Medical Informatics Association, Washington,D.C., 261–265. 10. https://doi.org/10.1145/3097983.3098039
[284] Simón C. Smith and Subramanian Ramamoorthy. 2020. Counterfactual Ex- [303] Khanh Hiep Tran, Azin Ghazimatin, and Rishiraj Saha Roy. 2021. Counterfactual
planation and Causal Inference In Service of Robustness in Robot Control. Explanations for Neural Recommenders. Association for Computing Machinery,
In 2020 Joint IEEE 10th International Conference on Development and Learn- New York, NY, USA, 1627–1631. https://doi.org/10.1145/3404835.3463005
ing and Epigenetic Robotics (ICDL-EpiRob). 1–8. https://doi.org/10.1109/ICDL- [304] Maria Tsiakmaki and Omiros Ragos. 2021. A Case Study of Interpretable Counter-
EpiRob48136.2020.9278061 factual Explanations for the Task of Predicting Student Academic Performance.
[285] Kacper Sokol and Peter Flach. 2018. Glass-Box: Explaining AI Decisions with In 2021 25th International Conference on Circuits, Systems, Communications and
Counterfactual Statements through Conversation with a Voice-Enabled Virtual Computers (CSCC). https://doi.org/10.1109/CSCC53858.2021.00029
Assistant. In Proceedings of the 27th International Joint Conference on Artificial [305] Stratis Tsirtsis, Abir De, and Manuel Rodriguez. 2021. Counterfactual
Intelligence (IJCAI’18). AAAI Press, 5868–5870. Explanations in Sequential Decision Making Under Uncertainty. In Ad-
[286] Kacper Sokol and Peter Flach. 2019. Desiderata for Interpretability: Explaining vances in Neural Information Processing Systems, Vol. 34. Curran Asso-
Decision Tree Predictions with Counterfactuals. Conference on Artificial Intelli- ciates, Inc., 30127–30139. https://proceedings.neurips.cc/paper/2021/file/
gence (AAAI) 33 (July 2019). https://doi.org/10.1609/aaai.v33i01.330110035 fd0a5a5e367a0955d81278062ef37429-Paper.pdf
[287] Thomas Spooner, Danial Dervovic, Jason Long, Jon Shepard, Jiahao Chen, and [306] Stratis Tsirtsis and Manuel Gomez-Rodriguez. 2020. Decisions, Counterfactual
Daniele Magazzeni. 2021. Counterfactual Explanations for Arbitrary Regression Explanations and Strategic Behavior. arXiv:cs.LG/2002.04333
Models. [307] Ryan Turner. 2016. A Model Explanation System: Latest Updates and Extensions.
[288] Laura State. 2021. Logic Programming for XAI: A Technical Perspective. In ArXiv abs/1606.09517 (2016).
Proceedings of the International Conference on Logic Programming 2021 Workshops [308] AALTO UNIVERSITY. [n. d.]. The European Commission offers significant
(ICLP 2021), Vol. 2970. http://ceur-ws.org/Vol-2970/meepaper1.pdf support to Europe’s AI excellence. https://www.eurekalert.org/pub_releases/
[289] Gregory Stein. 2021. Generating High-Quality Explanations for Navigation 2020-03/au-tec031820.php. Accessed: 2020-10-15.
in Partially-Revealed Environments. In Advances in Neural Information [309] Sohini Upadhyay, Shalmali Joshi, and Himabindu Lakkaraju. 2021. Towards
Processing Systems, Vol. 34. Curran Associates, Inc., 17493–17506. https: Robust and Reliable Algorithmic Recourse. arXiv:cs.LG/2102.13620
//proceedings.neurips.cc/paper/2021/file/926ec030f29f83ce5318754fdb631a33- [310] Berk Ustun, Alexander Spangher, and Yang Liu. 2019. Actionable Recourse in
Paper.pdf Linear Classification. In Proceedings of the Conference on Fairness, Accountability,
[290] Deborah Sulem, Michele Donini, Muhammad Bilal Zafar, Francois-Xavier Aubet, and Transparency (FAccT) (FAT* ’19). Association for Computing Machinery,
Jan Gasthaus, Tim Januschowski, Sanjiv Das, Krishnaram Kenthapadi, and New York, NY, USA, 10. https://doi.org/10.1145/3287560.3287566
Cedric Archambeau. 2022. Diverse Counterfactual Explanations for Anomaly [311] Arnaud Van Looveren and Janis Klaise. 2020. Interpretable Counterfactual
Detection in Time Series. https://doi.org/10.48550/ARXIV.2203.11103 Explanations Guided by Prototypes. http://arxiv.org/abs/1907.02584
[291] Ezzeldin Tahoun and Andre Kassis. 2020. Beyond Explanations: Recourse [312] Arnaud Van Looveren, Janis Klaise, Giovanni Vacanti, and Oliver Cobb. 2021.
via Actionable Interpretability - Extended. https://doi.org/10.13140/ Conditional Generative Models for Counterfactual Explanations. https://
RG.2.2.19076.14729 doi.org/10.48550/ARXIV.2101.10123
[292] Paolo Tamagnini, Josua Krause, Aritra Dasgupta, and Enrico Bertini. 2017. In- [313] Simon Vandenhende, Dhruv Mahajan, Filip Radenovic, and Deepti Ghadiyaram.
terpreting Black-Box Classifiers Using Instance-Level Visual Explanations. In 2022. Making Heads or Tails: Towards Semantically Consistent Visual Counter-
Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Associa- factuals. In ECCV 2022.
tion for Computing Machinery, New York, NY, USA, 6. https://doi.org/10.1145/ [314] Sahil Verma, John Dickerson, and Keegan Hines. 2020. Counterfactual Ex-
3077257.3077260 planations for Machine Learning: A Review. https://doi.org/10.48550/
[293] Juntao Tan, Shuyuan Xu, Yingqiang Ge, Yunqi Li, Xu Chen, and Yongfeng ARXIV.2010.10596
Zhang. 2021. Counterfactual Explainable Recommendation. In Proceedings of [315] Sahil Verma, John Dickerson, and Keegan Hines. 2021. Counterfactual Explana-
the 30th ACM International Conference on Information & Knowledge Management. tions for Machine Learning: Challenges Revisited. https://doi.org/10.48550/
Association for Computing Machinery, New York, NY, USA, 10. ARXIV.2106.07756
[294] Sarah Tan, Rich Caruana, Giles Hooker, and Yin Lou. 2018. Distill-and- [316] Sahil Verma, Keegan Hines, and John P. Dickerson. 2021. Amortized Gen-
Compare: Auditing Black-Box Models Using Transparent Model Distillation. eration of Sequential Counterfactual Explanations for Black-box Models.
In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society arXiv:cs.LG/2106.03962
(AIES ’18). Association for Computing Machinery, New York, NY, USA, 8. [317] Sahil Verma and Julia Rubin. 2018. Fairness Definitions Explained. In Proceedings
https://doi.org/10.1145/3278721.3278725 of the International Workshop on Software Fairness (FairWare ’18). Association
for Computing Machinery, New York, NY, USA, 1–7. https://doi.org/10.1145/
21
,, Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah

3194770.3194776 [340] Xintao Xiang and Artem Lenskiy. 2022. Realistic Counterfactual Explanations
[318] Tom Vermeire, Dieter Brughmans, Sofie Goethals, Raphael Mazzine Barbossa by Learned Relations.
de Oliveira, and David Martens. [n. d.]. Explainable Image Classification with [341] Shuyuan Xu, Yunqi Li, Shuchang Liu, Zuohui Fu, and Yongfeng Zhang. 2020.
Evidence Counterfactual. Pattern Anal. Appl. 25, 2 ([n. d.]), 21. https://doi.org/ Learning Post-Hoc Causal Explanations for Recommendation.
10.1007/s10044-021-01055-y [342] Yaniv Yacoby, Ben Green, Christopher L. Griffin, and Finale Doshi Velez. 2022.
[319] CÉDRIC VILLANI. [n. d.]. FOR A MEANINGFUL ARTIFICIAL INTELLIGENCE. "If it didn’t happen, why would I change my decision?": How Judges Respond
https://www.aiforhumanity.fr/pdfs/MissionVillani_Report_ENG-VF.pdf. Ac- to Counterfactual Explanations for the Public Safety Assessment. https:
cessed: 2020-10-15. //doi.org/10.48550/ARXIV.2205.05424
[320] Marco Virgolin and Saverio Fracaros. 2022. On the Robustness of Sparse Coun- [343] Prateek Yadav, Peter Hase, and Mohit Bansal. 2021. Low-Cost Algorithmic
terfactual Explanations to Adverse Perturbations. https://doi.org/10.48550/ Recourse for Users With Uncertain Cost Functions. https://doi.org/10.48550/
ARXIV.2201.09051 ARXIV.2111.01235
[321] J. von Kügelgen, N. Agarwal, J. Zeitler, A. Mastouri, and B. Schölkopf. 2021. [344] Fan Yang, Sahan Suresh Alva, Jiahao Chen, and Xia Hu. 2021. Model-Based
Algorithmic recourse in partially and fully confounded settings through bound- Counterfactual Synthesizer for Interpretation. In Proceedings of the 27th ACM
ing counterfactual effects. In ICML 2021 Workshop on Algorithmic Recourse. SIGKDD Conference on Knowledge Discovery & Data Mining (KDD ’21). Asso-
https://sites.google.com/view/recourse21/home ciation for Computing Machinery, New York, NY, USA, 1964–1974. https:
[322] Julius von Kügelgen, Umang Bhatt, Amir-Hossein Karimi, Isabel Valera, Adrian //doi.org/10.1145/3447548.3467333
Weller, and Bernhard Scholkopf. 2020. On the Fairness of Causal Algorithmic [345] Fan Yang, Ninghao Liu, Mengnan Du, and Xia Hu. 2021. Generative Counterfac-
Recourse. tuals for Neural Networks via Attribute-Informed Perturbation. SIGKDD Explor.
[323] Sandra Wachter, Brent Mittelstadt, and Luciano Floridi. 2017. Why a Right Newsl. 23 (may 2021), 10. https://doi.org/10.1145/3468507.3468517
to Explanation of Automated Decision-Making Does Not Exist in the General [346] Linyi Yang, Eoin Kenny, Tin Lok James Ng, Yi Yang, Barry Smyth, and Rui-
Data Protection Regulation. International Data Privacy Law 7, 2 (06 2017). hai Dong. 2020. Generating Plausible Counterfactual Explanations for Deep
https://doi.org/10.1093/idpl/ipx005 Transformers in Financial Text Classification. In Proceedings of the 28th In-
[324] Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual ternational Conference on Computational Linguistics. International Committee
Explanations Without Opening the Black Box: Automated Decisions and the on Computational Linguistics, Barcelona, Spain (Online), 6150–6160. https:
GDPR. SSRN Electronic Journal 31, 2 (2017). https://doi.org/10.2139/ssrn.3063289 //doi.org/10.18653/v1/2020.coling-main.541
[325] Pei Wang and Nuno Vasconcelos. 2020. SCOUT: Self-Aware Discriminant [347] Nakyeong Yang, Taegwan Kang, and Kyomin Jung. 2022. Deriving Explainable
Counterfactual Explanations. In The IEEE/CVF Conference on Computer Vision Discriminative Attributes Using Confusion About Counterfactual Class. In
and Pattern Recognition (CVPR). ICASSP 2022. 1730–1734. https://doi.org/10.1109/ICASSP43922.2022.9747693
[326] Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, and [348] Yuanshun Yao, Chong Wang, and Hang Li. 2022. Counterfactually Evalu-
Ronald M. Summers. 2017. ChestX-ray8: Hospital-Scale Chest X-Ray Database ating Explanations in Recommender Systems. https://doi.org/10.48550/
and Benchmarks on Weakly-Supervised Classification and Localization of Com- ARXIV.2203.01310
mon Thorax Diseases. In Proceedings of the IEEE Conference on Computer Vision [349] Roozbeh Yousefzadeh and Dianne P. O’Leary. 2019. DEBUGGING TRAINED
and Pattern Recognition (CVPR). MACHINE LEARNING MODELS USING FLIP POINTS. https://debug-ml-
[327] Yongjie Wang, Qinxu Ding, Ke Wang, Yue Liu, Xingyu Wu, Jinglong Wang, Yong iclr2019.github.io/cameraready/DebugML-19_paper_11.pdf
Liu, and Chunyan Miao. 2021. The Skyline of Counterfactual Explanations for [350] Zixuan Yuan, Yada Zhu, Wei Zhang, Ziming Huang, Guangnan Ye, and Hui
Machine Learning Decision Models. In Proceedings of the 30th ACM International Xiong. 2021. Multi-Domain Transformer-Based Counterfactual Augmentation
Conference on Information & Knowledge Management. Association for Computing for Earnings Call Analysis. https://doi.org/10.48550/ARXIV.2112.00963
Machinery, New York, NY, USA, 10. https://doi.org/10.1145/3459637.3482397 [351] Wencan Zhang and Brian Y Lim. 2022. Towards Relatable Explainable AI with
[328] Yongjie Wang, Hangwei Qian, and Chunyan Miao. 2022. DualCF: Efficient the Perceptual Process. ACM. https://doi.org/10.1145/3491102.3501826
Model Extraction Attack from Counterfactual Explanations. In 2022 ACM Con- [352] Yuhao Zhang., Kevin McAreavey., and Weiru Liu. 2022. Developing and Experi-
ference on Fairness, Accountability, and Transparency (FAccT ’22). Association menting on Approaches to Explainability in AI Systems. In Proceedings of the 14th
for Computing Machinery, New York, NY, USA, 12. https://doi.org/10.1145/ International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,.
3531146.3533188 INSTICC, SciTePress, 518–527. https://doi.org/10.5220/0010900300003116
[329] Zhendong Wang, Isak Samsten, Rami Mochaourab, and Panagiotis Papapetrou. [353] Yunxia Zhao. 2020. Fast Real-time Counterfactual Explanations. https://doi.org/
2021. Learning Time Series Counterfactuals via Latent Space Representations. 10.48550/ARXIV.2007.05684
In Discovery Science. Springer International Publishing, Cham, 369–384. [354] Jinfeng Zhong and Elsa Negre. 2022. Shap-Enhanced Counterfactual Explana-
[330] Zhendong Wang, Isak Samsten, and Panagiotis Papapetrou. 2021. Counterfactual tions for Recommendations. In Proceedings of the 37th ACM/SIGAPP Symposium
Explanations for Survival Prediction of Cardiovascular ICU Patients. In Artificial on Applied Computing. Association for Computing Machinery, New York, NY,
Intelligence in Medicine. Springer International Publishing, Cham, 338–348. USA, 1365–1372. https://doi.org/10.1145/3477314.3507029
[331] Greta Warren, Mark T Keane, and Ruth M J Byrne. 2022. Features of Explainabil- [355] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. 2016. Learning
ity: How users understand counterfactual and causal explanations for categorical Deep Features for Discriminative Localization. In CVPR. IEEE, New York, USA,
and continuous features in XAI. https://doi.org/10.48550/ARXIV.2204.10152 2921–2929.
[332] Geemi P. Wellawatte, Aditi Seshadri, and Andrew D. White. 2022. Model agnostic [356] Yao Zhou, Haonan Wang, Jingrui He, and Haixun Wang. 2021. From Intrinsic to
generation of counterfactual explanations for molecules. Chem. Sci. 13 (2022), Counterfactual: On the Explainability of Contextualized Recommender Systems.
3697–3705. https://doi.org/10.1039/D1SC05259D https://doi.org/10.48550/ARXIV.2110.14844
[333] J. Wexler, M. Pushkarna, T. Bolukbasi, M. Wattenberg, F. Viégas, and J. Wilson. [357] Alexander Zien, Nicole Krämer, Sören Sonnenburg, and Gunnar Rätsch. 2009.
2020. The What-If Tool: Interactive Probing of Machine Learning Models. IEEE The Feature Importance Ranking Measure. In Machine Learning and Knowledge
Transactions on Visualization and Computer Graphics 26, 1 (2020), 56–65. Discovery in Databases, Vol. 5782. Springer Berlin Heidelberg, Berlin, Heidelberg.
[334] Adam White and Artur d’Avila Garcez. 2019. Measurable Counterfactual Local https://doi.org/10.1007/978-3-642-04174-7_45
Explanations for Any Classifier. http://arxiv.org/abs/1908.03020
[335] Adam White and Artur d’Avila Garcez. 2021. Counterfactual Instances Explain
Little. https://doi.org/10.48550/ARXIV.2109.09809
[336] Adam White, Kwun Ho Ngan, James Phelan, Saman Sadeghi Afgeh, Kevin Ryan,
Constantino Carlos Reyes-Aldasoro, and Artur d’Avila Garcez. 2021. Contrastive
Counterfactual Visual Explanations With Overdetermination. https://doi.org/
10.48550/ARXIV.2106.14556
[337] Anjana Wijekoon, Nirmalie Wiratunga, Ikechukwu Nkisi-Orji, Kyle Martin,
Chamath Palihawadana, and David Corsar. 2021. Counterfactual explanations
for student outcome prediction with Moodle footprints. CEUR Workshop Pro-
ceedings, 1–8. https://rgu-repository.worktribe.com/output/1395861
[338] Nirmalie Wiratunga, Anjana Wijekoon, Ikechukwu Nkisi-Orji, Kyle Martin,
Chamath Palihawadana, and David Corsar. 2021. DisCERN: Discovering Coun-
terfactual Explanations using Relevance Features from Neighbourhoods. In 2021
IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI).
1466–1473. https://doi.org/10.1109/ICTAI52525.2021.00233
[339] James Woodward. 2003. Making Things Happen: A Theory of Causal Explanation.
Oxford University Press.

22
Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review ,,

A FULL TABLE this law to “right to explanation” is debatable and ambiguous [323],
Initially, we categorized the set of papers with more columns and the official interpretation by Working Party for Article 29 has con-
in a much larger table. We selected the most critical columns and cluded that the GDPR requires explanations of specific decisions,
put them in table 1. The full table is available here. and therefore counterfactual explanations are apt. In the US, the
Equal Credit Opportunity Act (ECOA) and the Fair Credit Reporting
B METHODOLOGY Act (FCRA) require the creditor to inform the reasons for an adverse
action, such as rejection of a loan request [52, 53]. They generally
B.1 How we collected the paper to review? compare the applicant’s feature to the average value in the popula-
We collected a set of more than 350 papers. This section provides tion to arrive at the principal reasons. Government reports from the
the exact procedure used to arrive at this set of papers. For the United Kingdom [234] and France [151, 319] also touched on the
first version of this survey paper, we had started from a seed set issue of explainability in AI systems. In the US, Defense Advanced
of papers recommended by other people [210, 224, 250, 310, 324], Research Projects Agency (DARPA) launched the Explainable AI
followed by snowballing their references. For this updated (second) (XAI) program in 2016 to encourage research into designing ex-
version of the paper, we collected papers that cited the first paper plainable models, understanding the psychological requirements
that proposed CFEs for ML, i.e., Wachter et al. [324] and the first of explanations, and the design of explanation interfaces [68]. The
version of this CFE survey paper [314]. European Union has taken similar initiatives as well [61, 308]. The
For an even complete search, we searched for “counterfactual ex- US White House recently put forward the Blueprint for an AI Bill
planations”, “recourse”, and “inverse classification” on two popular of Rights [143] to modulate decisions from automated systems. The
search engines for scholarly articles, Semantic Scholar and Google Bill outlines five principles for operating such systems: 1) safe and
scholar. We looked for papers published in the last five years on effective systems, 2) algorithmic discrimination protections, 3) data
both search engines. This is a reasonable time frame since the pa- privacy, 4) explanations for decisions made using such systems, and
per that started the discussion of counterfactual explanations in 5) discussion about human alternatives. While many techniques
the context of machine learning (specifically for tabular data) was have been proposed for explainable machine learning, it is yet un-
published in 2017 [324]. We collect papers that were published clear if and how these specific techniques can help address the letter
before 31st May 2022. The papers we collected were published at of the law. Future collaboration between AI researchers, regulators,
conferences like KDD, IJCAI, FAccT, AAAI, WWW, NeurIPS, WHI, the legal community, and consumer watchdog groups will help
or uploaded to Arxiv. ensure the development of trustworthy AI.

B.2 Scope of the review

Even though the first paper we reviewed was published online in
2017, and most other papers we review cite it [324] as the seminal pa-
per that started the discussion around counterfactual explanations,
we do not claim that this is an entirely new idea. Communities from
data mining [98, 212], causal inference [244], and even software
engineering [55] have explored similar ideas to identify the princi-
pal cause of a prediction, an effect, and a bug, respectively. Even
before the emergence of counterfactual explanations in applied
fields, they have been the topic of discussion in fields like social
sciences [218], philosophy [176, 196, 266], psychology [45, 46, 163].
In this review paper, we restrict our discussion to recent papers that
discuss counterfactual explanations in machine learning, specifi-
cally classification settings. These papers have been inspired by the
emerging trend of FATE and the legal requirements pertaining to
explainability in tasks automated by machine learning algorithms.

C BURGEONING LEGAL FRAMEWORKS

AROUND EXPLANATIONS IN AI
To increase the accountability of automated decision systems—
specifically, AI systems—laws and regulations regarding the de-
cisions produced by such systems have been proposed and im-
plemented across the globe [85]. The most recent version of the
European Union’s General Data Protection Regulation (GDPR), en-
forced starting on May 25, 2018, offered a right to information
about the existence, logic, and envisaged consequences of such a
system [121]. This also includes the right to not be a subject of
an automated decision-making system. Although the closeness of
23

Linnik, Iosif v. Ostrovskii - Decomposition of Random Variables and Vectors-Amer Mathematical Society (1977)
No ratings yet
Linnik, Iosif v. Ostrovskii - Decomposition of Random Variables and Vectors-Amer Mathematical Society (1977)
394 pages
Development of One'S Self As A Product of Enculturation: Marilyn B. Encarnacion
100% (1)
Development of One'S Self As A Product of Enculturation: Marilyn B. Encarnacion
18 pages
DSA by Shradha Didi & Aman Bhaiya - Bonus DSA Questions
No ratings yet
DSA by Shradha Didi & Aman Bhaiya - Bonus DSA Questions
2 pages
(Graduate Studies in Mathematics 047) A. Yu. Kitaev, A. H. Shen, M. N. Vyalyi - Classical and Quantum Computation-Amer Mathematical Society (2002)
No ratings yet
(Graduate Studies in Mathematics 047) A. Yu. Kitaev, A. H. Shen, M. N. Vyalyi - Classical and Quantum Computation-Amer Mathematical Society (2002)
274 pages
Solution 2
0% (1)
Solution 2
4 pages
Children First School: Special Education Program Form 1: Teacher Nomination Directions
No ratings yet
Children First School: Special Education Program Form 1: Teacher Nomination Directions
18 pages
1st Week Aug
No ratings yet
1st Week Aug
5 pages
Dynamic Programming and Optimal Control: Third Edition Dimitri P. Bertsekas
0% (1)
Dynamic Programming and Optimal Control: Third Edition Dimitri P. Bertsekas
54 pages
Art Curriculum Overview
100% (1)
Art Curriculum Overview
13 pages
A Primer of Infinitesimal Analysis - Portada
0% (1)
A Primer of Infinitesimal Analysis - Portada
4 pages
Nonlinear Systems Analysis
No ratings yet
Nonlinear Systems Analysis
510 pages
Critical Analysis of Judy Blumes Are You There God Its Me Margaret
No ratings yet
Critical Analysis of Judy Blumes Are You There God Its Me Margaret
11 pages
Dilations of Irreversible Evolutions in Algebraic Quantum Theory
100% (2)
Dilations of Irreversible Evolutions in Algebraic Quantum Theory
81 pages
Magnus Matrix Differentials Presentation
100% (1)
Magnus Matrix Differentials Presentation
119 pages
Fronczak, Reniers - Model-Based Systems Engineering - 4TC00 Dictaat 2014-2015 PDF
No ratings yet
Fronczak, Reniers - Model-Based Systems Engineering - 4TC00 Dictaat 2014-2015 PDF
105 pages
The Science of Deep Learning
0% (1)
The Science of Deep Learning
2 pages
Complexity of Algorithms
No ratings yet
Complexity of Algorithms
180 pages
Unit 2
No ratings yet
Unit 2
38 pages
A Brief Survey of Deep Reinforcement Learning
No ratings yet
A Brief Survey of Deep Reinforcement Learning
16 pages
MATH 115: Lecture IV Notes
No ratings yet
MATH 115: Lecture IV Notes
5 pages
VetterliKovacevic95 Manuscript
No ratings yet
VetterliKovacevic95 Manuscript
521 pages
Binomial Distribution
No ratings yet
Binomial Distribution
16 pages
Decidability and Reductions
No ratings yet
Decidability and Reductions
178 pages
An Introduction To The Laplace Transform and The Z Transform - Anthony C.grove (PH 1991 138s)
No ratings yet
An Introduction To The Laplace Transform and The Z Transform - Anthony C.grove (PH 1991 138s)
140 pages
Model Predictive Control Using YALMIP Getting Started
No ratings yet
Model Predictive Control Using YALMIP Getting Started
5 pages
TSLB Linguistic (Academic Writing Real)
No ratings yet
TSLB Linguistic (Academic Writing Real)
6 pages
Orthogonal Polynomials (In Matlab) : Walter Gautschi
No ratings yet
Orthogonal Polynomials (In Matlab) : Walter Gautschi
23 pages
03 Numeric Basics
No ratings yet
03 Numeric Basics
68 pages
Guc 2 61 38781 2023-11-25T16 29 04
No ratings yet
Guc 2 61 38781 2023-11-25T16 29 04
3 pages
HW5e Beg Video Worksheet Unit 02
100% (2)
HW5e Beg Video Worksheet Unit 02
2 pages
Introduction To Network Programming Using C/C++
No ratings yet
Introduction To Network Programming Using C/C++
27 pages
Davis Analysis of Active RC Networks by Decomposition
No ratings yet
Davis Analysis of Active RC Networks by Decomposition
3 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
7 pages
Dynamic Programming - LeetCode
No ratings yet
Dynamic Programming - LeetCode
12 pages
PrinciplesOfConcurrent PDF
No ratings yet
PrinciplesOfConcurrent PDF
16 pages
Abstract Algebra Notes - by Me-1
No ratings yet
Abstract Algebra Notes - by Me-1
282 pages
Neal Koblitz P-Adic... 2020
No ratings yet
Neal Koblitz P-Adic... 2020
166 pages
263 Homework
No ratings yet
263 Homework
153 pages
0262072629.MIT Press - Peter D. Grunwald, in Jae Myung, Mark A. P Advances in Minimum Description Length & Applications - Apr.2009
No ratings yet
0262072629.MIT Press - Peter D. Grunwald, in Jae Myung, Mark A. P Advances in Minimum Description Length & Applications - Apr.2009
455 pages
Matlab
No ratings yet
Matlab
45 pages
Unit 2 Eating Healthy Listening
No ratings yet
Unit 2 Eating Healthy Listening
4 pages
E The Master of All
No ratings yet
E The Master of All
12 pages
CMPT 305: Computer Simulation/Modelling Spring 2022 Course Introduction Modelling Pitfalls & Performance Metrics Alaa Alameldeen (Alaa@sfu - Ca)
No ratings yet
CMPT 305: Computer Simulation/Modelling Spring 2022 Course Introduction Modelling Pitfalls & Performance Metrics Alaa Alameldeen (Alaa@sfu - Ca)
43 pages
Decision Uncertainty
No ratings yet
Decision Uncertainty
269 pages
Interview Questions - Answers
No ratings yet
Interview Questions - Answers
10 pages
Semigroup Theory With Applica Ns To Systems An o N Tro L
No ratings yet
Semigroup Theory With Applica Ns To Systems An o N Tro L
300 pages
DoyleSnell Random Walks and Electrict Networks
No ratings yet
DoyleSnell Random Walks and Electrict Networks
118 pages
Matroid
No ratings yet
Matroid
18 pages
Counterfactual Explanations For Machine Learning A Review
No ratings yet
Counterfactual Explanations For Machine Learning A Review
13 pages
04 Feedback and Optimal Sensitivity
No ratings yet
04 Feedback and Optimal Sensitivity
20 pages
Overview ML Interpretability
No ratings yet
Overview ML Interpretability
10 pages
Reflections On Practice: The Difference of Two Squares
No ratings yet
Reflections On Practice: The Difference of Two Squares
13 pages
RidleyWorks15 02
No ratings yet
RidleyWorks15 02
112 pages
Sample Robotics - Lab - Proposal
No ratings yet
Sample Robotics - Lab - Proposal
15 pages
Teaching PowerPoint Slides - Chapter 3
100% (1)
Teaching PowerPoint Slides - Chapter 3
17 pages
Getting Started With MATLAB: Notes
100% (4)
Getting Started With MATLAB: Notes
27 pages
Advances in Interdisciplinary Applied Discrete Mathematics
No ratings yet
Advances in Interdisciplinary Applied Discrete Mathematics
273 pages
Graphblas
No ratings yet
Graphblas
8 pages
Motivation Letter Sample 3
No ratings yet
Motivation Letter Sample 3
2 pages
On Cellular Automata Representation of Submicroscopic Physics: From Static Space To Zuse's Calculating Space Hypothesis
No ratings yet
On Cellular Automata Representation of Submicroscopic Physics: From Static Space To Zuse's Calculating Space Hypothesis
11 pages
Generalized Functions
No ratings yet
Generalized Functions
56 pages
The Design and Implementation of The Wolfram Language Compiler
No ratings yet
The Design and Implementation of The Wolfram Language Compiler
17 pages
07 Bab 6
No ratings yet
07 Bab 6
32 pages
An Introductory Course in Elementary Number Theory
No ratings yet
An Introductory Course in Elementary Number Theory
172 pages
STAT 650 - Foundations of Data Science Syllabus
No ratings yet
STAT 650 - Foundations of Data Science Syllabus
13 pages
Company Bankruptcy Detection PDF
No ratings yet
Company Bankruptcy Detection PDF
34 pages
A Tutorial On Convex Optimization (Haitham Hindi)
100% (1)
A Tutorial On Convex Optimization (Haitham Hindi)
14 pages
High Quality Online Texts and Notes:: Sition To Advanced Mathematics, 2nd Ed., Addison Wesley, Reading, MA
No ratings yet
High Quality Online Texts and Notes:: Sition To Advanced Mathematics, 2nd Ed., Addison Wesley, Reading, MA
8 pages
RMS
No ratings yet
RMS
16 pages
Unconstrained Optimization (Contd.) Constrained Optimization
No ratings yet
Unconstrained Optimization (Contd.) Constrained Optimization
19 pages
TOC Tut
No ratings yet
TOC Tut
3 pages
(This Serves As An Invitation) : Maka-Diyos Makatao Makakalikasan Makabansa
No ratings yet
(This Serves As An Invitation) : Maka-Diyos Makatao Makakalikasan Makabansa
2 pages
DBMSEnd Sem Winter 2017 Solution
No ratings yet
DBMSEnd Sem Winter 2017 Solution
7 pages
Why We Should Not Have Homework On The Weekends
100% (1)
Why We Should Not Have Homework On The Weekends
5 pages
Causal Interpretability For Machine Learning
No ratings yet
Causal Interpretability For Machine Learning
16 pages
Viral Pandey Bankruptcy Prediction
No ratings yet
Viral Pandey Bankruptcy Prediction
7 pages
AA V1 I1 PCG Lanczos Eigensolver
No ratings yet
AA V1 I1 PCG Lanczos Eigensolver
2 pages
Explaining Explanations - An Overview of Interpretability of Machine Learning
No ratings yet
Explaining Explanations - An Overview of Interpretability of Machine Learning
10 pages
Pioneers of Modern Teaching
No ratings yet
Pioneers of Modern Teaching
48 pages
Cognitive Psychology - Module 1
No ratings yet
Cognitive Psychology - Module 1
72 pages
Thermal Degradation of PE and PTFE During Vacuum Evaporation
No ratings yet
Thermal Degradation of PE and PTFE During Vacuum Evaporation
4 pages
Tutorial 5
No ratings yet
Tutorial 5
4 pages
Mrs. Ofelia Melchora M. Sumalin Principal Holy Spirit Academy
No ratings yet
Mrs. Ofelia Melchora M. Sumalin Principal Holy Spirit Academy
7 pages
Python JD
No ratings yet
Python JD
2 pages
Resume JONATHAN MWITA 02 - 15 - 2023 7 - 15 - 50 AM
No ratings yet
Resume JONATHAN MWITA 02 - 15 - 2023 7 - 15 - 50 AM
2 pages
2024 Example of CV in English Endri Nderjaku
No ratings yet
2024 Example of CV in English Endri Nderjaku
6 pages
The Limitation of Conflict - A Theory of Bargaining and - Rangarajan, L - N - New York, New York State, 1985 - Palgrave Macmillan - 9780312486754 - Anna's Archive
No ratings yet
The Limitation of Conflict - A Theory of Bargaining and - Rangarajan, L - N - New York, New York State, 1985 - Palgrave Macmillan - 9780312486754 - Anna's Archive
360 pages
IsiXhosa HL P1 May-June 2023
No ratings yet
IsiXhosa HL P1 May-June 2023
13 pages
Language Translator Python
No ratings yet
Language Translator Python
25 pages
Non Text Magic Studio Magic Design For Presentations L&P
No ratings yet
Non Text Magic Studio Magic Design For Presentations L&P
6 pages
B Des Curriculum 2024 KTU
No ratings yet
B Des Curriculum 2024 KTU
11 pages
Local Heritage Themes
No ratings yet
Local Heritage Themes
3 pages
Edad Moderna y Contemporanea
No ratings yet
Edad Moderna y Contemporanea
10 pages
1 s2.0 S266682702400001X Main
No ratings yet
1 s2.0 S266682702400001X Main
8 pages
DAILY-LESSON-LOG-TLE10-Week 2 2024-2025
No ratings yet
DAILY-LESSON-LOG-TLE10-Week 2 2024-2025
5 pages
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
From Prognostics and Health Systems Management to Predictive Maintenance 1: Monitoring and Prognostics
From Everand
From Prognostics and Health Systems Management to Predictive Maintenance 1: Monitoring and Prognostics
Rafael Gouriveau
No ratings yet
Graphs and Tables of the Mathieu Functions and Their First Derivatives
From Everand
Graphs and Tables of the Mathieu Functions and Their First Derivatives
James C. Wiltse
No ratings yet

Counterfactual Explanations and Algorithmic Recourses For Machine Learning: A Review

Uploaded by

Counterfactual Explanations and Algorithmic Recourses For Machine Learning: A Review

Uploaded by

Counterfactual Explanations and Algorithmic Recourses for

Machine Learning: A Review

Keegan E. Hines John P. Dickerson Chirag Shah

Arthur AI Arthur AI University of Washington

ABSTRACT • An explanation can be beneficial to the applicant whose life

𝑓 : X𝑚 → Y, which is used to predict labels for unseen datapoints

2.3 History of Counterfactual Explanations

(5) Causality: Features in a dataset are rarely independent, there-

Amortized Multiple Data Feature Categorical dist.

Table 2: Continued from Table 1

Amortized Multiple Data Feature Categorical dist.

453. https://doi.org/10.1017/S0140525X07002579 Explanations and Feasible Recommendations to End Users. In Proceedings of

B.2 Scope of the review

C BURGEONING LEGAL FRAMEWORKS

You might also like