0% found this document useful (0 votes)
32 views8 pages

Explainable Artificial Intelligence A Review and Case Study On Model-Agnostic Methods

Uploaded by

Malini madavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views8 pages

Explainable Artificial Intelligence A Review and Case Study On Model-Agnostic Methods

Uploaded by

Malini madavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Explainable Artificial Intelligence: A Review and

Case Study on Model-Agnostic Methods


2023 14th International Conference on Intelligent Systems: Theories and Applications (SITA) | 979-8-3503-0821-1/23/$31.00 ©2023 IEEE | DOI: 10.1109/SITA60746.2023.10373722

Khadija LETRACHE Mohammed RAMDANI


Informatics Department, Informatics Department,
LIM Laboratory LIM Laboratory
Faculty of Sciences and Faculty of Sciences and
Techniques of Mohammedia, Techniques of Mohammedia,
University Hassan II, University Hassan II,
Casablanca,Morocco Casablanca,Morocco
Email:[email protected] Email:[email protected]

Abstract—Explainable Artificial Intelligence (XAI) has systems.


emerged as an essential aspect of artificial intelligence (AI), In this paper, we selected six XAI methods to analyze a
aiming to impart transparency and interpretability to AI black- healthcare case study with the purpose to evaluate and compare
box models. With the recent rapid expansion of AI applications
across diverse sectors, the need to explain and understand their these techniques. The chosen techniques relate all to global
outcomes becomes crucial, especially in critical domains. In this agnostic methods, aiming to explain any AI back-box model.
paper, we provide a comprehensive review of XAI techniques, Among them, three of the chosen techniques are global,
emphasizing their methodologies, strengths, and potential providing a global explanation of a model, while the remaining
limitations. Furthermore, we present a case study employing six three are local techniques, aiming to explain the decision
model-agnostic XAI techniques, offering a comparative analysis
of their effectiveness in explaining a black-box model related related to a specific instance. Finally, we provide a synthesis of
to a healthcare scenario. Our experiments not only showcase the techniques employed, evaluate their outcomes, and delve
the applicability and distinctiveness of each technique but into potential directions for future research.
also provide insights to researchers and practitioners seeking The remainder of the paper is organized as follows: Section
suitable XAI methodologies for their projects. We conclude 2 provides a comprehensive overview of XAI and the state-
with a discussion on future research perspectives in the field of
explainable AI. of-the-art related to this field. In section 3, we present our
Index Terms—Explainable Artificial Intelligence, XAI, Model- case study and the experimental outcomes of the chosen XAI
Agnostic, Black-box, Interpretability. techniques. Finally, we conclude this paper by addressing our
research perspectives related to XAI.
I. I NTRODUCTION II. XAI OVERVIEW
Over the past few decades, artificial intelligence (AI) has A. XAI Background
experienced exponential growth, driven by enhanced comput- Though the term may seem new, questions about the ex-
ing capabilities, the availability of massive data, and improve- plainability of AI systems trace back to the very inception of
ments in machine learning (ML) algorithms. This has led to AI. In fact, back to 1975 for instance, one of the three design
significant advancements across all AI fields. Meanwhile, as goals of the Expert System Mycin [2] [3] was to be able to
AI models have evolved, they have become more complex, explain decisions when it is asked to do so. In general, Expert
and their decisions increasingly obscure, especially for end- systems are considered as the first explainable AI systems [4]
users. Such models, often referred as ”black-boxes” by the [5] [6] [7]. In 1986, Ginsberg wondered in his paper [8] about
AI community [1], may deliver highly accurate outcomes, the potential applications of counterfactual explanations within
but their complexity and lack of transparency, make them the AI realm.
challenging to understand and to explain. As a result, the Thus, as AI evolved and grew, so did the field of XAI.
eXplainable Artificial Intelligence (XAI) field has emerged, However, the recent exponential evolution of AI across various
dedicated to demystifying AI decisions for data professionals domains has outpaced advancements in XAI, resulting in a
and domain experts. Actually, the need for interpretability is lack of techniques to explain complex AI systems.
imperative, especially in critical sectors such as healthcare, Actually, we differentiate in AI, or machine learning (ML)
finance and justice, where clear understanding and trust in AI precisely, between interpretable models and black-box models.
systems are not just desirable but essential. Interpretable models produce decisions that are easily under-
Therefore, over time, many explainable AI techniques have standable by humans, with examples including decision trees,
been proposed, each tailored to explain a specific aspect of AI linear regression, KNN, K-means, and Bayesian networks.
979-8-3503-0821-1/23/$31.00 ©2023 IEEE Conversely, models such as neural networks, random forests,
support vector machines (SVM), and gradient boosting ma- techniques. Beyond common visual tools like confusion matrix
chines (GBM) fall under the category of black-box models, and heatmaps, most XAI techniques come with libraries that
as their decisions are often more challenging to explain or offer a graphical representation of their outputs.
interpret. b) Rule Extraction
One might question whether the high accuracy of these models Another approach in XAI involves extracting the rules used
alone is enough to gain user’s trust, even if they don’t by a black-box model, either on a global or local scale. Ideally,
understand the decision-making process. In essence, why is pinpointing the exact rule a black-box model uses for decision-
explainability in AI so crucial? Actually, explainability is making would provide the perfect explanation. However, in
crucial in AI systems for several reasons. Foremost, it fosters practice, this is often not possible. Rule extraction typically
the user trust in AI outcome [9], especially in critical areas yields an approximation of the model’s exact rule. Techniques
such as medical diagnosis, where practitioners require com- like TREPAN for neutral network [14] or decision trees used
prehensive explanations to confidently rely on AI systems. A as surrogate models fall under this category.
further advantage of explainability is the opportunity for users c) Feature importance
to gain deep knowledge from AI [10]. By understanding the Interpreting a black-box model can also be achieved by
implicit, data-driven rules formulated by an AI model, users explaining the importance of its features and their impact
can discover new insights thanks to the artificial intelligence. on the model decisions. Techniques in this category often
Moreover, explainability aids in evaluating an AI system. In rank features based on assigned coefficients that indicate their
fact, relying only on performance metrics such as accuracy or importance, such as LIME [11] and PFI [15]. Other methods
precision is often insufficient to guarantee for an AI system’s display how predictions change in response to variations in
reliability [11]. It’s imperative to probe the underlying rules of specific feature values, like in ALE, PDP, and ICE techniques.
the system to ensure that its decisions are grounded in logic, Notably, this category encompasses both global and local XAI
rather than mere coincidence. This paves the way for corrective methods.
measures and enhancements. d) Counterfactual
B. XAI Taxonomy The core concept of counterfactual explanation is to pro-
pose alternative scenarios to identify how this decision might
1) XAI Taxonomy by Scope change when one or multiple features are altered [16]. Coun-
a) Global vs Local terfactuals typically follow the ”if a then y” format, where ”a”
• Global XAI Methods : These aim to explain the overall is presumed to be false in the given context [8]. For example, if
behavior of the black-box model. They are performed a predictive model classifies a client as a ”subprime borrower”,
once across all the dataset instances. Some techniques, a counterfactual explanation might address what changes are
such as PFI and surrogate models, deal with all the necessary for the client to be classified differently.
model’s features, while others, such as PDP and ALE e) Prototypes and Criticisms
[12], focus on individual features. Example-based explanations offer an intuitive approach to
• Local XAI Methods: These techniques are designed to explain black-box models to users [17]. Consider, for instance,
explain a specific decision or instance. Notably, explana- providing a physician with cancer images, as examples, from
tions can defer from one instance to another, especially the dataset to explain the classification related to a specific
with complex models. We find in this category, methods patient; this will certainly help the physician understand the
such as LIME and SHAP, which provide the rule related model’s reasoning. However, to be sure that the given exam-
to a specific instance. ples align with the instance of interest or with the broader
b) Agnostic vs Specific dataset, it is crucial to meticulously select a representative
• Agnostic XAI Methods : These techniques can explain example, a prototype, for explanation. Yet, relying only on
any AI black-box model. They analyze the relationship prototypes may risk over-generalizing the model’s behavior
between the model’s input and output, irrespective of the [18], especially for complex models. Therefore, prototypes
black-box’s internal structure. must be analyzed alongside criticisms, which refer to instances
• Specific XAI Methods: These techniques are designed to that are not well explained by prototypes [17].
explain specific AI models. They can either be integrated By examining prototypes and criticisms, developers and ex-
during the model’s design phase or applied post-hoc after perts can not only understand the model’s decision-making
the model has been trained. It’s worth noting that many process but also identify its limitations, paving the way for
of these techniques are specific to neural networks. potential corrections and enhancements.
f) Fuzzy rule-based explanation
Given human mental model, fuzzy reasoning can be viewed
2) XAI Taxonomy by Output
as an intuitive method of explanation in artificial intelligence.
a) Visualization Instead of seeking binary rules when explaining a black-
Visualization is considered as one of the most powerful box model, we can extract fuzzy rules. This approach aligns
explanation tools [13]. It offers users a summarized, com- with the idea that XAI techniques only offer approximations
prehensible and visual output of both AI models and XAI of the black-box models. Therefore, fuzzy logic has been
TABLE I
DATASET DESCRIPTION

Column Description Type Values

Age The age of the patient at the time Integer


of diagnosis

Race The racial background of the pa- String White, Black, Other
tient

Marital Status Patient’s marital status String Married, Divorced, Single,


Widowed, Separated

T Stage Indicates the size and extent of the String T1, T2, T3, T4
primary tumor

N Stage Indicates whether the cancer has String N1, N2, N3 Fig. 1. Description of the instance used for local explanation
spread to nearby (regional) lymph
nodes

6th Stage Refers to the breast cancer staging String IIA, IIIA, IIIC, IIB, IIIB
system as defined by the AJCC a accuracy of our trained model reached 91 %. Subsequently, we
in its 6th edition
employed following XAI techniques to interpret the decisions
A Stage Indicates the extent of cancer String Regional, Distant
spread of our black-box model: Global Surrogate model, PDP, PFI,
Differentiate Refers to how differentiated the String Poorly differentiated, Moder- ICE, LIME, and SHAP. For evaluating local XAI techniques,
tumor cells are ately differentiated, Well dif-
ferentiated, Undifferentiated we considered the patient described in [Fig 1] and classified
Grade Tumor grade is an indication of String 1, 2, 3, anaplastic; Grade IV as ”Alive” by the model. This instance will be called A in the
how aggressive a tumor is and how
it might behave remainder of this paper.
Tumor size Indicates the size of the primary Integer
tumor B. Global Model-Agnostic Methods
Estrogen Status Indicates whether the breast cancer
cells have receptors for the hor-
String Positive, Negative 1) Permutation Feature Importance
mone estrogen

Progesterone Status Indicates whether the breast cancer String Positive, Negative
a) Method Overview
cells have receptors for the hor-
mone progesterone
Permutation Feature Importance (PFI) [15] is a widely
Regional Node Ex- Number of nearby lymph nodes Integer
used XAI technique. While it shares similarities with the
amined that were removed and examined
under a microscope
PDP approach, PFI serves a distinct purpose. Specifically,
Reginol Node Posi- Represents the number of exam- Integer
PFI focuses on gauging the impact of individual features on
tive ined lymph nodes that were found
to have cancer
the black-box model performance. This impact is assessed
Survival Months Number of months the patient sur- Integer
by measuring the change in the model’s performance when
vived post-diagnosis the values of a feature are randomly shuffled. A significant
Status Patient’s current status String Alive, Dead increase in the model error implies higher importance of
a The American Joint Committee on Cancer. that feature. Consequently, the method yields a ranked list of
features based on their importance. The PFI algorithm can be
summarized in the following steps:
widely utilized in XAI [19] for both post-hoc and intrinsic
• Train the model and measure its performance.
explanations, especially in interpreting neural network models,
• For each feature:
dating back to 1997 [1] [20].
– Shuffle its values in the test dataset.
III. C ASE STUDY – Measure the model performance for the new dataset.
In this section, we provide a comprehensive review of – Calculate the feature importance based on the differ-
six XAI techniques, illustrated through a case study. These ence in performance (accuracy or r-squared) before
techniques are used to explain a random forest model trained and after permutation.
on a breast cancer dataset. • Rank the features by their computed importance.
Nevertheless, PFI can yield biased outcomes when dealing
A. Dataset with correlated features. Such scenarios may lead the model to
For our case study, we employed a dataset pertaining to underestimate the significance of certain features, even though
breast cancer. This dataset was sourced from the November they might play a crucial role [17] [22].
2017 update of the SEER (Surveillance, Epidemiology, and b) Method Application
End Results) Program of the National Cancer Institute (NCI). When we first applied PFI to our case study, we obtained
Specifically, the dataset centers on female patients diagnosed the feature importance plots shown in [Fig 2]. We can notice
with infiltrating duct and lobular carcinoma breast cancer, two that some features have negative values. These negative values
prevalent types of breast cancer, between 2006 and 2010 [21]. indicate that the respective features not only lack importance
The dataset comprises 4024 entries across 16 columns. A but also introduce noise into the model [22]. Indeed, after
detailed description of these columns is provided in Table 1. eliminating these features, the model’s accuracy improved.
We utilized the Random Forest algorithm to classify patients After eliminating these features and re-executing the PFI,
as either ”Alive” or ”Dead” based on their medical data. The the features ranking changed as depicted in [Fig 3]. The
Fig. 2. Feature importance plot showing the ranking of features based on their
computed importance. Noting that there are features with negative values.
Fig. 4. PDP applied to each feature, sorted by curve variation, and grouped in
this figure. The y-axis displays the prediction average. Curve variance reflects
the dependence between the prediction and the feature values. Almost constant
curves indicate a non-significant feature importance.

assumption can lead the PDP to provide misleading interpre-


tations. This is because correlated features could influence
the predictions, and also because artificially perturbing the
dataset (step 1) could generate unrealistic instances (e.g., a
combination of ”summer” for the season and ”-10” for the
temperature), leading to inaccurate model predictions.
Noting also that for large datasets or features with high
cardinality, the PDP process can be computationally intensive.
Fig. 3. Feature importance plot showing the ranking of features based on their
computed importance after eliminating the features with negative values.
In such cases, we can employ the Monte Carlo method to
sample instances rather than using the entire dataset or other
optimization techniques [23].
most crucial features in the model are now: Survival Months, b) Method Application
followed by Regional Node Positive, Grade, Age, Marital In our case study, we utilized the PDP technique to in-
Status, and Progesterone Status. Conversely, the remaining dividually assess the impact of each feature on the model’s
features have a non-significant impact on predictions. predictions. The visual results are consolidated and presented
in [Fig 4].We observed significant variations in the predictions
2) Partial Dependence Plot related to the features : Survival Months, Regional Node Pos-
itive, Grade, and Marital Status, highlighting their influence
a) Method Overview on the model’s outputs. Conversely, the predictions showed
The Partial Dependence Plot (PDP) aims to examine the minimal variation for features: Regional Node Examined,
influence of one or two features on a black-box model’s Race, Estrogen Status, A Stage and Progesterone Status.
predictions [13]. Through visualization, the PDP illustrates
how the model’s predictions changed when varying a feature
values. To do so, the PDP method involves the following steps: 3) Global Surrogate model
• For each unique value v of the feature in question, assign a) Method Overview
this value to all instances in the dataset, i.e. overriding The surrogate model approach is a straightforward
the original value of the feature with v. The values of and simple method for interpreting complex machine
other features remain unchanged. learning models. Essentially, this involves utilizing an
• Use the trained black-box model to obtain predictions for interpretable model, such as decision trees or linear models,
the new dataset. to approximate the behavior of a more complex black-box
• Calculate the average prediction corresponding to the model. This surrogate is trained using the predictions of
value v. the original, complex model. However, it’s worth noting
• Repeat the above steps for all unique values of the feature. that identifying an appropriate surrogate model isn’t always
• Plot the average prediction against the unique values of feasible. Furthermore, due to potential divergence between
the feature to visualize its impact on the model’s output. the original and surrogate models, the latter might not capture
However, a fundamental assumption of the PDP is that the all nuances of the former. If the surrogate could perfectly
selected features are independent of others. Violating this replicate the black box model, one might argue for the use of
the interpretable model outright.
Therefore, it is essential to quantify the degree to which the
surrogate model approximates the original. Metrics like the
R-squared value can be employed for this purpose [17] as
shown bellow:
n
(yi′ −yi )2
P

R2 = 1 − i=1
n
P
(yi −yi )2
i=1
where n is the number of instances, yi′ prediction given by
the surrogate model, yi prediction of the black-box model
and yi the mean of the back-box model predictions.
The R-squared value should ideally approach 1 to effectively
consider the interpretable model as a surrogate for the Fig. 5. ICE plot illustrating the individual impact of the ”Progesterone
Status” feature on model predictions. While the feature generally has limited
black-box model. However, it is crucial to remember that influence, pronounced effects are observed for specific instances.
such interpretation serve as approximation and may not
necessarily capture the exact logic used by the black-box
model. model’s prediction for individual instances.
This technique has been widely employed in explaining Just like PDP, the ICE method can sometimes yield mislead-
various complex machine learning models, including neural ing interpretations when the feature under consideration is
networks [14] [24] and random forests [18]. Notably, decision correlated with others. Moreover, in large datasets, ICE can
trees are often favored, as they provide comprehensible rules produce overlapping curves, potentially complicating visual
that explain the decisions made by the original model. interpretation. Nevertheless, ICE excels in discovering the
heterogeneous responses of individual instances to the feature
b) Method Application of interest, a detail that might be hidden when focusing on the
In the context of our case study, we employed the decision average provided by PDP.
tree as a surrogate model to provide insights into our Random b) Method Application
Forest classifier. The divergence between of the two models
From the results obtained using the PDP and PFI methods,
is 0.55, indicating a less than optimal alignment. It is worth
we observed a discrepancy in the interpretation of the ”Pro-
noting that the decision tree was configured with 6 layers.
gesterone Status” feature. While PFI considered it important,
Increasing the depth results in a completely dissimilar model.
PDP did not. To further analyze the impact of this feature at
However, despite the divergence between the two models, we
an individual instance level, we referred to the ICE plot, as
utilized the surrogate model to explain the decision related to
illustrated in [Fig 5]. It’s evident that, in general, the feature
the instance A. After ensuring a match in predictions between
doesn’t significantly influence predictions. However, there are
the Random Forest and the decision tree for the instance, the
specific instances where its positive or negative impact is
rule extracted from the decision tree is :
notably pronounced. We also observed the issue of overlapping
SurvivalM onths > 47.50
curves, which complicates the interpretation of the plot.
and SurvivalM onths <= 82.50
and ReginolN odeP ositive <= 8.50
and Grade <= 0.82 2) LIME
and Grade > 0.69 and Age > 30.50
a) Method Overview
C. Local Model-Agnostic Methods
LIME, standing for Local Interpretable Model-agnostic Ex-
1) Individual Conditional Expectation planations [11], aims to approximate a black-box model locally
a) Method Overview using a more interpretable model, often a linear one. While
The Individual Conditional Expectation (ICE) plot [25] is it shares similarities with the concept of a global surrogate,
a variant of the Partial Dependence Plot (PDP) that aims to LIME focuses on emulating the decisions of the complex
examine the relationship between specific features and their model at a local level. The process to explain the decision
respective model predictions. Although ICE fundamentally related to a specific instance x involves the following steps:
operates on a local level by evaluating individual instances, • Perturb the features of x to generate new instances Z
it offers a global view on a feature importance. Contrary around x.
to PDP which averages the predictions for each instance, • For each generated instance z, assign a weight π based
rendering a singular point for that instance, ICE retains and on its proximity to x. This weight acts as a measure to
visualizes these predictions across the range of a feature’s filter out noise; the closer z is to x, the greater the weight
values, rendering a distinctive curve for each instance. This π is.
distinction allows for a more granular understanding of how • Use the original black-box model to make predictions on
altering a specific attribute for a given feature might affect the the newly generated instances in Z, denoted as f (z).
Fig. 7. Feature importance for instance A using TreeSHAP. The plot displays
individual feature contributions towards the prediction. Features pushing the
prediction towards ”survival” are shown in red, while features in blue suggest
a contrary effect. The magnitude of each bar represents the strength of the
feature’s influence.

Fig. 6. LIME interpretation for instance A. The figure depicts the contri-
butions of individual features to the predicted outcome. Features in orange • Construct x′ such that features in S remain unchanged
positively influence the prediction towards 1, signifying ”Alive.” Conversely,
the ”Grade” feature, shown in blue, negatively impacts the prediction. from x, while features not in S and not ai are marginal-
ized (typically using an average or expected value).
• Calculate the prediction of x′ denoted fx (S) (The pre-
• Train an interpretable model g, using the perturbed diction of x without ai ).
instances, their assigned weights, and the predictions • Construct x′′ such that features in S and ai remain un-
obtained from the black-box model. The goal is for changed from x, while features not in S are marginalized.
the interpretable model to closely match the predictions • Compute the prediction of x′′ denoted fx (S ∪ i) (The
provided by fP by minimizing the following loss function: prediction of x with ai ).
L(f, g, π) = πx (z)(f (z) − g(z))2 • Calculate the difference between fx (S) and fx (S ∪i) (the
z∈Z contribution of ai when added to S).
• Explain the decision related to x using the interpretable
• Obtain the Shapley value for ai by computing the average
model. The coefficients of this model (in the case of
of all differencesP(over all coalitions S) weighted as
linear model) represent the impact of the features on the |S|!(M −|S|−1)!
follows: ϕi = [fx (S ∪ i) − fx (S)]
decision. S⊆N \{i}
M!

b) Method Application
The SHAP method includes three specialized variants:
In our case study, the explanation provided by LIME for
KernelSHAP: A model-agnostic approach optimized by
the instance A is depicted in [Fig 6]. It suggests a 91%
considering a subset of coalitions rather than the exhaustive
probability of survival. This high probability is attributed to
set.
the positive impact of the following features, in order: Survival
TreeSHAP: Specifically tailored for tree-based models, it
Months, Regional Node Positive, Regional Node Examined,
offers optimizations that capitalize on the structural properties
Age, Marital Status and Race. Conversely, the Grade feature
of trees.
has a slight negative impact on the prediction.
DeepSHAP: Designed for deep learning models, it combines
the Shapley value concept with the DeepLIFT method [26].
3) SHAP A prominent challenge with the Shapley value is its
exponential computation time. For a dataset with M features,
a) Method Overview
the technique necessitates the examination of 2M coalitions.
SHAP (SHapley Additive exPlanations), introduced by
Consequently, several approximation strategies have been
Lundberg [26], provides local explanations by quantifying the
proposed [28]. Furthermore, it’s notable that the SHAP
contribution of each feature to a model’s decision for a given
technique is not exclusively local; it also provides global
instance. Inspired by the Shapley value [27] from game theory,
explanations by averaging feature importance across the entire
SHAP considers each feature as a player and the increase
dataset [29].
in prediction (i.e., the difference between a given prediction
and the average prediction) as the payout. The Shapley value
evaluates the impact of a feature by assessing its contribution b) Method Application
across all possible subsets of features, known as coalitions.
In our case study, we utilized the SHAP method, specifically
Specifically, it computes how the prediction changes when a
TreeSHAP, to investigate the contributions of each feature to
feature is known versus when it’s unknown (approximated by
the model decision related to the instance A. The results from
a random or average value). Here are the main steps of the
this technique are depicted in [Fig 7]. Similar to the LIME
standard Shapley value:
technique, SHAP reveals features that contribute positively to
Given:
the ”survival” prediction of instance A. These features include
• f the prediction function of the black-box model Survival Months, Regional Node Positive, Regional Node
• N= {a1 , a2 , .., aM } the model’s features Examined, Marital Status, and Age, which offset the negative
• x the instance to be explained impact (Status=0) from the Grade feature. The TreeSHAP
For each feature ai : gives us also a global view of the features importance related
For each coalition S of N not containing ai : to our black-box model, as shown in [Fig 8].
agnostic approaches. We enriched our review with a case
study experiments, using a healthcare dataset, which was
classified using a Random Forest model. Our experimental
results highlighted the benefits of the six evaluated techniques
in revealing different interpretability facets of the black-box
model in question. Furthermore, our findings also indicated
potential disparities in the outcomes of these techniques,
particularly regarding less impactful features, in both local
and global explanation contexts. Such divergence underscores
the importance of aligning XAI techniques that target to the
Fig. 8. Global Features importance using SHAP same explanatory dimension to avoid misunderstandings and
confusion among users.
In our future work, we aim to delve deeper into model-agnostic
D. Result Discussion XAI techniques, with a particular focus on rule-based method-
In this study, we employed six XAI methods to interpret ologies as well as abductive and fuzzy logic. Additionally,
our Random Forest model. The techniques outcomes generally we aspire to collaborate with experts and practitioners in the
converge. The main conclusions are: healthcare domain to develop useful and interpretable models.
• Across the six techniques, both globally and locally,
the most significant features for predicting the status R EFERENCES
of a breast cancer patient are: Survival Months and
Regional Node Positive. Conversely, the least significant [1] Benı́tez, J. M., Castro, J. L., Requena, I. (1997). Are artificial neural
networks black boxes?. IEEE Transactions on neural networks, 8(5),
are: Estrogen Status and A Stage. 1156-1164.
• For a global explanation, based on PDP, PFI, and SHAP [2] Shortliffe, E. H., Davis, R., Axline, S. G., Buchanan, B. G., Green, C.
Global analyses, the most critical features for predicting C., and Cohen, S. N. (1975). Computer-based consultations in clinical
therapeutics: explanation and rule acquisition capabilities of the MYCIN
the status of a breast cancer patient are: Survival Months, system. Computers and biomedical research, 8(4), 303-320.
Regional Node Positive, Grade, and Age. Discrepancies [3] Shortliffe, E. H. (1977, October). Mycin: A knowledge-based computer
are evident for the other features. program applied to infectious diseases. In Proceedings of the Annual
Symposium on Computer Application in Medical Care (p. 66). American
• For local explanation, the outcomes of LIME and SHAP Medical Informatics Association.
largely aligned. [4] Feigenbaum, E. A., and Buchanan, B. C. (1971). J. Lederberg,” On
• Though the ICE technique is a local method, it provides Generality and Problem Solving: a Case Study Using the DENDRAL
Program,” Machine Intelligence 5.
a global view of the model. [5] Feigenbaum, E. A. (1979). Themes and case studies of knowledge
• The divergence between our model and the surrogate engineering. Expert systems in the micro-electronic age, 3-25.
model was measured at 0.55, suggesting that the surrogate [6] Confalonieri, R., Coba, L., Wagner, B., and Besold, T. R. (2021). A
historical perspective of explainable Artificial Intelligence. Wiley Inter-
might not accurately reflect the original model. On a local disciplinary Reviews: Data Mining and Knowledge Discovery, 11(1),
level, though, the surrogate model’s explanations were e1391.
generally consistent with those provided by both SHAP [7] DARPA XAI Literature Review
and LIME. [8] Ginsberg, M. L. (1986). Counterfactuals. Artificial intelligence, 30(1),
35-79.
In Table 2, we summarized the used XAI techniques. We [9] Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K., Müller, K. R.
also added the execution time and stability for each technique (Eds.). (2019). Explainable AI: interpreting, explaining and visualizing
deep learning (Vol. 11700). Springer Nature.
when repeatedly applied to the same dataset. Notably, random
[10] Meske, C., Bunde, E., Schneider, J., Gersch, M. (2022). Explainable
data perturbations lead to minor instabilities in both PFI and artificial intelligence: objectives, stakeholders, and future research op-
LIME, especially for less significant features. Furthermore, portunities. Information Systems Management, 39(1), 53-63.
PFI’s outcomes are sensitive to hyperparameter adjustments. [11] Ribeiro, M. T., Singh, S., and Guestrin, C. (2016, August). ” Why
should i trust you?” Explaining the predictions of any classifier. In
For instance, changing the number of permutations per feature Proceedings of the 22nd ACM SIGKDD international conference on
from the default 30 to 40, causes a change in the order of knowledge discovery and data mining (pp. 1135-1144).
”age” and ”grade”. Although XAI techniques are designed [12] Apley, D. W., Zhu, J. (2020). Visualizing the effects of predictor
variables in black box supervised learning models. Journal of the Royal
to boost the user confidence in black-box models, variations Statistical Society Series B: Statistical Methodology, 82(4), 1059-1086.
in their explanations may introduce confusion about both [13] Friedman, J. H. (2001). Greedy function approximation: a gradient
the model and the explanation technique. This highlights the boosting machine. Annals of statistics, 1189-1232.
[14] Craven, M., and Shavlik, J. (1995). Extracting tree-structured represen-
need for a balance in XAI methods, ensuring both stability tations of trained networks. Advances in neural information processing
and convergence in explanations, and also calls for efficient systems, 8.
evaluation metrics for XAI. [15] Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
[16] Stepin, I., Alonso, J. M., Catala, A., Pereira-Fariña, M. (2021). A survey
IV. C ONCLUSION of contrastive and counterfactual explanation generation methods for
explainable artificial intelligence. IEEE Access, 9, 11974-12001.
In this paper, we have conducted a comprehensive review [17] Molnar, C. (2023). Interpretable machine learning. A Guide for Making
of XAI techniques, with a particular emphasis on model- Black Box Models Explainable. Second Edition
TABLE II
XAI TECHNIQUES SUMMARY

XAI Reference Principle Output Scope View Correlated Stability Execution


Method features time
Global [14] Emulates the behavior An interpretable Global Global/ local + + 22.24
Surro- of a complex model model instance
gate using an interpretable
model
PDP [13] Visualizes the varia- Plot showing Global Global per - + 12.90
tion in the model’s the relationship feature (all
prediction when vary- between a instances)
ing values of a spe- feature values
cific feature and the model
predictions
PFI [15] Measures the Ranking Global Global(all - - 32.09
decrease in a model’s indicating the features &
performance when influence of each all instances)
the values of a feature on the
specific feature are predictions
randomly shuffled
ICE [25] Visualizes the pre- Curve of Local Global per - + 5.12
diction’s variation for prediction’s feature (all
each instance across a variation per instance
particular feature val- instance (all individually)
ues instances in one
plot)
LIME [11] Approximates a Features Local Local per in- + - 0.83
black-box model contribution stance
locally using an to a decision
interpretable model
SHAP [26] Calculates the fea- Features Local & Local per + + 0.83
ture importance lo- contribution Global instance &
cally or globally us- to a decision global (all
ing the Shapley value features& all
instances)

[18] Kim, B., Khanna, R., Koyejo, O. O. (2016). Examples are not enough, derstanding with explainable AI for trees. Nature machine intelligence,
learn to criticize! criticism for interpretability. Advances in neural 2(1), 56-67.
information processing systems, 29.
[19] Alonso, J. M., Castiello, C., Mencar, C. (2018, May). A bibliometric
analysis of the explainable artificial intelligence research field. In In-
ternational conference on information processing and management of
uncertainty in knowledge-based systems (pp. 3-15). Cham: Springer
International Publishing.
[20] Castro, J. L., Mantas, C. J., Benı́tez, J. M. (2002). Interpretation of
artificial neural networks by means of fuzzy rules. IEEE Transactions
on Neural Networks, 13(1), 101-116.
[21] https://seer.cancer.gov/data/
[22] https://scikit-learn.org/stable/modules/permutation importance.html
[23] Greenwell, B. M., Boehmke, B. C., McCarthy, A. J. (2018). A simple
and effective model-based variable importance measure. arXiv preprint
arXiv:1805.04755.
[24] Zhou, Z.H., Jiang, Y.: Medical diagnosis with c4. 5 rule preceded by
artificial neural network ensemble. IEEE Transactions on information
Technology in Biomedicine 7(1), 37-42 (2003)
[25] Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E. (2015). Peeking
inside the black box: Visualizing statistical learning with plots of indi-
vidual conditional expectation. journal of Computational and Graphical
Statistics, 24(1), 44-65.
[26] Lundberg, S. M., and Lee, S. I. (2017). A unified approach to interpreting
model predictions. Advances in neural information processing systems,
30.
[27] Shapley, Lloyd S. “A value for n-person games.” Contributions to the
Theory of Games 2.28 (1953): 307-317
[28] Štrumbelj, E., and Kononenko, I. (2014). Explaining prediction models
and individual predictions with feature contributions. Knowledge and
information systems, 41, 647-665.
[29] Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M.,
Nair, B., ... and Lee, S. I. (2020). From local explanations to global un-

You might also like