Explainable AI For Software Engineering
Explainable AI For Software Engineering
Abstract—The success of software engineering projects largely the explanations, what are the types of intelligibility questions,
depends on complex decision-making. For example, which tasks what are the scopes of explanations, what are the types of
should a developer do first, who should perform this task, is explanations). We will also discuss the model-specific and
the software of high quality, is a software system reliable and
resilient enough to deploy, etc. However, erroneous decision- model-agnostic tools and techniques that are extensively-used
making for these complex questions is costly in terms of money in the Explainable AI community (e.g., LIME [9], SHAP [5])
and reputation. Thus, Artificial Intelligence/Machine Learning for explaining the predictions of AI/ML models with a hands-
(AI/ML) techniques have been widely used in software engineer- on tutorial.
ing for developing software analytics tools and techniques to
improve decision-making, developer productivity, and software
quality. However, the predictions of such AI/ML models for
Part 2: Defect Prediction Models
software engineering are still not practical (i.e., coarse-grained), In this part, we will first introduce the fundamental knowl-
not explainable, and not actionable. These concerns often hinder
edge of defect prediction technologies. This include five key
the adoption of AI/ML models in software engineering practices.
In addition, many recent studies still focus on improving the steps (1) Data Collection, (2) Data Preprocessing, (3) Model
accuracy, while a few of them focus on improving explainability. Construction, (4) Model Evaluation, and (5) Model Ranking,
Are we moving in the right direction? How can we better improve along with the following empirical-grounded guidelines [10].
the SE community (both research and education)? We describe details below.
In this tutorial, we first provide a concise yet essential
(Step 1) Data Collection. Poor quality or noisy defect
introduction to the most important aspects of Explainable AI and
a hands-on tutorial of Explainable AI tools and techniques. Then, datasets could lead to inaccurate predictions and insights. We
we introduce the fundamental knowledge of defect prediction (an found that techniques for generating ground-truth data is often
example application of AI for Software Engineering). Finally, we not accurate, impacting the quality of defect datasets. Thus, we
demonstrate three successful case studies on how Explainable AI will discuss how to: consider affected releases (i.e., the actual
techniques can be used to address the aforementioned challenges
software releases that are affected) to label whether a file is
by making the predictions of software defect prediction models
more practical, explainable, and actionable. The materials are considered to be defective or clean, instead of the assumptions
available at https://xai4se.github.io. of a post-release window period (i.e., any defects that are fixed
Index Terms—Explainable AI, Software Engineering after 6 months) [17], while not to be concerned about issue
report misclassification [12].
I. T UTORIAL O UTLINE (Step 2) Data Analysis. Defect datasets are highly imbal-
Our 1.5-hour tutorial consists of three parts: anced with a defective ratio of <10%. Defect models trained
on class imbalance datasets often produce inaccurate models.
• Part 1-Explainable AI: We first provide a concise yet
Our TSE’19 paper suggested to consider using optimised
essential introduction to the most important aspects of
SMOTE to improve the predictive accuracy, i.e., handling the
Explainable AI and a hands-on tutorial of Explainable
class imbalance of the training datasets prior to training the
AI tools and techniques.
defect models. Thus, we will discuss how to:
• Part 2-Defect Prediction Models: We introduce the fun-
damental knowledge of defect prediction (an example • Handle colliniearity and multicollinearity when interpret-
application of AI for Software Engineering) ing defect models (i.e., understanding what are the most
• Part 3-Explainable AI for Software Engineering: We important variables) [2].
demonstrate three successful case studies on how Ex- • Use AutoSpearman for handling collinearity and multi-
plainable AI techniques can be used to address the collinearity [3], and avoid using existing automated fea-
aforementioned challenges by making the predictions of ture selection techniques (e.g., Stepwise Regression) if
software defect prediction models more practical, ex- the goal is to interpret defect prediction models, as they
plainable, and actionable. fail to mitigate correlated features and are dependent on
random seeds [4].
Part 1: Introduction to Explainable AI (Step 3) Model Construction. There exist a large number
We first provide a concise yet essential introduction to the of off-the-shelf classification techniques that can be used with
most important aspects of Explainable AI. At the beginning of a large number of possible combination of hypereparameter
the tutorial, we will introduce the motivation, definitions, and settings that can be configured. Sadly, practitioners often asked
concept of Explainable AI. We will also discuss a theory of which techniques and which settings should be used. Thus, we
explanations (e.g., what are the explainability goals, what are will discuss how to:
Authorized licensed use limited to: Dr. D. Y. Patil Educational Complex Akurdi. Downloaded on July 23,2024 at 05:50:09 UTC from IEEE Xplore. Restrictions apply.