Explainable AI Recipes: Implement Solutions to Model Explainability and Interpretability with Python 1st Edition Pradeepta Mishra 2024 scribd download
Explainable AI Recipes: Implement Solutions to Model Explainability and Interpretability with Python 1st Edition Pradeepta Mishra 2024 scribd download
com
https://ebookmass.com/product/explainable-ai-recipes-
implement-solutions-to-model-explainability-and-
interpretability-with-python-1st-edition-pradeepta-mishra-2/
OR CLICK HERE
DOWLOAD NOW
https://ebookmass.com/product/productionizing-ai-how-to-deliver-
ai-b2b-solutions-with-cloud-and-python-1st-edition-barry-walsh/
ebookmass.com
https://ebookmass.com/product/introduction-to-responsible-ai-
implement-ethical-ai-using-python-1st-edition-manure/
ebookmass.com
https://ebookmass.com/product/developmental-psychology-the-growth-of-
mind-and-behavior-ebook-pdf/
ebookmass.com
The Palgrave Handbook of Malicious Use of AI and
Psychological Security Evgeny Pashentsev
https://ebookmass.com/product/the-palgrave-handbook-of-malicious-use-
of-ai-and-psychological-security-evgeny-pashentsev/
ebookmass.com
https://ebookmass.com/product/handbook-on-democracy-and-security-
nicholas-a-seltzer-editor/
ebookmass.com
https://ebookmass.com/product/spatiotemporal-random-fields-second-
edition-theory-and-applications-george-christakos/
ebookmass.com
https://ebookmass.com/product/never-been-kissed-boy-meets-
boy-1-timothy-janovsky/
ebookmass.com
https://ebookmass.com/product/the-code-of-putinism-brian-d-taylor/
ebookmass.com
Operations and Supply Chain Management: The Core 6th
Edition F. Robert Jacobs
https://ebookmass.com/product/operations-and-supply-chain-management-
the-core-6th-edition-f-robert-jacobs-2/
ebookmass.com
Pradeepta Mishra
Explainable AI Recipes
Implement Solutions to Model Explainability and
Interpretability with Python
Pradeepta Mishra
Bangalore, Karnataka, India
Apress Standard
The publisher, the authors, and the editors are safe to assume that the
advice and information in this book are believed to be true and accurate
at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the
material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
Industries in which artificial intelligence has been applied include banking, financial services, insurance,
healthcare, manufacturing, retail, and pharmaceutical. There are regulatory requirements in some of these
industries where model explainability is required. Artificial intelligence involves classifying objects,
recognizing objects to detect fraud, and so forth. Every learning system requires three things: input data,
processing, and an output. If the performance of any learning system improves over time by learning from
new examples or data, it is called a machine learning system. When the number of features for a machine
learning task increases or the volume of data increases, it takes a lot of time to apply machine learning
techniques. That’s when deep learning techniques are used.
Figure 1-1 represents the relationships between artificial intelligence, machine learning, and deep
learning.
After preprocessing and feature creation, you can observe hundreds of thousands of features that need
to be computed to produce output. If we train a machine learning supervised model, it will take significant
time to produce the model object. To achieve scalability in this task, we need deep learning algorithms, such
as a recurrent neural network. This is how artificial intelligence is connected to deep learning and machine
learning.
In the classical predictive modeling scenario, a function is identified, and the input data is usually fit to
the function to produce the output, where the function is usually predetermined. In a modern predictive
modeling scenario, the input data and output are both shown to a group of functions, and the machine
identifies the best function that approximates well to the output given a particular set of input. There is a
need to explain the output of a machine learning and deep learning model in performing regression- and
classification-related tasks. These are the reasons why explainability is required:
Trust: To gain users’ trust on the predicted output
Reliability: To make the user rely on the predicted output
Regulatory: To meet regulatory and compliance requirements
Adoption: To increase AI adoption among the users
Fairness: To remove any kind of discrimination in prediction
Accountability: To establish ownership of the predictions
There are various ways that explainability can be achieved using statistical properties, probabilistic
properties and associations, and causality among the features. Broadly, the explanations of the models can
be classified into two categories, global explanations and local explanations. The objective of local
explanation is to understand the inference generated for one sample at a time by comparing the nearest
possible data point; global explanation provides an idea about the overall model behavior.
The goal of this chapter is to introduce how to install various explainability libraries and interpret the
results generated by those explainability libraries.
Solution
The solution to this problem is to use the simple pip or conda option.
How It Works
Let’s take a look at the following script examples. The SHAP Python library is based on a game theoretic
approach that attempts to explain local and as well as global explanations.
or
Solution
You can install the LIME library using pip or conda.
How It Works
Let’s take a look at the following example script:
or
Solution
If you want to use a combination of functions from both the LIME library and the SHAP library, then you can
use the SHAPASH library. You just have to install it, which is simple.
How It Works
Let’s take a look at the following code to install SHAPASH. This is not available on the Anaconda distribution;
the only way to install it is by using pip.
Solution
Since this is a Python library, you can use pip.
How It Works
Let’s take a look at the following script:
Solution
Skater is an open-source framework to enable model interpretation for various kinds of machine learning
models. The Python-based Skater library provides both global and local interpretations and can be installed
using pip.
How It Works
Let’s take a look at the following script:
Solution
Skope-rules offers a trade-off between the interpretability of a decision tree and the modeling power of a
random forest model. The solution is simple; you use the pip command.
How It Works
Let’s take a look at the following code:
Solution
The explainability method depends on who is the consumer of the model output, if it is the business or
senior management then the explainability should be very simple and plain English without any
mathematical formula and if the consumer of explainability is data scientists and machine learning
engineers then the explanations may include the mathematical formulas.
How It Works
The levels of transparency of the machine learning models can be categorized into three buckets, as shown
in Figure 1-2.
Textual explanations require explaining the mathematical formula in plain English, which can help
business users or senior management. The interpretations can be designed based on model type and model
variant and can draw inferences from the model outcome. A template to draw inferences can be designed
and mapped to the model types, and then the templates can be filled in using some natural language
processing methods.
A visual explainability method can be used to generate charts, graphs such as dendrograms, or any other
types of graphs that best explain the relationships. The tree-based methods use if-else conditions on the
back end; hence, it is simple to show the causality and the relationship.
Using common examples and business scenarios from day-to-day operations and drawing parallels
between them can also be useful.
Which method you should choose depends on the problem that needs to be solved and the consumer of
the solution where the machine learning model is being used.
Conclusion
In various AI projects and initiatives, the machine learning models generate predictions. Usually, to trust the
outcomes of a model, a detailed explanation is required. Since many people are not comfortable explaining
the machine learning model outcomes, they cannot reason out the decisions of a model, and thereby AI
adoption is restricted. Explainability is required from regulatory stand point as well as auditing and
compliance point of view. In high-risk use cases such as medical imaging and object detection or pattern
recognition, financial prediction and fraud detection, etc., explainability is required to explain the decisions
of the machine learning model.
In this chapter, we set up the environment by installing various explainable AI libraries. Machine
learning model interpretability and explainability are the key focuses of this book. We are going to use
Python-based libraries, frameworks, methods, classes, and functions to explain the models.
In the next chapter, we are going to look at the linear models.
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2023
P. Mishra, Explainable AI Recipes
https://doi.org/10.1007/978-1-4842-9029-3_2
A supervised learning model is a model that is used to train an algorithm to map input data to output data. A
supervised learning model can be of two types: regression or classification. In a regression scenario, the
output variable is numerical, whereas with classification, the output variable is binary or multinomial. A
binary output variable has two outcomes, such as true and false, accept and reject, yes and no, etc. In the
case of multinomial output variables, the outcome can be more than two, such as high, medium, and low. In
this chapter, we are going to use explainable libraries to explain a regression model and a classification
model, while training a linear model.
In the classical predictive modeling scenario, a function has been identified, and the input data is usually
fit to the function to produce the output, where the function is usually predetermined. In a modern
predictive modeling scenario, the input data and output are both shown to a group of functions, and the
machine identifies the best function that approximates well to the output given a particular set of input.
There is a need to explain the output of machine learning and deep learning models when performing
regression and classification tasks. Linear regression and linear classification models are simpler to explain.
The goal of this chapter is to introduce various explainability libraries for linear models such as feature
importance, partial dependency plot, and local interpretation.
Recipe 2-1. SHAP Values for a Regression Model on All Numerical Input
Variables
Problem
You want to explain a regression model built on all the numeric features of a dataset.
Solution
A regression model on all the numeric features is trained, and then the trained model will be passed through
SHAP to generate global explanations and local explanations.
How It Works
Let’s take a look at the following script. The Shapely value can be called the SHAP value. It is used to explain
the model. It uses the impartial distribution of predictions from a cooperative game theory to attribute a
feature to the model’s predictions. Input features from the dataset are considered as players in the game.
The models function is considered the rules of the game. The Shapely value of a feature is computed based
on the following steps:
1. SHAP requires model retraining on all feature subsets; hence, usually it takes time if the explanation has
to be generated for larger datasets.
2. Identify a feature set from a list of features (let’s say there are 15 features, and we can select a subset
with 5 features).
3. For any particular feature, two models using the subset of features will be created, one with the feature
and another without the feature.
4. Then the prediction differences will be computed.
5. The differences in prediction are computed for all possible subsets of features.
6. The weighted average value of all possible differences is used to populate the feature importance.
If the weight of the feature is 0.000, then we can conclude that the feature is not important and has not
joined the model. If it is not equal to 0.000, then we can conclude that the feature has a role to play in the
prediction process.
We are going to use a dataset from the UCI machine learning repository. The URL to access the dataset is
as follows:
https://archive.ics.uci.edu/ml/datasets/Appliances+energy+prediction
The objective is to predict the appliances’ energy use in Wh, using the features from sensors. There are
27 features in the dataset, and here we are trying to understand what features are important in predicting
the energy usage. See Table 2-1.
import pandas as pd
import shap
import sklearn
print("Model coefficients:\n")
for i in range(X.shape[1]):
print(X.columns[i], "=", model.coef_[i].round(5))
Model coefficients:
lights = 1.98971
T1 = -0.60374
RH_1 = 15.15362
T2 = -17.70602
RH_2 = -13.48062
T3 = 25.4064
RH_3 = 4.92457
T4 = -3.46525
RH_4 = -0.17891
T5 = -0.02784
RH_5 = 0.14096
T6 = 7.12616
RH_6 = 0.28795
T7 = 1.79463
RH_7 = -1.54968
T8 = 8.14656
RH_8 = -4.66968
T9 = -15.87243
RH_9 = -0.90102
T_out = -10.22819
Press_mm_hg = 0.13986
RH_out = -1.06375
Windspeed = 1.70364
Visibility = 0.15368
Tdewpoint = 5.0488
rv1 = -0.02078
rv2 = -0.02078
pd.DataFrame(np.round(shap_values.values,3)).head(3)
pd.DataFrame(np.round(shap_values.data,3)).head(3)
Solution
The solution to this problem is to use the partial dependency method (partial_dependence_plot)
from the model.
How It Works
Let’s take a look at the following example. There are two ways to get the partial dependency plot, one with a
particular data point superimposed and the other without any reference to the data point. See Figure 2-1.
# make a standard partial dependence plot for lights on predicted output for
row number 20 from the training dataset.
sample_ind = 20
shap.partial_dependence_plot(
"lights", model.predict, X, model_expected_value=True,
feature_expected_value=True, ice=False,
shap_values=shap_values[sample_ind:sample_ind+1,:]
)
Figure 2-1 Correlation between feature light and predicted output of the model
The partial dependency plot is a way to explain the individual predictions and generate local
interpretations for the sample selected from the dataset; in this case, the sample 20th record is selected from
the training dataset. Figure 2-1 shows the partial dependency superimposed with the 20th record in red.
shap.partial_dependence_plot(
"lights", model.predict, X, ice=False,
model_expected_value=True, feature_expected_value=True
)
Figure 2-2 Partial dependency plot between lights and predicted outcome from the model
The local interpretation for record number 20 from the training dataset is displayed in Figure 2-3. The
predicted output for the 20th record is 140 Wh. The most influential feature impacting the 20th record is
RH_1, which is the humidity in the kitchen area in percentage, and RH_2, which is the humidity in the living
room area. On the bottom of Figure 2-3, there are 14 features that are not very important for the 20th
record’s predicted value.
X[20:21]
model.predict(X[20:21])
array([140.26911466])
Recipe 2-3. SHAP Feature Importance for Regression Model with All
Numerical Input Variables
Problem
You want to calculate the feature importance using the SHAP values.
Solution
The solution to this problem is to use SHAP absolute values from the model.
How It Works
Let’s take a look at the following example. SHAP values can be used to show the global importance of
features. Importance features means features that have a larger importance in predicting the output.
print(shap_importance)
col_name feature_importance_vals
2 RH_1 49.530061
19 T_out 43.828847
4 RH_2 42.911069
5 T3 41.671587
11 T6 34.653893
3 T2 31.097282
17 T9 26.607721
16 RH_8 19.920029
24 Tdewpoint 17.443688
21 RH_out 13.044643
6 RH_3 13.042064
15 T8 12.803450
0 lights 11.907603
12 RH_6 7.806188
14 RH_7 6.578015
7 T4 5.866801
22 Windspeed 3.361895
13 T7 3.182072
18 RH_9 3.041144
23 Visibility 1.385616
10 RH_5 0.855398
20 Press_mm_hg 0.823456
1 T1 0.765753
8 RH_4 0.642723
25 rv1 0.260885
26 rv2 0.260885
9 T5 0.041905
All the feature importance values are not scaled; hence, sum of values from all features will not be
totaling 100.
The beeswarm chart in Figure 2-4 shows the impact of SHAP values on model output. The blue dot
shows a low feature value, and a red dot shows a high feature value. Each dot indicates one data point from
the dataset. The beeswarm plot shows the distribution of feature values against the SHAP values.
shap.plots.beeswarm(shap_values)
Recipe 2-4. SHAP Values for a Regression Model on All Mixed Input Variables
Problem
How do you estimate SHAP values when you introduce the categorical variables along with the numerical
variables, which is a mixed set of input features.
Solution
The solution is that the mixed input variables that have numeric features as well as categorical or binary
features can be modeled together. As the number of features increases, the time to compute all the
permutations will also increase.
How It Works
We are going to use an automobile public dataset with some modifications. The objective is to predict the
price of a vehicle given the features such as make, location, age, etc. It is a regression problem that we are
going to solve using a mix of numeric and categorical features.
df =
pd.read_csv('https://raw.githubusercontent.com/pradmishra1/PublicDatasets/main
df.head(3)
df.columns
Index(['Price', 'Make', 'Location', 'Age', 'Odometer', 'FuelType', 'Transmissi
'Mileage', 'EngineCC', 'PowerBhp'], dtype='object')
We cannot use string-based features or categorical features in the model directly as matrix multiplication
is not possible on string features; hence, the string-based features need to be transformed into dummy
variables or binary features with 0 and 1 flags. The transformation step is skipped here because many data
scientists already know how to do this data transformation. We are importing another transformed dataset
directly.
df_t =
pd.read_csv('https://raw.githubusercontent.com/pradmishra1/PublicDatasets/main
del df_t['Unnamed: 0']
df_t.head(3)
df_t.columns
Index(['Price', 'Age', 'Odometer', 'mileage', 'engineCC', 'powerBhp', 'Locatio
'Location_Chennai', 'Location_Coimbatore', 'Location_Delhi', 'Location_Hyderab
'Location_Kochi', 'Location_Kolkata', 'Location_Mumbai', 'Location_Pune', 'Fue
'FuelType_Electric', 'FuelType_LPG', 'FuelType_Petrol', 'Transmission_Manual',
Above', 'OwnerType_Second', 'OwnerType_Third'], dtype='object')
import pandas as pd
import shap
import sklearn
print("Model coefficients:\n")
for i in range(X.shape[1]):
print(X.columns[i], "=", model.coef_[i].round(5))
Model coefficients:
Age = -0.92281
Odometer = 0.0
mileage = -0.07923
engineCC = -4e-05
powerBhp = 0.1356
Location_Bangalore = 2.00658
Location_Chennai = 0.94944
Location_Coimbatore = 2.23592
Location_Delhi = -0.29837
Location_Hyderabad = 1.8771
Location_Jaipur = 0.8738
Location_Kochi = 0.03311
Location_Kolkata = -0.86024
Location_Mumbai = -0.81593
Location_Pune = 0.33843
FuelType_Diesel = -1.2545
FuelType_Electric = 7.03139
FuelType_LPG = 0.79077
FuelType_Petrol = -2.8691
Transmission_Manual = -2.92415
OwnerType_Fourth +ACY- Above = 1.7104
OwnerType_Second = -0.55923
OwnerType_Third = 0.76687
To compute the SHAP values, we can use the explainer function with the training dataset X and model
predict function. The SHAP value calculation happens using a permutation approach; it took 5 minutes.
import numpy as np
pd.DataFrame(np.round(shap_values.values,3)).head(3)
0
0 11.933
1 11.933
2 11.933
pd.DataFrame(np.round(shap_values.data,3)).head(3)
Recipe 2-5. SHAP Partial Dependency Plot for Regression Model for Mixed
Input
Problem
You want to plot the partial dependency plot and interpret the graph for numeric and categorical dummy
variables.
Solution
The partial dependency plot shows the correlation between the feature and the predicted output of the
target variables. There are two ways we can showcase the results, one with a feature and expected value of
the prediction function and the other with superimposing a data point on the partial dependency plot.
How It Works
Let’s take a look at the following example (see Figure 2-5):
shap.partial_dependence_plot(
"powerBhp", model.predict, X, ice=False,
model_expected_value=True, feature_expected_value=True
)
Random documents with unrelated
content Scribd suggests to you:
oli mennyt naimisiin oman renkinsä kanssa. Välillä nauroi hän
käheästi. — Erkki kuunteli vain puolella korvalla, mutta hän kuuli
kuitenkin. Ja jälleen tunsi hän mielensä painostuvan, turhaksi kaiken
työnsä ja taistelonsa.
Samassa tuli Anni käskemään häntä keittiöön. Siellä oli joku, joka
etsi Erkkiä. Erkki jätti setelitukun pöydän nurkalle ja meni. — Se oli
kutsu kirkonkylän kauppiaan luo yksinkertaisille illallisille.
Hän katsahti äidin vuoteelle päin. Äiti oli nukkuvinaan. Vaikka hän
oli viipynyt poissa tuskin viittä minuuttia, oli äiti muka ehtinyt sillä
aikaa vaipua syvään uneen. Hän makasi kasvot seinään päin ja
kuorsasi kovasti.
— Äiti!
Äiti ei liikahtanutkaan.
— Äiti!
Ei vastausta.
— Olen. Minä jätin rahat pöydälle ja nyt ei niitä enää ole siinä.
Anni pyyhki kätensä ja tuli päättäväisenä Erkin perästä kamariin.
Hän astui kursailematta äidin vuoteen luo ja pudisti häntä hartioista.
— Älä, älä!
Erkki rukoili:
Anni sanoi:
— Hän ei anna niitä. Pidä hänen toista kättään, että hän ei pääse
repimään niitä! Minä otan ne häneltä.
Silloin vasta heräsi äiti. Hän itki ja rukoili, pyysi ja vaikeroi. Erkillä
oli niin paljon rahoja, hänellä ei mitään, Erkki oli niin paljon nauttinut
maailmasta, hän ei ollenkaan. Hän oli Erkin edestä raatanut ja
taistellut, antanut viimeisen penninsäkin Erkin koulunkäyntiin; nyt ei
Erkki tahtonut antaa hänelle edes osaa paljostaan. Kohta hän
kuolisi, ei Erkillä enää hänestä kauan vaivaa olisi.
— Mihin?
Erkki oli tullut äkkiä aivan kylmäksi. Hänelle oli kaikki lopussa. Hän
meni pöydän luo, otti setelipakan, antoi sen äidin käteen ja sanoi
lujasti ja armottomasti:
*****
Hän käveli kauas, kauas pitkin lumista maantietä, ohi kirkon, ohi
hautausmaan, kohti korpea lähestyvää. Lankesi hämärä hänen
ympärilleen, syttyivät tähdet hänen päänsä päälle. Syttyivät myös
lamput lakeudella, syttyivät taloissa ja etäisissä metsätorpissa. —
Mutta Erkin mieleen hiipi ääretön surumielisyys.
Mitä hän oli ollut ja miksi hän oli elänyt? Mihin pyrkinyt ja mitä
elämässä toimittanut? Kuluttanut hän oli aikansa arkipäiväin
pienessä rähinässä, tuhlannut itsensä, tuhlannut toiveensa ja
työkykynsä. Tyhjä, puusta pudonnut lehti hän oli, valmis hankeen
haudattavaksi …
Ei ollut elämä hänelle antanut paljon iloja eikä paljon suruja. Oli
antanut päiviä pääksytellen, iloa vain iltojen siteiksi, surua vain
huomenen huoliksi. Ei ollut elämällä siis häneltä paljoa otettavaa …
LEHTIPURJEESSA
Lehtipurjeessa, lehtipurjeessa!
Lehtipurjeessa, lehtipurjeessa!
— Niinkuin tuossa.
— Nukuttaako sinua?
— Eipä erityisesti.
Mutta Erkki katseli Annia kuitenkin tänä iltana eri silmillä kuin
ennen, näki vanhojen muistojensa prisman läpi, jotka tuolla ulkona
tähti-vipajavassa talvi-yössä olivat hänen sielussaan heränneet. Eikä
hän enää millään muotoa tahtonut päästää niitä jälleen katoamaan,
hän koetti väkisin loihtia esiin äskeistä mielialaansa ja väkisin
kehitellä sitä… Anni esiintyi hänelle nyt arvoituksena, joka kerran oli
hänen elämäänsä astunut, mutta jäänyt selittämättä. Eikö voisi sitä
nyt selittää, eikö jälleen mennyttä manata?
Lehtipurjeessa, lehtipurjeessa!
Lehtipurjeessa, lehtipurjeessa!
— Enpähän erinomaista.
Mutta Anni katseli edelleenkin tutkivasti Erkkiä. Erkin kasvoille oli
noussut niin tuskallinen ilme ja hän hengitti aivan kuuluvasti…
Lieneekö hän ollut sairas? Hän oli joskus ennenkin maininnut
turmiolle menneestä hermostostaan. — Anni tunsi itsensä oikein
levottomaksi.
Lehtipurjeessa, lehtipurjeessa!
Pirtin ovi oli lukossa. Hän kolisteli, ei kukaan tullut avaamaan. Hän
löi aitan ovelle, sama tulos… Vihdoin ilmestyi saunasta piika
pikkarainen, joka kertoi kaiken talonväen menneen ulkoniitylle. Hän
yksin jäänyt kotimieheksi… Ruislintu narisi, punertava täysikuu
paistoi vinoon yli viljapellon.
— Erkki!
— Erkki!
DE PROFUNDIS
Mutta Erkistä tuntui aivan kuin olisi äiti puuhannut omia peijaitaan.
Erkki ei tiennyt, pitikö hänen itkeä vai nauraa. Hän oli viime aikoina
todellakin ruvennut pelkäämään järkeään; alituinen ilkeä
päänkivistys vaivasi häntä. Kaamea tyhjyyden tunne oli täyttänyt
hänet ja kaikki ulkopuoliset vaikutukset kurkistivat sisälle hänen
sielunsa ikkunoista kuin kummitukset autioon huoneesen.
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookmass.com