0% found this document useful (0 votes)

11 views37 pages

9 - Session 9 - Visualizing Model Performance, Evidence and Probabilities

Uploaded by

Muhammad Yazid Al-Kaafi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views37 pages

9 - Session 9 - Visualizing Model Performance, Evidence and Probabilities

Uploaded by

Muhammad Yazid Al-Kaafi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Visualizing

Model
Performance,
Evidence and
Probabilities
Ranking Instead of Classifying

Evaluating classifiers play an important

role in understanding and applying data
science concepts in the business world. A
classifier is a tool used in machine learning
to group items or data into certain categories
based on features learned from training data.
Class confusion is a situation where the classifier incorrectly identifies a class. For
example, in classifiers of email as "spam" or "non-spam", class confusion occurs
when emails that are not actually spam are misidentified as spam, and vice versa.
Confusion matrix is a table that describes the performance of a classifier by
separating correct and incorrect decisions.

An n × n matrix with columns labeled with actual classes and rows labeled with
predicted classes.
To describe the actual class and the class predicted by the model, different symbols
are usually used.
Unbalanced classes often occur in business applications where classifiers
are used to discover unusual entities from large populations. Examples
include looking for defrauded customers, checking for defective parts in
an assembly line, or targeting consumers who will respond to an
application offer.

The problem arises when a rare class becomes highly unbalanced in the
distribution, for example, only appearing in 1 out of every 1000
examples. In situations like this, the use of accuracy as an evaluation
metric becomes irrelevant. In fact, if we always chose the most common
class (in this case, appearing 999 out of 1000 examples), we would get
an accuracy rate of 99.9%, but this is not useful because the model will
not successfully identify rare classes.
Accuracy is the simplest and most popular measure for
evaluating classifier performance. It measures the proportion
of correct predictions out of all predictions made. Even though
it is easy to understand, accuracy has shortcomings, especially
in unbalanced data (imbalanced datasets) where the number of
examples from one class is much greater than from other
classes.
The use of accuracy as a simple classification evaluation metric
does not differentiate between false positive errors and false
negative errors. Both types of errors are calculated together,
with the implicit assumption that both errors have the same
level of importance. In real -world domains, false positive and
false negative errors have different consequences and often
have different impacts.

Ideally, we should evaluate the costs or benefits of each

decision made by the classifier. By combining all these costs
and benefits, we can estimate the expected gain or loss from
using the classifier
Profit Curves
ROC Graphs and Curves
When the threshold is lowered, cases move from row
N to row Y in the confusion matrix in that a case
previously considered negative is now classified as
positive, so the number changes.

The number changes depending on the original class

of the example.
If the case is positive (in the "p" column), it will
move up and become a true positive ( Y,p).
If the case is negative (n), it will be a false positive
(Y,n).
Technically, each different threshold produces a
different classifier, which is represented by its own
confusion matrix.
ROC Graphs and Curves
Generating ROC curve:
Algorithm
• Sort the test set by the model predictions

• Start with cutoff = max (prediction)

• Decrease cutoff, after each step count the number of true positives
TP (positives with prediction above the cutoff) and false positives FP
(negatives above the cutoff)

• Calculate TP rate (TP/P) and FP (FP/N) rate

• Plot current number of TP/P as a function of current FP/N

ROC Graphs and
Curves
• ROC graphs decouple classifier performance from the conditions
under which the classifiers will be used

• ROC graphs are independent of the class proportions as well as the

costs and benefits

• Not the most intuitive visualization for many business stakeholders

Area Under the
ROC Curve (AUC)
• The area under a classifier’s curve expressed as a fraction of the
unit square
• Its value ranges from zero to one

• The AUC is useful when a single number is needed to summarize

performance, or when nothing is known about the operating
conditions
• A ROC curve provides more information than its area

• Equivalent to the Mann-Whitney-Wilcoxon measure

• Also equivalent to the Gini Coefficient (with a minor algebraic
transformation)
• Both are equivalent to the probability that a randomly chosen positive
instance will be ranked ahead of a randomly chosen negative instance
Cumulative Response
curve
Lift Curve
Let’s focus back in on
actually mining the
data..

Which model should TelCo

select in order to target
customers with a special offer,
prior to contract expiration?
Performance
Evaluation
Training Set: Model Accuracy
Classification Tree 95%
Logistic Regression 93%
𝑘-Nearest Neighbors 100%
Naïve Bays 76%

Test Set:
Model Accuracy AUC
Classification Tree 91.8%±0.0 0.614±0.014
Logistic Regression 93.0%±0.1 0.574±0.023
𝑘-Nearest Neighbors 93.0%±0.0 0.537±0.015
Naïve Bays 76.5%±0.6 0.632±0.019
Performance
Evaluation
Naïve Bayes confusion matrix:
p n
Y 127 (3%) 848 (18%)
N 200 (4%) 3518 (75%)

𝑘-Nearest Neighbors confusion matrix:

p n
Y 3 (0%) 15 (0%)
N 324 (7%) 4351 (93%)
ROC Curve
Lift Curve
Profit Curves
Profit Curves
Agenda 2
 Introduction

 Bayes‘ Rule
 Applying Bayes‘ rule to data science
 Naive Bayes
 Advantages and Disadvantages of Naive Bayes
 Example
Introductory example
 So far: using data to draw conclusions about
some unknown quantity of a data instance
 Now: analyse data instances as evidence
for or against different values of the target
 Example: target online displays to
consumers based on webpages they
have visited in the past
 Run a targeted campaign for, e.g., a luxury hotel
 Target variable: will the consumer book a hotel
room within one week after having seen the
advertisement?
 Cookies allow for observing which consumers
book rooms
Introductory example
 (…)
 A consumer is characterized by the set of websites we
have observed her to have visited previously (cookies!)
 We assume that some of these websites are more likely to be visited by good
prospects for the luxury hotel
 Problem: we do not have the resources to estimate the evidence potential for
each site manually
 Idea: use historical data to estimate both the direction and the
strength of the evidence
 Combine the evidence to estimate the
resulting likelihood of class membership
 Similar problem: spam detection
Combining evidence
probabilistically
 What is the probability 𝑃(𝐶) that if you show an
ad to any customer, it will book a room given
some evidence 𝐸 (such as the websites visited by
a particular customer)? → 𝑃 𝐶 𝐸

 Problem: for any particular collection of evidence 𝐸,

we may not have seen enough cases/seen it at all!

 Idea: consider the different pieces of evidence

separately, and then combine evidence
Reminder: statistical
(in)dependence
 If the events A and B are statistically independent,
then we can compute the probability that both A and B
occur as
𝑝 𝐴𝐵 = 𝑝 𝐴 ∙ 𝑝 𝐵 𝐴

Example: rolling a fair dice

 The general formular for combining probabilities that

take care of dependencies between events is

𝑝 𝐴𝐵 = 𝑝 𝐴 𝑝 𝐵 𝐴

Given that you know A, what is the probability of B

Agenda
 Introduction

 Bayes‘ Rule
 Applying Bayes‘ rule to data science
 Naive Bayes
 Advantages and Disadvantages of Naive Bayes
 Example
Bayes‘ rule (2/2)
 Bayes‘ rule says
 that we can compute the probability of our hypothesis 𝐻 given some evidence 𝐸
by instead looking at the probability of the evidence given the hypothesis as
well as the unconditional probability of the hypothesis and the evidence.
 Example: medical diagnosis
 Hypothesis 𝐻 = measles, Evidence 𝐸 = red spots
 In order to directly estimate 𝑝(𝑚𝑒𝑎𝑠𝑙𝑒𝑠|𝑟𝑒𝑑 𝑠𝑝𝑜𝑡𝑠) we would need to think
through all the different reasons a person might exhibit red spots and what
proportion of them would be measles.
 Instead: 𝑝(𝐸|𝐻) is the prob. that one has red spots given that one has
measles. 𝑝(𝐻) is simply the prob. that someone has measles, and 𝑝(𝐸) that
someone has red spots.

10
Applying Bayes‘ rule to data
science (1/2)
 A lot of DM methods are based on Bayes‘ rule
 Bayes‘ rule for classification of the probability that
the target variable C takes on the class of interest ܿ
after taking the evidence 𝐸 (feature values) into
account:

 𝑝 (𝐶 = 𝑐) is the 'prior' probability of the class, i.e., the

probability we would assign to the class before seeing any
evidence [e.g., the prevalence of c in the population =
percentage of all examples that are of class c.

 𝑝 (𝐸|𝐶 = 𝑐) is the likelihood of seeing the evidence 𝐸 [the

percentage of examples of class c that have 𝐸]
 𝑝 (𝐸) is the likelihood of the evidence [occurence of 𝐸]
11
Applying Bayes‘ rule to data science (2/2)

 (…)
 Estimating these values, we could use 𝑝 𝐶 = 𝑐 𝑬 as an
estimate of class probability
 Alternatively, we could use the values as a score to rank
instances
 Drawback: if 𝐸 is a usual vector of attribute values,
we would require knowing the full joint probability of
the example
 This is difficult to measure
 We may never see a specific example in the training data
that matches a given 𝐸 in our test data

 Make a particular assumption of independence!

Naive Bayes (1/2)
 Conditional independence: use the class of the
example as condition
 This allows for easy combination of probabilities:
𝑝 𝐴𝐵 𝐶 = 𝑝 𝐴 𝐶 ∙ 𝑝 𝐵 𝐶
In other words: we assume that the attributes are
conditionally independent, i.e.

Each of the terms p(ei|c) can be computed directly

from the data (count up the prop. we see ei in c)
Naive Bayes (2/2)
 Naive Bayes classifies a new example by estimating the probability
that the example belongs to each class and reports the class with
highest probability
 Note that the denominator P(E) never actually has
to be calculated
 We can focus on the nominator for comparison of different classes ܿ, because the
denominator is always the same
 If we need probability estimates, the probabilities will add
up to one, so we can derive it from the other quantities
(Dis)Advantages of
Naive Bayes
 Naive Bayes
 is a simple classifier, although it takes all the feature
evidence into account
 is very efficient in terms of storage space and comp. time
 performs surprisingly well for classification
 is an „incremental learner“
 Note that the independence assumption does not hurt classification
performance very much
 To some extent, we double the evidence
 Tends to make the probability estimates more extreme in the correct direction
 Don‘t use the probability estimates themselves!
 But ranking is ok!
Example: Naive Bayes
classifier (1/5)
References

❑ Provost, F.; Fawcett, T.: Data Science for Business; Fundamental Principles of
Data Mining and Data- Analytic Thinking. O‘Reilly, CA 95472, 2013.
❑ Sharda, R., Delen, D., Turban, E., (2018). Business intelligence, Analytics, and
Data Science: A Managerial Perspective, 4th Edition, Pearson.

❑ ryan dan alyaq dan ikhlas kamalia

Thank You

06 - NaiveBayes and ME
No ratings yet
06 - NaiveBayes and ME
26 pages
Automatic Plastic Injection Moulding Machine - Injection Moulding Machines - Injection Moulding Manufacturers
No ratings yet
Automatic Plastic Injection Moulding Machine - Injection Moulding Machines - Injection Moulding Manufacturers
23 pages
Session01 DataScience
No ratings yet
Session01 DataScience
79 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Data Mining 4th Is
No ratings yet
Data Mining 4th Is
24 pages
Data MIning Chapter 8
No ratings yet
Data MIning Chapter 8
11 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
AI Notes
No ratings yet
AI Notes
19 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
A Gentle Introduction To Statistical Hypothesis Tests
No ratings yet
A Gentle Introduction To Statistical Hypothesis Tests
6 pages
Week 4 - Classification Alternative Techniques
No ratings yet
Week 4 - Classification Alternative Techniques
87 pages
Module - 4 - ECE3047 - Machine Learning
No ratings yet
Module - 4 - ECE3047 - Machine Learning
81 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Bayesian Theory Daniel Restrepo
No ratings yet
Bayesian Theory Daniel Restrepo
8 pages
Chap5 Evaluating Performance
No ratings yet
Chap5 Evaluating Performance
54 pages
Int3209 - Data Mining: Week 5: Classification Model Improvements
No ratings yet
Int3209 - Data Mining: Week 5: Classification Model Improvements
56 pages
IS4242 W6 Model Evaluation and Selection
No ratings yet
IS4242 W6 Model Evaluation and Selection
86 pages
Classification
No ratings yet
Classification
33 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
12 pages
INT354 - Unit 1
No ratings yet
INT354 - Unit 1
72 pages
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
No ratings yet
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
62 pages
Week 11
No ratings yet
Week 11
46 pages
For Unit 4 Useful
100% (1)
For Unit 4 Useful
107 pages
MISY 631 Final Review Calculators Will Be Provided For The Exam
No ratings yet
MISY 631 Final Review Calculators Will Be Provided For The Exam
9 pages
Machine Learning-Lecture 04
No ratings yet
Machine Learning-Lecture 04
31 pages
WK 08
No ratings yet
WK 08
10 pages
L25 - Naïve Bayes
No ratings yet
L25 - Naïve Bayes
18 pages
ML Lec-11
No ratings yet
ML Lec-11
12 pages
Evaluation of Predictive Models Final
No ratings yet
Evaluation of Predictive Models Final
6 pages
Evaluation Matrix
No ratings yet
Evaluation Matrix
29 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
9-Decision Tree Induction-23-01-2025
No ratings yet
9-Decision Tree Induction-23-01-2025
40 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
Naive Bayes
No ratings yet
Naive Bayes
37 pages
5.classification and Prediction
No ratings yet
5.classification and Prediction
9 pages
Unit6 - 7 Issues
No ratings yet
Unit6 - 7 Issues
53 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
39 pages
20210913115710D3708 - Session 09-12 Bayes Classifier
No ratings yet
20210913115710D3708 - Session 09-12 Bayes Classifier
30 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
4 22865 IS465 2019 1 2 1 08ClassBasic
No ratings yet
4 22865 IS465 2019 1 2 1 08ClassBasic
43 pages
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
6 pages
Learning AI
No ratings yet
Learning AI
34 pages
Machine Learning Project Report (Group 3) Shahbaz Khan
No ratings yet
Machine Learning Project Report (Group 3) Shahbaz Khan
11 pages
Data Mining - Bayesian Classification
No ratings yet
Data Mining - Bayesian Classification
6 pages
Classification Ppts 2021
No ratings yet
Classification Ppts 2021
80 pages
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
No ratings yet
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
66 pages
Lecture 6 - Generative Models
No ratings yet
Lecture 6 - Generative Models
33 pages
MILIT PPT Modifies
No ratings yet
MILIT PPT Modifies
43 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
TE - DWM Module No 3
No ratings yet
TE - DWM Module No 3
48 pages
AIML-HC Mod 03
No ratings yet
AIML-HC Mod 03
46 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
Naive Bayes Classifier: Fundamentals and Applications
From Everand
Naive Bayes Classifier: Fundamentals and Applications
Fouad Sabry
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Ammeraal Beltech: Innovation & Service in Belting
No ratings yet
Ammeraal Beltech: Innovation & Service in Belting
6 pages
Division of Negros Occidental
No ratings yet
Division of Negros Occidental
5 pages
Commissioning Report For Boiler Air and Flue Gas System Unit 1
No ratings yet
Commissioning Report For Boiler Air and Flue Gas System Unit 1
6 pages
Chapter 23 The Enlightenment
No ratings yet
Chapter 23 The Enlightenment
4 pages
Disposal of Plastic Bags
No ratings yet
Disposal of Plastic Bags
15 pages
Animals in Dreams - Dream Interpretation and Meaning of Animals in Dreams Cafeausoul
No ratings yet
Animals in Dreams - Dream Interpretation and Meaning of Animals in Dreams Cafeausoul
1 page
Chapter 5 & 6
No ratings yet
Chapter 5 & 6
28 pages
Problem Set 1 Significant Figures Answer Sheet
No ratings yet
Problem Set 1 Significant Figures Answer Sheet
2 pages
Test and Evaluation of Aircraft Avionics and Weapon Systems 2nd Edition Robert B. Mcshea PDF Download
No ratings yet
Test and Evaluation of Aircraft Avionics and Weapon Systems 2nd Edition Robert B. Mcshea PDF Download
52 pages
Grab The Full PDF Version of Test Bank For Human Physiology: An Integrated Approach, 8th Edition, Dee Unglaub Silverthorn, With A Fast Download.
100% (6)
Grab The Full PDF Version of Test Bank For Human Physiology: An Integrated Approach, 8th Edition, Dee Unglaub Silverthorn, With A Fast Download.
72 pages
EDHRM - HR Metrics 2023 Course Outline - Revised
No ratings yet
EDHRM - HR Metrics 2023 Course Outline - Revised
4 pages
Artikel 7 Pages From Prosiding Vol 2 No 1 Jan 2020
No ratings yet
Artikel 7 Pages From Prosiding Vol 2 No 1 Jan 2020
7 pages
Gossypium Barbadense
No ratings yet
Gossypium Barbadense
12 pages
LCDM 4000 (Product Spec V1.03)
No ratings yet
LCDM 4000 (Product Spec V1.03)
7 pages
CV Ajab Gul
No ratings yet
CV Ajab Gul
3 pages
ICD Dadri Report1
No ratings yet
ICD Dadri Report1
9 pages
ACCA Qualification Global Brochure
No ratings yet
ACCA Qualification Global Brochure
22 pages
Bid Evaluation Report - 23H00003
No ratings yet
Bid Evaluation Report - 23H00003
3 pages
Harshit Sinha: Deloitte Financial Advisory Services India Private Limited (USI)
No ratings yet
Harshit Sinha: Deloitte Financial Advisory Services India Private Limited (USI)
1 page
TCNet Design Report
No ratings yet
TCNet Design Report
2 pages
Form Pelaporan Ukl Upl
No ratings yet
Form Pelaporan Ukl Upl
3 pages
What Is Painting? Definition &amp Description - Eden Gallery
No ratings yet
What Is Painting? Definition &amp Description - Eden Gallery
3 pages
Resume Material of Practicality and Authenticity
No ratings yet
Resume Material of Practicality and Authenticity
2 pages
NVEM - EC300 - EC500 Self Test Guide
No ratings yet
NVEM - EC300 - EC500 Self Test Guide
8 pages
New Bunawan
No ratings yet
New Bunawan
7 pages
Mining Rehabilitation Fund Questions and Answers
No ratings yet
Mining Rehabilitation Fund Questions and Answers
4 pages
2024 WASCCE ALT A Chemistry Likely Questions
No ratings yet
2024 WASCCE ALT A Chemistry Likely Questions
3 pages
Technical Analysis Elearn
100% (5)
Technical Analysis Elearn
44 pages
Republic of Kenya Preparatory Survey On Second Olkaria Geothermal Power Project
No ratings yet
Republic of Kenya Preparatory Survey On Second Olkaria Geothermal Power Project
156 pages

9 - Session 9 - Visualizing Model Performance, Evidence and Probabilities

Uploaded by

9 - Session 9 - Visualizing Model Performance, Evidence and Probabilities

Uploaded by

Visualizing

Evaluating classifiers play an important

Ideally, we should evaluate the costs or benefits of each

The number changes depending on the original class

• Start with cutoff = max (prediction)

• Calculate TP rate (TP/P) and FP (FP/N) rate

• Plot current number of TP/P as a function of current FP/N

• ROC graphs are independent of the class proportions as well as the

• Not the most intuitive visualization for many business stakeholders

• The AUC is useful when a single number is needed to summarize

• Equivalent to the Mann-Whitney-Wilcoxon measure

Which model should TelCo

𝑘-Nearest Neighbors confusion matrix:

 Problem: for any particular collection of evidence 𝐸,

 Idea: consider the different pieces of evidence

Example: rolling a fair dice

 The general formular for combining probabilities that

Given that you know A, what is the probability of B

 𝑝 (𝐶 = 𝑐) is the 'prior' probability of the class, i.e., the

 𝑝 (𝐸|𝐶 = 𝑐) is the likelihood of seeing the evidence 𝐸 [the

 Make a particular assumption of independence!

Each of the terms p(ei|c) can be computed directly

❑ ryan dan alyaq dan ikhlas kamalia

You might also like