0% found this document useful (0 votes)

32 views5 pages

Simple Guide To Confusion Matrix Terminology

The document provides definitions and explanations of key terms related to confusion matrix terminology. It includes an example confusion matrix for a binary classifier that predicts a disease. It then defines the basic terms: true positives, true negatives, false positives, and false negatives. Finally, it lists several common rates calculated from a confusion matrix, such as accuracy, misclassification rate, true positive rate, and precision.

Uploaded by

Muhammad Akhtar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views5 pages

Simple Guide To Confusion Matrix Terminology

Uploaded by

Muhammad Akhtar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

4/28/2019 Simple guide to confusion matrix terminology

March 25, 2014 · MACHINE LEARNING

Simple guide to
confusion matrix
Launch a data science career!
terminology

 

A confusion matrix is a table that is often used to

Name:
describe the performance of a classi cation
model (or "classi er") on a set of test data for which
Email address:
the true values are known. The confusion matrix
itself is relatively simple to understand, but the
Join the Newsletter related terminology can be confusing.

I wanted to create a "quick reference guide" for

New? Start here!
confusion matrix terminology because I couldn't
Machine Learning course nd an existing resource that suited my
requirements: compact in presentation, using
Join my 80,000+ YouTube
numbers instead of arbitrary variables, and
subscribers
explained both in terms of formulas and sentences.

Join Data School Insiders Let's start with an example confusion matrix for

Private forum for Insiders a binary classi er (though it can easily be

extended to the case of more than two classes):
About

What can we learn from this matrix?

There are two possible predicted classes:

"yes" and "no". If we were predicting the
https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/ 1/14
4/28/2019 Simple guide to confusion matrix terminology

presence of a disease, for example, "yes"

would mean they have the disease, and "no"
would mean they don't have the disease.
The classi er made a total of 165 predictions
(e.g., 165 patients were being tested for the
presence of that disease).
Out of those 165 cases, the classi er
Launch a data science career!
predicted "yes" 110 times, and "no" 55 times.
 
In reality, 105 patients in the sample have the
Name: disease, and 60 patients do not.

Let's now de ne the most basic terms, which are

Email address: whole numbers (not rates):

true positives (TP): These are cases in which

Join the Newsletter we predicted yes (they have the disease), and
they do have the disease.
New? Start here!
true negatives (TN): We predicted no, and

Machine Learning course they don't have the disease.

false positives (FP): We predicted yes, but
Join my 80,000+ YouTube they don't actually have the disease. (Also
subscribers known as a "Type I error.")
false negatives (FN): We predicted no, but
Join Data School Insiders
they actually do have the disease. (Also known
Private forum for Insiders as a "Type II error.")

About I've added these terms to the confusion matrix, and

also added the row and column totals:

https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/ 2/14
4/28/2019 Simple guide to confusion matrix terminology

This is a list of rates that are often computed from a

confusion matrix for a binary classi er:

Accuracy: Overall, how often is the classi er

correct?
(TP+TN)/total = (100+50)/165 = 0.91
Misclassi cation Rate: Overall, how often is
Launch a data science career! it wrong?

  (FP+FN)/total = (10+5)/165 = 0.09
equivalent to 1 minus Accuracy
Name:
also known as "Error Rate"
True Positive Rate: When it's actually yes,
Email address: how often does it predict yes?
TP/actual yes = 100/105 = 0.95
also known as "Sensitivity" or "Recall"
Join the Newsletter
False Positive Rate: When it's actually no,
New? Start here! how often does it predict yes?
FP/actual no = 10/60 = 0.17
Machine Learning course True Negative Rate: When it's actually no,
how often does it predict no?
Join my 80,000+ YouTube
TN/actual no = 50/60 = 0.83
subscribers
equivalent to 1 minus False Positive
Join Data School Insiders Rate
also known as "Speci city"
Private forum for Insiders
Precision: When it predicts yes, how often is

About it correct?
TP/predicted yes = 100/110 = 0.91
Prevalence: How often does the yes
condition actually occur in our sample?
actual yes/total = 105/165 = 0.64

A couple other terms are also worth mentioning:

Null Error Rate: This is how often you would

be wrong if you always predicted the majority
class. (In our example, the null error rate
would be 60/165=0.36 because if you always
predicted yes, you would only be wrong for
the 60 "no" cases.) This can be a useful
https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/ 3/14
4/28/2019 Simple guide to confusion matrix terminology

baseline metric to compare your classi er

against. However, the best classi er for a
particular application will sometimes have a
higher error rate than the null error rate, as
demonstrated by the Accuracy Paradox.
Cohen's Kappa: This is essentially a measure
of how well the classi er performed as
Launch a data science career!
compared to how well it would have
 
performed simply by chance. In other words,
Name: a model will have a high Kappa score if there
is a big di erence between the accuracy and

Email address:
the null error rate. (More details about
Cohen's Kappa.)
F Score: This is a weighted average of the
Join the Newsletter true positive rate (recall) and precision. (More
details about the F Score.)
New? Start here! ROC Curve: This is a commonly used graph
that summarizes the performance of a
Machine Learning course
classi er over all possible thresholds. It is
Join my 80,000+ YouTube generated by plotting the True Positive Rate
subscribers (y-axis) against the False Positive Rate (x-axis)
as you vary the threshold for assigning
Join Data School Insiders
observations to a given class. (More details
Private forum for Insiders about ROC Curves.)

About And nally, for those of you from the world of

Bayesian statistics, here's a quick summary of these
terms from Applied Predictive Modeling:

In relation to Bayesian statistics, the

sensitivity and speci city are the
conditional probabilities, the prevalence
is the prior, and the positive/negative
predicted values are the posterior
probabilities.

Want to learn more?

https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/ 4/14
4/28/2019 Simple guide to confusion matrix terminology

In my new 35-minute video, Making sense of the

confusion matrix, I explain these concepts in more
depth and cover more advanced topics:

How to calculate precision and recall for

multi-class problems
How to analyze a 10-class confusion matrix
Launch a data science career! How to choose the right evaluation metric for

  your problem
Why accuracy is often a misleading metric
Name:

EMAIL FACEBOOK
TWITTERLINKEDIN
TUMBLRREDDIT GOOGLE+
POCKET
Email address:

Join the Newsletter

Data School Comment Policy


New? Start here! All comments are moderated, and will usually be
approved by Kevin within a few hours. Thanks for
Machine Learning course your patience!

Join my 80,000+ YouTube

subscribers Comments Community 
1 Login

Join Data School Insiders t Tweet f Share

 Recommend 46

Private forum for Insiders Sort by Best

About Join the discussion…

OR SIGN UP WITH DISQUS ?

Name

Engr Ali Raza • 2 years ago

Dear Kevin,
can you please tell me the relationship between
Misclassifcations and split value or split value
index??also explain that how we can calculate
https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/
th lit l i i l d i i l ith ? 5/14

Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
22 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Confusion Matrix Machine Learning
No ratings yet
Confusion Matrix Machine Learning
9 pages
Service Manual: History Information For The Following Manual
No ratings yet
Service Manual: History Information For The Following Manual
71 pages
Lecture 1
100% (1)
Lecture 1
21 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
Genie Evo
No ratings yet
Genie Evo
124 pages
Confusion Matrix
No ratings yet
Confusion Matrix
3 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
10 pages
Confusion Matrix
No ratings yet
Confusion Matrix
12 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
6 pages
Prismax Spec
No ratings yet
Prismax Spec
2 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Part 8 - Confusion Matrix
No ratings yet
Part 8 - Confusion Matrix
21 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
009 Confusion Matrix - Unlocked
No ratings yet
009 Confusion Matrix - Unlocked
4 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Confusion Matrix
No ratings yet
Confusion Matrix
2 pages
Confusion Matrix: Dr. P. K. Chaurasia
No ratings yet
Confusion Matrix: Dr. P. K. Chaurasia
13 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
US IT Recruiting Training Material - Road To US Staffing and USA
No ratings yet
US IT Recruiting Training Material - Road To US Staffing and USA
17 pages
Confusion Matrix - Wikipedia
No ratings yet
Confusion Matrix - Wikipedia
4 pages
Learning Material 1 in MMW, Ch3
No ratings yet
Learning Material 1 in MMW, Ch3
16 pages
COnfusion Matrix
No ratings yet
COnfusion Matrix
32 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Risk Security and Regulatory Compliance
No ratings yet
Risk Security and Regulatory Compliance
12 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
Confusion Matrix
No ratings yet
Confusion Matrix
13 pages
Confusion Matrix Accuracy Recall Sensitivity
No ratings yet
Confusion Matrix Accuracy Recall Sensitivity
16 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
UNIT4 Confusion Matrix
No ratings yet
UNIT4 Confusion Matrix
12 pages
Module 7 - Evaluation Measures
No ratings yet
Module 7 - Evaluation Measures
27 pages
Confusion Matrix and Outliers
No ratings yet
Confusion Matrix and Outliers
32 pages
Unit 2 Chap 4
No ratings yet
Unit 2 Chap 4
14 pages
Lecture 5
No ratings yet
Lecture 5
21 pages
Confusion Matrix
No ratings yet
Confusion Matrix
23 pages
Content Standard:: /configuring-Of-Computer-Systems-And-Networks - PDF Module in ICT CHS 10 Teacher Guide
100% (2)
Content Standard:: /configuring-Of-Computer-Systems-And-Networks - PDF Module in ICT CHS 10 Teacher Guide
2 pages
Confusion Matrix
No ratings yet
Confusion Matrix
10 pages
Confusion Matrix: A B C D
No ratings yet
Confusion Matrix: A B C D
3 pages
06-FSSR DS610 2024 2025T1 Metrics
No ratings yet
06-FSSR DS610 2024 2025T1 Metrics
24 pages
Confusion Matrix
No ratings yet
Confusion Matrix
11 pages
Confusion Matrix in Machine Learning FGVBN
No ratings yet
Confusion Matrix in Machine Learning FGVBN
4 pages
Confusion Matrix
No ratings yet
Confusion Matrix
3 pages
Confusion Matrix
No ratings yet
Confusion Matrix
2 pages
Unit II - 2.9 - Confusion Matrix in ML at CSJMU - 6 Slides Handouts
No ratings yet
Unit II - 2.9 - Confusion Matrix in ML at CSJMU - 6 Slides Handouts
2 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
Confusion Matrix
No ratings yet
Confusion Matrix
24 pages
ML 5
No ratings yet
ML 5
5 pages
MIPS Addressing Modes
No ratings yet
MIPS Addressing Modes
5 pages
Understanding The Confusion Matrix in Machine Learning
No ratings yet
Understanding The Confusion Matrix in Machine Learning
4 pages
Confusion Matrix
No ratings yet
Confusion Matrix
21 pages
11 - 23ECE216 - Histograms and Classifier Perf Measures
No ratings yet
11 - 23ECE216 - Histograms and Classifier Perf Measures
23 pages
Confusion Matrix
No ratings yet
Confusion Matrix
7 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
12 pages
Confusion Matrix Notes
No ratings yet
Confusion Matrix Notes
2 pages
Sample Final Q3 Ans
No ratings yet
Sample Final Q3 Ans
13 pages
6.evaluation Metrics - UNIT 2
No ratings yet
6.evaluation Metrics - UNIT 2
4 pages
Evaluation of Predictive Models Final
No ratings yet
Evaluation of Predictive Models Final
6 pages
LAS WEEK 1 - Grade 10 ICT
No ratings yet
LAS WEEK 1 - Grade 10 ICT
4 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
03 Factors Effecting Communication Effectiveness
No ratings yet
03 Factors Effecting Communication Effectiveness
13 pages
Classification Data Mining
No ratings yet
Classification Data Mining
84 pages
Nat ADABAS4 ND
100% (1)
Nat ADABAS4 ND
54 pages
Assignment 5
No ratings yet
Assignment 5
22 pages
BA
No ratings yet
BA
11 pages
Confusion Matrix
No ratings yet
Confusion Matrix
16 pages
Understanding Confusion Matrix
No ratings yet
Understanding Confusion Matrix
4 pages
Confusion Matrix
No ratings yet
Confusion Matrix
11 pages
KT88 3200 Opration 090911
No ratings yet
KT88 3200 Opration 090911
46 pages
Thirteenth Edition: Design of Goods and Services
No ratings yet
Thirteenth Edition: Design of Goods and Services
88 pages
Introduction Game Analysis 3rd (051 100)
No ratings yet
Introduction Game Analysis 3rd (051 100)
50 pages
Domain PR Check List3!!! (8647)
No ratings yet
Domain PR Check List3!!! (8647)
304 pages
Ned Mohan: Minneapolis 2002
No ratings yet
Ned Mohan: Minneapolis 2002
19 pages
Developing A Process For Laminated Object Manufacturing (Rapid Prototyping) Without De-Cubing.
100% (2)
Developing A Process For Laminated Object Manufacturing (Rapid Prototyping) Without De-Cubing.
93 pages
A+ Guide To Managing and Maintaining Your PC, 6e: Motherboards
100% (1)
A+ Guide To Managing and Maintaining Your PC, 6e: Motherboards
36 pages
Arpan Koley - Oe-Ec604c - Ca-1
No ratings yet
Arpan Koley - Oe-Ec604c - Ca-1
9 pages
DB6CONV 640 v47
No ratings yet
DB6CONV 640 v47
50 pages
VR&AR
No ratings yet
VR&AR
8 pages
Fall 2023 - CS607 - 1
No ratings yet
Fall 2023 - CS607 - 1
3 pages
Diagnostic Systematic Reviews Road Map V3
No ratings yet
Diagnostic Systematic Reviews Road Map V3
2 pages
Lecture 09
No ratings yet
Lecture 09
19 pages
Pharmacy Minitheme by Slidesgo
No ratings yet
Pharmacy Minitheme by Slidesgo
42 pages
Comparing Open-Source Speech Recognition Toolkits
No ratings yet
Comparing Open-Source Speech Recognition Toolkits
12 pages
UNIT 3 Notes
No ratings yet
UNIT 3 Notes
23 pages
Simple NLG
No ratings yet
Simple NLG
4 pages
Resume Shubhendu
100% (1)
Resume Shubhendu
2 pages
SATIR DX-Series - DX-300 - Catalogue
No ratings yet
SATIR DX-Series - DX-300 - Catalogue
3 pages
Lecture 8
No ratings yet
Lecture 8
13 pages
Locking Protocol Rules
No ratings yet
Locking Protocol Rules
12 pages
Healthcare ERP Project Success: It's All About Avoiding Missteps
No ratings yet
Healthcare ERP Project Success: It's All About Avoiding Missteps
5 pages
DLD Mid Term Exam
No ratings yet
DLD Mid Term Exam
2 pages
Battery Capacity and Battery Backup Time Calculation
No ratings yet
Battery Capacity and Battery Backup Time Calculation
3 pages
Revision Exercises in Basic Engineering Mechanics
From Everand
Revision Exercises in Basic Engineering Mechanics
Gregory Pastoll
No ratings yet
Errors of Regression Models: Bite-Size Machine Learning, #1
From Everand
Errors of Regression Models: Bite-Size Machine Learning, #1
Lee Baker
No ratings yet