0% found this document useful (0 votes)

381 views1 page

Confusion Matrix, Accuracy, Precision, Recall, F1 Score

This document discusses various metrics used to evaluate binary classification models including accuracy, precision, recall, F1 score and the confusion matrix. It explains these metrics using an example of classifying people as pregnant or not pregnant and highlights that accuracy can be misleading for unbalanced datasets, so precision, recall and F1 score which consider both false positives and false negatives are better measures of a model's performance.

Uploaded by

Ashner Novilla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

381 views1 page

Confusion Matrix, Accuracy, Precision, Recall, F1 Score

Uploaded by

Ashner Novilla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Confusion Matrix, Accuracy,

Precision, Recall, F1 Score

Binary Classification Metric

How to evaluate the performance of a machine learning model?

Let us consider a task to classify whether a person is pregnant or not
pregnant. If the test for pregnancy is positive (+ve ), then the person is
pregnant. On the other hand, if the test for pregnancy is negative (-ve) then
the person is not pregnant.

Now consider the above classification ( pregnant or not pregnant ) carried

out by a machine learning algorithm. The output of the machine learning
algorithm can be mapped to one of the following categories.

1. A person who is actually pregnant (positive) and classified as pregnant

(positive). This is called TRUE POSITIVE (TP).

Figure 1: True Positive.

2. A person who is actually not pregnant (negative) and classified as not

pregnant (negative). This is called TRUE NEGATIVE (TN).

Figure 2: True Negative.

3. A person who is actually not pregnant (negative) and classified as

pregnant (positive). This is called FALSE POSITIVE (FP).

Figure 3: False Positive.

4. A person who is actually pregnant (positive) and classified as not pregnant

(negative). This is called FALSE NEGATIVE (FN).

Figure 4. False Negative.

What we desire is TRUE POSITIVE and TRUE NEGATIVE but due to the
misclassifications, we may also end up in FALSE POSITIVE and FALSE
NEGATIVE. So there is a confusion in classifying whether a person is
pregnant or not. This is because no machine learning algorithm is perfect.
Soon we will describe this confusion in classifying the data in a matrix called
confusion matrix.

Now, we select 100 people which includes pregnant women, not pregnant
women and men with fat belly. Let us assume out of this 100 people 40 are
pregnant and the remaining 60 people include not pregnant women and
men with fat belly. We now use a machine learning algorithm to predict the
outcome. The predicted outcome (pregnancy +ve or -ve) using a machine
learning algorithm is termed as the predicted label and the true outcome (in
this case which we know from doctor’s/expert’s record) is termed as the true
label.

Now we will introduce the confusion matrix which is required to compute the
accuracy of the machine learning algorithm in classifying the data into its
corresponding labels.

The following diagram illustrates the confusion matrix for a binary

classification problem.

Figure 5: Confusion Matrix.

We will now go back to the earlier example of classifying 100 people (which
includes 40 pregnant women and the remaining 60 are not pregnant women
and men with a fat belly) as pregnant or not pregnant. Out of 40 pregnant
women 30 pregnant women are classified correctly and the remaining 10
pregnant women are classified as not pregnant by the machine learning
algorithm. On the other hand, out of 60 people in the not pregnant category,
55 are classified as not pregnant and the remaining 5 are classified as
pregnant.

In this case, TN = 55, FP = 5, FN = 10, TP = 30. The confusion matrix is as

follows.

Figure 6: Confusion matrix for the pregnant vs not pregnant classification.

What is the accuracy of the machine learning model for this classification
task?

Accuracy represents the number of correctly classified data instances over

the total number of data instances.

In this example, Accuracy = (55 + 30)/(55 + 5 + 30 + 10 ) = 0.85 and in

percentage the accuracy will be 85%.

Is accuracy the best measure?

Accuracy may not be a good measure if the dataset is not balanced (both
negative and positive classes have different number of data instances). We
will explain this with an example.

Consider the following scenario: There are 90 people who are healthy
(negative) and 10 people who have some disease (positive). Now let’s say our
machine learning model perfectly classified the 90 people as healthy but it
also classified the unhealthy people as healthy. What will happen in this
scenario? Let us see the confusion matrix and find out the accuracy?

In this example, TN = 90, FP = 0, FN = 10 and TP = 0. The confusion matrix is

as follows.

Figure 7: Confusion matrix for healthy vs unhealthy people classification task.

Accuracy in this case will be (90 + 0)/(100) = 0.9 and in percentage the
accuracy is 90 %.

Is there anything fishy?

The accuracy, in this case, is 90 % but this model is very poor because all the
10 people who are unhealthy are classified as healthy. By this example what
we are trying to say is that accuracy is not a good metric when the data set is Top highlight

unbalanced. Using accuracy in such scenarios can result in misleading

interpretation of results.

So now we move further to find out another metric for classification. Again
we go back to the pregnancy classification example.

Now we will find the precision (positive predictive value) in classifying the
data instances. Precision is defined as follows:

What does precision mean?

Precision should ideally be 1 (high) for a good classifier. Precision becomes 1
only when the numerator and denominator are equal i.e TP = TP +FP, this
also means FP is zero. As FP increases the value of denominator becomes
greater than the numerator and precision value decreases (which we don’t
want).

So in the pregnancy example, precision = 30/(30+ 5) = 0.857

Now we will introduce another important metric called recall. Recall is also
known as sensitivity or true positive rate and is defined as follows:

Recall should ideally be 1 (high) for a good classifier. Recall becomes 1 only
when the numerator and denominator are equal i.e TP = TP +FN, this also
means FN is zero. As FN increases the value of denominator becomes greater
than the numerator and recall value decreases (which we don’t want).

So in the pregnancy example let us see what will be the recall.

Recall = 30/(30+ 10) = 0.75

So ideally in a good classifier, we want both precision and recall to be one

which also means FP and FN are zero. Therefore we need a metric that takes
into account both precision and recall. F1-score is a metric which takes into
account both precision and recall and is defined as follows:

F1 Score becomes 1 only when precision and recall are both 1. F1 score
becomes high only when both precision and recall are high. F1 score is the
harmonic mean of precision and recall and is a better measure than accuracy.

In the pregnancy example, F1 Score = 2* ( 0.857 * 0.75)/(0.857 + 0.75) = 0.799.

Thames Water Bill
83% (30)
Thames Water Bill
3 pages
Unit 5
No ratings yet
Unit 5
61 pages
Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
SP18 CS182 Midterm Solutions - Edited
No ratings yet
SP18 CS182 Midterm Solutions - Edited
14 pages
Strenght of Materials Problem Set No. 4
100% (1)
Strenght of Materials Problem Set No. 4
12 pages
Scuderia Topolino - Technical Advice
No ratings yet
Scuderia Topolino - Technical Advice
130 pages
Lab 4 Air-Conditioning
No ratings yet
Lab 4 Air-Conditioning
17 pages
Ai QB
No ratings yet
Ai QB
3 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
PN Junction Diode and Zener Diode Hardware
No ratings yet
PN Junction Diode and Zener Diode Hardware
8 pages
Int. To Data Analytics and Cyber Security Syllabus
No ratings yet
Int. To Data Analytics and Cyber Security Syllabus
2 pages
NLP Lab Tasks
No ratings yet
NLP Lab Tasks
16 pages
Thyroid Disease Classification Using Machine Learning Project
No ratings yet
Thyroid Disease Classification Using Machine Learning Project
34 pages
PPS Course Material
100% (1)
PPS Course Material
177 pages
Neuro Fuzzy Systems
100% (1)
Neuro Fuzzy Systems
27 pages
Lecture Notes 5
No ratings yet
Lecture Notes 5
3 pages
DL Unit-2 Notes PPT
No ratings yet
DL Unit-2 Notes PPT
39 pages
P, NP, NP - Complete, NP Hard
No ratings yet
P, NP, NP - Complete, NP Hard
19 pages
Esiot Lab
No ratings yet
Esiot Lab
29 pages
BSCS PPT Daa N01
100% (1)
BSCS PPT Daa N01
38 pages
Answers All 2007
0% (1)
Answers All 2007
64 pages
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
No ratings yet
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
7 pages
1157 CS F425 20231222015056 Mid Semester Question Paper DL
No ratings yet
1157 CS F425 20231222015056 Mid Semester Question Paper DL
2 pages
CO - CSE 4102 - AI Lab Course Outline
100% (1)
CO - CSE 4102 - AI Lab Course Outline
4 pages
Rajesh (DL Unit1) 04dec2024
No ratings yet
Rajesh (DL Unit1) 04dec2024
125 pages
6.2 Means-Ends Analysis
No ratings yet
6.2 Means-Ends Analysis
8 pages
Single Layer Perceptron
No ratings yet
Single Layer Perceptron
6 pages
Issues in ML
No ratings yet
Issues in ML
2 pages
Unit V - Paths, Path Products and Regular Expressions
100% (1)
Unit V - Paths, Path Products and Regular Expressions
32 pages
N P-Hard and N P-Complete Problems
No ratings yet
N P-Hard and N P-Complete Problems
12 pages
Question Bank
No ratings yet
Question Bank
2 pages
4-Data Cleaning, Data Integration, Data Transformation, Data Reduction-03-02-2024
No ratings yet
4-Data Cleaning, Data Integration, Data Transformation, Data Reduction-03-02-2024
22 pages
Viva Questions For Soft Computing
No ratings yet
Viva Questions For Soft Computing
5 pages
Computer Organization and Architecture: UNIT-2
No ratings yet
Computer Organization and Architecture: UNIT-2
29 pages
CS3491 - Notes - Unit 4 - Ensemble Techniques and Unsupervised Learning
No ratings yet
CS3491 - Notes - Unit 4 - Ensemble Techniques and Unsupervised Learning
35 pages
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
No ratings yet
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
13 pages
32 Lecture CSC462
No ratings yet
32 Lecture CSC462
34 pages
Multilayer Perceptron (MLP) & Linear Separabaility
No ratings yet
Multilayer Perceptron (MLP) & Linear Separabaility
7 pages
Fuzzy Logic and Neural Networks Notes
No ratings yet
Fuzzy Logic and Neural Networks Notes
68 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
13 pages
Text
No ratings yet
Text
131 pages
Data Structure Question Bank
No ratings yet
Data Structure Question Bank
24 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
ML Unit-4
No ratings yet
ML Unit-4
9 pages
Lecture 2.1.9 Comparison of BNN and ANN
No ratings yet
Lecture 2.1.9 Comparison of BNN and ANN
5 pages
Lec - 9 - Image Segmentation-I
No ratings yet
Lec - 9 - Image Segmentation-I
27 pages
Lab Manual
No ratings yet
Lab Manual
28 pages
Implementing Logic Gates Using Neural Networks (Part 2) - by Vedant Kumar - Towards Data Science
No ratings yet
Implementing Logic Gates Using Neural Networks (Part 2) - by Vedant Kumar - Towards Data Science
3 pages
Lab Manual Soft Computing
No ratings yet
Lab Manual Soft Computing
44 pages
ATAL 6 Days Online FDP Scheme Document 2025-26
No ratings yet
ATAL 6 Days Online FDP Scheme Document 2025-26
4 pages
KNN Algorithm - PPT (Autosaved)
0% (1)
KNN Algorithm - PPT (Autosaved)
8 pages
Introduction To NLP
No ratings yet
Introduction To NLP
50 pages
Hough Transform
No ratings yet
Hough Transform
16 pages
Genetic Algorithm
No ratings yet
Genetic Algorithm
46 pages
Inception Net
No ratings yet
Inception Net
88 pages
CP5261 Data Analytics Laboratory LTPC0042 Objectives
No ratings yet
CP5261 Data Analytics Laboratory LTPC0042 Objectives
80 pages
Chapter 12 Context Free Grammars
100% (1)
Chapter 12 Context Free Grammars
68 pages
Support Vector Machine (SVM) : Basic Terminologies
100% (1)
Support Vector Machine (SVM) : Basic Terminologies
2 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
Digital Image Processing: Image Enhancement (Spatial Filtering 1)
No ratings yet
Digital Image Processing: Image Enhancement (Spatial Filtering 1)
19 pages
Touchpad Prime Ver. 1.2 Class 6
From Everand
Touchpad Prime Ver. 1.2 Class 6
Nisha Batra
No ratings yet
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)
Artificial Intelligence Class 6: Skill Education for Class 6th, Code (417)
From Everand
Artificial Intelligence Class 6: Skill Education for Class 6th, Code (417)
Geeta Zunjani
No ratings yet
Mod3 Supplychainanalytics Usecases Part 1
No ratings yet
Mod3 Supplychainanalytics Usecases Part 1
27 pages
Certified Data Scientist: Program Brochure
No ratings yet
Certified Data Scientist: Program Brochure
14 pages
Scale and Transform - PyCaret
No ratings yet
Scale and Transform - PyCaret
1 page
BDU DS0101EN Module 3 Reading
No ratings yet
BDU DS0101EN Module 3 Reading
1 page
Rasa Project Report
No ratings yet
Rasa Project Report
9 pages
Male Body Shapes
No ratings yet
Male Body Shapes
7 pages
V R R R V V: Questions and Problems 1. R R +R R 300 + 600 R 900
No ratings yet
V R R R V V: Questions and Problems 1. R R +R R 300 + 600 R 900
3 pages
C++ Developer - Full Description
No ratings yet
C++ Developer - Full Description
2 pages
Terminal Test Document Plan (Revised As of 23january2019)
No ratings yet
Terminal Test Document Plan (Revised As of 23january2019)
134 pages
Problem No. 4.32: Implement the following Boolean function with a multiplexer. (a) F (A,B,C,D) = ∑ (0, 2, 5, 7, 11, 14) (b) F (A,B,C,D) = π (3, 8, 12) Answer by: Elbambo, Roberto Jerome S. Solution
100% (3)
Problem No. 4.32: Implement the following Boolean function with a multiplexer. (a) F (A,B,C,D) = ∑ (0, 2, 5, 7, 11, 14) (b) F (A,B,C,D) = π (3, 8, 12) Answer by: Elbambo, Roberto Jerome S. Solution
4 pages
Airpollution 091220072552 Phpapp01
No ratings yet
Airpollution 091220072552 Phpapp01
91 pages
Computer Systems (LCST) Nov2012 - Part 08
No ratings yet
Computer Systems (LCST) Nov2012 - Part 08
4 pages
Ic 7483 Pin Configuration
No ratings yet
Ic 7483 Pin Configuration
12 pages
AWS Online Shop
No ratings yet
AWS Online Shop
8 pages
Cloud Computing On Blue PowerPoint Templates Widescreen
No ratings yet
Cloud Computing On Blue PowerPoint Templates Widescreen
3 pages
Conclusion DC Multimeter
No ratings yet
Conclusion DC Multimeter
15 pages
Free PPT Templates: Insert The Subtitle of Your Presentation
No ratings yet
Free PPT Templates: Insert The Subtitle of Your Presentation
48 pages
Characteristics of Control
100% (1)
Characteristics of Control
2 pages
Data Comm Recitation 1
No ratings yet
Data Comm Recitation 1
2 pages
VedicReport11 11 202410 56 31PM
No ratings yet
VedicReport11 11 202410 56 31PM
55 pages
Inviting Visitors PPTAnshuBathla
No ratings yet
Inviting Visitors PPTAnshuBathla
14 pages
Lesson 1 Introduction To GIS
No ratings yet
Lesson 1 Introduction To GIS
15 pages
Call For Applicants AAU2025 Final
No ratings yet
Call For Applicants AAU2025 Final
1 page
JuDo Mini Green Alien Baby
No ratings yet
JuDo Mini Green Alien Baby
1 page
Co Ensio Recovery Eav 02
No ratings yet
Co Ensio Recovery Eav 02
7 pages
MX06FHG865 HG
No ratings yet
MX06FHG865 HG
3 pages
Reading List "Food and Colonialism"
No ratings yet
Reading List "Food and Colonialism"
3 pages
Polynomials Mat 110 2022 Presentation 1
No ratings yet
Polynomials Mat 110 2022 Presentation 1
21 pages
4 Stroke Diesel Engine, Direct Injection, Common-Rail
No ratings yet
4 Stroke Diesel Engine, Direct Injection, Common-Rail
2 pages
Africa
No ratings yet
Africa
38 pages
Learning Plan in SCIENCE 8
No ratings yet
Learning Plan in SCIENCE 8
5 pages
Jncu - in Backfrm ADDPrint MainPrintout1718.Aspx
No ratings yet
Jncu - in Backfrm ADDPrint MainPrintout1718.Aspx
2 pages
Newborn - What To Expect
No ratings yet
Newborn - What To Expect
3 pages
NDP6 Booklet
No ratings yet
NDP6 Booklet
24 pages
Svensk Standard SS-EN 13813: Fastställd 2002-10-25 Utgåva 1
100% (1)
Svensk Standard SS-EN 13813: Fastställd 2002-10-25 Utgåva 1
11 pages
Passive Voice Present Simpe, Continuous, Perfect
No ratings yet
Passive Voice Present Simpe, Continuous, Perfect
30 pages
Ideal Industries, Inc - Noalox Anti-Oxidant
No ratings yet
Ideal Industries, Inc - Noalox Anti-Oxidant
2 pages
Pisiks 71 Prob Sets
No ratings yet
Pisiks 71 Prob Sets
8 pages
D'Agostini Et Al. 2022 - Ocean Engineering
No ratings yet
D'Agostini Et Al. 2022 - Ocean Engineering
7 pages
AA POSS UserManual
No ratings yet
AA POSS UserManual
24 pages
Data Collection
No ratings yet
Data Collection
10 pages
Quantitative Option Strategies: Marco Avellaneda G63.2936.001 Spring Semester 2009
No ratings yet
Quantitative Option Strategies: Marco Avellaneda G63.2936.001 Spring Semester 2009
29 pages
Hepatitis Disease Prediction Using - Machine.Learning
No ratings yet
Hepatitis Disease Prediction Using - Machine.Learning
12 pages
2550Q-4th2021 - (EB187139-EEFB-462C
No ratings yet
2550Q-4th2021 - (EB187139-EEFB-462C
3 pages
Budget Numerical
No ratings yet
Budget Numerical
3 pages
Preparándote para Preguntas Personales en Entrevistas Guía Práctica
No ratings yet
Preparándote para Preguntas Personales en Entrevistas Guía Práctica
3 pages