0% found this document useful (0 votes)

2 views5 pages

Comprehensive Guide On Confusion Matrix 1657202063

The document provides a comprehensive guide on confusion matrices, which summarize the performance of classification algorithms by detailing true positives, true negatives, false positives, and false negatives. It explains key metrics such as accuracy, precision, recall, F-score, and specificity, emphasizing the importance of not relying solely on accuracy for evaluating model performance. Additionally, it includes practical examples and code snippets for obtaining and visualizing confusion matrices using the sklearn library.

Uploaded by

shreya chavan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views5 pages

Comprehensive Guide On Confusion Matrix 1657202063

Uploaded by

shreya chavan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Comprehensive Guide on Confusion Matrix

By Isshin Inada Read the full version

Follow me on for weekly ML/DS guides for free at S K Y T O W N E R

What is a confusion matrix

A confusion matrix is a simple table used to summarise the performance of a classification algorithm. As an
example, consider the following confusion matrix for a binary classifier:

0 (Predicted) 1 (Predicted)

0 (Actual) 2 3

1 (Actual) 1 4

Here, the algorithm has made a total of 10 predictions, and this confusion matrix describes whether these
predictions are correct or not. To summarise this table in words:
there were 2 cases in which the algorithm predicted a 0 and the actual label was 0 (correct).
there were 3 cases in which the algorithm predicted a 1 and the actual label was 0 (incorrect).
there was 1 case in which the algorithm predicted a 0 and the actual label was 1 (incorrect).
there were 4 cases in which the algorithm predicted a 1 and the actual label label was 1 (correct).

Each cell in the confusion matrix can be labelled as such:

0 (Predicted) 1 (Predicted)

0 (Actual) 2 (True Negative - TN) 3 (False Positive - FP)

1 (Actual) 1 (False Negative - FN) 4 (True Positive - TP)

Here, you can interpret 0 as being negative, while 1 as being positive. Using these values, we can compute the
following key metrics:
1. Accuracy
2. Misclassification rate
3. Precision
4. Recall (or Sensitivity)
5. F-Score
6. Specificity

Classification metrics

Accuracy
Accuracy is simply the proportion of correct classifications:

TN + TP
Accuracy =
TN + FP + FN + TP

The denominator here represents the total predictions made. For our example, the accuracy is as follows:

2 + 4
Accuracy = = 0.6
2 + 3 + 1 + 4

Of course, we want the accuracy to be as high as possible. However, as we shall we see later, a classifier with
a high accuracy does not necessary make for a good classifier.

Misclassification rate
Misclassification rate is the proportion of incorrect classification:
SKYTOWNER
FP + FN
Misclassif ication Rate =
TN + FP + FN + TP

Note that this is basically one minus the accuracy. For our example, the misclassification rate is as follows:

3 + 1
Misclassif ication Rate = = 0.4
2 + 3 + 1 + 4

We want the misclassification to be as small as possible.

Precision
Precision is the proportion of correct predictions given the prediction was positive:

TP
Precision =
FP + TP

Precision involves the following part of the confusion matrix:

0 (Predicted) 1 (Predicted)

0 (Actual) 2 (True Negative - TN) 3 (False Positive - FP)

1 (Actual) 1 (False Negative - FN) 4 (True Positive - TP)

For our example, the precision is as follows:

4 4
Precision = =
3 + 4 7

The metric of precision is important in cases when you want to minimise false positives and maximise true
positives. For instance, consider a binary classifier that predicts whether an e-mail is legitimate (0) or spam
(1). From the users' perspective, they do not want any legitimate e-mail to be identified as spam since e-mails
identified as spam usually do not end up in the regular inbox. Therefore, in this case, we want to avoid
predicting spam for legitimate e-mails (false positive), and obviously aim to predict spam for actual spam e-
mails (true positive).

As a numerical example, consider two binary classifiers - the 1st classifier has a precision of 0.2, while the 2nd
classifier has a precision of 0.9. In words, for the 1st classifier, out of all e-mails predicted to be spam, only
20% of these e-mails were actually spam. This means that 80% of all e-mails identified as spam were actually
legitimate. On the other hand, for the 2nd classifier, out of all e-mails predicted to be spam, 90% of them were
actually spam. This means that only 10% of all e-mails identified as spam were legitimate. Therefore, we would
opt to use the 2nd classifier in this case.

Recall (or Sensitivity)

Recall (also known as sensitivity) is the proportion of correct predictions given that the actual labels are
positive:

TP
Recall =
FN + TP

Recall involves the following part of the confusion matrix:

SKYTOWNER
0 (Predicted) 1 (Predicted)

0 (Actual) 2 (True Negative - TN) 3 (False Positive - FP)

1 (Actual) 1 (False Negative - FN) 4 (True Positive - TP)

For our example, the recall is as follows:

4 4
Recall = =
1 + 4 5

The metric of recall is important in cases when you want to minimise false negatives and maximise true
positives. For instance, consider a binary classifier that predicts whether a transaction is legitimate (0) or
fraudulent (1). From the bank's perspective, they want to be able to correctly identify all fraudulent transactions
since missing fraudulent transactions would be extremely costly for the bank. This would mean that we want to
minimise false negatives (i.e. missing actual fraudulent transactions), and maximise true positives (i.e.
correctly identifying fraudulent transactions).

As a numerical example, let's compare 2 binary classifiers where the 1st classifier has a recall of 0.2, while the
2nd has a recall of 0.9. In words, the 1st classifier is only able to correctly detect 20% of all actual fraudulent
cases, whereas the 2nd classifier is capable of correctly detecting 90% of all actual fraudulent cases. In this
case, the bank would therefore opt for the 2nd classifier.

F-score
F-score is a metric that combines both the precision and recall into a single value:

2 ⋅ Precision ⋅ Recall
F =
Precision + Recall

A low precision or recall value will reduce the F-measure. For instance:

0 (Predicted) 1 (Predicted)

0 (Actual) 10000 (TN) 10 (FP)

1 (Actual) 900 (FN) 1000 (TP)

The accuracy here would be:

10000 + 1000
Accuracy = ≈ 0.92
1000 + 10 + 900 + 1000

The F-score would be:

1000
Precision = ≈ 0.99
10 + 1000
1000
Recall = ≈ 0.53
900 + 1000
2 ⋅ 0.99 ⋅ 0.53
F = ≈ 0.69
0.99 + 0.53

As we can see, the accuracy here is very high ( 0.92). However, the F-score is relatively much lower ( 0.69)
since the number of false negatives is quite high, thereby reducing the recall. This demonstrates how only
looking at the accuracy for the classification performance would mislead you into thinking the model is
excellent, when in fact, there may be some abnormality in the prediction errors made.

SKYTOWNER
WA R N I N G

It is bad practice to only quote the accuracy for the classification performance of an algorithm. As
demonstrated here, quoting the recall, precision and F-score is also extremely important based on the
context. For instance, for a binary classifier detecting fraudulent transactions, the recall (or F-score) is
more relevant than the accuracy.

Specificity
Specificity represents the proportion of correct predictions for negative labels:

TN
Specif icity =
TN + FP

Specificity involves the following cells of the confusion matrix:

0 (Predicted) 1 (Predicted)

0 (Actual) 2 (True Negative - TN) 3 (False Positive - FP)

1 (Actual) 1 (False Negative - FN) 4 (True Positive - TP)

Ideally, we want the specificity to be as high as possible.

Obtaining the confusion matrix using sklearn

Given the predicted and true labels, the confusion_matrix(~) method of the sklearn library returns the
corresponding confusion matrix:

from sklearn.metrics import confusion_matrix

true = [0,0,1,1,1,1,1,1,0,0,0] # actual labels

pred = [0,0,0,1,1,1,1,1,1,1,1] # predicted labels
matrix = confusion_matrix(true, pred) # returns a NumPy array
matrix

array([[2, 3],
[1, 5]])

Here, many people get confused as to what the numbers represent. This confusion matrix corresponds to the
following:

0 (Predicted) 1 (Predicted)

0 (Actual) 2 3

1 (Actual) 1 5

To pretty-print the confusion matrix, use the seaborn library:

import seaborn as sns

sns.heatmap(matrix, annot=True, cmap='Blues')

This gives the following output:

SKYTOWNER
Computing classification metrics
To compute the classification metrics, use sklearn 's classification_report(~) method like so:

from sklearn.metrics import classification_report

print(classification_report(true, pred))

precision recall f1-score support

0 0.67 0.40 0.50 5
1 0.62 0.83 0.71 6
accuracy 0.64 11
macro avg 0.65 0.62 0.61 11
weighted avg 0.64 0.64 0.62 11

Here, the 0th support represents the number of actual negative (0) labels.

SKYTOWNER

Capstone Project - Credit Risk Analysis
67% (6)
Capstone Project - Credit Risk Analysis
50 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Quality Control in Serology
100% (2)
Quality Control in Serology
56 pages
Lesson 4 - Performance Metrics
No ratings yet
Lesson 4 - Performance Metrics
46 pages
Clsi GP10 A
100% (1)
Clsi GP10 A
36 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
10 pages
Introduction To Epidemiology and Public Health - Answers
100% (2)
Introduction To Epidemiology and Public Health - Answers
8 pages
Lecture 2 Classifier Performance Metrics
No ratings yet
Lecture 2 Classifier Performance Metrics
60 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
Artificial Intelligence Demystified: Economic Commission For Europe
No ratings yet
Artificial Intelligence Demystified: Economic Commission For Europe
38 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
Agilent 53131A/132A 225 MHZ Universal Counter: Manual Part Number 53131-900 Printed in Malaysia
No ratings yet
Agilent 53131A/132A 225 MHZ Universal Counter: Manual Part Number 53131-900 Printed in Malaysia
193 pages
ML - Mod2 Classification
No ratings yet
ML - Mod2 Classification
74 pages
H-800 Service Manual1
No ratings yet
H-800 Service Manual1
84 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Update To The 2009 AMP Molecular Diagnostic Assay Validation White Paper
No ratings yet
Update To The 2009 AMP Molecular Diagnostic Assay Validation White Paper
11 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
W-Hu - MIT
No ratings yet
W-Hu - MIT
67 pages
Confusion Matrix and Outliers
No ratings yet
Confusion Matrix and Outliers
32 pages
BSC ML CH1
No ratings yet
BSC ML CH1
63 pages
Signal Processing and AI 20250310
No ratings yet
Signal Processing and AI 20250310
29 pages
Performance Evaluation
No ratings yet
Performance Evaluation
24 pages
Unit 6-Feature Engineering and Sensitivity Analysis
No ratings yet
Unit 6-Feature Engineering and Sensitivity Analysis
63 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Unit III Iml Final
No ratings yet
Unit III Iml Final
36 pages
Lecture - (3-4) Evaluation Metrices Classification and Regression
No ratings yet
Lecture - (3-4) Evaluation Metrices Classification and Regression
28 pages
Question Bank Unit 5
No ratings yet
Question Bank Unit 5
28 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
20 pages
COnfusion Matrix
No ratings yet
COnfusion Matrix
32 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
Intel Assignment
No ratings yet
Intel Assignment
13 pages
Unit-6 Notes PART A
No ratings yet
Unit-6 Notes PART A
20 pages
Lecture - 3
No ratings yet
Lecture - 3
24 pages
Lec5 Classification
No ratings yet
Lec5 Classification
27 pages
WINSEM2024-25 CBS3006 ETH VL2024250505168 2025-01-09 Reference-Material-IV
No ratings yet
WINSEM2024-25 CBS3006 ETH VL2024250505168 2025-01-09 Reference-Material-IV
20 pages
Confusion Matrix
No ratings yet
Confusion Matrix
7 pages
Accuracy and Error Measures
No ratings yet
Accuracy and Error Measures
14 pages
06-FSSR DS610 2024 2025T1 Metrics
No ratings yet
06-FSSR DS610 2024 2025T1 Metrics
24 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Project Main Nimoni
No ratings yet
Project Main Nimoni
29 pages
Confusion Matrix V 2.0
No ratings yet
Confusion Matrix V 2.0
14 pages
Confusion Matrix
No ratings yet
Confusion Matrix
11 pages
Cytometry Part B Clinical - 2020 - Illingworth - International Guidelines For The Flow Cytometric Evaluation of Peripheral
No ratings yet
Cytometry Part B Clinical - 2020 - Illingworth - International Guidelines For The Flow Cytometric Evaluation of Peripheral
28 pages
UNIT-1-2.Binary Classification and Related Tasks
No ratings yet
UNIT-1-2.Binary Classification and Related Tasks
22 pages
Classification Algorithm in Machine Learning
No ratings yet
Classification Algorithm in Machine Learning
13 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
CIGRE Paper B5 - PS2 - 11137 - 2024
No ratings yet
CIGRE Paper B5 - PS2 - 11137 - 2024
12 pages
Unit 3
No ratings yet
Unit 3
13 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
Evaluation Measures For Machine Learning Models
No ratings yet
Evaluation Measures For Machine Learning Models
6 pages
Model Evaluation Metrics - A Comprehensive Guide For Beginners - by Yash - Medium
No ratings yet
Model Evaluation Metrics - A Comprehensive Guide For Beginners - by Yash - Medium
9 pages
Risk Security and Regulatory Compliance
No ratings yet
Risk Security and Regulatory Compliance
12 pages
MOVES2018
No ratings yet
MOVES2018
11 pages
BigData Section6
No ratings yet
BigData Section6
10 pages
Iai&ml Unit-5
No ratings yet
Iai&ml Unit-5
15 pages
Machine Learningassignment
No ratings yet
Machine Learningassignment
10 pages
Diagnostic Accuracy of Examination Tests For Lateral Elbow Tendinopathy (LET) - A Systematic Review
No ratings yet
Diagnostic Accuracy of Examination Tests For Lateral Elbow Tendinopathy (LET) - A Systematic Review
10 pages
Evaluation Metrics-ML
No ratings yet
Evaluation Metrics-ML
16 pages
Confusion Matrix and Classification Evaluation Metrics
No ratings yet
Confusion Matrix and Classification Evaluation Metrics
16 pages
Rapid Review - Falls Risk Tools FINAL
No ratings yet
Rapid Review - Falls Risk Tools FINAL
13 pages
Williamson Et Al (2005) - Pain A Review of Three Commonly Used Pain Rating Scales
No ratings yet
Williamson Et Al (2005) - Pain A Review of Three Commonly Used Pain Rating Scales
7 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
Understanding The Confusion Matrix in Machine Learning
No ratings yet
Understanding The Confusion Matrix in Machine Learning
4 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Imp Notes For Aamd
No ratings yet
Imp Notes For Aamd
6 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
Lecture 5
No ratings yet
Lecture 5
21 pages
Performance Measures - Session 2
No ratings yet
Performance Measures - Session 2
35 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
Confusion Matrix
No ratings yet
Confusion Matrix
18 pages
Confusion Matrix
No ratings yet
Confusion Matrix
14 pages
ML 5
No ratings yet
ML 5
5 pages
Lab 3 Probability
No ratings yet
Lab 3 Probability
11 pages
Jurnal Reading Panum
No ratings yet
Jurnal Reading Panum
28 pages
Clase 1 Lambert Et Al. - 1996 - The Reliability and Validity of The Outcome Questionnaire
No ratings yet
Clase 1 Lambert Et Al. - 1996 - The Reliability and Validity of The Outcome Questionnaire
11 pages
Reference Vs Consensus Values
No ratings yet
Reference Vs Consensus Values
7 pages
QAL1 - ABB - AO2000 LS 25 - en - K
No ratings yet
QAL1 - ABB - AO2000 LS 25 - en - K
3 pages
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
20 pages
Sexual Dimorphism in The Humerus: A Study On South Indians: Girish Patil, Sanjeev Kolagi, Umesh Ramadurg
No ratings yet
Sexual Dimorphism in The Humerus: A Study On South Indians: Girish Patil, Sanjeev Kolagi, Umesh Ramadurg
4 pages
Intro To NAS
No ratings yet
Intro To NAS
32 pages
Automated Analysis Method For Screening Knee Osteoarthritis Using Medical Infrared Thermography
No ratings yet
Automated Analysis Method For Screening Knee Osteoarthritis Using Medical Infrared Thermography
7 pages
Kit Insert ANTI-HBS
No ratings yet
Kit Insert ANTI-HBS
11 pages
Planing Analysis Report 3.9 LCG
No ratings yet
Planing Analysis Report 3.9 LCG
7 pages
Basics of Math: Foundation Skills: Math & Logic
From Everand
Basics of Math: Foundation Skills: Math & Logic
Younish Pathan
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet

Comprehensive Guide On Confusion Matrix 1657202063

Uploaded by

Comprehensive Guide On Confusion Matrix 1657202063

Uploaded by

Comprehensive Guide on Confusion Matrix

By Isshin Inada Read the full version

What is a confusion matrix

Each cell in the confusion matrix can be labelled as such:

0 (Actual) 2 (True Negative - TN) 3 (False Positive - FP)

1 (Actual) 1 (False Negative - FN) 4 (True Positive - TP)

We want the misclassification to be as small as possible.

Precision involves the following part of the confusion matrix:

0 (Actual) 2 (True Negative - TN) 3 (False Positive - FP)

1 (Actual) 1 (False Negative - FN) 4 (True Positive - TP)

For our example, the precision is as follows:

Recall (or Sensitivity)

Recall involves the following part of the confusion matrix:

0 (Actual) 2 (True Negative - TN) 3 (False Positive - FP)

1 (Actual) 1 (False Negative - FN) 4 (True Positive - TP)

For our example, the recall is as follows:

0 (Actual) 10000 (TN) 10 (FP)

1 (Actual) 900 (FN) 1000 (TP)

The accuracy here would be:

The F-score would be:

Specificity involves the following cells of the confusion matrix:

0 (Actual) 2 (True Negative - TN) 3 (False Positive - FP)

1 (Actual) 1 (False Negative - FN) 4 (True Positive - TP)

Ideally, we want the specificity to be as high as possible.

Obtaining the confusion matrix using sklearn

from sklearn.metrics import confusion_matrix

true = [0,0,1,1,1,1,1,1,0,0,0] # actual labels

To pretty-print the confusion matrix, use the seaborn library:

import seaborn as sns

This gives the following output:

from sklearn.metrics import classification_report

precision recall f1-score support

You might also like