0% found this document useful (0 votes)
160 views15 pages

ML-Lecture-12 (Evaluation Metrics For Classification)

This document discusses various evaluation metrics for classification models in machine learning, including confusion matrix, accuracy, precision, recall, F1 score, ROC curve, and AUC score. It provides definitions and formulas for calculating each metric. An example is given to illustrate how to calculate values from a confusion matrix, and why accuracy alone is not sufficient for imbalanced classification problems. The document concludes with instructing students to use various metrics to evaluate models for predicting heart disease on a publicly available dataset, in order to compare performance.

Uploaded by

Md Fazle Rabby
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views15 pages

ML-Lecture-12 (Evaluation Metrics For Classification)

This document discusses various evaluation metrics for classification models in machine learning, including confusion matrix, accuracy, precision, recall, F1 score, ROC curve, and AUC score. It provides definitions and formulas for calculating each metric. An example is given to illustrate how to calculate values from a confusion matrix, and why accuracy alone is not sufficient for imbalanced classification problems. The document concludes with instructing students to use various metrics to evaluate models for predicting heart disease on a publicly available dataset, in order to compare performance.

Uploaded by

Md Fazle Rabby
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Machine Learning

Lecture 12: Evaluation Metrics for Classification


COURSE CODE: CSE451
2021
Course Teacher
Dr. Mrinal Kanti Baowaly
Associate Professor
Department of Computer Science and
Engineering, Bangabandhu Sheikh
Mujibur Rahman Science and
Technology University, Bangladesh.

Email: [email protected]
Common Evaluation Metrics for
Classification
1. Confusion Matrix
2. Accuracy
3. Precision
4. Recall/𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦
5. Specificity
6. F1 Score
7. ROC (Receiver Operating Characteristics) Curve
8. AUC (Area Under the ROC curve) Score
Confusion Matrix
 A confusion matrix is a table that describes
the performance of a classification model
on the test data
 It is an N X N matrix, where N is the number
of classes being predicted
 Each row of the matrix represents the
instances in a predicted class while each
column represents the instances in an
actual class (and vice versa).
Terms associated with Confusion matrix
 True Positives : The cases in which the model
predicted 1(True) and the actual output was
also 1(True).
 True Negatives : The cases in which the model
predicted 0(False) and the actual output was
also 0(False).
 False Positives : The cases in which the model
predicted 1(True) and the actual output was
0(False).
 False Negatives : The cases in which the model
predicted 0(False) and the actual output was
1(True).
Accuracy
 It is the ratio of number of correct predictions to the total
number of input samples (predictions).
𝑁𝑜. 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

𝑇𝑃 + 𝑇𝑁
=
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁

 It is the most commonly used metric to judge a model


and is a good measure when the target variable classes in
the data are nearly balanced.
 It should NEVER be used as a measure when the target
Accuracy = 93%
classes are imbalanced. Error = 7%
Precision
 Out of all the positive classes we have predicted,
how many are actually positive
𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑜𝑛 =
𝑇𝑃 + 𝐹𝑃

55
= = 0.9649
57

Accuracy = 93%
Error = 7%
Recall
 Out of all the positive classes, how many are
predicted correctly
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑃 + 𝐹𝑁

55
= = 0.9166
60

Accuracy = 93%
Error = 7%
F1 Score
 Harmonic mean of the Precision and Recall
F1 =

F1 = 0.94

 It makes a balance between Precision and Recall


 Rather than measure recall and precision every Accuracy = 93%
Error = 7%
time, it would be easier to use a single F1 score
 It is a better choice when the target classes are
imbalanced
HW: Why classification accuracy is not
enough?
Hints:
 Suppose you have the problem of detecting cancer. You
have two classes for that:
1.Having cancer, the positive class, denoted by 1
2.No cancer, the negative class, denoted by 0
Lets assume that you have 1000 patient records. The
Accuracy = 0.994
confusion matrix of a predictive model is as in the right Error = 0.006
side. F1 Score = 0.249
It yields very high accuracy (99.4%) but fails to detect the
patients with cancer. F1 score can be a proper metric in this
case of imbalanced target classes.
ROC (Receiver Operating Characteristics)
Curve
 A ROC is a graphical plot that is used as a performance
measurement for classification problem
 The ROC curve is created by plotting the true positive
rate (TPR) against the false positive rate (FPR) at various
threshold settings
𝑇𝑃
𝑇𝑃𝑅 = 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
𝑇𝑃+𝐹𝑁
𝑇𝑁 𝐹𝑃
𝐹𝑃𝑅 = 1 − 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = 1 − =
𝐹𝑃+𝑇𝑁 𝐹𝑃+𝑇𝑁
 It tells how much model is capable of distinguishing
between classes (i.e. Separability/Discrimination capacity)
AUC (Area Under the ROC curve) Score
 The AUC is the area under the ROC curve.
 This score gives us a good idea of how well the model
performances.
 AUC Score ranges 0 to 1
 An ideal model has AUC near to the 1 which means it has
excellent discrimination capacity.
 An poor model has AUC near to the 0.5 which means it has
no discrimination capacity.
 When AUC is approximately 0, model is actually reciprocating
the classes. It means the model is predicting negative class as
a positive class and vice versa (Worst model).
Example: Confusion Matrix
# import confusion matrix
from sklearn.metrics import confusion_matrix
# actual values
actual = [1,0,0,1,0,0,1,0,1,1]
# predicted values
predicted = [1,0,0,1,0,0,1,1,0,0]
# confusion matrix
matrix = confusion_matrix(actual, predicted, labels=[1,0])
print('Confusion matrix : \n',matrix)
# outcome values order in sklearn
TP,FN,FP,TN = matrix.reshape(-1)
print('Outcome values : \n', TP,FN,FP,TN)
LAB: How to Use Various Metrics in
Classification Problems?
1. Let us investigate the Cleveland Heart Disease Dataset
(processed.cleveland.data) from here:
https://archive.ics.uci.edu/ml/datasets/heart+Disease
2. There are 303 items (patients), six have a missing value. There are 13
predictor variables (age, sex, cholesterol, etc.) The variable to predict is
encoded as 0 to 4 where 0 means no heart disease and 1-4 means presence
of heart disease.
3. Build a model to predict heart disease of the patients. Estimate and compare
Accuracy, Precision, Recall, F1 Score and AUC Score to evaluate the
performance of the model. And plot the ROC curve also.
Some Learning Materials
AnalyticsVidhya: How to Choose Evaluation Metrics for Classification
Models
RitchieNg: Evaluating a Classification Model
TowardsDatascience: Various ways to evaluate a machine learning
model’s performance

You might also like