ML-Lecture-12 (Evaluation Metrics For Classification)

This document discusses various evaluation metrics for classification models in machine learning, including confusion matrix, accuracy, precision, recall, F1 score, ROC curve, and AUC score. It provides definitions and formulas for calculating each metric. An example is given to illustrate how to calculate values from a confusion matrix, and why accuracy alone is not sufficient for imbalanced classification problems. The document concludes with instructing students to use various metrics to evaluate models for predicting heart disease on a publicly available dataset, in order to compare performance.

Uploaded by

Md Fazle Rabby

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

161 views15 pages

ML-Lecture-12 (Evaluation Metrics For Classification)

Uploaded by

Md Fazle Rabby

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Machine Learning

Lecture 12: Evaluation Metrics for Classification

COURSE CODE: CSE451
2021
Course Teacher
Dr. Mrinal Kanti Baowaly
Associate Professor
Department of Computer Science and
Engineering, Bangabandhu Sheikh
Mujibur Rahman Science and
Technology University, Bangladesh.

Email: [email protected]
Common Evaluation Metrics for
Classification
1. Confusion Matrix
2. Accuracy
3. Precision
4. Recall/𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦
5. Specificity
6. F1 Score
7. ROC (Receiver Operating Characteristics) Curve
8. AUC (Area Under the ROC curve) Score
Confusion Matrix
 A confusion matrix is a table that describes
the performance of a classification model
on the test data
 It is an N X N matrix, where N is the number
of classes being predicted
 Each row of the matrix represents the
instances in a predicted class while each
column represents the instances in an
actual class (and vice versa).
Terms associated with Confusion matrix
 True Positives : The cases in which the model
predicted 1(True) and the actual output was
also 1(True).
 True Negatives : The cases in which the model
predicted 0(False) and the actual output was
also 0(False).
 False Positives : The cases in which the model
predicted 1(True) and the actual output was
0(False).
 False Negatives : The cases in which the model
predicted 0(False) and the actual output was
1(True).
Accuracy
 It is the ratio of number of correct predictions to the total
number of input samples (predictions).
𝑁𝑜. 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

𝑇𝑃 + 𝑇𝑁
=
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁

 It is the most commonly used metric to judge a model

and is a good measure when the target variable classes in
the data are nearly balanced.
 It should NEVER be used as a measure when the target
Accuracy = 93%
classes are imbalanced. Error = 7%
Precision
 Out of all the positive classes we have predicted,
how many are actually positive
𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑜𝑛 =
𝑇𝑃 + 𝐹𝑃

55
= = 0.9649
57

Accuracy = 93%
Error = 7%
Recall
 Out of all the positive classes, how many are
predicted correctly
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑃 + 𝐹𝑁

55
= = 0.9166
60

Accuracy = 93%
Error = 7%
F1 Score
 Harmonic mean of the Precision and Recall
F1 =

F1 = 0.94

 It makes a balance between Precision and Recall

 Rather than measure recall and precision every Accuracy = 93%
Error = 7%
time, it would be easier to use a single F1 score
 It is a better choice when the target classes are
imbalanced
HW: Why classification accuracy is not
enough?
Hints:
 Suppose you have the problem of detecting cancer. You
have two classes for that:
1.Having cancer, the positive class, denoted by 1
2.No cancer, the negative class, denoted by 0
Lets assume that you have 1000 patient records. The
Accuracy = 0.994
confusion matrix of a predictive model is as in the right Error = 0.006
side. F1 Score = 0.249
It yields very high accuracy (99.4%) but fails to detect the
patients with cancer. F1 score can be a proper metric in this
case of imbalanced target classes.
ROC (Receiver Operating Characteristics)
Curve
 A ROC is a graphical plot that is used as a performance
measurement for classification problem
 The ROC curve is created by plotting the true positive
rate (TPR) against the false positive rate (FPR) at various
threshold settings
𝑇𝑃
𝑇𝑃𝑅 = 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
𝑇𝑃+𝐹𝑁
𝑇𝑁 𝐹𝑃
𝐹𝑃𝑅 = 1 − 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = 1 − =
𝐹𝑃+𝑇𝑁 𝐹𝑃+𝑇𝑁
 It tells how much model is capable of distinguishing
between classes (i.e. Separability/Discrimination capacity)
AUC (Area Under the ROC curve) Score
 The AUC is the area under the ROC curve.
 This score gives us a good idea of how well the model
performances.
 AUC Score ranges 0 to 1
 An ideal model has AUC near to the 1 which means it has
excellent discrimination capacity.
 An poor model has AUC near to the 0.5 which means it has
no discrimination capacity.
 When AUC is approximately 0, model is actually reciprocating
the classes. It means the model is predicting negative class as
a positive class and vice versa (Worst model).
Example: Confusion Matrix
# import confusion matrix
from sklearn.metrics import confusion_matrix
# actual values
actual = [1,0,0,1,0,0,1,0,1,1]
# predicted values
predicted = [1,0,0,1,0,0,1,1,0,0]
# confusion matrix
matrix = confusion_matrix(actual, predicted, labels=[1,0])
print('Confusion matrix : \n',matrix)
# outcome values order in sklearn
TP,FN,FP,TN = matrix.reshape(-1)
print('Outcome values : \n', TP,FN,FP,TN)
LAB: How to Use Various Metrics in
Classification Problems?
1. Let us investigate the Cleveland Heart Disease Dataset
(processed.cleveland.data) from here:
https://archive.ics.uci.edu/ml/datasets/heart+Disease
2. There are 303 items (patients), six have a missing value. There are 13
predictor variables (age, sex, cholesterol, etc.) The variable to predict is
encoded as 0 to 4 where 0 means no heart disease and 1-4 means presence
of heart disease.
3. Build a model to predict heart disease of the patients. Estimate and compare
Accuracy, Precision, Recall, F1 Score and AUC Score to evaluate the
performance of the model. And plot the ROC curve also.
Some Learning Materials
AnalyticsVidhya: How to Choose Evaluation Metrics for Classification
Models
RitchieNg: Evaluating a Classification Model
TowardsDatascience: Various ways to evaluate a machine learning
model’s performance

Soft Computing MCQ
85% (13)
Soft Computing MCQ
12 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Supervised Machine Learning Algorithm
100% (1)
Supervised Machine Learning Algorithm
111 pages
Soviet Cybernetics
100% (3)
Soviet Cybernetics
23 pages
Cs3353 Fds Unit 1 Notes Eduengg
No ratings yet
Cs3353 Fds Unit 1 Notes Eduengg
51 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
10 pages
Confusion Matrix & Evaluation Metrics in Machine Learning
No ratings yet
Confusion Matrix & Evaluation Metrics in Machine Learning
23 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
Random Forest Classifiers A Survey and Future
No ratings yet
Random Forest Classifiers A Survey and Future
10 pages
Scott Flansburg - Mega Math - Workbook - Turn On The Human Calculator in You
No ratings yet
Scott Flansburg - Mega Math - Workbook - Turn On The Human Calculator in You
16 pages
Ca 3 Merged
No ratings yet
Ca 3 Merged
275 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
Analytics in Practice: Model Evaluation
No ratings yet
Analytics in Practice: Model Evaluation
40 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Plant Diesase Thesis
No ratings yet
Plant Diesase Thesis
48 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
ME451: Control Systems: Dr. Jongeun Choi Department of Mechanical Engineering Michigan State University
No ratings yet
ME451: Control Systems: Dr. Jongeun Choi Department of Mechanical Engineering Michigan State University
18 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Chapter 06 Normalization of Database Tables
No ratings yet
Chapter 06 Normalization of Database Tables
26 pages
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
20 pages
Unit - 2: Data Modeling Using The Entity-Relationship (ER) Model
No ratings yet
Unit - 2: Data Modeling Using The Entity-Relationship (ER) Model
64 pages
Performance Metrics Classification
No ratings yet
Performance Metrics Classification
39 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Analysts Role in The BA Model
No ratings yet
Analysts Role in The BA Model
35 pages
Artificial Intelligence in Gravel Packing
No ratings yet
Artificial Intelligence in Gravel Packing
22 pages
06-FSSR DS610 2024 2025T1 Metrics
No ratings yet
06-FSSR DS610 2024 2025T1 Metrics
24 pages
Finalproject Review PPT
No ratings yet
Finalproject Review PPT
39 pages
04 AssociationPatternMining
No ratings yet
04 AssociationPatternMining
38 pages
Evaluation Matrix
No ratings yet
Evaluation Matrix
29 pages
Lecture11evaluationmetricsforclassification 240913060639 0c766554
No ratings yet
Lecture11evaluationmetricsforclassification 240913060639 0c766554
28 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
UNIT4 Evaluation Metrics
No ratings yet
UNIT4 Evaluation Metrics
16 pages
Unit III Iml Final
No ratings yet
Unit III Iml Final
36 pages
Evaluation Metrics: Yining Chen (Adapted From Slides by Anand Avati) May 1, 2020
No ratings yet
Evaluation Metrics: Yining Chen (Adapted From Slides by Anand Avati) May 1, 2020
31 pages
Unit 2 Chap 4
No ratings yet
Unit 2 Chap 4
14 pages
Iai&ml Unit-5
No ratings yet
Iai&ml Unit-5
15 pages
Ad3501-Dl-Unit 4 Notes
No ratings yet
Ad3501-Dl-Unit 4 Notes
16 pages
Performance Parameters
No ratings yet
Performance Parameters
23 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Notes 03
No ratings yet
Notes 03
38 pages
Analytic Method:: Model Evaluation
No ratings yet
Analytic Method:: Model Evaluation
17 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
Tutorial 6 Evaluation Metrics For Machine Learning Models: Classification and Regression Models
No ratings yet
Tutorial 6 Evaluation Metrics For Machine Learning Models: Classification and Regression Models
22 pages
Unit - I Chap-4 Model Evaluation and Development
No ratings yet
Unit - I Chap-4 Model Evaluation and Development
35 pages
Gradient Descent Learning: Minimize Objective Function: Error Landscape
No ratings yet
Gradient Descent Learning: Minimize Objective Function: Error Landscape
14 pages
Classification Matrics
No ratings yet
Classification Matrics
18 pages
Unit8 (Evaluation Method)
No ratings yet
Unit8 (Evaluation Method)
43 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
11 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
March 3rd&4th
No ratings yet
March 3rd&4th
19 pages
IT 138 - Lecture 4
No ratings yet
IT 138 - Lecture 4
30 pages
ASERS-LSTM: Arabic Speech Emotion Recognition System Based On LSTM Model
No ratings yet
ASERS-LSTM: Arabic Speech Emotion Recognition System Based On LSTM Model
9 pages
l09 Machine Learning
No ratings yet
l09 Machine Learning
39 pages
Machine Learningassignment
No ratings yet
Machine Learningassignment
10 pages
Unit2 - Perfomance Measures
No ratings yet
Unit2 - Perfomance Measures
32 pages
Model Performance Assessment
No ratings yet
Model Performance Assessment
13 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
Lecture - 3
No ratings yet
Lecture - 3
24 pages
Confusion Matrix V 2.0
No ratings yet
Confusion Matrix V 2.0
14 pages
Unit 3
No ratings yet
Unit 3
13 pages
Experiment 2 - Warm Water Control
No ratings yet
Experiment 2 - Warm Water Control
13 pages
Confusion Matrix
No ratings yet
Confusion Matrix
16 pages
Assignment 5
No ratings yet
Assignment 5
22 pages
Detecting Events and Key Actors in Multi-Person Videos
No ratings yet
Detecting Events and Key Actors in Multi-Person Videos
11 pages
Kinematic Control of Wheeled Mobile Robots: L. Gracia and J. Tornero
No ratings yet
Kinematic Control of Wheeled Mobile Robots: L. Gracia and J. Tornero
10 pages
Worksheet 2 (18BCA1083)
No ratings yet
Worksheet 2 (18BCA1083)
9 pages
W6 CSE 4781 Classification Metrics
No ratings yet
W6 CSE 4781 Classification Metrics
28 pages
Frequency Response For Control System Analysis - GATE Study Material in PDF
No ratings yet
Frequency Response For Control System Analysis - GATE Study Material in PDF
7 pages
A New Anti Swing Control of Overhead Cranes
No ratings yet
A New Anti Swing Control of Overhead Cranes
6 pages
Auc Roc Curve Machine Learning
No ratings yet
Auc Roc Curve Machine Learning
12 pages
Institute of Pure and Applied Sciences: Implementation of Learning Vector Quantization (LVQ) Using Matlab
No ratings yet
Institute of Pure and Applied Sciences: Implementation of Learning Vector Quantization (LVQ) Using Matlab
8 pages
AIexplain AI
No ratings yet
AIexplain AI
8 pages
Lecture 2.3
No ratings yet
Lecture 2.3
9 pages
AUC and The ROC Curve in Machine Learning - DataCamp
No ratings yet
AUC and The ROC Curve in Machine Learning - DataCamp
12 pages
Asc399 Feb23
No ratings yet
Asc399 Feb23
6 pages
Exp7 MLAI2
No ratings yet
Exp7 MLAI2
8 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
Materials Today: Proceedings: Poonam Verma, Vikas Tripathi, Bhaskar Pant
No ratings yet
Materials Today: Proceedings: Poonam Verma, Vikas Tripathi, Bhaskar Pant
5 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
2 pages
4.9 Estimating The Performance of A Classifier II
No ratings yet
4.9 Estimating The Performance of A Classifier II
16 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
CRAN Task View Machine Lea..
No ratings yet
CRAN Task View Machine Lea..
3 pages
6.evaluation Metrics - UNIT 2
No ratings yet
6.evaluation Metrics - UNIT 2
4 pages
COSC 4P76 Machine Learning: Project Report Format: A. The Target Function
No ratings yet
COSC 4P76 Machine Learning: Project Report Format: A. The Target Function
3 pages
Tinker With A Neural Network in Your Browser. Don't Worry, You Can't Break It. We Promise
No ratings yet
Tinker With A Neural Network in Your Browser. Don't Worry, You Can't Break It. We Promise
1 page
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet