0% found this document useful (0 votes)

13 views

CE880_Lecture6_slides

Uploaded by

Anand A J

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

CE880_Lecture6_slides

Uploaded by

Anand A J

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

School of Computer Science and Electronics Engineering, University of Essex

ILecture 6: Evaluation Metrics and Hypothesis Testing

CE880: An Approachable Introduction to Data Science

Haider Raza
Tuesday, 21th February 2023

1
About Myself

I Name: Haider Raza

I Position: Senior Lecturer in AI
I Research interest: AI, Machine Learning, Data Science
I Contact: [email protected]
I Academic Support Hours: 1-2 on Friday via zoom. Zoom link is available on
Moodle
I Website: www.sagihaider.com

2
What we will be covering in this lecture

I Evaluation Metrics
I Confusion Metrics
I F1 Score, Recall, Precision
I Hypothesis testing
I p-value
I Types of Hypothesis testing

3
Classification/Evaluation Metrics

0
https://towardsdatascience.com

4
Prerequisites

I Condition positive (P): the number of real positive cases in the data
I Condition negative (N) the number of real negative cases in the data
I True positive: Sick people correctly identified as sick
I False positive: Healthy people incorrectly identified as sick
I True negative: Healthy people correctly identified as healthy
I False negative: Sick people incorrectly identified as healthy

5
Classification Accuracy

I Classification accuracy is the ratio of correct predictions to total predictions made.

Example (accuracy)
classification accuracy = correct predictions / total predictions

I i.e. 80 out of 100 samples are correctly classified. Then 80/100 = 0.8
I It is often presented as a percentage (%) by multiplying the result by 100.
Example (accuracy in %)
classification accuracy = correct predictions / total predictions * 100
classification accuracy = 80/100 * 100 = 80%

I Classification accuracy can also easily be turned into a misclassification rate or

error rate by inverting the value, such as:

Example (error)
classification accuracy = correct predictions / total predictions * 100
classification accuracy = 80/100 * 100 = 80%

6
Problem with classification accuracy?

I When your data has more than 2 classes: With 3 or more classes you may get
a classification accuracy of 80%, but you don’t know if that is because all classes
are being predicted equally well or whether one or two classes are being neglected
by the model.
I When your data does not have an even number of classes (Data Imbalance):
You may achieve accuracy of 90% or more, but this is not a good score if 90
records for every 100 belong to one class and you can achieve this score by always
predicting the most common class value.

Classification accuracy can hide the detail you need to diagnose the performance of
your model. But thankfully we can tease apart this detail by using a confusion matrix.

7
Confusion Matrix

A confusion matrix is a summary of prediction results on a classification problem. In

other words, the confusion matrix shows the ways in which your classification model is
confused when it makes predictions.
It gives you insight not only into the errors being made by your classifier but more
importantly the types of errors that are being made.

8
Accuracy and Confusion Matrix

9
Example: Covid vs Non-Covid

10
False Positive Rate: Type I error rate

False positive rate (FPR) is probability of false alarm

FP
FPR =
FP + TN
2
FPR = = 0.4
2+3

11
False Negative Rate: Type II error rate

FN
FNR =
TP + FN
1
FPR = = 0.2
4+1

12
Let’s consider Imbalance Dataset

Suppose we have 10K records, where label A being 9k and label B being 1K. Now
suppose we are calculating the Accuracy, then its obvious that we will get a 90%
accuracy were the model predicts most of the records being tagged to label A. Clearly
this is not a good way of calculating the efficiency of the model if our dataset is not
balanced. So, in such such situations we use Recall, Precision, F-beta as the
classification metric

13
Recall: Sensitivity, Hit Rate, or True Positive Rate (TPR)

Recall says that out of the total actual positive values, how many positive were we
able to predict correctly.
Example: If we have 100 Covid Positive Cases then out of 100 cases how many we
have correctly predicted?
Note: In case of Recall we deal with ’False Negative’

14
Precision: (Positive Predictive Value (PPV))

Precision says that out of the total predicted positive result, how many results were
actually positive.
Note: in case of Precision we deal with ’False Positive’

15
Precision Vs Recall: Example 1: SPAM Detection

In this case, we mostly have to consider the Precision. Let’s say we got an email
which is originally not a spam, but the model detected it as a spam, which means it is
a False Positive. At last: the user is going to miss an email, which may be of high
importance.
Note: In such cases, where the False Positive value is high, our main focus should
always be to reduce it to minimum.

16
Precision Vs Recall: Example 2: Cancer Detection

In this case, we mostly have to consider the Recall. In Cancer vs Not Cancer:
Suppose the model predicted Not Cancer whereas patient was actually having Cancer,
which is a False Negative. This might turn out to be a blunder by the model.
In such cases a False Positive won’t be a very big issue because even if the person is
not having Cancer but is predicted as Cancer then he/she could go for another test to
verify the result. But if the person has Cancer and is predicted as negative (False
Negative) then chances are he might not go for another test which might turn out to
be a disaster. Therefore, it’s important to use Recall in such situations.
NOTE: Our goal should always be to reduce Precision and Recall, however:

I Whenever the False Positive is of more importance with respect to the problem
statement, then use Precision.
I If the False Negative has greater importance with respect to the problem
statement, then use Recall.

17
F-Beta

Sometimes both the False Positive and False Negative play an important role in an
imbalanced dataset. In such cases, we have to consider both Recall and Precision.

If the β value is 1, then the Fbeta becomes a F1-Score. Sometimes, beta value can
also be 0.5 or 2

18
F1 Score

If the Beta value is 1, then:

The above formula is a representation of Harmonic mean between Precision and

Recall. Now, let’s understand when to choose what values of Beta.

19
Selecting beta value

Beta = 1 If both False Positive and False Negative are equally important, then we
will select Beta = 1.

Beta <= 1 or close to 0 Suppose False Positive is having more impact than the
False Negative, then we need to reduce the Beta value by selecting something between
0 to 1. Example: SPAM Detection

Beta >= 1 Suppose the False Negative impact is high which is basically the Recall,
then in such cases we increase the Beta value more than 1. Example: Cancer
Detection

20
Hypothesis Testing

Evaluation of two mutually exclusive statements on population using sample data.

I Start with specifying Null and Alternative Hypotheses about a population

parameter
I Set the level of significance (α)
I Collect Sample data and calculate the Test Statistic and P-value by running a
Hypothesis test that well suits our data
I Make Conclusion: Reject or Fail to Reject Null Hypothesis
21
Confusion Matrix in Hypothesis testing

The decision rule for the p-value method:

I Confidence: The probability of accepting a True Null Hypothesis. It is denoted

as (1-α)
I Type I error: Occurs when we reject a True Null Hypothesis and is denoted as α.
I Type II error: Occurs when we accept a False Null Hypothesis and is denoted as
β.

22
Hypothesis Testing: Decision Rule (p-value) based

The decision rule for the p-value method:

I if p-value (p) > level of significance (α), we fail to reject Null Hypothesis
I if p-value (p) ≤ level of significance (α), we reject Null Hypothesis

23
Selecting Hypothesis test

24
25

The Practically Cheating Statistics Handbook, The Sequel! (2nd Edition)
From Everand
The Practically Cheating Statistics Handbook, The Sequel! (2nd Edition)
S. Deviant
4.5/5 (3)
Classification Metrics in Machine Learning
No ratings yet
Classification Metrics in Machine Learning
6 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Lesson 4 - Performance Metrics
No ratings yet
Lesson 4 - Performance Metrics
46 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
20 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
Lecture 2 Classifier Performance Metrics
No ratings yet
Lecture 2 Classifier Performance Metrics
60 pages
Ch01_ICS422_03
No ratings yet
Ch01_ICS422_03
46 pages
Risk Security and Regulatory Compliance
No ratings yet
Risk Security and Regulatory Compliance
12 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
9__ROC__AUC
No ratings yet
9__ROC__AUC
27 pages
Accuracy and error measures
No ratings yet
Accuracy and error measures
14 pages
21-General approach to classification, classification by decision tree induction-17-02-2025
No ratings yet
21-General approach to classification, classification by decision tree induction-17-02-2025
15 pages
Confusion Matrix: A Confusion Matrix Is A Summary of Prediction Results On A Classification Problem
No ratings yet
Confusion Matrix: A Confusion Matrix Is A Summary of Prediction Results On A Classification Problem
13 pages
08 Classifier Evaluation
No ratings yet
08 Classifier Evaluation
39 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Evaluation Metrics:: Confusion Matrix
No ratings yet
Evaluation Metrics:: Confusion Matrix
7 pages
10 Ai Evaluation tp01
No ratings yet
10 Ai Evaluation tp01
5 pages
Evaluation of Predictive Models Final
No ratings yet
Evaluation of Predictive Models Final
6 pages
Confusion Matrix
No ratings yet
Confusion Matrix
13 pages
EvaluationMatrix
No ratings yet
EvaluationMatrix
29 pages
Lecture 5
No ratings yet
Lecture 5
21 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Lesson 6 Analytics Methods
No ratings yet
Lesson 6 Analytics Methods
12 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Ch-EVALUATION
No ratings yet
Ch-EVALUATION
7 pages
Confusion Matrix
No ratings yet
Confusion Matrix
14 pages
Confusion Metrics
No ratings yet
Confusion Metrics
7 pages
AI Project Evaluation 1
No ratings yet
AI Project Evaluation 1
5 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
2.Confusion matrix and Performmance Metrics
No ratings yet
2.Confusion matrix and Performmance Metrics
15 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
41 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
ERROR and Confusion Matrix
No ratings yet
ERROR and Confusion Matrix
29 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
Evaluating Model Performance Unit 6
No ratings yet
Evaluating Model Performance Unit 6
33 pages
Lecture - Model Accuracy Measures
No ratings yet
Lecture - Model Accuracy Measures
61 pages
Module 2
No ratings yet
Module 2
72 pages
517-c-30072-Assignment Chapter Evaluation
No ratings yet
517-c-30072-Assignment Chapter Evaluation
10 pages
Accuracy, Recall, Precision, F-Score & Specificity, Which To Optimize On
No ratings yet
Accuracy, Recall, Precision, F-Score & Specificity, Which To Optimize On
10 pages
Lec_4_ML_S4_Evaluation_Metrics
No ratings yet
Lec_4_ML_S4_Evaluation_Metrics
29 pages
Performance Measures
No ratings yet
Performance Measures
9 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Unit 2 Chap 4
No ratings yet
Unit 2 Chap 4
14 pages
Analytics in Practice: Model Evaluation
No ratings yet
Analytics in Practice: Model Evaluation
40 pages
Accuracy, Precision, Recall or F1 - Towards Data Science
No ratings yet
Accuracy, Precision, Recall or F1 - Towards Data Science
9 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
25 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
Lecture-(3-4) Evaluation Metrices Classification and Regression
No ratings yet
Lecture-(3-4) Evaluation Metrices Classification and Regression
28 pages
Module 7 - Evaluation Measures
No ratings yet
Module 7 - Evaluation Measures
27 pages
IAI&ML UNIT-5
No ratings yet
IAI&ML UNIT-5
15 pages
EVALUATION - notes
No ratings yet
EVALUATION - notes
15 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
AD3501-DL-UNIT 4 NOTES
No ratings yet
AD3501-DL-UNIT 4 NOTES
16 pages
Errors of Regression Models: Bite-Size Machine Learning, #1
From Everand
Errors of Regression Models: Bite-Size Machine Learning, #1
Lee Baker
No ratings yet
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
CE802_Lec_FuncOptim_handouts
No ratings yet
CE802_Lec_FuncOptim_handouts
11 pages
CE802_Lec_Eval_handouts
No ratings yet
CE802_Lec_Eval_handouts
33 pages
Bayesian_Learning1
No ratings yet
Bayesian_Learning1
21 pages
CE880_lecture5_slides
No ratings yet
CE880_lecture5_slides
32 pages
CE880_Lecture9_slides
No ratings yet
CE880_Lecture9_slides
43 pages
An Open Letter To All Engineering Grads Trying To Pursue Physics
No ratings yet
An Open Letter To All Engineering Grads Trying To Pursue Physics
2 pages
List of ENGLISH Books by Mahatma Gandhi Available With Gandhi Research Foundation, Jalgaon
No ratings yet
List of ENGLISH Books by Mahatma Gandhi Available With Gandhi Research Foundation, Jalgaon
41 pages
Modern Pridictive Modelling(Regression)
No ratings yet
Modern Pridictive Modelling(Regression)
12 pages
Shrutireport
No ratings yet
Shrutireport
30 pages
Network Anomaly Detection Using LSTMBased Autoencoder
No ratings yet
Network Anomaly Detection Using LSTMBased Autoencoder
10 pages
Ground Truth for Grammatical Error Correction Metrics
No ratings yet
Ground Truth for Grammatical Error Correction Metrics
6 pages
Suspicious Activity Recognition For Monitoring Cheating in 1thcpiyo
No ratings yet
Suspicious Activity Recognition For Monitoring Cheating in 1thcpiyo
10 pages
IJCSNS International Journal of Computer Science and Network Security, VOL.8
No ratings yet
IJCSNS International Journal of Computer Science and Network Security, VOL.8
6 pages
Autonomous Landing Scene Recognition Based On Transfer Learning For Drones - Paper
No ratings yet
Autonomous Landing Scene Recognition Based On Transfer Learning For Drones - Paper
12 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
11 pages
CAIE-D-22-03171
No ratings yet
CAIE-D-22-03171
36 pages
Rock Mine Classification Using Supervised Machine Learning Algorithms
No ratings yet
Rock Mine Classification Using Supervised Machine Learning Algorithms
8 pages
Machine Learning-Based Anomaly Detection
No ratings yet
Machine Learning-Based Anomaly Detection
11 pages
Deep Learning Applications in Image Analysis
No ratings yet
Deep Learning Applications in Image Analysis
218 pages
A Comprehensive Review For Chronic Disease Prediction Using Machine Learning Algorithms
No ratings yet
A Comprehensive Review For Chronic Disease Prediction Using Machine Learning Algorithms
28 pages
Diamond Price Prediction
No ratings yet
Diamond Price Prediction
23 pages
Emerging Trends in Computer Engineering
No ratings yet
Emerging Trends in Computer Engineering
20 pages
Brain Tumor Classification Using CNN on MRI Data: A PyTorch Implementation
No ratings yet
Brain Tumor Classification Using CNN on MRI Data: A PyTorch Implementation
7 pages
sample paper
No ratings yet
sample paper
12 pages
Pothole Segmentation - CNN
No ratings yet
Pothole Segmentation - CNN
44 pages
Machine Learning Essentials
No ratings yet
Machine Learning Essentials
19 pages
02 ruchiJWoo35-49
No ratings yet
02 ruchiJWoo35-49
16 pages
ViroNia--LSTM-based-proteomics-model-for-precis_2025_Computers-in-Biology-an
No ratings yet
ViroNia--LSTM-based-proteomics-model-for-precis_2025_Computers-in-Biology-an
12 pages
15 Mlops Interview Questions for 2025
No ratings yet
15 Mlops Interview Questions for 2025
13 pages
Amit - Kumar - Patel - Leave Management System - Project - Sem - V
No ratings yet
Amit - Kumar - Patel - Leave Management System - Project - Sem - V
62 pages
Rajshahi University of Engineering & Technology
No ratings yet
Rajshahi University of Engineering & Technology
15 pages
Machine Learning For Beginners
No ratings yet
Machine Learning For Beginners
25 pages
Assignment 3_553
No ratings yet
Assignment 3_553
9 pages
Enhanced IDS With Deep Learning For IoT-Based Smart Cities Security
No ratings yet
Enhanced IDS With Deep Learning For IoT-Based Smart Cities Security
19 pages
Predicting Sports Results Using Latent Features A Case Study
No ratings yet
Predicting Sports Results Using Latent Features A Case Study
6 pages
Tuning Traditional Language Processing Approaches For Pashto Text Classification
No ratings yet
Tuning Traditional Language Processing Approaches For Pashto Text Classification
14 pages
algorithms-17-00434-v2
No ratings yet
algorithms-17-00434-v2
35 pages

CE880_Lecture6_slides

Uploaded by

CE880_Lecture6_slides

Uploaded by

School of Computer Science and Electronics Engineering, University of Essex

ILecture 6: Evaluation Metrics and Hypothesis Testing

I Name: Haider Raza

I Classification accuracy is the ratio of correct predictions to total predictions made.

I Classification accuracy can also easily be turned into a misclassification rate or

A confusion matrix is a summary of prediction results on a classification problem. In

False positive rate (FPR) is probability of false alarm

If the Beta value is 1, then:

The above formula is a representation of Harmonic mean between Precision and

Evaluation of two mutually exclusive statements on population using sample data.

I Start with specifying Null and Alternative Hypotheses about a population

The decision rule for the p-value method:

I Confidence: The probability of accepting a True Null Hypothesis. It is denoted

The decision rule for the p-value method:

You might also like