0% found this document useful (0 votes)

87 views58 pages

Session 1 Evaluation Model

Uploaded by

pratheesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views58 pages

Session 1 Evaluation Model

Uploaded by

pratheesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 58

Session 1

Model Evaluation

PRATHEESH KUMAR N

 The most important task in building any machine

learning model is to evaluate its performance. So, the

question arises that how would one measure the success

of a machine learning model? How would we know that

when to stop the training and evaluation and when to

call it good?

 Evaluation metrics are tied to machine learning tasks. There are
different metrics for the tasks of classification and regression. Some
metrics, like precision-recall, are useful for multiple tasks.
Classification and regression are examples of supervised learning,
which constitutes a majority of machine learning applications. Using
different metrics for performance evaluation, we should be able to
improve our model’s overall predictive power before we roll it out
for production on unseen data. Without doing a proper evaluation of
the Machine Learning model by using different evaluation metrics,
and only depending on accuracy, can lead to a problem when the
respective model is deployed on unseen data and may end in poor
predictions.
Classification Metrics in
MachineLearning
 Classification is about predicting the class labels given input
data. In binary classification, there are only two possible
output classes(i.e., Dichotomy). In multiclass classification,
more than two possible classes can be present.

 A very common example of binary classification is spam

detection, where the input data could include the email text
and metadata (sender, sending time), and the output label is
either “spam” or “not spam.” (See Figure) Sometimes, people
use some other names also for the two classes: “positive” and
“negative,” or “class 1” and “class 0.”
Email spam detection is a binary classification

problem

Classification Metrics

 There are many ways for measuring classification
performance. Accuracy, confusion matrix, log-loss,
and AUC-ROC are some of the most popular
metrics. Precision-recall is a widely used metrics for
classification problems.
Accuracy

 Accuracy simply measures how often the classifier
correctly predicts. We can define accuracy as the
ratio of the number of correct predictions and the
total number of predictions.

 When any model gives an accuracy rate of 99%, you might
think that model is performing very good but this is not
always true and can be misleading in some situations. I am
going to explain this with the help of an example.
 Consider a binary classification problem, where a model can
achieve only two results, either model gives
a correct or incorrect prediction. Now imagine we have a
classification task to predict if an image is a dog or cat as
shown in the image. In a supervised learning algorithm, we
first fit / train a model on training data, then test the model
on testing data. Once we have the model’s predictions from
the X test data, we compare them to the true y values (the
correct labels).


 We feed the image of the dog into the training
model. Suppose the model predicts that this is a dog,
and then we compare the prediction to the correct
label. If the model predicts that this image is a cat
and then we again compare it to the correct label and
it would be incorrect.

We repeat this process for all images in X test
data. Eventually, we’ll have a count of correct
and incorrect matches. But in reality, it is very
rare that all incorrect or correct matches
hold equal value. Therefore one metric won’t
tell the entire story.

 Accuracy is useful when the target class is well balanced but is not
a good choice for the unbalanced classes. Imagine the scenario
where we had 99 images of the dog and only 1 image of a cat
present in our training data. Then our model would always predict
the dog, and therefore we got 99% accuracy. In reality, Data is
always imbalanced for example Spam email, credit card fraud, and
medical diagnosis. Hence, if we want to do a better model
evaluation and have a full picture of the model evaluation, other
metrics such as recall and precision should also be considered.
Confusion Matrix

Confusion Matrix is a performance
measurement for the machine learning
classification problems where the output can
be two or more classes. It is a table with
combinations of predicted and actual values.

A confusion matrix is defined as the table that is

often used to describe the performance of a

classification model on a set of the test data for

which the true values are known.




1 0

0
Link for confusion matrix

 https://
towardsdatascience.com/understanding-confusion-
matrix-a9ad42dcfd62

 https://
www.guru99.com/confusion-matrix-machine-learni
ng-example.html
Let’s try to understand TP, FP, FN, TN
with an example of pregnancy analogy


 True Positive: We predicted positive and it’s true. In the
image, we predicted that a woman is pregnant and she
actually is.
 True Negative: We predicted negative and it’s true. In the
image, we predicted that a man is not pregnant and he
actually is not.
 False Positive (Type 1 Error)- We predicted positive and it’s
false. In the image, we predicted that a man is pregnant but
he actually is not.
 False Negative (Type 2 Error)- We predicted negative and it’s
false. In the image, we predicted that a woman is not
pregnant but she actually is.
1. Precision —

Precision explains how many of the correctly predicted
cases actually turned out to be positive. Precision is useful
in the cases where False Positive is a higher concern than
False Negatives. The importance of Precision is in music or
video recommendation systems, e-commerce websites, etc.
where wrong results could lead to customer churn and this
could be harmful to the business.


Precision for a label is defined as the

number of true positives divided by the

number of predicted positives.

2. Recall (Sensitivity)

 Recall explains how many of the actual positive cases

we were able to predict correctly with our model. It is

a useful metric in cases where False Negative is of
higher concern than False Positive. It is important in
medical cases where it doesn’t matter whether we raise a
false alarm but the actual positive cases should not go
undetected!
Recall for a label is defined as the number of
true positives divided by the total number of
actual positives.

3. F1 Score —

 It gives a combined idea about
Precision and Recall metrics. It is
maximum when Precision is equal to
Recall.
F1 Score is the harmonic mean
of precision and recall.

Q1. What are the classification
metrics?

 Classification metrics are evaluation measures used to
assess the performance of a classification model. Common
metrics include accuracy (proportion of correct
predictions), precision (true positives over total predicted
positives), recall (true positives over total actual
positives), F1 score (harmonic mean of precision and
recall), and area under the receiver operating
characteristic curve (AUC-ROC).
Q2. What are the 4 metrics for evaluating
classifier performance?

 The four commonly used metrics for evaluating classifier
performance are:
1. Accuracy: The proportion of correct predictions out of the
total predictions.
2. Precision: The proportion of true positive predictions out of
the total positive predictions (precision = true positives / (true
positives + false positives)).
3. Recall (Sensitivity or True Positive Rate): The proportion of
true positive predictions out of the total actual positive instances
(recall = true positives / (true positives + false negatives)).
4. F1 Score: The harmonic mean of precision and recall,
providing a balance between the two metrics (F1 score = 2 *
((precision * recall) / (precision + recall))).
EVALUATING AN AI MODEL

AIM of Evaluation model:
To measure the performance of AI
through some evaluation metrics.
Calculate some performance score
and based on which the efficiency
of an AI model is determined.
Purpose of Evaluation
metrics

Provide a mathematical estimate as to
how far we are from making correct
predictions.

If the model performs well on unseen

new real-life data, the deployment stage is
started where it is to put to use in real-life
applications.
Model Evaluation
Metrics

Standard way of measurement to
assess something for accuracy and
performance.
Types of metrics

Classification metrics
Regression metrics
Deep learning related metrics
These metrics calculate some score which
indicates how correct the AI model’s prediction
is
The higher the score, the better our model is.
Classification metrics

Used for evaluating classification based
AI models.
Identifying the class of input value.
Classification problems are based on
non-continuous data.
What are the classification
models in AI?

Classification models include
logistic regression
decision tree
Random forest
Gradient-boosted tree
Multilayer perceptron
One-vs-rest
Naive Bayes.
What are types of classification
models in machine learning?

 There are perhaps four main types of classification
tasks that you may encounter; they are:

Binary Classification.
Multi-Class Classification.
Multi-Label Classification.
Imbalanced Classification.
What is an example of a
classifier
 model?
For example, a classification model might
be trained on a dataset of images labeled as
either dogs or cats and then used to predict
the class of new, unseen images of dogs or
cats based on their features such as color,
texture, and shape.
Why use a classification
model?

It helps in categorizing data into different
classes and has a broad array of applications,
such as email spam detection, medical diagnostic
test, fraud detection, image classification, and
speech recognition among others.
How many types of classification
algorithms are there?

Classification algorithms are used to categorize
data into a class or category. It can be performed on
both structured or unstructured data. Classification
can be of three types: binary classification,
multiclass classification, multilabel classification.

Regression metrics

 Used for evaluating Regression based AI models.
Deep learning related
metrics

 Used for evaluating Deep learning related metrics
based AI models.
Why did classification metrics are
mostly use in many AI evaluation?


Simple and easy to valuate.

Classification Metrics

Confusion Matrix
Accuracy
Precision
Recall
F1 Score
What is a confusion
matrix?

A technique using chart or table
summarizing the performance of a
classification based AI model by listing
the predicted values of an AI model
and the actual / correct outcome values
in a confusion table.

The actual value (True / False) represents
the actual result of the AI model
(observed or measured)
The predicted value (Positive / Negative)
is the value of the outcome /result of the
AI model, produced on the basis of its
algorithm and learning.
TRUE POSITIVE

An instance for which
both Predicted value of
the AI model and actual
value are positive.
TRUE NEGATIVE

An instance for which both
Predicted value of the AI
model and actual value are
Negative
FALSE POSITIVE

An instance for which
Predicted value of the AI
model is Positive but actual
value is Negative
FALSE NEGATIVE

An instance for which
Predicted value of the AI
model is Negative but actual
value is Positive
Classification Matrix
Format

PREDICTED VALUES
ACTUAL
VALUES
POSITIVE NEGATIVE

No of FALSE
No of TRUE
POSITIVE (1) NEGATIVES (FN)
POSITIVES (TP)

No of FALSE No of TRUE
NEGATIVE (0)
POSITIVES (FP) NEGATIVES (TN)
Link for Classification Metrics

https://www.analyticsvidhya.com/blog
/2021/07/metrics-to-evaluate-your-classi
fication-model-to-take-the-right-decisions
/
How can we evaluate AI model
using confusion matrices?

By calculating the following values:-

 ACCURACY RATE
 PRECISION RATE
 RECALL
 F1 SCORE
ACCURACY RATE


Percentage of times the predictions out of all
the observations are correct.
Accuracy =

Number of correct predictions (TP + TN)

X 100%
Total number of Predictions made ( TP + TN + FP + FN)
Precision rate

Rate at which the desirable predictions
turn out to be correct (True Positives out
of All Positives)
Precision Rate = TP / (TP + FP)

In %, Precision Rate = TP/(TP + FP) x 100%

RECALL

Rate of correct positive predictions to
the overall number of positive instances
in the dataset.
Recall =
Predictions actually positive
Actual positive values in the dataset

Recall = TP / (TP + FN)

In %, Recall = TP / (TP + FN) x 100 %

F1 score

A measure of balance between precision and
Recall. It is computed as per the following
formula:-

Precision . Recall
F1 = 2 x
Precision + Recall
F1 score

A measure of balance between precision and
Recall. It is computed as per the following
formula:-

TP
F1 =
TP + ½ (FP + FN)


6.data Mining - Classification
No ratings yet
6.data Mining - Classification
37 pages
Lesson 4 - Performance Metrics
No ratings yet
Lesson 4 - Performance Metrics
46 pages
CS585 Lecture October03rd
No ratings yet
CS585 Lecture October03rd
146 pages
0 Machine Learning Overview and Metrics LT
No ratings yet
0 Machine Learning Overview and Metrics LT
84 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
BSC ML CH1
No ratings yet
BSC ML CH1
63 pages
Module 2
No ratings yet
Module 2
151 pages
Performance Metrics
No ratings yet
Performance Metrics
12 pages
Classification Models Theory
No ratings yet
Classification Models Theory
37 pages
Lec 4 ML S4 Evaluation Metrics
No ratings yet
Lec 4 ML S4 Evaluation Metrics
29 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
AIML-HC Mod 03
No ratings yet
AIML-HC Mod 03
46 pages
Chương 2e. Model Evaluation
No ratings yet
Chương 2e. Model Evaluation
27 pages
Lecture 11 Model Evaluation
No ratings yet
Lecture 11 Model Evaluation
11 pages
Performance Measures
No ratings yet
Performance Measures
19 pages
Lecture 7
No ratings yet
Lecture 7
25 pages
Classification Metrics
No ratings yet
Classification Metrics
24 pages
WINSEM2024-25 CBS3006 ETH VL2024250505168 2025-01-09 Reference-Material-IV
No ratings yet
WINSEM2024-25 CBS3006 ETH VL2024250505168 2025-01-09 Reference-Material-IV
20 pages
Evaluation Matrix
No ratings yet
Evaluation Matrix
29 pages
MCQ Artificial Intelligence Class 10 Computer Vision
100% (3)
MCQ Artificial Intelligence Class 10 Computer Vision
41 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Bernd Klein Python and Machine Learning Letter
No ratings yet
Bernd Klein Python and Machine Learning Letter
453 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Important QnA Modelling AI Class 10
No ratings yet
Important QnA Modelling AI Class 10
3 pages
Lec 4
No ratings yet
Lec 4
24 pages
Unit III Iml Final
No ratings yet
Unit III Iml Final
36 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Classification Algorithm in Machine Learning
No ratings yet
Classification Algorithm in Machine Learning
13 pages
08 Classifier Evaluation
No ratings yet
08 Classifier Evaluation
39 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Exp7 MLAI2
No ratings yet
Exp7 MLAI2
8 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
Ad3501-Dl-Unit 4 Notes
No ratings yet
Ad3501-Dl-Unit 4 Notes
16 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
PDF General Math LM For Shs Compress
100% (1)
PDF General Math LM For Shs Compress
313 pages
Important QnA Neural Network AI Class 10
100% (1)
Important QnA Neural Network AI Class 10
3 pages
Lecture 20 - Evaluation Metrics
No ratings yet
Lecture 20 - Evaluation Metrics
27 pages
Evaluating A Machine Learning Model
No ratings yet
Evaluating A Machine Learning Model
14 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Machine Learningassignment
No ratings yet
Machine Learningassignment
10 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
Performance Measures - Session 2
No ratings yet
Performance Measures - Session 2
35 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
Unit 3
No ratings yet
Unit 3
13 pages
Evaluation Metrics-ML
No ratings yet
Evaluation Metrics-ML
16 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Assignment 5
No ratings yet
Assignment 5
22 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
ML Evaluation Metrics
No ratings yet
ML Evaluation Metrics
20 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
11 pages
Chapter 3 Model Evaluation Final
No ratings yet
Chapter 3 Model Evaluation Final
30 pages
Classification Metrics in Machine Learning
No ratings yet
Classification Metrics in Machine Learning
6 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
Understanding The Confusion Matrix in Machine Learning
No ratings yet
Understanding The Confusion Matrix in Machine Learning
4 pages
4.5K Hotmail Valid @vcombolist
No ratings yet
4.5K Hotmail Valid @vcombolist
85 pages
Artificial Intelligence 417 Class X Sample Paper Test 03 For Board Exam 2023 Answers 1
No ratings yet
Artificial Intelligence 417 Class X Sample Paper Test 03 For Board Exam 2023 Answers 1
10 pages
MCQ Artificial Intelligence Class 10 Data Science
0% (1)
MCQ Artificial Intelligence Class 10 Data Science
16 pages
Social Networking: Project Report
No ratings yet
Social Networking: Project Report
36 pages
Self Management Skills
No ratings yet
Self Management Skills
14 pages
Saudi Arabia Localisation User Guide (ODOO V10) Table of Content (PDFDrive)
No ratings yet
Saudi Arabia Localisation User Guide (ODOO V10) Table of Content (PDFDrive)
63 pages
PM Fantics: Gopika Radhakrishnan Naafiah Sadique
No ratings yet
PM Fantics: Gopika Radhakrishnan Naafiah Sadique
13 pages
L4 Communication Modes
100% (1)
L4 Communication Modes
21 pages
Vocabulary: A. Cloze Test
No ratings yet
Vocabulary: A. Cloze Test
32 pages
EnglishScore WritingTest Desktop
No ratings yet
EnglishScore WritingTest Desktop
37 pages
Advanced Word Processing Skills
No ratings yet
Advanced Word Processing Skills
9 pages
F1 Score Calculation
No ratings yet
F1 Score Calculation
4 pages
Muratec MFX-2550 Admin Guide
No ratings yet
Muratec MFX-2550 Admin Guide
175 pages
Zimbra Collaboration 9 Product Comparison - Network Edition
No ratings yet
Zimbra Collaboration 9 Product Comparison - Network Edition
4 pages
Business Letters Vol 2
No ratings yet
Business Letters Vol 2
25 pages
AK Class X AI PreBoard1 Set A 2024-25
No ratings yet
AK Class X AI PreBoard1 Set A 2024-25
9 pages
Update Instructions For The Academic Versions of EES (2016-17)
No ratings yet
Update Instructions For The Academic Versions of EES (2016-17)
5 pages
Manual Camara Vig LOGAN
No ratings yet
Manual Camara Vig LOGAN
58 pages
Greenbox Capital Funding Application USA 09152021
No ratings yet
Greenbox Capital Funding Application USA 09152021
1 page
Application
No ratings yet
Application
4 pages
Unit 7
No ratings yet
Unit 7
15 pages
Seminar Report On ICloud
No ratings yet
Seminar Report On ICloud
32 pages
Ôn Tập Chương Trình Mới Unit 10: Communication
No ratings yet
Ôn Tập Chương Trình Mới Unit 10: Communication
12 pages
Elearn VLSI SOC Design User Guide
No ratings yet
Elearn VLSI SOC Design User Guide
10 pages
MUET Model Answer Essay (Band 5) and Feedback
No ratings yet
MUET Model Answer Essay (Band 5) and Feedback
4 pages
Job Description - MBA - GGM MDU BD
No ratings yet
Job Description - MBA - GGM MDU BD
3 pages
Tting Nasadmin and Root Privileges On VNXe1 Series
No ratings yet
Tting Nasadmin and Root Privileges On VNXe1 Series
6 pages
The Impact of Ict On Society
No ratings yet
The Impact of Ict On Society
5 pages
8 Sinif 4 Unite Calisma Kagidi 102188
No ratings yet
8 Sinif 4 Unite Calisma Kagidi 102188
1 page
P1 Tool #1 - Participant Intake Form USAID Youth Advance
No ratings yet
P1 Tool #1 - Participant Intake Form USAID Youth Advance
2 pages
Documentation of Hamza
No ratings yet
Documentation of Hamza
3 pages
Shakti Pumps - Regarding Insurance Security Bond
No ratings yet
Shakti Pumps - Regarding Insurance Security Bond
3 pages
Candidate Guide Video Interview
No ratings yet
Candidate Guide Video Interview
5 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)

Session 1 Evaluation Model

Uploaded by

Session 1 Evaluation Model

Uploaded by

Session 1

learning model is to evaluate its performance. So, the

question arises that how would one measure the success

of a machine learning model? How would we know that

when to stop the training and evaluation and when to

 A very common example of binary classification is spam

often used to describe the performance of a

classification model on a set of the test data for

which the true values are known.

Precision for a label is defined as the

number of true positives divided by the

number of predicted positives.

we were able to predict correctly with our model. It is

If the model performs well on unseen

Simple and easy to valuate.

Number of correct predictions (TP + TN)

In %, Precision Rate = TP/(TP + FP) x 100%

In %, Recall = TP / (TP + FN) x 100 %

You might also like