0% found this document useful (0 votes)
5 views24 pages

Lecture - 3

The document discusses performance metrics in machine learning, emphasizing their importance in evaluating model effectiveness. Key metrics include accuracy, precision, recall, F1 score, and AUC-ROC, each with specific formulas and applications. Understanding these metrics is crucial for selecting the appropriate evaluation methods for different classification problems.

Uploaded by

Chaitali Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views24 pages

Lecture - 3

The document discusses performance metrics in machine learning, emphasizing their importance in evaluating model effectiveness. Key metrics include accuracy, precision, recall, F1 score, and AUC-ROC, each with specific formulas and applications. Understanding these metrics is crucial for selecting the appropriate evaluation methods for different classification problems.

Uploaded by

Chaitali Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Performance Metrics in

Machine Learning

Prof. Chaitali Mhatre


Assistant Professor
🞂 Evaluating the performance of a
Machine learning model is one of the
important steps while building an
effective ML model.

🞂 To evaluate the performance or quality


of the model, different metrics are
used, and these metrics are known as
performance metrics or evaluation
metrics.
🞂 It is the easiest way to measure the
performance of a classification
problem where the output can be of
two or more type of classes.

🞂 Not all metrics can be used for al


types of problems; hence, it is
important to know and understand
which metrics should be used.
🞂 To evaluate the performance of a classification
model, different metrics are used, and some of
them are as follows:-
🞂 Accuracy
🞂 Confusion Matrix
🞂 Precision
🞂 Recall
🞂 F-1-Score
🞂 AUC (Area Under the Curve)-ROC
🞂 A confusion matrix is nothing but a
table with two dimensions viz. “Actual”
and “Predicted” and furthermore, both
the dimensions have “True Positives
(TP)”, “True Negatives (TN)”, “False
Positives (FP)”, “False Negatives (FN)”
as shown below −
FALSE
NEGATIVE
(FN)
🞂 True Positives (TP) − It is the case when both
actual class & predicted class of data point is 1.
🞂 True Negatives (TN) − It is the case when both
actual class & predicted class of data point is 0.
🞂 False Positives (FP) − It is the case when actual
class of data point is 0 & predicted class of data
point is 1.
🞂 False Negatives (FN) − It is the case when actual
class of data point is 1 & predicted class of data
point is 0.
🞂 The sklearn metrics module implements
several loss, score, and utility functions to
measure classification performance. Some
metrics might require probability estimates of
the positive class, confidence values, or
binary decisions values.

🞂 We can use confusion matrix function of


sklearn.metrics to compute Confusion
Matrix of our classification model.
 Total number of predictions are 165 out of which
110 time predicted YES, whereas 55 times predicted
NO.
 However, in reality, 60 cases in which patients don't
have the disease, whereas 105 cases in which
patients have the disease.
Total Cases Actual Actual
165 ( YES) ( NO)

Predicted ( YES) 100 10


Predicted ( NO) 5 50
It is most common performance metric for
classification algorithms. It may be defined as
the number of correct predictions made as a
ratio of all predictions made.

We can easily calculate it by confusion matrix


with the help of following formula −

Accuracy = (TP+TN)
(TP+FP+FN+TN)
🞂 We can use accuracy_score function of
sklearn.metrics to compute accuracy of our
classification model.

🞂 Sklearn metrics lets you to assess the quality


of your predictions.
🞂 Precision, used in document retrievals,
may be defined as the number of
correct documents returned by our ML
model.
🞂 We can easily calculate it by confusion
matrix with the help of following formula

Precision = TP
TP+FP

= 100/(100+10)
=91 %
🞂 Recall may be defined as the number of positives
returned by our ML model. We can easily
calculate it by confusion matrix with the help of
following formula −

🞂 Recall= TP
TP+FN

= 100/(100+5)
=95 %
🞂 Specificity, in contrast to recall, may be defined
as the number of negatives returned by our ML
model. We can easily calculate it by confusion
matrix with the help of following formula −

🞂 Specificity= TN
TN+FP
= 50/(50+10)
=83 %
🞂 This score will give us the harmonic mean of
precision and recall. Mathematically, F1
score is the weighted average of the
precision and recall. The best value of F1
would be 1 and worst would be 0. We can
calculate F1 score with the help of following
formula −
🞂 𝑭𝟏 = 𝟐 ∗ (𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 ∗ 𝒓𝒆𝒄𝒂𝒍𝒍) / (𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 + 𝒓𝒆𝒄𝒂𝒍𝒍)
🞂 = 2x (91 x 95) / (91 +95)
🞂 = 92.95
🞂 = .93 ~ 1
🞂 F1 score is having equal relative contribution
of precision and recall.
🞂 We can use classification_report function of
sklearn.metrics to get the classification report
of our classification model.
🞂 AUC (Area Under Curve)-ROC (Receiver
Operating Characteristic) is a performance
metric, based on varying threshold values, for
classification problems.
🞂 As name suggests, ROC is a probability curve
and AUC measure the separability.
🞂 In simple words, AUC-ROC metric will tell us
about the capability of model in
distinguishing the classes. Higher the AUC,
better the model.
🞂 Mathematically, it can be created by plotting
TPR (True Positive Rate) i.e. Sensitivity or
recall vs FPR (False Positive Rate) i.e. 1-
Specificity, at various threshold values.

🞂 Following is the graph showing ROC, AUC


having TPR at y-axis and FPR at x-axis −
🞂 An ROC curve (receiver operating
characteristic curve) is a graph showing the
performance of a classification model at all
classification thresholds. This curve plots two
parameters:
🞂 True Positive Rate

🞂 False Positive Rate


🞂 True Positive Rate (TPR) is a synonym for
recall and is therefore defined as follows:

🞂 TPR= TP
TP+FN
=100/105
=0.95
🞂 False Positive Rate (FPR) is a synonym for
recall and is therefore defined as follows:

🞂 FPR= FP
FP+TN
=10/60
=0.17

You might also like