Lecture - 3
Lecture - 3
Machine Learning
Accuracy = (TP+TN)
(TP+FP+FN+TN)
🞂 We can use accuracy_score function of
sklearn.metrics to compute accuracy of our
classification model.
Precision = TP
TP+FP
= 100/(100+10)
=91 %
🞂 Recall may be defined as the number of positives
returned by our ML model. We can easily
calculate it by confusion matrix with the help of
following formula −
🞂 Recall= TP
TP+FN
= 100/(100+5)
=95 %
🞂 Specificity, in contrast to recall, may be defined
as the number of negatives returned by our ML
model. We can easily calculate it by confusion
matrix with the help of following formula −
🞂 Specificity= TN
TN+FP
= 50/(50+10)
=83 %
🞂 This score will give us the harmonic mean of
precision and recall. Mathematically, F1
score is the weighted average of the
precision and recall. The best value of F1
would be 1 and worst would be 0. We can
calculate F1 score with the help of following
formula −
🞂 𝑭𝟏 = 𝟐 ∗ (𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 ∗ 𝒓𝒆𝒄𝒂𝒍𝒍) / (𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 + 𝒓𝒆𝒄𝒂𝒍𝒍)
🞂 = 2x (91 x 95) / (91 +95)
🞂 = 92.95
🞂 = .93 ~ 1
🞂 F1 score is having equal relative contribution
of precision and recall.
🞂 We can use classification_report function of
sklearn.metrics to get the classification report
of our classification model.
🞂 AUC (Area Under Curve)-ROC (Receiver
Operating Characteristic) is a performance
metric, based on varying threshold values, for
classification problems.
🞂 As name suggests, ROC is a probability curve
and AUC measure the separability.
🞂 In simple words, AUC-ROC metric will tell us
about the capability of model in
distinguishing the classes. Higher the AUC,
better the model.
🞂 Mathematically, it can be created by plotting
TPR (True Positive Rate) i.e. Sensitivity or
recall vs FPR (False Positive Rate) i.e. 1-
Specificity, at various threshold values.
🞂 TPR= TP
TP+FN
=100/105
=0.95
🞂 False Positive Rate (FPR) is a synonym for
recall and is therefore defined as follows:
🞂 FPR= FP
FP+TN
=10/60
=0.17