Confusion Matrix & Evaluation Metrics in Machine Learning
Confusion Matrix & Evaluation Metrics in Machine Learning
Evaluation Metrics in
Machine Learning
Metrics to Evaluate Machine
Learning Classification Algorithms
• Now that we have an idea of the different types of
classification models, it is crucial to choose the right
evaluation metrics for those models.
• We will cover the most commonly used metrics: accuracy,
precision, recall, F1 score, and area under the ROC
(Receiver Operating Characteristic) curve and AUC
(Area Under the Curve).
Understanding the Confusion Matrix
in Machine Learning
• Machine learning models are increasingly used in
various applications to classify data into different
categories.
• However, evaluating the performance of these models
is crucial to ensure their accuracy and reliability.
• One essential tool in this evaluation process is the
confusion matrix
What is a Confusion Matrix?
• A confusion matrix is a simple table that shows how well a
classification model is performing by comparing its
predictions to the actual results.
• It breaks down the predictions into four categories:
• Correct predictions for both classes (true positives and true
negatives) and
• Incorrect predictions (false positives and false negatives).
• This helps you understand where the model is making
mistakes, so you can improve it
A 2X2 Confusion matrix
• The matrix displays the number of instances produced by the
model on the test data.
• True Positive (TP): The model correctly predicted a positive
outcome (the actual outcome was positive).
• True Negative (TN): The model correctly predicted a negative
outcome (the actual outcome was negative)
• False Positive (FP): The model incorrectly predicted a positive
outcome (the actual outcome was negative). Also known as a
Type I error.
• False Negative (FN): The model incorrectly predicted a
negative outcome (the actual outcome was positive). Also
known as a Type II error
Example - Confusion Matrix for Dog
Image Recognition with Numbers
• Actual Dog Counts = 6
• The AUC curve represents the area under the ROC curve.
• It measures the overall performance of the binary classification
model.
• As both TPR and FPR range between 0 to 1, So, the area will always
lie between 0 and 1, and A greater value of AUC denotes better
model performance.
• Our main goal is to maximize this area in order to have the highest
TPR and lowest FPR at the given threshold.
• The AUC measures the probability that the model will assign a
randomly chosen positive instance a higher predicted probability
compared to a randomly chosen negative instance.
Type 1 and Type 2 error
Type 1 error
• A Type 1 Error occurs when the model incorrectly predicts a positive instance, but the
actual instance is negative.
• This is also known as a false positive. Type 1 Errors affect the precision of a model,
which measures the accuracy of positive predictions.
Example:
• This occurs when the test predicts a patient has the disease
(positive result), but the patient is actually healthy (negative
case).
Type 2 Error
• A Type 2 Error occurs when the model fails to predict a positive instance,
even though it is actually positive.
• This is also known as a false negative.
• Type 2 Errors impact the recall of a model, which measures how well the
model identifies all actual positive cases
Example:
Scenario: A diagnostic test is used to detect a particular disease in patients.