We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13
Confusion Matrix
A confusion matrix is an N X N matrix, where N is the number of classes being
predicted. A confusion matrix is a summary of prediction results on a classification problem. Confusion Matrix gives us a matrix as output and describes the complete performance of the model. The correct predictions falls on the diagonal line of the matrix. 4 important terms in Confusion Matrix: •True Positives : The cases in which we predicted YES and the actual output was also YES. •True Negatives : The cases in which we predicted NO and the actual output was NO. •False Positives : The cases in which we predicted YES and the actual output was NO. •False Negatives : The cases in which we predicted NO and the actual output was YES. The Confusion matrix in itself is not a performance measure as such, but almost all of the performance metrics are based on Confusion Matrix and the numbers inside it. True Positive (TP) •The predicted value matches the actual value •The actual value was positive and the model predicted a positive value True Negative (TN) •The predicted value matches the actual value •The actual value was negative and the model predicted a negative value False Positive (FP) – Type 1 error •The predicted value was falsely predicted •The actual value was negative but the model predicted a positive value •Also known as the Type 1 error False Negative (FN) – Type 2 error •The predicted value was falsely predicted •The actual value was positive but the model predicted a negative value •Also known as the Type 2 error Type I and Type II error Reject Ho (Positive) Accept Ho (Negative) Ho False (Positive) No error (True Positive) Type II error (β)- False Sensitivity (Actual class is Power of the Test negative positive)- True positive rate Ho True (Negative) Type I error (False No error- True Negative Actual class is negative Positive) Precision (Predicted class Predicted class is negative Accuracy is positive)- True predicted value Predicted (p =0.5) Diagnostic test
Observed 1 (positive) 0 (negative) % correct
1 (positive) 17 (TP) 0 (FN) 100 (sensitivity)
0 (negative) 3 (FP) 4 (TN) 57.1 (specificity)
Overall % 085 (precision) 100 87.5 (accuracy)
Predicted (p =0.2)
Diagnostic test
Observed 1 (positive) 0 (negative) % correct
1 (positive) 9 (TP) 0 (FN) 52.9 (sensitivity)
0 (negative) 1 (FP) 6 (TN) 85.7 (specificity)
Overall % 90 (precision) 62.5 (accuracy)
Predicted
Diagnostic test
Observed 1 (positive) 0 (negative) % correct
1 (positive) 90 (TP) 10 (FN) 90 (sensitivity)
0 (negative) 90 (FP) 810 (TN) 90 (specificity)
Overall % 50 (precision) 90 (accuracy)
Precision v/s Recall Precision is a useful metric in cases where False Positive is a higher concern than False Negatives. Precision is important in music or video recommendation systems, e- commerce websites, etc. Wrong results could lead to customer churn and be harmful to the business. Recall is a useful metric in cases where False Negative trumps False Positive. Recall is important in medical cases where it doesn’t matter whether we raise a false alarm but the actual positive cases should not go undetected! Predicted
Diagnostic test
Observed 1 (positive) 0 (negative) % correct
1 (positive) 30 (TP) 10 (FN) (sensitivity)
0 (negative) 30 (FP) 930 (TN) (specificity)
Overall % (precision) 100 (accuracy)
The total outcome values are: TP = 30, TN = 930, FP = 30, FN = 10 So, the accuracy for our model turns out to be: Our model is saying “I can predict sick people 96% of the time”. However, it is doing the opposite. It is predicting the people who will not get sick with 96% accuracy while the sick are spreading the virus! Do you think this is a correct metric for our model given the seriousness of the issue? Shouldn’t we be measuring how many positive cases we can predict correctly to arrest the spread of the contagious virus? Or maybe, out of the correctly predicted cases, how many are positive cases to check the reliability of our model? This is where we come across the dual concept of Precision and Recall. 50% percent of the correctly predicted cases turned out to be positive cases. Whereas 75% of the positives were successfully predicted by our model.