UNIT4 Confusion Matrix
UNIT4 Confusion Matrix
Confusion Matrix
Confusion Matrix
The confusion matrix is a matrix used to determine the performance of the
classification models for a given set of test data.
It can only be determined if the true values for test data are known.
The matrix itself can be easily understood, but the related terminologies may
be confusing.
It shows the errors in the model performance in the form of a matrix, hence
also known as an error matrix.
Some features of Confusion matrix are :
1. For the 2 prediction classes of classifiers, the matrix is of 2*2 table, for 3
classes, it is 3*3 table, and so on.
2. The matrix is divided into two dimensions, that are predicted
values and actual values along with the total number of predictions.
Confusion Matrix
3. Predicted values are those values, which are predicted by the model, and
actual values are the true values for the given observations.
It looks like the table:
F-measure:
If two models have low precision and high recall or vice versa, it is difficult to
compare these models. So, for this purpose, we can use F-score.
This score helps us to evaluate the recall and precision at the same time. The
F-score is maximum if the recall is equal to the precision. It can be calculated
using the below formula:
Confusion Matrix
Confusion Matrix using scikit-learn in Python
# confusion matrix in sklearn
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
# actual values
actual = [1,0,0,1,0,0,1,0,0,1]
# predicted values
predicted = [1,0,0,1,0,0,0,1,0,0]
# confusion matrix
matrix = confusion_matrix(actual,predicted, labels=[1,0])
print('Confusion matrix : \n',matrix)
# outcome values order in sklear
ntp, fn, fp, tn = confusion_matrix(actual,predicted,labels=[1,0]).reshape(-1)
print('Outcome values : \n', tp, fn, fp, tn)
# classification report for precision, recall f1-score and accuracy
matrix = classification_report(actual,predicted,labels=[1,0])
print('Classification report : \n',matrix)