Confusion Matrix
Confusion Matrix
me/index.html)
In a typical data science project we try several models like (logistic regression, SVM, tree-classifiers
etc) on our data.
Then we measure the predicting performance of these models to find the best performing one.
Finally we decide to implement the best performing model.
In this notebook we talk about one of the classification model evaluation tools: Confusion matrix.
They can help us to see deeper how much reliable our models are.
We are going to look at the confusion matrices of a variety of Scikit-Learn models and compare them
using visual diagnostic tools from Yellowbrick in order to select the best model for our data.
import pandas as pd
import category_encoders as ce
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline
Confusion Matrix
Since we know the labels of the test set we can measure how succesfull are the predictions of the
model by comparing the the actual labels and predictions
We can see if our classifier identifies the samples succesfully, or it is "CONFUSED" with another label?
Confusion matrix shows the amount of confusion.
We use confusion matrices to understand which classes are most easily confused
There are two sets of labels in a confusion matrix of binary (2 class) classification:
{POSITIVE, NEGATIVE}- first, the model makes the prediction. It returns the labels 1 (POSITIVE)
or 0 (NEGATIVE).
{TRUE, FALSE}- then the model's prediction is evaluated if the prediction is made correctly
(TRUE) or incorrectly (FALSE) based on the actual known labels.
Tip: If you have difficulty to remember these terms because of the similarity of the terms just insert
the word "PREDICTED" in the middle.
For instance if you are confused with the meaning of "false positive" read it like
"false(ly) PREDICTED positive"
In multi-class classification ie if the class labels are more than 2 (not 1 or 0, positive or negative), the
confusion matrix looks like something like this
We do not use the terms like "true positive" with the confusion matrix with classes more than 2
The size of the confusion matrix is nxn where n is the number of classes
Different references may use a different convention for axes ie actual and predicted classes can take
place on different axes
url='https://raw.githubusercontent.com/rebeccabilbro/rebeccabilbro.github.io/master/data/a
garicus-lepiota.txt'
mushrooms.head(3)
Out[5]:
class cap-shape cap-surface cap-color
In [6]: mushrooms.info()
<class 'pandas.core.frame.DataFrame'>
dtypes: object(4)
mushrooms.nunique()
Out[7]: class 2
cap-shape 6
cap-surface 4
cap-color 10
dtype: int64
We see that target and feature columns contain different categorical values.
We need to encode them into numerical types in order to fit Sklearn models.
For this purpose, we will utilize Category Encoders (http://contrib.scikit-learn.org/categorical-
encoding/index.html)
library which provides scikit-learn-compatible categorical variable encoders.
All the transformers of Category Encoders can be used in Sklearn pipelines.
Later in a separate post we will analyse the encoders
In [8]: # Create the features dataset (X) and target dataset (y)
target = 'class'
X = mushrooms[features]
y = mushrooms[target]
Classifiers Dictionary
Now, tet's create a dictionary which contains the classifiers we want to use for our classification task
Here we create the dictionary with instantiates of Sklearn estimators without hyperparameter
tuning.
In reality we need to evaluate the performance of tuned classifiers.
In [10]: # Estimators dictionary
confusion_matrices function
Let's define a function to get the confusion matrices of a given dictionary of models (like in the upper
cell) easily without repetion.
plt.rcParams['figure.figsize'] = (6, 4)
plt.rcParams['font.size'] = 15
"""
Takes X, y datasets and an estimator dictionary -> returns confusion matrices of the c
lassifiers
"""
print(estimator)
('estimator', estimator_dict[estimator])])
model.fit(X_train, y_train)
cm.score(X_test, y_test)
cm.poof()
confusion_matrices(X, y, estimators_dct)
Logistic Legression
Linear SVC
Random Forest
SGD Classifier
Conclusion
Even though we can get deeper insight on predictions of the classifiers by confusion matrices, still it is
not very practical to compare several models performance with each other
Since confusion matrices provides tables of actual and prediction comparision we still need some
more metrics to interpret the result more directly to choose the best model
So we will continue to work on classification metrics like precision, recall, roc, auc etc in the next
posts
Sources:
http://www.scikit-yb.org/en/latest/api/classifier/confusion_matrix.html (http://www.scikit-
yb.org/en/latest/api/classifier/confusion_matrix.html)
http://contrib.scikit-learn.org/categorical-encoding/index.html (http://contrib.scikit-
learn.org/categorical-encoding/index.html)
Comments
Contents © 2019 Harun (mailto:[email protected]) - Powered by Nikola (https://getnikola.com)