Machine Learning Model Evaluation
Machine Learning Model Evaluation
In this Python code, we have imported the iris dataset which has features
like the length and width of sepals and petals. The target values are Iris
setosa, Iris virginica, and Iris versicolor. After importing the dataset we
divided the dataset into train and test datasets in the ratio 80:20. Then we
called Decision Trees and trained our model. After that, we performed the
prediction and calculated the accuracy score, precision, recall, and f1 score.
We also plotted the confusion matrix.
Importing Libraries and Dataset
Now let’s load the toy dataset iris flowers from the sklearn.datasets library
and then split it into training and testing parts (for model evaluation) in the
80:20 ratio.
random_state=20,
Now, let’s train a Decision Tree Classifier model on the training data, and
then we will move on to the evaluation part of the model using different
metrics.
1. Accuracy
print("Accuracy:", accuracy_score(y_test,
Output:
Accuracy: 0.9333333333333333
2. Precision and Recall
print("Precision:", precision_score(y_test,
average="weighted"))
print('Recall:', recall_score(y_test,
average="weighted"))
Output:
Precision: 0.9435897435897436
Recall: 0.9333333333333333
3. F1 score
F1 score is the harmonic mean of precision and recall. It is seen that during
the precision-recall trade-off if we increase the precision, recall decreases
and vice versa. The goal of the F1 score is to combine precision and recall.
2×Precision×Recall
F1 Score = Precision+Recall
Output:
F1 score: 0.9327777777777778
4. Confusion Matrix
cm_display.plot()
Output:
Confusion matrix for the output of the model
0 = Setosa
1 = Versicolor
2 = Virginica
From the confusion matrix, we see that 8 setosa classes were correctly
predicted. 11 Versicolor test cases were also correctly predicted by the
model and 2 virginica test cases were misclassified. In contrast, the rest 9
were correctly predicted.
5. AUC-ROC Curve
AUC (Area Under Curve) is an evaluation metric that is used to analyze the
classification model at different threshold values. The Receiver Operating
Characteristic (ROC) curve is a probabilistic curve used to highlight the
model’s performance. The curve has two parameters:
TPR: It stands for True positive rate. It basically follows the formula of
Recall.
FPR: It stands for False Positive rate. It is defined as the ratio of False
positives to the summation of false positives and True negatives.
This curve is useful as it helps us to determine the model’s capacity to
distinguish between different classes. Let us illustrate this with the help of
a simple Python example
Output:
Auc 0.75
mean_squared_error, mean_absolute_percentage_error
Now let’s load the data into the panda’s data frame and then split it into
training and testing parts (for model evaluation) in the 80:20 ratio.
test_size=0.20,
random_state=0)
Now, let’s train a simple linear regression model. On the training data and
we will move to the evaluation part of the model using different metrics.
This is the simplest metric used to analyze the loss over the whole
dataset. As we know that error is basically the difference between the
predicted and actual values. Therefore MAE is defined as the average of
the errors calculated. Here we calculate the modulus of the error, perform
summation and then divide the result by the total number of data points.
It is a positive value. The formula of MAE is given by
MAE = ∑ ∣ypred–yactual∣
N
i=1
N
y_pred=Y_pred)
Output:
MSE = i=1
N
Output:
RMSE =
N
pred –yactual )2
N
y_pred=Y_pred,
squared=False)
Output:
Root Mean Square Error 1.9951956560725306
Output: