Confusion Matrix & Box Plot
Confusion Matrix & Box Plot
A confusion matrix is a table that shows how well a machine learning algorithm is performing a
classification task.
0 – Negative, 1 – Positive
Actual_data : [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
Predicted_data : [1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1]
Accuracy:
Accuracy in a confusion matrix is the proportion of correct classifications out of all classifications, and is
calculated by dividing the total number of correct classifications by the total number of classifications.
Precision:
In a confusion matrix, precision is a metric that measures the quality of a model's positive predictions.
Precision measures the proportion of true positive predictions among all positive predictions.
Recall:
In a confusion matrix, recall is a metric that measures how well a model identifies true positives. Recall
is the total number of the actual positive cases that were predicted correctly.
F1 Score:
The F1 score is a metric used to evaluate a classifier's performance in a confusion matrix by combining
the precision and recall scores into a single value. To calculate the F1 score, we can use a confusion
matrix, which summarizes the predictive performance of a model on a binary classification task (positive
and negative classes).
1. Median (Q2): The median divides the data into two halves.
o Q2 = 14
2. First Quartile (Q1): This is the median of the lower half (first 7 values).
3. Third Quartile (Q3): This is the median of the upper half (last 7 values).
IQR=Q3−Q1 =24−6=18
Summary of Results
Q1: 6
Q2 (Median): 14
Q3: 24
IQR: 18
Outliers: 70
Upper Bound: 51
To create the box plot for this data, I’ll now use Python to visualize it.
Here is the box plot for the given data, with the outlier (70) clearly shown beyond the whiskers. This
value exceeds the upper boundary of 51, confirming it as an outlier.