0% found this document useful (0 votes)
8 views

Week 05 Classification Performance

Uploaded by

sabrinashah2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Week 05 Classification Performance

Uploaded by

sabrinashah2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

05-09-2024

TOD 533
Classification Performance:
Validation and metrics
Amit Das
TODS / AMSOM / AU
[email protected]

Model validation: Holdout sample


• Training set: data for training model (optimum values of parameters)
• Validation set: assessing performance on data withheld from training
• Opportunity to set / refine some model (hyper)parameters
• Test set: Expose model to data of interest for prediction

• Avoid overfitting – customizing model to quirks of training data that


are absent in other (particularly, target) data
• Prefer simpler models (Occam’s razor)

1
05-09-2024

k-fold Cross-validation
• Divide training data into k equally-sized subsets
• Randomize order, if necessary
• Train model on subsets 2, 3, …, k
• Choose subset 1 for testing model

• Repeat with subsets 2, 3, …, k as testing sets


• Stratified k-fold cross-validation

• Average performance over k runs (accuracy, …)

Comparing predicted to actual

Confusion Classification
Matrix Table

2
05-09-2024

Performance: accuracy

Performance: precision

3
05-09-2024

Performance: sensitivity (recall)

Performance: specificity

4
05-09-2024

Accuracy, precision, sensitivity and specificity


Actual
Positive Negative
Positive True Positive False Positive
Predicted
TP FP
Negative False Negative True Negative
FN TN

Accuracy (TP + TN) / (TP + TN + FP + FN)


Precision TP / (TP + FP)
Sensitivity (Recall) TP / (TP + FN)
Specificity TN / (TN + FP)

In the Diabetes context


Predicted
Diabetic Healthy
Diabetic True Positive False Negative
TP FN
Actual
153 115
Healthy False Positive True Negative
FP TN
60 440

Accuracy (TP + TN) / (TP + TN + FP + FN) = 0.772


Precision TP / (TP + FP) = 0.718
Sensitivity (Recall) TP / (TP + FN) = 0.571
Specificity TN / (TN + FP) = 0.880

5
05-09-2024

Jamovi output: Classification table


Results
Classification Table – …
Predicted
Observed tested_negative tested_positive % Correct
tested_negative 445 55 89.0
tested_positive 112 156 58.2
Note. The cut-off value is set to 0.5

Results
Predictive Measures
Accuracy Specificity Sensitivity
0.783 0.890 0.582
Note. The cut-off value is set to 0.5

Accuracy of classification: Logistic Regression

Accuracy

6
05-09-2024

Confusion Matrix: Logistic Regression

Specificity

Precision Sensitivity

F-measure
• Harmonic mean of precision and recall

• More generally,

• b < 1 focuses on precision, while b > 1 emphasizes recall

7
05-09-2024

MCC (Matthews correlation coefficient)

• It can be calculated from the confusion matrix as:

ROC Curves
• ROC is an abbreviation of Receiver Operating Characteristic
coming from the signal detection theory, developed during
World War II (for analysis of radar images).
• In the context of classifiers, ROC plot is a useful tool to study
• the behavior of a classifier or
• comparing two or more classifiers.

• A ROC plot is a two-dimensional graph, where the x-axis


represents FP rate (FPR) and y-axis represents TP rate (TPR).

8
05-09-2024

Comparing classifiers using ROC Plot


• We can use the concept of the “area
under the curve” (AUC) as a method to
compare two or more classifiers
• If a model is perfect, then its AUC = 1
• If a model simply performs random
guessing, then its AUC = 0.5
• A model that is strictly better than
another has a larger value of AUC than
the other

• Here, C3 is best, and C2 is better than


C1 as AUC(C3) > AUC(C2) > AUC(C1)

Comparison of Area under the ROC curve (AUC)


Classifier Logistic Discriminant KNN-5 Naïve Bayes Decision Tree Decision Rules
AUC 0.832 0.832 0.766 0.819 0.751 0.739

Amit’s Grades
AUC > 0.9 Excellent
AUC 0.8 to 0.9 Very Good
AUC 0.7 to 0.8 Good
AUC 0.6 to 0.7 Needs Improvement
AUC 0.5 to 0.6 Hopeless

9
05-09-2024

Multiway Classification: The Iris dataset

SepalLength SepalWidth PetalLength PetalWidth Species


5.1 3.5 1.4 0.2 Iris-setosa
4.9 3 1.4 0.2 Iris-setosa
4.7 3.2 1.3 0.2 Iris-setosa
4.6 3.1 1.5 0.2 Iris-setosa
5 3.6 1.4 0.2 Iris-setosa
7 3.2 4.7 1.4 Iris-versicolor
6.4 3.2 4.5 1.5 Iris-versicolor
6.9 3.1 4.9 1.5 Iris-versicolor
5.5 2.3 4 1.3 Iris-versicolor
6.5 2.8 4.6 1.5 Iris-versicolor
6.3 3.3 6 2.5 Iris-virginica
5.8 2.7 5.1 1.9 Iris-virginica
7.1 3 5.9 2.1 Iris-virginica
6.3 2.9 5.6 1.8 Iris-virginica
6.5 3 5.8 2.2 Iris-virginica

Multinomial Logistic Regression


Model Coefficients - Species
Species Predictor Estimate SE Z p Odds ratio
Iris-versicolor -
Intercept 18.68 30.3 0.6165 0.538 1.30e+8
Iris-setosa
PetalWidth -3.09 39.7 -0.0779 0.938 0.04535
PetalLength 13.95 52.6 0.2655 0.791 1.15e+6
SepalWidth -8.65 134.2 -0.0645 0.949 1.75e-4
SepalLength -5.32 76.7 -0.0694 0.945 0.00488
Iris-virginica -
Intercept -23.70 31.2 -0.7594 0.448 5.10e-11
Iris-setosa
PetalWidth 15.10 40.2 0.3756 0.707 3.61e+6
PetalLength 23.34 52.9 0.4415 0.659 1.37e+10
SepalWidth -15.31 134.2 -0.1140 0.909 2.25e-7
SepalLength -7.78 76.7 -0.1015 0.919 4.17e-4

10
05-09-2024

Multiway classification (Weka)

Logistic Regression with ridge parameter of 1.0E-8


Coefficients...
Class
Variable Iris-setosa Iris-versicolor
=============================================== === Confusion Matrix ===
SepalLength 21.8065 2.4652
SepalWidth 4.5648 6.6809 a b c <-- classified as
PetalLength -26.3083 -9.4293 50 0 0 | a = Iris-setosa
PetalWidth -43.887 -18.2859 0 46 4 | b = Iris-versicolor
Intercept 8.1743 42.637 0 2 48 | c = Iris-virginica

Separability of classes

11

You might also like