Chap3 Part1 Classification
Chap3 Part1 Classification
ML Team
Outline:
1- Introduction
2- Classification:
- Basic concepts
- Process
3- Classification Methods Evaluation:
- Confusion Matrix
- Accuracy , Recall, Precision and F1 Score
- ROC Curve
- AUC
Introduction:
What is Classification ?
1) Accuracy rate
2) The recall
3) The precision
4) F1 Score
Example:
• We have a database of customers who have subscribed
to a service.
• Customers who are still subscribers.
• Customers who have canceled the service.
Example:
• We build a churn score: for each customer, we predict if he
will cancel or keep their subscription the following month.
• What is the performance of this score?
• How much can I trust him to predict future terminations?
Confusion Matrix:
Accuracy:
Precision:
24
Precision and recall measures
• Then we have
Receive Operating Characteristics curve
• The ROC curve is a tool for evaluating and comparing
models:
- Independently of confusion matrices of misassignment;
It allows to know if a model M1 will be better than the
model M2 regardless of the confusion matrix.
- Operational even in the case of very unbalanced
distributions: Without the perverse effects of the
confusion matrix linked to the need to perform an
assignment.
- A graphical tool that visualizes performance: Only one a
glance should allow us to see the most suitable model for
our interest.
Example ROC curves
Drawing an ROC curve
Area under the curve (AUC)
• Which classifier is better, C1 or C2?
– It depends on which region you talk about.
• Can we have one measure?
– Yes, we compute the area under the curve (AUC)
• If AUC for Ci is greater than that of Cj, it is said that Ci is
better than Cj.
– If a classifier is perfect, its AUC value is 1
– If a classifier makes all random guesses, its AUC value is
0.5.
ROC curve Comparison
ROC curve Comparison
AUC Evaluation
Perfect Case:
• The curve of M1 is always above
that of M2:
There cannot exist a situation
where M2 would be a better
classification model.
AUC Evaluation
• Mann-Whitney Measure:
Classification Algorithms
• k-Nearest Neighbour (k-NN).
• Decision Tree
• Support vector Machine (SVM)
• Naive Bayes
• Logistic Regression.