Lesson 6 Analytics Methods
Lesson 6 Analytics Methods
November 18 1
Model Evaluation
• Evaluation metrics: How can we measure accuracy? Other metrics to
consider?
• Use test set of class-labeled tuples instead of training set when
assessing accuracy
• Methods for estimating a classifier’s accuracy:
• Holdout method, random subsampling
• Cross-validation
• Bootstrap
• Comparing classifiers:
• Confidence intervals
• Cost-benefit analysis and ROC Curves
November 18 2
Classifier Evaluation Metrics:
Accuracy & Error Rate
Confusion Matrix:
Classifier Accuracy, or recognition rate: percentage of test set tuples that are
correctly classified,
true positives (TP): These are cases in which we predicted yes
(they have the disease), and they do have the disease.
true negatives (TN): We predicted no, and they don't have the
disease.
Error rate: 1 – accuracy, or false positives (FP): We predicted yes, but they don't actually
have the disease. (Also known as a "Type I error.")
false negatives (FN): We predicted no, but they actually do
have the disease. (Also known as a "Type II error.")
3
November 18 3
3
Classifier Evaluation Accuracy: Overall, how often is the classifier correct?
• (TP+TN)/total = (100+50)/165 = 0.91
Metrics: Misclassification Rate: Overall, how often is it wrong?
• (FP+FN)/total = (10+5)/165 = 0.09
Example - Confusion • equivalent to 1 minus Accuracy
• also known as "Error Rate"
Matrix True Positive Rate: When it's actually yes, how often does it
predict yes?
• TP/actual yes = 100/105 = 0.95
• also known as "Sensitivity" or "Recall"
False Positive Rate: When it's actually no, how often does it
predict yes?
• FP/actual no = 10/60 = 0.17
Specificity: When it's actually no, how often does it predict no?
• TN/actual no = 50/60 = 0.83
• equivalent to 1 minus False Positive Rate
Precision: When it predicts yes, how often is it correct?
• TP/predicted yes = 100/110 = 0.91
Prevalence: How often does the yes condition actually occur in
our sample?
• actual yes/total = 105/165 = 0.64
November 18 4
Sensitivity in yellow, specificity in red
November 18 5
Precision in red, recall in yellow
November 18 6
Equations
• sensitivity = recall = tp / t = tp / (tp + fn)
• specificity = tn / n = tn / (tn + fp)
• precision = tp / p = tp / (tp + fp)
November 18 7
Equations explanation
• Sensitivity/recall – how good a test is at detecting the positives. A test
can cheat and maximize this by always returning “positive”.
• Specificity – how good a test is at avoiding false alarms. A test can
cheat and maximize this by always returning “negative”.
• Precision – how many of the positively classified were relevant. A test
can cheat and maximize this by only returning positive on one result
it’s most confident in.
• The cheating is resolved by looking at both relevant metrics instead of
just one. E.g. the cheating 100% sensitivity that always says “positive”
has 0% specificity.
November 18 8
Classifier Evaluation Metrics:
Sensitivity and Specificity
• Class Imbalance Problem:
• one class may be rare, e.g. fraud detection data,
medical data
• significant majority of the negative class and minority
of the positive class
• Sensitivity: True Positive recognition rate,
9
November 18 9
Classifier Evaluation Metrics:
Example Predicted class
November 18
Fig 2 : Cross-Validation Method
11
Model 1
Model 1 is better
Model Selection: ROC Curves Model 2 than Model 2.
Why?
• ROC (Receiver Operating Characteristics)
Diagonal
curves: for visual comparison of classification
line
models
• Originated from signal detection theory
• Shows the trade-off between the true positive
rate and the false positive rate Vertical axis represents
the true positive rate
• The area under the ROC curve is a measure of
Horizontal axis rep. the
the accuracy of the model
false positive rate
• Rank the test tuples in decreasing order: the The plot also shows a
one that is most likely to belong to the positive diagonal line
class appears at the top of the list
A model with perfect
• The closer to the diagonal line (i.e., the closer accuracy will have an area
the area is to 0.5), the less accurate is the of 1.0
12
model
November 18 12