6 Model Evalution

The document discusses model evaluation techniques in classification, focusing on metrics such as accuracy, precision, recall, and F1 score, along with their limitations. It also explains cross-validation methods, including Holdout Validation, Leave-One-Out Cross Validation, Stratified Cross-Validation, and K-Fold Cross Validation, emphasizing their importance in preventing overfitting and ensuring robust model performance. Each method has its advantages and drawbacks, particularly in relation to dataset size and class distribution.

Uploaded by

sknihal.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views16 pages

6 Model Evalution

Uploaded by

sknihal.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Classification -- Model

evaluations
•
Confusion Matrix
• TP – True Positive ; FP – False Positive
• FN – False Negative; TN – True Negative
Predicted Class
Actual Class = Yes Class = No
Class
Class = Yes a (TP) b (FN)
Class = No c (FP) d (TN)

a d TP 
Accuracy 
 a b c d TP  TN  FP 
FN

1
Classification -- Model
evaluations
• Given a set of records containing positive and negative results, the
computer is going to classify the records to be positive or
negative.

• Positive: The computer classifies the result to be positive

• Negative: The computer classifies the result to be negative
• True: What the computer classifies is true
• False: What the computer classifies is false

2
Classification -- Model
evaluations
• Limitation of Accuracy
• Consider a 2-class problem
• Number of Class 0 examples = 9990
• Number of Class 1 examples = 10
• If a “stupid” model predicts everything to be class 0, accuracy is 9990/10000 =
99.9 %

• The accuracy is misleading because the model does not detect any
example in class 1

3
Classification -- Model
evaluations
• Cost-sensitive measures
Predicted Class
Actual Class = Yes Class = No
Class
Class = Yes a (TP) b (FN)
Class = No c (FP) d (TN)

T a
Precision (p) 
 TPP FP a  c
T a Harmonic mean of Precision and Recall
Recall (r) 
 TPP FN a  b (Why not just average?)
2rp 2a
F - measure (F) 
 rp 2a  b 
c 4
How to understand
• Accuracy
• Accuracy = (TP+TN)/(TP+FP+FN+TN)
• How many students did we correctly label out of all the students?

• Precision
• Precision = TP/(TP+FP)
• How many of those who we labeled as diabetic are actually diabetic?

• Recall (sensitivity)
• Recall = TP/(TP+FN)
• Of all the people who are diabetic, how many of those we correctly predict?

• F1 Score = 2(Recall Precision) / (Recall + Precision)

• Harmonic mean (average) of the precision and recall
5
Which to choose
• Accuracy
• A great measure
• But only when you have symmetric datasets (FN & FP counts are close)
• Also, FN & FP have similar costs
• F1 score
• If the cost of FP and FN are different
• F1 is best if you have an uneven class distribution
• Recall
• If FP is far better than FN or if the occurrence of FN is unaccepted/intolerable
• Would like more extra FP (false alarms) over saving some FN
• E.g. diabetes. We’d rather get some healthy people labeled diabetic over leaving a
diabetic person labeled healthy
• Precision
• Want to be more confident of your TP
• E.g. spam emails. We’d rather have some spam emails in inbox rather than some
regular emails in your spam box.

6
Example
• Given 30 human photographs, a computer predicts 19 to be male, 11
to be female. Among the 19 male predictions, 3 predictions are not
correct. Among the 11 female predictions, 1 prediction is not
correct.

Predicted Class
Actual Male Female
Class
Male a = TP = 16 b = FN = 1
Female c = FP = 3 d = TN = 10

7
Example
Predicted Class
Actual Male Female
Class
Male a = TP = 16 b = FN = 1
Female c = FP = 3 d = TN = 10

• Accuracy = (16 + 10) / (16 + 3 + 1 + 10) = 0.867

• Precision = 16 / (16 + 3) = 0.842
• Recall = 16 / (16 + 1) = 0.941
• F-measure = 2 (0.842)(0.941) / (0.842 +
0.941)
= 0.889
8
Discussion
• “In a specific case, precision cannot be computed.” Is the statement true?
Why?
• If the statement is true, can F-measure be computed in that case?
a b c Classified as
a TP FN FN a: positive
b FP TN TN b: negative
c FP TN TN c: negative

• How about if b is positive, a and c are negative, or if c is positive, a and b

are negative ?

9
Cross Validation in Machine
Learning
In machine learning, we couldn’t fit the
model on the training data and can’t say that
the model will work accurately for the real data.
For this, we must assure that our model got the
correct patterns from the data, and it is not
getting up too much noise. For this purpose, we
use the cross-validation technique. In this
article, we’ll delve into the process of cross-
validation in machine learning.
What is Cross-Validation?
Cross validation is a technique used in machine
learning to evaluate the performance of a model on
unseen data.
It involves dividing the available data into multiple
folds or subsets, using one of these folds as a
validation set, and training the model on the remaining
folds. This process is repeated multiple times, each
time using a different fold as the validation set. Finally,
the results from each validation step are averaged to
produce a more robust estimate of the model’s
performance. Cross validation is an important step in
the machine learning process and helps to ensure that
What is cross-validation used for?
The main purpose of cross validation is to prevent overfitting,
which occurs when a model is trained too well on the training
data and performs poorly on new, unseen data. By evaluating
the model on multiple validation sets, cross validation
provides a more realistic estimate of the model’s
generalization performance, i.e., its ability to perform well on
new, unseen data.
Types of Cross-Validation
There are several types of cross validation techniques,
including k-fold cross validation, leave-one-out cross
validation, and Holdout validation, Stratified Cross-
Validation. The choice of technique depends on the size and
nature of the data, as well as the specific requirements of the
1. Holdout Validation
In Holdout Validation, we perform training on the
50% of the given dataset and rest 50% is used for
the testing purpose. It’s a simple and quick way to
evaluate a model. The major drawback of this
method is that we perform training on the 50% of the
dataset, it may possible that the remaining 50% of
the data contains some important information which
we are leaving while training our model i.e. higher
bias.
2. LOOCV (Leave One Out Cross Validation)
In this method, we perform training on the whole dataset but
leaves only one data-point of the available dataset and then
iterates for each data-point. In LOOCV, the model is trained
on n−1 samples and tested on the one omitted sample,
repeating this process for each data point in the dataset. It has
some advantages as well as disadvantages also.
An advantage of using this method is that we make use of all
data points and hence it is low bias.
The major drawback of this method is that it leads to higher
variation in the testing model as we are testing against one
data point. If the data point is an outlier it can lead to higher
variation. Another drawback is it takes a lot of execution
time as it iterates over ‘the number of data points’ times.
3. Stratified Cross-Validation
It is a technique used in machine learning to ensure that each fold of the
cross-validation process maintains the same class distribution as the
entire dataset. This is particularly important when dealing with
imbalanced datasets, where certain classes may be underrepresented. In
this method,
1.The dataset is divided into k folds while maintaining the proportion of
classes in each fold.
2.During each iteration, one-fold is used for testing, and the remaining
folds are used for training.
3.The process is repeated k times, with each fold serving as the test set
exactly once.
Stratified Cross-Validation is essential when dealing with classification
problems where maintaining the balance of class distribution is crucial for
the model to generalize well to unseen data.
4. K-Fold Cross Validation
In K-Fold Cross Validation, we split the dataset into k number of
subsets (known as folds) then we perform training on the all the
subsets but leave one(k-1) subset for the evaluation of the trained
model. In this method, we iterate k times with a different subset
reserved for testing purpose each time.

19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
GRADE 7 - MATH REVIEWER in Statistics PDF
100% (2)
GRADE 7 - MATH REVIEWER in Statistics PDF
6 pages
Cross-Validation in Machine Learning
No ratings yet
Cross-Validation in Machine Learning
18 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
9b. Evaluation of Classifiers
No ratings yet
9b. Evaluation of Classifiers
4 pages
Unit 6-Feature Engineering and Sensitivity Analysis
No ratings yet
Unit 6-Feature Engineering and Sensitivity Analysis
63 pages
Cofusion Matrix Cross - Validation
No ratings yet
Cofusion Matrix Cross - Validation
34 pages
Mountain State University 2
80% (5)
Mountain State University 2
4 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
AI351 Lecture 2 - Common Evaluation Metrics
No ratings yet
AI351 Lecture 2 - Common Evaluation Metrics
50 pages
Machine Learning: Cross Validation Machine Learning by Tom M. Mitchell Muhammad Affan Alim
No ratings yet
Machine Learning: Cross Validation Machine Learning by Tom M. Mitchell Muhammad Affan Alim
56 pages
ML - Module 5
No ratings yet
ML - Module 5
80 pages
Lec - 4
No ratings yet
Lec - 4
43 pages
ML Pyq Ans
No ratings yet
ML Pyq Ans
37 pages
ML 4
No ratings yet
ML 4
21 pages
ML Mod 5
No ratings yet
ML Mod 5
58 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
37 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
25 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
49 pages
Lecture 5 Evaluation - Classifer
No ratings yet
Lecture 5 Evaluation - Classifer
61 pages
Lec 16
No ratings yet
Lec 16
18 pages
P-2.1.2 Cross Validation and Regularization
No ratings yet
P-2.1.2 Cross Validation and Regularization
37 pages
Presentation On Classification
No ratings yet
Presentation On Classification
18 pages
Practical Issues
No ratings yet
Practical Issues
30 pages
Data Mining - Credibility: Evaluating What's Been Learned
No ratings yet
Data Mining - Credibility: Evaluating What's Been Learned
36 pages
Module 6
No ratings yet
Module 6
24 pages
Data Mining Final
No ratings yet
Data Mining Final
25 pages
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
No ratings yet
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
62 pages
Cross Validation Techniques
No ratings yet
Cross Validation Techniques
27 pages
T1 ML QB Soln
No ratings yet
T1 ML QB Soln
23 pages
CSC4316 9
No ratings yet
CSC4316 9
40 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Chapter2 1 33
No ratings yet
Chapter2 1 33
18 pages
List Steps in Data Preparation. Give Short Description of Each Step
No ratings yet
List Steps in Data Preparation. Give Short Description of Each Step
20 pages
Module 5 Advanced Classification Techniques
No ratings yet
Module 5 Advanced Classification Techniques
40 pages
MLA CT1 - Notes
No ratings yet
MLA CT1 - Notes
17 pages
Xchapter 1
No ratings yet
Xchapter 1
31 pages
DS Notes Unit - V
No ratings yet
DS Notes Unit - V
13 pages
Cross Validation
No ratings yet
Cross Validation
16 pages
Module 3 - ML
No ratings yet
Module 3 - ML
101 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
ML Unit4 Notes
No ratings yet
ML Unit4 Notes
20 pages
Lecture Note #6 - PEC-CS701E
No ratings yet
Lecture Note #6 - PEC-CS701E
11 pages
Unit 2
No ratings yet
Unit 2
28 pages
Cross Validation in ML
No ratings yet
Cross Validation in ML
5 pages
ML Model Evaluation
No ratings yet
ML Model Evaluation
17 pages
CH 6
No ratings yet
CH 6
24 pages
Data Mining Models and Evaluation Techniques
No ratings yet
Data Mining Models and Evaluation Techniques
59 pages
Bi Unit 5
No ratings yet
Bi Unit 5
20 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Lecture Note 2019 PDF
100% (1)
Lecture Note 2019 PDF
235 pages
ML 5
No ratings yet
ML 5
14 pages
Unit 5 ML
No ratings yet
Unit 5 ML
21 pages
Unit 5 New
No ratings yet
Unit 5 New
9 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
4 Sampling Distributions
100% (1)
4 Sampling Distributions
30 pages
Mathematics As A Tool Data MGNT PDF
No ratings yet
Mathematics As A Tool Data MGNT PDF
60 pages
Cross Validation
No ratings yet
Cross Validation
5 pages
ML-4 Cross Validation in Machine Learning
No ratings yet
ML-4 Cross Validation in Machine Learning
13 pages
DM Unit - 3
No ratings yet
DM Unit - 3
21 pages
03 Quantitative Method in Forecasting
No ratings yet
03 Quantitative Method in Forecasting
16 pages
Faiml Revision
No ratings yet
Faiml Revision
5 pages
TR Rain Error
No ratings yet
TR Rain Error
6 pages
Multiple Choice Test Item Analysis
No ratings yet
Multiple Choice Test Item Analysis
26 pages
Symbiosis International (Deemed University) : Symbiosis School For Online and Digital Learning
No ratings yet
Symbiosis International (Deemed University) : Symbiosis School For Online and Digital Learning
84 pages
Unit 09
100% (1)
Unit 09
25 pages
Unit 4 - Data Visualization
No ratings yet
Unit 4 - Data Visualization
32 pages
Meta-Mar Free Online Meta-Analysis Service!
No ratings yet
Meta-Mar Free Online Meta-Analysis Service!
8 pages
Standard Deviation Made Easy
No ratings yet
Standard Deviation Made Easy
8 pages
Hypothesis Testing & SPSS
No ratings yet
Hypothesis Testing & SPSS
34 pages
Hypothesis Testing - Z and T-Tests
No ratings yet
Hypothesis Testing - Z and T-Tests
9 pages
Newbold-Presentación Regresión Cap 11
No ratings yet
Newbold-Presentación Regresión Cap 11
43 pages
Ditzen 2018
No ratings yet
Ditzen 2018
33 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
Eberhardt 2012 Estimating Panel Time Series Models With Heterogeneous Slopes
No ratings yet
Eberhardt 2012 Estimating Panel Time Series Models With Heterogeneous Slopes
11 pages
Analysis of Variance (ANOVA) : Table 1 K Random Samples
No ratings yet
Analysis of Variance (ANOVA) : Table 1 K Random Samples
5 pages
Document 1
No ratings yet
Document 1
4 pages
DM - 02 - 02 - Descriptive Data Summarization
No ratings yet
DM - 02 - 02 - Descriptive Data Summarization
32 pages
COSM - Lesson Plan (CSE)
No ratings yet
COSM - Lesson Plan (CSE)
4 pages
Probability
No ratings yet
Probability
22 pages
Describe Machine Learning Lifecycle
No ratings yet
Describe Machine Learning Lifecycle
4 pages
Compsa Quiz 2 Data Batch 2 23 24
No ratings yet
Compsa Quiz 2 Data Batch 2 23 24
9 pages
Penerapan Fungsi Manajemen Sebagai Metode Meningkatkan Kinerja Karyawan
No ratings yet
Penerapan Fungsi Manajemen Sebagai Metode Meningkatkan Kinerja Karyawan
8 pages
MHA 610 Week 4 Assignment
No ratings yet
MHA 610 Week 4 Assignment
7 pages
When To Use What Statistical Test
No ratings yet
When To Use What Statistical Test
8 pages
Set 3 IBM-322
No ratings yet
Set 3 IBM-322
3 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet