Ch6-Models Selection Evaluating Classifiers

The document outlines the process of model selection and evaluation in machine learning, detailing the steps of data input, abstraction, and generalization. It discusses various training methods such as holdout, k-fold cross-validation, and bootstrap sampling, as well as the concepts of underfitting and overfitting. Additionally, it covers performance evaluation metrics for classification models, including accuracy, precision, recall, and ROC curves.

Uploaded by

tssdhanvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views28 pages

Ch6-Models Selection Evaluating Classifiers

Uploaded by

tssdhanvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Models selection &

Evaluating Classifiers
Outline
• Model in ML
• Selecting a Model
• Training a Model (for Supervised Learning) (Holdout
method, K-fold Cross-validation method, Bootstrap
sampling, Lazy vs. Eager learner)
• Underfitting and Overfitting
• Evaluating Performance of a Model
Model in ML
The basic learning process in machine learning can be divided into three key parts:

• Data Input: This is the initial step where information or data is collected and provided to the learning system. In
machine learning, it refers to the dataset that contains examples and features used for training.
• Abstraction: After receiving the data, the system or learner abstracts relevant patterns, features, or concepts
from it. In machine learning, it involves extracting meaningful patterns and relationships from the data using
algorithms and statistical methods.
• Generalization: This is the process of applying the abstracted knowledge or patterns to make predictions or
decisions beyond the specific examples or data points that were initially provided. In machine learning, it's the
model's capacity to make accurate predictions or classifications on unseen data based on what it has learned
from the training data.

⮚ Abstraction is a significant step as it represents raw input data in a summarized and structured
format, such that a meaningful insight is obtained from the data.
⮚ This structured representation of raw input data to the meaningful pattern is called a model.
⮚ The model might have different forms. It might be a mathematical equation, it might be a graph
or tree structure, it might be a computational block, etc
Selecting a model
There are three broad categories of machine learning approaches used for resolving different types
of problems. They are:
1. Supervised
▪ Classification
▪ Regression
2. Unsupervised
▪ Clustering
▪ Association analysis
3. Reinforcement

For each of the cases, the model that has to be created/trained is different. There are multiple
factors play a role when we try to select the model for solving a machine learning problem. The
most important factors are :
i. The kind of problem we want to solve using machine learning
ii. The nature of the underlying data.
Machine learning algorithms are broadly of two types:
• Models for supervised learning, which primarily focus on solving predictive problems
• Models for unsupervised learning, which solve descriptive problems
Predictive models
• Predictive models try to predict certain value using the values in an input data set
• The Predictive models which are used for prediction of target features of categorical value are known as
classification models. The target feature is known as a class and the categories to which classes are
divided into are called levels.
• Examples:
▪ Predicting win/loss in a cricket match
▪ Predicting whether a transaction is fraud
▪ Predicting whether a customer may move to another product
• Some of the popular classification models include: k-Nearest Neighbor (kNN), Naïve Bayes and
Decision Tree.

• Predictive models may also be used to predict numerical values of the target feature based on the
predictor features. The models which are used for prediction of the numerical value of the target feature
of a data instance are known as regression models.
• Examples:
▪ Prediction of revenue growth in the succeeding year
▪ Prediction of rainfall amount in the coming monsoon
▪ Prediction of potential flu patients and demand for flu shots next winter
• Some of the popular regression models include: Linear Regression and Logistic Regression models.
Descriptive models
• Descriptive models are used to describe a data set or gain insight from a data set.
• There is no target feature or single feature of interest in case of unsupervised
learning. Based on the value of all features, interesting patterns or insights are
derived about the data set.
• Descriptive models which group together similar data instances, i.e. data
instances having a similar value of the different features are called clustering
models.
• Examples of clustering include:
▪ Customer grouping or segmentation based on social, demographic, ethnic,
etc. factors
▪ Grouping of music based on different aspects like genre, language,
timeperiod, etc.
▪ Grouping of commodities in an inventory
• The most popular model for clustering is k-Means
Training a model (for Supervised Learning)
• Machine learning model training is a fundamental process in AI, enabling
computers to learn and make intelligent decisions.
• It involves teaching algorithms to recognize patterns, relationships, and
trends in data to make predictions or decisions.
• Training starts with a dataset containing examples and corresponding
outcomes.
• The model learns to generalize and make predictions on new, unseen data.
• Once trained, the model can be used to make predictions, classify objects, or
offer recommendations.
• Effective model training is critical for various applications in industries like
healthcare, finance, autonomous vehicles, and natural language processing.
• There are Various methods for training the models like:
▪ Holdout method
▪ K-fold Cross-validation method
▪ Bootstrap sampling
▪ Lazy vs. Eager learner
Holdout method
• Holdout Method involves splitting the dataset into two parts, typically a training set and a test set. The
training set is used to train the model, and the test set is used to evaluate its performance.
• Training and Test Data Split: Typically, 70%–80% of labeled input data is used for training, and 20%–30% is
used for testing, but other proportions are also acceptable.
• Random Data Split: To ensure both training and test data are similar, random partitioning is done using
random numbers.(In some cases, data is divided into three parts: training, test, and validation data.
Validation data is used iteratively to refine the model).
• Problem of Imbalanced Data: Imbalanced distribution of classes in training and test data can occur
despite random sampling, particularly when certain classes have much fewer examples.
• To address the problem of imbalanced data, stratified random sampling can be used, which divides
data into homogeneous groups and selects random samples from each group to ensure balanced
proportions.

Figure 1: Holdout method

K-fold Cross-validation method
• In k-fold cross-validation, the data set is divided into k-completely distinct or non-
overlapping random partitions called folds.
• The value of ‘k’ in k-fold cross-validation can be set to any number.
• There are two approaches which are extremely popular:

▪ Leave-one-out cross-validation (LOOCV)

o LOOCV is an extreme case of cross-validation where one data instance is
used as test data at a time.
o It aims to maximize the amount of data used for model training.
o The number of iterations in LOOCV equals the total number of data points in
the dataset.
o LOOCV is computationally expensive due to the large number of iterations.
o As a result, it is not commonly used in practice due to its high computational
cost.
K-fold Cross-validation method ………….
10-fold cross-validation (10-fold CV)
o is a widely used approach for assessing model
performance.
o The dataset is divided into 10 equal-sized folds,
each comprising about 10% of the data.
o Records in a fold are randomly sampled to ensure
a fair representation.
o In each of the 10 iterations, one fold is designated
as the test data, while the remaining 9 folds (90%
of the data) are used for training.
o This process is repeated 10 times, with a different
fold as the test data in each iteration.
o The average performance across all iterations is
Figure 2:Overall approach for K-fold cross-validation
reported to evaluate the model.
Bootstrap Sampling method
• Bootstrap sampling is a popular method for creating training and test data sets from an input data set.
• It utilizes Simple Random Sampling with Replacement (SRSWR), a well-known technique in sampling
theory for drawing random samples.
• In contrast to k-fold cross-validation, which divides data into separate partitions for testing and training,
bootstrapping randomly selects data instances from the input data set.
• Bootstrapping allows for the possibility of the same data instance being picked multiple times during the
sampling process.
• As a result, it can create one or more training data sets with 'n' data instances, and some instances may
be repeated.
• This technique is particularly useful in case of input data sets of small size, i.e. having very less number of
data instances.

Figure 3:Bootstrap sampling

Lazy vs. Eager learner method
• Eager Learning:
▪ Follows standard machine learning principles, constructing a generalized, input-independent target
function during training.
▪ Typical machine learning steps of abstraction and generalization are involved.
▪ Results in a trained model at the end of the learning phase.
▪ Eager learners are prepared with a model for classification when test data is received.
▪ Learning phase is time-consuming.
▪ Algorithms using eager learning include Decision Trees, Support Vector Machines, Neural Networks,
etc.
• Lazy Learning:
▪ Skips the abstraction and generalization processes, essentially not 'learning' in the traditional sense.
▪ Utilizes training data as-is and employs it for classifying unlabelled test data.
▪ Relies heavily on the given training data, making it known as rote learning or instance learning.
▪ Also referred to as non-parametric learning.
▪ Training phase is quick because little learning occurs.
▪ Classification can be time-consuming as each test data point is compared to training data.
▪ k-Nearest Neighbors is a popular algorithm for lazy learning.
Underfitting and Overfitting
• If the target function is kept too simple, it may not be able to capture the essential nuances
and represent the underlying data well. A typical case of underfitting may occur when trying
to represent a non-linear data with a linear model
• Many times underfitting happens due to unavailability of sufficient training data. Underfitting
results in both poor performance with training data as well as poor generalization to test data.
Underfitting can be avoided by:
❑ using more training data
❑ reducing features by effective feature selection
• Overfitting refers to a situation where the model has been designed in such a way that it
emulates the training data too closely. In such a case, any specific deviation in the training
data, like noise or outliers, gets embedded in the model.
• It adversely impacts the performance of the model on the test data. Overfitting, in many
cases, occur as a result of trying to fit an excessively complex model to closely match the
training data. Overfitting can be avoided by:
❑ using re-sampling techniques like k-fold cross validation
❑ hold back of a validation data set
❑ remove the nodes which have little or no predictive power for the given machine
learning problem.
Figure 4:Underfitting and Overfitting of models
Evaluating Performance of a Model
Supervised learning - classification

• For any classification model, model accuracy is given by total number of correct classifications
(either as the class of interest, i.e. True Positive or as not the class of interest, i.e. True Negative)
divided by total number of classifications done.
• A matrix containing correct and incorrect predictions in the form of True Positives, False Positives,
False Negatives and True Negatives is known as confusion matrix.

Figure 5: Details of model classification

Supervised learning classification model evaluation metrics

• Key Performance Metrics and Evaluation Techniques for

Classification Models

⮚ Accuracy
⮚ Error rate
⮚ Sensitivity
⮚ Specificity
⮚ Precision
⮚ Recall
⮚ F-measure
⮚ Receiver operating characteristic (ROC) curves
⮚ Area under curve (AUC)
Example
Supervised learning classification model evaluation- Accuracy
Accuracy is a measure of how many predictions a classification model got correct, expressed
as a ratio of the correctly predicted instances to the total instances in the dataset. It is
measured as:
Supervised learning classification model evaluation- Error Rate

The error rate is the complement of accuracy. It represents the proportion of incorrect
predictions in relation to the total instances. It is measured as:
Supervised learning classification model evaluation- Sensitivity
The sensitivity of a model measures the proportion of TP examples or positive cases
which were correctly classified. It is measured as:

In the context of the previous

Supervised learning classification model evaluation- Specificity
Specificity of a model measures the proportion of negative examples which have
been correctly classified. It is measured as:

A higher value of specificity will indicate a better model performance. However, it is

quite understandable that a conservative approach to reduce False Negatives might
actually push up the number of FPs.
Supervised learning classification model evaluation- Precision

Precision: Precision, also known as Positive Predictive Value, assesses the accuracy of
positive predictions. It's the ratio of true positives to the total instances predicted as
positive.
Supervised learning classification model evaluation- Recall
Recall: indicates the proportion of correct prediction of positives to the total number of
positives. In case of win/loss prediction of cricket, recall resembles what proportion of the
total wins were predicted correctly.
Supervised learning classification model evaluation- F-Measure
F-measure is another measure of model performance which combines the precision
and recall. It takes the harmonic mean of precision and recall as calculated as:
Supervised learning classification model evaluation-
Receiver operating characteristic (ROC)
• Receiver Operating Characteristic (ROC) curve helps in visualizing the performance of a
classification model. It shows the efficiency of a model in the detection of true positives
while avoiding the occurrence of false positives.
• In the ROC curve, the FP rate is plotted (in the horizontal axis) against true positive rate
(in the vertical axis) at different classification thresholds. If we assume a lower value of
classification threshold, the model classifies more items as positive. Hence, the values of
both False Positives and True Positives increase.
Supervised learning classification model evaluation- Area Under Curve
• The area under curve (AUC) value, as shown in
figure 6.a , is the area of the two-dimensional
space under the curve extending from (0, 0) to (1,
1), where each point on the curve gives a set of
true and false positive values at a specific
classification threshold.
• This curve gives an indication of the predictive
quality of a model. AUC value ranges from 0 to 1,
with an AUC of less than 0.5 indicating that the
classifier has no predictive ability.
• Figure 6.b shows the curves of two classifiers –
classifier 1 and classifier 2. Quite obviously, the
AUC of classifier 1 is more than the AUC of
classifier 2. So, we can draw the inference that
classifier 1 is better than classifier 2.
Figure 6: ROC curve
Reference list
Ref 1. Miroslav Kubat, An Introduction to Machine Learning, Third Edition, 2021, Pearson, ISBN 978-
3-030-81934-7
Ref 2. Saikat Dutt (Author), Subramanian Chandramouli (Author), Amit Kumar Das, Machine Learning,
First Edition , 2018, Person.

Sugar and Other Sweeteners - Handbook of Industrial Chemistry and Biotechnology-Springer (2017)
100% (1)
Sugar and Other Sweeteners - Handbook of Industrial Chemistry and Biotechnology-Springer (2017)
46 pages
Data Splitting and Bias Variance Tradeoff
No ratings yet
Data Splitting and Bias Variance Tradeoff
14 pages
Unit 3 (ML)
No ratings yet
Unit 3 (ML)
26 pages
ROADMAP First Edition
0% (1)
ROADMAP First Edition
32 pages
Gas Treating Technology Comparison GPA 2008
No ratings yet
Gas Treating Technology Comparison GPA 2008
12 pages
SSP 406 DCC Adaptive Chassis Control Design and Function
No ratings yet
SSP 406 DCC Adaptive Chassis Control Design and Function
32 pages
Lamination Suitability For Flexible Packaging Appl PDF
No ratings yet
Lamination Suitability For Flexible Packaging Appl PDF
3 pages
Design of Sewage Treatment Plants Course
100% (2)
Design of Sewage Treatment Plants Course
56 pages
305 Final Exam Cram Question Package
No ratings yet
305 Final Exam Cram Question Package
14 pages
Fun With Mud: by Deborah Schecter
No ratings yet
Fun With Mud: by Deborah Schecter
10 pages
Motor Vehicle Inspection
No ratings yet
Motor Vehicle Inspection
9 pages
cs3591 New Computer Network 2023 24 Course File
No ratings yet
cs3591 New Computer Network 2023 24 Course File
22 pages
Nabl 100
No ratings yet
Nabl 100
45 pages
Question1 Answers Complete
No ratings yet
Question1 Answers Complete
4 pages
Lec 16
No ratings yet
Lec 16
18 pages
Modellingandevaluationunit2june2322 220623063944 5c70ebed
No ratings yet
Modellingandevaluationunit2june2322 220623063944 5c70ebed
53 pages
ML Unit 2
No ratings yet
ML Unit 2
86 pages
Unit 5
No ratings yet
Unit 5
77 pages
Chapter 7 Learning
No ratings yet
Chapter 7 Learning
34 pages
BMW 5-Serie - Individual (2006-01)
No ratings yet
BMW 5-Serie - Individual (2006-01)
22 pages
5 - Model For Predictions - ML
No ratings yet
5 - Model For Predictions - ML
52 pages
UNIT03
No ratings yet
UNIT03
52 pages
Wa0001.
No ratings yet
Wa0001.
173 pages
Unit3ModellingandEvaluationpptx 2023 09 02 15 19 21
No ratings yet
Unit3ModellingandEvaluationpptx 2023 09 02 15 19 21
49 pages
Ml-Mid-2-Important Topics
No ratings yet
Ml-Mid-2-Important Topics
19 pages
Unit 3
No ratings yet
Unit 3
13 pages
Human Resource Management: Decenzo and Robbins
No ratings yet
Human Resource Management: Decenzo and Robbins
17 pages
Unit IV
No ratings yet
Unit IV
51 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
43 pages
CH 05 Optimization Technique
No ratings yet
CH 05 Optimization Technique
58 pages
Ensemble Learning
No ratings yet
Ensemble Learning
46 pages
DSOST3
No ratings yet
DSOST3
31 pages
SDS Loctite AA 324
No ratings yet
SDS Loctite AA 324
9 pages
ML Unit 2
No ratings yet
ML Unit 2
35 pages
ML Unit 2 Part 1
No ratings yet
ML Unit 2 Part 1
47 pages
M4 - FDS
No ratings yet
M4 - FDS
15 pages
ML Unit4 Notes
No ratings yet
ML Unit4 Notes
20 pages
CHP 3
No ratings yet
CHP 3
70 pages
ML 3170724 Unit-3
No ratings yet
ML 3170724 Unit-3
48 pages
Statistic Inference Unit 2 Notes
No ratings yet
Statistic Inference Unit 2 Notes
34 pages
ch-3 FML
No ratings yet
ch-3 FML
14 pages
Module 5 Advanced Classification Techniques
No ratings yet
Module 5 Advanced Classification Techniques
40 pages
Bike Rental System
No ratings yet
Bike Rental System
18 pages
Lecture-4 Model Evaluation
No ratings yet
Lecture-4 Model Evaluation
28 pages
ML Unit4 Notes
No ratings yet
ML Unit4 Notes
20 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
Unit 5 New
No ratings yet
Unit 5 New
9 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Wk07 Topic07 2 - 202303
No ratings yet
Wk07 Topic07 2 - 202303
21 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
Guia Seleccion Motores 20 HP Siemenes
No ratings yet
Guia Seleccion Motores 20 HP Siemenes
212 pages
ML 5
No ratings yet
ML 5
14 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
ML Notes (Module-3)
No ratings yet
ML Notes (Module-3)
21 pages
Module 1 IOT
No ratings yet
Module 1 IOT
103 pages
ML11 Generalization
No ratings yet
ML11 Generalization
40 pages
Unit I 2
No ratings yet
Unit I 2
78 pages
Bi Unit 5
No ratings yet
Bi Unit 5
20 pages
Job Order Costing
100% (3)
Job Order Costing
45 pages
Data Science Vijay1
No ratings yet
Data Science Vijay1
88 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
ML Unit IV
No ratings yet
ML Unit IV
70 pages
Xchapter 1
No ratings yet
Xchapter 1
31 pages
Module3 DS PPT
No ratings yet
Module3 DS PPT
68 pages
Unit Ii ML
No ratings yet
Unit Ii ML
57 pages
Artificial Intelligence Chapter 18 (Updated)
No ratings yet
Artificial Intelligence Chapter 18 (Updated)
19 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Module 1
No ratings yet
Module 1
50 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
21 pages
Lec2 Intro To ML
No ratings yet
Lec2 Intro To ML
35 pages
Generic ISO 14001 EMS Templates: ACT Plan
No ratings yet
Generic ISO 14001 EMS Templates: ACT Plan
59 pages
Week 4 - Intro To ML
No ratings yet
Week 4 - Intro To ML
37 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
ML 5
No ratings yet
ML 5
26 pages
Scom 261 - News Release Final Version
No ratings yet
Scom 261 - News Release Final Version
2 pages
UT Dallas Syllabus For cs6390.001 05s Taught by Jorge Cobb (Jcobb)
No ratings yet
UT Dallas Syllabus For cs6390.001 05s Taught by Jorge Cobb (Jcobb)
3 pages
Consent Form
No ratings yet
Consent Form
5 pages
Public Liability Insurance
No ratings yet
Public Liability Insurance
23 pages
Tillett Car Seat 2011 Brochure
No ratings yet
Tillett Car Seat 2011 Brochure
8 pages
The Nature and Importance of Entrepreneurship
No ratings yet
The Nature and Importance of Entrepreneurship
83 pages
Understanding The Nature and Scope of HRM
No ratings yet
Understanding The Nature and Scope of HRM
19 pages
Description: L Series Bolt-On Meter Head Assembly
No ratings yet
Description: L Series Bolt-On Meter Head Assembly
9 pages
Stock Analysis Strategy For US's Stock Market Based On Risk, Profitability, and Market Value Insights
No ratings yet
Stock Analysis Strategy For US's Stock Market Based On Risk, Profitability, and Market Value Insights
5 pages
Saipranaymasadi Resume
No ratings yet
Saipranaymasadi Resume
1 page
Visual Studio Tools - Install Free For Windows, Mac, Linux
No ratings yet
Visual Studio Tools - Install Free For Windows, Mac, Linux
1 page

Ch6-Models Selection Evaluating Classifiers

Uploaded by

Ch6-Models Selection Evaluating Classifiers

Uploaded by

Models selection &

Figure 1: Holdout method

▪ Leave-one-out cross-validation (LOOCV)

Figure 3:Bootstrap sampling

Figure 5: Details of model classification

• Key Performance Metrics and Evaluation Techniques for

In the context of the previous

A higher value of specificity will indicate a better model performance. However, it is

You might also like