Ch6-Models Selection Evaluating Classifiers
Ch6-Models Selection Evaluating Classifiers
Evaluating Classifiers
Outline
• Model in ML
• Selecting a Model
• Training a Model (for Supervised Learning) (Holdout
method, K-fold Cross-validation method, Bootstrap
sampling, Lazy vs. Eager learner)
• Underfitting and Overfitting
• Evaluating Performance of a Model
Model in ML
The basic learning process in machine learning can be divided into three key parts:
• Data Input: This is the initial step where information or data is collected and provided to the learning system. In
machine learning, it refers to the dataset that contains examples and features used for training.
• Abstraction: After receiving the data, the system or learner abstracts relevant patterns, features, or concepts
from it. In machine learning, it involves extracting meaningful patterns and relationships from the data using
algorithms and statistical methods.
• Generalization: This is the process of applying the abstracted knowledge or patterns to make predictions or
decisions beyond the specific examples or data points that were initially provided. In machine learning, it's the
model's capacity to make accurate predictions or classifications on unseen data based on what it has learned
from the training data.
⮚ Abstraction is a significant step as it represents raw input data in a summarized and structured
format, such that a meaningful insight is obtained from the data.
⮚ This structured representation of raw input data to the meaningful pattern is called a model.
⮚ The model might have different forms. It might be a mathematical equation, it might be a graph
or tree structure, it might be a computational block, etc
Selecting a model
There are three broad categories of machine learning approaches used for resolving different types
of problems. They are:
1. Supervised
▪ Classification
▪ Regression
2. Unsupervised
▪ Clustering
▪ Association analysis
3. Reinforcement
For each of the cases, the model that has to be created/trained is different. There are multiple
factors play a role when we try to select the model for solving a machine learning problem. The
most important factors are :
i. The kind of problem we want to solve using machine learning
ii. The nature of the underlying data.
Machine learning algorithms are broadly of two types:
• Models for supervised learning, which primarily focus on solving predictive problems
• Models for unsupervised learning, which solve descriptive problems
Predictive models
• Predictive models try to predict certain value using the values in an input data set
• The Predictive models which are used for prediction of target features of categorical value are known as
classification models. The target feature is known as a class and the categories to which classes are
divided into are called levels.
• Examples:
▪ Predicting win/loss in a cricket match
▪ Predicting whether a transaction is fraud
▪ Predicting whether a customer may move to another product
• Some of the popular classification models include: k-Nearest Neighbor (kNN), Naïve Bayes and
Decision Tree.
• Predictive models may also be used to predict numerical values of the target feature based on the
predictor features. The models which are used for prediction of the numerical value of the target feature
of a data instance are known as regression models.
• Examples:
▪ Prediction of revenue growth in the succeeding year
▪ Prediction of rainfall amount in the coming monsoon
▪ Prediction of potential flu patients and demand for flu shots next winter
• Some of the popular regression models include: Linear Regression and Logistic Regression models.
Descriptive models
• Descriptive models are used to describe a data set or gain insight from a data set.
• There is no target feature or single feature of interest in case of unsupervised
learning. Based on the value of all features, interesting patterns or insights are
derived about the data set.
• Descriptive models which group together similar data instances, i.e. data
instances having a similar value of the different features are called clustering
models.
• Examples of clustering include:
▪ Customer grouping or segmentation based on social, demographic, ethnic,
etc. factors
▪ Grouping of music based on different aspects like genre, language,
timeperiod, etc.
▪ Grouping of commodities in an inventory
• The most popular model for clustering is k-Means
Training a model (for Supervised Learning)
• Machine learning model training is a fundamental process in AI, enabling
computers to learn and make intelligent decisions.
• It involves teaching algorithms to recognize patterns, relationships, and
trends in data to make predictions or decisions.
• Training starts with a dataset containing examples and corresponding
outcomes.
• The model learns to generalize and make predictions on new, unseen data.
• Once trained, the model can be used to make predictions, classify objects, or
offer recommendations.
• Effective model training is critical for various applications in industries like
healthcare, finance, autonomous vehicles, and natural language processing.
• There are Various methods for training the models like:
▪ Holdout method
▪ K-fold Cross-validation method
▪ Bootstrap sampling
▪ Lazy vs. Eager learner
Holdout method
• Holdout Method involves splitting the dataset into two parts, typically a training set and a test set. The
training set is used to train the model, and the test set is used to evaluate its performance.
• Training and Test Data Split: Typically, 70%–80% of labeled input data is used for training, and 20%–30% is
used for testing, but other proportions are also acceptable.
• Random Data Split: To ensure both training and test data are similar, random partitioning is done using
random numbers.(In some cases, data is divided into three parts: training, test, and validation data.
Validation data is used iteratively to refine the model).
• Problem of Imbalanced Data: Imbalanced distribution of classes in training and test data can occur
despite random sampling, particularly when certain classes have much fewer examples.
• To address the problem of imbalanced data, stratified random sampling can be used, which divides
data into homogeneous groups and selects random samples from each group to ensure balanced
proportions.
• For any classification model, model accuracy is given by total number of correct classifications
(either as the class of interest, i.e. True Positive or as not the class of interest, i.e. True Negative)
divided by total number of classifications done.
• A matrix containing correct and incorrect predictions in the form of True Positives, False Positives,
False Negatives and True Negatives is known as confusion matrix.
⮚ Accuracy
⮚ Error rate
⮚ Sensitivity
⮚ Specificity
⮚ Precision
⮚ Recall
⮚ F-measure
⮚ Receiver operating characteristic (ROC) curves
⮚ Area under curve (AUC)
Example
Supervised learning classification model evaluation- Accuracy
Accuracy is a measure of how many predictions a classification model got correct, expressed
as a ratio of the correctly predicted instances to the total instances in the dataset. It is
measured as:
Supervised learning classification model evaluation- Error Rate
The error rate is the complement of accuracy. It represents the proportion of incorrect
predictions in relation to the total instances. It is measured as:
Supervised learning classification model evaluation- Sensitivity
The sensitivity of a model measures the proportion of TP examples or positive cases
which were correctly classified. It is measured as:
Precision: Precision, also known as Positive Predictive Value, assesses the accuracy of
positive predictions. It's the ratio of true positives to the total instances predicted as
positive.
Supervised learning classification model evaluation- Recall
Recall: indicates the proportion of correct prediction of positives to the total number of
positives. In case of win/loss prediction of cricket, recall resembles what proportion of the
total wins were predicted correctly.
Supervised learning classification model evaluation- F-Measure
F-measure is another measure of model performance which combines the precision
and recall. It takes the harmonic mean of precision and recall as calculated as:
Supervised learning classification model evaluation-
Receiver operating characteristic (ROC)
• Receiver Operating Characteristic (ROC) curve helps in visualizing the performance of a
classification model. It shows the efficiency of a model in the detection of true positives
while avoiding the occurrence of false positives.
• In the ROC curve, the FP rate is plotted (in the horizontal axis) against true positive rate
(in the vertical axis) at different classification thresholds. If we assume a lower value of
classification threshold, the model classifies more items as positive. Hence, the values of
both False Positives and True Positives increase.
Supervised learning classification model evaluation- Area Under Curve
• The area under curve (AUC) value, as shown in
figure 6.a , is the area of the two-dimensional
space under the curve extending from (0, 0) to (1,
1), where each point on the curve gives a set of
true and false positive values at a specific
classification threshold.
• This curve gives an indication of the predictive
quality of a model. AUC value ranges from 0 to 1,
with an AUC of less than 0.5 indicating that the
classifier has no predictive ability.
• Figure 6.b shows the curves of two classifiers –
classifier 1 and classifier 2. Quite obviously, the
AUC of classifier 1 is more than the AUC of
classifier 2. So, we can draw the inference that
classifier 1 is better than classifier 2.
Figure 6: ROC curve
Reference list
Ref 1. Miroslav Kubat, An Introduction to Machine Learning, Third Edition, 2021, Pearson, ISBN 978-
3-030-81934-7
Ref 2. Saikat Dutt (Author), Subramanian Chandramouli (Author), Amit Kumar Das, Machine Learning,
First Edition , 2018, Person.