0% found this document useful (0 votes)

18 views14 pages

Unit 2 Chap 4

Unit 2

Uploaded by

KALPANA C

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views14 pages

Unit 2 Chap 4

Unit 2

Uploaded by

KALPANA C

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

TYCS RKT COLLEGE ULHASNAGAR ASST.

PROF SHREYA TIWARI

UNIT 2: CHAP 4: MODEL EVALUATION AND SELECTION

1) Techniques for model evaluation performance

Accuracy: Accuracy is a measure of the overall correctness of the model. It is

the ratio of correctly predicted instances to the total instances.
Accuracy=Number of Correct PredictionsTotal Number of PredictionsAccuracy
=Total Number of PredictionsNumber of Correct Predictions

Real-life Example: Email Spam Detection Consider an email spam detection

system. If the system correctly classifies 900 out of 1000 emails as either spam
or not spam, the accuracy is

2. Precision: Precision is a measure of the accuracy of the positive predictions.

It is the ratio of correctly predicted positive observations to the total predicted
positives.

Real-life Example: Medical Test for a Disease Imagine a medical test for a
disease. Precision would be the proportion of patients correctly diagnosed with
the disease among those predicted to have it. If the test correctly identifies 80
out of 100 patients with the disease, and 20 of the positive predictions were
false alarms, precision is

3. Recall (Sensitivity or True Positive Rate): Recall is a measure of the ability

of the model to capture all the relevant instances. It is the ratio of correctly
predicted positive observations to the all observations in actual class.

MAIL ID: [email protected] Contact No. 9619374538

TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

Real-life Example: Airport Security Screening In airport security screening,

recall would be the ability of the system to correctly identify all dangerous
items (true positives) among all actual dangerous items, even if it means some
non-dangerous items are incorrectly flagged. If the system identifies 90 out of
100 dangerous items but misses 10,

4. F1-Score: The F1-score is the harmonic mean of precision and recall. It

provides a balance between precision and recall, especially when they have an
uneven distribution.

Real-life Example: Text Classification Consider a text classification model

that identifies spam messages. The F1-score would be useful when you want to
balance the need to correctly identify spam messages (precision) with the need
to capture all spam messages (recall).
In summary, accuracy, precision, recall, and F1-score are metrics used to
evaluate the performance of classification models. They provide insights into
different aspects of the model's performance and are chosen based on the
specific goals and requirements of the application.
2) Confusion matrix
••
In machine learning, classification is the process of categorizing a given set of
data into different categories. In machine learning, to measure the performance
of the classification model, we use the confusion matrix. Through this tutorial,
understand the significance of the confusion matrix.
What is a Confusion Matrix?
A confusion matrix is a matrix that summarizes the performance of a machine
learning model on a set of test data. It is a means of displaying the number of
accurate and inaccurate instances based on the model’s predictions. It is often
used to measure the performance of classification models, which aim to predict
a categorical label for each input instance.
MAIL ID: [email protected] Contact No. 9619374538
TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

The matrix displays the number of instances produced by the model on the test
data.
• True positives (TP): occur when the model accurately predicts a
positive data point.
• True negatives (TN): occur when the model accurately predicts a
negative data point.
• False positives (FP): occur when the model predicts a positive data
point incorrectly.
• False negatives (FN): occur when the model mispredicts a negative
data point.
Why do we need a Confusion Matrix?
When assessing a classification model’s performance, a confusion matrix is
essential. It offers a thorough analysis of true positive, true negative, false
positive, and false negative predictions, facilitating a more profound
comprehension of a model’s recall, accuracy, precision, and overall
effectiveness in class distinction. When there is an uneven class distribution in a
dataset, this matrix is especially helpful in evaluating a model’s performance
beyond basic accuracy metrics.
Let’s understand the confusion matrix with the examples:
Confusion Matrix For binary classification
A 2X2 Confusion matrix is shown below for the image recognition having a
Dog image or Not Dog image.
Actual

Dog Not Dog

True Positive False Positive

Dog (TP) (FP)

False Negative True Negative

Predicted Not Dog (FN) (TN)
•
True Positive (TP): It is the total counts having both predicted and
actual values are Dog.
• True Negative (TN): It is the total counts having both predicted and
actual values are Not Dog.
• False Positive (FP): It is the total counts having prediction is Dog
while actually Not Dog.
• False Negative (FN): It is the total counts having prediction is Not
Dog while actually, it is Dog.
Example for binary classification problems

MAIL ID: [email protected] Contact No. 9619374538

TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

Index 1 2 3 4 5 6 7 8 9 10

Not Not Not Not

Do Do Do Do Do Do
Do Do Do Do
g g g g g g
Actual g g g g

Not Not Not Not

Do Do Do Do Do Do
Predicte Do Do Do Do
g g g g g g
d g g g g

Result TP FN TP TN TP FP TP TP TN TN
• Actual Dog Counts = 6
• Actual Not Dog Counts = 4
• True Positive Counts = 5
• False Positive Counts = 1
• True Negative Counts = 3
• False Negative Counts = 1
Actual

Dog Not Dog

True Positive False Positive

Dog (TP =5) (FP=1)

False Negative True Negative

Predicted Not Dog (FN =1) (TN=3)

3) ROC /AUC CURVE

What is the AUC-ROC curve?

The AUC-ROC curve, or Area Under the Receiver Operating
Characteristic curve, is a graphical representation of the performance of a
binary classification model at various classification thresholds. It is
commonly used in machine learning to assess the ability of a model to
distinguish between two classes, typically the positive class (e.g.,
presence of a disease) and the negative class (e.g., absence of a disease).
MAIL ID: [email protected] Contact No. 9619374538
TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

Receiver Operating Characteristics (ROC) Curve

ROC stands for Receiver Operating Characteristics, and the ROC curve is the
graphical representation of the effectiveness of the binary classification model.
It plots the true positive rate (TPR) vs the false positive rate (FPR) at different
classification thresholds.
Area Under Curve (AUC) Curve:
AUC stands for the Area Under the Curve, and the AUC curve represents the
area under the ROC curve. It measures the overall performance of the binary
classification model. As both TPR and FPR range between 0 to 1, So, the area
will always lie between 0 and 1, and A greater value of AUC denotes better
model performance. Our main goal is to maximize this area in order to have the
highest TPR and lowest FPR at the given threshold. The AUC measures the
probability that the model will assign a randomly chosen positive instance a
higher predicted probability compared to a randomly chosen negative instance.

It represents the probability with which our model can distinguish between the
two classes present in our target.

Key terms used in AUC and ROC Curve

1. TPR and FPR
This is the most common definition that you would have encountered when
you would Google AUC-ROC. Basically, the ROC curve is a graph that shows
the performance of a classification model at all possible thresholds( threshold
is a particular value beyond which you say a point belongs to a particular
class). The curve is plotted between two parameters
• TPR – True Positive Rate
• FPR – False Positive Rate

MAIL ID: [email protected] Contact No. 9619374538

TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

4) Cross-validation

Cross-validation is a resampling technique used in machine learning to assess

the performance of a
predictive model. It involves partitioning the dataset into subsets, training the
model on a subset
(training set), and evaluating it on the complementary subset (validation set or
test set). This process
is repeated multiple times, and the performance metrics are averaged over the
iterations to obtain a
more robust estimate of the model's performance.

MAIL ID: [email protected] Contact No. 9619374538

TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

K-Fold Cross-Validation:

MAIL ID: [email protected] Contact No. 9619374538

TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

Definition: K-Fold Cross-Validation is a popular technique where the original

dataset is divided into k
equal-sized subsets (folds). The model is trained k times, each time using k-1
folds as the training set
and the remaining fold as the validation set. The performance metrics are then
averaged over the k
iterations.
Procedure:
1. Partitioning: Divide the dataset into k equal-sized subsets (folds).
2. Training and Validation: For each fold, train the model on the remaining k-1
folds and
validate it on the current fold.
3. Performance Metrics: Compute the performance metrics (e.g., accuracy,
precision, recall) on
the validation set for each iteration.
4. Average Performance: Average the performance metrics over the k iterations
to obtain a
more reliable estimate of the model's performance.
Advantages:

• Provides a more accurate estimate of the model's performance compared to a

single train-
test split.

• Utilizes the entire dataset for both training and validation, reducing bias and
variance.
• Helps detect overfitting by evaluating the model on multiple subsets of the
data.
Disadvantages:
• Computationally expensive, especially for large datasets and complex models,
MAIL ID: [email protected] Contact No. 9619374538
TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

as it involves

MAIL ID: [email protected] Contact No. 9619374538

TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

training the model multiple times.

• Not suitable for time-series data or data with temporal dependencies, as it may
break the
temporal structure of the data.
Stratified Cross-Validation:
Definition: Stratified Cross-Validation is a variation of k-fold cross-validation
where the class
distribution in the dataset is preserved in each fold. This technique is
particularly useful for
classification problems with imbalanced class distributions.
Procedure:
1. Stratification: Stratify the dataset based on the target variable (class labels) to
ensure that
each fold has a similar class distribution as the original dataset.
2. Partitioning: Divide the dataset into k equal-sized subsets (folds), ensuring
that each fold
maintains the class proportions of the original dataset.

3. Training and Validation: For each fold, train the model on the remaining k-1
folds and
validate it on the current fold.
4. Performance Metrics: Compute the performance metrics on the validation set
for each
iteration.
5. Average Performance: Average the performance metrics over the k iterations
to obtain a
more reliable estimate of the model's performance.
Advantages:
• Ensures that each fold represents the overall class distribution of the dataset,
making the
MAIL ID: [email protected] Contact No. 9619374538
TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

evaluation more representative.

• Helps prevent bias in the performance estimate, especially for imbalanced
datasets.
Disadvantages:
• May not always be feasible or necessary, especially for well-balanced datasets.
• Can be computationally expensive, especially for large datasets and complex
models.
In summary, both k-fold cross-validation and stratified cross-validation are
essential techniques for
evaluating the performance of machine learning models. The choice between
them depends on the
specific characteristics of the dataset and the problem at hand, such as class
distribution imbalance.

5) Hyperparameter tuning and model selection

Hyperparameter tuning and model selection are crucial steps in the process of
model evaluation and selection, especially in machine learning tasks. Let's delve
into each of these concepts:
Hyperparameter Tuning:
Definition: Hyperparameter tuning involves finding the optimal values for the
hyperparameters of a machine learning model. Hyperparameters are
configuration settings that are set before the learning process begins and control
the learning process itself.
Procedure:
1. Define Hyperparameters: Identify the hyperparameters of the model
that need to be tuned. These could include parameters such as learning
rate, regularization strength, tree depth, etc.
MAIL ID: [email protected] Contact No. 9619374538
TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

2. Select Search Method: Choose a search method to explore the

hyperparameter space. Common methods include grid search, random
search, and Bayesian optimization.
3. Define Search Space: Define the range or set of possible values for each
hyperparameter to be explored during the search.
4. Evaluate Performance: Train the model with different combinations of
hyperparameters and evaluate its performance using cross-validation or a
validation set.
5. Select Best Model: Choose the combination of hyperparameters that
results in the best performance metric(s) on the validation set.
6. Test Final Model: Validate the selected model on a separate test set to
obtain an unbiased estimate of its performance.
Importance: Hyperparameter tuning is crucial because the choice of
hyperparameters can significantly impact the performance and generalization
ability of a machine learning model. By finding the optimal values for
hyperparameters, we can improve the model's accuracy and robustness.
Model Selection:
Definition: Model selection involves choosing the best model architecture or
algorithm for a given machine learning task among a set of candidate models.
Procedure:
1. Define Candidate Models: Select a set of candidate models or
algorithms that are suitable for the problem at hand. This could include
various machine learning algorithms (e.g., linear regression, decision
trees, support vector machines) or different architectures of deep learning
models.
2. Train and Evaluate Models: Train each candidate model on the training
data and evaluate its performance using cross-validation or a validation
set.
3. Compare Performance: Compare the performance metrics (e.g.,
accuracy, precision, recall, F1-score) of the candidate models to
determine which one performs best on the validation set.
4. Select Best Model: Choose the model with the highest performance
metric(s) as the final model.

MAIL ID: [email protected] Contact No. 9619374538

TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

5. Test Final Model: Validate the selected model on a separate test set to
obtain an unbiased estimate of its performance.
Importance: Model selection is critical because different models have different
strengths and weaknesses, and the choice of model can significantly impact the
overall performance of the machine learning system. By comparing and
selecting the best-performing model, we can build a more accurate and effective
predictive model for the task at hand.
Here are simplified steps for hyperparameter tuning and model selection:
1. Problem Definition:
• Clearly define the problem you want to solve, such as customer churn
prediction, spam detection, or disease diagnosis.
2. Data Collection:
• Gather relevant data that includes features (input variables) and labels
(output variable) for your problem. Ensure the data is clean and properly
formatted.
3. Split Data:
• Split the dataset into training and test sets. The training set will be used to
train models, and the test set will be used for evaluation.
4. Model Selection:
• Choose candidate machine learning algorithms suitable for your problem.
Consider algorithms like Logistic Regression, Decision Trees, Random
Forest, Support Vector Machines, etc.
5. Hyperparameter Tuning:
• For each selected algorithm, identify hyperparameters to tune. These are
parameters that control the learning process, such as regularization
strength, tree depth, or learning rate.
• Use techniques like Grid Search or Random Search to explore different
combinations of hyperparameters.
• Train models using different hyperparameter configurations on the
training set and evaluate their performance using cross-validation.
6. Evaluation:

MAIL ID: [email protected] Contact No. 9619374538

TYCS RKT COLLEGE ULHASNAGAR ASST. PROF SHREYA TIWARI

• Evaluate the performance of each model using appropriate evaluation

metrics (e.g., accuracy, precision, recall, F1-score) on the validation set.
• Select the model with the best performance metrics as the final model.
7. Test:
• Validate the selected model on the test set to obtain an unbiased estimate
of its performance.
• Ensure that the model generalizes well to unseen data and performs
consistently.
8. Deployment and Monitoring:
• Deploy the selected model into production to make predictions on new
data.
• Monitor the model's performance over time and periodically re-evaluate
and re-tune hyperparameters if necessary.
By following these steps, you can effectively tune hyperparameters and select
the best-performing model for your machine learning task, ensuring accurate
predictions and successful deployment in real-world applications.

MAIL ID: [email protected] Contact No. 9619374538

Lesson 1 What Is A Number Bond
100% (1)
Lesson 1 What Is A Number Bond
4 pages
TM1 - Turbo Integrator Fuctions
No ratings yet
TM1 - Turbo Integrator Fuctions
22 pages
Ridge Regression
No ratings yet
Ridge Regression
82 pages
ML-Lecture-12 (Evaluation Metrics For Classification)
No ratings yet
ML-Lecture-12 (Evaluation Metrics For Classification)
15 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
CTET Paper Complete Analysis by Himanshi Singh
No ratings yet
CTET Paper Complete Analysis by Himanshi Singh
37 pages
Analytics in Practice: Model Evaluation
No ratings yet
Analytics in Practice: Model Evaluation
40 pages
2 Pier Alignment
No ratings yet
2 Pier Alignment
27 pages
The J Integral: Fracture Mechanics, With or Without Field Theory
No ratings yet
The J Integral: Fracture Mechanics, With or Without Field Theory
13 pages
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
20 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Machine Learning Project Report (Group 3) Shahbaz Khan
No ratings yet
Machine Learning Project Report (Group 3) Shahbaz Khan
11 pages
Evaluation of Predictive Models Final
No ratings yet
Evaluation of Predictive Models Final
6 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Analytic Method:: Model Evaluation
No ratings yet
Analytic Method:: Model Evaluation
17 pages
Fuzzy Rule Base and Approximate Reasoning
No ratings yet
Fuzzy Rule Base and Approximate Reasoning
31 pages
Auc Roc Curve Machine Learning
No ratings yet
Auc Roc Curve Machine Learning
12 pages
Characterization PDF
No ratings yet
Characterization PDF
80 pages
Stability & Determinacy of Trusses PDF
No ratings yet
Stability & Determinacy of Trusses PDF
5 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Lecture 5
No ratings yet
Lecture 5
21 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Lesson 4 - Performance Metrics
No ratings yet
Lesson 4 - Performance Metrics
46 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
11 pages
Aircraft Routing, and Crew Scheduling
No ratings yet
Aircraft Routing, and Crew Scheduling
195 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
Experiment Standing Wave
No ratings yet
Experiment Standing Wave
12 pages
Model Performance Assessment
No ratings yet
Model Performance Assessment
13 pages
Stress Concentration Problems
No ratings yet
Stress Concentration Problems
15 pages
9.some Applications of Trigonometry
No ratings yet
9.some Applications of Trigonometry
6 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Broadcast Engineering and Acoustics Expt. #2 FINAL
No ratings yet
Broadcast Engineering and Acoustics Expt. #2 FINAL
8 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
What Is Language?: Medium of Communication
No ratings yet
What Is Language?: Medium of Communication
3 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Winspire
No ratings yet
Winspire
44 pages
Module 5 ML
No ratings yet
Module 5 ML
12 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Risk Security and Regulatory Compliance
No ratings yet
Risk Security and Regulatory Compliance
12 pages
Iai&ml Unit-5
No ratings yet
Iai&ml Unit-5
15 pages
Lab 3 Matlab
No ratings yet
Lab 3 Matlab
19 pages
Acknowledgement
No ratings yet
Acknowledgement
14 pages
Zeus Case Study
No ratings yet
Zeus Case Study
7 pages
Confusion Matrix V 2.0
No ratings yet
Confusion Matrix V 2.0
14 pages
Error Checking in Java
No ratings yet
Error Checking in Java
18 pages
Ad3501-Dl-Unit 4 Notes
No ratings yet
Ad3501-Dl-Unit 4 Notes
16 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
10 pages
How To Evaluate and Monitor Performance of AI Models For Financial Risk Management - A Practical Guide by Indraneel Dutta Barua
No ratings yet
How To Evaluate and Monitor Performance of AI Models For Financial Risk Management - A Practical Guide by Indraneel Dutta Barua
1 page
Physics 1 - LESSON 1 (Mid - Fall 24)
No ratings yet
Physics 1 - LESSON 1 (Mid - Fall 24)
15 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
127 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
Confusion Matrix & Evaluation Metrics in Machine Learning
No ratings yet
Confusion Matrix & Evaluation Metrics in Machine Learning
23 pages
Session-11 Machine Learning - Jupyter Notebook
No ratings yet
Session-11 Machine Learning - Jupyter Notebook
11 pages
Lecture 2.3
No ratings yet
Lecture 2.3
9 pages
21-General Approach To Classification, Classification by Decision Tree Induction-17-02-2025
No ratings yet
21-General Approach To Classification, Classification by Decision Tree Induction-17-02-2025
15 pages
Session-11 Machine Learning
No ratings yet
Session-11 Machine Learning
27 pages
Lecture - 3
No ratings yet
Lecture - 3
24 pages
Evaluation Matrix
No ratings yet
Evaluation Matrix
29 pages
(0975 - 8887) Volume 85 - No7 January 2014
No ratings yet
(0975 - 8887) Volume 85 - No7 January 2014
2 pages
Exp7 MLAI2
No ratings yet
Exp7 MLAI2
8 pages
5.1 Oscillations (Part 1)
No ratings yet
5.1 Oscillations (Part 1)
2 pages
LAS 00 Basic Calc Semestral Plan (2nd Semester)
No ratings yet
LAS 00 Basic Calc Semestral Plan (2nd Semester)
1 page
Lecture11evaluationmetricsforclassification 240913060639 0c766554
No ratings yet
Lecture11evaluationmetricsforclassification 240913060639 0c766554
28 pages
Module 01 - Performance Metrics in ML
No ratings yet
Module 01 - Performance Metrics in ML
15 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
Unit III Iml Final
No ratings yet
Unit III Iml Final
36 pages
9 Roc Auc
No ratings yet
9 Roc Auc
27 pages
Performance Evaluation
No ratings yet
Performance Evaluation
24 pages
Strings Js Notes
No ratings yet
Strings Js Notes
3 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
Dexterous Abacus Level 1 Workbook
No ratings yet
Dexterous Abacus Level 1 Workbook
91 pages
An Invariant Approach To Statistical Analysis of Shapes 1st Edition Subhash R. Lele All Chapters Instant Download
100% (14)
An Invariant Approach To Statistical Analysis of Shapes 1st Edition Subhash R. Lele All Chapters Instant Download
85 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Unit 3
No ratings yet
Unit 3
13 pages
Intel Assignment
No ratings yet
Intel Assignment
13 pages
Confusion Matrix
No ratings yet
Confusion Matrix
16 pages
Performance Measure For A Classification Model.
No ratings yet
Performance Measure For A Classification Model.
5 pages
Lec 12 13 Evaluation Measures
No ratings yet
Lec 12 13 Evaluation Measures
45 pages
Confusion Matrix
No ratings yet
Confusion Matrix
7 pages
Unit8 (Evaluation Method)
No ratings yet
Unit8 (Evaluation Method)
43 pages
Chater 3 Class 10
No ratings yet
Chater 3 Class 10
4 pages
Assignment 5
No ratings yet
Assignment 5
22 pages

Unit 2 Chap 4

Uploaded by

Unit 2 Chap 4

Uploaded by

TYCS RKT COLLEGE ULHASNAGAR ASST.

PROF SHREYA TIWARI

UNIT 2: CHAP 4: MODEL EVALUATION AND SELECTION

1) Techniques for model evaluation performance

Accuracy: Accuracy is a measure of the overall correctness of the model. It is

Real-life Example: Email Spam Detection Consider an email spam detection

2. Precision: Precision is a measure of the accuracy of the positive predictions.

3. Recall (Sensitivity or True Positive Rate): Recall is a measure of the ability

MAIL ID: [email protected] Contact No. 9619374538

Real-life Example: Airport Security Screening In airport security screening,

4. F1-Score: The F1-score is the harmonic mean of precision and recall. It

Real-life Example: Text Classification Consider a text classification model

Dog Not Dog

True Positive False Positive

False Negative True Negative

MAIL ID: [email protected] Contact No. 9619374538

Not Not Not Not

Not Not Not Not

Dog Not Dog

True Positive False Positive

False Negative True Negative

3) ROC /AUC CURVE

What is the AUC-ROC curve?

Receiver Operating Characteristics (ROC) Curve

Key terms used in AUC and ROC Curve

MAIL ID: [email protected] Contact No. 9619374538

Cross-validation is a resampling technique used in machine learning to assess

MAIL ID: [email protected] Contact No. 9619374538

MAIL ID: [email protected] Contact No. 9619374538

Definition: K-Fold Cross-Validation is a popular technique where the original

• Provides a more accurate estimate of the model's performance compared to a

MAIL ID: [email protected] Contact No. 9619374538

training the model multiple times.

evaluation more representative.

5) Hyperparameter tuning and model selection

2. Select Search Method: Choose a search method to explore the

MAIL ID: [email protected] Contact No. 9619374538

MAIL ID: [email protected] Contact No. 9619374538

• Evaluate the performance of each model using appropriate evaluation

MAIL ID: [email protected] Contact No. 9619374538

You might also like