0% found this document useful (0 votes)

9 views6 pages

Evaluation Metrics

The document provides a comprehensive overview of evaluation metrics used in machine learning, categorized into classification, regression, and clustering metrics. It outlines specific metrics such as accuracy, precision, recall, F1-score, and others, detailing their formulas, when to use them, and their interpretations. Additionally, it emphasizes best practices for model evaluation, including metric selection based on problem type and the importance of cross-validation.

Uploaded by

Sai Indupuri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views6 pages

Evaluation Metrics

Uploaded by

Sai Indupuri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

EVALUATION METRICS

Machine Learning

Sai
Page |1

Machine learning models are evaluated using metrics that measure their
performance on the given task. The choice of evaluation metric depends on the type
of problem (e.g., regression, classification, clustering, etc.), the dataset, and the
business objectives.

Here’s an in-depth look at evaluation metrics for different types of machine learning.

1. Classification Metrics
Used to evaluate models where the goal is to predict categories or labels (e.g., spam
detection, image classification).

1.1 Accuracy

• Formula:
Number of Correct Predictions
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
Total Number of Predictions
• When to Use: Balanced datasets with equal class distribution.
• Limitation: Misleading for imbalanced datasets (e.g., 95% accuracy when
95% of data belongs to one class).
• Example Use Case: Email spam classification with balanced classes.

1.2 Precision
True Positives
• Formula : Precision = True Positives + False Positives
•
• When to Use: When false positives are costly (e.g., predicting cancer when it
doesn't exist).
• Interpretation: High precision means fewer false alarms.
• Improvement: Reduce model's tendency to misclassify negatives as positives
(e.g., threshold adjustment).

1.3 Recall (Sensitivity)

• Formula: 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠Recall =

True Positives
True Positives + False Negatives
• 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
• When to Use: When false negatives are costly (e.g., missing a cancer
diagnosis).
• Interpretation: High recall means the model captures most actual positives.
• Improvement: Train the model to reduce false negatives (e.g., adding more
positive samples).
Page |2

1.4 F1-Score
Precision⋅Recall
• Formula:𝐹1 = 2 ⋅ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ⋅ 𝑅𝑒𝑐𝑎𝑙𝑙𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙𝐹1 = 2 ⋅ Precision + Recall
• 𝐹1 = 2 ⋅ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ⋅ 𝑅𝑒𝑐𝑎𝑙𝑙
• When to Use: When precision and recall are equally important.
• Interpretation: Balance between false positives and false negatives.
• Example Use Case: Fraud detection where both false alarms and missed
frauds are critical.

1.5 ROC-AUC (Receiver Operating Characteristic - Area Under Curve)

• ROC Curve: Plots True Positive Rate (TPR) vs. False Positive Rate (FPR).
• AUC: Area under the ROC curve; a value close to 1 indicates a good
classifier.
• When to Use: To compare models, especially with imbalanced datasets.
• Interpretation: High AUC means the model separates classes well.

1.6 Log Loss

• Formula: 𝐿𝑜𝑔 𝐿𝑜𝑠𝑠 = −1𝑁 ∑ 𝑖 = 1𝑁(𝑦𝑖𝑙𝑜 𝑔(𝑝𝑖) + (1 − 𝑦𝑖)𝑙𝑜 𝑔(1 − 𝑝𝑖))

1
• Log Loss = − 𝑁 ∑𝑁𝑖=1(𝑦𝑖 log(𝑝𝑖 ) + (1 − 𝑦𝑖 ) log(1 − 𝑝𝑖 ))𝐿𝑜𝑔 𝐿𝑜𝑠𝑠 = −𝑁1𝑖 = 1 ∑ 𝑁
(𝑦𝑖𝑙𝑜𝑔(𝑝𝑖) + (1 − 𝑦𝑖)𝑙𝑜𝑔(1 − 𝑝𝑖))
• When to Use: Probabilistic classifiers.
• Interpretation: Lower log loss indicates better probability estimates.

2. Regression Metrics
Used for models that predict continuous outputs (e.g., house prices, stock prices).

2.1 Mean Absolute Error (MAE)

• Formula: 𝑀𝐴𝐸 = 1𝑁 ∑ 𝑖 = 1𝑁 ∣ 𝑦𝑖 − 𝑦 𝑖 ∣
1
• MAE = 𝑁 ∑𝑁𝑖=1|𝑦𝑖 − 𝑦
̂|𝑀𝐴𝐸
𝑖 = 𝑁1𝑖 = 1 ∑ 𝑁 ∣ 𝑦𝑖 − 𝑦𝑖 ∣
• When to Use: Understand average magnitude of errors.
• Interpretation: Lower MAE means fewer average deviations from true values.
• Improvement: Use models that capture trends more accurately.
Page |3

2.2 Mean Squared Error (MSE)

1
• Formula: 𝑀𝑆𝐸 = 1𝑁 ∑ 𝑖 = 1𝑁(𝑦𝑖 − 𝑦 𝑖 )2MSE = 𝑁 ∑𝑁
𝑖=1(𝑦𝑖 − 𝑦
̂)
𝑖
2
𝑀𝑆𝐸 = 𝑁1𝑖 =
1 ∑ 𝑁(𝑦𝑖 − 𝑦𝑖)2
• When to Use: Penalizing larger errors.
• Interpretation: Sensitive to outliers.
• Improvement: Remove or minimize outliers.

2.3 Root Mean Squared Error (RMSE)

• Formula: 𝑅𝑀𝑆𝐸 = 𝑀𝑆𝐸 ∗ RMSE = √MSE𝑅𝑀𝑆𝐸 = 𝑀𝑆𝐸

• When to Use: Similar to MSE but interpretable in the same units as the target
variable.
• Example Use Case: Predicting housing prices where large deviations are
undesirable.

2.4 R-Squared (Coefficient of Determination)

• Formula:𝑅2 = 1 −
𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝐸𝑟𝑟𝑜𝑟𝑠 (𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠)𝑇𝑜𝑡𝑎𝑙 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠𝑅 2 = 1 −
Sum of Squared Errors (Residuals)
𝑅2 = 1 −
Total Sum of Squares
𝑇𝑜𝑡𝑎𝑙 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝐸𝑟𝑟𝑜𝑟𝑠 (𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠)
• When to Use: To explain the variance captured by the model.
• Interpretation: Close to 1 indicates a good fit.
• Limitation: Can be misleading with non-linear data.

3. Clustering Metrics
Used for unsupervised learning tasks (e.g., customer segmentation).

3.1 Silhouette Score

• Range: [-1, 1]
• When to Use: Measure how similar an object is to its cluster vs. other
clusters.
• Interpretation: Higher scores indicate well-separated clusters.
Page |4

3.2 Davies-Bouldin Index

• When to Use: Evaluate the compactness and separation of clusters.

• Interpretation: Lower values indicate better clustering.

3.3 Dunn Index

• When to Use: To measure the ratio between the smallest inter-cluster

distance and the largest intra-cluster distance.
• Interpretation: Higher values indicate better clustering.

4. Use Cases and Interpretations

Problem Type Metric Use Case What It Tells Us How to Improve
How well the Balance the
Accuracy,
Binary Spam model dataset, adjust
F1, AUC-
Classification detection distinguishes thresholds,
ROC
between classes improve features.
Balance of Use weighted
Multi-class Precision, Image precision and metrics, improve
Classification Recall, F1 classification recall across class-specific
classes features.
MAE, Remove outliers,
House price Accuracy of
Regression RMSE, R- fine-tune model,
prediction predicted values
Squared normalize data.
Tune cluster count
Silhouette Customer (k), improve
Clustering Quality of clusters
Score segmentation distance
measures.

5. Best Practices for Model Evaluation

1. Choose Metrics Based on the Problem:
o Classification: Accuracy, Precision, Recall, F1.
o Regression: MAE, MSE, RMSE.
o Clustering: Silhouette Score, Davies-Bouldin Index.
2. Cross-Validation:
o Use k-fold cross-validation to assess model performance on different
subsets of data.
3. Visualize Metrics:
o Use confusion matrices, ROC curves, and scatter plots to understand
model performance.
Page |5

4. Consider Business Objectives:

o Choose metrics aligned with the problem's real-world impact (e.g.,
recall for medical diagnosis).
5. Analyze Failure Cases:
o Study misclassifications or large errors to identify areas for
improvement.

BS en 1808 2015 Suspended Access Platforms
100% (2)
BS en 1808 2015 Suspended Access Platforms
136 pages
Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
14 pages
MESOPOTAMIA
No ratings yet
MESOPOTAMIA
22 pages
Astm F 1145
100% (2)
Astm F 1145
12 pages
Law Enforcement Organization and Administration 1 1
100% (1)
Law Enforcement Organization and Administration 1 1
135 pages
Supervised Learning
No ratings yet
Supervised Learning
30 pages
Approaches of Educational Planning: 1. Social Demand Approach
100% (3)
Approaches of Educational Planning: 1. Social Demand Approach
4 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
11 pages
Evaluation
No ratings yet
Evaluation
18 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
01 Excel Test CL 11 and Below
100% (1)
01 Excel Test CL 11 and Below
23 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
9b. Evaluation of Classifiers
No ratings yet
9b. Evaluation of Classifiers
4 pages
Session 1 Evaluation Model
No ratings yet
Session 1 Evaluation Model
58 pages
Lecture - (3-4) Evaluation Metrices Classification and Regression
No ratings yet
Lecture - (3-4) Evaluation Metrices Classification and Regression
28 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
Yarn Processing
No ratings yet
Yarn Processing
31 pages
NA DeiselShip Latest
No ratings yet
NA DeiselShip Latest
105 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
UVEB Technology With 1.5 Nanometer Heteroatom Titanates Zirconates
No ratings yet
UVEB Technology With 1.5 Nanometer Heteroatom Titanates Zirconates
106 pages
Chapter 3 Model Evaluation Final
No ratings yet
Chapter 3 Model Evaluation Final
30 pages
AIML-HC Mod 03
No ratings yet
AIML-HC Mod 03
46 pages
Classification Models Theory
No ratings yet
Classification Models Theory
37 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
MOFP - Families Fabaceae, Brassicaceae, Malvaceae
No ratings yet
MOFP - Families Fabaceae, Brassicaceae, Malvaceae
2 pages
Chương 2e. Model Evaluation
No ratings yet
Chương 2e. Model Evaluation
27 pages
Intel Assignment
No ratings yet
Intel Assignment
13 pages
Lecture 4 Evaluation
No ratings yet
Lecture 4 Evaluation
58 pages
Unit 2
No ratings yet
Unit 2
39 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Object-Oriented Software Engineering: Practical Software Development Using UML and Java
No ratings yet
Object-Oriented Software Engineering: Practical Software Development Using UML and Java
71 pages
S1 Evaluate Performance LKW 1mar2025
No ratings yet
S1 Evaluate Performance LKW 1mar2025
26 pages
Dr. Dubacharla Gyaneshwar
No ratings yet
Dr. Dubacharla Gyaneshwar
30 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Ad3501-Dl-Unit 4 Notes
No ratings yet
Ad3501-Dl-Unit 4 Notes
16 pages
Further Studies Maths P1 Memo 2024
No ratings yet
Further Studies Maths P1 Memo 2024
19 pages
Ogunka 3 PDF
No ratings yet
Ogunka 3 PDF
18 pages
GoWork Event Space & Price Details (2024)
No ratings yet
GoWork Event Space & Price Details (2024)
29 pages
CSL0777 L06
No ratings yet
CSL0777 L06
24 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
Lec 4
No ratings yet
Lec 4
24 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Machine Learningassignment
No ratings yet
Machine Learningassignment
10 pages
Performance Measures
No ratings yet
Performance Measures
19 pages
Evaluating A Machine Learning Model
No ratings yet
Evaluating A Machine Learning Model
14 pages
22AIP3101A Session 3
No ratings yet
22AIP3101A Session 3
24 pages
Mod8 DM
No ratings yet
Mod8 DM
13 pages
Model Evaluation in ML
No ratings yet
Model Evaluation in ML
12 pages
ML Unit-3 - RTU
No ratings yet
ML Unit-3 - RTU
20 pages
How To Evaluate Machine Learning Models - Yulinda Rizky
No ratings yet
How To Evaluate Machine Learning Models - Yulinda Rizky
15 pages
Presentation 1
No ratings yet
Presentation 1
24 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
11 pages
U-3, Pharmacology-I, Carewell Pharma
No ratings yet
U-3, Pharmacology-I, Carewell Pharma
28 pages
Model Evaluation
No ratings yet
Model Evaluation
18 pages
Imp Notes For Aamd
No ratings yet
Imp Notes For Aamd
6 pages
Third Seminar Assignment On Machine Learning (CSC 912)
No ratings yet
Third Seminar Assignment On Machine Learning (CSC 912)
10 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
Classification Metrics
No ratings yet
Classification Metrics
24 pages
Performance Metrics
No ratings yet
Performance Metrics
8 pages
Week 08
No ratings yet
Week 08
13 pages
Identification: Vulnerable Individual (Assessment)
No ratings yet
Identification: Vulnerable Individual (Assessment)
20 pages
ML MAKAUT Unit-3
No ratings yet
ML MAKAUT Unit-3
6 pages
Ads Exp4
No ratings yet
Ads Exp4
3 pages
Public Notice: Dr. NTR University of Health Sciences: Andhra Pradesh
No ratings yet
Public Notice: Dr. NTR University of Health Sciences: Andhra Pradesh
4 pages
Capstone Project
No ratings yet
Capstone Project
6 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
2 pages
CBS-Manual July 2019
No ratings yet
CBS-Manual July 2019
8 pages
Instruction & Option Choice
No ratings yet
Instruction & Option Choice
6 pages
Ads Exp 4
No ratings yet
Ads Exp 4
4 pages
Metrix in ML
No ratings yet
Metrix in ML
7 pages
Pbyr2545ct CTB Cte-05
No ratings yet
Pbyr2545ct CTB Cte-05
14 pages
Latihan Minggu 7-2-Andri Rahman Kusumo-1B AKM
No ratings yet
Latihan Minggu 7-2-Andri Rahman Kusumo-1B AKM
5 pages
Performance Metrics ML
No ratings yet
Performance Metrics ML
4 pages
Machine Learning Model Evaluation - Zero To Mastery Academy
No ratings yet
Machine Learning Model Evaluation - Zero To Mastery Academy
1 page
What Are The Evaluation Metrics in Machine Learning
No ratings yet
What Are The Evaluation Metrics in Machine Learning
3 pages
Small Engines: Global Motorcycle Trends E-Mobility Trends Emissions Legislation Upgrades Motorcycle Market
No ratings yet
Small Engines: Global Motorcycle Trends E-Mobility Trends Emissions Legislation Upgrades Motorcycle Market
8 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
3 pages
Verbal Tenses Review 1st Term ANSWER KEY
No ratings yet
Verbal Tenses Review 1st Term ANSWER KEY
2 pages
Debate Grading Rubric
No ratings yet
Debate Grading Rubric
3 pages
ITI Newsletter July 2024
No ratings yet
ITI Newsletter July 2024
3 pages
Qadaqadar PDF
No ratings yet
Qadaqadar PDF
4 pages
Test 1 Truss Test 2014
No ratings yet
Test 1 Truss Test 2014
4 pages
Saint Mary'S University: School of Accountancy and Business
No ratings yet
Saint Mary'S University: School of Accountancy and Business
2 pages
Battery Impedance Test Equipment: Bite 2 and BITE 2P
No ratings yet
Battery Impedance Test Equipment: Bite 2 and BITE 2P
4 pages
Mathematics for Data Science: Linear Algebra with Matlab
From Everand
Mathematics for Data Science: Linear Algebra with Matlab
César Pérez López
No ratings yet

Evaluation Metrics

Uploaded by

Evaluation Metrics

Uploaded by

EVALUATION METRICS

1.3 Recall (Sensitivity)

• Formula: 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠Recall =

1.5 ROC-AUC (Receiver Operating Characteristic - Area Under Curve)

1.6 Log Loss

• Formula: 𝐿𝑜𝑔 𝐿𝑜𝑠𝑠 = −1𝑁 ∑ 𝑖 = 1𝑁(𝑦𝑖𝑙𝑜 𝑔(𝑝𝑖) + (1 − 𝑦𝑖)𝑙𝑜 𝑔(1 − 𝑝𝑖))

2.1 Mean Absolute Error (MAE)

2.2 Mean Squared Error (MSE)

2.3 Root Mean Squared Error (RMSE)

• Formula: 𝑅𝑀𝑆𝐸 = 𝑀𝑆𝐸 ∗ RMSE = √MSE𝑅𝑀𝑆𝐸 = 𝑀𝑆𝐸

2.4 R-Squared (Coefficient of Determination)

3.1 Silhouette Score

3.2 Davies-Bouldin Index

• When to Use: Evaluate the compactness and separation of clusters.

3.3 Dunn Index

• When to Use: To measure the ratio between the smallest inter-cluster

4. Use Cases and Interpretations

5. Best Practices for Model Evaluation

4. Consider Business Objectives:

You might also like