0% found this document useful (0 votes)

18 views10 pages

Machine Learningassignment

This document discusses the importance of selecting appropriate evaluation metrics for machine learning tasks, specifically classification and regression. It highlights the use of metrics like F1-score for imbalanced classification and R-squared for regression, and provides a systematic approach for choosing the right metric based on problem characteristics and business objectives. The conclusion emphasizes that well-informed metric selection enhances model transparency and leads to better decision-making in real-world applications.

Uploaded by

yosefdemeke08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views10 pages

Machine Learningassignment

Uploaded by

yosefdemeke08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

COLLEGE OF NATURAL AND COMPUTATIONALSCIENCE

DEPARTMENT OF COMPUTER SCIENCE

COURSE TITLE: INTRODUCTION TO MACHINE LEARNING

COURSE CODE: COSC3102

GROUP 7

NAME ID
Aboma Abaya 1300049

Binyam Tagel 1300191

Geleta Assefa 1300404

Indalo Kushe 1300516

Yoseph Demeke 1302105

Debark, Ethiopia

JUNE 2025 G.C

1
Contents
Introduction....................................................................................................................................................4
1. Choosing Machine Learning Metrics.........................................................................................................5
1.1. Machine Learning Tasks: Classification vs. Regression..............................................................5
1.2. Evaluation Metrics for Classification Models...............................................................................5
2. why certain metrics are more appropriate for certain tasks.......................................................................5
2.1. Imbalanced Classification and the F1-Score..................................................................................6
2.2. Regression and R-squared (R²)........................................................................................................6
3. Deciding which metric to use for a particular problem............................................................................7
3.1. Understand the Problem..................................................................................................................7
3.2. Analyze the Data...............................................................................................................................8
3.3. Consider the Metrics.........................................................................................................................8
3.4. Make the Decision.............................................................................................................................8
Conclusion.....................................................................................................................................................8
REFERENCE.................................................................................................................................................9

2
1. Different machine learning tasks (e.g., classification vs. regression) require different
evaluation metrics.
 Discuss why certain metrics are more appropriate for certain tasks, such as using
F1-score for imbalanced classification or R² for regression problems.
 How would you decide which metric to use for a particular problem?

3
Introduction
Machine learning is a rapidly evolving field that relies on data-driven models to solve complex
problems. The success of these models depends on their ability to generalize well to unseen data,
which is assessed using appropriate evaluation metrics. However, selecting the right metric is
crucial, as different machine learning tasks, such as classification and regression, require
different evaluation criteria. This document explores the importance of choosing appropriate
machine learning metrics, with a focus on classification and regression tasks. It highlights why
specific metrics are suitable for different scenarios, such as using the F1-score for imbalanced
classification and R-squared (R²) for regression problems. Furthermore, a systematic approach is
provided for selecting the most suitable metric based on problem characteristics and business
objectives.

4
1. Choosing Machine Learning Metrics
1.1. Machine Learning Tasks: Classification vs. Regression

Machine learning tasks are broadly categorized into two primary types:

Classification: This involves predicting a categorical or discrete label. The goal is to assign data points to
predefined categories. Examples include spam detection (spam/not spam), image classification
(cat/dog/bird), and medical diagnosis (disease/no disease). The output variable is a categorical variable.

Regression: This focuses on predicting a continuous numerical value. The objective is to establish the
relationship between input features and a target variable that takes on a range of values. Examples include
predicting house prices, forecasting stock prices, and estimating temperature. The output variable is a
numerical variable.

1.2. Evaluation Metrics for Classification Models

Several metrics are used to evaluate the effectiveness of classification models:

 Accuracy: The proportion of correctly classified instances out of the total number of
instances. While straightforward, accuracy can be misleading in imbalanced datasets,
where one class significantly outnumbers the others.

 Precision: Measures the proportion of correctly predicted positive instances out of all
instances predicted as positive. High precision is desirable when the cost of false
positives is high.

 Recall (Sensitivity): Measures the proportion of correctly predicted positive instances

out of all actual positive instances. High recall is crucial when the cost of false negatives
is high.

 F1-Score: The harmonic mean of precision and recall. It provides a balanced measure
that considers both false positives and false negatives. The F1-score is particularly useful
for imbalanced datasets, as it gives a more realistic picture of the model's performance on
the minority class.

 AUC-ROC (Area Under the Receiver Operating Characteristic Curve): Measures

the model's ability to distinguish between classes across different classification
thresholds. A higher AUC-ROC indicates better performance.

 Confusion Matrix: A table that summarizes the performance of a classification model by

showing the counts of true positives, true negatives, false positives, and false negatives.

5
2. why certain metrics are more appropriate for certain
tasks
In machine learning, selecting the appropriate evaluation metric is crucial, as it directly influences how
model performance is assessed and interpreted. Different tasks and data characteristics necessitate
different metrics to provide meaningful insights.

2.1. Imbalanced Classification and the F1-Score

The Problem with Imbalanced Data: In many real-world classification problems, classes are not evenly
distributed. One class, known as the majority class, has significantly more examples than the minority
class. Examples include:

 Fraud Detection: Fraudulent transactions are rare compared to legitimate ones.

 Disease Diagnosis: The number of individuals with a specific disease is often much smaller than
the number of healthy individuals.
 Spam Detection: While the volume of spam can be high, legitimate emails typically outnumber
spam emails in most inboxes.

The Role of Precision and Recall: To address this issue, precision and recall are utilized:

 Precision: Out of all instances predicted as positive, what proportion was actually positive? It
measures how many of the positive predictions were correct. High precision means fewer false
positives (predicting positive when it's actually negative).
 Recall: Out of all actual positive instances, what proportion was correctly predicted as positive?
It measures how well the model identifies all actual positives. High recall means fewer false
negatives (predicting negative when it's actually positive).

The F1-Score: A Balanced Measure: Precision and recall often have an inverse relationship; improving
one can come at the expense of the other. The F1-score is the harmonic mean of precision and recall:

F1-score=2×precision×recallprecision+recall\text{F1-score} = 2 \times \frac{\text{precision} \times \

text{recall}}{\text{precision} + \text{recall}}F1-score=2×precision+recallprecision×recall

The F1-score balances precision and recall, providing a single metric that considers both false positives
and false negatives. It's particularly useful when dealing with imbalanced datasets because it offers a
more realistic picture of the model's performance on the minority class.

Choosing Between Precision, Recall, and F1-Score: The optimal metric depends on the specific problem
and the relative costs of false positives and false negatives:

 Prioritize Precision: If false positives are very costly (e.g., incorrectly flagging a legitimate
transaction as fraudulent), precision should be prioritized.
 Prioritize Recall: If false negatives are very costly (e.g., failing to diagnose a disease), recall
should be prioritized.
 Balance Both: If both false positives and false negatives are important, the F1-score provides a
good compromise.

6
2.2. Regression and R-squared (R²)

In regression tasks, our goal is to predict a continuous numerical value. We strive to build a model that
accurately captures the relationship between input features and the target variable. A key challenge is
quantifying how well our model fits the data. This is where R-squared (R²), also known as the coefficient
of determination, plays a crucial role.

R² measures the proportion of the variance in the dependent variable (our target) that is predictable from
the independent variables (our inputs). Simply put, it tells us how much of the variation in the target is
explained by our model.

R² values range from 0 to 1:

 R² = 1: A perfect fit! The model explains all the variance in the target. This is extremely rare in
real-world scenarios.
 R² = 0: The model explains none of the variance. It's essentially useless for prediction.
 0 < R² < 1: The typical situation. The model explains some of the variance. A higher R² generally
suggests a better fit, but it's not the only factor to consider.

R² is useful because it provides a relative measure of goodness of fit. This makes it easier to compare
models, even if they are predicting different things with different scales. For instance, we can compare the
R² of a house price prediction model to the R² of a stock price prediction model, even though house prices
and stock prices are on very different scales.

However, R² has limitations:

 Prediction Quality: A high R² doesn't guarantee good predictions. A model can explain a lot of
variance but still make inaccurate predictions.
 Overfitting: R² can be misleading when a model is overfit. An overfit model might have a high
R² on training data but perform poorly on new, unseen data.
 Causality: R² measures correlation, not causation. Just because two variables are correlated
doesn't mean one causes the other. R² only quantifies the strength of the linear relationship. It
does not prove that changes in the independent variables cause changes in the dependent variable.

3. Deciding which metric to use for a particular problem

Choosing the right evaluation metric is a crucial step in the machine learning process. The
following steps can help guide the decision:
3.1. Understand the Problem

 Task Type: Determine whether the problem is classification, regression, or another type
of task. Different tasks require different metrics.

7
 Business Objective: Consider the costs associated with different types of errors. For
example, in medical diagnosis, false negatives (failing to detect a disease) might be more
costly than false positives.

 Target Audience: Choose metrics that are easy to explain to non-technical stakeholders
if necessary.
3.2. Analyze the Data

 Data Characteristics: Examine whether the data is balanced or imbalanced (for

classification) and whether there are outliers (for regression). These characteristics can
influence the choice of metrics.

 Data Scale: Consider the scale of the data, as this can impact the interpretation of certain
metrics like RMSE.
3.3. Consider the Metrics

 Classification Metrics: Use accuracy for balanced datasets, precision when false
positives are costly, recall when false negatives are costly, and the F1-score for
imbalanced datasets.

 Regression Metrics: Use MSE or RMSE for measuring prediction errors, MAE for less
sensitivity to outliers, and R² for assessing the goodness of fit.
3.4. Make the Decision

 Align with Business Goals: Choose the metric that best reflects the business objectives.
For example, prioritize recall if minimizing false negatives is critical.

 Use Multiple Metrics: Often, it is beneficial to use multiple metrics to get a more
complete picture of model performance.

 Iterate and Refine: Experiment with different metrics and refine your choice as you gain
more understanding of the problem.

8
Conclusion
The selection and interpretation of appropriate evaluation metrics are fundamental to the success
of any machine learning project. By carefully considering the task at hand, the nature of the data,
and the specific business objectives, data scientists can choose the most relevant metrics to guide
model development and ensure that the chosen model effectively addresses the problem being
tackled. A thorough understanding of these metrics is essential for building robust and reliable
machine learning systems.

Additionally, selecting the right metrics enhances model transparency, enabling stakeholders to
trust the results and make informed decisions. The use of multiple complementary metrics can
provide a more comprehensive understanding of a model's performance, minimizing the risk of
misleading evaluations. Ultimately, well-informed metric selection leads to better model
optimization, improved decision-making, and greater impact in real-world applications.

9
REFERENCE

 Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.

 Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE:
synthetic minority over-sampling technique. Journal of artificial intelligence research,
16, 321-357.

 Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

 Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning.
Springer.

 Japkowicz, M., & Stephen, S. (2002). The class imbalance problem: A systematic
study. Intelligent data analysis, 6(5), 429-449.

 Kvalseth, T. O. (1985). Cautionary note about R 2. The American Statistician, 39(4), 279-
285.

 Powers, D. M. W. (2020). Evaluation: From precision, recall and F-measure to ROC,

informedness, markedness & correlation. Journal of Machine Learning Technologies,
2(1), 37-63.

 Scikit-learn Documentation (for Python). Retrieved from https://scikit-learn.org/stable/

Design Thinking For Startups - Antler
No ratings yet
Design Thinking For Startups - Antler
36 pages
Inclusion Portfolio
No ratings yet
Inclusion Portfolio
20 pages
Machine Learningassignment G - 7
No ratings yet
Machine Learningassignment G - 7
10 pages
9
No ratings yet
9
4 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Session 1 Evaluation Model
No ratings yet
Session 1 Evaluation Model
58 pages
Instruction & Option Choice
No ratings yet
Instruction & Option Choice
6 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
11 pages
Iai&ml Unit-5
No ratings yet
Iai&ml Unit-5
15 pages
Ad3501-Dl-Unit 4 Notes
No ratings yet
Ad3501-Dl-Unit 4 Notes
16 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Performance Metrics
No ratings yet
Performance Metrics
12 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
Exp7 MLAI2
No ratings yet
Exp7 MLAI2
8 pages
Performance Measures
No ratings yet
Performance Measures
19 pages
l09 Machine Learning
No ratings yet
l09 Machine Learning
39 pages
Evaluating A Machine Learning Model
No ratings yet
Evaluating A Machine Learning Model
14 pages
Unit III Iml Final
No ratings yet
Unit III Iml Final
36 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
6 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
Lec 4
No ratings yet
Lec 4
24 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
Evaluating Models CH-3
No ratings yet
Evaluating Models CH-3
5 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
Evaluation Metrics in Machine Learning _ GeeksforGeeks
No ratings yet
Evaluation Metrics in Machine Learning _ GeeksforGeeks
6 pages
BSC ML CH1
No ratings yet
BSC ML CH1
63 pages
Unit 3
No ratings yet
Unit 3
13 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
Module 2
No ratings yet
Module 2
151 pages
Ads 5
No ratings yet
Ads 5
5 pages
CLASSIFICATION PPT (1)
No ratings yet
CLASSIFICATION PPT (1)
36 pages
Performance Metrics ML
No ratings yet
Performance Metrics ML
4 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
Classification Metrics
No ratings yet
Classification Metrics
24 pages
Ads Exp4
No ratings yet
Ads Exp4
3 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
6 pages
A Novel Performance Measure For Machine
No ratings yet
A Novel Performance Measure For Machine
19 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
11 pages
F1 - Score
No ratings yet
F1 - Score
13 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Performance Measures - Session 2
No ratings yet
Performance Measures - Session 2
35 pages
Assignment 5
No ratings yet
Assignment 5
22 pages
Unit 2
No ratings yet
Unit 2
28 pages
Ai Unit 5
No ratings yet
Ai Unit 5
13 pages
Comprehensive Guide On Confusion Matrix 1657202063
No ratings yet
Comprehensive Guide On Confusion Matrix 1657202063
5 pages
6.data Mining - Classification
No ratings yet
6.data Mining - Classification
37 pages
Chapter 3 Model Evaluation Final
No ratings yet
Chapter 3 Model Evaluation Final
30 pages
Metrics For Multi-Class Classification
No ratings yet
Metrics For Multi-Class Classification
17 pages
M M - C C: O: Etrics For Ulti Lass Lassification AN Verview
No ratings yet
M M - C C: O: Etrics For Ulti Lass Lassification AN Verview
17 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
From Everand
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
Ahmed Ph. Abbasi
No ratings yet
Selected Topic
No ratings yet
Selected Topic
14 pages
ML Individual Assigenment 1
No ratings yet
ML Individual Assigenment 1
11 pages
Chapter 4 Neural Network
No ratings yet
Chapter 4 Neural Network
46 pages
Chapter 3-Unsupervised Learning - Updated
No ratings yet
Chapter 3-Unsupervised Learning - Updated
54 pages
Chapter 1-Introduction
No ratings yet
Chapter 1-Introduction
33 pages
Tadlo MCL
No ratings yet
Tadlo MCL
11 pages
MCL Ind Assign
No ratings yet
MCL Ind Assign
10 pages
3rd Quarter Week 5 Arts
No ratings yet
3rd Quarter Week 5 Arts
2 pages
Challenges 2, Module 8, Lesson 2
No ratings yet
Challenges 2, Module 8, Lesson 2
2 pages
MA 15 LING6005 Assignment02part1 LaiThuyNgocHuyen
No ratings yet
MA 15 LING6005 Assignment02part1 LaiThuyNgocHuyen
6 pages
Nageswararao 4 PDF
No ratings yet
Nageswararao 4 PDF
6 pages
I Me My Mine Myself: Personal Pronouns
No ratings yet
I Me My Mine Myself: Personal Pronouns
1 page
Rubric (Mini-Project 1)
No ratings yet
Rubric (Mini-Project 1)
2 pages
Module 16 - EDITED As of April 9 PDF
No ratings yet
Module 16 - EDITED As of April 9 PDF
30 pages
Lecture 17 Clustering
No ratings yet
Lecture 17 Clustering
63 pages
Lesson Plan in English 7
No ratings yet
Lesson Plan in English 7
4 pages
Teachers Book 151-208
No ratings yet
Teachers Book 151-208
58 pages
OTP Signs and Symptoms
No ratings yet
OTP Signs and Symptoms
32 pages
Job Analysis and Job Evaluation
No ratings yet
Job Analysis and Job Evaluation
35 pages
Who Am I Sample Jean Klein
100% (1)
Who Am I Sample Jean Klein
32 pages
Story Telling Through Data
No ratings yet
Story Telling Through Data
7 pages
English Regular Verbs in Past Tense
No ratings yet
English Regular Verbs in Past Tense
4 pages
Resume of Sanjida Rahman PDF
No ratings yet
Resume of Sanjida Rahman PDF
2 pages
Laboratory Management
100% (1)
Laboratory Management
7 pages
Speaking Lesson Plan 6G
No ratings yet
Speaking Lesson Plan 6G
6 pages
Let Questions Good Quality
No ratings yet
Let Questions Good Quality
5 pages
Szab Czibor Rests The Darkest of All PDF
No ratings yet
Szab Czibor Rests The Darkest of All PDF
6 pages
Tagalog and Other Major Languages in The Philippines
No ratings yet
Tagalog and Other Major Languages in The Philippines
23 pages
DLL (Healthq1w1)
No ratings yet
DLL (Healthq1w1)
5 pages
Inversion
No ratings yet
Inversion
3 pages
Interview Tasks Revised
No ratings yet
Interview Tasks Revised
4 pages
Ge8 Module 5 Activity
No ratings yet
Ge8 Module 5 Activity
4 pages
Behaviorism To Neobehaviorism
67% (3)
Behaviorism To Neobehaviorism
11 pages
Wow How Now Feedback
No ratings yet
Wow How Now Feedback
2 pages
Edad-205 Trainings & Needs Assessment
No ratings yet
Edad-205 Trainings & Needs Assessment
22 pages