0% found this document useful (0 votes)
5 views13 pages

Intel Assignment

The document discusses evaluation metrics for regression algorithms, including Mean Absolute Error, Mean Squared Error, and R-squared, detailing their formulas, descriptions, and appropriate usage scenarios. It also explains the confusion matrix, its components, and derived metrics like accuracy, precision, recall, and F1-score, emphasizing its importance in assessing classification model performance. Additionally, examples illustrate the application of these metrics in real-world scenarios such as medical tests and spam email classification.

Uploaded by

mayankkumar5322
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views13 pages

Intel Assignment

The document discusses evaluation metrics for regression algorithms, including Mean Absolute Error, Mean Squared Error, and R-squared, detailing their formulas, descriptions, and appropriate usage scenarios. It also explains the confusion matrix, its components, and derived metrics like accuracy, precision, recall, and F1-score, emphasizing its importance in assessing classification model performance. Additionally, examples illustrate the application of these metrics in real-world scenarios such as medical tests and spam email classification.

Uploaded by

mayankkumar5322
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

220140111002

Intel AI For
Manufacturing

Assignment 5
Modelling and
Evaluation

Dhoriwala
Mahammed Tabrez
220140111002
220140111002

Evaluation Metrics for


Regression Algorithms
When evaluating regression models, various metrics are used to
measure how well the model predicts continuous values. Below are the
main types of evaluation metrics for regression, along with their usage
scenarios.

1. Mean Absolute Error (MAE)


• Formula: 𝑀𝐴𝐸 = 1𝑛∑𝑖 = 1𝑛 ∣ 𝑦𝑖 − 𝑦^𝑖 ∣ 𝑀𝐴𝐸 = \𝑓𝑟𝑎𝑐{1}{𝑛} \
𝑠𝑢𝑚_{𝑖 = 1}^{𝑛} |𝑦_𝑖 − \ℎ𝑎𝑡{𝑦}_𝑖|

• Description:
Measures the average absolute difference between actual and
predicted values. It is simple and easy to interpret.

• When to Use:

o When all errors are equally important.

o When outliers are not a major concern.

• Example:
Used in house price prediction, where small deviations matter
equally.

2. Mean Squared Error (MSE)


• Formula: 𝑀𝑆𝐸 = 1𝑛∑𝑖 = 1𝑛(𝑦𝑖 − 𝑦^𝑖)2𝑀𝑆𝐸 = \𝑓𝑟𝑎𝑐{1}{𝑛} \
𝑠𝑢𝑚_{𝑖 = 1}^{𝑛} (𝑦_𝑖 − \ℎ𝑎𝑡{𝑦}_𝑖)^2

• Description:
Measures the average squared difference between actual and
predicted values. Penalizes larger errors more than MAE.

• When to Use:
220140111002

o When large errors need to be heavily penalized.

o Useful when trying to avoid large deviations.

• Example:
Used in stock market forecasting, where large errors can have
significant consequences.

3. Root Mean Squared Error (RMSE)

• Formula: 𝑅𝑀𝑆𝐸 = 𝑀𝑆𝐸𝑅𝑀𝑆𝐸 = \𝑠𝑞𝑟𝑡{𝑀𝑆𝐸}

• Description:
Similar to MSE but returns error in the same unit as the target
variable, making it easier to interpret.

• When to Use:
o When large errors need to be penalized, but in the same unit
as the original values.

• Example:
Used in weather prediction, where temperature deviations must be
minimized.

4. Mean Absolute Percentage Error (MAPE)


• Formula: 𝑀𝐴𝑃𝐸 = 1𝑛∑𝑖 = 1𝑛 ∣ 𝑦𝑖 − 𝑦^𝑖𝑦𝑖 ∣× 100𝑀𝐴𝑃𝐸 =
\𝑓𝑟𝑎𝑐{1}{𝑛} \𝑠𝑢𝑚_{𝑖 = 1}^{𝑛} \𝑙𝑒𝑓𝑡| \𝑓𝑟𝑎𝑐{𝑦_𝑖 − \ℎ𝑎𝑡{𝑦}_𝑖}{𝑦_𝑖} \
𝑟𝑖𝑔ℎ𝑡| \𝑡𝑖𝑚𝑒𝑠 100

• Description:
Measures the percentage error between actual and predicted
values.

• When to Use:

o When errors need to be expressed as a percentage.


o Not ideal when the dataset contains zero or near-zero
values.
220140111002

• Example:
Used in sales forecasting, where understanding percentage
deviation is important.

5. R-squared (R²)
• Formula: 𝑅2 = 1 − ∑(𝑦𝑖 − 𝑦^𝑖)2∑(𝑦𝑖 − 𝑦ˉ)2𝑅^2 = 1 −
\𝑓𝑟𝑎𝑐{\𝑠𝑢𝑚 (𝑦_𝑖 − \ℎ𝑎𝑡{𝑦}_𝑖)^2}{\𝑠𝑢𝑚 (𝑦_𝑖 − \𝑏𝑎𝑟{𝑦})^2}

• Description:
Measures the proportion of variance explained by the model.
Values closer to 1 indicate better fit.

• When to Use:
o When needing to assess how well a model explains
variability in the data.

o Not useful for comparing models with different datasets.

• Example:
Used in econometrics, where understanding explanatory power is
crucial.

6. Adjusted R-squared

• Formula: Adjusted 𝑅2 = 1 − (1 − 𝑅2)𝑛 − 1𝑛 − 𝑘 − 1\


𝑡𝑒𝑥𝑡{𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 } 𝑅^2 = 1 − (1 − 𝑅^2) \𝑓𝑟𝑎𝑐{𝑛 − 1}{𝑛 − 𝑘 − 1}

• Description:
Similar to R² but adjusts for the number of predictors in the model,
preventing overfitting.

• When to Use:
o When comparing models with different numbers of predictors.

• Example:
Used in medical research to determine how well multiple factors
predict disease progression.
220140111002

7. Huber Loss
• Formula: 𝐿𝛿 ,𝑓𝑜𝑟 , 𝑓𝑜𝑟
𝛿𝐿_{\𝑑𝑒𝑙𝑡𝑎} \𝑏𝑒𝑔𝑖𝑛{𝑐𝑎𝑠𝑒𝑠} \𝑓𝑟𝑎𝑐 ,& \
𝑡𝑒𝑥𝑡 |𝑎| \𝑙𝑒𝑞 \𝑑𝑒𝑙𝑡𝑎 \\ \𝑑𝑒𝑙𝑡𝑎 \𝑓𝑟𝑎𝑐 \𝑑𝑒𝑙𝑡𝑎), & \
𝑡𝑒𝑥𝑡 \𝑑𝑒𝑙𝑡𝑎 \𝑒𝑛𝑑{𝑐𝑎𝑠𝑒𝑠}

• Description:
A combination of MSE (for small errors) and MAE (for large errors),
reducing sensitivity to outliers.

• When to Use:
o Whenthe dataset contains outliers, but they should not
dominate the model.

• Example:
Used in autonomous vehicle trajectory prediction, where sensor
noise can introduce outliers.

8. Log-Cosh Loss
• Formula: \𝑠𝑢𝑚 \𝑙𝑜𝑔 (\𝑐𝑜𝑠ℎ(𝑦_𝑖 −
\

• Description:
Similar to Huber Loss but smoother and less sensitive to large
errors.

• When to Use:
o Whenhandling moderate outliers with smooth loss function
properties.

• Example:
Used in time-series forecasting to minimize impact of occasional
large fluctuations.
220140111002

Choosing the Right Metric


Metric When to Use

MAE Balanced error measurement, ignores outliers

MSE Penalizes large errors, sensitive to outliers

RMSE Like MSE but easier to interpret

MAPE Useful for percentage-based errors

R² Measures goodness of fit

Adjusted R² Compares models with different numbers of features


Huber Loss Robust to outliers

Log-Cosh Loss Smooth loss function for moderate outliers


220140111002

Confusion Matrix: Definition &


Importance
What is a Confusion Matrix?
A confusion matrix is a table used to evaluate the performance of a
classification model. It provides a detailed breakdown of the model’s
predictions by comparing them to the actual values. The confusion
matrix helps in understanding where the model makes mistakes and
how well it distinguishes between different classes.

A 2x2 confusion matrix for a binary classification problem looks like this:

Actual / Predicted Positive (Predicted 1) Negative (Predicted 0)

Positive (Actual 1) True Positive (TP) False Negative (FN)

Negative (Actual 0) False Positive (FP) True Negative (TN)

Components of a Confusion Matrix

1. True Positives (TP) – Correctly predicted positive cases.

2. True Negatives (TN) – Correctly predicted negative cases.

3. False Positives (FP) (Type I Error) – Incorrectly predicted positives


(the model falsely classifies a negative as positive).

4. False Negatives (FN) (Type II Error) – Incorrectly predicted


negatives (the model falsely classifies a positive as negative).

Why is a Confusion Matrix Useful?


• Gives a detailed performance analysis – Unlike a simple accuracy
score, it tells what types of errors the model makes.
220140111002

• Helps calculate other performance metrics – Precision, recall,


F1score, and specificity are all derived from it.
• Essential for imbalanced datasets – If one class is more frequent
than another, accuracy alone can be misleading, but a confusion
matrix shows misclassification rates.

Key Metrics Derived from a Confusion Matrix

1. Accuracy = 𝑇𝑃 + 𝑇𝑁𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁\𝑓𝑟𝑎𝑐{𝑇𝑃 + 𝑇𝑁}{𝑇𝑃 +


𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁} o Measures overall correctness

of the model.

o Not reliable for imbalanced datasets.

2. Precision = 𝑇𝑃𝑇𝑃 + 𝐹𝑃\𝑓𝑟𝑎𝑐{𝑇𝑃}{𝑇𝑃 + 𝐹𝑃} o Measures how many


predicted positives are actually correct. o Important for minimizing
false positives (e.g., spam detection).

3. Recall (Sensitivity) = 𝑇𝑃𝑇𝑃 + 𝐹𝑁\𝑓𝑟𝑎𝑐{𝑇𝑃}{𝑇𝑃 + 𝐹𝑁}


o Measures how many actual positives were correctly
identified. o Crucial for applications where missing positives
is costly (e.g., cancer detection).

4. F1-score = 2 × 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙2 \𝑡𝑖𝑚𝑒𝑠 \


𝑓𝑟𝑎𝑐{\𝑡𝑒𝑥𝑡{𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛} \𝑡𝑖𝑚𝑒𝑠 \𝑡𝑒𝑥𝑡{𝑅𝑒𝑐𝑎𝑙𝑙}}{\𝑡𝑒𝑥𝑡{𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛} +
\𝑡𝑒𝑥𝑡{𝑅𝑒𝑐𝑎𝑙𝑙}}
o Balances precision and recall. Useful when both false
positives and false negatives are critical.

5. Specificity = 𝑇𝑁𝑇𝑁 + 𝐹𝑃\𝑓𝑟𝑎𝑐{𝑇𝑁}{𝑇𝑁 + 𝐹𝑃} o Measures how well


the model identifies actual negatives. o Useful in fraud detection,
where false positives should be minimized.
220140111002

Example Use Case

Imagine a medical test for a disease:

• True Positive (TP): The test correctly identifies a sick patient.


• False Positive (FP): The test wrongly classifies a healthy person as
sick.

• False Negative (FN): The test fails to detect a sick patient.

• True Negative (TN): The test correctly identifies a healthy patient.


A confusion matrix helps determine whether the test is better at
detecting real cases or prone to false alarms, guiding improvements in
accuracy.
220140111002

Example: AI Model for Spam


Email Classification
Let's consider a spam email classifier that predicts whether an email is
Spam (1) or Not Spam (0) based on various features. We test the model
on a fictitious dataset and calculate its confusion matrix, precision, recall,
and F1 score to evaluate its performance.

Test Dataset Predictions

Email ID Actual (Ground Truth) Predicted by Model

1 1 (Spam) 1 (Spam)

2 0 (Not Spam) 0 (Not Spam)

3 1 (Spam) 0 (Not Spam)

4 1 (Spam) 1 (Spam)

5 0 (Not Spam) 1 (Spam)

6 1 (Spam) 1 (Spam)

7 0 (Not Spam) 0 (Not Spam)

8 1 (Spam) 0 (Not Spam)

9 0 (Not Spam) 0 (Not Spam)

10 0 (Not Spam) 1 (Spam)


220140111002

Confusion Matrix Calculation

Actual / Predicted Spam (Predicted 1) Not Spam (Predicted 0)


Spam (Actual 1) TP = 3 FN = 2
TN = 3
Not Spam (Actual 0) FP = 2
Where:
• True Positives (TP) = 3 → Emails correctly classified as spam.

• False Positives (FP) = 2 → Non-spam emails misclassified as


spam.
• False Negatives (FN) = 2 → Spam emails misclassified as
nonspam.
• True Negatives (TN) = 3 → Emails correctly classified as not
spam.

Performance Metrics

1. Accuracy (Overall Correct Predictions)


Accuracy = \𝑓𝑟𝑎𝑐{𝑇𝑃 + 𝑇𝑁}{𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁} = \𝑓𝑟𝑎𝑐{3 +
3}{3 + 3 + 2 + 2} = \𝑓𝑟𝑎𝑐{6}{10} = 0.6 \𝑡𝑒𝑥𝑡{ (60%)}

2. Precision (Spam Detection Accuracy)


Precision = \𝑓𝑟𝑎𝑐{𝑇𝑃}{𝑇𝑃 + 𝐹𝑃} = \𝑓𝑟𝑎𝑐{3}{3 + 2} = \𝑓𝑟𝑎𝑐{3}{5} =
0.6 \𝑡𝑒𝑥𝑡{ (60%)}
• Precision tells us how many of the emails predicted as spam were
actually spam.

3. Recall (Sensitivity / True Positive Rate)


Recall = \𝑓𝑟𝑎𝑐{𝑇𝑃}{𝑇𝑃 + 𝐹𝑁} = \𝑓𝑟𝑎𝑐{3}{3 + 2} = \𝑓𝑟𝑎𝑐{3}{5} =
0.6 \𝑡𝑒𝑥𝑡{ (60%)}
• Recall measures how many actual spam emails were correctly
identified by the model.

4. F1 Score (Harmonic Mean of Precision & Recall)


220140111002

F1=2 × 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 = 2 × 0.6 × 0.60.6 +


0.6𝐹1 = 2 \𝑡𝑖𝑚𝑒𝑠 \𝑓𝑟𝑎𝑐{𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 \𝑡𝑖𝑚𝑒𝑠 𝑅𝑒𝑐𝑎𝑙𝑙}{𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 +
𝑅𝑒𝑐𝑎𝑙𝑙} = 2 \𝑡𝑖𝑚𝑒𝑠 \𝑓𝑟𝑎𝑐{0.6 \𝑡𝑖𝑚𝑒𝑠 0.6}{0.6 + 0.6} = 2 \𝑡𝑖𝑚𝑒𝑠 \
𝑓𝑟𝑎𝑐{0.36}{1.2} = 2 \𝑡𝑖𝑚𝑒𝑠 0.3 = 0.6 \𝑡𝑒𝑥𝑡{ (60%)}
• The F1 score balances precision and recall, giving a single
performance measure.

Interpretation of Results
• Accuracy is 60%, meaning the model correctly classified 6 out of
10 emails.
• Precision is 60%, meaning 60% of emails predicted as spam were
actually spam.
• Recall is 60%, meaning the model identified 60% of all actual
spam emails, but missed 40%.

• F1 Score is 60%, balancing precision and recall.

Python Implementation:-
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score,
recall_score, f1_score

# Actual labels (ground truth) y_true

= [1, 0, 1, 1, 0, 1, 0, 1, 0, 0]

# Predicted labels by the model

y_pred = [1, 0, 0, 1, 1, 1, 0, 0, 0, 1]
220140111002

# Compute confusion matrix cm =

confusion_matrix(y_true, y_pred) #

Compute performance metrics

accuracy = accuracy_score(y_true,

y_pred) precision =

precision_score(y_true, y_pred) recall

= recall_score(y_true, y_pred) f1 =

f1_score(y_true, y_pred)

# Print results print("Confusion

Matrix:") print(cm)

print(f"\nAccuracy: {accuracy:.2f}")

print(f"Precision: {precision:.2f}")

print(f"Recall: {recall:.2f}")

print(f"F1 Score: {f1:.2f}")

You might also like