0% found this document useful (0 votes)

22 views15 pages

EVALUATION - Notes

The document provides an overview of evaluation in artificial intelligence, detailing its importance, the concept of overfitting, and the distinction between prediction and reality. It explains various evaluation metrics such as accuracy, precision, recall, and F1 score, along with examples and scenarios where these metrics are applied. Additionally, it discusses the implications of false positives and false negatives in different contexts, emphasizing the need for careful metric selection based on the specific case at hand.

Uploaded by

ayushkumarupadhyay381

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views15 pages

EVALUATION - Notes

Uploaded by

ayushkumarupadhyay381

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

EVALUATION - Class 10

Artificial Intellignce(417)

1. What is Evaluation?
Ans : Evaluation is the process of understanding the reliability of any AI model,
based on outputs by feeding test dataset into the model and comparing with
actual answers. Its purpose is to make judgments about a program, to improve
its effectiveness, and/or to inform programming decisions.

2. Why is Evaluation important? Explain.

Ans : Evaluation is a process that critically examines a program by collecting
and analyzing information about a program’s activities, characteristics and
outcomes. The advantages of Evaluation are as follows :
i. Evaluation ensures that the model is operating correctly and
optimally.
ii. Evaluation is an initiative to understand how well it achieves its goals.
iii. Evaluation help to determine what works well and what could be
improved in a Program.

3. What is meant by Overfitting of Data?

Overfitting is "the production of an analysis that corresponds too closely or

exactly to a particular set of data, and may therefore fail to fit additional data
or predict future observations reliably".

Models that use the training dataset during testing, will always results in
correct output. This is known as overfitting

4. What are Prediction & Reality in relation to Evaluation?

Ans :Prediction – It is the output given by the AI model using Machine Learning
Algorithm.
Reality – It is the real scenario of the situation for which the prediction
has been made.

5. Differentiate between Prediction and Reality.

Ans :

a) Prediction is the input given to the machine to receive the expected result

of reality.

b) Prediction is the output given to match reality.

c) The prediction is the output which is given by the machine and the reality is

the real scenario in which the prediction has been made.

d) Prediction and reality both can be used interchangeably.

6. Terminologies of Model Evaluation

The Scenario

Let’s imagine that we have an AI-based prediction model which has been
deployed to identify a Football or a soccer ball.

Now, the objective of the model is to predict whether the given/shown figure
is a football. Now, to understand the efficiency of this model, we need to
check if the predictions which it makes are correct or not. So we need to
consider upon Prediction and Reality.

Case 1 :

a) Prediction = YES
b) Reality = YES

 The predicted value matches the actual value.

 Here, the Prediction is positive and matches Reality. Hence, this
condition is termed as True Positive.

Case 2 :

a) Prediction = No
b) Reality = No

 The predicted value matches the actual value.

 Here, the Prediction is negative and matches Reality. Hence, this

condition is termed as True Negative.

Case 3 :

a) Prediction = Yes
b) Reality = No

 The predicted value does not match the actual value.

 Here, the Prediction is positive and does not match Reality. Hence, this
condition is termed as False Positive.

 This is also known as Type 1 Error.

Case 4 :

c) Prediction = No
d) Reality = Yes

 The predicted value does not match the actual value.

 Here, the Prediction is negative and does not match Reality. Hence, this
condition is termed as False Negative.

 This is also known as Type 2 Error.

7. What is Confusion Matrix?

Ans : Confusion Matrix is a

tabular structure which
helps in measuring the
performance of an AI
model using the test data.
The result of comparison
between the prediction
and reality are recorded in
confusion matrix. It is a
record that helps in
evaluation.

8. Parameters to Evaluate a Model

9. What is Accuracy? Mention its formula.

Ans : Accuracy is defined as the percentage of correct predictions out of all

the observations. A prediction is said to be correct if it matches reality.
Here we have two conditions in which the Prediction matches with the
Reality, i.e., True Positive and True Negative. Therefore, Formula for
Accuracy is –
Where TP = True Positives, TN = True Negatives, FP = False Positives, and FN
= False Negatives.

10. What is Precision? Mention its formula.

Ans : Precision is defined as the percentage of true positive cases versus all
the cases where the prediction is true. That is, it takes into account the
True Positives and False Positives.

11. What is Recall? Mention its formula.

Ans : Recall is defined as the fraction of positive cases that are correctly
Identified. It majorly takes into account the true reality cases.

12. How do you suggest which evaluation metric is more important for any
case ?
Ans :

F 1 Evaluation metric is more important in any case. F1 score maintains a

balance between the precision and recall for the classifier. If the precision is
low, the F1 is low and if the recall is low again F1 score is low. The F1 score
is a number between 0 and 1 and is the harmonic mean of precision and
recall. When we have a value of 1 (that is 100%) for both Precision and
Recall, the F1 score would also be an ideal 1 (100%). It is known as the
perfect value for F1 Score. As the values of both Precision and Recall ranges
from 0 to 1, the F1 score also ranges from 0 to 1.
A model is said to have a good performance if the F1 score for that model is
high.
13.Give an example where High Accuracy is not usable.
Ans : SCENARIO: An expensive robotic chicken crosses a very busy road a
thousand times per day. An ML model evaluates traffic patterns and
predicts when this chicken can safely cross the street with an accuracy of
99.99%.
Explanation: A 99.99% accuracy value on a very busy road strongly suggests
that the ML model is far better than chance. In some settings, however, the
cost of making even a small number of mistakes is still too high. 99.99%
accuracy means that the expensive chicken will need to be replaced, on
average, every 10 days. (The chicken might also cause extensive damage to
cars that it hits.)
14.Give an example where High Precision is not usable.
Ans : Example: “Predicting a mail as Spam or Not Spam”
False Positive: Mail is predicted as “spam” but it is “not spam”.
False Negative: Mail is predicted as “not spam” but it is “spam”. Of course,
too many False Negatives will make the spam filter ineffective but False
Positives may cause important mails to be missed and hence Precision is
not usable.

15. Which evaluation metric would be crucial in the following cases? Justify.
 In a case like Forest Fire, a False Negative can cost us a lot and is risky

too. Imagine no alert being given even when there is a Forest Fire. The whole
forest might burn down.

 Another case where a False Negative can be dangerous is Viral Outbreak.

Imagine a deadly virus has started spreading and the model which is supposed
to predict a viral outbreak does not detect it. The virus might spread widely
and infect a lot of people.

 On the other hand, there can be cases in which the False Positive

condition costs us more than False Negatives. One such case is Mining.
Imagine a model telling you that there exists treasure at a point and you keep
on digging there but it turns out that it is a false alarm. Here, the False
Positive case (predicting there is a treasure but there is no treasure) can be
very costly.

 Similarly, let’s consider a model that predicts whether a mail is spam or

not. If the model always predicts that the mail is spam, people would not look
at it and eventually might lose important information. Here also False Positive
condition (Predicting the mail as spam while the mail is not spam) would have
a high cost.

16.Cases of High FN Cost

Forest Fire

Viral

Cases of High FP Cost

Spam

Mining
17. Calculate Accuracy, Precision, Recall and F1 Score for the following
Confusion Matrix on Heart Attack Risk. Also suggest which metric would be a
good evaluation parameter here and why?

Where True Positive (TP) = 50, True Negative (TN) = 20, False Positive (FP) = 20 and
False Negative (FN) = 10.
Accuracy

=((50+20) / (50+20+20+10))*100%

= (70/100) * 100%

= 0.7 * 100% = 70%

Precision:

Precision is defined as the percentage of true positive cases versus all the cases

where the prediction is true.

= (50 / (50 + 20)) * 100%

= (50/70)*100%

= 0.714 *100% = 71.4%

Recall: It is defined as the fraction of positive cases that are correctly identified.

= 50 / (50 + 10)
= 50 / 60
= 0.83

F1 Score:

F1 score is defined as the measure of balance between precision and recall.

= 2 * (0.714 *0.83) / (0.714 + 0.83)

= 2 * (0.592 / 1.544)

= 2* (0.383) = 0.766

Therefore,

Accuracy= 0.7 Precision=0.714 Recall=0.83 F1 Score=0.766

Here within the test there is a tradeoff. But Recall is a good Evaluation metric.

Recall metric needs to improve more.

Because,

False Positive (impacts Precision): A person is predicted as high risk but does not

have heart attack.

False Negative (impacts Recall): A person is predicted as low risk but has heart

attack. Therefore, False Negatives miss actual heart patients, hence recall metric
need more improvement.

False Negatives are more dangerous than False Positives.

18. Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion

Matrix on Water Shortage in Schools: Also suggest which metric would not be

a good evaluation parameter here and why?

Where True Positive (TP), True Negative (TN), False Positive (FP) and False Negative
(FN).

Accuracy

Accuracy is defined as the percentage of correct predictions out of all the

observations

= ((75+15) / (75+15+5+5))*100%

= (90 / 100) *100%

=0.9 *100% = 90%

Precision:

Precision is defined as the percentage of true positive cases versus all the cases

where the prediction is true.

= (75 / (75+5))*100%
= (75 /80)*100%
= 0.9375 *100% = 93%
Recall:

It is defined as the fraction of positive cases that are correctly identified.

= 75 / (75+5)

= 75 /80

= 0.9375

F1 Score:

F1 score is defined as the measure of balance between precision and recall.

= 2 * ((0.9375 *0.9375) / (0.9375+0.9375)

= 2 * (0.8789 / 1.875)
= 2 * 0.46875 = 0.9375

Accuracy= 90% Precision=93% Recall=0.9375 F1 Score=0.9375

Here precision, recall, accuracy, f1 score all are same

19. Calculate Accuracy, Precision, Recall and F1 Score for the following

Confusion Matrix on SPAM FILTERING: Also suggest which metric would not be a

good evaluation parameter here and why?

Accuracy is defined as the percentage of correct predictions out of all the

Observations

Where True Positive (TP) = 10, True Negative (TN) = 25, False Positive (FP) = 55 and
False Negative (FN) = 10.

Accuracy

= ((10 + 25) / (10+25+55+10))*100%

= (35 / 100)*100%

= 0.35 %100% = 35%

Precision:

Precision is defined as the percentage of true positive cases versus all the cases
where the prediction is true.

= (10 / (10 +55))*100%

= (10 /65) *100%
= 0.15 *100% = 15%

Recall:

= 10/(10+10)
= 10/20
0.5

F1 Score

F1 score is defined as the measure of balance between precision and recall.

= 2 * ((0.15 * 0.5) / (0.15 + 0.5))

= 2 * (0.075 / 0.65)

= 2 * 0.115

= 0.23

Accuracy= 35% Precision= 15% Recall= 0.5 F1 Score= 0.23

Here within the test there is a tradeoff. But Precision is not a good Evaluation

metric. Precision metric needs to improve more.

Because,
False Positive (impacts Precision): Mail is predicted as “spam” but it is not.

False Negative (impacts Recall): Mail is predicted as “not spam” but spam

Too many False Negatives will make the Spam Filter ineffective. But False

Positives may cause important mails to be missed. Hence, Precision is more

important to improve.

Unit 7 - AI (Evaluation)
No ratings yet
Unit 7 - AI (Evaluation)
28 pages
UNIT 3 Evaluating Models Q-Ans
No ratings yet
UNIT 3 Evaluating Models Q-Ans
6 pages
Evaluation Grade10 Ai
No ratings yet
Evaluation Grade10 Ai
32 pages
Evaluation-Practice Questions (Answer Key)
100% (1)
Evaluation-Practice Questions (Answer Key)
4 pages
Evaluation 1646538719041
No ratings yet
Evaluation 1646538719041
65 pages
EVALUATION
No ratings yet
EVALUATION
25 pages
Class X - Artificial Intelligence - Evaluation - Question Bank
83% (6)
Class X - Artificial Intelligence - Evaluation - Question Bank
8 pages
Partiiiunit2model Performanceconfusion Matrixaccuracyprecesion Recall
No ratings yet
Partiiiunit2model Performanceconfusion Matrixaccuracyprecesion Recall
8 pages
c10 Ai Evaluation - 2024-25
No ratings yet
c10 Ai Evaluation - 2024-25
29 pages
EvaluationQuestions Class 10 Ai
No ratings yet
EvaluationQuestions Class 10 Ai
6 pages
Evaluation
No ratings yet
Evaluation
32 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Unit - 7 - Evaluation
No ratings yet
Unit - 7 - Evaluation
30 pages
Cbse - Department of Skill Education Artificial Intelligence
No ratings yet
Cbse - Department of Skill Education Artificial Intelligence
12 pages
Measurement of Conductance and Kohlrauch's Law
No ratings yet
Measurement of Conductance and Kohlrauch's Law
23 pages
Circular For Early Dispersal 22-Apr-2024 11-03-30 Page 1
No ratings yet
Circular For Early Dispersal 22-Apr-2024 11-03-30 Page 1
1 page
Grade 10 Unit 7 - Evaluation
No ratings yet
Grade 10 Unit 7 - Evaluation
50 pages
EVALUATION
No ratings yet
EVALUATION
12 pages
AI Evaluation
No ratings yet
AI Evaluation
24 pages
2011 Positive Obligations Under The Eur
No ratings yet
2011 Positive Obligations Under The Eur
28 pages
Evaluation in AI
No ratings yet
Evaluation in AI
20 pages
Evaluation Class X Ai 417
No ratings yet
Evaluation Class X Ai 417
19 pages
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
Quality Questions
75% (16)
Quality Questions
26 pages
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
Notes of Evaluation
No ratings yet
Notes of Evaluation
5 pages
5.10ai - 2B
No ratings yet
5.10ai - 2B
15 pages
Assignment 5
No ratings yet
Assignment 5
22 pages
CH 07 Evaluation
No ratings yet
CH 07 Evaluation
25 pages
Vibration DNV
100% (1)
Vibration DNV
10 pages
Evaluation Questions
No ratings yet
Evaluation Questions
9 pages
Class 6 Maths Test (30!06!2025)
No ratings yet
Class 6 Maths Test (30!06!2025)
2 pages
For Parents
No ratings yet
For Parents
6 pages
EVALUATION
No ratings yet
EVALUATION
10 pages
Evaluation Grade10 Ai
No ratings yet
Evaluation Grade10 Ai
32 pages
PA Science
No ratings yet
PA Science
8 pages
AI Evaluation
No ratings yet
AI Evaluation
30 pages
Evaluation AI X
No ratings yet
Evaluation AI X
6 pages
3008 Revision CV Evaluation
No ratings yet
3008 Revision CV Evaluation
20 pages
Chater 3 Class 10
No ratings yet
Chater 3 Class 10
4 pages
Part B Chapter 7 (Evaluation)
No ratings yet
Part B Chapter 7 (Evaluation)
5 pages
Interim - Script-2 (Rough)
No ratings yet
Interim - Script-2 (Rough)
3 pages
Last Minute Notes
No ratings yet
Last Minute Notes
2 pages
Unit-7 Evaluation Notes
No ratings yet
Unit-7 Evaluation Notes
9 pages
STD 9 Worksheet On Gravitation-2 - 1695986277296 - Xpq9F
No ratings yet
STD 9 Worksheet On Gravitation-2 - 1695986277296 - Xpq9F
4 pages
Oracle WMS PICK (White Paper)
100% (16)
Oracle WMS PICK (White Paper)
35 pages
Chapter 4 Bending Part 1
No ratings yet
Chapter 4 Bending Part 1
35 pages
Part B Unit 7 Evaluation
No ratings yet
Part B Unit 7 Evaluation
11 pages
Unit 7 Evaluation
No ratings yet
Unit 7 Evaluation
13 pages
04 Evaluation Revision Notes
No ratings yet
04 Evaluation Revision Notes
5 pages
Evaluation
No ratings yet
Evaluation
12 pages
Evaluation 1 7
No ratings yet
Evaluation 1 7
7 pages
Q ClassX AI Evaluation
No ratings yet
Q ClassX AI Evaluation
12 pages
Worksheet Graphing Systems
No ratings yet
Worksheet Graphing Systems
3 pages
Screenshot 2025-03-17 at 12.15.59
No ratings yet
Screenshot 2025-03-17 at 12.15.59
3 pages
GRAPHS
No ratings yet
GRAPHS
3 pages
M. Tech. Chemical 2018
No ratings yet
M. Tech. Chemical 2018
37 pages
CH 7 - Notes Evaluation
No ratings yet
CH 7 - Notes Evaluation
3 pages
Evaluation Worksheet
No ratings yet
Evaluation Worksheet
2 pages
Screenshot 2024-12-17 at 8.54.03 PM
No ratings yet
Screenshot 2024-12-17 at 8.54.03 PM
4 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Unit 7 - Evaluation
No ratings yet
Unit 7 - Evaluation
7 pages
MS Evaluation Worksheet
No ratings yet
MS Evaluation Worksheet
3 pages
Evaluation Data
No ratings yet
Evaluation Data
3 pages
517-C-30072-Assignment Chapter Evaluation
No ratings yet
517-C-30072-Assignment Chapter Evaluation
10 pages
Half Deflection
No ratings yet
Half Deflection
4 pages
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
Extraction Process in The Ethanol Produc PDF
No ratings yet
Extraction Process in The Ethanol Produc PDF
7 pages
Wrninv 45 K Zhyt Bgro TTF9 X Le JT XCAVWEgf Ah IFn C
No ratings yet
Wrninv 45 K Zhyt Bgro TTF9 X Le JT XCAVWEgf Ah IFn C
12 pages
Htl05 Sub Pe 001 Mem Imp Civ r00 - Equipment Foundations
No ratings yet
Htl05 Sub Pe 001 Mem Imp Civ r00 - Equipment Foundations
27 pages
The Theory of Space Time Warping
No ratings yet
The Theory of Space Time Warping
9 pages
CH EVALUATION
No ratings yet
CH EVALUATION
7 pages
Evaluation Exercise
No ratings yet
Evaluation Exercise
3 pages
Specification and Description
No ratings yet
Specification and Description
16 pages
Evaluation Question Answers
No ratings yet
Evaluation Question Answers
7 pages
AI Project Evaluation 1
No ratings yet
AI Project Evaluation 1
5 pages
Correct Solution For Dominator Task From Codility Johnnyjavago Java Passion Coding - HTM
No ratings yet
Correct Solution For Dominator Task From Codility Johnnyjavago Java Passion Coding - HTM
14 pages
Matlab 8
100% (1)
Matlab 8
12 pages
Unit-7 Evaluation: 7. What Is Meant by Overfitting of Data?
No ratings yet
Unit-7 Evaluation: 7. What Is Meant by Overfitting of Data?
7 pages
Riemann - Biography - Wiki
No ratings yet
Riemann - Biography - Wiki
7 pages
AI Evaluation
No ratings yet
AI Evaluation
3 pages
10 Ai Evaluation tp01
No ratings yet
10 Ai Evaluation tp01
5 pages
UNIT 3-Practice Sheet 3
No ratings yet
UNIT 3-Practice Sheet 3
2 pages
Aiunit 7 10
No ratings yet
Aiunit 7 10
4 pages
Morris 2014
No ratings yet
Morris 2014
11 pages
HP DesignJet 500, 800 Series Printers Service Manual - English
No ratings yet
HP DesignJet 500, 800 Series Printers Service Manual - English
5 pages
Formlabs Fuse F1 - Sift Tech Specs
No ratings yet
Formlabs Fuse F1 - Sift Tech Specs
4 pages
1051637-Worksheet Part B Unit7 Evaluation
No ratings yet
1051637-Worksheet Part B Unit7 Evaluation
5 pages
01 Task Performance 1
No ratings yet
01 Task Performance 1
3 pages
DX Diag
No ratings yet
DX Diag
27 pages
Experiement 6
No ratings yet
Experiement 6
3 pages
9 Fraunhofer Snail Trails
No ratings yet
9 Fraunhofer Snail Trails
4 pages
ENGR 2530 Syllabus-Spring 2015 - KLM Abbreviated
No ratings yet
ENGR 2530 Syllabus-Spring 2015 - KLM Abbreviated
2 pages
Preheat Calculation 2 PDF
No ratings yet
Preheat Calculation 2 PDF
3 pages

EVALUATION - Notes

Uploaded by

EVALUATION - Notes

Uploaded by

EVALUATION - Class 10

2. Why is Evaluation important? Explain.

3. What is meant by Overfitting of Data?

Overfitting is "the production of an analysis that corresponds too closely or

4. What are Prediction & Reality in relation to Evaluation?

5. Differentiate between Prediction and Reality.

b) Prediction is the output given to match reality.

the real scenario in which the prediction has been made.

d) Prediction and reality both can be used interchangeably.

6. Terminologies of Model Evaluation

 The predicted value matches the actual value.

 The predicted value matches the actual value.

 Here, the Prediction is negative and matches Reality. Hence, this

 The predicted value does not match the actual value.

 This is also known as Type 1 Error.

 The predicted value does not match the actual value.

 This is also known as Type 2 Error.

Ans : Confusion Matrix is a

8. Parameters to Evaluate a Model

9. What is Accuracy? Mention its formula.

Ans : Accuracy is defined as the percentage of correct predictions out of all

10. What is Precision? Mention its formula.

11. What is Recall? Mention its formula.

F 1 Evaluation metric is more important in any case. F1 score maintains a

 Another case where a False Negative can be dangerous is Viral Outbreak.

 Similarly, let’s consider a model that predicts whether a mail is spam or

16.Cases of High FN Cost

Cases of High FP Cost

= 0.7 * 100% = 70%

where the prediction is true.

= (50 / (50 + 20)) * 100%

= 0.714 *100% = 71.4%

F1 score is defined as the measure of balance between precision and recall.

Accuracy= 0.7 Precision=0.714 Recall=0.83 F1 Score=0.766

Recall metric needs to improve more.

have heart attack.

False Negatives are more dangerous than False Positives.

a good evaluation parameter here and why?

Accuracy is defined as the percentage of correct predictions out of all the

= (90 / 100) *100%

=0.9 *100% = 90%

where the prediction is true.

It is defined as the fraction of positive cases that are correctly identified.

F1 score is defined as the measure of balance between precision and recall.

= 2 * ((0.9375 *0.9375) / (0.9375+0.9375)

Accuracy= 90% Precision=93% Recall=0.9375 F1 Score=0.9375

Here precision, recall, accuracy, f1 score all are same

good evaluation parameter here and why?

= ((10 + 25) / (10+25+55+10))*100%

= 0.35 %100% = 35%

= (10 / (10 +55))*100%

F1 score is defined as the measure of balance between precision and recall.

= 2 * ((0.15 * 0.5) / (0.15 + 0.5))

Accuracy= 35% Precision= 15% Recall= 0.5 F1 Score= 0.23

metric. Precision metric needs to improve more.

Positives may cause important mails to be missed. Hence, Precision is more

You might also like