0% found this document useful (0 votes)
60 views28 pages

Unit 7 - AI (Evaluation)

Uploaded by

divyanagarajj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views28 pages

Unit 7 - AI (Evaluation)

Uploaded by

divyanagarajj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Unit 7

Evaluation
Evaluation
• Method used for understanding the reliability of AI model on basis of
output provided after test data is given.
• It is measurement of model’s reliability & performance as per
requirement.
• This is done by comparing the output generated(Predictions) with the
actual outputs (Reality).
• For evaluation testing data is very crucial.
Characteristics of Testing Data
• It is completely different, unique and new entity when compared to
training data.
• Data is prepared carefully by trained professionals after exploring big
data.
• It’s recommended that the testing data be different from entire
training data set to evaluate model without any biases.
Scenario
• Scenario is the problem area for which the model has been deployed.
• It is the source of real data which is fed into the model for processing.

Real time scenario Regular interval scenario

• Non-stop critical emergent life • Less critical problem area.


threats or terrible situations. • No emergent threat.
• Ex: floods, earthquakes, online • Ex: Pollution monitoring,
transactions, monitoring monitoring of diet & fitness for
patients in critical care, sports person, study on cancer
monitoring air, road, sea and patients, monitoring student
rail traffic performance & their study
habits etc. Daily/weekly/
monthly etc.
The two aspects are compared to generate
two of the following possibilities:
1. The prediction matches the reality.
2. The prediction does not match the reality.
NOTE: The predictions are namely ‘Yes’ or ‘No’/ ‘True’ or ‘False’/
‘Negative’ or ‘Positive’
Case Study: A system that identifies if it is
raining or not.
Prediction Reality

YES / TRUE/
YES / TRUE/ POSITIVE TRUE POSITIVE
POSITIVE

NO/ FALSE/
NO/ FALSE/ NEGATIVE TRUE NEGATIVE
NEGATIVE

YES / TRUE/
NO/ FALSE/ NEGATIVE FALSE POSITIVE
POSITIVE

NO/ FALSE/
YES / TRUE/ POSITIVE FALSE NEGATIVE
NEGATIVE
Confusion Matrix / Error Matrix
• Outcome of the comparison between the prediction & reality,
can be recorded in a tabular form called Confusion Matrix or Error Matrix.
• It helps in visualizing the performance of algorithm & the model
• Mostly used to supervise the learning of the models.
• It is not an evaluation matrix but a
• TRUE POSITIVE
record which can help in evaluation. • Prediction and Reality Matches (TRUE)
TP • Prediction is True (POSITIVE)

• TRUE NEGATIVE
• Prediction and Reality Matches (TRUE)
TN • Prediction is False (NEGATIVE)

• FALSE POSITIVE
• Prediction and Reality DO NOT Match (FLASE)
FP • Prediction is True (POSITIVE)

• FLASE NEGATIVE
• Prediction and Reality DO NOT Match (FLASE)
FN • Prediction is False (NEGATIVE)
REALITY
The Confusion
Matrix
YES NO

True Positive False Positive


YES
(TP) (FP)
PREDICTIO
N
False Negative True Negative
NO
(FN) (TN)
Evaluation Methods

Accuracy Precision Recall F1 Score


Accuracy
• Accuracy allows you to count the total number of accurate predictions made
by a model.
• Percentage of correct predictions out of all the observations, i.e How many
of the model predictions were accurate will be determined by accuracy.
• Correct predictions are when it matches reality. That is, ‘True Positive’ and
‘True Negative’.
• Formula
Accuracy = Correct Prediction X 100 %
Total Cases
Accuracy = (TP + TN) X 100 %
(TP+TN+FP+FN)
Precision :
Precision is defined as the percentage of true positive cases versus
all the cases where the prediction is true.
That is, it takes into account the True Positives and False Positives .
Recall

• It can be described as the percentage of positively detected cases that


are positive.
• The scenarios where a fire actually existed in reality but was either
correctly or incorrectly recognised by the machine are heavily considered.
• That is, it takes into account both False Negatives (there was a forest
fire but the model didn’t predict it) and True Positives (there was a forest
fire in reality and the model anticipated a forest fire).
F1Score:
F1 score is a weighted average of precision and recall.
As we know in precision and in recall there is false positive
and false negative so it also consider both of them.
F1 score is usually more useful than accuracy, especially if
you have an uneven class distribution.
Which Metric is Important?

• Depending on the situation the model has been deployed, choosing between
Precision and Recall is necessary.
• A False Negative can cost us a lot of money and put us in danger in a situation
like a forest fire.
• Viral Outbreak is another situation in which a False Negative might be harmful.
• Consider a scenario in which a fatal virus has begun to spread but is not being
detected by the model used to forecast viral outbreaks. The virus may infect
numerous people and spread widely.

• To conclude the argument, we must say that if we


want to know if our model’s performance is good, we
need these two measures: Recall and Precision.
F1 Score
• Since both the measures (Precision & Recall) are important, there is a need
for a parameter that takes both Precision and Recall into account which is
called the F1 Score.
• An ideal situation would be when we have a value of 100% for both Precision
and Recall. In that case, the F1 score would also be an ideal 100%

• In conclusion, we can say that a model has good


performance if the F1 Score for that model is high.
Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion
Matrix on Heart Attack Risk. Also suggest which metric would not be a good
evaluation parameter here and why?

You might also like