0% found this document useful (0 votes)

8 views

TensorFlow Classification

The document discusses classification as a fundamental problem in machine learning, detailing various classifiers and their evaluation metrics such as accuracy, precision, and recall. It highlights the importance of understanding these metrics, especially in skewed datasets, and emphasizes the need for careful model selection and threshold tuning to optimize performance. Additionally, it introduces concepts like confusion matrices and ROC curves to aid in model evaluation and decision-making.

Uploaded by

Surya Bhoi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

TensorFlow Classification

Uploaded by

Surya Bhoi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 68

Classification as a Machine Learning Problem

Over view

Classification is a canonical problem in Machine Learning

Classifiers can be measured using accuracy, precision and

recall

Traditional ML models for classification include SVM and

Naive Bayes

Neural networks perform very well on classification problems

Classification and Classifiers
Machine Learning

Work with a huge maze of Make intelligent decisions

Find patterns
data
Machine Learning

Emails on a server Spam or Ham? Trash or Inbox

Types of Machine Learning Problems

Classification Regression Clustering Rule-extraction

Types of Machine Learning Problems

Classification Regression Clustering Rule-extraction

Whales: Fish or Mammals?

Mammals Fish
Members of the infraorder Cetacea Look like fish, swim like fish, move with
fish
Whales: Fish or Mammals?

ML-based Classifier
ML-based Classifier

Training Prediction
Feed in a large corpus of data classified Use it to classify new instances which it
correctly has not seen before
Training the ML-based Classifier

Classification

ML-based Classifier
Corpus

Feedback - loss
Improves model parameters function or cost
function
An algorithm might have high accuracy but
still be a poor machine learning model

Its predictions are useless

Accuracy, Precision, Recall
All-is-well Binary Classifier

Medical reports
Always classify as No Cancer
“normal”

Here, accuracy for rare cancer may be 99.9999%, but…

Accuracy

Some labels maybe much more common/rare

than others

Such a dataset is said to be skewed

Accuracy is a poor evaluation metric here

Confusion Matrix
Predicted Labels
No Cancer
Cancer
Actual Label
10 instances 4 instances
Cancer

No Cancer 5 instances 1000 instances

Confusion Matrix
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer

No Cancer 5 1000
True Positive
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer

No Cancer 5 1000

Actual Label = Predicted Label

True Positive
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4

No Cancer 5 1000

Actual Label = Predicted Label

False Positive
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer

No Cancer 5 1000

Actual Label =/ Predicted Label

False Positive
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer

No Cancer 5
FP 1000

Actual Label =/ Predicted Label

True Positive
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer

No Cancer 5 1000

Actual Label = Predicted Label

True Negative
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer

No Cancer 5 1000 TN

Actual Label = Predicted Label

False Negative
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer

No Cancer 5 1000

Actual Label =/ Predicted Label

False Negative
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 4 FN

No Cancer 5 1000

Actual Label =/ Predicted Label

Confusion Matrix
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN
Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN
Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN

Actual Label = Predicted Label

Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN

Accuracy =
TP + TN
=
1010
= 99.12%
Num Instances 1019
Accuracy

Accuracy = 99.12%

Classifier gets it right 99.12% of the time

But…
Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN

People on chemotherapy, radiation when not required

Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN

Cancer not detected, no treatment prescribed

Accuracy is not a good metric to evaluate
whether this model performs well
Precision
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN
Precision
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN

Precision = Accuracy when classifier flags cancer

Precision
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN

TP 10
Precision = TP + FP = 15 = 66.67%
Precision = 66.67%
Precision
1 in 3 cancer diagnoses is incorrect
Recall
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN
Recall
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN

Recall = Accuracy when cancer actually present

Recall
Predicted Labels
No Cancer
Cancer
Actual Label

Cancer
10 TP 4 FN

No Cancer 5 FP 1000 TN

TP 10
Recall = TP + FN = 14 = 71.42%
Recall = 71.42%
Recall
2 in 7 cancer cases missed
Choosing a Machine Learning Model
ML-based Binary Classifier

Breathes like a mammal

Mammal
Gives birth like a mammal
ML-based Classifier

Corpus
ML-based Binary Classifier

Breathes like a mammal

P(fish) = 0.45
Gives birth like a mammal
ML-based Classifier

Corpus
Applying Logistic Regression
Probability of
animal being (95%)
fish Lives in water, breathes with gills, lays
eggs
(60%)

Lives in water, breathes with lungs,does not lay

eggs
Lives on land, breathes with lungs,does not lay
eggs
(5%) (40%)

Whales: Fish or Mammals?

Choosing Decision Threshold
(50%)
Probability of
animal being
fish
Pthreshold (80%)
(95%)

(60%)

(5%) (20%) (40%)

Choosing Decision Threshold
Probability of
animal being
fish
Pthreshold (80%)
(95%)

(60%)

(5%) (20%) (40%)

If probability < Pthreshold, it’s a mammal

Applying Logistic Regression
Probability of
animal being
fish
Pthreshold (80%)
(95%)

(60%)

(5%) (20%) (40%)

If probability > Pthreshold, it’s a fish

Predicted
No Cancer
Cancer
Actual TP FN

Cancer 0 14

FP TN
“Always No Cancer
0 1005
Negative”
Pthreshold =1 - Recall = 0%

- Precision = Infinite

- Classifier too conservative

Precision vs.“Conservativeness”
1.0 Precision

0
1.0
“Conservativeness” of Decision Threshold
Predicted
No Cancer
Cancer
Actual TP FN

Cancer 14 0

FP TN
“Always No Cancer
1005 0
Positive”
Pthreshold = 0 - Recall = 100%

- Precision = 14/1019 = 13.7%

- Classifier not conservative enough

Recall vs.“Conser vativeness”
1.0

Recall
0
1.0
“Conservativeness” of Decision Threshold
Precision-Recall Tradeoff
1.0 Precision

Recall
0
1.0
“Conservativeness” of Decision Threshold
Precision-Recall Tradeoff
1.0

Precision

Recall
Heuristics to Choose a Model

ROC Curve
F1 Score
Plot a curve to maximize true positives,
Harmonic mean of precision and recall
minimize false positives
Heuristics to Choose a Model

ROC Curve
F1 Score
Plot a curve to maximize true positives,
Harmonic mean of precision and recall
minimize false positives
Precision x Recall
F1 = 2x
Precision + Recall
F1 Score - Harmonic mean of precision, recall

- Closer to lower of two

- Favors even tradeoff

Choosing Pthreshold

Tweak threshold values

Calculate F1 Score
Run training by changing threshold values for
Each training run produces a model, calculate F1 score for each model
each execution

High F1 score better

Calculate precision, recall
Choose threshold which results in the highest F1
Find values for each training run
score
Heuristics to Choose a Model

ROC Curve
F1 Score
Plot a curve to maximize true positives,
Harmonic mean of precision and recall
minimize false positives
Choosing Pthreshold

True
Positive
Rate

False Positive
Rate
Choosing Pthreshold

Should be as high as
True possible
Positive
Rate

False Positive
Rate
Choosing Pthreshold

Should be as low as
True possible
Positive
Rate

False Positive
Rate
Choosing Pthreshold
ROC Curve

(Receiver Operating
Characteristic)

True
Positive
Rate

False Positive
Rate
Choosing Pthreshold
1.0

True
Positive Different values of Pthreshold
(Hyperparameter tuning)
Rate

False Positive
Rate
Choosing Pthreshold
1.0

True
Positive Fit ROC curve from different
Rate values of Pthreshold

False Positive
Rate
ROC Cur ve
1.0

True
Positive Pick top-left corner point as Pthreshold
Rate Why? Maximises True Positive Rate,
minimises False Positive Rate
0

False Positive
Rate
ROC of Perfect Classifier
1.0

TP = 100%

FP = 0%

True
Positive
Rate

False Positive
Rate
ROC of Random Classifier
1.0

TP = FP
True
Positive
Rate

False Positive
Rate

Your RBC Personal Banking Account Statement
No ratings yet
Your RBC Personal Banking Account Statement
7 pages
Valve Market Report ESA
No ratings yet
Valve Market Report ESA
53 pages
One-Pager PRD Template: Section 1: Purpose
No ratings yet
One-Pager PRD Template: Section 1: Purpose
1 page
Philippine Arena Structural Design
100% (1)
Philippine Arena Structural Design
12 pages
3 LogisticRegression
No ratings yet
3 LogisticRegression
30 pages
Machine Learning Evaluation Metrics Lecturer
No ratings yet
Machine Learning Evaluation Metrics Lecturer
30 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Int3209 - Data Mining: Week 5: Classification Model Improvements
No ratings yet
Int3209 - Data Mining: Week 5: Classification Model Improvements
56 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Classification
100% (2)
Classification
105 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Lec10 Intro ML
No ratings yet
Lec10 Intro ML
93 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
ML-2-PPT-UNIT-2
No ratings yet
ML-2-PPT-UNIT-2
214 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
Lectures3 5
No ratings yet
Lectures3 5
57 pages
Lesson 4 - Performance Metrics
No ratings yet
Lesson 4 - Performance Metrics
46 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
19_ML_intro
No ratings yet
19_ML_intro
33 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
DL_IT324a_4
No ratings yet
DL_IT324a_4
52 pages
BSC ML CH1.pptx
No ratings yet
BSC ML CH1.pptx
63 pages
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
No ratings yet
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
53 pages
Classification
No ratings yet
Classification
53 pages
0 Machine Learning Overview and Metrics LT
No ratings yet
0 Machine Learning Overview and Metrics LT
84 pages
Lec 21
No ratings yet
Lec 21
34 pages
AIML-HC Mod 03
No ratings yet
AIML-HC Mod 03
46 pages
ML-chap-2
No ratings yet
ML-chap-2
60 pages
Evaluation Metrics and Statistical Tests For Machi
No ratings yet
Evaluation Metrics and Statistical Tests For Machi
15 pages
An Introduction To Supervised Machine Learning and Pattern Classification - The Big Picture
No ratings yet
An Introduction To Supervised Machine Learning and Pattern Classification - The Big Picture
55 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Session01 DataScience
No ratings yet
Session01 DataScience
79 pages
Model Evaluation - II
No ratings yet
Model Evaluation - II
12 pages
Lec 17 -Dsfa23
No ratings yet
Lec 17 -Dsfa23
32 pages
For Unit 4 Useful
100% (1)
For Unit 4 Useful
107 pages
08 Classifier Evaluation
No ratings yet
08 Classifier Evaluation
39 pages
ML Notes UT-2
No ratings yet
ML Notes UT-2
19 pages
ML Lecture#03 1
No ratings yet
ML Lecture#03 1
21 pages
Breast Cancer Detection and Prediction: Created by
No ratings yet
Breast Cancer Detection and Prediction: Created by
20 pages
Statistical Learning Slides
No ratings yet
Statistical Learning Slides
60 pages
Module - 4 - ECE3047 - Machine Learning
No ratings yet
Module - 4 - ECE3047 - Machine Learning
81 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
W4 Ecs7020p
No ratings yet
W4 Ecs7020p
48 pages
CH-5_ML
No ratings yet
CH-5_ML
36 pages
Machine Learning Intro & Evaluation Metrics
No ratings yet
Machine Learning Intro & Evaluation Metrics
49 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Machine_Learning_II
No ratings yet
Machine_Learning_II
61 pages
FALLSEM2024-25_BCSE209L_TH_VL2024250101735_2024-07-25_Reference-Material-I
No ratings yet
FALLSEM2024-25_BCSE209L_TH_VL2024250101735_2024-07-25_Reference-Material-I
37 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
No ratings yet
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
15 pages
Lecture 3 1611410001002
No ratings yet
Lecture 3 1611410001002
51 pages
Lecture W1c UG
No ratings yet
Lecture W1c UG
33 pages
lec5_Classification
No ratings yet
lec5_Classification
27 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
Introduction Class
No ratings yet
Introduction Class
134 pages
FML - KNN
No ratings yet
FML - KNN
64 pages
English Literature
No ratings yet
English Literature
13 pages
Quality Improvement Plan (QIP)
No ratings yet
Quality Improvement Plan (QIP)
17 pages
Group 2 11-Amber - PPT Lesson 3
No ratings yet
Group 2 11-Amber - PPT Lesson 3
16 pages
Catalago Palhetas MTF
No ratings yet
Catalago Palhetas MTF
29 pages
Accounting Policies of Dabur
No ratings yet
Accounting Policies of Dabur
10 pages
Week 4 - Fiber Optics and Waveguides - Solutions
No ratings yet
Week 4 - Fiber Optics and Waveguides - Solutions
4 pages
Lesson Plan For Poem (Daddy)
No ratings yet
Lesson Plan For Poem (Daddy)
2 pages
Crochet Pixie Bonnet_by elenarchiving
No ratings yet
Crochet Pixie Bonnet_by elenarchiving
4 pages
Present Perfect Activity Since For Ever Never Just Already
No ratings yet
Present Perfect Activity Since For Ever Never Just Already
1 page
Lesson Plan Class XI (Chapter - 2 - Olympic Value Education)
No ratings yet
Lesson Plan Class XI (Chapter - 2 - Olympic Value Education)
1 page
Allen Ethan - Firearms Designer and Mfg. and A List of His 22 Firearms Patents
No ratings yet
Allen Ethan - Firearms Designer and Mfg. and A List of His 22 Firearms Patents
3 pages
Addendum ASME
No ratings yet
Addendum ASME
10 pages
World of Insects
No ratings yet
World of Insects
47 pages
Underpass Project Development Road Drainage Calculation Data
No ratings yet
Underpass Project Development Road Drainage Calculation Data
3 pages
Mohd Abdul Azzim Bin Din 870824-11-5775
No ratings yet
Mohd Abdul Azzim Bin Din 870824-11-5775
3 pages
Design of Experiments in Growth Chambersuniformity
No ratings yet
Design of Experiments in Growth Chambersuniformity
8 pages
Thermal Vacuum Chamber Modification Testing and Analysis
No ratings yet
Thermal Vacuum Chamber Modification Testing and Analysis
165 pages
Virtual Work Concept
No ratings yet
Virtual Work Concept
35 pages
Mrs Abiola First Term Question 23-24 Session
No ratings yet
Mrs Abiola First Term Question 23-24 Session
55 pages
DNA Structure
No ratings yet
DNA Structure
1 page
Ielts PPT-2016
No ratings yet
Ielts PPT-2016
84 pages
Instant ebooks textbook Chinese Intellectuals Between State and Market Routledgecurzon Studies on China in Transition 17 1st Edition Merle Goldman download all chapters
100% (2)
Instant ebooks textbook Chinese Intellectuals Between State and Market Routledgecurzon Studies on China in Transition 17 1st Edition Merle Goldman download all chapters
55 pages
Constitutive Modeling of Elastomers - Accuracy of Predictions and Numerical Efficiency
No ratings yet
Constitutive Modeling of Elastomers - Accuracy of Predictions and Numerical Efficiency
19 pages
Weight loss - DIET PLAN - MANYA
No ratings yet
Weight loss - DIET PLAN - MANYA
3 pages
Physical Inspection of Meat Animals Prior To Slaughter
No ratings yet
Physical Inspection of Meat Animals Prior To Slaughter
8 pages
Ce5320 PDF
No ratings yet
Ce5320 PDF
2 pages

TensorFlow Classification

Uploaded by

TensorFlow Classification

Uploaded by

Classification as a Machine Learning Problem

Classification is a canonical problem in Machine Learning

Classifiers can be measured using accuracy, precision and

Traditional ML models for classification include SVM and

Neural networks perform very well on classification problems

Work with a huge maze of Make intelligent decisions

Emails on a server Spam or Ham? Trash or Inbox

Classification Regression Clustering Rule-extraction

Classification Regression Clustering Rule-extraction

Its predictions are useless

Here, accuracy for rare cancer may be 99.9999%, but…

Some labels maybe much more common/rare

Such a dataset is said to be skewed

Accuracy is a poor evaluation metric here

No Cancer 5 instances 1000 instances

Actual Label = Predicted Label

Actual Label = Predicted Label

Actual Label =/ Predicted Label

Actual Label =/ Predicted Label

Actual Label = Predicted Label

Actual Label = Predicted Label

Actual Label =/ Predicted Label

Actual Label =/ Predicted Label

Actual Label = Predicted Label

Classifier gets it right 99.12% of the time

People on chemotherapy, radiation when not required

Cancer not detected, no treatment prescribed

Precision = Accuracy when classifier flags cancer

Recall = Accuracy when cancer actually present

Breathes like a mammal

Breathes like a mammal

Lives in water, breathes with lungs,does not lay

Whales: Fish or Mammals?

(5%) (20%) (40%)

(5%) (20%) (40%)

If probability < Pthreshold, it’s a mammal

(5%) (20%) (40%)

If probability > Pthreshold, it’s a fish

- Classifier too conservative

- Precision = 14/1019 = 13.7%

- Classifier not conservative enough

- Closer to lower of two

- Favors even tradeoff

Tweak threshold values

High F1 score better

You might also like