0% found this document useful (0 votes)

73 views39 pages

COMPX310-19A Machine Learning Chapter 3: Classification

Uploaded by

Natch Sadindum

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views39 pages

COMPX310-19A Machine Learning Chapter 3: Classification

Uploaded by

Natch Sadindum

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 39

COMPX310-19A

Machine Learning
Chapter 3: Classification
An introduction using Python, Scikit-Learn, Keras, and Tensorflow

Unless otherwise indicated, all images are from Hands-on Machine Learning with
Scikit-Learn, Keras, and TensorFlow by Aurélien Géron, Copyright © 2019 O’Reilly Media
House keeping
 Outline


03/08/2021 COMPX310 2
MNIST: the “hello world”of ML
 Scikit-learn provides some benchmark datasets,

 In this case: www.openml.org

03/08/2021 COMPX310 3
Handwritten digits

03/08/2021 COMPX310 4
Preparing Y

03/08/2021 COMPX310 5
Train/test, binary class

03/08/2021 COMPX310 6
Yet another learner: SGD

03/08/2021 COMPX310 7
Cross-validation

03/08/2021 COMPX310 8
Cross-validation
 Cross-validation is an alternative to Train+Validation
 Train is split up into k equal-sized folds (default: 10 folds)
 Use k-1 folds together as the new train, validate on the
remaining fold
 Repeat this k times, always choosing another fold => k results
 Compute mean + standard deviation
 [can also repeat this multiple times with different random seeds
to reduce the variance of the result]

03/08/2021 COMPX310 9
Cross-validation
 Workhorse in ML, therefore direct support in scikit_learn:

03/08/2021 COMPX310 10
Are we really that good?

03/08/2021 COMPX310 11
Getting predictions from CV

03/08/2021 COMPX310 12
Compare to perfection

03/08/2021 COMPX310 13
Precision and Recall
Precision: how many of the predicted 5s are really 5s

Recall: how many of the real 5s do we actually find

03/08/2021 COMPX310 14
TN, TP, FN, FP and the confusion matrix
[[5, 1], TN=5, FP=1, FN=2, TP=3
[2, 3]]. Rows: row0 info about class0, …
Columns: col0 info about predictedAs0, …

03/08/2021 COMPX310 15
F1: harmonic mean of recall & precision
[[5, 1], TN=5, FP=1, FN=2, TP=3
[2, 3]]. Rows: row0 info about class0, …
Columns: col0 info about predictedAs0, …

03/08/2021 COMPX310 16
Some results

03/08/2021 COMPX310 17
Thresholds: precision/recall trade-off

03/08/2021 COMPX310 18
Classifiers return numeric scores

03/08/2021 COMPX310 19
Precision recall curves

03/08/2021 COMPX310 20
Precision recall curves

03/08/2021 COMPX310 21
Recall @ precision == 0.9

03/08/2021 COMPX310 22
Precision-recall curve

03/08/2021 COMPX310 23
Alternative: ROC curve

03/08/2021 COMPX310 24
Alternative: ROC curve
Plot true positive rate (TPR)
over false positive rate (FPR)
for all possible thresholds.

Best @ (0,1).
Diagonal is a random classifier.

Area under the curve (AUC) is

1.0 for best possible, and
0.5 for random classifier.

AUC is very popular,

does not need a threshold,
works well for imbalanced data.

03/08/2021 COMPX310 25
Compare to Random Forest

03/08/2021 COMPX310 26
Compare to Random Forest

03/08/2021 COMPX310 27
Compare to Random Forest

03/08/2021 COMPX310 28
Multi-class classification

03/08/2021 COMPX310 29
Multi-class classification

03/08/2021 COMPX310 30
One-vs-One for Multiclass

03/08/2021 COMPX310 31
Random forest for multi-class

03/08/2021 COMPX310 32
Error analysis: confusion matrix from CV

03/08/2021 COMPX310 33
Error analysis: confusion matrix from CV

03/08/2021 COMPX310 34
Error analysis: confusion matrix from CV

03/08/2021 COMPX310 35
Multilabel: more than one binary target

03/08/2021 COMPX310 36
Multilabel: cross-validation

“Macro”: compute F1 for each label separately, then

average over all labels

“Micro”: compute F1 for the labels per example, then

average over all examples

03/08/2021 COMPX310 37
MultiOutput: multiple multiclass target
 E.g.: reconstruct image from a corrupted version

X y

03/08/2021 COMPX310 38
Adding noise, train & predict

03/08/2021 COMPX310 39

Worksheet SR-FAITH
No ratings yet
Worksheet SR-FAITH
4 pages
Asco Valve Canada Introduces New Red Hat Valve
No ratings yet
Asco Valve Canada Introduces New Red Hat Valve
5 pages
Projects Prasanna Chandra 7E Ch4 Minicase Solution
No ratings yet
Projects Prasanna Chandra 7E Ch4 Minicase Solution
3 pages
COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
No ratings yet
COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
44 pages
COMPX310-19A Machine Learning Chapter 7: Ensembles, Random Forest
No ratings yet
COMPX310-19A Machine Learning Chapter 7: Ensembles, Random Forest
41 pages
Lec5 Classification
No ratings yet
Lec5 Classification
27 pages
Machine Learning Chapter3
No ratings yet
Machine Learning Chapter3
27 pages
Lecture03. Classification (Chapter 3)
No ratings yet
Lecture03. Classification (Chapter 3)
46 pages
Classification
No ratings yet
Classification
4 pages
(REPORT) LAB - 2 - Decision - Tree
No ratings yet
(REPORT) LAB - 2 - Decision - Tree
17 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
Class 2a-Decision Trees
No ratings yet
Class 2a-Decision Trees
28 pages
ML - Mod2 Classification
No ratings yet
ML - Mod2 Classification
74 pages
ML Viva Questions
No ratings yet
ML Viva Questions
4 pages
Human Activities Classifier Using SVM
No ratings yet
Human Activities Classifier Using SVM
19 pages
DWDM Lab 3
No ratings yet
DWDM Lab 3
10 pages
Cardiovascular Disease Slides
No ratings yet
Cardiovascular Disease Slides
35 pages
NF Assighment4
No ratings yet
NF Assighment4
5 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Data Science in FInancial Services - 3
No ratings yet
Data Science in FInancial Services - 3
76 pages
Data Preprocessing
No ratings yet
Data Preprocessing
65 pages
Unit6 - 7 Issues
No ratings yet
Unit6 - 7 Issues
53 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
Slides On DataI
No ratings yet
Slides On DataI
33 pages
SVM K NN MLP With Sklearn Jupyter NoteBo
No ratings yet
SVM K NN MLP With Sklearn Jupyter NoteBo
22 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
Final ML
No ratings yet
Final ML
2 pages
Phyton
No ratings yet
Phyton
10 pages
CIVI6731 Lecture (Week9)
No ratings yet
CIVI6731 Lecture (Week9)
18 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
Final Research Paper
No ratings yet
Final Research Paper
3 pages
COMPX310-19A Machine Learning Chapter 5: Support Vector Machines
No ratings yet
COMPX310-19A Machine Learning Chapter 5: Support Vector Machines
29 pages
CH 6
No ratings yet
CH 6
24 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 3
No ratings yet
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 3
30 pages
Unit II - 3 - Chapter 3 - MNIST Classification
No ratings yet
Unit II - 3 - Chapter 3 - MNIST Classification
13 pages
Module 2
No ratings yet
Module 2
151 pages
Comp3314 5. Data Preprocessing
No ratings yet
Comp3314 5. Data Preprocessing
51 pages
BigData Assessment2 26230605
No ratings yet
BigData Assessment2 26230605
14 pages
ML101 Graded Assignment 2.ipynb - Colab
No ratings yet
ML101 Graded Assignment 2.ipynb - Colab
6 pages
Machine Learning Final Report
No ratings yet
Machine Learning Final Report
8 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
CSE4261 Lecture-10
No ratings yet
CSE4261 Lecture-10
50 pages
Complete Data Science Questions
No ratings yet
Complete Data Science Questions
5 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
COMPX310-19A Machine Learning Chapter 10: Neural Networks
No ratings yet
COMPX310-19A Machine Learning Chapter 10: Neural Networks
35 pages
Draft Xai
No ratings yet
Draft Xai
16 pages
07 Classification
No ratings yet
07 Classification
52 pages
Python Essential Methods in Machine Learning
No ratings yet
Python Essential Methods in Machine Learning
6 pages
Untitled 10
No ratings yet
Untitled 10
12 pages
l09 Machine Learning
No ratings yet
l09 Machine Learning
39 pages
Review Paper
No ratings yet
Review Paper
3 pages
Aiml Nts
No ratings yet
Aiml Nts
33 pages
Maxbox - Starter67 Machine Learning
No ratings yet
Maxbox - Starter67 Machine Learning
7 pages
Ann Experiential Learning
No ratings yet
Ann Experiential Learning
43 pages
ML Report2
No ratings yet
ML Report2
21 pages
2018 02 Msu Data Science
No ratings yet
2018 02 Msu Data Science
65 pages
ML and Deploying It Using Flask and Docker.
No ratings yet
ML and Deploying It Using Flask and Docker.
30 pages
Unit3 Classification
No ratings yet
Unit3 Classification
40 pages
Openlab 1
No ratings yet
Openlab 1
17 pages
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
From Everand
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
Björn Olsson
No ratings yet
COMPX310-19A Machine Learning Chapter 11: Training Deep Neural Networks
No ratings yet
COMPX310-19A Machine Learning Chapter 11: Training Deep Neural Networks
21 pages
COMPX-203 WRAMP Instructions
No ratings yet
COMPX-203 WRAMP Instructions
16 pages
COMPX310-19A Machine Learning Chapter 8: Dimensionality Reduction
No ratings yet
COMPX310-19A Machine Learning Chapter 8: Dimensionality Reduction
35 pages
COMPX310-19A Machine Learning Chapter 4: Training Models
No ratings yet
COMPX310-19A Machine Learning Chapter 4: Training Models
48 pages
BBA-BI-Class 19 Business Research Notes For BHM
No ratings yet
BBA-BI-Class 19 Business Research Notes For BHM
28 pages
Handwriting Problems in Primary School Children A
No ratings yet
Handwriting Problems in Primary School Children A
11 pages
Normal Probability Plot Using Montgomery's Table: F (X) 3.930146793x - 66.9775616458 R 0.9733361199
No ratings yet
Normal Probability Plot Using Montgomery's Table: F (X) 3.930146793x - 66.9775616458 R 0.9733361199
4 pages
Bio-Stat Class 2 and 3
No ratings yet
Bio-Stat Class 2 and 3
58 pages
Multivariate Time Series Models
No ratings yet
Multivariate Time Series Models
28 pages
2017 S1 Test2
No ratings yet
2017 S1 Test2
10 pages
Regression Anaysis Explaination Lecture Notes by Dr. Wahid Sherani
No ratings yet
Regression Anaysis Explaination Lecture Notes by Dr. Wahid Sherani
7 pages
Netflix Movies and TV Shows Clustering
No ratings yet
Netflix Movies and TV Shows Clustering
29 pages
Power Calculation
No ratings yet
Power Calculation
2 pages
Tutorial Forecasting
No ratings yet
Tutorial Forecasting
3 pages
MATH131 Growth Project Social Media
No ratings yet
MATH131 Growth Project Social Media
6 pages
ISB ITPM Assignment 7.2 Anmol Srivastava
No ratings yet
ISB ITPM Assignment 7.2 Anmol Srivastava
3 pages
CQF EXAM 3-Answer
No ratings yet
CQF EXAM 3-Answer
14 pages
Review Questions Statisitcs
No ratings yet
Review Questions Statisitcs
8 pages
Bda Assign
No ratings yet
Bda Assign
15 pages
Factor Analysis Overview1
No ratings yet
Factor Analysis Overview1
12 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
105 pages
A Note On Using Stratified Alpha To Estimate The Composite Reliability of A Test Composed of Interrelated Nonhomogeneous Items
No ratings yet
A Note On Using Stratified Alpha To Estimate The Composite Reliability of A Test Composed of Interrelated Nonhomogeneous Items
8 pages
Basic Concepts: Time Value of Money
100% (1)
Basic Concepts: Time Value of Money
20 pages
Statistics and Probability: Quarter 2 Week 3 Test of Hypothesis
No ratings yet
Statistics and Probability: Quarter 2 Week 3 Test of Hypothesis
6 pages
Lecture 4 - Test of Outliers and Test of SKewness
No ratings yet
Lecture 4 - Test of Outliers and Test of SKewness
14 pages
Naan Muthalvan Project Report Stock Market Forecast 4310
No ratings yet
Naan Muthalvan Project Report Stock Market Forecast 4310
29 pages
Impact of Revenue Diversification On Nonprofit Financial Health
No ratings yet
Impact of Revenue Diversification On Nonprofit Financial Health
23 pages
DSBDAL Lab Manual
No ratings yet
DSBDAL Lab Manual
26 pages
Diagnosis Worksheet: Page 1 of 2 Citation
No ratings yet
Diagnosis Worksheet: Page 1 of 2 Citation
2 pages
Canonical Correspondence Analysis (CCA) and Other Techniques
No ratings yet
Canonical Correspondence Analysis (CCA) and Other Techniques
42 pages
6 Ways To Test For A Normal Distribution - Which One To Use - by Joos Korstanje - Towards Data Science
No ratings yet
6 Ways To Test For A Normal Distribution - Which One To Use - by Joos Korstanje - Towards Data Science
9 pages

COMPX310-19A Machine Learning Chapter 3: Classification

Uploaded by

COMPX310-19A Machine Learning Chapter 3: Classification

Uploaded by

COMPX310-19A

 In this case: www.openml.org

Recall: how many of the real 5s do we actually find

Area under the curve (AUC) is

AUC is very popular,

“Macro”: compute F1 for each label separately, then

“Micro”: compute F1 for the labels per example, then

You might also like