0% found this document useful (0 votes)

9 views30 pages

Classification: Prof. Gheith Abandah

This document discusses classification algorithms. It introduces the MNIST dataset and explores training binary classifiers. It then covers performance measures like accuracy, confusion matrices, and precision and recall. Multiclass classification algorithms like one-vs-all and one-vs-one are explained. Finally, multilabel classification is defined.

Uploaded by

rana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views30 pages

Classification: Prof. Gheith Abandah

Uploaded by

rana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Classification

Prof. Gheith Abandah

Reference: Hands-On Machine Learning with Scikit-Learn and

TensorFlow by Aurélien Géron (O’Reilly). Copyright 2017 Aurélien
Géron, 978-1-491-96229-9.

1
Introduction
• YouTube Video: Machine Learning - Supervised
Learning Classification from Cognitive Class

https://youtu.be/Lf2bCQIktTo

2
Outline
1. MNIST dataset
2. Training a binary classifier
3. Performance measures
4. Multiclass classification
5. Multilabel classification
6. Exercise

3
1. MNIST Dataset

• MNIST is a set of 70,000

small images of
handwritten digits.
• Available from
mldata.org
• Scikit-Learn provides
download functions.

4
1.1. Get the Data

5
1.2. Extract Features and Labels

There are 70,000 images, and each image has 784

features. This is because each image is 28×28 pixels,
and each feature simply represents one pixel’s
intensity, from 0 (white) to 255 (black).

6
1.3. Examine One Image

7
1.4. Split the Data

• The MNIST dataset is actually already split into a training

set (the first 60,000 images) and a test set (the last 10,000
images).
• You need to shuffle the training set to guarantee that all
cross-validation folds are similar.

8
Outline
1. MNIST dataset
2. Training a binary classifier
3. Performance measures
4. Multiclass classification
5. Multilabel classification
6. Exercise

9
2. Training a Binary Classifier
• A binary classifier can classify two classes.
• For example, classifier for the number 5, capable of
distinguishing between two classes, 5 and not-5.

True for all 5s, False for all

other digits.

Stochastic Gradient
Descent (SGD) classifier
10
2. Training a Binary Classifier
• Note that a better classifier for this problem is the
Random Forest Classifier.

11
Outline
1. MNIST dataset
2. Training a binary classifier
3. Performance measures
4. Multiclass classification
5. Multilabel classification
6. Exercise

12
3. Performance Measures

• Accuracy: Ratio of correct predictions

• Confusion matrix
• Precision and recall
• Precision/recall tradeoff

13
3.1. Accuracy

Example how to find the

accuracy.

Using the cross_val_score() function

to find the accuracy on three folds

14
3.2. Confusion Matrix

15
3.2. Confusion Matrix
• Scikit Learn has a function for finding the confusion
matrix.

• The first row is for the non-5s (the negative class):

• 53,272 correctly classified (true negatives)
• 1,307 wrongly classified (false positives)
• The second row is for the 5s (the positive class):
• 1,077 wrongly classified (false negatives)
• 4,344 correctly classified (true positives)

16
3.3. Precision and Recall

Precision Recall

The precision and recall are smaller than the accuracy.

Why? 17
3.4. Precision/Recall Tradeoff
• Increase the decision threshold to improve the
precision when it is bad to have FP.
• Decrease the decision threshold to improve the
recall when it is important not to miss FN.

18
3.4. Precision/Recall Tradeoff

19
3.4. Precision/Recall Tradeoff
• The function cross_val_predict() can return
decision scores instead of predictions.

• When using larger decision threshold, we increase

the precision and decrease the recall.

20
Outline
1. MNIST dataset
2. Training a binary classifier
3. Performance measures
4. Multiclass classification
5. Multilabel classification
6. Exercise

21
4. Multiclass Classification
• Multiclass classifiers can distinguish between more
than two classes.
• Some algorithms (such as Random Forest classifiers
or Naive Bayes classifiers) are capable of handling
multiple classes directly.
• Others (such as Support Vector Machine classifiers
or Linear classifiers) are strictly binary classifiers.
• There are two main strategies to perform multiclass
classification using multiple binary classifiers.

22
4.1. One-versus-All (OvA) Strategy

• For example, classify the digit images into 10

classes (from 0 to 9) to train 10 binary classifiers,
one for each digit (a 0-detector, a 1-detector, a 2-
detector, and so on).
• Then to classify an image, get the decision score
from each classifier for that image and select the
class whose classifier outputs the highest score.

23
4.2. One-versus-One (OvO)
Strategy
• Train a binary classifier for every pair of digits.
• If there are N classes, need N × (N – 1) / 2 classifiers.
For MNIST, need 45 classifiers.
• To classify an image, run the image through all 45
classifiers and see which class wins the most duels.
• The main advantage of OvO is that each classifier only
needs to be trained on a subset of the training set.
• OvO is preferred for algorithms (such as Support Vector
Machine) that scale poorly with the size of the training
set.

24
4.3. Scikit Learn Support of
Multiclass Classification
• Scikit-Learn detects when you try to use a binary
classification algorithm for a multiclass
classification task, and it automatically runs OvA
(except for SVM classifiers for which it uses OvO).

25
4.3. Scikit Learn Support of
Multiclass Classification
• Note that the multiclass task is harder than the
binary task.
• Binary task:

• Multiclass task:

26
Outline
1. MNIST dataset
2. Training a binary classifier
3. Performance measures
4. Multiclass classification
5. Multilabel classification
6. Exercise

27
5. Multilabel Classification
• Classifiers that output multiple classes for each
instance.

Popular algorithm

28
Summary
1. MNIST dataset
2. Training a binary classifier
3. Performance measures
4. Multiclass classification
5. Multilabel classification
6. Exercise

29
Exercise
• Try to build a classifier for the MNIST dataset that
achieves over 97% accuracy on the test set. Hint:
the KNeighborsClassifier works quite well for this
task; you just need to find good hyperparameter
values (try a grid search on the weights and
n_neighbors hyperparameters).

Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
Major Project Stage 2
No ratings yet
Major Project Stage 2
19 pages
Unit 5 - DA - Classification & Clustering
No ratings yet
Unit 5 - DA - Classification & Clustering
105 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
18 pages
Math401E, Ch3, Mortality Table, Emad Salem-Edited
No ratings yet
Math401E, Ch3, Mortality Table, Emad Salem-Edited
19 pages
Jupyter Notebook Project CART RF ANN
100% (1)
Jupyter Notebook Project CART RF ANN
41 pages
Data Mining Classification and Prediction
No ratings yet
Data Mining Classification and Prediction
17 pages
10 TensorFlow PDF
No ratings yet
10 TensorFlow PDF
12 pages
ML Notes - 2025
No ratings yet
ML Notes - 2025
145 pages
Beyond Binary Classification
No ratings yet
Beyond Binary Classification
34 pages
Lecture03. Classification (Chapter 3)
No ratings yet
Lecture03. Classification (Chapter 3)
46 pages
06 MultiClass Classification
No ratings yet
06 MultiClass Classification
16 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
Classification
No ratings yet
Classification
22 pages
ML Unit-Ii
No ratings yet
ML Unit-Ii
37 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
APA Chapter3 T20
No ratings yet
APA Chapter3 T20
24 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
Boxplots
No ratings yet
Boxplots
7 pages
Lecture 3 1611410001002
No ratings yet
Lecture 3 1611410001002
51 pages
Binary Classification Tutorial With The Keras Deep Learning Library
No ratings yet
Binary Classification Tutorial With The Keras Deep Learning Library
33 pages
20MEMECH Part 3 - Classification
No ratings yet
20MEMECH Part 3 - Classification
49 pages
03 Supervised Classification
No ratings yet
03 Supervised Classification
68 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
Quiz 1 On Wednesday
No ratings yet
Quiz 1 On Wednesday
46 pages
08classification I
No ratings yet
08classification I
52 pages
Spintronics ML Accelerator PDF
No ratings yet
Spintronics ML Accelerator PDF
40 pages
Digital Image Processing Lecture
No ratings yet
Digital Image Processing Lecture
63 pages
Machine Learning Chapter3
No ratings yet
Machine Learning Chapter3
27 pages
Chapter 2
No ratings yet
Chapter 2
124 pages
10 TensorFlow PDF
No ratings yet
10 TensorFlow PDF
12 pages
Mod 4 - CLustering
No ratings yet
Mod 4 - CLustering
55 pages
06 EnsembleLearning
No ratings yet
06 EnsembleLearning
65 pages
Part 11 MD
No ratings yet
Part 11 MD
53 pages
Unit 3
No ratings yet
Unit 3
100 pages
Laporan Evidence Based Medicine
No ratings yet
Laporan Evidence Based Medicine
34 pages
Recurrent Neural Networks: Prof. Gheith Abandah
No ratings yet
Recurrent Neural Networks: Prof. Gheith Abandah
32 pages
Example For Agglomerative Clustering
No ratings yet
Example For Agglomerative Clustering
2 pages
An Experiment-Driven Energy Consumption Model For Virtual Machine Management Systems
No ratings yet
An Experiment-Driven Energy Consumption Model For Virtual Machine Management Systems
30 pages
MNIST
No ratings yet
MNIST
54 pages
Chapter 4. Classification Algorithms-Stud
No ratings yet
Chapter 4. Classification Algorithms-Stud
43 pages
ML 4
No ratings yet
ML 4
32 pages
Chapter 5 - Machine Learning Basics
No ratings yet
Chapter 5 - Machine Learning Basics
58 pages
Lec5 Classification
No ratings yet
Lec5 Classification
27 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
P06 The Classification Pipeline Ans
No ratings yet
P06 The Classification Pipeline Ans
16 pages
Lecture W1c UG
No ratings yet
Lecture W1c UG
33 pages
BSC ML CH1
No ratings yet
BSC ML CH1
63 pages
Lecture 11 - 09.09.24 Classification Part 1
No ratings yet
Lecture 11 - 09.09.24 Classification Part 1
51 pages
Pemodelan Analisis Prediktif
No ratings yet
Pemodelan Analisis Prediktif
20 pages
Vmeter: Power Modelling For Virtualized Clouds
No ratings yet
Vmeter: Power Modelling For Virtualized Clouds
18 pages
3ML.02.MainConcepts Evaluation
No ratings yet
3ML.02.MainConcepts Evaluation
35 pages
K Means Clustering Algorithm: Explained: Dni Institute
No ratings yet
K Means Clustering Algorithm: Explained: Dni Institute
17 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
20 pages
CS373 Lecture18.1
No ratings yet
CS373 Lecture18.1
33 pages
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-25 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-25 Reference-Material-I
37 pages
ML - Mod2 Classification
No ratings yet
ML - Mod2 Classification
74 pages
AAI Lecture 11 SP 25
No ratings yet
AAI Lecture 11 SP 25
77 pages
Maxbox - Starter67 Machine Learning
No ratings yet
Maxbox - Starter67 Machine Learning
7 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
L02 Classification and Regression
No ratings yet
L02 Classification and Regression
26 pages
Machine Learning - Lecture 5
No ratings yet
Machine Learning - Lecture 5
19 pages
Us20190058762a1 PDF
No ratings yet
Us20190058762a1 PDF
14 pages
4 Types of Classification Tasks in Machine Learning
No ratings yet
4 Types of Classification Tasks in Machine Learning
14 pages
l05 Machine Learning
No ratings yet
l05 Machine Learning
34 pages
Unit II - 3 - Chapter 3 - MNIST Classification
No ratings yet
Unit II - 3 - Chapter 3 - MNIST Classification
13 pages
Types of Data or Classification of Variables 1
No ratings yet
Types of Data or Classification of Variables 1
14 pages
Lecture Material 11
No ratings yet
Lecture Material 11
14 pages
Aiml Unit 4
No ratings yet
Aiml Unit 4
17 pages
Multiclass Classification
No ratings yet
Multiclass Classification
45 pages
ML Unit2
No ratings yet
ML Unit2
22 pages
Sklearn Quick Reference
No ratings yet
Sklearn Quick Reference
11 pages
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Module 4 - Classification
No ratings yet
Module 4 - Classification
10 pages
Survey On Multiclass Classification Methods
No ratings yet
Survey On Multiclass Classification Methods
9 pages
Ensmble - Learning - ML - 5 - Jupyter Notebook
No ratings yet
Ensmble - Learning - ML - 5 - Jupyter Notebook
7 pages
Lecture 1, Part 2: Linear Classification: Roger Grosse
No ratings yet
Lecture 1, Part 2: Linear Classification: Roger Grosse
10 pages
Unit2 - Types of Classifications
No ratings yet
Unit2 - Types of Classifications
14 pages
Forecasting of Nonlinear Time Series Using Artificial Neural Network
No ratings yet
Forecasting of Nonlinear Time Series Using Artificial Neural Network
9 pages
Auc Roc Curve Machine Learning
No ratings yet
Auc Roc Curve Machine Learning
12 pages
DSC - MachineLearning Regular HO
No ratings yet
DSC - MachineLearning Regular HO
7 pages
Patter Recognition (Spring 2013) Midterm Exam
No ratings yet
Patter Recognition (Spring 2013) Midterm Exam
4 pages
VMM1
No ratings yet
VMM1
6 pages
ARI5102 Presentation
No ratings yet
ARI5102 Presentation
25 pages
Patter Recognition (Spring 2015) Midterm Exam
No ratings yet
Patter Recognition (Spring 2015) Midterm Exam
4 pages
CPE531 S18 MT Sol PDF
No ratings yet
CPE531 S18 MT Sol PDF
3 pages
Making Sensor Node Virtual Machines Work For Real-World Applications
No ratings yet
Making Sensor Node Virtual Machines Work For Real-World Applications
5 pages
Security Strategy For Virtual Machine Allocation in Cloud Computing Security Strategy For Virtual Machine Allocation in Cloud Computing
No ratings yet
Security Strategy For Virtual Machine Allocation in Cloud Computing Security Strategy For Virtual Machine Allocation in Cloud Computing
5 pages
Patter Recognition (Spring 2014) Midterm Exam
No ratings yet
Patter Recognition (Spring 2014) Midterm Exam
5 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
7 pages
Classifying Data Using Support Vector Machines (SVMS) in Python
No ratings yet
Classifying Data Using Support Vector Machines (SVMS) in Python
5 pages
Binary, Multi-Class & Multi-Label Classification
No ratings yet
Binary, Multi-Class & Multi-Label Classification
6 pages
Virtual Machine Power Metering and Its Applications
No ratings yet
Virtual Machine Power Metering and Its Applications
4 pages
Patter Recognition (Spring 2012) Midterm Exam Solution
No ratings yet
Patter Recognition (Spring 2012) Midterm Exam Solution
4 pages
Patter Recognition (Spring 2010) Midterm Exam: and ω are distributed according to
No ratings yet
Patter Recognition (Spring 2010) Midterm Exam: and ω are distributed according to
4 pages
Classification Is For Predicting Type and Regression Is For Predicting Value
No ratings yet
Classification Is For Predicting Type and Regression Is For Predicting Value
4 pages
Tugas Data Mining Pertemuan 10 Kelompok 3
No ratings yet
Tugas Data Mining Pertemuan 10 Kelompok 3
4 pages
01 - Extrema, Increase and Decrease
No ratings yet
01 - Extrema, Increase and Decrease
4 pages
Diagnosis Worksheet: Page 1 of 2 Citation
No ratings yet
Diagnosis Worksheet: Page 1 of 2 Citation
2 pages
Kami Export - Box Plot 2
No ratings yet
Kami Export - Box Plot 2
1 page
Assignment 6 AI Travelling Salesman Problem - Jupyter Notebook
No ratings yet
Assignment 6 AI Travelling Salesman Problem - Jupyter Notebook
1 page
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet

Classification: Prof. Gheith Abandah

Uploaded by

Classification: Prof. Gheith Abandah

Uploaded by

Classification

Prof. Gheith Abandah

Reference: Hands-On Machine Learning with Scikit-Learn and

• MNIST is a set of 70,000

There are 70,000 images, and each image has 784

• The MNIST dataset is actually already split into a training

True for all 5s, False for all

• Accuracy: Ratio of correct predictions

Example how to find the

Using the cross_val_score() function

• The first row is for the non-5s (the negative class):

The precision and recall are smaller than the accuracy.

• When using larger decision threshold, we increase

• For example, classify the digit images into 10

You might also like