0% found this document useful (0 votes)

11 views66 pages

L6 Lecture Image - Classification.fundemental v4

Uploaded by

enochmay123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views66 pages

L6 Lecture Image - Classification.fundemental v4

Uploaded by

enochmay123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 66

COMP 4423: Image classification

Fundemental
Easy Computer Vision
Xiaoyong Wei (魏驍勇)
[email protected]
New Toy
Outline

• Classification
• Supervised learning
• K nearest neighbors (k-NN)
• Bayesian classifiers
• Support vector machines (SVM)
• Rock-Paper-Scissors
How do you group them?
Feature Space

3
Skin Color

1 2 3
Hair Color
Clustering is unsupervised learning
which means we (human) don’t have
to tell the computers what each
group looks like. It’s data-driven
without using human knowledge
(supervision).
Sounds good?

But …
Feature Space

3
Skin Color

? ? ?
1 2 3
What if we have new examples unseen?
Hair Color
Feature Space

What
3 if the features selected are not
representative enough or consistent with
Skin Color

2 our understanding?

age age age

? ? ?

1 2 3
Hair Color
We can tell the computers about our
understanding on the subjects by
giving labels.
Training Examples (Seen)
Skin Color Class Label
Hair Color (H)
(S) (L)

Yellow (2) White (1) W

Black (3) Yellow (2) A

Yellow (2) White (1) W

Black (3) Black (3) B

Testing Examples (Unseen)
Skin Color Class Label
Hair Color (H)
(S) (L)

2.2 0.8 ?

3.2 1.9 ?

3.1 2.2 ?

2.4 1.3 ?

3.1 2.9 ?
We can tell the computers about our
understanding on the subjects by
giving labels.
Training Examples (Seen)
Skin Color Class Label
Hair Color (H)
(S) (L)

Yellow (2) White (1) W

Black (3) Yellow (2) A

Yellow (2) White (1) W

Black (3) Black (3) B

Testing Examples (Unseen)
Skin Color Class Label
Hair Color (H)
(S) (L)

2.2 0.8 ?

3.2 1.9 ?

3.1 2.2 ?

2.4 1.3 ?

3.1 2.9 ?
Classification: to predict the labels of
the testing (unseen) examples based
on the knowledge learned from the
training (seen) examples
Classification: to predict the labels of
the testing (unseen) examples based
on the knowledge learned from the
training (seen) examples

We are training the computers and the processing is called training.

The computers are learning. This is what the term “machine learning” is referring to.
Our participation in learning makes it
supervised learning.
Classification: to predict the labels of
the testing (unseen) examples based
on the knowledge learned from the
training (seen) examples

The result is the machine’s understanding of the knowledge. We

call it a model sometimes.
Classification – Supervised Learning

Training Set Testing Set

Samples Features Labels Samples Features Labels

1 [0.1, …] 1 1 [0.8, …] ?
Model
2 [0.3, …] -1 2 [0.7, …] -?
… … … … … …

Validation Set

Samples Features Labels

Samples Features Labels
1 [0.5, …] -1
1 [0.8, …] -1
2 [0.9, …] 1
2 [0.7, …] 1
… … …
… … …
Training Testing
Then how?
Instance-based Learning

PDA Cellphone Desktop PC Laptop

It looks like the

laptop I see last
time.
How about we use image retrieval to
find the most similar ones from the
seen examples for reference?
kNN Classifiers
kNN Classifier
>NN: nearest neighbors
>k: number of nearest neighbors
>Idea
• When k=1: assign the unseen with the label of its nearest
neighbor
• 近朱者赤，近墨者黑(If you lie down with dogs, you
will get up with fleas)
k=1
3

Skin color 2

1 2 3
Hair color
The nearest

k=1 neighbor

Skin color 2

1 2 3
Hair color
kNN Classifier
>NN: nearest neighbors
>k: number of nearest neighbors
>Idea
• k=1: assign the unseen with the label of its nearest neighbor
• k>1: assign the dominating label among these of the k
nearest neighbors
k=3
3

Skin color
2 #A=2
#W=1
#A>#W => A
1

1 2 3
Hair color
It’s straightforward. But so far, we picked the
simplest case (classes are well separated) for
illustration purpose.
In a more general sense, this is what we’re
going to have.
It’s straightforward. But so far, we picked the
simplest case (classes are well separated) for
illustration purpose.
In a more general sense, this is what we’re
going to have.

kNN may fail

Can we teach the computer to draw a “circle”
for each of the class and evaluate the
membership of an example by measuring
how much it falls into “circles”?
Bayesian Classifiers
Bayesian Classifiers

• Classes A and B as two sets

A B

A and B
Bayesian Classifiers
• Classes A and B as two sets
• P(A|x): the probability of A is observed when seeing an x
• P(B|x): the probability of B is observed when seeing an x

A B
Bayesian Classifiers
• Classes A and B as two sets
𝑃 𝑥 𝐴 𝑃(𝐴)
• P(A|x) ∝ P(x|A)P(A) 𝑃 𝐴𝑥 =
𝑃(𝑥)
• P(B|x) ∝ P(x|B)P(B)
𝑃 𝑥 𝐵 𝑃(𝐵)
𝑃 𝐵𝑥 =
𝑃(𝑥)
x

A B

https://towardsdatascience.com/naive-bayes-classifier-81d512f50a7c
Bayesian Classifiers
• Classes A and B as two sets
• P(A|x) ∝ P(x|A)P(A)
• P(B|x) ∝ P(x|B)P(B)

A B

P(A)=#A/(#A+#B) P(B)=#B/(#A+#B)

The Prior Probability

Bayesian Classifiers
• Classes A and B as two sets
• P(A|x) ∝ P(x|A)P(A)
• P(B|x) ∝ P(x|B)P(B)

A B

P(x|A)=#x/#A P(x|B)=#x/#B

The Conditional Probability (Likelihood)

Bayesian Classifiers
• Classes A and B as two sets
• P(A|x) ∝ P(x|A)P(A)
• P(B|x) ∝ P(x|B)P(B)

A B

Decision function: x is A if P(A|x)>P(B|x),

B otherwise
Naive Bayesian
• Advantages:
• Fast
• Extendable to multi-class problems
• Requires less training examples
• Works well for categorical data
• Disadvantages:
• Features are assumed to be independent to each other
(not true in real-world applications)
• Zero-frequency problem
Support Vector Machines
Linear Separators
>Binary classification can be viewed as the task of
separating classes in feature space:
𝒚 = 𝒘𝒙 + 𝒃 −𝒚 + 𝒘𝒙 + 𝒃 = 𝟎

𝒘 𝒙 𝒚 +𝒃=𝟎
−𝟏

wTx + b = 0
Linear Separators
>Binary classification can be viewed as the task of
separating classes in feature space:

wTx + b = 0
wTx + b > 0
wTx + b < 0

f(x) = sign(wTx + b)
Linear Separators
>Binary classification can be viewed as the task of
separating classes in feature space:

>Which one is the best?

Margin w T xi + b
> Distance from example xi to the separator isr=
w
> Examples closest to the hyperplane are support vectors.
> Margin ρ of the separator is the distance between support
vectors. ρ

r
Maximum Margin Classification
> Maximizing the margin is good according to intuition and
PAC theory (Probably Approximately Correct).
> Implies that only support vectors matter; other training
examples are ignorable.
Soft Margin Classification
>What if the training set is not linearly separable?
>Slack variables ξi can be added to allow
misclassification of difficult or noisy examples,
resulting margin called soft.

ξi
ξi
Linear SVMs: Overview
> The classifier is a separating hyperplane
> Most “important” training points are support vectors; they
define the hyperplane.
> Quadratic optimization algorithms can identify which training
points xi are support vectors with non-zero Lagrangian
multipliers αi.
> Both in the dual formulation of the problem and in the
solution training points appear only inside inner products:
⾃⼰讀（Math）
Find α1…αN such that
Q(α) =Σαi - ½ΣΣαiαjyiyjxiTxj is maximized and
f(x) = ΣαiyixiTx + b
(1) Σαiyi = 0
(2) 0 ≤ αi ≤ C for all αi
Linear SVMs: Overview
> The classifier is a separating hyperplane
> Most “important” training points are support vectors; they
define the hyperplane.
> Quadratic optimization algorithms can identify which training
points xi are support vectors with non-zero Lagrangian
multipliers αi.
> Both in the dual formulation of the problem and in the
solution training points appear only inside inner products:

Find α1…αN such that

Q(α) =Σαi - ½ΣΣαiαjyiyjxiTxj is maximized and
f(x) = ΣαiyixiTx + b
(1) Σαiyi = 0
(2) 0 ≤ αi ≤ C for all αi
Non-linear SVMs
> Datasets that are linearly separable with some noise work
out great:

0 x

0 x
Non-linear SVMs
> Datasets that are linearly separable with some noise work
out great:

0 x

> How about… mapping data to a higher-dimensional space:

In higher dimensional space there are more chance to find a line to marge.
x2

0 x
Non-linear SVMs: Feature spaces
>General idea: the original feature space can
always be mapped to some higher-dimensional
feature space where the training set is separable:

Φ: x → φ(x)
The “Kernel Trick”
> The linear classifier relies on inner product between vectors K(xi,xj)=xiTxj
> If every datapoint is mapped into high-dimensional space via some
transformation Φ: x → φ(x), the inner product becomes:
K(xi,xj)= φ(xi) Tφ(xj)
> A kernel function is a function that is equivalent to an inner product in
some feature space.
> Example:
2-dimensional vectors x=[x1 x2]; let K(xi,xj)=(1 + xiTxj)2,
Need to show that K(xi,xj)= φ(xi) Tφ(xj):
K(xi,xj)=(1 + xiTxj)2,= 1+ xi12xj12 + 2 xi1xj1 xi2xj2+ xi22xj22 + 2xi1xj1 + 2xi2xj2=
= [1 xi12 √2 xi1xi2 xi22 √2xi1 √2xi2]T [1 xj12 √2 xj1xj2 xj22 √2xj1 √2xj2] =
= φ(xi) Tφ(xj), where φ(x) = [1 x12 √2 x1x2 x22 √2x1 √2x2]
> Thus, a kernel function implicitly maps data to a high-dimensional space
(without the need to compute each φ(x) explicitly).
Examples of Kernel Functions
> Linear: K(xi,xj)= xiTxj
• Mapping Φ: x → φ(x), where φ(x) is x itself
> Polynomial of power p: K(xi,xj)= (1+ xiTxj)p
d + p
• Mapping Φ: x → φ(x), where φ(x) has 
 p


dimensions
2
xi −x j
−
2 2
> Gaussian (radial-basis function): K(xi,xj) = e
• Mapping Φ: x → φ(x), where φ(x) is infinite-dimensional:
every point is mapped to a function (a Gaussian);
combination of functions for support vectors is the
separator.
> Higher-dimensional space still has intrinsic dimensionality d
(the mapping is not onto), but linear separators in it correspond
to non-linear separators in original space.
Classification – Supervised Learning

Training Set Testing Set

Samples Features Labels Samples Features Labels

1 [0.1, …] 1 1 [0.8, …] ?
Model
2 [0.3, …] -1 2 [0.7, …] -?
… … … … … …

Validation Set

Samples Features Labels

Samples Features Labels
1 [0.5, …] -1
1 [0.8, …] -1
2 [0.9, …] 1
2 [0.7, …] 1
… … …
… … …
Training Testing
The New Toy
New Toy
Feature Extraction
Training & Testing
Thank you!

Thank You!

Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Chapter 6 ML Classifications
100% (1)
Chapter 6 ML Classifications
51 pages
Lecture Notes To Neural Networks in Electrical Engineering
No ratings yet
Lecture Notes To Neural Networks in Electrical Engineering
11 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
SVM Tutorial
100% (1)
SVM Tutorial
34 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
SVM Presentation
No ratings yet
SVM Presentation
27 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
Pattern Recognition & Learning II: © UW CSE Vision Faculty
No ratings yet
Pattern Recognition & Learning II: © UW CSE Vision Faculty
47 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Machine Learning - Open Elective - Part III
No ratings yet
Machine Learning - Open Elective - Part III
90 pages
Support Vector Machin, An Excellent Tool
No ratings yet
Support Vector Machin, An Excellent Tool
36 pages
Unit 2
No ratings yet
Unit 2
42 pages
SVM Using Python
No ratings yet
SVM Using Python
24 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
SVM Class
No ratings yet
SVM Class
33 pages
Lesson 8 - Classification
No ratings yet
Lesson 8 - Classification
74 pages
Introduction To Support Vector Machines
No ratings yet
Introduction To Support Vector Machines
23 pages
Unit 2 - SVM
No ratings yet
Unit 2 - SVM
137 pages
Asset-V1 ColumbiaX+CSMM.101x+1T2017+type@asset+block@AI Edx ML 5.1intro
No ratings yet
Asset-V1 ColumbiaX+CSMM.101x+1T2017+type@asset+block@AI Edx ML 5.1intro
70 pages
Lec06 SVM
No ratings yet
Lec06 SVM
25 pages
SVM-CDing2024 11 15
No ratings yet
SVM-CDing2024 11 15
54 pages
Samuel Muigai - Final Thesis 2024
No ratings yet
Samuel Muigai - Final Thesis 2024
158 pages
cs221 Lecture11
No ratings yet
cs221 Lecture11
71 pages
Quiz 1 On Wednesday
No ratings yet
Quiz 1 On Wednesday
46 pages
Lec 04
No ratings yet
Lec 04
70 pages
08classification I
No ratings yet
08classification I
52 pages
Introduction To Support Vector Machines: Andrew Moore CMU
No ratings yet
Introduction To Support Vector Machines: Andrew Moore CMU
40 pages
Day 4 Content
No ratings yet
Day 4 Content
35 pages
Machine Learning SVM - Supervised
No ratings yet
Machine Learning SVM - Supervised
32 pages
RNN and LSTM
No ratings yet
RNN and LSTM
32 pages
Fast Kernel Classifiers
No ratings yet
Fast Kernel Classifiers
41 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
Learning Temporal Regularity in Video Sequences
No ratings yet
Learning Temporal Regularity in Video Sequences
40 pages
SD-M1 TSI Chapitre 4
No ratings yet
SD-M1 TSI Chapitre 4
42 pages
Overfitting Vs Underfitting
No ratings yet
Overfitting Vs Underfitting
8 pages
CH 5 SVM
No ratings yet
CH 5 SVM
25 pages
Unit 3
No ratings yet
Unit 3
100 pages
Lec 05
No ratings yet
Lec 05
54 pages
Chapter 4. Classification Algorithms-Stud
No ratings yet
Chapter 4. Classification Algorithms-Stud
43 pages
INT354 - Unit 3
No ratings yet
INT354 - Unit 3
60 pages
AI Chapter 3 Part 3
No ratings yet
AI Chapter 3 Part 3
49 pages
Week 09 Lesson 1 Intro Machine Learning 1 To 32
No ratings yet
Week 09 Lesson 1 Intro Machine Learning 1 To 32
61 pages
Understanding Basic Principles of Artificial Intel
No ratings yet
Understanding Basic Principles of Artificial Intel
15 pages
ML Module4 Classification
No ratings yet
ML Module4 Classification
79 pages
Week1 Lecture2
No ratings yet
Week1 Lecture2
50 pages
AI Research
No ratings yet
AI Research
25 pages
Unit II
No ratings yet
Unit II
35 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
AP For NLP-LO2
No ratings yet
AP For NLP-LO2
38 pages
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
No ratings yet
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
20 pages
Event Categorization From News Articles Using Machine Learning Techniques.. (1) ..
No ratings yet
Event Categorization From News Articles Using Machine Learning Techniques.. (1) ..
18 pages
CH 7
No ratings yet
CH 7
33 pages
Aihub 1002
No ratings yet
Aihub 1002
20 pages
5 MUST Watch Youtube Channels For ML
No ratings yet
5 MUST Watch Youtube Channels For ML
13 pages
Mod09-ppt2-ML in Image Classification
No ratings yet
Mod09-ppt2-ML in Image Classification
30 pages
Unsupervised Learning: Neighbor Embedding
No ratings yet
Unsupervised Learning: Neighbor Embedding
15 pages
Classification
No ratings yet
Classification
7 pages
CUHK AI Quiz Revision Notes - 2024 - 2025
No ratings yet
CUHK AI Quiz Revision Notes - 2024 - 2025
14 pages
ML Chapter 3
No ratings yet
ML Chapter 3
45 pages
Brain Tumor Detection Using Convolutional Neural Network: December 2019
No ratings yet
Brain Tumor Detection Using Convolutional Neural Network: December 2019
7 pages
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
No ratings yet
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
2 pages
2016-MICCAI-Recognizing End-Diastole and End-Systole
No ratings yet
2016-MICCAI-Recognizing End-Diastole and End-Systole
9 pages
DL Highlights
No ratings yet
DL Highlights
6 pages
QUESTIONS
No ratings yet
QUESTIONS
20 pages
Final Unit 5 Questions
No ratings yet
Final Unit 5 Questions
6 pages
Computer Vision Based Attendance Management System For Students
No ratings yet
Computer Vision Based Attendance Management System For Students
6 pages
Ai and ML
No ratings yet
Ai and ML
16 pages
AE-ViT: Token Enhancement For Vision Transformers Via CNN-based Autoencoder Ensembles.
No ratings yet
AE-ViT: Token Enhancement For Vision Transformers Via CNN-based Autoencoder Ensembles.
12 pages
IC368 Computational Intelligence in Control Engineering
No ratings yet
IC368 Computational Intelligence in Control Engineering
3 pages
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization (Week 3) Quiz
No ratings yet
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization (Week 3) Quiz
6 pages
Adobe Scan May 14, 2024
No ratings yet
Adobe Scan May 14, 2024
4 pages
Research Paper
No ratings yet
Research Paper
6 pages
HistoryOfObjectRecognition PDF
No ratings yet
HistoryOfObjectRecognition PDF
2 pages
Revision 2 Answer Key
No ratings yet
Revision 2 Answer Key
6 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
This Is
No ratings yet
This Is
7 pages
Evaluation of Different Classifier
No ratings yet
Evaluation of Different Classifier
4 pages
Ai Fundamentals
No ratings yet
Ai Fundamentals
2 pages
Atc Lecture Tyliu
No ratings yet
Atc Lecture Tyliu
48 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages

L6 Lecture Image - Classification.fundemental v4

Uploaded by

L6 Lecture Image - Classification.fundemental v4

Uploaded by

COMP 4423: Image classification

age age age

Yellow (2) White (1) W

Black (3) Yellow (2) A

Black (3) Yellow (2) A

Yellow (2) White (1) W

Black (3) Black (3) B

Yellow (2) White (1) W

Black (3) Yellow (2) A

Black (3) Yellow (2) A

Yellow (2) White (1) W

Black (3) Black (3) B

We are training the computers and the processing is called training.

The result is the machine’s understanding of the knowledge. We

Training Set Testing Set

Samples Features Labels Samples Features Labels

Samples Features Labels

PDA Cellphone Desktop PC Laptop

It looks like the

kNN may fail

• Classes A and B as two sets

The Prior Probability

The Conditional Probability (Likelihood)

Decision function: x is A if P(A|x)>P(B|x),

>Which one is the best?

Find α1…αN such that

> How about… mapping data to a higher-dimensional space:

Training Set Testing Set

Samples Features Labels Samples Features Labels

Samples Features Labels

You might also like