L7 Lecture Image - classification.DNN v4
L7 Lecture Image - classification.DNN v4
Fei-Fei Li, Ranjay Krishna, Danfei Xu, Image Classification: A Core Task in Computer Vision
Illumination
Fei-Fei Li, Ranjay Krishna, Danfei Xu, Image Classification: A Core Task in Computer Vision
Occlusions
Fei-Fei Li, Ranjay Krishna, Danfei Xu, Image Classification: A Core Task in Computer Vision
Background Clutter
Fei-Fei Li, Ranjay Krishna, Danfei Xu, Image Classification: A Core Task in Computer Vision
Intra-class Variations
Fei-Fei Li, Ranjay Krishna, Danfei Xu, Image Classification: A Core Task in Computer Vision
Hand Gesture Recognition
ImageNet
ImageNet: 12 subtrees with 5247 synsets and 3.2 million images in total
J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei-Fei, "ImageNet: A large-scale hierarchical image database," 2009 IEEE
Conference on Computer Vision and Pattern Recognition, 2009, pp. 248-255, doi: 10.1109/CVPR.2009.5206848.
Deep Learning is a popular solution
to address these challenges.
x
Apple Space To find the best line
dividing the two groups
of apples is to find the
best parameters of a and
y b
x
Apple Space To find the best line
dividing the two groups
of apples is to find the
best parameters of a and
y b
The model:
z=a*x-y+b
Outputs 1 if z>0
Outputs -1 if z<=0
x
Apple Space To find the best line
dividing the two groups
of apples is to find the
best parameters of a and
y b
x
Initialization
Apple Space Without knowing which
line is the best at the
beginning, we can pick
y a random one by setting
a and be with random
numbers a’ and b’.
x
Initialization
Apple Space Without knowing which
line is the best at the
beginning, we can pick
y a random one by setting
a and be with random
numbers a’ and b’.
The model:
z0=a’*x-y+b’
x
How can we evaluate how good the
model (a’ and b’) is?
𝒊=𝟏
With the “goodness” evaluated, we
can update a’ and b’ by replacing
them with better ones.
𝝏𝑳
𝒂′ = 𝒂′ −
𝝏𝒂′
𝝏𝑳
𝒃′ = 𝒃′ −
𝝏𝒃′
Gradient Decent
We can update a’ and b’ by pushing
the gradients towards zeros!
′
𝝏𝑳′
𝒂 =𝒂 −𝜹
𝝏𝒂′
𝝏𝑳
𝒃′ = 𝒃′ −𝜹
𝝏𝒃′
′
𝝏𝑳′
𝒂 =𝒂 −𝜹
𝝏𝒂′
𝝏𝑳
𝒃′ = 𝒃′ −𝜹
𝝏𝒃′
a’
Learning Rate b’
Apple Space
y
x
Machine learning is a process to find
the best set of parameters that fits
into a model/hypothesis.
The learning is usually conducted by
updating the initial parameters with a
learning rate towards the optimal of
a loss function. Gradient Decent is
one of the most popular updating
strategies.
Let’s implement the learning using
neural networks.
Neural Network Version of the Model
x
a
-1 z=a*x-y+b
y z 1: if z>0
-1: if z<=0
b
1
Neural Network Version of the Model
z1
x
1: if d>0
z2 d -1: if d<=0
y
z3
y
The new
decision
boundary
y
How is the learning conducted with
more layers and weights?
Gradient Decent on Neural Networks
z1
x
z2 d
y Judge for
z3 Loss
Gradient Decent on Neural Networks
z1
x
z2 d
y Judge for
z3 Loss
Sauber Filter
Implement with Neural Networks
1 0 1
0 1 0
=
1 0 1
Convolutional Filter Convolved
Feature
Image
1 1 1 0 0
4
0 1 1 1 0
0 0 1 1 1
0 0 1 1 0
0 1 1 0 0
Input Layer Convolutional Layer
Implement with Neural Networks
1 0 1
0 1 0
=
1 0 1
Convolutional Filter Convolved
Feature
Image
1 1 1 0 0
4
0 1 1 1 0
0 0 1 1 1
Neurons are not fully
0 0 1 1 0 connected, which results in
0 1 1 0 0 Local Receptive Fields.
Input Layer Convolutional Layer
Implement with Neural Networks
Weights of the filters can be
learned by Backpropagation
1 0 1
0 1 0
=
1 0 1
Convolutional Filter Convolved
Feature
Image
1 1 1 0 0
4
0 1 1 1 0
0 0 1 1 1
0 0 1 1 0
0 1 1 0 0
Input Layer Convolutional Layer
Pooling
Max Pooling 7 6
0 1 2 9 10
7 6 5
8 9 10 3.5 3.5
Mean Pooling
7.5 7.5
Activation
Activation
𝑅𝑒𝐿𝑈 𝑥
𝑅𝑒𝐿𝑈 𝑥 = max(0, 𝑥)
Convolutional Networks
https://discuss.boardinfinity.com/t/what-do-you-mean-by-convolutional-neural-network/8533
AlexNet by Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton
VGG16 by Karen Simonyan, Andrew Zisserman @ Oxford
ResNet by K. He and et al.
𝐻 𝑥 =
𝜕𝐿 𝜕𝐿 𝜕𝐻(𝑥)
=
𝜕𝑥 𝜕𝐻(𝑥) 𝜕𝑥
𝜕𝐻(𝑥) 𝜕(𝐹 𝑥 + 𝑥)
=
𝜕𝑥 𝜕𝑥
𝜕𝐹 𝑥
= +1
𝜕𝑥
Thank You!