FundamentalsOfDeepLearning (1)
FundamentalsOfDeepLearning (1)
Notes adapted from Dr. César Beltrán (PUCP) and Dr. Ivan Serina (UNIBS)
Review
Scalars
● or more dimensions.
Review Scalar Derivative
Gradients
Chain Rule
Chain Rule
Gradient Descent
Approximate Optimization
History Review
Mark I Perceptron
Frank Rosenblatt ~1958
Mark I Perceptron
The first page of Rosenblatt's
article, “The Design of an
Intelligent Automaton,” in
Research Trends, a Cornell
Aeronautical Laboratory
publication, Summer 1958.
https://www.youtube.com/watch?v=IEFRtz68m-8
Neocognitron: a self organizing neural network
model for a mechanism of pattern recognition
unaffected by shift in position.
Fukushima K. 1980
https://www.youtube.com/watch?v=Qil4kmvm2Sw
Learning representations by back-
propagating errors
Rumelhart et. al., 1986
Sigmoid unit
Cost Function
Gradient Descent
[Krizhevsky 2012]
29
Detection Segmentation
[Faster R-CNN: Ren, He, Girshick, Sun 2015] [Farabet et al., 2012]
30
Convolutional Neural
Networks
CNN
● CNN architecture main task is the feature extraction through 2D or 3D
convolutional operations.
Lenet
¿Qué es una convolución?
1 0 1
0 1 0
1 0 1
Kernel
Convolution Layer
32x32x3 image
32 height
http://setosa.io/ev/image-kernels/
32 width
3 depth
Convolution Layer
activation map
32x32x3 image
32
5x5x3 filter
28
convolve
28
32
1
3
Convolution Layer
32x32x3 image
5x5x3 filter
32
32
3
Convolution Layer
32x32x3 image
5x5x3 filter
32
32
3
Convolution Layer
activation map
32x32x3 image
5x5x3 filter
32
28
convolve
32 28
3 1
Convolution Layer Un segundo filtro
32x32x3 image activation maps
5x5x3 filter
32
28
convolve
32 28
3 1
Convolution Layer
activation maps
32
28
Convolution Layer
32 28
3 6
32
28
Convolution Layer
32 28
● Kernel size = 5
3 ● # kernels = 6 6
● padding =0
7
7x7 input
3x3 filter
42
7
7x7 input
3x3 filter
43
7
7x7 input
3x3 filter
44
7
7x7 input
3x3 filter
45
7
7x7 input
3x3 filter
46
Padding
0 0 0 0 0 0
input 7x7
0 3x3 filter
0 padding 1
0
0
Padding
0 0 0 0 0 0
input 7x7
0 3x3 filter
0 padding 1
0
7x7 output!
0
https://ezyang.github.io/convolution-visualizer/index.html
Pooling layer
49
Max Pooling
3 2 1 0 3 4
1 2 3 4
y
50
Avg Pooling
3 2 1 0
2 2
1 2 3 4
y
51
Activation Function
Fully Connected Layer
32x32x3
Fully Connected Layer
3072
32x32x3
input
32
32
3
Fully Connected Layer
input
32
(32x32x3)
32
3
Fully Connected Layer
Keras code
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Flatten
model = Sequential([
Conv2D(16, 3, activation='relu', input_shape=(28,28,1)),
MaxPool2D(),
Conv2D(32, 3, activation='relu'),
MaxPool2D(),
Flatten(),
Dense(10, activation='softmax')
])
Arquitecturas conocidas
LeNet-5
[LeCun et al., 1998]
best model
7.3% top 5 error
GoogLeNet
Inception module
spatial dimension
only 56x56!
ResNet [He et al., 2015]
- Batch Normalization after every CONV layer
- Xavier/2 initialization from He et al.
- SGD + Momentum (0.9)
- Learning rate: 0.1, divided by 10 when validation error plateaus
- Mini-batch size 256
- Weight decay of 1e-5
- No dropout used
75
ResNet [He et al., 2015]
YOLO [Redmon et al., 2016]
SqueezeNet
[Iandola et al., 2017]
Thank You !