NN 06
NN 06
Deep Learning
Stride
Agenda Convolution Padding
1
Deep
Learning
Deep
Learning
2
Deep Learning, cont.
Deep learning is a set of algorithms that
learn to represent the data. The most
popular ones.
• Convolutional Neural Networks
• Deep Belief Networks
• Deep Auto-Encoders
• Recurrent Neural Networks (LSTM)
3
Fat + Short v.s. Thin + Tall
The same number
of parameters Which one is better?
As a rule of thumb deeper
models will perform better
than shallow models, the
problem is that more deep
you go more data, you will
…… need to avoid over-fitting.
x1 x2 …… xN x1 x2 …… xN
Shallow Deep
7
Convolution layer
Pooling Layer
Dropout Layer
4
Some guys
from Deep
Learning
10
5
Convolution
11
Convolution
• Convolution is a mathematical operation that does the integral of the
product of 2 functions(signals), with one of the signals flipped. For
example bellow we convolve 2 signals f(t) and g(t).
conv(a,b)==conv(b,a)
6
Application of convolutions
People use convolution on signal processing for the following use cases:
• Filter signals (1D audio, 2D image processing)
• Check how much a signal is correlated to another
• Find patterns in signals
13
Example
• convolve two signals x = (0,1,2,3,4) with w = (1,-1,2).
1- The first thing is to flip W horizontally (Or rotate to left 180 degrees).
14
7
2- After that we need to slide the flipped W over the input X.
• Observe that on steps 3,4,5 the flipped window is completely inside the input
signal.
• The cases where the flipped window is not fully inside the input window(X), we can
consider to be zero, or calculate what is possible to be calculated, e.g. on step 1 we
15
multiply 1 by zero, and the rest is simply ignored.
16
8
2D Convolution
• 2D convolutions are used as image filters.
17
2D Convolution
Consider 5 x 5 image
3 x 3 Filter
19
9
Example
20
10
Convolution and stride
22
11
• We have an input 5x5 convolved with a filter 3x3 (k=3).
24
• Example:
5x5 (WxH) input, with a conv layer with the following parameters Stride=1, Pad=1,
F= (3x3 kernel).
The output size: ((5 - 3 + 2)/1 )+ 1 = 5
25
12
Convolution Neural
Networks
(CNN)
26
27
13
Large Scale Object Recognition Challenge
28
CNN
• A CNN is composed of layers that filters(convolve) the inputs to get useful
information.
• These have two kinds of layers: convolution layers and pooling layers.
• The convolution layer has a set of filters. Its output is a set of feature maps,
each one obtained by convolving the image with a filter.
• CNN are better to work with images.
• Common architecture for CNN:
[CONV->ReLU->Pool->CONV->ReLU->Pool->FC->Softmax_loss(during train)] 29
14
Main actor the convolution layer
• The most important operation on the convolutional neural network are the
convolution layers.
• The filter will look for a particular thing on all the image, this means that it will
look for a pattern in the whole image with just one filter.
30
31
15
• It's common to apply a linear rectication nonlinearity: yi = max(zi ; 0)
ReLU
Why might we do this?
• Convolution is a linear
operation.
• Therefore, we need a non-
linearity, otherwise 2
convolution layers would be
no more powerful than 1.
• ReLU has been used after every
Convolution operation.
32
ReLU
• Other functions are also used to increase nonlinearity, for example the
saturating hyperbolic tangent, sigmoid function .
16
Pooling
Layers
34
Deep layer
Deep Learning Shallow
1D Convolution
Summary Convolution
2D Convolution
Stride
Padding
Convolution
Convolution Neural layer
ReLU
Networks (CNN)
Pooling Layer
35
17
Next: More details for
Convolution Neural
Networks (CNN)
Thanks
36
18