Lecture2 CNN Network Design
Lecture2 CNN Network Design
• Associate Professor
• Electrical and Computer Engineering
• Newark College of Engineering
• New Jersey Institute of Technology
• https://tao-han-njit.netlify.app
Slides are designed based on Prof. Hung-yi Lee’s Machine Learning courses at National Taiwan University
Network Architecture Design:
100 x 100
3
Image Classification
3 channels 100 x 100
3-D 100
tensor
100 x 100
4
Fully Connected Network
100 x 100 𝑥𝑖 ……
……
𝑥𝑗 3 x 107 ……
100 x 100
……
𝑥𝑘 ……
100 x 100 x 3 1000
100 x 100
……
……
xN ……
x2 bird
……
……
……
……
xN ……
basic advanced
detector detector
Some patterns are much smaller than the whole image.
8
3x3x3
Simplification 1 weights
…...
3x3
bias
1
…...
Receptive 1 0 0 0 0 1 3x3
11 00 00 00 00 11
field 0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00
…...
0 1 0 0 1 0 3x3
00 11 00 00 11 00
0 0 1 0 1 0
00 00 11 00 11 00
9
• Can different neurons have
different sizes of receptive field?
Simplification 1 • Cover only some channels?
• Not square receptive field?
3 x 3 x 3 weights
the same
1 0 0 0 0 1 receptive field
Receptive 11 00 00 00 00 11
field 0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00
0 1 0 0 1 0 Can be
00 11 00 00 11 00
0 0 1 0 1 0 overlapped
00 00 11 00 11 00
10
Simplification 1 – Typical Setting
Each receptive field has a set of neurons (e.g., 64 neurons).
stride = 2 overlap
all channels
1 0 0 0 0 1
11 00 00 00 00 11
kernel size 0 1 0 0 1 0 padding
00 11 00 00 11 00
(e.g., 3 x 3) 0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00
0 1 0 0 1 0 The receptive fields
00 11 00 00 11 00
0 0 1 0 1 0 cover the whole
00 00 11 00 11 00 image.
11
Observation 2
• The same patterns appear in different regions.
I detect “beak” in
my receptive field.
I detect “beak” in
my receptive field.
12
3x3x3
Simplification 2 weights
…...
bias
1 0 0 0 0 1 1
11 00 00 00 00 11
…
0 1 0 0 1 0 parameter sharing
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0 3x3x3
11 00 00 00 11 00 weights
…...
0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 0 1 0
00 00 11 00 11 00
bias
1
…
13
𝑥1 𝜎 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯
𝑥2 𝑤1
Simplification 2
…...
𝑤2
bias
1 0 0 0 0 1 1
11 00 00 00 00 11
…
0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0 𝑥1′ 𝜎 𝑤1 𝑥1′ + 𝑤2 𝑥2′ + ⋯
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00 𝑥2′ 𝑤1
…...
0 1 0 0 1 0 𝑤2
00 11 00 00 11 00
0 0 1 0 1 0
00 00 11 00 11 00
bias
Two neurons with the same receptive
1
field would not share parameters.
…
14
Simplification 2 – Typical Setting
Each receptive field has a set of neurons (e.g., 64 neurons).
1 0 0 0 0 1
11 00 00 00 00 11
0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00
0 1 0 0 1 0
……
00 11 00 00 11 00
……
0 0 1 0 1 0
00 00 11 00 11 00
15
Simplification 2 – Typical Setting
Each receptive field has a set of neurons (e.g., 64 neurons).
Each receptive field has the neurons with the same set of
parameters.
filter 1 1 0 0 0 0 1 filter 1
11 00 00 00 00 11
filter 2 0 1 0 0 1 0 filter 2
00 11 00 00 11 00
0 0 1 1 0 0
filter 3 00 00 11 11 00 00 filter 3
1 0 0 0 1 0
filter 4 11 00 00 00 11 00 filter 4
0 1 0 0 1 0
……
00 11 00 00 11 00
……
0 0 1 0 1 0
00 00 11 00 11 00
16
Benefit of Convolutional Layer
Fully Connected Layer Jack of all trades,
master of none
Receptive Field
Parameter Sharing
Convolutional Layer Larger model bias
(for image)
Filter 1
3 x 3 x channel
tensor
Convolution
Filter 2
3 x 3 x channel
tensor
……
……
1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0 -1 -1 1
0 0 1 1 0 0
-1 1 -1
1 0 0 0 1 0 Filter 2
-1 1 -1
0 1 0 0 1 0
-1 1 -1
0 0 1 0 1 0
6 x 6 image ……
(The values in the filters
are unknown parameters.)
19
1 -1 -1
Convolutional Layer -1 1 -1 Filter 1
-1 -1 1
stride=1
1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0 -3 -3 0 1
6 x 6 image 3 -2 -2 -1
20
-1 1 -1
Convolutional Layer -1 1 -1 Filter 2
-1 1 -1
stride=1 Do the same process for
1 0 0 0 0 1 every filter
0 1 0 0 1 0 3 -1 -3 -1
-1 -1 -1 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
-1 -1 -2 1
0 1 0 0 1 0 Feature
0 0 1 0 1 0 -3 -3 Map0 1
-1 -1 -2 1
6 x 6 image 3 -2 -2 -1
-1 0 -4 3
21
Convolutional Layer 3
-1
-1
-1
-3
-1
-1
-1
-3 1 0 -3
-1 -1 -2 1
-3 -3 0 1
-1 -1 -2 1
3 -2 -2 -1
-1 0 -4 3
64
Convolution
filters “Image” with 64 channels
Convolution
……
Multiple
3 -1 -3 -1
Convolutional Layers -1 -1 -1 -1
-3 1 0 -3
-1 -1 -2 1
-3 -3 0 1
-1 -1 -2 1
3 -2 -2 -1
-1 0 -4 3
64
Convolution
filters “Image” with 64 channels
Convolution
Filter:
3 x 3 x 64
……
64 23
1 0 0 0 0 1
Multiple
0 1 0 0 1 0
Convolutional Layers
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
64 3 -1 -3 -1
Convolution -1 -1 -1 -1
filters
-3 1 0 -3
-1 -1 -2 1
Convolution
-3 -3 0 1
-1 -1 -2 1
……
3 -2 -2 -1
-1 0 -4 3
24
Comparison of Two Stories
1 -1 -1 Filter
…...
-1 1 -1 3 x 3 x channel
-1 -1 1 tensor
Receptive
field (ignore bias in this slide)
25
The neurons with different receptive
…...
fields share the parameters.
bias
1 0 0 0 0 1 1
11 00 00 00 00 11
…
0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00
…...
0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 0 1 0
00 00 11 00 11 00
bias
Each filter convolves over the 1
…
input image. 26
Convolutional Layer
bird
bird
subsampling
28
Pooling – Max Pooling
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1
3 -1 -3 -1 -1 -1 -1 -1
-3 1 0 -3 -1 -1 -2 1
-3 -3 0 1 -1 -1 -2 1
3 -2 -2 -1 -1 0 -4 3
29
Convolutional Layers
3 -1 -3 -1
+ Pooling -1 -1 -1 -1
-3 1 0 -3
-1 -1 -2 1
-3 -3 0 1
-1 -1 -2 1
3 -2 -2 -1
-1 0 -4 3
Convolution
“Image” with 64 channels
Repeat
Pooling 3 0
-1 1
3 1
……
0 3
30
The whole CNN
cat dog ……
Convolution
softmax
Pooling
Fully Connected
Layers Convolution
Pooling
Flatten 31
Application: Playing Go
Next move
Network (19 x 19
positions)
19 x 19 classes
19 x 19 matrix
19(image)
x 19 vector
Black: 1 Fully-connected
48 channels network can be used
white: -1
in Alpha Go
none: 0 But CNN performs much better.
32
Why CNN for Go playing?
• Some patterns are much smaller than the whole
image
33
More Applications
Speech/Signal
Processing
https://dl.acm.org/doi/10.110
9/TASLP.2014.2339736
35