0% found this document useful (0 votes)
9 views

Lecture2 CNN Network Design

The document outlines an introductory course on applied machine learning, specifically focusing on Convolutional Neural Networks (CNN) for image classification. It discusses the architecture of CNNs, including the importance of receptive fields, parameter sharing, and the simplifications that enhance the model's efficiency in detecting patterns. The course is taught by Dr. Tao Han at the New Jersey Institute of Technology, utilizing materials inspired by Prof. Hung-yi Lee’s courses.

Uploaded by

ra734
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lecture2 CNN Network Design

The document outlines an introductory course on applied machine learning, specifically focusing on Convolutional Neural Networks (CNN) for image classification. It discusses the architecture of CNNs, including the importance of receptive fields, parameter sharing, and the simplifications that enhance the model's efficiency in detecting patterns. The course is taught by Dr. Tao Han at the New Jersey Institute of Technology, utilizing materials inspired by Prof. Hung-yi Lee’s courses.

Uploaded by

ra734
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

ECE 498:

Introduction to Applied Machine Learning

• Tao Han, Ph.D.

• Associate Professor
• Electrical and Computer Engineering
• Newark College of Engineering
• New Jersey Institute of Technology

• https://tao-han-njit.netlify.app

Slides are designed based on Prof. Hung-yi Lee’s Machine Learning courses at National Taiwan University
Network Architecture Design:

Convolutional Neural Networks (CNN)


Image Classification
⋮ ⋮
0.2 dog 0
0.7 cat 1
0.1 tree 0
⋮ ⋮
Model 𝒚′ ෝ
𝒚
Cross
entropy

100 x 100

(All the images to be classified have the


same size.)

3
Image Classification
3 channels 100 x 100

3-D 100
tensor
100 x 100

100 x 100 100


100 x 100
value represents intensity

4
Fully Connected Network

100 x 100 𝑥𝑖 ……

……
𝑥𝑗 3 x 107 ……
100 x 100
……

𝑥𝑘 ……
100 x 100 x 3 1000
100 x 100

Do we really need “fully connected”


in image processing?
5
Observation 1
Identifying some critical patterns

Input Layer 1 Layer 2


x1 ……
x2 Bird?
……
……

……

……
xN ……

Perhaps human also identify birds in a similar way … ☺


6
7
Observation 1 A neuron does not have to
see the whole image.
Need to see the Input Layer 1 Layer 2
whole image? x ……
1

x2 bird
……
……

……

……
xN ……
basic advanced
detector detector
Some patterns are much smaller than the whole image.
8
3x3x3
Simplification 1 weights

…...
3x3

bias
1

…...
Receptive 1 0 0 0 0 1 3x3
11 00 00 00 00 11
field 0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00

…...
0 1 0 0 1 0 3x3
00 11 00 00 11 00
0 0 1 0 1 0
00 00 11 00 11 00

9
• Can different neurons have
different sizes of receptive field?
Simplification 1 • Cover only some channels?
• Not square receptive field?

3 x 3 x 3 weights
the same
1 0 0 0 0 1 receptive field
Receptive 11 00 00 00 00 11
field 0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00
0 1 0 0 1 0 Can be
00 11 00 00 11 00
0 0 1 0 1 0 overlapped
00 00 11 00 11 00

10
Simplification 1 – Typical Setting
Each receptive field has a set of neurons (e.g., 64 neurons).

stride = 2 overlap
all channels
1 0 0 0 0 1
11 00 00 00 00 11
kernel size 0 1 0 0 1 0 padding
00 11 00 00 11 00
(e.g., 3 x 3) 0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00
0 1 0 0 1 0 The receptive fields
00 11 00 00 11 00
0 0 1 0 1 0 cover the whole
00 00 11 00 11 00 image.
11
Observation 2
• The same patterns appear in different regions.
I detect “beak” in
my receptive field.

Each receptive field


needs a “beak” detector?

I detect “beak” in
my receptive field.
12
3x3x3
Simplification 2 weights

…...
bias
1 0 0 0 0 1 1
11 00 00 00 00 11


0 1 0 0 1 0 parameter sharing
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0 3x3x3
11 00 00 00 11 00 weights

…...
0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 0 1 0
00 00 11 00 11 00
bias
1

13
𝑥1 𝜎 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯
𝑥2 𝑤1
Simplification 2

…...
𝑤2

bias
1 0 0 0 0 1 1
11 00 00 00 00 11


0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0 𝑥1′ 𝜎 𝑤1 𝑥1′ + 𝑤2 𝑥2′ + ⋯
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00 𝑥2′ 𝑤1

…...
0 1 0 0 1 0 𝑤2
00 11 00 00 11 00
0 0 1 0 1 0
00 00 11 00 11 00
bias
Two neurons with the same receptive
1
field would not share parameters.

14
Simplification 2 – Typical Setting
Each receptive field has a set of neurons (e.g., 64 neurons).

1 0 0 0 0 1
11 00 00 00 00 11
0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00
0 1 0 0 1 0

……
00 11 00 00 11 00
……

0 0 1 0 1 0
00 00 11 00 11 00
15
Simplification 2 – Typical Setting
Each receptive field has a set of neurons (e.g., 64 neurons).
Each receptive field has the neurons with the same set of
parameters.

filter 1 1 0 0 0 0 1 filter 1
11 00 00 00 00 11
filter 2 0 1 0 0 1 0 filter 2
00 11 00 00 11 00
0 0 1 1 0 0
filter 3 00 00 11 11 00 00 filter 3
1 0 0 0 1 0
filter 4 11 00 00 00 11 00 filter 4
0 1 0 0 1 0

……
00 11 00 00 11 00
……

0 0 1 0 1 0
00 00 11 00 11 00
16
Benefit of Convolutional Layer
Fully Connected Layer Jack of all trades,
master of none
Receptive Field

Parameter Sharing
Convolutional Layer Larger model bias
(for image)

• Some patterns are much smaller than the whole image.


• The same patterns appear in different regions.
17
Another story based on filter ☺
Convolutional Layer

Filter 1
3 x 3 x channel
tensor

Convolution
Filter 2
3 x 3 x channel
tensor
……

……

channel = 3 (colorful) Each filter detects a small


channel = 1 (black and white) pattern (3 x 3 x channel).
18
Consider channel = 1
Convolutional Layer (black and white image)

1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0 -1 -1 1
0 0 1 1 0 0
-1 1 -1
1 0 0 0 1 0 Filter 2
-1 1 -1
0 1 0 0 1 0
-1 1 -1
0 0 1 0 1 0

6 x 6 image ……
(The values in the filters
are unknown parameters.)
19
1 -1 -1
Convolutional Layer -1 1 -1 Filter 1
-1 -1 1
stride=1

1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0 -3 -3 0 1

6 x 6 image 3 -2 -2 -1

20
-1 1 -1
Convolutional Layer -1 1 -1 Filter 2
-1 1 -1
stride=1 Do the same process for
1 0 0 0 0 1 every filter
0 1 0 0 1 0 3 -1 -3 -1
-1 -1 -1 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
-1 -1 -2 1
0 1 0 0 1 0 Feature
0 0 1 0 1 0 -3 -3 Map0 1
-1 -1 -2 1
6 x 6 image 3 -2 -2 -1
-1 0 -4 3

21
Convolutional Layer 3
-1
-1
-1
-3
-1
-1
-1
-3 1 0 -3
-1 -1 -2 1

-3 -3 0 1
-1 -1 -2 1
3 -2 -2 -1
-1 0 -4 3
64
Convolution
filters “Image” with 64 channels

Convolution
……
Multiple
3 -1 -3 -1
Convolutional Layers -1 -1 -1 -1
-3 1 0 -3
-1 -1 -2 1

-3 -3 0 1
-1 -1 -2 1
3 -2 -2 -1
-1 0 -4 3
64
Convolution
filters “Image” with 64 channels

Convolution
Filter:
3 x 3 x 64
……

64 23
1 0 0 0 0 1
Multiple
0 1 0 0 1 0
Convolutional Layers
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

64 3 -1 -3 -1
Convolution -1 -1 -1 -1
filters
-3 1 0 -3
-1 -1 -2 1
Convolution
-3 -3 0 1
-1 -1 -2 1
……

3 -2 -2 -1
-1 0 -4 3
24
Comparison of Two Stories

1 -1 -1 Filter
…...

-1 1 -1 3 x 3 x channel
-1 -1 1 tensor

Receptive
field (ignore bias in this slide)

25
The neurons with different receptive

…...
fields share the parameters.

bias
1 0 0 0 0 1 1
11 00 00 00 00 11


0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 1 0 0
00 00 11 11 00 00
1 0 0 0 1 0
11 00 00 00 11 00

…...
0 1 0 0 1 0
00 11 00 00 11 00
0 0 1 0 1 0
00 00 11 00 11 00
bias
Each filter convolves over the 1

input image. 26
Convolutional Layer

Neuron Version Story Filter Version Story

Each neuron only considers There are a set of filters


a receptive field. detecting small patterns.

The neurons with different


Each filter convolves
receptive fields share the
over the input image.
parameters.

They are the same story.


27
Observation 3
• Subsampling the pixels will not change the object

bird
bird

subsampling

28
Pooling – Max Pooling
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1

3 -1 -3 -1 -1 -1 -1 -1

-3 1 0 -3 -1 -1 -2 1

-3 -3 0 1 -1 -1 -2 1

3 -2 -2 -1 -1 0 -4 3
29
Convolutional Layers
3 -1 -3 -1
+ Pooling -1 -1 -1 -1
-3 1 0 -3
-1 -1 -2 1

-3 -3 0 1
-1 -1 -2 1
3 -2 -2 -1
-1 0 -4 3
Convolution
“Image” with 64 channels
Repeat

Pooling 3 0
-1 1

3 1
……

0 3
30
The whole CNN
cat dog ……
Convolution
softmax

Pooling

Fully Connected
Layers Convolution

Pooling

Flatten 31
Application: Playing Go

Next move
Network (19 x 19
positions)
19 x 19 classes
19 x 19 matrix
19(image)
x 19 vector
Black: 1 Fully-connected
48 channels network can be used
white: -1
in Alpha Go
none: 0 But CNN performs much better.
32
Why CNN for Go playing?
• Some patterns are much smaller than the whole
image

Alpha Go uses 5 x 5 for first layer

• The same patterns appear in different regions.

33
More Applications

Speech/Signal
Processing
https://dl.acm.org/doi/10.110
9/TASLP.2014.2339736

35

You might also like