0% found this document useful (0 votes)
3 views30 pages

CNN Intro

The document provides an introduction to Convolutional Neural Networks (CNNs), explaining the importance of preserving spatial relationships in images through techniques like convolution and pooling. It discusses the role of kernels, the process of convolution, and how pooling reduces the complexity of the model while retaining essential features. Additionally, it highlights the advantages of CNNs over fully connected networks, such as reduced connections and shared weights.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views30 pages

CNN Intro

The document provides an introduction to Convolutional Neural Networks (CNNs), explaining the importance of preserving spatial relationships in images through techniques like convolution and pooling. It discusses the role of kernels, the process of convolution, and how pooling reduces the complexity of the model while retaining essential features. Additionally, it highlights the advantages of CNNs over fully connected networks, such as reduced connections and shared weights.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

CNN Introduction

ANN
 We know it is good to learn a small model.
 From this fully connected model, do we really need all the edges?
 Can some of these be shared?
CNN: Intuition
 We found that our model(ANN) was able to attain high
accuracy with the training data, but not with the
testing/validation data. We’re passing our image in as
one long string of pixels.
 Why might that not be such a good idea?
 There is important information contained in how pixels
are organized around each other. When we flatten the
picture into a single array, we lose that information
What is Kernel and convolution?
 In a drawing program the area around which we click that
manipulates our image is called kernel.
 We analyze the influence of nearby pixel by using
Filter(kernel)
 Convolution is when our kernel is multiplied with our
image.
 Applying one function to another function is convolution.
 For each point on the image a value is calculated based on
filter using Convolution operation.
 Base image is a function of color and kernel is a function of
pixel.
Kernel and Convolution
.06 .13 .06 0 -1 0
.13 .25 .13 -1 5 -1

.06 .13 .06 0 -1 0


Sharpe
Blur n

0 0 0 0 0 0
Original Image 0 1.5 0 0 0.5 0

Brighten
0 0 0 Darken
0 0 0
Convolution These are the network
parameters to be learned.

1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0 -1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
-1 1 -1 Filter 2
0 1 0 0 1 0
0 0 1 0 1 0 -1 1 -1



6 x 6 image
Each filter detects a
small pattern (3 x 3).
1 -1 -1
-1 1 -1 Filter 1
Convolution -1 -1 1
stride=1

1 0 0 0 0 1 Dot
product
0 1 0 0 1 0 3 -1
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
-1 1 -1 Filter 1
Convolution -1 -1 1
If stride=2

1 0 0 0 0 1
0 1 0 0 1 0 3 -3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
-1 1 -1
-1 1 -1 Filter 2
Convolution -1 1 -1
stride=1
Repeat this for each filter
1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
-1 -1 -1 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
-1 -1 -2 1
0 1 0 0 1 0 Feature
0 0 1 0 1 0 -3 -3 Map0 1
-1 -1 -2 1

6 x 6 image 3 -2 -2 -1
-1 0 -4 3
Two 4 x 4 images
Forming 2 x 4 x 4 matrix
Padding
Original Image Zero Padding

0 0 0 0 0 0 0 0
1 0 1 1 0 1
0 1 0 1 1 0 1 0
0 1 0 0 1 0
0 0 1 0 0 1 0 0
0 1 1 1 1 0
0 0 1 1 1 1 0 0
0 1 1 1 1 0
0 0 1 1 1 1 0 0
1 0 1 1 0 1
0 1 0 1 1 0 1 0
1 1 0 0 1 1
0 1 1 0 0 1 1 0

0 0 0 0 0 0 0 0
Color image: RGB 3 channels
1 1 -1-1 -1-1 -1-1 1 1 -1-1
1 -1 -1 -1 1 -1
-1 1 -1 -1 1 -1
-1-1 1 1 -1-1 Filter 1 -1-1 1 1 -1-1 Filter 2
-1-1 -1-1 1 1 -1-1 1 1 -1-1
-1 -1 1 -1 1 -1
Color image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0
3D Filter
 In the 1D case, we slide a one dimensional filter
over a one dimensional input
 In the 2D case, we slide a two dimensional filter
over a two dimensional out-put
 What would a 3D filter look like?
 It will be 3D and we will refer to it as a volume
Relation between input size, output
size and filter size
 W2 = W1 – F + 1
 H2 = H1 – F + 1

Results in an output which is of smaller dimensions than the


input
What if we want the output to be of same size as the input?
Padding

W2 = W1 - F + 2P + 1
H2 = H1 - F + 2P + 1
Final Version
Excersise
Convolution v.s. Fully Connected

1 0 0 0 0 1 1 -1 -1 -1 1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1
0 0 1 1 0 0 -1 -1 1 -1 1 -1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
convolution
image

x1
1 0 0 0 0 1
0 1 0 0 1 0
x2
Fully- 0 0 1 1 0 0
1 0 0 0 1 0
connected



0 1 0 0 1 0
0 0 1 0 1 0
x36
1 -1 -1 Filter 1 1 1
-1 1 -1 2 0
-1 -1 1 3 0
4: 0 3


1 0 0 0 0 1
0 1 0 0 1 0 0
0 0 1 1 0 0 8 1
1 0 0 0 1 0 9 0
0 1 0 0 1 0 10: 0


0 0 1 0 1 0
1 0
6 x 6 image
3 0
14
15 1 Only connect to 9
inputs, not fully
16 1
connected

1 -1 -1 1: 1
-1 1 -1 Filter 1 2: 0
-1 -1 1 3: 0
4: 0 3


1 0 0 0 0 1
0 1 0 0 1 0 7: 0
0 0 1 1 0 0 8: 1
1 0 0 0 1 0 9: 0 -1
0 1 0 0 1 0 10: 0


0 0 1 0 1 0
1 0
6 x 6 image
3: 0
14:
15: 1
16: 1 Shared weights

The whole CNN
cat dog ……

Can
repeat
many
times
Max Pooling
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1

3 -1 -3 -1 -1 -1 -1 -1

-3 1 0 -3 -1 -1 -2 1

-3 -3 0 1 -1 -1 -2 1

3 -2 -2 -1 -1 0 -4 3
Why Pooling
 Subsampling pixels will not change the object

bird
bird

Subsampling

We can subsample the pixels to make image


smaller fewer parameters to characterize the image
A CNN compresses a fully connected
network in two ways:

 Reducing number of connections


 Shared weights on the edges
 Max pooling further reduces the complexity
Max Pooling
New image
1 0 0 0 0 1 but smaller
0 1 0 0 1 0 Conv
3 0
0 0 1 1 0 0 -1 1
1 0 0 0 1 0
0 1 0 0 1 0 Max 3 1
0 3
0 0 1 0 1 0 Pooling
2 x 2 image
6 x 6 image
Each filter
is a channel
The whole CNN
3 0
-1 1

3 1
0 3

Can
repeat
many
Smaller than the original
times
image
The number of channels
is the number of filters
The whole CNN
cat dog ……
Flattening

Flattened
Few parameters for convolution
1. Stride:. The good is that means we have less data to analyze, but
if we increase our stride too much, we might miss important
information.
2. Padding: If we want enough data to make sure all pixels are
used in convolution, or if we want the resulting image to be the
same size as our input image, we can do something called
padding
3. Max-Pooling: especially useful when working with large
images because it’s a way to shrink images down to a smaller
size, and smaller images mean less computation
4. Dropout: randomly shutoff neurons at a rate we specify,
meaning, it’s unable to learn for that step of training.
Demo
 https://setosa.io/ev/image-kernels/

You might also like