(W F + 2P) /S + 1: Use of Zero-Padding

The document discusses convolutional neural networks and convolutional layers. It defines key concepts like receptive field size, stride, padding, and how they determine the size of the output volume. It also explains parameter sharing, where neurons in the same depth slice use the same weights and biases, dramatically reducing the number of parameters. This allows feature detectors to be shared across spatial positions in the input.

Uploaded by

olia.92

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

(W F + 2P) /S + 1: Use of Zero-Padding

Uploaded by

olia.92

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

correct formula for calculating how many neurons “ﬁt” is given by (W − F + 2P )/S + 1.

For
example for a 7x7 input and a 3x3 ﬁlter with stride 1 and pad 0 we would get a 5x5 output. With
stride 2 we would get a 3x3 output. Lets also see one more graphical example:

Illustration of spatial arrangement. In this example there is only one spatial dimension (x-axis), one neuron
with a receptive ﬁeld size of F = 3, the input size is W = 5, and there is zero padding of P = 1. Left: The neuron
strided across the input in stride of S = 1, giving output of size (5 - 3 + 2)/1+1 = 5. Right: The neuron uses
stride of S = 2, giving output of size (5 - 3 + 2)/2+1 = 3. Notice that stride S = 3 could not be used since it
wouldn't ﬁt neatly across the volume. In terms of the equation, this can be determined since (5 - 3 + 2) = 4 is
not divisible by 3.
The neuron weights are in this example [1,0,-1] (shown on very right), and its bias is zero. These weights are
shared across all yellow neurons (see parameter sharing below).

Use of zero-padding. In the example above on left, note that the input dimension was 5 and the
output dimension was equal: also 5. This worked out so because our receptive ﬁelds were 3 and
we used zero padding of 1. If there was no zero-padding used, then the output volume would
have had spatial dimension of only 3, because that it is how many neurons would have “ﬁt” across
the original input. In general, setting zero padding to be P = (F − 1)/2 when the stride is
S = 1 ensures that the input volume and output volume will have the same size spatially. It is

very common to use zero-padding in this way and we will discuss the full reasons when we talk
more about ConvNet architectures.

Constraints on strides. Note again that the spatial arrangement hyperparameters have mutual
constraints. For example, when the input has size W = 10 , no zero-padding is used P = 0, and
the ﬁlter size is F = 3 , then it would be impossible to use stride S = 2 , since
(W − F + 2P )/S + 1 = (10 − 3 + 0)/2 + 1 = 4.5 , i.e. not an integer, indicating that the

neurons don’t “fit” neatly and symmetrically across the input. Therefore, this setting of the
hyperparameters is considered to be invalid, and a ConvNet library could throw an exception or
zero pad the rest to make it fit, or crop the input to make it fit, or something. As we will see in the
ConvNet architectures section, sizing the ConvNets appropriately so that all the dimensions “work
out” can be a real headache, which the use of zero-padding and some design guidelines will
significantly alleviate.
Real-world example. The Krizhevsky et al. architecture that won the ImageNet challenge in 2012
accepted images of size [227x227x3]. On the first Convolutional Layer, it used neurons with
receptive field size F = 11 , stride S = 4 and no zero padding P = 0. Since (227 - 11)/4 + 1 =
55, and since the Conv layer had a depth of K = 96 , the Conv layer output volume had size
[55x55x96]. Each of the 55*55*96 neurons in this volume was connected to a region of size
[11x11x3] in the input volume. Moreover, all 96 neurons in each depth column are connected to
the same [11x11x3] region of the input, but of course with different weights. As a fun aside, if you
read the actual paper it claims that the input images were 224x224, which is surely incorrect
because (224 - 11)/4 + 1 is quite clearly not an integer. This has confused many people in the
history of ConvNets and little is known about what happened. My own best guess is that Alex
used zero-padding of 3 extra pixels that he does not mention in the paper.

Parameter Sharing. Parameter sharing scheme is used in Convolutional Layers to control the
number of parameters. Using the real-world example above, we see that there are 55*55*96 =
290,400 neurons in the ﬁrst Conv Layer, and each has 11*11*3 = 363 weights and 1 bias.
Together, this adds up to 290400 * 364 = 105,705,600 parameters on the ﬁrst layer of the
ConvNet alone. Clearly, this number is very high.

It turns out that we can dramatically reduce the number of parameters by making one reasonable
assumption: That if one feature is useful to compute at some spatial position (x,y), then it should
also be useful to compute at a different position (x2,y2). In other words, denoting a single 2-
dimensional slice of depth as a depth slice (e.g. a volume of size [55x55x96] has 96 depth slices,
each of size [55x55]), we are going to constrain the neurons in each depth slice to use the same
weights and bias. With this parameter sharing scheme, the ﬁrst Conv Layer in our example would
now have only 96 unique set of weights (one for each depth slice), for a total of 96*11*11*3 =
34,848 unique weights, or 34,944 parameters (+96 biases). Alternatively, all 55*55 neurons in
each depth slice will now be using the same parameters. In practice during backpropagation,
every neuron in the volume will compute the gradient for its weights, but these gradients will be
added up across each depth slice and only update a single set of weights per slice.

Notice that if all neurons in a single depth slice are using the same weight vector, then the
forward pass of the CONV layer can in each depth slice be computed as a convolution of the
neuron’s weights with the input volume (Hence the name: Convolutional Layer). This is why it is
common to refer to the sets of weights as a filter (or a kernel
kernel), that is convolved with the input.
Example filters learned by Krizhevsky et al. Each of the 96 filters shown here is of size [11x11x3], and each
one is shared by the 55*55 neurons in one depth slice. Notice that the parameter sharing assumption is
relatively reasonable: If detecting a horizontal edge is important at some location in the image, it should
intuitively be useful at some other location as well due to the translationally-invariant structure of images.
There is therefore no need to relearn to detect a horizontal edge at every one of the 55*55 distinct locations
in the Conv layer output volume.

Note that sometimes the parameter sharing assumption may not make sense. This is especially
the case when the input images to a ConvNet have some specific centered structure, where we
should expect, for example, that completely different features should be learned on one side of
the image than another. One practical example is when the input are faces that have been
centered in the image. You might expect that different eye-specific or hair-specific features could
(and should) be learned in different spatial locations. In that case it is common to relax the
parameter sharing scheme, and instead simply call the layer a Locally-Connected Layer.
Layer

Numpy examples. To make the discussion above more concrete, lets express the same ideas but
in code and with a speciﬁc example. Suppose that the input volume is a numpy array X . Then:

A depth column (or a ﬁbre) at position (x,y) would be the activations X[x,y,:] .
A depth slice, or equivalently an activation map at depth d would be the activations
X[:,:,d] .

Conv Layer Example. Suppose that the input volume X has shape X.shape: (11,11,4) .
Suppose further that we use no zero padding (P = 0), that the ﬁlter size is F = 5 , and that the
stride is S = 2 . The output volume would therefore have spatial size (11-5)/2+1 = 4, giving a
volume with width and height of 4. The activation map in the output volume (call it V ), would
then look as follows (only some of the elements are computed in this example):

V[0,0,0] = np.sum(X[:5,:5,:] * W0) + b0

V[1,0,0] = np.sum(X[2:7,:5,:] * W0) + b0

CV Lec6
No ratings yet
CV Lec6
57 pages
586_114_216_Convolutional_Neural_Networks
No ratings yet
586_114_216_Convolutional_Neural_Networks
48 pages
Convolutional Neural Networks - Annotated
No ratings yet
Convolutional Neural Networks - Annotated
83 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
55 pages
Convolutional Layer: Web-Based Demo
No ratings yet
Convolutional Layer: Web-Based Demo
3 pages
CNN Midterm
No ratings yet
CNN Midterm
103 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
102 pages
CNN For Visual Recognition
No ratings yet
CNN For Visual Recognition
4 pages
CC511 Week 7 - Deep - Learning
No ratings yet
CC511 Week 7 - Deep - Learning
33 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
161 pages
Additional CNN
No ratings yet
Additional CNN
82 pages
Cnn
No ratings yet
Cnn
26 pages
12 Convolutional Neural Networks
No ratings yet
12 Convolutional Neural Networks
101 pages
Week 7
No ratings yet
Week 7
24 pages
07 AIS302 CNN
No ratings yet
07 AIS302 CNN
56 pages
Lec9 CNN 25jan18
No ratings yet
Lec9 CNN 25jan18
111 pages
UNIT-4 Foundations of Deep Learning
100% (1)
UNIT-4 Foundations of Deep Learning
43 pages
W H D K F S P W H D W W H H D F F D D K: Summary. To Summarize, The Conv Layer
No ratings yet
W H D K F S P W H D W W H H D F F D D K: Summary. To Summarize, The Conv Layer
3 pages
Student Notes: Convolutional Neural Networks (CNN) Introduction
No ratings yet
Student Notes: Convolutional Neural Networks (CNN) Introduction
9 pages
5 - Convolutional Neural Network
No ratings yet
5 - Convolutional Neural Network
14 pages
CS601 - Machine Learning - Unit 3 - Notes - 1672759761
No ratings yet
CS601 - Machine Learning - Unit 3 - Notes - 1672759761
15 pages
CS60010: Deep Learning CNN - Part 1: Sudeshna Sarkar
No ratings yet
CS60010: Deep Learning CNN - Part 1: Sudeshna Sarkar
64 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
Lecture_3
No ratings yet
Lecture_3
48 pages
You Can't Stop The Clock
No ratings yet
You Can't Stop The Clock
14 pages
Unit 3 - Machine Learning
No ratings yet
Unit 3 - Machine Learning
29 pages
mod5
No ratings yet
mod5
96 pages
CS231n Convolutional Neural Networks For Visual Recognition
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition
2 pages
Cnnbasics 171028092801
No ratings yet
Cnnbasics 171028092801
43 pages
CNN
No ratings yet
CNN
8 pages
Principles of Convolutional Neural Networks
No ratings yet
Principles of Convolutional Neural Networks
9 pages
Convolution Neural Network: CP - 6 Machine Learning M S Prasad
No ratings yet
Convolution Neural Network: CP - 6 Machine Learning M S Prasad
28 pages
Session 3
No ratings yet
Session 3
109 pages
CNN
No ratings yet
CNN
62 pages
Intro_DL_02
No ratings yet
Intro_DL_02
49 pages
CONVOLUTIONAL NEURAL NETWORK
No ratings yet
CONVOLUTIONAL NEURAL NETWORK
36 pages
CNN and Autoencoder
No ratings yet
CNN and Autoencoder
56 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
63 pages
3.3 - CNNs
No ratings yet
3.3 - CNNs
29 pages
Convolution Neural Network (CNN) Unit 2: Dr. Kavita R Singh
No ratings yet
Convolution Neural Network (CNN) Unit 2: Dr. Kavita R Singh
65 pages
UNIT 2 Study Materials 1
No ratings yet
UNIT 2 Study Materials 1
42 pages
21CS743_Module4_notes
No ratings yet
21CS743_Module4_notes
15 pages
21CS743_DL_Module4_notes
No ratings yet
21CS743_DL_Module4_notes
7 pages
cnn
No ratings yet
cnn
10 pages
Convolutional Neural Networks: Computer Vision
No ratings yet
Convolutional Neural Networks: Computer Vision
14 pages
Deep Learning CNN
No ratings yet
Deep Learning CNN
204 pages
Convolutional Neural Networks-Part2
No ratings yet
Convolutional Neural Networks-Part2
21 pages
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
No ratings yet
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
11 pages
Assignment #1: Afzal Ali (11282) Muhammad Hammad (11293) Muhammad Bilal (11291) Mehran Ahmed (11287) Date 20/03/2019
No ratings yet
Assignment #1: Afzal Ali (11282) Muhammad Hammad (11293) Muhammad Bilal (11291) Mehran Ahmed (11287) Date 20/03/2019
7 pages
CV Lab 12 - Implementatin of a Simple CNN
No ratings yet
CV Lab 12 - Implementatin of a Simple CNN
9 pages
Unit 2 (1)
No ratings yet
Unit 2 (1)
45 pages
371810f3-a2d5-467f-aa88-bfa680405b79
No ratings yet
371810f3-a2d5-467f-aa88-bfa680405b79
17 pages
CS 230 - Convolutional Neural Networks Cheatsheet
No ratings yet
CS 230 - Convolutional Neural Networks Cheatsheet
7 pages
[Fall 2024] Images and Convolutions
No ratings yet
[Fall 2024] Images and Convolutions
69 pages
Unit - 2
No ratings yet
Unit - 2
51 pages
Lec 8
No ratings yet
Lec 8
60 pages
Alexnet -Number of Parameters and Tensor Sizes in a Convolutional Neural Network (CNN)
No ratings yet
Alexnet -Number of Parameters and Tensor Sizes in a Convolutional Neural Network (CNN)
11 pages
CNN 1
No ratings yet
CNN 1
41 pages
Neural Networks and Deep Learning (PE - V) (18CSE23) Unit - 4
No ratings yet
Neural Networks and Deep Learning (PE - V) (18CSE23) Unit - 4
11 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
CS231n - Convolutional-Networks 1
No ratings yet
CS231n - Convolutional-Networks 1
3 pages
CUDA - Introduction CUDA - Introduction
No ratings yet
CUDA - Introduction CUDA - Introduction
3 pages
2 - Introduction To The GPU
No ratings yet
2 - Introduction To The GPU
3 pages
4 - Key Concepts
No ratings yet
4 - Key Concepts
2 pages
IEEE PYTHON 2024
No ratings yet
IEEE PYTHON 2024
2 pages
Drone Detection Using Visual Analysis
No ratings yet
Drone Detection Using Visual Analysis
13 pages
AI Syllabus Course
No ratings yet
AI Syllabus Course
16 pages
AE-Net Novel Autoencoder-Based Deep Features for SQL Injection Attack Detection
No ratings yet
AE-Net Novel Autoencoder-Based Deep Features for SQL Injection Attack Detection
10 pages
Kim Arbitrary-Scale Image Generation and Upsampling Using Latent Diffusion Model and CVPR 2024 Paper
No ratings yet
Kim Arbitrary-Scale Image Generation and Upsampling Using Latent Diffusion Model and CVPR 2024 Paper
10 pages
R&D Showcase Template
No ratings yet
R&D Showcase Template
1 page
Akash Nayak GradedIndividual
No ratings yet
Akash Nayak GradedIndividual
20 pages
Artificial Intelligence Applicationsin Solar Photovoltaic Renewable Energy Systems
No ratings yet
Artificial Intelligence Applicationsin Solar Photovoltaic Renewable Energy Systems
41 pages
MSC Thesis Nordin Sahla
100% (1)
MSC Thesis Nordin Sahla
58 pages
Get (Ebook) An Introduction to Artificial Intelligence in Education by Yu, Shengquan, Lu, Yu ISBN 9789811627705, 9789811627699, 981162769X, 9811627703 free all chapters
100% (8)
Get (Ebook) An Introduction to Artificial Intelligence in Education by Yu, Shengquan, Lu, Yu ISBN 9789811627705, 9789811627699, 981162769X, 9811627703 free all chapters
81 pages
Ppt Plant[1] 2024
No ratings yet
Ppt Plant[1] 2024
29 pages
Aircraft Visual Inspection A Systematic Literature Review
No ratings yet
Aircraft Visual Inspection A Systematic Literature Review
15 pages
Deep Learning For Face Anti-Spoofing: An End-To-End Approach: September 2017
No ratings yet
Deep Learning For Face Anti-Spoofing: An End-To-End Approach: September 2017
7 pages
Brain Tumor Segmentation From MRI Images Using Hybrid
No ratings yet
Brain Tumor Segmentation From MRI Images Using Hybrid
10 pages
Final Report On Facial Emotion Detection Using Machine Learning
No ratings yet
Final Report On Facial Emotion Detection Using Machine Learning
12 pages
A Survey of Medical Image Classification Techniques
No ratings yet
A Survey of Medical Image Classification Techniques
6 pages
First Review PDF
No ratings yet
First Review PDF
36 pages
Deep Learning for Lip Reading and Speech Recognition
No ratings yet
Deep Learning for Lip Reading and Speech Recognition
4 pages
AI-CardioCare Artificial Intelligence Based Device For Cardiac Health Monitoring
No ratings yet
AI-CardioCare Artificial Intelligence Based Device For Cardiac Health Monitoring
11 pages
Brain Tumor Detection and Classification Using
No ratings yet
Brain Tumor Detection and Classification Using
10 pages
Ai ch5 Computer Vision
No ratings yet
Ai ch5 Computer Vision
10 pages
An Analysis of Machine Learning Algorithms and Deep Neural Networks For Email Spam Classification U
No ratings yet
An Analysis of Machine Learning Algorithms and Deep Neural Networks For Email Spam Classification U
6 pages
Deep Learning and Computer Vision Techniques for Enhanced Quality Control in Manufacturing Processes
No ratings yet
Deep Learning and Computer Vision Techniques for Enhanced Quality Control in Manufacturing Processes
32 pages
BILD: A General Bi-Level Framework For Dehazing and Underwater Enhancement
No ratings yet
BILD: A General Bi-Level Framework For Dehazing and Underwater Enhancement
20 pages
Jntuk r20 Unit v Deep Learning Techniqueswwwjntumaterials
No ratings yet
Jntuk r20 Unit v Deep Learning Techniqueswwwjntumaterials
32 pages
SIRE: Scale-Invariant, Rotation-Equivariant Estimation of Artery Orientations Using Graph Neural Networks
No ratings yet
SIRE: Scale-Invariant, Rotation-Equivariant Estimation of Artery Orientations Using Graph Neural Networks
16 pages
ConvNetJS Talk
No ratings yet
ConvNetJS Talk
39 pages
American Sign Language Progress (1) Final
No ratings yet
American Sign Language Progress (1) Final
44 pages
Smart Drishti For Blind Report
No ratings yet
Smart Drishti For Blind Report
30 pages
Deep Learning Techniques For Visual SLAM A Survey
No ratings yet
Deep Learning Techniques For Visual SLAM A Survey
25 pages