0% found this document useful (0 votes)

9 views96 pages

Mod 5

Convolutional Neural Networks (ConvNets) are designed to efficiently process large images by reducing their dimensionality while preserving critical features. They consist of three main types of layers: convolutional, pooling, and fully connected layers, with convolutional layers utilizing filters to extract features from images. Deep learning models, particularly those with multiple layers, can learn powerful representations automatically from large datasets, improving generalization capabilities in computer vision tasks.

Uploaded by

Pragyamita Basu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views96 pages

Mod 5

Uploaded by

Pragyamita Basu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 96

Convolutional Neural Network

Researchers in computer vision area have been experimenting many

neural-network architectures and algorithms, which have influenced other
fields as well.
In computer vision, images are the training data of a network, and the input
features are the pixels of an image. These features can get really big. For
example, when dealing with a 1 megapixel image, the total number of
features in that picture is 3 million (=1,000 x 1,000 x 3 color channels).
Then imagine passing this through a neural network with just 1,000 hidden
units, and we end up with some weights of 3 billion parameters!
These numbers are too big to be managed, but, luckily, we have the perfect
solution: Convolutional neural networks (ConvNets).
Smaller Network: CNN
• We know it is good to learn a small model.
• From this fully connected model, do we really need all the
edges?
• Can some of these be shared?
Consider learning an image:

• Some patterns are much smaller than the whole image

Can represent a small region with fewer parameters

“beak” detector
Same pattern appears in different places:
They can be compressed!
What about training a lot of such “small” detectors
and each detector must “move around”.

“upper-left
beak” detector

They can be compressed

to the same parameters.

“middle beak”
detector
Deep learning is about feature/representation learning.

Compared with traditional hand crafted features, neural network learned representation
are much powerful in computer vision tasks.
As these feature learning process are automatically accomplished by fed with large
volume of training data, it has better generalize capability.
Deep learning models are formed by multiple layers. In the context of artificial neural
networks the multi layer perceptron (MLP) with more than 2 hidden layers is already a Deep
Model.
As a rule of thumb deeper models have the potential to perform better than shallow models.
The problem is that the more deep you go the more data you will need to avoid over-fitting.
Convolutional Neural Network
You can imagine how computationally intensive things would get once the
images reach dimensions, say 8K (7680×4320). The role of the ConvNet is
to reduce the images into a form which is easier to process, without losing
features which are critical for getting a good prediction. This is important
when we are to design an architecture which is not only good at learning
features but also is scalable to massive datasets.

There are 3 types of layers in a convolutional network:

Convolution (CONV)
Pooling (POOL)
Fully connected (FC)
A convolutional layer
A “convolution” is one of the building blocks of the Convolutional network.
The primary purpose of a “convolution” in the case of a ConvNet is to
extract features from the input image. A convolutional layer has a number
of filters that does convolutional operation.

Beak detector

A filter

kernel
Output some times referred
as feature map
A convolutional layer
Every image can be represented as a matrix of
pixel values. An image from a standard digital
camera will have three channels — red, green
and blue. You can imagine those as three 2d-
matrices stacked over each other (one for each
color), each having pixel values in the range 0 to
255.

Applying a convolution to an image is like running a filter of a certain dimension and sliding it
on top of the image. That operation is translated into an element-wise multiplication between
the two matrices and finally an addition of the multiplication outputs. The final integer of this
computation forms a single element of the output matrix.
Convolution These are the network
parameters to be learned.

1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0 -1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
-1 1 -1 Filter 2
0 1 0 0 1 0
0 0 1 0 1 0 -1 1 -1

…
…
6 x 6 image
Each filter detects a
small pattern (3 x 3).
1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
stride=1

1 0 0 0 0 1 Dot
product
0 1 0 0 1 0 3 -1
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
If stride=2

1 0 0 0 0 1
0 1 0 0 1 0 3 -3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
stride=1

1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0 -3 -3 0 1

6 x 6 image 3 -2 -2 -1
-1 1 -1
Convolution -1 1 -1 Filter 2
-1 1 -1
stride=1
Repeat this for each filter
1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
-1 -1 -1 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
-1 -1 -2 1
0 1 0 0 1 0 Feature
0 0 1 0 1 0 -3 -3 Map0 1
-1 -1 -2 1

6 x 6 image 3 -2 -2 -1
-1 0 -4 3
Two 4 x 4 images
Forming 2 x 4 x 4 matrix
Color image: RGB 3 channels
11 -1-1 -1-1 -1-1 11 -1-1
Image size = 6x6x3 1 -1 -1 -1 1 -1
Filter size = 3x3x3 -1-1 11 -1-1 -1-1-1 111 -1-1-1 Filter 2
-1 1 -1 Filter 1 -1 1 -1
-1-1 -1-1 11 -1-1 11 -1-1
-1 -1 1
Color image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0
Convolution v.s. Fully Connected

1 0 0 0 0 1 1 -1 -1 -1 1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1
0 0 1 1 0 0 -1 -1 1 -1 1 -1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
convolution
image

x1
1 0 0 0 0 1
0 1 0 0 1 0 x2
Fully- 0 0 1 1 0 0
1 0 0 0 1 0
connected
…
…

…
…
0 1 0 0 1 0
0 0 1 0 1 0
x36
1 -1 -1 Filter 1 1 1
-1 1 -1 2 0
-1 -1 1 3 0
4: 0 3

…
1 0 0 0 0 1
0 1 0 0 1 0 0
0 0 1 1 0 0 8 1
1 0 0 0 1 0 9 0
0 1 0 0 1 0 10: 0

…
0 0 1 0 1 0
1 0
6 x 6 image
3 0
14
fewer parameters! 15 1 Only connect to 9
16 1 inputs, not fully
connected
…
1 -1 -1 1: 1
-1 1 -1 Filter 1 2: 0
-1 -1 1 3: 0
4: 0 3

…
1 0 0 0 0 1
0 1 0 0 1 0 7: 0
0 0 1 1 0 0 8: 1
1 0 0 0 1 0 9: 0 -1
0 1 0 0 1 0 10: 0

…
0 0 1 0 1 0
1 0
6 x 6 image
3: 0
14:
Fewer parameters 15: 1
16: 1 Shared weights
Even fewer parameters
…
M-N+1
32-5+1 =28
NxN filter
If images are composed of three channels (R-red, G-green, B-blue). Therefore the
input is a volume, a stack of three matrices, which forms a depth identified by the
number of channels.
If we apply only one filter the result would be:

where the cube filter of 27 parameters now slides on top of the cube of the input image.
Some common convolution filters
d – depth, k – number of filters, P- padding, S- stride

(N-F)/S+1=(6-3)/1+1=4

Input: N x N x d Output: [(N + 2P-F) / S + 1] x [[(N + 2P-F) / S + 1] x k

Filter: F x F x d
a Z=W*a+b
g g(z)
b
One layer of Convolutional Neural Network
1
27

The final step that takes us to a convolutional neural layer is to add the bias and a non-linear
function.
One layer of Convolutional Neural Network

The result of a convolution of 6×6×3 with two 3×3×3 is a volume of dimension 4×4×2

We’ve gone from a 6×6×3 dimensional a[0] through one layer of a neural network to
a 4×4×2 dimensional a[1]. So, 6×6×3 has gone to 4×4×2 and that’s one layer of a convolutional
net. In this example we had two filters involved which is why we end up with 4×4×2 output. If we
had 10 filters instead of 2 then would have we wound obtain a 4×4×10 dimensional output volume.
That is we’d be taking 10 of these maps instead of 2 of them, and stacking them up to form
a 4×4×10 output volume, and that’s how a[1] would be obtained.
One layer of Convolutional Neural Network

In neural networks one step of a forward propagation step

as: Z[1]=W[1]×a[0]+b[1], where a[0]=x.
Then we applied the non-linearity to get a[1]=g(z[1]). The same idea we will apply in a layer of
the Convolutional Neural Network.
This is how we go from a[0] to a[1]. So, the convolution is really:
1.apply the linear operation
2.add the biases and
3.apply ReLU
Number of parameters in one layer
Let’s suppose we have 10 filters that are 3×3×3 in one layer of a neural
network. So, how many parameters does this layer have?

In each filter, there is a 3×3×3 volume so each filter has 27 parameters to be

learned. Then, we added the bias, parameter b, so this gives us 28 parameters.
Previously we had two filters, but now if we imagine that we actually have ten of
these filters, then we have 28×10 so that would be 280 parameters. Nice point
about this is that no matter how big the input images are the number of
parameters will remain fixed. The input image could
be 1000×1000 or 5000×5000, but the number of parameters we have
remains 280.
Number of parameters in one layer

We can use these ten filters to detect features: vertical edges, horizontal
edges, maybe other features anywhere even in the very large image with just
a very small number of parameters. This is really one property of
convolutional neural nets that makes them less prone to overfitting. So, once
we learn ten feature detectors that work, we could apply this even to very
large images and the number of parameters also remains fixed and relatively
small as 280 in this example.
Number of parameters in one layer
An example of a ConvNet
If a layer l is a convolutional layer, we’re going to denote the filter size with f [l]. So,
f×f and this superscript [l] signifies that this is a filter size f×f filter in layer l.
Then, we use p[l] to denote the amount of padding, and the amount of padding can also be
specified just by saying that we want a valid convolution, which means no padding, or
a same convolution which means we choose a padding so that the output image size has
the same height and width as the input image consequently. We’re going to use s[l] to
denote the stride.

output size
(n+2p−f)/s+1.
(N+2p-f)/s+1
An example of a ConvNet
Let’s now say we have another convolutional layer, and we use 5×5 filters. So, in our
notation f[2] at the next layer of network is equal to 5(f[2]=5), and let’s say we use the stride
of 2, s[2] =2), and no padding (p[2]=0) and 20 filters.

Then the output of this will be another volume, this time, it will be 17×17×20. Notice that because we’re
now using a stride of 2, so (s[2]=2) the dimension has shrunk much faster and 37×37 has gone down in
size by slightly more than 2, to 17×17. Because we’re using 20 filters, the number of channels is
now 20, so this activation a[2] would be that dimension.
One last convolutional layer. Let’s say that we use a 5×5 filter again and, again a stride of 2. Eventually
we end up with a 7×7. Finally, if we use 40 filters and no padding we end up with 7×7×40.
An example of a ConvNet

Now our 39×39×3 input image is processed and 7×7×40 features are computed for
this image. Finally, if we take the 7×7×40(=1960), and we flatten this volume or unroll
it into 1960 units. By unrolling them into a very long vector we can feed into
a softmax or into a logistic regression in order to make a prediction for the final output.
As we go deeper in the neural network, typically we start off with larger images 39×39,
and then the height and width will stay the same for a while and gradually trend down as
we go deeper in the neural network. That is, the size has gone from 39 to 37 to 17 to 7 ,
whereas the number of channels generally increases (from 3 to 10 to 20 to 40). We can
see this general trend in a lot of other convolutional neural networks as well.
Pooling Layer and Fully Connected layer

In a typical ConvNet there are usually three types of layers: one is the
convolutional layer and often we’ll denote that as a ConvNet. One is called

a Poolinglayer, from now on called Pool, and then the last is

a Fullyconnectedlayer, called FC.

Although it’s possible to design a pretty good neural network using just
convolutional layers, most neural network architectures will also have a few

pooling layers and a few fully connected layers.

Pooling layers

Apart from convolutional layers, ConvNets often use pooling layers to

reduce the image size. Hence, this layer speeds up the computation and
this also makes some of the features they detect a bit more robust. Let’s
go through an example of pooling, and then we’ll talk about why we might
want to apply them.
There are two types of pooling:

•Max pooling
•Average pooling
Pooling layers
Max Pooling
Suppose we have a 4×4 input image and we want to apply a type of pooling,

called Maxpooling. The output of this particular implementation

of Maxpooling will be a 2×2 output. The procedure is quite simple: we take

our 4×4 input and break it into different regions. We’ll cover the four regions

as shown in figure below. Then, in the output, which is 2×2, each of the
outputs will be the max from the corresponding shaded region.
Maxpooling with a 2×2 filter and a stride of 2
The intuition behid the Maxpooling

If we think of this 4×4 region as some set of features (the activations in some layer of the
neural network), then a large number means that this particular feature is maybe detected. So,
the upper left-hand quadrant has this particular feature, maybe a vertical area or maybe an eye
of an animal. Clearly that feature exists in the upper left-hand quadrant, (maybe that’s a cat
eye detector), whereas this feature doesn’t really exist in the upper right hand quadrant. So,
the features, detected anywhere in one of these quadrants, max operation has preserved in
the output of max pooling. What the max operation does is really safe. If this feature is
detected anywhere in this filter then keep a high number. However, if this feature is not
detected, then this feature most likely doesn’t exist in the corresponding quadrant.
We can say that there are two main reasons that people use Maxpooling:
1.It’s been found in a lot of experiments to work well.
2.It has no parameters to learn. There’s actually nothing for the gradient descent to learn.
Once we’ve fixed f and s, it’s just a fixed computation and gradient descent doesn’t change
anything.
Average pooling
That is, instead of taking the maxima within each filter, it takes the average. In this example
the average of the numbers in purple is 3.75. Then 1.25, then 4, and finally 2. This is
average pooling with hyper parameters f=2 and s=2. We can choose other hyper
parameters as well. These days Max pooling is used much more often than Average
pooling.
Pooling layer downsamples the volume spatially, independently in each
depth slice of the input volume.
Why Pooling

• Subsampling pixels will not change the object

bird
bird

Subsampling

We can subsample the pixels to make image

smaller fewer parameters to characterize the image
Max Pooling
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1

3 -1 -3 -1 -1 -1 -1 -1

-3 1 0 -3 -1 -1 -2 1

-3 -3 0 1 -1 -1 -2 1

3 -2 -2 -1 -1 0 -4 3
Max Pooling
Helps to make representations approximately invariant

New image
1 0 0 0 0 1 but smaller
0 1 0 0 1 0 Conv
3 0
0 0 1 1 0 0 -1 1
1 0 0 0 1 0
0 1 0 0 1 0 Max 3 1
0 3
0 0 1 0 1 0 Pooling
2 x 2 image
6 x 6 image
Each filter
is a channel
The activations of an example ConvNet architecture.

Features
extracted DNN class
automatic
ally using
convoluti
Illustration of two convolutional layers, the first with 4 filters 5× 5× 3 that gets as
input an RGB image of size 64× 64× 3, and produces a tensor of feature maps. A
second convolutional layer with 5 filters 3× 3× 4 gets as input the tensor from the
previous layer of size 64× 64× 4 and produces a new 64× 64× 5 tensor of
feature maps. The circle after each filter denotes an activation function, e.g. ReLU.
VGG16 model for classification and detection

The image is passed through a stack of convolutional (conv.) layers, where the
filters were used with a very small receptive field: 3×3 (which is the smallest
size to capture the notion of left/right, up/down, center).
VGG16 model for classification and detection

Visualization of feature map from block 1 & 2

Visualization of feature map from block 3, 4 & 5
A CNN compresses a fully connected network in two ways:
• Reducing number of connections
• Shared weights on the edges
• Max pooling further reduces the complexity

Adding a Fully-Connected layer is a (usually) cheap way of learning non-linear

combinations of the high-level features as represented by the output of the convolutional
layer. The Fully-Connected layer is learning a possibly non-linear function in that space.
Fully Connected Layer
A fully connected layer acts like a “standard” single neural network layer, where
you have a weight matrix W and bias b.
We can see its application in the following example of a Convolutional Neural
Network. This network is inspired by the LeNet-5 network:

It’s common that, as we go deeper into the network, the sizes (nh, nw)
decrease, while the number of channels (nc) increases.
The whole CNN

cat dog ……
Convolution

Max Pooling
Can
Fully Connected repeat
Feedforward network
Convolution many
times

Max Pooling

Flattened
The whole CNN

3 0
-1 1 Convolution

3 1
0 3
Max Pooling
A new image Can
repeat
Convolution many
Smaller than the original
times
image
The number of channels Max Pooling

is the number of filters

The whole CNN

cat dog ……
Convolution

Max Pooling

Fully Connected A new image

Feedforward network
Convolution

Max Pooling

Flattened A new image

3
Flattening
0

1
3 0
-1 1 3

3 1 -1
0 3 Flattened

1 Fully Connected
Feedforward network

3
Revisiting CNN Convolution by 3 x 3 filter

The 66 image is now converted into a 44 image.

a stride of 2 would look like

The size of image keeps on reducing as we increase the stride value.

Padding the input image with zeros across it solves this problem for
us. We can also add more than one layer of zeros around the image in
case of higher stride values
Revisiting CNN

Multiple filters and activation map

The output from the each filter is stacked together forming the depth
dimension of the convolved image. Suppose we have an input image of
size 32*32*3. And we apply 10 filters of size 5*5*3 with valid padding. The
output would have the dimensions as 28*28*10.

activation map is the output of the convolution layer.

Revisiting CNN

Pooling layer
Sometimes when the images are too large, we would need to reduce the
number of trainable parameters. It is then desired to periodically
introduce pooling layers between subsequent convolution layers. Pooling
is done for the sole purpose of reducing the spatial size of the image.
Pooling is done independently on each depth dimension, therefore the
depth of the image remains unchanged. The most common form of
pooling layer generally applied is the max pooling.

As you can see, the 4*4 convolved output has become 2*2 after the max
pooling operation.
Revisiting CNN
Output dimensions
Three hyperparameter would control the size of output volume.
1. The number of filters – the depth of the output volume will be equal to the number of filter
applied. Remember how we had stacked the output from each filter to form an activation map.
The depth of the activation map will be equal to the number of filters.
2. Stride – When we have a stride of one we move across and down a single pixel. With
higher stride values, we move large number of pixels at a time and hence produce smaller
output volumes.
3. Zero padding – This helps us to preserve the size of the input image. If a single zero
padding is added, a single stride filter movement would retain the size of the original image.

The spatial size of the output image can be calculated as

( [W-F+2P]/S)+1.
W is the input volume size,
F is the size of the filter,
P is the number of padding applied and S is the number of strides.

Suppose we have an input image of size 32*32*3, we apply 10 filters of size 3*3*3, with single
stride and no zero padding. The output depth will be equal to the number of filters applied i.e.
10.
W=32, F=3, P=0 and S=1.

The size of the output volume will be ([32-3+0]/1)+1 = 30.

Therefore the output volume will be 30*30*10.
Revisiting CNN
Output layer

After multiple layers of convolution and padding, we would need the output in the
form of a class. The convolution and pooling layers would only be able to extract
features and reduce the number of parameters from the original images.
However, to generate the final output we need to apply a fully connected layer to
generate an output equal to the number of classes we need. It becomes tough to
reach that number with just the convolution layers. Convolution layers generate 3D
activation maps while we just need the output as whether or not an image belongs
to a particular class. The output layer has a loss function like categorical cross-
entropy, to compute the error in prediction. Once the forward pass is complete the
backpropagation begins to update the weight and biases for error and loss
reduction.
Revisiting CNN
Output layer
One Layer of a Convolutional Network
Once we get an output after convolving over the entire image using a filter, we add
a bias term to those outputs and finally apply an activation function to generate
activations. This is one layer of a convolutional network. Recall that the equation
for one forward pass is given by:

In our case, input (6 X 6 X 3) is a[0] and filters (3 X 3 X 3) are the weights w[1]. These
activations from layer 1 act as the input for layer 2, and so on. Clearly, the number of
parameters in case of convolutional neural networks is independent of the size of the
image. It essentially depends on the filter size. Suppose we have 10 filters, each of
shape 3 X 3 X 3. What will be the number of parameters in that layer? Let’s try to solve
this:
•Number of parameters for each filter = 3*3*3 = 27
•There will be a bias term for each filter, so total parameters per filter = 28
•As there are 10 filters, the total parameters for that layer = 28*10 = 280
Example
More Edge Detection
The type of filter that we choose helps to detect the vertical or horizontal
edges. We can use the following filters to detect different edges:

Some of the commonly used filters are:

The Sobel filter puts a little bit more weight on the central pixels. Instead of using these filters, we can
create our own as well and treat them as a parameter which the model will learn using
backpropagation.
More Edge Detection
Only modified the network structure and input
CNN in Keras format (vector -> 3-D tensor)

input

Convolution
1 -1 -1
-1 1 -1
-1 1 -1
-1 1 -1 … There are
25 3x3
-1 -1 1
-1 1 -1 … Max Pooling
filters.
Input_shape = ( 28 , 28 , 1)

28 x 28 pixels 1: black/white, 3: RGB Convolution

3 -1 3 Max Pooling

-3 1
Only modified the network structure and input
CNN in Keras format (vector -> 3-D array)

Input
1 x 28 x 28

Convolution
How many parameters for
each filter? 9 25 x 26 x 26

Max Pooling
25 x 13 x 13

Convolution
How many parameters 225=
for each filter? 50 x 11 x 11
25x9
Max Pooling
50 x 5 x 5
Only modified the network structure and input
CNN in Keras format (vector -> 3-D array)

Input
1 x 28 x 28

Output Convolution

25 x 26 x 26
Fully connected Max Pooling
feedforward network
25 x 13 x 13

Convolution
50 x 11 x 11

Max Pooling
1250 50 x 5 x 5
Flattened
A spatial separable convolution simply divides a kernel into two, smaller
kernels. The most common case would be to divide a 3x3 kernel into a 3x1
and 1x3 kernel, like so:

Now, instead of doing one convolution with 9 multiplications, we do two

convolutions with 3 multiplications each (6 in total) to achieve the same effect.
With less multiplications, computational complexity goes down, and the
network is able to run faster.
Unlike spatial separable convolutions, depthwise separable convolutions work
with kernels that cannot be “factored” into two smaller kernels. Hence, it is more
commonly used. Similar to the spatial separable convolution, a depthwise
separable convolution splits a kernel into 2 separate kernels that do two
convolutions: the depthwise convolution and the pointwise convolution.

Normal Convolution
Depthwise Convolution

Each 5x5x1 kernel iterates 1 channel of the image (note: 1 channel, not all
channels), getting the scalar products of every 25 pixel group, giving out a 8x8x1
image. Stacking these images together creates a 8x8x3 image.
Pointwise Convolution

The pointwise convolution is so named because it uses a 1x1 kernel, or a kernel

that iterates through every single point. This kernel has a depth of however many
channels the input image has; in our case, 3. Therefore, we iterate a 1x1x3
kernel through our 8x8x3 image, to get a 8x8x1 image.
Let’s calculate the number of multiplications the computer has to do in the
original convolution. There are 256 5x5x3 kernels that move 8x8 times. That’s
256x3x5x5x8x8=1,228,800 multiplications.
In the depthwise convolution, we have 3 5x5x1 kernels that move 8x8 times.
That’s 3x5x5x8x8 = 4,800 multiplications. In the pointwise convolution, we have
256 1x1x3 kernels that move 8x8 times. That’s 256x1x1x3x8x8=49,152
multiplications. Adding them up together, that’s 53,952 multiplications.
52,952 is a lot less than 1,228,800. With less computations, the network is able
to process more in a shorter amount of time.
ResNet Building block

In terms of architecture, if any layer ends up damaging the performance of the

model in a plain network, it gets skipped due to the presence of the skip-
connections.
MobileNet architecture

CNNS, Part 1: An Introduction To Convolutional Neural Networks
No ratings yet
CNNS, Part 1: An Introduction To Convolutional Neural Networks
17 pages
Hcia Ai Huawei Mock Exam Written Docu
100% (1)
Hcia Ai Huawei Mock Exam Written Docu
10 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
6 pages
CNN Basic Beak of Bird
100% (1)
CNN Basic Beak of Bird
20 pages
(Fall 2024) Images and Convolutions
No ratings yet
(Fall 2024) Images and Convolutions
69 pages
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
No ratings yet
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
11 pages
5 - Convolutional Neural Network
No ratings yet
5 - Convolutional Neural Network
14 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
26 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
Convolution Neural Networks U2
No ratings yet
Convolution Neural Networks U2
24 pages
Sarma CNN Vce Oct 2022
No ratings yet
Sarma CNN Vce Oct 2022
63 pages
Lecture 3 Updated
No ratings yet
Lecture 3 Updated
56 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
Unit 5th Ig Ann
No ratings yet
Unit 5th Ig Ann
112 pages
Unit4 CNN
No ratings yet
Unit4 CNN
187 pages
CNN (Neural Network)
No ratings yet
CNN (Neural Network)
32 pages
DL Unit4 CNN
No ratings yet
DL Unit4 CNN
132 pages
11 CNNs
No ratings yet
11 CNNs
64 pages
05introduction To Convolutional Neural Networks
No ratings yet
05introduction To Convolutional Neural Networks
72 pages
AE556 2024 Topic4 CNN
No ratings yet
AE556 2024 Topic4 CNN
26 pages
CNN Intro
No ratings yet
CNN Intro
30 pages
Convolutional Neural Networks: ZV0GDF798E
No ratings yet
Convolutional Neural Networks: ZV0GDF798E
9 pages
Deep Learning: Convolutional Neural Network & Its Applications
No ratings yet
Deep Learning: Convolutional Neural Network & Its Applications
53 pages
CNN Slides v1
No ratings yet
CNN Slides v1
157 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
Unit 3
No ratings yet
Unit 3
80 pages
Lecture CNN
No ratings yet
Lecture CNN
68 pages
FODL Unit-4
No ratings yet
FODL Unit-4
46 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
32 pages
Unit 4
No ratings yet
Unit 4
19 pages
Iii Unit - Deeplearning
No ratings yet
Iii Unit - Deeplearning
93 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
40 pages
Unit 2
No ratings yet
Unit 2
45 pages
Convolution Operation
No ratings yet
Convolution Operation
23 pages
CS217 2024 Lec14
No ratings yet
CS217 2024 Lec14
11 pages
465-Lecture 5-6
No ratings yet
465-Lecture 5-6
40 pages
Lec 8
No ratings yet
Lec 8
60 pages
Convolutional Networks1
No ratings yet
Convolutional Networks1
52 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
161 pages
UNIT 2 Study Materials 1
No ratings yet
UNIT 2 Study Materials 1
42 pages
Guide Convolutional Neural Network CNN
100% (1)
Guide Convolutional Neural Network CNN
25 pages
CNN 1
No ratings yet
CNN 1
41 pages
Module 3 CNN
No ratings yet
Module 3 CNN
92 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
102 pages
21CS743 DL Module4 Notes
No ratings yet
21CS743 DL Module4 Notes
7 pages
Student Notes: Convolutional Neural Networks (CNN) Introduction
No ratings yet
Student Notes: Convolutional Neural Networks (CNN) Introduction
9 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
98 pages
21CS743 Module4 Notes
No ratings yet
21CS743 Module4 Notes
15 pages
Convolutional Neural Networks: Shusen Wang
No ratings yet
Convolutional Neural Networks: Shusen Wang
75 pages
Convolutional Neural Networks - Deeplearning-Notes
No ratings yet
Convolutional Neural Networks - Deeplearning-Notes
43 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
Convolutional Neural Networks - Part 1
No ratings yet
Convolutional Neural Networks - Part 1
44 pages
Module 3
No ratings yet
Module 3
46 pages
CNN Lecture 1 by Dr. Vibha Tiwari
No ratings yet
CNN Lecture 1 by Dr. Vibha Tiwari
33 pages
Convolution and Pooling Layers
No ratings yet
Convolution and Pooling Layers
42 pages
DeepLearning Unit-II
No ratings yet
DeepLearning Unit-II
70 pages
Convolutional Networks 2024
No ratings yet
Convolutional Networks 2024
44 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
12 pages
DLT Unit-4
No ratings yet
DLT Unit-4
25 pages
Unit 3 CNN
No ratings yet
Unit 3 CNN
47 pages
The Imageverse
From Everand
The Imageverse
JM O'Farrell
No ratings yet
Btech Ec 6 Sem Artificial Neural Network Nec 013 2017
No ratings yet
Btech Ec 6 Sem Artificial Neural Network Nec 013 2017
1 page
2019人工智能发展报告
No ratings yet
2019人工智能发展报告
391 pages
Bi RNN
No ratings yet
Bi RNN
2 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
21 pages
Transformers in NLP 1
No ratings yet
Transformers in NLP 1
9 pages
ML Lecture3 2013 PDF
No ratings yet
ML Lecture3 2013 PDF
60 pages
Unit 3 Deep Learning SPPU BE IT
No ratings yet
Unit 3 Deep Learning SPPU BE IT
30 pages
Generative Adversarial Network-Based Phishing URL Detection With Variational Autoencoder and Transformer
No ratings yet
Generative Adversarial Network-Based Phishing URL Detection With Variational Autoencoder and Transformer
8 pages
08 NLP With Deep Learning
No ratings yet
08 NLP With Deep Learning
31 pages
Question Bank Beel801 PDF
100% (1)
Question Bank Beel801 PDF
10 pages
CS224n: Natural Language Processing With Deep Learning
No ratings yet
CS224n: Natural Language Processing With Deep Learning
14 pages
QB Ecc604 May 2022 Examination Te Extc Sem Vi 2021-22
No ratings yet
QB Ecc604 May 2022 Examination Te Extc Sem Vi 2021-22
25 pages
Machine Learning 2M&10M Qpaper
No ratings yet
Machine Learning 2M&10M Qpaper
3 pages
Deep Learning Most Important Ideas PDF
No ratings yet
Deep Learning Most Important Ideas PDF
16 pages
Deep Learning Tutorial Complete (v3)
No ratings yet
Deep Learning Tutorial Complete (v3)
109 pages
Teme Pentru Referate La Cursul "Retele Neuronale"
No ratings yet
Teme Pentru Referate La Cursul "Retele Neuronale"
3 pages
AISC - Term Test 2 - 2021 22
No ratings yet
AISC - Term Test 2 - 2021 22
11 pages
E-Eli5-Way-3bd2b1164a53: CNN (Source:)
No ratings yet
E-Eli5-Way-3bd2b1164a53: CNN (Source:)
4 pages
08 Fair Machine Learning
No ratings yet
08 Fair Machine Learning
53 pages
Adaline Madaline Comes Under The Supervised Learning Networks
No ratings yet
Adaline Madaline Comes Under The Supervised Learning Networks
8 pages
Intro Class
No ratings yet
Intro Class
81 pages
Script PFE
No ratings yet
Script PFE
8 pages
Unit 1 NNDL
No ratings yet
Unit 1 NNDL
8 pages
Final Neural 2018 May
No ratings yet
Final Neural 2018 May
2 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
35 pages
Lecture 3 - MATLAB Representation of Neural Network
No ratings yet
Lecture 3 - MATLAB Representation of Neural Network
6 pages
Sppu ML 2023 End Term
No ratings yet
Sppu ML 2023 End Term
2 pages
Lecture 7 - Neural Networks
No ratings yet
Lecture 7 - Neural Networks
48 pages
ENsemble, Random Forest
No ratings yet
ENsemble, Random Forest
28 pages