0% found this document useful (0 votes)

19 views67 pages

Module 3

The document provides an overview of Convolutional Neural Networks (CNNs), detailing their structure, including input, convolutional, pooling, fully connected, and output layers, and their roles in image processing. It explains key concepts such as convolution operations, parameter sharing, and pooling as a means of reducing data size while preserving important information. Additionally, it discusses various CNN architectures, applications, and efficient convolution algorithms, highlighting the significance of CNNs in tasks like image classification and object detection.

Uploaded by

akshaylalsp6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views67 pages

Module 3

Uploaded by

akshaylalsp6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 67

Module – 3

Convolutional Neural Network (CNN)

Convolutional Neural Networks – convolution operation, motivation, pooling,

Convolution and Pooling as an infinitely strong prior, variants of convolution
functions, structured outputs, data types, efficient convolution algorithms.

Reena Thomas, Asst. Prof., CSE dept., CEMP

1
Convolutional Neural Network (CNN)

• It is a special type of deep learning model used to recognize patterns in images.

• It works by extracting important features like edges, shapes, and textures and
then making predictions.
• A CNN has different layers, each with a specific role in processing an image.

2
1. Input Layer

• The input layer is where the CNN receives the image for processing.
• What it Takes: An image in the form of a numerical matrix (grid of pixel values).
• Structure:
• A grayscale image has 1 channel (e.g., 28×28×1 for a black-and-white image).
• A color image has 3 channels (RGB: Red, Green, Blue), e.g., 32×32×3.
• Preprocessing:
• Rescaling: Pixel values are often normalized (e.g., between 0 and 1) for better
training.
• Reshaping: Images may be resized to a fixed shape (e.g., 224×224×3 for deep
CNNs).

3
2. Convolutional Layer (Feature Extraction)

• The most important layer in a CNN.

• It detects patterns in the image by applying small filters (kernels).
• A filter slides over the image and performs a mathematical operation (dot
product) to create a feature map (new representation of the image).
Feature Map = Input ∗ Kernel + Bias
• Example: A 3×3 filter detects small edges
• Followed by a ReLU activation function (which removes negative values) to
keep only useful features.
• Think of this like looking at an image through different lenses to find
important details.

4
3. Pooling Layer (Downsampling)

• Reduces the size of the feature maps, making the model faster and more
efficient.
• Downsampling is the process of reducing the size of data while preserving
important information.
• Prevents overfitting by keeping only important information.
• Types of pooling:
• Max Pooling
• Average Pooling

5
4. Fully Connected (Dense) Layer

• Converts the extracted features into a single long vector (Flattening).

• Connects every neuron to all previous neurons (like a regular neural network).
• Helps the model understand high-level patterns.
• Uses activation functions (like ReLU) to introduce non-linearity.
• Think of this like a decision-making stage where all detected features combine
to classify an image.

6
5. Output Layer (Prediction)

• Produces the final result based on the processed information.

• Uses an activation function based on the type of task:
• Softmax for multi-class classification (e.g., dog, cat, bird).
• Sigmoid for binary classification (e.g., cancerous vs. non-cancerous).
• Linear for continuous value predictions.
• Think of this like making the final decision based on all observations.

7
Convolution Operation
The convolution operation in CNN is a fundamental process that extracts
spatial features from input data (such as images)

Step 1: Define the Input Image

Step 2: Define the Filter (Kernel)
Step 3: Perform Element-wise Multiplication
i. The filter moves over the input image, and at each step:
ii. It takes a region (of the same size as the filter).
iii. Multiplies corresponding elements.
iv. Sums up the results.
v. Places the summed value in the output matrix.
Step 4: Slide the Filter Over the Image
8
9
10
11
12
13
14
PROBLEM 3 : Let's take a smaller 4×4 input matrix and a 3×3 filter to perform
convolution.

15
16
17
18
19
20
Motivation
• Convolution leverages three key ideas: sparse interactions, parameter sharing, and
equivariant representations.
• Sparse interactions occur because each output unit interacts with only a small
subset of input units, unlike traditional neural networks where all input units
influence every output unit.
• Parameter sharing allows the same kernel to be applied across different parts of the
input, reducing the number of parameters and improving efficiency.
• Equivariant representations ensure that a pattern detected in one region of the
input will also be recognized in another, making convolution useful for tasks like
image processing.
• Convolutional networks can handle variable-sized inputs, unlike traditional networks
that require fixed input dimensions.

21
• Traditional neural network layers rely on matrix multiplication, where each
output unit is connected to all input units, making connectivity dense.
• Convolutional layers achieve sparse connectivity (Sparse interactions or sparse
weights) by using small kernels, ensuring only a limited number of input units
affect each output unit.

When S is formed by convolution with a

kernel of width 3, only three output
units are affected by each input unit.

In contrast, when S is formed by matrix

multiplication, the connectivity is no
longer sparse, meaning every output
unit is influenced by every input unit,
including x3.
22
• We highlight one output unit, s3, and
also highlight the input units in x that
affect this unit.
• These units are known as the
receptive field of s3.
• When s is formed by convolution
with a kernel of width 3, only three
inputs affect s3.

23
• The receptive field of the units in the deeper layers of a convolutional network is
larger than the receptive field of the units in the shallow layers.
• This effect increases if the network includes architectural features like strided
convolution or pooling.
• This means that even though direct connections in a convolutional net are very
sparse, units in the deeper layers can be indirectly connected to all or most of the
input image.
24
Parameter Sharing
• Parameter sharing means using the same parameter for multiple functions in a
model, also known as "tied weights" because a weight applied at one input is tied
to its value elsewhere.
• Traditional Neural Networks: Each weight in the matrix is used exactly once per
computation—multiplied by a single input element and never reused.
• Convolutional Neural Networks (CNNs): Each kernel parameter is applied across all
positions of the input.
• Advantage: Instead of learning separate parameters for every location, CNNs learn
a single set of parameters, reducing the model’s complexity while improving
efficiency.

25
• The black arrows indicate
uses of the central element
of a 3 element kernel in a
convolutional model. Due to
parameter sharing, this
single parameter is used at
all input locations.

• The single black arrow indicates the use of the central element of the weight
matrix in a fully connected model. This model has no parameter sharing so the
parameter is used only once.

26
27
CNN consists of a
A convolutional
few complex layers,
network consists of
each made up of
many simple
multiple stages.
layers, where each
processing step is
There is a direct
considered a
one-to-one
separate layer.
mapping between
kernel (filters) and
Some layers
network layers,
perform operations
meaning each layer
like activation or
applies specific
pooling without
filters to detect
having their own
patterns in the
parameters.
input.
28
Pooling

29
A view of the middle of the output of a
convolutional layer. The bottom row
shows outputs of the nonlinearity. The
top row shows the outputs of max
pooling, with a stride of one pixel
between pooling regions and a pooling
region width of three pixels.

A view of the same network, after the

input has been shifted to the right by
one pixel. Every value in the bottom
row has changed, but only half of the
values in the top row have changed,
because the max pooling units are
sensitive only to the maximum value
in the neighborhood, not its exact
location.
Figure: Max pooling introduces invariance.
30
31
• The use of pooling can be viewed as adding an inﬁnitely strong prior that the
function the layer learns must be invariant to small translations.
• It may greatly improve the statistical eﬃciency of the network.
• Pooling over spatial regions produces invariance to translation
• But if we pool over the outputs of separately parametrized convolutions, the
features can learn which transformations to become invariant

Figure : Example of learned invariances. 32

• A pooling unit that pools over multiple features that are learned with separate
parameters can learn to be invariant to transformations of the input.
• Here we show how a set of three learned filters and a max pooling unit can learn to
become invariant to rotation. All three filters are intended to detect a hand written 5.
• Each filter attempts to match a slightly different orientation of the 5. When a 5
appears in the input, the corresponding filter will match it and cause a large activation
in a detector unit.
• The max pooling unit then has a large activation regardless of which detector unit was
activated.
• We show here how the network processes two different inputs, resulting in two
different detector units being activated.
• Max pooling over spatial positions is naturally invariant to translation; this
multichannel approach is only necessary for learning other transformations.

33
• pooling summarizes the responses over a whole neighborhood
• It is possible to use fewer pooling units than detector units, by reporting
summary statistics for pooling regions spaced k pixels apart rather than 1 pixel
apart.

Figure : Pooling with downsampling.

• Here we use max pooling with a pool width of three and a stride between
pools of two.
• This reduces the representation size by a factor of two, which reduces the
computational and statistical burden on the next layer.
• Note that the rightmost pooling region has a smaller size but must be
included if we do not want to ignore some of the detector units.
34
• When the number of parameters in the next layer is a function of its
input size.
• This reduction in the input size can also result in improved statistical
efficiency and reduced memory requirements for storing the
parameters.
• For many tasks, pooling is essential for handling inputs of varying size.

35
Convolution and Pooling as an
infinitely strong prior

36
Prior Probability Distribution

• Imagine you're guessing a fruit's weight, if you've seen similar fruits before, you can
estimate their weight based on past experience before measuring.
• This initial belief before seeing any data is called a prior probability distribution.
• In machine learning and statistics, we use a prior distribution to express what we
believe about the parameters of a model before looking at any data.
• Weak Prior (High Uncertainty)
• Wide and spread-out belief - We let the data decide most of the outcome.
• A weak prior allows the data to shape the model more freely.
• Eg. : A Gaussian (Normal) distribution with high variance
• Strong Prior (Low Uncertainty)
• Narrow and confident belief – It has a big influence on the final outcome.
• A strong prior influences the model more, making it less sensitive to new data.
• Eg. : A Gaussian distribution with low variance.
37
An infinitely strong prior means certain parameter values are completely forbidden
(Zero Probability), no matter how much data supports them.

• In a neural network, this can mean forcing the weights of one hidden unit to be
identical to its neighbor.
• It can also mean setting most weights to zero, except in a small region assigned to
each unit.
• Convolutional layers apply this idea by restricting how weights are shared and used,
effectively enforcing a strong prior on the model's parameters.
• The layer's function should capture only local interactions and be translation
equivariant.
• Pooling acts as an infinitely strong prior, enforcing invariance to small translations.

38
• A convolutional net can be seen as a fully connected net with an infinitely strong
prior, helping us understand its behavior.
• Convolution and pooling may cause underfitting if the prior assumptions don’t
match the data.
• They are useful only when their assumptions hold true.

• Some channels use pooling for invariance, while others skip it to avoid underfitting
when translation invariance isn't accurate.

• Convolutional models should only be compared to other convolutional models in

statistical benchmarks.

39
40
CNN based architectures

• LeNet – LeNet-5 (named after Yann LeCun)

• AlexNet – Named after Alex Krizhevsky
• ZFNet – Zeiler and Fergus Network
• VGG – Visual Geometry Group (developed by Oxford's VGG team)
• GoogleNet – Later renamed as Inception (developed by Google)
• ResNet – Residual Network

41
Applications of CNN

• Image Classification – Identifying objects in images (e.g., ImageNet).

• Object Detection – Locating objects within images (e.g., YOLO).
• Face Recognition – Used in security systems and social media tagging.
• Medical Image Analysis – Detecting diseases in X-rays, MRIs, CT scans.
• Self-Driving Cars – Lane detection, pedestrian recognition.
• Satellite Image Processing – Land-use classification, weather prediction.
• Optical Character Recognition (OCR) – Reading handwritten and printed text.
• Gesture Recognition – Used in human-computer interaction and gaming.
• Deepfake Generation – Face swapping and synthetic media creation.
• Defect Detection in Manufacturing – Identifying flaws in products.
• Video Analysis – Action recognition, surveillance systems.
• Speech to Image Conversion – Generating images from spoken descriptions.

42
Variants of convolution functions
( https://youtu.be/CChuD_wD2UI?si=PzfL95olvlpZqyIb )

43
44
45
46
47
48
49
50
51
52
53
Structured outputs

• Convolutional networks can generate high-dimensional structured outputs,

typically represented as a tensor.
• The output tensor S is emitted by a convolutional layer, where Si,j,k
represents the probability that pixel (j, k) belongs to class i.
• This enables pixel-wise labeling, allowing the model to create detailed
object masks that follow the exact outlines of objects.
• When used for single-object classification, the largest spatial dimension
reduction comes from pooling layers with large strides.

54
A Recurrent Convolutional Network (RCN)
iteratively refines pixel labels. U extracts
features, V predicts labels, and W updates
predictions in later steps. The same
parameters are reused, making it recurrent.

55
Data types

• The data used with a convolutional network usually consists of several channels,
each channel being the observation of a different quantity at some point in
space or time.
• One advantage to convolutional networks is that they can also process inputs
with varying spatial extents.
• These kinds of input simply cannot be represented by traditional, matrix
multiplication-based neural networks

56
57
Efficient Convolution Algorithms
• Modern convolutional networks often contain millions of units, requiring powerful
implementations that leverage parallel computing for efficiency.
• Frequency Domain Convolution: Convolution can be done by converting both the input
and kernel to the Fourier domain, performing point-wise multiplication, and then
converting back using the inverse Fourier transform.
• This is often faster than direct convolution for larger kernels.
• Separable Kernels: A d-dimensional kernel is called separable if it can be expressed as
the outer product of d vectors.
• Separable Convolution Efficiency: Instead of using a naïve approach, separable kernels
allow convolution to be broken into multiple 1D convolutions, reducing computational
cost.
• It also reduces the number of parameters needed to store the kernel.
58
• If the kernel is w elements wide in each dimension, Naïve multidimensional
convolution requires O(w^d) runtime and storage, while separable convolution
reduces this to O(w × d).
• Three Major Approaches for Efficient Convolution:
• Naïve approach
• Convolution with separable kernel
• Recursive filtering

59
1. Naïve Convolution Approach
Convolution computes the weighted sum of an input signal using a kernel.

• For each input position n, multiply the flipped kernel with the corresponding input
values and sum the results.
• This method is slow and memory-intensive, making it inefficient for large data.
• More efficient methods like separable convolution and Fourier transforms are
used to speed up computation.

60
2. Separable Convolution
• A convolution with a separable kernel can be decomposed into multiple lower-
dimensional convolutions, reducing computational cost.
• A kernel is separable if its matrix has rank 1. To construct such a kernel, consider:
u = (u1, u2, u3, . . . , um) and one column vector v^T = (v1, v2, v3, . . . , vn). let us
convolve them together:

• A represents the convolution kernel, while u and v are lower-dimensional

convolution filters, making computation more efficient.

61
62
Challenges in Recursive Filtering
• Replication – Given a slow but accurate non-recursive filter, finding an equivalent
recursive version can be complex.
• Stability – The recursive formula may cause numerical instability, leading to
incorrect results or divergence.
• Accuracy – Small computational errors can accumulate over time, reducing the
precision of the filtering process.

Architectures for Recursive Filters

• Streaming architectures – Process data in a stream with minimal memory.
• Parallel architectures – Speed up computation using mathematical optimizations.

63
Previous year questions

64
• What is the size of the feature map if the input image is 64x64, the convolutional
kernel size is 8x8, the stride is 3, and there is symmetric padding of 2 pixels on
each side?

• What are structured outputs in the context of CNNs?

• Illustrate the role of convolutional layers in CNNs with an example.
• Explain the significance of efficient convolution algorithms in CNNs.
• List and explain the common data types used in deep learning.
• Suggest a method to make convolution algorithm more efficient. Justify Your
answer.
65
• What is CNN, and how is it different from a fully connected neural network?

• Give two benefits of using convolutional layers instead of fully connected ones for
visual tasks.
• Describe the motivation behind convolution neural networks.
• Sketch the diagram of Convolutional Neural Network architecture and explain
different stages in detail.
• Explain in detail the variants of convolution functions.

66
• Assume an input volume of dimension 64 x 64 x 3. What are the dimensions of the
resulting volume after convolving a 5 x 5 kernel with zero padding, stride of 1 and 2
filters?

M4_IA2
No ratings yet
M4_IA2
6 pages
Convolutional Neural Networks (Part I)
No ratings yet
Convolutional Neural Networks (Part I)
61 pages
Module 3
No ratings yet
Module 3
46 pages
AIML_ECE_UNIT-5
No ratings yet
AIML_ECE_UNIT-5
48 pages
Neural Networks and Deep Learning (PE - V) (18CSE23) Unit - 4
No ratings yet
Neural Networks and Deep Learning (PE - V) (18CSE23) Unit - 4
11 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
Sarma Cnn Vce Oct 2022
No ratings yet
Sarma Cnn Vce Oct 2022
63 pages
Cnn
No ratings yet
Cnn
32 pages
UNIT - 2
No ratings yet
UNIT - 2
31 pages
CNN
No ratings yet
CNN
62 pages
Lecture 4-Convolutional Network
No ratings yet
Lecture 4-Convolutional Network
26 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
55 pages
Module-4 dl
No ratings yet
Module-4 dl
22 pages
L09-10 DL and CNN
No ratings yet
L09-10 DL and CNN
56 pages
CNN2
No ratings yet
CNN2
70 pages
AE556_2024_Topic4_CNN
No ratings yet
AE556_2024_Topic4_CNN
26 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it
0% (1)
Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it
49 pages
UNIT2-CNN
No ratings yet
UNIT2-CNN
34 pages
AD3501-DL-UNIT 2 NOTES
No ratings yet
AD3501-DL-UNIT 2 NOTES
29 pages
Unit 4a - Convolutional Neural Networks
No ratings yet
Unit 4a - Convolutional Neural Networks
107 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
Unit-4
No ratings yet
Unit-4
19 pages
Unit III
No ratings yet
Unit III
89 pages
Intro to CNN
No ratings yet
Intro to CNN
93 pages
Cnns Convolution Neural Networks
No ratings yet
Cnns Convolution Neural Networks
50 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
78 pages
AD3501-DL-Unit 2
No ratings yet
AD3501-DL-Unit 2
33 pages
Unit-3
No ratings yet
Unit-3
59 pages
L11 Learning III Neural Network Architectures
No ratings yet
L11 Learning III Neural Network Architectures
35 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
Unit - 2
No ratings yet
Unit - 2
51 pages
DL_MOD3
No ratings yet
DL_MOD3
102 pages
Unit III
No ratings yet
Unit III
89 pages
Lecture 6
No ratings yet
Lecture 6
17 pages
cnn
No ratings yet
cnn
10 pages
Neural Networks Unit 3
No ratings yet
Neural Networks Unit 3
93 pages
Introduction to Convolutional Neural Networks
No ratings yet
Introduction to Convolutional Neural Networks
4 pages
UNIT 2 Study Materials 1
No ratings yet
UNIT 2 Study Materials 1
42 pages
dl_mod4
No ratings yet
dl_mod4
18 pages
What Is Convolutional Neural Network
No ratings yet
What Is Convolutional Neural Network
16 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
Unit III
No ratings yet
Unit III
8 pages
What is a Convolutional Neural Network-unit3.docx
No ratings yet
What is a Convolutional Neural Network-unit3.docx
12 pages
Chapter 4 Ann
No ratings yet
Chapter 4 Ann
33 pages
Variants of Cnn(page no 17-23), structured output(29-31),datatypes
No ratings yet
Variants of Cnn(page no 17-23), structured output(29-31),datatypes
31 pages
Chapter14 CNN
No ratings yet
Chapter14 CNN
54 pages
E-Note_33951_Content_Document_20250328020322PM
No ratings yet
E-Note_33951_Content_Document_20250328020322PM
29 pages
FODL Unit-4
No ratings yet
FODL Unit-4
46 pages
Lecture_3
No ratings yet
Lecture_3
48 pages
4
No ratings yet
4
5 pages
Module-4
No ratings yet
Module-4
20 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
Unit 3 CNN
No ratings yet
Unit 3 CNN
47 pages
Deep Learning Unit-III
No ratings yet
Deep Learning Unit-III
9 pages
Convolutional Neural Networks - Part 2
No ratings yet
Convolutional Neural Networks - Part 2
49 pages
Introduction to Convolution Neural Network
No ratings yet
Introduction to Convolution Neural Network
15 pages
Unit Iii Convolutional Networks and Sequence Modelling
No ratings yet
Unit Iii Convolutional Networks and Sequence Modelling
38 pages
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Floor Plan Generation Using GAN
100% (1)
Floor Plan Generation Using GAN
144 pages
Lecture 11 Unsupervised Learning
No ratings yet
Lecture 11 Unsupervised Learning
19 pages
Btech Cse 7 Sem Machine Learning Pec Cs701e 2024
No ratings yet
Btech Cse 7 Sem Machine Learning Pec Cs701e 2024
2 pages
Speechut: Bridging Speech and Text With Hidden-Unit For Encoder-Decoder Based Speech-Text Pre-Training
No ratings yet
Speechut: Bridging Speech and Text With Hidden-Unit For Encoder-Decoder Based Speech-Text Pre-Training
14 pages
Lab 6 Dsa
No ratings yet
Lab 6 Dsa
15 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
15 pages
Interview Preparation For Data Scientists
No ratings yet
Interview Preparation For Data Scientists
5 pages
Endsem Project Report B16
No ratings yet
Endsem Project Report B16
26 pages
Associative Memory Networks
No ratings yet
Associative Memory Networks
6 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
Deep Q-Network
No ratings yet
Deep Q-Network
15 pages
GenAI Pinnacle Plus Brochure
No ratings yet
GenAI Pinnacle Plus Brochure
10 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
31 pages
Abdul Waheed Et Al - 2020 - An Optimized Dense Convolutional Neural Network Model For Disease Recognition
No ratings yet
Abdul Waheed Et Al - 2020 - An Optimized Dense Convolutional Neural Network Model For Disease Recognition
11 pages
Artificial Intelligence - LinkedIn Playlist
No ratings yet
Artificial Intelligence - LinkedIn Playlist
1 page
Asansol Engineering College: Topic: PPT Assignment
No ratings yet
Asansol Engineering College: Topic: PPT Assignment
11 pages
Finals RPW
No ratings yet
Finals RPW
2 pages
An Unsupervised Machine Learning Algorithms_Comprehensive Review
No ratings yet
An Unsupervised Machine Learning Algorithms_Comprehensive Review
12 pages
Time Series Forecast of Electrical Load Based On XGBoost
No ratings yet
Time Series Forecast of Electrical Load Based On XGBoost
10 pages
375CS Lab Activity-7
No ratings yet
375CS Lab Activity-7
8 pages
project 4 report(Rohit&Gayatri)
No ratings yet
project 4 report(Rohit&Gayatri)
36 pages
Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder
100% (1)
Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder
20 pages
Mind Vs Machine
No ratings yet
Mind Vs Machine
10 pages
Introduction To Artificial Intelligence & Machine Learning
No ratings yet
Introduction To Artificial Intelligence & Machine Learning
5 pages
DCGAN (Deep Convolution Generative Adversarial Networks)
No ratings yet
DCGAN (Deep Convolution Generative Adversarial Networks)
27 pages
CV Siddhartha Shrestha
No ratings yet
CV Siddhartha Shrestha
5 pages
CS 563-DeepLearning-SentimentApplication-April2022 (27403)
No ratings yet
CS 563-DeepLearning-SentimentApplication-April2022 (27403)
124 pages
Compute Trends Across Three Eras of Machine Learning
No ratings yet
Compute Trends Across Three Eras of Machine Learning
25 pages
Anomaly Detection For Medical Images Using Heterogeneous Auto-Encoder
No ratings yet
Anomaly Detection For Medical Images Using Heterogeneous Auto-Encoder
13 pages
Introduction To Convolutional Neural Networks (CNNS)
No ratings yet
Introduction To Convolutional Neural Networks (CNNS)
28 pages

Module 3

Uploaded by

Module 3

Uploaded by

Module – 3

Convolutional Neural Network (CNN)

Convolutional Neural Networks – convolution operation, motivation, pooling,

Reena Thomas, Asst. Prof., CSE dept., CEMP

• It is a special type of deep learning model used to recognize patterns in images.

• The most important layer in a CNN.

• Converts the extracted features into a single long vector (Flattening).

• Produces the final result based on the processed information.

Step 1: Define the Input Image

When S is formed by convolution with a

In contrast, when S is formed by matrix

A view of the same network, after the

Figure : Example of learned invariances. 32

Figure : Pooling with downsampling.

• Convolutional models should only be compared to other convolutional models in

• LeNet – LeNet-5 (named after Yann LeCun)

• Image Classification – Identifying objects in images (e.g., ImageNet).

• Convolutional networks can generate high-dimensional structured outputs,

• A represents the convolution kernel, while u and v are lower-dimensional

Architectures for Recursive Filters

• What are structured outputs in the context of CNNs?

You might also like