0% found this document useful (0 votes)

20 views44 pages

ML Lec 13 CNN

Uploaded by

8d24wc8sj2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views44 pages

ML Lec 13 CNN

Uploaded by

8d24wc8sj2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 44

Convolutional Neural Networks

• A CNN is a network architecture for deep learning which

learns directly from data.
• CNNs are particularly useful for finding patterns in images to
recognize objects.
• They can also be quite effective for classifying non-image
data such as audio, time series, and signal data.
• CNNs are powerful tools for visual data analysis, leveraging
their layered architecture to automatically learn hierarchical
features, making them ideal for a variety of applications in
computer vision.
• The CNN consists of the following layers:
(i) Input Layer (ii) Convolutional Layer (iii) Pooling Layer (iv)
Fully Connected Layer, and (v) Output Layer
Convolutional Neural Networks
• It has many applications including:
 Image classification
 Object detection
 Image segmentation
 Facial recognition
 Medical image analysis
Convolutional Neural Networks
• Convolutional Neural Networks (CNNs) are a class of deep
learning models specifically designed for processing
structured grid data, such as images. The architecture of a
CNN is:
Convolutional Neural Networks
1. Input Layer:
• Image Input: Typically, the input is a multi-dimensional
array representing an image (height, width, channels). For
example, a color image might have three channels (RGB).
2. Convolutional Layers: It performs following tasks:
• Convolution Operation: The core building block of CNNs.
Filters (kernels) slide over the input image and compute
dot products to create feature maps.
• Activation Function: Often a non-linear function like ReLU
(Rectified Linear Unit) is applied to introduce non-
linearity.
Convolutional Layer
• In a convolutional neural network, the kernel
is nothing but a filter that is used to extract
the features from the images.
Usefulness of Convolutional Layer
• Its uses are manifolds: Already discussed in
previous slide.
1. Feature Extraction
• Local Patterns: Convolutional layers are designed
to detect local patterns in the input data (e.g.,
edges, textures, and shapes in images) through
learned filters (kernels).
• Hierarchical Features: As the network deepens,
these layers can capture increasingly complex
features, transitioning from simple edges to more
abstract representations (like shapes and objects).
Usefulness of Convolutional Layer

2. Parameter Sharing
• The same filter is applied across the entire input,
allowing the model to learn features that are invariant
to spatial location.
• This reduces the number of parameters compared to
fully connected layers, making the model more efficient
and easier to train.
3. Translation Invariance
• By applying convolutional filters across the input, CNNs
become more robust to the position of features within
the image. This helps in recognizing objects regardless
of their location.
Usefulness of Convolutional Layer
4. Dimensionality Reduction
• While convolutional layers can maintain or reduce spatial
dimensions through techniques like stride and padding, they also
create feature maps that can summarize important information,
allowing subsequent layers to operate on a more compact
representation.
5. Activation Functions: Non-Linearity:
• After the convolution operation, activation functions (commonly
ReLU) are applied to introduce non-linearity. This allows the
network to learn more complex patterns and relationships.
6. Preprocessing of Input
• Normalizing Features: Convolutional layers help preprocess the
input data, making it easier for the network to learn useful
representations.
Pooling
3. Pooling Layer
• Purpose: Reduce the spatial dimensions of the feature
maps, which decreases the number of parameters and
computation in the network.
• Types:
(i) Max Pooling: Takes the maximum value from a defined
window. It retains the most prominent features, making it
effective for detecting edges and textures.
(ii) Average Pooling: Takes the average value from a defined
window. It provides a smoother feature map and can be less
sensitive to noise than max pooling. It is sometimes in cases
where we want to retain more contextual information.
Pooling
Pooling

(iii) Global Average Pooling

• It takes the average of the entire feature map, producing
a single output for each feature map. It is helpful for
reducing the size of feature maps and can be effective in
classification tasks. It is generally used in the final layers
of CNN architectures before the output layer.
(iv) Global Max Pooling
• It takes the maximum value from the entire feature map.
Similar to global average pooling it is helpful for reducing
the size of feature maps but it retains the most
significant feature. It is also used before the output layer,
especially when spatial resolution is less important.
Pooling
• Pooling operations like max pooling and average pooling
generally reduce spatial resolution.
• However, the specific impact on spatial resolution depends
on how the pooling is configured (e.g., window size and
stride).
• Pooling Parameters
(i) Window Size: The size of the pooling filter (e.g., 2x2, 3x3).
(ii) Stride: The number of pixels the filter moves after each
operation. A larger stride reduces the output size more
significantly.
(iii) Padding: Adding extra pixels around the input feature map
before pooling. Typically, pooling layers don’t use padding,
but it can be useful in certain architectures.
Pooling
• Instead of pooling, some architectures use
convolutions with strides greater than one to
achieve downsampling while learning features.
• Pooling can lead to a loss of spatial information,
which might be critical for some tasks (e.g.,
image segmentation).
• The choice between max pooling, average
pooling, or other pooling techniques often
depends on the specific task and desired
outcomes.
Convolutional Neural Networks

4. Fully Connected Layers

• After several convolutional and pooling layers,
- the pooled feature maps are flattened into a single
long continuous linear vector and
- a high-level reasoning in the network is done via fully
connected layers, where each neuron in these layers is
connected to every neuron in the previous layer.
5. Output Layer
• The output layer outputs the result, typically using a
softmax activation for classification tasks or linear
activation for regression tasks.
Regularization Techniques
1. Dropout:
• Randomly sets a fraction of input units to zero during training to
prevent overfitting.
• It acts as a mask that nullifies the contribution of some neurons
towards the next layer and leaves unmodified all others.
Regularization Techniques:Dropout

• Dropout is a powerful technique used for regularization in

Convolutional Neural Networks (CNNs) and other neural
networks. Here’s how it helps:
• Prevents Overfitting: Dropout randomly removes some
neurons during training, which helps prevent the model
from depending too much on certain neurons or patterns.
This helps the model generalize better to unseen data.
• Encourages Redundancy: Since different neurons are
dropped out at each iteration, the network learns to
spread or distribute the learned representations across
many neurons rather than focusing on a few. This
redundancy can improve robustness.
Regularization Techniques:Dropout

• Effective Ensemble Method: During training, each mini-

batch sees a different subset of the network, effectively
training an ensemble of different models. At inference
time, all neurons are used, which can lead to better
performance.
• Stochasticity in Training: The randomness introduced by
dropout creates a form of stochastic training, which can
help escape local minima and lead to better optimization.
Conclusion: In practice, dropout is typically applied after
activation functions and before pooling layers in CNNs. By
adjusting the dropout rate (commonly between 20-50%),
we can control the balance between training speed and
generalization capability.
Regularization Techniques
2. Batch Normalization (BN): Normalizes the inputs of each
layer to improve stability and accelerate training.
(i) Internal Covariate Shift Reduction:
• Batch normalization addresses the issue of internal
covariate shift, which refers to the changes in the
distribution of inputs in the layer during training, i.e., same
distribution of inputs is considered in the layers during
training.
• By normalizing the inputs to each layer, BN reduces this
variability, allowing for more stable and faster training.
(ii) Faster Convergence: By stabilizing the learning process, BN
allows larger learning rates. This can lead to faster
convergence and reduced training time.
Regularization Techniques
(iii) Improved Gradient Flow:
• Normalized inputs can lead to a smoother loss function.
So these inputs helps prevent gradients from becoming
too small or too large
• This can improve gradient flow through the network
and mitigate issues like vanishing or exploding
gradients.
(iv) Regularization Effect:
• BN introduces some noise into the learning process by
normalizing over mini-batches.
• This can have a regularization effect, reducing the need
for other forms of regularization like dropout.
How Batch Normalization Works
How Batch Normalization Works

• Batch normalization can be applied after the convolutional layer and before
the activation function (e.g., ReLU) or after the activation, which helps improve
the training process and allows for faster convergence while maintaining or
improving model performance.
How Batch Normalization Works

• During training phase, how mini-batch is

applied is discussed earlier, which helps the
model adapt to changes in the distribution of
inputs.
• But during inference, instead of recalculating
mean and variance from the mini-batch, the
model uses running averages (accumulated
statistics) that were computed during training.
How Batch Normalization Works

• This is crucial for a few reasons:

(i) Consistency: Inference usually involves feeding
in data one sample at a time or in smaller
batches. Using the running averages ensures that
the normalization is consistent regardless of the
batch size.
(ii) Stability: If we used batch statistics during
inference, the mean and variance could fluctuate
significantly, especially with small batch sizes,
leading to inconsistent model performance.
How Running Averages Work

(i) Updating Running Averages: During training, after

calculating the mean and variance for each mini-batch,
we also maintain running averages:

• Mean Update:
running_mean=α⋅running_mean+(1−α)⋅μB

• Variance Update:
running_variance=α⋅running_variance+(1−α)⋅σB2

• Here, α is a hyperparameter (often set to around 0.9) that

determines how quickly the running averages update.
Convolutional Neural Networks

𝑃𝑎𝑟𝑎𝑚=156
𝑃𝑎𝑟𝑎𝑚=2416

Total no. of parameters:

Convolutional Neural Networks
• Parameters in Conv-1: For each kernel of size F × F × D = F × F × D
+1, 1 accounts for the bias term for each filter = 26, as D=1
• Total Parameters in Conv-1: For K=6 kernels = 26 ×6 = 156
•For pooling, there is no parameter.

𝑃𝑎𝑟𝑎𝑚=156
𝑃𝑎𝑟𝑎𝑚=2416

Total no. of parameters:

Convolutional Neural Networks
• Parameters in Conv-2: For each kernel of size F × F × D = F × F × D +1,
1 for bias = 151, as D=6, which is the number of layers in the
previous Conv. Layer.
• Total Parameters in Conv-2: For K=16 kernels = 151 × 16 = 2416
•For pooling, there is no parameter.

𝑃𝑎𝑟𝑎𝑚=156
𝑃𝑎𝑟𝑎𝑚=2416

Total no. of parameters:

Convolutional Neural Networks
• Parameters in 1st FC Layer: No. of nodes in input layer = 5 × 5 × 16
= 400, and no. of nodes in output layer=120.
• Total Parameters in 1st FC Layer = (400+1) × 120 = 48120, 1 added
with 400 as bias

𝑃𝑎𝑟𝑎𝑚=156
𝑃𝑎𝑟𝑎𝑚=2416

Total no. of parameters:

Convolutional Neural Networks
• Parameters in 2nd FC Layer: No. of nodes in input layer = 120, and
no. of nodes in output layer=84.
• Total Parameters in 2nd FC Layer = (120+1) × 84 = 10164, 1 added
with 120 as bias

𝑃𝑎𝑟𝑎𝑚=156
𝑃𝑎𝑟𝑎𝑚=2416

Total no. of parameters:

Convolutional Neural Networks
• Parameters in output Layer: No. of nodes in input layer = 84, and
no. of nodes in output layer=10.
• Total Parameters in output Layer = (84+1) × 10 = 850, 1 added
with 84 as bias

𝑃𝑎𝑟𝑎𝑚=156
𝑃𝑎𝑟𝑎𝑚=2416

Total no. of parameters:

Architecture Variations
• Classic CNNs: Like LeNet, AlexNet, which laid
the groundwork for modern architectures.
• Modern CNNs: Such as VGG, ResNet,
Inception, which introduce more advanced
techniques like skip connections, deeper
networks, and varying kernel sizes.
AlexNet
Layer Paramet
Name Tensor Size Weights Biases ers
Input Image 224x224x3
Conv-1 55x55x96
MaxPool-1 27x27x96
Conv-2 27x27x256
MaxPool-2 13x13x256
Conv-3 13x13x384
Conv-4 13x13x384
Conv-5 13x13x256
MaxPool-3 6x6x256
FC-1 4096X1
FC-2 4096X1
FC-3 1000x1
Output 1000x1
Total
Parameters in Different Layers in a CNN
• AS we discussed earlier,
Parameters in Different Layers in a CNN
Parameters in Different Layers in a CNN
AlexNet

• By default, assume pad=0 and stride=1. There are mistakes in the

table. Correct it.

Layer Paramet
Name Tensor Size Weights Biases ers
Input Image 224x224x3 0 0 0
Conv-1 54x54x96 34,848 96 34,944
MaxPool-1 26x26x96 0 0 0
Conv-2 26x26x256 6,14,400 256 6,14,656
MaxPool-2 13x13x256 0 0 0
Conv-3 13x13x384 8,84,736 384 8,85,120
Conv-4 13x13x384 13,27,104 384 13,27,488
Conv-5 13x13x256 8,84,736 256 8,84,992
MaxPool-3 6x6x256 0 0 0
FC-1 4096X1 3,77,48,736 4,096 3,77,52,832
FC-2 4096X1 1,67,77,216 4,096 1,67,81,312
FC-3 1000x1 40,96,000 1,000 40,97,000
Output 1000x1 0 0 0
Total 6,23,78,344
Transfer Learning
• Transfer learning is a machine learning technique
where a model developed for one task is reused as the
starting point for a model on a second task.
• This approach helps to gain the knowledge from the
first task to improve performance on the new task.
• It is particularly valuable in scenarios with limited data.
• It uses pre-trained models on new tasks to utilize
learned features and thus reduces training time and
improves performance, especially with limited data.
Key Concepts of Transfer Learning
• Pre-trained Models:
– Models are typically trained on large datasets (like
ImageNet for image classification).
– These models have learned useful features that can
be beneficial for various tasks.
• Fine-Tuning:
– After selecting a pre-trained model, we can fine-
tune it by retraining some or all of its layers on our
specific dataset.
– Fine-tuning adjusts the model's weights to better
suit the new task, improving accuracy.
Key Concepts of Transfer Learning

• Feature Extraction:
• We can use a pre-trained model as a fixed
feature extractor, where we remove the last
few layers and use the output from the
remaining layers as input for a new model.
• This approach is useful when we have limited
data for the new task, as it allows us to utilize
the learned representations without
additional training.
Steps in Transfer Learning:

(i) Select a Pre-trained Model: Choose a model

trained on a similar task or dataset.
(ii) Modify the Architecture: Adapt the model's
output layer to match the number of classes in the
new task.
(iii) Fine-Tuning: Retrain the entire model or specific
layers on specific dataset.
(iv) Train on New Data: Train the modified model on
the specific dataset.
(v) Evaluate and Optimize: Assess performance and
make adjustments as necessary.
Benefits:
• Reduced Training Time:
– Since the model starts with weights that are already
close to optimal, training time is significantly shortened
compared to training from scratch.
• Improved Performance:
– Transfer learning often leads to better performance on
the new task, especially when the dataset is small, as
the model benefits from previously learned features.
• Less Data Required:
– It is particularly useful in scenarios where labeled data
is scarce, as the pre-trained model has already learned
from a large amount of data.
Challenges:
• Domain Shift: If the new task is too different from
the original task, the transferred features may not
be as useful.
• Overfitting: Fine-tuning on a small dataset can lead
to overfitting, where the model learns noise instead
of general patterns.
• Overall, transfer learning is a powerful technique
that enhances efficiency and effectiveness in
various machine learning tasks, particularly in
domains with limited data.
Applications:
• Image Classification: Adapting models like VGG, ResNet,
or Inception for specific image classification tasks.
• Natural Language Processing: Using models like BERT
(Bidirectional Encoder Representations from
Transformer) or GPT (Generative Pre-trained
Transformer) for tasks such as sentiment analysis or text
summarization.
• Medical Imaging: Leveraging models trained on general
images to assist in detecting diseases from medical
scans.
Thank You

Amar Sahay - Business Analytics, Volume II - A Data Driven Decision Making Approach For Business-Business Expert Press (2019) PDF
100% (2)
Amar Sahay - Business Analytics, Volume II - A Data Driven Decision Making Approach For Business-Business Expert Press (2019) PDF
405 pages
U20Est109 / Problem Solving Approach L N D C S E
No ratings yet
U20Est109 / Problem Solving Approach L N D C S E
39 pages
Linear System Theory: 2.1 Discrete-Time Signals
No ratings yet
Linear System Theory: 2.1 Discrete-Time Signals
31 pages
History of Ciphers
100% (1)
History of Ciphers
15 pages
Unit V Searching and Sorting Algorithms Syllabus
No ratings yet
Unit V Searching and Sorting Algorithms Syllabus
16 pages
Faraday's Law and Relativity
No ratings yet
Faraday's Law and Relativity
1 page
Cellular Automata
100% (1)
Cellular Automata
27 pages
MCMC Methods For Fitting and Comparing Multinomial Response Models
No ratings yet
MCMC Methods For Fitting and Comparing Multinomial Response Models
28 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
CCS 3101 Artificial Intelligence Course Outline
No ratings yet
CCS 3101 Artificial Intelligence Course Outline
2 pages
Simulation of Process Scheduling Algorithm Using Poisson Distribution Function
50% (2)
Simulation of Process Scheduling Algorithm Using Poisson Distribution Function
11 pages
Digital Filters: CEN352, Dr. Nassim Ammour, King Saud University 1
No ratings yet
Digital Filters: CEN352, Dr. Nassim Ammour, King Saud University 1
13 pages
860 Mathematics SQP Ak
No ratings yet
860 Mathematics SQP Ak
28 pages
Chemical Process Dynamics
No ratings yet
Chemical Process Dynamics
19 pages
Comment: Chapter 2.6, Problem 4E
No ratings yet
Comment: Chapter 2.6, Problem 4E
5 pages
CI Set
No ratings yet
CI Set
18 pages
Hermite Interpolation
No ratings yet
Hermite Interpolation
15 pages
A2.3 One-Step Equations Using Addition and Subtraction - Student Version
No ratings yet
A2.3 One-Step Equations Using Addition and Subtraction - Student Version
11 pages
Zeeshan (CS) - Assignment 1
No ratings yet
Zeeshan (CS) - Assignment 1
3 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
55 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
24 pages
Relational Algebra and Relational Calcul
No ratings yet
Relational Algebra and Relational Calcul
13 pages
Unit 4a - Convolutional Neural Networks
No ratings yet
Unit 4a - Convolutional Neural Networks
107 pages
Advance LP
100% (2)
Advance LP
3 pages
CMDA 3606 Exam 2 Review Sheet 2
No ratings yet
CMDA 3606 Exam 2 Review Sheet 2
2 pages
L09-10 DL and CNN
No ratings yet
L09-10 DL and CNN
56 pages
Unit3 2023 NNDL
No ratings yet
Unit3 2023 NNDL
69 pages
CC511 Week 7 - Deep - Learning
No ratings yet
CC511 Week 7 - Deep - Learning
33 pages
Magnetic Structure of A 10 Nuclei Using The Norfolk Nuclear Models With Quantum Monte Carlo Methods
No ratings yet
Magnetic Structure of A 10 Nuclei Using The Norfolk Nuclear Models With Quantum Monte Carlo Methods
28 pages
DL Endsem 2024 FlyHigh Services
No ratings yet
DL Endsem 2024 FlyHigh Services
18 pages
Computer Vision NN Architecture
No ratings yet
Computer Vision NN Architecture
19 pages
Chapter14 CNN
No ratings yet
Chapter14 CNN
54 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
Algorithmic Design 2
100% (1)
Algorithmic Design 2
12 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
11 pages
GATE DA Important Topics
No ratings yet
GATE DA Important Topics
37 pages
AD3501-DL-Unit 2
No ratings yet
AD3501-DL-Unit 2
33 pages
Unit III
No ratings yet
Unit III
89 pages
Convolutional Neural Network - 5
No ratings yet
Convolutional Neural Network - 5
21 pages
McInerney Chap 3 Chap 4 y Chap 5
No ratings yet
McInerney Chap 3 Chap 4 y Chap 5
71 pages
Module 3
No ratings yet
Module 3
34 pages
CNN Architecture
No ratings yet
CNN Architecture
24 pages
Unit 3 CNN
No ratings yet
Unit 3 CNN
47 pages
DL Mod 3
No ratings yet
DL Mod 3
65 pages
Unit III
No ratings yet
Unit III
89 pages
Step by Step Procedure That How I Resolve Given Task Pytorh
No ratings yet
Step by Step Procedure That How I Resolve Given Task Pytorh
6 pages
Mod04 K Nearest Neighbor
No ratings yet
Mod04 K Nearest Neighbor
48 pages
Typical CNN (Convolutional Neural Network) Architecture: CHARAN S (1VE20CA005) Cse-Ai, Svce
No ratings yet
Typical CNN (Convolutional Neural Network) Architecture: CHARAN S (1VE20CA005) Cse-Ai, Svce
13 pages
Lec5 CNN RNN Attention
No ratings yet
Lec5 CNN RNN Attention
71 pages
DL Unit3
No ratings yet
DL Unit3
8 pages
Understandingcnn 241117075844 C6ee6804
No ratings yet
Understandingcnn 241117075844 C6ee6804
24 pages
Aiea2023 94 109
No ratings yet
Aiea2023 94 109
16 pages
What Should You Consider or Pay Attention To When Preparing A Data Set
No ratings yet
What Should You Consider or Pay Attention To When Preparing A Data Set
7 pages
DR - Jap Ece3051 MLDL Fpga
No ratings yet
DR - Jap Ece3051 MLDL Fpga
90 pages
Unit 2
No ratings yet
Unit 2
20 pages
Unit 4 Deep Learning Model:: Introduction To Cnns
No ratings yet
Unit 4 Deep Learning Model:: Introduction To Cnns
7 pages
Unit2 CNN
No ratings yet
Unit2 CNN
34 pages
Unit - 2
No ratings yet
Unit - 2
31 pages
MLT UNIT-4 & 5 Imp Sol
No ratings yet
MLT UNIT-4 & 5 Imp Sol
22 pages
XII-Maths Science Commerce Scheme Hsslive in
No ratings yet
XII-Maths Science Commerce Scheme Hsslive in
1 page
Unit II
No ratings yet
Unit II
38 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
Introduction To Convolution Neural Network
No ratings yet
Introduction To Convolution Neural Network
6 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
35 pages
Cnns Layers: Convolution Neural Network Convolutional Neural Network
No ratings yet
Cnns Layers: Convolution Neural Network Convolutional Neural Network
10 pages
Module 3
No ratings yet
Module 3
67 pages
AIDS - ANN - Unit 5 - Convolutional Neural Network AIDS - ANN - Unit 5 - Convolutional Neural Network
No ratings yet
AIDS - ANN - Unit 5 - Convolutional Neural Network AIDS - ANN - Unit 5 - Convolutional Neural Network
17 pages
New
No ratings yet
New
8 pages
Ad3501-Dl-Unit 2 Notes
No ratings yet
Ad3501-Dl-Unit 2 Notes
29 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
M4 Ia2
No ratings yet
M4 Ia2
6 pages
DL Mod3
No ratings yet
DL Mod3
102 pages
UNIT-III DLL Full Unit
No ratings yet
UNIT-III DLL Full Unit
63 pages
3.convolutional Networks and Sequence Modeling
No ratings yet
3.convolutional Networks and Sequence Modeling
19 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
11 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
8 pages
Unit 3 CNN 2024
No ratings yet
Unit 3 CNN 2024
58 pages
02 - Introduction To Convolutional Neural Networks (CNNS)
No ratings yet
02 - Introduction To Convolutional Neural Networks (CNNS)
28 pages
Some Important Question
No ratings yet
Some Important Question
59 pages
Unit III
No ratings yet
Unit III
8 pages
CS601 Machine Learning Unit 3
No ratings yet
CS601 Machine Learning Unit 3
47 pages
Unit 3
No ratings yet
Unit 3
59 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
6 pages
Deep Learning Series CNN - 2
No ratings yet
Deep Learning Series CNN - 2
15 pages
Introduction To Convolutional Neural Networks
No ratings yet
Introduction To Convolutional Neural Networks
4 pages
Unit 5 Ann
No ratings yet
Unit 5 Ann
28 pages
Ee046746 Tut 03 04 Convolutional Neural Networks
No ratings yet
Ee046746 Tut 03 04 Convolutional Neural Networks
26 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
34 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet

ML Lec 13 CNN

Uploaded by

ML Lec 13 CNN

Uploaded by

Convolutional Neural Networks

• A CNN is a network architecture for deep learning which

(iii) Global Average Pooling

4. Fully Connected Layers

• Dropout is a powerful technique used for regularization in

• Effective Ensemble Method: During training, each mini-

• During training phase, how mini-batch is

• This is crucial for a few reasons:

(i) Updating Running Averages: During training, after

• Here, α is a hyperparameter (often set to around 0.9) that

Total no. of parameters:

Total no. of parameters:

Total no. of parameters:

Total no. of parameters:

Total no. of parameters:

Total no. of parameters:

• By default, assume pad=0 and stride=1. There are mistakes in the

(i) Select a Pre-trained Model: Choose a model

You might also like