0% found this document useful (0 votes)
21 views4 pages

25-Deep Convolutional Models - ResNet, AlexNet, InceptionNet and Others-12!09!2024

Uploaded by

gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views4 pages

25-Deep Convolutional Models - ResNet, AlexNet, InceptionNet and Others-12!09!2024

Uploaded by

gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

AlexNet Solved Example Problems

AlexNet is a pioneering CNN architecture introduced by Alex Krizhevsky,


Ilya Sutskever, and Geoffrey Hinton in 2012. It significantly outperformed
previous models in the ImageNet Large Scale Visual Recognition
Challenge (ILSVRC).

Problem 1: AlexNet Architecture

Problem Statement:
Describe the architecture of AlexNet, including the number and types of
layers, and the size of the input it accepts.

Solution:
AlexNet consists of 8 layers: 5 convolutional layers followed by 3 fully
connected layers.

1. Input: 227x227x3 RGB image


2. Conv1: 96 filters of size 11x11 with stride 4, followed by ReLU
activation and max pooling
3. Conv2: 256 filters of size 5x5, followed by ReLU activation and max
pooling
4. Conv3: 384 filters of size 3x3, followed by ReLU activation
5. Conv4: 384 filters of size 3x3, followed by ReLU activation
6. Conv5: 256 filters of size 3x3, followed by ReLU activation and max
pooling
7. FC6: 4096 neurons with ReLU activation and dropout
8. FC7: 4096 neurons with ReLU activation and dropout
9. FC8 (Output): 1000 neurons with softmax activation (for 1000
ImageNet classes)

Explanation:
AlexNet's architecture was designed to process large-scale image
datasets. The convolutional layers extract features from the input image,
while the fully connected layers interpret these features for classification.

The use of ReLU activations and dropout were innovative at the time and
helped improve training speed and reduce overfitting.

Problem 2: Calculating Output Size of Conv1 Layer

Problem Statement:
Given an input image of size 227x227x3, calculate the output size of the
first convolutional layer (Conv1) in AlexNet.
Solution:
To calculate the output size, we use the formula:
Output size = (N - F + 2P) / S + 1

Where:
N = Input size
F = Filter size
P = Padding
S = Stride

For Conv1 in AlexNet:


N = 227
F = 11
P = 0 (no padding)
S=4

Output size = (227 - 11 + 2(0)) / 4 + 1


= 216 / 4 + 1
= 54 + 1
= 55

Therefore, the output size of Conv1 is 55x55x96 (96 is the number of


filters).

Explanation:
This calculation shows how the spatial dimensions are reduced in the first
convolutional layer due to the large filter size (11x11) and stride (4). The
depth becomes 96 because there are 96 filters in this layer.

Problem 3: Number of Parameters in Conv1 Layer

Problem Statement:
Calculate the number of learnable parameters in the first convolutional
layer (Conv1) of AlexNet.

Solution:
To calculate the number of parameters, we need to consider both the
weights and biases:

1. Weights:

o Each filter is 11x11x3 (3 for RGB channels)


o There are 96 such filters
o Total weights = 11 * 11 * 3 * 96 = 34,848

2. Biases:

o One bias per filter


o Total biases = 96
Total parameters = Weights + Biases
= 34,848 + 96
= 34,944

Explanation:
Each filter in the convolutional layer has weights for each pixel in its
receptive field (11x11) for each input channel (3 for RGB). Additionally,
each filter has one bias term. The large number of parameters in this layer
contributes to AlexNet's ability to learn complex features from the input
images.

Problem 4: Receptive Field Size in Later Layers

Problem Statement:
Calculate the receptive field size of a neuron in the Conv5 layer of AlexNet
with respect to the input image.

Solution:
To calculate the receptive field, we need to work backwards from Conv5
to the input:

1. Conv5: 3x3 filter


2. Conv4: 3x3 filter
3. Conv3: 3x3 filter
4. Conv2: 5x5 filter
5. Conv1: 11x11 filter with stride 4

Calculation:

 Start with Conv5: 3x3 = 3


 Conv4: 3 + (3-1) = 5
 Conv3: 5 + (3-1) = 7
 Conv2: 7 + (5-1) = 11
 Conv1: 11 + (11-1)*4 = 51

The receptive field size is 51x51 pixels in the original input image.

Explanation:
This calculation shows how neurons in deeper layers of the network have
a larger receptive field in the original image. This allows later layers to
capture more complex and larger-scale features of the input image.

Problem 5: Impact of ReLU Activation

Problem Statement:
Explain the purpose and impact of using ReLU (Rectified Linear Unit)
activation functions in AlexNet.
Solution:
ReLU activation functions in AlexNet serve several important purposes:

1. Non-linearity: ReLU introduces non-linearity into the network,


allowing it to learn complex patterns.

2. Faster training: ReLU doesn't suffer from the vanishing gradient


problem like sigmoid or tanh functions, allowing for faster training of
deep networks.

3. Sparsity: ReLU can lead to sparse activations (many neurons output


zero), which can be beneficial for feature learning.

4. Computational efficiency: ReLU is simple to compute (max(0,x)),


making it faster than other activation functions.

Impact:

 Improved training speed: AlexNet trained several times faster with


ReLU compared to tanh activation.
 Better performance: The use of ReLU contributed to AlexNet's
superior performance in the ILSVRC 2012 competition.
 Deeper networks: ReLU made it possible to train deeper networks
without suffering from the vanishing gradient problem.

Explanation:
The introduction of ReLU in AlexNet was a key innovation that helped
overcome limitations of previous activation functions. It allowed for the
effective training of deeper networks and contributed significantly to
AlexNet's breakthrough performance in image classification tasks.

You might also like