0% found this document useful (0 votes)
35 views57 pages

Building Convolutional Neural Networks For Image Classification Slides

1. Convolutional neural networks (CNNs) are commonly used for image classification and contain alternating convolutional and pooling layers, followed by fully connected layers that output class probabilities. 2. CNNs apply filters to local regions of input images to extract features, with techniques like zero-padding and stride size determining the size of output feature maps. 3. Batch normalization is applied before activation functions to help address the vanishing/exploding gradient problem and speed up training of CNNs.

Uploaded by

Rahul Shetty
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views57 pages

Building Convolutional Neural Networks For Image Classification Slides

1. Convolutional neural networks (CNNs) are commonly used for image classification and contain alternating convolutional and pooling layers, followed by fully connected layers that output class probabilities. 2. CNNs apply filters to local regions of input images to extract features, with techniques like zero-padding and stride size determining the size of output feature maps. 3. Batch normalization is applied before activation functions to help address the vanishing/exploding gradient problem and speed up training of CNNs.

Uploaded by

Rahul Shetty
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Building Convolutional Neural

Networks for Image Classification

Janani Ravi
CO-FOUNDER, LOONYCORN
www.loonycorn.com
Narrow and wide convolution
Zero-padding and the feature map sizes
Overview for convolutional layers
Calculating feature map dimensions
Batch normalization of input images
Building and training a CNN for image
classification
Changing model hyperparameters
Convolutional Neural Networks
Two Kinds of Layers in CNNs

Convolution Pooling
Local receptive field Subsampling of inputs
Typical CNN Architecture

ReLU

ReLU
Convolutional Pooling Convolutional

Alternating convolutional and pooling layers


Typical CNN Architecture

ReLU

ReLU
Convolutional Pooling Convolutional

This entire set of layers is then fed into a


regular, feed-forward NN
Typical CNN Architecture

P(Y = 0)

Fully Connected

Fully Connected
P(Y = 1)

Prediction
SoftMax
ReLU

ReLU
CNN Layers

Feed-forward
Layers
P(Y = 9)

This is the output layer, emitting probabilities


Typical CNN Architecture

P(Y=0) P(Y=9)

CNN Input is an image


Outputs are probabilities
Feature Maps

x1 x0 x1
x0 x1 x0
x1 x0 x1

Image Pixels Feature


Map
Zero-padding, Stride Size
Narrow vs. Wide Convolution

Input matrix i.e. image


Narrow vs. Wide Convolution

Convolution result
Narrow vs. Wide Convolution

Narrow Convolution Wide Convolution


Little zero padding; output Lots of zero padding; output
narrower than input wider than input
Without Zero Padding
6
0 0 0 0 0 0
4
0.2 0.8 0 0.3 0.6 0

0.2 0.9 0 0.3 0.8 0

6 0.3 0.8 0.7 0.8 0.9 0


x1 x0
x0
x1
x1 x0
4
x1 x0 x1
0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Convolution
Matrix Result
Zero Padding
10 8
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0
0 0.2 0.8 0 0.3 0.6 0

0 0.2 0.9 0 0.3 0.8 0


0

0
0

0
8
10 0 0 0.3 0.8 0.7 0.8 0.9 0 0 0
x1 x0
x0
x1
x1 x0

0 0 0 0 0 0.2 0.8 0 0 0 x1 x0 x1

0 0 0 0 0 0.2 0.2 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

Convolution
Matrix Result
Zero Padding
12
0 0 0 0 0 0 0 0 0 0 0 0
10
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0.2 0.8 0 0.3 0.6 0 0 0 0
10
12 0
0
0
0
0 0.2 0.9 0 0.3 0.8 0
0 0.3 0.8 0.7 0.8 0.9 0
0
0
0
0
0
0
x1 x0
x0
x1
x1 x0
0 0 0 0 0 0 0.2 0.8 0 0 0 0 x1 x0 x1
0 0 0 0 0 0 0.2 0.2 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0

Convolution
Matrix Result
Zero Padding

x0 x0 x0 With zero-padding, every element of


x0 x1 x1 x0 matrix will be passed into filter
x0 x1 x1 x0
Can decide number of zero columns to
x0 x1 x1 x0
pad with
x0 x0

Use to get output larger than input


Stride Size

0 0 0 0 0 0
x1 x0 x1

0.2 0.8 0 0.3 0.6 0


x0 x1 x0

0.2 0.9 0 0.3 0.8 0


x1 x0 x1

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0
Stride Size

0 0 0 0 0 0
x1 x0 x1

0.2 0.8 0 0.3 0.6 0


x0 x1 x0

0.2 0.9 0 0.3 0.8 0


x1 x0 x1

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Horizontal stride of 1
Stride Size

0 0 0 0 0 0
x1 x0 x1

0.2 0.8 0 0.3 0.6 0


x0 x1 x0

0.2 0.9 0 0.3 0.8 0


x1 x0 x1

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0
Stride Size

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0


x1 x0 x1

0.2
x0
0.9x1 0x0 0.3 0.8 0

0.3 0.8 0.7 0.8 0.9 0


x1 x0 x1

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Vertical stride of 1
Stride Size

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0


x1 x0 x1

0.2
x0
0.9x1 0x0 0.3 0.8 0

0.3 0.8 0.7 0.8 0.9 0


x1 x0 x1

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Stride size is an important


hyperparameter in CNNs
Batch Normalization
Training via Back Propagation

ML-based Classifier

Object Parts
Corners
Edges
Pixels

Error

Optimiser
Vanishing and Exploding Gradients

Back propagation fails if


- gradients are vanishing
- gradients are exploding
Vanishing Gradient Problem
Loss
Gradient becomes zero
and stops changing

W
Initial value of loss

b Smallest value of loss


Exploding Gradient Problem
Loss
Gradient changes
abruptly and “explodes”

W
Initial value of loss

b Smallest value of loss


Coping with Vanishing/Exploding Gradients

Non-saturating activation
Proper initialization
function

Batch normalization Gradient clipping


Coping with Vanishing/Exploding Gradients

Non-saturating activation
Proper initialisation
function

Batch normalization Gradient clipping


Batch Normalization

Just before applying activation function


First, “normalize” inputs
Second, “scale and shift” inputs
Batch Normalization

“Normalize” inputs
- subtract mean
- divide by standard deviation
“Scale and shift” inputs
- scale = multiply by constant
- shift = add constant
Batch Normalization

Supported in PyTorch
Many other benefits
- allows much larger learn rate
- reduces overfitting
- speeds convergence of training
Choice of Activation Function
A Neural Network

Once a neural network is trained all edges have weights


which help it make predictions
Operation of a Single Neuron

W1
X1
X2 W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+

b,0)
Xi

Wn
Xn
b

Each neuron only applies two simple functions to its inputs


Operation of a Single Neuron

W1
X1
X2 W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+

b,0)
Xi

Wn
Xn
b

The affine transformation alone can only learn linear


relationships between the inputs and the output
Operation of a Single Neuron

W1
X1
X2 W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+

b,0)
Xi

Wn
Xn
b

The affine transformation is just a weighted sum with a


bias added: W1x1 + W2x2 +…+ Wnxn + b
The weights and biases of
individual neurons are determined
during the training process
Operation of a Single Neuron

W1
X1
X2 W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+

b,0)
Xi

Wn
Xn
b

The combination of the affine transformation and the


activation function can learn any arbitrary relationship
Activation Function

ReLU logit tanh step

Various choices of activation functions exist and drive


the design of your neural network
Importance of Activation

The choice of activation function is


crucial in determining performance
Feature Map Size Calculations
Without Zero Padding
6
0 0 0 0 0 0
4
0.2 0.8 0 0.3 0.6 0

0.2 0.9 0 0.3 0.8 0 4


6 0.3 0.8 0.7 0.8 0.9 0
x1 x0
x0
x1
x1 x0
x1 x0 x1
0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Convolution
Matrix Result
Zero Padding
10 8
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0
0 0.2 0.8 0 0.3 0.6 0

0 0.2 0.9 0 0.3 0.8 0


0

0
0

0
8
10 0 0 0.3 0.8 0.7 0.8 0.9 0 0 0
x1 x0
x0
x1
x1 x0

0 0 0 0 0 0.2 0.8 0 0 0 x1 x0 x1

0 0 0 0 0 0.2 0.2 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

Convolution
Matrix Result
W-K+2P
O = + 1
S

Formula for dimension calculations


Handy in getting dimensions of CNN layers right
W-K+2P
O = + 1
S

O = Output dimension
Height/width of output
W-K+2P
O = + 1
S

W = Input dimension
Height/width of input image
W-K+2P
O = + 1
S

K = Kernel size
Height/width of kernel
W-K+2P
O = + 1
S

P = Padding (if any)


Maybe zero
W-K+2P
O = + 1
S

S = Stride
How far the kernel advances in each step
Stride Size

0 0 0 0 0 0
x1 x0 x1

0.2 0.8 0 0.3 0.6 0


x0 x1 x0

0.2 0.9 0 0.3 0.8 0


x1 x0 x1

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0
Stride Size

0 0 0 0 0 0
x1 x0 x1

0.2 0.8 0 0.3 0.6 0


x0 x1 x0

0.2 0.9 0 0.3 0.8 0


x1 x0 x1

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Horizontal stride of 1
Stride Size

0 0 0 0 0 0
x1 x0 x1

0.2 0.8 0 0.3 0.6 0


x0 x1 x0

0.2 0.9 0 0.3 0.8 0


x1 x0 x1

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0
Stride Size

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0


x1 x0 x1

0.2
x0
0.9x1 0x0 0.3 0.8 0

0.3 0.8 0.7 0.8 0.9 0


x1 x0 x1

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Vertical stride of 1
W-K+2P
O = + 1
S

Formula for dimension calculations


Handy in getting dimensions of CNN layers right
Demo
Image classification using
convolutional neural networks (CNNs)
Hyperparameter tuning
Narrow and wide convolution
Zero-padding and the feature map sizes
Summary for convolutional layers
Calculating feature map dimensions
Batch normalization of input images
Building and training a CNN for image
classification
Changing model hyperparameters

You might also like