0% found this document useful (0 votes)
10 views52 pages

Convolutional Networks1

Uploaded by

jiriraymond65
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views52 pages

Convolutional Networks1

Uploaded by

jiriraymond65
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Convolutional Networks

Computer Vision
Computer Vision Problems:
1. Image classification
Computer Vision
Computer Vision Problems:
Computer Vision

The biggest challenge in Computer vision is that the input can become
very big

Imagine 64x64, still need to multiply by 3 for RGB


Computer Vision
If you have better resolution

Say 1000 x 1000 = 1m


Computer Vision
• Concidering RGB x 3 then the input will be 3m dimensional

• If you have 3m input features, it means the input x will be 3m


dimensional, and if there is 1000 hidden units, and use fully
connected networks, then the matrics of the weights will be 1000 x
3m size.

• So the matrix will have 3b parameters, which is very large.


Computer Vision
Difficult to get enough data to prevent the network from overfitting

The memory requirements to train a network with 3b parameters will


be too much

For computer vision, one needs to use large images

To do that there is need for convolutional networks


Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
What is the Result of this Convolution?
Edge Detection
Edge Detection
Edge Detection
Edge Detection
Edge Detection
Different filters allow to detect horizontal and vertical edges
Edge Detection
Might not need to use these handpicked 9 numbers.

Lets treat the 9 numbers as parameters

These can be learnt using back propagation,

And that the goal is to learn the parameters, so that if you take the
image and convolve with the filter this give a good edge detector.
Edge Detection
Edge Detection
This can come out being better at capturing the statistic of the data,

A better edge detector,

May be more than just vertical and horizontal but of different


orientations.
Edge Detection
Padding
• 6x6 image convolved with 3x3 filter gave out 4x4 output.

• nxn image convolved with fxf filter will give n-f+1 x n-f+1

The disadvantage of the current convolution is


• 1, shrinking output
• 2, throwing out information from the edges.
Padding

If you pad the original image with say one more line outside,
meaning p = 1

e.g. 6x6 becomes 8x8, and the result of the convolution with the 3x3,
the result is 6x6
Padding

n+2p-f+1 x n+2p-f+1
Padding
• Types of padding can be Valid or Same
• Valid: nxn * fxf -> n-f+1 x n-f+1
Padding
Same: Pad so that output size is same as input size
n + 2p - f + 1 x n + 2p - f + 1
n + 2p - f + 1 = n
p = (f-1)/2
What of 5x5 filter?
Padding
In Computer Vision f is usually odd
Makes padding simpler
Has a central pixel
Strided convolution
Padding
nxn*fxf with pad p and stride s then

(n + 2p – f)/s + 1 x (n + 2p – f)/s + 1
Convolution
Convolutions used to generate feature maps

Take an image
+
A filter
=
Feature map
CNN for Classification
Layers in Convolution Neural Networks
Convolution Layer
It has a number of filters that perform convolution
Every image is considered a matrix of pixel values
Local Connectivity
Local Connectivity
Instead of the dot product of the feed forward, apply convolution

Every single neuron in the hidden layer only sees a patch of input in its
previous layer

Important for scaling the networks because as you go deeper, a neuron


is going to represent a much bigger area
Local Connectivity
• Applying a window of weights
• Computing linear combinations
• Activating with non-linearity
ReLu Layer
Once the feature maps are extracted, we introduce non-linearity
ReLu Layer
Pooling Layer
The rectified feature map now goes through a pooling layer
Pooling is down-sampling operation that reduces dimensionality of the
feature map
Pooling Layer
Pooling layer uses different filters to identify different parts of the
image like edges, corners, body, feathers, e.t.c
Flattening
Flattening is the process of converting all the resultant 2 dimentional
arrays from pooled feature map into a single long continuous linear
vector
Fully Connected Layer
The flattened matrix from the pooling layer is fed as input to the Fully
Connected Layer to classify the image
Fully Connected Layer
Fully Connected Layer

You might also like