0% found this document useful (0 votes)
24 views48 pages

Experiment 3

Uploaded by

sakshishetty149
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views48 pages

Experiment 3

Uploaded by

sakshishetty149
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

Apply and implement deep

convolutional neural
network (DCNN)
architectures for object
detection and
classification.
EXPERIMENT-3
Contents
CNN
Convolution Networks
Convolution Kernel
Features of Convolution
Convolutional Layer
3 Channel
Stages in a Convolution stage
Pooling: Max, Avg
Padding
Convolutional networks
Also known as Convolutional Neural Networks or CNNs, are a specialized kind of
neural network for processing data that has a known, grid-like topology.
Examples include time-series data which can be thought of as a 1D grid taking
samples at regular time intervals,
Image data, which can be thought of as a 2D grid of pixels.
The name “Convolutional Neural Network” indicates that the network employs a
mathematical operation called convolution.
Convolution is a specialized kind of linear operation.
Convolutional networks
Convolutional networks are simply neural networks that use convolution in place
of general matrix multiplication in at least one of their layers.

s(t) =Int{ x(a)w(t − a)da }


‘w' needs to be a valid probability density function,
The output is not a weighted average.
‘w’ needs to be 0 for all negative arguments,
Convolutional networks
In convolutional network terminology, the first
argument is the function x, to the convolution is
often referred to as the input and the second
argument (in this example, the function w) as the
kernel.
The output is sometimes referred to as the feature
map.
Convolutional kernel
Convolutional kernel

Padding on the
input volume with
zeros in such way
that the conv layer
does not alter the
spatial dimensions
of the input
Features of Convolution
Convolution leverages three important ideas that can
help improve a machine learning system:
◦ 1. Sparse Interactions: ( sparse connectivity or sparse
weights) :
◦ we need to store fewer parameters : memory requirements of the model
And
◦ Improves its statistical efficiency
Features of Convolution
◦ 2. Parameter sharing network has tied weights
because the value of the weight applied to one input is
tied to the value of a weight applied elsewhere.
◦ The parameter sharing used by the convolution operation
means that rather than learning a separate set of parameters
for every location, we learn only one set.
◦ 3. Equivariant Representations. To say a function is
equivariant means that if the input changes, the
output changes in the same way.
Same pattern appears in different places:
They can be compressed!
What about training a lot of such “small” detectors
and each detector must “move around”.

“upper-left
beak” detector

They can be compressed


to the same parameters.

“middle beak”
detector
A convolutional layer
A CNN is a neural network with some convolutional layers
(and some other layers). A convolutional layer has a number
of filters that does convolutional operation.

Beak detector

A filter
Convolution These are the network
parameters to be learned.

1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0
-1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
0 1 0 0 1 0 Filter 2
-1 1 -1
0 0 1 0 1 0
-1 1 -1

……
6 x 6 image
Each filter detects a small
pattern (3 x 3).
1 -1 -1
-1 1 -1
Convolution -1 -1 1
Filter 1

stride=1
1 0 0 0 0 1 Dot
0 1 0 0 1 0 product
3 -1
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
-1 1 -1
Convolution -1 -1 1
Filter 1

If stride=2
1 0 0 0 0 1
0 1 0 0 1 0
3 -3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
-1 1 -1
Convolution -1 -1 1
Filter 1

stride=1
1 0 0 0 0 1
0 1 0 0 1 0
3 -1 -3 -1
0 0 1 1 0 0
1 0 0 0 1 0
-3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0
-3 -3 0 1

6 x 6 image 3 -2 -2 -1
-1 1 -1
-1 1 -1 Filter 2
Convolution -1 1 -1
stride=1
Repeat this for each filter
1 0 0 0 0 1
0 1 0 0 1 0
3 -1 -3 -1
0 0 1 1 0 0 -1 -1 -1 -1
1 0 0 0 1 0
-3 1 0 -3
0 1 0 0 1 0 -1 -1 -2 1
Feature
0 0 1 0 1 0
-3 -3 Map0 1
-1 -1 -2 1
6 x 6 image 3 -2 -2 -1
-1 0 -4 3
Two 4 x 4 images
Forming 2 x 4 x 4 matrix
Color image: RGB 3
channels
11 -1-1 -1-1 -1-1 11 -1-1
-1 1 1 -1-1 -1 -1 1 -1
-1-1 11 -1-1
-1 1 -1
-1 -1-1 1 1 -1 Filter 1 -1 -1 1 1 -1 -1 Filter 2
-1-1 -1-1 11 -1-1 11 -1-1
Color image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0
Convolution v.s. Fully Connected

1 0 0 0 0 1 1 -1 -1 -1 1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1
0 0 1 1 0 0 -1 -1 1 -1 1 -1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
convolution
image

x1
1 0 0 0 0 1
0 1 0 0 1 0
x2
Fully- 0 0 1 1 0 0
1 0 0 0 1 0
connected

……

……
0 1 0 0 1 0
0 0 1 0 1 0
x36
1 -1 -1 1 1
Filter 1
-1 1 -1 2 0
-1 -1 1 3 0
4 0 3
1 0 0 0 0 1 :


0 1 0 0 1 0 0
0 0 1 1 0 0 8 1
1 0 0 0 1 0
9 0
0 1 0 0 1 0
10: 0
0 0 1 0 1 0


13 0
6 x 6 image
14 0
fewer parameters! 15 1 Only connect to 9
16 1 inputs, not fully
connected


1 -1 -1 1 1
-1 1 -1 :2 0
Filter 1
-1 -1 1 :3 0
:4 0 3
1 0 0 0 0 1 :


0 1 0 0 1 0 7 0
0 0 1 1 0 0 :8 1
1 0 0 0 1 0 :9 0 -1
0 1 0 0 1 0
10:: 0
0 0 1 0 1 0


13 0
6 x 6 image
:
14 0
Fewer parameters :
15: 1
16: 1 Shared weights
Even fewer parameters


A typical layer of a convolutional network consists
of three stages
First stage: the layer performs several
convolutions in parallel to produce a set of linear
activations.
Second stage: (Detector Stage) Each linear
activation is run through a nonlinear activation
function, such as the rectified linear activation
function.
Third stage: Pooling function to modify the output
of the layer further.
Pooling
A pooling function replaces the output of the net at a
certain location with a summary statistic of the nearby
outputs.
For example: the max pooling
Other popular pooling functions include the average of a
rectangular neighborhood, L2 norm of a rectangular
neighborhood, or a weighted average based on the distance
from the central pixel.
Max Pooling
New image
1 0 0 0 0 1 but smaller
0 1 0 0 1 0 Conv
0 0 1 1 0 0 3 0
-1 1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0 Max 30 13
Pooling
2 x 2 image
6 x 6 image
Each filter
is a channel
The whole CNN
cat dog ……
Convolution

Max Pooling
Can repeat
Fully Connected many times
Feedforward network
Convolution

Max Pooling

Flattened
Max Pooling
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1

3 -1 -3 -1 -1 -1 -1 -1

-3 1 0 -3 -1 -1 -2 1

-3 -3 0 1 -1 -1 -2 1

3 -2 -2 -1 -1 0 -4 3
Why Pooling
Subsampling pixels will not change the object
bird
bird

Subsampling

We can subsample the pixels to make image smaller


fewer parameters to characterize the image
A CNN compresses a fully
connected network in two ways:
Reducing number of connections
Shared weights on the edges
Max pooling further reduces the complexity
Max Pooling
New image
1 0 0 0 0 1 but smaller
0 1 0 0 1 0 Conv
0 0 1 1 0 0 3 0
-1 1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0 Max 30 13
Pooling
2 x 2 image
6 x 6 image
Each filter
is a channel
The whole CNN
3 0
-1 1 Convolution

3 1
0 3
Max Pooling
Can repeat
A new image
many times
Convolution
Smaller than the original
image
The number of channels is Max Pooling

the number of filters


The whole CNN
cat dog ……
Convolution

Max Pooling

Fully Connected A new image


Feedforward network
Convolution

Max Pooling

Flattened A new image


3

Flattening 0

1
3 0
-1 1 3

3 1 -1
0 3 Flattened

1 Fully Connected
Feedforward network

3
Only modified the network structure and input
CNN in Keras format (vector -> 3-D tensor)

input

Convolution
1 -1 -1
-1 1 -1
-1 1 -1
-1 1 -1 …… There are 25
-1 -1 1 3x3 filters.
-1 1 -1 Max Pooling
Input_shape = ( 28 , 28 , 1)

28 x 28 pixels 1: black/white, 3: RGB Convolution

3 -1 3 Max Pooling

-3 1
Only modified the network structure and input
CNN in Keras format (vector -> 3-D array)

Input
1 x 28 x 28

Convolution
How many parameters for
each filter? 9 25 x 26 x 26

Max Pooling
25 x 13 x 13

Convolution
How many parameters 225=
for each filter? 50 x 11 x 11
25x9
Max Pooling
50 x 5 x 5
Only modified the network structure and input
CNN in Keras format (vector -> 3-D array)

Input
1 x 28 x 28

Output Convolution

25 x 26 x 26
Fully connected Max Pooling
feedforward network
25 x 13 x 13

Convolution
50 x 11 x 11

Max Pooling
1250 50 x 5 x 5
Flattened
AlphaGo
Next move
Neural
(19 x 19
Network positions)

19 x 19 matrix
Black: 1 Fully-connected feedforward network
can be used
white: -1
none: 0 But CNN performs much better
AlphaGo’s policy network
The following is quotation from their Nature article:
Note: AlphaGo does not use Max Pooling.
CNN in speech
recognition
The filters move in the
Frequency CNN frequency direction.

Image Time
Spectrogram
CNN in text classification

Source of image:
http://citeseerx.ist.psu.edu/viewdoc/download
?doi=10.1.1.703.6858&rep=rep1&type=pdf

You might also like