0% found this document useful (0 votes)
20 views62 pages

Lecture # 4-1 Convolutional Neural Networks

This is the 5 lec of GEN AI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views62 pages

Lecture # 4-1 Convolutional Neural Networks

This is the 5 lec of GEN AI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 62

National University of Computer and Emerging Sciences

Convolutional Neural Networks

AI-4009 Generative AI

Dr. Akhtar Jamil


Department of Computer Science

09/09/2024 Presented by Dr. AKHTAR JAMIL 1


Goals
• Today’s Lecture
– Matrix Operations
– Convolution Operations
– Convolutional Neural Networks

09/09/2024 Presented by Dr. AKHTAR JAMIL 2


Matrix Operation
• Introduction to Neural Networks

• Matrix operations are helpful when working with multidimensional


inputs and outputs

1 4 0.98
1 W x + b a
-2

[ ] [ ] +¿ ¿[ ]
1 −2 1 0 .98
-1 -2 0.12 𝜎
−1 1 (−×1 1 ) 0.12
-1
1
[ ]4
−2
1

Slide credit: Hung-yi Lee – Deep Learning Tutorial


Matrix Operation
• Introduction to Neural Networks

• Multilayer NN, function f maps inputs x to outputs y, i.e.,

x1 …… y1
x 2 W1 W2 ……
WL y2
b1 b2 bL

……
……

……

……

……
xN x a1 a2…… y yM

y ¿ 𝑓 x( )
¿ WL … 𝜎
𝜎
W2 𝜎 b)
+ ( + b2 …
) + bL
1
x(
W1 ( )

Slide credit: Hung-yi Lee – Deep Learning Tutorial


Softmax Layer
• Introduction to Neural Networks

• The softmax layer applies softmax activations to output a


probability value in the range [0, 1]
– The values z inputted to the softmax layer are referred to as logits
A Softmax Layer
Probability:
3
3 0.88
e z1 20  e
zj
z1 e y1 e z1
j 1
0.12 3
z2 1 e e 2.7z2
 y2 e z2
e
zj

j 1
0.05 ≈0
z3 -3
3
e e z3
 y3 e z3
e
zj

3 j 1

 e
zj

j 1

Slide credit: Hung-yi Lee – Deep Learning Tutorial


Deep vs Shallow Networks
• Deep vs Shallow Networks

• Deeper networks perform better than shallow networks


– But only up to some limit: after a certain number of layers, the
performance of deeper networks plateaus
output

Shallow NN Deep NN
• #Layer
• Representation
……

x1 x2 …… xN

input

09/09/2024 Presented
Slide by LeeDr.– Deep
credit: Hung-yi AKHTAR JAMIL
Learning Tutorial 6
Introduction to Convolutional Neural Networks

09/09/2024 Presented by Dr. AKHTAR JAMIL 7


Multiplication and Dot Product

09/09/2024 Presented by Dr. AKHTAR JAMIL 8


Convolution Operation

09/09/2024 Presented by Dr. AKHTAR JAMIL 9


Convolution Operation in 1D

09/09/2024 Presented by Dr. AKHTAR JAMIL 10


Convolution in 1D

09/09/2024 Presented by Dr. AKHTAR JAMIL 11


Convolution Operation
• Convolution and Correlation

09/09/2024 Presented by Dr. AKHTAR JAMIL 12


Convolution Operation

09/09/2024 Presented by Dr. AKHTAR JAMIL 13


Convolution

09/09/2024 Presented by Dr. AKHTAR JAMIL 14


What are CNNs ?
• CNN = Neural Network with a convolution operation
• Convolutional neural networks (CNN, ConvNet) is a
class of deep, feed-forward artificial neural networks
that are applied to analyzing visual imagery.
• Consists of Matrix Multiplication in at least one of
the layers
09/09/2024 Presented by Dr. AKHTAR JAMIL 15
Smaller Network: CNN
• We know it is good to learn a small model.
• From this fully connected model, do we really need all the edges?
• Can some of these be shared?

09/09/2024 Presented by Dr. AKHTAR JAMIL 16


Convolutional Neural Networks (CNNs)
• Convolutional Neural Networks

• Feature extraction architecture


– After 2 convolutional layers, a max-pooling layer reduces the size of the
feature maps (typically by 2)
– A fully convolutional and a softmax layers are added last to perform
Living
classification Room

Bedroom

64 Kitchen
64

12

25
25

51
51

51
51
12

25

21

51
8

6
6

2
2
8

5
2

2
Bathroom

Outdoor

Conv
layer
Max
Pool
Fully Connected
Layer

09/09/2024 Presented
Slide by VirDr.
credit: Param AKHTAR
Singh JAMIL
– Deep Learning 17
CNN architecture

09/09/2024 Presented by Dr. AKHTAR JAMIL 18


Convolutional Neural Networks (CovNet)
• In a nutshell,
A ConvNet usually has 3 types of layers:
1) Convolutional Layer (CONV)
2) Pooling Layer (POOL)
3) Fully Connected Layer (FC)

09/09/2024 Presented by Dr. AKHTAR JAMIL 19


A convolutional layer
A CNN is a neural network with some convolutional layers
(and some other layers). A convolutional layer has a number
of filters that does convolutional operation.

Beak detector

A filter

09/09/2024 Presented by Dr. AKHTAR JAMIL 20


Fully Connected Layer

09/09/2024 Presented by Dr. AKHTAR JAMIL 21


The convolution operation

09/09/2024 Presented by Dr. AKHTAR JAMIL 22


The convolution operation

Local Connectivity. Neurons will connect to only a local region of the input
volume.
The spatial extent of this connectivity is a hyperparameter called
the receptive field of the neuron (equivalently this is the filter size).

09/09/2024 Presented by Dr. AKHTAR JAMIL 23


Convolution These are the network
parameters to be learned.
1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0
-1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
0 1 0 0 1 0 Filter 2
-1 1 -1
0 0 1 0 1 0
-1 1 -1



6 x 6 image
Each filter detects a
small pattern (3 x 3).
09/09/2024 Presented by Dr. AKHTAR JAMIL 24
Convolution
stride=1 1 -1 -1
-1 1 -1
1 0 0 0 0 1 Dot
0 1 0 0 1 0 product -1 -1 1
3 -1
0 0 1 1 0 0 Filter 1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image

09/09/2024 Presented by Dr. AKHTAR JAMIL 25


1 -1 -1
Convolution-1 1 -1 Filter 1
-1 -1 1
If stride=2
1 0 0 0 0 1
0 1 0 0 1 0
3 -3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image

09/09/2024 Presented by Dr. AKHTAR JAMIL 26


Convolution
Filter 1
stride=1
1 -1 -1
1 0 0 0 0 1
-1 1 -1
0 1 0 0 1 0
3 -1 -3 -1 -1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0
-3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0
-3 -3 0 1

6 x 6 image 3 -2 -2 -1

09/09/2024 Presented by Dr. AKHTAR JAMIL 27


-1 1 -1
Convolution -1 1 -1 Filter 2
-1 1 -1
stride=1
Repeat this for each filter
1 0 0 0 0 1
0 1 0 0 1 0
3 -1 -3 -1
0 0 1 1 0 0 -1 -1 -1 -1
1 0 0 0 1 0
-3 1 0 -3
0 1 0 0 1 0 -1 -1 -2 1 Feature
0 0 1 0 1 0 Map
-3 -3 0 1
-1 -1 -2 1
6 x 6 image 3 -2 -2 -1
-1 0 -4 3
Two 4 x 4 images
09/09/2024 Presented by Forming 2 x 4 x 4 matrix
Dr. AKHTAR JAMIL 28
Color image: RGB 3 channels
11 -1-1 -1-1 -1-1 11 -1-1
-1 1 1 -1-1 -1 -1 1 -1
-1-1 11 -1-1
-1 1 -1
-1 -1-1 1 1 -1 Filter 1 -1 -1 1 1 -1 -1 Filter 2
-1-1 -1-1 11 -1-1 11 -1-1
Color image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0

09/09/2024 Presented by Dr. AKHTAR JAMIL 29


Convolution Layer

09/09/2024 Presented by Dr. AKHTAR JAMIL 30


Convolution Layer
32x32x3 image

32 height

32 width
3 depth

09/09/2024 Presented by Dr. AKHTAR JAMIL 31


Convolution Layer
32x32x3 image

5x5x3 filter
32

Convolve the filter with the


image
i.e. “slide over the image spatially,
computing dot products”

32
3

09/09/2024 Presented by Dr. AKHTAR JAMIL 32


Convolution Layer
Filters always extend the full
depth of the input volume
32x32x3 image

5x5x3 filter
32

Convolve the filter with the image


i.e. “slide over the image spatially,
computing dot products”

32
3

09/09/2024 Presented by Dr. AKHTAR JAMIL 33


Convolution Layer
32x32x3 image
5x5x3 filter
32

1 number:
the result of taking a dot product between the
filter and a small 5x5x3 chunk of the image
32 (i.e. 5*5*3 = 75-dimensional dot product +
3 bias)

09/09/2024 Presented by Dr. AKHTAR JAMIL 34


Convolution Layer
activation map
32x32x3 image
5x5x3 filter
32

28

convolve (slide) over all


spatial locations

32 28
3 1

09/09/2024 Presented by Dr. AKHTAR JAMIL 35


Convolution Layer
32x32x3 image activation maps

5x5x3 filter
32

28

convolve (slide) over all


spatial locations

32 28
3 consider a second, green filter 1

09/09/2024 Presented by Dr. AKHTAR JAMIL 36


For example, if we had 6 5x5 filters, we’ll get 6 separate activation maps:
activation maps

32

28

Convolution Layer

32 28
3 6

We stack these up to get a “new image” of size 28x28x6!

09/09/2024 Presented by Dr. AKHTAR JAMIL 37


Preview: ConvNet is a sequence of Convolutional Layers, interspersed with
activation functions

32 28 24

….
CONV CONV CONV
, , ,
ReLU ReLU ReLU
e.g. 6 e.g. 10
32 5x5x3 28 5x5x6 24
3 filters 6 filters 10

09/09/2024 Presented by Dr. AKHTAR JAMIL 38


Convolution v.s. Fully Connected

1 0 0 0 0 1 1 -1 -1 -1 1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1
0 0 1 1 0 0 -1 -1 1 -1 1 -1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
convolution
image

x1
1 0 0 0 0 1
0 1 0 0 1 0
x2
Fully- 0 0 1 1 0 0
1 0 0 0 1 0
connected




0 1 0 0 1 0
0 0 1 0 1 0
x36
09/09/2024 Presented by Dr. AKHTAR JAMIL 39
1 -1 -1 1 1
Filter 1
-1 1 -1 2 0
-1 -1 1 3 0
4 0 3
1 0 0 0 0 1 :


0 1 0 0 1 0 0
0 0 1 1 0 0 8 1
1 0 0 0 1 0
9 0
0 1 0 0 1 0
10: 0
0 0 1 0 1 0


13 0
6 x 6 image
14 0
fewer parameters! 15 1 Only connect to 9
16 1 inputs, not fully
connected


09/09/2024 Presented by Dr. AKHTAR JAMIL 40
1 -1 -1 1 1
-1 1 -1 :2 0
Filter 1
-1 -1 1 :3 0
:4 0 3
1 0 0 0 0 1 :


0 1 0 0 1 0 7 0
0 0 1 1 0 0 :8 1
1 0 0 0 1 0 :9 0 -1
0 1 0 0 1 0
10:: 0
0 0 1 0 1 0


13 0
6 x 6 image
:
14 0
Fewer parameters :
15 1
:
16 1 Shared weights
Even fewer parameters
:

09/09/2024 Presented by Dr. AKHTAR JAMIL 41
Visualize the Results

09/09/2024 Presented by Dr. AKHTAR JAMIL 42


Visualization of VGG-16 by Lane McIntosh. VGG-16
Preview [Zeiler and Fergus 2013] architecture from [Simonyan and Zisserman 2014].

Fei-Fei Li & Justin Johnson & Serena Yeung April 18, 201
e7
u
L
rect 5 -

09/09/2024 Presented by Dr. AKHTAR JAMIL 43


Preview

09/09/2024 Presented by Dr. AKHTAR JAMIL 44


one filter =>
one activation map example 5x5 filters
(32 total)

We call the layer convolutional


because it is related to convolution
of two signals:

elementwise multiplication and sum


Presented byof
a filter and the signal (image)
Figure copyright Andrej Karpathy.
Dr. AKHTAR
JAMIL

09/09/2024 45
preview:

Fei-Fei Li & Justin Johnson & Serena Yeung April 18, 201
e7
u
L
rect 5 -

09/09/2024 Presented by Dr. AKHTAR JAMIL 46


The whole CNN
cat dog ……
Convolution

Max Pooling
Can repeat
Fully Connected many
Feedforward network
Convolution times

Max Pooling

Flattened
09/09/2024 Presented by Dr. AKHTAR JAMIL 47
3
Flattening
0

1
3 0
-1 1 3

3 1 -1
0 3 Flattened

1 Fully Connected
Feedforward network

3
09/09/2024 Presented by Dr. AKHTAR JAMIL 48
Only modified the network structure and input
CNN in Keras format (vector -> 3-D tensor)

input

Convolution
1 -1 -1
-1 1 -1
-1 1 -1
-1 1 -1 … There are
-1 -1 1 25 3x3
-1 1 -1 … Max Pooling
filters.
Input_shape = ( 28 , 28 , 1)

28 x 28 pixels 1: black/white, 3: RGB Convolution

3 -1 3 Max Pooling

-3 1

09/09/2024 Presented by Dr. AKHTAR JAMIL 49


Only modified the network structure and input
CNN in Keras format (vector -> 3-D array)

Input
1 x 28 x 28

Convolution
How many parameters for
each filter? 9 25 x 26 x 26

Max Pooling
25 x 13 x 13

Convolution
How many parameters 225=
for each filter? 50 x 11 x 11
25x9
Max Pooling
50 x 5 x 5

09/09/2024 Presented by Dr. AKHTAR JAMIL 50


Only modified the network structure and input
CNN in Keras format (vector -> 3-D array)

Input
1 x 28 x 28

Output Convolution

25 x 26 x 26
Fully connected Max Pooling
feedforward network
25 x 13 x 13

Convolution
50 x 11 x 11

Max Pooling
1250 50 x 5 x 5
Flattened

09/09/2024 Presented by Dr. AKHTAR JAMIL 51


Putting it all
together
import tensorflow as tf
def generate_model():
model = tf.keras.Sequential([
# first convolutional
layer
tf.keras.layers.Conv2D(32, filter_size=3, activation='relu’),
tf.keras.layers.MaxPool2D(pool_size=2, strides=2),

# second convolutional layer


tf.keras.layers.Conv2D(64, filter_size=3, activation='relu’),
tf.keras.layers.MaxPool2D(pool_size=2, strides=2),

# fully connected classifier


tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1024, activation='relu’),
tf.keras.layers.Dense(10,
activation=‘softmax’) # 10 outputs

])
return model

09/09/2024 Presented by Dr. AKHTAR JAMIL 52


Fully Connected Layers

09/09/2024 Presented by Dr. AKHTAR JAMIL 53


Fully Connected Layers
• Regular neural network
• This layer takes an input volume from last layer of CNN and
outputs an N dimensional vector.
– Where N is the number of classes

09/09/2024 Presented by Dr. AKHTAR JAMIL 54


Pooling Layer
• Pooling is used to progressively reduce the spatial size of the
input
– To reduce the amount of parameters and computation
– It control overfitting (regularization).
– Invariance to small translations of the input
• Commonly insert a Pooling layer in-between successive Conv
layers

09/09/2024 Presented by Dr. AKHTAR JAMIL 55


Pooling Layer

09/09/2024 Presented by Dr. AKHTAR JAMIL 56


Batch Normalization
• Usually the input layer scaled.
– For example, when we have some features whose value are from 0 to 1
and some from 1 to 1000, we should normalize them to speed up learning.
• Can we normalize the hidden layers also?
– YES

09/09/2024 Presented by Dr. AKHTAR JAMIL 57


Batch Normalization
• Batch normalization layers: They calculate the mean μ and variance σ of a batch of
input data, and normalize the data to a zero mean and unit variance
– I.e.,
• BatchNorm layers alleviate the problems of proper initialization of the
parameters and hyper-parameters
– Result in faster convergence training, allow larger learning rates
• BatchNorm layers are inserted immediately after convolutional layers or fully-
connected layers, and before activation layers
– They are very common with convolutional NNs

09/09/2024 Presented by Dr. AKHTAR JAMIL 58


Batch Normalization
• It increases the stability of a neural network
• Normalizes the output of a previous activation layer by subtracting
the batch mean and dividing by the batch standard deviation.
• Batch normalization adds two trainable parameters to each layer
– The normalized output is multiplied by a “standard deviation” parameter
(gamma)
– and add a “mean” parameter (beta)

09/09/2024 Presented by Dr. AKHTAR JAMIL 59


Dropout
• The term “dropout” refers to dropping out units
(both hidden and visible) in a neural network.
• It is a way to preventing overfitting problem
• These units are not considered during a
particular forward or backward pass
• A new hyperparameter is introduced that
specifies the probability at which outputs of the
layer are dropped out

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

09/09/2024 Presented by Dr. AKHTAR JAMIL 60


References
• https://www.deeplearningbook.org/contents/convnets.html
• https://www.deeplearningbook.org/contents/mlp.html
• Dropout: A Simple Way to Prevent Neural Networks from Overfittin
g
• https://arxiv.org/abs/1502.03167

09/09/2024 Presented by Dr. AKHTAR JAMIL 61


Thank You 

09/09/2024 Presented by Dr. AKHTAR JAMIL 62

You might also like