5-Convolutional Neural Network
5-Convolutional Neural Network
5-Convolutional Neural Network
2
Neural Network Model
• Each hidden layer is called a fully connected layer (or Dense layer)
• Each node in hidden layer is connected to all nodes in the previous layer
3
Problem of Fully Connected Neural Network
4
Problem of Fully Connected Neural Network
7
Convolution in a neural network
8
Convolution in a neural network
The process is performed by sliding the 3x3 window through the image
9
Convolution in color image
10
Convolution in color image
11
Padding
12
Stride
13
First Convolutional Layer
14
General Convolutional Layer
15
General Convolutional Layer
16
Element-wise activation functions
Relu activation function is often used after each convolutional layer since
it is an efficient activation function without heavy computation
17
Pooling Layer
18
Pooling Layer
19
Pooling Layer
20
Fully Connected Layer
• Tensor of output of last layer with size (H*W*D) is flatten to the vector
with size (H*W*D,1)
• The fully connected layers are then applied to this vector to combine
different image features learned by convolutional layers to produce output
of the model
21
Softmax activation function
22
Softmax Function
• Softmax function formula:
In which:
23
Softmax Function
24
Classic CNN Architecture
Input
Conv blocks:
• Convolution + activation (relu)
• Convolution + activaton (relu)
• …
• Maxpooling 2x2
Output
• Fully connected layers
• Softmax / Sigmoid activation function
25
Classic CNN Architecture
Output:
• Fully connected layers
26
Feature extraction with CNN
28
VGG Architecture
29
VGG Architecture
• VGG16 architecture:
30
VGG Architecture
• VGG16 architecture:
31
VGG Architecture
• VGG16 architecture:
32
Problem of classical CNN Architecture
• A 56-layer CNN gives more error rate than a 20-layer CNN in both training and
testing dataset
• Problem may cause by vanishing or exploding gradient (gradient becomes 0
or too large) during the backpropagation process
33
ResNet Architecture
34
ResNet Architecture
A residual block
35
ResNet Architecture
36
Network Architecture of ResNet
ResNet-34 architecture
37
VGG19 vs. ResNet50
VGG19 ResNet50
38
Tools and Framework
39
Solving AI Problem with Deep Learning
1. Problem definition
2. Dataset preparation
3. Model construction
4. Loss function definition
5. Apply backpropagation and gradient descent to
find parameters (weight and bias) to optimize loss
function (noted that another optimizer can also
be used)
6. Predict new output with new data using learnt
parameters and weights
40
Deep Learning Datasets
• You can visit this website to get information of the
dataset: https://paperswithcode.com/datasets
41
Exercises 3
42
43