0% found this document useful (0 votes)

22 views26 pages

AE556 2024 Topic4 CNN

The document discusses Convolutional Neural Networks (CNNs) and their advantages over fully-connected networks for image processing, including the use of convolutions, pooling, and techniques to prevent overfitting. It covers the architecture and parameters of CNNs, as well as notable models like AlexNet, VGGNet, GoogLeNet, and ResNet, highlighting their contributions to improving image classification performance. Additionally, it addresses the importance of loss functions and techniques like dropout and batch normalization in enhancing model training.

Uploaded by

Yang Woo Seong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views26 pages

AE556 2024 Topic4 CNN

Uploaded by

Yang Woo Seong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Fall’24 AE556 AI for Aerospace Applications

Topic 4. Convolutional Neural Networks (CNN)

Sept. 24 (T), 2024

Han-Lim Choi
The Problem with Fully-Connected Networks

• A 256x256(RGB) image → ~200K dimensional input x

• A fully connected network would need a very large number of parameters,
very likely to overfit the data
• Generic deep network also dose not capture the “natural” invariances we
expect in images (translation, scale)

2
Convolutional Neural Networks

• To create architectures that can handle large images, restrict the weights in
two ways
1. Require that activations between layers only occur in “local” manner
2. Require that all activations share the same weights

These lead to an architecture known as a convolutional neural network

3
Convolutions

• Convolutions are a basic primitive in many computer vision and image

processing algorithms
• Idea is to “slide” the weights w (called a filter) over the image to produce a
new image, written y=z*w
• It repeats the process of performing convolution operations using kernels
(filters) and extracting features
kernel
Input image (=filter=weights) output

4
Additional Notes on Convolutions

• Pooling is a process that minimizes the quantity of data, removing noise and
leaving only clear information, making judgment and learning easier
• When a receptive field is formed only by convolution operations, the amount of
computation grows and is inefficient, thus more significant feature maps can be
obtained from the feature maps extracted through pooling

↑ Make the features into a 1D vector

5
Additional Notes on Convolutions

• Considering the convolution operation, the following questions may arise:

– 1) Adjusting the filter movement interval
– 2) Repeated convolution results in a smaller final output

• The stride refers to the interval at which the filter moves when applied
Adjusting this changes the filter movement interval

6
Additional Notes on Convolutions

• To maintain the output size of the output and preserve the edge pixels, a
technique called "padding" is used. This is the task of putting additional
values on the edges of the image
• It is usual to "zero pad" the input image so that the resultant image has the
same size

1) Zero Padding 2) Replicate Padding 3) Mirror Padding

7
Convolutions in Image Processing

• Convolutions (typically with prepecified filters) are a common operation in

many computer vision applications

8
Convolutions in Image Processing

• In RGB images, the result is generated by executing a convolution operation on

each channel and then adding them
• These are 2D convolution, but they work like a 3D convolution

9
Number of Parameters

• Consider a convolutional network that takes as input color(RGB) 32x32 images,

and uses the layers (all convolutional layers use zero-padding)
1. 5x5x64 conv
2. 2x2 maxpool
3. 3x3x128 con
4. 2x2 maxpool
5. Fully-connected to 10-dimensional output

• How many parameters does this network have?

1.
2.
3.
4.

10
Number of Parameters

• The total number of parameters in the network is the sum of the number of
parameters in each convolution layer (pooling, stride, and padding are
hyperparameters that are not calculated)

𝟐 𝑲 : Filter size
𝒄 𝑪 : Number of input channels
𝒄 𝒄+ 𝑵 : Number of filters

• The depth of each kernel in a convolution layer is always equal to the number
of channels in the input image
• All kernels have 𝟐 × parameters, and there are of them
• The number of parameters in the FC layer is as follows:

𝟐 𝑶 : Size of output image of the previous conv layer

𝒄𝒇
𝑪 : Number of neurons in the FC layer
𝒄𝒇 𝒄𝒇 + 𝑵 : Number of filters

11
Number of Parameters

• The convolution layer's output tensor size is determined by the input

image size, padding, stride, and kernel. The number of channels in the output
image is the same as the number of kernels (N)
𝑶 : Size of output image
𝑰 : size of input image
𝑲 : size of kernels
𝑺 : stride
𝑷 : padding size

• In slide 10, What is the output size passed to the FC layer?

12
Improving model performance

• A neural network might become unresponsive to test data if it is overtrained

on training data, and this is called “overfitting”
• Models with numerous parameters or models with not enough data for
training are subject to overfitting

• Dropout can assist in resolving the overfitting

• Part of the layer's input units are "dropped out" at random at each learning
step (think of it like a random forest in a decision tree)

Turn off a few

neurons randomly

13
Improving model performance

• Batch normalization aids in improving slow or unstable learning

• Every incoming batch is normalized by the batch normalization layer using its
mean and standard deviation
• The data is then rescaled using learnable rescale parameters to a new scale
This can help prevent outlier learning in the model

14
Learning with Convolutions

• How do we apply backpropagation to neural networks with convolutions?

𝒊 𝟏= 𝒊( 𝒊* 𝒊 𝒊)

• Remember that for a dense layer 𝒊 𝟏 = 𝒊 ( 𝒊 * 𝒊 𝒊 ), forward pass required

multiplication by 𝒊 and backward pass required multiplication by 𝑻𝒊

• We’re going to show that convolution is a type of (highly structured) matrix

multiplication, and show how to compute the multiplication by its transpose

15
Convolutions as Matrix Multiplication

• Consider initially a 1D convolution 𝒊* 𝒊 for 𝒊

𝟑 , 𝒊
𝟔

• Then 𝒊* 𝒊 = 𝒊 𝒊 for

𝟏 𝟐 𝟑
𝟏 𝟐 𝟑
𝒊
𝟏 𝟐 𝟑
𝟏 𝟐 𝟑

𝑻
• So how de we multiply by 𝒊?

16
Convolutions as Matrix Multiplication

• Multiplication by transpose is just

𝟏
𝟐 𝟏
𝑻 𝟑 𝟐 𝟏
𝒊 𝒊 𝟏= 𝒊 𝟏= 𝒊 𝟏 * 𝒊
𝟑 𝟐 𝟏
𝟑 𝟐
𝟑

• where 𝒊 𝟏 is just the flipped version of 𝒊

• In other words, transpose of convolution is just (zero-padded) convolution by
flipped filter (correlations for signal processing people)

• Property holds for 2D convolutions, backprop just flips convolutions

17
Loss function

• The loss function value is utilized during the learning process to determine
how well the model fits the learning data, and it is usually divided into two
categories: regression and classification
• “MSE” (Mean Squared Error) is commonly employed in regression problems
where the NN output and goal values are continuous (It is also used to determine
the difference between pictures or masks in segmentation and shows how far the data
is from the mean)
𝟏 𝒏
MSE = 𝒊 𝟏 𝒊 𝒊
𝟐
𝒏

• “Cross Entropy” (Binary or Categorical) calculates the probability of belonging

to a specific class
𝟏 𝒏
CE = - 𝒊 𝟏 𝒊 𝒊 In multi classes (softmax)
𝒏
𝟏 𝒏
CE = - 𝒊 𝟏 𝒊 𝒊 𝒊 𝒊 )) In binary classes (sigmoid)
𝒏

18
Loss function

• Cross Entropy (Binary or Categorical) calculates the probability of belonging to

a specific class
• The softmax function converts the final result to [0, 1], and the loss is
determined using cross entropy with the 1-hot label
Prediction value (𝒚𝒊 ) Real label (1-hot label, 𝒚𝒊 )
Sample #1 Sample #2 Sample #3 Sample #1 Sample #2 Sample #3
0.3 0.3 0.4 1 0 0
0.1 0.5 0.1 0 1 0

Increased 0.6 0.2 0.5 0 0 1

prediction
𝑪𝑬 = −(𝒍𝒐𝒈 𝟎. 𝟑 + 𝒍𝒐𝒈 𝟎. 𝟓 + 𝒍𝒐𝒈 𝟎. 𝟓 ) = 𝟐. 𝟓𝟗
accuracy
Sample #1 Sample #2 Sample #3 Sample #1 Sample #2 Sample #3
0.4 0.2 0.1 1 0 0
0.1 0.7 0.1 0 1 0
0.5 0.1 0.8 0 0 1

𝑪𝑬 = − 𝒍𝒐𝒈 𝟎. 𝟒 + 𝒍𝒐𝒈 𝟎. 𝟕 + 𝒍𝒐𝒈 𝟎. 𝟖 = 𝟏. 𝟓𝟎

19
LeNet, Digit Classification

• The network that started it all (and then stopped for ~14 years)

20
AlexNet

• Alexnet is an 8-layer CNN model that learns two identical structures in parallel
utilizing GPUs
• This replaces the existing tanh and sigmoid functions with the “ReLU” function
This converges six times faster than the current activation function. Following
this, most current models use the ReLU function

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet classification with deep convolutional neural networks." Communications
of the ACM 60.6 (2012): 84-90. 21
VGGNet

• The Oxford University research team VGG developed VGGNet, which won
second in the 2014 ImageNet image recognition competition
• The VGGNet model confirmed that the deeper the network, the better the
performance, and it employed a stack of numerous tiny, 3x3 kernels in place of
large kernels

• VGG-16, 19, etc. are some divisions of the setup based on the layer's depth

Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint
arXiv:1409.1556 (2014). 22
GoogLeNet (Inception module)

• An idea was made for an inception module that uses different kernel sizes and
bottleneck structures to enhance the computational inefficiency of VGG
• It effectively extracts features by employing parallel convolutional layers at
several sizes
• GAP was used in place of the last FC layer to significantly reduce the size of the
model and improve accuracy and computational efficiency

Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition.
(2015). 23
ResNet

• ResNet uses residual connections in a block structure to handle gradient

vanishing and degradation problems efficiently
• GoogLeNet has 22 layers because gradient vanishing happens as the layer
becomes deeper, while resnet can create a deep network with up to 152 layers

This is similar to an open book test

where you are given material that
you have already learned

In 𝐻 𝑥 = 𝐹 𝑥 + 𝑥,
we learn so that 𝐹 𝑥 = 𝐻 𝑥 − 𝑥
which corresponds to the additional
learning amount, becomes 0
𝐻 𝑥 − 𝑥 is called the “residual”

• Residual blocks u lize shortcuts that add input values to output values
• By simply adding input x, the network uses “skip connections” to make each
layer learn only small information excluding existing information
He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition.
(2016). 24
ResNet (bottleneck structure)

• Because this deep network demands a significant amount of computing, a

bottleneck block is used in deep models
• 1x1 convolution is used before 3x3 convolution to minimize the number of
channels, and subsequently it is converted back to 1x1 convolution
• Thus, when utilizing deep resnet (or any other network), it is recommended to
employ bottleneck

strandard bottleneck

He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition.
(2016). 25
Classification model performance

Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
98 pages
Crash Course JEE Advanced Sample Ebook
100% (1)
Crash Course JEE Advanced Sample Ebook
31 pages
05introduction To Convolutional Neural Networks
No ratings yet
05introduction To Convolutional Neural Networks
72 pages
AE556 2024 Topic7 Transformer
No ratings yet
AE556 2024 Topic7 Transformer
49 pages
4th Unit Aktu Machine Learning
No ratings yet
4th Unit Aktu Machine Learning
9 pages
Ship Hydrodynamics 1 Part B Lecture 7 - Seakeeping Criteria - Supplement
100% (1)
Ship Hydrodynamics 1 Part B Lecture 7 - Seakeeping Criteria - Supplement
23 pages
CNN and Autoencoder
No ratings yet
CNN and Autoencoder
56 pages
Data Preprocessing in Data Mining PDF
100% (3)
Data Preprocessing in Data Mining PDF
327 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
Super VIP Cheatsheet - Deep Learning
No ratings yet
Super VIP Cheatsheet - Deep Learning
47 pages
Satchwell: Universal Multi-Loop Intelligent Advanced Controller
No ratings yet
Satchwell: Universal Multi-Loop Intelligent Advanced Controller
24 pages
A Comparative Study On Text Representation Schemes in Text Categorization
No ratings yet
A Comparative Study On Text Representation Schemes in Text Categorization
11 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
Unit III
No ratings yet
Unit III
89 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
Lecture 10 Slides - After
No ratings yet
Lecture 10 Slides - After
66 pages
Mod 5
No ratings yet
Mod 5
96 pages
CS601 Machine Learning Unit 3
No ratings yet
CS601 Machine Learning Unit 3
47 pages
Lecture CNN
No ratings yet
Lecture CNN
68 pages
Week 11 - Convolutional
No ratings yet
Week 11 - Convolutional
78 pages
Unit III
No ratings yet
Unit III
89 pages
CII4Q3 VISI KOMPUTER - Deep Learning - CNN
No ratings yet
CII4Q3 VISI KOMPUTER - Deep Learning - CNN
106 pages
Convolution Neural Networks
No ratings yet
Convolution Neural Networks
80 pages
Images and Convolutional Neural Networks: Practical Deep Learning
No ratings yet
Images and Convolutional Neural Networks: Practical Deep Learning
34 pages
Module 3
No ratings yet
Module 3
67 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
Bangxi Li (Auth.) - Linear Theory of Fixed Capital and China's Economy - Marx, Sraffa and Okishio-Springer Singapore (2017)
No ratings yet
Bangxi Li (Auth.) - Linear Theory of Fixed Capital and China's Economy - Marx, Sraffa and Okishio-Springer Singapore (2017)
132 pages
Convolutional Neural Networks - Part 1
No ratings yet
Convolutional Neural Networks - Part 1
44 pages
DL6 - Convnets 4
No ratings yet
DL6 - Convnets 4
57 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
34 pages
Unit3 2023 NNDL
No ratings yet
Unit3 2023 NNDL
69 pages
AE556 2024 Topic1 SSSearch
No ratings yet
AE556 2024 Topic1 SSSearch
52 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
CBSE Class 6 Playing With Numbers Worksheet
75% (4)
CBSE Class 6 Playing With Numbers Worksheet
5 pages
Pipe and Identification Diagrams
No ratings yet
Pipe and Identification Diagrams
11 pages
Electronics - Number System & Logic Gates
No ratings yet
Electronics - Number System & Logic Gates
26 pages
Q2 Week 3 Relation and Function
No ratings yet
Q2 Week 3 Relation and Function
42 pages
CNN Slides Part2
No ratings yet
CNN Slides Part2
69 pages
Convolutional Neural Networks - Annotated
No ratings yet
Convolutional Neural Networks - Annotated
83 pages
Iii Unit - Deeplearning
No ratings yet
Iii Unit - Deeplearning
93 pages
CNN Architectures 01
No ratings yet
CNN Architectures 01
66 pages
Unit 3 - Machine Learning
No ratings yet
Unit 3 - Machine Learning
27 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
BT 154 Statics of Rigid Bodies
No ratings yet
BT 154 Statics of Rigid Bodies
103 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
102 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
47 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
Intro DL 02
No ratings yet
Intro DL 02
49 pages
03 Convolutional Neural Networks
No ratings yet
03 Convolutional Neural Networks
83 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
32 pages
CNN (Neural Network)
No ratings yet
CNN (Neural Network)
32 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
35 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
HW1 Sol PDF
No ratings yet
HW1 Sol PDF
12 pages
L3 Gas Power Cycles
No ratings yet
L3 Gas Power Cycles
78 pages
Payne (2016) Grammars of Kinship Biological Motherhood and Assisted Reproduction in The Age of Epigenetics
No ratings yet
Payne (2016) Grammars of Kinship Biological Motherhood and Assisted Reproduction in The Age of Epigenetics
24 pages
Lecture 7 (Adaptive Filters)
No ratings yet
Lecture 7 (Adaptive Filters)
18 pages
Day 19
No ratings yet
Day 19
40 pages
AE556 2024 Topic3 NNBasics
No ratings yet
AE556 2024 Topic3 NNBasics
25 pages
Convolutional Neural Networks Notes
No ratings yet
Convolutional Neural Networks Notes
29 pages
Cnnbasics 171028092801
No ratings yet
Cnnbasics 171028092801
43 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
39 pages
Unit 3 - Machine Learning
No ratings yet
Unit 3 - Machine Learning
29 pages
PHYS4652 Planetary Science: Lecture 1: Introduction LEE Man Hoi
No ratings yet
PHYS4652 Planetary Science: Lecture 1: Introduction LEE Man Hoi
37 pages
Convolutional Neural Networks: Convolutions, Pooling and Cnns. Neural Architectures For Computer Vision
No ratings yet
Convolutional Neural Networks: Convolutions, Pooling and Cnns. Neural Architectures For Computer Vision
64 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
MECH 3408 Mechanics of Fluids
No ratings yet
MECH 3408 Mechanics of Fluids
31 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
Convolutional Neural Network - 5
No ratings yet
Convolutional Neural Network - 5
21 pages
AE556 2024 Topic6 RNN
No ratings yet
AE556 2024 Topic6 RNN
19 pages
This PDF Is The Sample PDF Taken From Our Comprehensive Study Material For IIT-JEE Main & Advanced
No ratings yet
This PDF Is The Sample PDF Taken From Our Comprehensive Study Material For IIT-JEE Main & Advanced
13 pages
L4 Steam Power Cycles
No ratings yet
L4 Steam Power Cycles
54 pages
Celestial Mechanics & Binary Systems
No ratings yet
Celestial Mechanics & Binary Systems
34 pages
Solved ISRO Scientist or Engineer Mechanical May 2017 Paper With Solutions
No ratings yet
Solved ISRO Scientist or Engineer Mechanical May 2017 Paper With Solutions
26 pages
MECH 3408 Mechanics of Fluids: Fluid Kinematics Dr. Jiyun Song Office: HW7-01 Email: Jsong90@hku - HK
No ratings yet
MECH 3408 Mechanics of Fluids: Fluid Kinematics Dr. Jiyun Song Office: HW7-01 Email: Jsong90@hku - HK
23 pages
Convolutional Neural Networks (CNN) : Convolutions
No ratings yet
Convolutional Neural Networks (CNN) : Convolutions
17 pages
Lecture 8
No ratings yet
Lecture 8
47 pages
Revised Final Version Clean
No ratings yet
Revised Final Version Clean
26 pages
CC511 Week 7 - Deep - Learning
No ratings yet
CC511 Week 7 - Deep - Learning
33 pages
Deep Learning Notes For Easy Access
No ratings yet
Deep Learning Notes For Easy Access
14 pages
NN 07
No ratings yet
NN 07
24 pages
MF821 Syllabus
No ratings yet
MF821 Syllabus
5 pages
NN 06
No ratings yet
NN 06
18 pages
Williams Landel Ferry - JACS55 PDF
No ratings yet
Williams Landel Ferry - JACS55 PDF
7 pages
Analyzing Operational Flexibility of Electric Power Systems
No ratings yet
Analyzing Operational Flexibility of Electric Power Systems
10 pages
MECH 3408 Mechanics of Fluids: Pipe Flows and Flow Kinematics Dr. Jiyun Song Office: HW7-01 Email: Jsong90@hku - HK
No ratings yet
MECH 3408 Mechanics of Fluids: Pipe Flows and Flow Kinematics Dr. Jiyun Song Office: HW7-01 Email: Jsong90@hku - HK
24 pages
CNN
No ratings yet
CNN
10 pages
Frames of References 5th Sem Nep
No ratings yet
Frames of References 5th Sem Nep
16 pages
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
No ratings yet
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
11 pages
Difficulties Observed When Implementing Total Productive Maintenance TPM Empirical Evidences From The Manufacturing Sect
No ratings yet
Difficulties Observed When Implementing Total Productive Maintenance TPM Empirical Evidences From The Manufacturing Sect
15 pages
Student Study Guide
No ratings yet
Student Study Guide
3 pages
Week3 Lecture Notes
No ratings yet
Week3 Lecture Notes
11 pages
Hermes 2011
No ratings yet
Hermes 2011
11 pages
CCCH9032 Syllabus 2017-18
No ratings yet
CCCH9032 Syllabus 2017-18
5 pages
Norma E691 Ingles
No ratings yet
Norma E691 Ingles
21 pages
Assignment 6
No ratings yet
Assignment 6
7 pages
Sap Examples
No ratings yet
Sap Examples
6 pages
Simulink
No ratings yet
Simulink
6 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Lecture 5 Power Set
No ratings yet
Lecture 5 Power Set
3 pages
Optics: Majorship Let Reviewer in Physical Science
No ratings yet
Optics: Majorship Let Reviewer in Physical Science
4 pages
Assignment 5
No ratings yet
Assignment 5
4 pages
Assignment 4
No ratings yet
Assignment 4
3 pages
PHYS4652 Assignment1
No ratings yet
PHYS4652 Assignment1
2 pages
PHYS4652 Assignment2
No ratings yet
PHYS4652 Assignment2
2 pages
Assignment 1
No ratings yet
Assignment 1
1 page
Assignment 1
No ratings yet
Assignment 1
1 page
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet