0% found this document useful (0 votes)

93 views32 pages

Modern CNN Architectures

The document discusses the LeNet convolutional neural network architecture. It consists of two parts: a convolutional encoder with two convolutional layers, and a dense block with three fully connected layers. LeNet takes in handwritten digit images and outputs the probability of the digit being one of 10 possible classes. The convolutional block uses 5x5 kernels, sigmoid activations, and 2x2 average pooling with stride 2. The feature map is flattened before passing to the dense block, which has three fully connected layers with 120, 84, and 10 neurons respectively.

Uploaded by

Arafat Hossain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views32 pages

Modern CNN Architectures

Uploaded by

Arafat Hossain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 32

CSE 471: MACHINE LEARNING

Modern CNN architectures

LeNet

 Input: Hand written digits (single channel)

 Output: Probability over 10 possible outcomes
 At a high level, LeNet (LeNet-5) consists of two parts
 A convolutional encoder consisting of two convolutional
layers
 A dense block consisting of three fully connected layers

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., & others. (1998). Gradient-based learning applied to document
recognition. Proceedings of the IEEE, 86(11), 2278–2324.
LeNet
 Convolution block
 Convolutional layer (5 x 5 kernel)
 Sigmoid activation
 Average pooling
 2 x 2 (stride 2)
 Spatial down sampling
 Output channels
 Layer 1: 6 @ 28 x 28
 Layer 2: 16 @ 10 x 10

 Feature map is flattened before passing onto the dense

layer
LeNet
 Dense block
 3 Fully connected layers
 Layer 1: 120 neurons
 Layer 2: 84 neurons
 Layer 3: 10 neurons
LeNet (PyTorch)
Xavier Initialization
 Let
 Oi – output for some fully-connected layer (without
nonlinearities)
 There are nin inputs xj with associated weights wij
 Weights are drawn independently from the same
distribution, with 0 mean, 2 variance
 xi’s also have 0 mean, 2 variance
 Independent of weights
 Independent of each other

Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks.
Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 249–256).
Xavier Initialization

 Variance can be kept fixed if

 nin2 = 1
 Following same reasoning, during backprop.
Gradients’ variance can be kept fixed if
 nout2 = 1
 Therefore, we try to achieve
 0.5 x (nin + nout) 2 = 1
Xavier Initialization
 Sampling weights from N(0, 2)
 Sampling weights from uniform distribution U(-a, a)
LeNet
AlexNet
 Runs on GPU hardware
 Won the ImageNet Large Scale Visual Recognition
Challenge 2012 by a phenomenally large margin
 Architecture
 5 Convolutional layers
 3 fully connected layers
 ReLU activation

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional
neural networks. Advances in neural information processing systems (pp. 1097–1105).
AlexNet
 Input: 224 x 224 3-channel
 11 x 11 filter in the first layer
 10 times more convolution channels/filters than
LeNet
 Uses dropout
 Image augmentation
 Flipping
 Clipping
 Color changes
Dropout
Dropout
 Drop out some neurons during training
 On each iteration
 Layer by layer
 Different neurons will get dropped in different
iterations
 Breaks up co-adaptation

Co-adaptation. Neural network overfitting is characterized

by a state in which each layer relies on a specific pattern of
activations in the previous layer.
Dropout
 Need to normalize the activation of the retained
nodes
 Each intermediate activation h is replaced by a
random variable h′
 Expectation remains unchanged, i.e., E[h′] = h.
Learned filters (96)
AlexNet

LeNet AlexNet
AlexNet (PyTorch)
VGG
 Visual Geometry Group (VGG) at Oxford
University
 Neurons  Layers  Blocks
 Basic VGG block
 A convolution layer with padding
 A nonlinearity (e.g. ReLU)
 A pooling layer (e.g. max pooling)
In the original VGG paper, the authors employed convolutions with 3x3 kernels
with padding of 1 (keeping height and width) and 2x2 max pooling with stride of 2
(halving the resolution after each block)

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image
recognition. arXiv preprint arXiv:1409.1556.
VGG
VGG
Original VGG network
 5 convolutional blocks
 Block# 1, 2: 1 Conv. layer each
 Block# 3, 4, 5: 2 Conv. layer each
 Fully connected block
 Same as AlexNet
 Called VGG-11
 8 Conv. Layers
 3 FC layers
 Uses dropout
GoogLeNet
 Won ImageNet challenge in 2014.
 Investigated which sized kernels are best.
 Employ a combination of variously-sized kernels
 The basic block is called Inception Block.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2015). Going deeper with
convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).
Inception Block
 4 parallel paths
 Path# 1: 1x 1 filter
 Path# 2: 3 x 3 filter, pad = 1
 1 x 1 filter used beforehand to reduce channels
 Path# 3: 5 x 5 filter, pad = 2
 1 x 1 filter used beforehand to reduce channels
 Path# 4: 3 x 3 MaxPool, pad = 1
 1 x 1 filter used afterwards to reduce channels
 Input and output have the same height and width
 Channel count varies in the different paths and are
concatenated
GoogLeNet
GoogLeNet
 7x7 filter, stride=2, pad=3, 64 channels
 3x3 maxpooling
 1x1 filter – 64 channels
 3x3 filter – 192 channels
 2 inception blocks in series
 Block# 1: 64 + 128 + 32 + 32 = 256 channels
 Block #2: 128 + 192 + 96 + 64 = 480 channels
 And so on …
Residual networks (ResNet)

Each layer completely

replaces the representation
from the preceding layer,

Whereas traditional networks must learn to propagate information and are subject to
catastrophic failure of information propagation for bad choices of the parameters,
residual networks propagate information by default
Functional classes

For non-nested function classes, a larger (indicated by area) function class does
not guarantee to get closer to the “truth” function (f*). This does not happen in
nested function classes.
ResNet (Intuition)
 For deep neural networks, if we can train the
newly-added layer into an identity function f(x) =
x, the new model will be as effective as the original
model.
 As the new model may get a better solution to fit
the training dataset, the added layer might make it
easier to reduce training errors.

Won the ImageNet Large Scale Visual Recognition Challenge in 2015.

ResNet Block
ResNet Block
 Two 3x3 convolution layers
 Same number of output channels
 Batch normalization
 ReLU activation
ResNet Block

Identical input/output channels Non-identical input/output channels

ResNext block

Simplified diagram
The use of grouped convolution with g groups is g times faster than a dense convolution. It is a
bottleneck residual block when the number of intermediate channels b is less than c.

E-Commerce: Kenneth C. Laudon Carol Guercio Traver
No ratings yet
E-Commerce: Kenneth C. Laudon Carol Guercio Traver
51 pages
Unit III
No ratings yet
Unit III
58 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
Deep CNN.pptx
No ratings yet
Deep CNN.pptx
66 pages
convnets3
No ratings yet
convnets3
17 pages
465-Lecture 7
No ratings yet
465-Lecture 7
46 pages
CNN Architectures 01
No ratings yet
CNN Architectures 01
66 pages
cours8b
No ratings yet
cours8b
39 pages
5b Dana
No ratings yet
5b Dana
67 pages
Difference Between Alexnet, Vggnet, Resnet, and Inception
No ratings yet
Difference Between Alexnet, Vggnet, Resnet, and Inception
14 pages
Unit 3
No ratings yet
Unit 3
37 pages
19 ResNet 10 09 2024
No ratings yet
19 ResNet 10 09 2024
35 pages
CS60010: Deep Learning CNN - Part 3: Sudeshna Sarkar
No ratings yet
CS60010: Deep Learning CNN - Part 3: Sudeshna Sarkar
167 pages
MLT CNN Architectures
No ratings yet
MLT CNN Architectures
104 pages
Modern Convolutional Neural Networks
No ratings yet
Modern Convolutional Neural Networks
68 pages
DL UNIT 2 CNN Architectures
No ratings yet
DL UNIT 2 CNN Architectures
12 pages
DL3 QB
No ratings yet
DL3 QB
19 pages
Convolutional Neural Network2 26112024 015227pm
No ratings yet
Convolutional Neural Network2 26112024 015227pm
41 pages
Unit 3
No ratings yet
Unit 3
38 pages
Convolutional Neural Network Ilsvrc Alexnet (2012) Zfnet (2013) Vggnet (2014) Googlenet 2014) Resnet (2015) Conclusion
No ratings yet
Convolutional Neural Network Ilsvrc Alexnet (2012) Zfnet (2013) Vggnet (2014) Googlenet 2014) Resnet (2015) Conclusion
82 pages
Unit 2 CNN
No ratings yet
Unit 2 CNN
15 pages
Res Net 4
No ratings yet
Res Net 4
23 pages
L3 - UUCLxDeepMind DL2020
No ratings yet
L3 - UUCLxDeepMind DL2020
110 pages
Kernel Slides
No ratings yet
Kernel Slides
33 pages
Lecture05 DeepLearningCNN Trang 2
No ratings yet
Lecture05 DeepLearningCNN Trang 2
45 pages
Convolution Neural Networks
No ratings yet
Convolution Neural Networks
80 pages
CNN (1) - Unit 3 - Merged
No ratings yet
CNN (1) - Unit 3 - Merged
14 pages
138 B Pretrained Networks Classification Complete
No ratings yet
138 B Pretrained Networks Classification Complete
47 pages
CNN Architectures - Transfer Learning
No ratings yet
CNN Architectures - Transfer Learning
64 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
Aidl 2023s DL 08 CNN Architectures
No ratings yet
Aidl 2023s DL 08 CNN Architectures
51 pages
10 CNN-2
No ratings yet
10 CNN-2
97 pages
ch4 CNN
No ratings yet
ch4 CNN
35 pages
Al3502 - Dlv Unit 3
No ratings yet
Al3502 - Dlv Unit 3
11 pages
Lecture 1
No ratings yet
Lecture 1
32 pages
Trustworthy - Final Essay
No ratings yet
Trustworthy - Final Essay
21 pages
Lecture06 VDL
No ratings yet
Lecture06 VDL
79 pages
Difference Between AlexNet, VGGNet, ResNet, and Inception - by Aqeel Anwar - Towards Data Science
No ratings yet
Difference Between AlexNet, VGGNet, ResNet, and Inception - by Aqeel Anwar - Towards Data Science
14 pages
4b Image Processing
No ratings yet
4b Image Processing
63 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
Convolutional Neural Network Report
No ratings yet
Convolutional Neural Network Report
5 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
39 pages
Notes
No ratings yet
Notes
15 pages
Unit V
No ratings yet
Unit V
84 pages
Cs437 Cs5317 Ee414 Ee513 l10 Cnncasestudies
No ratings yet
Cs437 Cs5317 Ee414 Ee513 l10 Cnncasestudies
55 pages
VGG Net
No ratings yet
VGG Net
6 pages
AE556 2024 Topic4 CNN
No ratings yet
AE556 2024 Topic4 CNN
26 pages
Difference Between AlexNet, VGGNet, ResNet, and Inception
No ratings yet
Difference Between AlexNet, VGGNet, ResNet, and Inception
25 pages
ML II - Unit IV
No ratings yet
ML II - Unit IV
20 pages
Data Science Interview Preparation (#DAY 14)
No ratings yet
Data Science Interview Preparation (#DAY 14)
11 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
17 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
Literature Review On Image Classification Architecture
No ratings yet
Literature Review On Image Classification Architecture
14 pages
07. Presentation
No ratings yet
07. Presentation
33 pages
138 A VGG Googlenet in B Now
No ratings yet
138 A VGG Googlenet in B Now
18 pages
TRes Net
No ratings yet
TRes Net
37 pages
CNN Apps
No ratings yet
CNN Apps
17 pages
CS436 CS5310 Ee513 L05 CNN2
No ratings yet
CS436 CS5310 Ee513 L05 CNN2
27 pages
Famous Networks
No ratings yet
Famous Networks
6 pages
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Datasheet Forcepoint NGFW 3300 Series en
No ratings yet
Datasheet Forcepoint NGFW 3300 Series en
2 pages
Data Privacy and Trust in Cloud Computing Building trust in the cloud through assurance and accountability 1st edition by Theo Lynn, John Mooney, Lisa van der Werff, Grace Fox 3030546594 978-3030546595 - The full ebook with all chapters is available for download
100% (2)
Data Privacy and Trust in Cloud Computing Building trust in the cloud through assurance and accountability 1st edition by Theo Lynn, John Mooney, Lisa van der Werff, Grace Fox 3030546594 978-3030546595 - The full ebook with all chapters is available for download
60 pages
De Sigmascope en
No ratings yet
De Sigmascope en
4 pages
Mrinal Sauraj 19MCA04 Project Report-123
No ratings yet
Mrinal Sauraj 19MCA04 Project Report-123
67 pages
Basic Computer Applications: LIFE SKILL COURSE-I Semester-I
No ratings yet
Basic Computer Applications: LIFE SKILL COURSE-I Semester-I
6 pages
Domainspecific Knowledge Graph Construction 1st Edition Mayank Kejriwal PDF Download
No ratings yet
Domainspecific Knowledge Graph Construction 1st Edition Mayank Kejriwal PDF Download
50 pages
Chapitre 6: Ospf: CCNP Enterprise: Advanced Routing & Service
No ratings yet
Chapitre 6: Ospf: CCNP Enterprise: Advanced Routing & Service
84 pages
Golden Master Pedal Manual
No ratings yet
Golden Master Pedal Manual
17 pages
Pape 2 Winqsb
No ratings yet
Pape 2 Winqsb
16 pages
Protege Half DIN Rail 2 Door Reader Expander 1/5
No ratings yet
Protege Half DIN Rail 2 Door Reader Expander 1/5
5 pages
IoT Unit 2
No ratings yet
IoT Unit 2
73 pages
Unit 5
No ratings yet
Unit 5
21 pages
Business Communication: Process & Product 9th edition Edition Mary Ellen Guffey download
No ratings yet
Business Communication: Process & Product 9th edition Edition Mary Ellen Guffey download
130 pages
Digital Electric-Hydraulic Regulating System (Deh)
No ratings yet
Digital Electric-Hydraulic Regulating System (Deh)
114 pages
Sa2 Business and Supply Chain M
No ratings yet
Sa2 Business and Supply Chain M
3 pages
Facade Design Pattern - Dev Genius
No ratings yet
Facade Design Pattern - Dev Genius
4 pages
InteliVision5-1.6.0 NewFeatures
No ratings yet
InteliVision5-1.6.0 NewFeatures
16 pages
Product - 100 Practical Industrial Programming Using IEC 61131-3 For PLCs (PDFDrive)
No ratings yet
Product - 100 Practical Industrial Programming Using IEC 61131-3 For PLCs (PDFDrive)
14 pages
ClickShare CX-30 GEN2 - Fiche Technique ANG - K
No ratings yet
ClickShare CX-30 GEN2 - Fiche Technique ANG - K
3 pages
Sqe 02
No ratings yet
Sqe 02
33 pages
BH Learn
No ratings yet
BH Learn
3 pages
Data Analysis and Data Visualization Using MS Excel: Arvind Kumawat Bsc. Aiml & VR
No ratings yet
Data Analysis and Data Visualization Using MS Excel: Arvind Kumawat Bsc. Aiml & VR
14 pages
Mail
No ratings yet
Mail
2 pages
A Short Case Study of The Impacts of The OLPC Project Around The World
No ratings yet
A Short Case Study of The Impacts of The OLPC Project Around The World
11 pages
Year 1 Exponentials and Logarithsm Unit Test 1.1
No ratings yet
Year 1 Exponentials and Logarithsm Unit Test 1.1
7 pages
Developing A Hybrid Model For Analyzing Students' Academic Performance Using ICT Integration in Higher Learning Institutions: A Case Study of IPRC-HUYE, RWANDA
No ratings yet
Developing A Hybrid Model For Analyzing Students' Academic Performance Using ICT Integration in Higher Learning Institutions: A Case Study of IPRC-HUYE, RWANDA
7 pages
Advanced iOS App Architecture v2.0
100% (1)
Advanced iOS App Architecture v2.0
316 pages
Ebook Basic Mathematics Sem 1
No ratings yet
Ebook Basic Mathematics Sem 1
657 pages
Introduction To C Part 1
No ratings yet
Introduction To C Part 1
51 pages

Modern CNN Architectures

Uploaded by

Modern CNN Architectures

Uploaded by

CSE 471: MACHINE LEARNING

Modern CNN architectures

 Input: Hand written digits (single channel)

 Feature map is flattened before passing onto the dense

 Variance can be kept fixed if

Co-adaptation. Neural network overfitting is characterized

Each layer completely

Won the ImageNet Large Scale Visual Recognition Challenge in 2015.

Identical input/output channels Non-identical input/output channels

You might also like