0% found this document useful (0 votes)

67 views

Deep Learning Basics Lecture 6 Convolutional NN

The document discusses convolutional neural networks and LeNet-5. It introduces convolutional layers, pooling layers, and fully connected layers. It describes the architecture of LeNet-5, which was an early convolutional neural network applied to handwritten digit recognition. LeNet-5 used convolutional and pooling layers followed by fully connected layers. It also discusses momentum, an optimization technique used in stochastic gradient descent training of neural networks.

Uploaded by

baris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views

Deep Learning Basics Lecture 6 Convolutional NN

Uploaded by

baris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Deep Learning Basics

Lecture 6: Convolutional NN
Princeton University COS 495
Instructor: Yingyu Liang
Review: convolutional layers
Convolution: two dimensional case
Input Kernel/filter
a b c d w x
e f g h y z
i j k l

wa + bx + bw + cx +
ey + fz fy + gz

Feature map
Convolutional layers
the same weight shared for all output nodes

𝑚 output nodes

𝑘 kernel size

𝑛 input nodes

Figure from Deep Learning, by Goodfellow, Bengio, and Courville

Terminology

Figure from Deep Learning,

by Goodfellow, Bengio,
and Courville
Case study: LeNet-5
LeNet-5
• Proposed in “Gradient-based learning applied to document
recognition” , by Yann LeCun, Leon Bottou, Yoshua Bengio and Patrick Haffner,
in Proceedings of the IEEE, 1998
LeNet-5
• Proposed in “Gradient-based learning applied to document
recognition” , by Yann LeCun, Leon Bottou, Yoshua Bengio and Patrick Haffner,
in Proceedings of the IEEE, 1998

• Apply convolution on 2D images (MNIST) and use backpropagation

LeNet-5
• Proposed in “Gradient-based learning applied to document
recognition” , by Yann LeCun, Leon Bottou, Yoshua Bengio and Patrick Haffner,
in Proceedings of the IEEE, 1998

• Apply convolution on 2D images (MNIST) and use backpropagation

• Structure: 2 convolutional layers (with pooling) + 3 fully connected layers

• Input size: 32x32x1
• Convolution kernel size: 5x5
• Pooling: 2x2
LeNet-5

Figure from Gradient-based learning applied to document recognition,

by Y. LeCun, L. Bottou, Y. Bengio and P. Haffner
LeNet-5

Figure from Gradient-based learning applied to document recognition,

by Y. LeCun, L. Bottou, Y. Bengio and P. Haffner
LeNet-5 Filter: 5x5, stride: 1x1,
#filters: 6

Figure from Gradient-based learning applied to document recognition,

by Y. LeCun, L. Bottou, Y. Bengio and P. Haffner
LeNet-5
Pooling: 2x2, stride: 2

Figure from Gradient-based learning applied to document recognition,

by Y. LeCun, L. Bottou, Y. Bengio and P. Haffner
LeNet-5 Filter: 5x5x6, stride: 1x1,
#filters: 16

Figure from Gradient-based learning applied to document recognition,

by Y. LeCun, L. Bottou, Y. Bengio and P. Haffner
LeNet-5
Pooling: 2x2, stride: 2

Figure from Gradient-based learning applied to document recognition,

by Y. LeCun, L. Bottou, Y. Bengio and P. Haffner
LeNet-5
Weight matrix: 400x120

Figure from Gradient-based learning applied to document recognition,

by Y. LeCun, L. Bottou, Y. Bengio and P. Haffner
Weight matrix: 84x10
LeNet-5
Weight matrix: 120x84

Figure from Gradient-based learning applied to document recognition,

by Y. LeCun, L. Bottou, Y. Bengio and P. Haffner
Software platforms for CNN
Updated in April 2016; checked more recent ones online
Platform: Marvin (marvin.is)
Platform: Marvin by
LeNet in Marvin: convolutional layer
LeNet in Marvin: pooling layer
LeNet in Marvin: fully connected layer
Platform: Caffe (caffe.berkeleyvision.org)
LeNet in Caffe
Platform: Tensorflow (tensorflow.org)
Platform: Tensorflow (tensorflow.org)
Platform: Tensorflow (tensorflow.org)
Others
• Theano – CPU/GPU symbolic expression compiler in python (from
MILA lab at University of Montreal)
• Torch – provides a Matlab-like environment for state-of-the-art
machine learning algorithms in lua
• Lasagne - Lasagne is a lightweight library to build and train neural
networks in Theano

• See: http://deeplearning.net/software_links/
Optimization: momentum
Basic algorithms
• Minimize the (regularized) empirical loss
෠𝐿𝑅 𝜃 = 1 σ𝑛𝑡=1 𝑙(𝜃, 𝑥𝑡 , 𝑦𝑡 ) + 𝑅(𝜃)
𝑛
where the hypothesis is parametrized by 𝜃

• Gradient descent
𝜃𝑡+1 = 𝜃𝑡 − 𝜂𝑡 𝛻𝐿෠ 𝑅 𝜃𝑡
Mini-batch stochastic gradient descent
• Instead of one data point, work with a small batch of 𝑏 points
(𝑥𝑡𝑏+1, 𝑦𝑡𝑏+1 ),…, (𝑥𝑡𝑏+𝑏, 𝑦𝑡𝑏+𝑏 )

• Update rule
1
𝜃𝑡+1 = 𝜃𝑡 − 𝜂𝑡 𝛻 ෍ 𝑙 𝜃𝑡 , 𝑥𝑡𝑏+𝑖 , 𝑦𝑡𝑏+𝑖 + 𝑅(𝜃𝑡 )
𝑏
1≤𝑖≤𝑏
Momentum
• Drawback of SGD: can be slow when gradient is small

• Observation: when the gradient is consistent across consecutive steps,

can take larger steps
• Metaphor: rolling marble ball on gentle slope
Momentum

Contour: loss function

Path: SGD with momentum
Arrow: stochastic gradient

Figure from Deep Learning, by Goodfellow, Bengio, and Courville

Momentum
• work with a small batch of 𝑏 points
(𝑥𝑡𝑏+1, 𝑦𝑡𝑏+1 ),…, (𝑥𝑡𝑏+𝑏, 𝑦𝑡𝑏+𝑏 )

• Keep a momentum variable 𝑣𝑡 , and set a decay rate 𝛼

• Update rule
1
𝑣𝑡 = 𝛼𝑣𝑡−1 − 𝜂𝑡 𝛻 ෍ 𝑙 𝜃𝑡 , 𝑥𝑡𝑏+𝑖 , 𝑦𝑡𝑏+𝑖 + 𝑅(𝜃𝑡 )
𝑏
1≤𝑖≤𝑏

𝜃𝑡+1 = 𝜃𝑡 + 𝑣𝑡
Momentum
• Keep a momentum variable 𝑣𝑡 , and set a decay rate 𝛼
• Update rule
1
𝑣𝑡 = 𝛼𝑣𝑡−1 − 𝜂𝑡 𝛻 ෍ 𝑙 𝜃𝑡 , 𝑥𝑡𝑏+𝑖 , 𝑦𝑡𝑏+𝑖 + 𝑅(𝜃𝑡 )
𝑏
1≤𝑖≤𝑏

𝜃𝑡+1 = 𝜃𝑡 + 𝑣𝑡

• Practical guide: 𝛼 is set to 0.5 until the initial learning stabilizes and
then is increased to 0.9 or higher.

ECE604 f20 hw3
0% (1)
ECE604 f20 hw3
3 pages
L3 - UUCLxDeepMind DL2020
No ratings yet
L3 - UUCLxDeepMind DL2020
110 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
33 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
4b Image Processing
No ratings yet
4b Image Processing
63 pages
Basics of DL: Prof. Leal-Taixé and Prof. Niessner 1
No ratings yet
Basics of DL: Prof. Leal-Taixé and Prof. Niessner 1
76 pages
Chapter21 4e
No ratings yet
Chapter21 4e
35 pages
521010J Toolbox Intro
No ratings yet
521010J Toolbox Intro
52 pages
19 Deep Learning
100% (1)
19 Deep Learning
49 pages
WEEK 8
No ratings yet
WEEK 8
101 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
7 CNN
No ratings yet
7 CNN
66 pages
DL-Unit-5
No ratings yet
DL-Unit-5
2 pages
Oversampling Techniques Deep Belief Network DenseNets DNN
No ratings yet
Oversampling Techniques Deep Belief Network DenseNets DNN
23 pages
Neural Networks & Deep Learning Makaut & & 7th SemNotes
No ratings yet
Neural Networks & Deep Learning Makaut & & 7th SemNotes
36 pages
DL-Unit-5
No ratings yet
DL-Unit-5
2 pages
L10 - Intro - To - Deep - Learning
No ratings yet
L10 - Intro - To - Deep - Learning
75 pages
mergeddv
No ratings yet
mergeddv
2 pages
Autoencoders: Parallel Programming Parallel Processing
No ratings yet
Autoencoders: Parallel Programming Parallel Processing
5 pages
AI Chapter 4
No ratings yet
AI Chapter 4
63 pages
ENG6500 8 DL IntroductionToDeepLearning Part2
No ratings yet
ENG6500 8 DL IntroductionToDeepLearning Part2
65 pages
Cnn
No ratings yet
Cnn
56 pages
Lecture 6 - Use Cases of CNN and Implementation
No ratings yet
Lecture 6 - Use Cases of CNN and Implementation
33 pages
Chapter 6 (6.2)
No ratings yet
Chapter 6 (6.2)
65 pages
Lec 1
No ratings yet
Lec 1
30 pages
138 B Pretrained Networks Classification Complete
No ratings yet
138 B Pretrained Networks Classification Complete
47 pages
UNIT-2 DL
No ratings yet
UNIT-2 DL
51 pages
Deepnet Lourentzou
No ratings yet
Deepnet Lourentzou
49 pages
Deep Learning - Intro, Methods & Applications
100% (1)
Deep Learning - Intro, Methods & Applications
37 pages
Introduction To Deep Neural Networks - DataCamp
No ratings yet
Introduction To Deep Neural Networks - DataCamp
10 pages
Introtodeeplearning MIT 6.S191
No ratings yet
Introtodeeplearning MIT 6.S191
36 pages
001 Intro
No ratings yet
001 Intro
66 pages
DL Slides 1
No ratings yet
DL Slides 1
63 pages
Keras and Tensorflow
No ratings yet
Keras and Tensorflow
11 pages
DL-19-CNN Sequential Model 210223
No ratings yet
DL-19-CNN Sequential Model 210223
18 pages
Let Us Code: Using Deep Learning Through A Library
No ratings yet
Let Us Code: Using Deep Learning Through A Library
17 pages
Module 3 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 3 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
20 pages
DL unit 5 perfect pdf._1
No ratings yet
DL unit 5 perfect pdf._1
17 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
CNN 2
No ratings yet
CNN 2
47 pages
Unit 4
No ratings yet
Unit 4
86 pages
Chapter 3
No ratings yet
Chapter 3
24 pages
LBDL A5 Booklet
No ratings yet
LBDL A5 Booklet
82 pages
CS 236 Section 3
No ratings yet
CS 236 Section 3
59 pages
Lecun 20201027 Att
No ratings yet
Lecun 20201027 Att
72 pages
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
No ratings yet
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
43 pages
23 DeepLearning PDF
No ratings yet
23 DeepLearning PDF
74 pages
Lec 01
No ratings yet
Lec 01
31 pages
Iii Unit - Deeplearning
No ratings yet
Iii Unit - Deeplearning
93 pages
AI_slide_2
No ratings yet
AI_slide_2
82 pages
CS 4650/7650: Natural Language Processing: Neural Text Classification
No ratings yet
CS 4650/7650: Natural Language Processing: Neural Text Classification
85 pages
Chapter 06 - in class
No ratings yet
Chapter 06 - in class
37 pages
7FFA790A
No ratings yet
7FFA790A
14 pages
Deep Learning Glossary
No ratings yet
Deep Learning Glossary
41 pages
Unit 5
No ratings yet
Unit 5
39 pages
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
No ratings yet
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
65 pages
Deep Learning in Neural Networks: An Overview
No ratings yet
Deep Learning in Neural Networks: An Overview
31 pages
Unit-3
No ratings yet
Unit-3
16 pages
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
Deep Learning Fundamentals in Python
From Everand
Deep Learning Fundamentals in Python
LazyProgrammer
4/5 (9)
OSRAM SFH 309 Datasheet
No ratings yet
OSRAM SFH 309 Datasheet
16 pages
Deep Learning Basics Lecture 2 Backpropagation
No ratings yet
Deep Learning Basics Lecture 2 Backpropagation
31 pages
Deep Learning Basics Lecture 1 Feedforward
No ratings yet
Deep Learning Basics Lecture 1 Feedforward
31 pages
Deep Learning Basics Lecture 8 Autoencoder & DBM
No ratings yet
Deep Learning Basics Lecture 8 Autoencoder & DBM
28 pages
Deep Learning Basics Lecture 3 Regularization I
No ratings yet
Deep Learning Basics Lecture 3 Regularization I
32 pages
Deep Learning Basics Lecture 4 Regularization II
No ratings yet
Deep Learning Basics Lecture 4 Regularization II
27 pages
ECE604 f20 hw1
No ratings yet
ECE604 f20 hw1
1 page
Deep Learning Basics Lecture 11 Practical Methodology
No ratings yet
Deep Learning Basics Lecture 11 Practical Methodology
25 pages
Lectures On Electromagnetic Theory - Weng Cho Chew
No ratings yet
Lectures On Electromagnetic Theory - Weng Cho Chew
591 pages
SFH 203 - en
No ratings yet
SFH 203 - en
15 pages
PYu-RC Group 51 RoHS L 12
No ratings yet
PYu-RC Group 51 RoHS L 12
10 pages
SFH 235 Fa - en
No ratings yet
SFH 235 Fa - en
15 pages
Applsci 12 09597 v2
No ratings yet
Applsci 12 09597 v2
16 pages
Warehouse Management Models Using Artificial Intel
No ratings yet
Warehouse Management Models Using Artificial Intel
8 pages
Transformer-Based Visual Segmentation - A Survey
No ratings yet
Transformer-Based Visual Segmentation - A Survey
23 pages
Artificial Intelligence in Medicine
No ratings yet
Artificial Intelligence in Medicine
10 pages
Kim Arbitrary-Scale Image Generation and Upsampling Using Latent Diffusion Model and CVPR 2024 Paper
No ratings yet
Kim Arbitrary-Scale Image Generation and Upsampling Using Latent Diffusion Model and CVPR 2024 Paper
10 pages
Research Paper
No ratings yet
Research Paper
5 pages
Artificial Neural Network - Wikipedia
No ratings yet
Artificial Neural Network - Wikipedia
14 pages
Explainable Deep Learning-Based Approach For Multilabel Classification of Electrocardiogram
No ratings yet
Explainable Deep Learning-Based Approach For Multilabel Classification of Electrocardiogram
13 pages
Journal of Advanced Research: Andrei-Alexandru Tulbure, Adrian-Alexandru Tulbure, Eva-Henrietta Dulf
No ratings yet
Journal of Advanced Research: Andrei-Alexandru Tulbure, Adrian-Alexandru Tulbure, Eva-Henrietta Dulf
16 pages
Student Notes - Convolutional Neural Networks (CNN) Introduction - Belajar Pembelajaran Mesin Indonesia
No ratings yet
Student Notes - Convolutional Neural Networks (CNN) Introduction - Belajar Pembelajaran Mesin Indonesia
14 pages
Fixed-Point CNN For FPGA
No ratings yet
Fixed-Point CNN For FPGA
7 pages
NeRF FOR HERITAGE 3D RECONSTRUCTION - Odf
No ratings yet
NeRF FOR HERITAGE 3D RECONSTRUCTION - Odf
8 pages
Classification of Diabetes Disease Using Decision Tree Algorithm (C4.5)
No ratings yet
Classification of Diabetes Disease Using Decision Tree Algorithm (C4.5)
9 pages
Hands-Free Mouse Control Using Facial Feature
No ratings yet
Hands-Free Mouse Control Using Facial Feature
6 pages
Part 1.4. Convolution Neural Network
No ratings yet
Part 1.4. Convolution Neural Network
24 pages
A Comparative Study of YOLO V4 and V5 Architectures On Pavement Crack Datasets Using Region-Based Detection
No ratings yet
A Comparative Study of YOLO V4 and V5 Architectures On Pavement Crack Datasets Using Region-Based Detection
16 pages
"Visual and Acoustic Identification of Bird Species": A Seminar Report ON
No ratings yet
"Visual and Acoustic Identification of Bird Species": A Seminar Report ON
29 pages
Object Detection Based Handwriting Localization
No ratings yet
Object Detection Based Handwriting Localization
15 pages
Investigating_YOLO_Models_Towards_Outdoor_Obstacle
No ratings yet
Investigating_YOLO_Models_Towards_Outdoor_Obstacle
20 pages
Project Report on Breast Cancer Segmentation and Development of Web App
No ratings yet
Project Report on Breast Cancer Segmentation and Development of Web App
30 pages
Enhancing Surveillance Systems With YOLO Algorithm for Real-Time Object Detection and Tracking
No ratings yet
Enhancing Surveillance Systems With YOLO Algorithm for Real-Time Object Detection and Tracking
4 pages
Presentation OF MINI PROJECT PDF
No ratings yet
Presentation OF MINI PROJECT PDF
32 pages
AI Observability and Automation Are Becoming Increasingly Crucial For Enhancing Both IT and Business Performance
No ratings yet
AI Observability and Automation Are Becoming Increasingly Crucial For Enhancing Both IT and Business Performance
12 pages
Skilldzire Report PDF
0% (1)
Skilldzire Report PDF
37 pages
Doom AI
No ratings yet
Doom AI
7 pages
How AI Enhances & Accelerates Diabetic Retinopathy Detection
No ratings yet
How AI Enhances & Accelerates Diabetic Retinopathy Detection
16 pages
1.5 Literature Review
No ratings yet
1.5 Literature Review
4 pages
YOLO V3 ML Project
No ratings yet
YOLO V3 ML Project
15 pages
MSR (Initialization Better Than Xavier)
No ratings yet
MSR (Initialization Better Than Xavier)
9 pages
ProSOUL A Framework To Identify PropagandaFrom Online Urdu Content
No ratings yet
ProSOUL A Framework To Identify PropagandaFrom Online Urdu Content
16 pages