0% found this document useful (0 votes)

24 views18 pages

DL-19-CNN Sequential Model 210223

The document discusses why convolutional neural networks use convolution and parameter sharing, how they provide equivariance to translation, and their modern applications including image classification on large datasets like ImageNet. It also covers training issues with deep networks like vanishing and exploding gradients, and how techniques like residual connections help address these issues.

Uploaded by

basketsahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views18 pages

DL-19-CNN Sequential Model 210223

Uploaded by

basketsahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

CSO507

Deep Learning
Why Convolution?

Koustav Rudra
21/02/2023
Why Convolution?
• Traditional NN:
– Each weight/parameter used only once
– multiplied by one element of the input and then never revisited

• Parameter Sharing (tied weights)

– Refers to using the same parameter for more than one function in a
model
– Rather than learning a separate set of parameters for every location, we
learn only one set

• Sparse connections due to small filter size

– Can detect small, meaningful features such as edges with small kernels
– Store fewer parameters: 𝑂 𝑚×𝑛 𝑣𝑒𝑟𝑠𝑢𝑠 𝑂(𝑘×𝑛)
– Low memory requirements, Faster training and inference
Equivariance
• Equivariance: 𝑓 𝑇 𝑥 =𝑇 𝑓 𝑥
• CNNs equivariance to translation:
– Unshifted representation for the object same as the representation for
the object after shifting
– The form of parameter sharing used by CNNs causes each layer to be
equivariant to translation
– property is useful when we know some local function is useful
everywhere (e.g. edge detectors)

• Not naturally equivariant to scale or rotation of an image

CSO507
Deep Learning
CNN – Modern Applications

Koustav Rudra
21/02/2023
Imagenet

• Dataset: With over 14 Million images across 21,841 categories

• Classification task: Produce a list of object categories present in image.
(1000 categories)
• Other tasks include:
• “Top 5 error”: rate at which the model does not output correct label in
top 5 predictions
Progress using CNN

• Large datasets help

• Over parameterized deep layers help
Progress using CNN
• But is learning as simple as stacking more and more layers ?
CNN: Training Issues
• What are the problems in training Deep Nets ?
• Vanishing and Exploding gradients
• The gradients coming from the deeper layers have to go through
continuous matrix multiplications because of the the chain rule,
and as they approach the earlier layers,

• Vanishing gradients
• if they have small values (<1), they shrink exponentially until they
vanish and make it impossible for the model to learn

• Exploding gradients
• if they have large values (>1) they get larger and eventually blow
up
CNN: Training Issues
• How can we control gradients?
• Better initialization
• Better normalization
• Gating mechanism (RNNs)
• Residual Connections
Residual Connections

• Residual network:
• directly copy the input matrix to the transformation output
• sum the output in final ReLU with the residual operation

• Very deep and successful architecture

Intuition:
• If identity were optimal, easy to set weights as 0
• If optimal mapping is closer to identity, easier to find small fluctuations
Beyond Classification
DL as Representation Learning
• We can use the representations learned from models learnt for image
classification in many other tasks involving images

• Users share their learnt representations like VGGNET,RESNET,...

• We can use these representations (features of an image) for other

supervised tasks
• Visual question answering
• Face detection
• Image localization

• Not on for images – Text, Images, Graphs

Conclusions
• Convolutional networks exploit local structure in the input

• Have large statistical as well as computational efficiency

• ConvNets have been at the forefront of the DL revolution

• Deeper Networks help but one has to account for gradient-based problems

• Representations learnt from ConvNets can be used for multiple tasks

CSO507
Deep Learning
Sequence Modelling

Koustav Rudra
21/02/2023
What have we learnt so far?
• Linearization of input might not be the desired option where locality
matters
• Convolutional Neural Networks
– Filters that exploit parameter sharing, sparsity and translation
invariance
– Pooling layers that reduce resolution without compromising
representation quality
• CNNs have been at the forefront of vision related tasks
– Also used in text, graphs
– Large datasets and deeper models have fuelled success
• Deeper layers are better but have to be careful about
– Vanishing and exploding gradients
– Better optimization techniques help
– Residual connections and Residual Networks for effective deeper
convnets
Modelling Sequential Data
• What is sequential data ?
– Data where the order matters – dependencies
– The man went to the ________ to withdraw money

• Language Modeling
– Predict the next word
– Which word comes next? The boy is playing football in the _______
• field, ground – Highly likely
• mall, shop – Less likely

– If you understand language well you can generate sensible statements

that are grammatically correct and capture world knowledge
Modelling Sequential Data
• Time Series forecasting

• What are the problems with modelling sequential data ?

Problem 1: Variable Size Input
• How would you design a feed forward network with fixed size inputs for
variable input sizes?

• The food was awesome

• The food was not bad, surprisingly instead of the ratings for the restaurant
being bad

• Solution1:Can take a moving window with last k input items

• Solution2:Take a bag or set-based representation

• The food was not bad, great in fact

• The food was not great, bad in fact

• How to model long term dependencies?

Module 3 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 3 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
20 pages
4th Unit Aktu Machine Learning
No ratings yet
4th Unit Aktu Machine Learning
9 pages
Introduction To Deep Learning 17th January 2025
No ratings yet
Introduction To Deep Learning 17th January 2025
60 pages
Deep Learning (R20a06610)
No ratings yet
Deep Learning (R20a06610)
170 pages
Deep Learning Report For Students
No ratings yet
Deep Learning Report For Students
32 pages
Unit 6
No ratings yet
Unit 6
41 pages
19 Deep Learning
100% (1)
19 Deep Learning
49 pages
Unit Iv (CNN)
No ratings yet
Unit Iv (CNN)
8 pages
Module 04 - Learners Guide
No ratings yet
Module 04 - Learners Guide
101 pages
Chapter21 4e
No ratings yet
Chapter21 4e
35 pages
Deep Learning
No ratings yet
Deep Learning
90 pages
CNN RNN LSTM Attention
No ratings yet
CNN RNN LSTM Attention
86 pages
Introduction To Deep Learning: Nandita Bhaskhar
No ratings yet
Introduction To Deep Learning: Nandita Bhaskhar
56 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
49 pages
Unit 5 Updated
No ratings yet
Unit 5 Updated
125 pages
DSA5102 Lecture5
No ratings yet
DSA5102 Lecture5
45 pages
Introduction To Rnns
No ratings yet
Introduction To Rnns
48 pages
AI Slide 2
No ratings yet
AI Slide 2
82 pages
For Seminar
No ratings yet
For Seminar
17 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
Deepnet Lourentzou
No ratings yet
Deepnet Lourentzou
49 pages
Mod 4-RNN Deep Learning
No ratings yet
Mod 4-RNN Deep Learning
63 pages
RNN Tutorial
No ratings yet
RNN Tutorial
41 pages
Neural Network (RNN & CNN)
No ratings yet
Neural Network (RNN & CNN)
31 pages
DSA5102X Lecture5
No ratings yet
DSA5102X Lecture5
44 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Unit 3
No ratings yet
Unit 3
16 pages
ENG6500 8 DL IntroductionToDeepLearning Part2
No ratings yet
ENG6500 8 DL IntroductionToDeepLearning Part2
65 pages
UNIT-2 DL
No ratings yet
UNIT-2 DL
51 pages
Unit 3
No ratings yet
Unit 3
41 pages
Outline
No ratings yet
Outline
50 pages
Introtodeeplearning MIT 6.S191
No ratings yet
Introtodeeplearning MIT 6.S191
36 pages
DL Decode Endsem
No ratings yet
DL Decode Endsem
71 pages
MLT Unit 4 and 5 Part 2
No ratings yet
MLT Unit 4 and 5 Part 2
34 pages
CNN Deep
No ratings yet
CNN Deep
35 pages
4b Image Processing
No ratings yet
4b Image Processing
63 pages
Convolutional Neural Networks Notes
No ratings yet
Convolutional Neural Networks Notes
29 pages
IC Unit6 DeepLearning
No ratings yet
IC Unit6 DeepLearning
35 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
Cs329-Lecture 5-2025
No ratings yet
Cs329-Lecture 5-2025
30 pages
DLA Unit 4
No ratings yet
DLA Unit 4
38 pages
DL Unit Iv
No ratings yet
DL Unit Iv
18 pages
AI Chapter 4
No ratings yet
AI Chapter 4
63 pages
Recurrent Neural Networks (RNNS)
No ratings yet
Recurrent Neural Networks (RNNS)
45 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
Deep Learning Notes (1) 2
No ratings yet
Deep Learning Notes (1) 2
54 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
Mergeddv
No ratings yet
Mergeddv
2 pages
Deep Learning: Technical Introduction: Thomas Epelbaum
No ratings yet
Deep Learning: Technical Introduction: Thomas Epelbaum
106 pages
Convolutional Networks
No ratings yet
Convolutional Networks
37 pages
DL Mod 3
No ratings yet
DL Mod 3
4 pages
DL Unit 4 Part 2
No ratings yet
DL Unit 4 Part 2
8 pages
Lec 10
No ratings yet
Lec 10
37 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
33 pages
Lecture 3 V33
No ratings yet
Lecture 3 V33
52 pages
Aquino Dominic Bien FA2.2
No ratings yet
Aquino Dominic Bien FA2.2
3 pages
Class Notes Unit 5
No ratings yet
Class Notes Unit 5
13 pages
STATSLIDE3
No ratings yet
STATSLIDE3
9 pages
Mid Semester Model Answer
No ratings yet
Mid Semester Model Answer
5 pages
ML Tutorial II Partial Solutions 290922
No ratings yet
ML Tutorial II Partial Solutions 290922
10 pages
ML Tutorial III
No ratings yet
ML Tutorial III
3 pages