0% found this document useful (0 votes)
24 views18 pages

DL-19-CNN Sequential Model 210223

The document discusses why convolutional neural networks use convolution and parameter sharing, how they provide equivariance to translation, and their modern applications including image classification on large datasets like ImageNet. It also covers training issues with deep networks like vanishing and exploding gradients, and how techniques like residual connections help address these issues.

Uploaded by

basketsahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views18 pages

DL-19-CNN Sequential Model 210223

The document discusses why convolutional neural networks use convolution and parameter sharing, how they provide equivariance to translation, and their modern applications including image classification on large datasets like ImageNet. It also covers training issues with deep networks like vanishing and exploding gradients, and how techniques like residual connections help address these issues.

Uploaded by

basketsahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

CSO507

Deep Learning
Why Convolution?

Koustav Rudra
21/02/2023
Why Convolution?
• Traditional NN:
– Each weight/parameter used only once
– multiplied by one element of the input and then never revisited

• Parameter Sharing (tied weights)


– Refers to using the same parameter for more than one function in a
model
– Rather than learning a separate set of parameters for every location, we
learn only one set

• Sparse connections due to small filter size


– Can detect small, meaningful features such as edges with small kernels
– Store fewer parameters: 𝑂 𝑚×𝑛 𝑣𝑒𝑟𝑠𝑢𝑠 𝑂(𝑘×𝑛)
– Low memory requirements, Faster training and inference
Equivariance
• Equivariance: 𝑓 𝑇 𝑥 =𝑇 𝑓 𝑥
• CNNs equivariance to translation:
– Unshifted representation for the object same as the representation for
the object after shifting
– The form of parameter sharing used by CNNs causes each layer to be
equivariant to translation
– property is useful when we know some local function is useful
everywhere (e.g. edge detectors)

• Not naturally equivariant to scale or rotation of an image


CSO507
Deep Learning
CNN – Modern Applications

Koustav Rudra
21/02/2023
Imagenet

• Dataset: With over 14 Million images across 21,841 categories


• Classification task: Produce a list of object categories present in image.
(1000 categories)
• Other tasks include:
• “Top 5 error”: rate at which the model does not output correct label in
top 5 predictions
Progress using CNN

• Large datasets help


• Over parameterized deep layers help
Progress using CNN
• But is learning as simple as stacking more and more layers ?
CNN: Training Issues
• What are the problems in training Deep Nets ?
• Vanishing and Exploding gradients
• The gradients coming from the deeper layers have to go through
continuous matrix multiplications because of the the chain rule,
and as they approach the earlier layers,

• Vanishing gradients
• if they have small values (<1), they shrink exponentially until they
vanish and make it impossible for the model to learn

• Exploding gradients
• if they have large values (>1) they get larger and eventually blow
up
CNN: Training Issues
• How can we control gradients?
• Better initialization
• Better normalization
• Gating mechanism (RNNs)
• Residual Connections
Residual Connections

• Residual network:
• directly copy the input matrix to the transformation output
• sum the output in final ReLU with the residual operation

• Very deep and successful architecture

Intuition:
• If identity were optimal, easy to set weights as 0
• If optimal mapping is closer to identity, easier to find small fluctuations
Beyond Classification
DL as Representation Learning
• We can use the representations learned from models learnt for image
classification in many other tasks involving images

• Users share their learnt representations like VGGNET,RESNET,...

• We can use these representations (features of an image) for other


supervised tasks
• Visual question answering
• Face detection
• Image localization

• Not on for images – Text, Images, Graphs


Conclusions
• Convolutional networks exploit local structure in the input

• Have large statistical as well as computational efficiency

• ConvNets have been at the forefront of the DL revolution

• Deeper Networks help but one has to account for gradient-based problems

• Representations learnt from ConvNets can be used for multiple tasks


CSO507
Deep Learning
Sequence Modelling

Koustav Rudra
21/02/2023
What have we learnt so far?
• Linearization of input might not be the desired option where locality
matters
• Convolutional Neural Networks
– Filters that exploit parameter sharing, sparsity and translation
invariance
– Pooling layers that reduce resolution without compromising
representation quality
• CNNs have been at the forefront of vision related tasks
– Also used in text, graphs
– Large datasets and deeper models have fuelled success
• Deeper layers are better but have to be careful about
– Vanishing and exploding gradients
– Better optimization techniques help
– Residual connections and Residual Networks for effective deeper
convnets
Modelling Sequential Data
• What is sequential data ?
– Data where the order matters – dependencies
– The man went to the ________ to withdraw money

• Language Modeling
– Predict the next word
– Which word comes next? The boy is playing football in the _______
• field, ground – Highly likely
• mall, shop – Less likely

– If you understand language well you can generate sensible statements


that are grammatically correct and capture world knowledge
Modelling Sequential Data
• Time Series forecasting

• What are the problems with modelling sequential data ?


Problem 1: Variable Size Input
• How would you design a feed forward network with fixed size inputs for
variable input sizes?

• The food was awesome

• The food was not bad, surprisingly instead of the ratings for the restaurant
being bad

• Solution1:Can take a moving window with last k input items


• Solution2:Take a bag or set-based representation

• The food was not bad, great in fact


• The food was not great, bad in fact

• How to model long term dependencies?

You might also like