DL-19-CNN Sequential Model 210223
DL-19-CNN Sequential Model 210223
Deep Learning
Why Convolution?
Koustav Rudra
21/02/2023
Why Convolution?
• Traditional NN:
– Each weight/parameter used only once
– multiplied by one element of the input and then never revisited
Koustav Rudra
21/02/2023
Imagenet
• Vanishing gradients
• if they have small values (<1), they shrink exponentially until they
vanish and make it impossible for the model to learn
• Exploding gradients
• if they have large values (>1) they get larger and eventually blow
up
CNN: Training Issues
• How can we control gradients?
• Better initialization
• Better normalization
• Gating mechanism (RNNs)
• Residual Connections
Residual Connections
• Residual network:
• directly copy the input matrix to the transformation output
• sum the output in final ReLU with the residual operation
Intuition:
• If identity were optimal, easy to set weights as 0
• If optimal mapping is closer to identity, easier to find small fluctuations
Beyond Classification
DL as Representation Learning
• We can use the representations learned from models learnt for image
classification in many other tasks involving images
• Deeper Networks help but one has to account for gradient-based problems
Koustav Rudra
21/02/2023
What have we learnt so far?
• Linearization of input might not be the desired option where locality
matters
• Convolutional Neural Networks
– Filters that exploit parameter sharing, sparsity and translation
invariance
– Pooling layers that reduce resolution without compromising
representation quality
• CNNs have been at the forefront of vision related tasks
– Also used in text, graphs
– Large datasets and deeper models have fuelled success
• Deeper layers are better but have to be careful about
– Vanishing and exploding gradients
– Better optimization techniques help
– Residual connections and Residual Networks for effective deeper
convnets
Modelling Sequential Data
• What is sequential data ?
– Data where the order matters – dependencies
– The man went to the ________ to withdraw money
• Language Modeling
– Predict the next word
– Which word comes next? The boy is playing football in the _______
• field, ground – Highly likely
• mall, shop – Less likely
• The food was not bad, surprisingly instead of the ratings for the restaurant
being bad