CNN Part 2
CNN Part 2
VisGraph, HKUST
Pooling layer down-samples the volume spatially, independently in each depth
slice of the input volume.
Left: the input volume of size [224x224x64] is pooled with filter size 2, stride 2
into output volume of size [112x112x64]. Notice that the volume depth is
preserved.
Right: The most common down-sampling operation is max, giving rise to max
pooling, here shown with a stride of 2. That is, each max is taken over 4
numbers (little 2x2 square).
Relation between input size, output size and
filter size
The popular CNN
• LeNet, 1998
• AlexNet, 2012
• VGGNet, 2014
• ResNet, 2015
Applications
• http://yann.lecun.com/exdb/lenet/index.html
• https://d2l.ai/chapter_convolutional-neural-networks/lenet.html
• https://www.kaggle.com/blurredmachine/lenet-architecture-a-compl
ete-guide
• https://d2l.ai/chapter_convolutional-neural-networks/why-conv.html
• https://www.cs.toronto.edu/~lczhang/aps360_20191/lec/w03/convn
et.html
Acknowledgement
• Deep learning –Andrew Ng
• Deep learning-Mitesh M.Khapra
• Ian Goodfellow, YoshuaBengio, Aaron Courville , “Deep Learning”,
The MIT Press, 2016
• Hands-on Mathematics for Deep Learning- Jay Dawani