CS436 CS5310 Ee513 L05 CNN2
CS436 CS5310 Ee513 L05 CNN2
Convolution
Convolution
Convolution
connected
connected
Pooling
Pooling
ReLU
ReLU
ReLU
Fully
Fully
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
O
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
.51
! Object Detection
! YOLO
! DETR
! Segmentation
! Segment Anything Model (SAM)
! CLIPSeg
Case Studies
! Methods on ImageNet
! LeNet-5
! AlexNet
! VGG-16
! ResNet-152
! Inception / GoogLeNet
! Efficient Net
! Object Detection
! YOLO
! DETR
! Segmentation
! Segment Anything Model (SAM)
! CLIPSeg
Case Studies
(W - F +2*P)/S + 1
AlexNet (227 - 11 +2*0)/4 + 1 =
POOL POOL
POOL
= ⋮ ⋮ ⋮
3x3
3x3 3x3 3x3 s=2 Softmax
p=1 p=1 p=1 1000
f = 384 f = 384 f = 256
9216 4096 4096
Layer Output Total Params Total Params
Volume (with bais) (without bais)
Name No. of filters Filter size Stride Pad
Input
[Krizhevsky et al., 2012. ImageNet classification with deep convolutional neural networks]
AlexNet
POOL POOL
POOL
= ⋮ ⋮ ⋮
3x3
3x3 3x3 3x3 s=2 Softmax
same same same 1000
13x13x384 13x13x384 13x13x256 6x6x256 9216 4096 4096
[Krizhevsky et al., 2012. ImageNet classification with deep convolutional neural networks]
AlexNet
Table 1
CONV
5x5
same
f= 32 28x28x32
28x28x192
CONV CONV
1x1, 16, 5x5, 32,
1x1 192 5x5 16
28 28 16
28x28x32
28x28x192
! Inception Motivation
1x1
3x3
64
128
5x5 28
32
32
28
28x28x192 MAX-POOL
https://www.researchgate.net/figure/Recent-ConvNets-proposed-in-ILSVRC_fig1_338797371
Next: Object Detection