0% found this document useful (0 votes)
28 views

Lecture 10 Basic CNN

The document provides an overview of convolutional neural networks using PyTorch. It begins by revising a fully connected neural network, then introduces convolutional networks with their basic building blocks of convolution and subsampling layers. It explains convolution operations for a single input channel, demonstrating how a kernel is convolved across an input to produce individual values in the output feature map. This helps build an understanding of how convolutional networks extract features from images.

Uploaded by

lingyun wu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Lecture 10 Basic CNN

The document provides an overview of convolutional neural networks using PyTorch. It begins by revising a fully connected neural network, then introduces convolutional networks with their basic building blocks of convolution and subsampling layers. It explains convolution operations for a single input channel, demonstrating how a kernel is convolved across an input to produce individual values in the output feature map. This helps build an understanding of how convolutional networks extract features from images.

Uploaded by

lingyun wu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

PyTorch Tutorial

10. Basic CNN

Lecturer : Hongpu Liu Lecture 10-1 PyTorch Tutorial @ SLAM Research Group
Revision: Fully Connected Neural Network

class Net(torch.nn.Module):
(𝑁, 784)
def __init__(self):
(𝑁, 1,28,28)
super(Net, self).__init__()
(𝑁, 512)
self.l1 = torch.nn.Linear(784, 512)
self.l2 = torch.nn.Linear(512, 256)
(𝑁, 512)
Input Layer self.l3 = torch.nn.Linear(256, 128)
self.l4 = torch.nn.Linear(128, 64)
Linear Layer (𝑁, 256)
self.l5 = torch.nn.Linear(64, 10)
ReLU Layer (𝑁, 256)
def forward(self, x):
Output Layer x = x.view(-1, 784)
(𝑁, 128)
x = F.relu(self.l1(x))
(𝑁, 128)
x = F.relu(self.l2(x))
x = F.relu(self.l3(x))
(𝑁, 64) x = F.relu(self.l4(x))
return self.l5(x)
(𝑁, 64)
(𝑁, 10) model = Net()

Lecturer : Hongpu Liu Lecture 10-2 PyTorch Tutorial @ SLAM Research Group
Convolutional Neural Network

𝑪𝟏 𝑺𝟏 𝑪𝟐 𝑺𝟐 𝒏𝟏 𝒏𝟐
Input Feature maps Feature maps Feature maps Feature maps Output
1 × 28 × 28 4 × 24 × 24 4 × 12 × 12 8×8×8 8×4×4

0
1

8
9
5×5 2×2 5×5 2×2 Fully Fully
Convolution Subsampling Convolution Subsampling Connected Connected

Feature Extraction Classification

Lecturer : Hongpu Liu Lecture 10-3 PyTorch Tutorial @ SLAM Research Group
Convolution

Output Channel

Patch

Red
Input Channel Green
Blue Height

Width

Lecturer : Hongpu Liu Lecture 10-4 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3

1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

3∙𝟏+4∙𝟐+6∙𝟑+2∙𝟒+4∙𝟓+6∙𝟔+1∙𝟕+6∙𝟖+7∙𝟗=
3 + 8 + 18 + 8 + 20 + 36 + 7 + 48 + 63 = 𝟐𝟏𝟏
Lecturer : Hongpu Liu Lecture 10-5 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211

1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

3∙𝟏+4∙𝟐+6∙𝟑+2∙𝟒+4∙𝟓+6∙𝟔+1∙𝟕+6∙𝟖+7∙𝟗=
3 + 8 + 18 + 8 + 20 + 36 + 7 + 48 + 63 = 𝟐𝟏𝟏
Lecturer : Hongpu Liu Lecture 10-6 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211

1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

4∙𝟏+6∙𝟐+5∙𝟑+4∙𝟒+6∙𝟓+8∙𝟔+6∙𝟕+7∙𝟖+8∙𝟗=
4 + 12 + 15 + 16 + 30 + 48 + 42 + 56 + 72 = 𝟐𝟗𝟓
Lecturer : Hongpu Liu Lecture 10-7 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211 295

1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

4∙𝟏+6∙𝟐+5∙𝟑+4∙𝟒+6∙𝟓+8∙𝟔+6∙𝟕+7∙𝟖+8∙𝟗=
4 + 12 + 15 + 16 + 30 + 48 + 42 + 56 + 72 = 𝟐𝟗𝟓
Lecturer : Hongpu Liu Lecture 10-8 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211 295 262

1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-9 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211 295 262

1 6 7 8 4 ⨀ 4 5 6 = 259

9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-10 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211 295 262

1 6 7 8 4 ⨀ 4 5 6 = 259 282

9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-11 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211 295 262

1 6 7 8 4 ⨀ 4 5 6 = 259 282 214

9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-12 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211 295 262

1 6 7 8 4 ⨀ 4 5 6 = 259 282 214

9 7 4 6 2 7 8 9 251

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-13 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211 295 262

1 6 7 8 4 ⨀ 4 5 6 = 259 282 214

9 7 4 6 2 7 8 9 251 253

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-14 PyTorch Tutorial @ SLAM Research Group
Convolution – Single Input Channel

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3 211 295 262

1 6 7 8 4 ⨀ 4 5 6 = 259 282 214

9 7 4 6 2 7 8 9 251 253 169

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-15 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7
2 4 6 8 2
1 6 7 8 4
9 7 4 6 2
3 7 5 4 1

3 4 6 5 7
2 4 6 8 2
1 6 7 8 4
9 7 4 6 2
3 7 5 4 1

3 4 6 5 7
2 4 6 8 2
1 6 7 8 4
9 7 4 6 2
3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-16 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7 211 295 262


2 4 6 8 2 1 2 3
1
9
6
7
7
4
8
6
4
2
⨀ 4 5 6
7 8 9
= 259 282 214

251 253 169


3 7 5 4 1

3 4 6 5 7
2 4 6 8 2
1 6 7 8 4
9 7 4 6 2
3 7 5 4 1

3 4 6 5 7
2 4 6 8 2
1 6 7 8 4
9 7 4 6 2
3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-17 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7 211 295 262


2 4 6 8 2 1 2 3
1
9
6
7
7
4
8
6
4
2
⨀ 4 5 6
7 8 9
= 259 282 214

251 253 169


3 7 5 4 1

3 4 6 5 7 179 245 268


2 4 6 8 2 9 8 7
1
9
6
7
7
4
8
6
4
2
⨀ 6 5 4
3 2 1
= 201 278 256

239 287 241


3 7 5 4 1

3 4 6 5 7
2 4 6 8 2
1 6 7 8 4
9 7 4 6 2
3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-18 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7 211 295 262


2 4 6 8 2 1 2 3
1
9
6
7
7
4
8
6
4
2
⨀ 4 5 6
7 8 9
= 259 282 214

251 253 169


3 7 5 4 1

3 4 6 5 7 179 245 268


2 4 6 8 2 9 8 7
1
9
6
7
7
4
8
6
4
2
⨀ 6 5 4
3 2 1
= 201 278 256

239 287 241


3 7 5 4 1

3 4 6 5 7 235 297 248


2 4 6 8 2 1 4 7
1
9
6
7
7
4
8
6
4
2
⨀ 2 5 8
3 6 9
= 253 294 204

255 259 169


3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-19 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7 211 295 262


2 4 6 8 2 1 2 3
1
9
6
7
7
4
8
6
4
2
⨀ 4 5 6
7 8 9
= 259 282 214

251 253 169


3 7 5 4 1

3 4 6 5 7 179 245 268


2 4 6 8 2 9 8 7
1
9
6
7
7
4
8
6
4
2
⨀ 6 5 4
3 2 1
= 201 278 256
+
239 287 241
3 7 5 4 1

3 4 6 5 7 235 297 248


2 4 6 8 2 1 4 7
1
9
6
7
7
4
8
6
4
2
⨀ 2 5 8
3 6 9
= 253 294 204

255 259 169


3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-20 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7 211 295 262


2 4 6 8 2 1 2 3
1
9
6
7
7
4
8
6
4
2
⨀ 4 5 6
7 8 9
= 259 282 214

251 253 169


3 7 5 4 1

3 4 6 5 7 179 245 268 625 837 778


2 4 6 8 2 9 8 7
1
9
6
7
7
4
8
6
4
2
⨀ 6 5 4
3 2 1
= 201 278 256
+ = 713 854 674

239 287 241 745 799 579


3 7 5 4 1

3 4 6 5 7 235 297 248


2 4 6 8 2 1 4 7
1
9
6
7
7
4
8
6
4
2
⨀ 2 5 8
3 6 9
= 253 294 204

255 259 169


3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-21 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7 211 295 262


2 4 6 8 2 1 2 3 Convolution
1
9
6
7
7
4
8
6
4
2
⨀ 4 5 6
7 8 9
= 259 282 214

251 253 169


3 7 5 4 1

3 4 6 5 7 179 245 268 625 837 778


2 4 6 8 2 9 8 7
1
9
6
7
7
4
8
6
4
2
⨀ 6 5 4
3 2 1
= 201 278 256
+ = 713 854 674

239 287 241 745 799 579


3 7 5 4 1

3 4 6 5 7 235 297 248


2 4 6 8 2 1 4 7
1
9
6
7
7
4
8
6
4
2
⨀ 2 5 8
3 6 9
= 253 294 204

255 259 169


3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-22 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7
3 4 6 5 7 211 295 262
2 34 46 68 52 7 1 2 3 179 245 268 625 837 778
2 4 6 8 2 9 8 7
1 26 47 68 84 2 4 1
5 4
6 7

259 235
282 297
214 248
1 6 7 8 4
9 17 64 76 82 4
9 7 4 6 2
6 5 4
7 2
8 5
9 8
3 2 1
= 201 278 256 + 713 854 674

3 97 75 44 61 2
251 253 294
169 204
3 6 9
3 7 5 4 1 239 287 241 745 799 579
3 7 5 4 1
255 259 169

Convolution

Lecturer : Hongpu Liu Lecture 10-23 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7
3 4 6 5 7 211 295 262
2 34 46 68 52 7 1 2 3 179 245 268 625 837 778
2 4 6 8 2 9 8 7
1 26 47 68 84 2 4 1
5 4
6 7

259 235
282 297
214 248
1 6 7 8 4
9 17 64 76 82 4
9 7 4 6 2
6 5 4
7 2
8 5
9 8
3 2 1
= 201 278 256 + 713 854 674

3 97 75 44 61 2
251 253 294
169 204
3 6 9
3 7 5 4 1 239 287 241 745 799 579
3 7 5 4 1
255 259 169

Convolution

3×3
Convolution

Lecturer : Hongpu Liu Lecture 10-24 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7
3 4 6 5 7 211 295 262
2 34 46 68 52 7 1 2 3 179 245 268 625 837 778
2 4 6 8 2 9 8 7
1 26 47 68 84 2 4 1
5 4
6 7

259 235
282 297
214 248
1 6 7 8 4
9 17 64 76 82 4
9 7 4 6 2
6 5 4
7 2
8 5
9 8
3 2 1
= 201 278 256 + 713 854 674

3 97 75 44 61 2
251 253 294
169 204
3 6 9
3 7 5 4 1 239 287 241 745 799 579
3 7 5 4 1
255 259 169

Convolution

3×3
Convolution

Lecturer : Hongpu Liu Lecture 10-25 PyTorch Tutorial @ SLAM Research Group
Convolution – 3 Input Channels

3 4 6 5 7
3 4 6 5 7 211 295 262
2 34 46 68 52 7 1 2 3 179 245 268 625 837 778
2 4 6 8 2 9 8 7
1 26 47 68 84 2 4 1
5 4
6 7

259 235
282 297
214 248
1 6 7 8 4
9 17 64 76 82 4
9 7 4 6 2
6 5 4
7 2
8 5
9 8
3 2 1
= 201 278 256 + 713 854 674

3 97 75 44 61 2
251 253 294
169 204
3 6 9
3 7 5 4 1 239 287 241 745 799 579
3 7 5 4 1
255 259 169

𝑐ℎ𝑎𝑛𝑛𝑒𝑙 = 3
Convolution
𝑐ℎ𝑎𝑛𝑛𝑒𝑙 = 3 𝑐ℎ𝑎𝑛𝑛𝑒𝑙 = 1

ℎ𝑒𝑖𝑔ℎ𝑡 = 5 ℎ𝑒𝑖𝑔ℎ𝑡 = 3 ℎ𝑒𝑖𝑔ℎ𝑡 = 3


𝑤𝑖𝑑𝑡ℎ = 3
𝑤𝑖𝑑𝑡ℎ = 3
𝑤𝑖𝑑𝑡ℎ = 5 3×3
Convolution
Lecturer : Hongpu Liu Lecture 10-26 PyTorch Tutorial @ SLAM Research Group
Convolution – N Input Channels

𝑐ℎ𝑎𝑛𝑛𝑒𝑙 = 𝑛 The shape of kernel tensor is (𝑛, 3, 3)


𝑐ℎ𝑎𝑛𝑛𝑒𝑙 = 𝑛 𝑐ℎ𝑎𝑛𝑛𝑒𝑙 = 1

ℎ𝑒𝑖𝑔ℎ𝑡 = 5 ℎ𝑒𝑖𝑔ℎ𝑡 = 3 ℎ𝑒𝑖𝑔ℎ𝑡 = 3


𝑤𝑖𝑑𝑡ℎ = 3
𝑤𝑖𝑑𝑡ℎ = 3
𝑤𝑖𝑑𝑡ℎ = 5
3×3
Convolution

Lecturer : Hongpu Liu Lecture 10-27 PyTorch Tutorial @ SLAM Research Group
Convolution – N Input Channels and M Output Channels

𝑐ℎ𝑎𝑛𝑛𝑒𝑙 = 𝑛
𝑓𝑖𝑙𝑡𝑒𝑟1 𝐹𝑒𝑎𝑡𝑢𝑟𝑒 𝑚𝑎𝑝1

Cat ⋯
𝐹𝑒𝑎𝑡𝑢𝑟𝑒 𝑚𝑎𝑝2

𝑓𝑖𝑙𝑡𝑒𝑟2

⋮ ⋮ ⋮
(𝑚, 𝑤𝑖𝑑𝑡ℎ𝑜𝑢𝑡 , ℎ𝑒𝑖𝑔ℎ𝑡𝑜𝑢𝑡 )
(𝑛, 𝑤𝑖𝑑𝑡ℎ𝑖𝑛 , ℎ𝑒𝑖𝑔ℎ𝑡𝑖𝑛 )

𝑓𝑖𝑙𝑡𝑒𝑟𝑚 𝐹𝑒𝑎𝑡𝑢𝑟𝑒 𝑚𝑎𝑝𝑚

Lecturer : Hongpu Liu Lecture 10-28 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer

This layer needs 𝒎 filters, which size are:


𝒏 × 𝒌𝒆𝒓𝒏𝒆𝒍_𝒔𝒊𝒛𝒆𝒘𝒊𝒅𝒕𝒉 × 𝒌𝒆𝒓𝒏𝒆𝒍_𝒔𝒊𝒛𝒆𝒉𝒆𝒊𝒈𝒉𝒕

4-Dimension Tensor
𝑚 × 𝑤𝑖𝑑𝑡ℎ𝑜𝑢𝑡 ,× ℎ𝑒𝑖𝑔ℎ𝑡𝑜𝑢𝑡
𝑛 × 𝑤𝑖𝑑𝑡ℎ𝑖𝑛 ,× ℎ𝑒𝑖𝑔ℎ𝑡𝑖𝑛


𝑚 × 𝑛 × 𝑘𝑒𝑟𝑛𝑒𝑙_𝑠𝑖𝑧𝑒𝑤𝑖𝑑𝑡ℎ × 𝑘𝑒𝑟𝑛𝑒𝑙_𝑠𝑖𝑧𝑒ℎ𝑒𝑖𝑔ℎ𝑡

Lecturer : Hongpu Liu Lecture 10-29 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer

import torch
in_channels, out_channels= 5, 10
width, height = 100, 100
kernel_size = 3
batch_size = 1

input = torch.randn(batch_size,
in_channels,
width,
height)

conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size=kernel_size)

output = conv_layer(input)

print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)

Lecturer : Hongpu Liu Lecture 10-30 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer

import torch
in_channels, out_channels= 5, 10
width, height = 100, 100
kernel_size = 3
batch_size = 1

input = torch.randn(batch_size,
in_channels,
width,
height)

conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size=kernel_size)

output = conv_layer(input)

print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)

Lecturer : Hongpu Liu Lecture 10-31 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer

import torch
in_channels, out_channels= 5, 10
width, height = 100, 100
kernel_size = 3
batch_size = 1

input = torch.randn(batch_size,
in_channels,
width,
height)

conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size=kernel_size)

output = conv_layer(input)

print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)

Lecturer : Hongpu Liu Lecture 10-32 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer

import torch
in_channels, out_channels= 5, 10
width, height = 100, 100
kernel_size = 3
batch_size = 1

input = torch.randn(batch_size,
in_channels,
width,
height)

conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size=kernel_size)

output = conv_layer(input)

print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)

Lecturer : Hongpu Liu Lecture 10-33 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer

import torch
in_channels, out_channels= 5, 10
width, height = 100, 100
kernel_size = 3
batch_size = 1

input = torch.randn(batch_size,
in_channels,
width,
height)

conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size=kernel_size)

output = conv_layer(input)

print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)

Lecturer : Hongpu Liu Lecture 10-34 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer

import torch
in_channels, out_channels= 5, 10
width, height = 100, 100
kernel_size = 3
batch_size = 1

input = torch.randn(batch_size,
in_channels,
width,
height)

conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size=kernel_size)

output = conv_layer(input)

print(input.shape) torch.Size([1, 5, 100, 100])


print(output.shape) torch.Size([1, 10, 98, 98])
print(conv_layer.weight.shape) torch.Size([10, 5, 3, 3])

Lecturer : Hongpu Liu Lecture 10-35 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – padding

Input Kernel Output

3 4 6 5 7
1 2 3 211 295 262
2 4 6 8 2
1 6 7 8 4 ⨀ 4 5 6 = 259 282 214
9 7 4 6 2
7 8 9 251 253 169
3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-36 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – padding=1

Input Kernel Output

3 4 6 5 7
1 2 3 211 295 262
2 4 6 8 2
1 6 7 8 4 ⨀ 4 5 6 = 259 282 214
9 7 4 6 2
7 8 9 251 253 169
3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-37 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – padding=1

Input Kernel Output

0 0 0 0 0 0 0
0 3 4 6 5 7 0
1 2 3 211 295 262
0 2 4 6 8 2 0
0 1 6 7 8 4 0 ⨀ 4 5 6 = 259 282 214
0 9 7 4 6 2 0
7 8 9 251 253 169
0 3 7 5 4 1 0
0 0 0 0 0 0 0

Lecturer : Hongpu Liu Lecture 10-38 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – padding=1

Input Kernel Output

0 0 0 0 0 0 0 91

0 3 4 6 5 7 0
1 2 3 211 295 262
0 2 4 6 8 2 0
0 1 6 7 8 4 0 ⨀ 4 5 6 = 259 282 214
0 9 7 4 6 2 0
7 8 9 251 253 169
0 3 7 5 4 1 0
0 0 0 0 0 0 0

Lecturer : Hongpu Liu Lecture 10-39 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – padding=1

Input Kernel Output

0 0 0 0 0 0 0 91 168 224 215 127

0 3 4 6 5 7 0
1 2 3 114 211 295 262 149
0 2 4 6 8 2 0
0 1 6 7 8 4 0 ⨀ 4 5 6 = 192 259 282 214 122
0 9 7 4 6 2 0
7 8 9 194 251 253 169 86
0 3 7 5 4 1 0
0 0 0 0 0 0 0 96 112 110 68 31

Lecturer : Hongpu Liu Lecture 10-40 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – padding=1

import torch

input = [3,4,6,5,7,
2,4,6,8,2,
1,6,7,8,4,
9,7,4,6,2,
3,7,5,4,1]
input = torch.Tensor(input).view(1, 1, 5, 5)

conv_layer = torch.nn.Conv2d(1, 1, kernel_size=3, padding=1, bias=False)

kernel = torch.Tensor([1,2,3,4,5,6,7,8,9]).view(1, 1, 3, 3)
conv_layer.weight.data = kernel.data

output = conv_layer(input)
print(output)

Lecturer : Hongpu Liu Lecture 10-41 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – stride=2

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3

1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-42 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – stride=2

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3
211
1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-43 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – stride=2

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3
211
1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-44 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – stride=2

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3
211 262
1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-45 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – stride=2

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3
211 262
1 6 7 8 4 ⨀ 4 5 6 =
9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-46 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – stride=2

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3
211 262
1 6 7 8 4 ⨀ 4 5 6 = 251
9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-47 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – stride=2

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3
211 262
1 6 7 8 4 ⨀ 4 5 6 = 251
9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-48 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – stride=2

Input Kernel Output

3 4 6 5 7

2 4 6 8 2 1 2 3
211 262
1 6 7 8 4 ⨀ 4 5 6 = 251 169
9 7 4 6 2 7 8 9

3 7 5 4 1

Lecturer : Hongpu Liu Lecture 10-49 PyTorch Tutorial @ SLAM Research Group
Convolutional Layer – stride=2

import torch

input = [3,4,6,5,7,
2,4,6,8,2,
1,6,7,8,4,
9,7,4,6,2,
3,7,5,4,1]
input = torch.Tensor(input).view(1, 1, 5, 5)

conv_layer = torch.nn.Conv2d(1, 1, kernel_size=3, stride=2, bias=False)

kernel = torch.Tensor([1,2,3,4,5,6,7,8,9]).view(1, 1, 3, 3)
conv_layer.weight.data = kernel.data

output = conv_layer(input)
print(output)

Lecturer : Hongpu Liu Lecture 10-50 PyTorch Tutorial @ SLAM Research Group
Max Pooling Layer

3 4 6 5

2 4 6 8 4 8

1 6 7 8 2×2 9 8
MaxPooling
9 7 4 6

Lecturer : Hongpu Liu Lecture 10-51 PyTorch Tutorial @ SLAM Research Group
Max Pooling Layer

import torch

input = [3,4,6,5,
2,4,6,8,
1,6,7,8,
9,7,4,6,
]
input = torch.Tensor(input).view(1, 1, 4, 4)

maxpooling_layer = torch.nn.MaxPool2d(kernel_size=2)

output = maxpooling_layer(input)
print(output)

Lecturer : Hongpu Liu Lecture 10-52 PyTorch Tutorial @ SLAM Research Group
A Simple Convolutional Neural Network

(𝑏𝑎𝑡𝑐ℎ, 10,24,24)

(𝑏𝑎𝑡𝑐ℎ, 10,12,12)
(𝑏𝑎𝑡𝑐ℎ, 1,28,28)
(𝑏𝑎𝑡𝑐ℎ, 20,8,8)
(𝑏𝑎𝑡𝑐ℎ, 20,4,4)
(𝑏𝑎𝑡𝑐ℎ, 10)

Conv2d Layer Pooling Layer Conv2d Layer Pooling Layer Linear Layer
𝑓𝑖𝑙𝑡𝑒𝑟: 5 × 5 𝑓𝑖𝑙𝑡𝑒𝑟: 2 × 2 𝑓𝑖𝑙𝑡𝑒𝑟: 5 × 5 𝑓𝑖𝑙𝑡𝑒𝑟: 2 × 2 𝐶𝑖𝑛 : 320
𝐶𝑖𝑛 : 1 𝐶𝑖𝑛 : 10 𝐶𝑜𝑢𝑡 : 10
𝐶𝑜𝑢𝑡 : 10 𝐶𝑜𝑢𝑡 : 20

Lecturer : Hongpu Liu Lecture 10-53 PyTorch Tutorial @ SLAM Research Group
Revision: Fully Connected Neural Network

(𝑏𝑎𝑡𝑐ℎ, 1,28,28)
(𝑏𝑎𝑡𝑐ℎ, 1,28,28)
𝐶𝑖𝑛 = 1, 𝐶𝑜𝑢𝑡 = 10, 𝑘𝑒𝑟𝑛𝑒𝑙 = 5 self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
(𝑏𝑎𝑡𝑐ℎ, 10,24,24)

Input Layer
(𝑏𝑎𝑡𝑐ℎ, 10,24,24)
Conv2d Layer 𝑘𝑒𝑟𝑛𝑒𝑙 = 2 × 2 self.pooling = torch.nn.MaxPool2d(2)
ReLU Layer (𝑏𝑎𝑡𝑐ℎ, 10,12,12)

Pooling Layer 𝐶𝑖𝑛 = 10, 𝐶𝑜𝑢𝑡 = 20, 𝑘𝑒𝑟𝑛𝑒𝑙 = 5 self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
Linear Layer (𝑏𝑎𝑡𝑐ℎ, 20,8,8)

Output Layer (𝑏𝑎𝑡𝑐ℎ, 20,8,8)


𝑘𝑒𝑟𝑛𝑒𝑙 = 2 × 2 self.pooling = torch.nn.MaxPool2d(2)
(𝑏𝑎𝑡𝑐ℎ, 20,4,4) → (𝑏𝑎𝑡𝑐ℎ, 320)
(𝑏𝑎𝑡𝑐ℎ, 10) 𝑓𝑖𝑛 = 320, 𝑓𝑜𝑢𝑡 = 10 self.fc = torch.nn.Linear(320, 10)

Lecturer : Hongpu Liu Lecture 10-54 PyTorch Tutorial @ SLAM Research Group
Revision: Fully Connected Neural Network

(𝑏𝑎𝑡𝑐ℎ, 1,28,28) class Net(torch.nn.Module):


(𝑏𝑎𝑡𝑐ℎ, 1,28,28) def __init__(self):
𝐶𝑖𝑛 = 1, 𝐶𝑜𝑢𝑡 = 10, 𝑘𝑒𝑟𝑛𝑒𝑙 = 5
super(Net, self).__init__()
(𝑏𝑎𝑡𝑐ℎ, 10,24,24) self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
Input Layer self.pooling = torch.nn.MaxPool2d(2)
(𝑏𝑎𝑡𝑐ℎ, 10,24,24) self.fc = torch.nn.Linear(320, 10)
Conv2d Layer 𝑘𝑒𝑟𝑛𝑒𝑙 = 2 × 2 def forward(self, x):
ReLU Layer # Flatten data from (n, 1, 28, 28) to (n, 784)
(𝑏𝑎𝑡𝑐ℎ, 10,12,12)
batch_size = x.size(0)
x = F.relu(self.pooling(self.conv1(x)))
Pooling Layer 𝐶𝑖𝑛 = 10, 𝐶𝑜𝑢𝑡 = 20, 𝑘𝑒𝑟𝑛𝑒𝑙 = 5
x = F.relu(self.pooling(self.conv2(x)))
(𝑏𝑎𝑡𝑐ℎ, 20,8,8) x = x.view(batch_size, -1) # flatten
Linear Layer
x = self.fc(x)
return x
Output Layer (𝑏𝑎𝑡𝑐ℎ, 20,8,8)
model = Net()
𝑘𝑒𝑟𝑛𝑒𝑙 = 2 × 2

(𝑏𝑎𝑡𝑐ℎ, 20,4,4) → (𝑏𝑎𝑡𝑐ℎ, 320)


(𝑏𝑎𝑡𝑐ℎ, 10) 𝑓𝑖𝑛 = 320, 𝑓𝑜𝑢𝑡 = 10

Lecturer : Hongpu Liu Lecture 10-55 PyTorch Tutorial @ SLAM Research Group
How to use GPU – 1. Move Model to GPU

class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
self.pooling = torch.nn.MaxPool2d(2)
self.fc = torch.nn.Linear(320, 10)

def forward(self, x):


# Flatten data from (n, 1, 28, 28) to (n, 784)
batch_size = x.size(0)
x = F.relu(self.pooling(self.conv1(x)))
x = F.relu(self.pooling(self.conv2(x)))
x = x.view(batch_size, -1) # flatten
x = self.fc(x)
return x

model = Net()

Lecturer : Hongpu Liu Lecture 10-56 PyTorch Tutorial @ SLAM Research Group
How to use GPU – 1. Move Model to GPU

class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
self.pooling = torch.nn.MaxPool2d(2)
self.fc = torch.nn.Linear(320, 10)

def forward(self, x):


# Flatten data from (n, 1, 28, 28) to (n, 784)
batch_size = x.size(0)
x = F.relu(self.pooling(self.conv1(x)))
x = F.relu(self.pooling(self.conv2(x)))
x = x.view(batch_size, -1) # flatten
x = self.fc(x) Define device as the first visible cuda
return x
device if we have CUDA available.
model = Net()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Lecturer : Hongpu Liu Lecture 10-57 PyTorch Tutorial @ SLAM Research Group
How to use GPU – 1. Move Model to GPU

class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
self.pooling = torch.nn.MaxPool2d(2)
self.fc = torch.nn.Linear(320, 10)

def forward(self, x):


# Flatten data from (n, 1, 28, 28) to (n, 784)
batch_size = x.size(0)
x = F.relu(self.pooling(self.conv1(x)))
x = F.relu(self.pooling(self.conv2(x)))
x = x.view(batch_size, -1) # flatten
x = self.fc(x)
return x Convert parameters and buffers of all
modules to CUDA Tensor.
model = Net()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)

Lecturer : Hongpu Liu Lecture 10-58 PyTorch Tutorial @ SLAM Research Group
How to use GPU – 2. Move Tensors to GPU

def train(epoch):
running_loss = 0.0
for batch_idx, data in enumerate(train_loader, 0):
inputs, target = data
optimizer.zero_grad()

# forward + backward + update


outputs = model(inputs)
loss = criterion(outputs, target)
loss.backward()
optimizer.step()

running_loss += loss.item()
if batch_idx % 300 == 299:
print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 2000))
running_loss = 0.0

Lecturer : Hongpu Liu Lecture 10-59 PyTorch Tutorial @ SLAM Research Group
How to use GPU – 2. Move Tensors to GPU

def train(epoch):
running_loss = 0.0
for batch_idx, data in enumerate(train_loader, 0):
inputs, target = data
inputs, target = inputs.to(device), target.to(device)
optimizer.zero_grad()
Send the inputs and targets at every
# forward + backward + update
outputs = model(inputs) step to the GPU.
loss = criterion(outputs, target)
loss.backward()
optimizer.step()

running_loss += loss.item()
if batch_idx % 300 == 299:
print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 2000))
running_loss = 0.0

Lecturer : Hongpu Liu Lecture 10-60 PyTorch Tutorial @ SLAM Research Group
How to use GPU – 2. Move Tensors to GPU

def test():
correct = 0
total = 0
with torch.no_grad():
for data in test_loader:
inputs, target = data
outputs = model(inputs)
_, predicted = torch.max(outputs.data, dim=1)
total += target.size(0)
correct += (predicted == target).sum().item()
print('Accuracy on test set: %d %% [%d/%d]' % (100 * correct / total, correct, total))

Lecturer : Hongpu Liu Lecture 10-61 PyTorch Tutorial @ SLAM Research Group
How to use GPU – 2. Move Tensors to GPU

def test():
correct = 0
total = 0 Send the inputs and targets at every
with torch.no_grad():
step to the GPU.
for data in test_loader:
inputs, target = data
inputs, target = inputs.to(device), target.to(device)
outputs = model(inputs)
_, predicted = torch.max(outputs.data, dim=1)
total += target.size(0)
correct += (predicted == target).sum().item()
print('Accuracy on test set: %d %% [%d/%d]' % (100 * correct / total, correct, total))

Lecturer : Hongpu Liu Lecture 10-62 PyTorch Tutorial @ SLAM Research Group
Results

Accuracy on test set: 6 % [637/10000]


[1, 300] loss: 0.098
[1, 600] loss: 0.035
[1, 900] loss: 0.025
Accuracy on test set: 96 % [9605/10000]
[2, 300] loss: 0.021
[2, 600] loss: 0.017
[2, 900] loss: 0.015
Accuracy on test set: 97 % [9709/10000]
……
[9, 300] loss: 0.006
[9, 600] loss: 0.006
[9, 900] loss: 0.007
Accuracy on test set: 98 % [9857/10000]
[10, 300] loss: 0.006
[10, 600] loss: 0.006
[10, 900] loss: 0.006
Accuracy on test set: 98 % [9869/10000]

Lecturer : Hongpu Liu Lecture 10-63 PyTorch Tutorial @ SLAM Research Group
Exercise 10-1

• Try a more complex CNN:


(𝑏𝑎𝑡𝑐ℎ, 1,28,28) • Conv2d Layer *3
• ReLU Layer * 3
Input Layer • MaxPooling Layer * 3
Conv2d Layer • Linear Layer * 3
ReLU Layer • Try different configuration of
Pooling Layer this CNN:
Linear Layer • Compare their performance.
Output Layer

(𝑏𝑎𝑡𝑐ℎ, 10)

Lecturer : Hongpu Liu Lecture 10-64 PyTorch Tutorial @ SLAM Research Group
PyTorch Tutorial
10. Basic CNN

Lecturer : Hongpu Liu Lecture 10-65 PyTorch Tutorial @ SLAM Research Group

You might also like