CS 231N Midterm Review
CS 231N Midterm Review
Midterm Logistics
● Multiple Choice
● True/False
● Short Answer Questions
● More emphasis on topics covered earlier in the course than those discussed
more recently
z
a m
n p
Backpropagation
z
2
a m
1
4
L
1 1
1
n p
Backpropagation
z
2
a m
1
4
L
1 1
1
0 0 0
n p
Optimizers
SGD No No
AdaGrad Yes No
RMSProp Yes No
Video: Lily
Jiang
CNNs
Output Shape:
(N, H’, W’)
W’=(W−F+2P)/S+1
H’=(H−F+2P)/S+1
LayerNorm: Normalizes across C*H*W (calculates mean and var for each image,
across all pixels in all channels)
BatchNorm vs LayerNorm
Input shape: (N, C, H, W)
LayerNorm: Normalizes across C*H*W (calculates mean and var for each image,
across all channels)
BatchNorm: C
LayerNorm: (C*H*W)
BatchNorm vs LayerNorm
One important difference:
BatchNorm calculates the mean and var across the batch, and stores a running
average which is used during test