ML Visuals
ML Visuals
By dair.ai
https://github.com/dair-ai/ml-visuals
Basic ML Visuals
Softmax
Convolve
Sharpen
Softmax
Convolve
Sharpen
Softmax
Linear
Positional Positional
Encoding Encoding
Input Output
Embedding Embedding
Linear
Positional Positional
Encoding Encoding
Input Output
Embedding Embedding
Tokenize
a[2]2 a[3]2
a[4] Ŷ
X
a[2]3 a[3]3
a[4] Ŷ
X
a[1]3 a[2]3 a[3]3
a[4] Ŷ
X
a[1]3 a[2]3 a[3]3
+b2 +b2
NxNx3
MxM ReLU
a[l-1] a[l]
CONV
operation
MxM ReLU
+b2 +b2
NxNx3
MxM ReLU
CONV
operation
MxM ReLU
+b2 +b2
NxNx3
MxM ReLU
Abstract backgrounds
DAIR.AI
Gradient Backgrounds
Community Contributions
Striding in
CONV
S=1
S=2
MaxPool
Inception Same s=1
Module
5x5 Same
NxNx192
3x3 Same
1x1 Same
NxNx128
NxNx192
NxNx32
NxNx64
t-1 t
Walk PRICE
? ŷ Basic Neuron
ZIP
Model
Schoo
l
How does NN
Wealth work (Insprired
from Coursera)
Ŷ = 0
Ŷ = 1
Logistic
Regression
Linear ReLU(x)
regression
$
$
Size Size
Training
C
O
N
C C V C C C
O O 3 O O O
I N N N N N I1
V V V V V
1 2 C 6 7
5
O 128*128*1
128*128*1 N
V
4
Encoder Decoder
Decoder V1
V ENcoder
128*128*1
128*128*1
Large NN
Med NN
η Small NN
SVM,LR
etc
Amount of
Data
a[1]1
a[1]2
X a[2] Ŷ
a[1]3
a[1]4
a[2]
x[2] a[1]2
x[1]
x[2]
x[3]
a[2]
x[2] a[1]2
x[1]
x[2]
x[3]
x2
x2
x1 x1 x1
a[L] DropOut
x[2]
x[3]
x2
r=1
x2
Normalizatio
x1 n
x1
w2 Early stopping
J
Er
r
w1
w1
Before Dev
w2
Normalization
Train
it
.
J w2
w1 w1
After Normalization
w2
x1
x2 w[L]
FN TN
TP FP
Understanding
Precision & Recall
Batch vs. Mini-
batch
Gradient Descent
w2 w2
BGD
SGD
Batch
Gradient Descent
vs. SGD
SGD
w1 w1
x[1]
p[1]
x[2]
p[2]
x[3]
Softmax Prediction
with 2 outputs
Miscellaneous
16+3
16 16 16 1
2
Convolution 3x3 Convolution 1x1 Dropout
3 16 0.1
Dropout
Max Pooling 2x2 Skip connection
Up Sampling 0.2
Dropout
Block copied 0.3
2x2
32 32 32+6
32
4
64 64 64+12
64
8
12 12 12
128+256
8 8 8
25 25
6 6
Output
FC-512
Max-Pool
Output
Conv3-128
3er
FC-512
Lay
4er
Max-Pool
Lay
Max-Pool
Conv3-64
Conv3-128
3er
Conv3-64
2
Lay
er
Max-Pool
Lay
Max-Pool
Conv3-64
Conv3-32
Conv3-64
2er
Conv3-32
Lay
Max-Pool
Conv3-32
Conv3-32
Conv3-32
1er
Conv3-32
Lay
Input
Conv3-32
Conv3-32
1er
Lay
Conv
Conv
Max-
Max-
Input
Pool
Pool
FC
FC
er3 er4 2 er 1 er
Lay Lay Lay Lay
Filter
concatenation
1x1 convolutions
Previous layer
Previous layer
1x3 conv,
1 padding 1x3 conv,
1 padding
1x7 conv,
1x3 conv, 3 padding
1 padding
1x5 conv,
2 padding
1x3 conv,
1 padding
Filter
concatenation
Auxiliary Classifier
Softmax
Auxiliary Classifier
FC
Softmax Conv
FC Avg-Pool
FC Inception
Conv Inception Softmax
Avg-Pool Max-Pool FC
Inception FC
Inception Conv
Inception Avg-Pool
Inception
Inception
Max-Pool
Inception
Inception
Max-Pool
Max-Pool
Conv
Conv
Max-Pool Max-Pool
Conv
ConvTranspose2d
Max-Pool
Conv Input
Input
Filter Filter
concatenation concatenation
(a) (b)
Previous input Previous input
x x
R1
R1 R2
R1
R1 R2 R3 R1 R2 R3
Dense Block 1 Dense Block 2 Dense Block 3
Avg-Pool
Avg-Pool
Avg-Pool
Softmax
Conv
Conv
Conv
Input
FC
Transition layers
hi+1
Filter
concatenation
3x3 3x3 5x5 3x3 3x3 3x3 3x3 5x5 7x7 5x5 3x3 7x7 3x3 5x5
identity identity
conv conv conv avg avg avg conv conv conv conv max conv avg conv
hi hi
... ...
hi-1 hi-1
(a) (b)
=
14*14
28*28
56*56
112*112
224*224
Max(1,1,5,6) = 6
Y
1 1 2 4
5 6 7 8 6 8
3 2 1 0 Pooling performed 3 4
with a 2x2 kernel
and a stride of 2
1 2 3 4
X
Image
Representation