Intro4 ANN Deep CNN PDF
Intro4 ANN Deep CNN PDF
x2 f✓ (x)
x3
Output:
f✓ (x) = w · x
Parameters: ✓ = w
x3
h2
Intermediate hidden units:
z 1
hj (x) = (vj · x) (z) = (1 + e )
Output:
f✓ (x) = w · h(x)
Parameters: ✓ = (V, w)
Intuitions:
• Hierarchical feature representations
• Can simulate a bounded computation logic circuit (original moti-
vation from McCulloch/Pitts, 1943)
• Learn this computation (and potentially more because networks
are real-valued)
• Formal theory/understanding is still incomplete
• Some hypotheses emerging: double descent, lottery ticket hypoth-
esis
What’s learned?
min TrainLoss(✓)
✓2Rd
For t = 1, . . . , T :
For (x, y) 2 Dtrain :
✓ ✓ ⌘t r✓ Loss(x, y, ✓)
• Non-convex optimization
Prior knowledge
Max-pooling
AlexNet
Residual networks
x 7! (W x) + x
• Depth matters