4 DL Deep Neural Nets
4 DL Deep Neural Nets
Interactive
Figures!
Bias +
Weight
Bias +
𝒙𝒙 Weight
Bias +
Weight
Bias + Activation
Weight (eg, ReLU) 𝒉𝒉𝟑𝟑
𝒉𝒉𝟐𝟐 Bias +
Bias + Activation
𝒙𝒙 Weight (eg, ReLU)
Weighted 𝒚𝒚
Sum
Bias + Activation
Weight (eg, ReLU) 𝒉𝒉𝟑𝟑
𝒉𝒉𝟐𝟐 Bias +
Bias + Activation Piecewise
𝒙𝒙 Weight (eg, ReLU)
Weighted 𝒚𝒚 linear function
Sum
Bias + Activation
Weight (eg, ReLU) 𝒉𝒉𝟑𝟑
𝒚𝒚𝟐𝟐
𝒚𝒚𝟑𝟑
Bias + Activation
𝒉𝒉𝟏𝟏
Weight (eg, ReLU)
𝒉𝒉𝟐𝟐
Bias + Activation
𝒙𝒙 Weight (eg, ReLU)
Bias + Activation
Weight (eg, ReLU) 𝒉𝒉𝟑𝟑
𝒚𝒚𝟐𝟐
𝒚𝒚𝟑𝟑
𝒉𝒉𝟏𝟏 Bias +
Bias + Activation
Weighted
Weight (eg, ReLU)
Sum
𝒉𝒉𝟐𝟐
Bias +
Bias + Activation
𝒙𝒙 Weight (eg, ReLU)
Weighted
Sum
Bias +
Bias + Activation
Weighted
Weight (eg, ReLU) 𝒉𝒉𝟑𝟑 Sum
𝒚𝒚𝟐𝟐
𝒚𝒚𝟑𝟑
𝒉𝒉𝟏𝟏 Bias +
Bias + Activation
Weight (eg, ReLU)
Weighted 𝒚𝒚𝟏𝟏
Sum
𝒉𝒉𝟐𝟐
Bias +
Bias + Activation
𝒙𝒙 Weight (eg, ReLU)
Weighted 𝒚𝒚𝟐𝟐
Sum
Bias +
Bias + Activation
Weight (eg, ReLU)
Weighted 𝒚𝒚𝟑𝟑
𝒉𝒉𝟑𝟑 Sum
𝒚𝒚𝟐𝟐
𝒚𝒚𝟑𝟑
𝒉𝒉𝟏𝟏 Bias +
Bias + Activation Piecewise
Weight (eg, ReLU)
Weighted 𝒚𝒚𝟏𝟏 linear function
Sum
𝒉𝒉𝟐𝟐
Bias +
Bias + Activation Piecewise
𝒙𝒙 Weight (eg, ReLU)
Weighted 𝒚𝒚𝟐𝟐 linear function
Sum
Bias +
Bias + Activation Piecewise
Weight (eg, ReLU)
Weighted 𝒚𝒚𝟑𝟑 linear function
𝒉𝒉𝟑𝟑 Sum
𝒉𝒉𝟏𝟏 Bias +
Bias + Activation
Weighted
Weight (eg, ReLU)
Sum
𝒉𝒉𝟐𝟐
Bias +
Bias + Activation
𝒙𝒙 Weight (eg, ReLU)
Weighted
Sum
Bias +
Bias + Activation
Weighted
Weight (eg, ReLU) 𝒉𝒉𝟑𝟑 Sum
18
Two-layer network with 1 output …
Bias +
Bias + Activation Activation
Weighted
Weight (eg, ReLU) 𝒉𝒉𝟑𝟑 Sum (eg, ReLU) 𝒉𝒉′𝟑𝟑
18
Two-layer network with 1 output …
Bias +
Bias + Activation Activation
Weighted
Weight (eg, ReLU) 𝒉𝒉𝟑𝟑 Sum (eg, ReLU) 𝒉𝒉′𝟑𝟑
18
Two-layer network with 1 output …
Piecewise
linear functions
18
Two-layer network with 1 output …
2 outputs?
Piecewise
linear functions
18
Two-layer network with 1 output …
3 layers?
2 outputs?
Piecewise
linear functions
18
Two-layer network with 1 output …
3 layers?
2 inputs? 2 outputs?
Piecewise
linear functions
18
Deep neural networks
• Two-layer neural network
• Hyperparameters
• Notation change and general case
• Shallow vs. deep networks
23
How many parameters are in that network (5 inputs, 2
outputs, 20 hidden layers, each of 30 hidden units each)?
24
Deep neural networks
• Two-layer neural network
• Hyperparameters
• Notation change and general case
• Shallow vs. deep networks
32
Deep neural networks
• Two-layer neural network
• Hyperparameters
• Notation change and general case
• Shallow vs. deep networks
Argument: One layer is enough, and for deep networks could arrange
for the other layers to compute the identity function.
5 layers 5 layers
10 hidden units per layer 50 hidden units per layer
471 parameters 10,801 parameters
161,501 linear regions >1040 linear regions