3b Dynamics
3b Dynamics
3b Dynamics
Deep Learning
Week 3b. Hidden Unit Dynamics
Alan Blair
School of Computer Science and Engineering
June 11, 2024
Outline
2
Encoder Networks
Inputs Outputs
10000 10000
01000 01000
00100 00100
00010 00010
00001 00001
3
N–2–N Encoder
Hidden Unit Space:
4
8–3–8 Encoder
Exercise:
➛ Draw the hidden unit space for 2-2-2, 3-2-3, 4-2-4 and 5-2-5 encoders.
➛ Represent the input-to-hidden weights for each input unit by a point, and the
hidden-to-output weights for each output unit by a line.
➛ Now consider the 8-3-8 encoder with its 3-dimensional hidden unit space.
→ what shape would be formed by the 8 points representing the
input-to-hidden weights for the 8 input units?
→ what shape would be formed by the planes representing the
hidden-to-output weights for each output unit?
Hint: think of two platonic solids, which are “dual” to each other.
5
Hinton Diagrams
Sharp Straight Sharp
Left Ahead Right
30 Output
Units
4 Hidden
Units
30x32 Sensor
Input Retina
6
Learning Face Direction
7
Learning Face Direction
8
Weight Space Symmetry (8.2)
➛ swap any pair of hidden nodes, overall function will be the same
➛ on any hidden node, reverse the sign of all incoming and outgoing weights
(assuming symmetric transfer function)
➛ hidden nodes with identical input-to-hidden weights in theory would never
separate; so, they all have to begin with different random weights
➛ in practice, all hidden nodes may try to do similar job at first, then gradually
specialize.
9
Controlled Nonlinearity
10
Limitations of Two-Layer Neural Networks
Some functions are difficult for a 2-layer network to learn.
6
−2
−4
−6
−6 −4 −2 0 2 4 6
For example, this Twin Spirals problem is difficult to learn with a 2-layer network,
but it can be learned using a 3-layer network.
11
First Hidden Layer
12
Second Hidden Layer
13
Network Output
14
Adding Hidden Layers
15
Vanishing / Exploding Gradients
16
Vanishing / Exploding Gradients
17
Activation Functions (6.3)
4 4
3 3
2 2
1 1
0 0
-1 -1
-2 -2
-4 -2 0 2 4 -4 -2 0 2 4
3 3
2 2
1 1
0 0
-1 -1
-2 -2
-4 -2 0 2 4 -4 -2 0 2 4
18
Activation Functions
19