Neural - Networks
Neural - Networks
Neural Networks
林彥宇 教授
Yen-Yu Lin, Professor
國立陽明交通大學 資訊工程學系
Computer Science, National Yang Ming Chiao Tung University
2
Linear model and neural networks
https://www.houseofbots.com/news-detail/1442-1-what-is-deep-learning-and-neural-network 3
Activations
4
Activations
6
Neural networks for regression and classification
• {𝑦𝑘 }𝐾
𝑘=1 are the final outputs of the neural networks
7
Two-layer neural networks
8
Feed-forward neural networks
Network Diagram
Nodes: Input, hidden, and
output variables
Links: Weights and biases
Arrows: Propagation direction
9
Generalizations
10
Neural networks as universal approximators
Points:
training data
Dashed curves:
Outputs of
three hidden
units
Curve:
Prediction by
the NN
11
Neural networks for classification
• 3-class classification
• 2-layer neural networks with 64 hidden units
https://www.annytab.com/neural-network-classification-in-python/
12
Network training
13
Neural networks for 1D regression
➢ where and
14
ML solution for 1D regression
15
ML solution for 1D regression
16
ML solution for 1D regression
17
ML solution for 1D regression
(1) 𝑧𝑀 (2)
𝑤𝑀𝐷 𝑤1𝑀
𝑥𝑛𝐷
output target
input
….
𝑦𝑛 𝑡𝑛
𝐱𝑛
𝑥𝑛1
𝑥𝑛0 𝑧1
𝑧0
18
Neural networks for multi-dimensional regression
19
Neural networks for multi-dimensional regression
𝑧𝑀 output target
(1) (2)
𝑤𝑀𝐷 𝑤𝐾𝑀 𝒚𝑛 𝐭𝑛
𝑥𝑛𝐷 𝑦𝑛𝐾 𝑡𝑛𝐾
input
….
….
….
𝐱𝑛
𝑥𝑛1 𝑦𝑛1 𝑡𝑛1
𝑥𝑛0 𝑧1
𝑧0
1 1 K
En ( w ) = y n − t n = ( ynk − t nk ) 2
2
2 2 k =1
20
Neural networks for binary classification
➢ where
➢ is the conditional probability
➢ The conditional probability is given by
21
ML solution for binary classification
22
ML solution for binary classification
➢ where denotes
• Optimize 𝐰 by using gradient descent or its variant
23
ML solution for binary classification
(1) 𝑧𝑀 (2)
𝑤𝑀𝐷 𝑤1𝑀
𝑥𝑛𝐷
output target
input
….
𝑦𝑛 𝑡𝑛
𝐱𝑛
𝑥𝑛1
𝑥𝑛0 𝑧1
𝑧0
24
Neural networks for multi-class classification
➢ where and
25
ML solution for multi-class classification
26
ML solution for multi-class classification
𝑧𝑀 output target
(1) (2)
𝑤𝑀𝐷 𝑤𝐾𝑀 𝒚𝑛 𝐭𝑛
𝑥𝑛𝐷 𝑦𝑛𝐾 𝑡𝑛𝐾
input
….
….
….
𝐱𝑛
𝑥𝑛1 𝑦𝑛1 𝑡𝑛1
𝑥𝑛0 𝑧1
𝑧0
27
Gradient descent
28
Stochastic gradient descent
29
Stochastic gradient descent
30
Geometric view of gradient descent
• is a local minimum
• is a global minimum
31
Error backpropagation
32
Feed-forward neural networks
….
….
𝐱
𝑥1 𝑦1
𝑥0 𝑧1
M
D
𝑧0 ak = wkj( 2 ) z j + wk( 20) , k = 1,..., K
aj = w x + w ,
(1)
ji i
(1)
j0 j = 1,..., M
i =1 j =1
z j = h(a j ) y k = ak
33
Error backpropagation
• Variables/Activations dependency:
1 (2)
{𝑥𝑖 } → {𝑤𝑗𝑖 } → {𝑎𝑗 } → {𝑧𝑗 } → {𝑤𝑘𝑗 } → {𝑎𝑘 } → {𝑦𝑘 } → 𝐸
….
.
…
• In backpropagation, we also 𝐱
𝑥1 𝑦1
need to compute
𝜕𝐸 𝜕𝐸
𝑥0 𝑧1
𝛿𝑘 = and 𝛿𝑗 =
𝜕𝑎𝑘 𝜕𝑎𝑗
𝑧0
34
output
}𝑧𝑀{𝑤 }
input 1 (2)
Error backpropagation 𝐱
{𝑤𝑗𝑖 𝑘𝑗
𝒚
𝑥𝐷 𝑎𝑘
𝑦𝐾
𝑎𝑗
….
.
…
• Stochastic gradient descent
𝑥1 𝑦1
𝑥0 𝑧1
• Multi-dimensional regression 𝑧0
D
a j = w(ji1) xi + w(j10) , j = 1,..., M E E ak
i =1 Hidden layer j =
z j = h(a j ) a j k ak a j
M
= h' (a j ) wkj( 2) k
ak = wkj( 2) z j + wk( 20) , k = 1,..., K
k
Output layer E
j =1 k = yk − t k
y k = ak ak
1 K
E (w ) = ( yk − t k ) 2 Error function
2 k =1
35
Error backpropagation
• Variables/Activations dependency:
1 (2)
{𝑥𝑖 } → {𝑤𝑗𝑖 } → {𝑎𝑗 } → {𝑧𝑗 } → {𝑤𝑘𝑗 } → {𝑎𝑘 } → {𝑦𝑘 } → 𝐸
D
j = h' (a j ) wkj( 2) k
a j = w(ji1) xi + w(j10) , j = 1,..., M
k
i =1 Hidden layer
z j = h(a j )
M
ak = wkj( 2) z j + wk( 20) , k = 1,..., K E E ak
j =1 Output layer = = k z j
wkj( 2) ak wkj
( 2)
y k = ak
1 K
E (w ) = ( yk − t k ) 2 Error function k = yk − t k
2 k =1
36
Error backpropagation
• Variables/Activations dependency:
1 (2)
{𝑥𝑖 } → {𝑤𝑗𝑖 } → {𝑎𝑗 } → {𝑧𝑗 } → {𝑤𝑘𝑗 } → {𝑎𝑘 } → {𝑦𝑘 } → 𝐸
D
j = h' (a j ) wkj( 2) k
a j = w(ji1) xi + w(j10) , j = 1,..., M
k
i =1 Hidden layer E E a j
= = j xi
z j = h(a j ) w(ji1) a j w(ji1)
M
ak = wkj( 2) z j + wk( 20) , k = 1,..., K E E ak
j =1 Output layer = = k z j
wkj( 2) ak wkj
( 2)
y k = ak
1 K
E (w ) = ( yk − t k ) 2 Error function k = yk − t k
2 k =1
37
A review of error backpropagation
output
{𝑤𝑗𝑖
1
} 𝑧𝑀 {𝑤 (2) } 𝒚
𝑘𝑗
𝑥𝐷 𝑦𝐾
input 𝑎𝑗 𝑎𝑘
….
.
𝐱 …
𝑥1 𝑦1
Step 1:
Step 3: 𝑥0 𝑧1 k = yk − t k
j = h' (a j ) wkj( 2) k
k 𝑧0
Step 2:
Step 4: E E ak
= = k z j
E E a j wkj( 2) ak wkj
( 2)
= = j xi
w ji
(1)
a j w ji
(1)
38
Error backpropagation for other tasks
E E yk
• Step 1: k =
ak yk ak
1 K
2 k =1
( yk − t k ) 2 regression
E (w ) = − {t ln y (x, w ) + (1 − t ) ln(1 − y (x, w ))} binary classification
K
39
Neural networks’ applications
• Face detection
Rowley et al.
40
Convolutional neural networks
41
Convolutional neural networks’ applications
object detection
42
Recurrent neural networks
• Speech recognition
https://gab41.lab41.org/speech-recognition-you-down-with-ctc-8d3b558943f0
43
Generative adversarial networks
https://www.slideshare.net/xavigiro/deep-learning-for-computer-
vision-generative-models-and-adversarial-training-upc-2016
44
Generative adversarial networks’ applications
45
References
46
Thank You for Your Attention!
47