Lecture21 Deep Learning PartII April12 2021
Lecture21 Deep Learning PartII April12 2021
Lecture 21:
Deep Learning – Part II
• Today’s Session:
• Deep Learning – Part II
• Announcements:
• Assignment 3 is due on Wednesday April 14 by midnight
• Quiz II is on April 19
Outline
Deep Learning
Computation Gradient
Overview Vectorization
Graph Descent
The Flow of Computations in Neural
Networks
• The flow of computations in a neural network goes in two ways:
1. Left-to-right: This is referred to as forward propagation, which results in
computing the output of the network
2. Right-to-left: This is referred to as back propagation, which results in
computing the gradients (or derivatives) of the parameters in the network
𝒂=𝟐
𝒃=𝟒
Computation
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐 Graph
𝒄=𝟑
Forward Propagation
• Let us assume we want to compute the following function :
𝒂=𝟐
𝒃=𝟒
Computation
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐 Graph
𝒄=𝟑
Forward Propagation
• Let us assume we want to compute the following function :
𝒂=𝟐
𝒃=𝟒
Computation
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐 Graph
𝒄=𝟑
Forward Propagation
• Let us assume we want to compute the following function :
𝒂=𝟐
𝒃=𝟒
Computation
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐 Graph
𝒄=𝟑
Forward Propagation
• Let us assume we want to compute the following function :
𝒂=𝟐
𝒃=𝟒
Computation
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐 Graph
𝒄=𝟑
Forward propagation allows computing
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
𝒅𝑱
=¿ Derivative of with respect to
𝒅𝒗
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱
=¿ If we change a little bit, how would change?
𝒅𝒗
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱
=𝟑
𝒅𝒗
To compute the derivative of with respect to , we went back to ,
nudged it, and measured the corresponding resultant increase on
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
𝒅𝑱
=¿ Derivative of with respect to
𝒅𝒂
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱
=¿ If we change a little bit, how would change?
𝒅𝒂
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅𝒗
=𝟑= ×
𝒅𝒂 𝒅𝒗 𝒅𝒂
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅 𝒗
=𝟑= ×
𝒅𝒂 𝒅𝒗 𝒅 𝒂
The change in caused a change in
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅 𝒗
=𝟑= ×
𝒅𝒂 𝒅𝒗 𝒅 𝒂
And the change in caused a change in
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅𝒗
=𝟑= ×
𝒅𝒂 𝒅𝒗 𝒅𝒂
This is denoted as the chain rule in calculus
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱
=𝟑= ×𝟏
𝒅𝒂 𝒅𝒗
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱
=𝟑=𝟑 ×𝟏
𝒅𝒂
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅𝒗
=𝟑= ×
𝒅𝒂 𝒅𝒗 𝒅𝒂
In essence, to compute the derivative of with respect to , we had to go back to ,
nudge it a little bit, and measure the corresponding resultant increase on
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅𝒗
=𝟑= ×
𝒅𝒂 𝒅𝒗 𝒅𝒂
Then, we had to go back to , nudge it a little bit, and measure the corresponding
resultant increase on
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒 2
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅𝒗
=𝟑= ×
𝒅𝒂 𝒅𝒗 𝒅𝒂
Then, we multiplied the changes together (i.e., we applied the chain rule!)
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
𝒅𝑱
=¿ Derivative of with respect to
𝒅𝒖
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱
=¿ If we change a little bit, how would change?
𝒅𝒖
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅 𝒗
=𝟑= .
𝒅𝒖 𝒅𝒗 𝒅 𝒂
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅 𝒗
=𝟑= ×
𝒅𝒖 𝒅𝒗 𝒅 𝒖
The change in caused a change in
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅 𝒗
=𝟑= ×
𝒅𝒖 𝒅𝒗 𝒅 𝒖
And the change in caused a change in
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱
=𝟑= ×𝟏
𝒅𝒖 𝒅𝒗
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱
=𝟑=𝟑 ×𝟏
𝒅𝒖
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅𝒗
=𝟑= ×
𝒅𝒖 𝒅𝒗 𝒅 𝒖
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟑
𝒅𝑱 𝒅𝑱 𝒅𝒗
=𝟑= ×
𝒅𝒖 𝒅𝒗 𝒅 𝒖
Same as before, we had to go back to then to in order to compute the derivative
of
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
𝒅𝑱
=¿ Derivative of with respect to
𝒅𝒃
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
𝒅𝑱
=¿ If we change a little bit, how would change?
𝒅𝒃
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
4 𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟗
𝒅𝑱 𝒅𝑱 𝒅 𝒗 𝒅 𝒖
=? = × ×
𝒅𝒃 𝒅𝒗 𝒅 𝒖 𝒅 𝒃
3 1 3
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
4 𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟗
𝒅𝑱 𝒅𝑱 𝒅 𝒗 𝒅 𝒖
=𝟗= × ×
𝒅𝒃 𝒅𝒗 𝒅 𝒖 𝒅 𝒃
3 1 3
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
4 𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟗
𝒅𝑱 𝒅𝑱 𝒅 𝒗 𝒅 𝒖
=𝟗= × ×
𝒅𝒃 𝒅𝒗 𝒅 𝒖 𝒅 𝒃
3 1 3
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
4 𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟗
𝒅𝑱 𝒅𝑱 𝒅 𝒗 𝒅 𝒖
=𝟗= × ×
𝒅𝒃 𝒅𝒗 𝒅 𝒖 𝒅 𝒃
3 1 3
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
4 𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
12 1 𝟒𝟐.𝟎𝟎𝟗
𝒅𝑱 𝒅𝑱 𝒅 𝒗 𝒅 𝒖
=𝟗= × ×
𝒅𝒃 𝒅𝒗 𝒅 𝒖 𝒅 𝒃
3 1 3
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
𝒅𝑱
=¿ Derivative of with respect to
𝒅𝒄
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
𝒅𝑱
=¿ If we change a little bit, how would change?
𝒅𝒄
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
3 12 1 𝟒𝟐.𝟎𝟏𝟐
𝒅𝑱 𝒅𝑱 𝒅 𝒗 𝒅 𝒖
=? = × ×
𝒅𝒄 𝒅𝒗 𝒅 𝒖 𝒅 𝒄
3 1 4
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
3 12 1 𝟒𝟐.𝟎𝟏𝟐
𝒅𝑱 𝒅𝑱 𝒅 𝒗 𝒅 𝒖
=𝟏𝟐= × ×
𝒅𝒄 𝒅𝒗 𝒅 𝒖 𝒅 𝒄
3 1 4
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
3 12 1 𝟒𝟐.𝟎𝟏𝟐
𝒅𝑱 𝒅𝑱 𝒅𝒗 𝒅𝒖
=𝟏𝟐= × ×
𝒅𝒄 𝒅𝒗 𝒅𝒖 𝒅𝒄
3 1 4
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
3 12 1 𝟒𝟐.𝟎𝟏𝟐
𝒅𝑱 𝒅𝑱 𝒅𝒗 𝒅𝒖
=𝟏𝟐= × ×
𝒅𝒄 𝒅𝒗 𝒅𝒖 𝒅𝒄
3 1 4
Backward Propagation
• Let us now compute the derivatives of the variables through the
computation graph as follows:
𝒂=𝟐
𝒃=𝟒
𝒖=𝒃𝒄 𝟏𝟐 𝒗=𝒂+𝒖 1
𝑱 =𝟑 𝒗 𝟒𝟐
𝒄=𝟑
3 12 1 𝟒𝟐.𝟎𝟏𝟐
𝒅𝑱 𝒅𝑱 𝒅𝒗 𝒅𝒖
=𝟏𝟐= × ×
𝒅𝒄 𝒅𝒗 𝒅𝒖 𝒅𝒄
3 1 4
Outline
Deep Learning
Computation Gradient
Overview Vectorization
Graph Descent
The Computation Graph of Logistic
Regression
• Let us translate logistic regression (which is a neural network with only
1 neuron) into a computation graph
𝟏
𝒃 𝑻
𝒛 =𝒘 𝒙 + 𝒃
𝒘
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 𝝏
( − 𝒚𝒍𝒐𝒈
= Partial ( 𝒂 ) −of
derivative ( 𝟏with
− 𝒚 ) respect − 𝒂) )
𝒍𝒐𝒈 ( 𝟏 to
𝝏𝒂 𝝏𝒂
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 𝝏
= ( − 𝒚𝒍𝒐𝒈 ( 𝒂 ) − ( 𝟏 − 𝒚 ) 𝒍𝒐𝒈 ( 𝟏 − 𝒂 ) )
𝝏𝒂 𝝏𝒂
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 −𝝏𝒚 (𝟏 − 𝒚 () ) (
= ( − 𝒚𝒍𝒐𝒈 𝒂 − 𝟏 − 𝒚 ) 𝒍𝒐𝒈 ( 𝟏 − 𝒂 ) )
+
𝝏𝒂 𝝏𝒂𝒂 ( 𝟏− 𝒂)
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 𝝏 𝓛 𝝏 𝒂
= Partial
× derivative of with respect to
𝝏 𝒛 𝝏𝒂 𝝏 𝒛
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 𝝏 𝓛 𝝏 𝒂 − 𝒚 (𝟏 − 𝒚 )
= × ¿
𝝏 𝒛 𝝏𝒂 𝝏 𝒛 𝒂 (
+
( 𝟏 − 𝒂)
× ¿
)
𝝏𝒛 𝒂
+
(
𝝏 𝒂 − 𝒚 (𝟏 − 𝒚 )
( 𝟏 − 𝒂) )
×𝒂 (𝟏 − 𝒂)
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 𝝏 𝓛 𝝏 𝒂
=𝒂 −×𝒚
𝝏 𝒛 𝝏𝒂 𝝏 𝒛
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 𝝏 𝓛 𝝏 𝒂 𝝏 𝒛
= Partial
× derivative
× of with respect to
𝝏𝒃 𝝏𝒂 𝝏 𝒛 𝝏𝒃
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 𝝏 𝓛 𝝏 𝒂 𝝏 𝒛 𝝏𝒛
× ¿ (𝒂 − 𝒚 )× ¿ (𝒂 − 𝒚 )
𝝏 𝒃¿ (𝒂 − 𝒚 )×𝟏
= ×
𝝏𝒃 𝝏𝒂 𝝏 𝒛 𝝏𝒃
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 𝝏𝓛 𝝏 𝒂 𝝏 𝒛
= Partial
× derivative
× of with respect to
𝝏 𝒘 𝝏 𝒂 𝝏 𝒛 𝝏𝒘
Backward Propagation
• The derivatives can be computed by moving from right to left
𝝏𝓛 𝝏𝓛 𝝏 𝒂 𝝏 𝒛 𝝏𝒛
¿ (𝒂 − 𝒚 )×
𝝏 𝒘¿ (𝒂 − 𝒚 ) 𝒙
= × ×
𝝏 𝒘 𝝏 𝒂 𝝏 𝒛 𝝏𝒘
Next Monday’s Lecture…
Deep Learning
Computation Gradient
Overview Vectorization
Graph Descent
Continue…