CHAPTER 3.4.1 - Backpropagation - Updated
CHAPTER 3.4.1 - Backpropagation - Updated
in Python LANGUAGE
Example:
Examples:
𝜕𝑓(𝑥, 𝑦) 𝜕max(𝑥, 𝑦) 1 if 𝑥 ≥ 𝑦
𝑓(𝑥, 𝑦) = max(𝑥, 𝑦) → = =
𝜕𝑥 𝜕𝑥 0 if 𝑥 < 𝑦
𝜕𝑓(𝑥, 𝑦) 1 if 𝑥 ≥ 0
𝑓(𝑥, 0) = max(𝑥, 0) → =
𝜕𝑥 0 if 𝑥 < 0
Example:
Gradient descent algorithm: An optimal algorithm that help finding the local minimum of a
function (in the case of a NN: the loss function) by making converging the model’s parameters
to optimal values.
M e c h a t ro n i c s – R o b o t & A I D e p a r t m e n t For Internal Circulation only
4.1. Back Propagation
• Gradient Descent Algorithm:
Gradient descent algorithm: Starting from a point that could be close to the solution, one will
use an iterative operation to gradually approach the desired point (local minimum), i.e., when
the derivative converge to 0.
In order to converge to the local minimum, one have to move in the inverse sense of the gradient
vector. The formula can be as follow:
𝜕𝑓
for each 𝑥𝑖 in 𝐱: 𝑥𝑖 (𝑡 + 1) = 𝑥𝑖 (𝑡) − 𝐿𝑅. 𝐱
𝜕𝑥𝑖
with 𝐿𝑅 is the learning rate
Examples:
z= xw0 + xw1 + xw2 + b # the value of the chain dReLU_dxw0 = dReLU_dz * dsum_dxw0
rule dReLU_dxw1 = dReLU_dz * dsum_dxw1
dReLU_dxw2 = dReLU_dz * dsum_dxw2
# ReLU activation dReLu_db = dReLU_dz * dsum_db
output = max(z,0)
Therefore: