0% found this document useful (0 votes)

72 views7 pages

How To Derive Errors in Neural Network With The Backpropagation Algorithm?

The document is a question posted on Cross Validated asking how to derive errors in a neural network using the backpropagation algorithm. The top answer explains that δ values represent error terms that are calculated during backpropagation. δ values allow the gradient of the cost function with respect to the weights to be calculated. This gradient is then used in gradient descent training. The key steps are: 1) Forward propagation to calculate activations, 2) Backpropagation to calculate error terms and gradients, 3) Gradient descent to update weights. Calculating δ terms involves applying the chain rule and relationships between errors at different layers.

Uploaded by

Андрей Маляренко

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views7 pages

How To Derive Errors in Neural Network With The Backpropagation Algorithm?

Uploaded by

Андрей Маляренко

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

machine learning - How to derive errors in neural network with the backpropagation algorithm? - Cross Validated 23.11.

2021, 19:14

Cross Validated is a question and

Anybody can ask a question
answer site for people interested in
statistics, machine learning, data
analysis, data mining, and data
visualization. It only takes a minute to Anybody can answer
sign up.

The best answers are voted up

How to derive errors in neural network with the

backpropagation algorithm?
Asked 7 years, 7 months ago Active 2 months ago Viewed 8k times

From this video by Andrew Ng around 5:00

How are 𝛿3 and 𝛿2 derived? In fact, what does 𝛿3 even mean? 𝛿4 is got by comparing to
y, no such comparison is possible for the output of a hidden layer, right?

machine-learning neural-networks backpropagation

Share Cite edited May 11 '20 at 13:49 asked Apr 19 '14 at 19:27
Improve this question Follow Aditya Saini qed
103 5 2,438 3 20 32

https://stats.stackexchange.com/questions/94387/how-to-derive-errors-in-neural-network-with-the-backpropagation-algorithm Page 1 of 7
machine learning - How to derive errors in neural network with the backpropagation algorithm? - Cross Validated 23.11.2021, 19:14

The video link is not working. Please, update it, or provide a link to the course. Thanks.
– MadHatter Nov 28 '17 at 3:09

Though @tmangin answered it beautifully, still, if you want a more detailed explanation, you
can refer to this GitHub repo. It is well explained and contains derivations to both the
Machine Learning Course and the Deep Learning Specialization. – Aditya Saini Jun 3 '20 at
12:50

1 Answer Active Oldest Votes

(𝑙)
I'm going to answer your question about the 𝛿𝑖 , but remember that your question is a
sub question of a larger question which is why:
20
(𝑙) (𝑙+1) (𝑙+1) (𝑙) (𝑙) (𝑙−1)
∇𝑖𝑗 = 𝜃𝑘𝑖 𝛿𝑘 ∗ (𝑎𝑖 (1 − 𝑎𝑖 )) ∗ 𝑎𝑗
∑
𝑘

Reminder about the steps in Neural networks:

(𝑙)
Step 1: forward propagation (calculation of the 𝑎𝑖 )
(𝑙)
Step 2a: backward propagation: calculation of the errors 𝛿𝑖
(𝑙)
Step 2b: backward propagation: calculation of the gradient ∇𝑖𝑗 of J(Θ) using the
(𝑙+1) (𝑙)
errors 𝛿𝑖 and the 𝑎𝑖 ,
(𝑙) (𝑙)
Step 3: gradient descent: calculate the new 𝜃𝑖𝑗 using the gradients ∇𝑖𝑗

(𝑙)
First, to understand what the 𝛿𝑖 are, what they represent and why Andrew NG it talking
about them, you need to understand what Andrew is actually doing at that pointand why
(𝑙) (𝑙)
we do all these calculations: He's calculating the gradient ∇𝑖𝑗 of 𝜃𝑖𝑗 to be used in the
Gradient descent algorithm.

The gradient is defined as:

∂𝐶
∇(𝑙)
𝑖𝑗 =
∂𝜃(𝑙)
𝑖𝑗

As we can't really solve this formula directly, we are going to modify it using TWO MAGIC
TRICKS to arrive at a formula we can actually calculate. This final usable formula is:

(𝑙) 𝑇 (𝑙) (𝑙) (𝑙−1)

∇𝑖𝑗 = 𝜃(𝑙+1) 𝛿(𝑙+1) . ∗(𝑎𝑖 (1 − 𝑎𝑖 )) ∗ 𝑎𝑗

Note : here mapping from 1st layer to 2nd layer is notated as theta2 and so on, instead

https://stats.stackexchange.com/questions/94387/how-to-derive-errors-in-neural-network-with-the-backpropagation-algorithm Page 2 of 7
machine learning - How to derive errors in neural network with the backpropagation algorithm? - Cross Validated 23.11.2021, 19:14

of theta1 as in andrew ng coursera

(𝑙)
To arrive at this result, the FIRST MAGIC TRICK is that we can write the gradient ∇𝑖𝑗 of
𝜃(𝑙) (𝑙)
𝑖𝑗 using 𝛿𝑖 :

(𝑙) (𝑙) (𝑙−1)

∇𝑖𝑗 = 𝛿𝑖 ∗ 𝑎𝑗
(𝐿)
With 𝛿𝑖 defined (for the L index only) as:

(𝐿) ∂𝐶
𝛿𝑖 =
∂𝑧𝑖(𝑙)
(𝑙) (𝑙+1)
And then the SECOND MAGIC TRICK using the relation between 𝛿𝑖 and 𝛿𝑖 , to
defined the other indexes,

𝛿𝑖(𝑙) = 𝜃(𝑙+1) 𝛿(𝑙+1) . ∗(𝑎𝑖(𝑙) (1 − 𝑎𝑖(𝑙) ))

𝑇

And as I said, we can finally write a formula for which we know all the terms:

∇(𝑙) . ∗(𝑎𝑖(𝑙) (1 − 𝑎𝑖(𝑙) )) ∗ 𝑎𝑗(𝑙−1)

𝑇
(𝑙+1) (𝑙+1)
𝑖𝑗 = 𝜃 𝛿
(𝑙)
DEMONSTRATION of the FIRST MAGIC TRICK: ∇𝑖𝑗 = 𝛿(𝑙) (𝑙−1)
𝑖 ∗ 𝑎𝑗

We defined:

∂𝐶
∇(𝑙)
𝑖𝑗 = (𝑙)
∂𝜃𝑖𝑗

The Chain rule for higher dimensions (you should REALLY read this property of the
Chain rule) enables us to write:

∂𝐶 ∂𝑧𝑘(𝑙)
∇(𝑙) = ∗
𝑖𝑗 ∑ ∂𝑧(𝑙) ∂𝜃(𝑙)
𝑘 𝑘 𝑖𝑗

However , as:

(𝑙) (𝑙) (𝑙−1)

𝑧𝑘 = 𝜃𝑘𝑚 ∗ 𝑎𝑚
∑
𝑚

Here: m --> unit in layer l - 1 ;

k --> unit in layer l

https://stats.stackexchange.com/questions/94387/how-to-derive-errors-in-neural-network-with-the-backpropagation-algorithm Page 3 of 7
machine learning - How to derive errors in neural network with the backpropagation algorithm? - Cross Validated 23.11.2021, 19:14

i --> unit in layer l

j --> unit in layer *l*-1

We then can write:

(𝑙)
∂𝑧𝑘 ∂ (𝑙) (𝑙−1)
= 𝜃𝑘𝑚 ∗ 𝑎𝑚
∂𝜃𝑖𝑗 ∑
(𝑙) (𝑙)
∂𝜃𝑖𝑗 𝑚

Because of the linearity of the differentiation [ (u + v)' = u'+ v'], we can write:

(𝑙) (𝑙)
∂𝑧𝑘 ∂𝜃𝑘𝑚 (𝑙−1)
= ∗ 𝑎𝑚
∂𝜃𝑖𝑗
(𝑙) ∑ ∂𝜃(𝑙)
𝑚 𝑖𝑗

with:

(𝑙)
∂𝜃𝑘𝑚 (𝑙−1)
if 𝑘, 𝑚 ≠ 𝑖, 𝑗, ∗ 𝑎𝑚 =0
∂𝜃(𝑙)
𝑖𝑗

∂𝜃𝑘𝑚
(𝑙)
(𝑙−1)
∂𝜃(𝑙)
𝑖𝑗 (𝑙−1) (𝑙−1)
if 𝑘, 𝑚 = 𝑖, 𝑗, (𝑙)
∗ 𝑎𝑚 = (𝑙)
∗ 𝑎𝑗 = 𝑎𝑗
∂𝜃𝑖𝑗 ∂𝜃𝑖𝑗

Then for 𝑘 = 𝑖 (otherwise it's clearly equal to zero):

(𝑙) (𝑙) (𝑙)
∂𝑧𝑖 ∂𝜃𝑖𝑗 (𝑙−1) ∂𝜃𝑖𝑚 (𝑙−1) (𝑙−1)
= ∗ 𝑎𝑗 + ∗ 𝑎𝑗 = 𝑎𝑗 +0
(𝑙)
∂𝜃𝑖𝑗
(𝑙)
∂𝜃𝑖𝑗 ∑ ∂𝜃(𝑙)
𝑚≠𝑗 𝑖𝑗

Finally, for 𝑘 = 𝑖:
(𝑙)
∂𝑧𝑖 (𝑙−1)
(𝑙)
= 𝑎𝑗
∂𝜃𝑖𝑗
(𝑙)
As a result, we can write our first expression of the gradient ∇𝑖𝑗 :

∂𝐶 ∂𝑧𝑖(𝑙)
∇(𝑙)
𝑖𝑗 = (𝑙)
∗ (𝑙)
∂𝑧𝑖 ∂𝜃𝑖𝑗

Which is equivalent to:

(𝑙) ∂𝐶 (𝑙−1)
∇𝑖𝑗 = ∗ 𝑎𝑗
∂𝑧𝑖(𝑙)
https://stats.stackexchange.com/questions/94387/how-to-derive-errors-in-neural-network-with-the-backpropagation-algorithm Page 4 of 7
machine learning - How to derive errors in neural network with the backpropagation algorithm? - Cross Validated 23.11.2021, 19:14

Or:

(𝑙) (𝑙) (𝑙−1)

∇𝑖𝑗 = 𝛿𝑖 ∗ 𝑎𝑗

DEMONSTRATION OF THE SECOND MAGIC TRICK:

(𝑙) 𝑇 (𝑙) (𝑙)
𝛿𝑖 = 𝜃(𝑙+1) 𝛿(𝑙+1) . ∗(𝑎𝑖 (1 − 𝑎𝑖 )) or:
𝑇
𝛿(𝑙) = 𝜃(𝑙+1) 𝛿(𝑙+1) . ∗(𝑎(𝑙) (1 − 𝑎(𝑙) ))

Remember that we posed:

∂𝐶 ∂𝐶
𝛿(𝑙) = 𝑎𝑛𝑑 𝛿𝑖(𝑙) = (𝑙)
∂𝑧(𝑙) ∂𝑧𝑖

Again, the Chain rule for higher dimensions enables us to write:

∂𝐶 ∂𝑧𝑘(𝑙+1)
𝛿𝑖(𝑙) =
∑ ∂𝑧(𝑙+1) ∂𝑧(𝑙)
𝑘 𝑘 𝑖

∂𝐶 (𝑙+1)
Replacing by 𝛿𝑘 , we have:
∂𝑧(𝑙+1)
𝑘

∂𝑧𝑘(𝑙+1)
𝛿𝑖(𝑙) = 𝛿𝑘(𝑙+1)
∑ ∂𝑧𝑖
(𝑙)
𝑘

(𝑙+1)
∂𝑧𝑘
Now, let's focus on . We have:
(𝑙)
∂𝑧𝑖
(𝑙+1) (𝑙+1) (𝑙) (𝑙+1) (𝑙)
𝑧𝑘 = 𝜃𝑘𝑗 ∗ 𝑎𝑗 = 𝜃𝑘𝑗 ∗ 𝑔(𝑧𝑗 )
∑ ∑
𝑗 𝑗

(𝑖)
Then we derive this expression regarding 𝑧𝑘 :

∂𝑧𝑘(𝑙+1) ∂ ∑ 𝑗 𝜃(𝑙+1)
𝑘𝑗 ∗ 𝑔(𝑧(𝑙)
𝑗 )
(𝑙)
= (𝑙)
∂𝑧𝑖 ∂𝑧𝑖

Because of the linearity of the derivation, we can write:

∂𝑧𝑘
(𝑙+1)
(𝑙+1)
∂𝑔(𝑧(𝑙)
𝑗 )
= 𝜃𝑘𝑗 ∗
∂𝑧𝑖
(𝑙) ∑ ∂𝑧𝑖
(𝑙)
𝑗

(𝑙+1) (𝑙)
∂ ∗ 𝑔( )
https://stats.stackexchange.com/questions/94387/how-to-derive-errors-in-neural-network-with-the-backpropagation-algorithm Page 5 of 7
=0
machine learning - How to derive errors in neural network with the backpropagation algorithm? - Cross Validated 23.11.2021, 19:14

(𝑙+1) (𝑙)
∂𝜃𝑘𝑗 ∗ 𝑔(𝑧𝑗 )
If 𝑗 ≠ 𝑖, then =0
∂𝑧(𝑙)
𝑖

As a consequence:

∂𝑧𝑘(𝑙+1) (𝑙+1)
𝜃𝑘𝑖 ∗ ∂𝑔(𝑧𝑖(𝑙) )
=
∂𝑧𝑖(𝑙) ∂𝑧𝑖(𝑙)

And then:

∂𝑔(𝑧𝑖(𝑙) )
𝛿𝑖(𝑙) = 𝛿𝑘(𝑙+1) 𝜃𝑘𝑖
(𝑙+1)
∗
∑ ∂𝑧𝑖(𝑙)
𝑘

As 𝑔′ (𝑧) = 𝑔(𝑧)(1 − 𝑔(𝑧)) , we have:

𝛿𝑖(𝑙) = 𝛿𝑘(𝑙+1) 𝜃𝑘𝑖

(𝑙+1)
∗ 𝑔(𝑧𝑖(𝑙) )(1 − 𝑔(𝑧𝑖(𝑙) )
∑
𝑘

(𝑙) (𝑙)
And as 𝑔(𝑧𝑖 ) = 𝑎𝑖 , we have:
(𝑙) (𝑙+1) (𝑙+1) (𝑙) (𝑙)
𝛿𝑖 = 𝛿𝑘 𝜃𝑘𝑖 ∗ 𝑎𝑖 (1 − 𝑎𝑖 )
∑
𝑘

And finally, using the vectorized notation:

∇(𝑙) ∗ (𝑎𝑖(𝑙) (1 − 𝑎𝑖(𝑙) ))] ∗ [𝑎𝑗(𝑙−1) ]

𝑇
(𝑙+1) (𝑙+1)
𝑖𝑗 = [𝜃 𝛿

Share Cite edited Sep 20 at 12:00 answered May 13 '15 at 15:32

Improve this answer Follow Aditya Khedekar tmangin
3 2 473 3 10

1 Thank you for your answer. I upvoted you !! Could you please cite the sources you referred
for arriving at the answer... :) – Adithya Upadhya Jan 30 '17 at 11:09

(𝑖)
@tmangin : Following Andrew Ng talk, we have 𝛿𝑗 is the error of node j in layer l. How did
(𝑖) ∂𝐶
you get the definition of 𝛿𝑗 = . – phuong Feb 9 '17 at 17:40
∂𝑍 𝑗(𝑙)

https://stats.stackexchange.com/questions/94387/how-to-derive-errors-in-neural-network-with-the-backpropagation-algorithm Page 6 of 7
machine learning - How to derive errors in neural network with the backpropagation algorithm? - Cross Validated 23.11.2021, 19:14

@phuong Actually, I you're right to ask: only the

𝛿𝑖(𝐿)
with the highest "l" index L is defined as

∂𝐶
𝛿𝑖(𝐿) =
∂𝑧𝑖(𝑙)
Whereas the deltas with lower "l" indexes are defined by the following formula:

𝛿𝑖(𝑙) = 𝜃(𝑙+1) 𝛿(𝑙+1) . ∗(𝑎𝑖(𝑙) (1 − 𝑎𝑖(𝑙) ))

𝑇

– tmangin Feb 28 '17 at 14:20

3 I highly recommend reading the backprop vectorial notation of calculating the gradients.
– CKM Mar 26 '17 at 13:30

Your final usable formula is not what Andrew Ng had, which is making it really frustrating to
follow your proof. He had ∇(l)ij=θ(l)Tδ(l+1).∗(a(l)i(1−a(l)i))∗a(l−1)j, not θ(l+1)Tδ(l+1) – azizj
Aug 5 '17 at 23:19

Highly active question. Earn 10 reputation (not counting the association bonus) in order to answer
this question. The reputation requirement helps protect this question from spam and non-answer
activity.

https://stats.stackexchange.com/questions/94387/how-to-derive-errors-in-neural-network-with-the-backpropagation-algorithm Page 7 of 7

Selenium Java Interview Questions
100% (5)
Selenium Java Interview Questions
22 pages
Errorback Propagation
No ratings yet
Errorback Propagation
3 pages
Backpropagation Example With Numbers Step by Step
No ratings yet
Backpropagation Example With Numbers Step by Step
8 pages
Neural Networks - Learning
No ratings yet
Neural Networks - Learning
26 pages
Back-Propagation Is Very Simple. Who Made It Complicated
No ratings yet
Back-Propagation Is Very Simple. Who Made It Complicated
26 pages
Artificial Neural Networks Mathematics of Backpropagation (Part 4) - BRIAN DOLHANSKY
No ratings yet
Artificial Neural Networks Mathematics of Backpropagation (Part 4) - BRIAN DOLHANSKY
9 pages
Building Neural Networks - A Hands-On Journey From Scratch With Python - by Long Nguyen - Medium
No ratings yet
Building Neural Networks - A Hands-On Journey From Scratch With Python - by Long Nguyen - Medium
21 pages
Neural Networks: Learning: Introduction To Machine Learning
No ratings yet
Neural Networks: Learning: Introduction To Machine Learning
8 pages
A Step by Step Backpropagation
No ratings yet
A Step by Step Backpropagation
8 pages
Chap5 3-BackProp
No ratings yet
Chap5 3-BackProp
41 pages
09: Neural Networks - Learning: Neural Network Cost Function
No ratings yet
09: Neural Networks - Learning: Neural Network Cost Function
9 pages
EELU ANN ITF309 Lecture 09 Spring 2024
No ratings yet
EELU ANN ITF309 Lecture 09 Spring 2024
52 pages
Week 5 Lecture Notes
No ratings yet
Week 5 Lecture Notes
15 pages
Derivations For Back Propagation of Multilayer Neural Network
No ratings yet
Derivations For Back Propagation of Multilayer Neural Network
14 pages
Lecture 02
No ratings yet
Lecture 02
37 pages
Week5-LectureNotes
No ratings yet
Week5-LectureNotes
15 pages
Exp - 4 - 5 (Prakash)
No ratings yet
Exp - 4 - 5 (Prakash)
10 pages
ANN Notes Updated
0% (1)
ANN Notes Updated
46 pages
Back Propagation
No ratings yet
Back Propagation
5 pages
Lecture 22
No ratings yet
Lecture 22
24 pages
Neural Networks: Derivation: 1 Model
No ratings yet
Neural Networks: Derivation: 1 Model
9 pages
Backpropagation Algorithm
No ratings yet
Backpropagation Algorithm
6 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
7 pages
A Step by Step Backpropagation Example
No ratings yet
A Step by Step Backpropagation Example
9 pages
Sample Final Q1 Ans
No ratings yet
Sample Final Q1 Ans
11 pages
Eio Supplementary
No ratings yet
Eio Supplementary
6 pages
Backprop Unit 2
No ratings yet
Backprop Unit 2
5 pages
14 Backprop
No ratings yet
14 Backprop
34 pages
ECE/CS 559 - Neural Networks Lecture Notes #7: The Backpropagation Algorithm
No ratings yet
ECE/CS 559 - Neural Networks Lecture Notes #7: The Backpropagation Algorithm
9 pages
Chapter 1 Annexe
No ratings yet
Chapter 1 Annexe
17 pages
Ann2018 L6
No ratings yet
Ann2018 L6
18 pages
What Is Backpropagation
No ratings yet
What Is Backpropagation
8 pages
Backpropagation in Neural Nets
No ratings yet
Backpropagation in Neural Nets
13 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
10 pages
Ch2 ANN BB
No ratings yet
Ch2 ANN BB
16 pages
ANN Example
No ratings yet
ANN Example
10 pages
Week3 Backpropagation
No ratings yet
Week3 Backpropagation
32 pages
Back Propagation
No ratings yet
Back Propagation
20 pages
Back Propogation
No ratings yet
Back Propogation
9 pages
D) Example of Computations of NNL by Backpropagation Algorithm
No ratings yet
D) Example of Computations of NNL by Backpropagation Algorithm
4 pages
Machine Learning - Exercise 4: Companion Slides
No ratings yet
Machine Learning - Exercise 4: Companion Slides
14 pages
Neural Networks Backtracking
No ratings yet
Neural Networks Backtracking
14 pages
Neural Networks Backpropagation Algorithm: COMP4302/COMP5322, Lecture 4, 5
No ratings yet
Neural Networks Backpropagation Algorithm: COMP4302/COMP5322, Lecture 4, 5
11 pages
Back Propagation in NN
No ratings yet
Back Propagation in NN
30 pages
Backpropagation Example
No ratings yet
Backpropagation Example
9 pages
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
No ratings yet
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
20 pages
NN Lecture Notes
No ratings yet
NN Lecture Notes
45 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
12 pages
Lecture 13.3 Classification ANN
No ratings yet
Lecture 13.3 Classification ANN
64 pages
Step by Step Back Propagation
No ratings yet
Step by Step Back Propagation
8 pages
Feed Forward Neural Networks: Prof. Adel Abdennour
No ratings yet
Feed Forward Neural Networks: Prof. Adel Abdennour
48 pages
MLP Multilayer
No ratings yet
MLP Multilayer
29 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
ANN5
No ratings yet
ANN5
21 pages
Lec3 Backpropagation
No ratings yet
Lec3 Backpropagation
13 pages
Lecture 7 - Backpropagation Example With Numbers Step by Step - A Not So Random Walk
No ratings yet
Lecture 7 - Backpropagation Example With Numbers Step by Step - A Not So Random Walk
10 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
17 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
9 pages
Tut 01
No ratings yet
Tut 01
39 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Deep Learning Fundamentals in Python
From Everand
Deep Learning Fundamentals in Python
LazyProgrammer
4/5 (9)
Verify Steps Fix Corba Service
No ratings yet
Verify Steps Fix Corba Service
4 pages
Synopsis: "Online Banking System "
No ratings yet
Synopsis: "Online Banking System "
4 pages
SAP Simple Finance Training Course Content
No ratings yet
SAP Simple Finance Training Course Content
5 pages
As 3648-1993 Specification and Methods of Test For Packaged Concrete Mixes
No ratings yet
As 3648-1993 Specification and Methods of Test For Packaged Concrete Mixes
7 pages
B-K Radio Programming Manual
No ratings yet
B-K Radio Programming Manual
29 pages
Politecnico Di Torino Repository ISTITUZIONALE: Loop Detection in Robotic Navigation Using MPEG CDVS
No ratings yet
Politecnico Di Torino Repository ISTITUZIONALE: Loop Detection in Robotic Navigation Using MPEG CDVS
7 pages
Billdesk PG Interface Specs-Libord Brokerage Private Limited
No ratings yet
Billdesk PG Interface Specs-Libord Brokerage Private Limited
11 pages
Sap FB08 & F.80 Tutorial: Document Reversal
100% (8)
Sap FB08 & F.80 Tutorial: Document Reversal
16 pages
Intro Cyber Final
No ratings yet
Intro Cyber Final
14 pages
Internet Tablet OS 2008 Edition User Guide: Nokia N800 Internet Tablet Nokia N810 Internet Tablet
No ratings yet
Internet Tablet OS 2008 Edition User Guide: Nokia N800 Internet Tablet Nokia N810 Internet Tablet
52 pages
Cheat Cube Ubuntu
No ratings yet
Cheat Cube Ubuntu
2 pages
CNC Miling
No ratings yet
CNC Miling
10 pages
V811810A Man - Inst.HPB-EN PDF
No ratings yet
V811810A Man - Inst.HPB-EN PDF
22 pages
Digi India
No ratings yet
Digi India
3 pages
National Instruments Privacy Statement: What Information Does NI Gather Online?
No ratings yet
National Instruments Privacy Statement: What Information Does NI Gather Online?
6 pages
16-2 p30 Mapping of j1939 To Can FD Cia602 Zeltwanger
No ratings yet
16-2 p30 Mapping of j1939 To Can FD Cia602 Zeltwanger
2 pages
MegaStat Users Guide
No ratings yet
MegaStat Users Guide
72 pages
QUIZ
83% (6)
QUIZ
34 pages
En Tanagra PLS DA
No ratings yet
En Tanagra PLS DA
10 pages
Guidelines For Technical Writing in Biomedical
No ratings yet
Guidelines For Technical Writing in Biomedical
34 pages
David - Support Engineer
No ratings yet
David - Support Engineer
6 pages
BCA 103 - Mathematical Foundation of Computer SC - BCA
100% (2)
BCA 103 - Mathematical Foundation of Computer SC - BCA
274 pages
IEEE 610-5-1990 - w2000 Glossary of Data Management Terminology
No ratings yet
IEEE 610-5-1990 - w2000 Glossary of Data Management Terminology
76 pages
Assignment 4
No ratings yet
Assignment 4
10 pages
Two Best Ways To Earn Online + The Secret of Being A Virtual Assistant T
No ratings yet
Two Best Ways To Earn Online + The Secret of Being A Virtual Assistant T
219 pages
Agilent Eesof Eda: Overview On Designing A Low-Noise Vco On Fr4
No ratings yet
Agilent Eesof Eda: Overview On Designing A Low-Noise Vco On Fr4
6 pages
DesignSpark Mechanical Guidebook
100% (1)
DesignSpark Mechanical Guidebook
94 pages
SQL Quiz 1
100% (1)
SQL Quiz 1
4 pages
Enable Database Table Logging in SAP
No ratings yet
Enable Database Table Logging in SAP
3 pages

How To Derive Errors in Neural Network With The Backpropagation Algorithm?

Uploaded by

How To Derive Errors in Neural Network With The Backpropagation Algorithm?

Uploaded by

machine learning - How to derive errors in neural network with the backpropagation algorithm? - Cross Validated 23.11.

Cross Validated is a question and

The best answers are voted up

How to derive errors in neural network with the

From this video by Andrew Ng around 5:00

machine-learning neural-networks backpropagation

1 Answer Active Oldest Votes

Reminder about the steps in Neural networks:

The gradient is defined as:

(𝑙) 𝑇 (𝑙) (𝑙) (𝑙−1)

of theta1 as in andrew ng coursera

(𝑙) (𝑙) (𝑙−1)

𝛿𝑖(𝑙) = 𝜃(𝑙+1) 𝛿(𝑙+1) . ∗(𝑎𝑖(𝑙) (1 − 𝑎𝑖(𝑙) ))

∇(𝑙) . ∗(𝑎𝑖(𝑙) (1 − 𝑎𝑖(𝑙) )) ∗ 𝑎𝑗(𝑙−1)

(𝑙) (𝑙) (𝑙−1)

Here: m --> unit in layer l - 1 ;

i --> unit in layer *l*

We then can write:

Then for 𝑘 = 𝑖 (otherwise it's clearly equal to zero):

Which is equivalent to:

(𝑙) (𝑙) (𝑙−1)

DEMONSTRATION OF THE SECOND MAGIC TRICK:

Remember that we posed:

Again, the Chain rule for higher dimensions enables us to write:

Because of the linearity of the derivation, we can write:

As 𝑔′ (𝑧) = 𝑔(𝑧)(1 − 𝑔(𝑧)) , we have:

𝛿𝑖(𝑙) = 𝛿𝑘(𝑙+1) 𝜃𝑘𝑖

And finally, using the vectorized notation:

∇(𝑙) ∗ (𝑎𝑖(𝑙) (1 − 𝑎𝑖(𝑙) ))] ∗ [𝑎𝑗(𝑙−1) ]

Share Cite edited Sep 20 at 12:00 answered May 13 '15 at 15:32

@phuong Actually, I you're right to ask: only the

𝛿𝑖(𝑙) = 𝜃(𝑙+1) 𝛿(𝑙+1) . ∗(𝑎𝑖(𝑙) (1 − 𝑎𝑖(𝑙) ))

– tmangin Feb 28 '17 at 14:20

You might also like

i --> unit in layer l