0% found this document useful (0 votes)

33 views

Lect 4

The document describes gradient descent learning and backpropagation algorithms for training neural networks. It discusses how gradient descent aims to find the minimum error by computing derivatives of the error function with respect to weights. Batch and incremental training modes are described. Backpropagation is introduced as a method for calculating error gradients for multi-layer networks using generalized delta rule. Sigmoid activation functions and their use in backpropagation are also mentioned.

Uploaded by

norain ismail sulieman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views

Lect 4

Uploaded by

norain ismail sulieman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

University Of Khartoum

Department Of Electronics & Electrical

Engineering
Software & Control Engineering

EEE52511: NEURAL NETWORKS

& FUZZY SYSTEMS
By: Dr. Hiba Hassan Sayed
Lecture 4
30/1/2023 U of K: Dr. Hiba Hassan 2

GRADIENT DESCENT LEARNING

30/1/2023 U of K: Dr. Hiba Hassan 3

Gradient Descent Learning in NN

• The gradient is the rate of change of f(x) at a particular value of x.

• Hence, it is the partial derivative of f(x) with respect to x.
• That led to Gradient Descent Learning, its aim is to find the
minimum error by computing the derivative of the error function
with respect to the weight. Sometimes it is called Gradient
Descent Minimization.
30/1/2023 U of K: Dr. Hiba Hassan 4

Finding the minimum of a function: gradient descent

30/1/2023 U of K: Dr. Hiba Hassan 5

Cont.
30/1/2023 U of K: Dr. Hiba Hassan 6

Cont.
• For a target (t) & an actual output (o), the error is given by the
following mean square error cost function,

• Where D is the set of training examples.

• There are 2 types of gradient descent based cost function, they are
stated next;
30/1/2023 U of K: Dr. Hiba Hassan 7

Where,
30/1/2023 U of K: Dr. Hiba Hassan 8
30/1/2023 U of K: Dr. Hiba Hassan 9

Batch Training
• Batch Training: In batch mode the weights and biases of the
network are updated only after the entire training set has been
applied to the network. The gradients calculated at each training
example are added together to determine the change in the weights
and biases.
• Batch Gradient Descent: In the batch steepest descent training
function the weights and biases are updated in the direction of the
negative gradient of the performance function.
30/1/2023 U of K: Dr. Hiba Hassan 10

Batch Gradient Descent with Momentum

• This algorithm often provides faster convergence.

• Momentum allows a network to respond not only to the local
gradient, but also to recent trends in the error surface.
• Acting like a low-pass filter, momentum allows the network to ignore
small features in the error surface.
• Without momentum a network may get stuck in a shallow local
minimum, such as shown in the next figure.
30/1/2023 U of K: Dr. Hiba Hassan 11

Cont.
Local and global minima Effect of adding Momentum
30/1/2023 U of K: Dr. Hiba Hassan 12

Incremental Mode Gradient Descent

• When we use the gradient with respect to one training example at
a time, the gradient descent becomes the Hoff’s delta rule, which
is given by,

Wi   (t  o) xi
• Also called the Least Mean Square, LMS, method.
30/1/2023 U of K: Dr. Hiba Hassan 13

LMS Learning Rule

Mean Square Error:
• Like the perceptron learning rule, the least mean square (LMS) - the delta
rule - algorithm is an example of supervised training, in which the learning
rule is provided with a set of examples of desired network behavior:
p1 , t1 , p2 , t 2  , ... , pQ , t Q 
• We want to minimize the average of the sum of the squared errors
between target & actual network output:
Q Q
1 1
mse   e(k )   (t (k ) - a(k ))
2 2

Q k 1 Q k 1
30/1/2023 U of K: Dr. Hiba Hassan 14

LMS Algorithm/ Widrow-Hoff rule

• The LMS algorithm was presented by Widrow and Hoff, hence, it is
called Widrow-Hoff learning algorithm.
• As seen before, it is based on an approximate steepest descent
procedure.
• Widrow and Hoff decided that they could estimate the mean
square error by using the squared error at each iteration.
30/1/2023 U of K: Dr. Hiba Hassan 15

Comparing Perceptron & Delta Rules

• Perceptron rule
• Thresholded output.
• Converges after a finite number of iterations to a hypothesis that perfectly
classifies the training data, provided the training examples are linearly
separable.
• Linearly separable data.
• Delta rule
• Unthresholded output.
• Converges toward the error minimum, possibly requiring unbounded time,
but converges regardless of whether the training data are linearly
separable or not.
• Linearly non-separable data.
30/1/2023 U of K: Dr. Hiba Hassan 16

Adaptive Linear Neuron Network Architecture (ADALINE)

• The ADALINE network is a single layer neural network with multiple

nodes, where each node accepts multiple inputs to generate one output.
• ADALINE networks are similar to the perceptron, but their transfer
function is linear rather than hard-limiting. This allows their outputs to
take on any value, whereas the perceptron output is limited to either 0 or
1.
• Both the ADALINE and the perceptron can only solve linearly separable
problems.
30/1/2023 U of K: Dr. Hiba Hassan 17

Cont.
• An adaptive linear system responds to changes in its environment
as it is operating.
• These networks are often used in error cancellation, signal
processing, and control systems. For example, they are used by
many long distance phone lines for echo cancellation.
• The pioneering work in this field was done by Widrow and Hoff,
who gave the name ADALINE to adaptive linear elements.
30/1/2023 U of K: Dr. Hiba Hassan 18

The ADALINE Neural Network

30/1/2023 U of K: Dr. Hiba Hassan 19

Cont.
• Multiple layer ADALINE is called MADALINE.
• The Widrow-Hoff rule can only train single-layer linear networks.
• This is not much of a disadvantage; single-layer linear networks are
just as capable as multilayer linear networks.
• For every multilayer linear network, there is an equivalent single-
layer linear network.
30/1/2023 U of K: Dr. Hiba Hassan 20

BACKPROPAGATION ALGORITHM
30/1/2023 U of K: Dr. Hiba Hassan 21

BackPropagation Algorithm
• The backpropagation algorithm was made popular by Rumelhart,
Hinton and Williams in 1986 "Learning Internal Representations by
Error Propagation". Rumelhart, David E.; McClelland, James
L. (eds.). Parallel Distributed Processing : Explorations in the
Microstructure of Cognition. Vol. 1 : Foundations. Cambridge: MIT
Press. ISBN 0-262-18120-7.]
• The researchers used semi-linear neurons with differentiable activation
functions in the hidden neurons (logistic activation functions or
sigmoids).
30/1/2023 U of K: Dr. Hiba Hassan 22

Cont.
• The error between the target and actual output is calculated at
every iteration and is back propagated through the layers of the
ANN to adapt the weights.
• The weights are adapted such that the error is minimized.
• Once the error has reached a justified minimum value, the training
is stopped.
• Among the first applications of the BP algorithm is speech
synthesis called NETalk developed by Terence Sejnowski
[Sejnowski & Rosenberg, 1987 “Parallel Networks that Learn to
Pronounce English Text”, Complex Systems 1, 145-168]
30/1/2023 U of K: Dr. Hiba Hassan 23

Cont.
• The configuration for training a neural network using the BP
algorithm is shown in the figure below.
30/1/2023 U of K: Dr. Hiba Hassan 24

The Generalized Delta Rule (G.D.R.)

• In BP algorithm, like in other learning algorithms, the goal is to find
the next value of the adaptation weights (Δw) which is also known
as the G.D.R.
• Consider the following ANN model:
30/1/2023 U of K: Dr. Hiba Hassan 25

Cont.
• We need to obtain the following algorithm to adapt the weights
between the output (k) and hidden (j) layers:

• Where the weights are adapted as follows:

• And t is the iteration number and is the error signal between the
output and hidden layers & is given by:
30/1/2023 U of K: Dr. Hiba Hassan 26

Cont.
• Adaptation between input (i) and hidden (j) layers :

• The new weight is thus:

• and the error signal through layer j is:

• Where,

• And,
30/1/2023 U of K: Dr. Hiba Hassan 27

Backpropagation Algorithm
• The following ANN model is used to derive the backpropagation
algorithm:
30/1/2023 U of K: Dr. Hiba Hassan 28

BP (cont.)
• The backpropagation has two steps,
• Forward propagation, and
• Backward propagation.
• Our ANN model has the following assumptions:
• A two-layer multilayer NN model, i.e. with 1 set of hidden neurons.
• Neurons in layer i are fully connected to layer j and neurons in
layer j are fully connected to layer k.
• Input layer neurons have linear activation functions and hidden
and output layer neurons have logistic activation functions
(sigmoids).
30/1/2023 U of K: Dr. Hiba Hassan 29

Note: Sigmoid Function

• Sigmoids have a variable c that controls their firing angle.
30/1/2023 U of K: Dr. Hiba Hassan 30

Cont.
• When c is large, the sigmoid becomes like a threshold function and
when is c is small, the sigmoid becomes more like a straight line
(linear).
• When c is large learning is much faster but a lot of information is
lost, however when c is small, learning is very slow but information
is retained.
• Since this function is differentiable, it enables the B.P. algorithm to
adapt the lower layers of weights in a multilayer neural network.
30/1/2023 U of K: Dr. Hiba Hassan 31

Cont.
• The firing angle used here is c=1.
• Bias weights are used with bias signals of 1 for hidden (j) and output
layer (k) neurons.
• In many ANN models, bias weights (θ) with bias signals of 1 are used to
speed up the convergence process.
• The learning parameter is given by the symbol η and is usually fixed a
value between 0 and 1, however, in many applications nowadays an
adaptive η is used.
• Usually η is set large in the initial stage of learning and reduced to a
small value at the final stage of learning.
• A momentum term α is also used in the G.D.R. to avoid local minima.
30/1/2023 U of K: Dr. Hiba Hassan 32

Steps of BP Algorithm
• Step 1: Obtain a set of training patterns.
• Step 2: Set up neural network model: No. of Input neurons, Hidden
neurons, and Output Neurons.
• Step 3: Set learning rate η and momentum rate α
• Step 4: Initialize all connection Wji , Wkj and bias weights θj θk to
random values.
• Step 5: Set minimum error, Emin
• Step 6: Start training by applying input patterns one at a time and
propagate through the layers then calculate total error.
30/1/2023 U of K: Dr. Hiba Hassan 33

Cont.
• Step 7: Backpropagate error through output and hidden layer and
adapt weights.
• Step 8: Backpropagate error through hidden and input layer and
adapt weights.
• Step 9: Check if Error < Emin
• If not repeat Steps 6-9. If yes stop training.
30/1/2023 U of K: Dr. Hiba Hassan 34

Solving an XOR Problem

• In this example we use the BP algorithm to solve a 2-bit XOR problem.
• The training patterns of this ANN is the XOR example as given in the next
table.
• For simplicity, the ANN model has only 4 neurons (2 inputs, 1 hidden and
1 output) and has no bias weights.
• The input neurons have linear functions and the hidden and output
neurons have sigmoid functions.
• The weights are initialized randomly.
• We train the ANN by providing the patterns #1 to #4 through an iteration
process until the error is minimized.
30/1/2023 U of K: Dr. Hiba Hassan 35

Cont.
• The training patterns of this ANN is the XOR example as given in
the following table:
30/1/2023 U of K: Dr. Hiba Hassan 36

Cont.
• The ANN model and its initial weights,

• Training begins when the pattern#1 and its target are provided to the
ANN.
• 1st pattern: 0, 0 target : 0
30/1/2023 U of K: Dr. Hiba Hassan 37
30/1/2023 U of K: Dr. Hiba Hassan 38

Compute the error by comparing this value to the target,

30/1/2023 U of K: Dr. Hiba Hassan 39

Cont.
• This error is now backpropagated through the layers following the
error signal equations given as follows:
• Between output (k) and hidden (j) layer

• Thus
• Between hidden (j) and input (i) layer :

• = -0.0035
30/1/2023 U of K: Dr. Hiba Hassan 40

Cont.
• Now we have calculated the error signal between layers (k) and (j)

• If we had chosen the learning rate and momentum term as follows :

• η = 0.1 and α= 0.9
• and the previous change in weight is 0 and Ojo= 0.5
• Then,

= -0.0064
30/1/2023 U of K: Dr. Hiba Hassan 41

Cont.
• This is the increment of the weight after the first iteration for the
weight between layers k and j.
• Now this change in weight is added to the actual weight as follows

• and thus the weight between layers k and j has been adapted.
30/1/2023 U of K: Dr. Hiba Hassan 42

Cont.
• Similarly for the weights between layers j and i, the adaptation follows

• Now this change in weight is added to the actual weight as follows:

• and this is the adapted weight between layers j and i after pattern#1 is
seen by the ANN in the first iteration.
• The whole calculation is then repeated for the next pattern (pattern#2 =
[0, 1]) with tk=1.
• After all the 4 patterns have been completed the whole process is
repeated for pattern#1 again.
30/1/2023 U of K: Dr. Hiba Hassan 43

UNSUPERVISED LEARNING
30/1/2023 U of K: Dr. Hiba Hassan 44

Unsupervised Learning
• Unsupervised learning is the process of finding structure, patterns
or correlation in the given data.
• Many times this type of learning depends on associative learning
procedures.
• We focus on two main approaches:
• Unsupervised Hebbian learning
• Principal component analysis
• Unsupervised competitive learning
• Clustering
30/1/2023 U of K: Dr. Hiba Hassan 45

Types of Analysis used in

Unsupervised Learning
• Correlational analysis
• Identifying the correlations among features.
• Accomplished via Hebbian learning
• Cluster analysis
• Identifying the relational structure of the data.
• Accomplished via competitive learning.

• Cluster analysis is a form of categorization, whereas

Correlational analysis is a form of simplification.
30/1/2023 U of K: Dr. Hiba Hassan 46

Hebbian Learning
• An association principle was proposed by Hebb in 1949 in the
context of biological neurons.
• Hebb’s principle
When a neuron repeatedly excites another neuron, then the
threshold of the latter neuron is decreased, or the synaptic
weight between the neurons is increased, in effect increasing
the likelihood of the second neuron to be excited by the first.
30/1/2023 U of K: Dr. Hiba Hassan 47

Hebbian Learning as Correlation Learning

• Hebbian learning is an associative learning, it associates things that
occur together.
• Thus Hebbian learning can be thought of as learning the auto-
correlation of the input space.
• Example: a child recognizes a banana by its shape & wants to eat it.
Then, he smells it and after a couple of exposures to that experiment
starts, drooling! Once he smells it without even seeing it.
• Conclusion: the child has associated the smell with the banana &
produced a response (hunger effect) even without seeing its shape.
30/1/2023 U of K: Dr. Hiba Hassan 48

Cont.
• Brilliant idea by Hebb(1949):cells that fire together, wire
together

Banana-smell Hungry Neuron

Neuron
30/1/2023 U of K: Dr. Hiba Hassan 49

Hebbian Learning Neural Network

Output Signals
Input Signals

i j
30/1/2023 U of K: Dr. Hiba Hassan 50

Banana Associator Example

30/1/2023 U of K: Dr. Hiba Hassan 51

Example (cont.)
• The inputs are defined as follows:

• If we want the network to associate the response to the shape of

the banana & not its smell, w0 is assigned a value greater than –b,
while w is assigned a value less than –b.
• Hence we choose; w0 = 1& w = 0.
• The output of the network reduces to;
a = hardlim(p0 - 0.5)
30/1/2023 U of K: Dr. Hiba Hassan 52

Hebbian Learning
• Hebbian learning rule Δwji = ηyjxi
• Consider the update of a single weight w,
w(n + 1) = w(n) + ηy(n)x(n)
• For a linear activation function
w(n + 1) = w(n)[1 + ηx2(n)]
• Weights increase without bounds. If initial weight is negative, then
it will increase in the negative range. If it is positive, then it will
increase in the positive range.
• Hebbian learning is naturally unstable.
30/1/2023 U of K: Dr. Hiba Hassan 53

Oja’s Learning Rule

• To solve the problem of the simple Hebbian rule that causes the
weights to increase (or decrease) without bounds,
• The weights need to be normalized to one as follows,
wji(n + 1) = [wji(n) + ηyj(n)xi(n)] / √Σi[wji(n) + ηyj(n)xi(n)]2
• This equation effectively imposes a constraint on the weights.
• Oja approximated the normalization (for small η) as:
30/1/2023 U of K: Dr. Hiba Hassan 54

Oja’s Rule (continued)

wji(n + 1) = wji(n) + ηyj(n)[xi(n) – yj(n)wji(n)]

• This rule is also known as the generalized Hebbian rule.

• The 2nd term is called a weight decay term or a ‘forgetting term’.

Student-Centered Learning Climate
100% (1)
Student-Centered Learning Climate
43 pages
L04 Slides.mlp1
No ratings yet
L04 Slides.mlp1
22 pages
Anthony Kuh - Neural Networks and Learning Theory
No ratings yet
Anthony Kuh - Neural Networks and Learning Theory
72 pages
Chapter3
No ratings yet
Chapter3
30 pages
Machine Learning: Algorithms and Applications: (Continued)
No ratings yet
Machine Learning: Algorithms and Applications: (Continued)
17 pages
Machine Learning
No ratings yet
Machine Learning
68 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
ML UNIT-5
No ratings yet
ML UNIT-5
19 pages
2025-Lecture07-P2-MLP
No ratings yet
2025-Lecture07-P2-MLP
56 pages
Multi Layer Feed-Forward Network Learning
No ratings yet
Multi Layer Feed-Forward Network Learning
5 pages
Supervised Learning Network
No ratings yet
Supervised Learning Network
33 pages
Lect 3 PDF
No ratings yet
Lect 3 PDF
59 pages
Chapter 2. Training NN
No ratings yet
Chapter 2. Training NN
50 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
MLP Lecture 4
No ratings yet
MLP Lecture 4
35 pages
Mid Summary
No ratings yet
Mid Summary
13 pages
Introduction To Neural Networks: Revision Lectures: © John A. Bullinaria, 2004
No ratings yet
Introduction To Neural Networks: Revision Lectures: © John A. Bullinaria, 2004
24 pages
Unit 1 (1)
No ratings yet
Unit 1 (1)
72 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Week-12 - Introduction To ML-NN-CNN
No ratings yet
Week-12 - Introduction To ML-NN-CNN
45 pages
Jntuk R20 ML Unit-V
No ratings yet
Jntuk R20 ML Unit-V
19 pages
AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
Learning Rules For Multilayer Feedforward Neural Networks
No ratings yet
Learning Rules For Multilayer Feedforward Neural Networks
19 pages
Unit 5
No ratings yet
Unit 5
219 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Unit - I Artificial Neural Networks
No ratings yet
Unit - I Artificial Neural Networks
23 pages
Module 3.Docxaiml
No ratings yet
Module 3.Docxaiml
20 pages
scunit-2-application-of-soft-computing-kcs056
No ratings yet
scunit-2-application-of-soft-computing-kcs056
26 pages
AIML-Module-3-part 2
No ratings yet
AIML-Module-3-part 2
122 pages
Unit_4 ANN ppt
No ratings yet
Unit_4 ANN ppt
46 pages
2023-Lecture11-NeuralNetworks
No ratings yet
2023-Lecture11-NeuralNetworks
48 pages
Chapter 10: Artificial Neural Networks
No ratings yet
Chapter 10: Artificial Neural Networks
17 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Back Propagation
100% (1)
Back Propagation
27 pages
Unit II Supervised II
No ratings yet
Unit II Supervised II
16 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Week 3
No ratings yet
Week 3
15 pages
Unit - I Artificial Neural Networks
No ratings yet
Unit - I Artificial Neural Networks
23 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
No ratings yet
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
24 pages
ML Unit - 2
No ratings yet
ML Unit - 2
70 pages
Unit 4
No ratings yet
Unit 4
18 pages
2024 MTH058 Lecture02 Backpropagation
No ratings yet
2024 MTH058 Lecture02 Backpropagation
62 pages
BDA Unit 2
No ratings yet
BDA Unit 2
48 pages
Back Propagation
No ratings yet
Back Propagation
56 pages
ANN 2 A
No ratings yet
ANN 2 A
20 pages
Lect3 UWA PDF
No ratings yet
Lect3 UWA PDF
73 pages
Error
No ratings yet
Error
24 pages
AI Unit II Lec Notes Deep Learning
No ratings yet
AI Unit II Lec Notes Deep Learning
64 pages
Module 3 Final
No ratings yet
Module 3 Final
88 pages
AI17-Neural Networks
No ratings yet
AI17-Neural Networks
34 pages
RBFN and TDNN
No ratings yet
RBFN and TDNN
42 pages
Neural Network: Prof. Subodh Kumar Mohanty
No ratings yet
Neural Network: Prof. Subodh Kumar Mohanty
37 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
Back Propagation
No ratings yet
Back Propagation
20 pages
EELU ANN ITF309 Lecture 07 Spring 2024
No ratings yet
EELU ANN ITF309 Lecture 07 Spring 2024
50 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
15 pages
Week - 5 (Deep Learning) Q. 1) Explain The Architecture of Feed Forward Neural Network or Multilayer Perceptron. (12 Marks)
No ratings yet
Week - 5 (Deep Learning) Q. 1) Explain The Architecture of Feed Forward Neural Network or Multilayer Perceptron. (12 Marks)
7 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
UBD Hazday
No ratings yet
UBD Hazday
3 pages
THM-312_CB-LEC04
No ratings yet
THM-312_CB-LEC04
34 pages
Present Participle
No ratings yet
Present Participle
12 pages
Chris Argyris
No ratings yet
Chris Argyris
14 pages
NLP Module 1
No ratings yet
NLP Module 1
124 pages
Choose The Letters of The Best Answer. Write Your Answers in Your Notebook
No ratings yet
Choose The Letters of The Best Answer. Write Your Answers in Your Notebook
2 pages
Session 4 BUSINESS COMMUNICATION AND DIRECTIONS
No ratings yet
Session 4 BUSINESS COMMUNICATION AND DIRECTIONS
5 pages
Fourteen Learner-Centered Psychological Principles
No ratings yet
Fourteen Learner-Centered Psychological Principles
9 pages
Lesson Team The Amish
No ratings yet
Lesson Team The Amish
3 pages
Field Study 2 Experiencing The Teaching Learning Process
No ratings yet
Field Study 2 Experiencing The Teaching Learning Process
70 pages
NMAT Social Science Practice Questions Set 1
No ratings yet
NMAT Social Science Practice Questions Set 1
6 pages
Professionals and Practitioners in Counselling: 1. Roles, Functions, and Competencies of Counselors
No ratings yet
Professionals and Practitioners in Counselling: 1. Roles, Functions, and Competencies of Counselors
70 pages
Contextualized Grammar
No ratings yet
Contextualized Grammar
23 pages
Daily Habits and Routines Lesson For Beginner: Your List of Daily Activities To The Class Two or More Times.)
No ratings yet
Daily Habits and Routines Lesson For Beginner: Your List of Daily Activities To The Class Two or More Times.)
2 pages
UTBK3-Word Forms, Academic Vocab, Collocations, Idioms, Phrasal Verbs
No ratings yet
UTBK3-Word Forms, Academic Vocab, Collocations, Idioms, Phrasal Verbs
7 pages
Storytelling Across The Primary Curricul
No ratings yet
Storytelling Across The Primary Curricul
17 pages
CAREER GUIDANCE Slides Presentation
No ratings yet
CAREER GUIDANCE Slides Presentation
24 pages
21st Century Leadership - A Neuro-Biological Perspective
No ratings yet
21st Century Leadership - A Neuro-Biological Perspective
19 pages
E3-E4 - PPT - Chapter 6. Effective Leadership - Team Building
No ratings yet
E3-E4 - PPT - Chapter 6. Effective Leadership - Team Building
35 pages
Alpha Mathematics Homework Book
100% (1)
Alpha Mathematics Homework Book
5 pages
Eidam and Partner Cross-Cultural Values of Germany
No ratings yet
Eidam and Partner Cross-Cultural Values of Germany
6 pages
Beginning The Proposal Process: Go Here
100% (2)
Beginning The Proposal Process: Go Here
4 pages
Differences Between British English and American English
100% (1)
Differences Between British English and American English
27 pages
Psrip GR 3 Term 1 2020 Efal Lesson Plan
No ratings yet
Psrip GR 3 Term 1 2020 Efal Lesson Plan
249 pages
Disabilities Lesson Plan
No ratings yet
Disabilities Lesson Plan
5 pages
The Syntactic Analysis of Adverbs in Embosi
No ratings yet
The Syntactic Analysis of Adverbs in Embosi
33 pages
Decide To Lead Building Capacity and Leveraging
No ratings yet
Decide To Lead Building Capacity and Leveraging
107 pages
Carol Olivares: Sales Representative
No ratings yet
Carol Olivares: Sales Representative
1 page
Self Awareness
0% (1)
Self Awareness
2 pages

Lect 4

Uploaded by

Lect 4

Uploaded by

University Of Khartoum

Department Of Electronics & Electrical

EEE52511: NEURAL NETWORKS

GRADIENT DESCENT LEARNING

Gradient Descent Learning in NN

• The gradient is the rate of change of f(x) at a particular value of x.

Finding the minimum of a function: gradient descent

• Where D is the set of training examples.

Batch Gradient Descent with Momentum

• This algorithm often provides faster convergence.

Incremental Mode Gradient Descent

LMS Learning Rule

LMS Algorithm/ Widrow-Hoff rule

Comparing Perceptron & Delta Rules

Adaptive Linear Neuron Network Architecture (ADALINE)

• The ADALINE network is a single layer neural network with multiple

The ADALINE Neural Network

The Generalized Delta Rule (G.D.R.)

• Where the weights are adapted as follows:

• The new weight is thus:

• and the error signal through layer j is:

Note: Sigmoid Function

Solving an XOR Problem

Compute the error by comparing this value to the target,

• If we had chosen the learning rate and momentum term as follows :

• Now this change in weight is added to the actual weight as follows:

Types of Analysis used in

• Cluster analysis is a form of categorization, whereas

Hebbian Learning as Correlation Learning

Banana-smell Hungry Neuron

Hebbian Learning Neural Network

Banana Associator Example

• If we want the network to associate the response to the shape of

Oja’s Learning Rule

Oja’s Rule (continued)

• This rule is also known as the generalized Hebbian rule.

You might also like