0% found this document useful (0 votes)

40 views

Artificial Intelligence, An Introductory Course

The document provides an overview of artificial neural networks and learning strategies. It discusses supervised and unsupervised learning. Supervised learning uses external feedback to correct the network by reducing error, while unsupervised learning discovers patterns in unlabeled data through self-organization. The general algorithm for network learning iteratively updates weights according to a learning rule to minimize error between actual and desired outputs. Delta learning is a common supervised rule that adjusts weights proportionally to the error and activation function derivative.

Uploaded by

praveenvarma26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views

Artificial Intelligence, An Introductory Course

Uploaded by

praveenvarma26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Artificial Intelligence,

An Introductory Course

The International Congress for global Science and Technology

www.icgst.com

Instructor: Ashraf Aboshosha, Dr. rer. nat.

Engineering Dept., Atomic Energy Authority,
8th Section, Nasr City, Cairo, P.O. Box. 29
E-mail: aboshosha@icgst.com,
Tel.: 012-1804952

Lecture (3): Artificial Neural Networks

Course Syllabus:

• An introduction to artificial intelligence and

machine learning
• Artificial Neural Networks
• Intelligent search techniques
• Neural programming

This is free educational material

6. Network learning strategies
Under the notation of learning in a network, we will consider a process
of forcing a network to yield a particular response to a specific input. A
particular response may or may not be specified to provide external
correction. learning is necessary when the information about inputs/ outputs is
unknown or incomplete a priori, so that no design of a network can be
performed in advance. The majority of the networks requires training in a
supervised or unsupervised learning mode. Some of the networks, however,
can be designed without incremental training. They are designed by batch
learning rather than stepwise training.

Batch learning takes place when the network weights are adjusted in
a single training step. In this mode of learning, the complete set of input/output
training data is needed to determine the weights, and feedback information
produced by the network itself is not involved in developing the network. This
learning technique is also called recording. Learning with feedback either
from the teacher or from the environment rather than a teacher, however, is
more typical for neural networks. Such learning is called incremental and is
usually performed in steps. The concept of feedback plays a central role in
learning. The concept is highly elusive and somewhat paradoxical. In a broad
sense it can be understood as an introduction of a pattern of relationships into
the cause-and-effect path. We will distinguish two different types of learning.

1-)supervised learning 2-)unsupervised learning

1
F(W,X)

X O
Network Input Network output
W

Network d
Learning
Algorithm

Learning
Signal

fig. (9) Supervised Learning Rule

In supervised learning fig. (9) we assume that at each instant of time

when the inputs applied, the desired response d of the system is provided by
the teacher, the distance between the actual and desired response serves as
error measure and is used to correct network parameters externally. Sense
we assume adjustable weights, the teacher may implement a reward-and-
punishment scheme to adapt the network's weight matrix W. For instance, in
learning classifications of input patterns or situations with known response,
the error can be used to modify weights so that the error decreases. This
mode of learning is very pervasive. Also, it is used in many situations of
natural learning. A set of input and output patterns called a training set is
required for this learning mode. Typically, supervised learning rewards
accurate classifications or associations and punishes those which yield
inaccurate response. The teacher estimates the negative error gradient
direction and reduces the error accordingly. In many situations, the inputs,
outputs and the computed gradient are deterministic, however, the
minimization of error proceeds over all its random realizations. As a result,
most supervised learning algorithms reduce to stochastic minimization of error
in multi-dimensional weight space. In learning without supervision, the
desired response is not known , thus , explicit error information cannot be
used to improve network behavior. Since no information is available as to
correctness or incorrectness of responses, learning must somehow be
accomplished based on observations of responses to inputs that we have
marginal or no knowledge about. for example unsupervised learning can
easily result in finding the boundary between classes of input patterns.
2
F(W,X)

X O
Network Input Network output
W

Fig. (10) Unsupervised Learning Rule

Unsupervised learning fig. (10) algorithms use patterns that are

typically redundant raw data having no labels regarding their class
membership, or association. in this mode of learning the network must
discover for itself any possibly existing patterns, regularities, separating
properties, etc. while discovering these, the network undergoes change of its
parameters, which is called self-organization. the technique of unsupervised
learning is often used to perform clustering as the unsupervised classification
of objects without providing information about the actual classes. This kind of
learning corresponding to minimal a priori information available. Some
information about the number of clusters, or similarity versus dissimilarity of
patterns, can be helpful for this mode of learning. Unsupervised learning is
sometimes called learning without teacher. this terminology is not the most
appropriate, because learning without teacher is not possible at all. although,
the teacher does not have to be involved in every training step, he has to set
goals even in an unsupervised learning mode. We may think of the following
analogy. learning with supervision corresponds to classroom learning with the
teacher's questions answered by students and corrected, if needed by the
teacher. Learning without supervision corresponds to learning the subject
from a videotape lecture covering the material but not including any other
teacher's involvement. Therefor, the student cannot get explanations of
unclear questions, check answers and become fully informed.

6.1. The general algorithm of learning

3
Neural networks have a different types and every type has its own
learning rule. All the methods of learning have the same general algorithm,
this algorithm mainly change the network parameters according to its learning
rule to accommodate the network’s characteristics to its desired pattern. In
general for the neuron I and its input j the weight vector
wi = [ wi1 wi 2 ............ win ]t increases in proportion to the product of input x and

learning signal r. The learning signal r is in general a function of Wi , X , and

sometimes of the teacher’s signal di . We thus have for the network shown in

fig. (11):

r = f ( wi , x , d i ) ..........................................(16)

The increment of the weight vector Wi produced by the learning step at time t
according to the general learning rule is

∆ w i ( t ) = cr [ w i ( t ), x ( t ), d i ( t )] x ( t ) .................... (17)

Where c is a positive number called the learning constant that determines the rate of
learning. The weight vector adapted at time t becomes at the next instant, or learning
step,
w i ( t + 1) = w i ( t ) + c r [ w i ( t ), x ( t ), d i ( t ) ] x ( t ) ....... (18)

The superscript convention will be used in this text to index the discerete-time
training steps as in equ. (18). For the k’th step we thus have from (18) using this
convention.

w ik + 1 = w ik + cr ( w ik , x k , d ik ) x k ......................... (19)

The learning in (18,19) assumes the form of a sequence of discerte- time weight
modifications. Continuous-time learning can be expressed as :
4
dwi ( t )
= crx ( t ) ........................................................ (20)
dt

w1
x1

w2
x2
O=f(net)
f(net)

Wn
f’(net)
Xn ∆W
-
r
d
X X X + +

Fig. (11) Neural Networks learning algorithm

3.6.2. Delta learning rule

The Delta learning rule is valid for continuos activation functions, and
in the supervised training mode. The learning signal for this rule is called delta
and is defined as follows

r ≅ [ d i - f ( w it X )] f '( w it X ) ...................(21)

The term f '(Wi t X ) is the derivative of the activation function f(net)

computed for net= wit X . The explanation of the delta learning rule is shown in
fig. (3.11) this learning rule can be readily derived from the condition of least
square error between oi and d i . Calculating the gradient vector with respect
to wi of the square error defined as

E ≅ 0 .5 (d i - o i ) 2 ..............................(22)

Which is equivalent to

5
E = 0.5 [ d i - f ( w it X )] 2 ...........................(23)

We Obtain the error gradient vector are

∇ E = - ( d i - o i ) f '( w it X ) X ......................(24)

The component of the gradient vector are

∂ E / ∂ W ij = - ( d i - o i ) f '(W i t X ) X i .......(25)

Since the minimization of the error requires the weight changes to be in the
negative gradient direction, we take

∆Wi=−c ∇Ε ........................................(26)

Where η is a positive constant. We then obtain from equations.

∆wi=c(di−oi)f’(neti)x .....................(27)

Οr, for the single weight the adjustment becomes

∆wij=c(di-oi)f’(neti)xj ....................(28)

Νote that the weight adjustment as in equation is computed based on

minimization of the squared error. considering the use of the general learning
rule and plugging in the learning signal as defined in equation , the weight
adjustment becomes

∆wi=c(di-oi)f’(neti)xj ...................(29)

Science c has been assumed to be arbitrary constant. The weights are

initialized at any values for this method of training. The delta rules was
introduced only recently for neural networks training (McClelland and
Rumelhart 1986). this rule parallels the discrete perceptron training rule. It
also can be called the continuous perceptron training rule. it also can be

6
called the continuous perceptron training rule. The delta learning rule can be
generalized for multilayer networks.

6.3. Widrow-Hoff learning rule

The Windrow-Hoff learning rule (Widrow 1962) is applicable for the

supervised training of neural networks. It is independent of the activation
function of neurons used since it minimizes the squared error between the
desired output value di and the neuron’s activation value neti = wit x . The

learning signal for this rule is defined as follows:

r = d i - wit x .............................. (30)

The weight vector increment under this learning rule is

∆wi = c ( d - wit x ) x ..................... (31)

or, for the single weight the adjustment is

∆wij = c(di - wit x) x j , for j=1,2, .. , n .............. (32)

This rule can be considered a special case of the Delta learning rule.
Indeed, assuming that f(net)=net, we obtain that f’(net)=1. this rule sometimes
called the LMS (Least Mean Square) learning rule. Weights are initialized at
any values in this method.

Learning Rules
No ratings yet
Learning Rules
60 pages
Cheatsheet Recurrent Neural Networks
No ratings yet
Cheatsheet Recurrent Neural Networks
5 pages
The Categories of Neural Network Learning Rules
No ratings yet
The Categories of Neural Network Learning Rules
7 pages
Lect3 UWA PDF
No ratings yet
Lect3 UWA PDF
73 pages
Aula 3 T
No ratings yet
Aula 3 T
12 pages
CT1 NNDL Question Bank
No ratings yet
CT1 NNDL Question Bank
8 pages
ECE/CS 559 - Neural Networks Lecture Notes #6: Learning: Erdem Koyuncu
No ratings yet
ECE/CS 559 - Neural Networks Lecture Notes #6: Learning: Erdem Koyuncu
13 pages
Artificial Neural Network _ Training
No ratings yet
Artificial Neural Network _ Training
52 pages
2 Units 8,12
No ratings yet
2 Units 8,12
9 pages
Learning Processes
No ratings yet
Learning Processes
30 pages
SC - Unit 2
No ratings yet
SC - Unit 2
50 pages
AI Lec-Module-III
No ratings yet
AI Lec-Module-III
102 pages
MLT Quantum
No ratings yet
MLT Quantum
138 pages
2.1 The Process of Learning 2.1.1 Learning Tasks
No ratings yet
2.1 The Process of Learning 2.1.1 Learning Tasks
25 pages
Learning Rules of ANN
No ratings yet
Learning Rules of ANN
25 pages
Learning
No ratings yet
Learning
34 pages
Experiment 7 AISC
No ratings yet
Experiment 7 AISC
5 pages
Ia Davma Unidad 2
No ratings yet
Ia Davma Unidad 2
113 pages
Supervised Vs Unsupervised Learning
No ratings yet
Supervised Vs Unsupervised Learning
12 pages
Network Learning (Training)
No ratings yet
Network Learning (Training)
29 pages
ANN-4 (2)
No ratings yet
ANN-4 (2)
15 pages
Week 2
No ratings yet
Week 2
14 pages
Soft Module 1
No ratings yet
Soft Module 1
14 pages
SECA4002
No ratings yet
SECA4002
65 pages
Learning Law in Neural Networks
100% (2)
Learning Law in Neural Networks
19 pages
Basic Neural Networks
No ratings yet
Basic Neural Networks
9 pages
Lecture 8 - Supervised Learning in Neural Networks - (Part 1)
No ratings yet
Lecture 8 - Supervised Learning in Neural Networks - (Part 1)
7 pages
Fundamentals of Artificial Neural Networks
No ratings yet
Fundamentals of Artificial Neural Networks
27 pages
6.1-Fundamentals of Artificial Neural Networks
No ratings yet
6.1-Fundamentals of Artificial Neural Networks
12 pages
AI Unit 3 Lecture 3
No ratings yet
AI Unit 3 Lecture 3
17 pages
10 Document1
No ratings yet
10 Document1
7 pages
ANN ARTIFICAL NEURAL NETWORK
No ratings yet
ANN ARTIFICAL NEURAL NETWORK
34 pages
Unsupervised_learning
No ratings yet
Unsupervised_learning
6 pages
Improvement in The Speed of Training A Neural Network Using Sampled Training
No ratings yet
Improvement in The Speed of Training A Neural Network Using Sampled Training
3 pages
Learning Rules in ANN
No ratings yet
Learning Rules in ANN
11 pages
Deep learning
No ratings yet
Deep learning
14 pages
Unit - I Artificial Neural Networks
No ratings yet
Unit - I Artificial Neural Networks
23 pages
Supervised Learning
No ratings yet
Supervised Learning
4 pages
NN3 PDF
No ratings yet
NN3 PDF
7 pages
1 Hassoun Chap3 Perceptron
No ratings yet
1 Hassoun Chap3 Perceptron
10 pages
AI notes Module- 4
No ratings yet
AI notes Module- 4
13 pages
Unit 5 Learning
No ratings yet
Unit 5 Learning
21 pages
Intorduction of ML
No ratings yet
Intorduction of ML
14 pages
Applications
No ratings yet
Applications
84 pages
Knowledge Based and Neural Network Learning
No ratings yet
Knowledge Based and Neural Network Learning
6 pages
Soft Computing Notes (1)
No ratings yet
Soft Computing Notes (1)
53 pages
Learning Process
No ratings yet
Learning Process
54 pages
Supervised Learning
No ratings yet
Supervised Learning
4 pages
Neural Networks
No ratings yet
Neural Networks
27 pages
Presentation On Neural Networks
No ratings yet
Presentation On Neural Networks
46 pages
Learning Scenarios Supervised Learning Unsupervised Learning Unit 1 Part B
No ratings yet
Learning Scenarios Supervised Learning Unsupervised Learning Unit 1 Part B
25 pages
Machine Learning Techniques (Book)-8-164
No ratings yet
Machine Learning Techniques (Book)-8-164
157 pages
Pages_2-15[1]
No ratings yet
Pages_2-15[1]
14 pages
Unit - 4
No ratings yet
Unit - 4
26 pages
Presentation 1
No ratings yet
Presentation 1
7 pages
Assignment Neural Networks
No ratings yet
Assignment Neural Networks
7 pages
Unit - I Artificial Neural Networks
No ratings yet
Unit - I Artificial Neural Networks
23 pages
Ann Unit 3
No ratings yet
Ann Unit 3
69 pages
Learning Processes
No ratings yet
Learning Processes
13 pages
machine learning notes
No ratings yet
machine learning notes
20 pages
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
AI Photo Enhancement (1)
No ratings yet
AI Photo Enhancement (1)
2 pages
ABC Algorithm 22MSM40206
No ratings yet
ABC Algorithm 22MSM40206
13 pages
Tugas_ Farah Shodiqotun _ Analisis Path
No ratings yet
Tugas_ Farah Shodiqotun _ Analisis Path
8 pages
Applied Soft Computing
No ratings yet
Applied Soft Computing
32 pages
Dive Into Deep Learning
No ratings yet
Dive Into Deep Learning
972 pages
PowerPoint Presentation (5822034)
No ratings yet
PowerPoint Presentation (5822034)
157 pages
optimization techniques (lecture 5 and 6)
No ratings yet
optimization techniques (lecture 5 and 6)
39 pages
Unit 2 DL
No ratings yet
Unit 2 DL
3 pages
Tension Structures Form and Behaviour: January 2003
No ratings yet
Tension Structures Form and Behaviour: January 2003
8 pages
Identification of Focal Spots by Image Processing and Machine Learning Algorithms
No ratings yet
Identification of Focal Spots by Image Processing and Machine Learning Algorithms
1 page
C2 W1
No ratings yet
C2 W1
20 pages
Numerical Methods in Linear Equations - Direct Method
No ratings yet
Numerical Methods in Linear Equations - Direct Method
22 pages
Batch Normalization Separate
No ratings yet
Batch Normalization Separate
20 pages
Markov Decision
100% (3)
Markov Decision
212 pages
HKDSE Mathematics in Action (3rd Edition) 4A - Chapter 05 More About Polynomials - Full Solution
No ratings yet
HKDSE Mathematics in Action (3rd Edition) 4A - Chapter 05 More About Polynomials - Full Solution
64 pages
Quantitative Techniques
No ratings yet
Quantitative Techniques
239 pages
Ahmedabad Institute of Technology: CE & IT Department
No ratings yet
Ahmedabad Institute of Technology: CE & IT Department
4 pages
Closest Pair of Points: Cormen Et - Al 33.4
No ratings yet
Closest Pair of Points: Cormen Et - Al 33.4
17 pages
National Institute of Technology Rourkela
No ratings yet
National Institute of Technology Rourkela
2 pages
Dwi Prima Handayani Putri - Ilkom VI A
No ratings yet
Dwi Prima Handayani Putri - Ilkom VI A
12 pages
PSSM
No ratings yet
PSSM
17 pages
KD Kakuro 12x12 s2 b001
No ratings yet
KD Kakuro 12x12 s2 b001
9 pages
Dual Linear Programming and Complementary Slackness
No ratings yet
Dual Linear Programming and Complementary Slackness
35 pages
14 greedy-II
No ratings yet
14 greedy-II
40 pages
Tos in Grade 7 Tos
No ratings yet
Tos in Grade 7 Tos
2 pages
Chapter - 2 Polynomials
No ratings yet
Chapter - 2 Polynomials
2 pages
Homework 5 Solutions
0% (1)
Homework 5 Solutions
4 pages
Goh Et Al-2017-Journal of Computational Chemistry
No ratings yet
Goh Et Al-2017-Journal of Computational Chemistry
17 pages
Chapter 3. Algorithms
No ratings yet
Chapter 3. Algorithms
96 pages

Artificial Intelligence, An Introductory Course

Uploaded by

Artificial Intelligence, An Introductory Course

Uploaded by

Artificial Intelligence,

The International Congress for global Science and Technology

Instructor: Ashraf Aboshosha, Dr. rer. nat.

Lecture (3): Artificial Neural Networks

• An introduction to artificial intelligence and

This is free educational material

1-)supervised learning 2-)unsupervised learning

fig. (9) Supervised Learning Rule

In supervised learning fig. (9) we assume that at each instant of time

Fig. (10) Unsupervised Learning Rule

Unsupervised learning fig. (10) algorithms use patterns that are

6.1. The general algorithm of learning

learning signal r. The learning signal r is in general a function of Wi , X , and

Fig. (11) Neural Networks learning algorithm

3.6.2. Delta learning rule

The term f '(Wi t X ) is the derivative of the activation function f(net)

We Obtain the error gradient vector are

The component of the gradient vector are

Where η is a positive constant. We then obtain from equations.

Οr, for the single weight the adjustment becomes

Νote that the weight adjustment as in equation is computed based on

Science c has been assumed to be arbitrary constant. The weights are

6.3. Widrow-Hoff learning rule

The Windrow-Hoff learning rule (Widrow 1962) is applicable for the

learning signal for this rule is defined as follows:

r = d i - wit x .............................. (30)

The weight vector increment under this learning rule is

∆wi = c ( d - wit x ) x ..................... (31)

or, for the single weight the adjustment is

∆wij = c(di - wit x) x j , for j=1,2, .. , n .............. (32)

You might also like