0% found this document useful (0 votes)
33 views11 pages

Learning Rules in ANN

DEEP LEARNING

Uploaded by

Anu Vamshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views11 pages

Learning Rules in ANN

DEEP LEARNING

Uploaded by

Anu Vamshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Learning rules in ANN

Learning rule is a method or a mathematical logic. It helps a Neural Network


to learn from the existing conditions and improve its performance. Thus learning
rules updates the weights and bias levels of a network when a network simulates
in a specific data environment. Applying learning rule is an iterative process. It
helps a neural network to learn from the existing conditions and improve its
performance. The different learning rules in the Neural network are:
1. Hebbian learning rule – It identifies, how to modify the weights of nodes of a
network.
2. Perceptron learning rule – Network starts its learning by assigning a random value
to each weight.
3. Delta learning rule – Modification in sympatric weight of a node is equal to the
multiplication of error and the input.
4. Correlation learning rule – The correlation rule is the supervised learning.
5. Outstar learning rule – We can use it when it assumes that nodes or neurons in a
network arranged in a layer.
Hebbian Learning Rule

The Hebbian rule was the first learning rule. In 1949 Donald Hebb developed it as
learning algorithm of the unsupervised neural network. We can use it to identify how to
improve the weights of nodes of a network. The Hebb learning rule assumes that – If two
neighbor neurons activated and deactivated at the same time, then the weight connecting
these neurons should increase. At the start, values of all weights are set to zero. This learning
rule can be used for both soft- and hard-activation functions. Since desired responses of
neurons are not used in the learning procedure, this is the unsupervised learning rule. The
absolute values of the weights are usually proportional to the learning time, which is
undesired.
Perceptron Learning Rule

Each connection in a neural network has an associated weight, which changes in the course of
learning. According to it, an example of supervised learning, the network starts its learning by assigning a
random value to each weight. Calculate the output value on the basis of a set of records for which we can
know the expected output value. This is the learning sample that indicates the entire definition. As a
result, it is called a learning sample. The network then compares the calculated output value with the
expected value. Next calculates an error function ∈, which can be the sum of squares of the errors
occurring for each individual in the learning sample which can be computed as:

Perform the first summation on the individuals of the learning set, and perform the second
summation on the output units. Eij and Oij are the expected and obtained values of the jth unit for the ith
individual. The network then adjusts the weights of the different units, checking each time to see if the
error function has increased or decreased. As in a conventional regression, this is a matter of solving a
problem of least squares. Since assigning the weights of nodes according to users, it is an example of
supervised learning.
Delta Learning Rule

• Developed by Widrow and Hoff, the delta rule, is one of the most common learning rules. It depends on supervised
learning. This rule states that the modification in sympatric weight of a node is equal to the multiplication of error and the
input. In Mathematical form the delta rule is as follows:

• For a given input vector, compare the output vector is the correct answer. If the difference is zero, no learning takes place;
otherwise, adjusts its weights to reduce this difference. The change in weight from ui to uj is: dwij = r* ai * ej. where r is the
learning rate, ai represents the activation of ui and ej is the difference between the expected output and the actual output
of uj. If the set of input patterns form an independent set then learn arbitrary associations using the delta rule. It has seen
that for networks with linear activation functions and with no hidden units. The error squared vs. the weight graph is a
paraboloid in n-space. Since the proportionality constant is negative, the graph of such a function is concave upward and
has the least value. The vertex of this paraboloid represents the point where it reduces the error. The weight vector
corresponding to this point is then the ideal weight vector. We can use the delta learning rule with both single output unit
and several output units. While applying the delta rule assume that the error can be directly measured. The aim of applying
the delta rule is to reduce the difference between the actual and expected output that is the error.
Correlation Learning Rule

• The correlation learning rule based on a similar principle as the Hebbian learning rule. It assumes that weights between
responding neurons should be more positive, and weights between neurons with opposite reaction should be more
negative. Contrary to the Hebbian rule, the correlation rule is the supervised learning, instead of an actual. The response, oj,
the desired response, dj, uses for the weight-change calculation. In Mathematical form the correlation learning rule is as
follows:

• Where dj is the desired value of output signal. This training algorithm usually starts with the initialization of weights to zero.
Since assigning the desired weight by users, the correlation learning rule is an example of supervised learning.
Out Star Learning Rule

• We use the Out Star Learning Rule when we assume that nodes or neurons in a network arranged in a layer. Here the
weights connected to a certain node should be equal to the desired outputs for the neurons connected through those
weights. The out start rule produces the desired response t for the layer of n nodes. Apply this type of learning for all nodes
in a particular layer. Update the weights for nodes are as in Kohonen neural networks. In Mathematical form, express the
out star learning as follows:

• This is a supervised training procedure because desired outputs must be known.


ADALINE
ADALINE (Adaptive Linear Neuron or later Adaptive Linear Element) is an early
single-layer artificial neural network and the name of the physical device that implemented
this network. The network uses memistors. It was developed by Professor Bernard Widrow
and his graduate student Ted Hoff at Stanford University in 1960. It is based on the
McCulloch–Pitts neuron. It consists of a weight, a bias and a summation function. The
difference between Adaline and the standard (McCulloch–Pitts) perceptron is that in the
learning phase, the weights are adjusted according to the weighted sum of the inputs (the
net). In the standard perceptron, the net is passed to the activation (transfer) function and
the function's output is used for adjusting the weights. Some important points about Adaline
are as follows:
• It uses bipolar activation function.
• It uses delta rule for training to minimize the Mean-Squared Error (MSE) between the actual
output and the desired/target output.
• The weights and the bias are adjustable.
Architecture of ADALINE network
• The basic structure of Adaline is similar to perceptron having an extra feedback loop with the help of which
the actual output is compared with the desired/target output. After comparison on the basis of training
algorithm, the weights and bias will be updated.
Training Algorithm of ADALINE
Step 1 − Initialize the following to start the training:
• Weights
• Bias
• Learning rate α
For easy calculation and simplicity, weights and bias must be set equal to 0 and the learning
rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
Step 3 − Continue step 4-6 for every bipolar training pair s : t.
Step 4 − Activate each input unit as follows: xi = si (i=1 to n)
Step 5 − Obtain the net input with the following relation:

Here ‘b’ is bias and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the final output:
Step 7 − Adjust the weight and bias as follows :
Case 1 − if y ≠ t then, wi(new) = wi(old)+α(t−yin)xi
b(new) = b(old)+α(t−yin)
Case 2 − if y = t then, wi(new) = wi(old)
b(new) = b(old)
Here ‘y’ is the actual output and ‘t’ is the desired/target output. (t−yin) is the computed error.
Step 8 − Test for the stopping condition, which will happen when there is no change in weight or the highest
weight change occurred during training is smaller than the specified tolerance.

You might also like