Unit - II: Recurrent Neural Network
Unit - II: Recurrent Neural Network
• In auto associative network, the weights on the diagonal can set to zero.
Auto Associative Memory
• Retrieves previous stored pattern that most closely resembles current
pattern.
A MEMORY
A
Testing Algorithm
Hetero Associative Memory
• In case of a hetero-associative neural net, the training input and the
target output vectors are different.
• The weights are determined in a way that the net can store a set of pattern
associations.
Hetero Associative Memory
• Retrieved pattern is generally different from input pattern (not only in
content but also in type and format).
• Step 2: Set the activation for input layer units equal to that of the current input vector given, x i
Hopfield neural network
Hopfield neural network
• Hopfield neural network was invented by Dr. John J. Hopfield in 1982.
• The output of each neuron should be the input of other neurons but not the input
of self.
If the network is stable, energy function decreases whenever the state of any node
changes.
Continuous Hopfield
Network
Continuous Hopfield Network
A discrete Hopfield net can be modified to a continuous model, in which time is assumed to be a
continuous variable, and can be used for associative memory problems or optimization problems like
traveling salesman problem. The nodes of this network have a continuous, graded output rather than a
two-state binary output. Thus, the energy of the network decreases continuously with time. The
continuous Hopfield networks can be realized as an electronic circuit, which uses non-linear amplifiers
and resistors. This helps building the Hopf1eld network using analog VLSI technology.
Hardware Model of Continuous Hopfield
Network
Hardware Model of Continuous Hopfield
Network
Hardware Model of Continuous Hopfield
Network
Hopfield Network - Limitations
• Pattern restoration
• Pattern Completion
• Pattern Generalization
• Pattern association
This network acts like a CAM (content addressable memory); it is capable of recalling a pattern from
the stored memory even if it's noisy or partial form is given to the model.
Boltzmann Machine
Boltzmann Machine
• When the simulated annealing process is applied to the discrete Hopfield network, it
become a Boltzmann machine. On applying the Boltzmann machine to a constrained
optimization problem, the weights represent the constraint of the problem and the
quantity to be optimized.
• The application of counter propagation net are data compression, function approximation and pattern
association.
• This model is three layer neural network that performs input-output data mapping, producing an output
vector y in response to input vector x, on the basis of competitive learning.
Counter propagation network
• The three layer in an instar-outstar model are the input layer, the hidden(competitive) layer and the output layer.
• There are two stages involved in the training process of a counter propagation net. The input vector are clustered
in the first stage. In the second stage of training, the weights from the cluster layer units to the output units are
tuned to obtain the desired response. There are two types of counter propagation net:
constructing a look-up-table.
• The full CPN works best if the inverse function exists and is continuous.
• The vector x and y propagate through the network in a counter flow manner to yield
•For each node in the input layer there is an input value xi.
•All the instar are grouped into a layer called the competitive layer. Each of the instar responds maximally to a group of
input vectors in a different region of space.
•An outstar model is found to have all the nodes in the output layer and a single node in the competitive layer. The
outstar looks like the fan-out of a node.
Full CPN – Training Algorithm
• Step 0: Set the weights and the initial learning rate.
• Step 4: Find the winning cluster unit. zinj=∑xi.vij + ∑yk.wkj for j=1 to p,
If Euclidean distance method is used, find the cluster unit z j whose squared distance from input vectors is the smallest: Dj=∑(xi-vij)^2 + ∑(yk-wkj)^2
Winner take all method: If there occurs a tie in case of selection of winner unit, the unit with the smallest index is the winner. Take the winner unit index as J.
Full CPN – Training Algorithm
• Step 5: Update the weights over the calculated winner unit zj.
• Step 8: Perform step 9 to 15 when stopping condition is false for phase II training.
• Step 9: Perform step 10 to 13 for each training input vector pair x:y. Here α and β are small constant values.
• Step 10: Make the X-input layer activations to vector x. Make the Y-input layer activations to vector y.
Full CPN – Training Algorithm
• Step 11: Find the winning cluster unit (Using the formula from step 4). Take the winner unit index as J.
• Step 2: Set X-input layer activations to vector X. Set Y-in put layer activations to vector
Y.
• Step 3: Find the duster unit ZJ that is closest to the input pair.
• Forward-only CPN uses only the x vector to form the cluster on the
Kohonen units during phase I training.
• Step 1: Perform step 2 to 7 when stopping condition for phase I training is false.
• Step 4: Compute the winning cluster unit J. If dot product method is used, find the cluster unit z J
If Euclidean distance is used, find the cluster unit z J square of whose distance from the input pattern is
smallest: Dj=∑(xi-vij)^2
If there exists a tie in the selection of winner unit, the unit with the smallest index is chosen as the
winner.
Forward only CPN – Training Algorithm
• Step 5: Perform weight updation for unit zJ. For i=1 to n,
viJ(new)=viJ(old) + α[xi-viJ(old)]
• Step 6: Reduce learning rate α: α (t+1)=0.5α(t)
• Step 7: Test the stopping condition for phase I training.
• Step 8: Perform step 9 to 1 when stopping condition for phase II training is false.
• Step 9: Perform step 10 to 13 for each training input pair x:y.
• Step 10: Set X-input layer activations to vector X. Set Y-output layer activation to vector Y.
• Step 11: Find the winning cluster unit J.
• Step 12: Update the weights into unit zJ. For i=1 to n, viJ(new)=viJ(old) + α[xi-viJ(old)]
• Step 13: Update the weights from unit zJ to the output units.
For k=1 to m, wJk(new)=wJk(old) + β[yk-wJk(old)]
• Step 14: Reduce learning rate β, β(t+1)=0.5β(t)
• Step 15: Test the stopping condition for phase II training.
ART (ADAPTIVE RESONANCE THEORY)
NETWORK
ART (ADAPTIVE RESONANCE THEORY)
• The adaptive resonance theory (ART) network, is an unsupervised learning, developed by Steven Grossberg
and Gail Carpenter in 1987. The adaptive resonance was developed to solve the problem of instability
occurring in feed-forward systems.
Fundamental Architecture: To build an adaptive resonance theory or ART network three groups of neurons are
used. These include
1.Input processing neurons(F1 layer): The input processing neuron consists of two portions: Input portion and
interface portion. The input portion perform some processing based on inputs it receives. The interface
portion of the F1 layer combines the input from input portion of F1 and F2 layers for comparing the similarity
of the input signal with the weight vector for the cluster unit that has been selected as a unit for learning.
1.ART1
2.ART2
ART1 is designed for clustering binary vectors and ART2 is designed to accept
continuous-valued vectors.
Adaptive Resonance Theory
Adaptive Resonance Theory
• Adaptive: Learning new
• Unsupervised learning
• Feedback is present (data in the form of processing element output reflect back and ahead
among layers)
• Reset module – act as controlling mechanism. It controls the degree of similarity of patterns placed on the same cluster unit. There exit two sets
of weighted interconnection path between F1 and F2 layers.
• ART1 network runs autonomously that means it does not require any external control signals and can run stably with infinite patterns of input
data.
• ART1 network is trained using fast learning method. This network performs well with perfect binary input patterns, but it is sensitive to noise in
the input data and should be handled carefully.
Adaptive Resonance Theory1
ART1 Architecture:
1. Computational units
2. Supplemental units
iii). Reset control unit(controls degree of similarity of patterns placed on the same cluster).
Adaptive Resonance Theory1 -Architecture
Computational units
Adaptive Resonance Theory1
ART1 Architecture:
F1 layer accept input and perform processing then transfer the best match with classification factor to F2 layer . F1 layer and
F2 layer contains two set of weighted interconnections. In Competitive layer the net input become candidate to learn the input
pattern and rest are ignored. The reset unit make decision whether or not the cluster unit us allowed to learn depending on top-
down weight vector.
Vigilance test will be conducted for taking decision. If degree of similarity is less than vigilance parameter then cluster unit is
not allowed to learn.
•Supplemental units called as gain control units contains G1, G2 along with reset.
•When any designated layer is ON, F1(b) layer receives input from G1 or F1(a)or F2 and
•F1 and F2 receives signals from 3 ways and it is called as two third rule.
Adaptive Resonance Theory1 -Architecture
G2
Supplemental units
Adaptive Resonance Theory1
ART1 Architecture:
The F1(b) unit should send a signal whenever it receives input from F1(a) and no F2 node is active. After
a F2 node has been chosen in competition, it is necessary that only F1(b) units whose input signal and top-
down signal match remain constant. This is performed by the two gains G1 and G2. Whenever F2 unit is
on, G1 unit is inhibited. When no F2 unit is on, each F1 interface unit receives a signal from G1 unit. In
the same way, G2 unit controls the firing of F2 units, obeying fie two thirds rule. Vigilance matching is
controlled by the reset control unit R. R also receives inhibitory signals from the F1 interface units that are
on. If sufficient number of interface units is on, then unit "F" may be prevented from firing. ·
ART 1 – Training Algorithm
• Step 0: Initialize the parameters: α>1 and 0<ρ<=1
Initialize the weights: 0<bij(0)<α/(α-1+n) and tij(0)=1
• Step 1: Perform steps 2 to 13 when stopping condition is false.
• Step 2: Perform steps 3 to 12 for each of the training input.
• Step 3: Set activation of all F2 units to zero. Set the activation of F1(a) units to input vectors.
• Step 4: Calculate the norm of s: ||s||= ∑si
• Step 5: Send input signal from F1(a) layer to F1(b) layer: xi=si
• Step 6: For each F2 node that is not inhibited, the following rule should hold: If yj not=-1, then
yj= ∑bij.xi
• Step 7: Perform step 8 to 11 when reset is true.
• Step 8: Find J for yJ>=yj for all nodes. If yJ = -1, then all the nodes are inhibited and this pattern cannot be clustered.
• Step 9: Recalculate activation X of F1(b): xi = si.tJi
• Step 10: Calculate the norm of vector x: ||x||=∑xi
ART 1 – Training Algorithm
• Step 11: Test for reset condition.
If ||x||/||s||<ρ, then inhibit node J, yJ = -1.go to step 7 again.
Else if ||x||/||s||>=ρ, then proceed to the next step.
• Step 12: Perform weight updation for node J:
biJ(new)=αxi / α-1+||x||
tJi(new)=xi
• Step 13: Test for stopping condition. The following may be stopping conditions:
i).no change in weights
ii).no reset of units
iii).maximum number of epochs reached.
Adaptive Resonance Theory2
Adaptive Resonance Theory2
• ART2 is for continuous-valued input vectors.
• In ART2 network complexity is higher than ART1 network because much processing s needed in F1
layer.
• ART2 network was designed to self-organize recognition categories for analog as well as binary input
sequences. The continuous-valued inputs presented to the ART2 network may be of two forms-the first
form is a “noisy binary” signal form and the second form of data is “truly continuous”.
Adaptive Resonance Theory2
• The major difference between ART1 and ART2 network is the input layer.
top layer: where inputs coming from the output layer are read in
middle layer: where the top and bottom patterns are combined together to form a matched pattern which is then
•In ART2 architecture F1 layer consist of six types of units-W, X, U, V, P, Q- and there are n unit of each type. The
supplemental unit “N” between units W and X receives signals from all “W” units, computes the norm of vector w and
sends this signal to each of the X units. Similarly there exit supplemental units between U and V, and P and Q, performing
same operation as done between W an X. The connection between Pi of the F1 layer and Yj of the F2 layer show the
weighted interconnections, which multiplies the signal transmitted over those paths.
•The operation performed in F2 layer are same for both ART1 and ART2.
Adaptive Resonance Theory2
Adaptive Resonance Theory2
Adaptive Resonance Theory2
Adaptive Resonance Theory2 - Algorithm
• Step 0: Initialize the parameters :a, b, c, d, e, α, ρ, θ. Also specify the number of epochs of training(nep) and number of
leaning iterations(nit).
• Step 1: Perform step 2 to 12 (nep) times.
• Step 2: Perform steps 3 to 11 for each input vector s.
• Step 3: Update F1 unit activations: ui=0 ; wi=si; Pi=0; qi=0; vi=f(xi); xi=si / e+||s||
Update F1 unit activation again: ui=vi / e+||v||; wi=si+a.ui; Pi=ui; xi=wi / e+||w||;
qi=pi / e+ ||p||; vi= f(xi) + b.f(qi)
In ART2 networks, norms are calculated as the square root of the sum of the squares of the respective values.
• Step 4: Calculate signals to F2 units: yj=∑bij.Pi
• Step 5: Perform steps 6 and 7 when reset is true.
• Step 6: Find F2 unit Yj with largest signal( J is defined such that yj>=yj, j=1 to m).
Adaptive Resonance Theory2 - Algorithm
• Step 7: Check for reset:
ui=vi / e + ||v||; Pi=ui + d.tJi; ri=ui + c.Pi / e+||u||+c||p||
If ||r|| < (ρ-e), then yJ =-1(inhibit J).Reset is true; perform step 5.
If ||r|| >= (ρ-e), then wi=si+a.ui; xi=wi / e+||w||; qi=pi / e+||p||; vi=f(xi)+b.f(qi)
Reset is false. Proceed to step 8.
• Step 8: Perform steps 9 to 11 for specified number of learning iterations.
• Step 9: Update the weights for winning unit J: tJi=α.d.ui + {[1+α.d(d-1)]}tJi
biJ= α.d.ui + {[1+α.d(d-1)]}biJ
• Step 10: Update F1 activations: ui = vi / e+ ||v||; wi=si+a.ui;
Pi=ui+d.tJi; xi=wi / e+||w||; qi=Pi / e+||p||; vi=f(xi)+b.f(qi)
• Step 11: Check for the stopping condition of weight updation.
• Step 12: Check for the stopping condition for number of epochs.