0% found this document useful (0 votes)
12 views

Module 3

Uploaded by

Ishant Bansal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Module 3

Uploaded by

Ishant Bansal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

SOFT COMPUTING (SWE1011)

Module 3: UNSUPERVISED NETWORKS

Dr Chiranji Lal Chowdhary


VIT Vellore

Dr Chiranji Lal Chowdhary, VIT


SYLLABUS

• Self-organizing maps
• LVQ network
• ART network

Dr Chiranji Lal Chowdhary, VIT


UNSUPERVISED LEARNING NETWORKS

• Unsupervised learning is the second major learning paradigm

• The system(environment) does not provide any feedback to


indicate the desired output
• The network itself has to discover any relationships of
interest (features, patterns, correlations, classifications) in
the input data

• TRANSLATE THE DISCOVERED RELATIONSHIPS INTO


OUTPUTS
• These type of ANNs are also called self-organizing networks

Dr Chiranji Lal Chowdhary, VIT


UNSUPERVISED LEARNING NETWORKS CONTD…
• It can determine how similar a new input pattern is to typical
patterns already seen
• The network gradually learns what similarity is
• It can set up several axes along which to measure similarity to
previous patterns
• The axes can be any one of the following:
– Principal Component Analysis
• the idea of PCA is simple — reduce the number of variables of a data set, while preserving
as much information as possible.
– Clustering
• “Clustering” is the process of grouping similar entities together. The goal of this
unsupervised machine learning technique is to find similarities in the data point and group
similar data points together.
– Adaptive Vector Quantization
• The AVQ algorithm is an ANN algorithm that lets you choose how many training instances
to hang onto and learns exactly what those instances should look like.
– Feature Mapping
• Feature Mapping is one such process of representing features along with the relevancy of
these features on a graph. This ensures that the features are visualized and their
corresponding information is visually available.

Dr Chiranji Lal Chowdhary, VIT


UNSUPERVISED LEARNING NETWORKS
CONTD…
• The net has added structure by means of which it is forced to
make a decision
• If there are several neurons for firing then only one of them
is selected to fire (respond)
• The process for achieving this is called a competition

• An Example:
• Suppose we have a set of students
• Let us classify them on the basis of their performance
• The scores will be calculated

Dr Chiranji Lal Chowdhary, VIT


UNSUPERVISED LEARNING NETWORKS
CONTD…
• The one whose score is higher than all others will be the
winner
• The same principle is followed for pattern classification in
neural networks
• Here, there may be a tie
• Some principle is followed even when there is a tie
• These nets are called competitive nets
• The extreme form of these nets are called winner-take-all
• In such a case, only one neuron in the competing group will
possess a non-zero output signal at the end of the
competition
Dr Chiranji Lal Chowdhary, VIT
UNSUPERVISED LEARNING NETWORKS
CONTD…
• Several ANNs exist under this category
1.Maxnet

2.Mexican hat

3.Hamming net

4.Counter propagation net

5.Kohonen self-organizing feature map

6.Learning vector quantization (LVQ)

7.Adaptive Resonance Theory (ART)

Dr Chiranji Lal Chowdhary, VIT


UNSUPERVISED LEARNING NETWORKS
CONTD…
• In case of these ANNs the net seeks to find patterns or
regularity in the input data by forming clusters
• ARTs are called clustering nets
• In such nets there are as many input units as an input vector
possessing components
• Since each output unit represents a cluster, the number of
output units will limit the number of clusters that can be
formed
• The learning algorithm used in most of these nets is known
as Kohonen learning

Dr Chiranji Lal Chowdhary, VIT


KOHONEN LEARNING

• The units update their weights by forming a new weight


vector, which is a linear combination of the old weight vector
and the new input vector
• The learning continues for the unit whose weight vector is
closest to the input vector
• The weight updation formula for output cluster unit ‘j’ is
given by
w j (new) = w j (old ) +  [ x − w j (old )]
• x: Input vector
• w j : weight vector for unit j
• : learning rate, its value decreases monotonically as
training continues

Dr Chiranji Lal Chowdhary, VIT


DECISION ON WINNERS

• There are two methods to determine the winner of the


network during competition
• Method 1:
• The square of the Euclidean distance between the input
vector and the weight vector is computed
• The unit whose weight vector is at the smallest distance from
the input vector is chosen as the winner
• Method 2:
• The dot product of the input vector and the weight vector is
computed
• This dot product is nothing but the net inputs calculated for
the corresponding cluster units

Dr Chiranji Lal Chowdhary, VIT


DECISION ON WINNERS

• The weight updation is performed over it because the one with


the largest dot product corresponds to the smallest angle
between the input and the weight vector, if both are of unit
length. So, we have to go for largest dot product.
• We know that for two vectors a and b, the dot product is given
by the formula
a  b =| a || b | cos 
• If | a |=| b |= 1 then a . b = cos 
• Also, as  increases a . b decreases
• Both the methods can be used if the vectors are of unit length
• The first method is normally preferred
Dr Chiranji Lal Chowdhary, VIT
KOHONEN SELF-ORGANISING FEATURE MAPS

• Feature mapping is a process which converts the patterns of


arbitrary dimensionality into a response of one or two
dimensional array of neurons
• It converts a wide pattern space into a typical feature space
• Feature map is a network performing such a map
• It maintains the neighbourhood relations of input patterns
• It has to obtain a topology (structure) preserving map
• For such feature maps it is required to find a self-organizing
neural array which consists of neurons arranged in a one-
dimensional array or two-dimensional array

Dr Chiranji Lal Chowdhary, VIT


KOHONEN SELF-ORGANISING FEATURE MAPS
CONTD…
• Developed in 1982
• By Tuevo Kohonen, a professor emeritus of the Academy of
Finland
• SOMs learn on their own through unsupervised competitive
learning
• Called “Maps” because they attempt to map their weights to
conform to the given input data
• The nodes in a SOM network attempt to become like the
inputs presented to them
• Retaining principle 'features’ of the input data is a
fundamental principle of SOMs

Dr Chiranji Lal Chowdhary, VIT


Dr Chiranji Lal Chowdhary, VIT
Dr Chiranji Lal Chowdhary, VIT
KOHONEN SELF-ORGANISING FEATURE MAPS
CONTD…
• SOMs provide a way of representing multidimensional data
in a much lower dimensional space – typically one or two
dimensions
• This aides in their visualization benefit, as humans are more
proficient at comprehending data in lower dimensions than
higher dimensions
• SOMs are a valuable tool in dealing with complex or vast
amounts of data
• These are extremely useful for the visualization and
representation of complex or large quantities of data in a
manner that is most easily understood by the human brain

Dr Chiranji Lal Chowdhary, VIT


KOHONEN SELF-ORGANISING FEATURE MAPS
CONTD…

Dr Chiranji Lal Chowdhary, VIT


KOHONEN SELF-ORGANISING FEATURE MAPS
CONTD…
Few key things to notice:
• Each map node is connected to each input node
• For this small 4x4 node network, that is 4x4x3=48 connections
• Map nodes are not connected to each other
• Nodes are organized in this manner, as a 2-D grid makes it easy
to visualize the results
• Each map node has a unique (i, j) coordinate
• This makes it easy to reference a node in the network
• Also, it is easy to calculate the distances between nodes
• Because of the connections only to the input nodes, the map
nodes are oblivious (unaware) as to what values their
neighbours have
Dr Chiranji Lal Chowdhary, VIT
KSO FEATURE MAPS: TRAINING ALGORITHM

• The steps involved in the training algorithm are:


• STEP 0: Initialize the weights wij (Random values are assumed)
– These can be chosen as the same as the components of the input
vector
– Set the topological neighbourhood parameters (radius of the
neighbourhood etc.)
– Initialize the learning rate
• STEP 1: Perform Steps 2 – 8 when stopping condition is false
• STEP 2: Perform Steps 3 – 5 for each input vector x
• STEP 3: Compute the square of the Euclidean distance for j =
1,2,…m.

Dr Chiranji Lal Chowdhary, VIT


KSO FEATURE MAPS: TRAINING ALGORITHM
m
• We have D( j ) =  ( xi − wij )2
i =1

• We can use the dot product method also


• STEP 4: Find the winning unit j, so that D(j) is minimum.
• STEP 5: For all units j within a specific neighbourhood of j and
for all i, calculate the new weights as:

wij (new) = wij (old) +  .[x i − wij (old)]

wij (new) = (1 −  ) wij (old) +  .xi

Dr Chiranji Lal Chowdhary, VIT


KSO FEATURE MAPS: TRAINING ALGORITHM

• STEP 6: Update the learning rate  using the formula


 (t + 1) = (0.5). (t ) The value of alpha
decreases as we proceed
• STEP 7: Reduce radius of topological neighbourhood at
specified times
• STEP 8: Test for stopping condition of the network

Dr Chiranji Lal Chowdhary, VIT


KSO FEATURE MAPS: ARCHITECTURE

• For a linear array of cluster units, the neighbourhood units


designated by ‘o’ of radii Ni (k1 ), Ni (k2 ), Ni (k3 );
k1  k2  k3 ; k3 = 0, k2 = 1, k1 = 3

O 0 0 0 [#] 0 0 0 0

N i ( k3 )

N i (k2 )

Ni (k1 )
Dr Chiranji Lal Chowdhary, VIT
KSO FEATURE MAPS: ARCHITECTURE

• For a rectangular grid it looks like

Dr Chiranji Lal Chowdhary, VIT


KSO FEATURE MAPS: STEPS

• The self-organization process involves four major


components:

• Initialization: All the connection weights are initialized with


small random values

• Competition: For each input pattern, the neurons compute


their respective values of a discriminant function (D) which
provides the basis for competition. The particular neuron
with the smallest value of the discriminant function is
declared the winner.

Dr Chiranji Lal Chowdhary, VIT


KSO FEATURE MAPS: STEPS CONTD…

• Cooperation: The winning neuron determines the spatial


location of a topological neighbourhood of excited neurons,
thereby providing the basis for cooperation among
neighbouring neurons.

• Adaptation: The excited neurons decrease their individual


values of the discriminant function in relation to the input
pattern through suitable adjustment of the associated
connection weights, such that the response of the winning
neuron to the subsequent application of a similar input
pattern is enhanced

Dr Chiranji Lal Chowdhary, VIT


EXAMPLE

• For a given Kohonen self-organising feature map with


weights shown in the next slide:

• (a) use the square of the Euclidean distance to find the


cluster unit 𝑌𝑗 closest to the input vector (0.2, 0.4). Using a
learning rate 0.2 find the new weights for unit 𝑌𝑗 .

• (b)For the input vector (0.6, 0.6) with learning rate 0.1, find
the winning cluster unit and its new weights.

Dr Chiranji Lal Chowdhary, VIT


EXAMPLE CONTD…
a

𝑌1 𝑌2 𝑌3 𝑌4 𝑌5
0.5 0.8
0.6 0.4
0.3 0.2 0.7 0.9
0.1 0.2

𝑋1 𝑋2

𝑥1 𝑥2
Dr Chiranji Lal Chowdhary, VIT
EXAMPLE CONTD…

(a) For the input vector (0.2, 0.4) = 𝑥1 , 𝑥2 and learning rate
𝛼 = 0.2, the weight vector W is given by (chosen arbitrarily)
0.3 0.2 0.1 0.8 0.4
𝑊= (2 x 5) dimension
0.5 0.6 0.7 0.9 0.2

Next, we find out the winner unit by using square of unit


distance. The distance of the jth map-node from the centre is
denoted by D(j) and it is given by the formula
2 2 2 2
𝐷(𝑗) = ෍ 𝑤𝑖𝑗 − 𝑥𝑖 = 𝑤1𝑗 − 𝑥1 + 𝑤2𝑗 − 𝑥2
𝑖=1
j= 1, 2, 3 , 4 ,5

Dr Chiranji Lal Chowdhary, VIT


EXAMPLE CONTD…

𝐷 1 = 0.3 − 0.2 2 + 0.5 − 0.4 2 = 0.01 + 0.01 = 0.02


𝐷 2 = 0.2 − 0.2 2 + 0.6 − 0.4 2 = 0.0 + 0.04 = 0.04
• 𝐷 3 = 0.1 − 0.2 2 + 0.7 − 0.4 2 = 0.01 + 0.09 = 0.1
𝐷 4 = 0.8 − 0.2 2 + 0.9 − 0.4 2 = 0.36 + 0.25 = 0.61
𝐷 5 = 0.4 − 0.2 2 + 0.2 − 0.4 2 = 0.04 + 0.04 = 0.08

• Since D(1)=0.02 is the minimum value, the winner unit is j =1.

• We now update the weights on the winner unit j =1. The


weight updation formula used is
• 𝑤𝑖𝐽 (𝑛𝑒𝑤) = 𝑤𝑖𝐽 (𝑜𝑙𝑑) + 𝛼[𝑥𝑖 − 𝑤𝑖𝐽 (𝑜𝑙𝑑)൧

Dr Chiranji Lal Chowdhary, VIT


EXAMPLE CONTD…

• Since j =1, the formula becomes


• 𝑤𝑖1 (𝑛𝑒𝑤) = 𝑤𝑖1 (𝑜𝑙𝑑) + 𝛼[𝑥𝑖 − 𝑤𝑖1 (𝑜𝑙𝑑)ሿ
• Putting i = 1, 2 we get
𝑤11 (𝑛) = 𝑤11 (O) + 𝛼[𝑥1 − 𝑤11 (O)ሿ = 0.3 + 0.2[0.2 − 0.3ሿ = 0.28

𝑤21 (𝑛) = 𝑤21 (O) + 𝛼[𝑥2 − 𝑤21 (O)ሿ = 0.5 + 0.2[0.4 − 0.5ሿ = 0.48

• The updated matrix is now


0.28 0.2 0.1 0.8 0.4
• 𝑊=
0.48 0.6 0.7 0.9 0.2

• (b) For the input vector 𝑥1 , 𝑥2 ) = (0.6,0.6 and 𝛼 = 0.1, the


initialised weight matrix we take as same for part (a)
Dr Chiranji Lal Chowdhary, VIT
EXAMPLE CONTD…

• The square of the Euclidean distance formula is same as in


part (a)
• So, taking j = 1,…5, we get

𝐷 1 = 0.3 − 0.6 2 + 0.5 − 0.6 2 = 0.09 + 0.01 = 0.1


𝐷 2 = 0.2 − 0.6 2 + 0.6 − 0.6 2 = 0.08 + 0.0 = 0.08
• 𝐷 3 = 0.1 − 0.6 2 + 0.7 − 0.6 2 = 0.25 + 0.01 = 0.26
𝐷 4 = 0.8 − 0.6 2 + 0.9 − 0.6 2 = 0.04 + 0.09 = 0.13
𝐷 5 = 0.4 − 0.6 2 + 0.2 − 0.6 2 = 0.04 + 0.16 = 0.2
• Since D(2)=0.08 is the minimum, the winner is unit j = 2

Dr Chiranji Lal Chowdhary, VIT


EXAMPLE CONTD…

• We update the weight for j = 2 with the same formula


• 𝑤𝑖2 (𝑛𝑒𝑤) = 𝑤𝑖2 (𝑜𝑙𝑑) + 𝛼[𝑥𝑖 − 𝑤𝑖2 (𝑜𝑙𝑑)ሿ
• Taking i = 1, 2
𝑤12 (𝑛) = 𝑤12 (𝑂) + 𝛼[𝑥1 − 𝑤12 (𝑂)ሿ = 0.2 + 0.1[0.6 − 0.2ሿ = 0.24

𝑤22 (𝑛) = 𝑤22 (𝑂) + 𝛼[𝑥2 − 𝑤22 (𝑂)ሿ = 0.6 + 0.1[0.6 − 0.6ሿ = 0.6

• The new weight matrix is given by


0.3 0.24 0.1 0.8 0.4
• 𝑊=
0.5 0.6 0.7 0.9 0.2

Dr Chiranji Lal Chowdhary, VIT


LEARNING VECTOR QUANTIZATION
-LVQ

Dr Chiranji Lal Chowdhary, VIT


LVQ
• It is a process of classifying the patterns, wherein each output
unit represents a particular class
• The inputs are parameters of a unit and the weights identify a
cluster, which is an output unit. These weight vectors are called
reference vectors for that output unit/cluster
• This is a special case of competitive net
• It follows supervised learning methodology
• During training the output units are found to be positioned to
approximate the decision. That is the final weight vectors are
obtained which identify that particular output unit
• Example: Characteristics of animals like colour, height and size
(weight vectors) determining that animal (output units or
classes)
Dr Chiranji Lal Chowdhary, VIT
LVQ CONTD…

• LVQ net is found to classify an input vector (Given as input


through its components) by assigning it to the same class as
that of the output unit, which has its weight vector very
close (nearest) to the input vector
• It is a classifier paradigm that adjusts the boundaries between
categories to minimize existing misclassification
• It is used for:
• Optical character recognition
• Converting speech into phonemes

Dr Chiranji Lal Chowdhary, VIT


APPLICATIONS OF LVQ

• Optical character recognition (Optical character recognition


or optical character reader is the electronic or mechanical
conversion of images of typed, handwritten or printed text
into machine-encoded text, whether from a scanned
document, a photo of a document, a scene-photo or from
subtitle text superimposed on an image)

• Converting speech into phonemes (A phoneme is a unit of


sound that distinguishes one word from another in a
particular language)

Dr Chiranji Lal Chowdhary, VIT


ARCHIRECTURE OF LVQ

Dr Chiranji Lal Chowdhary, VIT


LVQ ARCHITECTURE

• It has n input and m output units. The weights from the ith
input unit to the jth output unit is given by 𝑤𝑖𝑗
• Each output unit is associated with a class/cluster/category
• x: Training vector 𝑥1 , 𝑥2 , . . . 𝑥𝑛
• T: category or class of the training vector x
• 𝑤𝑗 = weight vector for the jth output unit (That is the weights
of the connections coming to it 𝑤1𝑗 , 𝑤2𝑗 , . . 𝑤𝑖𝑗 , . . . 𝑤𝑛𝑗
• 𝑐𝑗 = cluster or class or category associated with jth output
unit
• The Euclidean distance of jth output unit is
𝑛 2
• D(j) = ෍ 𝑥𝑖 − 𝑤𝑖𝑗
𝑖=1
Dr Chiranji Lal Chowdhary, VIT
TRAINING ALGORITHM
• STEP 0: Initiate the reference vectors.
• i. From the given set of training vectors, take the first “m”
(number of clusters) training vectors and use them as weight
vectors, the remaining vectors can be used for training (Like the
centroids of clusters in a clustering algorithm. We put them in
the output units)
• ii. Assign the initial weights and classifications (Clusters)
randomly (Although we say this, the first m vectors for this as
mentioned above)
• iii. K-means clustering method
• Set initial learning rate 𝛼
• STEP 1: Perform steps 2-6 if the stopping condition is false
• STEP 2: Perform steps 3-4 for each training input vector x

Dr Chiranji Lal Chowdhary, VIT


TRAINING ALGORITHM CONTD..
• STEP 3: Calculate the square of the Euclidean distance, for i = 1
𝑛 2
to n, j =1 to m, 𝐷(𝑗) = ෍ 𝑥𝑖 − 𝑤𝑖𝑗
𝑖=1
• Find the winning unit index J, when D(J) is minimum
• STEP 4: Update the weights on the winning unit 𝑤𝑗 using the
following conditions
• If T = 𝑐𝑗 then 𝑤𝑗 (𝑛𝑒𝑤) = 𝑤𝑗 (𝑜𝑙𝑑) + 𝛼[𝑥 − 𝑤𝑗 (𝑜𝑙𝑑)൧
• If 𝑇 ≠ 𝑐𝑗 then 𝑤𝑗 (𝑛𝑒𝑤) = 𝑤𝑗 (𝑜𝑙𝑑) − 𝛼[𝑥 − 𝑤𝑗 (𝑜𝑙𝑑)൧
• STEP 5: reduce the learning rate 𝛼 (normally we take
𝛼(𝑡 + 1) = 0.5𝛼(𝑡))
• STEP 6: Test for the stopping condition of the training process.
(the stopping conditions may be fixed number of epochs or if
learning rate has reduced to a negligible value)
Dr Chiranji Lal Chowdhary, VIT
FLOW CHART
• G

START

Initialize weight vectors


and learning rate

For each
No
input
vector x

Yes

Calculate winner unit J


When D(J) is minimum
Dr Chiranji Lal Chowdhary, VIT
FLOW CHART
• g
Input T
target
If No
T = Cj

Yes
Update weights using Update weights using
wi(new)= wi(old) +alpha[x- wi(new)= wi(old) - alpha[x-
wi(old) wi(old)

Reduce the learning


rate alpha(t+1) = 05
alpha(t)

Dr Chiranji Lal Chowdhary, VIT


FLOW CHART CONTD…
• g

If alpha
reduces to a
No negligible
value

Yes

STOP

Dr Chiranji Lal Chowdhary, VIT


AN EXAMPLE

• Consider an LVQ net with 2 input units x1, x2 and 4 output


classes c1, c2, c3 and c4
• There exists 16 classification units, with weight vectors as
indicated in the table below
x2/x1 0.0 0.2 0.4 0.6 0.8 1
0.0
0.2 c1 c2 c3 c4
0.4 c3 c4 c1 c2
0.6 c1 c2 c3 c4
0.8 c3 c4 c1 c2
1

Dr Chiranji Lal Chowdhary, VIT


EXAMPLE CONTD…

• Use square of Euclidean distance to measure the changes


occurring
• (a) Given an input vector of (0.25, 0.25) representing class 1
and using a learning rate of 0.25, show which classification
unit moves where?
• (b) Given the input vector of (0.4, 0.35) representing class 1,
using initial weight vector and learning rate of 0.25, note what
happens?

Dr Chiranji Lal Chowdhary, VIT


SOLUTION

• Let the input vectors be u1 and u2. Output classes be c1, c2,
c3 and c4
• The initial weights for the different classes are as follows:
• Class 1:
0.2 0.2 0.6 0.6
• 𝑤1 = 𝑐1 t =1
0.2 0.6 0.8 0.4
• Class 2:
0.4 0.4 0.8 0.8
• 𝑤2 = 𝑐2 t=2
0.2 0.6 0.8 0.4
• Class 3:
0.2 0.2 0.6 0.6
• 𝑤3 = 𝑐3 t=3
0.4 0.8 0.6 0.2
Dr Chiranji Lal Chowdhary, VIT
SOLUTION CONTD…

• Class 4:
0.4 0.4 0.8 0.8
• 𝑤4 = 𝑐4 t=4
0.4 0.8 0.6 0.2

• Part (a):
• For the given input vector (u1, u2) = (0.25, 0.25) with 𝛼 =
0.25 and t = 1, we compute the square of the Euclidean
distance as follows:
2 2
• D(j) = 𝑤1𝑗 − 𝑥1 + 𝑤2𝑗 − 𝑥2

Dr Chiranji Lal Chowdhary, VIT


SOLUTION CONTD…

• For j = 1 to 4
• D(1) = 0.005, D(2) = 0.125, D(3) = 0.145 and D(4) = 0.425

• As D(1) is minimum, the winner index J = 1. Also t =1


• So, we use the formula
𝑤𝑗 (𝑛𝑒𝑤) = 𝑤𝑗 (𝑜𝑙𝑑) + 𝛼[𝑥 − 𝑤𝑗 (𝑜𝑙𝑑)൧
The updated weights on the winner unit are
𝑤11 (𝑛𝑒𝑤) = 𝑤11 (𝑜𝑙𝑑) + 𝛼[𝑥1 − 𝑤11 (𝑜𝑙𝑑)ሿ
= 0.2 + 0.25(0.25 − 0.2) = 0.2125
𝑤21 (𝑛𝑒𝑤) = 𝑤21 (𝑜𝑙𝑑) + 𝛼[𝑥2 − 𝑤21 (𝑜𝑙𝑑)ሿ
= 0.2 + 0.25(0.25 − 0.2) = 0.2125
Dr Chiranji Lal Chowdhary, VIT
SOLUTION CONTD…

• So, the new weight vector is


0.2125 0.2 0.6 0.6
• 𝑊1 =
0.2125 0.6 0.8 0.4

(b) For the given input vector 𝑤1 , 𝑤2 ) = (0.4,0.35 , 𝛼 = 0.25


and t = 1 we calculate the Euclidean distance using
2 2 2 2
𝐷(𝑗) = ෍ 𝑤𝑖𝑗 − 𝑥𝑖 = 𝑤1𝑗 − 𝑥1 + 𝑤2𝑗 − 𝑥2
𝑖=1
Then we have D(1) = 0.0625, D(2)= 0.1025, D(3) = 0.2425 and
D(4) = 0.0425
So, D(4) is minimum and hence the winner unit index is J = 4

Dr Chiranji Lal Chowdhary, VIT


SOLUTION CONTD…

• The fourth unit is the winner unit that is closest to the input
vector
• Since 𝑡 ≠ 𝐽, the weight updation formula to be used is:
• 𝑤𝐽 (𝑛𝑒𝑤) = 𝑤𝐽 (𝑜𝑙𝑑) − 𝛼[𝑥 − 𝑤𝐽 (𝑜𝑙𝑑)൧
• Updating the weights on the winner unit, we obtain

𝑤14 (𝑛𝑒𝑤) = 𝑤14 (𝑜𝑙𝑑) − 𝛼[𝑥1 − 𝑤14 (𝑜𝑙𝑑)ሿ


= 0.6 − 0.25(0.4 − 0.6) = 0.65

𝑤24 (𝑛𝑒𝑤) = 𝑤24 (𝑜𝑙𝑑) − 𝛼[𝑥2 − 𝑤24 (𝑜𝑙𝑑)ሿ
= 0.4 − 0.25(0.35 − 0.4) = 0.4125

Dr Chiranji Lal Chowdhary, VIT


SOLUTION CONTD…

• Therefore the new weight vector is


0.2 0.2 0.6 0.65
• 𝑊1 =
0.2 0.6 0.8 0.4125
• (c) For the given input vector 𝑤1 , 𝑤2 ) = (0.4,0.45 , 𝛼 = 0.25
and t = 1 we calculate the Euclidean distance using
2 2 2 2
• 𝐷(𝑗) = ෍ 𝑤𝑖𝑗 − 𝑥𝑖 = 𝑤1𝑗 − 𝑥1 + 𝑤2𝑗 − 𝑥2
𝑖=1
• For j = 1 to 4
• D(1) = 0.1025, D(2)=0.0625, D(3) = 0.1625, D(4) = 0.0425.
• So, D(4) is minimum, so we have to updates of the 4th unit, J =
4. Given t=1. So, 𝑡 ≠ 𝐽. As a result we use the weight updation

Dr Chiranji Lal Chowdhary, VIT


SOLUTION CONTD…

• Formula 𝑤𝑗 (𝑛𝑒𝑤) = 𝑤𝑗 (𝑜𝑙𝑑) − 𝛼[𝑥 − 𝑤𝑗 (𝑜𝑙𝑑)൧


• To update the reference vectors for the 4th class c4
𝑤14 (𝑛𝑒𝑤) = 𝑤14 (0) − 𝛼[𝑥1 − 𝑤14 (𝑜𝑙𝑑)ሿ = 0.6 − 0.25(0.4 − 0.6) = 0.65

𝑤24 (𝑛𝑒𝑤) = 𝑤24 (0) − 𝛼[𝑥2 − 𝑤24 (𝑜𝑙𝑑)ሿ = 0.4 − 0.25(0.45 − 0.4) = 0.3875

• So, the new weight vector is

0.2 0.2 0.6 0.65 


W1 =  
0.2 0.6 0.8 0.3875

Dr Chiranji Lal Chowdhary, VIT


ART NETWORK

Dr Chiranji Lal Chowdhary, VIT


ART NETWORK

• It is Adaptive Resonance Theory (ART) Network


• Designed by Carpenter and Grossberg in 1987
• It is designed for both binary inputs and analog valued inputs
• The input patterns can be presented in any order
• This is unsupervised learning model based on competition
• It finds the categories automatically and learns new
categories if needed
• This was proposed to solve the problem of instability
occurring in feed forward networks
• There are two versions of it: ART1 and ART2

Dr Chiranji Lal Chowdhary, VIT


ART NETWORK

• ART1 was developed for clustering binary vectors


• ART2 was developed to accept continuous valued vectors
• For each pattern presented to the network, an appropriate
cluster unit is chosen and the weights of the cluster unit are
adjusted to let the cluster unit to learn the pattern
• The network controls the degree of similarity of the patterns
placed on the same cluster units
• During training:
• Each training pattern may be presented several times
• The input patterns should not be presented on the same
cluster unit when it is presented each time

Dr Chiranji Lal Chowdhary, VIT


THE ART NETWORK

• The stability may be achieved by reducing the learning rates


• The ability of the network to respond to a new pattern equally
at any stage of learning is called a plasticity
• The ART networks are designed to possess the properties of
stability and plasticity
• But, it is difficult to handle both stability and plasticity
• The ART networks are designed particularly to resolve the
stability-plasticity dilemma
• This means they are stable to preserve significant past
learning but nevertheless remain adaptable to incorporate
new information whenever it appears

Dr Chiranji Lal Chowdhary, VIT


FUNDAMENTAL ARCHITECTURE OF ART

• Three groups of neurons are used


• Input processing layer (F1 layer)
• Clustering units (F2 layer)
• Control mechanism (It controls degree of similarity of
patterns placed on the same cluster)

Input Processing Layer

Input Portion Interface Portion

Dr Chiranji Lal Chowdhary, VIT


FUNDAMENTAL ARCHITECTURE CONTD…

• The input portion may perform some processing based upon


the inputs it receives
• This is mostly performed in case of ART2 compared to ART1
• The interface portion of the F1 layer combines the input
portion from F1 and F2 layers for comparing the similarity of
the input signal with the weight vector for the cluster unit
that has been selected as a unit for learning F1 layer input
portion may be denoted as F1(a) and interface as F1(b)
• There exists two sets of weighted interconnections for
controlling the degree of similarity between the units in the
interface portion and the cluster layer

Dr Chiranji Lal Chowdhary, VIT


ARCHITECTURE CONTD…

• The bottom-up weights are used for the connection from


F1(b) layer to F2 layer and are represented by 𝑏𝑖𝑗 (ith F1 unit
to the jth F2 unit)
• The top-down weights are used for the connections from F2
layer to F1(b) layer and are represented by 𝑡𝑗𝑖 (jth F2 unit to
ith F1 unit)
• The competitive layer in this case is the cluster layer and the
cluster unit with largest net input is the victim to learn input
pattern
• The activations of all other F2 units are made zero

Dr Chiranji Lal Chowdhary, VIT


ARCHITECTURE CONTD…

• The interface units combine the data from input and cluster
layer units
• On the basis of similarity between the top-down weight
vector and input vector, the cluster unit may be allowed to
learn the input pattern
• The decision is done by reset mechanism unit on the basis of
the signals it receives from interface portion and input portion
of the F1 layer
• When cluster unit is not allowed to learn, it is inhibited and a
new cluster unit is selected as the victim

Dr Chiranji Lal Chowdhary, VIT


FUNDAMENTAL OPERATING PRINCIPLE

• Presentation of one input pattern forms a learning trial


• The activations of all the units in the net are set to zero
before an input pattern is presented
• All the units in the F2 layer are inactive
• On presentation of a pattern, the input signals are sent
continuously until the learning trial is completed
• There exists a user-defined parameter called vigilance
parameter
• This parameter controls the degree of similarity of the
patterns assigned to the same cluster unit

Dr Chiranji Lal Chowdhary, VIT


FUNDAMENTAL OPERATING PRINCIPLE

• The function of the reset mechanism is to control the state of


each node on F2 layer
• The units in F2 layer can be in any one of the three states at
any instant of time
1. Active: Unit is ON. The activation in this case is equal to 1.
For ART1, d = 1 and for ART2, 0 < d < 1
2. Inactive: Unit is OFF. The activation here is zero and the unit
may be available to participate in competition.
3. Inhibited: Unit is OFF. The activation here is also zero but the
unit here is prevented from participating in any further
competition during the presentation of current input vector

Dr Chiranji Lal Chowdhary, VIT


FUNDAMENTAL OPERATING PRINCIPLE

• ART net can work in two ways:


• Fast Learning
• Slow Learning
• FAST LEARNING:
• Weight updation takes place rapidly relative to the length of
time a pattern is being presented on any particular learning
trial
• The weights reach equilibrium in each trial
• SLOW LEARNING:
• The weight changes occur slowly
• The weights do not reach equilibrium in each trial

Dr Chiranji Lal Chowdhary, VIT


FUNDAMENTAL OPERATING PRINCIPLE

• The patterns are binary in ART1


• Weights are associated with each cluster unit stabilize in the
fast learning mode
• In ART2 network, the weights produced by fast learning
continue to change each time a pattern is presented
• Net is found to stabilize only after few presentations of each
training pattern
• It is not easy to find equilibrium weights immediately for ART2
• In slow learning is not adapted in ART1
• IN ART2 the weights produced by slow learning are far better
than those produced by fast learning for particular type of
data

Dr Chiranji Lal Chowdhary, VIT


FUNDAMENTAL ALGORITHM

• This algorithm discovers clusters of a set of pattern vectors


• STEP 0: Initialize the necessary parameters
• STEP 1: Perform Steps 2-9 when stopping condition is false
• STEP 2: Perform Steps 3-8 for each input vector
• STEP 3: F1 layer processing is done
• STEP 4: Perform Steps 5-7 when reset condition is true
• STEP 5: Find the victim unit to learn the current input pattern.
The victim unit is going to be the F2 unit with the largest
input
• STEP 6: F1(b) units combine their inputs from F1(a) and F2
• STEP 7: Test for reset condition

Dr Chiranji Lal Chowdhary, VIT


FUNDAMENTAL ALGORITHM

• If reset is true then the current victim unit is rejected


(inhibited)
• Go to step 4 if reset is false and the current victim unit is
accepted for learning
• Go to next step (Step 8)
• STEP 8: Weight updation is performed
• STEP 9: Test for stopping condition
• Note: The ART network does not require all training patterns
to be presented in the same order or even if all patterns are
presented in the same order , we refer to this as an epoch

Dr Chiranji Lal Chowdhary, VIT


ART1

Dr Chiranji Lal Chowdhary, VIT


ADAPTIVE RESONANCE THEORY (ART-1)

• Designed for binary inputs F1


• Input unit: F1 F1(a) Layer: F1(b) Layer:
• Output unit: F2 Input portion Interface portion
• Architecture of ART1 network: It is made up of two units
• 1. Computational unit
• 2. Supplemental unit
• Computational Unit:
❖ Input units (F1 unit) (input portion and interface portion)
❖ Cluster units (F2 unit) (output unit)
❖ Reset control unit (controls degree of similarity of patterns
placed in the same cluster)
Dr Chiranji Lal Chowdhary, VIT
BASIC ARCHITECTURE OF ART 1 (Computational
unit)
Reset Control
• g R Unit

s1 X1 Y1
S1

si bij
Xi Yj
Si tji

sn Sn Xn Ym

F1(a) layer: Input portion Dr F1(b) layer:


Chiranji Lal Interface
Chowdhary, VIT portion F2 layer: Cluster unit
ARCHITECTURE EXPLAINED

• Each unit in the input portion of the F1 layer is connected to


the respective unit in the interface portion of F1 layer
• Reset control unit has connections from the F1(a) and F1(b)
units
• Each unit in F1(b) layer is connected through two weighted
interconnection paths to each of the F2 unit
• The weight of the connection from Xi unit of F1(b) layer to Yj
unit of F2 layer is bij
• The weight of the connection from Yj unit of F2 layer to the Xi
unit of the F1(b) layer tji

Dr Chiranji Lal Chowdhary, VIT


ARCHITECTURE (SUPPLEMENTAL UNIT)
• G
F2 layer
- +
(Cluster units)
+
bij tji

F1(b) layer
R G1 G2
- (Interface portion) +

+ +
+
F1(a) layer
(Input portion)

Dr Chiranji Lal Chowdhary, VIT


SUPPLEMENTAL UNIT

• The supplemental unit provides the efficient neural control


of the neural process
• In ART1 can be practically implemented by analog circuits
governing differential equations
• That is, the bottom-up and top-down approach are
controlled by differential equations
• It works throughout autonomously
• It does not require any external control signals and can run
stably with infinite patterns of input data
• The computational units respond differently at different
stages of the process and these are not supported by any of
the biological neuron to decide what to then

Dr Chiranji Lal Chowdhary, VIT


SUPPLEMENTAL UNIT CONTD…

• G1 and G2 are called as gain control units


• There are three control units G1, G2 and reset control unit F
• These three units receive signals from and send signals to all
of the units in the input layer and cluster layer
• The excitatory weighted signals are denoted by “+” and
inhibitory signals are indicated by “-”
• Whenever any unit in a designated layer is “on”, a signal is
sent
• F1(b) unit and F2 unit receive signal from three sources
• F1(b) unit can receive signal from either F1(a) unit or F2 units
or G1 unit

Dr Chiranji Lal Chowdhary, VIT


SUPPLEMENTAL UNIT CONTD…

• F2 unit receives signal from F1(b) unit or reset control unit R


or gain control unit G2
• An F1(b) unit or F2 unit should receive two excitatory signals
for them to be on
• Both F1(b) unit or F2 unit can receive signals through three
possible ways (called two-third rule)
• F1(b) unit should send a signal whenever it receives input
from F1(a) and no F2 unit is active

Dr Chiranji Lal Chowdhary, VIT


TRAINING ALGORITHM

• STEP 0: Initialize the parameters: 𝛼 > 1 and 0 < 𝜌 ≤ 1


𝛼
• Initialize the weights: 0 < 𝑏𝑖𝑗 (0) < , 𝑡𝑗𝑖 (0) = 1
𝛼−1+𝑛
• STEP 1: Perform Steps 2-13 when stopping condition is false
• STEP 2: Perform Steps 3-12 for each of the training input
• STEP 3: Set activations of all F2 units to zero
• Set the activations of F1(a) units to input vectors
• STEP 4: Calculate the norm of s:
• ‖𝑠‖ = σ𝑖 𝑠𝑖
• STEP 5: Send input signal from F1(a) layer to F1(b) layer
• 𝑥𝑖 = 𝑠𝑖

Dr Chiranji Lal Chowdhary, VIT


TRAINING ALGORITHM CONTD…

• STEP 6: For each F2 node that is not inhibited, the following


rule should hold:

• If 𝑦𝑖 ≠ −1 then 𝑦𝑖 = ෍ 𝑏𝑖𝑗 𝑥𝑖
𝑖
• STEP 7: Perform steps 8 – 11 when rest is true
• STEP 8: Find J for 𝑦𝐽 ≥ 𝑦𝑗 for all nodes j. If 𝑦𝐽 = −1, then all
the nodes are inhibited and note that this pattern cannot be
clustered
• STEP 9: Recalculate activation X of 𝐹1 (𝑏):
• 𝑥𝑖 = 𝑠𝑖 𝑡𝑗𝑖

Dr Chiranji Lal Chowdhary, VIT


TRAINING ALGORITHM CONTD…

• STEP 10: Recalculate the norm of vector x:


• ‖𝑥‖ = σ𝑖 𝑥𝑖
• STEP 11: Test for reset condition
• If ‖𝑥‖Τ‖𝑠‖ < 𝜌 then inhibit node J, 𝑦𝐽 = −1. Go
back to step 7 again. Else proceed to step 12 (next step)
• STEP 12: Perform weight updation for the node J (fast
learning):
𝛼𝑥𝑖
• 𝑏𝑖𝐽 (𝑛𝑒𝑤) = and 𝑡𝐽𝑖 (𝑛𝑒𝑤) = 𝑥𝑖
𝛼−1+‖𝑥‖
• STEP 13: test for stopping condition:
• (a) No change in weights (b) No reset of units
• (c) Maximum number of epochs reached
Dr Chiranji Lal Chowdhary, VIT
TRAINING ALGORITHM CONTD…
• When calculating the winner unit, if there occurs a tie, the unit
with smallest index is chosen as winner
• In Step 3, all the inhibitions obtained from the previous learning
trial are removed
• When 𝑦𝐽 = −1, the node is inhibited and it will be prevented
from becoming the winner
• The unit 𝑡𝑗𝑖 is either 0 or 1 and once it is set to 0, during
training, it can never be set back to 1 (provides stable learning
method)
• The optimal values of the initial parameters are 𝛼 = 2, 𝜌 =
0.9, 𝑏𝑖𝑗 = 1Τ( 1 + 𝑛), 𝑡𝑗𝑖 = 1
• The algorithm uses fast learning, which uses the fact that the
input pattern is presented for a longer period of time for
weights to reach equilibrium
Dr Chiranji Lal Chowdhary, VIT
EXAMPLE ART1

• Consider an ART1 neural net with four F1 units and three F2


units. After some training the weights are as follows:
0.67 0 0.2
1 0 0 0
0 0 0.2
• 𝑏𝑖𝑗 = 𝑡𝑗𝑖 = 0 0 0 1
0 0 0.2
1 1 1 1
0 0.67 0.2
• Determine the new weight matrices after the vector [0, 0, 1,
1] is presented if
• (a) The vigilance parameter is 0.3
• (b) The vigilance parameter is 0.7

Dr Chiranji Lal Chowdhary, VIT


SOLUTION

• Here, n = 4, m = 3 and 𝛼 = 2. Vigilance parameter 𝜌 = 0.3


• Bottom-up weight,
1 1
• 𝑏𝑖𝑗 (0) = 1+𝑛
= 1+4
= 0.2
• Top-down weight, 𝑡𝑗𝑖 (0) = 1
• Now norm of s = [0 0 1 1], ‖𝑠‖ = 0 + 0 + 1 + 1 = 2
• Then we compute the activations of F1 layer, x = [0 0 1 1]
4
• We evaluate the net input 𝑦𝑗 = ෍ 𝑥𝑖 𝑏𝑖𝑗
𝑖=1
• 𝑦1 = σ4𝑖=1 𝑥𝑖 𝑏𝑖1 = 0.67 × 0 + 0 × 0 + 1 × 0 + 1 × 0 = 0
• 𝑦2 = σ4𝑖=1 𝑥𝑖 𝑏𝑖2 = 0 × 0 + 0 × 0 + 1 × 0 + 1 × 0.67 = 0.67
• Dr Chiranji Lal Chowdhary, VIT
SOLUTION CONTD…

• 𝑦3 = σ4𝑖=1 𝑥𝑖 𝑏𝑖3 = 0 × 0.2 + 0 × 0 + 1 × 0.2 + 1 × 0.2 = 0.4


• Since y2 is the largest, the winner unit is J = 2
• Let us compute the F1 activations function again
• 𝑥𝑖 = 𝑠𝑖 𝑡𝑗𝑖 = [0011ሿ[0001൧ 𝑇 (The second row of the tji
matrix)
• = [0 0 0 1] = x
• So, ‖𝑥‖ = 0 + 0 + 0 + 1 = 1
‖𝑥‖ 1
• We now test for reset. Actually, = = 0.5 ≥ 0.3(𝜌ቁ . So,
‖𝑠‖ 2
• We update the weights (𝑏𝑖𝐽 )
𝑑𝑥
• 𝑏𝑖𝐽 (𝑛𝑒𝑤) = 𝛼−1+‖𝑥‖
𝑖
(𝛼 = 2ቁ
Dr Chiranji Lal Chowdhary, VIT
SOLUTION CONTD…

2×0 2×0
𝑏12 = 2−1+1 = 0, 𝑏22 = 2−1+1 = 0
• 2×0 2×1
𝑏32 = 2−1+1 = 0, 𝑏42 = 2−1+1 = 1
• Update the top-down weights, 𝑡𝐽𝑖 (𝑛𝑒𝑤) = 𝑥𝑖
1 0 0 0
• The new top-down weights are 𝑡𝐽𝑖 = 0 0 0 1
1 1 1 1
0.67 0 0.2
0 0 0.2
• The new bottom-up weights are 𝑏𝑖𝐽 =
0 0 0.2
0 1 0.2

Dr Chiranji Lal Chowdhary, VIT


SOLUTION CONTD…

• (b) Vigilance parameter 𝜌 = 0.7.


• The input vector is s = [0 0 1 1]
• He norm of s is, ‖𝑠‖= 0 + 0 + 1 + 1 = 2
• Activations of F1 layer as x [0 0 1 1]
• Calculating the net input we obtain
4
• 𝑦𝑗 = ෍ 𝑥𝑖 𝑏𝑖𝑗
𝑖=1

𝑦1 = σ4𝑖=1 𝑥𝑖 𝑏𝑖1 = 0.67 × 0 + 0 × 0 + 1 × 0 + 1 × 0 = 0


• 𝑦2 = σ4𝑖=1 𝑥𝑖 𝑏𝑖2 = 0 × 0 + 0 × 0 + 1 × 0 + 1 × 0.67 = 0.67
𝑦3 = σ4𝑖=1 𝑥𝑖 𝑏𝑖3 = 0 × 0.2 + 0 × 0 + 1 × 0.2 + 1 × 0.2 = 0.4
Dr Chiranji Lal Chowdhary, VIT
SOLUTION CONTD…

• As y2 is the largest, the winner unit index is J =2


• Recomputing the activations of F1 layer we get
𝑇
• 𝑥𝑖 = 𝑠𝑖 𝑡𝐽𝑖 = [0 0 1 1ሿ 0 0 0 1 = [0 0 0 1൧
• ‖𝑥‖ = 0 + 0 + 0 + 1 = 1
• Test for reset condition:
‖𝑥‖ 1
• We have ‖𝑠‖
= = 0.5 < 0.7(𝜌ቁ
2
• y2 = -1 (inhibit node 2). So, y1 = 0, y2 = -1, y3 = 0.4
• As the largest one is y3, we have J = 3
• WE now recompute F1 layer activations

Dr Chiranji Lal Chowdhary, VIT


SOLUTION CONTD…

• 𝑥𝑖 = 𝑠𝑖 𝑡𝐽𝑖 = [0 0 1 1ሿ 1 1 1 1 𝑇 = [0 0 1 1൧
• So, ‖𝑥‖ = 2
• Test for the reset condition is:
‖𝑥‖ 2
• = = 1 > 0.7(𝜌ቁ
‖𝑠‖ 2
• Hence we update the weights
• The bottom-up weights are (xi = [0 0 1 1], J = 3)
𝛼𝑥
• 𝑏𝑖𝐽 (𝑛𝑒𝑤) = 𝛼−1+‖𝑥‖
𝑖

2×0 2×0
𝑏13 = 2−1+2 = 0, 𝑏23 = =0
2−1+2
• 2×1 2×1
𝑏33 = 2−1+2 = 0.67, 𝑏43 = = 0.67
2−1+2
Dr Chiranji Lal Chowdhary, VIT
SOLUTION CONTD…

• The updated bottom-up weights are

0.67 0 0
0 0 0
• 𝑏𝑖𝐽 =
0 0 0.67
0 0.67 0.67
• The top-down weights are given by 𝑡𝐽𝑖 (𝑛𝑒𝑤) = 𝑥𝑖 . So,
1 0 0 0
• 𝑡𝐽𝑖 = 0 0 0 1
0 0 1 1

Dr Chiranji Lal Chowdhary, VIT

You might also like