0% found this document useful (0 votes)
39 views

Unsupervised Learning Networks

This document discusses unsupervised learning networks. It describes several unsupervised learning network models including Max Net, Mexican Hat networks, Kohonen self-organizing feature maps, and Hamming networks. It explains the basic concepts of competitive learning and self-organization that these networks use, including finding the winning neuron and adapting weights based on competition between neurons. Kohonen self-organizing feature maps are discussed in detail, outlining their architecture and algorithms for competition, cooperation, and adaptation during training.

Uploaded by

Radhika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Unsupervised Learning Networks

This document discusses unsupervised learning networks. It describes several unsupervised learning network models including Max Net, Mexican Hat networks, Kohonen self-organizing feature maps, and Hamming networks. It explains the basic concepts of competitive learning and self-organization that these networks use, including finding the winning neuron and adapting weights based on competition between neurons. Kohonen self-organizing feature maps are discussed in detail, outlining their architecture and algorithms for competition, cooperation, and adaptation during training.

Uploaded by

Radhika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

CHAPTER 5

UNSUPERVISED LEARNING
NETWORKS

April 2007 1
UNSUPERVISED LEARNING
 No help from the outside.

 No training data, no information available on the


desired output.

 Learning by doing.

 Used to pick out structure in the input:


• Clustering,
• Reduction of dimensionality  compression.

 Example: Kohonen’s Learning Law.

April 2007 2
FEW UNSUPERVISED
LEARNING NETWORKS
There exists several networks under this category, such as
 Max Net,
 Mexican Hat,
 Kohonen Self-organizing Feature Maps,
 Learning Vector Quantization,
 Counterpropagation Networks,
 Hamming Network,
 Adaptive Resonance Theory.

April 2007 3
COMPETITIVE LEARNING
 Output units compete, so that eventually only
one neuron (the one with the most input) is active
in response to each output pattern.

 The total weight from the input layer to each


output neuron is limited. If some connections are
strengthened, others must be weakened.

 A consequence is that the winner is the output


neuron whose weights best match the activation
pattern.

April 2007 4
MAX NET
 Max Net is a fixed weight competitive net.
 Max Net serves as a subnet for picking the node whose
input is larger. All the nodes present in this subnet are fully
interconnected and there exist symmetrical weights in all
these weighted interconnections.
 The weights between the neurons are inhibitory and
fixed.
 The architecture of this net is as shown below:

April 2007 5
April 2007 6
MEXICAN HAT NETWORK
 Kohonen developed the Mexican hat network which is a more
generalized contrast enhancement network compared to the
earlier Max Net.

 Here, in addition to the connections within a particular layer


of neural net, the neurons also receive some other external
signals. This interconnection pattern is repeated for several
other neurons in the layer.

 The architecture for the network is as shown below:

April 2007 7
MEXICAN HAT FUNCTION OF
LATERAL CONNECTION

April 2007 8
MEXICAN HAT NETWORK
 The lateral connections are used to create a
competition between neurons. The neuron with the
largest activation level among all neurons in the
output layer becomes the winner. This neuron is the
only neuron that produces an output signal. The
activity of all other neurons is suppressed in the
competition.

 The lateral feedback connections produce


excitatory or inhibitory effects, depending on the
distance from the winning neuron. This is achieved by
the use of a Mexican Hat function which describes
synaptic weights between neurons in the output layer.
April 2007 9
• Has cooperative neighbours( neurons
in close proximity –have excitatory
weights))
• Competitive neighbours (neurons
present far away- have inhibitory
weights)
• Size of region depends on magnitude
of weights and also on topology of
regions –linear, rectangular,
hexagonal etc
• There exist two symmetric regions
around each neuron
April 2007 10
HAMMING NETWORK
 The Hamming network selects stored classes,
which are at a maximum Hamming distance (H)
from the noisy vector presented at the input.

 The Hamming distance between the two vectors


is the number of components in which the vectors
differ.

 The vectors here are binary or bipolar

It is a maximum likelihood classifier –determines


which exemplar vector is most similar to input
vector.

April 2007 11
• Let x, y be two bipolar vectors
• Use x.y = a-d,
a-no. Of components in which
vectors agree and d- no. Of
components in which vectors
disagree.
• n-total no. Of components
• Now Hamming distance=a-d
• n = a+d or d= n-a

April 2007 12
• Since x.y = a-d
• x.y = a-(n-a)
= 2a-n
2a = x.y+n
a =1/2(x.y) + ½(n)
• So weights can be set to one-half the
exemplar vector
• Bias can be set to n/2 initially
• Calculating the unit with largest net
input, net can locate the unit closest
to
April exemplar vector
2007
ARCHITECTURE OF HAMMING NET

April 2007 14
 The Hamming network consists of two layers.
• The first layer computes the difference
between the total number of
components and Hamming distance
between the input vector x and the
stored pattern of vectors in the
feedforward path.

• The second layer of the Hamming


network is composed of Max Net (used
as a subnet) or a winner-take-all
network which is a recurrent network.
April 2007 15
Algorithm
• Initialize weights and bias
ei ( j)
w ij for i  1 .... n , j  1 .... m
2
n
b j 
2

• Calculate net input to each unit Yj


n
yinj  b j   xi wij , j  1....m
i 1

• Initialize activations of Maxnet


y j (0) yinj , j  1....m
16
SELF-ORGANIZATION

 Network Organization is fundamental to the


brain
• Functional structure.

• Layered structure.

• Both parallel processing and serial


processing require organization of the brain.

April 2007 17
SELF-ORGANIZING FEATURE MAP

Our brain is dominated by the cerebral cortex, a very


complex structure of billions of neurons and hundreds of
billions of synapses. The cortex includes areas that are
responsible for different human activities (motor, visual,
auditory, etc.) and associated with different sensory inputs.
One can say that each sensory input is mapped into a
corresponding area of the cerebral cortex. The cortex is a
self-organizing computational map in the human brain.

April 2007 18
SELF-ORGANIZING NETWORKS
 Discover significant patterns or features in
the input data.

 Discovery is done without a teacher.

 Synaptic weights are changed according to


local rules.

 The changes affect a neuron’s immediate


environment until a final configuration
develops.

April 2007 19
KOHONEN SELF-ORGANIZING
FEATURE MAP (KSOFM)
 The Kohonen model provides a topological mapping.

 It places a fixed number of input patterns from the input


layer into a higher dimensional output or Kohonen layer.

 Training in the Kohonen network begins with the winner’s


neighborhood of a fairly large size. Then, as training
proceeds, the neighborhood size gradually decreases.

 Kohonen SOMs result from the synergy of three basic


processes
•Competition,
•Cooperation,
•Adaptation.
April 2007 20
ARCHITECTURE OF KSOFM

April 2007 21
COMPETITION OF KSOFM
 Each neuron in an SOM is
assigned a weight vector with the
same dimensionality N as the input
space.

 Any given input pattern is


compared to the weight vector of
each neuron and the closest
neuron is declared the winner.

 The Euclidean norm is commonly


used to measure distance.

April 2007 22
CO-OPERATION OF KSOFM
 The activation of the winning neuron is spread to
neurons in its immediate neighborhood.
This allows topologically close neurons to become
sensitive to similar patterns.

 The winner’s neighborhood is determined on the lattice


topology.
Distance in the lattice is a function of the number of
lateral connections to the winner.

 The size of the neighborhood is initially large, but shrinks


over time.
An initially large neighborhood promotes a topology-
preserving mapping.
Smaller neighborhoods allow neurons to specialize
in the latter stages of training.
April 2007 23
ADAPTATION OF KSOFM
During training, the winner neuron
and its topological neighbors are
adapted to make their weight
vectors more similar to the input
pattern that caused the activation.

Neurons that are closer to the


winner will adapt more heavily
than neurons that are further
away.

The magnitude of the adaptation


is controlled with a learning rate,
which decays over time to ensure
convergence of the SOM.
April 2007 24
KSOFM ALGORITHM

April 2007 25
KSOFM ALGORITHM

April 2007 26
KSOFM ALGORITHM

April 2007 27
EXAMPLE OF KSOFM

April 2007 28
EXAMPLE OF KSOFM
Find the winning neuron using the Euclidean distance:

Neuron 3 is the winner and its weight vector W3 is


updated according to the competitive learning rule:

April 2007 29
EXAMPLE OF KSOFM
The updated weight vector W3 at iteration (p+1) is
determined as:

The weight vector W3 of the winning neuron 3


becomes closer to the input vector X with each
iteration.

April 2007 30
LEARNING VECTOR QUANTIZATION
(LVQ)
 This is a supervised version of vector quantization.
Classes are predefined and we have a set of labeled
data.
 The goal is to determine a set of prototypes that best
represent each class.

April 2007 31
BASIC SCHEME OF LVQ
Step 1: Initialize prototype vectors for different classes.
Step 2: Present a single input.
Step 3: Identify the closest prototype, i.e., the so-called
winner.
Step 4: Move the winner
- closer toward the data (same class),
- away from the data (different class).

VARIANTS OF LVQ
 LVQ 1
 LVQ 2
 LVQ 2.1
 LVQ 3
April 2007 32
COUNTERPROPAGATION NETWORK
 Another variant of the BPN is the
counterpropagation network (CPN).

 Although this network uses linear


neurons, it can learn nonlinear functions by
means of a hidden layer of competitive
units.

 Moreover, the network is able to learn a


function and its inverse at the same time.
April 2007 33
COUNTERPROPAGATION NETWORK
• They are multi layer networks having
input, output and clustering layer
(hidden).
• Used for data compression, function
approximation and pattern
recognition.
• Constructed from instar-outstar
model
April 2007 34
COUNTERPROPAGATION
NETWORK
• Connections between input layer and
competitive layer are instar
structure.
• Connections between competitive
layer and output layer are oustar
structure.
• Competitive layer is winner –take
all/maxnet with lateral feedback
connections.
35
COUNTERPROPAGATION
NETWORK
• Input vectors are clustered in first
stage.
• The clusters are formed using
Euclidean distance or Dot product
method.
• In the second stages the weights
from cluster units to the output units
are tuned to obtain the desired
response.
36
TYPES OF COUNTERPROPAGATION
NETWORK
 FULL COUNTERPROPAGATION NET
Full counterpropagation net (full CPN) efficiently
represents a large number of vector pairs x:y by
adaptively constructing a look-up-table.
 FORWARD-ONLY COUNTERPROPAGATION NET
A simplified version of full CPN is the forward-only
CPN. The approximation of the function y = f(x) but not
of x = f(y) can be performed using forward-only CPN.

In forward-only CPN only the x-vectors are used to


form the clusters on the Kohonen units.

April 2007 37
BASIC STRUCTURE OF FULL CPN

April 2007 38
39
• X* and y* are approximations to x and y
set of vectors.
• Network is designed to approximate a
continuous function.
• Vectors x and y flow in a counterflow
manner to yield x* and y*.
• No. of nodes in hidden layer of CPN is
greater than that of BPN for a given
accuracy level.
• CPN has a greater speed of learning.

40
• Uses hybrid learning unsupervised
learning(instar) and supervised
learning(outstar).
• Weight updation rule on winning
cluster units is (kohonen learning)
vi j (new)  vij (old )   [ xi  vij (old )], i 1...n
wkJ (new)  wk J (old )   [ yk  wkJ (old )], k 1....m
• Weights between winning unit J and
output units are adjusted
as(Grossberg)
u Jk (new)  u Jk (old )  a[ yk  u Jk (old )], k  1....m
t Ji (new)  t Ji (old )  b[ xi  t Ji (old )], i 1...n 41
FIRST PHASE OF CPN

April 2007 42
SECOND PHASE OF CPN

April 2007 43
CPN LEARNING PROCESS

April 2007 44
CPN LEARNING PROCESS

April 2007 45
CPN LEARNING PROCESS

April 2007 46
CPN LEARNING PROCESS

April 2007 47
COUNTERPROPAGATION NETWORK
 After the first phase of the training, each hidden-layer
neuron is associated with a subset of input vectors.

 The training process minimized the average angle


difference between the weight vectors and their associated
input vectors.

 In the second phase of the training, we adjust the


weights in the network’s output layer in such a way that, for
any winning hidden-layer unit, the network’s output is as
close as possible to the desired output for the winning unit’s
associated input vectors.

 The idea is that when we later use the network to


compute functions, the output of the winning hidden-layer
unit is 1 and the output of all other hidden-layer units is 0.
April 2007 48
COUNTERPROPAGATION NETWORK
 In the first training phase, if a hidden-layer unit does not
win for a long period of time, its weights should be set to
random values to give that unit a chance to win
subsequently.

 There is no need for normalizing the training output


vectors.

 After the training has finished, the network maps the training
vectors onto output vectors that are close to the desired
ones. The more hidden units, the better the mapping.

 CPN can be widely used in data compression and image


compression applications.

April 2007 49
Training of CPN

April 2007 50
April 2007 51
Forward Only Counterpropagation
Network

April 2007 52
Forward Only
Counterpropagation Network
• Simplified version of Full CPN
• Approximation of th function y=f(x)
but not x=f(y) can be done
• Only x vectors are used to form the
clusters on kohonen units in first
phase of training.
• First weights between input and
cluster layer are trained.
April 2007 53
• This is a specific network with known
target.
• Winning cluster unit sends its signal
to output layer.
• At the output layer difference
between wJk and target yk is
calculated.
• Based on this weights between
cluster and output layer are updated.

April 2007 54
Weight updation between input units
to cluster units

viJ (new)  viJ (old )   [ xi  viJ (old )]


Weight updation between output units
to cluster units
wJk (new)  wJk (old )  a[ yk  wJk (old )]

April 2007 55
ADAPTIVE RESONANCE THEORY
(ART) NETWORK
 Adaptive Resonance Theory (ART) is a family of
algorithms for unsupervised learning developed by
Carpenter and Grossberg.

 ART is similar to many iterative clustering algorithms


where each pattern is processed by

- finding the "nearest" cluster (a.k.a. prototype


or template) to that exemplar (desired).

- updating that cluster to be "closer" to the


exemplar.

April 2007 56
ARCHITECTURES OF ART NETWORK

 ART1, designed for binary features.

 ART2, designed for continuous


(analog) features.

April 2007 57
ART1 Network
• It was developed to solve the problem
of instability of feed forward nets
• For each pattern presented to the
network, appropriate cluster unit is
chosen.
• Weights to cluster unit are adjusted to
let the cluster unit learn the pattern.
• This network controls degree of
similarity of patterns placed on same
cluster unit.
April 2007 58
ART1 Network
• These networks possess stability –
plasticity properties.
• I/p pattern should not be presented
on same cluster unit when the
pattern is presented each time.
• Based on this stability is defined as
that wherein a pattern is not
presented to previous cluster units.
It is achieved by reducing learning
rates.
April 2007 59
ART1 Network
• The ability of the pattern to respond
to a new pattern equally at any stage
of learning is called as plasticity.
• Stability –plasticity can be resolved
bya network that includes bottom-
up(i/p-o/p) competitive learning
combined with top-down (o/p-i/p)
learning.

April 2007 60
ART1 Network
• Instability of instar-outstar n/w is
resolved by reducing lerning rate
gradually to zero by freezing learned
categories.(react to new data)

• But at this point net may loose its


plasticity
• ART1 preserves significant past
learning but nevertheless remains
adaptable to new inputs 61
ART1 UNITS
ART1 Network is made up of two units
 Computational units
 Input unit (F1 unit – input and
interface).
 Cluster unit (F2 unit – output).
 Reset control unit (controls degree
of similarity).
 Supplemental units
 One reset control unit.
 Two gain control units.
62
BASIC ARCHITECTURE OF ART1

63
FUNDAMENTAL ALGORITHM OF ART
NETWORK

64
Supplemental unit

65
Training algorithm

66
67
68
69
70
71
April 2007 72
ART2 NETWORK

April 2007 73
BASIC ARCHITECTURE OF ART2

April 2007 74
ART2 ALGORITHM

April 2007 75
ART 2 Algorithm

April 2007 76
April 2007 77
April 2007 78
ART2 Algorithm

April 2007 79
SUMMARY
This chapter discussed on the various unsupervised
learning networks like:
 Max Net
 Mexican Hat
 Kohonen Self-organizing Feature Maps
 Learning Vector Quantization
 Counterpropagation Networks
 Hamming Network
 Adaptive Resonance Theory

April 2007 80

You might also like