0% found this document useful (0 votes)
10 views36 pages

CI-10 Networks Based On Competition Learning - Clustering - Kmean and SOM

The document discusses Kohonen Self Organizing Maps (SOM) and their application in data clustering, emphasizing the importance of clustering algorithms and their requirements. It covers various clustering applications, similarity measures, and quality indices, as well as the training process and properties of SOMs. The document highlights the adaptability and neurobiological inspiration behind SOMs, detailing their architecture and learning mechanisms.

Uploaded by

Subhan Farjam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views36 pages

CI-10 Networks Based On Competition Learning - Clustering - Kmean and SOM

The document discusses Kohonen Self Organizing Maps (SOM) and their application in data clustering, emphasizing the importance of clustering algorithms and their requirements. It covers various clustering applications, similarity measures, and quality indices, as well as the training process and properties of SOMs. The document highlights the adaptability and neurobiological inspiration behind SOMs, detailing their architecture and learning mechanisms.

Uploaded by

Subhan Farjam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 36

Lecture-10

Networks Based on Competition


Kohonen Self Organizing Maps

Dr. Abdul Majid, DCIS

1
Data Clustering
 Clustering is the classification of objects into different C1
groups
 The partitioning of a data set into subsets (clusters),
so that the data in each subset (ideally) share some
common trait - often proximity according to some
defined distance measure
 Clustering is unsupervised classification
 Input: A set of unlabeled examples (input vectors)
 Task: Cluster (group) the inputs into clusters
 Output: Cluster labels
C2

2
Applications of Clustering
 Imaging and Segmentation
 3D brain imaging data segmented into three
tissue types (White Matter, Gray Matter and
Cerebrospinal Fluid)

3
Applications of Clustering
 Gene Clustering
 Market Surveys
 Market researchers use
cluster analysis to
partition the general
population of consumers
into market segments
and to better understand
the relationships
between different groups
of consumers/potential
customers.
4
Applications of Clustering
 Social Network Analysis

5
Applications of Clustering
 Data Mining
 Document Clustering on the
WWW
 http://websom.hut.fi/websom/
 Search Result Grouping

 Image Clustering

6
Applications of Clustering
 Land Use Analysis
 Signal Analysis and Disputed Territory

Data Compression (IHJK)

 Compress the signal


by using code books
derived from Vector
Quantization
 …

7
Requirements for Clustering Algorithms
 Scalability
 Dealing with Different Types of Attributes
 Discovery of Clusters with arbitrary shape
 Minimum requirements for domain knowledge to
determine input parameters
 Robust and Noise Tolerant
 Insensitive to order of input records
 High Dimensionality
 Interpretability and Usability

8
How to define Similarity or Difference?
 Clustering is essentially based upon finding the
similarity/difference between given data points
 Use of Distance Measures
n
 Euclidean Distance (2-Norm)  pi  qi
2

i 1

n
 Manhattan Distance (1-Norm) pq
i 1
i i


m
pi  qi
 Minkowski Distance i 1

 Infinity Norm max pi  qi , i 1...n

  1
 Mahalanobis Distance ( p  q)P ( p  q)
9
Quality of Clustering
 A good clustering method will produce high quality clusters
with
 High intra-class similarity (Low intra class variability)
 Low inter-class similarity (High inter class variability)
 Measurement of Clustering Quality
 As the process in unsupervised so there is no error
relation specifying the quality of clustering
 Clustering Indices (Comparison of Different Clustering
Techniques)
 Davies Bouldin Index
 Dunn’s Index
 Partition Coefficients
 Classification Entropy
 Separation Index
 Fuzzy Hypervolume
 …

10
Davies Bouldin Index
 A similarity measure Rij between the clusters Ci and Cj is
defined based on a measure of dispersion of a cluster C i, let
si, and a dissimilarity measure between two clusters d ij.
R13
C1
d12
d13 R12=R1 C2
C3

Rij 
 s s 
i j Degree of intra class variability
dij Degree of inter si = Mean square distance from the points
R j  max Rij class variability in cluster i to the center of cluster i
i 1,..., nc ,i j
dij = distance between centers of clusters i
nc
1 Smaller DBI and j
DBnc 
nc
R
i 1
j means good nc = total number of clusters 11
Clustering
Dunn’s Index Large DI means
good Clustering
Degree of inter
class variability

Degree of intra
class variability

∆(A1)
∆(A3) ∆(A2)
C1

C3 C2
12
Clustering
 Using K-Means
 Most commonly used algorithm for clustering

13
k-means Clustering

 bit is 1 when the ith center is the one closest to xt


14
k-means Clustering

15
k-means Clustering

16
k-means Clustering

17
k-means Clustering

18
Use of ANNs for Clustering: SOM
 Effective Clustering
 Adaptability
 Is able adapt itself to a variety of
input data distributions
 Topological Ordering
 For example, SOM can be used
to divide the web-blogs into
groups and at the same time
provide a visualization
capability along with navigation
because neighboring nodes of
the SOM would be representing
similar communities [Merelo-
Guervós et al.]
 Same can be said for document
clustering (WEBSOM)
 Neuro-Biologically Sound
20
Inspiration for the SOM
 Topologically ordered Computational Mapping in the brain
transform the input signal into a place coded probability
distribution: Information can be readily accessed and
utilized

 Biological neurons are


hierarchically organized into
clusters to carry out
specific/related functions
 [4] Motor Cortex
 [6] Pre-motor Area
 [8] Frontal Eye Fields
 [3,1,2] Somatosensory Cortex
 [41,42] Auditory Cortex
 [17,18,19] Visual Cortex

21
Building concepts for the SOM
 Neurobiological studies indicate that different sensory
inputs are mapped onto corresponding areas of cerebral
cortex in an orderly fashion
 This form of map is known as a topographic map and has
two important properties
 At each stage of representation or processing, each
piece of information is kept in its proper
context/neighborhood
 Neurons dealing with closely related pieces of
information are kept close together so that they can
interact via short synaptic connections
 This gives rise to the principle of topographic map
formation
 The spatial location of an output neuron in a
topographic map corresponds to a particular domain or
feature drawn from the input space
22
Objectives of SOM
 To have artificial topographic maps that learn
through self organization in a neuro-biologically
inspired manner
 To transform an incoming signal pattern of
arbitrary dimensions into a one or two
dimensional discrete map and perform this
transformation adaptively in a topologically
ordered fashion
 See Video!

23
Architecture
 To classify a given set of n-
dimensional features into m
clusters
 The input neurons represent
the different components of
the input pattern
 The output unit is organized in
the form of a 1D or 2D map
 Each output unit represents a
cluster
 Weight vector, wj, of a
cluster unit j serves as an
exemplar (representative)
of the input patterns
associated with that cluster

24
Strategy
 The neurons are placed at the nodes of a lattice
that is usually one or two dimensional. Higher
dimensional maps are also possible but not as
common. The neurons become selectively tuned
to various input patterns or classes of input
patterns in the course of a competitive learning
process
 The locations of the neurons so tuned (i.e.
winning neurons) become ordered with respect to
each other in such a way that a meaningful
coordinate system for different input features is
created over the lattice

25
Training
 Training:
 Train the network such that the weight vector
wj associated with Yj becomes the
representative vector of the class of input
patterns Yj is to represent.
 Processes in Training
 Initialization
 Initialize the weights of the network with small random
values
 Competition
 For each input pattern each output neuron calculates
how close it is to the input pattern using a Discriminant
or distance function and the neuron closest to the input
pattern is selected as the winner neuron
 Cooperation
 The winning neuron determines the spatial location of a
topological neighborhood (implemented through a
neighborhood function) of excited neurons thereby
providing a basis for cooperation among neighboring
neurons
 Adaptation
 Movement of neurons to better fit the data through
weight adjustment
26
Training…
 During the competition phase the winner neuron is
selected on the basis of minimum distance from the input
pattern
i (x) arg min x( n)  w j , j 1, 2,..., l
j

 Adaptation

w j (n  1) w j (n)   (n)h j ,i ( x ) (n) x(n)  w j ( n) 
Learning Rate Neighborhood Function

x
wj
 ( n ) h j , i ( x ) ( n ) x ( n )  w j ( n ) 

27
Neighborhood Function
 Models the definition of the
neighborhood 1 if d j ,i  r (n)
h j ,i ( x ) n    Hard Neighborhood
 Symmetric about the maximum point 0 otherwise
(at the winning neuron)
 d 2j ,i 
 Amplitude of the neighborhood h j ,i ( x ) (n) exp    Soft Neighborhood
 2 2 (n) 
decreases with increase in distance  
from the winning neuron  n
 (n)  0 exp   
 Hard Neighborhood: Rectangular  1 
 Soft Neighborhood: Gaussian
(better)
 Decreased over time to preserve
topology

28
Learning Rate
 Upper bound on the amount of update
 LR is decreased over time to preserve the
ordering and the topological structure once
established
 We can use exponential decrease or linear
decrease
Decrease in learning rate:  0 = 0.1  2=1000
0.1

0.09

0.08

 n 0.07

 (n)  0 exp    0.06

 2 
0.05

0.04

 (n) L 0 n  1 0.03

0.02

0.01

0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
n

29
Different Topologies

30
Phases of the Adaptive Process
 Self Organizing (Ordering)
 May take ~1000 epochs
 LR begins close to 0.1 and remains above 0.01
 Neighborhood function should include almost
all neurons in the neighborhood
 Convergence phase
 LR is kept small (~0.01)
 Neighborhood function should include only
close neighbors of the winning neuron

31
Examples

32
Properties of SOMs

33
Properties of SOM

34
35
Property-3 Density Matching
 The feature map generated by SOM reflects the variations in the statistics of
the input distribution
 If fX(x) is the distribution of the input
data and m(x) is the map magnification
factor, defined as the number of
neurons in a small volume dx of the
input space, thus
 

f x  dx 1
X m x  dx l


 For the SOM to match the input density
exactly
m x  f X x 
 But SOM tends to over-represent
regions of low input density and to
under-represent regions of high density.
Theoretically,

m x  f X1/ 3 x  m x  f X2 / 3 x 
36
Property-4 Feature Selection
 Given data from an input space with a nonlinear
distribution, SOM is able to select a set of best features
for approximating the underlying distribution
 A discrete approximation of Principal Curves or
Principal Surfaces

37

You might also like