0% found this document useful (0 votes)
34 views30 pages

Ann Unit V

Ann

Uploaded by

balayasaswi352
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views30 pages

Ann Unit V

Ann

Uploaded by

balayasaswi352
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

ARTIFICIAL NEURAL NETWORKS(CSEN3011)

UNIT-V
Self-Organization Feature
By

Dr.Satyabrata Dash
Assistant Professor
Department of Computer Science and
Engineering, GITAM Deemed to be
University
Course Objectives( ANN)
1. To understand the architecture, learning algorithm and issues of various neural
networks.
2. Analyze ANN learning, Error correction learning, Memory-based learning,
Competitive learning and Boltzmann learning
3. To adopt gradient - descent techniques in real time applications
4. Provide knowledge on Generalization and function approximation and various
5. architectures of building an ANN
6. Implement and learn the applications of Self-organization Map

2
SYLLABUS

UNIT 5 Self-organization Feature Map 9 hours, P - 6 hours


• Maximal Eigenvector Filtering,
• Extracting Principal Components,
• Generalized Learning Laws,
• Vector Quantization,
• Self-organization Feature Maps,
• Application of SOM.

3
SYLLABUS

Textbooks: 1. Neural Networks and Deep Learning - Charu C. Aggarwal, Springer


International Publishing AG, part of Springer Nature 2018 (Chapters 1, 2, 3)
2. Neural Networks A Classroom Approach– Satish Kumar, McGraw Hill Education (India)
Pvt. Ltd, Second Edition. (Chapters 4, 5)

4
Dimensionality Reduction:

1. Dimensionality reduction is a technique used in machine learning and data analysis to


reduce the number of features or variables in a dataset while preserving its important
information.
2. The aim of dimensionality reduction is to simplify the dataset, reduce noise, and enhance
model performance by removing redundant features.
3. There are two main types of dimensionality reduction techniques:
• Feature Selection: This involves selecting a subset of the original features that are
most important for the task at hand.
• Feature Extraction: This involves transforming the original features into a new set
of features that are a combination of the original ones, but with reduced dimensions.

10/19/2023 by Dr.Satyabrata Dash 5


Dimensionality Reduction:

1. Dimensionality reduction can be useful in various applications, including image processing,


natural language processing, recommender systems, and bioinformatics.
2. However, it is important to note that dimensionality reduction may result in loss of some
information, and it should be carefully applied and evaluated in each case.
3. Common techniques for dimensionality reduction include

• Principal Component Analysis (PCA),

• Singular Value Decomposition (SVD),

• t-Distributed Stochastic Neighbour Embedding (t-SNE),

• Linear Discriminant Analysis (LDA),

• Non-negative Matrix Factorization (NMF).

10/19/2023 by Dr.Satyabrata Dash 6


Dimensionality Reduction: The Curse of Dimensionality

1. Handling the high-dimensional data is very difficult in practice, commonly known as the curse of
dimensionality.
2. If the dimensionality of the input dataset increases, any machine learning algorithm and model
becomes more complex.
3. As the number of features increases, the number of samples also gets increased proportionally,
and the chance of over fitting also increases.
4. If the machine learning model is trained on high-dimensional data, it becomes over fitted and
results in poor performance.
5. Hence, it is often required to reduce the number of features, which can be done with
dimensionality reduction.

10/19/2023 by Dr.Satyabrata Dash 7


Dimensionality Reduction: Advantages & Disadvantages

Advantages of Dimensionality Reduction

1. It helps in data compression, and hence reduced storage space.


2. It reduces computation time.
3. It also helps remove redundant features, if any.

Disadvantages of Dimensionality Reduction

1. It may lead to some amount of data loss.


2. Some data may be lost due to dimensionality reduction.
3. In the PCA dimensionality reduction technique, sometimes the principal components required to
consider are unknown.
4. We may not know how many principal components to keep- in practice, some thumb rules are
applied.

10/19/2023 by Dr.Satyabrata Dash 8


Dimensionality Reduction: Principal Component Analysis (PCA)

1. Principal Component Analysis (PCA) is a popular technique for dimensionality reduction


in machine learning and data analysis.
2. PCA is a linear method that transforms a high-dimensional dataset into a lower-dimensional
space by identifying the principal components that capture the most important information in
the dataset.
3. The basic idea of PCA is to find a set of orthogonal vectors, called principal components,
that best describe the variation in the dataset.
4. The first principal component is the direction of maximum variance in the dataset,
5. The second principal component is the direction of maximum variance orthogonal to
the first component, and so on.
6. The number of principal components is equal to the number of dimensions in the
dataset.

10/19/2023 by Dr.Satyabrata Dash 9


Dimensionality Reduction: PCA-Algorithm
The steps involved in PCA

1. Standardize the data: PCA assumes that the data is centered at zero and has unit variance.
Therefore, the data should be standardized before applying PCA.

2. Compute the covariance matrix: The covariance matrix describes the relationships between the
features in the dataset.

3. Compute the eigenvectors and eigenvalues of the covariance matrix: The eigenvectors represent
the principal components, and the eigenvalues represent the variance of the dataset along each
principal component.

4. Sort the eigenvectors in descending order of their corresponding eigenvalues: This determines
the order of importance of the principal components.

5. Select the top k eigenvectors: The top k eigenvectors with the highest eigenvalues are chosen to
represent the dataset in the lower-dimensional space.

6. Transform the data into the lower-dimensional space: The dataset is projected onto the selected
k principal components to obtain a lower-dimensional representation of the data.

10/19/2023 by Dr.Satyabrata Dash 10


Dimensionality Reduction: PCA Using Scikit-Learn

Scikit-learn is a popular Python library for machine learning, and it provides a simple and efficient
implementation of PCA. Here's how you can use scikit-learn to perform PCA on a dataset:

• Import the necessary modules:

• from sklearn.decomposition import PCA


• from sklearn.datasets import load_iris

• Load a dataset:

• iris = load_iris()
• X = iris.data
• y = iris.target

• Create a PCA object:

• pca = PCA(n_components=2)

• This creates a PCA object that will reduce the dimensionality of the data to 2 components.

10/19/2023 by Dr.Satyabrata Dash 11


Dimensionality Reduction: PCA Using Scikit-Learn
• Fit the PCA model to the data:

• X_pca = pca.fit_transform(X)

• This fits the PCA model to the data and transforms the data into the lower-dimensional subspace.

• Visualize the results:

• import matplotlib.pyplot as plt


• plt.scatter(X_pca[:, 0], X_pca[:, 1], c=y)
• plt.xlabel('PC1')
• plt.ylabel('PC2')
• plt.show()

This plots the data in the lower-dimensional subspace, with each point colored according to its class
label.
That's it! This is a simple example of how to use PCA in scikit-learn. Of course, there are many more
options and parameters you can use to customize the PCA model, so be sure to check out the scikit-
learn documentation for more information.

10/19/2023 by Dr.Satyabrata Dash 12


Dimensionality Reduction:

10/19/2023 by Dr.Satyabrata Dash 13


Generalized Learning Laws

• The concept of "Generalized Learning Laws" refers to the fundamental principles


or rules that govern the process of learning in various domains, including
psychology, neuroscience, and artificial intelligence.
• These laws provide insights into how individuals or machines acquire knowledge,
adapt to their environment, and improve their performance over time. Several key
generalized learning laws have been proposed and are widely recognized.

14
Learning and Types of Learning rules in ANN
Learning rule is a method or a mathematical logic.
• Neural Network to learn from the existing conditions and improve its performance.
• Applying learning rule is an iterative process.
There are different learning rules in the Neural network:

1 Hebbian learning rule It identifies how to modify the weights of nodes of a


network.
2 Perceptron learning rule Network starts its learning by assigning a random value
to each weight
3 Delta learning rule Modification in sympatric weight of a node is equal to
the multiplication of error and the input.
4 Error-Correlation The correlation rule is the supervised learning, using that
learning rule error to direct the training
5 Outstar learning rule We can use it when it assumes that nodes or neurons in a
network arranged in a layer
6 Memory-based learning, uses user rating historical data to compute the similarity
between users or items
7 Competitive learning, Based on Winner-takes-All rule
NN&DL by Dr.Satyabrata Dash 15
Self-organization Feature Maps, (Kohenen Self
Organization Networks )

1. The self-organizing map refers to an unsupervised learning model proposed for


applications in which maintaining a topology between input and output spaces
2. Kohonen Self-Organizing feature map (SOM) refers to a neural network, which is
trained using competitive learning. Basic competitive learning implies that the
competition process takes place before the cycle of learning. The competition process
suggests that some criteria select a winning processing element.
3. The self-organizing map makes topologically ordered mappings between input data
and processing elements of the map.
4. Topological ordered implies that if two inputs are of similar characteristics, the most
active processing elements answering to inputs that are located closed to each other
on the map.

16
Self-organization Feature Maps, (Kohenen Self
Organization Networks )

1. The weight vectors of the processing elements are organized in ascending to


descending order. Wi < Wi+1 for all values of i or Wi+1 for all values of i (this
definition is valid for one-dimensional self-organizing map only).
2. The self-organizing map is typically represented as a two-dimensional sheet of
processing elements
3. The processing elements of the network are made competitive in a self-organizing
process, and specific criteria pick the winning processing element whose weights
are updated.
4. A Self-Organizing Map (SOM) differs from typical ANNs both in its architecture
and algorithmic properties. Firstly, its structure comprises of a single-layer linear
2D grid of neurons, instead of a series of layers.

17
Self-organization Feature Maps, (Kohenen Self
Organization Networks )

1. All the nodes on this grid are connected directly to the input vector, but not to one
another, meaning the nodes do not know the values of their neighbours, and only
update the weight of their connections as a function of the given inputs.

18
Self-organization Feature Maps, (Kohenen Self
Organization Networks )

1. The grid itself is the map that organises itself at each iteration as a function of the
input of the input data. As such, after clustering, each node has its own (i,j)
coordinate, which allows one to calculate the Euclidean distance between 2 nodes
by means of the Pythagorean theorem.
2. A Self-Organizing Map utilizes competitive learning instead of error-correction
learning, to modify its weights. It implies that only an individual node is activated
at each cycle in which the features of an occurrence of the input vector are
introduced to the neural network
3. The selected node- the Best Matching Unit (BMU) is selected according to the
similarity between the current input values and all the other nodes in the network.

19
Self-organization Feature Maps, (Kohenen Self
Organization Networks )

Algorithm for training

Step 1 − Initialize the weights, the learning rate α and the neighbourhood topological
scheme.
Step 2 − Continue step 3-9, when the stopping condition is not true.
Step 3 − Continue step 4-6 for every input vector x.
Step 4 − Calculate Square of Euclidean Distance for j = 1 to m

𝑛 𝑚

D(j) = ෍ ෍(𝑥𝑖 − 𝑤𝑖𝑗)2


𝑖=1 𝑗=1

20
Self-organization Feature Maps, (Kohenen Self
Organization Networks )

Algorithm for training

Step 5 − Obtain the winning unit J where Dj is minimum.


Step 6 − Calculate the new weight of the winning unit by the following relation
Wij(new)=wij(old)+α[xi-wij(old)]

Step 7 − Update the learning rate α by the following relation −


α(t+1)=0.5αt
Step 8 − Reduce the radius of topological scheme.
Step 9 − Check for the stopping condition for the network.

21
Application of SOM.

• Data Visualization
• Clustering
• Anomaly Detection
• Dimensionality Reduction
• Pattern Recognition

22
Learning Vector Quantization,

1. Developed by Kohonen in 1989


2. It is a Supervised learning
3. There can be several output units per class
4. Learning vector quantization (LVQ) is an algorithm that is a type of artificial neural
networks and uses neural computation.
5. More broadly, it can be said to be a type of computational intelligence. This algorithm
takes a competitive, winner-takes-all approach to learning and is also related to other
neural network algorithms like Perceptron and back-propagation.
6. The LVQ algorithm allows one to choose the number of training instances to undergo and
then learns about what those instances look like.
7. It as a process of classifying the patterns where each output unit represents a class.
8. After completing the training process, LVQ will classify an input vector by assigning
it to the same class as that of the output unit.

23
Architecture Learning Vector Quantization,

• Architecture of LVQ which is quite similar to the architecture of KSOM. As we can see, there
are “n” number of input units and “m” number of output units. The layers are fully
interconnected with having weights on them.

24
Learning Vector Quantization,

Parameters Used

Following are the parameters used in LVQ training process as well as in the flowchart
x = training vector (x1,...,xi,...,xn)
T = class for training vector x
wj = weight vector for jth output unit
Cj = class associated with the jth output unit

25
Learning Vector Quantization,

Training Algorithm
Step 1 − Initialize reference vectors, which can be done as follows
From the given set of training vectors, take the first “m” number of clusters, training
vectors and use them as weight vectors. The remaining vectors can be used for training.
Assign the initial weight and perform classification randomly.
Apply K-means clustering method.

Step 2 − Initialize reference vector α


Step 3 − Continue with steps 4-9, if the condition for stopping this algorithm is not met.
Step 4 − Follow steps 5-6 for every training input vector x.
Step 5 − Calculate Square of Euclidean Distance for j = 1 to m and i = 1 to n

26
Learning Vector Quantization,

Step 6 − Obtain the winning unit J where Djj is minimum.


Step 7 − Calculate the new weight of the winning unit by the following relation

Step 8 − Reduce the learning rate αα.


Step 9 − Test for the stopping condition. It may be as follows
•Maximum number of epochs reached.
•Learning rate reduced to a negligible value.

27
Learning Vector Quantization,
(Flow Chat)

28
Maximal Eigenvector Filtering,

29
Thank You

30

You might also like