0% found this document useful (0 votes)
23 views46 pages

UNIT I II NOTES SOFT Computing

Soft computing

Uploaded by

abjha0307
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views46 pages

UNIT I II NOTES SOFT Computing

Soft computing

Uploaded by

abjha0307
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 46

Unit I

Soft computing:
*Soft computing is the reverse of hard (conventional/traditional) computing.
*Deals with approximate models and gives solutions to complex real-life problems.
*It provides cost-effective solutions to the complex real-life problems for which hard computing
solution does not exist.
such as:
Robotics Work
Data Compression
HandWriting Recognition
Image Processing and Data Compression
Automotive Systems and Manufacturing
Decision-support Systems
Power Systems
Fuzzy Logic Control
Machine Learning Applications
Speech and Vision Recognition Systems
Process Control

In effect, the role model for soft computing is the human mind. Soft computing is based on techniques
such as fuzzy logic,
genetic algorithms, artificial neural networks, machine learning, and expert systems

Soft Computing Vs Hard Computing

1.It can be evolve its own programs It requires program to be written

2. It uses fuzzy logic(multi-valued) it uses two valued logic

3. it can deal with noisy data it can only deal with exact data

4.it allows parallel computing it allows sequential computing

5.it gives approximate answers it gives exact/precise answers


6. it needs robustness it needs accuracy
7. it is also known as computation intelligence it is also know as conventional
intelligence
8.
Soft Computing:
* It is collection of techniques which help to construct a
computationally intelligent System.

Computationally intelligent System


* it is something which must posses human like
expertise
* they must be able to adapt and learn
* should be able to make decision and take action

* Soft computing deals with imprecision, uncertainty, partial truth,


and approximation to achieve tractability, robustness and low solution cost.
* It consists of distinct concepts and techniques which aim to overcome the difficulties
encountered in real world problems.

These problems result from the fact that our world seems to be imprecise,
uncertain and difficult to categorize.

Soft Computing Consist of


The tools for soft computing:
Fuzzy logic models:
*fuzzy logic its takes human knowledge and
*performs decision making

Neural networks:
* This have the capability of recognzing the
patterns and adapting themselves with
changing enviornments

Evolutionary Algorithms(Genetic algoritm)


it includes genetic algorithms are

**********************
Machine Learning :
Machine learning is a branch of artificial intelligence (AI) and computer science
which focuses on the use of data and algorithms to imitate the way that humans learn, gradually
improving its accuracy.
******************
Probabilistic Reasoning:

Probabilistic reasoning is a way of knowledge representation where we apply the concept of probability
to indicate the uncertainty in knowledge.

hybrid:
When we mix Hardcomputing with softcomputing its known as Hybrid computing
BNN(Biological Neural Network Vs ANN(Artificial
Neural Network)
Speed :
BNN:
Processes information at a slower rate response time is
Measured in milliseconds
ANN:
Information is process at faster rate the. The response time is
Measured in nanoseconds
Processing:
BNN: Massively parallel processing
ANN: Serial processing
Size & Complexity:
BNN : An Extremely intricate and dense network of linked neurons .
ANN: Size and complexity are reduced

Fault Tolerance:
BNN :The fact that information storage is flexible means that new information may be
added by
altering the connectivity strengths without deleting existing information.
ANN: Intolerant of faults. In the event of a system failure, corrupt data cannot be
recovered

Control Mechanism:
BNN:
There is no unique control mechanism outside of computational task.
ANN:
Controlling computer activity is handled by a control unit.
Aritificial Neural N/W:
An Artificial Neural Network (ANN) is an information processing paradigm
that is inspired by the brain. ANNs, like people, learn by examples. An ANN
is configured for a specific application, such as pattern recognition or data
classification, through a learning process. Learning largely involves
adjustments to the synaptic connections that exist between the neurons
ANN is an info processing paradigm- that is inspired by:
There are several different architectures for ANNs, each with their own
strengths and weaknesses. Some of the most common architectures
include:
Feedforward Neural Networks:
This is the simplest type of ANN architecture, where the information
flows in one direction from input to output. The layers are fully connected,
meaning each neuron in a layer is connected to all the neurons in the next
layer.
Recurrent Neural Networks (RNNs):
These networks have a “memory” component, where information can
flow in cycles through the network. This allows the network to process
sequences of data, such as time series or speech.
Convolutional Neural Networks (CNNs):
These networks are designed to process data with a grid-like topology,
such as images. The layers consist of convolutional layers, which learn to
detect specific features in the data, and pooling layers, which reduce the
spatial dimensions of the data.
Autoencoders:
These are neural networks that are used for unsupervised learning.
They consist of an encoder that maps the input data to a lower-dimensional
representation and a decoder that maps the representation back to the
original data.
Generative Adversarial Networks (GANs): These are neural networks that
are used for generative modeling. They consist of two parts: a generator
that learns to generate new data samples, and a discriminator that learns to
distinguish between real and generated data.
The model of an artificial neural network can be specified by three entities:
 Interconnections
 Activation functions
 Learning rules

Interconnections:

Interconnection can be defined as the way processing elements (Neuron) in


ANN are connected to each other. Hence, the arrangements of these
processing elements and geometry of interconnections are very essential in
ANN.
These arrangements always have two layers that are common to all
network architectures, the Input layer and output layer where the input layer
buffers the input signal, and the output layer generates the output of the
network. The third layer is the Hidden layer, in which neurons are neither
kept in the input layer nor in the output layer. These neurons are hidden
from the people who are interfacing with the system and act as a black box
to them. By increasing the hidden layers with neurons, the system’s
computational and processing power can be increased but the training
phenomena of the system get more complex at the same time.
There exist five basic types of neuron connection architecture :

1. Single-layer feed-forward network


2. Multilayer feed-forward network
3. Single node with its own feedback
4. Single-layer recurrent network
5. Multilayer recurrent network

1. Single-layer feed-forward network

In this type of network, we have only two layers input layer and the
output layer but the input layer does not count because no computation is
performed in this layer. The output layer is formed when different weights
are applied to input nodes and the cumulative effect per node is taken. After
this, the neurons collectively give the output layer to compute the output
signals.
2. Multilayer feed-forward network
This layer also has a hidden layer that is internal to the network and
has no direct contact with the external layer. The existence of one or more
hidden layers enables the network to be computationally stronger, a feed-
forward network because of information flow through the input function, and
the intermediate computations used to determine the output Z. There are no
feedback connections in which outputs of the model are fed back into itself.

3. Single node with its own feedback


Single Node with own Feedback

When outputs can be directed back as inputs to the same layer or


preceding layer nodes, then it results in feedback networks. Recurrent
networks are feedback networks with closed loops. The above figure shows
a single recurrent network having a single neuron with feedback to itself.
4. Single-layer recurrent network

The above network is a single-layer network with a feedback connection in


which the processing element’s output can be directed back to itself or to
another processing element or both. A recurrent neural network is a class of
artificial neural networks where connections between nodes form a directed
graph along a sequence. This allows it to exhibit dynamic temporal behavior
for a time sequence. Unlike feedforward neural networks, RNNs can use
their internal state (memory) to process sequences of inputs.
5. Multilayer recurrent network

In this type of network, processing element output can be directed to the


processing element in the same layer and in the preceding layer forming a
multilayer recurrent network. They perform the same task for every element
of a sequence, with the output being dependent on the previous
computations. Inputs are not needed at each time step. The main feature of
a Recurrent Neural Network is its hidden state, which captures some
information about a sequence.
McCulloch & Pitts Model

It is very well known that the most fundamental unit of deep


neural networks is called an artificial neuron/perceptron. But the
very first step towards the perceptron we use today was taken in
1943 by McCulloch and Pitts, by mimicking the functionality of a
biological neuron.

Dendrite: Receives signals from other neurons

Soma: Processes the information

Axon: Transmits the output of this neuron


Synapse: Point of connection to other neurons

Basically, a neuron takes an input signal (dendrite), processes it


like the CPU (soma), passes the output through a cable like
structure to other connected neurons (axon to synapse to other
neuron’s dendrite). Now, this might be biologically inaccurate as
there is a lot more going on out there but on a higher level, this
is what is going on with a neuron in our brain — takes an input,
processes it, throws out an output.

The first computational model of a neuron was proposed by


Warren MuCulloch (neuroscientist) and Walter Pitts (logician) in
1943.

McCulloch-Pitts Neuron
It may be divided into 2 parts. The first part, g takes an input
(ahem dendrite ahem), performs an aggregation and based on
the aggregated value the second part, f makes a decision.

Lets suppose that I want to predict my own decision, whether to


watch a random football game or not on TV. The inputs are all
boolean i.e., {0,1} and my output variable is also boolean {0:
Will watch it, 1: Won’t watch it}.

So, x_1 could be isPremierLeagueOn (I like Premier


League more)

x_2 could be isItAFriendlyGame (I tend to care less


about the friendlies)

 x_3 could be isNotHome (Can’t watch it when I’m


running errands. Can I?)

x_4 could be isManUnitedPlaying (I am a big Man


United fan. GGMU!) and so on.

These inputs can either be excitatory or inhibitory. Inhibitory


inputs are those that have maximum effect on the decision
making irrespective of other inputs i.e., if x_3 is 1 (not home)
then my output will always be 0 i.e., the neuron will never fire,
so x_3 is an inhibitory input. Excitatory inputs are NOT the ones
that will make the neuron fire on their own but they might fire it
when combined together. Formally, this is what is going on:
We can see that g(x) is just doing a sum of the inputs — a simple
aggregation. And theta here is called thresholding parameter. For example, if
I always watch the game when the sum turns out to be 2 or more, the theta is
2 here. This is called the Thresholding Logic.
Perceptron

Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a Perceptron learning


rule based on the original MCP neuron. A Perceptron is an algorithm for supervised learning
of binary classifiers. This algorithm enables neurons to learn and processes elements in the
training set one at a time.
Basic Components of Perceptron

Perceptron is a type of artificial neural network, which is a fundamental concept in machine learning.
The basic components of a perceptron are:

 Input Layer: The input layer consists of one or more input neurons, which receive input signals
from the external world or from other layers of the neural network.

 Weights: Each input neuron is associated with a weight, which represents the strength of the
connection between the input neuron and the output neuron.

 Bias: A bias term is added to the input layer to provide the perceptron with additional flexibility
in modeling complex patterns in the input data.

 Activation Function: The activation function determines the output of the perceptron based on
the weighted sum of the inputs and the bias term. Common activation functions used in perceptrons
include the step function, sigmoid function, and ReLU function.

 Output: The output of the perceptron is a single binary value, either 0 or 1, which indicates the
class or category to which the input data belongs.

 Training Algorithm: The perceptron is typically trained using a supervised learning algorithm
such as the perceptron learning algorithm or backpropagation. During training, the weights and
biases of the perceptron are adjusted to minimize the error between the predicted output and the true
output for a given set of training examples.

 Overall, the perceptron is a simple yet powerful algorithm that can be used to perform binary
classification tasks and has paved the way for more complex neural networks used in deep learning
today.

Types of Perceptron:

6. Single layer: Single layer perceptron can learn only linearly separable patterns.

7. Multilayer: Multilayer perceptrons can learn about two or more layers having a greater
processing power.

The Perceptron algorithm learns the weights for the input signals in order to draw a linear decision
boundary.
Adaline (Adaptive Linear Neural) :
 A network with a single linear unit is called Adaline (Adaptive Linear Neural). A
unit with a linear activation function is called a linear unit. In Adaline, there is only
one output unit and output values are bipolar (+1,-1). Weights between the input
unit and output unit are adjustable. It uses the delta rule i.e , where and are the
weight, predicted output, and true value respectively.
 The learning rule is found to minimize the mean square error between
activation and target values. Adaline consists of trainable weights, it compares
actual output with calculated output, and based on error training algorithm is
applied.

First, calculate the net input to your Adaline network then apply the activation
function to its output then compare it with the original output if both the equal, then
give the output else send an error back to the network and update the weight
according to the error which is calculated by the delta learning rule. i.e ,
where and are the weight, predicted output, and true value respectively.
Architecture:

In Adaline, all the input neuron is directly connected to the output neuron with
the weighted connected path. There is a bias b of activation function 1 is present.
Madaline (Multiple Adaptive Linear Neuron) :

 The Madaline(supervised Learning) model consists of many Adaline in parallel


with a single output unit. The Adaline layer is present between the input layer and
the Madaline layer hence Adaline layer is a hidden layer. The weights between the
input layer and the hidden layer are adjusted, and the weight between the hidden
layer and the output layer is fixed.
 It may use the majority vote rule, the output would have an answer either true
or false. Adaline and Madaline layer neurons have a bias of ‘1’ connected to them.
use of multiple Adaline helps counter the problem of non-linear separability.

There are three types of a layer present in Madaline First input layer contains all the
input neurons, the Second hidden layer consists of an adaline layer, and weights
between the input and hidden layers are adjustable and the third layer is the output
layer the weights between hidden and output layer is fixed they are not adjustable.
UNIT II
How Brain works, Neurons as simple computing

Compared to how computers traditionally work, neural networks have certain


special features:

Neural network key feature 1


For one, in a traditional computer, information is processed in a central processor
(aptly named the central processing unit, or CPU for short) which can only focus on
doing one thing at a time. The CPU can retrieve data to be processed from the
computer’s memory, and store the result in the memory. Thus, data storage and
processing are handled by two separate components of the computer: the memory
and the CPU. In neural networks, the system consists of a large number of
neurons, each of which can process information on its own so that instead of
having a CPU process each piece of information one after the other, the neurons
process vast amounts of information simultaneously.
Neural network key feature 2
The second difference is that data storage (memory) and processing isn’t
separated like in traditional computers. The neurons both store and process
information so that there is no need to retrieve data from the memory for
processing. The data can be stored short term in the neurons themselves (they
either fire or not at any given time) or for longer term storage, in the connections
between the neurons – their so called weights, which we will discuss below.
Because of these two differences, neural networks and traditional computers are
suited for somewhat different tasks. Even though it is entirely possible to simulate
neural networks in traditional computers, which was the way they were used for a
long time, their maximum capacity is achieved only when we use special hardware
(computer devices) that can process many pieces of information at the same time.
This is called parallel processing. Incidentally, graphics processors (or graphics
processing units, GPUs) have this capability and they have become a cost-effective
solution for running massive deep learning methods.
Perceptron

Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a Perceptron learning


rule based on the original MCP neuron. A Perceptron is an algorithm for supervised learning
of binary classifiers. This algorithm enables neurons to learn and processes elements in the
training set one at a time.
Basic Components of Perceptron

Perceptron is a type of artificial neural network, which is a fundamental concept in machine learning.
The basic components of a perceptron are:

 Input Layer: The input layer consists of one or more input neurons, which receive input signals
from the external world or from other layers of the neural network.

 Weights: Each input neuron is associated with a weight, which represents the strength of the
connection between the input neuron and the output neuron.

 Bias: A bias term is added to the input layer to provide the perceptron with additional flexibility
in modeling complex patterns in the input data.

 Activation Function: The activation function determines the output of the perceptron based on
the weighted sum of the inputs and the bias term. Common activation functions used in perceptrons
include the step function, sigmoid function, and ReLU function.

 Output: The output of the perceptron is a single binary value, either 0 or 1, which indicates the
class or category to which the input data belongs.

 Training Algorithm: The perceptron is typically trained using a supervised learning algorithm
such as the perceptron learning algorithm or backpropagation. During training, the weights and
biases of the perceptron are adjusted to minimize the error between the predicted output and the true
output for a given set of training examples.

 Overall, the perceptron is a simple yet powerful algorithm that can be used to perform binary
classification tasks and has paved the way for more complex neural networks used in deep learning
today.

How does Perceptron work?


Perceptron is considered as a single-layer neural network that consists of four
main parameters named input values (Input nodes), weights and Bias, net sum,
and an activation function. The perceptron model begins with the multiplication of
all input values and their weights, then adds these values together to create the
weighted sum. Then this weighted sum is applied to the activation function 'f' to
obtain the desired output. This activation function is also known as the step
function and is represented by 'f'.
This step function or Activation function plays a vital role in ensuring that output
is mapped between required values (0,1) or (-1,1). It is important to note that the
weight of input is indicative of the strength of a node. Similarly, an input's bias
value gives the ability to shift the activation function curve up or down.

Perceptron model works in two important steps as follows:

Step-1

In the first step first, multiply all input values with corresponding weight values
and then add them to determine the weighted sum. Mathematically, we can
calculate the weighted sum as follows:

∑wi*xi = x1*w1 + x2*w2 +…wn*xn

Add a special term called bias 'b' to this weighted sum to improve the model's
performance.

∑wi*xi + b

Step-2

In the second step, an activation function is applied with the above-mentioned


weighted sum, which gives us output either in binary form or a continuous value
as follows:
Y = f(∑wi*xi + b)

Types of Perceptron:

8. Single layer: Single layer perceptron can learn only linearly separable patterns.

9. Multilayer: Multilayer perceptrons can learn about two or more layers having a greater
processing power.

Single Layer Perceptron Model:


This is one of the easiest Artificial neural networks (ANN) types. A single-
layered perceptron model consists feed-forward network and also includes a
threshold transfer function inside the model. The main objective of the single-
layer perceptron model is to analyze the linearly separable objects with binary
outcomes.

In a single layer perceptron model, its algorithms do not contain recorded data, so
it begins with inconstantly allocated input for weight parameters. Further, it sums
up all inputs (weight). After adding all inputs, if the total sum of all inputs is more
than a pre-determined value, the model gets activated and shows the output
value as +1.

If the outcome is same as pre-determined or threshold value, then the


performance of this model is stated as satisfied, and weight demand does not
change. However, this model consists of a few discrepancies triggered when
multiple weight inputs values are fed into the model. Hence, to find desired output
and minimize errors, some changes should be necessary for the weights input.

"Single-layer perceptron can learn only linearly separable patterns."


. Multi-Layered Perceptron Model:

Like a single-layer perceptron model, a multi-layer perceptron model also has the
same model structure but has a greater number of hidden layers.

The multi-layer perceptron model is also known as the Backpropagation algorithm,


which executes in two stages as follows:

o Forward Stage: Activation functions start from the input layer in the forward
stage and terminate on the output layer.

o Backward Stage: In the backward stage, weight and bias values are
modified as per the model's requirement. In this stage, the error between
actual output and demanded originated backward on the output layer and
ended on the input layer.

Hence, a multi-layered perceptron model has considered as multiple artificial


neural networks having various layers in which activation function does not
remain linear, similar to a single layer perceptron model. Instead of linear,
activation function can be executed as sigmoid, TanH, ReLU, etc., for deployment.

A multi-layer perceptron model has greater processing power and can process
linear and non-linear patterns. Further, it can also implement logic gates such as
AND, OR, XOR, NAND, NOT, XNOR, NOR.

Advantages of Multi-Layer Perceptron:

o A multi-layered perceptron model can be used to solve complex non-linear


problems.

o It works well with both small and large input data.

o It helps us to obtain quick predictions after the training.

o It helps to obtain the same accuracy ratio with large as well as small data.

Disadvantages of Multi-Layer Perceptron:

o In Multi-layer perceptron, computations are difficult and time-consuming.

o In multi-layer Perceptron, it is difficult to predict how much the dependent


variable affects each independent variable.

o The model functioning depends on the quality of the training.

Perceptron Function
Perceptron function ''f(x)'' can be achieved as output by multiplying the input 'x'
with the learned weight coefficient 'w'.

Mathematically, we can express it as follows:

f(x)=1; if w.x+b>0

otherwise, f(x)=0

o 'w' represents real-valued weights vector

o 'b' represents the bias

o 'x' represents a vector of input x values.

Characteristics of Perceptron
The perceptron model has the following characteristics.

1. Perceptron is a machine learning algorithm for supervised learning of binary


classifiers.
2. In Perceptron, the weight coefficient is automatically learned.

3. Initially, weights are multiplied with input features, and the decision is made
whether the neuron is fired or not.

4. The activation function applies a step rule to check whether the weight
function is greater than zero.

5. The linear decision boundary is drawn, enabling the distinction between the
two linearly separable classes +1 and -1.

6. If the added sum of all input values is more than the threshold value, it must
have an output signal; otherwise, no output will be shown.
Back Propagation Neural Networks
Back Propagation Neural (BPN) is a multilayer neural network consisting of the input
layer, at least one hidden layer and output layer. As its name suggests, back
propagating will take place in this network. The error which is calculated at the output
layer, by comparing the target output and the actual output, will be propagated back
towards the input layer.
Architecture
As shown in the diagram, the architecture of BPN has three interconnected layers
having weights on them. The hidden layer as well as the output layer also has bias,
whose weight is always 1, on them. As is clear from the diagram, the working of
BPN is in two phases. One phase sends the signal from the input layer to the output
layer, and the other phase back propagates the error from the output layer to the
input layer.

Training Algorithm
For training, BPN will use binary sigmoid activation function. The
training of BPN will have the following three phases.
 Phase 1 − Feed Forward Phase
 Phase 2 − Back Propagation of error
 Phase 3 − Updating of weights
All these steps will be concluded in the algorithm as follows

Step 1 − Initialize the following to start the training −


10. Weights
11. Learning rate α
For easy calculation and simplicity, take some small random values.

Step 2 − Continue step 3-11 when the stopping condition is not


true.

Step 3 − Continue step 4-10 for every training pair.


Phase 1
Step 4 − Each input unit receives input signal xi and sends it to
the hidden unit for all i = 1 to n

Step 5 − Calculate the net input at the hidden unit using the
following relation −

Here b0j is the bias on hidden unit, vij is the weight on j unit of the
hidden layer coming from i unit of the input layer.

Now calculate the net output by applying the following activation


function
Send these output signals of the hidden layer units to the output
layer units.

Step 6 − Calculate the net input at the output layer unit using the
following relation −

Here b0k ⁡is the bias on output unit, wjk is the weight on k unit of
the output layer coming from j unit of the hidden layer.

Calculate the net output by applying the following activation function

Phase 2
Step 7 − Compute the error correcting term, in correspondence with
the target pattern received at each output unit, as follows −

On this basis, update the weight and bias as follows −


Then, send δk�� back to the hidden layer.
Step 8 − Now each hidden unit will be the sum of its delta inputs
from the output units.

Error term can be calculated as follows −

On this basis, update the weight and bias as follows −

Phase 3
Step 9 − Each output unit (ykk = 1 to m) updates the weight and
bias as follows −

Step 10 − Each output unit (zjj = 1 to p) updates the weight and


bias as follows −

Step 11 − Check for the stopping condition, which may be either


the number of epochs reached or the target output matches the
actual output.
ANN – Bidirectional Associative Memory (BAM)

Bidirectional Associative Memory (BAM) is a supervised learning model in


Artificial Neural Network. This is hetero-associative memory, for an input
pattern, it returns another pattern which is potentially of a different size. This
phenomenon is very similar to the human brain. Human memory is
necessarily associative. It uses a chain of mental associations to recover a
lost memory like associations of faces with names, in exam questions with
answers, etc.
In such memory associations for one type of object with another,
a Recurrent Neural Network (RNN) is needed to receive a pattern of one
set of neurons as an input and generate a related, but different, output
pattern of another set of neurons.
Why BAM is required?
The main objective to introduce such a network model is to store hetero-
associative pattern pairs.
This is used to retrieve a pattern given a noisy or incomplete pattern.
BAM Architecture:
When BAM accepts an input of n-dimensional vector X from set A then the
model recalls m-dimensional vector Y from set B. Similarly when Y is
treated as input, the BAM recalls X.
 Storage (Learning): In this learning step of BAM, weight matrix is
calculated between M pairs of patterns (fundamental memories) are
stored in the synaptic weights of the network following the equation
 Testing: We have to check that the BAM recalls perfectly for
corresponding and recalls for corresponding . Using,

All pairs should be recalled accordingly.


 Retrieval: For an unknown vector X (a corrupted or incomplete
version of a pattern from set A or B) to the BAM and retrieve a
previously stored association:

 Initialize the BAM:

 Calculate the BAM output at iteration :

 Update the input vector :

 Repeat the iteration until convergence, when input and


output remain unchanged.
Limitations of BAM:
12. Storage capacity of the BAM: In the BAM, stored number of
associations should not be exceeded the number of neurons in the
smaller layer.
13. Incorrect convergence: Always the closest association may not
be produced by BAM.
Hopfield Neural Network

Introduction
During your daily schedule, various waves of networks propagate
through your brain. The entire day is formed from the dynamic behaviour
of these networks. Neural Networks are artificial intelligence methods
that teach computers to process the data like our brains. These networks
use interconnected nodes that resemble neurons of the human brain.
This method enables computers to make intelligent decisions by learning
from their mistakes. The Hopfield network is one such type of recurrent
neural network method.

The sequence of the article will be as follows:


We will begin with an introduction to the Hopfield neural network. Then,
we will also discuss the architecture, energy function, and training model
of the Hopfield network.
Introduction to Hopfield Network
A Hopfield network is a particular type of single-layered neuron network.
Dr. John J. Hopfield invented it in 1982. These networks were introduced
to collect and retrieve memory and store various patterns. Also, auto-
association and optimization of the task can be done using these
networks. In this network, each node is fully connected(recurrent) to
other nodes. These nodes exist only in two states: ON (1) or OFF (0).
These states can be restored based on the input received from other
nodes. Unlike other neural networks, the output of the Hopfield network
is finite. Also, the input and output sizes must be the same in these
networks. .

The Hopfield network consists of associative memory. This memory


allows the system to retrieve the memory using an incomplete portion.
The network can restore the closest pattern using the data captured in
associative memory. This feature of Hopfield networks makes it a good
candidate for pattern recognition.
Associative memory is a content addressable memory that establishes a
relation between the input vector and the output target vector. It enables
the reallocation of data stored in the memory based on its similarity with
the input vector.
The Hopfield networks are categorized into two categories. These are:
Discrete Networks
These networks give any of the two discrete outputs. Based on the
output received, further two types:
 Binary: In this type, the output is either 0 or 1.
 Bipolar: In bipolar networks, the output is either -1 (When output <
0) or 1 (When output > 0)
Continuous Networks
Instead of receiving binary or bipolar output, the output value lies
between 0 and 1.
The Architecture of Hopfield Network
The architecture of the Hopfield network consists of the following
elements:
14. Individual nodes preserve their states until required an
update.
15. The node to be updated is selected randomly.
16. Each node is connected to all other nodes except itself.
17. The state of each node is either 0/1 or 1/-1.
18. The Hopfield network structure is symmetric, i.e., Wij = Wji for
all i's and j's.

The representation of the sample architecture of the Hopfield network


having three nodes is as follows:
In the above diagram, each symbol represents:
x1, x2, x3 - represents the input.
y1, y2, y3 - represents output obtained from each node.
Wij - represents the weight associated with the connection from i to j.

Energy Function in Hopfield Network


In Hopfield networks, there are two different types of updations.
o Synchronous: Updating all the nodes simultaneously each time.
o Asynchronous: Updating only one node at a time. That node is
selected randomly or based on specific rules.

In asynchronous updation, each state of Hopfield networks is associated


with an energy value. The value is obtained from a function, and that
function is named an energy function. This function can decrease or
remain unchanged during updation. This energy function of the Hopfield
network is defined as:
A network is considered to be in a stable state if the energy function
tends to be minimum. .

Training Model with Hopfield Network


Training a Hopfield network refers to lowering of energy of each state.
The training model consists of a training and testing algorithm. Let's
discuss each of them one by one.
Training Algorithm
It is based on the Hebbian principle, which Donald Hebb gave in 1949. In
training, algorithm weights are updated using a specific rule. The rule is
different for binary and bipolar. For storing a set of 'N' input patterns
[ X(n) where n = 1 to N ], the rule followed is as follows:
o Binary Input:
Testing Algorithm
The testing algorithm involves several steps. These steps are:
o Initialize the weights using the above training algorithm rules.
o Follow steps 3 to 7 for each input vector 'Xi'.
o Assign the value of external input vector 'Xi' to initial activators Yi',
for all i = 1 to N.
o Follow steps 5 to 7 for each initial activator 'Yi'.
o Calculate the network's net input 'Yin' using the below rule:
o Broadcast the obtained output 'Yi' to all other units. And update the
activation vector.
o Test the network.

You might also like