0% found this document useful (0 votes)
17 views144 pages

Unit 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views144 pages

Unit 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 144

CSE 412: PRINCIPLES OF SOFT COMPUTING

Dr. Kakumani K C Deepthi

CSE 412: Principles of Soft Computing 12/02/2024 1


UNIT I: INTRODUCTION TO SOFT COMPUTING,
ARTIFICIAL NEURAL NETWORK (ANN)

• Fundamentals of ANN, Basic Models of an artificial Neuron, Neural


Network Architecture, learning methods, Terminologies of ANN, Hebb
network.
• Supervised Learning Networks: Perceptron, Adaline, Madaline
• Back propagation Network: Back propagation learning, Learning
Effect of Tuning parameters of the Back propagation.

CSE 412: Principles of Soft Computing 12/02/2024 2


Concept of Computing

Basic of computing

• y = f (x), f is a mapping function


• f is also called a formal method or an algorithm to solve a problem.
CSE 412: Principles of Soft Computing 12/02/2024 3
Important characteristics of Computing

• Should provide precise solution.


• Control action should be unambiguous and accurate.
• Suitable for problem, which is easy to model mathematically

CSE 412: Principles of Soft Computing 12/02/2024 4


Hard Computing
• In 1996 LA Zade (LAZ) introduced the term hard computing .
• According to LAZ : We term a computing as ”Hard” computing, if
• Precise result is guaranteed
• Control action is unambiguous
• Control action is formally defined (i e with mathematical model or algorithm)
• Example
• Solving numerical problems (e g Roots of polynomials, Integration etc)
• Searching and sorting techniques
• Solving ”Computational Geometry” problems (e g Shortest tour in Graph theory, Finding closest pair
of points etc )
• Based on the concept of precise modelling and analysing to yield accurate results
• Works well for simple problems
CSE 412: Principles of Soft Computing 12/02/2024 5
Soft Computing

• It does not require any mathematical modeling of problem solving


• It may not yield the precise solution
• Algorithms are adaptive (i e it can adjust to the change of dynamic environment)
• Use some biological inspired methodologies such as genetics, evolution, ant’s behaviors, particles
swarming, human nervous systems etc
• Soft computing is foundation of conceptual intelligence in machines
• Unlike hard computing soft computing is tolerant of imprecision, uncertainty, partial truth, and
approximation
• Uses inexact methods to give useful but inexact answers to intractable problems
• Well suited for real world problems where ideal models are not available

CSE 412: Principles of Soft Computing 12/02/2024 6


Characteristics of Soft Computing

• It does not require any mathematical modeling of problem solving.


• It may not yield the precise solution.
• Algorithms are adaptive (i.e., it can adjust to the change of dynamic environment)
• Use some biological inspired methodologies such as genetics, evolution, human
nerves system etc)

CSE 412: Principles of Soft Computing 12/02/2024 7


Examples of soft computing
Handwritten character recognition
(Artificial Neural Networks)

CSE 412: Principles of Soft Computing 12/02/2024 8


Examples of soft computing

Money allocation problem


(Evolutionary Computing)

CSE 412: Principles of Soft Computing 12/02/2024 9


Robo Movement
(Fuzzy Logic)

CSE 412: Principles of Soft Computing 12/02/2024 10


Components of Soft Computing

• Components of soft computing include


• Neural Network (NN)
• Fuzzy Logic (FL)
• Evolutionary Computation (EC) - based on the origin of the species
• Genetic Algorithm
• Swarm Intelligence
• Bacteria Foraging Optimization

CSE 412: Principles of Soft Computing 12/02/2024 11


CSE 412: Principles of Soft Computing 12/02/2024 12
Soft vs. Hard Computing

CSE 412: Principles of Soft Computing 12/02/2024 13


Hybrid Computing

It is a combination of the conventional hard computing and emerging soft computing

Concept of Hybrid Computing

CSE 412: Principles of Soft Computing 12/02/2024 14


Self-study
• Limitation(s) in HC and SC
• Examples of (only) Hard computing and (only) Soft computing
• Examples of Hybrid computing

CSE 412: Principles of Soft Computing 12/02/2024 15


ARTIFICIAL NEURAL NETWORK (ANN)

CSE 412: Principles of Soft Computing 12/02/2024 16


Why study Artificial Neural Networks (ANN)?
• ANN are extremely powerful computational devices (Turing equivalent, universal
computers)
• Massive parallelism makes them very efficient
• ANN can learn and generalize from training data so there is no need for enormous feats of
programming
• They are particularly fault tolerant this is equivalent to the “graceful degradation” found
in biological systems
• They are very noise tolerant so they can cope with situations where normal symbolic
systems would have difficulty
• In principle, they can do anything a symbolic/logic system can do, and more (In practice,
getting them to do it can be rather difficult…)
CSE 412: Principles of Soft Computing 12/02/2024 17
Applications of ANN
• Brain modeling
• Models of human development help children with developmental problems
• Neuropsychological models suggest remedial actions for brain damaged patients
• Real world applications
• Financial modelling predicting stocks, shares, currency exchange rates
• Other time series prediction climate, weather, airline marketing tactician
• Computer games intelligent agents, backgammon, first person shooters
• Control systems autonomous adaptable robots, microwave controllers
• Pattern recognition speech recognition, handwriting recognition, sonar signals
• Data analysis data compression, data mining
• Noise reduction function approximation, ECG noise reduction
• Bioinformatics protein secondary structure, DNA sequencing
CSE 412: Principles of Soft Computing 12/02/2024 18
Biological Nervous System

• Biological nervous system is the most important part of many living things, in
particular, human beings.
• There is a part called brain at the center of human nervous system.
• In fact, any biological nervous system consists of a large number of interconnected
processing units called neurons.
• Each neuron is approximately 10m long and they can operate in parallel.
• Typically, a human brain consists of approximately 1011 neurons communicating
with each other with the help of electrical impulses.

CSE 412: Principles of Soft Computing 12/02/2024 19


Brain: Center of the nervous system

CSE 412: Principles of Soft Computing 12/02/2024 20


Biological nervous system

CSE 412: Principles of Soft Computing 12/02/2024 21


Neuron and its working

There are different parts in its dendrite, soma,


axon and synapse
• Dendrite: A bush of very thin fibre
• Axon: A long cylindrical fibre
• Soma: It is also called a cell body, and just
like as a nucleus of cell
• Synapse: It is a junction where axon makes
contact with the dendrites of neighboring
dendrites

CSE 412: Principles of Soft Computing 12/02/2024 22


Neuron and its working
• There is a chemical in each neuron called neurotransmitter. A signal (also called sense) is
transmitted across neurons by this chemical
• That is, all inputs from other neuron arrive to a neuron through dendrites
• These signals are accumulated at the synapse of the neuron and then serve as the output to
be transmitted through the neuron
• An action may produce an electrical impulse, which usually lasts for about a millisecond
• Note that this pulse generated due to an incoming signal and all signal may not produce
pulses in axon unless it crosses a threshold value

CSE 412: Principles of Soft Computing 12/02/2024 23


Neuron and its working

• Note that, a biological neuron receives all inputs through the dendrites, sums
them and produces an output if the sum is greater than a threshold value
• The input signals are passed on to the cell body (soma) through the synapse
which may accelerate or retard an arriving signal. It is this acceleration or
retardation of the input signals that is modeled by the weights.
• An effective synapse, which transmits a stronger signal will have a
correspondingly larger weights while a weak synapse will have smaller weights.
• Thus, weights here are multiplicative factors of the inputs to account for the
strength of the synapse.

CSE 412: Principles of Soft Computing 12/02/2024 24


Observations

• A neuron can receive many signals


• Signals may be modified by weights at the receiving dendrites
• A neuron sums its weighted inputs
• Under appropriate circumstances the neuron can transmit an output signal
• The output of a neuron can go to many other neurons
• Information processing is local
• Memory is distributed (short term = signals, long term = neuron synapse weights)
• The synapse strength may be modified through experience
• The neurotransmitters may be inhibitory or excitatory
CSE 412: Principles of Soft Computing 12/02/2024 25
Artificial Neural Network (ANN)
• The human brain incorporates nearly 10 billion neurons and 60 trillion connections,
synapses between them. By using multiple neurons simultaneously, the brain can
perform its functions much faster than the fastest computers in existence today.
• Each neuron has a very simple structure, but an army of such elements constitutes a
tremendous processing power.
• Artificial neural networks (ANNs) or neural network (NN) are simplified models of the
biological nervous system, and obviously, therefore, have been motivated by the kind of
computing performed by the human brain.

CSE 412: Principles of Soft Computing 12/02/2024 26


Artificial Neural Network

• Our brain can be considered as a highly complex, nonlinear and parallel information
processing system
• Information is stored and processed in a neural network simultaneously throughout the
whole network rather than at specific locations. In other words, in neural networks, both
data and its processing are global rather than local
• Learning is a fundamental and essential characteristic of biological neural networks. The
ease with which they can learn led to attempts to emulate a biological neural network in
a computer.

CSE 412: Principles of Soft Computing 12/02/2024 27


Artificial Neural Network

• ANN have been developed as a generalization of mathematical models of human


cognition or neural biology, based on the assumption that
• Information processing occurs at any simple element called neurons
• Signals are passed between neurons over connection links
• Each connection link has an associated weight, which, in a typical neural net, multiplies the
signal transmitted
• Each neuron applies an activation function to its net input to determine its output signal

CSE 412: Principles of Soft Computing 12/02/2024 28


Analogy between Biological and Artificial Neural
Networks

CSE 412: Principles of Soft Computing 12/02/2024 29


CSE 412: Principles of Soft Computing 12/02/2024 30
Comparison of Brains and Traditional computers

CSE 412: Principles of Soft Computing 12/02/2024 31


The neuron as a simple computing element

CSE 412: Principles of Soft Computing 12/02/2024 32


The neuron as a simple computing element

• The neuron computes the weighted sum of the input signals and compares the result with a
threshold value ϴ (theta)
• If the net input is less than the threshold the neuron output is -1. But if the net input is greater
than or equal to the threshold, the neuron becomes activated, and its output attains a value +1.
• The neuron uses the following transfer or activation function (ϕ)

• This type of activation function is called a sign function


CSE 412: Principles of Soft Computing 12/02/2024 33
Simple artificial neuron

CSE 412: Principles of Soft Computing 12/02/2024 34


Simple artificial neural network

• Neuron Y sends its signal y to both Z 1 and Z 2


• However, the value received by both Z 1 and Z 2
will be different because each signal is scaled
by the appropriate weight v 1 and v 2

CSE 412: Principles of Soft Computing 12/02/2024 35


Basic models of ANN

• The models of ANN can be characterized by


1. Its pattern of connection between the neuron (architecture)
2. Its method of determining the weight on the connections (training and learning
algorithms)
3. Its activation function

CSE 412: Principles of Soft Computing 12/02/2024 36


Self-study

• Advantages and disadvantages of ANN


• Advantages and disadvantages of BNN
• Difference between ANN and BNN
• Terminologies in ANN and BNN

CSE 412: Principles of Soft Computing 12/02/2024 37


ANN ARCHITECTURES

CSE 412: Principles of Soft Computing 12/02/2024 38


Typical Architectures

• ANN consists of a set of highly interconnected neurons connected through


weights to the other processing elements or to itself.
• The arrangement of these processing elements and the geometry of their
interconnections are important for ANN.
• The arrangement of neurons to form layers and the connection pattern formed
within and between layers is called the network architecture.

CSE 412: Principles of Soft Computing 12/02/2024 39


Neural network architectures
There are five basic neuron connection architectures.
• 1. Single-layer feed-forward network
• 2. Multilayer feed-forward network
• 3. Single node with its own feedback
• 4. Single-layer recurrent network
• 5. Multi-layer recurrent network

CSE 412: Principles of Soft Computing 12/02/2024 40


1. Single-layer Feed-forward Network
• It consists of a single layer of network where the inputs are directly connected
to the output, one per node with a series of various weights.

CSE 412: Principles of Soft Computing 12/02/2024 41


2. Multi-layer Feed-forward Network
• It consists of multi layers where along with the input and output layers, there are hidden
layers. There can be zero to many hidden layers.
• The hidden layer is usually internal to the network and has no direct contact with the
environment.

CSE 412: Principles of Soft Computing 12/02/2024 42


3. Single Node With Own Feedback

The simplest neural network architecture giving feedback to itself with a single neuron .

CSE 412: Principles of Soft Computing 12/02/2024 43


4. Single-layer Recurrent Network
A single-layer network with a feedback directed back to itself or to other processing
element or both.

CSE 412: Principles of Soft Computing 12/02/2024 44


5. Multilayer Recurrent Network

• A recurrent network has at least a feedback in place. The processing elements


output can be directed back to the nodes in the previous layer.

CSE 412: Principles of Soft Computing 12/02/2024 45


NETWORK TRAINING/LEARNING

CSE 412: Principles of Soft Computing 12/02/2024 46


Learning
• The learning is an important feature of human computational ability
• Learning may be viewed as the change in behavior acquired due to practice or
experience, and it lasts for relatively long time
• As it occurs, the effective coupling between the neuron is modified
• In case of artificial neural networks, it is a process of modifying neural network by
updating its weights, biases and other parameters, if any
• During the learning, the parameters of the networks are optimized and as a result
process of curve fitting
• It is then said that the network has passed through a learning phase

CSE 412: Principles of Soft Computing 12/02/2024 47


The main property of an ANN is its capability to learn. Learning or training is a process by
means of which a neural network adapts itself to a stimulus by making proper parameter
adjustment resulting in the production of desired response. Broadly, there are two kinds of
learning in ANNs:
1. Parameter learning: It updates the connecting weights in a neural net.
2. Structure learning: It focuses on the change in network structure (which includes the
number of processing elements as well as their connection types).
The above two types of learning can be performed simultaneously or separately. Apart
from these two categories of learning, the learning in an ANN can be generally classified
imo three categories as:
• supervised learning
• unsupervised learning
• CSE
reinforcement learning
412: Principles of Soft Computing 12/02/2024 48
1. Supervised Learning
• In Supervised learning, it is assumed that the correct target output values are known for
each input pattern.
• In this learning, a supervisor or teacher is needed for error minimization.
• The difference between the actual and desired output vector is minimized using the error
signal by adjusting the weights until the actual output matches the desired output.
• This type of training is called learning with the help of teacher.

CSE 412: Principles of Soft Computing 12/02/2024 49


CSE 412: Principles of Soft Computing 12/02/2024 50
2. Unsupervised Learning
• In Unsupervised learning, the learning is performed without the help of a teacher or
supervisor.
• In the learning process, the input vectors of similar type are grouped together to form
clusters.
• The desired output is not given to the network.
• The system learns on its own with the input patterns.
• This type of training is called learning without a teacher

CSE 412: Principles of Soft Computing 12/02/2024 51


3. Reinforcement Learning
• In this techniques, although a teacher is available, it does not tell the expected answer,
but only tells if the computed output is correct or incorrect.
• A reward is given for a correct answer computed and a penalty for a wrong answer This
information helps the network in its learning process.
• Note Supervised and unsupervised learnings are the most popular forms of learning
Unsupervised learning is very common in biological systems.
• It is also important for artificial neural networks training data are not always available
for the intended application of the neural network.

CSE 412: Principles of Soft Computing 12/02/2024 52


CSE 412: Principles of Soft Computing 12/02/2024 53
TRANSFER/ACTIVATION FUNCTION

CSE 412: Principles of Soft Computing 12/02/2024 54


Transfer/Activation Function
• An activation function f is applied over the net input to calculate the output of an ANN.
• The choice of activation functions depends on the type of problems to be solved by the
network.
The most common functions are
1. Identity function- It is a linear function. It is defined as f(x) = x for all x.
2. Binary step function: The function can be defined as

Here, 𝜃 represents the threshold value.


CSE 412: Principles of Soft Computing 12/02/2024 55
4. Bipolar Step function: The function can be defined as

Here,𝜃 represents the threshold value

5. Sigmoidal functions: These functions are used in back-propagation nets.


They are of two types: Binary Sigmoid function: It is known as unipolar sigmoid function.
It is defined by the equation

Here, 𝞴 is the steepness parameter. The range of the sigmoid


function is from 0 to 1
CSE 412: Principles of Soft Computing 12/02/2024 56
Bipolar Sigmoid function: This function is defined as

Here, 𝞴 is the steepness parameter. The range of the sigmoid function is from -1 to +1
5. Ramp function: The ramp function is defined as
CSE 412: Principles of Soft Computing 12/02/2024 57
The graphical representation is shown below for all the activation functions

(A) identity function


(B) binary step function
(C) bipolar step
(D) binary sigmoidal function
function

CSE 412: Principles of Soft Computing 12/02/2024 58


(E) bipolar sigmoidal function; (F) ramp function

CSE 412: Principles of Soft Computing 12/02/2024 59


Calculate the net input to the output neuron

Given inputs x1=0.3, x2=0.5, x3=0.6 and Weights W1=0.2, W2=0.1, W3=-0.3.
Net input yin = x1 W1 + x2 W2 + x3 W3
= 0.3*0.2 + 0.5*0.1 + 0.6*(-0.3)
= 0.06 + 0.05 – 0.18
= 0.11- 0.18
= -0.07

CSE 412: Principles of Soft Computing 12/02/2024 60


Calculate the net input to output neuron with bias value

Given x1=0.2, x2=0.5 and W1=0.3, W2=0.7


Net input yin = bias + x1 W1 + x2 W2
= 0.4 + 0.2 * 0.3 + 0.6 * 0.7
= 0.4 + 0.06 + 0.42
= 0.88
CSE 412: Principles of Soft Computing 12/02/2024 61
Obtain the output of the neuron y of the network using activation
function as 1) binary sigmoidal and 2) bipolar sigmoidal

Given x1= 0.8, x2=0.6, x3=0.4 and W1=0.1, W2=0.3 , W3=-0.2 and bias =0.35
The net input of output neutron is

CSE 412: Principles of Soft Computing 12/02/2024 62


For binary sigmoidal function is

For bipolar sigmoidal function is

CSE 412: Principles of Soft Computing 12/02/2024 63


Obtain output neuron at node y using sign function (bipolar step
function) with threshold =1

CSE 412: Principles of Soft Computing 12/02/2024 64


Calculate net input to the output neuron

CSE 412: Principles of Soft Computing 12/02/2024 65


Calculate the output of neuron for y, use binary
and bipolar sigmoidal activation function.

CSE 412: Principles of Soft Computing 12/02/2024 66


Terminologies of ANN
1. Weights:
• Weight is a parameter which contains information about the input signal. This information is used
by the net to solve a problem.
• In ANN architecture, every neuron is connected to other neurons by means of a directed
communication link and every link is associated with weights.
• Wij is the weight from processing element ‘i’ source node to processing element ‘j’ destination
node.
• 2. Bias (b):
• The bias is a constant value included in the network. Its impact is seen in calculating the net input.
The bias is included by adding a component x0 =1 to the input vector X.
• Bias can be positive or negative. The positive bias helps in increasing the net input of the
network. The negative bias helps in decreasing the net input of the network.
CSE 412: Principles of Soft Computing 12/02/2024 67
3. Threshold (𝜽)
Threshold is a set value used in the activation function. In ANN, based on the threshold
value the activation functions are defined and the output is calculated.
4. Learning Rate (𝜶)
The learning rate is used to control the amount of weight adjustment at each step of
training. The learning rate ranges from 0 to 1. It determines the rate of learning at each
time step.

CSE 412: Principles of Soft Computing 12/02/2024 68


History Of ANN

CSE 412: Principles of Soft Computing 12/02/2024 69


HISTORY OF ANN
• 1943 McCulloch and Pitts proposed the McCulloch Pitts neuron model.
• 1949 Hebb published his book The Organization of Behavior, in which the Hebbian learning
rule was proposed
• 1958 Rosenblatt introduced the simple single layer networks now called Perceptrons.
• 1982 Hopfield published a series of papers on Hopfield networks.
• 1982 Kohonen developed the Self Organizing Maps that now bear his name.
• 1986 The Back Propagation learning algorithm for Multi Layer Perceptrons was re-discovered
and the whole field took off again.
• 1990 s The sub field of Radial Basis Function Networks was developed.
• 2000 s The power of Ensembles of Neural Networks and Support Vector Machines becomes
apparent.
CSE 412: Principles of Soft Computing 12/02/2024 70
McCulloch & Pitts Network
• McCulloch (neuroscientist) and Pitts (logician) proposed a highly simplified
computational model of the neuron (1943)
• g aggregates the inputs, and the function f takes a decision
based on this aggregation
• The inputs can be excitatory or inhibitory

CSE 412: Principles of Soft Computing 12/02/2024 71


For the network shown here the activation function for unit Y is

• Where, y_in is the total input signal received,


θ is the threshold for Y
• Neurons in a McCulloch Pitts network are connected by directed,
weighted paths
• If the weight on a path is positive the path is excitatory, otherwise it is inhibitory
CSE 412: Principles of Soft Computing 12/02/2024 72
• All excitatory connections into a particular neuron have the same weight, although different
weighted connections can be input to different neurons.
• Each neuron has a fixed threshold. If the net input into the neuron is greater than or equal to the
threshold, the neuron fires.
• The threshold is set such that any nonzero inhibitory input will prevent the neuron from firing.
• It takes one time step for a signal to pass over one connection.

CSE 412: Principles of Soft Computing 12/02/2024 73


McCulloch & Pitts Network - AND
Consider the truth table for AND function.
This network has no particular training algorithm.
Here only analysis is being performed.
Hence assume the weights be w1=1 and w2=1.

CSE 412: Principles of Soft Computing 12/02/2024 74


McCulloch & Pitts Network - OR
• Consider the truth table for OR function.
• This network has no particular training algorithm.
• Here only analysis is being performed.
• Hence assume the weights be w1=1 and w2=1.

Threshold values is +1
CSE 412: Principles of Soft Computing 12/02/2024 75
McCulloch & Pitts Network - ANDNOT
• Consider the truth table for OR function.
• This network has no particular training algorithm.
• Here only analysis is being performed.
• Hence assume the weights be w1=1 and w2=1

CSE 412: Principles of Soft Computing 12/02/2024 76


McCulloch & Pitts Network - ANDNOT
• Hence assume the weights be w1=1 and w2=-1

CSE 412: Principles of Soft Computing 12/02/2024 77


Can you model XOR gate using McCulloch & Pitts
network?

CSE 412: Principles of Soft Computing 12/02/2024 78


CSE 412: Principles of Soft Computing 12/02/2024 79
Implement Boolean Functions Using This McCulloch Pitts

CSE 412: Principles of Soft Computing 12/02/2024 80


Can Any Boolean Function be represented using a
McCulloch Pitts Unit ?

CSE 412: Principles of Soft Computing 12/02/2024 81


Mcculloch Pitts unit for more than two inputs

• Well, instead of a line we will have a plane. For the OR function, we want a plane such that
the point (0,0,0) lies on one side and the remaining 7 points lie on the other side of the plane
• A single McCulloch Pitts Neuron can be used to represent Boolean functions which are
linearly separable

CSE 412: Principles of Soft Computing 12/02/2024 82


QUESTIONS

• What about non-Boolean ( real) inputs ?


• Do we always need to hand code the threshold ?
• Are all inputs equal, What if we want to assign more weight (importance) to
some inputs ?
• What about functions which are not linearly separable ?

CSE 412: Principles of Soft Computing 12/02/2024 83


HEBB NETWORK

• Hebb or Hebbian learning rule comes under Artificial Neural Network (ANN) which is an
architecture of a large number of interconnected elements called neurons. These neurons process the
input received to give the desired output.
• Hebb’s Law can be represented in the form of two rules
1. If two neurons on either side of a connection are activated synchronously then the weight of that
connection is increased.
2. If two neurons on either side of a connection are activated asynchronously, then the weight of
that connection is decreased.
• Hebb’s Law provides the basis for learning without a teacher. Learning here is a local
phenomenon occurring without feedback from the environment.
The weight update in Hebb rule is given by wi (new) = wi (old) + xiy
Hebb’s network is suited more for bipolar data.
HebbCSEnetwork
412: Principles of Soft Computing 12/02/2024
can be used for pattern association, pattern categorization, pattern classification 84
and in
similar areas.
Hebbian Learning In A Neural Network

CSE 412: Principles of Soft Computing 12/02/2024 85


Hebb Network Algorithm

CSE 412: Principles of Soft Computing 12/02/2024 86


Design A Hebb Network To Implement Logical AND Function
• The training data for the AND function.
• Here x1 and x2 are two inputs, b is the bias and y is the target.
• Initially the weights and bias are set to zero i.e.,
w1 = w2 = b =0
• First input [x1 x2 b] = [1 1 1] and target = 1 [i.e., y = 1]
• Setting the initial weights as old weights and apply the Hubb rule, we get
wi (new) = wi (old) + Δ wi
w1 (new) = w1 (old) + Δ w1 = 0 + 1 =1
Δ w1 = x 1 y = 1 x 1 = 1 w2 (new) = w2 (old) + Δ w2 = 0 + 1 =1
Δ w2 = x2 y =1 x 1 = 1 b (new) = b (old) + Δ b = 0 + 1 = 1
Δb = y =1 (Associated target)

CSE 412: Principles of Soft Computing 12/02/2024 87


Second input [x1 x2 b] = [1 -1 1] and y = -1.
The weight change here is
Δ w1 = x1 y = 1 x -1 = -1
Δ w2 = x2 y = -1 x -1 = 1
Δb = y = -1
The new weights are
w1 (new) = w1 (old) + Δ w1 = 1 - 1 = 0
w2 (new) = w2 (old) + Δ w2 = 1 + 1 = 2
b (new) = b (old) + Δ b = 1-1 = 0
Similarly, by presenting third and fourth input patterns, the new weights can be calculated.
The weights obtained from this and final weights are
w1 = 2, w2 = 2 and b = -2
CSE 412: Principles of Soft Computing 12/02/2024 88
CSE 412: Principles of Soft Computing 12/02/2024 89
The separating line equation is given by

For all inputs, use the final weights obtained for each input to obtain the separating line. For the first input [1 1
1] , the separating line equation is given by

Similarly, for the second input [1 -1 1], the separating line equation is given by

For the third input [-1 1 1 ], the separating line equation is given by

For the fourth input [ -1 -1 1], it is


CSE 412: Principles of Soft Computing 12/02/2024 90
CSE 412: Principles of Soft Computing 12/02/2024 91
Design A Hebb Network To Implement Logical OR Function

CSE 412: Principles of Soft Computing 12/02/2024 92


Design A Hebb Network To Implement Logical XOR Function

CSE 412: Principles of Soft Computing 12/02/2024 93


Design the Hebb Network for the following 3 x 3 pattern matrix
form in square

CSE 412: Principles of Soft Computing 12/02/2024 94


Design the Hebb Network for the following 3 x 3 pattern matrix form in square

CSE 412: Principles of Soft Computing 12/02/2024 95


SUPERVISED LEARNING NETWORKS

CSE 412: Principles of Soft Computing 12/02/2024 96


• Supervised learning takes place under the supervision of a teacher.
• This learning process is dependent. During the training of ANN under supervised
learning, the input vector is presented to the network, which will produce an output
vector.
• This output vector is compared with the desired/target output vector.
• An error signal is generated if there is a difference between the actual output and the
desired/target output vector.
• On the basis of this error signal, the weights would be adjusted until the actual output is
matched with the desired output.

CSE 412: Principles of Soft Computing 12/02/2024 97


Perceptron
• Frank Rosenblatt, an American psychologist, proposed the classical perceptron
model (1958)
• The scientist Minsky-Papert (1969, 1988) proposed various types of perceptron.
• Perceptron networks are feed-forward neural networks.
• A perceptron can be single layer or multilayer.
• A multilayer feed-forward neural network consists of an input layer, one or more
hidden layers, and an output layer.
• Inputs are no longer limited to Boolean values

CSE 412: Principles of Soft Computing 12/02/2024 98


Key points of Perceptron are
1. The perceptron network consists of three units, namely, sensory unit (input unit), associator unit
(hidden unit), response unit (output unit).
2. The sensory units are connected to associator units with fixed weights having values 1, 0 or -l,
which are assigned at random.
3. The binary activation function is used in sensory unit and associator unit.
4. The response unit has activation of l, 0 or -1. The binary step with fixed threshold θ is used as
activation for associator. The output signals chat are sent from the associator unit to the response
unit are only binary.
5. The output of the perceptron network is given by

where f(yin) is activation function and is defined as

CSE 412: Principles of Soft Computing 12/02/2024 99


6. The perceptron learning rule is used in the weight updation between the associator unit and the response
unit. For each training input, the net will calculate the response, and it will determine whether or not an
error has occurred.
7. The error calculation is based on the comparison of the values of targets with those of the ca1culated
outputs.
8. The weights on the connections from the units that send the nonzero signal will get adjusted suitably.
9. The weights will be adjusted on the basis of the learning rule if an error has occurred for a particular
training pattern i.e..,

• The y is computed output and t is desired output . The learning rate 𝛼 ranges from 0 to 1.

CSE 412: Principles of Soft Computing 12/02/2024 100


Original Perceptron Network

CSE 412: Principles of Soft Computing 12/02/2024 101


Single Classification Perceptron Network
The goal of the perceptron is to classify the input pattern as a member or not to a
particular class.

Training Algorithm

CSE 412: Principles of Soft Computing 12/02/2024 102


CSE 412: Principles of Soft Computing 12/02/2024 103
CSE 412: Principles of Soft Computing 12/02/2024 104
CSE 412: Principles of Soft Computing 12/02/2024 105
Multiple Classification Perceptron Network

CSE 412: Principles of Soft Computing 12/02/2024 106


CSE 412: Principles of Soft Computing 12/02/2024 107
Perceptron Network Testing Algorithm

CSE 412: Principles of Soft Computing 12/02/2024 108


Implement AND Function Using Perceptron Networks For
Bipolar Input And Targets

Given the initial weights and threshold are set to zero, i.e.,
w1 = w2 = b = 0 and θ= 0. The learning rate α is set equal to 1.
For the first input pattern, x1 = l, x2 = 1 and
t = 1, with weights and bias, w1 = 0, w2 = 0 and
b=0,

CSE 412: Principles of Soft Computing 12/02/2024 109


• Here we have taken θ=0. Hence, when, yin = 0 and y= 0. Check whether t = y.
• Check whether t = y. Here, t = 1 and y = 0, so t≠ y, hence weight updation takes place:

• Hence weights w1 = 1, w2 = 1 and b =1 are the final weights after first input pattern is
presented.
CSE 412: Principles of Soft Computing 12/02/2024 110
The final weights and bias after second epoch are w1 = 1, w2 = 1, b = -1
Since the threshold for the problem is zero, the equation of the separating line is

Here

CSE 412: Principles of Soft Computing 12/02/2024 111


Thus, using the final weights we obtain

It can be easily found that the above straight line separates the positive response and negative response
region,

CSE 412: Principles of Soft Computing 12/02/2024 112


• Implement OR function with binary inputs and bipolar target using perceptron
training algorithm upto 3 epochs.
• Find the weights using perceptron network for AND NOT function when all the inputs
are presented only one time. Use bipolar inputs and targets.
• Find the weights to perform the following classification using perceptron network. The
vectors (1, 1, 1, 1) and (-1, 1, -1, -1) are belong to a class having target value as 1. The
vectors (1, 1, 1, -1) and (1, -1, -1, 1) are not belong to the class, so have the target
value as -1. Assume learning rate as 1 and initial weights as 0.
• Classify the 2-D input pattern in following figure using Perceptron network. The
symbol “*” indicates the data to be +1 and “.” indicates the data to be -1. The patterns
are I-F: For pattern I, the target is +1 and for F, the target is -1.

CSE 412: Principles of Soft Computing 12/02/2024 113


Adaptive Linear Neuron (Adaline)
• The units with linear activation function are called linear units.
• A network with a single linear unit is called ADALINE
(ADAptive LInear NEuron).
• In an Adaline, the input-output relationship is linear.
• Adaline is a net which has only one output unit.
• Adaline networks are trained using delta rule.
• Delta rule also known as Least Mean Square
(LMS) rule or Widrow-Hoff rule.

CSE 412: Principles of Soft Computing 12/02/2024 114


• The delta rule for adjusting the weight of the ith pattern(neuron) 𝚫 𝒘 𝒊 =𝜶 ( 𝒕 − 𝒚 𝒊𝒏 ) 𝒙 𝒊
• The delta rule for adjusting the weight from the ith neuron to the jth neuron

• The perceptron learning rule originates from Hebbian assumption.


• The delta rule is derived from gradient descent method.
• The perceptron learning rule stops after a finite no. of learning steps.
• The delta rule updates the weights between the connection so as to minimize the difference
between net output and the target value.
• That is gradient descent continues forever until the convergence.

CSE 412: Principles of Soft Computing 12/02/2024 115


Training Algorithm

( is user supplied error)

CSE 412: Principles of Soft Computing 12/02/2024 116


Testing Algorithm

CSE 412: Principles of Soft Computing 12/02/2024 117


Implement Logic OR Function Using ADALINE (Consider Bipolar Data)

Initially all the weights and links are assumed to be small random
values, say 0.1, and the learning rare is also set to 0.1. Also, here
the least mean square error may be set. The weights are calculated
until the least mean square error is obtained.
The initial weights are taken to be w1 = w2 = b = 0.1 and the learning rate α = 0.1. For the first input sample, x1
= 1, x2 = 1, t = 1, we calculate the net input as

CSE 412: Principles of Soft Computing 12/02/2024 118


Now compute (t- yin) = (1- 0.3) = 0.7. Updating the weights we obtain,

where α(t- yin)xi is called as weight change Δwi. The new weights are obtained as
where

Calculate the error

The final weights after presenting first input sample are


andCSEerror E=0.49
412: Principles of Soft Computing 12/02/2024 119
CSE 412: Principles of Soft Computing 12/02/2024 120
• These calculations are performed for all the input samples and the error is caku1ared.
One epoch is completed when all the input patterns are presented. Summing up all the
errors obtained for each input sample for one epoch will give the total mean square
error of that epoch.
• The network training is continued until this error is minimized to a very small value.
• Example: 0.49 + 0.69 + 0.83 + 1.01 = 3.02

CSE 412: Principles of Soft Computing 12/02/2024 121


• Implement Logic AND function using ADALINE (Consider bipolar data)
• Implement Logic ANDNOT function using ADALINE (Consider bipolar data)

12/02/2024 122
CSE 412: Principles of Soft Computing
Multiple Adaptive Linear Neuron (MADALINE)

• The multiple adaptive linear neurons (Madaline) model


consists of many Adalines in parallel with a single output
unit whose value is based on certain selection rules.
• The weights between Input layer and Adaline layers
are adjusted.
• The weights between Adaline layer and Output layers
are fixed.

Input Adaline Output


layer layer layer

CSE 412: Principles of Soft Computing 12/02/2024 123


Training Algorithm

Here,

Here,

CSE 412: Principles of Soft Computing 12/02/2024 124


CSE 412: Principles of Soft Computing 12/02/2024 125
Using Madaline network, implement XOR function with bipolar inputs and
targets. Assume the required parameters for training of the network
The Madaline Rule I (MRI) algorithm in which the weights between the hidden
layer and output layer remain fixed is used for training the network. Initializing
the weights to small random values.
The initial weights and bias are [w11 w21 b1] = [0.05 0.2 0.3],
[w12 w22 b2] = [0.1 0.2 0.15] and [v1 v2 b3] = [0.5 0.5 0.5].
For first input sample, x1 = 1, x2 = l, target t = -1, and
learning rate α equal to 0.5:
Calculate net input to the hidden units:

CSE 412: Principles of Soft Computing 12/02/2024 126


Calculate the output z1 ,z2 by applying the activations over the net input computed. The activation function is given by

Hence

After computing the output of the hidden units, then find the net input entering the output unit:

Apply the activation function over the net input y in to calculate the output y.

Since t ≠ y. weight updation has to be performed. Also, since t = -1, the weights are updated on z 1 and z2 that have positive
net input. Since here both net inputs zin1 and zin2 are positive, updating the weights and bias on both hidden units, we obtain
CSE 412: Principles of Soft Computing 12/02/2024 127
This implies
All the weights and bias between the
input layer and hidden layer are
adjusted. This completes the training
for the first epoch. The same process
is repeated until the weight converges.
It is found that the weight converges
at the end of 3 epochs.
CSE 412: Principles of Soft Computing 12/02/2024 128
CSE 412: Principles of Soft Computing 12/02/2024 129
• The network architecture for Madaline network with final weights for XOR function

CSE 412: Principles of Soft Computing 12/02/2024 130


Gradient Descent
New weight 𝑤_𝑗𝑘 (𝑛𝑒𝑤)=𝑤_𝑗𝑘 (𝑜𝑙𝑑)+Δ𝑤_𝑗𝑘
Influence of Learning rate 𝛼

CSE 412: Principles of Soft Computing 12/02/2024 131


Gradient descent is an optimization method used to optimize the weight
adjustments.

CSE 412: Principles of Soft Computing 12/02/2024 132


Back Propagation Network

• A back-propagation network is a multilayer, feed-forward neural network consisting of an input


layer, hidden layers and an output layer.
• The training of the BPN is done in three stages
• The feed-forward of the input training pattern
• The calculation and back-propagation of the error
• Updation of weights.
• During the back propagation phase of learning, signals are sent in the reverse direction
• The back propagation learning algorithm is one of the most important developments in neural
networks. (Bryson and Ho, 1969; Werbos, 1974; Lecun, 1985; Parker, 1985; Rumelhart, 1986).

CSE 412: Principles of Soft Computing 12/02/2024 133


Architecture Of A Back-propagation Network
The inputs sent to the BPN and the output obtained from the net could be either binary (0, I) or bipolar ( -1, + 1).
The activation function could be any function which increases monotonically and is also differentiable.

CSE 412: Principles of Soft Computing 12/02/2024 134


Back Propagation Network Algorithm

CSE 412: Principles of Soft Computing 12/02/2024 135


CSE 412: Principles of Soft Computing 12/02/2024 136
CSE 412: Principles of Soft Computing 12/02/2024 137
Back Propagation Network Algorithm

CSE 412: Principles of Soft Computing 12/02/2024 138


Using BPN, find the new weights for the given network. it is
presented with the input pattern [0,1] and the target output is 1.
use a learning rate 𝛼 =0.25 and binary sigmoidal function .

CSE 412: Principles of Soft Computing 12/02/2024 139


The new weights are calculated based on the training –algorithm. The initial weights are
[v11 v21 v01] = [0.6 -0.1 0.3], [v12 v22 v02] = [-0.3 0.4 0.5] and [w1 w2 w0] = [0.4 0.1 -0.2], and
the learning rate is α = 0.25. Activation function used is binary sigmoidal activation
function and is given by

Given the output sample [x1, x2] = [0, 1] and target t= 1,

CSE 412: Principles of Soft Computing 12/02/2024 140


CSE 412: Principles of Soft Computing 12/02/2024 141
CSE 412: Principles of Soft Computing 12/02/2024 142
CSE 412: Principles of Soft Computing 12/02/2024 143
Find the new weights, using back-propagation network for the network shown in
Figure. The network is presented with the input pattern l-1, 1] and the target output
is + 1. Use a learning rate of α = 0.25 and bipolar sigmoidal activation function.

CSE 412: Principles of Soft Computing 12/02/2024 144

You might also like