SC Unit I
SC Unit I
Soft Computing
MCA20401 (Elective-III)
Lesson Plan
Prerequisites
SYLLABUS
UNIT:1 (12 Hours)
Introduction to Soft Computing, Fundamentals of ANN, Basic Model of an
Artificial Neuron, NN Architecture, Learning Methods, Terminology of
ANN, Hebb Network, ADALINE & MADALINE, Perceptron, MLP, Back
Propagation Network (BPN)- Architecture, Back Propagation, Learning
Effect of Tuning Parameters of the BPNN, Back Propagation- Algorithms.
Associative Memory: Auto-correlators & Hetero-correlators, Linear
Associative Memory, Applications, Adaptive Resonance Theory (ART),
ART1, ART2 & Applications
Books
Text Books:
1) S. N. Sivanandam, S. N. Deepa “Principles of Soft Computing”. Wiley
India (P) Ltd.
2) S. Rajasekaran, G.A. Vijayalakshmi Pai – “Neural Networks, Fuzzy Logic
and Genetic Algorithms” - PHI Private Limited, New Delhi
Reference Books:
1) J. S. R. Jang. C. T. SUN and E. Mizutani, “Neuro-fuzzy and soft-
computing”. PHI Pvt. Ltd., New Delhi.
2) Fredric M. Ham and Ivica Kostanic, “Principle of Neuro Computing for
Science and Engineering”, Tata McGraw Hill.
3) S. Haykins, “Neural networks: a comprehensive foundation”. Pearson
Education, India.
4) V. Keeman, “Learning and Soft Computing”, Pearson Education, India.
5) R. C. Eberhart and Y. Shi, “Computational Intelligence Concepts to
Implementation”. Morgan Kaufmann Publishers (Indian Reprint).
Unit-I
References
Sl. No.
Unit-I
Hour
Detail Description of Mode of
Date COs &POs Text Book
Topics/Subtopics Lecture
Page No.
3 1.3 Neural Networks: Basic Online [CO1][ PO1] TB1:11-13
Concept of Neural Networks
4 1.4 Models of an artificial Online [CO1][ PO1] TB1:13-16
Neuron, Various activation
functions
5 1.5 Neural Network Architecture Online [CO1][ PO1] TB1:16-19
& Characteristics, Different
Learning methods
6 1.6 Introduction to Early ANN Online [CO1][ PO1] TB1:19-20
architectures (basics only)
McCulloch & Pitts Model
7 1.7 Perceptron, ADALINE, Online [CO1][ PO1] TB1:22-24
MADALINE
8 1.8 Back Propagation Neural Online [CO1][ PO1] TB1:25-27
Network
References
Sl. No.
Unit-I
Hour
Detail Description of Mode of
Date COs &POs Text Book
Topics/Subtopics Lecture
Page No.
9 1.9 Back propagation Learning Online [CO1][ PO1] TB1:28-30
Algorithm
10 1.10 Example of BPN , Effect of Online [CO1][ PO1] TB1:
Tuning parameters of the
BPN Network,
11 1.11 Associative Memory: Online [CO1][ PO1] TB1:
Autocorrelators,
Hetrocorrelators
12 1.12 Energy function for BAM, Online [CO1][ PO1] TB1:
Exponential BAM.
References
Sl. No.
Unit-II
Detail Description of Mode of
Hour
Date COs &POs Text Book
Topics/Subtopics Lecture
Page No.
13 2.1 Fuzzy set Vs Crisp sets Online [CO2][ PO1] TB1:
14 2.2 Fuzzy Sets – Properties Online [CO2][ PO1] TB1:
15 2.3 Fuzzy Membership Online [CO2][ PO1] TB1:
Functions
16 2.4 Fuzzy Set Operations Online [CO2][ PO1] TB1:
17 2.5 Crisp and Fuzzy Relations Online [CO2][ PO1] TB1:
18 2.6 Fuzzy Relation Operations Online [CO2][ PO1] TB1:
19 2.7 Crisp logic, Propositional Online [CO2][ PO1] TB1:
logic, Predicate Logic
20 2.8 Fuzzy logic – Quantifier – Online [CO2][ PO1] TB1:
Inference
21 2.9 Fuzzy Rule based system Online [CO2][ PO1] TB1:
22 2.10 Defuzzification Methods Online [CO2][ PO1] TB1:
References
Unit-III
Sl. No.
Hour
Date COs &POs Text Book
Topics/Subtopics Lecture
Page No.
23 3.1 Fundamentals of genetic Online [CO1][ PO1] TB1:
algorithms
24 3.2 Encoding, Fitness functions Online [CO1][ PO1] TB1:
25 3.3 Reproduction Online [CO1][ PO1] TB1:
26 3.4 Genetic Modeling: Cross Online [CO1][ PO1] TB1:
over
27 3.5 Different Cross overs Online [CO1][ PO1] TB1:
28 3.6 Inversion and deletion Online [CO1][ PO1] TB1:
29 3.7 Mutation operator Online [CO1][ PO1] TB1:
30 3.8 Bit-wise operators & its Online [CO1][ PO1] TB1:
uses in GA.
31 3.9 Convergence of Genetic Online [CO1][ PO1] TB1:
algorithm
32 3.10 Applications, Real life Online [CO1][ PO1] TB1:
Problems.
References
Unit-IV
Sl. No.
Hour
Date COs &POs Text Book
Topics/Subtopics Lecture
Page No.
33 4.1 Hybrid system Online [CO1][ PO1] TB1:
34 4.2 Neural Networks Hybrid Online [CO1][ PO1] TB1:
system
35 4.3 Fuzzy Hybrid system Online [CO1][ PO1] TB1:
36 4.4 Genetic Algorithm Hybrid Online [CO1][ PO1] TB1:
system
37 4.5 Genetic Algorithm based Online [CO1][ PO1] TB1:
Back propagation Networks
38 4.6 GA based weight Online [CO1][ PO1] TB1:
determination
39 4.7 Fuzzy Back Propagation Online [CO1][ PO1] TB1:
Networks
40 4.8 Fuzzy logic controller Online [CO1][ PO1] TB1:
Concept of Computation
Soft Computing Techniques
Hard computing
Soft computing
How soft computing?
Hard computing vs. Soft computing
Computing
Antecedent Consequent
y = f(x)
Control Action
Figure: Basic of computing
Antecedent: A thing that existed before or logically precedes another.
y = f(x), f is a mapping function, f is also called formal methods
or an Algorithm to solve the problem
Control Action should an unambiguous and accurate.
Consequent: Following as a result or effect.
It should provide precise solution.
It is suitable for problem, which is easy to model mathematically.
19 Department of CSA, GIET University, Gunupur Thursday, May 2, 2024
Soft Computing Soft Computing Techniques
Neural
Neuro Fuzzy Network Neuro Genetic
Neuro Fuzzy
Fuzzy Genetic Genetic
Logic Algorithm
Fuzzy Genetic
Soft computing
Bank with
Soft computing maximum
return
Answer: a
a) Charles Darwin
b) Lofti A Zadeh
c) Rechenberg
d) Mc_Culloch
Answer: b
a) Neuron Network
b) Fuzzy Logic
c) Genetic Algorithm
d) Evolutionary Computing
Answer: a
Neural Network
Fuzzy Logic
Concept of Computation
Soft Computing Techniques
Hard computing
Soft computing
How soft computing?
Hard computing vs. Soft computing
Neural
Neuro Fuzzy Network Neuro Genetic
Neuro Fuzzy
Fuzzy Genetic Genetic
Logic Algorithm
Fuzzy Genetic
Bias
Inputs Weights
b
x1 w1
x2 w2
y
n Output
x3 w3 x w
i 1
i i
....
Activation
....
Sumation Function
xn wn Unit
An Artificial Neuron
The term fuzzy refers to things which are not clear or are
vague. In the real world many times we encounter a situation
when we can’t determine whether the state is true or false,
their fuzzy logic provides a very valuable flexibility for
reasoning. In this way, we can consider the inaccuracies and
uncertainties of any situation.
In Boolean system truth value, 1.0 represents absolute truth
value and 0.0 represents absolute false value. But in the fuzzy
system, there is no logic for absolute truth and absolute false
value. But in fuzzy logic, there is intermediate value too
present which is partially true and partially false.
Fuzzy logic is a form of many-valued logic in which the truth
values of variables may be any real number between 0 and 1
both inclusive.
Yes / 1
No / 0
Answer: d
Answer: a
Answer: a
ANN Model
Neural
Neuro Fuzzy Network Neuro Genetic
Neuro Fuzzy
Fuzzy Genetic Genetic
Logic Algorithm
Fuzzy Genetic
The human brain is one of the most complicated things which, on the
whole, has been poorly understood. However, the concept of neurons
as the fundamental constituent of the brain, attributed to Ramon Y.
Cajal (1911), has made the study of its functioning comparatively
easier. Figure illustrates the physical structure of the human brain.
Bias
Inputs Weights
b
x1 w1
x2 w2
y
n Output
x3 w3 x w
i 1
i i
....
Activation
....
Sumation Function
xn wn Unit
An Artificial Neuron
Mathematically it can
be represented as:
Sigmoidal function
This function is a continuous function that varies gradually between
the asymptotic values 0 and 1 or -1 and +1 and is given by
1
(I) = 1 e I
Advantages:
The function is differentiable. That means, we can find the slope
of the sigmoid curve at any two points.
Output values bound between 0 and 1, normalizing the output of
each neuron.
Disadvantages:
Vanishing gradient — for very high or very low values of X, there
is almost no change to the prediction, causing a vanishing gradient
problem.
Due to vanishing gradient problem, sigmoid have slow
convergence.
Outputs not zero centered.
Computationally expensive.
4) Leaky ReLU
Advantage:
1. Prevents dying ReLU problem — this variation of ReLU has a
small positive slope in the negative area, so it does enable back-
propagation, even for negative input values
Disadvantage:
1. Results not consistent — leaky ReLU does not provide
consistent predictions for negative input values.
2. During the front propagation if the learning rate is set very high it
will overshoot killing the neuron.
The idea of leaky ReLU can be extended even further. Instead of
multiplying x with a constant term we can multiply it with a hyper-
parameter which seems to work better the leaky ReLU. This
extension to leaky ReLU is known as Parametric ReLU.
6) Swish
Swish is a new, self-gated
activation function
discovered by researchers at
Google. It performs better
than ReLU with a similar
level of computational
efficiency. In experiments on
ImageNet with identical
models running ReLU and
Swish, the new function
achieved top -1 classification
accuracy 0.6-0.9% higher.
xi : Input neurons
yj : Output neurons
wij : Weights
Figure – 2
Single Layer Feedforward Network.
xi Input neurons
yj Hidden neurons
zk Output neurons
vij Input hidden layer weights
wjk Output hidden layer weights
Figure – 3:
A multilayer feedforward network ( l - m - n configuration).
Figure – 4
Recurrent Neural Network
i. The NNs exhibit mapping, capabilities, that is, they can map
input patterns to their associated output patterns.
ii. The NNs learn by examples. Thus, NN architectures can be
‘trained’ with known examples of a problem before they are
tested for their ‘Inference’ capability on unknown instances of
the problem. They can, therefore, identify new objects
previously untrained.
iii. The NNs possess the capability to generalize. Thus, they can
predict new outcomes from past trends.
iv. The NNs are robust systems and are fault tolerant. They can,
therefore, recall full patterns from incomplete, partial or noisy
patterns.
v. The NNs can process information in parallel, at high speed,
and in a distributed manner.
Neural Network
Learning Algorithm
Stochastic learning
In this method, weights are adjusted in a probabilistic fashion.
An example is evident in simulated annealing-the learning
mechanism employed by Boltzmann and Cauchy machines, which
are a kind of NN systems.
Hebbian Learning
This rule was proposed by Hebb (1949) and is based on correlative
weight adjustment. This is the oldest learning mechanism inspired by
biology.
In this method, the input-output pattern pairs (Xi, Yi) are associated
by the weight matrix W, known as the correlation matrix. It is
computed as:
n
W= i i
T
X Y
i 1
Competitive learning
In this method, those neurons which respond strongly to input stimuli
have their weights updated.
When an input pattern is presented, all neurons in the layer compete
and the winning neuron undergoes weight adjustment.
Hence, it is a “Winner – takes – all” strategy.
The connections between the output neurons show the competition
between them and one of them would be ‘ON’ which means it would
be the winner and others would be ‘OFF’.
y=b+∑
1 If y >= 0
f(y) =
−1 If y < 0
Case 1 if y ≠ t then,
wi(new)=wi(old)+α(t−y)xi
b(new)=b(old)+α(t−y)
Case 2 if y = t then,
wi(new)=wi(old)
b(new)=b(old)
Where α is the learning rate, y is the computed output and t is the
desired/target output, xi is the input and wi is the weight.
(t−y) is the computed error.
Compute I = ∑
2.
1, >0
0, ≤0
Compute observed output y = f (I) =
In our computation, we assume that < T0, TI > be the training set of
size |T|.
Consider the outputs of the neurons lying on input layer are the same
with the corresponding inputs to neurons in hidden layer.
That is, {O}I = {I}I
[l×1] [l×1] [Output of the input layer]
The input of the j-th neuron in the hidden layer can be calculated as
follows.
IHj = V1j .OI1 + V2j .OI2 +, … , + Vij .OIj + … + Vlj .OIl
where j = 1, 2, …, m.
[Calculation of input of each node in the hidden layer]
1
….
(1 + 2 34 567 3 867 )
{O}H =
….
…. [m × 1]
….
1
….
(1 + 2 34 5;< 3 8;< )
{O}O =
….
[n × 1]
….
141 Department of CSA, GIET University, Gunupur Thursday, May 2, 2024
Soft Computing Backpropagation Algorithm
E= ∑? 2 (=, >, )
Thus, given a training set of size N, the error surface, E can be
represented as
where Ii is the i-th input pattern in the training set and ei (…)
denotes the error computation of the i-th input.
Now, we will discuss the steepest descent method of computing
error, given a changes in V and W matrices.
neuron be TOk .
Then, the error ek of the k-th neuron is defined corresponding to the
@
input Ii as ek = (TOk - OOk )2
Step-4: For training data, we need to present one set of inputs and
outputs. Present the pattern as inputs to the input layer {I}I as
inputs to the input layer. By using linear activation function, the
output of the input layer may be evaluated as
{ O } I = { I }I
ℓx1 ℓx1
Step-5: Compute the inputs to the hidden layers by multiplying
corresponding weights of synapses as
{ I }H = [ V] T { O }I
[m x 1] [m x ℓ] [ℓ x 1]
Step-6: Let the hidden layer units, evaluate the output using the
sigmoidal function as ….
….
1
{O}H = (1 +….
2 356 )
…. [m × 1]
148 Department of CSA, GIET University, Gunupur Thursday, May 2, 2024
Soft Computing Backpropagation Algorithm
J
EP =
[n × 1]
….
….
[ Y ] = { O }H 〈 d 〉
Step-11: Find [ Y ] matrix as
[m × 1]
….
….
150 Department of CSA, GIET University, Gunupur Thursday, May 2, 2024
Soft Computing Backpropagation Algorithm
[ X ] = { O }I 〈 d* 〉 = { I }I 〈 d* 〉
Find [ X ] matrix as
0.4 0.1
0.2
TO = 0.1
0.4 -0.2
-0.7
0.2 -0.5
0.4
−0.7
Step-1: Find {O}I = {I}I = 2x1
ℓx1 ℓx1
1
Step-4: Find
(1 + 2 3X. Y ) 0.5448
{O}H = 1 =
0.505
(1 + 2 3X.X@ )
0.5448
[m x 1] [m x ℓ] [ℓ x 1]
= 0.2 − 0.5 x = −0.14354
0.505
1
Step-6: Find
{O}O = (1 + 2 X. \]^\ ) = 0.4642
0.5448 −0.0493
[Y] = {O}H 〈 d 〉 = x 〈–0.09058〉 =
0.505 −0.0457
−0.2 0.2
1 – 0.001077 0.002716
Step-14: Find [V] = +
0.001885 –0.004754
Associative Memory
Associative Memory
Input Output
Input Recalled
Pattern Pattern
a = c d a [d ]
X
Here, T = [tij] is a (p x p) connection matrix and Ai ∈ −1, 1 g
Where Ai = (a1, a2, …, ap) and the two parameter bipolar threshold
1, h>0
function is
h, i = j i, h=0
−1, h<0
3 1 3 −3
The connection matrix is
]
a = c d a 4l1 d 1l4 = 1 3 1 −1
3 1 3 −3
−3 −1 −3 3
−3 −1 −3 3
a3new = f(3 + 1 + 3 + 3, 1) = (10, 1) = 1
a4new = f(-3 - 1 - 3 - 3, 1) = (-10, 1) = -1
Hn ,o = c| −o |
Thus the HD of A' from each of the patterns in the stored set is as
follows:
HD (A' , A1) = 4
HD (A' , A2) = 2
HD (A' , A3) = 6
(a1new , a2new, a3new, a4new ) = (f(4, 1), f(4, 1), f(4, 1), f(-4, 1))
= (1, 1, 1, -1) = A2
Hence, in the case of partial vectors, an autocorrelator results in
the refinement of the pattern or removal of noise to retrieve the
closest matching stored pattern.
Consider N training pairs {(A1, B1), (A2, B2), ..., (Ai, Bi), ..., (An, Bn)}
where Ai = (ai1 , ai2 , . . . , ain) and Bi = (bi1 , bi2 , . . . , bip)
and aij, bij are either in ON or OFF state.
q = cla r
To retrieve the nearest of (Ai , Bi) pattern pair, given any pair (α, β),
the recall equation are as follows:
starting with (α, β) as the initial condition, we determine a finite
sequence (α', β'), (α", β"), ….. until an equilibrium point (αF, βF), is
reached, where
β' = ϕ(αM)
α' = ϕ(β'MT)
ϕ(F) = G = g1, g2, …, gn
F = (f1, f2, …, fn)
1 >0
0 t uvwo , <0
s =
−1 t gx vw , <0
yw2 xz{ s , =0
• Consider N = 3 pattern pairs (A1 , B1), (A2 , B2), (A3 , B3) given by:
A1 = (1 0 0 0 0 1) B1 = (1 1 0 0 0)
A2 = (0 1 1 0 0 0) B2 = (1 0 1 0 0)
A3 = (0 0 1 0 1 1) B3 = (0 1 1 1 0)
X1 = (1 -1 -1 -1 -1 1) Y1 = (1 1 -1 -1 -1)
X2 = ( -1 1 1 -1 -1 -1) Y2 = (1 -1 1 -1 -1)
X3 = ( -1 -1 1 -1 1 1) Y3 = (-1 1 1 1 -1)
1 1 −3 −1 1
The correlation matrix M is calculated as 6x5 matrix
1 −3 1 −1 1
−1 −1 3 1 −1
q = l1ar1 + l2ar2 + l3ar3 =
−1 −1 −1 1 3
−3 1 1 3 1
−1 3 −1 1 −1
Suppose we start with α = X3, and we hope to retrieve the associated
pair Y3. The calculations for the retrieval of Y3 yield :
αM = ( -1 -1 1 -1 1 1 ) (M) = ( -6 6 6 6 -6 )
β' = ϕ (αM) = ( -1 1 1 1 -1 )
β'MT = ( -5 -5 5 -3 7 5 )
ϕ (β'MT) = ( -1 -1 1 -1 1 1) = α'
α'M = ( -1 -1 1 -1 1 1 ) (M) = ( -6 6 6 6 -6 )
ϕ(α'M) = β" = ( -1 1 1 1 -1) = β’
Here, β' is same as Y3. Hence (αF, βF) = (X3, Y3) is the desired result.
3 −3 1 −1 −1 1 −3 −1 −1
3 −3 1 −1 −1 1 −3 −1 −1
1 −1 −1 1 1 −1 −1 −3 1
−1 1 1 −1 −1 1 1 3 −1
q= −1 1 1 −1 −1 1 1 3 −1
−1 1 −3 −1 −1 −3 1 −1 3
3 −3 1 −1 −1 1 −3 −1 1
1 −1 −1 1 1 −1 −1 −3 1
−1 1 1 −1 −1 1 1 3 −1
1 c o | }l ≥ 0
Suppose we are given N
.l
training pairs {(A1, B1),
o =
(A2 , B2),.., (An, Bn)}
where Ai = (ai1, ai2, …, ain) | ?
−1 c o | }l <0
.l
and Bi = (bi1, bi2, …, bin)
and if Xi, Yi are the bipolar
modes of the training
?
respectively, given by Xi ∈ 1 c }r ≥ 0
pattern pairs Ai and Bi
.r
|
{-1, 1}n and Yi ∈ {-1, 1}p
| = ?
−1 c | }r <0
.r
Then, we use the following
equations in the recall
process of eBAM.
Here, b is a positive number, b>1 and “.” represents the linear product
(X. Xi) = ∑7 !. !
operator of X and Xi, Y and Yi i.e. for X=(x1, x2, …, xn) and Xi =
(xi1, xi2, …, xin)
189 Department of CSA, GIET University, Gunupur Thursday, May 2, 2024
Soft Computing Applications
• Character Recognition
• Defect identification