SIC1614
SIC1614
1
Evolution of Computing - Soft Computing Constituents - From Conventional AI to Computational
Intelligence - MachineLearning Basics, Fundamentals of ANN - Biological Neurons and Their
Artificial Models - Types of ANN - Properties -Different Learning Rules - Types of Activation
Functions - Training of ANN - Hebb learning - Perceptron Model (Both Single& Multi Layer) -
Training Algorithm - Problems Solving Using Learning Rules and Algorithms - Linear
Separability –Limitation.
Evolution of Computing
Approximation : here the model features are similar to the real ones, but not the same.
Uncertainty : here we are not sure that the features of the model are the same as that of the entity .
Imprecision : here the model features (quantities) are not the same as that of the real ones, but
close to them. Importance of Soft Computing Soft computing differs from hard (conventional)
computing. Unlike hard computing, the soft computing is tolerant of imprecision, uncertainty,
partial truth, and approximation. The guiding principle of soft computing is to exploit these
tolerance to achieve tractability, robustness and low solution cost. In effect, the role model for soft
computing is the human mind. The four fields that constitute Soft Computing (SC) are : Fuzzy
Computing (FC), Evolutionary Computing (EC), Neural computing (NC), and Probabilistic
Computing (PC), with the latter subsuming belief networks, chaos theory and parts of learning
theory. Soft computing is not a concoction, mixture, or combination, rather, Soft computing is a
partnership in which each of the partners contributes a distinct methodology for addressing
problems in its domain. In principal the constituent methodologies in Soft computing are
complementary rather than competitive. Soft computing may be viewed as a foundation component
for the emerging field of Conceptual Intelligence. 5 Fuzzy Computing In the real world there exists
mu
Fundamentals of ANN
Neural computing is an information processing paradigm, inspired by biological system,
composed of a large number of highly interconnected processing elements(neurons) working in
unison to solve specific problems.
Artificial neural networks (ANNs), like people, learn by example. An ANN is configured
for a specific application, such as pattern recognition or data classification, through a learning
process. Learning in biological systems involves adjustments to the synaptic connections that
exist between the neurons. This is true of ANNs as well.
3
Dendrites are branching fibres that extend from the cell body or soma.
Soma or cell body of a neuron contains the nucleus and other structures, support chemical
processing and production of neurotransmitters.
Axon is a singular fiber carries information away from the soma to the synaptic sites of
other neurons (dendrites ans somas), muscels, or glands.
Axon hillock is the site of summation for incoming information. At any moment, the
collective influence of all neurons that conduct impulses to a given neuron will determine
4
whether or n ot an action potential will be initiated at the axon hillock and propagated along the
axon.
Myelin sheath consists of fat-containing cells that insulate the axon from electrical
activity. This insulation acts to increase the rate of transmission of signals. A gap exists
between each myelin sheath cell along the axon. Since fat inhibits the propagation of
electricity, the signals jump from one gap to the next.
Nodes of Ranvier are the gaps (about 1 µm) between myelin sheath cells. Since fat serves
as a good insulator, the myelin sheaths speed the rate of transmission of an electrical impulse
along the axon.
Synapse is the point of connection between two neurons or a neuron and a muscle or a
gland. Electrochemical communication between neurons take place at these junctions.
Terminal buttons of a neuron are the small knobs at the end of an axon that release
chemicals called neurotransmitters.
Information flow in a neural cell
The input/output and the propagation of information are shown below.
6
- Temporal information processing.
Basic Elements:
Neuron consists of three basic components – weights, thresholds and a single
activation function.
Types of ANN
Single Layer Feedforward NN
- A single layer network has one layer of connection weights.
- Often the units can be distinguished as input units which receive signals from the
outside world and output units from which the response of the net can be read
- In a typical single layer net the input units are fully connected to output units but are
not connected to other units and the output are not connected to other output units.
7
Multilayer feedforward NN
- A multilayer net is a net with one or more layers of nodes between the input units
and the output units.
- Typically there is a layer of weights between two adjacent levels of units.
8
- Multilayer nets can solve more complicated problems that can single layer nets,
but training may be more difficult.
Competitive layer:
- A competitive layer form a part of a large number of neural networks.
- The competitive interconnections have weights of -ɛ
9
Supervised learning :
Every input pattern that is used to train the network is associated with an output pattern
which is the target or the desired pattern.
A teacher is assumed to be present during the training process, when a comparison is
made between the network‘s computed output and the correct expected output, to determine
the error.
The error can then be used to change network parameters, which result in an
improvement in performance.
Unsupervised learning:
In this learning method the target output is not presented to the network.
It is as if there is no teacher to present the desired patterns and hence the system learns of its
own by discovering and adapting to structural features in the input patterns.
Reinforced learning:
In this method, a teacher though available, doesnot present the expected answer but only
indicates if the computed output correct or incorrect.
The information provided helps the network in the learning process.
Hebbian learning:
This rule was proposed by Hebb and is based on correlative weight adjustment.
This is the oldest learning mechanism inspired by biology.
In this, the input-output pattern pairs () are associated by the weight matrix W, known as the
correlation matrix.
It is computed as
W=
10
Here is the transpose of the associated output vector Numerous
variants of the rule have been proposed.
Gradient descent learning:
This is based on the minimization of error E defined in terms of weights and activation
function of the network.
Also it is required that the activation function employed by the network is differentiable, as
the weight update is dependent on the gradient of the error E.
Thus if is the weight update of the link connecting the and neuron of the two
neighbouring layers, then is defined as,
=ɳ
Where, ɳ is the learning rate parameter and is the error gradient
with reference to the weight .
Competitive learning:
In this method, those neurons which respond strongly to input stimuli have their weights
updated.
When an input pattern is presented, all neurons in the layer compete and the winning
neurons undergoes weight adjustment.
Hence it is a winner-takes-all strategy.
Stochastic learning:
In this method, weights are adjusted in a probablistic fashion.
An example is evident in simulated annealing the learning mechanism employed by
Boltzmann and Cauchy machines, which are a kind of NN systems.
Types of Activation Functions
• Common activation functions
– Identity function
• f(x) = x for all x
11
– Binary step function (with threshold θ) (aka Heaviside function or threshold
function)
S
-- Binary 1
sigmoid f (x)
1
ex
-
Bipolar sigmoid
12
Training a Neural Network
• Whether our neural network is a simple Perceptron, or a much more complicated
multilayer network with special activation functions, we need to develop a systematic
procedure for determining appropriate connection weights.
• The general procedure is to have the network learn the appropriate weights from a
representative set of training data
• In all but the simplest cases, however, direct computation of the weights is intractable
Instead, we usually start off with random initial weights and adjust them in small steps until the
required outputs are produced
• We shall now look at a brute force derivation of such an iterative learning algorithm
for simple Perceptrons.
Perceptron Model (Both Single & Multi Layer)
Simple Perceptron for Pattern Classification
We consider here a NN, known as the Perceptron, which is capable of performing pattern
classification into two or more categories. The perceptron is trained using the perceptron learning
rule. We will first consider classification into two categories and then the general multiclass
classification later. For classification into only two categories, all we need is a single output
neuron. Here we will use bipolar neurons. The simplest architecture that could do the job consists
of a layer of N input neurons, an output layer with a single output neuron, and no hidden layers.
This is the same architecture as we saw before for Hebb learning. However, we will use a
different transfer function here for the output neuron:
13
Perceptron Algorithm
• Step 0: Initialize weights and bias
– For simplicity, set weights and bias to zero
– Set learning rate a (0 <= a <= 1) (h)
• Step 1: While stopping condition is false do steps 2-6
• Step 2: For each training pair s:t do steps 3-5
• Step 3: Set activations of input units
xi =
• Step 4: Compute response of output unit:
y _ in b xi wi
• Step 5: Update weights and bias if an error occurred for this pattern
if y != t
wi(new) = wi(old) + atxi
b(new) = b(old) + at
else
wi(new) = wi(old)
b(new) = b(old)
PROBLEM--------------------------------------------------------------------------------
15
Perceptron and linearly seperable tasks:
Perceptron are successful only on problems with a linearly separable solution sapce and cited the XOR problem a
an illustration.
Perceptron cannot handle, in particular, tasks which are not linearly separable.
Sets of points in two dimensional spaces are linearly separable if the sets can be seperated by a straight line.
Generalizing, a set of points in n-dimentional space are linearly seperable if the sets can be seperated by a
straight line.
Generalizing, a set of points in n-dimentional space are linearly seperable if there is a hyperplane of (n-1)
dimensions that separates the sets.
Hebbian net:
Earliest and simplest learning rule for a neural net. Hebb proposed that learning occurs by modificatio of the
synapse strenghts in a manner such that if two interconnected neurons are both ―ON‖ at the sametime, there the
weight between those neurons should be increased.
• Step 0: Initialize all weights
– For simplicity, set weights and bias to zero
• Step 1: For each input training vector do steps 2-4
• Step 2: Set activations of input units
xi = si
• Step 3: Set the activation for the output unit
y=t
• Step 4: Adjust weights and bias
wi(new) = wi(old) + yxi b(new) = b(old)
+y
16
TEXT / REFERENCE BOOKS
1. Laurene Fausett, " Fundamentals of Neural Networks: Architectures, Algorithms and Applications", 2008.
2. Timothy J. Ross , ―Fuzzy Logic with Engineering Applications‖, McGraw - Hill International Editions, 2004.
3. Jang J.S.R., Sun C.T. and Mizutani E, "Neuro-Fuzzy and soft computing", Pearson Education, 2003.
4. Rajasekaran. S, Pai. G.A.V. ―Neural Networks, Fuzzy Logic and Genetic Algorithms‖, Prentice Hall of India, 2003.
Part- A
1. What is meant by evolutionary computation
2. What is meant by ANN
3. Define Architecture in NN
4. Define Training algorithm
5. Give some applications of ANN
6. Give the properties of biological neurons
7. Give the different types of ANN
8. Write the types of learning rules
9. What is meant by supervised learning
10. What is meant by competitive learning
11. Give the properties of ANN
12. What is meant by activation function
13. What is meant by Hebb Net.
14. What is meant by linear separability
15. Give the limitations of linear separability
Part-B
1. Explain the architecture, Learning rule and algorithm of Perceptron Model
2. Explain in detail about Biological Neural Network and compare with ANN.
3. Explain in detail about the different types of architecture
4. Elaborate on different learning rules
5. Explain in detail about Hebb Net .
6. Write notes on different types of activation function
7. Train logical AND gate with binary input & bipolar target using perceptron algorithm with initial
weights and bias as
w1=2 , w2=2 and b=-4
17
SCHOOL OF ELECTRICAL AND ELECTRONICS
DEPARTMENT OF ELECTRONICS AND INSTRUMENTATION ENGINEERING
18
DETERMINISTIC AND STATISTICAL NETWORKS
Back Propagation Training Algorithm - Practical Difficulties - Counter Propagation Network - Structure & Operation
- Training of Kohonen and Grossberg Layer - Applications of BPN & CPN - Statistical Method – Training
Application - Boltzman Training - Cauchy Training - Hop Field Network and Boltzman Machine - Speed Energy
Function - Network Capacity - RBF Network, BAM, Architecture of SOM, ANN based water level controller.
Introduction
Back propagation is a way of learning the internal representation of a multilayered
network. How back propagation does this is based on a simple idea. We input a vector of values
pi into the network and we get out a corresponding output a3 i (see fig on preceding slide). We
compare this output to our target (desired) output ti to determine the cumulative error among the
different outputs units. But the output units are themselves connected with the hidden units in
the network. We don‘t know what the hidden units ought to do, but we can compute how fast the
error changes as we change a hidden activity. But instead of using the desired activities to train
the hidden units, we use error derivatives with respect to the hidden activities. Each hidden
activity can affect many output units and can therefore have many separate effects on the error.
These effects must be combined. The point of the back propagation algorithm is to show how
we can compute the error derivatives for all the hidden units efficiently. Once we have the error
derivatives for the hidden activities, its easy to get the error derivatives for the weights going
into a hidden unit.
Algorithm:
The training involves three stages
The feedforward of the input training pattern
The back propagation of the associated error
The adjustment of the weights
Activation function:
Backpropagation net should have several important characteristics
Continuous
Differentiable
Monotonically non decreasing
One of the most typical activation functions is the binary sigmoid function, which has range of
(0,1) and is defined as
19
20
Another common activation function is bipolar sigmoid, which has range of (-1,1) and
is defined as,
21
Slop O. Initialize weights.
( I ‹a smzlt random values).
22
!+6 • b 0tJt§Ut «*it ( Y# , ñ ,I , . rig) SlZf¥lS
it.• weighted input signals,
23
Counter propagation network
They are multilayer networks based on a combination of input, clustering and output
layers.
Counter propagation networks are trained in two stages:
First stage – the input vectors are clustered
- No topology was assumed for the cluster units.
Second stage – the weights from the cluster units to the output units are adapted to produce the
desired response
There are two types of counter propagation nets:
- Full counter propagation
- Forward only counter propagation
24
Architecture:
Full counter propagation network:
First phase:
The units in the X input, cluster and Y input layers are active.
The learning rule for weight updates on the winning cluster unit is.
Second phase:
Only j unit remains active in the cluster region.
The weight updates for the units in the Y output and X outout layers are,
25
Algorithm:
26
In steps 4 and 11
27
- In case of a tie, take the unit with the smallest index.
- To use the dot product metric, find the cluster unit with the largest net input:
The weight vectors and input vectors should be normalized to use the dot product metric.
To use the Euclidean distance metric, find the cluster unit , the square of whose distance
from the input vectors is smallest.
Algorithm:
Learning rule for weights from input units to cluster units
28
Learning rule for weights from cluster units to output units
29
In steps 4 and 11
- In case of a tie, take the unit with the smallest index.
- To use the dot product metric, find the cluster unit with the largest net input:
To use the Euclidean distance metric, find the cluster unit , the square of whose distance from
the input vectors is smallest.
Training of Kohonen:
Algorithm:
30
To obtain approximate value of y for x = 0.12.
Step 0: Initialize weights.
Step 1: For the input x = 0.12, y = 0.0, do steps 2-4.
Step 2: Set X input layer activations to vector x.
Set Y input layer activations to vector y.
Step 3: Find the index J of the winning cluster unit, the squares of the distances from the
input to each of the cluster units are,
Step 5: Find the index J of the winning cluster unit, the squares of the distances from the
input to each of the cluster units are,
31
Thus based on the input from x only, the closest cluster unit is J = 1. Step
Boltzmann machine:
The states of the units are binary valued, with probabilistic state transitions. This
machine described in this section has fixed weights, which express the degree of desirability
that units and both be ‗ON‘. In applying Boltzmann machines to constrained optimization
problems, the weights represent the constraints of the problem and the quantity to be
optimized.
The objective of the neural net is to maximized the consensus function,
The sum runs over all units of the net. In sequential Boltzmann machine, the change in
consensus if unit, were to change its state is,
The probability of the net accepting a change in state for unit is,
32
The control parameters T called the temperature is gradually reduced as the net searches for a
maximal consensus.
Architecture:
The architecture of a Boltzmann machine for units arranged in a two dimensional array.
33
Hopfield Network:
The net is a fully interconnected neural net, in the sense that each unit is connected to every
other unit. The net has symmetric weights with no self-connections,
An
d
Only one unit updates its activation at a time and each unit continues to receive an external signal
in addition to the signal from the other units in the net. The asynchronous updating of the units
allows a function, known as an energy or Lyapunov function, to be found for the net.
Architecture:
An expanded form of a common representation of the Hopfield net,
34
Algorithm:
To store a set of binary patterns where,
And = 0
To store a set of bipolar patterns where,
And = 0.
35
Example:
Testing a discrete Hopfield net: mistakes in the first and second components of the stored vector.
The input vector is (1, 1, 1, 0) or (1, 1, 1,-1)
Mistake input (0,0,1,0)
Update order (, , , )
36
37
Self-Organizing Map (SOM)
The Self-Organizing Map is one of the most popular neural network models. It belongs to the category
of competitive learning networks. The Self-Organizing Map is based on unsupervised learning, which
means that no human intervention is needed during the learning and that little needs to be known about
the characteristics of the input data. We could, for example, use the SOM for clustering data without
knowing the class memberships of the input data. The SOM can be used to detect features inherent to
the problem and thus has also been called SOFM, the Self-Organizing Feature Map.
The Self-Organizing Map was developed by professor Kohonen . The SOM has been proven useful in
many applications . For closer review of the applications published in the open literature, see section .
The SOM algorithm is based on unsupervised, competitive learning. It provides a topology preserving
mapping from the high dimensional space to map units. Map units, or neurons, usually form a two-
dimensional lattice and thus the mapping is a mapping from high dimensional space onto a plane. The
property of topology preserving means that the mapping preserves the relative distance between the
points. Points that are near each other in the input space are mapped to nearby map units in the SOM.
The SOM can thus serve as a cluster analyzing tool of high-dimensional data. Also, the SOM has the
capability to generalize. Generalization capability means that the network can recognize or characterize
inputs it has never encountered before. A new input is assimilated with the map unit it is mapped to.
This is illustrated in Figure1 . One neuron is a vector called the codebook vector.
38
ANN based water level controller.
39
Part- A
1. Write the three stages of operation in BPN
2. State the important characteristics of BPN
3. Give the practical consideration of BPN
4. What is meant by CPN
5. Give the application of CPN
6. What is meant by speed energy function
7. What is meant by RBF network
8. List out statistical methods of training
9. What is Boltzmann training
10. What is BAM
11. Write about network capacity
12. wite about grossberg layer
13. What is meant by Cauchy training
Part-B
1. Explain about back propagation algorithm in detail and its limitations
2. Explain the architecture and algorithm of Full CPN
3. Explain the architecture and algorithm of Forward CPN
4. Elaborate on the training of Hopfield Network
5. Explain in detail about Discrete BAM
6. With neat diagram explain the ANN based water level controller
7. Explain the Architecture of Self Organizing Map algorithm.
40
SCHOOL OF ELECTRICAL AND ELECTRONICS
DEPARTMENT OF ELECTRONICS AND INSTRUMENTATION ENGINEERING
41
Introduction to Fuzzy Set Theory - Basic Concepts of Fuzzy Sets - Classical Set Vs Fuzzy Set - Properties of Fuzzy Set -
Fuzzy Logic Operation on Fuzzy Sets - Fuzzy Logic Control Principles - Fuzzy Relations - Fuzzy Rules - Defuzzification -
Fuzzy Inference Systems - Fuzzy Expert Systems - Fuzzy Decision Making
43
Complement Difference
Two special properties of set operations are known as the excluded middle axioms and De
Morgan‘s principles. These properties are enumerated here for two sets A and B. The excluded
middle axioms are very important because these are the only set operations described here that
are not valid for both classical sets and fuzzy sets. There are two excluded middle axioms. The
first, called the axiom of the excluded middle, deals with the union of a set A and its
complement; the second, called the axiom of contradiction, represents the intersection of a set A
and its complement.
44
De Morgan‘s principles are important because of their usefulness in proving tautologies and
contradictions in logic, as well as in a host of other set operations and proofs. De Morgan‘s
principles are displayed in the shaded areas of the Venn diagrams
FUZZY SETS
In classical, or crisp, sets the transition for an element in the universe between membership and
nonmembership in a given set is abrupt and well defined (said to be ―crisp‖). For an element in a
universe that contains fuzzy sets, this transition can be gradual. This transition among various
degrees of membership can be thought of as conforming to the fact that the boundaries of the
fuzzy sets are vague and ambiguous. Hence, membership of an element from the universe in this
set is measured by a function that attempts to describe vagueness and ambiguity.
A notation convention for fuzzy sets when the universe of discourse, X, is discrete and finite, is
as follows for a fuzzy set A:
In both notations, the horizontal bar is not a quotient but rather a delimiter. The numerator in
each term is the membership value in set A associated with the element of the universe
indicated in the denominator. In the first notation, the summation symbol is not for algebraic
summation, but rather denotes the collection or aggregation of each element; hence, the ―+‖
signs in the first notation are not the algebraic ―add‖ but are an aggregation or collection
operator. In the second notation, the integral sign is not an algebraic integral but a continuous
function-theoretic aggregation operator for continuous variables.
Fuzzy Set Operations
Define three fuzzy sets A, B , and C on the universe X. For a given element x of the
universe, the following function-theoretic operations for the set-theoretic operations of
union, intersection, and complement are defined for A, B , and C on X:
45
De Morgan‘s principles for classical sets also hold for fuzzy sets, as denoted by the following
expressions:
As enumerated before, all other operations on classical sets also hold for fuzzy sets, except for
the excluded middle axioms. These two axioms do not hold for fuzzy sets since they do not
form part of the basic axiomatic structure of fuzzy sets. Since fuzzy sets can overlap, a set and
its complement can also overlap. The excluded middle axioms, extended for fuzzy sets, are
expressed as,
46
7. De Morgan‘s principle A B = A ∩ B asserts that the loads that are safe for
neither material D nor material B are the intersection of those that are unsafe for material D
with those that are unsafe for material B.
Fuzzy sets vs. crisp sets
Crisp sets are the sets that we have used most of our life. In a crisp set, an element is either a
member of the set or not. For example, a jelly bean belongs in the class of food known as
candy. Mashed potatoes do not.
Fuzzy sets, on the other hand, allow elements to be partially in a set. Each element is given a
degree of membership in a set. This membership value can range from 0 (not an element of the
set) to 1 (a member of the set). It is clear that if one only allowed the extreme membership values
of 0 and 1, that this would actually be equivalent to crisp sets. A membership function is the
relationship between the values of an element and its degree of membership in a set. An example
of membership functions are shown in the figure. In this example, the sets (or classes) are
numbers that are negative large, negative medium, negative small, near zero, positive small,
positive medium, and positive large. The value, µ, is the amount of membership in the set.
CARTESIAN PRODUCT
An ordered sequence of r elements, written in the form (a1, a2, a3,...,ar), is called an
ordered r-tuple; an unordered r-tuple is simply a collection of r elements without restrictions on
order. In a ubiquitous special case where r = 2, the r-tuple is referred to as an ordered pair. For
crisp sets A1, A2,..., Ar, the set of all r-tuples (a1, a2, a3,...,ar), where a1 A1, a2 A2, and ar
Ar, is called the Cartesian product of A1, A2,..., Ar, and is denoted by A1 × A2 ×···× Ar. The
Cartesian product of two or more sets is not the same thing as the arithmetic product of two or
more sets. The latter is dealt with in Chapter 12, when the extension principle is introduced.
When all the Ar are identical and equal to A, the Cartesian product A1 × A2 × ···× Ar can be
denoted as Ar .
FUZZY RELATIONS
Fuzzy relations also map elements of one universe, say X, to those of another universe,
say Y, through the Cartesian product of the two universes. However, the ―strength‖ of the relation
47
between ordered pairs of the two universes is not measured with the characteristic function, but
rather with a membership function expressing various ―degrees‖ of strength of the relation on the
unit interval [0,1]. Hence, a fuzzy relation R is a mapping from the Cartesian space X × Y to
the interval [0,1], where the strength of the mapping is expressed by the membership function of
the relation for ordered pairs from the two universes, or µR (x, y).
As seen in the foregoing expressions, the excluded middle axioms for fuzzy relations do not
result, in general, in the null relation, O, or the complete relation, E.
Fuzzy Cartesian Product and Composition
Because fuzzy relations in general are fuzzy sets, we can define the Cartesian product to
be a relation between two or more fuzzy sets. Let A be a fuzzy set on universe X and B be a
fuzzy set on universe Y, then the Cartesian product between fuzzy sets A and B will result in
a fuzzy relation R, which is contained within the full Cartesian product space, or
48
Example .
Then, the resulting relation, T, which relates elements of universe X to elements of
universe Z, that is, defined on Cartesian space X × Z, can be found by max–min composition
The rest,
The rest,
49
FEATURES OF THE MEMBERSHIP FUNCTION
Since all information contained in a fuzzy set is described by its membership function, it
is useful to develop a lexicon of terms to describe various special features of this function. For
purposes of simplicity, the functions shown in the figures will all be continuous, but the terms
apply equally for both discrete and continuous fuzzy sets.
50
The core of a membership function for some fuzzy set A is defined as that region of the
universe that is characterized by complete and full membership in the set A. That is, the core
comprises those elements x of the universe such that µA (x) = 1.
The support of a membership function for some fuzzy set A is defined as that region
of the universe that is characterized by nonzero membership in the set A. That is, the support
comprises those elements x of the universe such that µA (x) > 0.
The boundaries of a membership function for some fuzzy set A are defined as that
region of the universe containing elements that have a nonzero membership but not complete
membership. That is, the boundaries comprise those elements x of the universe such that 0 < µA
(x) < 1. These elements of the universe are those with some degree of fuzziness, or only partial
membership in the fuzzy set A.
FUZZIFICATION
Fuzzification is the process of making a crisp quantity fuzzy. We do this by simply
recognizing that many of the quantities that we consider to be crisp and deterministic are
actually not deterministic at all; they carry considerable uncertainty. If the form of
uncertainty happens to arise because of imprecision, ambiguity, or vagueness, then the
variable is probably fuzzy and can be represented by a membership function.
DEFUZZIFICATION TO CRISP SETS
We begin by considering a fuzzy set A, then define a lambda-cut set, Aλ, where 0 ≤ λ
≤ 1. The set Aλ is a crisp set called the lambda (λ)-cut (or alpha-cut) set of the fuzzy set A,
where Aλ = {x|≧A (x) ≥ λ}. Note that the λ-cut set Aλ does not have a tilde underscore; it is a
crisp set derived from its parent fuzzy set, A. Any particular fuzzy set A can be transformed
into an infinite number of λ-cut sets, because there are an infinite number of values λ on the
interval [0, 1].
Any element x Aλ belongs to A with a grade of membership that is greater than or
equal to the value λ.
DEFUZZIFICATION TO SCALARS
As mentioned in the introduction, there may be situations where the output of a fuzzy
process needs to be a single scalar quantity as opposed to a fuzzy set. Defuzzification is the
conversion of a fuzzy quantity to a precise quantity, just as fuzzification is the conversion of a
precise quantity to a fuzzy quantity. The output of a fuzzy process can be the logical union of two
or more fuzzy membership functions defined on the universe of discourse of the output variable.
For example, suppose a fuzzy output comprises two parts:
51
(1) C1, a trapezoidalshape
(2) C2, a triangular membership shape.
The union of these two membership functions, that is, C = C 1 C 2, involves the max operator,
which graphically is the outer envelope of the two shapes shown
Of course, a general fuzzy output process can involve many output parts (more than two), and the
membership function representing each part of the output can have shapes other than triangles
and trapezoids.
Among the many methods that have been proposed in the literature in recent years, seven are
described here for defuzzifying fuzzy output functions (membership functions).
1. Max membership principle: Also known as the height method, this scheme is limited to
peaked output functions. This method is given by the algebraic expression,
52
where z is the defuzzified value.
2. Centroid method: This procedure (also called center of area or center of gravity) is the
most prevalent and physically appealing of all the defuzzification methods.
3. Weighted average method: The weighted average method is the most frequently used in
fuzzy applications since it is one of the more computationally efficient methods.
Unfortunately, it is usually restricted to symmetrical output membership functions. It is
given by the algebraic expression,
Where ∑ denotes the algebraic sum and where z is the centroid of each symmetric
53
membership function.
The weighted average method is formed by weighting each membership function in the
output by its respective maximum membership value.
4. Mean max membership: This method (also called middle-of-maxima) is closely related
to the first method, except that the locations of the maximum membership can be
nonunique (i.e., the maximum membership can be a plateau rather than a single point).
54
5. Center of sums: This is faster than many defuzzification methods that are currently in
use, and the method is not restricted to symmetric membership functions. This process
involves the algebraic sum of individual output fuzzy sets, say C1 and C2, instead of
their union. Two drawbacks to this method are that the intersecting areasare added
twice, and the method also involves finding the centroids of the individual membership
functions. The defuzzified value z is given as follows:
55
6. Center of largest area: If the output fuzzy set has at least two convex subregions, then the
center of gravity (i.e., z is calculated using the centroid method, Equation 4.5) of the
convex fuzzy subregion with the largest area is used to obtain the defuzzified value z of
the output.
7. First (or last) of maxima: This method uses the overall output or union of all individual
output fuzzy sets Ck to determine the smallest value of the domain with maximized
membership degree in Ck. The equations for z are as follows.
First, the largest height in the union (denoted hgt(C k)) is determined,
57
First-generation (nonadaptive) simple fuzzy controllers can generally be depicted by a
block diagram such as that shown in Figure. The knowledge-base module in Figure contains
knowledge about all the input and output fuzzy partitions. It will include the term set and the
corresponding membership functions defining the input variables to the fuzzy rule-base system
and the output variables, or control actions, to the plant under control.
The steps in designing a simple fuzzy control system are as follows:
1. Identify the variables (inputs, states, and outputs) of the plant.
2. Partition the universe of discourse or the interval spanned by each variable into a
number of fuzzy subsets, assigning each a linguistic label (subsets include all the elements in
the universe).
58
3. Assign or determine a membership function for each fuzzy subset.
4. Assign the fuzzy relationships between the inputs‘ or states‘ fuzzy
subsets on the one hand and the outputs‘ fuzzy subsets on the other hand, thus
forming the rule-base.
5. Choose appropriate scaling factors for the input and output
variables in order to normalize the variables to the [0, 1] or the [−1, 1]
interval.
6. Fuzzify the inputs to the controller.
7. Use fuzzy approximate reasoning to infer the output contributed from each
rule.
8. Aggregate the fuzzy outputs recommended by each rule.
9. Apply defuzzification to form a crisp output.
Part- A
1. Differentiate between crisp and fuzzy logic
2. Define fuzzy set
3. Define membership function
4. What is fuzzification
5. Define defuzzification
6. What is the advantage of fuzzy logic
7. What is FLC
8. What are Demorgan‘s laws
9. What is meant by decision making
10. Define fuzzy inference system
11. What is meant by Linguistic pharses
12. What is meant by alpha cut
13. Give the different types of membership functions
14. What are the basic elements of a fuzzy logic control system
Part-B
60
1. Discuss in detail about the fuzzy set theory
2. Write the operations and properties of fuzzy set
3. Differentiate classical set theory and fuzzy set using examples
4. Discuss in detail about the Fuzzy logic controller
5. Write in detail about the fuzzy relations and fuzzy rules
6. Explain in detail about the Defuzzification methods
7. Find max-min composition for the function
𝑦1 𝑦2 𝑦3
𝑥1 0.7 0.8 0.5
𝑅 = 𝑥2 0.2 0.4 0.6
𝑥3 0.1 0.9 0.4
𝑧1 𝑧2 𝑧3
𝑦1 0.3 0.2 0.5
𝑇 = 𝑦2 0.2 0.8 0.6
𝑦3 0.6 0.9 0.1
61
SCHOOL OF ELECTRICAL AND ELECTRONICS
62
Fuzzy Logic Controller - Fuzzification Interface - Knowledge Base- Decision Making Logic –
Defuzzification Interface- Application of Fuzzy Logic to Water Level Controller - Temperature
Controller - Control of Blood Pressure during Anaesthesia. Introduction to Neuro - Fuzzy Systems
- Fuzzy System Design Procedures - Fuzzy Sets and Logic Background - Fuzzy / ANN Design and
Implementation
General fuzzy logic controller:
- The principle design elements in a general fuzzy logic control system.
1. Fuzzification strategies and the interpretation of a Fuzzification operator, or
fuzzifier.
2. Knowledge base:
Discretization/ normalization of the universe of
discourse. Fuzzy partitions of the input and output
spaces.
Completeness of the partitions.
Choice of membership functions of a primary fuzzy set.
3. Rule base:
Choice of process state (input) variables and control (output)
variables. Source of derivation of fuzzy control rules
Types of fuzzy control rules
Consistency, interactivity and completeness of fuzzy control
rules
4. Decision making logic:
Definition of a fuzzy implication
Interpretation of the sentence
connective and Inference mechanism
5. Defuzzification strategies and the interpretation of a Defuzzification
operator (defuzzifier).
63
B.4.1 Fuszy-F.o$pc n .ecl>Io« 'on ction Creztterl Lry Fuaaifyiog
RnnLmn Rm
rat ‹r in 1›n:›kmJ ‹JHH‖\I hlt ‹› IH st‹'Ia ‹›f I \J lJI‘mI / ‹'I› 1›ku›‹1 I›r 'I .tir‹' in
\›r‹kk‹n rt‹e ' ti il I › I ‹ › J•t \1w t›{ I i1I roll t/\lp/ nt \ [›. ‹ Mt'¿‹qI xnlt tr I ir›l i» I
iFi kt•lj ‹Ir in.'j z it›t i› 1(I nl‹1x ‹y} .'›* \ ‹'r nt‹•]› IM t‹i r wtiitzti orv
rJ‹t \• in 1›n kt'lI ‹ l \›a'JI ilJt‹› I ‹j nl ‹•] n
iJ ztlrix La t'r •J‹t‹s I •'itI Il\t• ‹lii cxiv j¥,s I h•• fiMt. fi c IJ ••l 'Hit'H\ ‹ s{ tual
cix I ,
^y /'IUII^'II*I — Z Z I Z 8. U'I['/Il•I1•I/'* (• *‹›)
(ñ. I l )
64
8. 4.2 Eiiaay-Logic Patient DeterlorRtion I Hdex
I. i f‹• I fln •tit ‹ ' Billy ‹ ' 'hI,s 1 i k‹' C•h S"i'i it ri‹ ‖H1«r L"nilJ tr •. I‘ Ill l i1 nrt' ‹w
\‹•tMti. a tl i i It iIz;l I ¥'t'liI ris1tl r L li ltiu• ztn• J‹ i has I \r' il J I u'\ñ Its I zu•ts
I ‹A i IN t‹' r •k fHs1‹›r. l‘ztt it•til U‹•t «i‹/rJ‹t i\/l\ Ill I 'x ir tto lt'I\<l ru•
nlJvV' tl iJ \ till' Lai . G.• . tt‖‹•iphl n +- i¥,Its^I t it F}s|*t {A -I c UI JsTi' /i
t"‹"[l i i› n) i]‹• Is. •}. {‘III j‹ 'III ‹ i•< 'ri r•t i ›i› tI1‹]I'X > f‹ ›£It t\I) tt‹ n
I t‹ › JLwwas I ] t‹ ' t ¥1 I I‹-It I I I t' i f t lJ‹• -‹ ›1H {1t ic ij1 c i L t 1 H' [J‹tl H'llf - Al
jn I u\.w s}
i III I iii • fi izx;' }r4i',ic- ] iy‹›f u\I ›i [i\ li',s c **j t lan x • ‹ \ifF •ri'J It c-rit ice\I ‹ zl i‹ lil i i lj,s
‹*} t lz •
li‹ 'art . l‘nt i ml IN 't‹ 'ri‹ ›r ztl i‹ ›ti JT t‹ li x (/r,i,) is cixx 'H 1 i' t I ‹ ' fia I' G..1 \.
£''i:s- 6.5. Pnti‹V il 0•'l •'ri ›rn11MIl I\1‹I‹'x Fi tri‹ 1 i‹ II at r Art s≦yi•
65
Application Of Fuzzy Logic To Control Of Blood Pressure During Anaesthesia – Cardio
Vascular Signals
66
EXT / REFERENCE BOOKS
1. Laurene Fausett, " Fundamentals of Neural Networks: Architectures, Algorithms and Applications", 2008.
2. Timothy J. Ross , ―Fuzzy Logic with Engineering Applications‖, McGraw - Hill International Editions,
2004.
3. Jang J.S.R., Sun C.T. and Mizutani E, "Neuro-Fuzzy and soft computing", Pearson Education, 2003.
4. Rajasekaran. S, Pai. G.A.V. ―Neural Networks, Fuzzy Logic and Genetic Algorithms‖, Prentice Hall of
India, 2003.
Part- A
1. What is a fuzzy logic controller
2. What is fuzzy decision making
3. What is meant by defuzzification interface
4. Write about knowledge base
5. What is meant by neuro fuzzy system
6. What is meant by fuzzification interface
Part-B
1. Explain FLC with neat diagram and discuss advantages and disadvantages
2. Elaborate in detail about application of FLC for water level Controller
3. Write in detail about FLC for Temperature Controller
4. Explain in detail about application of FLC for the control of Blood Pressure during
Anesthesia
5. Explain in detail about the design and Implementation of Neuro fuzzy system
67
SCHOOL OF ELECTRICAL AND ELECTRONICS
DEPARTMENT OF ELECTRONICS AND INSTRUMENTATION ENGINEERING
68
Introduction - Robustness of Traditional Optimization and Search Techniques - The goals of optimization - Survival
of the Fittest - Fitness Computations - Cross over - Mutation -Reproduction- Rank method- Rank space method
1.Introduction
The genetic algorithm (GA), developed by John Holland and his collaborators in the 1960s and
1970s (Holland, 1975; De Jong, 1975), is a model or abstraction of biological evolution based on
Charles Darwin's theory of natural selection. Holland was probably the first to use the crossover
and recombination, mutation, and selection in the study of adaptive and artificial systems. These
genetic operators form the essential part of the genetic algorithm as a problem-solving strategy.
Since then, many variants of genetic algorithms have been developed and applied to a wide range
of optimization problems, from graph coloring to pattern recognition, from discrete systems (such
as the travelling salesman problem) to continuous systems (e.g., the efficient design of airfoil in
aerospace engineering), and from financial markets to multi-objective engineering optimization.
There are many advantages of genetic algorithms over traditional optimization algorithms. Two
most notable are: the ability of dealing with complex problems and parallelism. Genetic algorithms
can deal with various types of optimization, whether the objective (fitness) function is stationary or
non-stationary (change with time), linear or nonlinear, continuous or discontinuous, or with
random noise. Because multiple offsprings in a population act like independent agents, the
population (or any subgroup) can explore the search space in many directions simultaneously. This
feature makes it ideal to parallelize the algorithms for implementation. Different parameters and
even different groups of encoded strings can be manipulated at the same time.
However, genetic algorithms also have some disadvantages. The formulation of fitness function,
the use of population size, the choice of the important parameters such as the rate of mutation and
crossover, and the selection criteria of the new population should be carried out carefully. Any
inappropriate choice will make it difficult for the algorithm to converge or it will simply produce
meaningless results. Despite these drawbacks, genetic algorithms remain one of the most widely
used optimization algorithms in modern nonlinear optimization.
2.Robustness of Traditional Optimization and Search Techniques
GA is a stochastic search algorithm based on principles of natural competition between
individuals for appropriating limited natural sources. Success of the winner normally depends on
their genes, and reproduction by such individuals causes the spread of their genes. By successive
selection of superior individuals and reproducing them, the population will be led to obtain more
69
natural resources. The GA simulates this process and calculates the optimum of objective
functions. In general, the standard GA is not convenient for finding the solutions to complex
problems. The VSP method (Gholizadeh and Salajegheh, 2010; Gholizadeh and Samavati, 2011;
Salajegheh and Gholizadeh, 2005) is an alternative to overcome this shortcoming of GA.
In this modified GA, an initial population with a small number of individuals is selected; this
population is much smaller than that in standard GA. Then, all the necessary operations of the
standard GA are carried out and the optimal solution is achieved. As the size of the population is
small, the method converges to a premature solution. In each generation, the best individual is
saved. Then, the best solution is repeatedly copied to create a new population and the remaining
members of the population are randomly selected. Thereafter, the optimization process is repeated
using standard GA with a reduced population to achieve a new solution.
During the selection, chromosomes form pairs of parents for breeding. Each child takes
characteristics from its parents. Basically, the child represents a recombination of characteristics
70
from its parents: Some of the characteristics are taken from one parent and some from another. In
addition to the recombination, some of the characteristics can mutate.Because fitter chromosomes
produce more children, each subsequent generation will have better fitness. At some point, a
generation will contain a chromosome that will represent a good enough solution for our problem.
GA is powerful and broadly applicable for complex problems. There is a large class of
optimization problems that are quite hard to solve by conventional optimization techniques.
Genetic algorithms are efficient algorithms whose solution is approximately optimal. The well-
known applications include scheduling, transportation, routing, group technologies, layout design,
neural network training, and many others.
4.Selection
double sum = 0;
final double[] probabilities = new double[fitnesses.length];
for (int i = 0; i < fitnesses.length; i++) {
sum += fitnesses[i] / totalFitness;
probabilities[i] = sum;
}
probabilities[probabilities.length - 1] = 1;
index = -(index + 1 );
return population.get(index);
}).collect(toList());
The idea behind this implementation is the following: The population is represented as consequent
ranges on the numerical axis. The whole population is between 0 and 1.
5.Crossover
.toArray();
shuffle(Arrays.asList(indexes));
72
final T value2 = population.get(index2);
With the predefined probability crossoverProbability , we select parents for breeding. The selected
parents are shuffled, allowing any combinations to happen. We take pairs of parents and apply
the crossover operator. We apply the operator twice for each pair because we need to keep the
size of the population the same. The children replace their parents in the population.
Mutation
population.set(i, mutation.apply(population.get(i)));
73
6.Reproduction- Rank method
Selection is the stage of a genetic algorithm in which individual genomes are chosen from a
population for later breeding (using the crossover operator).
A generic selection procedure may be implemented as follows:
1. The fitness function is evaluated for each individual, providing fitness values, which are
then normalized. Normalization means dividing the fitness value of each individual by the
sum of all fitness values, so that the sum of all resulting fitness values equals 1.
2. Accumulated normalized fitness values are computed: the accumulated fitness value of an
individual is the sum of its own fitness value plus the fitness values of all the previous
individuals; the accumulated fitness of the last individual should be 1, otherwise something
went wrong in the normalization step.
4. The selected individual is the first one whose accumulated normalized value is greater than
or equal to R.
For many problems the above algorithm might be computationally demanding. A simpler and
faster alternative uses the so-called stochastic acceptance.
If this procedure is repeated until there are enough selected individuals, this selection method is
called fitness proportionate selection or roulette-wheel selection. If instead of a single pointer spun
multiple times, there are multiple, equally spaced pointers on a wheel that is spun once, it is
called stochastic universal sampling. Repeatedly selecting the best individual of a randomly chosen
subset is tournament selection. Taking the best half, third or another proportion of the individuals
is truncation selection.
There are other selection algorithms that do not consider all individuals for selection, but only
those with a fitness value that is higher than a given (arbitrary) constant. Other algorithms select
from a restricted pool where only a certain percentage of the individuals are allowed, based on
fitness value.
Retaining the best individuals in a generation unchanged in the next generation, is
called elitism or elitist selection. It is a successful (slight) variant of the general process of
constructing a new population.
74
7.Rank Selection
Rank Selection also works with negative fitness values and is mostly used when the individuals in
the population have very close fitness values (this happens usually at the end of the run). This leads
to each individual having an almost equal share of the pie (like in case of fitness proportionate
selection) and hence each individual no matter how fit relative to each other has an approximately
same probability of getting selected as a parent. This in turn leads to a loss in the selection pressure
towards fitter individuals, making the GA to make poor parent selections in such situations.
Part-B