Dtailed Notes Unit 1 ASC
Dtailed Notes Unit 1 ASC
Soft Computing
Soft computing differs from conventional (hard) computing in that, unlike hard computing, it is
tolerant of imprecision, uncertainty, partial truth, and approximation. In effect, the role model
for soft computing is the human mind.
The guiding principle of soft computing is: “Exploit the tolerance for imprecision,
uncertainty, partial truth, and approximation to achieve tractability, robustness and low
solution cost.”
The principal constituents of Soft Computing (SC) are Fuzzy Logic (FL), Neural Computing
(NC), Evolutionary Computation (EC) Machine Learning (ML) and Probabilistic Reasoning
(PR), chaos theory and parts of learning theory.
Hard Computing
Hard computing deals with precise models where accurate solutions are achieved quickly. Hard
Computing is the ancient approach employed in computing that have an accurate analytical
model. The outcome of hard computing approach is a warranted, settled and accurate result. It
deals with binary and crisp logic that need the precise input. Hard computing isn’t capable of
finding the solution of real world problems’ which are not well defined mathematically or if the
inputs are not precise but have lot of dependency on environment.
Page 1
Difference between Hard Computing and Soft Computing
1. Artificial Neural Network (ANN): ANNs are inspired from biological nervous system.
They have the capability of learning and adaptability.
2. Fuzzy Systems: This technique is inspired from human experience. In this approach
knowledge representation is performed via Fuzzy “If-Then” rules.
3. Genetic Algorithm: Genetic algorithm is a method for solving both constrained and
unconstrained optimization problems that is based on natural selection, the process that drives
biological evolution.
Neural Networks
A neural network is a processing device, whose design was inspired by the design and
functioning of human brains and components thereof. The neural networks have the ability to
learn by example which makes them very flexible and powerful. For neural networks, there is
no need to understand the internal mechanisms of the task.
Page 2
“A neural network is a massively parallel distributed processor made up of simple processing
units that has a natural propensity for storing experiential knowledge and making it available
for use. It resembles the brain in two respects:
1. Knowledge is acquired by the network from its environment through a learning process.
2. Interneuron connection strengths, known as synaptic weights, are used to store the acquired
knowledge.”
The procedure used to perform the learning process is called a learning algorithm, the function
of which is to modify the synaptic weights of the network in an orderly fashion to attain a
desired design objective.
A neural network derives its computing power through, first, its massively parallel distributed
structure and, second, its ability to learn and therefore generalize. Generalization refers to the
neural network’s production of reasonable outputs for inputs not encountered during training
(learning). These two information processing capabilities make it possible for neural networks
to find good approximate solutions to complex (large-scale) problems that are intractable.
An artificial neural network (ANN) may be defined as an information processing model that is
inspired by the way biological nervous systems, such as the brain, process information. This
model tries to replicate only the most basic functions of brain. The key element of ANN is the
novel structure of its information processing system. An ANN is composed of a large number of
highly interconnected processing elements (neurons) working in unison to solve specific
problems.
(i) The NNs exhibit mapping capabilities, that is, they can map input patterns to their
associated output patterns.
(ii) The NNs learn by examples. Thus, NN architectures can be 'trained with known
examples of a problem before they are tested for their 'inference capability on
unknown instances of the problem. They can, therefore, identify new objects
previously untrained.
(iii) The NNs possess the capability to generalize. Thus, they can predict new outcomes
from past trends.
(iv) The NNs are robust systems and are fault tolerant. They can, therefore, recall full
patterns from incomplete, partial or noisy patterns.
(v) The NNs can process information in parallel, at high speed, and in a distributed
manner.
Page 3
HISTORY OF NEURAL NETWORK RESEARCH
The pioneering work of McCulloch and Pitts (1943) was the foundation stone for the growth of
NN architectures. In their paper, McCulloch and Pitts suggested the unification of
neurophysiology with mathematical logic, which paved way for some significant results in NN
research. Infact, the McCulloch-Pitts model even influenced Von Neumann to try new design
technology in the construction of EDVAC (Electronic Discrete Variable Automatic Computer).
The next significant development arose out of Hebb's book ‘The organization of behaviour'. In
this, Hebb proposed a learning rule derived from a model based on synaptic connections
between nerve cells responsible for biological associative memory.
The Hebbian rule was later refined by Rosenblatt in 1958, in the Perceptron model (Rosenblatt,
1958). However, a critical assessment of the Perceptron model by Minsky in 1969 (Minsky and
Papert, 1969) stalled further research in NN. It was much later in the 1980s that there was a
resurgence of interest in NN and many major contributions in the theory and application of NN
were made.
The human brain is one of the most complicated part. However, the concept of neurons as the
fundamental constituent of the brain has made the study of its functioning comparatively easier.
Figure 1 illustrates the physical structure of the human brain.
Page 4
Brain contains about 1010 basic units called neurons. Each neuron in turn, is connected to about
104 other neurons. A neuron is a small cell that receives electro-chemical signals from its
various sources and in turn responds by transmitting electrical impulses to other neurons. An
average brain weighs about 1.5 kg and an average neuron has a weight of 1.5 x 10 -9 gms. While
some of the neurons perform input and output operations (referred to as afferent and efferent
cells respectively), the remaining form a part of an interconnected network of neurons which are
responsible for signal transformation and storage of information. However, despite their
different activities, all neurons share common characteristics.
A neuron is composed of a nucleus-a cell body known as soma (refer Fig. 2).
Attached to the soma are long irregularly shaped filaments called dendrites. The dendrites
behave as input channels, (i.e.) all inputs from other neurons arrive through the dendrites.
Dendrites look like branches of a tree during winter.
Another type of link attached to the soma is the Axon. Unlike the Dendritic links, the
axon is electrically active and serves as an output channel. Axons, which mainly appear
on output cells are non-linear threshold devices which produce a voltage pulse called
Action Potential or Spike that lasts for about a millisecond.
If the cumulative inputs received by the soma raise the internal electric potential of the
cell known as Membrane Potential, then the neuron fires by propagating the action
potential down the axon to excite or inhibit other neurons.
The axon terminates in a specialized contact called synapse or synaptic junction that
connects the axon with the dendritic links of another neuron. The synaptic junction,
which is a very minute gap at the end of the dendritic link contains a neuro-transmitter
fluid. It is this fluid which is responsible for accelerating or retarding the electric charges
to the soma. Each dendritic link can have many synapses acting on it thus bringing about
massive interconnectivity.
The human brain no doubt is a highly complex structure viewed as a massive, highly
interconnected network of simple processing elements called neurons. However, the behaviour
of a neuron can be captured by a simple model as shown in Fig. 3. Every component of the
model bears a direct analogy to the actual constituents of a biological neuron and hence is
termed as artificial neuron. It is this model which forms the basis of Artificial Neural Networks.
Fig. 3
Here, x1, x2, x3,….., xn are the n inputs to the artificial neuron. w1, w2, ….., wn are the
weights attached to the input links.
Recollect that a biological neuron receives all inputs through the dendrites, sums them
and produces an output if the sum is greater than a threshold value. The input signals are
passed on to the cell body through the synapse which may accelerate or retard an arriving
signal.
It is this acceleration or retardation of the input signals that is modeled by the weights.
An effective synapse which transmits a stronger signal will have a correspondingly larger
weight while a weak synapse will have smaller weights. Thus, weights here are
multiplicative factors of the inputs to account for the strength of the synapse. Hence, the
total input I received by the soma of the artificial neuron is
……………………(1.1)
To generate the final output y, the sum is passed on to an activation function ϕ, which
provides the output.
Page 6
i.e. y = ϕ (I) ……………………… (1.2)
………………….. (1.3)
Φ (I) = 1, I ≥ θ
Fig. 4 illustrates the Threshold function. This is convenient in the sense that the output signal is
either 1 or 0 resulting in the neuron being on or off.
Page 7
In binary threshold function, the value of θ is 0. So the binary threshold activation here is
1 𝑖𝑓 𝐼 ≥ 0
𝑓(𝐼) =
0 𝑖𝑓 𝐼 < 0
Signum Function
……………………… (1.5)
Sigmoidal funtion
This function is a continuous function that varies gradually between the asymptotic values 0 and
1 or -1 and +1 and is given by
……….............(1.6)
Page 8
Where, α is the slope parameter, which adjusts the abruptness of the function as it
changes between the two asymptotic values. Sigmoidal functions are differentiable, which
is an important feature of NN theory. Figure 6 illustrates the sigmoidal function.
Q1. For the network shown in Figure, calculate the net input to the output neuron.
0.3
X1
0.2
0.5 0.1
X2 Y
0.6
X3 -0.3
The given neural net consists of three input neurons and one output neuron. The inputs and
weights are:
Q2. For the network shown in Figure, calculate the net input to the output neuron.
1
0.2
X1
0.45
0.3
0.6 0.7
X2 Y y
Here inputs are [x1, x2] = [0.2, 0.6], weights are [w1, w2] = [0.3, 0.7] and bias is b=0.45
The net input Yin at neuron Y can be calculated as: Yin = w1x1 + w2x2 + b
0.4
X3 -0.2
The given network has three input neurons with bias and one output neuron. These form a
single-layer network. The inputs are given as [x1, x2, x3] = [0.8, 0.6, 0.4] and the weights [w1,
w2, w3] = [0.1, 0.3, -0.2] with bias=0.35
Page 10
𝑌 = 𝑤𝑥
1
𝑌= .
1+𝑒
So the output Y = 0.625
Page 11
NEURAL NETWORK ARCHITECTURE
Generally, an ANN structure can be represented using a directed graph. A graph G is an ordered
2-tuple (V, E) consisting of a set V of vertices and a set E of edges. When each edge is assigned
an orientation, the graph is directed and is called a directed graph or a digraph. Figure 7
illustrates a digraph. Digraphs assume significance in Neural Network theory since signals in
NN systems are restricted to flow in specific directions.
The vertices of the graph may represent neurons (input/output) and the edges, the synaptic links.
The edges are labelled by the weights attached to the synaptic links.
There are several classes of NN, classified according to their learning mechanisms. However,
there are three fundamentally different classes of networks based on architecture. All the three
classes employ the digraph structure for their representation.
This type or network comprises of two layers, namely the input layer and the output layer. The
input layer neurons receive the input signals and the output layer neurons computes the output
signals. The synaptic links carrying the weights connect every input neuron to the output neuron
but not vice-versa. Such a network is said to be feed forward in type or acyclic n nature.
Despite, the two layers, the network is termed single layer since it is the output layer,
alone which performs computation. The input layer merely transmits the signals to the output
layer. Hence, the name single layer feed forward network. Figure 8 illustrates an example
network.
Page 12
Fig. 8 Single layer feedforward network
This network, as its name indicates is made up of multiple layers. Thus, architectures of this
class besides possessing an input and an output layer also have one or more intermediary layers
called hidden layers. The computational units of the hidden layer are known as the hidden
neurons or hidden units. The hidden layer aids in performing useful intermediary computations
before directing the input to the output layer. The input layer neurons are linked to the hidden
layer neurons and the weights on these links are referred to as input-hidden layer weights.
Again, the hidden layer neurons are linked to the output layer neurons and the corresponding
weights are referred to as hidden-output layer weights. Figure 9 illustrates a multilayer feed
forward network with a configuration 1-m- n.
3. RECURRENT NETWORKS
These networks differ from feed forward network architectures in the sense that there is atleast
one feedback loop. Thus, in these networks, for example, there could exist one layer with
Page 13
feedback connections as shown in Fig. 10. There could also be neurons with self-feedback
links, i.e. the output of a neuron is fed back into itself as input.
Learning Processes
1. Supervised learning
2. Unsupervised learning
3. Reinforced learning
1. Supervised learning.
In conceptual terms, we may think of the teacher as having knowledge of the environment, with
that knowledge being represented by a set of input–output examples. The environment is,
however, unknown to the neural network. Suppose now that the teacher and the neural network
are both exposed to a training vector (i.e., example) drawn from the same environment. By
virtue of built-in knowledge, the teacher is able to provide the neural network with a desired
response for that training vector. Indeed, the desired response represents the “optimum” action
Page 14
to be performed by the neural network. The network parameters (weights) are adjusted under
the combined influence of the training vector and the error signal. The error signal is defined as
the difference between the desired response and the actual response of the network. This
adjustment is carried out iteratively in a step-by-step fashion with the aim of eventually making
the neural network emulate the teacher; the emulation is presumed to be optimum in some
statistical sense. In this way, knowledge of the environment available to the teacher is
transferred to the neural network through training and stored in the form of “fixed” synaptic
weights, representing long-term memory. When this condition is reached, we may then dispense
with the teacher and let the neural network deal with the environment completely by itself.
2. Unsupervised Learning
In unsupervised, or self-organized, learning, there is no external teacher or critic to
oversee the learning process.
3. Reinforced learning
Page 15
In this method, a teacher though available, does not present the expected answer but only
indicates if the computed output is correct or incorrect. The information provided helps the
network in its learning process. A reward is given for a correct answer computed and a penalty
for a wrong answer. But, reinforced learning is not one of the popular forms of learning.
Neural networks have been successfully applied for the solution of a variety of problem
however, some of the common application domains have been listed below:
Neural networks have shown remarkable progress in the recognition of visual images,
handwritten characters, printed characters, speech and other PR based tasks.
2. Optimization/constraint satisfaction
This comprises problems which need to satisfy constraints and obtain optimal solutions.
Examples of such problems include manufacturing scheduling, finding the shortest possible tour
given a set of cities, etc. Several problems of this nature arising out of industrial and
manufacturing fields have found acceptable solutions using NNs.
Neural networks have exhibited the capability to predict situations from past trends. They have
therefore, found ample applications in areas such as meteorology, stock market, banking, and
econometrics with high success rates.
McCulloch-Pitts Neuron
The McCulloch-Pitts neuron was the earliest neural network discovered in 1943. It is usually
called as M-P neuron. The M-P neurons are connected by directed weighted paths. It should be
noted that the activation of an M-P neuron is binary, that is, at any time step the neuron may fire
or may not fire. The weights associated with the communication links may be excitatory
(weight is positive) or inhibitory (weight is negative).
The threshold plays a major role in M-P neuron: There is a fixed threshold for each neuron, and
if the net input to the neuron is greater than the threshold then the neuron fires.
Page 16
X1
w
X2
w
w
X3 Y
-p
X4
+1 -p
X5
A simple M-P neuron is shown in above figure. The M-P neuron has both excitatory and
inhibitory connections. It is excitatory with weight (w > 0) or inhibitory with weight –p (p < 0).
In Figure, inputs from X1 to X3 possess excitatory weighted connections and inputs from X4 to
X5 possess inhibitory weighted interconnections. Since the firing of the output neuron is based
upon the threshold θ, the activation function here is defined as
1 𝑖𝑓 𝑦 ≥𝜃
𝑓(𝑦 ) =
0 𝑖𝑓 𝑦 <𝜃
Here yin is the net input at neuron y. If the net input is greater than or equal to the
threshold then the neuron fires else the neuron will not fire.
The M-P neuron has no particular training algorithm. An analysis has to be performed to
determine the values of the weights and the threshold. Here the weights of the neuron are set
along with the threshold to make the neuron perform a simple logic function.
Using the given combination of weights, we try to set the threshold for solving the
problem.
Page 17
Q4. Implement AND function using McCulloch-Pitts neuron model.
X1 X2 Y
0 0 0
0 1 0
1 0 0
1 1 1
x1
X1
w1
Y y
x2
X2 w2
With these assumed weights, the net input is calculated for four inputs:
Based on this net input for all inputs, we will set the threshold so that output from neuron Y will
match the desired output for AND function.
Page 18
For example here in AND function, the output is high only for input (1, 1) otherwise the output
is 0. So we will adjust the threshold in such a way that this net input will be converted into
desired output. Here we can set the threshold value θ = 2.
So where the net input is greater than or equal to 2, the neuron will output 1 otherwise neuron
will give the output 0.
After using the weights as (1, 1) and with threshold θ≥2, the neuron Y provides the output as
desired by the AND function. The network for AND function is:
x1
X1 w1=1 θ≥2
Y y
x2
X2 w2=1
If we are able to set the threshold with these assumed weights, then we can assume another pair
of weights such as (1, -1), (-1, 1) and (-1, -1).
Page 19
Q5. Implement OR function using McCulloch-Pitts neuron model.
X1 X2 Y
0 0 0
0 1 1
1 0 1
1 1 1
x1
X1
w1
Y y
x2
X2 w2
With these assumed weights, the net input is calculated for four inputs:
Here in OR function, the output is 1 for all pair of inputs except the pair (0, 0). So we will
adjust the threshold in such a way that this net input will be converted into desired
output. Here we can set the threshold value θ = 1.
Page 20
So where the net input is greater than or equal to 1, the neuron will output 1 otherwise neuron
will give the output 0.
After using the weights as (1, 1) and with threshold θ≥1, the neuron Y provides the output as
desired by the OR function.
x1
X1 w1=1 θ≥1
Y y
x2
X2 w2=1
Rosenblatt’s Perceptron
The perceptron is the simplest form of a neural network used for the classification of patterns
said to be linearly separable (i.e., patterns that lie on opposite sides of a hyperplane). Basically,
it consists of a single neuron with adjustable synaptic weights and bias. The algorithm used to
adjust the free parameters of this neural network first appeared in a learning procedure
developed by Rosenblatt (1958, 1962) for his perceptron brain model.
Perceptron convergence theorem or convergence rule states that if the patterns (vectors)
used to train the perceptron are drawn from two linearly separable classes, then the perceptron
algorithm converges and positions the decision surface in the form of a hyperplane between the
two classes. The proof of convergence of the algorithm is known as the perceptron
convergence theorem.
Rosenblatt’s perceptron is built around a nonlinear neuron, namely, the McCulloch–Pitts model
of a neuron. Such a neural model consists of a linear combiner followed by a hard limiter
Page 21
(performing the threshold / signum function), as depicted in Figure. The summing node of the
neural model computes a linear combination of the inputs applied to its synapses, as well as
incorporates an externally applied bias. The resulting sum, that is, the induced local field, is
applied to a hard limiter. Accordingly, the neuron produces an output equal to 1 if the hard
limiter input is positive, and 0 or -1 if it is negative (0 for threshold and -1 for signum function).
From the model, we find that the hard limiter input, or induced local field, of the neuron is:
Page 22
This is illustrated in Fig. below for the case of two input variables x1 and x2, for which the
decision boundary takes the form of a straight line. A point (x1, x2) that lies above the
boundary line is assigned to class C1, and a point (x1, x2) that lies below the boundary line
is assigned to class C2.
The synaptic weights w1,w2, ...,wm of the perceptron can be adapted on an iteration by-iteration
basis. For the adaptation, we may use an error-correction rule known as the
perceptron convergence algorithm. The equivalent signal flow graph can also be represented as:
Page 23
Page 24
Problems for practice:
1. Learn the truth table of AND Gate using perceptron. Assume initial weights as
w1 = 0.9 and w2 = 0.9, Assume learning rate=0.5, bias = 0.5.
2. Learn the truth table of OR Gate using perceptron. Assume initial weights as
w1 = 0 and w2 = 0. Assume learning rate=0.5, bias = 0.5.
ASSSOCIATIVE MEMORY
Associative Memories, one of the major classes of neural networks, are faint imitations of the
human brain's ability to associate patterns. An Associative Memory (AM) which belongs to the
class of single layer feed forward or recurrent network architecture depending on its association
capability, exhibits Hebbian learning.
An associate memory is a storehouse of associated patterns which are encoded in some form.
When the storehouse is submitted with a pattern, the associated pattern pair is recalled or
output. The input pattern could be an exact replica of the stored pattern or a distorted or partial
Page 25
representation of a stored pattern. Figure shown below illustrates the working of an associative
memory.
If the associated pattern pairs (x. y) are different and if the model recalls a ‘y’ given an
‘x’ or vice versa, then it is termed as hetero associative memory.
It is useful for the association of patterns
Hetero associative correlation memories are known as hetero correlators.
If x and y refer to the same kind of pattern, then the model is termed as auto associative
memory.
Auto associative memories are useful for image refinement, that is, given a distorted or a
partial pattern, the whole pattern stored in its perfect form can be recalled.
Auto associative correlation memories are known as auto correlators.
Figure shown below illustrates hetero associative and auto associative memories.
Page 26
Additional Questions
1. Differentiate between recurrent neural network and multilayer neural network
Data and calculations flow in backward Data and calculations flow in a single
direction, from the output to the input direction, from input data to the outputs.
layer.
It is used for text data, speech data. It is used for image data, time series data.
Page 27
3. Differentiate between Supervised and Unsupervised Learning.
SUPERVISED UNSUPERVISED
Uses Known and Labeled Data as input Uses Unknown Data as input
Therefore, a single layer perceptron does not have the ability to implement XOR. This gave rise
to the need for and the invention of multilayer networks and perceptron.
Page 28