0% found this document useful (0 votes)
63 views26 pages

Unit 1 Part 1

Soft computing differs from hard computing in that it is tolerant of imprecision, uncertainty, and approximation. The principal constituents of soft computing are fuzzy logic, neural computing, evolutionary computation, machine learning, and probabilistic reasoning. Soft computing is not a combination of these areas but a partnership where each contributes a methodology to address problems. Often problems can be solved most effectively using a combination of these soft computing techniques rather than any single one.

Uploaded by

Satyam Rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views26 pages

Unit 1 Part 1

Soft computing differs from hard computing in that it is tolerant of imprecision, uncertainty, and approximation. The principal constituents of soft computing are fuzzy logic, neural computing, evolutionary computation, machine learning, and probabilistic reasoning. Soft computing is not a combination of these areas but a partnership where each contributes a methodology to address problems. Often problems can be solved most effectively using a combination of these soft computing techniques rather than any single one.

Uploaded by

Satyam Rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

UNIT 1

Soft Computing

Soft computing differs from conventional (hard) computing in that, unlike hard computing, it is tolerant of
imprecision, uncertainty, partial truth, and approximation. In effect, the role model for soft computing is the
human mind.

The guiding principle of soft computing is: “Exploit the tolerance for imprecision, uncertainty, partial truth,
and approximation to achieve tractability, robustness and low solution cost.”

The principal constituents of Soft Computing (SC) are Fuzzy Logic (FL), Neural Computing (NC), Evolutionary
Computation (EC) Machine Learning (ML) and Probabilistic Reasoning (PR), with the latter subsuming belief
networks, chaos theory and parts of learning theory.

What is important to note is that soft computing is not a melange. Rather, it is a partnership in which each of
the partners contributes a distinct methodology for addressing problems in its domain. In this perspective, the
principal constituent methodologies in SC are complementary rather than competitive. Furthermore, soft
computing may be viewed as a foundation component for the emerging field of conceptual intelligence.

Importance of Soft Computing

The complementarity of FL, NC, GC, and PR has an important consequence: in many cases a problem can be
solved most effectively by using FL, NC, GC and PR in combination rather than exclusively. A striking example
of a particularly effective combination is what has come to be known as "neurofuzzy systems." Such systems
are becoming increasingly visible as consumer products ranging from air conditioners and washing machines
to photocopiers and camcorders. Less visible but perhaps even more important are neurofuzzy systems in
industrial applications. What is particularly significant is that in both consumer products and industrial
systems, the employment of soft computing techniques leads to systems which have high MIQ (Machine
Intelligence Quotient).

The conceptual structure of soft computing suggests that students should be trained not just in fuzzy logic,
neurocomputing, genetic programming, or probabilistic reasoning but in all of the associated methodologies,
though not necessarily to the same degree.

Hard Computing

Hard computing deals with precise models where accurate solutions are achieved quickly. Hard Computing is
that the ancient approach employed in computing that desires Associate in Nursing accurately declared
analytical model. The outcome of hard computing approach is a warranted, settled, correct result and defines
definite management actions employing a mathematical model or algorithmic rule. It deals with binary and
crisp logic that need the precise input file consecutive. Hard computing isn’t capable of finding the real world
problem’s solution.
Difference between Hard Computing and Soft Computing

Precise Models
Hard Computing

Traditional
Symbolic Logic
Numerical Modeling
Reasoning
and Search

Approximate
Models
Soft Computing

Functional
Approximate
Approximation and
Reasoning
Randomized Search

S.NO SOFT COMPUTING HARD COMPUTING

Soft Computing is liberal of inexactness, Hard computing needs an exactly state


1.
uncertainty, partial truth and approximation. analytic model.

Soft Computing relies on formal logic and Hard computing relies on binary logic and
2.
probabilistic reasoning. crisp system.

3. Soft computing is stochastic in nature. Hard computing is deterministic in nature.

Soft computing works on ambiguous and noisy


4. Hard computing works on exact data.
data.

Soft computing can perform parallel Hard computing performs sequential


5.
computations. computations.

6. Soft computing produces approximate results. Hard computing produces precise results.

Hard computing requires programs to be


7. Soft computing will emerge its own programs.
written.
S.NO SOFT COMPUTING HARD COMPUTING

8. Soft computing incorporates randomness. Hard computing is settled.

9. Soft computing will use multivalued logic. Hard computing uses two-valued logic.

Here we will study, Neural Networks, Fuzzy Logic and Genetic Algorithm.

Neural Networks

A neural network is a processing device, either an algorithm or an actual hardware, whose design was inspired
by the design and functioning of animal brains and components thereof. The computing world has a lot to gain
from neural networks, also known as neural networks or neural net.

The neural networks have the ability to learn by example which makes them very flexible and powerful. For
neural networks, there is no need to devise an algorithm to perform a specific task that is there is no need to
understand the internal mechanisms of the task.

The human brain computes in an entirely different way from the conventional digital computer. The brain is a
highly complex, nonlinear, and parallel computer (information-processing system). It has the capability to
organize its structural constituents, known as neurons, so as to perform certain computations (e.g., pattern
recognition, perception, and motor control) many times faster than the fastest digital computer in existence
today. Consider, for example, human vision, which is an information-processing task. It is the function of the
visual system to provide a representation of the environment around us and, more important, to supply the
information we need to interact with the environment. To be specific, the brain routinely accomplishes
perceptual recognition tasks (e.g., recognizing a familiar face embedded in an unfamiliar scene) in
approximately 100–200 ms, whereas tasks of much lesser complexity take a great deal longer on a powerful
computer.

“A neural network is a massively parallel distributed processor made up of simple processing units that has a
natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in
two respects:

1. Knowledge is acquired by the network from its environment through a learning process.

2. Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge.”

The procedure used to perform the learning process is called a learning algorithm, the function of which is to
modify the synaptic weights of the network in an orderly fashion to attain a desired design objective.

A neural network derives its computing power through, first, its massively parallel distributed structure and,
second, its ability to learn and therefore generalize. Generalization refers to the neural network’s production
of reasonable outputs for inputs not encountered during training (learning). These two information processing
capabilities make it possible for neural networks to find good approximate solutions to complex (large-scale)
problems that are intractable.

An artificial neural network (ANN) may be defined as an information processing model that is inspired by the
way biological nervous systems, such as the brain, process information. This model tries to replicate only the
most basic functions of brain. The key element of ANN is the novel structure of its information processing
system. An ANN is composed of a large number of highly interconnected processing elements (neurons)
working in unison to solve specific problems.

The Human Brain

The human nervous system may be viewed as a three-stage system, as depicted in the block diagram:

Stimulus Receptors Neural Net Effectors Response

Central to the system is the brain, represented by the neural (nerve) net, which continually receives
information, perceives it, and makes appropriate decisions. Two sets of arrows are shown in the figure. Those
pointing from left to right indicate the forward transmission of information-bearing signals through the
system. The arrows, pointing from right to left, signify the presence of feedback in the system. The receptors
convert stimuli from the human body or the external environment into electrical impulses that convey
information to the neural net (brain).The effectors convert electrical impulses generated by the neural net into
discernible responses as system outputs.

It is estimated that there are approximately 10 billion neurons in the human cortex, and 60 trillion synapses or
connections. The net result is that the brain is an enormously efficient structure. Specifically, the energetic
efficiency of the brain is approximately 10-16 joules (J) per operation per second, whereas the corresponding
value for the best computers is orders of magnitude larger.

Synapses, or nerve endings, are elementary structural and functional units that mediate the interactions
between neurons. The most common kind of synapse is a chemical synapse, which operates as follows: A
presynaptic process liberates a transmitter substance that diffuses across the synaptic junction between
neurons and then acts on a postsynaptic process. Thus a synapse converts a presynaptic electrical signal into a
chemical signal and then back into a postsynaptic electrical signal. In electrical terminology, such an element is
said to be a nonreciprocal two-port device. In traditional descriptions of neural organization, it is assumed that
a synapse is a simple connection that can impose excitation or inhibition, but not both on the receptive
neuron.
The Pyramidal Cell

The plasticity permits the developing nervous system to adapt to its surrounding environment. In an adult
brain, plasticity may be accounted for by two mechanisms:

The creation of new synaptic connections between neurons, and the modification of existing synapses.

Axons, the transmission lines, and dendrites, the receptive zones, constitute two types of cell filaments that
are distinguished on morphological grounds; an axon has a smoother surface, fewer branches, and greater
length, whereas a dendrite (so called because of its resemblance to a tree) has an irregular surface and more
branches.

Neurons come in a wide variety of shapes and sizes in different parts of the brain. The shape of a pyramidal
cell is one of the most common types of cortical neurons. Like many other types of neurons, it receives most
of its inputs through dendrites. The pyramidal cell can receive 10,000 or more synaptic contacts, and it can
project onto thousands of target cells. The majority of neurons encode their outputs as a series of brief
voltage pulses. These pulses, commonly known as action potentials, or spikes, originate at or close to the cell
body of neurons and then propagate across the individual neurons at constant velocity and amplitude. The
reasons for the use of action potentials for communication among neurons are based on the physics of axons.
The axon of a neuron is very long and thin and is characterized by high electrical resistance and very large
capacitance. Both of these elements are distributed across the axon. Analysis of this propagation mechanism
reveals that when a voltage is applied at one end of the axon, it decays exponentially with distance, dropping
to an insignificant level by the time it reaches the other end.

In the brain, there are both small-scale and large-scale anatomical organizations, and different functions take
place at lower and higher levels. The following figure shows a hierarchy of interwoven levels of organization
that has emerged from the extensive work done on the analysis of local regions in the brain.

Central nervous system

Interegional circuits

Local circuits

Neurons

Dendritic trees

Neural micro circuits

Synapses

Molecules

The synapses represent the most fundamental level, depending on molecules and ions for their action. At the
next levels, we have neural microcircuits, dendritic trees, and then neurons. A neural microcircuit refers to an
assembly of synapses organized into patterns of connectivity to produce a functional operation of interest. A
neural microcircuit may be likened to a silicon chip made up of an assembly of transistors. The smallest size of
microcircuits is measured in micrometers, and their fastest speed of operation is measured in milliseconds.
The neural microcircuits are grouped to form dendritic subunits within the dendritic trees of individual
neurons. The whole neuron, about 100 micrometers in size, contains several dendritic subunits. At the next
level of complexity, we have local circuits(about 1 mm in size) made up of neurons with similar or different
properties; these neural assemblies perform operations characteristic of a localized region in the brain. They
are followed by interregional circuits made up of pathways, columns, and topographic maps, which involve
multiple regions located in different parts of the brain.
Models of a Neuron

A neuron is an information-processing unit that is fundamental to the operation of a neural network. The
block diagram of in following figure shows the model of a neuron, which forms the basis for designing a large
family of neural networks.

Here, we identify three basic elements of the neural model:

1. A set of synapses, or connecting links, each of which is characterized by a weight or strength of its own.
Specifically, a signal xj at the input of synapse j connected to neuron k is multiplied by the synaptic weight w kj.
It is important to make a note of the manner in which the subscripts of the synaptic weight w kj are written.
The first subscript in wkj refers to the neuron in question, and the second subscript refers to the input end of
the synapse to which the weight refers. Unlike the weight of a synapse in the brain, the synaptic weight of an
artificial neuron may lie in a range that includes negative as well as positive values.

2. An adder for summing the input signals weighted by the respective synaptic strengths of the neuron; the
operations described here, constitute a linear combiner.

3. An activation function for limiting the amplitude of the output of a neuron. The activation function is also
referred to as a squashing function, in that it squashes (limits) the permissible amplitude range of the output
signal to some finite value. Typically, the normalized amplitude range of the output of a neuron is written as
the closed unit interval [0, 1], or, alternatively, [-1, 1].

The neural model in above figure also includes an externally applied bias, denoted by b k. The bias bk has the
effect of increasing or lowering the net input of the activation function, depending on whether it is positive or
negative, respectively.

In mathematical terms, we may describe the neuron k by writing the pair of equations:

𝑢𝑘 = ∑𝑚
𝑗=1 𝑤𝑘𝑗 𝑥𝑗 (1)

And

𝑦𝑘 = 𝜑(𝑢𝑘 + 𝑏𝑘 ) (2)
where x1, x2, ..., xm are the input signals; wk1, wk2, ..., wkm are the respective synaptic weights of neuron k; uk is
the linear combiner output due to the input signals; bk is the bias; ϕ(·) is the activation function; and yk is the
output signal of the neuron. The use of bias bk has the effect of applying an affine transformation to the
output uk of the linear combiner in the model, as shown by

𝑣𝑘 = 𝑢𝑘 + 𝑏𝑘 (3)

Depending on whether the bias bk is positive or negative, the relationship between the induced local field, or
activation potential, vk of neuron k and the linear combiner output u k is modified in the manner illustrated in
following figure.

Affine transformation produced by presence of bias

The bias bk is an external parameter of neuron k. Equivalently, we may formulate the combination of
Equations (1) to (3) as follows:

𝑣𝑘 = ∑𝑚
𝑗=0 𝑤𝑘𝑗 𝑥𝑗

and 𝑦𝑘 = 𝜑(𝑣𝑘 )

Here, we have added a new synapse. Its input is x0=+1 and its weight is wk0=bk

The effect of the bias is accounted for by doing two things:

(1) Adding a new input signal fixed at +1, and

(2) Adding a new synaptic weight equal to the bias bk.

Although the both represented models are different in appearance, they are mathematically equivalent.
Another non linear model of neuron

Types of Activation Function

The activation function, denoted by ϕ(v), defines the output of a neuron. We identify two basic types of activation
functions:

1. Threshold Function:
For this type of activation function, we have
1 𝑖𝑓 𝑣 ≥ 0
𝜑 (𝑣 ) = {
0 𝑖𝑓 𝑣 < 0
In engineering, this form of a threshold function is commonly referred to as a Heaviside function.
Correspondingly, the output of neuron k employing such a threshold function is expressed as
1 𝑖𝑓 𝑣𝑘 ≥ 0
𝑦𝑘 = {
0 𝑖𝑓 𝑣𝑘 < 0
where vk is the induced local field of the neuron; that is,
𝑚

𝑣𝑘 = ∑ 𝑤𝑘𝑗 𝑥𝑗 + 𝑏𝑘
𝑗=1

Threshold Function
2. Sigmoid Function:
The sigmoid function, whose graph is “S”-shaped, is by far the most common form of activation
function used in the construction of neural networks. It is defined as a strictly increasing function that
exhibits a graceful balance between linear and nonlinear behavior. An example of the sigmoid function
is the logistic function, defined by
1
𝜑 (𝑣 ) =
1 + exp(−𝑎𝑣)
where a is the slope parameter of the sigmoid function. By varying the parameter a, we obtain sigmoid
functions of different slopes, as illustrated in the following figure.

Sigmoid function
In fact, the slope at the origin equals a/4. In the limit, as the slope parameter approaches infinity, the
sigmoid function becomes simply a threshold function. Whereas a threshold function assumes the
value of 0 or 1, a sigmoid function assumes a continuous range of values from 0 to 1.
It is sometimes desirable to have the activation function range from -1 to +1, in which case, the
activation function is an odd function of the induced local field. Specifically, the threshold function is
now defined as:
1 𝑖𝑓 𝑣 > 0
𝜑(𝑣) = { 0 𝑖𝑓 𝑣 = 0
−1 𝑖𝑓 𝑣 < 0
which is commonly referred to as the Signum function. For the corresponding form of a sigmoid
function, we may use the hyperbolic tangent function, defined by:
𝜑(𝑣) = tanh(𝑣)

Neural Network viewed as directed graphs

A signal-flow graph is a network of directed links (branches) that are interconnected at certain points called
nodes. A typical node j has an associated node signal x j. A typical directed link originates at node j and
terminates on node k; it has an associated transfer function, or transmittance, that specifies the manner in
which the signal yk at node k depends on the signal xj at node j. The flow of signals in the various parts of the
graph is dictated by three basic rules:

Rule 1: A signal flows along a link only in the direction defined by the arrow on the link.

Two different types of links may be distinguished:


• Synaptic links, whose behavior is governed by a linear input–output relation. Specifically, the node signal x j is
multiplied by the synaptic weight wkj to produce the node signal yk. This is represented in following figure (a)
part.

• Activation links, whose behavior is governed in general by a non linear input–output relation. This form of
relationship is illustrated in following figure (b) part, where ϕ(·) is the nonlinear activation function.

Rule 2: A node signal equals the algebraic sum of all signals entering the pertinent node via the incoming links.
This second rule is illustrated in above figure (c) part for the case of synaptic convergence, or fan-in.

Rule 3: The signal at a node is transmitted to each outgoing link originating from that node, with the
transmission being entirely independent of the transfer functions of the outgoing links. This third rule is
illustrated in above figure for the case of synaptic divergence, or fan-out

A neural network is a directed graph consisting of nodes with interconnecting synaptic and activation links and
is characterized by four properties:

1. Each neuron is represented by a set of linear synaptic links, an externally applied bias, and a possibly
nonlinear activation link. The bias is represented by a synaptic link connected to an input fixed at +1.

2. The synaptic links of a neuron weight their respective input signals.


3. The weighted sum of the input signals defines the induced local field of the neuron in question.

4. The activation link squashes the induced local field of the neuron to produce an output.

A directed graph, defined in this manner is complete in the sense that it describes not only the signal flow
from neuron to neuron, but also the signal flow inside each neuron. When, however, the focus of attention is
restricted to signal flow from neuron to neuron, we may use a reduced form of this graph by omitting the
details of signal flow inside the individual neurons. Such a directed graph is said to be partially complete. It is
characterized as follows:

1. Source nodes supply input signals to the graph.

2. Each neuron is represented by a single node called a computation node.

3. The communication links interconnecting the source and computation nodes of the graph carry no weight;
they merely provide directions of signal flow in the graph.

A partially complete directed graph defined in this way is referred to as an architectural graph, describing the
layout of the neural network.

For the simple case of a single neuron with m source nodes and a single node fixed at +1 for the bias.
In the computation node representing the neuron is shown shaded, and the source node is shown as a small
square.

Feedback

Feedback is said to exist in a dynamic system whenever the output of an element in the system influences in
part the input applied to that particular element, thereby giving rise to one or more closed paths for the
transmission of signals around the system.

It plays a major role in the study of a special class of neural networks known as recurrent networks.

Signal Flow Graph of a Single-loop feedback system

Network Architectures

In general, we may identify three fundamentally different classes of network architectures:

(i) Single-Layer Feedforward Networks


In a layered neural network, the neurons are organized in the form of layers. In the simplest form
of a layered network, we have an input layer of source nodes that projects directly onto an output
layer of neurons (computation nodes), but not vice versa. This network is strictly of a Feedforward
type.
Following is the case of four nodes in both the input and output layers.
(ii) Multilayer Feedforward Networks
The second class of a Feedforward neural network distinguishes itself by the presence of one or
more hidden layers, whose computation nodes are correspondingly called hidden neurons or
hidden units; the term “hidden” refers to the fact that this part of the neural network is not seen
directly from either the input or output of the network. The function of hidden neurons is to
intervene between the external input and the network output in some useful manner. By adding
one or more hidden layers, the network is enabled to extract higher-order statistics from its input.
The source nodes in the input layer of the network supply respective elements of the activation
pattern (input vector), which constitute the input signals applied to the neurons (computation
nodes) in the second layer (i.e., the first hidden layer). The output signals of the second layer are
used as inputs to the third layer, and so on for the rest of the network. Typically, the neurons in
each layer of the network have as their inputs the output signals of the preceding layer only. The
set of output signals of the neurons in the output (final) layer of the network constitutes the overall
response of the network to the activation pattern supplied by the source nodes in the input (first)
layer.

The neural network in this figure is said to be fully connected in the sense that every node in each layer of
the network is connected to every other node in the adjacent forward layer. If, however, some of the
communication links (synaptic connections) are missing from the network, we say that the network is
partially connected.
(iii) Recurrent Networks
A recurrent neural network distinguishes itself from a Feedforward neural network in that it has at
least one feedback loop. For example, a recurrent network may consist of a single layer of neurons
with each neuron feeding its output signal back to the inputs of all the other neurons. In the
structure depicted in the figure below, there are no self-feedback loops in the network; self-
feedback refers to a situation where the output of a neuron is fed back into its own input. The
recurrent network illustrated in the figure below also has no hidden neurons.

Figure: Recurrent n/w with no self feedback loops and no hidden neurons

In next figure, we illustrate another class of recurrent networks with hidden neurons. The feedback
connections shown here originate from the hidden neurons as well as from the output neurons.
The presence of feedback loops, be it in the recurrent structure of both figures, has a profound
impact on the learning capability of the network and on its performance. Moreover, feedback loops
involve the use of particular branches composed of unit-time delay elements (denoted by z-1),
which result in a nonlinear dynamic behavior, assuming that the neural network contains nonlinear
units.

Knowledge Representation:

Knowledge refers to stored information or models used by a person or machine to interpret, predict, and
appropriately respond to the outside world.

A major task for a neural network is to learn a model of the world (environment) in which it is embedded, and
to maintain the model sufficiently consistently with the real world so as to achieve the specified goals of the
application of interest. Knowledge of the world consists of two kinds of information:

1. The known world state, represented by facts about what is and what has been known; this form of
knowledge is referred to as prior information.

2. Observations (measurements) of the world, obtained by means of sensors designed to probe the
environment, in which the neural network is supposed to operate. In any event, the observations so obtained
provide the pool of information, from which the examples used to train the neural network are drawn.

The examples can be labeled or unlabeled. In labeled examples, each example representing an input signal is
paired with a corresponding desired response (i.e., target output). On the other hand, unlabeled examples
consist of different realizations of the input signal all by itself. In any event, a set of examples labeled or
otherwise, represents knowledge about the environment of interest that a neural network can learn through
training.

A set of input–output pairs, with each pair consisting of an input signal and the corresponding desired
response, is referred to as a set of training data, or simply training sample. To illustrate how such a data set
can be used, consider, for example, the handwritten-digit recognition problem. In this problem, the input
signal consists of an image with black or white pixels, with each image representing one of 10 digits that are
well separated from the background. The desired response is defined by the “identity” of the particular digit
whose image is presented to the network as the input signal. Typically, the training sample consists of a large
variety of handwritten digits that are representative of a real-world situation. Given such a set of examples,
the design of a neural network may proceed as follows:

• An appropriate architecture is selected for the neural network, with an input layer consisting of source
nodes equal in number to the pixels of an input image, and an output layer consisting of 10 neurons (one for
each digit). A subset of examples is then used to train the network by means of a suitable algorithm. This
phase of the network design is called learning.

• The recognition performance of the trained network is tested with data not seen before. Specifically, an
input image is presented to the network, but this time the network is not told the identity of the digit which
that particular image represents. The performance of the network is then assessed by comparing the digit
recognition reported by the network with the actual identity of the digit in question. This second phase of the
network operation is called testing, and successful performance on the test patterns is called generalization, a
term borrowed from psychology.

Learning Processes

We may categorize the learning processes through which neural networks function as follows: learning with a
teacher and learning without a teacher. By the same token, the latter form of learning may be subcategorized
into unsupervised learning and reinforcement learning.

Learning with a Teacher

Learning with a teacher is also referred to as supervised learning.

In conceptual terms, we may think of the teacher as having knowledge of the environment, with that
knowledge being represented by a set of input–output examples. The environment is, however, unknown to
the neural network. Suppose now that the teacher and the neural network are both exposed to a training
vector (i.e., example) drawn from the same environment. By virtue of built-in knowledge, the teacher is able
to provide the neural network with a desired response for that training vector. Indeed, the desired response
represents the “optimum” action to be performed by the neural network. The network parameters are
adjusted under the combined influence of the training vector and the error signal. The error signal is defined
as the difference between the desired response and the actual response of the network. This adjustment is
carried out iteratively in a step-by-step fashion with the aim of eventually making the neural network emulate
the teacher; the emulation is presumed to be optimum in some statistical sense. In this way, knowledge of the
environment available to the teacher is transferred to the neural network through training and stored in the
form of “fixed” synaptic weights, representing long-term memory. When this condition is reached, we may
then dispense with the teacher and let the neural network deal with the environment completely by itself.
Learning without a Teacher

In supervised learning, the learning process takes place under the tutelage of a teacher. However, in the
paradigm known as learning without a teacher, as the name implies, there is no teacher to oversee the
learning process. That is to say, there are no labeled examples of the function to be learned by the network.
Under this second paradigm, two subcategories are identified:

1. Reinforcement Learning
In reinforcement learning, the learning of an input–output mapping is performed through continued
interaction with the environment in order to minimize a scalar index of performance.

The figure above, shows the block diagram of one form of a reinforcement-learning system built
around a critic that converts a primary reinforcement signal received from the environment into a
higher quality reinforcement signal called the heuristic reinforcement signal, both of which are scalar
inputs. The system is designed to learn under delayed reinforcement, which means that the system
observes a temporal sequence of stimuli also received from the environment, which eventually result
in the generation of the heuristic reinforcement signal.
The goal of reinforcement learning is to minimize a cost-to-go function, defined as the expectation of
the cumulative cost of actions taken over a sequence of steps instead of simply the immediate cost. It
may turn out that certain actions taken earlier in that sequence of time steps are in fact the best
determinants of overall system behavior. The function of the learning system is to discover these
actions and feed them back to the environment. Delayed-reinforcement learning is difficult to perform
for two basic reasons:
• There is no teacher to provide a desired response at each step of the learning process.
• The delay incurred in the generation of the primary reinforcement signal implies that the learning
machine must solve a temporal credit assignment problem. By this we mean that the learning machine
must be able to assign credit and blame individually to each action in the sequence of time steps that
led to the final outcome, while the primary reinforcement may only evaluate the outcome.
2. Unsupervised Learning
In unsupervised, or self-organized, learning, there is no external teacher or critic to oversee the
learning process.

Rather, provision is made for a task-independent measure of the quality of representation that the
network is required to learn, and the free parameters of the network are optimized with respect to
that measure. For a specific task-independent measure, once the network has become tuned to the
statistical regularities of the input data, the network develops the ability to form internal
representations for encoding features of the input and thereby to create new classes automatically.

McCulloch-Pitts Neuron

The McCulloch-Pitts neuron was the earliest neural network discovered in 1943. It is usually called as M-P
neuron. The M-P neurons are connected by directed weighted paths. It should be noted that the activation of
an M-P neuron is binary, that is, at any time step the neuron may fire or may not fire. The weights associated
with the communication links may be excitatory (weight is positive) or inhibitory (weight is negative).

The threshold plays a major role in M-P neuron: There is a fixed threshold for each neuron, and if the net input
to the neuron is greater than the threshold then the neuron fires.

X1

X2 w
w

w
X3 Y

-p
X4
+1 -p
+
X5

A simple M-P neuron is shown in above figure. The M-P neuron has both excitatory and inhibitory
connections. It is excitatory with weight (w > 0) or inhibitory with weight –p (p < 0). In Figure, inputs from X1 to
X3 possess excitatory weighted connections and inputs from X4 to X5 possess inhibitory weighted
interconnections. Since the firing of the output neuron is based upon the threshold θ, the activation function
here is defined as

1 𝑖𝑓 𝑦𝑖𝑛 ≥ 𝜃
𝑓 (𝑦𝑖𝑛 ) = {
0 𝑖𝑓 𝑦𝑖𝑛 < 𝜃

Here yin is the net input at neuron y. If the net input is greater than or equal to the net input then the
neuron fires else the neuron will not fire.

The M-P neuron has no particular training algorithm. An analysis has to be performed m determine the values
of the weights and the threshold. Here the weights of the neuron are set along with the threshold to make the
neuron perform a simple logic function. The M-P neurons are used as building blocks on which we can model
any function or phenomenon, which can be represented as a logic function.

Examples

Q1. For the network shown in Figure, calculate the net input to the output neuron.

0.3
X1
0.2
0.5 0.1
X2 Y

0.6 -0.3
X3

The given neural net consists of three input neurons and one output neuron. The inputs and weights are:

[x1, x2, x3] = [0.3, 0.5, 0.6]

[w1, w2, w3] = [0.2, 0.1, -0.3]

The net input Yin at neuron Y can be calculated as:

Yin = w1x1 + w2x2 + w3x3

Yin = 0.3*0.2 + 0.5*0.1 + 0.6*(-0.3) = -0.07

Q2. For the network shown in Figure, calculate the net input to the output neuron.

0.2 1
X1
0.45
0.3
0.6 0.7
X2 Y y

Here inputs are [x1, x2] = [0.2, 0.6], weights are [w 1, w2] = [0.3, 0.7] and bias is b=0.45
The net input Yin at neuron Y can be calculated as: Yin = w1x1 + w2x2 + b

Yin = 0.2*0.3 + 0.6*0.7 + 0.45 = 0.93

The net input is neuron Y is 0.93.

Q3. For the network shown in Figure, calculate the output of the neuron Y using activation function as

(i) Binary threshold function


(ii) Bipolar threshold function
(iii) Binary sigmoidal function
1
0.8
X1
0.35
0.1
0.6 0.3
X2 Y y

0.4
-0.2
X3

The given network has three input neurons with bias and one output neuron. These form a single-layer
network. The inputs are given as [x1, x2, x3] = [0.8, 0.6, 0.4] and the weights [w 1, w2, w3] = [0.1, 0.3, -0.2] with
bias=0.35

The net input to the output neuron is


𝑛

𝑌𝑖𝑛 = ∑ 𝑤𝑖 𝑥𝑖
𝑖=1

So here Yin = w1x1 + w2x2 + w3x3 + b

Yin = 0.8*0.1 + 0.6*0.3 + 0.4*(-0.2) + 0.35 = 0.53

(i) Using Binary Threshold function

The threshold activation function is


1 𝑖𝑓 𝑦𝑖𝑛 ≥ 𝜃
𝑓(𝑦𝑖𝑛 ) = {
0 𝑖𝑓 𝑦𝑖𝑛 < 𝜃

In binary threshold function, the value of θ is 0. So the binary threshold activation here is

1 𝑖𝑓 𝑦𝑖𝑛 ≥ 0
𝑓(𝑦𝑖𝑛 ) = {
0 𝑖𝑓 𝑦𝑖𝑛 < 0

Here Yin is 0.53, that is greater than 0, so the output Y here is 1.


(ii) Using Bipolar Threshold function
In case of bipolar, 1 is replaced by +1 and 0 is replaced by -1. +1 represents true and -1 represents
false. So the bipolar threshold activation function is
1 𝑖𝑓 𝑦𝑖𝑛 ≥ 0
𝑓(𝑦𝑖𝑛 ) = {
−1 𝑖𝑓 𝑦𝑖𝑛 < 0
Here Yin is 0.53 that is greater than 0, so the output Y here is + 1.

(iii) Using Binary Sigmoidal function


The binary sigmoidal activation function is:
1
𝑌 = 𝑓 (𝑌𝑖𝑛 ) =
1 + 𝑒 −𝑌𝑖𝑛
1
𝑌=
1 + 𝑒 −0.53

So the output Y = 0.625

Q4. Implement AND function using McCulloch-Pitts neuron model.

The truth table for AND function is

X1 X2 Y
1 1 1
1 0 0
0 1 0
0 0 0

In McCulloch-Pitts neuron, only analysis is being performed. Hence, assume the weights be w1 = 1 and w2 = 1.
The network architecture is:

x1
X1
w1
= Y y
x2
X2 w2

With these assumed weights, the net input is calculated for four inputs:

Initially we assume here that w1=1 and w2=1

(1, 1) Yin=w1*x1+ w2*x2 Yin= 1*1 + 1*1 = 2


(1, 0) Yin=w1*x1+ w2*x2 Yin= 1*1 + 1*0 = 1
(0, 1) Yin=w1*x1+ w2*x2 Yin= 1*0 + 1*1 = 1
(0, 0) Yin=w1*x1+ w2*x2 Yin= 1*0 + 1*0 = 0

Based on this net input for all inputs, we will set the threshold so that output from neuron Y will match the
desired output for AND function.
For example here in AND function, the output is high only for input (1, 1) otherwise the output is 0. So we will
adjust the threshold in such a way that this net input will be converted into desired output. Here we can set
the threshold value θ = 2.

So where the net input is greater than or equal to 2, the neuron will output 1 otherwise neuron will give the
output 0.

(1, 1) Yin=w1*x1+ w2*x2 Yin= 1*1 + 1*1 = 2 Output 1


(1, 0) Yin=w1*x1+ w2*x2 Yin= 1*1 + 1*0 = 1 Output 0
θ=2
(0, 1) Yin=w1*x1+ w2*x2 Yin= 1*0 + 1*1 = 1 Output 0
(0, 0) Yin=w1*x1+ w2*x2 Yin= 1*0 + 1*0 = 0 Output 0

After using the weights as (1, 1) and with threshold θ≥2, the neuron Y provides the output as desired by the
AND function. The network for AND function is:

x1
X1 w1=1 θ≥2

Y y
x2
X2 w2=1

If we are able to set the threshold with these assumed weights, then we can assume another pair of weights
such as (1, -1), (-1, 1) and (-1, -1).

Q5. Implement OR function using McCulloch-Pitts neuron model.

The truth table for OR function is

X1 X2 Y
1 1 1
1 0 1
0 1 1
0 0 1

First, assume the weights be w1 = 1 and w2 = 1. The network architecture is:

x1
X1
w1
= Y y
x2
X2 w2

With these assumed weights, the net input is calculated for four inputs:

Initially we assume here that w1=1 and w2=1


(1, 1) Yin=w1*x1+ w2*x2 Yin= 1*1 + 1*1 = 2
(1, 0) Yin=w1*x1+ w2*x2 Yin= 1*1 + 1*0 = 1
(0, 1) Yin=w1*x1+ w2*x2 Yin= 1*0 + 1*1 = 1
(0, 0) Yin=w1*x1+ w2*x2 Yin= 1*0 + 1*0 = 0

Here in OR function, the output is 1 for all pair of inputs except the pair (0, 0). So we will adjust the threshold
in such a way that this net input will be converted into desired output. Here we can set the threshold value θ =
1.

So where the net input is greater than or equal to 1, the neuron will output 1 otherwise neuron will give the
output 0.

(1, 1) Yin=w1*x1+ w2*x2 Yin= 1*1 + 1*1 = 2 Output 1


(1, 0) Yin=w1*x1+ w2*x2 Yin= 1*1 + 1*0 = 1 Output 1
θ=1
(0, 1) Yin=w1*x1+ w2*x2 Yin= 1*0 + 1*1 = 1 Output 1
(0, 0) Yin=w1*x1+ w2*x2 Yin= 1*0 + 1*0 = 0 Output 1

After using the weights as (1, 1) and with threshold θ≥1, the neuron Y provides the output as desired by the
AND function.

The network for OR function is:

x1
X1 w1=1 θ≥1

Y y
x2
X2 w2=1

Q6. Implement XOR function using McCulloch-Pitts neuron model.

The truth table for XOR function is

X1 X2 Y
1 1 0
1 0 1
0 1 1
0 0 0
The output is "ON" for only odd number of 1’s. For the rest it is "OFF." XOR function cannot be represented by
simple and single logic function; it is represented as:

Y=x1x2’ + x1’x2

Y = z 1 + z2

Where z1 = x1x2’ (function 1)

z2 = x1’x2 (function 2) and Y = z1 +(OR) z2


A single-layer net is not sufficient to represent the function. An intermediate layer is necessary.

x1 W11
X1 Z1 V11
W12 W21 Y y

x2 V21
X2 Z2
W22

For first function z1 = x1x2’

The truth table for z1 is,

X1 X2 Z1
1 1 0
1 0 1
0 1 0
0 0 0
1st case: Assume both weights as excitatory,

For z1, weights are W11, W21, assume both equal to 1.

Calculate the net inputs for all the inputs:

(1, 1) Z1in=w11*x1+ w21*x2 Z1in = 1*1 + 1*1 = 2


(1, 0) Z1in=w11*x1+ w21*x2 Z1in = 1*1 + 1*0 = 1
(0, 1) Z1in=w11*x1+ w21*x2 Z1in = 1*0 + 1*1 = 1
(0, 0) Z1in=w11*x1+ w21*x2 Z1in = 1*0 + 1*0 = 0

Here, we can not set any threshold to get the output same as desired output. So it is not possible to obtain function z 1
using these weights.

2nd case: Assume W11=1 and W21=-1

Calculate the net inputs for all the inputs:

Input Calculate net input Set Threshold Output


(1, 1) Z1in=w11*x1+ w21*x2 Z1in = 1*1 + (-1)*1 = 0 0
(1, 0) Z1in=w11*x1+ w21*x2 Z1in = 1*1 + (-1)*0 = 1 1
θ≥1
(0, 1) Z1in=w11*x1+ w21*x2 Z1in = 1*0 + (-1)*1 = -1 0
(0, 0) Z1in=w11*x1+ w21*x2 Z1in = 1*0 + (-1)*0 = 0 0

For the neuron Z1, the weights are W11=1, W21= -1 and Threshold θ≥1.
For second function z2 = x1’x2

The truth table for Z2 is,

X1 X2 Z2
1 1 0
1 0 0
0 1 1
0 0 0

First assume weights, W12= -1 and W22=1 and calculate the net input for all the inputs

Input Calculate net input Set Threshold Output


(1, 1) Z2in=w12*x1+ w22*x2 Z2in = (-1)*1 + 1*1 = 0 0
(1, 0) Z2in=w12*x1+ w22*x2 Z2in = (-1)*1 + 1*0 = -1 0
θ≥1
(0, 1) Z2in=w12*x1+ w22*x2 Z2in = (-1)*0 + 1*1 = 1 1
(0, 0) Z2in=w12*x1+ w22*x2 Z2in = (-1)*0 + 1*0 = 0 0
For the neuron Z2, the weights are W12= -1, W22=1 and Threshold θ≥1.

For third function Y = Z1 + Z2

The truth table for this function is:

X1 X2 Y Z1 Z2
1 1 0 0 0
1 0 1 1 0
0 1 1 0 1
0 0 0 0 0
The net input is calculated as:

Yin = V11*Z1 + V21*Z2

For this function, assume weights V11=1 and V21=1

Input Calculate net input Set Threshold Output


(0, 0) Yin = V11*Z1 + V21*Z2 Yin = 1*0 + 1*0 = 0 0
(1, 0) Yin = V11*Z1 + V21*Z2 Yin = 1*1 + 1*0 = 1 1
θ≥1
(0, 1) Yin = V11*Z1 + V21*Z2 Yin = 1*0 + 1*1 = 1 1
(0, 0) Yin = V11*Z1 + V21*Z2 Yin = 1*0 + 1*0 = 0 0

Now the Final Network for XOR is: θ≥1


x1 W11=1
X1 Z1 V11=1 θ≥1
W12=-1 W21=-1 Y y

x2 V21=1
X2 Z2
W22=1

θ≥1

You might also like