0% found this document useful (0 votes)
31 views98 pages

Unit - 1 - SC

Uploaded by

rwt91848
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views98 pages

Unit - 1 - SC

Uploaded by

rwt91848
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 98

Soft Computing (RCS-071)

Unit-I : Neural Networks-1


(Introduction & Architecture)

Neuron, Nerve structure and synapse,


Artificial Neuron and its model, activation functions,
Neural network architecture: single layer and multilayer feed
forward networks, recurrent networks.
Various learning techniques;
perception and convergence rule,
Auto-associative and hetro-associative memory.
1
Books for Reference

2
Soft Computing Components
Fuzzy Logic

3
How does SC relate to other fields?
symbolic automatic
manipulation improvement
Machine with
AI experience
Learning

Cognitive Soft Computing


Psychology Statistics

Study of the mind Probability


(not possibility)
uncertainty and imprecision
4
Introduction
• Artificial Intelligence: designing intelligent computer systems
that exhibit intelligence in human behavior.

• The idea behind soft computing is to model cognitive behavior


of human mind.
Cognitive: involves conscious intellectual activity (as thinking,
reasoning, or remembering)

• Soft computing is foundation of conceptual intelligence in


machines.

• Unlike hard computing , Soft computing is tolerant of


imprecision, uncertainty, partial truth, and approximation.
5
Soft Computing
• In computer science, soft computing is the use of inexact
solutions to solve computationally hard tasks such as the solution
of NP-complete problems, for which there is no known
algorithm that can compute an exact solution in polynomial
time

• “The essence of soft computing is that unlike the traditional, hard


computing, soft computing is aimed at an accommodation with
the pervasive imprecision of the real world”.

• Thus, the guiding principle of soft computing is to exploit the


tolerance for imprecision, uncertainty, and partial truth to
achieve tractability, robustness, low solution cost, and better
rapport with reality”- Lotfi Zadeh (Founder of fuzzy mathematics,
fuzzy set theory, and fuzzy logic)
6
Soft Computing Characteristics
1. Human Expertise (if-then rules, cases, conventional
knowledge representations)
2. Biologically inspired computing models

3. Model-free learning
4. Fault tolerance (deletion of neuron, rule, or case)

5. Real-world applications (large scale with uncertainties)


6. Computationally intelligent
7. Can adapt to the changing environment and can learn to do
better
8. Can explain their decisions
9. Optimization techniques
7
Hard Vs Soft Computing Paradigms
∙ Hard computing

− Based on the concept of precise modelling and analyzing to yield


accurate results.
− Works well for simple problems, but is bound by the NP-Complete
set.

∙ Soft computing
− Aims to surmount NP-complete problems.
− Uses inexact methods to give useful but inexact answers to
intractable problems.
− Represents a significant paradigm shift in the aims of computing a
shift which reflects the human mind.
− Tolerant to imprecision, uncertainty, partial truth, and
approximation.
− Well suited for real world problems where ideal models are not
available. 8
Difference between Soft and Hard
Computing
Hard Computing Soft Computing
Conventional computing requires a Soft computing is tolerant of
precisely stated analytical model. imprecision.
Often requires a lot of computation time. Can solve some real world problems in
reasonably less time.
Not suited for real world problems for Suitable for real world problems.
which ideal model is not present.
It requires full truth Can work with partial truth

It is precise and accurate Imprecise.

High cost for solution Low cost for solution 9


Components of Soft Computing

10
APPLICATIONS OF SOFT COMPUTING
 Handwriting Recognition
 Image Processing and Data Compression
 Automotive Systems and Manufacturing
 Decision-support Systems
 Power Systems
 Neuro Fuzzy systems
 Fuzzy Logic Control
 Machine Learning Applications
 Speech and Vision Recognition Systems
 Process Control
 Medical
 Data Mining
 Document Classification
 Document Analysis
 Multimedia database
“Principles of Soft Computing, 2nd Edition”
by S.N. Sivanandam & SN 11 Deepa
Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.
Ex.
Hard Computing Soft Computing

Conventional computing requires a Soft computing is tolerant of………..


precisely stated analytical model.

Often requires a…………... Can solve some real world problems in


reasonably less time.

Not suited for real world problems for Suitable for real world problems.
which ………is not present.

It requires ….. truth Can work with …… truth

It is precise and accurate …….

………. cost for solution ……cost for solution


12
1. Application
• A credit card company receives thousands of applications for
new cards. Each application contains information about an
applicant,
– age
– Marital status
– annual salary
– outstanding debts
– credit rating
– etc.
• Problem: to decide whether an application should be approved,
or to classify applications into two categories, approved and not
approved.

13
An example: data (loan application)
Approved or not

14
An example: the learning task
• Learn a classification model from the data
• Use the model to classify future loan applications
into
– Yes (approved) and
– No (not approved)
• What is the class for following case/instance?

15
2. Application
• An emergency room in a hospital measures 17
variables (e.g. blood pressure, age, etc) of newly
admitted patients.
• A decision is needed: whether to put a new patient in
an intensive-care unit.
• Due to the high cost of ICU, those patients who may
survive less than a month are given higher priority.
• Problem: to predict high-risk patients and
discriminate them from low-risk patients.

16
Machine learning is
• Like human learning which comes from past experiences.
• A computer does not have “experiences”.
• A computer system learns from data, which represent some
“past experiences” of an application domain.
• Basic Components:
– Class of tasks
– Performance Measure
– Well defined experience
• Our focus: learn a target function that can be used to predict
the values of a discrete class attribute, e.g., approve or not-
approved, and high-risk or low risk.
• The task is commonly called: Supervised learning,
classification, or inductive learning. 17
LEARNING ALGORITHMS

18
SUPERVISED LEARNING
 Generally, a set of patterns is given where the class label of each pattern is
known. This is known as the training data.
 Training Data:
– k attributes: A1, A2, … Ak.
– a class/label/category: Each example is labelled with a pre-defined class
 The information in the training data should be used to identify the class of the
test pattern.
 This type of classification where a training set is used is called supervised
learning.
 A supervised learning algorithm analyzes the training data and produces an
inferred function, which is called a classifier (if the output is discrete) or a
regression function (if the output is continuous).
 This inferred function can be used for mapping new examples i.e. to correctly
determine the class labels for unseen instances
 Learning with the help of a teacher. Example : learning process of a small
child: Child doesn’t know read/write. Their each & every action is supervised
by a teacher 19
Supervised Learning
• In ANN, each input vector
requires a corresponding target
vector, which represents the
desired output.
• The input vector along with target
vector is called training pair.
• The input vector results in output
vector.
• The actual output vector is
compared with desired output
vector.
• If there is a difference means an
error signal is generated by the
network.
• It is used for adjustment of
weights until actual output
20
matches desired output.
Generalizations of supervised learning
There are several ways in which the standard supervised
learning problem can be generalized:

• Semi-supervised learning: In this setting, the desired output


values are provided only for a subset of the training data.
The remaining data is unlabeled.

• Active learning: Instead of assuming that all of the training


examples are given at the start, active learning algorithms
interactively collect new examples, typically by making
queries to a human user. Often, the queries are based on
unlabeled data, which is a scenario that combines semi-
supervised learning with active learning.
21
Supervised learning process: two steps
 Learning (training): Learn a model using the training data
 Testing: Test the model using unseen test data to assess the model
accuracy
Number of correct classifica tions
Accuracy  ,
Total number of test cases

22
Algorithms of Supervised Learning
• Artificial neural network
• Boosting (meta-algorithm)
• Bayesian statistics
• Case-based reasoning
• Decision tree learning
• Inductive logic programming
• Naive Bayes classifier
• Nearest Neighbor Algorithm
• Support vector machines
• Random Forests
• Ensembles of Classifier
23
Applications of Supervised
Learning
• Bioinformatics
• Database marketing
• Handwriting recognition
• Information retrieval
• Object recognition in computer vision
• Optical character recognition
• Spam detection
• Pattern recognition
• Speech recognition
24
UNSUPERVISED LEARNING
• In unsupervised learning, there is no explicit teacher, and the system
forms clusters or “natural groupings" of the input patterns.
• Unsupervised machine learning is the machine learning task of
inferring a function to describe hidden structure from unlabeled
data.
• That is, there is no supervisor telling us what is right or wrong; we
simply observe some data and try to describe it in an efficient way
with our model.
• Approaches to unsupervised learning include:

Clustering (e.g., k-means, mixture models, hierarchical clustering), and among


neural network models, the self-organizing map (SOM) and adaptive resonance
theory (ART)
25
Unsupervised learning
• Example: tadpole – learn to swim by
itself.
• In ANN, during training process,
network receives input patterns and
organize it to form clusters.
• It is observed that no feedback is
applied from environment to inform
what output should be or whether they
are correct.
• The network itself discover patterns,
regularities, features/ categories from
the input data and relations for the input
data over the output.
• Exact clusters are formed by
discovering similarities & dissimilarities
so called as self – organizing.
26
Reinforcement learning
It is a type of Machine Learning that allows machines and
software agents to automatically determine the ideal
behaviour within a specific context, in order to maximize
its performance.

The basic reinforcement learning model consists of:


• A set of environment states
• A set of actions
• Rules of transitioning between states;
• Rules that determine the scalar immediate reward of a
transition; and
• Rules that describe what the agent observes.
• Examples: Applications to game playing and robot control
27
Reinforcement learning
• Similar to supervised learning.
• Learning based on critic information is called reinforcement
learning & the feedback sent is called reinforcement signal.
• The network receives some feedback from the environment.
• Feedback is only evaluative. A reward is given for correct answer
computed and penalty for wrong answer.
• The external reinforcement signals are processed in the critic
signal generator, and the obtained critic signals are sent to the
ANN for adjustment of weights properly to get critic feedback in
future.

28
29
30
The Nervous System
The human nervous system can be broken down into three stages that
may be represented in block diagram form as

• The receptors collect information from the environment


e.g. photons on the retina.
• The effectors generate interactions with the environment
e.g. activate muscles.
• The flow of information/activation is represented by arrows
feed-forward and feedback.
• Naturally, this module will be primarily concerned with how the
neural network in the middle works. 31
Brain Vs computer
Term Brain Computer
Speed Execution time is few Execution time is few nano
milliseconds seconds
Processing Perform massive parallel Perform several parallel
operations simultaneously operations simultaneously. It is
faster than the biological neuron

Size and complexity Number of Neurons is 1011 and It depends on the chosen
number of interconnections is application and network
1015. designer.
So complexity of brain is higher
than computer
Storage capacity i) Information is stored in i) Stored in continuous memory
interconnections or in synapse location.
strength. ii) Overloading may destroy
ii) New information is stored older locations.
without destroying old one. iii) Can be easily retrieved
iii) Sometimes fails to recollect
32
information
Contd…
Tolerance i) Fault tolerant i) No fault tolerance
ii) Store and retrieve ii) Information corrupted if the
information even network connections
interconnections fails disconnected.
iii) Accept redundancies iii) No redundancies

Control Depends on active CPU


mechanism chemicals and neuron Control mechanism is very
connections are strong or simple
weak

33
A BIOLOGICAL NEURON
• The most basic element of the human brain is a specific type of
cell, called neuron
OR
• The basic computational unit in the nervous system is the nerve
cell, or neuron.
• These neurons provide the abilities to remember, think, and
apply previous experiences to our every action
• Together, these neurons and their connections form a process,
which is not binary, not stable, and not synchronous.
• Basically, a biological neuron receives inputs from other sources,
combines them in some way, performs a generally nonlinear
operation on the result, and then outputs the final result.
• Neurons are responsible for input/output operations, signal
transformations and storage of information. 34
Structure of A BIOLOGICAL NEURON

35
Components of A BIOLOGICAL NEURON
Within humans there are many variations on basic type of
neuron, yet, all biological neurons have the same four
basic components.
They are known by their biological names –
 cell body (soma),
 dendrites,
 axon,
 and synapses.
Cell body (Soma):The body of neuron cell contains the
nucleus and carries out biochemical transformation
necessary to the life of neurons.
Dendrite: Each neuron has fine, hair like tubular structures
(extensions) around it. They branch out into tree around
the cell body. They accept incoming signals and behave as
an input channel.
36
Components of A BIOLOGICAL NEURON
Axon:
 Link attached with soma. It is a long, thin, tubular structure which
works like a transmission line and serve as an output channel.
 These are non-linear threshold devices which produces a voltage
pulse called Action Potential or Spike (lasts for about a
millisecond)
Synapse:
 At the end of axon are highly complex and specialized structures
called synapses or synaptic junction that connects the axon with
dendritic link of another neuron.
 Connection between two neurons takes place at these synapses.
 Dendrites receive the input through the synapses of other neurons.
 The soma processes these incoming signals over time and
converts that processed value into an output, which is sent out to
other neurons through the axon and the synapses.
37
Neural Signal Processing
The key components of neural signal processing are:
1. Signals from connected neurons are collected by the dendrites.
2. The cells body (soma) sums the incoming signals (spatially and
temporally).
3. When sufficient input is received (i.e. a threshold is exceeded), the
neuron generates an action potential or ‘spike’ (i.e. it ‘fires’).
4. That action potential is transmitted along the axon to other
neurons, or to structures outside the nervous systems (e.g., muscles).
5. If sufficient input is not received (i.e. the threshold is not
exceeded), the inputs quickly decay and no action potential is
generated.
6. Timing is clearly important – input signals must arrive together,
strong inputs will generate more action potentials per unit time.

38
Artificial Neuron
w0
A neuron has a set of n synapses
w0 associated to the inputs. Each of them is
characterized by a weight .
x1 w1 x1 Z= A signal xi , i  1,..., n at the ith input is
w1  wi xi (Z) multiplied (weighted) by the weight
wi , i  1,..., n
... Output
( z )  f ( x1 ,..., x n ) The weighted input signals are summed.
Thus, a linear combination of the input
signals w1 x1  ...  wn xn
xn wn wn x n
is obtained.
A "free weight" (or bias) w 0 , which does
not correspond to any input, is added to this
linear combination and this forms a
w1
x1 weighted sum z  .w0  w1 x1  ...  wn xn

Σ 
x2 y
w2 A nonlinear activation function φ is
xn applied to the weighted sum. A value of the
wn activation function y   ( z ) is the
neuron's output.

39
Terminology Relation Between Biological And
Artificial Neuron

Biological Neuron Artificial Neuron

Cell Neuron

Dendrites Weights or interconnections

Soma Net input

Axon Output

40
AN ARTIFICIAL NEURON
 In Figure various inputs to the network are represented by the
mathematical symbol, xn. Each of these inputs is multiplied by a
connection weight. The weights are represented by wn.
 In the simplest case, these products are summed, fed to a transfer
function (activation function) to generate a result, and this result is sent
as output.
Seven major components make up an artificial neuron.
Component 1. Weighting Factors:
 A neuron usually receives many simultaneous inputs.
 Each input has its own relative weight, which gives the input the impact
that it needs on the processing element's summation function.
 Weights are adaptive coefficients
 They are a measure of an input's connection strength which can be
modified in response to various training sets and according to a
network's specific topology or its learning rules. 41
AN ARTIFICIAL NEURON
Component 2. Summation Function:

 The inputs and corresponding weights are vectors which can be


represented as (x1, x2. . . xn) and (w1, w2. . . wn). The total input signal is
the dot product of these two vectors. The result; (x1* w1) + (x2* w2)
+…….. + (xn* wn); is a single number.
 In addition to summing, the summation function can select the
minimum, maximum, majority, product or several normalizing
algorithms
 Some summation functions have an additional ‘activation function’
applied to the result before it is passed on to the transfer function for the
purpose of allowing the summation output to vary with respect to time.

42
AN ARTIFICIAL NEURON
Component 3. Transfer Function:
 In the transfer function the summation can be compared with some
threshold to determine the neural output. If the sum is greater than the
threshold value, the processing element generates a signal and if it is less
than the threshold, no signal (or some inhibitory signal) is generated.
 Both types of response are significant.
 The threshold, or transfer function, is generally non-linear.
Component 4. Scaling and Limiting:
 After the transfer function, the result can pass through additional
processes, which scale and limit.
 This scaling simply multiplies a scale factor times the transfer value and
then adds an offset.
 Limiting is the mechanism which insures that the scaled result does not
exceed an upper, or lower bound.
43
AN ARTIFICIAL NEURON
Component 5. Output Function (Competition):
 Each processing element is allowed one output signal, which it may give
to hundreds of other neurons.
 Some network topologies modify the transfer result to incorporate
competition among neighboring processing elements.
 First, competition determines which artificial neuron will be active or
provides an output. Second, competitive inputs help to determine which
processing element will participate in the learning or adaptation process.
Component 6. Error Function and Back-Propagated Value:
 In most learning networks the difference between the current output and
the desired output is calculated as an error which is then transformed by
the error function to match a particular network architecture.
 The error is propagated backwards to a previous layer.

44
AN ARTIFICIAL NEURON

 This back-propagated value, after being scaled by the learning


function, is multiplied against each of the incoming connection
weights to modify them before the next learning cycle.

Component 7. Learning Function:


Its purpose is to modify the weights on the inputs of each
processing element according to some neural based algorithm.

45
Activation functions
The activation function acts as a squashing function, such that the
output of a neuron in a neural network is between certain values
(usually 0 and 1, or -1 and 1).
 To make work more efficient and for exact output, some force or
activation is given.
 Like that, activation function is applied over the net input to
calculate the output of an ANN.
 Information processing of processing element has two major parts:
input and output.
 An integration function (f ) is associated with input of processing
element.
 Several activation functions are there.
Refer written notes (given in class)
46
Neural Networks
 Neural network was inspired by the design and functioning of
human brain and components.
 ANN are relatively crude electronic models or information
processing model that is inspired by the way biological nervous
system (i.e) the brain, process information. The brain stores
information as patterns.
 ANN is composed of large number of highly interconnected
processing elements(neurons) working in unison to solve problems.
 It is configured for special application such as pattern recognition
and data classification through a learning process.
 85-90% accurate.
 ANN inspires from the following :
process of storing information as patterns
utilizing those patterns
and then solving the problems 47
DEFINITION s OF NEURAL NETWORKS
According to the DARPA Neural Network Study (1988, AFCEA
International Press, p. 60):

A neural network is a system composed of many simple processing


elements operating in parallel whose function is determined by network
structure, connection strengths, and the processing performed at computing
elements or nodes.

According to Haykin (1994), p. 2:

A neural network is a massively parallel distributed processor that has a


natural tendency for storing observed knowledge and making it available
for use. It resembles the brain in two respects:
• Knowledge is acquired by the network through a learning process.
• Interneuron connection strengths known as synaptic weights are used
to store the knowledge.
48
DEFINITION s OF NEURAL NETWORKS

According to Nigrin (1993)

A neural network is a circuit composed of a very large number of


simple processing elements that are neurally based. Each element
operates only on local information.
Furthermore each element operates asynchronously; thus there is no
overall system clock.

According to Zurada (1992):

Artificial neural systems, or neural networks, are physical cellular


systems which can acquire, store and utilize experiential knowledge.
49
Multi disciplinary point of view of Neural
Networks

50
Application Scope of Neural
 Air traffic control
Networks
 Appraisal and valuation of property, etc.,
 Betting on horse races, stock markets
 Criminal sentencing
 Complex physical and chemical process
 Data mining, cleaning and validation
 Direct mail advertisers
 Echo patterns
 Employee hiring
 Expert consultants
 Fraud detection
 Hand writing and typewriting
 Machinery controls
 Medical diagnosis
 Music composition
 Photos and finger prints
 Recipes and chemical formulation
 Traffic flows
 Voice prediction
 Weather prediction 51
Advantages of Neural Networks
 A Neural Network can be an “expert” in analyzing the category of
information given to it.
 Answers “ what-if” questions
 Adaptive learning
 Ability to learn how to do tasks based on the data given for training
or initial experience.
 Self organization
 Creates its own organization or representation of information it
receives during learning time.
 Real time operation
 Computations can be carried out in parallel.
 Fault tolerance via redundant information coding
 Partial destruction of neural network cause degradation of
performance.
 In some cases, it can be retained even after major network damage.
52
 Low Energy consumption
Characteristics of ANN:
• It is Neurally implemented mathematical model
• There exists a Large number of processing elements called neurons in an
ANN.
• Interconnections with weighted linkage hold informative knowledge.
• Input signals arrive at processing elements through connections and
connecting weights.
• Processing elements are able to learn, recall and generalize from the
given data.
• Computational power is determined by the collective behavior of
neurons.
– ANN is a connection models, parallel distributed processing models,
self-organizing systems, neuro-computing systems and neuro morphic
system.
53
Neural Network Architectures

54
Single Layer Feed-forward Network

55
Multilayer feed-forward network

56
Contd..

57
Feed back network
• If no neuron in the output layer is an input
to a node in the same layer / preceding
layer – feed forward network.
• If outputs are directed back as input to the
processing elements in the same layer/
preceding layer –feedback network.
• If the output are directed back to the input
of the same layer then it is lateral
feedback.
• Recurrent networks are networks with
feedback networks with closed loop.
• Fig 2.8 (A) –simple recurrent neural
network having a single neuron with
feedback to itself.
• Fig 2.9 – single layer network with
feedback from output can be directed to
processing element itself or to other
processing element/both. 58
Recurrent Network

59
Contd…
• Processing element output can
be directed back to the nodes in
the preceding layer, forming a
multilayer recurrent network.
• Processing element output can
be directed to processing
element itself or to other
processing element in the same
layer.

60
• Simulated annealing (SA) is a probabilistic technique for
approximating the global optimum of a given function
• It is a metaheuristic to approximate global optimization in a
large search space
• It is often used when the search space is discrete (e.g., all tours
that visit a given set of cities)
• For problems where finding an approximate global optimum is
more important than finding a precise local optimum in a fixed
amount of time
• The name and inspiration come from annealing in metallurgy, a
technique involving heating and controlled cooling of a material
to increase the size of its crystals and reduce their defects.

61
62
63
64
65
66
67
68
69
70
71
McCULLOCH PITTS (M-P) Neuron

72
Architecture

73
Architecture

Examples: Discussed in class

74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
Perceptron
Basic model, formulation of learning of weights discussed in class

91
92
93
94
95
96
97
98

You might also like