Artificial Neural Network
Artificial Neural Network
Genesis of ANN
Terminal Branches
Dendrites
of Axon
Axon
The Perceptron
• Binary classifier functions
• Threshold activation function
Terminal Branches
Dendrites
of Axon
x1
w1
x2
w2
x3 w3
S
Axon
wn
xn
The Perceptron: Threshold Activation
Function
• Binary classifier functions
• Threshold activation function
Step Threshold
Linear Activation functions
• Output is scaled sum of inputs
N
y u wn xn
n 1
Linear
Nonlinear Activation Functions
• Sigmoid Neuron unit function
1
y hid (u )
1 e u
Sigmoid
• The ability to learn is a fundamental trait of intelligence.
• Although a precise definition of learning is difficult to formulate, a
learning process in the ANN context can be viewed as the problem
of updating network architecture and connection weights so that a
network can efficiently perform a specific task.
• The network usually must learn the connection weights from
available training patterns.
• Performance is improved over time by iteratively updating the
weights in the network.
• ANNs' ability to automatically learn from examples makes them
attractive and exciting.
• Instead of following a set of rules specified by human experts, ANNs
appear to learn underlying rules (like input-output relationships)
from the given collection of representative examples. This is one of
the major advantages of neural networks over traditional expert
systems.
Learning – what it means exactly ?
• Learning is essential to most of neural network architectures.
• Choice of a learning algorithm is a central issue in network
development.
• What is really meant by saying that a processing element learns?
Learning implies that a processing unit is capable of changing its
input/output behavior as a result of changes in the environment.
Since the activation rule is usually fixed when the network is
constructed and since the input/output vector cannot be changed,
to change the input/output behavior the weights corresponding to
that input vector need to be adjusted. A method is thus needed by
which, at least during a training stage, weights can be modified in
response to the input/output process.
• In a neural network, learning can be supervised, in which the
network is provided with the correct answer for the output during
training, or unsupervised, in which no external teacher is present.
At learning process…
• At each training step the network computes the direction in
which each bias and link value can be changed to calculate a
more correct output.
• The rate of improvement at that solution state is also known.
A learning rate is user-designated in order to determine how
much the link weights and node biases can be modified based
on the change direction and change rate.
• The higher the learning rate (max. of 1.0) the faster the
network is trained.
• However, the network has a better chance of being trained to
a local minimum solution. A local minimum is a point at which
the network stabilizes on a solution which is not the most
optimal global solution.
learning rules
There are four basic types of learning rules:
• error correction,
• Boltzmann,
• Hebbian,
• and competitive learning.
parameters for quality the prediction
• Hidden layers: Both the number of hidden layers and the number of nodes in
each hidden layer can influence the quality of the results. For example, too
few layers and/or nodes may not be adequate to sufficiently learn and too
many may result in overtraining the network.
• Number of cycles: A cycle is where a training example is presented and the
weights are adjusted.
• The number of examples that get presented to the neural network during the
learning process can be set. The number of cycles should be set to ensure that
the neural network does not overtrain. The number of cycles is often referred
to as the number of epochs.
• Learning rate: Prior to building a neural network, the learning rate should be
set and this influences how fast the neural network learns.
Neural Network topologies
• In the previous section we discussed the properties of the basic processing unit
in an artificial neural network. This section focuses on the pattern of
connections between the units and the propagation of data. As for this pattern
of connections, the main distinction we can make is between:
• Feed-forward neural networks, where the data flow from input to output units
is strictly feedforward. The data processing can extend over multiple (layers of)
units, but no feedback connections are present, that is, connections extending
from outputs of units to inputs of units in the same layer or previous layers.
• Recurrent neural networks that do contain feedback connections. Contrary to
feed-forward networks, the dynamical properties of the network are important.
In some cases, the activation values of the units undergo a relaxation process
such that the neural network will evolve to a stable state in which these
activations do not change anymore. In other applications, the change of the
activation values of the output neurons are significant, such that the dynamical
behaviour constitutes the output of the neural network (Pearlmutter, 1990).
• Classical examples of feed-forward neural networks are the Perceptron and
Adaline. Examples of recurrent networks have been presented by Anderson
(Anderson, 1977), Kohonen (Kohonen, 1977), and Hopfield (Hopfield, 1982) .
• Volume: 1400 cm3
• Area: 2000cm2
• Weight: 1,5 kg •Fault-tolerant;
• Covering the hemispheres of the cerebral FLEXIBLE - easily adapts to changing environment;
cortex contains neurons: 1010 TEACHES THE - NOT must be programmed;
• The number of connections between cells: Can deal with the Information fuzzy, random, noisy or
10 15 inconsistent;
• The cells send and receive signals, the The PARALLEL HIGH DEGREE;
speed of operation= 1018 operations / sec SMALL, very low power consumption.
• The neural network is a simplified model of
the brain!
Neurons and Synapses
The basic computational unit in the nervous system is the nerve cell, or
neuron. A neuron has:
1. Dendrites (inputs)
2. Cell body
3. Axon (output)
A neuron receives input from other neurons (typically many thousands).
Inputs sum (approximately). Once input exceeds a critical level, the
neuron discharges a spike - an electrical pulse that travels from the body,
down the axon, to the next neuron(s) (or other receptors). This spiking
event is also called depolarization, and is followed by a refractory period,
during which the neuron is unable to fire.
The axon endings (Output Zone) almost touch the dendrites or cell body of the next
neuron. Transmission of an electrical signal from one neuron to the next is effected by
neurotransmittors, chemicals which are released from the first neuron and which bind to
receptors in the second. This link is called a synapse. The extent to which the signal from
one neuron is passed on to the next depends on many factors, e.g. the amount of
neurotransmittor available, the number and arrangement of receptors, amount of
neurotransmittor reabsorbed, etc.
A Simple Artificial Neuron
Basic computational element (model neuron) is often called a node or unit.
It receives input from some other units, or perhaps from an external
source.
Each input has an associated weight w, which can be modified so as to
model synaptic learning. The unit computes some function f of the
weighted sum of its inputs.
Its output, in turn, can serve as input to other units.
• The weighted sum is called the net input to unit i, often written neti.
Note that wij refers to the weight from unit j to unit i (not the other way
around).
• The function f is the unit's activation function.
• In the simplest case, f is the identity function, and the unit's output is
just its net input. This is called a linear unit.
Features of intelligent system
The ability of learning from examples and generalize knowledge acquired to solve
problems posed in a new context:
• Ability to create rules (associations), binding together the separate elements
of the system (object)
• The ability to recognize objects (images features) on the basis of incomplete
information.
Data classification
Data classification is one of the main tasks performed using neural networks.
What it is about ?
• NO:
for calculations, multiplication tables, for word processing, etc. applications
where you can easily use the well-known algorithm.
YES:
where the algorithm procedure is very difficult to achieve, where data are
incomplete or inaccurate, where the course of the test is non-linear
phenomena, etc. Where there is a lot of data, but some results do not yet
know the methods of operation.
Artificial Neuron schema:
The inputs are fed signals from the input layer neurons in the network or
the previous one. Each signal is multiplied by the corresponding
numerical value called a weight. It affects the perception of the input
signal and its part in creating the output neuron.
Weight can be invigorating - Delay positive or - negative;
if there is no connection between neurons is the weight is zero. Summed
products of signals and weights are the neuron activation function
argument.
A simplified model of a neuron showing expressed its
similarity to the natural model
Formula that describe the neuron
working
n
y f (s) Where s xi wi
i 0
Approximation function
The principle aim is to approximate a given function (in other words: learn the desired
function by observing examples of its operation).
Number of layers
If Y = 2
-0.087 Then predition error = (2-0.478)
f (-0.087) = 0.478 =1.522
0.478
Backpropagation
• One of the most popular techniques in
learning processes for ANN.
Learning process
2. Go through the
appropriate procedures
to determine the
output value
3. Compare the
1. Choose randomly one desired value with
of the observation the actually
obtained in the
network
where:
•Errori is the error from the i-th node,
•Outputi is the value predicted by a network,
•Actuali is the real value (which the network should learn).
Change the weights
• 1 hidden layer: D, E
• Input layer: A, B, C
• Output layer: F
Randomly choosing one observation
Randomly choosing one observation
etc.…
For better understanding…
the backpropagation learning algorithm can be divided into two phases:
propagation and weight update.
Phase 1: Propagation which involves the following steps:
• Forward propagation of a training pattern's input through the neural network
in order to generate the propagation's output activations.
• Backward propagation of the propagation's output activations through the
neural network using the training pattern's target in order to generate the
deltas of all output and hidden neurons.
Phase 2: Weight update
For each weight-synapse:
• Multiply its output delta and input activation to get the gradient of the weight.
• Bring the weight in the opposite direction of the gradient by subtracting a ratio of
it from the weight.
• This ratio influences the speed and quality of learning; it is called the learning
rate. The sign of the gradient of a weight indicates where the error is increasing,
this is why the weight must be updated in the opposite direction.
• Repeat the phase 1 and 2 until the performance of the network is good enough.
The size of ANN ?
• Big NN: few thousands of neurons, or even
more.
• The number of neurons should depends on
the type of the task of network.
• The power of the network depends on the
number of the neurons, the density of the
connections between neurons, and on the
proper choosing the values of weights.
How many hidden layers it should be?
• The number of the hidden layers is usually not
higher than 2. In the hidden layers there is the
fusion of the network signals.
• Input layer is usually responsible only for the
initial preparation of input data.
• The output layer is responsible for aggregating
the final beats of the hidden layers of neurons,
and the presentation of the final result of the
network at the outputs of the neurons, which are
the outputs at the same time across the network.
Advantages of ANN
1. They can work fine in case of incomplete information
incomes
insurance Yes
age
Marital status
employment
Get credit or
….
not ?
As output there will be a decision about giving the credit or not
Categorical data is a problem…unless…
• continent: {Asia, Europe, America}
3 neurons are necessary:
• Asia 100
• Europe 010
• America 001
• One variable „continent” create 3 neurons !!!
• It would be better for such cases to consider the merging some calues
into smaller number of categories.
• Usually the number of weights should be 10 times smaller than the
number of cases in training data set.
• The STATISTICA line of software provides a comprehensive
and integrated set of tools and solutions for:
• Data analysis and reporting, data mining and predictive
modeling, business intelligence, simple and multivariate
QC, process monitoring, analytic optimization, simulation,
and for applying a large number of statistical and other
analytic techniques to address routine and advanced data
analysis needs
• Data visualization, graphical data analysis, visual data
mining, visual querying, and simple and advanced scientific
and business graphing; in fact, STATISTICA has been
acknowledged as the “king of data visualization software”
(by the editors of " PC Graphics & Video")
Install Statistica 10 EN
• http://usnet.us.edu.pl/files/statsoft/STATISTIC
A_EN_10_0.zip
Neural networks in Statistica
• Classification analysis (creditRisk.sta)
• Regression analysis (cycling.sta)
Classification for creditRisk.sta
Custom neural network CNN
Increasing neurons from 11 to 20
Automated network search ANS
Regression for cycling.sta
Choosing the variables
5 different neural network