Unit 5
Unit 5
input dendrites
weight synapse
Structure
output axon
hidden layer cell body
centralized distributed
The human nervous system has two main parts – the central nervous system (CNS) consisting
of the brain and spinal cord the peripheral nervous system consisting of nerves and ganglia
outside the brain and spinal cord. The CNS integrates all information, in the form of signals,
from the different parts of the body. The peripheral nervous system, on the other hand, connects
the CNS with the limbs and organs.
It has three main parts to carry out its primary functionality of receiving and transmitting
information:
1. Dendrites – to receive signals from neighbouring neurons.
2. Soma – main body of the neuron which accumulates the signals coming from the different
dendrites. It ‘fires’ when a sufficient amount of signal is accumulated.
3. Axon – last part of the neuron which receives signal from soma, once the neuron ‘fires’, and
passes it on to the neighbouring neurons through the axon terminals (to the adjacent dendrite
of the neighbouring neurons).
There is a very small gap between the axon terminal of one neuron and the adjacent dendrite
of the neighbouring neuron. This small gap is known as synapse. The signals transmitted
through synapse may be excitatory or inhibitory.
There are many components to a neural network architecture. Each neural network has a few
components in common:
Input - Input is data that is put into the model for learning and training purposes.
Weight - Weight helps organize the variables by importance and impact of contribution.
Transfer function - Transfer function is when all the inputs are summarized and combined into
one output variable.
Activation function - The role of the activation function is to decide whether or not a specific
neuron should be activated. This decision is based on whether or not the neuron’s input will be
important to the prediction process.
Types of ANN
• Single Layer Feed Forward Network
• Multiple Layer Feed Forward Network
• Recurrent Network
• COMPETITIVE NETWORK
NUMBER OF LAYERS
In the case of a single layer, a set of neurons in the input layer receives signal, i.e. a single
feature per neuron, from the data set. The value of the feature is transformed by the activation
function of the input neuron. The signals processed by the neurons in the input layer are then
forwarded to the neurons in the output layer. The neurons in the output layer use their own
activation function to generate the final prediction. More complex networks may be designed
with multiple hidden layers between the input layer and the output layer. Most of the multi-
layer networks are fully connected.
DIRECTION OF SIGNAL FLOW
In certain networks, termed as feed forward networks, signal is always fed in one direction, i.e.
from the input layer towards the output layer through the hidden layers, if there is any.
However, certain networks, such as the recurrent network, also allow signals to travel from the
output layer to the input layer.
NUMBER OF NODES IN LAYERS
In the case of a multi-layer network, the number of nodes in each layer can be varied. However,
the number of nodes or neurons in the input layer is equal to the number of features of the input
data set. Similarly, the number of output nodes will depend on possible outcomes, e.g. number
of classes in the case of supervised learning. So, the number of nodes in each of the hidden
layers is to be chosen by the user. A larger number of nodes in the hidden layer help in
improving the performance. However, too many nodes may result in overfitting as well as an
increased computational expense.
WEIGHT OF INTERCONNECTION BETWEEN NEURONS
For solving a learning problem using ANN, we can start with a set of values for the synaptic
weights and keep doing changes to those values in multiple iterations. In the case of supervised
learning, the objective to be pursued is to reduce the number of misclassifications. Ideally, the
iterations for making changes in weight values should be continued till there is no
misclassification.
To summarize, learning process using ANN is a combination of multiple aspects – which
include deciding the number of hidden layers, number of nodes in each of the hidden layers,
direction of signal flow, and last but not the least, deciding the connection weight.
BACK PROPAGATION
Backpropagation is the essence of neural network training. It is the method of fine-tuning the
weights of a neural network based on the error rate obtained in the previous epoch (i.e.,
iteration). Proper tuning of the weights allows you to reduce error rates and make the model
reliable by increasing its generalization.
Backpropagation in neural network is a short form for “backward propagation of errors.” It is
a standard method of training artificial neural networks. This method helps calculate the
gradient of a loss function with respect to all the weights in the network.
How Backpropagation Algorithm Works
The Back propagation algorithm in neural network computes the gradient of the loss function
for a single weight by the chain rule. It efficiently computes one layer at a time, unlike a native
direct computation. It computes the gradient, but it does not define how the gradient is used. It
generalizes the computation in the delta rule.
Consider the following Back propagation neural network example diagram to understand:
How Backpropagation Algorithm Works
1. Inputs X, arrive through the preconnected path
2. Input is modeled using real weights W. The weights are usually randomly selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers, to the
output layer.
4. Calculate the error in the outputs
ErrorB= Actual Output – Desired Output
5. Travel back from the output layer to the hidden layer to adjust the weights such that the
error is decreased.
Keep repeating the process until the desired output is achieved
One main part of the algorithm is adjusting the interconnection weights. This is done using a
technique termed as gradient descent. In simple terms, the algorithm calculates the partial
derivative of the activation function by each interconnection weight to identify the ‘gradient’
or extent of change of the weight required to minimize the cost function.
During the learning phase, the interconnection weights are adjusted on the basis of the errors
generated by the network, i.e. difference in the output signal of the network vis-à-vis the expected
value. These errors generated at the output layer are propagated back to the preceding layers. Because
of the backward propagation of errors which happens during the learning phase, these networks are
also called back-propagation networks.
In this network, X0 is the bias input to the hidden layer and Y0 is the bias input to the output layer.
As a part of the gradient descent algorithm, partial derivative of the cost function E has to be
done with respect to each of the interconnection weights w′01 w′02 w′03 Mathematically, it can
be represented as follows:
The weights and bias for the interconnection between the input and hidden layers need to be
updated as follows:
DEEP LEARNING
There are multiple choices of architectures for neural networks, multi-layer neural network
being one of the most adopted ones. However, in a multi-layer neural network, as we keep
increasing the number of hidden layers, the computation becomes very expensive. Going
beyond two to three layers becomes quite difficult computationally. The only way to handle
such intense computation is by using graphics processing unit (GPU) computing. Deep
learning is a more contemporary branding of deep neural networks, i.e. multilayer neural
networks having more than three layers.