ANN Unit 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Unit II

Components of ANN and Network Topologies


Unit II

Components of artificial neural networks

The concept of time in neural networks

Connections

Propagation function

Activation

Threshold value, Activation function

Common activation functions

Output function, Learning strategies


Unit II

• Network Topologies
• Feed forward networks
• Recurrent networks
• Completely linked networks

• Bias Neuron
• Representing Neuron
• Orders of activation
• Communication : in and out from the neural network
Concept of time in neural networks

• The term time or the number of cycles of the neural


network, respectively.
• Time is divided into discrete time steps.
• The current time (present time) is referred to as (t), the
next time step as (t+ 1), the preceding one as (t−1).
• All other time steps are referred to analogously.
• e.g. netj or o i refer to a certain point in time,
• The notation will be, for example, netj (t − 1) or oi(t).
Concept of time in brain neurons

• From a biological point of view this is, of course, not very


plausible

• In the human brain a neuron does not wait for another


one, but it significantly simplifies the implementation
Components of neural network

• A technical neural network consists of simple processing


units, the neurons, and directed, weighted connections
between those neurons.
• Here, the strength of a connection (or the connecting
weight) between two neurons i and j is referred to as wi,j .
Definition : Neural Network

• A neural network is a sorted triple (N, V, w) with two sets N,


V and a function w, where N is the set of neurons and V a
set {(i, j)|i, j ∈ N} whose elements are called connections
between neuron i and neuron j.

• The function w : V → R defines the weights, where w((i, j)),


the weight of the connection between neuron i and neuron
j, is shortened to w i,j.

• Depending on the point of view it is either undefined or 0


for connections that do not exist in the network.
Definition : Neural Network - Weight matrix

• The weights can be implemented


in a square weight matrix W or
optionally, in a weight vector W.
• Matrix/vector with the row
number of the matrix indicating
where the connection begins, and
the column number of the matrix
indicating, which neuron is the
target.
• Indeed, in this case the numeric 0
marks a non-existing connection.
• This matrix representation is also
called Hinton diagram
Data processing of a neuron in neural
networks

• Connections carry information


that is processed by neurons.

• Data are transferred between


neurons via connections with the
connecting weight being either
excitatory or inhibitory.
Data processing of a neuron in neural
networks

Propagation function :Weighted Activation Output function


sum function :
Propagation function

• The propagation function


converts vector inputs to
scalar network inputs.
• For a neuron ‘j’ the
propagation function receives
the outputs oi1 , . . . , oin of
other neurons i1, i 2, . . . , in
(which are connected to j)
• It transforms them in
consideration of the
connecting weights w i,j into
the network input netj that
can be further processed by
the activation function.
Propagation function

• Let I = {i1, i2, . . . , in} be the set of neurons, such that ∀z ∈ {1, . . . , n} :
∃wiz,j . Then the network input of j, called net j , is calculated by the
propagation function fprop() as follows:

• Here the weighted sum is very popular: The multiplication of the


output of each neuron i by wi,j , and the summation of the results,
The activation is the "switching status" of a
neuron
• Definition of activation : Let j be a
neuron. The activation state a j , in
short activation, is explicitly
assigned to j, indicates the extent
of the neuron’s activity and
results from the activation
function.
• Neurons get activated if the
network input exceeds their
threshold value.
• Definition of threshold value: Let j
be a neuron. The threshold value
Θj is uniquely assigned to j and
marks the position of the
maximum gradient value of the
activation function.
Activation function

• Let j be a neuron. The activation function is defined as,

• It transforms the network input netj , as well as the


previous activation state aj (t − 1) into a new activation
state aj (t), with the threshold value Θ
Scope: Activation function and threshold value

• The activation function is often defined globally for all


neurons or at least for a set of neurons and only the
threshold values are different for each neuron.
• The threshold values can be changed, using a learning
procedure.
• So it can in particular become necessary to relate the
threshold value to the time and to write, for instance Θ j as
Θj(t).
• The activation function is also called transfer function.
Common activation functions

• The simplest activation function is the


binary threshold function, which can
only take on two values (also referred to
as Heaviside function).
• If the input is above a certain threshold,
the function changes from one value to
another, but otherwise remains
constant.
• This implies that the function is not
differentiable at the threshold and for
the rest the derivative is 0. Due to this
fact, backpropagation learning, for
example, is impossible.
Common activation functions

• Also very popular is the Fermi function or


logistic function.
• This function maps the values in the
range (0,1).
• The Fermi function can be expanded by a
temperature parameter T into the form,

• The smaller this parameter, the more does


it compress the function on the x axis.
• Thus, one can arbitrarily approximate the
Heaviside function.
Common activation functions

• The other common activation


function is hyperbolic tangent
which maps to (−1, 1). Both
functions are differentiable.
• The equation y =tanh(x) is defined
as,
Output function

• The output function of a neuron j calculates the values which


are transferred to the other neurons connected to j.
• Definition (Output function): Let j be a neuron. The output
function
f out(aj ) = o j
• The output value oj of the neuron j is calculated from its
activation state a j . Generally, the output function is defined
globally, too.
• Often this function is the identity, i.e. the activation a j is directly
output:
fout(aj ) = a j , so o j = aj
• Unless explicitly specified differently, the identity is used as
output function.
Learning strategy
• Learning strategies adjust a network to fit the model.
• The learning strategy is an algorithm that can be used to change and thereby train
the neural network, so that the network produces a desired output for a given input.
• Learning rule is a method or a mathematical logic. It helps a Neural Network to
learn from the existing conditions and improve its performance. It is an iterative
process.
• The following are different learning rules in the Neural network:
• Hebbian learning rule – It identifies, how to modify the weights of nodes of a
network.
• Perceptron learning rule – Network starts its learning by assigning a random value
to each weight.
• Delta learning rule – Modification in sympatric weight of a node is equal to the
multiplication of error and the input.
• Correlation learning rule – The correlation rule is the supervised learning.
• Outstar learning rule – It is used when it assumes that nodes or neurons in a
network arranged in a layer.
Network topologies

• Topology of neural network is referred to the design of


neural network, which is used to construct a neural
network model using its components.
• The following three designs will be discussed in detail,
1. Feedforward network
2. Recurrent network
3. Completely linked networks
Feedforward networks

• It consist of layers and connections towards each following


layer.
• The neurons are grouped in the following layers:
• One input layer,
• n hidden processing layers (invisible from the outside, that’s why
the neurons are also referred to as hidden neurons) and
• one output layer
• In a feedforward network each neuron in one layer has
only directed connections to the neurons of the next layer
(towards the output layer).
Feedforward networks

• Definition: The neuron layers of a


feedforward network are clearly
separated: One input layer, one
output layer and one or more
processing layers which are
invisible from the outside (also
called hidden layers).
• The Hinton diagram of the neural
network is shown here. The blocks
highlighted contributes to the
connections and weight matrix
Feedforward networks – shortcut connections
with skip layers
• (Feedforward network with
shortcut connections). Similar to
the feedforward network, but the
connections may not only be
directed towards the next layer but
also towards any other subsequent
layer.
• Some feedforward networks
permit the so-called shortcut
connections.
• These connections that skip one or more
levels.
• These connections may only be directed
Recurrent networks

• Recurrence is defined as the process of a neuron


influencing itself by any means or by any connection.
• Recurrent networks do not always have explicitly defined
input or output neurons.
• There are three types of recurrences possible for a neuron.
They are,
1. Direct recurrences
2. Indirect recurrences
3. Lateral recurrences
Recurrent networks: Direct recurrence

• Some networks allow for neurons to be


connected to themselves, which is called
direct recurrence (or sometimes self-
recurrence.
• As a result, neurons inhibit and therefore
strengthen themselves in order to reach
their activation limits.
• Definition (Direct recurrence): Expand
the feedforward network by connecting a
neuron j to itself, with the weights of
these connections being referred to as w j,j .

• In other words: the diagonal of the weight


matrix W may be different from 0.
Recurrent networks: Indirect recurrence

• Indirect recurrences can influence


their starting neuron only by making
detours.
• The indirect recurrences are
represented by solid lines. As we can
see, connections to the preceding
layers can exist here, too.
• Definition (Indirect recurrence): This
network is based on a feedforward
network, now with additional
connections between neurons and
their preceding layer being allowed.
Therefore, below the diagonal of W is
different from 0.
Recurrent networks: Lateral recurrence

• Connections between neurons within one


layer are called lateral recurrences.
• Here, each neuron often inhibits the other
neurons of the layer and strengthens itself.
• As a result only the strongest neuron
becomes active (winner takes-all scheme).
• The direct recurrences are represented by
solid lines.
• Here, recurrences only exist within the
layer.
• In the Hinton diagram, filled squares are
concentrated around the diagonal in the
height of the feedforward blocks, but the
diagonal is left uncovered.
Completely linked networks

• It allows possibly any connection.


• Completely linked networks permit
connections between all neurons, except
for direct recurrences. Furthermore, the
connections must be symmetric.
• Definition (Complete interconnection):
In this case, every neuron is always
allowed to be connected to every other
neuron – but as a result every neuron
can become an input neuron.
• Thus, the matrix W may be unequal to 0
everywhere, except along its diagonal.
Bias Neuron

• The bias neuron is a technical trick to consider threshold


values as connection weights.
• In many network paradigms neurons have a threshold
value that indicates when a neuron becomes active.
• Thus, the threshold value is an activation function
parameter of a neuron.
• It is complicated to access the activation function at
runtime in order to train the threshold value.
Bias Neuron

• But threshold values Θj1 , . . . , Θjn for neurons j 1, j2, . . . , j n


can also be realized as connecting weight of a
continuously firing neuron.
• For this purpose an additional bias neuron whose output
value is always 1 is integrated in the network and
connected to the neurons j 1, j2, . . . , j n.
• Definition: A bias neuron is a neuron whose output value
is always 1.
• These new connections get the weights −Θ j1 , . . . , −Θ jn , i.e.
they get the negative threshold values.
Bias Neuron

• Let j1, j2, . . . , j n be neurons with threshold values Θ j1 , . . . ,


Θjn .
• By inserting a bias neuron whose output value is always 1,
generating connections between the said bias neuron and
the neurons j1, j2, . . . , jn and weighting these connections
wBIAS,j1 , . . . , w BIAS,jn with −Θj1 , . . . , −Θ jn .
• Let the thresholds are set zero, Θ j1 = . . . = Θ jn = 0 and
receive an equivalent neural network whose threshold
values are realized by connection weights.
Significance of bias neuron

In effect, a bias value allows you to shift the activation function to


the left or right, which may be critical for successful learning.
It might help to look at a simple example. Consider this 1-input, 1-
output network that has no bias:

The output of the network is computed by multiplying the input (x) by


the weight (w0) and passing the result through some kind of activation
function (e.g. a sigmoid function.)
Significance of bias neuron

• Here is the function that this


network computes, for various
values of w0:

• Changing the weight w0 essentially


changes the "steepness" of the
sigmoid. That's useful, but what if
you wanted the network to output
0???

• For more details visit: https://www.geeksforgeeks.org/effect-of-bias-in-neural-


network/
Significance of bias neuron

• If we add a bias to that network, like


so:
Then the output of the network
becomes sig(w 0*x + w1*1.0). Here is
what the output of the network
looks like for various values of w1:
Representing neurons
Order of activation

• For a neural network it is very important in which order


the individual neurons receive and process the input and
output the results.
• Based on the order of activation of neurons, there are two
categories of neural network models exist.
• 1. Synchronous activation
• 2. Asynchronous activation
Order of activation – Synchronous activation

• All neurons change their values synchronously, i.e. they


simultaneously calculate network inputs, activation and
output, and pass them on.
• Synchronous activation corresponds closest to its
biological counterpart, but it is – if to be implemented in
hardware – only useful on certain parallel computers and
especially not for feedforward networks.
• This order of activation is the most generic and can be
used with networks of arbitrary topology
Order of activation - Synchronous activation

• Definition: All neurons of a network calculate network


inputs at the same time by means of the propagation
function, activation by means of the activation function
and output by means of the output function.
• After that the activation cycle is complete.
Order of activation - Asynchronous activation

• Here, the neurons do not change their values


simultaneously but at different points of time.
• For this, there exist different orders, they are
• 1. Random order
• 2. Random permutation
• 3. Topological order
• 4. Fixed order
Order of activation - Asynchronous activation

Random order:
• With random order of activation a neuron “i” is randomly
chosen and its net i , ai and oi are updated.
• For n neurons a cycle is the n-fold execution of this step.
• Obviously, some neurons are repeatedly updated during
one cycle, and others, however, not at all.
• Apparently, this order of activation is not always useful.
Order of activation - Asynchronous activation
Random permutation:
• With random permutation each neuron is chosen exactly once, but in
random order, during one cycle.
• Definition: Initially, a permutation of the neurons is calculated
randomly and therefore defines the order of activation. Then the
neurons are successively processed in this order.
• This order of activation is as well used rarely because
• firstly, the order is generally useless and,
• secondly, it is very time-consuming to compute a new permutation for every
cycle.

• A Hopfield network is a topology nominally having a random or a


randomly permuted order of activation.
• But note that in practice, for the previously mentioned reasons, a
fixed order of activation is preferred.
Order of activation - Asynchronous activation

Topological order:
• With topological order of activation the neurons are
updated during one cycle and according to a fixed order.
The order is defined by the network topology.
• This procedure can only be considered for non-cyclic, i.e.
non-recurrent, networks, since otherwise there is no order
of activation. Thus, in feedforward networks (for which
the procedure is very reasonable) the input neurons would
be updated first, then the inner neurons and finally the
output neurons.
Order of activation - Asynchronous activation

Topological order:
• This type of activation order may save us a lot of time.
• Given a synchronous activation order, a feedforward network
with ‘n’ layers of neurons would need ‘n’ full propagation cycles
in order to enable input data to have influence on the output of
the network.
• Given the topological activation order, it just need one single
propagation.
• However, not every network topology allows for finding a special
activation order that enables saving time
Order of activation - Asynchronous activation

Fixed orders of activation:


• when implementing, for instance, feedforward networks it
is very popular to determine the order of activation once
according to the topology and to use this order without
further verification at runtime. But this is not necessarily
useful for networks that are capable to change their
topology.
Communication with the outside world: input
and output of data in and from neural network
• The input and output components for ‘n’ input or output
neurons within the vectors x = (x 1, x 2, . . . , xn) and y = (y 1, y2, . . . ,
yn).
• A network with n input neurons needs n inputs x 1, x 2, . . . , xn.
They are considered as input vector x = (x1, x 2, . . . , xn). As a
consequence, the input dimension is referred to as n. Data is
put into a neural network by using the components of the input
vector as network inputs of the input neurons.
• A network with m output neurons provides m outputs y1, y2, . . .
, ym. They are regarded as output vector y = (y 1, y2, . . . , ym).
Thus, the output dimension is referred to as m. Data is output
by a neural network by the output neurons adopting the
components of the output vector in their output values

You might also like