Unit 2
Unit 2
Components of artificial neural networks - The concept of time in neural networks - Connections -
Propagation function - Activation - Threshold value - Activation function - Common activation functions -
Output function - Learning strategies, Network topologies - Feedforward networks - Recurrent networks -
Completely linked networks, Bias neuron, Representing Neurons, Orders of Activation, Synchronous
Activation - Asynchronous activation, input and output of data.
1
Unit II
2
Concept of Time in Neural Networks
3
Components of Neural Network
• A Technical Neural network consists of simple processing units, the neurons,
and directed, weighted connections between those neurons.
• The strength of connecting weight between two neurons i and j is referred to
as wi,j .
• A neural network is a sorted triple (N, V, w) with two sets N, V and a function
w, where
• N is the set of neurons and
• V a set {(i, j)|i, j ∈ N} whose elements are called connections between neuron i
and neuron j.
• The function w : V → R defines the weights
• where w((i, j)), the weight of the connection between neuron i and neuron j, is
shortened to wi,j
4
Components of Neural Network
5
Definition : Neural Network - Weight matrix
6
Data processing of a Neuron in Neural Networks
7
Data processing of a neuron in neural networks
9
Propagation Function
• Let I = {i1, i2, . . . , in} be the set of neurons, such that ∀z ∈ {1, . . . , n} : ∃wiz,j .
• Then the network input of j, called netj , is calculated by the propagation
function fprop() as follows:
• Here the weighted sum is very popular: The multiplication of the output of
each neuron i by wi,j , and the summation of the results,
10
The activation is the "switching status" of a neuron
Definition of activation : Let j be a neuron. The
activation state aj , in short activation, is
explicitly assigned to j, indicates the extent of the
neuron’s activity and results from the activation
function.
Neurons get activated if the network input
exceeds their threshold value.
Definition of threshold value: Let j be a neuron.
The threshold value Θj is uniquely assigned to j
and marks the position of the maximum gradient
value of the activation function.
11
Activation function
• Let j be a neuron. The activation function is defined as,
• It transforms the network input netj, as well as the previous activation state
aj(t − 1) into a new activation state a j(t), with the threshold value Θ
12
Scope: Activation Function and Threshold value
The activation function is often defined globally for all neurons or at least for a set of
neurons and only the threshold values are different for each neuron.
The threshold values can be changed, using a learning procedure.
So it can in particular become necessary to relate the threshold value to the time and to
write, for instance Θj as Θj(t).
The activation function is also called transfer function.
13
Common activation functions
•The simplest activation function is the binary
threshold function, which can only take on two
values (also referred to as Heaviside function).
•If the input is above a certain threshold, the function
changes from one value to another, but otherwise
remains constant.
• This implies that the function is not differentiable at
the threshold and for the rest the derivative is 0. Due
to this fact, back propagation learning, for example, is
impossible.
14
Common activation functions
• Also very popular is the Fermi function or logistic function.
• This function maps the values in the range (0,1).
• The Fermi function can be expanded by a temperature
parameter T into the form
15
Common activation functions
The other common activation function is hyperbolic
tangent which maps to (−1, 1). Both functions are
differentiable.
The equation y =tanh(x) is defined as,
16
Output function
The output function of a neuron j calculates the values which are transferred to the other
neurons connected to j.
Definition (Output function): Let j be a neuron. The output function
fout(aj ) = oj
The output value oj of the neuron j is calculated from its activation state a j . Generally,
the output function is defined globally, too.
Often this function is the identity, i.e. the activation a j is directly output:
fout(aj ) = aj , so oj = aj
Unless explicitly specified differently, the identity is used as output function.
17
Learning strategy
Learning strategies adjust a network to fit the model.
The learning strategy is an algorithm that can be used to change and thereby train the
neural network, so that the network produces a desired output for a given input.
Learning rule is a method or a mathematical logic. It helps a Neural Network to learn
from the existing conditions and improve its performance. It is an iterative process.
The following are different learning rules in the Neural network:
• Hebbian learning rule – It identifies, how to modify the weights of nodes of a network.
• Perceptron learning rule – Network starts its learning by assigning a random value to each weight.
• Delta learning rule – Modification in sympatric weight of a node is equal to the multiplication of error
and the input.
• Correlation learning rule – The correlation rule is the supervised learning.
• Outstar learning rule – It is used when it assumes that nodes or neurons in a network arranged in a
layer.
18
Network topologies
Topology of neural network is referred to the design of neural network, which is used to
construct a neural network model using its components.
The following three designs will be discussed in detail,
1. Feedforward network
2. Recurrent network
3. Completely linked networks
19
Feedforward networks
It consist of layers and connections towards each following layer.
The neurons are grouped in the following layers:
◦ One input layer,
◦ n hidden processing layers (invisible from the outside, that’s why the neurons are also
referred to as hidden neurons) and
◦ one output layer
◦ In a feedforward network each neuron in one layer has only directed connections to
the neurons of the next layer (towards the output layer).
20
Feedforward networks
Definition: The neuron layers of a feedforward network
are clearly separated: One input layer, one output layer
and one or more processing layers which are invisible
from the outside (also called hidden layers).
The Hinton diagram of the neural network is shown
here. The blocks highlighted contributes to the
connections and weight matrix
21
Feedforward networks – shortcut connections with
skip layers
◦ (Feedforward network with shortcut connections).
Similar to the feedforward network, but the connections
may not only be directed towards the next layer but also
towards any other subsequent layer.
Some feedforward networks permit the so-called shortcut
connections.
◦ These connections that skip one or more levels.
◦ These connections may only be directed towards the output
layer, too.
22
Recurrent networks
Recurrence is defined as the process of a neuron influencing itself by any means or by
any connection.
Recurrent networks do not always have explicitly defined input or output neurons.
There are three types of recurrences possible for a neuron. They are,
1. Direct recurrences
2. Indirect recurrences
3. Lateral recurrences
23
Recurrent networks: Direct recurrence
Some networks allow for neurons to be connected to
themselves, which is called direct recurrence (or sometimes
self-recurrence.
As a result, neurons inhibit and therefore strengthen
themselves in order to reach their activation limits.
Definition (Direct recurrence): Expand the feedforward
network by connecting a neuron j to itself, with the weights of
these connections being referred to as wj,j .
In other words: the diagonal of the weight matrix W may be
different from 0.
24
Recurrent networks: Indirect recurrence
Indirect recurrences can influence their starting neuron only
by making detours.
The indirect recurrences are represented by solid lines. As
we can see, connections to the preceding layers can exist
here, too.
Definition (Indirect recurrence): This network is based on a
feedforward network, now with additional connections
between neurons and their preceding layer being allowed.
Therefore, below the diagonal of W is different from 0.
25
Recurrent networks: Lateral recurrence
Connections between neurons within one layer are called
lateral recurrences.
Here, each neuron often inhibits the other neurons of the layer
and strengthens itself.
As a result only the strongest neuron becomes active (winner
takes-all scheme).
The direct recurrences are represented by solid lines.
Here, recurrences only exist within the layer.
In the Hinton diagram, filled squares are concentrated around
the diagonal in the height of the feedforward blocks, but the
diagonal is left uncovered.
26
Completely linked networks
It allows possibly any connection.
Completely linked networks permit connections between all
neurons, except for direct recurrences. Furthermore, the
connections must be symmetric.
Definition (Complete interconnection): In this case, every neuron
is always allowed to be connected to every other neuron – but as
a result every neuron can become an input neuron.
Thus, the matrix W may be unequal to 0 everywhere, except
along its diagonal.
27
Bias Neuron
The bias neuron is a technical trick to consider threshold values as connection weights.
In many network paradigms neurons have a threshold value that indicates when a
neuron becomes active.
Thus, the threshold value is an activation function parameter of a neuron.
It is complicated to access the activation function at runtime in order to train the
threshold value.
28
Bias Neuron
But threshold values Θj1 , . . . , Θjn for neurons j1, j2, . . . , jn can also be realized as
connecting weight of a continuously firing neuron.
For this purpose an additional bias neuron whose output value is always 1 is integrated
in the network and connected to the neurons j1, j2, . . . , jn.
Definition: A bias neuron is a neuron whose output value is always 1.
These new connections get the weights −Θj1 , . . . , −Θjn , i.e. they get the negative
threshold values.
29
Bias Neuron
Let j1, j2, . . . , jn be neurons with threshold values Θj1 , . . . , Θjn .
By inserting a bias neuron whose output value is always 1, generating connections
between the said bias neuron and the neurons j1, j2, . . . , jn and weighting these
connections wBIAS,j1 , . . . , wBIAS,jn with −Θj1 , . . . , −Θjn .
Let the thresholds are set zero, Θj1 = . . . = Θjn = 0 and receive an equivalent neural
network whose threshold values are realized by connection weights.
30
Significance of bias neuron
In effect, a bias value allows you to shift the activation function to the left or
right, which may be critical for successful learning.
It might help to look at a simple example. Consider this 1-input, 1-output network
that has no bias:
The output of the network is computed by multiplying the input (x) by the weight
(w0) and passing the result through some kind of activation function (e.g. a sigmoid
function.)
31
Significance of bias neuron
Here is the function that this network computes, for
various values of w0:
32
Significance of bias neuron
If we add a bias to that network, like so:
Then the output of the network becomes sig(w0*x +
w1*1.0). Here is what the output of the network looks
like for various values of w1:
33
Representing neurons
34
Order of activation
For a neural network it is very important in which order the individual neurons receive
and process the input and output the results.
Based on the order of activation of neurons, there are two categories of neural network
models exist.
◦ 1. Synchronous activation
◦ 2. Asynchronous activation
35
Order of activation – Synchronous activation
All neurons change their values synchronously, i.e. they simultaneously calculate
network inputs, activation and output, and pass them on.
Synchronous activation corresponds closest to its biological counterpart, but it is – if to
be implemented in hardware – only useful on certain parallel computers and especially
not for feedforward networks.
This order of activation is the most generic and can be used with networks of arbitrary
topology
36
Order of activation - Synchronous activation
Definition: All neurons of a network calculate network inputs at the same time by
means of the propagation function, activation by means of the activation function and
output by means of the output function.
After that the activation cycle is complete.
37
Order of activation - Asynchronous activation
Here, the neurons do not change their values simultaneously but at different points of
time.
For this, there exist different orders, they are
◦ 1. Random order
◦ 2. Random permutation
◦ 3. Topological order
◦ 4. Fixed order
38
Order of activation - Asynchronous activation
Random order:
With random order of activation a neuron “i” is randomly chosen and its neti , ai and oi
are updated.
For n neurons a cycle is the n-fold execution of this step.
Obviously, some neurons are repeatedly updated during one cycle, and others,
however, not at all.
Apparently, this order of activation is not always useful.
39
Order of activation - Asynchronous activation
Random permutation:
Definition: Initially, a permutation of the neurons is calculated randomly and therefore
defines the order of activation. Then the neurons are successively processed in this
order.
This order of activation is as well used rarely because
◦ firstly, the order is generally useless and,
◦ secondly, it is very time-consuming to compute a new permutation for every cycle.
40
Order of activation - Asynchronous activation
Topological order:
With topological order of activation the neurons are updated during one cycle and
according to a fixed order. The order is defined by the network topology.
This procedure can only be considered for non-cyclic, i.e. non-recurrent, networks, since
otherwise there is no order of activation. Thus, in feedforward networks (for which the
procedure is very reasonable) the input neurons would be updated first, then the inner
neurons and finally the output neurons.
41
Order of activation - Asynchronous activation
Topological order:
This type of activation order may save us a lot of time.
◦ Given a synchronous activation order, a feedforward network with ‘n’ layers of neurons
would need ‘n’ full propagation cycles in order to enable input data to have influence on the
output of the network.
◦ Given the topological activation order, it just need one single propagation.
◦ However, not every network topology allows for finding a special activation order that
enables saving time
42
Order of activation - Asynchronous activation
Fixed orders of activation:
when implementing, for instance, feedforward networks it is very popular to determine
the order of activation once according to the topology and to use this order without
further verification at runtime. But this is not necessarily useful for networks that are
capable to change their topology.
43
Communication with the outside world: input and
output of data in and from neural network
The input and output components for ‘n’ input or output neurons within the vectors x =
(x1, x2, . . . , xn) and y = (y1, y2, . . . , yn).
A network with n input neurons needs n inputs x1, x2, . . . , xn. They are considered as
input vector x = (x1, x2, . . . , xn). As a consequence, the input dimension is referred to as
n. Data is put into a neural network by using the components of the input vector as
network inputs of the input neurons.
A network with m output neurons provides m outputs y1, y2, . . . , ym. They are
regarded as output vector y = (y1, y2, . . . , ym). Thus, the output dimension is referred to
as m. Data is output by a neural network by the output neurons adopting the
components of the output vector in their output values
44