Artifical Neural Networks - Lect - 2
Artifical Neural Networks - Lect - 2
neural networks
Simple model of neuron
y1
• Each neuron has a threshold value w1j
w2j
y2
• Each neuron has weighted inputs
w3j
y3 O
from other neurons
yi wij
• The weighted sum is subtracted from its threshold value, to give its
activation level.
Cont.
• If the activation level exceeds the threshold, the neuron “fires”.
Otherwise, the perceptron does not(e.g, ignored).
• Note: For simplicity, bias may not appear in the following equations
Cont.
• The perceptron algorithm use smooth approximation of the gradient with
respect to each example:
Current
Weight Previous
weight ∆w
1. Zero initialization: this choice causes all weights have the same value in
subsequent iterations.
• Sign function (or Bipolar binary function ) used to map to binary outputs at
prediction time(-1 or +1).
• Sigmoid function (or unipolar continuous function) outputs a value in the interval
(0,1), thus it can create probabilistic outputs(real values) and creating loss
function based on maximum- likelihood model.
• Tanh function is similar to sigmoid function, but it is preferred when the outputs
desired to be positive or negative. Furthermore, its larger gradient makes it easier
to train.
Cont.
• Piecewise linear activation functions:
• Both ReLU and hard tanh have largely replaced sigmoid and soft tanh
activation function in modern neural networks for the ease of training.
Cont.
Learning algorithms
During the learning process, weights can be updated by different rules. In the next
rules: C is learning rate, d desired output, o actual output and net = 𝑾𝒕 𝐗
• Perceptron Learning Rule:
∆wi = c [di – oi)] xi
• Hebbien learning rule:
∆wi = c [oi] xi
• Delta learning rule:
∆wi = c [di – oi)] f '(net) xi
f '(net)= 1/2 (1-o2(
• Widrow-Hoff Learning Rule (d is independent of activation function)
∆wi = c [di – net)] f '(net) xi
f '(net)= 1, f (net) = net
• Correlation Learning Rule (special case of Hebbien)
∆wi = c [di ] xi
Example 1: illustrates the perceptron learning rule