0% found this document useful (0 votes)
44 views12 pages

Adaline Network: W Output

The document describes the Adaline network, an adaptive linear network proposed in the 1960s. It is similar to the perceptron network but uses a purely linear transfer function, whereas the perceptron uses a hard limiting transfer function. The Adaline network can be trained using the Widrow-Hoff or LMS algorithm to minimize the mean squared error between the network's output and the target values. It has various applications in areas like adaptive filtering, system identification, inverse system modeling, and prediction.

Uploaded by

Neha Rajput
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views12 pages

Adaline Network: W Output

The document describes the Adaline network, an adaptive linear network proposed in the 1960s. It is similar to the perceptron network but uses a purely linear transfer function, whereas the perceptron uses a hard limiting transfer function. The Adaline network can be trained using the Widrow-Hoff or LMS algorithm to minimize the mean squared error between the network's output and the target values. It has various applications in areas like adaptive filtering, system identification, inverse system modeling, and prediction.

Uploaded by

Neha Rajput
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

ADALINE NETWORK

 Proposed by Widrow & Hoff in 1960’s.

 Stands for Adaptive Linear Network.

 Architecturally it is similar to perceptron network except for transfer


function.
Adaline uses pure purely linear transfer function while perceptron uses hard
limiting transfer function.

 Has large number of application in signal processing.

W11
output
[ ] W12

W1r

BIAS

Direct Weight
Adjustment or
Iterative Training
Algorithm

TARGET

Adaline network along with arrangement for training


DECISION BOUNDARY OF ADALINE N/W

Consider 2 input, 1 output adaline network


W11
a1*1
* + ∑
W12
b
1

n= w11p1 + w12p2 + b

a= purelin(n) = n

a= w11p1 + w12p2 + b

Limiting case n=0

w12p2 =- w11p1 - b

p2= -w11p1/w12 –b/w12

This line is called the decision boundary.


a= 0 along decision boundary

How to decide on which side o/p is greater than zero?

Direction of weight vector is the direction in which output will be positive.


Thus adaline has some limitation as the perceptron can classify only linearly
separable patterns. However due to linear TF it can be put to other uses.

TRAINING ADALINE USING LMS ALGORITHM.

N/W can be considered to be trained if it produces o/p with acceptable error


for i/ps.

Let X= * + = [ ]

Z= * +=[ ]
n = wTp + b
a = purelin(n)=n

In matrix notation,

a= [ ] [ ] = XTZ

e = t-a =t-XTZ

Since error may be positive or negative, we take square of the error.

e2 = (t- XTZ)2

Mean of square of errors

E[e2] = E[(t-a)2] = E[(t- XTZ)2]

where E[e2] = statistical expectation operator.


-----------------------
Expectation of discrete variable x E(X) = ∑ xip(xi)
xi is the ith discrete value of variable x.
p(xi) is probability of occurrence of xi.

Hence, E[e2] = e12p(e12) + e22p(e22) + ………

Assuming that all values of e2 have equal probability of occurrence p(e2)


=1/n.

E[X2] = e12/n+ e22/n+……. en2/n

Thus E[X2] is mean of square of errors or mean squared error.


------------------------
Let E[e2] = F(X) performance function(reflects how well the
network is performing)

F(X) = E[(t- XTZ)2]


= E[(t2-2tXTZ+XTZZTX)]
= E(t2) – 2XTE(tZ) +XTE(ZZT)X
= C – 2XTh + XTRX

Where C = E(t2), h = E(tZ), R = E(ZZT)


R is input correlation matrix ( measure of similarity of a signal with a
delayed version of same signal)

h is cross correlation vector (measure of similarity between a signal and


delayed version of another signal)

In order to bring F (X) to standard quadratic form


F(X) = 1/2 XT2RX -2hTX+C
=1/2 XTAX + dTX+C
Where A=2R d=-2h

Stationary point (point at which gradient is zero) can be found by setting


gradient of F(X) = 0
F(X) = 0
But gradient of quadrature function is given by Ax+d.
So, 2RX -2h = 0
2RX = 2h
2R-1RX = 2R-1h

X=R-1h

where R = E(ZZT) and h= E(t2)

Thus if we could calculate statistical properties like R & h the value of


vector X i.e. weights & biases can be computed directly i.e without any
iterations. In general it is not convenient to calculate h & R .We can avoid
calculating inverse of R using steepest descent algorithm.

WINDOW HOFF ALGORIYHM FOR TRAINING ADALINE.

It is an approximate steepest descend algorithm in which performance index


is mean square error. Performance function to be minimized is taken as e2(k)
rather than E(e2). Error is minimized after each individual pattern is applied.
(k+1)th value of vector X (weight vector) is found from kth value such that
F(x(k+1)) < F(x(k)). i.e. we are going down hill on the surface formed by
performance function.

xk+1 = xk – αkgk (method of steepest descent) gk is gradient at kth


iteration,

From 2 input network,

2
gk = F(x) = (k) = =

[ ] [ ]
e(k) =t(u) – a(k)
e(k) = t(k) –wTp(k)+b
e(k) = t(k) – ( w1ipi(k)+ b)

= 0- p1(k)-0……+0= -p1(k)

In general, = -pj(k) & = -1


gk = -2e(k)[ ]

xk+1 = xk -α gk

[ ] = [ ] +2 αe(k) [ ]

OR
W(k+1) =w(k)+2αe(k)p(k)
B(k+1) = b(k)+2αe(k)

These two equations are LMS algorithm or Widrow Hoff learning algorithm
or delta rule

Widrow Hoff Algorithm

 Performance function to be minimized is e2(k)


 Minimization method used is method of steepest descent.

xk+1= xk-αkgk

 Window Hoff algorithm


W(k+1) =w(k)+2αe(k)pT(k)
B(k+1) = b(k)+2αe(k)
Steps:
1. Start with small random weights & biases.
2. Apply the 1st i/p vector & propagate it forward to find output.
3. Compute the error.
4. Modify weights & biases using formulae.

W(k+1) =w(k)+2αe(k)pT(k)

B(k+1) = b(k)+2αe(k)

5. Stop when e(k) drops to an acceptably low value.


In the classical LMS method, we first apply all the available input patterns &
find the individual errors. We then try to minimize the mean of squared
error. In Windrow Hoff method, we proceed in iterative fashion as each
input pattern is applied, thus avoiding finding inverse of matrix that requires
statistical properties of input vectors to be known. This solves large amount
of labor in practical sized problems.

PROBLEM:

I/O pairs are

p1 =* + t1 =1 ; p2 = * + t2= -1

Train the network using LMS algorithm with the initial guess set to zero &
learning rate α = 0.25. Neglect bias.

a(k) = purelin(wTpk)

w(k+1) = w(k) +2 αe(k)p(k)

p1 is applied

a(0)= purelin([ ] * +) =0 ; t(0) = 1

e(0) =t(0) –a(0) =1

w(1)= wT(0) +2*0.25*1*pT(0)

= [ ] + 2*0.25* [ ] =[ ]

p2 is applied.

a(1)= purelin([ ]* +)=0


e(1) = t(1) –a(1)= -1-0 = -1

w(2) = wT(1) +2*0.25*-1*pT(1)

=[ ] [ ]

=[ ]

Adaline is more widely used than perceptron.

Major area of application of adaline is in adaptive filtering.

Adaptive filter is able to separate undesirable components from signals even if


undesirable components fall in the frequency band as the useful signal.

The adaptive filtering has the foll applications.

 Noise cancellation
 System identification
 Inverse sysyem modeling
 Prediction

s(t) +f1(n(t)) Restored signal s(t)


Useful signal s(t)

f1(n(t)) f2(n(t))

Noise Adaline Training


path
filter algorithm
error
Noise source n(t)
System Identification

input output desired signal


system to
be
identified
target

LMS
Algorithm
error

output of
adaptive filter
Adaptive filter
adaline
Inverse System Modeling

output of filter
input System whose Adaptive filter
inverse model (adaline) for
is to be found inverse system
modeling

Training
algorithm

Delay
Prediction

Prediction is required in many situations.

signal
Actual value of current
sample (target)
LMS
D algorithm

error
Past D
samples Adaptive filter
of signal (adaline) for Predicted value
D prediction of current
sample

You might also like