Multi Layer Perceptron
Multi Layer Perceptron
Contents
CS 476: Networks of Neural Computation
MLP Model
Model Selec.
BP & Opt.
Conclusions
Contents
MLP Model
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
Conclusions •Generalisation
•Model selection through cross-validation
•Conguate-Gradient method for BP
Model Selec.
BP & Opt.
Conclusions
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
Model Selec.
BP & Opt.
Conclusions
Model Selec.
BP & Opt.
Conclusions
•And,
Contents
y j ( n )
MLP Model j ' ( v j ( n ))
v j ( n )
BP Algorithm
Approxim. v j ( n )
yi (n)
Model Selec. w ji ( n )
BP & Opt.
•Combining all the previous equations we get finally:
Conclusions
( n )
w ij ( n ) e j ( n ) j ' ( v j ( n )) y i ( n )
w ji ( n )
Model Selec. j
(n) (d j
(n) y j ( n )) j ' ( v j ( n ))
BP & Opt.
•If j is a hidden neuron then j(n) is defined as:
Conclusions
( n ) y j ( n ) ( n )
(n) j ' ( v j ( n ))
y j ( n ) v j ( n ) y j ( n )
j
Model Selec.
•Then we have:
BP & Opt.
( n ) e k ( n )
Conclusions ek ( n )
y j ( n ) k C y j ( n )
k ( n ) w kj ( n )
k C
3. Forward Computation:
Contents • Let the training example in the epoch be
MLP Model denoted by (x(n),d(n)), where x is the input
vector and d is the desired vector.
BP Algorithm
• Compute the local fields by proceeding forward
Approxim. through the network layer by layer. The local
Model Selec. field for neuron j at layer l is defined as:
BP & Opt. m
w ji
(l ) (l ) ( l 1 )
Conclusions vj (n) (n) yi (n)
where m is the number i 0 of neurons which connect
BP & Opt.
• If j is in the input layer we simply set:
Conclusions
x j (n)
(0)
yj (n)
Conclusions
e j (n) d j (n) o j (n)
where dj(n) is the desired response for the jth
element.
4. Backward Computation:
Contents
• Compute the s of the network defined by:
MLP Model
e j ( n ) j ' ( v j ( n )) for
(L) (L)
neuron j in output layer L
BP Algorithm j ( l ) ( n )
j ' ( v j ( l ) ( n )) k ( l 1) ( n ) w kj ( l 1) ( n ) for neuron j in hidden layer l
Approxim. k
Conclusions
BP Algorithm
Approxim.
BP Algorithm
• H5: Normalise the inputs:
• Create zero-mean variables
Approxim.
• Decorrelate the variables
Model Selec.
• Scale the variables to have covariances
BP & Opt.
approximately equal
Conclusions
• H6: Initialise properly the weights. Use a zero
mean distribution with variance of:
w
1
m
Model Selec.
BP & Opt.
Conclusions
define:
Contents m1
m
) a i w ij x j b i
0
BP & Opt.
for all x1, …, xm0 that lie in the input space.
Conclusions
Model Selec.
•We should try to approximate the true mechanism that
BP & Opt. generates the data; not the specific structure of the
Conclusions
data in order to achieve the generalisation. If we learn
the specific structure of the data we have overfitting or
overtraining.
Contents
MLP Model
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
Model Selec.
BP & Opt.
Conclusions
Conclusions
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
R(w)=Es(w)+Ec(w)
Contents
MLP Model Where R is the total cost function, Es is the standard
BP Algorithm
performance measure, Ec is the complexity penalty and
>0 is a regularisation parameter
Approxim.
Model Selec.
•Typically one imposes smoothness constraints as a
complexity term. I.e. we want to co-minimise the
BP & Opt. smoothing integral of the kth-order:
Conclusions 1 k 2
c ( w , k ) || k F ( x , w ) || ( x ) d x
2 x
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
c ( w ) || w || w i
2 2
BP Algorithm
Approxim. Where W is the total numberi of
1
all free parameters in
the model
Model Selec.
• Weight Elimination:
BP & Opt.
Conclusions W
( wi / w0 )
2
c ( w )
1 ( wi / w0 )
2
i 1
Conclusions
Eav < Si
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions