WK3 - Multi Layer Perceptron
WK3 - Multi Layer Perceptron
WK3 - Multi Layer Perceptron
Contents
MLP Model
CS 476: Networks of Neural Computation
BP Algorithm WK3 – Multi Layer Perceptron
Approxim.
Model Selec.
BP & Opt. Dr. Stathis Kasderidis
Conclusions Dept. of Computer Science
University of Crete
Approxim.
Model Selec.
BP & Opt.
Conclusions
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
Approxim.
Model Selec.
BP & Opt.
Conclusions
Model Selec.
BP & Opt.
Conclusions
Approxim. 1
e j ( n)
2
( n)
Model Selec. 2 jC
BP & Opt.
Where C is the set of all output neurons.
Conclusions •If we assume that there are N examples in the set
then the average squared error is:
N
1
av
N
( n)
n 1
•And,
Contents
y j (n)
MLP Model j ' (v j (n))
v j (n)
BP Algorithm
Approxim. v j ( n)
yi ( n )
Model Selec.
w ji ( n)
BP & Opt.
•Combining all the previous equations we get finally:
Conclusions
( n)
wij (n) e j (n) j ' (v j (n)) yi (n)
w ji (n)
Model Selec.
•Then we have:
BP & Opt.
(n) e (n)
Conclusions ek ( n) k
y j (n) kC y j (n)
k ( n)wkj ( n)
kC
3. Forward Computation:
Contents
• Let the training example in the epoch be
MLP Model denoted by (x(n),d(n)), where x is the input
BP Algorithm vector and d is the desired vector.
• Compute the local fields by proceeding forward
Approxim.
through the network layer by layer. The local
Model Selec. field for neuron j at layer l is defined as:
BP & Opt. m
v j ( n) w ji (n) yi
(l ) (l ) ( l 1)
Conclusions
( n)
i 0
where m is the number of neurons which
connect to j and yi(l-1)(n) is the activation of
neuron i at layer (l-1). Wji(l)(n) is the weight
Conclusions
e j ( n ) d j ( n) o j ( n )
4. Backward Computation:
Contents
• Compute the s of the network defined by:
MLP Model
e
( L)
( n ) ' ( v
( L)
(n)) for neuron j in output layer L
(l ) j j j
BP Algorithm j (n)
j ' (v j (n)) k (n) wkj (n) for neuron j in hidden layer l
(l ) ( l 1) ( l 1)
k
Approxim.
Model Selec.
where j(•) is the derivative of function j wrt
the argument.
BP & Opt.
• Adjust the weights using the generalised delta
Conclusions
rule:
( l 1)
w ji (n) w ji (n 1) j (n) yi
(l ) (l ) (l )
( n)
where is the momentum constant
Approxim.
Model Selec.
• The order of presentation of examples should be
randomised from epoch to epoch
BP & Opt.
• The momentum and the learning rate parameters
Conclusions
typically change (usually decreased) as the number
of training iterations increases.
Conclusions
BP Algorithm
Approxim.
Model Selec.
• The following network solves the problem. The
BP & Opt. perceptron could not do this. (We use Sgn func.)
Conclusions
Conclusions
define:
Contents m1
m0
MLP Model F ( x1 ,..., xm0 ) ai wij x j bi
i 1 j 1
BP Algorithm
Model Selec.
Where W is the total number of adjustable parameters
BP & Opt. of the model. There is mathematical support for this
Conclusions
observation (but we will not analyse this further!)
•There is the “curse of dimensionality” for
approximating functions in high-dimensional spaces.
•It is theoretically justified to use two hidden layers.
Model Selec.
•We should try to approximate the true mechanism
BP & Opt.
that generates the data; not the specific structure of
Conclusions the data in order to achieve the generalisation. If we
learn the specific structure of the data we have
overfitting or overtraining.
Contents
MLP Model
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
Model Selec.
BP & Opt.
Conclusions
MLP Model
•The model is trained in all the subsets except for one
and the validation error is measured by testing it on
BP Algorithm the subset left out
Approxim.
•The procedure is repeated for a total of K trials, each
Model Selec. time using a different subset for validation
BP & Opt. •The performance of the model is assessed by
Conclusions averaging the squared error under validation over all
the trials of the experiment
•There is a limiting case for K=N in which case the
method is called leave-one-out.
MLP Model
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
R(w)=Es(w)+Ec(w)
Contents
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions
Model Selec.
•The optimal brain surgeon procedure (OBS)
BP Algorithm
Approxim.
Model Selec.
BP & Opt.
Conclusions