Feedforward Neural Networks: An Introduction
Feedforward Neural Networks: An Introduction
FEEDFORWARD
NEURAL NETWORKS:
AN INTRODUCTION
Simon Haykin
N
T = {( x i , di )}i =1 (1.1)
Figure 1.1 Fully connected feedforward with one hidden layer and one
output layer.
W
N = OÊ ˆ (1.4)
Ë e¯
where O denotes “the order of,” and e denotes the fraction of clas-
sification errors permitted on test data. For example, with an error
of 10% the number of training examples needed should be about 10
times the number of synaptic weights in the network.
Supposing that we have chosen a multilayer perceptron to be
trained with the back-propagation algorithm, how do we determine
when it is “best” to stop the training session? How do we select the
size of individual hidden layers of the MLP? The answers to these
important questions may be gotten though the use of a statistical
technique known as cross-validation, which proceeds as follows
(Haykin 1999):
Dw ji = hy j ( xi - y j w ji ) (1.6)
where the term -hy2j wji is added to stabilize the learning process. As
the number of iterations approaches infinity, we find the following:
mz-1
G (z) =
1 - (1 - m )z-1
m
=
z - (1 - m )
0<m<2
n - 1ˆ p
g p (z) = Ê
n- p
m (1 - m ) n≥ p
Ë p - 1¯
BIBLIOGRAPHY
Sandberg, I. W., and Xu, L., 1997, “Uniform approximation and gamma net-
works,” Neural Networks, vol. 10, pp. 781–784.
Van Hulle, M. M., 2000, Faithful Representations and Topographic Maps:
From Distortion-to-Information-Based Self Organization (New York:
Wiley).
Wan, E. A., 1994, “Time series prediction by using a connectionist network
with internal delay lines,” in A. S. Weigend and N. A. Gershenfield, eds.,
Time Series Prediction: Forecasting the Future and Understanding the Past
(Reading, MA: Addison-Wesley), pp. 195–217.
Werbos, P. J., 1974, “Beyond regression: New tools for prediction and analy-
sis in the behavioral sciences,” Ph.D. Thesis, Harvard University,
Cambridge, MA.
Werbos, P. J., 1990, “Backpropagation through time: What it does and how
to do it,” Proc. IEEE, vol. 78, pp. 1550–1560.
Yee, P. V., 1998, “Regularized radial basis function networks: Theory and
applications to probability estimation, classification, and time series pre-
diction,” Ph.D. Thesis, McMaster University, Hamilton, Ontario.