0% found this document useful (0 votes)
194 views3 pages

Bishop Neural Networks ACM

Uploaded by

Oana Mihai
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
194 views3 pages

Bishop Neural Networks ACM

Uploaded by

Oana Mihai
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Neural Networks

MICHAEL I. JORDAN
Massachusetts Institute of Technology

CHRISTOPHER M. BISHOP
Aston University

Neural networks have emerged as a field output nodes of a network are viewed as
of study within AI and engineering via samples from probability densities, and
the collaborative efforts of engineers, a network is viewed as a probabilistic
physicists, mathematicians, computer sci- model that assigns probabilities to pat-
entists, and neuroscientists. Although the terns. The problem of learning the
strands of research are many, there is a weights of a network is thereby reduced
basic underlying focus on pattern recogni- to a problem in statistics—that of find-
tion and pattern generation, embedded ing weight values that look probable in
within an overall focus on network archi- the light of observed data.
tectures. Many neural network methods The links to statistics have proved
can be viewed as generalizations of classi- important in practical applications of
cal pattern-oriented techniques in statis- neural networks. Real-world problems
tics and the engineering areas of signal are often characterized by complexities
processing, system identification, optimi- such as missing data, mixtures of quali-
zation, and control theory. There are also tative and quantitative variables,
ties to parallel processing, VLSI design, regimes of qualitatively different func-
and numerical analysis. tional relationships, and highly nonuni-
A neural network is first and foremost form noise. Neural networks are com-
a graph, with patterns represented in plex statistical models with the
terms of numerical values attached to flexibility to address many of these com-
the nodes of the graph and transforma- plexities, but as with any flexible statis-
tions between patterns achieved via tical model one must take care that the
simple message-passing algorithms. complexity of the network is adjusted
Certain of the nodes in the graph are appropriately to the problem at hand. A
generally distinguished as being input network that is too complex is not only
nodes or output nodes, and the graph as hard to interpret but, by virtue of over-
a whole can be viewed as a representa- fitting the random components of the
tion of a multivariate function linking data, can perform worse on future data
inputs to outputs. Numerical values than a simpler model. This issue is ad-
(weights) are attached to the links of the dressed via statistical techniques such
graph, parameterizing the input/output as cross-validation, regularization, and
function and allowing it to be adjusted averaging, as well as the use of an in-
via a learning algorithm. creasingly large arsenal of Bayesian
A broader view of a neural network methods. Other practical statistical is-
architecture involves treating the net- sues that arise include the assignment
work as a statistical processor, charac- of degrees of confidence to network out-
terized by making particular probabilis- puts (“error bars”), the active choice of
tic assumptions about data. Patterns data points (“active learning”), and the
appearing on the input nodes or the choice between different network struc-

Copyright © 1996, CRC Press.

ACM Computing Surveys, Vol. 28, No. 1, March 1996


74 • Michael I. Jordan and Christopher M. Bishop

tures (“model selection”). Progress has which are associated with problems in
been made on all these issues by apply- pattern recognition and control theory.
ing and developing statistical ideas. Here we give a small selection of exam-
The statistical approach also helps in ples, focusing on applications in routine
understanding the capabilities and lim- use.
itations of network models and in ex- The problem of recognizing handwrit-
tending their range. Neural networks ten characters is a challenging one, that
can be viewed as members of the class has been widely studied as a prototypi-
of statistical models known as “non- cal example of pattern recognition.
parametric,” and the general theory of Some of the most successful approaches
nonparametric statistics is available to to this problem are based on neural
analyze network behavior. It is also of network techniques and have resulted
interest to note that many neural net- in several commercial applications.
work architectures have close cousins in Mass screening of medical images is
the nonparametric statistics literature; another area in which neural networks
for example, the popular multilayer per- have been widely explored, where they
ceptron network is closely related to a form the basis for one of the leading
statistical model known as “projection systems for semi-automatic interpreta-
pursuit,” and the equally popular radial tion of cervical smears. As a third exam-
basis function network has close ties to ple of pattern recognition we mention
kernel regression and kernel density es- the problem of verifying handwritten
timation. signatures, based on the dynamics of
A more thoroughgoing statistical ap- the signature captured during the sign-
proach, with close ties to “semiparamet-
ing process, where the leading approach
ric” statistical modeling, is also avail-
to this problem is again based on neural
able in which not only the input and
networks.
output nodes of a network but also the
An example of a control application
intermediate (“hidden”) nodes are given
based on neural networks involves the
probabilistic interpretations. The gen-
eral notion of a mixture model, or more real-time adjustment of the plasma
generally a latent variable model, has boundary shape in a tokamak fusion
proved useful in this regard; the hidden experiment, which requires several in-
units of a network are viewed as unob- terdependent parameters to be con-
served variables that have a parameter- trolled on time scales of a few tens of
ized probabilistic relationship with the microseconds. Neural networks have
observed variables (i.e., the inputs, the also been applied to the real-time con-
outputs, or both). This perspective has trol of the mirror segments in adaptive-
clarified the links between neural net- optics telescopes, used to cancel distor-
works and a variety of graphical proba- tions due to atmospheric turbulence.
bilistic approaches in other fields; in The prospects for neural networks
particular, close links have been forged seem excellent, given the increasing so-
with hidden Markov models, decision phistication of the underlying theory,
trees, factor analysis models, Markov the increasing range of applicability of
random fields, and Bayesian belief net- the techniques, and the growing scale of
works. These links have helped to pro- the applications that are being under-
vide new algorithms for updating the taken. The interdisciplinary nature of
values of nodes and the values of the research in the field seems certain to
weights in a network; in particular, EM persist and to bring new vigor into al-
algorithms and stochastic sampling lied fields. Finally, future progress in
methods such as Gibbs sampling have theoretical neuroscience will provide a
been used with success. continuing impetus for the development
Neural networks have found a wide and understanding of network models of
range of applications, the majority of intelligence.

ACM Computing Surveys, Vol. 28, No. 1, March 1996


Neural Networks • 75

REFERENCES HAYKIN, S. 1994. Neural Networks: A Compre-


hensive Foundation. Macmillan, New York.
BISHOP, C. M. 1995. Neural Networks for Pat- HERTZ, J., KROGH, A., AND PALMER, R. G. 1991.
tern Recognition. Oxford University Press, Introduction to the Theory of Neural Compu-
New York. tation. Addison-Wesley, Reading, MA.

ACM Computing Surveys, Vol. 28, No. 1, March 1996

You might also like