0% found this document useful (0 votes)

67 views14 pages

Common Misconceptions About Neural Networks As Approximators

Uploaded by

Alberto Astorayme Valenzuela

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views14 pages

Common Misconceptions About Neural Networks As Approximators

Uploaded by

Alberto Astorayme Valenzuela

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

COMMON MISCONCEPTIONS ABOUT NEURAL

NETWORKS AS APPROXIMATORS

By William C. Carpenter, ~ Member, ASCE, and Jean-Francois Barthelemyz

ABSTRACT." A current trend in scientific and engineering computing is to use

neural-network approximationsinstead of polynomialapproximationsor other types
of approximations involving mathematical functions. A number of misconceptions
have arisen concerning neural networks as approximators. This paper eliminates
these misconceptions. In so doing, the paper examines the computational efficiency
of neural-network approximations compared to polynomial approximations, ex-
amines the effect of using underdetermined neural-network approximations, ex-
amines the effect of design point selection on the quality of neural-network ap-
proximations, and examines the computing time required to train neural networks
compared to the time to develop polynomial approximations.

INTRODUCTION

A p p r o x i m a t i o n s have been used in engineering since the profession's

inception. Very recently, the use of neural networks as a p p r o x i m a t o r s has
become popular. The brain is an amazing organ with incredible cognitive
and computational capacity. Since neural networks mimic the workings of
the brain, neural networks have been attributed with numerous advantages
over other types of approximators. A n u m b e r of the supposed advantages
of neural networks, however, are misconceptions. This p a p e r addresses a
number of these misconceptions.
Neural-network a p p r o x i m a t i o n s are c o m p a r e d to polynomial approxi-
mations to ascertain their characteristics. The criteria used to compare the
quality of approximations is discussed and a brief description of the meth-
odology for making neural-network and polynomial approximations is given.
Various misconceptions about neural-network approximations are then dis-
cussed using examples to illustrate salient points.

QUALITY OF FIT
Consider a p r o b l e m with n i n d e p e n d e n t variables, the c o m p o n e n t s of the
vector {x} = (xl, x 2 , . . . , x,,)'. A total of N sets of the i n d e p e n d e n t variables,
referred to as design points, will be considered, {x}j, j = 1, N. A t the design
point {x}j, let y~ be the value of the function to be approximated. The pairing
of {x}j and yj is referred to as the jth training pair. Let 9j be the value of
the approximating function. The approximating function, ))j, should closely
match the function, yj, not only at the design points, {x}j, but over the entire
region of interest.

~Prof., Dept. of Civ. Engrg. and Mech., Univ. of South Florida, Tampa, FL
33620.
2Sr. Aerosp. Engr., NASA Langley Res. Center, Hampton, VA 23681.
Note. Discussion open until December 1, 1994. To extend the closing date one
month, a written request must be filed with the ASCE Manager of Journals. The
manuscript for this paper was submitted for review and possible publication on
January 14, 1993. This paper is part of the Journal of Computing in CivilEngineering,
Vol. 8, No. 3, July, 1994. 9 ISSN 0887-3801/94/0003-0345/$2.00 + $.25 per
page. Paper No. 5442.

345

J. Comput. Civ. Eng., 1994, 8(3): 345-358

Fit at Design Points
The approximating function 3~ closely approximates the function y at the
design points when 8 2, the sum of the squares of the residuals, is small where
N
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/19/18. Copyright ASCE. For personal use only; all rights reserved.

82 = ~ (yj _ ))~)2 (1)

j=l

Let )7 be the average value of y at the design points, In this study, one
measure of the closeness of fit to be considered is the nondimensional value
v where

v =
•/• =1
(y~ - ~)-~

N
9 100 (2)

The coefficient v = nondimensional root-mean-square (RMS) error at the

design points. Thus, v = 0 is a necessary and sufficient condition that the
approximating function fit the actual function at the N design points.

Overall Fit
Just because the approximating function exactly fits the function at N
design points does not guarantee that it gives a good fit over the region of
interest. It is therefore desirable over the region of interest to have a measure
of the quality of overall fit. Consider N G design points, referred to as grid
points, scattered throughout the region of interest and assume that if the
approximating function closely matches the exact function at these points
that a good approximation is obtained throughout the entire region of in-
terest. A measure of the quality of overall fit is taken as

U/.~ [(yj -- yj)~/NO]

vo = 9100 (3)

where Yo = average value of y at the grid points. A small value of vo

indicates that the approximating function did a good job of approximation
over the region of interest.

APPROXIMATIONS
Neural-Network A p p r o x i m a t i o n s
While the initial motivation for developing artificial neural nets was to
develop computer models that could imitate certain brain functions, neural
nets can be thought of as another way of developing approximations (these
approximations in many references are referred to as response surfaces).
Different types of neural networks are available (Rumelhart et al. 1986;
Anderson et al. 1988) but the type of neural nets considered in this paper
are feed-forward networks with one hidden layer as shown in Fig. 1. This
type of neural net has been used previously to develop response surfaces
(Vanluchene et al. 1990; Hajela et al. 1990; Swift et al. 1991; Berte et al.
1990; Rogers et al. 1992) and is capable, with enough nodes on the hidden

346

J. Comput. Civ. Eng., 1994, 8(3): 345-358

'i WN -
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/19/18. Copyright ASCE. For personal use only; all rights reserved.

input output

hidden
FIG. 1. Neural Network

layer, of approximating any continuous function (Hornik et al. 1989; Hornik

et al. 1990; White 1990; Gallant et al. 1992).
For the neural net of Fig. 1, associated with each node on the hidden
layer_, node j, and each output node, node k, are coefficients or weights, 0j
and Ok, respectively. These weights are referred to as the biases. Associated
with each path, from an input node i to node j on the hidden layer, is an
associated weight, wji and from node j on the hidden layer to output node
k is an associated weight wkj. Let qi be inputs entered at node i. Node j on
the hidden layer receives weighted inputs, wjiq~, which are summed and used
with a bias 0j and an activation function to yield an output rj. The activation
function considered in this paper is the sigmoid function (Rumelhart et al.
1986; Anderson et al. 1988)
1
rj = 1 + e ~j,q,-o, (4)

Output node k then receives inputs ~kirj, which are summed and used
with a bias 0k and an activation function to yield an output y~. Some
variation of the delta-error back-propagation algorithm (Rumelhart et al.
1986; Anderson et al. 1988) is then used to adjust the weights on each
learning cycle so as to reduce the difference between the predicted and
desired outputs. In this investigation, studies were performed using the
program NEWNET (Carpenter 1992b). N E W N E T minimizes the sum of
the squares of the residuals in (1) with respect to the weights and biases of
the net. Training of the net is thus formulated as an unconstrained min-
imization problem. Solution of this minimization problem is performed using
the method of Davidon, Fletcher, and Powell, a quasi-Newton method
(Reklaitis et al. 1983; Fox 1971). That algorithm performs a series of one-
dimensional searches along search directions. Search directions are deter-
mined by building an approximation to the inverse Hessian matrix using
gradient information. Gradients required by that algorithm are obtained
using back-propagation. One-dimensional searches are performed along the
search directions using an interval-shortening routine.

347

J. Comput. Civ. Eng., 1994, 8(3): 345-358

Polynomial Approximations
Polynomial approximations can be made using an m = k + 1 term
polynomial expression (Box et al. 1987; Khuri et al. 1987; Myers et al. 1971)
thus
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/19/18. Copyright ASCE. For personal use only; all rights reserved.

Y = bo + b~Xl + "" bkXk (5)

where Xj = some expression involving the design variables. For example,
a second-order polynomial approximation in two variables could be of the
form
.9 = bo + blxl + b2x2 + b3x2 q- b4xlx2 + bsx~ (6)
Values of the function to be approximated at the N design points can be
used to determine the m = k + 1 undetermined coefficients in the poly-
nomial expression. For N design points, (5) yields

Y2 = X12 "'" Xk2 bl

(7)
XIN ' XkA
or

{Y} = [Z]{b} (8)

where {Y} = N • 1 matrix; [Z] = an N x m matrix; and {b} = an m x
1 matrix.

Exactly Determined Approximation

When N = m, the approximation is exactly determined and the matrix
{b} can be determined from (8).

Overdetermined Approximation
With N > m, (8) can be solved in a least-squares sense thus (Box et al.
1987; Khuri et al. 1987; Myers 1971)
[Z]'{Y} = [Z]'[Z]{b} (9)
or

{b} = ( [ Z ] t [ Z ] ) - I [ Z ] t { Y } = [H]-I[Z]t(Y} (10)

Eq. (10) in effect, chooses the terms of {b} so as to minimize the square of
the residual as defined in (1).

Underdetermined Approximation
When N < m, the approximation is underdetermined. A solution can be
obtained by choosing the terms of {b} so as to minimize the square of the
residual as defined in (1). However, a direct solution can be obtained by
using the concept of pseudoinverse (Greenville 1959; Penrose 1955). As-
sume that the rank of matrix [Z] is N and define the pseudoinverse of matrix
Z, Z* thus
[z]* = [z]'([z][z]9 -1 (11)
Solution of (8) is then
348

J. Comput. Civ. Eng., 1994, 8(3): 345-358

{b} : [Z]*{Y} + [Q]{w} (12)
where {w} = an (m - N) column matrix of arbitrary coefficients and [Q]
= a m x (m - N) matrix formed from any rn - N independent columns
of the matrix [R] thus
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/19/18. Copyright ASCE. For personal use only; all rights reserved.

[R] : [I] - [Z]*[Z] (13)

The approximating function using (12) exactly matches the function at the
design points for any values of wi. Thus, nonunique approximations are
obtained when approximations are underdetermined.

Types of Approximations
Overdetermined, exactly determined, and underdetermined approxima-
tions have more, an equal number, or fewer training pairs than there are
undetermined parameters associated with the approximations. To have an
approximation, 3~, that exactly matches the function, y, not only at the design
points but over the region of interest, one should use an overdetermined
approximation. Underdetermined and exactly determined approximations
may give a good approximation at the design points, but not necessarily a
good approximation over the region of interest. Recent studies (Carpenter
1992a) have indicated that approximations that are from 20% to 50%
overdetermined tend to be computationally efficient. In other words, they
are a good compromise between doing as few functional evaluations as
possible while still obtaining a good approximation with the chosen ap-
proximating function.

MISCONCEPTIONS
A number of presentations and publications on the application of neural
networks to engineering problems indicate that several misconceptions exist
concerning neural networks as approximators. This paper points out these
misconceptions.

Misconception: Neural-Network Approximations are Superior to

Other Types of Mathematical Approximations such as
Polynomial Approximations
The number of undetermined parameters associated with a neural net-
work are the weights and biases of the network. The number of undeter-
mined parameters associated with a polynomial approximation are the coef-
ficients associated with the terms of the polynomial approximation. Carpenter
et al. (1993) found on a limited number of test problems that, in a general
sense, the performance of an approximation depends upon the number of
undetermined parameters associated with the approximation. An example
elucidates this point.
Consider the problem of determining the minimum volume of the five-
bar truss of Fig. 2 subject to stress, stability, and member size constraints.
The form of the constraint equations are detailed by Swift et al. (1991) and
Carpenter et al. (1993). The design variables of the problem are the areas
of the five members of the truss and the xl and x2 coordinates of node 2.
One solution technique is to obtain the minimum volume (VOL) of the
truss for various fixed values of the coordinates of node 2, and then to obtain
the optimum truss volume using the functional relationship between VOL
and the coordinates of node 2.
349

J. Comput. Civ. Eng., 1994, 8(3): 345-358

x
2
.o .to
,~ 1 0 in -J,,-~

'1'1 X
I'0

-r!.
?
w
'-rl
,,-I
V--
t~ 0
3 84
o
0 ~ b
0 --.,.
,,-,i ?
0
w
C

,,-I F
I
I r~
0 r
Nw Go I l
~" I O-t E ---.k
C

J. Comput. Civ. Eng., 1994, 8(3): 345-358

3 r,
o_
Ig

X
1

21 • ......... . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .

I61.4 - - ~ ' i ....~ I0

~; ....... -~,_ ...... -====~ ......................... . .......... : L - ~

o 81 ....................... :::::::::::::::::::::::::::::::::::: ............ :;]I[::::L[[[;;L:2;L[Z_.

:::t .............. ,........... : ..... _ ......... :-?==?%==t .............: ..........
20 30 48 50 60 70 80 90 100 110 120 130
N, Number of Designs

2nd order pely - + - 3rd order po[y .-ee- 4th order poly
5th orderpdy - x - . 3node net -:~" 5 node net
- ~ - 7 node net

FIG. 4. Five-Bar Truss, Performance of Approximations

Fig. 3 represents the functional relationship between VOL and the xl and
x2 coordinates of node 2. Using differing numbers of design points to build
the approximations, various order-polynomial approximations and various
neural-net approximations with varying numbers of nodes on the hidden
layer were developed to approximate VOL. The approximations were then
compared to the exact function at N G = 961 points (a 31 x 31 evenly
spaced grid of points) and the parameter vc as defined in (3) was calculated.
Fig. 4 gives the value of va versus the number of design points used to build
the approximation for each of the approximations examined. The approx-
imations considered were polynomial approximations of orders two through
five and neural nets with three, five, and seven nodes on the hidden layer.
There are a different number of undetermined parameters associated with
the various approximations. These numbers (coefficient m) are shown next
to the curves on Fig. 4. Notice that the performance of the approximations
is directly related to the number of undetermined parameters associated
with that approximation. The 2nd-order-polynomial approximation with six
undetermined parameters performed the poorest. The artificial neural net
with seven nodes on the hidden layer with 29 undetermined parameters
performed the best.
Note that one method of forming an approximation is not inherently
superior to the other. On selected test problems, the performance of good
polynomial approximations and good neural-net approximations have been
found to be comparable. Obtaining a good approximation for a given func-
tion, however, is not a simple task. One must search over a series of network
architectures or over a family of polynomial functions to find a good ap-
proximation.
Searches over a family of polynomial functions can be accomplished, for
example, by starting with a linear function and then examining functions of
increasing higher order. Some may find that such an approach yields more
insight into the nature of the function being approximated than with neural-
network approximations. Searches over network architecture can be readily
accomplished by varying the number of nodes on a hidden layer or layers
351

J. Comput. Civ. Eng., 1994, 8(3): 345-358

which may, in part, account for the current popularity of using neural net-
works as approximators.

Misconception: Neural Networks Can be Trained with Fewer

Training Pairs Than Other Types of Approximations

A number of papers have appeared recently reporting neural-network
approximations that were made using very few training pairs to train the
approximation (Vanluchene et al. 1990). The argument used to justify these
approximations is that the approximations can be trained to give a very
small or zero root-mean-square error at the design points [parameter v
0 in (2)]. The key here is that such an approximation can match the exact
function at the design points but be a poor and nonunique approximation
over the region of interest.
Underdetermined polynomial approximations can be developed using the
pseudoinverse technique (Greville 1959; Penrose 1955). However, under-
determined polynomial approximations are seldom, if ever, used because
the approximations obtained are not unique. For undetermined polynomial
approximations, information is only available to determine some of the
parameters associated with the approximation. Approximations obtained
are then functions of the remaining associated parameters. Different values
of these remaining parameters give different approximations. The nonu-
niqueness of the undetermined parameters is shown by (12). Different sets
of these parameters give approximations that match the exact function at
the design points but are different over the region of interest.
Underdetermined neural networks are networks in which the number of
training pairs used to train the network are fewer than the number of weights
and biases associated with the network. Such networks can be trained to
exactly duplicate the exact function at the design points. However, just as
with underdetermined polynomial approximations, approximations thus ob-
tained are not unique. Such networks, starting from different initial values
of the weights and biases, when trained will, in general, yield different
approximations. An example illustrates this point.
Consider the following one-variable function:
y = 2x + sin(~rx) + sin(2"~x); 0 -< x <- 1 (14)
A neural-net approximation was made of this function. The net had one
node on the input layer, which receives the x value of the design points;
had one node on the output layer, which predicts the y response; and had
one hidden layer with four nodes on that hidden layer. There are 13 un-
determined parameters associated with the net (eight weights and five biases).
Thus, at least 13 training pairs are required to uniquely determine these
parameters. For illustration purposes, however, the net was trained with
only four training pairs. As the number of training pairs is less than the
number of undetermined parameters, the net is underdetermined. One can
see in Fig. 5 the variability in predicted response. This variability points out
the nonunique nature of undetermined approximations. Fig. 6 shows results
using a net with two nodes on the hidden layer trained with seven training
pairs. Here, there are seven pieces of information to solve for the seven
undetermined parameters associated with the net (four weights and three
biases). Notice in Fig. 6 that each training of the net yielded the same
approximation.
The conclusion to be drawn from this example is that it is not desirable
352

J. Comput. Civ. Eng., 1994, 8(3): 345-358

2.5- .2.~!.~ .~.~..'.. ~..%~- ..................................................................
2"

1,5
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/19/18. Copyright ASCE. For personal use only; all rights reserved.

0.5
.............................................................. 0-1 trainina pairs I......

-0.5
o:1 o:2 o:3 0:4 ors o:6 d.7 o:8 o:9
x

[ .--~---exact-e-- net 1 -~-- net2 --~-" net3 I

FIG. 5. One Dimensional Example, Neural Network, ih = 4, Four Training Pairs

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

[+ exact-~-, net1 -~-- net2 --~-- net3 1

FIG. 6. One Dimensional Example, Neural Network, ih = 2, Seven Training Pairs

to train large nets with only a few training pairs. These nets can be trained
so that the approximation exactly fits the function to be approximated at
the design points and thus can be trained to have v = 0 but there can be
a great variation, from one training to the next, in the approximations over
a region of interest (i.e., there will be a large variation in the parameter
vc). The next example reemphasizes this point.
Fox (1971) investigated a function
y = 10x 4 - 20x2 x2 + 10x~ + x 2 - 2x a + 5 (15)
which has banana shaped contours as seen in Fig. 7. Fig. 8 shows results
using various neural-network approximations. In each case there are two
input nodes receiving the Xl and x2 coordinates of the design and one output
node predicting the response ,9. One hidden layer was considered with three,
five, or seven nodes on the hidden layer. Each net was trained three times
using 16 training pairs. Fig. 8 gives, for each neural net, the lowest value
of the parameter VG obtained in the three trainings and the difference of
the highest value of vc obtained minus the lowest value of VG in the three
trainings. The underdetermined nets (five and seven nodes on the hidden
layer) did not yield unique approximations. Three training runs for each of
these nets all gave v = 0 (exact fit at the design points) but yielded greatly
different results over the region of interest. The net with three nodes on
353

J. Comput. Civ. Eng., 1994, 8(3): 345-358

1.5 Level func

7 20.00
6 17.50
5 15.00
1.0 4 12.50
3 10,00
2 7.50
c~ 1 5.00
0.5

0.0

-0.5

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0

X
1
FIG. 7. Fox's Banana Function

/
100
90. / ~[over-determined i ~under'determined l - -
,/
u) 80-
e 70-
E 60

I J--
50-
40-
b
30
20-
10-
0-
3 5 7
Number of nodes on Ndden layer, ih

[ ~ lowest vg ~ (highest-lowest) vg ]

FIG. 8. Fox's Banana Function, Error Parameters for Three, Five, and Seven Nodes
on Hidden Layer

the hidden layer (an overdetermined net) had a small variation in vc of

approximately 5% (a manifestation of the algorithm employed to terminate
training). Nets with five and seven nodes on the hidden layer had large
variations of v a indicating that nonunique approximations are obtained with
underdetermined neural nets.
These examples demonstrate that a necessary condition for obtaining a
unique approximation is to have the number of design points used to train
the approximation equal to or greater than the number of parameters as-
sociated with the approximation.
354

J. Comput. Civ. Eng., 1994, 8(3): 345-358

Misconception: Neural Networks are Less Sensitive to Training Data
Than Other Types of Approximations
A deficient set of training pairs, which would give a singular [HI matrix
in (10) when attempting a polynomial approximation, can be used to train
a neural network. Depending on the architecture or the network, the neural-
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/19/18. Copyright ASCE. For personal use only; all rights reserved.

network approximation can be trained to yield a small or zero root-mean-

square error at the design points. Thus, the misconception has arisen that
neural networks are not as sensitive to which training pairs are selected as
other types of approximations (Carpenter et al. 1993). The next example
indicates that a deficient set of training pairs, which would not permit certain
polynomial approximations, may not yield a unique neural-network ap-
proximation. In other words, different trainings of the network from dif-
ferent initial values of the associated weights and biases, using the deficient
set of training pairs, may yield different approximations.
Consider the function
y = 1 + X 1 -~- X 2 -~- X 3 -~- X 2 "t- X I X 2 -}- X l X 3 -}- X ~ "l- X 2 X 3 -~- X ~ (16)
Twelve training pairs were used to make a polynomial approximation and
a two-node neural-net approximation of this function. The design points
associated with these training pairs are shown in Fig. 9. Information is not
available from this design for determining the mixed derivatives of the
function to be approximated. Thus, a complete second-order polynomial
approximation is not possible (Carpenter 1992a). If a solution is attempted
using (10), a singular [H] matrix is encountered. An approximation can be
made, however, using a polynomial of the form
Y = bo + blxl + b2x2 + b.3x3 -~- bex~ + bsx 2 + b6x2 (17)
As there are now more training pairs than there are undetermined param-
eters, an overdetermined approximation is obtained. Such an approximation
was developed and yielded a value of vc = 34.6.
A neural net with two nodes on the hidden layer was then trained with
the 12 training pairs. The net was trained 10 times starting from different

x
]2

x
1

/:
x3

FIG. 9. Star Pattern of Design Pointsm12 Design Points

355

J. Comput. Civ. Eng., 1994, 8(3): 345-358

randomly selected sets of weights and biases. Even though the number of
training pairs, 12, is greater than the number of undetermined parameters
associated with the net, 11, nonunique approximations were obtained. The
value of vc for these approximations ranged from 32.9 to 93.5. Some of the
approximations were as good as or better than the polynomial approxi-
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/19/18. Copyright ASCE. For personal use only; all rights reserved.

mation. However, a great deal of variability was encountered from one

approximation to the next. Thus, it can be concluded that for neural-net
approximations, having more training pairs than the number of associated
undetermined parameters is only a necessary condition for obtaining a unique
approximation but that it is not a sufficient condition. Deficient designs can
lead to nonunique approximations.

Misconception: Neural-Network Approximations Are as Easy to

Obtain as Other Types of Approximations such as
Polynomial Approximations
Training of a neural network can be a lengthy procedure. Most neural-
network algorithms use the delta-error back-propagation algorithm to adjust
the weights and biases on each learning cycle so as to reduce the difference
between the predicted and desired outputs. This method is basically a
steepest descent method of minimizing an error functional in which the
amount that the weights are changed in a cycle is defined by a learning
parameter, Steepest descent methods are notoriously inefficient (Reklaitis
et al. 1983; Fox 1971). For example, 20,000-100,000 training cycles were
typically required to train the nets reported in Carpenter (1993) when a
variation of the steepest descent method was employed. The use of quasi-
Newton method to minimize the error functional and the use of an algorithm
for obtaining the optimum value of the learning parameter for each cycle
greatly reduces the amount of training time required. These modifications
were incorporated in the program N E W N E T (Carpenter 1992b). Using
NEWNET, the nets in Carpenter et al. (1993) were trained in several thou-
sand cycles. Even with this improvement, training times for the neural nets
were many times slower than that required with corresponding polynomial
approximations.
To give insight into the time involved in training a neural network, training
times for a neural network investigated at NASA Langley Research Center
are also reported. The network was to be used to yield a response surface
for an engineering design optimization study. The network had four input
nodes, 15 nodes on a single hidden layer, and one output node. It had
previously been trained with 46 training pairs using the program NETS
(Baffes 1989) to a root mean square error of 0.002. Training using the
program NETS had taken from four to five days of continuous training on
a SUN, SPARC Station 1+ computer. The authors trained this net using
the program NEWNET. Training of the net with program N E W N E T took
22 central processing unit (CPU) min. Thus the use of the improved training
scheme of NEWNET greatly improved the training time of the net. Still,
this training time is much greater than that required with a polynomial
approximation.
Training times, however, may not be good measures of the desirability
of using one type of approximation over another. In general, a researcher
does not initially know the correct network architecture or polynomial func-
tion. The researcher must systematically search over a series of network
architectures or over a family of polynomial functions to find a good ap-
proximation. Thus, training times for a given network or polynomial ap-
356

J. Comput. Civ. Eng., 1994, 8(3): 345-358

proximation may be insignificant compared to the total effort involved in
developing a good approximation. A systematic search of network archi-
tecture can be accomplished by simply varying the number of nodes on a
single hidden layer (Hornik et al. 1989, 1990). Thus, some researchers may
find searching over net architecture easier than searching over a family of
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/19/18. Copyright ASCE. For personal use only; all rights reserved.

polynomials.

CONCLUSION
This paper points out the danger of using underdetermined approxima-
tions. This danger is present whether the approximations are underdeter-
mined polynomial approximations or underdetermined neural nets. The
authors are not aware of any studies using underdetermined polynomial
approximations. However, a number of reported studies have used nets with
a large number of nodes on a hidden layer or layers (and thus a large number
of associated undetermined parameters) trained with a relatively few num-
ber of training pairs. Thus, these nets are underdetermined approximators.
The variability in approximations, which can be obtained with such under-
determined nets, has been emphasized. The paper further points out that
design-point selection is important when training neural networks. Deficient
design-point selection can also give variability in an approximation.
For the examples in Carpenter et al. (1993) and those of this study,
overdetermined polynomials and neural nets with equivalent numbers of
associated coefficients gave comparable results. It should be emphasized
that only a limited set of examples and only feedforward nets with one
hidden layer were considered. While it is not possible to make a general
statement based on these limited studies, it does seem that the selection of
one type of approximation over the other can be reasonably based on per-
sonal preference. Training times of neural networks and polynomial ap-
proximations were discussed. While it was found that neural networks take
much longer, in general, to train than polynomial approximations, it was
pointed out that the ease in which a family of network architectures can be
examined may more than compensate for the lengthy training times of those
networks.

ACKNOWLEDGMENT
Selected examples from this paper were presented at the conferences The
Third International Conference on the Application of Artificial Intelligence
to Civil & Structural Engineering, Edinburgh, Scotland, 17-19 August, 1993
and ANNIE'93 Artificial Neural Networks in Engineering, Rolla, Missouri,
November 14-17, 1993.

APPENDIX. REFERENCES
Anderson, J., and Rosenfeld, E. (1988). Neurocomputing; foundations of research.
MIT Press, Cambridge, Mass.
Baffes, P. T. NETS user's guide. (1989). Software Technology Branch, Lyndon B.
Johnson Space Center.
Berke, L., and Hajela, P. (1990). "Application of artificial neural nets in structural
mechanics." Shape and Layout Optimization of Structural Systems, Int. Center for
Mech. Sciences, Udine, Italy.
Box, G. E. P., and Draper, N. R. (1987). Empirical model-building and response
surfaces. John Wiley and Sons, New York, N.Y.
Carpenter, W. C. (1992a). "Effect of design selection on response surface perfor-

357

J. Comput. Civ. Eng., 1994, 8(3): 345-358

mance." Final Rep. NASA Grant No. NAG-l-1378, Univ. of South Florida, Tampa,
Fla.
Carpenter, W. C. (1992b). NEWNET user's guide. Univ. of South Florida, Tampa,
Fla.
Carpenter, W. C., and Barthelemy, J. F. M. (1993). "A Comparison of polynomial
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/19/18. Copyright ASCE. For personal use only; all rights reserved.

approximations and artificial neural nets as response surfaces." Struct. Optimi-

zation, Vol. 5, 1-15.
Fox, R. L. (1971). Optimization methods for engineering design. Addison-Wesley
Publishing Co., Reading, Mass.
Gallant, A., and White, H. (1992). "On learning the derivatives of an unknown
mapping with multilayer feedforward networks." Neural networks 5, 129-138.
Greville, T. N. E. (1959). "The pseudoinverse of a rectangular or singular matrix
and its application to the solution of systems of linear equations." SlAM Rev.,
1(1), 38-43.
Hajela, P., and Berke, L. (1990). "Neurobiological computational models in struc-
tural analysis and design." Paper AIAA-90-1133-CP, AIAA/ASME/ASCE/AHS/
ASC 31st Struct., Struct. Dynamics and Mat. Conf., New York, N.Y.
Hornik, K., Stinchcombe, M., and White, H. (1989). "Multilayer feedforward net-
works are universal approximators." Neural networks 2, 359-366.
Hornik, K., Stinchcombe, M., and White, H. (1990). "Universal approximation of
an unknown mapping and its derivatives using multilayer feedforward networks."
Neural networks 3, 551-560.
Khuri, A. I., and Cornell, J. A. (1987). Response surfaces, designs and analyses.
Marcel Dekker, Inc., New York, N.Y.
Penrose, R. (1955). "A generalized inverse for matrices." Proc. Cambridge Philos-
ophy Soc., Vol. 51,406-413.
Myers, R. H. (1971). Response surface methodology. Allyn and Bacon, Boston.
Reklaitis, G. V., Ravindran, A., and Ragsdell, K. M. (1983). Engineering optimi-
zation, methods and applications. John Wiley & Sons, New York, N.Y.
Rogers, J. L., and LaMarsh, W. J. (1992). "Application of a neural network to
simulate analysis in an optimization process." Artificial intelligence ht design 92,
John S. Gero, ed., Kluwer Academic Publishers, Boston, Mass., 739-754.
Rumelhart, D., and McClelland, J. (1986). Parallel distributed processing, vols. I
and H, MIT Press, Cambridge, Mass.
Swift, R. A., and Batill, S. M. (1991). "Application of neural networks to preliminary
structural design." AIAA/ASME/AHS/ASC 32nd Struct., Struct. Dynamics and
Mat. Conf., Baltimore, Md., 335-343.
Vanluchene, R. D., and Roufei Sun. (1990). "Neural networks in structural engi-
neering." Microcomputers in Cir. Engrg., 5(3), 207-215.
White, H. (1990). "Connectionist nonparametric regression--multilayer feedfor-
ward networks can learn arbitrary mappings." Neural networks 3, 535-549.

358

J. Comput. Civ. Eng., 1994, 8(3): 345-358

Neural Networks For Optimization and Signal Processing
No ratings yet
Neural Networks For Optimization and Signal Processing
549 pages
All Certik Skynet Answer (Up-To-date)
100% (2)
All Certik Skynet Answer (Up-To-date)
21 pages
Mathematical Theory of Deep
No ratings yet
Mathematical Theory of Deep
275 pages
Admin Dumps Final
100% (1)
Admin Dumps Final
9 pages
Patel Uchicago 0330D 14442
No ratings yet
Patel Uchicago 0330D 14442
239 pages
Data Fitting and Uncertainty (A Practical Introduction To Weighted Least Squares and Beyond)
No ratings yet
Data Fitting and Uncertainty (A Practical Introduction To Weighted Least Squares and Beyond)
6 pages
Neural ODES
No ratings yet
Neural ODES
32 pages
(2018!04!16) Bali DL PRB Justification
No ratings yet
(2018!04!16) Bali DL PRB Justification
7 pages
Types of 3D Printers - Complete Guide - SLA, DLP, FDM, SLS, SLM, EBM, LOM, BJ, MJ Printing
100% (2)
Types of 3D Printers - Complete Guide - SLA, DLP, FDM, SLS, SLM, EBM, LOM, BJ, MJ Printing
12 pages
Six Lectures On NN - Montanari
No ratings yet
Six Lectures On NN - Montanari
77 pages
17 Aap1328
No ratings yet
17 Aap1328
59 pages
Skript Opt Mach
No ratings yet
Skript Opt Mach
49 pages
Applying Statistical Learning Theory To Deep Learning
No ratings yet
Applying Statistical Learning Theory To Deep Learning
51 pages
Sketching As A Tool For Numerical Linear Algebra
No ratings yet
Sketching As A Tool For Numerical Linear Algebra
139 pages
Neural Network - Optimization DRAFT 3.11
No ratings yet
Neural Network - Optimization DRAFT 3.11
66 pages
(1993) A. Cichocki and R. Unbehauen. Neural Networks For Optimization and Signal Processing PDF
No ratings yet
(1993) A. Cichocki and R. Unbehauen. Neural Networks For Optimization and Signal Processing PDF
549 pages
DSA5105 Lecture5
No ratings yet
DSA5105 Lecture5
52 pages
1997 Statistical Methods Neural Network Prediction Models Ee JVR 97 2
No ratings yet
1997 Statistical Methods Neural Network Prediction Models Ee JVR 97 2
55 pages
3 Regression Diagnostics
100% (1)
3 Regression Diagnostics
53 pages
Operator Learning Algorithms and Analysis
No ratings yet
Operator Learning Algorithms and Analysis
36 pages
Function Approximation
No ratings yet
Function Approximation
35 pages
Computing Lyapunov Functions Using Deep Neural Networks
No ratings yet
Computing Lyapunov Functions Using Deep Neural Networks
27 pages
Chen 1990
No ratings yet
Chen 1990
25 pages
Sap SD Credit Management Further Knowledge Material
100% (1)
Sap SD Credit Management Further Knowledge Material
5 pages
Modelling 05 00009
No ratings yet
Modelling 05 00009
27 pages
1 s2.0 S1474667017477378 Main
No ratings yet
1 s2.0 S1474667017477378 Main
24 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
MIONet
No ratings yet
MIONet
25 pages
18CS71 AI & ML Module 5 Notes
No ratings yet
18CS71 AI & ML Module 5 Notes
21 pages
36 Neural Operator Graph Kernel N
No ratings yet
36 Neural Operator Graph Kernel N
21 pages
Lec 105
No ratings yet
Lec 105
19 pages
Neural ODE
No ratings yet
Neural ODE
21 pages
NSP NFM-P 19.3 Installation and Upgrade Guide PDF
No ratings yet
NSP NFM-P 19.3 Installation and Upgrade Guide PDF
488 pages
Neural Network Lectures RBF 1
No ratings yet
Neural Network Lectures RBF 1
44 pages
A Deterministic Annealing Neural Network For Convex Programming
No ratings yet
A Deterministic Annealing Neural Network For Convex Programming
13 pages
Lu DeepONet NMachineIntell21
No ratings yet
Lu DeepONet NMachineIntell21
15 pages
ECE/CS 559 - Neural Networks Lecture Notes #8: Associative Memory and Hopfield Networks
No ratings yet
ECE/CS 559 - Neural Networks Lecture Notes #8: Associative Memory and Hopfield Networks
9 pages
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
No ratings yet
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
20 pages
Notes5 Regression
No ratings yet
Notes5 Regression
14 pages
4 Adaline - The Adaptive Linear Element: Nnets - L. 4 February 10, 2002
No ratings yet
4 Adaline - The Adaptive Linear Element: Nnets - L. 4 February 10, 2002
34 pages
Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
No ratings yet
Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
15 pages
The Mathematics of Artificial Intelligence: 1 Supervised Learning
No ratings yet
The Mathematics of Artificial Intelligence: 1 Supervised Learning
10 pages
Image Based Classification
No ratings yet
Image Based Classification
8 pages
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
No ratings yet
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
16 pages
Modeling Systems With Machine Learning Based Differential Equations
No ratings yet
Modeling Systems With Machine Learning Based Differential Equations
12 pages
Neural Networks and Principal Component Analysis: Learning From Examples Without Local Minima
No ratings yet
Neural Networks and Principal Component Analysis: Learning From Examples Without Local Minima
6 pages
#Loyola, Pedergnana & Garcia - Smart Sampling and Incremental Function Learning For Very Large High Dimensional Data
No ratings yet
#Loyola, Pedergnana & Garcia - Smart Sampling and Incremental Function Learning For Very Large High Dimensional Data
13 pages
Introduction To Machine Learning Lecture 2: Linear Regression
No ratings yet
Introduction To Machine Learning Lecture 2: Linear Regression
38 pages
MACHINE LEARNING CSEN 3233 - 2022.pdf - Crdownload
No ratings yet
MACHINE LEARNING CSEN 3233 - 2022.pdf - Crdownload
6 pages
Applications of ANN
No ratings yet
Applications of ANN
19 pages
ANN For Fitting Applications (Data Modeling)
No ratings yet
ANN For Fitting Applications (Data Modeling)
47 pages
Dasgupta and Schnitger - 1992 - The Power of Approximating A Comparison of Activation Functions Paper
No ratings yet
Dasgupta and Schnitger - 1992 - The Power of Approximating A Comparison of Activation Functions Paper
8 pages
A Proposal On Machine Learning Via Dynamical Systems
No ratings yet
A Proposal On Machine Learning Via Dynamical Systems
11 pages
CBD Aisc 360 16
100% (1)
CBD Aisc 360 16
98 pages
ML Algorithm For PM
No ratings yet
ML Algorithm For PM
8 pages
X (X X - . - . - . - . - X) : Neuro-Fuzzy Comp. - Ch. 3 May 24, 2005
No ratings yet
X (X X - . - . - . - . - X) : Neuro-Fuzzy Comp. - Ch. 3 May 24, 2005
20 pages
Mathematics Theory of Deep Learning
No ratings yet
Mathematics Theory of Deep Learning
3 pages
Linear Regression
No ratings yet
Linear Regression
6 pages
A Proposal of Neural Network Architecture For Non-Linear Function Approximation
No ratings yet
A Proposal of Neural Network Architecture For Non-Linear Function Approximation
4 pages
1998-scarselli-NN - Universal Approximation Using Feedforward Neural Networks A Survey of Some Existing Methods, and Some New Results PDF
No ratings yet
1998-scarselli-NN - Universal Approximation Using Feedforward Neural Networks A Survey of Some Existing Methods, and Some New Results PDF
23 pages
On Deep Learning For Inverse Problems: Jaweria Amjad Jure Sokoli C Miguel R.D. Rodrigues
No ratings yet
On Deep Learning For Inverse Problems: Jaweria Amjad Jure Sokoli C Miguel R.D. Rodrigues
5 pages
RD 01 Mus 2
No ratings yet
RD 01 Mus 2
9 pages
Klqgceb Ewvhja SC
No ratings yet
Klqgceb Ewvhja SC
8 pages
Statement of Purpose: Jaweria Amjad
No ratings yet
Statement of Purpose: Jaweria Amjad
3 pages
Best Paper For Sigmoid Implementation
No ratings yet
Best Paper For Sigmoid Implementation
5 pages
Interfacing Stepper Motor Using MicroController
100% (1)
Interfacing Stepper Motor Using MicroController
4 pages
Multilayer Feedforward Networks Are Universal Approximators PDF
No ratings yet
Multilayer Feedforward Networks Are Universal Approximators PDF
8 pages
Testbank and Solutions For Microelectronic Circuits 7th Edition
No ratings yet
Testbank and Solutions For Microelectronic Circuits 7th Edition
18 pages
Advance Analysis of Structures Course Code: 4350609
No ratings yet
Advance Analysis of Structures Course Code: 4350609
10 pages
Certified Fiber Optics Tech
No ratings yet
Certified Fiber Optics Tech
4 pages
1.9.4 Test (TST) - Foundations of Geometry (Test)
No ratings yet
1.9.4 Test (TST) - Foundations of Geometry (Test)
11 pages
CCNA1 v7.0: ITN Practice PT Skills Assessment (PTSA) Answers
No ratings yet
CCNA1 v7.0: ITN Practice PT Skills Assessment (PTSA) Answers
1 page
Settlement of Footings On Sand by CPT Data
No ratings yet
Settlement of Footings On Sand by CPT Data
15 pages
PCS902S 21L
No ratings yet
PCS902S 21L
5 pages
LOTUS Spreadsheet Design For Storm Drain Networks
No ratings yet
LOTUS Spreadsheet Design For Storm Drain Networks
17 pages
Constructability Analysis - Machine Learning Approach
No ratings yet
Constructability Analysis - Machine Learning Approach
9 pages
Stokes 2018 TOC - Emarketing 6ed
No ratings yet
Stokes 2018 TOC - Emarketing 6ed
4 pages
Network Cable Tester SC8108-A PDF
No ratings yet
Network Cable Tester SC8108-A PDF
20 pages
DTC B1615/14 Front Airbag Sensor LH Circuit Malfunction: Description
No ratings yet
DTC B1615/14 Front Airbag Sensor LH Circuit Malfunction: Description
2 pages
Machine Learning of Design Rules - Methodology and Case Study PDF
No ratings yet
Machine Learning of Design Rules - Methodology and Case Study PDF
23 pages
Patents Database
No ratings yet
Patents Database
126 pages
Applications of Artificial Intelligence Techniques To Componentâ Based Modular Building Design
No ratings yet
Applications of Artificial Intelligence Techniques To Componentâ Based Modular Building Design
20 pages
Introduction To Computer L1-L3
No ratings yet
Introduction To Computer L1-L3
66 pages
Natural Language Processing 16CSE16-3-6-21
No ratings yet
Natural Language Processing 16CSE16-3-6-21
1 page
Discussion of Â Œknowledgeâ Based Design of Projectâ Procurement Processâ by R. A. Mohsini (January 1993, Vol. 7, No. 1) PDF
No ratings yet
Discussion of Â Œknowledgeâ Based Design of Projectâ Procurement Processâ by R. A. Mohsini (January 1993, Vol. 7, No. 1) PDF
3 pages
Protean
No ratings yet
Protean
5 pages
DEM Aggregation and Smoothing Effects On Surface Runoff Modeling
No ratings yet
DEM Aggregation and Smoothing Effects On Surface Runoff Modeling
29 pages
2-Digit Addition & Subtraction: With and Without Regrouping Worksheets
No ratings yet
2-Digit Addition & Subtraction: With and Without Regrouping Worksheets
21 pages
Configuring Simconnect For Interfaceit™ Software
No ratings yet
Configuring Simconnect For Interfaceit™ Software
8 pages
Automatic Visual To Tactile Translation, Part I - Human Factors, Access Methods and Image Manipulation
No ratings yet
Automatic Visual To Tactile Translation, Part I - Human Factors, Access Methods and Image Manipulation
16 pages
In-Place Expansion of 32-Bit Aggregates To 64-Bit PDF
No ratings yet
In-Place Expansion of 32-Bit Aggregates To 64-Bit PDF
25 pages
Resume - VIVEK KUMAR - PANDEY
No ratings yet
Resume - VIVEK KUMAR - PANDEY
4 pages
University Licensure Examination Reviewer For Teacher: A Framework For Developing Gamified Examination
No ratings yet
University Licensure Examination Reviewer For Teacher: A Framework For Developing Gamified Examination
14 pages
Data Division For Developing Neural Networks Applied To Geotechnical Engineering
No ratings yet
Data Division For Developing Neural Networks Applied To Geotechnical Engineering
10 pages
Nandhirajan P: Work Experience Skills
No ratings yet
Nandhirajan P: Work Experience Skills
1 page
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet
Intermediate AI Prompting - Neural Networks
From Everand
Intermediate AI Prompting - Neural Networks
Eric Centore
No ratings yet
Convolutional Neural Networks in Python: Beginner's Guide to Convolutional Neural Networks in Python
From Everand
Convolutional Neural Networks in Python: Beginner's Guide to Convolutional Neural Networks in Python
Frank Millstein
No ratings yet