0% found this document useful (0 votes)
14 views9 pages

Backpropagation

Uploaded by

Như Vũ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views9 pages

Backpropagation

Uploaded by

Như Vũ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Artificial Intelligence in Engineering 9 (1995) 143-151

1995 Elsevier Science Limited


Printed in Great Britain.
0954-1810(94)00011-5 0954-1810/95/$09.50
ELSEVIER

Back-propagation neural networks for modeling


complex systems
A. T. C. Goh
School of Civil & Structural Engineering, Nanyang Technological University, Nanyang Avenue, Singapore 2263

In complex engineering systems, empirical relationships are often employed to


estimate design parameters and engineering properties. A complex domain is
characterized by a number of interacting factors and their relationships are, in
general, not precisely known. In addition, the data associated with these
parameters are usually incomplete or erroneous (noisy). The development of these
empirical relationships is a formidable task requiring sophisticated modeling
techniques as well as human intuition and experience. This paper demonstrates
the use of back-propagation neural networks to alleviate this problem. Back-
propagation neural networks are a product of artificial intelligence research. First,
an overview of the neural network methodology is presented. This is followed by
some practical guidelines for implementing back-propagation neural networks.
Two examples are then presented to demonstrate the potential of this approach
for capturing nonlinear interactions between variables in complex engineering
systems.

Keywords: back propagation, complex systems, cone penetration test, geotechni-


cal engineering, modeling, neural networks, pile driving.

1 INTRODUCTION engineering are then presented to demonstrate the


potential of this approach for capturing nonlinear
In complex engineering systems, empirical relationships interactions between variables in complex engineering
are often employed to estimate design parameters and systems. The first example involves the analysis of data
engineering properties. Generally, a complex domain is obtained from large scale laboratory tests on sand. The
characterized by a number of interacting factors in other example relates to the prediction of the ultimate
which the relationship between these factors is not load capacity of driven piles from actual field records.
precisely known. In addition, the data associated with
these parameters are usually incomplete or erroneous
(noisy). The extraction of knowledge from the data to 2 NEURAL NETWORKS
develop these empirical relationships is a formidable
task requiring sophisticated modeling techniques as well 2.1 Architecture
as human intuition and experience.
This paper demonstrates the use of back-propagation A neural network is a computer model whose
neural networks to alleviate this problem. The back- architecture essentially mimics the knowledge acquisi-
propagation neural network is a product of artificial tion and organizational skills of the human brain. A
intelligence research. The growing interest in neural thorough treatment of neural network methodology is
networks among researchers is due to its excellent beyond the scope of this paper. The basic architecture of
performance in pattern recognition and the modeling of neural networks has been covered widely.‘12 A neural
nonlinear relationships involving a multitude of vari- network consists of a number of interconnected
ables, in place of conventional techniques. processing elements, commonly referred to as neurons.
First, an overview of the neural network methodology The neurons are logically arranged into two or more
is presented. This is followed by some practical guide- layers as shown in Fig. 1, and interact with each other
lines for implementing back-propagation neural via weighted connections. These scalar weights deter-
networks. Two practical examples in geotechnical mine the nature and strength of the influence between
143
144 A. T.C. Goh

sigmoid curve is commonly used as the transfer


function. The neural network learns by modifying the
weights of the neurons in response to the errors between
the actual output values and the target output values.
This is carried out through the gradient descent on the
sum of squares of the errors for all the training
patterns.12 The changes in weights are in proportion to
the negative of the derivative of the error term. One pass
through the set of training patterns along with the
updating of the weights is called a cycle or epoch.
Training is carried out by repeatedly presenting the
entire set of training patterns (with the weights updated
at the end of each cycle) until the average sum squared
error over all the training patterns are minimized and
within the tolerance specified for the problem. Details of
Fig. 1. A typical neural network architecture. the algorithm for adjusting the weights to minimize the
average sum squared error are described in Caudill and
the interconnected neurons. Each neuron is connected to ButlerI and Eberhart and Dobbins.”
all the neurons in the next layer. There is an input layer At the end of the training phase, the neural network
where data are presented to the neural network, and an should correctly reproduce the target output values for
output layer that holds the response of the network to the training data provided the errors are minimal, i.e.
the input. It is the intermediate layers, also known as convergence occurs. The associated trained weights of
hidden layers, that enable these networks to represent the neurons are then stored in the neural network
and compute complicated associations between patterns. memory. In the next phase, the trained neural network is
Neural networks essentially learn through the adapta- fed a separate set of data. In this testing phase, the
tion of their connection weights. A number of neural neural network predictions (using the trained weights)
network learning strategies have been developed and are are compared with the target output values. This
described elsewhere.‘>2 In recent years, a number of assesses the reliability of the neural network to gener-
neural network development tools capable of imple- alize correct responses for the testing patterns that only
menting these learning strategies have become commer- broadly resemble the data in the training set. No
cially available.3 Some recent applications of neural additional learning or weight adjustments occur during
networks in civil engineering include material model- this phase. Once the training and testing phases are
ing,4 damage assessment,516 structural analysis and found to be successful, the neural network can then be
design7- lo and seismic liquefaction assessment.” put to use in practical applications. The neural network
will produce almost instantaneous results of the output
2.2 Back-propagation algorithm for the practical inputs provided. The predictions should
be reliable provided the input values are within the
The neural network paradigm adopted in this study range used in the training set.
utilizes the back-propagation learning algorithm.12
Back-propagation neural networks with a single hidden
layer have been shown to be capable of providing an 3 BACK-PROPAGATION IMPLEMENTATION
accurate approximation of any continuous function STRATEGIES
provided there are sufficient hidden neurons.13 In back-
propagation neural networks, the mathematical relation- The development of a backpropagation neural network
ships between the various variables are not specified. model essentially involves a number of stages. First, the
Instead, they learn from the examples fed to them. In variables to be used as the input parameters for the
addition, they can generalize correct responses that only neural network model have to be identified. This
broadly resemble the data in the learning phase. requires an understanding of the problem domain and
The basic mathematical concepts of the back- may require insights from specialists in that field. To
propagation algorithm are found in the literature.14’15 minimize the number of input parameters, statistical
Training of the neural network is essentially carried out methods are sometimes used to identify the most
through the presentation of a series of example patterns significant variables in the model.16
of associated input and target (expected) output values. The next stage involves gathering the data for use in
Each hidden and output neuron processes its inputs by training and testing the neural network. This requires a
multiplying each input by its weight, summing the data set of case records containing the input patterns
product and then passing the sum through a nonlinear and the expected (target output) solution. The training
transfer function to produce a result. The S-shaped set must provide a representative sample of the data
Back-propagation neural networks 145

containing the various distinct characteristics of the Eberhart and Dobbins.” The program was written in C
problem the neural network is likely to encounter in the language. Training was carried out until the average
finished application. A large training set reduces the risk sum squared error over all the training patterns was
of undersampling the nonlinear function but increases minimized. This occurred after about 10 000-20 000
the training time. A general guide is to have at least five cycles of training. Training time on a 80486-33 MHz
to ten training patterns for each weight3 As neural personal computer was usually between 5 and 10min.
networks learn linear relationships more efficiently, to
reduce training time, ‘one goal of data preparation is to
reduce nonlinearity when we know its character and
5 EXAMPLE APPLICATIONS
leave the hidden nonlinearities we don’t understand for
the neural network to resolve’.t7 Hence if it is known
Now two exampIes are presented to demonstrate the
that input X is inversely related to the output, a more
potential of this approach for capturing nonlinear
efficient approach would be to use (l/X) as the input.
interactions between various parameters in complex
Preprocessing of the data is usually required before
civil engineering systems. The first example involves the
presenting the patterns to the neural network. This is
analysis of data obtained from calibration chamber tests
necessary because the sigmoid transfer function modulates
on sand. The other example relates to the prediction of
the output of each neuron to values between 0 and 1.
the ultimate load capacity of driven piles. In both
Various normalization or scaling strategies have been
examples, actual field data were used in training the neural
proposed.‘“18 The following normalization procedure is
network. For brevity, only samples of the training and
commonly adopted and was used in this study.
For a variable with maximum and minimum values of testing data have been included.
Vmaxand Vmi, respectively, each value V is scaled to its
normalized value A using 5.1 Example A

A = (v - Vmin>/(vmax - vmin) (1) Cone penetration test (CPT) measurements are often
used to determine the soil engineering parameters for
The data are then randomly separated into a training
use in foundation design. The relationship between the
set and a testing set. Usually about one-third of the data
are used as the testing set.3 Initially, random scalar measured cone stresses and the soil properties are
weights are assigned to the neurons. The neural network determined from empirical correlations. In sands, these
is then fed the training patterns and learns through the correlations are commonly derived from large scale
adjustment of the weights. Training is carried out laboratory calibration chamber tests. The sand sample
iteratively until the average sum squared error over all of known density is prepared in the chamber and then
the training patterns is minimized. consolidated to the desired stresses. The cone is then
There is currently no rule for determining the optimal pushed into the sample, and the cone tip resistance qC
number of neurons in the hidden layer except through and the sleeve frictionf, are measured. The engineering
experimentation. Using too few neurons impairs the properties of the sample are determined from laboratory
neural network and prevents the correct mapping of testing. The cone measurements are then correlated
input to output. Using too many neurons impedes directly to the engineering properties.
generalization and increases training time.” A common CPT calibration chamber tests have been carried out
strategy and the one used in this study was to replicate by a number of researchers including Holden2’ and
the training several times, starting with two neurons and Baldi et d2’ For this study, the experimental results
then increasing the number while monitoring the average obtained by Baldi et aZ.2’ were used. Their experiments
sum squared error. Training is carried out until there is no involved a comprehensive study of the behaviour and
significant improvement in the error. properties of Ticino sand, under different stress and
As described earlier, the testing set of patterns is then boundary conditions. From statistical analysis, they
used to verify the performance of the neural network, on were able to establish correlations between qC and a
the satisfactory completion of the training. The testing number of engineering parameters. In this paper, the
phase assesses the quality of the neural network model correlation between the tangent constrained modulus
and determines whether the neural network can gener- M, during compression and qC for normally consoli-
alize correct responses for patterns that only broadly dated sand is considered. The correlation determined by
resemble the data in the training set. Baldi et d2’ from statistical analysis is shown below.

MO/q, = 1420(a~/98~1)~0”16e~1’123DR (2)


4 NEURAL NETWORK PROGRAM The mean effective stress ok, M, and qC are in units of
kPa, and the sand relative density Da is in decimals.
The back-propagation neural network program adopted This statistical correlation is used for comparison with
in this study essentially followed the formulations of the neural network predictions.
146 A. T. C. Goh

Table 1. Sample training and testing data for Example A

DR (%) uk (kPa) qc (MPa) K (MPa)


g 250
Training z.
92.4 366.4 46.5 147.6 4 200
92.9 221.2 39.1 119.6
92.9 80.5 23.9 80.3 1
E 150
74.6 221.3 26.1 106.1
74.6 369.1 34.4 131.4
74.9 516.3 40.7 144.3 f 100
61.8 370.2 20.1 111.4
63.4 515.3 25 120.5 i 50
91.8 46.7 18.4 71.1
75.8 46.4 10.9 66.3
% 0
0 50 100 150 200 : i0
Testing
Measuredconstrainedmodulus (MPa)
73.1 80.6 15.6 74.3
92.9 219.8 36.2 118.1
57.7 223.1 13.4 87.5
61.8 81.6 9.1 62.6
63.4 45.4 5.6 g 250
52.1
55.8 370.1 15.5 112.2 3
76.7 224.8 22.1 108.4 8 200
56.4 522.1 19.9 128.4
77.2 518.8 32.1 137.7 t
f 150
51.2 85 7.3 60.8
3
$ 100
s
The neural network consisted of three input neurons
i 50
representing DR, ah and qc and a single output neuron H
representing il4,. The DR values ranged from 16% to
% 0
96%, C& ranged from 26 to 458 kPa, qc was in the range 0 50 100 150 200 2 i0
2-47 MPa, and M, was in the range 16-150 MPa. A Measuredconstrainedmodulus (MPa)
total of 73 training patterns and 29 testing patterns were Fig. 3. Comparison of predicted and measured IV, values.
used. Some sample training and testing patterns are
shown in Table 1.
values were assessed using regression analysis. High
The average sum squared error plotted as a function
coefficients of correlations for the training and testing
of the training cycles is shown in Fig. 2 for the neural
data were obtained as shown in Table 2. The results
network with four hidden neurons. Experiments
show that the neural network was successful in modeling
indicated that there was no significant improvement in
the nonlinear relationship between M,, and the other
convergence as the number of hidden neurons increased
parameters. A comparison of the correlation coefficients
beyond four. The neural network predictions for the
in Table 2 indicates that neural network model is more
training and testing sets are shown in Fig. 3. The scatter
reliable than the statistical model. This is also evident
of the predicted M,, values versus the measured M,
from the plots in Fig. 3.
Table 3 shows the weights of the hidden-input layer
connections, and the hidden-output layer connections.
The relative importance of the various input factors can
be assessed by examining these connection weights of
the neurons. This involves partitioning the hidden-
output connection weights into components connected
with each input neuron.22 The results are summarized in
Table 3. An example of the computational process is
shown in the Appendix. They indicate that DR is the

Table 2. Summary of regression analysis results for Example A


I . Method Coefficient of correlation
0.001’ ’ ’ ’
0 5,wO 1omJ 15400 ~,ooo 25mJ
No.ofcycles Training data Testing data
Neural network 0.98 0.94
Fig. 2. Convergence characteristics during training for 0.91 0.87
Eqn (2)
Example A with four hidden neurons.
Back-propagation neural networks 147

Table 3. Summary of connection weights for Example A


Hidden neurons Weights 9’C 36 MPa

h
I
DR urn 9c MO

Hidden 1 -1.68 3.29 1.32 4.58


Hidden 2 -0.52 -0.23 -0.26 -0.49
Hidden 3 -4.02 2.12 -0.08 -5.14
Hidden 4 -1.76 -1.45 0.58 -2.65
Relative importance (%) 47.3 36.9 15.8 -

h=20% 40% 60%


most important input factor, followed by a;, with qC of I I I
600
less importance. 0 60 100 l!iO 1 la
Since the neural network is capable of generalization, Constrained modulus (MPa)
parametric studies can be carried out to evaluate the Fig. 4. Results of typical parametric study.
effects of the various input parameters Dn, &, and qC on
the output M,. This was done at the end of the testing
The relative merits and reliability of the various
phase, whereby the trained neural network was fed some
conventional formulae have been discussed by others
hypothetical values of Dx, & and qC. The results are
including Olson and Flaate,28 and will not be covered in
shown in Fig. 4 demonstrating the correlations between
this paper.
DR, a;, qC and the predicted M, values.
The training and testing data were drawn from actual
case records compiled by Flaate29 for piles in cohesion-
less soils. This consisted of the results of load tests on
5.2 Example B
timber, precast concrete and steel piles driven into sandy
soils. Details of the range of values for the records are
The second example involves the estimation of the load
summarized in Table 5. A total of 59 patterns were
capacity of driven piles. Pile driving formulae are
randomly selected for the training phase and 35 patterns
commonly used to estimate the load capacity of driven
for the testing phase. The output neuron was the pile
piles. These formulae are essentially derived from
load capacity Q,. Some sample training and testing data
impulse-momentum principles. The formulae assume
are shown in Table 6.
that there is a correlation between the driving resistance
Experiments were carried out using a number of
and the ultimate load capacity of the pile QU. The
combinations of input parameters to determine the most
important factors influencing the load capacity include
reliable neural network model. Generally the reliability
the hammer characteristics, the properties of the pile
of the model improved as the number of input
and soil, and the pile set s.
parameters increased. The neural network model with
A number of pile driving formulae are widely used in
eight input neurons representing E, L, A, WP, H, W, s,
actual practice. These include the Engineering News
and the hammer type (H type) and three hidden neurons
(EN) formula,23 the Hiley formula,24 and the Janbu
was found to be the most reliable. H type was assigned a
formula.25 These formulae are summarized in Table 4,
binary value of 1 for gravity hammers and a value of 0
where W is the hammer weight, H is the hammer drop,
for steam hammers.
L is the pile length, WP is the pile weight, A is the pile
The average sum squared error plotted as a function
cross-sectional area, and E is the pile modulus of
of the training cycles is shown in Fig. 5. The results
elasticity. The derivation of these formulae is described
indicate that convergence was achieved for the training
by Whitaker,27 and is beyond the scope of this paper.

Table 4. Pile driving formulae


Formula Equation for Q, Remarks
WH
Engineering News (EN) c = 25 mm for gravity hammer
S+C = 2.5 mm for steam hammer
= 2.5 Wp/ W for steam hammer on very heavy piles
ef WH W+n’W,
Hiley ef, cl, c2, cg and n are tabulated by Chelli?’
s+o~s(c* +c2+c3) * w-t w*
WH
Janbu k, = Cd{ 1 + (1 + x,/c~)“‘5}
k,s A, = WHLIAE?
c, = 0.75 + 0.15 WJ w
148 A. T. C. Goh

Table 5. Smmnary of range of values for Example B


Property Symbol Range of values
Pile elastic modulus (GPa) E 9.7-206.8
Pile length (m) L 4.4-32.4
Pile cross-sectional area (m2) A 0.8 x lo-’ to 0.37
Pile weight (kN) wp 0.7-221
Hammer drop (m) H 0.5-4.1
Hammer weight (kN) W 0.98 -48.9
Pile set (mm) 0.76-76.2
Hammer type H&e Gravity or
steam hammer

phase. The neural network predictions for the training


and testing sets are shown in Fig. 6. The scatter of the
predicted Q, values versus the measured Q, values were Fig. 5. Convergence characteristics during training for
assessed using regression analysis. High coefficients of Example B with three hidden neurons.
correlations for the training and testing data were
obtained as shown in Table 7. The results from the network predictions are more reliable than the conven-
testing phase suggest that although the model was not tional pile driving formulae.
explicitly trained for these data, the neural network was
capable of generalization and generally gave reasonable
predictions. The results indicate that the neural network 6 DISCUSSION
was successful in modeling the nonlinear relationship
between QU and the other parameters. Statistical methods are commonly used in the develop-
The neural network results in Fig. 6 showed less ment of empirical relationships between various inter-
scatter in the data points in comparison to the acting factors. This is often complex and circuitous,
conventional methods listed in Table 4. For brevity, particularly for nonlinear relationships. Also, to
the plots of the measured and predicted Q, using these formulate the statistical model, the important para-
conventional methods have been omitted. The coeffl- meters must be known. By comparison, the modeling
cients of correlation of predicted versus measured results process in back-propagation neural networks is more
are summarized in Table 7. They indicate that the neural direct, as there is no necessity to specify a mathematical

Table 6. Sample training and testing data for Example B


H type“
&Pa) (& (x 10d3 m2) (2) (Z (k:) (n& (&
Training
9.7 13.61 70.33 6.85 1.02 IO.76 19.3 1 302.46
9.7 9.7 50.65 4.89 0.74 13.34 9.65 0 329.15
17.9 9.3 130.98 26.87 0.61 17.7 3.56 1 1076.42
206.8 32 4.59 16.01 0.91 22.24 0.76 0 1272.13
206.8 11.79 8.26 IO.23 0.76 13.34 3.25 0 880.7
206.8 17.65 1548 19.57 0.81 39.14 1.83 1 2 757.76
206.8 17~78 33.42 43.06 0.81 39.14 6.1 1 1 805.89
9.7 13.61 70.33 6.85 1.02 10.76 22.35 1 311.36
9.7 7.16 72.9 1 6.85 1.12 26,42 14.22 1 934.08
9.7 24 232.27 44.04 1.52 39.14 8.13 1 1663.55
Testing
9.7 19.3 187.11 26.69 1.02 39.14 2.03 1 1094.21
206.8 17.65 15.48 19.57 0.81 39.14 2.24 1 2 633.22
9.7 19.81 111.62 14.95 1.02 29.45 12.19 1 951.87
9.7 9.3 46.45 3.38 0.64 13.34 8.64 0 329.15
9,7 16.26 82.59 9.79 0.71 12.72 1.52 1 871.81
206.8 23.37 14.65 28.38 1.02 39.14 12.19 1 1103.1
206.8 17.65 12.9 18.59 1.02 17.61 1.93 1 2 072.77
9.7 23.88 187.11 33.27 1.02 39.14 10.16 1 1076.42
206.8 22.35 13.42 21.53 1.02 39.14 8.13 1 1975.12
17.9 24.74 370.99 182.63 0.99 41.37 7.62 0 1227.65
’ Gravity hammer = 1, steam hammer = 0.
Back-propagation neural networks 149

The main criticism of the neural network methodol-


ogy is its inability at present to trace and explain the
step-by-step logic it uses to arrive at the outputs from
the inputs provided. This is expected to be a temporary
drawback that will be overcome with further research.

7 SUMMARY

This study demonstrates the feasibility of using neural


networks for capturing nonlinear interactions between
various parameters in complex civil engineering systems.
Measuredultimstepile wpacity (lcN)
A simple back-propagation neural network was used to
model two problems involving nonlinear variables.
Actual field data were used. After learning from a set
of selected patterns, the neural network models were
Testingdata able to produce reasonably accurate predictions.
Neuralnetwork
/

REFERENCES

1. Rumelhart, D. E. & McClelland, J. L. Parallel Distributed


Processing - Explorations in the Microstructure of
Cognition, Vols 1 and 2. MIT Press, Cambridge, MA,
1986.
2. Lippmann, R. P., An introduction to computing with
neural nets. IEEE Acoust. Speech Signal Process, 4(2)
0 1,000 2,wLl 3&m 4,wO 5,cKxl (1987) 4-22.
Measuredultimatepile capecity(kN) 3. Hammerstrom, D. Working with neural networks. IEEE
Spectrum, July (1993) 46-53.
Fig. 6. Comparison of predicted and measured QUvalues. 4. Ghaboussi, J., Garrett, J. H. Jr & Wu, X. Knowledge-
based modeling of material behavior with neural networks.
relationship between the input and output variables. J. Engng. Mech. Division AXE, 117(l) (1991) 132-53.
5. Elkordy, M. F., Chang, K. C. & Lee, G. C. Neural
Neural networks can be effective for analysing a system
networks trained by analytically simulated damage states.
containing a number of variables, to establish patterns J. Comput. Civil Engng ASCE, 7(2) (1993) 130-45.
and characteristics not previously known. In addition, it 6. Yeh, Y. C., Kuo, Y. H. & Hsu, D. S. Building KBES for
can generalize correct responses that only broadly diagnosing PC pile with artificial neural network. J.
resemble the data in the training set. During training, Comput Civil Engng ASCE, 7(l) (1993) 71-93.
7. Vanluchene, D. & Sun, R. Neural networks in structural
irrelevant input variables are assigned low connection
engineering. Microcomp. Civil Engng, 5(3) (1990) 207-l 5.
weights. These variables can then be omitted from the 8. Hajela, P. & Berke, L. Neurobiological computational
model. In neural networks, quantitative as well as models in structural analysis and design. Comput. Struct.,
qualitative information can be considered. This was 41(4) (1991) 657-67.
illustrated in Example B where qualitative information 9. Hajela, P., Fu, B. & Berke, L. Neural networks in
related to the hammer type Htype was incorporated into structural analysis and design: an overview. Comput.
Syst. Engng, 3(1-4) (1992) 525-38.
the model. Since the neural networks are trained on 10. Gunaratnam, D. J. & Gero, J. S. Effect of representation
actual test data, they are trained to deal with inherent on the performance of neural networks in structural
noisy or imprecise data. As new data become available, engineering applications. Microcomp. Civil Engng, 9(2)
the neural network model can be readily updated by (1994) 97-108.
retraining with patterns which include these new data. 11. Goh, A. T. C. Seismic liquefaction potential assessed by
neural networks. J. Geot. Engng ASCE, 120(9) (1994)
1467-80.
12. Rumelhart, D. E., Hinton, G. E. & Williams, R. J.
Table 7. Summary of regression analysis results for Example B Learning internal representation by error propagation. In
Method Coefficient of correlation Parallel Distributed Processing, ed. R. H. Rumelhart & J.
L. McClelland. MIT Press, Cambridge, MA, 1986.
Training data Testing data 13. Hornik, K. Approximation capabilities of multilayer
Neural network 0.96 0.97 feedforward networks. Neural Networks, 4(2) (1991)
Engineering News (EN) 0.69 0.61 251-7.
Hiley 0.48 0.76 14. Caudill, M. & Butler, C. Naturally Intelligent Systems.
Janbu 0.82 0.89 MIT Press, Cambridge, MA, 1990.
15. Eberhart, R. C. & Dobbins, R. W. Neural Network PC
150 A. T.C. Goh

Tools: A Practical Guide. Academic Press, San Diego, 22. Garson, G. D. Interpreting neural-network connection
1990. weights. AI Expert, 6(7) (1991) 47-51.
16. Stein, R. Selecting data for neural networks. AI Expert, 23. Wellington, A. M. The iron wharf at Fort Monroe, VA.
8(2) (1993) 42-7. Trans. AXE, 27 (1892) 129-37.
17. Crooks, T. Care and feeding of neural networks. AZ 24. Hiley, A. The efficiency of the hammer blow, and its effects
Expert, 7(9) (1992) 36-41. with reference to piling. Engineering, 2 June (1922) 673.
18. Masters, T. Practical NeuraI Network Recipes in C++. 25. Janbu, N. Une analyse energetique du battage des pieux a
Academic Press, San Diego, 1993. l’aide de parametres sans dimension. Norwegian Geotech-
19. Bailey, D. & Thompson, D. How to develop neural nical Institute, Oslo, 1953, pp. 63-4 (in Norwegian).
network applications. AZ Expert, 5(6) (1990) 38-47. 26. Chellis, R. D. Pile Foundations, 2nd edn. McGraw-Hill,
20. Holden, J. The calibration of electrical penetrometers in New York, 1961.
sand. Internal report, Norwegian Geotechnical Institute, 27. Whitaker, T. The Design of Piled Foundations. Pergamon
Oslo, 152108-2, 1976. Press, Oxford, 1970.
21. Baldi, G., Bellotti, R., Ghionna, V. N., Jamiolkowski, M. 28. Olson, R. E. & Flaate, K. S. Pile-driving formulas for
& Pasqualini, E. Interpretation of CPTs and CPTUs - friction piles in sand. J. Soil Mech. Foundat. Div. ASCE,
2nd Part: Drained penetration of sands. Proc. 4th Int. 93(6) (1967) 279-96.
Geotech. Seminar on Field Instrumentation and Insitu 29. Flaate, K. S. An investigation of the validity of three pile
Measurements. Nanyang Technological Institute, Singa- driving formulae in cohensionless material. Norwegian
pore, 1986, pp. 143-56. Geotechnical Institute, Oslo, 1964, pp 1l-22.

APPENDIX. EXAMPLE ILLUSTRATING THE PARTITIONING OF WEIGHTS

This appendix details the procedure for partitioning the connection weights to determine the relative importance of the
various inputs, using the method proposed by Garson.** The method essentially involves partitioning the hidden-
output connection weights of each hidden neuron into components associated with each input neuron.
Consider the neural network with three input neurons, four hidden neurons and one output neuron with the
connection weights as shown below, as an example.

Hidden neurons Weights


Input 1 Input 2 Input 3 output
Hidden 1 - 1.67624 3.29022 1.32466 4.57857
Hidden 2 -0*51874 -0.22921 -0.25526 -0.48815
Hidden 3 -4.01764 2.12486 -0.08168 -5.73901
Hidden 4 -1.75691 - 1.44702 0.58286 -2.65221

The computation process is as follows:

(1) For each hidden neuron i, multiply the absolute value of the hidden-output layer connection weight by the
absolute value of the hidden-input layer connection weight. Do this for each input variable j. The following
products Pijare obtained:
Input 1 Input 2 Input 3
Hidden 1 P,, = 1.67624 x 4.57857 P,* = 3.29022 x 4.57857 P,3 = 1.32466 x 4.57857
Hidden 2 P2i = O-51874 x 0.48815 Pz2 = 0.22921 x 0.48815 Pz3 = 0.25526 x 0.48815
Hidden 3 Psl = 4.01764 x 5.73901 PJ2 = 2.12486 x 5.73901 P33 = 0.08168 x 5.73901
Hidden 4 P4, = 1.75691 x 2.65221 Pd2 = 1.44702 x 2.65221 Pd3 = 0.58286 x 2.65221

(2) For each hidden neuron, divide P, by the sum for all the input variables to obtain Qij. For example for Hidden
1, Q,, = P,l/(P1l + P12+ P13)= 0.266445.
(3)For each input neuron, sum the product Sj formed from the previous computations of Qij. For example,
Si = QII + Q21 + Q31 + Q41.
Input 1 Input 2 Input 3
Hidden 1 Qll = 0.266445 Qi2 = 0.522994 Qi3 = 0.210560
Hidden 2 Qzl = o-517081 Q2* = O-228478 Qz3 = O-254441
Hidden 3 Q3, = 0.645489 Q32 = 0.341388 Qs3 = 0.013123
Hidden 4 Q4, = 0.463958 Q42 = 0.382123 Q4s = 0.153919
Sum Si = 1.892973 s, = 1.474983 S, = 0.632044
Back-propagation neural networks 151

(4) Divide Sj by the sum for all the input variables. Expressed as a percentage, this gives the relative importance
or
distribution of all output weights attributable to the given input variable. For example, for the input neuron 1,
the relative importance is equal to (S, x lOO)/(S, + S2 + Ss) = 47.3%.
Input 1 Input 2 Input 3
Relative importance (%) 47.3 369 15.8

You might also like