w02 NeuralNetworks PDF
w02 NeuralNetworks PDF
ABSTRACT: This paper presents two ap- sition problems in which the goal is to cal- The nonlinear function g(-) is called the sig-
proaches for utilization of neural networks culate the optimal fit of an integer-coefficient moid function and takes the values of s and
in identification of dynamical systems. In the combination of basis functions (possibly a --s as x approaches +ca and --ca. As X
first approach, a Hopfield network is used to nonorthogonal set) for an analog signal. increases, g approaches a scaled shifted step
implement a least-squares estimation for However, there are advantages in choosing function.
time-varying and time-invariant systems. The orthogonal basis functions because of the fi- Consider the following energy function E.
second approach, which is in the frequency nality of the coefficients [3]. This paper pre-
domain, utilizes a set of orthogonal basis sents how trigonometric functions can be
functions and Fourier analysis to construct a used so that the Fourier transform of the sig-
dynamic system in terms of its Fourier coef- nal can be generated. As shown, it is simpler
ficients. Mathematical formulations are pre- to pose this problem in the form of an adap-
sented along with simulation results. tive linear combiner (ALC), which was orig- (3)
inally proposed in [4], [5]. The paper de- Hopfield [7] has shown that, if the weights
Introduction scribes that by proper selection of learning Tare symmetric (TG= T,,), then this energy
constants and the time for updating network function has a negative time gradient. This
Artificial neural networks offer the advan- weights, we can transform the continuous means that the evolution of dynamic system
tage of performance improvement through version of the Widrow-Hoff rule into New- (1) in state space always seeks the minima
learning using parallel and distributed pro- ton’s method in one step, i.e., weight con- of the energy surface E. Integration of Eqs.
cessing. These networks are implemented vergence is achieved in only one update (1) and (3) shows that the outputs V, do fol-
using massive connections among process- cycle. The convenience provided by this ap- low gradient descent paths on the E surface.
ing units with variable strengths, and they proach is due to the orthogonality of the ba-
are attractive for applications in system iden- sis functions. Computer simulations are
tification and control. System Identification
conducted for two cases: ( I ) signal decom-
Recently, Hopfield and Tank [l], [2] dem- Using the Hopfield Network
position of a flat spectrum with nonzero mean
onstrated that some classes of optimization and linear phase delay, and (2) identification The mean-square error typically is used as
problems can be programmed and solved on of the frequency response of a mass-spring- a performance criterion in system identifi-
neural networks. They have been able to damper system subject to periodic input cation. Our motivation is to study whether it
show the power of neural networks in solv- pulses. is possible to express system identification
ing difficult optimization problems. How- problems in the form of programming a
ever, a globally optimal solution is not guar- Hopfield Neural Model Hopfield optimization network. The system
anteed because the shape of the optimization identification discussed here is called puru-
surface can have many local optima. The The Hopfield model [6], [7] consists of a metric identiJicution, which means estimat-
object of this paper is to indicate how to number of mutually interconnected process- ing the parameters of a mathematical model
apply a Hopfield network to the problem of ing units called neurons, whose outputs V, of a system or process. Figure 1 shows the
linear system identification. By measuring are nonlinear functions g of their state U;. proposed structure for system identification
inputs, state variables, and time derivatives The outputs can take discrete or continuous in the time domain. The dynamics of the
of state variables, a procedure is presented values in bounded intervals. For our pur-
plant (to be identified) are defined by the
for programming a Hopfield network. The poses, we consider only the continuous ver- usual equations, where Ap and Bp are un-
states of the neurons of this network will sion of the Hopfield model. In this case, neu- known matrices and x and U are the state and
converge to the values of the system param- rons change their states, U, according to the control.
eters, which are to be identified. Results are following dynamic equation, where Tv are
presented from computer simulations for the the weights, R, the ith neuron input imped- k. = A,x -I- Bpu (4)
identification of time-invariant and time- ance, and 1, the bias input.
The dynamic equation of the adjustable sys-
varying plants. N tem depends on e, which is the error vector
Tank and Hopfield [2] have also shown
that the network can solve signal decompo-
dUi/dt =
J=I I
C “jlq- U, R; + Ii (1) between actual system states x and estimated
values y.
It is assumed that all neurons have the same
Presented at the 1989 American Control Confer- capacitances, thus C is not included in Eq. y = A,(e, r)x + &(e, t ) u - Ke (5)
ence, Pittsburgh, Pennsylvania, June 2 1-23, 1989. (1). The dynamics are influenced by the Therefore, the e m r dynamic equation is a
Reynold Chu and Rahmat Shoureshi are with the learning rate X and the nonlinear function g. function of state and control.
School of Mechanical Engineering and Manoel
Tenorio is with the School of Electrical Engi- v, = &?(XU;) (24 e = (A,, - A,T)x + (B, - B,Ju + Ke
neering, Purdue University, West Lafayette, IN
47907. U, = (l/X)g-’(V;) (2b) (6)
I[ ux2 0 0 u2 0
0 ux, ux2 0 u2
Fig. 1. Proposed time-domain system identi3cation scheme. -
The goal is to minimize simultaneously we can program a Hopfield network that has
neurons with their states representing differ- ai,,UX2]' dt
square-error rates of all states utilizing a
Hopfield network. To ensure global conver- ent elements of the A, and B, matrices. From In the case of the time-invariant system,
gence of the parameters, the energy function the convergence properties of the Hopfield the following Ap and B, matrices are to be
of the network must be quadratic in terms of network, the equilibrium state is achieved identified.
the parameter errors, (Ap - A,) and (B, -
B,). However, the error rates (e) in Eq. ( 6 )
are functions of the parameter errors and the
when the partial derivatives aE/aA, and aE/
aB, are zero. This results in the following,
where A: and BT are optimum solutions of
A, = [ - 0 . 9 4 2 5 12.56 1
- 12.56 -0.9425
state errors. The state error depends on y, the estimation problem.
which, in turn, is influenced by A, and B,.
Hence, an energy function based on e will (AT - Ap) [(1/T) so'xxTdt]
have a recurrent relation with A, and B,. To
Figure 2 shows the simulation results of the
avoid this, we use the following energy
system identification. As shown, it takes
function, where tr defines the trace of a ma-
about 0 . 2 sec for the network to learn the
trix, and (.)'is the transpose of a matrix [ 8 ] .
system parameters, even though their initial
E = (1/T) 1
n7
( 1 / 2 ) e , ( t ) T e , ( t )dt
(AT - A,) [ ( 1 / T ) lo' xu'dt]
estimates may have opposite polarities. This
figure represents only A p l l ;similar results
may be obtained for other entries of the A,
and Bp matrices.
= (l/T) l T ( 1 / 2 ) ( X- A,x - B,u)'
+ (B: - B,) [(l/T) s o T u u T d t ]= 0 For the case of time-varying systems, the
A matrix has a low-frequency harmonic vari-
. (X - A,x - B,u) dt ation represented by
A, =
[ -0.9425 12.56 1
ron input impedance, Ri,is high enough so - 12.56 -0.9425
that the second term in Eq. ( 1 ) can be ne-
glected. Therefore, AT approaches A,, and
. (1 + 0.1 sin 2 r . 0.025t)
B, approaches B, if, and only if, the follow-
ing is true.
- tr A , [ ( l / T ) io'xx'dt]
(UT) [X'lU*] dt # 0
- tr B , [ ( l / T ) So' u t T dt]
and only if, x ( t ) and u ( t ) are linearly inde-
pendent in [0, TI. Landau [ l o ] gives a de-
-5.5
I
Equation (7) is quadratic in terms of A, and ( 9 ) , state convergence follows the parameter 0 2 4 6 8 IO
B,. Substituting A,x +
B,u for x in Eq. (7) convergence. Time, sec
indicates that E is also a quadratic function To show applications of the preceding net- Fig. 2. Zdent$cation of a time-invariant
of the parameter errors. Based on Eq. (7), work for system identification, a second-or- plant.
0
X ( r ) Xr(r) dr Then apply Eq. (14) to accumulate the
weight changes while keeping all the weight
an infinite memory is used; whereas, for the
case of the time-varying system, an expo-
nentially decaying window using a first-or-
P = (l/T) S 111
U
+T
f ( r ) x ( t ) dr
constant until the end of the T seconds. The
weight corrections are then based on the
average of gradient estimates over a period
der filter is utilized to ensure fast conver-
gence in the presence of changing plant These results can be used to program Hop- T. Because of the diagonal R matrix, our
parameters. This window has the effect of field's network. However, since orthogonal- method is equivalent to Newton's method of
emphasizing the most current estimates of ity of the basis functions produces a diagonal searching the performance surface. By suc-
error, rather than the past memory. High- R matrix, it would be simpler to use Wid- cessive increases of T in each cycle (and,
frequency fluctuations observed in Fig. 3 row's ALC than a Hopfield network. The hence, the decrease of learning gain), we are
may be alleviated by using second- or higher- gradient of the mean-square-error surface, able to eliminate the variations of weights
order filters. aE/aW, is RW - P. The optimal W * is ob- due to low-frequency components inf(t). A
tained by setting this gradient to zero, re- suggested sequence of periods for averaging
sulting in W * = R - ' P . The optimal weights is T, 2T, 3T, . . . . Each time the period is
Signal Decomposition we obtained are indeed the Fourier coeffi- increased, we need to add more weights and
and Frequency-Response cients of the original signal, f ( t ) , which can neurons. Hence, the frequency resolution is
Identification be rewritten as W * = W - R - ' V , where V determined by the available resources and is
is the gradient of the energy surface, aE/aW, improved as T increases. Moreover, if the
Hopfield demonstrated that the network and can be represented by initial selection of T happens to be the period
can solve signal decomposition problems in off@) or its integer multiples, then we reach
+7
'j
IU
which a signal is approximated by a linear the correct weights in a single search period.
combination of a specific set of basis func-
v = -(UT) { e ( [ ) ,e(t) cos Wlr, This can be seen by integrating Eq. (13) over
tions. The basis functions Hopfield used are a [to, to + TI time interval with the initial
not necessarily orthogonal. We prefer using e(t) sin w l t , . . . , e(r) cos w,t, weight being W , then we will get Eq. (13).
orthogonal functions for the sake of simplic- Figures 4 and 5 show the magnitude and
e ( t ) sin w,,,rJ7dr (12) phase results of decomposing a signal having
ity and finality of coefficients of the combi-
nation. By finality, we mean that when the The integrand in Eq. (12) is the instanta- frequency components from direct current to
approximation is improved by increasing the neous estimation of the gradient vector in the 10 Hz with 0.5-Hz increments. We use a
number of basis functions from n to n + 1 , least-mean-squares (LMS) algorithm used to network capable of 1-Hz resolution. The ini-
the coefficients of the original basis functions estimate the unknown parameters. Based on tial guess of T i s 1 sec, which does not pro-
remain unchanged. In particular, we have orthogonality of the basis functions and ap- vide good results. After extending Tto 2 sec,
used sines and cosines as the basis functions. plications of Eq. (12), W * can be repre- we can identify quite accurately all the com-
The computational network then will be a sented by ponents within our frequency resolution.
Fourier analysis circuit. To estimate f ( f ) by Figures 6 and 7 are the simulation results of
means of a finite-order function s,rI(t),the
following energy function is formulated,
w* = w + (IiT) S ro
I!
+7
{e(t),
identifying the frequency response of an un-
known plant subject to periodic input pulses
where x,n(t)can be expressed by a Fourier with a period of 2 sec. The pulse train is
. cos wit, 2 4 4 sin wlr, . . . , 2e(r)
series of cos w,,t and sin w,,f, and U,, = formed by a series of cosine waves with fre-
2 d T a n d n = 1 , 2, . . . , m. . cos w,r, 2e(r) sin ~ , r } ~ d (13)
r quency components from direct current to 10
Hz with increments of 0.5 Hz and 0.05 mag-
E = (1/2T) l r,!
I/
+7
[f(r) - x,,?(r)]'dt (10)
Therefore, the famous Widrow-Hoff LMS
rule for this problem, in a continuous form,
nitude for all components. The output of the
plant is analyzed by the Fourier network.
Aprii 7990 33
0.4 1 0.251
0.20 : Circ. Syst.. vol. CAS-33, pp. 533-541,
1986.
[31 E. Mishkin, “The Analytical Theory of
Nonlinear Systems,” chap. 8 in Aduptive
Control Systems, edited by E. Mishkin and
L. Braun, New York: McGraw-Hill, pp.
271-273, 1961.
B. Widrow and M. E. Hoff, Jr., “Adaptive
8 Switching Circuits,” IRE WESCON Conv.
8
Rec., pt. 4, pp. 96-104, 1960.
B. Widrow and S. D. Steams, Adaptive Sig-
nal Processing, Englewood Cliffs, NJ:
Prentice-Hall, 1985.
0 2 4 6 8 1 0 J . J. Hopfield, “Neural Networks and Phys-
Frequency, Hz Frequency, Hz ical Systems with Emergent Collective
Fig. 4. Amplitude decomposition of a Fig. 6. Identijcation of magnitude Computational Abilities,” Proc. Nutl.
signal with discrete frequency contents. frequency response of a mass-spring- Acud. Sci., vol. 79, pp. 2554-2558, 1982.
damper system. [71 J . J . Hopfield, “Neurons with Graded Re-
sponse Have Collective Computational
“1
Properties Like Those of Two-State Neu-
rons,” Proc. Nutl. Acud. Sei., vol. 81, pp.
I20 3088-3092, 1984.
I
R. Shoureshi, R. Chu, and M. Tenorio,
The initial trial uses T = 2 sec and 0.5-Hz vergence is of concern, stable feedback of
frequency resolution. Then we use T = 4 state errors can be introduced. Simulation re-
sec and 0.25-Hz frequency resolution. Fig- sults have shown the feasibility of using this
ures 6 and 7 also show the theoretically cal- system identification scheme for time-vary-
culated frequency response. Figure 6 indi- ing and time-invariant plants.
cates that the network not only correctly It was shown that Widrow’s adaptive lin-
identifies all the existing components, it also ear combiner is useful in conducting Fourier S. Reynold Chu received
marks out the nonexistent components by analysis of an analog signal. Instead of using his B.S. degree in me-
showing an almost zero magnitude. The a string of delayed signals, we use sines and chanical engineering from
phase estimation corresponding to the non- cosines as inputs. By taking advantage of National Cheng Kung
existent components has been deleted from orthogonal input functions, we can perform University, Tainan, Tai-
Fig. 7. Newton’s searching method in seeking the wan, in 1981; the M3.E
minimum point of the performance surface. degree in mechanical en-
Simulation results show that this technique gineering in 1983; and the
Conclusion
can be used to identify the frequency transfer M.S.E. in computer in-
A technique for programming of the Hop- functions of dynamic plants. formation and control en-
field network for system identification was gineering in 1984, from
developed. Simultaneous energy minimiza- the University of Michi-
tion by the Hopfield network is used to min- gan, Ann Arbor. From 1985 to 1987, he was an
References Engineering Analyst in the Technical Center of
imize the least mean square of error rates of
[l] J. J. Hopfield and D. W. Tank, “Neural Navistar International Transportation Corpora-
estimates of state variables. In this proce- Computation of Decisions in Optimization tion, working in the field of vehicle dynamics,
dure, we obtain a quadratic error surface by Problems,” Biol. Cybern., vol. 52, pp. vibration, and control system design. Currently,
suppressing feedback of the estimation error 141-152, 1985. he is working on his Ph.D. degree in the School
of the state variations. This would eliminate [2] D. W. Tank and J. J. Hopfield, “Simple of Mechanical Engineering at Purdue University.
formulation of complex error surfaces caused Neural Optimization Networks: An A/D His research interests include applications of neural
by recursive dependence of state error rates Converter, Signal Decision Circuit, and a networks to system identification and control, and
on adjustable variables. If state-variable con- Linear Programming Circuit,” IEEE Trans. control of flexible structures.
I
ing at Purdue University. Manoel F. Tenorio re- in artificial intelligence at USC, UCLA, and
He completed his gradu- ceived the B.Sc.E.E. de- Rockwell Internationalin Los Angeles. Currently,
ate dudieb ut MIT i n gree from the National In- he is an Assistant Professor at the School of Elec-
1981. His research inter- stitute of Telecommuni- trical Engineering, Purdue University, where his
ests include. intelligent cation, Brazil, in 1979; primary research interests are parallel and distrib-
control and diagnostic the M.Sc.E.E. degree uted systems, artificial intelligence, and neural
systems using analyticalisymbolicprocessors and from Colorado State Uni- networks. He is the organizer of the interdiscipli-
neural networks; active and semiactive control of versity in 1983; and the nary faculty group at Purdue called the Special
distributed parameter systems, including flexible 'h.D. degree in computer Interest Group in Neurocomputing (SIGN) and
structures and acoustic plants; and manufacturing engineeringfrom the Uni- heads the Parallel Distributed Structures Labora-
automation, including autonomous machines and versity of Southern Cali- tory (PDSL) in the School of Electrical Engineer-
robotic manipulators. He was the recipient of the fornia in 1987. In 1989, ing.
35