0% found this document useful (0 votes)

11 views25 pages

Chen 1990

This article discusses the use of multi-layered neural networks for the identification of discrete-time non-linear systems. It presents new parameter estimation algorithms based on a prediction error formulation and demonstrates the effectiveness of the neural network approach through applications to both simulated and real data. The study highlights the theoretical foundation for using neural networks in non-linear system modeling and provides insights into the development of identification procedures.

Uploaded by

tye46

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views25 pages

Chen 1990

Uploaded by

tye46

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

This article was downloaded by:[University of Southampton]

On: 14 September 2007

Access Details: [subscription number 769892610]
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954
Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of Control

Publication details, including instructions for authors and subscription information:
http://www.informaworld.com/smpp/title~content=t713393989
Non-linear system identification using neural networks
S. Chen a; S. A. Billings b; P. M. Grant a
a
Department of Electrical Engineering, University of Edinburgh, Edinburgh,
Scotland, U.K
b
Department of Control Engineering, University of Sheffield, Sheffield, England, U.K

Online Publication Date: 01 January 1990

To cite this Article: Chen, S., Billings, S. A. and Grant, P. M. (1990) 'Non-linear
system identification using neural networks', International Journal of Control, 51:6,
1191 - 1214
To link to this article: DOI: 10.1080/00207179008934126
URL: http://dx.doi.org/10.1080/00207179008934126

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article maybe used for research, teaching and private study purposes. Any substantial or systematic reproduction,
re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly
forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents will be
complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be
independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings,
demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or
arising out of the use of this material.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

INT. J. CONTROL, 1990, VOL. 51, No.6, 1191-1214

Non-linear system identification using neural networks

s. CHENt, s. A. BILLINGSt and P. M. GRANTt

Multi-layered neuralnetworks offer an exciting alternative for modelling complex

non-linear systems. This paper investigates the identification of discrete-time non-
linear systems using neural networks with a single hidden layer. New parameter
estimationalgorithms are derived for the neural network model based on a predic-
tion error formulation and the application to both simulated and real data is
included to demonstrate the effectiveness of the neural network approach.

1. Introduction
Both the theory and practice of non-linear system modelling has advanced consid-
erably in recent years. It is known that a wide class of discrete-time non-linear systems
can be represented by the non-linear autoregressive moving average with exogenous
inputs (NARMAX) model (Leontaritis and Billings 1985, Chen and Billings 1989 b).
The NARMAX model provides a description of the system in terms of a non-linear
functional expansion of lagged inputs, outputs and prediction errors. The mathemat-
ical function describing a real-world system can be very complex and its exact form
is usually unknown so that in practice modelling of a real-world system must be
based upon a chosen model set of known functions. A desirable property for this
model set is the capability of approximating a system to within an arbitrary accuracy.
Mathematically, this requires that the set be dense in the space of continuous func-
tions. Polynomial functions are one choice that have such a completeness property.
This provides the foundation for modelling non-linear systems using the polynomial
NARMAX model and several identification procedures based upon this model have
been developed (Leontaritis and Billings 1988, Chen and Billings 1989 a, Chen et al.
1989). Because the derivation of the NARMAX model was independent of the form
of the non-linear functional, other choices of expansion can easily be investigated
within this framework and neural networks are an obvious alternative. Neural net-
works can therefore be viewed as just another class of functional representations.
Feedforward multi-layered neural networks have been widely used in many areas
of signal processing (see the I.E.E.E. Transactions, 1988). A common feature in these
applications is that neural networks are employed to realize some complex non-linear
decision functions. Recent theoretical works (Cybenko 1989, Funahashi 1989) have
rigorously proved that, even with only one hidden layer, neural networks can uni-
formly approximate any continuous function. The theoretical basis for modelling
non-linear systems by neural networks is therefore sound.
The present study develops an identification procedure for discrete-time non-
linear systems based on neural networks with a single hidden layer. New batch and

Received 28 August 1989.

t Department of Electrical Engineering, University of Edinburgh, Mayfield Road, Edin-
burgh EH9 3JL, Scotland, U.K.
~ Department of Control Engineering, University of Sheffield, Mappin Street, Sheffield
SI 310, England, U.K.
0020-7179/90 $3.00 © 1990 Taylor & Francis Ltd.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1192 S. Chen et al.

recursive estimation algorithms are derived for the neural network model based on
the prediction error principle. It is shown that the classical back propagation algo-
rithm is a special case of the new prediction error routines, and model validity tests
are introduced as a means of measuring the quality of fit. The results of applying the
neural network model to both simulated and real data are included and a suggestion
for further research is also given.

2. System representation
Under some mild assumptions, a discrete-time multivariable non-linear stochastic
control system with m outputs and r inputs can be represented by the multi variable
NARMAX model (Leontaritis and Billings 1985):
y(t) =f (y(t - 1), ... , y(t - ny), u(t - I), ... , u(t - nul,
e(t - 1), ... , e(t - nell + e(t) (I)
where
Yl (t)] [U 1(t)] e1 (t)]
y(t) = : , u(t) = : , e(t) = : (2)
[ [
Ym(t) u,(t) em(t)

are the system output, input and noise vectors, respectively; ny, nu and n, are the
maximum lags in the output, input and noise respectively; e(t) is a zero-mean inde-
pendent sequence; and f( •) is some vector-valued non-linear function.
The input-output relationship (I) is dependent upon the non-linear functionf(·).
In reality,f( .) is generally very complex and knowledge of the form of this function
is often not available. The solution is to approximate j'(v) using some known simpler
function, and in the present study we consider using neural networks to approximate
non-linear systems governed by the model
y(t) = f(y(t - 1), ... , y(t - ny ) , u(t - I), ... , u(t - nul) + e(t) (3)

Notice that (3) is a slightly simplified version of (I) because only additive uncorrelated
noise is considered. Extension of the results to the more general model description
(I) is discussed.

3. Modelling by neural networks

Neural networks employed for function approximation are feedforward type net-
works with one or more hidden layers between the inputs and outputs. Each later
consists of some computing units known as nodes. Figure I shows the structure of a
multi-layer neural network. Inputs to the network are passed to each node in the
first layer. The outputs of the first layer nodes then become inputs to the second
layer, and so on. The outputs of the network are therefore the outputs of the nodes
lying in the final layer. Usually all the nodes in a layer are fully connected to the
nodes in adjacent layers, but there is no connection between nodes within a layer
and no connecting bridging layers. The input-output relationship of each hidden
node is determined by the connection weights W;, a threshold parameter Il and the
node activation function a( •), as follows:
y = a(I Wi X; + Il) (4)
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

N on-linear system identification using neural networks 1193

network outputs

output layer

hidden layer

network inputs

Figure 1. Multi-layer neural network.

where X; are the node inputs and y is the node output. The activation function a( • )
for each output node is specifically chosen to be linear, and the output node is the
weighted sum of the inputs
(5)
The overall input-output relationship of an n-input m-output network with one
or more hidden layers is described by a function!: IRn -+ IRm • Under very mild assump-
tions on the activation function a( . ), it has been rigorously proved that any continu-
ous function f: D c IR n -+ IR m can be uniformly approximated by an J on D, where D
is a compact subset of IRn (Cybenko 1989, Funahashi 1989).
Our aim is to use neural networks with one hidden layer to model non-linear
systems described by (3). Define n = mny + rn.
x(t) = [Xl (t) xn(tW
= [yT(t - 1) ... yT(t - ny)uT(t - 1) uT(t- n.)]T (6)
and introduce the notation
nh number of hidden nodes
Ill h
) threshold of ith hidden node
wl'> connection weight from xAt) to ith hidden node
0h;(t) output of ith hidden node
w~i) connection weight from ith hidden node to kth output node
Let 0 = [O[ On,]T be all the weights and thresholds of the network ordered in
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1194 S. Chen et al.

a chosen way. The network is then defined by the model

Ht, 0) =I(x(t); 0) = [II (x (t); 0) ... Im(x(t); 0W (7)
with
"h
Yk(t, 0) =J,.(x(t); 0) = L
i= I
WWOhj(t)

= ;~I w~i)aCt wl'>xj(t) + Jllh} I ~k~m (8)

Without the loss of generality, the activation function a( •) will be chosen as

I
a(z) = I + exp ( _ z) (9)

The network model (7) is therefore the one-step-ahead predictor for y(t) and the
prediction error or residual is given as usual by
6(t, 0) = y(t) - Ht, 0) (10)
The first step in modelling non-linear systems using (3) is therefore to select values
for ny, nu and nh' The next is to determine values of all the weights and thresholds or
to estimate 0. The gradient of y(t, 0)

'I'(t, 0) = [
dY(t
d~
0)JT = g(x(t); 0) (11)

an no x m matrix, plays an important role in determining 0. The combination of (7)

and (II)
Y(t, 0)]
[ 'I'(t,0)
= [J(X(t); 0)]
g(x(t); 0)
(12)

will be referred to as the extended network model. The stability of (12) is of vital
importance in any implementation. The set of all 0 that each produce a stable
extended network model is denoted as Do. Notice that, for the chosen activation
function (9), Do is the whole no-dimensional euclidean space and in this sense the
corresponding extended network model is unconditionally stable. Furthermore, the
elements of 'I' (t, 0) for I ~ i ~ no and I ~j ~ m are given by
Ohk(t) if lIj=w}~), I ~k~nh

w}~)ohk(t)(1 - 0hdt)) if (Jj = Il~h), I ~ k ~ nh

'1'..( 0) = dy/t, 0) =
'J t, d(J.,
wWohk(t)(1 - °hk(t))X,(t) if lIj=wW, I ~k~nh' I ~l~n
o otherwise
(13)

4. Identification algorithm
The network model (7) is non-linear in the parameters. This section applies the
well-known prediction error estimation method to derive both the batch and recursive
algorithms for estimating the parameter vector 0 in (7).
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1195

4.1. Off-line prediction error algorithm

A good measure of the closeness between yet) and Ht, 0) is the quadratic form
Q(e(t, 0)) = eT(t, 0)A -1 e(t, 0) (14)
where A is a given m x m symmetric positive definite matrix. Assume that a block of
data {u(t), y(t)}~= 1 is available. The best 0 may then be selected by minimizing the
loss function
I N
J 1 (0) = -N
2 .=1
L
Q(e(t, 0)) (15)

over 0 E Do. Such a method of obtaining 0 is known as the prediction error estima-
tion method.
The minimization of criterion (15) can be performed efficiently using the following
Gauss-Newton algorithm
(16)
where
(17)
is the optimizing direction vector, and
1 N
V J 1 (0) = - N 1~1 'I'(t, 0)A -1 e(t, 0) (18)

1 N
HI (0, <5) =- L 'I'(t, 0)A -1 'l'T (t, 0) + M (19)
N .=1

are the gradient and the approximate hessian of J 1 (0), respectively. <5 is a non-
negative small scalar and I is the identity matrix with appropriate dimension. The
scalar 5(k) is obtained by minimizing
(20)
over 0 < 5 < 1 using a linear search technique such as the golden section search.
In practice, the direction vector '7(0, <5) is computed as follows. The square root
decomposition method is first used to factorize the hessian as
H 1(0,<5)= UTU (21)
where U is an upper triangular square matrix. '7(0, <5) is then solved from
U T (U'7(0, <5)) =- V J d0) (22)
by the forward and backward substitution algorithms (Bierman 1977).
The above Gauss-Newton algorithm is known to converge to at least a local
minimum. Other loss functions can also be employed, and a different example to (15)
is
J 2 (0) =! log det (C(0)) (23)
with
1 N
C(0) = - L e(t, 0)e T(t, 0) (24)
N '=1
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1196 S. Chen et al.

The gradient and the approximate hessian of J 2 (0 ) are

I N
V'J z (0 ) = -- L 'P(t,0)C 1(0)E(t,0) (25)
N 1=1

I N
H z(0 , t5 ) = -
N
L
,~I
'P(t,0)C I(0)'P T(t,0)+M (26)

respectively. If 8* is a minimum of J 1 (0), an optimal choice of A for the loss function

J 1 (0) is C(0*). Choosing the criterion (23) is therefore equivalent to choosing the
criterion (IS) with A approaching an optimum. More detailed discussion of loss
functions can be found in work by Goodwin and Payne (1977) and Ljung (1978).

4.2. Recursive prediction error algorithm

Many applications require recursive or adaptive updating of parameters. Ljung
(1981) and Ljung and Soderstrom (1983) systematically studied recursive approxi-
mations of the prediction error method. Although they used the linear model in their
studies, the principle is actually more general and can readily be applied to non-
linear models as shown by Chen and Billings (1989 a). For the extended network
model (12), the standard form of the recursive prediction error (RPE) algorithm based
on the loss function (15) is

J
y(t) = [J(X(t); ~(t - 1))J (27)
[ 'P(t) g(x(t); 0(t - 1))

E(t) = y(t) - Y(t) (28)

R(t) = R(t - 1) + y(t) ['P(t)A - I 'PT(t) +M - R(t - 1)] (29)
0(t) = 0(t - 1) + y(t)R -I (t)'P(t)A -loft) (30)
where 0(t) is the estimate of 0 at time t and y(t) is the gain at t. Notice that o(t), Y(t)
and 'P(t) depend upon all the old estimates 0(t -I) to 0(0). Thus (27) is time-varying.
R(t) in (29) can clearly be viewed as a recursive form of (19). 'P(t)A - I E(t) corre-
sponds to the gradient of Q(E(t)) and is therefore a noisy or stochastic gradient.
R - I (t)'P(t)A - I E(t) can thus be regarded as an approximation of the Gauss-Newton
search direction (17). Equations (29)and (30) are mainly used for theoretical analysis.
In practice, they are implemented in the equivalent form (with t5 = 0)
I
P(t) = 2(t) {P(t - 1)- P(t - 1)'P(t)

x [2(t)A + 'P T (t)P(t - 1)'1'(tJr 1 'P T (t)P(t - I)} (31)

0(t) = 0(t -1) + P(t)'P(t)A - I E(t) (32)
where
P(t) = y(t)R - 1 (t) and 2(t) = y(t - 1)(1 - y(t))jy( t) (33)
The simplest choice for A is 1. A time-varying A:
A(t) = A(t - 1) + y(t) [E(t)ET(t) - A(t - I)] (34)
can however replace the constant A. The resulting RPE algorithm can be viewed as
based on the criterion (23).
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1197

By applying a general method known as the differential equation method for the
analysis of recursive parameter estimation algorithms developed by Ljung (1977), the
convergence of the algorithm (27)-(30) can be proved. The underlying ideas of Ljung's
method are as follows.
Assume that a projection is employed to keep 0(t) inside the stable region De'

lim t)l(t) = P > 0 (35)

,~oo

R(t) ;;. M V t and some b > 0 (36)

and some regularity conditions hold. Let 0 E De; then the time-invariant non-linear
difference equation (12) is stable. The stability of the time-varying non-linear differ-
ence equation (27) will be guaranteed if 0(k) varies in a sufficiently small neigh-
bourhood of 0, and for sufficiently large t and some M, the influence of 0(k),
k = t - M - I, ... , 0, then becomes very small, i.e.

y (t) ] = [y(t, ~(t - 1), , ~(O))] ~ [y(t, ~(t - I), , ~(t - M))] (37)
[ 'P(t) 'P(t, 0(t - 1), , 0(0)) 'P(t, 0(t - 1), , 0(t - M))

Furthermore, assumption (35) implies )I(t)-->O as t--> 00. For sufficiently large t, )I(t)
will be arbitrarily small, and it is seen that {0(t)} will change more and more slowly,
i.e,

0(t -I) ~ ... ~ 0(t - M) ~ 0 (38)

As a consequence, the time-varying difference equation (27) behaves more and more
like the time-invariant difference equation (12), and problems such as convergence
with a probability of one, possible convergence points and asymptotic behaviour of
the recursive algorithm can thus be studied in terms of an associated differential
equation (for more details, see Ljung and Soderstrom 1983). The results show that
the RPE algorithm has the same convergence properties as its corresponding off-line
algorithm. One of these properties is that 0(t) converges with a probability of one
to a local minimum of
- . I ~
J, (0) = J'..~ 2N ,f-, E[Q(8(t, 0))] (39)

where E [ .] is the expectation operator

For the neural network model (12), a projection to guarantee 0(t) E De is not
actually required because De is the whole space !R The above convergence results
0
'.

are obtained under assumption (35), which implies )I(t) --> 0 as t --> 00 (or ,1,(t) --> I as
t ..... 00). In order to track time-varying parameters, )I(t) should not tend to zero. It is
reasonable to believe that analysis under condition (35) will have relevance for the
case where )I(t) tends to some small non-zero value. As in any non-linear optimization
problem, the initial conditions have an important influence on convergence and the
speed of convergence. The performance surface (39) for a general network model is
very complex and is known in general to contain many local minima. A study of this
performance surface and the influence of 0(0) on the algorithm (27)-(30) is beyond
the scope of this paper.
Strictly speaking, algorithm (30) or (32) is only a crude approximation of the off-
line Gauss-Newton algorithm because -'P(t)A -'8(t) is hardly a good approxi-
mation of the gradient (18). A modified RPE algorithm is proposed here by intro-
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1198 S. Chen et al.

ducing a smoothed stochastic gradient

~(t) = ~(t - I) + y(t) ['¥(t)/\ - I e(t) - ~(t - 1)] (40)
so that, in parameter updating,
0(t)=0(t-I)+y(t)R-I(t)~(t) or 0(t)=0(t-I)+P(t)~(t) (41)
The new algorithm is thus a truly recursive Gauss-Newton algorithm. Using a
smoothed stochastic gradient usually improves the performance of the recursive
algorithm at the cost of more computation in each recursion. Smoothed stochastic
gradient algorithms can be directly obtained from this new algorithm by making
some simplifications. One example is
0(t) = 0(t - I) + ji(t)~(t) (42)
The so-called back propagation algorithm (Rumelhart et al. 1986)is a simple version
of this smoothed stochastic gradient algorithm with ji(t) in (42) and y(t) in (40) fixed
to some constant values.
In adaptive identification, 0 < .l.(t) < I. If the covariance matrix P(t) is imple-
mented in its basic form as given by (31), a phenomenon known as 'covariance wind-
up' may occur. That is, P(t) may become fairly large. When this occurs and in addition
the gradient is dominated by noise, changes in 0(t) are unlikely in the direction of
improving the model output and this can cause a problem known as 'parameter drift'.
Two factors are likely to introduce covariance wind-up when applying the RPE
algorithm to neural network models. Because of the complexity of the non-linear
structure, it is possible that two different value of 0 can result in the same
input-output relationship from (12). When the parametrization is not unique, covari-
ance wind-up can occur (Janecki 1988). If the signal excitation is poor, covariance
wind-up may happen (Sripada and Fisher 1987). For the recursive least squares
algorithm, similar difficulties can arise and many numerical modifications have been
developed to overcome these problems in the single-output (m = I) case. A technique
often used is the constant trace adjustment in which P(t) is adjusted in such a way
that its trace remains constant. A more sophisticated technique called exponential
resetting and forgetting (Salgado et al. 1988) can also be employed.

5. Model validation
If modelling is adequate, e(t~ El) will be unpredictable from (uncorrelated with) all
linear and non-linear combinations of past inputs and outputs. Model validity tests
for other non-linear models (Billings and Voon 1986, Billings and Chen 1989, Billings
et al. 1989, Leontaritis and Billings 1987)were developed based on this principle and
can therefore be applied to the current neural network model. For simplicity, only
single-input (r = I) single-output model validity tests are briefly summarized.
If the identified model is adequate, the prediction errors should satisfy the follow-
ing conditions (Billings and Voon 1986, Billings and Chen 1989)

,,(k) = an impulse function

u,(k) = 0 for all k
,(,u)(k) = 0 k~0 (43)
u"t(k) = 0 for all k
u",,(k) = 0 for all k
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1199

where cIl x .(k) indicates the cross-correlation function between x(t) and z(t), eu(t) =
6(t + I)u(t + I), u2'(t) = u2(t) - u2(t) and u2(t) represents the time average or mean
value of u2 (t). Therefore if these correlation functions are within the (95%) confidence
intervals ± 1'96/JN, the model is regarded as adequate.
Alternatively a statistical test known as the chi-squared test (Bohlin 1978, Leonta-
ritis and Billings 1987) can be employed to validate the identified model. Let n(t) be
an s-dimensional vector-valued function of the past inputs, outputs and prediction
errors, and
I N
r Tr =- L n(t)nT(r) (44)
N t~1

Then the chi-squared statistic is computed using the formula

(= N JlT(rTr)-1 Jl (45)
where
I N
Jl = - L n(t)6(t,0)/a,
N t~1
(46)

o is the estimate of 0 and a; is the variance of the residuals. Under the null hypothesis
that the data are generated by the model, the statistic ( is asymptotically chi-squared
distributed with s degrees of freedom. A convenient choice for n(t) is
n(t) = [w(t)w(t - I) ... w(t - s + IW (47)
where w(t) is some chosen (non-linear) function of the past inputs, outputs and
prediction errors. Thus if the values of ( for several different choices of w(t) are within
the acceptance region (95 %), that is
( < X;(Cl) (48)
the model can be regarded as adequate, where X; (Cl) is the critical value of the chi-
squared distribution with s degrees offreedom for the given significance level Cl (0'05).
To sum up the discussion so far, the identification of a structure-unknown system
described by (3) using a single hidden layer neural network involves the following
procedure:
(a) choose values of ny, n. and n.;
(b) estimate 0;
(c) validate the estimated model. If the model is adequate, the procedure is termin-
ated; otherwise go to step (a).

6. Simulation study
The parameter estimation algorithm used in this simulation study was the off-
line prediction error algorithm and only single-input single-output examples are
given.

Example I
This is a simulated system. 500 points of data were generated by
y(t) = (0'8 - 0·5 exp (- y2(t - I»)y(t - I) - «}3 + 0·9 exp (- y2(t - I»)y(t - 2)
+ u(t - I) + 0'2u(t - 2) + 0·1 u(t - I)u(t - 2) + e(t)
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1200 S. Chen et al.

where the system noise e(t) was a gaussian white sequence with mean zero and
variance 0·04 and the system input u(t) was an independent sequence of uniform
distribution with mean zero and variance 1·0.
The input order of the network model was chosen as n = ny + n. = 2 + 2. When
the number of hidden nodes was increased to nh = 5 (no = 30) the model validity tests
were satisfied. Figure 2 shows the system and model response where the model
deterministic output Yd(t, 8) is defined by
Yit, 8) = j(jid(t - 1,8), ..., Yd(t - ny, 8), u(t - I), .. " u(t - n.); 8) (49)
and the deterministic error Eit, 8) is given as
Bd(t, 8) = y(t) - Yit, 8) (50)

1.74

-1.74
2 (a) 500

5.88

-5.88
500
2 (b)

5.88

-s.as 1 1 - - - '_ _
t,
500
2 (e)
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1201

0.64

-0.64 ~---_-- __ --_~ ~ ~

500
(d)

-3.0
500
(e)

5.88

~,".f~,_~_ 500
(f)
Figure2. System and model response (Example 1): (a) U(I); (b) y(I); (e) .HI, El); (d) e(I, El);
(e) e4(1, El); (f) .MI, El).

Figures 3 and 4 display the correlation tests and some chi-squared tests for the
estimated model.

It can easily be verified that the unforced response (that is e(t) = 0 and u(t) = 0)
of this simulated system is a stable limit cycle as illustrated in Fig. 5. The unforced
response from the estimated model with the same initial condition is plotted in Fig. 6,
where it is seen that, although the shape is different from that in Fig. 5, the estimated
model correctly predicts the existence of a limit cycle. The data shown in Fig. 5 were
used to identify a network model with n = ny = 2 and nh = 10 (n6 = 40). The resulting
model produces the limit cycle shown in Fig. 7, which is much closer to that produced
by the unforced system.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1202 s. Chen et aI.

-I
o 20 o 20

(a) (b)

-I ' - - ~ - - --<
-I
-10 o 10 -10 o 10

(e) (d)

-1
-10 o 10

(e)
Figure 3. Correlation tests (Example I): (a) cIl,,(k); (b) cIl,(.. )(k); (e) cIl",,(k); (d) cIl.'.,(k);
(e) cIl.'.,,(k). Dashed line: 95% confidence interval.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1203

33 -r-------------,. 33 . - - - - - - - - - - - - - - - , .

o ~- --_--l o
o 20 o 20
delay delay
(a) (b)

33 . - - - - - - - - - - - - - ; 33 . - - - - - - - - - - - - - ;

0 0
0 20 0 :0
delay delay
(e) (d)

33 33

o o
o :0
delay delay
(e) (f)
Figure 4. Chi-squared tests (Example I): (a) ro(t) = £(t- I, e); (b) ro(t) = y(t - I); (e) ro(t) =
exp(u(t - I)); (d) ro(t) = tan h(.(t -I, ell; (e) ro(t) = y2(t - 1).2(t - 2, e); (f) ro(t) =
exp(-u 2(t-2))exp(-y'(t-2)). Dashed line: 95% confidence limit.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1204 S. Chen et al.

1.20

0.60

-0.60

-1.20

1 76 151 301
Figure 5. System unforced response (Example I): initial condition: y(-I) = 0'01, y(O) = 0-1.

1.20

0.60

-0.60

1 76 151 301
Figure 6. Control model unforced response (Example I): initial condition: y(-I) = 0'01,
y(O) = 0-1.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1205

Q.!iO

-0.60

-1.20

1 76 151 301
Figure 7. Time series model unforced response (Example I): initial condition: y(-I) = 0'01,
y(O) = 0·1.

Example 2
This is the time series of annual sunspot numbers. Observations for the years
1700 to 1979 can be found in a paper by Tong (1983, Appendix A. I). The first 256
observations are plotted in Fig. 8 (a).

It has long been noticed that the record of sunspot numbers reveals an intriguing
cyclical phenomenon of an approximate l l-year period. Chen and Billings (1989 c)
fitted a subset polynomial model with ny = 9 and polynomial-degree three to the first
221 observations. The unforced response of this subset polynomial model is a sus-
tained oscillation with an approximate II-year period as shown in Fig. 8 (c). In the
current study a neural network model with n = ny = 9 and nh = 5 (n. = 55) was fitted
to the first 221 observations. The unforced response of this neural network model is
illustrated in Fig. 8 (b) where it is seen that this time series model also produces a
sustained oscillation with an approximate II-year period.

Example 3
The data were generated from a heat exchanger and contains 996 points. A
detailed description of this process and the experimental design can be found in work
by Billings and Fadzil (1985). The first 500 points of the data, depicted in Fig. 9, were
used as. the identification set and the rest of the data as the test set.

A neural network model with ny = nu = 5 and nh = 3 (n. = 36) was fitted to the
identification data set. Figures 10 and II show the correlation tests using the identi-
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1206 s. Chen et al.

156

f\
o \
256
(a)
156

o
256
(b)

156

o
256
(c)
Figure 8: Observations and model unforced response (Example 2): (a) observations; (b) neural
network model; (c) subset polynomial model; first nine observations used as initial
condition in unforced response.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1207

-6
500
(a)

4
500
(b)

Figure 9. Identification data set (Example 3): (a) u(t); (b) y(t).

fication and test sets, respectively. The test set and model response for this set are
given in Fig. 12. Further increasing the size of the network only slightly improved
the quality of fit.
Previous identification results (Billings and Chen 1989) indicate that this non-
linear process can be described better by using a model with the form of (1). The
results obtained here are satisfactory considering that no noise model was fitted as
part of the model estimation.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1208 S. Chen et al.

o~...:.--··--·=
o ~ ... -

-1 k -1 ~----~---~ k
o 20 o 20
(a) (b)

o o

-1 -1 k
-10 o 10 -10 o 10
(c) (d)

o _.: __.'-./.
__ . - .

-1
-10 o 10
(e)

Figure 10. Correlation tests using identification set (Example 3): (a) lI>,,(k); (b) lI>,(,u)(k);
(c) lI>u,(k); (d) lI>u'·,(k); (e) lI>u,·,,(k). Dashed line: 95% confidence interval.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1209

-1 I-----o--_....- ----l -] I-----o-_ _....--__ -~

k
o 20 o 20
(a) (b)

o ~: -... .' / . .....::::

-] k
-lO o lO o
(e) (d)

A
f---...:.. _
o r---=-' 7. ~
._ . .-
-_

-I
-lO o 10
(e)

Figure II. Correlation tests using test set (Example 3): (a) ,,(k); (b) ,t"l(k); (e) .,(k);
(d) .",(k); (e) .",,(k). Dashed line: 95% confidence interval.
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1210 S. Chen et al.

-5 1-----_---_-- ------<
501 996

12 (a)

5
501 996

12(b)

5
501 996

12(e)
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1211

1.4

-1.4 1-----_----.......- -'---_-----.

501 996
(d)

1.3

-1.3 ~----_----.......- - ' - - - _ - - -_ _

501 996
(e)

5~--- _ _- - -_ _- - - _ - _
501 996

(f)
Figure 12. Test set and model response (Example 3):. (a) U(I); (b) y(I); (c) Y(t, El); (d) £(1, El);
(e) £d(l, El); (f) MI, 0).
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1212 S. Chen et al.

7. Discussion for further research

As mentioned in § 2, .(3) is only, a special case of the general system (I). The
approach developed in the present study can easily be extended to the general system
(I) by augmenting x(t) to
x(t) = [yT(t - 1) ... yT(t - ny)uT(t - 1) ... uT(t - nu )8T(t - 1,8)
8
T(t - n., 8 )]T (51)
The extended network model becomes

y (t, 8)] [ !(x(t); 8) ]

(52)
[ 'I'(t, 8) = g('I'(t - 1,8), ..., 'I'(t - ne, 8), x(t); 8)

The application ofthe parameter estimation algorithms of § 4 to this model is straight-

forward. The analysis of a non-linear model involving 8(t - i, 8) within its arguments
is however considerably more difficult. In particular, Do generally is no longer the
whole no-dimensional space and is dependent on system input-output statistics. In
some extreme cases, Do may not even exist. In other words the issue of invertibility
(Chen and Billings 1989 c), i.e. whether it is possible to compute 8(t,8) using the
model and given system inputs and outputs, becomes critical. Unlike the polynomial
model, which may be explosive, the network model with the activation function (9)
can be non-explosive. The neural network approach may therefore be more suitable
for modelling non-linear time series whose underlying processes are stable and non-
explosive. There is scope for further investigation of this aspect.
A comprehensive study is required to compare the neural network model with
other non-linear models. For the polynomial model, efficient procedures for selecting
subset models have been developed (Chen et al. 1989, Leontaritis and Billings 1987).
A parsimonious model has advantages in controller design, prediction and other
practical applications. Selection of subset neural network models is worth invetigat-
ing. One possible approach is to develop a criterion like Akaike's information criterion
(Leontaritis and Billings 1987) for removing insignificant connection weights.
It is a common beliefthat neural networks with several hidden layers can approx-
imate a function more efficiently (with less nodes) for a given accuracy requirement
than networks with a single hidden layer. More theoretical research is required to
derive some quantitative results. There are other advantages of using highly layered
networks, such as increasing integrity. The identification procedure for the network
model with several hidden layers, however, will not be as simple as that given at the
end of § 5 because more than one hidden layer will need to be specified.
For non-linear systems which exhibit a significant constant level independent of
system input and noise, a threshold can be introduced to each output node. The
activation function is not restricted to (9). The study of different activation functions
to compare their performance is of practical interest. If a polynomial model is used
to model the system (3), the loss function (39) has a single global minimum for a fixed
dimension no. It is well known that (39) contains many local minima if the neural
network model is employed. Further investigation is required to analyse the effect of
this on the outcome of the identification.

8 Conclusions
An identification procedure has been developed for disrete-time non-linear sys-
tems based on a neural network approach. Both batch and recursive prediction error
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

Non-linear system identification using neural networks 1213

estimation algorithms have been derived for a neural network model with a single
hidden layer and model validation methods have been discussed. Application to some
simulated and real systems has been demonstrated. The results obtained suggest that
modelling non-linear systems by neural networks is an effective approach and further
research in this field is worth pursuing.

ACKNOWLEDGMENTS
This work is supported by the U.K. Science and Engineering Research Council.
The authors are also grateful for information supplied by Dr G. 1. Gibson.

REFERENCES
BIERMAN, G. J., 1977, Factorization Methods for Discrete Sequential Estimation (New York:
Academic Press).
BILLINGS, S. A., and CHEN, S., 1989, Identification of non-linear rational systems using a
prediction-error estimation algorithm. International Journal of Systems Science, 20,
467-494.
BILLINGS, S. A., CHEN, S., and KORENBERG, M. J., 1989, Identification of MIMO non-linear
systems using a forward-regression orthogonal estimator. International Journal ofCon-
trol, 49, 2157-2189.
BILLINGS, S. A., and F ADZIL, M. B., 1985, The practical identification of systems with non-
Iinearities. Proc. of the 7th IFAC Symposium on Identification and System Parameter
Estimation, York, U.K., pp. 155-160.
BILLINGS, S. A., and VOON, W. S. F., 1986, Correlation based model validity tests for non-
linear models. International Journal of Control, 44, 235-244.
BOHLIN, T., 1978, Maximum-power validation of models without higher-order fitting. Auto-
matlea,4,137-146.
CHEN, S., and BILLINGS, S. A., 1989 a, Recursive prediction error parameter estimator for non-
linear models. International Journal of Control, 49, 569-594; 1989 b, Representation of
non-linear systems: the NARMAX model. Ibid., 49,1013-1032; 1989 c, Modelling and
analysis of non-linear time series. Ibid., 50, 2151-2171.
CHEN, S., BILLINGS, S. A., and Luo, W., 1989, Orthogonal least squares methods and their
application to non-linear system identification. International Journal of Control, 50,
1873-1896.
CYBENKO, G., 1989, Approximations by superpositions of a sigmoidal function. Mathematics of
Control, Signals and Systems, 2, 303-314.
FUNAHASHI, K., 1989, On the approximate realization of continuous mappings by neural
networks. N eural Networks, 2, 183-192.
GooDWIN, G. C, and PAYNE, R. L., 1977, Dynamic System Identification: Experiment Design
and Data Analysis (New York: Academic Press).
I.E.E.E., 1988, I.E.E.E. Transactions on Acoustics, Speech and Signal Processing, 36 (7).
JANECKI, D., 1988, New recursive parameter estimation algorithms with varying but bounded
gain matrix. International Journal of Control, 47, 75-84.
LEONTARITIS, I. 1., and BILLINGS, S. A., 1985, Input-output parametric models for non-linear
systems. Part I: Deterministic non-linear systems; Part 2: Stochastic non-linear sys-
tems. International Journal of Control, 41, 303-344; 1987, Model selection and valida-
tion methods for non-linear systems. Ibid., 45, 311-341; 1988, Prediction error
estimator for non-linear stochastic systems. International Journal of Systems Science,
19,519-536.
LJUNG, L., 1977, Analysis of recursive stochastic algorithms. I.E.E.E. Transactions on Automatic
Control, 22, 551-575; 1978, Convergence analysis of parametric identification methods.
Ibid., 23, 770-783; 1981, Analysis of a general recursive prediction error identification
algorithm. Automatica, 17, 89-99.
LJUNG, L., and SODERSTROM, T., 1983, Theory and Practice of Recursive Identification (Cam-
bridge, MA: MIT Press).
Downloaded By: [University of Southampton] At: 14:31 14 September 2007

1214 Non-linear system identification using neural networks

RUMELHART, D. E., HINTON, G. E., and WILLIAMS, R. J., 1986, Learning internal representations
by error propagation. In Parallel Distributed Processing: Explorations in the Micro-
structure of Cognition, edited by Rumelhart, D. E., and McClelland, J. L., pp. 318-362
(Cambridge, MA: MIT Press).
SALGADO, M. E., GOODWIN, G. C, and MIDDLETON, R. H., 1988, Modified least squares algor-
ithm incorporating exponential resetting and forgetting. International Journal 0/ Con-
trol, 47, 477-491.
SRIPADA, N. R., and FISHER, D. G., 1987, Improved least squares identification. International
Joumal 0/ Control, 46, 1889-1913.
TONG, H., 1983, Threshold Models in Non-linear Time Series Analysis. Lecture Notes in Stat-
istics (New York: Springer-Verlag).

Neural Network Model Predictive Control of Nonlinear Systems Using Genetic Algorithms
No ratings yet
Neural Network Model Predictive Control of Nonlinear Systems Using Genetic Algorithms
10 pages
Neural Networks For Process Control and Optimization - Two Industrial Applications
No ratings yet
Neural Networks For Process Control and Optimization - Two Industrial Applications
13 pages
Multimodel Neural Networks Identification and Failure Detection of Nonlinear Systems
No ratings yet
Multimodel Neural Networks Identification and Failure Detection of Nonlinear Systems
6 pages
Stability Analysis of Neural Networks-Based System
No ratings yet
Stability Analysis of Neural Networks-Based System
8 pages
Implementations of Learning Control Systems Using Neural Networks
No ratings yet
Implementations of Learning Control Systems Using Neural Networks
9 pages
Applications of ANN
No ratings yet
Applications of ANN
19 pages
13-Nonlinear Dynamic System Identification Using Chebyshev Functional Link Artificial Neural Networks
No ratings yet
13-Nonlinear Dynamic System Identification Using Chebyshev Functional Link Artificial Neural Networks
7 pages
Inverted Pendulum
No ratings yet
Inverted Pendulum
13 pages
Ricsannipoct 03
No ratings yet
Ricsannipoct 03
17 pages
Legendre FLANN
No ratings yet
Legendre FLANN
7 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
Week 03-04 - Deep Feedforward Networks - Intro
No ratings yet
Week 03-04 - Deep Feedforward Networks - Intro
141 pages
Module 2
No ratings yet
Module 2
44 pages
14 Deep
No ratings yet
14 Deep
6 pages
Feed-Forward Multi-Layer Neural Networks: Definitions
No ratings yet
Feed-Forward Multi-Layer Neural Networks: Definitions
33 pages
A Direct Adaptive Neural-Network Control For Unknown Nonlinear Systems and Its Application
No ratings yet
A Direct Adaptive Neural-Network Control For Unknown Nonlinear Systems and Its Application
8 pages
Backpropagation
No ratings yet
Backpropagation
7 pages
Identification and Control of Dynamical Systems Using Neural Network PDF
No ratings yet
Identification and Control of Dynamical Systems Using Neural Network PDF
24 pages
Session 1
No ratings yet
Session 1
8 pages
ACES Journal May 2010 Paper 04 PDF
No ratings yet
ACES Journal May 2010 Paper 04 PDF
10 pages
Artificial Neural Networks and Deep Learning
No ratings yet
Artificial Neural Networks and Deep Learning
22 pages
Learning Rules For Multilayer Feedforward Neural Networks
No ratings yet
Learning Rules For Multilayer Feedforward Neural Networks
19 pages
Diagonal Recurrent Neural Networks For Dynamic Systems Control
No ratings yet
Diagonal Recurrent Neural Networks For Dynamic Systems Control
13 pages
On Neural Networks in Identification and Control of Dynamic Systems
No ratings yet
On Neural Networks in Identification and Control of Dynamic Systems
34 pages
Neural Network Models in Simulation: A Comparison With Traditional Modeling Approaches
No ratings yet
Neural Network Models in Simulation: A Comparison With Traditional Modeling Approaches
9 pages
Lecture Slides 2 - Neural Networks - 2021
No ratings yet
Lecture Slides 2 - Neural Networks - 2021
42 pages
Chapter 6: Counterpropagation Network: Competitive Network Grossberg's Outstar Structure
No ratings yet
Chapter 6: Counterpropagation Network: Competitive Network Grossberg's Outstar Structure
21 pages
DL - ANN - RNN - CNN (Autosaved) (Autosaved)
No ratings yet
DL - ANN - RNN - CNN (Autosaved) (Autosaved)
53 pages
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
No ratings yet
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
4 pages
Unit - III: ANN Applications
No ratings yet
Unit - III: ANN Applications
31 pages
NFC Unit2 (Portal)
No ratings yet
NFC Unit2 (Portal)
26 pages
NN 2
No ratings yet
NN 2
31 pages
Patton 1994
No ratings yet
Patton 1994
6 pages
2 3 4 6 7 8 9 Coursenotes
No ratings yet
2 3 4 6 7 8 9 Coursenotes
98 pages
Approach To The Synthesis of Neural Network Structure During Classification
No ratings yet
Approach To The Synthesis of Neural Network Structure During Classification
7 pages
Neural Networks Unit-3
No ratings yet
Neural Networks Unit-3
14 pages
Two Applications of Deep Learning in The Physical Layer of Communication Systems
No ratings yet
Two Applications of Deep Learning in The Physical Layer of Communication Systems
10 pages
BackProp in Recurrent NNs
100% (1)
BackProp in Recurrent NNs
10 pages
Architecture Selection For A Multilayer Feedforward Network
No ratings yet
Architecture Selection For A Multilayer Feedforward Network
4 pages
Wang2003 Chapter ArtificialNeuralNetwork
No ratings yet
Wang2003 Chapter ArtificialNeuralNetwork
20 pages
Feedforward in Neural Networks
No ratings yet
Feedforward in Neural Networks
14 pages
Artificial Neural Network Seminar Report
50% (2)
Artificial Neural Network Seminar Report
15 pages
Artificial Neural Network Application in Logic System: Siddharth Saxena TCET, Mumbai
No ratings yet
Artificial Neural Network Application in Logic System: Siddharth Saxena TCET, Mumbai
5 pages
Estimation of Neurons and Forward Propagation in Neural Net
No ratings yet
Estimation of Neurons and Forward Propagation in Neural Net
11 pages
Ann R16 Unit 4 PDF
No ratings yet
Ann R16 Unit 4 PDF
16 pages
978-3-030-41068-1 (1) - 133-188
No ratings yet
978-3-030-41068-1 (1) - 133-188
56 pages
Salehfar 1995
No ratings yet
Salehfar 1995
8 pages
Lect8 DNN
No ratings yet
Lect8 DNN
33 pages
A Recurrent Neural-Network-Based Real-Time Learning Control Strategy Applying To Nonlinear Systems With Unknown Dynamics
No ratings yet
A Recurrent Neural-Network-Based Real-Time Learning Control Strategy Applying To Nonlinear Systems With Unknown Dynamics
11 pages
Term Paper: Dept of CSE, GMRIT
No ratings yet
Term Paper: Dept of CSE, GMRIT
16 pages
Neural Networks / Deep Learning
No ratings yet
Neural Networks / Deep Learning
9 pages
Vanishing Gradient Problem
No ratings yet
Vanishing Gradient Problem
3 pages
Neural Controller Matlab
0% (1)
Neural Controller Matlab
10 pages
NN 2
No ratings yet
NN 2
31 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Pattern Classification Slide
No ratings yet
Pattern Classification Slide
45 pages
Learning in A Feed Forward Multiple Layer ANN - Backpropagation
No ratings yet
Learning in A Feed Forward Multiple Layer ANN - Backpropagation
18 pages
Universal Approximation To Nonlinear Operators by Neural Networks With Arbitrary Activation Functions and Its Application To Dynamical Systems
No ratings yet
Universal Approximation To Nonlinear Operators by Neural Networks With Arbitrary Activation Functions and Its Application To Dynamical Systems
7 pages
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet
Deep Learning Powers Better Decisions in Financial Services
No ratings yet
Deep Learning Powers Better Decisions in Financial Services
29 pages
LT15 Graphing Linear Functions Foldable
No ratings yet
LT15 Graphing Linear Functions Foldable
2 pages
Knapsack Problem
No ratings yet
Knapsack Problem
30 pages
3D Computer Vision Assignment 4
No ratings yet
3D Computer Vision Assignment 4
13 pages
Deterministic Finite Automata (DFA) - 11-01-2024
No ratings yet
Deterministic Finite Automata (DFA) - 11-01-2024
10 pages
RD Sharma Solution Class 9 Maths Chapter 6 Factorization of Polynomials PDF
No ratings yet
RD Sharma Solution Class 9 Maths Chapter 6 Factorization of Polynomials PDF
25 pages
Value-Based Reinforcement Learning: Shusen Wang
No ratings yet
Value-Based Reinforcement Learning: Shusen Wang
53 pages
INT247 Lect3.03.1
No ratings yet
INT247 Lect3.03.1
23 pages
Ques Bank Updated
No ratings yet
Ques Bank Updated
2 pages
Ame302 Chapter4 Homework Set
No ratings yet
Ame302 Chapter4 Homework Set
14 pages
LTI System (Linear Time Invariant) Implementation
No ratings yet
LTI System (Linear Time Invariant) Implementation
16 pages
Exercises: Part I: Author: Mala Mitra
No ratings yet
Exercises: Part I: Author: Mala Mitra
10 pages
Surveying Image Segmentation Approaches in Astronomy
No ratings yet
Surveying Image Segmentation Approaches in Astronomy
33 pages
Model Predictive Control
100% (2)
Model Predictive Control
40 pages
Syllabus - Deep Learning and Edge Intelligence
No ratings yet
Syllabus - Deep Learning and Edge Intelligence
3 pages
Assignment Problem
No ratings yet
Assignment Problem
11 pages
Class X-Maths-Polynomials-Aecs2 Mumbai
No ratings yet
Class X-Maths-Polynomials-Aecs2 Mumbai
5 pages
CP4161 Ads Lab Final
No ratings yet
CP4161 Ads Lab Final
71 pages
First Summative GM 1st
No ratings yet
First Summative GM 1st
25 pages
Practice Problems: Divide and Conquer
No ratings yet
Practice Problems: Divide and Conquer
3 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
90 pages
DAA Unit 3
No ratings yet
DAA Unit 3
23 pages
Linear Arrangement Questions For Cat 16
No ratings yet
Linear Arrangement Questions For Cat 16
8 pages
HK 10 Maths CH 2 Polynomial
No ratings yet
HK 10 Maths CH 2 Polynomial
3 pages
Vogal's Approximation Method
No ratings yet
Vogal's Approximation Method
12 pages
IPE 329, Truncation Error and Taylor Series
No ratings yet
IPE 329, Truncation Error and Taylor Series
26 pages
Chapter 6
No ratings yet
Chapter 6
7 pages
SYLLABUS
No ratings yet
SYLLABUS
3 pages
Medium Level Array Practice
No ratings yet
Medium Level Array Practice
6 pages
Chapter 3 FT
No ratings yet
Chapter 3 FT
31 pages

Chen 1990

Uploaded by

Chen 1990

Uploaded by

This article was downloaded by:[University of Southampton]

On: 14 September 2007

International Journal of Control

Online Publication Date: 01 January 1990

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

INT. J. CONTROL, 1990, VOL. 51, No.6, 1191-1214

Non-linear system identification using neural networks

Multi-layered neuralnetworks offer an exciting alternative for modelling complex

Received 28 August 1989.

1192 S. Chen et al.

3. Modelling by neural networks

N on-linear system identification using neural networks 1193

Figure 1. Multi-layer neural network.

1194 S. Chen et al.

a chosen way. The network is then defined by the model

= ;~I w~i)aCt wl'>xj(t) + Jllh} I ~k~m (8)

Without the loss of generality, the activation function a( •) will be chosen as

an no x m matrix, plays an important role in determining 0. The combination of (7)

w}~)ohk(t)(1 - 0hdt)) if (Jj = Il~h), I ~ k ~ nh

Non-linear system identification using neural networks 1195

4.1. Off-line prediction error algorithm

1196 S. Chen et al.

The gradient and the approximate hessian of J 2 (0 ) are

respectively. If 8* is a minimum of J 1 (0), an optimal choice of A for the loss function

4.2. Recursive prediction error algorithm

E(t) = y(t) - Y(t) (28)

x [2(t)A + 'P T (t)P(t - 1)'1'(tJr 1 'P T (t)P(t - I)} (31)

Non-linear system identification using neural networks 1197

lim t)l(t) = P > 0 (35)

R(t) ;;. M V t and some b > 0 (36)

0(t -I) ~ ... ~ 0(t - M) ~ 0 (38)

where E [ .] is the expectation operator

1198 S. Chen et al.

ducing a smoothed stochastic gradient

<I>,,(k) = an impulse function

Non-linear system identification using neural networks 1199

Then the chi-squared statistic is computed using the formula

1200 S. Chen et al.

Non-linear system identification using neural networks 1201

-0.64 ~---_-- __ --_~ ~ ~

1202 s. Chen et aI.

Non-linear system identification using neural networks 1203

1204 S. Chen et al.

Non-linear system identification using neural networks 1205

1206 s. Chen et al.

Non-linear system identification using neural networks 1207

1208 S. Chen et al.

Non-linear system identification using neural networks 1209

-1 I-----o--_....- ----l -] I-----o-_ _....--__ -~

o ~: -... .' / . .....::::

1210 S. Chen et al.

Non-linear system identification using neural networks 1211

-1.4 1-----_----.......- -'---_-----.

-1.3 ~----_----.......- - ' - - - _ - - -_ _

1212 S. Chen et al.

7. Discussion for further research

y (t, 8)] [ !(x(t); 8) ]

The application ofthe parameter estimation algorithms of § 4 to this model is straight-

Non-linear system identification using neural networks 1213

1214 Non-linear system identification using neural networks

You might also like