0% found this document useful (0 votes)
22 views7 pages

Approximation Capability To Functions of Several Variables Nonlinear Functionals and Operators by Radial Basis Function Neural Networks

This paper investigates the representation capabilities of radial basis function (RBF) neural networks, establishing that a necessary and sufficient condition for a function to serve as an activation function in RBF networks is that it is not an even polynomial. It also demonstrates the ability of RBF networks to approximate nonlinear functionals and operators using data from both frequency and time domains. The findings have implications for system identification and the computation of outputs from nonlinear dynamical systems.

Uploaded by

laoji Qian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views7 pages

Approximation Capability To Functions of Several Variables Nonlinear Functionals and Operators by Radial Basis Function Neural Networks

This paper investigates the representation capabilities of radial basis function (RBF) neural networks, establishing that a necessary and sufficient condition for a function to serve as an activation function in RBF networks is that it is not an even polynomial. It also demonstrates the ability of RBF networks to approximate nonlinear functionals and operators using data from both frequency and time domains. The findings have implications for system identification and the computation of outputs from nonlinear dynamical systems.

Uploaded by

laoji Qian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

904 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 6, NO.

4, JULY 1995

Approximation Capability to Functions of Several


Variables, Nonlinear Functionals, and Operators by
Radial Basis Function Neural Networks
Tianping Chen and Hong Chen

Abstract-The purpose of this paper is to explore the represen-


tation capability of radial basis function (RBF) neural networks.
The main results are: 1) the necessary and sufficientcondition for
a function of one variable to be qualified as an activation function
in RBF network is that the function is not an even polynomial,
and 2) the capability of approximation to nonlinear functionals
and operators by RBF networks is revealed, using sample data NETWORK
either in frequency domain or in time domain, which can be used INPUTS
in system identification by neural networks.

I. INTRODUCTION

T HERE have been several recent works concerning the


representation capabilities of multilayer feedforward neu-
ral networks. For example, in the past few years, several papers
KERNEL NODES

Fig. 1. A radial basis function network.


([1]-[7] and many others) related to this topic have appeared.
They all claimed that a three-layered neural network with
sigmoid units on the hidden layer can approximate continuous to approximate continuous functions by the family
or other kinds of functions defined on compact sets in R". N

In many of those papers, sigmoidal functions need to be c,g(y, .2 + 0%)


assumed to be continuous or monotone. Recently [9], [lo], 2=1

we pointed out that the boundedness of the sigmoidal function where yt E R" , c,, O2 E RI, and y2..x denotes the inner product
plays an essential role for its being an activation function of yz and x.
in the hidden layer, i.e., instead of continuity or monotonity, Among the various kinds of promising neural networks
the boundedness of sigmoidal functions ensures the network currently under active research, there is another type called
capability. radial-basis-function (RBF) networks [ 131 (also called lo-
In addition to sigmoidal functions, many others can be used calized receptive field network [ 16]), where the activation
as activation functions of neural networks. For example, in functions are radially symmetric and produce a localized
[12], we proved that for a function g E C(R1) n S'(R1) to response to input stimulus. For a survey, see, for example,
be an activation function in feedforward neural networks, the [15]. A block diagram of an RBF network is shown in
necessary and sufficient condition is that the function is not a Fig. 1. One of the special basis functions that are commonly
polynomial (also see [5]). used is a Gaussian kernel function. Using the Gaussian basis
The above papers are all significant advances towards function, RBF networks is capable of forming an arbitrarily
solving the problem of whether a function is qualified as close approximation to any continuous functions, as shown by
an activation function in neural networks. They only dealt [ 171-[ 191.
with affine-basis-function neural networks (ABF), also called More generally, the goal here is to approximate functions
multilayer perceptrons (MLP), however, and the goal there is of a finite number of real variables by

Manuscript received March 8, 1993; revised November 3, 1993 and March


7, 1994. This work has been supported in part by NSF of China, Shanghai
NSF, the "Climbing" Project in China, Grant NSC 92092, and the Doctoral
Program Foundation of Educational Commission in China.
T. Chen is with the Department of Mathematics, Fudan University, Shang- where 2 E R" and - : x i l l R is the distance between x and
hai, P.R. China.
H. Chen is with Sun Microsystems, Inc., Mountain View, CA 95050 USA. xi in R". Here, the activation function g is not necessarily
IEEE Log Number 9404867. Gaussian. In this direction, several results concerning RBF
104S-9227/95$04.00 0 1995 IEEE

Authorized licensed use limited to: Univ of Calif Merced. Downloaded on February 05,2025 at 19:07:08 UTC from IEEE Xplore. Restrictions apply.
CHEN AND CHEN: APPROXIMATION CAPABILITY TO FUNCTIONS OF SEVERAL VARIABLES 905

neural networks were obtained [13], [14]. In [14], Park and AND DEFMTIONS
11. NOTATIONS
Sandberg proved the following important theorem: We list here the main symbols and notations that will be
Let K : R" + R be a radial symmetric, integrable, bounded used throughout this paper.
function such that K is continuous almost everywhere and
SR- K ( x ) x # 0 , then the family X some Banach space with norm I I . I I x .
R" Euclidean space of dimension n with norm
11 ' IIR".
K some compact set in a Banach space.
C(K) Banach space of all continuous functions defined
on K , with norm
is dense in L p ( R n )where
, g(IIXIIR-) = K ( x ) .
In [23], Park and Sandberg discussed several related results
on the L1, L2 approximation. For example, they proved the
following interesting theorem. V some compact set in C ( K ) .
TheoremA: Assuming that K : R" -+ R1 is a square- All Schwartz functions in distribution theory,
integrable function, then the family S(Rn)
i.e., all the infinitely differentiable functions,
which are rapidly decreasing at infinity.
S'(R") All the distributions defined on S ( R " ) , i.e., all
the linear continuous functionals defined on
S(Rn).
Cm( R") All infinitely differentiable functions defined on
is dense in L2(R")if and only if K is pointable. R" .
In [24], we generalized this result and proved. c,- all infinitely differentiable functions with
Theorem B: Suppose that g : R+ + R1,such that compact support in R".
g(11xllR-)E L2(R"),then the family of functions We review the following definitions.
Dejinition 1: A function o : R1 + R1 is called a (gener-
alized) sigmoidal function, if it satisfies

is dense in L2(R").
It is now natural to ask the following questions: 1) What
is the necessary and sufficient condition for a function to be Dejinition 2: Let X be a Banach space with norm I I . I If
qualified as an activation function in RBF neural networks? there are elements x, E X , n = 1 , 2 , . . . , such that for every
2) How can nonlinear functionals be approximated by RBF x E X there is a unique real number sequence aTL (x),such that
neural networks? 3) How can nonlinear operators (e.g., the
00
output of a system) be approximated by RBF neural networks,
using sample data in frequency (phase) domain or in time z= an(z)xn
(state) domain?
The purpose of this paper is to give strong results in
answering those questions. where the series converges in X ,then {xn}?& is called a
This paper is organized as follows. In Section 11, we list our Schauder basis in X and X is called a Banach space with
symbols, notations and review some definitions. In Section Schauder basis.
111, we show that the necessary and sufficient condition for a Dejinition 3: Suppose that X is a Banach space, V C X is
continuous function in S'(R') to be qualified as an activation called a compact set in X , if for every sequence {x"}:!~ with
function in RBF networks is that it is not an even polynomial. all 2, E V, there is a subsequence { x n k } which converges
In Section IV, we show the capability of RBF neural networks to some element x E V. It is well known that if V C X
to approximate nonlinear functionals and operators on some is a compact set in X , then for any 6 > 0, there is a 6-net
Banach space as well as on some compact set in C ( K ) , N ( 6 ) = { X I , . . . , x n ( 6 ) } with
, all xi E V,1: = l , . . .,n(6),
where K is a compact set in any Banach space. Furthermore, i.e., for every x E X , there is some xi E N(6) such that
we establish the capability of neural networks to approximate 11xi - xllx < S.
nonlinear operators from C(K1) to C (K2). Approximations
using samples in both frequency domain and time domain
are discussed. Examples are given, which includes the use of 111. CHARACTERISTICS OF CONTINUOUS
wavelet coefficients to the approximation. It is also pointed out FUNCTIONSAS ACTIVATION IN RBF NETWORKS
that the main results in Section IV can be used in computing In this section, we show that for a continuous function to
the outputs of nonlinear dynamical systems, thus identifying be qualified as an activation function in RBF networks, the
the system. We conclude this paper with Section V. necessary and sufficient condition is that it is not an even

Authorized licensed use limited to: Univ of Calif Merced. Downloaded on February 05,2025 at 19:07:08 UTC from IEEE Xplore. Restrictions apply.
906 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 6, NO. 4, JULY 1995

polynomial and we prove two approximation theorems by RBF for all X E R1, X # 0 , rotations p, where irepresents the
networks. Fourier transform of h in the sense of distribution.
More precisely, we prove the following. For (6) to make sense, we have to show that
Theorem 1: Suppose that g E C(R1)n S'(R1),i.e., all
those continuous functions such that SRI g ( x ) s ( x )dx makes w(t)&(Xp-'(t)) E S'(R").
sense for all s E S ( R 1 )then
, the family
In fact, G ( t ) E S(R").Moreover, since supp(dp) C K , it
N is easy to verify that &(t) = S e-it'z d p ( x ) E Cm(R")and
cciY(Xi115 - YillR.) there are constants C k , k = 1,2, . . . , such that
a=1
(7)
is dense in C ( K ) ,if and only if g is not an even polynomial,
where K is a compact set in R", ya E R", e,, A, E R1,i = Thus, w ( t ) d ^ p ( t )E S(R").
1, . . . , N . Since dp # 0 and &(t) E C"(Rn),there is t o E R n , h #
Theorem 2: Suppose that g E C ( R 1 n ) S'(R1)and is not 0 and a neighborhood O(t0,S) = {x : 112 - t o l l ~ n< 6} such
an even polynomial, K is a compact set in R", V is a compact
that I&u(t)l> c > 0 for all t E O(to,6).
set in C ( K ) ,then for any E > 0, there are a positive integer
Pick tl E R", tl # 0 arbitrarily. Let t o = Xp(tl),where p is
N . A, E RI, ya E R" , i = I, . . ' , N , which are all independent
a rotation in R". Then Idp(Xp(t))l > c for all t E O ( t l , 6 / X ) .
of f E V and constants c , ( f ) depending on f ,i = 1, . . . ,N ,
Let 2ir E C,"(O(tl,6/X)), then w ( t ) / & ( X p ( t ) ) E
such that
C,"(O(tl, S/X)), because I&(Xp(t))l > rl and d ~ L ( X p ( t ) E )
I N I C" (R").Therefore
I i=l I
holds for all x E K , f E V. Moreover, every e i ( f ) is a
continuous functional defined on V.
The previous argument shows that for any t* E R", t* # 0 ,
Remark I : It is worth noting that X i and yi are all indepen-
dent of f in V and q ( f ) are continuous functionals. This fact there is a neighborhood of t* : O ( t * , q )such that
will play an important role in the approximation to nonlinear (A@), z;(t))= 0 (9)
operators by RBF networks.
To prove Theorem 1, we need to prove the following lemma holds for all 6 with support supp (w)C O ( t * ,T I ) , which
which is of significance by itself. means that supp (4) C { O}. It is well known that a distribution
Lemma 1: Suppose that h(x) E C(R")n S'(R"),then the is the Fourier transform of a polynomial if and only if its
family support is a subset of (0). (see [25, Section 7.161). Thus g
N is a polynomial.
czh(Aipi(x) Y i ) + Necessity. If g is a polynomial of degree m, then all the
functions
i=l
N
is dense in C ( K ) if and only if h is not a polynomial, where
p,arerotationsinR",y;~R",A;~R~,i=l,-..,N.
i=l
Pro08 Sufficiency. Suppose that the linear combinations
Cy=?=,c;h(Xipa(x)++i) are not dense in C[a,b ] , then by Hahn- are polynomials of x l , . . . , x, with total degree m, which, of
Banach extension theorem of linear functionals and Riesz course, are not dense in C ( K ) .Lemma 1 is proved. 0
representation theorem (see [6]), we conclude that there is Proof of Theorem 1: Suppose that g E C(R1)n S'(R1),
a bounded signed Bore1 measure dp with supp ( d p ) C K and then h(x) = g(11XllRn)E S'(R") n C(R")and
N
(3)
c%'(Xillx - YilIR")
i=l
for all X E R1,y E R" and all rotation p. Pick any w E N
S (R" ) , then = c c i Y ( X i l l P ( x )- P(YZ)lIRn)

/Rn 4 Y ) dY s,. h(Xp(z) + Y ) d l l ( z ) = 0. (4)


i= 1
N
= c C i g ( ( l X i P ( x )- ZiIIR") (10)
Changing the order of integral, we have i=l

s,- dUlR" d p p ( y))


h(u) w(y) =0 (5) where = XiP(Yi).
From Lemma 1, we see that the family C E l cig(kllz -
which is equivalent to yillR-) is dense in C ( K ) ,if and only if g(llzllRn)is not a
polynomial in R", which is equivalent to that g is not an even
( i ( t ) , w ( t ) J p ( X p - l ( t ) ) )= 0 (6) polynomial. Theorem 1 is proved. 0

Authorized licensed use limited to: Univ of Calif Merced. Downloaded on February 05,2025 at 19:07:08 UTC from IEEE Xplore. Restrictions apply.
CHEN AND CHEN: APPROXIMATION CAPABILITY TO FUNCTIONS OF SEVERAL VARIABLES 907

such that
I N

holds for all z E K, where xM = (ul(z),...,u~(z))


E
RM,z= Cr==lun(z)~,.
To prove Theorem 3, we need the following two lemmas.
Lemma 3 [21]: Suppose that X be a Banach space with
Schauder basis { x ~ } : = ~then
, K C X is a compact set in
X if and only if the following two conditions are satisfied
simultaneously: 1) K is a closed set in X, and 2) for any
S > 0, there is a positive integer M such that

holds for all z E K.


Lemma 4: Suppose that K is a compact set in a Ba-
nach space X with Schauder basis {x"}:=~. Define K" =
N
{E;==, ui(z)x;,x E K} and K* = K U = lU
: K", then K" is
ICe:(r,.
j=1
-(1/6)llz-z,lIZ
Rn - (f * h 6 ) ( ~<) ~ / 3 (13) a compact set in R" and K' is a compact set in X.
Pro08 It is easy to verify that K" is a compact set in
X , (also in X), provided that K is a compact set in X.
Now, suppose {U"}:==, is a sequence in K*, then one of
the following two cases occurs: i) there is a subsequence
{unL}& of { u ~ } : = ~with all elements being in K or in
some fixed K", and ii) there is no such subsequence.
In case i), it is obvious there is another subsequence of
{ u n L } g 1which
, converges to some element U in K or in
K", because K and K" are compact sets in X.
In case ii), there is a sequence w" = E& ui(v")zi and
integers M , tending to infinity as n + 03, such that U" =
C z l ai(vn)xi.By taking a suitable subsequence, without loss
of generality, we can assume that w" converges to some v E K
as n -+ 03. By Lemma 3, U" - w" + 0 as n --t 03. Thus
U" converges to U.

(i(r: N

- ~ c i ( f ) g ( ~ i -YzllRn)
i=l
llz < E
Combining the two cases, we conclude that K' is a compact
(15) set in X. Lemma 4 is proved.
Proof of Theorem 3: By Tietze extension theorem, we
0

can extend f to a continuous functional f * defined on K*.


Since f * is a continuous functional defined on the compact
set K " , then for any E > 0, there is a S > 0 such that If(.') -
f(z")I < ( ~ / 2 )provided that z', x" E K* and IIz' - z"11 < 6.
Iv. APPROXIMATION TO NONLINEAR FUNCTIONALS By Lemma 3, there is an integer M such that
AND OPERATORS BY RBF NEURAL NETWORKS llz' - 2 q x < 6
In this section, we show some results concerning capability
of RBF neural networks in approximating nonlinear function- for all z E K.
als and operators defined on some Banach space, which can Therefore
be used to approximate outputs of dynamical systems using
sample data in either frequency (phase) domain or time (state)
If(.) <E/2
- f*(zM)I (17)
domain. for all z E K. K M is homomorphic to some compact set K g
We first introduce one of our results. in RM by the map
Theorem 3: Suppose that g E C ( R 1 n ) S'(R') and g is not
an even polynomial, X is a Banach space with Schauder basis
{z,}~=~, K C X is a compact set in X, f is a continuous + {P= ( U ' ( Z ) , . . . , U M ( . ) )
functional defined on K. Then for any E > 0, there are positive
integers M, N, yy E R M ,constants e,, X, # 0, i = 1,. . . , N , E RM : z E K}.

Authorized licensed use limited to: Univ of Calif Merced. Downloaded on February 05,2025 at 19:07:08 UTC from IEEE Xplore. Restrictions apply.
908 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 6, NO. 4, JULY 1995

Proof of Theorem 4: We follow a similar line as used


in 1121. Pick a sequence t1 > € 2 > . . . > tn + 0, then we
can find another sequence 61 > 6 2 > . . ‘ > 6, + 0, such that
).(fI - f ( v ) l <Ek for all f E V, whenever U , V E V and
N
I Iu - 211 Ic(K) < 6k. Moreover, we can find q1 > 72 > . . .qa --t
f*(xM-
) czg(XillxM - Y F I l R M ) < (18) 0, such that Iu(x’) - u ( x ” )< ~ 6k for all U E v, whenever
i=l d , x ” E K and 11x’ - x ” ( l <qk.~
By induction and rearrangement, we can find a sequence
{ x ; ) z l with each xi E K and a sequence of positive integers
n(q1)< n(q2)< . . . < n(qk) + CO, such that the first n(qk)
N elements N(?&) = {xl,. . . ,xn(,,)} is an qk-net in K .
l f ( x )- ~ c z g ( X z l l x M-yMIIRM) < E (l9) For each qk-net, define functions
i=l

and

J=1
for j = l;..,n(qk). It is easy to verify that {TVk,j(x)}
is a
partition of unity, i.e.,

ExampZe2: Let X = L2[0,27r],then (1, { c o s n x } ~ ~ , ,


{sinnx)r=l} is a Schauder basis and a,(x) are just the For each U E V, define a function
Fourier coefficients correspondingly. Thus, we can approxi- n(Vk)
mate every nonlinear functional defined on some compact set ULL?7k = U(xJ)Tqk ?J (.)’ (24)
in L2[0,27r]by RBF neural networks using sample data in 3=1
frequency domain. Moreover, let V,, = {uVk: U E V >and V* = VU(U?=.=, vVh).
3: Let = L 2 ( R 1 )and {$,,k>:k=l be wavelets We then have the following conclusion:’
in L 2 ( R 1 )Then. we can approximate continuous functionals 1) For any fixed ]c,vVkis a compact set in a space of
defined on a compact set in L2(R1)by RBF neural network dimension n ( q k ) in C ( K ) .
using wavelet coefficients as sample data (also in frequency 2) For every U E v, there holds
domain).
Remark 2: Since in Theorem 3 we only require that [IU - u V k l I C ( K ) < 6k’ (25)
{xn}r=l be a Schauder basis (no orthogonal requirement 3) V* is a compact set in G ( K ) .
is imposed), therefore we do not require that {$j,k)rk=l Now, similar to the proof of Theorem 3, we can extend f
be orthogonal wavelets. This is a property of significant to a continuous functional f * on V * ,such that
advantage, for nonorthogonal wavelets are much easier to
construct than orthogonal wavelets. f*(x)= f(x) if z E V. (26)
The following theorem shows the possibility of approximat- Now, for any 6 > 0, we can find a 6 > 0 such that If*( U ) -
ing functionals by RBF neural networks using sample data in f * ( v ) l< 6/2 provided that U , E v* and 11’11 - V ~ ~ C ( < K )6.
time (or state) domain. Let k be fixed such that 6k < 6, then by (24), for every
Theorem4: Suppose that g E C(R1) n S’(R1) is not U E v
an even polynomial, X is a Banach space, K C X is a I I U - %k I Ix < (27)
compact set, V is a compact set in C ( K ) ,f is a continuous
which implies
functional defined on V, then for any E > 0, there are positive
integers N , M , x l , . ’ . , X M E K and X2,c, E R 1 , & = If*(U) - f * ( u , s ) l < f / 2 (28)
( & i , . . . , & ~E) R M , i = l , . . . , N such that for all U E V.
I N I By the argument used in the proof of Theorem 3,
letting M = n ( q k ) , there is an integer N , X2,c, E
R1,<F= (<?,...,&) E R M , t= l , . . . , N and M points
‘For a proof of these three propositions, see Lemma 7 in [12] or the
for all U E V, where uM = (u(xl),.. . , U ( Z M ) ) E R M . Appendix of this paper.

Authorized licensed use limited to: Univ of Calif Merced. Downloaded on February 05,2025 at 19:07:08 UTC from IEEE Xplore. Restrictions apply.
~

CHEN AND CHEN: APPROXIMATION CAPABILITY TO FUNCTIONS OF SEVERAL VARIABLES 909

Let M = maxk { N k } and let cf = 0 for Nk < i 5 M .


N Thus (35) can be rewritten as
If*(U,,) - &(WM- tYllR4 <E/2 (29) N M

a=1
k=l i=l

li(I.) -
2=1
N

~ c 2 g ( X z 1 1 ~-
M <YIlp)l <E (30)
v
for all U E V and E K2. Theorem 5 is proved.
Remark3: Theorem 5 shows the capability of RBF net-
works in approximating nonlinear operators using sample data
in time (or state) domain. Likewise, by Theorem 3, we can
construct RBF networks using sample data in frequency (or
phase) domain.
Remark 4: We can also construct neural networks, where
affine basis functions are mixed with radial basis functions.
For example, we can approximate G(U ) (y) by
N M

c;g(PillUm - EZIIR")S(Wk . Y + Ck).


k=l i=l
The details are omitted here.
Remark 5: In engineering, a system can be viewed as a
operator (linear or nonlinear). Theorem 5 shows the capability
of RBF neural networks in identifying systems, as being
comparable to theorums, which shows the capability of ABF
. d A k I l Y - WkllR") <E (31) neural networks in pattern recognition.

V. CONCLUSION
In this paper, the problem of approximating functions of
several variables, functionals and nonlinear operators by radial
basis function neural networks are studied. The necessary and
sufficient condition for a continuous function to be qualified
as an activation function in RBF networks is given. Results
on using RBF neural networks for computing the output of
dynamical systems by sample data in frequency domain or
time domain are given.

1
N

G(u)(y) - C C k ( G ( 4 ) 9 ( X k I I Y - W k l l R " ) <4 2 (32)


APPENDIX
k=l
Proof of The Three Propositions in the Proof of Theorem 4:
We will prove the three propositions individually as follows.
For a fixed IC, let u$?,j = 1 , 2 , . . . , be a sequence in
V,, and ~ ( 3 be) the sequence in V, such that

"('I,)

ug = U'"(xJT'Ik,J(2). (36)
3=1

Since V is compact, there is a subsequence u ( J r ) ( x )


holds for all k = 1,.. . , N , U E V, where
which converges to some U E V, it then follows that
N
u&'(z) converges to u,)(z) E K I k ,which means that
V,, is a compact set in C ( K ) .
By the definition and the property of unity partition, we
have
n(7k)

- U n ( 7 k )(x)= [U(.) - 1 1 ~ v ,J
k (x)
J=1

. 9(hcllY -WkllR4 <E = [U(.) - 4 5 3 1 1

1/1--23 I/Xl,lk

. T,, > J (XI. (37)

Authorized licensed use limited to: Univ of Calif Merced. Downloaded on February 05,2025 at 19:07:08 UTC from IEEE Xplore. Restrictions apply.
910 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 6, NO. 4, m y 1995

Consequently [ 101 -, “A constructive proof of Cybenko’s approximation theorem and


its extensions,” in Computing Science and Statistics, LePage and Page,
n(7)k)
Eds.. in Proc. 22nd Symp.
. _Interface, East Lansing, Michigan, May 1990,

IIu(.) - Un(qk)(Z)IIX 5 s k T q k , ~ ( x=) s k (38) pp. 163-168.


J=1 1111 ..
-~ T. Chen and H. Chen. “Amroximation to continuous functionals by
neural networks with application to dynamic systems,” IEEE Trans.
for all 71 E V. Neural Networks, vo1.4, no. 6, Nov. 1993.
3 ) Suppose { u J } ~ O O , ~ is a sequence in V * . If there is a [ 121 -, “Universal approximation to nonlinear operators by neural
networks with arbitrary activation functions and its application to
subsequence { u J 1 } & of { U J } with ~ ~ all 7 P E V,Z =
dynamic systems,’’ IEEE Trans. Neural Networks, vol. 6, no. 4, July
1, . . . , then by the fact that V is compact, there is 1995.
a subsequence of {uJ1}El which converges to some [13] R. P. Lippmann, “Pattem classification using neural networks,” IEEE
Commun. Mag., vol. 27, pp. 47-64, 1989.
U E V. Otherwise, to each u J , there corresponds a [14] J. Park and I. W. Sandberg, “Universal approximation using radial-
positive integer k ( j ) such that uJ = dn ( q k , ( J ) ) ’ There basis-function networks,” Neural Computation, vol. 3, pp. 24&257,
1991.
are two possibilities: i) we can find infinite 31 such that [15] D. Hush and B. Home, “Progress in supervised neural networks,” IEEE
uJ‘ E Vqkofor some fixed ko. By proposition 1 proved SP Mag., Jan. 1993.
in this Appendix, Vn(qko) is a compact set, hence there is [I61 J. E. Moody and C. J. Darken, “Fast learning in networks of locally-
tuned processing units,” Neural Computation, vol. 1. pp. 281-293,
a subsequence of ,,,} which converges to some 1989.
[I71 F. Girosi and T. Poggio, “Networks and the hest approximation prop-
71 E Vqko, i.e., a subsequence of {d}converging to erty,” MIT, Artificial Intelligence Lab, Memo 1164, 1989.
w, and ii) there are sequences 31 < j 2 < . . . + KI and [18] E. J. Hartman, J. D. Keeler, and J. M. Kowalski, “Layered neural
k(j1) < k ( j 2 ) < . . . + CO such that uJrE V q k ( 3Let i,. networks with Gaussian hidden units as universal approximators,”
Neural Computation, vol. 2, no. 2, pp. 210-215, 1990.
uJi E V be such that . . S. Lee and R. M. Kil. “A Gaussian potential function network with
r191
n(qk(Ji 1) hierarchically self-organizing learning,” Neural Networks, vol. 4, pp.
207-224, 1991.
= u ” ( x ~ ) T q ~ ( J l ) ,(.).~ (39) [20] J. Diedonne, Foundation ofModem Analysis. New York and London:
i=l Academic, 1969, p. 142.
[21] L. A. Liustemik and V. J. Sobolev, Elements of Functional Analysis (3rd
Since vji E and is compact, we that ed., translated from Russian 2nd. ed.). New York Wiley, 1974.
there is a subsequence of {di})t.o=l,
which converges [22] T. Chen, “Approximation to nonlinear functionals in Hilbert space by
to some 7) E V. By the proposition 2, proved in this superposition of sigmoidal functions,” Kexue Tongbao, 1992.
[23] J. Park and I. W. Sandberg, “Approximation and radial-basis-function
Appendix, the corresponding subsequence of { ujl }E1 networks,” Neural Computation, vol 5, 1993.
also converges to U. Thus the compactness of V* is [24] T. Chen and H. Chen, “L‘(R“) approximation by RBF neural net-
proved. works,” Chinese Annals of Mathematics, in press.
[25] W. Rudin, Functional Analysis. New York McGraw-Hill, 1973.

ACKNOWLEDGMENT
The authors wish to thank Prof. R.-W. Liu of University of
Notre Dame and Prof. I. W. Sandberg of University of Texas- Tianping Chen received the graduate degrees from
Austin for bringing some of the papers in this area to their Fudan University, Shanghai, China, in 1966.
attention. He is a Professor at Fudan University, Shanghai,
China. He is also a Concurrent Professor at Nanjing
University of of Aeronautics and Astronautics. He
REFERENCES has held short-term appointments at several insti-
tutions in US. and Europe. His research interests
[ l ] A. Wieland and R. Leighten, “Geometric analysis of neural network include: harmonic analysis, approximation theory,
capacity,” in Proc. IEEE 1st ICNN. 1, 1987, pp. 385-392. neural networks and signal processing.
[2] B. Irie and S. Miyake, “Capacity of three-layered perceptrons,” in Proc. He has published over 80 journal papers and was
IEEE ICNN 1, 1988, pp. 641448. a recipient of a National Award for Excellence in
[3] S. M. Carroll and B. W. D i c k ” , “Construction of neural nets using Scientific Research by State Education Commission of China in 1985 and
radon transform,” in Proc. IJCNN Proc. I , 1989, pp. 607411. 1994.
[4] K. Funahashi, “On the approximate realization of continuous mappings
by neural networks,” Neural Networks, vol. 2, pp. 183-192, 1989.
[5] H. N. Mhaskar and C. A. Micchelli, “Approximation by superposition of
sigmoidal and radial functions,” in Advances on Applied Mathematics,
vol. 13, pp. 350-373, 1992.
[6] G. Cyhenko, “Approximation by superpositions of a sigmoidal func- Hong Chen received the B.S.E.E. degree from
tion,” in Mathematics of Control, Signals and Systems, vol. 2, no. 4,pp. Fudan University, Shanghai, P.R. China in 1988
303-314, 1989. and the M.S.E.E. and Ph.D. degrees from University
[7] Y. Ito, “Representation of functions by superpositions of a step or of Notre Dame, Notre Dame, Indiana, in 1991 and
sigmoidal function and their applications to neural network theory,” 1993, respectively.
Neural Networks, vol. 4, pp. 385-394, 1991. He was with VLSI Libraries, Inc., Santa Clara,
[8] K. Hornik, “Approximation capabilities of multilayer feedfonvard net- CA, and is now with Sun Microsystems, Inc., Moun-
works,” Neural Networks, vol. 4, pp. 251-257, 1991. tain View, CA. His interests include neural net-
[9] T. Chen, H. Chen, and R.-W. Liu, “Approximation capability in C ( R ” ) works, signal processing and VLSI design.
by multilayer feedfonvard networks and related problems,” IEEE Trans.
Neural Networks, vol. 6, no. 1, Jan. 1995.

Authorized licensed use limited to: Univ of Calif Merced. Downloaded on February 05,2025 at 19:07:08 UTC from IEEE Xplore. Restrictions apply.

You might also like