0% found this document useful (0 votes)

13 views8 pages

A Modified Invasive Weed Optimization Algorithm For Training of Feed Forward Neural Networks

Uploaded by

Nguyen Le Phu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views8 pages

A Modified Invasive Weed Optimization Algorithm For Training of Feed Forward Neural Networks

Uploaded by

Nguyen Le Phu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

A Modified Invasive Weed Optimization Algorithm for Training of Feed

Forward Neural Networks

Ritwik Girii, Aritra Chowdhuryi, Arnob Ghoshi, Swagatam Das\

Ajith Abraham2 and Yaclav Snasee
iDepartment of Electronics and Telecommunication Engineering, Jadavpur University, India
2 Machine Intelligence Resarch Labs (MIR Labs), USA. Email: [email protected]
3YSB Technical University Ostrava, Czech Republic. Email: [email protected]

Ahstract- Invasive Weed Optimization Algorithm IWO) is The computational drawbacks of existing derivative-based
an ecologically inspired metaheuristic that mimics the process numerical methods have forced the researchers all over the
of weeds colonization and distribution and is capable of solving world to rely on metaheuristic algorithms founded on
multi-dimensional, linear and nonlinear optimization problems
simulations to solve engineering optimization problems. A
with appreciable efficiency. In this article a modified version of
common factor shared by the metaheuristics is that they
IWO has been used for training the feed-forward Artificial
Neural Networks (ANNs) by adjusting the weights and biases of
combine rules and randomness to imitate some natural
the neural network. It has been found that modified IWO phenomena. Two closely related families of algorithms that
performs better than another very competitive real parameter primarily constitute this field today are the Evolutionary
optimizer called Differential Evolution (DE) and a few classical Algorithms (EAs) [9 - 11] and the Swarm Intelligence (SI)
gradient-based optimization algorithms in context to the weight algorithms [12 - 14]. While the EAs emulate the processes
training of feed-forward ANNs in terms of learning rate and
of Darwinian evolution and natural genetics, the SI
solution quality. Moreover, IWO can also be used in validation
algorithms draw inspiration from the collective intelligence
of reached optima and in the development of regularization
terms and non-conventional transfer functions that do not emerging from the behavior of a group of social insects (like
necessarily provide gradient information bees, termites and wasps) and also from the socio-cognition
theory of human beings.
Keywords: metaheuristics, invasive weed optimization, differential To overcome the shortcomings of BP in training the
evolution, feed-forward neural networks, classification, back ANN, metaheuristic ANN training models i.e. the
propagation. combination of stochastic optimization algorithms like
Genetic Algorithms (GAs) [9, 10], Particle Swarm
I. INTRODUCTION
Optimization (PSO) [11, 12], and Differential Evolution
(DE) [13] with the ANN learning process has been
Since the advent of the Artificial Neural Network (ANN) it
proposed. A survey and overview of the evolutionary
is severely used in the field of pattern recognition and
techniques in evolving ANN can be found in [10]. This kind
function approximation. Among various kinds of ANNs,
of evolutionary ANN models do not exhibit the
Feed-Forward Artificial Neural Networks (FFANN) are
inefficiencies of BP algorithms like need of the
considered to be powerful tools in the area of pattern
differentiability of the neuron transfer function, possibility
classification [1] where universal FFANN approximators,
of getting trapped in a local optima etc. Further the search
for arbitrary finite-input environment measures, can be
techniques of the evolutionary models are population driven
constituted by using only a single hidden layer [2]. The
instead of the trajectory driven techniques of the BP.
technique involves training of the FFANN with the dataset
The common evolutionary techniques are biologically
to be recognized. The process of training an ANN is
inspired stochastic global optimization methods. They have
concerned with adjusting the weights between each pair of
one common underlying idea behind them, which is based
the individual neurons and corresponding biases until a close
on a population of individuals [11]. Environmental pressure
approximation of the desired output is achieved. Usually
causes natural selection that in turn causes a rise in the
ANN, unless specified, uses Back Propagation (BP)
fitness of the population. An objective (fitness) function
algorithm for training purposes [3, 4]. The BP algorithm is a
represents a heuristic estimation of solution quality and the
trajectory driven technique, which are analogous to an error
variation and selection operators drive the search process.
minimizing process. BP learning requires the neuron transfer
Such process is iterated until convergence is reached. The
function to be differentiable and it also suffers from the
best population member is expected to be a near-optimum
possibility of falling into the local optima. BP is also known
solution [12].
to be sensitive to the initial weight settings and many weight
Using a suitable ANN representation, the process of
initialization techniques have been proposed to lessen such a
supervised ANN training using an evolutionary method
possibility [5, 6, and 7]. So, BP is considered to be
involves performing several iterations in order to minimize
inefficient in searching for global minimum of the search
or maximize a certain fitness function [8, 13, and 14]. Such
space [8].
optimization process would usually stochastically generate
vectors representing the network's weight values, including

978-1-4244-6588-0/10/$25.00 ©2010 IEEE

3166
biases, calculate the fitness for the generated vectors and input layer serve only for transferring the input pattern to the
tries to keep those vectors that give better fitness values. It is rest of the network, without any processing. The information
possible also to include the ANN structure in such is processed by the units in the hidden and output layers.
representation where the structure can also evolve [15]. The Figure 2 depicts the architecture of a generic three-layered
cycle is repeated to generate new offspring and eventually FFANN model.
The neural network considered is fully connected in the
after several iterations the training process is halted based on
sense that every unit belonging to each layer is connected to
some criteria.
every unit belonging to the adjacent layer. In order to find
In recent past Mehrabian and Lucas proposed the the optimal network architecture, several combinations were
Invasive Weed Optimization (IWO) [16], a derivative-free, evaluated. These combinations included networks with
metaheuristic algorithm, mimicking the ecological behavior different number of hidden layers, different number of units
of colonizing weeds. Since its inception, IWO has found in each layer and different types of transfer functions. We
successful applications in many practical optimization converged to a configuration consisting of a one hidden
problems like optimization and tuning of a robust controller layer, one input layer and one output layer.
[16], optimal positioning of piezoelectric actuators [17],
developing a recommender system [18], design of E-shaped
MIMO Antenna [19], and design of encoding sequences for
DNA computing [20]. In this article IWO, with a
modification from it's original self has been used as an
evolutionary optimization technique to train artificial neural
network for the purpose of pattern recognition and function
approximation. A single case for function approximation and
three instances for pattern recognition have been used to
illustrate the application of the proposed algorithm.
Comparison with results obtained by another very common
and largely used evolutionary algorithm DE [21], and three
common back propagation algorithms namely gradient
descent SP, resilient SP, one step secant SP establishes the
superiority of the proposed method. Figure 1: Structure of a neuron
The rest of the paper is organized in the following
way. Section II outlines the method to construct the FFANN However the no. of neurons of each layer and the transfer
structure and it's details, Section III gives a short description function of each layer varies upon different problems of
of the IWO algorithm along with it's modification, Section function approximation and pattern recognition. The
IV describes the performance index, Section V represents structure of the neural network for each case has been
the results on various datasets by IWO and comparison with described later in different cases.
the competiting algorithms, Section V finally concludes the This configuration has been proven to be a universal
paper and unfold some future research works. mapper, provided that the hidden layer has enough units. On
the one hand, if there are too few units, the network will not
be flexible enough to model the data well and, on the other
II. PROPOSED METHODOLOGY
hand, if there are too many units, the network may overfit
the data. Typically, the number of units in the hidden layer is
A. FFANN Model chosen by trial and error, selecting a few alternatives and
then running simulations to find out the one with the best
Artificial Neural networks are highly interconnected simple results. Training of feed forward networks is normally
processing units designed in a way to model how the human performed in a supervised manner. One assumes that a
brain performs a particular task. Each of those units, also training set is available, given by the dataset, containing both
called neurons, forms a weighted sum of its inputs, to which inputs and the corresponding desired outputs, which is
a constant term called bias is added. This sum is then passed presented to the network. Evolutionary Algorithm has been
through a transfer function: linear, sigmoid or hyperbolic used in this training by choosing the appropriate values of
tangent. Figure 1 shows the internal structure of
a neuron. weights and biases of the ANN to minimize the training
Multi-Layer Perceptrons (MLP) are the best known and error of the corresponding problem. The error minimization
most widely used kind of neural network. Networks with process is repeated until an acceptable criterion for
interconnections that do not form any loops are called Feed convergence is reached. The knowledge acquired by the
Forward Artifical Neural Network (FFANN). The units are neural network through the learning process is tested by
organized in a way that defines the network architecture. In applying new data that it has never seen before, called the
feed forward networks, units are often arranged in layers: an testing set. The network should be able to generalize and
input layer, one or more hidden layers and an output layer. have an accurate output for this unseen data. It is undesirable
The units in each layer may share the same inputs, but are
to overtrain the neural network, meaning that the network
not connected to each other. Typically, the units in the

3167
would only work well on the training set, and would not
generalize well to new data outside the training set. In case
of ANN the most common learning algorithm is the back B. Reproduction
-
propagation algorithm. However, the standard back Each weed W;,G of the population G is allowed to produce
propagation learning algorithm is not efficient numerically
and tends to converge slowly. To improve the results in case seeds depending on its own, as well as the highest and
of training we have used an ecologically inspired algorithm lowest fitness of the colony, such that the number of seeds
IWO rather than the BP algorithms and has found that produced by a weed increases linearly from lowest possible
proposed algorithm can outperform the others i. e. DE, for a weed with worst fitness to the maximum number of
traingd, trainoss, trainrp etc. seeds for a weed with best fitness.

C. Spatial Dispersal
III. CLASSICAL IWO AND ITS MODIFICATION
The generated seeds are then randomly distributed in the
Invasive Weed Optimization (IWO) is a metaheuristic entire search space by normally distributed random numbers
algorithm that mimics the colonizing behavior of weeds. with zero mean but varying variance. This means that the
IWO can be summarized as follows. seeds will be randomly distributed at the neighborhood of
the parent weed. Here the standard deviation (u) of the
A. Initialization random function will be reduced from a previously defined
initial value ainilial to a final value afinal in every iteration
(itermax -iter
A finite number of weeds are initialized randomly in the
entire search space. For the training purpose of the FFANN, pow
of the algorithm following eq.l.

ermax J
each weed consists of a string of network weights followed
Wi ailer ( afinal - ainitial ) + ainilial (I)

itermax
=
f'
by network biases. So the i'th weed can be represented 1

ailer
as
is the maximum number of iteration, is the
standard deviation at the present iteration and pow is the
Wi,p = p'th weight term of the network non-linear modulation index.
This step ensures that the probability of dropping a seed in
bi,q = q'th bias term of the network and n & m being the the distant area decreases nonlinearly at each iteration which
total number of weights and biases respectively. results in grouping fitter plants and elimination of
inappropriate plants.

Input layer D. Competitive Exclusion

If a plant leaves no offspring then it would go extinct
otherwise it would take over the world. Thus there is need of
some kind of competition between plants for limiting
maximum number of plants in a colony. Initially the plants
will reproduce fast and all the produced plants will be
included in the colony, until the number of plants in the
colony reaches a maximum, popmax' However it is expected
the by this time the fitter plants have reproduced more than
the undesirable plants. From there on, only the fittest plants
among the existing ones and the reproduced ones are taken
in the colony and then steps lIb to lId are repeated until the
maximum number of iterations are reached i.e the colony
size is fixed from there on at POPmax . This step is known as
the Competitive Exclusion and it is the selection procedure
oflWO.

E. Modification of fWD
Here we aim at reducing the standard deviation u for a
weed when the objective function value of a particular weed
nears the minimum objective function value of the current
popUlation, so that the weed disperses it's seeds within a
small neighborhood of the suspected optima. Eqn (2)
describes the scheme by which the standard deviation Uj of

Figure 2: FFANN structure

ai afinal + (1-
= e
the i'th weed is varied .
-I'./; Xainilial - afinal) (2)

3168
network parameters may grow explosively. So to allow only
where, t!,.J; = /feW;) - feW best)/ (3) network parameters with lowest numerical value to be
selected, a penalty term consisting of the square of the
so when t!,.J; �0 then (Yi � (Yfinal As
weights and biases has been added with MSEREG to form
(Yfinal « (Yinilial , so when t!,.J; � 0 i.e. the i'th weed is the complete fitness function of the evolutionary
optimization algorithms like IWO and DE for training the
in close proximity of the optima ,then the standard deviation
neural network. Hence for well trained networks, MSEREG
of the weed becomes very small resulting in dispersal of the
should me as minimum as possible.
corresponding seeds within a small neighborhood around the
optima. Thus in this scheme, instead of using a fixed u for
B. Index for Pattern Recognition Problem Set
all weeds in a particular iteration we are varying the standard
deviation for each weed depending on it's objective function
In case of pattern recognition problem, Classification Error
value. So this scheme in one hand increases the explorative
Percentage (CEP) as defined in (4) is used as the fitness
power of the weeds and on the other creates some
function of the IWO and DE algorithms.
probability for the seeds dispersed by the undesirable weeds
(the weeds with higher objective function value) to be a
fitter plant. These features were absent in the classical IWO E
CEP =p *100 (4)
algorithm. Figure 3 shows the variation of u vs t!,.J; and P
Figure 4 represents the flowchart of the modified IWO Ep = Total Number of incorrectly recognized training
algorithm. or testing patterns.
p= Total number of training or testing patterns.
Hence for well trained networks CEP should be as
20 minimum as possible.
18

16 V. EXPERIMENTAL RESULTS AND DISCUSSION

14
1 A. Algorithms Compared
c:
.� 12 The performance of IWO in training the FFANN has been
.�
>
'"
10 compared with the FFANN trained by the following
"0

"0 algorithms.
Cii
"0
c:
'"
'"

1ii
• Differential Evolution (DE)-It is a novel
evolutionary algorithm first introduced by Storn
and Price [21], which is inspired from the theory of
evolution. It is successfully used in many artificial
and real optimization problems and applications
[22] including training a neural network [23]. In
2 ---3L-� 4 ---L---6L-� --�--��10
°0��---L this paper DE/rand!llbin variant is used for
1�lt h weed)·f(best weed)1 _>
..
comparison.
Figure 3: Variation of (Yi with t!,.J; for Ufinal=O.OOI and • Back propagation algorithm with an adaptive
learning rate (TRAINGDX).
• One step secant learning method (TRAINOSS).
IV. INDEX OF PERFORMANCE EVALUATION
• Resilient back propagation algorithm (TRAINRP)

In this paper, we have used two different classes of problem B. Experimental Results
set: Function Approximation & Pattern Recognition. The In this section performance of IWO is evaluated by
different Indices of Performance Evaluation for these two experiments. The experiments were conducted by various
classes have elaborated as follows: configurations of FFANN and two commonly used problem
domains: Function Approximation and Pattern Recognition.
A. Index for Function Approximation Problem Set
/) Function Approximation
In case of function Approximation Problem, Measurement of Here IWO trained FFANN has been used to approximate a
Mean Square Error (MSEREG) i.e. the square of the very simple and conventional function SIN(X). The network
difference between the actual and obtained outputs is used as structure of the selected FFANN consists of one input,one
the index of the performance evaluation. As the neural hidden and one output layer each containing 1,5,1 number of
network may have a large number of solutions of network neurons respectively. The transfer of the networks are
weights and biases having the the same MSEREG, the tansig-tansig-tansig ( tansig: Hyperbolic Tangent Sigmoid)

3169
respectively. Such kind of networks has been selected after
much experimentation. This same network architecture is
used for other competiting algorithms. IWO DE/rand/llbin
Paremeter Value Parameter Value
Search [-500,500] Search [-500,500]
range range
Maximum 200 Population 200
Population Size
Size
�lTAILIZATIOl'i
Initialize randomly generated weeds in the Initial 50 Scale Factor, 0.8
entire search space. Population F
Size
Max seed 8 Crossover 0.9
Probability,
Cr
Min seed 1
Reproduction and Spatial Dispn s al
10% of the
O'inilial
Create the s ee d population by producing entire search
normally distributed seeds with zero mean range
and standard deviation Vi for each weed 0.01% of the
O'jinal
following eqn(2) depending on the fitness entire search
of the weed.
range

Figure 4: Flowchart of IWO Algorithm along with its modification

ALGORITHMS Mean MSEREG Minimum Maximum

(Sid dev)
IWO 0.1128 0.1042 0.1146
(0.0031)
DE 0.1285 0.1051 0.3265
(0.0196)
TRAINGDX 0.1044 0.1040 0.1069
(0.0026)
TRAINRP NA NA NA
TRAINOSS 0.1045 0.1040 0.1047
Include all seeds and Competitin (0.0046)
weeds in the colony Exclus io n
Include the fittest
"PO Pmu" no of weeds
and seeds in the colony
Table 1: Parametric Set-up oflWO and DE algorithms

Table 2: Comparison of MSEREG for SIN(X) Function

The detailed parametric set-up for the two evolutionary

technique IWO & DE is shown in Table 1. Table 2
Weeds=colony
summarizes the MSEREG obtained by FFANN, trained by
IWO and other comparative algorithms. In the table mean,
maximum and minimum values of MSEREG obtained from
50 statistical runs are reported. IWO clearly performs better
than the other evolutionary technique DE/randillbin as
Is
reflected by the lower value of MSEREG obtained by IWO.
iter=iter_max? But TRAINGDX performs even better than IWO. This is
indeed the case for function approximation problems, where
the sole objective is to train the network not to test it with
some new data. BP algorithms like TRAINGDX in many
cases "overtrain" the network i.e trains the network
exceedingly well for only the training dataset thus producing

3170
lesser performance index. Its limitation gets exposed when it IWO & DE/rand!llbin algorithms is same as shown in Table
is tested with some new data points as evident by the pattern I. 80% of the entire datasets has been used for training
recognition problems to be discussed next. Approximated purpose and rest 20% has been used for testing purpose.
Sine curve obtained by IWO along with the original one Both training and testing CEPs obtained by IWO and the
shown in Figure 5. other competiting algorithms are shown in Tables 4, 5 and 6
respectively.
Various classes are numbered in the Tables according to
Table 3. We have run 50 independent training session of
IWO,DE & BP algorithms for each of the selected datasets
and reporting the mean of these runs along with the standard
deviation, best and worst CEP. It is evident that at some
instances, the training CEP obtained by BP algorithms are
, , : , , -- Original CUive
better than that obtained by lWO. It is again occurring due to
0.6 ------1 ----- : --------: ------ - ---- ; ------ : -- __ Approxi m at ed -
,

: the classical problem of "overtraining"[24]. The overall

, : : : : : CUlVe
:
0.4 -- - - - : - -- - -- � - - -- - - - -- - ---- � -- - - - : - ----- � - - - - -- ,-- ----- , ---- - - , -- - -- testing CEP i.e. the CEP for unknown 20% datapoints
,
,
I
I ,
, ,
I
,

,,
I
I
t
, ,
I I
,
obtained by IWO is much better than those obtained by DE
---:------�------- -
: ------�---- -:------�-----+-----�------�-----
, I , I I , , I

0.2
& BP algorithms in case of each dataset as can be verified
-

,
,
I
I
,
,
I
I
,
,
I
I
,, ,
I
•

o - - -- - -! - - - - - - � - -
, I
- :
-- - " " - - - - - -
"
� - - - - - - - - - - - � - - -- - �- - - - - - -� - - - - - - � -
-
"
-
I
- - -- from Table 4,5,6.This fact establishes the claim of
, " "overtraining" of the network only for the training dataset by
,
-0.2 - - - - - -! - - - - - - � - - - - - - - -- - - - - - :- -- - - - - ! - - - - � - - - - -- �- - - - - - - :- - - - - - - � - - - -
" "

BP algorithms. Thus whether the network is properly trained

, I I , , , , I

, I
:
, I , I ,, , I
,
,
I
I
,
,
I
I
,
,
I
I ,
I
I
I
I or not can be understood only by comparing the testing CEP,
not the training one. Now, as IWO comfortably beats the
other competitors when testing CEP is concerned, these
experiments establish the superiority of IWO in training
,
-0.8 - - - - - - !- - - - - - � - - - - - - -: - - - - - - - :_ - - - - - -!- - - - - - � - - - \___.�
... _. FFANN to use it as a pattern classifier.
, I , I I

_ 1 L-�---L--�--L-�L-�--��--L-�
o 10 20 30 40 50 60 70 80 90 100
Table 3: Properties of the three datasets and the configuration of
the neural networks used

Figure 5: Approximated Sine Curve along with the original Properties CANCER DIABETES GLASS
one obtained by IWO FFANN 9-8-2 8-7-2 9-12-6
structures
2) Pattern Recognition Problem Weights 169 134 261
In this paper three datasets Diabetes, Cancer, Glass Dataset Biases 19 17 27

have been used available online available from [25]. Classes and I.Benign I. Building float
their (65.14%) I.Diabetes processed (40.19%)
The properties of the data sets are summarized in Table 3_
percentages II Malign (33.07%) II. Building non-float
The last row of Table 3 shows the percentage of various (34.86%) II.No processed (27.10%)
classes in each dataset. There are two classes in each of Diabetes III. Vehicle float
Cancer and Diabetes dataset and six classes in Glass dataset. (66.92%) processed (6.54%)
IV. Containers (8.41%)
A three layer FFANN was used for each problem to work as
V. Tableware (5.61%)
a pattern classifier. For all the three problems the transfer VI. Headlamps
function of layers has been chosen as purelin-tansig-tansig (12.15%)
respectively. Such a selection has been done after much
experimentation to obtain the best possible results. Network
configurations used for each dataset are summarized in
Table 3. Number of neurons in each class is equal to the
number of classes in each dataset. Parametric set-up for

3171
Table 4: Comparison of the CEP for the DIABETES dataset among IWO and other algorithms

Training Testing Classes

Algorithms MeanCEP BestCEP Worst MeanCEP BestCEP WorstCEP I II
(std dev) CEP (std dev)
IWO 1.8229 1.7719 2.0000 5.2083 5.0019 5.6290 33.33 66.67
(0.1015 ) (0.2376)
DE/rand/llhin 2.8571 2.6719 3.1190 9.3750 8.7562 10.0018 40.62 59.38
( 0.2887) (0.4527)
TRAINGDX 2.3437 2.0019 3.3310 8.3333 7.8990 8.5326 38.54 61.46
( 0.2003) (0.1129)
TRAINOSS 2.0833 2.0000 2.3019 4.6875 4.1098 5.3390 36.97 63.03
( 0.3018) (0.2454)
TRAINRP 1.8229 1.7244 2.0000 11.9791 11.1106 12.1897 34.89 65.11
( 0.2998) (0.3675)

Table 5: Comparison of the CEP for the DIABETES dataset among IWO and other algorithms

Training Testing Classes Testing)

MeanCEP BestCEP Worst MeanCEP Best CEP WorstCEP I II
Algorithms (std _dev) CEP (std _dev)

IWO 17.4285 16.9945 18.1362 24.7126 20.1625 28.1967 62.07 37.93

(0.6347) (0.2876)
DE/rand/llhin 24.5714 24.1744 25.5261 37.9310 35.8744 39.1108 62.64 37.36
(0.3923) (0.6495)
TRAINGDX 22.2857 21.8534 23.9908 32.1839 30.7367 33.0023 64.94 35.06
(0.8235) (0.9732)
TRAINOSS 16.3128 16.1109 16.8874 29.3103 28.7656 29.8069 71.26 28.74
(0.2163) (0.0957)
TRAINRP 19.1428 19.0053 19.8763 27.0114 26.8721 27.5300 70.11 29.89
(0.2195) (0.8365)

Table 6: Comparison of the CEP for the DIABETES dataset among IWO and other algorithms

Training Testing Classes

Algorithms MeanCEP Best Worst Mean Best Worst I II III IV V VI
(std_dev) CEP CEP CEP CEP CEP
(std_dev)
IWO 30.8411 30.1656 31.5434 39.6226 39.2354 40.1154 41.50 18.86 15.09 9.43 3.77 11.35
(0.2356 ) (0.4532)
DE/rand/llhi 41.1215 40.5435 41.2565 47.1634 46.7752 47.9178 43.39 22.64 13.20 7.54 1.88 11.35
n (0.2276 ) (0.7356)
TRAINGDX 35.5140 35.0017 35.9800 41.568 40.0000 41.8142 39.61 18.86 16.97 11.31 1.88 11.37
( 0.1189) (0.6190)
TRAINOSS 33.6449 32.9178 33.8716 43.147 43.0018 43.9913 37.72 20.74 18.86 13.19 3.77 5.72
( 0.3100) (0,3254)
TRAINRP 30.8411 30.1798 31.6724 40.746 40.0011 40.9278 41.50 16.97 16.97 11.31 3.77 9.48
( 0.2781) (0.3855)

results it is evident that our proposed algorithm has advantage

VI. CONCLUSIONS over gradient based methods in case of training the FFANN. In
some cases if the error surface is very rough and the gradient
In this study, a modified Invasive Weed Optimization information frequently leads to local optimums, then the
(lWO) training algorithm was used to train feed- forward proposed algorithm performs much better than the other
multilayer perceptron neural networks. From the reported gradient based methods. Other than this advantages training by
this proposed algorithms has an disadvantage, i.e. the time

3172
required for obtaining the convergence in case of this Networks-ISNN (2005)(Springer Berlin 1 Heidelberg) volume
algorithm sometimes become intolerable in case of larger 3496/2005,660-665

datasets. But as the performance is much better we have to [9] W. Gao, Evolutionary Neural Network based on New Ant
Colony Algorithm. international Symposium on Computational
go for trade off between time and performance, thus the
Intelligence and Design (ISCID)(2008) 318-321
future work should consider this trade off. In case of larger
[10] X.Yao, Evolving Artificial Neural Networks, Proceedings of the
FFANN with larger datasets the intrinsic parallel nature of IEEE, (1999) 87(9) 1423-1447
FFANN feed-forward calculations would invite the use of a [11] H. Pierreval , C. Caux, 1.L. Paris, F. Viguier, Evolutionary
parallel implementation to speedup the fitness function approaches to design and organization of the manufacturing
calculations resulting in a reduction in the overall training systems, Computer and Industrial Engineering, 44 (2003) 339-
time required 364
by our proposed algorithm. Future research may focus into [12] M.G.H. Omran, M. Mahadavi, Global Best Harmony Search,

recognition of more complex and useful applications like Applied Mathematics and Computation 198 (2008) 643-656
[13] E. Alba, J.F.Chicano, Training Artificial Nural Networks with
speech, character etc.
GA Hybrid Algorithm,Genetic and Evolutionary Computation
(GECCO) 2004
ACKNOWLEDGEMENT [14] R.E.Dorsey ,J.D.Johnson, W.J.Mayer, A Genetic Algorithm for
This work was supported by the Czech Science Foundation the Training of Feedforward Neural Networks, Advances in

under the grant no.1 02/0911494. Apngical Intelligence in Economics, Finance and Management
(1994) 93-111
[15] J. Yu, S. Wang, L. Xi, Evolving Artificial Neural Networks
using an Improved PSO and DPSO, Neurocomputing 71 (2008)
REFERENCES 1054-1060
[16] AR. Mehrabian, C. Lucas, A novel Numerical Optimization
Algorithm Inspired from Weed Colonization, Ecological
Informatics, 2006,vol 1.pp-355-366
[1] U. Seiffert, Training of Large-Scale Feed Forward Neural [17] AR. Mehrabian, A Yousefi-Koma, Optimal Positioning of
Networks. International Joint Conference on Neura Piezoelectric Actuators on a Smart Fin using Bio-inspired
Networks(2006),5324-5329 Algorithms,Aerospace Science and Technology,2007,vol 11, pp
174-182
[2] X.Jiang, AH.K.S. Wah, Constructing and Training feed [18] H. Sepehri Rad , C. Lucas, " A Recommender System based on
forward neural networks for Pattern classification, Pattern Inavasive Weed Optimization Algorithm", IEEE Congress on
recognition 36 (2003) 853-867 Evolutionary Computation, CEC 2007,pp 4297-4304
[3] AT. Chronopoulos, 1. Sarangapani, A distributed discrete [19] A R. Mallahzadeh ,S. Es'haghi, A Alipour, " Design of an E
time neural network architecture for pattern allocation and shaped MIMO Antenna using IWO Algorithm for Wireless
control. Proceedings of the International Parrelel and Application at 5.8 Ghz", Progress in Electromagnetic
Distributed Processing Symposium (IPDPS'02),(2002) 204- Research,PIER 90, 2009,187-203
211. [20] X. Zhang, Y. Wang, G. Cui, Y. Niu, 1. Xu, Applicationof a
[4] K. M. Lane, R.D. Neidinger, Neural networks from idea to novel IWO to the design of encoding sequence for DNA
implementation, ACM Sigapl APL Quote Quad 25(3) (1995) computing,Comput. Math. Appl. 57, pp. 2001-2008,Jun,2009
27-37 [21] R. Storn and K. V. Price, "Differential Evolution - a simple and
[5] L. Fausett , Fundamentals of Neural Networks Architecture, efficient adaptive scheme for global optimization over
Algorithms, and Applications, Prentice Hall, New Jersey, continuous spaces", Technical Report TR-95-012,ICSI,
1994. [22] Lampinen J, A Bibliography of Differential Evolution
[6] L.G.C. Hamey , XOR has No Local Minima: A case study in Algorithm.http://www.lut.fi
neural network error surface Analysis, Neural Networks [23] Masters T, Land .W , A New training algorithm for the general
11(1998) 669-681 regression neural network, IEEE International Conference on
[7] G.Wei, Study of Evolutinary Neural Network based on Ant System, Man and Cybernetics, Computational Cybernatics and
Colony Optimization, International Conference on Simulations, 3 (1997),1990-1994
Computational Intelligence and Security Workshops (2007) 3- [24] Y.Liu, 1. A Starzyk, Z. Zhu , Optimized Approximation
6 Algorithm in Neural Networks without Overfitting, IEEE
[8] D. Kim, H. Kim, D. Chung, A Modified Genetic Algorithm Transactions on Neural Networks,19(6) (2008) 983-995
for Fast Training Neural Networks, Advances in Neural [25] ftp://fip.ira.uka.de/pub/neuron/probenl.tar.gz.

3173

Reliability of Electric Generation With Transmission Constraints
No ratings yet
Reliability of Electric Generation With Transmission Constraints
215 pages
Artifcial Neural Networks Training Algorithm Integrating Invasive
No ratings yet
Artifcial Neural Networks Training Algorithm Integrating Invasive
9 pages
Evolving Neural Networks Using Bird Swarm Algorithm For Data Classification and Regression Applications
No ratings yet
Evolving Neural Networks Using Bird Swarm Algorithm For Data Classification and Regression Applications
29 pages
Pso 1
No ratings yet
Pso 1
21 pages
Diagnosis On Lung Cancer Using Artificia PDF
No ratings yet
Diagnosis On Lung Cancer Using Artificia PDF
7 pages
Ai Pso
No ratings yet
Ai Pso
5 pages
Comparison of Particle Swarm Optimization and Backpropagation As
No ratings yet
Comparison of Particle Swarm Optimization and Backpropagation As
8 pages
8385-Article Text-19739-1-10-20210930
No ratings yet
8385-Article Text-19739-1-10-20210930
13 pages
Applied Mathematics and Computation: Seyedali Mirjalili, Siti Zaiton Mohd Hashim, Hossein Moradian Sardroudi
No ratings yet
Applied Mathematics and Computation: Seyedali Mirjalili, Siti Zaiton Mohd Hashim, Hossein Moradian Sardroudi
13 pages
A Binary Bat Inspired Algorithm For The Classification of Breast Cancer Data
No ratings yet
A Binary Bat Inspired Algorithm For The Classification of Breast Cancer Data
21 pages
Artificial Neural Networks Training Algorithm Inte
No ratings yet
Artificial Neural Networks Training Algorithm Inte
10 pages
Neural Network Weight Selection Using Genetic Algorithms
No ratings yet
Neural Network Weight Selection Using Genetic Algorithms
17 pages
Tuning Differential Evolution For Artificial Neura
No ratings yet
Tuning Differential Evolution For Artificial Neura
20 pages
An Improved Genetic Algorithm and Its Blending Application With Neural Network
No ratings yet
An Improved Genetic Algorithm and Its Blending Application With Neural Network
3 pages
2015 Training Artificial Neural Network Using Modificationof Differential Evolution Algorithm
No ratings yet
2015 Training Artificial Neural Network Using Modificationof Differential Evolution Algorithm
7 pages
Ijaest: Use of Artificial Bee Colony (ABC) Algorithm in Artificial Neural Network Synthesis
No ratings yet
Ijaest: Use of Artificial Bee Colony (ABC) Algorithm in Artificial Neural Network Synthesis
10 pages
A Hybrid Artificial Bee Colony Algorithmic Approach For Classification Using Neural Networks
No ratings yet
A Hybrid Artificial Bee Colony Algorithmic Approach For Classification Using Neural Networks
24 pages
PPR - Espinal - Comparison of PSO and DE For Training Neural Networks
No ratings yet
PPR - Espinal - Comparison of PSO and DE For Training Neural Networks
5 pages
Using Genetic Algorithms To Evolve Artificial Neural Networks
No ratings yet
Using Genetic Algorithms To Evolve Artificial Neural Networks
24 pages
Using Artificial Bee Colony Algorithm Fo
No ratings yet
Using Artificial Bee Colony Algorithm Fo
8 pages
Improved Teaching-Learning-Based Optimization Algorithm
No ratings yet
Improved Teaching-Learning-Based Optimization Algorithm
11 pages
An Artificial Neural Network Based Learning Method For Mobile Robot Localization
No ratings yet
An Artificial Neural Network Based Learning Method For Mobile Robot Localization
10 pages
Applsci 13 00283 v3
No ratings yet
Applsci 13 00283 v3
18 pages
Three Novel Quantum Inspired Swarm Optimization Algorithms Using Different Bounded Potential Fields
No ratings yet
Three Novel Quantum Inspired Swarm Optimization Algorithms Using Different Bounded Potential Fields
22 pages
Apply Genetic Algorithm To The Learning Phase of A Neural Network
No ratings yet
Apply Genetic Algorithm To The Learning Phase of A Neural Network
6 pages
Using Genetic Algorithm To Optimize Artificial Neural Network A Case Study On Earthquake Prediction
No ratings yet
Using Genetic Algorithm To Optimize Artificial Neural Network A Case Study On Earthquake Prediction
4 pages
Chaotic Map and Opposition-based Learning 2022
No ratings yet
Chaotic Map and Opposition-based Learning 2022
24 pages
Icai08 NN Meng
No ratings yet
Icai08 NN Meng
7 pages
Hippopotamus Optimization Algorithm A Novel Nature
No ratings yet
Hippopotamus Optimization Algorithm A Novel Nature
51 pages
Ann
No ratings yet
Ann
11 pages
An Improved Genetic Algorithm and Its Blending Application With Neural Network
No ratings yet
An Improved Genetic Algorithm and Its Blending Application With Neural Network
4 pages
Training Neural Networks With GA Hybrid Algorithms
No ratings yet
Training Neural Networks With GA Hybrid Algorithms
12 pages
Optimisation
No ratings yet
Optimisation
3 pages
21 CA1 Mahak
No ratings yet
21 CA1 Mahak
10 pages
Improved Group Search Optimization Based On Opposite Populations For Feedforward Networks Training With Weight Decay
No ratings yet
Improved Group Search Optimization Based On Opposite Populations For Feedforward Networks Training With Weight Decay
6 pages
A New Optimizer Using Particle Swarm Theory
No ratings yet
A New Optimizer Using Particle Swarm Theory
5 pages
Evolutionary Population Dynamics and Grey Wolf Optimizer
No ratings yet
Evolutionary Population Dynamics and Grey Wolf Optimizer
7 pages
1104 نیکنام
No ratings yet
1104 نیکنام
6 pages
ML_Unit_2_NN_Gen
No ratings yet
ML_Unit_2_NN_Gen
54 pages
Hybrid Beluga Whale Optimization Algorithm With Multistrategy For Functions and Engineering Optimization ProblemsJournal of Big Data
No ratings yet
Hybrid Beluga Whale Optimization Algorithm With Multistrategy For Functions and Engineering Optimization ProblemsJournal of Big Data
55 pages
Dl-Unit 2
No ratings yet
Dl-Unit 2
7 pages
Hybridizing Grey Wolf Optimization With Neural Network Algorithm
No ratings yet
Hybridizing Grey Wolf Optimization With Neural Network Algorithm
20 pages
Applsci 13 03128 v2 8 14
No ratings yet
Applsci 13 03128 v2 8 14
7 pages
Ijettcs 2013 06 25 149
No ratings yet
Ijettcs 2013 06 25 149
5 pages
Tech Talk 1
No ratings yet
Tech Talk 1
40 pages
Hippopotamus Optimization Algorithm: A Novel Nature Inspired Optimization Algorithm
No ratings yet
Hippopotamus Optimization Algorithm: A Novel Nature Inspired Optimization Algorithm
50 pages
Artificial Neural Network Design Flow For Classification Problem Using MATLAB
No ratings yet
Artificial Neural Network Design Flow For Classification Problem Using MATLAB
4 pages
A Dynamic All Parameters Adaptive BP Neural Networks Model and Its Application On Oil Reservoir Prediction
No ratings yet
A Dynamic All Parameters Adaptive BP Neural Networks Model and Its Application On Oil Reservoir Prediction
10 pages
Solve Complex Problems Using Artificial Neural Network Learned by PSO
No ratings yet
Solve Complex Problems Using Artificial Neural Network Learned by PSO
7 pages
s44196 023 00323 5
No ratings yet
s44196 023 00323 5
13 pages
MCTS Ga
No ratings yet
MCTS Ga
5 pages
Noc20 Cs81 Assignment 01 Week 04
No ratings yet
Noc20 Cs81 Assignment 01 Week 04
6 pages
ML Architecture
No ratings yet
ML Architecture
4 pages
Mathematical Programming CO4
No ratings yet
Mathematical Programming CO4
116 pages
Lit Rev 7
No ratings yet
Lit Rev 7
12 pages
Backpropagation
No ratings yet
Backpropagation
6 pages
E H O D L U V L G A: Fficient Yperparameter Ptimization in EEP Earning Sing A Ariable Ength Enetic Lgorithm
No ratings yet
E H O D L U V L G A: Fficient Yperparameter Ptimization in EEP Earning Sing A Ariable Ength Enetic Lgorithm
16 pages
Nia, Si
No ratings yet
Nia, Si
70 pages
An Elitist Teaching-Learning-Based Optimization Al
No ratings yet
An Elitist Teaching-Learning-Based Optimization Al
26 pages
Optimal Transformer Tap Changing Setting
No ratings yet
Optimal Transformer Tap Changing Setting
7 pages
Particle Swarm Optimization For Fuzzy Membership Functions Optimization
No ratings yet
Particle Swarm Optimization For Fuzzy Membership Functions Optimization
6 pages
CH 10 Summary
No ratings yet
CH 10 Summary
5 pages
Openmp Implementation of 3D Gravimetric, FTG and Magnetic Growth-Based Inversion Algorithm For Salt Structures in Deepwater Gulf of Mexico
No ratings yet
Openmp Implementation of 3D Gravimetric, FTG and Magnetic Growth-Based Inversion Algorithm For Salt Structures in Deepwater Gulf of Mexico
35 pages
Linear Programming Simplex Method PDF
No ratings yet
Linear Programming Simplex Method PDF
14 pages
Venkatasalam K L: Work Experience
No ratings yet
Venkatasalam K L: Work Experience
2 pages
Electric Power Distribution Handbook
No ratings yet
Electric Power Distribution Handbook
41 pages
Optimasi Daya Reaktif DGN MILP PDF
No ratings yet
Optimasi Daya Reaktif DGN MILP PDF
70 pages
Image Classification Using Naïve Bayes Classifier: Dong-Chul Park
No ratings yet
Image Classification Using Naïve Bayes Classifier: Dong-Chul Park
5 pages
Gas Cap Blowdown
No ratings yet
Gas Cap Blowdown
59 pages
Linear Programming Problems
No ratings yet
Linear Programming Problems
24 pages
Schedule
No ratings yet
Schedule
5 pages
Low-Rank Structure Learning Via Nonconvex Heuristic Recovery
No ratings yet
Low-Rank Structure Learning Via Nonconvex Heuristic Recovery
14 pages
Bits Catalog
100% (1)
Bits Catalog
100 pages
NGTS PHD Thesis
No ratings yet
NGTS PHD Thesis
177 pages
Opt. Lec 4
No ratings yet
Opt. Lec 4
22 pages
Motion Estimation Algorithm Using Block-Matching and Harmony Search Optimization
No ratings yet
Motion Estimation Algorithm Using Block-Matching and Harmony Search Optimization
33 pages
Lecture 14
No ratings yet
Lecture 14
25 pages
Production Research 2021
No ratings yet
Production Research 2021
339 pages
Strategic Decision Making
No ratings yet
Strategic Decision Making
24 pages
A Review On Resilience Studies in Active Distribution Systems
No ratings yet
A Review On Resilience Studies in Active Distribution Systems
20 pages
PittaRosso Artificial Intelligence-Driven Pricing and Promotion, Spreadsheet Supplement
No ratings yet
PittaRosso Artificial Intelligence-Driven Pricing and Promotion, Spreadsheet Supplement
11 pages
ch06 Revised
No ratings yet
ch06 Revised
42 pages
Variable Speed Transmission Using Planetary Gear System For High Speed Rotorcraft Application
No ratings yet
Variable Speed Transmission Using Planetary Gear System For High Speed Rotorcraft Application
14 pages
Architectures For Computer Vision From Algorithm To Chip With Verilog 1st Edition Hong Jeong PDF Download
No ratings yet
Architectures For Computer Vision From Algorithm To Chip With Verilog 1st Edition Hong Jeong PDF Download
52 pages
Chapter 13 Heragu
100% (1)
Chapter 13 Heragu
111 pages
CS6402-Design and Analysis of Algorithms - 2013 - Regulation PDF
No ratings yet
CS6402-Design and Analysis of Algorithms - 2013 - Regulation PDF
19 pages
Irrigation Water Demand and Implications For Groundwater PDF
No ratings yet
Irrigation Water Demand and Implications For Groundwater PDF
21 pages
Advanced Microeconomics: Envelope Theorem
No ratings yet
Advanced Microeconomics: Envelope Theorem
3 pages

A Modified Invasive Weed Optimization Algorithm For Training of Feed Forward Neural Networks

Uploaded by

A Modified Invasive Weed Optimization Algorithm For Training of Feed Forward Neural Networks

Uploaded by

A Modified Invasive Weed Optimization Algorithm for Training of Feed­

Forward Neural Networks

Ritwik Girii, Aritra Chowdhuryi, Arnob Ghoshi, Swagatam Das\

978-1-4244-6588-0/10/$25.00 ©2010 IEEE

Input layer D. Competitive Exclusion

Figure 2: FFANN structure

16 V. EXPERIMENTAL RESULTS AND DISCUSSION

Figure 4: Flowchart of IWO Algorithm along with its modification

ALGORITHMS Mean MSEREG Minimum Maximum

Table 2: Comparison of MSEREG for SIN(X) Function

The detailed parametric set-up for the two evolutionary

: the classical problem of "overtraining"[24]. The overall

BP algorithms. Thus whether the network is properly trained

Training Testing Classes

Training Testing Classes Testing)

IWO 17.4285 16.9945 18.1362 24.7126 20.1625 28.1967 62.07 37.93

Training Testing Classes

results it is evident that our proposed algorithm has advantage

You might also like

A Modified Invasive Weed Optimization Algorithm for Training of Feed