0% found this document useful (0 votes)

39 views14 pages

CSO

CSO AAAAAAAAKMCASMCSAOKCMSAOCMASCOKMASCOKASMCOKASMCOKSAMCAOSKMCASOKCMAOKCMAOKSCMKOASC

Uploaded by

James Mattias

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views14 pages

CSO

CSO AAAAAAAAKMCASMCSAOKCMSAOCMASCOKMASCOKASMCOKASMCOKSAMCAOSKMCASOKCMAOKCMAOKSCMKOASC

Uploaded by

James Mattias

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

IEEE TRANSACTIONS ON CYBERNETICS, VOL. XX, NO.

X, XXXX XXXX

A Competitive Swarm Optimizer for Large Scale

Optimization
Ran Cheng and Yaochu Jin, Senior Member, IEEE

AbstractIn this paper, a novel competitive swarm optimizer

(CSO) for large scale optimization is proposed. The algorithm is
fundamentally inspired by the particle swarm optimization (PSO)
but conceptually very different. In the proposed CSO, neither
the personal best position of each particle nor the global best
position (or neighborhood best positions) is involved in updating
the particles. Instead, a pairwise competition mechanism is
introduced, where the particle that loses the competition will
update its position by learning from the winner. To understand
the search behavior of the proposed CSO, a theoretical proof
of convergence is provided, together with empirical analysis
of its exploration and exploitation abilities showing that the
proposed CSO achieves a good balance between exploration
and exploitation. Despite its algorithmic simplicity, our empirical
results demonstrate that the proposed CSO exhibits better overall
performance than five state-of-the-art metaheuristic algorithms
on a set of widely used large-scale optimization problems and is
able to effectively solve problems of dimensionality up to 5000.
Index TermsParticle swarm optimization, competition, learning, convergence analysis, competitive swarm optimizer, large
scale optimization

I. I NTRODUCTION

ARTICLE swarm optimization (PSO) is one powerful and

widely used swarm intelligence paradigm [1] introduced
by Kennedy and Eberhart in 1995 [2] for solving optimization
problems. The algorithm is based on a simple mechanism
that mimics swarm behaviors of social animals such as bird
flocking. Due to its simplicity in implementation, PSO has
witnessed a rapid increase in popularity over the past decades.
PSO contains a swarm of particles, each of which has a
position and velocity flying in an n-dimensional search space,
representing a candidate solution of the optimization problem
to be solved. To locate the global optimum, the velocity and
position of each particle are updated iteratively using the
following equations:
Vi (t + 1) = Vi (t) + c1 R1 (t)(pbesti (t) Xi (t))

(1)

Xi (t + 1) = Xi (t) + Vi (t + 1),

(2)

+ c2 R2 (t)(gbest(t) Xi (t)),

where t is the iteration (generation) number, Vi (t) and Xi (t)

represent the velocity and position of the i-th particle, respectively; is termed inertia weight [3], c1 and c2 are the
acceleration coefficients [4], R1 (t) and R2 (t) are two vectors
Ran Cheng and Yaochu Jin are with the Department of Computing,
University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom (e-mail:
{r.cheng;yaochu.jin}@surrey.ac.uk).
Manuscript received July 20, 2013; revised xxxx, xxxx.

randomly generated within [0, 1]n ; pbesti (t) and gbest(t)

are the best solution of the i-th particle found so far, often
known as the personal best, and the best solution found
by all particles so far, known as the global best, respectively. Kennedy has referred c1 R1 (t)(pbesti (t) Xi (t)) and
c2 R2 (t)(gbest(t) Xi (t)) as the cognitive component and
social component, respectively [5].
Due to its conceptual simplicity and high search efficiency,
PSO has attracted much research interest over the past decades
and has been successfully applied to a number of applications,
such as water distribution network design [6], parameter
optimization in suspension system [7], resource allocation [8],
task assignment [9], DNA sequence compression [10] and
many others [11]. However, it has been found that PSO
perform poorly when the optimization problem has a large
number of local optima or is high-dimensional [12]. The
above weaknesses can usually be attributed to the premature
convergence that often occurs in PSO [13].
As a typical population-based optimization technique, convergence speed and global search ability are two critical
criteria for the performance of PSO algorithms. In order to
alleviate premature convergence by achieving a good balance
fast convergence and global search ability, a number of PSO
variants have been suggested, which can be largely classified
into the following four categories [13]:
1) Adaptation of the control parameters. , c1 and c2 are
the three control parameters in the canonical PSO, as
shown in (1). , termed the inertia weight, was first
proposed by Shi and Eberhart to balance between global
search and local refinement [14]. The inertia weight
was further modified by linearly decreasing it from 0.9
to 0.4 over the search procedure [15]. Another important
modification of is the introduction of fuzzy inference
[16].
Methods for adapting c1 and c2 (called the acceleration
coefficients) have also been suggested. Time-varying
acceleration coefficients were first introduced by Ratnaweera et al. in [17]. Similarly, a time-varying PSO
based on a novel operator was introduced in [18].
Most recently, a multiple parameter control mechanism
has been introduced to adaptively change all the three
parameters in [19].
2) Modifications in topological structures. The motivation
of introducing topological structures in PSO is to enhance the swarm diversity using neighborhood control
[20], [21]. Several topological structures have been
proposed [22], including the ring topology and the von
Neumann topology. In [23], a fully informed PSO (FIPS)

IEEE TRANSACTIONS ON CYBERNETICS, VOL. XX, NO. X, XXXX XXXX

was developed, where the update of each particle is

based on the positions of several neighbors. Another
important example is the comprehensive learning PSO
(CLPSO) introduced in [24], where particles update
each dimension by learning from different local best
positions. Recently, a distance-based locally informed
particle swarm optimizer was proposed specifically to
tackle multimodal problems in [25].
3) Hybridization with other search techniques. Since different search techniques have different strengths, it is a
natural idea to hybridize PSO with other search methods. One straightforward idea is to combine PSO with
different evolutionary algorithms, such as genetic algorithms [26], differential evolution [27], and ant colony
optimization [28], [29]. Another important idea is to
integrate PSO with local search techniques [30], [31]. In
addition, various search operators based on sociological
or biological concepts have also been proposed, such
as the niching PSO [32], cultural-based PSO [33] and
the aging theory inspired PSO (ALC-PSO) [13]. Other
hybrid PSO variants include PSO with Gaussian mutation [34], PSO with chaos [35], orthogonal learning PSO
(OLPSO) [36], PSO with a moderate-random-search
strategy [37], and a very recently proposed PSO with
a periodic mutation strategy and neural networks [38].
4) Multi-swarm PSO. One early work on multi-swarm PSO
was reported in [39], where the sub-swarms cooperate
to solve large scale optimization problems. In [30],
a dynamic multi-swarm PSO (DMS-PSO) algorithm
is proposed to dynamically change the neighborhood
structure for a higher degree of swarm diversity, even
with a very small swarm size. More multi-swarm PSO
variants can be found in [40], [41].
Since most PSO variants introduce new mechanisms or
additional operators, the enhancement of search performance is
often at the cost of increasing the computational complexity.
Furthermore, due to the strong influence of the global best
position gbest on the convergence speed [39], premature
convergence remains a major issue in most existing PSO
variants. To take a closer look into the influence of gbest on
premature convergence, we rewrite (1) as follows:
Vi (t + 1) = Vi (t) + 1 (p1 Xi (t)),

In order to address premature convergence, a step further

might be to completely get rid of gbest and pbesti . An
attempt was made along this line in a multi-swarm framework
[42], where neither gbest nor pbesti has been used. In this
multi-swarm framework, the update of particles is driven by
a pairwise competition mechanism between particles from the
two swarms. After each competition, the loser will be updated
according to the information from the winner swarm, while
the winner will be updated using a mutation strategy. In the
experiments, the framework showed promising performance
on relatively high-dimensional problems.
Following the idea in [42], in this work, we explore the
use of the competition mechanism between particles within
one single swarm. In addition, the particle loses a competition
will learn from the winner particle instead of from gbest or
pbesti . Since the main driving force behind this idea is the
pairwise competition mechanism between different particles
and neither gbest nor pbesti is involved in the search process,
we term the proposed algorithm Competitive Swarm Optimizer
(CSO) to avoid ambiguity. CSO distinguishes itself from the
canonical PSO mainly in the following aspects:
1) In the canonical PSO, the dynamic system is driven
mostly by the global best position gbest and individual
best position pbesti , whilst in CSO, there is no more
gbest or pbesti . Instead, the dynamic system is driven
by a random competition mechanism, where any particle
could be a potential leader;
2) In PSO, the historical best positions are recorded, whilst
in CSO, there is no memory used to memorize the
historical positions. By contrast, the particles that lose a
competition learn from the winners in the current swarm
only.

(3)

where,
1 = c1 R1 (t) + c2 R2 (t),
c1 R1 (t)
pbesti (t)
p1 =
c1 R1 (t) + c2 R2 (t)
c2 R2 (t)
+
gbest(t).
c1 R1 (t) + c2 R2 (t)

as a source of diversity, p1 largely determines how well

PSO is able to balance exploration and exploitation in the
search procedure. Having noticed this, Mendes and Kennedy
proposed a modified PSO, where p1 is the best particle in the
neighborhood of a particle rather than a combination of gbest
and pbesti [23]. Out of similar considerations, Liang et. al
also introduced a PSO variant without gbest [24], where the
update strategy aims to learn from pbesti only.

(4)

From (3), it can be noticed that the difference between p1

and Xi serves as the main source of diversity. More precisely,
the diversity of p1 itself is generated by the difference between
pbesti and gbest, refer to (4). However, in practice, due
to the global influence of gbest, pbesti is very likely to
have a value similar to or even the same as gbest, thus
considerably reducing the swarm diverisity. In other words,

The rest of this paper is organized as follows. Section II

presents the details of CSO, followed by an empirical analysis
of search behaviors and a theoretical convergence proof in
Section III. Section IV first presents some statistical results
that compare CSO with a few state-of-the-art algorithms on
the CEC08 benchmark functions of dimensionality up to 1000
[43]. Empirical investigations on the influence of the parameter
settings are also conducted. The search ability of CSO has
been challenged further with the test functions of 2000 and
5000 dimensions, which, to the best of our knowledge, are
the highest dimensions that have ever been reported in the
evolutionary optimization literature. Finally, the influence of
neighborhood control on the search performance has been
investigated. Conclusions will be drawn in Section V.

RAN CHENG et al.: A COMPETITIVE PARTICLE SWARM OPTIMIZER FOR LARGE SCALE OPTIMIZATION

II. A LGORITHM

strategy:

Without loss of generality, we consider the following minimization problem:

minf = f (X)

(5)

s.t. X X

where X Rn is the feasible solution set, n denotes the

dimension of the search space, i.e., the number of decision
variables.
In order to solve the optimization problem above, a swarm
P (t) contains m particles is randomly initialized and iteratively updated, where m is known as the swarm size
and t is the generation index. Each particle has an ndimensional position, Xi (t) = (xi,1 (t), xi,2 (t), ..., xi,n (t)),
representing a candidate solution to the above optimization
problem, and an n-dimensional velocity vector, Vi (t) =
(vi,1 (t), vi,2 (t), ..., vi,n (t)). In each generation, the particles
in P (t) are randomly allocated into m/2 couples (assuming
that the swarm size m is an even number), and afterwards,
a competition is made between the two particles in each
couple. As a result of each competition, the particle having
a better fitness, hereafter denoted as winner, will be passed
directly to the next generation of the swarm, P (t + 1), while
the particle that loses the competition, the loser, will update
its position and velocity by learning from the winner. After
learning from the winner, the loser will also be passed to
swarm P (t + 1). This means that each particle will participate
in a competition only once. In other words, for a swarm size of
m, m/2 competitions occur so that all m particles participate
in one competition once and the position and velocity of m/2
particles will be updated. Fig. 1 illustrates the main idea of
CSO.
Loser

Learning

Competition
Winner

Updated loser

Swarm P(t+1)

Swarm P(t)

t=t+1

Fig. 1. The general idea of CSO. During each generation, particles are
pairwise randomly selected from the current swarm for competitions. After
each competition, the loser, whose fitness value is worse, will be updated by
learning from the winner, while the winner is directly passed to the swarm
of the next generation.

Let us denote the position and velocity of the winner and

loser in the k-th round of competition in generation t with
Xw,k (t), Xl,k (t), and Vw,k (t), Vl,k (t), respectively, where
k = 1, 2, ..., m/2. Accordingly, after the k-th competition the
losers velocity will be updated using the following learning

Vl,k (t + 1) = R1 (k, t)Vl,k (t)

+ R2 (k, t)(Xw,k (t) Xl,k (t))
k (t) Xl,k (t)).
+ R3 (k, t)(X

(6)

As a result, the position of the loser can be updated with the

new velocity:
Xl,k (t + 1) = Xl,k (t) + Vl,k (t + 1),
n

(7)

where R1 (k, t), R2 (k, t), R3 (k, t) [0, 1] are three randomly
generated vectors after the k-th competition and learning
k (t) is the mean position value of the
process in generation t, X
relevant particles, is a parameter that controls the influence

k (t), a global version and a local

of X(t).
Specifically, for X
version can be adopted:
g (t) denotes the global mean position of all particles
1) X
k
in P (t);
l (t) means the local mean position of the particles
2) X
l,k
in a predefined neighborhood of particle l.
It has been found that neighborhood control is able to help
improve PSOs performance on multimodal function by maintaining a higher degree of swarm diversity [21]. Similarly,
k (t) is
the motivation to introduce neighborhood control in X
to increase swarm diversity, which potentially enhances the
search performance of CSO. In the remainder of this paper,
g (t) is adopted as a default setup unless
the global version X
k
otherwise specified. The performance of CSO using the local
l (t) will be investigated in Section IV.D.
version X
l,k
In order to gain a better understanding of the learning
strategy in CSO, we provide below more discussions about
(6).
1) The first part R1 (k, t)Vl,k (t) is similar to the inertia
term in the canonical PSO, refer to (1), which ensures
the stability of the search process. The only difference is
that the inertia weight in PSO is replaced by a random
vector R1 (t) in CSO.
2) The second part R2 (k, t)(Xw,k (t) Xl,k (t)) is also
called cognitive component after Kennedy and Eberhart.
Different from the canonical PSO, the particle that loses
the competition learns from its competitor, instead of
from its personal best position found so far. This mechanism may be biologically more plausible in simulating
animal swarm behaviors, since it is hard to require
that all particles memorize the best position they have
experienced in the past.

3) The third part R3 (k, t)(X(t)

Xl,k) (t)) is termed
social component, again after Kennedy and Eberhart.
However, the particle that loses the competition learns
from the mean position of the current swarm rather than
the gbest, which requires no memory and makes good
sense, biologically.
With the descriptions and definitions above, the pseudo
code of CSO algorithm can be summarized in Algorithm 1.
We can see that CSO maintains the algorithmic simplicity
of PSO, which is quite different from most existing PSO
variants. Apart from the fitness evaluations, which is problem
dependent [44], [45], the main computational cost in CSO is

IEEE TRANSACTIONS ON CYBERNETICS, VOL. XX, NO. X, XXXX XXXX

competition, whilst gbest(t) is deterministically updated and

shared by all particles, and pbesti (t) is also deterministically
updated and always used by particle i. On the other hand,

although X(t)
is shared by several particles, it depends on
the mean current position of the whole swarm, which will be
less likely to introduce bias towards any particular particle.
Finally, it is noted that in CSO, only half of the particles will
be updated in each generation, while in PSO, all particles are
updated.
To illustrate the above intuitive observation, three typical
cases are considered below to show how CSO is potentially
able to perform more explorative search than the canonical
PSO.
f (X)

Algorithm 1 The pseudocode of the Competitive Swarm Optimizer (CSO). t is the generation number. U denotes a set of
particles that have not yet participated in a competition. Unless
otherwise specified, the terminal condition is the maximum
number of fitness evaluations.
1: t = 0;
2: randomly initialize P (0);
3: while terminal condition is not satisfied do
4:
calculate the fitness of all particles in P (t);
5:
U = P (t), P (t + 1) = ;
6:
while U 6= do
7:
randomly choose two particles X1 (t), X2 (t) from U ;
8:
if f (X1 (t)) f (X2 (t)) then
9:
Xw (t) = X1 (t), Xl (t) = X2 (t);
10:
else
11:
Xw (t) = X2 (t), Xl (t) = X1 (t);
12:
end if
13:
add Xw (t) into P (t + 1);
14:
update Xl (t) using (6) and (7);
15:
add the updated Xl (t + 1) to P (t + 1);
16:
remove X1 (t), X2 (t) from U ;
17:
end while
18:
t = t + 1;
19: end while

p1w =

pbestw (t)+gbest(t)
2

p1l =

pbestl (t)+gbest(t)
2

Xl (t)
pbestl (t)

Xw (t)

p1w

pbestw (t)

p1l
gbest(t)
10

the update of Xl (t), which is an inevitable operation in most

swarm or population based evolutionary algorithms [2], [46]
[48]. Consequently, the computational complexity of CSO is
O(mn), where m is the swarm size and n is the search
dimensionality.
III. S EARCH DYNAMICS A NALYSIS
P ROOF

AND

C ONVERGENCE

In order to better understand the search mechanism in CSO,

we will carry out empirical studies on its search dynamics by
comparing it with the canonical PSO. In addition, a theoretical
proof of convergence will be given, which shows that CSO,
similar to the canonical PSO, will converge to an equilibrium.
Note, however, that this equilibrium is not necessarily the
global optimum.
A. Analysis of search dynamics
1) Exploration: Exploration is desirable in the early stage
of optimization to perform global search and locate the optimum regions. To examine the exploration ability of CSO, we
reformulate (6) into a form similar to (3):
Vi (t + 1) = R1 (k, t)Vi (t) + 2 (p2 Xi (t)),

(8)

then the following expression can be obtained:

2 = R2 (k, t) + R3 (k, t)
R2 (k, t)
R3 (k, t)

p2 =
X(t).
Xw (t) +
R2 (k, t) + R3 (k, t)
R2 (t) + R3 (k, t)
(9)
Compared to (4), it can be observed that (9) has better chance
to generate a higher degree of diversity. On the one hand,
particle Xw (t) is randomly chosen from the swarm before the

Global Optimum
0
0

50
X

Fig. 2. Illustration of the search dynamics of the canonical PSO on a

multimodal optimization problem. In this case, gbest(t) is in a local optimum
region and the two particles w(t) and l(t) is attracted into this region.

Let us first consider a situation where gbest(t) is trapped

in a local optimum, as illustrated in Fig. 2. For particle l(t),
since both Xl (t) and pbestl (t) are located inside the local
optimum region, the particle will move towards the local
optimum position recorded by gbest(t). By contrast, although
both Xl (t) and pbestl (t) are located outside the optimum
region, particle w(t) will still move in a wrong direction
towards the local optimum due to the dynamics driven by
gbest(t).
A natural idea to avoid the situation shown in Fig. 2 is to
remove the gbest(t) from the update strategy so that particles
will learn from pbest(t) only. This methodology has already
been adopted by Liang et al [24]. In this way, particle l(t) is
able to fly over the local optimum, refer to Fig. 3. Without
gbest(t), PSO seems to be in a better position to perform
exploration.
However, although gbest(t) is removed, pbest(t) can still
attract the particles into a local optimum region, limiting its
ability to explore the whole landscape. Let us consider the
situation shown in Fig. 4. In iteration t, both particles reside
inside the local optimum region, including their pbest(t). In
iteration t + 1, coincidentally, particle w(t + 1) manages to
move outside the local optimum region and its new position is
Xw (t + 1). However, since the fitness value of pbestw (t + 1)
is still better than Xw (t + 1), pbestw (t) is not updated. As

f (X)

RAN CHENG et al.: A COMPETITIVE PARTICLE SWARM OPTIMIZER FOR LARGE SCALE OPTIMIZATION

40
Xl (t)

Xl (t)
pbestl (t)

Xw (t + 1)

Xw (t)
pbestw (t)

10
Global Optimum

Global Optimum

0
0

50
X

Fig. 3. Illustration of a situation where search is performed in a multimodal

space using a PSO variant without gbest. In this case, particle l will fly over
the global optimum region due to the attraction of the pbestw .

f (X)

Xw (t), Xl (t + 1)

pbestl (t)

Xw (t + 1)

Xw (t)

50
X

Fig. 5. Illustration of a situation where search is performed in a multimodal

space using a PSO variant with neither gbest(t) nor pbest(t) (CSO). In this
case, particle l(t) is only attracted by particle w(t), thereby flying over the
local optimum region.

Let

Global Optimum
0
0

When t becomes very large in the late search stage, the following relationship between pbestw (t), pbestl (t) and gbest(t)
holds:
(
pbestw (t) gbest(t),
.
(12)
pbestl (t) gbest(t)

pbestw (t)

According to the definitions of gbest and pbest in the canonical PSO, the following relationship can be obtained:
(
f (gbest(t)) f (pbestw (t)) f (Xw (t)),
(11)
f (gbest(t)) f (gbestl (t)) f (Xl (t)).

Xl (t)
30

50
X

Fig. 4. Illustration of a situation where search is performed in a multimodal

space using a PSO variant without gbest(t). In this case, the pbestw (t) of
particle w(t) is located in a local optimum region, serving as a local gbest.
As a result, both particles l(t) and w(t) will be attracted by pbestw (t).

a consequence, pbestw (t) continues to serve as an attractor,

which will pull particle w(t + 1) back into the local optimum
region again. In this situation, pbestw (t) has played the role of
a local gbest, although no gbest(t) is adopted. In comparison,
both the situations as shown in Fig. 2 and Fig. 4 can
be avoided by CSO because both gbest(t) and pbest(t) are
removed, refer to Fig. 5.
2) Exploitation: Exploitation is required in the later search
stage to refine the solution found at the exploration stage. To
analyze the exploitation behavior of CSO, we randomly pick
up two particles w and l from the swarm and the following
relationship holds:

f (Xw (t)) f (Xl (t)),

(10)

F1 (t) = |f (Xl (t)) f (gbest)|

gbest(t) + gbest(t)
)|
= |f (Xl (t)) f (
2
gbest(t) + pbestl (t)
|f (Xl (t)) f (
)|
2

= |f (Xl (t)) f (p1 )|,

F2 (t) = |f (Xl (t)) f (Xw (t))|

(13)

= |f (Xl (t)) f (p2 )|,

where p1 is the expected value of p1 in (4) and p2 is the

expected value of p2 in (9) with = 0. Then the following
relationship can be obtained from (10), (11) and (12):
F2 (t) F1 (t).

(14)

The relationship in (14) indicates that CSO, in comparison

with the canonical PSO, has a better ability to exploit the
small gaps between two positions whose fitness values are
very similar.
B. Theoretical convergence proof
Similar to most theoretical convergence analysis of PSO
[49][51], a deterministic implementation of CSO is considered to theoretically analyze its convergence property. It should

IEEE TRANSACTIONS ON CYBERNETICS, VOL. XX, NO. X, XXXX XXXX

also be pointed out that the proof does not guarantee the
convergence to the global optimum.
For any particle i in P (t), it can have the following two
possible behaviors after participating in a competition:
1) Xi (t + 1) = Xi (t), if Xi (t) is a winner;
2) Xi (t + 1) is updated using (6) and (7), if Xi (t) is a
loser.
In case Xi (t) is a winner, the particle will not be updated.
Therefore, we only need to consider the case when Xi (t)
is a loser and then updated. Without loss of generality, we
can rewrite (6) and (7) by considering an one-dimension
deterministic case:
1
Vi (t + 1) = Vi (t)
2
1
+ (Xw (t) Xi (t))
2
(15)
1
+ (X(t) Xi (t)),
2
Xi (t + 1) = Xi (t) + Vi (t + 1),

or global) has been found, so that no more update for p will

happen.
Convergence means that the particles will eventually settle
down at the equilibrium point y . From the dynamical system
theory, we can know that the convergence property depends
on the eigenvalues of the state matrix A:
1
3
2 ( ) + = 0,
2
2
where the eigenvalues are:

= 3 + ( 32 )2 2
1
4
2
3 2 2
( 2 ) 2

2 = 4 2
2

(21)

(22)

The necessary and sufficient condition for the convergence,

i.e., the equilibrium point is a stable attractor, is that |1 | < 1
and |2 | < 1, leading to the result:
> 0,

(23)

where is the expected value of R1 , R2 and R3 , Xw (t) is the

position of the winner in the competition with the i-th particle.

where if is substituted by using (16), the condition for

convergence on is:
> 1.
(24)

Theorem 1. For any given 0, the dynamic system

described by (15) will converge to an equilibrium.

Therefore, 0 is a sufficient condition for the convergence

of the system.

Proof. Let

From Theorem 1, we can conclude that the algorithm will

converge to an equilibrium regardless the exact value of , as
long as > 1. In this work, only non-negative 0 will
be adopted.

1
2

1+
,
2
1

p=
X(t),
Xw (t) +
1+
1+
=

(16)

IV. E XPERIMENTAL STUDIES

then (15) can be simplified to:

1
Vi (t) + (p Xi (t))
2
Xi (t + 1) = Xi (t) + Vi (t + 1),
Vi (t + 1) =

(17)

The search dynamics described in (17) can be seen as a

dynamical system, and the convergence analysis of the system
can be conducted by using the well-established theories on
stability analysis in dynamical systems. To this end, we rewrite
system (17) in the following form:
y(t + 1) = Ay(t) + Bp,

(18)

where
y(t) =

Vi (t)
X(t)

,A =

1
2
1
2

,B =

, (19)

where A is called state matrix in dynamical system theory, p

is called external input that drives the particle to a specific
position and B is called input matrix that controls external
effect on the dynamics of the particle.
If there exists an equilibrium y that satisfies y (t + 1) =

y (t) for any t, it can be calculated from (18) and (19):

y = 0 p ,
(20)
which means that the particles will finally stabilize at the same
position, provided that p is constant, i.e., an optimum (local

In this section, we will perform a set of experiments

conducted on the seven benchmark functions proposed in the
CEC08 special session on large scale optimization problems
[43]. Among the seven functions, f1 (Shifted Sphere), f4
(Shifted Rastrigin) and f6 (Shifted Ackley) are separable
functions, while the other four functions f2 (Schwefel Problem), f3 (Shifted Rosenbrock), f5 (Shifted Griewank) and f7
(Fast Fractal) are non-separable. Note that f5 becomes more
separable as the dimension increases, because the product
component of f5 becomes increasingly less significant [52]
with an increasing dimension. Therefore, in the following
experiments, if the dimension is equal to or higher than 500,
f5 will be regarded a separable function.
At first, experiments are conducted to empirically understand the influence of the two parameters in CSO, namely,
the swarm size m and the social factor . Then, CSO is
compared with a few recently proposed algorithms for large
scale optimization on 100-D, 500-D, and 1000-D benchmark
functions. Afterwards, regarding the scalability to search dimensions, CSO is further challenged on 2000-D and 5000-D
functions. Finally, the influence of neighborhood control on
CSOs swarm diversity and search performance is investigated.
The experiments are implemented on a PC with an Intel
Core i5-2500 3.3GHz CPU and Microsoft Windows 7 Enterprise SP1 64-bit operating system, and CSO is written in
language C on Microsoft Visual Studio 2010 Enterprise.

RAN CHENG et al.: A COMPETITIVE PARTICLE SWARM OPTIMIZER FOR LARGE SCALE OPTIMIZATION

All statistical results, unless otherwise specified, are averaged over 25 independent runs. For each independent run,
the maximum number of fitness evaluations (FEs), as recommended in [43], is set to 5000 n, where n is the search
dimension of the test functions. In the comparisons between
different statistical results, two-tailed t-tests are conducted at
a significance level of = 0.05.
A. Parameter settings

160

140
10

Optimization result

10
120
100
80
60

10
40
20

100

200

300

400

500

600

100

200

Swarm size

300

400

500

600

400

500

600

Swarm size

(a) f2

(b) f3

Optimization result

10
5

10
20

100

200

300

Swarm size

400

500

600

100

200

300
Swarm size

(d) f6

Fig. 6. Statistical results of optimization errors obtained CSO on 2 nonseparable functions f2 , f3 and 2 separable functions f1 , f6 of 500 dimensions
with different swarm sizes m varying from 25 to 300.

that CSO performs well on 500-D functions with a swarm size

around 150, which is smaller than the swarm sizes adopted in
CCPSO2, DMS-PSO and other PSO variants, though CSO is
a single-swarm algorithm without any ad hoc mechanism for
large scale optimization. More interestingly, when the swarm
size is bigger than some specific values (e.g. 100 for f1 ), the
optimization performance begin to deteriorate. The reason is
that with a bigger swarm size, more FEs (fitness evaluations)
have to be performed in each generation. Since the terminal
condition in this experiment is the maximal number of FEs,
a larger swarm size means a smaller number of generations.
This also implies that the performance of CSO does not rely
much on a large swarm size. From Fig. 6, we can also see
that a swarm size smaller than 100 might be too small for
500-dimensional problems. Based on the observations and
discussions above, the swarm size should not be smaller
than 200 for real-world large scale (D 500) optimization
problems.
2) Social factor: In the following, we investigate the influence of the social component by varying the social factor
. To this end, simulations have been conducted on the four
functions with the swarm size m varying from 200 to 1000
and varying from 0 to 0.3.
From the statistical results summarized in Table I, we can
see that the best statistical results are the diagonal elements in
the table, which implies that there exist a correlation between
m and . Additionally, it can also be noticed that the nonseparable functions (f2 and f3 ) require a smaller than the
separable functions (f1 and f6 ) to achieve good performance.
The reason might be that separable functions are easier to
optimize, as a result, a bigger would work better because it
leads to faster convergence. The best combinations observed
from Table I are summarized in Table II.
0.3

0.25

0.2

0.15

1) Swarm size: Like most swarm optimization algorithms,

the swarm size is an indispensable parameter. With a small
swarm size, the particles tend to converge very fast before
the search space is well explored, thus leading to premature
convergence; however, if the swarm size is too big, a large
number of FEs will be required during each generation,
which may become impractical for computationally expensive
problems.
Generally, the swarm size is empirically specified. For
example, in CCPSO2 [52], multi-swarms were adopted with a
swarm size 30 for each swarm, and for 500-D functions, the
number of swarms varied from 4 to 50, creating an average
size around 240; in DMS-PSO [30], a larger swarm size 450
was adopted for the optimization of 500-D functions.
To gain empirical insight into the influence of the swarm
size on the search performance of CSO, the swarm size has
been varied from 25 to 300 for the four CEC08 functions
f1 , f2 , f3 and f6 of search dimension 500. Among the four
functions, f1 and f6 are separable and the other two are nonseparable. To remove the influence of the social component,
is set to 0 in this set of experiments.
Fig. 6 shows the statistical results of the optimization errors
obtained by CSO with different swarm sizes m. It can be seen

0.1

0.05

Data of min
Fitting curve of min
Data of max
Fitting curve of max

0.05
100

200

300

400

500

600

700

800

900

1000

Fig. 7. Fitting curves describing the relationship between the social factor
and swarm size m that lead to the best search performance using the
logarithmic linear regression analysis.

For a deep insight into the relationship between the optimal

pair of and m, a logarithmic linear regression analysis has
been performed to model the relationship between m and
using the data in Table II, as shown in Fig. 7. Based on the
regression analysis result, the following empirical setups for
and swarm size m is recommended:

(m) = 0
if m 100,
(25)
(m) [L (m), U (m)]
otherwise
where L (m) and L (m) are the lower and upper bound

IEEE TRANSACTIONS ON CYBERNETICS, VOL. XX, NO. X, XXXX XXXX

TABLE I
S TATISTICAL RESULTS ( MEAN VALUES AND STANDARD DEVIATIONS ) OF OPTIMIZATION ERRORS OBTAINED BY CSO ON 2 NON - SEPARABLE FUNCTIONS
f2 , f3 AND 2 SEPARABLE FUNCTIONS f1 , f6 OF 500 DIMENSIONS WITH THE SWARM SIZE m VARYING FROM 200 TO 1000 AND FROM 0 TO 0.3.
Swarm size
m = 200

m = 400

m = 600

m = 1000

Function
f2
f3
f1
f6
f2
f3
f1
f6
f2
f3
f1
f6
f2
f3
f1
f6

=0
4.79E+01(1.97E+00)
5.95E+02(1.55E+02)
1.08E-09(2.26E-10)
3.24E-06(2.96E-07)
6.09E+01(1.06E+00)
1.31E+06(1.22E+05)
3.17E+00(3.86E-01)
2.94E-01(1.76E-02)
6.52E+01(7.68E-01)
2.75E+08(3.96E+07)
3.26E+02(2.84E+01)
3.33E+00(3.07E-02)
7.12E+01(7.47E-01)
1.40E+09(1.26E+08)
5.74E+03(4.94E+02)
6.75E+00(4.61E-02)

= 0.1
8.26E+01(2.85E+00)
8.25E+02(5.28E+01)
4.73E-23(8.70E-25)
3.57E-13(1.02E-14)
5.47E+01(3.46E+00)
4.90E+02(1.27E-01)
4.38E-16(5.76E-17)
1.49E-09(4.36E-11)
2.74E+01(3.76E+00)
4.92E+02(4.00E-01)
2.57E-08(1.69E-09)
1.27E-05(6.24E-07)
3.22E+01(4.68E-01)
6.43E+02(1.68E+01)
1.19E-02(5.35E-04)
9.58E-03(3.70E-04)

= 0.2
8.58E+01(3.48E+00)
1.07E+07(4.57E+06)
1.51E+01(1.77E+01)
2.78E+00(1.75E-01)
7.41E+01(7.92E-01)
5.01E+03(6.21E+02)
3.22E-22(2.41E-23)
8.82E-13(1.42E-14)
6.72E+01(1.26E+00)
1.39E+03(3.84E+02)
5.46E-22(2.72E-23)
1.10E-12(1.48E-14)
6.11E+01(1.61E+00)
5.97E+02(6.90E+01)
7.26E-12(2.89E-13)
1.60E-07(8.03E-09)

= 0.3
8.45E+01(1.27E+00)
4.33E+09(6.06E+08)
3.50E+04(3.77E+03)
1.04E+01(7.18E-01)
6.09E+01(1.06E+00)
2.75E+08(3.96E+07)
1.29E+03(2.69E+02)
4.00E+00(1.85E-01)
7.17E+01(8.32E-01)
5.28E+07(1.09E+07)
3.73E+01(1.61E+01)
1.91E+00(8.08E-02)
6.89E+01(7.57E-01)
8.34E+06(1.76E+06)
2.01E-18(7.69E-19)
2.55E-11(5.88E-12)

Two-tailed t-tests have been conducted between the statistical results in each row. If one result is significantly better than all the other errors, it is highlighted.
Note that the statistical results are shown in the order of f2 , f3 (non-separable functions) and f1 , f6 (separable functions), to clearly see the different
values for non-separable and separable functions.

TABLE II
T HE BEST COMBINATIONS OF THE SWARM SIZE m AND THE SOCIAL
FACTOR IN THE OPTIMIZATION OF 500-D f1 , f2 , f3 AND f6 .

min
max

m = 200
0
0.1

m = 400
0.1
0.2

m = 600
0.1
0.2

m = 1000
0.1
0.3

max and min denote the maximal and minimal that perform best with
corresponding m, respectively.

TABLE III
PARAMETER SETTINGS FOR THE SEVEN FUNCTIONS OF 100-D, 500-D
AND 1000-D.
Parameter
m

Dimensions
100-D
500-D
1000-D
100-D
500-D
1000-D

Separable functions
100
250
500
0
0.1
0.15

Non-separable functions
100
250
500
0
0.05
0.1

Separable functions include f1 , f4 , f5 , f6 . Non-separable functions include f2 ,f3 ,

f7 . Note that f5 is grouped as a non-separable function because the product component
becomes less significant with the increase of dimension [52].

B. Benchmark comparisons
In order to verify the performance of CSO for large scale
optimization, CSO has been compared with a number of the
state-of-the-art algorithms tailored for large scale optimization
on the CEC08 test functions with dimensions of 100, 500 and
1000. The compared algorithms for large scale optimization
include CCPSO2 [52], MLCC [53], sep-CMA-ES [54], EPUSPSO [55] and DMS-PSO [30]. The same criteria proposed in

10
0

Fitness Error

10
Fitness Error

of the recommended social factor that can be determined as

follows:

L (m) = 0.14 log(m) 0.30

R (m) = 0.27 log(m) 0.51 .
(26)

L (m), R (m) 0

CSO
CCPSO2
MLCC

0.5

1.5
FEs

(a) f1

2.5
6

x 10

0.5

1.5
FEs

2.5
6

x 10

(b) f5

Fig. 8. The convergence profiles of CSO, CCPSO2 and MLCC on 500-D f1

and f5

the CEC08 special session on large scale optimization [43]

have been adopted.
Among the compared algorithm, the CCPSO2 [52] and
the MLCC [53] are designed in the cooperative coevolution
(CC) framework [56], which has been proposed to solve
high-dimensional problems by automatically implementing the
divide-and-conquer strategy [57]. Specifically, in both algorithms, random grouping technique is used to divide the whole
decision vector into different subcomponents, each of which
is solved independently. In CCPSO2, a modified PSO using
Cauchy and Gaussian distributions for sampling around the
personal best and the neighborhood best positions is adopted
as the core algorithm to evolve the CC framework whilst
in MLCC, a self-adaptive neighborhood search differential
evolution (SaNSDE) is adopted.
The sep-CMA-ES is a simple modification of the original CMA-ES algorithm [58], which has been shown to be
more efficient, and to scale surprisingly well on some highdimensional test functions up to 1000 dimensions [54]. EPUSPSO and DMS-PSO are another two PSO variants, where the
former adjusts the population size according to the search
results [55] and the latter adopts a dynamically changing
neighborhood structure for each particle [30], and both of them

RAN CHENG et al.: A COMPETITIVE PARTICLE SWARM OPTIMIZER FOR LARGE SCALE OPTIMIZATION

TABLE IV
T HE STATISTICAL RESULTS ( FIRST LINE ) AND THE t VALUES ( SECOND LINE ) OF OPTIMIZATION ERRORS ON 100-D TEST FUNCTIONS .
100-D
f1
f2
f3
f4
f5
f6
f7
w/t/l

CSO
9.11E-29(1.10E-28)

3.35E+01(5.38E+00)

3.90E+02(5.53E+02)

5.60E+01(7.48E+00)

0.00E+00(0.00E+00)

1.20E-014(1.52E-015)

-7.28E+05(1.88E+04)

CCPSO2
7.73E-14 (3.23E-14)
-1.20E+01
6.08E+00 (7.83E+00)
1.44E+01
4.23E+02 (8.65E+02)
-1.61E-01
3.98E-02 (1.99E-01)
3.74E+01
3.45E-03 (4.88E-03)
-3.53E+00
1.44E-13 (3.06E-14)
-2.15E+01
-1.50E+03 (1.04E+01)
-1.93E+02
4/1/2

MLCC
6.82E-14 (2.32E-14)
-1.47E+01
2.53E+01 (8.73E+00)
4.00E+00
1.50E+02 (5.72E+01)
2.16E+00
4.39E-13 (9.21E-14)
3.74E+01
3.41E-14 (1.16E-14)
-1.47E+01
1.11E-13 (7.87E-15)
-6.18E+01
-1.54E+03 (2.52E+00)
-1.93E+02
4/0/3

sep-CMA-ES
9.02E-15 (5.53E-15)
-8.16E+00
2.31E+01 (1.39E+01)
3.49E+00
4.31E+00 (1.26E+01)
3.49E+00
2.78E+02 (3.43E+01)
-3.16E+01
2.96E-04 (1.48E-03)
-1.00E+00
2.12E+01 (4.02E-01)
-2.64E+02
-1.39E+03 (2.64E+01)
-1.93E+02
4/1/2

EPUS-PSO
7.47E-01 (1.70E-01)
-2.20E+01
1.86E+01 (2.26E+0)
1.28E+01
4.99E+03 (5.35E+03)
-4.28E+00
4.71E+02 (5.94E+01)
-3.47E+01
3.72E-01 (5.60E-02)
-3.32E+01
2.06E+0 (4.40E-01)
-2.34E+01
-8.55E+02 (1.35E+01)
-1.93E+02
6/0/1

DMS-PSO
0.00E+00 (0.00E+00)
4.14E+00
3.65E+00 (7.30E-01)
2.75E+01
2.83E+02 (9.40E+02)
4.91E-01
1.83E+02 (2.16E+01)
-2.78E+01
0.00E+00 (0.00E+00)
0.00E+00
0.00E+00 (0.00E+00)
3.95E+01
-1.14E+03 (8.48E+00)
-1.93E+02
2/2/3

TABLE V
T HE STATISTICAL RESULTS ( FIRST LINE ) AND THE t VALUES ( SECOND LINE ) OF OPTIMIZATION ERRORS ON 500-D TEST FUNCTIONS .
500-D
f1
f2
f3
f4
f5
f6
f7
w/t/l

CSO
6.57E-23(3.90E-24)

2.60E+01(2.40E+00)

5.74E+02(1.67E+02)

3.19E+02(2.16E+01)

2.22E-16(0.00E+00)

4.13E-13(1.10E-14)

-1.97E+06(4.08E+04)

CCPSO2
7.73E-14 (3.23E-14)
-1.20E+01
5.79E+01 (4.21E+01)
-3.78E+00
7.24E+02 (1.54E+02)
-3.30E+00
3.98E-02 (1.99E-01)
7.38E+01
1.18E-03 (4.61E-03)
-1.28E+00
5.34E-13 (8.61E-14)
-6.97E+00
-7.23E+03 (4.16E+01)
-2.41E+02
5/1/1

MLCC
4.30E-13 (3.31E-14)
-6.50E+01
6.67E+01 (5.70E+00)
-3.29E+01
9.25E+02 (1.73E+02)
-7.30E+00
1.79E-11 (6.31E-11)
7.38E+01
2.13E-13 (2.48E-14)
-4.29E+01
5.34E-13 (7.01E-14)
-8.53E+00
-7.43E+03 (8.03E+00)
-2.41E+02
6/0/1

sep-CMA-ES
2.25E-14 (6.10E-15)
-1.84E+01
2.12E+02 (1.74E+01)
-5.29E+01
2.93E+02 (3.59E+01)
8.23E+00
2.18E+03 (1.51E+02)
-6.10E+01
7.88E-04 (2.82E-03)
-1.40E+00
2.15E+01 (3.10E-01)
-3.47E+02
-6.37E+03 (7.59E+01)
-2.41E+02
5/1/1

EPUS-PSO
8.45E+01 (6.40E+00)
-6.60E+01
4.35E+01 (5.51E-01)
-3.55E+01
5.77E+04 (8.04E+03)
-3.55E+01
3.49E+03 (1.12E+02)
-1.39E+02
1.64E+00 (4.69E-02)
-1.75E+02
6.64E+00 (4.49E-01)
-7.39E+01
-3.51E+03 (2.10E+01)
-2.41E+02
7/0/0

DMS-PSO
0.00E+00 (0.00E+00)
8.42E+01
6.89E+01 (2.01E+00)
-6.85E+01
4.67E+07 (5.87E+06)
-3.98E+01
1.61E+03 (1.04E+02)
-6.08E+01
0.00E+00 (0.00E+00)
7.85E+84
2.00E+00 (9.66E-02)
-1.04E+02
-4.20E+03 (1.29E+01)
-2.41E+02
5/0/2

TABLE VI
T HE STATISTICAL RESULTS ( FIRST LINE ) AND THE t VALUES ( SECOND LINE ) OF OPTIMIZATION ERRORS ON 1000-D TEST FUNCTIONS .
1000-D
f1
f2
f3
f4
f5
f6
f7
w/t/l

CSO
1.09E-21(4.20E-23)

4.15E+01(9.74E-01)

1.01E+03(3.02E+01)

6.89E+02(3.10E+01)

2.26E-16(2.18E-17)

1.21E-12(2.64E-14)

-3.83E+06(4.82E+04)

CCPSO2
5.18E-13 (9.61E-14)
-2.70E+01
7.82E+01 (4.25E+01)
-4.32E+00
1.33E+03 (2.63E+02)
-6.04E+00
1.99E-01 (4.06E-01)
1.11E+02
1.18E-03 (3.27E-03)
-1.80E+00
1.02E-12 (1.68E-13)
5.59E+00
-1.43E+04 (8.27E+01)
-3.96E+02
4/1/2

MLCC
8.46E-13 (5.01E-14)
-8.44E+01
1.09E+02 (4.75E+00)
-6.96E+01
1.80E+03 (1.58E+02)
-2.46E+01
1.37E-10 (3.37E-10)
1.11E+02
4.18E-13 (2.78E-14)
-7.51E+01
1.06E-12 (7.68E-14)
9.24E+00
-1.47E+04 (1.51E+01)
-3.96E+02
5/0/2

have participated in the CEC 2008 competition on Large Scale

Global optimization (LSGO) [43].
Based on the previous empirical analysis of the two parameters in CSO, the parameter settings used in the benchmarks
are summarized in Table III. The optimization errors on 100D, 500-D and 1000-D functions are summarized in Table
IV, V and VI, respectively. In all the three tables, t values
are listed together mean values and the standard deviations.
A negative t value means that the statistical results of the
optimization errors obtained by CSO are relatively smaller and
vice versa. If the difference is statistically significant smaller,
the corresponding t value is highlighted. w/t/l in the last row
means that CSO wins in w functions, ties in t functions, and

sep-CMA-ES
7.81E-15 (1.52E-15)
-2.57E+01
3.65E+02 (9.02E+00)
-1.78E+02
9.10E+02 (4.54E+01)
9.17E+00
5.31E+03 (2.48E+02)
-9.24E+01
3.94E-04 (1.97E-03)
-1.00E+00
2.15E+01 (3.19E-01)
-3.37E+02
-1.25E+04 (9.36E+01)
-3.96E+02
5/1/1

EPUS-PSO
5.53E+02 (2.86E+01)
-9.67E+01
4.66E+01 (4.00E-01)
-2.42E+01
8.37E+05 (1.52E+05)
-2.75E+01
7.58E+03 (1.51E+02)
-2.24E+02
5.89E+00 (3.91E-01)
-7.53E+01
1.89E+01 (2.49E+00)
-3.80E+01
-6.62E+03 (3.18E+01)
-3.97E+02
7/0/0

DMS-PSO
0.00E+00 (0.00E+00)
1.30E+02
9.15E+01 (7.14E-01)
-2.07E+02
8.98E+09 (4.39E+08)
-1.02E+02
3.84E+03 (1.71E+02)
-9.07E+01
0.00E+00 (0.00E+00)
5.18E+01
7.76E+00 (8.92E-02)
-4.35E+02
-7.50E+03 (1.63E+01)
-3.97E+02
5/0/2

loses in l functions.
The statistical results of the optimization errors show that
CSO has significantly better overall performance in comparison with all the other five compared algorithms on 500-D,
1000-D functions. CSO and DMS-PSO have similar performance on 100-D functions, and both outperform the rest four
algorithms. It seems that DMS-PSO is always able to find
the global optimum of f1 and f5 , regardless of the number
of search dimensions, but has poor performance on the other
five functions in comparison with CSO, especially when the
dimensionality becomes higher. In comparison, MLCC has
yielded the best results on f4 , which is a shifted Rastrigin
function with a large number of local optima. Such outstanding

IEEE TRANSACTIONS ON CYBERNETICS, VOL. XX, NO. X, XXXX XXXX

performance on f4 should be brought about by the differential

evolution variant (SaNSDE) used in MLCC.
In addition, the convergence profiles of one typical separable
function (f1 ) and one typical non-separable function (f5 ) are
plotted in Fig. 8. It can be seen that, although the convergence
speed of the proposed is not so fast as the CCPSO2 or MLCC
at the very beginning, it is able to perform a relatively more
consistent convergence to continuously improve the solution
quality.

From the statistical results of the optimization errors summarized in Table IV, V and VI, it can be noticed that CSO has
shown very good scalability to the search dimension, i.e., the
performance does not deteriorate seriously as the dimension
increases.
To further examine the search ability of CSO on the
functions of even higher dimensions, e.g., 2000-D or even
5000-D, additional experiments have been performed on f1
to f6 of 2000 and 5000 dimensions. f7 is excluded from this
experiment for the reason that its global optimum is dimension
dependent and thus it is not easy to evaluate the scalability.
It must be stressed that optimization of problems of 2000
and 5000 dimensions is very challenging for CSO since it
has not adopted any particular strategies tailored for solving
large scale optimization, e.g., the divide-and-conquer strategy.
Furthermore, to the best of our knowledge, optimization of
problems of a dimension larger than 1000 has only been
reported by Li and Yao in [52], where 2000-dimensional f1 ,
f3 and f7 have been employed to test their proposed CCPSO2.
TABLE VII
PARAMETER SETTINGS OF CSO ON 2000-D AND 5000-D FUNCTIONS .

Dimension
2000-D
5000-D
2000-D
5000-D

Separable
1000
1500
0.2
0.2

f1
f2
f3
f4
f5
f6

D = 2000
1.66E-20(3.36E-22)
6.17E+01(1.31E+00)
2.10E+03(5.14E+01)
2.81E+03(3.69E+01)
3.33E-16(0.00E+00)
3.26E-12(5.43E-14)

D = 5000
1.43E-19(3.33E-21)
9.82E+01(9.78E-01)
7.30E+03(1.26E+02)
7.80E+03(8.73E+01)
4.44E-16(0.00E+00)
6.86E-12(5.51E-14)

Since the time cost of one single run on a 5000-D functions is extremely
expensive, the statistical results of optimization errors are averaged over 10
independent runs.

C. Scalability to higher dimensionality

Parameter

TABLE VIII
S TATISTICAL RESULTS OF THE OPTIMIZATION ERRORS OBTAINED BY CSO
ON 2000-D AND 5000-D FUNCTIONS .

Non-separable
1000
1500
0.15
0.15

Separable functions include f1 , f4 , f5 and f6 . Non-separable functions

include f2 and f3 . Note that f5 is grouped as a non-separable function
because the product component becomes less significant with the increase
of the dimension [52].

The parameter settings are listed in Table VII and the

statistical results of optimization errors are listed in Table VIII.
It can be seen that CSO continues to perform well even if
the dimension is higher than 1000, especially on the three
separable functions f1 , f4 and f6 , together with f5 .
In order to get an overall picture of the scalability of
CSO and the five compared algorithms, the mean optimization
errors obtained by the six algorithms on all test functions of
dimensions 100, 500 and 1000 are plotted in Fig 9, together
with the mean optimization errors obtained by CCPSO2 and
sep-CMA-ES on 2000-D f1 and f3 [52], as well as the mean
optimization errors obtained by CSO on 2000-D and 5000-D
f1 to f6 . Unfortunately, we are not able to obtain the results of
the compared algorithms on all 2000-D and 5000-D functions
due to the prohibitively high time-consumption needed for
optimization of such large scale problems. Nevertheless, it can

be seen that CSO has shown the best scalability on f1 and f5 ,

where the mean optimization errors obtained by CSO on the
2000-D and 5000-D test problems are much better than those
obtained by the compared algorithms. Meanwhile, CSO shows
similar scalability to CCPSO2 and MLCC on f3 and f6 .
D. Influence of neighborhood control
In CSO, as introduced in Section II, there exist two versions
k (t) in the learning stratfor calculating the mean position X
g (t) and one local version X
l (t),
egy, one global version X
l,k
k
l (t) is based on a neighborhood
where the calculation of X
l,k
instead of the whole swarm, refer to (6). Although the effec g (t) has already been verified by the empirical
tiveness of X
k
results above, it is still very interesting to investigate the
l (t) on the
influence of neighborhood control used in X
l,k
swarm diversity and thus the search performance.
With neighborhood control, the whole swarm is dynamically
divided into several neighborhoods, each neighborhood having
a local mean position vector. This will enhance the swarm
diversity in comparison with the original CSO where the
whole swarm shares a global mean position. Intuitively, a
higher degree of swarm diversity may help alleviate premature
convergence but can also slow down the convergence speed to
a certain extent.
For simplicity, the commonly used ring topology [22], [23],
[59], which has been shown to be an effective neighborhood
structure [60] is adopted here. In this topology, each particle
takes the two immediate neighbors to form a neighborhood
[61].
First, we investigate the influence of the neighborhood
control on the swarm diversity. In order to obtain measurable
observations, a diversity measure introduced in [62], [63] is
adopted here to indicate the change of diversity during the
search process:
v
n
m uX
1 Xu
t (xji xj )2
D(P ) =
m i=1 j=1
(27)

with

x
j =

1
m

m
X

(xji ),

i=1

where D(P ) denotes the diversity of the swarm P , m is the

swarm size, n is the dimension of the decision space, xji is

RAN CHENG et al.: A COMPETITIVE PARTICLE SWARM OPTIMIZER FOR LARGE SCALE OPTIMIZATION

CSO
CCPSO2
sepCMAES
EPUSPSO
DMSPSO
MLCC

500

1000
Dimension

2000

5000

10
100

10000

500

(a) f1

1000
Dimension

2000

5000

10000

CSO
CCPSO2
sepCMAES
EPUSPSO
DMSPSO
MLCC

Optimization result

2000

5000

10000

5000

10000

CSO
CCPSO2
sepCMAES
EPUSPSO
DMSPSO
MLCC

10
15

10
20

1000
Dimension

2000

1000
Dimension

10
CSO
CCPSO2
sepCMAES
EPUSPSO
DMSPSO
MLCC

500

100

(b) f2

100

Optimization result

CSO
CCPSO2
sepCMAES
EPUSPSO
DMSPSO
MLCC

Optimization result

CSO
CCPSO2
sepCMAES
EPUSPSO
DMSPSO
MLCC

100

500

(d) f4

1000
Dimension

2000

5000

10000

100

(e) f5

500

1000
Dimension

2000

5000

10000

(f) f6

Fig. 9. The statistical results of optimization errors obtained by CSO, CCPSO2, MLCC , sep-CMA-ES, EPUS-PSO and DMS-PSO on 100-D, 500-D, 1000-D
f1 to f6 , together with the statistical results of optimization errors obtained by CCPSO2, sep-CMA-ES on 2000-D f1 , f3 and the statistical results of
optimization errors obtained by CSO on 2000-D, 5000-D f1 to f6 . Note that due to the logarithmic scale used in the plots, errors of 0 cannot be shown.

the value of the j-th dimension of particle i, and x

j is the
average value of the j-th dimension over all particles.
It can be seen from Fig. 10 that the overall swarm diversity
of CSO with neighborhood control (denoted as CSO-n) is
higher than that of the original CSO, which is in consistency
with the expectation above.
To further assess whether the enhanced swarm diversity can
have a positive influence on the search performance of CSO-n,
additional numerical experiments have been conducted on 500D and 1000-D functions. Two-tailed t-test is implemented at
a significance level = 0.05 between the statistical results of
optimization errors obtained by CSO-n and CSO. A negative
t value means that the statistical results obtained by CSO-n
are relatively smaller and vice versa. The smaller statistical
results are highlighted.
TABLE IX
S TATISTICAL RESULTS OF OPTIMIZATION ERRORS OBTAINED BY CSO- N
AND CSO ON 500-D FUNCTIONS .
m = 250
f1
f2
f3
f4
f5
f6
f7

CSO-n
2.71E-011(5.77E-012)
4.61E+001(1.02E+000)
5.37E+002(4.00E+001)
3.95E+003(1.32E+002)
4.04E-012(7.00E-013)
4.90E-007(5.98E-008)
-1.68E+006(8.17E+003)

CSO
6.57E-23(3.90E-24)
2.60E+01(2.40E+00)
5.74E+02(1.67E+02
3.19E+02(2.16E+01)
2.22E-16(0.00E+00)
4.13E-13(1.10E-14)
-1.97E+06(4.08E+04)

t value
2.35E+01
3.85E+01
-1.08E+00
1.36E+02
2.89E+01
4.10E+01
3.48E+01

In the first set of experiments, the same parameter settings

TABLE X
S TATISTICAL RESULTS OF OPTIMIZATION ERRORS OBTAINED BY CSO- N
AND CSO ON 1000-D FUNCTIONS .
m = 500
f1
f2
f3
f4
f5
f6
f7

CSO-n
7.77E-001(2.30E-002)
8.11E+001(6.48E-001)
1.31E+007(7.74E+005)
1.02E+004(5.51E+001)
4.22E-002(1.77E-003)
5.95E-002(6.32E-003)
-2.58E+006(3.06E+004)

CSO
1.09E-21(4.20E-23)
4.15E+01(9.74E-01)
1.01E+03(3.02E+01)
6.89E+02(3.10E+01)
2.26E-16(2.18E-17)
1.21E-12(2.64E-14)
-3.83E+06(4.82E+04)

t value
1.69E+02
1.69E+02
8.46E+01
7.52E+02
1.19E+02
4.71E+01
1.09E+02

as in the original CSO have been used for CSO-n, refer

to Table III. The experimental results shown in Table IX
and Table X indicate that CSO-n is outperformed by the
original CSO. As discussed above, the neighborhood control
is expected to generate a higher degree of swarm diversity and
the experimental results in Fig. 10 has empirically confirmed
this expectation. Therefore, one possible reason for the inferior
performance of CSO using neighborhood control is that for
these test functions, the diversity of the global CSO is already
sufficient and therefore additional diversity may slow down
the convergence.
In the global version of CSO, the main source of swarm
diversity comes from the random pairwise competitions, where
the swarm size can be an important factor to determine the
amount of diversity. More specifically, a bigger swarm size is
able to provide more combinations of random pairwise compe-

IEEE TRANSACTIONS ON CYBERNETICS, VOL. XX, NO. X, XXXX XXXX

hood control, if a relatively large swarm size is used for large

scale optimization problems. However, use of neighborhood
control, which is able to further enhance diversity, can enable
us to use a smaller swarm size even for large scale problems,
which is very attractive in practice.

CSOn
CSO

CSOn
CSO
2

Swarm Diversity

V. C ONCLUSION
1

0.5

1.5

2.5
FEs

3.5

4.5

5
5

x 10

(a) f1

0.5

1.5

2.5
FEs

3.5

4.5

5
5

x 10

(b) f3

Fig. 10. The swarm diversity profiles during 500,000 Fitness Evaluations
(FEs) of CSO with neighborhood control (denoted as CSO-n) and the original
CSO on 500-D functions on a separable function f1 and a non-separable
function f3 respectively.

titions, thus generating a higher degree of swarm diversity, and

vice versa. Following this line of thoughts, the performance
of CSO-n may be improved by reducing the swarm size.
Therefore, a second set of experiments have been conducted
using a different parameter setup, where the swarm size is set
to m = 150 for 500-D functions and m = 200 for 1000-D
functions, respectively.
TABLE XI
S TATISTICAL RESULTS OF OPTIMIZATION ERRORS OBTAINED BY CSO- N
AND CSO ON 500-D FUNCTIONS .
m = 150
f1
f2
f3
f4
f5
f6
f7

CSO-n
1.51E-025(3.21E-027)
5.23E+001(1.11E+001)
7.93E+002(1.03E+002)
4.18E+002(3.04E+001)
3.11E-016(4.44E-017)
4.09E-014(1.74E-015)
-1.79E+006(1.28E+004)

CSO
4.10E-023(9.28E-025)
8.20E+001(4.53E+000)
9.32E+002(4.15E+002)
6.45E+002(2.66E+001)
2.46E-003(4.93E-003)
1.08E+000(1.41E-001)
-2.10E+006(5.73E+003)

t value
-2.20E+02
-1.24E+01
-1.63E+00
-2.81E+01
-2.49E+00
-3.83E+01
1.11E+02

TABLE XII
S TATISTICAL RESULTS OF OPTIMIZATION ERRORS OBTAINED BY CSO- N
AND CSO ON 1000-D FUNCTIONS .
m = 200
f1
f2
f3
f4
f5
f6
f7

CSO-n
3.60E-018(9.38E-019)
6.50E+001(1.10E+000)
1.61E+003(7.96E+001)
1.04E+003(4.85E+001)
7.77E-016(0.00E+000)
1.37E-010(1.84E-011)
-2.90E+006(1.69E+004)

CSO
5.22E-013(3.70E-013)
1.03E+002(3.24E+000)
1.95E+003(2.08E+002)
2.14E+003(7.51E+001)
2.46E-003(4.93E-003)
3.03E+000(2.67E-001)
-4.24E+006(3.05E+004)

p value
-7.05E+00
-5.55E+01
-7.63E+00
-6.15E+01
-2.49E+00
-5.67E+01
1.92E+02

As shown by the statistical results of optimization errors

in Table XI and Table XII, after reducing the swarm size,
CSO-n is able to outperform the global CSO on most test
functions studied in this work, except for f7 . Interestingly, the
performance of CSO on f7 is always better than CSO-n. As
described in [43], f7 is a very special function which has large
amount of random noise and its global optimum is unknown.
One possible reason for such a consequence is that f7 , as a
noisy function, is very sensitive to the swarm diversity, and the
original CSO, which maintains less swarm diversity, is able to
achieve better performance on it.
To summarize, since the random pairwise competitions are
able to generate sufficient amount of diversity in the swarm,
the original CSO can work properly without using neighbor-

In this paper, we have introduced a new swarm algorithm

termed competitive swarm optimizer (CSO). The algorithm
is based on a pairwise competition mechanism and adopts a
novel update strategy, where neither gbest nor pbest is used.
Theoretical proof of convergence and empirical analysis of
search dynamics are given to understand the search mechanisms of CSO. Despite its simplicity in algorithmic implementation, CSO has shown to perform surprisingly well on large
scale optimization problems, outperforming many state-of-theart meta-heuristics tailored for large scale optimization. Our
comparative studies conducted on 100-D, 500-D and 1000-D
CEC08 benchmark problems demonstrate that CSO performs
consistently well on those test functions, especially in the high
dimensional cases. The performance of CSO has been further
demonstrated on 2000-D and 5000-D functions and the experimental results show that CSO has reasonably good scalability
on these extremely high-dimensional problems. In addition, we
have empirically investigated the influence of neighborhood
control on the swarm diversity and search performance of
CSO, which suggests that neighborhood control can enhance
diversity and therefore enable us to user a smaller swarm size
for large scale optimization problems.
In the future, we will investigate the application of CSO
to solving other challenging optimization problems, such as
multi-objective problems [64] and many-objective problems
[65]. Application of CSO to complex real-world problems is
another important future work.
ACKNOWLEDGEMENT

This work was supported in part by Honda Research Institute Europe.

R EFERENCES
[1] J. Kennedy, Swarm Intelligence. Springer, 2006.
[2] J. Kennedy and R. Eberhart, Particle swarm optimization, in Proceedings of the IEEE International Conference on Neural Networks, vol. 4.
IEEE, 1995, pp. 19421948.
[3] Y. Shi and R. Eberhart, A modified particle swarm optimizer, in
Proceedings of IEEE International Conference on Evolutionary Computation. IEEE, 1998, pp. 6973.
[4] R. Eberhart and J. Kennedy, A new optimizer using particle swarm
theory, in Proceedings of International Symposium on Micro Machine
and Human Science. IEEE, 1995, pp. 3943.
[5] J. Kennedy, The particle swarm: social adaptation of knowledge, in
Proceedings of IEEE International Conference on Evolutionary Computation. IEEE, 1997, pp. 303308.
[6] I. Montalvo, J. Izquierdo, R. Perez, and P. L. Iglesias, A diversityenriched variant of discrete pso applied to the design of water distribution networks, Engineering Optimization, vol. 40, no. 7, pp. 655668,
2008.
[7] A. Alfi and M.-M. Fateh, Parameter identification based on a modified
pso applied to suspension system, Journal of Software Engineering &
Applications, vol. 3, pp. 221229, 2010.

RAN CHENG et al.: A COMPETITIVE PARTICLE SWARM OPTIMIZER FOR LARGE SCALE OPTIMIZATION

[8] Y.-J. Gong, J. Zhang, H. Chung, W.-n. Chen, Z.-H. Zhan, Y. Li, and
Y.-h. Shi, An efficient resource allocation scheme using particle swarm
optimization, IEEE Transactions on Evolutionary Computation, vol. 16,
no. 6, pp. 801816, 2012.
[9] S.-Y. Ho, H.-S. Lin, W.-H. Liauh, and S.-J. Ho, Opso: Orthogonal
particle swarm optimization and its application to task assignment
problems, IEEE Transactions on Systems, Man and Cybernetics, Part
A: Systems and Humans, vol. 38, no. 2, pp. 288298, 2008.
[10] Z. Zhu, J. Zhou, Z. Ji, and Y.-H. Shi, Dna sequence compression using
adaptive particle swarm optimization-based memetic algorithm, IEEE
Transactions on Evolutionary Computation, vol. 15, no. 5, pp. 643658,
2011.
[11] Y. Shi et al., Particle swarm optimization: developments, applications
and resources, in Proceedings of IEEE Congress on Evolutionary
Computation, vol. 1. IEEE, 2001, pp. 8186.
[12] Y. Yang and J. O. Pedersen, A comparative study on feature selection
in text categorization, in Proceedings of International Conference on
Machine Learning. Morgan Kaufmann Publishers, 1997, pp. 412420.
[13] W.-N. Chen, J. Zhang, Y. Lin, and e. Chen, Particle swarm optimization
with an aging leader and challengers, IEEE Transactions on Evolutionary Computation, vol. 17, no. 2, pp. 241258, 2013.
[14] Y. Shi and R. Eberhart, Parameter selection in particle swarm optimization, in Evolutionary Programming VII. Springer, 1998, pp. 591600.
[15] Y. Shi and R. C. Eberhart, Empirical study of particle swarm optimization, in Proceedings of IEEE Congress on Evolutionary Computation.
IEEE, 1999, pp. 19451950.
[16] , Fuzzy adaptive particle swarm optimization, in Proceedings of
IEEE Congress on Evolutionary Computation, vol. 1. IEEE, 2001, pp.
101106.
[17] A. Ratnaweera, S. K. Halgamuge, and H. C. Watson, Self-organizing
hierarchical particle swarm optimizer with time-varying acceleration
coefficients, IEEE Transactions on Evolutionary Computation, vol. 8,
no. 3, pp. 240255, 2004.
[18] R. Cheng and M. Yao, Particle swarm optimizer with time-varying
parameters based on a novel operator, Applied Mathematics and Information Sciences, vol. 5, no. 2, pp. 3338, 2011.
[19] M. Hu, T. Wu, and J. D. Weir, An adaptive particle swarm optimization
with multiple adaptive methods, IEEE Transactions on Evolutionary
Computation, vol. 17, no. 5, pp. 705720, 2013.
[20] P. N. Suganthan, Particle swarm optimiser with neighbourhood operator, in Proceedings of IEEE Congress on Evolutionary Computation,
vol. 3. IEEE, 1999, pp. 19581962.
[21] J. Kennedy, Small worlds and mega-minds: effects of neighborhood
topology on particle swarm performance, in Proceedings of IEEE
Congress on Evolutionary Computation, vol. 3. IEEE, 1999, pp. 1931
1938.
[22] J. Kennedy and R. Mendes, Population structure and particle swarm
performance, in Proceedings of IEEE Congress on Evolutionary Computation, vol. 2. IEEE, 2002, pp. 16711676.
[23] R. Mendes, J. Kennedy, and J. Neves, The fully informed particle
swarm: simpler, maybe better, IEEE Transactions on Evolutionary
Computation, vol. 8, no. 3, pp. 204210, 2004.
[24] J. J. Liang, A. Qin, P. N. Suganthan, and S. Baskar, Comprehensive
learning particle swarm optimizer for global optimization of multimodal
functions, IEEE Transactions on Evolutionary Computation, vol. 10,
no. 3, pp. 281295, 2006.
[25] B. Qu, P. Suganthan, and S. Das, A distance-based locally informed
particle swarm model for multimodal optimization, IEEE Transactions
on Evolutionary Computation, vol. 17, no. 3, pp. 387402, 2013.
[26] J. Robinson, S. Sinton, and Y. Rahmat-Samii, Particle swarm, genetic
algorithm, and their hybrids: optimization of a profiled corrugated horn
antenna, in Proceedings of IEEE Antennas and Propagation Society
International Symposium, vol. 1. IEEE, 2002, pp. 314317.
[27] C.-F. Juang, A hybrid of genetic algorithm and particle swarm optimization for recurrent network design, IEEE Transactions on Systems,
Man, and Cybernetics, Part B: Cybernetics, vol. 34, no. 2, pp. 9971006,
2004.
[28] N. Holden and A. A. Freitas, A hybrid particle swarm/ant colony
algorithm for the classification of hierarchical biological data, in
Proceedings fo the IEEE Swarm Intelligence Symposium. IEEE, 2005,
pp. 100107.
[29] P. Shelokar, P. Siarry, V. K. Jayaraman, and B. D. Kulkarni, Particle
swarm and ant colony algorithms hybridized for improved continuous
optimization, Applied mathematics and computation, vol. 188, no. 1,
pp. 129142, 2007.

[30] J. Liang and P. Suganthan, Dynamic multi-swarm particle swarm

optimizer, in Proceedings of IEEE Swarm Intelligence Symposium.
IEEE, 2005, pp. 124129.
[31] Z.-H. Zhan, J. Zhang, Y. Li, and H.-H. Chung, Adaptive particle swarm
optimization, IEEE Transactions on Systems, Man, and Cybernetics,
Part B: Cybernetics, vol. 39, no. 6, pp. 13621381, 2009.
[32] R. Brits, A. P. Engelbrecht, and F. van den Bergh, Locating multiple
optima using particle swarm optimization, Applied Mathematics and
Computation, vol. 189, no. 2, pp. 18591883, 2007.
[33] M. Daneshyari and G. G. Yen, Cultural-based multiobjective particle
swarm optimization, IEEE Transactions on Cybernetics, vol. 41, no. 2,
pp. 553567, 2011.
[34] N. Higashi and H. Iba, Particle swarm optimization with gaussian
mutation, in Proceedings of IEEE Swarm Intelligence Symposium.
IEEE, 2003, pp. 7279.
[35] B. Liu, L. Wang, Y.-H. Jin, F. Tang, and D.-X. Huang, Improved particle
swarm optimization combined with chaos, Chaos, Solitons & Fractals,
vol. 25, no. 5, pp. 12611271, 2005.
[36] Z.-H. Zhan, J. Zhang, Y. Li, and Y.-H. Shi, Orthogonal learning particle
swarm optimization, IEEE Transactions on Evolutionary Computation,
vol. 15, no. 6, pp. 832847, 2011.
[37] H. Gao and W. Xu, A new particle swarm algorithm and its globally
convergent modifications, IEEE Transactions on Cybernetics, vol. 41,
no. 5, pp. 13341351, 2011.
[38] Y. V. Pehlivanoglu, A new particle swarm optimization method enhanced with a periodic mutation strategy and neural networks, IEEE
Transactions on Evolutionary Computation, vol. 17, no. 3, pp. 436452,
2013.
[39] F. Van den Bergh and A. P. Engelbrecht, A cooperative approach
to particle swarm optimization, IEEE Transactions on Evolutionary
Computation, vol. 8, no. 3, pp. 225239, 2004.
[40] S. Baskar and P. N. Suganthan, A novel concurrent particle swarm
optimization, in Proceedings of the IEEE Congress on Evolutionary
Computation, vol. 1. IEEE, 2004, pp. 792796.
[41] G. G. Yen and M. Daneshyari, Diversity-based information exchange
among multiple swarms in particle swarm optimization, International
Journal of Computational Intelligence and Applications, vol. 7, no. 01,
pp. 5775, 2008.
[42] R. Cheng, C. Sun, and Y. Jin, A multi-swarm evolutionary framework
based on a feedback mechanism, in Proceedings of IEEE Congress on
Evolutionary Computation. IEEE, 2013, pp. 718724.
[43] K. Tang, X. Yao, P. N. Suganthan, C. MacNish, Y.-P. Chen, C.-M. Chen,
and Z. Yang, Benchmark functions for the cec2008 special session
and competition on large scale global optimization, Nature Inspired
Computation and Applications Laboratory, USTC, China, 2007.
[44] Y. Jin and B. Sendhoff, Fitness approximation in evolutionary
computation-A survey, in Genetic and Evolutionary Computation Conference (GECCO), 2002, pp. 11051112.
[45] D. Lim, Y. Jin, Y.-S. Ong, and B. Sendhoff, Generalizing surrogateassisted evolutionary computation, IEEE Transactions on Evolutionary
Computation, vol. 14, no. 3, pp. 329355, 2010.
[46] M. Dorigo and C. Blum, Ant colony optimization theory: A survey,
Theoretical computer science, vol. 344, no. 2, pp. 243278, 2005.
[47] D. Karaboga and B. Akay, A survey: algorithms simulating bee swarm
intelligence, Artificial Intelligence Review, vol. 31, no. 1-4, pp. 6185,
2009.
[48] R. Storn and K. Price, Differential evolutiona simple and efficient
heuristic for global optimization over continuous spaces, Journal of
global optimization, vol. 11, no. 4, pp. 341359, 1997.
[49] M. Clerc and J. Kennedy, The particle swarm-explosion, stability, and
convergence in a multidimensional complex space, IEEE Transactions
on Evolutionary Computation, vol. 6, no. 1, pp. 5873, 2002.
[50] I. C. Trelea, The particle swarm optimization algorithm: convergence analysis and parameter selection, Information Processing Letters,
vol. 85, no. 6, pp. 317325, 2003.
[51] J. L. Fernandez-Martinez and E. Garcia-Gonzalo, Stochastic stability
analysis of the linear continuous and discrete pso models, IEEE
Transactions on Evolutionary Computation, vol. 15, no. 3, pp. 405423,
2011.
[52] X. Li and Y. Yao, Cooperatively coevolving particle swarms for large
scale optimization, IEEE Transactions on Evolutionary Computation,
vol. 16, no. 2, pp. 115, 2011.
[53] Z. Yang, K. Tang, and X. Yao, Multilevel cooperative coevolution
for large scale optimization, in Proceedings of IEEE Congress on
Evolutionary Computation. IEEE, 2008, pp. 16631670.

[54] R. Ros and N. Hansen, A simple modification in cma-es achieving

linear time and space complexity, Parallel Problem Solving from
NaturePPSN X, pp. 296305, 2008.
[55] S.-T. Hsieh, T.-Y. Sun, C.-C. Liu, and S.-J. Tsai, Solving large
scale global optimization using improved particle swarm optimizer, in
Proceedings of IEEE Congress on Evolutionary Computation. IEEE,
2008, pp. 17771784.
[56] M. Potter and K. De Jong, A cooperative coevolutionary approach to
function optimization, Parallel Problem Solving from Nature, PPSN III,
pp. 249257, 1994.
[57] Z. Yang, K. Tang, and X. Yao, Large scale evolutionary optimization
using cooperative coevolution, Information Sciences, vol. 178, no. 15,
pp. 29852999, 2008.
[58] N. Hansen and A. Ostermeier, Completely derandomized selfadaptation in evolution strategies, Evolutionary computation, vol. 9,
no. 2, pp. 159195, 2001.
[59] M. A. Montes de Oca, T. Stutzle, M. Birattari, and M. Dorigo, Frankensteins pso: a composite particle swarm optimization algorithm, IEEE
Transactions on Evolutionary Computation, vol. 13, no. 5, pp. 1120
1132, 2009.
[60] X. Li, Niching without niching parameters: particle swarm optimization
using a ring topology, IEEE Transactions on Evolutionary Computation,
vol. 14, no. 1, pp. 150169, 2010.
[61] X. Li and K. Deb, Comparing lbest pso niching algorithms using
different position update rules, in Proceedings of the IEEE Congress
on Evolutionary Computation. IEEE, 2010, pp. 18.
[62] O. Olorunda and A. Engelbrecht, Measuring exploration/exploitation
in particle swarms using swarm diversity, in Proceedings of IEEE
Congress on Evolutionary Computation. IEEE, 2008, pp. 11281134.
[63] A. Ismail and A. Engelbrecht, Measuring diversity in the cooperative
particle swarm optimizer, Swarm Intelligence, vol. 7461, pp. 97108,
2012.
[64] K. Deb, Multi-Objective Optimization Using Evolutionary Algorithms.
John Wiley & Sons Hoboken, NJ, 2001.
[65] H. Ishibuchi, N. Tsukamoto, and Y. Nojima, Evolutionary manyobjective optimization: A short review, in Proceedings of IEEE
Congress on Evolutionary Computation. IEEE, 2008, pp. 24192426.