Choice of Benchmark Optimization Problems Does Matter
Choice of Benchmark Optimization Problems Does Matter
A R T I C L E I N F O A B S T R A C T
Keywords: Various benchmark sets have already been proposed to facilitate comparison between metaheuristics, or
Evolutionary algorithms Evolutionary Algorithms. During the competition, typically algorithms are allowed either to run until the
Swarm intelligence allowed number of function calls is exhausted (and one compares the quality of solutions found), or until a
Metaheuristics
required objective function value is obtained (one compares the speed in reaching the required solution). During
Benchmarking
Differential evolution
the last 20 years several problem sets were defined using the first approach. In this study, we test 73 optimization
Particle swarm optimization algorithms proposed between the 1960′s and 2022 on nine competitions based on four sets of problems (CEC
2011, CEC 2014, CEC 2017, and CEC 2020) with different dimensionalities. We intend to test the original
versions of 73 algorithms “as they are”, with control parameters proposed by the authors of the particular
method. The recent benchmark set, CEC 2020, includes fewer problems and allows much more function calls
than the former sets. As a result, one group of algorithms perform best on older, a different one on the more
recent (CEC 2020) benchmark sets. Almost all algorithms that perform best on CEC 2020 set achieve moderate-
to-poor performance on older sets, including real-world problems from CEC 2011. Algorithms that perform best
on older sets are more flexible than those that perform best on CEC 2020 benchmark. The choice of the
benchmark may have a crucial impact on the final ranking of algorithms. The lack of tuning may affect the results
that were obtained in this study, hence it is highly recommended to repeat a similar large-scale comparison with
control parameters of each algorithm tuned, best by different methods, separately for each benchmark set.
* Corresponding author.
E-mail address: [email protected] (A.P. Piotrowski).
https://doi.org/10.1016/j.swevo.2023.101378
Received 20 October 2022; Received in revised form 31 July 2023; Accepted 8 August 2023
Available online 11 August 2023
2210-6502/© 2023 Elsevier B.V. All rights reserved.
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
applications. In the second one, the comparison is performed on versa top algorithms in the competitions, if applied without parameter tuning,
tile benchmarks that should mimic “general” problems that may be would perform on different benchmarks, and to analyze the differences
interesting in practice. The benchmark sets may be composed of in rankings of algorithms on different benchmark sets. We consider each
real-world problems [25–27], but much more frequently benchmarks algorithm as it is, and do not perform any tuning [7] or other initial
include collections of mathematical functions with known properties research to modify its control parameters or operators. This choice may
[20,28-31]. affect the results to a large degree. However, in [15] it was found that
How the comparison is organized would also affect the results [20]. the issue of parameter tuning is less important than the number of
There are two main approaches in the literature designed to compare problems that are used for comparison. Although this finding could be
metaheuristics on single-objective unconstrained numerical optimiza premature, and is based on a limited number of algorithms, benchmarks
tion problems [16]. In Black-Box Optimization Benchmarks (BBOB) al and tests, the study [15] seems to be the most detailed comparison be
gorithms aim to reach the assumed value of the objective function tween tuned and non-tuned algorithms performed so far. The other
within as few objective function calls as possible [29,32]. In other reason for using untuned algorithms in the present paper is the
words, the algorithm that finds the solution with desired precision computational complexity. As given in [18,42], tuning is not a simple
quicker is considered better. On the contrary, in the majority of approach. First, the control parameters for tuning should be selected
Competition on Evolutionary Computation (CEC) benchmarks [28, separately for each algorithm. Secondly, to obtain statistically sound
33-35] the number of function calls is fixed, and algorithms are results, often a huge number of tests need to be done for each considered
compared on how good solutions they can find within the limited algorithm. It has been pointed out that “the tuning with these kinds of
computational budget. Algorithms that can find better solutions within tools [for automatic parameter tuning] can be computationally unaf
an assumed number of function calls are considered better. None of the fordable in real-world problems [42]” This is a great challenge for the
two approaches may be regarded as “more appropriate”, they simply present research, in which many algorithms are involved. Nonetheless,
differ in defining the goals and setting comparison rules. Very recently a although unaffordable in this study, tuning of control parameters of
trial-based dominance measure has been proposed to bridge the gap algorithms on the selected benchmarks following the procedures given
between these two methods and order two-dimensional data sets, with in [18,42] is highly advisable. The tuning would be especially important
both the time needed to reach some solutions and the quality of solutions when the particular study is focused on a single, or a few specific ap
[12]. However, practitioners more frequently use the CEC approach plications, for which the suggested values of the control parameter may
than BBOB one. Setting the maximum number of function calls is, be inappropriate. However, one should remember that in the case of
alongside with direct computational time (if all algorithms are coded real-world problems, even slight modification of the mathematical
and implemented in the same hardware and software, which is rarely definition of the problem may affect the results. This is exemplified in
applicable), found to be the fairest stopping criterion according to [16]. [43], where it is shown that in some comparison papers different
Hence, in the present study we focus on CEC-type of comparisons. equations are given for the problems with identical names, and identical
Many CEC benchmarks are very popular among researchers working original references. The reason of that is unclear, but impact of such
on new metaheuristics or studying the behavior of existing algorithms seemingly small modifications on the comparison may be noticeable.
[14,19,20,36,37]. Unfortunately, each single benchmark suite is In the present study we mainly focus on the differences noted on
composed of no more than a few dozen of problems and in most papers benchmarks composed of mathematical functions against benchmarks
dealing with Evolutionary Algorithms no more than one or two bench composed of real-world problems, and on benchmarks with a lower
mark sets are used to test the performance. However, in [15] it was number of allowed function calls against benchmarks with a very high
clearly shown that the number of problems on which algorithms are number of allowed function calls. Our goal is to give a clear message to
tested has a major impact on the results. Hence, the conclusions may the practitioners looking for the appropriate Evolutionary Algorithm
easily be biased by the choice of the benchmark sets. Nonetheless, the that would be best suited for their specific optimization problem. Should
question of how much different ranking of algorithms would be obtained practitioners focus on algorithms tested on numerical benchmarks with
depending on which benchmark set is used has rarely been asked and a higher or lower number of function calls? Should they consider algo
even more rarely answered (e.g. [7,15,38]). This question is particularly rithms tested on benchmarks composed of mathematical functions, or
important, as in recent CEC benchmark sets proposed for focus mainly on those tested on real-world problems?
single-objective numerical optimization a clear shift in comparison To address such issues a large number of algorithms needs to be
criteria has been made [28,35,38]. Many older CEC benchmarks (CEC tested on at least a few benchmark sets. In the present paper we analyze
2005 [39]; CEC 2013 [33]; CEC 2014 [40]; CEC 2017 [34]) were the performance of 73 metaheuristics on four sets of CEC benchmark
composed of 20–30 problems defined in 10- to 100-dimensional spaces problems: real-world problems proposed for CEC 2011, mathematical
and allowed up to 10000D function calls per problem, where D is the functions defined for CEC 2014 and CEC 2017 benchmarks, and the
problem dimensionality. Recently [28,35] such settings were fully newly proposed cases for CEC 2020. The number of function calls is
modified. For example, the CEC 2020 [35] set, which has also been used always kept to the values suggested in the particular set of problems. We
in competitions held in 2021, is composed of only ten 5- to 20-dimen compare 73 algorithms according to their ranks averaged over all
sional problems for which, apart from the 5-dimensional version, the problems included in the particular set. We test different di
number of allowed function calls is much higher than in the older mensionalities of mathematical functions included in problem sets CEC
benchmarks (up to 10,000,000 function calls are allowed for 20-dimen 2014, CEC 2017, and CEC 2020.
sional case, and 3000,000 for 15-dimensional case). This changes the We are unaware of any study that would apply so many algorithms
expectations from competing algorithms – those slower and more together on various benchmark sets. This alone justifies the research.
explorative would be favored over those quicker and more exploitative However, the sheer number of compared algorithms may also affect
ones [41]. Again, there is nothing wrong with changing the rules. some conclusions, as discussed in [15]. Using too many algorithms may
However, the question arises of how much the change of expectations flatten the averaged performance of many tested methods, as even the
would affect the ranking of algorithms, and would the results based on best algorithms may be outperformed by the majority of others on some
new benchmark sets be comparable with the former ones? How would specific problems. To address this potential problem, we also aim to
they relate to the performances of algorithms on real-world problems? study how the number of compared algorithms affects the rankings
The present paper is devoted to large-scale comparison among many based on averaged ranks. To do so, on each benchmark set we create
metaheuristics on different sets of single-objective, static, numerical, another ranking of algorithms that includes only the selected best
and low-to-moderate dimensional optimization problems. Our goal is methods. The detailed results obtained by each algorithm used in the
not to look for the best algorithms. This work aims to explore how well ranking based on the limited number of methods are the same as the
2
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
results used in rankings based on all 73 approaches. The only change is shape of the problem differs in various parts of the search space. As a
the number of algorithms, which affect the worst possible rank that may result, the fitness landscape of such functions frequently have different
be obtained on a particular problem. By comparing the best results ob shape in various parts of the search space. There are six hybrid functions
tained in rankings based on all 73 optimizers and on a limited number of in CEC 2014 (20% of problems), ten in CEC 2017 (33% of problems), and
best algorithms, we should be able to determine to what extent the sheer three in CEC 2020 (30% of problems). Finally, composition functions are
number of compared algorithms affects the choice of the best optimizer considered the hardest for optimizers, as they mix to a higher degree the
for each considered benchmark set. properties of sub-functions, and use also hybrid functions as
sub-components. There are eight composition functions in CEC 2014
2. Methods (27% of problems), eleven in CEC 2017 (37% of problems), and three in
CEC 2020 benchmarks (30% of problems).
2.1. Test problems Different functions are generally used in CEC 2014 and CEC 2017
benchmarks, even though their properties are obtained in a similar way.
In the present paper we analyze the performances of 73 Swarm In In CEC 2020, though, nine out of ten functions are those that were
telligence and Evolutionary Algorithms in nine competitions performed already used either in CEC 2014 or CEC 2017. The difference of the CEC
on four benchmark sets. We focus on low-to-moderate dimensional 2020 set is, hence, mainly due to the lower number of problems, lower
problems; large-scale optimization problems [44] are not addressed. All dimensionality of functions, and higher number of allowed function
considered sets are composed of numerical single-objective non- calls.
dynamic minimization problems. The four benchmark sets include As may be inferred from the above, among these three mathematical
real-world problems from CEC 2011 [25], and numerical benchmarks functions-based benchmarks CEC 2017 seems to be the most difficult, as
from CEC 2014 [40], CEC 2017 [34], and CEC 2020 [35]. The collection it contains the lowest percentage of unimodal functions (7%), and 70%
of CEC 2011 includes 1- to 216-dimensional real-world problems from of its problems are hybrid or composition functions. CEC 2020 may be
very different fields of science and engineering. For each of the 22 considered the simplest to fit, as it contains much fewer problems (hence
problems from this collection the dimensionality is fixed. The number of it is easier to fit to them well when constructing a novel algorithm), these
allowed function calls for CEC 2011 problems is set to 150,000 [25]. problems are low-dimensional, and each algorithm has much more time
CEC 2014 and CEC 2017 problems are composed of 30 numerical to solve the problem than in the case of other benchmarks (hence
functions defined in 2-, 10-, 30-, 50-, and 100-dimensional versions. The wasting time on unsuccessful steps is a much weaker issue for CEC 2020
functions highly differ in difficulty [34,40]. The maximum number of than for the other benchmarks).
function calls in both CEC 2014 and CEC 2017 collections of problems is In the present paper we use all four versions of CEC 2020 (5- to 20-
set to 10000D, where D is the problem dimensionality. CEC 2020 set is dimensional), the only available version of CEC 2011, and two versions
composed of only 10 numerical problems, which are similar to some of CEC 2014 and CEC 2017 (10- and 50-dimensional). Together this
problems from CEC 2014 and CEC 2017 (see [35]). However, CEC 2020 gives nine competitions. We have selected 10-dimensional versions of
problems are defined for 5-, 10-, 15-, and 20-dimensional versions, and CEC 2014 and CEC 2017 because these are the only choices that agree in
the allowed number of function calls is set to 50,000, 1000,000, 3000, dimensionality with available CEC 2020 versions. In the further text we
000, and 10,000,000 for 5-, 10-, 15-, and 20-dimensional cases, will use a specific nomenclature to define both the benchmark set and
respectively [35]. As may be seen, apart from the 5-dimensional case, the dimensionality, such as CEC 2020_15, in which CEC 2020 means a
much more time is given to solve CEC 2020 problems than was allowed test suite, and _15 refers to the dimensionality. Following Awad et al.
in CEC 2011, CEC 2014, or CEC 2017 cases. [34] we have run each algorithm 51 times on every problem in each
There are some major similarities and differences between various competition. The lowest value of the objective function from each run is
benchmark problems. CEC 2011 is the only set composed of real-world remembered.
problems, hence it is obviously very different from the other three
benchmarks. CEC 2011 problems are from various fields of science and 2.2. Algorithms compared
industry, but the majority come from physics, chemistry, power engi
neering, and space mission planning. In contradiction to artificially- Our idea is to present a large-scale comparison between various al
created mathematical functions, each of the 22 CEC 2011 problems gorithms proposed in different periods of Swarm Intelligence and
have a unique dimensionality related to the real-world case it addresses, Evolutionary Computation-based research. The main aim is to analyze
which cannot be changed without making some additional assumptions similarities and differences in rankings of algorithms obtained for
about the problem to be solved. As may be seen from the literature and different benchmark sets of problems. Together 73 algorithms proposed
the results published in the present paper, for some problems (e.g. between the 1960′s and 2022 are tested “as they are”, without control
problem 3 or 8) the same solutions are found by almost each meta parameter tuning [18]. These 73 algorithms are summarized in Table 1.
heuristic used. On the other hand, for some other problems (e.g. prob Because algorithms belong to various families of methods, including
lems 1 or 13) the solutions found by different algorithms may be highly hybrids, and may be known under different names or abbreviations –
diversified. even if they behave similarly [121], we have decided to avoid sorting, or
The remaining three benchmarks (CEC 2014, CEC 2017, and CEC classifying them alphabetically or according to any possible similarities
2020) are composed of four kinds of mathematical functions [34,35,40]. (e.g. [122]). In Table 1 we sort algorithms historically, from the oldest to
Unimodal functions are theoretically the simplest, and should pose only the newest, according to the year of the final publication. The choice of
moderate difficulty to well-designed algorithms. Three unimodal func algorithms is of course subjective. We include a number of historical
tions are included in CEC 2014 benchmark (they compose 10% of algorithms but mainly focus on methods published within the last 15
problems), two in CEC 2017 (7% of problems), and just one in CEC 2020 years. We have included both many state-of-the-art, widely cited, as well
(10% of problems). Simple multimodal functions have multiple optima as some less-known algorithms. However, we have tried to avoid those
and may be rotated and shifted, but their fitness landscape often has a with disputable novelty [19,123-125]. We apologize that many popular
relatively regular shape, which facilitates the search for many kinds of and reputable variants are anyway not included. Apart from some
metaheuristics. There are as many as thirteen such functions in CEC necessary modifications discussed in Table 1, the control parameters of
2014 (43% of problems), seven in CEC 2017 (23%), and three in CEC algorithms are set to the values suggested in the source papers. In
2020 benchmarks (30%). The more difficult kinds of functions are Table 1 we specify the population size of each algorithm, as this
so-called hybrid functions, which are composed of different basic parameter is important for almost every method and is sometimes used
functions, and the weighted impact of each basic function on the final differently by different authors. In Table 1 we also give a detailed
3
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Table 1
Algorithms compared. D – problem dimensionality. * – marks algorithms whose codes were obtained from authors of the source paper, or their web pages. In the middle
columns of the Table, we refer to the benchmark problems and comparison settings (number of problems in the benchmark (if the function is used with different
dimensionalities, it is counted just once), dimensionalities, maximum number of function calls (MNFC)) used in the initial paper that introduce the particular algo
rithm. These may be assumed to be the benchmark and settings for which the particular algorithm has been tuned by its authors in the original paper. However, the
degree of tuning, and the tuning method used is often unreported in the source papers, hence unknown. *1 – algorithms were run until the optimum was achieved, and
the number of function calls and/or success rates were compared.
Chronological Short name Long/descriptive Reference Year The initial version of the algorithm has been tested in the original Comments
number name paper with the following settings
MNFC benchmarks nr of dimensionality
problems
4
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Table 1 (continued )
Chronological Short name Long/descriptive Reference Year The initial version of the algorithm has been tested in the original Comments
number name paper with the following settings
MNFC benchmarks nr of dimensionality
problems
12 AdapSS- JADE with [58] 2011 1.5ˑ105–5ˑ105 basic and 22 30 Population size = 100.
JADE adaptive strategy CEC2005
selection functions
13 CDE Clustering DE [59] 2011 104–5ˑ105 basic and 30 2–30 Population size = 100.
CEC2005
functions
14 EPSDE DE with ensemble [60] 2011 105–3ˑ105 basic and 14 10, Population size = 50.
of strategies and CEC2005 30
parameters functions
15 jDElscop Multi-strategies [61] 2011 5ˑ103ˑD basic 19 50–1000 Population size is
self-adaptive DE functions initialized at 10D (but
within [50,500]) and is
gradually reduced
during the run.
16 SspDE DE with self- [62] 2011 104ˑD basic and 19 10–100 Population size = 100.
adaptive strategy CEC2005
and parameters functions
17 MDE_ Memetic adaptive [63] 2012 3ˑ105–106 CEC2005 25 30, Population size = 100.
pBX* DE with new 50, The MDE_pBX code has
mutation and 100 been obtained from its
crossover authors.
18 DE-SG Differential [64] 2012 105ˑD CEC2005 19 10, Population size = 2D
Evolution with 30, (but within [20,500]).
separated groups 50 An algorithm proposed
by some authors of the
present papers.
19 SapsDE DE with adaptive [65] 2013 105ˑD basic and 17 30, Population size is
resizing CEC2005 50, variable, initialized
mechanism functions 100 with 1D (but within
[10,500]).
20 PMS Parallel Memetic [66] 2013 5ˑ105ˑD BBOB2010 76 30, Non-population-based
Structures CEC2005 100, EA inspired and partly
CEC2008 1000 based on RA.
CEC2010
21 AM-DEGL Adaptive memetic [67] 2013 105ˑD CEC2005 25 10, Population size = 5D
DEGL 30, (but within [10,500]).
50 The algorithm also uses
NMA as a sub-
procedure. An
algorithm proposed by
some authors of the
present papers.
22 ALC-PSO PSO with aging [68] 2013 2ˑ105 basic and 17 30 Population size = 20.
leader CEC2005
functions
23 ATPS-DE JADE with [69] 2013 105ˑD CEC2005 25 30, Population size is
adaptive 100 adaptive during the
population tuning run, initialized with 5D
(but within [50,200]).
24 L-SHADE* SHADE with linear [70] 2014 105ˑD CEC2014 30 10, A version of Successful
population size 30, History Adaptive DE
reduction 50, with population size
100 linearly reduced during
the run from 18D at the
beginning to 4 at the
end. State-of-the-art
algorithm. The code of
L-SHADE has been
provided by prof.
Suganthan.
25 CoBiDE DE based on [71] 2014 105ˑD CEC2005 25 30 Population size = 60.
Covariance Matrix DE algorithm that
learning adaptively rotates
coordinate system
during crossover and
uses a bimodal
distribution setting of
parameters. State-of-
the-art algorithm.
26 LBBO* Linearized [72] 2014 105ˑD- CEC2005 25+ 10, Population size = 50.
Biogeography- 1.5ˑ105 CEC2011 22 30, The code of LBBO has
based 1–216 been downloaded from
Optimization Prof.
(continued on next page)
5
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Table 1 (continued )
Chronological Short name Long/descriptive Reference Year The initial version of the algorithm has been tested in the original Comments
number name paper with the following settings
MNFC benchmarks nr of dimensionality
problems
6
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Table 1 (continued )
Chronological Short name Long/descriptive Reference Year The initial version of the algorithm has been tested in the original Comments
number name paper with the following settings
MNFC benchmarks nr of dimensionality
problems
38 GLPSO Genetic learning [84] 2016 105ˑD basic 24+ 30 Population size = 50.
PSO functions and 28 The method mixes
CEC2013 properties of PSO and
Genetic Algorithm.
39 HMJCDE Hybrid memetic [85] 2016 105ˑD CEC2014 30 30, Population size = 100.
CoDE and JADE 50, The hybrid algorithm
100 that merges two state-
of-the-art DE variants.
40 L-SHADE- Hybrid algorithm [86] 2017 105ˑD CEC2017 30 10, Population size is
SPACMA* that merges L- 30, linearly decreased for
SHADE and CMA- 50, 18D at the beginning of
ES 100 the search to 4 at the
end. The code has been
obtained from https:
//github.com/P-N-Sug
anthan/CEC2017-Bo
undContrained/blo
b/master/Codes-of-To
p-Methods-and-results.
zip
41 L-SHADE- Ensemble [97] 2017 105ˑD CEC2017 30 10, Population size is
cnEpSin* sinusoidal 30, linearly decreased from
parameter 50 18D at the beginning of
adaptation L- the search to 4 at the
SHADE end. The algorithm is a
hybrid of DE variants
merged into the L-
SHADE framework. The
code has been obtained
from https://github.co
m/P-N-Suganthan/CEC
2017-BoundContrained
/blob/master/Codes-o
f-Top-Methods-and-re
sults.zip
42 EPSO* Ensemble of PSO [88] 2017 105ˑD CEC2005 25 10, Population size = 40,
variants 30 divided into two
uneven swarms. The
code has been obtained
from https://github.
com/P-N-Suganthan/C
ODES/blob/master
/2017-ASOC-EPSO.zip
43 ETI-SHADE* SHADE with [89] 2017 105ˑD CEC2014 30 30, Population size = 150.
event-triggered 50, The code has been
scheme 100 obtained from its
authors.
44 HIVBBO* Hybrid Invasive [90] 2017 105ˑD CEC2005 25+ 30, Population size = 100.
Weed and BBOB2015 24+ 10, The code has been
Biogeography- and 2 real- 2 36–47 obtained from http
based world ://embeddedlab.csuoh
optimization problems io.edu/BBO/IWO.html.
45 L-JADE* JADE with linear [53,91] 2018 5ˑ103 CEC2014 30+ 50, Population size is
population size 105 CEC2011 22 1–216 linearly decreased from
reduction 5ˑ105 18D at the beginning of
2.5ˑ106 the search to 4 at the
5ˑ 103 end. This is a JADE
3ˑ 104 algorithm with linear
1.5ˑ105 population size
7.5ˑ105 reduction added. The
code has been obtained
from the authors of L-
MPEDE.
46 L-MPEDE* MPEDE with [81,91] 2018 5ˑ103 CEC2014 30+ 50, Population size is
linear population 105 CEC2011 22 1–216 linearly decreased from
size reduction 5ˑ105 18D at the beginning of
2.5ˑ106 the search to 4 at the
5ˑ 103 end. This is a MPEDE
3ˑ 104 algorithm with linear
1.5ˑ105 population size
7.5ˑ105 reduction added. The
code has been obtained
from its authors.
(continued on next page)
7
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Table 1 (continued )
Chronological Short name Long/descriptive Reference Year The initial version of the algorithm has been tested in the original Comments
number name paper with the following settings
MNFC benchmarks nr of dimensionality
problems
47 DSO* Drone Squadron [92] 2018 105 basic 13+ 10 Population size = 100
Optimization functions and 25 divided into 4 groups.
CEC2005 The code has been
obtained from https://
github.com/melovv/D
SO-MATLAB.
48 EFADE* Enhanced fitness- [93] 2018 105ˑD CEC2013 28 10, Population size = 50.
adaptive DE 30, The code has been
50 obtained from its
authors.
49 L-SHADE-50 Simplification of L- [94] 2018 105ˑD CEC2014 30+ 50, Population size is
SHADE variants 1.5ˑ105 CEC2011 22 1–216 linearly decreased from
18D at the beginning of
the search to 4 at the
end. In contradiction to
the majority of new
variants, this algorithm
is simpler than the
algorithms on which it
is based. An algorithm
proposed by some
authors of the present
papers.
50 L-SHADE- L-SHADE-50 with [95] 2018 105ˑD CEC2014 30+ 50, Population size is
50-PWI PSO-based inertia 1.5ˑ105 CEC2017 30+ 1–26 linearly decreased from
weight CEC2011 22 18D at the beginning of
the search to 4 at the
end. An algorithm
proposed by some
authors of the present
papers.
51 HSES* Hybrid Sampling [96] 2018 105ˑD CEC2018 29 10, Population size = 200.
Evolution Strategy 30, The winner of the IEEE
50, Competition on
100 Evolutionary
Computation in 2018.
The code has been
obtained from https://
github.com/P-N-Suga
nthan?tab=reposito
ries.
52 EnsDE* Ensemble of DE [97] 2018 105ˑD CEC2005 25 30, Population size = 100.
algorithms 50 State-of-the-art
ensemble of DE
variants. The code has
been obtained from
https://github.com/
P-N-Suganthan?ta
b=repositories.
53 DEPSO* Dual [98] 2019 105ˑD CEC2013 28 50 Population size = 50.
environmental The code has been
PSO obtained from its
authors.
54 HARD-DE* Hierarchical 2019 105ˑD CEC2013 28+ 10, Population size is
[99]
archive-based DE 1.5ˑ105 CEC2017 30+ 30, parabolically (quicker
2 real-world 2 50, at the end of the search)
problems 6–20 decreased during a run
√̅̅̅̅
from from 25ln(D) D to 4.
CEC2011 The code has been
obtained from htt
ps://sites.google.
com/view/zhenyume
ng/
55 jDE* DE with diversity [100] 2019 105ˑD CEC2014 30 10, Population size is
and adaptive 30, initialized with 50 but
population size 50, varies adaptively
100 during search within
the [8,5D] range. The
algorithm is based on
[101], but with a
diversity-based
mechanism. The code
(continued on next page)
8
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Table 1 (continued )
Chronological Short name Long/descriptive Reference Year The initial version of the algorithm has been tested in the original Comments
number name paper with the following settings
MNFC benchmarks nr of dimensionality
problems
9
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Table 1 (continued )
Chronological Short name Long/descriptive Reference Year The initial version of the algorithm has been tested in the original Comments
number name paper with the following settings
MNFC benchmarks nr of dimensionality
problems
10
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Table 1 (continued )
Chronological Short name Long/descriptive Reference Year The initial version of the algorithm has been tested in the original Comments
number name paper with the following settings
MNFC benchmarks nr of dimensionality
problems
However, a comparison
against state-of-the-art
algorithms is given only
for the CEC2020
benchmark.
71 PSO-sono* PSO for single- [118] 2022 105ˑD CEC2013 28+ 10, Population size = 100.
objective CEC2014 30+ 30, A new fully informed
problems CEC2017 30 50, search scheme is
100 proposed, and control
parameters are
adaptively modified
during the search. The
code has been obtained
from https://sites.
google.com/view/zhe
nyumeng/.
72 AHA* Artificial [119] 2022 5ˑ104 basic 50+ 2–30 Population size = 50.
hummingbird 2.5ˑ104 functions 30+ This is a new bio-
algorithm 1.5ˑ104 CEC2014 10 inspired algorithm. The
10 real-world code has been obtained
problems from https://seyedalim
irjalili.com/projects.
73 N-L-SHADE Spatial- [120] 2022 105ˑD CEC2013 28+ 10, Population size is
neighborhoods- 1.5ˑ105 CEC2014 30+ 30, linearly decreased from
based L-SHADE CEC2015 15+ 50, 18D at the beginning of
CEC2017 30+ 100 the search to 4 at the
CEC2011 22 1–216 end. Control
parameters F and CR
are adaptively modified
based on successful sets
of parameters in the
neighborhood of a
particular individual.
information on the benchmarks on which particular algorithms were fewer than 10− 8, they both would be given a rank 1.5). Finally, we
tested in their source papers, including the number of problems included average the ranks among all problems in the particular competition. As
in each benchmark, and the dimensionality of problems. We also specify nine competitions are performed in this study, we obtain nine averaged
the maximum number of function calls (MNFC) on which each algorithm rankings of 73 algorithms.
has been tested in the source paper. This information is essential, as it As discussed earlier, there is a clear difference between the compe
may suggest for which kind of problems, and competition settings, the tition settings defined for CEC 2020 and those specified for earlier CEC
algorithm was originally tuned by its authors. For example, in 2009 benchmark sets. We would like to learn how such differences affect the
SFMDE has been tested on less than ten, 7-to-12-dimensional problems, rankings of algorithms. To focus our discussion on differences in rank
with 100,000 function calls. This is in high contradiction to DEGL, that ings of algorithms obtained on CEC 2020 and former sets, in the pre
in the same time (the year 2009) has been tested on 26 problems of sentation of the results (see Table 1) we have decided to mark in green
various dimensionalities (up to 100), with (depending on the problem) ten algorithms that performed best in CEC 2014_50 competition, and
up to a few millions of function calls. We may hence expect that the mark in red ten algorithms that performed best in CEC 2020_20
control parameters of SFMDE may be rather fitted to some low dimen competition, out of 73 algorithms tested. These selected algorithms are
sional problems, and may be hardly generalized for other applications. then marked in the same color for any other competition, to see how the
On the contrary, the control parameter values proposed for DEGL may best methods according to CEC 2014_50 or CEC 2020_20 sets perform in
be of more general use. This information may be critical for more recent other tests. For simplicity, we refer to these algorithms as greens and
algorithms, among which some were tested in their initial paper on a reds in the further part of the paper.
single specific benchmark (e.g. HIP-DE, OLSHADE-CS, MaDE) when the Finally, to see how the number of algorithms used may affect the
performances of others were verified on versatile kinds of problems comparison between the best methods, we prepare another ranking for
(HARD-DE, PSO-sono, N-L-SHADE). all nine competitions in which only these marked (greens and reds) al
gorithms are considered. Let us discuss the importance of that on an
2.3. Comparison criteria example of ten CEC 2020 problems. In a large crowd of 73 algorithms, a
poor performance on a single problem out of ten may highly increase
In this study we follow a classical comparison between algorithms algorithm’s average rank. Failure on a specific problem in less crowded
according to the rankings based on the averaged performance. First, for competition affects the averaged ranking much less harmfully. For
every algorithm we compute the averaged performance from 51 runs on example, if algorithm A wins nine out of ten problems, but is the poorest
a particular problem. For every problem we rank algorithms from the on the tenth problem, its averaged rank would be 8.2 when we have 73
best one (rank 1) to the worst (rank 73 in the case of rankings based on competitors ((9ˑ1+73)/10 = 8.2), but only 2.9 when we have 20 algo
all tested algorithms), according to the averaged performance. When the rithms in competition ((9ˑ1+20)/10 = 2.9). As a result, if some other
difference between performances obtained by some algorithms is lower algorithm B would always be 3rd to 8th on each problem – B will be
than 10− 8 [34], these algorithms are given an equal rank, namely the considered better than A in a crowded competition among 73 algo
average of neighboring positions (e.g. if two best algorithms differ by rithms, but A will be regarded as better than B in a competition with just
11
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
20 best algorithms. Note that the results obtained by A and B do not algorithms on each problem set by means of Friedman’s test with the
change at all – what changes is just the number of competitors. No one post-hoc Shaffer’s static procedure at α = 0.05 [131], using the codes
may say which ranking (based on the lower, or on the larger number of available from [129].
competitors) is fairer, hence we prefer to discuss them both.
3. Results and discussion
2.4. Statistical tests
In the present section we first analyze the averaged-performance
When many algorithms are compared on multiple problems, a mul based rankings of all 73 algorithms in nine competitions. We focus on
tiple pair-wise statistical comparison that considers interrelations be the differences in rankings of algorithms obtained for four different
tween all algorithms should be performed [37,126-128]. However, benchmark sets but briefly discuss also the performance of particular
multiple pair-wise tests are too computationally demanding to be algorithms. After that, we focus on rankings created solely for a limited
applied for comparison among much more than 30 algorithms (e.g. number of algorithms, namely only those that were either among the ten
[129]). As a result, we have used two separate ways of testing the sta best methods for CEC 2014_50 or among the ten best methods for CEC
tistical hypothesis. For all 73 algorithms, we have used the Wilcoxon 2020_20 problems. We discuss the differences observed in rankings
rank sum test [70,71,130] to verify the statistical significance of the based on a large and small number of algorithms.
differences between the best algorithm on the particular benchmark,
and the remaining 72 algorithms. Note that in this case, a different al 3.1. Comparison between 73 algorithms on nine competitions
gorithm may be chosen as the control method in each of the nine
competitions. On the other hand, we performed an additional statistical Nine rankings that are based on 73 algorithms are given in Table 2.
comparison between algorithms that were among the ten best methods Every ranking is based on different competition. The detailed results are
either for CEC 2014_50 or CEC 2020_20 (greens and reds). As the shown in Suppl. Material, including the 51-runs based mean perfor
number of such algorithms is sufficiently small, we verify the statistical mance of 73 algorithms on each specific problem (named: performance),
significance of the multiple pair-wise comparisons among all these detailed ranks of 73 algorithms on each specific problem (named:
Table 2
Averaged-ranks based ranking of 73 Swarm Intelligence and Evolutionary Algorithms on nine different tests. The nine best algorithms on the 50-dimensional version of
CEC 2014 are given in green, the nine best algorithms on the 20-dimensional version of CEC 2020 are shown in red, and one algorithm that was among the ten best
methods in both CEC 2014_50 and CEC2020_20 is marked in yellow. Notation: nr – means a position in the particular ranking; ar – means averaged ranks obtained by a
particular algorithm in the specific ranking.
12
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
ranks), ranks of 73 algorithms averaged over all problems within function calls (much higher for CEC 2020), the dimensionality (low for
particular competition (named: avg_ranks), and order of algorithms CEC 2020), and the number of problems (low for CEC 2020). We may
from the best to the worst for each competition (named: order); these expect, hence, that algorithms proposed in 2010′s were fitted rather to
results are given separately for each competition. Note that apart from higher-dimensional problems that were to be solved with a relatively
the “order”, all the other results from 73 algorithms in Suppl. Material limited number of function calls.
are arranged in columns according to the chronological numbers given Red algorithms perform similarly well in CEC 2020_10 and CEC
in the first column of Table 1. Ranks and performances are provided for 2020_15 competitions as in CEC 2020_20 competition. The only red
each algorithm and every problem. For example, in the table “ranks” of algorithm that looks disappointing in CEC 2020_15 and CEC 2020_10
size 30 × 73 in the RESULTS_CEC_2014_50 file one finds the ranks of 73 rankings is SADE from 2009, which is both times ranked below the
algorithms obtained on thirty 50-dimensional CEC 2014 problems. average. However, the performances of red algorithms on the remaining
“Order” provides the names of algorithms arranged from the best to the six, non-CEC 2020-based competitions are much poorer and more
worst according to the specific competition. The statistical significance diversified.
of the differences between the best algorithm in each competition and Some among red algorithms perform well on 10-dimensional ver
the remaining 72 algorithms is given in Table 3. sions of CEC 2017 and, to a lesser extent, CEC 2014 sets. Especially
The nine final rankings of 73 algorithms are given in Table 2, in jDElscop, MaDE, and APGSK_IMODE have to be mentioned as they are
which we have marked green these ten algorithms that performed best among the seven best methods in CEC 2017_10 competition. The good
on CEC 2014_50 benchmark suite, and marked red these ten algorithms performance of MaDE and APGSK_IMODE may be justified because they
that performed best on CEC 2020_20 suite. As one algorithm (L_SHAD were tested in their initial studies on the low-dimensional CEC 2021
E_cnEpSin) was among the ten best methods in both rankings, it is benchmark. On the contrary, jDElscop has been tested on very high-
marked yellow in Table 2. As a result, 19 different algorithms are dimensional problems (see Table 1), hence its performance cannot be
marked as “winners” of either CEC 2014_50 or the CEC 2020_20 explained in a similar way. However, some other red algorithms are
competition. These 19 algorithms are given in the same color in each among the bottom half of 73 algorithms for these two 10-dimensional
among nine competitions, to facilitate finding the positions of the best competitions (CEC 2017_10 and CEC 2014_10). Surprisingly, IMODE
algorithms from CEC 2014_50 and CEC 2020_20 in all comparisons. shows the weakest performance among reds on 10-dimensional CEC
First, let us discuss whether the comparison results between so many 2014 and CEC 2017 problems, even though it was the winner on the 10-
algorithms are diversified. According to averaged ranks given in Table 2, dimensional CEC 2020 set! This clearly indicates a lack of strong re
some remarkable performance differences are observed among 73 al lations between performances on CEC 2020 on the one side, and CEC
gorithms. In particular competition, the best algorithms reached aver 2014 and CEC 2017 sets on the other, even if the problem dimension
aged ranks between 7.65 and 17.73, and the worst algorithms reach ality is the same.
averaged ranks between 65.41 and 71.77. Hence, the best algorithms When ranking is based on 73 algorithms, the performance of reds on
achieve at least 3.5 times better averaged ranks than the worst algo 50-dimensional CEC 2014, CEC 2017, and on real-world problems (CEC
rithms – these lowest differences are noted for real-world problems. In 2011) is very disappointing. Apart from L-SHADE-cnEpSin (yellow),
the case of the CEC 2017_50 benchmark, the differences are even 9-fold none red algorithm is among the first ten methods for any of these three
(7.65 against 71.77). This may explain why CEC 2017 benchmark is very competitions, but some (OLSHADE_CS, IMODE, MaDE) are among the
popular among researchers for wide-scale comparisons between meta ten worst methods; OLSHADE_CS is even ranked 72nd out of 73 algo
heuristics (e.g. [132,133]). One may, hence, note that the results ob rithms on CEC 2011 real-world problems, despite being the winner of
tained by many metaheuristics on each benchmark are indeed CEC 2020_20 and CEC 2020_15 competitions. OLSHADE_CS also per
diversified, and the large crowd of competitors does not hamper the forms very poorly on CEC 2017_50 (66th out of 73 algorithms), and only
exceptional performance of some algorithms – both the best, and the marginally better on CEC 2014_50 (61st out of 73). In general, over 50%
worst algorithms may easily be determined. Also, the differences be of reds are located among the worse half of algorithms in CEC 2011, CEC
tween the best algorithm for a particular benchmark and the vast ma 2014_50, and CEC 2017_50 competitions. On real-world problems all
jority of remaining algorithms are statistically significant for the greens are better than all reds (skipping yellow) – the worst green is
majority of problems (see Table 3). ranked 18th, and the best red is ranked 25th. This show that, when 73
According to Table 2, different algorithms are ranked among the ten algorithms are considered, the best algorithms according to the CEC
best ones in CEC 2014_50 than in CEC 2020_20 competitions. L_SHAD 2020 set perform below the average on higher-dimensional CEC 2014
E_cnEpSin is the only method that appears within the ten best algorithms and CEC 2017 benchmarks, and especially on various-dimensional CEC
in both rankings, even though it is not in the top positions (L_SHAD 2011 real-world problems. This may again be due to a much different
E_cnEpSin is ranked 7th in CEC 2014_50 competition, and 10th in CEC number of allowed function calls, due to differences in dimensionality,
2020_20 competition). Interestingly, L_SHADE_cnEpSin has been tested or both. However, the observed failure on CEC 2011 real-world prob
by its authors [87] only on a single CEC 2017 benchmark with 10000D lems, on which the performance of the reds is the poorest among all nine
function calls (see Table 1), hence has not been fitted for the variety of competitions, may suggest that CEC 2020 problems are not appropriate
problems and computational budgets. Among red algorithms, six were for choosing algorithms for practical applications. The huge variability
proposed very recently, between 2020 and 2022. In other words, these in the positions obtained in different rankings by OLSHADE_CS and
algorithms were introduced when the CEC 2020 problems were already IMODE, winners of 10- to 20-dimensional CEC 2020 competitions, may
known. Of course, at that time CEC 2011, CEC 2014, and CEC 2017 were indicate that these algorithms are fitted to some kind of problems when
also known for years. However, three other red algorithms (AMALGAM, the allowed number of function calls is large (see Table 1). Such fitting
SADE, and jDElscop) are over ten years old; they were proposed during may be disastrous for some other problems when the search must be
the 2009–2011 period, before CEC 2014, CEC 2017, and CEC 2020 performed quicker in the higher-dimensional domain.
benchmark sets were developed. The relatively poorer performance on The algorithms marked green show less variable performance than
CEC 2020_20 problems achieved by algorithms proposed during the the red ones. Greens are among the best methods both for 10- and 50-
2012–2019 period (among the best ten, only the L_SHADE_cnEpSin is dimensional CEC 2017 problems, as well as for real-world CEC 2011
from that period – namely from the year 2017) may be occasional. ones. On CEC 2011 real-world problems greens occupy the highest po
However, it may also indicate that algorithms proposed during that sitions, including a winner (HARD_DE). The poorest green is ranked
period were fitted to the conditions that are unfavorable for CEC 2020 18th on CEC 2011 problems while the poorest red is ranked 72nd.
set. The most obvious differences between conditions defined for CEC Greens also performed very well on CEC 2014_10 competition, where
2020 and the previous benchmark sets are the number of allowed the green CS-DE is the winner, and the weakest green is ranked 16th.
13
A.P. Piotrowski et al.
Table 3
Statistical significance of the differences between the best algorithm on a particular benchmark set and all remaining algorithms by means of Wilcoxon rank sum test at α = 0.05. The chronological order of algorithms is
retained, as in Table 1, to facilitate reading. Note that the best algorithm is different for each benchmark set. b / e / w – the number of problems on which the best algorithm is statistically better, equal, or worse than the
algorithm in the particular line.
benchmark CEC 2011 CEC 2014_10 CEC 2014_50 CEC 2017_10 CEC 2017_50 CEC 2020_5 CEC 2020_10 CEC 2020_15 CEC 2020_20
best algorithm HARD-DE CS-DE L-SHADE-50 SPS-L-SHADE-EIG ELSHADE-SPACMA APGSK-IMODE IMODE OLSHADE-CS OLSHADE-CS
b e w b e w b e w b e w b e w b e w b e w b e w b e w
1 RA 19 0 3 27 3 0 28 0 2 25 3 2 29 1 0 9 1 0 9 0 1 9 0 1 9 0 1
2 NMA 18 2 2 29 0 1 26 1 3 28 1 1 29 1 0 10 0 0 10 0 0 10 0 0 10 0 0
3 DE 20 1 1 26 3 1 30 0 0 26 3 1 30 0 0 8 2 0 9 1 0 7 2 1 9 1 0
4 PSO 17 5 0 30 0 0 30 0 0 29 1 0 30 0 0 10 0 0 10 0 0 10 0 0 10 0 0
5 CLPSO 17 3 2 26 2 2 27 1 2 25 3 2 30 0 0 9 1 0 8 1 1 9 1 0 9 0 1
6 AMALGAM 14 5 3 18 4 8 17 2 11 18 7 5 15 4 11 8 1 1 5 2 3 4 1 5 5 2 3
7 JADE 15 5 2 22 8 0 26 1 3 18 12 0 28 2 0 7 3 0 7 3 0 6 4 0 6 3 1
8 DEGL 19 2 1 26 4 0 26 0 4 22 8 0 27 2 1 9 1 0 9 1 0 9 1 0 9 1 0
9 SADE 13 7 2 20 8 2 28 0 2 13 15 2 28 2 0 6 4 0 6 2 2 6 3 1 4 1 5
10 SFMDE 18 4 0 26 3 1 28 1 1 27 3 0 30 0 0 9 1 0 8 2 0 9 1 0 9 1 0
11 GA-MPC 14 3 5 19 9 2 26 2 2 19 10 1 28 2 0 6 4 0 9 1 0 8 2 0 8 2 0
12 AdapSS-JADE 17 3 2 23 7 0 24 2 4 20 9 1 27 3 0 6 4 0 7 2 1 6 3 1 7 1 2
13 CDE 17 2 3 19 10 1 25 2 3 15 9 6 29 0 1 5 5 0 8 1 1 5 5 0 9 1 0
14 EPSDE 19 1 2 22 6 2 26 2 2 17 9 4 29 1 0 7 3 0 8 1 1 7 1 2 8 1 1
15 jDElscop 16 3 3 17 10 3 25 2 3 11 11 8 28 2 0 5 4 1 5 3 2 3 4 3 7 2 1
16 SspDE 17 4 1 22 8 0 26 1 3 16 12 2 30 0 0 6 4 0 6 3 1 6 3 1 9 1 0
17 MDE_pBX 13 7 2 20 9 1 29 1 0 15 12 3 28 2 0 8 2 0 8 2 0 8 2 0 9 1 0
18 DE-SG 21 1 0 20 8 2 29 0 1 15 9 6 29 1 0 5 4 1 7 3 0 6 3 1 7 2 1
19 SapsDE 17 3 2 22 7 1 27 2 1 16 12 2 27 3 0 6 3 1 7 2 1 8 2 0 10 0 0
20 PMS 17 2 3 30 0 0 27 1 2 29 1 0 29 1 0 10 0 0 10 0 0 10 0 0 10 0 0
21 AM-DEGL 16 3 3 19 9 2 23 2 5 16 11 3 29 1 0 7 3 0 8 2 0 6 4 0 9 1 0
14
22 ALC-PSO 18 2 2 30 0 0 30 0 0 29 1 0 30 0 0 9 1 0 10 0 0 10 0 0 10 0 0
23 ATPS-DE 9 8 5 23 7 0 23 3 4 19 9 2 27 3 0 8 1 1 7 2 1 9 1 0 9 1 0
24 L-SHADE 9 7 6 11 18 1 15 8 7 9 18 3 19 4 7 4 5 1 7 2 1 5 3 2 8 1 1
25 CoBiDE 11 4 7 16 8 6 22 5 3 6 17 7 28 2 0 4 6 0 7 1 2 5 4 1 8 2 0
26 LBBO 14 4 4 23 5 2 27 0 3 26 4 0 29 1 0 9 1 0 8 1 1 10 0 0 10 0 0
27 Rcr-JADE 14 5 3 14 15 1 26 1 3 11 16 3 27 3 0 4 5 1 7 2 1 6 2 2 7 1 2
28 SPS-L-SHADE-EIG 14 3 5 7 15 8 13 3 14 0 30 0 20 3 7 4 5 1 6 2 2 5 3 2 7 2 1
29 JADE-EIG 18 1 3 22 7 1 26 2 2 20 9 1 29 1 0 7 2 1 9 1 0 6 4 0 8 2 0
30 HCLPSO 12 6 4 26 3 1 25 2 3 24 2 4 28 2 0 9 1 0 10 0 0 9 0 1 9 0 1
31 IDE 15 3 4 22 7 1 27 2 1 20 9 1 29 1 0 8 1 1 4 2 4 5 2 3 8 1 1
32 JADE-AEPD 13 5 4 23 7 0 26 1 3 22 8 0 28 2 0 10 0 0 7 2 1 6 2 2 6 2 2
33 JADEEP 16 3 3 26 4 0 25 2 3 19 9 2 27 2 1 7 3 0 8 1 1 8 2 0 8 2 0
34 MPADE 7 6 9 23 5 2 27 1 2 21 7 2 30 0 0 10 0 0 6 3 1 6 2 2 9 1 0
Greens perform poorer on CEC 2020 set. However, the weakest po
w
sition of a green algorithm in any CEC 2020 competition is 45th (on CEC
0
0
0
0
0
2
0
0
4
3
0
0
0
2
0
0
0
0
0
2
0
0
1
OLSHADE-CS
10
tions held on CEC 2011, CEC 2014, or CEC 2017 problems. Interestingly,
1
1
2
2
1
2
2
1
5
2
3
1
1
1
1
0
2
0
1
2
0
0
e
10
10
10
benchmarks: on CEC 2020_20 the worst green is just the 25th. This
3
0
9
9
8
7
3
6
9
9
9
6
9
8
8
9
8
9
6
b
suggests that greens are more robust algorithms than reds. In other
words – algorithms that win on CEC 2014_50 perform much better
w
1
0
2
3
0
0
0
0
0
1
3
1
0
0
2
0
1
2
0
0
3
2
4
through all nine competitions than algorithms that win on CEC 2020_20.
OLSHADE-CS
CEC 2020_15
10
CEC 2020 problems are similar to selected low-dimensional prob
1
3
4
2
2
5
2
2
4
2
3
3
1
0
3
0
3
2
0
0
1
4
e
lems from earlier competitions (see discussion in Section 2.1), and are
less numerous (CEC 2020 include just ten problems when other bench
10
10
10
10
0
8
9
6
4
8
6
8
6
2
6
4
4
6
8
6
5
6
6
b
marks had 22–30 problems each), but much more function calls are
allowed to solve them. Hence, the difference in performance is probably
w
1
3
0
0
2
1
1
0
0
0
1
0
2
0
3
2
0
1
3
0
1
1
0
10
function calls and the decreased number of problems for CEC 2020
2
2
3
6
4
1
1
2
1
3
6
2
3
0
2
0
2
2
0
0
1
3
e
IMODE
10
10
9
6
7
8
6
3
3
9
4
6
9
6
3
8
9
5
7
7
7
0
b
1
0
0
0
0
1
0
0
0
0
1
2
0
1
0
1
0
0
0
1
0
0
0
APGSK-IMODE
the CEC 2020 benchmark when they are applied to other problems.
10
6
5
1
3
6
4
5
5
2
5
0
2
8
7
5
3
0
1
0
6
5
0
e
10
10
10
8
5
4
4
9
7
4
4
7
5
0
4
5
9
3
5
7
0
3
b
2020 set) or green (which win on all competitions held on CEC 2014,
CEC 2017, and CEC 2011). We may even note that, apart from CEC
ELSHADE-SPACMA
0
3
0
0
6
2
0
0
0
1
1
5
0
5
2
1
6
0
0
0
0
0
the winner on both CEC 2020_20 and CEC 2020_15 tests, no other al
30
0
1
1
0
5
1
4
0
0
2
1
0
4
4
2
5
1
0
2
2
0
1
e
29
20
27
29
22
28
30
29
28
29
30
25
25
0
9
b
worst approach for CEC 2011 real-world problems, performs among the
worst ten on CEC 2017_50, and is 61st method, out of 73, on CEC
2014_50. It seems that OLSHADE_CS is the most over-fitted algorithm
SPS-L-SHADE-EIG
w
6
2
2
0
1
5
5
2
6
5
4
5
2
4
1
0
5
4
0
0
4
4
5
when plenty of time is allowed, but very poorly in some other cases. The
12
13
14
10
15
13
15
16
16
6
5
3
1
2
6
3
0
3
6
4
2
8
e
22
23
20
22
23
12
14
14
17
24
23
27
10
30
27
9
9
b
five best methods on the other four competitions, and is at worse ranked
16th – in the 15-dimensional CEC 2020 competition. It must be noted
11
12
10
13
w
3
1
2
9
1
3
2
6
7
3
7
6
0
3
4
0
1
6
5
that HARD_DE, in its original paper [99], has been tested on 60 problems
CEC 2014_50
L-SHADE-50
3
0
0
1
0
5
4
5
1
4
0
4
1
1
5
0
0
1
1
1
1
3
e
8
b
scratch.
1
3
4
0
7
3
5
4
1
2
1
3
1
2
2
7
4
1
2
3
0
0
2
CEC 2014_10
30
12
22
21
5
3
0
1
2
1
3
2
3
3
1
9
0
e
CS-DE
(CEC 2011, CEC 2014, and CEC 2017) sets of problems, especially real-
21
26
29
26
10
13
17
25
11
22
20
23
28
18
30
28
27
25
27
14
3
7
b
tions, than on newer CEC 2020 set. However, those that perform best on
w
former CEC benchmarks are still ranked above the average on the newer
3
4
5
5
6
3
1
0
2
4
1
2
1
7
3
3
2
0
2
0
3
6
4
CEC 2020 set. This cannot be said about the winners of CEC 2020
11
22
CEC 2011
HARD-DE
2
2
2
3
8
5
6
2
2
2
3
3
2
5
3
e
11
15
14
13
16
17
10
16
17
17
17
17
20
and CEC 2017 sets. Rankings based on real-world CEC 2011 problems, or
9
5
0
b
SOMA T3A
OLSHADE-
HARD-DE
PSO-sono
algorithms than CEC 2020 set. On the other hand, CEC 2020 bench
FDBSFS
CIJADE
HIP-DE
IMODE
DEPSO
TAPSO
EnsDE
CS-DE
MaDE
ADDE
Di-DE
AGSK
HSES
marks, due to their low number (just ten), low dimensionality (up to 20),
AHA
jDE
CS
and the very high number of allowed function calls (which would enable
competitors to waste plenty of time) may allure algorithms to overfit
68
69
70
71
72
73
62
63
64
65
66
67
51
52
53
54
55
56
57
58
59
60
61
15
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Practitioners looking after the appropriate algorithm for the specific Nonetheless, by skipping the majority of moderate-to-poor per
application should hence consider using the best methods proposed forming algorithms some important changes in the rankings of algo
before 2020 if their computational budget is limited and problem rithms in some competitions are observed. According to the ranking
dimensionality is high, or when their problem is much different from the based on 19 algorithms, L_SHADE_cnEpSin become the winner of the
problems on which algorithms are typically tested. Algorithms tested real-world CEC 2011 competition, instead of HARD_DE which won in
initially on CEC 2020 are suggested to be used if one has plenty of time, ranking based on 73 algorithms. HARD_DE is only the 3rd best method
and the problem is relatively low-dimensional and shares some simi in ranking based on 19 algorithms. However, the differences between
larities with the problems included in the benchmark set. averaged ranks of the first three methods in rankings based on 19 al
gorithms and 73 algorithms are small (see Tables 2 and 4) and statisti
3.2. Comparison among 19 selected algorithms on nine competitions cally insignificant (see Table 5), hence the change in the first position is
probably just accidental. According to the ranking based on 73 algo
In this section we focus on the new nine rankings based solely on 19 rithms for the CEC 2014_10 competition the red MaDE was better than
algorithms, namely those that were among the first ten methods on two greens, L_SHADE_50 and L_SHADE_50_PWI. However, the ranking
either CEC 2014_50 or CEC 2020_20 competitions. In the new, 19- based on 19 algorithms does not confirm that, as according to it all reds
methods based rankings, the algorithm with the lowest 51-runs aver perform poorer than all greens. These two cases show that the number of
aged objective function for the particular problem is ranked 1st, and the algorithms may affect the position of specific algorithms in the ranking.
algorithm with the highest 51-runs averaged objective function is However, in general, we may confirm the finding from [15] that the
ranked 19th (instead of 73rd, as was for rankings based on 73 algo overall positions of best algorithms remain the same, irrespective of
rithms). Such ranks are then averaged over all problems considered whether a low or high number of competitors is used.
within the particular competition. 19 algorithms are used, as one So far, little attention has been given to the statistical significance of
method, L_SHADE_cnEpSin, was among the ten best algorithms in both the results. In multiple comparisons on many different problems, the
CEC 2020_20 and CEC 2014_50 competitions. The results obtained by rank-based tests that include all mutual pair-wise interrelations should
each among 19 algorithms on every problem are the same as those that be used [126,127], what limits the power of such tests. In Tables 5-9 the
were used to construct 73-algorithms based rankings. Note that the results from Friedman’s test with the post-hoc Shaffer’s static procedure
chosen 19 algorithms include all the winners of 73-algorithms based at α = 0.05 are given for five competitions based on CEC 2011, CEC
rankings, in any of the nine competitions (see Table 2). Such nine new, 2014, and CEC 2017 (only for rankings based on 19 algorithms). In these
19-algorithms based rankings are given in Table 4. All algorithms are Tables, 0 means no statistically significant difference, and 1 means that
given in green, red, or yellow in the new rankings. the difference between the pair of algorithms in a row and a column is
After skipping the large flock of algorithms, only some moderately statistically significant.
important changes are observed in the rankings of specific methods. The We do not present Tables with the results obtained for CEC 2020
overall picture of the relationship between greens and reds remains competitions, as the differences between all pairs among 19 algorithms
almost the same: reds perform much better than greens on CEC 2020 are not statistically significant in any of the four CEC 2020 competitions
problems, but greens are overall clearly better in the remaining com (hence the tables would be filled only with 0′s). This may again confirm
petitions. Especially, in the case of real-world CEC 2011 problems, all that the CEC 2020 benchmark set is less reliable than CEC 2011, CEC
greens outperform all reds. This is observed in Fig. 1, where the 51-runs 2014, or CEC 2017, at least due to the fact that CEC 2020 set includes too
averaged performance of all greens (left sub-plots) and reds (right sub- few problems.
plots) on 22 real-world CEC 2011 problems is illustrated. In Fig 1 we As seen from Tables 5-9, also for CEC 2011, CEC 2014, and CEC 2017
have repeated the results from L_SHADE_cnEpSin both in green and red competitions many inter-comparisons are not statistically significant.
sub-plots, as this algorithm has been classified within both subsets of What is important, the differences between the vast majority of reds and
algorithms; the L_SHADE_cnEpSin results in both red and green sub-plots greens are frequently statistically significant. For example, on CEC
are of course identical. As seen in Fig. 1, in almost every considered real- 2014_50 the differences between seven reds (APGSK_IMODE,
world problem the majority of greens perform better than the reds, or at TbL_SHADE, AGSK, SADE, OLSHADE_CS, MaDE and IMODE) and eight
least the performances of both the best greens and the best reds are very greens (L_SHADE_50, L_SHADE_50_PWI, L_SHADE_SPACMA, ELSHA
comparable. Although some failures (much poorer performance than DE_SPACMA, SPS_L_SHADE_EIG, HARD_DE, SC_DE and L_SHADE) are
noted by competing algorithms) do appear among greens on specific statistically significant. The weakest reds may even be statistically
real-world problems, they happen much rarer than the failures observed significantly inferior to all greens: on CEC 2011 real-world problems,
among reds. The visual representation of performances achieved by both OLSHADE_CS is statistically significantly inferior to all green algorithms
groups of algorithms leaves little doubt: greens clearly outperform reds (see Table 5); similar statistically significant differences between
on real-world problems. OLSHADE_CS and greens are observed for CEC 2017_50 set (Table 9).
Table 4
Averaged-ranks based ranking on nine different tests of 19 algorithms that were among the ten best methods for CEC 2014_50 or CEC 2020_20. The nine best algorithms
on the 50-dimensional version of CEC 2014 are shown in green, the nine best algorithms on the 20-dimensional version of CEC 2020 are given in red, and one algorithm
that was among the ten best methods in both CEC 2014_50 and CEC2020_20 is marked in yellow. Notation: nr – means a position in the particular ranking; ar – means
averaged ranks obtained by a particular algorithm in the specific ranking.
16
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Fig. 1. 51-runs average performances obtained on 22 real-world problems by ten best algorithms in CEC 2014_50 competition (numbered 1–10, illustrated in green
on left sub-plots) and ten best algorithms in CEC 2020_20 competition (numbered 11–20, illustrated in red in right sub-plots). For a particular problem, the scale of
sub-plots showing the winners of both competitions is identical. However, the scale varies between problems. One algorithm, L-SHADE-cnEpSin, was among ten best
methods in both competitions, hence its results are repeated in both sub-plots. On the vertical axis the value of the objective function is given (the lower – the better).
Numbers represent the following algorithms: winners of CEC 2014_50: 1 – L-SHADE; 2 – SPS-L-SHADE-EIG; 3 – L-SHADE-SPACMA; 4 – L-SHADE-cnEpSin; 5 – L-
SHADE-50; 6 – L-SHADE-50-PWI; 7 – HARD-DE; 8 – CS-DE; 9 – ELSHADE-SPACMA; 10. HIP-DE; winners of CEC 2020_20: 11 – AMALGAM; 12. SADE; 13 – jDelscop;
14 – L-SHADE-cnEpSn; 15 – TbL-SHADE; 16 – IMODE; 17 – AGSK; 18 – APGSK-IMODE; 19 – MaDE; 20 – OLSHADE-CS.
The largest number of statistically significant differences in perfor dimensional problems, as in CEC 2020, is not an appropriate way to
mances between pairs of algorithms is observed for higher-dimensional choose the best algorithm. It again shows that CEC 2011, CEC 2014, and
benchmarks CEC 2014_50 and CEC 2017_50 that are composed of 30 CEC 2017 benchmarks are much more selective than CEC 2020 set.
problems (see Tables 7 and 9). This is in striking difference with the lack
of significant differences observed for CEC 2020 competitions and
confirms that the comparison based on a low number of low-
17
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
Fig. 1. (continued).
18
A.P. Piotrowski et al.
Table 5
Pair-wise comparison between algorithms – CEC 2011. 0 – means that the difference between results from the two algorithms is not statistically significant according to Friedman’s test with the post-hoc Shaffer’s static
procedure at α = 0.05; 1 – means that the difference is statistically significant.
AMALGAM SADE jDElscop L- SPS-L- L- L- L- L- HARD- TbL- IMODE AGSK APGSK- CS- ELSHADE- HIP- MaDE OLSHADE-
SHADE SHADE- SHADE- SHADE- SHADE- SHADE- DE SHADE IMODE DE SPACMA DE CS
EIG SPACMA cnEpSin 50 50-PWI
AMALGAM – 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SADE 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
jDElscop 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
L-SHADE 0 0 0 – 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1
SPS-L- 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0 1
SHADE-
EIG
L-SHADE- 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0 1
SPACMA
L-SHADE- 0 0 0 0 0 0 – 0 0 0 0 1 1 1 0 0 0 1 1
cnEpSin
19
L-SHADE- 0 0 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0 1
50
L-SHADE- 0 0 0 0 0 0 0 0 – 0 0 0 1 1 0 0 0 1 1
50-PWI
HARD-DE 0 0 0 0 0 0 0 0 0 – 0 0 1 1 0 0 0 1 1
TbL-SHADE 0 0 0 0 0 0 0 0 0 0 – 0 0 0 0 0 0 0 1
IMODE 0 0 0 0 0 0 1 0 0 0 0 – 0 0 0 0 0 0 0
AGSK 0 0 0 1 0 0 1 0 1 1 0 0 – 0 1 0 1 0 0
APGSK- 0 0 0 1 0 0 1 0 1 1 0 0 0 – 0 0 0 0 0
IMODE
CS-DE 0 0 0 0 0 0 0 0 0 0 0 0 1 0 – 0 0 0 1
ELSHADE- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 – 0 0 1
AMALGAM – 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
SADE 0 – 0 0 1 0 0 0 0 1 0 0 0 0 1 1 1 0 0
jDElscop 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
L-SHADE 0 0 0 – 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1
SPS-L- 0 1 0 0 – 0 0 0 0 0 1 1 1 0 0 0 0 0 1
SHADE-
EIG
L-SHADE- 0 0 0 0 0 – 0 0 0 0 1 1 0 0 0 0 0 0 1
SPACMA
L-SHADE- 0 0 0 0 0 0 – 0 0 0 0 1 0 0 0 0 0 0 0
cnEpSin
20
L-SHADE- 0 0 0 0 0 0 0 – 0 0 0 1 0 0 0 0 0 0 0
50
L-SHADE- 0 0 0 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0
50-PWI
HARD-DE 0 1 0 0 0 0 0 0 0 – 1 1 1 0 0 0 0 0 1
TbL-SHADE 0 0 0 1 1 1 0 0 0 1 – 0 0 0 1 1 1 0 0
IMODE 0 0 0 1 1 1 1 1 0 1 0 – 0 0 1 1 1 0 0
AGSK 0 0 0 0 1 0 0 0 0 1 0 0 – 0 1 1 1 0 0
APGSK- 0 0 0 0 0 0 0 0 0 0 0 0 0 – 0 0 0 0 0
IMODE
CS-DE 1 1 0 0 0 0 0 0 0 0 1 1 1 0 – 0 0 0 1
ELSHADE- 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 – 0 0 1
AMALGAM – 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 1 1
SADE 1 – 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1 0 0
jDElscop 0 0 – 0 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0
L-SHADE 0 1 0 – 0 0 0 0 0 0 1 1 1 1 0 0 0 1 1
SPS-L- 0 1 1 0 – 0 0 0 0 0 1 1 1 1 0 0 0 1 1
SHADE-
EIG
L-SHADE- 0 1 1 0 0 – 0 0 0 0 1 1 1 1 0 0 0 1 1
SPACMA
L-SHADE- 0 1 1 0 0 0 – 0 0 0 1 1 1 1 0 0 0 1 1
cnEpSin
21
L-SHADE- 0 1 1 0 0 0 0 – 0 0 1 1 1 1 0 0 0 1 1
50
L-SHADE- 0 1 1 0 0 0 0 0 – 0 1 1 1 1 0 0 0 1 1
50-PWI
HARD-DE 0 1 0 0 0 0 0 0 0 – 1 1 1 1 0 0 0 1 1
TbL-SHADE 1 0 0 1 1 1 1 1 1 1 – 0 0 0 1 1 0 0 0
IMODE 1 0 0 1 1 1 1 1 1 1 0 – 0 0 1 1 1 0 0
AGSK 1 0 0 1 1 1 1 1 1 1 0 0 – 0 1 1 1 0 0
APGSK- 1 0 0 1 1 1 1 1 1 1 0 0 0 – 1 1 0 0 0
IMODE
CS-DE 0 1 0 0 0 0 0 0 0 0 1 1 1 1 – 0 0 1 1
ELSHADE- 0 1 1 0 0 0 0 0 0 0 1 1 1 1 0 – 0 1 1
AMALGAM – 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SADE 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
jDElscop 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
L-SHADE 0 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SPS-L- 1 0 0 0 – 0 0 0 0 0 1 1 0 0 0 0 0 0 1
SHADE-
EIG
L-SHADE- 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0
SPACMA
L-SHADE- 0 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0
cnEpSin
22
L-SHADE- 0 0 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0 0
50
L-SHADE- 0 0 0 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0 0
50-PWI
HARD-DE 0 0 0 0 0 0 0 0 0 – 0 0 0 0 0 0 0 0 0
TbL-SHADE 0 0 0 0 1 0 0 0 0 0 – 0 0 0 0 0 0 0 0
IMODE 0 0 0 0 1 0 0 0 0 0 0 – 0 0 0 0 0 0 0
AGSK 0 0 0 0 0 0 0 0 0 0 0 0 – 0 0 0 0 0 0
APGSK- 0 0 0 0 0 0 0 0 0 0 0 0 0 – 0 0 0 0 0
IMODE
CS-DE 0 0 0 0 0 0 0 0 0 0 0 0 0 0 – 0 0 0 0
ELSHADE- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 – 0 0 0
AMALGAM – 1 1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 1 1
SADE 1 – 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1 0 0
jDElscop 1 0 – 1 1 1 1 1 1 1 0 0 0 0 0 1 1 0 1
L-SHADE 0 1 1 – 0 0 0 0 0 0 1 1 1 1 0 0 0 1 1
SPS-L- 0 1 1 0 – 0 0 0 0 0 1 1 1 1 0 0 0 1 1
SHADE-
EIG
L-SHADE- 0 1 1 0 0 – 0 0 0 0 1 1 1 1 0 0 0 1 1
SPACMA
L-SHADE- 0 1 1 0 0 0 – 0 0 0 1 1 1 1 0 0 0 1 1
cnEpSin
23
L-SHADE- 0 1 1 0 0 0 0 – 0 0 1 1 1 1 0 0 0 1 1
50
L-SHADE- 0 1 1 0 0 0 0 0 – 0 1 1 1 1 0 0 0 1 1
50-PWI
HARD-DE 0 1 1 0 0 0 0 0 0 – 1 1 1 1 0 0 0 1 1
TbL-SHADE 1 0 0 1 1 1 1 1 1 1 – 0 0 0 1 1 1 0 0
IMODE 1 0 0 1 1 1 1 1 1 1 0 – 0 0 1 1 1 0 0
AGSK 1 0 0 1 1 1 1 1 1 1 0 0 – 0 1 1 1 0 0
APGSK- 1 0 0 1 1 1 1 1 1 1 0 0 0 – 1 1 1 0 0
IMODE
CS-DE 0 1 0 0 0 0 0 0 0 0 1 1 1 1 – 0 0 1 1
ELSHADE- 0 1 1 0 0 0 0 0 0 0 1 1 1 1 0 – 0 1 1
LS IIL ANS LS IIL ANS LS IIL ANS LS IIL ANS LS IIL ANS LS IIL ANS LS IIL ANS LS IIL ANS LS IIL ANS
1 4,6 5,7 16,2 0,89 2,00 9,20 6,96 12,78 47,18 0,85 1,48 8,80 7,12 9,97 47,83 0,41 1,12 4,63 4,77 20,68 90,91 17,3 65,9 284,21 65,6 264,1 1049,0
2 5,6 8,2 18,9 0,66 1,79 8,63 5,82 11,36 45,86 0,58 1,44 0,89 7,39 10,41 47,89 0,39 1,08 4,46 6,79 21,18 89,75 22,8 65,3 271,70 91,9 228,6 886,9
3 3641,5 3745,5 3803,9 0,67 1,82 8,75 5,81 11,38 45,36 0,66 1,38 4,49 6,70 9,85 47,13 0,33 1,08 4,43 6,43 20,89 89,50 21,6 64,6 271,49 86,8 224,5 888,1
4 41,0 47,1 49,0 0,67 1,86 8,51 5,78 11,52 45,30 0,65 1,39 8,60 6,92 9,84 47,11 0,41 1,04 4,39 4,92 20,52 89,16 18,1 62,5 268,91 73,5 215,3 882,3
5 70,1 75,9 86,4 0,65 1,84 8,56 6,15 11,91 46,00 0,69 1,39 8,65 7,72 10,79 47,96 0,27 1,03 4,43 6,27 20,56 89,86 22,0 64,8 272,07 89,6 220,2 894,5
6 69,7 75,9 86,5 2,05 3,39 7,41 40,76 48,54 83,70 0,73 1,50 1,05 10,39 13,20 50,87 0,29 1,01 4,43 6,30 20,56 89,36 21,2 63,4 270,56 85,7 216,1 885,5
7 7,5 9,3 20,4 0,65 1,84 8,57 6,38 11,70 45,83 0,67 1,43 8,68 7,59 10,87 48,22 0,24 1,02 0,37 6,21 20,39 89,82 21,1 64,1 268,41 86,4 219,3 888,0
8 13,8 16,3 27,1 0,55 1,82 8,54 5,32 10,76 45,14 0,67 1,42 8,67 7,70 10,78 48,07 0,35 1,08 4,47 5,86 22,65 91,23 27,6 72,7 273,79 115,0 257,6 921,8
9 71,9 43,2 55,1 0,63 1,82 8,64 6,10 11,70 46,87 0,64 1,42 2,79 7,60 10,79 48,07 0,44 1,11 4,49 8,09 23,21 91,55 30,2 75,3 274,17 127,6 272,3 931,0
10 94,6 97,9 110,8 0,66 1,88 8,61 6,81 12,42 46,65 0,70 1,44 8,67 8,29 11,34 48,97 0,43 1,11 4,48 7,39 22,36 90,97 26,7 71,2 271,14 111,5 253,3 916,0
11 36,6 15,4 26,3 0,70 1,89 8,63 7,69 13,46 48,10 0,65 1,37 8,63 7,09 10,29 47,71
12 232,8 24,9 36,4 0,88 2,09 8,86 12,15 18,52 52,59 0,66 1,40 8,68 7,64 10,84 48,36
24
13 3,7 5,7 16,3 0,65 1,80 8,49 5,75 11,29 45,58 0,66 1,42 8,69 7,18 10,59 47,96
14 3,0 4,8 15,4 0,62 1,81 8,48 5,79 11,36 45,64 0,68 1,44 8,70 8,18 11,47 49,00
15 4,0 6,2 16,8 0,63 1,82 8,59 6,22 11,63 45,94 0,64 1,39 8,64 6,99 10,28 47,52
16 3,6 5,6 16,3 0,64 1,81 8,60 6,24 11,83 46,23 0,67 1,41 8,73 7,44 10,62 48,08
17 59,4 13,5 27,9 0,67 1,91 8,68 6,85 12,47 47,17 0,75 1,50 8,83 9,55 12,88 50,49
18 19,2 13,4 24,1 0,65 1,84 8,68 6,18 11,63 46,10 0,67 1,42 8,67 7,33 10,71 47,98
19 22,1 16,5 27,2 0,94 2,16 8,98 13,55 19,67 54,14 1,21 1,94 9,34 20,77 23,81 61,93
20 22,3 16,6 27,5 0,66 1,83 8,58 6,14 11,72 45,29 0,73 1,52 8,85 10,01 13,19 51,09
21 112,6 124,9 142,7 0,68 1,86 8,68 6,59 12,26 45,95 0,77 1,53 8,87 12,84 16,47 54,10
22 106,1 117,8 134,6 0,70 1,90 8,71 7,35 12,88 46,86 0,83 1,59 8,98 13,97 17,34 55,21
23 0,96 2,22 8,96 14,99 21,82 54,50 0,84 1,61 9,12 15,68 19,44 57,07
24 0,89 2,12 8,96 11,89 17,86 51,17 0,88 1,63 9,10 16,55 20,01 57,82
25 0,93 2,16 8,99 13,91 19,89 53,64 0,81 1,56 8,99 15,92 19,50 57,39
3.3. Computational burden of different benchmark sets best one (rank 1) to the worst one. Then, ranks are averaged over all
problems in the particular competition. We perform two different
So far, we have focused on the comparison between performances rankings: one is based on all 73 algorithms, another one – on selected 19
obtained by various algorithms on different benchmark problems. methods that performed best in two selected competitions – on 50-
However, the computational burden associated with each benchmark dimensional CEC 2014 and 20-dimensional CEC 2020; these 19 algo
may also be important for the practitioners, hence in this sub-section we rithms also include the winners of each among nine competitions ac
briefly compare the computational cost required by each benchmark set. cording to the rankings based on 73 algorithms.
This comparison is based on three chosen algorithms (IILPSO, LBBO, We have found that much different algorithms perform best on CEC
OLSHADE_CS) that are run just once on each problem from each 2020 problems than on CEC 2011 real-world problems, or CEC 2014 and
competition. Each run is performed separately on the same machine, CEC 2017 benchmarks. According to the rankings based on 73 algo
without parallel computing. We measure the time, in seconds, required rithms, only one algorithm (L_SHADE_cnEpSin [87]) is within the ten
by these three algorithms to solve particular problems from each best methods for both 50-dimensional CEC 2014 benchmarks and
benchmark set. The only reason for that comparison is to give some 20-dimensional CEC 2020 tests. The ten best algorithms in the
general information on the computational time needed to use various 10-dimensional CEC 2020 competition are also different from the best
benchmarks. ten algorithms in the 10-dimensional CEC 2014 and CEC 2017 compe
As codes of the three chosen algorithms were implemented in titions. Among ten best algorithms on the 20-dimensional CEC 2020
different years, by different authors, using diversified machines and version (hence the one with the highest number of allowed function
having different programming skills, the computational time needed by calls), six were proposed the most recently, in the years 2020–2022.
each optimizer is different. These results should not be used to compare Among the four best algorithms on the 10-, 15-, and 20-dimensional CEC
the computational speed of various algorithms. 2020 benchmark, all were proposed between 2020 and 2022. On the
The number of seconds needed by algorithms to solve each problem contrary, the vast majority of best performing algorithms on other
in each competition once is given in Table 10. Table 10 shows that the benchmark sets were proposed in the 2015–2019 period (even though
real-world CEC 2011 problems are the most computationally some are newer, proposed in 2020–2022). This show that the best per
demanding. However, among 22 real-world problems, just a single one forming algorithms on CEC 2020 are mainly methods that were devel
(nr 3), despite being 1-dimensional, consumes about 75% of computa oped to solve these kinds of problems when one has plenty of time. We
tional time. Hence, skipping this problem may highly limit the required may advise practitioners to use the best methods proposed before 2020 if
time. their computational time is limited and the problem dimensionality is
Among benchmarks composed of mathematical functions, 10-dimen high, or to choose the most recent algorithms that were initially tested
sional versions of CEC 2014 and CEC 2017 are quicker than the 10- on CEC 2020 – if they have plenty of time and the problem is relatively
dimensional version of CEC 2020. Moreover, even 50-dimensional ver low-dimensional.
sions of CEC 2014 and CEC 2017 are relatively fast, compared to the 15- Of noticeable importance is, however, that algorithms that ranked
or 20-dimensional CEC 2020 benchmark. This is easily explained by the high in CEC 2011 real-world problems are also highly ranked on CEC
exponential increase in the allowed number of function calls in CEC 2014 and CEC 2017 benchmarks, but not on CEC 2020 set. Moreover,
2020 benchmarks, contrary to the linear relation between the number of the algorithms that performed best on CEC 2011, CEC 2014, or CEC
function calls and the dimensionality that was used in CEC 2014 and 2017 benchmarks show much better performance on the CEC 2020 set
CEC 2017 benchmarks. Considering that CEC 2014 and CEC 2017 than the algorithms that performed best on the CEC 2020 benchmark –
include 30 problems, while CEC 2020 – has just 10, clearly CEC 2020 on former sets of problems. Hence, the CEC 2020 set is incompatible
benchmark is much more time-consuming. Although the specific time with the earlier benchmark sets, and choosing the algorithm for real-
needed to use various algorithms on a particular benchmark may be world applications based on CEC 2020 benchmarks may be inappro
much different, the relative differences in computational burden be priate unless a large number of function calls is available.
tween different benchmarks are almost the same for all three algorithms. One algorithm, OLSHADE_CS [117], shows especially diversified
To summarize, the CEC 2020 benchmark not only leads to a much performances on different problems. It is the best method on 15- and
different ranking of algorithms than older benchmarks, but also requires 20-dimensional CEC 2020 problems, but it is also the second poorest
much more time to be executed, despite having a few times fewer method on CEC 2011 real-world problems and is within the worst ten
problems. Benchmark composed of CEC 2011 real-world problems is algorithms on the 50-dimensional CEC 2017 benchmark.
also time-consuming but may be made 3–4 times quicker if a single 1- We have also shown that the number of algorithms tested may have a
dimensional problem is removed. limited impact on the ranking of best-performing algorithms. For
example, in the case of real-world problems the algorithm that was
4. Conclusions ranked the best when 73 algorithms were compared may become only
3rd best if the ranking is based on only nineteen selected best algo
In the present paper we show that the choice of the set of benchmarks rithms. This is because occasional failures are more critical when the
used for comparison may highly affect the ranking of optimizers. The number of competitors is higher than when fewer algorithms are
research is based on 73 algorithms proposed between 1960 and 2022 compared. Nonetheless, in most comparisons, the ranking of algorithms
that are tested in nine different competitions. The control parameters of obtained for a large and small number of competitors is similar.
algorithms were not tuned, but set to the values suggested in the liter The fact that we have tested algorithms that were not tuned for the
ature, which may affect the outcome of this study. Each competition is specific benchmarks would probably affect the results. Although tuning
composed of 10 to 30 problems. The nine competitions are composed of so many algorithms on multiple problems is difficult and time
CEC 2011 real-world problems, 10- and 50-dimensional CEC 2014 and consuming, we recommend performing a similar large-scale study with
CEC 2017 benchmark sets, and 5-, 10-, 15-, and 20-dimensional algorithms which control parameters were tuned for each problem
benchmarks from CEC 2020. CEC 2020 benchmark differs from the separately. The information on how different conclusions could be ob
other collections, as it includes much fewer problems (10 instead of 22 tained may be of high importance for the future research on
or 30), allows much more function calls to be performed, and includes metaheuristics.
solely lower-dimensional problems.
To construct rankings of algorithms, first 51-runs averaged perfor Author contributions
mance of each algorithm on every problem in each competition is ob
tained, and algorithms are ranked for the particular problem from the AEP and APP designed the research; APP performed research; APP
25
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
and JNN analyzed data and results; APP, JNN and AEP wrote the paper. [20] M. Hellwig, H.G. Beyer, Benchmarking evolutionary algorithms for single
objective real-valued constrained optimization – a critical review, Swarm. Evol.
Comput. 44 (2019) 927–944.
Declaration of Competing Interest [21] D.H. Wolpert, W.G. Macready, No free lunch theorems for optimization, IEEE
Trans. Evol. Comput. 1 (1) (1997) 67–82.
[22] M. Köppen, D.H. Wolpert, W.G. Macready, Remarks on a recent paper on the “No
Authors declare that they have no competing interests. free lunch” theorems, IEEE Trans. Evol. Comput. 5 (3) (2001) 295–296.
[23] Y.C. Ho, D.L. Pepyne, Simple explanation of the No-Free-Lunch Theorem and its
Data availability implications, J. Optim. Theory Appl. 115 (3) (2002) 549–570.
[24] T. Joyce, J.M. Herrmann, A review of no free lunch theorems, and their
implications for metaheuristic optimisation, in: X.-S. Yang (ed.), Nature-Inspired
The authors do not have permission to share data. Algorithms and Applied Optimization, Studies in Computational Intelligence 744,
https://doi.org/10.1007/978-3-319-67669-2_2.
[25] S. Das, P.N. Suganthan, Problem Definitions and Evaluation Criteria For CEC
2011 Competition on Testing Evolutionary Algorithms on Real World
Acknowledgments Optimization Problems. Technical Report, Jadavpur Univ., Nanyang Technol.
Univ., Kolkata, India, 2010.
[26] K. Hussain, M.N.M. Salleh, S. Cheng, Y. Shi, Metaheuristic research: a
This work was supported within statutory activities No 3841/E-41/ comprehensive survey, Artif. Intell. Rev. 52 (2019) 2191–2233.
S/2022 of the Ministry of Science and Higher Education of Poland. [27] A. Kumar, G. Wu, M.Z. Ali, R. Mallipeddi, P.N. Suganthan, S. Das, A test-suite of
non-convex constrained optimization problems from the real-world and some
baseline results, Swarm. Evol. Comput. 56 (2020), 100693.
Supplementary materials [28] A. Kumar, K.V. Price, A.W. Mohamed, A.A. Hadi, P.N. Suganthan, Problem
Definitions and Evaluation Criteria for the CEC 2022 Special Session and
Supplementary material associated with this article can be found, in Competition on Single Objective Bound Constrained Numerical Optimization,
Technical Report, December 2021.
the online version, at doi:10.1016/j.swevo.2023.101378. [29] K. Varelas, O.A. El Hara, D. Brockhoff, N. Hansen, D.M. Nguyen, T. Tušar,
A. Auger, Benchmarking large-scale continuous optimizers: the bbob-largescale
References testbed, a COCO software guide and beyond, Applied Soft Computing Journal 97
(2020), 106737.
[30] A.E. Ezugwu, O.J. Adeleke, A.A. Akinyelu, S. Viriri, A conceptual comparison of
[1] T. Bartz-Beielstein, C. Doerr, D. van den Berg, J. Bossek, S. Chandrasekaran, T.
several metaheuristic algorithms on continuous optimisation problems, Neural
Eftimov, A. Fischbach, P. Kerschke, W. La Cava, M. Lopez-Ibanez, K.M. Malan, J.
Computing and Applications 32 (2020) 6207–6251.
H. Moore, B. Naujoks, P. Orzechowski, V. Volz, M. Wagner, T. Weise,
[31] Y. Lou, S.Y. Yuen, On constructing alternative benchmark suite for evolutionary
Benchmarking in optimization: best practice and open issues, 2020. arXiv:
algorithms, Swarm. Evol. Comput. 44 (2019) 287–292.
2007.03488v2.
[32] O. Mersmann, M. Preuss, H. Trautmann, B. Bischl, C. Weihs, Analyzing the BBOB
[2] A. LaTorre, D. Molina, E. Osaba, J. Poyatos, J. Del Ser, F. Herrera, A prescription
results by means of benchmarking concepts, Evol. Comput. 23 (1) (2015)
of methodological guidelines for comparing bio-inspired optimization algorithms,
161–185.
Swarm. Evol. Comput. 67 (2021), 100973.
[33] J.J. Liang, B.Y. Qu, P.N. Suganthan, A.G. Hernández-Díaz, Problem Definitions
[3] E. Osaba, E. Villar-Rodriguez, J. Del Ser, A.J. Nebro, D. Molina, A. LaTorre, P.
and Evaluation Criteria For the CEC 2013 Special Session On Real-Parameter
N. Suganthan, C.A. Coello Coello, F. Herrera, A Tutorial On the design,
Optimization. Technical Report 201212, Computational Intelligence Laboratory,
experimentation and application of metaheuristic algorithms to real-world
Zhengzhou University, Zhengzhou China and Technical Report, Nanyang
optimization problems, Swarm. Evol. Comput. 64 (2021), 100888.
Technological University, Singapore, 2013.
[4] J. Swan, S. Adriaensen, A.E.I. Brownlee, K. Hammond, C.G. Johnson, A. Kheiri,
[34] N.H. Awad, M.Z. Ali, J.J. Liang, B.Q. Qu, P.N. Suganthan, Problem Definitions
F. Krawiec, J.J. Merelo, L.L. Minku, E. Ozcan, G.L. Pappa, P. Garcia-Sanchez,
and Evaluation Criteria for the CEC 2017 Special Session and Competition on
K. Sorensen, S. Voss, M. Wagner, D.R. White, Metaheuristics “In the Large, Eur. J.
Single Objective Real-Parameter Numerical Optimization, Nanyang
Oper. Res. 29 (2022) 393–406.
Technological University, Singapore, 2016. Technical Report.
[5] A.H. Halim, I. Ismail, S. Das, Performance assessment of the metaheuristic
[35] C.T. Yue, K.V. Price, P.N. Suganthan, J.J. Liang, M.Z. Ali, B.Y. Qu, N.H. Awad, P.
optimization algorithms: an exhaustive review, Artif. Intell. Rev. 54 (2021)
P. Biswas, Problem Definitions and Evaluation Criteria For the CEC 2020 Special
2323–2409.
Session and Competition on Single Objective Bound Constrained Numerical
[6] M. Fox, S. Yang, F. Caraffini, A new moving peaks benchmark with attractors for
Optimization, Technical Report, Zhengzhou University, China and Nanyang
dynamic evolutionary algorithms, Swarm. Evol. Comput. 74 (2022), 101125.
Technological University, Singapore, 2019.
[7] T. Liao, D. Molina, T. Stützle, Performance evaluation of automatically tuned
[36] P. Bujok, J. Tvrdik, R. Polakova, Comparison of nature-inspired population-based
continuous optimizers on different benchmark sets, Appl. Soft Comput. 27 (2015)
algorithms on continuous optimization problems, Swarm. Evol. Comput. 50
490–503.
(2019), 100490.
[8] A.P. Piotrowski, Review of Differential Evolution population size, Swarm. Evol.
[37] J. Carrasco, S. García, M.M. Rueda, S. Das, F. Herrera, Recent trends in the use of
Comput. 32 (2017) 1–24.
statistical tests for comparing swarm and evolutionary computing algorithms:
[9] T.N. Huynh, D.T.T. Do, J. Lee, Q-Learning-based parameter control in differential
practical guidelines and a critical review, Swarm. Evol. Comput. 54 (2020),
evolution for structural optimization, Appl. Soft Comput. 107 (2021), 107464.
100665.
[10] A.P. Piotrowski, M.J. Napiorkowski, J.J. Napiorkowski, P.M. Rowinski, Swarm
[38] K.V. Price, N.H. Awad, M.Z. Ali, P.N. Suganthan, The 2019 100-digit Challenge
Intelligence and Evolutionary Algorithms: performance versus speed, Inf Sci (Ny)
On real-parameter, Single-Objective op-timization: Analysis of Results, Nanyang
384 (2017) 34–85.
Technological University, Singapore, 2019. Tech. Rep. http://www.ntu.edu.sg/h
[11] A. Kazikova, M. Pluhacek, R. Senkerik, How does the number of objective
ome/epnsugan.
function evaluations impact our understanding of metaheuristics behavior? IEEE
[39] P.N. Suganthan, N. Hansen, J.J. Liang, K. Deb, Y.P. Chen, A. Auger, S. Tiwari,
Access 9 (2021) 44032–44048.
Problem Definitions and Evaluation Criteria For the CEC 2005 Special Session On
[12] K.V. Price, A. Kumar, P.N. Suganthan, Trial-based dominance for comparing both
Real-Parameter Optimization, Nanyang Technol. Univ., Singapore, 2005. Tech.
the speed and accuracy of stochastic optimizers with standard non-parametric
Rep. KanGAL #2005005, IIT Kanpur, India.
tests, Swarm. Evol. Comput. 78 (2023), 101287.
[40] fish 0,punct]">J.J. Liang, B.Y. Qu, P.N. Suganthan, Problem definitions and
[13] A. Nabaei, M. Hamian, M.R. Parsaei, R. Safdari, T. Samad-Soltani, H. Zarrabi,
evaluation criteria for the CEC 2014 special session and competition on single
A. Ghassemi, Topologies and performance of intelligent algorithms: a
objective real-parameter numerical optimization. Technical Report 201311,
comprehensive review, Artif. Intell. Rev. 49 (2018) 79–103.
Computational Intelligence Laboratory, Zhengzhou University, Zhengzhou China
[14] J. Peng, Y. Li, H. Kang, Y. Shen, X. Sun, Q. Chen, Impact of population topology
and Technical Report, Nanyang Technological University, Singapore, 2013.
on particle swarm optimization and its variants: an information propagation
[41] M. Crepinsek, S.H. Liu, M. Mernik, Exploration and exploitation in evolutionary
perspective, Swarm. Evol. Comput. 69 (2022), 100990.
algorithms: a survey, ACM Comput. Surv. 45 (3) (2013) 35.
[15] N. Vecek, M. Crepinsek, M. Mernik, On the influence of the number of algorithms,
[42] E. Osaba, E. Villar-Rodriguez, J. Del Ser, A.J. Nebro, D. Molina, A. LaTorre, P.
problems, and independent runs in the comparison of evolutionary algorithms,
N. Suganthan, C.A. Coello Coello, F. Herrera, A Tutorial On the design,
Appl. Soft Comput. 54 (2017) 23–45.
experimentation and application of metaheuristic algorithms to real-World
[16] M. Ravber, S.H. Liu, M. Mernik, M. Crepinsek, Maximum number of generations
optimization problems, Swarm. Evol. Comput. 64 (2021), 100888.
as a stopping criterion considered harmful, Appl. Soft Comput. 128 (2022),
[43] A.S. Eesa, M.M. Hassan, W.K. Arabo, Letter: application of optimization
109478.
algorithms to engineering design problems and discrepancies in mathematical
[17] A.E. Eiben, S.K. Smit, Parameter tuning for configuring and analyzing
formulas, Appl. Soft Comput. 140 (2023), 110252.
evolutionary algorithms, Swarm. Evol. Comput. 1 (2011) 19–31.
[44] Z.H. Zhan, L. Shi, K.C. Tan, J. Zhang, A survey on evolutionary computation for
[18] C. Huang, Y. Li, X. Yao, A survey of automatic parameter tuning methods for
complex continuous optimization, Artif. Intell. Rev. 55 (2022) 59–110.
metaheuristics, IEEE Trans. Evol. Comput. 24 (2) (2020) 201–216.
[45] H.H. Rosenbrock, An automated method for finding the greatest or least value of
[19] J. del Ser, E. Osaba, D. Molina, X.S. Yang, S. Salcedo-Sanz, D. Camacho, S. Das, P.
a function, Comput. J. 3 (3) (1960) 175–184.
N. Suganthan, C.A. Coello Coello, F. Herrera, Bio-inspired computation: where we
stand and what’s next, Swarm. Evol. Comput. 48 (2019) 220–250.
26
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
[46] J.A. Nelder, R.A. Mead, simplex-method for function minimization, Comput. J. 7 [78] M. Yang, C.H. Li, Z.H. Cai, J. Guan, Differential Evolution with auto-enhanced
(4) (1965) 308–313. population diversity, IEEE Trans. Cybern. 45 (2) (2015) 302–315.
[47] R. Storn, K.V. Price, Differential Evolution – A Simple and Efficient Adaptive [79] Y.L. Li, Z.H. Zhan, Y.J. Gong, W.N. Chen, J. Zhang, Y. Li, Differential evolution
Scheme for Global Optimization Over Continuous Spaces. Tech. Report TR-95- with an evolution path: a DEEP evolutionary algorithm, IEEE Trans. Cybern. 45
012, International Computer Sciences Institute, Berkeley, California, USA, 1995. (9) (2015) 1798–1810.
[48] R. Storn, K.V. Price, Differential evolution – a simple and efficient heuristic for [80] L.Z. Cui, G.H. Li, Q.Z. Lin, J.Y. Chen, N. Lu, Adaptive differential evolution
global optimization over continuous spaces, J. Global Optimiz. 11 (1997) algorithm with novel mutation strategies in multiple sub-populations, Comput.
341–359. Oper. Res. 67 (2016) 155–173.
[49] J. Kennedy, R.C. Eberhart, Particle swarm optimization. Proceedings of the IEEE [81] G.H. Wu, R. Mallipeddi, P.N. Suganthan, R. Wang, H.K. Chen, Differential
International Conference On Neural Networks, Perth, Australia. IEEE, Piscataway, Evolution with multi-population based ensemble of mutation strategies, Inf. Sci.
NJ, USA, 1995 pp. IV 1942-1948. (Ny) 329 (2016) 329–345.
[50] Y. Shi, R.C. Eberhart, A modified particle swarm optimizer, in: Proceeding in IEEE [82] Q. Qin, S. Cheng, Q.Y. Zhang, L. Li, Y.H. Shi, Particle Swarm Optimization with
Congress on Evolutionary Computation (CEC), 1998, pp. 69–73. interswarm interactive learning strategy, IEEE Trans. Cybern. 46 (10) (2016)
[51] J.J. Liang, A.K. Qin, P.N. Suganthan, S. Baskar, Comprehensive learning particle 2238–2251.
swarm optimizer for global optimization of multi-modal functions, IEEE Trans. [83] G.H. Wu, Across neighborhood search for numerical optimization, Inf. Sci. (Ny)
Evol. Comput. 10 (3) (2006) 281–295. 329 (2016) 597–618.
[52] J.A. Vrugt, B.A. Robinson, J.M. Hyman, Self-adaptive multimethod search for [84] Y.J. Gong, J.J. Li, Y. Zhou, Y. Li, H.S.H. Chung, Y.H. Shi, J. Zhang, Genetic
global optimization in real-parameter spaces, IEEE Trans. Evol. Comput. 13 (2) learning particle swarm optimization, IEEE Trans. Cybern. 46 (10) (2016)
(2009) 243–259. 2277–2290.
[53] J. Zhang, Z.C. Sanderson, JADE: adaptive differential evolution with optional [85] G. Li, Q. Lin, L. Cui, Z. Du, Z. Liang, J. Chen, N. Lu, Z. Ming, A novel hybrid
external archive, IEEE Trans. Evol. Comput. 13 (5) (2009) 945–958. differential evolution algorithm with modified CoDE and JADE, Appl. Soft
[54] S. Das, A. Abraham, U.K. Chakraboty, A. Konar, Differential Evolution using a Comput. 47 (2016) 577–599.
neighborhood-based mutation operator, IEEE Trans. Evol. Comput. 13 (3) (2009) [86] A.W. Mohamed, A.A. Hadi, A.M. Fattouh, K.M. Jambi, LSHADE with
526–553. semiparameter adaptation hybrid with CMA-ES for solving CEC 2017 benchmark
[55] A.K. Qin, V.L. Huang, P.N. Suganthan, Differential Evolution algorithm with problems, in: 2017 IEEE Congress on Evolutionary Computation (CEC), San
strategy adaptation for global numerical optimization, IEEE Trans. Evol. Comput. Sebastian, Spain, 2017, https://doi.org/10.1109/CEC.2017.7969307.
13 (2) (2009) 398–417. [87] N.H. Awad, M.Z. Ali, P.N. Suganthan, Ensemble sinusoidal differential covariance
[56] A. Caponio, F. Neri, V. Tirronen, Super-fit control adaptation in memetic matrix adaptation with Euclidean neighborhood for solving CEC2017 benchmark
differential evolution frameworks, Soft Comput. 13 (8–9) (2009) 811–831. problems, in: 2017 IEEE Congress on Evolutionary Computation (CEC), 2017,
[57] S.M. Elsayed, R.A. Sarker, D.L. Essam, GA with a new multi-parent crossover for https://doi.org/10.1109/CEC.2017.7969336.
solving IEEE-CEC2011 competition problems. IEEE Congress of Evolutionary [88] N. Lynn, P.N. Suganthan, Ensemble particle swarm optimizer, Appl. Soft Comput.
Computation (CEC), New Orleans (LA), USA, 2011, pp. 1034–1040. 55 (2017) 533–548.
[58] W.Y. Gong, A. Fialho, Z.H. Cai, H. Li, Adaptive strategy selection in Differential [89] W. Du, S.Y.S. Leung, Y. Tang, A.V. Vasilakos, Differential Evolution with event
Evolution for numerical optimization: an empirical study, Inf Sci (Ny) 181 (24) triggered impulsive control, IEEE Trans. Cybern. 47 (1) (2017) 244–257.
(2011) 5364–5386. [90] G. Khademi, H. Mohammadi, D. Simon, Hybrid invasive weed/biogeography-
[59] Z.H. Cai, W.Y. Gong, C.X. Ling, H. Zhang, A clustering-based Differential based optimization, Eng. Appl. Artif. Intell. 64 (2017) 213–231.
Evolution for global optimization, Appl. Soft Comput. 11 (1) (2011) 1363–1379. [91] A.P. Piotrowski, J.J. Napiorkowski, Step-by-step improvement of JADE and
[60] R. Mallipeddi, P.N. Suganthan, Q.K. Pan, M.F. Tasgetiren, Differential Evolution SHADE-based algorithms: success or failure? Swarm. Evol. Comput. 43 (2018)
algorithm with ensemble of parameters and mutation strategies, Appl. Soft 88–108.
Comput. 11 (2) (2011) 1679–1696. [92] Z.V.V. de Melo, W. Banzhaf, Drone Squadron Optimization: a novel self-adaptive
[61] J. Brest, M.S. Maucec, Self-adaptive differential evolution algorithm using algorithm for global numerical optimization, Neural. Comput. Appl. 30 (2018)
population size reduction and three strategies, Soft Comput. 15 (11) (2011) 3117–3144.
2157–2174. [93] A.W. Mohamed, P.N. Suganthan, Real-parameter unconstrained optimization
[62] Q.K. Pan, P.N. Suganthan, L. Wang, L. Gao, R. Mallipeddi, A Differential based on enhanced fitness-adaptive differential evolution algorithm with novel
Evolution algorithm with self-adapting strategy and control parameters, Comput. mutation, Soft comput 22 (2018) 3215–3235.
Oper. Res. 38 (2011) 394–408. [94] A.P. Piotrowski, J.J. Napiorkowski, Some metaheuristics should be simplified, Inf.
[63] S.M. Islam, S. Das, S. Ghosh, S. Roy, P.N. Suganthan, An adaptive differential Sci. (Ny) 427 (2018) 32–62.
evolution algorithm with novel mutation and cross-over strategies for global [95] A.P. Piotrowski, L-SHADE optimization algorithms with population-wide inertia,
numerical optimization, IEEE Trans. Syst., Man Cybernetics. Part B – Cybernetics Inf. Sci. (Ny) 468 (2018) 117–141.
42 (2) (2012) 482–500. [96] G. Zhang, Y. Shi, Hybrid Sampling Evolution Strategy for Solving Single Objective
[64] A.P. Piotrowski, J.J. Napiorkowski, A. Kiczko, Differential Evolution algorithm Bound Constrained Problems. In: 2018 IEEE Congress on Evolutionary
with separated groups for multi-dimensional optimization problems, Eur. J. Oper. Computation (CEC), Rio de Janeiro, Brazil, 2018, https://doi.org/10.1109/
Res. 216 (2012) 33–46. CEC.2018.8477908.
[65] X. Wang, S.G. Zhao, Differential Evolution algorithm with self-adaptive [97] G. Wu, X. Shen, H. Li, H. Chen, A. Lin, P.N. Suganthan, Ensemble of differential
population resizing mechanism, Math. Problems Eng. 2013 (2013), 419372. evolution variants, Inf. Sci. (Ny) 423 (2018) 172–186.
[66] F. Caraffini, F. Neri, G. Iacca, A. Mol, Parallel memetic structures, Inf Sci (Ny) 227 [98] J.Q. Zhang, X.X. Zhu, Y.H. Wang, M.C. Zhou, Dual-environmental particle swarm
(2013) 60–82. optimizer in noisy and noise-free environments, IEEE Trans. Cybern. 49 (6)
[67] A.P. Piotrowski, Adaptive Memetic Differential Evolution with global and local (2019) 2011–2021.
neighborhood-based mutation operators, Inf. Sci. (Ny) 241 (2013) 164–194. [99] Z. Meng, S.J. Pan, HARD-DE: hierarchical Archive Based Mutation Strategy With
[68] W.N. Chen, J. Zhang, Y. Lin, N. Chen, Z.H. Zhan, H.S.H. Chung, Y. Li, Y.H. Shi, Depth Information of Evolution for the Enhancement of Differential Evolution on
Particle Swarm Optimization with an aging leader and challengers, IEEE Trans. Numerical Optimization, IEEE Access 7 (2019) 12832–12854.
Evol. Comput. 17 (2) (2013) 241–258. [100] R. Polakova, J. Tvrdik, P. Bujok, Differential evolution with adaptive mechanism
[69] W. Zhu, Y. Tang, J.A. Fang, W.B. Zhang, Adaptive population tuning scheme for of population size according to current population diversity, Swarm. Evol.
differential evolution, Inf. Sci. (Ny) 223 (2013) 164–191. Comput. 50 (2019), 100519.
[70] R. Tanabe, A. Fukunaga, Improving the search performance of SHADE using [101] J. Brest, S. Greiner, B. Boškovic, M. Mernik, V. Žumer, Self-adapting control
linear population size reduction, in: Proc. IEEE Congress on Evolutionary parameters in differential evolution: a comparative study on numerical
Computation, Bejing, China, 2014, pp. 1658–1665. benchmark problems, IEEE Trans. Evol. Comput. 10 (2006) 646–657.
[71] Y. Wang, H.X. Li, T.W. Huang, L. Li, Differential Evolution based on covariance [102] Q.B. Diep, I. Zelinka, S. Das, R. Senkerik, SOMA T3A for solving the 100-digit
matrix learning and bimodal distribution parameter setting, Appl. Soft Comput. challenge, in: Proceedings of the 2019 Swarm, Evolutionary and Memetic
18 (2014) 232–247. Computing Conference, Maribor, Slovenia, 2019.
[72] D. Simon, M.G.H. Omran, M. Clerc, Linearized biogeography-based optimization [103] L. Skanderova, Self-organizing migrating algorithm: review, improvements and
with re-initialization and local search, Inf. Sci. (Ny) 267 (2014) 140–157. comparison, Artif. Intell. Rev. (2022), https://doi.org/10.1007/s10462-022-
[73] W.Y. Gong, Z.H. Cai, Y. Wang, Repairing the crossover rate in adaptive differentia 10167-8.
evolution, Appl. Soft Comput. 15 (2014) 149–168. [104] X. Xia, L. Gui, F. Yu, H. Wu, B. Wei, Y.L. Zhang, Z.H. Zhan, Triple Archives
[74] S.M. Guo, J.S.H. Tsai, C.C. Yang, P.H. Hsu, A self-optimization approach for Particle Swarm Optimization, IEEE Trans. Cybern. 50 (12) (2020) 4862–4875.
LSHADE incorporated with eigenvector-based crossover and successful-parent [105] X. Sun, L. Jiang, Y. Shen, H. Kang, Q. Chen, Success history-based adaptive
selecting framework on CEC 2015 benchmark set, in: Proc. IEEE Congress on differential evolution using turning-based mutation, Mathematics 8 (2020) 1565.
Evolutionary Computation, Sendai, Japan, 2015, pp. 1003–1010. [106] K.M. Sallam, S.M. Elsayed, R.K. Chakrabortty, M.J. Ryan, Improved multi-
[75] S.M. Guo, C.C. Yang, Enhancing Differential Evolution utilizing eigenvector- operator differential evolution algorithm for solving unconstrained problems.
based crossover operator, IEEE Trans. Evol. Comput. 19 (1) (2015) 31–49. 2020 IEEE Congress on Evolutionary Computation (CEC), Glasgow, UK, 2020,
[76] N. Lynn, P.N. Suganthan, Heterogeneous comprehensive learning Particle Swarm https://doi.org/10.1109/CEC48606.2020.9185577.
Optimization with enhanced exploration and exploitation, Swarm. Evol. Comput. [107] A.W. Mohamed, A.A. Hadi, A.K. Mohamed, N.H. Awad, Evaluating the
24 (2015) 11–24. performance of adaptive gaining- sharing knowledge based algorithm on CEC
[77] L. Tang, Y. Dong, J. Liu, Differential evolution with an individual-dependent 2020 benchmark problems. 2020 IEEE Congress on Evolutionary Computation
mechanism, IEEE Trans. Evol. Comput. 19 (4) (2015) 560–574. (CEC), Glasgow, UK, 2020, https://doi.org/10.1109/CEC48606.2020.9185901.
27
A.P. Piotrowski et al. Swarm and Evolutionary Computation 83 (2023) 101378
[108] J.S. Pan, N. Liu, S.C. Chu, A hybrid differential evolution algorithm and its [120] A. Ghosh, S. Das, A.K. Das, R. Senkerik, A. Viktorin, I. Zelinka, A.D. Masegosa,
application in unmanned combat aerial vehicle path planning, IEEE Access 8 Using spatial neighborhoods for parameter adaptation: an improved success
(2020) 17691–17712. history based differential evolution, Swarm. Evol. Comput. 71 (2022), 101057.
[109] Z. Meng, C. Yang, X. Li, Y. Chen, Di-DE: depth Information-based differential [121] G. Ochoa, K.M. Malan, C. Blum, Search trajectory networks: a tool for analysing
evolution with adaptive parameter control for numerical optimization, IEEE and visualising the behaviour of metaheuristics, Appl. Soft Comput. 109 (2021),
Access 8 (2020) 40809–40827. 107492.
[110] Z.H. Zhan, Z.J. Wang, H. Jin, J. Zhang, Adaptive Distributed Differential [122] A.E. Ezugwu, A.K. Shukla, R. Nath, A.A. Akinyelu, J.O. Agushaka, H. Chiroma, P.
Evolution, IEEE Trans. Cybern. 50 (11) (2020) 4633–4647. K. Muhuri, Metaheuristics: a comprehensive overview and classification along
[111] A.W. Mohamed, A.A. Hadi, P. Agrawal, K.M. Sallam, A.K. Mohamed, Gaining- with bibliometric analysis, Artif. Intell. Rev. 54 (2021) 4237–4316.
sharing knowledge based algorithm with adaptive parameters hybrid with imode [123] K. Sorensen, Metaheuristics—the metaphor exposed, Int. Trans. Operat. Res. 22
algorithm for solving CEC 2021 Benchmark Problems. 2021 IEEE Congress on (2015) 3–18.
Evolutionary Computation (CEC), Kraków, Poland, 2021, https://doi.org/ [124] D. Molina, J. Poyatos, J. del Ser, S. Garcia, A. Hussain, F. Herrera, Comprehensive
10.1109/CEC45853.2021.9504814. taxonomies of nature- and bio-inspired optimization: inspiration versus
[112] Z. Meng, Y. Zhong, C. Yang, CS-DE: cooperative strategy based differential algorithmic behavior, critical analysis recommendations, Cognit. Comput. 12
evolution with population diversity enhancement, Inf. Sci. (Ny) 577 (2021) (2020) 897–939.
663–696. [125] A. Tzanetos, G. Dounias, Nature inspired optimization algorithms or simply
[113] A.A. Hadi, A.W. Mohamed, K.M. Jambi, Single-objective real-parameter variations of metaheuristics? Artif. Intell. Rev. 54 (2021) 1841–1862.
optimization: enhanced LSHADE-SPACMA algorithm, Heurist. Optimiz. Learning, [126] J. Demsar, Statistical comparisons of classifiers over multiple data sets, J. Mach.
Stud. Comput. Intell. 906 (2021) 103–121. Learn. Res. 7 (2006) 1–30.
[114] S. Aras, E. Gedikli, H.T. Kahraman, A novel stochastic fractal search algorithm [127] J. Derrac, S. Garcia, D. Molina, F. Herrera, A practical tutorial on the use of
with fitness-distance balance for global numerical optimization, Swarm. Evol. nonparametric statistical tests as a methodology for comparing evolutionary and
Comput. 61 (2021), 100821. swarm intelligence algorithms, Swarm. Evol. Comput. 1 (2011) 3–18.
[115] Z. Meng, C. Yang, Hip-DE: historical population based mutation strategy in [128] S. Garcia, F. Herrera, An extension on “Statistical comparisons of classifiers over
differential evolution with parameter adaptive mechanism, Inf. Sci. (Ny) 562 multiple data sets” for all pairwise comparisons, J. Mach. Learn. Res. 9 (2008)
(2021) 44–77. 2677–2694.
[116] S. Biswas, D. Saha, S. De, A.D. Cobb, S. Das, Improving differential evolution [129] A. Ulas, O.T. Yildiz, E. Alpaydin, Cost-conscious comparison of supervised
through bayesian hyperparameter optimization. 2021 IEEE Congress on learning algorithms over multiple data sets, Pattern Recognit. 45 (2012)
Evolutionary Computation (CEC), Kraków, Poland, 2021, https://doi.org/ 1772–1781.
10.1109/CEC45853.2021.9504792. [130] F. Wilcoxon, Individual comparisons by ranking methods, Biometrics Bulletin 1
[117] A. Kumar, P.P. Biswas, P.N. Suganthan, Differential evolution with orthogonal (6) (1945) 80–83.
array-based initialization and a novel selection strategy, Swarm. Evol. Comput. 68 [131] J. Shaffer, Modified sequentially rejective multiple test procedures, J. Am. Statis.
(2022), 101010. Assoc. 81 (1986) 826–831.
[118] Z. Meng, Y. Zhong, G. Mao, Y. Liang, PSO-sono: a novel PSO variant for single- [132] R. Biedrzycki, Handling bound constraints in CMA-ES: an experimental study,
objective numerical optimization, Inf. Sci. (Ny) 586 (2022) 176–191. Swarm. Evol. Comput. 52 (2020), 100627.
[119] W. Zhao, L. Wang, S. Mirjalili, Artificial hummingbird algorithm: a new bio- [133] Z. Ma, G. Wu, P.N. Suganthan, A. Song, Q. Luo, Performance assessment and
inspired optimizer with its engineering applications, Comput. Methods Appl. exhaustive listing of 500+ nature-inspired metaheuristic algorithms, Swarm.
Mech. Eng. 388 (2022), 114194. Evol. Comput. 77 (2023), 101248.
28