A Tutorial On The Desing, Experimentation and Application of Metaheuristic Algorithms To Real-World Optimization Problems
A Tutorial On The Desing, Experimentation and Application of Metaheuristic Algorithms To Real-World Optimization Problems
Survey Paper
a r t i c l e i n f o a b s t r a c t
2010 MSC: In the last few years, the formulation of real-world optimization problems and their efficient solution via meta-
00-01 heuristic algorithms has been a catalyst for a myriad of research studies. In spite of decades of historical advance-
99-00 ments on the design and use of metaheuristics, large difficulties still remain in regards to the understandability,
Keywords: algorithmic design uprightness, and performance verifiability of new technical achievements. A clear example
Metaheuristics stems from the scarce replicability of works dealing with metaheuristics used for optimization, which is often
Real-world optimization infeasible due to ambiguity and lack of detail in the presentation of the methods to be reproduced. Additionally,
Good practices in many cases, there is a questionable statistical significance of their reported results. This work aims at providing
Methodology the audience with a proposal of good practices which should be embraced when conducting studies about meta-
Tutorial heuristics methods used for optimization in order to provide scientific rigor, value and transparency. To this end,
we introduce a step by step methodology covering every research phase that should be followed when addressing
this scientific field. Specifically, frequently overlooked yet crucial aspects and useful recommendations will be
discussed in regards to the formulation of the problem, solution encoding, implementation of search operators,
evaluation metrics, design of experiments, and considerations for real-world performance, among others. Finally,
we will outline important considerations, challenges, and research directions for the success of newly developed
optimization metaheuristics in their deployment and operation over real-world application environments.
1. Introduction This last aspect unveils one of the most backbreaking challenges that
researchers face these days. It is relatively easy to find in the literature
The formulation and solution of optimization problems through the really meaningful studies around theoretical and synthetic applications
use of metaheuristics has gained an increasing popularity over the last of optimization problems and their solution using metaheuristic algo-
decades within the Artificial Intelligence community [1,2]. This momen- rithms [6–8]. However, it is less frequent to find thoughtful and com-
tum has been propelled by the emergence and progressive maturity of prehensive studies focused on real-world deployments of optimization
new paradigms related to problem modeling (e.g., large scale optimiza- systems and applications.
tion, transfer optimization), as well as by the vibrant activity achieved Besides that, the challenge is even more arduous when the main goal
in the Swarm Intelligence and Evolutionary Computation fields [3–5]. of the research is to put in practice a previously published theoretical
In this regard, there are several crucial aspects and phases that define and experimental study. There are two main reasons that generate this
a high-quality research work within these specific areas. Each of these complicated situation. First, it is difficult to find studies that allow the
aspects deserves a painstaking attention for reaching the always desir- work carried out to be transferred to practical environments without
able replicability and algorithmic understanding. Moreover, these ef- requiring a significant previous effort. On the other hand, the second
forts should be intensified if the conducted research has the goal of being reason points to the lack of a practical guide for helping researchers to
deployed in real-world scenarios or applications. outline all the steps that a research work should meet for being repro-
∗
Corresponding author.
E-mail address: [email protected] (E. Osaba).
https://doi.org/10.1016/j.swevo.2021.100888
Received 17 August 2020; Received in revised form 14 January 2021; Accepted 15 April 2021
Available online 28 April 2021
2210-6502/© 2021 Elsevier B.V. All rights reserved.
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
ducible such that it contemplates both theoretical and real-world de- 2. Problem solving using metaheuristics: A long history with
ployment aspects. methodological uncertainties
With this in mind, it is appropriate to claim that the gap between the-
oretical and real-world oriented research is still evident in the research Optimization problems and their efficient handling has received ex-
on metaheuristics for optimization that is being conducted nowadays. tensive attention throughout the years. The appropriate solution of ex-
This gap is precisely the main motivation for the present study in which traordinarily complex problems usually entails the use of significant
we propose a methodology of design, development, experimentation, computation resources [13–15]. This computational complexity, along
and final deployment of metaheuristic algorithms oriented to the solu- with their ease of application to real-world situations, has made of the
tion of real-world problems. The guidelines provided here will tackle optimization field one of the most intensively studied by the current
different pivotal aspects that a metaheuristic solver should efficiently artificial intelligence community. This scientific interest has led to the
address for enhancing its replicability and to facilitate its practical ap- proposal of a plethora of solution approaches by a considerable number
plication. of researchers and practitioners. Arguably, the most successful methods
To rigorously meet the objective proposed in this paper, each of the can be grouped into three different categories: (1) exact methods, (2)
phases that define a high-quality research are analyzed. This analysis is heuristics, and (3) metaheuristics. As stated previously, this study will
conducted from a critical but constructive approach towards amending sharpen its focus on the last of these categories.
misconceptions and bad methodological habits, with the aim of ulti- Metaheuristics can be divided into different categories depending
mately achieving valuable research of practical utility. To this end, the on their working philosophy and inspiration [16,17]. For a better un-
analysis made for each phase incorporates a prescription of application- derstanding of the situation described in this paper, it is interesting to
agnostic guidelines and recommendations that should be followed by put emphasis on a specific branch of knowledge related to metaheuris-
the community to foster actionable metaheuristic algorithms, namely, tics and optimization problem solving: bio-inspired computation [18].
metaheuristic methods designed and tested in a principled way, with a In the last two decades, a myriad of bio-inspired approaches have been
view towards ensuring their actual use in real-world applications. applied to different problems, some of which have shown remarkable
Over the years, several efforts have been made by renowned re- performance. This growing attention has led to an extraordinary in-
searchers for establishing firm foundations that guide practitioners to crease in the amount of relevant published material, usually focused
conduct rigorous research studies [9–12]. All these previous studies on the adaptation, improvement, and analysis of a variety of methods
have significantly contributed to the standardization of some important that have been previously reported in the specialized literature.
concepts. However, the majority of these works are focused on some Several reasons have contributed to this situation. Probably, the most
specific steps or phases of the whole optimization process, while others important cornerstone was the birth of the branches which are known
focus on very specific knowledge domains. These remarkable studies today as Evolutionary Computation and Swarm Intelligence [19,20]. The
will be analyzed in upcoming sections. Such works certainly helped us main representative techniques within these streams are the genetic al-
to highlight the main novelty of the methodology proposed here, which gorithm (GA, [21,22]), particle swarm optimization (PSO, [23]), and
is the deeming of each step that makes up a real-world oriented opti- ant colony optimization (ACO, [24]). Being more specific, it was PSO,
mization research. We cover from the early phase of problem modeling thanks to its overwhelming success and novelty, the one that decisively
to the validation and practical operation of the developed algorithm. influenced the creation of a plethora of bio-inspired methods, which
Therefore, the main contributions and analysis of this tutorial are fo- clearly inherit its main philosophy [25].
cused on the following issues: In spite of the existence of an ample collection of classical and sophis-
ticated solvers proposed in both past and recent literature, an important
segment of the research community continues scrutinizing the natural
• Problem Modeling and Mathematical Formulation: this first step is de-
world seeking to formulate new metaheuristics that mimick new bio-
voted to the modeling and mathematical formulation of the opti-
logical phenomena. This fact has entailed the seeding of three different
mization problem, which is guided by a previously conducted con-
problems in the community, which are now deeply entrenched. We list
ceptualization.
these problems below:
• Algorithmic Design, Solution Encoding and Search Operators: this phase
is entirely dedicated to the design and the implementation of the
• Usually, the proposed novel methods are not only unable to offer a
metaheuristic algorithm.
step forward for the community, but also augment the skepticism of
• Performance Assessment, Comparison and Replicability: this is a crucial
critical researchers. These practitioners are continuously question-
step within the optimization problem solving process, and it is de-
ing the need for new methods, which apparently are very similar
voted to the correct evaluation of the algorithms developed, and to
to previously published ones. Some studies that have discussed this
the replicability and consistency of the research.
problem are [26], [27] or [5].
• Algorithmic Deployment for Real-World Applications: once the meta-
• The uncontrolled development of metaheuristics contributes to grow
heuristic is developed and properly tested, this last phase is dedi-
an already overcrowded literature, which is prone to generate am-
cated to the deployment of the method in a real environment.
biguities and insufficiently detailed research contributions. This un-
controlled growth is splashing the research community with a large
The remainder of the paper is structured as follows. In Section 2, number of articles whose contents is not replicable and in some
the history of problem solving through metaheuristics is briefly out- cases, it may be even unreliable. The reason is the ambiguity and
lined, underscoring the related inherent methodological uncertainties. lack of detail in the presentation of the methods to be replicated and
In Section 3, we introduce a reference workflow which will guide the the questionable statistical significance of their reported results.
whole methodology. We also highlight in this section some of the most • Most of the proposed methods are tested over synthetic datasets and
important related works already published in the literature and their generally compared with classical and/or representative metaheuris-
connection with the methodology proposed in the present paper. Our tics. This fact also involves the generation of two disadvantages. First
proposed practical procedure is described in detail in Sections 4, 5, 6, of all, the sole comparison with classical techniques has led to unre-
and 7. Additionally, a summary of good practices at each specific phase liable and questionable findings. Second, the approaches proposed
of the complete problem solving process is provided in Section 8. The tu- in these publications is usually difficult to deploy in real-world envi-
torial ends with a discussion on future research lines of interest for the ronments, requiring huge amounts of time and effort to make them
scope of this tutorial, Section 9, followed by our concluding remarks work. Finally, being aware of the rich related literature currently
provided in Section 10. available, today’s scientific community must turn towards the pro-
2
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
Table 1
Summary of the literature review, and comparison with our proposed methodology.
posal of practical and real-world applications of metaheuristic algo- tory tests. More specific is the study presented in [31], focused on high-
rithms. This goal cannot be reached if part of the community con- lighting the many pitfalls in algorithm configuration and on introducing
tinues delving into the proposal of new solution schemes which, in a unified interface for efficient parameter tuning.
most cases, don’t seem to be fully justified. It is also interesting to mention the work proposed in [32], which in-
troduces some good practices in experimental research within evolution-
For reversing this non-desirable situation, we provide in this work ary computation. Focused also in evolutionary computation, the authors
a set of good practices for the design, experimentation, and application of [33] highlight some of the most common pitfalls researchers make
of metaheuristic algorithms to real-world optimization problems. Our when performing computational experiments in this field, and they pro-
main goals with the methodology proposed in this paper is to guide re- vide a set of guidelines for properly conducting replicable and sound
searchers to conduct fair, accurate, and shareable applied studies, deem- computational tests. A similar effort is made in [34] but focused on bio-
ing all the spectrum of steps and phases from the inception of the re- inspired optimization. The literature contemplates additional works of
search idea to the final real-world deployment. this sort, such as [35].
As has been pointed out in the introduction, some dedicated efforts The methodologies mentioned up to now revolve around two key
have been conducted before with similar purposes. Some of these papers aspects in optimization: efficient algorithmic development and rigorous
are currently cornerstones for the community, guiding and inspiring the assessment of techniques. In addition to that, it is also possible to find
development of many high-quality studies. In [28], for instance, a tuto- in the literature good practices about the modeling and formulation of
rial on the use of non-parametric statistical tests for the comparison of the optimization problem itself. This issue is equally important to the
evolutionary and swarm intelligence metaheuristics is presented. In that others that have been previously mentioned, and not dealing properly
paper, some essential non-parametric procedures for conducting both with it, usually becomes a source of multiple uncertainties and ineffi-
pairwise and multiple comparisons are detailed and surveyed. A similar ciencies. In [36], for example, Edmonds provides a complete guide for
research is introduced in [9], in which a procedure for statistically com- properly formulating mathematical optimization problems. The author
paring heuristics is presented. The goal of that paper is to introduce a of that paper highlights the importance of analyzing the complexity of
methodology to carry out a statistically correct and bias-free analysis. problems, which is crucial for choosing and justifying the use of a solu-
In [29], a detailed study on the Vehicle Routing Problem with Time tion method. He also stresses the importance of carefully defining three
Windows is presented, in which several guides are offered for the proper different ingredients that make up an optimization problem: instances,
design of solutions and operators, among other remarkable aspects. In solutions, and costs.
any case, one of the most valuable parts of this research is the in-depth Also related are the works conducted in [11] and [37], both dedi-
discussion on how heuristic and metaheuristic methods should be as- cated to multi-objective problems. Moreover, in its successful book [38],
sessed and compared. An additional interesting paper is [30], which Kumar dedicates a complete section to guide researchers in the proper
proposes a procedure to introduce new techniques and their results in definition of optimization problems. This book is especially valuable for
the field of routing problems and combinatorial optimization problems. newcomers in the area due to its informative nature. Apart from these
Furthermore, in a previously cited paper, Sorensen [26] also provides generic approaches, valuable works of this sort can be found in the lit-
some good research practices to follow in the implementation of novel erature devoted to some specific knowledge domains, such as the ones
algorithms. presented in [39] and [40].
The difficulty of finding standards in optimization research in terms As indicated before, the community has made remarkable efforts
of significant laboratory practices is the main focus of the work pro- to establish some primary lines which should guide the development
posed in [10]. Thus, the authors of that work suggest some valuable of high-quality, transparent, and replicable research. The main original
recommendations for properly conducting rigorous and replicable ex- contribution of the methodology proposed in this paper is the consider-
periments. A similar research is proposed in the technical report pub- ation of the full procedure related to a real-world oriented optimization
lished by Chiaraindini et al. [12]. In that report, the authors formalize research, covering from the problem modelling to the validation and
several scenarios for the assessment of metaheuristics through labora- practical operation of the developed systems. Finally, Table 1 summa-
3
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
Fig. 1. Phase 1 of the reference workflow for solving optimization problems with metaheuristic algorithms.
Fig. 2. Phase 2 of the reference workflow for solving optimization problems with metaheuristic algorithms.
rizes the state of the art outlined in this section. We also depict the main general scheme, and it contemplates the problem description ➂, analy-
contribution of our proposal in comparison with each of the works de- sis, and development of the selected solution approach (5–6), and the
scribed there. deployment of the solution ➃. On the other hand, the second scheme
(Fig. 2) is completely devoted purely to the research activity (stage ➅ in
3. Solving optimization problems with metaheuristic algorithms: Fig. 1). In another short glimpse, we can also see how we have devised
A reference workflow two different development environments. Specifically, problem descrip-
tion, baseline analysis, and research activity are conducted in a labora-
In this section, we introduce the reference workflow that describes tory environment ➀, while the algorithmic deployment is conducted in
our methodological proposal. Our main intention is to establish this an application environment ➁.
procedure as a reference, considering its adoption a must for properly Focusing our attention on the first workflow, the whole activity starts
conducting both theoretical and practical rigorous, thorough, and sig- with the existence of a real problem that should be efficiently tackled.
nificant studies related to metaheuristic optimization. Thus, Fig. 1 and The detection of this problem and the necessity of addressing it, trig-
Fig. 2 represent this reference workflow, which will serve as a guide for gers the beginning of the research, whose first steps are the conceptual
the remaining sections of this paper. definition of the problem and the definition and analysis of both func-
Thus, we have used two different high-level schemes to describe tional and non-functional requirements ➂. It should be clarified here
our methodology graphically. The first one (Fig. 1) is conceived as the that this first description of the problem is made at a high-level, focus-
4
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
ing on purely conceptual issues. Due to the nature of this first step, the be considered solved and the research completely finished after the final
presence of final stakeholders is highly recommended in addition to re- deployment of the algorithm in a real environment.
searchers and developers. In another vein, Fig. 2 depicts the second part of our workflow, which
Regarding functional requirements, it is hard to find a canonical def- is devoted to the work related to research development. As can be eas-
inition [41], but they can be referred to as what the product must do ily seen in this graphic, this workflow has three different entry points,
[42] or what the system should do [43]. Furthermore, the establishment depending on the status of the whole activity. Furthermore, this phase
of functional requirements involves the definition of the objective (or is divided into three different and equally important sequential stages.
objectives in case of multi-objective problems) function to be optimized These phases and how they are reached along the development process
and the equality and inequality constraints (in case of dealing with a are detailed next:
problem with side constraints). On the other hand, there is no such
consensus for non-functional requirements. Davis defines them as the • Problem Modeling and Mathematical Formulation (Section 4): This first
required of overall attributes of the system, including portability, reliability, step should be entirely devoted to the modeling and mathematical
efficiency, human engineering, testability, understandability and modificabil- formulation of the optimization problem, which should be guided by
ity [44]. Robertson and Robertson describe them as a property, or quality, the previously conducted conceptualization. The entry to this part of
that the product must have, such as an appearance, or a speed or accuracy the research should be materialized if the problem to solve has not
properties [42]. More definitions can be found in [41]. In any case, these been tackled in the literature before, or in case of the non-existence
objectives are crucial for the proper election of the solution approach, of an adapted baseline or library.
and the non-consideration of them can lead to the re-design of the whole • Algorithmic Design, Solution Encoding and Search Operators (Section 5):
research, involving both economical and time costs. This paramount This second stage should be devoted to the design and implementa-
importance is the reason why, in this work, we put special attention tion of the metaheuristic method. It should also be highlighted that
on highlighting the impact of the consideration or non-consideration of another research branch could also be conducted, which is the re-
these non-functional objectives (of a fair and comprehensive description finement of a baseline or library already found in the scientific com-
of the non-functional requirements). In fact, many of the research con- munity.
tributions available in the literature are focused on the pure fulfillment • Performance Assessment, Comparison and Replicability (Section 6):
of functional requisites, making them hard to be properly deployed in Once the algorithmic approach is developed (or refined), the per-
the real world. Thus, we can see the meeting of non-functional objec- formance analysis of the technique should be carried out. This is a
tives as the key for efficiently transitioning from the laboratory ➀ to the crucial phase within the optimization problem solving process, and
application environment ➁. the replicability and consistency of the research clearly depend on
After this first conceptual phase, it is necessary to scrutinize the the good conduction of this step. Furthermore, once the quality of the
related literature and scientific community for finding an appropriate algorithms has been tested over the theoretical problem, it should
baseline ➄. The main objective of this process is to find a public shared be deployed in a real environment (Algorithmic Deployment for Real-
library or baseline that fits with the previously fixed functional require- World Application phase, Fig. 1).
ments. In the positive case, the next step is to analyze whether these
Once we have introduced and described our envisioned reference
findings are theoretically compliant with all the outlined non-functional
workflow, we outline in the following sections all the good practices
requirements. The published research activity is usually carried out un-
that researchers and practitioners should follow for conducting high-
der trivial and unofficial laboratory specifications with a short-sighted
quality, real-world oriented research.
design mostly concentrated on the “what” (functional objectives) but not
on the feasibility of “the how”. The recommended good practice is filter-
ing out research that has allegedly gone through from the lab hypothesis 4. Problem modeling and mathematical formulation
to the demanding real-world conditions. On the contrary, when assum-
ing that the baseline does not satisfy or reckon these non-functional Once the analyst and domain expert have agreed upon the concep-
requirements, the research activity will first include procedures to eval- tual definition and the requirements to be met by the solution, the
uate the baseline viability so as to decide whether the baseline is still a research activity gets started. All these inputs (conceptual description
potential workaround or has to be discarded ➅. Finally, if both actions and functional/non-functional requirements) will be tracked along the
are positively solved, the investigation is considered ready to go through whole workflow and be more approachable depending on the specific
the deployment phase ➃. stage. At the problem modeling phase, the “what” contained in the con-
At this point, it is important to highlight that the so-called Algorithmic ceptualization and functional requirements have to be perfectly clear
Deployment for Real-World Application phase ➃ (detailed in Section 7), and comprehensive enough to be fairly translated into a mathematical
considered as a cornerstone in our methodology, can receive as input formulation.
an algorithm directly drawn from a public library ➄, or a method de-
veloped ad-hoc as a result of a thorough research procedure ➅. At this 4.1. Mathematical formulation
phase, it could be possible to face the emergence of new non-functional
objectives, implying the re-analysis of the problem (going back to ➂) for The key steps to be followed are depicted in Fig. 3 aiming at ade-
the sake of deeming all the newly generated necessities. quately translating a problem conception on paper into a precise math-
On the contrary, if all the non-functional requirements are consid- ematical formulation of an optimization problem:
ered but not fully met, further re-adjustments are necessary. In this
scenario, additional minor adaptations should be made over the meta- • Clearly state the objective/cost function 𝑓 (x) in charge of cover-
heuristic if further configurations are left to test ➆. Nevertheless, if these ing functional requirements and measure the quality and success of
minor adjustments do not result in a desirable performance of the al- each assignment or solution. Try to infer as well the main charac-
gorithm, the process should re-iterate starting from the Algorithmic De- teristics of 𝑓 (X): linear/nonlinear, unimodal/multimodal, or contin-
sign, Solution Encoding and Search Operators phase (part of Workflow 2, uous/discontinuous. In multi-objective scenarios, when the decision
Fig. 2, and detailed in Section 5), which may involve a re-design and functions are conflicting, multiple criteria will play a part in the de-
re-implementation of (or even a new) our metaheuristic solution ➅. Fi- cision making process. Nonetheless, the stakeholder might narrow
nally, if none of the above deviations occur and the performance of the down the Pareto optimal/nondominated solutions by imposing some
metaheuristic meets the initially established objectives, the problem can preferences.
5
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
Fig. 3. Phase 1 of the reference workflow for solving optimization problems with metaheuristic algorithms.
• Characterize the decision variables (𝐱 = {𝑥1 , 𝑥2 , … , 𝑥𝑛 }) to be tuned sensitivity of the objective function with respect to parameters
(the order of the cities to visit, the community in which a node should [47] (analogously to Information Gain in Machine Learning) and
be placed, the value of some input parameters,...) and their domain. the inter-relation/correlation of each pair of input variables. The
• Determine the constraints of the problems as well as the natural or major concern about time consumption is likely to entice the re-
imposed restrictions, whose intersection will yield the feasible re- searcher to pay close attention to the balance between problem
gion of solutions. For Constrained Optimization Problems, hereafter dimensionality reduction and solution quality.
called COPs, the nature of such constraints (equality, inequality or • Constraints may contribute to a faster convergence by narrow-
both) may be decisive in the algorithm approach selection. ing the search in the feasible space. Nevertheless, the number of
Also, in Fig. 3, and for the sake of understandability, we have de- constraints (and their complexity) can also have a big impact on
picted in the upper right corner the placement of this phase in the Re- the existence of a solution and/or on the capacity of a numeri-
search Activity workflow (Fig. 2). The researcher will strive to accom- cal solver to find it. In fact, for real-life optimization problems,
modate the itemized list of functional and non-functional requirements inequality constraints (physical limitations, operating modes, ...)
into the mathematical formulation since setting boundaries to the auto- can be quite large in comparison to decision variables 𝐱, hence
matic solution generation relieves subsequent efforts in modeling them causing the feasible space to be shrunk to the point of eliminating
at a next stage. We propose here a list of most common real-world non- any available solution. In such a case, the COP goal will be math-
functional requirements and some good practices to be assessed, if ap- ematically reformulated as finding the least infeasible vector of
plied, by the researcher: variable values.
• Accuracy of the solution. Generally tightly related to the time-
• Time consumption. It is often the most relevant non-functional re- consumption requirement, once the mathematical formulation has
quirement expelled to take into account since the beginning of the been inferred, the optimization problem can be categorized into a
mathematical conceptualization. convex (i.e. the objective function 𝑓 (x) is a convex function and
• The fitness/objective function 𝑓 (x) evaluation might be ex- the feasible search space is a convex set) or non-convex one, which
tremely time-consuming, specifically when equations are large will mostly lead the algorithm selection process and its design. Re-
and must be assessed in heavy computer-based simulations. Re- searchers must get a balance between the aforementioned time con-
formulations such as those approaches based on approximation- sumption and the accuracy of the solution, especially on large scale
preserving reduction, i.e., relaxing the goal from finding the op- non-convex spaces: are local optima acceptable results in favor of the
timal solution to obtaining solutions within some bounded dis- computation lightening? are global optima achievable and verifiable
tance from the former [45], surrogate objective functions [46] or in the real-world environment?. These questions will also flourish in
dimension reduction procedures (rightly after introduced) might the subsequent stages.
be practical alternatives. • Unexpected algorithm interruptions must return feasible solutions. In
• Dimension reduction relates decision variables, the parameters real-world environments, many unforeseen events may justify a need
on which the algorithm will perform the decision-making pro- for a solution before the algorithm meets the stopping criteria thus
cedure. The length of such list 𝑛 = |𝐱| and their flexibility is finishing the search process. The solution, albeit premature, must
strictly related to the time consumption required by the meta- be complete and fully compliant with the hard constraints. In such
heuristic to explore the search space and run evaluations (i.e., circumstances, the tendency to convert non-linear constraints into
𝑓 (x)). Therefore, a preliminary study on the input parameters’ penalties (soft constraints) in the objective function to bias the solu-
selection, similar to Attribute Selection in Machine Learning, is tions towards the frontiers is not a viable option.
strongly advocated in realistic scenarios oriented to real-world
deployment. A parameterized complexity analysis might trig- With such an enumeration of requirements in hand, researchers
ger a mathematical reformulation after delving into both the should check those regarded at this initial stage and those not plau-
6
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
sible for being satisfied by the mathematical formulation, which will be the real-world application under study? Unless these discussions are
consequently transferred to the following adjacent phase. elaborated at this point of the reference workflow, design choices in
subsequent phases can be made on the basis of a problem statement
4.2. Analyze problem complexity - Justify the use of metaheuristics uncoupled from the requirements of the real-world problem under
consideration.
Once the mathematical formulation has been completed, the re- • Is the problem complex enough to discard simpler heuristics? An equally
search team involved in the work at hand should justify with solid relevant aspect of real-world problems is its complexity, which has
grounds the need for metaheuristics to solve it efficiently. Mathemat- been lately studied in the literature under the concept of fitness land-
ical optimization is a long-standing discipline, which has been tradi- scape [50–52]. In the context of optimization, fitness landscape com-
tionally focused on convex objective functions and feasible sets [48]. prises three essential elements of study: search space, fitness func-
Fortunately, there are already specific solvers (either exact or heuris- tion, and neighborhood among solutions. Interestingly, since the
tics) suited to deal with this family of optimization problems, even to search space and the definition of a neighborhood depend stringently
optimality when some specific conditions hold (e.g., linearity). Convex- on how solutions (optimization variables) are represented, the math-
ity ensures that every local optimum is a global optimum, hence avoid- ematical statement of the optimization problem and the algorithmic
ing common issues that motivate the use of heuristic and metaheuristic design of the solver to address it become entangled with each other
alternatives. [53]. In other words, a single problem statement can span differ-
Unfortunately, the majority of contributions addressing real-world ent fitness landscapes depending on how solutions are represented,
optimization problems just neglect any discussion on the convexity and even if dealing with continuous search spaces. The point is that only
properties of their mathematical formulations. Instead, they directly by assessing all these elements jointly, one can find solid reasons to
resort to the use of metaheuristics, without any major discussion on opt for simpler heuristics, such as implicitly enumerative methods
whether they are really needed [27]. In this context, any prospective that rely extensively on the domains of study of landscape analysis
work along this line should pause at the following research questions: (e.g., exhaustive search, Montecarlo sampling, neighborhood search,
A∗ and branch and bound among others). Besides, landscape anal-
• Are the objective function(s) and constraint(s) analytically defined? In- ysis can unveil other features with important implications that can
tuitively, certain real-world optimization scenarios do not allow for be equally identified, such as ruggedness, basins of attraction, and
an analytical formulation of the optimization problem itself. Indeed, funnels, to mention a few. When addressing real-world optimization
the complexity of systems and assets to be optimized (as occurs in scenarios with analytically defined problem formulations, we advo-
e.g., industrial machinery) jeopardizes the derivation of closed-form cate for a closer look at these tools that, unfortunately, are often
formulae for the objectives and constraints to be dealt with. How- overseen in the literature related to real-world optimization.
ever, this does not imply that quality and feasibility cannot be eval- • Is there expert knowledge about the problem/asset that should be consid-
uated for any potential solution, but rather that the system/asset at ered in the definition of the problem? In real-world settings, years of
hand must be considered as a black-box model that can be queried unassisted problem solving by users often accumulate expert knowl-
for any given input (solution). In this case, when the use of algo- edge that can be exploited in the design of efficient heuristics, as
rithms that do not depend or rely on the problem’s properties be- typically done by local search methods in memetic algorithms. How-
comes properly justified, it paves the way for the use of metaheuris- ever, we emphatically underscore the relevance of expert knowledge
tic algorithms. in terms of problem analysis. For instance, large regions of the search
• Can the problem be modified or reformulated without compromising the space can be discarded as per the experience of the user consuming
imposed functional requirements? When the problem can be analyti- the solution provided by the algorithm (implicit experience-based
cally defined, it might fail to comply with the mathematical prop- constraints). Likewise, the usability of the output in real application
erties that could allow exact methods and ad-hoc heuristics to be contexts can give valuable hints about how the problem can be re-
applied. For instance, even if the convexity of the objective(s) can laxed, either in terms of formulation or in what refers to aspects
be guaranteed, their linear or quadratic nature with respect to the impacting on its landscape (e.g., solution encoding, or how solu-
optimization variables plays a crucial role in the adoption of exact tions can be compared to each other – neighborhood). Section 5 will
linear and quadratic programming methods rather than optimization later revolve on the capital role of expert knowledge in the design
heuristics (e.g., gradient-based methods). At this point, it is strongly of optimization algorithms for real-world problems. However, this
advised to examine strategies to reformulate (relax) the problem relevance also permeates to the definition of the problem itself and
and mathematically simplify its objective(s) and constraint(s). These its eventual reformulations.
strategies include, among others, quadratic and linear transforma-
tions, constraint approximation via trust regions or Lagrangian re- Summarizing the above points: metaheuristics must not be simply re-
laxation [49]. garded as a swiss knife for solving real-world problems, nor should this
When considered and successfully applied to the problem at hand, family of solvers be unduly applied to problems that can be simplified
the compliance of the reformulated problem concerning functional or tackled with simpler optimization methods. Instead, metaheuristics
requirements should be analyzed. For instance, if the objective(s) are are powerful algorithmic enablers to deal efficiently with those cases of
modified, a quantitative analysis of the implications of such mod- study whose complexity calls for their adoption. The provision of unde-
ifications in the landscape of the original problem should be per- niable arguments for the necessity of metaheuristics should be enforced
formed, particularly in regards to quality degradation (fitness value) in prospective studies.
and feasibility (constraint satisfaction). Depending on the chosen
problem relaxation strategy, the reformulation could just penalize 5. Algorithmic design, solution encoding and search operators
with an additive objective term those solutions as per their compli-
ance with the imposed constraints. This reformulation is a crucial The second phase to analyze after the problem modeling is the one
aspect that can be detected and it must be held in mind in subse- devoted to the pure algorithmic design and development. Fig. 4 summa-
quent design phases, as there is no mathematical guarantee that a rizes the main aspects of this activity. As in the previous subsection, we
feasible solution will be obtained. A similar conclusion can be drawn have shown in the upper right corner the placement of this phase in the
with linear relaxation strategies: is the quality (fitness) of the global Research Activity workflow (Fig. 2). As can be seen in this scheme, and
optima of the relaxed problem far away from that of the original, following the guidelines highlighted in the previous section, this step re-
unrelaxed problem? If there is an optimality gap, is it relevant for ceives as input an optimization problem, formulated adequately as one
7
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
Fig. 4. Summary of the methodology on Algorithmic Design, Solution Encoding, and Search Operators.
or more objective functions, a set of decision variables, and a group of meanings. For example, the candidate can represent the problem so-
constraints. Furthermore, a group of must-fulfilling non-functional re- lution itself (when genotype = phenotype [59]), as in the case of
quirements is also provided as input, which undoubtedly influences the the permutation encoding for the TSP [60], or a partial solution,
designs and developments conducted in this phase. This specific stage as normally happens when using Ant Colony Optimization [61,62].
of the research can be reached from four different points: Nonetheless, the candidate can represent a set of values acting as
input for a specific system or a configuration of a defined set of pref-
1. Following the natural flow of research methodology depicted in erences [63] which will subsequently play a part in the complete
Fig. 1, the metaheuristic design and implementation are conducted problem solution. Taking this particularity into account, it is impor-
after the mathematical formulation of the problem (Section 4). tant not only to match the encoding to the problem (genotype vs
2. If researchers have found a baseline that meets the same functional phenotype) but to clearly detail it. For this reason, two important
requirements of the problem at hand, but the theoretical compliance questions a researcher should answer are: “Do we need to encode an
of all non-functional requirements established (step ➄ in Fig. 1) can- individual for representing in a straightforward manner the prob-
not be verified. lem’s solution? Or we need an intermediate encoding better suited
3. The re-design and re-implementation of the selected metaheuristic to test different heuristic operators?”.
is necessary if the experiments carried out in the Lab Environment Focusing our attention on solutions encoded as parameters that act
using a previously implemented solver do not verify the compliance as inputs for an external system, researchers should bear in mind that
of the defined functional requirements (Section 6). the length of the candidate solutions and the domain of their vari-
4. From the Algorithmic Deployment for Real-World Applications step ables are strictly related to the running times needed by the meta-
(Section 7 and ➆ in Fig. 1), only if the previously deployed solver heuristic to modify and evaluate them. This impact on the running
does not meet the established non-functional requirements. times is the reason for which, as mentioned in the previous section,
a preliminary study on the input parameters to be considered is re-
Thus, these are the most important aspects a researcher or a prac-
quired for studies oriented to real-world deployment. This way, re-
titioner should consider regarding the algorithmic design, solution en-
searchers could definitely choose which parameters should be part
coding, and search operator development:
of the solution encoding, balancing both time consumption and in-
• Solution encoding. This is the first crucial decision to take in the fluence in the solution quality. A remarkable number of studies have
algorithmic design [54,55]. The type of encoding for representing been published in the literature delving into this topic [64], being
the candidate solution(s) should be decided (real or discrete; binary the restricted search mechanism [65] and the compression expansion
[56], permutation [57], random keys [58], etc.). Its length (under- [66] two representative strategies of this sort.
stood as the number of parameters that compose the solution) is Furthermore, the importance of solution encoding is twofold. On the
also an essential choice. This length can be dependant on the size one hand, it defines the solution space in which the solver works.
of the problem, or on the number of parameters to optimize. Thus, On the other hand, the movement/variation operators to consider
depending on these choices, encoded solutions can adopt different are dependant on this encoding. Consequently, different operators
8
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
should be used depending on the encoding (e.g., real numbers, bi- For example, a mutation operator of a GA can be mathematically
nary or discrete). Ideally, this representation should be wide and formulated as:
specific enough for representing all the feasible solutions to the prob-
lem. Additionally, it should fit at best as possible the domain search 𝐱𝑡+1 = 𝑓𝑖 (𝐱𝑡 , 𝑍) ∈ , (1)
of the problem, avoiding the use of representations that unnecessar- where 𝐱𝑡+1 is the output solution, and 𝑍 denotes the number of
ily enlarge this domain. In any case, and taking into account that this times one of the functions 𝑓𝑖 () in is applied to the input 𝐱𝑡 . Fol-
methodology is oriented to real-world deployments, non-functional lowing the same notation, a crossover could be denoted as 𝐳𝑡+1 =
requirements should be decisive for deciding which encoding is the 𝑔𝑖 ({𝐱𝑡 , 𝐲𝑡 }, 𝑍) ∈ .
most appropriate to deal with the problem at hand. For example, if Again, non-functional requirements should be carefully studied to
the real environment contemplates unexpected algorithm interrup- accurately choose or design all the operators that will be part of the
tions (concept defined in Section 4.1), encoding strategies allowing whole algorithm. For example, some operators allow the eventual
for partial solutions should be completely discarded. Moreover, if the generation of incomplete and/or non-feasible solutions (i.e., solu-
execution time is a critical factor in the real system, representations tions which do not meet all the constraints) to enhance the explo-
that require a complex and time-consuming transformations or trans- ration capacity of the method. In any case, these alternatives should
lations should also be avoided. An example of this translations is the be avoided in case the real-world scenario considers unexpected al-
Random-Keys based encoding, often used in Transfer Optimization gorithm interruptions. Additionally, in case the running time is a
environments [67,68]. critical issue, operators that favor the convergence of the algorithm
• Population. On the one hand, if the number of candidate solutions to should be prioritized (understanding convergence as the computa-
optimize is just one, as in Simulated Annealing (SA, [69]) and Tabu tional effort that the algorithm requires for reaching a final solu-
Search (TS, [70]), we can consider the metaheuristic as a trajectory- tion(s) [82]).
based method. On the other hand, if we deal with a group of solu- • Algorithmic Design. Briefly explained, the algorithmic design dic-
tions, the technique is classified as population-based. Examples of tates how operators are applied to the solution or groups of solu-
these solvers are GA and PSO. An additional consideration is the tions. It could be said that this design determines the type of meta-
number of populations, which can also be more than one. These heuristic developed. At this point, it should be mandatory to pro-
methods can be called multi-population, multi-meme, or island- vide overall details of the algorithm. To do this, several alternatives
based methods, depending on their nature [71,72]. Instances of these are useful, such as a flow diagram, a mathematical description or a
approaches are the Imperialist Competitive Algorithm [73] or the pseudocode of the method. Furthermore, if the modeled technique
distributed and parallel GAs [74]. In these specific cases, the way in incorporates any novel ingredient, it is highly desirable to conduct
which individuals are introduced in each sub-population should be this overall description of the method using references to other algo-
clearly specified, and the way in which solutions migrate from one rithmic schemes made for similar purposes. Furthermore, the num-
deme to another must also be formulated [75]. Finally, well-known ber of possible alternatives for building a solution metaheuristic is
methods such as Artificial Bee Colony [76] and Cuckoo Search (CS, really immense, being impossible to point here all the aspects that
[77]) are characterized for being multi-agent, meaning that each in- should be highlighted. In any case, some of the facets that must be
dividual of the community can behave differently. described are the selection criterion, the criterion for the interaction
Summarizing, the number of solutions to consider, the structure of among solutions (in terms of recombination in GAs, or migration in
the population, and the behavior of the individuals are three as- multi-population metaheuristics), the acceptance criterion (replace-
pects that must be thoroughly studied. As in the previous case, non- ment) and the termination criterion.
functional requirements need to be carefully analyzed for making the Probably, the first good practice to follow when deciding the algo-
right decision. For example, if the solver is run in a distributed en- rithmic design of a real-world oriented metaheuristic is to take a
vironment, a multi-population method or a distributed master-slave detailed look at recent related scientific competitions. Tournaments
approach (both synchronous or asynchronous) could be promising such as the ones celebrated in reference conferences such as the IEEE
choices. Moreover, if the running time is a critical aspect and the Congress on Evolutionary Computation1 and the Genetic and Evolution-
problem is not expensive to evaluate, a single point search algo- ary Computation Conference should guide the selection of the can-
rithm could be considered. In this regard, functional requirements didate algorithm. For making this decision, it should be checked if
must also be analyzed for choosing the proper alternative. For in- the real-world problem belongs to a class of problems with a similar
stance, if the solution space is non-convex and the number of local competition benchmark, being meaningful in this case to focus the
optima is high, a population-based metaheuristic should be selected, attention on those algorithms that have shown a remarkable perfor-
since it enhances the exploration of the search space. This aspect can mance at recent competitions.
be particularly observed in multimodal optimization [78,79]. Once again, researchers should thoroughly consider both functional
• Operators. The design and implementation of operators is an impor- or non-functional requirements for properly choosing the design of
tant step that should also be carefully conducted. A priori, it is not a the metaheuristic. For example, computationally demanding designs
strict guideline for the development of functions in functional terms. could be acceptable only in situations in which the running time is
Furthermore, there are different kinds of operators, such as selec- not critical. On the contrary, if we want to reduce the execution
tion, successor, or replacement functions, among others [16,55,80]. time by sacrificing some quality in the solution, the termination cri-
In any case, and in order to avoid any ambiguity related to the ter- terion would be a cornerstone for reaching a proper and desirable
minology used [26,81], the way in which individuals evolve along convergence. Interaction between candidate solutions would also be
the execution should be detailed using a standard mathematical lan- of paramount importance if the implemented algorithms will be de-
guage [5]. In order to do that, each operator’s inputs and outputs ployed in a distributed environment, requiring advanced and care-
should be described using both algorithm descriptions and standard fully designed communication mechanisms.
mathematical notation. We should also describe the nature of the Regarding the problem complexity, if this is remarkably high, auto-
operators (search based, constructive...) and the way in which they mated algorithm selection mechanisms can be an appropriate alter-
operate. Furthermore, and with the ambiguity avoidance in mind,
it is advisable to anticipate possible resemblances with other algo-
rithms from the literature and highlight differences (if any) by using, 1
https://www.ntu.edu.sg/home/epnsugan/index_files/CEC2020/CEC2020-
once again, mathematically defined concepts. 1.htm
9
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
native [83]. This concept sinks its roots in the well-known no-free- imental benchmark 6.1, evaluation score 6.2, fair comparisons among
lunch theorem [84]. This theorem particularly applies in computa- techniques 6.3, statistical testing 6.4, and replicability 6.5.
tionally demanding problems, in which no single algorithm defines
the baseline. On the contrary, there is a group of alternatives with 6.1. Experimental benchmark
complementary strengths. In this context, automated algorithm se-
lection mechanisms can, within a predefined group of algorithms, This is usually the first decision researchers must face when design-
decide which one can be expected to perform best on each instance ing the experimental evaluation of their proposal. In this regard, de-
of the problem. pending on the kind of contribution (more theoretical or more applied)
Another interesting aspect to consider for the algorithmic design is the nature of the problems used in the experimentation might be differ-
the whole complexity of the technique. Usually, the development ent. In the first case, researchers would probably use synthetic bench-
of complex algorithmic schemes is unnecessary, if not detrimental. mark functions to assess the performance or the advantages of the differ-
Some influential authors have proposed the bottom-up building of ent proposals or considered algorithms. In the second case, the authors
metaheuristics, using the natural trial and error procedure. It has would normally propose a set of instances related to the problem they
been demonstrated how, in practice, robust optimizers can be built, are trying to solve. Sometimes, especially when they are dealing with a
which can compete with complex and computationally expensive novel or a very specific problem with strong requirements, one has to de-
methods. This concept, based on the philosophical concept of Oc- sign his own benchmark. In all the cases, those problem instances must
cam’s Razor, is the focus of some interesting studies such as [85,86]. comply with several conditions to ensure that the experiments succeed
An additional consideration that should be taken into account for in assessing the significance of the decision of selecting one or another
properly choosing the algorithmic design is the expertise of the fi- technique. In order to do that, benchmarks should be designed not only
nal user. In this sense, if the user who will use the deployed method to allow the evaluation of functional but also of non-functional require-
in the real environment has no experience with these kinds of tech- ments, in order to analyze the degree to which all of them are met. Ad-
niques, it is recommended to implement techniques needing a slight ditionally, it should also be considered specific conditions allowing or
parameterization. Examples of these methods are the basic versions encouraging the eventual application of statistical tests, such as a large
of the Cuckoo Search or Differential Evolution. Other promising al- number of problems and/or an odd number of them to reduce potential
ternatives for these types of situations are the solvers known as adap- problems in the comparative analysis, due to the cycle ranking or the sur-
tive [87], or the automated design methods [88,89]. On the con- vival of the nonfittest paradoxes [90]. Since the methodology introduced
trary, if the final user is familiar with the topic, the researcher could in this paper is oriented to the deployment of real-world metaheuris-
deploy a flexible solver configurable by several control parameters tics, we recommend conducting laboratory tests using datasets as real
to allow refinements in the future. Well-known examples of these as possible, even if they are synthetically generated.
methods are the Genetic Algorithm (with its crossover and mutation In this regard, it is widely agreed in the community that real-world
probabilities, population size, replacement strategy, and many other benchmarks have been traditionally scarce. However, significant efforts
parameters) or the Bat Algorithm (with its loudness, pulse rate, fre- have been recently conducted to overcome this lack. In [91] an easy-
quency or wavelength, among other parameters). to-use multi-objective optimization suite is introduced, consisting of 16
Furthermore, and although the interest in providing theoretical guar- bound-constrained real-world problems. Similar studies have been also
antees of newly developed metaheuristics is in crescendo, we should contributed in [92] and [93], focused on realistic many-objective opti-
also explicitly call in this methodological paper for an effort to in- mization problems. In [94], a generic framework is proposed to design
corporate theoretical reasons for new algorithmic designs. In other geared electro-mechanical actuators. The proposed framework is uti-
words, we should progressively shift from a performance-driven ra- lized for constructing realistic multi-objective optimization test suite,
tionale (look, my algorithm works) to a theory-/intuition-driven de- with an emphasis placed on constraint handling. Another work along
sign rationale (look, my algorithm will work because,...). Of course, this this line can be found in [95], which describes a benchmark suite com-
trend should also be extended not only to the algorithmic design, but posed by 57 real-world constrained optimization problems. An addi-
also to the generation of new operators and operation mechanisms. tional proposal published in [96] revolves on data-driven evolution-
Finally, and referring to the proposal of new metaheuristics, opera- ary multi-objective optimization. Additional studies of this kind can be
tors, or mechanisms, we want to highlight the importance of prop- found in [97], [98], and [99]. In any case, for the generation of valuable
erly describing all the aspects involved in a solver using a stan- synthetic problem instances, all the variables that compose the real sit-
dard language. In other words, all metaphoric features should be uation must be thoroughly studied in order to build reliable test cases.
left apart, or contextualized using openly accepted methods as ref- Thus, newly built datasets should be adapted to these variables. Lastly,
erences. In fact, the lack of depth in these descriptions is the main if any real instance is available, the generation of additional synthetic
reason for lots of ambiguities generated in the literature [5,27]. For test cases is recommended, using the real one as inspiration.
example, it is perfectly valid to name the individuals of a population Furthermore, when the experimental benchmark is made up of syn-
as Raindrops, Colonies, Bees or Particles, but they must be notated thetic functions, such functions should be a challenge for optimization
using a standard mathematical language, and it should be clarified algorithms. Thus, the benchmark should comprise functions of different
that they are similar to an individual of a Genetic Algorithm (if we nature and challenging features, such as a different number of local op-
use the GA as a reference). tima, shifting of the global optimum, rotation of the coordinate system,
non-separability of (sub-)components, noise, several problem sizes, etc.,
depending on the expected features of the problem to tackle. Designing
6. Performance assessment, comparison and replicability such a benchmark can be a difficult task, so a good recommendation
is to use some of the existing benchmarks in the literature. The use of
When the selection of the algorithms is carried out by considering well-known benchmarks also facilitates the selection of the reference al-
previous reports and studies in the literature, this step is indeed not gorithms to be included in the comparison. It should be finally pointed
needed. However, it is quite frequent that good comparisons do not ex- out that, although a technique will be deployed in a real environment
ist in the literature to make a reasonable decision. This implies that we solving a real-world problem, conducting tests with this kind of datasets
have to conduct our own comparisons in order to select the algorithm is of great importance for measuring the quality of the developed pro-
that better meets our requirements. This section discusses on several posal.
aspects that must be considered to conduct a rigorous and fair exper- On the other hand, when the experimental benchmark includes real-
imentation to make that decision. Specifically, these topics are: exper- world problems that have not been tackled before, authors should care-
10
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
fully select an appropriate set of instances to evaluate their proposal, In some real-world problems, specific visualizations can be helpful
or even generate them. This last case is especially frequent in situations to ease the interpretation of the results of the optimization algorithm.
where the problem is being solved for the first time by the commu- For example, in a routing problem, the visualization of the routes can
nity, something frequent when dealing with real environments. Thus, be examined by an expert in mobility that will be able to assess the con-
the testbed should include a broad set of instances covering all the rel- venience of using the solutions provided by the optimization algorithm.
evant characteristics of the problem under consideration in order to re- In some cases, a route with a longer distance might be more appropri-
semble, as much as possible, the real-world scenarios being modeled. ate if it complies with some additional constraints that were not avail-
These dataset should be chosen or generated for efficiently testing each able when the problem was defined than a route with a shorter distance
functional and non-functional requirements of the problem. Lastly, in- which violates such constraints. This can be easily spotted by the expert
stances should be described in detail and, whenever possible, should be with this kind of visualizations.
made available to the community, so that other authors can use them Something similar is recommended for real-parameter optimization
to evaluate their own contributions. problems. In this case, it is also useful to depict different solutions, but an
alternative visualization should be considered. This visualization should
6.2. Evaluation measure make possible to represent not only the solutions themselves, but also
the fitness value associated to each solution, in order to identify promis-
An optimization algorithm can be assessed from different points of ing regions of the solution space. A direct approach for visualizing con-
view and based on many features. Traditionally, main measures are re- tinuous variables would be to use 2D or 3D scatter plots, in the case of
lated to the performance (a fitness function or an error measure). How- very small problems. If the problem has 4 variables or more, it is not
ever, there are many other possible measures of interest in a real-world possible to represent solutions without the use of dimensionality reduc-
context: tion techniques (PCA, t-SNE, UMAP, etc.). An alternative approach is
the use of parallel coordinate plots.
• Processing time, which depends on computational complexity. In
In the case of multi-objective optimization problems, they also re-
many real-world contexts this response time is crucial. In this con-
quire specific visualization techniques. In this context, it is of great inter-
text, it is recommended to generate a record containing the execution
est to be able to represent the Pareto front of the optimization problem
times presented by all the considered metaheuristics. This record
to allow the user to choose from among all the available non-dominated
should be associated with the computational environment in which
solutions.
techniques have been run. Thus, this logbook will be useful in the
Another important issue that should be subject of study is how algo-
subsequent deployment phase, and it is specially crucial for properly
rithms manage the exploration vs. exploitation ratio [102,103]. In most
measuring the impact of the system migration on the algorithm’s
cases, authors do not put significant attention on how the components of
running time.
the developed techniques contribute to exploration/exploitation. How-
• Memory requirements: This is especially important when the algo-
ever, no analysis to support this hypothesis is normally carried out, and
rithm is expected to run in hardware with limited resources.
such analysis should be mandatory [104].
• For distributed algorithms there are special measures such as the
Another crucial aspect, which has been also mentioned in previous
communication latency, or the achieved speedup (relative to the num-
sections, deals with the complexity of the algorithms. In this sense, an
ber of nodes). Additionally, other non-functional requirements can
intuitive approach is to compare the running times of the algorithms
also be considered and measured, such as robustness when a node
under study. However, this measure is only meaningful in certain real-
fails, redundancy, etc.
world situations. Other elements could also affect this performance mea-
• The required time to obtain a reasonably good solution, especially in
sure: differences in the computing platform, availability of a parallel
problems in which each evaluation requires significant computa-
implementation, the application of the code, etc. For this reason, other
tional resources. In these scenarios, algorithms often apply surrogate
language-agnostic measures such as the Cyclomatic Complexity (or Con-
models to reduce their execution time.
ditional Complexity, or McCabe’s Complexity) [105], are normally pre-
As a general rule, the assessment of the performance of an optimiza- ferred. More concretely, Cyclomatic Complexity is a software metric that
tion algorithm can not be guaranteed if the measure of just a single measures the number of independent paths in a program source code.
run is reported. Robust estimators of an evaluation metric can only be The higher the number of independent paths are, the more complex the
computed if enough information is available. In this sense, multiple runs program is and, thus, a higher complexity value is obtained. Nonethe-
should be considered so that the statistical methods described below can less, the efficiency of the algorithm, in terms of their consumption of
deliver significant conclusions. Special attention should also be paid to computing resources, can be of utmost importance for real-world ori-
the fact that multiple runs must be independent, i.e., no information is ented research.
fed from one run to another. The last fundamental feature pointed out in this subsection relates
to the adjustment of the parameter values of each algorithm. In this
6.3. Rich comparisons from multiple perspectives sense, it makes sense to adjust the parameter values to adapt the search
to the complexity of the instance/problem, given that this complexity
A rigorous assessment of an optimization method should focus on can be directly inferred from the information that we have of the in-
different aspects of the method behavior, and it should explore differ- stance/problem (such as, for example, its size), without the need of ad-
ent perspectives for gathering meaningful insights. For example, aligned ditional processing to identify it. If a parameter tuning algorithm has
tables with min, max and mean results should only be considered for in- been employed (which is highly recommended, see [5]), the tuned val-
formative purposes. Nevertheless, much richer visualizations should be ues should also be analyzed. An additional aspect to consider is to clearly
analyzed to dive in the data and highlight the most important findings analyze the influence of each parameter in the fulfillment of established
of the research. One possible approach to visualize the comparisons be- functional and non-functional requirements, and to analyze the impact
tween algorithms is the use of data profile techniques like the one pro- of the fine-grained tuning of each parameter value. The depth compre-
posed in [100], which was later extended in [101]. The modified data hension of this influence is of great value for providing a sort of under-
profile technique proposed in these studies allow comparing several op- standability framework to non-familiarized stakeholders. In this regard,
timization algorithms by adopting a two-step methodology: a compari- algorithm developers should prioritize techniques and systems that can
son of the mean in the first step, and a comparison of confidence bounds be parameterized externally, so that such parameterization can be car-
in the second phase. ried out by non-experts in the field.
11
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
6.4. Statistical testing and experimental conditions that guarantees that none of them has an
advantage over the others. This is a crucial aspect for determining which
Statistical comparisons of results should be considered mandatory, approach will perform better in the real environment. In order to accom-
especially when the algorithms used in the experimentation are stochas- plish that, it is important to use a benchmark that does not have an unfair
tic. However, even if the statistical comparisons are made, they are not bias which favors some algorithms over the others. This is particularly
always correctly carried out. There are some popular methods in in- important when the algorithms are tested using a synthetic benchmark,
ferential hypothesis testing, such as the t-test or the ANOVA family. because it could have some characteristics that are uncommon in real-
Nonetheless, these tests, called parametric tests, assume a series of hy- world problems. As we are concerned about real-world problems, any
potheses on the data to which they are applied (normality, homocedas- specific feature that could be useful in any particular optimization algo-
ticity, etc.). If those assumptions do not hold, their reliability is not rithm would be of particular interest.
guaranteed, and alternative approaches should be considered. This is Regarding the experimental conditions, another important issue that
the case of non-parametric tests, such as Wilcoxon’s test, which do not should be taken into account is the maximum processing time, which is
assume any particular characteristic of the distribution of the under- strictly determined by the real-world problem to be solved. A good prac-
lying data [106]. Consequently, these tests can be more generally ap- tice in this context is the allocation of a dedicated budget of objective
plied. However, they are less powerful than parametric tests as they con- function evaluations for each of the algorithms in the experimentation.
sider the relative ranking instead of the real error values of the different This budget should be determined by an estimation of the time complex-
proposals. ity required by each algorithm in the benchmark. In turn, the estimated
Additionally, when several comparisons are done, the cumulative er- time complexity of a method should be subject to the implementation
ror should be carefully considered. For instance, the popular Wilcoxon’s that would be subsequently deployed over the real environment. Also,
test is a test designed for comparing two data samples (usually coming it is advisable to perform quantitative time complexity assessments for
from the errors of the algorithms subject to comparison). When more each of the stages that comprise the whole metaheuristic technique. It
than two samples are compared among them, the cumulative error could should also be clear that the time complexity can be influenced by mul-
increase [28]. In these cases, a post-hoc treatment such as Holm (or oth- tiple factors, including the hardware in which the experiments are run
ers) should be used to keep this cumulative error under control in the and the software of the implementation in use, such as the operating
overall comparison. system, the programming language and/or the compiler/interpreter. A
It should be noted, however, that the use of statistical tests does not change in any of these factors could significantly alter the performance
guarantee that errors in the interpretation of results will not occur. In- estimation of the algorithms under comparison. In any case, the max-
deed, the concept of p-value can lead to several misinterpretations. This imum processing time of an algorithm is an important decision driver
same problem could also arise when using confidence intervals meth- that has to be considered when designing the algorithmic solution. Oth-
ods, but it has been proven that it happens in a smaller scale [107]. erwise, the selection of one metaheuristic approach over other possibili-
Also, in [90] two popular comparison strategies are analyzed, obtain- ties could be of no practical use when applied to the real-world scenario
ing several paradoxes that could lead to different misinterpretations of under study.
results. In particular, comparing by pairs of algorithms, as it is done All the previous requirements are needed to guarantee the replica-
when using well-known t-test or Wilcoxon’s test, could produce the cy- bility of the experimental conditions. However, we can not talk about
cle ranking paradox, concluding that none of the compared algorithms replicability if the specific instances/problems used in the experiments
could be identified as the best one. Furthermore, methods like ANOVA, are not readily available to external researchers. In the specific case of
which compares multiple algorithms, may lead to the survival of the non- this methodology, which is oriented to real-world applications, it could
fittest paradox, by which the identified winner could differ from the one be possible that the instance/problem used contains internal and pri-
obtained through statistical comparisons of pairs of algorithms. vate data, which should not be shared publicly. In these situations, re-
The above inferential tests are based on frequentist statistics, and searchers should comply with the corresponding legal limitations be-
present several problems, the most obvious being the degree of depen- fore the public sharing of the data (such as anonymizing private data,
dence between the p-value and the confidence intervals with respect to for example). Additionally, it could be interesting to provide connec-
the size of the sample. Generally, when enough data is available, it is tors for different languages and/or frameworks. Furthermore, if the
very simple to obtain a small p-value. Since the sample size is arbitrarily problem datasets have been generated synthetically (as mentioned in
chosen by the researcher and the null hypothesis (samples come from Section 6.1), it is highly recommended to publicly share the instance
the same distribution) is usually wrong, the researcher can reject it by generator that was adopted.
testing the algorithms with a larger amount of data. On the contrary, Finally, and although it is usually not considered a requirement,
considerable differences could not yield small p-values if there are not making the source code of a new algorithm freely available to facili-
enough data (as datasets) for testing the method [108]. More recently, tate replicating the results is highly recommended. In this regard, and
the use of Bayesian statistical tests is attracting more and more inter- depending on the context, confidentiality issues can arise between the
est, as they are considered to be more stable and the interpretation of algorithm developer and the stakeholder. In this case, several actions
their results is more appropriate to what researchers want to analyze can be conducted, such as the anonymization of the code, or the gen-
[108]. eralization of the method. Thus, the principal reason for enhancing the
sharing of the source code is that, very often [109], many details in
6.5. Replicability of the experiments the implementation that have a strong influence on the results are not
included in the descriptions provided. Thus, without a reference imple-
As the last point of this section, replicability is one of the standard mentation, many implementations of the same algorithms could deeply
criteria used to assess the scientific value of a research. With replication, differ in their results. Thus, the source code should be shared in a per-
different and independent researchers can address a scientific hypothe- manent and public repository, such as GitHub, Gitlab, Bitbucket, etc.,
sis and build up evidence for or against it. In this section, we are going to name a few. If confidentiality is a problem, a contact e-mail could be
to describe different considerations that should be taken into account to shared for code sharing requests.
make possible the replicability of experiments. The conjunction of the availability of both the data and the source
A good practice for comparing multiple algorithms is to ensure that code of the algorithm is what is called “Open Science” [110–112], and it
all of them have been configured by following the same approach, with is an increasingly popular approach to ensuring replicability in science,
the same exhaustiveness, so all of them are run on the same environment so that we can make better and better science.
12
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
7. Algorithmic deployment for real-World applications Attending to the programming language, we observe that Java,
Python, and C++ are popular choices, but we also find HeuristicLab
Once we have completed the steps of the Lab Environment phase of and PlatEMO, which are developed in C# and MATLAB, respectively.
Workflow 1, it’s time to proceed with the second part of Fig. 1, namely At first glance, it might be assumed a priori that Python-based frame-
the Application Environment, which is focused on the algorithmic de- works would be computationally inefficient, so if this is a non-functional
ployment for the real application at hand. As pointed out in Section 3, requirement, then others based on C++ or even Java could be more
this phase receives as input either an algorithm implementation taken appropriate. However, Pygmo is in fact based on Pagmo (it is basically a
from an existing software package or an ad-hoc method defined in the Python wrapper of that package, which becomes a drawback to Python
algorithmic design step of Workflow 2. In both cases, this implementa- users if the intend to use Pygmo to develop new algorithms), so it can be
tion should go through a verification process to determine whether it very competitive in terms of performance. The other frameworks writ-
fulfills the functional and non-functional requirements to be deployed ten in Python are considerable slower; for example, if we consider jMetal
in a real environment. If this is not the case, then a new implementation (Java) and jMetalPy (Python), it can be seen that running the same al-
should be addressed. gorithm with identical settings (e.g., the default NGSA-II algorithm pro-
When facing the development of a metaheuristic to be deployed in vided in both packages) can take up to fifteen times more computing
a real-world environment, several factors can lead to taking one of the time in Python than in Java. In return, the benefits of Python for fast
following approaches: to implement the algorithm from scratch or to prototyping and the large number of libraries available for data analy-
choose an existing optimization framework. Among these factors, we sis and visualization make the frameworks written in this language ideal
can consider: for testing and fine-tuning.
The orientation of the frameworks on single- or multi-objective op-
• Programming skills. If the development team has a high expertise then
timization can be a stronger reason to choose a particular package
a choice is to determine whether it can afford to make an implemen-
than the programming language. Thus, if the problem at hand is single-
tation from scratch. Consequently, the written code can be highly
objective, then ECJ, HeuristicLab, Pagmo/Pygmo, ParadisEO, or NiaPy
optimized, and thus it is more likely to meet the non-functional re-
offers a wide range of features and algorithms to deal with it. The same
quirements, particularly those related to performance. The counter-
applies with the other frameworks concerning multi-objective optimiza-
part of this approach is that the code may be difficult to be updated,
tion; in this regard, it is worth mentioning jMetal, which started in 2006
extended, and reused by other people (including the development
and it is still an ongoing project which is continuously evolving, and
team itself).
PlatEMO, which appeared a few years ago and offers more than 100
• Using an existing optimization framework. The most productive ap-
multi-objective algorithms and more than 200 benchmark problems.
proach to develop a metaheuristic is to build on existing frameworks.
The type of software licenses can be a key feature that may dis-
This way, most of the needed algorithmic components may be al-
able the choice of a particular package. For example, PlatEMO is free
ready provided, so there is no need to reinvent the wheel, and they
to be used in research works according to its authors, so it is not clear
can offer additional functionality (e.g., visualization, analysis tools,
whether it can be used in industrial or commercial applications. In this
etc.). If a goal of the metaheuristic to be developed is to offer it to
regard, the first release of jMetal had a GPL license, which was changed
the community (in principle, as an open-source contribution), inte-
a few years later to LGPL and, more recently, to MIT upon request of
grating it into a framework is probably the best choice. However, as
researchers working in companies that wanted to use the framework in
a possible negative point, using a framework imposes the use of a set
their projects.
of existing base components, so the resulting implementation could
When the metaheuristic has been implemented, it is advisable to per-
not be as efficient as one developed ad-hoc, and thus non-functional
form a fine-tuning to improve its performance as much as possible. This
requirements related to performance could be affected.
process has two dimensions. First, the code can be optimized by applying
• Corporate development platform. Many companies have a preferred
profiling tools to determine how the computational resources available
software platform to develop their products, e.g., Java,.NET (C#,
are distributed among the functions to be optimized. This way, code
Visual Basic), etc., which can impose constraints affecting the im-
parts consuming considerable time fractions can be detected, and they
plementation of the algorithm, both in the sense of the optimization
can be refactored by rewriting them to make them more efficient. We
frameworks that could be used and the availability of third-party
have to note that metaheuristics consist of a loop where several steps
libraries. In this sense, programming languages such as Python are
(e.g., selection, variation, evaluation, and replacement in the case of
becoming very popular due to the large number of existing libraries
evolutionary algorithms) are repeated thousands or millions of times,
for data analysis, visualization, and parallel execution.
so any small improvement in a part of the code can have a high impact
• Software license. An important issue to consider when using third-
in the total computing time.
party software is the licensing policy. Some licenses, such as GPL
The second dimension is to adjust the parameters settings of the al-
(GNU General Public License) or LGPL (GNU Lesser General Public
gorithm to improve its efficacy, which can be carried out by following
License), can be too restrictive and thus can hinder the adoption of
two main approaches: ad-hoc pilot tests and automatic configuration.
software packages in non-open-source applications. Others, includ-
The first approach is the most widely used in practice, and it is advis-
ing MIT and Apache, are less restrictive.
able when having a high degree of expertise; otherwise, it usually turns
• Project activity. If an existing software package is attractive to use, it
into a loop of trial and error steps lacking rigor and leading to a waste
is important to determine whether the project is still active, which
of much time. The second alternative implies the use of tools for auto-
ensures, at least in theory, the possibility of contacting the authors
matic parameter tuning of metaheuristics [126], such as irace [127] and
to report bugs found or to answer questions than are not included in
ParamILS [128], although it must be taken into account that the tuning
the project’s documentation. There is also the choice of requesting
with these kinds of tools can be computationally unaffordable in real-
support for the project developers.
world problems.
Table 2 contains a summary of the main features of a representative At this point, the new implementation should again be verified
set of metaheuristic optimization frameworks. The characteristics re- against the non-functional requirements, which could imply to review
ported include the programming language used in the project, the main the implementation in case of not fulfilling some of them. If this is not
focus of the framework (most of them include single- and multi-objective the case, the metaheuristic may still not be ready to be used in a real
algorithms, but they usually are centered on one of them), the software environment because of the potential appearance of new non-functional
licence, and the current version and last update date (at the time of requirements. This situation can happen due to a number of facts, such
writing this paper). as the following:
13
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
Table 2
Main features of representative multi-objective optimization frameworks. “SO/MO” in column Algorithms
stand for single-objective/multi-objective algorithms. If a framework provides both types of algorithms but
it is more focused on one them, it is highlighted in boldface.
• Changes in the deployment environment. The real system was not • Objectives and/or functional/non-functional requirements
specified in detail when the problem was defined (e.g., the target should be prioritized as per the criteria of the user.
computing system is not as powerful as previously expected), so • The complexity of the problem should be analyzed towards sub-
there can be a requirement fulfillment degradation that was not ob- stantiating the need for metaheuristics.
served in the in-lab development. 2. Algorithmic design, solution encoding and search operators:
• The client is satisfied with the results obtained by the metaheuristic, • Baseline models should be first searched for in the literature, past
so it is applied to more complex scenarios than expected. Conse- experiences, project reports or any other source of information.
quently, the quality of the solutions cannot be satisfactory, or time If they exist, baseline models should be used first:
constraints can be violated. • If any baseline model meets the functional and non-functional
• Once the algorithm is running, the domain expert notices new sit- requirements, the problem is solved. There is no need for it-
uations that were not taken into account when the functional and erating any further.
non-functional requirements were defined. • If no baseline model meets the requirements, they must be
• The algorithm is not robust enough, and there may be significant considered as a starting point to incrementally improve their
differences in the obtained solutions under similar conditions, which compliance with the requirements.
can be confusing for the user. • It is advisable to quantify and trace which requirements benefit
• In the case of multi-objective problems, providing an accurate Pareto the most from each algorithmic modification, so that insighta are
front approximation, with a high number of solutions, can over- gained about which changes can be more promising in order to
whelm the decision maker if it is merely presented. The algorithm improve the compliance with every requirement.
could be empowered then with a high-level visualization compo- • The complexity of the algorithm must be kept to the minimum
nent to assist in choosing a particular solution (a posteriori decision required for guaranteeing the requirements, even if the comput-
making). Even a dynamic preference articulation mechanism could ing technology is capable of running it efficiently. This allows
be incorporated to guide the search during the optimization process minimizing risks during the deployment of the algorithm.
(interactive decision making). • When designing the encoding strategy, population structure and
search operators, it is necessary to gauge, when possible, their
If the metaheuristic is not compliant with all the new non-functional impact on the degree of fulfillment of the imposed requirements,
requirements, it must be analyzed whether they can be fulfilled by re- so that their design becomes coupled to them.
adjusting the parameters settings or by carrying out a new implementa- • Validated algorithmic design templates should be always pre-
tion; on the contrary, it can be necessary to go back again to the research ferred rather than overly sophisticated algorithmic components.
activity or even to the problem description. • Expert knowledge acquired over years of observation of the sys-
tem/asset to be optimized should be always leveraged in the al-
8. Summary of lessons learned and recommendations gorithmic design.
3. Performance assessment, comparison and replicability:
The final purpose of the methodology discussed heretofore is to avoid • Baseline models selected in the previous phase should be always
several problems, poor practices and practical issues often encountered included in the benchmark.
in projects dealing with real-world optimization problems. As a prescrip- • Quantitative metrics must be defined and measured for all func-
tive summary of the phases in which the methodology is divided, we tional and non-functional requirements.
herein provide a set of synthesized recommendations that should help • Variability of scenarios: when the problem at hand can be con-
even further when following them in prospective studies. Such recom- figured as per a number of parameters, as many problem config-
mendations are conceptually sketched in Fig. 5, and are listed next: urations as possible should be created and evaluated to account
for the diversity of scenarios that the algorithm(s) can encounter
1. Problem modeling and mathematical formulation:
in practice.
• It should be strictly mandatory to clearly state the problem ob-
• For the sake of fairness in the comparisons, parameter tuning
jectives, variables and constraints, considering all the practical
must be enforced in all the algorithms of the benchmark (includ-
aspects of the scenario at hand (e.g. users consuming the output,
ing the baseline ones). Furthermore, statistical tests should be
contextual factors affecting the validity of the solution, etc.).
applied to ensure that the gaps among the performance of the
• All non-functional requirements should be exhaustively listed,
algorithms are indeed relevant.
such as time/memory consumption, accuracy of the solution, the
• User in the loop: results should be reported comprehensively to
chance to undergo unexpected early interruptions, the usability
ease the decision making process of the end user. It is better to
of the produced solution(s), etc.
14
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
Fig. 5. Main recommendations given for every phase of our proposed methodology.
provide several solutions at this phase than in deployment. Fur- to manage their different particularities efficiently. In this section, we
thermore, new requirements often emerge when the user evalu- highlight several challenges and research directions that, given the cur-
ates the results by him/herself. rent state of the art, we consider of utmost relevance for prospective
• When soft constraints are considered, the level of constraint ful- studies in the confluence of real-world optimization and metaheuristics.
fillment of the solutions should be also informed to the user. Our envisioned future for the field is summarized in Fig. 6, and elabo-
• If confidentiality allows it, it is always good and enriching to rated in what follows.
publish code and results in public repositories.
4. Algorithmic deployment for real-world applications: 9.1. Robust optimization and worst-case analysis
• Parameter tuning of the selected metaheuristic algorithm is a
must before proceeding further, so that the eventual performance In real-world optimization scenarios, many sources of uncertainty
degradation between the laboratory and the real environment are may arise, from exogenous variables of the environment that cannot be
only due to contextual factors. measured and are not considered in the formulation anyhow, to the col-
• The degradation of the fulfillment of the requirements when in- lected data that can participate in the definition of the objective func-
lab developments are deployed on the production environment tion(s) and/or constraint(s). Furthermore, it is often the case that, in
must be quantified and carefully assessed. If needed, a redesign practice, the user consuming the solution given to the problem is willing
of the algorithm can be enforced to reduce this risk, always de- to impose worst-case constraints assuming that such sources of uncer-
parting from the identified cause of the observed degradation. tainty cannot be counteracted anyhow. In fact, the identification of the
• Good programming skills (optimized code, modular, with com- worst conditions under which the optimization problem can be formu-
ments and exception handling) are key for an easy update, ex- lated is usually much easier for the user than the derivation of efficient
tension, and reuse of the developed code for future purposes. strategies to accommodate the uncertainty of the setup. This issue am-
• When possible, open-source software frameworks should be se- plifies when tackling the problem at hand with metaheuristics, since the
lected for the development of the algorithm to be deployed in optimization algorithm itself induces an additional source of epistemic
order to ensure productivity and community support. uncertainty that may compromise the requisites imposed on the worst
• Hard constraints from corporate development platforms imposed case.
on the implementation language should be taken into account. This situation unleashes a formidable future for robust optimization,
• Straightforward mechanisms to change the parameters of the al- which aims at the design of metaheuristic solvers for problems in which
gorithm should be implemented. uncertainty is considered explicitly in its formulation [129]. Initially
• Efforts should be conducted towards the visualization of the al- addressed with tools from mathematical programming, robust optimiza-
gorithm’s output. How can the solution be made more valuable tion has also been studied with metaheuristics, with different approxi-
for the user? Unless a proper answer to this question is given as mations to account for uncertainty during the search [130,131]. In this
per the expertise and cognitive profile of the user, this can be a context, a core concept in robust optimization is the level of conserva-
major issue in real-world optimization problems, specially when tiveness demanded by the user, namely, the level of protection of the
the user at hand has no technical background whatsoever. solution against the uncertainty of the problem [132]. This issue is cru-
cial in real-world optimization, especially for its connection with the
9. Research trends in real-world optimization with metaheuristics notion of risk in circumstances in which decision variables relate to as-
sets that require human intervention. Depending on the implications of
Although the optimization research field has dealt with real-world implementing the solution in practice, the user might prefer less opti-
problems throughout its long life, the diversity and increasing complex- mal, albeit safer solutions. For instance, in manufacturing, it is often ad-
ity of scenarios in which such problems are formulated in practice have visable to be conservative when operating a human-intervened drilling
stimulated a plethora of new research directions over the years aimed machine. If this operation were to be automated via a metaheuristic
15
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
Fig. 6. Challenges and research directions foreseen for real-world optimization with metaheuristics.
algorithm, the solution should ensure a high level of conservativeness is whether this process can be enclosed within a unified methodology
with respect to the normal operation of the machine not to engender that comprises all questions and decision steps to be followed for formu-
risks for the operator and/or exposed persons. lating a real-world optimization problem. Unfortunately, the question
Besides advances reported in this line, we advocate for the anal- remains unanswered in the literature, and current practices evince that
ysis of solutions with different quality and levels of conservativeness the definition of a real-world optimization problem is largely ad-hoc and
with respect to the sources of uncertainty existing in the real-world sce- subject to the expertise and modeling skills of the scientist. For instance,
nario. This analysis can be realized by considering conservativeness as in most cases, the number of constraints imposed on a particular prob-
an additional objective function to be minimized, so that multi-objective lem restricts the search space severely, to the point of modeling it as a
metaheuristics can be designed to yield an approximation of the Pareto constraint satisfaction problem in which the only goal of the solver is
front by considering quality and risk [133,134]. When provided with to produce a feasible solution. Metaheuristics suitable to deal with con-
this Pareto front approximation, the user can appraise the implications straint satisfaction problems differ from those used for the optimization
of the uncertainty on the quality of solutions for the problem, and select of an objective function (both single- and multi-objective), as the poten-
the solution with the most practical utility bearing in mind both objec- tial sparsity of the space where feasible solutions are located may call for
tives. We definitely foresee an increasing relevance of risk in real-world an extensive use of explorative search operators and diversity-inducing
optimization problems considering the progressively higher prevalence mechanisms. However, there is no clear criterion for shifting in practice
of automated means to solve them efficiently in real life. to this paradigm. Furthermore, the presence of multiple global optima
(the so-called multi-modality of the problem’s landscape) can be a crit-
9.2. Translating real requirements into optimization problems ical factor for the design of the algorithm. Unless carefully considered
from the very inception of the problem, multi-modality can give rise
A few works have lately revolved around the methodological proce- to solutions of no practical use due to non-modeled externalities that
dure for formulating an optimization problem, including the definition discriminate which solutions can be found in practice.
of its variables and constraints. Assorted tools have been very recently All in all, prospective literature works on real-world optimization
proposed for this purpose, including directed questionnaires [135] and should not only restrict their coverage to the presentation of the prob-
algebraic modeling languages to describe optimization problems (see lem but also design and validate their algorithms. Explanations should
[136] and references therein). Despite these tools, there is a large se- also be given on the process by which the formulation of the problem
mantic gap in practical cases between what the user consuming the solu- was inferred from the scenario under analysis. In real-world optimiza-
tion to the problem truly needs, and what the scientist designing the tion problems, information about the process is almost as valuable as
optimization algorithm understands. Leaving aside non-technicalities the result itself, inasmuch as the community can largely benefit from
that could potentially open this gap, two factors that impact most in innovative methodological practices that can be adopted in other prob-
widening this gap are 1) the capability of the user and the scientist to lems.
plunge into the discourse of each other, progressively coming to a point
of agreement on what is needed; and 2) the capability of the scientist 9.3. Hybridization of mathematical tools with metaheuristic algorithms
to effectively translate requirements from the application domain into
algorithmic clauses. When addressing real-world optimization problems with metaheuris-
The first factor depends roughly on both parties’ social and empa- tics, another relevant direction is the hybridization of these algorithms
thetic skills, especially from the scientist who must understand the over- with methods from other disciplines for improved performance of the
all application context in which the problem is framed. A proper under- search process. Such an opportunity arises when the conditions under
standing of the problem, along with discussions held with the practi- which the problem is formulated allows for the consideration of addi-
tioner, can eventually unveil useful insights and hints that help in the tional tools towards enhancing the convergence and/or quality of the so-
problem formulation and the design of the algorithm. For this to occur, lutions elicited by the metaheuristic algorithms. Therefore, the chance
the scientist must assimilate all the details concerning the asset/system to opt for hybrid metaheuristic algorithms is bounded to the case un-
to be optimized, especially when the objective to be optimized and the der study and the functional and non-functional requirements imposed
imposed constraints cannot be analytically determined. thereon, as the incorporation of new search steps in the algorithm might
The second factor promoting the aforementioned semantic gap is penalize the computational time, increase the memory consumption, en-
more related to the methodology used for the translation between re- tail the purchase of third-party software, or may impose any other sim-
quirements and problem formulation. The issue emerging at this point ilar demand.
16
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
One example of this hybridization is the exploitation of explicit for- From an optimization perspective, the use of simulators for prob-
mulae defining the objectives and/or constraints. There are plenty of lem solving (simulation-based optimization or simheuristics [142]) con-
programming methods that can be utilized when the definition of the stitutes a straightforward approach to circumvent an issue that appears
fitness and constraints comply with certain assumptions, such as a linear concurrently in real-world problems: the impossibility of formulating
or quadratic relationship with the optimization variables. When this is objective functions and constraints in mathematical form. Furthermore,
the case, swarm and evolutionary methods for real-world optimization depending on its faithfulness with respect to the modeled asset/system,
should make use of the aforementioned tools, even if a mathematical the use of simheuristics in real-world optimization can also account for
formulation of the optimization problem is available. Indeed, if the re- the uncertainty present in non-deterministic application scenarios under
quirements of the real-world problem under analysis aim at the compu- analysis in a scalable fashion. Consequently, the adoption of simulation-
tational efficiency of the search process, the scientist should do his/her based optimization can ease the quantification of risk incurred by candi-
best to benefit from the equations. Unfortunately, this hybridization is date solutions and alternative hypothesis [143], which connects back to
not effectively done as per the current state of the art in Evolutionary our prospects around the importance of risk in real-world optimization
Algorithms and Swarm Intelligence. Prior work can be found around (Subsection 9.1).
the exploitation of gradient knowledge of the optimization problem to In this context, a research line with a long history in metaheuristic
accelerate local search and ensure feasibility more efficiently in contin- optimization is the use of machine learning surrogates [144]. Solvers un-
uous optimization problems [137]. Domain-specific knowledge is also der this paradigm resort to data-based regression techniques to infer the
key for a tailored design of the encoding strategy and other elements of relationship between the decision variables and their objective function
the metaheuristic algorithm [138], which in some cases can be inspired value, so that when learned, the evaluation of new candidates for the
by the mathematical foundations of the problem. Search methods cap- problem at hand can be efficiently performed by querying the trained
italizing on the combination of mathematical programming techniques regression model [145]. Although the alleviation of the computational
and metaheuristics have been collectively referred to as matheuristics complexity of the solver is arguably the most extended use of surrogates
[139], expanding a flurry of academic contributions in the last years in metaheuristic optimization, another vein of literature has stressed on
over a series of dedicated workshops. the valuable information that surrogates can feed to the search algo-
In this context, an interesting research path to follow is variable re- rithm for improving its convergence. Possibilities for this purpose are
duction, which can alleviate the computational complexity of the search diverse, including the evaluation and removal of poor solutions when
process by inferring relationships among the system of equations de- initializing the population of the metaheuristic algorithm, or the imple-
scribing a given problem [140]. As pointed out in this and other related mentation of informed operators that reduce their level of randomness
works, a large gap is still to be bridged to extrapolate these findings to concerning naïve metaheuristic implementations [146].
real-world optimization problems lacking properties such as differentia- Disregarding the specific model combined with the metaheuristic al-
bility and continuity. Nevertheless, workarounds can be adopted to infer gorithm (simulation or machine learning surrogates), several problems
such relationships and enable variable reduction during the search pro- arise when resorting to these meta-modeling approaches in real-world
cess, such as approximate means to detect such relationships (via e.g., problems. To begin with, very few works have elaborated on scalable
neural or bayesian networks). Interestingly, reducing part of the vari- meta-modeling approaches capable of implementing different model-
ables involved in an optimization problem can bring along an increased ing granularities, each balancing differently between the fidelity of the
complexity of other remaining variables. All this paves the way to inte- meta-model with respect to the modeled asset/system, and the compu-
grating variable reduction with traditional mathematical programming tational complexity of the model when queried with a certain input. This
methods for constrained optimization, such as the Newton or interior- trade-off and the challenges stemming therefrom have been widely iden-
point methods. tified in the related literature [143]. We herein underscore the need for
We certainly identify a promising future for the intersection between further strategies to develop scalable meta-models with varying levels of
metaheuristics and traditional mathematical programming methods, es- complexity and fidelity. New advances in this line should marry up with
pecially when solving real-world problems with accurate mathematical achievements in asynchronous parallel computing, especially when sev-
equations available. As a matter of fact, several competitions are orga- eral meta-models are considered jointly, each requiring different com-
nized nowadays for the community to share and elaborate on new ap- plexity levels. This is actually another reason why the prescription of
proaches along this research line. For instance, the competitions on real- non-functional requirements is of utmost importance in real-world opti-
world single-objective constrained optimization held at different venues mization: unless properly accounted from the very beginning, sophisti-
(CEC 2020, SEMCCO 2020, and GECCO 2020) consider a set of 57 real- cated meta-models can be of no use if the available computing resources
world constrained problems [95]. In these competitions, participants are do not fulfill such requirements in practice.
allowed to use the constraint equations to design the search algorithm. Another issue that remains insufficiently unaddressed to date is how
Another example around real-world bound constrained problems can be to prevent surrogates from overfitting, specially in problems character-
found in [141]. In short, we foresee that metaheuristic algorithms hy- ized by many decision variables that are tackled by using complex mod-
bridized with mathematical programming techniques will become cen- eling approaches (e.g., Deep Learning). Under such conditions, and de-
tral in future studies related to real-world optimization. pending on the availability of evaluated examples at the beginning of the
search, the learning algorithm might have a few high-dimensional ex-
amples available for training the surrogate. This could eventually dom-
9.4. Meta-modeling for real-world optimization inate the learning process and hinder the generalization of the trained
model to unseen candidate solutions. Regularization approaches have
When dealing with physical assets/systems, the evaluation of the been extensively suggested to deal with this problem, especially with
quality and/or feasibility of solutions produced over the metaheuris- linear models and neural networks [147,148]. However, we feel that
tic search can be realized by complex simulation environments map- further research can be pursued towards regularization approaches that,
ping the decision variables at their input to the values dictating their besides overfitting, provide a countermeasure for another serious prob-
fitness/compliance with constraints. The use of digital twins in man- lem derived from overly complex surrogates: the existence of virtual op-
ufacturing or the design of structures in civil engineering are two re- tima, i.e., optima that do not exist in the original problem under analy-
cent examples of simulation environments that serve as a computational sis. When this is the case, regularized ensembles and archiving strategies
representation of large-scale complex systems, for which an analytical can be effective solutions to both overfitting and virtual optima.
formulation of all their components and interrelations cannot be easily Finally, we briefly pause at the explainability of machine learn-
stated. ing models, which currently capitalizes most research contributions
17
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
reported under the eXplainable Artificial Intelligence (XAI) paradigm tic algorithm that performs best as per the objective function(s) of the
[149]. XAI refers to all techniques aimed at eliciting interpretable infor- problem at hand. In many situations, this goal suffices for the interest
mation about the knowledge learned by a model. In the case of black- of the user. However, we utterly believe that other aspects should also
box surrogate models, XAI must be conceived not only as a driver for be reflected in this process, such as implementation complexity (time,
acceptability but also as a tool to provide hints for the design of the op- memory), simplicity of search operators, and robustness of the config-
timization algorithm. For instance, post-hoc XAI methods can be used ured algorithm against factors inducing uncertainty in the definition of
to unveil what the surrogate observes in its input (decision variables) to the problem. All in all, functional and non-functional requirements of
produce a given output (estimated fitness value), so that a global under- real-world problems are also affected by the parametric configuration
standing of the decision variables that most correlate with the fitness of metaheuristics, so there is a pressing need for embedding metrics that
value can be obtained. This augmented information about the search quantify such non-functional requirements in existing frameworks.
process can boost the acceptability of the solution provided by the over- Finally, we dedicate some closing words to the field of meta-learning,
all surrogate-assisted metaheuristic by a non-expert user. Causal anal- which is understood as the family of methods aimed at inferring a poten-
ysis for machine learning models [150,151] can take a step further in tially good algorithm for a given problem without actually addressing
this direction, discriminating which decision variables, when modified it, namely, just by the similarities of the problem with others tackled
throughout the search, lead to major changes in the fitness value. These in the past [153,154]. For this purpose, meta-learning methods for op-
studies can unchain new forms of informed search operators that seize timization problems usually hinge on the extraction of meta-features
implicit causal relationships between variables and fitness during the from the given problem, which are then used as inputs of a super-
search. vised learning model that recommends the best algorithm [155]. In
other words, meta-learning approaches automate the same task than
9.5. Automating algorithm selection and parameter tuning that of automated parameter tuning methods, but without the compu-
tational complexity required by the latter to evaluate multiple candi-
We round up our prospects with a mention of parameter tuning, date solutions representing the algorithm and/or its parameter values.
which is arguably among the main reasons for differences appearing be- Studies on meta-learning for the recommendation of metaheuristic algo-
tween the in-lab design of a metaheuristic algorithm and its deployment rithms have so far been centered on instances of a few classical optimiza-
in a real-world environment. Indeed, the complexity of real-world sce- tion problems (e.g., traveling salesman [156], vehicle routing [157] or
narios can lead to incomplete/oversimplified formulations that do not flow-shop scheduling [158]). In those works meta-features are extracted
fully represent the diversity of contextual factors affecting the problem. from graph representations of the problem under analysis, or the analy-
Furthermore, the problem itself can be dynamic, so that fitness and/or sis of their fitness landscapes. However, such meta-features are largely
constraints can evolve over time. If this variability is not considered problem-dependent, which leaves an open question on whether such
in the definition of the problem nor resolved during the design of the meta-features can attain a good generalization performance of the meta-
metaheuristic algorithm (by means of e.g., dynamic optimization ap- learner when facing real-world problems, in which problem formula-
proaches), differences likely emerge when deploying the metaheuristic tions can be much more diverse in practice. Furthermore, the discovery
in practice. The methodology herein presented contemplates this issue of alternative meta-feature extraction methods can pave the way to the
by enforcing a fine-grained tuning of parameters right at the beginning consideration of meta-learning methods as a first step towards the auto-
of the application environment, so that the effects of any contextual mated construction of optimization ensembles, which are known to be
bias on the compliance of functional and non-functional requirements less sensitive to the parametric configuration of their constituent solvers
can be minimized. However, accounting for parameter tuning in our than single metaheuristics. Moreover, ensembles of methods can be also
methodology does not play down the fact that parameter tuning is a used to identify which operator, parameter value or algorithmic compo-
time-consuming process, especially when the search for a satisfactory nent is effective for a particular problem in a competitive way, and just
parametric configuration of the solver takes into account a mixture of in one step [159]. We see a fascinating opportunity for meta-learning
functional and non-functional requirements. in real-world optimization with metaheuristics, sparking many research
Fortunately, in this context, the research community has left behind directions for achieving higher degrees of intelligent design automation
ancient practices in parameter tuning, wherein the metaheuristic algo- as the ones reviewed heretofore.
rithms were configured in a trial-and-error fashion, or by using the con-
figurations utilized in similar studies. This procedure is by no means 10. Conclusions and outlook
acceptable in academic works comparing among metaheuristics, nor
should this be the case in real-world optimization. Vigorous research In this tutorial, we have proposed an end-to-end methodology for
is nowadays concentrated on the derivation of new algorithmic means addressing real-world optimization problems with metaheuristic algo-
to automate the process of adjusting the parameters of metaheuristic rithms. Our methodology covers from the identification of the optimiza-
solvers, either during the search process (self-adaptation mechanisms tion problem itself to the deployment of the metaheuristic algorithm,
for parameter control) or as a separate off-line process performed before including the determination of functional and non-functional require-
the configured metaheuristic is actually executed (parameter tuning). ments, the design of the metaheuristic itself, validation, and benchmark-
Both approaches can actually be applied to real-world problems, yet the ing. Each step comprising our methodology has been explained in de-
recent activity noted in the field is steering more notably towards pa- tail along with an enumeration of the technical aspects that should be
rameter tuning approaches due to their independence with respect to considered by both the scientist designing the algorithm and the user
the metaheuristic algorithm to be adjusted. In any case, when used in consuming its output. Recommendations are also given for newcomers
real-world optimization, automated parameter tuning methods can not to avoid misconceptions and bad practices observed in the literature
only ease this process for non-expert users, but also perform it more related to real-world optimization.
efficiently than grid search methods. We have complemented our prescribed methodology with a set of
Automated parameter tuning has so far provided a rich substrate of challenges and research directions which, according to our experience
methods and software frameworks, mature enough for their early adop- and assessment of the current status of the field, should drive efforts in
tion to cope with real-world problems of realistic complexity [152]. Our years to come. Specifically, our vision gravitates around four different
claim in this matter is that the flexibility of current automated param- domains:
eter tuning frameworks is limited, and leaves aside non-functional re-
quirements that often emerge in real-world environments. Most of them • The consideration of risk as an additional objective to be minimized,
focus on optimality, i.e., on finding a configuration of the metaheuris- and the massive adoption of robust optimization techniques, given
18
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
19
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
[29] O. Bräysy, M. Gendreau, Vehicle routing problem with time windows, part i: route [64] S. Salcedo-Sanz, M. Prado-Cumplido, F. Pérez-Cruz, C. Bousoño-Calzón, Feature
construction and local search algorithms, Transportation science 39 (1) (2005) selection via genetic optimization, in: International Conference on Artificial Neural
104–118. Networks, Springer, 2002, pp. 547–552.
[30] E. Osaba, R. Carballedo, F. Diaz, E. Onieva, A.D. Masegosa, A. Perallos, Good prac- [65] S. Salcedo-Sanz, G. Camps-Valls, F. Pérez-Cruz, J. Sepúlveda-Sanchis, C. Bousoño–
tice proposal for the implementation, presentation, and comparison of metaheuris- Calzón, Enhancing genetic feature selection through restricted search and walsh
tics for solving routing problems, Neurocomputing 271 (2018) 2–8. analysis, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applica-
[31] K. Eggensperger, M. Lindauer, F. Hutter, Pitfalls and best practices in algorithm tions and Reviews) 34 (4) (2004) 398–406.
configuration, Journal of Artificial Intelligence Research 64 (2019) 861–893. [66] S. Salcedo-Sanz, J. Su, Improving metaheuristics convergence properties in induc-
[32] A.E. Eiben, M. Jelasity, A critical note on experimental research methodology in ec, tive query by example using two strategies for reducing the search space, Comput-
in: Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02 (Cat. ers & operations research 34 (1) (2007) 91–106.
No. 02TH8600), volume 1, IEEE, 2002, pp. 582–587. [67] A. Gupta, Y.-S. Ong, L. Feng, Multifactorial evolution: toward evolutionary multi-
[33] M. Črepinšek, S.-H. Liu, M. Mernik, Replication and comparison of computational tasking, IEEE Trans. Evol. Comput. 20 (3) (2015) 343–357.
experiments in applied evolutionary computing: common pitfalls and guidelines to [68] A. Gupta, Y.-S. Ong, L. Feng, Insights on transfer optimization: because experience
avoid them, Appl Soft Comput 19 (2014) 161–170. is the best teacher, IEEE Transactions on Emerging Topics in Computational Intel-
[34] A. LaTorre, D. Molina, E. Osaba, J. Del Ser, F. Herrera, Fairness in bio-inspired ligence 2 (1) (2017) 51–64.
optimization research: aprescription of methodological guidelines for comparing [69] S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi, Optimization by simulated annealing, Sci-
meta-heuristics, arXiv preprint arXiv:2004.09969 (2020). ence 220 (4598) (1983) 671–680.
[35] N. Hansen, A. Auger, D. Brockhoff, D. Tušar, T. Tušar, Coco: performance assess- [70] F. Glover, M. Laguna, Tabu Search, in: Handbook of combinatorial optimization,
ment, arXiv preprint arXiv:1605.03560 (2016). Springer, 1998, pp. 2093–2229.
[36] J. Edmonds, Definition of Optimization Problems, Cambridge University Press, [71] E. Alba, Parallel metaheuristics: A new class of algorithms, volume 47, John Wiley
2008, pp. 171–172. & Sons, 2005.
[37] V. Huang, A.K. Qin, K. Deb, E. Zitzler, P.N. Suganthan, J. Liang, M. Preuss, [72] E. Alba, G. Luque, S. Nesmachnow, Parallel metaheuristics: recent advances and
S. Huband, Problem definitions for performance assessment of multi-objective op- new trends, International Transactions in Operational Research 20 (1) (2013) 1–48.
timization algorithms, Technical Report, School of EEE, Nanyang Technological [73] E. Atashpaz-Gargari, C. Lucas, Imperialist competitive algorithm: an algorithm for
University, 2007. optimization inspired by imperialistic competition, in: 2007 IEEE congress on evo-
[38] R. Kumar, Research methodology: a step-by-step guide for beginners, Sage Publi- lutionary computation, IEEE, 2007, pp. 4661–4667.
cations Limited, 2010. [74] G. Luque, E. Alba, Parallel genetic algorithms: Theory and real world applications,
[39] W. Jie, J. Yang, M. Zhang, Y. Huang, The two-echelon capacitated electric vehicle volume 367, Springer, 2011.
routing problem with battery swapping stations: formulation and efficient method- [75] E. Cantú-Paz, A survey of parallel genetic algorithms, Calculateurs paralleles, re-
ology, Eur J Oper Res 272 (3) (2019) 879–904. seaux et systems repartis 10 (2) (1998) 141–171.
[40] M. Delorme, M. Iori, S. Martello, Bin packing and cutting stock problems: mathe- [76] D. Karaboga, B. Basturk, Artificial bee colony (abc) optimization algorithm for solv-
matical models and exact algorithms, Eur J Oper Res 255 (1) (2016) 1–20. ing constrained optimization problems, in: International fuzzy systems association
[41] M. Glinz, On non-functional requirements, in: 15th IEEE International Require- world congress, Springer, 2007, pp. 789–798.
ments Engineering Conference (RE 2007), IEEE, 2007, pp. 21–26. [77] X.-S. Yang, S. Deb, Cuckoo search via lévy flights, in: 2009 World Congress on
[42] S. Robertson, J. Robertson, Mastering the requirements process: Getting require- Nature & Biologically Inspired Computing (NaBIC), IEEE, 2009, pp. 210–214.
ments right, Addison-wesley, 2012. [78] S. Das, S. Maity, B.-Y. Qu, P.N. Suganthan, Real-parameter evolutionary multi-
[43] I. Sommerville, Software engineering, Ed., Harlow, UK.: Addison-Wesley (2001). modal optimization a survey of the state-of-the-art, Swarm Evol Comput 1 (2)
[44] M. Davis, Software requirements, OBJECTS FUNCTIONS & STATUS (1993). (2011) 71–88.
[45] E. Coffman, M. Garey, D. Johnson, Approximation Algorithms for NP-Hard Prob- [79] X.-S. Yang, Firefly algorithms for multimodal optimization, in: International sym-
lems, 1996, pp. 46–93. posium on stochastic algorithms, Springer, 2009, pp. 169–178.
[46] K. Lange, D.R. Hunter, I. Yang, Optimization transfer using surrogate objective [80] R. Sivaraj, T. Ravichandran, A review of selection methods in genetic algorithm, In-
functions, Journal of computational and graphical statistics 9 (1) (2000) 1–20. ternational journal of engineering science and technology 3 (5) (2011) 3792–3797.
[47] A. Spagnol, R.L. Riche, S.D. Veiga, Global sensitivity analysis for optimization with [81] A. Prakasam, N. Savarimuthu, Metaheuristic algorithms and probabilistic be-
variable selection, SIAM/ASA Journal on Uncertainty Quantification 7 (2) (2019) haviour: a comprehensive analysis of ant colony optimization and its variants, Artif
417–443, doi:10.1137/18m1167978. Intell Rev 45 (1) (2016) 97–130.
[48] S. Boyd, S.P. Boyd, L. Vandenberghe, Convex optimization, Cambridge university [82] S. Ólafsson, Metaheuristics, Handbooks in operations research and management
press, 2004. science 13 (2006) 633–654.
[49] B. Ponton, A. Herzog, A. Del Prete, S. Schaal, L. Righetti, On time optimization [83] P. Kerschke, H.H. Hoos, F. Neumann, H. Trautmann, Automated algorithm selec-
of centroidal momentum dynamics, in: 2018 IEEE International Conference on tion: survey and perspectives, Evol Comput 27 (1) (2019) 3–45.
Robotics and Automation (ICRA), 2018, pp. 5776–5782. [84] D.H. Wolpert, W.G. Macready, et al., No free lunch theorems for search, Technical
[50] S. Wright, The roles of mutation, inbreeding, crossbreeding, and selection in evo- Report, Technical Report SFI-TR-95-02-010, Santa Fe Institute, 1995.
lution, volume 1, na, 1932. [85] G. Iacca, F. Neri, E. Mininno, Y.-S. Ong, M.-H. Lim, Ockham’S razor in memetic
[51] C.M. Reidys, P.F. Stadler, Combinatorial landscapes, SIAM Rev. 44 (1) (2002) 3–54. computing: three stage optimal memetic exploration, Inf Sci (Ny) 188 (2012)
[52] E. Pitzer, M. Affenzeller, A Comprehensive Survey on Fitness Landscape Anal- 17–43.
ysis, in: Recent advances in intelligent engineering systems, Springer, 2012, [86] F. Caraffini, G. Iacca, F. Neri, E. Mininno, Three variants of three stage optimal
pp. 161–191. memetic exploration for handling non-separable fitness landscapes, in: 2012 12th
[53] P. Merz, B. Freisleben, et al., Fitness landscapes and memetic algorithm design, UK Workshop on Computational Intelligence (UKCI), IEEE, 2012, pp. 1–8.
New ideas in optimization (1999) 245–260. [87] C. Cotta, M. Sevaux, K. Sörensen, Adaptive and multilevel metaheuristics, volume
[54] S. Ronald, Robust encodings in genetic algorithms: A survey of encoding issues, in: 136, Springer, 2008.
Proceedings of 1997 IEEE International Conference on Evolutionary Computation [88] J.R. Woodward, J. Swan, Automatically designing selection heuristics, in: Proceed-
(ICEC’97), IEEE, 1997, pp. 43–48. ings of the 13th annual conference companion on Genetic and evolutionary com-
[55] E.-G. Talbi, Metaheuristics: from design to implementation, volume 74, John Wiley putation, 2011, pp. 583–590.
& Sons, 2009. [89] J.R. Woodward, J. Swan, The automatic generation of mutation operators for ge-
[56] U.K. Chakraborty, C.Z. Janikow, An analysis of gray versus binary encoding in netic algorithms, in: Proceedings of the 14th annual conference companion on Ge-
genetic search, Inf Sci (Ny) 156 (3–4) (2003) 253–269. netic and evolutionary computation, 2012, pp. 67–74.
[57] C. Bierwirth, D.C. Mattfeld, H. Kopfer, On permutation representations for schedul- [90] Q. Liu, W.V. Gehrlein, L. Wang, Y. Yan, Y. Cao, W. Chen, Y. Li, Paradoxes in nu-
ing problems, in: International Conference on Parallel Problem Solving from Na- merical comparison of optimization algorithms, IEEE Trans. Evol. Comput. 24 (4)
ture, Springer, 1996, pp. 310–318. (2020) 777–791, doi:10.1109/TEVC.2019.2955110.
[58] J.C. Bean, Genetic algorithms and random keys for sequencing and optimization, [91] R. Tanabe, H. Ishibuchi, An easy-to-use real-world multi-objective optimization
ORSA journal on computing 6 (2) (1994) 154–160. problem suite, Appl Soft Comput 89 (2020) 106078.
[59] F. Rothlauf, Representations for Genetic and Evolutionary Algorithms, in: Repre- [92] R. Cheng, M. Li, Y. Tian, X. Zhang, S. Yang, Y. Jin, X. Yao, A benchmark test suite
sentations for Genetic and Evolutionary Algorithms, Springer, 2006, pp. 9–32. for evolutionary many-objective optimization, Complex & Intelligent Systems 3 (1)
[60] P. Larranaga, C.M.H. Kuijpers, R.H. Murga, I. Inza, S. Dizdarevic, Genetic algo- (2017) 67–81.
rithms for the travelling salesman problem: a review of representations and oper- [93] W. Chen, H. Ishibuchi, K. Shang, Proposal of a realistic many-objective test suite,
ators, Artif Intell Rev 13 (2) (1999) 129–170. in: International Conference on Parallel Problem Solving from Nature, Springer,
[61] M. Dorigo, G. Di Caro, Ant colony optimization: a new meta-heuristic, in: Pro- 2020, pp. 201–214.
ceedings of the 1999 congress on evolutionary computation-CEC99 (Cat. No. [94] C. Picard, J. Schiffmann, Realistic constrained multi-objective optimization bench-
99TH8406), volume 2, IEEE, 1999, pp. 1470–1477. mark problems from design, IEEE Trans. Evol. Comput. (2020).
[62] C. Blum, M. Sampels, Ant colony optimization for fop shop scheduling: a case study [95] A. Kumar, G. Wu, M.Z. Ali, R. Mallipeddi, P.N. Suganthan, S. Das, A test-suite
on different pheromone representations, in: Proceedings of the 2002 Congress on of non-convex constrained optimization problems from the real-world and some
Evolutionary Computation. CEC’02 (Cat. No. 02TH8600), volume 2, IEEE, 2002, baseline results, Swarm Evol Comput (2020) 100693.
pp. 1558–1563. [96] C. He, Y. Tian, H. Wang, Y. Jin, A repository of real-world datasets for data-driven
[63] E. Osaba, J. Del Ser, A.J. Nebro, I. Laña, M.N. Bilbao, J.J. Sanchez-Medina, evolutionary multiobjective optimization, Complex & Intelligent Systems (2019)
Multi-objective optimization of bike routes for last-mile package delivery with 1–9.
drop-offs, in: 2018 21st International Conference on Intelligent Transportation Sys- [97] Y. Lou, S.Y. Yuen, On constructing alternative benchmark suite for evolutionary
tems (ITSC), IEEE, 2018, pp. 865–870. algorithms, Swarm Evol Comput 44 (2019) 287–292.
20
E. Osaba, E. Villar-Rodriguez, J. Del Ser et al. Swarm and Evolutionary Computation 64 (2021) 100888
[98] H. Ishibuchi, Y. Peng, K. Shang, A scalable multimodal multiobjective test prob- [129] V. Gabrel, C. Murat, A. Thiele, Recent advances in robust optimization: an
lem, in: 2019 IEEE Congress on Evolutionary Computation (CEC), IEEE, 2019, overview, Eur J Oper Res 235 (3) (2014) 471–483.
pp. 310–317. [130] Y. Jin, J. Branke, Evolutionary optimization in uncertain environments-a survey,
[99] M.N. Omidvar, X. Li, K. Tang, Designing benchmark problems for large-scale con- IEEE Trans. Evol. Comput. 9 (3) (2005) 303–317.
tinuous optimization, Inf Sci (Ny) 316 (2015) 419–436. [131] I. Paenke, J. Branke, Y. Jin, Efficient search for robust solutions by means of evo-
[100] J.J. Moré, S.M. Wild, Benchmarking derivative-free optimization algorithms, SIAM lutionary algorithms and fitness approximation, IEEE Trans. Evol. Comput. 10 (4)
J. Optim. 20 (1) (2009) 172–191, doi:10.1137/080724083. (2006) 405–420.
[101] Q. Liu, W.-N. Chen, J.D. Deng, T. Gu, H. Zhang, Z. Yu, J. Zhang, Bench- [132] A. Ben-Tal, A. Nemirovski, Robust solutions of uncertain linear programs, Opera-
marking stochastic algorithms for global optimization problems by visual- tions research letters 25 (1) (1999) 1–13.
izing confidence intervals, IEEE Trans Cybern 47 (9) (2017) 2924–2937, [133] Y. Jin, B. Sendhoff, Trade-off between performance and robustness: An evolution-
doi:10.1109/TCYB.2017.2659659. ary multiobjective approach, in: International Conference on Evolutionary Multi-
[102] A. LaTorre, S. Muelas, J.M. Peña, A MOS-based dynamic memetic differential evo- -Criterion Optimization, Springer, 2003, pp. 237–251.
lution algorithm for continuous optimization: a scalability test, Soft Computing - A [134] K. Deb, S. Gupta, D. Daum, J. Branke, A.K. Mall, D. Padmanabhan, Reliabili-
Fusion of Foundations, Methodologies and Applications 15 (11) (2010) 2187–2199, ty-based optimization using evolutionary algorithms, IEEE Trans. Evol. Comput.
doi:10.1007/s00500-010-0646-3. 13 (5) (2009) 1054–1074.
[103] A. Herrera-Poyatos, F. Herrera, Genetic and memetic algorithm with diversity equi- [135] K. van der Blom, T.M. Deist, T. Tušar, M. Marchi, Y. Nojima, A. Oyama, V. Volz,
librium based on greedy diversification, CoRR abs/1702.03594 (2017). B. Naujoks, Towards realistic optimization benchmarks: aquestionnaire on the
[104] M. Črepinšek, S.H. Liu, M. Mernik, Exploration and exploitation in evo- properties of real-world problems, arXiv preprint arXiv:2004.06395 (2020).
lutionary algorithms: a survey, ACM Comput Surv 45 (3) (2013) 1–33, [136] I. Dunning, J. Huchette, M. Lubin, Jump: a modeling language for mathematical
doi:10.1145/2480741.2480752. optimization, SIAM Rev. 59 (2) (2017) 295–320.
[105] T.J. McCabe, A complexity measure, IEEE Trans. Software Eng. SE-2 (4) (1976) [137] M.M. Noel, A new gradient based particle swarm optimization algorithm for accu-
308–320. rate computation of global minimum, Appl Soft Comput 12 (1) (2012) 353–359.
[106] J. Demšar, Statistical comparisons of classifiers over multiple data sets, The Journal [138] P.P. Bonissone, R. Subbu, N. Eklund, T.R. Kiehl, Evolutionary algorithms+ domain
of Machine Learning Research 7 (2006) 1–30. knowledge= real-world evolutionary computation, IEEE Trans. Evol. Comput. 10
[107] S. Greenland, S. Senn, K. Rothman, J. Carlin, C. Poole, S. Goodman, D. Altman, (3) (2006) 256–280.
Statistical tests, p values, confidence intervals, and power: a guide to misinterpreta- [139] M. Fischetti, M. Fischetti, Matheuristics, in: Handbook of Heuristics, Springer,
tions, Eur. J. Epidemiol. 31 (4) (2016) 337–350, doi:10.1007/s10654-016-0149-3. 2018, pp. 121–153.
[108] A. Benavoli, G. Corani, J. Demšar, M. Zaffalon, Time for a change: a tutorial for [140] G. Wu, W. Pedrycz, P.N. Suganthan, R. Mallipeddi, A variable reduction strategy
comparing multiple classifiers through bayesian analysis, The Journal of Machine for evolutionary algorithms handling equality constraints, Appl Soft Comput 37
Learning Research 18 (1) (2017) 2653–2688. (2015) 774–786.
[109] R. Biedrzycki, On equivalence of algorithm’s implementations: The CMA-ES [141] S. Das, P.N. Suganthan, Problem definitions and evaluation criteria for cec 2011
algorithm and its five implementations, in: Proceedings of the Genetic and competition on testing evolutionary algorithms on real world optimization prob-
Evolutionary Computation Conference Companion, in: GECCO ’19, Associa- lems, Jadavpur University, Nanyang Technological University, Kolkata (2010)
tion for Computing Machinery, Prague, Czech Republic, 2019, pp. 247–248, 341–359.
doi:10.1145/3319619.3322011. [142] A.A. Juan, J. Faulin, S.E. Grasman, M. Rabe, G. Figueira, A review of simheuris-
[110] P. Killeen, Predict, control, and replicate to understand: how statistics can foster tics: extending metaheuristics to deal with stochastic combinatorial optimization
the fundamental goals of science, Perspectives on Behavior Science 42 (1) (2019) problems, Oper. Res. Perspect. 2 (2015) 62–72.
109–132, doi:10.1007/s40614-018-0171-8. [143] M. Chica, J. Pérez, A. Angel, O. Cordon, D. Kelton, Why simheuristics? benefits,
[111] R.D. Peng, Reproducible research in computational science, Science 334 (6060) limitations, and best practices when combining metaheuristics with simulation,
(2011) 1226–1227, doi:10.1126/science.1213847. Benefits, Limitations, and Best Practices When Combining Metaheuristics with Sim-
[112] O.S. Collaboration, The Reproducibility Project: A Model of Large-Scale Collab- ulation (January 1, 2017) (2017).
oration for Empirical Research on Reproducibility, SSRN Scholarly Paper, Social [144] Y. Jin, Surrogate-assisted evolutionary computation: recent advances and future
Science Research Network, Rochester, NY, 2013, doi:10.2139/ssrn.2195999. challenges, Swarm Evol Comput 1 (2) (2011) 61–70.
[113] E.O. Scott, S. Luke, ECJ at 20: Toward a general metaheuristics toolkit, in: Pro- [145] Y. Jin, A comprehensive survey of fitness approximation in evolutionary computa-
ceedings of the Genetic and Evolutionary Computation Conference Companion, tion, Soft comput 9 (1) (2005) 3–12.
in: GECCO ’19, Association for Computing Machinery, New York, NY, USA, 2019, [146] K. Rasheed, H. Hirsh, Informed operators: Speeding up genetic-algorithm-based
pp. 1391–1398. design optimization using reduced models, in: Proceedings of the 2nd Annual Con-
[114] S. Wagner, G. Kronberger, A. Beham, M. Kommenda, A. Scheibenpflug, E. Pitzer, ference on Genetic and Evolutionary Computation, 2000, pp. 628–635.
S. Vonolfen, M. Kofler, S. Winkler, V. Dorfer, M. Affenzeller, Advanced Methods and [147] Y. Jin, M. Olhofer, B. Sendhoff, A framework for evolutionary optimization with
Applications in Computational Intelligence, in: Topics in Intelligent Engineering approximate fitness functions, IEEE Trans. Evol. Comput. 6 (5) (2002) 481–494.
and Informatics, vol. 6, Springer, 2014, pp. 197–261. [148] A. Bhosekar, M. Ierapetritou, Advances in surrogate based modeling, feasibility
[115] J.J. Durillo, A.J. Nebro, Jmetal: a java framework for multi-objective optimization, analysis, and optimization: a review, Computers & Chemical Engineering 108
Adv. Eng. Software 42 (2011) 760–771. (2018) 250–267.
[116] A.J. Nebro, J.J. Durillo, M. Vergne, Redesigning the jMetal multi-objective opti- [149] A.B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado,
mization framework, in: Proceedings of the Companion Publication of the 2015 S. García, S. Gil-López, D. Molina, R. Benjamins, R. Chatila, F. Herrera, Explainable
Annual Conference on Genetic and Evolutionary Computation, in: GECCO Com- artificial intelligence (xai): concepts, taxonomies, opportunities and challenges to-
panion ’15, Association for Computing Machinery, New York, NY, USA, 2015, ward responsible ai, Information Fusion 58 (2020) 82–115.
pp. 1093–1100. [150] R. Guo, L. Cheng, J. Li, P.R. Hahn, H. Liu, A survey of learning causality with data:
[117] E. López-Camacho, M.J. García Godoy, A.J. Nebro, J.F. Aldana-Montes, Jmetalcpp: problems and methods, arXiv preprint arXiv:1809.09337 (2018).
optimizing molecular docking problems with a c++ metaheuristic framework, [151] R. Moraffah, M. Karami, R. Guo, A. Raglin, H. Liu, Causal interpretability for
Bioinformatics 30 (3) (2013) 437–438. machine learning-problems, methods and evaluation, ACM SIGKDD Explorations
[118] A. Benítez-Hidalgo, A.J. Nebro, J. García-Nieto, I. Oregi, J.D. Ser, Jmetalpy: a Newsletter 22 (1) (2020) 18–33.
python framework for multi-objective optimization with metaheuristics, Swarm [152] C. Huang, Y. Li, X. Yao, A survey of automatic parameter tuning methods for meta-
Evol Comput 51 (2019) 100598. heuristics, IEEE Trans. Evol. Comput. 24 (2) (2020) 201–216.
[119] D. Hadka, MOEA Framework. A Free and Open Source Java Framework for Multi- [153] K.A. Smith-Miles, Towards insightful algorithm selection for optimisation using
objective Optimization, 2020. http://moeaframework.org/. meta-learning concepts, in: 2008 IEEE International Joint Conference on Neu-
[120] G. Vrbančič, L. Brezočnik, U. Mlakar, D. Fister, I. Fister Jr., Niapy: python mi- ral Networks (IEEE World Congress on Computational Intelligence), ieee, 2008,
croframework for building nature-inspired algorithms, Journal of Open Source pp. 4118–4124.
Software 3 (2018). [154] L. Kotthoff, Algorithm Selection for Combinatorial Search Problems: A Survey, in:
[121] F. Biscani, D. Izzo, pagmo, 2020. https://esa.github.io/pagmo2/. Data Mining and Constraint Programming, Springer, 2016, pp. 149–190.
[122] S. Cahon, N. Melab, E.-G. Talbi, Paradiseo: a framework for the reusable design of [155] K. Smith-Miles, J. van Hemert, Discovering the suitability of optimisation algo-
parallel and distributed metaheuristics, Journal of Heuristics (2004). rithms by learning from evolved instances, Ann Math Artif Intell 61 (2) (2011)
[123] Y. Tian, R. Cheng, X. Zhang, Y. Jin, Platemo: a MATLAB platform for evolutionary 87–104.
multi-objective optimization, IEEE Comput Intell Mag 12 (4) (2017) 73–87. [156] J. Kanda, A. de Carvalho, E. Hruschka, C. Soares, P. Brazdil, Meta-learning to se-
[124] F. Biscani, D. Izzo, pygmo, 2020. https://esa.github.io/pygmo2/. lect the best meta-heuristic for the traveling salesman problem: a comparison of
[125] D. Hadka, Platypus - Multiobjective Optimization in Python, 2020. meta-features, Neurocomputing 205 (2016) 393–406.
https://platypus.readthedocs.io/. [157] A.E. Gutierrez-Rodríguez, S.E. Conant-Pablos, J.C. Ortiz-Bayliss, H. Terashima–
[126] C. Huang, Y. Li, X. Yao, A survey of automatic parameter tuning methods for meta- Marín, Selecting meta-heuristics for solving vehicle routing problems with time
heuristics, IEEE Trans. Evol. Comput. 24 (2) (2020) 201–216. windows via meta-learning, Expert Syst Appl 118 (2019) 470–481.
[127] M.L.-I. nez, J. Dubois-Lacoste, L. Pérez Cáceres, M. Birattari, T. Stüt- [158] L.M. Pavelski, M.R. Delgado, M.-É. Kessaci, Meta-learning on flowshop using fitness
zle”, The irace package: iterated racing for automatic algorithm configu- landscape analysis, in: Proceedings of the Genetic and Evolutionary Computation
ration, Oper. Res. Perspect. 3 (2016) 43–58, doi:10.1016/j.orp.2016.09.002. Conference, 2019, pp. 925–933.
http://www.sciencedirect.com/science/article/pii/S2214716015300270 [159] G. Wu, R. Mallipeddi, P.N. Suganthan, Ensemble strategies for population-based
[128] F. Hutter, H.H. Hoos, K. Leyton-Brown, T. Stützle, Paramils: an automatic algorithm optimization algorithms–a survey, Swarm Evol Comput 44 (2019) 695–711.
configuration framework, J. Artif. Int. Res. 36 (1) (2009) 267–306.
21