0% found this document useful (0 votes)
25 views13 pages

A Novel Pareto-Optimal Ranking Method For Comparing Multi-Objective Optimization Algorithms

Uploaded by

neturiue
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views13 pages

A Novel Pareto-Optimal Ranking Method For Comparing Multi-Objective Optimization Algorithms

Uploaded by

neturiue
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

JOURNAL OF ...

A Novel Pareto-optimal Ranking Method for


Comparing Multi-objective Optimization Algorithms
Amin Ibrahim, Azam Asilian Bidgoli, Shahryar Rahnamayan, and Kalyanmoy Deb

Abstract—As the interest in multi- and many-objective op- the evaluation of the solution sets produced by the multi-
timization algorithms grows, the performance comparison of objective algorithms, which presents various trade-offs among
these algorithms becomes increasingly important. A large num- the objectives must be quantitatively appraised using a variety
ber of performance indicators for multi-objective optimization
of measurement metrics [5]. While the quality evaluation of
arXiv:2411.17999v1 [cs.AI] 27 Nov 2024

algorithms have been introduced, each of which evaluates these


algorithms based on a certain aspect. Therefore, assessing the the single-objective solution is trivial (i.e., for minimization,
quality of multi-objective results using multiple indicators is the smaller the value of the objective, the better), measuring
essential to guarantee that the evaluation considers all quality the quality of Pareto fronts resulting from multi-objective al-
perspectives. This paper proposes a novel multi-metric compar- gorithms is complex. A Pareto front should be evaluated using
ison method to rank the performance of multi-/ many-objective
optimization algorithms based on a set of performance indicators. aspects such as diversity, distribution, and closeness to the true
We utilize the Pareto optimality concept (i.e., non-dominated Pareto front [6]. In order to assess the developed algorithms
sorting algorithm) to create the rank levels of algorithms by and analyze the results, a vast number of performance metrics
simultaneously considering multiple performance indicators as have been proposed [7]. Miqing Li et al. [8] presented a
criteria/objectives. As a result, four different techniques are study of 100 multi-objective optimization metrics that have
proposed to rank algorithms based on their contribution at each
Pareto level. This method allows researchers to utilize a set of been used in the specialized literature. Their paper discusses
existing/newly developed performance metrics to adequately as- the usage, trends, benefits, and drawbacks of the most popular
sess/rank multi-/many-objective algorithms. The proposed meth- metrics to provide researchers with essential knowledge when
ods are scalable and can accommodate in its comprehensive selecting performance metrics.
scheme any newly introduced metric. The method was applied Audet et al. [9] classified multi-objective performance in-
to rank 10 competing algorithms in the 2018 CEC competition
solving 15 many-objective test problems. The Pareto-optimal dicators into four main categories including cardinality indi-
ranking was conducted based on 10 well-known multi-objective cators, convergence indicators, distribution and spread indica-
performance indicators and the results were compared to the tors, and convergence and distribution indicators. Cardinality
final ranks reported by the competition, which were based on the indicators such as two-set coverage [10] evaluate the quality
inverted generational distance (IGD) and hypervolume indicator of the Pareto front based on the number of non-dominated
(HV) measures. The techniques suggested in this paper have
broad applications in science and engineering, particularly in points generated by the corresponding algorithm. Convergence
areas where multiple metrics are used for comparisons. Examples indicators quantify the closeness of the resulting Pareto front
include machine learning and data mining. to the true Pareto front. Generational distance (GD) [11] and
Index Terms—Multi-objective optimization, Performance indi- inverted GD (IGD) [12] are two well-known metrics that
cator, Pareto optimality, Multi-metric, Ranking method, Com- belong to this category. Distribution and spread indicators
parative studies. measure the distribution and the extent of spacing among non-
dominated solutions. Spacing [13] is an instance of distribution
I. I NTRODUCTION and spread indicators. Finally, the convergence and distribution
indicators capture both the properties of convergence and
Since many real-world problems are modeled as multi-
distribution. One of the popular metrics in this category is
objective optimization problems, a large number of multi-
the hypervolume (HV) indicator [14].
objective algorithms have been developed to tackle them [1]–
In addition to the categorization based on the factors that
[3]. In practical approaches, multi-objective algorithms are
each metric captures, the performance indicators can be di-
required to deal with a set of conflicting objectives, and thus
vided into unary and binary [15]. Unary metrics provide a
finding the optimal solutions may not be easily observable
real value after taking into account one or more of the afore-
[4]. Accordingly, the nature of conflicting objectives generates
mentioned factors, whereas binary metrics focus primarily on
a set of trade-off solutions as the Pareto front solutions of
the relationship between two approximation sets resulting from
a multi-objective optimization problem. On the other hand,
two algorithms to determine which one is better.
Author1, is with the Faculty of Business and IT, Ontario Tech University, In general, there is no universally supreme metric since
2000 Simcoe Street North, Oshawa, Ontario, L1G 0C5, Canada, Correspond- each performance indicator may have some strengths and
ing author’s e-mail: [email protected]
Author2 is with Computer Science, Wilfrid Laurier University, 75 Univer- limitations [16]. Moreover, we require at least as many in-
sity Avenue, Waterloo, Ontario, N2L 3C5, Canada dicators as the number of objectives in order to determine
Author3 is with the Department of Engineering, Brock University, 1812 Sir whether an approximate solution is better than another (i.e.,
Isaac Brock Way St. Catharines, Ontario, L2S 3A1, Canada
Author4 is with the Department of Electrical and Computer Engineering, one objective vector dominates or weakly dominates another)
Michigan State University, East Lansing, MI 48824, USA [17]. Each developed metric captures one or sometimes several
JOURNAL OF ... 2

aforementioned factors at the same time and assigns a score II. BACKGROUND R EVIEW ON M ULTI - AND
to the result of each algorithm. This indicates that - no ideal M ANY- OBJECTIVE OPTIMIZATION
individual metric can assess all characteristics of a Pareto front Multi-objective optimization targets handling two or three
and, consequently, considering several metrics simultaneously conflicting objectives and many-objective algorithms aim to
is crucial. In such cases, the scores of algorithms based on each tackle more than three conflicting objectives. In recent years,
metric are computed and the overall final rank is determined. multi-objective optimization algorithms have been greatly
However, this can also be more complicated if the metrics expanded to tackle many-objective problems. The use of
conflict with each other, thus indicating the overall ranking can evolutionary algorithms has been very promising for solving
be more challenging. This occurs because each performance such problems. As a population-based approach, it enables
indicator may yield distinct rankings for competing algorithms. the generation of a set of solutions at each run, with each
For instance, [18] investigated the contradictions of the IGD solution potentially interacting with the others to create even
and HV indicator values when evaluating concave Pareto better solutions.
fronts. In literature, there exist a few publications that have Definition 1. Multi-objective Optimization [2]
studied the application of multiple performance metrics [19],
[20]. Yen et al. [19] introduced an ensemble method to rank x) = [f1 (x
M in/M ax F (x x), f2 (x
x), ..., fM (x
x)]
(1)
algorithms by combining several evaluation metrics using s.t. Li ≤ xi ≤ Ui , i = 1, 2, ..., d
double-elimination tournament selection. As an alternative
subject to the following equality and/or inequality constraints.
approach, there have been some techniques to combine several
metrics to present one ranking result based on a comprised x) ≤ 0 j = 1, 2, ..., J
gj (x
indicator [21]. However, finding a combination technique to (2)
x) = 0 k = 1, 2, ..., K
hk (x
avoid the negative impact on the individual metrics can be
another challenging issue. Furthermore, even with a combi- where M is the number of objectives, d is the number of
nation technique, only a limited number (two or three) of decision variables (i.e., dimension), and the value of each
indicators can be combined. From these investigations, it is variable, x i , is in interval [Li , Ui ] (i.e., box-constraints). fi
evident that multi-metric evaluation of algorithms has more represents the objective function, which should be minimized
benefits than any stand-alone single performance metric and or maximized. The hard constraints that are required to be
the set of quality indicators has to be large enough to satisfy satisfied are gj (x x) ≤ 0 j = 1, 2, ..., J and hk (x x) = 0 k =
the reliability of the comparison [22]. 1, 2, ..., K.
In multi- or many-objective optimization problems, finding
To address the aforementioned concerns and to ensure a fair an optimal solution set is far more complex than in the single-
assessment, this paper proposes a multi-metric method to rank objective case. As such, a trade-off must be made between the
the multi-objective algorithms based on a set of performance different objectives. One way to compare the various candidate
indicators. The proposed approach utilizes the Pareto optimal- solutions is to use the concept of dominance. This involves an
ity concept to tackle the issue of possible conflicts among the assessment of how one solution is better than another with
measurements. Each performance metric can be observed as an regard to the objectives.
objective in the objective space and, consequently, algorithms Definition 2. Dominance Concept [23] If x =
are ranked based on their scores achieved from the individual x = (x́1 , x́2 , ..., x́d ) are two vectors in
(x1 , x2 , ..., xd ) and x́
metrics. Since the Pareto optimality leads to different Pareto a minimization problem search space, x dominates x́ x ≺ x́
x (x x)
levels, four techniques are proposed to rank algorithms based if and only if
on their contribution at each Pareto level. This technique
provides a reliable ranking at the end of the process, regardless ∀i ∈ {1, 2, ..., M }, fi (x
x) ≤ fi (x́
x)∧
(3)
of the objectives, metrics, and algorithms employed. However, ∃j ∈ {1, 2, ..., M } : fj (xx) < fj (x́
x)
it is crucial to choose appropriate performance metrics that
This concept defines the optimality of a solution in a multi-
align with our ultimate goal. This is because the metrics
objective space. Candidate solution x is better than x́ x if it
we opt for play a vital role in accurately ranking competing
is not worse than x́ x in any of the objectives and it has a
algorithms. Furthermore, any newly developed metrics can be
better value in at least one of the objectives. Solutions that are
included as part of the assessment. A great benefit of this
not dominated by any other solution are called non-dominated
method is that it is parameter-free and algorithms can be
solutions; they create the Pareto front set. Multi-objective al-
evaluated on various factors, with the overall ranking generated
gorithms attempt to find these solutions by utilizing generating
at the end of its process.
strategies/operators and selection schemes. The non-dominated
The remaining sections of this paper are organized as sorting (NDS) algorithm [23] is one of the popular selection
follows: A background review of multi-objective optimization strategies that work based on the dominance concept. It ranks
is provided in Section II. Section III reviews some well- the solutions of the population in different levels of optimality,
known multi-objective performance indicators. Section IV called Pareto. The algorithm starts with determining all non-
presents a detailed description of the proposed multi-metric dominated solutions in the first rank. In order to identify
method. Section V investigates the performance of the pro- the second rank of individuals, the non-dominated vectors
posed method using the 2018 CEC competition’s test problems are removed from the set to process the remaining candidate
and algorithms. Finally, the paper is concluded in Section VI. solutions in the same way. Non-dominated solutions of this
JOURNAL OF ... 3

step make the second level of individuals (second Pareto). distance measure is from each point in the true Pareto to those
Thereafter, the second-ranked individuals will be removed to in the approximate Pareto front, so that:
identify the third Pareto. This process will continue until all q
P|P | 2
of the individuals are grouped into different levels of Pareto. i=1 dist(i, S)
IGD(S, P ) = , (6)
|P |

III. L ITERATURE REVIEW ON PERFORMANCE INDICATORS where dist(i, S) is the Euclidean distance between a point
in the true Pareto front and the nearest solution on the
There are several performance metrics to assess the qual- approximate solution.
ity of multi and many-objective algorithms. These metrics Two-set Coverage (C) [10]: This metric indicates the rate of
evaluate the performance of the algorithms using aspects solutions on the Pareto front of one algorithm that is dominated
such as convergence, distribution, and coverage. Each metric using solutions of another algorithm.
may have its advantages and drawbacks. In this section, we |{b ∈ B, there exists a ∈ A such that a ⪯ b}|
review some well-known metrics that we have utilized in our C(A, B) =
|B|
experiments to design the multi-metric ranking method. Table I (7)
provides an overview of the ten metrics employed in this For example, C(A, B) = 0.25 means that the approximate
study, detailing what they measure, the required number of solutions from algorithm A dominate 25% of the solutions
parameters, and their respective advantages and disadvantages. resulting from algorithm B. Obviously, both C(A, B) and
The table primarily addresses three key aspects of a solution C(B, A) should be calculated for comparison.
set: convergence (proximity to the theoretical Pareto opti- Coverage over the Pareto front (CPF) [24]: This is a
mal front), diversity (distribution and spread), and cardinality measure of the diversity of the solutions projected through a
(number of solutions). mapping from an M -dimensional objective space to an (M −
Hypervolume (HV) indicator [14]: HV is a very popular 1)-dimensional space. In this process, a large set of reference
indicator that evaluates multi-objective algorithms in terms points are sampled on the Pareto front, and then each solution
of the distribution of the Pareto front and the closeness to on the resulting Pareto front is replaced by its closest reference
true Pareto front. HV indicator evaluates the diversity and point. Thus, a new point set P ′ can be generated as follows:
convergence of a multi-objective algorithm. It calculates the
volume of an M -dimensional space that is surrounded by P ′ = {argminr∈R ∥ −f (x) ∥ |x ∈ P }, (8)
a set of solution points (A) and a reference point r = where R is a set of reference points and P denotes the set of
(r1 , r2 , ..., rM ) where M is the number of objectives of the approximate candidate solutions. After the transformation of
problem. Therefore, HV measures the volume of the region P ′ and R (i.e. projection, translation, and normalization) to
which is dominated by the non-dominated solutions in the project the points to a unit simplex, the ratio of the volume of
objective space, relative to a reference point, r. A reference P ′ and R is calculated as CPF.
point is a point with worse values than a nadir point. The HV V ol(P ′ )
measure is defined in Eq. 4. CP F = . (9)
V ol(R)
[
HV (A) = vol ( [f1 (a), r1 ]×[f2 (a), r2 ]×...×[fM (a), rM ]), The details of this metric and the way of calculating volume
a∈A are given in [24].
(4) Hausdorff Distance to the Pareto front (∆p ) [25]: This
where fi represents the objective function and a ∈ A is a point indicator combines two metrics, GD and IGD. ∆p is defined
that weakly dominates all candidate solutions. Larger values as follows:
of HV indicate that the Pareto front surrounds a wider space
resulting in more diverse solutions and closer to the optimal ∆p (S, P ) = max(GD(S, P ), IGD(S, P )) (10)
Pareto front.
Generational Distance (GD) [11]: GD measures the av- This metric has stronger properties than the two individual
erage minimum distance between each obtained objective indicators since it combines GD and IGD. For instance, ∆p can
vector from the set S and the closest objective vector on the efficiently handle outliers by considering the averaged result.
representative Pareto front, P , which is calculated as follows: Pure Diversity (PD) [26]: Given an approximate solution
(A), PD measures the sum of the dissimilarities of each
q
P|S| solution in A to the rest of A. For this purpose, the solution
2
i=1 dist(i, P ) with the maximal dissimilarity has the highest priority to
GD(S, P ) = , (5)
|S| accumulate its dissimilarities. The higher the PD metric, the
greater the diversity among the solutions. PD is calculated
where dist(i, P ) is the Euclidean distance from an approxi- using recursive Eq. 11:
mate solution to the nearest solution on the true Pareto front
q = 2. A smaller GD value indicates a lower distance to the P D(A) = max(P D(A − si ) + d(si , A − si )), (11)
si ∈A
true Pareto front and consequently better performance.
Where,
Inverted Generational Distance (IGD) [12]: The only
difference between GD and IGD is that the average minimum d(s, A) = min (dissimilarity(s, si )), (12)
si ∈A
JOURNAL OF ... 4

where d(si , X − si ) denotes the dissimilarity d from one 5) Apply the NDS algorithm on A × R M -dimensional
solution si to a community A. vectors to generate different levels of ranks. Note that,
Spacing (SP) [13]: This indicator measures the distribution in order to apply the NDS algorithm, we reverse some of
of non-dominated points on the approximate Pareto front. SP the scores (e.g., since the highest HV value indicates the
can be computed as follows: better algorithm, we replace this value with its inverse
v or change its sign in case of zeros) so that all metrics
u
u 1 |S|
X are evaluated as a minimization problem instead of a
SP (S) = t (d − di )2 , (13)
|S| − 1 i=1 maximization.
6) Evaluate the ranks of each algorithm based on their
where di = min(si ,sj )∈S,si ̸=sj ∥ F (si ) − F (sj ) ∥1 is the l1 contribution at each Pareto level using the proposed
distance of ith point on the approximate Pareto front to the ranking techniques discussed below.
closest point on the same Pareto front and d is the mean of Suppose that we want to compare the performance of
di ’s. A multi/many-objective algorithms for solving a benchmark
Overall Pareto Spread (OS) [27]: This metric measures the test problem. We run each algorithm R times and compute
extent of the front covered by the approximate Pareto front. the algorithm’s performance scores using M existing multi-
A higher value of this metric indicates better front coverage objective metrics (e.g., HV, IGD, GD,..., etc.). After these
and it is calculated as follows: steps, we obtain A × R M -dimensional points (performance
m
Y | maxx∈S fi (x) − minx∈S fi (x) scores). The format of the matrix representing these scores can
OS(S) = , (14)
|fi (PB ) − fi (PG )| be illustrated as follows:
i=1
where PB is the nadir point and PG is the ideal point. m1 m2 m3 ... mM
Distribution Metric (DM) [28]: This metric also indicates a1,1 psa1,1 ,m1 psa1,1 ,m2 psa1,1 ,m3 ... psa1,1 ,mM
the distribution of the approximate Pareto front by an algo- a1,2 psa1,2 ,m1 psa1,2 ,m2 psa1,2 ,m3 ... psa1,2 ,mM
.. .. .. .. .. ..
rithm. DM is given by: . . . . . .
a1,R psa1,R ,m1 psa1,R ,m2 psa1,R ,m3 ... psa1,R ,mM
m 
.. .. .. .. ..
 
1 X σi |fi (PG ) − fi (PB )| ..
DM (S) = , (15) . . . . . .
|S| i=1 µi Ri
aA,1 psaA,1 ,m1 psaA,1 ,m2 psaA,1 ,m3 ... psaA,1 ,mM
where σi and µi are the standard deviation and mean of the aA,2 psaA,2 ,m1 psaA,2 ,m2 psaA,2 ,m3 ... psaA,2 ,mM
distances relative to the ith objective, Ri = maxs∈S fi (s) − .. .. .. .. .. ..
. . . . . .
mins∈S fi (s) where |S| is the number of points on the
aA,R psaA,R ,m1 psaA,R ,m2 psaA,R ,m3 ... psaA,R ,mM
approximate Pareto front, and fi (PG ) and fi (PB ) are the
function values of design ideal and nadir points, respectively. Where psai,j mk indicates the computed performance score for
A lower value of DM indicates well-distributed solutions. jth run of ith algorithm based on kth metric.
Next, the proposed method applies the NDS algorithm to
IV. P ROPOSED PARETO - OPTIMAL R ANKING M ETHOD place A × R M -dimensional vectors. This process yields a
In this section, we present the details of the multi-metric set of Pareto levels and the corresponding points in these
ranking method. The proposed method ranks competing multi- levels. Suppose that the NDS algorithm resulted in L levels,
or many-objective algorithms based on a variety of perfor- say l1 , l2 , l3 , ..., lL . Then, for each algorithm, we count the
mance metrics, simultaneously. First, each algorithm is inde- number of points associated with each Pareto level (i.e., this
pendently evaluated using a set of M performance indicators, step allows us to quantify the quality of each algorithm when
hence forming M objectives. Then, it combines these M- solving a given problem). Then, the resulting matrix can be
dimensional points and groups them to Pareto dominance lev- illustrated as follows:
els using the non-dominated sorting (NDS) algorithm. Finally, l1 l2 l3 ... lL
the ranks of each algorithm are determined using one of the a1 na1 l1 na1 l2 na1 l3 ... na1 lL
four proposed ranking methods described in this section. The a2 na2 l1 na2 l2 na2 l3 ... na2 lL
steps of the proposed method to rank A algorithms (for our a3 na1 l1 na1 l2 na1 l3 ... na3 lL
experiment we have used A = 10 algorithms) when solving .. .. .. .. .. ..
. . . . . .
many-objective optimization problems are described below:
aA naA l1 n aA l1 naA l1 ... naA lL
1) Select M -multi-objective performance indicators (e.g.,
HV, IGD, GD,..., etc.). where nai lj indicates the number of points (i.e., M -
2) Run each algorithm R times. dimensional metrics scores) of ith algorithm on jth Pareto
3) For each run, calculate the performance scores based on level. Lastly, the ranks of each algorithm are determined using
M metrics. As a result, for each algorithm, we have a one (or more, in case of a tie) of the four proposed ranking
matrix size of R × M performance scores. techniques described below.
4) Concatenate all performance scores from A algorithms Olympic method: The best algorithm is determined by
to get a matrix size of A × R points containing M - evaluating the number of points each solution has on the first
dimensional (objectives/criteria) vectors. level. If a tie occurs between two solutions, the second level
JOURNAL OF ... 5

TABLE I
A DVANTAGES AND D ISADVANTAGES OF M ULTI -O BJECTIVE P ERFORMANCE M ETRICS

Metric Measures Sets Advantages Disadvantages


Hypervolume (HV) Accuracy, Unary Provides a single scalar value representing Computationally expensive for high-
Diversity the volume of the objective space covered dimensional problems.
by the Pareto front.
Generational Accuracy Unary Measures the average distance from solu- GD favors solutions that are close to the
Distance (GD) tions in the population to the Pareto front. It true Pareto front, potentially overlooking
tends to be robust to variations in the shape the diversity or spread of solutions in the
and complexity of the Pareto front. population.
Inverted Generational Accuracy, Unary Measures the average distance from solu- Similar to GD, IGD may favor solutions that
Distance (IGD) Diversity tions on the Pareto front to solutions in the are close to the true Pareto front, potentially
population. overlooking the convergence or quality of
individual solutions. It can also be sensitive
to outliers, particularly if extreme solutions
significantly affect the average distance cal-
culations.
Two-set Coverage (C) Accuracy, Binary It offers an objective measure of solution Requires a reference set for comparison
Diversity, quality, enabling comparisons between dif- and the quality and characteristics of the
Cordiality ferent algorithms or parameter settings. It reference set can significantly affect the
emphasizes the importance of diversity by Two-set Coverage metric. Small changes or
assessing the extent to which the solutions inaccuracies in the reference set can lead to
cover the Pareto front. misleading results.
Coverage over the Diversity Unary Measures the coverage over the Pareto front Sensitive to the density of solutions along
Pareto front (CPF) measures the proportion of the Pareto front the Pareto front. Similar to the Two-set
covered by a set of solutions. It provides Coverage metric, the quality and character-
a quantitative assessment of how well the istics of the reference set can significantly
solutions represent the Pareto front. affect the Coverage over the Pareto front
metric. Small changes or inaccuracies in the
reference set can lead to misleading results.
Hausdorff Distance to Accuracy, Unary Provides a measure of the maximum dis- It can be sensitive to outliers, particularly
the Pareto front (∆p ) Diversity tance from any point on the Pareto front to if extreme solutions significantly affect the
the closest point in a set of solutions. This distance calculations.
quantification helps in assessing the quality
of the solutions in relation to the Pareto
front.
Pure Diversity (PD) Diversity Unary Quantifies the diversity of solutions in a PD does not consider the quality of individ-
population without considering their quality ual solutions or their proximity to the true
or proximity to the Pareto front. This focus Pareto front. Therefore, it may prioritize di-
on diversity is essential for maintaining a versity over convergence or solution quality.
well-distributed set of solutions.
Spacing (SP) Diversity Unary provides a measure of the dispersion or SP does not directly consider the quality
spread of solutions in a population. It quan- of individual solutions or their proximity to
tifies how evenly solutions are distributed the true Pareto front. Therefore, it may pri-
throughout the objective space. oritize spread over convergence or solution
quality.
Overall Pareto Spread Diversity Unary provides a comprehensive measure of the OS does not directly consider the quality
(OS) spread of solutions across the entire Pareto of individual solutions or their proximity to
front. It considers the distribution of solu- the true Pareto front. Therefore, it may pri-
tions along both the objective space and the oritize spread over convergence or solution
Pareto front. quality.
Distribution Metric Diversity Unary provides a measure of the distribution or DM does not directly consider the quality
(DM) spread of solutions in a population. It quan- of individual solutions or their proximity
tifies how evenly solutions are distributed to the true Pareto front. Therefore, it may
throughout the objective space. prioritize spread over convergence or solu-
tion quality. It can also be sensitive to the
density of solutions in certain regions of the
objective space.

is considered and the algorithm with more points is selected; where the number of points of algorithm a1 on the first,
if there is still a tie, the third level is considered, and so on. second, and third levels is 20, 10, and 1, respectively, and
15, 14, and 2 for algorithm a2 . According to the Olympic
F irst rank = arg max (nai l1 ) (16) ranking, algorithm a1 outperforms a2 as the number of points
i
on the first level (i.e., l1 ) for a1 is higher than a2 .
Suppose that two algorithms, a1 and a2 , after the NDS step, Linear method: This technique takes all points into account
have the following number of points on each level: when calculating an algorithm’s ranking, rather than just the
top Pareto level score like the Olympic method. The weighted
l1 l2 l3 score of each algorithm is calculated by multiplying the
a1 20 10 1 number of points they have in each level by the decreasing
a2 15 14 2 linear weights. For example, if the NDS algorithm produces
JOURNAL OF ... 6

L levels, the first level will have a weight of L, the second Using this definition, the total sum of cumulative weight
level will be L − 1, and so on. Once the weighted sums ratios of all ranks is the score of each algorithm. In this
are determined for all competing algorithms, their ranks are way, the ratio of contribution of each algorithm at each level
assigned based on these weights (the highest weighted sum determines the rank of the algorithm. Eq. 21 represents the
algorithm ranked first). In this way, every point on all levels computation of adaptive score for algorithm ai .
contributes to the ranking of an algorithm. Eq. 17 represents L
the computation of linear score for algorithm ai .
X CW (ai , l)
Adaptive Score(ai ) = (21)
T otal CW (l)
Linear Score(ai ) =nai l1 × (L) + nai l2 × (L − 1)+, ..., l=1

+ nai lL × (1) (17) For the previous example, we can compute the cumulative
weights for algorithms a1 and a2 as follows:

Given the previous example, the linear score of algorithm CW1 CW2 CW3
a1 and a2 can be calculated as follows: a1 20 30 31
a2 15 29 31

Linear Score(a1 ) = 20 × 3 + 10 × 2 + 1 × 1 = 81 The total cumulative weights for each Pareto level are calcu-
lated as:
Linear Score(a2 ) = 15 × 3 + 14 × 2 + 2 × 1 = 75
T otal CW (l1 ) = 20 + 15 = 35
Accordingly, the algorithm a1 has a higher rank. T otal CW (l2 ) = 30 + 29 = 59
Exponential Method: Similar to the linear ranking method,
the exponential technique assigns a weight to each level T otal CW (l3 ) = 31 + 31 = 62
of Pareto, however, the designated weights are decreasing Consequently, the adaptive scores for algorithms a1 and a2
exponentially rather than linearly. Specifically,, the decreasing will be:
weights are 20 , 2−1 , 2−3 , ..., 2−L for levels 1, 2, 3, ..., Adaptive Score(a1 ) = 20/35 + 30/59 + 31/62 = 1.58
L, respectively. Then, the weighted sum indicates the score
Adaptive Score(a2 ) = 15/35 + 29/59 + 31/62 = 1.42
of each algorithm. Eq. 18 represents the computation of
exponential score for algorithm ai . Thus, based on the adaptive score, algorithm a1 is better
than a2 .
Exponential Score(ai ) =nai l1 × 20 + nai l2 × 2−1 +, ..., It is worth mentioning that in the event of a tie (i.e.,
(18)
+ nai lL × 2−(L−1) identical ranks for two or more algorithms), when using
one of these ranking techniques, we the can apply any of
Given the previous example, the exponential score of
the other ranking methods to break the tie. Additionally, if
algorithms a1 and a2 can be calculated as follows:
the user has a preference for ranking in scenarios involving
diverse complexities, such as varying numbers of objectives,
Exponential Score(a1 ) = 20 × 20 + 10 × 2−1 + 1 × 2−2 it is feasible to assign weights to the scores based on their
= 25.25 respective complexities following the Non-Dominated Sorting
0 −1
Exponential Score(a2 ) = 15 × 2 + 14 × 2 + 2 × 2 −2 (NDS) step.
= 22.5
V. E XPERIMENTAL VALIDATION : C ONDUCTING
Accordingly, the algorithm a1 outperforms a2 . C OMPREHENSIVE C OMPARISONS
Adaptive Method: The score of each algorithm is calcu- A. Experimental settings
lated based on the cumulative number of points distributed
The practical application of the proposed metric is uti-
over all levels. For each algorithm, the total number of points
lized to rank ten well-known evolutionary multi-objective
at levels 1 and 2 is considered as the cumulative weight of
algorithms submitted to the 2018 IEEE CEC competition. In
level 2. The total number of points at levels 1, 2, and 3 are
this competition, participants were asked to develop a novel
considered as the cumulative weight of level 3. Correspond-
many-objective optimization algorithm to solve 15 MaF many-
ingly, the total number of points distributed on all levels is the
objective test problems listed in Table II. The competing
cumulative weight of level L of the corresponding algorithm.
algorithms include AGE-II [29], AMPDEA, BCE-IBEA [30],
Eq. 19 indicates the computation of cumulative weight for
CVEA3 [31], fastCAR [32], HHcMOEA [33], KnEA [34],
level l of algorithm ai .
RPEA [35], RSEA [36], and RVEA [37]. All experiments
l
X were conducted on 3-, 5-, and 15-objective MAF test problems
CW (ai , l) = n ai j (19) and the number of decision variables was set according to the
j=1 setting used in [38]. Each algorithm was run independently
Similarly, the total cumulative weight of level l can be 20 times. The maximum number of fitness evaluations was
defined as: set to max(100000, 10000 × D), and the maximum size of
A
the population was set to 240.
X In order to assess the efficacy of the proposed ranking
T otal CW (l) = CW (ai , l) (20)
i=1
method, we obtained the approximated Pareto fronts of these
JOURNAL OF ... 7

TABLE II AMPDEA, CVEA3, and HHcMOEA share the top rank when
P ROPERTIES OF THE 15 M A F BENCHMARK PROBLEMS . M IS THE solving MaF1. In such cases, we can resort to one or more
NUMBER OF OBJECTIVES .
of the other proposed ranking methods or employ the average
Test function Properties Dimension ranking across all four methods to resolve these ties when-
MaF1 Linear M+9 ever feasible. However, if two algorithms contribute equally
MaF2 Concave M+9
MaF3 Convex, multimodal M+9 across all Pareto levels, their rankings will remain identical
MaF4 Concave, multimodal M+9 irrespective of the ranking method employed.
MaF5 Convex, biased M+9
MaF6 Concave, degenerate M+9
MaF7 Mixed, disconnected, multimodal M+19 C. Ranking algorithms when solving a set of test problems
MaF8 Linear, degenerated 2 with a particular number of objectives
MaF9 Linear, degenerated 2
MaF11 Convex, disconnected, nonseparable M+9 In the previous experiment, we examined the ranking of
MaF12 Concave, nonseparable, biased deceptive M+9 algorithms based on individual test problems with a specific
MaF13 Concave, unimodal, nonseparable, degenerate 5
MaF14 Linear, Partially separable, Large scale 20 × M number of objectives. In this experiment, we delve into the
MaF15 Convex, Partially separable, Large scale 20 × M overall ranking of algorithms across a set of test problems
with the same number of objectives. To establish the overall
rankings of algorithms, akin to the previous experiment, we
ten algorithms from [38]. The competition utilized the IGD initially compute the contribution of each algorithm to each
and HV scores to rank these algorithms. However, we have specific test problem. Subsequently, for each algorithm, we
utilized the ten many-objective metrics (including HV and aggregate the total number of points at each level across all
IGD) listed in Section III to take advantage of the different test problems with the same index. For instance, if Algorithm 1
aspects of these performance metrics. accumulates 20 points in level 1 when addressing test problem
1, and 12 and 8 points in levels 1 and 2, respectively, when
B. Ranking algorithms when solving one specific test problem addressing test problem 2, then the total number of points for
with a particular number of objectives Algorithm 1 would be 32 in level 1 and 8 in level 2. The table
in Fig. 2 represents each algorithm’s total number of points
Our first experiment includes ranking the ten algorithms
on each Pareto level when solving 10- and 15-objective MaF
based on a specific test problem with a specific number of
benchmark test problems. For instance, AGE-II contributed
objectives. Since the number of independent runs is set to 20,
171 points on the first level from all test problems when
the input to our proposed method is 10 × 20 M -dimensional
solving 10-objective MaF test problems. From the table, we
performance metric scores corresponding to each run.
can also observe that the sum of points in each row is 300
Fig. 1 shows the results of the proposed scheme when solv-
(i.e., 300 = 20 runs × 15 test problems).
ing 5-objective MaF1 and 15-objective MaF10 test problems.
Fig. 2 also shows the overall distribution of points on the
The NDS algorithm resulted in 7 levels of Pareto levels for
different Pareto levels using 2-D and 3-D RadViz visualization.
MaF1, and 6 Pareto levels for MaF10 test problems. The top
Additionally, the rankings of each algorithm based on all test
table in Fig. 1 shows the number of points contributed by
problems using the four proposed ranking schemes can be
each algorithm at each Pareto level. For instance, the scores
seen in the middle table of Fig. 2. For instance, when solving
of AGE-II for MaF5 resulted in 17 points at the first level,
15-objective MaF test problems, the fastCAR algorithm has
while BCE-IBEA and CVEA3 have 10 points at this level.
the maximum number of points on the first level compared to
Correspondingly, the sum of elements in each row is 20,
the other algorithms. As a result, it is ranked first according to
indicating the overall number of runs. Fig. 1 also illustrates
the Olympic method as well as the other ranking methods. On
the distribution of these points in different Pareto levels using
the other hand, the HHcMOEA algorithm is ranked second by
2-D and 3-D RadViz visualization [39]. From the figure, the
the Olympic method when solving 15 objective test problems
density of points is higher at the lower levels and decreases at
as it has the second-highest number of NDS contributions at
the higher Pareto levels. The middle table shows the ranks of
the first level. However, this algorithm is ranked 6th when
each algorithm based on the four ranking methods discussed
using the linear method to rank these algorithms. Hence, the
in section IV. For instance, using the Olympic method, the
ranking methods should be selected carefully to address the
CVEA3 and HHcMOEA algorithms have the highest rank
needs of each user. For example, the Olympic method would
when solving 15-objective MaF10 as all their points are on the
be useful when we are only interested in algorithms that have
first Pareto level. Although the four proposed ranking methods
the majority of their contribution in Pareto level 1.
generally provide the same ranking, they may sometimes result
in minor conflicts. For example, the AMPDEA algorithm is
ranked 10th using the Olympic method, however, it was ranked D. Determining the overall rankings of algorithms
9th when the other three ranking methods were used to solve In order to determine the overall rankings of algorithms, it is
the 5-objective MaF1 test problem. required to consider the results of all algorithms when solving
Table III illustrates the Olympic ranking of each algorithm all test problems with a different number of objectives. From
when addressing the 15-objective MaF1 to MaF15 benchmark the previous experiment, we have identified the contribution of
test problems. The table reveals instances where the Olympic each algorithm after the NDS step when solving 5-, 10-, and
strategy led to ties among some algorithms. For instance, 15-objective MaF test problems separately. Now, we combine
JOURNAL OF ... 8

MaF1 -M=5 MaF10 - M = 15


L1 L2 L3 L4 L5 L6 L7 L1 L2 L3 L4 L5 L6
AGE-II 17 3 0 0 0 0 0 16 2 2 0 0 0
AMPDEA 4 7 4 4 1 0 0 12 4 2 1 1 0
BCE-IBEA 10 5 4 1 0 0 0 12 3 3 1 1 0
CVEA3 10 5 4 1 0 0 0 20 0 0 0 0 0
fastCAR 15 3 0 0 2 0 0 19 1 0 0 0 0
HHcMOEA 19 1 0 0 0 0 0 20 0 0 0 0 0
KnEA 6 4 5 4 1 0 0 12 3 3 2 0 0
RPEA 10 5 4 1 0 0 0 12 3 3 1 1 0
RSEA 9 4 4 2 1 0 0 13 3 3 1 0 0
RVEA 5 3 4 1 2 4 1 9 4 2 3 1 1

MaF1, M=5

MaF10, M=15
Fig. 1. Outcome of the NDS algorithm and the ranks of algorithms for 5-objective MaF1 and 15-objective MaF10 benchmark test problems. The top
table shows the number of points associated with each Pareto level. The bottom two diagrams show the distribution of these points using the PartoRadviz
visualization along the ranks of these algorithms using the four ranking techniques proposed in this paper.
JOURNAL OF ... 9

M = 10 M = 15
L1 L2 L3 L4 L5 L6 L7 L8 L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 L11 L12
AGE-II 171 63 40 14 10 2 0 0 165 32 17 21 4 32 17 2 0 0 9 1
AMPDEA 254 38 8 0 0 0 0 0 190 52 29 16 10 3 0 0 0 0 0 0
BCE-IBEA 272 23 5 0 0 0 0 0 194 46 31 16 10 2 1 0 0 0 0 0
CVEA3 274 26 0 0 0 0 0 0 207 36 30 14 10 1 1 1 0 0 0 0
fastCAR 250 28 9 11 1 1 0 0 238 43 17 1 0 1 0 0 0 0 0 0
HHcMOEA 264 20 13 3 0 0 0 0 232 16 8 4 0 10 12 5 0 0 10 3
KnEA 246 46 8 0 0 0 0 0 217 37 25 12 7 2 0 0 0 0 0 0
RPEA 202 61 23 4 6 1 2 1 151 41 28 20 10 18 8 8 7 6 2 1
RSEA 233 61 6 0 0 0 0 0 164 40 38 18 12 9 6 4 8 1 0 0
RVEA 235 22 23 9 11 0 0 0 144 48 45 19 15 10 4 8 4 2 1 0

M=10

M=15

Fig. 2. The overall outcome of the NDS algorithm and the ranks of the ten algorithms when solving 10- and 15-objective MaF benchmark test problems.
The top table shows the total number of points associated with each Pareto level. The bottom two diagrams show the distribution of these points using the
PartoRadviz visualization along the ranks of these algorithms using the four ranking techniques proposed in this paper.
JOURNAL OF ... 10

TABLE III
R ANKS OF ALGORITHMS WHEN SOLVING 15- OBJECTIVE M A F TEST PROBLEMS USING THE O LYMPIC RANKING TECHNIQUE .

M = 15
AGE-II AMPDEA BCE-IBEA CVEA3 fastCAR HHcMOEA KnEA RPEA RSEA RVEA
MaF1 6 1 7 1 10 1 5 4 8 9
MaF2 6 4 8 2 1 7 5 10 3 9
MaF3 6 9 4 7 5 2 1 10 3 8
MaF4 2 5 7 2 2 1 8 6 9 9
MaF5 8 5 3 6 1 10 4 2 9 7
MaF6 5 7 4 7 3 1 1 7 6 7
MaF7 10 4 4 4 3 1 1 4 4 9
MaF8 8 7 1 1 5 9 6 10 3 3
MaF9 10 4 4 4 2 1 4 3 8 9
MaF10 4 6 8 1 3 1 7 8 5 10
MaF11 10 1 9 4 1 3 6 6 8 5
MaF12 6 8 7 2 1 9 2 10 2 2
MaF13 3 5 5 5 1 1 5 5 5 4
MaF14 5 6 4 8 3 1 1 9 10 7
MaF15 1 1 1 1 9 1 7 1 8 10

these contributions by adding the number of points at each cumulative points distributed across all levels, as opposed
level with the same index from the results of the three different to only considering the top-level points. Secondly, although
objectives (when M = 5, 10, and 15). Hence, we have a total all the proposed ranking methods demonstrated comparable
of nine hundred 10-dimensional points (15 test problems × ranking performance, the adaptive ranking method exhibited
3 different numbers of objectives × 20 runs). For example, a higher average pair-wise correlation value compared to the
suppose we are interested in ranking algorithms based on their other methods. This indicates that the results obtained from
performance when M = 5 and 10. Let Algorithm 1 has 120 the adaptive ranking method are more consistent with the
points in level 1 and 80 points in level 2 when solving test rankings generated by other methods. The average pair-wise
problems with M =5, and 100 points in level 1, 55 points in correlation values for the Olympic, linear, exponential, and
level 2, and 45 in level 3 when solving test problems with adaptive ranking methods are 0.947, 0.923, 0.956, and 0.960,
M =10. Then, the total number of points of Algorithm 1 would respectively. Another consideration is that the Olympic method
be 220 in level 1, 135 in level 2, and 45 in level 2. is particularly useful for identifying algorithms whose primary
contributions lie within the top Pareto level/s.
From Table IV we see that the combined results consist
of 18 ranks, and each algorithm consists of a total of 900
points. For some of the algorithms, such as AMPDEA, most E. Comparison of rankings from the Competition and the
of the points are located on the higher levels (i.e., the first 5 proposed method
levels) while for some others such as RVEA, the points are In this section, we evaluate the results from the proposed
distributed over 18 levels. Table V shows the overall ranking ranking method and the official rankings published by the CEC
of all algorithms using the four different ranking techniques 2018 competition when comparing ten evolutionary multi- and
discussed in Section IV along with their average ranking many-objective optimization algorithms on 15 MaF bench-
based on the four ranking techniques. For instance, the AGE- mark problems with 5, 10, and 15 objectives. Each algorithm
II algorithm is ranked 5th position using the Olympic method was run independently 20 times for each test problem with
while this algorithm takes the 7th position if the linear ranking 5, 10, and 15 objectives to generate 900 results (15 test
method is applied. However, for some algorithms such as problems × 3 different numbers of objectives × 20 runs).
RSEA an equal rank is assigned by all techniques. Based on The Committee ranked the 10 algorithms based on two multi-
the results obtained from the proposed four ranking methods, objective metrics, IGD and HV. They sorted the means of
the fastCAR algorithm is ranked 1st algorithm among the 10 each performance indicator value on each problem with each
state-of-the-art many-objective optimization algorithms based number of objectives (i.e., 90 ranks). The score achieved by
on the ten considered performance indicators, while RPEA, each algorithm was then computed as the sum of the reciprocal
conversely is ranked last. values of the ranks.
It is important to highlight that all four proposed ranking For a fair comparison with the official ranking provided by
methods produce satisfactory results. However, we suggest the ECE2018 committee, in this experiment, we have used
employing the average ranking across these methods to im- the HV and IGD metrics (as opposed to the ten metrics
partially evaluate them based on a comprehensive assess- used in the previous experiments) to rank these algorithms
ment. Additionally, using the average ranking helps mitigate using the proposed ranking methods. Table VI represents
discrepancies that may arise when different methods assign the results of ranking based on the competition scores and
varying rankings to the same algorithm. Alternatively, if one the proposed method. From this table, we see that, since
prefers to use a single ranking method, we recommend utiliz- both ranking methods use the HV and IGD metrics, both
ing the adaptive ranking method for the following reasons. methods resulted in comparable rankings, with CVEA3 as
Firstly, unlike the linear ranking method, it considers the 1st and RPEA as last. However, when comparing the results
JOURNAL OF ... 11

TABLE IV
D ISTRIBUTION OF 900 POINTS FROM EACH ALGORITHM OVER 18 PARETO LEVELS .

L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 L11 L12 L13 L14 L15 L16 L17 L18


AGE-II 601 127 59 35 15 34 17 2 0 0 9 1 0 0 0 0 0 0
AMPDEA 571 166 97 40 19 6 1 0 0 0 0 0 0 0 0 0 0 0
BCE-IBEA 618 134 89 37 14 5 2 1 0 0 0 0 0 0 0 0 0 0
CVEA3 619 136 86 33 13 4 4 3 2 0 0 0 0 0 0 0 0 0
fastCAR 747 83 31 15 4 3 0 3 0 3 7 1 0 0 3 0 0 0
HHcMOEA 747 53 30 12 8 15 13 8 1 0 10 3 0 0 0 0 0 0
KnEA 595 142 86 37 13 4 4 4 4 4 3 2 1 1 0 0 0 0
RPEA 467 152 102 60 43 29 13 13 9 8 3 1 0 0 0 0 0 0
RSEA 517 165 99 48 28 17 11 6 8 1 0 0 0 0 0 0 0 0
RVEA 513 135 125 44 29 14 5 8 4 2 1 5 2 2 4 4 2 1

TABLE V not representative of the problem being solved. By using


T HE OVERALL RANKING OF ALGORITHMS USING THE FOUR RANKING multiple metrics, you can reduce the risk of bias and get
TECHNIQUES BASED ON THEIR CONTRIBUTION PRESENTED IN TABLE IV
a fair objective evaluation of algorithms.
Olympic Linear Expo Adaptive Average Rank • The proposed scheme is flexible: the proposed schemes
AGE-II 5 7 6 7 7 allow researchers to include/exclude existing or future
AMPDEA 7 5 7 5 6
BCE-IBEA 4 2 4 3 3 metrics as part of their contributing ranking metrics.
CVEA3 3 4 3 4 4 • The proposed scheme is scalable: it is possible to rank al-
fastCAR 1 1 1 1 1 gorithms for an increased number of algorithms, metrics,
HHcMOEA 2 3 2 2 2
KnEA 6 6 5 6 5 and number of runs without any change or introducing
RPEA 10 10 10 10 10 additional parameters.
RSEA 8 8 8 8 8
RVEA 9 9 9 9 9
Overall, using multiple metrics to compare algorithms pro-
vides a more comprehensive and accurate evaluation of their
performance, and can help us to make more informed decisions
obtained by the proposed algorithm when utilizing the ten when choosing between them. Therefore, the proposed ranking
performance metrics and the official CEC2018 results pro- method can help us incorporate several performance metrics
vided by the committee, we see a significant difference in to adequately rank multi-/ many-objective algorithms based
the rankings of these algorithms. This is expected as the on their overall achievement in several categories of perfor-
proposed ranking method uses several metrics to evaluate the mance measures. While our experiments haven’t revealed any
rankings of each algorithm based on their performance in all significant problem with the proposed ranking schemes, it’s
aspects (convergence, distribution, spread, ..., etc.) of many- essential to recognize that the suggested ranking method could
objective quality measures. For instance, the proposed method be exploited if someone customizes a competing algorithm to
ranked the fastCAR algorithm 1st when evaluating the overall excel in a specific indicator, thus becoming the top algorithm
performance of algorithms based on ten performance metrics according to the Pareto dominance definition used in this
while RPEA is ranked last. From the above experiments, paper. In order to mitigate these concerns, we recommend em-
we see the importance of incorporating several performance ploying impartial domination relations, such as ε-dominance
metrics to properly assess the performance of multi-/many- [40], to ensure that none of the competing algorithms exploit
objective algorithms in all aspects of their quality measures the limitation inherent in the NDS algorithm. The ε-dominance
for the following reasons: is defined as follows:
• A single metric cannot capture all aspects of algorithm Given two vectors, x = (x1 , x2 , ..., xd ) and x́ x =
performance: Different metrics measure different aspects (x́1 , x́2 , ..., x́d ) in a minimization problem search space, we
of algorithm performance, and no single metric can say that x ε-dominates x́ x ≺ε x́
x (x x) if and only if (Bt − Ws =
ε > 0) ∧ (||F (x ´
x)|| < ||F (x )||), where
capture all of them. Using multiple metrics gives a more
comprehensive picture of how the algorithms perform in Bt (x x, x́x) = |Fi (x x)|, i ∈ M
x) < Fi (x́
different areas. Ws (x x, x́x) =s |Fi (x x)|, i ∈ M
x) > Fi (x́
• We need at least as many indicators as the number of m
P
||F (x x)|| = x))2
(Fi (x
objectives in order to determine whether an approximate i=0
solution is better than another [17]. and when Ws = 0, ε-dominance is equivalent to Pareto
• Metrics may contradict each other: different metrics may dominance.
indicate different levels of performance for the same
algorithm, and, in some cases, metrics may even contra-
dict each other. By comparing algorithms with multiple VI. C ONCLUSION
metrics, you can get a more nuanced understanding of In this study, we have proposed a novel ranking technique to
their strengths and weaknesses. assess the quality of many-objective optimization algorithms
• It can help avoid bias: using a single metric to compare using a set of performance metrics. Since there is no one
algorithms can introduce a bias, especially if the metric is many-objective performance indicator that can capture all
JOURNAL OF ... 12

TABLE VI
C OMPARISON BETWEEN THE RANKING RESULTS PROVIDED BY THE C OMPETITION AND THE PROPOSED TECHNIQUES BASED ON HV AND IGD
MEASURES .

Proposed method
Algorithm Competition ranking
Olympic Linear Expo Adaptive Average Rank
AGE-II 4 5 4 6 4 5
AMPDEA 2 2 2 2 2 2
BCE-IBEA 3 4 3 3 3 3
CVEA3 1 1 1 1 1 1
fastCAR 7 3 7 4 5 6
HHcMOEA 10 7 10 8 9 8
KnEA 6 9 6 7 7 7
RPEA 9 10 8 10 10 10
RSEA 5 6 5 5 5 4
RVEA 8 8 9 9 8 9

quality aspects of an algorithm (convergence, distribution, [6] E. Zitzler, K. Deb, and L. Thiele, “Comparison of multiobjective
spread, cardinality, ..., etc.), it is important to incorporate evolutionary algorithms: Empirical results,” Evolutionary computation,
vol. 8, no. 2, pp. 173–195, 2000.
several performance multi-/many-metrics to properly assess [7] N. Riquelme, C. Von Lücken, and B. Baran, “Performance metrics
the performance of multi-/many-objectives algorithms in all in multi-objective optimization,” in 2015 Latin American Computing
aspects of many-objective quality measures. The proposed Conference (CLEI), 2015, pp. 1–11.
[8] M. Li and X. Yao, “Quality evaluation of solution sets in multiobjective
multi-metric approach gives users the ability to incorporate as optimisation: A survey,” ACM Computing Surveys (CSUR), vol. 52,
many performance indicators as required to properly compare no. 2, pp. 1–38, 2019.
the quality of several competing algorithms. The proposed [9] C. Audet, J. Bigeon, D. Cartier, S. Le Digabel, and L. Salomon, “Per-
ranking method uses the NDS algorithm to categorize the level formance indicators in multiobjective optimization,” European journal
of operational research, vol. 292, no. 2, pp. 397–422, 2021.
of contribution from each algorithm and apply four ranking [10] E. Zitzler and L. Thiele, “Multiobjective optimization using evolutionary
techniques, namely, Olympic, linear, exponential, and adaptive algorithms—a comparative case study,” in International conference on
to rank algorithms based on several performance metrics. parallel problem solving from nature. Springer, 1998, pp. 292–301.
[11] D. A. Van Veldhuizen, Multiobjective evolutionary algorithms: clas-
Our experimental results indicate that the proposed rank- sifications, analyses, and new innovations. Air Force Institute of
ing method can effortlessly incorporate several performance Technology, 1999.
metrics to adequately rank multi-/many-objective algorithms [12] C. A. C. Coello and N. C. Cortés, “Solving multiobjective optimization
problems using an artificial immune system,” Genetic programming and
based on their overall achievements in several categories of evolvable machines, vol. 6, no. 2, pp. 163–190, 2005.
performance measures. Moreover, it can also be used as a [13] J. R. Schott, “Fault tolerant design using single and multicriteria genetic
general ranking technique for any application in which the algorithm optimization,” Ph.D. dissertation, Massachusetts Institute of
Technology, 1995.
evaluation of multiple metrics is required. This includes such
[14] E. Zitzler and L. Thiele, “Multiobjective evolutionary algorithms: a com-
as machine learning (e.g., multi-loss), data mining (e.g., multi- parative case study and the strength pareto approach,” IEEE transactions
quality metrics), business (e.g., revenue, profitability, customer on Evolutionary Computation, vol. 3, no. 4, pp. 257–271, 1999.
satisfaction, employee engagement, market share), sport (e.g., [15] N. Riquelme, C. Von Lücken, and B. Baran, “Performance metrics
in multi-objective optimization,” in 2015 Latin American computing
scoring, assists, rebounds, blocks, tackles), healthcare (e.g., conference (CLEI). IEEE, 2015, pp. 1–11.
blood pressure, cholesterol levels, body mass index, heart [16] J. A. Nuh, T. W. Koh, S. Baharom, M. H. Osman, and S. N. Kew,
rate variability, cognitive function), education (e.g., grades, “Performance evaluation metrics for multi-objective evolutionary al-
gorithms in search-based software engineering: Systematic literature
standardized test scores, attendance), and environment (e.g., review,” Applied Sciences, vol. 11, no. 7, p. 3117, 2021.
air quality, water quality, biodiversity, and climate change). [17] E. Zitzler, L. Thiele, M. Laumanns, C. M. Fonseca, and V. G. Da Fon-
seca, “Performance assessment of multiobjective optimizers: An analysis
and review,” IEEE Transactions on evolutionary computation, vol. 7,
R EFERENCES no. 2, pp. 117–132, 2003.
[18] S. Jiang, Y.-S. Ong, J. Zhang, and L. Feng, “Consistencies and contra-
[1] Y. Hua, Q. Liu, K. Hao, and Y. Jin, “A survey of evolutionary algorithms dictions of performance metrics in multiobjective optimization,” IEEE
for multi-objective optimization problems with irregular pareto fronts,” transactions on cybernetics, vol. 44, no. 12, pp. 2391–2404, 2014.
IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 2, pp. 303–318, [19] G. G. Yen and Z. He, “Performance metric ensemble for multiobjective
2021. evolutionary algorithms,” IEEE Transactions on Evolutionary Computa-
[2] A. Asilian Bidgoli, S. Rahnamayan, B. Erdem, Z. Erdem, A. Ibrahim, tion, vol. 18, no. 1, pp. 131–144, 2013.
K. Deb, and A. Grami, “Machine learning-based framework to cover [20] E. Zitzler, L. Thiele, and J. Bader, “On set-based multiobjective op-
optimal pareto-front in many-objective optimization,” Complex & Intel- timization,” IEEE Transactions on Evolutionary Computation, vol. 14,
ligent Systems, vol. 8, no. 6, pp. 5287–5308, 2022. no. 1, pp. 58–79, 2009.
[3] S. Sharma and V. Kumar, “A comprehensive review on multi-objective [21] J. Yan, C. Li, Z. Wang, L. Deng, and D. Sun, “Diversity metrics in
optimization techniques: Past, present and future,” Archives of Compu- multi-objective optimization: Review and perspective,” in 2007 IEEE
tational Methods in Engineering, vol. 29, no. 7, pp. 5605–5633, 2022. International Conference on Integration Technology. IEEE, 2007, pp.
[4] K.-J. Du, J.-Y. Li, H. Wang, and J. Zhang, “Multi-objective multi- 553–557.
criteria evolutionary algorithm for multi-objective multi-task optimiza- [22] M. Ravber, M. Mernik, and M. Črepinšek, “The impact of quality
tion,” Complex & Intelligent Systems, pp. 1–18, 2022. indicators on the rating of multi-objective evolutionary algorithms,”
[5] T. Okabe, Y. Jin, and B. Sendhoff, “A critical survey of performance Applied Soft Computing, vol. 55, pp. 265–275, 2017.
indices for multi-objective optimisation,” in The 2003 Congress on [23] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist
Evolutionary Computation, 2003. CEC’03., vol. 2. IEEE, 2003, pp. multiobjective genetic algorithm: NSGA-II,” IEEE transactions on evo-
878–885. lutionary computation, vol. 6, no. 2, pp. 182–197, 2002.
JOURNAL OF ... 13

[24] Y. Tian, R. Cheng, X. Zhang, M. Li, and Y. Jin, “Diversity assessment of


multi-objective evolutionary algorithms: Performance metric and bench-
mark problems [research frontier],” IEEE Computational Intelligence
Magazine, vol. 14, no. 3, pp. 61–74, 2019.
[25] O. Schutze, X. Esquivel, A. Lara, and C. A. C. Coello, “Using the
averaged hausdorff distance as a performance measure in evolutionary
multiobjective optimization,” IEEE Transactions on Evolutionary Com-
putation, vol. 16, no. 4, pp. 504–522, 2012.
[26] H. Wang, Y. Jin, and X. Yao, “Diversity assessment in many-objective
optimization,” IEEE transactions on cybernetics, vol. 47, no. 6, pp.
1510–1522, 2016.
[27] Y.-N. Wang, L.-H. Wu, and X.-F. Yuan, “Multi-objective self-adaptive
differential evolution with elitist archive and crowding entropy-based
diversity measure,” Soft Computing, vol. 14, no. 3, pp. 193–209, 2010.
[28] K. Deb and S. Jain, “Running performance metrics for evolutionary
multi-objective optimization,” 2002.
[29] M. Wagner and F. Neumann, “A fast approximation-guided evolutionary
multi-objective algorithm,” in Proceedings of the 15th annual conference
on genetic and evolutionary computation, 2013, pp. 687–694.
[30] M. Li, S. Yang, and X. Liu, “Pareto or non-pareto: Bi-criterion evolution
in multiobjective optimization,” IEEE Transactions on Evolutionary
Computation, vol. 20, no. 5, pp. 645–665, 2015.
[31] J. Yuan, H.-L. Liu, and F. Gu, “A cost value based evolutionary many-
objective optimization algorithm with neighbor selection strategy,” in
2018 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2018,
pp. 1–8.
[32] M. Zhao, H. Ge, H. Han, and L. Sun, “A many-objective evolutionary
algorithm with fast clustering and reference point redistribution,” in 2018
IEEE Congress on Evolutionary Computation (CEC). IEEE, 2018, pp.
1–6.
[33] G. Fritsche and A. Pozo, “A hyper-heuristic collaborative multi-objective
evolutionary algorithm,” in 2018 7th Brazilian Conference on Intelligent
Systems (BRACIS). IEEE, 2018, pp. 354–359.
[34] X. Zhang, Y. Tian, and Y. Jin, “A knee point-driven evolutionary
algorithm for many-objective optimization,” IEEE Transactions on Evo-
lutionary Computation, vol. 19, no. 6, pp. 761–776, 2014.
[35] Y. Liu, D. Gong, X. Sun, and Y. Zhang, “Many-objective evolution-
ary optimization based on reference points,” Applied Soft Computing,
vol. 50, pp. 344–355, 2017.
[36] C. He, Y. Tian, Y. Jin, X. Zhang, and L. Pan, “A radial space division
based evolutionary algorithm for many-objective optimization,” Applied
Soft Computing, vol. 61, pp. 603–621, 2017.
[37] R. Cheng, Y. Jin, M. Olhofer, and B. Sendhoff, “A reference vector
guided evolutionary algorithm for many-objective optimization,” IEEE
Transactions on Evolutionary Computation, vol. 20, no. 5, pp. 773–791,
2016.
[38] R. Cheng, M. Li, Y. Tian, X. Zhang, S. Yang, Y. Jin, and X. Yao,
“A benchmark test suite for evolutionary many-objective optimization,”
Complex & Intelligent Systems, vol. 3, no. 1, pp. 67–81, 2017.
[39] M. Nasrolahzadeh, A. Ibrahim, S. Rahnamayan, and J. Haddadnia,
“Pareto-radvis: A novel visualization scheme for many-objective op-
timization,” in 2020 IEEE International Conference on Systems, Man,
and Cybernetics (SMC). IEEE, 2020, pp. 3868–3873.
[40] Z. Kang, L. Kang, X. Zou, M. Liu, C. Li, M. Yang, Y. Li, Y. Chen,
and S. Zeng, “A new evolutionary decision theory for many-objective
optimization problems,” in Advances in Computation and Intelligence:
Second International Symposium, ISICA 2007 Wuhan, China, September
21-23, 2007 Proceedings 2. Springer, 2007, pp. 1–11.

You might also like