0% found this document useful (0 votes)
256 views16 pages

WFG Toolkit

This document introduces the Walking Fish Group (WFG) Toolkit, which allows for the scalable generation of multi-objective test problems. The WFG Toolkit is more flexible than previous toolkits as it allows test problem designers to incorporate a wide range of characteristics, such as bias, multi-modality, separability, and different Pareto optimal geometries. Problems generated with the WFG Toolkit have known Pareto optimal sets and are scalable with respect to both the number of objectives and parameters. Nine benchmark problems are proposed that exhibit characteristics not covered by existing test problems, such as being both multi-modal and non-separable.

Uploaded by

Erich
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
256 views16 pages

WFG Toolkit

This document introduces the Walking Fish Group (WFG) Toolkit, which allows for the scalable generation of multi-objective test problems. The WFG Toolkit is more flexible than previous toolkits as it allows test problem designers to incorporate a wide range of characteristics, such as bias, multi-modality, separability, and different Pareto optimal geometries. Problems generated with the WFG Toolkit have known Pareto optimal sets and are scalable with respect to both the number of objectives and parameters. Nine benchmark problems are proposed that exhibit characteristics not covered by existing test problems, such as being both multi-modal and non-separable.

Uploaded by

Erich
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

A Scalable Multi-objective Test Problem Toolkit

(corrected version: 22 June 2005)

Simon Huband1 , Luigi Barone2 , Lyndon While2 , and Phil Hingston1


1
Edith Cowan University, Mount Lawley WA 6050, Australia,
{s.huband,p.hingston}@ecu.edu.au
2
The University of Western Australia, Crawley WA 6009, Australia
{luigi,lyndon}@csse.uwa.edu.au

Abstract. This paper presents a new toolkit for creating scalable multi-
objective test problems. The WFG Toolkit is flexible, allowing char-
acteristics such as bias, multi-modality, and non-separability to be in-
corporated and combined as desired. A wide variety of Pareto optimal
geometries are also supported, including convex, concave, mixed con-
vex/concave, linear, degenerate, and disconnected geometries.
All problems created by the WFG Toolkit are well defined, are scal-
able with respect to both the number of objectives and the number of
parameters, and have known Pareto optimal sets. Nine benchmark multi-
objective problems are suggested, including one that is both multi-modal
and non-separable, an important combination of characteristics that is
lacking among existing (scalable) multi-objective problems.

1 Introduction

There have been several attempts to define test suites and toolkits for test-
ing multi-objective evolutionary algorithms (MOEAs) [1–4]. However, existing
multi-objective test problems do not test a wide range of characteristics, and
are often poorly designed. Typical defects include not being scalable and be-
ing susceptible to simple search strategies. Moreover, many problems are poorly
constructed, with unknown Pareto optimal sets, or featuring parameters with
poorly located optima.
As suggested for single-objective problems by Whitley et al. [5] and Bäck and
Michalewicz [6], test suites should include scalable problems that are resistant to
hill climbing strategies, are non-linear, non-separable1 , and multi-modal. Such
requirements are also a good start for multi-objective test suites, but unfortu-
nately are poorly represented in the literature.
Addressing this problem, this paper presents the Walking Fish Group (WFG)
Toolkit, which places an emphasis on allowing test problem designers to con-
struct scalable test problems with any number of objectives, where features such
1
Separable problems can be optimised by considering each parameter in turn, inde-
pendently of one another. A non-separable problem is thus characterised by parame-
ter dependencies, is more difficult, and is more representative of real world problems.
2 corrected version: 22 June 2005

as modality and separability can be customised as required. Test problems in the


WFG Toolkit are defined in terms of a simple underlying problem that defines
the fitness space and a series of composable, configurable transformations that
allow the test problem designer to add arbitrarily levels of complexity to the test
problem. Problems created by the WFG Toolkit are well defined, are scalable
with respect to both the number of objectives and the number of parameters,
and have known Pareto optimal sets.
The next section of the paper introduces the multi-objective terminology
used throughout. Section 3 briefly examines previous multi-objective test suites,
highlighting the deficiencies with them. Section 4 specifies our new WFG Toolkit,
generalising the concepts introduced in these previous test suites to produce a
configurable toolkit that allows for the construction of scalable, well-behaved
test problems. Section 5 then describes how the WFG Toolkit can be used to
construct an example test problem. Some experimental results are presented in
Section 6. A suite of nine test problems are proposed in Section 7 that exceeds
the functionality of previous test suites. Section 8 concludes the paper.

2 Terminology

Consider a multi-objective optimisation problem given in terms of a search space


of allowed values of n parameters x1 , . . . , xn , and a vector of M objective func-
tions {f1 , . . . , fM } mapping parameter vectors into fitness space. The mapping
from the search space to fitness space defines the fitness landscape.
In multi-objective optimisation, we aim to find the set of optimal trade-off
solutions known as the Pareto optimal set. The Pareto optimal set is the set
of all Pareto optimal parameter vectors, and the corresponding set of objective
vectors is the Pareto optimal front. The Pareto optimal set is a subset of the
search space, whereas the Pareto optimal front is a subset of the fitness space.
The following types of relationships are useful because they allow us to sep-
arate the convergence and spread aspects of sets of solutions for a problem. A
distance parameter is one that when modified only ever results in a dominated,
dominating, or equivalent parameter vector. A position parameter is one that
when modified only ever results in an incomparable or equivalent parameter
vector. All other parameters are mixed parameters.
When the projection of the Pareto optimal set onto the domain of a single
parameter, the parameter optima, is a single value at the edge of the domain,
then we call the parameter an extremal parameter . If instead the parameter
optima cluster around the middle of the domain, then it is a medial parame-
ter . Extremal parameters can be unduly favoured by truncation based mutation
correction strategies, whereas medial parameters can be favoured by EAs that
employ intermediate recombination [7].
corrected version: 22 June 2005 3

3 Previous Multi-objective Test Problems


Deb’s toolkit [1] for constructing two-objective problems is the only toolkit for
multi-objective problems of which we are aware. Deb’s toolkit segregates param-
eters into distance and position parameters — mixed parameters are atypical.
Three functionals are used that control the shape of and position on the trade-
off surface, and the distance to the Pareto optimal front. Deb’s toolkit provides
a number of functionals, including multi-modal and biased functions, most of
which are scalable parameter-wise.
Deb’s toolkit has various limitations: it was designed for two-objective prob-
lems, no real-valued deceptive functions are suggested, the suggested functions
do not facilitate the construction of problems with degenerate Pareto optimal
front geometries2 , only one non-separable function is suggested (but it scales
poorly and has but weak parameter dependencies), and position and distance
parameters are always independent of one another3 .
Related by authorship to Deb’s toolkit are the DTLZ test problems [4, 8],
which, unlike the majority of multi-objective problems, are scalable objective-
wise. This important characteristic has facilitated several recent investigations
into what are commonly called “many” objective problems. Like Deb’s toolkit,
the DTLZ problems have distinct distance and position components, have known
Pareto optimal fronts, and are simple to employ. The DTLZ problems also ad-
dress a variety of problem characteristics, including multi-modality, bias, and
several Pareto optimal front geometries.
However, the DTLZ test suite has serious limitations: none of its problems
is deceptive, none of its problems is non-separable4 , and the number of position
parameters is always fixed relative to the number of objectives. DTLZ5 and
DTLZ6 also deserve special mention, as they are both meant to be problems
with degenerate Pareto optimal fronts. However, we have found that this is
untrue for instances with four or more objectives (due to space limitations, we
omit the proof). As DTLZ5 and DTLZ6 do not behave as expected, their Pareto
optimal fronts are unclear beyond three-objectives.
Whilst other test problems exist, including those employed by Van Veld-
huizen [9], Zitzler et al. [10], and others [11, 12], they tend to be of limited scope,
and are often ad hoc and consequently difficult to analyse (but by the same
token, some also have unusual Pareto optimal geometries). Many are restricted
to three or fewer parameters or objectives, some have poorly located parame-
ter optima, few are non-separable (and even fewer are both non-separable and
multi-modal), and those that are non-separable are either not scalable, or have
unknown Pareto optimal fronts.
2
A degenerate front is a front that is of lower dimension than the objective space in
which it is embedded, less one.
3
Deb does suggest a way of making position and distance parameters mutually non-
separable. However, the suggested approach can lead to cyclical dependencies, po-
tentially causing unwanted side effects on the fitness landscape.
4
Technically speaking, the majority of the DTLZ problems are non-separable, but
only marginally so.
4 corrected version: 22 June 2005

Despite the variety of existing test problems, there is clear need for additional
work. At present there is no toolkit for creating problems with an arbitrary
number of objectives, where desirable features can easily be incorporated or
omitted as desired. We remedy this problem with our WFG Toolkit.

4 The WFG Toolkit

The WFG Toolkit defines a problem in terms of an underlying vector of pa-


rameters x. The vector x is always associated with a simple underlying problem
that defines the fitness space. The vector x is derived, via a series of transition
vectors, from a vector of working parameters z. Each transition vector adds com-
plexity to the underlying problem, such as multi-modality and non-separability.
The EA directly manipulates z, through which x is indirectly manipulated.
Unlike previous test suites in which complexity is “hard-wired” in an ad-hoc
manner, the WFG Toolkit allows a test problem designer to control, via a se-
ries of composable transformations, which features will be present in the test
problem. To create a problem, the test problem designer selects several shape
functions to determine the geometry of the fitness space, and employs a num-
ber of transformation functions that facilitate the creation of transition vectors.
Transformation functions must be designed carefully such that the underlying
fitness space (and Pareto optimal front) remains intact with a relatively easy to
determine Pareto optimal set. The WFG Toolkit provides a variety of predefined
shape and transformation functions to help ensure this is the case.
For convenience, working parameters are labelled as either distance- or pos-
ition-related parameters (even if they are actually mixed parameters), depending
on the type of the underlying parameter being mapped to.
All problems created by the WFG Toolkit conform to the following format:

Given z = {z1 , . . . , zk , zk+1 , . . . , zn }


Minimise fm=1:M (x) = xM + Sm hm (x1 , . . . , xM −1 )
where x = {x1 , . . . , xM } = {max(tpM , A1 )(tp1 − 0.5) + 0.5, . . . ,
max(tpM , AM −1 )(tpM −1 − 0.5) + 0.5, tpM }
tp = {tp1 , . . . , tpM } ←[ tp−1 ←[ . . . ←[ t1 ←[ z[0,1]
z[0,1] = {z1,[0,1] , . . . , zn,[0,1] } = {z1 /z1,max , . . . , zn /zn,max }

where M is the number of objectives, x is a set of M underlying parameters


(where xM is an underlying distance parameter, and x1:M −1 are underlying
position parameters), z is a set of k + l = n ≥ M working parameters (the first k
and the last l working parameters are position- and distance-related parameters
respectively), A1:M −1 ∈ {0, 1} are degeneracy constants (for each Ai = 0, the
dimensionality of the Pareto optimal front is reduced by one), h1:M are shape
functions, S1:M > 0 are scaling constants, and t1:p are transition vectors, where
“←[” indicates that each transition vector is created from another vector via
transformation functions. The domain of all zi ∈ z is [0, zi,max ] (the lower bound
corrected version: 22 June 2005 5

is always zero for convenience), where all zi,max > 0. Note that all xi ∈ x will
have domain [0, 1].
Some observations can be made about the above formalism: substituting in
xM = 0 and disregarding all transition vectors provides a parametric equation
that covers and is covered by the Pareto optimal front of the actual problem,
working parameters can have dissimilar domains (which would encourage EAs
to normalise parameter domains), and employing dissimilar scaling constants
results in dissimilar Pareto optimal front tradeoff ranges (this is more represen-
tative of real world problems, and encourages EAs to normalise fitness values).

4.1 Shape Functions

Shape functions determine the nature of the Pareto optimal front, and map pa-
rameters with domain [0, 1] onto the range [0, 1]. Each of h1:M must be associated
with a shape function. For example, letting h1 = linear1 , hm=2:M −1 = convexm ,
and hM = mixedM indicates that h1 uses the linear shape function, hM uses the
mixed shape function, and all of h2:M −1 use convex shape functions.
Table 1 presents five different types of shape functions. Example Pareto op-
timal fronts constructed using these shape functions are given in Fig. 1.

Table 1. Shape functions. In all cases, x1 , . . . , xM −1 ∈ [0, 1]. A, α, and β are constants.

Linear QM −1
linear1 (x1 , . . . , xM −1 ) = xi
¡Qi=1
M −m
¢
linearm=2:M −1 (x1 , . . . , xM −1 ) = xi (1 − xM −m+1 )
i=1
linearM (x1 , . . . , xM −1 ) = 1 − x1 PM
When hm=1:M = linearm , the Pareto optimal front is a linear hyperplane, where hm = 1.
m=1
Convex QM −1
convex1 (x1 , . . . , xM −1 ) = (1 − cos(xi π/2))
¡Qi=1
M −m
¢
convexm=2:M −1 (x1 , . . . , xM −1 ) = (1 − cos(xi π/2)) (1 − sin(xM −m+1 π/2))
i=1
convexM (x1 , . . . , xM −1 ) = 1 − sin(x1 π/2)
When hm=1:M = convexm , the Pareto optimal front is purely convex.
Concave QM −1
concave1 (x1 , . . . , xM −1 ) = sin(xi π/2)
¡Qi=1
M −m
¢
concavem=2:M −1 (x1 , . . . , xM −1 ) = sin(xi π/2) cos(xM −m+1 π/2)
i=1
concaveM (x1 , . . . , xM −1 ) = cos(x1 π/2)
When hm=1:M = concavem , the Pareto optimal front P is purely concave, and a region of the
M
hyper-sphere of radius one centred at the origin, where h2 = 1.
m=1 m
Mixed convex/concave (α > 0, A¡∈ {1, 2, . . .}) ¢
cos(2Aπx1 +π/2) α
mixedM (x1 , . . . , xM −1 ) = 1 − x1 − 2Aπ
Causes the Pareto optimal front to contain both convex and concave segments, the number of
which is controlled by A. The overall shape is controlled by α: when α > 1 or when α < 1, the
overall shape is convex or concave respectively. When α = 1, the overall shape is linear.
Disconnected (α, β > 0, A ∈ {1, 2, . . .})
discM (x1 , . . . , xM −1 ) = 1 − (x1 )α cos2 (A(x1 )β π)
Causes the Pareto optimal front to have disconnected regions, the number of which is controlled
by A. The overall shape is controlled by α (when α > 1 or when α < 1, the overall shape is
concave or convex respectively, and when α = 1, the overall shape is linear), and β influences
the location of the disconnected regions (larger values push the location of disconnected regions
towards larger values of x1 , and vice versa).
6 corrected version: 22 June 2005

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 1 0.2 1
0.8 0.8
00 0.6 00 0.6
0.2 0.4 0.2 0.4
0.4 0.4
0.6 0.2 0.6 0.2
0.8 0.8
1 0 1 0

(a) hm=1:3 = linearm . (b) hm=1:3 = convexm .

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 1 0.2 1
0.8 0.8
00 0.6 00 0.6
0.2 0.4 0.2 0.4
0.4 0.4
0.6 0.2 0.6 0.2
0.8 0.8
1 0 1 0

(c) hm=1:3 = concavem . (d) hm=1:2 = linearm , h3 =


mixed3 (α = 0.4, A = 3).

1 2

0.8
1.5

0.6
1
0.4

0.5
0.2 1 2
0.8 1.5
00 0.6 00
1
0.2 0.4 0.5
0.4 1 0.5
0.6 0.2
0.8 1.5
1 0 2 0

(e) hm=1:2 = concavem , h3 = (f) hm=1:3 = concavem , degener-


disc3 (dominated regions not re- ate on x2 , shown for distances 0,
moved, α = 0.4, β = 0.4, A = 3). 0.5, and 1.

Fig. 1. Example three-objective Pareto optimal fronts.

4.2 Transformation Functions

Transformation functions map input parameters with domain [0, 1] onto the
range [0, 1]. All transformation functions take a vector of parameters (called the
primary parameters) and map them to a single value. Transformation functions
may also employ constants and secondary parameters that further influence the
mapping. Primary parameters allow us to qualify working parameters as being
position- and distance-related.
corrected version: 22 June 2005 7

There are three types of transformation functions: bias, shift, and reduction
functions. Bias and shift functions only ever employ one primary parameter,
whereas reduction functions can employ many.
Bias transformations have a natural impact on the search process by bias-
ing the fitness landscape. Shift transformations move the location of optima.
In the absence of any shift, all distance-related parameters would be extremal
parameters, with optimal value at zero. Shift transformations can be used to set
the location of parameter optima (subject to skewing by bias transformations),
which is useful if medial and extremal parameters are to be avoided. We rec-
ommend that all distance-related parameters be subjected to at least one shift
transformation.
The transformation functions are specified in Table 2 and plotted in Fig. 2.
To ensure problems are well designed, some restrictions apply as given in Table 3.
For brevity, we have omitted a weighted product reduction function (analogous
to the weighted sum reduction function).
By incorporating secondary parameters via a reduction function, b param
can create dependencies between distinct parameters, including position- and
distance-related parameters. Moreover, when employed before any shift trans-
formation, b param can create objectives that are effectively non-separable — a
separable optimisation approach would fail unless given multiple iterations, or a
specific order of parameters to optimise.
The deceptive and multi-modal shift transformations make the corresponding
problem deceptive and multi-modal respectively5 . When applied to position-
related parameters, some regions of the Pareto optimal set can become difficult
to find, and the mapping from the Pareto optimal set to the Pareto optimal front
will be many-to-one (even when k = M − 1)6 . When applied to distance-related
parameters, finding any Pareto optimal solution becomes more difficult.
The flat region transformation can have a significant impact on the fitness
landscape7 , and can also be used to create a stark many-to-one mapping from
the Pareto optimal front to the Pareto optimal set.

5 Building an Example Test Problem

Creating problems with the WFG Toolkit involves three main steps: specifying
values for the underlying formalism (including scaling constants and parameter
domains), specifying the shape functions, and specifying transition vectors. To
aid in construction, a computer-aided design tool or meta-language could be used
to help select and connect together the different components making up the test
5
Multi-modal problems are difficult because an optimiser can become stuck in local
optima. Deceptive problems (as defined by Deb [1]) exacerbate this difficulty by
placing the global optimum in an unlikely place.
6
Many-to-one mappings from the Pareto optimal set to the Pareto optimal front
present difficulties to the optimiser, as choices must be made between two otherwise
equivalent parameter vectors.
7
Optimisers can have difficulty with flat regions due to a lack of gradient information
8 corrected version: 22 June 2005

Table 2. Transformation functions. The primary parameters y and y1 , . . . , y|y| always


have domain [0, 1]. A, B, C, α, and β are constants. For b param, y0 is a vector of
secondary parameters (of domain [0, 1]), and u is a reduction function.

Bias: Polynomial (α > 0, α 6= 1)


b poly(y, α) = y α
When α > 1 or when α < 1, y is biased towards zero or towards one respectively.
Bias: Flat Region (A, B, C ∈ [0, 1], B < C, B = 0 ⇒ A = 0 ∧ C 6= 1, C = 1 ⇒ A = 1 ∧ B 6= 0)
A(B−y) (1−A)(y−C)
b flat(y, A, B, C) = A + min(0, by − Bc) B − min(0, bC − yc) 1−C
Values of y between B and C (the area of the flat region) are all mapped to the value A.
Bias: Parameter Dependent (A ∈ (0, 1), 0 < B < C)
0
b param(y, y0 , A, B, C) = y B+(C−B)v(u(y ))¯ ¯
v(u(y )) = A − (1 − 2u(y )) ¯b0.5 − u(y0 )c + A¯
0 0

0
A, B, C, and the secondary parameter vector y together determine the degree to which y is
biased by being raised to an associated power: values of u(y0 ) ∈ [0, 0.5] are mapped linearly onto
[B, B + (C − B)A], and values of u(y0 ) ∈ [0.5, 1] are mapped linearly onto [B + (C − B)A, C].
Shift: Linear (A ∈ (0, 1))
|y−A|
s linear(y, A) = |bA−yc+A|
A is the value for which y is mapped to zero.
Shift: Deceptive (A ∈ (0, 1), 0 < B ¿ 1, 0 < C ¿ 1, A − B > 0, A + B < 1)
s decept(y, A, B, C) = 1 + (|y
³ − A| − B)× ´
A−B 1−A−B
by−A+Bc(1−C+ ) bA+B−yc(1−C+ ) 1
B + B +
A−B 1−A−B B

A is the value at which y is mapped to zero, and the global minimum of the transformation. B
is the “aperture” size of the well/basin leading to the global minimum at A, and C is the value
of the deceptive minima (there are always two deceptive minima).
Shift: Multi-modal (A ∈ {1, 2, . . .}, B ≥ 0, (4A + 2)π ≥ 4B, C ∈ (0, 1))
£ ¡ |y−C|
¢¤ ¡ |y−C|
¢2
1+cos (4A+2)π 0.5− +4B
2(bC−yc+C) 2(bC−yc+C)
s multi(y, A, B, C) = B+2
A controls the number of minima, B controls the magnitude of the “hill sizes” of the multi-
modality, and C is the value for which y is mapped to zero. When B = 0, 2A + 1 values of y (one
at C) are mapped to zero, and when B 6= 0, there are 2A local minima, and one global minimum
at C. Larger values of A and smaller values of B create more difficult problems.
Reduction: Weighted Sum ¡ (|w| = |y|,¢ w1 , . . . , w|y| > 0)
P|y| P|y|
r sum(y, w) = w i yi / wi
i=1 i=1
By varying the constants of the weight vector w, EAs can be forced to treat parameters differently.
Reduction: Non-separable (A ¡ ∈ {1, . . . , |y|},
¯ |y| mod A = 0) ¯¢
P|y| PA−2
yj + ¯
yj −y1+(j+k) mod |y| ¯
j=1 k=0
r nonsep(y, A) = |y|
dA/2e(1+2A−2dA/2e)
A
A controls the degree of non-separability (noting that r nonsep(y, 1) = r sum(y, {1, . . . , 1})).

Table 3. Transformation function restrictions.


Restriction Comment
Constants Must be fixed (not tied to the value of any parameters).
Primary For any given transition vector, all parameters of the originating transition vector
parameters must be employed exactly once as a primary parameter (counting parameters that
appear independently as primary parameters), and in the same order in which they
appear in the originating transition vector.
Secondary Care must be taken to avoid cyclical dependencies in b param. Consider the fol-
parameters lowing terminology: if a is a primary parameter of b param, and b is one of the
secondary parameters, then we say that a depends on b. If b likewise depends on c,
then a also (indirectly) depends on c. When a depends on some parameter b, then
there is an associated dependency between the corresponding working parameters.
To prevent cyclical dependencies, no two working parameters should be dependent
on one another. In addition, a parameter should not depend on itself.
Shifts Parameters should not be subjected to more than one shift transformation.
Reductions Reduction transformations should belong to transition vectors that are closer to the
underlying parameter vector than any shift transformation.
b flat When A = 0, b flat should only belong to transition vectors that are further away
from the underlying parameter vector than any shift or reduction transformation.
corrected version: 22 June 2005 9

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

(a) b poly (α = 20). (b) b flat (A = 0.7, B = 0.4,


C = 0.5).
1

0.8

0.8 0.6

0.6

0.4 0.4

0.2 1
0.8
00 0.6 0.2
0.2 0.4
0.4
0.6 0.2
0.8
1 0
0
0 0.2 0.4 0.6 0.8 1

(c) b param plotted against (d) s linear (A = 0.35).


u (A = 0.5, B = 2, C = 10).
1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

(e) s decept (A = 0.35, B = (f) s multi (A = 5, B = 10,


0.005, C = 0.05). C = 0.35).

1
1
0.8
0.8

0.6
0.6

0.4
0.4

0.2 1 0.2
0.8 1
0.8
00 0.6 00 0.6
0.2 0.4
0.2 0.4 0.4 0.6 0.2
0.4 0.8 1 0
0.6 0.2
0.8
1 0

(g) r sum for two parame- (h) r nonsep for two param-
ters (w1 = 1, w2 = 5). eters (A = 2).

Fig. 2. Example transformations. Each example plots the value of the input primary
parameter(s) versus the result of the transformation.
10 corrected version: 22 June 2005

problem. With the use of sensible default values, the test problem designer then
need only specify which features of interest they desire in the test problem. An
example scalable test problem is specified in Table 4 and expanded in Fig. 3.

Table 4. An example test problem. The number of position-related parameters, k, must


be divisible by the number of underlying position parameters, M − 1 (this simplifies
t3 ). The number of distance-related parameters, l, can be set to any positive integer.
To enhance readability, for any transition vector ti , we let y = ti−1 . For t1 , let y =
z[0,1] = {z1 /2, . . . , zn /(2n)}.

Type Setting
Constants Sm=1:M = 2m
A1:M −1 = 1
The settings for S1:M ensures the Pareto optimal front will have dissimilar trade-
off magnitudes, and the settings for A1:M −1 ensures the Pareto optimal front is not
degenerate.
Domains zi=1:n,max = 2i
The working parameters have domains of dissimilar magnitude.
Shape hm=1:M = concavem
The purely concave Pareto optimal front facilitates the use of some performance met-
rics, where the distance of a solution to the nearest point on the Pareto optimal front
must be determined.
t1 t1i=1:n−1 = b param(yi , r sum({yi+1 , . . . , yn }, {1, . . . , 1}), 49.98
0.98
, 0.02, 50)
t1n = yn
By employing the parameter dependent bias transformation, this transition vector
ensures that distance- and position-related working parameters are inter-dependent
and somewhat non-separable.
2
t t2i=1:k = s decept(yi , 0.35, 0.001, 0.05)
t2i=k+1:n = s multi(yi , 30, 95, 0.35)
This transition vector makes some parts of the Pareto optimal front more difficult
to determine (due to the deceptive transformation), and also makes it more difficult
to converge to the Pareto optimal front (due to the multi-modal transformation). The
multi-modality is similar to Rastrigin’s function, with many local optima (61l −1), and
one global optimum, where the “hill size” between adjacent local optima is relatively
small.
t3 t3i=1:M −1 = r nonsep({y(i−1)k/(M −1)+1 , . . . , yik/(M −1) }, k/(M − 1))
t3M = r nonsep({yk+1 , . . . , yn }, l)
This transition vector ensures that all objectives are non-separable, and also reduces
the number of parameters down to M , as required by the framework.

This example problem is scalable both objective- and parameter-wise, where


the number of distance- and position-related parameters can be scaled indepen-
dently. For a solution to be Pareto optimal, it is required that all of:
½ −1

zi=k+1:n = 2i × 0.35(0.02+1.96r sum({zi+1 ,...,zn },{1,...,1}))


, i 6= n
0.35, i = n

which can be found by first determining zn , then zn−1 , and so on, until the
required value for zk+1 is determined. Once the optimal values for zk+1:n are
determined, the position-related parameters can be varied arbitrarily to obtain
different Pareto optimal solutions.
The example problem has a distinct many-to-one mapping from the Pareto
optimal set to the Pareto optimal front due to the deceptive transformation of
the position-related parameters. All objectives are non-separable, deceptive, and
corrected version: 22 June 2005 11

Given z = {z1 , . . . , zk , zk+1 , . . . , zn }


QM −1
Minimise f1 (x) = xM + 2 ¡i=1 sin(xi π/2) ¢
QM −m
fm=2:M −1 (x) = xM + 2m i=1
sin(xi π/2) cos(xM −m+1 π/2)
fM (x) = xM + 2M cos(x1 π/2)
where xi=1:M −1 = r nonsep({y(i−1)k/(M −1)+1 , . . . , yik/(M −1) }, k/(M − 1))
xM = r nonsep({yk+1 , . . . , yn }, l)
yi=1:k = s decept(yi0 , 0.35, 0.001, 0.05)
0
yi=k+1:n = s multi(y³ i , 30, 95, 0.35) ´
0
Pn 0.98
yi=1:n−1 = b param zi /(2i), j=i+1
zj /(2j(n − i)), 49.98
, 0.02, 50
yn0 = zn /(2n)

Fig. 3. The expanded form of the problem defined in Table 4. |z| = n = k + l, k ∈


{M − 1, 2(M − 1), 3(M − 1), . . .}, l ∈ {1, 2, . . .}, and the domain of all zi ∈ z is [0, 2i].

multi-modal, the latter with respect to the distance component. The problem is
also biased in a parameter dependent manner.
This example constitutes a well designed scalable problem that is both non-
separable and multi-modal — we are not aware of any problem in the literature
with comparable characteristics.

6 Implications

In this section, we consider the performance of NSGA-II [13] on five related


problem instances with varying levels of complexity (see Table 5). I1 is separable
and hence the simplest problem. I2 adds complexity by making position-related
parameters depend on distance-related parameters (and other position-related
parameters), whereas I3 makes distance-related parameters depend on position-
related parameters (and other distance-related parameters). I4 instead employs
a non-separable reduction. I5 combines the difficulties of both I3 and I4. I1–I5
all have concave Pareto optimal front, and are all uni-modal.
To facilitate analysis, I1–I5 are tested only for two objectives, with k = 1 and
l = 10. The following (un-optimised) NSGA-II settings were used: population
size 100, 250 generations, crossover probability 0.9, real parameter mutation
probability n1 , SBX parameter 10, and mutation parameter 50. NSGA-II was
run 35 times on each of I1–I5. The 50% attainment surfaces [14] (the “median”
non-dominated fronts) are plotted in Fig. 4.
These are the first results we know of for NSGA-II on scalable, non-separable
problems with known Pareto optimal sets, and clearly show the effects different
types of problem complexity can have. For I1, NSGA-II effectively finds the
Pareto optimal front, whereas having distance parameters depend on position
parameters, as with I3 and I5, causes much difficulty. Moreover, I1–I5 are all
uni-modal, and as the tests are only in two-objectives, more challenging problem
instances are easily envisaged — NSGA-II can clearly be challenged by a variety
of problem characteristics.
12 corrected version: 22 June 2005
Table 5. Test problems I1–I5. The number of position-related parameters, k, must
be divisible by the number of underlying position parameters, M − 1 (this simplifies
reductions). To enhance readability, for any transition vector ti , we let y = ti−1 . For
t1 , let y = z[0,1] = z.

Problem Type Setting


I1 Constants S1:M = 1
A1:M −1 = 1
Domains zi=1:n,max = 1
Shape hm=1:M = concavem
t1 t1i=1:n = yi
t2 t2i=1:k = yi
t2i=k+1:n = s linear(yi , 0.35)
t3 t3i=1:M −1 = r sum({y(i−1)k/(M −1)+1 , . . . , yik/(M −1) }, {1, . . . , 1})
t3M = r sum({yk+1 , . . . , yn }, {1, . . . , 1})
I2 As I1, except the following replaces t1 :
t1 t1i=1:n−1 = b param(yi , r sum({yi+1 , . . . , yn }, {1, . . . , 1}), 49.98
0.98
, 0.02, 50)
t1n = yn
1
I3 As I1, except the following replaces t :
t1 t11 = y1
t1i=2:n = b param(yi , r sum({y1 , . . . , yi−1 }, {1, . . . , 1}), 49.98
0.98
, 0.02, 50)
I4 As I1, except the following replaces t3 :
t3 t3i=1:M −1 = r nonsep({y(i−1)k/(M −1)+1 , . . . , yik/(M −1) }, k/(M − 1))
t3M = r nonsep({yk+1 , . . . , yn }, l)
I5 As I1, except use t1 from I3, and t3 from I4.

7 A Suggested Test Suite

In this section, we propose a test suite that consists of nine scalable, multi-
objective test problems (WFG1–WFG9) that focuses on some of the more perti-
nent problem characteristics. Table 6 specifies WFG1–WFG9, the properties of
which are summarised in Table 7.
We make the following additional observations: WFG1 skews the relative sig-
nificance of different parameters by employing dissimilar weights in its weighted
sum reduction, only WFG1 and WFG7 are both separable and uni-modal, the
non-separable reduction of WFG6 and WFG9 is more difficult than that of
WFG2 and WFG3, the multi-modality of WFG4 has larger “hill sizes” (and is
thus more difficult) than that of WFG9, the deceptiveness of WFG5 is more dif-
ficult than that of WFG9 (WFG9 is only deceptive on its position parameters),
the position-related parameters of WFG7 are dependent on its distance-related
parameters (and other position-related parameters) — WFG9 employs a simi-
lar type of dependency, but distance-related parameters also depend on other
distance-related parameters, the distance-related parameters of WFG8 are de-
pendent on its position-related parameters (and other distance-related parame-
ters) and as a consequence the problem is non-separable, and the predominance
of concave Pareto optimal fronts facilitates the use of performance metrics that
require knowledge of the distance to the Pareto optimal front.
For WFG1–WFG7, a solution is Pareto optimal iff all zi=k+1:n = 2i × 0.35,
noting WFG2 is disconnected. For WFG8, it is required that all of:
−1
zi=k+1:n = 2i × 0.35(0.02+49.98( 49.98 −(1−2u)|b0.5−uc+ 49.98 |))
0.98 0.98

u = r sum({z1 , . . . , zi−1 }, {1, . . . , 1})


corrected version: 22 June 2005 13

Table 6. The WFG test suite. The number of position-related parameters, k, must
be divisible by the number of underlying position parameters, M − 1 (this simplifies
reductions). The number of distance-related parameters, l, can be set to any positive
integer, except for WFG2 and WFG3, for which l must be a multiple of two (due to the
nature of their non-separable reductions). To enhance readability, for any transition
vector ti , we let y = ti−1 . For t1 , let y = z[0,1] = {z1 /2, . . . , zn /(2n)}.

Problem Type Setting


All Constants Sm=1:M = 2m
A1 = 1n
0, for WFG3
A2:M −1 =
1, otherwise
The settings for S1:M ensures the Pareto optimal fronts have dissimilar trade-
off magnitudes, and the settings for A1:M −1 ensures the Pareto optimal fronts
are not degenerate, except in the case of WFG3, which has a one dimensional
Pareto optimal front.
All Domains zi=1:n,max = 2i
The working parameters have domains of dissimilar magnitude.
WFG1 Shape hm=1:M −1 = convexm
hM = mixedM (with α = 1 and A = 5)
1
t t1i=1:k = yi
1
ti=k+1:n = s linear(yi , 0.35)
t2 t2i=1:k = yi
t2i=k+1:n = b flat(yi , 0.8, 0.75, 0.85)
t3 t3i=1:n = b poly(yi , 0.02)
t4 t4i=1:M −1 = r sum({y(i−1)k/(M −1)+1 , . . . , yik/(M −1) },
{2((i − 1)k/(M − 1) + 1), . . . , 2ik/(M − 1)})
t4M = r sum({yk+1 , . . . , yn }, {2(k + 1), . . . , 2n})
WFG2 Shape hm=1:M −1 = convexm
hM = discM (with α = β = 1 and A = 5)
t1 As t1 from WFG1. (Linear shift.)
t2 t2i=1:k = yi
t2i=k+1:k+l/2 = r nonsep({yk+2(i−k)−1 , yk+2(i−k) }, 2)
t3 t3i=1:M −1 = r sum({y(i−1)k/(M −1)+1 , . . . , yik/(M −1) }, {1, . . . , 1})
t3M = r sum({yk+1 , . . . , yk+l/2 }, {1, . . . , 1})
WFG3 Shape hm=1:M = linearm (degenerate)
t1:3 As t1:3 from WFG2. (Linear shift, non-separable reduction, and weighted sum
reduction.)
WFG4 Shape hm=1:M = concavem
1
t t1i=1:n = s multi(yi , 30, 10, 0.35)
2 2
t ti=1:M −1 = r sum({y(i−1)k/(M −1)+1 , . . . , yik/(M −1) }, {1, . . . , 1})
t2M = r sum({yk+1 , . . . , yn }, {1, . . . , 1})
WFG5 Shape hm=1:M = concavem
t1 t1i=1:n = s decept(yi , 0.35, 0.001, 0.05)
t2 As t2 from WFG4. (Weighted sum reduction.)
WFG6 Shape hm=1:M = concavem
t1 As t1 from WFG1. (Linear shift.)
2 2
t ti=1:M −1 = r nonsep({y(i−1)k/(M −1)+1 , . . . , yik/(M −1) }, k/(M − 1))
t2M = r nonsep({yk+1 , . . . , yn }, l)
WFG7 Shape hm=1:M = concavem
t1 t1i=1:k = b param(yi , r sum({yi+1 , . . . , yn }, {1, . . . , 1}), 49.98
0.98
, 0.02, 50)
t1i=k+1:n = yi
2 1
t As t from WFG1. (Linear shift.)
t3 As t2 from WFG4. (Weighted sum reduction.)
WFG8 Shape hm=1:M = concavem
t1 t1i=1:k = yi
t1i=k+1:n = b param(yi , r sum({y1 , . . . , yi−1 }, {1, . . . , 1}), 49.98
0.98
, 0.02, 50)
t2 As t1 from WFG1. (Linear shift.)
3 2
t As t from WFG4. (Weighted sum reduction.)
WFG9 As the example in Section 5.
14 corrected version: 22 June 2005
1.2
I1
I2
I3
I4
1 I5

0.8

f2 0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 1.2
f1

Fig. 4. The 50% attainment surfaces obtained by NSGA-II on I1–I5.


Table 7. Properties of the WFG problems. All WFG problems are scalable, have
no extremal nor medial parameters, have dissimilar parameter domains and Pareto
optimal tradeoff magnitudes, have known Pareto optimal sets, and can be made to
have a distinct many-to-one mapping from the Pareto optimal set to the Pareto optimal
front by scaling the number of position parameters.

Problem Obj. Separability Modality Bias Geometry


WFG1 f1:M separable uni polynomial,flat convex, mixed
WFG2 f1:M −1 non-separable uni − convex, disconnected
fM non-separable multi
WFG3 f1:M non-separable uni − linear, degenerate
WFG4 f1:M separable multi − concave
WFG5 f1:M separable deceptive − concave
WFG6 f1:M non-separable uni − concave
WFG7 f1:M separable uni parameter dependent concave
WFG8 f1:M non-separable uni parameter dependent concave
WFG9 f1:M non-separable multi,deceptive parameter dependent concave

To obtain a Pareto optimal solution, the position should first be determined by


setting z1:k appropriately. The required distance-related parameter values can
then be calculated by first determining zk+1 (which is trivial given z1:k have been
set), then zk+2 , and so on, until zn has been calculated. Unlike the other WFG
problems, different Pareto optimal solutions will have different distance-related
parameter values, making WFG8 a difficult problem.
The WFG test suite exceeds the functionality of previous existing test suites.
In particular, it includes a number of problems that exhibit properties not evi-
dent in the commonly-used DTLZ test suite. These include: non-separable prob-
lems, deceptive problems, a truly degenerative problem, a mixed shape Pareto
front problem, problems scalable in the number of position-related parameters8 ,
and problems with dependencies between position- and distance-related param-

8
The DTLZ test suite uses a fixed (relative to the number of objectives) number of
position parameters.
corrected version: 22 June 2005 15

eters. The WFG test suite provides a fairer means of assessing the true perfor-
mance of optimisation algorithms on a wider range of different problems.

8 Conclusions

The WFG Toolkit offers a substantial range of features. Test problem designers
can construct problems with a diverse range of Pareto optimal geometries and
can incorporate a variety of important features in the manner of their choosing. A
suite of nine test problems is presented that exceeds the functionality of existing
test suites. Significantly, the WFG Toolkit allows for the construction of scalable
problems that are both non-separable and multi-modal. Given the relevance of
both characteristics to real world problems and the corresponding lack of such
problems in the literature, the WFG Toolkit offers an important contribution in
assessing the quality of optimisation algorithms on these types of problems.

Acknowledgments

This work was partly supported by an Australian Research Council linkage grant.

References

1. Deb, K.: Multi-objective genetic algorithms: Problem difficulties and construction


of test problems. Evolutionary Computation 7 (1999) 205–230
2. Van Veldhuizen, D.A.: Multiobjective Evolutionary Algorithms: Classifications,
Analyses, and New Innovations. PhD thesis, Air Force Institute of Technology,
Wright-Patterson AFB, Ohio (1999)
3. Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algo-
rithms: Empirical results. Evolutionary Computation 8 (2000) 173–195
4. Deb, K., Thiele, L., Laumanns, M., Zitzler, E.: Scalable test problems for evolu-
tionary multi-objective optimization. KanGAL Report 2001001, Kanpur Genetic
Algorithms Laboratory, Indian Institute of Technology, Kanpur, India (2001)
5. Whitley, D., Mathias, K., Rana, S., Dzubera, J.: Building better test functions. 6th
International Conference on Genetic Algorithms, Morgan Kaufmann Publishers
(1995) 239–246
6. Bäck, T., Michalewicz, Z.: Test landscapes. Handbook of Evolutionary Computa-
tion. Institute of Physics Publishing (1997) B2.7 14–20
7. Fogel, D.B., Beyer, H.G.: A note on the empirical evaluation of intermediate
recombination. Evolutionary Computation 3 (1995) 491–495
8. Deb, K., Thiele, L., Laumanns, M., Zitzler, E.: Scalable multi-objective optimiza-
tion test problems. CEC’02. Volume 1., IEEE (2002) 825–830
9. Van Veldhuizen, D.A., Lamont, G.B.: Multiobjective evolutionary algorithm test
suites. 1999 ACM Symposium on Applied Computing, ACM (1999) 351–357
10. Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: Improving the strength Pareto
evolutionary algorithm for multiobjective optimization. EUROGEN 2001, CIMNE,
Barcelona, Spain (2001) 95–100
16 corrected version: 22 June 2005

11. Bentley, P.J., Wakefield, J.P.: Finding acceptable solutions in the pareto-optimal
range using multiobjective genetic algorithms. Soft Computing in Engineering
Design and Manufacturing, Springer-Verlag (1998) 231–240
12. Knowles, J.D., Corne, D.W.: Approximating the nondominated front using the
Pareto archived evolution strategy. Evolutionary Computation 8 (2000) 149–172
13. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective
genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6
(2002) 182–197
14. Fonseca, C.M., Fleming, P.J.: On the performance assessment and comparison
of stochastic multiobjective optimizers. Parallel Problem Solving from Nature —
PPSN IV, Springer-Verlag (1996) 584–593

You might also like