The Computation of The Expected Improvement in Dominated Hypervolume of Pareto Front Approximations
The Computation of The Expected Improvement in Dominated Hypervolume of Pareto Front Approximations
Introduction
(2)
Probability Density
0.2
20
0
15
20
15
10
10
f1
5
f2
0 0
Figure 1: Multivariate distributions of three uncertain predictions and confidence boxes around them. The red line indicates the current approximation to
the Pareto front.
for some input space X. We may think of X as Rn for some n but this is
not essential for the following discussion. In addition, let us assume that a
prediction tool (e.g. Kriging) provides a prediction for the function values
fi (x), i = 1, . . . , m with an uncertainty measure. The prediction and its uncertainty are defined by a multivariate normal distribution. Whenever we deal
with independent models of the m objectives, the multivariate distribution is
defined by a vector of mean values
~ of the marginal distributions and the vector
of their corresponding standard deviations ~ .
In the context of efficient global optimization and metamodel-assisted local
search it is desirable to know what the expected improvement of a point is. By
maximizing the expected improvement over a selection of points from X we can
detect promising solutions using the prediction tool which subsequently may be
evaluated by means of a real experiment or a cost expensive analysis tool.
In this paper we are particularly interested in the expected improvement
in hypervolume. The improvement in hypervolume is measured relative to a
population of solution vectors P = {~y (1) , . . . , ~y (k) } in the objective space Rm
by:
I(~y , P ) = S(P {~y }) S(P ).
(3)
The expected improvement in hypervolume is now defined as
Z
ExI(x) =
~
y Rm
(4)
In [1] this integral is proposed for the purpose of metamodel-assisted multiobjective optimization. A Monte-Carlo integration method was described for its
computation. However, this method has limited accuracy and a direct computation would be desirable. This paper proposes for the first time such a direct
2
method for computing the expected improvement integral as a closed form expression for m > 1, thereby generalizing the direct computation of Jones at al.
[2] that is only defined for the single-objective case, i.e. m = 1.
Computation Procedure
R
As with the probability of non-domination, i.e. 1 ~yRm T(P ~y )d~y , described
in [1], the integration region Rm can be partitioned into a set of interval boxes,
and then piecewise integration can solve the problem of computing the integral
directly. To provide an intuition on the grid-variables and areas introduced in
the following, Fig. 3 may serve the reader.
(2)
(k)
(1)
Let bi , bi , . . ., bi denote the sorted list of the all the i-th coordinates
(0)
(k+1)
of vectors ~y (1) , . . . , ~y k . For technical reasons we define bi = and bi
=
(k+2)
ref
= .
yi , bi
(i)
The grid-coordinates bj give rise to a partitioning into grid cells. We
can enumerate these grid cells as follows. For each (i1 , . . . , im ), where is
(i )
(i )
{0, . . . , k+1}, the grid cell named C(i1 , . . . , im ) is determined by (b1 1 , . . . , bmm )T
(i +1)
(i +1)
and (b1 1 , . . . , bmm )T as the half open (from below) interval box (i.e.,
(i )
(i )
(i ) (i +1)
(i2 ) (i2 +1)
(i1 ) (i1 +1)
] (bmm , bmm ]). We call (b1 1 , . . . , bmm )T
] (b2 , b2
(b1 , b1
the lower bound of C(i1 , . . . , im ) denoted by ~l(i1 , . . . , im ), likewise we call
(i +1)
(i +1)
(b1 1 , . . . , bmm )T the upper bound of of C(i1 , . . . , im ) denoted by ~u(i1 , . . . , im ).
With this notation we will also denote C(i1 , . . . , im ) by (~l(i1 , . . . , im ), ~u(i1 , . . . , im )].
See Figure 2 for an example.
It can be directly observed that there are many cells the integration over
which adds a contribution of zero to the integral. These are cells that
1. have lower bounds ~l(i1 , . . . , im ) that are dominated or equal to points in
P , i.e. P ~l(i1 , . . . , im ), or
2. that have upper bounds ~u(i1 , . . . , im ) that do not dominate the reference
point, i.e. ~u(i1 , . . . , im ) ~y ref .
The second criterion is fulfilled by grid cells C(i1 , . . . , im ) with i1 = k + 1 or
i2 = k + 1 . . . or im = k + 1, i.e. at least one of their coordinates has index
k + 1.
All cells that fulfill criterion (1) or (2) will be called inactive cells, while the
other cells will be termed active cells. Active cells are cells the inner points of
which are dominating the reference point and dominate at least one point in P .
Obviously, the expected improvement integral is the sum of all contributions
of integration of the improvement integral over the set of active cells C + , i.e.
X
(i1 , . . . , im )
(5)
ExI(x) =
C(i1 ,...,im )C +
Figure 3: Schematic drawing of the integration area and grid in the bi-objective
case.(S = {~z Rm |P (~u) ~z ~u}, where P (~u) = {~
p P |~u p~} )
, where
Z
(i1 , . . . , im ) =
~
y (~l(i1 ,...,im ),~
u(i1 ,...,im )]
(6)
m
Y
j (i1 , . . . , im )) Vol(S )
j=1
m
Y
((
i=1
j (i1 , . . . , im ) =
ui i
li i
) (
)) (7)
i
i
(8)
1
1
(11)
(x) = exp(x2 /2), (x) = (1 + erf(x/ 2))
2
2
5
Derivation
(12)
(in the following, we assume always the grid indices i1 , . . . , im ). In summary for
points ~y within the cell C(i1 , . . . , im , ) it holds:
I(~y ) = Vol([~y , ~v ] [~u, ~v ] + L+ )
(13)
. An important observation is that only the first term, namely [~y , ~v ] depends on
~y .
After these preliminaries, let us recall the expected improvement integral
(expression 4). The contribution of cell C(i1 , . . . , im ) is given by:
Z u1
Z um
=
lm
l1
i=1
u1
A2 =
m
um Y
y1 =l1
, and
lm
u1
A3 =
um
l1
(16)
i=1
lm
(17)
i=1
(18)
lm
Qm
Moreover the expression Vol(L+ ) i=1 (vi ui ) is the negative hypervolume
measure for the Q
set of points in P dominated or equal to ~u with reference point ~v
m
(i.e., Vol(L+ ) i=1 (vi ui ) = Vol(S ), where S = {~z Rm |P (~u) ~z ~u}
with P (~u) = {~
p P |~u p~}, see also Figure 3).
In addition, the integral
Z u1
Z um
lm
((
i=1
li i
ui i
) (
))
i
i
i=1
=
m
Y
((vi , ui , i , i ) (vi , li , i , i ))
i=1
. The last step is motivated by the following equality and the definition in
Equation 10 :
Z b
y
b
b
(a z)(
)dz = (
) + (a )(
).
(20)
.
The computation of reduces to the integration of a one-dimensional inRb
tegration the details of which are as follows: Compute (a y) pdf(y)dy,
Rb
2
2
2
2
21 ( y
) dy =
where pdf(y) =
exp( 12 ( y
) ). Answer: (a y) e
1 y 2
2
Conclusion
References
[1] M. Emmerich, Single- and Multiobjective Evolutionary Design Optimization Using Gaussian Random Field Metamodels, PhD Thesis, FB Informatik, University of Dortmund, 2005
[2] Jones, D., Schonlau, M., Welch, W., (1998) Efficient Global Optimization
of Expensive Black-Box Functions. Journal of Global Optimization, Vol.
13, 455-492.
[3] J.W. Klinkenberg, M. Emmerich, A. Deutz: Expected Improvement of the
S-Metric for finite Parero front Approximations, Proc. of MCDM 2008,
Auckland, NZ