0% found this document useful (0 votes)
40 views15 pages

Offline Data-Driven Multiobjective Optimization Knowledge Transfer Between Surrogates and Generation of Final Solutions

This document proposes an evolutionary algorithm for offline data-driven multiobjective optimization that uses two surrogate models: a coarse surrogate model and a fine surrogate model. The coarse model is constructed in a low-dimensional subspace to guide the algorithm to quickly find promising regions, while the fine model focuses on leveraging good solutions according to knowledge transferred from the coarse model. Since the obtained Pareto optimal solutions cannot be validated using the real fitness function, a technique is suggested for generating final optimal solutions by clustering all achieved solutions and averaging solutions within each cluster.

Uploaded by

刘哲宁
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views15 pages

Offline Data-Driven Multiobjective Optimization Knowledge Transfer Between Surrogates and Generation of Final Solutions

This document proposes an evolutionary algorithm for offline data-driven multiobjective optimization that uses two surrogate models: a coarse surrogate model and a fine surrogate model. The coarse model is constructed in a low-dimensional subspace to guide the algorithm to quickly find promising regions, while the fine model focuses on leveraging good solutions according to knowledge transferred from the coarse model. Since the obtained Pareto optimal solutions cannot be validated using the real fitness function, a technique is suggested for generating final optimal solutions by clustering all achieved solutions and averaging solutions within each cluster.

Uploaded by

刘哲宁
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 24, NO.

3, JUNE 2020 409

Offline Data-Driven Multiobjective Optimization:


Knowledge Transfer Between Surrogates and
Generation of Final Solutions
Cuie Yang , Jinliang Ding , Senior Member, IEEE, Yaochu Jin , Fellow, IEEE,
and Tianyou Chai , Fellow, IEEE

Abstract—In offline data-driven optimization, only historical I. I NTRODUCTION


data is available for optimization, making it impossible to vali-
N SOLVING real-world optimization problems, com-
date the obtained solutions during the optimization. To address
these difficulties, this paper proposes an evolutionary algorithm
assisted by two surrogates, one coarse model and one fine model.
I putationally intensive numerical simulations or physical
experiments often need to be conducted to evaluate the objec-
The coarse surrogate (CS) aims to guide the algorithm to quickly tive functions, e.g., in integrated circuit design [1], antenna
find a promising subregion in the search space, whereas the fine design [2], hybrid vehicle control [3], or aerodynamic design
one focuses on leveraging good solutions according to the knowl- optimization [4], [5]. In many other situations, only histor-
edge transferred from the CS. Since the obtained Pareto optimal
solutions have not been validated using the real fitness function, a ical data are available for optimization [6]–[10], where the
technique for generating the final optimal solutions is suggested. optimization problem is solved on the basis of collected data
All achieved solutions during the whole optimization process are without resorting to any physical models. These optimization
grouped into a number of clusters according to a set of refer- problems are often referred to data-driven optimization prob-
ence vectors. Then, the solutions in each cluster are averaged lems [6], [11], which can be divided into online and offline
and outputted as the final solution of that cluster. The proposed
algorithm is compared with its three variants and two state- data-driven optimization problems [6]. In the online data-
of-the-art offline data-driven multiobjective algorithms on eight driven optimization, a small number of candidates can be
benchmark problems to demonstrate its effectiveness. Finally, validated during the optimization with the real objective func-
the proposed algorithm is successfully applied to an operational tions (obtained by either performing numerical simulations
indices optimization problem in beneficiation processes. or experiments), and the newly generated data can also be
Index Terms—Knowledge transfer, multiobjective evolution- used to update the surrogates. By contrast, in the offline
ary algorithms (MOEAs), multisurrogate, offline data-driven data-driven optimization, no new data can be generated and
optimization. only the historical data can be exploited to manage the
surrogates [7], [9].
Most data-driven optimization relies on surrogate models
to assist the optimizer to guide the search [11]–[13]. Many
machine learning methods can be employed to construct sur-
Manuscript received August 24, 2018; revised December 17, 2018 and rogates, including artificial neural networks (ANNs) [14], [15],
February 28, 2019; accepted June 18, 2019. Date of publication July 1, 2019; polynomial regression (PR) [16], support vector machines
date of current version May 29, 2020. This work was supported in part by (SVMs) [17], radial basis function (RBF) networks [18]–[20],
the National Natural Science Foundation of China under Grant 61525302
and Grant 61590922, in part by the Project of Ministry of Industry and and Gaussian processes (GPs). GPs are also known as
Information Technology of China under Grant 20171122-6, in part by the Kriging or design and analysis of computer experiment mod-
Projects of Shenyang under Grant Y17-0-004, in part by the Fundamental els [21]–[23].
Research Funds for the Central Universities under Grant N160801001 and
Grant N161608001, in part by the National Key Research and Development Multisurrogate methods have been investigated in data-
Program of China under Grant 2018YFB1701104, in part by the Royal driven optimization due to their promising performance. One
Society under Grant IEC\NSFC\170279, and in part by the Outstanding intuitive idea of multiple models is to use ensembles, which
Student Research Innovation Project of Northeastern University under Grant
N170806003. (Corresponding authors: Jinliang Ding; Yaochu Jin.) have been shown to be able to improve the accuracy and
C. Yang, J. Ding, and T. Chai are with the State Key Laboratory reliability of the estimated fitness [13], [24]. Ensembles may
of Synthetical Automation for Process Industries, Northeastern consist of homogeneous [25]–[27] or heterogeneous base mod-
University, Shenyang 110819, China (e-mail: [email protected];
[email protected]; [email protected]). els [28], [29]. Different from ensembles, multiple surrogates
Y. Jin is with the Department of Computer Science, University of have also been used to exploit the balance between “curse
Surrey, Guildford GU27 XH, U.K., and also with the State Key Laboratory of uncertainty” and “bless of uncertainty,” typically with the
of Synthetical Automation for Process Industries, Northeastern University,
Shenyang 110819, China (e-mail: [email protected]). help of a global model and a local model. The global surro-
This paper has supplementary downloadable material available at gate model aims to capture the global profile of the fitness
http://ieeexplore.ieee.org, provided by the author. landscape by smoothing out the local optima, thereby help-
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. ing the optimizer explore the search space. By contrast, the
Digital Object Identifier 10.1109/TEVC.2019.2925959 local surrogate model is constructed around the promising
1089-778X 
c 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
410 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 24, NO. 3, JUNE 2020

region found by the current population to exploit the local guide the evolutionary optimization. The CS, which is
details of the fitness landscape [28], [30], [31]. The global constructed in a low-dimensional subspace using a sim-
and local surrogates can also be used in parallel during the ple structure, is meant to capture the global profile
optimization. For instance, in [20], each of the surrogates is of the fitness landscapes. The FS, by contrast, aims
trained separately using different training sets. Then a new to exploit promising solutions by reusing the knowl-
candidate is evaluated by a selected surrogate that has the edge transferred from the course model. It should be
least prediction error. A global RBF surrogate model was noted that the proposed multisurrogate model is different
combined with a local fitness estimation method [32] in a from most existing multisurrogate models [25], [28] in
hierarchical particle swarm optimization (PSO) algorithm [33]. that the coarse model in this paper is constructed in
Most recently, a surrogate-assisted cooperative PSO was sug- a subspace of the original search space. The potential
gested to solve high-dimensional data-driven optimization benefits of constructing a low-dimensional surrogate are
problems [34], where a social learning PSO assisted by a twofold. First, the low-dimensional surrogate is expected
global RBF works with a PSO assisted by the local fitness to improve the model accuracy given limited training
estimation method. data, especially for high-dimensional problems. Second,
Despite the great success achieved in online data-driven the low-dimensional surrogate is able to smooth out
optimization, little work has been dedicated to more chal- some local minimums so that it can more easily capture
lenging, offline data-driven optimization problems with few the main profile of the fitness landscape.
exceptions. In [7], a low-order PR model and a GP model are 2) A knowledge transfer technique is introduced to transfer
constructed, where the low-order PR is used as the real fitness the knowledge acquired by the CS to the FS. After the
function to produce new data during search process for model coarse model locates the promising areas, the fine model
management, while the GP model is employed as the surro- will exploit this knowledge to more accurately identify
gate to assist the evolutionary search. In [9], a large number of the optimums and accelerate the convergence, thereby
base learners are built offline and then a subset of these base enhancing the search efficiency.
learners are adaptively selected to make sure that the surrogate 3) An RV-based method is proposed to generate the final
is able to best approximate the local fitness landscape. Note solution set. At first, all candidate solutions, which are
that the algorithm reported in [6] is developed for optimization achieved in each generation of the fine search, are
driven by a large amount of data, which is meant for reducing grouped into different clusters according to a set of RVs.
the computation time by using less data by means of adaptive Then, one solution is generated by averaging those in
clustering. each cluster and all these averaged solutions will be
This paper aims to address the following two main chal- the final solution set. Our empirical results indicate that
lenges in offline data-driven multiobjective optimization where the quality of these solutions is better than that of the
very limited historical data are available. nondominated solutions obtained in the last generation.
1) How to design surrogate models using the limited histor- The rest of this paper is organized as follows. Section II
ical data only that are able to correctly guide the search? briefly introduces offline data-driven optimization prob-
In offline data-driven optimization, building reliable sur- lems and knowledge transfer techniques in evolutionary
rogates plays an essential role in exploiting promising optimization. In Section III, the proposed offline data-driven
solutions because the optimization is guided by the sur- multiobjective optimization algorithm is described in detail.
rogates only and no new data can be generated during The comparative experimental results on the benchmark prob-
the search for updating or validating the surrogates. lems are presented in Section IV, followed by an application
2) How to select solutions to be implemented when the of the proposed algorithm to operational indices optimization
optimization is completed. Recall that the surrogates of a beneficiation process in Section V. Section VI concludes
built offline have not been updated during the search this paper with a summary and a discussion for future work.
and the “optimal” solutions achieved at the end of
the optimization may not be really optimal due to
II. R ELATED W ORK
the approximation errors in evaluating the objectives.
Compared to conventional multiobjective optimization A. Offline Data-Driven Optimization Problems
problems (MOPs), it is even trickier for the user to Offline data-driven optimization problems widely exist
select solutions to be implemented at the end of offline in the real world [11], [35], such as trauma systems
data-driven multiobjective optimization. design [6], performance optimization of fused magnesium
To tackle the above challenges, this paper proposes a multi- furnaces [7], and operational indices optimization of benefici-
surrogate approach with knowledge being transferred between ation processes [8], in which the objective functions cannot be
the surrogates to improve the reliability of the surrogates. In directly calculated using mathematical equations and only data
addition, a method for selecting final solutions with the help are available for fitness evaluations [11], [35]. Offline data-
of a set of reference vectors (RVs) is designed to improve driven optimization starts with a certain amount of collected
the reliability of the solutions. The main contributions of this data, which are used to construct surrogates for searching
paper can be summarized as follows. optimal solutions [9]. During the optimization, surrogate mod-
1) A multisurrogate model consisting of a coarse surro- els can be updated to improve the search efficiency either by
gate (CS) and a fine surrogate (FS) is proposed to using generated synthetic data from other surrogates, reusing

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
YANG et al.: OFFLINE DATA-DRIVEN MULTIOBJECTIVE OPTIMIZATION 411

knowledge collected from the optimization, or newly collected


data that are not under the control of the optimizer. An exam-
ple of model management techniques using information during
the optimization can be found in [9], which adaptively selects
a subset of the base learners of an ensemble at each iteration
according to the location of the best individual. Once the
surrogate-assisted optimization is completed, the best solution
(set) will be implemented to solve the real-world problem.

B. Multiobjective Optimization Problems


Many real-world problems involve multiple conflicting
objectives to be optimized simultaneously, which are com-
monly referred to as MOPs. This paper considers the following
box-constrained MOP:
min F(x) = (f1 (x), . . . , fM (x))
s.t. x ∈ [l, u] (1)
where l and u are the lower and upper bounds of the search
space, (f1 (x), . . . , fM (x)) is the objective vector containing M
objectives, and x = (x1 , . . . , xn ) are the decision variables.
Due to the conflicting nature of the objectives, no single
solution is able to optimize all objectives at the same time. Fig. 1. Framework of the proposed offline data-driven multiobjective
Instead, a set of tradeoff optimal solutions can be found, optimization.
which is called the Pareto set (PS) and its image in the objec-
tive space is known as the Pareto front (PF). Evolutionary
algorithms (EAs) are well suited for MOPs because main- Most recently, multifactorial EAs (MFEAs) [54]–[56] have
taining a population of candidates enables EAs to explore a been proposed to transfer knowledge among tasks to be
set of diversified optimal solutions to approximate the PF. solved in parallel. MFEAs allow knowledge sharing via two
A large number of multiobjective EAs (MOEAs) have been techniques, namely, assortative mating and vertical cultural
proposed in the last decades, which can generally be classified transmission, to ensure that the tasks can not only maintain
into Pareto domination-based approaches [36], indicator-based their own specific knowledge but also achieve knowledge from
approaches [37], and decomposition-based approaches [38]. other tasks [54].
Inspired by the success of knowledge transfer in evolution-
C. Knowledge Transfer in Optimization ary optimization, this paper employs the knowledge transfer
Knowledge transfer refers to sharing knowledge and pro- technique to transfer knowledge between two surrogates to
viding inputs to improve problem solving [39]. Knowledge enhance the capability of exploiting promising solutions. Both
acquisition from the evolutionary optimization process and two surrogates are trained using the historical data and have
reuse of the acquired knowledge in EAs have been shown to shared knowledge in nature.
be effective in speeding up convergence as well as enhancing
the quality of the obtained optimal solutions [40], [41]. One
III. P ROPOSED A LGORITHM
class of knowledge transfer techniques transfers the knowledge
of the tasks similar to the current task to prevent a cold start A. Overall Framework
of EAs. For example, one early work on knowledge transfer A generic diagram of the proposed algorithm is presented in
uses the optimal solutions of the past similar problems as the Fig. 1. Before the optimization starts, a data set of the problem
shared knowledge, and then incorporate them into the initial to be solved, (X, Y), is collected to train the surrogates, where
population in solving the current task [42]–[44]. Furthermore, X represents the decision variables, and Y the corresponding
the techniques of reusing knowledge from heterogeneous prob- objective values. The algorithm starts with construction of the
lems are investigated [45]–[48]. The main idea is to transfer FS, which remains unchanged in the following optimization
the structured knowledge learned from the previous tasks to process. The widely used RBF model [18]–[20] is employed to
the current task. In addition, in model-based optimization algo- build the FS in this paper. Afterwards, a population consisting
rithms, such as estimation of distribution algorithms (EDAs), of N individuals is randomly initialized, denoted as PF0 , which
the distance distribution information about similar previous will be evaluated using the FS.
tasks is considered as knowledge and combined with the At each iteration (Iter) of main optimization loop, a CS,
current task to improve the efficiency of model construc- which is a PR model, is constructed in a subspace of the search
tion [49], [50]. In genetic algorithms, the building blocks space and a search assisted by the CS will be carried out to
of related problems are reused in the current optimization find the promising regions of the search space. Then a fine
task [51]–[53]. search exploiting the knowledge transferred from the coarse

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
412 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 24, NO. 3, JUNE 2020

search is conducted. The nondominated solutions found during Algorithm 1 Coarse Search
the fine search at each iteration are stored in a DB. 1: Input: rv ≤ 1: A coefficient controls the dimension of
The coarse-fine search process continues until some termi- coarse model; (X, Y): The historical data set; TC: The
nation condition is met. Then, an RV-based solution generation maximum number of generations for coarse search;
strategy is performed to produce a set of final solutions for the 2: Output: PTL : The population at the last generation of
user to implement using the solutions stored in DB. In the fol- the coarse search; Dind: Index of the selected decision
lowing sections, we will detail the three main components of variables for the coarse model;
the proposed algorithm, i.e., the coarse search, the fine search 3: Calculate the dimension of coarse model DL = rv × D;
and the RV-based final solution set generation. 4: Randomly select DL decision variables from the variable
vector and record their index as Dind;
B. Coarse Search 5: XL=X(:,Dind) //take out the selected variables and denoted
as XL;
In the coarse search, the CS is built in a subspace of the
6: Training the PR model using (XL,Y) as the CS;
original search space. As shown in Fig. 1, the coarse search is
7: Initialize the population for the coarse search: create
composed of four steps, namely, determination of the subspace
the initial population P0 with N individuals; randomized
of the CS, construction of CS, population initialization and a
individuals, each individual has DL dimension;
number of iterations of search assisted by the CS. Note that
8: /* Coarse surrogate assisted optimization */
the search space of the coarse search is the input space of the
9: while t < TC do
CS, denoted by DL, which is controlled by a coefficient rv as
10: Generate the parent population from Pt using tourna-
follows:
ment selection;
DL = rv × D (2) 11: Create the offspring population Qt by applying
crossover and mutation on the parent population;
where D is the dimension of the original optimization problem, 12: Combine the parent population Pt and the offspring
i.e., DL ≤ D. population Qt as Pt ;
For the sake of simplicity, DL decision variables are ran- 13: Perform environmental selection on Pt to select the
domly chosen from X and denoted as XL. Then a new data set parent population for the next generation Pt+1 ;
(XL, Y) is used to train the CS. In this paper, a second-order 14: t = t + 1;
PR is adopted for the CS, which aims to smooth out local 15: end while
optima. It is worth noting that the search space of the CS 16: PTL = Pt+1 ;
changes at each iteration of the main loop so that the coarse
search is able to find promising regions in different subspaces
during the optimization. Algorithm 2 Fine Search
Any MOEA assisted by the CS can be employed for the 1: Input: PFiter : Current population of the fine search;
coarse search. The maximum number generations TC is taken PTL : the population of the coarse search; Dind: Index
as the termination condition of the coarse search. When the of the selected decision variables for coarse search; N:
optimization terminates, individuals of the last generation Population size;
will be stored in DB. Algorithm 1 details the coarse search 2: Output: PFiter+1 : Population of fine search in the next
procedure. iteration;
3: Set PL = PFiter ;
C. Fine Search 4: for i=1:N do
5: Set PL(i,Dind)=PTL (i,:);
In this paper, the FS is trained to accurately approximate the
6: end for
real fitness functions. However, an accurate surrogate model
7: P = PL U PFiter ;
is not necessarily always be desirable in optimization since
8: P = randomly shuffled P;
the real fitness functions typically have multiple local optima,
9: Q=crossover + mutation(P);
making the search more challenging. Thus, the knowledge,
10: P = Q U PFiter ;
specifically the solutions, acquired during the coarse search
11: PFiter+1 = environmental selection(P);
is expected to be able to enhance the search performance of
12: Add PFiter+1 to DB;
FS by reducing the likelihood of getting stuck in a local opti-
mum, since we hypothesize that the CS and FS are correlated
yet the CS is smoother. For this reason, a knowledge transfer
is employed to transfer the knowledge of CS to FS during problems, while the dimension of CS is a subspace of the orig-
the search. This paper borrows the transfer strategy proposed inal problem, which changes from generation to generation of
in [54], which realizes knowledge transfer among optimization the coarse search. For this reason, we first convert the popu-
tasks during offspring production. Details of the fine search are lation PTL of coarse search to an intermediate population PL,
described in Algorithm 2. whose dimension is same as that of the fine search, as shown
Steps 3–9 of Algorithm 2 describe steps for knowledge in steps 3–6 in Algorithm 2. Specifically, population PL main-
transfer between the CS and FS. As mentioned above, the tains the decision variables of PTL of the coarse search, and
dimension of the FS is the same as the original optimization copies the remaining variables from the current population of

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
YANG et al.: OFFLINE DATA-DRIVEN MULTIOBJECTIVE OPTIMIZATION 413

the fine search, PFiter . After that, PL and PFiter are combined
as the parent population P (step 7), which serves as the mating
pool after being randomly shuffled (step 8). In the next step,
new offspring Q is generated by applying crossover and muta-
tion on the parent individuals. Once the fitness of all offspring
individuals in Q is calculated using FS, the popular environ-
ment selection method in NSGA-II [36] is employed to select
N individuals from the combination of Q and the current pop-
ulation PFiter . The selected solutions are passed to the next
generation and also stored in DB.
As described above, the combined population contains indi-
viduals from two different populations, PL and PFiter . Thus
in reproduction, one of the following three cases may occur
for the two parent individuals for crossover, i.e., both are from
Fig. 2. Illustration example of RV-based clustering, where eight solutions
population PL, both from population PFiter , or one from PL are grouped into five clusters. A solid arrow line indicates that a solution is
and the other from PFiter . In the last case, the solutions of associated with an RV in the first stage, while a dashed arrow line means that
CS are implicitly transferred to the offspring of FS, thus FS a solution is added to a cluster of an RV in the second step. Finally, a circle
means that a solution is removed from a cluster in the second step.
is able to automatically gain useful knowledge from CS dur-
ing environmental selection. In this way, the fine search is
able to exploit beneficial knowledge from the coarse search to
accelerate the search process. The second step aims to ensure that each cluster contains ρ
solutions.
The second step is carried out due to the fact that there
are might be too few or too many solutions in one cluster
D. Reference Vector-Based Final Solution Set Generation created in the first step. In case there are too few solutions in
In online data-driven optimization, the final solution (set) a cluster, i.e., smaller than the predefined number ρ, solutions
can be relatively easily determined since all solutions are in the neighboring RVs will be added to the cluster until ρ are
evaluated using the real objective functions. In offline found. In case there are too many solutions in a cluster, it is
optimization, however, the observed objective values of candi- desirable to remove those inferior candidates from the cluster
date solutions are all predicted by the surrogates, which may so that they are not used in averaging. Here, for a minimization
not be correct due to the approximation errors introduced by problem, solutions that are far from the origin in the objective
the surrogates. Our empirical results indicate that selecting the space will be removed.
nondominated solutions only for final implementation may not Fig. 2 presents an illustrative example of the clustering pro-
be the best choice in offline data-driven optimization. Instead, cess, where eight solutions, d1 –d8 , need to be divided into five
we propose an RV-based final solution set generation strategy clusters represented by five RVs R1 –R5 . In the figure, a solid
to obtain a solution set for final implementation. The main arrow shows the association relationship of a solution to an
idea is to group the solutions achieved at each generation of RV determined in the first step. For example, solution d1 is
the fine search and then generate one solution by averaging associated with R1 , d2 and d3 are associated to R2 , and d6 –d8
the solutions in each group. Recall that all solutions obtained are associated to R4 . However, no solution is associated to R5 .
at each generation of the fine search are stored in a database Suppose the required number of solutions in each cluster is
DB. The underlying motivation of clustering is to reduce the defined to be ρ = 2, then the cluster for R1 has one solution
errors introduced by surrogate models by averaging a number less and the cluster for R5 lacks two solutions. Consequently,
of similar solutions within a cluster. solution d2 , which is the closest to R1 in terms of its angle
The question now is how to group the solutions stored with the RV, will be added to the cluster for R1 . Similarly,
in DB. For diversity of the solutions, this paper adopts a solutions d7 and d8 are added to the cluster for R5 . By con-
set of RVs for final solution generation, which have been trast, the cluster for R4 has too many solutions and one of
widely adopted to guide the search process in many-objective them should be deleted. Since the Euclidean distance of solu-
optimization [57], [58]. The final solution generation method tion d8 to the origin is the largest among the three solutions
starts with categorizing the solutions in DB into a number of in the cluster for R4 , d8 will be removed.
groups. We assume that for each group, ρ solutions will be Once ρ solutions are associated to each cluster, the average
used for averaging, where ρ is a parameter to be specified. of the ρ solutions in each cluster will be considered to be a
Grouping solutions in DB consists of two steps. In the first final solution. The whole process of final solution generation
step, a set of RV is generated and each solution in DB is is presented in Algorithm 3.
associated to one RV, as proposed in RVEA [58]. The result- In the proposed final solution set generation method, all
ing cluster is denoted as Cj , j = 1, 2, . . . , K, where K is the solutions achieved during the optimization will participate in
number of the RVs predefined by the user to indicate a set the generation of the final solutions, although some of them
of preferred solutions. Note that similar to RVEA, all objec- may be eventually discarded and others may contribute more
tive values are normalized to [0, 1] before being clustered. than once. In addition to the RV-based method suggested in

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
414 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 24, NO. 3, JUNE 2020

TABLE I
Algorithm 3 RV-Based Solution Set Generation D EFINITION OF T HREE VARIANTS OF MS-RV
1: Input: K: The number of reference vectors; AF: Objective
values of the solutions stored in DB; AS: Decision vari-
ables of the solutions in DB; Cj , j = 1, 2, ..., K: The
initial clusters created by reference vector association; ρ:
Predefined number of solutions in each cluster
2: Output: The generated solution set TS for final
implementation;
3: /* Clusters construction */
4: for j=1:|K| do
5: /* The first case */
6: if Cj  is smaller than ρ then Among the test problems, the decision variables of the DTLZ
7: I = descendsort cos θi,j ; problems are independent of each other, while the variables in
i∈{1,2,...,|AF|}
8: Construct a cluster around j-th reference vector: Cj = the F test suite are linearly correlated. Regarding the number
{I1 , I2 , ..., Iρ }; of decision variables, D =10, 30, and 50 are considered for
9: /* The second case */ all the instances. Regarding the number of objectives of the
else DTLZ test problems, M = 2 and M = 3 are considered.
10:  
11: for i=1:Cj  do In the experiments, we first examine the efficiency of
Calculate the Euclidean distance to the origin: di = the coarse-fine search using the RV-based final solution set
12:  
Cj,i ; generation strategy (MS-RV). Then, we compare MS-RV
13: end for with two offline surrogate-assisted multiobjective optimization
14: I = ascendsort di ; algorithms using NSGAII_GP [7] and K-RVEA [10] as the
i∈{1,2,...,|Ci |} basic search method, respectively, which is proposed in [61].
15: Find individuals with a smaller Euclidean distance: To investigate the importance of the two strategies in MS-
Cj = {I1 , I2 , ..., Iρ }; RV, MS-RV is also compared with its three variants, FS-RV,
16: end if MS-ND, and FS-ND, as listed in Table I.
17: end for In all algorithms, RBF is constructed using the toolbox
18: /* Generate the final solution */ in [62] and GP is built using the toolbox in [63].
19: for j=1:K do
20: Average the decision variables: TS = TS ∪
mean(AS{Cj }); A. Experimental Settings
21: end for The general and the specific parameters for each algorithm
used in the experiments are summarized as follows.
1) All the compared algorithms adopt NSGA-II [36] as the
this paper, other ideas, e.g., using other clustering methods basic search method. The distribution indexes of both
rather than the reference-based method, or using a better sub- crossover and mutation in NSGA-II are set to 20. The
set of the solutions for clustering, can also be used. Therefore, crossover probability and mutation probability are set to
in the empirical studies, we compare the RV-based clustering pc = 1.0 and pm = 1/D, respectively, where D is the
method with the k-means clustering method, and also test a number of decision variables of the original optimization
strategy that only uses the nondominated solutions to generate problems.
the final solution set. The results, as listed in Tables S.B and 2) The initial training data set (historical data set) for each
S.C in Appendix A of the supplementary material, demon- experiment are sampled using the Latin hypercube sam-
strate that the RV-based clustering method using all solutions pling method [64] and the size is set to 10D, where D
in clustering achieves the best overall performance. is the number of decision variables. The population size
N is set to 50. For each algorithm, 100 final solutions
are generated for bi-objective and 105 for tri-objective
IV. S IMULATION R ESULTS AND D ISCUSSION instances. The number of candidates in each cluster
In this section, empirical studies are conducted to ver- in the RV-based final solution set generation is set to
ify the performance of the proposed offline data-driven ρ = 20. Twenty independent runs are performed for
multiobjective optimization algorithm, MS-RV for short. To each algorithm on each test instance.
this end, eight test problems are adopted for comparison, as 3) The termination condition of each run is the maximum
listed in Table S.C in Appendix B of the supplementary mate- number of generation for fine search TF = 40.
rial, of which four are taken from the DTLZ test suite [59] and 4) The maximum number of generations for coarse in each
the rest four (F1–F4) from [60]. Similar to [22], the value 20π generation of fine search is set to TC = 15. The coef-
within cosine in the original DTLZ1 and DTLZ3 is changed ficient of rv, which determines the dimension of CS, is
to 2π and denoted as DTLZ1a and DTLZ3a, meanwhile, the set to 0.3.
parameter α in DTLZ4 is also changed from 100 to 10, which 5) In NSGAII_GP, the parameters are the same as the
is named as DTLZ4a, to reduce the complexity of the problem. original algorithm except for the maximum number of

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
YANG et al.: OFFLINE DATA-DRIVEN MULTIOBJECTIVE OPTIMIZATION 415

Fig. 3. Comparison of the HV values of the solution set over the generations, which are calculated based on the real objective functions and the FS,
respectively.

generations, which is set to 20 for bi-objective and 21 As can be observed from Fig. 3, the HV of the solution set
for tri-objective problems in order to obtain the same in the last generation are better (validated by real objective
number of final solutions. functions) compared with the initial population on all the test
6) The parameters of K-RVEA are set the same as in [10]. problems. These results indicate that the search assisted by an
Inverted generational distance (IGD) [60] and hypervolume FS or a multisurrogate constructed on the historical data is
(HV) [65], [66] are adopted to measure quality of the solution able to find solutions better than those in the historical data.
sets in terms of both convergence and diversity. In calculat- In addition, we can find that, the HV values of solution sets
ing IGD, 500 uniformly distributed reference solutions are according to the surrogates and the real objective functions
sampled from the PF. To calculate HV, all solution sets are show a consistent trend on all test instances except for F3,
combined and their objective values are normalized to [0, 1]. which exists a slight fluctuation at the later generations. The
Then y∗ = (1.1, 1.1) and y∗ = (1.1, 1.1, 1.1) are set to the above finding implies that the better solutions evaluated by the
reference point for bi-objective and tri-objective test instances, surrogates will usually have better objective values if evalu-
respectively. ated using the real objective functions. It can also be found
in the figure that the HV values of solution sets achieved by
the coarse-fine search are significantly better than those found
B. Experimental Studies by the single fine model according to both FSs and real fit-
1) Effectiveness of the Proposed Coarse-Fine Search: In ness functions, indicating a fast convergence achieved by the
this section, we examine the efficiency of the coarse-fine coarse-fine search compared to the single FS. The encouraging
search component in MS-RV in solving offline data-driven convergence speed of the coarse-fine search might be attributed
problems via in comparison with single fine search (the single to the knowledge transferred from the CS to the fine model.
FS assisted optimization) using eight test instances DTLZ1a 2) Effectiveness of Reference Vector-Based Final Solution
(M = 2, D = 30), DTLZ2 (M = 2, D = 30), DTLZ3a Set Generation: As mentioned above, the nondominated solu-
(M = 2, D = 30), DTLZ4a (M = 2, D = 30), F1 (D = 30), tions in the DB may not be the best solutions when evaluated
F2 (D = 30), F3 (D = 30), and F4 (D = 30). For the solu- using the real objective functions because of the approximation
tions of each generation, we calculate the HV of solutions errors introduced by the surrogates. For this reason, we should
according to the objective values evaluated using the surro- not directly use those nondominated solutions in DB. Instead,
gates and the real objective functions, respectively. The mean we propose an RV-based strategy to generate a set of final solu-
and standard of HV values over 30 runs over the generations tions as described in Section III-D. This section investigates
on each test problem are plotted in Fig. 3. Note that we do the effectiveness of the RV-based final solution generation
not compare the exact HV between the surrogate and its corre- method by comparing the quality of the solutions generated
sponding real objective functions as they are normalized into using the RV-based method with that of the nondominated
different region in calculating HV. Instead, we just want to solutions in DB on the eight test instances.
note that the HV values calculated according to the real and In this experiment, the solutions generated during the
estimated fitness values during the optimization are strongly optimization during the 60 iterations are stored in the DB.
correlated. Then two sets of solutions, each consisting of 100 solutions,

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
416 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 24, NO. 3, JUNE 2020

Fig. 4. Visualization of the solutions generated by ND and RV-based generation strategy, where the solutions generated by ND are denoted by circles and
solutions generated by RV-based strategy are denoted by squares.

TABLE II
S TATISTICAL R ESULTS OF IGD OF THE C OMPARED A LGORITHMS ON F T EST I NSTANCES OVER 30 I NDEPENDENT
RUNS , W HERE THE B EST R ESULT ON E ACH T EST I NSTANCE I S H IGHLIGHTED

are generated by nondominated sort (ND) and RV-based gen- to the surrogates) in the last generation of the coarse-fine
eration method, respectively. The objective values of these search to the user for possible implementation. Second, the
solutions validated by the real fitness functions and the FSs, solutions obtained RV on the F problems and DTLZ prob-
respectively, are plotted in Fig. 4. From the figure, we can lems are either comparable or much better than those obtained
make the following observations. First, the quality of these two ND. The promising performance of solutions achieved by
sets of solutions are very different according to the surrogates RV may be attributed to the fact that averaging solutions
and the real fitness functions. More specially, the solutions in each cluster may reduce the approximation error intro-
achieved by RV is no worse than those obtained by ND on duced by the surrogates, thereby enhancing the quality of the
all test problems when validated by the real fitness functions, solutions.
although they appear to be worse according to the surrogates. From the result in Fig. 4, we can also see that the
This observation suggests that it might not be a good idea to performance of the solutions obtained by RV on the DTLZ
directly present the nondominated solutions found (according test functions is consistently much better than that on F1–F4

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
YANG et al.: OFFLINE DATA-DRIVEN MULTIOBJECTIVE OPTIMIZATION 417

TABLE III
S TATISTICAL IGD OF THE C OMPARED A LGORITHMS ON DTLZ T EST I NSTANCES OVER 30 I NDEPENDENT RUNS ,
W HERE THE B EST R ESULT ON E ACH T EST I NSTANCE I S H IGHLIGHTED

in comparison with the quality of the ND. The reason might of each algorithm are evaluated using the real objective func-
be that the decision variables of F1–F4 are linearly correlated tions before calculating of the IGD value. From these two
while those of the DTLZ test instances are independent of tables, we can draw the following conclusions.
each other. First, the proposed algorithm MS-RV achieves the best
3) Comparison With Offline Data-Driven Evolutionary overall performance against K-RVEA, NSGAII_GP, and its
Algorithms: Tables II and III present the statistical results of three variants. In particular, MS-RV significantly outperforms
the IGD values obtained by the six compared algorithms on K-RVEA and NSGAII_GP on higher dimensional problems,
F1–F4 and the DTLZ test suite over 30 independent runs, i.e., D = 30 and D = 50 of both DTLZ and F test instances.
where the best result of each test instance is highlighted. The These results demonstrate that MS-RV is competitive in solv-
Wilcoxon rank sum test is also adopted at a significance level ing high-dimensional problems in comparison with K-RVEA
of 0.05, where symbols “+,” “−,” and “≈” indicates that the and NSGAII_GP.
result obtained by other algorithms is significantly better, sig- Second, regarding the performance of the surrogates, the
nificantly worse and no difference to that obtained by the comparative results in terms of the IGD values between the
proposed algorithm MS-RV, respectively. Note that solutions two ND algorithms FS-ND (FS) and MS-ND (multisurrogates)

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
418 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 24, NO. 3, JUNE 2020

TABLE IV
S TATISTICAL IGD R ESULTS OF THE C OMPARED A LGORITHMS W ITH D IFFERENT FS M ODELS ON E IGHT T EST I NSTANCES OVER 30 I NDEPENDENT
RUNS , W HERE THE B EST R ESULT ON E ACH T EST I NSTANCE I S H IGHLIGHTED

demonstrate that MS-ND performs better than FS-ND on 25


out of 36 test instances and has never outperformed by the
single FS assisted algorithm.
4) Scalability of the Proposed Algorithm on Different
Models: In this section, we compare FS-ND, MS-ND, FS-
RV, and MS-RV on eight test problems when using SVR and
GP, respectively, as the FS. For each test problem, the num-
ber of decision variables D is set to 10 due to the prohibitive
computational cost of GP on high-dimensional problems.
The IGD values of the solution sets obtained by the algo-
rithms under comparison averaged over 30 independent runs
when using different FS models are presented in Table IV. We
can find from these results that the four algorithms assisted
by SVR and GP, respectively, perform similarly when they
are assisted by the RBF model. Furthermore, MS-RV achieves
the best overall performance among the four compared algo-
rithms on the eight test problems. These findings indicate
that the knowledge transfer strategy used in MS-RV can also Fig. 5. Mean and standard deviation of IGD values obtained by FS-ND,
work well when other models than the RBF are used as the MS-ND, FS-RV, and MS-RV using different sizes of historical data.
surrogate.
5) Sensitivity of the Proposed Algorithm to Data Size: In
this section, we evaluate the sensitivity of the performance in the figure show that the approximation error of the fine
of FS-ND, MS-ND, FS-RV, and MS-RV on the data size models on F1, DTLZ1, and DTLZ4a gradually decreases as
(5D, 10D, 15D, 20D, and 25D, D is the number of the the size of the data increases, while the error on F3 remains
decision variables) on four test problems with M = 2 nearly unchanged. These results indicate that the quality of
and D = 30. fine models is generally becoming better as the number of
The mean and standard deviation of the IGD values of the data increases. However, only the coarse-fine search is able to
solution sets obtained by each algorithm over 20 independent make use of the improved accuracy of the surrogates on the
runs for the different data sizes are plotted in Fig. 5. On F1 DTLZ problems.
and F3, the IGD values of all algorithms vary little when the
size of the offline data changes. On DTLZ2 and DTLZ4a, C. Comparison With Online Algorithms
the IGD values of MS-RV and MS-ND also improves as In general, the offline data-driven optimization algorithm
the number of data increases, although the performance of can be seen as one step in the online algorithms once the
FS-RV and FS-ND generally remains unchanged. To better final solution set is generated and validated. In this section,
understand the above observations, we calculate the root mean the proposed MS-RV is modified to be an online algo-
square error (RMSE) of each fine model trained using differ- rithm. To this end, six generated solutions from each run
ent offline data sizes and the results are plotted in Fig. S.A of MS-RV are evaluated by the real fitness functions and
in Appendix C of the supplementary material. The results are used for updating the surrogate model. To evaluate the

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
YANG et al.: OFFLINE DATA-DRIVEN MULTIOBJECTIVE OPTIMIZATION 419

TABLE V
S TATISTICAL IGD R ESULTS OF THE O NLINE V ERSION OF MS-RV AND T HREE O NLINE A LGORITHMS ON E IGHT T EST I NSTANCES OVER 30
I NDEPENDENT RUNS , W HERE THE B EST R ESULT ON E ACH T EST I NSTANCE I S H IGHLIGHTED

performance of the online version of MS-RV, we compare it


with three online multiobjective data-driven optimization algo-
rithms, K-RVEA [61], MOEA/D-EGO [23], and ParEGO [22]
on the eight test problems (M = 3 in DTLZ problems) adopted
in this paper. In the experiments, the initial training data size
is set to 11D − 1 and the maximum number of evaluations
for each compared algorithm is set to 150. The three online
algorithms are implemented in PlatEMO [67].
The IGD results of the four compared algorithms are
presented in Table V. We can see from the table that the
online version of MS-RV significantly outperforms the three
online data-driven optimization algorithms, especially on high-
dimensional problems. The results imply that MS-RV is
potentially also promising for solving high-dimensional online
data-driven optimization problems.
Fig. 6. Averaged IGD values obtained by MS-RV using different
coefficients rv.
D. Parameter Sensitivity Analysis
In the proposed MS-RV algorithm, there are three parame-
ters that may influence the performance of the algorithm, i.e., with different rv are plotted. In this figure, we can find that
the coefficient rv that controls the dimension of coarse model, F1 and F3 are more sensitive to rv than DTLZ2 and DTLZ4a,
the maximum number of generations TC of the search assisted probably because the decision variables of F1 and F3 are corre-
by the CS in each generation, and the number of solutions ρ lated. It can also be seen that MS-RV achieves the best overall
for averaging in each cluster. In this section, we will ana- IGD values when rv is around 0.3. Therefore, we use rv = 0.3
lyze the sensitivity of the performance of MS-RV to the three in the experiments for comparisons.
parameters on four test functions, namely F1 (D = 30), F3 2) Maximum Number of Generations of the Coarse Search
(D = 30), DTLZ2 (M = 2 and D = 30), and DTLZ4a (M = 2 (TC): As discussed in Section III, the proposed multisurrogate
and D = 30). method reuses knowledge of acquired by the CSs to enhance
1) Coefficient (rv): The rv determines the dimension of the convergence of the fine search. TC can directly influences
CSs, which influences search assisted by the CSs as well as the the knowledge acquired from the coarse search, thereby influ-
knowledge to be transferred to the FSs, which eventually influ- encing the performance of MS-RV. Fig. 7 plots the average
ences the performance of MS-RV. Fig. 6 presents the obtained IGD values with different TC on the above four test problems.
results within 40 generations on the above four instances with As can be seen in the figure, MS-RV is slightly more sensi-
10, 30, and 50 dimensions, in which the average IGD values tive to TC on F1 and F3 than on the two DTLZ test problems.

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
420 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 24, NO. 3, JUNE 2020

Fig. 9. HV of solution sets obtained by the proposed algorithm over the


generations.

A. Problem Description
Fig. 7. Averaged IGD values obtained by MS-RV using different number of
generations TC in the coarse search. The purpose of operational indices optimization in benefi-
ciation processes aims to improve the concentrate grade (G),
concentrate yield (Y), as well as to decrease the energy con-
sumption (E) including the costs of roasting unit, grinding
units by properly coordinating the operating state of each unit.
In this paper, we consider the following 15 operate indices,
namely, particle sizes of the raw ore entered in LMPL and
HMPL (pl, ph) and grade of the raw ore entered in LMPL
and HMPL (gl, gh), capacity and run time of the shaft fur-
nace roasting (sc, st), grade of waste ore (gw), grade of feed
ore of grindings in LMPL and HMPL (gfl, gfh), capacity of
grindings in LMPL and HMPL (gcl, gch) and running time
Fig. 8. Averaged IGD values obtained by MS-RV with different numbers of
solutions in each cluster ρ.
of grindings in LMPL and HMPL (gtl, gth), grade of tail-
ings from LMPL and HMPL (tl, th). These 15 operational
indices are taken as the decision variables and denoted as
X = (pl, ph, gl, gh, sc, st, gw, gfl, gfh, gcl, gch, gtl, gth, tl, th).
Nevertheless, the performance is satisfactory in general when The optimization problem can be formulated as follows:
TC is set between 10 and 20. Accordingly, we use TC = 15.
3) Number of Solutions in Each Cluster (ρ): The purpose min −G, −Y, E
of averaging over ρ solutions in each cluster is to reduce the s.t. G = 1 (X), Y = 2 (X)
influence of the errors introduced by the surrogates. If ρ is too
E = sc + 0.3st + gcl + gch + gtl + gth (3)
small, averaging will not work properly. However, a too large
ρ may lead to poor performance as many very different will where 1 and 2 represent the unknown correlation between
participate in the averaging. For this reason, we investigate a the objectives and the decision variables in the operational
proper ρ for generating high quality solutions on 30-D F1, F3, indices optimization problem.
DTLZ2, and DTLZ4 instances. The comparative results of ρ =
3, 5, 10, 20, and 50 are shown in Fig. 8. From these results,
B. Optimization Results
we can see that the algorithm has shown the best performance
when ρ is around 20. Therefore, we set ρ = 20. In this real-world application, we are not able to validate the
obtained solutions as no “real objective functions” are avail-
able for validation. For this reason, we evaluate the search
ability of coarse-fine search strategy in the proposed MF-RV
V. C ASE S TUDY by comparing with the FS and the multiform optimization
In this section, we consider the application of the approach [68], which is the most recent proposed algorithm in
proposed MS-RV algorithm to a real-world operational indices solving the operational indices optimization problem. Note that
optimization of the beneficiation process. This is a typi- the target accurate model in multiform approach is replaced
cal offline data-driven problem in that mathematical equa- with RBF for fair comparison. Each of the compared algo-
tions of the objective functions cannot be obtained due to rithm is employed to solve the operational indices optimization
the complex physical and chemical reactions in the process problem, which contains 150 pairs of collected historical data.
and only a small amount of historical data can be used. We first conduct 30 independent runs of each algorithm, each
A brief introduce of the operational indices optimization run performing 40 generations of fine search.
problem is presented in Appendix D of the supplementary The mean and standard deviation of the HV values of the
material. solution sets obtained by three compared algorithms over the

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
YANG et al.: OFFLINE DATA-DRIVEN MULTIOBJECTIVE OPTIMIZATION 421

are ill-distributed in the objective space. In the future, we are


interested in improving the performance of the proposed algo-
rithm for addressing ill-distributed historical data by develop-
ing new surrogates, surrogate-management strategies, and final
solution generation methods.

R EFERENCES
[1] D. J. Allstot, K. Choi, and J. Park, Parasitic-Aware Optimization of
CMOS RF Circuits. New York, NY, USA: Springer, 2003.
Fig. 10. Parallel coordinate plot of a set of 50 nondominated solutions [2] T. A. Milligan, Modern Antenna Design. Hoboken, NJ, USA: Wiley,
obtained of each algorithm. 2005.
[3] R. Cheng, T. Rodemann, M. Fischer, M. Olhofer, and Y. Jin,
“Evolutionary many-objective optimization of hybrid electric vehicle
control: From general optimization to preference articulation,” IEEE
generations are plotted in Fig. 9. We can see from the figure Trans. Emerg. Topics Comput. Intell., vol. 1, no. 2, pp. 97–111,
that the HV values gradually increase over the generations, Apr. 2017.
indicating that all compared algorithms converge on the oper- [4] T. Chugh, K. Sindhya, K. Miettinen, Y. Jin, T. Kratky, and P. Makkonen,
“Surrogate-assisted evolutionary multiobjective shape optimization of
ational indices optimization problem in general as discussed an air intake ventilation system,” in Proc. IEEE Congr. Evol. Comput.
in Section IV-B. We can also see from Fig. 9 that the coarse- (CEC), 2017, pp. 1541–1548.
fine search strategy maintains the best HV values among the [5] H. Wang, J. Doherty, and Y. Jin, “Hierarchical surrogate-assisted evolu-
compared algorithms, demonstrating its best search ability tionary multi-scenario airfoil shape optimization,” in Proc. Congr. Evol.
Comput., 2018, pp. 8–13.
compared with the other two algorithms. To further exam- [6] H. Wang, Y. Jin, and J. O. Jansen, “Data-driven surrogate-assisted
ine the quality of the solutions found by each algorithm, a multiobjective evolutionary optimization of a trauma system,” IEEE
set of 50 nondominated solutions are plotted using parallel Trans. Evol. Comput., vol. 20, no. 6, pp. 939–952, Dec. 2016.
[7] D. Guo, T. Chai, J. Ding, and Y. Jin, “Small data driven evolution-
coordinates in Fig. 10. The figure shows that the solutions ary multi-objective optimization of fused magnesium furnaces,” in Proc.
of each production index achieved by the coarse-fine strategy IEEE Symp. Series Comput. Intell. (SSCI), 2016, pp. 1–8.
are distributed in a larger region. This result confirms that the [8] J. Ding, T. Chai, H. Wang, and X. Chen, “Knowledge-based global
operation of mineral processing under uncertainty,” IEEE Trans. Ind.
coarse-fine search strategy is able to achieve a more diverse Informat., vol. 8, no. 4, pp. 849–859, Nov. 2012.
solution set. [9] H. Wang, Y. Jin, C. Sun, and J. Doherty, “Offline data-driven evo-
lutionary optimization using selective surrogate ensembles,” IEEE
Trans. Evol. Comput., vol. 23, no. 2, pp. 203–216, Apr. 2019.
VI. C ONCLUSION doi: 10.1109/TEVC.2018.2834881.
In this paper, an offline data-driven optimization algo- [10] T. Chugh, N. Chakraborti, K. Sindhya, and Y. Jin, “A data-driven
surrogate-assisted evolutionary algorithm applied to a many-objective
rithm, called MS-RV, has been proposed for solving blast furnace optimization problem,” Mater. Manuf. Process., vol. 32,
offline multiobjective data-driven optimization problem. The no. 10, pp. 1172–1178, 2017.
developed algorithm builds a CS and an FS using the histori- [11] Y. Jin, “Data driven evolutionary optimization of complex systems: Big
data versus small data,” in Proc. ACM Genet. Evol. Comput. Conf.
cal data set. The CS, which is dynamically constructed in the Companion, 2016, pp. 1281–1282.
subspace of the original search space, is used for quick explo- [12] Y. Jin and J. Branke, “Evolutionary optimization in uncertain
ration of a promising region, whereas the FS aims to help environments—A survey,” IEEE Trans. Evol. Comput., vol. 9, no. 3,
the optimizer exploit the promising solutions. Meanwhile, a pp. 303–317, Jun. 2005.
[13] Y. Jin, “Surrogate-assisted evolutionary computation: Recent advances
knowledge transfer method is employed to transfer the knowl- and future challenges,” Swarm Evol. Comput., vol. 1, no. 2, pp. 61–70,
edge from the CS to the FS to enhance the convergence of 2011.
the optimization process assisted by the FS. After that, an [14] Y. Jin, M. Olhofer, and B. Sendhoff, “A framework for evolutionary
optimization with approximate fitness functions,” IEEE Trans. Evol.
RV-based final solution set generation strategy is proposed to Comput., vol. 6, no. 5, pp. 481–494, Oct. 2002.
reduce the influence of the approximation errors introduced [15] A. Gaspar-Cunha and A. Vieira, “A hybrid multi-objective evolutionary
by the surrogates, thereby generating high-quality solutions. algorithm using an inverse neural network,” in Hybrid Metaheuristics.
Valencia, Spain: Springer, 2004, pp. 25–30.
We compare the MS-RV algorithm with two state-of-the-
[16] Y. Lian and M.-S. Liou, “Multiobjective optimization using coupled
art algorithms K-RVEA and NSGAII_GP and three variants response surface model and evolutionary algorithm,” AIAA J., vol. 43,
of MS-RV on eight MOPs of 10, 30, and 50 dimensions. no. 6, pp. 1316–1325, 2005.
Empirical results demonstrate that MS-RV achieves the best [17] M. Herrera, A. Guglielmetti, M. Xiao, and R. F. Coelho, “Metamodel-
assisted optimization based on multiple kernel regression for mixed
overall performance than the compared algorithms. Finally, variables,” Struct. Multidiscipl. Optim., vol. 49, no. 6, pp. 979–991,
MS-RV is applied to solve a real-world operational indices 2014.
optimization problem, which is a typical offline data-driven [18] Y. S. Ong, P. B. Nair, and A. J. Keane, “Evolutionary optimization of
computationally expensive problems via surrogate modeling,” AIAA J.,
optimization problem. vol. 41, no. 4, pp. 687–696, 2003.
Despite the encouraging performance of the proposed algo- [19] S. Z. Martínez and C. A. Coello Coello, “MOEA/D assisted by RBF
rithm on the test problems, we find from the experiments networks for expensive multi-objective optimization problems,” in Proc.
that the algorithm is not able to well approximate the true ACM 15th Annu. Conf. Genet. Evol. Comput., 2013, pp. 1405–1412.
[20] A. Isaacs, T. Ray, and W. Smith, “An evolutionary algorithm with spa-
PF for hard optimization problems, such as F3 and DTLZ4a. tially distributed surrogates for multiobjective optimization,” in Proc.
One reason may be that the offline data of these problems Aust. Conf. Artif. Life, 2007, pp. 257–268.

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
422 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 24, NO. 3, JUNE 2020

[21] D. Buche, N. N. Schraudolph, and P. Koumoutsakos, “Accelerating [44] A. Moshaiov and A. Tal, “Family bootstrapping: A genetic trans-
evolutionary algorithms with Gaussian process fitness function mod- fer learning approach for onsetting the evolution for a set of related
els,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 35, no. 2, robotic tasks,” in Proc. IEEE Congr. Evol. Comput. (CEC), 2014,
pp. 183–194, May 2005. pp. 2801–2808.
[22] J. Knowles, “ParEGO: A hybrid algorithm with online landscape approx- [45] L. Feng, Y.-S. Ong, A.-H. Tan, and I. W. Tsang, “Memes as building
imation for expensive multiobjective optimization problems,” IEEE blocks: A case study on evolutionary optimization + transfer learning
Trans. Evol. Comput., vol. 10, no. 1, pp. 50–66, Feb. 2006. for routing problems,” Memetic Comput., vol. 7, no. 3, pp. 159–180,
[23] Q. Zhang, W. Liu, E. Tsang, and B. Virginas, “Expensive multiobjective 2015.
optimization by MOEA/D with Gaussian process model,” IEEE Trans. [46] L. Feng, Y.-S. Ong, M.-H. Lim, and I. W. Tsang, “Memetic search with
Evol. Comput., vol. 14, no. 3, pp. 456–474, Jun. 2010. interdomain learning: A realization between CVRP and CARP,” IEEE
[24] D. Lim, Y.-S. Ong, Y. Jin, and B. Sendhoff, “A study on metamodeling Trans. Evol. Comput., vol. 19, no. 5, pp. 644–658, Oct. 2015.
techniques, ensembles, and multi-surrogates in evolutionary computa- [47] L. Feng, Y.-S. Ong, I. W.-H. Tsang, and A.-H. Tan, “An evolutionary
tion,” in Proc. ACM 9th Annu. Conf. Genet. Evol. Comput., 2007, search paradigm that learns with past experiences,” in Proc. IEEE Congr.
pp. 1288–1295. Evol. Comput. (CEC), 2012, pp. 1–8.
[25] Y. Jin and B. Sendhoff, “Reducing fitness evaluations using cluster- [48] L. Feng, Y.-S. Ong, S. Jiang, and A. Gupta, “Autoencoding evolutionary
ing techniques and neural network ensembles,” in Proc. Genet. Evol. search with learning across heterogeneous problems,” IEEE Trans. Evol.
Comput. Conf., 2004, pp. 688–699. Comput., vol. 21, no. 5, pp. 760–772, Oct. 2017.
[26] A. Rosales-Pérez, C. A. Coello Coello, J. A. Gonzalez, [49] M. Pelikan and M. W. Hauschild, “Learn from the past: Improving
C. A. Reyes-Garcia, and H. J. Escalante, “A hybrid surrogate- model-directed optimization by transfer learning based on distance-
based approach for evolutionary multi-objective optimization,” in Proc. based bias,” Missouri Estim. Distrib. Algorithms Lab., Univ. Missouri
IEEE Congr. Evol. Comput. (CEC), 2013, pp. 2548–2555. at St. Louis, St. Louis, MO, USA, Rep. 2012007, 2012.
[27] T. Goel, R. T. Haftka, W. Shyy, and N. V. Queipo, “Ensemble of [50] R. Santana, A. Mendiburu, and J. A. Lozano, “Structural transfer using
surrogates,” Struct. Multidiscipl. Optim., vol. 33, no. 3, pp. 199–216, EDAS: An application to multi-marker tagging SNP selection,” in Proc.
2007. IEEE Congr. Evol. Comput. (CEC), 2012, pp. 1–8.
[28] H. Wang, Y. Jin, and J. Doherty, “Committee-based active learning for [51] E. Haslam, B. Xue, and M. Zhang, “Further investigation on genetic
surrogate-assisted particle swarm optimization of expensive problems,” programming with transfer learning for symbolic regression,” in Proc.
IEEE Trans. Cybern., vol. 47, no. 9, pp. 2664–2677, Sep. 2017. IEEE Congr. Evol. Comput. (CEC), 2016, pp. 3598–3605.
[29] D. Guo, Y. Jin, J. Ding, and T. Chai, “Heterogeneous ensemble based [52] M. Iqbal, W. N. Browne, and M. Zhang, “Reusing building blocks of
infill criterion for evolutionary multi-objective optimization of expen- extracted knowledge to solve complex, large-scale Boolean problems,”
sive problems,” IEEE Trans. Cybern., vol. 49, no. 3, pp. 1012–1025, IEEE Trans. Evol. Comput., vol. 18, no. 4, pp. 465–480, Aug. 2014.
Mar. 2019. doi: 10.1109/TCYB.2018.2794503. [53] D. O’Neill, H. Al-Sahaf, B. Xue, and M. Zhang, “Common subtrees
[30] Z. Zhou, Y. S. Ong, M. H. Nguyen, and D. Lim, “A study on polynomial in related problems: A novel transfer learning approach for genetic
regression and Gaussian process global surrogate model in hierarchical programming,” in Proc. IEEE Congr. Evol. Comput. (CEC), 2017,
surrogate-assisted evolutionary algorithm,” in Proc. IEEE Congr. Evol. pp. 1287–1294.
Comput., vol. 3, 2005, pp. 2832–2839. [54] A. Gupta, Y.-S. Ong, and L. Feng, “Multifactorial evolution: Toward
evolutionary multitasking,” IEEE Trans. Evol. Comput., vol. 20, no. 3,
[31] D. Lim, Y. Jin, Y.-S. Ong, and B. Sendhoff, “Generalizing surrogate-
pp. 343–357, Jun. 2016.
assisted evolutionary computation,” IEEE Trans. Evol. Comput., vol. 14,
no. 3, pp. 329–355, Jun. 2010. [55] A. Gupta, Y.-S. Ong, L. Feng, and K. C. Tan, “Multi-objective multifac-
torial optimization in evolutionary multitasking,” IEEE Trans. Cybern.,
[32] C. Sun, J. Zeng, J. Pan, S. Xue, and Y. Jin, “A new fitness esti-
vol. 47, no. 7, pp. 1652–1665, Jul. 2017.
mation strategy for particle swarm optimization,” Inf. Sci., vol. 221,
[56] J. Ding, C. Yang, Y. Jin, and T. Chai, “Generalized multitask-
pp. 355–370, Feb. 2013.
ing for evolutionary optimization of expensive problems,” IEEE
[33] C. Sun, Y. Jin, J. Zeng, and Y. Yu, “A two-layer surrogate-assisted Trans. Evol. Comput., vol. 23, no. 1, pp. 44–58, Feb. 2019.
particle swarm optimization algorithm,” Soft Comput., vol. 19, no. 6, doi: 10.1109/TEVC.2017.2785351.
pp. 1461–1475, 2015.
[57] K. Deb and H. Jain, “An evolutionary many-objective optimization
[34] C. Sun, Y. Jin, R. Cheng, J. Ding, and J. Zeng, “Surrogate-assisted algorithm using reference-point-based nondominated sorting approach,
cooperative swarm optimization of high-dimensional expensive prob- part I: Solving problems with box constraints,” IEEE Trans. Evol.
lems,” IEEE Trans. Evol. Comput., vol. 21, no. 4, pp. 644–660, Comput., vol. 18, no. 4, pp. 577–601, Aug. 2014.
Aug. 2017. [58] R. Cheng, Y. Jin, M. Olhofer, and B. Sendhoff, “A reference vector
[35] Y. Jin, H. Wang, T. Chugh, D. Guo, and K. Miettinen, “Data- guided evolutionary algorithm for many-objective optimization,” IEEE
driven evolutionary optimization: An overview and case studies,” Trans. Evol. Comput., vol. 20, no. 5, pp. 773–791, Oct. 2016.
IEEE Trans. Evol. Comput., vol. 23, no. 3, pp. 442–458, Jun. 2019. [59] K. Deb, L. Thiele, M. Laumanns, and E. Zitzler, “Scalable multi-
doi: 10.1109/TEVC.2018.2869001. objective optimization test problems,” in Proc. IEEE Congr. Evol.
[36] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist Comput. (CEC), vol. 1, 2002, pp. 825–830.
multiobjective genetic algorithm: NSGA-II,” IEEE Trans. Evol. Comput., [60] Q. Zhang, A. Zhou, and Y. Jin, “RM-MEDA: A regularity model-based
vol. 6, no. 2, pp. 182–197, Apr. 2002. multiobjective estimation of distribution algorithm,” IEEE Trans. Evol.
[37] N. Beume, B. Naujoks, and M. Emmerich, “SMS-EMOA: Multiobjective Comput., vol. 12, no. 1, pp. 41–63, Feb. 2008.
selection based on dominated hypervolume,” Eur. J. Oper. Res., vol. 181, [61] T. Chugh, Y. Jin, K. Miettinen, J. Hakanen, and K. Sindhya, “A
no. 3, pp. 1653–1669, 2007. surrogate-assisted reference vector guided evolutionary algorithm for
[38] Q. Zhang and H. Li, “MOEA/D: A multiobjective evolutionary algorithm computationally expensive many-objective optimization,” IEEE Trans.
based on decomposition,” IEEE Trans. Evol. Comput., vol. 11, no. 6, Evol. Comput., vol. 22, no. 1, pp. 129–142, Feb. 2018.
pp. 712–731, Dec. 2007. [62] G. Jekabsons, RBF: Radial Basis Function Interpolation for
[39] I. Koprić, Citizens as Partners: Information, Consultation and Public MATLAB/OCTAVE, vol. 1, Riga Tech. Univ., Riga, Latvia, 2009.
Participation in Policy-Making. Org. Econ. Co-Oper. Develop., Paris, [63] S. Lophaven, H. Nielsen, and J. Søndergaard, DACE—A MATLAB
France, 2001. Kriging Toolbox, Version 2.0, Tech. Univ. Denmark, Lyngby, Denmark,
[40] Y. Jin, Ed., Knowledge Incorporation in Evolutionary Computation. 2002.
Berlin, Germany: Springer, 2006. [64] R. L. Iman, “Latin hypercube sampling,” in Encyclopedia of Quantitative
[41] A. Gupta, Y.-S. Ong, and L. Feng, “Insights on transfer optimization: Risk Analysis and Assessment. Chichester, U.K.: Wiley, 2008.
Because experience is the best teacher,” IEEE Trans. Emerg. Topics [65] L. While, P. Hingston, L. Barone, and S. Huband, “A faster algorithm
Comput. Intell., vol. 2, no. 1, pp. 51–64, Feb. 2018. for calculating hypervolume,” IEEE Trans. Evol. Comput., vol. 10, no. 1,
[42] S. J. Louis and J. McDonnell, “Learning with case-injected genetic pp. 29–38, Feb. 2006.
algorithms,” IEEE Trans. Evol. Comput., vol. 8, no. 4, pp. 316–328, [66] L. M. S. Russo and A. P. Francisco, “Quick hypervolume,” IEEE Trans.
Aug. 2004. Evol. Comput., vol. 18, no. 4, pp. 481–502, Aug. 2014.
[43] P. Cunningham and B. Smyth, “Case-based reasoning in scheduling: [67] Y. Tian, R. Cheng, X. Zhang, and Y. Jin, “PlatEMO: A MATLAB plat-
Reusing solution components,” Int. J. Prod. Res., vol. 35, no. 11, form for evolutionary multi-objective optimization [educational forum],”
pp. 2947–2962, 1997. IEEE Comput. Intell. Mag., vol. 12, no. 4, pp. 73–87, Nov. 2017.

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.
YANG et al.: OFFLINE DATA-DRIVEN MULTIOBJECTIVE OPTIMIZATION 423

[68] C. Yang, J. Ding, Y. Jin, C. Wang, and T. Chai, “Multitasking Yaochu Jin (M’98–SM’02–F’16) received the
multiobjective evolutionary operational indices optimization of bene- B.Sc., M.Sc., and Ph.D. degrees from Zhejiang
ficiation processes,” IEEE Trans. Autom. Sci. Eng., vol. 16, no. 3, University, Hangzhou, China, in 1988, 1991, and
pp. 1046–1057, Jul. 2019. doi: 10.1109/TASE.2018.2865593. 1996 respectively, and the Dr.-Ing. degree from
[69] J. Park and I. W. Sandberg, “Universal approximation using radial- Ruhr-University Bochum, Bochum, Germany, in
basis-function networks,” Neural Comput., vol. 3, no. 2, pp. 246–257, 2001.
Jun. 1991. He is currently a Distinguished Chair Professor
of Computational Intelligence with the Department
of Computer Science, and the Head of the Nature
Inspired Computing and Engineering Group,
University of Surrey, Guildford, U.K., where
Cuie Yang received the B.Sc. degree from Henan he heads the Nature Inspired Computing and Engineering Group. He
Polytechnic University, Jiaozuo, China, in 2014, was a Finland Distinguished Professor and a Changjiang Distinguished
and the M.Sc. degree from Northeastern University, Visiting Professor. He has (co)authored over 300 peer-reviewed journal and
Shenyang, China, in 2016, where she is currently conference papers and holds eight patents on evolutionary optimization.
pursuing the Ph.D. degree in control theory and con- Dr. Jin was a recipient of the 2014 and 2016 IEEE Computational
trol engineering with the State Key Laboratory of Intelligence Magazine Outstanding Paper Award, the 2018 IEEE
Synthetical Automation for Process Industry. T RANSACTIONS ON E VOLUTIONARY C OMPUTATION Outstanding Paper
Her current research interests include multitasking Award, and the Best Paper Award of the 2010 IEEE Symposium on
evolutionary optimization, data-driven evolutionary Computational Intelligence in Bioinformatics and Computational Biology.
optimization, and their application. He is the Editor-in-Chief of the IEEE T RANSACTIONS ON C OGNITIVE
AND D EVELOPMENTAL S YSTEMS and Complex and Intelligent Systems.
He is also an Associate Editor or an Editorial Board Member of the
IEEE T RANSACTIONS ON E VOLUTIONARY C OMPUTATION, the IEEE
T RANSACTIONS ON C YBERNETICS, the IEEE T RANSACTIONS ON
NANOBIOSCIENCE, Evolutionary Computation, BioSystems, Soft Computing,
and Natural Computing. He is an IEEE Distinguished Lecturer for the period
from 2017 to 2019.

Tianyou Chai (M’90–SM’97–F’08) received the


Ph.D. degree in control theory and engineering from
Northeastern University, Shenyang, China, in 1985.
He has been with the Research Center of
Automation, Northeastern University, since 1985,
where he became a Professor in 1988 and a Chair
Professor in 2004. He is also the Founder and the
Director of the Center of Automation, which became
a National Engineering and Technology Research
Jinliang Ding (M’09–SM’14) received the Ph.D. Center in 1997. He has made a number of important
degree in control theory and control engineering contributions in control technologies and applica-
from Northeastern University, Shenyang, China, in tions. He has authored and coauthored two monographs, 84 peer reviewed
2012. international journal papers, and around 219 international conference papers.
He is a Professor with the State Key Laboratory He has been invited to deliver over 20 plenary speeches in international
of Synthetical Automation for Process Industry, conferences of IFAC and the IEEE. His current research interests include
Northeastern University. He has authored or adaptive control, intelligent decoupling control, integrated plant control and
coauthored over 100 refereed journal papers and ref- systems, and the development of control technologies with applications to
ereed papers at international conferences. He is also various industrial processes.
the inventor or co-inventor of 17 patents. His cur- Prof. Chai was a recipient of three prestigious awards of National Science
rent research interests include modeling, plant-wide and Technology Progress, the 2002 Technological Science Progress Award
control and optimization for the complex industrial systems, stochastic distri- from the Ho Leung Ho Lee Foundation, the 2007 Industry Award for
bution control, and multiobjective evolutionary algorithms and its application. Excellence in Transitional Control Research from the IEEE Control Systems
Dr. Ding was a recipient of the Young Scholars Science and Technology Society, and the 2010 Yang Jia-Chi Science and Technology Award from the
Award of China in 2016, the National Science Fund for Distinguished Young Chinese Association of Automation. He is a member of the Chinese Academy
Scholars in 2015, the National Technological Invention Award in 2013, two of Engineering, an Academician of the International Eurasian Academy of
First-Prize of Science and Technology Award of the Ministry of Education in Sciences, and an IFAC Fellow. He is a Distinguished Visiting Fellow of the
2006 and 2012, and the IFAC Control Engineering Practice 2011–2013 Paper Royal Academy of Engineering (U.K.), and an Invitation Fellow of the Japan
Prize. Society for the Promotion of Science.

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on May 16,2022 at 09:49:09 UTC from IEEE Xplore. Restrictions apply.

You might also like