Matrix_and_Learning-Assisted_Distributed_Dual-Space_Memetic_Algorithm_for_Customized_Distributed_Blocking_Flowshop_Scheduling_Problem
Matrix_and_Learning-Assisted_Distributed_Dual-Space_Memetic_Algorithm_for_Customized_Distributed_Blocking_Flowshop_Scheduling_Problem
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
plan. If additional personalized needs are required by customers, continuous evolutionary optimizer.
these products can be further fine-processed by differentiation 2) Individuals of exploitative population are represented as
technology. The examples of such fine-processing technologies discrete permutations. The entire population is configured in
can be seen in many industries, such as furniture manufacturing parallel with the exploratory population and is evolved in the
[6]. Combining the DBFSP with assembly and differentiation discrete search space by the evolutionary optimizer including a
fine-manufacturing stages leads to a novel and more practical reinforcement learning-based multi-neighborhood local search
scheduling problem, denoted as DBFSP-AD in this paper. This and a statistical learning-based enhanced local search.
scheduling model exists in many scenarios, such as recycling 3) To communicate superior evolutionary information from
plants of green supply chain industries [6] and manufacturing exploration and exploitation, an adaptive knowledge migration
plants of PC and mobile devices [7, 8]. across continuous and discrete search spaces is proposed based
In this paper, we address the DBFSP-AD with the objective on the impact of the individual migrations on the dispersion of
of minimizing makespan. The DBFSP has been proven to be exploitative population.
NP-hard [9]. Since the DBFSP-AD is a further extension of the 4) The DDMA is constructed by fusing these special designs
DBFSP, it can be concluded that DBFSP-AD is also NP-hard. into our proposed distributed dual-space memetic framework.
For this hard combinatorial optimization problem, it is critical Superior performance of DDMA is guaranteed because of the
to innovatively build powerful solving methods. Evolutionary advanced features of continuous-discrete space coevolution,
algorithms (EAs) have been successfully used to solve different distributed evolution, memetic evolution, and joint assistance
DFSPs and DBFSPs. Thus, EA is also selected to address the of matrix computation and learning strategies to the algorithm.
DBFSP-AD to obtain satisfactory scheduling solutions in a The rest of this paper is organized as follows. In Section II, a
reasonable computing time. In this paper, we propose a novel literature review is provided. In Section III, the DBFSP-AD is
evolutionary framework called distributed dual-space memetic detailed. Section IV introduces the evolutionary framework and
framework, which can be used for a unified evolutionary model details of DDMA. Section V shows computational experiments
to build EAs for different optimization problems and consists of and comparisons. Finally, conclusions and directions for future
three modules: exploration in continuous space, exploitation in research are presented in Section VI.
discrete space, and knowledge migration across continuous and
discrete spaces. Based on this evolutionary framework and II. RELATED WORK
problem-specific design of each module, we build an efficient A. DFSP, DBFSP and Fine-Manufacturing Scheduling
EA for the DBFSP-AD, which is called the matrix and learning
The DFSP is one of the most studied distributed production
assisted distributed dual-space memetic algorithm (DDMA).
scheduling problems. In the pioneering work of DFSP, Naderi
The proposed framework and algorithm are devised based on
and Ruiz [14] proposed six mathematical models and multiple
the following three considerations. First, distributed EA (DEA)
constructive heuristics and metaheuristics. On this basis, many
[10] and memetic algorithm (MA) [11] are two famous
scholars are continuously developing new approaches, mainly
evolutionary frameworks for constructing specific EAs. DEA
metaheuristics like iterated greedy (IG) [15] and artificial bee
shows superior global exploration ability by evolving multiple
colony algorithm [16]. To model and schedule more practical
populations in a distributed way, while MA has powerful local
production scenarios, many scholars have explored the DFSP
exploitation ability by introducing local refinement into an EA
with various constraints [17]. For reasons such as insufficient
sequentially. This inspires us to establish a unified framework
funds and special processing techniques, blocking constraints
that can fuse DEA’s distributed coevolution and MA’s memetic
are widely present in the manufacturing industry. Currently, the
evolution to possess their respective evolutionary advantages.
blocking DFSP, i.e., DBFSP, has become one of the most
Second, the continuous EA cannot directly solve the scheduling
widely studied topics in the distributed production scheduling
problem with combinatorial solution space. To handle this issue,
field. Many algorithms have been proposed for this problem.
a common method is to discretize continuous EA to establish a
Ying and Li [18] presented three hybrid algorithms by
discrete EA. Because of different evolutionary mechanisms, the
integrating IG with different Tabu list operators. Since then,
problem-solving capability of continuous and discrete EAs is
many variants of IG were developed, such as Han et al. [19],
different and time-varying. This makes that continuous-discrete
Miyata and Nagano [20], and Zhao et al. [21]. Except for IG
space coevolution in the same algorithm is expected. Third, the
algorithm, Duan et al. [22] proposed a probabilistic memetic
heavy computational burden[12] and insufficient historical data
framework based on estimation distribution of algorithm.
driven [13] are two main factors that cause an EA to have poor
Zhang et al. [9] built a discrete differential evolution algorithm,
performance. For these two issues, the scheme of using matrix
which devised discrete mutation operators, a biased selection
computation and learning strategies to assist the construction
operator, and an elite retain strategy. Zhao et al. [23] also built a
and evolution of an EA is a promising choice. For the DDMA,
discrete differential evolution by fusing the problem knowledge
specific innovations are summarized as follows.
and discretizing a differential mutation operator. More work on
1) Exploratory population is represented as a real-valued
DFSP and DBFSP can be found in the literature review [17].
matrix, where individuals are defined as different identities that
Presently, in response to the challenges associated with
will dynamically adjust in the evolutionary process. Based on
customization, fine-manufacturing has become popular among
identity differences, exploratory population is heterogeneously
modern enterprises. The DFSP with fine-processing technology
evolved in the continuous search space by a matrix-assisted
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
has received widespread attention from scholars. Hatami et al. proposed a high-performance MA for the inter-satellite link
[24] handled the DFSP with an assembly fine-manufacturing scheduling problem in the BeiDou navigation satellite system.
stage for the first time. Since then, a considerable number of To refine search efficiency, a data-driven heuristic is devised to
EA-based methods have emerged for this problem. Wang and assist the algorithm. Recently, adopting MA to tackle various
Wang [25] presented a new EA based on estimation distribution production scheduling problems has become a popular topic.
of algorithm. Song et al. [26] proposed a hyperheuristic-based Accordingly, many efficient MA variants have been proposed,
EA. Ying and Lin [27] proposed an IG algorithm based on such as dual-space coevolutionary MA [41], network MA [42],
reinforcement learning. More relevant work can be obtained and cooperative MA [43]. In [44], Ishibuchi et al. showed that
from the literature [2, 17]. To model customized manufacturing the performance of an EA can be significantly improved by the
hybridization with problem-specific local searches within MA
scenarios, Zhang et al. [28] tackled the DFSP with a flexible
framework when solving complex combinatorial optimization
assembly fine-manufacturing stage and designed a constructive
problems. Due to its excellent ability to balance exploration and
heuristic and a hybrid particle swarm optimization algorithm.
exploitation, it can be foreseen that MA will be applied across a
More recently, the DFSP with both fine-manufacturing stages wider range of scientific fields.
of assembly and differentiation has also been investigated [7].
On this basis, Zhang et al. [29] further considered the integrated C. Summary
scheduling of distributed manufacturing and fine manufacturing A review of the relevant scheduling problems and algorithms
to investigate more real-world production scenarios. To the best reveals that the subproblems of DBFSP-AD, i.e., the DFSP,
of our knowledge, the DFSP with blocking constraints and the DBFSP, and fine-manufacturing scheduling, have been widely
assembly and differentiation fine-manufacturing stages, i.e., the surveyed. However, the DBFSP-AD has not been reported
studied DBFSP-AD, has not been reported in the literature. although it is more practical and has many applications in
modern industries. For the DBFSP-AD, developing efficient
B. DEA and MA
solving method is critical. We can also see that many studies
Unlike traditional sequential EA, DEA configures multiple have verified the respective advantages of DEA and MA in
populations in a distributed way. Each population is evolved by solving different complex optimization problems. These two
an operator, independently of the others. Among populations, evolutionary frameworks are also highly complementary [45].
the communication is carried out by a migration mechanism to This motivates us to build an evolutionary model by integrating
achieve the coevolution of populations. This kind of distributed them to make full use of their advantages. To this end, a
framework can effectively enhance the global exploration and distributed dual-space memetic algorithm with advanced
has been widely applied in studies across various domains. In evolutionary features of heterogeneity, distributed coevolution,
terms of continuous optimization, Zhan et al. [30] proposed a and continuous-discrete space coevolution is proposed for the
DEA by incorporating the differential evolution and adaptive DBFSP-AD. To ensure the algorithm has a low computational
mechanism to a multi-population evolutionary model. Herrera cost and is driven as much as possible by the historical data,
and Lozano [31] built a gradual distributed real-coded genetic matrix computation and learning technologies are further used
algorithm based on an island distributed model. In this model, to assist the construction and evolution of the algorithm. The
populations are adequately connected by a hypercube topology studied scheduling problem and the proposed algorithm in this
and communicates in a gradual manner. Alba and Dorronsoro paper will significantly benefit the production scheduling and
[32] proposed a DEA based on genetic algorithm and a cellular evolutionary computing communities.
distributed model. This model has only one population but sets
the individuals on the grid. The communication is achieved by a III. DESCRIPTION OF DBFSP-AD
network topology, in which each individual can only compete
with its neighbors. In recent years, research on DEA has mainly The DBFSP-AD is an integration of the traditional DFSP,
focused on applying it to solve many complex problems, such blocking constraints, and two additional fine-processing stages
as database fragmentation [33], supply chain configuration [34], of assembly and differentiation. In this problem, manufacturing
big data optimization [35], power electronic circuit design [36], a set of n jobs J = {Jj | j = 1, 2, …, n} to a set of v customized
and production scheduling [37]. More relevant theoretical and products P = {Pl | l = 1, 2, …, v} goes through a three-stage
applied research can be found in the literature [10]. manufacturing system. At the first stage, a set of f identical
The MA is another widely applied evolutionary framework factories F = {Fk | k = 1, 2, …, f} is responsible for fabricating
that sequentially combines population-based global EA and the jobs. Each factory is a permutation flowshop formed by a
problem-based local refinement to achieve a balance between set of m fabrication machines M1 = {M1,i | i = 1, 2, …, m}.
exploration and exploitation, resulting in superior optimization Between two adjacent machines M1,i and M1,i+1 in any factory,
performance [38]. Several investigations have been conducted there is no intermediate buffers, which can store the jobs to wait
to explore this promising evolutionary computation domain [11, for the subsequent operations. That is, blocking constraints are
38]. Like DEA, MA has also achieved great success in dealing assumed [9]. This means that a job completed on M1,i has to be
with complex problems, especially combinatorial optimization blocked on M1,i until M1,i+1 becomes idle. For other constraints,
problems. Wang et al. [39] devised an estimation of distribution we can refer to permutation flowshop [46] and DFSP [3]. This
algorithm-based MA for the QoS-aware automated semantic stage needs to assign the jobs to factories and determine their
web service composition, where different types of local search fabricating order in each factory. At the second stage, there is
strategies are used to improve the exploitation. Du et al. [40]
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
of each factory, which adopts a similar symbol as the departure Finally, Cmax(γ) is defined as the maximal completion time of
time for the convenience of recursive calculation. Then, C1,j,0 all products at the third stage, which can be calculated by Eq.
and C1,j,i can be calculated by, (5). The DBFSP-AD is to find a scheduling scheme γ* from the
entire solution space Г such that Eq. (6) is satisfied.
0, j = 1
C1, k ( j ),0 =
C1, k ( j −1),1 , j = 2,3, , nk
(1)
Cmax ( ) = max C3,r (tr ), r
r =1,2,..., g
(5)
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
be calculated. between Ord and Sup. This way, superior individuals can guide
To initialize the entire population CP0 simultaneously based the evolution of ordinary individuals. Besides, matrix mutation
on matrix computation, two special matrices Ones and Rand are is not used in the evolution of Ord. For Inf, it is directly
defined. Ones is a matrix whose elements are all 1, and Rand is reinitialized by a matrix restart operator to introduce evolutionary
a matrix whose elements are uniformly and randomly generated diversity and save computational cost. This restart operator is
from the range [0, 1]. Based on these two matrices, population performed by the matrix operation in Eq. (8). After Sup, Ord
CP0 as a whole is initialized randomly by Eq. (8), ensuring that and Inf are evolved, they are recombined into the CP to prepare
each 𝑥𝑖𝑗 ∈ [1, 𝑓 + 1). for the next round of evolution. Next, we introduce the adopted
matrix crossover and mutation operators.
1 + f r11 1 + f r1n
CP0 = OnesN n + f Rand N n = 4) Matrix crossover. Denote the size of superior matrix Sup as
𝑁𝑆 × 𝑛, i.e., 𝑁𝑆 = ⌊0.2 × 𝑁⌋. Let Off be the offspring matrix
1 + f rN 1 1 + f rNn (8) of Sup after matrix crossover. Initialize Off as an 𝑁𝑆 × 𝑛 empty
where 𝑅𝑎𝑛𝑑𝑁×𝑛 = (𝑟𝑖𝑗 )𝑁×𝑛 , rij is a random number in [0, 1], matrix. The matrix crossover is performed on Sup by following
and operation “∗” is the scalar multiplication of a number and a four steps.
matrix. Unlike traditional EAs in combinatorial optimization Step 1: Generate a matrix 𝛿(𝑁𝑆⁄2)×1 randomly, each element
which represents population individuals as discrete permutation of which is an integer from the range [1, n] and represents the
and initialize them one by one, herein we represent the entire crossover positions. Generate a matrix 𝜑1×𝑛 , which is a random
exploratory population as a real-valued matrix. The parallel permutation of all jobs.
computation of matrix can greatly accelerate the computational Step 2: Calculate the matmul product 𝛿(𝑁𝑆⁄2)×1 × 𝑂𝑛𝑒𝑠1×𝑛
speed of evolutionary operators. and 𝑂𝑛𝑒𝑠(𝑁𝑆⁄2)×1 × 𝜑1×𝑛 to obtain matrices 𝛿(𝑁𝑆⁄2)×𝑛 with all
2) Identity definition and adjustment. For population CP, we the columns the same and 𝜑(𝑁𝑆⁄2)×𝑛 with all the rows the same.
define a specific identity for each individual, which represents Step 3: Perform logical operation 𝛿(𝑁𝑆⁄2)×𝑛 ≤ 𝜑(𝑁𝑆⁄2)×𝑛 to
the quality of this individual, i.e., the performance of the related create mask matrix ∆(𝑁𝑆⁄2)×𝑛 , where ∆𝑖𝑗 = 1 if 𝛿𝑖𝑗 ≤ 𝜑𝑖𝑗 , ∆𝑖𝑗 =
solution. At the beginning of evolution, individuals in CP are 0 otherwise.
ranked in the ascending order of makespans in Fit. Define the Step 4: Let 𝑎 = {1, ⋯ , 𝑁𝑆⁄2} and 𝑏 = {𝑁𝑆⁄2 + 1, ⋯ , 𝑁𝑆}.
top 20% of individuals as superior individuals, the next 20% to Denote 𝑆𝑢𝑝𝑎×𝑛 and 𝑆𝑢𝑝𝑏×𝑛 as the matrix composed of the first
50% as ordinary individuals, and the remaining 50% as inferior 𝑁𝑆⁄2 rows and the second 𝑁𝑆⁄2 rows of Sup, respectively.
individuals. They are extracted from CP into three matrices or 𝑂𝑓𝑓𝑎×𝑛 and 𝑂𝑓𝑓𝑏×𝑛 have the same meaning and are generated
subpopulations Sup, Ord, and Inf, respectively. Therefore, the by Eqs. (10) and (11), respectively. This way, Sup is evolved as
initial ratios of Sup, Ord, and Inf in CP are 20%, 30%, and 50%, Off by matrix crossover.
(Ones( ) − ( ) ) (10)
respectively. We address these concerns as in [30]. The ratio of
Sup is expected to be small to reduce the probability of Off a n = Supa n ( NS 2) n + Supbn NS 2 n NS 2 n
(Ones( ) ) + Sup
premature convergence, so it is fixed at 20% during the entire
evolutionary process. In addition, population diversity should Off b n = Supa n NS 2 ) n
− ( NS 2 n ( ) (11)
b n NS 2 n
be focused more on in the early evolutionary stage, while fast where operation “°” is the Hadamard product. For example, if
convergence is more urgent in the later stage of evolution. To three matrices have 𝐶 = 𝐴 ° 𝐵, then 𝐶𝑖𝑗 = 𝐴𝑖𝑗 × 𝐵𝑖𝑗 .
this end, the ratio of Inf gradually decreases with evolution,
For the ordinary matrix Ord with size 𝑁𝑂 × 𝑛, where 𝑁𝑂 =
while the ratio of Ord increases accordingly. The ratio of these
⌊0.3 × N⌋, matrix crossover is carried out between Sup and Ord.
three identities is modeled via Eq. (9) and shown in Fig. 2 (a).
Since the dimension (row number) of Sup is smaller than that of
0.2, for Sup CP Ord, we first expand the dimension of Sup to that of Ord. This
Ratio = 0.5 − 0.003 10( cur max ) , for Inf CP
2T T is achieved by randomly selecting 𝑁𝑂 − 𝑁𝑆 rows of Sup and
0.8 − Inf CP , for Ord CP (9)
inserting them into Sup. Then, we perform the previous Steps 3
and 4 to obtain the offspring by having Sup and Ord play the
where Tcur and Tmax are the current runtime and the maximum roles of 𝛿 and 𝜑, respectively.
allowed runtime, respectively. 5) Matrix mutation. Herein, we devise two mutation operators
3) Matrix-assisted heterogeneous evolution. At each called matrix probability mutation and matrix swap mutation.
generation, CP is evolved heterogeneously based on identity At each generation, they are randomly selected to evolve Sup.
differences in the continuous search space. We propose a novel For the matrix probability mutation operator, mutation is
matrix-assisted continuous space exploitation optimizer to achieved by reinitializing the values of a few elements in each
evolve Sup, Ord, and Inf. Fig. 2 (b) shows the evolutionary row of Sup according to probability.
process. For Sup, it is evolved as a whole by the operator called Step 1: Generate a matrix 𝑅𝑎𝑛𝑑𝑁𝑆×𝑛 . Denote 𝑃𝑚 ∈ [0, 1] as
matrix crossover. If all superior individuals in Sup are not the mutation probability of each Sup element. That is, 𝑆𝑢𝑝𝑖𝑗
improved, Sup is further improved by an operator called matrix will mutate if 𝑅𝑖𝑗 < 𝑃𝑚 ; otherwise, it is unchanged.
mutation. Otherwise, matrix mutation is not performed. For Step 2: Extend Pm to matrix 𝑃𝑀𝑁𝑆×𝑛 using 𝑃𝑀𝑁𝑆×𝑛 = 𝑃𝑚 ∗
Ord, it is evolved by a similar matrix crossover operator as Sup. 𝑂𝑛𝑒𝑠𝑁𝑆×𝑛 . Make logical operation 𝑅𝑁𝑆×𝑛 ≤ 𝑃𝑀𝑁𝑆×𝑛 to create
The difference is that the crossover operator is executed
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
a mask matrix ∇𝑁𝑆×𝑛 whose elements 1 and 0 are used to decide that is, ⌊0.1 × 𝑁⌋. Each individual is represented as a set of
the positions of those mutated and unchanged elements of Sup, multiple permutations, i.e., π = {πk | k = 1, 2, …, f}, where each
respectively. permutation πk is a discrete job sequence, indicating the
Step 3: Generate a matrix 𝑅_𝑆𝑢𝑝𝑁𝑆×𝑛 = 𝑂𝑛𝑒𝑠𝑁𝑆×𝑛 + 𝑓 ∗ processing order of the jobs assigned to factory Fk. Clearly, this
𝑅𝑎𝑛𝑑𝑁𝑆×𝑛 by Eq. (8) and perform Eq. (12) to mutate Sup to representation is incomplete because it can only represent the
offspring Off. fabrication schedule of jobs at multiple factories and misses the
Off NS n = SupNS n (OnesNS n − NS n ) + R− SupNS n NS n (12)
assembly and differentiation schedules of the products. The
complete schedule γ = {π, λ, ξ} can be obtained by using the
where the first and second Hadamard products control Off to procedure of Section III. The individuals of DP are expected to
inherit those unchanged elements from Sup and those mutated be as high-quality as possible. Thus, we heuristically initialize
elements from 𝑅_𝑆𝑢𝑝, respectively. DP0 based on the method in [29].
Unlike the matrix probability mutation operator, the matrix At each generation, DP is directly evolved in discrete search
swap mutation operator randomly swaps two different elements space by using a learning-assisted discrete space exploitation
in each row of Sup to achieve mutation. Thus, it does not optimizer. This optimizer includes two stages: a reinforcement
introduce new values to Sup. The procedure is as follows. learning based multi-neighborhood local search (RL-MLS) at
Step 1: Generate matrices 𝛿𝑁𝑆×1 and 𝜑𝑁𝑆×1 , where 𝛿𝑖1 and the first stage and a statistical learning based enhanced local
𝜑𝑖1 are random integers in [1, n], representing the positions of search (SL-ELS) at the second stage.
each pair of swap elements. Extend them as a 𝑁𝑆 × 𝑛 matrix by 2) RL-MLS. In RL-MLS, it does not evolve the entire DP but
𝛿𝑁𝑆×𝑛 = 𝛿𝑁𝑆×1 × 𝑂𝑛𝑒𝑠1×𝑛 and 𝜑𝑁𝑆×𝑛 = 𝜑𝑁𝑆×1 × 𝑂𝑛𝑒𝑠1×𝑛 . only evolves one individual of DP to reduce the computational
All elements in each row of matrices are clearly the same. cost. This individual is selected based on a probability of 0.5.
Step 2: Generate an ascending permutation of the integers Specifically, if a random number rand between 0 and 1 is less
from 1 to n, 𝑃𝑒𝑟1×𝑛 = [1,2, ⋯ , 𝑛]. Extend it as 𝑁𝑆 × 𝑛 matrix than 0.5, then RL-MLS is performed on a randomly selected
with all the rows the same by 𝑃𝑒𝑟𝑁𝑆×𝑛 = 𝑂𝑛𝑒𝑠𝑁𝑆×1 × 𝑃𝑒𝑟1×𝑛 . individual from DP. Otherwise, it is executed on the best
Step 3: Make logical operation ∆𝑁𝑆×𝑛 = 𝛿𝑁𝑆×𝑛 == 𝑃𝑒𝑟𝑁𝑆×𝑛 , individual of DP. Denote the selected individual as π = {πk | k =
where ∆𝑖𝑗 = 1if 𝛿𝑖𝑗 = 𝑃𝑒𝑟𝑖𝑗 , ∆𝑖𝑗 = 0 otherwise. Similarly, there 1, 2,…, f}, and πa and πb are randomly selected from π. Because
is ∇𝑁𝑆×𝑛 = 𝜑𝑁𝑆×𝑛 == 𝑃𝑒𝑟𝑁𝑆×𝑛 . there is no unified operator that can be used to all exploitative
Step 4: Perform the Hadamard product 𝑆𝑢𝑝𝑁𝑆×𝑛 ° ∆𝑁𝑆×𝑛 and situations, we propose a set of four neighborhood operators in
𝑆𝑢𝑝𝑁𝑆×𝑛 ° ∇𝑁𝑆×𝑛 to generate two matrices 𝑆1𝑁𝑆×𝑛 and 𝑆2𝑁𝑆×𝑛 , RL-MLS. They are denoted as Ope = {Opej | j = 1, 2, 3, 4} and
where the swapped elements remain and the other elements are defined as follows.
zero. Ope1: Insert a job of πa to another position of this sequence.
Step 5: Use the matmul product 𝑆1𝑁𝑆×𝑛 × 𝑂𝑛𝑒𝑠𝑛×𝑛 to create Ope2: Swap the positions of two different jobs in πa.
matrix 𝑆3𝑁𝑆×𝑛 , in which all the elements of each row are the Ope3: Insert a job of πa to a position of πb.
same and equal to the swapped element. Similarly, 𝑆4𝑁𝑆×𝑛 = Ope4: Swap the positions of two jobs obtained from πa and πb,
𝑆2𝑁𝑆×𝑛 × 𝑂𝑛𝑒𝑠𝑛×𝑛 . respectively.
Step 6: Perform Eq. (13) to generate matrix 𝑆𝑁𝑆×𝑛 , where the The procedure of RL-MLS is easy. It repeatedly recalls these
swap operations are achieved and the non-swapped elements four neighborhood operators to greedily improve π. This recall
become zero. process will be iterated LS times. In each iteration, fully
leveraging the historical evolutionary data to recall the most
SNS n = S 3NS n NS n + S 4NS n NS n (13)
suitable operator is a key issue. This will greatly affect the
Step 7: Replace all zero elements of 𝑆𝑁𝑆×𝑛 by those at the performance of RL-MLS. Given that reinforcement learning
same positions as Sup to generate offspring Off. This is done by (RL)-based EA has emerged to solve complex optimization
Eq. (14). problems [47, 48], we propose a data-driven recall mechanism
Off NS n = SupNS n ( SNS n OnesNS n ) + SNS n (14) based on an RL method [49] called enhanced Q-learning (EQL)
in this paper. Unlike the traditional Q-learning that establishes
C. Learning-Assisted Discrete Space Exploitation the learning model by gradually learning the knowledge in the
As shown in Fig. 1, discrete space exploitation is performed early evolutionary stage, EQL pretrains a good initial learning
in parallel with continuous space exploration, aiming to exploit model before application. This enables it to provide more
the promising areas found during the evolutionary process. This accurate and reasonable decisions immediately at the beginning
exploitation optimizer has two characteristics to achieve high of evolution.
performance. 1) Individuals adopt discrete representation and Next, we detail the proposed EQL-based recall mechanism,
evolve directly in the discrete search space, which together with including the definitions of EQL components, the procedure of
continuous space exploration, generates a continuous-discrete recall mechanism, and the pretraining of EQL learning model.
dual-space coevolution. 2) Historical data are adaptively mined (1) Definition of EQL components. The components of EQL
based on learning strategies, including reinforcement learning need to be defined based on the problem under consideration.
and statistical learning, which makes the exploitative campaign The agent in EQL is the evolved individual π of RL-MLS. Let
more efficient. 𝜋 𝑡+1 and 𝜋 𝑡 represent the agents at the (t + 1)-th and t-th
1) Discrete representation and heuristic initialization. The generations. Let ∃𝐶max (𝜋) represent the makespan increment
size of exploitative population DP is set at 10% of the CP size,
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
between 𝜋 𝑡+1 and 𝜋 𝑡 . That is, ∃𝐶max (𝜋) = 𝐶max (𝜋 𝑡+1 ) − Algorithm 1 Pretraining of EQL learning model
𝐶max (𝜋 𝑡 ), indicating that whether the agent is improved. Based Input: Act, 𝛼, 𝛽
on this makespan increment, we define a set of three states Sta Output: Q
1: π, Cmax(π) ← Initialize agent by the heuristic in Section C-1)
= {si | i = 1, 2, 3}, in which states s1, s2 and s3 correspond to
2: Q ← Initialize Q-table that all values are 0
∃𝐶max (𝜋) < 0, i.e., positive improvement; ∃𝐶max (𝜋) = 0, i.e., 3: ES1, ES2 ← Generate two empty sets
zero improvement; ∃𝐶max (𝜋) > 0, i.e., negative improvement. 4: // Action independent training stage
The four neighborhood operators in Ope = {Opej | j = 1, 2, 3, 4} 5: for each action aj of Act in parallel do
6: t ← 1, 𝑠 𝑡 ← s1, 𝜋 𝑡 ← π, Cmax(𝜋 𝑡 ) ← Cmax(π)
are defined as the alternative actions Act = {aj | j = 1, 2, 3, 4},
7: while t < 103 do
where each aj = Opej. 8: 𝜋 𝑡+1 ← Perform aj on 𝜋 𝑡
The reward rew after applying an action to a state is set as the 9: Cmax(𝜋 𝑡+1 ) ← Evaluate 𝜋 𝑡+1
negative value of makespan increment, i.e., 𝑟𝑒𝑤 = −∃𝐶max (𝜋). 10: 𝑟 𝑡 , 𝑠 𝑡+1 ← Determine reward and next state
In this way, states s1, s2 and s3 are given positive, zero and 11: 𝑄(𝑠 𝑡 , 𝑎𝑗 ) ← (1 − 𝛼)𝑄(𝑠 𝑡 , 𝑎𝑗 ) + 𝛼 (𝑟 𝑡 + 𝛽𝑄(𝑠 𝑡+1 , 𝑎𝑗 ))
negative reward values, respectively. 12: t ← t +1
According to the definition of states and actions, the adopted 13: end while
Q-table includes three rows and four columns, denoted as 𝑄 = 14: 𝐸𝑆1 ← 𝐸𝑆1 ∪ 𝑠 𝑡 , 𝐸𝑆2 ← 𝐸𝑆2 ∪ 𝜋 𝑡
15: end for
[𝑄(𝑠𝑖 , 𝑎𝑗 )]3×4 as follows. 16: // Action collaborative training stage
Q ( s1 , a1 ) Q ( s1 , a2 ) Q ( s1 , a3 ) Q ( s1 , a4 ) 17: 𝑠 𝑡 ← random(𝐸𝑆1 ), 𝜋 𝑡 ← random(𝐸𝑆2 )
18: while t < 1.5 × 103 do
Q = Q ( s2 , a1 ) Q ( s2 , a2 ) Q ( s2 , a3 ) Q ( s2 , a4 ) 19: 𝑎𝑡 ← Select an action by Eq. (16)
Q ( s3 , a1 ) Q ( s3 , a2 ) Q ( s3 , a3 ) Q ( s3 , a4 ) (15) 20: 𝜋 𝑡+1 ← Perform 𝑎𝑡 on 𝜋 𝑡
21: Cmax(𝜋 𝑡+1 ) ← Evaluate 𝜋 𝑡+1
where 𝑄(𝑠𝑖 , 𝑎𝑗 ) represents the Q-value of the state-action pair 22: 𝑟 𝑡 , 𝑠 𝑡+1 ← Determine reward and next state
23: 𝑄(𝑠 𝑡 , 𝑎𝑡 ) ← Update 𝑄(𝑠 𝑡 , 𝑎𝑡 ) by Eq. (17)
(𝑠𝑖 , 𝑎𝑗 ), which reflects the evolutionary capacity of applying 24: t ← t +1
action 𝑎𝑗 at state 𝑠𝑖 . 25: end while
(2) Procedure of the recall mechanism. Denote the Q-table
as Q at generation t. Note that the initial Q-table will be overfitting, the test instance used in this pretraining process is
pretrained before application. The proposed pretraining process regenerated by the heuristic proposed in DP initialization. In
is detailed in the next section. The procedure of the recall addition, action collaborative training is carried out based on
mechanism is as follows. For the current state 𝑠 𝑡 , the action the results of the action independent training stage. Algorithm 1
(corresponding to a neighborhood operator) is selected based shows the pretraining procedure of the EQL learning model.
3) SL-ELS. After RL-MLS, SL-ELS is used to further improve
on Q and the -greedy strategy in Eq. (16). After an action 𝑎𝑡 is
the best individual, still denoted as π = {πk | k = 1, 2, …, f}, of
selected, the corresponding neighborhood operator evolves
DP by job moves. To more accurately determine the factory of
agent 𝜋 𝑡 into 𝜋 𝑡+1 . Based on the makespan of 𝜋 𝑡 and 𝜋 𝑡+1 , the
the moved jobs in π, a statistical learning method is proposed
reward 𝑟𝑒𝑤 𝑡 and the next state 𝑠 𝑡+1 can be determined. Further,
based on the analysis of the job statistical distribution among all
the Q-value of 𝑄(𝑠 𝑡 , 𝑎𝑡 ) in Q can be updated by Eq. (17). The
factories of all individuals in DP. The basic idea is that all
new state and Q-table will be used to recall the neighborhood
individuals of DP are high-quality; hence, if a job appears at the
operator in the next round of iteration.
same factory in most individuals of DP, then the probability of
j =1,2,3,4
arg max Q ( s t , a j ) ,if r improving π by moving such a job to other factories will be
a =
t
(16) very small. That is, those jobs of the factory that do not have
j =1,2,3,4
t
arg random Q ( s , a j ) , otherwise such characteristics should be moved as much as possible. The
( )
steps of SL-ELS are as follows.
j =1,2,3,4
Q ( s t , a t ) = (1 − ) Q ( s t , a t ) + r t + max Q ( s t +1 , a j ) (17) Step 1: Let 𝜏𝑗𝑘𝑙 = 1 if job Jj appears at factory Fk in the l-th
individual of DP, otherwise 𝜏𝑗𝑘𝑙 = 0. Calculate matrix ∅ =
where r is a random number in [0, 1], is the greedy selection ⌊0.1×𝑁⌋
probability, and α and β are the learning rate and discount factor, (∅𝑗𝑘 )𝑛×𝑓 , where ∅𝑗𝑘 = ∑𝑙 𝜏𝑗𝑘𝑙 is the times that Jj appears
respectively. We set α = 0.5 and β = 0.7 in the next experiments. at factory Fk of all individuals in DP.
The operator random represents randomly selecting a value. Step 2: For each factory Fk, calculate the standard deviation
(3) Pretraining of the EQL learning model. The purpose of of the k-th column in ∅. Rank the obtained standard deviations
pretraining is to generate a superior initial Q-table to more in a descending order to generate the sequence std[1], std[2], …,
accurately recall the operator immediately at the beginning of std[f], where std[k] corresponds to factory F[k].
performing RL-MLS. It contains an action independent training Step 3: According to the order from π[2] to π[f], insert each job
stage and an action collaborative training stage. The former of π[1] into all possible positions of these sequences. Once π is
trains each action independently to maximize their respective improved through a certain insertion, update π and subsequent
search potential, and in this case, the Q-table is updated column insertions will not be performed. At the same time, the SL-ELS
by column. The latter trains all actions collaboratively to terminates.
enhance the overall search potential of the learning model, in
which case the Q-table will be updated as a whole. To avoid
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
( C ( ) − C )
the decoding process in Section IV-B-1), and vice versa
through the following steps. DIS DP = max max 0.1 N C max (18)
DP
Step 1: Suppose there are nk jobs in πk. Generate nk random
numbers from interval [k, k + 1) and rank them in an ascending Definition 2: Impact degree of migration on the dispersity of
order to obtain vector Rk. DP. Let 𝐷𝑃 − 𝜋 be the population after removing 𝜋 of DP, and
Step 2: Generate sequence πT = [π1, π2, ..., πf] and vector R = let 𝐷𝑃 + 𝑥 be the population after adding x of CP into DP.
[R1, R2, ..., Rf] by ranking all πk and Rk, respectively. After performing these two migrations, the impact on the
Step 3: Rank the jobs in πT in the order from job J1 to Jn, and dispersion of DP is defined by Eqs. (16) and (17).
R that performs the same order change is output as xi = [xi1, ImpactDP − = DDP − − DDP (19)
xi2, …, xin].
Preliminary studies have shown that random or unreasonable ImpactDP + x = DDP + x − DDP (20)
individual migration accelerates the search into local optima Based on the above definitions, the knowledge migration
due to rapid homogenization of the exploitative population. In between CP and DP is described as follows.
order to handle this issue, the proposed dual-space knowledge Step 1: Calculate 𝐷𝐷𝑃 , 𝐷𝐷𝑃−𝜋 for each 𝜋 ∈ 𝐷𝑃, and 𝐷𝐷𝑃+𝑥
migration is devised based on the impact of migration on the for each 𝑥 ∈ 𝐶𝑃 . Calculate sets 𝐼𝑚𝑝𝑎𝑐𝑡1 = {𝐼𝐷𝑃−𝜋 |𝜋 ∈ 𝐷𝑃}
dispersity of exploitative population. and 𝐼𝑚𝑝𝑎𝑐𝑡2 = {𝐼𝐷𝑃+𝑥 |𝑥 ∈ 𝐶𝑃}.
Definition 1: Dispersity of population DP. Denote 𝐶𝑚𝑎𝑥 (𝜋) Step 2: Extract 𝜋 from DP by Eq. (21) and extract x from CP
as the makespan of individual 𝜋 , and 𝐶𝑚𝑎𝑥 ̅ as the average by Eq. (22).
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
[8] F. L. Xiong, K. Y. Xing, and F. Wang, "Scheduling a hybrid assembly- [29] G. H. Zhang, B. Liu, L. Wang, and K. Y. Xing, "Distributed
differentiation flowshop to minimize total flow time," European Journal heterogeneous co-evolutionary algorithm for scheduling a multistage
of Operational Research, vol. 240, no. 2, pp. 338-354, 2015. fine-manufacturing system with setup constraints," IEEE Transactions on
[9] G. H. Zhang, K. Y. Xing, and F. Cao, "Discrete differential evolution Cybernetics, vol. 54, no. 3, pp. 1497-1510, 2024.
algorithm for distributed blocking flowshop scheduling with makespan [30] Z. H. Zhan, Z. J. Wang, H. Jin, and J. Zhang, "Adaptive distributed
criterion," Engineering Applications of Artificial Intelligence, vol. 76, pp. differential evolution," IEEE Transactions on Cybernetics, vol. 50, no. 11,
96-107, 2018. pp. 4633-4647, 2020.
[10] Y.-J. Gong et al., "Distributed evolutionary algorithms and their models: [31] F. Herrera and M. Lozano, "Gradual distributed real-coded genetic
A survey of the state-of-the-art," Applied Soft Computing, vol. 34, pp. algorithms," IEEE Transactions on Evolutionary Computation, vol. 4, no.
286-300, 2015. 1, pp. 43-63, 2000.
[11] X. Chen, Y. S. Ong, M. H. Lim, and K. C. Tan, "A multi-facet survey on [32] E. Alba and B. Dorronsoro, "The exploration/exploitation tradeoff in
memetic computation," IEEE Transactions on Evolutionary Computation, dynamic cellular genetic algorithms," IEEE Transactions on Evolutionary
vol. 15, no. 5, pp. 591-607, 2011. Computation, vol. 9, no. 2, pp. 126-142, 2005.
[12] Z. H. Zhan et al., "Matrix-based evolutionary computation," IEEE [33] Y. F. Ge et al., "Distributed memetic algorithm for outsourced database
Transactions on Emerging Topics in Computational Intelligence, vol. 6, fragmentation," IEEE Transactions on Cybernetics, vol. 51, no. 10, pp.
no. 2, pp. 315-328, 2022. 4808 - 4821, 2021.
[13] X. Wang and L. Tang, "A machine-learning based memetic algorithm for [34] X. Zhang, Z. H. Zhan, W. Fang, P. Qian, and J. Zhang, "Multi population
the multi-objective permutation flowshop scheduling problem," ant colony system with knowledge based local searches for multiobjective
Computers & Operations Research, vol. 79, pp. 60-77, 2017. supply chain configuration," IEEE Transactions on Evolutionary
[14] B. Naderi and R. Ruiz, "The distributed permutation flowshop scheduling Computation, vol. 26, no. 3, pp. 512-526, 2022.
problem," Computers & Operations Research, vol. 37, no. 4, pp. 754-768, [35] N. R. Sabar, J. Abawajy, and J. Yearwood, "Heterogeneous cooperative
2010. co-evolution memetic differential evolution algorithm for big data
[15] R. Ruiz, Q.-K. Pan, and B. Naderi, "Iterated greedy methods for the optimization problems," IEEE Transactions on Evolutionary
distributed permutation flowshop scheduling problem," Omega, vol. 83, Computation, vol. 21, no. 2, pp. 315-327, 2017.
pp. 213-222, 2019. [36] X. F. Liu, Z. H. Zhan, and J. Zhang, "Resource-aware distributed
[16] J.-P. Huang, Q.-K. Pan, Z.-H. Miao, and L. Gao, "Effective constructive differential evolution for training expensive neural-network-based
heuristics and discrete bee colony optimization for distributed flowshop controller in power electronic circuit," IEEE Transactions on Neural
with setup times," Engineering Applications of Artificial Intelligence, vol. Networks and Learning Systems, vol. 33, no. 11, pp. 6286-6296, 2022.
97, p. 104016, 2021. [37] I. Abu Doush, M. A. Al-Betar, M. A. Awadallah, Z. A. A. Alyasseri, S. N.
[17] P. Perez-Gonzalez and J. M. Framinan, "A review and classification on Makhadmeh, and M. El-Abd, "Island neighboring heuristics harmony
distributed permutation flowshop scheduling problems," European search algorithm for flow shop scheduling with blocking," Swarm and
Journal of Operational Research, vol. 312, no. 1, pp. 1-21, 2024. Evolutionary Computation, vol. 74, p. 101127, 2022.
[18] K. C. Ying and S. W. Lin, "Minimizing makespan in distributed blocking [38] F. Neri and C. Cotta, "Memetic algorithms and memetic computing
flowshops using hybrid iterated greedy algorithms," IEEE Access, Article optimization: A literature review," Swarm and Evolutionary Computation,
vol. 5, pp. 15694-15705, 2017. vol. 2, pp. 1-14, 2012.
[19] X. Han et al., "An effective iterative greedy algorithm for distributed [39] C. Wang, H. Ma, G. Chen, and S. Hartmann, "Memetic EDA-based
blocking flowshop scheduling problem with balanced energy costs approaches to QoS-aware fully automated semantic web service
criterion," Applied Soft Computing, p. 109502, 2022. composition," IEEE Transactions on Evolutionary Computation, vol. 26,
[20] H. H. Miyata and M. S. Nagano, "An iterated greedy algorithm for no. 3, pp. 570-584, 2022.
distributed blocking flow shop with setup times and maintenance [40] Y. Du, L. Wang, L. Xing, J. Yan, and M. Cai, "Data-driven heuristic
operations to minimize makespan," Computers & Industrial Engineering, assisted memetic algorithm for efficient inter-satellite link scheduling in
vol. 171, p. 108366, 2022. the BeiDou navigation satellite system," IEEE/CAA Journal of
[21] F. Zhao, H. Bao, L. Wang, T. Xu, N. Zhu, and Jonrinaldi, "A heuristic and Automatica Sinica, vol. 8, no. 11, pp. 1800-1816, 2021.
meta-heuristic based on problem-specific knowledge for distributed [41] G. H. Zhang, L. Wang, and K. Y. Xing, "Dual-space co-evolutionary
blocking flow-shop scheduling problem with sequence-dependent setup memetic algorithm for scheduling hybrid differentiation flowshop with
times," Engineering Applications of Artificial Intelligence, vol. 116, p. limited buffer constraints," IEEE Transactions on Systems, Man, and
105443, 2022. Cybernetics: Systems, vol. 52, no. 11, pp. 6822-6836, 2022.
[22] W. Duan, Z. Li, Y. Yang, B. Liu, and K. Wang, "EDA based probabilistic [42] W. Shao, Z. Shao, and D. Pi, "A network memetic algorithm for energy
memetic algorithm for distributed blocking permutation flowshop and labor-aware distributed heterogeneous hybrid flow shop scheduling
scheduling with sequence dependent setup time," in 2017 IEEE Congress problem," Swarm and Evolutionary Computation, p. 101190, 2022.
on Evolutionary Computation (CEC), 2017, pp. 992-999. [43] J. J. Wang and L. Wang, "A cooperative memetic algorithm with
[23] F. Zhao, L. Zhao, L. Wang, and H. Song, "An ensemble discrete learning-based agent for energy-aware distributed hybrid flow-shop
differential evolution for the distributed blocking flowshop scheduling scheduling," IEEE Transactions on Evolutionary Computation, vol. 26,
with minimizing makespan criterion," Expert Systems with Applications, no. 3, pp. 461-475, 2022.
vol. 120, p. 113678, 2020. [44] H. Ishibuchi, T. Yoshida, and T. Murata, "Balance between genetic search
[24] S. Hatami, R. Ruiz, and C. Andrés-Romano, "The distributed assembly and local search in memetic algorithms for multiobjective permutation
permutation flowshop scheduling problem," International Journal of flowshop scheduling," IEEE Transactions on Evolutionary Computation,
Production Research, vol. 51, no. 17, pp. 5292-5308, 2013. vol. 7, no. 2, pp. 204-223, 2003.
[25] S. Y. Wang and L. Wang, "An estimation of distribution algorithm-based [45] G. H. Zhang, W. J. Ma, K. Y. Xing, L. N. Xing, and K. S. Wang,
memetic algorithm for the distributed assembly permutation flow-shop "Quantum-inspired distributed memetic algorithm," Complex System
scheduling problem," IEEE Transactions on Systems, Man, and Modeling and Simulation, vol. 2, no. 4, pp. 334-353, 2022.
Cybernetics: Systems, vol. 46, no. 1, pp. 139-149, 2016. [46] M. Amirghasemi and R. Zamani, "An effective evolutionary hybrid for
[26] H.-B. Song, Y.-H. Yang, J. Lin, and J.-X. Ye, "An effective hyper solving the permutation flowshop scheduling problem," Evolutionary
heuristic-based memetic algorithm for the distributed assembly Computation, vol. 25, no. 1, pp. 87-111, 2017.
permutation flow-shop scheduling problem," Applied Soft Computing, p. [47] L. Wang, Z. Pan, and J. Wang, "A review of reinforcement learning based
110022, 2023. intelligent optimization for manufacturing scheduling," Complex System
[27] K.-C. Ying and S.-W. Lin, "Reinforcement learning iterated greedy Modeling and Simulation, vol. 1, no. 4, pp. 257-270, 2021.
algorithm for distributed assembly permutation flowshop scheduling [48] D. J. Mankowitz et al., "Faster sorting algorithms discovered using deep
problems," Journal of Ambient Intelligence and Humanized Computing, reinforcement learning," Nature, vol. 618, no. 7964, pp. 257-263, 2023.
vol. 14, pp. 11123-11138, 2023. [49] R. Fakoor, P. Chaudhari, S. Soatto, and A. J. Smola, "Meta-Q-learning,"
[28] G. H. Zhang, K. Y. Xing, and F. Cao, "Scheduling distributed flowshops in International Conference on Learning Representation (ICLR 2020),
with flexible assembly and setup time to minimise makespan," 2020, pp. 1-17.
International Journal of Production Research, vol. 56, no. 9, pp. [50] X. L. Jing, Q. K. Pan, L. Gao, and L. Wang, "An effective iterated greedy
3226-3244, 2018. algorithm for a robust distributed permutation flowshop problem with
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Evolutionary Computation. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TEVC.2024.3519774
carryover sequence-dependent setup time," IEEE Transactions on Systems, [53] S. García, D. Molina, M. Lozano, and F. Herrera, "A study on the use of
Man, and Cybernetics: Systems, vol. 52, no. 9, pp. 5783-5794, 2022. non-parametric tests for analyzing the evolutionary algorithms' behaviour:
[51] J. Mao, Q. Pan, Z. Miao, and L. Gao, "An effective multi-start iterated a case study on the CEC'2005 Special Session on Real Parameter
greedy algorithm to minimize makespan for the distributed permutation Optimization," Journal of Heuristics, vol. 15, no. 6, pp. 617-644, 2009.
flowshop scheduling problem with preventive maintenance," Expert
Systems with Applications, vol. 169, p. 114495, 2021.
[52] B. Naderi and R. Ruiz, "A scatter search algorithm for the distributed
permutation flowshop scheduling problem," European Journal of
Operational Research, vol. 239, no. 2, pp. 323-334, 2014.
Guanghui Zhang received the Ph.D. degree in control Qianlong Dang received the Ph.D. degree in
science and engineering from Xi’an Jiaotong mathematics from Xidian University, Xi’an, China, in
University, Xi’an, China, in 2018. Since 2019, he has 2022.
been with the School of Information Science of He is currently an Associate Professor with
Technology, Hebei Agricultural University, Baoding, Northwest A&F University, Yangling, China. His
China. research interests include machine learning, deep
His currently research interests include control and neural networks, evolutionary computation, and
scheduling of flexible manufacturing system. He has optimization algorithms. He has authored more than
authored some referred academic articles in IEEE ten journal and conference papers.
Trans and other international journals, such as IEEE
Transactions on Evolutionary Computation, IEEE Transactions on Cybernetics,
IEEE Transactions on Systems, Man, Cybernetics: Systems, IEEE Transactions
Ling Wang received the B.S. degree in automation and
on Emerging Topics in Computational Intelligence, IEEE Internet of Things
Ph.D. degree in control theory and control engineering
Journal, etc.
from Tsinghua University, Beijing, China, in 1995 and
1999, respectively.
Since 1999, he has been with the Department of
Juan Wang received the B.S. degree in information Automation, Tsinghua University, where he became a
management and information systems from Xinyang Full Professor in 2008. His research interests include
Normal University, in Xinyang, China, in 2022. She is intelligent optimization and production scheduling.
currently pursuing the M.S. degree with the School of Professor Wang is a recipient of the National Natural
Information Science of Technology, Hebei Agricultural Science Fund for Distinguished Young Scholars of
University, Baoding, China. China. Professor Wang is currently the Editor-in-Chief of Swarm and
Her current research interests include the scheduling Evolutionary Computation, Expert Systems with Applications, and
of distributed manufacturing systems and intelligent International Journal of Automation and Control, and the Associate Editor of
optimization methods IEEE Transactions on Evolutionary Computation etc.
Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on December 22,2024 at 08:24:51 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.