0% found this document useful (0 votes)

13 views6 pages

A Parallel Monte-Carlo Tree Search-Based Metaheuristic For Optimal Fleet Composition Considering Vehicle Routing Using Branch Amp Bound

Uploaded by

Ace

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views6 pages

A Parallel Monte-Carlo Tree Search-Based Metaheuristic For Optimal Fleet Composition Considering Vehicle Routing Using Branch Amp Bound

Uploaded by

Ace

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

A Parallel Monte-Carlo Tree Search-Based Metaheuristic For Optimal

Fleet Composition Considering Vehicle Routing Using Branch & Bound

T.M.J.T. Baltussen1 , M. Goutham1 , M. Menon2 , S.G. Garrow2 , M. Santillo2 , S. Stockar1

Abstract—Autonomous mobile robots enable increased flexibil- combinatorial optimization problem, structure the problem as
ity of manufacturing systems. The design and operating strategy a tree exploration problem and are solved using Branch &
of such a fleet of robots requires careful consideration of both Bound (B&B) methods [7]. However, due to the N P-hard
fixed and operational costs. In this paper, a Monte-Carlo Tree
2023 IEEE Intelligent Vehicles Symposium (IV) | 979-8-3503-4691-6/23/$31.00 ©2023 IEEE | DOI: 10.1109/IV55152.2023.10186562

Search (MCTS)-based metaheuristic is developed that guides a nature of the problem, the application of exact algorithms is
Branch & Bound (B&B) algorithm to find the globally optimal restricted to small problem instances [8]. Real-life VRPTW
solution to the Fleet Size and Mix Vehicle Routing Problem applications are considerably larger in scale [8] and finding the
with Time Windows (FSMVRPTW). The metaheuristic and exact optimal solution to such a problem is computationally expen-
algorithms are implemented in a parallel hybrid optimization sive. Therefore, most VRPTWs are solved using metaheuristic
algorithm where the metaheuristic rapidly finds feasible solutions
that provide candidate upper bounds for the B&B algorithm. methods due to their ability to find near-optimal solution in a
The MCTS additionally provides a candidate fleet composition limited time [7], [8]. However, such approximate methods do
to initiate the B&B search. Experiments show that the proposed not provide guarantees on the optimality of the solution [7].
approach results in significant improvements in computation time Hybrid optimization methods can improve the performance
and convergence to the optimal solution. and efficiency of the optimizer by combining the strengths of
Keywords: Fleet composition, Vehicle Routing, Branch &
Bound, Monte-Carlo Tree Search, Metaheuristic metaheuristics and exact algorithms. Successful metaheuristics
provide a balance between exploration and exploitation of the
I. I NTRODUCTION search space [9]. As such, Monte-Carlo Tree Search (MCTS)
In the industrial sector, reconfigurable manufacturing sys- is a reinforcement learning algorithm that balances this ex-
tems are increasingly being adopted because of their ability to ploration and exploitation and it is well suited to large-scale
scale and diversify production by supporting the adaptability combinatorial optimization problems [7], [10], [11]. In fact,
of process controls, functions, and operations [1]. A key MCTS has already been used in literature as a metaheuristic
enabler is the added production flexibility provided by the that guides a CPLEX solver toward the optimal solution [12].
adoption of fleets of autonomous mobile robots (AMRs) that Moreover, it is frequently hybridized with other optimization
move material within a plant [2]. In particular, multi-load algorithms [11]. MCTS has been found to obtain state-of-the-
AMRs enhance efficiency by picking up and dropping off art results in resource allocation problems (RAP) [13] and
multiple items in a single mission [3]. The design of such in single vehicle instances of the VRPTW, called Travelling
a fleet is a strategic problem and involves considerable capital Salesperson Problems with Time Windows (TSPTW) [14]. It
investment [4]. Therefore, all costs related to the acquisition has also been used to solve VRP problems with variable fleet
and operation should be considered. Although [5] and [6] have sizes [13]–[15]. However, MCTS has not yet been used to
recently shown the relevance of combining vehicle routing and solve FSMVRPTWs that permit different types of vehicles.
component design of the vehicles in the fleet, the combined The first contribution of this paper is the development of
vehicle routing and fleet composition has generally received an exact incremental B&B algorithm for the FSMVRPTW.
insufficient attention [4]. In this paper, the Vehicle Routing This algorithm employs a divide and conquer approach where
Problem with Time Windows (VRPTW) and capacity con- the VRPTW is partitioned into an (RAP) that first assigns
straints on the cargo mass, volume and vehicle range is used tasks to each robot using a parallel B&B algorithm, and then
to obtain operational costs. The combined VRPTW with the finds the optimal sequence in which the assigned tasks are
heterogeneous fleet composition problem, is called the Fleet completed by solving a nested TSPTW, using another B&B
Size and Mix Vehicle Routing Problem with Time Windows algorithm. The second contribution is a hybrid MCTS-based
(FSMVRPTW). This problem accommodates a heterogeneous metaheuristic (UCT-MH), that uses the Upper Confidence
fleet and considers both fixed and operational costs [4]. bounds applied to Trees algorithm [16] in the fleet composition
Fleet composition optimization problems are typically posed levels to guide its search and solves the nested TSPTW using
as a capacitated VRPTW where the fleet size can be var- a B&B algorithm. The third novelty presented in this paper
ied [7]. Exact algorithms that guarantee optimality for this is the hybrid optimization framework where the UCT-MH
guides the incremental B&B to find the optimal solution
1 Tren Baltussen, Mithun Goutham and Stephanie Stockar are with the to the FSMVRPTW. When possible, this B&B is initialized
Center for Automotive Research, The Ohio State University, Columbus, OH with a fleet composition identified by the rapid search space
43212, USA. {baltussen.1, goutham.1, stockar.1}@osu.edu exploration enabled by the UCT-MH. Additionally, the best
2 Meghna Menon, Sarah Garrow and Mario Santillo are with the Ford
Motor Company, Dearborn, MI 48109 USA, {mmenon8, sgarrow1, solutions found by the UCT-MH update the upper bound used
msantil3}@ford.com by the incremental B&B, which allows sub-optimal solutions

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on November 29,2023 at 13:02:45 UTC from IEEE Xplore. Restrictions apply.
to be pruned earlier. The performance of the proposed method is the arc set. Between every pair of nodes (i, j) ∈ A, the
is verified on various real-life case studies. Results show a operational costs Dij ∈ R+ , energy consumed δeij ∈ R+ and
significant reduction in computation time when the incre- travel time δtij ∈ R+ are pre-computed before initializing
mental B&B algorithm is guided by the proposed UCT-MH, the optimization by solving a path planning problem between
especially for large problem sizes. every two locations i, j ∈ V : (i, j) ∈ A.
2nr 2n r +1
II. P ROBLEM F ORMULATION & M ETHODOLOGY r
X X
J (r, Tr ) = min
x
Dij xij (3)
Consider a manufacturing plant with a known layout that ij
∀ij∈A i=0 j=1
comprises various spatial constraints, and a set of material
s.t. xij ∈ {0, 1} ∀(i, j) ∈ A (4)
handling tasks T . Each task involves picking-up certain cargo
nr
items at inventory locations and dropping them off at their X
x0j = 1 (5)
designated drop-off locations within defined time windows.
j=1
The objective of the optimization is to find the optimal fleet 2nr 2n r +1
of multi-load capacitated AMRs that completes all the defined X
xil =
X
xlj = 1, ∀l ∈ {V \ {0, 2nr + 1}} (6)
tasks T while minimizing fixed and operational costs. i=0 j=1
Let the set H := {1, 2, ..., h} identify h different AMR 2nr
types available, each with specific traveling speeds, energy
X
xi,2nr +1 = 1 (7)
efficiency, cargo capacity, driving range, charge-time etc. Let i=nr +1
ki ≤ kimax : i ∈ H denote the number of each type of AMR 
that forms a fleet so that any fleet composition can be fully 
 zi − δeij

 if x = 1 ∧ z − δe > 0 ∀(i, j) ∈ A
defined by a vector k ∈ Nh0 . This fleet is associated with a zj =
ij i ij
(8)
fixed cost J f (k) composed of purchase costs, depreciation, 
 1 − δe 0j
etc., that can be captured by J f (k) = c⊤ k for some c ∈ Rh .

if xij = 1 ∧ zi − δeij ≤ 0 ∀(i, j) ∈ A

For completing all the tasks in T , the operational cost J o (k)
z0 =1; 0 ≤ zi ≤ 1 ∀i ∈ V (9)
can be any combination of relevant metrics to be minimized 
such as energy, slack time, number of turns, asset depreciation, 
 δtij

etc. [17]–[19]. The total cost to be minimized is:  if x = 1 ∧ z − δe > 0 ∀(i, j) ∈ A
ij i ij
Tij = −1
(10)
min J = c⊤ k + J o (k) (1) 
 δt0i + (1 − zi − δei0 )p + δt0j
k 
if xij = 1 ∧ zi − δeij ≤ 0 ∀(i, j) ∈ A

The fleet operational cost J o (k) is posed as an RAP that
xij = 1 → ti + si + Tij ≤ tj ∀(i, j) ∈ A (11)
finds the optimal partition of tasks to be assigned to AMRs that
minimizes total operational cost. If the total number ti + si + Ti,n+i ≤ tn+i ∀i ∈ V (12)
Ph of robots
in the heterogeneous fleet k is given by m = i=1 ki , every ei ≤ ti ≤ li ∀i ∈ V (13)
robot in this fleet can be identified by r ∈ Rk := {1, 2, ..., m}. xij = 1 → yj = yi + qj ∀(i, j) ∈ A (14)
Let the set Tr ⊆ T denote the tasks assigned to robot r by the
y0 = 0; 0 ≤ yi ≤ Qr ∀i ∈ V (15)
partitioning
S of T , denoted by T := {Tr : r ∈TRk }, meaning
r∈Rk T r = T and ∀r, s ∈ Rk : r ̸= s, Tr Ts = ∅. The The binary flow variable xij = 1 signifies that the robot uses
optimal partition of task set T minimizes J o (k) in Eq (2). directed arc (i, j) ∈ A. Constraints related to the robot starting
X from the depot 0, visiting every location once and terminating
J o (k) = min J r (r, Tr ) (2) the sequence at 2nr + 1 are enforced by Eq. (4-7).
T
r∈Rk
The battery states of charge zj in Eq. (8-10) are updated
The minimum operational cost J r (r, Tr ) for each robot in as the robot goes about its mission. Whenever the battery is
fleet k is dependent on the AMR type, and is also affected depleted, the robot heads to the depot where it is fully charged
by the sequence with which task locations are visited, as up with a constant recharging rate p. The variable Tij in Eq.
it is possible for the robot to pickup multiple items before (10-12) updates the travel time between locations i and j based
dropping them off so long as each pickup is visited before the on whether a recharge is required between the two locations.
corresponding drop-off. The objective function and constraints Time variables ti in Eq. (11-13) denote the arrival time of
that yield J r (r, Tr ) are defined in Eq. (3-15). the robot at location i ∈ V. Each location is associated with
Let robot r of the fleet be assigned nr = |Tr | tasks. The a time si for material handling and a time window [ei , li ]
set of pickup and drop-off locations are defined as VP := which represents the earliest and latest time at which material
{1, 2, ..., nr } and VD := {nr + 1, nr + 2, ..., 2nr } respectively, handling can start. Cargo constraints are captured in Eq. (14,
so that an item picked up at location i must be dropped off 15) where payload variables yi capture the cargo mass being
at location nr + i. The origin and final destination locations carried by the robot as it leaves location i ∈ V. Each robot r
of the robot are identified by {0, 2nr + 1}. Let V := {VP ∪ has a cargo capacity limitation of Qr and each location i ∈ V
VD ∪ {0, 2nr + 1}} be the set of all locations in a graph is associated with a cargo load qi ∈ R such that qi +qn+i = 0.
representation G := (V, A) where A := {(i, j) ∈ V × V} Volumetric constraints are modeled similarly.

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on November 29,2023 at 13:02:45 UTC from IEEE Xplore. Restrictions apply.
A. Exact Algorithm: Incremental Branch & Bound work pool. For each processor, this RAP B&B algorithm is
implemented by a recursive function to minimize memory and
The incremental B&B systematically partitions the search computational requirements as the tree is explored. Further,
space into subsets that are arranged in a tree structure. The since the computation time of B&B algorithms increases with
root of the tree is the original problem and the leaves of the number of feasible branches at each node, the fleet is
the tree are its individual candidate solutions. Between the initiated with a smaller candidate fleet f 1 ∈ Nh0 than the
root and the leaves are intermediate nodes that represent maximal fleet kmax ∈ Nh0 . After evaluating the total cost
subproblems obtained by recursively partitioning the original of this candidate fleet, the number of robots is incrementally
problem by a process called branching. B&B algorithms are raised until further increments do not reduce the total cost or
used to solve these sub-problems. The order according to additional robots remain idle. For each fleet increment, only
which these subproblems are examined is determined by a RAP subproblems that include at least one of the newly added
best-first selection criteria, i.e. exploitation, that first explores robots are evaluated since other solutions are guaranteed to
the problem with the cheapest cost. have been evaluated already.
For minimization problems, the upper bound is the incum- For h different AMR types available, the fleet is initiated
bent solution which is the cheapest candidate solution to the with a candidate fleet f 1 ≤ kmax , which is chosen based
original problem found at the leaf node. The upper bound on problem parameters and prior experience so that feasible
is continuously updated as the tree is explored, and is used solutions exist. The RAP of fleet f 1 is then solved using the
to prune sub-optimal branches without recursively evaluating described parallel B&B algorithm, and its minimum total cost
their solutions up to the leaf node. Thus, as the algorithm J 1 is found, which utilizes robots k1 ∈ Nh0 : k1 ≤ f 1 . In
searches from the root to the leaves, branching is conducted the increment step, only robot types i that satisfy k1i = fi1
only if the cost at the node is lower than the incumbent are incremented by 1 for the next candidate fleet f 2 . These
solution, and branching can potentially find a better solution increments are conducted so long as both J i+1 ≤ J i and
than the incumbent solution. Following this process, the B&B ∃ i : k1i = fi1 . The optimal fleet that completes all tasks while
algorithm recursively decomposes the original problem until minimizing total cost is then k i when J i+1 > J i or when
further branching is futile when the solution cannot be im- ∄ i : ki+1 = fii+1 , i.e, the additional robots were idle.
i
proved, or until the original problem has been solved when At each instance that the RAP subproblem is solved at a
every feasible branch has been evaluated. node in the arborescence shown in Fig. 1, the TSPTW problem
The N P-hard RAP problem described by Eq. 2 is solved defined by Eq. (3-15) is solved to find the cost at that node.
by the B&B algorithm implemented in a parallel framework This TSPTW problem is solved using recursive Algorithm
that uses p processing cores, as shown in Fig. 1, where robots 1 that employs another B&B to find the optimal sequence
in the fleet are identified by subscript r ∈ R := {1, 2, ..., m}. of task completion for each robot. In summary, the B&B
Thus, by splitting the arborescence at some task assignment incrementally increases the fleet size while minimizing total
level and assigning the emanating sub-trees to the available cost J of Eq. 1, which includes operational cost J o (k) of Eq.
processors, several subproblems are explored simultaneously. 2 found using the RAP B&B algorithm and the cost J r (r, Tr )
During each processor’s exploration, updated incumbent solu- of Eq. (3-15) found using the TSPTW B&B algorithm.
tions are instantaneously made available to every processor in
an asynchronous information sharing method using a shared B. Metaheuristic: Monte-Carlo Tree Search
Each iteration of MCTS involves four steps [20]: Selection:
at every node v in the arborescence, the tree policy selects
Root the next node v ′ . This node selection is initiated at the root
node v0 and is used for navigation until the leaf node vl is
… reached. Expansion: at the leaf node vl , a random action is
ܶܽ‫ ݇ݏ‬1 ‫ݎ‬ଵ ‫ݎ‬ଶ ‫ݎ‬௠
taken to expand the tree. Simulation: a Monte-Carlo simulation
ܶܽ‫ ݇ݏ‬2 ‫ݎ‬ଵ ‫ݎ‬ଶ … ‫ݎ‬௠ ‫ݎ‬ଵ ‫ݎ‬ଶ … ‫ݎ‬௠ ‫ݎ‬ଵ ‫ݎ‬ଶ … ‫ݎ‬௠ is performed starting from the expansion node to complete the
solution. Backpropagation: the cost/reward of the expansion
and simulation is propagated back to the root node v0 .
…

…
…

The Upper Confidence bounds applied to Trees algorithm

ܶܽ‫|߬| ݇ݏ‬ ‫ݎ‬ଵ ‫ݎ‬ଶ … ‫ݎ‬௠ ‫ݎ‬ଵ ‫ݎ‬ଶ … ‫ݎ‬௠ ‫ݎ‬ଵ ‫ݎ‬ଶ … ‫ݎ‬௠
[16] was the first variant and formal introduction of MCTS.
Processor 1 Processor 2 … Processor ‫݌‬ The proposed metaheuristic (UCT-MH) uses this algorithm to
guide the exact incremental B&B algorithm to the optimal
solution. While in typical VRPTWs the fleet size is a free
variable [7], the proposed metaheuristic selects a fleet size
Pooled best costs m and a composition k in the fleet sizing and composition
stages and tries to solve the VRPTW optimally, fully utilizing
Figure 1. Parallel implementation of the RAP B&B algorithm for the that composition. By doing so, the algorithm finds an estimate
Resource Allocation Problem formulation. of the expected total cost associated with a particular fleet
1

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on November 29,2023 at 13:02:45 UTC from IEEE Xplore. Restrictions apply.
Algorithm 1 Recursive TSPTW B&B determined by action a0 , where g1 (s0 (v0 ), a0 ) := 0, for the
Cost = B&B(State, taskList, current Location) fleet cost is determined by its composition. Subsequently, the
1: Find feasible next locations based on completed pickups, fleet composition k is determined by a1 ∈ A1 (m), with fixed
time, cargo, battery constraints cost g2 (s1 (m), a1 ) = J f (k). Fig. 2 provides a schematic
2: Sort feasible next locations by operational cost of branch- overview of the problem and the proposed metaheuristic.
ing to that location (Best First Search) At the fleet sizing and composition stages, the UCT-MH
3: for i in feasibleLocations do utilizes the UCB1 tree policy [16] for the selection step at
4: branchCost = tourCost + operationalCost(i) node v of the search tree:
5: if branchCost ≥ State.bestCost then s
Q(v ′ ) 2 ln N (v)
6: continue { skip to next location i+} UCB1(v) = arg max + (16)
′ N (v ′ )
7: else if branchCost< State.bestCost then v ′ ∈children of v N (v )

8: State+ = Update stateOfTime, stateOfCharge, final- Here, Q(v ′ ) is the total reward of all plays through child
Position, remainingLocations node v ′ , N (v ′ ) denotes the number of visits of child node
9: if remainingLocations > 0 then v ′ , and N (v) is the number of visits of the parent node v. The
10: Cost = B&B(State+, taskList, location(i)) policy function is dependent on the quality of the node being
11: else considered as well as the number of evaluations of that node,
12: State.bestCost = Cost balancing the exploration and exploitation of the search space
13: end if [20]. In order to apply the UCB1 policy and have a proper
14: end if balance between exploration and exploitation, the problem is
15: end for transformed such that the stage reward Ri (v) ∈ [0, 1] [16]:
gi (v ′ )
Ri (v ′ ) = 1 − (17)
gmax
size and composition. This estimate serves as a measure for
the quality of that branch and can be used by the MCTS to where Ri (v ′ ) is the reward of the transition from state si−1 (v)
navigate the search. MCTS is most effective as a heuristic at to state si (v ′ ) and v ′ ∈ children of v. It follows that Q(v ′ ) is
the early stages of the decision problem [12]. Moreover, for the sum of all rewards of all N (v ′ ) plays through node v ′ back
smaller problem instances, B&B algorithms are often more to the root node v0 :
suitable than MCTS [14]. As such, the proposed hybrid MCTS N (v ′ )
X
′
algorithm is aimed to utilize the strengths of the different Q(v ) = Ri (v ′ ) + Ri (v) + ... + Ri (v0 ) (18)
algorithms and combine them into an effective hybrid MCTS- i=1
based metaheuristic. Considering that the number of permutations of the RAP is
Although MCTS was originally designed to solve Markov exponential with the number of tasks, it is deemed sufficient to
Decision Processes, without loss of generality, MCTS can determine the task assignment by a random rollout (ξ1 , ..., ξn ).
be used to solve a design problem by formulating it as a In order to prevent any bias toward another fleet size, it is
deterministic Markov Decision Process [11]. The optimization ensured that the full fleet size is utilized, i.e. each AMR in
problem is modeled as a 3-tuple ⟨S, A, g⟩, where S is a set of the fleet will have at least one assignment. The assigned tasks
states, A is a set of actions and g(s, a) : S × A → [0, gmax ] is do not have any associated costs/rewards.
a scalar cost function for taking action a at state s. The state Since many of the TSPTW instances encountered are small
s(v) contains the parameters that follow from the decisions problem instances, it is advantageous to use the same recursive
up to node v. At the root node v0 , the fleet size m is B&B algorithm for TSPTW as described in Section II-A to
find the optimal sequence in which the assigned tasks are
completed by each robot. Each TSPTW B&B is terminated
after a one second time cap since the metaheuristic is not
aimed at local convergence. Considering the best first order
of exploration, this still finds reasonably good estimates for
the operational cost J˜o (k). The cost that is obtained through
the rollout of the RAP and the TSPTW, is backpropagated
through the tree and are assigned to Q(v) at node v that is
associated with a particular fleet size or composition. This is
in turn used by the UCB1 policy function to determine the
decisions in ′the next iteration. As a result, at the root node,
Q(v )
the term N (v ′ ) in (16) is proportional to the total mean cost-
to-go for a given fleet size or composition at node v ′ . As the
total number of plays at the root node N (v0 ) grows to infinity,
Figure 2. Overview of the multi-stage design problem, with the FSMVRPTW the UCB1 function converges to the expected value of the total
(red) and the nested VRPTW (blue), and the proposed UCT-MH Algorithm. cost for a given fleet size.

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on November 29,2023 at 13:02:45 UTC from IEEE Xplore. Restrictions apply.
104
C. Hybrid Optimization: Guiding B&B with the UCT-MH
10

Mean Cost To Go [-]

The hybrid optimization framework utilizes the search re- m=1 m=4
8 m=2 m=5
sults of the UCT-MH to guide the exact incremental B&B. m=3 m=6
Multiple processors are allocated to the B&B algorithm that 6
systematically navigates the tree to solve the problem exactly.
4
Meanwhile, one processor is dedicated to running the UCT-
MH which efficiently samples the entire design space to get an 0 2 4 6 8 10 12 14 16
Time [min.]
estimate of the associated costs. Considering the parallelization
(a) The quality of candidate fleet sizes as determined by the UCT-MH
overhead of the paralleled B&B algorithm, it can be expected

Number of visits [-]

that the UCT-MH already finds a fleet composition candidate 1000 m=1 m=4
k̂ by the time the B&B is initiated. If such a composition m=2 m=5
m=3 m=6
is available, then it is used as the candidate fleet f 1 = k̂ 500
that initializes the B&B algorithm. Moreover, whenever the
guiding UCT-MH finds a new best solution, it provides this 0
0 2 4 6 8 10 12 14 16
solution with its associated cost to the guided B&B by adding Time [min.]
it to the pooled best cost shown in Fig. 1. This information is (b) The UCT-MH balances exploration and exploitation of candidate solutions.
used to preemptively prune sub-optimal branches and guide the 10
4
4
B&B toward the optimal fleet size and composition, thereby

Best cost found [-]

Standalone B&B: 7 cores
reducing the search space and computation time. 3
Guided B&B: 6 cores
2 Guiding UCT-MH: 1 core
III. R ESULTS
1
A. Computational Experiments
0
0 2 4 6 8 10 12 14 16
To study the performance of the proposed hybrid algorithm, Time [min.]
the guiding UCT-MH and the guided B&B are compared (c) Performance of UCT-MH and B&B algorithm and their parallelization.
against the standalone incremental B&B. Four real-life case
Figure 3. Case Study 1: The UCT-MH and B&B algorithm, number of tasks
studies are conducted in MATLAB 2022a at the Ohio Super- n = 10, maximum number of AMRs: mmax = 6, k⊤ max = [2, 2, 2].
computer Center [21]. For each experiment, a set of n tasks
is defined, each consisting of items of known mass, volume, 5
10
pick-up and drop-off locations and respective time windows. 2.4
The fleet size is limited to mmax , equally distributed over 2.2
m=1 m=8 m = 15
m=2 m=9 m = 16
h = 3 different AMR types. Each algorithm is run for a limited m=3 m = 10 m = 17
time tmax after which the incumbent solutions are compared. 2 m=4 m = 11 m = 18
Mean Cost To Go [-]

m=5 m = 12 m = 19
Two smaller problems are studied in detail to illustrate the 1.8 m=6 m = 13 m = 20
behavior of the UCT-MH in Fig. 3-4. The best-found cost by m=7 m = 14 m = 21
1.6
each algorithm is summarized for all case studies in Table I.
1.4
B. Case Studies
1.2
1) n = 10 and mmax = 6: Figure 3a shows the UCT-
MH exploration of the various fleet sizes, where the mean of 1
the cost-to-go starts to converge and the algorithm gains more
0.8
confidence in particular solutions as the number of evaluations 0 10 20 30 40 50 60
Time [min.]
increases. The guiding UCT-MH finds that m = 6 is the
(a) The quality of candidate fleet sizes as determined by the UCT-MH, first
best candidate and dedicates more visits to these branches as 60 minutes of simulation.
shown in Fig. 3b. As a result, the guided B&B quickly focuses
104
on local convergence (Fig. 3c). As the entire search space is 5
Best cost found [-]

Standalone B&B: 7 cores

explored, this solution is the guaranteed global optimum. 4 Guided B&B: 6 cores
2) n = 20 and mmax = 21: In Fig. 4a several patterns are Guiding UCT-MH: 1 core
observed. While small fleet sizes yield infeasible solutions, 3
larger fleet sizes initially show a transient behavior due to
2
the stochastic exploration. The largest fleet sizes always yield 0 60 120 180 240
feasible solutions, irrespective of the lower-level decisions. Time [min.]

Here, an increase in fleet size results in an incremental increase (b) Performance of UCT-MH and B&B algorithm and their parallelization.
of the mean cost-to-go which is associated with the fleet cost. Figure 4. Case Study 2: The UCT-MH and B&B algorithm, number of tasks
Remarkably, Fig. 4b shows that the standalone B&B is initially n = 20, maximum number of AMRs: mmax = 21, k⊤ max = [7, 7, 7].

faster, however, as the guided B&B already starts from a good

candidate branch, the underlying TSPTW is expected to be

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on November 29,2023 at 13:02:45 UTC from IEEE Xplore. Restrictions apply.
Table I
E XPERIMENTAL R ESULTS - B&B AND UCT-MH

Case Study Standalone B&B Guided B&B Guiding UCT-MH

n mmax tmax [h] Cost tfound [min] Cost Rel. Gap tfound [min] Reduction Cost Rel. Gap tfound [min]
10 6 2 9633.5∗ 16 9633.5∗ 0.00% 2.17 86.5% 10190.8 5.79% 78.52
20 21 4 24,797.0 229.02 24,761 - 0.15% 141.35 38.3% 24,891.5 0.53% 27.88
50 30 12 38,047.3 420.43 40,638.3 + 6.81% 212.30 49.5% 53,161.3 39.72% 2.883
100 60 24 N/A† − 74,250.5 − 358.33 − 103,193.0 38.98% 30.05
∗ Globally optimal solution.
† No solution was found after 24 hours.

more difficult to solve. Consequently, the guided B&B discards [3] R. Yan, L. Jackson, and S. Dunnett, “A study for further exploring the
suboptimal fleets and focuses on local convergence thereby advantages of using multi-load automated guided vehicles,” Journal of
Manufacturing Systems, vol. 57, pp. 19–30, 10 2020.
reducing the overall computation time of the guided B&B. [4] A. Hoff, H. Andersson, M. Christiansen, G. Hasle, and A. Løkketangen,
“Industrial aspects and literature survey: Fleet composition and routing,”
C. Discussion Computers and Operations Research, vol. 37, no. 12, pp. 2041–2061,
12 2010.
The time taken to initialize the parallel B&B algorithm is [5] F. Paparella, T. Hofman, and M. Salazar, “Joint optimization of number
sufficient for the guiding UCT-MH to find a strong candidate of vehicles, battery capacity and operations of an electric autonomous
fleet that warm starts the guided B&B. The UCT-MH provides mobility-on-demand fleet,” in IEEE 61st Conference on Decision and
Control (CDC), 2022, pp. 6284–6291.
a reduction of computation time ranging from 38.3% up to [6] A. Wallar, W. Schwarting, J. Alonso-Mora, and D. Rus, “Optimizing
86.5%. The local convergence of the UCT-MH is dependent multi-class fleet compositions for shared mobility-as-a-service,” in IEEE
on the problem size due to the time cap imposed at the TSPTW Intelligent Transportation Systems Conference, 2019, pp. 2998–3005.
[7] G. Desaulniers, O. B. G. Madsen, and S. Ropke, “The Vehicle Routing
level. As seen in Table I, for a higher number of tasks where Problem with Time Windows,” in Vehicle Routing Problems, Methods,
the TSPTW is larger, the gap with the best-known solution and Applications., 2nd ed., 2014, pp. 119–159.
is greater (∼ 40%). However, the guided B&B is able to [8] R. Elshaer and H. Awad, “A taxonomic review of metaheuristic al-
gorithms for solving the vehicle routing problem and its variants,”
close this gap since it conducts local searches systematically. Computers and Industrial Engineering, vol. 140, 2 2020.
Further, for the case with 100 tasks, the standalone B&B was [9] I. Boussaı̈d, J. Lepagnot, and P. Siarry, “A survey on optimization
unable to find any feasible solution in 24 hours while the UCT- metaheuristics,” Information Sciences, vol. 237, pp. 82–117, 7 2013.
[10] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van
MH provided multiple solutions through its efficient stochastic Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam,
exploration of the design space. M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner,
I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and
IV. C ONCLUSIONS D. Hassabis, “Mastering the game of Go with deep neural networks and
tree search,” Nature, vol. 529, no. 7587, pp. 484–489, 1 2016.
In this paper, a hybrid optimization algorithm was developed [11] M. Świechowski, K. Godlewski, B. Sawicki, and J. Mańdziuk, “Monte
that uses a Monte-Carlo Tree Search-based metaheuristic Carlo Tree Search: a review of recent modifications and applications,”
Artificial Intelligence Review, 2022.
(UCT-MH) to guide an exact incremental Branch & Bound [12] A. Sabharwal, H. Samulowitz, and C. Reddy, “Guiding Combinatorial
algorithm, which solves a real-life Fleet Size and Mix Ve- Optimization with UCT,” in International Conference on Integration of
hicle Routing Problem with Time Windows. The UCT-MH Artificial Intelligence (AI) and Operations Research (OR) Techniques in
Constraint Programming. Springer, 6 2012, pp. 356–361.
yields a significant improvement in the computation time and [13] B. Kartal, E. Nunes, J. Godoy, and M. Gini, “Monte Carlo Tree
convergence of the B&B by constantly sharing the expected Search for Multi-Robot Task Allocation,” in Proceedings of the AAAI
optimal fleet composition as well as the upper bound on the Conference on Artificial Intelligence, 2016, pp. 4222–4223.
[14] S. Edelkamp, M. Gath, C. Greulich, M. Humann, O. Herzog, and
cost. Although in this study MCTS was only employed at the M. Lawo, “Monte-Carlo Tree Search for Logistics,” in Lecture Notes
fleet sizing and composition level, future research needs to in Logistics. Springer Cham, 2015, pp. 427–440.
determine to what depth MCTS can be effective. Moreover, [15] C. Barletta, W. Garn, C. Turner, and S. Fallah, “Hybrid fleet capacitated
vehicle routing problem with flexible Monte–Carlo Tree search,” Inter-
modifications to the selection policy as well as bi-directional national Journal of Systems Science: Operations and Logistics, 2022.
communication between the UCT-MH and the B&B algorithm [16] L. Kocsis and C. Szepesvári, “Bandit based Monte-Carlo Planning,” in
could further improve computation times. European Conference on Machine Learning, 9 2006, pp. 282–293.
[17] J. Lu, Y. Chen, J.-K. Hao, and R. He, “The time-dependent electric
ACKNOWLEDGMENTS vehicle routing problem: Model and solution,” Expert Systems with
Applications, vol. 161, p. 113593, 2020.
This research was supported by the Ford Motor Company [18] I. Kucukoglu, R. Dewil, and D. Cattrysse, “The electric vehicle routing
as part of the Ford-OSU Alliance Program. problem and its variations: A literature review,” Computers & Industrial
Engineering, vol. 161, p. 107650, 2021.
[19] M. Goutham, S. Boyle, M. Menon, S. Mohan, S. Garrow, and S. Stockar,
R EFERENCES “Optimal path planning through a sequence of waypoints,” IEEE
Robotics and Automation Letters, vol. 7, no. 4, pp. 8566–8573, 2022.
[1] J. Morgan, M. Halton, Y. Qiao, and J. G. Breslin, “Industry 4.0 smart [20] C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling,
reconfigurable manufacturing machines,” pp. 481–506, 4 2021. P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton, “A
[2] Z. Ghelichi and S. Kilaru, “Analytical models for collaborative au- survey of Monte Carlo tree search methods,” pp. 1–43, 3 2012.
tonomous mobile robot solutions in fulfillment centers,” Applied Math- [21] O. S. Center, “Ohio supercomputer center,” 1987. [Online]. Available:
ematical Modelling, vol. 91, pp. 438–457, 3 2021. http://osc.edu/ark:/19495/f5s1ph73

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on November 29,2023 at 13:02:45 UTC from IEEE Xplore. Restrictions apply.

UKMT Senior Maths Challenge 2014
0% (1)
UKMT Senior Maths Challenge 2014
8 pages
Lesson 3 Metrology
No ratings yet
Lesson 3 Metrology
25 pages
Simulated Annealing Metaheuristics For The Vehicle Routing Problem With Time Windows
No ratings yet
Simulated Annealing Metaheuristics For The Vehicle Routing Problem With Time Windows
25 pages
A Hybrid Multiobjective Evolutionary Algorithm For Solving Vehicle Routing Problem With Time Windows
No ratings yet
A Hybrid Multiobjective Evolutionary Algorithm For Solving Vehicle Routing Problem With Time Windows
37 pages
KKP-BDS Lecture Notes
No ratings yet
KKP-BDS Lecture Notes
78 pages
A Branch and Cut Procedure
No ratings yet
A Branch and Cut Procedure
21 pages
Ship Hydrodynamics 1 Part B Lecture 7 - Seakeeping Criteria - Supplement
100% (1)
Ship Hydrodynamics 1 Part B Lecture 7 - Seakeeping Criteria - Supplement
23 pages
VRPTW Journal
No ratings yet
VRPTW Journal
32 pages
A Vehicle Routing Problem With Backhauls and Time Windows: A Guided Local Search Solution
No ratings yet
A Vehicle Routing Problem With Backhauls and Time Windows: A Guided Local Search Solution
14 pages
Mathematics in Chemical Engineering A 50 Year Introspection
No ratings yet
Mathematics in Chemical Engineering A 50 Year Introspection
17 pages
2013 A Hybrid Chaos-Particle Swarm Optimization Algorit
No ratings yet
2013 A Hybrid Chaos-Particle Swarm Optimization Algorit
24 pages
1 s2.0 S0377221724002923 Main
No ratings yet
1 s2.0 S0377221724002923 Main
36 pages
CN U2
No ratings yet
CN U2
162 pages
VRPTW
No ratings yet
VRPTW
30 pages
SSRN Id4120102
No ratings yet
SSRN Id4120102
37 pages
Close Open VRPTW
No ratings yet
Close Open VRPTW
10 pages
2012 - A Hybrid Genetic Algorithm For Multidepot and Periodic Vehicle Routing Problems
No ratings yet
2012 - A Hybrid Genetic Algorithm For Multidepot and Periodic Vehicle Routing Problems
14 pages
Szeto 2011
No ratings yet
Szeto 2011
10 pages
Evolutionary Multitasking For Bidirectional Adaptive Codec
No ratings yet
Evolutionary Multitasking For Bidirectional Adaptive Codec
22 pages
Zhou 2013
No ratings yet
Zhou 2013
11 pages
Bece Practice Questions
No ratings yet
Bece Practice Questions
11 pages
Hybrid Algorithms For Energy Minimizing Vehicle Routing Problem Integrating Clusterization and Ant Colony Optimization
No ratings yet
Hybrid Algorithms For Energy Minimizing Vehicle Routing Problem Integrating Clusterization and Ant Colony Optimization
22 pages
Case Study For The Amsterdam ArenA Stadium
No ratings yet
Case Study For The Amsterdam ArenA Stadium
24 pages
Fams 09 1155356
No ratings yet
Fams 09 1155356
17 pages
A Hybrid GRASP and Tabu Search Heuristic and An Exact Me - 2025 - Expert Systems
No ratings yet
A Hybrid GRASP and Tabu Search Heuristic and An Exact Me - 2025 - Expert Systems
12 pages
Week3 Lecture Notes
No ratings yet
Week3 Lecture Notes
11 pages
Monte Carlo Vehicle Routing
No ratings yet
Monte Carlo Vehicle Routing
9 pages
Algorithms 15 00412 v2
No ratings yet
Algorithms 15 00412 v2
19 pages
Mcqs
100% (1)
Mcqs
2 pages
Vehicle Routing With Time Windows: An Overview of Exact, Heuristic and Metaheuristic Methods
No ratings yet
Vehicle Routing With Time Windows: An Overview of Exact, Heuristic and Metaheuristic Methods
9 pages
2D-Ptr: 2D Array Pointer Network For Solving The Heterogeneous Capacitated Vehicle Routing Problem
No ratings yet
2D-Ptr: 2D Array Pointer Network For Solving The Heterogeneous Capacitated Vehicle Routing Problem
9 pages
Chapter 3
No ratings yet
Chapter 3
33 pages
Optimizing The Vehicle Routing Problem With Time Windows: A Discrete Particle Swarm Optimization Approach
No ratings yet
Optimizing The Vehicle Routing Problem With Time Windows: A Discrete Particle Swarm Optimization Approach
14 pages
Cooperative Object Transportation With Differential-Drive Mobile Robots Control and Exprimentation
No ratings yet
Cooperative Object Transportation With Differential-Drive Mobile Robots Control and Exprimentation
10 pages
Lecture 9 - Marine Hydrodynamics I - Volume and Mass Flow Rates - Part I
No ratings yet
Lecture 9 - Marine Hydrodynamics I - Volume and Mass Flow Rates - Part I
25 pages
MD VRP
No ratings yet
MD VRP
14 pages
Solving Practical Vehicle Routing Proble
No ratings yet
Solving Practical Vehicle Routing Proble
10 pages
Evolving Heuristics For Dynamic Vehicle Routing With Time Windows Using Genetic Programming
No ratings yet
Evolving Heuristics For Dynamic Vehicle Routing With Time Windows Using Genetic Programming
8 pages
Reinventing Discovery
No ratings yet
Reinventing Discovery
4 pages
Accepted Manuscript
No ratings yet
Accepted Manuscript
33 pages
PDC Review 3
No ratings yet
PDC Review 3
32 pages
A Hybrid Algorithm For Vehicle Routing Problem With Time Windows and Target Time
No ratings yet
A Hybrid Algorithm For Vehicle Routing Problem With Time Windows and Target Time
10 pages
Solving Large-Scale Vehicle Routing Problems With Time Windows: The State-of-the-Art
No ratings yet
Solving Large-Scale Vehicle Routing Problems With Time Windows: The State-of-the-Art
45 pages
A Library of Local Search Heuristics For The Vehicle Routing Problem
No ratings yet
A Library of Local Search Heuristics For The Vehicle Routing Problem
23 pages
Full Paper - 1st - ICAST.20.053
No ratings yet
Full Paper - 1st - ICAST.20.053
7 pages
Volumen Finito
No ratings yet
Volumen Finito
11 pages
A New Mathematical Model For A Competitive Vehicle Routing Problem With Time Windows Solved by Simulated Annealing
No ratings yet
A New Mathematical Model For A Competitive Vehicle Routing Problem With Time Windows Solved by Simulated Annealing
10 pages
Computer Architecture ECE 361 Lecture 5: The Design Process & ALU Design
No ratings yet
Computer Architecture ECE 361 Lecture 5: The Design Process & ALU Design
55 pages
A General Heuristic For Vehicle Routing Problems: David Pisinger, Stefan Ropke
No ratings yet
A General Heuristic For Vehicle Routing Problems: David Pisinger, Stefan Ropke
33 pages
Applied Soft Computing Journal: Asma M. Altabeeb, Abdulqader M. Mohsen, Laith Abualigah, Abdullatif Ghallab
No ratings yet
Applied Soft Computing Journal: Asma M. Altabeeb, Abdulqader M. Mohsen, Laith Abualigah, Abdullatif Ghallab
20 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
An Effective Search Framework Combining Meta-Heuristics To Solve The Vehicle Routing Problems With Time Windows
No ratings yet
An Effective Search Framework Combining Meta-Heuristics To Solve The Vehicle Routing Problems With Time Windows
23 pages
2002.08539v1 Trang 1
No ratings yet
2002.08539v1 Trang 1
3 pages
Sbai2020 Article TwoMeta-heuristicsForSolvingTh PDF
No ratings yet
Sbai2020 Article TwoMeta-heuristicsForSolvingTh PDF
43 pages
A Framework For Solving Time Dependent Vehicle Routing Problem With Time Windows
No ratings yet
A Framework For Solving Time Dependent Vehicle Routing Problem With Time Windows
24 pages
Transportation Research Part C: A. Juan, J. Faulin, S. Grasman, D. Riera, J. Marull, C. Mendez
No ratings yet
Transportation Research Part C: A. Juan, J. Faulin, S. Grasman, D. Riera, J. Marull, C. Mendez
15 pages
Fem Objective Questions
0% (2)
Fem Objective Questions
7 pages
Information Sciences: Jacek Ma Ndziuk, Maciej Swiechowski
No ratings yet
Information Sciences: Jacek Ma Ndziuk, Maciej Swiechowski
15 pages
Lecture 5 Power Set
No ratings yet
Lecture 5 Power Set
3 pages
Syllabus For RET Examination 2018: University of Gour Banga Subject: Physics
No ratings yet
Syllabus For RET Examination 2018: University of Gour Banga Subject: Physics
22 pages
Route Optimization Via Improved Ant Colony Algorithm With Graph Network
No ratings yet
Route Optimization Via Improved Ant Colony Algorithm With Graph Network
11 pages
3635 PDF
No ratings yet
3635 PDF
9 pages
Experiment 05
No ratings yet
Experiment 05
20 pages
Analyzing A Unified Ant System For The VRP and Some of Its Variants
No ratings yet
Analyzing A Unified Ant System For The VRP and Some of Its Variants
11 pages
Artificial Intelligence Heuristics in Solving Vehicle Routing Problems With Time Window Constraints
No ratings yet
Artificial Intelligence Heuristics in Solving Vehicle Routing Problems With Time Window Constraints
13 pages
An Improved Clarke and Wright Savings Algorithm For The Capacitated Vehicle Routing Problem
No ratings yet
An Improved Clarke and Wright Savings Algorithm For The Capacitated Vehicle Routing Problem
12 pages
2nd Assignment
No ratings yet
2nd Assignment
15 pages
Ch.5 Fixed-Point vs. Floating Point
No ratings yet
Ch.5 Fixed-Point vs. Floating Point
10 pages
Tariq - Vehicle Routing Problem
No ratings yet
Tariq - Vehicle Routing Problem
19 pages
Classical and Modern Heuristics For The Vehicle Routing Problem
No ratings yet
Classical and Modern Heuristics For The Vehicle Routing Problem
17 pages
Classical and Modern Heuristics For The Vehicle Routing Problem - Ps
No ratings yet
Classical and Modern Heuristics For The Vehicle Routing Problem - Ps
17 pages
An Optimization Model For Solving Time-Dependent Pick-Up and Delivery Heterogeneous Vehicle Routing Problem in Food Logistic
No ratings yet
An Optimization Model For Solving Time-Dependent Pick-Up and Delivery Heterogeneous Vehicle Routing Problem in Food Logistic
4 pages
Document Xyz4
No ratings yet
Document Xyz4
1 page
Manual Tambahan Geogebra
No ratings yet
Manual Tambahan Geogebra
21 pages
RCC54 Circular Column Charting
No ratings yet
RCC54 Circular Column Charting
13 pages
DSP 1imp
No ratings yet
DSP 1imp
13 pages
A New Mathematical
No ratings yet
A New Mathematical
10 pages
1-S2.0-S1568494610000773-Main RUTEO PDF
No ratings yet
1-S2.0-S1568494610000773-Main RUTEO PDF
12 pages
A Particle Swarm Optimization Algorithm With Crossover For Vehicle Routing Problem With Time Windows
No ratings yet
A Particle Swarm Optimization Algorithm With Crossover For Vehicle Routing Problem With Time Windows
4 pages
Dimitri Vey - Multisymplectic Geometry and Loop Quantum Gravity: Toward A Covariant Canonical Quantum Gravity
No ratings yet
Dimitri Vey - Multisymplectic Geometry and Loop Quantum Gravity: Toward A Covariant Canonical Quantum Gravity
19 pages
Simulink
No ratings yet
Simulink
6 pages
Document SSD
No ratings yet
Document SSD
1 page
Vehicle Routing Problem With Time Windows A Hybrid Particle Swarm Optimization Approach
No ratings yet
Vehicle Routing Problem With Time Windows A Hybrid Particle Swarm Optimization Approach
5 pages
Two Phase Algorithm For Solving VRPTW Problem
No ratings yet
Two Phase Algorithm For Solving VRPTW Problem
16 pages
As A Single PDF
No ratings yet
As A Single PDF
3 pages
A New Mathematical Model For A Competitive Vehicle Routing Problem PDF
No ratings yet
A New Mathematical Model For A Competitive Vehicle Routing Problem PDF
10 pages
A Heuristic Search Algorithm For Vehicle Routing Problems and GIS-based Vehicle Routing System Onboard
No ratings yet
A Heuristic Search Algorithm For Vehicle Routing Problems and GIS-based Vehicle Routing System Onboard
6 pages
SIMULATION MODEL of Permanent Magnet Synchronous Motor
No ratings yet
SIMULATION MODEL of Permanent Magnet Synchronous Motor
9 pages
Ed Ef: Design of Base Plate & Anchor Bolt: BP1, BP2, BP3, BP4, BP5, BP6, BP7, BP8, BP9 B
No ratings yet
Ed Ef: Design of Base Plate & Anchor Bolt: BP1, BP2, BP3, BP4, BP5, BP6, BP7, BP8, BP9 B
9 pages
Table Arus Motor
No ratings yet
Table Arus Motor
2 pages
1998 - A Method For VRP With Multiple Vehicle Types and TW
No ratings yet
1998 - A Method For VRP With Multiple Vehicle Types and TW
11 pages
CatBoost Algorithms and Applications: Definitive Reference for Developers and Engineers
From Everand
CatBoost Algorithms and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Applied Techniques for GPT-3: Definitive Reference for Developers and Engineers
From Everand
Applied Techniques for GPT-3: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet

A Parallel Monte-Carlo Tree Search-Based Metaheuristic For Optimal Fleet Composition Considering Vehicle Routing Using Branch Amp Bound

Uploaded by

A Parallel Monte-Carlo Tree Search-Based Metaheuristic For Optimal Fleet Composition Considering Vehicle Routing Using Branch Amp Bound

Uploaded by

A Parallel Monte-Carlo Tree Search-Based Metaheuristic For Optimal

Fleet Composition Considering Vehicle Routing Using Branch & Bound

The Upper Confidence bounds applied to Trees algorithm

Mean Cost To Go [-]

Number of visits [-]

Best cost found [-]

Standalone B&B: 7 cores

faster, however, as the guided B&B already starts from a good

Case Study Standalone B&B Guided B&B Guiding UCT-MH

You might also like