0% found this document useful (0 votes)
73 views13 pages

The Knapsack Problem With Forfeit Sets

This document summarizes a research paper that introduces an extension of the knapsack problem called the Knapsack Problem with Forfeit Sets (KPFS). In KPFS, forfeit costs are associated not with pairs of items, but with subsets of items of arbitrary sizes called forfeit sets. For each forfeit set, there is an allowance parameter that determines how many items can be chosen before incurring a penalty cost. The paper presents a mathematical model for KPFS, proves a special case is polynomially solvable, and introduces three heuristic algorithms - a greedy approach, an extension using Carousel Greedy, and a hybrid memetic/Carousel Greedy algorithm. Computational tests on random

Uploaded by

gabriel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views13 pages

The Knapsack Problem With Forfeit Sets

This document summarizes a research paper that introduces an extension of the knapsack problem called the Knapsack Problem with Forfeit Sets (KPFS). In KPFS, forfeit costs are associated not with pairs of items, but with subsets of items of arbitrary sizes called forfeit sets. For each forfeit set, there is an allowance parameter that determines how many items can be chosen before incurring a penalty cost. The paper presents a mathematical model for KPFS, proves a special case is polynomially solvable, and introduces three heuristic algorithms - a greedy approach, an extension using Carousel Greedy, and a hybrid memetic/Carousel Greedy algorithm. Computational tests on random

Uploaded by

gabriel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Computers & Operations Research 151 (2023) 106093

Contents lists available at ScienceDirect

Computers and Operations Research


journal homepage: www.elsevier.com/locate/cor

The Knapsack Problem with forfeit sets


Ciriaco D’Ambrosio a ,∗, Federica Laureana a , Andrea Raiconi b , Gaetano Vitale a
a
Department of Mathematics, University of Salerno, Via Giovanni Paolo II 132, 84084 Fisciano (SA), Italy
b
Institute for Applied Mathematics ‘‘Mauro Picone’’ (IAC), CNR, Via Pietro Castellino 111, 80131 Naples (NA), Italy

ARTICLE INFO ABSTRACT

Keywords: This work introduces a novel extension of the 0/1 Knapsack Problem in which we consider the existence of
Knapsack Problem so-called forfeit sets. A forfeit set is a subset of items of arbitrary cardinality, such that including a number
Conflicts of its elements that exceeds a predefined allowance threshold implies some penalty costs to be paid in the
Forfeit sets
objective function value. A global upper bound on these allowance violations is also considered. We show
Carousel Greedy
that the problem generalizes both the Knapsack Problem with conflicts among item pairs and the Knapsack
Memetic algorithm
Hybrid metaheuristic
Problem with forfeit pairs, that have been previously introduced in the literature. We present a polynomial
subcase by proving the integrality of its LP relaxation polytope and, we introduce three heuristic approaches,
namely a constructive greedy, an algorithm based on the recently introduced Carousel Greedy paradigm and a
hybrid Memetic/Carousel Greedy algorithm. Finally, we validate the performances for the proposed algorithms
on a set of benchmark instances that consider both random and correlated data.

1. Introduction Conflict Graph (KPCG), given that the incompatibilities between items
can be represented through an auxiliary graph. In Pferschy and Schauer
Motivated by real-world applications, many classical optimization (2009, 2017), the authors proved the problem to be strongly NP-Hard
problems have been adapted to take into account the possible occur- and presented approximation results for some special cases. In the
rence of contrasting choices. In most cases, this occurrence has been latter work, the Knapsack Problem with Forcing Graph (in which at
modeled through conflicts, that is, pairs of mutually exclusive decisions. least one item for each pair must be included) is also studied. Facet-
Among problems in this area, we recall the Minimum Spanning Tree
defining valid inequalities were proposed in Ben Salem et al. (2018),
with Conflicts (MSTC), first introduced in Darmann et al. (2009). In
Luiz et al. (2021). Among the proposed resolution approaches, we recall
this problem, a set of conflicting edge pairs is considered, and at most
a scatter search metaheuristic (Hifi and Otmani, 2012), branch-and-
one edge can be part of the solution for each pair. Unlike the classical
MST problem, MSTC is strongly NP-Hard, except for some polynomial bound (Bettinelli et al., 2017; Coniglio et al., 2021) and branch-and-cut
cases, i.e. disjoint conflict pairs or conflict pairs for which the transitive approaches (Ben Salem et al., 2018). A different variant of the prob-
property holds (Darmann et al., 2011; Zhang et al., 2011). The problem lem considering multiple knapsacks, where conflicting items cannot
was solved heuristically in Zhang et al. (2011), while branch-and- be included in the same one, is proposed in Basnet (2018), where
cut approaches were proposed in Samer and Urrutia (2015), Carrabs the author compares the performances of different heuristics. In the
et al. (2021). Another classical problem with conflicting pairs is the quadratic knapsack problem with conflicts, instead, an additional profit
Maximum Flow Problem. In Pferschy and Schauer (2011) two general- is generated whenever two compatible items are chosen. Heuristics for
izations of the problem were presented, with either negative or positive this problem were proposed in Dahmani and Hifi (2021), Shi et al.
disjunctive constraints. In the latter case, at least one of the edges (2017).
belonging to each pair must be used in the solution. The Bin Packing A different variant of the 0/1 Knapsack Problem introducing soft
Problem with Conflicts was studied in Gendreau et al. (2004), Epstein conflicts, or forfeits, was recently introduced in Cerulli et al. (2020). In
and Levin (2008), Muritiba et al. (2010), Sadykov and Vanderbeck the Knapsack Problem with Forfeits (KPF), a collection of item pairs
(2013), while the 0/1 Knapsack Problem with Conflicts was instead
(called forfeit pairs), each with an associated penalty (forfeit cost ) is
studied, for instance, in Bettinelli et al. (2017), Hifi and Otmani (2012),
considered. For each forfeit pair, the related cost must be deducted
Pferschy and Schauer (2009, 2017). In the latter problem, a collection
from the objective function value (total profit) if both items composing
of item pairs is considered, such that for each pair at most one item
can be included. It is usually referred to as Knapsack Problem with the pair are chosen. In Cerulli et al. (2020), the authors propose a

∗ Corresponding author.
E-mail addresses: [email protected] (C. D’Ambrosio), [email protected] (F. Laureana), [email protected] (A. Raiconi), [email protected] (G. Vitale).

https://doi.org/10.1016/j.cor.2022.106093
Received 14 July 2022; Received in revised form 10 October 2022; Accepted 19 November 2022
Available online 23 November 2022
0305-0548/© 2022 Elsevier Ltd. All rights reserved.
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

mathematical model and two heuristic approaches for the problem. the bounds on the variables representing violations. Furthermore, we
In Capobianco et al. (2022), the authors proposed a hybrid meta- consider a number of problems corresponding to subcases of KPFS.
heuristic approach that combines a genetic algorithm with the Carousel For one of these cases, we prove that the problem is polynomially
Greedy paradigm (see Cerrone et al., 2017). solvable by showing the integrality of the associated LP polytope (see
In this work, we introduce and study an extension of KPF called Theorem 2.1). In Section 3 we present three heuristic algorithms for the
Knapsack Problem with Forfeit Sets (KPFS). In this variant, rather than general problem. In particular, we propose a constructive greedy, an ex-
to pairs, forfeit costs are associated to subsets of items. The subsets are tension of the greedy based on the recently introduced Carousel Greedy
not necessarily disjoint, and have arbitrary cardinalities. paradigm, and finally a hybrid metaheuristic that embeds the Carousel
In this more general scenario, we introduce an allowance parameter Greedy approach within a memetic algorithm. Such algorithms are
representing how many items can be chosen from each set before tested on a set of benchmark instances, considering both random and
incurring in penalty costs. More in detail, for each set a penalty cost correlated data, as well as instances for the KPF problem; the results
is paid for each violation of the allowance parameter. We also consider of our computational tests are presented and discussed in Section 4.
a global upper bound on the number of allowance violations. Finally, our concluding remarks are contained in Section 5.
Both KPF and KPFS model scenarios in which some items represent
contrasting objectives, or decisions that would generate a cost if taken 2. Mathematical formulations
together, but in which avoiding all these choices rather than accepting
some of them and paying the associated penalties would lead to poor Let 𝑋 be the set of items. We identify each item with a numerical
solutions. The more general setting of KPFS can be closer to some index 𝑗 ∈ {1, … , 𝑛}. Each item has an associated profit 𝑝𝑗 > 0 in the
real-world cases. set 𝑃 and weight 𝑤𝑗 ≥ 0 in the set 𝑊 , 𝑗 = 1, … , 𝑛. As for the classical
Among the many applications connected with the Knapsack prob- Knapsack Problem, a solution 𝑆 ⊆ 𝑋 is composed of a subset of items.
lems, KPFS may arise in particular, for instance, in business planning Let 𝑏 ≥ 0 be the budget of the knapsack, that is, a bound on the sum
and scheduling scenarios. The most obvious is the selection of specific of the weights of the chosen items. Furthermore, let 𝐶 = {𝐶 𝑖 }𝑖=1,…,𝑙
tasks among a larger set of available ones that may be processed. Each be a collection of 𝑙 forfeit sets, such that 𝐶 𝑖 ⊆ 𝑋 and |𝐶 𝑖 | ≥ 2 for
item would correspond to a task whose completion would lead to a 𝑖 = 1, … , 𝑙. Each set 𝐶 𝑖 has an associated cost 𝑑𝑖 > 0 composing the
profit. Each forfeit set may be seen as a machine needed to process the set 𝐷, and an integer allowance ℎ𝑖 in the set 𝐻, such that 0 ≤ ℎ𝑖 ≤ |𝐶 𝑖 |.
items it contains, or a specific resource needed for their completion. Let 𝑛𝑖𝑆 = |𝐶 𝑖 ∩ 𝑆| for a given solution 𝑆. If 𝑛𝑖𝑆 > ℎ𝑖 , a penalty equal to
Each machine may have a basic set-up that makes it able to process (𝑛𝑖𝑆 − ℎ𝑖 ) × 𝑑𝑖 must be paid. In this case, we say that 𝑛𝑖𝑆 − ℎ𝑖 violations
a given number of tasks without further costs, modeled by the forfeit are associated to 𝐶 𝑖 for solution 𝑆.
set allowance. The allowance may also correspond to the number of Finally, let 𝑘 ≥ 0 be an integer upper bound on the number of
tasks that can be processed by the machine in the considered time violations that can incur in a solution. That is, 𝑆 is feasible if and only
horizon, without having to pay extras on the salaries for overtime work. if
Similarly, if forfeit sets model resources, the allowance may represent
the initial resource units availability. In the mentioned scenarios, it is ∑
easy to understand that violations would induce costs, and that it may (𝑛𝑖𝑆 − ℎ𝑖 ) ≤ 𝑘
be necessary or advisable to have bounds on the number of violations. 𝑖∈{1,…,𝑙}∶𝑛𝑖𝑆 >ℎ𝑖
Furthermore, as will be shown in Section 2, KPFS generalizes both To better understand the problem, consider the example reported
KPCG and KPF, and is therefore of interest as a generalized problem. in Fig. 1. The first table reports the data related to a 0/1 Knapsack
Some other variants introducing penalty costs when specific items instance with 6 items. For each of them, we report its index 𝑗 ∈ 𝑋,
are chosen, and that consider item combinations have been studied profit 𝑝𝑗 ∈ 𝑃 and weight 𝑤𝑗 ∈ 𝑊 , along with the budget 𝑏. The second
as well in the literature. For instance, in the Penalized Knapsack table reports the additional data needed to define a KPFS instance. In
Problem (Ceselli and Righini, 2006; Della Croce et al., 2019), a penalty particular, for each of the 3 forfeit sets 𝐶 𝑖 ∈ 𝐶, the table contains the
is associated to each item in addition to its profit, and the maximum 𝑑𝑖 and ℎ𝑖 values, in addition to the threshold 𝑘.
penalty among the chosen items must be paid. In the Fixed-Charge Consider 𝑆 = {2, 3, 4, 5, 6}. This is a feasible solution for the 0/1
Knapsack Problem (Akinc, 2006; Yamada and Takeoka, 2009), items ∑
Knapsack Problem. Indeed, 𝑗∈𝑆 𝑤𝑗 = 15, which does not exceed the
are partitioned in disjoint sets, and each set has an associated penalty ∑
budget 𝑏 = 16. The objective function value of this solution is 𝑗∈𝑆 𝑝𝑗 =
(set-up cost) to be paid once if at least one of its elements belongs to
19.
the solution. In the Set-Union Knapsack Problem (Goldschmidt et al.,
𝑆 is also a feasible solution for 𝐾𝑃 𝐹 𝑆. We observe that 𝑛1𝑆 = 2,
1994; He et al., 2018; Wu and He, 2020) each item is a subset of
𝑛2𝑆 = 4 and 𝑛3𝑆 = 2. Since ℎ1 = 1, ℎ2 = 3 and ℎ3 = 1, we incur in
elements, profits are associated to items and weights to elements. The
3 violations, which are allowed given the threshold value 𝑘 = 3. For
aim is to maximize the profit associated to the chosen items, and the ∑
KPFS, the objective function value is 𝑗∈𝑆 𝑝𝑗 − (1 × 𝑑1 + 1 × 𝑑2 + 1 × 𝑑3 ) =
weight of the solution is given by the sum of the weight of the elements
19 − 12 = 7.
belonging to the union of these items. It is clear, however, that these
We now consider 𝑆 ′ = {1, 2, 3, 4}, which is also a feasible solution
variants cannot be used to model scenarios such as the one described
for the two problems. For the 0/1 Knapsack problem, we note that
above. Indeed, in the case of Penalized Knapsack Problem, a single ∑ ∑
penalty is paid, and they are specific to each item, that is they do not 𝑗∈𝑆 ′ 𝑤𝑗 = 16 and that the objective function value is 𝑗∈𝑆 ′ 𝑝𝑗 = 10.
Regarding KPFS, 2 violations are associated to 𝑆 , both related to 𝐶 1

model relations among them. In the Fixed-Charge Knapsack Problem
(𝑛1 = 3, 𝑛2𝑆 ′ = 3 and 𝑛3𝑆 ′ = 0). The objective function value is
a single penalty per set is considered, and since they are disjoint it ∑𝑆 ′
is not possible to consider multiple relations (and therefore multiple 𝑗∈𝑆 𝑝𝑗 − (2 × 𝑑1 ) = 10 − 2 = 8. Therefore, despite being a worse solution
than 𝑆 for the classical problem, 𝑆 ′ is a better solution for KPFS.
penalties) associated to a single item. Non-disjoint sets of elements
KPFS can be formulated as follows:
may exist in the Set-Union Knapsack Problem, but each of these sets
is taken entirely when an item is chosen, and no penalties are taken ∑
𝑛 ∑
𝑙
max 𝑝𝑗 𝑥𝑗 − 𝑑𝑖 𝑣𝑖 (1)
into account. For further variants of the Knapsack Problem, the reader
𝑗=1 𝑖=1
may refer to the recent surveys Cacchiani et al. (2022a) and Cacchiani
et al. (2022b). 𝑠.𝑡.
The paper is organized as follows. In Section 2 we formally define
the problem by presenting a mixed-integer linear programming formu- ∑
𝑛
𝑤𝑗 𝑥 𝑗 ≤ 𝑏 (2)
lation for it, along with some properties that allow us to strengthen 𝑗=1

2
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

Fig. 1. Example instance.


𝑙
As will be shown in the next subsection, KPFS generalizes KPCG and
𝑣𝑖 ≤ 𝑘 (3)
𝑖=1
is, therefore, strongly NP-Hard as well in the general case.

𝑥 𝑗 − 𝑣𝑖 ≤ ℎ 𝑖 ∀𝑖 = 1, … , 𝑙 (4)
2.1. Subcases
𝑗∈𝐶 𝑖

𝑥𝑗 ∈ {0, 1} ∀𝑗 = 1, … , 𝑛 (5) In this section, we present some KPFS subcases that result from
𝑣𝑖 ∈ {0, … , |𝐶 | − ℎ𝑖 }
𝑖
∀𝑖 = 1, … , 𝑙 (6) appropriate parameter choices, including a polynomially solvable case.

where: KPCG (Pferschy and Schauer, 2009). We recall that this variant corre-
sponds to a Knapsack Problem with strict conflicts on item pairs. KPFS
• For each 𝑗 ∈ 𝑋, the binary variable 𝑥𝑗 is equal to 1 if 𝑗 is chosen, reduces to KPCG iff |𝐶 𝑖 | = 2, ℎ𝑖 = 1 ∀𝑖 ∈ {1, … , 𝑙} and 𝑘 = 0; indeed:
and 0 otherwise;
• For each 𝐶 𝑖 ∈ 𝐶, the integer variable 𝑣𝑖 represents the number of • if |𝐶 𝑖 | = 2 ∀𝑖 ∈ {1, … , 𝑙}, then 𝐶 is a set of item pairs;
violations associated to the set. • if ℎ𝑖 = 1 ∀𝑖 ∈ {1, … , 𝑙}, then a single item is allowed for each pair
before inducing a violation;
The objective function (1) maximizes the profit, which is the sum of
the profits of all selected items minus the associated costs. Constraint • if 𝑘 = 0, no violation is actually allowed.
(2) ensures that the sum of the weights of such items does not exceed
k-violated Knapsack Problem with Conflict Graph (kKPCG). We may
the budget. Similarly, Constraint (3) enforces the bound on the number
consider a generalization of KPCG resulting from the parameter setting
of allowed violations. Constraints (4) represent the relation among the
|𝐶 𝑖 | = 2, ℎ𝑖 = 1 ∀𝑖 ∈ {1, … , 𝑙} and 𝑘 ≥ 0. In this case, up to 𝑘 conflict
chosen items, the allowance value and the resulting violations for each
violations are allowed.
forfeit set. Finally, Constraints (5)–(6) state the domain of the decision
variables. k-Violated Knapsack Problem with Conflict Sets (kKPCS). We may further
In the following, we introduce some properties regarding the 𝑣𝑖 generalize kKPCG by relaxing the assumption on the size of the 𝐶 𝑖 sets,
variables, that allow us to relax integrality on them and strengthen their i.e. |𝐶 𝑖 | ≥ 2, ℎ𝑖 = 1 ∀𝑖 ∈ {1, … , 𝑙} and 𝑘 ≥ 0.
upper bounds.
KPF (Cerulli et al., 2020). Like KPCG, KPF is a special case of kKPCG.
Proposition 2.1. In Formulation (1)–(6), integrality on 𝑣𝑖 variables can In particular, it is equivalent to it in the case in which the number of
be relaxed without compromising the feasibility of the optimal solution. violations is unbounded. Since each forfeit pair induces at most one
violation, KPF corresponds to KPFS with parameter setting |𝐶 𝑖 | = 2,
Proof. Assume to relax integrality on 𝑣𝑖 variables. By effect of Con- ℎ𝑖 = 1 ∀𝑖 ∈ {1, … , 𝑙} and 𝑘 = 𝑙.

straints (4), if 𝑗∈𝐶 𝑖 𝑥𝑗 > ℎ𝑖 for any 𝑖 ∈ {1, … , 𝑙}, 𝑣𝑖 has to be set
∑ Knapsack Problem with Disjoint Forfeit Sets (KPDFS). Let us consider a
to a value greater or equal than 𝑗∈𝐶 𝑖 𝑥𝑗 − ℎ𝑖 . Therefore, given the KPFS instance for which the following conditions hold:
relaxation of Constraints (6) and Objective Function (1), which forces
each 𝑣𝑖 to its lowest feasible value, in the optimal solution 𝑣𝑖 will be 𝐶 𝑖1 ∩ 𝐶 𝑖2 = ∅ ∀𝑖1 , 𝑖2 ∈ {1, … , 𝑙}

equal to 𝑚𝑎𝑥(0, 𝑗∈𝐶 𝑖 𝑥𝑗 − ℎ𝑖 ) ∀𝑖 ∈ {1, … , 𝑙}. □
In this case, with some additional assumptions it is possible to
provide a result on the integrality of the associated polytope, which
Proposition 2.2. For each forfeit set 𝐶𝑖(𝑖 ∈ {1, … , 𝑙}), let 𝐶̄ 𝑖 ⊆ 𝐶 𝑖 be generalizes (Stefanov, 2013, Theorem 14.1). Let us recall the definition
the subset composed of the ℎ𝑖 items with highest profits, with eventual ties
of totally unimodular matrix. A matrix 𝐴 is called totally unimodular iff
broken arbitrarily (|𝐶̄ 𝑖 | = ℎ𝑖 ). Furthermore, let 𝐶̂ 𝑖 = {𝑗 ∈ 𝐶 𝑖 ⧵ 𝐶̄ 𝑖 |𝑝𝑗 > 𝑑𝑖 }.
each square submatrix has determinant equal to 0, +1 or −1.
In any optimal solution 𝑆 ∗ , 𝑛𝑖𝑆 ∗ ≤ ℎ𝑖 + |𝐶̂ 𝑖 |, and therefore |𝐶̂ 𝑖 | is a valid
upper bound for variable 𝑣𝑖 in Formulation (1)–(6).
Theorem 2.1. Given a KPDFS instance, the polytope of the linear
relaxation of Formulation (1)–(6) is integral if it is nonempty, 𝑤𝑗 = 1
Proof. Let us suppose that 𝑛𝑖𝑆 ∗ > ℎ𝑖 + |𝐶̂ 𝑖 |. Since 𝑛𝑖𝑆 ∗ > ℎ𝑖 , there is ∀𝑗 ∈ {1, … , 𝑛} and 𝑏 is an integer.
at least one violation associated to 𝐶 𝑖 . Furthermore, by construction,
there is at least one item 𝑗 ∈ 𝐶 𝑖 ∩ 𝑆 ∗ such that 𝑝𝑗 ≤ 𝑑𝑖 . It follows that,
Proof. Let us consider the coefficient matrix 𝐴. 𝐴 can be partitioned
by removing 𝑗 from 𝑆 ∗ , we obtain a solution that is either equivalent
in 𝐴1 and 𝐴2 , where 𝐴1 refers to Constraint (2) while 𝐴2 refers to
or better. □
Constraints (3) and (4). Each entry in A belongs to {0, +1, −1} and,
From Propositions 2.1 and 2.2, it follows that we can substitute (6) since the elements of 𝐶 are disjoint, each column has at most two non-
with zero entries. Moreover, we observe that for each column of 𝐴 with two
nonzero entries, such entries have the same sign if one of them is in 𝐴1
0 ≤ 𝑣𝑖 ≤ |𝐶̂ 𝑖 | ∀𝑖 = 1, … , 𝑙 (7)
and the other in 𝐴2 , or opposite signs if both are in 𝐴2 . Therefore, 𝐴 is
Observe that by construction |𝐶̂ 𝑖 | ≤ |𝐶 𝑖 | − ℎ𝑖 , hence the new upper totally unimodular (see Heller and Tompkins, 1956). We can conclude
bounds are always either stricter or equal to the ones expressed by (6). that the polytope {𝐲 ∈ R𝑛+𝑙 | 𝐴𝐲 = 𝐛} is integral if 𝐛 has integer
In particular, if 𝐶̂ 𝑖 = ∅, the new bound sets the related 𝑣𝑖 variable to 0. components. □

3
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

It is straightforward to observe that the following corollary also Algorithm 1 GreedyFS


holds:

Input: 𝑋, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 , 𝑏, 𝑘, 𝜏 ⊳ instance data


Corollary 2.1. Given a KPDFS instance, the polytope of the linear
Output: 𝑆 ⊳ solution (subset of chosen items)
relaxation of Formulation (1)–(6) is integral if it is nonempty, 𝑤𝑗 = 𝑤
∀𝑗 ∈ {1, … , 𝑛} for a given value 𝑤 > 0, and 𝑤𝑏 is an integer.
1: 𝑆 ← ∅, 𝑏𝑟𝑒𝑠 ← 𝑏, 𝑘𝑐𝑢𝑟𝑟 ← 0
2: 𝑛𝑠𝑖 ← 0 ∀𝐶 𝑖 ∈ 𝐶, 𝑁 𝑆 ← {𝑛𝑆 𝑖 | 𝐶 ∈ 𝐶}
𝑖
It is also easy to find counterexamples on the total unimodularity of
𝐴 if the forfeit sets are not disjoint. For instance, if we have the forfeit 3: while 𝑋 ⧵ 𝑆 ≠ ∅ do
sets 𝐶 𝑖1 = {𝑗1 , 𝑗2 }, 𝐶 𝑖2 = {𝑗1 , 𝑗3 } and 𝐶 𝑖3 = {𝑗2 , 𝑗3 }, they induce in 𝐴 the 4: 𝑋𝑖𝑡𝑒𝑟 ← ∅
5: for 𝑗 ∈ 𝑋 ⧵ 𝑆 | 𝑤𝑗 ≤ 𝑏𝑟𝑒𝑠 do
⎡1 1 0⎤
6: (𝑓 𝑣𝑗 , 𝑓 𝑐𝑗 ) ← 𝐹 𝑢𝑡𝑢𝑟𝑒𝐶𝑜𝑠𝑡𝑠(𝑗, 𝐶, 𝐷, 𝐻, 𝑁 𝑆 )
submatrix 𝑀 = ⎢1 0 1⎥, with 𝑑𝑒𝑡(𝑀) = −2.
⎢ ⎥ 7: if (𝑘𝑐𝑢𝑟𝑟 + 𝑓 𝑣𝑗 ≤ 𝑘) & (𝑝𝑗 > 𝑓 𝑐𝑗 ) then
⎣0 1 1⎦
8: 𝑋𝑖𝑡𝑒𝑟 ← 𝑋𝑖𝑡𝑒𝑟 ∪ {𝑗}
To the best of our knowledge, kKPCG, kKPCS and KPDFS have never 9: end if
been previously introduced in the scientific literature. Studying these 10: end for
problems in greater detail may represent a promising line of future 11: if 𝑋𝑖𝑡𝑒𝑟 = ∅ then
research. 12: return 𝑆
13: end if
𝑝 −𝑓 𝑐
3. Heuristic algorithms 14: 𝑟𝑎𝑡𝑖𝑜𝑗 ← 𝑗 𝑗 ∀𝑗 ∈ 𝑋𝑖𝑡𝑒𝑟
𝑤𝑗 +𝜏𝑓 𝑣𝑗
15: 𝑗 ∗ ← 𝑎𝑟𝑔𝑚𝑎𝑥(𝑟𝑎𝑡𝑖𝑜)
In this section, we present three heuristic algorithms for the prob- 16: 𝑆 ← 𝑆 ∪ {𝑗 ∗ }, 𝑏𝑟𝑒𝑠 ← 𝑏𝑟𝑒𝑠 − 𝑤𝑗 ∗ , 𝑘𝑐𝑢𝑟𝑟 ← 𝑘𝑐𝑢𝑟𝑟 + 𝑓 𝑣𝑗 ∗
lem, namely a constructive greedy, an enhancement of such approach 17: 𝑛𝑠𝑖 ← 𝑛𝑠𝑖 + 1 ∀𝐶 𝑖 ∈ 𝐶 | 𝑗 ∗ ∈ 𝐶 𝑖
based on the Carousel Greedy paradigm, and finally a hybrid algorithm 18: end while
combining a memetic metaheuristic with Carousel Greedy. 19: return 𝑆

3.1. Greedy heuristic

The proposed GreedyFS algorithm is a constructive greedy that 𝑘𝑐𝑢𝑟𝑟 plus 𝑓 𝑣𝑗 does not violate the threshold 𝑘, and whether the item
iteratively adds to a partial solution new items showing advantageous profit 𝑝𝑗 is greater than 𝑓 𝑐𝑗 ; if both conditions hold, 𝑗 is added to 𝑋𝑖𝑡𝑒𝑟
ratios between profit and weight. However, values are dynamically (lines 7–8). For each item 𝑗 ∈ 𝑋𝑖𝑡𝑒𝑟 , we refer to 𝑝𝑗 − 𝑓 𝑐𝑗 as the updated
updated on the basis of the new violations and their related costs that profit of 𝑗, as it corresponds to the net profit that would be obtained
would be introduced by the items. In the following, we describe the by adding 𝑗 to 𝑆 during the current iteration.
heuristic referring, when appropriate, to its pseudocode contained in
Algorithm 1. Algorithm 2 FutureCosts
The input consists of the problem data: the set of items 𝑋 along with
the associated profits 𝑃 and weights 𝑊 , the set of forfeit sets 𝐶 along
Input: 𝑗, 𝐶, 𝐷, 𝐻, 𝑁 𝑆 ⊳ considered item, forfeit sets data
with the associated costs 𝐷 and allowances 𝐻, the budget 𝑏, the bound
Output: 𝑓 𝑣𝑗 , 𝑓 𝑐𝑗 ⊳ future violations and costs associated to item 𝑗
on total number of allowed violations 𝑘 and a weighting parameter
𝜏 ≥ 0.
1: 𝑓 𝑣𝑗 ← 0, 𝑓 𝑐𝑗 ← 0
As mentioned, during the algorithm execution we iteratively build a
2: for 𝐶 𝑖 ∈ 𝐶| (𝑗 ∈ 𝐶 𝑖 ) & (𝑛𝑆𝑖 ≥ ℎ𝑖 ) do
solution set 𝑆 ⊆ 𝑋, representing a feasible solution. One item is added
3: 𝑓 𝑣𝑗 ← 𝑓 𝑣𝑗 + 1, 𝑓 𝑐𝑗 ← 𝑓 𝑐𝑗 + 𝑑𝑖
in each iteration of the main loop (lines 3–18), until a stop condition
4: end for
occurs. The values 𝑏𝑟𝑒𝑠 , 𝑘𝑐𝑢𝑟𝑟 and 𝑛𝑆𝑖 , ∀𝐶 𝑖 ∈ 𝐶, represent the residual
5: return (𝑓 𝑣𝑗 , 𝑓 𝑐𝑗 )
budget, the current number of violations and the current number of
items belonging to 𝐶 𝑖 ∩ 𝑆, respectively. The set 𝑁 𝑆 contains all 𝑛𝑆𝑖
values. The initialization of 𝑆 and the related values is reported in lines If 𝑋𝑖𝑡𝑒𝑟 is empty, it means that at least one of the stop conditions
1–2. (a)–(c) was reached; therefore, 𝑆 is returned and the algorithm ends
Four different stop conditions are considered: its execution (lines 11–12). Otherwise, for each item 𝑗 ∈ 𝑋𝑖𝑡𝑒𝑟 , we
(a) no item can be added without exceeding the budget; compute the ratio among its updated profit and 𝑤𝑗 + 𝜏𝑓 𝑣𝑗 , and choose
the element 𝑗 ∗ ∈ 𝑋𝑖𝑡𝑒𝑟 that maximizes this ratio (lines 14–15). The item
(b) no item can be added without exceeding the bound 𝑘;
𝑗 ∗ is added to 𝑆, and 𝑏𝑟𝑒𝑠 , 𝑘𝑐𝑢𝑟𝑟 and 𝑛𝑆𝑖 values are updated accordingly
(c) no item would improve the objective function value if added;
(lines 16–17).
(d) 𝑆 = 𝑋.
Finally, if the algorithm exits from the loop in lines 3–18 because
𝑋 ⧵ 𝑆 = ∅, then the trivial stop condition (d) was reached; in this case,
In each iteration, we evaluate the set 𝑋𝑖𝑡𝑒𝑟 ⊆ 𝑋 ⧵ 𝑆, composed of 𝑆 is returned and the algorithm ends (line 19).
the items that can be added to 𝑆 without reaching any stop condition.
𝑋𝑖𝑡𝑒𝑟 is initialized with the empty set (line 4). For each item 𝑗 ∈ 𝑋 ⧵ 𝑆 3.2. Carousel Greedy
that can be added to 𝑆 without violating the residual budget, we
evaluate its future violations 𝑓 𝑣𝑗 and future costs 𝑓 𝑐𝑗 (lines 5–6). These As for many constructive greedy heuristics, the main drawback of
are equal to the number of additional violations that would incur by GreedyFS is that, due to lack of information about the optimal solution
adding 𝑗 to 𝑆, and to their overall cost, respectively, and are computed structure, ‘‘early’’ locally optimal choices may result to be disadvanta-
using the auxiliary procedure FutureCosts (Algorithm 2). The procedure geous, and significantly compromise the final solution quality. For this
simply checks any forfeit set 𝐶 𝑖 such that 𝑗 ∈ 𝐶 𝑖 and such that 𝑛𝑆𝑖 is reason, we decided to enhance GreedyFS using the Carousel Greedy
greater than or equal to the allowance ℎ𝑖 , and computes 𝑓 𝑣𝑗 and 𝑓 𝑐𝑗 (CG) paradigm (Cerrone et al., 2017). The main intuition behind CG
accordingly. We then verify whether the number of current violations is to iteratively reconsider the greedy choices, starting from the oldest

4
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

Algorithm 3 CarouselFS appropriately designed crossover operator. The fitness of each chro-
mosome to the environment (the considered optimization problem) is
evaluated through a fitness function.
Input: 𝑋, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 , 𝑏, 𝑘, 𝜏, 𝛼, 𝛽 ⊳ instance data, CG parameters
Memetic Algorithms (MAs) extend the GA paradigm by including
Output: 𝑆 ⊳ solution (subset of chosen items)
local refinement mechanisms aimed at improving selected individu-
als. Thus, while the GA scheme provides a global search mechanism
1: 𝑆 ← 𝐺𝑟𝑒𝑒𝑑𝑦𝐹 𝑆(𝑋, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 , 𝑏, 𝑘)
(exploration phase) in order to identify promising areas of the search
2: 𝑆 ← 𝑅𝑒𝑚𝑜𝑣𝑒𝐿𝑎𝑠𝑡(𝑆, 𝛽)
space, the embedded local search executions (exploitation phase) allow
3: 𝑠𝑖𝑧𝑒 ← |𝑆|
to further explore these subregions. For an overview about evolutionary
4: for 𝑖 = 1 → 𝛼 × 𝑠𝑖𝑧𝑒 do
inspired algorithms and their variants, the reader can refer to Cerrone
5: 𝑆 ← 𝑅𝑒𝑚𝑜𝑣𝑒𝐹 𝑖𝑟𝑠𝑡(𝑆)
et al. (2016) and Eiben and Smith (2015).
6: 𝑗 ∗ ← 𝑆𝑖𝑛𝑔𝑙𝑒𝐼𝑡𝐺𝑟𝑒𝑒𝑑𝑦𝐹 𝑆(𝑆, 𝑋, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 , 𝑏, 𝑘, 𝜏)
More specifically, in our approach, each new individual is first
7: 𝑆 ← 𝑆 ∪ {𝑗 ∗ }
initialized by combining information coming from two promising parent
8: end for
chromosomes, and then completed using CG. In this sense, the explo-
9: 𝑆 ← 𝐼𝑛𝑖𝑡𝑎𝑙𝑖𝑧𝑒𝑑𝐺𝑟𝑒𝑒𝑑𝑦𝐹 𝑆(𝑆, 𝑋, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 , 𝑏, 𝑘, 𝜏)
ration phase of the algorithm may be seen as a multi-start CG in which
10: return S
the initializations are the result of an evolutive process. Finally, in the
exploitation phase, we apply to the resulting child chromosome a local
search procedure composed of three different neighborhoods. To our
ones, and to eventually allow to substitute them with better ones. CG knowledge, no approaches combining MA and CG have been previously
allows to expand the solution space explored by greedy approaches, proposed in the literature.
within a computational time increase that is usually much smaller than The pseudocode of our proposed algorithm, called MemeticFS, is
what requested by metaheuristics. In Cerrone et al. (2017) the approach reported in Algorithm 4. We present here a general overview of the
has been applied successfully by the authors to the minimum cardinal- algorithm, while detailed descriptions of its components are reported in
ity and the minimum weight set cover problems, the minimum label the next subsections. In the following, we use the terms chromosome,
spanning tree problem and the maximum independent set problem, individual or solution interchangeably.
while in Cerulli et al. (2020) it was applied to KPF. Other applications The input is composed of all the CarouselFS parameters, plus 5
of the method can be found in Carrabs et al. (2017), Cerrone et al. additional ones, namely:
(2019, 2018), Cerulli et al. (2022), Hadi et al. (2019), Kong et al.
(2019). • 𝑝𝑜𝑝𝑠𝑖𝑧𝑒 > 0, the population size;
The pseudocode of the proposed CG algorithm, CarouselFS, is pro- • 𝑚𝑎𝑥_𝑖𝑡 > 0, 𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟 > 0 (𝑚𝑎𝑥_𝑖𝑡 ≥ 𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟), two parameters
vided in Algorithm 3. The algorithm embeds GreedyFS as well as the controlling the number of global iterations;
following two variants: • 0 ≤ 𝛾 ≤ 1, 0 ≤ 𝛿 ≤ 1 (𝛾 ≥ 𝛿), two parameters representing
probabilities, used by the crossover operator.
• InitalizedGreedyFS: A (feasible) set of items is provided in input
along with the problem parameters, and it is used to initialize the
set 𝑆 instead of the empty set (𝑘𝑐𝑢𝑟𝑟 , 𝑏𝑟𝑒𝑠 and 𝑁 𝑆 are initialized Algorithm 4 MemeticFS
accordingly).
• SingleItGreedyFS: As for InitalizedGreedyFS, an initialization is Input: 𝑋, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 , 𝑏, 𝑘, 𝜏, 𝛼, 𝛽, 𝑝𝑜𝑝𝑠𝑖𝑧𝑒, 𝑚𝑎𝑥_𝑖𝑡, 𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟, 𝛾, 𝛿
provided; furthermore, a single iteration of the main loop (Al- ⊳ instance data, CG parameters, MA parameters
gorithm 1, lines 4–15) is executed, and the chosen item 𝑗 ∗ (if it Output: 𝑆 ∗ ⊳ solution (subset of chosen items)
exists) is returned.
1:  ← 𝑅𝑎𝑛𝑑𝑜𝑚𝑃 𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛(𝑋, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 , 𝑏, 𝑘, 𝑝𝑜𝑝𝑠𝑖𝑧𝑒)
The input of CarouselFS is the same as GreedyFS, with the addition
2: 𝑆 ∗ ← 𝐵𝑒𝑠𝑡𝑆𝑜𝑙(, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 )
of two parameters, 𝛼 ≥ 1 (integer) and 0 ≤ 𝛽 ≤ 1, according to
3: 𝑖𝑡_𝑛𝑜_𝑖𝑚𝑝𝑟 ← 0
the Carousel Greedy paradigm. In the first step of the algorithm, the
4: for 𝑖 = 1 → 𝑚𝑎𝑥_𝑖𝑡 do
solution 𝑆 is initialized using GreedyFS; the last 𝛽 × |𝑆| elements are
5: (𝑆1 , 𝑆2 ) ← 𝐶ℎ𝑜𝑜𝑠𝑒𝑃 𝑎𝑟𝑒𝑛𝑡𝑠(, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 )
then removed from it (lines 1–2). Let 𝑠𝑖𝑧𝑒 be the cardinality of 𝑆 at this 6: 𝑆3 ← 𝐶ℎ𝑜𝑜𝑠𝑒𝑅𝑒𝑝𝑙𝑎𝑐𝑒𝑑(, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 )
point; the main CarouselFS loop is composed of 𝛼 × 𝑠𝑖𝑧𝑒 iterations. 7: 𝑆 ← 𝐶𝑟𝑜𝑠𝑠𝑜𝑣𝑒𝑟(𝑆1 , 𝑆2 , 𝛾, 𝛿, 𝑊 , 𝑏, 𝑘)
At each iteration, the item that was first added among the ones 8: 𝑆 ← 𝐼𝑛𝑖𝑡𝑖𝑎𝑙𝑖𝑧𝑒𝑑𝐶𝑎𝑟𝑜𝑢𝑠𝑒𝑙𝐹 𝑆(𝑆, 𝑋, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 , 𝑏, 𝑘, 𝜏, 𝛼, 𝛽)
currently in 𝑆 is removed, and then replaced with an item chosen 9: 𝑆 ← 𝐿𝑜𝑐𝑎𝑙𝑆𝑒𝑎𝑟𝑐ℎ(𝑆, 𝑋, 𝐶, 𝐷, 𝐻, 𝑃 , 𝑊 , 𝑏, 𝑘)
using SingleItGreedyFS (lines 5–7). Note that this item may eventually 10: if 𝑂𝑏𝑗𝐹 (𝑆) ≥ 𝑂𝑏𝑗𝐹 (𝑆 ∗ ) then
correspond to the one that has just been removed if it still appears to 11: 𝑆 ∗ ← 𝑆, 𝑖𝑡_𝑛𝑜_𝑖𝑚𝑝𝑟 ← 0
be the most profitable choice. 12: else
Finally, at the end of the main loop, we check if the current solution 13: 𝑖𝑡_𝑛𝑜_𝑖𝑚𝑝𝑟 ← 𝑖𝑡_𝑛𝑜_𝑖𝑚𝑝𝑟 + 1
can be improved with the addition of other items using Initalized- 14: end if
GreedyFS (line 9). 15: if 𝑖𝑡_𝑛𝑜_𝑖𝑚𝑝𝑟 > 𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟 then
16: return 𝑆 ∗
3.3. Hybrid Memetic/CG algorithm 17: end if
18: if 𝑆 ∉  then
In this section, we present a metaheuristic that hybridizes the 19:  ←  ⧵ {𝑆3 } ∪ {𝑆}
memetic paradigm with the CG approach presented in Section 3.2. 20: end if
We recall that genetic algorithms (GAs) are metaheuristic 21: end for
approaches that take inspiration from natural selection and popula- 22: return 𝑆 ∗
tion genetics. Given a population of individuals (chromosomes), each
representing a feasible problem solution, new members are iteratively In the first step of the algorithm (line 1), the population  is ini-
generated by merging the features of parent chromosomes using an tialized with 𝑝𝑜𝑝𝑠𝑖𝑧𝑒 randomly generated chromosomes. This process is

5
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

described in Section 3.3.1. As previously mentioned, each chromosome provide to the CG algorithm embedded in our approach (see Algorithm
𝑆 ∈  corresponds to a feasible solution for the problem and is, there- 4, line 8) initializations containing a limited amount of penalizations.
fore, a subset of items that violate neither the budget nor the violations These solutions are then completed by CG and refined by LS, that uses
limit. A binary vector encoding is used for the chromosomes. In the the main problem objective function. We experimented in a preliminary
algorithm pseudocode and in the following, 𝑂𝑏𝑗𝐹 (𝑆) is the objective test phase that this choice appears to work better than 𝐹 𝑖𝑡𝐹 (𝑆) =
function value associated to 𝑆. After the population initialization, the 𝑂𝑏𝑗𝐹 (𝑆).
incumbent chromosome 𝑆 ∗ (that is, the chromosome that maximizes The two parent chromosomes 𝑆1 and 𝑆2 to be used by the crossover
𝑂𝑏𝑗𝐹 ) is saved (line 2). During the algorithm execution, 𝑆 ∗ is updated operator are chosen using a tournament selection mechanism, as fol-
to represent the best solution encountered, while 𝑖𝑡_𝑛𝑜_𝑖𝑚𝑝𝑟 (initialized lowing:
with value 0 in line 3) stores the number of passed iterations since the
last 𝑆 ∗ update. 1. Two random individuals 𝑆 𝐼 ∈ , 𝑆 𝐼𝐼 ∈  (𝑆 𝐼 ≠ 𝑆 𝐼𝐼 ) are chosen;
There are two termination criteria, namely an overall maximum 2. 𝑆 𝐼 is chosen to be 𝑆1 if 𝐹 𝑖𝑡𝐹 (𝑆 𝐼 ) ≤ 𝐹 𝑖𝑡𝐹 (𝑆 𝐼𝐼 ), while 𝑆 𝐼𝐼 is
number of iterations (𝑚𝑎𝑥_𝑖𝑡) and a maximum number of iterations chosen to be 𝑆1 otherwise;
since the last improvement of the incumbent best (𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟). As soon 3. Two new random individuals 𝑆 𝐼𝐼𝐼 ∈ , 𝑆 𝐼𝑉 ∈  (𝑆 𝐼𝐼𝐼 ≠ 𝑆 𝐼𝑉 ,
as one of the two stopping criteria is reached, the algorithm ends and 𝑆 𝐼𝐼𝐼 ≠ 𝑆1 , 𝑆 𝐼𝑉 ≠ 𝑆1 ) are chosen;
𝑆 ∗ is returned. 4. 𝑆 𝐼𝐼𝐼 is chosen to be 𝑆2 if 𝐹 𝑖𝑡𝐹 (𝑆 𝐼𝐼𝐼 ) ≤ 𝐹 𝑖𝑡𝐹 (𝑆 𝐼𝑉 ), while 𝑆 𝐼𝑉 is
In each iteration of the main loop (lines 4–22) a new chromosome chosen to be 𝑆2 otherwise.
is generated. Two parent chromosomes 𝑆1 and 𝑆2 are first selected
(line 5) through a randomized process aimed at choosing favorable The chromosome 𝑆3 to be replaced by the newly generated indi-
individuals according to the fitness function. Similarly, an unfavor- vidual in each iteration is chosen instead at random among the 𝑝𝑜𝑝𝑠𝑖𝑧𝑒
2
able chromosome 𝑆3 to be replaced is chosen (line 6). The two se- individuals with the worst ranking according to the fitness function.
lection mechanisms and the used fitness function are described in
Section 3.3.2. The new individual 𝑆 is initialized by combining 𝑆1 and 3.3.3. Crossover
𝑆2 through the crossover operator (line 7), which is described in detail The crossover operator uses the items belonging to parent chromo-
in Section 3.3.3. 𝑆 is then refined by running InitializedCarouselFS somes 𝑆1 and 𝑆2 to initialize the newly generated individual 𝑆. We
(line 8), a version of CarouselFS in which the initial greedy solution implemented a randomized crossover working in two steps, as follows:
contains the current elements of 𝑆; that is, in line 1 of Algorithm 1. For each item 𝑗 ∈ 𝑆1 ∩ 𝑆2 , 𝑗 is added to 𝑆 with probability 𝛾 and
3, InitalizedGreedyFS is used instead of GreedyFS. Subsequently, we discarded otherwise;
attempt to further refine 𝑆 by running a local search (LS) algorithm 2. For each item 𝑗 such that 𝑗 ∈ 𝑆1 ⧵ 𝑆2 or 𝑗 ∈ 𝑆2 ⧵ 𝑆1 , 𝑗 is chosen
(line 9). This algorithm is described in Section 3.3.4. for inclusion in 𝑆 with probability 𝛿. If 𝑗 is chosen, it is actually
We then check if 𝑆 improves 𝑆 ∗ and, if this is the case, the added to 𝑆 if the resulting chromosome is feasible.
incumbent solution is updated; the 𝑖𝑡_𝑛𝑜_𝑖𝑚𝑝𝑟 value is also updated
accordingly (lines 10–15). If the 𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟 termination criterion was The second step is performed after that all items corresponding to the
not reached, we check whether 𝑆 was already in ; if this is the case, first step have been considered. Note that since 𝑆1 and 𝑆2 are feasible
it is discarded, otherwise it is added to the population, substituting 𝑆3 by construction, no feasibility check is needed for the first step.
(lines 19–21). Given the updated population, a new iteration starts,
unless the 𝑚𝑎𝑥_𝑖𝑡 stopping criterion is reached. 3.3.4. Local search
As previously described, we attempt to further improve new indi-
3.3.1. Population initialization viduals resulting from the Crossover and the CG phases using a LS
As discussed, the population  is initialized with 𝑝𝑜𝑝𝑠𝑖𝑧𝑒 randomly algorithm. In each iteration, LS explores three different neighborhoods
generated individuals. They are iteratively generated one by one, using of the current solution 𝑆, which are built as follows:
the following procedure:
1. Every possible feasible solution 𝑆 ∪ {𝑗}, with 𝑗 ∈ 𝑋 ⧵ 𝑆;
1. the new chromosome 𝑆 is initialized with ∅; 2. Every possible solution 𝑆 ⧵ {𝑗}, with 𝑗 ∈ 𝑆;
2. a random permutation 𝜋1 , … , 𝜋𝑛 of the set of items is considered; 3. Every possible feasible solution 𝑆 ⧵ {𝑗 ′ } ∪ {𝑗 ′′ }, with 𝑗 ′ ∈ 𝑆 and
3. for 𝑗 = 1, … , 𝑛, item 𝜋𝑗 is added to 𝑆 if the resulting chromosome 𝑗 ′′ ∈ 𝑋 ⧵ 𝑆.
is still feasible and if 𝑂𝑏𝑗𝐹 (𝑆) > 0, meaning that it is better than
Note that, by construction, no feasibility check is needed for neigh-
the trivial solution in which no item is chosen. Otherwise, 𝜋𝑗 is
borhoods of the second type. A solution belonging to this neighborhood
discarded.
could improve 𝑆 if removing 𝑗 brings to a reduction in the costs to be
Each newly generated chromosome 𝑆 is added to , unless an identical paid that exceeds 𝑝𝑗 .
one already exists in the population, in which case it is discarded. The three neighborhoods are explored exhaustively and the solution
As soon as 𝑝𝑜𝑝𝑠𝑖𝑧𝑒 distinct chromosomes are added to , the popu- with the best objective function 𝑆 + is saved along the process. At the
lation initialization phase ends. end of the current iteration, if 𝑂𝑏𝑗𝐹 (𝑆 + ) > 𝑂𝑏𝑗𝐹 (𝑆), we set 𝑆 = 𝑆 +
and a new iteration starts. Otherwise, LS ends its execution and 𝑆 is
3.3.2. Chromosome selection mechanisms and fitness function returned.
The fitness function is used to rank chromosomes belonging to the
current population. We decided to rank chromosomes according to 4. Computational results
their number of violations, with ties broken by taking into account
their objective function values. Formally, given a chromosome 𝑆, and In this section, we resume the results of our computational tests
recalling that 𝑛𝑖𝑆 = |𝐶 𝑖 ∩ 𝑆|, the fitness function value 𝐹 𝑖𝑡𝐹 (𝑆) is on CarouselFS and MemeticFS. From now on, for the sake of concise-
evaluated as follows: ness, we will refer to these approaches with the names CG and MA,
1 ∑ respectively. Furthermore, we will refer as CPLEX to the approach con-
𝐹 𝑖𝑡𝐹 (𝑆) = + 𝑛𝑖𝑆 − ℎ𝑖
𝑂𝑏𝑗𝐹 (𝑆) 𝑖
sisting in solving the mathematical formulation proposed in Section 2
𝑖
𝐶 ∈𝐶∶𝑛𝑆 >ℎ𝑖
using the IBM ILOG CPLEX solver. The next subsections contain the
The lower is the 𝐹 𝑖𝑡𝐹 (𝑆) value, the better is ranked the chro- description of the considered benchmark instances (Section 4.1), infor-
mosome. The main idea underlying our fitness function choice is to mation about the testing environment and the chosen parameter values

6
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

(Section 4.2), a description of the procedure used to computer upper Scenario 1–2 the cardinality of solutions and, accordingly, the
bound values for the 𝑣𝑖 variables (Section 4.3), and the obtained results number of violations tend to decrease as 𝑛 increases, while the
along with comments on them (Section 4.4). Finally, in Section 4.5, opposite happens for Scenarios 3–4.
we compare MA with a previously proposed approach to solve the KPF
problem. For each scenario, instance type and value of 𝑛, 10 different in-
stances were generated. Furthermore, to have a fair comparison among
4.1. Instances instances of different types, the ones corresponding to the same sce-
nario and value of 𝑛 have equal item weights and the same forfeit sets.
Given the novelty of the proposed problem, we performed tests Overall, each scenario has 50 different instances for each instance type,
on newly generated instances. Our instances have a number of items for a total of 600 instances. In the following, we refer to each subset of
𝑛 equal to 𝑛 = 300, 500, 700, 800 or 1000, and belong to 4 different 10 instances generated according to the same features as instance group.
scenarios, each further composed of instances of 3 different types. The Instance files can be downloaded from http://www.dipmat2.unisa.it/
scenarios are the following: people/cdambrosio/www/DataSet/KPFS_instances.zip.

• Scenario 1: 𝑙 = 5 × 𝑛, |𝐶𝑖 | chosen at random in the interval 4.2. Test environment and parameter values
𝑛
[2, … , 50 ] ∀𝑖 = 1, … , 𝑙, ℎ𝑖 = 1 ∀𝑖 = 1, … , 𝑙;
• Scenario 2: 𝑙 = 3 × 𝑛, |𝐶𝑖 | chosen at random in the interval All our algorithms were coded using the C++ programming lan-
𝑛
[2, … , 20 ] ∀𝑖 = 1, … , 𝑙, ℎ𝑖 = 1 ∀𝑖 = 1, … , 𝑙; guage. For the CPLEX approach, the Concert Library of IBM ILOG
• Scenario 3: 𝑙 = 5 × 𝑛, |𝐶𝑖 | chosen at random in the interval CPLEX 12.10 was used. All tests were run on a workstation with an
𝑛
[2, … , 50 ] ∀𝑖 = 1, … , 𝑙, ℎ𝑖 chosen at random in the interval Intel Xeon CPU E5-2650 v3 processor running at 2.3 GHz, with 128 GB
of RAM.
[1, … , 32 |𝐶𝑖 |] ∀𝑖 = 1, … , 𝑙;
A preliminary tuning phase was run to choose the GreedyFS, CG and
• Scenario 4: 𝑙 = 3 × 𝑛, |𝐶𝑖 | chosen at random in the interval
𝑛 MA parameters. GreedyFS has a single parameter 𝜏, aimed at balancing
[2, … , 20 ] ∀𝑖 = 1, … , 𝑙, ℎ𝑖 chosen at random in the interval
for each item that is a candidate for inclusion its weight and the number
[1, … , 32 |𝐶𝑖 |] ∀𝑖 = 1, … , 𝑙. of future incurring violations; the value 𝜏 = 5 gave the best results in
our preliminary tests. For CG, we set 𝛼 = 2 since further iterations did
Scenarios 1–3 have a higher number of generally smaller forfeit sets, not bring significant improvements, while within the range of values
compared to Scenarios 2–4. Hence, Scenarios 1–3 model more specific suggested in Cerrone et al. (2017), 𝛽 = 0.05 provided the best results.
properties involving fewer items each, while the opposite is true in the Finally, for GA, parameters 𝛾 = 0.7, 𝛿 = 0.3 were chosen for the
other two cases. Furthermore, in Scenarios 1–2 a single item is allowed randomized crossover, since they provided a good trade-off between
for each set without incurring in costs, bringing the concept closer to inheritance of good features from both parents and diversification.
classical conflicts; indeed, in this case the problem corresponds to the Higher values for 𝛾 and lower for 𝛿 resulted in a faster convergence
kKPCS problem defined in Section 2. A larger range of allowance values and worse results. For the population size, according to our tests, the
is considered for Scenarios 3–4. value 𝑝𝑜𝑝𝑠𝑖𝑧𝑒 = 50 was used, since it appeared to maintain a good level
Instance types define the relation between profits, weights and
of diversity in the population.
forfeit costs. They are defined as follows:
Furthermore, with respect to Scenarios 1–2, value 𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟 = 250
• Not Correlated (NC): each item weight 𝑤𝑗 ∈ 𝑊 and item profit was chosen, while 𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟 = 150 was chosen for Scenarios 3–4, with
𝑝𝑗 ∈ 𝑃 is chosen uniformly at random in the interval [1, … , 30], a global upper bound 𝑚𝑎𝑥_𝑖𝑡 = 1000 on the number of iterations for both
while each forfeit cost 𝑑𝑖 ∈ 𝐷 is chosen at random in the interval cases. As will be shown, by effect of the stricter allowances, Scenarios
[1, … , 20]. 1–2 have solutions with fewer items, therefore the local search step
• Correlated (C): Weights and costs are chosen randomly as for the for these solutions generally explores smaller neighborhoods. For this
NC instances, while each profit 𝑝𝑗 ∈ 𝑃 is equal to 𝑤𝑗 + 10, where reason, we increased the 𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟 value to prevent the premature
𝑤𝑗 is the weight referring to the same item 𝑗 ∈ 𝑋. convergence of MA. As will be shown, despite this choice, MA is usually
• Fully Correlated (FC): Weights are chosen randomly as for the faster for Scenarios 1–2 with respect to Scenarios 3–4.
previous instance types. Profits are correlated to weights as for
the C instances. Each cost 𝑑𝑖 ∈ 𝐷 is computed according to the 4.3. Upper bounds computation for the 𝑣𝑖 variables
following formula:
⌊∑ ⌋ For each test instance, we computed in preprocessing the cardinality
+ 𝑤𝑗
𝑗∈𝐶̄ 𝑖
𝑑𝑖 = of 𝐶̂ 𝑖 for each forfeit set 𝐶 𝑖 ∈ 𝐶, that is, the values of the bounds (7)
|𝐶 𝑖 | in the MILP formulation solved by CPLEX.
+ The preprocessing procedure is composed of two steps. In the first
where 𝐶̄ 𝑖 is the subset composed of the ℎ𝑖 + 1 items with highest
step, given each forfeit set, the procedure computes a list of its items,
profits in 𝐶 𝑖 .
sorted by profit in descending order. Iterated for all the 𝑙 sets, the
Knapsack instances with correlated profits are known to be harder complexity of the first part is 𝑂(𝑙⋅𝑛 log 𝑛). Since in our instances 𝑙 = 𝑂(𝑛),
than random ones, and have been used in the literature to test KPCG it can be rewritten as 𝑂(𝑛2 log 𝑛).
approaches as well (see for instance Bettinelli et al., 2017). With FC Given each sorted list obtained in the first step, corresponding to the
instances, we extend the correlation concept to forfeit costs. generic set 𝐶𝑖 , the algorithm then scrolls it starting from the (ℎ𝑖 + 1)-th
The other instance parameters have been chosen as follows: position, looking for the first item 𝑗 such that 𝑝𝑗 ≤ 𝑑𝑖 . The number
𝑤 +𝑤 𝑛
of items encountered before reaching this condition (eventually 0, or
• 𝑏 = 𝑚𝑖𝑛 2 𝑚𝑎𝑥 × 10 , where 𝑤𝑚𝑖𝑛 and 𝑤𝑚𝑎𝑥 are the extremes of the |𝐶𝑖 |−ℎ𝑖 if it is never reached) corresponds to |𝐶̂ 𝑖 |. Iterated for all forfeit
interval for 𝑤𝑗 values (1 and 30 respectively, in our case); sets, the complexity of the second step is 𝑂(𝑙 ⋅ 𝑛), which in our case
𝑛 𝑛 𝑛 𝑛 𝑛
• For Scenarios 1–2, 𝑘 = , , , ,
15 25 35 45 55
(rounded to the nearest corresponds to 𝑂(𝑛2 ).
integer) when 𝑛 = 300, 500, 700, 800 or 1000, respectively. For In our tests, the actual time needed to compute the bounds for
𝑛
Scenarios 3–4, 𝑘 = 15 rounded to the nearest integer for all values each instance was never higher than 0.01 s, therefore we considered
of 𝑛. Values were chosen after verifying experimentally that for it negligible and do not discuss it further in this section.

7
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

Table 1
Heuristics comparison for Scenarios 1–2.
Inst. group Scenario 1 Scenario 2
Type 𝒏 CG MA CG MA
Sol Time Sol Time Imp. Sol Time Sol Time Imp.
(s) (s) (%) (s) (s) (%)
300 588.1 0.01 681.4 3.86 16.25 247.2 0.01 299.1 1.37 22.24
500 468.7 0.01 560.8 11.95 20.39 177.8 0.01 226.7 2.45 29.21
NC 700 365.0 0.01 492.0 19.85 35.39 140.8 0.01 186.3 3.24 33.98
800 368.2 0.01 457.1 32.62 24.58 136.5 0.01 174.3 3.80 30.45
1000 300.4 0.01 404.6 45.30 35.63 103.3 0.01 144.0 4.31 41.31
300 696.9 0.01 768.7 2.59 10.51 329.1 0.01 443.9 1.89 36.50
500 621.3 0.01 842.2 18.96 36.13 230.3 0.01 346.5 3.16 54.07
C 700 511.2 0.01 727.5 43.53 44.42 193.2 0.01 273.7 4.29 44.65
800 475.7 0.01 667.3 54.60 41.58 169.8 0.01 249.8 4.99 56.25
1000 396.7 0.01 598.2 73.40 51.94 144.2 0.01 213.7 6.56 50.17
300 667.7 0.01 751.9 3.03 12.73 360.1 0.01 465.4 2.21 29.91
500 637.6 0.01 819.2 17.38 29.42 320.8 0.01 406.5 4.61 28.13
FC 700 576.1 0.01 742.7 41.45 29.43 270.6 0.01 348.9 5.25 30.97
800 505.2 0.01 706.0 55.45 40.28 261.0 0.01 325.9 7.09 26.61
1000 468.4 0.01 641.9 78.17 38.41 226.3 0.01 289.9 11.30 30.75

4.4. Results discussion and 56.25% for Scenarios 1–2 (above 25% for 24 out of 30 groups),
and between 4.45% and 18.58% for Scenarios 3–4 (above 7% for
We start with a comparison of CG and MA. Being CG a direct 17 groups). In general, for all scenarios and instance types, larger
extension of GreedyFS and already (as will be shown) very fast, we improvements are usually connected to instances with higher values
omit to also present the results of the base heuristic. On the other of 𝑛. These results clearly show that CG, although fast, cannot be
hand, while MA embeds CG and is therefore expected to provide better considered effective as a stand-alone approach.
results, it is a significantly more complex algorithm. We find therefore While being significantly slower, MA maintains very good compu-
of interest to compare the performances of the two approaches, in tational times. In the following, in order to assess the quality of these
order to evaluate their trade-offs in terms of solution quality and heuristic solutions, we compare MA with solutions obtained by CPLEX.
computational time. In particular, two different time limit settings were considered for the
The results of this comparison are contained in Table 1 for Scenarios solver:
1–2, and in Table 2 for Scenarios 3–4. For each scenario, the tables
report averages computed over the 10 instances corresponding to the • In the first setting, we consider a long time limit of 3 h for each
same group. For each approach, headings Sol and Time(s) refer to instance. Most of the instances corresponding to Scenarios 2 and
solution values and computational times in seconds, respectively. The 4 and Type NC could be solved to certified optimality within this
columns under the Imp.(%) heading contain the percentage improve- time limit; therefore, we run tests on all the corresponding 10
ments obtained by MA with respect to CG. In more detail, let 𝑆𝑜𝑙(𝐻, 𝑎) groups. In all other cases, tests were only run for instances with
be the solution value computed by algorithm 𝐻 for instance 𝑎; each 𝑛 ≤ 500. The comparisons referring to this setting are reported in
gap value is an average of the values 100 × 𝑆𝑜𝑙(𝑀𝐴,𝑎)−𝑆𝑜𝑙(𝐶𝐺,𝑎)
𝑆𝑜𝑙(𝐶𝐺,𝑎)
computed Tables 3 and 4 for Scenarios 1–2 and 3–4, respectively.
on the related 10 instances. • In the second setting, we aim to compare the quality of the
CG is an extremely fast heuristic, never averaging more than 0.05 s solutions provided by CPLEX and MA within a similar running
on any instance group. In particular, for Scenarios 1–2 the computa-
time. To this end, for all 10 instances composing each group,
tional time was always within or below 0.01 s; for the sake of clarity,
we considered as time limit for CPLEX the highest computational
we rounded some of these values to 0.01 even if the actual value
time required by MA on these instances. We run these tests on all
was below 0.005 s. MA is a significantly more complex algorithm,
600 instances, and the results are contained in Tables 5 and 6 for
however computational times remain very reasonable. Overall, the
Scenarios 1–2 and 3–4, respectively.
highest average time is 200.31 s, corresponding to Scenario 3, Type
NC, 𝑛 = 1000. However, they are below 100 s for 57 out of 60 instance
Let us start with the analysis for the first time limit setting. In
groups, and below 50 s for 47 groups.
Tables 3 and 4, we report the average solution value provided by CPLEX
We can note some trends with respect to computational times.
(Sol.), while for each of the two approaches, we report average values
First of all, comparing the solution values in Table 1 with those in
for number of violations (Viol.), number of items in the solutions (Size)
Table 2, we note that scenarios with higher allowances have higher
and computational times in seconds (Time(s)).
computational times. These scenarios have higher objective function
Furthermore, for CPLEX, the Opt Gap(%) column contains average
values as well, which generally correspond (as will be later shown in
optimality gaps when solutions are not solved within the time limit.
the comparison with CPLEX results) to solutions composed by more
When 0 (respectively 10) instances belonging to a group are solved to
items. This could be easily expected given the higher thresholds before
having to pay costs, and correspond to a higher number of iterations optimality, the ‘‘-’’ symbol is reported under Time(s) (respectively Opt
for CG, as well as larger local search neighborhoods. Furthermore, Gap(%)). When a partial set of instances is solved, the average compu-
among scenarios within the same table, we note for MA that the ones tational time is computed on this subset, and the optimality gap on the
with a higher number of smaller forfeit sets (1 and 3) have higher remaining instances. In this case, the number of unsolved instances is
computational times than the other two. Again, we may note that these also reported in brackets under Opt Gap(%). For MA, the CPLEX Gap(%)
scenarios have higher solution values, and their solutions contain more column reports average percentage gaps from CPLEX solutions. CPLEX
items, as it is less likely for them to belong to the same forfeit sets. Gaps are computed as 100 × 𝑆𝑜𝑙(𝐶𝑃𝑆𝑜𝑙(𝐶𝑃
𝐿𝐸𝑋,𝑎)−𝑆𝑜𝑙(𝑀𝐴,𝑎)
𝐿𝐸𝑋,𝑎)
for each instance 𝑎.
We note that in all cases MA brings significant improvements with Optimality gaps are computed as 100× (𝐵𝐵,𝑎)−𝑆𝑜𝑙(𝐶𝑃
(𝐵𝐵,𝑎)
𝐿𝐸𝑋,𝑎)
, where (𝐵𝐵, 𝑎)
respect to CG. In particular, such improvement is between 10.51% is the best bound found by CPLEX for 𝑎 within the time limit.

8
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

Table 2
Heuristics comparison for Scenarios 3–4.
Inst. group Scenario 3 Scenario 4
Type 𝒏 CG MA CG MA
Sol Time Sol Time Imp. Sol Time Sol Time Imp.
(s) (s) (%) (s) (s) (%)
300 953.0 0.01 1027.7 1.23 7.96 829.3 0.01 903.0 1.32 8.90
500 1265.5 0.01 1386.8 8.03 9.76 1017.7 0.01 1168.3 13.01 15.02
NC 700 1458.9 0.02 1676.0 57.38 15.08 1191.8 0.02 1401.9 36.40 17.85
800 1537.0 0.02 1813.1 124.10 18.09 1243.0 0.02 1449.4 55.01 16.70
1000 1676.3 0.04 1949.8 200.31 16.37 1280.1 0.04 1513.2 89.64 18.58
300 919.4 0.01 960.2 1.23 4.45 855.0 0.01 896.4 1.16 4.88
500 1363.3 0.01 1439.4 5.16 5.62 1220.4 0.01 1308.1 5.91 7.24
C 700 1762.7 0.02 1859.0 25.19 5.47 1595.2 0.02 1726.4 16.28 8.33
800 1949.8 0.03 2068.7 30.53 6.11 1742.8 0.02 1879.3 35.92 7.88
1000 2234.7 0.05 2426.4 86.82 8.63 1844.2 0.04 2150.4 111.68 16.68
300 894.8 0.01 952.0 1.54 6.43 842.3 0.01 889.7 1.23 5.66
500 1343.4 0.01 1423.9 6.12 6.00 1250.7 0.01 1313.6 4.80 5.05
FC 700 1741.1 0.02 1843.9 20.94 5.93 1673.3 0.02 1759.9 15.40 5.26
800 1941.5 0.03 2054.0 34.51 5.83 1805.3 0.03 1937.2 23.37 7.52
1000 2284.1 0.05 2433.1 83.24 6.55 1971.4 0.04 2302.3 77.22 16.99

Table 3
CPLEX (long time limit) - MA comparison for Scenarios 1–2.
Inst. group CPLEX MA
Type 𝒏 Sol. Viol. Size Time Opt Gap Viol. Size Time CPLEX
(s) (%) (s) Gap(%)
Scenario 1
300 684.0 18.1 32.0 3325.66 2.92(1) 18.2 32.2 3.86 0.38
NC
500 561.7 18.1 26.5 – 32.34 18.2 26.5 11.95 0.14
300 769.5 9.5 32.9 – 8.35 7.9 32.6 2.59 0.10
C
500 834.1 19.2 29.0 – 32.60 19.5 29.7 18.96 −0.99
300 751.3 0.7 29.5 – 9.45 0.8 29.6 3.03 −0.09
FC
500 802.6 18.9 29.4 – 34.40 19.2 30.4 17.38 −2.08
Scenario 2
300 299.6 12.2 14.2 357.50 – 12.8 14.1 1.37 0.17
500 227.6 9.9 10.4 2642.25 – 9.5 10.1 2.45 0.41
NC 700 186.9 5.1 7.8 5133.72 – 5.5 7.8 3.24 0.31
800 174.3 5.7 7.4 5837.46 0.13(1) 5.4 7.3 3.80 0.00
1000 145.9 6.0 6.5 5869.85 – 7.2 6.3 4.31 1.29
300 443.9 18.7 16.6 3158.58 – 18.7 16.6 1.89 0.00
C
500 343.2 14.5 12.2 – 18.19 14.6 12.5 3.16 −0.96
300 464.4 19.7 17.3 5003.56 5.73(4) 19.8 17.3 2.21 −0.22
FC
500 403.4 19.3 13.4 – 26.16 19.2 13.7 4.61 −0.79

Focusing on Scenarios 1–2 ( Table 3), we observe that the first Regarding correlated instances, CPLEX was able to solve to optimality
one, containing more forfeit sets, is more challenging for CPLEX. Fur- all 10 instances of Type C and 6 instances of Type FC for 𝑛 = 300, with
thermore, within the same scenario, correlated instances are harder to average computational times of around 3000 and 5000 s, respectively.
solve. For Scenario 1, CPLEX solved to certified optimality 9 out of 10 It is worth noting that in the only case where all optimal solutions are
instances of Type NC and 𝑛 = 300, with an average computational time known, the CPLEX gap for MA is 0, while in the remaining 3 cases it is
of about 3325.66 s. For the remaining instance, the returned solution negative, down to −0.96% for Type C, 𝑛 = 500. Indeed, all 16 optimal
has an optimality gap of 2.92%. No other Scenario 1 instance was solutions provided by CPLEX on correlated instances were found by
solved within 3 h. For Type C and FC, with 𝑛 = 300, the average opti- MA. For these Scenario 2 groups, MA always runs within 5 s.
mality gap rises to 8.35% and 9.45%, respectively. The effectiveness of Comparing the size of the solutions and the number of violations for
CPLEX also degrade with respect to MA for correlated instances. Indeed, Scenarios 1–2, we note that the two approaches identify solutions with
while gaps are small but positive for NC instances (up to 0.38% for a very similar structure, with a maximum difference of 1.6 violations
𝑛 = 300), they are negative in 3 out of 4 cases for Type C and FC, (Scenario 1, Type C, 𝑛 = 300) and 1 item (Scenario 1, Type FC, 𝑛 = 500).
meaning that MA solutions are usually better, with a peak equal to The difference is within 1 violation for 13 out of 15 groups while
−2.08% for Type FC, 𝑛 = 500. For the Scenario 1 groups reported in and 1 item for all groups. We also note that on these scenarios the
Table 3, computational times for MA are all below 19 s. The algorithm solution size usually decreases as 𝑛 increases. While this may seem
found 5 of the 9 optimal solutions provided by CPLEX for Type NC and counterintuitive, we recall that for these instances ℎ𝑖 = 1 for each forfeit
𝑛 = 300. set 𝐶𝑖 , and that the number of forfeit sets is equal to either 5×𝑛 or 3×𝑛.
On Scenario 2, CPLEX found 49 out of 50 certified optimal NC Hence, as 𝑛 grows, it is more likely that each new item insertion would
solutions within the time limit, with the remaining one having a gap lead to multiple additional costs to be paid.
equal to 0.13%. Computational times grow from 357.50 s for 𝑛 = 300 We now focus on the results for Scenarios 3–4, contained in Table 4.
to 5869.85 s for 𝑛 = 1000. CPLEX gaps are always below 0.5%, except Again, we observe that correlated instances are harder than uncorre-
for 𝑛 = 1000, where it is equal to 1.29%. Overall, on NC instances lated ones, and that the scenario with more forfeit sets (Scenario 3) is
MA found the same solution value as CPLEX in 39 out of 50 cases. harder than the other. For Scenario 3, CPLEX solves all NC instances

9
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

Table 4
CPLEX (long time limit) - MA comparison for Scenarios 3–4.
Inst. group CPLEX MA
Type 𝒏 Sol. Viol. Size Time Opt Gap Viol. Size Time CPLEX
(s) (%) (s) Gap(%)
Scenario 3
300 1033.3 16.3 48.4 3.04 – 15.6 48.8 1.23 0.55
NC
500 1404.5 28.7 64.8 2086.83 0.50(1) 27.5 65.3 8.03 1.25
300 968.3 8.3 52.3 369.69 – 5.0 51.1 1.23 0.84
C
500 1453.8 11.7 70.1 – 2.60 8.4 68.5 5.16 0.99
300 955 0.6 49.5 675.73 – 0.2 48.9 1.54 0.32
FC
500 1435.6 1.8 67.3 – 3.20 2.5 66.8 6.12 0.81
Scenario 4
300 908.7 16.7 41.2 4.85 – 15.3 41.6 1.32 0.63
500 1178.7 29.4 54.7 119.36 – 30.0 55.5 13.01 0.89
NC 700 1428.3 44.6 67.3 4134.69 – 41.8 66.9 36.40 1.86
800 1475.4 47.4 70.1 3538.67 1.35(2) 46.1 69.8 55.01 1.74
1000 1544.4 53.8 73.8 7089.20 1.67(7) 49.1 72.6 89.64 2.02
300 906.2 11.3 46.4 165.61 – 6.1 44.6 1.16 1.08
C
500 1321.5 12.4 57.3 8882.10 1.16(9) 8.8 55.7 5.91 1.01
300 895.8 2.5 44.4 224.82 – 3.0 44.1 1.23 0.68
FC
500 1323.4 12.3 59.0 – 1.13 10.9 57.8 4.80 0.74

with 𝑛 ≤ 500 except one, while no instances with 𝑛 = 500 are solved for with a positive solution value. In these 3 cases, the number of instances
Type C and FC. Average computational times for 𝑛 = 300 grow from used to compute the average is reported in brackets in Table 5, next to
3.04 s for NC instances to 369.69 and 675.73 s for C and FC ones, the CPLEX gap value.
respectively. However, looking at optimality gaps, we note that the For Scenarios 1–2, optimality gaps are consistently above 50% for
solutions found by CPLEX are in all cases close to the optimal ones, 𝑛 ≥ 700, and range between 22.98% and 69.42% for Scenario 1 and
since the gap is at most 3.20% (Type FC, 𝑛 = 500). The solution gaps between 38.67% and 88.79% for Scenario 2. Accordingly, the solutions
for MA, while positive, are all quite small, being equal to 1.25% in the returned by MA are always better (all gaps are negative), often by a
worst case (Type NC, 𝑛 = 500) and below 1% in the other 5 cases. noticeable margin.
Computational times are within 1.54 s for 𝑛 = 300 and 8.03 s for For Scenario 1, the gap is below -10% in 12 out of 15 cases. Overall,
𝑛 = 500. Overall MA found 7 of the optimal solutions provided by these gaps range from −4.85% and −21.93%. The gaps are even more
CPLEX on Scenario 3. noticeable for Scenario 2, where they are below -10% in 14 out of
Regarding Scenario 4, CPLEX solves all NC instances with 𝑛 ≤ 700, 15 cases, and equal to −9.19% in the remaining one. Gaps are always
8 out 10 instances for 𝑛 = 800 and 3 instances with 𝑛 = 1000. below -25% for 𝑛 ≥ 700, and always below -50% for 𝑛 = 1000.
Computational times range from 4.85 s to 7089.20 s. With respect to C For Scenarios 3–4, the optimality gaps are significantly smaller,
and FC instances, all instances with 𝑛 = 300 are solved, with an average similarly to what was observed with the first time limit setting. In
computational time equal to 165.61 and 224.82 s, respectively. A single Scenario 3, these gaps are below 10% in 13 out of 15 cases, growing up
instance of type C and 𝑛 = 500 is solved in 8882.10 s. However, for to 12.87% for instances of Type NC with 𝑛 = 1000. The gaps from MA
instance groups with unsolved instances the optimality gaps are even are again all negative, but CPLEX and MA solutions are much closer
lower than the ones observed for Scenario 3, being 1.67% in the worst than in the previous cases. In particular, we note that all gaps for the
case (Type NC, 𝑛 = 1000) and below 1.5% in the remaining 3 cases. The NC instances are above -1%, while they are all below this threshold
solution gaps for MA are again very small, being below 2% in 8 out 9 for the harder correlated instances. In more detail, they range between
cases (with a peak equal to 2.02%) and below 1% in 4 cases. Average −1.31% and −2.00% for C instances, and between −1.90% and −2.35%
computational times grow up to 89.64 s for Type NC, 𝑛 = 1000. Overall for FC instances. For Scenario 4, all optimality gaps are below 5%,
MA found 2 of the optimal solutions provided by CPLEX on Scenario 4. and within 4.03% for the NC instances. CPLEX and MA solutions are
We observe that, contrarily to Scenarios 1–2, the solution size grows always very close, with gaps ranging between 1.59% and −0.68%. It
with 𝑛, as an effect of the higher allowances. In terms of solution is interesting to observe that they are always positive for NC instances,
structure, there is again little difference between the two approaches, while they become negative in 6 out of 10 cases for the correlated ones,
although there is a higher variability since solutions contain more and in particular in 4 out of 5 cases for the instances of type FC. We
items. The highest differences in terms of number of violations and can conclude that for the settings of Scenario 4 (fewer forfeit sets and
items are 5.2 (Scenario 4, Type C, 𝑛 = 300) and 1.8 (Scenario 4, Type higher allowances), although failing to converge, CPLEX finds quickly
C, 𝑛 = 300), respectively. The difference is within 2 violations in 9 out near-optimal solutions, and the benefits of a well performing heuristic
of 15 cases, and within 1.8 items for all groups. are less obvious as opposed to the other scenarios.
Finally, we analyze the results obtained for the CPLEX runs with the
second time limit setting. Results are contained in Table 5 for Scenarios 4.5. Tests on KPF instances
1–2 and Table 6 for Scenarios 3–4. For each group, we report the
average CPLEX solution value (Sol.), the time limit (TimeLim.) and the In this section we compare the MA performances with those of GA-
average optimality gap. We recall that time limits correspond to the CG (Capobianco et al., 2022), the most effective heuristic approach
highest MA computational time for the related group. Furthermore, known to date for KPF. For the comparison, we consider the dataset
we report the average gaps of MA solutions from the CPLEX ones. Gap proposed in the latter work, composed of three instance types, 𝑂, 𝐿𝐾
values are computed as previously discussed. For Scenario 2, in the case and 𝑀𝐹 . An instance group of size 10 was generated for each instance
of 8 instances belonging to 3 different groups with 𝑛 ≥ 800, CPLEX type and each value of 𝑛 in {500, 700, 800, 1000}, for a total of 120
provided a solution with value 0 (no item chosen). For each of these individual instances. For 𝑂 and 𝑀𝐹 instances, 𝑏 = 3𝑛, while it is equal
groups, we computed the average CPLEX gap on the subset of instances to 5𝑛 for 𝐿𝐾 instances. Moreover, for 𝑂 and 𝐿𝐾 instances 𝑙 = 6𝑛,

10
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

Table 5
CPLEX (short time limit) - MA comparison for Scenarios 1–2.
Inst. group Scenario 1 Scenario 2
Type 𝒏 CPLEX MA CPLEX MA
Sol. TimeLim. Opt Gap CPLEX Sol. TimeLim. Opt Gap CPLEX
(%) Gap(%) (%) Gap(%)
300 650.0 4.89 22.98 −4.85 274.9 2.43 42.68 −9.19
500 510.1 18.10 50.06 −10.16 196.0 4.16 60.54 −16.02
NC 700 450.5 31.45 58.36 −9.56 139.5 4.52 72.81 −35.47
800 405.1 61.31 62.72 −13.12 139.0 6.07 73.11 −26.95
1000 342.2 70.27 69.34 −18.55 57.9 6.16 88.79 −51.92(6)
300 675.1 3.90 26.74 −14.27 402.8 2.66 41.51 −10.89
500 728.3 34.79 45.00 −16.24 279.5 4.31 61.48 −26.77
C 700 624.1 61.71 60.43 −17.10 203.5 6.00 72.15 −35.45
800 588.5 68.73 63.21 −13.61 169.6 6.44 77.05 −37.58(9)
1000 497.5 93.03 69.42 −20.32 86.3 10.32 88.27 −142.33(7)
300 696.6 4.56 24.22 −7.98 421.0 2.90 38.67 −10.67
500 675.4 33.68 48.93 −21.93 355.9 5.92 52.47 −16.56
FC 700 640.5 68.04 53.42 −16.13 275.5 5.94 66.38 −29.18
800 606.3 78.63 53.77 −16.82 252.0 11.01 65.85 −33.04
1000 536.8 109.52 54.19 −19.92 200.8 15.35 65.94 −51.54

Table 6
CPLEX (short time limit) - MA comparison for Scenarios 3–4.
Inst. group Scenario 3 Scenario 4
Type 𝒏 CPLEX MA CPLEX MA
Sol. TimeLim. Opt Gap CPLEX Sol. TimeLim. Opt Gap CPLEX
(%) Gap(%) (%) Gap(%)
300 1026.7 1.83 2.48 −0.09 905.7 2.09 1.24 0.29
500 1380.3 10.57 6.87 −0.50 1173.9 22.32 2.49 0.48
NC 700 1662.0 93.66 9.62 −0.86 1419.2 49.75 3.62 1.22
800 1812.1 199.42 9.05 −0.05 1463.2 78.53 3.70 0.91
1000 1935.9 277.56 12.87 −0.72 1537.5 157.43 4.03 1.59
300 945.8 2.33 6.81 −1.53 895.6 1.63 3.89 −0.09
500 1420.9 8.62 7.67 −1.31 1299.3 10.09 4.94 −0.68
C 700 1832.7 40.71 9.26 −1.44 1729.3 33.68 4.21 0.17
800 2035.6 43.14 9.36 −1.63 1879.5 58.84 4.70 0.01
1000 2379.1 114.60 10.56 −2.00 2166.2 193.56 3.96 0.74
300 934.2 2.36 7.53 −1.92 884.9 2.00 4.30 −0.54
500 1391.2 8.39 9.14 −2.35 1305.7 6.24 4.61 −0.61
FC 700 1806.2 30.56 9.29 −2.09 1757.4 29.18 4.17 −0.14
800 2014.7 64.39 9.31 −1.96 1933.9 30.06 4.24 −0.17
1000 2387.9 122.78 9.40 −1.90 2302.6 116.23 4.28 0.01

Table 7
Comparison on KPF instances.
Inst. group GA-CG MA
Type 𝒏 Sol. Time CPLEX Sol. Time CPLEX
(s) Gap(%) (s) Gap(%)
500 2555.4 163.90 1.53 2579.1 14.38 0.62
700 3535.3 506.22 1.70 3563.8 56.05 0.90
O
800 4046.5 798.20 1.60 4075.1 111.30 0.91
1000 5005.7 1586.59 1.82 5057.3 312.24 0.80
500 2636.1 223.44 1.87 2676.4 33.64 0.37
700 3686 684.33 1.65 3731.5 139.39 0.43
LK
800 4181.6 1093.46 1.67 4241.3 318.64 0.27
1000 5203.6 2098.57 1.61 5279.4 719.10 0.18
500 2248.9 253.96 1.84 2279.4 40.20 0.51
700 3092.7 740.86 1.92 3142.4 171.98 0.34
MF
800 3583.1 1115.93 1.82 3638.8 236.00 0.29
1000 4471.6 2106.27 1.49 4547.5 551.51 −0.19

while 𝑀𝐹 instances contain 8𝑛 forfeit pairs. For further details on these Regarding MA, the tests were run using the same parameters de-
instances, the reader can refer to Capobianco et al. (2022). scribed in the previous section, without any further tuning for the
In Table 7 we report average results for each instance group. For specific problem, choosing 𝑘 = 𝑙 (that is, the number of violations
each algorithm, we report the solution value, the computational time
is unbounded, in agreement with the definition of the KPF problem),
in seconds and the gap from the solutions found by CPLEX within a
ℎ𝑖 = 1 for each forfeit pair 𝑖 and 𝑚𝑎𝑥_𝑖𝑡_𝑖𝑚𝑝𝑟 = 150.
3 h time limit. GA-CG and CPLEX solutions are taken from Capobianco
et al. (2022). The tests were run on the same hardware used for the We can observe that MA is in all cases faster and more effective
current work. than GA-CG. Indeed, the CPLEX gaps are always between 1% and 2%

11
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

for GA-CG, while they are always below 1% for MA. In particular, for Bettinelli, A., Cacchiani, V., Malaguti, E., 2017. A branch-and-bound algorithm for the
the 𝐿𝐾 and 𝑀𝐹 instances they are always within 0.51%. Knapsack problem with conflict graph. INFORMS J. Comput. 29 (3), 457–473.
Cacchiani, V., Iori, M., Locatelli, A., Martello, S., 2022a. Knapsack problems — An
For 𝑛 = 1000, MA is around 5, 3 and 4 times faster than GA-CG for
overview of recent advances. Part I: Single Knapsack problems. Comput. Oper.
𝑂, 𝐿𝐾 and 𝑀𝐹 respectively. The performance gap grows for smaller Res. 143, 105692.
instances; for 𝑛 = 500, MA is between 6 and 11.5 times faster. Cacchiani, V., Iori, M., Locatelli, A., Martello, S., 2022b. Knapsack problems — An
In comparison with the tests on KPFS instances, MA has generally overview of recent advances. Part II: Multiple, multidimensional, and quadratic
higher computational times; we believe that this depends both on Knapsack problems. Comput. Oper. Res. 143, 105693.
Capobianco, G., D’Ambrosio, C., Pavone, L., Raiconi, A., Vitale, G., Sebastiano, F., 2022.
the problem structure and the specific instance features. Indeed, as
A hybrid metaheuristic for the Knapsack problem with forfeits. Soft Comput. 26,
mentioned, for KPF the bound on the maximum number of violations 749–762.
is missing; furthermore, both 𝑙 and 𝑏 are usually higher in the dataset Carrabs, F., Cerrone, C., D’Ambrosio, C., Raiconi, A., 2017. Column generation embed-
proposed in Capobianco et al. (2022). Nevertheless, as discussed, MA ding carousel greedy for the maximum network lifetime problem with interference
constraints. In: Springer Proceedings in Mathematics and Statistics, Vol. 217. pp.
is always significantly faster than GA-CG without the need of a specific
151–159.
tuning for the KPF problem. Carrabs, F., Cerulli, R., Pentangelo, R., Raiconi, A., 2021. Minimum spanning tree with
conflicting edge pairs: A branch-and-cut approach. Ann. Oper. Res. 298, 65–78.
5. Conclusions Cerrone, C., Cerulli, R., Gaudioso, M., 2016. OMEGA one multi ethnic genetic approach.
Optim. Lett. 10 (2), 309–324.
Cerrone, C., Cerulli, R., Golden, B., 2017. Carousel greedy: A generalized greedy
In this work we proposed the Knapsack Problem with Forfeit Sets, a
algorithm with applications in optimization. Comput. Oper. Res. 85, 97–112.
novel generalization of the 0/1 Knapsack Problem taking into account Cerrone, C., D’Ambrosio, C., Raiconi, A., 2019. Heuristics for the strong generalized
multiple and possibly overlapping sets of contrasting decisions, that minimum label spanning tree problem. Networks 74 (2), 148–160.
is decisions that lead to penalties if taken together. An allowance Cerrone, C., Gentili, M., D’Ambrosio, C., Cerulli, R., 2018. An efficient and simple
approach to solve a distribution problem. In: AIRO Springer Series, vol. 1, pp.
threshold is considered for each set, as well as global upper bound for
151–159.
the number of allowance violations. The problem includes as subcases Cerulli, R., D’Ambrosio, C., Iossa, A., Palmieri, F., 2022. Maximum network life-
previously studied variants of the problem that take into account either time problem with time slots and coverage constraints: Heuristic approaches. J.
strict or soft conflicts on item pairs, and with respect to them it allows Supercomput. 78 (1), 1330–1355.
to model a larger number of real-world applications. We presented Cerulli, R., D’Ambrosio, C., Raiconi, A., Vitale, G., 2020. The Knapsack problem with
forfeits. In: Combinatorial Optimization. 5th International Symposium. ISCO 2020,
a mathematical formulation, showed the existence of a polynomial
In: Lecture Notes in Computer Science, vol. 12176, pp. 263–272.
subcase and proposed three heuristic approaches. In particular our Ceselli, A., Righini, G., 2006. An optimization algorithm for a penalized Knapsack
proposed hybrid metaheuristic, combining a memetic algorithm with problem. Oper. Res. Lett. 34, 394–404.
Carousel Greedy, proved to be able to produce good quality solutions Coniglio, S., Furini, F., San Segundo, P., 2021. A new combinatorial branch-and-bound
on the considered benchmark instances, and to outperform the best pre- algorithm for the Knapsack problem with conflicts. European J. Oper. Res. 289 (2),
435–455.
viously known algorithm for KPF. In the future, we intend to develop Dahmani, I., Hifi, M., 2021. A modified descent method-based heuristic for binary
new heuristic and exact approaches for the problem and to extend the quadratic Knapsack problems with conflict graphs. Ann. Oper. Res. 298, 125–147.
concept of forfeit sets to other optimization problems. Darmann, A., Pferschy, U., Schauer, J., 2009. Determining a minimum spanning tree
with disjunctive constraints. In: Lecture Notes in Computer Science (Including Sub-
series Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),
CRediT authorship contribution statement
vol. 5783 LNAI, pp. 414–423.
Darmann, A., Pferschy, U., Schauer, J., Woeginger, G., 2011. Paths, trees and matchings
Ciriaco D’Ambrosio: Conceptualization, Methodology, Software, under disjunctive constraints. Discrete Appl. Math. 159 (16), 1726–1735.
Writing – review & editing. Federica Laureana: Conceptualization, Della Croce, F., Pferschy, U., Scatamacchia, R., 2019. New exact approaches and
Methodology, Software, Writing – review & editing. Andrea Raiconi: approximation results for the Penalized Knapsack problem. Discrete Appl. Math.
(253), 122–135.
Conceptualization, Methodology, Software, Writing – review & editing. Eiben, A.E., Smith, J.E., 2015. Introduction to Evolutionary Computing, second ed. In:
Gaetano Vitale: Conceptualization, Methodology, Software, Writing – Natural Computing Series, Springer-Verlag Berlin Heidelberg, Berlin, Heidelberg.
review & editing. Epstein, L., Levin, A., 2008. On bin packing with conflicts. SIAM J. Optim. 19 (3),
1270–1298.
Gendreau, M., Laporte, G., Semet, F., 2004. Heuristics and lower bounds for the bin
Data availability
packing problem with conflicts. Comput. Oper. Res. 31 (3), 347–358.
Goldschmidt, O., Nehme, D., Yu, G., 1994. Note: On the set-union Knapsack problem.
A link to the data is included in the manuscript. Nav. Res. Logist. 41, 833–842.
Hadi, K., Lasri, R., El Abderrahmani, A., 2019. An efficient approach for sentiment
Acknowledgments analysis in a big data environment. Int. J. Eng. Adv. Technol. 8 (4), 263–266.
He, Y., Xie, H., Wong, T.-L., Wang, X., 2018. A novel binary artificial bee colony
algorithm for the set-union Knapsack problem. Future Gener. Comput. Syst. 78,
• Ciriaco D’Ambrosio has been supported by the Italian Ministry 77–86.
of University and Research (MUR) and European Union with Heller, I., Tompkins, C., 1956. An extension of a theorem of Dantzig’s. In: Linear
the program PON ‘‘Ricerca e Innovazione’’ 2014–2020, Azione Inequalities and Related Systems, Vol. 38. Princeton University Press, Princeton,
pp. 247–254.
1.2 ‘‘Mobilità dei Ricercatori’’ (AIM ‘‘Attraction and International
Hifi, M., Otmani, N., 2012. An algorithm for the disjunctively constrained Knapsack
Mobility’’-LINEA 1), POC R&I 2014–2020. problem. Int. J. Oper. Res. 13 (1), 22–43.
• Gaetano Vitale has been partially supported by MIUR-PRIN 2017, Kong, H., Kang, Q., Li, W., Liu, C., Kang, Y., He, H., 2019. A hybrid iterated carousel
project ‘Stochastic Models for Complex Systems’, greedy algorithm for community detection in complex networks. Physica A 536,
No. 2017JFFHSH. 122124.
Luiz, T.A., Santos, H.G., Uchoa, E., 2021. Cover by disjoint cliques cuts for the Knapsack
problem with conflicting items. Oper. Res. Lett. 49 (6), 844–850.
References Muritiba, A.E.F., Iori, M., Malaguti, E., Toth, P., 2010. Algorithms for the bin packing
problem with conflicts. INFORMS J. Comput. 22 (3), 401–415.
Akinc, U., 2006. Approximate and exact algorithms for the fixed-charge Knapsack Pferschy, U., Schauer, J., 2009. The Knapsack problem with conflict graphs. J. Graph
problem. European J. Oper. Res. 170, 363–375. Algorithms Appl. 13 (2), 233–249.
Basnet, C., 2018. Heuristics for the multiple Knapsack problem with conflicts. Int. J. Pferschy, U., Schauer, J., 2011. The maximum flow problem with conflict and forcing
Oper. Res. 32 (4), 514–525. conditions. In: International Conference on Network Optimization. Springer, pp.
Ben Salem, M., Taktak, R., Mahjoub, A.R., Ben-Abdallah, H., 2018. Optimization 289–294.
algorithms for the disjunctively constrained Knapsack problem. Soft Comput. 22, Pferschy, U., Schauer, J., 2017. Approximation of Knapsack problems with conflict and
2025–2043. forcing graphs. J. Comb. Optim. 33 (4), 1300–1323.

12
C. D’Ambrosio et al. Computers and Operations Research 151 (2023) 106093

Sadykov, R., Vanderbeck, F., 2013. Bin packing with conflicts: A generic Stefanov, S.M., 2013. Separable Programming: Theory and Methods. Springer New York,
branch-and-price algorithm. INFORMS J. Comput. 25 (2), 244–255. New York.
Samer, P., Urrutia, S., 2015. A branch and cut algorithm for minimum spanning trees Wu, C., He, Y., 2020. Solving the set-union Knapsack problem by a novel hybrid Jaya
under conflict constraints. Optim. Lett. 9 (1), 41–55. algorithm. Soft Comput. 24 (3), 1883–1902.
Shi, X., Wu, L., Meng, X., 2017. A new optimization model for the sustainable Yamada, T., Takeoka, T., 2009. An exact algorithm for the fixed-charge multiple
development: Quadratic Knapsack problem with conflict graphs. Sustainability 9 Knapsack problem. European J. Oper. Res. 192, 700–705.
(2), 236. Zhang, R., Kabadi, S.N., Punnen, A.P., 2011. The minimum spanning tree problem with
conflict constraints and its variations. Discrete Optim. 8 (2), 191–205.

13

You might also like