1 Introduction

The Maximally Diverse Grouping Problem (MDGP) is a well-known and well-investigated combinatorial optimization problem. Given a set of items \(i \in I\) with a pairwise diversity \(d_{ij} \ge 0\), the task is to assign the items to groups \(g \in G\) such that each group g gets at least \(l_g\) and at most \(u_g\) items assigned and the within group diversity is maximized over all groups. The MDGP is an important combinatorial optimization problem for two reasons: First, it has a wide field of applications such as the assignment of students to project groups (Beheshtian-Ardekani & Mahmood, 1986) or teams (Dias & Borges, 2017), the assignment of pupils to tutor groups (Baker & Benn, 2001) or of children to equally strong sport teams (Rubin & Bai, 2015). Moreover, there are applications in final exam scheduling, VLSI design (Weitz & Lakshminarayanan, 1998), and anticlustering. Anticlustering aims like the MDGP to partition items into disjoint groups such that groups are similar but within-group heterogeneity is high (Brusco et al., 2020; Papenberg, 2024). Applications are in the assignment of participants to groups (Batista et al., 2023) and in dividing data sets for cross validation (Papenberg & Klau, 2021). Second, the MDGP is NP-hard to solve (Feo & Khellaf, 1990) although it can be formulated as a short integer program, which can easily be linearized (compare e.g. (Gallego et al., 2013)):

$$\begin{aligned}&\text {max} \sum _{g \in G} \sum _{i \in I} \sum _{j \in I:j>i} d_{ij} x_{ig} x_{jg}&\nonumber \\&\quad \text {with the constraints}&\end{aligned}$$
(1)
$$\begin{aligned}&\quad \sum _{g \in G} x_{ig} = 1&\forall i \in I \end{aligned}$$
(2)
$$\begin{aligned}&\quad l_g \le \sum _{i \in I} x_{ig} \le u_g&\forall g \in G \end{aligned}$$
(3)
$$\begin{aligned}&x_{ig} \in \{0,1\}&\forall i \in I, g \in G \end{aligned}$$
(4)

In the study by Gallego et al. (2013), only instances with up to 12 items could be solved to optimality. Given that \(d_{ij}\) values often have a certain structure in practice, e.g. they are the difference of attribute values (Schulz, 2021) or binary values (Mingers & O’Brien, 1995), instances of up to 30–70 items can be solved to proven optimality (Schulz, 2022). If at most two attributes are considered, Schulz (2021) proved that even large instances can be solved efficiently.

However, if we want to solve large instances or instances with general \(d_{ij} \ge 0\) (\(d_{ii} = 0\) for all \(i \in I\)), efficient heuristic solution methods are required. While earlier approaches focussed on construction and local search improvement heuristics (compare the overview by Weitz and Lakshminarayanan (1998)), later papers focussed on different metaheuristic solution approaches. In the last years, Brimberg et al. (2015) developed a skewed general variable neighborhood search, Palubeckis et al. (2015) an iterated tabu search approach, Lai and Hao (2016) an iterated maxima search heuristic, Lai et al. (2021a) a neighborhood decomposition based variable neighborhood search and tabu search, and Yang et al. (2022) a three-phase approach with a dynamic population size. These are the most recent and successful approaches. Please see Lai and Hao (2016) for a more depth review of metaheuristic solution approaches for the MDGP.

The focus of the paper at hand is not to develop a new advanced metaheuristic solution approach but to consider the evaluation of neighborhoods, especially insertions and swaps, within these approaches. We present a new method to evaluate these neighborhoods more efficiently. In doing so, the paper is inspired by the neighborhood decomposition approach by Lai et al. (2021a).

Given a feasible solution for the MDGP, i.e. an assignment of each item to exactly one group such that each group has a number of items between \(l_g\) and \(u_g\) assigned, the insertion neighborhood contains all feasible solutions such that exactly one item is assigned to a different group. Thus, given solution y, whereat \(y_i\) indicates the group of item i, the neighborhood includes all solutions \({\bar{y}}\) such that (2)–(4) are fulfilled and \(y_i = {\bar{y}}_i\) for all but exactly one item \(i \in I\). Correspondingly, the swap neighborhood of y includes all solutions \({\bar{y}}\) such that (2)–(4) are fulfilled, \(y_i = {\bar{y}}_i\) for all items \(i \in I\backslash \{j,j'\}\), \({\bar{y}}_{j'} = y_j\), and \({\bar{y}}_j = y_{j'}\). These two neighborhoods are used in most of the advanced solution methods for the MDGP, including the five advanced algorithms mentioned before as well as for example Baker and Powell (2002), Chen et al. (2011), Fan et al. (2011), Palubeckis et al. (2011), Rodriguez et al. (2013), Urošević (2014), and Schulz (2023). In the paper at hand, we present a new efficient method enhancing the neighborhood decomposition (ND) method described in Lai et al. (2021a) to evaluate the two neighborhoods faster than in the standard implementation, which simply evaluates the entire neighborhood, and the ND implementation.

The presented neighborhood evaluation can also be applied to other grouping or clustering problems. These include clustering problems like the capacitated clustering problem (Lai et al., 2021b) and the capacitated p-median problem (Zheng et al., 2021). It can also be applied to the ratio cut and normalized cut graph partitioning problem (Palubeckis, 2022), to vehicle routing problems (e.g. Pfeiffer and Schulz (2022) or Zhou et al. (2023)) or to parallel machine scheduling (Yalaoui & Chu, 2002).

The paper is constructed as follows: All three implementations are introduced in the following Sect. 2. Section 3 presents the general framework used in the computational study, in which all three implementations are evaluated on benchmark instances (Sect. 4). The paper closes with a conclusion (Sect. 5).

2 Implementation of neighborhoods

In this section, we present the three implementations standard, ND, and efficient ND to implement the insertion and the swap neighborhood.

2.1 Standard implementation

In the standard implementation, simply all solutions of the neighborhoods are evaluated. Thereby, a solution is encoded by parameters \(y_i\) and additionally by sets \(I_g = \{ i \in I: y_i = g \}\), \(g \in G\), i.e. \(I_g\) is the set of items assigned to group g in the current solution. The pseudo-code for the insertion neighborhood can be found in Algorithm 1.

Algorithm 1

[standard insertion]

figure a

By using \(I_g\), we only once have the check for each pair of groups whether they are identical (Line 3). Often authors replace the three for-loops starting in Lines 24 by a for-loop over all items, a for-loop over all groups, and an if-statement checking whether the item is in the group or not. Thus, the if-check needs to be done \(|I|\cdot |G|\) times while in the above implementation the check in Line 3 is only done \(|G|^2\) times, whereat \(|G|<< |I|\) holds typically. We use this implementation for Algorithms 1 and 2 (swap neighborhood) also to ensure that the neighborhoods are always evaluated in the exact same way in the three different implementations presented in this paper. By this, we ensure that the search is the same for all three implementations. Thus, if one implementation leads to a higher number of operated iterations due to its more efficient implementation, the best found solution cannot be worse than the best one found with the other two implementations.

Algorithm 2 presents the pseudo-code for a full evaluation of the swap neighborhood.

Algorithm 2

[standard swap]

figure b

It is well-known in the literature (see e.g. Brimberg et al. (2015)) that inserts and swaps can be evaluated effectively by using matrix \((D_{ig})_{i \in I, g \in G}\) with

$$\begin{aligned} D_{ig} = \sum _{j \in I_g} d_{ij} \end{aligned}$$

which indicates the sum of diversities of item i with all items assigned to group g. By this, the change in the objective value due to a move of item i from group g to group \(g'\) can directly be computed as

$$\begin{aligned} D_{ig'} - D_{ig}. \end{aligned}$$
(5)

For a swap of the groups of items i and j currently assigned to groups g and \(g'\) we obtain the change in the objective value by

$$\begin{aligned} D_{ig'} - D_{ig} + D_{jg} - D_{jg'} - 2 \cdot d_{ij}. \end{aligned}$$
(6)

As \(d_{ij}\) is included in \(D_{ig'}\) and \(D_{jg}\), but i is removed from g and j is removed from \(g'\), we have to subtract \(d_{ij}\) twice. After realizing an insert or a swap \(D_{ig}\) needs to be updated for all items and the involved two groups g and \(g'\) by subtracting the diversity with the removed item and adding the diversity with the added item.

2.2 Neighborhood decomposition implementation

Lai et al. (2021a) recognized that it is not necessary to evaluate the entire neighborhood in every iteration. In every iteration, the assignment to only two groups is changed. Thus, if we found out that there is no promising insert of an item from group g into group \(g'\) or no promising swap between items of groups g and \(g'\), we do not need to evaluate these inserts or swaps again until at least one of the two groups is changed by removing or adding an item.

As evaluating all inserts of items of group g into group \(g'\) or all swaps between items of groups g and \(g'\) is independent of the evaluation of all inserts/swaps of all other group pairs, Lai et al. (2021a) call the part of the neighborhood containing these inserts/swaps the neighborhood block of groups g and \(g'\). Note that the neighborhood block of g and \(g'\) is identical to the one of \(g'\) and g for the swap neighborhood, but there is a difference for the insertion neighborhood. In both cases, all neighborhood blocks \(g,g' \in G\) (for swap with \(g<g'\)) are disjunct and their union is the entire neighborhood.

Lai et al. (2021a) introduced two zero–one matrices \(W^1 = (W^1_{gg'})_{g,g'\in G}\) and \(W^2 = (W^2_{gg'})_{g,g' \in G}\) to picture whether the neighborhood block including groups g and \(g'\) needs to be evaluated. If an entry of the matrices is 1, the inserts/swaps between the corresponding groups need to be evaluated. If an entry is 0, the neighborhood block can be skipped, as we know already that it does not contain any promising insert/swap.

Note that \(W^2\) is symmetric while \(W^1\) is not. It might be that there is no promising insert of an element of group g into group \(g'\), but there is one in the opposite direction. If an item is added to or removed from a group g, \(W^1_{gg'}\), \(W^1_{g'g}\), \(W^2_{gg'}\), and \(W^2_{g'g}\) are set to 1 for all \(g' \in G\backslash \{g\}\), i.e. they have to be re-evaluated (last line of Algorithm 3 and 4, respectively). Lai et al. (2021a) call the procedure neighborhood decomposition, as the neighborhoods are decomposed for each pair of groups \(g,g' \in G\) into one independent block (swap) and two independent blocks (insert), respectively. We write for short ND instead of neighborhood decomposition in the following. The pseudo-code for the insertion and the swap neighborhood using the ND implementation are presented in Algorithms 3 and 4, respectively.

Algorithm 3

[ND insertion]

figure c

Algorithm 4

[ND swap]

figure d

It can clearly be seen that the neighborhoods can be evaluated more efficiently than in the standard implementation if the block of a group pair g and \(g'\) does not need to be evaluated (Line 4 in both algorithms). However, the benefit depends on the number of blocks which can be skipped. In contrast, there is the drawback that matrices \(W^1\) and \(W^2\) need to be updated after every change in the solution (Lines 13 and 15, respectively) although this requires only a linear effort (\(W^1_{g{\bar{g}}}\), \(W^1_{{\bar{g}}g}\), \(W^2_{g{\bar{g}}}\), \(W^2_{{\bar{g}}g}\), \(W^1_{g'{\bar{g}}}\), \(W^1_{{\bar{g}}g'}\), \(W^2_{g'{\bar{g}}}\), and \(W^2_{{\bar{g}}g'}\) are set to 1 for all \({\bar{g}} \in G\backslash \{g,g'\}\) whereat g and \(g'\) are the two groups with removed/added items).

Moreover, the approach still has the disadvantage that a block is evaluated in an iteration and again in the next iteration if there is a promising insert/swap, but another one comprising two other groups was realized.

2.3 Efficient neighborhood decomposition implementation

We present now an improved version to overcome this drawback. Therefore, we replace matrices \(W^1\) and \(W^2\) by new three-dimensional matrices \(M^1 = (M^1_{gg'h})_{g,g' \in G, h = 1,2}\) and \(M^2 = (M^2_{gg'h})_{g,g' \in G, h = 1,2,3}\), respectively. For each pair of groups g and \(g'\) we evaluate all inserts of an item of group g into group \(g'\) (analogously to conduct Lines 6–8 of Algorithm 3). If there is a promising one, we save the change in the objective function for the best one if the insert would be conducted in \(M^1_{gg'1}\) (value of (5)) and the corresponding item number i in \(M^1_{gg'2}\). If there is no promising insert, we simply set \(M^1_{gg'1} = M^1_{gg'2} = 0\). Thus, \(M^1\) gives us the best insert of an item of group g into group \(g'\) if there is one such that the neighborhood evaluation reduces as can be seen in Lines 8–10 of Algorithm 5.

Algorithm 5

[efficient ND insertion]

figure e

Whenever a change occurs in groups g or \(g'\), we need to update \(M^1_{gg'h}\) and \(M^1_{g'gh}\) analogously to the ND method. We simply set \(M^1_{gg'1} = M^1_{g'g1} = -1\) indicating that the neighborhood block containing groups g and \(g'\) needs to be re-evaluated (Line 14 of Algorithm 5). We cannot set \(M^1_{gg'1}\) and \(M^1_{g'g1}\) to 0, as 0 indicates that the neighborhood block has been evaluated, but no promising insert was found. Matrix \(M^2\) is updated analogously. If \(M^1_{gg'1} = -1\), all inserts of items of group g into group \(g'\) are evaluated and the best promising one is again saved in \(M^1_{gg'1}\), \(M^1_{gg'2}\), and \(M^1_{gg'3}\) (Lines 5–7).

As we can replace Lines 4–9 of Algorithm 3 by Lines 5–10 of Algorithm 5, evaluating the neighborhood is more efficient now. We only need to evaluate neighborhood blocks which have not been evaluated since the last change in their group assignment (Line 5), but these neighborhood blocks would also be evaluated in the ND implementation. Additionally, further promising neighborhood blocks might be re-evaluated in the ND implementation but are not in the efficient ND implementation.

For the swap neighborhood we analogously save the value the objective function changes in entry \(M^2_{gg'1}\) (value of (6)) and the item i removed from group g in entry \(M^2_{gg'2}\). Additionally, we save the item j removed from group \(g'\) in entry \(M^2_{gg'3}\). Again, all three entries are 0 if there is no promising swap between groups g and \(g'\) and \(M^2_{gg'1} = -1\) if the neighborhood block has not been evaluated since the last change in the assigned items to group g or \(g'\). The swap neighborhood can then be evaluated by Algorithm 6.

Algorithm 6

[efficient ND swap]

figure f

2.4 Comparison of the three implementations

Comparing the three implementations, the standard implementation fully evaluates the neighborhoods in every iteration. The ND implementation saves some computations by evaluating only those parts which might be promising but always evaluates a neighborhood block if anything has changed in the assignment of one of the two involved groups. The efficient ND implementation saves the best insert/swap which can be realized between two groups such that the neighborhood block only needs to be re-evaluated if anything changes in one of the two groups. In other words, every part of the neighborhood which needs to be evaluated in the efficient ND implementation needs also to be evaluated in the ND implementation and every part of the neighborhood which needs to be evaluated in the ND implementation needs to be evaluated in the standard implementation. Thus, we have a clear hierarchy in the efficiency of the implementations.

Concretely, we need to evaluate \(|I| \cdot (|G|-1) = (|I_1| + \ldots |I_{|G|}|) \cdot (|G|-1)\) inserts in the standard implementation while we only need to evaluate up to \((|I_{g_1}| + |I_{g_2}|) \cdot (|G|-1)\) inserts in the efficient ND implementation whereat \(g_1\) and \(g_2\) are the two groups which were changed by an insert/swap in the previous iteration. All items of these two groups (\(|I_{g_1}| + |I_{g_2}|\)) need to be reinserted into all remaining \(|G|-1\) groups. It can clearly be seen that the efficient ND implementation requires fewer evaluated inserts than the standard implementation if \(|G|>2\) while the difference is the larger the larger |G| is. The ND implementation is in the best case as efficient as the efficient ND implementation, but in the worst case as inefficient as the standard implementation.

If the swap neighborhood is considered, we need to evaluate \(\sum _{g \in G} |I_g| \cdot (|I|\backslash |I_g|)/2\) swaps in the standard implementation, i.e. for every item of a group (\(|I_g|\)) the swap with any item assigned to another group (\(|I|\backslash |I_g|\)). As the swap of two items only needs to be evaluated once, we divide the result by two. In the efficient ND implementation, we only need to update swaps of the items assigned to a group g with the items assigned to all other groups if group g was changed in the previous iteration. As in one iteration at most two groups \(g_1\) and \(g_2\) are changed, we need to evaluate at most \(\sum _{g \in \{ g_1,g_2\}} |I_g| \cdot (|I|\backslash |I_g|) - |I_{g_1}| \cdot |I_{g_2}|\) swaps whereat \(|I_{g_1}| \cdot |I_{g_2}|\) subtracts the swaps between the two groups \(g_1\) and \(g_2\) which are otherwise counted twice. Again, the efficient ND implementation is more efficient than the standard implementation if \(|G|>2\) and the difference is the larger the larger |G| is. Moreover, the ND implementation is again in the best case as efficient as the efficient ND implementation, but in the worst case as inefficient as the standard implementation.

The difference is even clearer if one of the neighbourhoods is called without having any change in the group assignment since the last call of the algorithm. Then, matrices \(M^1\) and \(M^2\), respectively, are up to date such that the efficient ND implementation does not need to re-evaluate any part of the neighborhood. It only has an effort of up to \(|G| \cdot (|G|-1)\) to find the best neighbouring solution. This is clearly less effort than in the standard implementation requiring \(|I| \cdot (|G| - 1)\) for the insertion neighborhood and \(|I|^2\) for the swap neighborhood. We are certainly in this situation if we are in a local optimum for both neighborhoods. Then, we need to re-evaluate both neighborhoods before we know that we are in a local optimum, but for the one leading to the local optimal solution nothing hast changed since the last call.

Of course we do not want to stay in a local optimum. To not stick there, we introduce a variable neighborhood search (VNS) based framework with perturbation in the next section which is used to evaluate the introduced neighborhood implementations in the computational study.

3 Variable neighborhood search framework

Before we introduce the overall framework we first introduce the used perturbation methods to avoid sticking in local optimal solutions. We use the weak and strong perturbation method presented in Lai and Hao (2016). They are presented in Algorithms 7 and 8. As in Lai and Hao (2016) we set \(\eta _w = 3\) and \(\eta _s = \Theta \cdot |I|/|G|\), whereat \(\Theta = 1\) if \(|I| \le 400\) and 1.5 otherwise. In Algorithm 7, we determine 5 solutions in the for-loop starting in Line 4. Lai and Hao (2016) determined |I| solutions in the for-loop. However, in our preliminary evaluations this resulted in a very significant time spent for the weak perturbation. Therefore, we decreased the value such that the algorithm spends much more time for the neighborhood evaluation and we can perform clearly more iterations.

Algorithm 7

[Weak perturbation]

figure g

Algorithm 8

[Strong perturbation]

figure h

With the weak and the strong perturbation we have all components for the variable neighborhood search framework in Algorithm 9.

Algorithm 9

[VNS framework]

figure i

Algorithm 9 summarizes the entire procedure. The algorithm starts with an initial solution. In line with the literature (Lai & Hao, 2016), we add uniformly \(l_g\) items to all groups g in the first step before the remaining items are uniformly distributed over all groups such that no group exceeds its capacity \(u_g\). Then, we improve this solution with the swap neighborhood (Algorithm 6) until no improvement is found any more. We do this ten times, i.e. generate ten initial solutions. The best one of them is used in Line 1 of Algorithm 9 as initial solution.

The rest of the algorithm is also closely oriented at the procedure by Lai and Hao (2016). The core of the algorithm comprises two while-loops. The outer one (Lines 4–31) steers the strong perturbation (Line 30). Strong perturbation is performed if for \(\alpha \) iterations of the inner while-loop no improved solution was found (objective_value_weak; Lines 23–26). The value of \(\alpha \) is set to 5 if \(|I| \le 400\) or \(|I|/|G| \le 10\). Otherwise, it is set to 3 (compare Lai and Hao (2016)). In the inner while-loop, we call the insertion and the swap neighborhood alternately (Lines 8–14). If the solution has not improved after calling both (Lines 7, 15–17), i.e. if we are in a local optimum of both neighborhoods, we perform a weak perturbation (Lines 22–28). To avoid sticking in local optima, we first perform a weak perturbation to still evaluate the nearer environment of the current solution (intensification). If we could not find a better solution for \(\alpha \) runs of the weak perturbation procedure, the strong perturbation procedure is executed to reach different areas of the solution space (diversification). Whenever a new best solution is found, it is saved as \(y_{best}\) in Lines 19–21. When the algorithm terminates after the time limit is reached, the best solution \(y_{best}\) is returned (Line 32).

In line with the aim of the paper to evaluate the different implementations of the insertion and the swap neighborhood, Algorithm 9 is a basic variable neighborhood search evaluating both neighborhoods alternately like it is done in Lai and Hao (2016). In difference to their paper, we fully evaluate the neighborhoods and except the best solution found instead of accepting every improvement. If we would accept every improved solution, matrices \(W^1\), \(W^2\), \(M^1\), and \(M^2\) would be updated more often meaning that the ND and efficient ND implementations cannot fully use their advantage. Thus, we would not be able to evaluate their full potential.

4 Computational study

In this section, we evaluate the different implementations of the insertion and the swap neighborhood. All algorithms were implemented in C++. The code was executed on a single AMD EPYC 7542 32 core with 2.90GHz. All three implementations (call Algorithms 1, 3, and 5 in Line 9 and Algorithms 2, 4, and 6 in Line 12 of Algorithm 9, respectively) were evaluated on the three standard benchmark sets Geo, RanInt, and RanReal all containing 160 instances where half of them have equal-sized (ss) groups and half of them not (ds). The number following the letter n in the instance’s name indicates the number of items considered. In the Geo set, diversities are Euclidean distances between pairs of points with random coordinates in the interval [0,10]. In the RanReal set, diversities are real uniformly generated numbers in (0,100). In the RealInt set, diversities are integer uniformly generated in the interval [0,100]. The instances are available under the following link: https://grafo.etsii.urjc.es/optsicom/mdgp.html. For a single run the computation time was set to 3s for n smaller or equal to 120, 20s for \(n = 240\), 120s for \(n = 480\), and 600s for \(n = 960\). We conducted 20 runs with different seeds for all three implementations. The code for the efficient ND implementation is available under the following link: http://doi.org/10.25592/uhhfdm.14613. The results presented in the following are average values over all 20 runs.

The aim of the computational study is to evaluate the different implementations of the insertion and the swap neighborhood. As the focus of our algorithm is on the evaluation of the neighborhood implementations, it is not that advanced as other approaches in the literature (compare Yang et al. (2022)). Moreover, the different implementations were always started with the same seed such that all of them conduct the same search. This means that an implementation leading to a larger number of executed iterations cannot lead to a worse objective value. Together, objective values are not that interesting for this study. We therefore do not report them in detail. However, note that the best objective value we found for an instance is on average 0.3% worse than the best found in the study by Yang et al. (2022). The maximum difference was 0.85%.

Table 1 Comparison of number of iterations

As already said, given a seed, the search is the same for all three implementations. Thus, a higher number of iterations cannot lead to a worse objective value but gives the chance to evaluate new and possibly better solutions. Hence, it is desirable to increase the number of performed iterations. Table 1 presents the average number of iterations over all 20 runs and all 10 instances of the instance type performed by the three implementations as well as the percentaged increase if ND was superior in comparison to standard and if efficient ND was superior in comparison to one of the other two. The results show that the standard implementation is superior for smaller instance sizes of up to 60 items while the ND implementation is superior for medium-sized instances with unequal group sizes. Finally, the efficient ND implementation is clearly superior for the large instances.

Reasons for the good performance of the standard implementation on small instances are that a smaller number of items goes along with a smaller number of groups such that the other two implementations cannot use the full potential of their advantage. As an example consider the smallest instances with ten items. They only have two groups. Thus, any insert or swap always changes both groups and all groups have to be evaluated in all three implementations. Moreover, matrices \(W^1\), \(W^2\), \(M^1\), \(M^2\) need to be updated which leads to a higher demand for memory access for the ND and the efficient ND implementation. A higher demand for memory access is also a reason for the better performance of ND in comparison to efficient ND for the smaller instances. For larger instances with a higher number of groups the advantage of fewer parts of the neighborhoods which need to be evaluated exceeds the disadvantage of a higher memory access. Thus, ND and efficient ND clearly outperform standard. Moreover, the advantage of efficient ND to only evaluate a neighborhood block once until a change is done becomes relevant such that efficient ND clearly outperforms ND. This results in an increase of performed iterations of up to 170% in comparison to standard and up to 76% in comparison to ND.

Table 2 Evaluation of neighborhood evaluations

The extent of skipped neighborhood blocks is shown in Table 2. For smaller instances only a small number of neighborhood blocks can be skipped without evaluation due to the small number of groups. If we consider the example of the smallest instances with ten items and two groups again, we can only skip them in an iteration where no improvement was found, i.e. if we are in a local optimum. In contrast, for the instances with 960 items assigned to 24 groups only 45 of the 276 neighborhood blocks need to be evaluated after a change. If we use the ND implementation, we moreover need to evaluate all of the others if there is a promising insert/swap. Consequently, the share of skipped block evaluations increases to more than 82% for the swap neighborhood and over 70% for the insertion neighborhood if we use the efficient ND implementation.

The values are smaller if the swap as well as the insertion neighborhood are used, i.e. for the instances with unequal-sized groups. One reason is that both neighborhoods are evaluated alternately. Thus, up to four groups were changed instead of up to two before the same neighborhood is evaluated next (can be more if perturbation is executed meanwhile).

Table 3 Evaluation of algorithm design

Table 3 shows the success rates of the two neighborhoods as well as the average number of iterations between two consecutive calls of the perturbation algorithms 7 and 8, respectively. The success rate is the percentage of calls of the neighborhood in which an improved solution was found. It is not surprising that more iterations were performed before the algorithm reaches a local optimum, i.e. perturbation is required, if the instance size is larger. Thus, the success rates of both neighborhoods increase with the instance size. Especially the swap neighborhood with a success rate of up to 91% is very effective.

The insertion neighborhood reaches a success rate of only about 50%. Thus, there are more iterations without any change in the current solution which explains why the ND implementation is more effective for instances with unequal group sizes. If an iteration is unsuccessful, no promising neighborhood block exists. This also means that in the next call of the insertion neighborhood only those neighbourhood blocks need to be evaluated which were changed due to a swap in the swap neighborhood which was called in the meantime. Thus, ND is as effective as efficient ND and clearly superior to standard in this case. Our implementation with the alternate calls of both neighborhoods follows Lai and Hao (2016). The small success rate of the insertion neighborhood, however, might be an argument to follow another policy in future approaches.

5 Conclusion

The paper compares the three implementations called standard, ND, and efficient ND of the insertion and the swap neighborhood for the MDGP. The efficient ND implementation is newly introduced and based on the ND implementation. Both implementations use the idea that the neighborhoods can be divided into independent blocks containing the inserts/swaps between two groups. While the ND implementation evaluates a block if there is a promising, i.e. an improving insert/swap, the efficient ND implementation evaluates each block only if the assignment to at least one of its groups has changed.

All three neighborhood implementations were compared in an extensive computational study on the classical benchmark sets. The results show that the standard implementation is superior for small instance sizes while the efficient ND implementation is superior for large instance sizes. The ND implementation performed best for medium sized instances with unequal group sizes. A reason for it is that the insertion neighborhood found only in around half of the cases an improved solution. For the large instances the efficient ND implementation could perform up to 160% more iterations in comparison to standard and up to 76% more iterations in comparison to ND.

Our results lead to several directions for future research. First, the new neighborhood implementation can be used for other grouping or clustering problems. These include as mentioned in the introduction problems like the capacitated clustering problem or the capacitated p-median problem as well as the ratio cut and normalized cut graph partitioning problem, vehicle routing problems, or parallel machine scheduling (Yalaoui & Chu, 2002). Second, we found that the insertion neighborhood has a success rate of only about 50% if both neighborhoods, insertion and swap, are called alternately. Future research could evaluate other policies to increase the neighborhoods’ success rates. Finally, future research can extend our approach by using the matrices \(M^1\) and \(M^2\) to determine more than one promising insert/swap per iteration. The matrices indicate the best insert/swap between two groups. As long as group pairs are disjunct one could also realize additional neighborhood moves in the same iteration. The best selection can be determined by a maximum weighted matching.